INTRODUCTION TO QUANTUM COMPUTATION AND INFORMATION
INTRODUCTION TO QUANTUM COMPUTATION AND INFORMATION Editors
Hoi=...
134 downloads
1511 Views
19MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
INTRODUCTION TO QUANTUM COMPUTATION AND INFORMATION
INTRODUCTION TO QUANTUM COMPUTATION AND INFORMATION Editors
Hoi=KwongLo MagiQ Technologies, Inc., New York
Sandu Popescu University of Bristol & BRIMS, Hewlett-Packard Laboratories, Bristol
Tim Spiller Hewlett-Packard Laboratories, Bristol
Scientific New Jersey. London Hong Kong
Published by World Scientific Publishing Co. Re. Ltd. P 0 Box 128, Farrer Road, Singapore 912805 USA office: Suite IB, 1060 Main Street, River Edge, NJ 07661 UK oflce: 57 Shelton Street, Covent Garden, London WC2H 9HE
Library of Congress Cataloging-in-PublicationData Lo, Hoi-Kwong. Introduction to Quantum computation and information I Hoi-Kwong Lo & Tim Spiller, Sandu Popescu. p. cm. Includes bibliographicalreferences. ISBN 981023399X ISBN 981024410X (pbk) 1 . Quantum computers. I. Spiller, Tim. 11. Popescu, Sandu. 111. Title. QA76.889.L6 1998 004.1--d~21 98-31095 CIP
British Library Cataloguing-in-PublicationData A catalogue record for this book is available from the British Library
First published 1998 Reprinted 1999,2000
Copyright 0 1998 by World Scientific Publishing Co. Pte. Ltd.
All rights reserved. This book, or parts thereof, may not be reproduced in any form or by any means, electronic or mechanical, including photocopying, recording or any information storage and retrieval system now known or to be invented, without written permission from the Publisher.
For photocopying of material in this volume, please pay a copying fee through the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, USA. In this case permission to photocopy is not required from the publisher.
Printed in Singapore by Uto-Print
FOREWORD MICHAEL BERRY H. H. Walls Physics Laboratory, University of Bristol Tyndall Avenue, Bristol BS8 1 TL, United K i n g d o m
In the nineteenth century, life was transformed by the conscious application of classical mechanics, in the form of Newton’s equations (and, later, thermodynamics) t o the engines of the industrial revolution. In this century, a similar transformation has been wrought by electromagnetism, in generating and distributing electric power and communicating words and pictures across the world at the speed of light, in what should be seen as the conscious application of Maxwell’s equations. It is easy to predict that in the twenty-first century it will be quantum mechanics that influences all our lives. There is a sense in which quantum mechanics is already having profound effects. Leon Lederman claims that a large part of the gross national product of the industrialized countries stems from quantum mechanics. I suppose he is referring t o transistors-the ‘fundamental particles’ of modern electronics-that depend on properties of semiconducting materials designed by applying quantum mechanics to electrons in solids, and t o lasers, where the Bose-Einstein statistics of identical particles generates coherent avalanches of photons to read the bar-codes in our supermarkets and guide delicate surgery in our eyes. The most dramatic influences are, however, likely to come from the deliberate manipulation of entangled states. These arise naturally when Thomas Young’s superposition principle, familiar throughout wave physics since 1800, is generalized to states of more than one particle. Entanglement lies at the heart of the microscopic world as described by quantum mechanics, and makes it weirdly different from the world of our immediate experience. Schrodinger invented the idea of entanglement more than sixty years ago (although a version of it had appeared before, in the quantum states of identical particles); but only now, in a remarkable flowering of fin de siitcle quantum mechanics, are its full implications being thoroughly and energetically explored. There is much to understand; even a convincing measure of entanglement is lacking. In an entangled state of several particles, measurements on one particle can affect all the others, even if they are too far apart for a causal influence to propagate between them. These nonlocal (but not relativity-violating) actions are being incorporated into proposals for technologies that were hardly imagined twenty years ago. Most of the attention is being focused on quantum V
vi
Introduction t o Quantum Computation and Information
computing, quantum cryptography, and teleportation. In quantum computing, information is manipulated not discretely, in the classical way, as a series of zeros and ones (bits), but as continuous superpositions (qubits) where the number of possibilities is vastly greater. In effect, many computations are performed simultaneously, and calculations that would be intractable classically (for example, factoring large integers) become feasible quantally. In this way, the theory of computation-indeed information science itself-is becoming a branch of physics rather than mathematics. Easy factorization would destroy one of the commonly-used methods of encryption. However, the same entanglement employed in quantum computing makes possible the development of unbreakable shared codes, incorporating the intrinsic randomness of quantum mechanics. To my mind, this particular emphasis in the application of fundamental physics is depressing, because I regard the obsession with secrets in public life (as opposed to a commendable discretion about private matters) as one of the less attractive preoccupations of our fellow human beings. Teleportation is the dissolution of an object in one place in a way that enables it to be perfectly reconstituted elsewhere. On the finest scale, the most complete specification of an object is its quantum state, but complete knowledge of that cannot be had: measurement of one property of the state irrevocably destroys information about complementary properties. However, suitable measurements can entangle the teleportee (whose state is unknown) with one of a previously entangled pair of systems, and thereby transfer the state to the other member of the pair, however far away that is, where another measurement (requiring inforrration sent conventionally) can reconstruct it. Each of these projects is visionary-technological fantasy, some say-but the principles have been demonstrated experimentally. The big obstacle to further developments is ‘decoherence’, in which uncontrolled effects of the environment scramble the delicate phase correlations that embody quantum entanglement. Decoherence has plagued wave physics from its beginnings; it is what made Thomas Young’s superpositions, in his double-slit experiments with light, so hard to create and maintain (his experiments were carried out with candles!). States of many particles are even more fragile. Recent work suggests that the effects of decoherence might be reduced or eliminated by cleverly correcting errors as they arise. This book is a record of these modern developments, a self-contained pedagogical account-perhaps the first-written by the world’s leading experts. Most of the chapters were ‘battle-tested’ in a series of lectures at HewlettPackard’s Basic Research Institute in the Mathematical Sciences (BRIMS) in Bristol, United Kingdom. That the lectures were sponsored by Hewlett-
Foreword
vii
Packard indicates the intense industrial interest in a branch of theoretical, and, increasingly, experimental, physics that optimists (including me) believe is also a nascent technology.
PREFACE HOI-KWONG LO MagiQ Technologies, Inc. 275 Seventh Avenue, 26th Floor, New York, N Y 10001-6708, USA TIMOTHY P.SPILLER Hewlett-Packard Laboratories, Bristol Filton Road, Stoke Giflord Bristol BS34 SQZ, United Kingdom
SANDU POPESCU University of Bristol, Tyndall Avenue Bristol BS8 1 TL, United Kingdom and BRIMS, Hewlett-Packard Laboratories, Bristol Filton Road, Stoke Giflord, Bristol BS34 SQZ, United Kingdom
About This Book: This book is an introduction to quantum information and what can be done with it: computation, teleportation and cryptography, Following the pioneering work by Richard Feynman, David Deutsch and others in the 1980’s, there has been an explosive growth in the research activities in the whole area. It is now known that information processing based on the principles of quantum mechanics can be profoundly different and, in some cases, much more powerful than that based on classical mechanics. Apart from technological interests, quantum computation and related subjects also provide a concrete arena for investigating fundamental issues in quantum mechanics. Numerous exciting results and developments have emerged, experimental as well as theoretical, and many of them are discussed in this book. Topics covered include the non-locality of quantum mechanics, quantum teleportation, quantum computation, quantum cryptography, quantum error correction and fault-tolerant quantum computation, as well as some experimental aspects of quantum computation and quantum cryptography. This book aims to be a self-contained overview. Only knowledge of basic quantum mechanics is assumed. Density matrices, irreversibility, entanglement and the like are discussed rather than simply taken as read. Concepts and ix
x
Introduction to Quantum Computation and Information
ideas from computational complexity theory, cryptology and error correcting codes are also introduced as needed. While important basic results are presented in detail, some more involved technical details are excluded. (Extensive reference lists are given with each chapter for readers who wish to go further.) The book should be well-suited for a wide audience, ranging from beginning graduate students to advanced researchers. Clearly the whole field of quantum information is still developing-no doubt as you are reading this, researchers somewhere in the world will be endeavouring to push the boundaries beyond those reported here. Nevertheless, the contents of this book should continue to provide a good primer for the further exciting breakthroughs that will surely follow in years to come. Although the focus of industrial research laboratories is clearly on the technology and products of the near future, it also pays to invest some effort towards potential new long term technology, through basic and “blue skies” research. In 1994 Hewlett-Packard set up BRIMS, the Basic Research Institute in the Mathematical Sciences,a which is part of Hewlett-Packard Laboratories Bristo1 (UK), (the European arm of the Company’s corporate research structure). Members of BRIMS undertake basic research in various areas of mathematics and theoretical physics within this industrial setting. In addition, fundamental research in mathematics and physics is also carried out at Bristol in the Mathematics, Cryptography and Security Group,b another part of the main HP Laboratories, in conjunction with their applied and consultative activities. The future potential for a whole new quantum information technology is clearly very exciting, (a point also highlighted by Michael Berry in his Foreword). Quantum information and computing are thus two of the basic research themes currently being pursued in BRIMS and the Mathematics, Cryptography and Security Group. This book is based on a series of lectures organized by the three editors under the auspices of BRIMS, from November 1996 to April 1997. In addition to speakers writing chapters based on their lectures, other contributions have been included. This widens the scope of the book and thus presents a good balanced overview of quantum information processing in general.
About The Editors: Hoi-Kwong Lo is Chief Scientist and Senior Vice President, Research and Development of MagiQ Technologies, Inc.,‘ New York, a company founded in 1999 that focuses on the commercialization of quantum information technology in “More information can be found at http://www-uk.hpl.hp.corn/brirns/ bMore information can be found at http://www-uk.hpl.hp.corn/rncs CMore information can be found at http://www.rnagiqtech.corn
Preface
xi
addition to basic research on the subject. From 1996 to 1999, he worked a t Hewlett-Packard Labs, Bristol (UK), first as a Research Consultant and subsequently as a Senior Member of Technical Staff. From 1994 to 1996, he was a Member of the School of Natural Sciences of the Institute for Advanced Study, Princeton (USA). He obtained his B.A. in Mathematics (Triple First Class Honour) from Trinity College, Cambridge University (UK) in 1989 and his M.S. and Ph.D. in Physics from the California Institute of Technology, Pasadena (USA) in 1991 and 1994 respectively. Tim Spiller is currently a Technical Consultant in the Mathematics, Cryptography and Security Group, Hewlett-Packard Laboratories, Bristol (UK) and Coordinator of QUIPROCONE, the European Network of Excellence for Quantum Information Processing and Communication.d He was formerly a UK Science and Engineering Research Council Research Fellow and a Royal Society Research Fellow in the Physics Department at the University of Sussex (UK). He got his B.A. in Physics from the University of Oxford (UK) in 1980 and his Ph.D. from the University of Durham (UK) in 1984. Sandu Popescu currently holds a joint appointment as Professor of Physics at the University of Bristol and member of Hewlett-Packard’s Basic Research Institute in the Mathematical Sciences (BRIMS) at Hewlett-Packard Laboratories, Bristol (UK). From 1996 to 1999 he was Hewlett-Packard Senior Research Fellow and University Reader in Quantum Mechanics at the Isaac Newton Institute, University of Cambridge (UK). He was formerly a Postdoctoral Fellow at the Free University of Bruxelles (Belgium) and a t Boston University (USA). He got his B.A. in 1980 and M.Sc. in 1981, both at University of Bucharest, Romania and his Ph.D. in Physics from Tel-Aviv University (Israel) in 1991.
Acknowledgments
We thank all the authors for their considerable time and effort spent on writing up their manuscripts. We also thank Michael Berry for writing the Foreword to this book. We have benefitted from the generous support of both BRIMS and the Mathematics, Cryptography and Security Group a t Hewlett-Packard Laboratories, Bristol, during the staging of the lectures and the preparation of the book. The help and service provided by World Scientific Publishing is also gratefully acknowledged.
dMore information can be found at http://www.quiprocone.org
CONTENTS Foreword
V
ix
Preface Basic Elements of Quantum Information Technology Timothy P. Spiller
1
The Joy of Entanglement Sandu Popescu and Daniel Rohrlich
29
Quantum Information and Its Properties Richard Jozsa
49
Quantum Cryptology Hoi-Kwong Lo
76
Experimental Quantum Cryptography Hugo Zbinden
120
Quantum Computation: An Introduction Adriano Barenco
143
Quantum Error Correction Andrew M. Steane
184
Fault-Tolerant Quantum Computation John Preskill
213
Quantum Computers, Error-Correction and Networking: Quantum Optical Approaches Thomas Pellizzari
270
Quantum Computation with Nuclear Magnetic Resonance Isaac L. Chuang
311
Future Directions for Quantum Information Theory Charles H. Bennett
340
xiii
BASIC ELEMENTS OF QUANTUM INFORMATION TECHNOLOGY TIMOTHY P. SPILLER Hewlett-Packard Laboratories, Bristol Filton Road, Stoke Gafford Bristol BS34 SQZ, United Kingdom The marriage of quantum physics and information technology has the potential to generate radically new information processing devices. Examples are quantum cryptosystems, which provide guaranteed secure communication, and quantum computers, which manipulate data quantum mechanically and could thus solve some problems currently intractable to conventional (classical) computation. This introductory chapter serves two purposes. Firstly, I discuss some of the basic aspects of quantum physics which underpin quantum information technology (QIT). These will be used (and in some cases further expanded upon) in subsequent chapters. Secondly, and as a lead into the whole book, I outline some of the ideas of QIT and its possible uses.
1
Introduction
Information technology (IT) can feed off quantum physics in two ways, which might loosely be termed evolutionary and revolutionary. Both are potentially very important and each one forms a currently very active and exciting research field. In the evolutionary work, quantum physics is essentially employed as a tool, so it is possible to understand and appreciate a good deal of its impact without having to get to grips with the theory itself. Conversely, in the revolutionary work quantum mechanics plays the lead role. Some knowledge of what it is about is therefore required to get a feel for the dramatic new possibilities which arise. The various chapters in this book introduce and discuss in some depth these developing areas, such as quantum cryptography and quantum computing. As will be seen, some of the most fundamental and interesting aspects of quantum mechanics play centre stage. As a primer, this chapter contains some basic discussion of these topics, in addition to an overview on some areas of QIT. Readers who have already consumed such hors-d’czuvres may care t o go straight to the entrkes in the later chapters. In the evolutionary IT work, quantum physics is basically used to better understand and thus improve existing technology. For example, the development of smaller and faster silicon or other semiconducting devices benefits from the understanding of the quantum behaviour of electrons in such materials. A bit more radical would be the replacement of silicon transistors by supercon1
2
Introduction to Quantum Computation and Information
ducting Josephson junction devices? Nevertheless, intrinsically quantum in nature though superconductors may be, this would still not constitute a fundamentally new technology. The superconducting benefit here would be faster digital switching and lower power consumption. However, the logical operations performed, the manipulations of the physical bits in these devices, are no different from those of existing devices. These familiar logical operations still obey the laws of classical physics, as they always have. A genuinely radical development comes if quantum physics impacts on information technology in a second and rather different way. Instead of improved versions of what we have already, consider devices which actually process information-perform logical operations-according to the laws of quantum physics. Such devices, which would be part of a new quantum information technology (QIT), are fundamentally different from their classical counterparts. Quite unlike billiard balls, fundamental particles such as electrons can exhibit wave-like interference phenomena and two (or in principle more) of them can be intimately entangled. In a similar way, machines which store, process and transmit information (usually in the form of bits) quantum mechanically can do things with it that would appear totally out of character, or even impossible, for a classical machine. Of course, it's not that easy-if it were, QIT would probably have been around for a good many years by now. The problem is that measuring electrons generally shakes them up, destroying interference and entanglement. Worse still, such disruptive effects may occur whether you like it or not; they may arise from unavoidable interactions with other systems. Such behaviour is part of quantum physics in general-it is not peculiar to electrons-and so forms a barrier to the development of any realization of QIT. Indeed, it is not at all clear that the decohering interactions with other systems can be avoided, and so a few years back there was substantial pessimism for practical QIT. However, recent remarkable work on quantum error correction has shown that the development barrier is not insurmountable in some cases. This is why QIT is a growing an active research field." Even if it develops as well as current researchers can best imagine, QIT is not going to revolutionize the electronics industry in the sense of ousting existing IT. Rather, it will create new business opportunities which will grow aThis has been tried essentially twice, by IBM and by the MITI project in Japan, but without commercial success. A third (and perhaps the final) attempt is in progress, using the new approach of rapid single flux quantum technology! It remains to be seen how this will fare. bOr, at least, it is possible to tunnel through it ... CIn addition to the discussions and references presented in this book, there exist review articles on the subject 2-4 which contain extensive lists of further references. Quantum information can also be found at web sites5
Basic Elements of Quantum Information Technology
3
alongside the existing ones. Instead of replacing your PC with a quantum version that merely outperforms your old one at the same tasks, it seems rather more likely that you will buy a quantum attachment, or a whole new machine, which actually does things your (or indeed any) classical machine simply cannot. The two most well-known and researched examples of quantum information processors to date are a quantum cryptosystem and a quantum factoring computer. The former enables guaranteed secure communication between two parties. The latter would enable a large composite integer to be factored, a problem which is essentially intractable on any classical computer. (It is a computationally simple task to multiply together two very large prime numbers p and q to obtain their (composite) product N = p q . However, it is exceedingly difficult to find the factors p and q if you are given just N . ) The hardness of factoring forms the basis for public key cryptosystems such as RSA; these are very widely used today so the cracking of the factoring problem would have major implications! I’ll briefly use these two examples, cryptography and computation, to highlight how fundamental features of quantum physics come into play for QIT. First, though, we need the quantum ingredients. 2
Quantum mechanics
There are five important elements of quantum mechanics which feature highly in quantum information processing.
2.1 Superposition states Quantum systems have a much richer and more interesting existence than their classical counterparts. A single bit, the very basic building block of any classical information processor, only has a choice between two possible states, 0 or 1. It is always in one state or the other. However, a single quantum bit, or qubit, has the luxury of an infinite choice of so-called superposition states. Nature allows it t o have a part corresponding to 0 and a part corresponding to 1 at the same time, analogous to the way a musical note contains various dFor example, RSA6 operates roughly as follows: A user wishing to receive secret messages uses two large primes p and q and publicly declares an encryption key of N = p q and a suitable random number e , which is co-prime with z = (p-l)(q- 1). A sender encrypts their message m to f = me mod N and transmits f . Knowing the primesp and q , it is mathematically easy for the receiver to decrypt this by evaluating f d mod N, where d = e-l mod x. However, it is extremely difficult for an eavesdropper to decrypt the transmission because they only know N and not its factors.
4
Introduction to Quantum Computation and Information
harmonic frequencies? Picture it as a classical bit being only black or white, but a qubit having every colour you like, if this helps. In mathematical terms, the state of a quantum system (which is usually denoted by I$)) is a vector in an abstract Hilbert space of possible states for the system. The space for a single qubit is spanned by a basis consisting of the two possible classical states, denoted by 10) and 11). This means that any state of a qubit can be decomposed into the superposition
with suitable choices of the complex coefficients a and b. A familiar representation of the basis uses the orthogonal 2D unit vectors ( y ) and (A); in this case I$) is represented by the vector (t). The value of a qubit in state I$) is uncertain; if you measure such a qubit, you cannot be sure in advance what result you will get. Quantum mechanics just gives the probabilities, from the overlaps f between I$) and the possible outcomes, rules due originally to Max Born. Thus the probability of getting 0 is 1(01$)12 = laI2 and that for 1 is 1(11$)12 = lbI2. (Quantum states are therefore normalized; ($I$) = (b* a*).(:) = 1 and the probabilities sum to unity.) Quantum mechanics also tells you that (assuming the system is not absorbed or totally destroyed by the action of measurement) the qubit state of Eq. 1 suffers a projection to 10) (11))when you get the result 0 (1). There is clearly something intrinsically irreversible about a measurement. In fact, this is not peculiar to measurement interactions and I discuss irreversibility more generally in Sec. 2.4. As a qubit has a basis of two states, a full system of m qubits has a basis of 2m states. These could represent the binary values from 0 to 2"-1. A classical computer with an m-bit input register can clearly only be prepared in one of these possible states and so calculations with different inputs have to be run as separate computations. However, scaling up the superposition principle of Eq. 1to a machine with an input register of m qubits, a carefully constructed quantum computer-the reason for this qualifier will become apparent lateris thus allowed to exist in a superposition of all its possible classical binary states. This means that it could perform a single computation with its input set to a superposition of all possible classical inputs! This so-called quantum eAnother (mathematically correct-it is sometimes called the Poincar6 or Bloch sphere) analogy is to think of a globe. A classical bit can only sit at the north or the south pole, whereas a qubit is allowed to reside at any point on the surface. f ( q / $ )is the inner product between the two states; (71 follows from lq) by transposition and complex conjugation (*)-together these form Hermitian conjugation (t). In the vector representation ($1 is given by (b' a*) and the inner product is the familiar scalar/dot product.
Basic Elements of Quantum Information Technology
5
parallelism is the basis for being able to solve some problems much more quickly with a quantum processor.
2.2 Entanglement Quantum systems are weird! Even with just two qubits, a strange and remarkable property of quantum systems raises its head. Two qubits (labelled A and B) have a basis of four states, which could be written as ~ O ) A I O ) B , I O ) A I l ) B , I1)AIO)B and 11)AIl)B. (The total Hilbert space for a number of systems is given by the direct product of the individual Hilbert spaces, often denoted by the symbol @. A complete state vector is thus a direct product of individual ones. Some authors choose to make this explicit; others, such as myself, take it as read-thus ( O ) A ( O ) B denotes ( 0 )~3 ~( O ) B . ) Consider a superposition state of just two of the basis states, I‘$)AB
= 2-lI2
(1O)AIO)B -k 1 1 ) A I I ) B )
.
(2)
There is no way that this state can be rewritten in the factored form I ~ ) A ~ X ) B , for any crafty choice of I $ ) A and Ix)B. Such a form would imply that qubits A and B have definite quantum states (in their individual Hilbert spaces), independent from their partner. Consequently, for states like Eq. 2 this is not so-there exists an intimate entanglement between the two. Neither has a state of its own. Entanglement plays a very important role for QIT. As such, it deserves a sizeable discussion, and this is what it gets, from Sandu Popescu and Daniel Rohrlich in Chapter Two. Entanglement between two qubits, such as in Eq. 2, is well understood. However, there are still open questions on entanglement between many qubits and cases where the overall system is impure, and so has finite entropy. I give some discussion of impurity in Sec. 2.4; as will be seen, entanglement with other degrees of freedom generates entropy for the system of interest. It is well known that the spatial separation of systems A and B when they are in an entangled state like Eq. 2 has remarkable consequences. Albert Einstein, Boris Podolsky and Nathan Rosen7 started the ball rolling in 1935; John Bell * took it up in the sixties and proved his famous theorem-in effect that quantum mechanics as a theory is non-local. Numerous interesting and important further developments have followed in the last decade or so. One thing that cannot be done with the non-locality of spatially separated entangled systems (often called EPR pairs) is “faster-than-light” signalling; the irreversibility of quantum measurement ensures this. However, shared entanglement can be used for the teleportationg of (unknown) quantum states and
6 Introduction to Quantum Computation and Information
for superdense coding?’?’’ These subject areas, and general aspects of quantum information theory, are addressed by Richard Jozsa in Chapter Three.
2.3 Reversible unitary evolution An isolated puantum system evolves in a nice reversible manner. Schrodinger’s famous equation tells us how;
Here I$) is the state of the system-which might be anything from a single qubit through to some complex interacting collection of degrees of freedomand H is the total Hamiltonian (the energy operator). It is important not to miss any parts of H, interactions with bits and pieces outside the defined system? Provided none are omitted the system is “closed” and evolves according to Eq. 3. Formally, this can be integrated to give the state at any time
I@(t))= ~ l $ ( O > )
(4)
7
[ s,’ dt’ H I . Clearly, such
where the unitary operator is given by U = exp - f
evolution can be reversed by application of U t . However, to make a rather less glib association with the familiar statistical mechanical idea of reversibility, it is helpful to consider a different description of quantum systems. This broader picture, which employs density operators (or matrices), can handle irreversibility and it will therefore be handy for the discussions later in the book where this plays a crucial role. Density operators can also incorporate classical probabilistic uncertainty-lack of information about a choice of state-as well as the quantum uncertainty implicit in superposition states such as the qubit states of Eqs. 1 and 2. Density operators thus play a crucial role in discussions of quantum information theory, and because of this also feature strongly in the next couple of chapters. Consider a large number of identical and non-interacting quantum systems, where every member of this ensemble is in the quantum state I$). The whole gThese could simply be due to coupling with the surrounding vacuum electromagnetic field, or thermal contact with some other apparatus. h A unitary operator is one whose Hermitian conjugate is its inverse, so U U t = U t U = Z where I is the appropriate identity operator. Clearly U in Eq. 4 must be unitary to conserve the total probability; ( $ ( t ) l $ ( t ) ) must equal ($(O)l$(O)).This is ensured because the total energy is an observable and H is Hermitian; H = H t .
Basic Elements of Quantum Information Technology 7
ensemble can be described by a density operator: given by
In the vector representation of states, p is a density matrix-an ensemble of lbI2 ba* qubits each in state Eq. 1 is described by p = (I) (b* a * ) = (ab. The reason a direct statistical description of an ensemble is useful is that the entropy (per member of the ensemble) can be defined l 2 by
S = -k T r a c e ( p lnp)
,
(6)
where k is Boltzmann’s constant? For any pure ensemble, where every member is in the same state (and so, from Eq. 5 , p2 = p), it is easy to show that the entropy vanishes. As every member is in the same state, there is no lack of knowledge, or “missing information,” about such an ensemble. The Schrodinger evolution for p follows from Eq. 3; ap =
at
i (7)
As per Eq. 4, this can be integrated to give p ( t ) = Up(0)Ut. It is straightforward to show that unitary Schrodinger evolution preserves the entropy, aS/dt = 0. This is why such evolution is called reversible. The meaning is the same as in thermodynamics; the entropy of an ensemble of closed quantum systems does not change as they evolve reversibly. It will be seen throughout this book that reversible evolution of systems is crucial for quantum information processing. Qubits have to evolve unitarily from place to place to move quantum information around. A quantum computer has to evolve reversibly, utilizing entanglement between many qubits, in order to perform tasks impossible for any classical machine. Of course, irreversibility does come into play. The only way to get answers out of quantum information processors is to make measurements. However, apart from these deliberate injections of irreversibility, interactions causing changes in entropy are essentially bad news, and need to be avoided or circumvented. *In this approach, normalization gives ($I$) = Trace(p)= 1, where Trace denotes the sum of the diagonal elements. The expectation value of any observable quantity 0, its average value over the ensemble, follows from (0)= ($lOl$) = T r a c e ( p 0 ) . j I t is customary to give the entropy units of JK-’ and to use natural logarithms when relating S to irreversibility and the familiar thermodynamic entropy. Alternatively, it is usual to work with a dimensionless S and to use logarithms base 2 in information theory, to relate S to the Shannon entropy of a classical probability distribution. This is clearly nothing more than a change of units-pure ensembles have zero entropy in either case.
8
Introduction to Quantum Computation and Information
Before moving to discuss such irreversibility, a little word of caution is in order regarding unitary operators. Generally in quantum information processing, it is handy to think of the sequences of unitary operators which have t o be applied to qubits to effect some desired process. However, theorists-myself included-should not get too cocky! Just because a U can be defined on paper does not necessarily mean that it is easy t o implement even under laboratory conditions, let alone out in the real world! If it is effected by some piece of Hamiltonian acting for some time, errors may occur. If the Hamiltonian contains some externally applied source (like an electromagnetic pulse) , in reality this may not be exactly as per the blueprint. The timings may not be quite right. The evolution may still be unitary, but it may not be that due to the desired U . Despite the discrete bases of quantum systems like qubits, general states such as Eq. 1 contain continuous amplitudes a and b. Incorrect unitary evolution 13,14 thus has some analogy with the occurrence of errors in classical analogue computing. Such problems cannot be ignored, as they will doubtless occur whenever QIT moves off the drawing board. It will be clear from the chapters by Andrew Steane on error correction and John Preskill on fault-tolerant computing that, over the last few years, theorists have not tried t o sweep these problems under the carpet. Unitary errors, as well as non-unitary effects such as decoherence which are discussed in the next section 2.4, have t o be allowed for and then overcome, in order t o effect successful quantum information processing. That this can be done a t all frequently surprises people, often by an amount proportional t o their advance knowledge of quantum physics!
2.4
Irreversibility, measurement and decoherence
All quantum systems have a somewhat fragile existence. The only way t o find out anything about a quantum state is to actually make a measurement on the system. The type of measurement you choose to make defines the set of possible results; the outcome of every measurement has t o be one of these. The consequences of forcing the hand of a quantum system by measurement are that a single measurement is a truly random process and that the act of measurement imparts an irreversible change t o the state of the system. Fragile superposition states collapse. Measuring the value of a qubit will always yield 0 or 1-the measurement projects any initial state t o one or other of these. For an initial superposition state such as Eq. 1this occurs randomly with respective kAt the most basic and universal level, these have to be on individuals and on pairs of qubits, although it may be convenient to think of more complicated many-qubit unitary “gates” which in principle break down into these basic operations.
Basic Elements of Quantum Information Technology 9
probabilities1 of laI2 and lbI2. There is a corresponding irreversible change to the state as it jumps to 10) or 11). Irreversibility is only avoided in the special cases when the qubit is actually in state 10) or state 11) before measurement. The upshot is that you cannot infer the prior state of a quantum system from the outcome of a single measurement-if you get 0 you have no idea if the initial state was purely this, or if it was a superposition state containing a part of this. You cannot deduce the colour of a single qubit if you only see in black and white. This is not a question of experimental competence; it is a property of Nature. The fragility of quantum states is the key to a quantum cryptosystem. Sending information encoded in individual qubits guarantees that any eavesdropper cannot read it in transit without leaving evidence of their tampering. They will always corrupt some of the data. Measurement of a quantum system generally requires interaction with other degrees of freedom, external to those of the system of interest (and so not included in the system Hamiltonian H). Other forms of interaction exist, too. The trendy term for additional degrees of freedom coupled to a quantum system is the environment-a system so coupled is referred to as “open.” Its H does not tell the whole story. The irreversible nature of interactions with environments can be seen by looking at the entropy for some relevant examples. 1. Measurement: Since measuring the values of qubits projects them according to the Born rules, an initially pure ensemble pi = I$)($[ with I$) given by Eq. 1 ends up as the weighted sum of pure ensembles
after measurementl” This is clearly not puren (p; # p f ) and the entropy (Eq. 6 ) has increased from zero to S = -k(laI2 In laI2 lbI2 In lbl’).
+
2. Decoherence: Although entanglement within a large complex system is vital for quantum computing, additional entanglement with environment degrees of freedom is a real nuisance and causes unwanted irreversibility. This can be seen even in the simple example of the decoherence of an EPR pair. Consider a total pair-environment (AB-e) system initially ‘In the globe picture, measurement forces a qubit to jump at random to one of the poles, with a probability proportional to the square of the cosine of half the zenithal angle 0 to that pole. In Eq. 1 the amplitudes can be parametrized in terms of this and the azimuthal angle 4:a = exp(iq5) cos(e/2) ; b = sin(B/2). mIn the matrix representation this is simply pf = (Ib’’ O2) 0 la1 nExtending the globe picture, a non-pure ensemble of qubits is represented by a point somewhere inside the surface. In particular, diagonal ensembles such as that of Eq. 8 lie on the axis joining the poles.
10 Introduction to Quantum Computation and Information
in a state JP)= I$).mle), with I$)AB given by Eq. 2. Suppose that an interaction with the environment generates the additional entanglement
The qubit values determine the new environment states. If the environment contains many degrees of freedom, these states will almost certainly be orthogonal (or very nearly so), (eolel) = 0. Note that the environment need only couple to one or other of the pair ( A or B ) t o do this. Clearly the total final density operator ep = lP)f(Plf is still pure; O however, this is not the point. Anyone trying to use the E P R pair for quantum information processing will not be using the environment as well; they may not even be aware of its intervention. The system as far as they are concerned is just A and B. The reduced density operator, describing a n ensemble of such EPR pairs alone, is found by tracing over the environment P to give
This ensemble is not pure and has finite entropy of S = kln2. When the environment is a large complex system containing many degrees of freedom, entanglement with it (once generated) can to all intents and purposes never be unwound. In such cases the E P R pairs effectively undergo irreversible decoherence. Clearly a similar effect can occur with much more complex systems of interest and, indeed, it will be much more likely-in effect, occur much more quickly-when there are many more components to the system (each able to couple to the environment), compared t o the two of an EPR pair.
3. Thermal equilibrium: Consider the energy eigenstates of the system of interest, HIE,.) = EjlEj). Independent of how it starts off-if it is being used for information processing it will be in some carefully prepared and supposedly unitarily evolving state-if the system makes contact with an environment at temperature T and attains thermal equilibrium, it decoheres. An ensemble of such systems is described by the equilibrium OThe total system of EPR pair plus environment is closed, because there are no additional degrees of freedom coupled to this. pTrace,(O) is effected by C , ( k l O l k ) , where the (many) states Ik) are a complete orthonormal basis for the environment. The states in Eq. 9 decompose as lei) = ai(k)lk) for i = 0 , l and their orthogonality constrains the coefficients to obey at (k)aj(k)= 6ij.
c,
C,
Basic Elements of Quantum Information Technology 11
density operator
The exponential probabilities are the well known Boltzmann factors and 2 is the normalizing partition function exp(-Ej/kT). pes is clearly is the average system not pure with an entropy of S = E/T, where energy, the expectation value Trace(p,,N).
cj
Generally speaking, irreversibility such as that in the latter two examples has to be stopped from biting before some desired unitary quantum evolution of the system has been completed. This is the really crucial point. Although simple illustrations of irreversibility such as those just given are useful for thinking about the interactions and processes likely t o generate decoherence, they don’t answer the vital question: How does the typical decoherence time for a system-the inverse of the characteristic rate at which entropy grows-compare to the time needed to accomplish some useful unitary process? Actual time evolution is relevant for this, so here is another cautionary reason for not simply abstracting QIT to a list of unitary operations to be applied to a bag of qubits. The total time these operations take to run in practice is extremely important-it has to fall inside the decoherence time for that particular system. Simple error correction, or more sophisticated fault-tolerance, will more than likely lengthen the time of the desired unitary process. To gain payback, the increase in the effective decoherence time has to outstrip this. A comprehensive discussion of the time evolution of open quantum systems, a vast subject in itselG5?l6is clearly beyond the scope of this chapter. However, a simple introductory example is worthwhile, especially as this provides a useful model for some of the decoherence processes which occur in quantum systems relevant for QIT. Starting with a complete system plus environment, it is possible to write down (at least formally) a very general expression for the evolution of p = Trace,(@),assuming that the total coupled system is closed. This is extremely complicated; 15316 for starters it contains memory effects. The history of p has some say in its rate of change. Neglecting these-this is the Markovian approximation-and assuming that the interaction between the system and environment is weak enough for the Born approximation to work, it is possible to give a simple model (so-called master) equation for the system alone;
12 Introduction to Quantum Computation and Information
To re-emphasise, H is just the Hamiltonian of the system of interest. The operators L , (which also act in the Hilbert space of this system) are the leftovers of the interaction with the environment after this has been traced out. These modify the unitary Schrodinger evolution and generate irreversibility. The irreversible examples can be illustrated within this simple framework: 1. Measurement: Measuring the value of a qubit can be modelled using Eq. 12 with a single operator L = q1I2Bwhere B is the bit value operator. In the matrix representation, B = 8) and the solution is lbI2 ba* e x p ( - q t ) ), for a pure initial p ( 0 ) constructed from = (ab* exp(--qt) la12 Eq. 1. (This is in an interaction picture, setting H = 0.) Clearly p ( t ) approaches the Eq. 8 result at large times and the rate of approach is set by the strength of the measurement interaction q. If the measurement is to look like a sharp “projection” at some time-scale, 11-l must be very short in these units.
(A
2. Decoherence: Qubits are frequently modelled as spin-1/2 systems and, indeed, in some cases this is an appropriate physical picture (in addition to a mathematical one). Often used operators are the Pauli matrices for the components of the spin,
Such a spin qubit subject to isotropic noise can be modelled using Eq. 12 with three operators L1 = L2 = dI2uy and L3 = K ’ / ~ O ~Be. sides the entropy, another quantity which can measure irreversibility and is often useful in discussions of quantum information theory is the fidelity, f = ( $ i l p ( t ) l $ i ) , which compares the initial pure state I$*) with the density operator at later times. The decoherence of a pure initial ensemble subject to isotropic noise is demonstrated by the decaying fidelity. The solution to Eq. 12 for any pure initial ensemble gives 1 f ( t ) = - (1+exp(-4rct)) 2
.
This decay is due to that of the off-diagonal pieces of p ( t ) . Such damping is a generic feature of decoherence. At this level of description, then, for any physical realization of a qubit it is vital to identify the appropriate environment coupling in order to gauge the decoherence time ( w K - ’ ) of the system! PAS an example, suppose that the spin has a magnetic moment p and the environment is
Basic Elements of Quantum Information Technology
13
3. Thermal equilibrium: A photon mode (or any other harmonic oscillator degree of freedom) coupled to a thermal bath can be modelled by Eq. 12 with two operator^:^>'^ L1 = [ ( A l)w/QI1/’a and L2 = [ A ~ / Q l ~ / ~ a t . Here w is the frequency, a (at) is the photon annihilation (creation) operator (so H = hw (at, Q is the environment quality factor and ? =i [exp(hw/kT) - 11-l is the thermal equilibrium photon number. For any starting condition, the photon number a t , evolves according to
+ + i)),
Trace (p(t)ata) = A + [Trace (p(0)ata) - E] exp(-wt/Q)
(15)
as the density operator diagonalizes to Eq. 11with photon number eigenstates and E3 = ( j $)FLU. Although the temperature T determines the final photon number, the time-scale for evolution t o this is Q / w . More often than not, the time-scale of a desirable unitary process will be set by the characteristic quantum frequency of the system; here this time is u-’. Comparing these, it is clear that high-Q systems are needed for QIT.
+
-
-
These last two very simple examples of decoherence illustrate the sorts of effects that have t o be avoided for successful quantum information processing. Any real system will always have some level of environment coupling, so in practice this needs to be identified and a decent estimate of the relevant decoherence time made. If this compares favourably with the time-scale of the unitary process t o be run, all is well and good; if it doesn’t, you have trouble. I have focussed on the density operator approach t o irreversibility as it is the standard one, giving a nice elegant description of ensemble average behaviour. However, QIT puts an emphasis on individual quantum systems, so it is worth pursuing this viewpoint a little. Certainly, it should be (made) clear that, apart from the pure ensemble case where every member is in the same state I+), a density operator does not identify with a unique ensemble. A simple example is Eq. 8 with laI2 = lbI2 = 1/2. It could be that this ensemble is the output of bit value measurement apparatus, so each qubit is in state 10) or 11). However, instead it could be the result of decoherence due to isotropic noise, with all possible qubit states equally probable: Both ensembles have the same density operator. Of course, you can’t distinguish between these cases, or other possibilities, by making measurements; however, your actions can be a simple white noise magnetic field B ( t ) with each component having a time-correlation of B , ( t ) B , ( s )= B i ~ 6 (-t s), defined by a characteristic field Bo and a time r . The coupling p 2 B ; r / h 2 , which identifies it in terms of physical system and environment is then K parameters. ?In the globe picture, there are many different ways of distributing individual points over the surface to achieve the same ensemble average point in the interior. N
14 Introduction to Quantum Computation and Information
detected! Suppose that somebody prepares one ensemble by encoding a long random bit string into a string of qubits using the states (lo), 11)) and a second by doing the same but using the alternative basis states (lo), li))defined byS
16) = 2-ll2
(10)
+ 11))
li) = 2-1/2
(11) - 10)) .
In the richer state space of a qubit, these are equally good for representing zero and one; hence the labelling. Without the additional information regarding the ensembles’ preparation, you simply have them both described by the same density operator, p = i ( l O) ( Ol ll)(ll). If this diagonal form tempts you into thinking that you can measure in the (lo), 11)) basis and leave the qubits untouched, you will be mistaken, because this works for the first ensemble but not the second? Don’t take it personally, though, because this inability to “eavesdrop” on both ensembles is fundamental t o quantum physics and forms the basis of quantum cryptography. This is outlined in Sec. 3 and discussed in more depth later in the book.
+
Any up t o date work on open quantum systems should at least mention the quantum state approaches,l%1s-20 as alternatives to the ensemble density operator view. These have developed considerably over the last decade. They all describe an individual member of an open ensemble by a state I$), which evolves stochastically-essentially there are extra bits t o Eq. 3 which model the effect of the environment. The evolution is such that the average of I$) ($1 over the stochastic variables gives density operator evolution consistent with the appropriate master equation. Various approaches, unravellings of the master equation, exist, such as quantum trajectories,16 quantum state diffusion l8l1’ and quantum jumps?’ The most appropriate one t o use generally depends on the system and environment. Their virtue is that they are able to produce pictures of individual quantum systems, in keeping with the language we often use to describe them-projective measurements actually happen and thermal systems hop continually-and underpinned by the correct statistics. Such methods have proved extremely useful in mainstream quantum optics and so it seems likely that they will prove to be a very handy bag of tools for &IT modellers as well. ~
_____
‘On the globe, these are two points diametrically opposed on the equator. In the second basis viewpoint, these are now regarded as the poles. The original poles thus lie on the new equator, and there is complete symmetry between the two basis viewpoints. tClearly you shpuld not have been tempted, since the density operator can be rewritten P = $(10)(01 I W I ) .
+
Basic Elements of Quantum Information Technology
15
2.5 No cloning
Quantum systems lead a rather private existence. It is physically impossible to copy the state of a quantum system to a second identical one, leaving the original untouched. This is really a consequence of what has already been discussed; nevertheless, given its importance for QIT, it is worth stressing. From the fragility of quantum states, it is clear that simply measuring a system and then also placing the second system in the outcome state is useless-in general (and a copier has to work generally), neither will be in the original state. Alternatively, you might think that some subtle quantum coherent process, which preserves superposition states, could be devised to realize the cloning. Not so! Once again this is a property of Nature and not down to our hamfistedness. Even if a unitary operator U, acting on systems A and B can be arranged to copy the basis states of A to some initial state l i ) ~so, U,lo>~li)~ = IO)AIO)B and U c ( l ) ~ l i=) 11)A11)B7 ~ it is clear that with the superposition state of Eq. 1 the result is U c l $ ) ~ l= i ) ~~I O ) A I O ) B + ~
I ~ ) A I ~ > B.
(17)
I$)A~$)B,
This entangled state is certainly not equal to so the copying does not work in general. The “no-cloning theorem” 21,22 for quantum states has important implications for us. Eavesdroppers are thus unable to use cloning to try and beat a quantum cryptosystem by copying each qubit of a transmission. Similarly, it will not be possible to run off a copy of the state of a quantum computer part way through a computation, to use as backup in the event of subsequent errors. This simple approach to error correction is no good. Given this, it is not immediately obvious that any sort quantum error correction is possible. However, as you will see later in the book, recent remarkable research has shown that some forms of error correction and prolongation of quantum coherence can be done. 3
3.1
Quantum cryptography
The idea
The only cipher which is known to be mathematically secure is the one-time pad (or Vernam cipher, after Gilbert Vernam). Current public key cryptosystems rely on the assumed mathematical difficulty of certain operations (such as factoring in the case of RSA 6); they are thus unable to guarantee security.” A “ I suppose that people do not worry too much about this as it is assumed that the cracking of any such hard problems will happen in the academic and research community,
16
Introduction t o Quantum Computation and Information
one-time pad requires a random bit string, the key, as long as the secret message to be communicated. The key must be known only by the sender (Alice) and the receiver (Bob); “one-time” refers to the fact that any key should be used only once. To encode the message, Alice simply adds modulo 2 each bit of the key to its corresponding bit in the message. To decode, Bob simply repeats this procedure. Provided that an eavesdropper, Eve, has no information about the key, she cannot decipher the encoded message. To her it will look like a random string of bits; she needs the key to crack the encoding. The security of the message thus reduces to the security of the key. Herein lies a problem, though, because if Alice and Bob share the key as ordinary classical information they cannot be sure that nobody else has shared their supposedly secret key. In principle, Eve can read a classical key without leaving any evidence at all of her snooping. The impact of quantum physics is to solve this problem. If Alice and Bob use qubits, they can establish a shared key which they can be sure is known only to them. They then have a guaranteed secure quantum cryptosystem because the irreversibility of quantum measurement-this is the only way Eve can examine the qubits-ensures that Eve cannot snoop without leaving evidence of this. A wider discussion of the whole area of quantum cryptology is given by Hoi-Kwong Lo in Chapter Four; this short section just introduces the original idea and approach. Alice and Bob need a “quantum channel”, along which qubits are sent, and a form of conventional public communication channel such as a broadcast radio system. The fundamental requirement of the public channel is that Eve cannot block all the transmissions and then replace them with her own spoof messages-if she could, she could break the security. The sacrifice of Alice and Bob’s spoof-proofing is that Eve can hear their public transmissions without any effort and without revealing her presence. Where Eve has to attempt to listen in is on the quantum transmissions; this is where she gets caught. Suppose first that Alice simply sends to Bob a random string of qubits with states 10) or 11). Knowing that he will be getting these states, Bob can measure them without introducing any irreversibility. Apparatus shortcomings aside, he gets perfect results and he and Alice have a shared random bit string. The trouble with this is that Eve can do the same! In this case the quantum channel is effectively being used classically, so Eve can listen in without being detected. To combat this, Alice uses a second pair of states-in quantum language a second basis. These are the states 10) and li) defined in Eq. 16, superpositions of the other basis states 10) and 11) (and vice versa). Bob now has a problem and so will become public knowledge. Perhaps we should get suspicious if some of the top number theorists stop publishing and buy big yachts ...?
Basic Elements of Quantum Information Technology 17
because if Alice sends at random one of the four states lo), Il),16) and li),he does not know what to measure! He therefore chooses at random to measure projecting onto the (lO),ll)) basis, or onto the (18),li)) basis. Half the time he will be okay, but half the time he will choose to measure a state which is a superposition, as seen in the basis in which he is measuring. These states will be irreversibly corrupted by Bob’s quantum measurement, and so must be discarded. This is done by him telling Alice publicly the sequence of measurements he made (but not the results!); she then identifies which data are to be kept (which is called the raw quantum transmission-RQT) and communicates this back to Bob. On average they sacrifice half of the transmission; however, their gain is that they confound Eve. Eve has a problem when Alice uses four states, the same problem as Bob. She does not know what to measure, so essentially all she can do is the same as Bob, and guess. Consider just the RQT, just the data kept by Alice and Bob. For half of this Eve will guess wrongly, and measure in the opposite basis to that used by Alice and Bob. The irreversibility of quantum measurement ensures that Eve corrupts all these qubits en route to Bob. He has equal chances of recovering the correct value (sent by Alice), or getting an error, when he measures such a corrupt qubit. Eve therefore corrupts one quarter of the RQT that she intercepts; quantum physics guarantees this. Besides this original other quantum cryptographic protocols and procedures now exist. However, they all essentially rely on the same idea: Force Eve to undertake some guesswork as to what to measure and quantum mechanics will ensure that she leaves evidence, in the form of errors in the RQT. By public sacrifice of a sample of the RQT, Alice and Bob can thus determine how much of this has been intercepted. If it is the lot, they bin it and try again. However, at least they know-this is why it is best to use the quantum channel to establish a secret key, rather than send the actual message. If only a part of the RQT has been read, Alice and Bob can find and eliminate the errors, and then distil from the correct data a smaller secret shared string which forms their final cryptographic key?3 All this is done by public discussions. Even if Eve knows some of the RQT, she will still know essentially nothing about the final key. For example;3 if Eve corrupts 4% of a 2000 bit RQT, Alice and Bob are able to distil a 754 bit key, about which Eve knows less than of one bit.
18 Introduction t o Quantum Computation and Information
3.2 Experiments Quantum cryptography is not just a pipe-dream of theoreticians. This is one area of QIT which has made it off the drawing board. There is total consensus in the field that photons-quanta of light-are the best qubits for this purpose. All the working systems use them; their polarizations or phases are used as the bit values. The first prototype ran in 1989 at IBM,” 23 over a short distance under laboratory conditions. Since then, a number of groups have advanced the technology and produced much more practical systems. Examples are those of Nicolas Gisin’s group 25,26 (GAP in Geneva, Switzerland), John Rarity’s group 27 (at DERA, Malvern, UK), Paul Townsend’s group (at BT Laboratories, Ipswich, UK), Jim Franson’s g r o ~ p ~ ’ >(at ~ ’John Hopkins University in the US) and Richard Hughes’ group31$32(at Los Alamos National Laboratory in the US). Rarity’s group have used entangled photons, and in 1994 demonstrated a violation of a Bell inequality down 4 km of optical fibre! Townsend’s group have run a quantum cryptosystem simultaneously with conventional data transmission down an installed fibre, using wavelength-division multiplexing! GAP borrowed 23 km of Swiss Telecom optical fibre which runs under Lake Geneva, and ran a quantum cryptosystem down this! They have also extended the Bell inequality violation distance33 up to 10.9 km! Franson’s and Hughes’ groups have run quantum cryptosystems in free space, down lit corridors and outside in bright daylight! Between them, all these groups have addressed and solved many of the problems which lie between prototype and real, practical systems. An example benchmark is the GAP result, establishing a 20 kbit key at about 0.5 Hz over 23 km. Much higher data rates have been achieved for shorter bursts or shorter distances. A detailed discussion on experimental quantum cryptography (with a broader spectrum of references) is given in Chapter Five by Hugo Zbinden.
’*
4
Quantum computing
Whereas irreversibility is what enables quantum cryptography, it may end up being the insurmountable hurdle for useful quantum computing. Decoherence of any of the qubit components of a quantum computer may trash the running of the whole unitary algorithm. Apart from measurements designed into a quantum computation, which may well be made right at the end, to reveal the result, irreversibility means trouble. If you keep opening the oven door to see “Charles Bennett at IBM also holds the first patent for quantum cryptosystems;24further refinements are also patented.
Basic Elements of Quantum Information Technology
19
what is happening, or the door fits badly so heat leaks t o the environment, your souffl6 will flop. Quantum computing gets its potential power from initial superposition states evolving reversibly and generating entanglement between the many components of quantum machine. The 2m possible states of an m-bit classical register form a suitable basis, so an m-qubit register can be placed in a superposition of all these states. This is why certain problems may be solved “exponentially faster” by a quantum machine, in comparison t o any classical machine. For a problem whose solution requires some property of the results of all 2m different calculations, these have t o be calculated separately in the classical case. On the other hand, if some clever manipulation can be performed on a quantum computer state (which has evolved t o contain 2m parts, corresponding t o all the classical results), t o yield the collective property in just one run, the solution of such a problem can be obtained with exponentially less effort! Chapter Six, by Adriano Barenco, discusses details of quantum computing, from the gates and networks needed through t o the types of algorithms which can usefully be run on such machines. As decoherence rubbishes nice reversible unitary evolution, and this is vital for quantum computation, the effects of the environment have to be held at bay for the duration of any computation. Unfortunately, decoherence bites harder a t bigger, more complex, quantum systems. Roughly speaking, a composite of n quantum systems decoheres n times more quickly than one of the individual members?’ Given this, it seems unlikely that careful shielding of a quantum computer alone will render it able t o perform useful calculations. Some form of active state stabilization, to preserve unitarity and prevent errors, will almost certainly be required as a useful computer will contain n >> 1 qubits. Despite the “no-cloning” theorem, this can be done. The basic idea behind the procedure is the same as in classical error correction-build in redundancy and use this t o protect against (some level of) errors. However, the implementations are more subtle because, on top of cloning being outlawed, the richer space of quantum states contains a greater variety of errors, in comparison t o simple bit-flip errors which can occur with classical bits. The first developments, independently by Peter Shor and Andrew Steane, showed how a number of qubits could simply have their decoherence time lengthened, by encoding them into a greater n ~ m b e r ? ~ Essentially, -~~ the entropy which arises from the interactions of all the qubits with the environment is massaged into just the redundant ones, leaving the important ones unscathed. Although fine ‘WHandwaving:In Sec. 2.4 it was seen that coherence dies like exp(-K.t) for a qubit where characterizes the coupling t o the environment. Taking such a factor for each of n qubits, (nK)-’. the effective decoherence time of the whole system is reduced t o K
N
20
Introduction t o Quantum Computation and I n f o m a t i o n
for the storage of quantum information, this is inadequate for computation, where information is manipulated by interactions between qubits as the system evolves. More recently, Shor?' Steane 39 and others have addressed this problem, and shown that in principle quantum computations can be performed in a fault tolerant manner. A sample result is that provided the error probabila coherent ity for a single quantum operation between two qubits is computation of 10l2 such steps and involving around 80 qubits should be possible. Andrew Steane (Chapter Seven) and John Preskill (Chapter Eight) discuss the important topics of error correction and fault tolerance.
-
-
Examples
4.1
Here's a list of things you might do with a quantum computer. This may not yet convince you to go and put a down payment on a machine, but it should whet your appetite for the later chapters of this book. 0
Factoring: Given the classical cryptographic importance of factoring, the most well-known example to date of a quantum algorithm is Peter Shor's factoring algorithm?' Factoring of a large composite integer N is not proven to be intractable classically, but to date no good algorithms for this exist? Shor's algorithm works by turning the problem into that of finding the (very large) period T of a periodic function? Given r , it is a bit of elementary number theory to deduce factors of N ; finding r is the hard bit.
At least it is hard classically, because it requires a very large number of calculations, to plot the function and read off its period. Quantum mechanically, all these calculations can be performed in parallel. The clever manipulation is then to transform-apply a discrete Fourier transformthe final state to one where a single measurement will then yield r . (Actually, this is not quite true-there is a probabilistic element, so a few runs are needed, but not very many.) 0
Simulation: A quantum computer would be an excellent basic research tool. It is hard to squash a sizeable Hilbert space into ordinary memory, so simulating complex interacting quantum systems on a conventional computer is really hard work. Simulating them41 on an actual quantum
'For example, the factoring of a 130 decimal digit number took 500 M I P S years of computer effort, and a big supercomputer crunch at the end! (An average workstation runs at around 10 M I P S , million instructions per second.) Y f N ( z ) = y' mod N , where y is an integer coprime with N . This is a periodic function of the variable x, so f N ( z ) = f N ( Z r)
+
Basic Elements of Quantum Information Technology
21
machine would be much easier! Nuclear physicists, material scientists, molecular chemists and many others would queue up for time on a quantum computer, to investigate novel systems, regimes and materials inaccessible with classical modelling tools. An interesting simulation, which may soon be realized as it only needs a few qubits, is that of quantum chaosP2 0
0
Searching and estimation: A classical search of a random list of M items t o find a particular one requires the examination of at least M / 2 of them t o have a 50% success probability. Lov Grover has shown43 how a quantum search could find an item in only O(M1l2) steps. In effect, using superposition states enables the examination of multiple items simultaneously. This speeds up the search, although in this case not exponentially. A similar square root improvement over classical algorithms for estimating the median of M data can be achieved in the quantum caseP4 Frequency standard: As the first working quantum machines will certainly consist of only a few interacting qubits, it would be nice to find something useful that can be done with such a simple system. A possibility is t o use the ideas developed for quantum error correction in something other than a computer. A frequency standard effectively relies on the coherent oscillation of a pure atomic quantum state, so it is limited by decoherence as the atomlion interacts with its environment? The problem is subtle. It is not simply one of preserving a static state; the oscillation cannot be ignored in a frequency standard! The errors are harder t o remove from a time-varying state. Nevertheless, it seems that some entanglement between ions has potential benefit?'
4.2 Experiments Whereas quantum cryptography relies on the independent behaviour of a string of non-interacting photon qubits, interactions between qubits are a must for quantum computation. There are a number of candidate systems currently being researched. There is no clear favourite as yet, to mirror the use of photons for cryptography. Those jostling for position are: 1. Ions/atoms in an electromagnetic trap, interacting through their quantum vibrational motion. Their internal energy levels form qubits and external laser fields can be coupled to these. 'The dominant effect is dephasing, which can be described for a two state atom/ion by a single environment operator bz in the model given in Sec. 2.4.
22
Introduction t o Quantum Computation and Information
2. Atoms in beams, interacting electromagnetically with cavity or travelling photons. Cavity photon number states and atomic levels (Rydberg or optical) form qubits; external fields (microwaveor optical) can be coupled in.
3. Electrons in quantum dots, interacting electrostatically or possibly magnetically. The discrete levels of the confined electrons form qubits (or possibly qunits) and they couple readily to external fields.
4. Spin systems, interacting through their magnetic moments. These might be in a regular l a t t i ~ e p ~or, y ~at~ a smaller scale, different spins within a large molecule, the so-called NMR quantum c o m p ~ t i n g ! ~ A ~static ~~ external field separates out discrete spin levels for qubits. Time dependent fields can be applied to manipulate the system; in particular, in the NMR case the technology for doing this is very well developed. 5 . Superconducting systems, interacting through the quantum motion of electric charges 50,51 or magnetic Such systems also have discrete levels and can be probed with external currents, voltages and fluxes.
Numbers 1 and 2 originally took the lead in experiments. David Wineland’s at NIST in Colorado demonstrated a quantum logic operation between two qubits, with a single ion in an ion trap. Jeff Kimble’s group 54 at Caltech demonstrated atom-photon cavity interactions which could form the basis of a similar quantum gate and Jim F’ranson has proposed a technique 55 for enhancing two-photon non-linear phase shifts (again, the basis for a two-qubit gate). More recently, Wineland’s group have cooled two ions to their collective motional ground state 56 and prepared two ions in chosen entangled state^?^ Although ions, atoms and photons claimed the first breakthroughs for quantum gates, there is now growing interest in NMR quantum c o m p ~ t i n g : ~ , ~ ~ Experimentally, this started with the demonstration 58 of a GHZ 59 threeparticle entangled state. (Eq. 9 is an example of such a state, if the environment is simply taken to be a third qubit.) Since then, examples of simple quantum algorithms run with just a few qubits have begun to appear. The Deutsch-Jozsa algorithm 6o determines whether a function is constant or balanced-analogous to seeing if a coin is biased (having two HEADS or two TAILS) or fair (having HT or TH)-and does so faster than any classical algorithm. This has been implemented by two experimental groupsfl r62 The Grover search algorithm 43 has also been implemented on a two-qubit c0mputer.6~The simplest form of quantum error correction is the encoding of one qubit into three, to protect against phase errors (see footnote 2); this has also been demonstratedf4 One potential problem with all this liquid state NMR work (which uses non-pure
Basic Elements of Quantum Information Technology 23
ensembles of molecules) is its scalability to larger numbers of q ~ b i t s ? ~ rAn ~’ effective purification algorithm does exist; 65 however, it will be a huge experimental challenge to find sufficient operational NMR qubits to make this algorithm usable. Another point worth noting is that, compared to the other options, the construction of (at least some) complex quantum dot and superconducting systems will probably be rather easier-fabrication expertise already exists here due t o their use in conventional IT. Semiconducting and superconducting systems could thus be crucial ingredients of future QIT. An interesting hybrid approach, which could utilize the advantages of long nuclear spin coherence times and fabrication techniques, is the idea of effective solid state.NMR Rather than using the ensemble techniques of liquid state NMR, this would involve addressing individual spins in a lattice. Bruce Kane has given a detailed analysis 46 for the use of phosphorus spins in silicon. There are tough fabrication challenges t o overcome if such a machine is to be built. However, these are comparable with the challenges faced by the next generation of conventional electronics. 46y47
The quantum computers operated to date are clearly way short of being useful, of actually doing something that we cannot achieve with a conventional machine. However, it is important to realize that lots of two-qubit gates (kept quantum coherent), A la those which exist already, will be sufficient to build any quantum processor. More complex quantum gates may appear, but they are not necessary. It is known theoretically that pretty well any two-qubit gate is universal. Add in coherent operations applied to a single qubit-to fundamental quantum physicists these are really old hat compared to two-qubit gates-and you have the all the ingredients you need to build any quantum processor. Consequently, blueprints have already been drawn up for devices such as Shor’s factoring machine. For ordinary classical (irreversible) computing, in principle just three basic gates are needed to build any processor. Nevertheless, real machines usually contain many rather more complicated gates, because it is more practical and convenient to build them this way. If real quantum machines develop, they may well follow suit, using more complex tailored gates rather than being made entirely from universal building blocks. Given the early stages of experimental work on quantum computing, for the meantime basic research in all of the QIT building block areas (including potential new ones) is likely to contribute to the ultimate goal of useful technology. Current practical quantum computing research areas are reviewed later in this book; Thomas Pellizzari discusses optical and ionic systems in Chapter Nine and Isaac Chuang discusses NMR systems in Chapter Ten.
24
htroduction to Quantum Computation and Information
5
Summary and comments
Here are a few comments to take forward for the rest of the book. At the end, in Chapter Eleven, Charles Bennett discusses aspects of the future of QIT, some open questions and how the field may develop. 0
0
0
0
0
0
0
Quantum physics has the potential to generate both evolutionary and revolutionary developments in information technology. Expect evolutionary improvements to conventional logical processing to have shorter lead times than those for the emergence of radically new forms of processor. The intrinsic irreversibility of quantum measurement enables guaranteed secure communications. Eavesdroppers cannot intercept quantum transmissions without corrupting some of the data, thus exposing themselves. Quantum cryptosystems use secret keys, shared quantum mechanically, as one-time pads. Quantum cryptosystems work in the real world, not just in sanitized laboratories. A benchmark key is 20 kbit, established down 23 km of optical fibre under Lake Geneva at 0.5 Hz. Much higher (- lo3) bit rates have been achieved in shorter bursts. Quantum systems can exist in superposition states, which simultaneously contain parts corresponding to different classical states. A complex quantum machine could thus process an exponentially large number of classical calculations in one run. Problems like factoring would be tractable with quantum parallelization. Complex quantum systems lose their coherence much more quickly than simple ones. Decoherence destroys quantum parallelism, generating errors. Despite the “no-cloning” theorem, quantum error correction is possible, massaging errors and entropy out of systems and prolonging their unitary life. Some individual quantum gates have been made and some very simple quantum algorithms have been run with few-qubit NMR systems. Roughly 2000 qubits (plus many more for error correction), coherent as they interact, would be needed to factor a 400 bit number. This is a big challenge for the future. There has been a lot of excitement in the media about NMR quantum computing. However, the idea that we will soon be doing serious quantum computing with our coffee is probably mostly froth. Nevertheless,
Basic Elements of Quantum Information Technology 25
research in this this practical area, along with that on ion traps, atomic beams, photons, quantum dots and superconductors, is at a very interesting stage. 0
0
0
Research is on-going into uses for processors containing just a handful of qubits. Coherently manipulating entanglement in these systems is the goal-this may have applications to frequency standards and in quantum simulations, as well as being of tremendous fundamental importance. Of course, the search is also still on for other useful quantum algorithms, additional t o Shor’s, which would run on bigger machines. In addition to the practical interest in QIT, the fields of quantum information and computing provide a new arena for testing and understanding fundamental questions in quantum mechanics. For example, they have helped stimulate experimentalists to master the mapping out of actual quantum states of light, atoms and molecules, and encouraged theorists t o delve deeper into quantum entanglement and separability. Quantum information technology seems unlikely to displace large areas of existing IT and more likely to emerge alongside it, defining new applications and markets. Given the might of the current industry, the short term payback will therefore almost certainly come from evolutionary quantum-assisted developments. However, given the successes at the basic research level over the last few years, it seems clear that future research efforts should be spread across the whole spectrum, rather than simply being focussed on evolutionary short term goals.
References
1. 2. 3. 4. 5.
K. K. Likharev, Physics World, vol. 10, no. 5, 39 (May 1997). A. Barenco, Contemporary Physics 37, 375 (1996). A. K. Ekert and R. Jozsa, Rev. Mod. Phys. 68, 733 (1996). T . P. Spiller, Proceedings of the IEEE, vol. 84, no. 12, 1719 (1996). srnolin/index.htrnl http://eve.physics.ox.ac.u k/QChome. htrnl http://vesta.physics.ucla.edu/
http://feynman.stanford.edu/qcomp/ http://www.iro.urnontreal.ca/labs/theorique/index~n. htrnl htt p :/ /xxx .la nI.gov/a rc hive/q ua nt-ph http://www.lsr.ph.ic.ac.u k/TQO/EPSRC/
6. R. Rivest, A. Shamir and L. Adleman, “On Digital Signatures and Public Key Cryptosystems,” MIT Laboratory for Computer Science Technical Report, MIT/LCS/TR-212 (January 1979).
26
Introduction to Quant um Computation and Information
7. A. Einstein, B. Podolsky and N. Rosen, Physical Review 47,777 (1935). 8. J. S. Bell, Physics . 1, 195 (1964); Rev. Mod. Phys. 38,447 (1966). 9. C. H. Bennett, G. Brassard, C. Cr6peau, R. Jozsa, A. Peres and W. K. Wootters, Phys. Rev. Lett. 70,1895 (1993). 10. C. H. Bennett and S. J . Wiesner, Phys. Rev. Lett 69,2881 (1992). 11. B. Schumacher, Phys. Rev. A 51,2738 (1995). 12. J. von Neumann, Mathematical foundations of quantum mechanics, Ch. 5 (Princeton University Press, 1955). 13. R. Landauer, Phys. Lett. A 217, 188 (1996). 14. A. Peres, Phys. Rev. A 32,3266 (1985). 15. U. Weiss, Quantum dissipative systems, Series in Modern Condensed Matter Physics Vol. 2, (World Scientific Press, 1993). (This book also contains a large list of very useful references.) 16. H. Carmichael, A n open systems approach to quantum optics, Lecture Notes in Physics m18, (Springer-Verlag Press, 1993). 17. M. Sargent 111, M. 0. Scully and W. E. Lamb, Jr., Laser Physics, Ch. 16 (Addison-Wesley Press, 1974). 18. N. Gisin and I. C. Percival, J. Phys. A 25,5677 (1992). 19. L. Diosi, N. Gisin and W. T. Strunz, “Non-Markovian Quantum State Diffusion,” preprint quant-ph/9803062. 20. M. B. Plenio and P. L. Knight, “The quantum jump approach to dissipative dynamics in quantum optics,” preprint quant-ph/9702007, to appear in Rev. Mod. Phys.. 21. W. K. Wootters and W. H. Zurek, Nature, 299,802 (1982). 22. D. Dieks, Phys. Lett. A 92,271 (1982). 23. C. H. Bennett, F. Bessette, G. Brassard, L. Savail and J. Smolin, J. Cryptology 5,3 (1992). 24. C. H. Bennett, International Business Machines Corporation, “Interferometric quantum cryptographic key distribution system,” United States Patent Number 5,307,410 (April 26, 1994). 25. A. Muller, H. Zbinden and N. Gisin, Europhys. Lett. 33,335 (1996). 26. H. Zbinden, J. D. Gautier, N. Gisin, B. Huttner, A. Muller and W. Tittel, Electron. Lett. 33,586 (1997). 27. P. R. Tapster, J . G. Rarity and P. C. M. Owens, Phys. Rev. Lett. 73, 1923 (1994). 28. P. D. Townsend, Electronics Letters 33, 188 (1997). 29. J. D. Franson and B. C. Jacobs, “Quantum cryptography without optical fibers,” in QELS ’96, paper presented at the Quantum Electronics and Laser Science Conference, Vol.10 1996 Technical Digest Series, Conference Edition (IEEE Cat. No. 96CH35902).
Basic Elements of Quantum Information Technology 27
30. B. C. Jacobs and J. D. Franson, Opt. Lett. 21, 1854 (1996). 31. W. T. Buttler, R. J. Hughes, P. G. Kwiat, G. G. Luther, G. L. Morgan, J. E. Nordholt, C. G. Peterson and C. M. Simmons, “Free-Space Quantum Key Distribution,” preprint quant-ph/9801006, to appear in Phys. Rev. A (1998). 32. W. T . Buttler, R. J. Hughes, P. G. Kwiat, S. K. Lamoreaux, G. G. Luther, G. L. Morgan, J. E. Nordholt, C. G. Peterson and C. M. Simmons, “Practical free-space quantum key distribution over 1 km,” preprint quant-ph/9805071. 33. W. Tittel, J . Brendel, H. Zbinden and N. Gisin, “Violation of Bell inequalities by photons more than 10 km apart,” preprint quantph/9806043. 34. P. W. Shor, Phys. Rev. A 52, 2493 (1995). 35. A. R. Calderbank and P. W. Shor, Phys. Rev A 54, 1098 (1996). 36. A. M. Steane, Proc. R. SOC.Lond. A 452, 2551 (1996). 37. A. M. Steane, Phys. Rev. Lett. 77, 793 (1996). 38. P. W. Shor, “Fault-tolerant quantum computation,” in Proc. 37th Symp. on Foundations of Computer Science (IEEE Computer Society Press, 1996). 39. A. M. Steane, Phys. Rev. Lett. 78, 2252 (1997). 40. P. W. Shor, “Algorithms for Quantum Computation: Discrete Log and Factoring,” in Proc. 35th IEEE Symp. on Foundations of Computer Science, ed. S . Goldwasser (IEEE Computer Society Press, 1994). 41. S. Lloyd, Science 273, 1073 (1996). 42. R. Schack, “Using a quantum computer to investigate quantum chaos,” preprint quant-ph/9705016. 43. L. K. Grover, “A fast quantum mechanical algorithm for database search,” in Proc. 28th Annual ACM Symposium on the Theory of Computing (STOC) (1996); Phys. Rev. Lett. 79, 325 (1997). 44. L. K. Grover, “A fast quantum mechanical algorithm for estimating the median,” preprint quant-ph/9607024; Bell Labs Technical Memorandum NO. ITD-96-30115J. 45. S . F. Huelga, C. Macchiavello, T. Pellizzari, A. K. Ekert, M. B. Plenio and J. I. Cirac, Phys. Rev. Lett. 79, 3865 (1997). 46. B. E. Kane, Nature 393, 133 (1998). 47. H. Wei, X. Xue and S. D. Morgera, “NMR Quantum Automata in Doped Crystals,” preprint quant-ph/9805059. 48. N. A. Gershenfeld and I. L. Chuang, Science 275, 350 (1997). 49. D. G. Cory, A. F. Fahmy and T. F. Havel, Proc. Nat. Acad. Sci. 94 (5), 1634 (1997); D. G. Cory, M. D. Price, A. F. Fahmy and T. F. Havel,
28
I n t d u c t i o n to Quantum Computation and Information
Physica D (in press), preprint quant-ph/9709001. 50. A. Shnirman, G. SchSn and Z. Hermon, Phys. Rev. Lett. 79, 2371 (1997). 51. A. Shnirman and G. Schon, “Quantum Measurements Performed with a Single-Electron Tkansistor ,” preprint cond- mat/980 1125. 52. M. F. Bocko, A. M. Herr and M. J. Feldman, IEEE Trans. on Appl. Superconductivity 7,3638 (1997). 53. C. Monroe, D. M. Meekhof, B. E. King, W. M. Itano and D. J . Wineland, Phys. Rev. Lett. 75,4714 (1995). 54. Q. A. Turchette, C. J . Hood, W. Lange, H. Mabuchi and H. J. Kimble, Phys. Rev. Lett. 75,4710 (1995). 55. J. D. Franson, Phys. Rev. Lett. 78,3852 (1997). 56. B. E. King, C. S. Wood, C. J . Myatt, Q. A. Turchette, D. Leibfried, W. M. Itano, C. Monroe, D. J. Wineland, “Cooling the Collective Motion of Trapped Ions to Initialize a Quantum Register,” preprint quantph/9803023. 57. Q. A. Turchette, C. S. Wood, B. E. King, C. J . Myatt, D. Leibfried, W. M. Itano, C. Monroe and D. J. Wineland, “Deterministic entanglement of two trapped ions,” preprint quant-ph/9806012. 58. R. Laflamme, E. Knill, W. H. Zurek, P. Catasti and S. V. S. Mariappan, “NMR GHZ,” preprint quant-ph/9709025. 59. D. M. Greenberger, M. Horne and A. Zeilinger in Bell’s theorem, quantum mechanics and conceptions of the universe, ed. M. Kafatos (Kluwer Press, 1989). 60. D. Deutsch and R. Jozsa, Proc. Roy. SOC.Lond. A 439,553 (1992). 61. I. L. Chuang, L. M. K. Vandersypen, X. Zhou, D. W. Leung and S. Lloyd, Nature 393,143 (1998). 62. J. A. Jones and M. Mosca, “Implementation of a Quantum Algorithm t o Solve Deutsch’s Problem on a Nuclear Magnetic Resonance Quantum Computer,” preprint quant-ph/9801027, to appear in J. Chem. Phys. (1998). 63. J. A. Jones, M. Mosca and R. H. Hansen, Nature 393,344 (1998). 64. D. G. Cory, W. Mass, M. Price, E. Knill, R. Laflamme, W. H. Zurek, T. F. Have1 and S. S. Somaroo, “Experimental Quantum Error Correction,” preprint quant-ph/9802018. 65. L. J. Schulman and U. Vazirani, “Scalable NMR Quantum Computation,” preprint quant-ph/9804060.
THE JOY OF ENTANGLEMENT SANDU POPESCU University of Bristol, Tyndall Avenue Bristol BS8 1 TL, United Kingdom and B R I M S , Hewlett-Packard Laboratories, Filton Road, Stoke G i o r d , Bristol BS34 SQZ, UK School of
DANIEL ROHRLICH Physics and Astronomy, Tel Aviu University, Ramat Auiu 69978 Tel Aviv, Israel
For six decades, physicists have broken their heads over quantum entanglement. But by now we have learned to do more than break our heads over it. This review explains what is so baffling about entanglement and also what we can do with it. Entanglement is a resource for teleporting quantum states and constructing unbreakable codes, a resource that we can extract, purify, distribute and consume. The applications of entanglement lead us to develop new conceptual tools and adapt old ones-in particular, the concept of entropy, which helps us exploit entanglement efficiently. Here we define entanglement and show, via the EinsteinPodolosky-Rosen (EPR) paradox and Bell’s inequality, how it implies quantum nonlocality. We draw a parallel between reversible heat engines and reversible transformations among entangled states. This parallel leads to the “entropy of entanglement” as the measure of entanglement of bipartite pure states. We also discuss the entanglement of density matrices.
In 1935, Einstein, Podolsky and Rosen (EPR) presented a paradox that still baffles and surprises us today. Consider two particles that once interacted but are remote from one another now and do not interact. Although they do not interact, they are still entangled if their quantum state does not factor into a product of states of each particle. Entangled particles have correlated properties, and these correlations are at the heart of the paradox. Einstein, Podolsky and Rosen claimed that quantum mechanics is incomplete. They based their claim on the correlated properties of entangled particles. In addition, they made an entirely reasonable assumption: measurements on one particle do not affect the results of measurements on the other particle. They never dreamed that this assumption would prove wrong. But, as Bell’ put it, “The reasonable thing just doesn’t work.” The assumption is inconsistent with quantum mechanics and with experiment. So today, the EPR paradox is more paradoxical than ever and generations of physicists have broken their heads over it. Here we explain what makes entanglement so baffling and surprising. But 29
30 Introduction t o Quantum Computation and Information
we do not break our heads over it; we take a more positive approach t o entanglement. After decades in which everyone talked about entanglement but no one did anything about it, physicists have begun to do things with entanglement. Here we treat entanglement as a resource that allows us to teleport quantum states and construct unbreakable codes, a resource that we can extract, purify, distribute and consume. The applications of entanglement lead us t o develop new conceptual tools and t o adapt old ones-in particular, the concept of entropy. Like Carnot, we face fundamental questions about how to use this resource most efficiently, and the concept of entropy helps us exploit entanglement efficiently just as it helps us exploit energy efficiently. This chapter consists of seven sections. The first section introduces and defines entanglement. Sec. 2 presents the EPR claim and Sec. 3 shows why it does not work. Sec. 4 shows how quantum nonlocality arises in the entanglement of remote systems. Sec. 5 is about converting entanglement from one form to another. Sec. 6 draws a parallel between reversible heat engines and reversible transformations among entangled states and derives the “entropy of entanglement” as the measure of entanglement of pure states. Sec. 7 discusses the entanglement of density matrices. The Appendix contains a derivation of the Clauser-Horne-Shimony-Holt (CHSH) inequality. 1
What is entanglement?
Quantum mechanics builds systems out of subsystems in a remarkable, holistic way. The states of the subsystems do not determine the state of the system. Schrodinger 3-commenting on the EPR paper in 1935, the year it appearedcoined the term entanglement for this aspect of quantum mechanics. In this section, we discuss entanglement mathematically, starting from a precise definition of entanglement. Consider a system consisting of two subsystems. Quantum mechanics associates t o each subsystem a Hilbert space. Let H A and H B denote these two Hilbert spaces; let ( i )(where ~ i = 1 , 2 , . . .) represent a complete orthonormal basis for H A , and l j ) (where ~ j = 1 , 2 , . . .) a complete orthonormal basis for HB. Quantum mechanics associates to the system-i.e., the two subsystems taken together-the Hilbert space H A 8 H B , namely the Hilbert space spanned ~ ( j ) ~In. the following, we will drop the tensor product by the states l i ) 8 ~ l i ) ~ ( j and ) ~ , so on. symbol 8 and write l i ) 8~ ( j ) as Any linear combination of the basis states l i ) ~ l jis) a~ state of the system, and any state I ~ ) A B of the system can be written
The Joy of Entanglement
31
where the cij are complex coefficients; we take IP)ABto be normalized, hence
0
A special case of Eq. 1 is a direct product state in which ~ * ) A Bfactors into (a tensor product of) a normalized state I T $ ~ ) ) A = CicIA)I.i)A in H A and a normalized state I $ ( B ) ) B = C j~ y ) l jin) HB: ~
Not every state in H A 8 H g is a product state. Take, for example, the state ( ( 1 ) A I l ) B ( 2 ) A ( 2 ) B ) / & ; if you try to write it as a direct product of states of H A and H B , you will find that you cannot.
+
0
If 1 9 )is~not ~ a product state, we say that it is entangled.
Comparing Eqs. 1 - 3, we see that I*)AB is a product state only if it satisfies the following constraint: there must be complex coefficients ciA) and c y ) such that for every i and j , cij = ci( A )cj( B ) . The problem is that in general it is not immediately clear, from a look at the coefficients cij, whether they satisfy this constraint; it takes some work to check whether they do. This practical, technical problem has a solution, which we now present. If it does not interest you right now, we suggest that you skip to the last paragraph of this section. We can determine whether IQ)AB is entangled by the following method (see also the chapter by Jozsa in this volume). Consider an operator OA that acts only in the Hilbert space H A . The expectation value of OA in the state I*)AB is
where trA and trB denote traces over states of the subsystems A and B respecdenote the adjoint of I*)AB. The expression in square tively; we let AB(QI brackets, trBl*)ABAB(*l, is the reduced density matrix P A for the subsystem A. It satisfies trAP.4 = 1 (since I*) A B is a normalized state). If \!P)AB is a
32
Introduction to Q u a n t u m Computation and Information
product state, say ( \ ~ ) A B= I $ J ( ~ ) ) A I G ( ~ then ) ) B , p i = P A , because P A is simply the projection operator l $ J ( A ) ) ~ ~ ( I.G (We A ) will now prove the converse: if p i = P A , then I \ ~ ) A B is a product state. We begin the proof by diagonalizing P A . If P A has more than one nonzero term on the diagonal, then p i # P A . (Hence IS)ABis entangled; we have shown that p i = P A for I \ ~ ) A B a product state.) So suppose that there is only one nonzero term on the diagonal, so that p i = P A . Let I 4 i ) ~be a basis of eigenstates of P A , with 141)~as the eigenstate with nonzero eigenvalue. We rewrite I!P)AB as follows:
Now we ask, what is the expectation value of the projector I + ~ ) A ~ ( 4 i ifI i # l? It must be zero; hence cij for i # 1 vanishes, and 1 9 )is~a ~ product state. So p i = P A if and only if 1 9 )is~a ~ product state. This result leads to a convenient expression for [ ! P ) A B , as follows. If we normalize the states j
and denote the normalized states ,+I.): then the 145)~are orthogonal. The proof is that P A = trB[I\k)AB A B ( ! P l ] must be diagonal in the I d i ) ~basis, hence B ( ~ : I ~ ; ) B must vanish unless i = j . We conclude, therefore, that any state 1 9 )of~two ~ subsystems A and B can be reduced to the form IWAB =
Cdi14i)aW)B.
(7)
i
We can even choose the coefficients di to be real and positive, by absorbing any phase factors in the definitions of the bases. Eq. 7, known as the Schmidt decomposition, is very useful in the study of entanglement. Any state I!P)AB of the two subsystems A and B has a Schmidt decomposition, as in Eq. 7. If the sum in Eq. 7 contains just one term, I S),, is a product state; otherwise, IS)ABis an entangled state. Suppose that I!P)AB is entangled. If we look at subsystem A in isolation, we find it in a mixed state, with a probability ldiI2 to be in the state I $ ~ ) A . This is also the probability for the subsystem B to be in the state ( 4 : ) ~The . reduced density matrices for the subsystems yield these probabilities. What they do not yield is the fact that whenever subsystem A is in the state I $ ~ ) A , subsystem B is in the state 14:)~. That is, the states of the two subsystems do not account for the correlations between measurements on the subsystems. These correlations, which
The Joy of Entanglement
33
the mixed states of A and B leave out, are contained in the state IQ)AB of the combined system. Thus, whenever two or more subsystems are entangled, the state of the combined system contains more than the states of the subsystems combined, as we remarked at the beginning of this section. 2
Hidden variables?
Quantum mechanics is not deterministic, in the following sense: if we prepare two identical systems in the same state, and perform the same measurement on each, the results of the measurement may not be the same. This indeterminism is fundamental, because-according to quantum mechanics-the initial quantum states were truly identical. Einstein never liked the indeterminism of quantum mechanics; he claimed that the initial quantum states were not truly identical. That is, the quantum state of a system does not completely describe it; there are additional, “hidden” variables that are missing from the quantum state. The EPR paper applied the correlations between subsystems in an entangled state to argue convincingly for Einstein’s claim. Here we present a version of the EPR argument; in the next section we show how it leads to trouble. Let’s start with something simple: a singlet state of two nonidentical spin1 / 2 particles:
In Eq. 8, I ? ) A , I $ ) A and I f ) ~ I, $)B represent the spin states of the two particles (polarized along a common z-axis) and ZA and ZB their respective coordinates. Consider measurements of the spin of each particle along the z-axis. Quantum mechanics makes two predictions: 0
the results I ? ) A and I $ ) A are equally probable, and the results and I $ ) B are equally probable;
I T)B
the measured z-components of the spins show perfect anticorrelation. These are the only quantum predictions regarding the results of the measurements.
So far we have made no assumption about the locations of these particles. ) localized But now let us assume that the wave functions U ( Z A ) and ~ ( Z B are to regions that are disjoint and far apart. Let two friends, Alice and Bob, help us with the measurements on this entangled state. Alice measures the z-component of spin of particle A and Bob measures the z-component of spin
34
Introduction t o Quantum Computation and Information
of particle B; Alice’s measurements are spacelike separated from Bob’s. Although Alice and Bob cannot compare their results immediately, they always ultimately find that their results are perfectly anticorrelated. But this perfect anticorrelation is very strange. Suppose that (in some Lorentz frame) Alice makes her measurement after Bob has already measured. How can Alice’s particle know what result Bob obtained? And if we say that it doesn’t know what result Bob obtained, we can choose a Lorentz frame in which Bob makes his measurement after Alice has already measured. How can Bob’s particle know what result Alice obtained? In short, how can the results be perfectly anticorrelated? Consider two mutually exclusive possibilities. The first possibility is that the results of both Alice’s and Bob’s respective measurements are determined locally. That is, Alice’s particle, together with its immediate environment, completely determine the result of Alice’s measurement (and likewise for Bob). Each result arises locally, and the fact that the results are anticorrelated is due to the past history of the two particles. The second possibility is that Alice’s particle, together with its immediate environment, do not completely determine the result of Alice’s measurement (and likewise for Bob). For example, there could be some randomness in the results. But then Alice and Bob would not see the perfect anticorrelation that they do see, unless the result of Alice’s measurement could influence the result of Bob’s, or vice versa. Since a spacelike interval separates the two measurements, such an influence seems incompatible with the theory of relativity, which does not allow influences to propagate faster than light. Thus, it seems that the results of both measurements are determined before the measurements (by some hidden variables) and quantum mechanics is incomplete, as EPR claimed. These hidden variables are local because their local interaction with a measuring device determines the measurement result. To be explicit, let us assume that the particles reach Alice and Bob with prepared answers to the questions that Alice and Bob ask. The answers are opposite, so the results of the measurements are anticorrelated. The EPR claim is reasonable. But “the reasonable thing just doesn’t work”, as we see in the next section.
3
A thought experiment
To exhibit the baffling nonlocality of quantum mechanics-what Bell first showed-we present a thought experiment due to Greenberger , Horne and Zeilinger (GHZ). It shows, dramatically and directly, how the E P R claim breaks down. We ask Alice, Bob and their sidekick Claire to help us with an
The Joy of Entanglement 35
experiment on a state
IQGHZ)
for three distinguishable spin-1/2 particles:
(Here, as in the previous section, I t) and I $) represent spin parallel and antiparallel, respectively, to the z-axis.) We assume (implicitly, since spatial wave functions don't appear in Eq. 9) that each particle is in a localized state remote from the other particles. We prepare an ensemble of particles in this state, and ask Alice, Bob and Claire to each take a particle ( A , B and C, respectively) in each triplet. All three measure o, on some of the particles at their disposal, oy on the others. Alice's measurements are spacelike separated from Bob's and Claire's, and Bob's measurements are spacelike separated from Claire's. Consider two special cases: when two of the friends measure oY and the third measures u, on a triplet, and when all three of them measure o, on triplet. It just so happens that ( Q G H Z ) is an eigenstate of the three operator products otofu:, otoEu: and o ~ u with ~ oeigenvalue ~ 1 and is also an eigenstate of o,"o,"o," with eigenvalue -1. (Here o," Alice's spin, 0," operates on Bob's, etc.) Thus if Alice, Bob and Claire all measure o,, they may obtain -1, -1, -1 or -1,1,1 or 1,-1,1 or 1,1,-1 respectively; if two measure oy and the third measures C J ~they , may obtain 1,1,1or 1,-1, -1 or -1,1, -1 or - 1,-1 , l respectively; only these results are possible. It is only reasonable to assume that these correspondences among the measurements of Alice, Bob and Claire arise from properties of the particle triplets that exist before the measurements. Otherwise, the three particles must be superluminally gossiping about what Alice, Bob and Claire choose to measure. So let us assume that the particles in each triplet come prepared to answer to any question that Alice, Bob and Claire may ask, without coordinating their answers in the last minute. That is, each particles in each triplet carries a local plan that prepares its answers to Alice's, Bob's or Claire's questions; and the local plan insures that all the answers are consistent with the predictions of quantum mechanics. In the GHZ experiment, such a local plan must contain (at least) six entries-two entries for the two possible measurements of each of the three observers. Here is an example of a local plan, in which!s denotes what a measurement of o," yields, and so on: st
=
-1,
1, s," = s; = -1,
s; = 1, SyB
=-1,
s; = 1.
However, this plan is not satisfactory, because if the measurements are a,",o,"
36
Introduction to Quantum Computation and Information
and a:, the product of their measurements is not -1 as quantum mechanics requires. Guess what? No local plan is satisfactory. The proof is quite short. The predictions of quantum mechanics impose four constraints:
If we multiply together the first three lines of Eq. 11, we obtain
which contradicts the fourth line of Eq. 11 since ( s ; ) ~= ( s ; ) ~= ( s : ) ~= 1. The “reasonable thing” just doesn’t work. No one has yet performed the GHZ thought experiment as an actual experiment, but by now you know that you must choose between what is “reasonable”, on the one hand, and quantum mechanics, on the other; and you, too, can break your head on this paradox. But here we emphasize the pleasures, and not the paradox, of entanglement. That’s why the title of this chapter is “The Joy of Entanglement”. 4
Nonlocality
Why is the EPR claim “the reasonable thing”? It is reasonable insofar as it follows from a principle that has long guided physicists: the principle of locality-no action at a distance. Locality implies that particles cannot communicate over spacelike separations. The GHZ thought experiment, however, shows that-in some senseparticles do communicate over spacelike separations. Hence quantum mechanics is nonlocal. In Sect. 2, we stated that such nonlocality seems incompatible with the theory of relativity. Now we have to look more carefully at the apparent incompatibility. Can Alice, Bob and Claire use the GHZ experiment to exchange superluminal signals? Let Alice, say, measure the spin component of her particle along an arbitrary axis. By inspection of the state I!PGHz), we see that the two possible results of her measurement have probability 1/2, regardless of what the others choose to measure. The same is true of Bob and Claire. The friends cannot exchange superluminal messages by making measurements on
The Joy of Entanglement
37
an ensemble of prepared GHZ triplets, because they cannot affect the probabilities of each others’ results; hence neither Alice, Bob or Claire will ever see an effect in their measurements. The particles may communicate, but Alice, Bob and Claire cannot. Quantum nonlocality violates the spirit, but not the letter, of relativity theory. The GHZ thought experiment is impressive, but it concerns the very special state IQGHz). Also, it remains a thought experiment since no one has made an actual experiment out of it. Can we make a more general (and testable) statement about nonlocality in quantum mechanics? Indeed we can: every entangled state, of any number n of remote subsystems, is nonlocal! That is, given any entangled state, our friends (Alice, Bob, Claire, etc.) can obtain nonlocal correlations-correlations that are inconsistent with the E P R assumption of local hidden variables. They just have to make the right measurements. The only states that are totally consistent with the E P R assumption are direct product states. The proof of this statement involves the Clauser, Horne, Shimony and Holt (CHSH) inequality: a generalization of Bell’s inequality? The CHSH inequality is an important experimental and theoretical tool for the study of nonlocality. We present a proof of the CHSH inequality in the Appendix. Furthermore, the statement follows from the results of the next section. There we show how different entangled states of two systems can be interconverted. Since all entangled states (of remote systems) are nonlocal, we must ask, once again, whether quantum mechanics and relativity theory are compatible. The answer is that they are compatible: although quantum correlations can be nonlocal, they cannot be used for superluminal communication! The reason is that measurements on an isolated system depend only on the reduced density matrix for that system. The reduced density matrix for a system is independent of measurements made on another system, even if the systems are entangled. The most thorough experimental demonstration t o date of nonlocal correlations is due to Aspect and coworkers? The experiment involved photons in a singlet state. The results were consistent with quantum mechanics and violated a form of the Bell-CHSH inequality by five standard deviations. However, the results have not convinced everyone!’
5
Manipulating entanglement
Entanglement, once a curiosity, appears today t o be a resource. Using entangled pairs of spins, Alice and Bob can teleport an arbitrary quantum state l1 (see the chapter by Jozsa in this volume), and they can construct an unbreakable code12 (see the chapter by Lo in this volume). The teleportation protocol
38 Introduction t o Quantum Computation and Information
requires spins in the singlet state of Eq. 8. But suppose Alice and Bob share, not a singlet state, but an entangled state IQa): IQa)
=
?)A1 ?)B
2 112
+ (1
I $)A1
(13)
$)B,
with a real. (We suppress the spatial wave function of the pairs.) lQa) looks very different from Eq. 8; IQa) contains terms I ? ) ~ l ?)B and I $ ) ~ l $)B while Eq. 8 contains terms I ? ) A / $)B and I $ ) ~ l?)B; also their relative sign is different. Yet these differences are insignificant. Alice could flip her spin or Bob could flip his, and then their spins would be antiparallel. (Note that Alice can Kip her spin without knowing its state, just by, say, a -T rotation around the z-axis; the same is true of Bob.) Also, they can adjust the relative phase of the terms in IQa) by briefly applying a magnetic field, parallel to the z-axis, to their spins. These are local operations. They belong to a general class of local operations, local unitary transformations. Local unitary transformations are unitary transformations that operate only Alice’s system or only on Bob’s (or products of these). If Alice and Bob share any entangled state of two spins, they can put it in the form IQa) by using local unitary operations. (Indeed, lQa) is a special case of the Schmidt decomposition, Eq. 7.) As long as a = 1/aAlice , and Bob can use IQa) to teleport an arbitrary spin state. If, however, (y. # 1/&, teleportation will not be reliable. Errors will show up. Bob may not receive exactly the state that Alice sent. Yet suppose that Alice and Bob cannot tolerate errors. That is, suppose that Bob cannot use the teleported state unless he is sure that it is exactly the state that Alice sent. If Alice and Bob share only the state IQa), what can they do? They can apply local filtering?3 Suppose that a > 1/ai.e. , a is too large. Either Alice or Bob can reduce a as follows. Let Alice, say, run her spin through a selective filter that never absorbs the state I $)A, but sometimes absorbs the state I ? ) A . We denote the initial state of the filter by 10) and represent local filtering by a unitary operator U that sends
1 ?)A\O)
€11 ?)A\O)
I$)AIO)
I$)AIO)-
+ €21 ?)All), (14)
+
Here 11) represents the state of the filter if it absorbs Alice’s spin, and 1e1I2 1e2l2 = 1. Eq. 14 is consistent with the requirement that U be unitary. After Alice runs her spin through the filter, the combined state of the two spins and the filter is T)B
+ (1-
2 112
I
c)B]
10)+ f z a ~
?)Ell).
(15)
Now Alice looks in the filter. The chance is 1 a ~ 2 1that ~ she finds her spin there. If she does not find her spin in the filter, however, she knows (and
The Joy of Entanglement
39
informs Bob) that the state of the two spins is given by the bracketed term in Eq. 15, up to normalization. In particular, if we choose €1 = (1--a2)'l2/a,the state of the two spins will now be equivalent to a singlet state, and suitable for exact teleportation. So Alice and Bob have a chance 1- lae2I2 = 2(1- a 2 ) of producing a state that allows exact teleportation. Of course, they lose all the entanglement in the initial state IS,) if the filter absorbs the I ? ) A state, but if they are lucky, Alice and Bob get an entangled state that allows faithful teleportation. If Alice and Bob share only one pair of entangled spins, local filtering is the best they can do. But what if they share many pairs in the state IS,)? Should they locally filter the pairs, one by one? Even in tailoring, there are economies of scale. A careful tailor can cut more clothes from one large piece of cloth than from many small pieces of cloth (with the same total area); less of the large piece of cloth goes to waste. Yet in local filtering, Alice and Bob throw away the part of IS,) that doesn't fit into a singlet. Local filtering is also known as the Procrustean method of making a singlet, after Procrustes, the cruel giant of Greek myth who chopped or stretched guests to the size of his bed. Let us see whether Alice and Bob can do better; whether, in cutting out singlet pairs-as in cutting out clothes-there are economies of scale. To do better, Alice and Bob must apply collective operations to their entangled pairs-they must operate on the pairs together and not one by one. Suppose Alice and Bob share two pairs in the entangled state IS,). The state of two pairs is
where A and A' refer to Alice's spins, and B and B' refer to Bob's. Expanding Eq. 16, we obtain a21? ) A !
? ) E l ?)All ?)B'
+ (1 - a2)1$)A1 & ) E l $ ) A ! ! $ ) E !
(17)
for the state of the two pairs. Now let Bob, say, make a (local) measurement of total z-component of spin. If Bob measures uf af , the result can be either 2, -2, or zero. (If Alice measures 0," u,"',the result is the same. Hence Alice and Bob do not even need to communicate.) Suppose the result is 0; the probability of this result is 2a2(1- a 2 ) . The state of the spins after the measurement is the bracketed term in Eq. 17. If we now define
+
+
40
Introductaon to Quantum Computataon and Informatton
we see that the state of the spins after the measurement is equivalent to a singlet, and allows exact teleportation. Actually, this way of making a singlet out of two pairs is too inefficient to pay. But the collective method, unlike the Procrustean method, gets more and more efficient as Alice and Bob apply it to more and more pairs. Bennett, Bernstein, Popescu and Schumacher l 3 showed that Alice and Bob can obtain n singlets from k pairs of spins in the state l!Pa), and as n , k become large, the ratio n / k approaches the limit n lim - = .E(l!Pa))
n,k+m
k
=
-a2 log, a2 - (1 - 2 ) 10g2(l- 2).
(19)
E(l!Pa))is called the entropy of entanglement,and equals the Shannon entropy of the squares of the coefficients of the Schmidt decomposition, Eq. 7. We shall see, in the next section, that Eq. 19 represents the highest possible yield of singlets from pairs in the state l!Pa). The entropy of entanglement equals 1 if Q: = 1/& (i.e., if I!Pa) is equivalent to a singlet) and equal 0 for a product state; the yield in Eq. 19 is always greater than or equal to that of local filtering. For any entangled state l!Pa), if Alice and Bob have a large enough ensemble of pairs in the state ]!Pa),they can extract singlets; Eq. 19 tells how many they can extract. The fact that Alice and Bob can do so gives us insight into the statement (in the last section) that every entangled state is nonlocal. Since Alice and Bob can extract singlets from an ensemble of spins in any entangled state, and since the nonlocality of the singlet state is already known, it must be that is nonlocal as well; Alice and Bob could never obtain nonlocal states from local states by using only local interactions. What about going the other way? Can Alice and Bob manufacture pairs of spins in the state ]!Pa) out of singlets, using only local operations? Indeed, they can. For example, Alice can manufacture, in her laboratory, pairs in the state [!Pa); she then teleports one spin out of each entangled pair to Bob. In this way, Alice uses up one singlet pair for every spin that she teleports to Bob, so Alice and Bob use up k singlets to produce k pairs in the state [!Pa). On the other hand, from k pairs in the state I!Pa) they can only recover
The Joy of Entanglement
41
n < k singlet pairs. So this is not an efficient way to produce pairs in the state ISa). However, Alice can teleport the pairs more efficiently using a method called quantum data omp press ion?^ (See the chapter by Jozsa in this volume.) The idea behind quantum data compression is the following. Alice has t o teleport k spins, i.e. a state in a 2k-dimensional Hilbert space. But the effective dimension of the Hilbert space is much smaller than 2k, because the k spins have a common source. For example, in the state IS,) with a > 1/a, 1 f-)B is more likely than I $)B,so a sequence with every spin in the state I f - ) ~ is much more likely than a sequence with every spin in the state I J)B; still more likely are sequences with most, but not all, spins in the state I f - ) ~ In . fact, the effective dimension of the Hilbert space approaches 2", rather than 2 k ; that is, Alice can actually teleport the k spins to Bob without using more than the n singlets that Alice and Bob can obtain from k pairs in the state In a word, the conversion of pairs in the state 1 !Pa) into singlets is reversible. The reversibility of these entanglement manipulations is highly significant , as the next section shows. 6
Thermodynamics and entanglement
Alice and Bob can, using local operations, reduce any pure entangled state of two spins to the state ISa),and the only parameter in ISEa) is a. Plausibly, the closer a is t o 1/&, the more entangled is the state ISla).But even if a increases with increasing entanglement of ISa), does a measure entanglement? There have been numerous proposed measures for entanglement,15 but is one of them the measure? This question gives us a chance t o approach entanglement in a different way from the last section. In this section, we show that the entropy of entanglement, E ( I S a ) )of Eq. 19, is the unique measure of entanglement for pure states?6 Although Eq. 19 defines the entropy of entanglement, we will not refer t o it until the end of this section. Until then, forget that we defined E ( l S a ) )in the last section. When Einstein searched for a universal formal principle from which to derive a new mechanics (namely, special relativity) he took for inspiration a general principle of thermodynamics: The laws of nature are such that it is impossible to construct a perpetuum m ~ b i l e ?This ~ general principle (the second law) enabled Carnot to show that all reversible heat engines operating between given temperatures TI and T2 are equally efficient. Consider two reversible heat engines; suppose that both absorb heat Q1 at 'TI and expel heat Q2 at T2, but one does work W , and the other does work W' > W , per cycle. The first engine, if run in reverse, is a refrigerator-absorbs heat Q2 at T2 and expels heat &I at TI-and requires only work W per cycle. Thus
42
Introduction t o Quantum Computation and Information
the two engines together could provide W’ - W in work per cycle without changing their environment. Such a conclusion contradicts the second law, so both engines must do the same work: W = W’. We can draw an analogy with entanglement, as follows: The laws of nature are such that it is impossible to create (or increase) entanglement between remote quantum systems by local operation^!^)'^ Quantum mechanics does not allow local operations to create such entanglement, although they may preserve or destroy entanglement. So this general principle is analogous to the second law of thermodynamics. Furthermore, a reversible manipulation of entanglement-any reversible transformation, consisting only of local operations, that transforms one entangled state into another-is analogous to a reversible heat engine. Suppose that Alice and Bob share k pairs of systems in an entangled state, and that, by local operations only, they can transform the entanglement to n pairs of systems in a different entangled state. Since Bob and Alice have access to other systems that are not initially entangled, k and n may be different. Even if n > k, there need be no contradiction with the general principle that it is impossible to create entanglement by local operations, because the state of the n pairs may be less entangled than the state of the original k pairs. If Alice and Bob can transform k pairs in one entangled state into n pairs in another entangled state without destroying any entanglement, then any measure of entanglement must assign the same entanglement to the k initial pairs and the n final pairs. But did they not destroy any entanglement? That is, could Alice and Bob apply a more efficient set of local operations to obtain the same number n of final pairs from a smaller number k’ < k of initial pairs? The answer is that they cannot, i f both transformations are reversible. For if it were possible to transform k’ of the initial pairs into n of the final pairs by a different transformation, Alice and Bob could then reverse the first transformation and transform the n pairs in the final state to k pairs in the initial entangled state. In doing so, they would have added k - k’ entangled pairs t o their initial supply, contradicting the general principle that it is impossible t o create entanglement by local operations. Thus k’ = k. So far we have discussed these reversible local transformations of entanglement abstractly. Now we recall that we encountered them concretely in the last section. (In the last section the systems were spins; but the Hilbert space for any system can be embedded in a tensor product of spin Hilbert spaces, so without loss of generality we can let the systems be spins.) Alice and Bob can transform k pairs in an entangled state IQa) into n pairs in a singlet state, using only local transformations; the transformation is reversible when the number of pairs becomes arbitrarily large. That is, the ratio n / k
The Joy of Entanglement
43
tends t o a constant in the limit k + co. We can then assign, to k systems in a pure entangled state ISa),the same measure of entanglement as n singlet pairs. Thus the problem of defining a measure of entanglement for k pure states reduces t o the problem of defining a measure of entanglement for n singlets. At first, it might seem that many such measures, such as n, n2 and en, would be admissibie. But actually, the measure must be proportional to n. The reason is that the transformations under consideration are reversible only when the number of systems becomes arbitrarily large. Indeed, the ratio nlk nearly always tends to an irrational number, and if the number is irrational, we can never reversibly transform n singlets into a finite number k of systems in the state IS,). Reversibility requires us to go to the limit of infinite n, and for infinite n there is no way to define total entanglement. We can only define entanglement per system. Here too, thermodynamics provides the formal principle: the thermodynamic limit requires us to define intensive quantities. Likewise, the measure of entanglement must be intensive, i.e. the measure of entanglement of n singlets must be proportional to n. It follows that the measure of entanglement for pure states is unique (up to a constant factor). Since the measure of entanglement of k systems IS,) approaches the measure of entanglement of n pairs in a singlet state, and since the measure is intensive, we have k E ( [ S , ) ) = n, where E denotes the measure, and the measure of entanglement of a singlet state is 1. Thus
E(IS,)) = lim
n -.
n,k-+oo k
(20)
Now Eq. 19 shows that this limit is equal to the entropy of entanglement of IS,); so the measure of entanglement of I@,) must equal its entropy of entanglement (as we have anticipated in the notation) up to a conventional proportionality constant-measuring the entanglement of a singlet pair-that we set it t o 1. In analogy with qubit, which denotes a single quantum bit of information, the word ebit denotes a single entangled bit. Eq. 19 measures the entanglement of IS,) in ebits.
7 Entangled d en sity matrices In Sec. 1 we noted that the subsystems of a system in a pure entangled state IS)ABare in mixed states. Suppose Alice and Bob share pairs of spin-112 particles in a pure singlet state; Alice gets one spin in each pair and Bob gets the other. If Bob makes no measurement, the pairs remain in a pure singlet state, but Alice’s spins are in a mixed state. If, however, Bob measures c,” on his spins, he leaves Alice with an equal mixture of spins with a! = f l .
44
Introduction to Quantum Computation and Information
Relativistic causality implies that Alice cannot distinguish between these two mixtures, because otherwise Bob could send superluminal messages to Alice by choosing whether or not to measure CJ,”. But we already know (from Sect. 4) that quantum correlations do not allow superluminal signalling. Hence for Alice, the two mixtures are equivalent and correspond to the same density matrix. So far we have assumed that there is a pure state J!P)AB for the overall system. What if the overall system is itself in a mixed state? For example, Alice and Bob could share entangled pairs of spins, with half of the pairs in the singlet state of Eq. 8 and the other pairs in the triplet state
(We suppress the spatial wave function of the pairs.) If they don’t know which pairs are in which state, they have a mixed state. We emphasize that a mixed state is a classical mixture of quantum states. Alice and Bob lack information about their pairs, but someone else could, perhaps, supply the informationperhaps Claire prepared the mixture and still remembers which pairs are which. Since the mixture is classical, we might expect questions about mixed states to be easy, once we can answer the same questions for pure states. Yet some of the central questions that we can answer about pure states remain open when we come to mixed states. For any practical applications we must answer these questions, because in practice, pure states are apt to decohere quickly and turn into mixed states, by interacting and entangling with the environment. For any pure state, we can say whether it is entangled or not. For mixed states, we so far do not know to do so. We know how to define an entangled mixed state: a mixed state that cannot be written as a mixture of direct product states is entangled. But it is difficult to apply this definition directly. To see the difficulty, let us consider an example. Suppose again that Alice and Bob share equally many pairs of spins in the singlet state and in the triplet state, and they don’t know which pairs are which. The density matrix for this mixed state is
that is, this mixture of entangled states is equivalent to an equal mixture of the direct product states I t ) ~$ )lB and I $ ) ~tl ) ~Are . these two mixtures indeed
The Joy of Entanglement 45
indistinguishable for Alice and Bob? Indeed they are, for suppose Alice, Bob and Claire initially share the entangled three-particle state ISABC):
If Claire measures 0," on her spin(s), she leaves Alice and Bob with spins in an equal mixture of entangled singlet and triplet states, while if she measures 02, she leaves them with spins in an equal mixture of the two product states. Now relativistic causality implies that Alice and Bob cannot distinguish between these two mixtures, because otherwise Claire could send superluminal messages t o Alice and Bob by her choice of what t o measure. But quantum correlations do not allow superluminal signalling, so the two mixtures are equivalent. Although Alice and Bob can prepare the mixture by using entangled states, it is not an entangled mixture, because they can also prepare it using direct product states. So far, this difficulty has prevented a generalization of results for pure states t o density matrices. However, there are some results concerning density matrices of small dimension. For mixed states of two spin-l/2 particles, a necessary and sufficient condition for a density matrix t o be entangled is known!' (It is not a necessary condition for mixtures of higher spin states.) The question remains open: given an arbitrary density matrix, how can we tell if it is entangled or not? We can manipulate pure entangled states, t o concentrate and dilute entanglement. Can we do the same with mixed states? Here, too, there are partial results. Bennett et al. have shown how t o purify some entangled mixed states and even how t o extract singlets from them?g And here too, the big question remains open. In the last section, we noted that the concentration and dilution of entanglement, in pure states, are reversible processes. Reversibility is crucial for the derivation of the measure of entanglement. When it comes t o mixtures, we do not know whether the corresponding processes are reversible. Is it possible to obtain as many singlets from a mixture, asymptotically, as it takes to create the mixture? Until we can answer this question, we have at least two measures of entanglement for mixtures: entropy of distillation 19-how many singlets we can purify from a mixture-and entropy of formation 19-the least number of singlets required to produce a mixture. We conclude with a brief restatement of our main point. The modern view of entanglement treats entanglement as a resource, like (free) energy. Like energy, entanglement can be consumed, distributed and converted to different
46
Introduction t o Quantum Computation and I n f o m a t i o n
forms; like energy it obeys thermodynamic principles. Alongside the results we present or cite in this chapter are many that we had no space to mention; and more numerous yet are the many results that wait to be discovered. Appendix The CHSH inequality concerns measurements that Alice and Bob make on an ensemble of pairs of systems all prepared in the same initial state. Let us suppose that each pair of systems carries a definite local plan, which we denote A. The local plan contains prepared responses to any question that Alice and Bob might ask-it is a crib sheet for the pair of systems. Let p(A)dA be the fraction of pairs in the ensemble that carry the local plan A. We normalize p(A) so that
s
dAp(A) = 1,
(24)
with the integration over all A. Let A and A’ denote measurements that Alice can make, and B and B’ denote measurements that Bob can make. Given a local plan A, let P ( A ;a ; A) be the probability that a measurement A yields the result a. Similarly, let P ( A ,B ;a, b; A) be the probability that measurements A and B (on the same pair) yield results a and b, respectively. Since these probabilities arise from the same local plan, P ( A ,B ;a , b; A) factorizes:
P ( A ,B ;a, b; A) = P ( A ;a; A)P(B;b; A).
(25)
Now let P ( A ,B ; a , b ) be the probability that measurements of A and B on a pair yield a and b, respectively. It is the average of P ( A ,B;a , b; A) weighted by p(A), i.e.
P ( A ,B ;a , b) =
I
dAp(A)P(A,B ;a , b; A).
(26)
We define the correlation between measurements A and B to be
where ai and bj are possible results of measurements A and B , respectively. Assume -1 5 ai,bj 5 1. We will prove that a combination S C H ~ of H correlation functions,
S C H S H ( AA‘; , B , B’) = C ( A ,B ) + C(A’,B ) + C ( A ,B‘) - C(A’,B’), (28) is bounded above and below:
-2 5 C ( A ,B )
+ C(A’,B ) + C ( A ,B‘) - C(A’,B’) 5 2.
(29)
The Joy of Entanglement
47
To prove Eq. 29, we fix A and look at the sum of products
Each term in brackets is bounded in magnitude by 2, and their sum and difference are also bounded by 2. Since the absolute values of a , P (A ;a,; A) and a:P(A’;a:; A) are bounded by 1, the absolute value of the sum is bounded by 2: -2
5 a,b,P(A;a,; A)P(B;b,; A) +a,biP(A; a,; A)P(B’;b i ; A)
+ a:b,P(A’;a:; X)P(B;b,; A) - a:biP(A’;a:; A)P(B‘;bi; A)
5
(31) 2.
Summing over i and j , multiplying by p ( A ) and integrating over A, we obtain the CHSH inequality, Eq. 29. Quantum correlations violate the CHSH inequality. For two spin-1/2 particles in the singlet state of Eq. 8, we may choose A = 8 . 0,A’ = 8’ . CT, B = b . 0 and B’ = 61 . 0 where 8, 8’, 6 and 6‘ are unit vectors in space. The quantum correlation function CQ(A,B ) is equal to Q . b, and so on. If we apply the CHSH inequality to the case where 8, Q’,b and 1;‘ lie in the plane with w/4 radians separating 8’ and 6, 1; and 8, and 8 and p, we obtain 2 a for the quantum version of S C H S H ( A A’; , B , B‘). Hence quantum correlations cannot arise from local plans. If the state of the ensemble is the general state 19,) of Eq. 13, then the four unit vectors still lie a plane, with C‘ and 8 perpendicular, but the angle between b and 8 and between 8 and 61 becomes arctan [2a(l - a 2 ) 1 / 2for ] the maximal quantum version of S C H ~ H ( A , A ’ ; B , Bnamely6 ’), 2[1+4~u~(l-a~)]~~~. References
1. A. Einstein, B. Podolsky and N. Rosen, Phys. Rev. 47, 777 (1935). 2. J. Bernstein, Quantum profiles (Princeton: Princeton U. Press) 1991, p. 84. 3. E. Schrodinger, Proc. Camb. Phil. SOC.31, 555 (1935). 4. J. S. Bell, Physics 1, 195 (1964). 5. D. M. Greenberger, M. Horne, and A. Zeilinger, in Bell’s Theorem, Quantum Theory, and Conceptions of the Unaverse, ed. M. Katafos (Dordrecht: Kluwer Academic), 1989, p. 69; see also D. M. Greenberger, M. A. Horne, A. Shimony, and A. Zeilinger, Am. J. Phys. 58, 1131 (1990).
48 Introduction t o Quantum Computation and Information
6. N. Gisin, Phys. Rev. A154, 201 (1991); N. Gisin and A. Peres, Phys. Lett. A162, 15 (1992); S. Popescu and D. Rohrlich, Phys. Lett. A 166, 293 (1992). 7. J. F. Clauser, M. A. Horne, A. Shimony, and R. A. Holt, Phys. Rev. Lett. 23, 880 (1969). 8. G. C. Ghirardi, A. Rimini and T. Weber, Lett. Nuovo Cam. 27, 263 (1980). 9. A. Aspect, J. Dalibard, and G. Roger, Phys. Rev. Lett. 49, 1804 (1982). 10. E. Santos, Phys. Rev. Lett., 66 1388 (1991). 11. C. H. Bennett, G. Brassard, C. Crkpeau, R. Jozsa, A. Peres and W. K. Wootters, Phys. Rev. Lett. 70, 1895 (1993). 12. C. H. Bennett and G. Brassard, Proc. of IEEE Int. Conf. on Comp., Sys. and Sag. Proc., Bangalore, India, 175 (1984); A. Ekert, Phys. Rev. Lett. 68, 661 (1991). 13. C. H. Bennett, H. J. Bernstein, S. Popescu and B. Schumacher, Phys. Rev. A 53, 2046 (1996). 14. R. Jozsa and B. Schumacher, J. Mod. Optics 41, 2343 (1994); B. Schumacher, Phys. Rev. A 51, 2738 (1995). 15. A. Shimony, in The Dilemma of Einstein, Podolsky and Rosen - 60 Years Later (Annals of the Israel Physical Society, 12), A. Mann and M. Revzen, eds., Institute of Physics Publishing, 1996. 16. S. Popescu and D. Rohrlich, Phys. Rev. A 56, Rapid Comm. R3319 (1997). 17. A. Einstein, Autobiographical notes, trans. and ed. by P. A. Schilpp (Chicago: Open Court Pub. Co.) 1979, pp. 50-51. 18. A. Peres, Phys. Rev. Lett. 76, 1413 (1996); M. Horodecki, P. Horodecki and R. Horodecki, Phys. Lett. A 223, 1 (1996). 19. C. H. Bennett, G. Brassard, S. Popescu, B. Schumacher, J . A. Smolin and W. K. Wootters, Phys. Rev. Lett. 76, 722 (1996).
Q U A N T U M I N F O R M A T I O N AND ITS PROPERTIES RICHARD JOZSA School of Mathematics and Statistics University of Plymouth Plymouth, Devon PL4 8AA, England We provide a general introduction to the concept of quantum information, discussing its relation to quantum measurement theory and properties of entanglement. We give an account of some of its basic features, notably quantum dense coding, quantum teleportation and the compression of quantum information. We include an explanation of the formalism of density matrices and the concept of von Neumann entropy, which play a fundamental role in much of quantum information theory.
1
Quantum I nfo rmatio n - What is it?
One of the most fascinating aspects of recent work in fundamental quantum theory is the emergence of a new notion, the concept of quantum information, which is quite distinct from its classical counterpart. It provides a new perspective for all foundational and interpretational issues and highlights new essential differences between classical and quantum theory. In this chapter we will introduce this concept and develop a selection of its basic properties. We may think of classical information as being embodied in the state of a physical system which has been prepared in a state unknown to us (the receiver). By performing a measurement to identify the state (which is always possible in principle in classical physics) we acquire the information. The simplest such situation consists of a physical system, called a bit, which is prepared in one of two possible states, denoted 0 and 1. We often allow the receiver to have a priori probabilistic knowledge of the state. For example in the case of a bit, the receiver will know ahead of time that the state will be 0 (respectively 1) with probability po (respectively P I ) . In this scenario the profound and gives a precise mathematical quantification of beautiful theory of Shannon the intuitive concept of information leading to extensive developments of great theoretical and practical interest. In the above description we have emphasised the role of physical theory in the concept of information and we will now consider the analogous situation in the context of quantum theory. Thus “quantum information” is embodied in a given unknown quantum state. We will see that this apparently natural generalisation of the classical situation will differ dramatically from its classical counterpart. The simplest non-trivial quantum system is a 2-level system and 1,273
49
50
Introduction t o Quantum Computation and Information
we will use the term “qubit” (introduced by Schumacher 15) .to refer to a 2level system with a chosen preferred orthonormal basis denoted (10) , 11)).The general state of a qubit may be labelled by two real parameters 0 and 4:
I+)
= cos 0 10)
+ ei$sin 0 1 )
(1)
Thus we can apparently encode an arbitrarily large amount of classical information into the state of just one qubit (by coding the information into the sequence of digits of 0 and 4). However in contrast to classical physics, quantum measurement theory places severe limitations on the amount of information we can obtain about the identity of a given quantum state by performing any conceivable measurement on it. Thus most of the quantum information is LLinaccessible” but it is still useful - for example it is necessary in its totality to correctly predict any future evolution of the state and to carry out the processes of quantum computation? This phenomenon of inaccessibility has no classical analogue. It allows quantum information to possess further curious properties which make it even qualitatively different from classical information. For example, unlike classical information, quantum information cannot be copied 4,5 i.e. given an unknown quantum state I$) there is no process that will accept I+) and a standard state 10) as input and produce two copies of I+) as output. Indeed if we could make many copies of I+) we could determine the state by measuring probability distributions of measurement outcomes as precisely as desired. Also in some processes of information transmission it appears most natural to interpret quantum information as propagating backwards in time. For classical information this would entail paradoxical violations of causality but in the quantum case these can be avoided, broadly speaking, since the information is not accessible enough for the violations to occur! We will see an example of this later in our discussion of quantum teleportation. Let us now give a more quantitative discussion of the inaccessibility of quantum information. We may think of a quantum measurement as an attempt to represent quantum information in terms of classical information (of the measurement outcomes) and as such, we can quantify the inaccessibility in terms of Shannon’s information theory. According to this theory if a classical system X is known to be in one of the states x1,x2,.. . ,xn with prior probabilities p l , p ~ ,. . ,pn then the amount of information gained in identifying the state is given by the Shannon entropy of the probability distribution:
Quantum I n f o n a t a o n and Its Propertaes
51
Equivalently we may think of H as the amount of uncertainty in the state before its identification. It may be shown that for any distribution with n possible outcomes 0 5 H 5 log2 n The minimum value occurs precisely when one of the pz’s is 1and all others are zero i.e. we already know the state and no information is gained in identifying it. The maximum value log2n occurs when the distribution is uniform (i.e. pa = for all i) so we have no prior bias at all about the identity of the state. The unit of information is the bit. A 2-state classical system has a maximum information capacity of H ( f , f ) = lbit. Suppose more generally that we do not perfectly identify the state of X but merely obtain further probabilistic information about it. Thus we perform a measurement Y with outcomes y1, y2,. . . ,ym and for each x, we know the conditional probability distribution p(y, Ixa) of obtaining the various Y outcomes for each fixed state of X . How much information does the seen value of Y give about the identity of X ? Well, we have the unconditional distribution p(y,) = C,p(y, Ixc,)p(z,)and using Bayes’ rule with the prior probabilities p ( x a )and p(y, 12,) we can compute p(x,Iy,). Let H ( X ) denote the Shannon entropy of the distribution { p ( x , ) } and for each fixed j let H(Xly,) denote the Shannon entropy of {p(z,1y3)}. Finally let H ( X 1 Y ) = C , p ( y , ) H ( X I y , ) . Now H ( X ) represents the initial uncertainty in the state of X and H ( X l y , ) represents the residual uncertainty given that outcome y, was seen. Thus on average the residual uncertainty will be H ( X 1 Y ) . Hence on average the amount of information gained about the identity of X is I ( X : Y )= H ( X ) - H ( X 1 Y ) It is interesting to note that I ( X : Y ) is in fact symmetric in X and Y i.e. given any two probabilistically related random variables X and Y , I ( X : Y ) = H ( X ) - H ( X 1 Y ) = H ( Y ) - H ( Y 1 X ) = I ( Y : X ) . However there appears t o be no intuitive interpretation of this symmetry. Let us now return to the question of inaccessibility of quantum information. Let 1x1) , 1x2) , .. . ,1x,) with probabilities pl,p2,. . . , p , be any chosen distribution X of states of a qubit. Suppose that a qubit is prepared in a state according t o this distribution and let Y be any quantum measurement on the qubit with possible outcomes y1, y2,. . . ym. (Here we envisage any possible quantum measurement including POVM’s l4 which may have an arbitrarily large number of outcomes.) Standard quantum measurement theory provides a formula for p(y,1x2) and we can compute I ( X : Y ) as above, quantifying the amount of information that the measurement provides about the identity of the qubit’s state.
52
Introduction to Quantum Computation and Information
A fundamental theorem of Holevo6*7states that for any choice of X and Y the value of I ( X : Y ) can never exceed one bit! (Actually Holevo’s theorem gives a better upper bound - the von Neumann entropy S of the distribution of states X and S never exceeds 1 for a qubit. We will give a discussion of von Neumann entropy later.) For example we may code log, n bits of classical information into a qubit by preparing it equiprobably in one of n different states (which are necessarily non-orthogonal for n > 2) but the indistinguishability of these states allows only at most one bit of information about the input to be extracted. More generally for an n-level system I ( X : Y ) 5 log, n so although we may attempt to code a vast amount of information into a quantum state the amount of information that can be read out never exceeds the maximum capacity of a classical system with the same number of levels. There is another more intrinsically quantum mechanical sense in which quantum states can embody vastly more “information” than classical states. This arises essentially through the non-classical feature of quantum entanglement and it is entirely independent of the previous issue of precision of real parameters and the continuum of different superpositions being available in the state space. Consider a state of n qubits. According to the laws of quantum mechanics the state space of a composite system is given by the tensor product of the spaces of its constituent parts so that for n qubits the state space has 2n dimensions. Thus the information required to describe a general state grows exponentially with n - generally we will have 0 ( 2 n ) superposition components even if the amplitudes are restricted to a simple set of numbers of finite precision. By contrast in classical physics the state space for a composite system is given merely by the Cartesian product of the constituent spaces. Thus although the number of possible states grows exponentially with n , the description of any particular state grows only linearly with n, being generally n times the amount of information needed t o describe a single 2-level system. In classical physics we are allowed only product states of composite systems whereas in quantum physics the phenomenon of entanglement - i.e. the possibility of superpositions of product states - gives rise to an exponential increase in the information necessary to specify a general state. From our information-theoretic point of view, natural physical evolution may be considered as the processing of the information content of a quantum state. Generally this information content will grow exponentially as the number of qubits that become entangled together may grow linearly in time starting from a simple specified starting state (e.g. if we consider a process which develops in discrete steps with each step entangling an extra qubit into the state). Thus a full description of the evolution by any classical means would generally involve an exponential growth in the classical representation
Quantum Information and Its Properties
53
of the state and a consequent exponential slowdown in calculating the evolution from the starting state, as compared to the actual evolution of the quantum system performed by Nature. This astonishing ability of Nature to process information exponentially faster than any known classical means may be exploited for useful computational tasks and it forms the basis for the subject of quantum ~ o m p u t a t i o n . 8However ~ ~ ~ ' ~ as a result of the inherent inaccessibility of quantum information, the final product of this amazing feat of processing remains largely hidden from view! Fortunately this does not render it ineffective - by quantum measurements we may obtain small amounts of information about the overall final state which would still take an exponential amount of computing effort t o obtain by classical calculations!0 In summary, our information-theoretic perspective points to an extraordinary new distinction between classical and quantum physics. According to the laws of quantum mechanics physical evolution requires the processing of a vast amount of information at a rate that cannot be matched in real time by any classical means. Furthermore most of the resulting processed information remains in principle inaccessible to any measurement and only a small amount can be read out in classical terms! From the classical perspective of our macroscopic world this is a bizarre foundation for a fundamental physical theory! The phenomenon of quantum entanglement features predominantly in every aspect of quantum information theory. In this chapter we will focus on three properties: quantum dense coding, quantum teleportation and the compression of quantum information. Along the way we will give an introduction to the formalism of density matrices and the concept of von Neumann entropy which are important basic tools in the study of quantum information generally. 2
Quantum Dense Coding
We stated above that the capacity of a single qubit for communicating classical information cannot exceed one bit. To realise this maximum capacity the sender (Alice) prepares the qubit in one of the basis states 10) or 11) with equal prior probability. The receiver (Bob) can then distinguish these orthogonal states perfectly by a suitable measurement and acquire one bit of information. According t o Holevo's theorem if Alice attempts to code more information in the qubit, generally using non-orthogonal signal states, then Bob will be unable to distinguish them well enough to get more than one bit of information about the identity of the signal. Remarkably however, if the qubit is entangled with another qubit then it may be used to communicate two bits of classical information from Alice to Bob. This is the process of quantum dense coding
54
Introduction to Quantum Computation and Information
devised by Bennett and Wiesner l 2 giving a doubling of information capacity of a single qubit through entanglement with a second qubit. Suppose that Alice and Bob are distantly separated in space but they share a maximally entangled pair of qubits, say the EPR state
I$-)
=
1 Jz (10) 11)- 11)10))
(3)
Here the first particle is always with Alice and the second with Bob, which we could emphasise with subscripts, writing the state as lo), I1)B - Il), It is a well known feature of quantum measurement theory l 1 that the sharing of entanglement cannot by itself enable Alice to communicate information to Bob. Any such communication process utilising the entanglement must be accompanied by some other transmission from Alice to Bob. Consider now the four states: 1 = - (10) 11)- 11)10)) (4)
I+-)
I$+)
Jz
=
1
-(10) 10) + 11)11))
Jz
(7)
These four states are each maximally entangled and together they form an orthonormal basis - called the Bell basis - for states of two qubits. The main point here is that each of these states can be prepared from the state I$-) by Alice alone performing purely local operations on her particle. Indeed consider the four 1-qubit unitary operations (written in the {lo), 11))basis):
u10=
(
;l)
u11=
which respectively act on the qubit basis by
( 01 -10 )
Quantum Information and Its Properties
55
2
Bell Mmt
t2
TIME
one qubit transmitted
tl
2 bits
ij to
EPR pair created Alice
Bob I
+
SPACE
Figure 1: Quantum dense coding. The figure shows a spacetime diagram with space horizontally and time increasing up the page. At time to an EPR pair is created and distributed to Alice and Bob. Alice also has two bits ij. At time t i she applies Ut3 to her particle and sends it to Bob. On reception at tz Bob performs a Bell measurement and reads out the 2-bit message ij.
56 Introduction t o Quantum Computation and Information
It is easily verified that if Alice applies one of these transformations to her particle of I$-) then the resulting state will be I$-), I$+), Id-) or 14+) respectively. Thus to communicate two bits ij (i, j being 0 or 1) to Bob, Alice applies Uij t o her qubit and sends this single qubit to Bob. On receiving it, Bob performs a Bell measurement (distinguishing the four Bell states) on the joint state of the two particles and reliably reads out the value of ij. This is illustrated in Fig. 1. 3
Quantum Teleportation
Suppose that Alice has a qubit in state I$) (whose identity is in general unknown t o her) and she wishes to transfer this state to Bob i.e. to communicate the quantum information of I$) to Bob. For example I$) may be a crucial state half way through a quantum computation and Bob is to complete the computation on his quantum computer. If Alice attempts to identify the state, then as discussed in Sec. 1, she will irrevocably destroy most of the quantum information rendering the initial computational effort useless. If Alice knew the identity of the state ) ) 1 1 she could send Bob a (classical) description of it by conventional classical means. Bob could then reconstruct I$) in his laboratory. However this is a very inefficient means of communicating the state of just one qubit (the quantum analogue of one bit) as Alice will need to send a vast amount of classical information (an accurate specification of 6 and q5 in Eq. 1) if Bob is to get a reasonably accurate reconstruction of I$). Alice could always place her qubit in a secure box shielding it from a possibly noisy environment and send the qubit itself intact across space to Bob. But is there any other way of communicating the full quantum information of I$)? Remarkably quantum entanglement can serve as a channel for the transmission of quantum information. This is the process called quantum teleportation l9 which we now describe. Suppose that Alice and Bob share some quantum entanglement in the form of the EPR state I$-) in Eq. 3 (just as in the dense coding scenario). Alice also possesses an extra qubit in the (unknown) state I$). As before we will use subscripts A and B for the systems which comprise I$-) and the subscript C for Alice’s extra qubit. The systems A and C are thus in Alice’s possession and B is in Bob’s possession. In components we write
and the overall state of the three qubits is
Quantum Information and Its Properties
57
A simple algebraic rearrangement of this expression in terms of the Bell states of CA given in Eqs. 4,5, 6 and 7 yields Iq)ABc
= ;{
+
+ +
I$-)CA
(-.lo),
I$+)CA
(-a
I 6 ) C A
( blO)B (-bIo)B+all)B
I4’)CA
lo),
- b I1)B
+
I1)B
)
)
1
Thus if Alice performs a Bell measurement on her two particles CA then regardless of the identity of I $ ) c , each outcome will occur with equal probability Hence this measurement gives Alice no information a t all about the identity of the state and the resulting state of Bob’s particle will be respectively
i.
where Uij are given in Eq. 8. The two bits ij label the four possible outcomes of the Bell measurement. Now the crucial observation is that in each case the state of Bob’s particle is related to by a fixed unitary transformation (-l)ij+lUij independent of the identity of I$). Thus if Alice communicates to Bob the two bits ij of classical information (i.e. her actual Bell measurement outcome) then Bob will be able to apply the corresponding inverse transformation -Uij t o his particle, restoring it to state I$)B in every case. Hence by utilising prior shared entanglement Alice is able t o communicate to Bob the full quantum information of I+) by transmitting merely two bits of classical information to him! As a result of the process, the initial shared entanglement is destroyed and Alice learns nothing whatever about the identity of I$). Indeed a refined version of the “no-copying” property of quantum information (mentioned in Sec. 1) asserts that no quantum process can reveal any information about the identity of a general state without disturbing it in some irrevocable way?5 In the process of teleportation, Bob is left with a perfect instance of I$) and hence no participants can gain any further information about its identity.
58
Introduction to Quantum Computation and Information
This process of quantum teleportation has various notable features. Once Alice and Bob are in possession of their shared entanglement it is entirely unaffected by any noise in the spatial environment between them. Thus teleportation achieves perfect transmission of delicate quantum information across a noisy environment assuming that classical information is robust and easy to protect against noise (as it is). Also the entanglement of I$-) is independent of the spatial location of Bob relative to Alice so that Bob may travel around and Alice can transfer the quantum information without even knowing his location - she needs only to broadcast the two bit information of her Bell measurement outcome say, by publishing it in a newspaper advertisement. Quantum teleportation is described diagramatically in Fig. 2 which highlights its most enigmatic feature. The question is this: Alice succeeds in transferring the quantum state I$) to Bob by sending him just two bits of classical information. Clearly these two bits are vastly inadequate to specify the state I$) so how does the remaining information get across to Bob? What carries it? What route does it take? In Fig. 2 there is clearly only one other route connecting Alice to Bob (apart from the channel carrying the two classical bits) - it runs backwards in time from Alice to the creation of the EPR pair and then forwards in time to Bob. Hence we must conclude that most of the quantum information of )1 was propagated along this route, firstly backwards in time13 and then forwards to Bob. This raises some intriguing interpretational questions. With the above routing, we can deduce that at times between to and tl most of the quantum information was already well on its way to Bob (in his arm of the EPR pair) even though Alice had not yet performed her measurement. Indeed she may not yet have decided to transmit anything to Bob or even have been born yet! For classical information this situation would lead to paradoxes and contradictions of causality but for quantum information these do not arise, basically because of its inherent inaccessibility. However at the (unobservable and inaccessible) level of the identity of pure state outcomes of a measurement (prescribed by the collapse postulate of quantum measurement theory) there remain considerable difficulties l4 in assigning an unambiguous state to the system at each time. This applies particularly to a component of an entangled system when a measurement is performed on another spatially separated component since there is no uniquely defined notion of simultaneity for spatially separated events. It may be shown that for all physical purposes, in Bob’s laboratory his original EPR particle is completely indistinguishable from an equal probabilistic mixture of the four states Eq. 10. Thus until the two bits of classical information arrive from Alice, the quantum information (which flowed partly
Quantum Information and Its Properties
59
one qubit output
-Uij
tz
I'IME
two bits ij transmitted
Bell Mmt
tl
to
EPR pair created
Bob
Alice SPACE
Figure 2: Quantum teleportation. The figure shows a spacetime diagram just as in Fig. 1 with the EPR pair created at t o . Alice also has an input qubit. At time tl Alice performs a Bell measurement on the joint state of her input qubit and her EPR particle and sends the outcome ij to Bob. On reception at t z Bob applies -U,j to his particle which is then guaranteed to be in the same state as Alice's original input qubit.
60
Introduction to Quantum Computation and Information
backwards in time) is utterly useless and completely indistinguishable from the situation in which Alice performed no measurement a t all. Since the two bits must travel by conventional means, all potential problems of causality are avoided. In the next section we will elaborate on the particular kind of inaccessibility of quantum information involved here, as described by the formalism of density matrices. This will also be a key ingredient in our discussion of quantum data compression later. Thus shared entanglement is a most valuable resource since it provides a channel for the reliable communication of quantum information. In general the initial distribution of EPR particles t o Alice and Bob will occur through a noisy environment so that the resulting joint state will be somewhat corrupted from I$-). However it may be shown that by purely local operations (i.e. in Alice’s and Bob’s laboratories separately) and only classical communication between Alice and Bob, a large number of such corrupted pairs may be “purified” t o yield a smaller number of arbitrarily pure E P R states. This leads t o a means of reliably transmitting quantum information through a noisy environment - Alice herself is the creator of the EPR pairs and she sends many halves to Bob which become partially corrupted along the way. Subsequently Alice and Bob purify the pairs (requiring only classical communication) and use the resulting pure It,!--) states for the teleportation of quantum information (requiring only further classical comunication). Bennett , DiVincenzo, Smolin and Wootters l7 have given a most interesting development of these ideas and compared them t o the theory of quantum error correcting codes, which provides an alternative way of protecting quantum information from environmental noise, during the course of its transmission. 16717,26
4
Density Matrices
Let us consider a situation which combines quantum uncertainty (i.e. the inaccessibility of the quantum information in a state) with classical probabilistic uncertainty in the choice of the state. We will see that these two sources of uncertainty combine t o admit a most elegant and useful mathematical description. Consider a mixture of pure states 1t,!-1), It,!-z), . . . , I&) which have been prepared with prior probabilities pl , p 2 , . . ., p n . Any possible physical measurement on this mixture must amount t o the estimation of the average value (0)of some observable 0. According t o standard quantum measurement theory we have
Quantum Information and Its Properties
61
Here ($1 denotes the complex conjugate of the state I$) and I$) ($1 is the outer product, giving a Hermitian operator on the state space. If I@) is normalised then this is just the projection operator onto the one dimensional subspace spanned by I@).For any operator A, trace A denotes the sum of the diagonal entries of the matrix of A relative to any chosen orthonormal basis {lei)}: trace A =
(eil A lei) i
This is independent of the choice of orthonormal basis. If A is Hermitian then we may choose { l e i ) } to be a basis of orthonormal eigenstates and trace A is thus given by the sum of the eigenvalues of A. As an example consider a state I$) of one qubit given in components by
Then
($1
= conjugate transpose = (a*b*)
and
Note also that for any two states
so that trace]$) which we have used in Eq. 11.
(41 = ac* + bd* = (q5I$)
62
Introduction to Quantum Computation and Information
Eq. 11 shows that any physically observable property of our mixture of states depends on the constituent states l$i) and probabilities pi only through the combination
P=
CPi
I$i)
($4
(12)
i
i.e. the average projection operator of all the constituent states. p is called the density matrix of the mixture or a mixed state (representing the probabilistic mixture of pure states). It is most interesting to note that a given density matrix can arise from many different mixtures of pure states!s For example the following mixtures of qubit states all have p = $1 (where I is the 2 x 2 identity matrix): (a) any pair of orthogonal states taken with equal probabilities $, (b) three spin f states 120" apart taken with equal probabilities (c) a uniform (continuous) distribution of all possible states of a qubit, (d) the states k ( 9 , lo&), h ( 1 2 ,5&), and &(3i, 2fi) taken with respective probabilities 281/900,97/450 and 17/36 and many more examples which may include any prescribed number of states. Remarkably according to Eq. 11 these different mixtures behave identically under any physical investigation . Thus for example suppose we have a source emitting spin half states and either (i) each state is randomly chosen to be +z spin or -z spin with equal probbaility of a half, or (ii) each state is one of the states in (d) above with the given probabilities. Then it is, in principle, impossible to distinguish (i) from (ii) by any physical means. This is another manifestation of the inherent inaccessibility of quantum information. Mixed states also arise in another context, as a consequence of entanglement. Suppose we have a composite system AB which is in a pure entangled state of the two parts A and B . Then as far as local measurements on A alone are concerned, the subsystem A cannot be described by any pure state. However it may always be described by a mixed state of the sort we considered above. Technically, a local operation OA on A may be viewed as the operation OA @ ZB on the whole system AB where ZB is the identity operation on B . Now if I$AB) is the given pure entangled state then
i,
(where the subscripts on the trace denote the systems over which the trace is taken.) If we express this trace in a product basis of the joint state space then we can perform the trace over the B space explicitly to leave a trace only over
Quantum Informataon
and Its Propertaes 63
the state space of A , giving an expression of the form:
where
Here { I b ) } is an orthonormal basis of the state space of B and the expression ( b ( $ A B ) is a vector in the state space of A . P A constructed in this way is called the partial trace of I$AB) (GAB(over B . According to Eq. 13 PA fully characterises the result of any possible local physical observation on the subsystem A of the entangled system A B . There is a close relationship between the two occurrences of density matrices-as representing probabilistic mixtures of pure states and as representing subsystems of entangled systems. Consider a probabilistic mixture of pure states 1$1) , . . . , I$,) with probabilities p l , . . . , p , embodied in a system A and a second set of orthonormal states [ e l ) ,. . . , le,) embodied in a system B. Consider the entangled state n
2=1
If we take the partial trace over the subsystem B then the resulting mixed state of subsystem A is given precisely by the mixture of states I$2) with probabilities p , : P A = traceB I$AB) (GAB1 = c
p t
1$z)
($21
(using the orthonormality conditions (e21e 3 ) = 6,, in calculating the partial trace). If instead, we perform a measurement on subsystem B which distinguishes the basis states le,) then for each possible outcome i, the state of A will be collapsed into the corresponding pure state The outcome i (and hence also the state of A ) occurs with probability p , i.e. as a result of the measurement, the description of A is changed from being a subsystem of an ento being a probabilistic mixture of pure states. Now, we tangled system I$AB) may consider B as being very remotely distant from A so that, in the absence of unphysical effects such as superluminal communication, the change of the description of the state of A due to the measurement at B can have no effect
64
Introductzon t o Quantum Computataon and Informataon
whatever on any physical process at A. This leads to an alternative demonstration that different mixtures of pure states with the same density matrix are physically indistinguishable: let I&) , . . . , ItT)with probabilities q1, . . . , qT be any mixture of pure states with density matrix P A : T
n
a= 1
%=I
Then it may be shown l8 that starting from the entangled state I + AB) in Eq. 14 (or any other entangled state of AB having P A as the reduced state of subsystem A) there is a measurement on system B alone whose outcomes result in A being collapsed to the states I&) with probabilities qa. Thus an observer at B may, at his whim, remotely prepare the state of A as being any chosen mixture of pure states corresponding to the density matrix P A . Hence (in the absence of instantaneous communication) these mixtures must all be physically indistinguishable. Let us now return to the process of quantum teleportation. By computing the partial trace of the EPR state over Alice’s particle we see that Bob’s EPR particle is described by the mixed state ;I (for any local operations in his laboratory). Furthermore using Eq. 12, ;I is also seen to be the density matrix of an equal mixture of the four states in Eq. 10 (for any choice of I+B)). This is the mixture that arises as a result of Alice’s measurement. Thus before the information of ij arrives, Bob cannot distinguish in any physical way between the unadulterated EPR particle and the post Bell measurement mixture. Hence no violation of causality can occur if we claim that Bob’s particle, in fact, corresponded to an equal mixture of the four states in Eq. 10 before tl.
5
Compression of Information
Consider a situation in which Alice is generating a sequence of signals where each signal is either zero or one, chosen randomly and independently with prior probabilities po = 0.2 and p l = 0.8. Suppose that she wishes to communicate the sequence of signals to Bob. Clearly she can achieve this by sending one bit per signal (i.e. the signal value itself) but can she (reliably) communicate the information more economically? According to Shannon’s source coding theorem she can in this example communicate the information to Bob using only 0.722 bits per signal (e.g. 722 bits per 1000 signals) in such a way that Bob will obtain the information with arbitrarily low probability of error for strings of any length. The value 0.722 is the Shannon entropy (c.f. Eq. 2) of the prior probability distribution {0.2,0.8}. Any further compression beyond 1,273
Quantum Infonnatzon and Its Propertzes
65
H(0.2,0.8) bits per signal will necessarily result in high probability of an error in all sufficiently long strings. Alice achieves the compression by “block coding” - instead of considering signals individually she takes long strings of K signals jointly and re-codes them as shorter sequences (of 0.722K bits in our example). On reception, Bob is able to reconstruct the original sequence with arbitrarily high probability by a suitable decoding operation. It is remarkable and perhaps counterintuitive at first sight that such compression can be possible at all! In our example the signal ‘one’ is more likely to occur than ‘zero’ so in this sense Bob has some information about the signal’s identity even before it is sent (he could make a guess which is correct more often than not). Hence sending just the string of signals directly includes some redundancy and our coding scheme may be intuitively thought of as eliminating this redundancy, sending a shorter string containing only the information that Bob does not already have. Thus compression will be possible for any non-uniform prior probability distribution of signals. For later comparison with the quantum case, it will be useful to have a more precise statement of Shannon’s source coding theorem. Suppose that Alice is generating signals a1 ,. . . ,a, with prior probabilities p l , . . . ,p,. Let H be the Shannon entropy of this distribution. (Note that if Alice sends the signals directly she will need log, n bits per signal to send n distinct possible names, and recall that H 5 log,n for any distribution). A coding/decoding scheme is given by the following: for each (sufficiently large) block of length K , Alice has a (coding) operation for converting any block a, = a,, . . . azK of K signals (Klog,n bits) into a shorter string 5~ of X(K)Klog,n bits (where 0 5 X ( K ) 5 1) and Bob has a (decoding) operation for converting any X ( K ) Klog, n bit string U K into a sequence a;( = aJ1. . .aJKof K signals. The coding and decoding operations may involve probabilistic processes and generally a;( # a, (although we will want the probability of equality here to be as high as possible for reliable communication). The fidelity of the coding/decoding scheme, for signal strings of length K is defined as follows. Each of the nK possible strings a, = a,, . . .azK has a prior probabilityp(aK) = pa, . . .pa, and then FK = all
p(aK) .
(
probability that Bob decodes to get given that Alice sent a,
a,
Clearly FK is a measure of the reliability of the coding/decoding scheme to communicate signal strings of length K . We can now state:
66 Introduction to Quantum Computation and Information
Theorem (Shannon’s Source Coding Theorem). For any
E,
S > 0:
+ S bits per signal available then there is a coding/decoding scheme with FK > 1 - E for all sufficiently large K .
(a) If we have H
(b) If we have H - 6 bits per signal available then for any coding/decoding 0 scheme we will have FK < E for all sufficiently large K . There is a natural quantum analogue of the scenario described above (first considered by Schumacher 15). Suppose that Alice has a source which generates a sequence of quantum states. Each state is either I$o) or \$I) (a pair of specified qubit states, generally non-orthogonal) chosen randomly and independently with prior probabilities PO = 0.2 and pl = 0.8 respectively. Alice wishes to comunicate the sequence of quantum states to Bob. Clearly she can achieve this by sending one qubit per state (by sending the states themselves) but can she do it more economically - sending a smaller number of of Hilbert space dimensions (than 2) per signal? Let p be the density matrix defined by the source: P = Po W O )
($0
I + P l l$l)
($1
I
and let S(p) be the von Neumann entropy of p - a concept which will be defined in the next section. For the present it will suffice to note that S is a kind of quantum analogue of the Shannon entropy H and that S(P) 5 H(P0,Pl)
(16)
for any distribution. Also in Eq. 16 the von Neumann entropy of the source equals the Shannon entropy of the prior probability distribution if and only if the quantum states are all mutually orthogonal. We will see (c.f. Schumacher compression later) that Alice may compress the quantum information represented by the sequence of qubit states to S(p) qubits per state and no further while achieving arbitrarily reliable transmission of the quantum information to Bob. Thus compression beyond the classical limit of H ( p o , p l ) is generally possible and we may think of this as being due to an extra “quantum redundancy” arising from the non-distinguishability of non-orthogonal states. In the classical context Alice may always read the input bit string and use that knowledge in her compression scheme. However in the quantum context, because of the inaccessibility of quantum information, we can distinguish two distinct situations: (a) Blind compression: Alice is fed the sequence of states from an external source. She does not know the identities of the individual states in the
Quantum Information and Its Properties
67
sequence (and she cannot determine this reliably). Thus her compression scheme can be based only on the known prior distribution and the identities of corresponding states. (b) Visible compression: Alice knows the identity of each individual state in the input sequence e.g. she may prepare the states herself in accordance with the required probability distribution.
As an example of a visible compression scheme, Alice may do the following: instead of sending quantum information to Bob she merely sends him the classical information of the string of 0’s and 1’s which gives the identity of the states in the sequence. On reception Bob builds the corresponding states in his laboratory. According to Shannon’s theorem this classical information may be compressed to H ( p 0 , p l ) bits per state. However this scheme is not optimal (and generally very inefficient) for two reasons. Firstly according to Eq. 16 the compression to S(p) qubits per state is better than H @ o , p l ) bits per state (considering that the communication of one bit requires the transmission of one qubit, using a chosen basis of orthonormal states). Indeed the scheme does not exploit the quantum redundancy. Secondly, if we had more signal states 1$1) , . . . , which were still all qubit states, with equal prior probabilities pl = . . . = p , = 1 n say, then Alice would need H ( i , . . . , = log, n bits per signal, which can be arbitrarily large, whereas just sending the states themselves requires only 1 qubit per signal. In terms of the inaccessibility of quantum information, Alice is sending Bob a large amount of extra information (the identity of the states in the string) that he could not get if he were given the full correct sequence of quantum states. One may expect that visible compression would be more efficient than blind compression since Alice has more information about the input sequence at her disposal. However, remarkably, it is a consequence of Schumacher’s result that there is no difference: the limit of high fidelity compression in both cases is S(p) qubits per signal provided that the signal states are all pure states. This surprising equivalence between blind and visible optimal compression is not expected to persist in the case where the signals may be mixed states.
i)
6
Von Neumann Entropy
Before entering into a more precise discussion of Schumacher compression we will introduce the key concept of the von Neumann entropy S of a mixture of quantum states. This concept was first defined by von Neumann2’ in 1935, motivated from thermodynamic considerations and he used it to prove the irreversibility of the quantum measurement process.
68
Introduction to Quantum Computatzon and Znfonnatzon
By way of motivation, we wish to associate an entropy or information content t o an ensemble of quantum states \+I) , . . . , ]$,) with given prior probabilities PI,.. . ,p,. In classical physics the entropy or information content of a probabilistic distribution of classical states is quantified by the Shannon function given in Eq. 2. This formula was already well established in classical thermodynamics before the advent of Shannon’s information theory. We would wish our quantum formula to be a suitable generalisation of the classical one, reducing to it in the special case of orthogonal states. Now in the quantum context we have seen (in Eqs. 11 and 12) that as far as any measureable physical property is concerned, a mixture of quantum states is fully characterised by its density matrix p = C , p , I$%)(G21.F’urthermore any other mixture with the same density matrix is physically equivalent. With this in mind, we now use the following result: Theorem. For any density matrix p let XI, . . . ,A, be its eigenvalues and let \A,), . . . ,\A,) be corresponding eigenvectors. Then (a) { XI,. . .,A,} is always a classical probability distribution i.e. and each A, is real and non-negative.
C A,
=1
(b) The eigenvectors IA,) may be chosen to be orthonormal. ( c ) p is the density matrix of the mixture of states [A,) taken with prior probabilities A, respectively i.e. p = C ,A, [A,) (A,l 0
These are all standard results and proofs may be found for example in von Neumann 2o or P e r e ~ !For ~ (a) we note that since p is a convex sum of projection operators, it is a positive Hermitian matrix with trace 1 (as any projection is positive Hermitian with trace 1). This immediately gives the eigenvalue properties stated in (a). (b) is a well known property of eigenvectors of any Hermitian matrix and (c) is just the spectral decomposition of the Hermitian matrix p. The mixture in (c) comprises orthogonal states so that we would expect its entropy to be given by the classical formula - C ,A, log, A,. Furthermore this mixture has density matrix p so we arrive at the formula
I
for the entropy of any mixture of quantum states with density matrix p. Here A, are the eigenvalues of p and Eq. 17 may be written alternatively as S(p) = -trace plog,p
(18)
Quantum Information and Its Properties 69
defining the von Neumann entropy of p. (Here we have used the fact that X i log, X i are the eigenvalues of plog, p and the trace gives the sum of the eigenvalues.) Eq. 17 shows that for any mixed state in a Hilbert space of dimension d we have S(P) I log, d (19) since log,d is the maximum possible value of the Shannon entropy of any distribution with at most d non-zero probabilities. The quantity S plays a fundamental role in many aspects of quantum information theory, independently of its intuitive motivation above. It is interesting to note that in classical physics the entropy can be interpreted as the amount of information gained by identifying the state (from the known prior distribution). However this is no longer true in the quantum case as the state generally cannot be identified. Holevo’s theorem states that the quantum (von Neumann) entropy S(p) provides an upper bound on the amount of information we may obtain about the identity of the state but this bound is generally not very tight. Thus Holevo’s theorem does not provide a precise information theoretic interpretation of von Neumann entropy. Schumacher’s result (below) gave the first such characterisation albeit in terms of quantum rather than classical information (c.f. Jozsa ’l). Because of this interpretation we take the unit of S t o be the qubit (analogous to the unit of bit for the Shannon entropy H ) . As an example of computing the von Neumann entropy consider the mixture of quantum states
160) =
( ;)
(g) 1
1+1)
=
taken with prior probabilities po = pl = $. The density matrix is
The eigenvalues are cos’
and sin2
so 7r
7r
S(p) = -trace plog, p = H(cos2 -, sin2 -) = 0.601 qubits 8 8
Thus for a probabilistic sequence of these states, no classical compression (of the state names) is possible since H ( f) = 1 but according to Schumacher’s theorem, long sequences can be reliably transmitted using only 0.601 qubits per state, exploiting the quantum redundancy inherent in the non-orthogonality of I1clo) and 161).
3,
70 Introductzon to Quantum Computataon and Infonnataon
7 Schumacher Compression of Quantum Information To state our fundamental result we first need a notion of fidelity for quantum states (analogous to the classical formula Eq. 15) which will provide a quantitative measure of reliability of any proposed coding/decoding (or compression/decompression) scheme. Let Ic) be any pure state and p be any mixed state. We define the fidelity between p and Ic) to be
F = (cl P Ic) This is just the probability that p passes the test of “being Ic)” i.e. it is the average value of the observable which has value 1 in the subspace spanned by lc) and 0 in the orthogoonal subspace (this observable being Ic) (cl, the projector onto the subspace of Ic)). Now suppose that we have a source of quantum states 1$1), . . . I$n) with z K ) = I$,,) . . . I$,) prior probabilities p l , . . . p n respectively. Each string J!P, of length K occurs with probability p , , 2 K = pa, . . . p z K. Suppose the K-string is compressed and decompressed to yield (in general) a mixed state p,, z K . Then the average block fidelity of the compression/decompression procedure, for blocks of length K is defined to be FK
=
PZI
ZK 12‘(
2KI P Z l
ZK 12’1
ZK)
(20)
all K-strings This is our quantum analogue of Eq. 15. Theorem (Schumacher’s Quantum Source Coding Theorem). Suppose that a quantum source produces signal states /$I) , . . . , I$n) with prior ($,I be the density probabilities p l , . . . , p n respectively. Let p = C , p z matrix corresponding to the source. Then for any E , 6 > 0:
+
(a) If S(p) 6 qubits per signal are available, then for all sufficiently large K we can transmit blocks of K signals with average block fidelity > 1 - E .
(b) If S(p) - 6 qubits per signal are available, then for all sufficiently large K, blocks of K signals will be transmitted with average block fidelity < E (for any choice of coding/decoding scheme). 0 The full proof of this theorem may be found in the l i t e r a t ~ r e !Here ~ ~ ~we~ ~ ~ ~ will outline the essential basis of an optimal coding/decoding scheme which achieves the compression in (a) and thus proves this part of the theorem. We remark that the theorem is valid for either blind or visible coding - the optimal coding scheme that we describe is in fact a blznd procedure, depending on the source only through its density matrix p.
Quantum Information and Its Properties
71
The essential ingredient in the quantum compression scheme is the notion of the typical subspace A ( K ) of the source. Consider again the illustrative example of two states in two dimensions, I$o) and I$1) with prior probabilities po and pl respectively. Let p = po I&) ($01 + p l 161) ($1 be the corresponding density matrix. The 2K signal strings ~ Q ~ , . , . i K=) I $ i , ) . . . I$i,) of length K are generally linearly independent and occupy a space of 2K dimensions i.e. K qubits. They also have associated probabilities pi ,...iK = p i , . . . p i K and when these are taken into account they do not occupy the 2K dimensions equally, in a remarkable way: as K increases they tend to “pancake out” more and more around a smaller subspace A ( K ) of dimension 2KS(p) in the following sense. For any E , b > 0 there is a subspace A ( K ) of dimension 2K(S(p)+b)in the 2K dimensional space of K signals such that
C all K-strings
I
pi1 ...i K ( ~ i...li K n A ( K ) I Q i 1 ...i K )
> 1-
(21)
(where IIA(K)denotes the projection operator) i.e. any K-string of signals generated by the source will be found to lie in the subspace A ( K ) with high probability 1 - E and those states Q i l . . . i K which lie far from A ( K ) must all occur with very low probability. Note that the subspace A ( K ) of dimension 2K(S(p)+b)is generally exponentially small inside the total space (as we have S ( p ) 5 1 for any qubit source). We can also say that A ( K ) is the smallest possible subspace with the property that it captures essentially all of the signal strings of length K , in the following sense. For every subspace C of dimension 2K(S(p)-6)the average probability that strings of length K lie in C is less than E for all sufficiently large K . The above discussion generalises readily to a source which emits a distribution of r possible states 1$), . .., each of which is a state in d dimensions. In this case, the r K signal sequences of length K span a space of d K = 2K log, dimensions and the typical subspace has dimension 2K(s(p)+b)which is generally exponentially small by Eq. 19. The above properties of h ( K ) follow readily from well known properties of typical sequences in classical information theory which we now summarise. This also provides a prescription for constructing the typical subspace A ( K ) . Let P = { p l , . . . ,pm} be any classical probability distribution. Let
PK = { p i , . . . p i K : 1 5 i l l . I . , i K 5 m } be the set (of size m K ) of all products of K-sequences of the pi’s. Here products are kept distinct even if they give the same value i.e. PK is the
72
Introduction to Quantum Computation and Information
probability distribution of K independent trials of P . The total probability of any subset A P is defined to be the sum of all the members of A. Let H = H ( p 1 , . . . ,p m ) be the Shannon entropy of P . then we have: Theorem of Typical Sequences. For all E , 6 > 0 and for all sufficiently large K there is a subset C K C PK of size 2K(H+6)such that (a) The total probability of C K is greater than 1 - E (so also the total probability of the remaining members UK = PK - C K is less than 6.1
(b) Any subset of PK with less than 2K(H-6) members has total probability less than E . 0 Any such subset C K is called a set of typical sequences, and its complement U K , a set of atypical sequences. Note that any set of typical sequences has exponentially small size inside the set of all sequences yet it contains almost all of the weight of the probability distribution PK. According to (b) ZKH is the critical smallest possible size of any subset with this property. A subset of typical sequences with the property (a) is not unique. We may choose C K to be the set of 2K(H+6) largest members of PK but alternatively (and often more conveniently for further developments) it can be shown that we may require .CK to satisfy the additional property3 that each typical sequence is approximately equally likely, in the sense that 2-K(H+4
<1<2-w-6)
for all 1 E C K
Returning now to the quantum source coding theorem let p be the density matrix of the source (of d dimensional states) and let P = {A,,. . . ,A,} be the set of eigenvalues of p. Then the density matrix p~ of K-sequences of signals is just
-
pK = p @ ’ ” . @ p
K times
in 2 K 1 0 g d dimensions. The eigenvalues of p~ are precisely the set PK and the Shannon entropy H of P is just S(p) (by the definition of S , Eq. 17). Now consider the density matrix of p~ written in the basis of its eigenstates. It will be a diagonal matrix with the 2 K 1 0 g d members of PK on the diagonal. Choose a typical subset C K C PK and select the 2K(s(p)+6) typical elements from the list on the diagonal. Set all the remaining (atypical) elements to zero. According to the theorem of typical sequences, this will have a negligible effect on p~ and the resulting modified density matrix , 5 ~will be supported on a subspace of dimension 2 K ( S ( p ) + 6 ) . This gives the construction of the typical subspace A(K) for the source: it is the span of all eigenvalues of p~ belonging
Quantum Information and Its Properties
73
to the typical eigenvalues i.e. the eigenvalues in C K PK. The claimed properties of the typical subspace then follow readily from the theorem of typical sequences. The above discussion leads to the following explicit scheme for optimal high fidelity compression of quantum information. Given a source with density matrix p, and (sufficiently large) block size K , determine the typical subspace A ( K ) of dimension 2K(S(p)+b)as described above. Let U be any chosen unitary transformation (in dimension d K = 2 K ' o g d ) which rotates A ( K ) into a standard configuration, say, occupying the first K ( S ( p ) + 6 ) qubits of a row of K log d qubits and having 10) in each of the remaining R = K log d-K(S(p)+G) qubits. Now if I @ K ) is a block of K signals generated by the source, let )?K be the mixed state obtained by discarding the last R qubits of U I Q K ) and replacing them with R qubits each ip state 10). Note that )?K will generally be a mixed state since the initial retained K ( S ( p )+ 6) qubits will be entangled slightly with the discarded R qubits. Because of the characteristic property Eq. 21 of A ( K )we see that, on average, I @ K ) will have arbitrarily large (> 1-6) weight in A ( K ) so the average fidelity between U I @ K ) and )?K will be correspondingly large (> 1-6). Now Alice sends to Bob the first K ( S ( p )+6) qubits of U I @ K ) (discarding the last R qubits so she has compressed K logd qubits to K ( S ( p ) 6 ) qubits). Upon receiving them, Bob decompresses the quantum information by attaching R extra qubits each in state 10) (giving the state ) ? K ) and applying U p ' . This will yield I @ K ) with an average fidelity exceeding 1 - 6. Thus we have shown how to achieve the compression given in part (a) of the quantum source coding theorem. It is more difficult to prove (b) i.e. that no further compression is possible if we are to maintain high average fidelity for arbitrarily long sequences of signals. For this we need to consider an expression for the most general possible coding/decoding scheme and the details may be found in the literature, especially Barnum et a1.?3
+
8
Conclusions
We have seen that the phenomenon of entanglement plays a fundamental role in all aspects of quantum information theory that we have discussed. This includes the extraordinary capability of a quantum system to embody information (albeit largely inaccessible) far more efficiently than any classical system of a corresponding size, and the process of quantum teleportation which may be used to transmit quantum information reliably through a noisy environment if we have procedures for purifying entanglement by local operations and classical communication. We have shown how quantum information may 16717126
74 Introduction t o Quantum Computation and Information
be compressed by a factor given by the von Neumann entropy of the source distribution, giving an elegant quantum analogue of Shannon’s source coding theorem. Another important aspect of quantum information theory, which is discussed elsewhere in this volume;4 is the subject of quantum error correcting codes and this also relies heavily on properties of entanglement. Many immediate further developments of these basic results are the subject of much current research, for example, the question of quantifying the capacity of a noisy quantum channel for the reliable transmission of classical and quantum information, the problem of quantifying the maximum amount of pure entanglement that can be ‘distilled’ out of given mixed states shared by Alice and Bob, and determining the limit of high fidelity compression for mixed states. Ultimately we would wish to develop a theory of quantum information as extensive as Shannon’s classical information theory which would provide a fertile new conceptual framework for interpreting and understanding all aspects of quantum physics.
Acknowledgments This work was supported in part by the European TMR Research Network ERB-FMRX-CT96-0087.
References 1. C. E. Shannon and W. W. Weaver, The Mathematical Theory of Communication, (University of Illinois Press 1949). 2. R. B. Ash, Information Theory, (Interscience, New York 1965). 3. T . M. Cover and J. A. Thomas, Elements of Information Theory, (John Wiley and Sons, New York 1991). 4. W. K. Wootters and W. H. Zurek, Nature 299, 802 (1982). 5. T . P. Spiller, this volume. 6. A. S. Holevo, Probl. Inf. Transm. (USSR) 9, 177 (1973) 7. B. Schumacher, in Complexity, Entropy and the Physics of Information edited by W. H. Zurek (Addison-Wesley, Redwood City, CA, 1990). 8. A. Ekert and R. Jozsa, Rev. Mod. Phys. 68, 733 (1996) 9. A. Barenco, this volume. 10. R. Jozsa, Entanglement and Quantum Computation, in The Geometric Universe eds. S . Huggett, L. Mason, K. P. Tod, S. T. Tsou and N. M. J . Woodhouse (Oxford University Press 1998). 11. P. H. Eberhard and R. R. ROSS,Found. Phys. Lett. 2, 127 (1989) 12. C. H. Bennett and S. J . Wiesner, Phys. Rev. Lett. 69, 2881 (1992)
Quantum Information and Its Properties
75
13. The idea of backwards in time propagation of information also applies in quantum dense coding and it is attributed to B. Schumacher and C. H. Bennett?2 14. A. Peres, Quantum Theory: Concepts and Methods (Kluwer, 1993) 15. B. Schumacher, Phys. Rev. A 52,2738 (1995) 16. C. H. Bennett, G. Brassard, S. Popescu, B. Schumacher, J . A. Smolin and W. K. Wootters, Phys. Rev. Lett. 76, 722 (1996) 17. C. H. Bennett, D. DiVincenzo, J . A. Smolin and W. K. Wootters,Phys. Rev. A 54, 3824 (1996) 18. L. P. Hughston, R. Jozsa and W. K. Wootters, Phys. Lett. A 183, 14 (1993). 19. C. H. Bennett, G. Brassard, C. Crepeau, R. Jozsa, A. Peres and W. K. Wootters, Phys. Rev. Lett. 70, 1895 (1993) 20. J . von Neumann, Mathematical Foundations of Quantum Mechanics, English translation by R. Beyer (Princeton University Press, 1955). 21. R. Jozsa in Quantum Computing, Communication and Measurement: Proceedings of QCM’96 edited by 0. Hirota, A. S. Holevo and C. M. Caves (Plenum Press 1997). 22. R. Jozsa and B. Schumacher, J. Mod. Optics 41, 2343 (1994). 23. H. Barnum, C. Fuchs, R. Jozsa and B. Schumacher, Phys. Rev. A 54, 4707 (1996). 24. A. M. Steane, this volume. 25. H.-K. Lo, this volume. 26. S. Popescu and D. Rohrlich, this volume.
QUANTUM CRYPTOLOGY a HOI-KWONG LO MagiQ Technologies, Inc. 275 Seventh Avenue, 26th Floor, New York, N Y 10001-6708, USA The contest between code-makers and code-breakers has been going on for thousands of years. Recently, quantum mechanics has made a remarkable entry in the field. On the one hand, it has been rigorously proven that quantum cryptography can provide absolute security for communications between two users. O n the other hand, code-breakers in possession of a quantum computer can easily break popular encryption schemes such as RSA and Data Encryption Standard (DES) which are essentially intractable with any classical computer. Here I survey these recent developments.
1
Introduction
Coded messages have a long history in military applications! With the proliferation of the Internet and electronic mail, the importance of achieving secrecy in communications by cryptography 2-the art of using coded messages-is growing each day. Amazingly, quantum mechanics has now provided the foundation stone to a new approach t o cryptography-quantum cryptography? It has been claimed that quantum cryptography can solve many problems that are impossible from the perspective of conventional cryptography. Here I survey the physical principles behind quantum cryptography together with its triumphs and defeats. This is followed by a discussion on the power of quantum computers in code-breaking. Finally, I give some thoughts for the future. 2
Novel Properties of Quantum Information
In my opinion, the essence of quantum cryptography can be understood by considering a single question: Given a single photon in one of the four possible polarizations: horizontal, vertical, 45 degrees and 135 degrees, can you distinguish between these four possibilities with certainty? Surprisingly, the a Cryptology is the art of secure communications. It consists of Cryptography, the art of code-making and cryptanalysis, the art of code-breaking. bJust like matter is made up of indivisible atoms, light is made up of photons, which are indivisible without a change of frequency. A photon is the smallest unit or quantum of light which can be thought of as a tiny, oscillating electromagnetic field. T h e direction of the electric oscillation is known as its polarization, which can be probed by using a polarizer or a calcite crystal.
76
Quantum Cryptology 77
answer is no. This is due to the novel properties4 of quantum information. First, there is a physical law in quantum mechanics known as the quantum ‘no-cloning’ theorem which states that an unknown quantum state cannot be cloned. Second, given a quantum system prepared in one of two prescribed non-orthogonal states, any attempt to distinguish between the two possibilities necessarily leads to disturbance. Third, a measurement on an arbitrary unknown quantum state is an irreversible process which introduces disturbance to the state. As a result of these three properties, passive monitoring of quantum signals is impossible. Therefore, eavesdropping on quantum channels necessarily disturbs the signal and is exceedingly likely to be detected. In what follows, I will discuss these three properties4 in more detail. 2.1
Quantum No-Cloning Theorem
Owing to the linearity of quantum mechanics, there is a quantum no-cloning theorem5 which states that an unknown quantum state cannot be copied? A proof by contradiction goes as follows: Suppose the contrary. Then a quantum copying machine exists and can copy an unknown state. Considering the unitary evolution of the composite system with two orthogonal states (0)and 11) respectively as the input, one finds that
10) €3
I 4 + 10) @ 10) €3 1.0)
(1)
where) . 1 is the initial stated of the copying machine, 1.0) and 1.1) are the final states of the system excluding the original and the duplicate. 1.0) and Iwl) may be non-orthogonal. Now suppose that the input is, in fact, a linear superposition al0) b l l ) (a,b # 0) of the two orthogonal states. Then by the linearity of quantum mechanics, one obtains from Eqs. 1 and 2 that
+
Notice that the state of the original is now entangled with the duplicate. However, for quantum cloning the resulting state should be a direct product
( 4 0 )+ b l l ) ) €3 ( 4 0 ) + b l l ) )
@
’) .1
(4)
=Andy Steane says: “Even though one can clone a sheep, one cannot clone a single photon.” dlu)is independent of the input state (lo) or 11)) because the copying machine is assumed to have no prior knowledge of the state.
78
Introduction to Quantum Computation and Information
instead. Since
40) @ 10) @ 10.) + 4 1 ) @ 11) @)1.1 # (40)+ b l l ) ) @ (40)+ b l l ) ) @ 71. whenever a, b cloned!
(5)
# 0; one concludes that an unknown quantum state cannot be
2.2 Information Gain & Disturbance
Another unusual property of quantum mechanics is that, in any attempt to distinguish between two non-orthogonal states, information gain is possible only at the expense of introducing disturbance to the signal. A proof goes as follows: Suppose one is given a particle in one of two possible non-orthogonal states 14) and I$). The most general evolution involves the attachment of an ancillary quantum system say in a prescribed state) . 1 and a unitary transformation of the composite system. Assuming that the evolution leaves the state of the particle unchanged, one finds that
where lw) and ‘) . 1 denote the final states of the ancilla in the two situations. Since the inner product is preserved by unitary transformations, one takes the inner product between the above two equations and finds that
~~~
~
eThis can be verified by considering a simple example, say a = b = 1/&. f The above discussion shows that cloning violates the linearity of quantum mechanics. Since unitary transformations are linear, cloning also violates unitarity. Furthermore, cloning violates causality. Historically, it was suggested by Herbert that cloning can be used to transmit signals faster than the speed of light. Suppose Alice and Bob share an EPR pair (see Sec. 5.2) of photons. If Alice would like to send a ‘O’, she measures the polarization of her photon along the rectilinear basis. If she would like to send a ‘l’,she measures it along the diagonal basis. Now her measurement will project Bob’s photon into one of the four possible polarizations-vertical, horizontal, 45-degree and 135-degree. If cloning were possible, immediately after Alice’s measurement Bob could generate a sequence of photons all in one of the four possible polarizations. Bob could determine the polarization of his photons and thus the basis measured by Alice immediately, thus implying transmission of signals faster than the speed of light.
Quantum Cryptology
79
where the last line follows from the fact that ($1~)) # 0 for non-orthogonal states. Therefore, one concludes that Iw) is the same as Id).In other words, any process that causes no disturbance to any two non-orthogonal states must give no information in distinguishing between the two. Thus, information gain in distinguishing between two non-orthogonal states is possible only at the expense of disturbing the state of the system. These two properties-the quantum no-cloning theorem and the tradeoff between information gain and disturbance-imply that, given a photon in one of the four polarizations (horizontal, vertical, 45-degree and 135-degree), there is no way to distinguish between the four possibilities with certainty. 2.3 Irreversibility of Measurements
One might ask: What if one makes a measurement and copies the result of the measurement? Doesn’t it allow one t o make copies? The answer is n o because measurements generally disturb the state of an object under observation. Consequently, the result of a measurement is generally different from the initial state and the copying will be unfaithful. To understand this point, it suffices to consider the above example of a photon in one of the four possible polarizations? A birefringent calcite crystal can be used t o distinguish with certainty between horizontally and vertically polarized photons. As shown in Fig. la, horizontally polarized photons pass straight through whereas in Fig. l b vertically polarized photons are deflected to a new path. Photons originally in these two polarizations are, therefore, deterministically routed. However, the law of quantum mechanics says that if a photon polarized at some other direction enters the crystal (Fig. lc), it will have some probability of going into either beam. It will then be re-polarized according to which beam it goes into and permanently forget its original polarization. For instance, a diagonally (i.e., 45- or 135-degree) polarized photon is equally likely to go into either beam, revealing nothing about its original polarization. If a photon is known to be rectilinearly (horizontally or vertically) polarized, by a simple modification-adding two detectors, such as photo-multiplier tubes, that can record single photons along the two paths-an observer Bob can reliably distinguish between the two possibilities. This set up will, however, randomize the polarizations of diagonal (45- or 135-degree) photons, thus failing to distinguish between the two possibilities. In order to distinguish between diagonal photons, one should rotate the whole apparatus (calcite crystal and detectors) by 45 degrees. The rotated apparatus is, however, powerless in distinguishing between vertical and horizontal photons. gThe discussion here is based on an excellent exposition in Ref. 3.
80 Introductzon to Quantum Computatzon and Infonnatzon
Figure 1: A calcite crystal is used to distinguish between horizontal and vertical
photons. (a) Horizontally polarized photons pass straight through. (b) Vertical polarized photons are deflected to a new path. (c) Diagonally polarized photons will have equal probability of coming out vertically or horizontally polarized.
Quantum Cryptology 81
In conclusion, when a photon in one of the four polarizations (horizontal, vertical, 45-degree and 135-degree) is received, a naive process of measure-andcopy will disturb the signal and fail to distinguish between the four possibilities: A measurement that distinguishes rectilinear photons will disturb diagonal photons. Similarly, a measurement that distinguishes diagonal photons will disturb rectilinear photons. As the last two subsections demonstrate, this fundamental limitation in distinguishing between non-orthogonal states is due to the basic principles of quantum mechanics and thus it applies not only to the particular measuring apparatus described here, but also to ang measuring apparatus. I remark that these three novel properties of quantum information-1) no cloning, 2) information gain implies disturbance and 3) measurements are irreversible-are closely related. Indeed, the first and third properties can be regarded as corollaries of the second. It would, thus, be interesting to work out a quantitative theory of the second property?
3
An Illustrative Example: Quantum Money
It was first appreciated by Stephen Wiesner that quantum mechanics may be useful for cryptography. In a seminal manuscript written in about 1970 which remained unpublished until 1983, Wiesner showed that quantum mechanics can, in principle, be used to make bank notes that are physically impossible to counterfeit. The idea is that, in addition to a unique serial number on a bank note, one stores on it a sequence of isolated two-state physical systems. For instance, one can imagine trapping photons with perfectly reflecting mirrors. Each of the trapped photons should be randomly and independently chosen to be in one of the four polarizations (vertical, horizontal, 45-degree and 135-degree). In the bank, a record of the serial numbers together with the actual polarizations is kept. See Fig. 2. Now the key point is that the polarization basis (rectilinear or diagonal) used for each photon is kept secret. When a customer deposits a bank note, the bank with its knowledge of the polarization basis can verify the polarizations of the sequence of photons without introducing any disturbance. On the contrary, a counterfeiter who is ignorant of the polarization basis has absolutely no way of counterfeiting a bank note faithfully. For illustration, let me consider a simple measure-and-copy strategy. Suppose that, for each photon, a counterfeiter simply chooses one of the two (rectilinear or diagonal) bases to perform a measurement and makes copies according hActually, it is more appropriate to call it a quantum cheque because a verification step with the bank is needed for each transaction.
82
I n t r o d u c t i o n t o Q u a n t u m C o m p u t a t i o n and I n f o r m a t i o n
Quantum Money
Figure 2: In addition to a serial number, a sequence of single photons are kept in a
bank note. The polarizations of those photons are a secret which is kept in the bank record.
Quantum Cryptology 83
to his measurement result. There is a probability 1 / 2 that a wrong basis is chosen in which case the polarization of the photon will be randomized. Each of those randomized photons has only a probability 1/2 of passing the bank’s subsequent verification step. For each photon a measure-and-copy strategy, therefore, gives a total probability 1/2 + 1/2(1/2) = 3/4 of success for the counterfeiter. If the total number of photons in each bank note is N , a duplicate has only a probability (3/4)N of passing the bank’s verification step. When N is large, this probability becomes exponentially small. For this reason, a measure-and-copy strategy fails miserably for counterfeiting quantum money. The security of quantum money against more sophisticated counterfeiting strategies is guaranteed by the quantum no-cloning theorem. Wiesner’s work was so far ahead of his time that it was largely ignored in the 1970s. However, in the 1980s and 9Os, various quantum cryptographic protocols including quantum key distribution (QKD) were proposed. Before I come t o them, I shall first introduce the subject of cryptography. 4
Cryptography
Suppose a sender, Alice, would like to send a receiver, Bob, a message. A basic problem in Cryptography is to make sure that an evil eavesdropper, Eve, cannot read it. (See Fig. 3.) This can be done by encryption. The idea is to scramble the message so that it becomes unintelligible to anyone except the intended recipient. In modern cryptography, the encryption algorithm itself is public information and the security lies on the users’ knowledge of a secret string of information, known as the ‘key’. Everyone can make copies of the encrypted message, but only the intended recipient who possesses the correct key can unlock form it the original message. See Fig. 4a. If Alice and Bob share a key of the same length as the message, a perfect2y secure scheme of communications is the one-time pad shown in Fig. 4b. It was invented by Vernam in 1918: For ease of discussion, the message is converted to binary. Suppose both the sender and the receiver possess a copy of a random sequence of 0’s and 1’s. The sender Alice can encode a message by combining the message and the key using the exclusive OR operation bitwise. See Fig. 4b. In other words, each message bit is flipped if and only if the corresponding key bit is 1. The encrypted form of the message is then transmitted to Bob. Bob decodes by combining the encrypted message and the key with a similar application of the exclusive OR operation bitwise. The one-time pad is secure because the encrypted message, being formed by the exclusive OR of the message with the random secret key, is itself totally random. Anyone intercepting the message and not having the encryption key
84 Introductaon to Quantum Computation and Informataon
Eve Figure 3: Alice sends a message to Bob through a channel while the eavesdropper Eve
is listening to their conversation.
knows that the message exists and how long it is, but will not be able t o know anything about its meaning. It is crucial t o the security of the one-time pad that the length of the key is the same as the message. In other words, the key in a one-time pad should never been re-used.” Otherwise, an eavesdropper Eve can reduce her ignorance of the message t o that of the key. For instance, if two messages m l and m2 are encoded by the same one-time pad k, then the encoded messages will be c1 = k m l and c2 = k m2 modulo 2. Hence, Eve can add the two encoded messages c1 and c2 to get m l m2 modulo 2, thus learning partial information about the messages. So, what is the catch with the one-time pad? The catch is the following: The above discussion presupposes the possession of a common secret key by Alice and Bob. In practice, Alice and Bob need a second channel t o transmit the key. A key problem in conventional cryptography is the key distribution problem. In classical physics, an evil eavesdropper can always passively monitor the key distribution channel and make copies of the transmitted key. Consequently, she can decode the message successfully. Worse still, there is, in principle, no way for the users to detect such a passive eavesdropping attack. In conventional cryptography, the key distribution problem can be solved through either 1)trusted couriers or 2) ‘public key’ schemes? At the conceptual
+
+
+
aEncryption schemes with key lengths shorter than those of the messages also exist and are widely used. They do not give perfect security. J S o far, I have assumed that the encryption key is the same as the decryption key. As shown in Fig. 4a, one can think of such a ‘symmetric’ algorithm as a safe and the key as the combination. “Someone who knows the combination can open the safe, put a document
Quantum Cryptology
85
)-$
0 1 1 0 1 0
1 0 1 0 0 0
I
transmission
Ea
1 0 1 0 0 0
0 1 1 0 1 0
)
m
y
Figure 4: (a) A message is encrypted by Alice using a key into a ‘cipher-text’, which
is unintelligible to the eavesdropper. Bob, sharing the same key with Alice, can, however, decrypt the cipher-text to recover the original message. (b) One-time pad.
86
Introduction to Quantum Computation and Information
level, both methods are unsatisfactory: In the first case, the danger in the deflection or capture of couriers by the adversaries cannot be under-estimated. In the second case, the security of public key schemes is based on computational assumptions, i.e., on the difficulty of solving certain hard problems such as the factoring of large integers and the ‘discrete logarithm problem’. [See Appendix A for a discussion on RSA, which is the most popular public key crypto-system. The security of RSA is based on the difficulty of factoring large integers.] These computational assumptions may be defeated by exhaustive computer analysis or by the discovery of better algorithms for solving the problems on which they are based. For instance, Shor lo has constructed efficient quantum algorithms for both factoringk and the ‘discrete logarithm problem’. Therefore, if a quantum computer is ever built, many public key crypto-systems in use today will become unsafe. Worse still, this will lead to a retroactive total security break with catastrophic consequences. Ironically, quantum mechanics also comes to the rescue. As remarked earlier, an attack that is notoriously difficult to defeat in conventional cryptography is passive eavesdropping. The strength of this attack lies in the ability of the eavesdropper Eve to make identical copies of the transmitted messages in order to perform extensive subsequent computer analysis 08-line. In conventional cryptography there is, in principle, nothing to prevent this attack. In contrast, the quantum ‘no-cloning’ theorem forbids passive eavesdropping. As discussed in Sec. 2, information gain generally leads to disturbance. Consequently, eavesdropping on a quantum channel will almost surely be detected due to the disturbance introduced to the signals. This is the basic idea behind quantum key distribution, a subject that I will come to in the next section.
inside and close it again. Someone else with the combination can open the safe and take the document out. Anyone without the combination is forced t o learn safe-cracking.’’ As the sender and the receiver must agree on a secret key in using a symmetric key algorithm, the key distribution problem is inevitable. However, there exist schemes in which the sender and the receiver do not need to agree on a secret key before they send messages. Indeed, in 1976 W. Diffie and M. Hellman9 invented public key cryptography. In a public key crypto-system, two different keys are used. The encryption key is made public whereas the decryption key is kept private. It is supposed to be computationally hard to deduce the decryption key from the encryption key. Therefore, one can think of a public key crypto-system as a mailbox. Everyone can easily put mail in it, but getting the mail out is much harder unless one has the (secret) private key. Public key crypto-systems avoid the key distribution problem, but their security is based on some unproven computational assumptions. kSee Chapter Six for details on Shor’s efficient quantum algorithm for factoring.
Quantum Cryptology
5
87
Quantum Key Distribution (QKD)
Quantum key distribution (QKD) cannot prevent eavesdropping. However, it can detect eavesdropping. If eavesdropping is found (from the abnormally high error rate), the transmitted random string of numbers is discarded. On the other hand, if the error rate is sufficiently small, the two users have the peace of mind that the transmitted random string of numbers is most likely t o be secure and can be used as a secure key for subsequent communications. Notice that, even in the case when the error rate is large, no useful information is leaked t o the eavesdropper. This is because, in this case, the string is simply discarded. Alice and Bob postpone sending any valuable information until the security of the key is ascertained. Notice that there is nothing, in principle, t o prevent a n adversary from jamming a quantum channel. In this case, the two users will be forced to abandon using the key distribution channel for the time being. However, the big advantage of QKD is t o avoid a false sense of security. When substantial eavesdropping has occurred, the two users of a QKD scheme will be exceedingly unlikely t o be fooled into believing the security of the key.
5.1 Bennett and Brassard's Scheme (BB84) Various schemes for QKD have been proposed. For simplicity, I will consider mainly the first and the most well-known QKD scheme BB84, proposed by Bennett and Brassardll in 1984. The idea of BB84 scheme is not for Alice t o prepare a particular key and send it t o Bob. Heuristically, Alice and Bob each independently generate a random string of numbers. Afterwards, they go through some public discussion t o decide on the key. Two channels between Alice and Bob are needed for BB84: First of all, a classical communications channel is needed. It is assumed t o be public but unjammable! In other words, while anyone can read all the transmitted messages, no one can alter the messages sent by Alice or Bob. In peacetime, the 'An unjammable channel is, in principle, impossible to achieve. If one allows the eavesdropper to attack the classical channel, some form of authentication process must be implicitly used in order to verify that the two users are talking to each other rather than an eavesdropper in disguise. Notice that authentication is needed even in conventional key distribution schemes. It can be done only if the two users initially share some small amount of secret information. If Alice and Bob have seen each other before, the information can be their outward appearances. In the case that they have not met before, it can be a short secret password. There are information-theoretically secure authentication scheme^!^ Notice that without sharing some secret information or an unjammable channel with Bob, it is totally symmetric whether Alice is talking to Bob or to an enemy Eve and it would be impossible for her to distinguish between the two cases. Barring unjammable channels, what QKD can achieve is only to expand this initially shared key information. Perhaps, a more appropriate
88
Introduction to Quantum Computation and Information
New York Times or the BBC Radio would be good approximations to an unjammable classical channel. This classical channel will be useful for public discussion between Alice and Bob. (See below.) Second, a quantum communications channel is needed. Experiments have been done in free air l 2 > l 4and on optical fibers. l5 Ground to satellite experiments l 6 have been proposed. The quantum channel is assumed to be insecure and the eavesdropper can manipulate the quantum signals in any way she desires. Let me introduce a refined procedure of the BB84 scheme!’ Suppose Alice and Bob would like t o establish a secret key. Before the execution of the protocol, Alice and Bob first decide on the maximal acceptable error rate emax for the transmissionln Referring t o Figs. 5 and 6 , the steps of BB84 are as follows: (1) Alice sends Bob a sequence of photons, each of which is chosen randomly and independently t o be in one of the four polarizations (horizontal, vertical, 45 degrees and 135 degrees). (Fig. 5, Step 1.) (2) For each photon, Bob randomly chooses either the rectilinear or diagonal bases t o perform a measurement. (Fig. 5, Step 2.) (3) Bob records his bases used and the results of the measurements. (Fig. 5, Step 3.) (4) Subsequently, Bob announces his bases (but not the results) publicly through the public unjammable channel that he shares with Alice. (Fig. 5 , Step 4.) Notice that it is crucial that Bob publicly announces his basis of measurement only after the measurement is made. This ensures that the eavesdropper, Eve, does not know the right basis during eavesdropping. If Bob were to announce his basis before the measurement, Eve could simply eavesdrop along the announced basis without being detected. (5) Alice tells Bob which measurements are done in the correct bases. (Fig. 5 , Step 5.) ( 6 ) Alice and Bob divide up their polarization data into four classes according t o the bases used by them. See Fig. 7. In cases (a) and (b), Bob has performed the wrong type of measurement (i.e., Alice and Bob have used different bases). They should throw away those polarization data. On the other hand, in cases (c) and (d), Bob has performed the correct type of measurement (i.e., Alice and Bob have used the same bases). name for QKD is quantum key expansion?’ Conventional methods for key expansion are necessarily insecure because a passive eavesdropper can always make copies of the communications and crack the key expansion scheme ofl-Zine by exhaustive computing analysis. The quantum no-cloning theorem forbids such a passive eavesdropping attack in quantum key expansion schemes. mIn current experiments something like emax= 1% is reasonable.
Quantum Cryptology
Step 1: Alice picks polarization randomly
Step 3: Bob records his basis and measurement results.
x
basis Step 4: Bob announces his basis publicly.
J’
result
n
Step 5: Alice tells Bob if he ha& cho&n the correct basis.
Step 6: Test for tampering, error correction and privacy amplification.
Figure 5: Procedure of the BB84 scheme for quantum key distribution.
89
90 Introduction to Quantum Computation and Information
Bob
++x++xx 1 - J’
Bob
J’
Figure 6: A sequence of photons are sent by the BB84 scheme. For each photon,
Alice chooses its polarizations randomly from horizontal, vertical, 45-degree and 135degree. Bob then randomly chooses the rectilinear or diagonal basis to perform a measurement. He writes down the result of his measurement. Alice and Bob public compare their basis. Whenever they have used the same basis, they can convert their polarization data into a single raw bit. Of course, they need to test for tampering and go through error correction and privacy amplification as described in the text.
a) Alice
Bob
b) Alice
Bob
d) Alice
Bob
+X
c)
++ Alice
Bob
xx
Figure 7: Alice and Bob divide their polarization data into four cases according to
the bases used by them.
Quantum Cryptology
91
Notice that if no eavesdropping has occurred, all the photons that are measured by Bob in the correct bases should give the same polarizations as prepared by Alice. Bob can determine those polarizations by his own detector without any communications from Alice. Therefore, Alice and Bob can use those polarization data as their raw key. Of course, before they proceed any further, they should sacrifice a small number of those photons to test for eavesdropping. For instance, they can do the following: (7) Alice and Bob randomly pick a fixed number say ml photons from case (c) and compute its experimental error rate, el. Similarly, they randomly pick m2 photons from case (d) and compute its experimental error rate, e2? If either el or e2 is larger than the maximal tolerable error rate emax,either substantial eavesdropping has occurred or the channel is unexpectedly noisy. Alice and Bob should, therefore, discard all the data and start with a fresh batch of photons. On the other hand, if both el and e2 are smaller than emax, they proceed to step 8. (8) Reconciliation and privacy amplification: Alice and Bob can independently convert the polarizations of the remaining photons into a raw key by, for example, regarding a horizontal or 45-degree photon as denoting a ‘0’ and a vertical or 135-degree photon a ‘1’. There are still two problems, namely noise and leakage of information to Eve, in the raw key that Alice and Bob share. Indeed, the raw key that Alice has may differ slightly from that of Bob. Moreover, Eve may have partial information on the raw key. A realistic scheme must include error correction and privacy amplification 12J7-the distillation of perfectly secret key out of a sequence of raw key that Eve may have partial knowledge of. A proof of ultimate security against all possible types of attackg is a delicate and difficult problem. See Sec. 6 for a discussion. I relegate an elementary discussion of error correction and privacy amplification to Appendix B.
5.2 Other Schemes Even though BB84 solves the key distribution problem, it does not solve the key storage problem: Once Alice and Bob have established their classical key, they must store it before it is used. In principle, an eavesdropper may break nml and m2 are chosen to be large enough for accurate estimation of the true error rates of the transmission. A simple protocol may take ml = m2. In the original BB84,cases (c) and (d) are combined to estimate a single error rate. OIn the most general attack, namely the joint attack, Eve regards the whole sequence of photons as a single entity and couples it with an ancilla and evolves the combined system. Afterwards, she keeps her ancilla and listens t o the public discussion between Alice and Bob before deciding on what information to extract from her ancilla.
92
I n t d u c t i o n to Quantum Computation and Information
into their laboratories to steal it. Ekert l8 proposed an Einstein-PodolskyRosen-based scheme which solves the key storage problem. The well-known Einstein-Podolsky-Rosen (EPR) effect occurs when a pair of entangled (i.e., quantum mechanically correlated) photons is emitted from a source. The entanglement may arise out of conservation of angular momentum. As a result, each photon is in an undefined polarization. Yet, the two photons always give opposite polarizations when measured along the same basis. For example, if Alice and Bob both measure along the rectilinear basis, their photons are each equally likely to be horizontally or vertically polarized. But if Alice’s photon is horizontal, Bob’s will certainly be vertical and vice versa. A simplified version of Ekert’s scheme goes as follows: A source emits such pairs of entangled photons. Alice and Bob each keep a member of each pair. They measure some of their polarizations immediately to test for eavesdropping. The remainder is stored without being measured. When they need to use the key, they measure and compare some of the stored pairs. If no tampering has occurred, the polarizations of the two members of each pair should be opposite. They verify that, for the test pairs, this is indeed the case. They can then measure the polarizations of the reminder randomly and independently along two bases and subsequently go through privacy amplification in the same way as in BB84. Another interesting QKD scheme, B92, was proposed in 1992 by Bennett 2o who showed that any two non-orthogonal states suffice to distribute a key. Suppose a photon is chosen randomly from two non-orthogonal polarizations say luo) and lul). Let me consider the projections Pnoto = 1 - Iuo)(uol and Pnot1 = 1 - Iu1)(u11.Notice that Pnot0 1 ~ 0 )= 0. Therefore, if a measurement of Pnot0 gives an eigenvalue 1, Bob can be sure that the state before the .)1.1 On the other hand, if a measurement Pnot0 gives observation must be an eigenvalue 0, the initial state may be either luo) or .)1.1 The procedure of B92 goes as follows: Alice sends a random sequence of photons t o Bob, using luo) t o represent a 0 and 1.) t o represent a 1. Bob performs a random measurement of either P n o t o or P n o t 1. Bob publicly announces the eigenvalue of his measurement for each photon, but not the type of measurement that he has performed. Alice and Bob discard all the instances when the eigenvalue is found to be 0. Notice that, in the absence of noise, when the eigenvalue is 1, the type of measurement performed by Bob will tell him the bit chosen by Alice. The eigenvalue 1 should appear with a probability (1- I(u0lul)l2)>/2.In this case, they share a common bit. Of course, just like in BB84, they need to test for tampering. They can do so by selecting and sacrificing a subset of photons for the case when the eigenvalue is 1 to check
Quantum Cryptology
93
that their sub-strings agree with each other. Besides, they also need t o check that the proportion of 1’s is, indeed, a fraction around (1 - I(u0lul)l’))/2. A malicious Eve who measures the signals in transit using an apparatus similar to Bob’s and destroys them whenever the measurement outcome is 0 will decrease the proportion of 1’s in Bob’s result and thus be caught. (See also Ref. 21.) Other QKD schemes have also been proposed. Townsend and collaborators have discussed a practical implementation of quantum cryptography in a communications network with many users? A quantum cryptographic network based on quantum memories was p r ~ p o s e d ?Goldenberg ~ and Vaidman24 showed that, rather surprisingly, orthogonal states can be used for QKD. The basic idea there is for Alice to split the quantum signal into two pieces and send them to Bob one at a time. Alice transmits the second signal only after Bob’s notification of his reception of the first signal. A proposal to use quantum cryptography without public announcement of bases has also been made?5 Finally, efficient schemes for quantum cryptography have also been i n t r ~ d u c e d ? ~The > ~ ’ key idea can be understood in the example of BB84. Instead of choosing the two bases with equal probability in the original BB84, suppose each of Alice and Bob chooses the two bases with probabilities c and 1 - E respectively where E < 1/2 is a parameter that is publicly announced beforehand. Then the probability that Alice and Bob use the same basis is (1 - E ) ~ c2 which tends to 1 as E tends to zero. Hence, the improved scheme is more efficient than BB84. The security of the scheme is guaranteed by a refined error analysis: the error rates of the two bases should be individually estimated in step (7) of Sec. 5.1. And both error rates should be demanded to be small. The probability E can never be zero, though. In order t o have enough photons t o estimate the two error rates reliably, the constraint Nc’ 2 rno has to be satisfied. Here N is the total number of photons transmitted and rno is some fixed number.
+
6
Is Quantum Key Distribution Really Secure?
The most important question in QKD is how secure it really is. In practice, Alice and Bob’s devices are imperfect and they share a noisy quantum channel. A rigorous proof of the security of a QKD scheme would require the explicit construction of a procedure to constrain the eavesdropper Eve’s information on the final key t o an exponentially small amount e P k ,where k > 0 is the security parameter chosen by Alice and Bob. The security of QKD was a notoriously hard problem. A big challenge is to demonstrate the security against the most general type of attacks, namely joint attacks: Instead of measuring the particles in transit from Alice to Bob one by one, Eve has the option of treating the whole
94
Introduction to Quantum Computation and Information
sequence of quantum signals as a single entity. She then couples this entity with her probe and evolves the combined system by a unitary transformation chosen by her. Since the particles are now generally entangled with each other, classical intuitions such as the laws of large numbers appear to be invalid. Notice also that Eve attempts to mask her presence by attributing the errors due to her eavesdropping attack to the normal transmission noises. There has been much work on the security of QKD?8>29>30~31r32 In particular, its security against a restricted class of attacks-the so-called collective attacks-has been d i s c ~ s s e d ?A~rigorous proof of security of QKD has recently been given by Lo and C h a ~ who ? ~ have demonstrated that, given quantum computers, quantum key distribution over an arbitrarily long distance can be made unconditionally secure. To give you some flavors of the proof, I remark that there are two major insights. 6.1
Fault-Tolerant Quantum Computation
The first insight is that, by using the general theory of fault-tolerant quantum one can reduce the proof of security in the noisy (i.e., imperfect devices, noisy channels, storage errors, etc) case to the error-free case. More specifically, to overcome transmission errors in the channel, ‘relay stations’ are set up and quantum error correction are performed during the transmission?’ Moreover, thanks to fault-tolerant quantum computation, computational operations can be regarded as performed on the encoded form of the signal (the so-called logical qubits), rather than the physical qubits. The ‘threshold result’ of fault-tolerant quantum computation states that, by encoding signals and performing operations on them in the encoded form, one can overcome all sources of errors in quantum information processing provided that the error rates are sufficiently sma11?5~36~37~38~39
6.2 Error-Free Case With the above discussion in mind, I shall consider only the error-free case. In the most general eavesdropping strategy, Eve is in charge of preparing the state of the N pairs shared between Alice and Bob. She claims that they are perfect EPR pairs. Alice and Bob will be happy to sacrifice a small number say m of those pairs to verify Eve’s claim. If the m tested pairs fail the test, all the N pairs are discarded. On the other hand, if the m pairs pass the test, the remaining N - m pairs will be accepted as singlets and used to generate the key (e.g. by measuring all particles along the z axis). The goal of the verification is for Alice and Bob to make sure that Eve has a very small probability of cheating successfully. By cheating successfully, we mean that
Quantum Cryptology 95
the m tested pairs pass the verification test and yet some of the remaining N - rn pairs when measured are shown to be non-singlets. [Of course, Alice and Bob will not actually come together to determine whether the remaining pairs are singlets or not. However, this is a very useful thought experiment: As it turns out, the security of the quantum verification scheme will automatically guarantee the security of the corresponding quantum key distribution scheme.] Notice that, essentially what Alice and Bob are trying to do is to distinguish singlets from triplets. I shall remark without proof that there is no way to do so with certainty by using local operations and classical communications alone. However, the good news is that one can do so with a very high probability. 6.3 Bell Basis
Let me introduce the so-called Bell basis, Q* and
@*,where
Using the convention of Ref. 41, the Bell basis vectors are represented by two classical bits:
9+ =
oo=o
Q+
= Ol=1
a-
=
Q-
10=2 = 11 = 3 .
Notice that all the basis vectors in Bell basis are highly entangled and are, hence, non-classical. Yet, as shown by Bennett, DiVincenzo, Smolin and Wootters (BDSW)p' this basis is highly useful for giving classical interpretation to some quantum computation steps, which can be performed with only local operations and classical communications. More concretely, for N pairs of spin-1/2 particles, the basis vectors are represented by strings of length 2N. BDSW gave an explicit procedure for 41 Alice and Bob to compute the parity of any chosen subset of their string by using local operations and classical communications only. The parity will be given by the outcomes of measurements performed on a single pair, which has to be discarded afterwards. Instead of describing the procedure of the quantum computation, I will simply focus on the classical interpretation. Example: Consider the string 20 = 110101.
96
Introductzon t o Quantum Computatzon and Infomataon
Alice may use a random string say s = 011001 t o represent a random subset that she has chosen. The parity of this random subset is given by zo . s (mod2) = 1 ~ 0 + 1 ~ 1 + 0 ~ 1 + 1 ~ 0 + 0 ~ (mod21 0 + 1 ~ =1 0 (mod2). Note that any two different strings will give the same parity for a randomly chosen subset with a probability only 1/2. For example, consider 2 1 = 010101 which happens t o differ from zo in only the first bit. Then, for any string s1 = a1a2. “ a 6 , 20 . s1 = 2 1 . s1 (mod 2) if a1 = 0 and zo . s1 # 2 1 . s1 (mod 2) if a1 = 1. In the context of classical information theory, iterative computation of the parities of randomly chosen subsets is a very efficient way t o check whether two strings are the same. The probability for two different strings t o give the same answers for m iterations is no more than 2-m. Let me return to the quantum case. In the absence of Eve, the string shared by Alice and Bob should be all 1’s. Let me now introduce Eve. If Eve were to prepare a state represented by a basis vector, classical information theory carries over completely. Thus, Alice and Bob could easily verify its identity by iterative applications of the determination of the parity of a random subset of the string (and throwing away a pair after each iteration)?’ As in the classical case, with m iterations the probability of Eve cheating successfully is known to be less than 2-m.
6.4 Reductaon from Quantum to Classical The big problem is: Will such a verification scheme work for a general eavesdropping strategy? As noted earlier, in the most general eavesdropping strategy, Eve prepares the state for Alice and Bob. The pairs may be entangled among themselves as well as with a probe in Eve’s hands. Since any mixed state can be represented by a pure state by including the quantum die explicitly, we shall, without loss of generality, consider that Eve prepares a pure state ) .1 = x a Z l , Z 2 , ,ZN,J1iI,i2>”.>iIV)@ lj), (12) 21rZ2, r Z N 3 where i, denotes the state of the j - t h pair and it runs from 0 to 3, and 13)’s form an orthonormal basis for the ancilla. Each state Iu) represents a particular cheating strategy chosen by Eve. For any Iu),one can compute the probability for Eve to cheat successfully against the random-hashing quantum verification scheme. The second insight of Lo and Chau is reduction. The idea is t o reduce the proof of security of the noiseless quantum scheme t o a classacal scheme. Given any cheating strategy by Eve, let us imagine that Eve prepares the state lu)and
Quantum Cryptology
97
then measures along the Bell basis for the pairs and along I j ) , s for her probe before sending the N pairs to Alice and Bob. After the measurement, the whole problem is classical. Therefore, Eve has mapped a cheating strategy against the random-hashing quantum verification scheme into a cheating strategy against a classical verification scheme, namely the random-hashing classical verification scheme. More importantly, it is a simple exercise t o show that the probability for Eve t o cheat successfully remains unchanged under such a measurement. The chief reason is that whether Eve can cheat successfully or not against the fully quantum verification scheme can be determined most conveniently by considering a hypothetical measurement along a single basis-Bell basis-to see if the remaining pairs are singlets or not. This reduction result-a quantum verification scheme has a classical interpretation-is highly surprising. However, it is a simple (but important) extension of the classical interpretation proposed by BDSW!’ Note that, paradoxically, it is with respect t o a highly non-classical basis-Bell basis-that the fully quantum problem becomes classical. Now, as discussed in above paragraphs, no cheating strategy by Eve against the classical scheme can succeed with a probability greater than 2-m. The invariance of probability under Eve’s measurement must imply that that no cheating strategy by Eve against the quantum scheme can succeed with a probability greater than 2-m. This completes the proof of the security of quantum key distribution scheme. Q.E.D.
6.5 Remarks The execution of Lo and Chau’s secure quantum key distribution scheme requires quantum computers. Building such computers is a technological feat that is far beyond our current technology. Therefore, for the time being their scheme is a proof of principle rather than a practical tool. An important question is, therefore, the security of quantum key distribution schemes that do not require quantum computers. Examples here include BB84 and Ekert’s schemes. Lo and Chau’s proof of security of quantum key distribution also applies to single-particle-based scheme. The point is that Alice is allowed to perform her measurements on her part of the particles before sending out those for Bob. To ensure security, she only needs t o withhold the information on her measurement basis and outcome until Bob acknowledges the receipt of his particles. There are other proofs of security of QKD,42243i44particularly by M a y e r ~ , ~ ~ that are based on standard BB84. Some of them have the advantage of not requiring a quantum computer, but are more complex.
98
Introduction to Quantum Computation and Information
7
Practical Considerations
This section provides the background for the chapter by Zbinden on experimental quantum cryptography. It may be skipped on first reading. Quantum key distribution is not just a theoretical subject. The first experimental demonstration of the feasibility of quantum key distribution was done with open air over 32 cm?2 By now, experiments over 20 km of optical fibers l5 as well as 205m of free air l4 have been performed. Besides, there have been proposals for performing quantum key distribution experiment from the ground to a satellite.16 Such capability is of immense value for re-programming satellites currently in orbit around the earth as well as for long distance relay of cryptographic keys via satellites. These exciting experiments will be the subject matter of the next chapter. Here I will give some simple practical considerations for the experiments.
7.1 Photon Source As it turns out, it is difficult to prepare single photon sources. Most of the current experiments are, therefore, done with faint light pulses, rather than single photons. On average, there can be only about 1/10 photon per pulse. Even so, there are still some chances of having two or more photons. This gives rise to a new eavesdropping strategy. Eve may use a beam-splitter to try to divide up the beam into two pieces, measuring the state of one beam and sending the second to Bob. Notice that such an attack is possible only when the beam contains more than one photon and is, therefore, divisible. By using very weak light pulses, the probability of success of the beam-splitting attack can be kept small. Hence, Alice and Bob can put some bounds to the information leakage to Eve due to such an attack and use privacy amplification to distill a perfectly secure key as discussed in Sec. 5.1. A warning is in order. In the case of ground-to-satellite quantum cryptography, Bob's photon collection efficiency can be as low as to lop4. Such a low collection efficiency will, in principle, allow Eve to break the quantum key distribution scheme completely by using the beam-splitter attackP5 Assuming a Poisson distribution for the number of photons emitted by the source, if p(one photon) x 0.1, then p(two photon) M 0.005. This means that Eve can obtain 5% of the quantum signals by using a beam-splitter attack right at the source. Suppose further that the collection efficiency of Bob is actually lower than 5%. Eve can, in theory, block all signals that she fails to beam-split and then make use of a hypothetical perfect channel to re-send to Bob some of the signals that she succeeds to beamsplit. By doing so, she has a perfect copy of what is received by Bob. By symmetry, Eve will have perfect knowledge of the
Quantum Cryptology 99
key generated by Alice and Bob. We see from the above paragraph that it is of utmost importance to use better single-photon sources in QKD. An example of a better source is EPR pairs from so-called parametric down conversion experiments. When a photon passes through a non-linear crystal, it can be converted into two entangled photons of lower frequencies? One of the two photons can, then, be used as a trigger to signal the creation of at least one EPR pair. Since the input to the non-linear crystal is often a faint laser pulse rather than a single photon, parametric down conversion still gives two or more EPR pairs with non-zero probabilities. However, the case of having no photon pairs can be eliminated from consideration due to non-triggering of the sender’s device. This helps to cut down an important source of error in the experiment-the photon dark count rates, which will be introduced in Sec. 6.4. I remark that other method^^^>^^ such as carefully tailored atomic emission in cavity quantum electrodynamics 46 may give still a better photon source in future. Finally, one should note that the beam-splitter attack is just one of the many possible attacks against quantum key distribution with faint light sourcesP6 Even if one can show that such a quantum key distribution system is secure against the beam-splitter attack, it is still conceivable that some more subtle attacks can break the system completely. This subject deserves future investigations.
7.2 Coding Schemes There are two main types of coding schemes in experimental quantum cryptography-polarization coding and phase coding. The idea of phase coding is to send a photon into two different arms of an interferometer. The two paths then represent two orthogonal states in the coding scheme. By passing a photon through a 50-50 beam-splitter (i.e., a half-reflecting mirror), one can launch it into a coherent superposition of the two paths:
) .1
1
= -path
Jz
+ -path Jz
1)
2
2).
(13)
One can encode information by introducing a phase difference
PThis does not violate the fact that a photon cannot be divided without a change in frequency.
100 Introduction t o Quantum Computation and Information
By picking 4 randomly between 0 and 7 ~ 1 2 the , scheme is equivalent to the B92 scheme introduced in the last section. Bob can read off information using a similar interferometer. See the next chapter for details.
7.3 Frequency Commercial single-photon counting modules employing silicon avalanche photo-diodes (APDs) are available around wavelengths of 800 nm. Such devices have high efficiencies (about 50%) and low noise rates. Unfortunately, the losses in optical fibers are quite high (2 dB/km) at this frequency range. Therefore, for long-distance optical fiber experiments, it is preferable to use commercial Telecom wavelengths, either 1300 nm or 1550 nm where the losses are 0.35 dB/km and 0.2 dB/km respectively. At such frequencies, no efficient commercial single-photon counting modules are available and cooled Ge or InGaAs avalanche photo-diodes have to be built in the laboratories.
7.4 Noise Even when the same basis is used by both Alice and Bob, the transmitted data of Alice and Bob may still be different because of various sources of errors. One of them is the dark counts in the detector: A detector may click accidentally even when there are no photons. To eliminate this source of error, the clicking of the detector is ignored unless it falls into specific time windows when a photon pulse is expected to arrive. Incidentally, an advantage of a parametric down conversion EPR source over a weak light pulse is that a member of the EPR pair will provide the ‘trigger’ to the sender Alice’s detector. Only then will Bob consider his data. Therefore, the receiver Bob will discard the dark counts in cases when there are no triggering. Other sources of errors will be discussed in more detail in the next chapter. 8
Beyond Quantum Key Distribution?
Beside quantum key distribution, other applications of quantum cryptography have also been proposed. The underlying theme of those applications is the protection of private information during public discussion. “In this scenario, there are no enemies, but you must negotiate with everyone and you don’t entirely trust them,” Charles Bennett says. Indeed, there have been reports48 of fake teller machines stealing PIN (Personal Identification Number) from customers. Next time when you type your PIN to an unknown teller machine, maybe you should worry about this possibility. To solve this problem, it would be useful to have some means of identification without revealing the
Quantum Cryptology 101
actual password. i.e., comparing whether the customer’s private password x matches the password y stored by the machine without revealing x itself. More generally, in a two-party secure computation, Alice has a private input x and Bob a private input y. Alice would like to help Bob t o compute a prescribed (i.e., public) function f ( x , y ) without revealing anything t o Bob about x more than what follows logically from f ( x , y ) and y. Either trusted intermediaries or computational assumptions may be used to achieve two-party secure computations. In the first case, Alice and Bob send their private inputs t o a trusted third party (or a machine) Charles, who performs the computation for them and tells them the result afterwards. Of course, the problem here is that Charles may cheat by telling one party the other party’s input. In the second case, assumptions such as the hardness of factoring large integers can be used. However, an adversary may crack such system by exhaustive computer analysis or by more efficient algorithms. In particular, an adversary with a quantum computer can use Shor’s algorithm lo to factor large integers efficiently. See footnote k. The impossibility of unconditionally secure schemes for two-party secure computations in conventional cryptography has sparked much interest in quantum protocols. Until recently, there had been a widespread belief that quantum two-party secure computations can be made unconditionally s e c ~ r e How? ~ ~ ~ ever, this optimism was recently shattered 54 following the demonstration of the insecurity of quantum bit commitment by Mayers 55,56 and also by Lo and C h a ~ ? ~For > ~a*review, see Ref. ”. This is a severe setback t o quantum cryptography. In what follows, I will introduce the concept of bit commitment, describe a simple quantum bit commitment scheme and explain why unconditionally secure quantum bit commitment is impossible.
8.1
Bit Commitment
Bit commitment is a crucial primitive for implementing secure computations? A bit commitment scheme involves two parties, a sender, Alice and a receiver, Bob. It is executed in two steps-1) the commit phase and 2) the opening phase. In the commit phase, Alice chooses a bit ( b = 0 or 1) and commits it to Bob. That is, she gives a piece of evidence t o Bob that she has chosen a bit and that she cannot change it. At that moment, the scheme should prevent QYao60has shown that quantum bit commitment can be used to implement quantum oblivious transfer. Besides, it has been shown by KilianG1that in conventional cryptography oblivious transfer can be used to achieve two-party secure computations. These two results combined together seem to suggest that quantum bit commitment leads directly to unconditionally secure two-party secure computations, thus achieving what is impossible from the perspective of conventional cryptography.
102
Introduction to Quantum Computation and Information
Bob from learning the value of the bit from that evidence. At a later time, however, Alice and Bob must be able to execute the opening phase in which Alice opens the commitment. That is, she tells Bob which bit she has chosen and proves to him that this is indeed the genuine bit that she chose during the commit phase. As an example of bit commitment, Alice writes down her bit in a piece of paper, places it in a box and locks it. She then hands over the box to Bob. Now she can no longer change her mind about the value of the bit. Meanwhile, Bob, without the key to the lock, cannot learn the value of the committed bit himself. At a later time, Alice gives the key to Bob who opens the box to recover the value of the committed bit. Unfortunately, the security of this scheme relies solely on the physical security of the box and the lock. Therefore, it is not applicable in the electronic age. What is cheating? Both Alice and Bob may attempt to cheat. On the one hand, a dishonest Bob tries to find out the value of the bit before the opening phase. On the other hand, a dishonest Alice may choose 0 during the commit phase and yet in the opening phase claims that it was 1 that she had in mind. For a bit commitment scheme to be secure, both forms of cheating must be foiled. On first reading, readers who are not interested in the proof of the impossibility of quantum bit commitment may skip Secs. 8.2 to 8.5. 8.2 A Sample Quantum Bit Commitment Scheme
For concreteness, I will describe a simple quantum bit commitment scheme proposed by Bennett and Brassard!' As before photons in four possible polarizations, horizontal (0 degrees), vertical (90 degrees), 45 degrees and 135 degrees, are used. If Alice has 0 in mind, she sends a sequence of photons chosen randomly from the rectilinear basis. i.e., each photon is independently and randomly chosen from horizontal and vertical polarizations. On the other hand, if Alice has 1 in mind, she sends a sequence of photons chosen randomly from the diagonal basis. i.e., each photon is independently and randomly chosen from 45-degree and 135-degree polarizations. Notice that independent of the value of the bit chosen by Alice, the density matrix p of the entire sequence of photons received by Bob is the same. It is just the tensor product of the density matrices of the individual photons. Indeed, p = psingle@pPsingle@. . .@psingle with
Quantum Cryptology
1 = - (145O)(45”1 1135”)(135O1) 2 1 = 51,
103
+
(15)
where I is the two-dimensional identity matrix. Consequently, there is absolutely no way for Bob to learn Alice’s committed bit. What an honest Bob should do is, for each photon, to choose randomly between the rectilinear or diagonal basis to measure its polarization. During the opening phase, Alice tells Bob her committed bit and the polarizations of all the photons. Bob accepts Alice’s committed bit if her announced polarizations are consistent with his measurement results. Suppose, for instance, N photons are transmitted and Alice opens the commitment by telling Bob that she has committed to a 0. Since Bob has chosen the two bases at random, Bob would have performed measurements along the rectilinear basis for an average of N/2 photons. For those photons, Bob can then check if Alice’s announced polarizations are the same as what he has got from his measurements. If the answer is yes, he believes that Alice is honest. Otherwise, Alice must be cheating. I remark that a naive cheating strategy by Alice is likely to be caught by Bob. Suppose Alice prepares a sequence of rectilinear photons and claims that they are diagonally polarized during the opening phase. For an average of N/2 photons that Bob has measured along the diagonal basis, Alice’s photons give random results to Bob’s detector. Now Alice has to blindly guess those results. The probability that she will be successful is, therefore, approximately ( 1/ 2 ) N / ” A fatal problem in Bennett and Brassard’s scheme, as noted by the inventors themselves in their paper:’ is that it is insecure against an EinsteinPodolsky-Rosen (EPR) type of attack. Recall from Sec. 5.2 that an EPR correlated pair of photons always shows opposite polarizations when measured along the same basis. For instance, when measured along the rectilinear basis, if one photon is horizontal, the other will necessarily be vertical and vice versa. Suppose that each of the photons sent by Alice is, in fact, a member of an EPR pair and that Alice keeps the other member herself. Alice decides on the value of her bit only during the opening phase. If she decides it to be 0, she performs her measurement along the rectilinear basis. On the other hand, if she decides it to be 1, she performs her measurement along the diagonal basis instead. This strategy will totally fool Bob and defeat the security requirement of the
104 Introduction t o Q u a n t u m Computation and Information
scheme: During the commit phase, Bob’s photons are described by a density matrix p = psingle8 psingle 8 . . . 8 psingle with psinglegiven by Eq. 15, just like in an honest protocol. Yet, for each pair of photons shared between Alice and Bob, the EPR paradox allows Alice’s photon t o give opposite polarization t o that of Bob’s whenever the two are measured along the same basis. There is no way for Bob t o defeat such an attack. While this E P R type of attack was well-known, its power and generality was not fully appreciated. Indeed, after Bennett and Brassard’s scheme, many other quantum bit commitment protocols had been proposed. Until recently, it had been widely claimed that unconditionally secure quantum bit commitment is possible. The fatal flaw of all those schemes was independently discovered by Mayers55 and by Lo and C h a ~ ?By~ now, it has been shown that unconditionally secure quantum bit commitment is i m p ~ s s i b l e ! ~ I, ~will ~ sketch the key point of the argument here. 50752
8.3 Unconditionally Secure Quantum Bit Commitment Is Impossible Recall the two security requirements for bit commitment: (A) Bob cannot learn the value of the bit b during the commit phase; and yet (B) Alice cannot change it during the opening phase. I now show that they are inconsistent. If Bob cannot learn the value of the committed bit, then Alice can almost always cheat successfully (i.e., she can change her bit from 0 t o 1 during the opening phase without being caught by Bob) even if Bob has a quantum computer! Then, it is quite clear that she can cheat against a Bob without a quantum computer: Consequently, quantum bit commitment is always insecure. Here is the proof. Imagine that both Alice and Bob use quantum computers to execute a quantum bit commitment scheme. (See footnote r.) At the beginning, Alice chooses her committed bit b = 0 or 1 and inputs the state 10) or 11) accordingly. Alice and Bob are supposed t o go through a multi-step procedure of sending classical and quantum signals t o and fro as well as performing local unitary transformations, attaching ancilla and performing measurements TAny bit commitment procedure followed by Bob can be re-phrased as one in which Bob does has a quantum computer but simply fails to make full use of it. Therefore, by showing that Alice can defeat a Bob who makes full use of his quantum computer, Mayers and also Lo and Chau proved that all bit commitment schemes based on quantum mechanics-classical, quantum, or quantum but with measurements-are insecure. This discussion is in the spirit of Everett interpretation, which asserts that any quantum mechanical experiment (e.g. the execution of a bit commitment scheme) can be described by the unitary evolution of a pure state wave function, followed by a measurement. The essence is that all measurements can be delayed to the very end of a quantum mechanical experiment. Therefore, there is no need to consider decoherence in the intermediate steps. This important observation greatly simplifies the proof of the impossibility of unconditionally secure quantum bit commitment.
Quantum Cryptology
105
in each step. With quantum computers, they preserve the coherence of the state under manipulation perfectly. One can then argue that all actions (classical s and quantum communications, unitary transformations, measurements and attachments of ancilla) by Alice and Bob can be regarded as a unitary transformation applied t o the input state. The basic idea of this point was noted in Ref. 56. A more concrete discussion was made in Ref. 5 8 . For a review, see Ref. 59! Therefore, at the end of the commit phase, their composite quantum system H A @ H B [where H A ( H B respectively) is the Hilbert space of Alice's (Bob's respectively) quantum machine] is in a pure state I&) or 1+1) depending on the value of b. Now the security requirement (A)-that Bob cannot learn the value of b i m p l i e s that Bob's quantum machine is described by essentially the same density matrix" independent of the value of b. i.e., RA1$O)($OI
= /$ = Lf
(16)
=~Al$l)($ll7
where RAdenotes the partial trace operation over the subsystem A controlled by Alice. Notice that I am considering the state of the whole quantum machine of Bob rather than its individual components. This greatly simplifies my discussion and avoids fallacies in classical reasoning. But then, there is a mathematical result (see below) which says that and 1$1) are related by a local unitary transformation by Alice alone. i.e., [$I) = UAl$0) for some U A acting on H A only. Consequently, Alice can cheat successfully by executing the protocol for b = 0 during the commit phase. It is only at the beginning of the opening phase that she makes up her mind. If she decides b to be zero, of course, she can execute the protocol honestly. If she decides b to be one instead, she simply applies U A t o her state t o change
I$o)
I$o)
SAny classical communications may be regarded as a special case of quantum communications and there is no need to distinguish between the two. tThis footnote elaborates on the point made in footnote r. Let H A and H B denote the Hilbert spaces of Alice and Bob's quantum machine respectively and let H c be the Hilbert space of the quantum communications channel. Consider the combined Hilbert space H = H A @ H B 8 H c . In the beginning, Alice chooses the bit b to be zero or one Alice and Bob and prepares the state (0) or 11) accordingly. Bob always prepares .).1 now take turns to perform operations (including measurements, unitary transformation and attachment of ancilla) on the system. The key point to note is that the operation applied at each step can be regarded as a unitary transformation. Indeed, one can imagine that Alice and Bob have quantum computers. It is then well-known all operations can be done without any actual measurement. See Refs. 5s,59 for details. In other words, in each step, a party D E { A , B } applies a unitary transformation on H A @ Hc,, which can then be regarded as a unitary transformation on H . Therefore, the whole procedure of the commit phase can be regarded as a product of unitary transformations, which is thus a unitary transformation, applied to the initial state. Hence, the final state can be considered as pure. "The case when pf and pf are slightly different will be briefly discussed later. The physics there is essentially the same.
106 Introductzon to Quantum Computatzon and I n f o m a t z o n
it to I$&) and executes the protocol for b = 1 instead. Since U A is a local unitary transformation on Alice’s machine, she can clearly apply it without Bob’s help and there is no way for Bob to defeat such cheating. Therefore, unconditionally secure quantum bit commitment is impossible. 8.4
Schmadt Decomposataon
All that is left to prove is the existence of U A used in the last paragraph. For this, I need a mathematical result-Schmidt decomposition62 The following discussion is largely based on Ref. 6 2 . Given Hilbert spaces H A and H B with dimensions p and q respectively, consider a normalized state I@) in H A 8 H E . Let p = I@)(@I be the density matrix and pA = T r ~ and p pB = T r ~ be p the reduced density matrices. Then the Schmidt decomposition theorem says that I@) can be written as r
z=1
where la,) (Ib,) respectively) are orthonormal eigenstates of p A ( p B respectively), and T 5 min(p, q ) is the total dimension of the non-zero eigenspaces of PA. The proof goes as follows: Let me write I@) in terms of the orthonormal eigenbasis lal), laz), . . . , ( a p )of p A as
where Ib’,)’s are not necessarily orthogonal. Tracing over H E , one finds u
2=1
v
j=1
On the other hand, since (a,)’s are the eigenstates of p A , one must have
Quantum Cryptology
107
where A2’s are the eigenvalues of p A . Equating these two equations, one finds 1 (btlb;) = A,6,,. Hence, Ib,) = Xz51b:) is an orthonormal set in H B and
Now, by taking the trace over H A , it is easy to see that r
Therefore, Ib,) is an eigenvector of p B corresponding to the eigenvalue A,. Q.E.D. Let me apply this result to quantum bit commitment. At the end of the commit phase, if b = 0, the wavefunction I&) can be written in Schmidt decomposition as T
2=1
Similarly, if b = 1, the wavefunction
can be written as
1$1)
T
2=1
The first security requirement-that Bob does not know the bit-demands that zdeally Bob has the same density matrix for b = 0 and b = 1. That is, r
T
,= 1
a=1
Without much loss of generality: this implies that A, = A: and Ib,) = lb:). In other words, T
I$1)
=
t:X%l.’,)
8 IQ.
(26)
2=1
Observe that the only possible difference between I$O) and !$I) lies in the eigenvectors la,) and la:) (of p i and pf respectively). Let me consider the unitary transformation U A that maps la,) to la:). As asserted, it acts on uHere I assume that the eigenvalues are non-degenerate. The case of having degenerate eigenvalues can be dealt with in a similar manner.
108
Introduction t o Quantum Computation and Information
Alice’s quantum machine H A only and yet maps I&) t o 1$1). This shows the existence of a cheating unitary transformation U A and thus completes the proof of insecurity of ideal quantum bit commitment where Bob’s density matrices pf and p; corresponding to b = 0 and b = 1 are exactly the same.
8.5 Non-ideal Quantum Bit Commitment However, in general, one can allow pf and p r to be slightly different. This will only give Bob a small probability of distinguishing between 0 and 1. Using the concept of fidelityg3 it has been shown rigorously by Mayers (see also Refs. that even then Alice can almost always cheat successfully. Therefore, one concludes that even non-ideal quantum bit commitment is impossible. For a review on quantum bit commitment, see Ref. 59. 55356
57758)
8.6 Aftermath of the Fall of Quantum Bit Commitment In conclusion, the fatal flaw in quantum bit commitment protocols is that they all involve an implicit assumption that some measurements are performed by the two users. However, with quantum computers and quantum storage devices, a cheater, Alice, can almost always cheat successfully with entanglement. The significance of this discovery lies in its generality: Not only existing quantum bit commitment schemes, but also any quantum bit commitment scheme that one can possibly devise, are necessarily insecure. Furthermore, as noted in footnote r, this ‘no-go theorem’ applies not only to fully quantum bit commitment schemes, but to all bit commitment schemes based on quantum mechanics-classical, quantum and quantum but with measurements. Moreover] one cannot bypass this ‘no-go theorem’ by assuming that the decoherence time involved is short. This is because a cheater can, in principle] perform quantum error correction 64 and fault-tolerant quantum computation to defeat decoherence. Following the surprising discovery of the insecurity of quantum bit commitment, other quantum protocols such as ideal quantum coin quantum ‘one-out-of-twooblivious transfer’ and ‘one-sided’two-party secure computation were also shown 54 to be impossible. By now, the big hope of unconditionally secure two-party computations has been totally ~ h a t t e r e d ? ~ ? ~ ’ 35136,37,38139
wA one-sided two-party secure computation allows only one of the two parties t o learn the result f(z, y). In other words, Alice with a private input I and Bob with a private input y cooperate t o compute a prescribed function f(z,y) in such a way that at the end of the computation, 1) Alice learns nothing about y or f(~,y); 2) Bob learns f(~,y) and 3) Bob learns nothing about I except for what logically follows from y and f(~,y).
Quantum Cryptology
109
There is, however, an important caveat in what I am saying. Even though quantum two-party secure computations are impossible in theory, they may still be possible in practice. The point is, to break those quantum protocols, a cheater generally needs a quantum computer. Therefore, quantum cryptographic protocols allow one to replace classical computational assumptions by quantum computational assumptions. Since it is a huge technological challenge to actually build a quantum computer, quantum two-party computations may still have practical value. Hrub? 65 has worked on a quantum smart card for identification purposes. Finally, it cannot be over-emphasized that those ‘no-go theorems’ do not apply t o quantum key distribution or quantum money. Quantum cryptography should remain a fertile and challenging subject in the foreseeable future. This is particularly so in view of the recent dramatic advances in experiments.
8.7
W h y did people wrongly believe t h a t q u a n t u m bit c o m m i t m e n t is possible?
I shall remark that the Einstein-Podolsky-Rosen attack is a subtle attack with no classical analogue. The implication of this attack is that, in quantum bit commitment, while Alice can open the bit as either 0 or 1, Alice cannot open the bit as both 0 and 1. The point is: to open a bit as a 0 t o Bob, Alice is generally required to perform a measurement. Since measurements generally lead t o disturbance, this will prevent Alice from opening the bit as a 1. In other words, in quantum bit commitment the information to open the bit as both 0 and 1 is n e v e r available to a cheating Alice. This effect is a reincarnation of the Einstein-Podolsky-Rosen paradox. It is highly paradoxical from the perspective of classical information theory and it highlights the basic flaw in the long-standing belief in the security of quantum bit commitment. All previously claimed secure quantum bit commitment schemes implicitly assume that some measurements have been performed by the two users. By using a quantum computer and delaying the measurements to the very end, a user can, in principle, break all quantum bit commitment schemes. 9
Quantum Cryptanalysis
This section may be skipped on first reading. The subject of cryptology consists of two parts-cryptography, the art of code-making, and cryptanalysis, the art of code-breaking. In this section, I will turn to cryptanalysis. As remarked earlier, the cheating strategy in quantum bit commitment generally requires a quantum computer. The insecurity of quantum bit commitment, therefore, demonstrates the power of a quantum computer in cryptanalysis
110 Introduction to Quantum Computation and Information
against quantum cryptography. Here I remark that quantum computer is also a powerful weapon for cryptanalysis against conventional cryptography. This is so because quantum computers can crack a number of hard problems that underlie the security of many conventional crypto-systems. For instance, Shor has devised efficient quantum for factoring and for the so-called ‘discrete logarithm pr~blem’!~ Boneh and Lipton66 have generalized Shor’s algorithm to attack any crypto-system with a ‘hidden linear form’. In particular, even the discrete logarithm problem in ‘elliptic curves’ can be solved efficiently by a quantum computer. In conclusion, if a quantum computer is ever built, many widely used public key crypto-systems will be unsafe. What is even more worrying is the fact that this total security break by quantum computers is retroactive. By keeping copies of the current transmission, an eavesdropper can, in principle, wait for the construction of a quantum computer (or the invention of more efficient classical algorithms) in future to decode any top secret message encoded by those breakable public key schemes. Since technological advances in computing power are notoriously difficult to predict, the risk involved in using conventional cryptography for long-term secrets is inherently non-zero. In this aspect, quantum cryptography offers the great advantage of avoiding this retroactive security break. What about private key or symmetric systems? Grover’s efficient algorithm 68 for database search can reduce the time needed for exhaustive key search from O ( N ) to O ( n ) , where N is the total number of possible keys. For instance, it can speed up million-fold the exhaustive key search against DES (Data Encryption Standard): the most popular computer encryption alg0rithm.6~The successful construction of a large scale quantum computer would be the end of DES.” To foil Grover’s attack, one needs to use a new scheme with double the key length. In conclusion, quantum computation can have potentially shattering effect “As a side remark, quantum computer can also solve the ‘collision problem’ efficiently. Let me first introduce the latter. Given a function F : X + Y ,the collision problem is to find a collision in F , i.e., two distinct elements 20 and z1 in X such that F ( z o ) = F ( z l ) , assuming that such a pair exists. This problem is important in cryptography because it is commonly assumed that the collision problem is computationally infeasible for a class of functions known as hash functions. Indeed, a brute force attack known as the birthday attack (The requires O ( n ) evaluations of the function for a two-to-one function, where N = name birthday attack comes from the fact that on average it requires a group of less than 30 persons to find a pair of persons having the same birthday. The key point is that there are r ( -~1)/2 pairs to consider.) However, using Grover’s algorithm as a subroutine, Brassard, Heryer and Tapp’O have found a quantum algorithm that finds collisions in arbitrary r-to-one functions after only O( expected evaluations of the function. Furthermore, there also exist some super-fast quantum algorithms 71,72for complex quantum queries. Their impact to cryptography is, however, unclear t o me.
1x1.
m)
Quantum Cryptology 111
on cryptography. To many conventional cryptographers, this is an unwelcome possibility that is too catastrophic to ignore. For a review on quantum algorithms including Shor’s and Grover’s, see chapter six. 10
Thoughts For The Future
Many challenging questions in quantum cryptology remain to be answered. Let me mention a few here. At the conceptual level, there are now solid foundations to quantum cryptography. On the one hand, quantum key distribution has been proven to be secure. (See Sec. 6.) On the other hand, quantum bit commitment has been shown to be impossible due to cheating by using the EPR effect. (See subsection 8.3.) The important conceptual questions are: What is the exact boundary to the power of quantum cryptography? And why is there such a boundary? An important concrete line of research here is to quantify the partial security offered by quantum identification schemes. As discussed in Sec. 8.6, such schemes cannot offer perfect security. Another question is multi-party secure computations. Under the assumption that a majority of the people involved are honest, unconditionally secure schemes of multi-party computations exist even in conventional ~ryptography?’~ The big question here is whether one can improve on those classical results by using quantum mechanics. As discussed in Sec. 6, given quantum computers, quantum key distribution can be made unconditionally secure over arbitrarily long distances. However, full-blown quantum computers are far beyond current technology. At a phenomenological level, it would be interesting to see if quantum error correction, a subject to be introduced in chapter seven, can be used in practice to increase the range of quantum key distribution from the state-of-the-art tens of kilometers to a futuristic range of thousands of kilometers. Some work along this line has already been ~ t a r t e d . 7This ~ would be an important milestone in the feasibility study of a practical quantum key distribution system. For experimental quantum cryptography, the proposed ground to satellite experiment is a major challenge. Quantum key distribution does offer some advantages over its conventional counterpart. As noted in Sec. 4, it can avoid retroactive total security break which haunts its classical counterpart. John Smolin has also emphasized that classical cryptographic systems can, in principle, have fatal security loopholes that are hidden from users. Indeed, they might send out miniature robots to the adversaries to disclose all the secrets. A homemade quantum cryptographic system can prevent hidden security loopholes. It should also be noted that a conventional optical fiber can be divided in the wavelength domain into, for
112
Introduction to Quantum Computation and Information
example, a quantum channel a t 1300 nm and a conventional high-bandwidth channel operating at 1550 nm. This means that quantum key distribution can, in principle, be used during non-rush hours without seriously affecting the transmission of classical (telephone) messages. Improvements in photon sources, transmission channels and detector technology as well as the future developments of conventional cryptography will ultimately determine the competitiveness of quantum cryptography against its conventional counterparts in military and commercial applications. Finally, I remark that quantum cryptology is an integrated component of the general field of quantum information processing, whose ultimate goal is the unification of quantum mechanics with subjects such as information theory, computer science and cryptology. Exciting unexpected developments will most likely arise out of the interplay of the concepts from quantum cryptology, quantum computing and quantum information and out of inspirations from the classical theory. A closer look at those related subjects may, therefore, give new insights t o the development of quantum cryptology. This point is well illustrated by the history of quantum cryptography itself. While quantum key distribution was born before much work on quantum computation, as discussed in Sec. 6 its security 34 has been rigorously established rather recently. Such a proof makes use of sophisticated concepts such as quantum error correction and fault-tolerant quantum computation, which have been developed only in recent years. 11
What Quantum Cryptography Is Telling Us About Quantum Mechanics?
Quantum cryptography provides a concrete arena for investigating fundamental problems in quantum mechanics. As an example, consider the dichotomy between quantum mechanics and classical physics: While our everyday experience is described by classical physics and probability theory, experiments tell us that microscopic phenomena should be described by quantum mechanics. In the most general case, we use a hybrid description with classical measurement outcomes and a quantum state under investigation. The big question is: What is the relationship between the threeclassical probabilistic description, fully quantum mechanical description, and a hybrid description? The first lesson from quantum cryptography is that a hybrid or classical description can always be replaced by a fully quantum mechanical description, i.e. a unitary transformation followed by a measurement a t the very end. As in cheating strategy against quantum bit commitment discussed in Sec. 8, this is done by delaying all measurements t o the very end.
Quantum Cryptology 113
From the Einstein-Podolsky-Rosen paradox, we learn that some quantum mechanical experiments do not have classical explanations. One might wonder when a reduction from a quantum scheme to a classical scheme will be possible. Indeed, the proof of security of quantum key d i ~ t r i b u t i o nmakes ~ ~ essential use of this reduction idea. (See Sec. 6.) The key insight there is coarse-graining. It is shown that, whenever the coarse-grained projection operators of some quantum mechanical experiment can be written as the coarse-grained projection operators along a single basis, then the probabilities associated with those operators have classical interpretations. This leads to tremendous simplification as classical probability theory can then be used to address the original quantum mechanical experiment.
Acknowledgments
I am indebted t o the generous support from colleagues and mentors at the California Institute of Technology, Pasadena, U S A . , at the Institute for Advanced Study, Princeton, U S A , and at the Hewlett-Packard Laboratories, Bristol, U.K. Much of this chapter is based on other sources, in particular, an enlightening talk given by C. H. Bennett. Stimulating discussions on the subject with numerous colleagues including M. Ardehali, C. H. Bennett, M. Ben-Or, A. Bodor, G. Brassard, S. L. Braunstein, R. Cleve, C. Crepeau, D. P. DiVincenzo, A. K. Ekert, L. Goldenberg, D. Gottesman, J. Hrubf, R. J . Hughes, R. Jozsa, A. Kent, J. Kilian, J. S. Kim, H. J. Kimble, M. Knill, N. Liitkenhaus, D. Mayers, T. Mor, S. Popescu, J. Preskill, J. G. Rarity, P. Shor, J . Smolin, T. Toffoli, L. Vaidman and A. C.-C. Yao are gratefully acknowledged. I particularly thank H. F. Chau for fruitful collaborations on the subject, D. W. Leung, T. Spiller and H. Zbinden for helpful comments during the preparation of this chapter. I dedicate this chapter to the memory of my parents. This paper was written while the author was working at HewlettPackard Labs, Bristol (UK). Appendix A: RSA Public Key Crypto-system The most well-known public key encryption scheme was invented by Rivest, Shamir and Adleman.21~~ The security of RSA is based on the difficulty of factoring large numbers. A user, say Bob, first chooses two large primes p and q and computes N = pq. He then randomly chooses the encryption key e such that e and ( p - l ) ( q - 1) have no common factors. Afterwards, he computes the unique decryption key, d, such that ed = 1 [modO, - l ) ( q - l)].
(27)
114 Introduction to Quantum Computation and Infomation
This computation can be done efficiently by the Euclidean algorithm. Now e and N are made public: They can be published in a public key directory in the same manner as a telephone directory. The decryption key, d , must be kept secret. As p and q are no longer needed, they can be discarded, but never revealed. Suppose a person Alice, who may or may not have met Bob before, would like to send Bob a message m (mod N). She can do so by raising it to the power e, i.e., c = m e (modN) (28) and sending c to Bob. Bob can recover the message m by raising c to the power d. This is because, from elementary number theory, m ( P - ' ) ( q - ' ) = 1 (mod N ) for any m (mod N ) and, therefore, cd
= m e d = mk(P-l)(q-l)+l
mt(P-l)(q-')
1
x m = m all
(mod N ) . (29)
For a long message, Alice may, for example, expand it in power of N and encrypt each entry in the N-ary expansion individually. An eavesdropper Eve who does not know d nor the factorization of N will generally have a hard time in deducing m from c, e and N alone. On the other hand, if Eve can factor N into p times q, then she can trivially find the decryption key d by using the Euclidean algorithm with d and ( p - l ) ( q - 1) as the inputs.
Appendix B: Error Correction and Privacy Amplification Here I review a simple but non-optimal procedure for error correction and privacy amplification as introduced in Ref. 1 2 . Recall that Alice and Bob's polarization data may be different due to noise and eavesdropping by Eve. Upon the completion of the quantum transmission, Alice and Bob need to exchange public messages in order to reconcile the difference between their data. I will assume that Eve can listen to all public discussion. Therefore, Alice and Bob should make sure that the public discussion reveals as little information as possible on their data. A simple scheme of reconciliation is for Alice and Bob to first agree on a random permutation of the bit positions in their strings. They then partition their string into blocks of size k such that each block is highly unlikely to contain more than one error. For each block, Alice and Bob compare its parity publicly. If the parities computed by Alice and Bob respectively are the same, a block is tentatively accepted as correct. If the parities are different, a binary search will now be applied to the block. This will disclose log, k bits of parities about the sub-blocks before the error is finally located and corrected.
Quantum Cryptology 115
To prevent Eve from gaining information through the public discussion, Alice and Bob should discard the last bit of each block or sub-block whose parity has been announced. Notice that if two or more errors occur in the same block, some of them may remain undetected. To correct those errors, random permutation and block parity disclosure (with increasing block size) is repeated several times. Once Alice and Bob have reached the stage in which there are probably only a few errors left, it will be inefficient for them to continue the block parity disclosure process. Therefore, a new process is now adapted: they can apply an iterative process of comparing the parity of a publicly chosen random subset of their data. Whenever there is some disagreement between their shared string, the random subset parities will disagree with a probability 1/2. If a disagreement is found, a bisective search is applied to locate and correct the error. As before, the last bit of each set whose parity is announced should be discarded in order to avoid Eve from getting additional information from the public discussion. This iterative process is repeated until Alice and Bob fail to find any disagreement in many (say 20) consecutive comparisons. In this case, it is highly likely that they share the same string. This completes the process of error correction. Alice and Bob can now convert their polarization data into a raw key. The remaining problem is that Eve may have partial information on this raw key. Therefore, Alice and Bob perform privacy amplification, i.e., they distill a shorter but perfectly secure key from such a partly secure raw key. Bennett et al. presented a procedure for achieving this distillation process against a special class of eavesdropping strategies: Suppose that there are n bits in the raw key and Eve has at most 1 deterministic bits of information about it. A hash function h should be chosen randomly from an appropriate class of functions {0,1}" -+ (0, l)"-lps where s > 0. At the end, the raw key x will be mapped into h(x) such that Eve's expected information on it is less than 2-"/ In 2 bit. Alice and Bob can now each compute the value h(x) and keep it as a secret key for subsequent communications. References
1. D. Kahn, The Codebreakers: The Story of Secret Writing, New York, Macmillan Publishing Co., 1967. 2. B. Schneier, Applied Cryptography, New York, John Wiley and Sons, Inc., 1996. 3. For an excellent but perhaps outdated review, see C. H. Bennett, G. Brassard and A. K. Ekert, Sci. Am. (Oct. 1992), 50.
116
Introduction to Quantum Computation and Information
4. The following discussion is based on a talk delivered by C. H. Bennett at Princeton University around 1996. 5. W. K. Wootters and W. Zurek, Nature 299,802 (1982); D. Dieks, Phys. Lett. A 92,271 (1982). 6. N. Herbert, Found. Physics 12, 1171 (1982). 7. C. A. Fuchs, Los Alamos preprint archive quant-ph/9611010. 8. S. Wiesner, Sigact News 15,78 (1983). 9. W. Diffie and M. E. Hellman, IEEE Transactions on Information Theory, v. IT-22, n. 6, 644 (1976). 10. P. W. Shor, “Algorithms for Quantum Computation: Discrete Loga-
rithms and Factoring,” in Proceedings 35th Annual Symposium on Foundations of Computer Science, (USA, Nov. 1994), IEEE Press; P. W. Shor, SIAM J. Computing, 26, 1484-1509 (1997). 11. C. H. Bennett and G. Brassard, “Quantum cryptography: Public key distribution and coin tossing,” in Proceedings of IEEE International Conference on Computers, Systems, and Signal Processing, p. 175-179. IEEE, 1984. 12. C. H. Bennett, F. Bessette, G. Brassard, L. Salvail and J . Smolin, J. Cryptology 5,3 (1992). 13. M. N. Wegman and J. L. Carter, Journal of Computer and System Sciences 22,265 (1981). 14. B. C. Jacobs and J. D. Franson, Opt. Lett. 21, 1854 (1996); W.T . Buttler et al., Phys. Rev. A 57,2379 (1998) ; W. T . Buttler et al., Los Alamos preprint archive quant- ph/9805071. 15. P. D. Townsend, J. G. Rarity, P. R. Tapster, Electronic Letters 29 # 14, 1291 (1993); A. Muller, H. Zbinden and N. Gisin, Nature 378,449 (1995); R. J . Hughes et al., in Advances in Cryptology: Proceedings of Crypto ’96, Lecture Notes in Computer Science, Vol. 1109, SpringerVerlag, Berlin, p. 329. 16. B. C. Jacobs and J. D. Franson, “Feasibility of Global Systems for Quantum Cryptography,” to be published. R. Hughes has also made a similar
proposal. 17. C. H. Bennett, G. Brassard and J.-M. Robert, SIAM J. Computing 17, 210 (1988). 18. A. K. Ekert, Phys. Rev. Lett. 67,661 (1991). 19. A. Einstein, B. Podolsky, and N. Rosen, Phys. Rev. 47,777 (1935); J . S. Bell, Physics 1, 195 (1964); J. F. Clauser, M. A. Horne, A. Shimony, and R. A. Holt, Phys. Rev. Lett. 23, 880 (1969). These three papers
are reprinted in J. A. Wheeler and W. H. Zurek, Quantum Theory and Measurement (Princeton University Press, Princeton, 1983), p. 138, p.
Quantwn Cryptology 113
20. 21. 22. 23. 24.
25. 26. 27. 28. 29. 30. 31. 32. 33.
34. 35.
36. 37. 38.
39.
403 and p. 409 respectively. C. H. Bennett, Phys. Rev. Lett. 68, 3121 (1992). M. Koashi, and N. Imoto, Phys. Rev. Lett. 77, 2137 (1996). P. D. Townsend, Nature 385, 47 (1997). E. Biham, B. Huttner, and T. Mor, Phys. Rev. A 54, 2651 (1996). L. Goldenberg and L. Vaidman, Phys. Rev. Lett. 75, 1239 (1995). A comment to this paper appeared in A. Peres, Phys. Rev. Lett. 77, 3264 (1996). A reply was made in L. Goldenberg and L. Vaidman, Phys. Rev. Lett. 77, 3265 (1996). The concept was clarified in T. Mor, Phys. Rev. Lett. 80, 3137 (1998). W. Y. Hwang, I. G. Koh, and Y. D. Han, Los Alamos preprint archive quant-ph/9702009. H.-K. Lo and H. F. Chau, “Quantum Cryptographic System with Reduced Data Loss” US patent No. 5,732,139 (granted March 24, 1998). M. Ardehali, G. Brassard, H. F. Chau, and H.-K. Lo, Los Alamos preprint archive qua nt- p h /9803007. N. Lutkenhaus, Phys. Rev. A 54, 97 (1996). C. H. Bennett, T . Mor, and J . A. Smolin, Phys. Rev. A 54, 2675 (1996). E. Biham, and T . Mor, Phys. Rev. Lett. 78, 2256 (1997). C. A. Fuchs, N. Gisin, R. B. Griffiths, C.-S. Niu, and A. Peres, Phys. Rev. A 56, 1163 (1997). R. B. Griffiths, and C.-S. Niu, Phys. Rev. A 56, 1173 (1997). E. Biham, M. Boyer, G. Brassard, J . van de Graaf, and T. Mor, “Security of quantum key distribution against all collective attacks”, Los Alamos preprint archive quant-ph/9801022 (1998). H.-K. Lo and H. F. Chau, available at Los Alamos preprint archive quantph/9803006 (1998). P. W. Shor, in Proc. 37th Symposium on Foundations of Computer Science, IEEE Computer Society Press, 1996, p. 56, also Los Alamos preprint archive quant-p h/9605011. A. Yu. Kitaev, “Quantum error correction with imperfect gates”, preprint (1996). E. Knill, R. Laflamme, and W. Zurek, “Resilient quantum computation”, Science 279, 342-345 (1998). D. Aharonov, and M. Ben-Or, “Fault-tolerant quantum computation with constant error”, Proceedings of the 29th Annual ACM Symposium on the Theory of Computing. Also available in Los Alamos preprint archive quant-ph/9611025. For a review, see for example, chapter 8 by John Preskill and J . Preskill, Proc. Roy. SOC.(London) A 454, 385 (1998).
118 Introduction to Quantum Computation and Information
40. H.-J. Briegel, W. Dur, J . I. Cirac, and P. Zoller, “Quantum repeaters for communication”, Los Alamos preprint archive quant-ph/9803056. 41. C. H. Bennett, D. P. DiVincenzo, J. A. Smolin, and W. K. Wootters, Phys. Rev. A 54, 3824-3851 (1996). 42. D. Deutsch, A. Ekert, R. Jozsa, C. Macchiavello, S. Popescu, and A. Sanpera, Phys. Rev. Lett. 77, 2818 (1996). 43. D. Mayers, in Advances tn Cryptology: Proceedings of Crypto’ 9 6 , Lecture Notes in Computer Science, Vol. 1109 (Springer-Verlag, Berlin, 1996) p. 343. 44. D. Mayers, “Unconditional security in Quantum Cryptography”, Los Alamos preprint archive quant-ph/9802025, version 4, 15 Sept., 1998. 45. N. Lutkenhaus, in discussion with R. J. Hughes, Torino workshop on quantum information and computation (1997). 46. H. J. Kimble, Private Communications. 47. J . S. Kim, Private Communications. 48. “One Less Thing to Believe in: Fraud at Fake Cash Machine,” New Yorlc Times, 13 May 1993, pp. A1 and B9 as cited in Ref. 49. 49. C. Crkpeau and L. Salvail, in Advances in Cryptology: Proceedings of Eurocrypto ’95, (Springer-Verlag) 133. 50. G. Brassard and C. Crkpeau, “Quantum bit commitment and coin tossing protocols,” in Advances an Cryptology: Proceedings of Crypto ’90, Lecture Notes in Computer Science, Vol. 537, p. 49-61. Springer-Verlag, 1991. 51. B. Huttner, N. Imoto and S. M. Barnett, Journal of Nonlinear Optical Physics and Materials 5 , No. 4 (1996) 823. 52. G. Brassard, C. Crkpeau, R. Jozsa and D. Langlois, “A quantum bit
commkmen‘ scheme prouab\y unb~ea!b\eby ’00thpax%ies in PTQCWA53.
54. 55. 56. 57. 58.
ings of the 34th annual IEEE Symposium on the Foundation of Computer Science, Nov. 1993, p.362-371. C. H. Bennett, G. Brassard, C. Crkpeau, and M.-H. Skubiszewska, “Practical quantum oblivious transfer,” in Advances in Cryptology: Proceedings of Crypto ’91, Lecture Notes in Computer Science, Vol. 576, p. 35 1-366. Springer-Verlag , 1992. H.-K. Lo, Phys. Rev. A 5 6 , 1154 (1997). D. Mayers, “The trouble with quantum bit commitment,” Los Alamos preprint archive quant-ph/9603015, t o be published. D. Mayers, Phys. Rev. Lett. 7 8 , 3414 (1997). H.-K. Lo and H. F. Chau, Phys. Rev. Lett. 7 8 , 3410 (1997). H.-K. Lo and H. F. Chau, in Proceedings of the Fourth Workshop on Physics and Computation, PhysComp ’ 96, Boston 1996 (New England
Quantum Cryptology
59. 60. 61. 62. 63. 64. 65.
66. 67. 68.
69. 70. 71. 72. 73.
74. 75.
119
Complex Systems Institute, Boston, 1996), p. 76, also Los Alamos preprint archive quant-ph/9605026,full paper to appear in a special issue of Physlca D. H. F. Chau and H.-K. Lo, Fort. der Phys. 46, 325 (1998). A. C.-C. Yao, in Proceedings of 26th Annual ACM Symposium on the Theory of Computing, 1995, p. 67. J. Kilian in Proceedings of 1988 ACM Annual Symposium on Theory of Computing, (May, 1988), p. 20. See, for example, the Appendix of L. P. Hughston, R. Jozsa and W. K. Wootters, Phys. Lett. A183,14 (1993). R. Jozsa, J. Mod. Opt. 41, 2315 (1994). P. W. Shor, Phys. Rev. A 52,R2493 (1995); A. M. Steane, Phys. Rev. Lett. 77,793 (1996). See also chapter 7 by Andy Steane. J. Hrubf, in Proceedings of International Conference on Cryptography: Policy and Algorithms, Lecture Notes in Computer Science, Vol. 1029, Springer-Verlag, 1995, p. 282. D. Boneh and R. J. Lipton, in Advances an Cryptology: Proceedings of Crypto’ 95, Lecture Notes in Computer Sciences Vol. 963, 424 (1995). A. Yu. Kitaev, Los Alamos preprint archive quant-ph/9511026. L. K. Grover, in Proceedings of 28th Annual ACM Symposium on the Theory of Computing (ACM Press, New York, 1996), p. 212; Phys. Rev. Lett. 79,325 (1997). For a survey article on the impact of Grover’s algorithm on cryptanalysis, see G. Brassard, Science 275,627 (1997). G . Brassard, P. Hoyer and A. Tapp, Los Alamos preprint archive quantph/9705002. B. M. Terhal and J . A. Smolin, Los Alamos preprint archive quantp h/9705041. L. K. Grover, Phys. Rev. Lett. 79,4709 (1997). See, for example, T. Rabin and M. Ben-Or, “Verifiable Secret Sharing and Multi-party Protocols with Honest Majority,” in Proceedings of the ACM Symposium on the Theory of Computing, 1989, p. 73-85. H.-J. Briegel, W. Diir, S. J . van Enk, J. I. Cirac, and P. Zoller, Los Alamos preprint archive quant-ph/9712027. R. L. Rivest, A. Shamir and L. M. Adleman, Communications of the ACM 21, # 2, 120 (1978).
EXPERIMENTAL QUANTUM CRYPTOGRAPHY HUGO ZBINDEN Group of Applied Physics, University of Geneva, CH-1211, Geneva 4 , Switzerland In this chapter different experimental Quantum Cryptography setups based on optical fibers are presented. Performance and technical problems are discussed and compared. The transmission distances and error rates are essentially limited by the photon counters that are presently available. Possible eavesdropping strategies, the use of single photon sources and open air experiments are briefly discussed.
1
Introduction
This chapter discusses possible experimental realizations of the different quantum key distribution protocols presented in the preceding chapter. In all practical set-ups photons are the carriers of the information. The quantum information is encoded either in the phase or the polarization of the photons. In the first experimental demonstration of quantum cryptography (QC) in 1992, the photons travelled over 30 cm in air! Then, for many practical reasons, in most of the following experiments telecom fibres were used as channels?-5 We will therefore focus on fibre optical implementations and related problems. However, open air set-ups are still investigated with the ambitious goal to establish secure connections to satellites. Results of open air experiments will be presented briefly in Sec. 8 at the end of the chapter. In the next two sections standaid fibre optic set-ups for the polarization coding and phase coding scheme are presented. Unfortunately, both schemes need a continuous alignment of the set-up. In polarization-based QC systems, the polarization has to be maintained stable over tens of kilometres, in order to keep aligned the polarizers at the emitter’s and at the receiver’s ends. In fact, due t o the effect of the environment the output polarization fluctuates randomly on a time scale of minutes. Therefore, these systems have to compensate actively for changes of the outcoming polarization. These fluctuations are generally slow enough that an automatic tracking would be feasible? Interferometric QC systems are usually based on two unbalanced Mach-Zehnder interferometers, one at each end. Since two interfering pulses do not follow the same path within the two interferometers, the difference in arm lengths must be kept stable to a fraction of a wavelength for both interferometers, in order to obtain high visibility. Consequently, every few seconds, one interferometer has to be adjusted to the other to compensate thermal drifts? In Sec. 4 an interferometric system with Faraday mirrors is presented6 This phase-coding 120
Experimental Quantum Cryptography
121
set-up needs neither alignment of the interferometer nor polarization control, and therefore considerably facilitates the experiment. Advantages and disadvantages of all set-ups are briefly discussed. The performance of any QC-system is limited by the losses in the fibre and the noise of the single photon counters. The losses in optical fibres are typically around 2 dB/km at 800 nm, 0.35 dB/km in the 1300 nm Telecom window and 0.2 dB/km in the 1550 nm telecom window. Hence, at 1300 nm the bit rate is reduced by a factor of ten after about 30 km. At this wavelength Ge or InGaAs avalanche photo diodes (APD) have to be used, instead of commercial silicon photon counting modules. This means lower detection efficiencies and higher dark count rates, hence lower bit rates and higher error rates. In Sec. 5 the sources of errors are defined and limits for transmission length, bit rates and error rates for the different wavelengths are estimated. Actually, the noise of the available photon counters in the near infrared is one major problem of experimental QC that finally limits the transmission distance. Therefore we have a closer look at actual detector performances. The performance of APDs can be considerably improved using a fast active biasing electronics. More details about single photon counting can be found in the Appendix. At the end of this section a short review of the realized experiments is given. In Sec. 6 advantages and disadvantages of using weak laser pulses or photon pairs created by parametric downconversion are discussed. Finally, in Sec. 7, the susceptibility of the different set-ups to different eavesdropping strategies is briefly discussed. 2
Standard polarization coding set-up
Let us have a look at the experimental polarization coding set-up2t3 based on the four states protocol BB84? In all presented set-ups, faint laser pulses are used. Alice's light source is hence a pulsed semiconductor laser. Four different polarization states are created by an electro-optic polarization controller. Or, as proposed by the set-up shown in Fig. 1, four lasers with polarizers oriented at 0", go", 45" and 135" are used, followed by three passive couplers? The lasers fire at random, at a given pulse rate v. The laser pulses are attenuated to an average number of photon per pulse p well below 1 ( p = 0.1, say), to keep small the probability of obtaining more than 1 photon per pulse (see Sec. 5). Bob randomly selects the polarization analyzer basis. This is most easily done by introducing a passive optical coupler that directs the photon to one of the two polarizing beam splitters (PBS) oriented at 45". The photons aIn this case one has to make sure that Eve cannot find out which laser has fired due to differences in spectrum or timing.
122
Introduction to Quantum Computation and Information
Alice.........
.......
c .
Bob ..._........._.. ........ ............ I
I
I
" I . "
Del Del
:...............
.......-".....".."...""" ....."..."..J
"
Del
j
PBS
:.-............. ....................."2 I
"I
I
I
Figure 1: Scheme of a polarization coding QC set-up. PBS denotes a polarization beam splitter, with the axis alignment defined by the arrow.
are then detected with a photon counter and acquisition electronics collects the data. After the measurement Alice and Bob publicly compare the chosen bases (0"/90" or 45"/135") of emission and detection, without revealing the polarization states transmitted and measured. Incompatible measurements are disregarded. With the other results a secret key can be established by interpreting 0" and 45" as bit 1, and 90" and 135" as bit 0. If, for example, Eve uses a simple intercept-resend strategy (i.e. measuring the polarization of each photon and sending on one polarized as per her result), she would introduce an error of 25%. This can easily be detected by Alice and Bob, simply by comparing a sample of their key. The axis of the polarizers at Alice and Bob must be aligned and kept aligned. This is the main s ~ x i f i cdifficulty for a fibre optic implementation of the polarization scheme. This difficulty is threefold: 1. The first problem is a topological one related to the transport of a vector along a curve. Since the path taken by the light in the optical fibre is three dimensional, its polarization rotates by an angle related to Berry's phases This effect does not limit the distance or the quality of the transmission, if the fibre link is stable. However, an aerial cable or a cable sustaining strong vibrational perturbations are not suited.
2. The second difficulty arises from the intrinsic birefringence of optical fibres. Changes in mechanical stress that causes birefringence will change the state of polarization at the output of the fibre. However, these changes are usually quite slow i.e. in the order of tens of minutes, depending on the mechanical and thermal stability of the environment? Another effect of the birefringence is the polarization mode dispersion (PMD)!O An optical cable behaves as a random concatenation of pieces
Experimental Quantum Cryptography
123
of birefringent fibres. For long distances, the result of this is a spread of the pulses growing with the square root of length. This evolution is the same as a random walk. To prevent depolarization of the light pulses, lasers with a coherence time larger than the polarization mode delay must be used. This is not a real limitation, since typical PMD are between 0.1 ps km-1/2 and 1 ps km-1/2 and DFB semiconductor lasers feature more than 1 ns coherence time. However, PMD does limit the minimal usable pulse length of the laser.
3. A third potential problem is polarization dependent losses in optical components that could arise in Passive Optical Networks. In this case the relation between the polarization state at the input and the output of the optical link is no longer unitary.l1 To summarize, polarization instabilities are mainly due to mechanical stresses and temperature variations. This requires the optical fibre to be kept as stable as possible. In an experiment on installed Telecom fibres, a polarization separation of 23 dB over 23 km was achieved? The stability of the polarization alignment in the field experiment was excellent most of the time, and measurements could be performed for an hour without realigning the system. However, from time to time there were fast polarization instabilities, i.e. a rotation of 27r of the polarization happened within a few seconds. Hence a practical device needs an automated polarization controller to compensate for the fluctuations due to thermal and mechanical disturbances of the fibre. Here, from time to time the laser will be switched to cw operation or the attenuation turned down to increase the signal. The polarization controller consisting of Pockels-cells (Ins response time) or liquid crystals (20 ms response time, suitable for good conditions) would minimize the count rate of the detector that detects the orthogonal polarization. The polarization controller will be situated with Alice if it has considerable loss, otherwise it will be with Bob, for simpler and faster feedback. Of course, the polarization must also be kept stable in the fibres before the final coupler at Alice and after the first coupler at Bob. No continuous alignment should be necessary if the fibres are carefully fixed. These elements can also be constructed using bulk optics. This would guarantee high polarization stability.
3
Standard phase coding set-up
A standard phase coding set-up is shown in Fig. 2?95712There are two identical unbalanced Mach-Zehnder interferometers, i.e. interferometers with two paths of different length, one at Alice’s location and one at Bob’s. The path length
124
Introduction to Quantum Computation and Information
Source
I
PM
Alice
Bob
Figure 2: Scheme of a standard phase coding QC set-up. P M denotes a phase modulator and DO and D1 are the photon detectors.
difference is bigger than the coherence length of the photons to prevent interference at Alice’s output beamsplitter. However, pulses taking the short path in Alice’s interferometer and the long one in Bob’s will interfere with pulses taking the long path in Alice’s and the short one in Bob’s. Non-interfering pulses taking twice the short or the long arm can be discarded since they arrive in another time window at the detectors. The four state protocol BB84 can be readily implemented. In one arm, Alice randomly apply phase shifts of 0, 7r/2, x or 3x12. Bob chooses a basis by applying phase shift of 0 or ~ 1 2 If . compatible bases have been chosen, i.e. the phase difference is 0 or w , the outcome at the detectors is deterministic. Hence a secret key can be established by interpreting 0 and 1r12 as bit 1, and T and 3 ~ / as 2 bit 0. Again, incompatible measurements are disregarded. The major disadvantage of phase coding set-ups is that the two interferometers have to be adjusted for equal path difference. This can be done by controlling the width of an air gap with a piezoelectric transducer in one arm of one interferometer. This adjustment has to be repeated from time to time to compensate for thermal drifts. Depending on the thermal stabilization of the interferometers, this time interval can vary from a few seconds to many minutes. The adjustment can be performed with intense light pulses in the first case or by continuously analyzing the error rate in the latter. The two pulses at Bob’s output beamsplitter interfere perfectly only if they are in the same polarization state. This could be obtained by adjusting just one polarization controller in one of the arms of Bob’s interferometer. In practice, there is a passive polarization controller (consisting, for example, of three small fibre loops acting like two quarter wave and one half wave retarder) in one arm of each interferometer. They are adjusted in a manner such that an arbitrary input polarization state of an incoming pulse undergoes the same transformation for both arms of the interferometer. Thus the two interfering pulses experi-
Experimental Quantum Cryptography
125
ence the same polarization transformation in Alice’s interferometer, the same, though randomly changing, polarization transformation in the fibre link, and finally the same polarization transformation in Bob’s interferometer. Therefore, their output polarizations will always be identical; hence these pulses will completely interfere. Unfortunately, integrated fibre optic phase modulators have polarization dependent losses and phase shifts. Therefore, to eliminate intensity and phase fluctuations in one arm of Bob’s interferometer the incoming polarization state must also be controlled. So an additional active polarization controller that compensates for the polarization transformation of the link is needed, like in a polarization based set-up. With all these polarization controllers installed, we can now replace the input coupler at Bob’s by a polarization splitter and make sure that the two pulses leaving Alice are orthogonally polarized. Like this we can ensure that all pulses take interfering paths (short-long and long-short); this increases our bit rate by a factor four. Since orthogonal polarizations do not interfere, the delay loops in the two interferometers are in principle no longer necessary. So let us ignore the delay loops for a while and assume that the two interferometers are balanced. Alice’s Mach-Zehnder interferometer with the phase shifter can now simply be regarded as a polarization rotator. Bob’s apparatus is then a polarization analyzer. The interferometric setup is finally equivalent t o the polarization code scheme. In practice, however, it is advantageous to keep the delay loops. In the case of poor polarization alignment, the polarizations of the two pulses may no longer be parallel and perpendicular to axis of the PBS. So, a small fraction of the pulses will go through the wrong arm. Since the incoming pulses are orthogonally polarized, this loss will be the same for both pulses. Moreover, due to the delay loops, the fractions taking the wrong arm will not interfere. Hence bad polarization adjustment just introduces some loss, reducing the bit rate. However, it does not increase the error rate. To summarize, like in the polarization coding set-up, the phase coding set-up demands polarization control. The main disadvantage is that in each Mach-Zehnder interferometer the path length differences must be actively balanced. An advantage is that fast integrated phase modulators are commercially available. The fringe visibility obtained with phase coding is typically 0.985?2 4
QC using Faraday mirrors
We have seen that standard polarization coding, as well as standard phase coding, demands continuous alignment of the set-up. In this section an autobalanced QC set-up based on an interferometer with Faraday mirrors is discussed. Let us have a closer look at the QC scheme6?l3depicted in Fig. 3,
126
Introduction to Quantum Computation and Information ~
............
...._.. ............
*!
Bob
i
i i
M3
Alice
...........................................................
i ......................................................................
-.
Figure 3: Experimental set-up of an interferometric QC system with Faraday mirrors. C1, C2 and C3 are fibre optic couplers; M1, M2 and M3 are Faraday mirrors (ordinary mirrors in combination with Faraday rotators, FR); the P M s are phase modulators; A is an attenuator; DO is a photon counter and DA a photodiode; T is an optional trigger output; SRS is a delay generator; the FGs are function generators; & denotes an “and-gate.”
disregarding the Faraday rotators (FR) for the moment. Their crucial effect will be explained later. In principle Bob has a very unbalanced Michelson interferometer (beamsplitter C2) with one long arm going all the way to Alice. The laser pulse impinging on C2 is split into two pulses, P1 and P2. P2 propagates through the short arm first (mirror M2 then M1) and then travels to Alice and back, whereas P1 propagates first to Alice and then passes through the short arm. As both pulses run exactly the same path length, they interfere maximally at C2 (disregarding polarization for the time being). To encode their bits, Alice acts with her phase modulator (PM) only on P2 (phase shift whereas Bob lets pulse P2 pass unaltered and modulates the phase of P1 (phase shift $a). If no phase shifts are applied or if the difference - q5b = 0, then the interference will be constructive. On the contrary, when - q5b = T , the interference will be destructive and no light will be detected by detector DO. Since the interfering pulses travel the same path, the interferometer is automatically aligned. The visibility of the fringes is also independent of the splitting ratio of C2. However, the visibility does depend strongly on the polarization states of the interfering pulses. Let Mi, be the vector which fixes the point on the PoincarC sphere representing the polarization state of the incoming laser pulse at C2. Points representing linear polarization states lie on the equator of the sphere and those representing right and left circular polarized light sit at the north and south poles, respectively. All other points on the sphere correspond to elliptically polarized light. The polarization states of the interfering pulses
Experimental Quantum Cryptography
127
P1 and P2 are given by
where Ri is the matrix describing the polarization rotation in a round trip path to mirror Mi. Since rotation operators do not commute, these two operations are in general not identical, hence the two outcoming polarizations are not parallel. This is where the Faraday mirrors (FM) enter the game. A FM is composed of a 45" Faraday rotator and a mirror. A light pulse with an arbitrary polarization injected into a fibre terminated by a FM will come back exactly orthogonally polarized, regardless of the polarization transformations in the fibre due to induced birefringence: Orthogonal polarization states are represented by antiparallel vectors on the Poincark sphere. Hence a round trip path in any fibre terminated with a FM will lead to a polarization transformation R = -1. This is true because there are no significant mechanical or thermal variations during the time of flight of the photons; which is, for example, 300 ps for a 30 km link. However, this applies only if there is no Faraday rotation inside the fibre. In fact, although the Verdet constant of a standard optical fibre is low, the Faraday rotation due to the geomagnetic field may not be completely neglected for optical fibres of several tens of kilometersP.in such cases R3 # -1. However, with R1 = R2 = -1 we obtain Ml,,t = RBM~,= M2,,t. So in principle, the Faraday rotator in front of M3 wouldn't be necessary, but it is retained in order to get rid of the polarization dependence of the integrated phase shifter. To quantify the performance of the interferometer, the ratio of the count rates for constructive and destructive interference is measured. In practice, this ratio is obtained by measuring the reduction of the attenuation (A) that leads to the same count rate for destructive interference as for constructive interference. An extinction of 30 dB was bFaraday discovered that a constant magnetic field can generate a circular birefringence in isotropic media (the Faraday effect). Hence a linear polarization is rotated by an angle a given by a = VH1, where V is the Verdet constant of the material, H the magnetic field and 1 the path length. A Faraday rotation by an angle a is represented on the Poincar6 sphere by a rotation of the polarization vector around the north-south polar axis by an angle of 20. =This description of Faraday mirrors requires that after a reflection, one switches from a right handed to a left handed reference frame, or vice versa; hence one changes the hemisphere on the Poincarh sphere. This is no problem as long as the interfering paths each undergo the same number (parity) of reflections. dThe horizontal component of the geomagnetic field H = B/bo is 17 A/m in Geneva, O/A at 1300 nm. Therefore, the polarization the Verdet constant of silica is ca. 0.6 is turned by about twice 1' per km displacement in the north-south direction. However, polarization mode coupling strongly reduces this effect.
128
Introduction to Quantum Computation and Information
achieved over 23 km6 Replacing one Faraday mirror by an ordinary mirror, the extinction is strongly fluctuating and can be reduced to 20 dB. If two Faraday mirrors are removed, essentially no interference is visible.
The 2-state protocol B92 l4 has to be applied with the above setup. To implement the 4-state protocol7 another coupler and detector have to be inserted. The key exchange in the B92 protocol proceeds as follows. Alice and Bob choose at random 0 or ?r phase shifts, defined as bit values 0 and 1. Since very weak pulses are used, in most cases no photon will be detected in DO. If a detection (i.e. constructive interference) occurred, Alice and Bob know that they applied the same phase shift, and they register the same bit value. In our interferometric setup the pulses leaving Bob carry no phase information. The information is in the phase difference of the two pulses P1 and P2 leaving Alice. The attenuator (A) is set such that the weaker pulse P2 that already passed through Bob’s delay line has 0.05 photons on average when leaving Alice. The information that Eve could obtain depends on the number of photons in the weaker pulse. Therefore, to measure the phase difference, she must attenuate P1 to the intensity of P2 in order to obtain complete interference. She actually performs the same measurement as Bob does. More generally, such a kind of measurement can be called a Loss Induced Generalized Measurement !4 Consequently, 0.05 photons in the weaker pulse is equivalent to an average number of p = 0.1 for the pulse pair. Of course, this reasoning also applies for the standard time multiplexed interferometer set-up (Fig. 2), where the two pulses may also have different intensities.
The great advantage of this set-up is of course that no continuous alignment is needed. It is also noteworthy that the timing of Alice’s apparatus can be pre-adjusted in the lab, and will not change, even if the apparatus is plugged to another fibre to communicate with a third party. The timing of Bob’s apparatus, especially of his photon counter has to be adjusted once for every link. A problem of the set-up is all the reflections that lead to parasitic pulses impinging on Bob’s photon counter. Of course these pulses can be discriminated by the short detection window. Going to pulse frequencies higher than the inverse of the round-trip time implies that several pulse pairs are on the way between Alice and Bob at the same time, making this discrimination more difficult. Moreover, parasitic pulses arriving at the detector immediately before a detection window can increase the darkcount rate.
Experimental Quantum Cryptography 129
5
The performance of a QC-setup: transmission length, data rate and quantum bit error rate
The transmission length, the data rate and the quantum bit error rate are the three values of interest for a QC set-up. In this paragraph we will discuss how these values depend on the used wavelength and performance of the corresponding detector. They also depend on the average number of photons per pulse. We will compare the use of attenuated laser pulses and two-photon sources. Let us consider a QC set-up with a laser pulse rate v. The average number of photons at the output of Alice is p. The total transfer efficiency vt between the output of Alice and the detector can be expressed as
vt = 1 0 - ( L f l + L B ) / l o
(3) where Lf represents the fibre losses in dB/km, 1 is the length of the link in km and L g represents the internal losses at Bob's end in dB. Finally we have a detector with an efficiency qd. Hence the raw data rate R , i.e. the number of exchanged bits per second before any error correction, is given by
R = qpvvtvd ' (4) Here q is a systematic factor smaller than (or equal to) 1 / 2 , depending upon the chosen implementation. For example, in the case of the polarization scheme of Fig. 1, q equals the maximum value 1 / 2 . In half of the cases the chosen bases are not compatible and the corresponding detections have to be neglected. For the phase scheme in Fig. 2 , without PBS q would be 1/8. The raw bit rate R will be further reduced when error correction and privacy amplification (see chapter 4) are applied, by an amount dependent upon the error rate and the algorithms used. The error is generally expressed as the ratio of wrong bits to the total amount of detected bits. We call this quantity the quantum bit error rate ( Q B E R ) ? It is equivalent to the ratio of the probability to get a false detection to the total probability of detection per pulse,
ePhysicists often call this quantity the bit error rate ( B E R ) . In telecommunications B E R is commonly used for the total error in a transmission and is in the order of In QC the B E R is in the order of 1%. Of course, this does not correspond to the final error in the message, since error correction will be applied. However, to prevent any confusion for telecoms specialists, we use the terminology Q B E R for the wrong bit to detected bit ratio. Note that in theoretical papers about eavesdropping, the Q B E R introduced by Eve is often called disturbance ( D ) .
130 Introductzon to Quantum Computatzon and Infonatzon
Table 1: QBER,,t for different QC set-ups.
polarization coding
'
phase coding
Faraday mirror
0.75%
0.5%
QBERopt
''
0.15%
, Pphot and popt are the probabilities to get a darkcount, to detect a photon and that a photon went to an erroneous detector, respectively. n d a r k is the dark count rate of the detector and AT is the detection time window. This expression for QBER applies for a setup with two detectors. Since a dark count will with a 50% chance not lead to an error, but just to an additional count, there is a factor two in the denominator of Eq. 5, but not in the numerator. Note that QBER is independent of the factor q in Eq. 4, since we do not consider errors when incompatible bases are used. The QBER consists of two parts. The first part is what we call QBER,t, that is the fraction of photons popt whose polarization or phase is erroneously determined, i.e. the fraction of photons who end up in the wrong detector. This is mainly due to depolarization and to poor polarization alignments, or due to the limited visibility of the interferometers. popt can easily determined by measuring the polarization ratio, the extinction ratio or the classical fringe visibility V:
PdaTk
Popt
=
Iman Imax
+ Iman
23-
Iman Imax
!%-
1-V 2 '
(7)
In Table 1 the experimentally achieved QBERopt for the three presented setups are summarized: The second part of Eq. 5, QBERdet, due t o the dark count rate of the photon counters, is proportional to AT. Hence a good detector must not only be efficient and have a small dark count rate, but it should also have a small time jitter, at least smaller than the pulse length of the laser diodes. Photon counting is achieved by using Avalanche Photo Diodes (APDs) in the so-called Geiger-mode. The performance of APDs can be improved, if the bias voltage is
Experimental Quantum Cryptography
131
Table 2: Photon counting performance of APDs. The Ge and the InGaAs diodes were actively biased with 2.6 ns pulses, (see appendix).
only applied at the times when a photon is expected (active biasing or gating). In the three telecom windows at 800 nm and 1300 nm Si-APDs, Ge or InGaAs APDs are used, respectively. In the 1550 nm window, InGaAs APDs could be used at higher temperature; however, with considerably increased dark count rate. For every diode a trade-off between high efficiency and low noise has to be found, by choosing the appropriate bias voltage. In the appendix, photon counting with APDs is discussed in more detail. In Table 2 the performances of actual APDs are summarized: The QBERd,t depends not only on the performance of the detector as shown above, but also on the transfer efficiency qt and the photon number per pulse. The transfer efficiency is essentially determined by the fibre losses that must be considered as given. For the mean photon number, however, a tradeoff between security and bit rate has to be found. The photon number of a laser (coherent state) is Poisson distributed, with parameter p. The probability to have exactly n photons in a pulse is thus given by
The mean number of photons per pulse is therefore
This is chosen to satisfy p
< 1.
The probability to have more than one photon
132 Introduction to Quantum Computation and Information
in a pulse is therefore:
+
P2 ~ ( 2 n2) = 1 - ~ ( 1-) ~ ( 2 =) 1 - exp(-p)(l+ p ) M 0 ( p 3 ). 2 The probability to have more than one photon in non-empty pulses is
(11)
So in the case of p = 0.1,0.2 and 1,the probability for more than one photon in non-empty pulses is 5%, 10% and 58%. The question of which p to choose is not easy to answer. It depends on the algorithms used for privacy amplification, the measured QBER and, of course, the eavesdropping techniques Eve is supposed to possess (see Sec. 7). Relatively high values of p will on the one hand increase the raw bit rate and reduce the QBER, and hence the necessary error correction. On the other hand, this bit rate will then be reduced due the required privacy amplification procedure. Most recent experiments have used p = 0.1 or p = 0.2. The possibility to work with single photon number states is discussed in the next section. In Table 3 the QBERd,t and R for the different wavelengths with corresponding detector performance are summarized for different fibre lengths. Note that the wavelength of 800 nm is a good choice only for short distances up to a few kilometres, taking advantage of the efficient and commercially available Si-detectors. The disadvantage is that fibres and modulators are generally conceived for the longer telecom wavelengths. For medium distances up to 25 km, Peltier cooled InGaAs could be used. For longer distances liquid nitrogen (LN2) cooling is still necessary. However, with only slightly improved performance of the detectors at 177 K, 1550 nm will clearly be preferable for long distance QC over 100 km or more. Using single photon states ( p = 1) f all values of QBERdet could in principle be reduced by a factor of 10. According to recent calculations QC could be performed securely with QBER up to 15%!5 In the last row of Table 3, the maximum length of the link, leading to this QBER is calculated (neglecting QBER,,t). The limit for 1550 nm is around 120 km using real single photon states ( p = l),a limit that, once again, depends strongly on the detector performance and thus on how this may develop in the future. Of course, the obtained raw bit rates will further be reduced, due to error correction and privacy amplification depending on the corresponding QBER. fNote that these have one photon per pulse, so /L = 1, but that they are not a coherent state with the Poisson distribution of Eq. 9. Rather, P ( n ) = 61, and there is only an amplitude for the Fock state 11).
Experimental Quantum Cryptography
133
Table 3: Quantum bit error rates ( Q B E R ) and raw data rates ( R ) for different wavelengths and detector performance for three different fibre lengths with v = 10 MHz, p = 0.1 (or as indicated) and q = 0.5.
wavelength (detector) temperature
pdark
L (dB/km) 2.5 km: ’ QBERdet R 25 km:
800 nm (Si) Peltier cooled 50% 1 2.0
1300 nm InGaAs 77 K
77 K
1550 nm (InGaAs) 177 K
177 K
20% 3.3 10-6
30% 10 10-6 0.35
10% 20 10-6 0.35
2% 10 0.2
0.00006% 79 kHz
0.02% 82 kHz
0.04% 123 kHz
0.25% 27 kHz
0.56% 8.9 kHz
0.2% 25 Hz
0.12% 13 kHz
0.25% 20 kHz
1.5% 6.7 kHz
1.6% 3.2 kHz
0.93% 1.8 kHz
1.9% 2.7 kHz
11% 0.89 kHz
5% 1 kHz
29 km 0.3 Hz
84 km 110 Hz
76 km 0.33 kHz
54 km 0.67 kHz
74 km 333 Hz
34 km
113 km
104 km
1
82 km
I
124 km
So, the above mentioned trade-off between efficiency and noise of the detector depends not only on the transmission length, but also on the error correction algorithms. Note that only at short distances is QBERdet smaller than a typical QBER,,t of around 0.5%. It is clearly the performance of the existing detectors that limit the QBER and the maximum transmission distance of QC. All the three fibre optical QC schemes discussed in the preceding sections have been experimentally realized by different groups in the USA, in the UK and in Switzerland. F’ranson et al? demonstrated a 5 kHz data rate over 1 km a t 0.4% Q B E R ( p = 0.2) using the polarization scheme with a 633 nm HeNe laser. Muller et al? showed the feasibility of this scheme at 1300 nm over 23 km of installed telecom fibres, by demonstrating a polarization separation of 23 dB. The phase coding scheme has been demonstrated by Marand et al! over 21.8 km and 30 km with 700 Hz and 260 Hz bit rates at 2.8% and
134
Introduction to Quantum Computation and I n f o n a t i o n
4% QBER ( p = 0.2), respectiveIy. Hughes et a1.16 have even performed an experiment over 48 km. Townsend showed the compatibility of the phase coding set-up with a multi-user optical fibre netw0rk.1~The Faraday mirror setup was finally realized by the Swiss group over 23 km, featuring excellent stability and very low QBER of 1.3 however, at a low bit rate of about 1 Hz.6 Hence, the technical feasibility of QC has been successfully demonstrated several times. The commercial feasibility will depend on the technological progress, especially in the field of photon counters. But even more important will be the development of computer performance and the related demand for an alternative to public key cryptosystems. As soon as there is a market for QC, commercial QC-systems will be available. This, however, will be the case in the next millennium. 6
QC with single photon sources
A QC setup exploiting quantum mechanical single-mode number states Il), containing exactly one photon, would have two major advantages. On the one hand, the pulses have by definition a zero probability of containing two or more photons. Hence, one possible leak of information is definitely closed (Sec. 7). On the other hand, the QBERdet will be reduced by a factor of ten with respect to pulses with p = 0.1. Unfortunately, number states are very hard to realize experimentally. However, a good approximation of a single photon state can be realized with two-photon sources based on parametric down-conversion (see e.g. 18). Such a source consists essentially of a pump laser and a nonlinear crystal. The nonlinear crystal transforms a pump photon (A,) into two, simultaneously emitting photons (A1 , A,). Energy and momentum are conserved, hence 1/A, = 1/A1 1/Az. A 10 mW pump laser can typically produce about 106 photon pairs per second with a bandwidth of approximately 10 nm, depending on the crystal and the phase matching conditions. Parametric downconversion can be used to create an entangled state and to perform QC based on a Bell e~periment!~-~l Here, we discuss only the application of such a source to produce a single photon state, where one photon just serves as a trigger for the presence of its twin. The two-photon distribution is random. Taking into account our time resolution, which is much longer than the coherence time of the photon pairs, the photon pair number can be considered as Poisson distributed like the photon number of attenuated laser pulses. For 1 MHz pair production rate the probability to have a second photon originating from another pair in a 1 ns time window pulse is 0.05%. This is equal to probability of finding more than two photons in laser pulse with p = 0.001. The bit rate is as high as for 10 MHz pulses with p = 0.1, but the QBERdet
+
Experimental Quantum Cryptography 135
is decreased by a factor 10 according to Eq. 6. Some important experimental problems oppose these obvious advantages: 1. The first problem is the short coherence time of the photons. A bandwidth of 10 nm at 1310 nm corresponds to a coherence time of about 0.5 ps. So, the polarization mode dispersion, in the range of 0.1 ps km-lj2 to 1 ps km-1/2, leads to a depolarization of the photon after a short distance. Hence, in particular, the polarization coding setup gets impracticable. Or, the bandwidth of the photons is reduced on the expense of production rate. 2. The losses at Alice’s apparatus get important, in contrast to the realizations with faint laser pulses. In particular, single photons can not be exploited with the Faraday mirror setup due to the high losses in a round trip Bob-Alice-Bob.
3. The pulses will not be regularly distributed in time, complicating the triggering, the random number generation and the recording of the counts. 4. The two photon source is more complicated than a simple diode laser. The wavelength of the trigger photon is chosen in the detection range of high efficiency and low noise Si-detectors, so XI i 1000 nm. With X2 = 1310 nm, a green laser is necessary (instead of a diode laser), e.g. a frequency doubled Nd:YAG 532 nm + 1310 nm 896 nm.
+
In conclusion, single photon sources can potentially increase the performance of a QC-system. However, the current experimental difficulties imply that set-ups applying faint laser pulses are to be preferred for the time being. 7
Practical eavesdropping
We have seen that the simplest beam-splitting attack of Eve can be prevented using weak pulses with e.g. p = 0.1. More elaborated strategies are analyzed in some of the reference^?)^^?^^-^^ We would like discuss another strategy that Eve could follow in the case of the Faraday mirror set-up. She could chop the fibre and try to measure actively the phase or polarization settings applied by Alice. Eve could mount an interferometer similar to Bob’s and measure with intense light pulses the phase shifts applied by Alice. Then, she can apply the same phase shifts to the pulses received from Bob and send them back to him, as if she was Alice. However, as Alice attenuates the incoming pulses by more than 40 dB down to the 0.1 photon level before sending them back, Eve is forced to send intense pulses to Alice, which can be detected by the detector DA, inserted
136 Introduction to Quantum Computation and Infomation
for this purpose. However, by assumption, Eve has perfect technology at her disposal. Therefore, she could for example try to sense Alice’s phase with very a short pulse beyond the bandwidth of DA. Alice, in return, could prevent such an intrusion with a narrow line filter. Probably any kind of intrusion could be prevented with the appropriate means, but security would no longer be guaranteed solely by the fundamental laws of quantum mechanics. In fact, all the other QC schemes face the same problem. In the standard phase scheme the position of the phase shifter could be sensed interferometrically using small reflections at Alice’s or Bob’s ends. Or, hypothetically, Eve might devise an optical technique to find out which laser fired or which detector clicked in the polarization scheme proposed above. In these set-ups, optical isolators could be introduced in contrast to the Faraday mirror setup. We cannot discuss all possible strategies of Eve and the technical means to fight them. A general assumption implicit in all discussions of QC security is that Alice and Bob’s offices are absolutely safe. This is a reasonable assumption, necessary also for all other crypto-systems. However, as illustrated by the above discussion, care should be given to the fact that the fibre optic quantum channel provides a potential entrance gate for malevolent intruders. In yet another eavesdropping strategy, which applies only to the two-state system,14 Eve interrupts the transmission and measures as many pulses as possible. She sends to Bob only the pulses for which she obtained the phase or the polarization. To prevent this, Bob has to introduce another detector to monitor the stronger pulse P1 to make sure that Eve cannot suppress this pulse. If Eve suppresses only the weak one, because she didn’t get the phase information, the strong pulse alone will introduce 50% error on detector DO. This causes a serious problem with the Faraday setup. To render the power of P2 measurable by a conventional detector, the losses of Bob delay line could be increased. The attenuation applied at Alice’s side would be reduced by the same amount. This damping applies also to pulses needed by Eve to spy on Alice’s phase, applying the strategy mentioned above. With the laser power and the detectors at our disposal, it is not possible to monitor the pulse P2 at Bob end, hence Eve’s spying pulse at the same time (also with appropriate choice of the splitting ratio of the couplers C2 and C3, presently 3 dB couplers). Hence, for longer distances only the four-state protocol can guarantee absolute security with the Faraday mirror set-up. In the four-state protocol BB84 the eavesdropping strategy mentioned in the previous paragraph fails because Eve would introduce errors when she chooses the wrong basis. However, let’s suppose that Eve has a lossless line and a way to sense how many photons are in the pulse. For p = 0.1 there is about 5% chance to have two photons in an non-empty pulse. In these cases
Experimental Quantum Cryptography 137
Eve could let pass one photon and store the other until Alice and Bob publicly communicate their bases and get full information on this bit. Eve would then send only these pulses to Bob, and block the others. Bob would not notice Eve’s presence, since he expects considerable losses in his line. In conclusion, as a function of p and the losses in the line Eve could win a considerable fraction of the information. Again this could be prevented by measuring the intensity of a stronger pulse, to force Eve to send a pulse every time?2 QC with correlated photon pairs would have the advantage that, since in this case real single photon states are used, all strategies dealing with the fraction of pulses containing more than one photon must fail. Unfortunately, the self-aligning set-up with Faraday mirrors is not suited for such a photon source, due to the high losses in a complete round trip. In practice, a tradeoff has to be found between the complexity, hence the price, and the absolute security of the setup. Let us just mention in this context that since the interferometer in the Faraday mirror set-up is not stabilized, the absolute phase difference between the pulses P1 and P2 will fluctuate randomly, rendering Eve’s job very hard. This contrasts with the standard phase coding set-up, where the intense pulses sent by Alice to adjust Bob’s interferometer, can also be used by Eve to adjust hers. 8
Open air QC
In this section we discuss briefly the possibility of transmitting a key through open air. Such a technique could be applied to satellite-earth communications. For free space cryptography, polarization coding is chosen, since the atmosphere has a neglectable birefringence. So, a typical set-up would be similar to that in Fig. 1. At Alice’s end, instead of being coupled into the fibre link, the laser beam is expanded to have a small divergence. At Bob’s end the light is collected by a telescope and injected into a fibre and directed to the analyzer. The wavelength is preferably between 770 nm and 800 nm, where the atmospheric transmission and the Si-detector efficiency are both high. A relatively strong laser pulse some 100 ns before the photon triggers Bob’s detector. This trigger pulse may have another wavelength and will be detected by an independent detector, in order to prevent saturation of the photon counters. Obviously, open air cryptography faces essentially two problems, the efficient collection of the emitted photon and the suppression of parasitic light. Although a problem, naive considerations may overestimate the effect of stray light. The fact that we are working with short laser pulses offers us the possibility to use temporal and spectral discrimination. Coupling the light after the telescope into a single mode fibre, gives an excellent spatial discrimination.
138
Introduction to Quant um Computation and Information
This effect is independent of the telescope diameter, since an increase of the aperture goes with corresponding improvement of the angular resolution. A simple estimation therefore shows that this problem can be reasonably well solved: Let us suppose that we have an isotropic spectral background radiance of W/m2 nm sr at 800 nm. (This corresponds to the spectral radiance of a clear zenith sky with a sun elevation of 77°?5)The divergence 8 of a Gaussian beam with radius t u g is given by 8 = X / W O T . The product of beam (telescope) cross-section and angle is therefore: .rrw;d2 = X2. With a interference filter W, corresponding of 1 nm width we obtain a power on the detector of 6 to 2 lo4 photons per second, hence 2 photons per ns time window. For comparison, this is about the p d a & of InGaAs detectors. Under good environmental conditions, the transfer efficiency not limited by the atmospheric attenuation, which may be as low as 0.01 dB/km. Essentially, it depends on the divergence of the output beam, hence the diameter of the beam expander and the diameter of the receiving telescope. A 50 mm beam expander produces a beam divergence of about 0.02 mrad (at 770 nm). So, in a distance of 10 km the beam has a diameter of 40 cm and a 10 cm Telescope would theoretically be able to collect 1/16 of the photons. Atmospheric turbulence limits the angular resolution of telescopes to roughly l”, i.e. 5 p a d . A beam with 5 p a d divergence has a diameter of 3 m at the 300 km orbit of a satellite. In this case a 10 cm Telescope would still capture 1% of the pulses. This leads to a QBERd,t of 4%, with pda+k of 2 p = 0.1 and q d = 50% . Hence QC between a satellite and earth is in principle feasible, with reasonable QBER and bit rates. The main problem of working with such small beam diameters will be the tracking of the satellite by the ground station at the required speed and precision. In particular, if the satellite is supposed to act as a large mirror to allow QC between two terrestrial stations, alignment would appear to be impossible. So, free space QC could be used to communicate secret data from a satellite to a ground station. Or, the satellite can act as an intermediate station, but will in this case share the code with Alice and Bob. The feasibility of free space QC has already been demonstrated 26 under daylight conditions over about 150 m with R = 1 kHz and QBER of 2% and more recently by another group2? over 205 m.
Appendix: The performance of photon counters Photon counting in the near infrared can be achieved with liquid nitrogen (LN2) cooled Ge avalanche photodiodes, working in the passively quenched Geiger mode?* In this mode the diodes are driven above breakdown, i.e. the bias voltage is so high that one electron-hole pair created by an absorbed
Experimental Quantum Cryptography 139
photon will be able to produce an avalanche of thousands of carriers. The avalanche only stops when the created current through the resistance in series to the diode lowers the applied voltage below the breakdown value. The noise in such detectors is due to carriers generated in the detector volume by causes other than an incident photon (so-called dark counts). These carriers can be created thermally or by band-to-band tunneling processes, or they can be emitted from trapping levels that were populated in previous avalanches (after pulsing). The quantum efficiency v d and the dark count rate Tt).daTk both increase with increasing bias voltage Ubias. To obtain a low QBERd,t, a trade-off between high efficiency and low noise has to be found. For LN2 cooled Ge diodes the thermal contribution can be neglected and the dark counts mainly consist of tunnelled electrons and “after-pulses,” the latter being more important if the total current through the device is large?’ The after-pulse rate is decreases almost exponentially with a time constant of about 200 ns. This fact opens the door to a further reduction of the dark count rate: If the diode is biased only immediately before a photon is expected, no spontaneous avalanches can occur before the detection and consequently no afterpulses will fall into the detection time interval. The so-called active biasing electronic circuit profits from this. The bias voltage of the diode is the sum of a DC part well below threshold and a short (say, 2 ns) almost rectangular pulse of several Volts amplitude that pushes the diode over breakdown at the time when the photon is expected. This increases considerably the efficiency, without excessively increasing the noise. Moreover, the time jitter is reduced to a value below 100 ps. The short bias pulse induces a parasitic signal. A discriminator in combination with a temporal coincidence window recovers the true avalanche signal from this parasitic signal. A time-to-amplitude converter followed by a window-discriminator of 300 ps width, allows further reduction of the noise level. Quite recently InGaAs APDs have also been used for single photon counting3’ and one QC experiment has been performed with such detectors? Fig. 4 shows the noise as a function of the efficiency at 1300 nm for actively biased and passively quenched Ge and InGaAs diodes (from 3 1 ) . One can easily see that active biasing considerably decreases the noise with respect to passive quenching. For higher efficiencies, actively biased InGaAs diodes show smaller noise than Ge diodes. On the other hand, passive quenching is not suitable for InGaAs diodes. The quantum efficiency at 1550 nm is very low at 77 K and increases with higher temperature, as does the noise. At -100” C, a temperature at the limit of Peltier cooling, InGaAs diodes exhibit increased but still quite reasonable noise levels. That is, p d a T k = 2 for q d = 10% and pdaTk =1 for 7]d = 2% at 1300 and 1550 nm, respectively. so there is legiti-
140 Introduction to Quantum Computation and Information
’
Ge passil
/ i
X
1E-7 0%
10%
20%
40%
30%
50%
M)%
I
%
Eflklency 1.
Figure 4: The probability to get a darkcount per pulse P d o r L against the quantum efficiency q~ for Ge (NEC NDL5131) and InGaAs (Fujitsu FPD9WlKS) APDs. Comparison of active gating and normal passive quenching mode.
mate hope that such diodes will be practical without LN2 cooling and open the second telecom window at 1550 nm. See Ref? for more details on photon counting with InGaAs diodes. For comparison, commercial silicium single photon counting modules have about 50% efficiency at 800 nm with extremely low darkcount rates of down to 10 Hz. References 1. C. H. Bennett, F. Bessette, G. Brassard, L. Salvail and J. Smolin, J. Crypt. 5 , 3 (1992). 2. A. Muller, H. Zbinden and N. Gisin, Europhys. Lett. 33, 335 (1996). 3. J. D.F’ranson and B. C. Jacobs, Electron. Lett. 31, 232 (1995). 4. Ch. Marand and P. D. Townsend, Opt. Lett. 20, 1695 (1995). 5. R. J. Hughes, G. G. Luther, G. L. Morgan, C. G. Peterson and C. Simmons, Lecture Notes in Computer Science 1109, 329 (1996). 6. H. Zbinden, J. D. Gautier, N. Gisin, B. Huttner, A. Muller and W. Tittel, Electron. Lett. 33, 586 (1997). 7. c. H. Bennett and G. Brassard, “Quantum Cryptography: Public key
Experimental Quantum Cryptography
141
distribution and coin tossing,” p. 175 in Proc. Int. Conf. Computer systems and Signal Processing, (Bangalore 1984). 8. R. Y. Chiao and Y. S. Wu, Phys. Rev. Lett. 57, 933 (1986). 9. N. Gisin, R. Passy, J. C. Bishoff and B. Perny, IEEE Phot. Technol. Lett. 5 , 819 (1993). 10. N. Gisin, J . P. Pellaux and J. P. Von Der Weid, IEEE J. Lightwave Technology 9, 821 (1991). 11. B. Huttner, A. Muller, J. D. Gautier, H. Zbinden and N. Gisin, Phys. Rev. A 54, 3783 (1996). 12. P. D. Townsend, J. G. Rarity and P. R. Tapster, Electron. Lett. 29, 1291 (1993). 13. A. Muller, T. Herzog, B. Huttner, W. Tittel, H. Zbinden and N. Gisin, Appl. Phys. Lett. 70, 793 (1997); US patent pending. 14. C. H. Bennett, Phys. Rev. Lett. 68, 3121 (1992). 15. C. A. Fuchs, N. Gisin, R. B. Griffiths, C.-S. Niu and A. Peres, “Optimal eavesdropping in quantum cryptography,” eprint quant-ph/9701039, submitted to Phys Rev. A, (1997). 16. R. J. Hughes, 17. P. D. Townsend, Nature 385,47 (1997). 18. C. L. Tang, “Spontaeous and Stimulated Parametric Processes,” in Quantum Electronics, eds. H. Rabin and C. L. Tang (Academic Press, New York, 1975). 19. A. K. Ekert, Phys Rev. Lett. 67, 661 (1991). 20. C. H. Bennett, G. Brassard and N. D. Mermin, Phys. Rev. Lett. 68, 557 (1992). 21. A. K. Ekert, J . G. Rarity, P. R. Tapster and G. M. Palma, Phys. Rev. Lett. 69, 1293 (1992). 22. B. Huttner, N. Imoto, N. Gisin and T. Mor, Phys. Rev. A 51, 1863 (1995). 23. J. Cirac and N. Gisin, “Coherent eavesdropping strategies for the 4-state quantum cryptography protocol,” eprint quant-ph/9702002, to appear in Phys. Lett. A. (1997). 24. A. K. Ekert, B. Huttner, G. M. Palma and A. Peres, Phys. Rev. A 50, 1047 (1994). 25. W. G. Driscoll (ed.), Handbook of Optics, (McGraw-Hill, New York, 1978). 26. B. C. Jacobs and J. D. Ranson, Opt. Lett. 21, 1854 (1996). 27. W. T. Buttler, R. J. Hughes, P. G. Kwiat, G. G. Luther, G. L. Morgan, J. E. Nordholt, C. G. Peterson and C. M. Simmons, “Free-Space Quantum Key Distribution,” eprint quant-ph/9801006, to appear in Phys.
142
Introduction to Quantum Computation and Information
Rev. A (1998). 28. P. C. M. Owens, J. G. Rarity, P. R. Tapster, D. Knight and P. D. Townsend, A p p l . O p t . 33,6895 (1994). 29. A. Lacaita, P. A. Francese, F. Zappa and S. Cova, A p p l . O p t . 33, 6902 (1996). 30. A. Lacaita, F. Zappa, S. Cova and P. Lovati, A p p l . O p t . 35,2986 (1996). 31. G. Ribordy, J. D. Gautier, H. Zbinden and N. Gisin, “Performance of InGaAs/InP avalanche photodiodes as gated-mode photon counters,” preprint submitted to A p p l . O p t . (1997).
QUANTUM COMPUTATION: AN INTRODUCTION ADRIAN0 BARENCO Clarendon Laboratory, University of Oxford Parks Road, Oxford OX1 3PU, United Kingdom When applied to computation, quantum mechanics opens completely new perspectives; by exploiting quantum mechanical features such as entanglement or superposition, one can solve some problems much more efficiently than with classical computers. One of the most striking examples is the factoring problem. This task is believed to be out of reach of classical computers as soon as the number of digits in the number to factor exceeds a certain limit. In this contribution, we present the basic notion of quantum computation and review the most common quantum algorithms discovered so far.
1
Computation and Physics
The theory of computation has been long considered a completely theoretical field, detached from physics. Nevertheless, pioneers such as Turing, Church, Post and Godel were able, by intuition alone, to capture the correct physical picture, but since their work did not refer explicitly to physics, it has been for a long time falsely assumed that the foundations of the theory of classical computation were self-evident and purely abstract. Only in the last two decades were questions about the physics of computation asked and consistently answered’ These later developments led to a complete and thorough understanding of the physical limits of classical computers; however, they were concerned only with the classical theory of computation, for which the computing device is supposed to obey the laws of classical physics. This is fine as long as one asks questions about computers we have now: any computer that was ever built, from the oldest abacus to the latest supercomputer, behaves indeed in a classical fashion. Nevertheless, we live in a quantum world and quantum objects tend to behave quite differently from classical ones. So what about quantum ... computers? Despite early suggestions 2,3 that “something new” may exist when computers are enabled to behave in a quantum mechanical way, it was not until the seminal work of Deutsch in 19854 that the foundations of quantum computation were laid and properly formalised. In his article, Deutsch considers the situation where computers behave like quantum systems and can enter highly non-classical states. These quantum computers could, for instance, exist in a superposition of states. Each state could follow coherently a distinct computational path and interfere to produce a final output. This “quantum 143
144
Introduction t o Quantum Computation and Information
parallelism”, achieved in a single piece of hardware, outstrips by far any parallelism that can be thought of in classical computers, thus potentially providing quantum computers with unprecedented power. It took indeed another decade to gain clear evidence of the power of quantum computers and t o exhibit specific problems that were intractable on classical computers but that could be solved efficiently on a quantum one. The most striking example is the factorisation problem. Shor has shown recently that using a quantum algorithm (i.e. an algorithm that runs on a quantum computer) it is possible t o factor large integers efficiently. Factorisation is believed t o be intractable (or at best extremely difficult and time-consuming) on any classical computer, and Shor’s algorithm shows for the first time that the class of problems accessible t o quantum computers includes problems that (so far) cannot be handled efficiently by classical devices. In fact, factorisation is not only purely of academic interest: it is the problem which underpins the security of many classical public key cryptosystems. For example, RSA: the most popular public key cryptosystem (named after the three inventors, Rivest, Shamir, and Adleman), gets its security from the difficulty of factoring large numbers. Hence for the purpose of cryptoanalysis the experimental realisation of quantum computation is a most interesting issue. This growing interest in the field during these last years is backed up by the enormous experimental progress made in testing fundamentals of quantum mechanics. In the last decade or two, it has become possible t o isolate and study single microscopic quantum systems, giving new insights into the meaning of quantum mechanics, opening new horizons of research and above all giving the possibility to test fundamental ideas such as those involved in quantum computation.
2
Complexity theory
The purpose of complexity theory is to provide tools for classifying computational problems according t o physical resources needed t o solve them. The physical resources under consideration are usually time and m e m o r y (often referred t o as space). The classification should not depend on a particular computational model as long as the physical framework is fixed, but should only measure the intrinsic difficulty of a problem. The size of a problem is the total number of bits needed t o specify entirely the input of a problem. For instance, the problem of factoring a L-digit integer number has a size of approximately log,( 1O)L, i.e. the number of bits necessary to represent a L-digit long integer. An algorithm is a computational procedure that takes a variable input of
Quantum Computation: An Introduction
145
size L and produces after a time t the output of a problem. While running, the algorithm may use up to s bits of memory space to store temporary information. The running time of an algorithm is usually measured as the number of fundamental steps needed to obtain a result. On modern classical computers, a fundamental step can be taken to mean a floating point operation, on quantum computers, it is common use to measure the time as the number of one-bit and two-bit gates one needs to effect to complete the algorithm. The space is usually taken as the maximum amount of memory used at any given time during the completion of the algorithm. Clearly, we are interested in algorithms that minimise the running time or the space needed and, from all the algorithms at hand to solve a given problem, we will always focus on the best one. Complexity theory enables us to classify problems according to the asymptotic amount of resources needed (space or time), as the size of the input L goes to infinity. Thus the exact running time or the exact memory requirements (both usually quite difficult to compute) of an algorithm on a particular input are of no direct interest, only the asymptotic behaviour matters.
2.1 Asymptotic notation The order notation is very handy when discussing the behaviour of some algorithms. We will write that f (L) M O(g(L)) if there exist a positive constant c and a positive integer LO such that 0 5 f(L) 5 cg(L) for L > LO. In other words, for L sufficiently big, f is always smaller than g up to a fixed factor. As examples, f(L) = 10-23L1000+ 1023/L M O(L1Oo0) and f(L) = 1023L1000 + 10-232L M O(2L). In the first example the polynomial term dominates for large enough L; in the second example, the second (exponential) term eventually dominates. Most of the time, it will be sufficient for us to distinguish exponential running times ( O ( e L ) from ) polynomials ones (O(Lk), k fixed). Distinguishing between different polynomials becomes important when trying to optimise certain algorithms.
2.2 Example: addition Let us consider the problem of adding two L-bit long numbers. The usual way to perform the sum requires L single bit additions. Thus the running time of the algorithm is O(L). To store the result L bits plus one overflow bit (which is also used during the addition as a carry bit) are needed; thus the total number of bits needed is L + 1, and the asymptotic behaviour is O(L) as well.
146
Introduction to Quantum Computation and Information
2.3 Complexity classes
A polynomial-time algorithm is an algorithm whose worst-case running time is of the order c3(Lk),where L is the size of the input and Ic a constant. An exponential-time algorithm is any algorithm that is not polynomial-time. (The definition of polynomial algorithms is generally well accepted, in the case of exponential algorithms, slightly different definitions exist; the one above will suffice for our purpose7). Similar definitions can be made for polynomial-space and exponential-space algorithms, where space (i.e. memory) is the physical resource under consideration. Loosely speaking, polynomial/exponential behaviour will determine whether or not a certain algorithm is efficient. In general, polynomial time algorithms are said to be efficient, whereas exponential ones are said to be intractable. It is important to note that many algorithms exist for the same problem, some of which can be polynomial and others exponential. In any case, only the best one is considered. A decision problem is a problem, whose output is YES or NO. The complexity class P is the set of all the decision problems that are solvable in polynomial time. A general problem is in P if it can be “packed” in a polynomial decision problem. “Packing” means finding a corresponding decisional problem that one needs to apply only a polynomial number of times to solve the original problem. Let us illustrate this “packing” procedure with a simple example: the problem of adding two L-digit numbers nl and 712 is not a decisional problem. However it can be solved (by bracketing the result) with a polynomial number (in L ) of applications of the problem “is n1 n2 5 2?”. This later problem is a decisional problem which can be solved in polynomial time (by adding n1 and 712 and comparing z to their sum!). In fact, less formally, the class P usually refers to any type of polynomialtime problem (decisional or not). The space-P class is defined in an analogous way when considering memory as the main resource. The class NP is the class of all decision problems for which a YES answer can be found in polynomial time using some additional information, called a certificate. In the same way, we will loosely say that a problem is in NP if its solution can be found in polynomial time using an additional piece of information. One should note that the extension of the class NP to nondecisional problems is abusive. This extension blurs the details of the real structure of complexity classes. Refer to the reference section for a thorough and rigorous discussion of complexity classes. Factorisation of a number N = pq ( p and q primes) is an example of NP
+
Quantum Computation: A n Introduction
147
problem that is not known to be in P. There is no known algorithm that returns the factors of a prime number in polynomial time (the best known algorithms are subexponential 9 ) . However, given one of the factors of the number (as a certificate), one can find the other through a simple division. 2.4
Remark
One should realise that this classification is not static. A problem not known to be in P today may be reclassified tomorrow if a polynomial algorithm is discovered. The boundaries between complexity classes are not fixed, and in fact one of the major open questions in classical complexity theory is to know whether or not P=NP.
2.5 Randomised algorithms So far, we have assumed deterministic algorithms; these algorithms follow the same sequence of operations each time they are executed on the same input. Let us turn to randomised algorithms. For these algorithms, random choices are introduced on one or several steps of the algorithm. As a result, the algorithm can fail to return the correct answer. This does not mean that randomised algorithms should be discarded: if the probability of failure of the algorithm is of the same order of, say, that of the hardware on which the algorithm is executed, then clearly randomised algorithms are of great appeal. 2.6 Amplifying the probability of success Consider a randomised polynomial algorithm that runs successfully with probability 1 - E . As we will see later, the quantum factorisation algorithm is an example of such an algorithm: it produces a candidate factor of N which can be checked by a trial division. There is an associated probability E that the candidate is not a factor of N . If E is independent of the size of the input, or depends only polynomially on the size, then by repeating the computation k times, we get probability 1 - E' of having at least one success. This can be made arbitrarily close to 1 by choosing a fixed k sufficiently large (cf Fig. 1). The critical point here is that k does not have to increase exponentially with the size of the input. If 1- E is fixed for any input, then a single k will give the same final probability of success for any input. If 1 - E decreases as l/polynomial(size), the number of repetitions k needs to increase only polynomially. Most of the quantum algorithms discovered so far fall in this category.
148
Introduction to Quantum Computation and Infomataon
Probability of success after k repetitions 1 v)
al v)
8
0.8 -
1
2r o w 0.6 -
3 2
0.4
-
0.2
-
01
I
I
I
5 10 15 Number of repetitions [k] Figure 1: Probability of having at least one successful run after k runs of a randomised algorithm with probability of success c . The various curves illustrate different values of c , ranging from e = 0 . 9 (bottom curve), to c = 0.1 (top curve).
2.7
Quantum us. classical complexity classes
The classification in complexity classes depends on the physical realisation of the computer. For instance, a parallel computer, which can adjust the number of processors to the size of the problem, will be able t o solve exponentially difficult problems in polynomial time. To do so, the number of available processors must grow a t the same rate as the complexity of the problem. For exponential problems, this is clearly an unphysical assumption (the (physical) size of the computer would have to grow exponentially). However, for this (unphysical) model of computation, we can define a class P, which includes probIems that lie outside the class P defined for the model of computation that does not allow for exponential parallelism. This example illustrates the fact that a classification according to the complexity of the problems depends on the physics of the computation model. However, within the same physical paradigm, the classification in complexity classes does not depend on a particular physical model. In classical physics, this principle was formulated by Church, Post and Turing, in the late thirties. In other words, this principles states that if a problem can be solved in polynomial time on a given computer, another computer will also require polynomial time. In the same way, if a problem requires exponential time on machine X, then the next generation Y of machines will also need exponential time. This is of course true as long as machine Y “relies77on the same
Quantum Computation: A n Introduction
149
physics as machine X. As we will see later, different physics can mean different complexity classes. In particular a machine Y that uses quantum mechanics, can in principle solve in polynomial time problems that require (or appear t o require) exponential time on the classical machine X. 3
Fundamental definitions
To understand what makes quantum computers different from classical machines, let us consider the fundamental elements of a quantum machine. 3.1
Bits and Registers
At the heart of a quantum computer lies the quantum bit l o or simply qubit as the natural extension of the classical notion of bit. Instead of a simple twostate system that can either be in state 0 or 1, a qubit is a quantum two-level system, that in addition t o the two eigenstate 10) and 11) (the labels are here a mere convention, but often correspond t o the ground and excited state of the two-level system) can be set in any superposition of the form
Any quantum two-level system is a potential candidate for a qubit, but to help to construct a mental picture, it is a good idea to carry a concrete, albeit somewhat idealised, physical example of a qubit. In the following it will be useful t o think of a qubit as a spin-1/2 particle. 10) and 11) will correspond respectively to the spin-down and spin-up eigenstates (along a prearranged axis of quantisation, usually set by a constant external magnetic field). Although a qubit can be prepared in an infinite number of different quantum states (by choosing different complex coefficient ti's in Eq. 1) it cannot be used to transmit more than one bit of information. This is because no detection process can reliably differentiate between nonorthogonal states!' However, qubits (and more generally information encoded in quantum systems) can be used in systems developed for quantum cryptography:2 quantum teleportation l 3 or quantum dense ~ 0 d i n g . lIn~ this last example a single qubit appears to carry two bits of classical information, but in fact two qubits are involved in this process. The problem of measuring a quantum system is a central one in quantum theory, and much attention has been and is still devoted t o this s u b j e ~ t ! In ~ a classical computer, it is possible in principle t o inquire at any time (and without disturbing the computer) about the state of any bit in the memory. In a quantum computer, the situation is different. Qubits can be in superposed states, or can even be entangled with each other, and the mere act
150 Introduction to Quantum Computation and Information
of measuring the quantum computer alters its state. Performing a measurement on a qubit in a state given by Eq. 1 will return 0 with probability [%I2 and 1 with probability /c1I2;but, more importantly, the state of the qubit after the measurement (the post-measurement state) will be 10) or 11) (depending on the outcome), and not ~ 1 0+) ~111).With our spins, it is convenient to think of the measuring apparatus as a Stern-Gerlach device into which the qubits are sent when we want to measure them. When measuring a state of the form of Eq. 1, outcomes 0 and 1 will be recorded with a probability lc0l2 and 1c1I2 on the respective detectors. We will call a collection of qubits a quantum register, or simply a register. As in the classical case, it can be used to encode more complicated information. For instance, the binary form of 6 is 110 and loading a quantum register with this value is done by preparing three qubits in state 11) 8 11) @ 10). In the following, we use a more compact notation: la) stands for the direct product \an-l) 8 lan-2). . .\.I) @ lao) which denotes a quantum register prepared with the value a = 2'ao 2 l a l + . . .2"-l an-l. Two states la) and Jb)are orthogonal as soon as a # b:
+
( a I b) =
(a0
I bo)(a1 I b l ) . . . (an-1 I bn-l),
(2)
and if a # b at least one of the terms in the r.h.s. of the above expression is zero so that ( a I b) = 0. For an n-bit register, the most general state can be written as 2"-1
s=o
Note that this state describes the situation in which several different values of the register are present simultaneously;just as in the case of the qubit, there is no classical counterpart to this situation, and there is no way to gain a complete knowledge of the state of such a register through a single measurement. With our spin picture in mind, measuring the state of a register is done by passing one by one the various spins that form the register through a SternGerlach apparatus and recording the results. For instance a two-bit register initially prepared in the state )1 = l/fi(lO)+I3)), i e . l/fi(lO)lO)+ll)Il)), will with equal probability result in either two successive clicks in the up-detector or two successive clicks in the the down-detector. The post-measurement state will be either 10) or 13), depending on the outcome. A record of a click-up followed by a click-down, or the .opposite (click-down followed by click-up), signals an experimental or a preparation error, because neither 12) = 1l)lO) nor 11) = l0)ll) appear in I+).
Quantum Computation: An Introduction
151
3.2 Gates In a classical computer, the processing of information is done by logic gates. A logic gate maps the state of its input bits into another state according to a truth table. The simplest non-trivial classical gate is the NOT gate, a one-bit gate which negates the state of the input bit: 0 becomes 1 and vice-versa. The corresponding quantum gate is implemented via a unitary operation that evolves the basis states into the corresponding states according to the same truth table. For instance the quantum version of the classical NOT is the unitary operation UNOTsuch
In quantum mechanics, the notion of gate can be extended to operations that have no classical counterpart. For instance, the operation UA that evolves
defines a perfectly “legal” quantum gate. Note that it evolves “classical” states into superpositions and therefore cannot be regarded as classical. This gate is of great utility: take an n-bit quantum register initially in the state 10) and apply to every single qubit of the register the gate UA. The resulting state is
Thus, with a linear number of operations (i.e. n application of U A )we have generated a register state that contains an exponential ( 2 n ) number of distinct terms. It is quite remarkable that using quantum registers, n elementary operations can generate a state containing all 2n possible numerical values of the register. In contrast, in classical registers n elementary operations can only prepare one state of the register representing one specific number. It is this ability to create quantum superpositions which makes the “quantum parallel processing” possible. If after preparing the register in a coherent superposition of several numbers all subsequent computational operations are unitary and linear (i.e. preserve the superpositions of states) then with each computational step the computation is performed simultaneously on all the numbers present in the superposition.
152
Introductzon to Quantum Computatzon and Injonnatzon
Gates are often represented graphically. Fig. 2 gives an example. Qubits are represented by “wires” (whose only function is to transfer a quantum state from the output of one gate t o the input of another one).
3.3 Functzons Let us next describe now how quantum computers deal with functions. Consider a function
f : {0,1,... 2m
- l}
+ {0,1, ... 2n - l},
(7)
where m and n are positive integers. A classical device computes f by evolving each labelled input, 0,1, ... 2m - 1 into its respective labelled out, f ( 2 m - 1). Quantum computers, due t o the unitary (and put f(O), f ( l )... therefore reversible) nature of their evolution, compute functions in a slightly different way. Indeed, it is not directly possible to compute a function f by a unitary operation that evolves Iz) into If(z)):if f is not a one-to-one mapping (z.e if f(z)= f ( y ) for some z # y), then two orthogonal kets Iz) and Iy) can be evolved into the same ket = If(y)), thus violating unitarity. One way to compute functions which are not one-to-one mappings, while preserving the reversibility of computation, is by keeping the record of the input. To achieve this, a quantum computer uses two registers: the first register to store the input data, the second one for the output data. Each possible input z is represented by lz), the quantum state of the first register. Analogously, each possible output y = f(z)is represented by Iy), the quantum state of the second register. States corresponding to different inputs and different outputs are orthogonal, (zlz’) = &,!, (sly’) = 6,,,. The function evaluation is then determined by a unitary evolution operator U f that acts on both registers:
If(.))
It has been shown that as far as computational complexity is concerned, a reversible function evaluation, 2.e. the one that keeps track of the input, is as good as a regular, irreversible evaluation! This means that if a given function can be computed in polynomial time, it can also be computed in polynomial time using a reversible computation. The main difference is that in general a reversible evaluation requires “garbage” or “work” bits, that keep the information needed t o undo the computation. The computations we are considering here are not only reversible but also quantum, and we can do much more than computing values of f(z)one by one. We can prepare a superposition of all input values as a single state and
Quuntum Computation: An Introduction 153
by running the computation U f only once, we can compute all of the 2m values f ( O ) , . . ., f ( 2 m - I),
It looks too good to be true, so where is the catch? How much information about f does the state I$) contain? As we would expect, no quantum measurement can extract all of the 2m values f (0), f ( l ) ., . . ,f ( P- 1) from I$). Imagine, for instance, performing a measurement on the first register of I$). Quantum mechanics enables us to infer several facts: Since each value x appears with the same complex amplitude in the first register of state I$), the outcome of the measurement is equiprobable and can be any value ranging from 0 to 2m - 1. 0
Assuming that the result of the measurement is l j ) , the post-measurement state of the two registers ( L e . the state of the registers after the measurement) is = lj)lf(j)).Thus a subsequent measurement on the second register would yield with certainty the result f ( j ) ,and no additional information about f can be gained.
14)
However, there are more subtle measurements that provide us with information about joint properties of all the output values f (z) such as, for example, the periodicity of f .
3.4
Quantum networks
Like classical computers, quantum computers can be built out of logic gate networks. A quantum network is a quantum computing device consisting of quantum logic gates whose computational steps are synchronised in time. The outputs of some of the gates are connected by wires to the inputs of others. Complex unitary operations can be implemented as a network consisting of several quantum gates. Graphical representation of quantum networks is straightforward. Fig. 2 is an example of a network consisting of two consecutive application of the gate Un on a qubit. Further examples of quantum networks will be given in the next section. The question of how complex unitary operations can be effected through networks of gates acting only on a few qubits will be addressed later.
154 Introduction to Quantum Computation and Information
Figure 2: Simple example of a quantum network. Quantum networks are read from t h e left t o t h e right. Boxes represent quantum gates (i.e. unitary operations). Black lines denote “transmission lines” t h a t transfer t h e state of a quantum bit from one part of a network to another. M denotes a measurement. In this network, a qubit is initially prepared in state 10) (on t h e right) and undergoes twice t h e operation V, before being measured.
4
Simple algorithm: Deutsch’s problem
Deutsch’s problem4 is the first and one of the simplest quantum algorithms so far. Nevertheless, it illustrates some of the crucial elements of quantum computation. Consider all one-bit functions f from ( 0 , l ) to (0,l). There are only four possibilities, namely fl(0) = fl(1) = 0 f2(0) = f 2 ( 1 ) = 1 (10) f 3 ( 0 )= 0 and f3(l)= 1 f4(0) = 1 and f 4 ( 1 ) = 0 . Given an unknown function f among the four possibilities, the problem consists in determining whether the function is constant (f1 or f 2 ) or balanced (f3 or f4). Intuitively, the best classical strategy is clearly to evaluate f on input 0 and 1, and compare the results. Therefore one requires two evaluations of the function f to answer the question.
4.1 Algorithm The first quantum algorithm4 proposed by Deutsch is randomised: it fails t o produce a conclusive answer with probability 50%, but when the algorithm succeeds, only one evaluation is necessary. In the following we propose a variation l6 of the original algorithms that succeeds all the time with only one evaluation. Two quantum bits are needed. The algorithm proceeds as follows (the corresponding quantum network is given in Fig. 3): 0
The first qubit is initially set in state 10) and the second qubit in state 11). The total state is 101).
Quantum Computation: An Introduction 155
M
M
Figure 3: Network for Deutsch’s problem. Note that the operator Uf is applied only once, amounting to querying the function f a single time.
0
a
The first step consists in applying on each qubit the gate UA. This leaves the two qubits in the state
Next one computes the function f on this superposition. This is done through a two-bit gate Uf,which is completely defined by its action on the basis vectors: li,A -+ li,j @ f(4). (1’4 Here i, j = 0 , l and @ denotes the addition modulo 2.
a
The last operation consists in applying once more the gate UA on each qubit
One can easily check that the final state of the two qubits is 101) -101) (11) -111)
i f f = fi if f = fi i f f = f3 i f f = f4 .
(13)
Therefore a final measurement on the first qubit will reveal whether the function is constant (outcome 0) or balanced (outcome 1) One can make several remarks about this algorithm. First, the quantum algorithm enables the classification of the unknown function f with a single evaluation, whereas classically two evaluations are necessary. This is clearly not an exponential difference between the classical and quantum case (the notion
Introduction to Quantum Computation and Infomation
156
of exponential speed-up does not apply for this problem, since the problem is of fixed size and does not scale up). However, it illustrates the crucial role played by quantum mechanics. Quantum superposition and linearity of quantum mechanics is crucial: when Uf is applied on the state of Eq. (ll),f is effectively computed on the four states of the superposition simultaneously. Quantum interference is also essential: the last two UA gates result in an interference of various parts of the superposition. It is also important to note that nowhere do we learn about either f(0) or f (l),the algorithm only reveals information about a global property of f , i.e. whether it is a balanced or a constant function. 5
Simon’s problem
Simon’s problem can be phrased in the following way. Consider an integer function f from [O.. . 2 N - 11 to [0 . . . 2 N - 11. Any input of f can be put in a one-to-one correspondence with a binary vector x = ( 2 0 , .. . ,X N - I ) , where xi = 0 , l . Similarly, the output y = f(x) is also a vector of N binary value. (We could have used an integer x = 2ixi instead of x,but the binary vector notation is more handy in this context.) Let us assume that f has the following properties:
xi
0
0
f belongs to P, i.e. it can be computed in polynomial time on the computer at hand.
either f is a bijective map, or if f is not bijective, then f(x) = f(y) H x = y @ c where c is some constant binary vector. @ denotes the XOR operation (addition modulo 2 ) and in this context X @ C = (XI @ c 1 , . . . , x N @ C N ) .
These properties amount to saying that either f is bijective, or that it has a “period” c . Given an unknown function fulfilling the above properties, the problem consists of finding whether f is bijective or periodic, and in the latter case to determine the period. Intuitively, it is quite clear that solving the problem classically requires on average an exponential (in N) number of evaluations of the function. Refer to the original work l7 for a rigorous proof. Quantum mechanically, Simon showed how to solve this problem with a polynomial number of evaluations. To describe the algorithm, one needs to introduce a unitary transform, the Hadamard transform, which is extensively used in many algorithms.
Quantum Computation: An Introduction 157
M M
M
M
-++
* -+q-+-
Figure 4: a) Network for Simon’s algorithm. b) The Hadamard Transform LIH consists in applying to each qubit the quantum gate U A .
5.1
Hadamard transform
The Hadamard transform on one qubit has already been introduced in Sect. 3.2, it corresponds to the gate UAof Eq. 5. The generalisation to N qubits consists in applying to each qubit the gate UA. In section 3.2, it was already shown that the action of the Hadamard transform on an initial state (0.. . 0) results in an equally weighted superposition of all possible states of the N qubits. More generally, one can check that if the initial state of the N qubits is given by Ix), then
where the sums spans all possible binary vectors y and x . y = (zlyl .. . z,y,) mod 2 is the scalar product of x and y.
+
5.2 Simon’s algorithm Simon’s algorithm proceeds as follows (the corresponding network is given in Fig. 4): Start with two registers of N qubits, both in state 10). The total state is lO)(O), where each 10) represents N qubits (lO000.. .O)).
158 Intduction to Quantum Computation and Information 0
Apply on the first register the Hadamard transform. The resulting state is
Apply on the registers a unitary operator Uf whose action is defined by
Since f is in P, it is possible to show that such an operator can be implemented by a network that contains a number of gates polynomial in N . The resulting state is
Apply a second time the Hadamard transform on both registers. The state becomes
a
Perform a measurement on the second and then the first register. (The order in which the measurements are effected is irrelevant, we just specify it for pedagogical reasons. In fact, one can even show that measuring the second register is not necessary.)
Let us now consider what happens if the function f is a bijective function. In that case, the outcome of the measurement on the second register is a random number x between 0 and 2N - 1 (or to stick with the notation used above, a random binary vector x = f(k)). After this measurement, the total state is given by
A successive measurement on the first register will also return a random binary vector. Therefore, if the function is bijective, the outcome of the two measurements will be a random binary vector. Let us examine the case when the function is periodic. A measurement on the second register will not yield any random binary vectors, but only vectors k for which there exists x' such that x' = j(k). Since the function is periodic,
Quantum Computation: An Introduction 159
we know by definition that for x @ c we also have x' = f (k@ c). Therefore the state after the measurement on the second register is
Since the scalar product between the unknown period c and a binary vector y can only be 0 or 1 (we work in modulo 2 arithmetic), the complex amplitude (1 + (- l)".Y) can only be 0 (when c .y = 1) or 2 (when c . y = 0). Therefore a measurement on the first register can only yield vectors y such that c . y = 0 (i.e. vectors for which the complex amplitude is non-zero). We can conclude that, if the function is periodic, the measurement of the first register yields a binary vector y for which c .y = 0. Simon's algorithm consists in repeating this whole procedure a number of times m until at least N different outcomes have been obtained (since the outcome are probabilistic, we can have "duplicates", and therefore we may have to effect more than N runs to obtain N different outcomes). From each run, we retain only the value of the first register. After m runs we have a collection of N binary vectors yi, i = 1 . . .N . By solving the system of N simultaneous equations given by {yi . c' = 0 mod 21i = 1 . . . N}, one can deduce a binary vector c'. If the function f is periodic, then every yi satisfies yi . c = 0, and therefore the c' obtained by solving the system of equations is the actual period c of the function. This is checked by verifying that f (x) = f (x@ c) for an arbitrary x. On the other hand, if f is bijective, then one obtains an arbitrary binary vector c' that will not fulfill the identity f (x) = f (x @ c') for any x.
5.3 Eficiency of the algov-ithrn The complexity of the algorithm is readily analysed. First let us observe that the algorithm is a randomised one: in the first part, one may obtain duplicates, i.e. the outcome of the measurement of the registers may be identical on two different runs. However the number of possible outcomes for the second register is 2N when the function is bijective and 2N-1 when it is periodic, therefore, the probability of not obtaining N different outcomes when trying m > N times decreases exponentially with m - N . On average one therefore needs a polynomial number of repetitions to obtain N different binary vectors y . Each repetition is a polynomial task, as one can check from the network. Finally, deducing the vector c' requires solving a system of N linear equations, this can be effected in polynomial time on a classical computer. Thus the algorithm is globally polynomial.
160
6
Introduction t o Quantum Computation and Information
Shor's factorisation algorithm
In 1993, Shor5 showed that a quantum computer could in principle factor large composite integers problem in polynomial time. Shor used the fact that the factorisation problem can be related to finding periods of certain functions. In particular one can show l8 that finding the two primes factors p and q of N = pq boils down t o finding the period of the function f a , ~ ( 2 =)a2 mod
N
(21)
where a is a n y randomly chosen number smaller than N which is coprime with N, 2.e. which has no common factors with N . (If a is not coprime with N, then the factors of N are trivially found by computing the greatest common divisor of a and N using the Euclidean algorithm.) The function in Eq.21 gives the remainder after the division of a" by N . This function is periodic l9 with period T , which depends on a and N . Knowing the period T of f a , N , we can factor N provided that T is even and r mod N # -1. When a is chosen randomly between 0 and N the two conditions are satisfied with probability greater than 1/2. The factors of N are then given by gcd(aTI2f 1, N), the greatest common divisor of aTI2f 1 and N . For this last calculation, an easy and very efficient algorithm has been known since 300 BC. The algorithm, known as the Euclidean algorithm, runs in polynomial time on a classical computer. Thus, the problem of factoring big numbers reduces t o the related task of finding the period of a function. 6.1 Example
To see how this method works, let us illustrate it with a very simple example. Let us try to factor N = 15. Firstly we select a , such that gcd(a,N) = 1, for instance a = 7 (with N = 15, a could be any number from the set {2,4,7,8,11,13,14}). The values of f7,15(~)= 7" mod 15 for 2 = 1 , 2 , 3 , 4 , 5 , 6 , 7 . . are 1 , 7 , 4 , 1 3 , 1 , 7 , 4 .. . respectively and clearly the period here is T = 4. aTI2 gives 49 and by computing the largest common divisors gcd(aTI2f l , N ) , we find the two factors of 15: gcd(48,15) = 3 and gcd(50,15) = 5. The periods of fa,15(2) for other values a in the set {2,4,7,8,11,13,14} are respectively { 4 , 2 , 4 , 4 , 2 , 4 , 2 } and in this particular example any choice of a except a = 14 leads to the correct result. For a = 14 we obtain T = 2, aTI2 -1 mod 15 and the method fails. Every step of this method, except finding T , can be performed in polynomial time on a classical computer. Unfortunately, finding r is actually as time consuming as finding factors of N by the trial division method, since on
Quantum Computation: A n Introduction 161
average it requires us to evaluate fa,N(Z) an number of times exponential in L (where L is the size of the number we want to factor, L 2~ log2(N));however, if we employ quantum computation, r can be evaluated very efficiently. Shor describes a quantum algorithm which provides the period r of a function and which runs efficiently ( i e . in polynomial time) on a quantum computer. Let us now outline the main features of this algorithm. 6.2 Quantum algorithm As was pointed out in Sec. 3.3, quantum mechanics enables us to compute a function f for different values by just applying the corresponding unitary operator U f on a register previously set in a superposition of orthogonal states. Let us play this game for the function f a , N ( X ) . Since the result cannot be larger than N , the output register, as defined in Sec. 3.3, should have at least L qubits. For reasons that will become clear later, we will consider an input register of 2L bits. Both registers are initially loaded with the value 0 and the total initial state is 10)10) (22) (cf Fig. 5). We first set the input register into an equally weighted superposition of all possible states, from 0 to 22L - 1 (21 N 2 ) ,by applying the gate U,C, (defined in Sect. 3.2) on each qubit of the input register
.
22-5-1
Applying the operator Ufa,Nto this state, we obtain
At this stage, all the possible values of fa,N are encoded in the state of the second register, but as was pointed out earlier, they are not all accessible at the same time. On the other hand, we are not interested in the values themselves, but only in the periodicity of the function. Let us now proceed with an example to see how this periodicity can be efficiently retrieved. Taking the same values as in the example of the previous section ( N = 15 and a = 7), the state of the two registers after applying U f 7 , , ,is
162 Introduction to Quantum Computation and Information
Figure 5 : Network for Shor’s algorithm. Only t h e “active” qubits are represented: t h e computation of the modular exponentiation fa,^ is implemented in a reversible fashion in the network U f a , N .Such reversible implementations require usually “work” qubits, these qubits are initialised in a known state (in general 10) and are used t o store (or keep) t h e intermediate information needed t o make t h e operation reversible. See Bennett for a detailed discussion. In t h e case of the function fa,^, it has been shown” t h a t t h e number of work qubits can b e reduced to O ( L ) .
At this point, we perform a measurement on the second register. In Eq. 25, the second register encodes only the four different values 1,4,7 or 13, and therefore any other measurement outcome is impossible, unless an experimental error has occurred. The state of the second register after a measurement with outcome j is l j ) For the first register the situation is a bit more delicate and the postmeasurement state of the first register will be an equally weighted superposition of the states in Eq. 25 for which the second register was in state lj). Table 1 sums up the possible outcomes and the post-measurement states of the two registers. We forget now about the second register and focus only on the state of the first one. Let us imagine for a while that quantum mechanics enables us to dictate the outcome of the measurement we perform on the second register. Imagine that we decide to always obtain, say, 4. In this case, we would be able to prepare at will the quantum computer in the state
Returning to the normal rules of quantum mechanics, we could now perform a measurement on the first register and obtain with equal probability any of the values 3,7,11,15. . . etc.. Repeating the procedure from the beginning two more times, we could, with a very high probability obtain two other distinct values, which would enable us to find the period very easily: if in these three
Quantum Computation: An Introduction 163 Table 1: Possible outcomes of the measurement performed on the second register for the
) , a = 4 and N = 15. The post-measurement state of the form $ ~ 2 2 L - ' l z ) l f a , ~ ( z )with state and%e offset 1 is also given for each possible outcome.
I outcome I
I
Dost-measurement state
I
I
offset 1
I
successive runs, we obtain for instance 23, 3 and 11, the period is easily found to be gcd(23 - 11,ll- 3) = 4. Unfortunately, dictating the result of a measurement on the second register violates the rules of quantum mechanics. Measurement outcomes are probabilistic, and in our example, each allowed outcome (1,4,7 or 13) is equiprobable. In this particular case, we could repeat the experiment a few times and retain only the runs for which the outcome is 4. However, the notion of efficiency is defined for asymptotic behaviour and not for particular cases. The real question is to know how this technique will perform for increasing N's. In a general case, when the period is T , there are T possible different outcomes. Since T also grows exponentially with L (the size of N ) , the approach that consists of repeating the quantum computation over and over again until measuring in the second register at least three times the same value (in order then to perform a measurement on the first register and find the factor via a greatest common divisor calculation) is inefficient. An additional ingredient is needed to make the quantum algorithm polynomially efficient. Whatever the outcome of the measurement, the first register is left in an equally weighted superposition of the form
j=O
with T being the period of f a , N ( X ) , 1 an offset value and an appropriate normalisation factor. (cf. Fig. 6a and Table 1.) Regardless of the outcome of the measurement, the period T of the function f a , N is always reflected in the postmeasurement state of the first register. However, it is not readily accessible, as the offset 1 depends probabilistically on the outcome of the measurement. It is
164 Introduction to Quantum Computation and Information
nevertheless possible to get rid of this offset by using a quantum equivalent of the classical Fast Fourier Transform. This operation is known as the Discrete Fourier Transform (DFT).
6.3 Discrete Fourier Transform Consider the unitary operation UDFTthat acts on a quantum register and effects
where 2L is the size of the register. The reason for calling this particular unitary transformation the discrete Fourier transform becomes obvious when we notice that in the transformation
x=o
Y
the coefficients cy are the discrete Fourier transform of the cx, i.e. 22L-1
cy =
C exp(2ri-)cx. XY
x =o
22L
The strength of the DFT lies in the fact that when it acts on a periodic state of the form of Eq. 27, it will wipe out the offset 1, and transform it into a phase factor that does not affect the probabilistic outcome of a later measurement on the register. Appendix A shows how the DFT on a 2L-bit register maps a state of period r into a state of period 22L/r. When r divides exactly 22L, the resulting state has a nice closed form (cf. Fig. 6):
A more careful analysis is required when r does not divide 22L (see Appendix A). Even in this more general case, the DFT retains the features illustrated in the particular situation above: it “inverts” the periodicity of the input ( r 22L/r) i and it has this translation invariance property which washes out the offset I , (Fig. 6b). Thus, by effecting UDFTon states of the form of Eq. 27 with different I , we always end up with a state for which neither the outcome of a measurement, nor its probability depend on 1 anymore.
Quantum Computation: An Introduction 165
a)
b)
Initial state
Fourier transform
1cx12b ...
0
0
5
10
i
+
250 255
”
0
50
100
X
Figure 6: T h e DFT on an input state of the form
150
200
250
Y
c,
c,l+)
results in
c
cvIy). When the
input state is periodic, as in (a), the effect of the DFT is t o eliminate t h e eventual offset 1 and t o invert t h e period T -+ 2 L / r (b). In the figure L = 8 , I = 3 and r = 4. In this particular case t h e period r divides exactly 2 L , resulting thus in a “clean” transformation. Appendix A describes a more general case where T does not divide 2 L , in this situation a slight spread occurs in t h e peaks of t h e Fourier transformed state, (cf. Fig. 13).
In the previous section, we showed how to construct a state of a register with a periodic superposition and an arbitrary offset. Combining this method with a DFT, it is now possible to retrieve efficiently the “inverted” period 2 2 L / r (cf. Fig. S), from it the period r , and finally the factors of N .
6.4 Eficiency of the algorithm Shor’s algorithm is a randomised algorithm which runs successfully only with probability 1 - E . We know when it is successful: it produces a candidate factor of N which can be checked by a trial division t o see whether the result is indeed a factor or not. This check can be effected in polynomial time as it just involves a division. The randomness in the algorithm is due to certain mathematical results concerning the distribution of prime and coprime numbers. For instance, for some values of the initial number a, the algorithm will fail, even if a is coprime with N . Also, if we abandon the assumption that r divides 22L-almost certainly it wil not; this was adopted in the previous section only for pedagogical purposes-the DFT of c, will not produce sharp maxima as in Fig. 6 . This may contribute to possible errors while reading y from the register. Subsequent estimation of r is calculated using additional mathematical approxima-
Introduction to Quantum Computation and Infomation
166
tion techniques, (continued fraction expansion, see for instance 21). Apart from the fact that Shor’s algorithm is a random one, every step of it is efficient. Computing the function fa,N can be done in time20>22 O(L3)using memory U ( L ) . Moreover, the OFT operation needed t o retrieve the periodicity is also an efficient operation which can be effected in O ( Llog(L)) operations. Since the whole calculation needs to be repeated only a polynomial number of times, the algorithm is globally polynomial.
6.5 Discrete logarithms In his original paper! Shor also showed how to solve the discrete log problem. Given a prime integer p and an integer g coprime with p , the discrete log of a number x with respect to p and g is the integer r such that gr = x mod p . (For a rigorous formulation of the discrete log problem see Hardy and Wright!g) The quantum algorithm to solve this problem is very similar to the factoring algorithm.
7 Grover’s searching algorithm In 1996 Grover 23 considered the problem of database search and proposed an algorithm which is significantly more efficient than any classical algorithm, though not exponentially better. Consider the problem of finding someone’s name in a telephone book, knowing only their phone number. Classically one needs to look up on average N / 2 numbers (where N is the total number of entries in the book). Grover showed that a quantum computer could perform this task in O ( n ) steps, hence a speed up of O ( a ) . The problem can be formalised in the following way. Consider N = 2L different states, which we label SO,S1,. . . S N ,and a condition C,. We suppose that only one state fulfils the condition C, (i.e. Cu(Sv) = 1, C,(Si)= 0 V i # v), the goal is to identify S, while minimising the number of evaluations of the condition C,.
7.1 Quantum Algorithm Quantum mechanically, the algorithm proceeds as follows (cf. Fig. 7): 0
0
Start with a L qubits register in state 10). Each possible state S, is represented by the corresponding ket ). .1 Apply the Hadamard transform UH on the register.
This leaves an
Quantum Computation: An Introduction 167
equally weighted superposition
.
2L--1
At this stage, a measurement on the register will yield with equal probability a state ),.I meaning that any Sx is a candidate answer to the problem. 0
Q(a)
Repeat times the following operation (the exact number of times the operation needs to be repeated will be discussed later): 1. Apply the operator Uc, defined by UC,I4 = -14
UC,l.)
=).I
x#v
(33)
We can view this operator as a black box given to us; the black box “recognises” the state) . 1 and modifies the phase accordingly, but we have no way to “open” the box and see what the parameter v is. 2. Apply the gate UD defined as
where Uco is given by
0
Q(n)
-
After the iterations, perform a measurement on the register. The outcome of the measurement is Iv) with probability 50%.
The algorithm is a randomised one, but by repeating it several times, one can boost the probability of actually discovering S,. Let us see how the algorithm works. The central part of the algorithm is the iteration of the operations Uc, and UD. Let us see their effect on a given state. The following explanation is based on an article by Gr0ver.2~ Let us start by noticing that at any stage in the algorithm, the coefficients of the quantum state of the registers are real. (This does not mean that the coefficients are always real. Because of the physical implementation of
168
Introduction to Quantum Computation and Information
!
Repeat c?(&V)times
i
Figure 7: Grover’s network. When the database contains only one marked element, the unitary operation in the dashed box should be repeated O ( n ) times. A different number of repetition should be repeated when the database contains more marked elements.
the operators, it may be that at some stage during the application of one operator the coefficients become complex. But in between each application, the coefficients are real.) It is therefore possible to discuss the action of Uc, and UD with a very simple graphical representation. The action of Ucv is to reverse the sign of the amplitude of) . 1 in the total superposition, cf- Fig. 9. The action of the sequence U H U C ~ Uis H a bit more subtle. It amounts to what Grover calls an “inversion about the average”. Let us define a as the average value of the (real) coefficients of the state I*) of the register, so
where
c
2L-1
I*)
=
aili).
(37)
i=O
The operation UD increases (decreases) the amplitude of each state in the superposition so that after the operation it is as much above (below) a as it was below (above) a before the operation, (cf. Fig. 8). This property can be readily verified. Keeping this in mind, one can understand how the algorithm works. The combined effect of U c , and UD is to boost the amplitude of the state) . 1 while reducing that of every other state. Fig. 9 illustrates the first iteration of Grover’s algorithm on the state Eq. 32. At the end of the first iteration, the amplitude of the state) . 1 has become slightly larger than the amplitude of the rest of the states in the superposition. In fact, it is quite easy to show that after each iteration, the total
Quantum Computation: An Introduction 169
Figure 8: Effect of the sequence of unitary operators U H U C ~ on ~ Han arbitrary quantum state with real complex amplitudes. The top graph represents the initial (real) amplitude of the quantum state. After applying the sequence (bottom figure), the complex amplitude is as much above (below) the average as it was below (above)
state can be written in the form
ZfU
Thus the quantum superposition is evenly weighted, apart from state Iv), whose amplitude departs from the amplitude of the rest of the states as the algorithm is iterated. (Clearly a = (1 - 2-L)a‘ 2-La, 2 a’.) Boyer et al. 25 were able to derive an analytical expression for a, for each iteration:
+
a,(j) = sin((2j + 1)O)
with
sin2(0) = 1/2L.
(39)
In the above equation, the j designates the iteration. For j = 0 (no iteration), the amplitude of ) . 1 is l/@, and the state is an equally weighted superposition (initial state). As j increases, the amplitude a,(j) increases as well, and peaks after roughly j p e & E 2 f l iterations, (cf Fig. 7.1). If we keep on iterating, the amplitude starts to decrease again to become negligible after jnegl E $@ iterations. Thus one has to be careful as to how many iterations are actually carried out. Twice as much work as is actually needed and the probability of success becomes almost zero!
170
Introduction to Quantum Computation and Information
....average..Q.,
.........................
--luull
........................
average. ... a. c
Figure 9: First iteration of Grover’s algorithm.(a) After t h e initial UH gate, t h e state is in a n equally weighted (real) superposition of all possible states of the quantum register.(b) T h e operation VC, reverses t h e sign of t h e amplitude associated t o state lu). As a result t h e average amplitude value a is slightly below l / f i . ( c ) After applying t h e sequence U H L I C ~ U H t h,e amhas increased plitudes are inverted about the average. As result, t h e relative amplitude of ) . 1 significantly.
Quantum Computation: An Introduction 171
sin ( ( 2 j + l ) e ) I 0.8 .
0.6
.
0.4 .
X X X
0.2
X
t“
X
10
20
30
40 j
Figure 10: Oscillations in Grover’s algorithm. With no iterations (j=O) the probability of meaThe probability increases as more iterations suring the correct state ) . 1 is very low ( E l/-). are effected until it reaches a peak for U ( m ) iterations. If more iterations are carried out, then the probability decreases again to reach the initial level, before starting to increase again. In this example L = 10. Note that the exact number of iterations that maximises the probability depends on the number of marked elements in the d a t a b a ~ e . 2 ~
7.2
Remarks
Grover’s algorithm can be extended when more than one element satisfies the condition C,. In this case, even less iterations are needed, but to optimise the probability of success, one needs to have a prior knowledge of the number of “marked” element^?^ An important point about Grover’s algorithm is that the algorithm is asymptotically optimal. This was proved by Bennett et a1.?6 Grover has also extended his ideas to other problems. He showed that problems such as estimation of the mean 27 or the median could be solved with similar techniques. The previous sections by no means constitute an extensive inventory of every single quantum algorithm proposed so far: Kitaev 29 has proposed an alternative polynomial method to perform factoring on a quantum computer. Brassard et aL30 have built on Grover’s algorithm to solve the collision problem exponentially faster than in the classical case.
’*
7.3 Beyond Deutsch, Simon, Shor and Grover
In the previous sections, we have reviewed most of the algorithms discovered so far where quantum computers perform better than classical ones. We have, however, omitted a problem of particular interest to physicists: quantum computers can simulate quantum mechanical problems efficiently. It was the difficulty to simulate quantum mechanics on classical machines that sparked Feynman’s initial interest to use quantum mechanics to perform computation. By observing that the Hilbert space of a quantum system grows exponentially
172
Introduction to Quantum Computation and Information
with the number of particles in the problem, Feynman suggested that computers that make use of an exponentially large Hilbert space could maybe outperform classical computers. As we have seen in the sections above, this insight was actually correct. In fact several people31 have shown that a quantum computer could be used t o simulate efficiently a quantum problem; when the number of particles involved in the problem increases, the corresponding scaling in the computer resources needs only t o be polynomial in the number of particles, unlike the classical case where exponential resources are needed. Many questions about the power of quantum computation are still open. It is not known whether any N P problem can be solved efficiently with a quantum computer. Although we have some convincing arguments 26 showing that quantum computers cannot solve efficiently NP-complete problems ( N P complete problems are a subclass of NP problem such that if any NP-complete problem can be solved in polynomial time, then any N P problem can be solved in polynomial time. In other words, finding a polynomial algorithm for a single NP-complete problem is equivalent t o showing that the class N P is equal t o the class P).At present, we do not know t o where exactly the power of quantum computation extends beyond the boundaries of the class P. 8
Towards Quantum Networks
In the previous sections we have specified unitary operations by describing how they affect the state of the registers on which they act. We have not given any indications of how to implement them. These operations are usually quite complex. For instance the DFT on a register of L qubits is an operation that acts on a 2L dimensional state; the mere task of writing down its matrix would take an exponential time in L. Deutsch 32 described quantum networks as a possible way t o effect complex unitary operations. Quantum networks are composed of elementary logic gates connected together by wires. The fundamental idea underlying quantum networks is t o decompose complex unitary operations acting on several qubits into a sequence of simple one- and two-bit gates. Other paradigms t o implement quantum computation involve for instance quantum cellular automata, but so far they have not proved t o be tools as valuable as the idea of quantum networks. We will not discuss them here, and the interested reader can refer to the literature on the ~ u b j e c t . 3 ~ Deutsch showed that there exists a universal three-bit quantum gate from which any quantum computation, i.e. any unitary operation on any finite number of qubits, can be built by a suitable network consisting only of copies of this gate. This result has been improved upon since then 34 and we now
Quantum Computation: An Introduction 173
Figure 11: Network effecting a DFT on a four-bit register, the phases that appear in the operations $jk
are related to the “distance” of the qubits upon which U B acts, namely
= i ~ / 2 ’ - ~ . The network should be read from the left to the right. first the gate U , is effected on the qubit
a3,
then U g [ + , , ) on a2 and
a3,
and so on.
know that almost any non-trivial two-bit gate is uni~ersal!~Much attention has also been devoted to the efficient construction of more complex quantum gatesB6 and to specific networks, such as the one that effects the modular exponentiation required in the first part of the factorisation algorithm?() In the following, I will illustrate how the quantum discrete Fourier transform discussed above can be implemented as a network consisting of only one- and two-bit gates. Consider again the one-bit gate UA of Sec. 3.2 performing the unitary
Consider also the two-bit UB($)gate acting on qubits 91 and the operation
92
and performing
The diagram on the right provides a schematic representation of the gate acting on a qubit 91 and 9 2 . The gate UB($, performs a conditional phase shift, i . e . a multiplication by a phase factor ei@only if the two qubits are both in their 11) state. The three other basis states are unaffected. The DFT on a register of any size can be implemented using only these two gates. For example, consider a four-bit register with qubits ao, . . . a3. The
174
Introduction to Quantum Computation and Information
network in Fig. 11 follows step by step the classical algorithm of a,DFT37 and performs the operation
where Ib) represents the value c read reversing the order of the bits, i.e. 3
b=
3
2iC3-i
with
Ck
given by
C
2kCk.
=
(43)
k=O
i=O
A trivial extension of the network following the same sequence pattern of gates on L qubits gives the general DFT. In this case the transformation requires L operations UA and L ( L - 1)/2 operations UB(,), in total L ( L + 1)/2 elementary operations. Thus the quantum DFT can be performed efficiently. Moreover, it can even be simplified? in a general network for DFT, the operations UB(,, that involve distant qubits aj and ak, i.e. qubits for which Ij - Icl is large (and therefore q5 = ~ / 2 ~ -approaches j zero), are close to unity. Therefore, when performing the quantum DFT on registers of size L , one can neglect operations UB(,) on distant qubits (more precisely on qubits aj and a k for which Ij - kl > log,(L) 2) and still retrieve the periodicity of coefficients c, with high pr0bability.3~ The network of gates for the quantum DFT enables the efficient implementation of the second part of Shor’s algorithm. The first part requires an efficient quantum evaluation of the function f a , ~ ( = z )a, mod N . The computation of f a , ~ (isz )“easy” i.e. the number of elementary operations does not grow faster than a polynomial in the size of the input. The respective network is constructed by combining networks which perform additions and multiplications in a reversible and unitary way?O
+
9
Coupling with the environment: the decoherence problem
In order to perform a successful quantum computation, one has to maintain a coherent unitary evolution until the completion of the computation. Technically it is not possible to ensure that a quantum register is completely isolated from the environment. This remnant coupling induces decay and decoherence processes, both of which drastically reduce the performance of a quantum computer, even when the coupling is very weak. Decay is a process by which a quantum system dissipates energy in the environment. For a spin, it is for instance a transition from 11) + lo), accompanied by the emission of a photon
Quantum Computation: A n Intmduction 175
of appropriate wavelength. Decoherence is a subtler phenomenon that involves no exchange of energy with the environmentPo Its effect is to scramble the relative phase of the various parts of a quantum superposition. Decoherence occurs in most cases on a much faster time scale than decay, and therefore, we will focus on this kind of processes. Decoherence can be more easily understood if we formalise it in the language of density operators, rather than in the more familiar Dirac state notation. When a quantum system is in a pure state, it can be equivalently described by a ket I$) or by a density operator p = [$)($I. The characteristic effect of decoherence is to destroy the off-diagonal elements of the density operator; the system evolves into a “mixed state” 41 for which the ket notation alone is no longer suitable. To see how this can affect quantum computers, let us first consider the very simple situation in which a qubit initially in the state 10) undergoes successively and without decoherence two operations UA (as introduced in Sec. 3.2): 1
I$in)
= 10)3 -(lo)
Jz
+ 11))3 10)= I + j i n ) .
(44)
In a density matrix formulation, this sequence can be written (in the basis 8 = {lo), 11)))
A measurement of the final state would yield 0 with probability one. Let us suppose now that decoherence occurs in between the two operations UA and wipes out completely the off-diagonal elements. (This is of course an oversimplification, and one should rather picture decoherence as a continuous process that progressively eliminates the off-diagonal elements.) In this case, the sequence of operations reads
(46)
and P f i n no longer represents a qubit in the state lo), but rather a statistical mixture of the states 10) and 11). Performing a measurement on the qubit would now return either 0 or 1 with equal probability; thus decoherence affects the probability distribution of the possible outcomes of a computation. The onset of decoherence is actually more complex, and to a large extent depends on the physical situation. In a typical case, for a quantum computer of S qubits which interacts with an environment in thermal equilibrium, the
176
Introduction to Quantum Computation and Information
off-diagonal elements of the density matrix decay exponentially fast at a rate yS 42 pij
-
(t)
pij ( 0 ) e - Y s t
(47)
where y = 1 / T d e c is a constant that describes the coupling of a single qubit with the environment: the stronger the coupling, the higher y and the smaller the decoherence time T d e c . For an efficient computation, we have seen that both S and the total computation time ttot required to complete the algorithm should not grow faster than a polynomial in the size L of the problem, so that one can write
S
N
La
hot
N
LP t e l e m ,
(48)
where t e l e m is the characteristic time needed to perform a single elementary computational step of the algorithm. From this, it is then possible to show that the probability P of measuring the right answer at the end of the quantum computation decreases exponentially with S and t t o t , and hence with La+P:
This can be illustrated in a very simple situation. Consider performing a DFT on a register of L qubits that encodes a superposition of period T = 4. Without decoherence, we expect, according t o Eq. 31, to measure with equal probability either lo), 12L-2), 12 . 2LL-2)or 13. 2L-2). The measurement outcome will be affected by decoherence and Fig. 12 illustrates how the diagonal elements of the density matrix of the state ( i e . the probability outcomes of a measurement) behave. The calculation is repeated for different L with the same amount of decoherence (given by a fixed y). To obtain at least one successful computation, one needs to run the comLoL+P puter on average l / P = e 7 t e l e m times. Thus the problem becomes exponentially difficult as soon as some decoherence is present. From the complexity point of view, the magnitude of y has no relevance: as soon as there is some coupling with the environment and y is non-zero, any computation becomes inefficient. It is, however, quite clear that for small y (long decoherence time T d e c = l / y ) , it is possible t o effect some quantum operations before decoherence takes its toll. Technological progress in isolating quantum computers from the environment and reducing decoherence will increase the largest number that can be factored by such computers. The requirement for a coherent computation to be completed within the decoherence time can be written as
Quantum Computation: An Introduction 177
lP7
a)
0.2
I
"
0
200
400
Y
4
C)
-
(
5
0
.
L=16
0
8 10 12 14 16 o,04/ Length of the reginter L
0.03
lcyl
0.02 0.01 n "
0
4OOO
8000 Y
12000
0
16OOO 32000 Y
48000
Figure 12: Numerical simulations t h a t mimic t h e effect of decoherence on t h e result of a DFT. The initial state (not shown) is a periodic state with T = 4 on a register of varying size, (see Fig. 6 for t h e case L = 8). Without decoherence, the resulting state should have only 4 components (Fig. 6b). T h e coupling with the environment induces errors (a-d), and reduces t h e probability of measuring t h e right answer. For a fixed y and increasing L , the probability of getting t h e right peak decreases in a characteristic exponential way (central plot). T h e four plots a), b ) , c ) and d ) show the diagonal elements of t h e density matrix of t h e output. T h e intensity of t h e four principal peaks decreases as L increases, and t h e probability of measuring a correct result decreases in an exponential fashion (central plot). Note t h a t the scales on the vertical axis are adjusted for graphs c) and d ) .
178
Introduction to Quantum Computation and Information
The right-hand side is the characteristic decoherence time of La qubits. From the above it follows that
With the best implementation of the factorisation algorithm cy = 1and = 3?O hence the size of the largest number that can be factored is bounded by L < ( T & ~ / t & . m ) ’ / ~This . is an optimistic estimate in which only decoherence is taken into account. A careful analysis shows that this bound is dramatically reduced when decay phenomena (such as spontaneous emission) are also includedP3 The ratio M = T & c / t e l e m is a useful figure of merit for comparing different technologies. It tells us, very approximatively, how many elementary operations can in principle be performed on a single qubit before it decoheres. Decoherence ultimately cuts out any “exponential” speed-up that we may gain from using quantum computers and quantum algorithms, simply because to overcome it, we have to run the same computation over and over again an exponential number of times (until we get a correct answer). Fortunately, this is not the end of the story. Classical computers suffer from similar problems, and yet, one tends to agree that classical computers are (in general!) reliable. This is because in the classical situation we have efficient ways to fight errors. For existing computers, error-correcting codes have been designed that are exponentially effective and that can handle and control possible errors. (In fact, in classical computers, components are so reliable that hardly any error-correcting code is used; error detection is used instead. Error-correcting codes are, however, widely used in classical communication when noise occurs in the transmission channel.) These techniques can be adapted to tame decoherence in quantum computersP4 They are based on redundancy ( i e . several bits encoding one bit of information) and, more importantly, on a periodic monitoring of the state of the computer. This involves measuring the state of part of the computer, diagnosing an error and then correcting it. This topic will be discussed extensively in the next two chapters. Acknowledgements
I wish to thank Hoi-Kwong Lo for helpful comments on early drafts of the manuscript and M. Mosca for clarifying some arcane details of complexity theory. I acknowledge the financial support of the Swiss National Fund for Scientific Research.
a)
Initial state
-
Quantum Computation: A n Introduction
b,
179
Fourier transform
u ---
I
-
0.4
0
'2L/, Icy1
2L/, 2L/,
95
100
105
110
n 0
5
10
Y
X
c,
T = 5 does not divide exactly 2 L = 256. This results cyly). Nevertheless, it can be shown t h a t , in a broadening of t h e peaks of the output state
Figure 13: Same as Fig. 6, but in this case
when effecting a measurement on this state, the closest integers from multiples of 2 L ' / ~are the most likely outcomes. This is illustrated in the inset: 102 (local maximum), is the closest integer of 2 x 2 8 / 5 = 102.4. (Note t h a t normally one plots lcy12, i . e . , t h e actual probabilities, b u t here lc(y)l is plotted to emphasise t h e spread of the peaks.)
Appendix A: Discrete Fourier Transform
Let us consider the simplified situation where the period T divides 2L exactly. A register in a periodic state is given, for instance, by Eq. 27, which we rewrite as
with G = 2 L / ~ Performing . a DFT on Idin) gives
where the amplitude of cy is
180 Introduction to Quantum Computation and Information
The term in the square bracket on the r.h.s. is zero unless y is a multiple of 2L/T, e x p ( 2 ~ i ~ y / 2 ~ ) / J ;if; y is a multiple of 2 L / ~ y: = I otherwise Therefore, in the particular case when written
.
T
divides 2L exactly,
C ~ ~ / T
(55)
Idout) can
be
T-1
A more elaborate analysis is actually required when r is not a multiple of 2L. In this case, after effecting the DFT, the coefficients cy are peaked on the closest integers t o the multiples of 2L/r (cf. Fig. 13). These peaks have a spread that decreases exponentially with L (hence the reason t o choose in the factorisation algorithm the size of the first register t o be 2L). A careful analysis of this case has been given by Ekert and Jozsa?l References
1. R. Landauer, IBM J. Res. Dev. 5 , 183 (1961); C. H. Bennett, IBM J. Res. Dev. 17, 525 (1973); C. H. Bennett, Int. J. Theor. Phys. 21, 905 (1982); C. H. Bennett, SIAM J. Comput. 18(4), 766 (1989). 2. P. Benioff, J. Stat. Phys. 29, 515 (1982). 3. R. P. Feynman, Int. J. Theor. Phys. 21, 467 (1982). 4. D. Deutsch, Proc. R . SOC.Lond. A 400, 97 (1985). 5. P. W. Shor, p. 124 in Proceedings of the 35th Annual Symposium on the Foundations of Computer Science, ed. S . Goldwasser (IEEE Computer Society Press, Los Alamitos, CA, 1994). 6. R. Rivest, A. Shamir and L. Adleman, “On Digital Signatures and Public-Key Cryptosystems,” MIT Laboratory for Computer Science Technical Report, MIT/LCS/TR-212 (January 1979). 7. M. Mosca, private communication. 8. A. J. Menezes, P. C. van Oorschot and S. A. Vanstone, Handbook of Applied Cryptography, (CRC Press, 1997); see also D. Welsh, Codes and cryptography, (Clarendon Press, Oxford, 1988) and H. S. Wilf, Algorithms and complexity, (Prentice-Hall, Englewood Cliffs / Prentice-Hall International, London, 1986). 9. A. K. Lenstra, H. W. Lenstra Jr., M. S. Manasse and J. M. Pollard, p. 564 in Proc. 22nd ACM Symposium on the Theory of Computing, (1990).
Quantum Computation: An Introduction
181
10. B. Schumacher, Phys. Rev. A 51,2738 (1995). 11. A. S. Holevo, Problemy Peredachi Informatsii 9,3 (1979) (this journal
is translated by IEEE under the title Problems of Information Transfer); E. B. Davies, IEEE Trans. Inform. Theory IT 24, 596 (1978); C. A. Fuchs and C. M. Caves, Phys. Rev. Lett. 73,3047 (1994). 12. S. Wiesner, Sigact News 15(1), 78 (1983); C. H. Bennett and G. Brassard p. 175 in Proceedings of the IEEE International Conference on Computers, Systems, and Signal Processing, Bangalore, India (IEEE, New York, 1984); A. K. Ekert, Phys. Rev. Lett. 71, 4287 (1993). For a review ofk the field, see also R. J. Hughes, D. M. Alde, P. Dyer, G. G. Luther, G. L. Morgan and M. Schauer, Contemporary Physics 36(3), 149 (1995); S. J. D. Phoenix and P. D. Townsend, Contemporary Physics 36(3), 165 (1995). 13. C. H. Bennett, G. Brassard, C. Crkpeau, R. Jozsa, A. Peres, and W. K. Wootters, Phys. Rev. Lett. 70,1895 (1993). 14. C. H. Bennett and S. Wiesner, Phys. Rev. Lett. 69,2881 (1992). 15. A. Peres, Quantum Theory: Concepts and Methods, (Kluwer, 1993). 16. R. Cleve, A. K. Ekert, C. Macchiavello and M. Mosca, “Quantum Algorithms Revisited,” preprint quant-ph/9708016. 17. D. S. Simon, p. 116 in Proceedings of the 5’4th Annual Symposium on the
18. 19. 20. 21. 22. 23. 24. 25.
26.
27.
Foundations of Computer Science, ed. S . Goldwasser (IEEE Computer Society Press, Los Alamitos, CA, 1994). G. L. Miller, Journal of Computer Science 13,300 (1976). G. H. Hardy and E. M. Wright: An Introduction to the Theory of Numbers (4th edition, Oxford University Press, 1965). V. Vedral, A. Barenco and A. K. Ekert, Phys. Rev. A. 54,147 (1996). A. K. Ekert and R. Jozsa, Rev. Mod. Phys. 68,733 (1996). D. Beckman, A. Chari, S. Devabhaktuni and J. Preskill, Phys. Rev. A 54,1034 (1996). L. K. Grover, p. 212 in Proceedings, 28th Annual ACM Symposium on the Theory of Computing (STOC), (May 1996), preprint quant-ph/9605043. L. K. Grover, Phys. Rev. Lett. 79,325 (1997). M. Boyer, G. Brassard, P. Hmyer and A. Tapp, p. 36 in Proceedings of the Fourth Workshop on Physics and Computation (PhysComp ’96), (1996), preprint quant-ph/9605034. C. H. Bennett, E. Bernstein, G. Brassard and U. Vazirani, “Strengths and weaknesses of quantum computing,” preprint quant-ph/9701001, to appear in SIAM Journal on Computing (special issue on quantum computing). L. K. Grover, “A fast quantum mechanical algorithm for estimating the
182
Introduction to Quantum Computation and Information
median,” preprint quant-ph/9607024. 28. L. K. Grover, “Quantum Telecomputation ,” preprint quant-ph/97040 12. 29. A. Y. Kitaev, “Quantum measurements and the Abelian Stabilizer Problem,” preprint quant-ph/9511026. See also R. Jozsa, “Quantum Algorithms and the Fourier Transform,” preprint quant-ph/9707033. 30. G. Brassard, P. Hoyer and A. Tapp, “Quantum Algorithm for the Collision Problem,” preprint quant-ph/9705002. 31. D. Deutsch, unpublished; C. Zalka, “Efficient Simulation of Quantum Systems by Quantum Computers,” preprint quant-ph/9603026; S. Wiesner, “Simulations of Many-Body Quantum Systems by a Quantum Computer,” preprint quant-ph/9603028; D. S. Abrams and S. Lloyd, Phys. Rev. Lett. 79 2586 (1997). Additional references can be found at the quant-ph archive. 32. D. Deutsch, Proc. R . SOC.Lond. A 425, 73 (1989). 33. N. Margolus, in Complexity, Entropy, and the Physics of Information, ed. W. H. Zurek (Addison-Wesley, 1990); M. Biafore, MIT Ph.D. Thesis (1993). 34. A. Barenco, Proc. R. SOC.Lond. A 449, 679 (1995); D. P. DiVincenzo, Phys. Rev. A 50, 1015 (1995); T. Sleator and H. Weinfurter, Phys. Rev. Lett. 74, 4087 (1995). 35. D. Deutsch, A. Barenco and A. K. Ekert, Proc. R. SOC. Lond. A 449, 669 (1995); S. Lloyd, Phys. Rev. Lett. 75, 346 (1995). 36. A. Barenco, C. H. Bennett, R. Cleve, D. P. DiVincenzo, N. Margolus, P. Shor, T. Sleator, J. Smolin and H. Weinfurter, Phys. Rev. A 52, 3457 (1995); J . A. Smolin and D. P. DiVincenzo, Phys. Rev. A 53, 2855 (1996); A. Barenco, D. Deutsch, A. K. Ekert and R. Jozsa, Phys. Rev. Lett. 74, 4083 (1995). 37. D. E. Knuth, The Art of Computer Programming, Volume 2: Seminumerical Algorithms, (Addison-Wesley, 1981). 38. D. Coppersmith, IBM Research Report No. RC19642 (1994); D. Deutsch, unpublished. 39. A. Barenco, A. K. Ekert, P. Torma and K.-A. Suominen, Phys. Rev. A. 54, 139 (1996). 40. See, for instance, W. H. Zurek, Physics Today 44(10), 36 (1991). 41. See, for example, C. Cohen-Tannoudji, B. Diu and F. Laloe, Quantum Mechanics, (Hermann and John Wiley & Sons, 1977). 42. W. G. Unruh, Phys. Rev. A 51, 992 (1995); G. M. Palma, K.-A. Suominen and A. K. Ekert, Proc. R . SOC.Lond. A 452, 567 (1996). This phenomenon has been known for a while in other contexts; see, for instance, S. M. Barnett and P. L. Knight, Phys. Rev. A 33, 2444 (1986)
Quantum computation: An Introduction 183
for an example in quantum optics. 43. M. B. Plenio and P. L. Knight, Phys. Rev. A. 53,2986 (1996). 44. P. W. Shor, Phys. Rev. A 52, R2493 (1995); A. M. Steane, Phys. Rev. Lett 77,793 (1996); A. M. Steane, Proc. Roy. SOC.Lond. A 452, 2551 (1996). See also the abundant subsequent literature posted on this topic at the quant-ph archive.
QUANTUM ERROR CORRECTION ANDREW M. STEANE Department of Atomic and Laser Physics, Clarendon Laboratory, Parks Road, Oxford OX1 3NP, United Kingdom Quantum error correction (QEC) is a central component of quantum information theory. It is a powerful general method to restore the state of a quantum system after it has been subject to noise. The theory of quantum error correction is introduced, starting from the principles of error correction for classical communication channels. The treatment concentrates on the construction of quantum error correcting codes, and on the syndrome extraction operation, by which the quantum state is recovered after errors have occurred.
1
Introduction
Quantum error correction (QEC) comes from the marriage of quantum mechanics with the classical theory of error correcting codes. Classical error correction is a central part of classical information theory, and quantum error correction is a central part of quantum information theory. Both are concerned with the fundamental problem of communication in the presence of noise. Here the ‘communication’ might be between two physically separated people, but as usual in information theory it includes the case of information storage during some finite time. Error correction is especially important in quantum computers, because efficient quantum algorithms make use of large scale quantum interference, which is very fragile, i.e. sensitive to imprecision in the computer and t o unwanted coupling between the computer and the rest of the ~ o r l d ? ~ This > ~ makes ~ ) ~ > ~ ~ large scale quantum computation so difficult as to be practically impossible unless error correction methods are Indeed, before quantum error correction was discovered, it seemed that this fragility would forbid large scale quantum computation. T h e ability to correct a quantum computer efficiently without disturbing the coherence of the computation is highly non-intuitive, and was thought more or less impossible. The discovery of powerful error correction methods therefore caused much excitement, since it converted large scale quantum computation from a practical impossibility t o a possibility. The first quantum error correcting codes were discovered independently by Shor31 and Steane?5 Shor proved that 9 qubits could be used t o protect a single qubit against general errors, while Steane described a general code construction whose simplest example does the same job using 7 qubits (see Sec. 6.2). A general theory of quantum error correction dates from subsequent papers 184
Quantum Error Correction
185
of Calderbank and Shor and Steane 36 in which general code constructions, existence proofs, and correction methods were given. Knill and Laflamme 2o provided a more general theoretical framework, describing requirements for quantum error correcting codes (Sec. 5.1), and measures of the fidelity of corrected states. They, and independently Ekert and Macchiavello l1 derived the quantum Hamming bound (Sec. 5.3). The close relationship between entanglement purification and quantum error correction was discussed by Bennett et. al.? They, and independently Laflamme et. a1?2 discovered the perfect 5 qubit code (Sec. 6.4). The important concept of the stabilizer (Sec. 6.4) is due t o Gottesman l 3 and independently Calderbank et. al.; this permitted many new codes to be d i s ~ o v e r e d ? ~The > ~ , quantum ~,~~ MacWilliams identities were discovered by Shor and Laflamme$3 these provide further important constraints on codes and fidelity measures. A recursive coding and correction method, called concatenated coding, was introduced by Knill and Laflamme?l this uses more quantum resources 39 but permits communication over arbitrarily long times or distances. Van Enk et. a1?2 have discussed quantum communication over noisy channels using a realistic model of trapped atoms and high-quality optical cavities. This introduction to QEC will concentrate on the essential ideas of QEC and on the construction and use of quantum error correcting codes. There will not be space to discuss fidelity measures or the quantum channel capacity. These have been investigated by many a ~ t h o r s ~and , important ~ ~ ~ ~ ~ ~ ~ questions remain open in this area. A further important subject is that of faulttolerant methods, see the next chapter by Preskill 27 and references therein. 2
Three bit code
We will begin by analysing in detail the workings of the most simple quantum error correcting code. Exactly what is meant by a quantum error correcting code will become apparent. Suppose a source A wishes transmit quantum information via a noisy communication channel to a receiver B. Obviously the channel must be noisy in practice since no channel is perfectly noise-free. However, in order t o do better than merely sending quantum bits down the channel, we must know something about the noise in the channel. For this introductory section, the following properties will be assumed: the noise acts on each qubit independently, and for a given qubit has an effect chosen at random between leaving the qubit’s state unchanged (probability 1 - p ) and applying a Pauli (T, operator (probability P 1/21. The simplest quantum error correction method is summarised in Fig. 1.
186
Introduction t o Q u a n t u m Computation and Information
l+>Bch-e' encode
recover
----. ----. ----.
decode
I+>
t
noise Figure 1: Simple example illustrating t h e principles of quantum error correction. Alice wishes t o transmit a single-qubit state I@)= a 10) b 11) to Bob through a channel which introduces on errors (10)ff 11)) randomly. Alice prepares two further qubits in the state lo), represented by a small circle. She then encodes her single qubit into a joint state of three qubits, by two controlledNOT operations. These three qubits are sent t o Bob. At t h e receiving end, Bob recovers t h e joint state by extracting a syndrome, and correcting on t h e basis of this syndrome. T h e correction is a u. operation applied t o one (or none) of t h e qubits. Finally, a decoding operation disentangles one qubit from t h e others, giving Bob a single qubit in t h e state 16) with probability 1 - O ( p z ) .
+
We adopt the convention of calling the source Alice and the receiver Bob. The state of any qubit which Alice wishes to transmit can be written without loss of generality a 10) b 11). Alice prepares two further qubits in the state lo), so the initial state of all three is a l000) b 1100). Alice now operates a controlled-NOT gate from the first qubit to the second, producing a 1000)+b IllO), followed by a controlled-NOTgate from the first qubit to the third, producing a lOOO)+b 1111). Finally, Alice sends all three qubits down the channel. Bob receives the three qubits, but they have been acted on by the noise in the channel. Their state is one of the following:
+
+
state a 1000) a 1100) a 1010) a 1001) a 1110) a 1101) a 1011) a l000)
+ b 1111) + b 1011) + b 1101) + b 1110) + b 1001) + b 1010) + b 1100) + b 1111)
probability (1 - p ) 3 p(1 - p)2 p(1 - p)2 p ( l - p)2 p2(1 - p ) p2(1 - p ) p2(1 - p ) p3
Bob now introduces two more qubits of his own, prepared in the state 100). This extra pair of qubits, referred to as an ancilla, is not strictly necessary, but makes error correction easier to understand and becomes necessary when fault-tolerant methods are needed. Bob uses the ancilla to gather information
Quantum Error Correction 187
about the noise. He first carries out controlled-NOTs from the first and second received qubits to the first ancilla qubit, then from the first and third received qubits to the second ancilla bit. The total state of all five qubits is now state
+ b (111))100) ( a 1100) + b loll)) 111) ( a 1010) + b (101))110) ( a 1001) + b I l l O ) ) 101) ( a 1110) + b 1001))101) ( a 1101) + b 1010)) 110) ( a loll) + b (100))111) ( a l000) + b 1111))100) ( a 1000)
probability (1 - P)3 P(1 - P>2 P(1 - PI2 P(1 - P)2 P2(1 - P) P2(1- PI P2(1 - P) P3
Bob measures the two ancilla bits in the basis {lo), 11)). This gives him two classical bits of information. This information is called the error syndrome, since it helps to diagnose the errors in the received qubits. Bob’s next action is as follows: measured syndrome action 00 do nothing apply u, to third qubit 01 10 apply u, to second qubit apply u, to first qubit 11 Suppose for example that Bob’s measurements give 10 (i.e. the ancilla state is projected onto 110)). Examining Eq. 2, we see that the state of the received qubits must be either a 1010) b 1101) (probability p ( l - P ) ~ )or a 1101)+ b 1010) (probability p2(1-p)). Since the former is more likely, Bob corrects the state by applying a Pauli u, operator to the second qubit. He thus obtains either a 1000) b 1111) (most likely) or a 1111) b 1000). Finally, to extract the qubit which Alice sent, Bob applies controlled-NOT from the first qubit to the second and third, obtaining either ( a 10) b 11))100) or ( a 11) b 10)) 100). Therefore Bob has either the exact qubit sent by Alice, or Alice’s qubit operated on by IS,. Bob does not know which he has, but the important point is that the method has a probability of success greater than 1 - p . The correction is designed to succeed whenever either no or just one qubit is corrupted by the channel, which are the most likely possibilities. The failure probability is the probability that at least two qubits are corrupted by the channel, which is 3p2(1- p) p3 = 3p2 - 2p3, i.e. less than p (as long as p < 1/2). To summarise, Alice communicates a single general qubit by expressing its state as a joint state of three qubits, which are then sent to Bob. Bob first applies error correction, then extracts a single qubit state. The probability
+
+
+
+
+
+
188 Introduction to Quantum Computation and Information
that he fails to obtain Alice’s original state is O ( p 2 ) ,whereas it would have been O ( p ) if no error correction method had been used. We will see later that with more qubits the same basic ideas lead to much more powerful noise suppression, but it is worth noting that we already have quite an impressive result: by using just three times as many qubits, we reduce the error probability by a factor 1/3p, i.e. a factor -30 for p = 0.01, -300 for p = 0.001, and so on.
-
3
Binary fields and discrete vector spaces
In order t o generalise the above ideas, we will need t o understand the theory of classical error correcting codes, and this section provides some mathematical preliminaries. Classical error correction is concerned with classical bits, not quantum states. The mathematical treatment is based on the fact that linear algebraic operations such as addition and multiplication can be consistently defined using finite rather than infinite sets of integers, by using modular arithmatic. The simplest case, that of modulo 2 arithmatic, will cover almost everything in this article. The addition operation is defined by 0 0 = 0,O + 1 = 1 0 = 1,l + 1 = 0. The set ( 0 , l ) is a group under this operation, since 0 is the identity element, both elements are their own inverse and the operation is associative. The set ( 0 , l ) is also a group under multiplication, with identity element 1. Furthemore, we can also define division (except division by zero) and subtraction, and the commutative and distributive laws hold. These properties together define a finite field, also called a Galois field. Thus the set ( 0 , l ) is referred to as the field GF(2), where addition and multiplication are as defined. A string of n bits is considered to be a vector of n components, for example 011 is the vector (0, 1 , l ) . Vector addition is carried out by the standard method of adding components, for example (0,1,1) (l,O, 1) = (0 1,1 0 , l 1) = (1,1,0). It is easy to see that this operation is equivalent to the exclusive-or operation CB carried out bitwise between the binary strings: 011 @ 101 = 110. Note that u u ZE 0 and u - v = u v (prove by adding to both sides). We can define the inner product (or scalar product) by the standard rule of multiplying corresponding components, and summing the results: (1,1,0,1) . (1,0,0,1) = 1+ 0 + 0 + 1 = 0. Note that all the arithmatic is done by the rules of the Galois field, so the final answer is only ever 0 or 1. The inner product is also called a parity check or check sum since it indicates whether the second vector ‘satisfies’ the parity check specified by the first vector (or equivalently whether the first vector satisfies the parity check specified by the second). To
+
+
+
+
+
+
+
+
Quantum E ~ OCorrection T 189
satisfy a parity check u , a vector w must have an even number of 1’s at the positions (coordinates) specified by the 1’s in u. If u and w are row vectors then u . v = uwT where T is the transpose operation. The number of non-zero components of a binary vector u is important in what follows, and is called the weight (or Hamming weight), written wt(u). For example, wt(0001101) = 3. The number of places (coordinates) where two vectors differ is called the Hamming distance between the vectors; the distance between u and w is equal t o wt(u v). There are 2n vectors of n components (stated in other language, there are 2n n-bit words). This set of vectors forms a linear vector space, sometimes called Hamming space. It is a discrete vector space since vector components are are only ever equal t o 0 or 1. The vectors point t o the vertices of a square lattice in n dimensions. The space is spanned by any set of n linearly independent vectors. The most obvious set which spans the space is { 1000.. .oo, 0100. . .00, 0010. . .00, .. . , 0000. . .Ol}. There are subspaces within Hamming space. A linear subspace C is any set of vectors which is closed under addition, ie u v E C Q’ ’ u , u E C. For example the set 0000,0011,1100,1111is a 22 linear subspace of the 24 Hamming space. A linear subspace containing 2k vectors is spanned by k linearly independent vectors (for example 0011 and 1100 in the case just given). Any linear subspace is thus completely specified by its generator matrix G, which is just the matrix whose k rows are any k vectors which span the space. We can always linearly combine rows to get an equivalent generator matrix, for example
+
+
G=
(
0011 1100) =
0011
( 1111 )
(3)
The minimum distance d of a subspace is the smallest Hamming distance between any two members of the subspace. If the two closest vectors are u and w, then d = wt(u w). For the case of a linear space, w = u v is also a member of the space. From this we deduce that the minimum distance of a linear space is equal to the smallest weight of a non-zero member of the space. This fact is useful in calculating the value of d, since it is much easier to find the minimum weight than to evaluate all the distances in the space. Now, if u . w = 0 and u .w = 0 then u . (w w) = 0. LFrom this it follows, that if all the rows of a generator satisfy the parity check u,then all so do all the vectors in the subspace. Any given parity check u divides Hamming space exactly in half, into those vectors which satisfy u and those that do not. Therefore, the 2k vectors of a linear subspace in 2n Hamming space can satisfy at most n - k linearly independent parity checks. These parity checks together form the pam’ty check matrix H , which is another way t o define the
+
+
+
190 Intduction to Quantum Computation and Information
linear subspace. H has n columns and n - k rows. For any given subspace, the check and generator matrices are related by
H G =~ o
(4)
where GT is the transpose of G , and 0 is the (n - k) x k zero matrix. The simple error correction method described in the previous section is based around the very simple binary vector space 000,111. Its generator matrix is G = (111) and the parity check matrix is =
( ;::)
(5)
A useful relationship enables us to derive each of H and G from the other. If G can be converted to the form G = ( I k , A ) where I k is the k x k identity matrix, and A is the rest of G ( A is a k x n - k matrix), then H = ( A T ,I n n - k ) . The last concept which we will need in what follows is that of the dual. The dual space CL is the set of all vectors u which have zero inner product with all vectors in C , u . v = 0 Vv E C. It is simple to deduce that the parity check matrix of C is the generator matrix of CL and vice versa. If H = G then C = C', such spaces are termed self-dual. The notation (n, m , d ) is a short-hand for a set of m n-bit vectors having minimum distance d. For linear vector spaces, the notation [n,k,d] is used, where k is now the dimension of the vector space, so it contains 2k vectors. Let us conclude this section with another example of a linear binary vector space which will be important in what follows. It is a [7,4,3] space discovered by Hamming.15 The generator matrix is 1010101
so the sixteen members of the space are 0000000 0001111 1110000 1111111
1010101 1011010 0100101 0101010
0110011 0111100 1000011 1001100
1100110 1101001 0010110 0011001
(7)
these have been written in the following order: first the zero vector, then the first row of G. Next add the second row of G to the two vectors so far obtained,
Quantum Error Correction
191
then add the third row to the four vectors previously obtained, and so on. We can see at a glance than the minimum distance is 3 since the minimum non-zero weight is 3. The parity check matrix is
H =
(
1010101 0110011) 0001111
It is simple to confirm that HGT = 0 . Note also that since H is made of rows of G , this code contains its dual: CL E C. 4
Classical error correction
Classical error correction is a large subject, a full introduction may be found in many readily available t e ~ t b o o k s !In~order ~ ~ to ~ ~ keep ~ ~the ~ present ~ ~ discussion reasonably self-contained, a minimal set of ideas is given here. These will be sufficient to guide us in the construction of quantum error correcting codes. Classical communication can be considered without loss of generality to consist in the communication of strings of binary digits, i.e. the binary vectors introduced in the previous section. A given binary vector, also called a binary word, which we wish to send from A to B , is called a message. A noisy communication channel will currupt the message, but since the message u is a binary vector the only effect the noise can have is change it to some other binary vector u‘. The difference e = u’- u is called the error vector. Error correction consists in deducing u from u’. 4.1
Error correcting code
A classical error correcting code is a set of words, that is, a binary vector space. It need not necessarily be linear, though we will be concerned almost exclusively with linear codes. Each error correcting code C allows correction of a certain set S z { e i } of error vectors. The correctable errors are those which satisfy u+ei
# v + e j Vu,v E C (u# u)
(9)
The case of no error, e = 0, is included in S, so that error-free messages are ‘correctable’. To achieve error correction, we use the fact that each message u is corrupted t o u’= u e for some e E S. However, the receiver can deduce u unambiguously from u e since from Eq. 9, no other message u could have
+
+
192
Introduction to Quantum Computation and Information
+
been corrupted to u e , as long as the channel only generates correctable error vectors. In practice a noisy channel causes both correctable and uncorrectable errors, and the problem is t o match the code to the channel, so that the errors most likely t o be generated by the channel are those the code can correct. Let us consider two simple examples. First, suppose the channel is highly noisy, but noise occurs in bursts, always affecting pairs of bits rather than one bit at a time. In this case we use the simple code C = {00,01}. Longer messages are sent bit by bit using the code. The possible error vectors (those which the channel can produce) are (00, ll},and in this example this is also a set of correctable errors: the receiver interprets 00 or 11 as the message 00, and 01 or 10 as the message 01. Therefore error correction always works perfectly! This illustrates the fact that we can always take advantage of structure in the noise (here, pairs of bits being equally affected) in order to circumvent it. Next suppose the noise affects each bit independently, with a fixed error probability p < 1/2. This noise has less structure than that we just considered, but it still has some predictable features, the most important being that the most likely error vectors are those with the smallest weight. We use the code C = (000,111). The errors which the channel can produce are, in order of decreasing probability, {000,001,010,100,011,101,011,111}. With n = 3 and m = 2, the set of correctable errors can have at most n3/2 = 4 members, and these are {000,001,010,100}. This is the classical equivalent of the quantum code described in Sec. 2.
4.2 Minimum distance coding The noisy channel just described is called the binary symmetric channel. This is a binary channel (the only kind we are considering) in which the noise affects each bit sent down the channel independently. It is furthermore symmetric, meaning that the channel causes errors 0 + 1 and 1 + 0 with equal probability. If n bits are sent down a binary symmetric channel, the probability that m of them are flipped (0 H 1) is the probability for m independent events in n opportunities:
C(n,m)p*(1 -p)"-"
(10)
where the binomial coefficient C(n,m)= n ! / ( m ! ( n - m)!). The binary symmetric channel is important because other types of noise can be treated as a combination of a structured component and a random component. The structured component can be tackled in other ways, and the correction of the random component can be converted t o a problem equivalent to that of the binary symmetric channel.
Quantum EWOT Correction 193
To code for the binary symmetric channel, we clearly need a code in which error vectors of small weight are correctable, since these are the most likely ones. A code which corrects all error vectors of weight up t o and including t is called a t-error correcting code. A simple but important observation is the following: A code of minimum distance d can correct all error vectors of weight less than or equal to t if and only if d > 2t. Proof if d = wt(u v) 5 2t then there exist error vectors e l , e2 of weight 5 t such that wt(u v el e2) = 0, which implies u el = w e2, so correction is impossible. Also, if wt(u v el e2) # 0 for all vectors e l , e2 of weight up to t, then wt(u v) > 2t for all u , v E C , so d > 2t. This argument shows that a good set of codes for the binary symmetric channel are those of high minimum distance.
+ + + + +
+ + +
+
+
4.3 Bounds on the size of codes In order to communicate k bits of information using an error correcting code, n > k bits must be sent down the channel. The ratio k l n , called the rate of the code, gives a measure of the cost of error correction. Various bounds on k f n can be deduced. In a Hamming space of n dimensions, that is, one consisting of n-bit vectors, there are C ( n , t) error vectors of weight t. Any member of a t-error t correcting code has C(n,i) erroneous versions (including the error-free version) which must all be distinct from the other members of the code and their erroneous versions if error correction is to be possible. They are also distinct from each other because el
# e2 3 u + el # u + e2.
(11)
However, there are only 2n possible vectors in the whole Hamming space, therefore the number of vectors m in n-bit t-error correcting codes is limited by the Hamming bound15J6 t i=O
For linear codes m = 2k so the Hamming bound becomes
194 Introduction to Quantum Computation and I n f o m a t i o n
From this one may deduce in the limit of large n, k , t:
where Q -+0 as n -+ 00, and H ( z ) is the entropy function
H ( 2 ) s --z log,
2
- (1 - x) log,(l - x).
(15)
The Hamming bound makes precise the intuitively obvious fact that error correcting codes can not achieve arbitrarily high rate k l n while still retaining their correction ability. As yet we have no definite guide as to whether good codes exist in general. A very useful result is the Gilbert-Varshamov bound: it can be shown24that for given n, d , there exists a linear [n,k , d] code provided d-2
2'X~~(n-1,i)<2~ i=O
In the limit of large n, k , d , and putting t = d / 2 , this becomes
where Q -+0 as n -+ 00. It can be shown that there exists an infinite sequence of [n,k , d] codes satisfying Eq. 17 with d / n 2 6 if 0 5 6 < 112. The GilbertVarshamov bound necessarily lies below the Hamming bound, but it is an important result because it shows that error correction can be very powerful. In the binary symmetric channel, for large n the probability distribution of all the possible error vectors is strongly peaked around error vectors of weight close to the mean n p (see Eq. lo), where p is the error probability per bit. This is an example of the law of large numbers. Therefore as long as t > n p ( 1 Q), where Q << 1,error correction is almost certain to succeed in the limit n -+ co. The Gilbert-Varshamov bound tells us that this can be achieved without the need for codes of vanishingly small rate. Another result on coding limitations is Shannon's t h e ~ r e m $ ' J ~which >~~ states that codes exist whose average performance is close to that of the Hamming bound: Shannon's theorem: If the rate k l n is less than the channel capacity, and n is sufficiently large, then there exists a binary code allowing transmission with arbitrarily low failure probability. The failure probability is the probability that an uncorrectable error will occur; the capacity of the binary symmetric channel is 1 - H ( p ) . Shannon's
+
Quantum Error Correction 195
theorem is ‘close to’ the Hamming bound in the sense that we would expect a correction ability t > n p to be required to ensure success in a binary symmetric channel, which implies k / n < 1 - H ( p ) in the Hamming bound; Shannon’s theorem assures us that codes exist which get arbitrarily close to this. The codes whose existence is implied by Shannon’s theorem are not necessarily convenient t o use, however.
4.4 Linear codes, error syndrome The importance of linear codes is chiefly in their convenience, especially the speed with which any erroneous received word can be corrected. They are also highly significant when we come to generalise classical error correction to quantum error correction. So far the only error correction method we have mentioned is the simple idea of a look-up table, in which a received vector w is compared with all the code vectors u E C and their erroneous versions u e , e E S , until a match w = u + e is found, in which case the vector is corrected to u. This method makes inefficient use of either time or memory resources since there are 2n vectors u e. For linear codes, a great improvement is to calculate the error syndrome s given by
+
+
s = H wT.
(18)
where H is the parity check matrix. Since H is a (n - k) x n matrix, and w is an n bit row vector, the syndrome s is an n - k bit column vector. The transpose of w is needed to allow the normal rules of matrix multiplication, though the notation H .w is sometimes also used. Consider s = H(u
+ e)T = HuT + H e T = H e T
(19)
where we used Eq. 4 for the second step. This shows that the syndrome depends only on the error vector, not on the transmitted word. If we could deduce the error from the syndrome, which will be shown next, then we only have 2n-k syndromes to look up, instead of 2n erroneous words. Furthermore, many codes can be constructed in such a way that the error vector can be deduced from the syndrome by analytical methods, such as the solution of a set of simulataneous equations. Proof that the error can be deduced from the syndrome. The parity check matrix consists of n - k linearly independent parity check vectors. Each check vector divides Hamming space in half, into those vectors which satisfy the check, and those which do not. Hence, there are exactly 2k vectors in Hamming
196
Introduction to Quantum Computation and Infonation
space which have any given syndrome s. Using Eq. 19, these vectors must be the vectors u + e, where u is one of the 2k members of the code, and hence e, is a unique error vector associated with the syndrome s. To conclude, let us consider a n example using the [7,4,3] Hamming code given at the end of Sec. 3. Since this code has minimum distance 3, it is a single error correcting code. The number of code vectors is limited by the Hamming bound to 2?/(C(7,0) C ( 7 , l ) ) = 2 ? / ( l 7) = 24. Since there are indeed Z4 vectors the code saturates the bound; such codes are called perfect. The set of correctable errors is (0000000,0000001,0000010,0000100, 0001000,0010000,0100000,1000000). Suppose the message is 0110011 and the error is 0100000. The received word is 0110011 0100000 = 0010011. The syndrome, using Eq. 8, is
+
+
+
H(0010011)~= (010)T
(20)
The only word of weight 5 1 which fails the second parity check and passes the others is 0100000, so this is the deduced error. We thus deduce the message to be 0010011 - 0100000 = 0110011, which is correct. 5
Basic principles of quantum error correction
The basic ingredients of the theory of quantum error correction are the quantum states we wish t o consider, and the noise processes we wish t o correct. There are many ways t o present the theory. For example, a general noise process can be described either in terms of its effect on the density matrix of the system under consideration, or as a change in the joint state of a system and its environment, in which case a density matrix treatment is not required since we can always consider the total quantum state vector of system plus environment (sometimes called a purification of the system Suppose a system q which is not initially entangled with its environment undergoes interaction with the environment. Three equivalent ways t o treat such a process are as follows. 1st method: "y2').
where pi ( p f ) is the initial (final) reduced density matrix of q. The same interaction can be described by more than one set of operators A,; a useful choice of A , leads to C , A i A , = I . 2nd method:
I4 14) + C 1%) Ms 14) s
(22)
Quantum Error Correction 197
where le) (14)) is the initial state of the environment (the system), the final environment states le,) are not necessarily orthogonal or normalised, and the operators M , acting on the system are unitary. Tensor product symbols 8 are not written in order t o keep the equations uncluttered. 3rd method:
S
where the operators Z and E, , not necessarily unitary, act on the environment. The first and second methods have been compared by Knill and Laflamme?' the second and third only differ by a small change of emphasis. The second method (Eq. 22) will be adopted here. The essential idea of quantum error correction is as follows. We assume that the environment cannot be controlled, in which case a process such as Eq. 22 is irreversible. There are two ways t o tackle this problem. The first is to identify states of the system which are eigenstates of the interaction with the environment,
Ms 14) = 14)
(24)
for all the operators M , in Eq. 22. This is not normally called quantum error correction, though it is closely related t o it. The second method is quantum error correction. This consists in coupling the system q t o another system called an ancilla a, prepared in some known state lo),. The unitary interaction A between q and a is carefully arranged t o have the following property:
where the states Is), of the ancilla are orthonormal (but see later), for all M , appearing in Eq. 22, where this is possible, or else for the dominant terms in Eq. 22. This interaction is termed syndrome extraction, the syndrome s will give us information about the noise. It is highly significant that the state of the ancilla Is), depends on the noise, but not on the quantum state 14) t o be corrected. Applying A t o the noisy state on the right hand side of Eq. 22, we obtain s
Since the states Is), are orthonormal, we can measure the ancilla in the basis and something rather wonderful happens: the whole state is projected onto
198 Introduction to Quantum Computation and Information
for some particular value of s, and furthermore we know s (the result of the measurement). By hypothesis, s is in one to one correspondance with M,, therefore we can deduce M , from s , and apply the corrective operation M l l to q, producing the final state
1%)
Is),
14) .
(28)
The state of q has now been corrected, the state of the environment is immaterial, and we can re-prepare a in lo), for further use. Actually, the requirement on s is slightly less restrictive than was just stated: rather than deducing M , exactly from s, it is sufficient that for each final ancilla state Is),, an operation can be identified which corrects the system and disentangles it from the environment (cf Eq. 31). It is not strictly necessary to measure the ancilla, since after syndrome extraction one could arrange a further unitary interaction C between q and a, defined by C(ls), )) .1 = ML1). .1 The final state would then be 14) C , le,) Is),. The entanglement between q and the environment is transferred to an entanglement between a and the environment. However, measurement of the ancilla has practical advantages compared to the use of C (or CA). The complete unitary operation
,)SI
(14Ms 14)) = 1%) 14)
(29)
is called recovery, where la) and Ia,) are states of any relevent systems other than q , including the environment, an ancilla, and any measuring apparati involved. For any given noise, it is only possible to recover a restricted set of states 16) of the system we wish to correct. The central task of the theory of quantum error correction is to identify such states and the respective recovery operator. The states la,) on the right hand side of Eq. 29 must not depend on I+), in order that if two states 141) and 142) can both be recovered (by the same R), then so can any linear combination. With this property it is sufficient to find an orthonormal set of recoverable states in order to have a recoverable Hilbert space. This is a subspace of the total Hilbert space available to the system. The recovery operator is usually obtained by finding the syndrome extraction operator.
5.1
Conditions for quantum error correcting codes
A quantum error correcting code is an orthonormal set of states { Iu)},called quantum codewords, for which recovery is possible after noise consisting of any
Quantum E m r Correction 199
combination (in the sense of Eq. 22) of error operators from the set S = {M,}. The set S is the set of correctable errors, it includes the identity operator. The requirements for quantum codewords which correspond to the requirement of Eq. 9 for classical codes are;0 for (uIw) = 0,
( U I M / M ~=( V0,)
(30)
(I. M ~ ) . 1M ~= (.I M,~M ) .1 ~
(31)
The first condition says that to be correctable, an error acting on one codeword must not produce a state which overlaps with another codeword, or with an erroneous version of another codeword. This is what we would expect intuitively. Proof: X l 4 M j I 4 = I%>lV) R la) Mi = lai)). 1
).I +I.( MI I.( RtR la)M j Iw) +, ( U I M/M,.) . 1
= (ailaj) (U) . 1
= o
= 0.
The second condition, Eq. 31, is surprising since it permits one erroneous version of a codeword ).I to overlap with another, as long as all the other codewords when subject to the same error have the same overlap with their respective other erroneous version. This is not possible in classical error correction because of Eq. 11. Proof of Eq. 31:
R).I Mi) . 1 3 (ti1 M!
(a1RtR la) Mj) . 1 + I.( M / M) .~1
=
1.i)
).I
= (ailaj) = (ail a j ) .
The same result is obtained starting from Iw), from which Eq. 31 is derived. Note that this means the syndrome s need not be in one-to-one correspondence with the error M , in Eq. 26. We have shown that Eqs. 30 and 31 are necessary to allow recovery, Eq. 29. By reversing the derivations it can be shown that they are also sufficient?O Equation 31 is satisfied if
/ M) .~1 = o (32) for all the codewords. Codes which satisfy this more restrictive condition will be termed orthogonal. They are easier to manipulate, and we will be concerned mostly with such codes. N.B. Orthogonal codes are sometimes termed ( U IM
n~ndegenerate?~
200
Introduction t o Quantum Computation and Information
5.2 Qubit errors The basic principles of quantum error correction have so far been presented without reference to qubits, with the exception of the introductory Sec. 2. This is to emphasise that quantum error correction need not necessarily be considered in terms of qubits. However, t o find general code constructions, and to take advantage of classical error correction theory, a treatment in terms of qubits is by far the best choice. From now on we will consider systems such as quantum computers or quantum communication channels which consist of a set of qubits. Obviously any quantum system can be mathematically treated as a set of qubits, but we wish to consider noise processes which affect different qubits independently, and this is only realistic for systems in which the qubits are physically separated, though they may still interact with one another. Any interaction between a single qubit and its environment can be written
where I. .) denotes states of the environment (not necessarily orthogonal) and c,.. are coefficients depending on the noise. A significant insight is that this general interaction can be written
Ie>I4) + ( l e r ) I + l e x > X + l e ~ ) Y + l e z ) ~ ) I ~ )
(34)
+
14) = a (0) + b 11) is the initial state of the qubit, and 1.1) = COO leoo) c10 lelo), [ e x ) = c01 1.01) c11 lell), and so on. The operators I,X , Y, 2 are a set of single-qubit error operators. In the basis {lo), 11)) they are
where
+
I = ( 1o l0 ) , X = ( l 0 o 1 ) . y = ( l0 -1 o ) ) ’ . = ( o 1 -01 ) .
(35)
I is the identity, X is the Pauli ( T ~operator, 2 is the Pauli (T, operator, and Y = X Z = -icy. The general problem of error correction of qubits is thus reduced 36 t o the problem of correcting ‘bit flip’ errors ( X ) , or ‘phase’ errors ( 2 )or both (Y). Equations 22 and 34 are similar, and indeed a general noise process for a set of many qubits can be written in the form Eq. 22 such that each error operator M , is a tensor product of single-qubit operators taken from the set {I,X , Y ,2 ) . For example, an error operator for a system of 5 qubits is
Quantum Error Correction 201
where the subscripts indicate which qubit the {I,X , Y ,2) operator acts on. The following notation will also be useful:
M = x,z,
(37)
where x and z are binary vectors which indicate where the X and Z operators appear in M . For example
The weight of a classical error vector was defined to be the number of non-zero elements, which indicates how many bits are flipped by such an error. Similarly, we define the weight of a quantum error operator of the general form illustrated in Eq. 36 to be the number of elements in the product which are not equal to I. If M = X,Z, then the weight of M is the weight of the binary vector obtained from the bitwise OR of x and 2. When noise acts on different qubits independently, the most important terms in Eq. 22 are those where the error M8 has the smallest weight. Therefore the quantum codes suitable for correcting independent noise are those which correct all errors of weight up to a certain maximum. This is the equivalent of minimum distance coding in classical error correction.
5.3 Quantum Hamming bound
A t-error correcting quantum code is defined t o be a code for which all errors of weight less than or equal t o t are correctable. Since there are 3 possible single-qubit errors, the number of error operators of weight t acting on n qubits is 3t C( n ,t ) . Therefore a t-error correcting code must be able t o cort ' rect 3'C(n, i) error operators. For orthogonal codes (i.e. those satisfying Eq. 32 as well as Eqs. 30 and 31), every codeword) . 1 and all its erroneous versions M) . 1 must be orthogonal to every other codeword and all its erroneous versions. All these orthogonal states can only fit into the 2"-dimensional Hilbert space of n qubits if t
i=O
This bound is known as the quantum Hamming b o ~ n d ? ~ For > ~ m~ = > 2k ~ i ~ ~ and large n , t it becomes
k
-
n
5 1 - -t log, 3 - H ( t l n ) . n
202
Introduction t o Quantum Computation and Information
0.1.
\ \ \
0
0
0.05
0.1
0.15
0.2
0.25
t/n Figure 2: Bounds on code rates. T h e curves show the code rate k/n as a function of t / n for t-error correcting codes, in the limit n --t 00. Full curve: quantum Hamming bound (Eq. 40), this is a n upper limit for orthogonal codes, in which every correctable error is associated with an orthogonal syndrome state. Dashed curve: Gilbert-Varshamov type bound (Eq. 56) for CSS codes: codes exist with at least this rate. Dotted curve: Hamming bound for CSS codes k/n 5 1 - 2 H ( t / n ) .
Quantum Error Correction 203
This result is shown in Fig. 2. The rate k/n falls to zero at t / n 21 0.18929. What is the smallest single-error correcting orthogonal quantum code? A code with m = 2 codewords represents a Hilbert space of one qubit (it ‘encodes’ a single qubit). Putting m = 2 and t = 1 in the quantum Hamming bound, we have 1 3n 5 2”-l which is saturated by n = 5 , and indeed a 5-qubit code exists (see Sec. 6.4).
+
6
Code construction and syndrome extraction
The quantum Hamming bound, like the classical Hamming bound, tells what cannot be done, but not what can be done. To establish the possibilites of quantum error correction, we need to prove the existence of powerful codes, and, preferably, to present specific examples. Also, we must be explicit about the recovery procedure. 6.1
Quasi-classical codes
Suppose first of all that the noise only includes error operators composed of tensor products of I and X , so M , = X,. This mimics the errors occuring in the binary vector spaces of classical error correction. A suitable quantum error correcting code in this situation is the set of codewords { I u E C)} where C is a classical t-error correcting code, and we use the standard notation lOOlOl) 10) 10) 11)10) 11). It is easy to see that the conditions of Eqs. 30 and 32 are u e ) , in which satisfied for all error operators of weight 5 t , since X, lu)= I u , e and u + e are binary vectors. The recovery in this case was illustrated in Sec. 2. Suppose C is an [n, k, d] linear code. We introduce an ancilla of n-k qubits, prepared in lo). To evaluate each classical parity check we perform a sequence of controlled-NOT (= XOR) operations from qubits in q to a single qubit in a. The single qubit in a is the target, and the qubits in q which act as control bits are those specified by the 1’s in the n-bit parity check vector. The n - k parity checks specified by the classical parity check matrix H are thus evaluated on the n - k qubits in a. The notation will be used for this unitary interaction between q and a , an example is shown in Fig. 3. The syndrome extraction operation is
+
XORL~~,)
~ O R i f i , )(
lo),
I
Iu + e ) ) = H e T ) , Iu + e )
where we used Eq. 19. The fact that the classical syndrome is independent of the message is now highly significant, for if the initial state of q is any superposition CUEG a, Iu),the syndrome extraction does not leave q and a
204
Introduction to Quantum Computation and Information
Figure 3: Syndrome extraction operation for [[7,1,3]] CSS code. A control with several
NOT’S
represents several controlled NOT operations with t h e same control qubit. Each of the three qubits of the ancilla begin in 10) and finish in 10) or 11) according as t h e relevant parity check in H (Eq. 8) is or is not satisfied. A further ancilla of three more qubits is required t o complete t h e syndrome, using a similar network together with R operations, see Eq. 49.
entangled:
UEC
UEC
Recovery is completed by measuring a , deducing e from H e T , and applying x,l = x, to q. The noise process just considered is quite unusual in practice. However, it is closely related to a common type of noise, namely phase decoherence. Phase decoherence takes the form of Eq. 22 with error operators consisting of tensor products of I and 2,so Ms = Z,. The quantum error correcting code for this case is now simple t o find because 2 = R X R , where R is the Hadamard or basis change operation
R = - (1
Jz
1 1 >. 1 -1
(43)
The letter H is often used for this operator, but R is adopted here t o avoid confusion with the parity check matrix. We will use R = R1 Rz . . . R, t o denote Hadamard rotation of all the n qubits in q.
Quantum Error Correction 205
The quantum codewords in the case of phase noise are Icu) = R l u E C) where C is a classical error correcting code. An ancilla is prepared in the state lo),, and the syndrome extraction operation is RXOR$:)R where H is the parity check matrix of C. This can be understood as exactly the same code and extraction as the previous example, only now all operations are carried out in the basis { R 10) ,R 11))instead of { 10) ,Il)}. A formal proof that the syndrome extraction works is as follows. An error acting on a codeword produces
Z, (R)) .1
= RX,) . 1
=RI u
+ e) .
(44)
Now introduce the ancilla and perform syndrome extraction:
=
IHeT),Rlu+e)
(46)
where we use the fact that R does not operate on a. The error vector e is deduced from HeT, and the corrective operation Z, is applied, returning q t o
R 14. The simplest example of a phase error correcting code is a single error correcting code using three qubits. The two quantum codewords are
+
+
R ( 0 0 0 ) = 1000) 1001) 1010) R 1111) = 1000) - 1001) - 1010)
+ l oll ) + 1100) + 1101) + 1110) + 1111), + loll) - 1100) + 1101) + 1110) - 1111).
where the normalisation factor 1/& has been ommitted. An equivalent code is one using codewords R(1000) f Illl)),see Steane36 for a thorough analysis.
6.2
CSS codes
We now turn to quite general types of noise, where the error operators include X , Y and 2 terms. The code construction and correction method discovered by Calderbank, Shor and Steane 35,36 works by separately correcting the X and 2 errors contained in a general error operator M , = X, Z, . The key to the code construction is the “dual code theorem”: 35
iEC
ZEC’
The normalisation has been omitted from this expression in order t o make apparent the significant features; normalisation factors will be dropped hereafter since they do not affect the argument. The content of Eq. 47 is that if we
206
Introduction to Quantum Computation and Information
form a state by superposing all the members of a linear classical code, then the Hadamard transformed state is a superposition of all the members of the dual code. We can correct both X and 2 errors by using states like those in Eq. 47, as long as both C and CL have good error correction abilities. The codewords of a Calderbank, Shor and Steane (CSS) code are iECl
where u # C I , u E C and C I E C. If C = [n,k,d] then CL = [ n , n - k,dL]. The number of linearly independent u which generate new states is therefore k- (n- k) = 2k-n. We can construct 22k-n orthonormal quantum codewords, which represents a Hilbert space large enough to store 2k - n qubits. We will show that the resulting code can correct all errors of weight less than d/2. The parameters of the quantum code are thus [[n, 2k - n, d]]. To correct a CSS code obtained from a classical code C = [n,k,4, CL E C we introduce two ancillas a(.) and a ( z ) ,each of n - k qubits, prepared in the state 10). The syndrome extraction operation is
where H is the check matrix of C. The proof that this works correctly for all errors M , = X,Z, of weight less than d/2 is left as an exercise for the reader. It is straightforward through use of the relations
x,z, XOR;Z)Z, RX,
=
(-l)”‘zz,x,
= Z,XOR$Z)
= Z,R
where the latter follows from Eqs. 47 and 52. The simplest CSS code is obtained from the [7,4,3] Hamming code given at the end of Sec. 3. This is single-error correcting and contains its dual, and therefore leads to a single-error correcting quantum code of parameters [[7,1,3]].The two codewords are
+
+
+
Ico) = ~0000000) ~1010101) ~0110011) ~1100110)
+ ~0001111)+ ~1011010)+ ~0111100)+ ~1101001), ICl)
= Xlllllll
1%)
(54) (55)
The syndrome extraction operation for this code is illustrated in Fig. 3.
Quantum Error Correction
207
6.3 Good quantum codes exist
The CSS construction allows a Gilbert Varshamov bound to be obtained for quantum codes. It can be shown5 that there exists an infinite sequence of classical codes which contain their dual and satisfy the Gilbert-Varshamov bound, Eq. 16. Therefore there exists an infinite sequence of quantum [[n, K , 41 CSS codes provided Eq. 16 is satisfied with k = ( K n)/2. In the limit of large n , Ic, t this becomes
+
5736
K n
-> 1- 2 H ( 2 t / n )
(56)
where the usual factor (1 - 77) has been suppressed. The bound of Eq. 56 is indicated on Fig. 2.
6.4
Stabilizer
A more general theory of quantum code construction can be built around the concept of the stabilizer, introduced by Gottesman l 3 and Calderbank et. al.! Notice that if u.u = 0 then Z,) . 1 = lu). That is, if u satisfies the parity check u then) . 1 is an eigenstate of the operator Z,. Also, R) . 1 will be an eigenstate of the operator X,. Another way to specify a CSS code is therefore t o describe it as a set of 2K orthonormal states which are simulataneous eigenstates of n - K linearly independent error operators. These are the n - k operators Z,EH and the n - k operators X u E where ~ H is the parity check matrix, with K = 2k - n. This set of n - K linearly independent operators generates a group '?? of 2n-K different operators which together form the stabilizer of the quantum code. The word stabilizer can refer either to '?? or to any set of n - K operators which generates '??;we will use it mostly in the latter sense. Let us introduce the shorthand6 X,Z, = (ulu), then the stabilizer of a CSS code can be written
where H is the (n- k ) x n parity check matrix of the underlying classical code. We can immediately generalise this to more general stabilizers
Ifl = (Hzl H z ) aIn fact, Steane 36,37 gave a slightly more general construction of the form
(58)
208
Introduction to Quantum Computation and Information
where H , and H , are (n - K ) x n binary matrices. The syndrome extraction operation for such a stabilizer code is
where we use an n - K bit ancilla prepared in A generator G = (G,IG,) having n K rows is related to U by
+
H,G:
+ H,G:
=o
(60)
This means that U can be obtained from G by swapping the X and Z parts, and extracting the dual of the resulting in K ) x 2n binary matrix. The quantum code is t-error correcting if 6 generates operators of minimum weight > 2t. Calderbank et. a16 show that in order for 3t to be the stabilizer of a quantum error correcting code, it must safisfy the special condition
+
H,HT + H,HT
= 0.
(61)
Comparing this with Eqs. 60 and 57, it is seen that this is the generalisation of the condition CL E C for CSS codes. The problem of quantum code construction is greatly simplied by the framework sketched above. Further properties of the stabilizer U are discussed e l ~ e w h e r e ? ~Quantum - ~ , ~ , ~ ~networks to encode and decode K qubits in and out of n-qubit codewords can be easily derived? Let us conclude by examining two example codes. A [[5,1,3]]code is given by 3722,6
%=(
The two codewords are 1.0)
=
+ -
11000 01100 00110 00011
00101 10010 01001 10100
1
~00000) I l l O O O ) lOll00) + lOOll0) lOO0ll) I l O O O l ) IlOlOO) - lololo) - lOOl0l) - IlOOlO) - lOlO0l) I l l l l O ) - l0llll) - I l O l l l ) - I l l O l l ) - I l l l O l )
+
+
and )1.1
= X l l l l l Ico)
+
Quantum Error Correction 209
It is simple t o check that the codewords are eigenstates of the stabilizer. To show that this is a single error correcting code, one can either 22 examine the syndrome extracted for every error vector of weight 5 1, or obtain 4 from Eq. 60 and confirm that it generates operators of minimum weight 3. A [[8,3,3]]code is given by 13,6337,38
x=
11111111 00000000 00001111 00110011 01010101
00000000 11111111 00110011 01010101 00111100
G=
11111111 ’ 00001111 00110011 01010101 00000000 00000000 00000000 00000000 00000011 00000101 00010001
00000000 00000000 00000000 00000000 11111111 00001111 00110011 01010101 00000101 00010001 00000110
The generator is
Each codeword can be written as a superposition of 16 states, further details are given by Gottesman l 3 and S t e a n e ” ~ This ~ ~ code is the first of an infinite set of quantum codes based on classical Reed-Muller ~ o d e s ? ~ > ~ ~
7
Further development
This article has provided an introduction t o quantum error correction, but this is a large subject and much material has of necessity been ommitted. This concluding section will give a guide t o other work in this area, as well as avenues for future development. The relationship between quantum error correction and the capacity of the quantum channel is a n important subject which has only been briefly touched on here. The question is not yet fully settled, though much progress has been made!,36>20>3>23>29,1 In classical error correction, a t-error correcting code enables information t o be communicated through a binary symmetric
210
Introduction to Quantum Computation and Information
channel with overall failure probability O(pt+'), where the prefactor may be large (- C(n,t + 1 ) ) . If E is a parameter indicating the noise level per qubit in a quantum channel, then a t-error correcting quantum code enables the transmitted state to be recovered with fidelity 1 - O(rt+'). However, E may be either a quantum amplitude or a probability, depending on exactly what is assumed about the The relationship between quantum error correction and entanglement purification protocols has been examined in detail by Bennett et. al.? These authors describe a method of random parity checking, termed by them 'hashing', which realises the quantum Hamming bound in the limit of large n. It is known that the quantum Hamming bound can in principle be exceeded through the use of non-orthogonal coding methods which make use of Eq. 31. A small improvement on the quantum Hamming bound has been found through the use of a variation on hashing:4 and recently a quantum code better than any stabilizer code has been discovered?8 Several large or infinite classes of stabilizer codes have been described elsewhere!3>14>37>3*>6 The fact that the group {I,X , Y,2 ) has four members makes quantum coding especially susceptible to an analysis in terms of GF(4) rather than GF(2), and this enables many codes to be discovered? A basic result in classical error correction is the MacWilliams identity, which relates the weight distributions of dual codes. The corresponding relation has been discovered for quantum c0des.3~This is a remarkable result which also yields a relationship between different fidelity measures. A recursive coding and correction method, known as concatenated coding, was introduced by Knill and Laflamme?l This enables the most likely errors to be corrected more frequently than the less likely ones. For the purpose of quantum communication and results related to channel capacity, it is important to use the most efficient codes possible, that is those of highest rate k / n . However, for use in quantum computers the simplicity of construction of the code is also important, especially when fault tolerant methods are needed (see the chapter by Preskill). For large scale quantum computation, fault tolerant methods based on quantum codes are probably the best prospect. However, the experimental techniques required to manipulate quantum codewords are formidable, so for the more simple aim of quantum communication, a method based on quantum teleportation is probably best. A promising method to establish the required entanglement through a realistic noisy channel is described by van Enk et. al.?2 All the above mentioned ideas are being actively investigated. Whereas the basic concepts of quantum error correction are now well established, many of the limits on exactly what can be achieved remain uncertain.
Quantum Error Correction 211
Acknowledgments
The author is supported by the Royal Society and by St Edmund Hall, Oxford. I have profited from discussions on quantum error correction with many people, of whom I would like to thank especially R. Laflamme, E. Knill, P. Shor, S. Lloyd, A. Ekert, D. Gottesman and J. Preskill. References
1. H. Barnum, C. A. Fuchs, R. Jozsa and B. Schumacher, “A general fidelity limit for quantum channels,” preprint quant-ph/9603014. 2. C. H. Bennett, G. Brassard, S. Popescu, B. Schumacher, J. A. Smolin and W. K. Wootters, Phys. Rev. Lett. 76, 722 (1996). 3. C. H. Bennett, D. P. DiVincenzo, J . A. Smolin and W. K. Wootters, Phys. Rev. A 54, 3825 (1996). 4. A. Berthiaume, D. Deutsch and R. Jozsa, “The stabilisation of quantum computation,” p. 60 in Proceedings of the Workshop on Physics and Computation, PhysComp 94, (Los Alamitos: IEEE Computer Society Press 1994). 5. A. R. Calderbank and P. W. Shor, Phys. Rev. A 54, 1098 (1996). 6. A. R. Calderbank, E. M. Rains, P. W. Shor and N. J. A. Sloane Phys. Rev. Lett. 78,405 (1997). 7. A. R. Calderbank, E. M. Rains, P. W. Shor and N. J . A. Sloane “Quantum error correction via codes over GF(4),” preprint quant-ph/9608006. 8. I. L. Chuang, R. Laflamme, P. W. Shor and W. H. Zurek, Science 270, 1633 (1995). 9. R. Cleve and D. Gottesman, “Efficient computations of encodings for quantum error correction,” preprint quant-ph/9607030. 10. D. Deutsch, A. K. Ekert, R. Jozsa, C. Macchiavello, S. Popescu, and A. Sanpera, Phys. Rev. Lett. 77,2818 (1996). 11. A. K. Ekert and C. Macchiavello, Phys. Rev. Lett. 77,2585 (1996). 12. S. J. van Enk, J. I. Cirac and P. Zoller, Phys. Rev. Lett. 78,4293 (1997). 13. D. Gottesman, Phys. Rev. A 54, 1862 (1996). 14. D. Gottesman, “Pasting quantum codes,” preprint quant-ph/9607027. 15. R. W. Hamming, ‘Bell Syst. Tech. J. 29, 147 (1950). 16. R. W. Hamming, Coding and information theory, 2nd ed., (Prentice-Hall, Englewood Cliffs 1986). 17. R. Hill, A first course in coding theory, (Clarendon Press, Oxford 1986). 18. D. S. Jones, Elementary information theory, (Clarendon Press, Oxford 1979).
212
Introduction t o Quantum Computation and Information
19. R. Jozsa, J. Mod. Optics 41, 2315 (1994). 20. E. Knill and R. Laflamme, Phys. Rev. A 55, 900 (1997). 21. E. Knill and R. Laflamme, “Concatenated quantum codes,” preprint quant-ph/9608012. 22. R. Laflamme, C. Miquel, J. P. Paz and W. H. Zurek, Phys. Rev. Lett. 77, 198 (1996). 23. S . Lloyd, Phys. Rev. A 55, 1613 (1997). 24. F. J. MacWilliams and N. J. A. Sloane, The theory of error correcting codes, (Elsevier Science, Amsterdam 1977). 25. G. M. Palma, K.-A. Suominen and A. K. Ekert Proc. Roy. SOC.Lond. A 452, 567 (1996). 26. M. B. Plenio and P. L. Knight, Proc. R. SOC.Lond. A 453, 2017 (1997). 27. J. Preskill, “Reliable quantum computers,” preprint quant-ph/9705031. 28. E. M. Rains, R. H. Hardin, P. W. Shor and N. J. A. Sloane, Phys. Rev. Lett. 79, 953 (1997). 29. B. W. Schumacher and M. A. Nielsen, Phys. Rev. A 54, 2629 (1996). 30. C. E. Shannon, Bell Syst. Tech. J. 27, 379; also 623 (1948). 31. P. W. Shor, Phys. Rev. A 52 R2493 (1995). 32. P. W. Shor, “Fault tolerant quantum computation,” in Proc. 37th Symp. on Foundations of Computer Science, pp. 56-65, (IEEE Computer Society Press, 1996); preprint quant-ph/9605011. 33. P. W. Shor and R. Laflamme, “Quantum analog of the MacWilliams identities in classical coding theory,” preprint quant-ph/9610040. 34. P. W. Shor and J. A. Smolin, “Quantum error correcting codes need not completely reveal the error syndrome,” preprint quant-ph/9604006. 35. A. M. Steane, Phys. Rev. Lett. 77, 793 (1996). 36. A. M. Steane, Proc. Roy. SOC.Lond. A 452, 2551 (1996). 37. A. M. Steane, Phys. Rev. A 54, 4741 (1996). 38. A. M. Steane, “Quantum Reed-Muller codes,” submitted to IEEE Trans. Inf. Theory, preprint quant-ph/9608026. 39. A. M. Steane, “Space, time and noise requirements for reliable quantum computing,” preprint quant-ph/9708021. 40. A. M. Steane, “Quantum Computing,” Rep. Prog. Phys. (to be published), preprint quant-ph/9708022. 41. W. G. Unruh, Phys. Rev. A 51, 992 (1995).
FAULT-TOLERANT QUANTUM COMPUTATION JOHN PRESKILL California Institute of Technology, Pasadena,
CA 91125, USA
The discovery of quantum error correction has greatly improved the long-term prospects for quantum computing technology. Encoded quantum information can be protected from errors that arise due to uncontrolled interactions with the environment, or due to imperfect implementations of quantum logical operations. Recovery from errors can work effectively even if occasional mistakes occur during the recovery procedure. Furthermore, encoded quantum information can be processed without serious propagation of errors. In principle, an arbitrarily long quantum computation can be performed reliably, provided that the average probability of error per quantum gate is less than a certain critical value, the accuracy threshold. It may be possible to incorporate intrinsic fault tolerance into the design of quantum computing hardware, perhaps by invoking topological Aharonov-Bohm interactions to process quantum information.
1
The need for fault tolerance
Quantum computers appear to be capable, at least in principle, of solving certain problems far faster than any conceivable classical computer!-3 In practice, though, quantum computing technology is still in its infancy. While a practical and useful quantum computer may eventually be constructed, we cannot clearly envision at present what the hardware of that machine will be like. Nevertheless, we can be quite confident that any practical quantum computer will incorporate some type of error correction into its operation. Quantum computers are far more susceptible to making errors than conventional digital computers, and some method of controlling and correcting those errors will be needed t o prevent a quantum computer from crashing. The most formidable enemy of the quantum computer is decoherence?-* We know how t o prepare a quantum state of a cat that is a superposition of a dead cat and a live cat, but we never observe such macroscopic superpositions because they are very unstable. No real cat can be perfectly isolated from its environment. The environment measures the cat, in effect, immediately projecting it onto a state that is completely alive or completely dead? A quantum computer may not be as complex as a cat, but it is a complicated quantum system, and like a cat it inevitably interacts with the environment. The information stored in the computer decays, resulting in errors and the failure of the computation. Can we protect a quantum computer from the debilitating effects of decoherence? And decoherence is not our only enemy?-6 Even if we were able t o achieve 213
214
Introduction to Quantum Computation and Information
excellent isolation of our computer from the environment, we could not expect to execute quantum logic gates with perfect accuracy. As with an analog classical computer, the errors in the quantum gates form a continuum. Small errors in the gates can accumulate over the course of a computation, eventually causing failure, and it is not obvious how to correct these small errors. Can we prevent the catastrophic accumulation of the small gate errors? The future prospects for quantum computing received a tremendous boost from the discovery that quantum error correction is really possible in principle (see the preceding chapter by A. Steane). But this discovery in itself is not sufficient t o ensure that a noisy quantum computer can perform reliably. To carry out a quantum error-correction protocol, we must first encode the quantum information we want to protect, and then repeatedly perform recovery operations that reverse the errors that accumulate. But encoding and recovery are themselves complex quantum computations, and errors will inevitably occur while we perform these operations. Thus, we need to find methods for recovering from errors that are sufficiently robust to succeed with high reliability even when we make some errors during the recovery step. Furthermore, to operate a quantum computer, we must do more than just store quantum information; we must process the information. We need to be able to perform quantum gates, in which two or more encoded qubits come together and interact with one another. If an error occurs in one qubit, and then that qubit interacts with another through the operation of a quantum gate, the error is likely to spread to the second qubit. We must design our gates t o minimize the propagation of error. Incorporating quantum error correction will surely complicate the operation of a quantum computer. To establish the redundancy needed to protect against errors, the number of elementary qubits will have to rise. Performing gates on encoded information, and inserting periodic error-recovery steps, will slow the computation down. Because of this necessary increase in the complexity of the device, it is not a prior2 obvious that error correction will really improve its performance. A device that works effectively even when its elementary components are imperfect is said to be fault tolerant. This chapter is devoted to the theory of fault-tolerant quantum computation. We will address the issues and questions summarized above. In fact, similar issues also arise in the theory of fault-tolerant classical computation. Because existing silicon-based circuitry is so remarkably reliable, fault-tolerance is not essential to the operation of modern digital computers. Even so, the study of fault-tolerant classical computing has a distinguished history. In 1952, Von Neumann l 3 suggested improving the reliability of a cir-
Fault- Tolerant Quantum Computation 215
cuit with noisy gates by executing each gate many times, and using majority voting. He concluded that if the gate failures are statistically independent, and the probability of failure per gate is small enough, then any computation can be performed with reasonable reliability. One shortcoming of Von Neumann’s analysis was that he assumed perfect transmission of bits through the “wires” connecting the gates? Going beyond this assumption proved difficult, but was eventually achieved in 1983 by GBcs;* who described a universal cellular automaton with a hierarchical organization that can be maintained by local operations in the presence of noise, without any need for direct nonlocal communication among the components. It is an interesting question whether a quantum system can similarly maintain a complex hierarchical structure, but we will not be so ambitious as to address this question here. Because we are interested in the limitations imposed by noise on the processing of quantum information, we will classify our gates into classical and quantum, and we will assume that the classical gates can be executed with perfect accuracy and as quickly as necessary? This assumption will be well justified as long as the clock speed and accuracy of our classical computer far exceed those of the quantum computer. After reviewing the features of a particular quantum error-correcting code (Steane’s 7-qubit code12) in Sec. 2, we assemble the ingredients of fault-tolerant recovery in Sec. 3. Errors that occur during recovery can further damage the encoded quantum information; hence recovery must be implemented carefully to be effective. Ancilla qubits are used to measure an error syndrome that diagnoses the errors in the encoded data block, and we must minimize the propagation of errors from the ancilla to the data. Methods for controlling error propagation during recovery (proposed by Peter Shor l 5 and Andrew Steane 16) are described. Fault-tolerant processing of quantum information is the subject of Sec. 4. The central challenge is to construct a universal set of quantum gates that can act on the encoded data blocks without introducing an excessive number of errors. Some schemes for universal computation (due to Peter Shor l 5 and Daniel Gottesman 1 7 ) are outlined. Once the elementary gates of our quantum computer are sufficiently reliable, we can perform fault-tolerant quantum gates on encoded information, along with fault-tolerant error recovery, to improve the reliability of the device. “This problem is serious because Von Neumann’s circuits cannot be realized in threedimensional space with wires of bounded length; one would expect the probability of a transmission error t o approach unity as the wire becomes arbitrarily long. bHowever, when we consider error recovery with quantum codes of arbitrarily large block size, we will insist that the amount of classical processing t o be performed remains bounded by a polynomial in the block size.
216
Introduction to Quantum Computation and I n f o m a t i o n
But for any fixed quantum code, or even for most infinite classes that contain codes of arbitrarily large block size, these procedures will eventually fail if we attempt a very long computation. However, it is shown in Sec. 5 that there is a special class of codes (concatenated codes) which enable us to perform longer and longer quantum computations reliably, as we increase the block size at a modest rate?8-24 Invoking concatenated codes we can establish an accuracy threshold for quantum computation; once our hardware meets a specified standard of accuracy, quantum error-correcting codes and fault-tolerant procedures enable us to perform arbitrarily long quantum computations with arbitrarily high reliability. This result is roughly analogous t o Von Neumann’s conclusion regarding classical fault-tolerance, while the hierarchical structure of concatenated coding is reminiscent of the GBcs construction. We outline an estimate of the accuracy threshold, given assumptions about the errors that are enumerated in Sec. 6. With the development of fault tolerant methods, we now know that it is possible in principle for the operator of a quantum computer to actively intervene to stabilize the device against errors in a noisy (but not too noisy) environment. In the long term, though, fault tolerance might be achieved in practical quantum computers by a rather different r o u t e w i t h intrinsically fault-tolerant hardware. Such hardware, designed to be impervious t o localized influences, could be operated relatively carelessly, yet could still store and process quantum information robustly. The topic of Sec. 7 is a scheme for fault-tolerant hardware envisioned by Alexei K i t a e ~ ?in~ which the quantum gates exploit nonabelian Aharonov-Bohm interactions among distantly separated quasiparticles in a suitab:y constructed two-dimensional spin system. Though the laboratory implementation of Kitaev’s idea may be far in the future, his work offers a new slant on quantum fault tolerance that shuns the analysis of abstract quantum circuits, in favor of new physics principles that might be exploited in the reliable processing of quantum information.
The claims made in this chapter about the potential for the fault-tolerant manipulation of complex quantum states may seem grandiose from the perspective of present-day technology. Surely, we have far to go before devices are constructed that can, say, exploit the accuracy threshold for quantum computation. Nevertheless, I feel strongly that recent work relating t o quantum error correction will have an enduring legacy. Theoretical quantum computation has developed at a spectacular pace over the past three years. If, as appears to be the case, the quantum classification of computational complexity differs from the classical classification, then no conceivable classical computer can accurately predict the behavior of even a modest number of qubits (of order 100). Perhaps, then, relatively small quantum systems will have far greater potential
Fault- Tolerant Quantum Computation 217
than we now suspect to surprise, baffle, and delight us. Yet this potential could never be realized were we unable to protect such systems from the destructive effects of noise and decoherence. Thus the discovery of fault-tolerant methods for quantum error recovery and quantum computation has exceptionally deep implications, both for the future of experimental physics and for the future of technology. The theoretical advances have illuminated the path toward a future in which intricate quantum systems may be persuaded to do our bidding. 2
Quantum error correction: the 7-qubit code
To see how quantum error correction is possible, it is very instructive to study a particular code. A simple and important example of a quantum errorcorrecting code is the 7-qubit code devised by Andrew Steane. l 1 , l 2This code enables us to store one qubit of quantum information (an arbitrary state in a two-dimensional Hilbert space) using altogether 7-qubits (by embedding the two-dimensional Hilbert space in a space of dimension 27). Steane's code is actually closely related to a familiar classical error-correcting code, the [7,4,3] Hamming code. 26 To understand why Steane's code works, it is important to first understand the classical Hamming code. The Hamming code uses a block of 7 bits to encode 4 bits of classical information; that is, there are 16 = 24 strings of length 7 that are the valid codewords. The codewords can be characterized by a parity check matrix 0 0 0 1 1 1 1 (1) 1 0 1 0 1 0 1 Each valid codeword is a 7-bit string
'&ode
that satisfies
that is, the matrix H annihilates each codeword in mod 2 arithmetic. Since 2 2 = (0, l} is a (finite) field, familiar results of linear algebra apply here. H has three linearly independent rows and its kernel is spanned by four linearly independent column vectors. The 16 valid codewords are obtained by taking all possible linear combinations of these four strings, with coefficients chosen from (0,1}. Now suppose that '&ode is an (unknown) valid codeword, and that a single (unknown) error occurs: one of the seven bits flips. We are assigned the task of determining which bit flipped, so that the error can be corrected. This trick
218
Introduction to Quantum Computation and Information
can be performed by applying the parity check matrix to the string. Let ei denote the string with a one in the ith place, and zeros elsewhere. Then when the ith bit flips, 7/code becomes 7/,o& ei. If we apply H to this string we obtain H (7/code ei) = Hei (3)
+
+
(because H annihilates ‘Ucode), which is just the ith column of the matrix H . Since all of the columns of H are distinct, we can infer i; we have learned where the error occurred, and we can correct the error by flipping the ith bit back. Thus, we can recover the encoded data unambiguously if only one bit flips; but if two or more different bits flip, the encoded data will be damaged. It is noteworthy that the quantity Hei reveals the location of the error without telling us anything about Vcode; that is, without revealing the encoded information. Steane’s code generalizes this sort of classical error-correcting code to a quantum error-correcting code. The code uses a 7-qubit Liblock”to encode one qubit of quantum information, that is, we can encode an arbitrary state in a two-dimensional Hilbert space spanned by two states: the “logical zero” 10)code and the “logical one” 11)code. The code is designed to enable us t o recover from an arbitrary error occurring in any of the 7 qubits in the block. What do we mean by an arbitrary error? The qubit in question might undergo a random unitary transformation, or it might decohere by becoming entangled with states of the environment. Suppose that, if no error occurs, the qubit ought be in the state al0) bll). (Of course, this particular qubit might be entangled with others, so the coefficients a and b need not be complex numbers; they can be states that are orthogonal to both 10) and Il), which we assume are unaffected by the error.) Now if the qubit is afflicted by an arbitrary error, the resulting state can be expanded in the form:
+
al0) + bJ1) --+ (alW + bll))
+ (all)+ b(O)) + (.lo) - b ( 1 ) ) + (all)- b10))
~3
J A n o error
)env
C3
IAbit-flip
)en,
‘8
(Aphase-flip )en,
‘8
[Aboth errors)env
I
(4) where each IA)env denotes a state of the environment. We are making no particular assumption about the orthogonality or normalization of the states; so Eq. 4 entails no loss of generality. We conclude that the evolution of the qubit can be expressed as a linear combination of four possibilities: (I) ~
CThough,of course, the combined evolution of qubit plus environment is required to be unitary.
Fault-Tolerant Quantum Computation 219
Figure 1: Diagrammatic notation for the NOT gate, the XOR (controlled-NOT) gate, and the Toffoli (controlled-controlled-NOT) gate.
no error occurs, (2) the bit flip (0) +) 11) occurs, (3) the relative phase of 10) and 11) flips, (4) both a bit flip and a phase flip occur. Now it is clear how a quantum error-correcting code should w0rk.1~7~~ By making a suitable measurement, we wish to diagnose which of these four possibilities actually occurred. Of course, in general, the state of the qubit will be a linear combination of these four states, but the measurement should project the state onto the basis used in Eq. 4. We can then proceed to correct the error by applying one of the four unitary transformations:
(and the measurement outcome will tell us which one to apply). By applying this transformation, we restore the qubit to its intended value, and completely disentangle the quantum state of the qubit from the state of the environment. It is essential, though, that in diagnosing the error, we learn nothing about the encoded quantum information, for to find out anything about the coefficients a and b in Eq. 4 would necessarily destroy the coherence of the qubit. If we use Steane’s code, a measurement meeting these criteria is possible. The logical zero is the equally weighted superposition of all of the even weight codewords of the Hamming code (those with an even number of l’s),
-r -6
(~0000000)+ (0001111)+ ~0110011)+ (0111100) +~lOlOlOl)
+ ~1011010)+ ~1100110)+ ~1101001)), (6)
220
Introduction to Quantum Computation and Information
10)
Measure Figure 2: Computation of the bit-flip syndrome for Steane’s 7-qubit code. Repeating the computation in the rotated basis diagnoses the phase-flip errors. To make the procedure fault tolerant, each ancilla qubit must be replaced by four qubits in a suitable state.
and the logical 1 is the equally weighted superposition of all of the odd weight codewords of the Hamming code (those with an odd number of l’s),
(
+ ~1110000)+ ~1001100)+ (1000011) + ~ O l O l O l O ) + ~0100101)+ ~0011001)+ ~0010110)) .
-1 11111111) -4
(7) Since all of the states appearing in Eq. 6 and Eq. 7 are Hamming codewords, it is easy to detect a single bit flip in the block by doing a simple quantum computation, as illustrated in Fig. 2 (using notation defined in Fig 1). We augment the block of 7 qubits with 3 ancilla bits? and perform the unitary operation: lv) I0)anc +) . 1 QD ( H v ) a n c 7 (8) where H is the Hamming parity check matrix, and I . ) a nc denotes the state of the three ancilla bits. If we assume that only a single one of the 7 qubits in the block is in error, measuring the ancilla projects that qubit onto either a state with a bit flip or a state with no flip (rather than any nontrivial superposition of the two). If the bit does flip, the measurement outcome diagnoses which dTo make the procedure fault-tolerant, we will need to increase the number of ancilla bits as discussed in Sec. 3.
Fault- Tolerant Quantum Computation 221
bit was affected, without revealing anything about the quantum information encoded in the block. But to perform quantum error correction, we will need to diagnose phase errors as well as bit flip errors. To accomplish this, we observe (following ) we can change the basis for each qubit by applying the Steane l 1 ~ l 2 that Hadamard rotation R = - ( 1’ (9) fi 1 -1
’).
Then phase errors in the basis 1
16) 5
lo), 11) basis become bit
Jz (10) + 11)) ’
17)
1
flip errors in the rotated
- (10) - 11)) .
Jz
(10)
It will therefore be sufficient if our code is able to diagnose bit flip errors in this rotated basis. But if we apply the Hadamard rotation to each of the 7 qubits, then Steane’s logical 0 and logical 1 become in the rotated basis
(where wt(v) denotes the weight of v). The key point is that 16)codeand li)code, like 10)codeand ( l ) c o d e , are superpositions of Hamming codewords. Hence, in the rotated basis, as in the original basis, we can perform the Hamming parity check to diagnose bit flips, which are phase flips in the original basis. Assuming that only one qubit is in error, performing the parity check in both bases completely diagnoses the error, and enables us to correct it. In the above description of the error correction scheme, I assumed that the error affected only one of the qubits in the block. Clearly, this assumption as stated is not realistic; all of the qubits will typically become entangled with the environment to some degree. However, as we have seen, the procedure for determining the error syndrome will typically project each qubit onto a state in which no error has occurred. For each qubit, there is a non-zero probability of an error, assumed small, which we’ll call E . Now we will make a very important assumption - that the errors acting on different qubits in the same block are completely uncorrelated with one another. Under this assumption, the probability of two errors is of order e2, and so is much smaller than the probability of a single error if c is small enough. So, to order E accuracy, we can safely confine our attention to the case where at most one qubit per block is in error. (In fact, to reach this conclusion, we do not really require that errors
222
Introduction to Quantum Computation and Information
acting on different qubits be completely uncorrelated. If all qubits are exposed to the same weak magnetic field, so that each has a probability E of flipping over, that would be okay because the probability that two spins flip over is order c2. What would cause trouble is a process occuring with probability of order E that flips two spins at once.) But in the (unlikely) event of two errors occurring in the same block of the code, our recovery procedure will typically fail. If two bits flip in the same block, then the Hamming parity check will misdiagnose the error. Recovery will restore the quantum state to the code subspace, but the encoded information in the block will undergo the bit flip I0)code
11)code
I0)code
I1)code
7
.
(12)
Similarly, if there are two phase errors in the same block, these are two bit flip errors in the rotated basis, so that after recovery the block will have undergone a bit flip in the rotated basis, or in the original basis the phase flip 1O)code
-+
1O)code
3
11)code
-+
-1l)code
.
(13)
(If one qubit in the block has a phase error, and another one has a bit flip error, then recovery will be successful.) Thus we have seen that Steane’s code can enhance the reliability of stored quantum information. Suppose that we want to store one qubit in an unknown pure state I$). Due to imperfections in our storage device, the state poutthat we recover will have suffered a loss of fidelity:
F
3 ($Ipoutl$)
=1-
.
(14)
But if we store the qubit using Steane’s 7-qubit block code, if each of the 7qubits is maintained with fidelity F = 1 - c , if the errors on the qubits are uncorrelated, and if we can perform error recovery, encoding, and decoding flawlessly (more on this below), then the encoded information can be maintained with an improved fidelity F = 1 - 0 ( E ~ ) . A qubit in an unknown state can be encoded using the circuit shown in Fig. 3. It is easiest to understand how the encoder works by using an alternative expression for the Hamming parity check matrix,
H=
1 0 0 1 0 1 1 0 1 0 1 1 0 1 . 0 0 1 1 1 1 0
(
)
(15)
(This form of H is obtained from the form in Eq. 1 by permuting the columns, which is just a relabeling of the bits in the block.) The even subcode of the
F a d - Tolerant Quantum Computation 223
Figure 3: An encoding circuit for Steane’s 7-qubit code.
Hamming code is actually the space spanned by the rows of H ; so we see that (in this representation of H ) the first three bits of the string completely characterize the data represented in the subcode. The remaining four bits are the parity bits that provide the redundancy needed to protect against errors. When encoding the unknown state al0) +bll), the encoder first uses two XOR’s to prepare the state a10000000) b10000111),a superposition of even and odd Hamming codewords. The rest of the circuit adds I0)code t o this state: the Hadamard (R) rotations prepare an equally weighted superposition of all eight possible values for the first three bits in the block, and the remaining XOR gates switch on the parity bits dictated by H . We will also want to be able t o measure the encoded qubit, say by projecting onto the orthogonal basis {IO)code, Il)code}. If we don’t mind destroying the encoded block when we make the measurement, then it is sufficient to measure each of the seven qubits in the block by projecting onto the basis {lo), ll)}; we then perform classical error correction on the measurement outcomes to obtain a Hamming codeword. The parity of that codeword is the value of the logical qubit. (The classical error correction step provides protection against measurement errors. For example, if the block is in the state IO)code, then two independent errors would have to occur in the measurement of the elementary qubits for the measurement of the logical qubit to yield the incorrect value
+
11)code.)
In applications t o quantum computation, we will need t o perform a measurement that projects onto { IO)code, Il)code} without destroying the block. This task is accomplished by copying the parity of the block onto an ancilla qubit, and then measuring the ancilla. A circuit that performs a nondestruc-
224
Introduction t o Quantum Computation and I n f o m a t i o n
,z or
IIII KZE
Measure Measure Measure
c l a s s i c a t 0 or 1 Recover
I1jcode
Measure
Figure 4: Destructive and nondestructive measurement of the logical qubit.
tive measurement of the code block (in the case where the parity check mat.rix is as in Eq. 15) is shown in Fig. 4. The measurement is nondestructive in the sense that it preserves the code subspace; it does, of course "destroy" a coherent superposition a l 0 ) c o d e -I-bl1)code by Collapsing the State to either I0)code (with probability \ . I 2 ) or I1)code (with probability lbI2). Steane's 7-qubit code can recover from only a single error in the code block, but better codes can be constructed that can protect the information from up t o t errors within a single block, so that the encoded information can be maintained with a fidelity F = 1 - 0 (et"). The current status of quantum coding theory is reviewed by Steane in this volume. The key conceptual insight that makes quantum error correction possible is that we can fight entanglement with entanglement. Entanglement can be our enemy, since entanglement of our device with the environment can conceal quantum information from us, and so cause errors. But entanglement can also be our friend-we can encode the information that we want to protect in entanglement, that is, in correlations involving a large number of qubits. This information, then, cannot be accessed if we measure just a few qubits. By the same token, the information cannot be damaged if the environment interacts with just a few qubits. Furthermore, we have learned that, although the quantum computer is in a sense an analog device, we can digitalize the errors that it makes. We deal with small errors by making appropriate measurements that project the state of our quantum computer onto either a state where no error has occurred, or a state with a large error, which can then be corrected with familiar methods. And we have seen that it is possible to measure the errors without measuring the data-we can acquire information about the precise nature of the error without acquiring any information about the quantum information encoded in 12928-31
Fault- Tolerant Quantum Computation 225
our device (which would result in decoherence and failure of our computation). All quantum error correcting codes make use of the same fundamental strategy: a small subspace of the Hilbert space of our device is designated as the code subspace. This space is carefully chosen so that all of the errors that we want to correct move the code space to mutually orthogonal error subspaces. We can make a measurement after our system has interacted with the environment that tells us in which of these mutually orthogonal subspaces the system resides, and hence infer exactly what type of error occurred. The error can then be repaired by applying an appropriate unitary transformation. 3
Fault-tolerant recovery
In our discussion so far, we have assumed that we can encode quantum information and perform recovery from errors without making any mistakes. But, of course, error recovery will not be flawless. Recovery is itself a quantum computation that will be prone to error. If the probability of error for each bit in our code block is c, then it is reasonable to suppose that each quantum gate that we employ in the recovery procedure has a probability of order E of introducing an error (or that “storage errors” occur with probability of order E during recovery). If our recovery procedure is carelessly designed, then the probability that the procedure fails (e.g., because two errors occur in the same block) may be of order E . Then we have derived no benefit from using a quantum error-correcting code; in fact, the probability of error per data qubit is even higher than without any coding. So we are obligated to consider systematically all the possible ways that recovery might fail with a probability of order E , and to ensure that they are all eliminated. Only then is our procedure fault tolerant, and only then is coding guaranteed to pay off once E is small enough. 3.1
The Back Action Problem
One serious concern is propagation of error. If an error occurs in one qubit, and then we apply a gate in which that qubit interacts with another, the error is likely to spread to the second qubit. We need to be careful to contain the infection, or at least we must strive to prevent two errors from appearing in a single block. In performing error recovery, we repeatedly use the two-qubit XOR gate. This gate can propagate errors in two different ways. First, it is obvious that if a bit flip error occurs in one qubit, and that qubit is then used as the source qubit of an XOR gate, then the bit flip will propagate “forward” to the
226
Introduction to Quantum Computation and Infomation
I
Figure 5: A useful identity. The source and the target of an XOR gate are interchanged if we perform a change of basis with Hadamard rotations.
Data
Data
Ancilla Ancilla
m Bad!
Good!
Figure 6: Bad and good versions of syndrome measurement. The bad circuit uses the same ancilla bit several times; the good circuit uses each ancilla bit only once.
target qubit. The second type of error propagation is more subtle, and can be understood using the identity represented in Fig. 5 - if we perform a rotation of basis with a Hadamard gate on both qubits, then the source and the target of the XOR gate are interchanged. Since we recall that this change of basis also interchanges a bit flip error with a phase error, we infer that if a phase error occurs in one qubit, and that qubit is then used as the target qubit of an XOR gate, then the error will propagate “backward” to the source qubit. We can now see that the circuit shown in Fig. 2 is not fault tolerant. The trouble is that a single ancilla qubit is used as a target for four successive XOR gates. If just a single phase error occurs in the ancilla qubit at some stage, that one error can feed back to two or more of the qubits in the data block. The result is that a block phase error may occur with a probability of order E , which is not acceptable. To reduce the failure probability t o order e 2 , we must modify the recovery circuit so that each ancilla qubit couples to no more than one qubit within the code block. One way to do this is t o expand the ancilla from one bit to
Fault- Tolerant Quantum Computation 227
four, with each bit the target of a single XOR gate, as in Fig. 6. We can then measure all four ancilla bits. The bit of the syndrome that we are seeking is the parity of the four measured bits. In effect, we have copied from the data block to the ancilla some information about the error that occurred, and we read that information when we measure the ancilla. But this procedure is still not adequate as it stands, because we have copied too much information. The circuit entangles the ancilla with the error that has occured in the data, which is good, but it also entangles the ancilla with the encoded data itself, which is bad. The measurement of the ancilla destroys the carefully prepared superposition of basis states in the expressions Eqs. 6 and 7 for 10)code and 11)code. For example, suppose we are measuring the first bit of the syndrome as in Fig. 2, but with the ancilla expanded from one bit to four. In effect, then, we are measuring the last four bits of the block. If we obtain the measurement result, say, 1000O),nc, then we have projected 10)code to lOOOO000) and I1)code t o 1110000); the codewords have lost all protection against phase errors. 3.2 Preparing the Ancilla We need to modify the reco. :ry procedure further, preserving its good features while eliminating its bad features. We want t o copy onto our ancilla the information about the errors in the data block, without feeding multiple phase errors into the data, and without destroying the coherence of the data. To meet this goal, we must prepare an appropriate state of the ancilla before the error syndrome computation begins. This state is chosen so the outcome of the ancilla measurement will reveal the information about the errors without revealing anything about the state of the data. One way to meet this criterion was suggested by Peter Shor; The Shor state that he proposed is a state of four ancilla bits that is an equally weighted superposition of all even weight strings:
To compute each bit of the syndrome, we prepare the ancilla in a Shor state, perform four XOR gates (with appropriate qubits in the data block as the sources and the four bits of the Shor state as the targets), and then measure the ancilla state. If the syndrome bit we are computing is trivial, then the computation adds an even weight string t o the Shor state, which leaves it unchanged; if the syndrome bit is nontrivial, the Shor state is transformed t o the equally weighted
228
Introduction to Quantum Computation and Information
D
a
Cat
t
R
a
x Meas.
D Cat
a
t
R
a
:)a;%
r Meas.
R Meas.
Figure 7: (a) The procedure for computing one bit of the bit-flip error syndrome, shown schematically. The Hadamard gate applied to the “cat state” completes the preparation of the Shor state, as discussed in Sec. 3.3. Both the XOR gate and the Hadamard gate in the diagram actually represent four gates performed in parallel. (b) The procedure for computing one bit of the phase-flip error syndrome, shown schematically. It is the same as (a), but applied to the data in the Hadamard rotated basis. (c) A circuit equivalent to (b), simplified by using the identity in Fig. 5.
superposition of odd weight strings. Thus, the parity of the measurement result reveals the value of the syndrome bit, but no other information about the state of the data block can be extracted from measurement - we have found a way to extract the syndrome without damaging the codewords. (The particular string of specified parity that we find in the measurement is selected at random, and has nothing t o do with the state of the data block.) There are altogether 6 syndrome bits (3 to diagnose bit-flip errors and 3 to diagnose phase-flip errors), so the syndrome measurement uses 24 ancilla bits prepared in 6 Shor states, and 24 XOR gates. One way t o obtain the phase-flip syndrome would be to first apply 7 parallel R gates t o the data block to rotate the basis, then to apply the XOR gates as in Fig. 2 (but with the ancilla expanded into a Shor state), and finally to apply 7 R gates to rotate the data back. However, we can use the identity represented in Fig. 5 to improve this procedure. By reversing the direction of the XOR gates (that is, by using the ancilla as the source and the data as the target), we can avoid applying the R gates to the data, and hence can reduce the likelihood of damaging the data with faulty gate^;^^^^ as shown in Fig. 7. Another way to prepare the ancilla was proposed by Andrew Steane. His 7-qubit ancilla state is the equally weighted superposition of all Hamming codewords:
+
and can be ob(This state can also be expressed as (1O)code Il)code) /fi, tained by applying the bitwise Hadamard rotation to the state 1O)code.) To compute the bit-flip syndrome, we XOR each qubit of the data block into the corresponding qubit of the ancilla, and measure the ancilla. Applying the Ham-
Fault- Tolerant Quantum Computation 229
10)
/t\A\ Measure
Figure 8: Construction and verification of the Shor state. If the measurement outcome is 1, then the state is discarded and a new Shor state is prepared.
ming parity-check matrix H t o the classical measurement outcome, we extract the bit-flip syndrome. As with Shor’s method, this procedure “copies” the data onto the ancilla, where the state of the ancilla has been carefully chosen to ensure that only the information about the error can be read by measuring the ancilla. For example, if there is no error, the particular string that we find in the measurement is a randomly selected Hamming codeword and tells us nothing about the state of the data. The same procedure is carried out in the rotated basis to find the phase-flip syndrome. The Steane method has the advantage over the Shor procedure that only 14 ancilla bits and 14 XOR gates are needed. But it also has the disadvantage that the ancilla preparation is more complex, so that the ancilla is somewhat more prone t o error.
3.3
Verifying the Ancilla
As we continue with our program to sniff our all the ways in which a recovery failure could result from a single error, we notice another potential problem. Due to error propagation, a single error that occurs during the preparation of the Shor state or Steane state could cause two phase errors in this state, and these can both propagate t o the data if the faulty ancilla is used for syndrome measurement. Our procedure is not yet fault tolerant. Therefore the state of the ancilla must be tested for multiple phase errors before it is used. If it fails the test, it should be discarded, and a new ancilla state should be constructed. One way t o construct and verify the Shor state is shown in Fig. 8. The first Hadamard gate and the first three XOR gates in this circuit prepare a “cat state” (lO000) 11111)), a maximally entangled state of the four ancilla
+
230
Introduction to Quantum Computation and Information
bits; the final four Hadamard gates rotate the cat state to the Shor state. But a single error occuring during the second or third XOR could result in two errors in the cat state (it might become (10011) 11100))). These two bit-flip errors in the cat state become two phase errors in the Shor state which will feed back to cause a block phase error during syndrome measurement. But we notice that for all the ways that a single bad gate could cause two bit-flip errors in the cat state, the first and fourth bit of the cat state will have different values. Therefore, we add the last two XOR gates to the circuit (followed by a measurement) to verify whether these two bits of the cat state agree. If verification succeeds, we can proceed with syndrome measurement secure in the knowledge that the probability of two phase errors in the Shor state is of order e2. If verification fails, we can throw away the cat state and try again. Of course, a single error in the preparation circuit could also result in two phase errors in the cat state and hence two bit-flip errors in the Shor state; we have made no attempt to check the Shor state for bit-flip errors. But bit-flip errors in the Shor state are much less troublesome than phase errors. Bit-flip errors cause the syndrome measurement to be faulty, but they do not feed back and damage the data. If we use Steane’s method of syndrome measurement, we first employ the encoding circuit Fig. 3 (with the first two XOR gates eliminated) to construct IO)code, and then apply a Hadamard gate to each qubit to complete the preparation of the Steane state. Again, a single error during encoding can cause two bit flip errors in I0)code which become two phase errors in the Steane state, so that verification is required. We can verify by performing a nondestructive measurement of the state to ensure that it is lO)code (up to a single bit flip) rather than 11)code. Thus we prepare two blocks in the state IO)code, perform a bitwise XOR from the first block to the second, and then measure the second block destructively. We can apply classical Hamming error correction to the measurement outcome, to correct one possible bit-flip error, and identify the measured block as either I0)code or 11)code. If the result is IO)code, then the other block has passed inspection. If the result is \l)code, then we suspect that the other block is faulty, and we flip that block to fix it. However, this verification procedure is not yet trustworthy, because it might have been the block that we measured that was actually faulty, rather than the block we were trying to check. Hence we must repeat the verification step. If the measured block yields the same result twice in a row, the check may be deemed reliable. What if we get a different result the second time? Then we don’t know whether to flip the block we are checking or to leave it alone. We could try one more time, to break the tie, but this is not really
+
Fault- Tolerant Quantum Computation 231
necessary; in fact, if the two verification attempts give conflicting results, it is safe to do nothing. Because the results conflict, we know that one of the measured blocks was faulty. Therefore, the probability that the block t o be checked is also faulty is order c2 and can be neglected. With this verification procedure, we have managed t o construct a Steane state such that the probability of multiple phase errors (which would feed back to the data during syndrome measurement) is of order c 2 .
3.4
Verifying the Syndrome
A single bit-flip error in the ancilla will result in a faulty syndrome. The error could arise because the ancilla was prepared incorrectly, or because an error occured during the syndrome computation. The latter case is especially dangerous, because a single error, occuring with a probability of order E , could produce a fault in both the data block and the ancilla. This might happen because a bad XOR gate causes errors in both its source and target qubits, or because an error in the data block that occured during syndrome measurement is later propagated forward to the ancilla by an XOR. In such cases, were we to accept the faulty syndrome and act to reverse the error, we would actually introduce a second error into the data block. So our procedure is still not fully fault tolerant; a scenario arising with a probability of order E can fatally damage the encoded data. We must therefore find a way to ensure that the syndrome is more reliable. The obvious way t o do this is to repeat the syndrome measurement. It is not necessary t o repeat if the syndrome measurement is trivial (indicates no error); though there actually might be an error in the data that we failed to detect, we need not worry that we will make things worse, because if we accept the syndrome we will take no action. If on the other hand the syndrome indicates an error, then we measure the syndrome a second time. If we obtain the same result again, it is safe to accept the syndrome and proceed with recovery, because there is no way occuring with a probability of order E to obtain the same (nontrivial) faulty syndrome twice in a row. If the first two syndrome measurements do not agree, then we could continue to measure the syndrome until we finally obtain the same result twice in a row, a result that can be trusted. Alternatively, we could choose t o do nothing, until an error is reliably detected in a future round of error correction. At least in that event we will not make things worse by compounding the error, and if there is really an error in the data, we will probably detect it next time. (There are also other ways t o increase our confidence in the syndrome. For example, instead of repeating the measurement of the entire syndrome, we
232
Introduction t o Quantum Computation and Information
Meas.
10)
--&Meas.
Meas. &Meas.
Figure 9: The complete circuit for Steane error recovery. Encoded 10)’s are prepared, then verified. The verified 10)’s are used as ancillas to compute the bit-flip and phase-flip syndromes, which are both measured twice. The large circles indicate actions that are taken (conditioned on measurement outcomes) to repair the ancilla states, or in the final step, to repair the data block.
could compute some additional redundant syndrome bits, and subject the computed bits to a parity check. If there is an error in the syndrome, this method will usually detect the error; thus if the parity check passes, the syndrome is likely to be correct 32,24 Finally, we have assembled all the elements of a fault-tolerant error recovery procedure. If we take all the precautions described above, then recovery will fail only if two independent errors occur, so the probability of an error occurring that irrevocably damages the encoded block will be of order c2. A complete quantum circuit for Steane’s error correction is shown in Fig. 9. Note that both the bit-flip and phase error correction are repeated twice. The verification of the Steane states is also shown, but the encoding of these states is suppressed in the diagram. 3.5 Measurement and Encoding
We will of course want to be able to measure our encoded qubits reliably. But we have already noted in Sec. 2 that destructive measurement of the code block is reliable if only one qubit in the block has a bit-flip error. If the probability of a flawed measurement is order c for a single qubit, then faulty measurements of the code block occur with probability of order c2. Fault-tolerant nondestructive measurement can also be performed, as we have already noted in our discussion (Sec. 3.3) of the verification of the Steane state. An alternative procedure would be t o use the nondestructive measurement depicted in Fig. 4 without
Fault- Tolerant Quantum Computation 233
any modification. Though the ancilla is the target of three successive XOR gates, phase errors feeding back into the block are not so harmful because they cannot change I0)code to I1)code (or vice versa). However, since a single bit-flip error (in either the data block or the ancilla qubit) can cause a faulty parity measurement, the measurement must be repeated (after bit-flip error correction) to ensure accuracy to order e2. (We eschewed this procedure in our description of the verification of the Steane state to avoid the frustration of needing error correction to prepare the ancilla for error correction!) We will often want to prepare known encoded quantum states, such as I0)code. We already discussed in Sec. 3.3 above (in connection with preparation of the Steane state), how this encoding can be performed reliably. In fact, the encoding circuit is not actually needed. Whatever the initial state of the block, (fault-tolerant) error correction will project it onto the space spanned by { IO)code, Il)code}, and (verified) measurement will project out either I0)code or Il),-ode. If the result (1)code is obtained, then the (bitwise) NOT operator can be applied to flip the block to the desired state I0)codeIf we wish to encode an unknown quantum state, then we use the encoding circuit in Fig. 3. Again, because of error propagation, a single error during encoding may cause an encoding failure. In this case, since no measurement can verify the encoding, the fidelity of the encoded state will inevitably be F = 1- O(e). However, encoding may still be worthwhile, since it may enable us to preserve the state with a reasonable fidelity for a longer time than if the state had remained unencoded. 3.6
Other Codes
Both Shor’s and Steane’s scheme for fault-tolerant syndrome measurement have been described here only for the 7-qubit code, but they can be adapted to more complex codes that have the capability to recover from many errorsP3?l6 Syndrome measurement for more general codes is best described using the code stabilizer formalism. In this formalism, which is discussed in more detail in the chapter by Andrew Steane in this volume, a quantum error-correcting code is characterized as the space of simultaneous eigenstates of a set of commuting operators (the stabilizer generators). Each generator can be expressed as a product of operators that act on a single qubit, where the single-qubit operators are chosen from the set { I ,X, Y,2 ) defined in Eq. 5 . Each generator squares to the identity and has equal numbers of eigenvectors with eigenvalue +1 and -1, so that specifying its eigenvalue reduces the dimension of the space by half. If there are n qubits in a block, and there are n - lc generators, then the code subspace has dimension 2k - there are k encoded qubits.
234
Introduction to Quantum Computation and Information
For example, Steane’s 7-qubit code is the space for which the six stabilizer generators
MI
=
M2
=
M3
=
M4 = M:, = MG =
(IIIZZZZ) (IZZIIZZ) (ZIZIZIZ) (IIIXXXX) (IXXIIXX) (XIXIXIX) (18)
all have eigenvalue one. Comparing to Eq. 1, we see that the space with MI = M2 = M3 = 1 is spanned by codewords that satisfy the Hamming parity check. Recalling that a Hadamard change of basis interchanges Z and X , we see that the space with M4 = M5 = MG = 1 is spanned by codewords that satisfy the Hamming parity check in the Hadamard-rotated basis. Indeed, the defining property of Steane’s code is that the Hamming parity check is satisfied in both bases. The stabilizer generators are chosen so that every error operator that is to be corrected (also expressed as a product of the one-qubit operators { I ,X , Y,Z } ) , and the product of any two distinct such error operators, anticommutes with at least one generator. Thus, every error changes the eigenvalues of some of the generators, and two independent errors always change the eigenvalues in distinct ways. This means that we obtain a complete error syndrome by measuring the eigenvalues of all the stabilizer generators? Measuring a stabilizer generator M is not difficult. First we perform an appropriate unitary change of basis on each qubit so that M in the rotated basis is a product of I’s and 2’s acting on the individual qubits. (We rotate by R = - (1 1 1 1 -1
Jz
)
for each qubit acted on by X in M , and by
eActually, it is also acceptable if the product of two independent error operators lies in the stabilizer. Then these two errors will have the same syndrome, but it won’t matter, because the two errors can also be repaired by the same action. Quantum codes that assign the same syndrome to more than one error operator are said to be degenerate.
Fault- Tolerant Quantum Computation
235
for each qubit acted on by Y . ) In this basis, the value of M is just the parity of the bits for which 2’s appear. We can measure the parity (much as we did in our discussion of the 7-qubit code), by performing an XOR t o the ancilla from each qubit in the block for which a Z appears in M . Finally, we invert the change of basis. This procedure is repeated for each stabilizer generator until the entire syndrome is obtained. We can make this procedure fault-tolerant by preparing the ancilla in a Shor state for each syndrome bit t o be measured, where the number of bits in the Shor state is the weight of the corresponding stabilizer generator (the number of one-qubit operators that are not the identity). Each ancilla bit is the target of only a single XOR, so that multiple phase errors do not feed back into the data. The procedures discussed above for verifying the Shor state and the syndrome measurement can also be suitably generalized. For complex codes that either encode many qubits or can correct many errors, this generalized Shor method uses many more ancilla qubits and many more quantum gates than are really necessary to extract an error syndrome. We can do considerably better by generalizing the Steane method. In the case of the 7-qubit code, Steane’s idea was that we can use one 7-bit ancilla t o measure all of M I , M2, and MS; we prepare an initial state of the ancilla that is an equally weighted superposition of all strings that satisfy the Hamming parity check ( L e , all words in the classical Hamming code), perform the appropriate XOR’s from the data block t o the ancilla, measure all ancilla qubits, and finally apply the Hamming parity check to the measurement result. The three parity bits obtained are the measured eigenvalues of M I , M2, and M3. The ancilla preparation has been chosen so that no other information aside from these eigenvalues can be extracted from the measurement result; hence the coherence of our quantum codewords is not damaged by the procedure. This procedure evidently can be adapted to the simultaneous measurement of any set of operators where each can be expressed as a product of I’s and 2’s acting on the individual qubits. Given a list of k such n-qubit operators, we obtain a matrix Hz with Ic rows and n columns by replacing each I in the list by 0 and each Z by 1. We prepare the ancilla as the equally weighted superposition of all length-n strings that obey the H z parity check. Proceeding with the XOR’s and the ancilla measurement (and applying H z to the measurement result), we project a block of n qubits onto a simultaneous eigenstate of the n operators. Performing the same procedure in the Hadamard-rotated basis, we can simultaneously measure any set of operators where each is a product of I’s and X’s. Among the stabilizer generators there also might be operators that have the form M = where 2 is a product of 2’s acting on one set of qubits, and
Zx,
236
Introduction to Quantum Computation and Information
x
is a product of X’s acting on another set of qubits. Since the generator M must square t o the identity, the number of qubits acted on by the product Y of Z and X must be even. Hence 2 and X commute, and so can be simultaneously measured by the method described above. However, this measurement would give too much information; we want to measure the product of 2 and X rather than measure each separately. To make the measurement we want, we must further modify the ancilla. The ancilla should not be chosen to satisfy both the H Z parity check and the corresponding H X parity check. Rather it is prepared so that the H z and H X parity bits are correlated - the ancilla is a sum over strings such that either both parity bits are trivial or both bits are nontrivial. After the ancilla measurement, we sum the parity of the “2 measurement” and the ‘‘Xmeasurement” to obtain the eigenvalue of M . But the separate parities of the 2 and ‘‘meas~rernents~~ are entirely random and actually reveal nothing about the values of 2 or Now we can describe Steane’s method in its general form that can be applied t o any stabilizer code. If k logical qubits are encoded in a block of n qubits, then there are n - k independent stabilizer generators. With a list of these generators we associate a matrix
x
R=(Hz
x.
I Hx)
(21)
that has n - k rows and 2n columns. The positions of the 1’s in H z indicate the qubits that are acted on by Z in the listed generators, and the 1’s in H X indicate the qubits acted on by X ; if a 1 appears in the same position in both H z and H x , then the product Y = ZX acts on that qubit. A 2n-qubit ancilla is prepared in the generalized Steane state - the equally weighted superposition of all of the strings that satisfy the H parity check. Then the quantum circuit shown in Fig. 10 is executed, the ancilla qubits are measured, and H is applied t o the measurement result. The parity bits found are the eigenvalues of the stabilizer generators, which provide the complete error syndrome. The ancilla preparation has been designed so that no other information other than the syndrome can be extracted from the measurement result, and therefore the coherence of the quantum codewords is not damaged by the procedure. Each qubit in the code block is acted on by only two quantum gates in this procedure, the minimum necessary to detect both bit-flip and phase errors afflicting any qubit. Finally, we note that a different strategy for performing fault-tolerant error correction was described by K i t a e ~ ?He ~ invented a family of quantum errorcorrecting codes such that many errors within the code block can be corrected, but only four XOR gates are needed to compute each bit of the syndrome. In this case, even if we use just a single ancilla qubit for the computation of
Fault- Tolerant Quantum Computation 237
Steane
LMe
Figure 10: Circuit for Steane syndrome measurement, shown schematically. A 2n-qubit Steane state is used to find the syndrome for an n-qubit data block. Each XOR gate in the diagram represents n XOR gates performed in parallel.
each syndrome bit (rather than an expanded ancilla state like a Shor or Steane state), only a limited number of errors can feed back from the ancilla into the data. The code can then be chosen such that the typical number of errors fed back into the data during the syndrome computation is comfortably less than the maximum number of errors that the code can tolerate.
4
Fault-tolerant quantum gates
We have seen that coding can protect quantum information. But we want to do more than store quantum information with high fidelity; we want to operate a quantum computer that processes the information. Of course, we could decode, perform a gate, and then re-encode, but that procedure would temporarily expose the quantum information t o harm. Instead, if we want our quantum computer to operate reliably, we must be able t o apply quantum gates directly t o the encoded data, and these gates must respect the principles of fault tolerance if catastrophic propagation of error is to be avoided.
4.1
T h e 7-qubit code
In fact, with Steane’s 7-qubit code, there are a number of gates that can be easily implemented. Three single-qubit gates can all be applied bitwise; that is applying these gates to each of the 7 qubits in the block implements the same gate acting on the encoded qubit. We have already seen in Eq. 11 that the Hadamard rotation R acts this way. The same is true for the NOT gate (since each odd parity Hamming codeword is the complement of an even parity
238
=.-L
Introduction to Quantum Computation and Information
=I
Data
Figure 11: The transversal XOR gate, shown sc..amatically. By XOR’ing each i t o the source block into the corresponding bit of the target block, we implement an XOR acting on the encoded qubits. The gate implementation is fault tolerant because each qubit in both code blocks is acted on by a single gate.
Ancilla
Measure
0
Figure 12: The measurement circuit used in the ancilla preparation step of Shor’s implementation of the Toffoli gate.
Hamming codewordy , and the phase gate
P=(;
;)
;
(the odd Hamming codewords have weight E 3 (mod 4)and the even codewords have weight 0 (mod 4), so we actually apply P-’ bitwise to implement P ) . The XOR gate can also be implemented bitwise; that is, by XOR’ing each bit of the source block into the corresponding bit of the target block, as in Fig. 11. This works because the even codewords form a subcode, while the odd codewords are its nontrivial coset. Thus there are simple fault-tolerant procedures for implementing the gates NOT, R, P , and XOR. But unfortunately, these gates do not by themselves form a universal set. To be able to perform any desired unitary transformation acting on encoded data (to arbitrary precision), we will need to make a suitable fActually, we can implement the NOT acting on the encoded qubit with just 3 NOT’S applied to selected qubits in the block.
Fault- Tolerant Quantum Computation 239
'
IY)
-
14
Meas.
Meas.
Figure 13: The fault-tolerant Toffoli gate. Each line represents a block of 7 qubits, and the gates are implemented transversally. For each measurement, the arrow points to the set of gates that is to be applied if the measurement outcome is 1; no action is taken if the outcome is 0.
addition to this set. Following Shor,15 we will add the 3-qubit Toffoli gate, which is implemented by the procedure shown in Fig. 138 Shor's construction of the fault-tolerant Toffoli gate has two stages. In the first stage, three encoded ancilla blocks are prepared in a state of the form 1
IA>anc
2
CC
\a,b,ab)anc
.
(23)
a=0,1 b=0,1
In the second stage, the ancilla interacts with three data blocks to complete the execution of the gate. First, we will describe how the ancilla is prepared. To begin with, each of three ancilla blocks are encoded in the state 10)code. Bitwise Hadamard gates are applied to all three blocks to prepare the encoded state
We note that this state can be expressed as
9Knill et al. 19,*0 describe an alternative way of completing the universal set of gates.
240
Introduction to Quantum Computation and Information
where NOT3 denotes a NOT gate acting on the third encoded qubit. In the remainder of the ancilla preparation, the three blocks are measured in the is obtained, the preparation is I B ) a n c } basis; if the outcome complete; if is obtained, NOT3 is applied t o complete the procedure. Now we must explain how the {IA)anc, IB)anc}measurement is carried out. Schematically, the measurement is done with the circuit shown in Fig. 12, where the ZAB gate (conditioned on a control bit) flips the relative phase of 1A)anc and 1B)anc. We can see from Eqs. 23 and 25 that, in terms of the values a , b, and c of the three ancilla blocks, ZAB applies the phase (-l)ab+c. If the control bit is denoted 2 , then the gates we need to apply are (-l)sab and (-l)”c, the product of a three-bit phase gate and a two-bit phase gate. But a three-bit phase gate is as hard as a Toffoli gate, so we seem to be stuck. However, we can get around this impasse if the control block is chosen to be not an encoded qubit, but instead a (verified) 7-bit “cat state”
We do already know how t o construct fault-tolerant two-bat and one-bit phase gates transversally. These can be promoted to the three-bit and two-bit gates that we need by simply conditioning all of the bitwise gates in the construction on the corresponding bits of the cat state. Finally, we apply the bitwise Hadamard rotation to the cat state and measure its parity to complete the execution of the measurement circuit Fig. 12. (We obtain the circuit in Fig. 13, by noting that, if the cat state is in the Hadamard rotated basis, the threebit phase gate can be expressed as a Toffoli gate with the cat state as target; therefore one bitwise Toffoli gate is executed in our implementation of the measurement circuit.) Of course, the measurement is repeated to ensure accuracy. Meanwhile, three data blocks have been waiting patiently for the ancilla to be ready. By applying three XOR gates and a Hadamard rotation, the state of the data and ancilla is transformed as
a=0,1 b=0,1
a=0,1 b=0,1 w=O,l
(28) Now each data block is measured. If the measurement outcome is 0, no action is taken, but if the measurement outcome is 1, then a particular set of gates is applied t o the ancilla, as shown in Fig. 13, t o complete the implementation
Fault- Tolemnt Quantum Computation 241
of the Toffoli gate. Note that the original data blocks are destroyed by the procedure, and that what were initially the ancilla blocks become the new data blocks. The important thing about this construction is that all of the steps have been carefully designed to adhere to the principles of fault tolerance and minimize the propagation of error. Thus, two independent errors must occur during the procedure in order for two errors to arise in any one of the data blocks. That the fault-tolerant gates form a discrete set is a bit of a nuisance, but it is also an unavoidable feature of any fault-tolerant scheme. It would not make sense for the fault-tolerant gates to form a continuum, for then how could we possibly avoid making an error by applying the wrong gate, a gate that differs from the intended one by a small amount? Anyway, since our fault-tolerant gates form a universal set, they suffice for approximating any desired unitary transformation to any desired accuracy. 4.2
Other codes
Shor l5 described how to generalize this fault tolerant set of gates to more complex codes that can correct more errors, and Gottesman 17,35 has described an even more general procedure that can be applied to any of the quantum stabilizer codes. Gottesman’s construction begins with the observation that for any stabilizer code, there are fault-tolerant implementations of the single qubit gates X and Z acting on each encoded qubit. For a stabilizer code with block size n , recall that we have already seen in Sec. 3.6 that any “error operator” M (expressed as a tensor product of n matrices chosen from { I , X , Y , Z } ) can be written in the form ZX,and so can be uniquely represented as a binary string of length 2n. If there are k logical qlibits encoded in the block, then the stabilizer of the code is generated by n - k such operators. The error operators that commute with all elements of the stabilizer themselves form a group. The generators of this group are represented by binary strings of length 2n that are required to satisfy n - k independent binary conditions; therefore, there are n k independent generators. Of these, n - k are the generators of the stabilizer, but there are in addition 2k independent error operators that do not lie in the stabilizer, but do commute with the stabilizer. These 2k operators preserve the code subspace but act nontrivially on the codewords, and therefore they can be interpreted as operations that act on the encoded logical qubit s. In fact, these 2k operators can be chosen as the single qubit operations Zi and X i , where i = 1 , 2 , 3 , .. . ,k labels the encoded qubits. We first note
+
242
Introduction to Quantum Computation and Information
that the n - lc stabilizer generators can be extended to a maximal commuting set of n operators; the lc additional operators may be identified as the &’s. We can choose the computational basis states in the code subspace to be the simultaneous eigenstates of all the Zi’s, with the +1 eigenvalue corresponding to the value 0, and the -1 eigenvalue t o the value 1. Then %i flips the phase of qubit a. We may choose the remaining k generators, denoted &, which commute with the stabilizer but not with all of the Zi’s, to obey the relations
Since X i anticommutes with Zi, it flips the eigenvalue of Zi, and hence the value of qubit i. All of these operations are performed by applying at most one single-qubit gate to each qubit in the block; therefore, these operations are surely fault tolerant. We have also seen in Sec. 3.6 that it is possible to perform a fault-tolerant measurement of any error operator Z X , and so in particular to measure each X z ,El and Zi fault tolerantly. Gottesman17 has shown that, if it possible to perform a Toffoli gate (which is universal for the classical computations that preserve the set of computational basis states), then the single qubit gates X and Z,together with the ability to measure X , Y , and Z for any qubit, suffice for universal quantum computation. Hence, if we can show that a fault-tolerant Toffoli gate can be constructed acting on any three qubits, we will have completed the demonstration that universal fault-tolerant quantum computation is possible with any stabilizer code. The construction of a fault-tolerant Toffoli gate is rather complicated, so it is best to organize the demonstration this way: Gottesman showed that in any stabilizer code, it is possible to apply a fault-tolerant XOR gate to any pair of qubits (whether or not the two qubits reside in the same code block). Using the XOR gate, plus the single qubit gates and measurements that we have already seen can be implemented fault-tolerantly, one can show that all of the gates needed in Shor’s construction of the Toffoli gate can be constructed. Thus, the fault-tolerant construction of the Toffoli gate can be carried out using any stabilizer code, and universal fault-tolerant quantum computation can be achieved. While in principle any stabilizer code can be used for fault-tolerant quantum computing, some codes are better than others. For example, there is a 5-qubit code that can recover from one error and Gottesman l 7 has exhibited a universal set of fault-tolerant gates for this code. But the gate implementation is quite complex. The 7-qubit Steane code requires a larger block, but it is much more convenient for computation. 36137
Fault- Tolerant Quantum Computation
5
243
The accuracy threshold for quantum computation
Quantum error-correcting codes exist that can correct t errors, where t can be arbitrarily large. If we use such a code and we follow the principles of fault-tolerance, then an uncorrectable error will occur only if a t least t 1 independent errors occur in a single block before recovery is completed. So if the probability of an error occurring per quantum gate, or the probability of a storage error occurring per unit of time, is of order E , then the probability of an error per gate acting on encoded data will be of order E ~ + ' , which is much smaller if E is small enough. Indeed, it may seem that by choosing a code with t as large as we please we can make the probability of error per gate as small as we please, but this turns out not t o be the case, at least not for most codes. The trouble is that as we increase t , the complexity of the code increases sharply, and the complexity of the recovery procedure correspondingly increases. Eventually we reach the point where it takes so long to perform recovery that it is likely that t 1 errors will accumulate in a block before we can complete the recovery step, and the ability of the code to correct errors is thus compromised. Suppose that the number of computational steps needed to perform the syndrome measurement increases with t like a power t b . Then the probability that t 1 errors accumulate before the measurement is complete will behave like Block Error Probability (tbE)t+' , (30)
+
+
+
-
-
where E is the probability of error per step. We may then choose t to minimize the error probability ( t e-'e-'/a , assuming t is large), obtaining Minimum Block Error Probability
-
(
exp -e-' be-'lb
1.
(31)
Thus if we hope t o carry out altogether T cycles of error correction without any error occurring, then our gates must have an accuracy E
-
(log7y
.
Similarly, if we hope t o perform a quantum computation with altogether T quantum gates, elementary gates of this prescribed accuracy are needed. In the procedure originally described by Shor,15 the power characterizing the complexity of the syndrome measurement is b = 4, and somewhat smaller values of b can be achieved with a better optimized procedure. The block size of the code used grows with t like t2 (for the codes that Shor considered), so when the code is chosen to optimize the error probability, the block size is of order
244
Introduction to Quantum Computation and Information
Figure 14: Concatenated coding. Each qubit in the block, when inspected at higher resolution. is itself an encoded subblock.
-
(logT)2. Certainly, the scaling described by Eq. 32 is much more favorable than the accuracy E T-' that would be required were coding not used at all. But for any given accuracy, there is a limit to how long a computation can proceed until errors become likely. This limitation can be overcome by using a special kind of code, a concatenated ode?^-^^ To understand the concept of a concatenated code, imagine that we are using Steane's quantum error-correcting code that encodes a single qubit as a block of 7 qubits. But if we look more closely a t one of the 7 qubits in the block with better resolution, we discover that it is not really a single qubit, but another block of 7 encoded using the same Steane code as before. And when we examine one of the 7 qubits in this block with higher resolution, we discover that it too is really a block of 7 qubits. And so on. (See Fig. 14.) If there are all together L levels t o this hierarchy of concatenation, then a single qubit is actually encoded in a block of size 7 L . The reason that concatenation is useful is that it enables us to recover from errors more efficiently, by dividing and conquering. With this method, the complexity of error correction does not grow so sharply as we increase the error-correcting capacity of our quantum
Fault- Tolerant Quantum Computation 245
code. We have seen that Steane’s 7-qubit code can recover from one error. If the probability of error per qubit is E , the errors are uncorrelated, and recovery is fault-tolerant, then the probability of a recovery failure is of order c 2 . If we concatenate the code to construct a block of size 72, then an error occurs in the block only if two of the subblocks of size 7 fail, which occurs with a probability of order ( c ~ ) ~And . if we concatenate again, then an error occurs only if two subblocks of size 72 fail, which occurs with a probability of order ( ( E ~ ) ~ )If~ there are all together L levels of concatenation, then the probability of an error is or order ( E ) ~ while ~ , the block size is 7L. Now, if the error rate for our fundamental gates is small enough, then we can improve the probability of an error per gate by concatenating the code. If so, we can improve the performance even more by adding another level of concatenation. And so on. This is the origin of the accuracy threshold for quantum computation: if coding reduces the probability of error significantly, then we can make the error rate arbitrarily small by adding enough levels of concatenation. But if the error rates are too high to begin with, then coding will make things worse instead of better. To analyze this situation, we must first adopt a particular model of the errors, and I will choose the simplest possible quasi-realistic model: uncorrelated stochastic errors! In each computational time step, each qubit in the device becomes entangled with the environment as in Eq. 4, but where we assume that the four states of the environment are mutually orthogonal, and that the three “error states” have equal norms. Thus the three types of errors (bit flip, phase flip, both) are assumed to be equally likely. The total probability of error in each time step is denoted Cstore. In addition to these storage errors that afflict the “resting” qubits, there will also be errors that are introduced by the quantum gates themselves. For each type of gate, the probability of error each time the gate is implemented is denoted Egate (with independent values assigned to gates of each type). If the gate acts on more than one qubit (XOR or Toffoli), correlated errors may arise. In our analysis, we make the pessimistic assumption that an error in a multi-qubit gate always damages all of the qubits on which the gate acts; e.g., a faulty XOR gate introduces errors in both the source qubit and the target qubit. This assumption (among others) is made only t o keep the analysis tractable. Under more realistic assumptions, we would find that somewhat higher error rates could be tolerated. We can analyze the efficacy of concatenated coding by constructing a set of concatenation flow equations, that describe how our error model evolves as we proceed from one level of concatenation t o the next. For example, suppose h I will characterize the error model in more detail in Sec. 6.
.
246
Introduction to Quantum Computation and Information
we want to perform an XOR gate followed by an error recovery step on qubits encoded using the concatenated Steane code with L levels of concatenation (block size 7 L ) . The gate implementation and recovery can be described in terms of operations that act on subblocks of size 7L-1. Thus, we can derive an expression for the probability of error ~ ( for ~ a1gate acting on the full block in terms of the probability of error for gates acting on the subblocks. This expression is one of the flow equations. In principle, we can solve the flow equations to find the error probabilities “at level L” in terms of the parameters of the error model for the elementary qubits. We then study how the error probabilities behave as L becomes large. If all block error probabilities approach zero for L large, then the elementary error probabilities are “below the threshold.” Since our elementary error model may have many parameters, the threshold is really a codimension one surface in a high-dimension space. Steane’s method of syndrome measurement is particularly well suited for concatenated coding. All of the gates in the recovery circuit Fig. 9 can be executed transversally; if we perform the gates on the elementary qubits in the code block, then we are executing the very same gates on the encoded information in each block of size 7 , each superblock of size 7.7 and so on. Similarly, when we measure the elementary qubits in the ancilla at the conclusion of the syndrome computation, then (after applying classical Hamming error correction t o the qubits), we have also measured the encoded qubit in each block of 7 , and (after applying Hamming error correction t o the blocks) each superblock of 7 . 7 , etc. We see then that the quantum data processing needed to extract a syndrome can be carried out at all levels of the concatenated code simultaneously! After some relatively straightforward classical processing, we determine what single qubit gates need to be applied t o all the elementary qubits in order to complete the recovery step on all levels at once. Thus it is easy to see (at least conceptually) how the accuracy threshold can be estimated?8 At each level of the concatenated code, a block of 7 fails if there are errors in at least two of the subblocks that it contains. If p~ is the probability of an error in a block at level L , then the probability of an error in a block a t level L 1 is
+
pL+1-
( ; ) P : . + - - = 2 1 P L +2 - . .
(33)
(neglecting the terms of higher order in p ~ )which , will be smaller than p~ if p~ < 1/21. Therefore, if the each elementary qubit has a probability of error ‘A destructive measurement of the encoded ancilla block can be carried out at all levels simultaneously. The procedure for measuring the block nondestructively (projecting the block onto 10)code or Il)code) is much more laborious; it must be carried out on one level at a time.
Fault- Tolerant Quantum Computation 247
po < 1/21, the error probability will be smaller at level 1, still smaller at level 2, and so on-the threshold value of po is 1/21. Suppose that we perform error correction every time we execute an XOR or single qubit gate. Roughly speaking, po is the probability of error per data qubit when a cycle of error correction begins. To estimate the accuracy threshold, we follow the circuit Fig. 9 and add up the contributions to po due to errors (including possible storage errors) that arose during recently executed quantum gates and that have not already been eliminated in a previous error correction cycle. We obtain an expression for po in terms of the gate error and storage error probabilities that we can equate to 1 / 2 1 to find the threshold. Proceeding this way, assuming that storage errors are negligible, and that each single-qubit or XOR gate has the same error probability cgate, we 38 crudely estimate the threshold gate error rate as
Similarly, if gate errors are negligible, the estimated threshold storage error rate is estore,O N
6.
.
(35)
The thresholds for gate and storage errors are essentially the same because the Steane method is well optimized for dealing with storage errors. The qubits are rarely idle; a gate acts on each one in almost every step. Hence, the storage accuracy requirement is considerably less stringent than in previous threshold estimates based on the Shor recovery However, a more thorough analysis shows that, for several reasons, the actual threshold that can be inferred from the circuit Fig. 9 is somewhat lower than the estimates Eq. 34 and Eq. 35. The most serious caveat is that to perform recovery we must have a supply of well verified 10)code states encoded at level L. A separate (and rather complicated) calculation is required to determine the threshold for reliable encoding. We also need to analyze Shor's implementation of the Toffoli gate to ensure that highly reliable Toffoli gates can be executed on the concatenated blocks? Finally, we must bound the higher-order contributions to the failure probability that have been dropped in Eq. 33 to obtain a rigorous result. The full analysis for this case has not yet been completed, but it seems conservative to guess that the final values of the storage and gate thresholds will exceed Of course, it is possible JThe elementary Toffoli gates are not required to be as accurate as the one and two-body gates - an Toffoli gate error rate of order lop3 is acceptable, if the other error rates are sufficiently small. This finding is welcome, since Tofolli gates are more difficult to implement, and are likely to be less accurate in practice.
248
Introduction t o Quantum Computation and Information
that with a better coding scheme and/or error recovery protocol a much higher value of the accuracy threshold could be established. We should also ask how large a block size is needed t o ensure a certain specified accuracy. Roughly speaking, if the threshold gate error rate is €0 and the actual elementary gate error rate is E < €0,then concatenating the code L times will reduce the error rate to
- ($)
2L
€0
;
thus, to be reasonably confident that we can complete a computation with T gates without making an error we must choose the block size 7 L to be block size
-
logEOT [log E O / c ]
(37)
N---
+
If the code that is concatenated has block size n and can correct t 1 errors, the power log, 7 2.8 in Eq. 37 is replaced by logn/log(t 1); this power approaches 2 for the family of codes considered by Shor, but could in principle approach 1 for codes. When the error rates are below the accuracy threshold, it is also possible to maintain an unknown quantum state for an indefinitely long time. However, as we have already noted in Sec. 3.5, if the probability of a storage error per computational time step is E , then the initial encoding of the state can be performed with a fidelity no better than F = 1 - O ( E ) .With concatenated coding, we can store unknown quantum information with reasonably good fidelity for an indefinitely long time, but we cannot achieve arbitrarily good fidelity. Concatenation is an important theoretical construct, since it enables us to establish that arbitrarily long computations are possible. But unless the error rates are quite close to the threshold values, concatenated coding may not be the best way to perform a particular computation of given length. Indeed, a code chosen from the family originally described by Shor may turn out to be more efficient that the concatenated 7-bit code. Furthermore, the concatenated 7-bit code and Shor’s codes encode just a single qubit of quantum information in a rather large code block. But we saw in Sec. 4 that fault-tolerant quantum computation can be carried out using any stabilizer code, including codes that make more efficient use of storage space by encoding many qubits in a single block. If the reliability of our hardware is close to the accuracy threshold, then efficient codes will not work effectively. But as the hardware improves, we can use better codes, and so enhance the reliability of our quantum computer at a smaller cost in storage space.
+
Fault- Tolerant Quantum Computation 249
6
Error models
A fault-tolerant scheme should be tailored to protect against the types of errors that are most likely t o afflict a particular device. And any statement about what error rates are acceptable (like the estimate of the accuracy threshold that we have just outlined) is meaningless unless a model for the errors is carefully specified. Let's summarize some of the important assumptions about the error model that underlie our estimate of the accuracy threshold: Random errors. We have assumed that the errors have no systematic componentk Errors that have random phases accumulate like a random walk, so that the probability of error accumulates roughly linearly with the number of gates applied. But it the errors have systematic phases, then the error amplitude can increase linearly with the number of gates applied. Hence, for our quantum computer t o perform well, the rate for systematic errors must meet a more stringent requirement than the rate for random errors. Crudely speaking, if we assume that the systematic phases always conspire t o add constructively, and if the accuracy threshold is € 0 for the case of random errors, then the accuracy threshold would be of order ( ~ 0 for ) ~(maximally conspiratorial) systematic errors. While systematic errors may pose a challenge t o the quantum engineers of the future, they ought not t o pose an insuperable obstacle. My attitude is that (1) even if our hardware is susceptible t o making errors with systematic phases, these will tend to cancel out in the course of a reasonably long computation, 40-42 and (2) since systematic errors can in principle be understood and eliminated, from a fundamental point of view it is more relevant t o know the limitations on the performance of the machine that are imposed by the random errors. Uncorrelated errors. We have assumed that the errors are both spatially and temporally uncorrelated with one another. Thus when we say that the probability of error per qubit is (for example) E lop5, we actually mean that, given two specified qubits, the probability that errors afflict both is c2 10-l'. This is a very strong assumption. The really crucial requirement is that correlated errors affecting multiple qubits in the same code block are highly unlikely, since our coding schemes will fail if several errors occur in a single block. Future quantum engineers will face the challenge of designing devices such that qubits in the same block are very well isolated from one another.
-
-
'"Knill et al. l9 have demonstrated the existence of an accuracy threshold for much more general error models.
Introduction t o Quantum Computation and Information
250
0
0
0
0
Maximal parallelism. We have assumed that many quantum gates can be executed in parallel in a single time step. This assumption enables us t o perform error recovery in all of our code blocks at once, and so is critical for controlling qubit storage errors. (Otherwise, if we added a level of concatenation to the code, each individual resting qubit would have t o wait longer for us to get around to perform recovery, and would be more likely to fail.) If we ignore storage errors, then parallel operation is not essential in the analysis of the accuracy threshold, but it would certainly be desirable to speed up the computation. Error rate independent of number of qubits. We have assumed that the error rates do not depend on how many qubits are stored in our device. Implicitly, this is an assumption about the nature of the hardware. For example, it would not be a reasonable assumption if all of the qubits were stored in a single ion trap, and all shared the same phonon bus. 43 Gates can act on any pair of qubits. We have assumed that our machine is equipped with a set of fundamental gates that can be applied t o any pair of stored qubits (or triplet of qubits, in the case of the Toffoli gate), irrespective of their proximity. In practice, there is likely to be a cost, both in processing time and error rate, of shuttling the qubits around so that a gate can act effectively on a particular pair. We leave it to the machine designer to choose an architecture that minimizes this cost. If gates can act only on neighboring qubits, there will still be a threshold;' but it will be ccxsiderably lower. Fresh ancilla qubits. We have assumed that our computer has access t o an adequate supply of fresh ancillary qubits. The ancilla qubits are used both t o implement (Toffoli) gates and to perform error recovery. As the effects of random errors accumulate, entropy is generated, and the error recovery process flushes the entropy from the computing device into the ancilla registers. In principle, the computation can proceed indefinitely as long as fresh ancilla qubits are provided, but in practice we will want to clear the ancilla and reuse it. Erasing the ancilla will necessarily dissipate power and generate heat; thus cooling of the device will be required. No leakage errors. We have ignored the possibility of leakage. In our model of a quantum computer, each of our qubits lives in a twodimensional Hilbert space, and we assumed that, when an error occurs, this qubit either becomes entangled with the environment or rotates in
Fault- Tolemnt Quantum Computation 251
Ancilla (0) -Measure Figure 15: A quantum leak detection circuit. Assuming that the XOR gate acts trivially if the data has leaked, then the outcome of the measurement is 0 if leakage has occurred, 1 otherwise.
the two-dimensional space in an unpredictable way. But there is another possible type of error, in which the qubit leaks out of the two-dimensional space into a larger spaceP4 To control leakage errors, we can repeatedly interrogate each qubit to test for leakage (for example, using the leakagedetection circuit shown in Fig. 15), without trying t o diagnose exactly what happened t o the leaked q ~ b i t If ? ~leakage has occurred, the qubit is damaged and must be discarded; we replace it with a fresh qubit in a standard state, say the state 10). Then we can perform conventional syndrome measurement, which will project the qubit onto a state such that the error can be reversed by a simple unitary transformation.m If concatenated coding is used, leakage detection need be implemented only at the lowest coding level. The detection circuit is quite simple, so allowing leakage errors does not have much effect on the accuracy threshold. The assumptions of our error model are sufficiently realistic t o provide reasonable guidance concerning how well a quantum computer can perform under noisy conditions. Suppose, for example, that we want our quantum computer to solve a hard factoring problem using Shor’s algorithm; what specifications must be met by the machine? With the best known classical factoring algorithm and the fastest existing machines, it takes a few months t o factor a 130 digit (432-bit) numberP6 To perform this task with Shor’s algorithm, we would need t o be able to store about 5.432 = 2160 qubits and t o perform about 38. (432)3 3 . lo9 Toffoli gated7. To have a reasonable chance of performing the computation with acceptable accuracy, we would want the probability of error per Toffoli gate to be less than about and the probability of a storage error per gate execution time to be less than about
-
‘Of course, we can recycle it later. mIn fact, since we know before the syndrome measurement that the damaged qubit is in a particular position within the code block, we can apply a streamlined version of error correction designed to diagnose and reverse the error at that known p ~ s i t i o n ? ~
252
Introduction to Quantum Computation and Information
According to the concatenation flow equations for the 7-qubit code? these error rates can be achieved for the encoded data, if the error rates at the level of individual qubits are cstore Egate lop6,and if 3 levels of concatenation are used, so that the size of the block encoding each qubit is 73 = 343. Allowing for the additional ancilla qubits needed t o implement gates and (parallelized) error correction, the total number of qubits required in the machine would be of order lo6. When the storage error rate is fairly high, concatenation may be the most effective coding procedure. But if gate errors dominate (and if the gate error rate is not too close t o the threshold), then other quantum codes give a better performance. For example, Steane 48 found that this same factoring problem could be solved by a quantum computer with 4 . lo5 qubits and a gate error rate of order using a code with block size 55 that can correct 5 errors. At lower error rates it is possible t o use codes that make more efficient use of storage space by encoding many qubits in a single b10ck.l~ Surely, a quantum computer with about a million qubits and an error rate per gate of about one in a million would be a very powerful and valuable device (assuming a reasonable processing speed). Of course, from the perspective of the current state of the t e c h n o l ~ g ythese ~~~ numbers ~ seem daunting. But in fact a machine that meets far less demanding specifications may still be very ~sefu1.5~ First of all, quantum computers can do other things besides factoring, and some of these other tasks (in particular quantum simulation 54) might be accomplished with a less reliable or smaller device. Furthermore, our estimate of the accuracy threshold might be too conservative for a number of reasons. For example, the estimate was obtained under the assumption that phase and amplitude errors in the qubits are equally likely. With a more realistic error model better representing the error probabilities in an actual device, the error correction scheme could be better tailored to the error model, and a higher error rate could be tolerated. Also, even under the assumptions stated, the fault-tolerant scheme has not been definitively analyzed; with a more refined analysis, one can expect t o find a somewhat higher accuracy threshold, perhaps considerably higher. Substantial improvements might also be attained by modifying the fault-tolerant scheme, either by finding a more efficient way t o implement a universal set of fault-tolerant gates, or by finding a more efficient means of carrying out the measurement of the error syndrome. With various improvements, it would not be surprising' t o find that a quantum N
-
nThis analysis23 was actually carried out for the Shor method of syndrome measurement, rather than the Steane method invoked in our discussion of concatenated coding in Sec. 5 . OIn fact, estimates of the accuracy threshold that are more optimistic than mine have been put forward by Zalka.z4
Fault- Tolerant Quantum Computation 253
computer could work effectively with a probability of error per gate, say, of order loW3. An error rate of, say l o W 5is surely ambitious, but not, perhaps, beyond the scope of what might be achievable in the future. In any case, we now have a fair notion of how good the performance of a useful quantum computer will need to be. And that in itself represents enormous progress over just two years ago.
7 7.I
Topological Quantum Computation Aharonov-Bohm Phenomena and Superselection Rules
Now that we know that quantum error correction is possible, it is important to broaden our perspective - we should strive t o go beyond the analysis of abstract circuits and explore the potential physical contexts in which quantum information might be reliably stored and manipulated. In particular, we might hope to design quantum gates that are intrinsically fault tolerant, so that active intervention by the computer operator will not be required to protect the machine from noise. A significant step toward this goal has been taken recently by Alexei Kitaev; 25 this section is based on his ideas. Topological concepts have a natural place in the discussion of quantum error correction and fault-tolerant computation. Topology concerns the “global” properties of an object that remain unchanged when we deform the object locally. The central idea of quantum error correction is to store and manipulate quantum information in a “global” form that is resistant t o local disturbances. A fault-tolerant gate should be designed t o act on this global information, so that the action it performs on the encoded data remains unchanged even if we deform the gate slightly; that is, even if the implementation of the gate is not perfect. In seeking physical implementations of fault-tolerant quantum computation, then, we ask whether there are known systems in which physical interactions have a topological character. Indeed, topology is at the essence of the Aharonov-Bohm effect. If an electron is transported around a perfectly shielded magnetic solenoid, its wave function acquires a phase eie’ , where e is the electron charge and is the magnetic flux enclosed by the solenoid. This Aharonov-Bohm phase is a topological property of the path traversed by the electron - it depends only on how many times the electron circumnavigates the solenoid, and is unchanged when the path is smoothly deformed. (See Fig. 16.) We are thus led to contemplate a realization of quantum computation in which information is encoded in a form that can be measured and
254
Introduction to Quantum Computation and Information
Figure 16: A topological interaction. The Aharonov-Bohm phase acquired by an electron that encircles a flux tube remains unchanged if the electron’s path is slightly deformed.
manipulated through Aharonov-Bohm interactions - topological interactions that are immune to local disturbances. It is useful to reexpress this reasoning in the language of superselection rules. A superselection rule, as I am using the term here, arises (in a field theory or spin system defined in an infinite spatial volume) if Hilbert space decomposes into mutually orthogonal sectors, where each sector is preserved by any local operation. Perhaps the most familiar example is the charge superselection rule in quantum electrodynamics. An electric charge has an infinite range electric field. Therefore no local action can create or destroy a charge, for to destroy a charge we must also destroy the electric field lines extending to infinity, and no local procedure can accomplish this task. The Aharonov-Bohm interaction is also an infinite range effect; the electron acquires an Aharonov-Bohm phase upon circling the solenoid no matter what its distance from the solenoid. So we may infer that no local operation can destroy a charge that participates in Aharonov-Bohm phenomena. If we consider two objects carrying such charges, widely separated and well isolated from other charged objects, then any process that changes the charge on either object would have to act coherently in the whole region containing the two charges. Thus, the charges are quite robust in the presence of localized disturbances; we can strike the particle with a hammer or otherwise abuse it without modifying the charges that it carries.
Fault- Tolerant Quantum Computation 255
Following K i t a e ~ ?we ~ may envision a topological quantum computer, a device in which quantum information is encoded in the quantum numbers carried by quasiparticles that reside on a two-dimensional surface and have long-range Aharonov-Bohm interactions with one another. At zero temperature, an accidental exchange of quantum numbers between quasiparticles (an error) arises only due to quantum tunneling phenomena involving the virtual exchange of charged objects. The amplitude for such processes is of the order of e - m L , where m is the mass of the lightest charged object (in natural units), and L is the distance between the two quasiparticles. If the quasiparticles are kept far apart, the probability of an error afflicting the encoded information will be extremely low. At finite temperature TI there is an additional source of error, because an uncontrolled plasma of charged particles will inevitably be present , with a density proportional t o the Boltzman factor e - A / T , where A is the mass gap (not necessarily equal t o the “curvature mass” m). Sometimes one of the plasma particles will slip unnoticed between two of our data-carrying particles, resulting in an exchange of charge and hence an error. To achieve an acceptably low error rate, then, we would need to keep the temperature well below the gap A (or else we would have t o monitor the thermal plasma very faithfully).
7.2 T h e Fractional Q u a n t u m Hall Effect and Beyond If our device is to be capable of performing interesting computations, the Aharonov-Bohm phenomena that it employs must be nonabelian. Only then will we be able to build up complex unitary transformations by performing many particle exchanges in succession. Such nonabelian Aharonov-Bohm effects can arise in systems with nonabelian gauge fields. Nature has been kind enough to provide us with some fundamental nonabelian gauge fields, but unfortunately not very many, and none of these seem to be suited for practical quantum computation. To realize Kitaev’s vision, then, we must hope that nonabelian Aharonov-Bohm effects can arise as complex collective phenomena in (two-dimensional electron or spin) systems that have only short-range fundamental interactions. In fact, one of the most remarkable discoveries of recent decades has been that infinite range Aharonov-Bohm phenomena can arise in such systems, as revealed by the observation of the fractional quantum Hall effect. The electrons in quantum Hall systems are so highly frustrated that the ground state is an extremely entangled state with strong quantum correlations extending out over large distances. Hence, when one quasiparticle is transported around another, even when the quasiparticles are widely separated, the many electron wave
256
Introduction t o Quantum Computation and Information
.
..
.,
-.
.
- ..
...". . . r ' -
""
- --
I
, __
..
.
.I_
. "
.
_I
. " ..
__
i
.
.
. .
. ..
." I .
_I
. - .-
Figure 17: A Kitaev spin model. Spins reside on the lattice links. The four spins that meet at a site or share a plaquette are coupled.
function acquires a nontrivial Berry phase (such as This Berry phase is indistinguishable in all its observable effects from a n Aharonov-Bohm phase arising from a fundamental gauge field, and its experimental consequences are spe~tacular!~ The Berry phases observed in quantum Hall systems are abelian (although there are some strong indications that nonabelian Berry phases can occur under the right c o n d i t i o n ~ ~and ~ ~ so ~ ~are ) , not very interesting from the viewpoint of quantum computation. But Kitaev 25 has described a family of simple spin systems with local interactions in which the existence of quasiparticles with nonabelian Berry phases can be demonstrated. (The Hamiltonian of the system so frustrates the spins that the ground state is a highly entangled state with infinite range quantum correlations.) These models are sufficiently simple (although unfortunately they require four-body interactions) , that one can imagine a designer material that can be reasonably well-described by one of Kitaev's models. The crucial topological properties of the model are relatively insensitive t o the precise microscopic details, so the task of the fabricator who "trims" the material may not be overly demanding. If furthermore it were possible to control the transport of individual quasiparticles (perhaps with a suitable magnetic tweezers), then the system could be operated as a faulttolerant quantum computer. To construct his models, Kitaev considers a square lattice, with spins re-
Fault- Tolerant Quantum Computation 257
siding on each lattice link. The Hamiltonian is expressed as a sum of mutually commuting four-body operators, one for each site and one for each plaquette of the lattice. (See Fig. 17.) Because the terms are mutually commuting, it is simple to diagonalize the Hamiltonian by diagonalizing each term separately. The operators on sites resemble local gauge symmetries (acting independently at each site), and a state that minimizes these terms is invariant under the local symmetry, like the physical states that obey Gauss’s law in a gauge theory. The operators on plaquettes are like ‘<magneticflux” operators in a gauge theory, and these terms are minimized when the magnetic flux vanishes everywhere. The excitation spectrum includes states in which Gauss’s law is violated at isolated sites - these points are “electrically charged” quasiparticles - and states in which the magnetic flux is nonvanishing at isolated plaquettes - these are magnetic fluxon quasiparticles. The quantum entanglement of the ground state is such that a nontrivial Berry phase is associated with the transport of a charge around a flux - this phase is identical t o the Aharonov-Bohm phase in the analog gauge theory. These Aharonov-Bohm phenomena are stable even as we deform the Hamiltonian of the theory. Indeed, if the deformation is sufficiently small, we can study its effects using perturbation theory. But as long as the perturbations are local in space, topological effects are robust, since perturbation theory is just a sum over localized influences. Whatever destroys the long-range topological interactions must be nonperturbative in the deformation of the theory. Two types of nonperturbative effects can be anticipated?’ The ground state of the theory might become a “flux condensate” with an indefinite number of magnetic excitations. In this event, there would be a long-range attractive interaction between charged particles and their antiparticles. It would be impossible to separate charges, and there would be no long-range effects. In a gauge theory, this phenomenon would be called electric confinement. Alternatively, a condensate of electric quasiparticles might appear in the ground state. Then the magnetic excitations would be confined, and again the longrange Aharonov-Bohm effects would be destroyed. In a gauge theory, we would call this the Higgs phenomenon (or magnetic confinement). Thus, as we deform Kitaev’s Hamiltonian, we can anticipate that a phase boundary will eventually be encountered, beyond which either electric confinement or the Higgs phenomenon will occur. The size of the region enclosed by this boundary will determine how precisely a material will need to be fabricated in order t o behave as Kitaev specifies. A particularly urgent question for the material designer is whether cleverly chosen two-body interactions might so frustrate a spin system as to produce a highly entangled ground state and nonabelian Aharonov-Bohm interactions among the quasiparticle excitations.
258
Introduction to Quantum Computation and Information
The fractional quantum Hall effect, and Kitaev’s models, speak a memorable lesson. We find gauge phenomena emerging as collective effects in systems with only short range interactions. It is intriguing t o speculate that the gauge symmetries known in Nature could have a similar origin.
7.3 Topological Interactions As we have noted, in Kitaev’s spin models, there are two types of charges that can be carried by localized quasiparticles, which we may call “electric” and “magnetic” charges. In the simplest type of model, the “magnetic flux” carried by a particle can be labeled by an element of a finite group G, and “electric charges” are labeled by irreducible representationsp of G. If a charged particle in the irreducible representation D(”), whose quantum numbers are encoded in an internal wavefunction I$(”)), is carried around a flux labeled by group element u E G, then the wavefunction is modified according t o
Exploiting this interaction, we can measure a magnetic flux by scattering a suitable charged particle off of the flux!’ For example, we could construct a Mach-Zender flux interferometer as shown in Fig. 18 that is sensitive to the relative phase acquired by the charged particle paths that pass t o the left or right of the flux. If we balance the interferometer properly, we can distinguish between, say, two flux values u1,u2 E G ; a u1 flux will be detected emerging from one arm of the interferometer, and a u2 flux from the other arm. Of course, the interferometer we build will not be flawless, but the flux measurement can nevertheless be fault-tolerant - if we have many charged projectiles and perform the measurement repeatedly, we can determine the flux with very high statistical confidence. If the two fluxes u1 and u2 belong to the same conjugacy class in G, then there is a symmetry relating the two fluxons, so that all local physics is indifferent t o the value of the flux (see below). Therefore, a coherent superposition of fluxes +1) + bb2) (39) will not readily decohere due t o localized interactions with the environment. But the flux interferometer (operated repeatedly) will project the fluxon onto either of the flux eigenstates lul) (with probability or 2)1. (with probability lbI2). PThere can also be “dyons” that carry both types of charge, and the classification of the charge carried by a dyon is somewhat subtle, but we will not need to discuss explicitly the properties of the dyons.
Fault- Tolerant Quantum Computation 259
“charge” Figure 18: A Mach-Zender interferometer for flux measurement, shown schematically. The flux to be measured is inserted inside. The test charge emerges from one arm if the flux has value u l , the other arm if the flux has value u2.
F
1
Figure 19: The flux exchange interaction. The flux labeled 211 is carried from its original position (shaded) to its new position (unshaded), and then remeasured. The charged particle path shown that encircles the original position of the flux is topologically equivalent to a path that encircles the new position; hence the value of the flux changes from u1 to ui = u i 1 u 1 u 2 .
260 Introduction t o Quantum Computation and Information
Figure 20: The “pull-through” interaction. One flux pair is pulled through another. The outside flux is unmodified, but the inside flux is conjugated by the outside flux.
Now imagine that two fluxons have been carefully calibrated, so that one is known to carry the flux u1 and the other the flux 212. And suppose that the two vortices are carefully “exchanged” by carrying the first around the second as shown in Fig. 19, and that we subsequently remeasure the fluxes. Carrying a charged particle around the fluxon on the right, after the exchange, is topologically equivalent t o carrying the charged particle around first the right fluxon, then the left fluxon, and finally the right fluxon in the opposite direction, before the exchange. We infer that the exchange modifies the quantum numbers of the fluxons according t o
a nontrivial interaction if the two fluxes fail t o Thus, noncommuting fluxes have interesting Aharonov-Bohm interactions of their own, even in the absence of any electric charges. Because carrying one flux around another can conjugate the value of the flux, two fluxons carrying conjugate fluxes must be regarded as indistinguishable particles6l An exchange of two such objects can modify their internal quantum numbers; we will refer to them as nonabelionsb2 indistinguishable particles in two dimensions that obey an exotic nonabelian variant of quantum statistics. We will use the exchange interaction Eq. 40 as a fundamental logical operation in our Aharonov-Bohm quantum computer. However, it will actually
Fault- Tolerant Quantum Computation 261
be convenient to encode qubits in pairs of fluxons, where the total flux of the pair is t r i ~ i a l . 2That ~ is, we will consider fluxon-antifluxon pairs of the form Iu,u-l), but where the flux and antiflux are kept far enough apart from one another that an inadvertent exchange of quantum numbers between them is unlikely. To perform logic, we may pull one pair through another as shown in Fig. 20. Since the total flux that passes through the middle of the outside pair is trivial, this pair is not modified, but the inside fluxes are conjugated by the outside flux:
an operation that is evidently isomorphic t o the effect of the exchange of single fluxes described by Eq. 40. Using pairs instead of single fluxons has two advantages. First, since each pair has trivial total flux, the pairs do not interact unless one is pulled through another; therefore, we can easily shunt pairs around the device without inducing any unwanted interactions with distant pairs. Second, and more important, pairs can carry charges even if each member of the pair carries no ~ h a r g e gThe ~?~ charge ~ of a pair can be measured, and this charge-measurement operation will be a crucial ingredient in the construction of a universal set of quantum gates. The operation Eq. 41 can be regarded as a classical logic gate; it takes flux eigenstates to flux eigenstates. To perform interesting quantum computations, we will need t o be able to prepare coherent superpositions of flux eigenstates. This is what we can accomplish by measuring the charge of a pair. Suppose that uo and u1 E G are related by u1 = Y - ~ U O Z , for some Y E G. Then if we think of the flux eigenstates IUO, uO1) and lull u y 1 ) as computational basis states, the effect of pulling either pair through a Iw,v-') pair can be interpreted as a NOT or X gate: Iuolu;l)
++ Iu1,u;7
(42)
(see Fig. 21). But suppose we wish to prepare one of the states
We can project a coherent superposition of ~ U O2)1';, and lull uT1) onto the {I&)} basis by scattering a Iv) fluxon off the pair, or in other words by operating a charge interferometer, as in Fig. 22. When the) . 1 fluxon navigates around the pair, it acquires a trivial Aharonov-Bohm phase if the pair is in the state I+) and the nontrivial phase -1 if the pair is in the state I - ) . If the interferometer is properly balanced, then, the IY) projectile will be detected emerging from one
262
Introduction to Quantum Computation and Information
--x
0 NOT pair
Figure 21: The NOT gate. Pulling a computational flux pair through a NOT pair flips the value of the encoded bit.
Figure 22: A Mach-Zender interferometer for charge measurement, shown schematically. The flux pair whose charge is to be measured is inserted inside. If the test NOT flux emerges from one arm, the I+) charge state has been prepared; if it emerges from the other arm, I-) has been prepared.
Fault- Tolerant Quantum Computation
263
arm of the interferometer if the pair is I+), and the other arm if the pair is I-). This is an example of charge measurement. Though the interferometer will not be perfect , charge measurement (like flux measurement) can be fault-tolerant, if we repeat the measurement enough times.
7.4
Universal Topological Computation
Working with fluxon pairs as computational basis states, we have seen how to perform the exchange (or “pull through”) operation Eq. 41, how to measure flux (using previously calibrated charges), and how to measure charge (using previously calibrated fluxes). We will also suppose that we are able to produce a large supply of vortex pairs. Local processes produce pairs that carry no charge or flux; a charge-zero pair with trivial flux has the form (up to normalization)
(u, u-’),
(charge zero) =
(44)
U
where the sum ranges over a complete conjugacy class of G. Because this state is left invariant when conjugated by any element of G, it has trivial AharonovBohm interactions with any flux, and so carries no detectable charge. After producing such a pair, we can perform flux measurement to project out one of the flux eigenstate pairs Iu,u-’).Performing many such measurements on many pairs, we can assemble a large reservoir of calibrated flux pairs that can be withdrawn as needed during the course of a computation. But is our quantum computer universal - can we closely approximate any desired unitary transformation? To address this issue, we recall the result mentioned in Sec. 4.2: Universal classical computation, together with the ability to perform the single-qubit gates X and 2, and the ability to measure X , Y , and 2 , suffice for universal quantum c o m p ~ t a t i o n ?In~ fact, there are groups G such that the operation Eq. 41 is sufficient for universal classical computation. We have found 65 that a Toffoli gate can be constructed from Eq. 41 if G = As, the group of even permutations on five objects. We may, for example, choose computational basis states with uo = (125) ,
~1
= (234) ;
(45)
that is, we choose our computational fluxes to be three-cycles with one object in common. Then a Toffoli gate can be constructed from a total of 16 elementary “pull-through” operations; six ancilla pairs are also used to catalyze this reaction. No Toffoli gate was found in any group smaller than ASP Since A5 is qKitaev had reported earlier that universal classical computation is possible for G = S5.
264
Introduction t o Quantum Computation and Information
also the smallest of the finite nonsolvable groups, it is tempting t o conjecture that nonsolvablility is a necessary condition for universal classical computation generated by conjugation.' We have already remarked that an X gate can be realized by pulling a computational vortex pair through the pair with flux v such that u1 = v-'uov; here we choose v = (14)(35). It turns out that the 2 gate can be constructed with six pull-through steps and four ancilla pairs. Measuring 2 is the same as measuring flux, and we have already seen that X measurement can be achieved by measuring the charge of a pair, specifically, by using a v projectile in a charge interferometer. It only remains t o verify that we can measure Y. Though Y measurement cannot be carried out exactly in this scheme, it turns out that a controlled-Y gate can be constructed from 31 pull-through steps, and using 7 ancilla pairs. Appealing to another trick invented by K i t a e ~ , 6we ~ can use the controlled-Y gate repeatedly to carry out Y-measurement to any desired accuracy? Therefore, we have constructed a universal gate set using only the Aharonov-Bohm interactions of fluxes and charges; we have a fault-tolerant universal quantum computer. Unfortunately, the spin model on which this construction is based is not so simple. Since the group A5 has order 60, the Kitaev spin model that realizes this scenario has a 60-component spin residing at each lattice link (!) One hopes that a simpler implementation of universal Aharonov-Bohm computation will be found.
7.5 Is Nature Fault Tolerant? The discovery of quantum error correction and fault tolerance has so altered our thinking about quantum information that it is appropriate t o wonder about the potential implications for fundamental physics. And in fact, a fundamental issue pertaining t o loss of quantum information has puzzled the physics community for over twenty years. In 1975, Stephen Hawking6* argued that quantum information is unavoidably lost when a black hole forms and then subsequently evaporates completely. The essence of the argument is very simple: because of the highly distorted causal structure of the black hole spacetime, the emitted radiation is actually on the same time slice as the collapsing body that disappeared behind 'A finite group is nonsolvable if it has a nontrivial subgroup whose commutator subgroup is itself. Barrington 66 also found evidence for a separation in the computational complexity of group multiplication for solvable vs. nonsolvable groups. 'Actually, measuring Y (which has eigenvalues hi) using the controlled-Y gate does not work, because the Kitaev method does not distinguish between eigenvalues related by complex conjugation. What we really construct is a controlled-wY gate where w = e 2 x i / 3 .
Fault- Tolerant Quantum Computation
265
the event horizon. If the quantum information that is initially encoded in the collapsing body is eventually to re-emerge encoded in the microstate of the emitted information, then that information must be in two places at once. In other words, the quantum information must be cloned, a known impossibility under the usual assumptions of quantum the~ryB’>~’ Hawking concludes that not all physical processes can be governed by unitary time evolution; the laws of quantum theory need revision. This argument is persuasive, but many physicists are very distrustful of the conclusion. Perhaps one reason for the skepticism is that it seems odd for Nature to tolerate just a little bit of information lossT1 If processes involving black holes can destroy information, then one expects that information loss is unsuppressed at the Planck length scale ( G f i / ~ ~ ) ~ / ~cm, a scale where virtual black holes continually arise as quantum fluctuations. It becomes hard to understand why quantum information can be so readily destroyed at the Planck scale, yet is so well preserved at the much longer distance scales that we have been able to explore experimentally - violations of quantum mechanics, after all, have never been observed. Our newly acquired understanding of fault-tolerant quantum computation provides us with a fresh and potentially fruitful way t o think about this problem. In Kitaev’s spin models, we might imagine that localized processes that destroy quantum information are quite common. Yet were we to follow the evolution of the system with coarser resolution, tracking only the information encoded in the charges of distantly separated quasiparticles, we would observe unitary evolution to remarkable accuracy; we would detect no glimmer of the turmoil beneath the surface? Likewise, it is tempting t o speculate that Nature has woven fault tolerance into her design, shielding the quantum noise a t the Planck scale from our view. The discovery that quantum systems can be stabilized through suitable coding methods prompts us to ask the question: Is Nature fault tolerant? If so, then quantum mechanics may reign (to excellent accuracy) a t intermediate length scales, but falter both a t the Planck scale (where “errors” are common) and at macroscopic scales (where decoherence is rapid). N
Acknowledgments This work has been supported in part by DARPA under Grant No. DAAHOC 96-1-0386 administered by the Army Research Office, and by the Department tSimilar language could be used to characterize the performance of a concatenated codeerrors are rare when we inspect the encoded information with poor resolution, but are seen to be much more common if we probe the code block at lower levels of concatenation.
266
Introduction to Quantum Computation and Information
of Energy under Grant No. DE-FG03-92-ER40701. I am grateful for helpful conversations and correspondence with Dorit Aharonov, David Beckman, John Cortese, Eric Dennis, David DiVincenzo, Jarah Evslin, Chris Fuchs, Sham Kakade, Alesha Kitaev, Manny Knill, Raymond Laflamme, Andrew Landahl, Seth Lloyd, Michael Nielsen, Walt Ogburn, Peter Shor, Andrew Steane, and Christof Zalka. I especially thank Daniel Gottesman for many fruitful discussions about fault-tolerant quantum computation. References 1. R. P. Feynman, Int. J. Theor. Phys. 21, 467 (1982). 2. D. Deutsch, Proc. Roy. SOC.Lond. A 400, 96 (1985). 3. P. W. Shor, “Algorithms for quantum computation: discrete logarithms and factoring,” pp. 124-134 in Proceedings of the 35th Annual Symposium
on Fundamentals of Computer Science (Los Alamitos, CA, IEEE Press, 1994). 4. R. Landauer, Phil. Dan. R. SOC.Lond. 353, 367 (1995). 5. R. Landauer, Phys. Lett. A 217, 188 (1996). 6. R. Landauer, “ISquantum mechanically coherent cornputation useful?”
in Proc. Drexel-4 Symposium on Quantum Nonintegrability-QuantumClassical Correspondence, Philadelphia, PA, 8 September 1994, ed. D. H. Feng and B.-L. Hu (Boston, International Press, 1997). 7. W. G. Unruh, Phys. Rev. A 51, 992 (1995). 8. S. Haroche and J. M. Raimond, Phys. Today 49 (8), 51 (1996). 9. W. H. Zurek, Phys. Today 44, 36 (1991). 10. P. W. Shor, Phys. Rev. A 52, 2493 (1995). 11. A. M. Steane, Phys. Rev. Lett. 77,793 (1996). 12. A. M. Steane, Multiple particle interference and quantum error correction, Proc. Roy. SOC.Lond. A 452, 2551 (1996). 13. J. von Neumann, “Probabilistic logics and synthesis of reliable organisms from unreliable components,” in Automata Studies, eds. C. E. Shannon and J . McCarthy (Princeton, Princeton University Press, 1956). 14. P. Gats, J. Comp. Sys. Sci 32, 15 (1986). 15. P. Shor, “Fault-tolerant quantum computation,” in Proceedings of the Symposium on the Foundations of Computer Science (Los Alamitos, CA: IEEE Press, 1996), preprint quant-ph/9605011. 16. A. M. Steane, Phys. Rev. Lett. 78,2252 (1997). 17. D. Gottesman, “A theory of fault-tolerant quantum computation,” preprint quant-ph/9702029. 18. E. Knill and R. Laflamme, “Concatenated quantum codes,” preprint
Fault- Tolerant Quantum Computation
267
quant-ph/9608012. 19. E. Knill, R. Laflamme, and W. H. Zurek, “Accuracy threshold for quantum computation,” preprint quant-ph/9610011. 20. E. Knill, R. Laflamme, and W. H. Zurek, “Resilient quantum computation: error models and thresholds,” preprint quant-ph/9702058. 21. D. Aharonov and M. Ben-Or, “Fault tolerant quantum computation with const ant error ,” preprint q uant- ph/9611025. 22. A. Yu. Kitaev, “Quantum computing: algorithms and error correction,” (preprint, in Russian, 1996). 23. J. Preskill, Proc. Roy. SOC.Lond. A 454, 385 (1998), preprint quantph/9705031. 24. C. Zalka, “Threshold estimate for fault tolerant quantum computing,” preprint quant-ph/9612028. 25. A. Yu. Kitaev, “Fault-tolerant quantum computation by anyons,” preprint quant-ph/9707021. 26. F. J. MacWilliams and N. J. A. Sloane, The Theory of Error-Correcting Codes, (New York, North-Holland Publishing Company, 1977). 27. E. Knill and R. Laflamme, Phys. Rev. A 55, 900 (1997). 28. A. R. Calderbank and P. W. Shor, Phys. Rev. A 54, 1098 (1996). 29. D. Gottesman, Phys. Rev. A 54, 1862 (1996). 30. A. R. Calderbank, E. M. Rains, P. W. Shor, and N. J. A. Sloane, Phys. Rev. Lett. 78,405 (1997). 31. A. R. Calderbank, E. M. Rains, P. W. Shor, and N. J. A. Sloane, “Quantum error correction via codes over GF(4),” preprint quant-ph/9608006. 32. J. Evslin, S. Kakade, and J. Preskill, unpublished (1996). 33. D. P. DiVincenzo and P. .W Shor, Phys. Rev. Lett. 77,3260 (1996). 34. A. Yu. Kitaev, [‘Quantum error correction with imperfect gates,” (preprint, 1996). 35. D. Gottesman, “Stabilizer codes and quantum error correction,” Ph.D. thesis, California Institute of Technology, preprint qua nt- ph/ 9705052. 36. C. H. Bennett, D. P. DiVincenzo, J. Smolin, and W. K. Wootters, Phys. Rev. A 54, 3824 (1996). 37. R. Laflamme, C. Miquel, J. P. Paz, and W. H. Zurek, Phys. Rev. Lett. 77,198 (1996). 38. D. Gottesman and J. Preskill, unpublished (1997). 39. D. Gottesman, J. Evslin, S. Kakade, and J. Preskill, unpublished (1996). 40. K. Obenland and A. M. Despain, “Simulation of factoring on a quantum
computer architecture,” in Proceedings of the 4th Workshop on Physics and Computation, Boston, November 22-24,1996, (Boston, New England Complex Systems Institute, 1996).
268
Introduction t o Quantum Computation and Information
41. K. Obenland, and A. M. Despain, “Impact of errors on a quantum computer architecture,” online preprint at http://www.isi.edu/acal/quantum/quantumjntro.html (1996). 42. C. Miquel, J. P. Paz, and W. H. Zurek, “Quantum computation with phase drift errors,” preprint quant-ph/9704003. 43. J. I. Cirac and P. Zoller, Phys. Rev. Lett. 74, 4091 (1995). 44. M. B. Plenio and P. L. Knight, Proc. Roy. Soc. Lond. A 453, 2017 (1997). 45. M. Grassl, Th. Beth, and T. Pellizzari, Phys. Rev. A 56, 33 (1997). 46. A. K. Lenstra, J. Cowie, M. Elkenbracht-Huizing, W. Furmanski, P. L. Montgomery, D. Weber, J. Zayer, “RSA factoring-by-web: the worldwide status,”online document http://www.npac.syr.edu/factoring/status.htrnl (1996). 47. D. Beckman, A. Chari, S. Devabhaktuni, and J. Preskill, Phys. Rev. A 54, 1034 (1996). 48. A. M. Steane, “Space, time, parallelism and noise requirements for reliable quantum computing,’’ preprint quant-ph/9708021. 49. C. Monroe, D. M. Meekhof, B. E. King, W. M. Itano, and D. J. Wineland, Phys. Rev. Lett. 75, 4714 (1995). 50. Q. A. Turchette, C. J. Hood, W. Lange, H. Mabuchi, and H. J. Kimble, Phys. Rev. Lett. 75, 4710 (1995). 51. D. G. Cory, A. F. Fahmy, and T. F. Havel, “Nuclear magnetic resonance spectroscopy: an experimentally accessible paradigm for quantum computing,” in Proceedings of the 4th Workshop on Physics and Computation (Boston, New England Complex Systems Institute, 1996). 52. N. Gershenfeld and I. Chuang, Science 275, 350 (1997). 53. J. Preskill, Proc. Roy. Soc. Lond. A 454, 469 (1998), preprint quantph/9705032.
54. S. Lloyd, Science 273, 1073 (1996). 55. R. Prange and S. Girvin, eds., The Quantum Hall Effect (New York, Springer-Verlag, 1987). 56. N. Read and E. Rezayi, “Quasiholes and fermionic zero modes of paired fraction quantum Hall states: the mechanism for nonabelian statistics,” preprint cond-mat/9609079. 57. C. Nayak and F. Wilczek, “272 quasihole states realize 2n-’-dimensional spinor braiding statistics in paired quantum Hall states,” preprint condmat/9605145.
58. G, ’t Hooft, Nucl. Phys. B 138, 1 (1978). 59. M. Alford, S. Coleman, and J. March-Russell, Nucl. Phys. B 351, 735 (1991).
Fault- Tolerant Quantum Computation 269
60. 61. 62. 63. 64. 65. 66. 67. 68. 69. 70. 71.
F. A. Bais, Nucl. Phys. B 170, 32 (1980). H.-K. Lo and J. Preskill, Phys. Rev. D 48, 4821 (1993). G. Moore and N. Read, Nucl. Phys. B 360, 362 (1991). M. G. Alford, K. Benson, S. Coleman, J. March-Russell, and F. Wilczek, Phys. Rev. Lett. 64, 1632 (1990). J. Preskill and L. M. Krauss, Nucl. Phys. B 341, 50 (1990). W. Ogburn and J. Preskill, unpublished (1997). D. A. Barrington, J. Comp. Sys. Sci. 38, 150-164 (1989). A. Yu. Kitaev, “Quantum measurements and the abelian stabilizer problem,” preprint quant-ph/9511026. S. W. Hawking, Phys. Rev. D 14, 2460 (1976). D. Dieks, Phys. Lett. A 92, 271 (1982). W. K. Wootters, and W. H. Zurek, Nature 299, 802 (1982). T. Banks, M. E. Peskin, and L. Susskind, Nucl. Phys. B 244,125 (1984).
QUANTUM COMPUTERS, ERROR-CORRECTION AND NETWORKING: QUANTUM OPTICAL APPROACHES THOMAS PELLIZZARI Clarendon Laboratory, Department of Physics, University of Oxford, Parks Road, Oxford OX1 3PU, United Kingdom Practical quantum computing is discussed, from the viewpoint of using ion traps or applications of quantum optics. A linear ion trap quantum computer coupled to external laser beams is described in detail, as is an alternative realization based on cavity quantum electrodynamics. An experimentally feasible approach to quantum error correction for ion trap systems is given. The possibility of quantum computer networking, using optical fibres to couple systems together coherently, is also discussed.
1
Introduction
Until only recently the field of quantum computing has been a purely theoretical one, and no experimental physicist would have seriously considered setting up a n experiment to realize a quantum computer (QC) in the laboratory. During the last few years, however, the situation has changed dramatically due to some seminal experimental and theoretical developments in the field. The purpose of this chapter is to give an overview of some of the developments that have nourished the hope that QCs might eventually become a feasible technology. The prospect of building faster computers is certainly not the only motivation for research into quantum computing. Many physicists agree that even a very small QC consisting of only a few quantum bits (commonly called qubits, which are the quantum analog to the classical bits) would in itself be a most significant development. Such a small-scale QC could be used to generate quantum states which have no analogue in the classical macroscopic world. Many of the gedanken experiments which have been proposed over almost a century of quantum mechanics could actually be realized as laboratory experiments. Quantum states could be generated that would decide fundamental quantum mechanical questions (see, e.g., 1 ) 2 ) . Moreover, studies in the context of decoherence and quantum measurements could be made on a QC. This is important for bridging the gap between the microscopic quantum world and the macroscopic classical world. It has also been suggested that a QC could be used t o generate quantum states that would enable improved high-precision spectroscopy? This is relevant for improving time and frequency standards and 2 70
Quantum Computers, Error-Correction and Networking 271
for building more accurate atomic clocks, although a careful analysis has shown that the actual improvements are very small? Finally, a QC might be used as a simulator of quantum systems? The first promising hypothetical prototype quantum computer is the iontrap QC proposed by Ignacio Cirac and Peter Zoller of the University of Innsbruck! Their scheme is based on a string of ions stored in a linear ion-trap. Computations are performed by manipulating the ions with laser-beams. Initial experimental results have already been reported, and suggest that smallscale computing with ions in a trap is a feasible technology8 Several groups in Europe and the USA have set out to realize the ion-trap QC experimentally. At least five groups are currently conducting experiments along these lines. In Sec. 2 the ion-trap QC is described in some detail. Section 3 describes an alternative proposal for realizing a QC based on cavity quantum electrodynamics? Building a QC is an extremely difficult task. The major obstacle is the coupling of the QC to the environment, which destroys the quantum mechanical properties of the system. It is impossible to avoid coupling of the QC to the environment completely because interaction with the system is necessary for initialization, execution of the desired quantum computation, and retrieval of results. The external interaction with the QC required by these processes inevitably affects any non-trivial quantum computation. Consequently, a major challenge facing researchers is the development of schemes to undo these unwanted effects. Recently, several schemes have been developed to correct for errors in quantum computations!O- l2 If non-trivial quantum computing is to become feasible, it will almost certainly rely heavily on such schemes. An experimentally feasible scheme for error-correction for the ion-trap QC is discussed in Sec. 4. Section 5 addresses the problem of quantum networking and the difficulties of setting up a network of several spatially separate QCs that are able to communicate with each other quantum mechanically. The need for solutions to these problems is particularly pressing because the current technologies impose certain upper limits on the possible number of quantum bits in a single QC. These limits might be overcome by connecting several QCs together. Section 5 describes the two proposals which have been made so far to effect these connection^!^^'^ These schemes are based on communicating quantum information via optical fibres.
272
Introduction t o Quantum Computation and Information
Figure 1: Schematic representation of the ion trap QC.
2
2.1
The ion-trap quantum computer
Introduction
In early 1995 Ignacio Cirac and Peter Zoller of the University of Innsbruck proposed a model for realizing a quantum computer which differed from all preceeding proposals in one crucial point: it was experimentally feasible.61~ Their proposal precipitated vigorous experimental activity in several research groups. Six months later Chris Monroe and co-workers from the NIST/Boulder facility in the US reported their first experimental results? Even though their experiment was only a demonstration of a simplified version of the scheme it showed the feasibility of the Cirac-Zoller QC. The success of the Cirac-Zoller QC is due to the fact that it is based on several technologies that already exist. The key technology employed of cooling and trapping of ions, for example, has already been developed for precision spectroscopy. The basic principle of the Cirac-Zoller QC can be seen in Fig. 1. The quantum bits are stored in long-lived internal states of ions which are trapped in a linear radio-frequency Paul trap ? 5 The necessary communication between the ions is mediated by the electrostatic repulsive interaction between them. Quantum gates are performed by temporarily transferring the quantum information stored in one of the ions to a collective vibrational excitation of the whole string of ions. In other words, vibrations induced in the string of ions are conditioned by the logical state of a particular ion. This makes the quantum information stored in a single ion available to all the others because all ions participate in the same vibrational motion. When this has been achieved, an operation on a different ion can be performed conditioned by the motional state of the string of ions. Finally the quantum information stored in the vibrational motion can be transferred back to the first ion. This method allows quantum
Quantum Computers, Error- Correction and Networking 273
gates to be executed involving arbitrary pairs of ions. All these operations are induced by laser beams which interact with individual ions. The ion-trap proposal is attractive because it meets three distinct requirements of realistic quantum computing. Firstly, it offers a physical system to store the qubits reliably . The quantum bits are stored in long-lived electronic states of ions, and can persist for a considerable time. Experiments with single trapped ions have shown that quantum superpositions stored in hyperfine ground state sublevels can survive for several minutesJ6 Secondly, it provides a means of performing universal %bit quantum gates . By applying a universal quantum gate to different pairs of quantum bits, any arbitrary quantum computation can be r e a l i ~ e d ? ~The - ~ concept ~ of using the vibration of the ions as a “bus for quantum information” is promising because the vibrational modes are only slightly damped. Thirdly, the proposal offers a physical mechanism to perform reliable measurements of the qubits. This can be effected through the quantum jump technique by which the state of an ion can be determined with an efficiency of nearly 100%?2-24
2.2 Atomic and ionic quantum bits Atoms and ions make excellent storage media for quantum information. Although atoms generally have a high number of internal electronic levels, only the most stable levels such as ground states and metastable states can be used for storing quantum information. A qubit may be represented using an atom by selecting two stable electronic levels and identifying them with the logical values 0 and 1. For example, we may choose an atomic ground state 19) and a (metastable) excited state Ie) separated by an energy fiweg. The identification with logical values might be as follows:
Having made this identification, the next step towards quantum computing is to perform 1-bit quantum gates on the atom. This can be done by shining laser light on the atom? The Hamiltonian operator which describes this interaction is given by the standard dipole interaction Hamilt~nian;~
aFor the purposes of this example, it is assumed that atomic transition Ig)-le) corresponds to an optical transition, in which case lasers must be used to interact with the atom. In the case of microwave transitions, appropriate radio-frequency fields must be used.
274
Introduction to Quantum Computation and Information
Here d and E denote respectively the atomic dipole moment and the electrical field at the position of the atom. After introducing an important approximation (the rotating wave approximation) and restricting the Hamiltonian to the Hilbert space spanned by the two states 19) and le) the full Hamiltonian of the system (consisting of the free atomic part and the interaction Hamiltonian operator) reads:
-
H = Hat
Hat
+ Hat-las
= hwegle)(el
Hat-las =
le)(gl
+ h.c. .
(1)
h.c. denotes the Hermitian conjugate and R is the Rabi frequency. R depends on the intensity of the laser and the dipole moment of the transition. weg and wlas denote the atomic transition frequency and the laser frequency, respectively. Details of the derivation of Eq. 1 can be found in the standard textbooks on quantum Finally, a transformation is performed which removes the fast optical frequencies weg and wlas from the Hamiltonian of Eq. 1. Suppose IG(t)) is a solution to the Schrodinger equation with the Hamiltonian fi of Eq. 1. This (Schrodinger picture) state vector I@(t))is transformed as follows: I Q ( ~ >=) ~ X (iwlastIe)(eI) P IG(~)). The new transformed state vector obeys a Schrodinger equation with the Hamiltonian:
H = -We)(el
+
(2) The detuning A = wlas - weg denotes the difference between the laser and atomic frequencies. For the sake of simplicity it is assumed that the Rabi frequency is real. Note that there is no contribution to the Hamiltonian corresponding to the laser because the laser field is treated as a classical field. If, for example, we wished to perform a bit flip operation (le)(gl - Ig)(el).
this could be achieved by choosing the laser frequency wlas equal to the atomic frequency weg (corresponding to A = 0). The electrical field of the laser will interact with the dipole moment of the atomic transition and induce a periodic exchange of energy between the two states 19) and le). This behaviour is found by solving the Schrodinger equation corresponding to the Hamiltonian of Eq. 2 for the initial conditions IQ(0)) = 19) and IQ(0)) = le). These two states evolve
Quantum Computers, Error-Correction and Networking 275
as a function of time as follows:
I!P(t))= cosRt1g) +isinRtle) for IQ(0)) = 19) 1Q(t))= cosRtle) -isinRtlg) for I!P(O)) = le). The interaction time of the laser with the atom needs to be chosen such that R t = 7r/2. In summary, the following transformation will be realized:
Note that this is almost, but not quite, the operation of Eq. 3 we wish to perform. This is because, in addition to performing a bit flip, the quantum state picks up an unwanted phase, dependent upon the initial state. Therefore an additional operation that changes the phase conditioned by the state of the atom needs to be effected: 19)
I4
*
19)
exp (-ia)Ie)
.
(5)
Here (Y is an arbitrary phase. This operation can be performed by the same interaction Hamiltonian of Eq. 2 by detuning A much more than the Rabi frequency R. Under this condition only the first term in Eq. 2 will be relevant. The transformation of Eq. 5 is realized by setting CY = - A t . It can be shown that the two processes of Eqs. 4 and 5 are sufficient to perform any arbitrary unitary transformation, and thus any arbitrary quantum gate, on a single atomic qubit.
2.3 Ion trapping and cooling technology As mentioned above, the qubits in the ion-trap QC are arranged as a string of ions stored in a linear radio-frequency Paul trap?5 This technology was developed (among other purposes) for ultra-high precision spectroscopy and to improve time and frequency standards. Such a trap is schematically depicted in Fig. 2. The device consists of four parallel rods. A voltage varying at radio frequency is applied to two opposing rods, while the other two are grounded. The high-frequency field confines the charged particles to the axis parallel to the rods in the centre of the trap. However, the particles can move freely along this axis. To confine the ions axially an electrostatic potential is applied to both ends. For reviews of ion trapping and cooling see, e.g., the article by Blatt?8 The trapping potential is designed so that the ions are tightly confined in two spatial directions. In one direction, however, the trapping potential is relatively flat. If more than one ion is stored in such a trap the individual particles
276
Introduction t o Quantum Computation and Information
Figure 2: Linear radio-frequency ion trap.
Figure 3: Photograph of a string of ions in a linear ion trap. By courtesy of D. .J. Wineland (NIST/Boulder).
Quantum Computers, Error- Correction and Networking
(a)
inphase
277
(b) out of phase
Figure 4: Phonon modes of two ions in a linear trap: oscillation (a) in phase (centre-of-mass mode) and (b) out of phase.
experience two different kinds of forces. Firstly, the ions feel the trapping force. As mentioned above, this force is weak in one spatial direction and thus allows the particles to move reasonably freely in this direction. Secondly, since the ions are charged particles they experience a repulsive electrostatic force. The result of these two competing effects is that the ions can arrange themselves in a stable configuration with approximately equal distances between them. Such ion traps have already been built?’ By shining laser light on the ions their fluorescence can be photographed as shown in Fig. 3. This experiment was carried out at the NIST facility in Boulder. 2.4
Collective modes of motion
Each of the ions shown in Fig. 3 could possibly represent a quantum bit. The distance between the ions is usually very large compared to the optical wavelength. Thus it is (relatively) straightforward to address each of the ions separately with laser beams and to perform 1-bit quantum gates as described in Sec. 2.2. However, 2-bit quantum gates are more complex. To execute 2-bit quantum gates an interaction between the ions is necessary, and in this proposal the interaction is provided by the electrostatic Coulomb repulsion. Because the ions arrange themselves in a linear crystal configuration, this situation can be described as a system of coupled harmonic oscillators - a situation which closely resembles the elementary introductory examples of phonons in many textbooks on solid state p h y s i ~ s ? By ~ > diago~~ nalizing the (classical) equations of motion one finds that the system possesses normal modes which correspond to collective motional states of the ions, corresponding to a fixed energy (or frequency). An example of this would be the case of two ions in a linear trap coupled by the Coulomb repulsion. In this system two normal modes exist. The energetically lowest is the centre-of-mass (CM) mode corresponding to an oscillation of the two ions which are in phase (see Fig. 4a). The other mode corresponds
278
Introduction t o Quantum Computation and Information
to the two ions oscillating out of phase (Fig. 4b). In general, the number of normal modes is equal to the number of ions in the trap. The oscillation frequency of the CM vibrational mode coincides with the oscillation frequency WCM of a single ion in the trap. It is important to point out that the next regardless of the number of ions. frequency is &CM 2.5 Laser cooling to the motional ground state
For the present scheme only the CM mode is relevant. Prior to performing quantum computations, the string of ions must be cooled such that the ions perform very small oscillations around their equilibrium positions. In particular, the CM mode must be cooled to its quantum mechanical ground state. At such low temperatures the motion of the ions must be treated quantum mechanically and the normal modes can be treated as quantum mechanical harmonic oscillators. For the CM normal mode the energy distance between the quantized motional energy levels is given by AWCM. In order to cool to the quantum mechanical ground state the thermal energy associated with this mode must obey the inequality ~ B << T AWCM, where k~ denotes the Boltzmann constant. This is required because the scheme relies on using the CM mode as a temporary qubit and therefore it is essential that single quantum states can be populated. Cooling to the quantum mechanical ground state can be achieved by using laser cooling techniques. Paradoxically, this technique reduces the temperature of atoms or ions by shining laser light on them. The simplest version of this counter-intuitive method can be described as follows. A laser shone on an atom gives rise to excitation by absorption of a photon from the laser field. The absorption process is accompanied by the transfer of momentum from the photon field to the atom. If photons were to be primarily absorbed when an atom is moving towards a laser, the atoms would slow down due to momentum conservation. This can be achieved by using the Doppler effect. With laser cooling techniques single trapped ions have already been cooled to zero-point energyP2 Cooling to the quantum mechanical motional ground state of a string of ions has yet to be demonstrated. 2.6
Coupling internal levels to external motion
The possibility of transferring quantum information from the internal degrees of freedom of the ions to the external CM vibrational mode is essential for effecting quantum gates in the ion trap quantum computer. This transfer can be achieved by means of lasers. The reason for this is that the photons carry momentum. This transfer is explained in the following example.
Quantum Computers, E ~ o r - C o ~ ~ c tand i o nNetworking 279
Figure 5 : Coupling between internal and external degrees of freedom. Basic level scheme consisting of two internal ionic levels 19) and Ie) and two external motional levels 1 0 ) ~ ~ (ground state) and I1)CM (first excited state).
For the sake of simplicity, a single ion (modelled by two internal levels 19) and le)) is assumed to be trapped in a 1D harmonic potential. The energy difference between 19) and le) is wegand the frequency of the trapping potential is WCM. The level scheme of the combined system consisting of two internal levels and the first two vibrational quantum states is schematically depicted in Fig. 5. The vibrational basis states are denoted by l n ) ~where ~ , n is the number of vibrational quanta. It is assumed that weg >> WCM. Note that if a laser is applied to the atom with a frequency wlas equal to the atomic frequency weg, Rabi oscillations between 19) and le) take place as described in Sec. 2.2 and the external motional state remains unchanged. However, if the laser frequency is tuned to one of the motional sidebands, the external degrees of freedom will be coupled to the internal ones. For example, if the laser is tuned to the first lower motional sideband was = weg - WCM , Rabi oscillations between the states 1g)ln 1 ) C M and le) I n ) c ~ can take place. The corresponding Hamiltonian reads:
+
170 H = -le)(glc+ 2
h.c. .
(6)
c denotes the annihilation operator of a vibrational quantum! The parameter *The annihilation operator c applied to a phonon vibrational state decreases the phonon
280
Introduction t o Quantum Computation and Information
7 is called the Lamb-Dicke parameter, and is a measure of the width of the motional ground state wave function. For details of the derivation and the validity of this Hamiltonian see the paper by Cirac e t uZ.?~ Suppose the ion is initially at rest and the internal states are in an arbitrary superposition a ) g ) Pie), where Q and p are complex amplitudes. If a laser pulse with an appropriate duration tuned t o the first lower motional side-band wlas = W , ~ - W C M was applied to the ion, the following transformation generated by the Hamiltonian of Eq. 6 would take place:
+
Note that the quantum state 1g)lo)CM remains unchanged. Thus the quantum information (encoded in the amplitudes Q and p) which is initially encoded in the internal levels is transferred to the external motion.
2.7 2-bat quantum gates All the prerequisites are now available which are necessary t o explain the implementation of 2-bit quantum gates. Below a quantum gate characterised by the following transformation is described:
11)11)
-11>11>
.
This quantum gate changes the sign if, and only if, both basis states are in the logical state 1. This gate is very closely related t o the famous controlledNOT gate34 because in combination with arbitrary 1-bit gates, it is able t o perform arbitrary quantum computations. To function as a controlled-NOT gate the Hadamard 1-bit quantum gate of Eq. 9 must be applied t o the target bit, performing the conditional sign-change gate of Eq. 8 and applying the Hadamard gate of Eq. 9 t o the target bit again. The Hadamard 1-bit gate is number by one. If applied to a number state In),i.e. a vibrational state with exactly n quanta in the phonon mode we find:
ctln) = h l n - 1). On the other hand, the creation operator ct increases the phonon number by one and results in: ctln) = 1).
ml.+
Quantum Computers, Error-Correction and Networking
(9
(ii)
(iii)
281
c
@ -cr_._.-r-
ionj:
@
Figure 6: Three basic steps for performing a quantum gate on two ions: (i) Transfer of qubit i to the phonon mode; (ii) sign change; (iii) inverse of first step.
defined by:
The quantum gate of Eq. 8 is realized in three steps, which are outlined below before being discussed in greater detail. Fig. 6 shows the necessary operations on the three quantum systems involved in the gate: ion i, ion j , and the CM phonon mode. Step (i): the qubit stored in the ith atom is transferred to the CM phonon mode by a laser pulse with an appropriate frequency and duration as described by transformation of Eq. 7. That is, if ion i is in the excited state le)i a vibration in the string of ions is induced. On the other hand, if ion i is in the ground state 1g)i the ions remain at rest. This ensures that the qubit is accessible to all the other ions. Since all ions participate in the CM motion they can all “see” the quantum information stored there. Step (ii): an operation is performed on the j t h ion conditioned by the quantum state of the CM phonon mode. This again requires an appropriate laser pulse. In this step of the gate the conditional sign change takes place. Step (iii): the first step is reversed in order to restore the qubit from the CM phonon mode to the ith ion. This step completes the quantum gate and leaves the CM phonon mode in its motional ground state 1 0 ) ~ The ~ . QC is now ready for further quantum gates. This is a simplifed explanation of the quantum gate. There now follows a more detailed description of the three steps. Step (i): The transfer of the ith qubit is achieved by performing a 7r/2pulse with a laser aimed at the ith ion and tuned to the first lower motional sideband (corresponding to ulas = weg - WCM ). If the duration of the laser pulse is timed correctly the ith ion and the CM mode will undergo the transformation of Eq. 7 while the other ions remain unchanged. Fig. 7a shows the two internal levels of ion i as well as the first two vibrational levels. To facilitate comprehension of the explanation below, Fig. 7b shows the transformation from the perspective of the j t h atom. In this diagram the internal levels of ion j are shown with the first two vibrational levels of the CM phonon mode.
282
Introduction to Quantum Computation and Information
Figure 7: First step of the two-qubit quantum gate in the ion-trap QC from the perspective of (a) ion i and (b) ion j . The relevant internal states of the ion are shown together with the first two vibrational states.
Before this step the CM mode is in its vacuum state and therefore only the two internal levels 1g)jIO)CM and 1e)jIO)CM are populated. Note that in Fig. 7b a third internal level is depicted. This level is used for performing step (ii) and will be explained later. If the ith ion was in state 1g)i before step (i), nothing would happen in the level scheme shown in Fig. 7b. However, if the ith ion was in state le)i the states 1g)jIO)CM and 1e)jIO)CM would be transferred to 1g)j I 1 ) c M and le)j I1)CM, respectively. Step (ii): The sign change on ion j conditioned by the state of the CM phonon mode now needs to be effected. However, this sign change is only to be carried out if ion j is in the excited state le)j and if the CM phonon mode is in the first excited state 1 1 ) ~ The ~ . corresponding level scheme is depicted in Fig. 8. The desired transformation is thus: 19)jIO)CM b)jI1)CM
k>jIO)CM I
le)j l ) C M
1g)jlO)CM +
b)jI1)CM
le)j IO)CM
(10)
-1e)jll)CM
For this step an auxiliary level la)j in ion j is needed which is empty before and after the operation. This gives rise to a third set of states in Fig. 8. The operation of Eq. 10 is now performed by inducing Rabi oscillations between level
Quantum Computers, Error- Correction and Networking 283
Figure 8: Second step of the two-qubit quantum gate in the ion-trap Q C : sign change through a full Rabi oscillation via an auxilliary state 1a)jll).
le)j I1)CM and level ( u ) ~ ( O ) C M . This is achieved by shining a laser on ion j which is tuned in resonance with this transition. After the initial state le)j I1)CM has performed a full cycle via state ~ u ) ~ I O ) C Mthe sign of the amplitude will have changed corresponding to the argument of the cos-function being equal to i-r. The other three initially populated states 1g)j I O ) c M , 1g)j11)CM7 and le)j 1 0 ) ~ ~ are not in resonance with any other state and therefore no Rabi oscillationsand thus no sign change-will take place.
Step (iii): The final step is to reverse the first step (i) in which the qubit stored in ion i was transferred to the CM phonon mode. This reverse step can be carried out completely analogously to the first step (i). In summary, these three steps demonstrate how a conditional sign-change gate on two qubits can be performed in an ion-trap QC. This gate, along with arbitrary 1-bit quantum gates, is sufficient to perform any arbitrary quantum computation. The three necessary transformations are summarized in the
284
Introduction to Quantum Computation and Information
following table:
The transformation table shows that the quantum gate of Eq. 8 is realized through the procedure described above by making the appropriate identifications of logical states with internal ionic levels. 2.8
The NIST experiment
A simplified version of a 2-bit quantum gate based on the ion-trap QC proposal has already been demonstrated experimentally at NIST in Boulder? In their experiment a single Be+-ion was stored in an ion trap. Laser cooling was deployed to cool the ion to the motional ground state 1 0 ) (for ~ ~trapping and cooling technology see Sec. 2.3). This single ion was used to demonstrate a controlled-NOT gate!4 The target bit was stored in internal levels 19) and le) of the Be+-ion? The control bit was stored in the ground state 1 0 ) and ~ ~the first excited state 1 1 ) of~the ~ CM phonon mode. This meant that the first and the third steps of the scheme described in Sec. 2.7 were unnecessary because one of the quantum bits was already in the CM phonon mode. The controlled-NOT gate performed at N E T functions in the following way. First the l-bit quantum gate of Eq. 9 is applied to the internal levels of the ion as described in Sec. 2.2. The conditional signchange gate of Eq. 8 is now performed by inducing a full Rabi cycle from level 1e)ll)CM to level 1a)lO)c~ and back with a laser pulse of appropriate frequency and duration. This Rabi cycle will perform the required sign change on the basis state identified with both qubits being in the logical 1state. Finally, the operation of Eq. 9 is applied to the ion again, which completes the quantum gate. In the NIST experiment, the four basis states were prepared and the controlled-NOT gate was applied. Afterwards, the state of the ions was meaActually, two hyperfineground state Zeeman sublevels were used which are separated by 1.250 GHz (microwave transition). Rabi oscillations between these two levels can be induced with o p t i d lasers by using a Raman transition involving a common excited state.
Quantum Computers, Error-Correction and Networking
285
Figure 9: Schematic representation of the basic elements of the cavity QED quantum computer.
sured with the quantum jump The 2-bit quantum gate yielded the expected result with a reliability of 90%. The main causes of the 10% error margin will be discussed in Sec. 4.2. 3
The cavity quantum electrodynamics quantum computer
3.1 Introduction The model system described in this section is based on cavity quantum electrodynamics (cavity QED)? As in the ion-trap quantum computer, the quantum bits are represented by ions or atoms and are kept fixed in space by appropriate trapping techniques. However, the communication between the ions is based on the exchange of photons. This proposal may serve as a useful tool to experiment with quantum logic based on photon exchange. The basic elements of the cavity QED QC are schematically depicted in Fig. 9. The atoms/ions which represent the qubits are trapped between two highly reflective mirrors. These mirrors are adjusted such that a single resonator mode is coupled to all atoms, thereby providing a means of communication between them. The most obvious way t o perform quantum logic gates would be t o simply translate the steps used in the ion-trap QC to the present model. That is, in order to perform a 2-bit quantum gate the quantum information from one atom would be transferred to the cavity mode by an appropriate laser pulse, which makes this quantum bit (literally) visible to
286
Introduction to Quantum Computation and Information
all the other atoms. Then an appropriate conditional operation on the second atom would be effected dependent upon the state of the cavity. Finally, the first step would be LL~ndone77 by transferring the quantum bit back to the atom. However, there are problems with this simple strategy because it would require very accurate control of the coupling of the atoms to the cavity. This, however, is not as simple as in the ion-trap QC where this can be done by controlling the intensity of a laser field. Also, the simple strategy would require the temporary population of excited atomic states. However, this requires the use of dipole-allowed transitions in the optical regime which are very unstable. Consequently, any significant population in the excited atomic state will give rise to unwanted decoherence effects. The proposed method seeks to circumvent these problems by using a dark state technique. Dark states have a variety of applications in atomic and molecular physics, in laser cooling and in cavity QED?”39 The principle of dark states can only be understood quantum mechanically and is based on destructive quantum interference. Section 3.3 gives a brief introduction to dark states. In the present model, dark states are used to perform quantum gates without populating the excited atomic states. This is a surprising possibility since the lasers and the cavity mode are in fact strongly coupled to the excited states.
3.2
The basic physical system
This section describes the basic physical model. Later amendments will be necessary. However, these present no conceptual additions to the basic model described here. The atom is represented by threelevel systems with two ground states Iao) and 1.1) and an excited state Ib) as depicted in Fig. 10. The excited atomic state decays with a decay rate y,while the ground states are stable. While the QC is idle (i.e. while no quantum gates are being performed) the quantum information is stored in stable ground states. As in the ion-trap QC ground states provide a stable and reliable method for storage of quantum information. Both 1.0) and 1.) are coupled to the excited state Ib). The states 10.) and ( b ) are coupled by a laser field, while lal) and (b) are coupled by the interaction with the single cavity mode. Ideally, the model would have a strong coupling strength g of the atom to the cavity mode and a small decay rate y.However, g and y are not independent. In particular, it can be shown that the ratio y/g is proportional to the square root of the frequency w of the transition?’ Thus it is desirable to use small frequencies such as microwave transitions. However, in this scheme it is essential that each atom can be addressed by laser light individually. This imposes a condition on the minimum distance
Quantum Computers, Error-Correction and Networkang 287
c
1%)
14
Figure 10: Basic threelevel system for the cavity &ED quantum computer and for quantum networking (see Sec. 5). One transition is coupled to a laser field, the other is coupled to a quantized cavity mode.
of the atoms, which is that the separation of the atoms must be much larger than the wavelength of the laser field. This excludes the possibility of using microwave fields for exciting the atoms as this would require a huge separation between them. Therefore optical frequencies must be used and the atomic decay rate y is a non-negligible quantity. The coupling of the atomic states 1.0) and Ib) to the (classical) laser field is described by a similar Hamiltonian to that given in Eq. 1. However, the cavity mode which is coupled to the states lal) and Ib) must be treated quantum mechanically. This is because the quantum state of the cavity mode always has an extremely low photon number. Initially we assume that the photon number in the cavity is zero. Note that this condition can be easily satisfied for optical frequencies at room temperatures whereas for microwave frequencies cooling schemes are necessary. In quantum optics cavity modes are usually described by quantum mechanical harmonic oscillator^?^>^^ Here we are only interested in a single mode. The Hamiltonian which describes the interaction between the single, quantized cavity mode and the atomic transitions) 1 . 1 and Ib) is given by: Hcav-atom = hgc+Ial)(bI
+ gficIb)(alI-
(12)
c and ct denote the mode annihilation and creation operators. The Hamiltonian which governs this dynamic behaviour is very similar to the one describing
288
Introduction t o Quantum Computation and Information
Figure 11: Basic level scheme for explaining the concept of dark states: threelevel system coupled to two laser fields.
an atom coupled to a collective vibrational excitation (see Eq. 6 ) . The coupling strength between the atom and the mode is denoted by 9. This simple interaction is called the Jaynes-Cummings model?' This Hamiltonian has already been transformed to a rotating frame to eliminate fast optical frequencies. It is valid in this form only on resonance (i.e. when the frequency of the atomic transition equals the frequency of the cavity mode). Note also that an excitation of the atom is accompanied by decreasing the photon number by one as described by the second term in Eq. 12. Here the first term describes the opposite process. If the system was prepared in the excited atomic state and the cavity in a zero photon state 10) (also called vacuum state) Rabi oscillations would take place between the states 1b)lO) and 1al)ll). The first ket denotes the atomic state and the second ket is a number state of the cavity mode (i.e. a state which contains a precise number of photons). The frequency of these Rabi oscillation is given by the modulus of the coupling strength 9. The following short description of dark states and adiabatic passage is a necessary preliminary to the description of the quantum gate.
3.3 Dark states and adiabatic passage Destructive interference in quantum mechanics causes counter intuitive behaviour of quantum states. The concept can be seen in the following simple
Quantum Computers, Error-Correction and Networkang 289
example. (In the proposal for a quantum computer, a generalization will be necessary.) The model system consists of a single atom with ground states 1.0) and 1.1) and an excited state ( b ) , (see Fig. 11). For the sake of simplicity both transitions 1.0) - Ib) and 1.1) - ( b ) are coupled to classical laser fields. Thus the Hamiltonian which describes the dynamics of the system reads:
H = hRolb)(aol+
hRllb)(Ull
+ h.c. .
(13)
Again, this Hamiltonian is valid on resonance and is written in a rotating frame to avoid high-frequency contributions. The symbols Ro and R1 denote Rabi frequencies. If only one of the two lasers was switched on then Rabi oscillations between one of the ground states and the excited state would take place. One would expect that if both lasers were switched on, a more complicated pattern of Rabi oscillations would take place. In general, this is in fact the case. However, for very special quantum states an interesting effect occurs: these quantum states, also called dark states, are completely decoupled from the laser interaction though lasers are shone on the atom coupled to both transitions. That is, the atom remains in its initial superposition of ground states because the contributions to the excited state amplitude from both ground states interfere destructively. For the Hamiltonian given in Eq. 13 the dark state reads: ID) = "fhl.0) - flOl.l)l. (14) Here JVis a normalization constant. It can be seen that ID) is an eigenstate of the Hamiltonian given by Eq. 13, corresponding to the eigenvalue 0. Thus the state remains unchanged, even in the presence of both laser fields. Dark states have useful applications in conjunction with adiabatic passage. If an atom is prepared in a dark state and the Rabi frequencies are changed slowly the dark state may adjust adiabatically to the new laser configuration. ~ example, if the Rabi freThis process is called an adiabatic p a ~ s a g e ?For quencies fulfill the inequality 01 >> 00the dark state of Eq. 14 is given by ID)-" 1 ~ 0 ) .If the Rabi frequencies were changed slowly to arrive at 520 >> 01 then the corresponding dark state would become ID) 21 1.1). An atom initially prepared in 1.0) would evolve (under adiabatic conditions) into state 1.1). Thus a population transfer could be achieved without having any population in the excited state Ib) at any time. Note, however, that the ground states 1.0) and )1.1 are not coupled directly but only via the excited state Ib). This has practical implications if the excited state Ib) is short-lived. Because the excited state is never populated, spontaneous emission does not deteriorate the transfer. This is the basis of a useful application of dark states in atomic physics?6 Adiabatic passage via dark states can also be used for building atomic beam splitters and for quantum state synthesis in the context of
290
Introduction to Quantum Computation and Information
cavity QED:7p39 to give just two examples. Two new applications of adiabatic passage are described in this chapter. Firstly, the cavity QED quantum computer described in this section and secondly a scheme for quantum networking between spatially separated QCs, (see Sec. 5.3).
3.4 Multi-atom dark states and quantum state swapping For the cavity QED quantum computer a generalization of the basic dark state idea is required. Earlier in the chapter the notion of a dark state was introduced for a single atom (described by a three-level system) coupled to two classical fields. The N-atom dark state can now be considered. As in the first example, the atoms are described by three-level systems with two ground states 1.0) and lal) and an excited state Ib). In order to achieve a coupling between the individual atoms the transition lal) - Jb) is coupled to a single mode of an optical cavity as described in Sec. 5.3 (see Fig. 10). The other transition is coupled to classical laser fields which can be controlled independently for each atom. Thus the Hamiltonian for N atoms interacting with the cavity mode and with N laser fields consists of contributions from Eqs. 2 and 13:
For the sake of simplicity, the example is restricted to the case N = 2. It can be verified very easily that for the Hamiltonian of Eq. 15 a family of dark states (and not only a single one as in the single atom case) exists:
P o ) = I.l)llal)2lo> Iok)
= Nk [fl29laO)llal)2lk - 1) -k fl1g)al)llao)z)k- 1) - ~ l ~ 2 J a l ) l l a l ) 2for J k k) ]L 1.
Here the first and second ket denote the first and second atom respectively, while the third ket denotes a number-state of the cavity mode? The first dark state [DO) is decoupled from the interaction for energy conservation reasons, while the others are dark due to destructive quantum interference. In the and 1 0 1 ) are relevant. From the present case only the first two dark states [DO)
din) denotes a quantum state of the cavity mode with a photon number of exactly n. These states are also called Fock ~ t a t e s z ~ , ’ ~
Quantum Computers, Error-Correction and Networking 291
linearity of quantum mechanics it follows that any superposition of these states is also a dark state (as they are degenerate eigenvectors of the Hamiltonian of Eq. 15. It is now possible to perform an adiabatic passage by using a superposition of (DO)and 1 0 1 ) . Such an adiabatic passage can be used to transfer quantum states from one atom to another. For example, atom 1 is initially prepared in an arbitrary superposition of the two ground states lao)~ and ( a l ) ~while , atom 2 is prepared in state la1)2. It can be seen that if the Rabi frequency of the laser shining on the second atom is much larger that that of the laser interacting with the first (Q2 >> Q,), the initial atomic state can be expressed in terms of the dark states !DO) and 101)as follows:
+
[alao>1 Dla,),]
la1)2
=
0
1
)
+ DIDO).
a! and p denote arbitrary complex coefficients. A slow modification of the Rabi frequencies will preserve the initial superposition of dark states following the description given in Sec. 3.3. In particular, slowly changing from Q2 >> 0 1 to R1 >> Q2 will give rise to the following transformation:
+ Dlal)ll1~1>2==+ b l > l[alao>2+ Dla1)21
[alao>1
3
Thus quantum state swapping between the two atoms has been achieved. That is, the quantum state initially stored in the first atom is transferred to the second atom and vice versa. Most importantly, this quantum state swapping has been achieved without populating any of the (fragile) excited atomic states, thereby rendering the scheme insensitive to spontaneous emission. The next section demonstrates how quantum state swapping can be used for performing quantum logic gates.
3.5 Performing a controlled-NOT gate
As in the case of the ion-trap QC, a two-bit quantum gate needs to be effected in order to perform arbitrary quantum computations. Here it is shown how to perform the controlled-NOT gate 34 which is characterized by the following transformation:
In order to perform a controlled-NOT gate between two atoms it is necessary that in each atom two threelevel systems are available as depicted in Fig. 12. Thus we have four ground states lao), lao),lal) and and two
lal)
292
Introduction to Quantum Computation and Information
.I '
la,) I
\
Figure 12: Two parallel three-level systems in each ion are necessary to perform quantum gates in the cavity QED quantum computer.
excited states Ib) and Ib'). The states Iz) and Iz'), where z = ao,al,b, are energetically degenerate. These levels may be, for example, degenerate Zeeman sublevelsP2 Thus the classical laser field simultaneously couples to the transitions 1.0) - Ib) and Iko)- 16). Similarly, the cavity mode couples to the transitions 11.) - ) b ) and la1) - 16).Note that it is not important that the Rabi frequencies and coupling strengths are the same for both transitions. A controlled-NOT gate can be perfomed with two atoms of this structure, as shown in Fig. 12. The logical values 0 and 1 are identified with atomic ground states as follows:
in the first atom and
in the second atom? The controlled-NOT gate requires three steps. Firstly, an adiabatic passage is performed from 0 2 >> 01 to 01 >> 0 2 . In practice this can be achieved by applying two laser pulses to the two atoms. The laser pulses have the same shape as a function of time (a Gaussian, for example), but the laser pulse applied to atom 1 is applied with a certain time delay with respect to the pulse applied to atom 2. The four logical basis states 'This asymmetry can be avoided by two single-atom operations before and after the quantum gate.
Quantum Computers, Error-Correction and Networking
293
then undergo the following transformation:
This transformation corresponds t o the adiabatic passage described above in Sec. 3.3, which takes place an parallel in the two threelevel systems of the second atom. Note that this step has transferred the quantum state initially stored in atom 1 to atom 2. The two quantum bits are now locally encoded in the second atom, while there is no quantum information stored in the first atom. The second step of the quantum gate consists in performing a single atom operation on the second atom. A laser pulse is applied which swaps the two states 1a1)2 and &)2 while leaving 1a0)2 and &)2 unchanged. In general this is not a difficult experimental problem and it can be achieved with the methods described in Sec. 2.2. The transformation which takes place in atom 2 reads:
Finally, the first step is reversed which recovers the quantum bit originally stored in atom 1. In summary, the transformations corresponding t o the three steps of the quantum gate read:
This corresponds to a controlled-NOT gate as defined in Eq. 16. In conjuction with arbitrary 1-qubit gates, which are easy to implement experimentally, any arbitrary 2-qubit gate can be realized. The generalisation t o N qubits is also straightforward. In order t o prevent the atoms not involved in a particular quantum gate from interacting with the cavity mode, the quantum information in these atoms is stored in the levels 1.0) and lab). When a controlled-NOT
Introduction to Quantum Computation and Information
294
gate between a given pair of qubits is required, the atomic representation of the quantum information is changed by single-atom operations as in Eqs. 17 and 18. After the quantum gate the qubits are transferred back to lao) and
lab). 4
4.1
Experimentally feasible quantum error-correction
Introduction
Quantum computations are prone to errors and thus the development of powerful error-correction techniques will be crucial for the success of quantum computing. Quantum error-correction is significantly more difficult than classical error-c~rrection?~ This is mainly because error correction requires measurements, which for qubits (in contrast to classical bits) will in general result in loss of quantum coherence and thus quantum information. A classical bit, on the other hand, can assume only one of the two possible logical values 0 or 1. At any point during the computation the classical bit can be measured and error-correction steps can be applied conditioned by the outcome of the measurement. Quantum error-correction as a field of reasearch came into existence in 1994. Peter Shor of AT&T Bell Labs and Andrew Steane of the University of Oxford independently discovered the first quantum codes!Otll Their pioneering results spurred vigorous research activity, the main results of which are described elsewhere in this book. Due to the complexity of most error-correction techniques they are unlikely to be experimentally feasible in the foreseeable future. However, this section describes an error-correction scheme which was designed to correct for a key error of the ion-trap quantum comp~ter!~Because this scheme was developed for a particular physical situation an economical and realizable implementation in terms of resources was possible.
4.2 Types of error in the ion-trap quantum computer To assess the types of error which can take place in an ion-trap QC a broad distinction may be made between memory errors and computational errors. Memory errors. If a state is prepared in a quantum register, it will, in general, not remain unchanged in time. This is because any realistic quantum register cannot be isolated perfectly from its environment. The interaction of the quantum register with the environment will invariably destroy the quantum coherence of superposition states. Quantum bits with single trapped ions have been realized in which quantum superpositions persisted for several minutes.16 However, there are several
Quantum Computers, Error-Correction and Networkang
295
effects which can lead to memory errors in the ion-trap QC. Examples of these effects would be fluctuations of the fields that trap the ions, and imperfections in the vacuum in which the ions are trapped. The latter can cause collisions with the background gas which will destroy the quantum superposition in the affected ion! Computational errors. The second kind of error takes place only while quantum gates are being performed. Computational errors may be caused by a variety of factors. One type of error occurs when the transformations needed to perform quantum gates are not accurate. For example, in the ion-trap QC the duration of the pulses may not be precise, which can cause imprecise triggering of the target gate. This type of error leaves quantum superpositions intact and does not give rise to decoherence. However, errors may accumulate and spoil the computationP6 There is also the possibility of decoherence during quantum gates. This kind of error can occur if the qubits involved in the gate are coupled to fluctuating fields or fragile auxiliary quantum systems. This type of error is predominant in the ion-trap quantum computer. Quantum gates require temporary excitation of the CM phonon mode which is a much more fragile quantum system than the ionic ground states. The CM phonon mode undergoes damping and heating due to its coupling to the electrodes of the trap. Errors can also arise due to imperfect laser cooling. In addition, there are spontaneous emissions due to the coupling of the stable ground states to unstable excited states which is neccessary to perform quantum gates. There are also errors due to intensity and phase fluctuations of the lasers that manipulate the ions. In the NIST experiment decoherence time was found to be in the order of a millisecond! This number has to be contrasted to the decoherence time for quantum memory mentioned above (several minutes) and to the time needed to perform quantum gates which is approximately 50 microseconds. 4.3 Correction of errors in the i o n trap q u a n t u m computer
The factors described in the previous section explain why the standard schemes for memory error-correction are not useful for the ion-trap QC. Applying these error-correction methods would only introduce additional errors, due to the quantum gates required by the error-correction procedure. This is not to say that error-correction schemes along these lines are not important on a longterm basis and for different physical implementations. An error-correction scheme is needed which corrects for errors that take place during the execution of quantum gates (computational errors). Recently, a generalization of quantum error correction has been proposed
296
Introduction t o Quantum Computation and Information
which is capable of correcting for errors during quantum gatesP7 However, this fault-tolerant quantum computing method requires an extensive computational overhead and may not be applicable at all to prototype realizations like the ion-trap QC. This section describes a recently proposed error-correction scheme designed specifically for the ion-trap QCP5 The scheme corrects for an important source of errors during the execution of 2-bit quantum gates. Because the scheme is not intended t o correct for the most general error it can be implemented efficiently with regard to time and memory overhead. This scheme could be tested as soon as a prototype ion-trap QC becomes available.
Error model There follows a simplified model for the errors in the CM phonon mode. Although not necessarily realistic, the model facilitates explanation of the scheme. The scheme, in turn, can be generalized to more realistic errors. In the simplest error model only decay of the CM phonon mode is included. If an error takes place the motional quantum number is decreased by one. This corresponds to a quantum jump modelled by the application of a CM mode annihilation operator c followed by normalization: 48
Therefore, if such an error takes place during a 2-bit quantum gate when one of the two qubits is temporarily swapped into the CM phonon mode the corresponding quantum information is lost and the gate has failed. Error detection
The scheme described in this section requires a means to measure whether or not an error has taken place during the gate operation, after the operation is complete. This measurement has to be designed in such a way that if no error has occured, then the quantum state that has passed through the gate is not destroyed. In the present error model this can be done by using the second motional sideband whenever the first motional sideband was used in the original gate (see Sec. 2.7). For example, for step (i) described in Sec. 2.7 the laser is tuned to the first lower motional sideband wlas = weg - WCM in order to induce Rabi oscillations between the states 1e)lO)cM and 1g)ll)CM. Here the laser is tuned to the second lower motional side-band wlas = weg - 2WCM. This couples the states 1g)In 2 ) c and ~ 1e)ln)cM (see Fig. 13). If the entire quantum gate is performed via
+
Quantum Computers, Error- Correction and Networking
297
ion i
@ I4
ion i
@ lg) Figure 13: Level scheme with internal and external levels for error correction in the ion trap QC. Solid arrow: First lower motional sideband; dashed arrow: Second lower motional sideband. In the error correction scheme described here all operations involving the external motion have to be carried out via the second lower motional sideband. The wavy arrows indicate the decay processes which can take place within the present error model.
the second motional sideband, the transformation table of Eq. 11 is modified by exchanging the one-phonon states by two-phonon states: 11)CM + 12)CM. During the quantum gate the CM phonon mode will thus be in a superposition of 1 0 ) and ~ ~ 12)CM. If an error occurs the phonon annihilation operator is applied to this superposition. The amplitude of state 1 0 ) vanishes ~ ~ while a quantum from the state 12)CM is annihilated. After the quantum jump the CM phonon mode is in state I1)CM. The important feature of this operation is that this state is decoupled from the remaining laser pulses of the quantum gate because no resonant pair of levels is available. Therefore, if no further quantum jump takes place, the phonon mode remains in state J 1 ) C M until the end of the gate. Since, in the absence of errors, the phonon mode is in state 1 0 ) after ~ ~ the gate, a measurement of I1)CM indicates an error. This measurement can be made with an additional ionP5 As an example, suppose that a controlled-NOT quantum gate of the type described in Sec. 2.7 is performed and the second lower motional sideband is used. If an error were to occur between step (ii) and step (iii), then the transformation table would read:
298
Introduction to Quantum Computation and Information
ion i:
CM: ionj :
-
encoding
phonon measurement
Figure 14: Quantum gate error correction: Schematic representation of the scheme. After each subgate a measurement determines if an error has taken place and the subgate is, if necessary, undone.
Of course, the resulting quantum state must be normalized after the error. But quantum information is lost because the amplitudes of the initial states Ig)ilg)jlo)CM and Ig)ile)jlo)CM vanish. Quantum gate error-correction The implementation of error-correction is set out below. The key idea is that each logical quantum bit is encoded in two physical quantum bits located on the same ion. That means that the number of ions remains unchanged but each ion has four levels to encode quantum information instead of two. The four levels of each ion are denoted as loo), IOl), [lo), and 111). The scheme is schematically depicted in Fig. 14. The first step of the process is to encode the
Quantum Computers, Error- Correction and Networking 299
quantum bits which are involved in the quantum gate as follows:
The states 10) and 1 1)encode the logical states 0 and 1 redundantly. This is an operation which involves independent ions and it is assumed for the sake of the example that these operations can be executed without error. This assumption is reasonable as errors during 2-bit quantum gates are much more likely. The goal is to perform the conditional sign-change gate of Eq. 8 with error-correction on the states IQ) and 11).This operation can be carried out by performing the sign-change gate on every pair of physical quantum bits which involve two ions as shown in Fig. 14. Each of the four subgates now needs to be checked for errors. This can be done by measuring the phonon mode after each subgate. As mentioned earlier a non-zero phonon number indicates an error. If an error has taken place it turns out that all relevant amplitudes are still present and the erroneous subgate can then be undone by a proper transformation. An example of the error-correction scheme is set out below showing the function of the model in a sample situation. The example situation is restricted, and general proof for the operation of the model can be found in the original paperP5 Suppose that an error had taken place between the second and the third steps of the first subgate. The effect of an error on the two qubits involved in this particular subgate can be read off from Eq. 19 and the other two qubits remain unchanged. Thus the four logical basis states after the first (erroneous) subgate would be:
This transformation demonstrates that all four initial amplitudes are still present and that local (i.e. single-ion) operations suffice to undo the gate and recover the initial state. However, the time at which the error has occurred
300
Introduction t o Quantum Computation and Information
during the gate is not known. The transformation corresponding t o an erroneous first subgate, where the error has taken place a t an unknown time within the first subgate, reads: 45
The parameters a and ,L? are unknown and depend on the time a t which the error has taken place. However, a measurement of a qubit involved in the subgate (qubit three in this case) projects the system in a predictable state from which the initial state can be recovered by single-ion operations. After restoring the initial state the subgate can be tried again. It can be shown that this method works for all four subgates in a similar way. The scheme only fails if two errors occur during a subgate. This, however, is an effect of second order. 5
Quantum networking
5.1 Introduction The final part of this chapter describes two methods which have been proposed to establish quantum communication between locally separated quantum computers. This is a particularly pressing problem because the number of qubits in the proposed model systems is limited. For example, the technology of linear radio frequency Paul traps limits the number of ions that can be used in an ion-trap quantum c o m p ~ t e r ?It~ therefore seems desirable t o be able to set up a network of, say, ion-trap quantum computers t o increase the number of qubits available for a quantum computation. At the time of writing two proposals have been put forward t o accomplish this t a ~ k ! ~ *These l~ proposal are based on communicating quantum information stored in atoms (or ions) via optical fibres. The use of photons as the carrier of quantum information instead of the mechanical transfer of atoms between distant sites seems t o be a more promising approach in terms of reliability of transfer. Of course, these methods are not error-free, and are susceptible t o spontaneous emission of the
Quantum Computers, Error-Correction and Networking
301
Figure 15: Schematic representation of the experimental setup used for the quantum networking schemes described here. The quantum information is stored in atoms. The atoms are coupled to cavities, which are, in turn, coupled to both ends of a fibre.
atoms, imperfections in the interface between the cavities and the optical fibres, and losses within the fibres. However, these shortcomings can be dealt with by means of appropriate design of the scheme or quantum error-corre~tion!~>~~ The common features of the schemes described below are depicted in Fig. 15. Their goal is to exchange quantum information between locally separated qubits represented by atoms A and B. These atoms could be part of a quantum computer based on atomic qubits (see Secs. 2 and 3). The physical carriers of quantum information are single photons guided by an optical fibre. Because it is not feasible to provide a coupling of the atoms directly to the fibre, a resonator is used to enhance the coupling of the atoms t o the electromagnetic field. The resonator in turn is coupled to the fibre. This could be accomplished, for instance, by using the ends of the fibre directly as mirrors. Quantum networking can be explained in the following way. Suppose atom (qubit) A is prepared in an arbitrary, unknown superposition of the logical states 10) and [I),and qubit B is prepared in a predefined state such as 11):
As usual, Q and P represent arbitrary complex coefficients. The aim is for the quantum states to be swapped after the process, i.e.:
Of course, quantum state swapping in the other direction from B t o A can be accomplished by simply reversing the laser pulse sequence. Quantum state swapping in connection with the possibility of performing arbitrary quantum computations in the local quantum computers is sufficient to perform arbitrary quantum computations on the joint quantum computer.
302
Introduction t o Quantum Computation and Information
Section 5.2 discusses a scheme for quantum networking put forward by Cirac and c o - ~ o r k e r s ?An ~ alternative scheme proposed by the author is described in Sec. 5.3?4
5.2 Quantum networking with time-symmetrical wave packets The method described in this section is an elegant and novel way of transferring quantum information between locally separated atoms, as depicted in Fig. 15?3 It is assumed that a qubit was to be transferred from site A to B as described in Sec. 5.2 above. Initially the network is in a superposition of two basis states (see Eq. 20). Depending on the basis states of the atoms, the application of a laser pulse to atom A will either trigger the emission of a photon wave packet or have no effect. The explanation of these two possible outcomes is given below. Ideally, the wave packet would be completely absorbed at B, but this outcome is complicated by the fact that it is partially reflected by the cavity on its arrival. The proposal seeks to overcome this problem by creating a symmetrical wave packet. The arrival of the wave packet at B coincides with the application (to B) of the time-reverse of the laser pulse that was applied to A. Because the process which takes place at B is an exact mirror image (in time) of the process at A, absorption will be complete. A more detailed description is set out below. The atom is described by a three-level atom with two ground states ) 0 . 1 and 1.) and an excited state Ib) as depicted in Fig. 10. The laser field is coupled to the transition lao) - Ib), whereas the resonator field is coupled to the transition 1.) - Ib). Some basic properties of atoms coupled to lasers and resonators have already been given in Sec. 3.2. Unlike the situation described in Sec. 3, the laser and cavity fields are now far detuned from their respective atomic resonances by a detuning A. As a result, the excited state is never populated significantly for energy conservation reasons. In fact, the excited state can be eliminated from the model by a procedure called adiabatic e l i r n i n ~ t i o n ?This ~ effectively leaves a system in which the two ground states lao) and l a l ) are coupled directly and Rabi oscillations take place between them. The effective Rabi frequency is determined by the product of the coupling strengths of both of the electromagnetic fields to the atom. If the Rabi frequency of the laser is denoted by R and the coupling strength to the cavity by g , then the effective Rabi frequency will be:
R& = -.
A This formula holds under the assumption that the decay rate of the excited atomic state is much smaller than the detuning and that the Rabi oscillations
Quantum Computers, Error-Correction and Networking
303
take place between zero and one photon states. Note that the effective Rabi frequency 5 2 , ~can be controlled by changing just the laser Rabi frequency R and leaving the cavity coupling g unchanged. Because of the coupling to the quantized cavity mode, oscillations in the cavity photon number and between atomic states will take place simultaneously. For example, if the system is initially prepared in state lao) and the cavity is in its vacuum state after a certain time (corresponding to a nl2-pulse) the system is found to be in the atomic state l a l ) and the cavity is in a onephoton state. On the other hand, if the atom is initially in the atomic state) 1 . 1 (and the cavity is in the vacuum state) no excitation of the cavity mode can take place due to energy conservation and the cavity remains in the zero-photon state. The corresponding transformation table reads:
The logical values 10) and 11) are identified with the atomic levels lao) and Iq),respectively. A .rr/2-pulse thus transfers the quantum information from the atom to the cavity. Because the coupling of the cavity photon to the fibre can be described by a decay rate 6 ,the field within the cavity decreases exponentially with a rate K . At the same time, the electrical field in the fibre increases and a photon wave packet is formed which travels towards the cavity B. If the transformation of Eq. 21 takes place on a time scale which is fast compared to the leakage of the photons into the fibre, the photon wave packet will have a shape as shown in Fig. 16. The electrical field is strongest at the beginning of the wave packet and has an exponential tail. This is because the temporal behaviour of the intracavity photon number will map onto the spatial shape of the wave packet in the fibre. After a certain time-delay determined by the length of the fibre the wave packet will reach cavity B and the front of the wave packet will enter the cavity. However, a part of the wave packet will be reflected, exit the cavity and travel back towards A before the end of the wave packet has entered the cavity. Thus the photon is never entirely within the cavity. This, however, would be necessary to undo the transfer described in Eq. 21 and to restore the qubit in atom B. The scheme proposed by Cirac and co-workers provides a simple and elegant solution to this problem. Instead of just performing a n/2-pulse by simply switching on and off the laser field to achieve quantum state transfer from the atom to the cavity, the temporal behaviour of the effective Rabi frequency Q e is tailored in a more sophisticated way. The frequency Reff is controlled so as to render the outgoing wave packet which conveys the qubit from A to B symmetrical in space (and time) (see Fig. 16). Details of the design of the effective
~
304 Introduction to Quantum Computation and Information
Figure 16: Wavepacket travelling from sender to receiver. (a) Wavepacket corresponding to the case that the photon in the cavity is created instantaneously (on the time scale at which the cavity field leaks into the fibre). (b) The cavity is excited gradually so as t o generate a time symmetrical wave packet.
Rabi frequency R,R as a function o f t which produces the symmetrical wave packet can be found in the original work. For this example it may suffice to note that in order to render the wave packet timesymmetrical it is necessary to switch Refi not instantaneously to its peak value but to increase it slowly in order to prevent the maximum amplitude of the wave packet being at its beginning. Assuming a function R,s(t) which produces a time and space symmetrical wave packet the following equation holds for the expectation value of the electrical field vector (€(x - ct)) within the fibre: W ( X o
+ E ) ) = (€(xo - E ) )
V& : 0
5 20 + E , xo - & <
Here x and t denote the spatial coordinate along the fibre and the time, respectively. The fibre extends from x = 0 to x = x,, and the wave packet is symmetrical about x - ct = ZO. zo can be set to 0 by choosing an appropriate origin of t . ( c denotes the speed of light within the fibre.) When the wave packet reaches the subsystem B the qubit needs to be restored in the corresponding atom. The method used to achieve this is to run the emission process backwards in time with the sender system A replaced by the receiver system B. The time evolution of the electrical field created by the incoming
Quantum Computers, Error-Correction and Networking 305
wave packet at the position of the receiver B is equivalent to the time-reversed evolution at the sender A. Of course the time delay zmaX/c(due to the finite speed of light) must be taken into account. Thus it is sufficient to apply the time reverse of the pulse R,a(t) to the receiver B (with the appropriate time delay) : a,B,(t>= Q$(zmax/c - t). The superscripts A and B denote the subsystem to which the pulses are applied. In summary, the physical process at the receiver B is exactly the time reverse of the emission process at A and thus the quantum bit is restored in the receiving atom B. 5.3 Quantum networking via photonic dark states
Recently, the author proposed an alternative scheme to accomplish the task of quantum state ~ w a p p i n g ?The ~ experimental setup is identical to the one used for the scheme described previously in Sec. 5.2 (see Fig. 15). However, the physical process which is responsible for the quantum state transfer is quite different. In this scheme, the two cavities which provide the coupling between the atoms and the fibre is always in its zero-photon state throughout the whole transfer. This is achieved by means of dark-state techniques similar to those described in Sec. 3.3. The obvious advantage of having no photons in the cavities is that the scheme is robust, and consequently not susceptible to cavity loss mechanisms. Moreover, the scheme has the advantage that precise tailoring of the outgoing transmitting wave packet is not required. This is because the scheme is based on adiabatic passage (see Sec. 3.3). Consequently, the details of timing and shape of the pulses are not important as long as the process takes place adiabatically. As in the scheme by Cirac et al. (see Sec. 5.2), the atoms are described by threelevel atoms and the transitions 10.) - Ib) and 1.1) - Ib) are coupled to a laser field and to the cavity mode, respectively (see Fig. 10). Moreover, the two fields are detuned from resonance by A and it is assumed that adiabatic eliminiation techniques are applicable. In order to explain the scheme it is necessary t o assume that the fibre is not described by a continuum of modes (the limit of an infinitely long fibre) but by a discrete spectrum of modes instead. It is also essential that both cavity modes are resonant with one of the fibre modes. The corresponding level scheme is shown in Fig. 17. Note that the excited atomic states have already been eliminated and thus the states ~ U O ) ~ ~ ~ J O ) , couple $$ directly to the states lal)AiBJ1)$L.!?.Moreover, the states lal)A)BJl)$L:couple to the fibre modes and in particular to the resonant mode. The Fock states of the resonant fibre mode are denoted by 1k)fib. The goal is
306
Introduction to Quantum Computation and Information
Figure 17: Level scheme for quantum networking with photonic dark states. The four levels on the left and the right side of the level scheme represent the energetically lowest levels of the combined atom-cavity systems A and B, respectively. Here only the atomic ground states and the cavity vacuum and first excited states are shown. The dashed arrow represents the transition which is effected by the laser and the cavity via the far detuned excited atomic state. The solid arrows represent the coupling to the fibre modes.
to achieve the following transformation: aO)AIO>?av
Vac)fib
[iul)All)$aj
Ivac)fib
[lul)B1o)?aj lal)BIO)?av
*
[!ul)All)$aj al)AIO)?av
VaC)fib laO>Blo)?av
Ivac)fib [lu1)”l0)?aj
.
(22) 1vac)fibdenotes the quantum state where all fibre modes are in their respective vacuum states. The atomic levels in both atoms are to be swapped if the sender subsystem A is initially in state ( ~ 0 ) Otherwise ~ . the states are to remain unchanged. As in Sec. 5.2, the receiving atom is prepared in state 1 ~ 1 )If~ the . logical values 10) and 11) are identified with the atomic levels lao) and 1.1) then the transformation of Eq. 22 realises quantum gate swapping. As mentioned above, this transformation is to be executed so that the two quantum states with non-zero cavity excitation, i.e. lal)A>B are avoided. This enables the scheme to resist cavity losses. To show that the fivelevel system shown in Fig. 17 has a dark state, the interaction Hamiltonian (constrained to the relevant levels) is specified as:
ll)$;F
+
H HA,B HA,B-fib
HA HB =
+ HA-fib
3- HB-fib
ntiB[ ~ a o ) ~ r ~ ~ ~[Iai)A’BI1)civ] ) t ; ? ] A B t + h-c.
+
= fA’B [lai)A’B11)ti710)fib][lai)A’BIO)ti~I1ko)fib] h-c. (23)
This Hamiltonian is written in an interaction picture in order to avoid fast
Quantum Computers, Error- Correction and Networking
307
optical frequencies! f denotes the coupling strength of the cavity mode to the resonant mode of the fibre. The following quantum states are the relevant dark states: ]DO> = 101)
0:
[lal>AIo>&v] I0)fib [la1> Blo>:av] (fA [laO)AIO>tav] I0)fib [ I u l > " l 0 ) ~ a v ] - (a$ a%) [Ial)"lO)tav] ll)fib [lal>BIO):av] + (& f B )[lal>AIO)tav] I0)fib [luO)B1o):av]
.
These quantum states are dark states because (i) they are eigenstates of the Hamiltonian of Eq. 23 and (ii) they do not contain excited cavity states. The dark state [ D O )is decoupled from the excited cavity state for energy conservation reasons. It is thus a trivial dark state. On the other hand, 101)is a non-trivial dark state in the sense that the excited cavity states are not populated due to destructive quantum interference. In a very similar way to the cavity QED quantum computer the required transformation of Eq. 22 can now be carried out by slowly (adiabatically) changing the Rabi frequencies R;iB. In particular, if the Rabi frequency Rfff corresponding to the receiver subsystem B is much larger than Q$, then the first term of the dark state 1 0 1 ) is dominant. In the opposite regime >> Q:f the last term prevails. IDo)is independent of the Rabi frequencies. Thus by slowly changing the Rabi frequencies from atff>> RFff t o atff<< Rtff the transformation of Eq. 22 is realized. In practice, this process is carried out by a sequence of laser pulses applied to the two atoms. The laser pulse acting on the receiver atom B precedes the laser pulse on the sender A and reaches its maximum value first. Of course the two cavity modes will not only couple to the resonant fibre mode but to neighbouring modes as well (see Fig. 17). It turns out that the "quality" of the dark state deteriorates with increasing detuning of the fibre mode from resonance. However, numerical simulations have shown that for realistic experimental parameters' the quality of the transfer is very good, even if there is a significant population in the neighboring non-dark ("grey") statesP5
Rtff
6
Conclusions
It cannot be overemphasized that quantum computing is still in its infancy, and that the concepts and schemes presented in this chapter are only first steps f I t should be noted that the adiabatic elimination procedure gives rise to additional terms in the Hamiltonian of Eq. 23 which deteriorate the dark state. These terms have been omitted here. In practice, they can be avoided by additional compensating lasers!4
308
Introduction t o Quantum Computation and Information
towards the development of computationally useful QCs. At present nobody can predict when-and if-such technology will be developed. At this early stage it seems important to conduct research into different technologies because their respective potentials cannot be satisfactorily assessed at the present time. For example, ion-trap technology, which seems promising a t the moment, is unlikely t o remain the most attractive architecture for QCs. From a computational perspective, more algorithms need to be found to justify the effort of building QCs. Yet even without these, small-scale QCs might still be used to simulate quantum dynamics. Above all, it should be recognized that the development of QCs is not just dependent on quantifiable factors such as the schedules and resources of research and industry, but ultimately on the discovery of new physics. Acknowledgements
I am grateful t o my colleagues and friends from the Universities of Oxford and Innsbruck and from Wolfson College, especially to Ignacio Cirac, Artur Ekert, Jonathan Roberts and Peter Zoller for stimulating discussions and fruitful collaboration. This work was supported by an Erwin-Schrodinger-scholarship granted by the Austrian Science Fund. References
1. J. S. Bell, Speakable and unspeakable in quantum mechanics (Cambridge University Press, Cambridge, 1987). 2. D. M. Greenberger et al., Am. J. Phys.58 1131 (1990). 3. D. J. Wineland et al., Phys. Rev. A 46, R6797 (1992); D. J. Wineland et al., Phys. Rev. A 5 0 , 67 (1992). 4. S. F. Huelga, C. Macchiavello, T . Pellizzari, A. K. Ekert, M. B. Plenio and J . I. Cirac, Phys. Rev. Lett. 7 9 , 3865 (1997). 5. S . Lloyd, Science 273, 1073 (1996). 6. J . I. Cirac and P. Zoller, Phys. Rev. Lett. 7 4 , 4091 (1995). 7. A. .M. Steane, Appl. Phys. B 6 7 , 623 (1997). 8. C. Monroe et al., Phys. Rev. Lett. 7 5 , 4714 (1995). 9. T. Pellizzari, S. A. Gardiner, J . I. Cirac and P. Zoller, Phys. Rev. Lett. 7 5 , 3788 (1995). 10. P. W. Shor, Phys. Rev. A 5 2 , R2493 (1995). 11. A. M. Steane, Phys. Rev. Lett. 77, 793 (1996). 12. E. Knill and R. Laflamme, Phys. Rev. A 55, 900 (1997). 13. J. I. Cirac et al., Phys. Rev. Lett. 7 8 , 3221 (1997).
Quantum Computers, Error-Correction and Networking
309
14. T. Pellizzari, Phys. Rev. Lett. 79, 5242 (1997). 15. W. Paul, p. 497 in Proceedings of the International School of Physics “Enm’co Fermi, ” eds. E. Arimondo et al.(North Holland, Amsterdam, 1992). 16. J. J. Bollinger et al., IEEE Trans. Instrum. Meas. 40, 126 (1991). 17. S. Lloyd, Phys. Rev. Lett. 75, 346 (1995). 18. A. Barenco, Proc. R . SOC.London A 449, 679 (1996). 19. D. P. DiVincenzo, Phys. Rev. A 51, 1015 (1995). 20. T. Sleator and H. Weinfurter, Phys. Rev. Lett. 74, 4087 (1995). 21. D. Deutsch et al., Proc. Roy. SOC.Lond. A 449, 669 (1995). 22. W. Nagourney et al., Phys. Rev. Lett. 56, 2797 (1986). 23. J. C. Bergquist et al., Phys. Rev. Lett. 56, 1699 (1986). 24. Th. Sauter et al., Phys. Rev. Lett. 56, 1696 (1986). 25. C. Cohen-Tannoudji, J. Dupont-Roc and G. Grynberg, Atom-Photon Interactions. Basic Processes and Applications (Wiley, New York, 1992). 26. P. Meystre and M. Sargent 111, Elements of Quantum Optics (Springer, New York, 1991). 27. D. F. Walls and G. J. Milburn, Quantum Optics (Springer, New York, 1994). 28. R. Blatt, p. 219 in Proc. 14th International Conference on Atomic Physics ICAP, ed. D. J. Wineland (American Institute of Physics, New York, 1995). 29. M. G. Raizen et al., Phys. Rev. A 45, 6493 (1992). 30. C. Kittel, Introduction to Solid State Physics (Wiley, New York, 1996). 31. N. W. Ashcroft and N. D. Mermin, Solid State Physics (Holt, Rinehart and Winston, Philadelphia, 1976). 32. F. Diedrich et al., Phys. Rev. Lett. 62, 403 (1989). 33. J. I. Cirac, A. S. Parkins, R. Blatt and P. Zoller, Ad. At. Mol. Opt. Phys. 37, 237 (1996). 34. A. Barenco, D. Deutsch, A. K. Ekert and R. Josza, Phys. Rev.Lett. 74, 4083 (1995). 35. J. Oreg, F. T. Hioe and J. H. Eberly, Phys. Rev. A 29, 690 (1984). 36. S. Schiemann, A. Kuhn, S. Steuerwald and K. Bergmann, Phys. Rev. Lett. 71, 3637 (1993). 37. P. Marte, P. Zoller and J. L. Hall, Phys. Rev. A 44, R4118 (1991); L. S. Goldner et al., Phys. Rev. Lett. 72, 997 (1994); J. Lawall and M. Prentiss, Phys. Rev. Lett. 72, 993 (1994). 38. A. Aspect et al., Phys. Rev. Lett. 61, 826 (1988). 39. A. S. Parkins et al., Phys. Rev. Lett. 71, 3095 (1993). 40. H. J. Kimble et al., p. 314 in Proceedings of ICAP 1994, (American
310
Introduction to Quantum Computation and Information
Institute of Physics, New York, 1995). 41. E. T. Jaynes and F. W. Cummings, Proc. IEEE 51,89 (1963). 42. I. I. Sobelman, Atomic Spectra and Radiative Transitions, 2nd ed. (Springer, Berlin, 1992). 43. C. W. Gardiner, Quantum Noise, (Springer, Berlin, 1991). 44. N. J . Sloane and F. J. MacWilliams, The Theory of Error-Correcting Codes, (North Holland, Amsterdam, 1977). 45. J. I. Cirac, T. Pellizzari and P. Zoller, Science 273,5279 (1996). 46. R. Landauer, Proc. R . SOC.London A 353,367 (1995). 47. P. W. Shor, p. 56 in Proceedings of the Symposium on the Foundations of Computer Science, (IEEE Press, Los Alamitos, California, 1996), preprint quantph/9605011. 48. P. Zoller and C. W. Gardiner, in Proceedings of the Les Houches Lecture Session LXII: Quantum Fluctuations, eds. E. Giacobino and S. Renaud (North Holland, Amsterdam, 1996). 49. M. B. Plenio and P. L. Knight, Phys. Rev. A 53,2986 (1996). 50. S. J. van Enk, J. I. Cirac and P. Zoller, Phys. Rev. Lett. 78,4293 (1997).
QUANTUM COMPUTATION WITH NUCLEAR MAGNETIC RESONANCE ISAAC L. CHUANG IBM Almaden Research Center 650 Harry Road S a n Jose, CA 95120 Nuclear spins have long coherence times and are easily manipulated using radiofrequency pulses. Single molecules, with spin-spin couplings provided by electronic bonds, are thus attractive physical systems for performing quantum computation. However, observation of small nuclear induction signals typically requires a large number, O( of molecules. This presents fundamental problems for quantum computation, which traditionally requires pure state inputs, and strong measurements. Quantum computation with nuclear magnetic resonance was made possible by the invention of techniques which create effective pure states from thermal mixtures, and algorithmic steps which make strong measurements unnecessary. The theoretical basis for quantum computation with bulk materials, and experimental results demonstrating implementation of simple quantum gates and circuits are described.
1
Introduction
A single molecule can be thought of as a very small computer, in the following way. Let the spin orientation of each nucleus represent a logical bit (for simplicity, let us assume spin-1/2 nuclei), and utilize the spin-spin couplings naturally provided by through-bond electronic interactions as logical interactions between the bits. Placed in a high magnetic field, the spins precess at different frequencies, depending on their local chemical environment, and by applying radio-frequency (RF) fields on resonance at these frequencies, different nuclei can be addressed and manipulated. By cooling the system down to the ground state, a known input state is prepared, and by applying RF pulses, logical operations are performed. Readout is accomplished by measuring the magnetic induction signal generated by the precessing spins. Why is this computer interesting? Aside from its small size, it is particularly fascinating because conceptually, it can perform computations which no comparable classical machine can do! Because of the natural isolation of the nuclei from electronic and vibrational mechanisms which can lead t o decoherence, nuclear spins can exist in quantum superposition states which maintain their coherence for timescales from seconds to thousands of seconds. Nuclear spins can exist in entangled states, and unlike classical machines, the envisioned single molecule computer can perform quantum computations, which 31 1
312
Introduction t o Quantum Computation and Information
can be exponentially faster compared to their classical counterparts. This interesting feature of quantum systems was first noted by Feynman: and our hypothetical single molecule computer provides an excellent illustration of his point. Consider a single spin-112 nucleus; its dynamics are described by three coupled differential equations (the Bloch equations), which determine the evolution of the x , y, and z components of the spin. Now consider two spins. Fifteen coupled differential equations are required: one each for the individual spin components xi, yi, and z i , but twelve more for the “correlated” states 21x2, xly2, . . . (which represent entangled degrees of freedom). In general, when no semi-classical approximations are made, the evolution of an N-spin system is described by Q(4N) differential equations! Imagine if we could somehow utilize an N-spin molecule to solve Q(4N) differential equations. Theoretical studies of quantum computation have shown how this is possible for certain problems, like factoring and discrete logarithms? The experimental challenge is t o either realize such machines, or to discover why they are impossible. Nuclear magnetic resonance (NMR) is a well established field, dating back t o the mid 1940’s. Today it is a major analytical tool in synthetic chemistry, and is used to study the properties of solids, liquids, and gases. What has been most valuable is the ability of NMR to probe the local chemical environment of nuclei in a molecule, t o aid in determination of molecular structure and composition. Ironically, long coherence times are usually undesirable in such applications, and in fact, Felix Bloch points out in his original 1946 paper on nuclear induction how useful it is to introduce paramagnetic ions to shorten relaxation times. NMR has not widely been thought of as a tool for computation; because of relatively weak interactions, typical computational clock cycles would be quite slow compared to modern computing machines. However, this is less of an issue for quantum computation, where new (and much faster) algorithms are made possible by utilizing quantum properties. And although most NMR applications never explore more than the classical “bar magnet” behavior of single spins, it is now widely understood that the true behavior of coupled molecular spins is quantum, requiring full treatment of entanglement (“multiple quantum coherence” in traditional NMR Early pioneers in quantum computation recognized that NMR is an attractive technique for implementing a quantum computer, given the long coherence times and mature technology. In fact, Divincenzo noted that traditional “double resonance” experiments such as ENDOR (Electron-Nucleon Double Resonance) essentially perform quantum logic operations similar t o the quantum version of the exclusive-oR gate. This experiment requires a handful of R F pulses t o be applied. Modern NMR using state-of-the-art spectrome435).
637
Quantum Computation with Nuclear Magnetic Resonance
313
ters routinely involves application of complex sequences of thousands of RF pulses to manipulate nuclei. These features, and the long coherence times easily obtained in experiments, pointed to the tremendous potential of NMR for quantum computation. However, two important problems had t o be resolved first. An ideal NMR quantum computer would be a single molecule, as described above; however, this is presently experimentally infeasible because the signal produced from a single precessing nucleus is too small to observe. Experimental efforts to perform single-molecule NMR using high-Q mechanical * and opticalg detection methods provide hope for future technologies. Current commercial technology, however, requires samples consisting of O( molecules (milliliter volumes) , and next-generation machines will probably offer nanoliter volume capability!’ Such large samples contain enough nuclei to produce an aggregate signal strong enough to observe, but having multiple molecules presents a problem for our application. Computation requires initial state preparation - for NMR quantum computation, this means somehow putting the nuclei in a known (and thus lowentropy) state. In an ensemble of N-spin molecules bulk material - in equilibrium at room temperature, the spin-up (t)and spin-down (4)states are nearly equally populated, with an excess ground state population of only one part in lo6. Such a random state is unacceptable as a computational input. One solution would be to cool the nuclei into their ground state, perhaps by optical pumping or adiabatic demagnatization, but this is practically difficult. The second issue is that the computational result of any individual molecular computer is inaccessible; only the net bulk magnetization, an ensemble average, is measured. This means that quantum algorithms which intrinsically utilize wavefunction collapse would not work. From the standpoint of computer science, this bulk NMR quantum computer would be a single-instruction multiple-data ( “SIMD”) machine:’ with each machine given only a nearly random input, and with the only accessible results being averages over all individual machine outputs. Solutions to the input preparation and final readout problems were simultaneously found by two groups in 1997. Gershenfeld and Chuang l 2 noted that the thermal equilibrium state is not completely random, and found a method by which the inherent structure can be used to perform an N quantum bit (qubit) calculation using a molecule with N O(1ogN) spins. For example, if each nucleus is spin-down with probability p > 112, then for N = 3 the 44-1state has the highest population, and the three states tt-1,t$T, and have equal population. Together, these four states represent a manifold which behaves like a ground state two-qubit system, since the only signal comes from ~
+
314
Introduction t o Quantum Computation and Information
the excess population in 3.44. A solution t o the ensemble readout problem was also presented in Ref. 12. Cory, Fahmy, and Have1 l 3 discovered a different method t o solve the state preparation problem; they noted that by applying RF pulses of different strengths t o different parts of their ensemble of N-spin molecules, they could selectively erase the signal from all states except the ground state. For example, in the N = 2 case, this involved suppressing the -If, f-4, and tt states, t o obtain just the $4 state signal. Later, Knill, Chuang, and Laflamme l 4 pointed out that instead of averaging over space, as done in Ref. 1 3 , one can average over time, by doing successive experiments employing different RF pulses. A fundamental limitation of all these approaches is an exponential reduc> l ~happens because tion in the observed signal strength as N i n ~ r e a s e s ? ~This the magnitude of the signal observed cannot be increased beyond the proportion of the desired state (e.g., the ground state) present in the initial thermal equilibrium state. Although each procedure produces a system which behaves as if it were prepared in the ground state, an effectively very cold state, this is done without actually performing physical cooling. Fortunately, there is an abundance of NMR signal available, and experimental success has been reported for one, two, and three qubit systems to-date. The challenge will be to scale up beyond ten qubits or so. We begin with a brief review of NMR physics in Sec. 2, and describe how NMR techniques can be used t o perform logic operations in Sec. 3. We then present a general theoretical framework known as “state labeling,” which describes all the NMR quantum computing schemes for input state preparation, in Sec. 4. Experimental results demonstrating implementation of simple quantum gates and circuits are described in Sec. 5. Finally, prospects for the future are summarized. Occasional notes will provide historical context and references t o the literature, but this is not meant t o be a thorough review. Rather, the treatment will be pedagogical, and the central goal will be t o explain and illustrate the fundamental concepts.
2
Physics of Nuclear Magnetic Resonance
NMR is a huge field of study, primarily in chemistry, but originally in physics. We will summarize only a small portion of the physics of NMR that is relevant t o quantum computation. For a complete treatment, many excellent reference books are available?6,5 > l8
Quantum Computation with Nuclear Magnetic Resonance
315
2.1 N spin system The basic physical system used in NMR quantum computation is an N-spin system, such as that provided by a single molecule with N magnetically distinct nuclei. For simplicity, we shall consider only spin-1/2 systems here, although in general, higher spin systems may be useful as well. Typically, these molecules are dissolved in a liquid solvent, and contained in a test-tube which is placed in a strong magnetic field, Bo, oriented along the 2 axis (Fig. 1). In this field, the N spins in each molecule precess like small magnets about the 2 direction. Spins are also coupled to each other via a through-bond electronic interaction. For example, the free evolution of an N = 2 system is given by the Hamiltonian
7 - l AWAI,A ~
+ AWBI,B + +AwABI,AI,B,
(1)
where the last term gives t o first order the spin-spin coupling. In this equation, I , is the spin angular momentum operator in the i direction. In general, wi = -yiBo, where yi is the gyromagnetic ratio for spin i. For example, a twocarbon molecule in a 9.3 Tesla magnetic field would have carbon-13 resonance x x 100 MHz, and a proton in a 11.7 Tesla field would frequencies at have & x 500 MHz. Typically, carbon-carbon couplings are around x 100 Hz, whereas proton-proton couplings are around 1 to 10 Hz. Importantly, the resonant frequencies also depend on the local chemical environment of the nuclei, which can shield (or increase) the local magnetic field, resulting in a lower or higher resonance frequency by an amount known as the chemical shift. This ranges from Hz to kHz, depending on the nuclei, molecular structure, and other factors.
2 2
2.2
Thermal equilibrium state
The thermal equilibrium state of the system is described by the diagonal density matrix p = e-PR/tr(e-Px), where ,d = 1/kBT. Since ~ B at T room temperature is usually much larger than yABo (by about lo5), this state is well approximated by p x 1 - ,d'H/2N. For N = 1 this gives
and for N = 2 (and W A x
~wB),
316
Introduction t o Quantum Computation and Information
Figure 1: Schematic diagram of an NMR apparatus.
This state may be interpreted as being a mixture of the four pure states loo), [ O l ) , / l o ) , and 111) (we shall use 0 to represent the spin-down state -1, and similarly for 1 and Of course, since the ensemble contains some molecules, the microscopic system is probably actually composed quite differently; the same density matrix can be decomposed in an infinite number of ways into mixtures of pure states, but as Von Neumann originally pointed out (see, for example, Sakurai 1 9 ) , as long as only ensemble averages are measured there is no way to tell distinguish among them. It is also important to note that the part of p proportional to the identity is unobservable in NMR experiments; only differences in populations of different states are ever measured. Mathematically, this is reflected by the fact that only traceless observables, such as the transverse magnetizations, are experimentally accessible quantities.
r).
2.3 S p i n m a n i p u l a t i o n s Spins are manipulated by applying a much smaller radio-frequency field, B1, in the 2 - y plane to excite the spins at their resonant frequencies w i . Using the rotating wave approximation, we find that in the rotating frame the spin evolves under an effective field B' = B1 cos('p)2 B1 sin('p)? (where cp is the R F phase). Near resonance, even a small B1 can cause large spin rotations, and by varying 'p and the magnitude of B1, the rotation angle and axis can be controlled. Furthermore, by applying R F fields a t different frequencies,
+
Quantum Computation with Nuclear Magnetic Resonance 317
different spins can be selectively excited. Mathematically, this effect from a resonant perturbation is well known,lg and a derivation is sketched here for completeness. Consider the Hamiltonian
where oi are the usual Pauli matrices. We shall use the fact that for CT* (rZf ioY)/2, e ~ u z u * e - ~ u=z e*2tu*. Let the state of the system be I$(t)), and define the rotating frame state Ix(t)) = eiwtuz/2I$(t)), such that the Schrodinger equation for the system, itz&l$(t)) = HJ$(t)),can be expressed as
where the high frequency terms were dropped (the rotating wave approximation). This has the solution
6t represents the integrated power of the applied resonant R F field. Ix(0)) is a two-component spinor,lg and the effect of eistU=I2is just a rotation of the state about the P axis by the angle 6t. Single spin operations will be written, for example, as R Y ~ ( 6= ' ) eieu$/2, representing a rotation of spin A by angle 0 around the y axis. Typical proton pulse lengths are about 5 microseconds for 6' = go", with standard NMR probes in a 500 MHz spectrometer, and a R F power of tens of Watts. 2.4
Magnetization readout
Applying a 6' = 7r/2 pulse to a thermal equilibrium state tips the nuclei from the i axis into the P - y plane. There, they precess around 2 , generating a magnetic induction signal known as the free induction decay (FID), that is captured by a pickup coil. This is mixed down to audio frequencies and digitized, giving a signal such as that shown in Fig. 2. The FID can be fourier transformed to give a plot of the spectrum. Mathematically, the experimentally detected signal can be calculated from the system state p, since the transverse magnetization for a single spin is just
318 Introduction to Quantum Computation and Information
Figure 2: (left) Nuclear free induction decay signal from a sample of Alanine; and (right) frequency spectrum given by the Fourier transform of the FID, showing the resonance peaks of the three carbon nuclei.
where d is the volume density of the detected spin with gyromagnetic ratio y. For example, for a spectrometer with a K-turn solenoidal coil with quality factor Q and area A, and magnetic flux a, d@ dt
V ( t ) = QK-
(9)
Note that this voltage is complex; both its magnitude and phase can be measured using a quadrature detector, which is usually standard in modern spectrometers.
2.5 Decoherence
A fundamental feature of the FID is an e-'lT2 exponential decay envelope. This results from the loss of relative phase coherence between different precessing magnetic moments in the sample, originating either from inhomogeneities such as a nonuniform magnetic field, or intrinsic homogeneous processes, such as the loss of phase information from individual spins to the lattice. In contrast to such processes, which do not involve any energy loss from the spin system, another fundamental process is the decay of the magnetization due t o relaxation of the spin system back t o thermal equilibrium. Physically, this
Quantum Computation with Nuclear Magnetic Resonance 319
process originates from the loss of energy from the spins to the lattice (or to the solvent). Mathematically, these decoherence processes may be written as a linear map on density matrices; in the single spin case, for an equilibrium state with a0 = 1 1 h w / 2 L ~ Twe , have, for example,
Here, TI and T2 are known as the spin-lattice and spin-spin relaxation rates. They are both experimentally measurable parameters, and represent important time-scales which limit the lifetime of coherent spin superposition states, and thus also the extent of quantum computation possible with a given system. Generally, T2 5 T I . Moreover, because the spectral density of the noise power in the environment decreases rapidly with frequency, for low viscosity liquids, Tz x T I , while in viscous liquids and in solids, TI >> T2. Due to the long range of spin-spin interactions, in solids T2 is usually very short, often sub-millisecond. In common organic liquids, because of motional narrowing:6 TI x T2 x 1 to 10 seconds, and in noble gases, TI can be on the order of tens of hours?’ The specific physical mechanisms which govern NMR relaxation timescales is beyond the present scope of discussion, but many thorough treatments are available in the l i t e r a t ~ r e ! ~ ? ~ ~
2.6 Spin-span interactions Spin-spin interactions occur through two dominant mechanisms: direct dipolar coupling, and indirect through-bond electronic interactions. Dipolar coupling is described by an interaction Hamiltonian of the form
where n is the unit vector in the direction joining the two nuclei, and I is the magnetic moment vector. In a low viscosity liquid, dipolar interactions are rapidly averaged away; mathematically this is calculated by showing that the spherical average of H& over n goes to zero. Through-bond electronic interactions are an indirect interaction; the magnetic field seen by one nucleus is perturbed by the state of its electronic cloud, which interacts with another nucleus through a Fermi contact interaction. It is also known simply as “Jcoupling,” and takes on the form
= h J I l -12 = h J I z I z +
2
(14)
320
Introduction to Quantum Computation and Information
where we have taken J as a scalar coupling constant for simplicity (in general it may be a tensor). This interaction is dominant in liquids, and furthermore, for weak couplings, or heteronuclear species (such that the matrix element of the I+I- + I-I+ is small), an excellent approximation is given by
This is the scalar interaction most often observed in liquids NMR, and we shall concentrate on it in our study of NMR quantum computation. We turn next to understanding how computation can be performed with liquids NMR systems.
3
Computation with NMR
An NMR system can be thought of as a computer in the sense that information is represented by nuclear spins, processing can be performed by applying RF pulses t o manipulate the spins, and results are read out by measuring the free induction decay signal. The details of mapping a computation onto an NMR spin system are described in this section, begining with logical operations, then readout. The problem of input state preparation is discussed in Sec. 4. 3.1 Logical operations Computation requires the ability to implement any logical transformation. It is sufficient for this purpose to be able to implement a universal set of operations, from which all possible transformations can be generated. For example, any classical logical function can be reduced to a network of single-bit NOT gates and two-bit AND gates. An important result for quantum computation is that any unitary quantum transform can be implemented by the quantum generalizations of these devices, an arbitrary single-bit rotation, and the controlled-NOT gate that flips one qubit conditioned on the state of the These operations may be implemented by applying the appropriate R F pulses in an NMR system. Consider first a single nuclear spin, such as a proton. As was previously shown in Sec. 2.3, single spin rotations RZ(8)and Ry(8) can be performed by applying R F pulses of appropriate power, duration, and phase. Furthermore, RZ(8)is also possible, since RZ(-7r/2)Ry(8)RZ(7r/2) = RZ(8). Arbitrary single spin rotations are possible even in coupled spin systems. For example, in a system with the interaction Hamiltonian of Eq. 15, the
Quantum Computation with Nuclear Magnetic Resonance 321
coupled evolution gives rise t o dynamics described by the operator
[I jl p1 e] 1
.~
0
0
RzAB( Jt) = e i J t I z A I z B - c o s ( ~ t / a ) . i + is i n ( ~ t / 2 )
0
.(16)
This is a coupled two-spin rotation. Typically, the J-coupling will be much weaker than the strength of the RF field, and furthermore it is reasonable to assume that the chemical shift WA - WB is sufficiently large (or the system is heteronuclear) , so that the spins can be addressed individually; thus, arbitrary single spin rotations can still be performed effectively. Given the ability t o perform arbitrary single bit operations, the next element required for being able t o construct arbitrary unitary operations for quantum computation is a multiple-spin interaction such as the controlled-NOT (CNOT) operation. For the J-coupled two-spin system, a CNOT can be implemented as a controlled phase shift preceded and followed by rotations, given by the sequence CAB = R y ~ ( - 9 0 )R , ~ ( - 9 0 ) R , ~ ( 2 7 0= -90) R , ~ ~ ( 1 8 0 ) is shown schematically in Fig. 3. The corresponding rotation R y ~ ( 9 0 )This . matrices can be multiplied t o give
0
0 l-i
0 x [
:
x [ 1 ; 0
0 l+i 0 0
;p
0 - 1 1
0 0 l-i 0
l+i
0 0 l+i
l-i
0
0 l+i
: :]
0 0 0 1 0 0 1 0
’
which is the CNOT operation (up t o an irrelevant overall phase). It turns out that a nontrivial operation in the NMR system is implementing “identity” - that is, making the spins cease evolution. This can be done using a common NMR tool known as refocusing, which is made possible by the fact that a 180” pulse on A about any axis 4 reverses the evolution of terms containing I Zin~the Hamiltonian. Namely, R # A ( ~ ~ O ) R ~ A=( W R zA~~( )- u ~ t ) R + ~ ( 1 8 0 ) ,
322
Introduction t o Quantum Computation and Information
I
+ dp
A B &
*
Figure 3: (a) A controlled-NOT gate acting on two qubits, (b) the controlled-NOT gates implemented by a controlled phase shift gate (specified by a unitary matrix with diagonal elements {1,1,1, -1)) preceded and followed by s / 2 rotations, and ) the pulse sequence and spin orientations corresponding to the components in (b). Note that, unlike a conventional NMR selective population transfer sequence, extra refocusing is required to preserve the Bell basis exchange symmetry between A and B. Figure reproduced from Ref. l 2 with permission. 01997, American Assoc. for the Advancement of Science.
Quantum Computation with Nuclear Magnetic Resonance 323
and R ~ A ( ~ ~ O ) & A = BR ( J=~A) B ( - J ~ ) R ~ A (Refocusing ~ ~ O ) ? ~also removes reversible broadening effects (such as spin interactions and magnetic field inhomogeneity). Repeated fast refocusing, known as decouplang, is useful because it completely stops the dynamics of the affected terms. More generally, the basic requirement to be able to perform quantum computation is that the set of possible manipulations gives the ability t o construct an arbitrary Hamiltonian. Lloyd has shown that nearly any interaction suffices for this purpose?2 A powerful theory for using this fact in NMR is average Hamiltonian theory?18 This provides another means for devising arbitrary unitary transforms in systems other than the simple scalar coupled spin system used in the examples here.
3.2 Readout
A tricky issue in performing quantum computation with NMR is readout of the result, because the system is an ensemble, rather than a single N-spin molecule. The problem is that the output of a typical quantum algorithm is a random number, whose distribution gives information which allows the problem to be solved. However, the average value of the random variable would give no relevant information, and this would be the output if the quantum algorithm were executed without modification on an NMR quantum computer. A resolution to this problem was presented by Gershenfeld and Chuang: l 2 simply append an additional computational step t o the quantum algorithm t o make its output deterministic. For example, Shor's algorithm produces a random rational number c / r , where c is an unknown integer, and r is the desired result (also an integer). c is nearly uniformly distributed, so the average value ( c / r ) gives no meaningful information. However, after calculating this fraction, each quantum computer can proceed further and perform a continued fraction e x p a n ~ i o n , 2and ~ > ~thus ~ obtain r . This algorithm does not always work, but failure can easily be detected by plugging r back into the original problem (a discrete logarithm) for verification. Upon failure, the quantum computer can arrange not t o output any result, so that the ensemble average measurement gives just the output ( r ) . Similar modifications can be made t o allow all known quantum algorithms to work on an NMR quantum computer. Solving this readout problem was a key step in making quantum computation possible, in principle, with bulk systems like those used in NMR. The other problem, input state preparation, is discussed next. 27,28129
324
4
Introduction t o Quantum Computation and Information
Theory of Bulk Quantum Computation
A significant problem for quantum computation with NMR is the fact that the input state is not pure. Rather, it is a thermal equilibrium state, with a density matrix which is nearly the identity - a very high entropy state. From a computational standpoint, a highly random input is very undesirable, and thus this problem was long regarded as unsurmountable for NMR quantum computation. In this section, we describe the theoretical development which solved this input preparation problem. First, we present two examples, for three and two spin systems, then we summarize a general theory, known as state labeling, which encompasses all methods which have been developed to date. 4.1
Three spin example
Consider an ensemble of 24 three-spin systems, in thermal equilibrium, with the density matrix
P X
6 0 0 0 0 0 0 0
0 4 0 0 0 0 0 0
0 0 4 0 0 0 0 0
0 0 0 2 0 0 0 0
0 0 0 0 4 0 0 0
0 0 0 0 0 2 0 0
0 0 0 0 0 0 2 0
0 0 0 0 0 0 0 0
The spin energies of each of the spins is nearly the same, so that the populations of the states IOOO), 1001). . .1111) are given by 3 times the number of 0’s in the state label. For example, there are six molecules in the 1000) state, four in the 1100) state, and so forth. This state is commonly encountered in NMR, where kBT >> Aw. The desired pure state for computation would be something more like
Pideal
1 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
Quantum Computation with Nuclear Magnetic Resonance 325
but no unitary transform can convert p to this form, because unitary transforms preserve eigenvalues. Nevertheless, we wish to construct a pure state using unitary transforms, since those are at our disposal. We now note that the mixed state
1 0 0 0
6 0 0 0
:].
:]=21+4[:
P2%[:
0 0 0 2
(20)
0 0 0 0
has a very interesting property: a unitary transform U leaves I alone, and transforms only the deviation from the identity. Mathematically, Up2 U t = a1 U p a U t , where p a = p2 - Tr(pa)1/4 is called the deviation density matrix corresponding to p ~ Furthermore, . for traceless observables, the only observed signal comes from p a ; for example, the i? magnetization is Tr(pzc7,) = Tr(pao,), since Tr(ala,) = 0. Thus, from the standpoint of transforms (computation) and measurement (readout), p a behaves like a pure state, l+)($l. We call density matrices such as p2 effective pure states. A unitary transform can be applied to transform the state p to give
+
- 6 0 0 0 0 0 0 0
0 2 0 0 0 0 0 0
0 0 2 0 0 0 0 0
0 0 0 2 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 4 0 0
0 0 0 0 0 0 4 0
0 0 0 0 0 0 0 4
The upper left half-block of this matrix is the effective pure state p2. It is distinguished from the lower half-block by the state of the third spin, which is 10) for the top and 11) for the bottom half. An alternative way of viewing this result is using an energy level diagram, as shown in Fig. 4. The unitary transform can be accomplished using controlled-NOT gates, or can be understood as simple population transfers between specific energy levels. This procedure produces from a three-spin system, a manifold of states which acts like a pure two-qubit system. Generalizing this procedure allows an n-O(1ogn) qubit system to be purified from an n-spin system. This technique, discovered by Gershenfeld and Chuang,12 is called “logical labeling” because the state of a subset of spins is used to label a specific manifold of states of the remaining spins, such that when the label is in a specific state, the remaining
326
Introduction t o Quantum Computation and Information ..a.
111
- /
/ f
e‘
:: \
&
00
In
00
..a.
...a
011
101
0000
0000
110
i $
1
a
001
0000
*
a.
a.
\
010
...... I
-_
I
000000
-\
g ’
000
^r
”-*
_ I
100
’One effective pure state manifold
Figure 4: Energy levels and populations of a three spin system. The initial thermal distribution is shown by the empty circles, and the populations of the “purified” state with effectively pure states are shown with filled circles. Reproduced with permission. 0 1 9 9 8 , Royal SOC.of London.
spins are in an effectively pure state. A systematic algorithm t o perform logical labeling on n spins has been given?l Fundamentally, logical labeling works because of structure present in the high-temperature thermal equilibrium state it is not completely random. From a computational standpoint, logical labeling performs a kind of compression similar t o Schumacher ~ o d i - i g Bto ~ >obtain ~ ~ a set of low-entropy qubits useful for quantum computation. ~
4.2
Two s p i n example
Preparation of a pure state from a mixed state inherently requires some nonunitary operation. In logical labeling, this is done by the measurement procedure, which selects the signal from a particular configuration labeled by a set of spins. An alternative technique, known as temporal labeling, was discovered by Knill, Chuang, and Laflamme!4 It works as follows. Consider an ensemble of two-spin systems with the density matrix
Now suppose that we perform three experiments, each with this state as input.
Q u a n t u m Computation with Nuclear Magnetic Resonance
327
First, we obtain p1 = C ( p ) , where C is some quantum computation (a unitary transform). We then perform another experiment to get p 2 = C ( I p l ( p ) ) , where a
0
0
0 (23)
cyclically permutes the IOl), I l O ) , and 111) populations. This is a unitary operation that is straightforward t o implement using, for example, controlledNOT gates. Finally, we obtain p3 = c(P2(p)),where r a
O
o o
01
d
O
O
o o o c is the inverse of P I . We now sum the three results, to obtain p l + p 2 + p 3 = C(p’), where 3 a O O O
(25) 0
0 0 s
is an effective pure state! Despite the original input being a mixed state, the result of the computation will be as if a pure state had been input, C ( p ) N C(I+)(+l). In this technique, non-unitary operations are accomplished by adding multiple results together, each of which is implemented using only unitary transforms. Because time is used to distinguish the different experiments, this is known as temporal labeling, or temporal averaging. The general technique involves transforms other than simple permutations - what is necessary is t o efficiently average over all terms in the density matrix other than the ground state. Particularly efficient algorithms can be obtained for high temperature thermal equilibrium distribution input^!^ The technique discovered by Cory, Fahmy, and Have1 l 3 for creating effective pure states is similar to temporal labeling. Instead of performing multiple sequential experiments, however, they do theirs simultaneously, and split the sample volume into spatially distinct locations, to each of which is applied a different unitary operation. Experimentally, this is possible by the use of gradient magnetic and RF fields. Summation is performed naturally, since the overall observed signal is just the sum of that from each volume element. We refer to their technique as spatial labeling.
328 Introduction t o Quantum Computation and Infornation
4.3 General State Labeling Theory State labeling is general theory which encompasses each of the three techniques described above for creating effective pure states from a mixed state. The idea is to take a given computation C, and t o prepend it with an initial preparation step P and append it with a readout preparation step R, such that we have a new computation C’,
where a is a known proportionality constant, and R, and P denote general quantum operations. In logical labeling, an ancilla state which acts as labels is introduced, and the preparation operation creates the state
P(P) = lo)(ol
@
Ql$)($l
-I-
Ik)(kl
@ Pk
,
(27)
k#O
where the lO)(Ol state of the ancilla labels the desired pure state of the computational spins and all of the other states are arbitrary. Following a computation that does not change the state of the label spins, Ilabel @C, a readout operation that projects on the ancilla
gives the desired result
In spatial and temporal labeling, the preparation operation is implemented by a sum over unitary operations 4 ,
which produces an effective pure state. The label k represents either degrees of freedom in space, or in time. The limitation to all these procedures is that the magnitude of the observed signal from the effective pure state cannot be larger than the proportion of the desired state in the original input. In particular, for an n-spin thermal equilibrium state, the deviation of the largest eigenvalue from 1 is
Quantum Computation wzth Nuclear Magnetic Resonance 329
in the high temperature limit, with nearly equal spin resonance frequencies. The exponential decrease of A,, is due simply t o the exponentially small probability of finding a configuration in the ground state; if each spin is up or down with a Bernoulli p distribution, then the probability of the all spin-down configuration is p n , which is exponentially small. The problem thus becomes one of exponentially decreasing signal to noise ratio as the number of spins is increased. Nevertheless, this limit can be countered by several experimental techniques, such as optical pumping 30 and electron polarization transfer, and electronic spin polarization readout?l These increase the polarization of the initial state, and increase the signal to noise ratio of the output. With modern spectrometer technology, it is expected that such steps will not be needed until constructing systems of more than eight to ten spins. Present experiments can manipulate two t o three spins robustly, as discussed next. 5
Experimental Results
In this section, we describe experimental results by Chuang, Gershenfeld, Kubinec, and Leung, which confirm the operation of nontrivial quantum circuits implemented using bulk spin resonance quantum computation. Data are presented which demonstrate the classical and quantum operation of controlledNOT gates, using state tomography 34 to completely characterize the density matrix. Only the main results are reviewed here; for further details the reader is referred to the literature?l
5.1 Experimental system The two-spin physical system used in these experiments was carbon-13 labeled chloroform (Fig. 5). A 0.5 milliliter, 200 millimolar sample was prepared with d6-acetone as a solvent, degassed, and flame sealed in a thin walled, high performance 5mm nuclear magnetic resonance (NMR) sample tube. This sample was in liquid form, and experiments were performed a t room temperature. The Hamiltonian for the carbon-hydrogen spin system can be modeled by single-spin chemical shifts and the scalar coupling interaction term as
where ‘lien,represents a coupling to an external reservoir, and I=A = a z ~ / is 2 the angular momentum operator in the i direction for spin A (the proton; B is the carbon). The reservoir includes small interactions with other nuclei such as the chlorine, which do not play a major role in the dynamics. It also includes
330
Introduction to Quantum Computation and Information
CI I
CI -c- CI I
H Figure 5 : Chloroform molecule.
higher order terms in the spin-spin coupling, which can be disregarded in the first-order model; the spin interaction is dominated by through-bond coupling mediated by electrons, rather than by direct dipole-dipole interaction between the nuclei, and in the liquid at high magnetic field the rapid molecular tumbling averages away all but the I z ~ I J-coupling z ~ interaction. Spectra were taken using Varian U"TYlnova-500 and Bruker DRX-500 spectrometers using standard probes. The deuterium resonance in the solvent was used as the lock signal. The resonance frequencies of the two proton lines (in the DRX-500) were measured t o be at 500135028.5 107.5 Hz, and the carbon lines at 125767641.5f 107.5 Hz, with errors of f l Hz. The radio-frequency (RF) excitation carrier (and probe) frequencies were set at the midpoints of these peaks so that the chemical shift evolution could be neglected, leaving only the 215 Hz J-coupling. The resonance lines from the solvent were at least a kHz away, and did not play any role in the experiment. The coherence times of the two spins were estimated by measuring TI and T2 relaxation times, using standard inversion-recovery and Carr-PurcellMeiboom-Gill pulse sequences. For the proton, it was found that TI M 7 sec, and T2 M 2 sec, and for carbon, TI M 16 sec, and T2 z 0.2 sec. The short carbon T2 time is due to coupling with the three quadrupolar chlorine nuclei, which reduces the coherence time.
*
5.2 State Tomography As previously described, unlike a conventional computer in which the bits travel to the gates, in an NMR computer the bits are represented in the molecular spin states, and the gates are brought t o them by RF pulses that effectively change the Hamiltonian! Analogous to the use of a classical logic analyzer to characterize a conventional computer, the quantum computer's state can be completely characterized by systematically applying 90" pulses to rotate each spin around the 2 and y axes to make observable all of the terms in the
Quantum Computation with Nuclear Magnetic Resonance 331
deviation of the density matrix from the identity. Applying this procedure t o the thermal equilibrium state Pthermal = e ~ p ( - % / k T ) / 2 obtained ~ the data shown in Fig. 6. Each of the elements was measured from a single free induction decay without any averaging. As expected, all the off-diagonal elements are nearly zero, while the diagonal elements show the Boltzmann occupancies of the logical basis states, with relative populations of a b (loo)), a - b ( [ O l ) ) , -a b (IlO)), and -a - b (111)).The ratio a / b = 3.98 is fixed by the ratio of the gyromagnetic frequencies of the two nuclei, and was used t o calibrate the relative strength of the carbon signal amplification and digitization circuitry to that of the proton. An error of about 5% was observed in the data, due primarily t o imperfect calibration of pulse times and inhomogeneity of the magnetic field. Pthermal is the logical input state t o the NMR computer, and represents a classical mixture of 00, 01, 10, and 11.
+
+
Figure 6: Experimentally measured deviation density matrix of a thermal state, in the logical basis loo), I O l ) , I l O ) , Ill). The magnitude of the elements are shown on the left, and the phases on the right. Phases are given in radians, and those of small elements have been suppressed for clarity (shown as empty squares). Peaks are labeled for later identification.
5.3 Gates For the single bit rotation operation we use the Hadamard transform H . This can be implemented by addressing one of the spins with the semi-selective pulse sequence YX2 (where, for example, Y = R,(90) is a rotation about the jj axis by 90", and X 2 = R,(180)). Multiplying the rotation matrices shows that this has the correct form, up t o an overall phase factor of -i
In certain cases, this sequence can be simplified to omit the X 2 pulse. This matrix can be understood as an analog t o the truth table for a classical logic
332
Introduction t o Quantum Computation and Information
function; inputs are read as the column labels, and the outputs as the row labels. However, unlike a classical function, H can create superposition outputs; for example, H I O ) = (10) ]I))/&?. A controlled-NOT gate inverts spin B only when spin A = 1, and does nothing if A = 0. A pulse sequence that performs this operation exactly was described in Sec. 3.1. Here, we shorten it to the sequence shown in Fig. 7. Multiplying the corresponding matrices gives the unitary transformation
+
4
l+i
0
0 0
lfi
0
lo 0i
1 O- i
0
ucjv= -
1,
(34)
which differs from the exact controlled-NOT only in the relatives phases. Applying this gate t o the thermal state gives the result shown in Fig. 8.
Figure 7: Quantum circuit symbol for a controlled-NOT gate, and its implementation with RF pulses in our two-spin NMR system in the doubly rotating frame. Time proceeds from left t o right; T = ? T / J= 2.326 milliseconds.
Figure 8: Experimental data showing the output from a controlled-NOT gate, showing the expected conditional flip of the target qubit (exchange of 2 tt 3).
Quantum Computation with Nuclear Magnetic Resonance 333
5.4
CZTCUitS
Having shown the controlled-NOT gate operating properly on a classical mixture of inputs, we tested its quantum operation in a simple circuit to create a mixture of entangled states from the thermal state. The most important example of two-qubit entangled states are the maximally entangled EinsteinPodolsky-Rosen (EPR) shown in Eq. 35. A quantum circuit which creates E P R states from pure states is shown in Fig. 9. The pulse sequence UEPR shown gives an equivalent mapping,
100) 101) 110) 111)
+ + + -+
100) 101) 100) 101)
(35)
+ +
(up t o a normalization and phase). The result of applying this sequence t o the thermal state is shown in Fig. 10, confirming the expected amplitudes up to the uncertainty due t o single-shot measurements and magnetization decay during the free induction decay.
;
B
acquire
m
Figure 9: (top) Quantum circuit, and (bottom) corresponding pulse program for creating an EPR state, using a Hadamard transform H followed by a controlled-NOT gate.
The first pulse in this sequence tips both spins into the P - y plane, making their transverse magnetization observable in the pick-up coil. The second pulse tips the B spin out of the observable plane, but does not act directly on the A spin. Classically the precession of A would still be expected t o be observable. But as Fig. 10 shows, the observable magnetization has been transformed into
334
Introduction t o Quantum Computation and Information
Figure 10: Deviation density matrix of an EPR mixture state created from a thermal mixture. The reverse diagonal represents the entangled states.
non-observable quantum coherences (off-diagonal elements). Because of the entanglement that has been created between spins A and B by the exchange coupling during the precession interval, operations on spin B do affect the state of spin A , in this case causing the magnetization for the mixture of states t o coherently interfere and thereby cancel. This demonstrates the quantum operation of the controlled-NOT gate. A quantum computer must be able t o interconnect its logical gates while preserving coherence. To demonstrate this two controlled-NOT gates were cascaded to implement a permutation operation, using the quantum circuit shown in Fig. 11 which performs the transformation
1: P : :I 0 0 - 1 0
Applying this sequence to the thermal input state gives the result shown in Fig. 12, properly performing a cyclic permutation on the populations.
5.5 Eflective pure states Arbitrary quantum computations require pure state inputs, not the thermal populations that have been shown in the preceding examples. But using the primitive operators that have been introduced, state labeling can be used t o create effective pure states from the equilibrium thermal initial condition, as described in Sec. 4. While the system remains close to thermal equilibrium, these states transform exactly like pure states and hence can be used for universal quantum computation.
Q u a n t u m Computation with Nuclear Magnetic Resonance
335
A acquire
4,
nu
Figure 11: Quantum circuit to perform a permutation operation, and its pulse program implementation.
Figure 12: Experimentally measured deviation density matrices showing the effect of two cascaded controlled-NOT gates on a thermal state: cyclic permutation of the state labels, 2-4-342.
336
Introduction to Quantum Computation and Information
Temporal labeling l4 was used t o create the effective pure state 100) from Letting P2 = P! be the cyclic permutation in the direction opposite to P I , and Po be the identity operator, then summing the result of a quantum computation performed three times prepended by these permutation operators projects out the pure state signal. Following the permutation operations by the EPR circuit, we obtain the transformation Pthermal.
2
PEPR
UEPRPk Pthermal p , u , & p ~
=
(37)
k=O
which should give the deviation density matrix
pEPR=a[
1 0 0 - 1 0 0 0 0
- 1 0 0
1,
(38)
1
having the signature of the Bell state 100) - 111). The output expected from this network of three controlled-NOT gates and one Hadamard gate is confirmed in Fig. 13. Magnitude
Phase
Figure 13: Experimentally observed Einstein-Podolsky-Rosen state 100) - 11 1) created using temporal state labeling from a thermal ensemble and measured using quantum state tomography.
6
Conclusions
The experimental results described here represent the first steps taken t o systematically investigate the use of NMR for quantum computation, through tomographic characterization of gates and cascaded gate circuits. The experimental data confirm the dynamical behavior expected for a quantum system.
Quantum Computation with Nuclear Magnetic Resonance 337
These techniques have been extended t o construct larger circuits, demonstrating the first experimental implementation of quantum algorithms. Groups at Stanford, MIT, Berkeley, and IBM Almaden 36,37 and independently, a t O ~ f o r have d ~ realized ~ ~ ~ Grover’s ~ algorithm 40 on a search space of four elements, and the Deutsch-Jozsa algorithm41 for a function on a domain of two elements. These first steps are encouraging signs of potential future progress, but it is still too early to draw definitive conclusions, particularly about the technological relevance of NMR quantum computation. Nevertheless, the results are intriguing, and motivate new experimental and theoretical questions. For example, how might bulk quantum computation ideas be applied to other physical systems, such as quantum dots? Are there state labeling schemes which do not incur exponential signal loss (in contrast to those discussed here)? How may optical pumping and other traditional NMR techniques be used for quantum computation? It is also very possible that quantum computation, particularly the recently discovered quantum error correction techniquesp2 may be useful in NMR spectroscopy. While great challenges remain if quantum computation is ever t o become practically useful, these preliminary steps mark a hopeful transition of the study of quantum computation from a theoretical t o an experimental science. References
1. 2. 3. 4.
5.
6. 7. 8. 9. 10.
R. P. Feynman, Int. J. Theor. Phys. 21 (6/7), 467 (1982). P. W. Shor, SIAM J. Computing 26, 1484 (1997). F. Bloch, Phys. Rev. 70 (7), 460 (1946). L. Emsley and A. Pines, Lectures on Pulsed NMR, (2nd Edition) (Societa Italiana di Fisica, CXXIII, Enrico Fermi International School of Physics, Villa Monastero, Italy, 1994). R. R. Ernst, G. Bodenhausen and A. Wokaun, Principles of Nuclear Magnetic Resonance in One and Two Dimensions (Oxford University Press, Oxford, 1994). S. Lloyd, Science 261, 1569 (1993). D. P. DiVincenzo, Phys. Rev. A 5 0 , 1015 (1995). K. Wago, 0. Zuger, R. Kendrick, C. S. Yannoni and D. Rugar, J . Vac. Sci. B 14,1197 (1996). J. Wrachtrup, A. Gruber, L. Fleury and C. von Borczyskowski, Chem. Phys. Lett. 267, 179 (1997). D. L. Olson, T . L. Peck, A. G. Webb, R. L. Magin and J . V. Sweedler, Science 270, 1967 (1995).
338
Introduction t o Quantum Computation and Information
11. D. Patterson and J. Hennessy, Computer Architecture, A Quantitative Approach (Morgan Kaufmann Publishers, San Francisco, 1996). 12. N. Gershenfeld and I. L. Chuang, Science 275, 350 (1997). 13. D. Cory, A. Fahmy, and T. Havel, Proc. Nat. Acad. Sci. 94 (5), 1634 (1997). 14. E. Knill, I. L. Chuang, and R. Laflamme, Phys. Rev. A 57, 3348 (1998). 15. W. Warren, Science 277, 1688 (1997). 16. A. Abragam, The principles of nuclear magnetism (Clarendon Press, Oxford, 1961). 17. M. Goldman, Quantum Description of High-Resolution NMR in Liquids (Oxford Scientific Publications, London, 1988). 18. C. P. Slichter, Principles of Magnetic Resonance (Springer, Berlin, 1996). 19. J . J. Sakurai, Modern Quantum Mechanics ( Addison-Wesley Publishing Company, Reading, Massachusetts, 1985). 20. T. E. Chupp, E. R. Oteiza, J. M. Richardson and T. R. White, Phys. Rev. A 38 (8), 3998 (1988). 21. D. P. DiVincenzo, “Two-bit gates are universal for quantum computation,” in Workshop on Quantum Computing and Communication, Gaithersburg, MD, August 18-19 (1994). 22. S. Lloyd, Phys. Rev. Lett. 75, 346 (1995). 23. A. Barenco, C. H. Bennett, R. Cleve, D. P. DiVincenzo, N. Margolus, P. W. Shor, T. Sleator, J . Smolin and H. Weinfurter, Phys. Rev. A 52, 3457 (1995). 24. G. D. Mateescu and A. Valeriu, 2D NMR : Density Matrix and Product Operator Treatment (Prentice Hall, Englewood Cliffs, 1993). 25. G. H. Hardy and E. M. Wright, A n Introduction to the Theory of Numbers, (Fourth Edition), (Oxford University Press, London, 1960). 26. A. K. Ekert and R. Jozsa, Rev. Mod. Phys. 68, 733 (1996). 27. D. Deutsch and R. Jozsa, Proc. R . SOC.Lond. A 439, 553 (1992). 28. D. Simon, p. 116 in Proc. 3gh Annual Symposium on Foundations of Computer Science, Los Alamitos, CA, (IEEE Computer Society Press, 1994). 29. A. Yu. Kitaev, “Quantum measurements and the Abelian stabilizer problem,” LANL E-print quant-ph/9511026. 30. T. Walker and W. Happer, Rev. Mod. Phys. 69, 629 (1997). 31. 1. L. Chuang, N. Gershenfeld, M. G. Kubinec and D. W. Leung, Proc. R. SOC.Lond. A 454, 447 (1998). 32. B. Schumacher, Phys. Rev. A 51, 2738 (1995). 33. R. Cleve and D. P. Divincenzo, Phys. Rev. A 54, 2636 (1996). 34. U. Leonhardt, Phys. Rev. Lett. 74, 4101 (1995).
Quantum Computation with Nuclear Magnetic Resonance
339
35. A. Einstein, B. Podolsky and N. Rosen, Phys. Rev. 47,777 (1935). 36. I. L. Chuang, L. Vandersypen, D. Leung, X. Zhou, and S. Lloyd, Nature 393,143 (1998). 37. I. L. Chuang, N. Gershenfeld, and M. G. Kubinec, Phys. Rev. Lett. 8 0 , 3408 (1998). 38. J. A. Jones, M. Mosca, and R. H. Hansen, Nature 393,344 (1998). 39. J. A. Jones and M. Mosca, J. Chem. Phys. (Aug. 1998). 40. L. K. Grover, p. 212 in Proc. 2gh Annual ACM Symposium on the Theory of Computation, New York, NY, (ACM Press, 1996); L. K. Grover, Phys. Rev. Lett. 79,325 (1997). 41. D. Deutsch and R. Jozsa, Proc. R. SOC.Lond. A 439,553 (1992). 42. See, for example, A. M. Steane, Reports on Progress in Physics 61, 117 (1998); J. Preskill, Proc. R. SOC.Lond. A 454,385 (1998).
FUTURE DIRECTIONS FOR QUANTUM INFORMATION THEORY CHARLES H. BENNETT IBM Research, Yorktown Heights, NY 10598, USA Classical information and computation theory are now naturally viewed as a subset of a more comprehensive theory concerned with the transmission and processing of intact quantum states. The new theory is broadly concerned with the possible transformations of quantum states into one another, often in a multiparty setting, where part of a possibly entangled quantum system is held by each of several observers. Many open questions remain concerning the kind and quantity of resources-including classical messages, shared entanglement, noiseless or noisy quantum communications channels, and (perhaps most fundamentally) direct physical interaction between the parties-that can be used to perform these transformations, and the extent to which various combinations of resources can be substituted for one another.
1
Introduction
Recent developments, described elsewhere in this book, have made it clear that t o fully accommodate the kinds of information processing available in nature, classical notions need t o be expanded to accommodate information carriers capable of entanglement, superposition, and unitary evolution. Just as classical information processing can be reduced to a sequence of one- and two-bit Boolean operations ( “gates”), so quantum information processing can be reduced to a sequence of one-and two-qubit unitary operations ( L1quantum gates”) acting on qubits, i.e. elementary two-state quantum systems. As suggested in Fig. l a , any transformation of a closed quantum system can be described by a unitary operation U realizable by an array of one- and two-qubit gates. The most general processing of quantum information in an open system (mathematically, a completely positive linear map or superoperator ’) can be described as a such a unitary transformation on a larger number of qubits, including some extra constant inputs ( “ancillas” ) and/or some extra outputs which are discarded into the environment. Within this formalism a classical bit can be defined without loss of generality as a qubit promised t o be in one of two standard orthogonal states lo), and I l), identified with the Boolean values 0 and 1. A classical wire can be defined as one capable of carrying Boolean values reliably, but not superpositions. As is well known, this effect can be achieved (Fig. l b ) by having the input qubit interact, via a quantum XOR or controlled-NOT gate, with an ancillary qubit, initially in the 10) state, which is then discarded. In other words, a classical wire is a noisy quantum 340
Future Directions for Quantum Information Theory 341
Figure 1: a) Any unitary operation
u
on quantum d a t a can b e synthesized from t h e two-qubit XOR gate and one-qubit unitary operations (U).T h e most general treatment, or superoperator, that can be applied t o quantum d a t a is a unitary interaction with one or more 0 quhits, followed by discarding some of t h e qubits. Superoperators are typically irreversible. b) A classical wire (thick line) conducts 0 and 1 faithfully but not superpositions or entangled states. It may be defined as a quantum wire t h a t interacts via an XOR with a n ancillary 0 which is then discarded (shading indicates entanglement).
channel (i.e. a superoperator) consisting of a noiseless quantum wire with an eavesdropper or environment which monitors the state passing through via an XOR gate. This leaves 10) and 1 1) unharmed but collapses any superposition into a probabilistic mixture of 0 and 1. 2
2.1
Comparison of Classical and Quantum Information Processing
Computation
A number of the most important similarities and differences between quantum and classical information processing are catalogued in Table 1. The most famous difference between quantum and classical information processing occurs in a single-party setting. It is of course the dramatic quantum speedup of certain computations, the ability (Fig. 2) of a small quantum gate array to perform the same classical input-output mapping as a much larger classical array. The quantum gate array can be regarded as deriving its extra power from its ability t o produce, preserve and operate on entangled states (gray); classical gates such as NAND, by contrast, destroy entanglement, so that at every intermediate step of a classical computation the state of the entire computer is a product of states of the individual parts. Open questions in quantum computation theory concern the power of quantum computers to speed up other classical computations besides the known examples, and the relation of
342
Introduction to Quantum Computation and Information Table 1: Comparison of Classical and Quantum Information Processing.
Property
I
Classical
[ Quantum
State representation
I
string of bits 2 E (0,1>" deterministic or stochastic one- and two-bit operations By classical faulttolerant gate arrays
I
Computation primitives Fault-tolerant computation Quantum computational speedups Communication primitives Source Entropy Noiseless coding techniques Error-correction Techniques Noisy Channel Capacities
Entang.- Assisted Communication Communication complexity Secret crypto key agreement 2-Party Bit Commitment Digital signatures
Transmitting a classical bit
;
H = - ) p(z)logp(z) Classical data compression Error-correcting codes Classical capacity C1 equals maximum mutual information through a single channel use
Bit communication cost of distributed computation Insecure against unlimited computing power, or if P=NP
string of qubits
I I
Ref. 2.3
$ = ~ j 7 T C I I Z )
4.5
one- and two-qubit unitary transformations
4.5
2.3
By quantum faulttolerant gate arrays Factoring: exponential speedup Search: quadratic speedup Black box iteration: no speedup Transmitting a classical bit Transmitting a qubit Sharing an EPR pair s = -TrPlogP Quantum data compression Entanglement concentration Quantum error-correcting codes Entanglement distillation Classical capacity C 2 C1; Unassisted quantum capacity Q 5 C; classically assisted quantum capacity Q 2 2 Q Superdense Coding Quantum Teleportation Qubit cost, or entanglementassisted bit cost, can be less Secure against general quantum attack and unlimited computing Insecure against attack by a quantum computer No known quantum realization
6,7
5 5 8
3,4 4 3.4 9
6 10,ll 12 6
10.13
3.4
14
I
15
15
Future Directions for Quantum Information Theory 343
11
. C. . . . .
0
.
.
k}f(x,
I _ I
Figure 2: Quantum speedup of classical computations.
quantum t o classical computational complexity classes.
2.2
Communication
A broader range of questions remains unanswered in the distributed or multiparty setting, where two or more parties, often personified as Alice, Bob ..., each can perform arbitrary local quantum operations but are limited in their abilities t o communicate with one another. For example they might only be able to communicate classically, or they might have a noisy quantum channel N , but no noiseless quantum channel, at their disposal. In the latter case, the quantum capacity Q ( N )of the channel l 3 may be defined as
Here & is an encoding superoperator from n qubits t o m channel inputs and 2) is a decoding superoperator from m channel outputs to n qubits. In Fig. 3, n = 3 qubits are encoded into m = 4 uses of the noisy channel. The classical capacity C ( N )of a noisy quantum channel is defined by the same expression but with the universal quantification taken over all Boolean states II,E { 10) , I l)}n,rather than all possible states E Hz- of the n qubits, because classical communication does not require superpositions of the Boolean states to be transmitted faithfully. For some channels the classical capacity C
+
344
Introduction to Quantum Computation and Information
a.
b.
I
Figure 3: Capacities of a noisy quantum channel.
is known to exceed the maximum mutual information that can be sent through a single use of the channel?2 Both Q and C can be understood in terms of the block diagram of Fig. 3a, in which n qubits are encoded, sent through m instances of the channel, and then decoded t o yield an asymptotically faithful approximation of the input state, either for all input states in the case of Q, or for all Boolean input states in the case of C. Another kind of quantum capacity, so-called classically assisted capacity QZ(N) is defined in terms of a more complicated protocol (Fig. 3b) in which the sender Alice initially receives n qubits, after which she and the receiver Bob can perform local quantum operations and exchange classical messages freely in both directions, interspersed with m forward uses of the noisy quantum channel N, with the goal of ultimately enabling Bob to output a faithful approximation of the n-qubit input state. The capacity Q2 is then defined by a limiting expression like Eq. 1, but with the encoder/decoder combination E,D replaced by an interactive protocol of the form of Fig. 3b. Clearly QZ(N) 2 Q ( N ) for any quantum channel N, and channels are known for which this inequality is strict; an open question is whether there are channels for which Q2 > C. For most noisy quantum channels, none of the three capacities is known exactly, though upper and lower bounds are known; l 3 notably, for channels such as the 50% depolarizing channel, no-cloning arguments require that Q = 0, while entanglement distillation protocols and teleportation give a positive lower bound for Q2.1' Classical communication complexity deals with the number of bits of classical communication required by two or more parties to evaluate a publicly agreed-on function of several private inputs, one held by each party. For example if Alice and Bob each hold an n-bit string, it is known that O(n) bits of
Future Directions f o r Quantum Information Theory 345
communication are required for them to determine whether the bitwise AND of their strings is zero. Although the question and answer are classical, Buhrman et a1?4 have shown that if Alice and Bob are allowed to exchange qubits rather than bits, or if they are allowed to supplement their classical communication with prior entanglement (enabling them to teleport quantum data) the communication complexity falls to O(&). Many other results of this sort are known?4 2.3 Cryptography
Quantum cryptographic key distribution is a 3-party protocol involving quantum and classical communication between Alice and Bob, subject to eavesdropping by Eve, as shown in Fig. 4. In contrast to communication complexity, where the parties are cooperating toward a common goal, the adversarial na-
Alice
>
Secretkey
or
K
0
Negligible information about K
I
Bob
or
Q
Figure 4: Quantum cryptographic key distribution.
ture of cryptography implies a nontrivial quantification over roles of the parties: secure key distribution requires the existence of a strategy for Alice and Bob such that for all strategies of Eve the desired outcome (either a shared key on which Eve has hardly any information, or no key, as indicated by the frown) occurs with high probability.
346
Introduction to Q u a n t u m Computation a n d Information
Shared Entanglement
?-
\
I 1
I
+t
Alice eiHt
P
a
a
'
Shared Entanglement
a
a
a
a
Figure 5 : General multipartite quantum state processing. Two or more parties (Alice, Bob ...) have t h e task of implementing some physically possible transformation S, i.e. a completely positive linear map from initial states p t o final states S ( p ) of their multipartite system. If there were only one party this would always be possible (though maybe not easy in terms of t h e number of oneand two-qubit quantum gates required). But with several parties it may or may not be possible, depending on whether the parties can b e trusted t o cooperate, and on t h e kind and quantity of communications resources a t their disposal. These include, from left to right, noiseless and noisy quantum channels, previously shared entanglement, classical communication (thick arrows) and the unitary evolution U = e i H * induced by some limited amount of interaction, with Hamiltonian H , between the parties. Finally, t h e need to dispose of waste classical information (symbolized as heat), generated by processes such as teleportation, may be viewed as a negative communications resource.
Future Directions for Quantum Information Theory 347
3
Open Problems
Figure 5 suggests the large variety of mostly unsolved problems that arise when one tries to analyze quantum information processing in a distributed setting. Clearly an initial-to-final state mapping S is only possible in a multipartite setting if it would be possible in single quantum computer. However, the effect of nonlocality and (in the case of cryptographic protocols) noncooperation among the parties greatly complicates the situation. As in classical information theory, some of the most important questions concern asymptotic performance of protocols in the limit of large n, for example the various kinds of channel capacity, and the asymptotic efficiency of distilling one quantum state from another, with the aid of classical communication. The intensity of research now underway offers hope for a greatly improved understanding of many of these questions in the next few years.
Acknowledgements I wish to thank Sam Braunstein, Isaac Chuang, David DiVincenzo, Chris Fuchs, Daniel Gottesman, Sandu Popescu, John Smolin, Barbara Terhal, Ashish Thapliyal, and Bill Wootters for helpful discussions. Part of this work was completed during the 1998 Elsag-Bailey I.S.I. Foundation research meeting on quantum computation.
References 1. K. Kraus, States, Effects, and Operations: Fundamental Notions of Quantum Theory (Springer, Berlin, 1983); see also B. Schumacher, “Sending Entanglement through Noisy Channels,” preprint quantph/9604023 (1996).
T. Spiller, this volume. S. Popescu and D. Rohrlich, this volume. R. Jozsa, this volume. A. Barenco, this volume. 6. A. Steane, this volume. 7. J. Preskill, this volume. 8. Y. Ozhigov, “Quantum Computers Cannot Speed up Iterated Applications of a Black Box,” preprint quant-ph/9712051 (1997); E. Farhi, J. Goldstone, S. Gutmann, and M. Sipser, “A Limit on the Speed of Quantum Computation in Determining Parity,” preprint quant-ph/9802045 2. 3. 4. 5.
(1998).
348 Introduction to Quantum Computation and Information
9. C. H. Bennett, H. J. Bernstein, S. Popescu, and B. Schumacher, Phys. Rev. A 53, 2046 (1996); H.-K. Lo and S. Popescu, “Concentrating entanglement by local actions-beyond mean values,” preprint quantph/9707038 (1997). 10. C. H. Bennett, G. Brassard, B. Schumacher, S. Popescu, J. Smolin, and W. K. Wootters, Phys. Rev. Lett. 76,722 (1996); D. Deutsch, A. Ekert, R. Jozsa, C. Macchiavello, S. Popescu, and A. Sanpera, Phys. Rev. Lett. 77,2818 (1996), 80,2022 (1998); C. H. Bennett, D. P. DiVincenzo, J. Smolin, and W. K. Wootters, Phys. Rev A 54, 3824 (1996), (preprint quant-ph/9604024); M. Horodecki, P. Horodecki, and R. Horodecki, Phys. Rev. Lett. 78,574 (1997), (preprint quant-ph/9607009). 11. V. Vedral, M. B. Plenio, M. A. Rippin, and P. L. Knight Phys. Rev. Lett. 78,2275 (1997); M. Horodecki, P. Horodecki, and R. Horodecki, Phys. Rev. Lett. 80, 5239 (1998), (preprint quant-ph/9801069); K. Zyczkowski, P. Horodecki, A. Sanpera, and M. Lewenstein, “On the volume of the set of mixed entangled states,” preprint quant-ph/9804024 (1998). 12. A. S. Holevo, Probl. Peredachi Inform. 15,3-11 (1979, in Russian); M. Sasaki, K. Kato, M. Izutsu, and 0. Hirota, “A simple quantum channel having superadditivity of channel capacity,” preprint quant-ph/9705043 (1997); C. A. Fuchs, Phys. Rev. Lett. 79,1162 (1997). 13. S. Lloyd, “The capacity of the noisy quantum channel,” preprint quantph/9604015 (1996); H. Barnum, J. Smolin, and B. Terhal, “The quantum capacity is properly defined without encodings,” preprint quantph/9711032, to appear in Phys. Rev. A (1998); H. Barnum, M. A. Nielsen, and B. Schumacher, Phys. Rev. A 57,4153 (1998), (preprint quant-ph/9702049).
14. R. Cleve and H. Buhrman, Phys. Rev. A 5 6 , 1201 (1997), (preprint quant-ph/9704026); H. Buhrman, R. Cleve, and A. Wigderson, “Quantum vs. Classical Communication and Computation,” preprint quantph/9802040, to appear in 3O’th Symposium on Theory of Computing (ACM Press 1998); R. Beals, H. Buhrman, R. Cleve, M. Mosca, and R. de Wolf, “Quantum Lower Bounds by Polynomials,” preprint quantph/9802049 (1988). 15. H.-K. Lo, this volume.