Foundations of Quantum Mechanics, an Empiricist Approach
Fundamental Theories of Physics An International Book Series...
45 downloads
808 Views
8MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
Foundations of Quantum Mechanics, an Empiricist Approach
Fundamental Theories of Physics An International Book Series on The Fundamental Theories of Physics: Their Clarification, Development and Application
Editor: ALWYN VAN DER MERWE, University of Denver, U.S.A.
Editorial Advisory Board: JAMES T. CUSHING, University of Notre Dame, U.S.A. GIANCARLO GHIRARDI, University of Trieste, Italy LAWRENCE P. HORWITZ, Tel-Aviv University, Israel BRIAN D. JOSEPHSON, University of Cambridge, U.K. CLIVE KILMISTER, University of London, U.K. PEKKA J. LAHTI, University of Turku, Finland ASHER PERES, Israel Institute of Technology, Israel EDUARD PRUGOVECKI, University of Toronto, Canada TONY SUDBURY, University of York, U.K. HANS-JÜRGEN TREDER, Zentralinstitut für Astrophysik der Akademie der Wissenschaften, Germany
Volume 127
Foundations of Quantum Mechanics, an Empiricist Approach by
Willem M. de Muynck Eindhoven University of Technology, The Netherlands
KLUWER ACADEMIC PUBLISHERS NEW YORK, BOSTON, DORDRECHT, LONDON, MOSCOW
eBook ISBN: Print ISBN:
0-306-48047-6 1-4020-0932-1
©2004 Kluwer Academic Publishers New York, Boston, Dordrecht, London, Moscow Print ©2002 Kluwer Academic Publishers Dordrecht All rights reserved No part of this eBook may be reproduced or transmitted in any form or by any means, electronic, mechanical, recording, or otherwise, without written consent from the Publisher Created in the United States of America Visit Kluwer Online at: and Kluwer's eBookstore at:
http://kluweronline.com http://ebooks.kluweronline.com
For Yolanda, Catelijne and Sarah
This page intentionally left blank
Contents
xix
Preface 1 Standard and generalized formalisms of quantum mechanics
1
1.1
Basic postulates of standard quantum mechanics
1
1.2
Some elements of quantum field theory
7
1.3
Simultaneous and joint measurement of compatible observables
9
1.3.1 1.4
1.5
Postulate of local commutativity
10
Mixtures
12
1.4.1
Density operators
12
1.4.2
Basic postulates for mixtures
14
1.4.3
Density operators as vectors in a linear space
16
Coupled systems
16
1.5.1
Subsystems
16
1.5.2
Correlation observables
18
1.5.3
Polar decomposition
19
1.5.4
Entangled states
21
1.6
Projection or reduction postulate
22
1.7
Uncertainty relations
26
1.8
1.7.1
Heisenberg inequality
26
1.7.2
Entropic uncertainty relations
28
Proposition calculus of standard quantum mechanics
31
1.8.1
31
Boolean lattice of classical propositions; objectivity
vii
CONTENTS
viii
1.9
1.8.2
Propositions referring to a single quantum mechanical observable 32
1.8.3
Two or more compatible observables
33
1.8.4
Incompatible standard observables
34
1.8.5
States on the lattice of propositions
37
Generalized quantum mechanical observables
40
Quantum mechanical observables as positive operator-valued measures
40
1.9.2
Joint measurement of generalized observables
43
1.9.3
Naimark’s theorem
47
1.9.4
Phase observables
50
1.9.1
1.10 Correspondence with classical mechanics
51
1.10.1 Ehrenfest’s theorem
52
1.10.2 Dirac quantization
53
1.10.3 Classical statistical mechanics
54
1.11 Phase space representations
57
1.11.1 Introduction
57
1.11.2 Wigner-Weyl representation
58
1.11.3 Husimi representation
61
1.11.4
64
representations
1.11.5 Relation to operator ordering
66
1.11.6 Wigner’s theorem
67
1.11.7 The Schrödinger equation in the
representation
2 Empiricist and realist interpretations of quantum mechanics
69 73
2.1
Introduction
73
2.2
Empiricist interpretation of quantum mechanics
74
2.2.1
77
Logical positivism/empiricism and empiricist interpretation
2.3
Realist interpretation of quantum mechanics
82
2.4
Empiricist or realist interpretation: which one to choose?
87
CONTENTS
2.5
ix
2.4.1
Logical positivism/empiricism and realist interpretation
87
2.4.2
The classical paradigm
88
2.4.3
Double role of the quantum mechanical observable
93
2.4.4
Interpretations of quantum field theory
95
2.4.5
Contextualistic-realist interpretation
Some consequences
101 106
2.5.1
Empiricist interpretation and generalized observables
106
2.5.2
Realist interpretation of quantum mechanics, and hidden variables
108
Interpretations and the classical limit
109
2.5.3
3 Quantum mechanical description of measurement, and the “measurement problem” 113 3.1
The (conventional) “measurement problem”
3.1.1
Schrödinger’s cat
113
3.1.2
Interpretations, the “measurement problem”, and the problem of quantum mechanical measurement
115
Three tentative answers
117
3.1.3 3.2
3.3
113
Quantum mechanical description of the measurement process
121
3.2.1
A simplified model
121
3.2.2
Von Neumann’s proof of consistency of projection and unitary evolution
123
3.2.3
“Orthodox” solution to the “measurement problem”
125
3.2.4
Measurements of first and second kind
127
3.2.5
The “measurement problem” for measurements of the second kind
132
3.2.6
Conditional preparation
134
3.2.7
Quantum jumps
140
Quantum mechanical description of the measurement process and POVMs
146
CONTENTS
x
3.4
3.3.1
Possibility of POVMs
146
3.3.2
Pointer observables and POVMs
149
3.3.3
Measurement and empiricist interpretation
151
3.3.4
Conditional preparation and generalized observables
152
3.3.5
Generalized von Neumann projection for generalized observables 155
3.3.6
Measurement and information
157
Decoherence
160
3.4.1
Ergodicity
161
3.4.2
Environment-induced superselection
162
3.4.3
Spontaneous localization
163
3.4.4
General evolution equation for open systems
165
3.4.5
Critique of the decoherence solution to the problem of quantum measurement
166
4 The Copenhagen interpretation
171
4.1
Introduction
171
4.2
Completeness of quantum mechanics
173
4.2.1
Completeness in a wider sense
173
4.2.2
Completeness in a restricted sense
177
4.2.3
Entanglement of the ‘(in)completeness’ question with other issues; sources of confusion
182
4.3 The correspondence principle
190
4.3.1
Weak and strong forms of the correspondence principle
190
4.3.2
Strong form of the correspondence principle
191
4.3.3
Realism versus empiricism, and correspondence
193
4.3.4
Critique of the correspondence principle
195
4.4
Complementarity in a wider and in a restricted sense
197
4.5
‘Thought experiments’
200
4.5.1
200
Diffraction of particles through a slit
xi
CONTENTS
4.6
4.7
4.8
4.5.2
The double-slit experiment
201
4.5.3
The
206
microscope
Meaning of the ‘complementarity’ concept
207
4.6.1
Two meanings of ‘to determine’
208
4.6.2
Heisenberg’s disturbance theory of measurement
212
4.6.3
Complementarity according to Bohr
214
4.6.4
Particle-wave duality
218
4.6.5
Parallel and circular complementarity
219
4.6.6
Complementarity and the projection postulate
220
4.6.7
Complementarity and consciousness
223
Critique of the complementarity principle
225
4.7.1
Einstein
225
4.7.2
Margenau
226
4.7.3
Ballentine
228
4.7.4
Recent developments
231
Complementarity and empiricist interpretation
5 The Einstein-Podolsky-Rosen problem
235
239
5.1
Introduction
239
5.2
Formulation of the EPR problem in terms of physical quantities
241
5.2.1
The EPR reasoning
241
5.2.2
Discussion of the EPR reasoning
243
5.3
Bohr’s answer to EPR
5.3.1 5.4
Criticisms of Bohr’s answer to EPR
246 250
Formulation of the EPR problem in terms of state vectors
255
5.4.1
The EPR reasoning in terms of state vectors
255
5.4.2
Discussion of the state vector approach to EPR
259
5.4.3
Modified EPR experiments
265
CONTENTS
xii
6 Individual-particle and ensemble interpretations of quantum me275 chanics 6.1
Introduction
275
6.2
Probabilistic and statistical interpretations of quantum statistics
279
6.2.1
The ‘statistical’ interpretation
279
6.2.2
Individual-particle interpretation of pure states
280
6.2.3
Ensembles in the Copenhagen interpretation
284
6.3
6.4
7
Problems of an individual-particle interpretation
286
6.3.1
Spreading of the wave packet
286
6.3.2
Entangled states, and individual-particle interpretation
288
6.3.3
Entanglement, and (in)homogeneity of quantum ensembles
291
6.3.4
Disentanglement by means of projection?
295
To explain, or not to explain
298
6.4.1
Minimal interpretation
302
6.4.2
Explanation by means of observables?
304
6.4.3
Explanation by means of subensembles?
309
6.4.4
Explanation by means of projection?
313
6.5
The EPR festival of confusions
316
6.6
Modal interpretations
321
6.6.1
Copenhagen variant of the modal interpretation
322
6.6.2
Anti-Copenhagen variant of the modal interpretation
327
Generalized quantum mechanics
331
7.1
Introduction
331
7.2
Inefficient photon detection
333
7.2.1
POVM of an inefficient photon detection process
333
7.2.2
Comments on the nonideality relation
336
7.3
Quantum mechanical description of a double-slit experiment
340
7.3.1
342
Instationary approach
xiii
CONTENTS
7.3.2
Stationary approach
343
7.4
Homodyne optical detection
345
7.5
Nonideal polarization measurement of a photon
347
7.6
Theory of nonideal measurement
349
7.6.1
Examples
349
7.6.2
Definition of a nonideal measurement
352
7.6.3
Invertibility
355
7.6.4
An alternative definition of nonideal measurement
357
7.6.5
Nonideal measurement of a standard observable
358
7.6.6
Linear dependence of elements of a NODI; coarsening and refinement of POVMs
360
Equivalence of POVMs
362
7.6.7 7.7
Partial ordering of nonideal measurements
364
7.7.1
Partial ordering of equivalence classes
364
7.7.2
Maximal and minimal POVMs
365
7.7.3
Maximality and completeness of observables
368
7.8
Measures of nonideality or inaccuracy
370
7.9
Joint nonideal measurement of two observables
375
7.9.1
Definition and examples
375
7.9.2
Measurement of a PVM as a joint nonideal measurement of incompatible observables
381
7.9.3
Wigner measures
383
7.9.4
From standard measurements to complete measurements
387
7.10 Complementarity in a joint nonideal measurement of two incompatible standard observables
391
7.10.1 Examples
391
7.10.2 Martens inequality
393
7.10.3 Discussion of the Martens inequality
395
7.10.4 Examples of generalized von Neumann projection
400
7.10.5 Complete measurements, and the Copenhagen interpretation
404
CONTENTS
xiv 8 Applications of generalized quantum mechanics 8.1
The Arthurs-Kelly model
8.1.1 8.1.2 8.2
8.3
8.4
407
The Arthurs-Kelly model as a joint nonideal measurement of position and momentum
407
Wigner measure of the Arthurs-Kelly model
409
Neutron interferometry
409
8.2.1
Introduction
409
8.2.2
Interference and path observables
411
8.2.3
Joint nonideal measurement of interference and path observables 413
8.2.4
Complete neutron interference measurements
420
8.2.5
Analogy with optical interference
423
8.2.6
Absorber fluctuations
425
8.2.7
Accuracy of the interference measurement
427
Stern-Gerlach experiments
429
8.3.1
Introduction
429
8.3.2
The Stern-Gerlach experiment as a nonideal measurement of
430
8.3.3
Stern-Gerlach as a joint measurement of incompatible observables
434
Quantum optical experiments
8.4.1
8.5
407
436
Photon detection in the output ports of a partially reflecting mirror
437
8.4.2
Homodyne optical detection
438
8.4.3
‘Four-port’ and ‘eight-port’ homodyne detection
443
8.4.4
Quantum tomography
449
Atomic beam interference experiments
451
8.5.1
Introduction
451
8.5.2
The Ramsey experiment
452
8.5.3
The Haroche-Ramsey experiment
455
8.5.4
The Davidovich-Haroche experiment
457
xv
CONTENTS
8.5.5
Informational aspects of the Davidovich-Haroche experiment
461
8.5.6
The Haroche-Ramsey experiment as a joint nonideal measurement of interference and path observables
465
9 The Bell inequality in quantum mechanics 9.1 Introduction
471 471
9.1.1
The EPR problem, standard and generalized
471
9.1.2
Bell’s inequality, and interpretations of quantum mechanics
475
9.1.3
Bell’s inequality and nonlocality
478
Derivation of the Bell inequality from the existence of a quadrivariate probability distribution
480
9.2.1
The BCHS inequality
480
9.2.2
Derivation of the Bell inequality from the BCHS inequality
482
9.2.3
Quadruples and joint probability distributions of measurement results
485
9.3 The Bell inequality in an empiricist interpretation; relation to joint nonideal measurement of incompatible observables
488
9.2
9.4
9.5
9.3.1
A generalized EPR-Bell experiment
9.3.2
Heisenberg measurement disturbance or nonlocal interaction? 492
9.3.3
Classical and quantum correlations
The Bell inequality in realist interpretations
488
495 496
9.4.1
Derivation from the ‘possessed values’ principle
9.4.2
The Bell inequality, and a contextualistic-realist interpretation 499
A Copenhagen-inspired empiricist approach
496
506
9.5.1
Stapp’s “nonlocality proof”
506
9.5.2
The possibility of additional assumptions
509
9.5.3
Additional assumptions induced by the Copenhagen interpretation
511
The assumption that identical individual preparations in different EPR-Bell experiments are possible
518
9.5.4
CONTENTS
xvi
9.5.5 9.6
Discussion of the quantum mechanical non-reproduci- bility conjecture
Bell’s theorem without inequalities
521 529
9.6.1
Problem and derivation
529
9.6.2
Discussion of the Hardy-Stapp formulation
531
10 Subquantum or hidden-variables theories
535
10.1 Introduction
535
10.2 “No go” theorems
539
10.2.1 Von Neumann
539
10.2.2 Jauch and Piron
542
10.2.3 Kochen and Specker
542
10.3 Bohm’s hidden-variables theories
547
10.3.1 Introduction
547
10.3.2 Bohm’s causal theory
549
10.3.3 Bohm’s stochastic theory
551
10.3.4 Objections to Bohm’s theory
553
10.3.5 Concluding remarks
559
10.4 Quasi-objectivistic hidden-variables theories
560
10.4.1 Introduction
560
10.4.2 Quasi-objectivistic deterministic hidden-variables theories
562
10.4.3 Quasi-objectivistic stochastic theories
566
10.5 Quasi-objectivistic local hidden-variables theories, and Bell’s inequality
571
10.5.1 Why local hidden variables?
571
10.5.2 Bell’s theorem
573
10.5.3 The Clauser-Horne derivation
579
10.6 Non-quasi-objectivistic hidden-variables theories 10.6.1 Farewell to quasi-objectivism
584 584
CONTENTS
xvii
10.6.2 Implications of non-quasi-objectivism
587
10.6.3 Contextual states for generalized measurements
593
10.6.4 Thermodynamic analogy
598
A Mathematical appendix
609
A.1 Position and momentum operators
609
A.2 Boson creation and annihilation operators
611
A.3 Some properties of boson creation and annihilation operators
612
A.4 Coherent states
613
A.5 Squeezed boson operators
616
A.6 Theorem on ordering of a positive operator and a projection operator 618 A.7 Theorem on overcomplete sets of vectors
619
A.8 Non-orthogonal (skew) projections
621
A.8.1
Non-orthogonal projections and bi-orthonormality
623
A.9 Direct product and tensor product of Hilbert spaces
626
A.10 Linear spaces of operators
627
A.11 Convexity
629
A.11.l Convex functions
629
A.11.2 Quantum mechanical entropic inequalities
630
A.11.3 Convex subsets of a linear space
634
A.12 Measures
636
A.12.1 Measures on a set
636
A.12.2 Measures on a linear vector space; Gleason’s theorem
639
A.12.3 Operator-valued and functional-valued measures
641
A.12.4 Mathematical representation of (N)ODIs
644
A.13 Some properties of stochastic matrices
646
Bibliography
649
Index
675
This page intentionally left blank
Preface “Concepts without percepts are empty; percepts without concepts are blind.” (I. Kant)
In this book old and new problems of the foundations of quantum mechanics are viewed from the new perspective provided by a generalization of the mathematical formalism of that theory encompassing so-called positive operator-valued measures. At its inception the standard formalism of quantum mechanics as developed by Dirac and von Neumann seemed to yield the natural mathematical framework for describing the microscopic world of atoms and subatomic objects. In particular, Hermitian operators seemed to be able to replace the phase space functions of classical mechanics as mathematical representations of physical quantities. For a large part of the century axiomatic systems were set up, based on this latter idea. Only in the second half of that century, starting with the pioneering work by Davies and Lewis it was gradually realized that Hermitian operators constitute too narrow a framework to encompass all experiments possible in the atomic domain. “The only justification for our concepts and system of concepts is that they serve to represent the complex of our experiences; beyond this they have no legitimacy.” (A. Einstein, The meaning of relativity)
The generalization of the mathematical framework meant here is often referred to as the ‘operational approach’. One objective of the present book is to demonstrate the crucial role the generalized formalism plays in fundamental issues as well as in practical applications, and to contribute to its development. The fundamental insight inherent in an acknowledgment of the necessity of the generalized formalism is that the interaction between microscopic object and measuring instrument should be duly taken into account when assessing the meaning of quantum mechanics. This insight might seem not to be new at all, since it was shared by the Copenhagen interpretation already at a very early stage. It should be realized, however, that this sharing is only a partial one because recognition of the necessity of a quantum mechanical description of the measurement interaction draws a dividing line between the Copenhagen and operational approaches. As will be seen from a thorough analysis of the Copenhagen principles of ‘correspondence’ and ‘complementarity’, these principles were tuned to the standard formalism, quantum mechanical measurement xix
PREFACE
xx
being thought not to be analyzable beyond a classical description of measurement phenomena. “The argument is above all that the measuring instruments, if they are to serve as such, cannot properly be included in the domain of application of quantum mechanicsa.” (N. Bohr, letter to Schrödinger, October 26, 1935) a
Quotations should be seen as characterizing their authors; the presented opinions are not necessarily shared by the author of this book.
Although the Copenhagen principles stressed the importance of the measurement arrangement, they missed the important point that transfer of information from a microscopic object to a measuring instrument is a microscopic process, to be described by quantum mechanics. It seems that in criticizing the Copenhagen interpretation Einstein has been perfectly right when characterizing the professed unanalyzability of the measurement process as a “tranquilizing philosophy”. By ignoring Bohr’s dogmatic ban on applying quantum mechanics to quantum measurement it has become evident by now that such a paradigmatic ‘thought experiment’ as the double-slit experiment is even outside the domain of application of the standard formalism, requiring the generalized formalism for its description. It is hardly surprising, then, that a discussion of the ‘thought experiments’ remaining within the confines of the standard formalism has produced so many confusing and paradoxical conclusions. Clarifying these is a second objective of this book. In particular, the so-called “measurement problem” will be discussed at some length, revealing as a second shortcoming of the Copenhagen interpretation its neglect of a sufficient distinction between the physical procedures of ‘measurement’ and ‘preparation’. “Physics is an attempt conceptually to grasp reality as it is thought independently of its being observed.” (A. Einstein, Autobiographical notes, in: “Albert Einstein: Philosopher– Scientist”, P.A. Schilpp, ed., The Library of Living Philosophers, 1949) A third objective is to provide a critical assessment of interpretations of the quantum mechanical formalism. As is well known, Einstein’s main criticism of the Copenhagen interpretation was directed against its claim that quantum mechanics is a complete theory. However, what does ‘completeness’ mean? For Einstein and Bohr ‘completeness’ had very different meanings. For Einstein it was associated with a choice between an interpretation of the state vector either as a description of an individual object (‘completeness’) or as a description of an ensemble (‘incompleteness’). For Bohr the reason to claim ‘completeness’ had quite a different source, namely, the mutual exclusiveness of measurement arrangements of incompatible observables, causing momentum to be ill-defined in the context of a position measurement (and vice versa). This, once again, regards the interaction with a measuring instrument, thus connecting the ‘completeness’ question to the principles of ‘correspondence’
PREFACE
xxi
and ‘complementarity’. “Quantum mechanics is not a theory about reality, it is a prescription for making the best possible predictions about the future if we have certain information about the past.” (G. t’Hooft. Journ. Stat. Phys. 53, 323 (1988))
Physicists are generally reluctant to deal with interpretation as they are used to associate this with metaphysical speculation. It should be realized, however, that without an interpretation the mathematical formalism of quantum mechanics would be just mathematics. In order to make it physics, an interpretation in the sense of a mapping of entities of the mathematical formalism into reality is indispensable. A physicist will have to specify the physical meaning of his formalism if he wants to be able to compare the results of his calculations with what happens in reality. Pragmatic approaches, in which such a meaning is not specified and the resulting vagueness is employed to circumvent problems, may appear to be successful in solving specific problems but will be detrimental in the end because they tend to prevent a consistent overall account. In clarifying a number of possible interpretations the generalization of the mathematical formalism referred to above will once again be illuminating by relaxing the urge to adhere to a particular interpretation suggested by the standard formalism. “If one wants to clarify the meaning of the expression ‘position of the object’, for instance of an electron, one must provide certain experiments by means of which one can envisage the measurement of ‘the position of the electron’.” (W. Heisenberg, Zeitschr. f. Physik 43, 172 (1927))
With respect to the interpretation of the quantum mechanical formalism our main concern will not be the question of ‘completeness’ in the sense of whether the wave function describes the microscopic object either completely or incompletely; the question will rather be whether the wave function describes the microscopic object at all. This amounts to a choice between a realist interpretation and an empiricist one. In the first interpretation the mapping is into the microscopic world; Hermitian operators are thought to represent properties of microscopic objects, either in the objectivistic sense as Einstein would like to have it, or in the contextualistic one adhered to by Bohr. In an empiricist interpretation the mapping is into the (macroscopic) world of preparation and measurement devices, observables being seen as labels of measurement procedures; a description of the microscopic world is relegated to subquantum theories to be developed if need be. The difference between a realist and an empiricist interpretation of quantum mechanics might be characterized in terms of Plato’s allegory of the cave by asking whether the theory is either referring to the real (quantum) world, or just to the shadows on the wall actually observed by the cave dwellers. In an empiricist interpretation justice is done to the difference between the shadows and the real objects. The generalized formalism of quantum mechanics gives ample occasion to endorse the view that this theory does
xxii
PREFACE
not yield Plato’s ideal description of reality but just describes the shadows projected onto our measuring instruments. We should note here that interpretations cannot be proven. They can only be made plausible by showing that they satisfy reasonable criteria like absence of inconsistencies, both internally and with respect to experimental evidence. This implies the possibility of different interpretations, acceptable to different persons for different reasons. On the other hand, one interpretation may be more liable to paradoxical consequences than another. Our conclusion will be that virtually all the paradoxes plaguing quantum mechanics are caused by adhering to a realist interpretation, and can be circumvented by relaxing to an empiricist one. In particular, the extension of the domain of applicability of the theory by taking into account the generalized formalism is very helpful in reaching this conclusion. “However the idea that quantum mechanics, our most fundamental physical theory, is exclusively even about the results of experiments would remain disappointing.” “To restrict quantum mechanics to be exclusively about piddling laboratory operations is to betray the great enterprise.” (J.S. Bell, Against ”measurement”, in: Sixty Years of Uncertainty, A. Miller ed., Plenum, New York, 1990)
Is a choice for an empiricist interpretation of quantum mechanics a “betrayal of the great enterprise” as seems to be implied by John Bell’s assertion? It would be, if quantum mechanics is not only “our most fundamental physical theory”, but if it were the most fundamental theory we will ever be able to think of, the “theory of everything”. It is not, if quantum mechanics is just a phenomenological theory describing certain phenomena occurring within its domain of application, but liable to be superseded by a still more advanced theory as new domains of experimentation are explored. Perhaps we can learn from history here. On earlier occasions physicists as well as philosophers have thought that they had reached the boundaries of knowledge, and were on the brink of “knowing the mind of God”. However, each time God turned out to be more sophisticated than man had ever dreamed of. Is there any reason to think that it will be different this time? Of course, we do not know. But it is far too early to assume that quantum mechanics gives all answers to our questions, the more so as the necessity of the generalization of the quantum mechanical formalism, referred to above, already demonstrates failure of the standard formalism to give a comprehensive account of our experience even at this moment. “If we find the answer to that, it would be the ultimate triumph of human reason —for then we would know the mind of God.” (S.W. Hawking, A brief history of time, Bantam Books, 1988)
John Bell’s great enterprise seems to be inspired by the idea that the physicist’s task is to design a blueprint of reality. In view of the necessity of experimental
PREFACE
xxiii
testability, felt by many physicists to be a necessary condition to be satisfied by any scientific theory, the necessary interaction of object and measuring instrument seems to make such an endeavor self-defeating. As will be seen, even Einstein’s less exacting ideal that quantum mechanics yield a description of an objective reality, independent of any influence exerted by an observer (including his measuring instruments), cannot be upheld. It seems that on this count Bohr was right: knowledge about reality obtained from quantum mechanical measurements has only a contextual meaning. It is knowledge about a microscopic object as it is “colored” by its environment, of which the measuring instrument is an important component. Although the empiricist interpretation of quantum mechanics developed in this book is fundamentally different from the “orthodox” Copenhagen interpretation, its acknowledgement of the impossibility of ignoring the measurement arrangement might justify referring to it as a neo-Copenhagen interpretation. “We are facing here the perennial question whether we physicists do not go beyond our competence when searching for philosophical truth. I believe we probably do.” (E.P. Wigner, Amer. Journ. of Phys. 31, 6 (1963))
It seems that at this moment realist interpretations of quantum mechanics are the more fashionable ones. Many-worlds interpretations, modal interpretations, Bohm’s interpretation and even the experimentalist’s classical way of speaking are working together to weave a picture of reality that is in agreement with the quantum mechanical formalism. Admittedly, it is not impossible that quantum mechanics describes some features of microscopic reality, analogously to the way the classical theory of rigid bodies describes some properties of billiard balls. However, billiard balls are not rigid bodies. They consist of atoms needing a different theory (viz, quantum mechanics) for their description. By the same token electrons are not wave functions, which in an empiricist interpretation just describe macroscopically observable features of a preparation procedure. It is not at all impossible that an attempt at understanding microscopic reality on the basis of a realist interpretation of quantum mechanics is comparable to an attempt at understanding the rigidity of a billiard ball on the basis of a model of densely packed rigid atoms. In particular, the widely discussed notions of ‘nonseparability’ and ‘nonlocality’ might very well turn out to be of this kind, attributing to microscopic reality properties of (macroscopic) phenomena observed in measuring instruments. The related problem of the Bell inequalities will be discussed in the last two chapters. Here, too, the generalized formalism of quantum mechanics will play its part in analyzing the problem of violation of the Bell inequalities, and in suggesting an alternative explanation to the widely accepted explanation of such a violation on the basis of ‘nonlocality’. This book could not have been realized but for the inspiration and support from the part of students, friends and colleagues. First of all I would like to thank the Faculty of Applied Physics of Eindhoven University of Technology and the Department of Theoretical Physics for allowing me to pursue a subject seemingly removed
xxiv
PREFACE
so far from direct application. Hopefully, the insights obtained from a study of the generalized formalism will contribute to future developments of such advanced technologies as quantum computation and quantum communication which are nowadays contemplated as possibilities. In any case, a critical assessment of the way theory is interpreted in this field won’t hurt. On the other hand, it is my conviction that a technological environment can be very beneficial when studying interpretations of physical theories by stimulating a down-to-earth attitude. Awareness of practical realizability constantly reminds the philosopher of what it means to test a physical theory, thus stimulating an operationalist attitude and pushing him into the direction of an empiricist interpretation. This book is an overview of research done in collaboration with many students. Of these I want to mention in particular Sander Santman and Peter Janssen who started off my interest in the foundations of quantum mechanics by being intrigued by the problem of joint measurement of position and momentum, thus establishing incompatibility of quantum mechanical observables as the driving force of my research. Over the years they have been succeeded by a large number of students of applied physics who did not have foundations as their main interest but were nevertheless sufficiently motivated to be engaged with it during some time. Many of them have made substantial contributions. Hans Martens should particularly be thanked here. His contributions evoke a suspicion that foundations might have looked different by now if he had been able to pursue his efforts in this field. I also want to thank Willy DeBaere for a fruitful collaboration, convincing me of the necessity to take into account subquantum theories in understanding quantum mechanics. My special thanks are due to the participants of the Quantum club started by Jan Hilgevoord as an informal forum for discussion of the foundations of quantum mechanics more than 25 years ago. Like everyone else I started on a realist interpretation. In contrast to most participants I have decided that it will be more fruitful to view quantum mechanics in an empiricist way, thus more or less solving a conundrum, unresolved during more than 70 years of discussion, by cutting a Gordian knot. This book gives the reasons for such a decision. It is up to the reader to decide whether these reasons are convincing enough. Willem M. de Muynck Eindhoven June 2002
Chapter 1 Standard and generalized formalisms of quantum mechanics 1.1
Basic postulates of standard quantum mechanics
Let us start with an enumeration of the postulates (or axioms) of quantum mechanics, essentially to be found already in the very first textbooks of quantum mechanics, viz those by Dirac [1] and by von Neumann [2]. The formalism based on these postulates will be referred to as the standard formalism. The corresponding mathematics is the mathematics of finite-dimensional or infinite-dimensional real or complex linear vector spaces1. We distinguish: Quantum mechanical state
The basic idea of a quantum mechanical state is a unit vector in a Hilbert space (see section 1.4 for a more general definition). This space can be infinitedimensional. The state vector is denoted by It should be normalized. Thus, The rationale behind a representation of a quantum mechanical state by a vector in a linear (Hilbert) space is the superposition principle, stating that a normalized linear combination of two (or more) state vectors is another possible state vector. In the superposition it is not necessary that and be orthogonal. A preparation of a quantum mechanical state 1
See e.g. Kreyszig [3]. In general our presentation will be a simplified one, using Dirac’s notation, and omitting most of the subtleties involved in infinite-dimensional spaces (see also van Eijndhoven and de Graaf [4]).
1
2
CHAPTER 1. STANDARD AND GENERALIZED FORMALISMS
will be referred to as a ‘quantum mechanical state preparation’ if it is necessary to be very precise. Quantum mechanical observable
In the standard formalism a quantum mechanical observable is represented by a Hermitian operator A on a Hilbert space containing the state vectors. In section 1.9 an extension of the standard formalism will be considered, in which a different (generalized) definition of a quantum mechanical observable will be given. The observables of the standard formalism are referred to as ‘standard observables’. The eigenvalues of A are the possible values the observable can have: the possible ‘individual measurement results’ . The set of all values of an observable (the spectrum) 2 is sometimes referred to as the ‘value space’. When it is possible to do so without causing confusion (see, however, chapters 2 and 10) we shall often not distinguish between an observable and an operator (i.e. a mathematical quantity representing an observable). Analogously, a state vector will often be referred to as a state. However, it is not unwise to realize every now and then that a mathematical representation is not the same as the thing it is representing. Quantum mechanical measurement
It is necessary to draw a clear distinction between an ‘individual quantum mechanical measurement’, having as individual measurement result an eigenvalue of observable A, and a ‘quantum mechanical measurement’ corresponding to a sequence of N individual quantum mechanical measurements, having (in the limit of very large N) a probability distribution as a quantum mechanical measurement result (see also (6.1)). Quantum mechanics is a statistical theory (see chapters 4 and 6 for a discussion of different interpretations), in general making assertions about probability distributions rather than predicting results of individual measurements. Quantum mechanical preparation
We should also distinguish between an ’individual quantum mechanical preparation’ and a ‘quantum mechanical preparation’, the quantum mechanical preparation referring only to sequences of individual preparations (cf. chapter 6). This allows for the possibility that a quantum mechanical preparation corresponding to a state vector may be realized by individual quantum mechanical preparations yielding a statistical distribution of different measurement results 2 When dealing with infinite-dimensional Hilbert spaces a complication is the possibility of a continuous spectrum. In the notation we shall restrict ourselves mostly to observables with a discrete spectrum. Generalization to continuous spectra will always be possible, however.
1.1. BASIC POSTULATES
3
Note that the absence of a clear distinction between the notions of ‘individual quantum mechanical preparation’ and ‘quantum mechanical preparation’ within the mathematical formalism of quantum mechanics has been a main source of confusion. An acceptable way to circumvent this problem is to consider individual preparations corresponding to a state to be identical in the sense that in a sequence of different realizations all macroscopic features of the preparation setup are identical, leaving open the possibility of microscopic differences not described by quantum mechanics.
Expectation value Let be the probability that we obtain measurement result Then the average value of A in a quantum mechanical measurement is given by the usual statistical expression
This quantity is referred to as the expectation value. It is experimentally found in a sequence of identical (in the sense defined above) preparations by averaging the individual measurement results using the probabilities as relative weights. When the state is represented by state vector given by
If
is a linear superposition
the expectation value
is
the expectation value is given by
The last two terms are often referred to as interference terms. They embody the typically quantum mechanical character expressed by the superposition principle. For this reason interference terms are especially important for the foundations of quantum mechanics.
Note that the existence of the sum (1.2) is not self-evident because it hinges on the existence of the probabilities However, we shall follow here the usual practice of assuming that, by leaving unaltered the macroscopic characteristics of the measurement arrangements of the individual experiments the microscopic circumstances are not changed in such a way that the existence of the probabilities is put in jeopardy (see also section 10.6.4).
CHAPTER 1. STANDARD AND GENERALIZED FORMALISMS
4
Probability distribution
The probability of a discrete (eigen) value
of observable A is given by3
Here is the normalized eigenvector of A belonging to eigenvalue If A has more than one orthonormal eigenvectors belonging to the same eigenvalue then
This expression can be written as
in which operator is the orthogonal projection operator on the subspace spanned by the eigenvectors belonging to eigenvalue Since I it follows from the normalization (1.1) of state vector that This implies that the quantum mechanical postulates assume that a quantum mechanical measurement is always successful, in the sense that an individual measurement does always yield an eigenvalue In case of degeneracy the eigenvectors of A are not uniquely defined. It is possible to achieve a unique definition by choosing these vectors as the joint eigenvectors of A and one or more other operators, all operators mutually commuting. Such a mutually commuting set of operators is called a ‘complete set of commuting operators’. Its joint eigenvectors are mutually orthogonal. Projection operator can be given as the sum of the (one-dimensional) projections on each of the joint eigenvectors. In Dirac notation we denote If there is no degeneracy we have The relation between expectation value (1.2) and probabilities (1.7) is consistent with the spectral representation of operator A:
in which the subspace on which degeneracy. 3
is projecting is many-dimensional in case of
In case of a continuous spectrum (for instance, position observable ) (1.5) is replaced by which is the probability that measurement result is in interval The quantity is not a probability, but a probability density. The probability that is exactly equal to is essentially zero.
1.1. BASIC POSTULATES
5
Standard deviation Apart from the expectation value (1.3) in quantum mechanics a second quantity, characterizing the probability distribution of measurement results of an observable, plays an important role, viz, the standard deviation. This quantity is defined as in which expectation values are given by (1.3). It is a measure of the statistical spreading of the measurement results if observable A is measured in state If is an eigenvector of A, then The standard deviation is a measure of the uncertainty predictions on result of an individual measurement of observable A are subject to. In general, preparation and measurement are not carried out at the same time, but with a certain time delay This time dependence can be expressed in the formalism in two different ways: Schrödinger picture State vector varies with time. The time dependence is described by the Schrödinger equation
in which H is a Hermitian operator, the so-called Hamilton operator. The solution of (1.10) can formally be given as
in which
is a unitary operator, satisfying
If H is independent of
then
It is important to note that, even though quantum mechanics is a statistical theory, the Schrödinger equation is nevertheless a deterministic equation, in the sense that the solution at time uniquely determines the solution at all later times. Hence, although the individual measurement result of a measurement of observable A at time is not uniquely determined by the state vector of the quantum mechanical state at such a unique determination holds for the statistical distribution of the measurement results; the
CHAPTER 1. STANDARD AND GENERALIZED FORMALISMS
6
probability that a measurement of observable A at time yields measurement result is with certainty equal to
The expectation value at time is given by
determined by a measurement of observable A
in which A is the same (time independent) operator as in (1.3). Heisenberg picture Using (1.11) it is possible to write (1.14) according to
in which In the Heisenberg picture expectation value (1.14) is represented in terms of a time independent state vector and a time dependent observable The time dependence of is described by the Heisenberg equation:
in which is the commutator of operators H and Operator is independent of if Then A is a constant of the motion. If H is independent of the solution of (1.17) can formally be written as
By
we get
In the early stages of the development of quantum mechanics the correspondence principle (cf. section 4.3) has served well to find quantum mechanical operators H and A from the classical expressions of classical Hamiltonians and other classical
1.2. SOME ELEMENTS OF QUANTUM FIELD THEORY
7
physical quantities. However, it follows from the difference between the properties of the physical quantities of classical and quantum mechanics that this method has a restricted validity (see also section 1.10). In particular, the possibility of incompatibility of quantum mechanical observables is relevant here. Two standard observables are called compatible if and only if the corresponding operators A and B commute, i.e. The most fundamental example of incompatible observables is given by position and momentum observables and P, satisfying the canonical commutation relation4
Some basic properties of the position and momentum operators are summarized in appendix A.1. Observables satisfying the canonical commutation relation are often called canonically conjugated or complementary.
1.2
Some elements of quantum field theory
Apart from time dependence, observables may also have position dependence. Such observables figure in field theories like quantum electrodynamics, in which, for instance, the energy density of the field is represented by a position and time dependent Hermitian operator The energy at time within a region of three-dimensional space is then represented by the operator More generally, it is possible to attribute in a similar way Hermitian operators to regions of four-dimensional space-time, thus creating the possibility of making the theory relativistic. Restricting ourselves to a free electromagnetic field as an example, it is seen that the basic postulates of quantum field theory can be formulated in a way that is not essentially different from the one formulated above. Restricting ourselves to a polarized free field, in the quantum field theoretic description the electric field operator is represented as a sum over modes,
in which is the eigenfrequency of the mode, and is its spatial configuration (e.g. [5, 6, 7]). The functions constitute a complete orthonormal set. The operators are annihilation operators (cf. appendix A.2), usually interpreted as annihilating a photon in mode (see also section 2.4.4). Together with their Hermitian adjoints, the creation operators they satisfy the commutation relations
4
We shall often put
8
CHAPTER 1. STANDARD AND GENERALIZED FORMALISMS
In terms of the operators field is given by
Here and
and
the Hamiltonian operator of the electromagnetic
is the number operator (cf. (A.15)) of mode according to
Defining operators
it follows from (1.21) that and satisfy the canonical commutation relation (1.19) for position and momentum operators, operators corresponding to different values of being commutative. The Hamiltonian (1.22) can be expressed in terms of and as
From this expression it follows that the electromagnetic field can be considered as a set of non-interacting harmonic oscillators with position observables and momentum observables We denote by the joint eigenvectors of the commuting operators for eigenvalues Restricting ourselves to one single mode i.e. one single (onedimensional) harmonic oscillator, the (orthonormal) eigenvectors of the number operator are the number states (cf. appendix A.2). These vectors span an infinite-dimensional Hilbert space Usually they are interpreted as representing states comprising precisely photons in the relevant mode. For the complete field the joint eigenvectors of the number operators of the modes can be written as direct products They are orthonormal:
The tensor product Hilbert space spanned by these vectors is the so-called Fock space. Their projection operators satisfy
The eigenvector satisfies
with eigenvalue for all
for each mode
is the vacuum state. It
1.3. SIMULTANEOUS AND JOINT MEASUREMENT
9
1.3 Simultaneous and joint measurement of compatible observables Two standard observables A and B are compatible if their commutator Let A have spectral representation (1.8), and let, analogously, B be given by
According to the standard theory compatible observables are simultaneously measurable . Thus, in case of compatibility of A and B we have
Since and operators
are orthogonal projection operators, it follows from (1.25) that are orthogonal projection operators, too. Moreover,
Projection operator projects onto the subspace spanned by the joint eigenvectors of A and B corresponding to eigenvalues and respectively. Analogously to (1.7), is the probability of finding the pair of values of A and B.
in a simultaneous measurement
The bivariate probability distribution has a “classical” relation to the probability distributions of A and B separately:
i.e. the probability distributions of A and B are the marginals of the bivariate distribution Conditional probabilities are also defined in the usual way. Thus,
is the probability of
given that the A measurement yielded value
In principle a simultaneous measurement of two standard observables can be considered as a measurement of one single observable C with spectral representation
10
If
CHAPTER 1. STANDARD AND GENERALIZED FORMALISMS
then such an observable C cannot be found.
In the discussion given above, compatibility of A and B is the only important issue, and not so much the simultaneity of the measurements. The bivariate probability distribution (1.26) remains well-defined if A and B are two time dependent Heisenberg observables (cf. (1.16)) corresponding to different instants and provided that What is of importance, is that the measurement results and correspond to one and the same individual preparation, i.e. that they are measured on the same individual object. In order to emphasize this it is advisable to refer to such measurements as joint measurements rather than simultaneous ones. If A and B are compatible constants of the motion, then a joint measurement at different times yields the same individual measurement results as a simultaneous measurement.
1.3.1 Postulate of local commutativity Let and be two regions of four-dimensional space-time. If is (partially or completely) situated in the forward light cone of then it is possible that the physical situation in is influenced by what has happened in In that case and are causally connected (cf. figure 1.1a, in which for the velocity of light is taken). When and are outside each other’s forward light cones we call them ‘causally disjoint’ (figure 1.1b). According to the theory of relativity, events in causally disjoint regions cannot influence each other. In relativistic quantum mechanics this would have to hold for measurements which are carried out in causally disjoint regions, for instance, for measurements performed simultaneously in non-overlapping regions of three-dimensional space. In standard quantum mechanics mutual non-influencing of measurements finds its expression in compatibility of observables measured jointly. For this reason the requirement that measurements in causally disjoint regions do not disturb each other is expressed by the postulate of local commutativity, stating that local observables which are measured in causally disjoint regions of space-time should commute (e.g.
1.3. SIMULTANEOUS AND JOINT MEASUREMENT
11
Emch [8]). At this moment all empirical data seem to corroborate this postulate. For this reason it may be considered a principle of quantum theory. Let operators A and B be local observables measured in causally disjoint regions. Then relations (1.27) precisely express what we mean by saying that measurements do not disturb each other; the marginal probability distributions and are equal to the probability distributions obtained if only A or B is measured, respectively. Evidently, the quantum mechanical measurement result of A is independent of the fact that B is also measured, and vice versa. If, instead of B, another observable incompatible with B, would be measured jointly with A, then, analogously to (1.26) and (1.27), we would get
and
Hence, probability distribution or is measured jointly with A.
of A is independent of whether either B
Incidentally, it should be noted that it is possible that measurements of compatible observables do not disturb each other even though the operators do not belong to causally disjoint regions of space-time. For instance, in case of coupled systems (cf. section 1.5) two arbitrary observables and of the two different subsystems are compatible, and, hence, satisfy (1.27). This remains valid even if the two systems interact with each other, and, hence, are in the same region. Here compatibility is a consequence of the fact that the measuring instrument for is supposed only to interact with system 1, and the measuring instrument for only with system 2. Of course, such measurement interactions would have to be very specific. In the theory of generalized observables, to be introduced in section 1.9, joint measurability of observables (in the sense that a measurement procedure exists yielding a bivariate probability distribution having the probability distributions of the two observables as marginals) will be referred to as commeasurability. It will be demonstrated that for standard observables commutativity of the Hermitian operators is a necessary condition for this to be the case (cf. section 1.9.2). The postulate of local commutativity should be distinguished from the general requirement of relativistic causality, implying that influences do not travel faster than the velocity of light. As a matter of fact, the postulate of local commutativity applies to joint measurements of local observables. Strictly speaking, it does not apply to the relation between a local preparation and a local measurement, which should satisfy an additional postulate (called ‘primitive causality’ by Haag and Schroer [9]) to satisfy relativistic causality. Hence, local commutativity is not sufficient to warrant that quantum mechanics be completely consistent with relativity theory (see also de Muynck [10]).
12
CHAPTER 1. STANDARD AND GENERALIZED FORMALISMS
A controversy over the consistency of the postulate of ‘primitive causality’ with a requirement of positivity of the spectrum of the Hamiltonian H [11] should not be left unmentioned here. This controversy will not be discussed, however, because it does not seem to regard directly observable effects of a perceptible size (Ruijsenaars [12]). Relativistic causality can be assumed for all practical purposes. Nevertheless, it was demonstrated by de Muynck [10] that the formalism of quantum electrodynamics accommodates a certain nonlocality that can give rise to restrictions on the preparation of initial states, and may have indirect observational consequences (like e.g. the Lamb shift). Such restrictions are not uncommon in physics. They do exist also in classical electrodynamics and general relativity theory, as well as in non-relativistic theories like classical rigid body mechanics and thermodynamics (cf. section 10.6.4). An answer to the question of whether such nonlocalities are fundamentally ingrained in reality, or whether they just represent a restriction of the domain of application of the theory to such nonlocal events, hinges on one’s views with respect to the issue of the completeness of the theory. We shall be concerned with this issue in a large part of this book. The question of relativistic causality plays an important role also in the socalled EPR, problem, to be discussed in chapters 5, 9 and 10. It is important to note, however, that the EPR problem is not directly related to the above-mentioned controversy because, as will be seen, it is not a matter of any nonlocality of the mathematical formalism of quantum mechanics but a consequence of certain interpretations of that formalism (the formalism itself being assumed to be relativistically causal).
1.4 Mixtures 1.4.1 Density operators In actual practice state preparations seldom satisfy the condition of section 1.1 that the prepared state corresponds to a state vector In general the preparing instrument fluctuates in such a way that successive preparations may correspond to different states. Let be the relative frequency state is prepared with (hence, Then probability (1.5) that on measurement of observable A measurement result will be found, should be replaced by
Using Dirac notation be written as
for the projection operator on vector
this can
1.4. MIXTURES
13
Here is a Hermitian operator on Hilbert space operator, describing a mixture of the states need not be mutually orthogonal. A state represented by a vector
the density operator or statistical Note that in (1.32) the states
can be represented by a density operator too,
viz,
Such a state is called a pure state. It corresponds to a density operator that is a (one-dimensional) projection operator. Necessary and sufficient for being a pure state is that Both for pure states and mixtures we have In general a quantum mechanical preparation is not represented by a Hilbert space vector but by a non-negative operator satisfying (1.35). In general
Although states in (1.32) need not be mutually orthogonal, for every density operator it is possible to find a representation of the form (1.32) in which the states are orthogonal. Indeed, since is a trace-class operator [13], it has a discrete spectrum. Hence, each density operator can be represented according to
with eigenvectors of corresponding to eigenvalues The boundaries of the set of eigenvalues follow from the fact that the operator satisfies As a consequence it is possible to interpret these eigenvalues as probabilities. The requirement is equivalent to condition (1.35). The difference between pure states and mixtures can be characterized using the concept of von Neumann entropy It follows from (1.36) that
Hence, for every density operator
we have
Equality holds if and only if all eigenvalues of pure state.
are either 0 or 1, i.e. in case of a
CHAPTER 1. STANDARD AND GENERALIZED FORMALISMS
14
1.4.2
Basic postulates for mixtures
It is possible to formulate quantum mechanics in terms of density operators, pure states just corresponding to special cases satisfying (1.34). The postulates of section 1.1 should then be generalized in the following way: State and preparation The state of a quantum mechanical system is represented by a density operator i.e. a non-negative operator with eigenvalues between 0 and 1, satisfying (1.35):
Also in the case of mixtures the terms ‘state’ and ‘density operator’ will often be used interchangeably. An interesting question is to what extent the representation (1.36) of a mixture can be interpreted in the sense that a quantum mechanical preparation of a mixture amounts to a mixture of preparations of pure states, each corresponding to a macroscopically different measurement arrangement (see section 6.2.3). Observable and measurement In the standard formalism these concepts remain unchanged. Hence, a quantum mechanical observable is represented by a Hermitian operator A (however, see section 1.9), and an individual measurement result corresponds to an eigenvalue Expectation value The expectation value
of A is given by
With (1.32) this follows directly from Probability distribution The probability of (eigen)value
of observable A is given by (compare (1.7))
Sometimes it is convenient to calculate
using its Fourier transform:
Statistical spreading of measurement results Much-used measures of the statistical spreading of measurement results are:
1.4. MIXTURES
15
Standard deviation:
directly generalizing (1.9);
Shannon entropy:
These quantities vanish if (i.e. if the probability distribution is concentrated in one single point of the spectrum); the Shannon entropy has a maximum (ln N, if N is the number of points of the spectrum) if the probability distribution is uniform (i.e.
Evolution equations Schrödinger picture Using representation (1.32) this equation follows directly from (1.10). Because of its similarity with the Liouville equation of classical statistical mechanics it is often referred to as the Liouville-von Neumann equation (see also section 1.10.3). Analogously to (1.18) we obtain for time independent H
Heisenberg picture Since the observables of section 1.1 have remained the same, equation (1.17) is unchanged too. Thus,
Comparing (1.46) with (1.44) a remarkable similarity is seen. However, there is also a clear difference: the signs of the right-hand sides of the equalities are different. This difference is related to the fact that a density operator, although it is a Hermitian operator, does not have the physical significance of an observable. The similarity is a consequence of the physical equivalence of the two pictures. Time dependence of expectation value can be accommodated at will in the density operator or in the observable (superscripts S and H indicate that operators should be considered in Schrödinger or Heisenberg picture, respectively):
16
CHAPTER 1. STANDARD AND GENERALIZED FORMALISMS
Here
and
Unlike the Shannon entropy (1.43), the von Neumann entropy (1.37) is constant under unitary evolution.
1.4.3 Density operators as vectors in a linear space Expressions (1.39) and (1.40) have the form of Hilbert-Schmidt inner product (A.82). Using (1.36) it follows that
Hence, can be considered as a vector in Hilbert-Schmidt space. Since density operators are non-negative operators, satisfying (1.38), they constitute only a subset of this space. Since a convex combination of two density operators and is another density operator, this subset is convex (cf. appendix A.11.3). Thus, represents a possible quantum mechanical state that is a mixture of the two states represented by and with relative weights and respectively. The convexity of the set of density operators reflects the convexity of the set of probability measures on a linear space (cf. appendix A.12.2). From appendix A.11.3 it directly follows that the extreme elements of the convex set of density operators are the projection operators on the one-dimensional subspaces of Hilbert space Hence, the extreme elements correspond to the pure states (1.33). Expansion (1.32) is a convex combination (A.96) of extreme elements. By the spectral representation (1.36) of we see that every density operator can be written as a convex combination of extreme elements. It is important to note that in general this expansion is not unique (cf. appendix A.12.2).
1.5
Coupled systems
1.5.1 Subsystems It is often necessary to study systems that are interacting with each other, or have interacted in the past. Consider a system consisting of subsystems 1 and 2, for
1.5. COUPLED SYSTEMS
17
instance an electron and a proton in a hydrogen atom. Apart from terms separately referring to each of the subsystems, the Hamilton operator H also contains interaction terms influencing the states of both subsystems. The state vector is a vector in tensor product space (cf. appendix A.9) of Hilbert spaces and of electron and proton, respectively. This state can be developed according to (A.79), using a complete orthonormal set of vectors obtained by taking the direct product of complete orthonormal sets in and in Omitting tensor product notation if there is no danger of ambiguity, an arbitrary state vector can be written according to
A state of the general form (1.49) with (such that is called an ‘entangled state’. The existence of entangled states is a direct consequence of the linear character of the Schrödinger equation, and the superposition principle. Hence, it is an essential feature of quantum mechanics. What is the state of subsystem 1 if is the state vector of the coupled system? In order to define it we need the density operator introduced in section 1.4.1. Consider all possible standard observables of subsystem 1. These are represented by the Hermitian operators in which corresponds to an arbitrary observable of system 1, and is the unit operator on Hilbert space The expectation values of these observables are given by
It can easily be verified that operator referring only to subsystem 1. It is denoted by
is a density operator, too,
Since density operator evidently contains all information on state as far as subsystem 1 is concerned (disregarding all information on subsystem 2). For this reason this density operator must represent the state of subsystem 1. By taking the partial trace of density operator of the combined system over we actually average out all information on subsystem 2. Only the information on subsystem 1 is preserved. Analogously, the density operator of subsystem 2 is given by Using (1.49), density operator
of subsystem 1 can easily be found as
18
CHAPTER 1. STANDARD AND GENERALIZED FORMALISMS
In general, vectors
are not normalized since
Putting
it follows that
Note that, since the normed vectors need not be mutually orthogonal, in general (1.53) is not the spectral decomposition (1.36) of density operator Analogously, we find
in which vectors are normalized. It is also noted here that, although the reduced density operators and are independent of the choice of bases and this does not hold for vectors and Choosing different bases and , we get
We then find
If vectors to a vector
are linearly independent, it follows that no vector
can be equal
1.5.2 Correlation observables It is possible to carry out a joint measurement of observable of system 1 and observable of system 2. Since these observables refer to different systems, they are compatible, and the corresponding operators commute. Therefore the theory of section 1.3 is applicable. Analogously to (1.26), joint probability distribution of values of subsystem 1 and of subsystem 2 is given by
Here it is explicitly displayed that spaces. Joint probability distribution
and
are operators in different Hilbert represents the statistical correlation of
1.5. COUPLED SYSTEMS
19
observables and of the two subsystems. We shall refer to observables of the type as correlation observables. The spectral representations of and satisfy and respectively. Hence, the marginal probability distributions and are expressions of the form (1.50) for observables of subsystem 1 and 2, respectively. From this we see that, for instance, expression can refer both to a separate measurement of on subsystem 1, as well as to a simultaneous measurement of and while ignoring the measurement results obtained for observable By the partial tracing (1.51) the information on subsystem 2 as well as the information on possible correlations of the observables of the two subsystems gets lost. This latter information is contained only in the joint probability distribution not in the marginal distributions and Partial tracing yields the density operators of the subsystems also if the total system is not represented by state vector (1.49) but by a density operator thus and The subsystems are uncorrelated if and only if
Correlation can be quantified by means of the von Neumann entropy (1.37), which satisfies inequality (cf. appendix A.11.2)
If
corresponds to a pure state this inequality is trivially satisfied because then Equality holds if and only if (1.57) is satisfied. The difference is called mixing entropy, which is always non-negative. It is easily verified
that
1.5.3
Polar decomposition
If orthonormal sets and are arbitrary, then in general in expansion (1.49) all combinations of and can be present. Schmidt [14] has demonstrated that it is always possible to choose special orthonormal sets in and such that
i.e. there are only diagonal terms. Expansion (1.59) is called the polar decomposition of The polar decomposition is important for certain applications of coupled
systems in quantum mechanics (compare sections 6.3.2 and 6.6.1).
20
CHAPTER 1. STANDARD AND GENERALIZED FORMALISMS
In order to prove (1.59) use is made of the density operators and of subsystems 1 and 2. Let operator representation (cf. (1.36))
Taking in the expansion of can write (1.49) as
for
It now turns out that the set of vectors determine , using (1.61). This yields
have spectral
the special orthonormal set
we
is a special one too. To see this we
By comparison with (1.60) it immediately follows that
From this we get
Evidently, the states thonormal set of vectors in
constitute an (in general not complete) orInsertion in (1.61) yields
in which the summation over is restricted to those values of Subsequently calculating from (1.64),
for which
we see that vectors are precisely the eigenvectors of corresponding to its non-vanishing eigenvalues (we, moreover, see that the eigenvalues of are precisely equal to those of Hence
Accordingly, vector
can be written as
1.5. COUPLED SYSTEMS
21
where the restriction of the summation to the values of with is now automatically realized. Expression (1.66) is the polar decomposition of Hence, the special representations in (1.59) precisely correspond to the eigenvectors of and belonging to the (non-vanishing) identical eigenvalues (eigenvectors belonging to or do not play any role in the decomposition). If the spectra of and are degenerate, then the eigenvectors and are not uniquely determined. For instance, maintaining it is possible to make another choice for The polar decomposition (1.66) becomes different then. Without proof it is mentioned here that not only are the non-vanishing eigenvalues of and identical, but that they even have equal multiplicities, i.e. a non-vanishing eigenvalue has an equal number of eigenvectors of and This already follows from the fact that the polar decomposition is possible only on the basis of the one-toone correspondence of eigenvectors and expressed by (1.66). The possible non-uniqueness of the polar decomposition plays a role in certain interpretations of quantum mechanics (cf. section 6.3.2).
1.5.4 Entangled states States of the form (1.49) or (1.59) play an important role in the discussion on the meaning of the quantum mechanical formalism. The nomenclature, referring to such states as entangled states, is sometimes meant to exhibit that subsystems 1 and 2 are so closely interrelated that it would be impossible to consider the subsystems as separate particles (so-called nonseparability, compare section 6.3.2). The underlying idea is that the particles would be separable if the state of the two-particles system would be represented by a density operator of the form
i.e. a density operator that is obtained from state vector (1.66) by omitting the non-diagonal terms. A more general non-entangled state is given by
in which
and
are arbitrary density operators of particles 1 and 2, respectively.
The discussion of the difference between states (1.66) and (1.67) is closely related to the interpretation of the quantum mechanical state vector, and is postponed till section 6.3.2. Here we restrict ourselves to the remark that, since every pure state of a two-particles system has a polar decomposition, all of these states are also entangled, unless the series (1.66) consists of one single term (in which case the two particles are statistically independent). Hence, entanglement is just expressing statistical
22
CHAPTER 1. STANDARD AND GENERALIZED FORMALISMS
correlation between the subsystems if the total system is in a pure state. Unlike the correlation in a mixture as described by density operator (1.67) or (1.68), the correlation described by (1.66) is often experienced as problematic. Consequently, an important fraction of the literature on the foundations of quantum mechanics is devoted to a discussion of entangled states (see, for instance, the Einstein-PodolskyRosen problem, to be discussed in chapter 5, as well as the so-called “measurement problem” (chapter 3)). These problems illustrate the typically quantum mechanical (non-classical) character of these correlations, which, for this reason are referred to as quantum correlations. The question of how to physically understand the difference between classical and quantum correlations is a central one also in this book.
1.6 Projection or reduction postulate Von Neumann [2] distinguishes two kinds of time evolution: Time evolution of an isolated object, i.e. an object that is not interacting with any other object, in particular not with a measuring instrument (it is allowed that by another object forces are exerted upon the system by means of external potentials, but these potentials are not treated dynamically but parametrically); these time evolutions are described by (1.10), (1.17) or (1.44). Time evolution of an object during a measurement of an observable. The reason that the existence of two different time evolutions is thought to be necessary, is that a measurement seems to cause the state vector of the object to change discontinuously, whereas a process described by the Schrödinger equation corresponds to a continuous change. Thus, let state vector be given by
in which vectors are the (orthonormal) eigenvectors of the measured observable A (the spectrum of which is assumed to be non-degenerate here). Let us assume that the measurement result is According to von Neumann this means that after the measurement observable A must have this value with certainty, and that an immediately repeated measurement of A yields this same value with certainty (also Dirac [1]). This latter result is possible only if preceding the second measurement the state vector was not (1.69) but the vector This seems to imply that the first measurement must have caused the discontinuous transition
1.6. PROJECTION OR REDUCTION POSTULATE
23
The transition can be continuous only if the initial state is an eigenvector in this case the state vector even remains unchanged. The process described by (1.70) is often referred to as von Neumann projection. The postulate stating that a quantum mechanical measurement causes the transition (1.70), is called the strong projection or reduction postulate 5 . In agreement with this postulate a quantum mechanical measurement of observable is sometimes considered as a kind of filter, splitting state vector into components and allowing only the component corresponding to measurement result to pass (actually, the final state has to be normalized, thus yielding as the final state vector). For one-dimensional projectors this precisely yields As a reason for the projection postulate von Neumann [2] refers to the Compton effect. A photon is scattered by a particle, supposed to be initially in a momentum eigenstate corresponding to Let the initial momentum of the photon be Due to momentum conservation we find for the final momentum of the photon that where is the final momentum of the particle. Starting from the initial situation given above there are different possible final situations, depending on the angle at which the photon is scattered. The final state of the system photon+particle is a linear superposition of the possible states corresponding to the pairs thus
However, by measuring the final momentum of the particle we know for certain that the corresponding photon has momentum The idea is that the state of the photon must then also be given by the vector This would mean that the measurement of the momentum of the particle has changed state (1.71) in a discontinuous way. Yet, the projection postulate is controversial (see, for instance, [16, 17]). We shall see in the following that von Neumann’s reasoning is not entirely cogent. That it is impossible that (1.70) is generally valid can immediately be seen by considering observables, like position and momentum, having a continuous spectrum. Such observables do not even have normalizable eigenvectors; hence, it is impossible that the final state of the measurement process is an eigenvector of such an observable. Moreover, there exist measurement procedures (like the ideal photon detector, cf. section 3.2.4) that cannot possibly satisfy the projection postulate (1.70). Therefore, a conclusion of this book will be that the projection postulate, at least in the form presented above, must be relegated to the realm of quantum mechanical folklore. Since, however, the postulate has played (and is still playing) an important role 5 Transition (1.70) has the property of idempotency, and, therefore, is a projection. The term ‘reduction’ refers to the reduction of the uncertainty with respect to the value of observable A, expressed by the transition. The notion of projection was first mentioned by Heisenberg [15].
24
CHAPTER 1. STANDARD AND GENERALIZED FORMALISMS
in the discussion of the foundations of quantum mechanics, it is necessary to duly pay attention to it (see also section 3.2.6). Here we briefly mention a number of objections to the projection postulate, advanced in the past. A first objection is that the projection postulate is unjustifiedly attributing a special significance to measurement. Measurement seems to be just an ordinary physical interaction between object and measuring instrument. Therefore, it is incomprehensible why the usual rules of quantum mechanics would not be applicable, in the sense that, when interacting with a measuring instrument, the object would not be described by means of the continuous time evolution of a solution of a Schrödinger equation, but, instead, would undergo a discontinuous transition. Von Neumann [2] has already recognized this problem, and has tried to demonstrate consistency of his projection postulate with a quantum mechanical description of the interaction of object and measuring instrument. This problem, known as the (conventional) “measurement problem”, will be discussed within a larger context in chapter 3. A second controversy with respect to the projection postulate stems from the question of when a quantum mechanical measurement is completed ((1.70) seems to make sense only in that case). As long as it is not uniquely ascertained that the measurement result is there exists, in agreement with (1.5), a probability for each of the measurement results For this reason (see also section 3.1) it is often assumed that the measurement procedure does not realize (1.70), but the transition
where
is the density operator (1.33) of initial state and is the one-dimensional projection operator onto the eigenvector of A corresponding to Hence, the measurement seems to transform pure state into a mixture (compare (1.32)) of eigenstates of A. Transition (1.72) is in agreement with the so-called weak projection postulate. The suggestion raised by (1.72), is that the final state of the measurement process is one of the states but that it is (still) unknown which one. Only the final observation would select the “actual” state (cf. section 4.6.6), thus yielding (1.70) as the ultimate result. When the initial state is not a pure state but a mixture, then (1.72) is generalized to
This picture is in agreement with the idea of a von Neumann ensemble, to be discussed in section 6.2.3, which, however, has to be rejected as being problematic. Moreover, even though it is possible to think of measurement procedures at least approximately satisfying (1.73) (cf. section 3.2), it will turn out that realistic experiments in general do not satisfy the weak projection postulate. For this reason it
1.6. PROJECTION OR REDUCTION POSTULATE
25
does not make sense to assume this postulate as a property that quantum mechanical measurements should have (see also section 6.6.2). In the past the projection postulate in the form (1.73) has been criticized because of its restriction to non-degenerate spectra (e.g. Furry [18]), and the fact that the expression for in (1.73) does not seem to be applicable to more-dimensional projection operators because in case of degeneracy the eigenvalue of the observable does not determine the state vector in a unique way. For this reason Lüders [19] has proposed to generalize the weak projection postulate in the following way:
where projection operator is now projecting onto the more-dimensional space spanned by all eigenvectors of A corresponding to eigenvalue (compare (1.6) and (1.7)). With (1.74) we have for arbitrary
It was felt as satisfying that on the basis of (1.74) also in case of degeneracy the state does not change if is an arbitrary eigenvector of A. This is in agreement with the idea of a quantum mechanical measurement as a filter (cf. section 1.6) in which during a measurement the state of the object changes as little as possible. For one-dimensional projections (1.72) and (1.73) are special cases of (1.74). However, for more-dimensional projections (1.73) does not agree with (1.74). In the generalization by Lüders the strong projection postulate (1.70) is replaced by
in which the normalization factor is the probability that during the measurement the projection corresponding to takes place. The strong projection postulate does not allow a simultaneous measurement of incompatible observables not possessing joint eigenvectors6. This fits in with the Copenhagen interpretation, in which the impossibility of a simultaneous measurement of incompatible observables is blamed on the fact that measurement arrangements of such observables are mutually exclusive (cf. section 4.5): indeed, it is impossible that a measurement procedure simultaneously projects the state vector of the object onto both an eigenvector of A and an eigenvector of B if A and B do not have joint eigenvectors (see also section 1.9.2). Yet, this way of connecting the projection postulate to the impossibility of a simultaneous measurement of incompatible observables has caused quite a lot of 6
It is possible that two observables have some (but not all) eigenvectors in common; in that case there is a partial compatibility; this possibility is left out of consideration here.
26
CHAPTER 1. STANDARD AND GENERALIZED FORMALISMS
confusion. In many textbooks of quantum mechanics the so-called microscope (a ‘thought’ experiment due to Heisenberg) is discussed as an introduction to the idea of complementarity (cf. section 4.5). This experiment is presented as a simultaneous measurement of position and momentum of a particle, i.e. of two incompatible observables. It, admittedly, is the purpose of Heisenberg’s illustration to demonstrate that the and P measurements disturb each other, and that these measurements do exclude each other in this sense. However, there is no mutual exclusion here on the basis of the projection postulate, which is not even applicable to this example because and P do not possess normalizable eigenvectors. It could justifiably be questioned what might be the final state of the particle in this measurement, and why this measurement should not satisfy the projection postulate, notwithstanding that this latter postulate is deemed applicable to any quantum mechanical measurement. We shall extensively discuss this problem in section 3.2.5. Its solution will once more demonstrate that the projection postulate cannot play a useful role in quantum mechanics as a prescription a quantum mechanical measurement should satisfy. The postulate has led to an unfortunate fixation of the attention on a type of quantum mechanical measurements, viz, measurements of the first kind (cf. section 3.2.4), that is far too restricted, leaving out of consideration, or even declaring impossible, interesting measurement procedures used in actual practice (on the grounds that they do not satisfy the projection postulate).
1.7 Uncertainty relations 1.7.1
Heisenberg inequality
The Heisenberg uncertainty relation, or Heisenberg inequality, is formulated in terms of standard deviation (1.9). The inequality is given by Heisenberg [15] (without proof) for position and momentum observables and P, satisfying the canonical commutation relation For arbitrary normalized state vector the Heisenberg inequality is given as
Later this relation has been generalized by Robertson [20] for arbitrary observables A and B (earlier Kennard [21] had proven the inequality for arbitrary canonically conjugated observables having a commutation relation similar to (1.19)). The inequality is then given by
This more general relation, of which (1.77) is a special case, is also often referred to as the Heisenberg uncertainty relation.
1.7. UNCERTAINTY RELATIONS
27
Proof: The proof of (1.78) is simple. Since the norm of a vector is non-negative, we have
in which expectation values and are taken in state miticity of A and B, this inequality can be written as
Since
and
Using the Her-
are positive this implies
An analogous result is obtained if A and B are interchanged. Finally, replacing in this expression A by and B by directly yields (1.78). Using (1.32) it easily follows from (1.79) that (1.78) is also valid for mixtures, for which is given by (1.42), and
The Heisenberg inequality is a manifestation of the fact that quantum mechanical states cannot be dispersionless in the sense that all quantum mechanical quantities have a sharp value. A dispersionless state must necessarily be a pure state, with a density operator given by the one-dimensional projection operator (1.33). The observable has a sharp value. However, for each a different observable can be found (for instance, a projection operator that is incompatible with for which the standard deviation is non-vanishing. In older textbooks the ‘Heisenberg uncertainty relation’ is also referred to as ‘Heisenberg indeterminacy relation’, which, however, has a more ambiguous meaning. The difference between ‘indeterminacy’ and ‘uncertainty’ will play an important role in this book (compare sections 4.5 and 7.10.3). In order to avoid ambiguity we shall refer to inequality (1.78) as the Heisenberg inequality.
Minimal uncertainty states A state for which in (1.78) equality holds is called a minimal uncertainty state. In this case it follows from (1.79) that the vector has a vanishing norm. Hence, this vector is the null vector, and
28
CHAPTER 1. STANDARD AND GENERALIZED FORMALISMS
It follows that state vector is an eigenvector of operator which in general is non-Hermitian/non-normal. With B := –P it is seen that the coherent states of the electromagnetic field, defined in appendix A.4, are such minimal uncertainty states, with From (A.48) and (A.44) it follows that the ‘squeezed’ coherent states are minimal uncertainty states, too, but with different spreadings of and P, viz, Coherent states (either ‘squeezed’ or not) are for this reason considered as the most “classical” states of the field.
1.7.2
Entropic uncertainty relations
Uffink and Hilgevoord [22] have stressed that the standard deviation (1.9), in terms of which the Heisenberg inequality is formulated, is not always the most appropriate measure for characterizing spreading of measurement results over the spectrum of A. In particular, if the probability distribution exhibits more internal structure than the customarily studied Gaussian-like distributions, it may be necessary to consider different measures if one wants to provide an adequate representation of the quantum mechanical uncertainty principle. As an alternative possibility (see also [23]) we consider here the Shannon entropy (1.43). An important advantage of the entropic measure over the standard deviation is that the former, in contrast to the latter, is independent of the values of the observable: only the probabilities play a role in this quantity. This is of great importance to the empiricist interpretation of quantum mechanics, to be discussed in chapter 2, in which the values of observables do not have a direct physical significance. This implies that the numbering of measurement results becomes irrelevant, i.e. instead of the numbering an arbitrary permutation is equally acceptable. As a consequence, points of the spectrum that are far apart in one permutation can be close in another one (and vice versa). Under such a renumbering standard deviations (1.9) change: a probability distribution, consisting of two narrow peaks, has a larger standard deviation if the peaks are situated far apart (figure 1.2a) than if they are close together as in figure 1.2b. If eigenvalues do not play a role, then such a difference is not desirable. The entropic measure (1.43) solves this problem. The value of the Shannon entropy is independent of any renumbering or renaming of measurement results For this reason this measure is particularly suited to quantify the extent to which a distribution deviates from the uniform one, while ignoring the special positions where the deviations from uniformity occur. Deutsch [24] (see also Partovi [25], Kraus [26]) has derived the following inequality for the case that represents a pure state and and are spectral representations of maximal Hermitian operators (i.e. and
1.7. UNCERTAINTY RELATIONS
such that
Proof: Since
29
and
we have
Consider in this expression a term with fixed
and
Deutsch demonstrated that
(this can be done, for instance, by means of the calculus of variations, varying the quantity over the set of all normalized linear combinations of and From this (1.81) directly follows.
The sum of the entropic spreading measures of two incompatible observables has a lower bound, thus preventing that both are vanishing. This is quite analogous7 to the Heisenberg inequality (1.78). It is important that, in contrast to (1.78), in (1.81) this lower bound is independent of the choice of state Consequently, it is impossible that the right-hand side vanishes for particular states, as is happening in (1.78) for eigenvectors of A or B. Because (1.78) is a property also of the state is (1.81) a better representation of incompatibility of observables A and B than is provided by the Heisenberg relation. 7
exp
The analogy can be made even stronger by taking as a spreading measure the quantity instead of
30
CHAPTER 1. STANDARD AND GENERALIZED FORMALISMS
The measure (1.43) is a special case of the more general expression
which is related to (1.43) according to
Maassen and Uffink [27] have derived the following inequality, given here without proof:
If
this inequality reduces to
constituting an improvement over relation (1.81). Using (A.93), inequalities (1.81) and (1.83) can be generalized to states represented by arbitrary density operators Thus, with we get from (A.93)
Inequalities (1.84) are not yet completely satisfying. It is possible that the two complete orthonormal sets and have a vector in common. In that case and inequality (1.84) is trivially satisfied. It is possible, however, to sharpen this inequality in a way taking into account a possible partial compatibility of the spectral representations and in which a number of orthogonal eigenvectors of one observable spans the same subspace as an (equally large) number of orthogonal eigenvectors of the other observable. If this occurs, then the orthogonal projection operator on this subspace satisfies
If there are more than one of such (orthogonal) subspaces, then there are more projection operators constituting the spectral representation of a standard observable that is compatible with both and (in case of complete incompatibility there is only one viz, The sharpened inequality is now given by [28]
1.8. PROPOSITION CALCULUS
31
The right-hand side of this expression can vanish only if all eigenvectors of and coincide.
i.e. if all
Proof: The proof of (1.86) makes use of (A.93). Define
Since also is a density operator. We write it in the form (A.93):
Using (1.85) it can easily be seen that Hence, and
and yielding
Since and if and correspond to eigenvectors not lying in we can now apply (1.84) within each of these subspaces. This directly implies (1.86).
1.8 1.8.1
Proposition calculus of standard quantum mechanics Boolean lattice of classical propositions; objectivity
Physical propositions of a classical system can be ordered in a Boolean lattice (e.g. Jauch [29]). This is a consequence of the fact that these propositions can be represented by subsets of a manifold, viz, classical phase space. Let be such a subset. Then the corresponding proposition reads: “The object has such a position and such a momentum that the phase space point is in subset All physical quantities are functions of and Hence, specification of the phase space point automatically gives all these quantities their values.
32
CHAPTER 1. STANDARD AND GENERALIZED FORMALISMS
The classical character of the lattice of propositions of classical mechanics is embodied by the unrestricted validity of the distributive laws
These relations warrant that a proposition like: “The particle is at position and has energy which can be represented by the intersection of the subsets corresponding to the propositions: “The particle is at position and “The particle has energy is meaningful. Unlike quantum mechanical propositions, to be discussed later, classical propositions are thought to refer to objective properties of the object, possessed by the object, independently of the question of whether an observation takes place to test whether the proposition is true or false. As a consequence all classical propositions are mutually compatible, i.e. each sublattice of the lattice of classical propositions is Boolean. Each proposition can in an unequivocal way be decided to be either true or false. And this can be done simultaneously for all propositions: in principle all classical quantities can be determined jointly. It should be noted here that within the context of quantum mechanics ‘objectivity’ has a more specific meaning than the usual significance of ‘intersubjectivity’. By ‘objective’ it is not only meant that a proposition is valid ‘independently of the specific observer’, but that it is even valid independently of the specific observer including his measuring instruments (see also chapter 2.3). In classical physics this latter condition is more or less trivially satisfied, because within this domain of physics the measuring instrument is assumed to play no particular role. In quantum mechanics this is quite different, however (cf. section 4.4). Here observer and measuring instrument have to be clearly distinguished. In quantum theory the important question is not so much the separation of object and observer, but rather the separation between object and measuring instrument. Whether ‘objectivity’ in this latter sense is possible in quantum mechanics, is one of the big issues of the interpretation of that theory.
1.8.2 Propositions referring to a single quantum mechanical observable Quantum mechanical propositions are of the type: “When observable A is measured, an eigenvalue is found”, or, more generally, “When observable A is measured, a value is found in interval of the spectrum of A” . This latter proposition is denoted by We shall now briefly discuss a proposition calculus in which such propositions can be ordered. For a more extensive discussion8 see e.g. Jauch [29] or Piron [30]. 8 We are restricting ourselves here to systems without superselection rules, i.e. coherent or irreducible systems in which the superposition principle is unrestrictedly valid.
1.8. PROPOSITION CALCULUS
33
Restricting ourselves first to the case of one single quantum mechanical standard observable it is not difficult to see that the propositions constitute a Boolean lattice. As a matter of fact, the propositions correspond in a one-to-one manner with subsets of the spectrum of A. The proposition: “The value of observable A is found both in and in therefore has no other meaning than: “The value of observable A is found in For this reason it is meaningful to define this latter proposition as the meet (intersection) of propositions and
This proposition is true if (union) of and
and
are both true. In an analogous way the join
is defined as the proposition that is true if either or corresponds to the subset of the spectrum of A.
is true (or both): it
If (i.e. the intersection is the empty set), then the corresponding proposition will be false independently of the state of the object. The proposition that is identically false is denoted by O, the absurd proposition. Analogously, if then it follows from our assumption that a quantum mechanical measurement is always successful (cf. section 1.1) that this proposition is always true. For this reason it is called the trivial proposition, and denoted by I. The complement of can now be defined as the proposition corresponding to the complement of defined by For all we have:
Since
if
also
1.8.3 Two or more compatible observables When we are simultaneously considering two or more observables A, B, C, then the Boolean character of the lattice of propositions does not change if all observables are mutually compatible. In that case the observables can be measured simultaneously, making meaningful propositions like: “In a simultaneous measurement of A, B and C values and are found”. Analogously to classical mechanics, in this special case the structure of the (Boolean) lattice of these propositions can be found by considering the spectrum as a more-dimensional manifold.
34
CHAPTER 1. STANDARD AND GENERALIZED FORMALISMS
The proposition “When observable A is measured, a value is found in subset of the spectrum of A” is true if the state vector of the system is in the subspace of Hilbert space spanned by the eigenvectors corresponding to the eigenvalues in This subspace is characterized by the projection operator (cf. (1.6))
For this reason it is possible to represent the quantum mechanical propositions, discussed above, by subspaces of or by the corresponding projection operators. The complement is then represented by projection operator onto the orthogonal complement of Proposition (1.88) now corresponds to projection operator of the intersection of the subspaces corresponding to and Analogously, proposition (1.89) corresponds to the join of the subspaces that are involved (note that Due to the Boolean structure of the sublattice of propositions corresponding to one single observable does correspond to the projection operator By choosing in (1.91) the vectors as joint eigenvectors of A, B, and C, and by restricting the summation to those vectors belonging to certain subsets of the spectra, it is possible to extend in a Boolean way the proposition calculus to arbitrary subsets of the spectra of A, B, and C.
1.8.4 Incompatible standard observables By restricting ourselves to one single observable (or a set of mutually compatible ones, corresponding to commuting operators) up to now we encountered only aspects of the lattice of quantum mechanical propositions that were not different from those of a classical system. Typically quantum mechanical problems derive from the existence of incompatible observables corresponding to non-commuting operators. What could be the meaning of propositions (1.88) and (1.89) if and refer to incompatible observables that, allegedly, cannot be measured simultaneously? The solution, chosen by Jauch and Piron, is to associate (1.88) and (1.89) with the meet and join, respectively, of subspaces of also if and correspond to incompatible observables. Now and each correspond to a projection operator, and respectively, which in general do not commute. For such noncommuting projection operators the intersection of the two subspaces is represented by the projection operator (cf. Jauch [29])
In this approach the subspace of onto which this operator is projecting, corresponds to proposition Analogously, the union can be represented
1.8. PROPOSITION CALCULUS
35
by the projection operator of the orthogonal complement of the intersection of subspaces corresponding to projection operators and For this reason in this approach the lattice of all quantum mechanical propositions (as far as they are in agreement with standard Dirac-von Neumann axiomatics) is isomorphic to the lattice of subspaces of Hilbert space Birkhoff and von Neumann [31] have demonstrated that the lattice of subspaces of a finite-dimensional vector space is a so-called modular lattice, satisfying the distributive law only in a restricted sense, viz,
On the basis of the idea that quantum mechanical propositions correspond to subspaces of it is simple to determine, for instance, the lattice of propositions of the electron spin. It corresponds to the lattice of subspaces of a two-dimensional Hilbert space. Apart from propositions O and I this lattice contains only elements corresponding to one-dimensional subspaces of viz, the subspaces spanned by the eigenvectors and respectively, of all possible components of the spin observable (cf. figure 1.3). For each value of vectors and constitute an orthonormal basis of Hence, the corresponding propositions are each other’s complement. From figure 1.3 it is clear that the total lattice consists of (Boolean) sublattices of the propositions of one single spin component The meet of any two propositions corresponding to different spin components is the absurd proposition O. This reflects the mutual incompatibility of spin components in different directions if and the non-existence of joint eigenvectors: standard axiomatics does not allow that propositions regarding different spin components are simultaneously true, the reason being that they cannot be simultaneously measured. The example of the electron spin, presented above, suggests that the meaningfulness of proposition in which and refer to incompatible observables A and B, hinges on the existence of joint eigenspaces of the observables, spanned by eigenvectors of A as well as of B. As is evident from the spin example, incom-
36
CHAPTER 1. STANDARD AND GENERALIZED FORMALISMS
patibility of observables restricts the existence of such eigenspaces due to the fact that projection operator (1.92) often vanishes. Yet, this is not always the case. It is possible that incompatible observables do have non-empty joint eigenspaces. The most simple example of this obtains if the incompatible observables A and B have one single eigenvector in common (for instance, is a common eigenfunction of the angular momentum operators and In this case there actually exists a certain measure of compatibility, related to the fact that i.e. and are compatible on the subspace spanned by the joint eigenvector. A less trivial example can be obtained if the common eigenspace has a larger dimension, and the eigenvectors of A spanning this subspace are different from those of B (the two-dimensional spin example, in which the same two-dimensional vector space (viz, is spanned by different orthonormal sets of eigenvectors of incompatible spin components, is of this kind). In more-dimensional systems this implies that non-trivial joint eigenspaces of A and B exist, notwithstanding for all state vectors (see figure 1.4, demonstrating that, because planes and have a one-dimensional intersection two arbitrary observables always have non-trivial joint eigenspaces). Hence, in this formalism meaningful joint propositions do exist for observables that are “truly” incompatible. Joint propositions of this kind even exist for operators (position) and P (momentum), being the paradigm of incompatible observables, satisfying commutation relation (1.19). Although projection operator (1.92) vanishes if both subsets and of the spectra of and P, respectively, are bounded, there exists a nontrivial common eigenspace if the complements and are bounded [32]. Yet, even the existence of such a non-trivial common eigenspace of incompatible observables A and B can be seen as a reflection of a certain kind of compatibility. Thus, in the example of figure 1.4 it is possible to define two operators and by replacing all eigenvalues of A, corresponding to the eigenspace, by equal eigenvalues of (and analogously for B). Then it is clear that on (actually, due to the degeneracy of the spectra of and we can choose eigenvectors in subspaces and differently, in such a way that and have common eigenvectors in Since this proposition does not contain any information referring to the fact that the eigenvalues of A and B are distinct, it contains the same information on A and B as it contains on and This means that, essentially, this proposition is meaningful only because of a partial compatibility of A and B: some simultaneous information on incompatible observables A and B can be obtained even according to the standard formalism. Stated differently, it is possible that a non-Boolean lattice has a Boolean sublattice. For instance, in figure 1.4 the subspaces and constitute, together with O and I, a Boolean sublattice. Two propositions
and
are called compatible if, together with their com-
1.8. PROPOSITION CALCULUS
37
plements and they can be accommodated in a Boolean sublattice. This is denoted by In case of compatibility the corresponding subspaces and intersect each other in an orthogonal way, i.e.
where is defined as the orthogonal complement of In this case the corresponding commuting projection operators and satisfy the relations for meet and join given in section 1.8.3. The concept of compatibility has been used by Jauch and Piron to define weak modularity: A lattice is weakly modular if
i.e. compatibility in case of inclusion. Like in a modular lattice, satisfying (1.93), in a weakly modular lattice there is only a restricted validity of the distributive laws. In fact, the definition of weak modularity corresponds to a weakening of the concept of modularity, sufficient to allow this lattice to represent the structure of the subspaces of an infinite-dimensional vector space.
1.8.5
States on the lattice of propositions
Jauch [29] has defined a quantum mechanical state as a real-valued function the lattice of propositions satisfying the following requirements: 1. 2.
on
CHAPTER 1. STANDARD AND GENERALIZED FORMALISMS
38 3.
if propositions spectrum of one single observable,
4. for each sequence of propositions 5.
if if
correspond to disjoint subsets satisfying
of the
it follows that
then a state exists such that then a state exists such that
This definition is in agreement with the probability measure introduced in appendix A.12. Its meaning is evident: the state as the set of all probabilities of events corresponding to quantum mechanical propositions. Requirement 5 is an efficiency condition: propositions that are not experimentally distinguishable in any way, must be identified. Gleason’s theorem (A.109) states that, if propositions correspond to subspaces of a Hilbert space of dimension then a density operator (as defined in section 1.4) exists, such that in which is the projection operator on the corresponding subspace. This implies that, if the quantum mechanical propositions can indeed be associated with the subspaces of a Hilbert space, then the definition of a state on the lattice of quantum mechanical propositions, given above, is in agreement with the usual description of a quantum mechanical state by means of a density operator on a Hilbert space. Moreover, it follows that the density operator yields the most general description of a quantum mechanical state. To a certain extent for and the situation is analogous to in the sense that each state can be represented by a density operator. This follows from the fact that a Hilbert space can always be embedded in a higher-dimensional Hilbert space. For this is not the only possibility, however. For these values of there also exist non-quantum mechanical representations (so-called hidden-variables models, cf. chapter 10). For this is trivial, because this corresponds to the classical situation of a manifold; for this is demonstrated by Bell [33] (note that Bell’s model does not satisfy property 4). The importance of Gleason’s theorem is contained in the interpretation of (1.94) as a probability measure on a linear space. Whereas in property 3 the measure is considered only on each Boolean sublattice separately, is density operator constituting a relation between its values on incompatible subspaces. It is important to note that this relation is not an inherent property of a probability measure as such, but is just a consequence of the structure of the linear space on which the measure is defined. A derivation of (1.94), in which the assumption of linearity plays an important role, has already been given by von Neumann ([2], p. 313 f.f.). We shall return to this in section 10.1.
1.8. PROPOSITION CALCULUS
39
Jauch [29] has demonstrated that for a Boolean lattice as well as for the lattice of subspaces of a vector space property 4 is a consequence of the first three properties. Hence, when restricting ourselves to such lattices, property 4 could be left out of the definition. That this property has yet been preserved in the definition, is a consequence of the fact that Jauch’s intention is more general than just yielding a representation of the Hilbert space structure of quantum mechanics. It was his ambition to derive this Hilbert space structure from the structure of empirical propositions possible in the quantum mechanical domain. For this reason the quantum mechanical state defined above, must in the first place be seen as the probability that proposition is true if the corresponding observable is measured, rather than as a measure on the lattice of subspaces of a Hilbert space. Property 4 states that proposition is true with certainty (probability 1) if and are both true. If and are propositions corresponding to incompatible observables A and B, then Jauch proposes to realize a measurement of proposition by a measurement procedure in which observables A and B are alternately measured an infinite number of times, such that each time either proposition or is tested. In order to be able to describe such a procedure, it is necessary that the state be known in which the object is left behind after each A or B measurement. Assuming that the A and B measurements satisfy the strong projection postulate (cf. section 1.6), this state can be uniquely determined by the transitions and respectively. Carried out consecutively, this gives: Repeating this an infinite number of times expression (1.92) of the projection operator on the intersection subspace is obtained; evidently, under the given assumptions this operator corresponds to the action of an infinite sequence of alternating A and B filters. If state vector is in the intersection subspace, then i.e. both filters are completely transparent for this state, irrespective of the number of times each filter is applied (in this case property 4 holds). If then the intersection subspace is empty, and property 4 cannot be presupposed. In this case the alternating sequence of A and B filters is opaque. Jauch’s point of departure in characterizing the meet of two propositions is the above-mentioned idea of an infinite alternating filter rather than expression (1.92). The idea is that the Hilbert space structure of quantum mechanics must follow from the structure of the set of operational procedures for measuring quantum mechanical quantities. Property 4 is considered as a physical property of this latter set, valid because the operational procedures are satisfying it. Since Hilbert space has the same structure, it seems to yield a satisfactory representation of the operational procedures. For this reason Hilbert space has become a paradigm of quantum mechanics, this theory being virtually identified with Hilbert space theory. A word of caution is in order here, however. It is possible that more general structures than the Hilbert space structure exist, having the same property. A hint in this direction is
CHAPTER 1. STANDARD AND GENERALIZED FORMALISMS
40
obtained when the possibility of superselection rules is taken into account. The possibility of superselection is an indication that it may be necessary to generalize the structure of the simple correspondence between quantum mechanical propositions and (projection operators of) subspaces of a Hilbert space. The possibility that the definition of a joint measurement of incompatible observables as an opaque filter may be inspired by expression (1.92), rather than the other way around, is another source of uneasiness with respect to the standard formalism. The question is, whether this definition is exhausting all operational possibilities of obtaining joint information on incompatible observables. In view of its reliance on von Neumann’s projection postulate, it, in particular, is not clear that the idea of a filter is a necessary one at all. In order to obtain a reliable idea of what a quantum mechanical proposition is, it will be necessary to have a less abstract notion of quantum mechanical measurement. This will be considered in section 1.9. It will turn out that measurement within the domain of quantum mechanics cannot be curbed to fit into the standard formalism of Hilbert space theory so easily as assumed by Jauch, and that a generalized formalism will be necessary to encompass all measurements possible within this domain.
1.9
Generalized quantum mechanical observables
1.9.1 Quantum mechanical observables as positive operatorvalued measures In section 1.8.1 we started from standard Dirac-von Neumann axiomatics, in which the structure of quantum mechanical propositions regarding observable quantities is identified with the lattice of projection operators on subspaces of the Hilbert space containing the quantum mechanical state vectors. Gleason’s theorem (cf. section 1.8.5) then induced a generalization of the quantum mechanical state. It turned out that, next to states represented by state vectors (wave functions), states represented by density operators not satisfying (1.34) are possible. Gleason’s theorem demonstrates that this is the maximal generalization of the concept of a quantum mechanical state if propositions are associated with projection operators9. In the standard formalism probabilities are represented by expressions like (1.30), (1.31) or (1.94). It will now be demonstrated that in these representations an element is hidden that is not required by the quantum mechanical formalism. In particular in the form (1.94) it is evident that the probability can be represented as an inner product or a bounded linear functional, on a Hilbert space, viz, 9
In chapter 10 it will be seen that this does not imply that further generalization of the concept of ‘physical state’ would be excluded.
1.9. GENERALIZED OBSERVABLES
41
Hilbert-Schmidt space (cf. appendix A.10 and section 1.4.3). Thus, in section 1.8.5 density operators can be considered as vectors in Hilbert-Schmidt space, whereas refers to a functional on that space. Strictly speaking it is sufficient to consider functionals that are not defined on the whole Hilbert-Schmidt space, but only on the convex subset of density operators (cf. appendix A.11.3). Then the quantum mechanical probability is an affine functional on this set (e.g. Holevo [34]), i.e. a functional satisfying
Since the density operators span the whole of Hilbert-Schmidt space it is possible to extend the affine functional in a unique way to a (bounded) functional on HilbertSchmidt space. Using this representation it is possible to find a generalization of the concept of a quantum mechanical observable as defined in section 1.1. For this purpose use is made of Riesz’s lemma, stating that each bounded linear functional on a Hilbert space can be represented as an inner product. This implies that a probability on Hilbert-Schmidt space must have a representation in the form of a Hilbert-Schmidt inner product: Here is an element of Hilbert-Schmidt space. Because of the requirement that is a probability it should satisfy
Expression (1.96) is very similar to (1.40) and (1.94). However, there is an essential difference: the operator need not be a projection operator. In the following (see section 3.3 and chapters 7 and 8) we shall repeatedly be confronted with the fact that the standard formalism, in which operator is a projection operator, is indeed too restricted to be able to accommodate all measurement procedures within the domain of application of quantum mechanics. Hence, the generalization of the definition of a quantum mechanical probability from (1.94) to (1.96) has an operational importance. The set of the probabilities over the different values of defines a probability distribution, corresponding to a measure on the spectrum of the observable (cf. appendix A. 12). Operators generate a positive operator-valued measure (POVM) (appendix A.12.3). Hence, the generalization of the quantum mechanical observable found here is nothing but the generalization of the projection-valued measure (PVM), determined by the projection operators of the spectral representation of a Hermitian operator, to the POVM. Evidently, the formalism of quantum mechanics allows a more general representation of probabilities than is considered in
42
CHAPTER 1. STANDARD AND GENERALIZED FORMALISMS
the standard formalism presented in the previous sections. This generalization can also be characterized as an extension of the concept of an orthogonal decomposition of the identity operator to the generalized, generally non-orthogonal decomposition (NODI) defined in appendix A.12.3, i.e. apart from (1.97) we also have
but in general It will also turn out that it is necessary to drop the requirement, inherent in the standard formalism, that operators belonging to one single generalized observable, be mutually commutative: in general One purpose of this book is to show that the generalized concept of a quantum mechanical observable is necessary to be able to describe realistic experiments. Indeed, the standard formalism will turn out not to be able to describe certain experiments, even experiments that are essential to quantum mechanics. For an adequate characterization of quantum mechanical experiments the generalization from PVM to POVM is of crucial importance. This observation is important enough to deal with it at some greater length. As a matter of fact, only the standard formalism, in which quantum mechanical observables are represented by Hermitian operators, is generally presented in elementary quantum mechanics textbooks. As a consequence there is a suggestion that the whole quantum mechanical domain is covered by it. One might even think, with Heisenberg, that only those observations are experimentally possible that are allowed by the (standard) theory 10 . When only the standard formalism is taken into account, then Heisenberg’s maxim would imply the impossibility of measurements represented by POVMs that are not PVMs. Failing a definition of the concept of a quantum mechanical measurement that is independent of the standard formalism, this could stand in the way of a consideration of generalized measurements, seducing us into neglecting possible deviations from the standard formalism as irrelevant consequences of a sloppy kind of measurement. An example of this latter attitude is the quantum mechanical description of the detection of the intensity of an electromagnetic field using a realistic detector, i.e. a detector with efficiency smaller than 1. In section 7.2 we shall see that this measurement procedure is not represented by a PVM, but by a POVM. Notwithstanding this, such detectors are used to test the standard formalism of quantum mechanics. Without any correction this, of course, would entail disagreement between experimental and theoretical outcomes: the measured intensity will in general be smaller than the intensity that would be measured by a detector that is 100% efficient. As will be seen in section 7.2, this latter detector, indeed, is measuring the 10
“It is the theory which decides what we can observe” (Heisenberg [35]).
1.9. GENERALIZED OBSERVABLES
43
quantum mechanical photon number observable represented (for one single mode of the electromagnetic field) by the photon number operator Hence, for the efficient detector the standard formalism is valid. This is not the case, however, for an inefficient detector. Yet, for most physicists this has not been a reason to doubt the standard formalism. For a long time it has been standard practice to (successfully) correct the experimental measurement results by means of an efficiency factor so as to achieve agreement with the standard formalism. Under the influence of work by, a.o., Davies and Lewis [36, 37], Holevo [34] and Ludwig [38] (see also Kraus [39], Prugovecki [40], and Busch et al. [41, 42]) it has only recently been realized that the generalization of the concept of a quantum mechanical observable from PVM to POVM cannot be merely understood on the basis of correction factors, but that it constitutes a fundamental extension of the domain of application of quantum mechanics. Moreover, it will become clear that even basic issues of quantum mechanics like the so-called ‘thought experiments’, discussed in the early days of quantum mechanics in order to try to understand such basic concepts as ‘complementarity’ (cf. chapter 4), cannot be fully understood on the basis of the standard formalism, but need generalized observables for their formulation. The approach taking seriously the necessity of POVMs for a description of quantum mechanical measurements is often referred to as the operational approach.
1.9.2 Joint measurement of generalized observables For compatible standard observables joint measurement has been discussed in section 1.3. In the standard formalism commutativity (1.25) of the spectral representations of the corresponding Hermitian operators is necessary and sufficient for joint measurability. The PVM yielding probabilities (1.26) as its expectation values, corresponds to the spectral representation of the standard observable C defined in section 1.3. It can be interpreted as the observable corresponding to the joint measurement of the two standard observables A and B with spectral representations and respectively. This can be implemented by the observation that PVMs and can be obtained from PVM as its marginals (cf. (1.27)),
The concept of joint measurement is extended in the following way to generalized observables:
Joint measurement of POVMs Two observables, represented by POVMs and can be jointly measured if a measurement procedure exists, represented by a bivariate POVM
44
CHAPTER 1. STANDARD AND GENERALIZED FORMALISMS
(more precisely11: a POVM generated by a NODI, with elements characterized by two indices) such that
If this is the case, then POVMs and are called commeasurable. The important point with respect to commeasurability of POVMs is that the measurement arrangement has two pointer scales, one for observable and one for observable (cf. figure 1.5). As already remarked in section 1.3, in this definition it is not so much the aspect of simultaneity that is important, but rather the aspect of jointness, i.e. the circumstance that each individual object is yielding a measurement result for both observables. This dictates the bivariate character of POVM Analogously to (1.26) the joint probability distribution is given by
It is important to note here that, in contrast to the standard formalism, the possibility of joint measurement of generalized observables does not hinge on commutativity of operators. This can be seen by considering a simple example (compare section 7.9.1 for physical background). Let E and F be two non-commuting Hermitian projection operators. Then the operators can be easily seen to generate a POVM if The operators can be ordered in the following way into a bivariate POVM:
11
In the following a strict distinction between a POVM and the NODI generating the POVM will be drawn only if this is necessary to avoid confusion. This implies a certain sloppiness in the mathematical terminology, which is accepted in order not to deviate too much from customary nomenclature.
1.9. GENERALIZED OBSERVABLES
45
The marginals of this bivariate POVM are and Since operators of one POVM do not commute with those of the other POVM. Hence, there is no compatibility in this sense. Nevertheless there is commeasurability of and in the sense defined above. Note that and are not PVMs. Hence, they do not correspond to standard observables. We now prove the following theorem (cf. de Muynck and van den Eijnde [43]): Theorem: Two PVMs and are commeasurable if and only if Their joint measurement is represented by POVM with
Proof: Assume that Then
Since
is a POVM such that Because of the Theorem of appendix A.6 we then have
we also have
This implies
By multiplying this expression from left and from right by we get
and by using
Hence: and
Assume and
Then
is a POVM satisfying
CHAPTER 1. STANDARD AND GENERALIZED FORMALISMS
46
This result is in complete agreement with the well-known theorem of the standard formalism (cf. section 1.3) that only compatible observables (corresponding to commuting PVMs) are jointly measurable. Result (1.103) is in agreement with (1.26). The foregoing theorem can be sharpened appreciably in the following way (de Muynck and Koelman [44]): Theorem: PVM and POVM are commeasurable if and only if POVM of the joint measurement is represented by
The
Proof: Like in the foregoing theorem it follows from Then it follows from
that that
For the same reason we also have
The proof of the converse is analogous to the corresponding proof of the foregoing theorem.
Evidently, commutativity of two POVMs is already required for joint measurability if only one of the two observables is a standard one. If the requirement of commutativity is not satisfied, joint measurability can only obtain if both observables are generalized ones. POVM (1.103). and the joint probabilities defined by it, could correspond to a measurement procedure in which is measured first in accordance with the Lüders projection procedure (1.76), followed by a measurement of (or an analogous procedure in which the order of the PVMs is interchanged). Of course, it, is not possible to infer from this a general validity of the Lüders projection postulate. It is only possible to conclude that the concept of generalized observables does not exclude the possibility of such a procedure for the joint measurement of commuting PVMs.
1.9. GENERALIZED OBSERVABLES
47
1.9.3 Naimark’s theorem Naimark [45] (also Helstrom [46], section III.3, and Akhiezer and Glazman [47], Vol. II, p. 388) has demonstrated that it is possible to obtain a POVM on a Hilbert space by means of orthogonal projection of a PVM on a Hilbert space containing as a subspace. Thus: Naimark’s theorem: Let be a POVM on Then there exists a Hilbert space on and an orthogonal projection operator P, such that
We prove12 Naimark’s theorem for the case that Hilbert space (see section 1.9.4 for an infinite-dimensional example).
a PVM and
is finite-dimensional
Proof: We first prove the theorem for the case that the NODI generating POVM consists of N' (N' > N) elements, all of them up to a multiplicative constant equal to a projection operator onto a one-dimensional subspace of a so-called maximal POVM (compare section 7.7.2). Thus,
For the proof we make use of the theorem on eutactic stars proven in appendix A.7. As a matter of fact, eutactic stars precisely correspond to NODIs of the type considered here. Thus, on we have
Defining vectors
(not normalized to 1), then
or
This means that the Gram matrix is a projection, and that for this reason the condition of the theorem of appendix A.7 is satisfied. The theorem then 12
A similar proof has been given by Peres ([48], p. 285).
48
CHAPTER 1. STANDARD AND GENERALIZED FORMALISMS
straightforwardly yields as elements of the required PVM the projection operators onto the orthonormal basis of constructed in the theorem. For general POVMs the theorem follows directly from the foregoing, since each of its operators is Hermitian, and, hence, can be expressed in terms of its (unnormalized) eigenvectors according to
Then the NODI
generates a maximal POVM. Hence,
and
the orthogonal decomposition of the identity PVM.
generating a non-maximal
Naimark’s theorem is an important mathematical tool for deriving properties of POVMs (e.g. Kruszynski and de Muynck [49]). By some physicists the theorem is also deemed important because of the apparent possibility of reducing the problem of generalized observables to the standard formalism. Indeed, because of
a measurement of generalized observable in state can be replaced by a measurement of standard observable in state This appears to restore the universal validity of the standard formalism, silently attributed to it in most quantum mechanics textbooks. In the following chapters it will be seen that in a quantum mechanical measurement the measuring instrument plays an important role. An extension of Hilbert space of the object to the tensor product space (cf. appendix A.9) of the state vectors of object and measuring instrument (sometimes called an ‘ancilla’, e.g. Helstrom [46]) is a plausible one. This seems to yield a natural way to define as such a tensor product space, and relate a POVM of the object to a PVM of the measuring instrument or, perhaps, of the combined system of object and measuring instrument. It, indeed, is possible to do so by extending, if necessary, to a space having dimension integer, which can be interpreted as a tensor product of and an space of the ancilla. As will be discussed in section 3.2, in actual practice a measurement of a quantum mechanical observable
1.9. GENERALIZED OBSERVABLES
49
is often achieved by observing a pointer of a measuring instrument, quantum mechanically represented by a PVM, the so-called pointer observable. It would be very helpful if this latter PVM could be obtained by means of the Naimark extension. Unfortunately, the Naimark extension does not yield any indication with respect to the construction of a physical method for measuring a given POVM. In general the Naimark theorem yields a PVM corresponding to an observable of the combined system of object and measuring instrument rather than to a pointer observable of the measuring instrument. This considerably reduces the applicability of the theorem. The theorem does also not really reduce POVMs to PVMs, since the Naimark PVM refers (also) to the measuring instrument. So, this PVM does not represent a property of the microscopic object (see also section 2.2). In discussing in chapters 7 and 8 specific examples of quantum measurements we shall often rely on PVMs as pointer observables. These PVMs refer to properties of the measuring instrument instead of the microscopic object. As far as this object is concerned, the POVMs remain necessary for characterizing the probabilities, and Naimark’s theorem does not yield any contribution to a better understanding of it. Moreover, the theorem does not solve the problem of the joint measurement of incompatible observables. Thus, let and be two incompatible POVMs on and let there exist a bivariate on such that and with P and two projection operators as in Naimark’s theorem. If, moreover, we would have
then it would follow that
and the probability distributions of and could be obtained by measuring PVM This would reduce the problem of the joint measurement of the incompatible POVMs and to the measurement of PVM However, it follows from (1.104) that the ranges of projection operators P and must have a non-empty intersection containing Hilbert space Restricting state vectors to it follows from (1.104) and (1.105) that and However, this would imply that POVMs and are compatible on this subspace of This is in disagreement with our point of departure. This attempt at solving the problem of incompatibility can be compared with the attempt, discussed in section 1.8.4, to represent a joint measurement of incompatible PVMs by means of a projection operator of an intersection subspace. For truly incompatible observables, for which these intersection subspaces are all empty, this does not yield a solution. It seems that Naimark’s theorem does not bring the solution of the problem of joint measurement of incompatible observables any nearer. In section 7.9 this problem will be approached in a different way.
50
CHAPTER 1. STANDARD AND GENERALIZED FORMALISMS
1.9.4 Phase observables As is well known [50, 51, 52] a standard phase observable (corresponding to a Hermitian operator that is canonically conjugate to the number observable N (A.15)) does not exist. However, it is possible to find POVMs representing generalized phase observables. One way to find such phase POVMs is by exploiting the idea of canonical conjugatedness. Analogously to (A. 10) it may be required that a phase observable be represented by a POVM satisfying
with the phase shift operator defined in appendix A.2, and V the number shift operator (A. 19). It was essentially demonstrated by Leonhardt et al. [53] that (1.106) implies
with
given by (A.18). This should be compared with (A.11). The special solution (to be qualified in section 7.7.2 as a maximal POVM) might be referred to as the canonical phase observable. Since the vectors are not (Dirac) orthogonal for different values of we have a non-orthogonal decomposition of the identity (NODI) rather than an orthogonal one (cf. appendix A.12.3). The POVM generated by this NODI is the so-called Susskind-Glogower phase POVM [50]. Another way to arrive at the canonical phase observable (Barnett and Pegg [54]) is essentially based on the Naimark theorem (cf. section 1.9.3). Consider the space spanned by the number states (cf. appendix A.2) as a subspace of an extended space spanned by the orthogonal set in which the states with (not having a direct physical meaning) span the orthogonal complement of in In this latter space we can define the unitary number shift operator (compare (A. 19)) which, in agreement with Stone’s theorem, defines a Hermitian operator according to having the vectors
as a complete and (in the Dirac sense) orthonormal set of (improper) eigenvectors. Thus,
Now, let
be the projection operator onto
then
1.10. CORRESPONDENCE WITH CLASSICAL MECHANICS
51
and the canonical phase observable is the Naimark projection of PVM
The construction of a phase observable by means of Naimark’s theorem illustrates the problems of the physical interpretation of this theorem, mentioned earlier. First, the physical relevance of vectors is unclear. Second, we do not know a measurement procedure for measuring PVM It is possible to circumvent the problem posed by the negative vectors by applying the method of a Naimark extension using an ancilla (Shapiro and Shepard [55]). On the basis of the number shift operators V and defined by (A.19) for object and ancilla, respectively, it is possible to define two commuting Hermitian operators, in which (and analogously for A complete and (in the Dirac sense) orthonormal set of joint eigenvectors of and is given by
Since if and (1.107), the Naimark projection of PVM (1.109) realized by the Susskind-Glogower POVM.
with given by essentially yields
Although this version of the Naimark construction evades negative vectors, it does not solve all problems mentioned above. It hinges on an operational way of jointly measuring the standard observables and which is not at all evident, even if the normal operator (having eigenvalues 0 and might be interpreted as a phase operator of the combined system of object+ancilla. An alternative definition of a phase observable, based on the ‘eight-port’ homodyning experiment performed by Noh, Fougères and Mandel [56] will be discussed in section 8.4.3.
1.10 Correspondence with classical mechanics From the outset the correspondence between the quantum mechanical and the classical theory has attracted much attention. Quantum mechanics and classical mechanics are very different at first sight. This holds for the Schrödinger picture (cf. section 1.1) in which the emphasis is on the state vector, as well as for the Heisenberg picture in which physical quantities are represented by operators rather than by scalar functions of and What relation might exist between evolution equation (1.17) of the quantum mechanical observables and the classical Hamilton equations
CHAPTER 1. STANDARD AND GENERALIZED FORMALISMS
52
in which
is the classical Hamiltonian of the system?
The question of the correspondence between quantum mechanics and classical mechanics has two different aspects: How can known classical formulae be used for generating valid quantum mechanical formulae? How can classical relations be derived from quantum mechanical ones (the classical limit)? Although these questions are not independent from each other, they are yet very different. The first question has a heuristic character: quantum mechanics cannot be derived from classical mechanics; at most the classical formulae can suggest what the quantum mechanical ones could look like. The standard answer “Replace the scalar quantities and by operators and P” turns out to be not unique if applied to quantities like (cf. section 1.10.2). The only thing we can do is examine the different possibilities, and check which one (if any) is in agreement with observation. The question of the classical limit of quantum mechanics is related to the question of whether quantum mechanics is also applicable to macroscopic objects. If this is the case, then it must be possible to find the classical equations by applying the quantum mechanical ones to objects with macroscopic masses, making some kind of approximation (like This question, too, is less easily answerable than is sometimes thought (e.g. Messiah [57], chapter VI.4). The latter question is discussed in sections 2.4.2 and 3.4. In the present chapter we are primarily interested in the first question.
1.10.1
Ehrenfest’s theorem
A preliminary answer to the question of the correspondence between the quantum and classical formalisms is yielded by Ehrenfest’s theorem. This theorem applies to expectation values and of the quantum mechanical position and momentum observables. It states that the evolution equations of these quantities are formally identical to the classical Hamilton equations of the corresponding classical quantities, i.e.
Of course, equations (1.111) are meaningful only if the differentiations with respect to the operators and can be given a mathematical meaning. This, indeed, is the case if we restrict ourselves to systems with simple Hamiltonians of the form
1.10. CORRESPONDENCE WITH CLASSICAL MECHANICS
53
For such Hamilton operators it is easily verified that the commutators with and respectively, can be found by means of the formal differentiations of H as given in (1.111), the validity of this latter relation following by applying (1.17) to observables and For these systems this corroborates the formal correspondence of the classical and quantum mechanical descriptions exhibited in Ehrenfest’s theorem: the expectation values of observables and behave classically, in the sense of satisfying the “classical” equations (1.111). Note that for the systems considered here this is an exact result of quantum mechanics. Hence, it has no bearing on the classical limit: in general the solutions of (1.111) differ from those of (1.110), even if the system is macroscopic (e.g. [58]). As will be seen in the next section, for systems having a less simple Hamiltonian (1.111) does not have an unambiguous meaning.
1.10.2
Dirac quantization
The similarity of the quantum mechanical equation (1.17) with the evolution equation of a classical quantity has particularly been emphasized by Dirac [1]. The classical equation can be written as
in which is the (classical) Poisson bracket of quantity and Hamiltonian If applied to the classical quantities and this yields the Hamilton equations (1.110). Comparison with (1.17) shows that the evolution equation of a quantum mechanical observable can be obtained from the evolution equation of a classical quantity by means of the substitution
in which the classical Poisson bracket is replaced by the quantum mechanical commutator This suggests the possibility of establishing a relation between classical mechanics and quantum mechanics by means of the substitution (1.112). For instance, by (1.112) the classical relation immediately gives the canonical commutation relation (1.19). The quantization procedure, establishing this commutation relation between and P, is referred to as the canonical quantization procedure. It implies the well-known representation of the momentum observable. The operators corresponding to some other classical quantities, like the components of
54
CHAPTER 1. STANDARD AND GENERALIZED FORMALISMS
angular momentum etc., and the Hamilton operator can be obtained in this way. Yet, the fundamental difference between classical and quantum mechanics makes it rather improbable that the canonical quantization procedure can be an automatic recipe for a “derivation” of quantum mechanics from classical mechanics. There are several indications that this, indeed, is not the case. One result, pointing in this direction, is a theorem, proven by Groenewold [59] and van Hove [60], stating that it is impossible to find a mapping from the set of classical phase space functions to the set of quantum mechanical operators, satisfying the following requirements:
This implies that there does not exist an algebraic isomorphism between the Lie algebra of the classical quantities (i.e. functions of and with the Poisson bracket as a Lie product) and a Lie algebra of Hermitian operators with the commutator as a Lie product (e.g. Chernoff [61]). The proof is based on a demonstration of the non-uniqueness of mapping for instance, the classical quantity is mapped onto different Hermitian operators, depending on whether in the classical equality either the first or the second Poisson bracket is taken as point of departure of the quantization procedure.
1.10.3
Classical statistical mechanics
The analogy of quantum mechanics and classical mechanics is especially clear in the following formulation of classical statistical mechanics. In the classical formalism the state is described by a probability measure (cf. appendix A. 12) on phase space. Such a measure induces a probability distribution on phase space13, satisfying:
Here the integration is carried out over the whole phase space. Restricting the integration to either or yields the classical probability distributions of and respectively:
13
In the notation we restrict ourselves to one single pair to an arbitrary number of coordinates is unproblematic.
of canonical coordinates. Extension
1.10. CORRESPONDENCE WITH CLASSICAL MECHANICS
55
The dispersionless state
is a special probability distribution, corresponding to the (non-statistical) trajectory in phase space of a classical particle if satisfies the classical Hamilton equation (1.110). Let this trajectory start at in point Then, denoting by the state (1.117) that is involved, we have
If at
we have an arbitrary probability distribution
then
The evolution equation satisfied by is the Liouville equation. This equation can be found by taking into account the time dependence of (1.117). We get:
If Dirac quantization (1.112) is applied, this equation is completely identical to the Liouville-von Neumann equation (1.44). Equation (1.119) could, hence, be seen as the Schrödinger picture of classical statistical mechanics. There exists a considerable analogy of classical statistical mechanics and quantum mechanics in the sense that there also is a Heisenberg picture. In order to see this we consider the expectation value of a classical quantity (like position, momentum, or energy) represented by a function on phase space. The classical expectation value is the quantity
This quantity should be compared with the quantum mechanical quantity (1.39). Using (1.118) we find the time dependence of as
With (1.117), after integration over
Putting
and
it follows from (1.121) that
56
CHAPTER 1. STANDARD AND GENERALIZED FORMALISMS
this gives
This is the Heisenberg form of the expectation value, in which not probability distribution but the physical quantity is time dependent. The equation satisfied by follows directly from (1.122) as
It can be considered as the Heisenberg picture of classical statistical mechanics. Note that the equations of and are transformed into each other by the transformation This is completely analogous to equations (1.44) and (1.46).
Functional-valued measures in classical mechanics If functions are continuous, it is possible to write classical probabilities in the form (1.120). Thus
This defines the classical analogue of a POVM (cf. section 1.9 and appendix A.12.3), viz, the functional-valued measure14 (FVM) where is an arbitrary point of phase space, and
(compare (A.118) through (A.119)). Other FVMs are obtained from this FVM by partitioning phase space into disjoint subsets such that coincides with the whole of phase space (cf. (A.122)). FVM consisting of the indicator functions of subsets defines the probabilities that for state function the phase space point is in
The partial ordering relation between FVMs, presented in appendix A. 12.3, can be translated in terms of information regarding the state function as the partitioning is coarser, less information on is contained in the expectation 14 This encompasses the concept of ‘measurable function’ so as to exhibit the analogy between classical and quantum mechanical formulations.
1.11. PHASE SPACE REPRESENTATIONS
57
values of the elements of the FVM through relation (1.125). In case of the FVM defined in (A. 122) the information is even distorted: is the conditional probability that an event, taking place in is counted as happening in FVM is complete because its expectation values completely determine state function For a long time the existence of a complete FVM has been seen as a fundamental difference between classical and quantum mechanics. Indeed, the PVMs of the standard formalism are all incomplete: the probability distribution does not yield any information on the non-diagonal elements of density operator in the corresponding representation, not even if all projection operators are one-dimensional, (and, hence, the standard observable is maximal).
POVMs: bridging the gap between standard quantum mechanics and classical mechanics It is important to note that the introduction of POVMs has brought a drastic change in this situation: in contrast to complete PVMs complete POVMs do exist, in the sense that the probability distribution described by it contains sufficient information to completely determine the density operator (see also section 3.3.6). An example is the POVM in which are the coherent states defined in appendix A.4 (compare (A.37)). We shall see in section 1.11.3 that the probability distribution corresponding to this POVM completely determines the density operator (by (1.148) and (1.149), taking In chapter 8 a number of measurement procedures of observables corresponding to such complete POVMs will be discussed. Such measurements are clearly outside the domain of application of the standard formalism. Due to the existence of complete POVMs the generalization of the concept of a quantum mechanical observable in a certain sense bridges the gap between the classical theory (in which there are complete measurements too) and the standard formalism of quantum mechanics. For this reason it is not surprising that there is a close relation between POVMs and so-called phase space representations of quantum mechanics, in which it is attempted to formulate this latter theory analogously to classical statistical mechanics.
1.11 Phase space representations 1.11.1 Introduction Similarities and differences of classical and quantum mechanical statistics have always been important sources of inspiration. The question of whether it is possible to treat both kinds of statistics in an analogous way has given rise to representations of quantum mechanics known as phase space representations. In such interpretations
58
CHAPTER 1. STANDARD AND GENERALIZED FORMALISMS
quantum mechanics is formulated as much as possible analogously to classical statistical mechanics [62]. This implies that quantum mechanical states and observables are represented by functions on a phase space, and that the quantum mechanical expectation values (1.39) are written in the form (1.120). Thus,
It is evident that, in order that such a representation be possible, requirements (1.113) on the correspondence of quantum mechanical operators and phase space functions will have to be weakened. It will turn out that in particular the product AB of operators A and B cannot be represented by the product of their phase space representatives, even if [A, B]- = O. Thus, in general, operator will not correspond to phase space function if operator A is represented by Another difference between phase space representations of quantum mechanics and classical statistical mechanics is that in general can take negative values (note, however, that the Husimi representation, to be discussed in section 1.11.3, is non-negative). Because of this, quantum mechanics is sometimes associated with negative probabilities. It is questionable, however, whether to the phase space function the physical significance of a probability distribution can be attributed. Moreover, the analogue of (1.116) need not always be satisfied. On the other hand, (1.115) will always be satisfied, because the operator A = I will always be represented by the function There exist many prescriptions for a correspondence of quantum mechanical operators and phase space functions, yielding different phase space representations. We shall discuss a number of these. In doing so we shall restrict ourselves to a system consisting of one single particle, for which the phase space is the two-dimensional space Although the representations of state vectors and observables can be very different, the representations are equivalent in the sense that the expectation values (i.e. the experimentally observable quantities) do not depend on the particular representation. This, too, is a reason to be cautious when attempting to attribute a direct physical significance to phase space functions.
1.11.2 Wigner-Weyl representation Wigner [63] has found the following phase space representation for a state with density operator
Here is an eigenvector of position observable (as usual we have put The phase space function is known as the Wigner distribution. In principle
1.11.
PHASE SPACE REPRESENTATIONS
59
it contains the same information on the quantum mechanical state as does density operator This follows directly from the possibility of inverting the Fourier integral (1.127). The corresponding representation of operator A is:
A proof of (1.126) can be given as follows:
since It can directly be verified that the analogue of (1.115) is satisfied:
However, the analogue of (1.114) is not satisfied: in general the Wigner distribution is not positive definite (this can easily be seen by considering simple examples). For this reason it is called a quasi-probability distribution. It is also easy to demonstrate that the Wigner distribution satisfies relations analogous to (1.116):
In the last expression vectors are the eigenvectors of operator P defined by (A.5). This implies that the Wigner distribution has the probability distributions of the position and momentum observables as its marginals. Because of the desired analogy to classical mechanics this seemed to be a satisfying result. However, such an analogy cannot be perfect. The existence of negative probabilities must be seen as characteristic of the non-classical character of quantum mechanics. A reason to consider the (‘squeezed’) coherent states defined in appendices A.4 and A.5, as the most classical states of quantum mechanics is that these states have non-negative Wigner distributions: with (A.49) and (A.38) we find from (1.127)
For all values of
and
this is a positive function on phase space. For
rotation symmetric around the point
for
it is
the function
60
CHAPTER 1. STANDARD AND GENERALIZED FORMALISMS
is not rotation symmetric, but ‘squeezed’ in the direction (for
direction (for
or in the
The representation (1.128) is referred to as the Wigner-Weyl representation, because it was earlier found by Weyl [64] on the basis of the idea that a correspondence between operators and phase space functions might be established by means of Fourier transformations in phase space. With the help of (A.21) it, indeed, can easily be seen that the Wigner distribution (1.127) and the function (1.128) can be written as
The Wigner-Weyl representation, as given above, treats states and observables in a slightly asymmetric way. A more symmetric representation is obtained by defining operators
By means of (A.25) these operators can be demonstrated to satisfy
Then
and
can be represented according to
All operators A (including density operators) can be expanded in terms of the operators according to (compare (A.26))
Because of (1.134) this is in complete agreement with (1.126), in which the phase space representations of and A are now given by and respectively. Expression (1.136) can be seen as an expansion of a Hilbert-Schmidt vector (cf. section 1.4.3) in terms of an (in the sense of Dirac) complete orthonormal set of vectors with expansion coefficients The set can be obtained by the unitary transformation (1.133) from the complete orthonormal set allowing a different expansion:
1.11.
PHASE SPACE REPRESENTATIONS
61
An advantage of expansion (1.136) is that the operators Often the quantity lation operators
1.11.3
are Hermitian.
is represented in terms of creation and annihiand
Husimi representation
Probability distributions on phase space which, unlike the Wigner-Weyl distribution, are non-negative, do exist. Husimi [65] has found the following representation:
in which vector
is given in the position representation by
Parameter can be chosen arbitrarily in interval Hence, there actually exists a whole class of representations, one for each value of Note that is closely related to the coherent states, defined in appendix A.4. We have (compare (A.38) and (A.49)):
i.e. apart from the phase factor exp states coherent states defined in appendix A.5. Thus:
are precisely the ‘squeezed’
Here the normalization constant differs by a factor of 2 from that of (A.50) because now the variables are and and Distribution (1.141) is nonnegative because the operator is non-negative. The marginals of are easily found as
62
CHAPTER 1. STANDARD AND GENERALIZED FORMALISMS
The following relation between the Husimi distribution distribution is now proven:
and the Wigner
Proof:
Inserting (1.127) into the right-hand side of (1.144) yields the threefold integral
In this expression the integration of can be executed. Then, after some rearrangements of the exponential terms, the coordinate transformation leads to the equality
This agrees with (1.139) when the expectation value is written in the position representation (1.140).
With (1.141) and (1.135) it follows from (1.144) that
This can in principle be inverted by means of deconvolution15 according to
Equalities (1.145) and (1.146) yield in a condensed form the relation between the Wigner-Weyl and the Husimi representation. These expressions define a transition from the orthogonal (Dirac) basis in Hilbert-Schmidt space to the non-orthogonal basis This suggests a simple interpretation of the Husimi representation. Introducing the dual basis forming together with a bi-orthonormal system (compare appendix A.8.1), viz,
15 Note that the integrand in (1.146) is very singular. The integrals exist only in the weak sense defined by the Hilbert-Schmidt inner product.
1.11.
PHASE SPACE REPRESENTATIONS
63
we can write down the following expansions
With (1.147) we find:
The basis can be determined by taking in (1.147) for expansion (1.145), and by expanding in terms of the complete orthogonal set Deconvolution yields
This can be inverted to give (compare (1.145) and (1.146))
Also
Expression (1.150) defines the (super)operator bases and
in which
relating, analogously to (A.76),
is defined by
It can straightforwardly be verified that
When operators and P are expressed in terms of the ‘squeezed’ boson operators and (cf. (A.44)), then we obtain (after performing a scale transformation of the integration variables) yet another representation of the quantities and
64
1.11.4
CHAPTER 1. STANDARD AND GENERALIZED FORMALISMS
representations
The Wigner-Weyl and Husimi representations are special cases of the more general correspondence of operators A and density operators to phase space functions and respectively, according to
in which operators orthonormal system, i.e.
and
constitute an (in the sense of Dirac) bi-
and are related to the Wigner-Weyl basis according to
For a given now have
representation
is a fixed function. Analogously to (1.149) we
For the Wigner-Weyl representation we find (compare (1.135) and (1.133))
for the Husimi representation
as follows directly from (1.150). In order that operators Hermitian, should be chosen such that
and
are
Analogously to (1.151) a (super)operator is defined by (1.154), establishing the relation to the orthogonal Wigner-Weyl basis in agreement with
1.11.
PHASE SPACE REPRESENTATIONS
The (super)operator
65
is Hermitian if, apart from (1.156), also
Often it is also required that because in that case (this follows directly from (1.152) and (1.154) because and It is evident that this requirement is met by the Wigner-Weyl and the Husimi representations. By means of a special choice of we are able to adapt the representation to special needs. For instance, if we choose then operator is represented by the function (analogously: In the Wigner-Weyl representation this, evidently, is the case; however, in the Husimi representation it does not hold. This means that, in agreement with (1.130) and (1.131), the Wigner-Weyl distribution yields the correct quantum mechanical expectation values of observables and B(P) by taking in phase space representation (1.126) for functions and respectively. For instance,
In the other representations this does not hold in general. Denoting by and the phase space averages of and respectively, with respect to distribution and by and respectively, the analogous averages with respect to the function defined in (1.157), it follows from the convolution character of the relation between and viz,
that in general
For the standard deviation we obtain:
and analogous relations for P. For the Husimi representation this amount to
and
66
CHAPTER 1. STANDARD AND GENERALIZED FORMALISMS
Evidently, in the Husimi representation operators and P are represented by the corresponding phase space functions and respectively. However, this does not hold for and From (1.160) it is seen that the Husimi distribution is such that the (marginal) distributions of and are broadened with respect to the quantum mechanical distributions. This is in agreement with (1.142) and (1.143). In section 7.9 we shall attribute a physical significance to this. Incidentally, it should be noted that in general the function can have negative values, and, hence, and are not necessarily positive.
1.11.5
Relation to operator ordering
Let us consider the special case of the Husimi representation with In this case the Husimi representation (1.148) of the density operator is reduced to
Here the index refers to ‘anti-normal ordering’. This nomenclature is connected to the fact that representation (1.149) of observable is found according to
i.e. we find the representation by first writing the operator, by means of commutation relation (A. 13), in such a form that all operators are situated to the right of operators (this is called ‘anti-normal ordering’), and by next simply carrying out the substitution The proof of (1.162) makes use of (A.20) and (A.37). Using this we get from (1.149):
Hence, anti-normal ordering corresponds to the
representation with
Due to the unitary equivalence of operators and this can be generalized to arbitrary values of by anti-normally ordering operator A in terms of ‘squeezed’ boson operators,
1.11.
PHASE SPACE REPRESENTATIONS
67
the Husimi representation (1.149) of A follows as
In a completely analogous manner it can be proven that
corresponds to a representation in which the operator representation is found by applying substitution to the normally ordered form of operator A:
such that
1.11.6
Wigner’s theorem
From (1.142) and (1.143) it is seen that too, does not have all properties of a classical phase space distribution, even though However, the marginals do not reproduce the probability distributions of observables and P. Wigner [66] has demonstrated that this is not accidental, but that it is a special case of Wigner’s theorem. Wigner’s theorem:
If a phase space distribution is a linear16 functional of the density operator and its marginals reproduce the quantum mechanical distributions and of the position and momentum observables, i.e.
then it cannot be non-negative. Proof:
Wigner considers a state, in the position representation represented by wave function
16 It is not difficult to find a non-linear functional for which Wigner’s theorem cannot be proven, e.g. the functional Since such functionals belong outside the linear framework of quantum mechanics, they are left out of consideration here.
CHAPTER 1. STANDARD AND GENERALIZED FORMALISMS
68
in which vals and
and are non-vanishing only in bounded non-overlapping interrespectively. Hence, for all
Since
the phase space function satisfies
In this expression and are so-called cross terms. Let us assume that Since and are arbitrary (apart from normalization it follows that Hence, From the second equation of (1.164) it now follows that
This is possible only if But this is impossible because are the Fourier transforms of which, since have bounded supports must be analytic functions of and, for this reason, cannot vanish in an open interval. This implies that at least one of the presuppositions cannot be satisfied.
The proof of Wigner’s theorem, given above, is of a rather technical nature, and does supply only very indirect evidence that the impossibility of non-negative phase space functions satisfying (1.164) is a consequence of the structure of linear vector spaces and the operators defined on these. It may be more illuminating if Wigner’s theorem is formulated in terms of (positive) operator-valued measures related to analogously to (1.101):
In order that be a (non-negative) probability distribution, should be a positive operator-valued measure. Relations (1.164) can be written as
1.11.
PHASE SPACE REPRESENTATIONS
69
Formulated in this manner Wigner’s theorem is nothing but a proof that PVMs and corresponding to the spectral representations of the standard observables of position and momentum, are not commeasurable (cf. (1.100)). In section 1.9.2 it was demonstrated for observables with discrete spectra that commeasurability of PVMs can obtain only in case of commutativity of the PVMs. A generalization of this proof to continuous spectra [49] therefore amounts to a proof of Wigner’s theorem, demonstrating that the difference with classical phase space distributions is embodied in the basic quantum mechanical fact of the noncommutativity of the position and momentum operators. We shall, in the best tradition of the Dirac formalism, not deal here any further with the technical problems involved in continuous spectra, and shall consider the significance of Wigner’s theorem sufficiently illustrated by the proof for discrete spectra given in section 1.9.2. In case of the (non-negative) Husimi distribution Wigner’s theorem is clearly satisfied (cf. (1.142) and (1.143)). Srinivas and Wolf [67] have proven Wigner’s theorem still in a different way, yielding better insight into its physical significance. They, too, observe that operator in (1.165) must be a positive operator lest be positive for arbitrary density operators In the representation it follows by comparison with (1.159) that
Using (A.23) and (1.158) this entails Hence, for every value of and operator is a density operator of a quantum mechanical state. From requirements (1.166) it now follows that
Since the integrands are positive this implies that
i.e. in state the probability distributions of and would both be concentrated in one single point. This is in disagreement with the Heisenberg inequality (1.77).
1.11.7 The Schrödinger equation in the
representation
Since we have phase space representations (1.152) of both observables and states, it should be possible to write the Schrödinger equation as an equation describing the time evolution of a (quasi-)probability distribution on phase space. This was done
70
CHAPTER 1. STANDARD AND GENERALIZED FORMALISMS
for the first time by Husimi [65] (Husimi representation) and by Moyal [68] (Wigner representation). Later it was generalized for arbitrary representations by Agarwal and Wolf [69], and by Prugovecki [40]. In order to find the phase space representation, we can substitute the representations (1.152) of Hamiltonian H and density operator in the Liouville-von Neumann equation (1.44). By using (A.24), and applying a number of simple coordinate transformations in the integrals, we then find the following equality:
From (A.23) it directly follows that the expression within the square brackets must be zero. The resulting equation for can be finally reduced to the following equation for its Fourier transform
For the case of the Wigner distribution this equation was already found by Moyal [68]. By using (1.155) we get for the Husimi representation (cf. section 1.11.3):
Using (A.28) these results can be simplified in certain representations. Thus, in the Wigner-Weyl representation we obtain for a free particle
yielding for
as evolution equation:
For the Husimi representation we get
and
1.11.
PHASE SPACE REPRESENTATIONS
71
Phase space representations of the Schrödinger equation suggest that these equations could be describing stochastic processes on a phase space. This holds in particular for (1.171), which has the non-negative distribution functions (1.139) as solutions. Yet, we should be careful with such an interpretation. Even equation (1.171) is not a Fokker-Planck equation [70], the operator not being a positive operator on the space of phase space functions. Although the solutions of this equation remain positive for very special choices of the initial state (corresponding to functions of the form (1.139)) this can go wrong if another choice is made (e.g. de Muynck and van Stekelenborg [71]). In this connection it is interesting to consider Green’s function of the initial value problem of equation (1.171), i.e. the solution with initial condition This solution can be formally written in terms of a Fourier integral according to
In principle, the integrals can be evaluated in an elementary way. However, it turns out that the result is diverging! Evidently, the initial value problem with initial value does not have a solution (see also section 10.6.1).
This page intentionally left blank
Chapter 2 Empiricist and realist interpretations of quantum mechanics 2.1 Introduction A physical theory consists in the first place of a mathematical formalism. For quantum mechanics this formalism contains two kinds of mathematical quantities, viz, on one hand the wave function or the density operator on the other hand the Hermitian operator A or the positive operator-valued measure (POVM) (cf. section 1.9). Another aspect of a physical theory is its domain of application, which is the set of all physical situations for which the theory makes predictions that can be corroborated by experiment. The domain of application of quantum mechanics is atomic, nuclear and elementary-particle physics, in which the values of the theoretical parameters of mass, charge, etc. are chosen in such a way that such a corroboration is achieved. In order to make a connection with experiment it is necessary to specify correspondence rules establishing a correspondence between mathematical quantities and physical entities. Hence, the question to be answered is: “What is the physical significance of or and of A or To this question different answers have been -and still are- given, corresponding to different interpretations of the mathematical formalism of quantum mechanics1, attributing different meanings to the mathematical quantities of the theory 2 . All interpretations have in common that 1
Somewhat sloppily this is often referred to as ‘interpretations of quantum mechanics’. Note that we shall not be concerned here with the question of the completeness of an interpretation, which addresses the problem of whether a physical meaning can be attributed to every mathematical quantity of a theory. 2
73
74
CHAPTER 2. EMPIRICIST AND REALIST INTERPRETATIONS
the quantity (1.96) has the meaning of a probability, viz, the probability of a certain individual measurement result obtained if a measurement of the observable corresponding to the POVM is performed when the state is prepared according to the density operator This essentially is the Born statistical interpretation, which can be tested by repeating the experiment very often, say N times, and by comparing the relative frequency of measurement result (in the limit of large N) with the quantity (see also section 6.2.2). The interpretation restricting itself to this, is called the minimal interpretation (see also section 6.4.1). The minimal interpretation is not a complete interpretation. It might be characterized as an instrumentalist one, since the mathematical formalism is not thought to say anything about (microscopic) reality itself, but to serve only as an ‘instrument’ for the calculation of quantum mechanical measurement results. In this interpretation the quantities of the theory, like the wave function and the Hermitian operator A, have a symbolic meaning only, and need not correspond to anything existing in reality. The wave function might play a role comparable with the epicycles invented by Ptolemy in order to mathematically reproduce the observed trajectories of the planets, but without any further physical significance. For Bohr the quantum mechanical wave function had an instrumentalist significance in this sense: he considered the wave function not as a description of a microscopic object, but as a symbolic representation of “statistical laws governing observations obtainable under specified conditions” (Bohr [72], page 5, see also section 4.3). An advantage of the minimal interpretation is that it asserts so little that no internal inconsistencies can arise. For this reason this interpretation is, to a certain extent, impeccable. In particular, the “measurement problem”, to be discussed in chapter 3, is non-existent in this interpretation. A disadvantage of the minimal interpretation is that it is asserting too little. In particular, it is not made explicit what precisely is meant by a ‘measurement result’. This notion can be implemented in different ways. By not specifying which implementation is chosen great confusion may arise (cf. section 5.3.1). To avoid this it is necessary to strengthen the interpretation, assuming something more about the correspondence between the mathematical formalism and physical reality. In doing so we can choose from two possibilities, to be referred to, respectively, as an empiricist and a realist interpretation of the quantum mechanical formalism, attributing completely different significances to the mathematical quantities of quantum mechanics.
2.2 Empiricist interpretation of quantum mechanics In an empiricist interpretation emphasis is on the directly observable part of reality. In the laboratory we see measurement arrangements that, although intended
2.2. EMPIRICIST INTERPRETATION
75
to perform measurements in the microscopic domain of quantum mechanics, are yet composed of macroscopic (although often very small) components. In these measurement arrangements one can often discern two fundamentally different parts, viz, the part having as an objective to prepare microscopic objects (like an electron emission grid, cyclotron, laser, etc.), and the part intended to register some phenomenon that can be interpreted as a measurement result (like a photo diode, bubble chamber, spark chamber, etc.). The first part will be referred to as the preparing apparatus, the second one as the measuring instrument. The measuring instrument has as an essential part a macroscopic pointer ranging over a measurement scale from which the individual measurement result can be read off. Like in an instrumentalist interpretation of quantum mechanics, in an empiricist one the mathematical formalism is thought to refer to measurement results. Unlike in an instrumentalist interpretation, however, in an empiricist one it is unambiguously specified what is meant by ‘measurement result namely, a mark on the scale of a measuring instrument specifying a pointer position of a (macroscopic) pointer. The Hermitian operator A or the POVM representing a quantum mechanical observable is thought to be just a label of some measuring instrument or measurement procedure, reproducing the quantum mechanical probabilities (1.40) and (1.96) when applied to a state represented by the wave function or the density operator In general a measurement procedure corresponds to a well-defined measuring instrument, although it is possible that different measuring instruments measure the same observable (i.e., are labeled by the same POVM). In an empiricist interpretation the wave function and the density operator are labels too, viz, of a preparing apparatus, or, more generally, a preparation procedure. Sometimes such a procedure can be characterized by specifying a preparing apparatus having one or more knobs by which the preparation can be varied (for instance, particle energy by varying the electric field of a cyclotron). More generally, even if no preparing apparatus is directly involved, it is still possible to see the density operator as a symbolic representation of some preparation procedure applied earlier, which determines the object’s present physical condition. In the following it will always be assumed that a preparing apparatus is present. Hence, density operator and POVM can both be considered as names of apparata, or of states of apparata. In an empiricist interpretation of quantum mechanics the mathematical formalism is thought to describe only relations between directly observable events, viz, the settings of the knobs of preparing apparata and the readings of pointer positions of measuring instruments. Hence, the formalism does not refer directly to the microscopic object: no quantity of the mathematical formalism of quantum mechanics is thought to refer to this latter part of reality. The microscopic object just does not have a place in the theory (cf. figure 2.1), at least in the sense that the formalism is thought not to make any statements about it. The formalism describes,
76
CHAPTER 2. EMPIRICIST AND REALIST INTERPRETATIONS
within its domain of application, only (cor)relations between certain (macroscopic) preparations and pointer positions of (equally macroscopic) pointers of measuring instruments. By refraining from any reference to the microscopic object itself, an empiricist interpretation of quantum mechanics exhibits a rather modest attitude as regards the physical meaning of this theory: it admits the possibility of a fundamental incompleteness of quantum mechanics (see also section 4.2). This latter theoryis thought to be strictly referring to observable phenomena. It will be clear that a terminology in which a Hermitian operator is called an ‘observable’ fits perfectly into an empiricist interpretation. In this interpretation undisturbed time evolution, governed by the Hamiltonian H according to (1.12), should not be seen as a description of the state evolution of the microscopic object (according to (1.11) or (1.45)) in the way this is usually done in quantum mechanical textbooks. On the contrary, the density operator should be interpreted as just another preparation procedure, obtained by performing the procedure represented by and waiting a time before performing a measurement of observable This experiment is also describable by means of a combination of the preparation and the measurement in which the latter observable represents measurement procedure retarded by time The equivalence of the Schrödinger and Heisenberg pictures (cf. sections 1.1 and 1.4.2), to the effect that seems to support an empiricist interpretation. Summarizing, an empiricist interpretation of quantum mechanics can be characterized as mapping the mathematical quantities of the theory into the set of all possible experimental procedures of preparation or measurement. Thus,
One reason to favor an empiricist interpretation of quantum mechanics could be the fact that in Heisenberg’s original invention of matrix mechanics [73] the theory was formulated so as to contain observable quantities only. Indeed, since
2.2. EMPIRICIST INTERPRETATION
77
quantum mechanics virtually does not contain elements other than states and observables, we may conclude that an empiricist interpretation is in perfect agreement with Heisenberg’s intentions 3 . One of the most outspoken supporters of an empiricist interpretation of quantum mechanics is Wheeler [75, 76]. According to him “No elementary phenomenon is a phenomenon until it is a registered (observed) phenomenon”, indicating that the (macroscopic) observation is considered to be an essential constituent of a quantum mechanical measurement. Wheeler and Feynman [77] have demonstrated that it is possible to formulate the interaction between electrons without the intermediary of photons. Moreover, the presence of an electron is evident only because of a reaction of a detector absorbing the electron (absorber theory). Of major importance is the macroscopic character of the detector enabling direct observation. What is observed directly is a macroscopic phenomenon (i.e. the ‘clicking’ of a detector), not the microscopic object itself! In an empiricist interpretation of quantum mechanics the quantum mechanical formalism is considered as a ‘surface’ or ‘experimental’ model as defined by van Fraassen ([78], p. 113), the density operator representing the (observable) condition of preparation, and the POVM the condition of measurement. The probability distribution corresponds to van Fraassen’s surface state.
2.2.1 Logical positivism/empiricism and empiricist interpretation An empiricist interpretation seems to fit into the logical positivist/ empiricist views like those of Carnap and Reichenbach, whose ideal it was to formulate scientific theories completely in terms of a so-called ‘observation language’ consisting of statements like: “The pointer of the Ammeter is on the scale position marked 4.5 A”. In order to avoid misunderstandings it is necessary to point at a number of fundamental differences between a logical positivist/empiricist conception of science and an empiricist interpretation of quantum mechanics. Moreover, an empiricist interpretation should be clearly distinguished from Machian phenomenalism, in which observation statements are thought to refer to human observational sensations rather than to pointer positions of a macroscopic measuring instrument. The positivistic ideal stems above all from fear of the metaphysical: by completely basing a theory on observations it was hoped to eliminate any metaphysical element from the theory. It was expected that in this way all discussion about non-existing metaphysical elements could be precluded: something not existing will 3 Although in abolishing particle trajectories and in replacing them by transition probabilities Heisenberg was perhaps more inspired by a certain pragmatism than by an empiricist philosophy, he soon realized the usefulness of the latter point of view (cf. Mehra and Rechenberg [74], Vol. 2, section V.2).
78
CHAPTER 2. EMPIRICIST AND REALIST INTERPRETATIONS
never be represented in the theory if the latter contains only elements the existence of which is warranted by observation. Ernst Mach, one of the important precursors of logical positivism/empiricism, perpetuated his lifelong resistance against the belief in the existence of atoms because these were not observed directly. Presumably Mach was one of the last persons to do so, even after seeing the light flashes on a fluorescent screen irradiated by alpha particles (Brush [79]): Mach saw light flashes, not helium atoms! Nowadays we have at our disposal advanced techniques like scanning tunneling microscopy (STM), virtually tracking the atoms at the surface of a crystal lattice, and thus making observation so direct that it is hardly possible to doubt the atomic constitution of matter (cf. figure 2.2). Moreover, apart from this more or less direct observation so much indirect evidence exists for the existence of atoms that it would require an extremely skeptical attitude to remain doubtful about the existence of such microscopic objects. This belief as to their existence may be extended to electrons and most other elementary particles, even though here observation is less direct. For the logical positivist/empiricist there might be a problem here: is he allowed to believe in the existence of these particles, or not? The differences referred to above are concerned with i) the way microscopic objects are represented in the theory, and ii) the description of measurement. i) Microscopic objects
Let us first consider the issue of microscopic objects. As far as a logical positivist/empiricist believes in the existence of atoms, this must be as a result of his observations. Certain combinations of observations (like, for instance, a track in a Wilson chamber, or a configuration like the one observed in figure 2.2) may convince him to accept the ‘particle’ concept as meaningful. Strictly speaking, however, he is
2.2. EMPIRICIST INTERPRETATION
79
not allowed to transcend equality of this concept and the observed pattern. Hence, in the positivistic view the concept of an electron is not different from the corresponding complex of observation statements. An important consequence of this is that, since observations regarding electrons are described by quantum mechanics, the electron obtains a place within this theory, though only as a complex of observation statements related to quantum mechanical measurements. The electron is conceived as being constituted by quantum mechanical phenomena (e.g. Dancoff [80]). This may be satisfactory as long as we are content with a rather abstract image of the electron as some strange entity manifesting itself sometimes as a particle, sometimes as a wave, for this reason occasionally referred to as a ‘wavicle’. However, in applying the Born statistical interpretation of the wave function to interference experiments in which the wave character of a quantum mechanical object is supposed to be important, all empirical evidence is consistent with the assumption that the phenomena are induced by point-like objects (see also section 4.5.2, especially figure 4.4). It seems that there are actually two different ‘particle’ concepts. One is implemented by the logical positivist/empiricist concept of a ‘wavicle’ in which the particle is thought to have the extension of the wave packet by which it is represented; the other is thought of as a point-like object, to be found, on measurement, somewhere inside the wave packet. The latter ‘particle’ concept is of a much more informal nature than the former one: since a quantum mechanical state is represented by a wave function it is not at all clear how such point-like objects could be represented in the quantum mechanical formalism (a comparable problem with respect to photons will be discussed in section 2.4.4). It is sometimes questioned whether it is reasonable to expect that microscopic particles like electrons have properties analogous to the ones observed in the macroscopic world (i.e. wave and particle properties). Our observations suggest a fundamental difference between microscopic and macroscopic objects. As a consequence, if the quantum mechanical formalism describes relations between macroscopic phenomena of preparation and measurement (within its domain of application), then it is hardly plausible that the same theory would also describe the microscopic object itself. It is not impossible that a logical positivist/empiricist view, which does not make this distinction, could even be hampering progress by trying to understand microscopic reality by sticking too much to the macroscopic phenomena described by quantum mechanics. On the other hand, in an empiricist interpretation of quantum mechanics microscopic objects are not thought to be represented in the theory. In this view only macroscopically observable preparation and measurement phenomena find a place within quantum mechanics. This is quite independent of the question of whether one believes in the existence of microscopic objects, or not (see also section 2.3). An empiricist interpretation of quantum mechanics does not imply any ontological com-
CHAPTER 2. EMPIRICIST AND REALIST INTERPRETATIONS
80
mitment with respect to existence or non-existence of microscopic objects, or their properties. In an empiricist interpretation the mathematical quantities of quantum mechanics are thought to refer to phenomena in preparing and measuring apparata, not to electrons or similar microscopic particles. The logical positivist/empiricist equality of the concepts of ‘microscopic object’ and ‘complex of observation statements’ is rejected. It must be stressed here with some emphasis that the reason for such a rejection is not a belief in the non-existence of microscopic objects. That we cannot see them does not entail that they do not exist 4 . Although, strictly speaking, it would be sufficient for an empiricist interpretation to assume the ontic existence only of macroscopic reality, there is no single objection against believing that the phenomena described by quantum mechanics are caused by microscopic objects (for this reason the object was not omitted in figure 2.1; see also section 2.3). On the contrary, in order that quantum mechanics be relevant to the microscopic world it is necessary that the measurements described by it should reflect that world’s properties as closely as possible. However, in contrast to the logical positivist/empiricist approach, in an empiricist interpretation of quantum mechanics these properties are thought not to be described by the latter theory. Hence, if we would want to describe microscopic reality itself instead of (cor)relations between certain preparation acts and measurement events, then we would have to develop a new theory realizing such a description. Such a theory of so-called subquantum processes would possibly relate to quantum mechanics like this latter theory relates to classical mechani4ss. We shall deal with subquantum theories in chapter 10. They are often called ‘hidden-variables theories’, to indicate that, apart from quantum mechanical observablcs, ‘unobservable’ (hidden) quantities may also occur in the theory. Such a theory could, for instance, possibly describe the property of a microscopic object (hence, a non-quantum mechanical property!) causing the measuring instrument to jump to pointer position when it is brought into interaction with the object. As is apparent from the possibility of combining an empiricist interpretation of quantum mechanics with the possibility of hidden-variables theories, this kind of empiricism is very different from a logical positivist/empiricist conception of science, the latter abhorring hidden variables because of their alleged metaphysical character. However, such theories have a metaphysical status only when their application is restricted to the domain of validity of quantum mechanics. Subquantum theories may lead to experimentally testable consequences outside this latter domain. For instance, the Bell inequality, to be discussed in chapters 9 and 10, has been instrumental in (partly) relieving hidden-variables theories from their metaphysical status by suggesting experiments not a priori thought to belong within the domain of quantum mechanics. Without a subquantum theory transcending quantum mechanics 4
Presumably for this reason Reichenbach [81] refers to microcosmic and macrocosmic objects rather than microscopic and macroscopic ones.
2.2. EMPIRICIST INTERPRETATION
81
it is impossible to get any idea about the boundaries of the domain of the latter theory. This makes it virtually impossible to devise experiments leading outside these boundaries. It seems that the logical positivist/empiricist approach has great difficulty in recognizing such boundaries, thus promoting a belief in the completeness of quantum mechanics (cf. section 4.2.1). By not a priori restricting discourse on microscopic objects to quantum mechanics, an empiricist interpretation of quantum mechanics might, finally, be more fruitful than a logical positivist/empiricist attitude. ii) Measurement
As a second difference between an empiricist interpretation of quantum mechanics and a logical positivist/empiricist view we should point at the description of the measurement process. In the latter view it is thought necessary that the measurement process is not described by the theory being tested, i.e. quantum mechanics, but by a thoroughly verified pre-theory like classical mechanics (see also Bohr’s ideas on this subject, to be discussed in sections 4.3 and 4.3.4). An attempt at a logical positivist/empiricist axiomatization of quantum mechanics is due to Ludwig [82], trying to implement Bridgman’s operational ideas (e.g. [83]). His point of departure is the same as that of an empiricist interpretation, viz, the mathematical representation of preparation and measurement procedures by density operators and Hermitian operators, respectively 5 . This generates an ‘observation language’ (Ludwig calls this an ‘axiomatic basis’) in terms of which a theoretical concept like the microscopic object can be defined. This latter object is actually identified with the collection of possible preparations and measurements that can be executed, with results that are interrelated in such a way that they can be interpreted as being caused by an object (see the above discussion of the first point of difference). The important issue in the logical positivist/empiricist view is the theory-independence of the observation language (i.e. independence of the theory to be tested), meant to warrant an empirical basis of the theory. Unfortunately it is not very well known among physicists that it is mainly this requirement of theory-independence of observations that has led to a breakdown of the logical positivist/empiricist program (see, for instance, Suppe [84]). A detailed analysis of a measurement procedure testing a certain theory always needs for its description the very theory being tested. Quantum mechanics offers an interesting case in point. In studying the transmission of microscopic information from a microscopic object to a macroscopic measuring instrument it would hardly be reasonable to expect that quantum mechanics -being the very theory meant to describe the microscopic domain of experimentation- would not be involved. This will be discussed more extensively in section 4.3.4 while criticizing Bohr’s correspondence 5
Note that Ludwig refers to registration procedures rather than to measurement procedures.
CHAPTER 2. EMPIRICIST AND REALIST INTERPRETATIONS
82
principle which seems to fit into the logical positivist/empiricist program. Suffice it to say here that the quantum mechanical description of the measurement process (see chapter 3), found to be necessary in numerous examples of measurement procedures performed in actual practice (cf. chapters 7 and 8), and the concomitant necessity of a generalization of the concept of a quantum mechanical observable (cf. section 3.3), have been major motives for developing an empiricist interpretation as offered here. It should be noted that this interpretation is not based on any preconceived operationalistic ideas in the sense of Bridgman and Ludwig, but that the interpretation is induced by a quantum mechanical analysis of quantum mechanical measurement, as well as by the way the relation between theory and experimental measurement results is implemented in actual experimental practice.
2.3
Realist interpretation of quantum mechanics
Before starting the discussion of a realist interpretation of quantum mechanics it is necessary to distinguish clearly between two concepts of realism (compare Hacking [85]), viz, i) a concept of realism at the ontic level, related to such questions as: “What is reality?”, and “ Does anything exist outside the observation phenomena?” (this is the question whether the moon exists when nobody looks [86]); this ontic concept of realism should be opposed to the concept of idealism, the idea that reality is only in the senses or in the observer’s consciousness; ii) a concept of realism at the epistemic level of the significance of theories, concerned with the interpretation of terms of a theory. The epistemic concept of realism should be opposed to the empiricism of an empiricist interpretation as discussed in section 2.2 (interpreting the mathematical quantities of a theory in a different way), or even to instrumentalisrn (not interpreting the mathematical quantities at all). The first concept of realism is not at stake here. Nowadays it would be difficult to find a physicist who is not a realist in the ontic sense. Questions regarding the existence of the moon when nobody looks, once jokingly(?) put by Einstein ([87], p. 5), may have been hotly debated in earlier centuries, but they are no longer interesting in present-day physical discourse. The discussion of the present section deals with the second issue i.e. the question of epistemic realism, ontic realism being a presupposition of all interpretations of quantum mechanics. The correspondence rules mentioned in section 2.1 can function properly only as a mapping from the theory into physical reality if the physical
2.3. REALIST INTERPRETATION
83
entities do “really” exist. The choice between an empiricist and a realist interpretation of quantum mechanics is a choice between different rules of correspondence, in the sense that in the different interpretations the mathematical quantities of quantum mechanics are thought to be mapped into different parts of reality. Whereas in an empiricist interpretation of quantum mechanics the mapping is thought to be into the macroscopic reality of preparing and measuring apparata, in a realist interpretation the mapping is thought to be into microscopic reality itself. Hence, in a realist interpretation of quantum mechanics the quantities of the theory are thought to correspond to properties microscopic objects possess in reality 6 . Thus,
Of course, there often is also a preparing apparatus to bring the object into a certain state, and there must be a measuring instrument to determine the physical quantity; 6
It might be felt as a problem of terminology that realist and empiricist interpretations are both ‘realist’ in the sense of mapping the theory into reality, be it into different parts of reality. Possibly for this reason different nomenclature has been proposed in the past. Thus, in order to emphasize the (microscopic) object as constituting the range of the interpretation, the term ‘objectivist’ instead of ‘realist’ has been used (de Muynck [88]); for analogous reasons ‘instrumental’ instead of ‘empiricist’ has been coined. As these alternative characterizations are liable to confusion, too, in the following we shall stick to the more usual ‘realist’ versus ‘empiricist’ ones. Sometimes the nomenclature ontic versus epistemic is used (e.g. Primas [89]) to indicate that according to the former interpretation quantum mechanics is about reality itself, whereas according to the second one it is only about our knowledge of reality. Employing the argumentation that quantum mechanics is a physical theory, not psychology, this distinction is sometimes used to defend an ontic interpretation. This nomenclature must be distinguished from the one used in this book in juxtaposing realist and empiricist interpretations. In both of these latter interpretations quantum mechanics is about reality, be it about different parts of reality. Independently of its interpretation, a physical theory is a (mathematical) representation of our knowledge about reality, and is, hence, always epistemic (a detector in the laboratory is hit by an electron, not by a wave function). For this reason the opposition ontic-epistemic is a bit confusing if used in this context. The “solution” to consider a wave function not as a physical object but as information should rather be considered a point of departure than solving a problem.
84
CHAPTER 2. EMPIRICIST AND REALIST INTERPRETATIONS
however, in a realist interpretation these are left out of consideration, as is usually done in quantum mechanical textbooks (cf. figure 2.3). For most physicists a realist interpretation of quantum mechanics will perhaps be the most familiar one, and, possibly, even the most plausible one. Nowadays physicists are convinced of the existence of microscopic objects like atoms, electrons and other elementary particles, held responsible for the transmission of effects from preparing apparatus to measuring instrument. They consider it self-evident that, since quantum mechanics has especially been devised to describe phenomena related to the existence of microscopic particles, it must describe just these particles, much in the same way as classical mechanics is thought to describe macroscopic objects. This analogy has given rise to the idea that the quantum mechanical wave function should replace the classical phase space point as a description of a microscopic particle, and, hence, is related to the particle itself rather than to macroscopic preparation phenomena. Analogously, a quantum mechanical observable is viewed upon as a property of the microscopic particle rather than as a (directly observable) property of the measuring instrument. In particular, the essential role of measurement in quantum mechanics (as a process of human intervention) is often (e.g. Einstein 7 , Bell8 [92], Everett [93]) felt as undesirable, and it is tried to formulate the theory without any reference to measurement. There is still another reason why physicists may be somewhat reluctant to accept an empiricist interpretation of quantum mechanics, and may be more inclined toward a realist one. We are not satisfied by merely accepting physics as a description of phenomena. We also ask from our theories explanations. Physical theories are all formulated essentially in terms of ‘cause’ and ‘effect’, where a ‘cause’ is seen as a (partial) explanation of an ‘effect’. The microscopic object has such an explanatory function: it constitutes a causal relation between preparation and measurement (cf. figure 2.1). Without the microscopic object such a causal relation would be absent. Then we could study only correlations between preparation and measurement events without being able to explain the correlations by means of signals transmitted from the preparing to the measuring instrument. In an empiricist interpretation quantum mechanics is thought not to describe such explaining causal mechanisms. A proponent of such a view is van Fraassen [78]. 7 “What does not satisfy me in that theory, from the standpoint of principle, is its attitude towards that which appears to me to be the programmatic aim of all physics: the complete description of any (individual) real situation (as it supposedly exists irrespective of any act of observation or substantiation) [emphasis added, WMdM]” ([90], p. 667). And: “Physics is an attempt conceptually to grasp reality as it is thought independent of its being observed” ([91], p. 81). 8 For instance:“However the idea that quantum mechanics, our most fundamental physical theory, is exclusively even about the results of experiments would remain disappointing.” And: “To restrict quantum mechanics to be exclusively about piddling laboratory operations is to betray the great enterprise.”
2.3. REALIST INTERPRETATION
85
For many physicists this might be a reason to reject the empiricist interpretation. Thus, Peres [94] seems to equate an empiricist interpretation (denying the existence of measurement results of unperformed experiments) with obedience to the maxim: “Thou shalt not think”. Those who cannot refrain from thinking will, according to Peres, look for causal explanations on the basis of microscopic processes9. In a realist interpretation, in which the wave function is thought to represent the state of a (microscopic) object and a Hermitian operator to represent a certain physical property (like position, momentum, etc.) of this object, the “strangeness” of the quantum world might be thought to be explained by the “strangeness” of the quantum mechanical formalism. A realist interpretation of quantum mechanics, in the sense defined above, is considerably closer to the way we are used to think in classical mechanics, than an empiricist one. Although, perhaps, electrons are not point particles, we could try to keep thinking of them as localized entities to be described by quantum mechanical wave packets (see, however, section 6.3.1). Under certain conditions it is possible to attribute to an electron a well-defined value of momentum also in quantum mechanics, analogously to the way this is done in classical mechanics. In this way we might try to understand physical effects like conduction in metals and superconductors, or pressure in neutron stars, in terms of (quantum mechanical) properties of the constituting entities. This would seem to be particularly plausible if, like in the latter example, there are no obvious preparing and measuring apparata interacting with the object. In particular, elementary particle theories like quantum chromodynamics and string theories seem to favor a realist interpretation because the elementary concepts of these theories seem to be far removed from direct observation. At this moment the realist interpretation of quantum mechanics is the predominant one, both in textbooks as well as in the scientific literature. Admittedly, in an introductory chapter often some attention is paid to the struggle by pioneers of quantum mechanics like Bohr, Heisenberg, and Einstein with the meaning of this theory, and is the role of the measuring instrument discussed in connection to the Heisenberg inequality (1.77). However, when it comes to application of the theory, then in general the terminology becomes a bluntly realist one: the electron “is in a certain quantum state”, or “has a certain value of momentum”. Wave function and observable are treated as properties of the microscopic object. Preparation and measurement remain outside the discussion. The measurement result, obtained on measuring a quantum mechanical observable, is often thought to be explainable by the fact that the object already had the value of the observable before the measurement was carried out: we would find, on measurement, a value of momentum because the particle had this value beforehand. 9
On the other hand, when considering the quantum mechanical state vector as a representation of a preparation procedure, Peres [95] seems to take a more empiricist (or operational) position.
86
CHAPTER 2. EMPIRICIST AND REALIST INTERPRETATIONS
It is emphasized here that this view is in disagreement with the Copenhagen (orthodox) interpretation to be discussed in chapter 4 (see especially section 4.7). In order to be able to arrive at an assessment of the relative merits of this latter interpretation as compared with alternative views, it is necessary to distinguish between two different versions of the realist interpretation of quantum mechanics, viz, an objectivistic-realist interpretation and a contextualistic-realist one. In the former the quantum mechanical properties of the object are thought to be objective10 properties, possessed by the object independently of observation, i.e. possessed when, prior to measurement, the object could be considered as an isolated or closed system. On the other hand, in a contextualistic-realist interpretation the object is thought to have its quantum mechanical properties only within the context of the measurement (see also section 2.4.5). Thus,
In a contextualistic-realist interpretation quantum mechanics is thought not to describe a closed system, but an open one, co-determined by its environment (of which the measuring instrument is a part). In a realist interpretation of quantum mechanics states and observables can both be considered as properties of the object, either in an objectivistic or in a contextualistic sense. Often only the observables are explicitly discussed in one of these realist ways, while the state is left outside the discussion, or is even treated in an instrumentalist sense. The attribution of values to observables, in the sense of objective properties possessed by the object independently of observation, plays an important role in the discussion on the foundations of quantum mechanics. For this reason this version of the objectivistic-realist interpretation is explicitly stated here as a separate principle, to be referred to as the ‘possessed values’ principle:
10 The notion of ‘objectivity’ is used here in the sense of ‘independence of the observer including his measuring instruments’. In order to avoid confusion with the objectivity/subjectivity dichotomy it is often referred to as ‘non-contextuality’ (cf. section 1.8.1).
2.4. WHICH INTERPRETATION TO CHOOSE?
87
Since the ‘possessed values’ principle is part of an objectivistic-realist interpretation of quantum mechanics it follows that, to the extent to which the ‘possessed values’ principle creates problems, an objectivistic-realist interpretation of quantum mechanics must be considered as problematic. It is essentially this principle which is applied by Einstein in the EPR problem, to be discussed in chapter 5, used by him to attack the Copenhagen interpretation. Einstein’s criticism of the Copenhagen interpretation can be understood as criticizing the fact that this interpretation is not an objectivistic-realist one (see also Guy and Deltete [96]). The ‘possessed values’ principle plays a role in certain derivations of the Bell inequality, to be discussed in chapter 9.
2.4
Empiricist or realist interpretation: which one to choose?
2.4.1
Logical positivism/empiricism and realist interpretation
As we saw in section 2.2, logical positivism/empiricism and the empiricist interpretation of quantum mechanics have largely the same roots. Nevertheless, a logical positivist/empiricist attitude may not at all be favorable toward an empiricist interpretation, but even be conducive to a realist one. This rather paradoxical state of affairs is caused by the positivistic view that the concept of a microscopic particle is not different from the corresponding complex of observation statements. Hence, in the logical positivist/empiricist approach quantum mechanics seems to describe the microscopic object itself. The desire to have a description of the microscopic object itself, however, is also the motive behind a realist interpretation of quantum mechanics. It is attempted to interpret the quantum mechanical formalism as a description of the microscopic world, the latter being considered to be as different from the macroscopic world as quantum mechanics is different from classical mechanics. This may be the explanation of the symbiosis, to be observed in many textbooks of quantum mechanics, of a logical positivist/empiricist philosophy and a realist interpretation of the quantum mechanical formalism11. Such a realist tendency is not consistent with an empiricist philosophy, however, and may be seen as an inadvertent effect of the logical positivist/empiricist view, caused by the latter’s reluctance to make any statement referring to a reality behind the phenomena. Such a view has a major drawback if the phenomena, described by the theory, would turn out not to be all there is in reality, i.e., if microscopic objects 11
Schrödinger’s interpretation of the quantum mechanical wave function seems to be a typical example of such a blend of positivistic and realist thinking (cf. Bitbol [97], section 1.5).
88
CHAPTER 2. EMPIRICIST AND REALIST INTERPRETATIONS
would exist, inducing the phenomena, but not to be identified with them. Although the logical positivist/empiricist approach is cautious in the sense that it does not want to attribute to a microscopic object properties other than observed ones, it is at the same time incautious in the sense that it tries to interpret properties of macroscopic preparing and measuring instruments as properties of a microscopic object. The former do not transcend the directly observed phenomena (like, for instance, Mach’s light flashes), and, hence, do not yield any empirical motive for introducing microscopic objects behind the phenomena. Therefore the logical positivist/empiricist approach seems to carry us quite a bit further than is justified on strictly empirical grounds. If the microscopic and the macroscopic worlds are as different as they seem to be, and if quantum mechanics describes relations between preparation and measurement phenomena, then it is not probable that the very same formalism will also describe the microscopic reality behind these phenomena. Negligence of this has given rise to considerable problems in the foundations of quantum mechanics (cf. chapter 3). In this book the idea is defended that these problems are mainly due to the assumption that the quantum mechanical formalism describes the microscopic object itself rather than preparation and measurement phenomena. It is a kind of a paradox that, evidently, the logical positivist/empiricist caution with respect to the characterization of a microscopic object seems to contribute to these problems rather than to solve them.
2.4.2
The classical paradigm
As already remarked in section 2.3, the tendency to interpret realistically the quantum mechanical formalism stems above all from the fact that a similar interpretation is customary in classical mechanics. The assumption that the concepts of classical mechanics correspond to really existing entities is not felt as problematic. Thus, Bell ([98], p. 52) has proposed to base the “woolly” quantum mechanical concept of ‘observable’ on the concept of ‘beable’, cast in the classical terminology of really existing objects, and giving a “precise physical” meaning to the concept of observable. The attribution of reality to classical concepts seems to be justified to a certain extent. If we do not look too carefully, the planet Mars can be considered as a point mass, or, if we look somewhat better, as a rigid body. Stated differently, there does exist in reality something (called ‘Mars’) corresponding to the theoretical terms (point mass or rigid body) of the classical theories of point particles and of rigid bodies, respectively. Hence, correspondence rules mapping the quantities of classical mechanics into reality do indeed seem possible in a certain sense. This is sometimes referred to as referential realism (e.g. Aronson [99]). It is a different question, however, whether point masses and rigid bodies do “really” exist. Put in this way this amounts to questioning the feasibility of a realist
2.4. WHICH INTERPRETATION TO CHOOSE?
89
interpretation of classical mechanics in a sense analogous to the definition given in section 2.3 for quantum mechanics. Can the attribute of ‘being a point particle’ or ‘being a rigid body’ describe a property of any object existing in reality? It is directly evident that a positive answer to this question with respect to classical mechanics is far less probable than would seem to be suggested by the casual way we are used to interpret this theory in a realist sense. It is rather certain that the point masses and rigid bodies of classical mechanics do not exist in reality. If point masses exist at all, then these objects are certainly of a microscopic nature, and will presumably not belong to the domain of application of classical mechanics, since quantum mechanics (or even subquantum mechanics) is the theory covering the microscopic domain. Analogously, also the rigid bodies of classical mechanics do not literally correspond to something existing in reality. Billiard balls do exist in reality, rigid bodies do not. A billiard ball can be considered as a rigid body at most in the sense of van Fraassen’s surface model, referred to in section 2.2. We know, however, that we will observe an atomic structure if we observe it more carefully: “in reality” it is not a rigid body at all12. Also here a (more) “realistic” description would require a specification of atomic movements, which could be provided by classical mechanics only in an approximate way. Our custom of seeing classical mechanics as a no-nonsense description of ‘reality as it is’ does not seem to be justified. This custom is actually based on a confusion of categories, in which from the ontic realism with respect to the existence of certain objects conclusions are drawn with respect to an (epistemic) interpretation of the theory describing these objects. The question whether the planet Mars “really” exists is an ontic one. It is related to reality itself. However, Mars as a point mass or as a rigid body does exist only on paper, in our imagination, or in the mathematical formalism of classical mechanics. This is a purely epistemic matter, related to the way we order our knowledge. By attributing reality to the model a realist interpretation of classical mechanics makes insufficient distinction between the two categories. On the other hand, it is quite useful to characterize a billiard ball as a rigid body also in an ontic sense, thus admitting, in a common-sense way, the existence of such objects within reality. A mapping, in the sense of referential realism, of the concept of a rigid body into reality does seem to be possible in a certain sense. It should be stressed, however, that this holds true only with respect to a certain domain of experimentation and observation, viz, the domain of those phenomena which are adequately described by the model. Thus, there is a certain domain of observation in which a billiard ball behaves as if it is a rigid body, and there is a domain in which Mars can be considered to be “really” a point mass. Outside these domains, however, the objects may exhibit deviations from such model-like behavior. 12
‘Rigidity’ is a so-called dispositional term, the meaning of which has been discussed by e.g. Hempel [100].
90
CHAPTER 2. EMPIRICIST AND REALIST INTERPRETATIONS
Put in this way it is clear that an objectivistic-realist interpretation, in the sense defined in section 2.3, is not even justified for classical mechanics: rigidity is not an objective property of any object existing in reality; due to its atomic constitution small vibrations will be present in a billiard ball under all physically realizable circumstances. Hence, a mapping, in the sense of an objectivistic-realist interpretation, of the concept of rigidity into reality is impossible. As far as classical mechanics describes reality it does not describe reality proper, but only a reality which is observed not too closely (ignoring atomic vibrations). Rigidity of a billiard ball can at best be interpreted in a contextualistic-realist sense. Since preparation and measurement are independent procedures, it might even be appropriate to distinguish between these in delimiting the experimental context. Even in classical mechanics in general there is a difference between a (contextually prepared) property of the object and a measurement result. This implies that, perhaps, classical mechanics, too, should preferably be interpreted in an empiricist sense rather than in a realist one: the theoretical concepts of classical mechanics may have to be mapped into the set of macroscopic phenomena of preparation and observation, much in the same way as discussed in section 2.2. In classical physics the measuring instrument is not conspicuously present, though, thus seemingly thwarting an empiricist interpretation of this theory. However, also in classical mechanics measurements are performed using measuring instruments (sometimes restricted to the unaided human eye). These have analogous characteristics to those discussed in section 2.2, although they are represented by function valued measures (cf. section 1.10) rather than by positive operator-valued ones. In any case, it is very well possible to draw a distinction between the reality of the object itself and the reality of the measuring apparatus also in classical mechanics13. Actually, the choice between a realist or an empiricist interpretation of classical mechanics is already inherent in Newton’s “Hypotheses non fingo”, expressing his (empiricist) decision to consider the terms in his equations as ‘describing just the phenomena’ rather than ‘representing the mechanism of gravitation’. At the level of a description of phenomena by Newton’s theory of gravitational interaction the gravitational field should be described, in agreement with figure 2.1, by a different (“sub-Newtonian”) theory, for instance, Laplace’s field theory, explaining the “action-at-a-distance” inherent in Newton’s theory. Objects in the common-sense meaning given above are sometimes called ‘percepts’ (e.g. Russell [102], p. 218). The assumption that there is an accurate correspondence between percepts and “things” is called ‘naive realism’ ([102], p. 337), according to Russell philosophically to be rejected, but yet widely applicable ([102], 13 The same is true for relativity theory, for which theory an empiricist interpretation seems to be the appropriate one (de Muynck [101]). Note in particular that a realist interpretation of relativity theory would imply the unusual interpretation of time as a property of the object, analogous to the position coordinates and whereas in an empiricist interpretation values of all four coordinates are interpreted as pointer readings of measuring rods and clocks.
2.4. WHICH INTERPRETATION TO CHOOSE?
91
p. 493). It seems, however, that this applicability should be restricted to the domain of macroscopic physics. If applied to quantum mechanics the “things” corresponding to the percepts are not the microscopic objects, but the macroscopic (parts of) preparing and measuring apparata. This makes naive realism inapplicable to quantum mechanics as a description of the microscopic world (although it remains applicable to the phenomena of preparation and measurement). It seems that the impossibility of equating, within the domain of quantum mechanics, percepts to (microscopic!) objects marks a more fundamental difference between classical and quantum mechanics than the generally accepted fact that within the domain of the first theory the interaction between measuring instrument and (macroscopic) object can be neglected, whereas in quantum mechanics it is not allowed to neglect the corresponding interaction with the (microscopic) object. Absence of interaction in a classical measurement may have been instrumental in establishing the idea that a classical measurement result is a true representation of a property of the object. However, although the measurement interaction is a factor of major importance in quantum mechanics, it does not seem to be responsible for the necessity of distinguishing between the ‘phenomena’ and the ‘reality behind the phenomena’, since this necessity is alike in quantum and classical physics. Since also in classical physics the model should be conceived as a description of the reality of the phenomena within the relevant domain of observation, it is necessary to distinguish two different domains of reality also here. The classical habit of equating the reality of the phenomena to the reality of the object itself may also have been promoted by the way, discussed in section 2.2.1, in which in a logical positivist/empiricist approach the object is represented in the theory, because here a difference between the phenomena and the reality behind the phenomena is even denied. The example of a rigid body demonstrates clearly that this is not allowed. We shall have to take into account the possibility that something analogous may be the case with respect to quantum mechanics, and that the domain of application of this theory may be restricted to the empirical domain in which the object (for instance, an electron) behaves as if it is a wave packet. The question of what an electron is “in reality” is not easier to be answered than the question of what Mars is “in reality” (see also the discussion on the completeness of quantum mechanics, section 4.2). For practical applications like the calculation of the position of Mars, or the scattering cross section of an electron (i.e. in order to describe phenomena within the domain of application of the theory) this is no problem. However, in discussing foundations it is of great importance to distinguish what has to be distinguished, and not to allow the theory to dictate how reality is constituted. If the theory would, indeed, just describe the phenomena, such an allowance could severely hamper developments leading outside its domain of application. With respect to this latter issue Heisenberg’s idea, mentioned in section 1.9 -that the standard formalism would be decisive against the possibility of a simultaneous measurement of position
92
CHAPTER 2. EMPIRICIST AND REALIST INTERPRETATIONS
and momentum- is an example preferably not to be followed (see also sections 7.9 and 7.10). The situations in classical and quantum mechanics are not essentially different in this respect. As little as it is meaningful to assume that Mars is a “real” point mass, or that a billiard ball is a “real” rigid body, is there any reason to expect that quantum mechanics will provide a literal description of reality in the sense that, for instance, an electron “is” a wave packet. Also in the case of an electron there will certainly exist “something”, described by the wave packet in a model-like sense. However, as is the case in classical mechanics, this model might be (partially) determined by the way we perform our observations. For this reason Ballentine’s reservations with respect to an empiricist interpretation (to the effect that, due to its operationalistic character, it allegedly is restricted to laboratory situations [103], section 2.1) do not seem to be conclusive: if quantum mechanics is not the ‘theory of everything’, completely describing physical reality, then there is no reason to require quantum mechanics to perform better than classical mechanics in describing the reality of the phenomena within its own domain of application. Also Schrödinger’s criticism ([104], p. 809) of an empiricist interpretation of quantum mechanical observables (as requiring a pre-established harmony, i.e. the measurement result both as a property of the measuring instrument and as a property of the microscopic object) seems to stem from a similar neglect of the possibility that a quantum mechanical measurement may not literally reveal an objective property of microscopic reality (see also section 10.6). The attempts at solving the problems and paradoxes arising when wave functions and observables are interpreted realistically, and are viewed, analogous to classical mechanical custom, as objective properties of the microscopic object, constitutes a considerable part of the literature on the foundations of quantum mechanics. For this reason we shall have to deal with these problems, even though they do not arise in an empiricist interpretation. Since, unfortunately, the “realist” way of thinking, so common in classical mechanics, has been firmly established in quantum mechanics, too, it is necessary to pay explicit attention to it by demonstrating that these paradoxes are artefacts of a realist interpretation, and disappear if an empiricist interpretation of quantum mechanics is adopted. The ‘possessed values’ principle, introduced in section 2.3, plays an important role in generating these problems. This principle is at the basis of the ‘nonlocality’ problem (cf. section 9.4.1), which is a particularly persistent problem in a realist interpretation, seemingly defying a unification of quantum mechanics and relativity theory. For Wheeler [105] this has been sufficient reason for stating that a realist interpretation of quantum mechanics is not well possible, and that we have to content ourselves with an empiricist one. Wheeler’s argument, however, is not decisive for everybody. Thus, Vigier [106] has tried to demonstrate that there does not exist an insurmountable contradiction between quantum mechanics and relativity theory.
2.4. WHICH INTERPRETATION TO CHOOSE?
2.4.3
93
Double role of the quantum mechanical observable
There does exist still another argument against a realist interpretation (and in favor of an empiricist one). This argument is of a methodological nature, and is related to the requirements a physical theory must satisfy. As already stated in section 2.1, it is necessary that certain (possibly not all) terms of a physical theory have a physical meaning in the sense of referring to a property of some physical object. It also seems necessary that for those terms which have a physical meaning this meaning is a unique one. Although this requirement appears to be rather innocent, it is so strong that it can not be fulfilled in a realist interpretation of quantum mechanics. This is caused by the important role played by the measuring instrument in quantum mechanics. It should be stressed that the relative frequency calculated from the mathematical formalism of quantum mechanics, is compared in the first place with the (relative) number of times the pointer of a measuring instrument points at position (cf. figure 2.1). It is this correspondence that is decisive for the question of whether theory and experiment are in agreement. Hence, quantum mechanical observables are in any case related ro phenomena in measuring instruments. As far as quantum mechanical observables in the sense of referential realism refer to reality, this reference is primarily to the macroscopic reality of the measuring instruments, even though, in its turn, the measurement phenomenon may refer to some property of the microscopic object. However, when, as in a realist interpretation, we consider observables as properties of the microscopic object, this implies that the observable must have a double role, since the empirical relation to the measuring instrument, noticed above, must remain valid in this interpretation too. In the literature not much attention is paid to this double function: a pointer reading of a momentum meter is generally interpreted as ‘the particle’s momentum is found to be without noting the difference. An exception is Redhead [107], who, however, immediately introduces a principle of ‘faithful measurement’ to justify the equation of the pointer position with the value the observable had preceding the measurement:
Yet, there are two different objects involved: the microscopic object and the measuring instrument. From a methodological point of view it does not appear to be sound to assume that one single mathematical quantity is referring to both objects, or to give the quantity a double role by decree of the ‘faithful measurement’ principle. On the other hand, when we are forced to make a choice, then it is virtually self-evident that the observable must refer to the measuring instrument,
94
CHAPTER 2. EMPIRICIST AND REALIST INTERPRETATIONS
since this is precisely what is done in practice! It always are the relative frequencies of pointer positions that are compared with the outcomes of quantum mechanical calculations. The translation of a measurement result in terms of a (quantum mechanical) property of the microscopic object is an addition of an interpretative nature, devoid of any meaning in an empiricist interpretation of quantum mechanics (subquantum theories being thought necessary for describing properties of the microscopic object), and not unproblematic in a realist one. A simple example is provided by an ordinary photon detector that is not 100% efficient (for a detailed treatment see section 7.2). Here it is clear that the measurement result photons’ does not necessarily mean that in reality before the measurement photons were present: the detector may have missed some (see also section 3.3.3). Although at first sight this example seems a bit trivial because it seems to be possible to cope with it by allowing as quantum mechanical measuring instruments only detectors that are 100% efficient, we must realize that the assumption of the possibility of explaining away the inefficiency of the detector is itself a consequence of the classical idea that the measurement process does not play an essential role in the theory, and can be eliminated from it by means of an idealization. The impossibility of such an elimination is, however, one of the basic themes in the discussion on the foundations of quantum mechanics. It is, indeed, seen from a quantum mechanical treatment of the photon detection process [7, 108] that the probabilistic nature of quantum mechanical transitions in general causes a photon detector to have an efficiency less than 100%. In this book we shall try to demonstrate that the distinction between the quantum mechanical measurement result as, on one hand, a property of a microscopic object, and, on the other hand, a pointer position of a measuring instrument (i.e. a property of the measuring instrument) is crucial for an understanding of the meaning of quantum mechanics. We shall see in section 4.6 that in particular in the (orthodox) Copenhagen interpretation the measuring instrument plays an essential role in the interpretation of the formalism of quantum mechanics, more specifically, in the interpretation of Heisenberg’s inequality. The confusion with respect to the latter issue, to be discussed extensively in chapters 4 and 7, will be seen to be to a large extent a consequence of a realist interpretation, eliminating as much as possible the role of measurement in assessing the theory (compare figure 2.3). It is important to note here that a restriction of the theory to standard observables may have promoted a tendency toward a realist interpretation and its ‘possessed values’ principle (cf. section 4.7.3), because the ‘faithful measurement’ principle seems to be applicable there. This same restriction is also responsible for a fundamental confusion in the Copenhagen approach of the complementarity problem with respect to two different kinds of ‘indeterminacy’ (cf. section 4.7.4), viz, ‘indeterminacy of the preparation’ (in a certain sense to be attributed to the object itself), and ‘indeterminacy of the measurement process’ (of which the in-
2.4. WHICH INTERPRETATION TO CHOOSE?
95
efficient photon detector is an example). We shall see in chapter 7 that within the generalized formalism of section 1.9 it is possible to distinguish clearly between these two kinds of ‘indeterminacy’, and that the indeterminacy introduced by the measurement -playing such an important role in the Copenhagen complementarity principle- is taken into account by the quantum mechanical formalism (see especially section 7.10.2). The ineradicable role of the measurement process in quantum mechanics is a strong argument in favor of an empiricist interpretation of this theory (see also the discussion of the contextualistic-realist interpretation in section 2.4.5), and against an a priori acceptance of the ‘possessed values’ and ‘faithful measurement’ principles inspired by the realist interpretation.
2.4.4 Interpretations of quantum field theory Interpretations of quantum field theory need not be different from those of elementary quantum mechanics. Either a realist or an empiricist interpretation may be adopted also here. Thus, the energy observable in which and are field operators, can be interpreted either as referring to the total energy at a certain time present in region (realist interpretation), or as a label of a measuring instrument set up in this region (empiricist interpretation). The same holds true for the relation between the state vector and the preparation. A great advantage of an empiricist interpretation is that there is no necessity to choose between a representation of the field by either ‘particles’ or ‘waves’. In this interpretation the field is thought not to be represented in the quantum mechanical description at all. In the literature of field theory realist and empiricist interpretations can both be encountered. Thus, in classical electrodynamics we may choose between a field theory (interpreted as ‘describing an object’) and an action-at-a-distance theory (cf. Wheeler and Feynman [77]), of which the latter just describes the motions of the charged particles (the ‘phenomena’) without explicitly referring to fields that may explain momentum transfer from one particle to another. Within the domain of quantum mechanics S matrix theory has been constructed as an alternative to quantum field theory, relating asymptotic (incoming and outgoing) states without assuming a continuous time evolution (Heisenberg 14 [110]). Incoming and outgoing states are directly related to observable phenomena of preparation and measurement (like scattering cross sections), and, hence, yield a formalism that is appropriate for an empiricist interpretation. Actually, Heisenberg seems to repeat here his empiricist approach when inventing matrix mechanics (cf. section 2.2), trying to solve the divergence problems of (infinite-dimensional) quantum field theory by restricting himself to observable quantities (e.g. Cushing [111]). Stapp [112] has developed an 14
The S matrix had been introduced before by Wheeler [109].
96
CHAPTER 2. EMPIRICIST AND REALIST INTERPRETATIONS
interpretation of quantum mechanics on the basis of the S matrix, in which probabilities are probabilities of responses of macroscopic measuring devices. The formalism is supposed to describe correlations between macroscopic preparing and measuring devices, and not to give a description of the connection between an observed and an observing system. Like in Newton’s theory of gravitation the connection between preparing and measuring devices is thought to be achieved by means of a long-range (nonlocal) interaction between the preparing and the measuring parts of the experimental arrangement. Yet, like Newton’s theory of gravitation, S matrix theory, too, has been felt to lack a causal explanation of the long-range interaction between the preparing and measuring parts of the processes described by the formalism. Analogously to the classical theory it was thought that quantum field theory could provide such an explanation. Therefore it was attempted to derive the S matrix by means of a Schrödinger equation describing time evolution of a quantum field, much in the same way as is done for finite-dimensional systems. After the divergence problems of quantum field theory (in particular of quantum electrodynamics) could be circumvented by means of renormalization, Heisenberg returned to field theory because of its greater explanatory power (e.g. [113]). In particular Schwinger, Feynman and Dyson have made important contributions to the derivation of the S matrix from quantum field theory (Schweber [114]). In the theory of strong interactions a similar development can be observed during the sixties and seventies: at first S matrix theory was thought to yield the appropriate description (e.g. Chew [115], p. 1,2), but field theory was favored as soon as renormalization was demonstrated to be possible also in this domain (’t Hooft [116]). Due to persistent difficulties within quantum field theories these formalisms were mostly considered as more or less provisional, to be interpreted in an instrumentalist sense as a tool for calculating observable quantities (e.g. Pais ([117], p. 325): “Quantum field theory is a language, a technique, for calculating the probabilities of creation, annihilation, scatterings of all sorts of particles...”). Nevertheless, sometimes the expectation is expressed that one day the structure of the elementary particles may spring from the formulae of a well-chosen field theory or some of its refinements, thus furnishing a “Theory of everything” completely describing microscopic reality. Although quantum field theory allows an empiricist interpretation, too (in which time dependence of the observables has a meaning that is not different from the one valid in quantum mechanics of finite-dimensional systems), a tendency towards a realist interpretation can be observed also here. In particular, the elementary quanta of the field are often treated quite realistically as microscopic objects existing in reality. However, we should be very careful here: a realist interpretation of field quanta is far from unproblematic.
2.4. WHICH INTERPRETATION TO CHOOSE?
97
Identical particles and (in)distinguishability One reason to assume that a particle picture may underlie quantum field theory is the mathematical equivalence of this theory with the elementary quantum mechanical theory of N identical particles if there is no creation and annihilation of particles [118]. Non-relativistic quantum statistical mechanics of a system of N identical particles can either be cast in terms of wave functions or of quantum fields. The canonical commutation relations
of the quantum field operators for fermions (+) or bosons ( – ) , respectively, correspond to a requirement of (anti)symmetry of the wave functions of the elementary formalism of N-particle systems:
in which P denotes permutation of the particle indices. This correspondence suggests that a quantum field can be seen as consisting of a number of particles, distinguished from each other by labels that in the elementary formalism are represented by particle indices. These labels cannot be used, however, to observationally distinguish the particles by means of a measurement. Due to the (anti)symmetry of the wave function measurement of any observable yields the same expectation values as measurement of Thus, for instance,
demonstrating that the particles cannot be distinguished by their positions. For this reason they are often called ‘indistinguishable’ particles. Since in quantum field theory particle indices are absent, this latter theory seems to provide a more natural way to describe such particles than does the elementary formalism. Yet, this does not at all imply that identical particles, although observationally indistinguishable, would be indistinguishable in a conceptual sense too (see also [119]). The question of whether indistinguishable particles can be treated as individuals labeled by their particle indices [119] will not be pursued here any further because this typically is a problem of a realist interpretation in which the wave function is thought to refer to the particles themselves. This is not to say that the question of whether to observationally indistinguishable particles a certain individuality can be ascribed is an irrelevant one. However, in an empiricist interpretation this question is not related to the mathematical formalism of quantum mechanics. Here, wave functions and observables refer to procedures of preparation and measurement rather than to the object. Since they yield the same measurement results, observables are labels of one and the same measurement procedure. The equivalence class of all observables with the same expectation values as should
98
CHAPTER 2. EMPIRICIST AND REALIST INTERPRETATIONS
preferably be represented by the symmetrized observable because this is the only one that is independent of the individual indices. Moreover, the symmetrized observable is the only one having a counterpart in the field-theoretic description. From an empiricist point of view particle indices are redundant if there do not exist observation procedures that can distinguish between them. Under this condition the field theoretic description is superior to the elementary one. However, the proviso made above with respect to the existence of observation procedures is not a formal one, but may have operational implications. Thus, a particle index might become observable if it turns out to correspond to an observable for which a measurement procedure becomes available that was not known until then. ‘Identical’ particles may turn out to be non-identical if new degrees of freedom are taken into account. In an empiricist interpretation the question of whether wave functions always ought to be (anti)symmetric refers to the preparation procedure, and, therefore, is not related to (observational) (in)distinguishability (in contrast to what has sometimes been asserted (e.g. [120])). Why should all quantum mechanical preparations of systems of identical particles satisfy (2.1) and (2.2)? The answer to this question is part of the solution of the general problem of understanding the relation between a preparation procedure and its quantum mechanical description. Presumably, this question transcends quantum mechanics proper; a subquantum theory may be necessary for this purpose (see also chapter 10). It seems evident that the domain of application of a theory satisfying (2.1) or (2.2) is (co-)determined by the requirement that preparation procedures are such that the individual particles are correlated so as to yield N. It is not at all clear why preparation procedures should obey such a requirement. For instance, a preparation procedure could be considered in which two identical particles (for instance, electrons) are prepared simultaneously in different galaxies (Mirman [121]). It is hardly to be expected that quantum mechanical measurements of the two electron positions will yield equal measurement results corresponding to a position halfway between the galaxies. It seems more likely that a formalism satisfying (2.1) or (2.2) is not applicable to preparations as considered here. Presumably, (anti)symmetry of the wave function is a consequence of certain restrictions on the preparation of the object system, causing all particles to be physically equivalent (see also chapter 10.6.4).
Photons The ‘photon’ concept is an interesting test case for the interpretation. The photoelectric effect has convinced most physicists that photons really exist. Also in this respect the usual terminology is rather “classically realist”. A photon is seen
2.4. WHICH INTERPRETATION TO CHOOSE?
99
as a representation of the particle aspect of electromagnetic radiation, and is for this reason considered as a particle-like, i.e. localized object. For instance, at a partially reflecting mirror a photon incident in one of the input ports is thought to be always either reflected or transmitted into one of the output ports (Grangier et al. [122], cf. figure 2.4). The question is, however, whether this particle picture is represented by the formalism of quantum electrodynamics in any realistic sense. In particular is it interesting to see whether the annihilation operator introduced in section 1.2, can be interpreted as annihilating a particle-like object in mode Such an interpretation would seem to be consistent with a conception of a photon as a particle only if the mode function is localized in a small region of space. However, often the plane waves are taken as mode functions. The annihilation of a quantized excitation of such a mode seems to affect the whole space, and would hardly be interpretable as a particle-like event. As an application consider the partially reflecting mirror. Let refer to the input ports of the mirror. As is well known [123, 124] a mirror with transmission and reflection amplitudes given by and respectively, realizes a state transformation of the two-mode state of the input field by the unitary operator arctan Coherent input states and are transformed (cf. (A.42)) into coherent output states and respectively. Note that also in the output states the analysis is in terms of the input mode operators and Only the values of have been changed. By applying (A.36) it is possible to calculate the state change of an arbitrary input state. For the input state in which one single photon is entering input port 1 we find
Analogously, for one photon entering port 2:
100
CHAPTER 2. EMPIRICIST AND REALIST INTERPRETATIONS
The outgoing state can be interpreted in terms of a photon as a particle that is either in one or the other of the output ports, with probabilities determined by the transmission coefficient and reflection coefficient of the mirror. However, it is equally possible to interpret the one-photon states in terms of the final boson creation operators and (cf. (A.40)). Then the outgoing states can be represented according to and respectively. In this representation they seem to consist of one (wave-like?) photon, going partly one way and partly the other. Hence, instead of sending an incoming photon one way or the other, a realist interpretation of the formalism could assume the mirror to reshape the photon so as to split into two parts. Evidently, the field theoretical definition of the ‘photon’ concept is tied up with a rather arbitrary choice of the mathematical representation. Many choices are possible. On the other hand, the electric field operator (1.20) is completely independent of the choice of the representation: if different photon operators are chosen by means of the transformation with a unitary matrix, then does not change if simultaneously the mode functions are replaced by Hence, on measuring the electric field, or some quantity derived from it. the two representations do not yield any difference in the quantum mechanical measurement results. This implies that such measurement results cannot discriminate between the two photon models. For this reason we must be careful in interpreting measurements of the electric field by means of photon counting [7]: it is completely undefined what kind of photons are counted. A realist interpretation of photon operators therefore yields completely arbitrary notions of what a photon “is”. In particular, it is not very meaningful to assume that in the experiment of figure 2.4 either or photons were present prior to the measurement. An analogous conclusion can be drawn if we compare ‘squeezed’ and ‘unsqueezed’ photons (cf. appendix A.5). It is customary to interpret eigenvector of the number operator (cf. appendix A.3) as describing a state in which precisely photons are present. However, this state vector can be developed according to in a basis of ‘squeezed’ photons (cf. appendix A.5). Hence, the state containing precisely ‘unsqueezed’ photons is mathematically equivalent to a state in which a certain nonzero probability exists of measuring an arbitrarily large number of ‘squeezed’ photons. In an objectivistic-realist interpretation this, once again, causes a problem since, preceding the measurement, the ‘squeezed’ photons would seem to have the same right to be there as the ‘unsqueezed’ ones. In a realist interpretation the question is: does the state “really” contain ‘unsqueezed’ photons, or is there in reality a superposition of ‘squeezed’ photons? Do we have any reason to prefer one answer over the other? The impossibility of giving an unambiguous answer to questions like these is, in fact, the most important reason to prefer an empiricist interpretation of quantum
2.4. WHICH INTERPRETATION TO CHOOSE?
101
mechanics. Such an interpretation has the advantage that no more interpretation is attached to the quantum mechanical formalism than is justified on strictly empirical grounds. By doing so we, in any case, avoid what is done in a realist interpretation, viz, interpreting in a speculative and untestable way a pointer position of a measuring instrument as a property of a microscopic object. It is in the first place this pointer position (the click in the photon detector) that has to be described by quantum mechanics. For this reason the choice of a mathematical representation may to a large extent be dictated by the measurement arrangement. By itself this does not tell us very much about the object itself, however: detection of an photon need not imply at all that an photon was present beforehand. In an empiricist interpretation the operators and need not have any direct physical meaning. In case of a photon detector such a physical meaning seems to be reserved only for the operator of the electric field. On the other hand, an empiricist interpretation of quantum mechanics does not at all imply that photons would not exist. We do interpret a click of a photon detector as an absorption of a photon that was present in the detector, and that seems to behave as a localized object. This ‘photon’ concept, however, is different from the one discussed before. It is much more heuristic and informal than the one defined within the quantum mechanical formalism by means of creation and annihilation operators 15 . Although this informal ‘photon’ concept is not supported by quantum mechanics, it is certainly not inconsistent with empirical evidence, and might be just as “real” as the (point-like) electron, according to the Born statistical interpretation finding itself at some position within the confines of a wave function. It is very well possible that a picture of an electromagnetic field as a kind of particle gas is not incorrect physically. This, however, is certainly not described by the quantum mechanical formalism in the sense that the creation operators would create these particles. If we would yet like to have a description of electromagnetic phenomena in terms of these particle-like photons, then we would not have another choice than to develop a new theory, different from quantum mechanics (see also section 2.5.2).
2.4.5
Contextualistic-realist interpretation
In agreement with the tendency, observed in section 2.3, of interpreting quantum mechanics in a realist sense, a solution to the problem of the arbitrariness of the mathematical representation, mentioned above, is sometimes sought in the assumption that the measurement context may be instrumental in determining which representation should be singled out. This may yield a contextualistic-realist interpretation, in which quantities are thought to have reality only within the context of 15
The existence of different ‘photon’ concepts is noted e.g. by Bachor ([125], section 3.2), and by Paul ([126], section 4.2).
102
CHAPTER 2. EMPIRICIST AND REALIST INTERPRETATIONS
a well-defined measurement arrangement. Thus, the kind of photons present (e.g. the ‘squeeze’ parameter would be determined by the measurement arrangement. Analogously, the quantization axis for angular momentum -and, hence, the quanta of angular momentum- would be determined by the direction of the magnetic field that is applied. In this view quanta do exist only if the experimental measurement arrangement is actually present, because this arrangement plays an active role in their definition. This view fits into the positivistic ideal of scientific practice, in which quantities not actually measured are thought not even to be defined. No value can be attributed to an observable preceding the measurement: the observable does not exist as such because it is defined only within the context of the measurement (compare Bohr’s correspondence principle, section 4.3). In a contextualistic-realist interpretation an observable is physically realized only when it is measured (compare Jordan’s views, to be discussed in section 6.2.2), and it is determined only then which value it has. Such a contextualistic-realist view is sometimes even endorsed in case of generalized observables (e.g. Busch, Lahti and Mittelstaedt [41]), thus demonstrating that an operational approach need not automatically imply an empiricist interpretation (however, see section 7.2.2 for a criticism of such a contextualistic-realist operationalism). In general, contextualistic realism is discussed primarily with respect to observables, thus offering an alternative to the (objectivistic) ‘possessed values’ principle (cf. section 2.3). The concept of weak projection (1.72) or (1.74) seems to offer a possibility of implementing the contextualistic-realist interpretation also at the level of states. Thus, the state obtained from by means of weak projection, could be considered as representing reality within the context of a measurement of observable This state is often interpreted as representing an ensemble consisting of subensembles with well-defined values of A. A contextualistic-realist interpretation of would be consistent with a similar interpretation of observable A. Indeed, it was presumably the desire to be able to interpret not only the observable but also the state realistically, which has been responsible for the idea of (weak) projection. Because is valid only in the context of a measurement of observable A, it will be referred to as a contextual state. The state is often thought to be prepared by the interaction between object and measuring instrument, and, hence, to represent the final state of the object, valid after the measurement process. However, such an interpretation of the contextual state does not meet the realist goal that an individual measurement result should be explained by the state the object was in before the measurement. Moreover, as we shall see in chapter 3, a transition from initial state to final state described by weak projection -although often considered in discussions on the foundations of quantum mechanics- cannot be correct in general: respectable quantum mechanical measurement procedures exist which do not satisfy the requirement that the final
2.4. WHICH INTERPRETATION TO CHOOSE?
103
state be represented by the state Hence, either the transitions (1.72) and (1.74) do not have a general relevance to quantum mechanical measurements, or they should be interpreted in a different way. Here we consider the second possibility. There, indeed, might exist a different interpretation of the contextual state in the sense that this density operator is not thought to represent the final state of a measurement, but is considered as an alternative description of the initial state (see also section 6.6.2). The transition from to need not be considered as the result of a quantum mechanical interaction, but might be taken in an epistemic sense. It might be compared to a transition from a description of a billiard ball as an object having an atomic constitution, to a description as a rigid body. The equality (1.75) is then comparable to the fact that a rigid body description of a billiard ball does yield the same results as an atomic description as long as we restrict ourselves to experimental contexts valid within the domain of applicability of the rigid body model. In a contextualistic-realist interpretation the reality of contextual state could be compared to the reality of a billiard ball16, in the sense that in the above-mentioned domain the ball “really is” rigid. Analogously, the contextual state could be thought to describe microscopic reality as it “really is” in the context of a measurement of standard observable A. If a realist interpretation is aspired to at all, it would seem to be reasonable to interpret the state as a contextual description of the initial reality of an object rather than of the final (post-measurement) one, since represents the same information as does the initial state At least, it would seem to be satisfying that it may be possible to explain the final pointer position of the measuring instrument (e.g. by a principle like the ‘faithful measurement’ principle) on the basis of the reality of the corresponding microscopic property in the initial contextual state (e.g. Herbut [127]). If a contextualistic-realist interpretation of the contextual state is possible at all, a relation to the initial state of the object (rather than to the final one) also seems to be more plausible because the transition from to is a generic one, depending only on the observable ( A ) , but being independent of the specific interaction of object and measuring instrument. By contrast, the final state is generally influenced by the specific way observable A is measured, and, in actual practice is often different from A contextualistic-realist interpretation is a defensible view within the standard formalism of quantum mechanics, and many elements of it can be found within the Copenhagen interpretation (to be discussed in chapter 4). A positive aspect of a contextualistic-realist interpretation is the emphasis laid on the role of the measuring instrument. Because of this the ‘possessed values’ principle (cf. section 2.3) is not valid any more, thus evading its problematic consequences. Photons could be considered to be quanta of the electromagnetic field as this field exists within 16
Better: of an ensemble of billiard balls (cf. chapter 6).
104
CHAPTER 2. EMPIRICIST AND REALIST INTERPRETATIONS
the context of an experimental measurement arrangement. A contextualistic-realist interpretation of the contextual state as a description of microscopic reality as it is “observed” by a measuring instrument is not liable to the objection, raised in section 2.4.3 against the double role of the concept of a quantum mechanical observable in a realist interpretation, because and are different concepts. Unlike is not applicable outside the context of a measurement of A. However, the quantum mechanical observable still has the double meaning rejected in section 2.4.3 on methodological grounds. Since only the pointer positions of measuring instruments are empirically verifiable, in this book an empiricist interpretation is preferred over a contextualistic-realist one. Neither a (contextualistic-) realist attribution of a measurement result as a property to the object, nor a (contextualistic-)realist interpretation of the contextual state does add anything verifiable to the empirical content of the theory. Hence, a (contextualistic-)realist interpretation is empirically superfluous. Due to equality (1.75) quantum mechanics does not give any further answer to the question of whether observable A has “really” obtained a well-defined value in the context of an A measurement, or whether photons are indeed quanta of the electromagnetic field corresponding to the representation suggested by the context of the measurement arrangement. As will be seen in chapters 5 and 9, a contextualistic-realist interpretation of quantum mechanics may even be misleading. The apparent success of such an interpretation could be a result of the fact that we devise certain pictures of microscopic reality on the basis of our macroscopic observations, more or less analogously to the way we can speak in classical mechanics about rigid bodies within a restricted domain of experimentation, knowing, however, that these bodies consist of atoms and, hence, are not rigid at all. By the same token a picture of a photon as a contextual quantum might be misleading. The fact, observed in section 2.4.4, that the measurement results of the electromagnetic field are independent of the mathematical representation, seems to point in a direction not particularly favorable to a contextualistic-realist interpretation. As far as quantum mechanics refers to reality, it in the first place refers to the macroscopic reality of the measuring and preparing instruments, rather than to the microscopic reality of the object. This suggests an empiricist interpretation rather than a (contextualistic-)realist one. From an empiricist point of view the contextual state is dispensable, although it might be viewed as an alternative description of a preparation procedure of the initial state of the object in the context of a measurement of A, in which “irrelevant” details (related to other observables) have been omitted. In an empiricist interpretation it is clear that the final state of the measuring instrument is the important issue. Whether there exists a unique relation between an individual final state of the measuring instrument and an individual state of the object -either initial or final- is a question not posed in this interpretation because quantum mechanics is thought not to be able to answer it. Of course, this does
2.4. WHICH INTERPRETATION TO CHOOSE?
105
not imply that the way the individual measurement result is triggered by the precise constitution of microscopic reality, would be uninteresting. However, quantum mechanics seems to be largely silent on this subject. We shall return to this issue in chapter 10 when dealing with hidden-variables theories intended to describe the reality behind the quantum mechanical phenomena. In particular in section 10.6 it will be discussed how it might be possible to bridge the gap between the microscopic world of the object and the macroscopic world of measuring instruments, left open by the quantum mechanical description, in the way suggested by a contextualisticrealist interpretation of the contextual state This should not be interpreted, however, as proving that the contextualisticrealist interpretation is right. Such an interpretation would stretch the analogy with the classical rigid body description of a billiard ball much too far, and would ignore the fact that within the domain of quantum mechanics we necessarily have to deal with the interaction between the microscopic object and our measuring instruments. This is a dynamic process in which the measuring instrument is involved in a far more essential way than by just providing a certain environment as an experimental context for the microscopic object. However, this process might be describable only by transcending quantum mechanical theory, analogously to the necessity of a microscopic theory of interatomic interactions to delimit the domain of validity of rigid body theory for billiard balls. An attribution of a value of a quantum mechanical observable as a property to a microscopic object might face an analogous difficulty as attribution of rigidity to a billiard ball (compare the objections, raised in section 10.2.3 against a derivation of the Kochen-Specker theorem). A second reason to prefer an empiricist interpretation is related to the generalization of the quantum mechanical formalism to the generalized observables introduced in section 1.9. This generalization is induced by the necessity, referred to above, to distinguish between microscopic object and measuring instrument, and to take into account their interaction also in the quantum mechanical description. As will be seen chapters 7 and 8, the generalized formalism and generalized experiments both strongly suggest that it is wise to make a clear distinction between properties of the object (even when interacting with a measuring instrument) and the pointer positions of the measuring instrument. An attempt to interpret certain generalized observables as describing, in a contextualistic-realist sense, a ‘fuzzy’ reality rather than a ‘sharp’ one (Busch et al. [41]) will be criticized in section 7.2.2. It is possible, in principle, to generalize the contextual state defined in (1.74) for a standard observable, to the case of a generalized observable with POVM according to (e.g. Busch et al. [42]; Hofmann [128])
Yet, expression (2.3) does not seem to be very meaningful if intended to represent
106
CHAPTER 2. EMPIRICIST AND REALIST INTERPRETATIONS
the final object state of the measurement process, since such an interpretation would amount to a generalization of the concept of so-called first kind measurements, which itself is of a rather dubious nature (compare section 3.2.4). Although the final state will certainly be affected by the detection process, it is unlikely that this will happen in the way described by (2.3) (see also section 3.3.4). If intended to represent the initial object state in the contextual sense given above, then it satisfies the generalization of (1.75) only if Hence, the definition is not applicable to the most interesting cases. There does not seem to exist an obvious way to find a density operator, different from containing the same information on a general POVM as does In particular, Naimark’s theorem, suggesting that might work because (cf. section 1.9.3), is not of any help here since it is easily seen that The difficulty of finding contextual states for general POVMs seems to reflect the problematic character of a realist interpretation of the generalized formalism. As will be seen in section 7.10, to understand the meaning of certain POVMs it will be necessary to take into account disturbing influences of the measurement process on the information obtained in a quantum mechanical measurement. It seems to be meaningless to try to mimic this influence by a preparation process preceding the measurement. Preparation and measurement should be carefully distinguished in quantum mechanics (de Muynck [129]). Even though the possibility of defining contextual states for standard observables suggests that, perhaps, these special observables may be reflecting a (contextual) reality, it must always be kept in mind that such an interpretation transcends the empirical data provided by the readings of our measuring instruments.
2.5 Some consequences of a choice for an empiricist interpretation 2.5.1
Empiricist interpretation and generalized observables
We note already here that an empiricist interpretation of quantum mechanical observables has an important consequence, viz that all results depending on the choice of calibration of a pointer scale ( in figure 2.1) are largely arbitrary. Indeed, an experimenter is free, in principle, in his choice of this scale. Of course, he will always try to relate it to a property of the object. However, the problem is that he has, apart from his measurements, no independent information directing his choice. The only thing he is able to do, is to compare the scales of different measurements and to gauge these in such a way that all data are consistent, both mutually and with respect to theory. This leaves a considerable arbitrariness as regards the choice
2.5. SOME CONSEQUENCES
107
of the calibration of pointer scales. Values of quantum mechanical observables are largely conventional. Strictly speaking, only results that are independent of this choice can be physically relevant. Such an independence holds true for the relative frequencies of pointer positions but not for the expectation values (1.3). For this reason conclusions in this book will be based exclusively on probability distributions, not on expectation values. This seems to be just a matter of interpretation, often deemed to be not very important. Yet, it has important consequences, relevant not only to the interpretation but also to the formalism of quantum mechanics, because it entails a change in the concept of a quantum mechanical observable to the extent that such an observable must be independent of the (eigen)value (see sections 1.9 and 3.3.1). This implies that from an empiricist perspective for standard observables only their spectral representations (corresponding to PVMs) have physical relevance, thus enabling a generalization to POVMs. By relinquishing the role of the values of observables, and by the consequent predominance of probability distributions it becomes obvious that projection-valued measures do not take special positions among the POVMs. Even if admitting the necessity of POVMs for describing the probability distributions of certain quantum mechanical measurements, many physicists consider PVMs (or Hermitian operators) as the more fundamental observables. Thus, Englert and Wódkiewicz [130] distinguish intrinsic and operational observables, the latter corresponding to POVMs, the former thought to represent “real” properties of a microscopic object. It should be stressed here that it may turn out to be impossible to substantiate this view. Although in certain cases (e.g. the inefficient photon counting process referred to in section 2.4.3) a measurement of a generalized observable can be interpreted as a nonideal measurement of a standard observable, this does not imply that the latter observable is “more real” (compare our discussion of the different photon concepts in section 2.4.4). The standard observable might need an empiricist interpretation as well. Moreover, we shall encounter generalized observables which do not even seem to be related to a PVM of the microscopic object. For instance, the POVM measured in the experiment by Noh et al. [56], to be discussed in section 8.4.3, is not related in any obvious way to a PVM. Although it is possible to define a Hermitian operator corresponding to this POVM (Englert et al. [131]), and, hence, a PVM is defined by the spectral representation of this operator, this definition is based on a certain choice of the values of the generalized observable, different choices yielding different PVMs. A unique PVM could only be obtained if it were possible to curtail the arbitrariness of this choice.
108
CHAPTER 2. EMPIRICIST AND REALIST INTERPRETATIONS
2.5.2 Realist interpretation of quantum mechanics, and hidden variables We have to draw a clear distinction between a ‘realist interpretation of quantum mechanics’ and ‘subquantum or hidden-variables theories’. The latter are often seen as intended to describe the reality behind the quantum mechanical phenomena, and, therefore, are associated with realism. As already put forward in section 2.2, the latter theories are not at all incompatible with an empiricist interpretation of quantum mechanics. On the contrary, if quantum mechanics describes only phenomena in preparing and measuring instruments, then -provided one is not content with the logical positivist/empiricist proposal to equate the microscopic object with the phenomena- a description of the microscopic reality behind the phenomena would necessarily require a subquantum theory. The question of whether electrons and photons are point-like objects rather than the ‘wavicles’ implied by the logical positivist/empiricist approach, does not obtain a sufficient answer within the formalism of quantum mechanics, and has to be relegated to the domain of subquantum theories. The relation between subquantum theory and quantum mechanics could be analogous to the relation between (classical) statistical mechanics and thermodynamics; the former theory is intended to describe the microscopic reality behind the (macroscopic) thermodynamic phenomena, and explains these in terms of microscopic objects and their interactions (see also chapter 10). When quantum mechanics is interpreted, analogously to thermodynamics, as a phenomenological description of reality -as is done in an empiricist interpretation- then this does not imply that we can stop thinking (cf. section 2.3), but, on the contrary, it means that we are free to direct our intelligence toward an explanation of the phenomena in terms of subquantum processes (analogous to an explanation of thermodynamic phenomena by means of statistical mechanics), while employing quantum mechanics to describe just the phenomena within its domain of application. Ultimately, this might be more fruitful than the continuing effort applied to solve all those paradoxes caused by a realist interpretation of quantum mechanics, and haunting this theory already for such a long time. Under von Neumann’s influence (cf. section 10.2.1) hidden-variables theories have for a long time been considered prototypical of a metaphysical approach no serious physicist could afford to be involved in. It is not improbable that the preference, observed in section 2.4.2. of many physicists for a realist interpretation of quantum mechanics is caused by a certain reluctance with respect to hidden-variables (subquantum) theories. Indeed, when the possibility of subquantum theories is denied, a belief in the existence of elementary particles virtually makes necessary a realist interpretation of quantum mechanics, in which these particles are thought to be described by quantum mechanics itself. As a consequence an elementary particle
2.5. SOME CONSEQUENCES
109
is often thought to be representable as a wave packet flying around in space. It is amusing to realize that, although such a representation may have been furthered by the logical positivist/empiricist fear of the metaphysical -requiring that our physical theories should restrict themselves to the description of observable quantities, and should shun speculation and metaphysics-, such a view is not less metaphysical than a hidden-variables representation was supposed to be. Elementary particles may fly around in space, wave packets do not. The idea that they do is just a consequence of a realist interpretation of quantum mechanics, assuming the theory to yield a description of microscopic reality rather than of the phenomena of preparation and measurement. Another aspect of a rejection of subquantum theories is its implicit assumption that quantum mechanics is universally valid. This, too, is a methodological point of departure of a doubtful calibre, which a realist interpretation is more liable to than an empiricist one. The problem of the completeness of quantum mechanics will be discussed more fully in section 4.2. Here it should suffice to remark that this problem has become tied up with the interaction between object and measuring instrument. In a realist interpretation this interaction has a tendency to be neglected, thus concealing a problem that is essential to an understanding of quantum mechanics. On the other hand, in an empiricist interpretation this problem is highlighted, since here the transmission of information from object to measuring instrument is of central importance. In this book the interaction between object and measuring instrument plays a major role. Any subquantum theory striving at a reproduction of the quantum mechanical measurement results should duly take this into account. Whether one believes in the completeness of quantum mechanics is a major incentive to choose one interpretation or another. Whether quantum mechanics is really an incomplete theory can get a definitive answer only by experimentally transcending the domain of application of this theory. A belief in the completeness of quantum mechanics, however, would a priori deem this a self-contradictory enterprise not worth trying. An empiricist interpretation might be able to break the vicious circle, entered in this way, for the same reason as it is instrumental in abolishing a restriction to the standard formalism within quantum mechanics. In chapter 10 some ideas will be developed with respect to the kind of experiments that might lead outside the domain of application of (generalized) quantum mechanics.
2.5.3 Interpretations and the classical limit An interesting difference between an empiricist interpretation of quantum mechanics and a realist one is found when considering the classical limit of the theory. In which sense can classical mechanics be seen as a limit to quantum mechanics? Since a macroscopic object consists of microscopic particles, in a realist interpretation quantum mechanics should also be able to describe macroscopic objects. Since
110
CHAPTER 2. EMPIRICIST AND REALIST INTERPRETATIONS
position and momentum can be attributed simultaneously to macroscopic objects, in this interpretation we are obliged to formulate the theory in such a way that it contains “classical” (i.e. commuting) observables, that, yet, represent position and momentum of a macroscopic particle. Stated differently, in a realist interpretation quantum mechanics must contain classical mechanics as a part, or, at least, the latter theory should be derivable from the former one in some approximation. A general feature of this approach is the assumption that the domain of application of classical mechanics is embedded in the domain of application of quantum mechanics (cf. figure 2.5a). Although abstract quantum theories exist in which certain quantities are interpreted as classical observables (like the ‘observables at infinity’ in local quantum field theories [8]), their practical applicability is far from clear. A similar remark holds true for Ehrenfest’s theorem (1.111), which casts quantum mechanics in a “classical” form, but which does certainly not provide a derivation of classical mechanics from quantum mechanics (e.g. Ballentine et al. [58], where it is demonstrated that classical statistical mechanics rather than classical mechanics may be obtained in the classical limit of quantum mechanics). Another attempt to consider the classical limit within a realist interpretation of the formalism is made in the context of the ‘decoherence’ approach to be discussed in section 3.4. In this approach it is assumed that to obtain agreement of the quantum mechanical description with the macroscopic world it is necessary to take into account the influence of the environment of a macroscopic object. In the related approach by Ghirardi, Rimini and Weber [132, 133] it is assumed that macroscopic objects are described by an equation fundamentally different from the Schrödinger equation, and, hence, are outside the domain of application of ordinary quantum mechanics. This would correspond to the situation depicted in figure 2.5b. However, within the ‘decoherence’ program attempts are made to interpret the deviation from the Schrödinger equation as a consequence of the influence of the environment, thus possibly restoring the situation of figure 2.5a.
2.5. SOME CONSEQUENCES
111
In an empiricist interpretation the situation is quite different. In the first place preparation and observation methods are involved here. In the domain of application of quantum mechanics these can be very different from the ones that are possible in the domain of application of classical mechanics. For instance, in classical mechanics observation methods for the simultaneous observation of position and momentum do exist. In (standard) quantum mechanics they do not. For this reason the domains of application of the theories are different. In an empiricist interpretation only the situation of figure 2.5b seems to apply. We do not have any reason to try to consider classical mechanics as a limiting case of quantum mechanics. Of course, there must be some kind of continuity in the transition from preparations and measurements in the microscopic domain, via the mesoscopic one, to the macroscopic domain. But this, evidently, cannot be a relation of inclusion. Primas ([89], p. 107) is convinced that classical mechanics is not a limit of Diracvon Neumann (“pioneering”) quantum mechanics. He derives his argumentation from his experience as a chemist in describing molecules, and from his insights that the concept of ‘molecular structure’, as employed by chemists, is a classical concept ([89], p. 322; also [134]), not consistent with the superposition principle (cf. section 1.8.2). As we shall see in chapter 3 it, indeed, is the application of the superposition principle to macroscopic objects (like Schrödinger’s cat) that is causing problems in a realist interpretation. Incidentally, for Primas this is no reason to reject a realist (or ontic) interpretation. He looks for a solution in a different direction. By introducing superselection rules, restricting the possibility of superposition, he tries to generalize the theory so as to encompass both classical and quantum mechanical quantities. Thus, in a molecule an atomic nucleus would have a classical position variable, an electron a quantum mechanical one. Since, however, a nucleus consists of particles that are to be described by quantum mechanics, Primas’s solution does not seem to be a final one. In any case, for the center of mass of a nucleus there would also exist a quantum mechanical position variable that in a realist interpretation of the theory has a rather unclear meaning next to the classical one. On the other hand, in an empiricist interpretation the difference between a classical and a quantum mechanical position observable is evident because they represent different observation procedures. A classical determination of the position of an atomic nucleus can be carried out by means of one single low frequency photon having such a small energy that the nucleus is not appreciably influenced. A quantum mechanical measurement requires more than one photon (viz, one for each constituting particle) of a much larger energy, and will give rise to Heisenberg disturbance, which is characteristic of quantum mechanical measurements (cf. section 4.6.2). In an empiricist interpretation there can exist a domain of classical observations, fundamentally different from the methods used within the domain of application of standard quantum mechanics. Transition between the two different domains is a matter of devising appropriate measurement procedures for
112
CHAPTER 2. EMPIRICIST AND REALIST INTERPRETATIONS
the intermediate region rather than considering some (mathematical) classical limit. In this book the classical limit will not be discussed any further, although certain results may be relevant to this problem. Thus, in the generalization of standard quantum mechanics, to be discussed in chapter 7, a concept of joint nonideal measurement of position and momentum is found. The extension of the domain of application of quantum mechanics, thus obtained, seems to provide a bridge between classical and quantum mechanics, not present in standard quantum mechanics. By studying the nonideality of the joint measurement of position and momentum dependent on the mass of the object the intermediate (mesoscopic) domain (M) could be an interesting testing ground for the classical limit in an empiricist sense.
Chapter 3 Quantum mechanical description of measurement, and the “measurement problem” 3.1 The (conventional) “measurement problem” 3.1.1 Schrödinger’s cat The so-called “measurement problem” is partly a consequence of the mathematical formalism of quantum mechanics, and is partly induced by its interpretation. It arises when the measuring instrument is considered as a quantum mechanical object. The problem is induced by the application of the superposition principle to the states of a measuring instrument. Already at a very early stage it was felt that such superpositions might constitute a problem for the interpretation of the quantum mechanical formalism, in particular for those realist interpretations considering the state vector as a description of an individual object in the classical sense discussed in section 2.3. It is not accidental that Schrödinger -who preferred a realist interpretation of quantum mechanics- was the first one to worry about this problem. Schrödinger [104] considers a cat, confined in a cage together with a radioactive atomic nucleus that has a probability to decay within the next hour. If the nucleus decays a mechanism is set into motion causing the cat to die. The problem is sometimes (e.g. [135]) presented in a somewhat simplistic way, to the effect that the cat’s final state would be described by a superposition,
which is a rather odd -perhaps even paradoxical- state for a cat to be in. 113
114
CHAPTER 3. THE PROBLEM OF MEASUREMENT
Schrödinger’s cat is actually nothing but a measuring instrument answering the question of whether the radioactive nucleus has decayed or not. The states and of a living and a dead cat, respectively, are examples of different final states of the pointer of a measuring instrument. The “measurement problem” is symbolized by the superposition (3.1) of the states of the living and the dead cat: how can such a superposition be reconciled with the fact that in actual practice cats are found only to be either alive or dead? One way to cope with the “measurement problem”, as exemplified by Schrödinger’s cat, has been the application of von Neumann’s projection postulate (cf. section 1.6) to the cat (measuring instrument) itself. Thus, observation of the cat would cause the state (3.1) to change discontinuously into either or if strong projection (1.70) is assumed, or to the density operator if weak projection is applied (cf. (1.72)), c referring to a “cat” observable having and as eigenvectors. Comparing (3.2) with the density operator corresponding to the state (3.1), the essential difference is seen to be the absence of the “cross” terms 1 and in (3.2), thus suggesting the possibility of interpreting the latter state, in the spirit of the Copenhagen ensemble interpretation (cf. section 6.2.3), as an “unproblematic” description of an ensemble of living and dead cats. The “cross” terms give rise to the interference terms in the expectation value (1.4). Busch, Lahti and Mittelstaedt ([41], section III.5.1) interpret the transition from (3.1) to (3.2) as an ‘objectification’ of the measured observable, the problem of how this observable gets its objective value being considered as the key problem of the quantum theory of measurement. Note that density operator (3.2) is meant to represent the final state of the measuring instrument in a measurement process. Reliance on the projection postulate can hardly be satisfactory if this postulate itself is not firmly established. For Schrödingcr [104] this was reason enough to develop serious doubts with respect to the way the quantum mechanical formalism used to be interpreted in the physical literature of his time. If a state vector or density operator describes reality, then (3.1) and (3.2) represent different realities, thus posing the problem of what is the physical mechanism causing the “cross” terms to vanish. This is the main issue in the conventional “measurement problem”. Thus,
1
Sometimes also referred to as ‘off-diagonal’ terms, as they correspond to off-diagonal terms of density operator in the representation of the eigenvectors of the “cat” observable.
3.1. THE (CONVENTIONAL) “MEASUREMENT PROBLEM”
115
In the following the qualification ‘conventional’ will be omitted in general. The (conventional) “measurement problem” should always be clearly distinguished from the problem of quantum mechanical measurement as relevant to our present-day understanding:
3.1.2
Interpretations, the “measurement problem”, and the problem of quantum mechanical measurement
As the Copenhagen interpretation (cf. chapter 4) was the dominant one at that time, Schrödinger’s objections are generally seen as being directed against this interpretation. However, the Copenhagen interpretation is a far from monolithic structure, even containing conflicting elements. Thus, as will be discussed in section 4.3, it is in disagreement with one of the theses of the Copenhagen interpretation (i.e. the strong form of the correspondence principle) to consider a measuring instrument as a quantum mechanical object. According to this thesis a measurement must be described in classical terms. Since Schrödinger’s objection was based on a quantum mechanical description of a measuring instrument, this may have been a reason for Bohr to take the problem not too seriously 2 . On the other hand, already von Neumann ([2], chapter VI) considered the possibility of treating the measurement process as an ordinary quantum mechanical process, satisfying a Schrödinger equation describing the interaction between object and measuring instrument (cf. section 3.2.2). It is rather von Neumann’s approach against which Schrödinger’s objection is directed. Since this approach is widely considered as a part of the Copenhagen interpretation (see also section 4.3.4), the “measurement problem” can yet be seen as a problem of this interpretation, notwithstanding Bohr’s possibility of evading the problem by appealing to the correspondence principle. There may be a second reason responsible for the scant importance Bohr seemed to attribute to Schrödinger’s cat paradox. As far as the Copenhagen interpretation 2 “The argument here is of course first and foremost that in order to serve as measuring instruments, they cannot be included in the realm of application proper to quantum mechanics” (letter from Bohr to Schrödinger [136]).
116
CHAPTER 3. THE PROBLEM OF MEASUREMENT
is not a realist one, Schrödinger’s doubts seem to be misdirected. Thus, in Bohr’s instrumentalist interpretation of the wave function (cf. section 2.1) the vector (3.1) is not supposed to describe the state of a cat, but is considered to be just a mathematical tool for calculating probabilities. It is actually admitted by Schrödinger [104] that in an instrumentalist interpretation the problem does not arise (without being prepared, though, to accept this latter interpretation). As was remarked in section 2.4.5, however, many realist elements can be found in the Copenhagen interpretation (see also sections 4.3.3 and 4.6.6), thus making this interpretation vulnerable to Schrödinger’s criticism. This realism, undoubtedly, is the reason that for solving the “measurement problem” it has been felt necessary to single out measurement interactions as a peculiar kind of processes that, although describable in quantum mechanical terms, would nevertheless not satisfy the usual unitary evolution governed by a Schrödinger equation, but would be subject to projection according to von Neumann’s projection postulate (cf. section 3.2.2). The conventional “measurement problem” might indeed be characterized as the problem of justifying the projection postulate. Because of its dependence on interpretation the “measurement problem” is often considered a pseudo problem, not to be referred to otherwise than supplied with quotation marks. Indeed, as formulated above, the problem seems to be induced primarily by a realist interpretation of the state vector as a description of an individual object. A different approach could be an attempt to find another interpretation of the formalism, not necessitating the projection postulate, and treating measurement on an equal footing with any other quantum mechanical interaction. As already mentioned above, an instrumentalist interpretation might evade the problem. This holds true for the so-called minimal interpretation (see also sections 2.1 and 6.4.1). Also in an empiricist interpretation (cf. section 2.2) the state vector does not describe the cat itself, but rather represents a preparation procedure. As long as the probability distributions of all possible measurements, to be performed on the cat, are correctly predicted by the state vector, there does not seem to arise any problem in an empiricist interpretation. Therefore in this interpretation the “measurement problem” is non-existent in the form given above, and could be ignored. Finally, even a non-Copenhagen realist interpretation, in which the state vector is not considered as a description of an individual object but as a description of an ensemble (cf. chapter 6), is able to circumvent the problem (see also section 3.2.5). Although the above-mentioned interpretations are not liable to the conventional “measurement problem”, this does not imply, however, that Schrödinger’s cat does not pose any problem to them. On the contrary, independently of the choice of an interpretation there is the problem of why, if (3.1) is an allowed state vector of the cat, there have never occurred observations distinguishing this state from (3.2). If it is accepted that also measurement interactions must be described quantum mechanically -as will be argued in section 4.3.4 while criticizing Bohr’s correspon-
3.1. THE (CONVENTIONAL) “MEASUREMENT PROBLEM”
117
dence principle- then the problem is a genuine problem of the quantum mechanical formalism rather than merely of the interpretation. In the past the question of what is a quantum mechanical measurement has mainly been approached starting from a realist interpretation of the standard formalism. Thus, a measurement was supposed to yield information on the value of an observable (represented by a Hermitian operator), assuming that the object possesses the measured value of the observable as a property immediately after the measurement. The conventional “measurement problem” can be characterized as the problem of how it is possible that the measuring instrument, after completion of a measurement, has a well-defined pointer position (and the object a welldefined value notwithstanding the final state is a superposition like (3.1). Von Neumann’s projection postulate had to be invoked for solving the apparent contradiction by providing a mechanism explaining the disappearance of the “cross” terms. However, the starting point may be too restricting. The interpretation and the formalism may both impose requirements on the notion of a quantum mechanical measurement that are too strong. It is important to know which requirements are crucial, and which ones are just spurious consequences of either a too restricted formalism or a too restrictive interpretation. This is the subject of the present chapter. By abolishing the realist interpretation it is possible to get around von Neumann’s projection postulate, which ‘getting around’ is necessary because no realistic measurement seems to satisfy it (cf. section 1.6). If we choose a less restrictive interpretation (e.g. the minimal interpretation, or the empiricist one), however, then the question of the unobservability of the “cross” terms regains its full weight, because we no longer have the “explanation” provided by the stronger interpretation. In the formalism the “cross” terms are there. Hence, we need some other explanation for the fact that they are not observed.
3.1.3 Three tentative answers Considering the “measurement problem” as a problem of the quantum mechanical formalism, three different answers could be given.
i) Unobservability of the “cross” terms One answer, valid in standard quantum mechanics (e.g. Gottfried [137]), could be that the cat, apart from measuring the state of a nucleus, also is a measuring instrument for measuring an observable (let us call it the “cat” or pointer observable), with eigenvectors and The difference between (3.1) and (3.2) could be observed only by measuring an observable of the cat, incompatible with the “cat”
118
CHAPTER 3. THE PROBLEM OF MEASUREMENT
observable. This, however, would seem to require an (“impossible”) simultaneous measurement of two incompatible observables of the cat (the measuring instrument). A restriction to observables compatible with implies unobservability of the “cross” terms. This answer is the “orthodox” one (see also section 3.2.3). There are two reasons not to be satisfied with this answer, both reasons stemming from the answer’s reliance on the special status of the cat as a measuring instrument described by the standard formalism: i) the assumption that and are orthogonal eigenvectors of causing to vanish, seems to be too restrictive. The macroscopic states of a measuring instrument will not be orthogonal in general. Thus, the coherent states (A.29), often considered as yielding the most appropriate quantum mechanical descriptions of macroscopic objects (e.g. [138]), are not mutually orthogonal. In a representation based on coherent states the “cat” observable will have nonvanishing “cross” terms in general; ii) the assumption that in state (3.1) no measurements can be made of observables incompatible with (such measurements being able to demonstrate the presence of the “cross” terms) does not seem to follow from the quantum mechanical postulates as given in chapter 1. A measurement process is at the same time a preparation process, of which (3.1) is the final state. In general there is nothing in the standard formalism of quantum mechanics imposing on a measurement restrictions that are dependent on the nature of the preparation: in (1.39) and A can be chosen arbitrarily, thus warranting independence of measurement from preparation, even if the preparation process is a (different) measurement. Hence, it is not at all clear why it would be impossible to measure some observable incompatible with after the cat has been prepared in state (3.1) or (3.2). In this respect the cat’s macroscopicity is sometimes presented (e.g. [139, 140]) as a tentative explanation, macroscopic objects allegedly having only compatible observables, corresponding to the object’s classical properties. However, this answer, which is closely related to Bohr’s correspondence argument, meets the same objections as those given above: there is no reason to assume the fundamental impossibility of measuring a microscopic observable, incompatible with of the measuring instrument in the state prepared by the measurement of .
ii) Decoherence In section 3.4 a second tentative answer will be discussed to the question of why the “cross” terms are never observed, viz, the idea that some mechanism exists wiping out these terms, the mechanism being effective only in macroscopic objects. Theories of this kind are called decoherence theories because the wiping out of the “cross” terms can be interpreted as caused by a disturbance of the phase coherence of the different terms in the linear superposition (3.1).
3.1. THE (CONVENTIONAL) “MEASUREMENT PROBLEM”
119
The idea of decoherence is firmly embedded in the law of entropy increase, which is considered a universal law of nature, corroborated over and over again. For this reason it does not seem unrealistic to assume the existence of such a decohering mechanism. The “measurement problem” strongly hinges on the question of whether superpositions like (3.1) “really” exist, or whether they are just artefacts of the quantum mechanical formalism, (3.2) being the “true” state, and time evolution during measurement being different from an evolution according to the solution of a Schrödinger equation. Like von Neumann’s projection the decoherence mechanism serves to get rid of the “cross” terms. Decoherence could provide a physical mechanism realizing the projection. The close relationship between the von Neumann-Lüders projection (1.74) and decoherence is often thought to be epitomized by the increase of the von Neumann entropy cf. (A.90)) realized by projection (however, see section 3.3.6). However, as a mechanism to realize a von Neumann-Lüders projection decoherence could hardly be adequate if this projection itself would not be a real effect. For this reason it is not at all clear whether decoherence constitutes only a practical rather than a fundamental impediment to observation of the “cross” terms. As we shall see in section 3.4, the decoherence programme meets great difficulty in arriving at a consistent explanation of all possible observations. Moreover, its assumption that the “cross” terms are completely absent in the density operator is tantamount to an a priori assumption that no measurement can ever be sensitive to these terms. It is very well possible that present agreement of such an assumption with observation is a consequence of a too restricted set of experiments performed up to now. This possibility leads to the third possible answer:
iii) Measurements sensitive to the “cross” terms As already suggested above, the impossibility of any observation of the “cross” terms may be questioned. Is it really necessary to get rid of the “cross” terms, either by projection or by decoherence? On a more general count it is possible that superpositions like (3.1) do exist, the difference with (3.2) not being observed up to now just because the right measurements were not performed (e.g. Leggett [141]). Viewed from this perspective the above-mentioned solutions to the “measurement problem” are rather unfortunate, since they tend to define away an effect that possibly is there. Rather than devise reasonings why the difference between (3.1) and (3.2) should be unobservable, it might be more fruitful to devise experiments that are sensitive to the difference. The question of so-called ‘Schrödinger cat states’, being superpositions of states (not necessarily of a measuring instrument) that are macroscopically -or at least mesoscopically- distinguishable, is receiving a lively interest after their feasibility was demonstrated theoretically [142]. Nowadays such states have even been real-
120
CHAPTER 3. THE PROBLEM OF MEASUREMENT
ized experimentally. Thus, by carefully manipulating the coupling between external (centre of mass) and internal degrees of freedom of a ion [143, 144] it is possible to prepare the external state of the ion as a superposition of states localized in different regions of space, and perform measurements that are sensitive to the superposition character of this state. As another example neutron (or even atom) interference experiments might be mentioned [145, 146], in which the wave function of a neutron or atom is split by a diffraction grating into two coherent components that, after traversing macroscopically distinguishable trajectories, are brought into interference again (cf. section 8.2). Of course, neither a ion nor a neutron is a macroscopic object in the sense that it would be directly observable. However, it is well known that the centre-ofmass motion of these objects can often be described to a very good approximation by means of a classical trajectory. In certain measurement methods, e.g. in the Stern-Gerlach experiment (cf. section 8.3), an observable (viz, spin) is even measured by observing the atom’s centre-of-mass position, thus demonstrating the latter quantity’s use as a pointer observable. Hence, it does not seem unreasonable to consider it as a macroscopic observable. The Stern-Gerlach example actually demonstrates that ‘Schrödinger cat states’ have been prepared ever since Stern-Gerlach experiments were first performed. Hence, it is the observation of the superposition character of the ‘Schrödinger cat state’ rather than its creation, that seems to be the main achievement in the recent interference experiments. The approach corresponding to the third answer -adopted in this book- takes seriously the possibility that, even in macroscopic objects like measuring instruments, the “cross” terms are there, but that we have to perform the right experiments to be able to see them 3 . This virtually makes the conventional “measurement problem” obsolete: no mechanism need to be thought up, causing the “cross” terms to vanish. On the contrary, the possibility that these terms might give rise to new phenomena is an incentive to devise new kinds of experiments, like the ones referred to above. From a fundamental point of view such experiments are interesting because they exhibit the quantum mechanical nature of quantum measurement, thus demonstrating that -in contrast to one of the dogmas of the Copenhagen interpretation (cf. section 4.3.2)- it is not possible to deal with quantum measurement in classical terms. The conventional “measurement problem” stays interesting only as far as it is an impediment to an unbiased application of the quantum mechanical formalism to the interaction of object and measuring instrument.
3
“The classical limit is characterized, not by the absence of interference, but by the interference pattern being too fine to be resolved” (Ballentine et al. [58]).
3.2. QUANTUM MECHANICAL DESCRIPTION
3.2
121
Quantum mechanical description of the measurement process
The “measurement problem”, being a part of the general problem of quantum mechanical measurement, may be used as a testing ground to probe the role played in quantum mechanics by the measuring instrument, a role often underestimated. Thus, it is evident that the state vector (3.1) can hardly represent the final state of a cat if it has been interacting with another object, since such an interaction necessarily yields an entangled state (cf. section 1.5) of cat and object. Result (3.1) can be obtained only if the object’s dynamics is completely ignored 4 . The interaction between microscopic object and measuring instrument is a crucial element in the measurement process5. Therefore it is necessary to investigate whether the “measurement problem” survives this complication. One reason to consider a quantum mechanical description of the measuring instrument (i.c. Schrödinger’s cat) might be a justification of the cat’s final state (3.2) on the basis of the interaction with a microscopic object (i.c. the decaying nucleus). This is considered in the present section.
3.2.1
A simplified model
The following simplified, and, perhaps, not very realistic model of a quantum mechanical measurement of a standard observable represented by a Hermitian operator A with eigenvalues and normalized eigenvectors is often considered (e.g. Wigner [147]). The vectors constitute an orthonormal basis in the Hilbert space of the state vectors of the microscopic object. Let be normalized and mutually orthogonal pointer states of the measuring instrument. The vectors may be considered as eigenvectors of an observable of the measuring instrument, called the pointer observable (to be compared with the “cat” observable of section 3.1.3), with corresponding eigenvalues Let state vector represent the initial state of the measuring instrument; the vector is thought to represent the final pointer state of the measuring instrument 6 after completion of a measurement in which the object was initially in state Let be the Hilbert space of the states of the measuring instrument. 4
Note that Schrödinger [104], being well aware of this, did actually consider entangled states. In the past, failure to take this interaction into account has sometimes induced the idea that it must be the human act of observation, or even human consciousness, that must be invoked to solve the problem (cf. section 4.6.7). 6 It is possible that equals one of the final states, e.g. as is the case for an ideal photon counter, in which the possibility that initially no photon is present induces a probability that the detector remains in its initial state. 5
122
CHAPTER 3. THE PROBLEM OF MEASUREMENT
Then the measurement process takes place in the tensor product Hilbert space (cf. appendix A.9) In the model it is supposed that if the measurement result of observable A is then the final state of the object should be the state (cf. section 1.6). According to standard quantum mechanics this can be achieved if an interaction Hamiltonian H exists such that the unitary operator (T the final time of the measurement) realizes the transition (omitting tensor product notation if there is no danger of ambiguity) This is feasible. For instance, a measurement of position would be achieved by the Hamiltonian with the momentum operator of the measuring instrument: from
it follows that by this interaction a pointer is shifted over a distance if the object’s initial state was Due to the linearity of the Schrödinger equation for an arbitrary initial state of the object the final state of the system of object+measuring instrument is directly obtained from (3.3) as
From (3.5) it follows that the probability to find the pointer of the measuring instrument at time T at position equals Hence, this measurement process reproduces the probabilities (1.5) of the standard formalism. Evidently, the state (3.5) is an entangled state 7 , and might seem to pose analogous problems of interpretation as the superposition (3.1). Hence, it would seem that the interaction of object and measuring instrument has not brought us much further. Yet, if the density operator of the final state of the measuring instrument is determined (by means of partial tracing (cf. (A.81)) of density operator in Hilbert space then we find
This, evidently, has the desired form (3.2). Analogously, the final state of the object is found by partial tracing of density operator in Hilbert space yielding
7
This, actually, is the state considered by Schrödinger [104].
3.2. QUANTUM MECHANICAL DESCRIPTION
123
Hence, and look both as if (weak) projection had taken place. In order to arrive at (3.6) and (3.7) it is not necessary to introduce (weak) projection as an additional operation: these density operators are obtained in a straightforward way from the unitary development of the state vector of object+measuring instrument. It is easily verified that, if (3.7) is taken as the initial state of a new measurement of the same observable A (using the same simplified model), then the final object state of this latter measurement equals (3.7). This makes the measurement a repeatable one according to a definition of repeatability (e.g. Busch et al. [42], p. 45) to the effect that a second measurement of the same observable should not change the state of the object as it is obtained after the first measurement. Repeatability is a consequence of the validity of weak projection in the present measurement scheme. Therefore, the criticism of the projection postulate, to be discussed in the following (see also section 1.6), will also be applicable to this concept of repeatability. The process described by (3.5) is sometimes referred to as a pre-measurement (e.g. Peres [148]), transferring microscopic information from the microscopic object to the measuring instrument. In general it is followed by a process of amplification taking place within the measuring instrument. As far as the final states of this amplification process are in one-to-one correspondence to the states the probabilities of the final pointer positions are determined by the pre-measurement. For this reason in the present discussion it is not necessary to explicitly take into account the amplification process, the microscopic information transfer between object and measuring instrument being thought to be the crucial phase of the quantum mechanical measurement process (see also section 3.4). For instance, in the Stern-Gerlach experiment (cf. section 8.3) the pre-measurement phase is the time interval during which the atom traverses the inhomogeneous magnetic field. During this phase a correlation is established between a spin component of the atom and, for instance, a component of its total momentum (which serves as the pointer observable), without, however, realizing a macroscopic separation of the beams. The amplification phase consists of the time interval during which the atom is moving freely after traversing the field, thus increasing the distance between the two outgoing beams, followed by detection of the atom by a detector in one of the beams.
3.2.2 Von Neumann’s proof of consistency of projection and unitary evolution Although (3.6) and (3.7) are often considered to be satisfactory because of the possibility of interpreting these density operators as describing states in which measuring instrument and object are, with probability in states and respectively, they have one drawback as compared with the state vector they do not describe the correlation of object and measuring instrument, to the effect that a joint measurement of the compatible observables A and would always yield as
124
CHAPTER 3. THE PROBLEM OF MEASUREMENT
measurement results pairs with This correlation is described by the final state but is in no way reflected by (3.6) and (3.7). A density operator both describing the correlation inherent in and allowing an interpretation analogous to (3.6) and (3.7), is obtained by omitting in the density operator the “cross” terms, yielding (see also section 6.3.2)
A transition from to density operator could be achieved by weak projection in a joint measurement of A and in state Hence, this latter procedure would realize the result desired by a realist interpretation. Since by a unitary transformation pure states are transformed into pure states, it is impossible to obtain (3.8) from the initial state by solving a Schrödinger equation. Some form of projection or reduction will be necessary. The consistency of such a projection with a unitary evolution of the state vector has been a major concern to von Neumann [2]. He tried to establish consistency in the following way: In the final state the pointer observable is measured by letting the measuring instrument interact with a secondary measuring instrument. If value is found, then application of the projection postulate -in agreement with (1.70)- to the first measuring instrument should yield a transition to the corresponding pointer state It is now proven by von Neumann that, simultaneously, the object will have to make a transition to the state (this proof will be given in a somewhat different form in section 3.2.6). Thus, the observation of the pointer position would cause a discontinuous change of the state vector (3.5) of the combined system of object+measuring instrument according to
Hence, there is a consistency between the projection of the state of the object itself and the projection of the state of the measuring instrument. It could even be interpreted in the sense that the former is brought about by the latter. It is not necessary to assume the projection of the state vector of the object; it is sufficient to project the state of the measuring instrument by a (second) measurement performed on this latter object. Von Neumann has employed this to shift the ultimate cause of the projection toward the observer’s consciousness, which is felt to be necessary for locating the ultimate cause of the projection outside the domain of validity of the Schrödinger equation. This is done by considering a chain of measuring apparata, one measuring the other, between the object and the observer’s consciousness (the latter
3.2. QUANTUM MECHANICAL DESCRIPTION
125
being considered as the ultimate measuring instrument, performing its own projection by introspection), each measuring instrument being projected by the next measurement in the chain. In order to emphasize the extra-physical character of projection, von Neumann points to the arbitrariness of the exact position where the above-mentioned chain is cut by assuming the projection to take place.
This subject will not be pursued here any further. The role of consciousness in quantum mechanics has been a hotly debated one, and has been endorsed by eminent physicists (see also section 4.6.7). It seems, however, that they were primarily forced to invoke consciousness as an operational tool in quantum mechanics by starting from a realist interpretation of the formalism. An empiricist interpretation can provide a satisfactory alternative to von Neumanns’ psychophysical parallelism, allowing to cut von Neumann’s chain at the (physical) level of the macroscopic measuring instrument actually used in the experiment, without any necessity of projecting the wave function, and, hence, without any urge to attribute an active role to human consciousness different from its role in classical mechanics.
3.2.3
“Orthodox” solution to the “measurement problem”
The transition from density operator to density operator (3.8) marks the conventional “measurement problem”. It is the problem of the unobservability of the “cross” terms in the density operator As pointed out in section 3.1.3 there are two possible explanations of this unobservability : either the “cross” terms are really absent, or they are there but do not contribute to the results of any measurement that can actually be performed, or that has been performed up to now. An advantage of the latter possibility would be that it would not be necessary to invoke the activity either of consciousness or of any other agent to realize a vanishing of the “cross” terms by means of projection or of any other mechanism invented to achieve this goal (see also section 3.4). Jauch ([29], p. 170) has developed the “orthodox” solution to the problem of the “cross” terms along the latter line, his motivation being primarily a desire to avoid the anthropomorphic element brought into the theory by von Neumann’s projection idea. His proposal is based on the Copenhagen idea that measuring instruments -or, at least their pointers- are macroscopic objects, and, for this reason, should be described by classical mechanics. This would imply that the observables of the measuring instrument must all be compatible with the pointer observable If this is true, then, in the standard formalism the “cross” terms are completely unobservable, because they do not have any influence on the measurement results of such observables. For this reason the “cross” terms can be omitted from the density
126
CHAPTER 3. THE PROBLEM OF MEASUREMENT
operator. As a matter of fact, a state of a measuring instrument corresponds to an equivalence class of density operators having different “cross” terms but identical diagonal ones Such an equivalence class is called a ‘macrostate’ by Jauch. According to him only such macrostates are relevant to the description of a measuring instrument. Jauch’s proposal for solving the “measurement problem” is based on the Copenhagen assumption that a measuring instrument with pointer observable could not function properly if a measurement is executed on it of an observable that is incompatible with This assumption is formalized in the standard formalism by the theorem of the impossibility of simultaneously measuring observables corresponding to non-commuting Hermitian operators (cf. section 1.9.2). Hence, if this formalism would exhaust our possibility of representing measurements, Jauch would be right in not bothering about such unobservable differences. However, the standard formalism does not rule out the possibility of measuring an observable incompatible with after completion of the (primary) measurement. Consecutive measurements of incompatible observables, performed on the same object, were studied by Davies and Lewis [149], demonstrating the feasibility of such experiments. As a matter of fact, this work started off a development leading to the concept of a generalized observable, represented by a non-orthogonal decomposition of the identity (NODI), or the positive operator-valued measure (POVM) generated by it (cf. section 1.9), and to the insight that the standard formalism fails to account for all possible aspects of quantum measurement. A consecutive measurement of incompatible observables is just a special case of a joint nonideal measurement of incompatible observables, to be discussed in section 7.9. Such measurements can distinguish between the states represented by (3.1) and (3.2). That, due to the incompatibility of the observables, during the second measurement the instrument no longer is functioning as a measuring instrument of observable is of no importance if the measurement result has been registered before the second measurement is performed. If the generalized formalism of quantum mechanics is applicable, then Jauch’s solution refers only to measurements described by the standard formalism. As will be discussed extensively in chapters 7 and 8, among the generalized observables introduced in section 1.9 it is possible to find observables representing complete measurements, yielding sufficient information for a complete determination of the density operator (cf. sections 3.3.6 and 7.9.4). Such measurements might be able to discriminate between different members of macrostates8. For this reason it does not seem to be possible to maintain unobservability of the “cross” terms as a basis for the solution of the “measurement problem”. Jauch’s terminology is often an empiricist one. Thus, his definition of a quantum mechanical observable refers to the reading of the scale of a measuring instrument 8In
section 3.3.2 it will be demonstrated that a further generalization of the formalism may be necessary if the macrostates are not defined by the orthogonal eigenvectors of a standard observable.
3.2. QUANTUM MECHANICAL DESCRIPTION
127
([29], p. 97), as would be appropriate in an empiricist interpretation. On the other hand, however, Jauch ([29], p. 92) defines a quantum mechanical state as the result of a sequence of physical manipulations applied to the object system, and not as a symbolic representation of a preparation procedure. He therefore seems to attribute a realist meaning to his concept of state. In the context of the present discussion the question of the interpretation of the state vector seems to be of less importance, however, because both in a realist interpretation of the state vector as adhered to by Jauch, as well as in an empiricist interpretation there is no fundamental objection against, the occurrence of off-diagonal terms. It is an operational matter whether these terms can be observed in a measurement. Jauch’s conclusion of the unobservability of the “cross” terms just stems from consideration of a too restricted type of measurements, and, for this reason, is unacceptable.
3.2.4
Measurements of first and second kind
Critique of measurements of the first kind Measurements described by (3.3) or (3.5) are often called measurements of the first kind [150], the essential characteristic being that the final state of the object equals its initial state if the object is prepared in a state represented by an eigenvector of A. This is closely connected to the projection postulate discussed in section 1.6, stipulating that immediately after the measurement the object has to be described by the eigenvector of A corresponding to the eigenvalue found in the measurement (see also section 3.2.6). For this reason measurements of the first kind are sometimes called von Neumann measurements. They are disturbing the object system as little as possible. For this reason they are sometimes referred to as ‘ideal measurements’. Since we intend to use this latter term in a different way (cf. section 7.6) we shall stick to the former terminology, however. Although the belief in the validity of the projection postulate is rather widespread, it has a number of serious problems. Admittedly, some measurement procedures can -at least to a certain approximation- be seen as measurements of the first kind. For instance, from an ontic point of view it is rather plausible that in a position measurement using a photographic plate an electron is in the immediate vicinity of the spot where it induced the chemical reaction responsible for its detection. Nevertheless, since the position observable has a continuous spectrum the projection postulate can be valid only in an approximate sense because eigenvectors of the position operator cannot represent quantum mechanical states (see also section 3.3.4). The Compton effect (1.71), too, can be seen as an example of a measurement of the first kind of photon momentum, in which the particle serves as a measuring instrument, and a measurement of final particle momentum marks a pointer reading. Due to the continuity of the spectrum of the momentum operator, a restriction of a strict
128
CHAPTER 3. THE PROBLEM OF MEASUREMENT
application of the projection postulate obtains also here. Another measurement, often treated as an example of a measurement of the first kind, is a measurement of a spin component by means of a Stern-Gerlach device, in which it is assumed that one of the outgoing beams has ‘spin up’, and the other ‘spin down’. This assumption is based on a double misunderstanding, however. First, a Stern-Gerlach measurement can be considered as completed only if, after the particle has traversed the inhomogeneous magnetic field, its presence in one of the beams has been ascertained by means of a particle detector. Since, in general, spin components are not conserved in the interaction with such a detector, the outgoing beam (if it exists at all) will differ from ‘spin up’ even if this was the state when reaching the detector. Hence, if the experiment is a genuine measurement, then, unless a spin conserving detection method is applied, it will not be a measurement of the first kind. Without the detectors the experiment is not a measurement, but rather a state preparation (see also section 4.2.3). A second misunderstanding amounts to the assumption, generally made in textbooks of quantum mechanics, that the measured spin component is conserved during traversal of the magnetic field, and that, hence, the correlation between spin and outgoing beam, established in the pre-measurement phase, is unique. This assumption is not justified, however. For the Stern-Gerlach device to work properly it is necessary that the magnetic field B be inhomogeneous. As will be demonstrated in section 8.3.2, this implies that the magnetic field cannot be unidirectional, and, hence, the interaction Hamiltonian, essentially given by the operator cannot commute with a spin component. Hence, a realistic Stern-Gerlach experiment can at best approximately have a pre-measurement of the first kind, the approximation being better as the magnetic field is more homogeneous. Since, however, inhomogeneity of the magnetic field is essential to the functioning of the instrument, it is clear that deviation from first-kindness is even crucial. Measurement of photon number by means of a photon counter is another example of a measurement procedure of which a correct functioning is contingent on not being a measurement of the first kind. This measurement procedure has been described quantum mechanically first by Glauber [151]. It is based on absorption of the photons by the atoms of the detector, each absorbed photon causing the emission of an electron that gives rise to a macroscopically detectable electric pulse. It is essential to a proper functioning of the measuring instrument that all photons are absorbed which are initially present, and, hence, are not present any more in the final state. Therefore, ideally the final state of the object (the electromagnetic field) is the vacuum state. Any deviation from the vacuum means that photons of the initial state have not been detected, i.e. the efficiency of the detection process is smaller than 1. In section 7.2 this will be considered more extensively. It turns out that photon detection with efficiency smaller than 1 is an example of a measurement procedure corresponding to an observable represented by a POVM (cf. section 1.9) rather than
3.2. QUANTUM MECHANICAL DESCRIPTION
129
by a Hermitian operator, and, hence, falls outside the domain of application of the standard formalism of quantum mechanics. If the detection efficiency is equal to 1 the POVM reduces to the PVM corresponding to the spectral representation of a Hermitian operator, viz, the photon number operator (A. 15). Although in this limit the observable is a standard one, such a measurement is the opposite of a measurement of the first kind: the detection process is a better measurement of photon number to the extent that it has less in common with a measurement of the first kind. The examples discussed above suggest that the concept of a measurement of the first kind has at most a theoretical importance, and is seldom realized in practice. Wigner [152] has demonstrated that such a measurement is even impossible theoretically if an additive constant of the motion exists, like, for instance, total angular momentum of the combined system of object and measuring instrument, being a conserved quantity if no external forces are applied to this system. Subsequently it was demonstrated by Araki and Yanase [153] that the measurement scheme of a measurement of the first kind can be approximated arbitrarily closely by increasing the mass of the measuring instrument, thus making the object’s contribution to the additive constant of the motion negligible, in this way vitiating Wigner’s counterargument. We shall not discuss this attempt to justify von Neumann’s projection postulate any further, however, because we do not think that it can be justified. Since real measurements can deviate strongly from the ‘first kind’ model, it is evident that von Neumann projection is not an essential ingredient of a quantum mechanical measurement. It, moreover, is based on the classical paradigm (cf. section 2.4.2), to the effect that it is assumed that a microscopic object will behave as classically as a measuring instrument, adopting a value of A as soon as the measuring instrument adopts value It is important, however, to distinguish the macroscopic pointer from the necessarily microscopic part of the measuring instrument that is sensitive to the microscopic information transferred to it from the microscopic object by the measurement interaction. We do not have any reason to expect similar behavior of such different objects. Indeed, a quantum mechanical description of the measurement process is necessary primarily because the process of information transfer between object and measuring instrument is a microscopic one, the part of the instrument that is sensitive to this information being necessarily of microscopic dimension, and, therefore, quite different from a macroscopic pointer. Whether also the macroscopic aspects of a measuring instrument can be described by quantum mechanics is an open question (cf. section 2.5.3; see also section 3.4), and, therefore, certainly is no sound basis for a characterization of what a measurement is. If application of quantum mechanics is restricted to the microscopic part of the measurement (i.e. the pre-measurement), then Wigner’s argument does not seem to be applicable, since, due to the necessary
130
CHAPTER 3. THE PROBLEM OF MEASUREMENT
interaction with the macroscopic part of the instrument, the microscopic part cannot be considered as an isolated system. Hence, the question of constants of the motion does not even arise, and the solution by Araki and Yanase seems to be an answer to a non-existing question. It could be thought that a solution of the “measurement problem” on the basis of the macroscopic character of the measuring instrument is an attractive one because in the Copenhagen interpretation this macroscopicity is particularly emphasized (cf. section 4.3). However, this will be seen in section 4.3.4 to be one of the weaknesses of this interpretation. A quantum mechanical description of the measurement process as discussed in the present chapter is hardly reconcilable with Bohr’s correspondence principle. Such a quantum mechanical treatment seems to be necessary, however, to be able to describe the microscopic part of the measurement, thus actually bringing the “measurement problem” into existence. Hence, the macroscopic character of measurement cannot be the real solution to this problem, which should be of a quantum mechanical nature. By restricting considerations to measurements of the first kind non-quantum mechanical (classical) elements have entered the discussion, thus obscuring what is really going on. In classical mechanics it is determined which value a certain quantity “has”, assuming that the measurement does not influence the object. Hence, no distinction needs to be made between the value of a quantity before and after the measurement. In quantum mechanics the attribution of a value to an observable preceding the measurement is problematic, the possibility of doing so being denied by the Copenhagen interpretation (cf. sections 4.4 and 4.5; see also chapter 9). The measurement of the first kind seems to offer a possibility of saving at least part of the classical heritage by assuming that the observable should in any case have its value after the measurement. As seen from the examples discussed above this cannot be a general feature of quantum mechanical measurement, however.
Measurements of the second kind In order not to be trapped in the classical paradigm it will be necessary to consider less simplified models of measurement than the ‘first kind’ ones. This, moreover, has the advantage that such bona fide quantum mechanical measurement procedures like photon detection processes, which are not even approximately measurements of the first kind, can be taken into account. Let us consider the (still very simplified) measurement procedures in which (3.5) is replaced by
where may be arbitrary normalized states of the object, not necessarily mutually orthogonal ones (for instance, in the case of an ideal photon detector
3.2. QUANTUM MECHANICAL DESCRIPTION
131
might be the vacuum state for every Measurements satisfying (3.10) are often called ‘measurements of the second kind’. This locution is not very fortunate since measurements of the first kind constitute a subset of the set of ‘second kind’ ones, satisfying However, we shall stick to this terminology. Measurements of the second kind were first considered by Landau and Peierls [154]. It is easily seen that the general scheme (3.10) is as adequate to the purpose of getting knowledge about the state of an object, immediately preceding the measurement, as is the measurement scheme (3.5): the relative frequencies of the different pointer positions of the measuring instrument reproduce the probability distribution predicted by the standard formalism, just as faithfully. Requirements with respect to the state vectors are connected with the preparative aspect of measurement rather than with the determinative one . It tells something about the state of the object after the measurement rather than about the state before the measurement. This also touches upon a problem of the Copenhagen interpretation to be discussed more fully in section 4.7. When we take the point of view that measurements are performed to get knowledge about the state of an object immediately preceding the measurement, then it is not very important in which state the object is left after the measurement. Of course, it is legitimate to be interested in the state of the object after the measurement, for instance, because a second measurement has to be performed on the same individual object. This problem, however, is completely different from the problem of measurement in the sense of the question to what extent a measurement procedure yields results that are consistent with the Dirac-von Neumann postulates as given in section 1.1. Von Neumann’s projection postulate has no bearing on this. Many discussions on the foundations of quantum mechanics are heavily relying on the concept of measurement of the first kind. This, however, induces a considerable confusion with respect to the notions of ‘preparation’ and ‘measurement’, playing an important role in the interpretation of the Heisenberg inequality, and criticized e.g. by Margenau and Ballentine (cf. sections 4.7.2 and 4.7.3). As far as the conclusions of such discussions depend on the assumption of ‘first kind’ness they can hardly be taken seriously. For this reason this assumption will systematically be avoided in this book. Although it is possible to consider a measurement of the first kind as a special case of the more general quantum mechanical measurement procedure of the second kind, from the point of view of measurement the first kind ones are of minor importance. Unfortunately, they may even be harmful to an understanding of quantum mechanical measurement by imposing unnecessary requirements to be satisfied by measurement procedures, thus directing attention to features that are not only inessential but that may even be misleading.
132
CHAPTER 3. THE PROBLEM OF MEASUREMENT
3.2.5 The “measurement problem” for measurements of the second kind Determining, analogously to (3.7), the density operator of the final object state for a measurement of the second kind we find from (3.10)
It is clear from this expression that, if the measurement is not of the first kind, the projection postulate is not satisfied in the sense understood by von Neumann: there is no special relation of the final object state (3.11) to the eigenstates of the measured observable. On the contrary, as will be proven in section 3.3.4, it is possible to interpret (3.11) in the sense that the final state of the object may be represented by state vector if the measurement result (pointer position) was Hence, if the measurement would realize a projection or reduction at all, then this seems to be some generalized “projection” onto a member of a set of (possibly nonorthogonal) states rather than onto one of the orthogonal states This can be tested by performing a second measurement on the object after completion of the first one (see also section 3.3.4). It is clear that, from the point of view of the final object state, there is no single motive for a restriction to measurements of the first kind. Let us next consider the final state of the measuring instrument. Analogously to (3.6) we find
It is important to notice that, if for in this expression the same “cross” terms appear that were deemed so problematic in the example of Schrödinger’s cat. This may explain the popularity of measurements of the first kind, for which the “cross” terms are absent in (3.12). Evidently, for measurements of the second kind in general the “measurement problem” already arises independently of the question of the correlation between object and measuring instrument as expressed by (3.8); as far as the existence of “cross” terms is considered to be problematic, measurements of the second kind pose a problem already when considering the measuring instrument by itself 9 . It is clear that this problem does not exist for measurements of the first kind. The problem cannot be evaded, however. 9
The possibility that the vectors constitute an orthonormal set different from the eigenvectors will be ignored here because, although not impossible, this does not seem to be very realistic. It could occur, for instance, if the measurement interaction is treated as induced by an external field (as in the Stern-Gerlach experiment), neglecting the dynamics of the field.
3.2. QUANTUM MECHANICAL DESCRIPTION
133
by restricting ourselves to such measurements, since these hardly play any role in actual experimental practice. As in the case of Schrödinger’s cat, discussed in section 3.1, different strategies may be used to cope with the “cross” terms in (3.12). One possibility to circumvent the problem might seem to be a diagonalization of the density operator, thus obtaining a diagonal representation (cf. (1.36)). However, the vectors are linear combinations of the vectors This evidently makes the remedy worse than the disease, since this would yield a mixture of superpositions of living and dead cats. Our conclusion must be that in this way a solution of the “measurement problem” cannot be found (compare [155]). Another possibility might be a generalization of von Neumann’s projection postulate as expressed by (3.8), to the effect that during the measurement the system of object+measuring instrument would make a transition to the state
Indeed, by partial tracing of over we obtain (3.6). As will be seen in section 3.2.6 this density operator would also correctly reproduce any result of a second measurement performed in the final state of the object, conditional on a measurement result of the first one. Like in the discussion of section 3.1.2, interpretation may play an important role in assessing the appropriateness of (3.12). Thus, in the minimal interpretation this expression is completely satisfactory, since it correctly predicts the probability distribution of the pointer positions,
which is all that is required in this interpretation. Hence, from this point of view (3.13) is superfluous. Also in an empiricist interpretation does (3.12) not pose any insurmountable problem. Although the state does contain “cross” terms, in an empiricist interpretation this does not imply that the measuring instrument (cat) would “be in” some strange state. It rather tells something about the preparation procedure, possessing certain quantum mechanical properties not commonly observed in macroscopic physics. Rather than being explained away, the “cross” terms may constitute a challenge to physicists to devise measurements of the measuring instrument (cat) that, unlike observation of its pointer position, are sensitive to the “strange” terms (compare the discussion of Schrödinger cat states in section 3.1.3). Whereas the conventional “measurement problem” had as an objective to find reasonings explaining the alleged “unobservability” of the “cross” terms, on the contrary a quantum mechanical treatment of measurement suggests their observability. Thus, realization
134
CHAPTER 3. THE PROBLEM OF MEASUREMENT
of measurements that are sensitive to the “cross” terms may turn a pseudo problem into a problem that can be tackled experimentally. Also in a realist interpretation the “cross” terms in (3.12) need not be problematic. In particular, if the state vector is interpreted as describing an ensemble, then the presence of the “cross” terms might be conceived as indicative of a non-classical character of the ensemble. This is hardly surprising within the domain of application of quantum mechanics. The meaning of a quantum state as presented in most modern textbooks of quantum mechanics is in agreement with such an ensemble interpretation. This interpretation should be distinguished from an (equally realist) individual-particle interpretation in which the state vector is thought to describe an individual object, the latter being the common view in older textbooks adhering to the Copenhagen interpretation of quantum mechanics (see also chapters 4 and 6). Superpositions of macroscopic pointer states (like the superposition of a living and a dead cat), and the “cross” terms in the corresponding density operators are undesirable only if the state vector is thought to describe an individual measuring instrument (cat). Apart from the fact that quantum mechanical predictions can be put to a test only by comparing them with the results of a large number of measurements carried out on “identically prepared” individual objects, the “Schrödinger cat paradox” of superposed macroscopic pointer states may have been an important motive for the change within the realist interpretation, to be observed in textbooks, from an individual-particle to an ensemble version. In particular, interference fringes in interference experiments, described by “cross” terms like the ones discussed here, are obtained by collecting large numbers of individual data. It is true that for macroscopic objects interference is more difficult to observe than for microscopic ones. This is no reason, however, for an a priori negation of the existence of “cross” terms in (3.12), which may be seen if probed with sufficient care. If no interference is observed, and measurement results would be in agreement with (3.13), then it seems to be more appropriate to ask for an explanation of the discrepancy with (3.12) than to assume the validity of von Neumann’s projection postulate. Decohering mechanisms may be responsible for an effective reduction of interference amplitudes (cf. section 3.4), thus possibly yielding such an explanation.
3.2.6
Conditional preparation
From the foregoing sections it is evident that a measurement process has two different aspects, viz, a preparative and a determinative aspect, the first one being concerned with how an object is prepared (in its final state) by the interaction with a measuring instrument, the latter referring to what can be learned about (the initial state of) an object by registering pointer positions of a measuring instrument. Unfortunately this distinction is not always made in the literature, thus causing quite
3.2. QUANTUM MECHANICAL DESCRIPTION
135
a bit of confusion (see also sections 4.6.1 and 4.7). This confusion is particularly evident in the application of the projection postulate (1.70) or (1.76) as a property of a quantum mechanical measurement, determining the state of the object after the measurement rather than ascertaining some property of the object prior to the measurement. Hence, the projection postulate is associated with preparation rather than with measurement. Undoubtedly, an important cause of confounding preparative and determinative aspects of quantum mechanical measurement in applying the projection postulate is the problem that it is not clear in what sense an individual measurement result can be attributed to the initial state of the object if this state is a superposition of eigenvectors (see also chapter 6). It is rather unfortunate that for a long time it has generally been thought that this problem could be “solved” by the assumption that, if this value cannot be attributed to the individual object immediately before the measurement, it should be possible to attribute it to the object immediately after the measurement. As was already argued before, the latter application of the projection postulate is not generally applicable to quantum mechanical measurement procedures. Yet, as a preparation principle the projection postulate (or a generalization of it) can play a role in the formalism. Indeed, it is legitimate to represent the state of the object by state vector if a measurement of the pointer observable in state (3.10) yielded pointer position Registration of pointer position can be associated with the transition
This is a generalization of the strong projection principle (1.70), not specifying the final object state as an eigenvector of the measured observable but as a state (co-)determined by the nature of the interaction between object and measuring instrument. This can be justified as follows (e.g. [39, 156]): It is possible, at least in principle, to perform in state (3.10) a simultaneous measurement of pointer observable and an arbitrary observable of the microscopic object. The observables and B operate in different Hilbert spaces and respectively, and, hence, are compatible. Applying the theory of the joint measurement of compatible observables (cf. section 1.3) the bivariate probability distribution of the eigenvalues of these two observables is welldefined, and the conditional probability (1.28) of for a given pointer position can easily be calculated as
This holds true for any observable B. Hence, if we restrict ourselves to those measurement results of B which are coincident with pointer position then (3.16) is consistent with the assumption that after a measurement yielding result the state of the object is described by the vector
136
CHAPTER 3. THE PROBLEM OF MEASUREMENT
In contrast to (1.70) the transition (3.15) is not a projection since it is not idempotent. In general it need not even realize a reduction of the indeterminacy of the measured observable. It is called a conditional preparation since it describes the final state of the object, conditional on measurement result of the pointer observable. In the case of a measurement of the first kind conditional preparation and strong projection (1.70) coincide. We should better refer to it as a ‘preparation of the first kind’. In the general case it is (3.15), rather than (1.70), that governs the conditional preparation. This result is in agreement with the opinion voiced in section 1.6 that the projection postulate in the form (1.70) does not make sense in general. The projection postulate, taken as a general property of measurement, is seldom, if ever, satisfied. The requirement that measurements should satisfy it would disqualify as genuine quantum mechanical measurement processes measurement methods like, for instance, efficient photon counting, for which the final state is ideally given by (cf. (3.10))
yielding to be equal to the vacuum state of the electromagnetic field for all and, hence, in general to be very different from the number states corresponding to the number of registered photons. Only measurements of the first kind would meet the requirement that There is, however, no single reason to restrict the concept of quantum mechanical measurement to this very special type of measurements. For this reason the projection postulate as a measurement principle does not seem to be appropriate. However, in general a measurement is also a preparation. After the measurement the object has a quantum mechanical state, viz, the state given by (3.11). As seen above, the post-measurement state of the object can be specified more precisely if the measurement result of the preceding measurement is taken into account. Any quantum mechanical experiment to be performed on those objects leaving the measuring instrument in coincidence with pointer position can be described by the state vector (cf. figure 3.1). Hence, the transition (3.15) specifies the state vector to be attributed to the preparation of the microscopic object, conditional on measurement result For this reason transition (3.15) is a principle of conditional preparation, applicable to any measurement. Conditional preparation is a widely used experimental procedure to prepare an object in a well-defined state. Thus, in the Compton effect (cf. section 1.6) it may be applied to select photons corresponding to those particles of which the final momentum is found (measured!) within a certain range A subsequent scattering experiment performed with these photons can then be described assuming the state of the photons to be represented by the density operator
3.2. QUANTUM MECHANICAL DESCRIPTION
137
obtained from (1.71) by projecting onto the interval Similar measurement procedures are particularly useful if a state preparation is required of an eigenstate of an observable having a discrete spectrum. Using a measurement of the first kind this can be achieved by means of conditional preparation. Measurements of the first kind are therefore important tools of state preparation. The difference between measurement and conditional preparation has been realized already long ago. Thus, Kemble ([157], section 41) distinguished predictive and retrospective measurements, the former actually being conditional preparations. An analogous distinction (viz, between initial and final experiments) was made by Fock ([158], section I.1.4) already in 1932. Yet, ‘preparation’ and ‘measurement’ have not always been sufficiently distinguished. The Compton effect may have been instrumental in shaping Heisenberg’s idea of a ‘quantum measurement’ as a state preparation of the object’s final state, to be represented by an eigenvector of the measured observable (cf. section 4.6.1). An analogous physical situation will be encountered in chapter 5 in the discussion of the Einstein-Podolsky-Rosen experiment, which, like the Compton effect, can be interpreted as a preparation procedure of a part of the object system, conditional on a measurement performed on another part (cf. section 5.3.1). In both issues it will become evident that failure to distinguish projection as a measurement principle from its legitimate application as a procedure of conditional preparation has caused quite a bit of confusion. Unfortunately, in the widely discussed examples given above the transitions (3.15) are actually von Neumann projections, thus suggesting the latter’s general applicability to quantum measurement. Copenhagen ideas on measurement have largely been shaped by considering such examples, thereby arriving at a rather stereotyped view. Thus, if the states are spatially separated for different values of then the object’s position can serve as a pointer observable (compare, in particular, the Stern-Gerlach experiment, cf. section 3.2.4), thus allowing to employ (certain features of) the preparation of the final object state to be interpreted as a
138
CHAPTER 3. THE PROBLEM OF MEASUREMENT
measurement result. In such a special case spatial separation of states entails their orthogonality, thus making the measurement a first kind one. Possibly, the large interest in measurements of the first kind, to be observed in the literature, can be explained by the fact that such special measurement procedures have been taken as paradigms of general quantum measurement (for instance, Kemble ([157], section 42e); also compare Davies’ notion of ‘instrument’ [37]). However, for quantum mechanical measurements equality of conditionally prepared states and eigenstates is the exception rather than the rule. As a matter of fact, if considered as measurements the Compton and EPR examples are very special ones (see chapters 5 and 9 for a discussion of the EPR problem), whereas Stern-Gerlach at best is a first kind measurement only in an approximate sense (cf. section 8.3). Also Heisenberg’s idea that quantum mechanical measurement results should correspond to final states of the microscopic object applies to very special cases only. In general the vectors are not even orthogonal. The notion of ‘measurement of the first kind’ is not sufficient to encompass most quantum mechanical measurement procedures that are actually performed. Therefore, it hardly seems opportune to maintain this feature as a general property of quantum measurement. In particular, the Copenhagen replacement of ‘measurement’ by ‘preparation’ has been a source of confusion by directing the attention to the post-measurement object state rather than to the post-measurement state of the measuring instrument. Analogous remarks can be made with respect to weak projection. If no selection is made on the basis of a reading of the pointer position then the preparation of the final state of the object should be described by the density operator (3.11). For the purpose of measurements conditional on the value of it evidently is allowed to consider each term of this density operator to correspond to a partial beam as depicted in figure 3.1. Whether this implies for the system of object+measuring instrument a transition to the state (3.13) is an open question, however, because a test would require a measurement of an observable incompatible with Transition (3.15) can be conceived as a generalization of the von NeumannLüders projection in a preparative sense (cf. section 1.6). It is an example of an operation (as opposed to a measurement) (cf. Kraus [39]), more precisely, a selective operation in the sense of conditional preparation. The transition to (3.11) is a nonselective operation changing the initial density operator of the object to the post-measurement one By Kraus it was demonstrated that non-selective operations can generally be represented by a set of operators according to
A priori there need not necessarily exist any relation between the preparative and the determinative aspects of a quantum mechanical measurement: the final states (3.11) of a measurement process need not have any special relation to the POVM
3.2. QUANTUM MECHANICAL DESCRIPTION
139
representing the measurement (see e.g. Braunstein and Caves [159]). In particular, in case of a measurement of a PVM the operators need not be the projection operators of that PVM (as would be the case in a measurement of the first kind).
Conditional preparation and interpretations The issue of conditional preparation, although in the first place regarding the formalism of quantum mechanics itself, also has a certain interpretative importance. Thus, in a realist individual-particle interpretation transition (3.15) is thought “really” to take place, being caused in some sense by the act of observation. This interpretation has to cope with the question of what is the physical mechanism realizing the transition, as well as with the rather strange properties of that mechanism. Thus, in a measurement of position the strong transition (1.70) would require the wiping out of the alleged reality (described by the wave function) outside the region where the particle is actually observed10. If applied to a detector, present in a region different from the one in which the particle is actually observed, this amounts to an objection made by Renninger [162] to the effect that projection can also take place if there has not been any interaction between object and measuring instrument, viz, in a so-called ‘negative-result measurement’. The impossibility of finding a reasonable physical mechanism realizing the strong transition could be a reason, additional to the one mentioned in section 3.2.5, for adopting a (realist) ensemble interpretation rather than an individual-particle one (cf. chapter 6). In an ensemble interpretation it is sufficient to consider weak projection, only in a realist version requiring an explanation of the vanishing of the “cross” terms. This might seem to be quite a bit easier than explaining the strong transition (see also section 3.4): like in an empiricist interpretation a conditional preparation could correspond to a selection of a subensemble. However, as will be discussed in section 6.4.3, a transition from an individual-particle interpretation to an ensemble one does not solve all problems if the interpretation is a realist one. In an empiricist interpretation the process of conditional preparation does not require a physical explanation at all. In this interpretation a density operator is not a description of the object itself (neither individual particle nor ensemble), but a symbolic representation of a preparation procedure (see also section 3.3.4). A transition from to has just the significance of changing the preparation procedure: when only those objects are considered corresponding to pointer position then, as demonstrated above, the probabilities of all subsequent measurements can be obtained from the state vector Consequently, this latter state vector in an operational sense represents a new preparation procedure. It should be attributed 10
We do not refer here to difficulties of a relativistic description of state projection (e.g. d’Espagnat [160], chapter 8.3), which seem to be solvable by adopting a relativistic model [161].
140
CHAPTER 3. THE PROBLEM OF MEASUREMENT
no other significance than that of a label of a certain combination of macroscopic measurement arrangements, observations (of measurement results and decisions to select particles on the basis of these observations. As a matter of fact, the event corresponding to ‘the measuring instrument is remaining in its initial state’ (Renninger’s negative-result measurement) can give rise to the adoption of a preparation procedure conditional on this very event. Whereas ‘negative-result measurements’ are problematic in a realist interpretation, they do not pose any problem in an empiricist one. In an empiricist interpretation the correlation between an observation of an individual measurement result and transition (3.15) is not thought to be a causal one, governed by some physical principle. Although the decision to change the preparation procedure is triggered by the observation of measurement result this observation does not in a physical sense “cause” the projection 11 . An analogous view of projection can be encountered in the Copenhagen interpretation (cf. chapter 4) in which an instrumentalist conception of the state vector allows to look upon projection as a subjective change of the observer’s perspective. If it is not registered which is the precise value of the measurement result, and if quantum correlations between object and measuring instrument are ignored, then the preparation procedure (including the measuring instrument) may be described in an operational sense by density operator (3.13). However, if such correlations are thought to be interesting, then the preparation procedure must be represented by the exact final state Once again, in an empiricist interpretation it is not necessary to justify the disappearance of the “cross” terms in the process of weak projection by means of some causal mechanism like, e.g. decoherence (cf. section 3.4). Also here the projection is not “caused” by the observation, but is just a change in the description of the state of the system made on the basis of certain subjective considerations.
3.2.7 Quantum jumps Von Neumann’s projection postulate may be inspired by the idea, already going back to Bohr’s 1913 atomic model, that quantum systems can undergo discontinuous transitions like those between stationary states of an atom 12 . In the case of von Neumann projection such a transition would be connected in some sense to an act of observation. The contradiction of such a discontinuous transition with the continuous evolution of the solutions of the Schrödinger equation has been a major 11As far as the interpretation of projection, presented here, is a “subjectivistic” one (decisions being made by a subject), this subjectivity is not of a specifically quantum mechanical nature. 12 Note, however, the fundamental difference of von Neumann projection and Bohr’s quantum jump model, to the effect that the former describes a transition from a superposition to an eigenstate of the measured observable, whereas in the latter the transition is between two eigenstates.
3.2. QUANTUM MECHANICAL DESCRIPTION
141
reason for doubting the reality of von Neumann projection (e.g. Margenau [163]), as well as the reality of quantum jumps within an atom. Moreover, the quantum mechanical measurement results are the probabilities (cf. section 1.1), evolving in time in the continuous way governed by the Schrödinger equation. Such probabilities can be measured only as relative frequencies in an ensemble. For this reason the discontinuities, being attributes of individual events (if existing at all), were widely considered to be inaccessible to observation. Dehmelt [164], and later Cook and Kimble [165], have realized, however, that the new technique of isolating a single ion (which has become possible after development of the Paul trap) might enable a prolonged observation of one and the same ion, thus possibly yielding experimental evidence of discontinuous transitions. Soon afterwards such experiments were actually realized [166], all experiments being based on the phenomenon of intermittent fluorescence, in which an ion is continuously excited by resonant light beams inducing transitions from a stationary state either to a short-living state or to a metastable state (cf. figure 3.2). In the former case fluorescence radiation can be observed because of de-excitation to the state Occasionally this radiation is being quenched during some time if a transition to the metastable state has occurred, thus barring immediate decay. The discontinuous graph depicted in figure 3.3 can be considered as a sample function of a stochastic process in which the transitions take place rapidly during certain time intervals (yielding fluorescence intensity and are absent during the quenching periods The average quenching time equals the lifetime of the metastable state. These experiments seem to indicate that the discontinuous transitions between stationary states, as envisaged in the Bohr model, “really” take place. The discontinuities are smoothed out if averages are taken over a large ensemble of sample functions, ensemble averages behaving continuously. The phenomenon of intermittent fluorescence as represented in figure 3.3 gives the intensity of the fluorescent light emitted by an individual atom during some time. It can in the following way be understood on the basis of the notion of conditional preparation discussed above. Let the light be resonant with both the 0 – 1 and the 0 – 2 transition, and be applied in very short pulses at times
142
CHAPTER 3. THE PROBLEM OF MEASUREMENT
with periods T between pulses small compared with the lifetime of the metastable state (e.g. Beige et al. [167]), and duration of pulses small compared to T. Then, each pulse can be interpreted as a measurement of an observable with two different values, and corresponding to eigenvectors and respectively13. The photon number of the fluorescent light of a pulse plays the role of pointer observable. If the initial state was the metastable state no fluorescent photon will be observed to be interpreted as measurement result for initial state a fluorescent photon will be obtained to be interpreted as measurement result If, moreover, it is assumed that the measurement is of the first kind, then measurement result is accompanied by a transition to state (which is the conditionally prepared state On the other hand, observation of quenching can be interpreted, in the sense of a negative-result measurement [168], as measurement result In this latter case conditional preparation of the ion state should yield the metastable state as the conditionally prepared state Measurements of photon number are performed consecutively, the object being allowed to interact continuously with the resonant light. The problem is how to reconcile the discontinuities observed in figure 3.3 with the continuity of quantum mechanical results described by the Schrödinger equation. The solution to this problem strongly depends on the interpretation of quantum mechanics adhered to. Whereas all interpretations agree that some form of projection or conditional preparation is an essential ingredient, they vastly disagree with respect to its physical meaning. The measure of acceptability of the different physical meanings could be an indication of the acceptability of each of the interpretations. Let us start with the empiricist interpretation, in which the state vector is seen as a representation of a preparation procedure rather than a description of the microscopic object. If the measurement results are ignored, then, according to this interpretation, the quantum mechanical description can yield only the continuous state evolution described by The discontinuous graph of figure 3.3 is not 13
In a realist interpretation this is often referred to as a measurement testing whether the ion “is” in state or
3.2. QUANTUM MECHANICAL DESCRIPTION
143
thought to be an intrinsic element of quantum mechanics. It should rather be seen as representing a (macroscopic) compound event consisting of a set of individual measurement results of a sequence of consecutive measurements performed on the same ion, each of these individual measurements corresponding to a possible number or of fluorescent photons. In an empiricist interpretation quantum mechanics is thought to yield only the probabilities of these compound events (compare (3.34)), and to say nothing about the reality of quantum jumps, even though in an ontic sense the emission of a photon by an ion, and the concomitant change of the ion, may be considered as real processes, happening at a certain moment, but needing a subquantum theory for its description (cf. section 2.4.4). On the basis of conditional preparation the discontinuities can be understood as discontinuous changes of the description, based on information obtained by the measurements, allowing to take as the new state vector as soon as is obtained as a measurement result. The discontinuous change can be seen as a consequence of a transition to a description of a subensemble. The probabilities (3.34) are continuous functions of the times of the consecutive measurements. The empiricist interpretation should be compared with the realist one. Like in an empiricist interpretation, in the ensemble version of a realist one conditional preparation corresponds to selection of a subensemble. Rather than on observation, the selection is then based on whether an element of the ensemble has either “really” jumped, or not. As in an empiricist interpretation the discontinuous changes of the state need not be interpreted as physical processes that are “really” happening, since it is just a matter of the (quantum mechanical) description. Since each observation of a fluorescent photon can be attributed to a jump process, from an operational point of view there is no difference between the two interpretations. In particular, the state transition need not be governed by von Neumann projection, but may be to states co-determined by the measurement interaction. Things are different, however, in the individual-particle version of the realist interpretation. In this interpretation the quantum jumps are not only thought to be really happening, but also to be described by quantum mechanics. Hence, individual trajectories like that of figure 3.3 are thought to be obtainable from the quantum mechanical formalism. Discontinuous jumps pose a real problem if the individual particle is thought at each time to “be” in a pure state, because they are inconsistent with the continuous evolution of solutions of the Schrödinger equation. For this reason the formalism of quantum stochastic differential equations has been developed (e.g. [169]; for a review see Plenio and Knight [170]), the solutions of which are capable of simulating quantum jumps. In order to do so extra terms have to be inserted into the Schrödinger equation. These extra terms are thought to stem from interaction of the object with its environment, the measuring instrument constituting an important part of this environment. In general these terms are chosen such that the wave function in a stochastic way undergoes projections
144
CHAPTER 3. THE PROBLEM OF MEASUREMENT
in the sense of von Neumann’s prescription, projecting on eigenvectors measured observable. If necessary, a generalization to states seem to be impossible.
of the does not
The ‘quantum stochastic differential equation’ approach may be a useful extension of the quantum mechanical formalism to take into account interactions of a microscopic object with objects that are not explicitly introduced as additional degrees of freedom in the quantum mechanical description, in particular if these interactions have an impulsive character. Yet, it will not be considered here any further, because it is questionable whether it can account for von Neumann projection as an objective physical process, not merely a change of description based on observation and/or selection by an outside observer. What is aspired in a realist individual-particle interpretation is a mechanical account of quantum jumps, a projection being caused in a mechanical (though stochastic) way by a physical interaction with a measuring instrument. In the fluorescence experiment the measurement interaction is strongly interfering with the microscopic object, and, for this reason, can have a large influence. As a consequence it may be not impossible to shape stochastic influences of the measurement in such a way that von Neumann projection is simulated (e.g. Gisin and Percival [171], Beige et al. [167]). It is questionable, however, whether the mechanism of stochastic influences has general applicability, even if effective in this particular case. Once again, we should be careful not to draw general conclusions from too scant material. In other examples of quantum mechanical measurement the measurement interaction may have quite a different character. Thus, consider a radioactive nucleus, continuously observed by means of a counter monitoring the decay products of a nuclear decay. Here the decay process is quite independent of the observation process, since completely different interaction energies are involved in the process of decay and the process of registration of the decay products. The nuclear decay could be influenced by the measurement only if the measurement interaction is sufficiently energetic. Moreover, the measurement is performed in a different region of space, and after the decay process has occurred. For these reasons it seems that in this measurement quantum jumps, if existing, should be independent of the interaction with the measuring instrument. The example of nuclear decay highlights an objection against a realist interpretation of the quantum mechanical formalism already discussed in section 2.4.3, viz, the double role of the quantum mechanical observable, being implemented here by not drawing a clear distinction between the quantum jump (taking place within the object) and the measurement event (taking place within the measuring instrument). Such a lack of distinction may obscure the physics that is involved. In particular the view that measurement itself is causing quantum jumps to take place in the object seems to invert the order of cause and effect. It is not accidental that an analogous lack of causality plays an important role in
3.2. QUANTUM MECHANICAL DESCRIPTION
145
an application of the projection postulate to the EPR experiment (cf. chapter 5) in which a similar spatial separation of object and measuring instrument is present as in the example of nuclear decay. Whereas from the present discussion of quantum jumps no definitive conclusions shall be drawn with respect to a choice between the three interpretations of quantum mechanics considered above, the lack of causality in the EPR experiment will be an important incentive to reject a realist interpretation, thus leaving the empiricist interpretation as the more appropriate one.
The quantum Zeno effect When the period T between measurement pulses in the fluorescence measurement discussed above is infinitesimally small (ideally T = 0) we have a continuous measurement. Continuous measurements allegedly give rise to the so-called quantum Zeno effect (Misra and Sudarshan [172]), asserting that a continuous observation would freeze a system in its initial state. This freezing is presented as a consequence of von Neumann projection, a ‘no-transition’ result obtained in each infinitesimally small time interval causing the state to project back into its initial state. In the fluorescence experiment this would imply that, if the ion is initially in a stationary state, it will remain there forever, the continuous measurement thus preventing the ion from decaying. Itano et al. [173] have obtained experimental evidence that something like the quantum Zeno effect really exists, by demonstrating that the transition probability between states and approaches zero if the intermediate periods between consecutive measurements are decreased. Although this experiment seems to corroborate the realist picture of quantum jumps caused by, or prevented by, the measurement, we should yet be careful with such a conclusion. If the analogous reasoning were applied to the radioactive nucleus, observed continuously by means of a counter, then we would have to conclude that the nucleus will never decay. However, as argued above, it is hardly to be expected that the nucleus will be influenced in any way by the continuous observation. The measurement process can prevent the quantum mechanical state of the object from evolving freely according to the isolated object’s Schrödinger equation only if it is capable of disturbing the free evolution. This will happen only if the object energies match with the interaction energy, and the measurement interaction is strong enough. In contrast to the fluorescence experiment this is not the case here. Therefore the state of the nucleus will not be influenced by the counter, and no von Neumann projection would be expected to take place. By the same token, in contrast to what is sometimes assumed (“a watched pot never boils”, e.g. [174]) Schrödinger’s cat is not saved by continuous observation, unless this observation is performed in such a way that it prevents the deadly mechanism from doing its work, for instance by preventing the nucleus from decaying (only if the ‘watching’ provides for the cooling necessary for compensating the heat-
146
CHAPTER 3. THE PROBLEM OF MEASUREMENT
ing ‘a watched pot never boils’). An analogous conclusion is reached by Frerichs and Schenzle [175] with respect to the fluorescence experiment, demonstrating that the experimental results obtained by Itano et al. can be described completely without any reference to projection or reduction. In this experiment the Schrödinger equation, describing the interaction of object and measuring field, is sufficient to yield freezing as involved in the quantum Zeno effect (see also Beige et al. [167]). The quantum Zeno effect demonstrates particularly clearly that it is meaningful to apply von Neumann projection as a causal mechanism exerted (or influenced) by a measurement only if the measurement is devised so as to have this effect. This would hold only for measurements of the first kind. Since most measurement procedures do not satisfy this condition, it is hardly surprising that the quantum Zeno effect need not be satisfied, even if the object is continuously observed. Conditional preparation can be applied in all measurements. However, to be able to do so it is necessary to know precisely how the quantum mechanical state of the system object+measuring instrument is changed by the interaction. This requires a detailed account of the dynamics of this interaction rather than an artificial measurement scheme as embodied in von Neumann’s projection postulate or in stochastic differential equations. The suggestion of a realist individual-particle interpretation of quantum mechanics that von Neumann projection can be seen as ‘an objective physical process, not merely a change of description based on observation and/or selection by an outside observer’ is rather questionable, and may in certain cases even be misleading.
3.3 3.3.1
Quantum mechanical description of the measurement process and POVMs Possibility of POVMs
In section 1.9 the possibility is demonstrated that the Hilbert space formalism admits a quantum mechanical probability distribution to be described by the expectation values of a POVM (cf. appendix A.12 and section 1.9) rather than a PVM corresponding to the projection operators of the spectral representation of a Hermitian operator (as is the case in the standard formalism). It is demonstrated in the following that the POVM can be found in a natural way from a quantum mechanical description of the measurement process like the one already applied in a simplified way in section 3.2. However, in general neither the initial state of the object, nor the initial state of the measuring instrument is a pure state, as was assumed there. In most cases these states will have to be chosen as mixtures represented by density operators (cf. section 1.4). Let and be the density operators of the initial state of object and measuring instrument, respectively. Let H be the Hamilton operator of the total system of object+measuring instrument and T the duration of the
3.3. QUANTUM MECHANICAL DESCRIPTION AND POVMS
147
measurement. In agreement with (1.45) then
is the density operator of the final state of the combined system: This density operator replaces the state vector (3.5) or (3.10), which correspond to a measurement of a PVM. In an empiricist interpretation of quantum mechanics we are primarily interested in the probability distributions of pointer positions. In section 3.2 these corresponded to the state vectors In the present section the more realistic assumption will be made that the values of the pointer observable are degenerate, i.e. that there are more pointer states corresponding to the same value of For equal these states are thought to differ only in an unobservable, microscopic way from each other. Let be the Hermitian projection operator of the subspace spanned by the states with the same The projection operators are thought to constitute the spectral representation of a standard observable of the measuring instrument, to be referred to as the ‘pointer observable’. Applying the rules of the standard formalism (cf. (3.14)), the probability of pointer position is given by Inserting (3.20) in (3.21), and making use of the property this can be written according to
It is now possible to define operators according to
on the Hilbert space
of a trace,
of the object
Using this definition (3.21) precisely gets the form (1.96):
Although the operator is a projection operator (and, hence, also is one), there is no reason to suppose that will also be a projection operator. Although it follows directly from (3.14) that the simplified measurement process discussed in section 3.2 is a measurement of the PVM in more realistic measurements this is different. In chapters 7 and 8 many measurement procedures will be discussed in which the operators are not projection operators, and the measurement is represented by a POVM rather than by a PVM. The standard formalism, as given in section 1.1, is applicable only in very exceptional cases. Evidently, this latter formalism describes only a very restricted type of measurement procedures.
148
CHAPTER 3. THE PROBLEM OF MEASUREMENT
Two qualifying remarks are in order here. First, in general the operators need not constitute a PVM in the Hilbert space of the measuring instrument because the vectors need not span the whole space. For instance, the possible initial states of the measuring instrument need not correspond to a final state of the pointer. However, the claim, formulated in section 1.1, that a quantum mechanical measurement must be always successful ensures that
causing the operator to effectively equal the unit operator I if applied in the space of all possible final states of the measuring instrument (it is sufficient that be the projection operator onto the subspace spanned by the pointer states). It may be assumed that this requirement is satisfied in general. Secondly, in the above derivation it is assumed that the standard formalism is applicable to the measuring instrument, thus assuming the pointer observable to be a PVM. This PVM is often thought to correspond to the position of the pointer’s center of mass. Eigenvectors of this observable corresponding to different eigenvalues are mutually orthogonal, and are considered as distinguishable from each other in a deterministic sense. This is implemented by the requirement
In view of the reservations advanced in section 2.5.3 as regards the validity of quantum mechanics in the macroscopic domain, the assumption that the standard formalism is applicable to the measuring instrument is perhaps not very realistic since the pointer is a macroscopic object. Indeed, such an object would have sharp values for both position and momentum, and the non-existence of a PVM corresponding to this requirement is well known to be a fundamental feature of quantum mechanics. Often the coherent states (A.29) are considered as candidates for a description of macroscopic states. Then a measurement would have to be based on the act of distinguishing between different coherent states as pointer states of the measuring instrument, rather than between eigenvectors of a position observable. However, since coherent states are not mutually orthogonal no PVM can be found satisfying (3.26). Hence, this transcends the standard formalism, suggesting that, perhaps, a pointer observable should be a POVM rather than a PVM. This does not thwart the conclusion to be drawn from the present treatment of quantum mechanical measurement, however. The conclusion that operators (3.23) generate a POVM rather than a PVM, and that, hence, POVMs are necessary for representing generic quantum mechanical measurements, is not changed if the pointer observable itself is supposed to be a POVM. The question of whether a pointer observable can be a POVM that is not a PVM will be investigated in section 3.3.2. It will be demonstrated there that the
3.3. QUANTUM MECHANICAL DESCRIPTION AND POVMS
149
suggestion of employing a POVM to distinguish non-orthogonal pointer states in the sense expressed by (3.26), is not opportune. In the present section we restrict ourselves to the remark that a choice of as a PVM might be justified if the final state in (3.21) would be interpreted as referring to the final state of the premeasurement process, alluded to in section 3.2.1, rather than to the final state of the measuring instrument after the amplification to macroscopic dimensions has been completed. For instance, in the Stern-Gerlach experiment observable could refer to the position of the atom of which angular momentum is being measured, rather than to the position of some (actually observed) spot on a photographic plate. Since in an ideal amplification process there is a one-to-one correspondence between the (macroscopic) spot and the (microscopic) atomic position, the relative frequencies of the former would equal those of the latter, thus allowing the results of a macroscopic observation to be represented by a microscopic quantity like (3.21).
3.3.2 Pointer observables and POVMs As suggested in section 3.3.1, a pointer observable might have to be a POVM that is not a PVM, because no PVMs exist that can distinguish, in the way expressed by (3.26), between pointer states if these are non-orthogonal. It will now be demonstrated that in this respect POVMs do not perform any better. Let be the set of non-orthogonal pointer states (for instance, coherent states) corresponding to a POVM. The state vectors are assumed to be linearly independent. Because of the non-orthogonality it is convenient to construct the bi-orthonormal system (cf. appendix A.8.1) in the subspace of Hilbert space spanned by this set. Thus,
In the following we shall use the property (A.72) of a bi-orthonormal system that is the projection operator onto the subspace spanned by the pointer states or, equivalently, by the vectors Analogously to (3.26), a POVM should satisfy the relations
In order that the set of operators
distinguishing between the states
constitute a POVM, we should have
150
CHAPTER 3. THE PROBLEM OF MEASUREMENT
the latter equality stemming from a requirement analogous to (3.25). It follows that the operators can be represented according to
From requirement (3.27) it follows that
Hence,
From the second requirement of (3.28) it is easily seen that the coefficients should satisfy the equality
warranting the non-negativity of operator However, taking expectation values in arbitrary superpositions of the pointer states it now easily follows from representation (3.29) that a single operator cannot be non-negative (note that matrix cannot be non-negative because its trace is positive but its determinant is negative). Hence the set of operators does not correspond to a POVM if it is required to satisfy (3.27).
For orthogonal pointer states a solution does exist. In that case we have and It could be assumed that the non-orthogonality of the pointer states can be neglected, thus being able to stick to the standard formalism. However, it is questionable whether this is the right thing to do. It does not seem very probable that macroscopic objects can be dealt with by the standard formalism if this formalism does not even cover the whole of the microscopic domain. By enlarging the formalism so as to encompass POVMs it might be hoped that the gulf between the microscopic and macroscopic world could be bridged. The above result seems to indicate that this hope is not fully justified if the formalism is required to yield exact descriptions. The non-orthogonal projection operators as well as their Hermitian parts constitute operator-valued measures (OVM, cf. appendix A.12.3) exactly distinguishing the non-orthogonal pointer states. Moreover,
3.3. QUANTUM MECHANICAL DESCRIPTION AND POVMS
151
as will be discussed in section 7.7.3, POVMs exist representing complete measurements, from which the initial state can be determined completely. Such measurements would be capable of distinguishing between any pair of pointer states, be it not in the deterministic sense expressed by (3.26), but in a statistical sense. At this moment it is not clear at all whether this result has any physical relevance within the domain of macroscopic physics (in which the deterministic notion of distinguishability is the current one). Should the quantum mechanical formalism be extended still further by allowing OVMs as representations of quantum mechanical observables (in section 7.3 it will be seen that at least the double-slit experiment requires such an extension)? Or should we really resign ourselves to the idea that no deterministic quantum mechanical description of macroscopic objects is possible? Since at this moment there is no general solution this subject is not discussed here any further.
3.3.3
Measurement and empiricist interpretation
Due to the explicit reference to the pointer position of the measuring instrument, the generalization of the formalism, obtained in the way described in section 3.3.1, fits in very nicely into an empiricist interpretation (cf. section 2.2). In this interpretation it is sufficient that a mathematical expression exist describing empirical facts, i.e. relative frequencies of pointer positions. Unless inspired by the standard formalism, it could hardly occur to anyone that the operator in (3.23) would have to be a projection operator. Sometimes the requirement that be a projection operator is justified by the assumption that only orthogonal states would be empirically distinguishable, the underlying idea being that states represented by wave functions and can be distinguished by means of a position measurement only if the wave functions do not overlap, and, hence, are orthogonal. However, the derivation of (3.23) is not sensitive to this argument, because the operators may constitute a POVM even if the pointer observable is a PVM. As far as the argument of distinguishability through orthogonality is applicable, it is sufficient that it apply to the pointer states of the measuring instrument; it need not apply to the states of the object itself. A second observation regards the values of a quantum mechanical observable. In the standard formalism these are the eigenvalues of a Hermitian operator, in a realist interpretation considered to correspond to a certain property of the object. In an empiricist interpretation this would have to be replaced by a correspondence to a property of the measuring instrument, viz, the pointer position. It is important to realize that the way these pointer positions are labelled does not have any influence on the probabilities themselves (in the standard formalism the labels are the eigenvalues of a Hermitian operator (cf. (1.5)). (Eigen)values play a role only in the expectation values (1.3) of observables. However, the experimenter does not measure the expectation values in a direct way; he measures probability distributions
152
CHAPTER 3. THE PROBLEM OF MEASUREMENT
(better: relative frequencies). If he wants to formulate his experimental results in terms of expectation values, then he has to stipulate the correspondence between the pointer positions of his measuring instrument and the values of the observable he intends to measure, by marking the final states of the pointer on some measurement scale of his measuring instrument. In principle, the experimenter is free in choosing the scale, although he will let himself be guided by the perspective of obtaining a consistent interpretation of the totality of his experimental results in some physical model (see also section 2.4). For this reason it is preferable to represent also standard observables by POVMs (more specifically, PVMs), rather than by Hermitian operators. As an example, consider a photon counter measuring the photon number observable N (cf. appendix A.3). In the case of 100% efficiency of the detector it is reasonable to choose the scale such that it corresponds to the number of detected photons, this being assumed to be the number of photons present before the measurement. In the case of a photon detector with efficiency the experimenter could, at least in principle, choose the scale according to rather than thus achieving equality of the measured expectation values with the values predicted by the standard formalism. It will be clear, however, that this would not be a sensible choice to be made for the measurement scale because it does not fit into a plausible physical model. It would be more “physical” to draw a distinction between ‘the number of photons present’14 and ‘the number actually detected’. The measurement scale can be allowed to correspond to ‘the number of actually detected photons’. In general this latter number is different from the number of photons actually present. Since this latter number corresponds to an eigenvalue of the number operator N, it follows that the pointer positions of the inefficient photon detector cannot correspond to these same eigenvalues. It is clear from this example that the experimenter will not be able to follow the standard formalism blindly in choosing the measurement scale. He will have to make his choice on the basis of his insight into the functioning of his measuring instrument, and his estimation of the extent to which his measurement results reflect the reality of the microscopic object (compare section 7.2).
3.3.4
Conditional preparation and generalized observables
In section 3.2.6 the application of measurement for the purpose of conditional preparation was discussed for the simplified case of a measurement process of a standard observable represented by (3.10). A generalization of the projection postulate, used 14 It would be more appropriate to replace this rather sloppy realist wording by a more accurate empiricist one, viz, ‘the number of photons to be detected by a 100% efficient detector’ (compare the remark made in section 2.4).
3.3. QUANTUM MECHANICAL DESCRIPTION AND POVMS
153
as a preparation principle, was yielding a method to determine the density operator of the post-measurement state of the microscopic object corresponding to measurement result An analogous description can be given for generalized measurements represented by POVMs. The density operators representing the conditional preparation can be found completely analogously to section 3.2.6. The theory of joint measurement can in a rather straightforward way be generalized and applied to coupled systems (cf. section 1.5, in which in (1.56) the PVMs and are replaced by POVMs). In the final state (3.20) a joint measurement of observable of the first measuring instrument and an arbitrary observable of the microscopic object yields the joint probability distribution
Analogously to (1.28) the conditional probability is given by with given by (3.30), and by (3.21). Hence, the conditional probability can be represented according to in which, for
is the density operator representing the preparation of the microscopic object conditional on measurement result Due to this is consistent with the final object state
Due to the arbitrariness of observable (3.31) is determined uniquely. These expressions also hold if the observable of the first measuring instrument has a continuous spectrum, as was the case in the original application (3.18) to the Compton effect. In that case is not a probability but a probability density. In order that the conditional probabilities be different from zero the POVM has to correspond to a partition of the continuous spectrum in finite intervals. Because of this the conditional preparation can yield a normalized state (3.31), notwithstanding the non-normalizability of improper eigenvectors. Like in section 3.2.6 the transition from density operator to can be seen as a generalization of the von Neumann-Lüders projection (cf. section 1.6) in a preparative sense. No special relation with the POVM measured by the arrangement need be assumed. Thus, there is no reason to assume that a measurement of POVM will be accompanied by a change of the object state described by an operation (3.19) in which the operators are given by (cf. (2.3)). Apart from the (virtually non-existent) measurements of the first kind, von Neumann-Lüders projection is not applicable to the operation on the state of the microscopic object
154
CHAPTER 3. THE PROBLEM OF MEASUREMENT
executed by a measurement. In section 3.3.5 it will be demonstrated, however, that von Neumann projection may have a meaning in a determinative sense.
Consecutive measurements Although the necessity of applying POVMs in the description of quantum mechanical measurements is most evident in an empiricist interpretation, this interpretation has not played a role in the development of the generalized notion of a quantum mechanical observable as represented by a POVM rather than by a PVM. Above all it has been the idea of consecutive measurements that has been instrumental in this development [149, 176, 37]. Let the PVM be measured in a state represented by the density operator Assuming that the preparation of the final state of the object, conditional on measurement result is described by the Lüders projection (1.76) onto the state the PVM is subsequently measured in the final state. Then the joint probability of the pair of measurement results is easily seen to be given by
The operators for all
constitute a POVM, reducing to a PVM only if (cf. section 1.9.2).
It is possible to generalize this measurement scheme to consecutive measurements of several (say N) PVMs, taking into account the possibility that between two consecutive measurements the density operator can evolve according to (1.45). Then the joint probability of the N consecutive measurements is given by
being the time of the measurement, and the inter-measurement time evolution of the density operator being accounted for by the time-dependence of the operators (cf. (1.47)). Griffiths [177] and Omnès [178] even consider expression (3.34) as the basis of a general interpretation of quantum mechanics, in which the POVM represents a ‘consistent set of histories’, such that each combination corresponds to a ‘history’ realized in an individual measurement. Since the projection postulate as a measurement principle is not generally valid, it does not seem to be very well possible to base an interpretation of quantum mechanics on it, even if special measurement procedures would exist that are represented by the POVM For this reason the ‘consistent histories’ interpretation will not be considered here any further. If necessary, application of the POVM can be accommodated in an empiricist interpretation without any problem. Moreover, this
3.3. QUANTUM MECHANICAL DESCRIPTION AND POVMS
155
has the advantage that it is not necessary to worry about the consistency conditions ([178], p. 134)
stemming from a desire that the “cross” terms should vanish. As discussed in section 3.1.3, such a desire is based on a realist interpretation, and is completely unnecessary in an empiricist one. It should be emphasized once more that an interpretation of a transition from the state to the state (3.31) if measurement result is found, is problematic as a description of “what really happens”. In an empiricist interpretation the two density operators are representations (labels) of different preparation procedures; their meanings are restricted to the operational contents of the expressions given above.
3.3.5 Generalized von Neumann projection for generalized observables In this section the von Neumann projection (1.74) is generalized for measurements represented by arbitrary POVMs generated by NODIs with linearly independent operators In order to do so the operators of the NODI are considered as the vectors of a non-orthogonal basis spanning a subspace of Hilbert-Schmidt space (cf. appendix A. 10). Due to the linear independence it is possible to determine in the subspace the dual basis by (cf. (A.67))
It is easily verified that the operators Hermitian, but in general not non-negative. Since
The generalized von Neumann projection of density operator is defined according to
are they satisfy
generated by POVM
Expression (3.38) should be compared to (A.73). It is based on an application of the theory of non-orthogonal projections in a Hilbert space (cf. appendix A.8), viz, Hilbert-Schmidt space, orthogonality of Hilbert-Schmidt operators A and B being defined by Such projections are generally described by nonHermitian (super)operators. However, the projection-(super)operator is Hermitian since it describes a projection onto the whole subspace spanned by the operators Hence, because
156
of (A.71) we have
CHAPTER 3. THE PROBLEM OF MEASUREMENT
and
Using this latter representation it is easily verified that, indeed, the transition from to is a projection, satisfying
We also have demonstrating that the projected density operator bility distribution of POVM in state
reproduces the proba-
From (3.38) it is seen that Moreover, from (3.37) it follows that Nevertheless, if POVM is not a PVM, then the operator is not a density operator in general, since it need not be a nonnegative operator15. This demonstrates that the projection (3.38) cannot be seen as a generalization of von Neumann projection in a preparative sense, describing the preparation of a state of the object either before or after the measurement (compare section 2.4.5). However, due to (3.39) it is possible to interpret the generalized von Neumann projection in a determinative sense. By means of (3.38) can be constructed in a straightforward way from the measurement results of observable It follows that the projected density operator, being the component of in the subspace spanned by the operators can be interpreted as representing the information on the initial state obtainable from a measurement of POVM In this determinative sense this is a general property of any quantum mechanical measurement. In section 7.10.4 generalized von Neumann projections will be determined for a number of measurement procedures. If POVM is a maximal PVM, then (3.38) and (3.39) reduce to (1.73) and (1.75), respectively, characteristic of (weak) von Neumann-Lüders projection. Note, however, that, unlike (1.73), (3.38) is also valid for multi-dimensional projections. It is easily seen that for a non-maximal PVM with the dimension of the subspace corresponding to eigenvalue the dual basis is given by and the projection is
15 Due
to a restriction to 2-dimensional examples, and by incorrectly applying Naimark’s theorem, it was erroneously concluded in de Muynck [179] that is a density operator. In general the projection operator defined by (3.38) does not coincide with the projection operator proven by Naimark’s theorem to exist.
3.3. QUANTUM MECHANICAL DESCRIPTION AND POVMS
157
This result coincides with a generalization to non-maximal PVMs found by von Neumann ([2], chapter V.4, p. 414; compare Furry [18]). In general this generalization is different from the Lüders projection (1.74). It differs also from (2.3), which in general neither is a projection, nor allows to reproduce the probabilities analogously to (3.39). We close this section by proving the following
Theorem: for a two-dimensional Hilbert space
Proof: We assume that the elements of POVM are linearly independent, and consider the non-trivial situation (for we trivially obtain Then the dimension of the subspace spanned by the operators is greater than 1, and it is easy to prove that contains the elements of a maximal PVM Since the subspace is a subspace of the orthogonal projections and must satisfy which implies Hence, in the representation we get
Due to the fact that have
is an orthogonal projection onto with
we should also This implies that
and hence Denoting the eigenvalues of We already know that
by and we find Hence, both eigenvalues are non-negative if
Since implies it follows directly from (3.41) that condition (3.42) is satisfied. Hence, in the two-dimensional case is a non-negative operator.
3.3.6 Measurement and information A measurement of a quantum mechanical observable (represented by POVM serves to obtain information on a previously prepared state (represented by density operator This information is represented by the projection of defined
158
CHAPTER 3. THE PROBLEM OF MEASUREMENT
by (3.38), into the subspace spanned by the operators of the POVM. Since is not a density operator in general, it should be compared to a description of a quantum state by means of the Wigner distribution, yielding A for all operators but not containing any information on the part of that is in the orthogonal complement of If is in then In that case complete information on is obtained: is then reproduced by (3.38). For maximal standard observables, represented by maximal PVMs, such is the case if is diagonal in the representation defined by the PVM. If is not diagonal, such a measurement yields information only on the diagonal elements of the density matrix, thus entailing an information deficit preventing a complete reconstruction of The formalism of generalized observables allows measurements surpassing standard observables with respect to their capability of yielding information on the initial state. As a matter of fact, the set of operators of a PVM span only a subspace of Hilbert-Schmidt space. As will be discussed extensively in chapters 7 and 8, generalized measurements exist, represented by POVMs spanning larger subspaces of Hilbert-Schmidt space. For this reason such POVMs can yield more information on the initial state It is even possible that the operators constitute a basis of the whole Hilbert-Schmidt space. Then and for any Such measurements are called complete measurements. For measurements that are not complete there is an information deficit if Only the part of that is in can be reconstructed from the data obtained by a measurement of POVM An example of a complete measurement is ‘eight-port’ homodyning (cf. section 8.4.2), measuring the POVM in which is a coherent state for arbitrary complex The fact that the operators span Hilbert-Schmidt space is at the basis of the Husimi representation discussed in section 1.11.3. Comparing (3.36) and (1.147) we find defined by (1.150). Due to (1.148) and (1.149) we have
for all density operators
thus demonstrating completeness of the measurement.
Since is non-negative in general, it is impossible to take its von Neumann entropy (1.37) as a quantitative measure of information. A simple measure of the information deficit is given by the quantity
3.3. QUANTUM MECHANICAL DESCRIPTION AND POVMS
159
which can be seen to satisfy by applying the Schwarz inequality to the Hilbert-Schmidt inner product. vanishes for complete measurements. For general initial the limit is not attainable by measuring a standard observable. If applied to a pure state quantity (3.43) reduces to
which for a maximal PVM
reduces to
N the dimension of the object Hilbert space. From this it is seen that there is an information deficit unless the state vector is an eigenvector of the measured (standard) observable; the deficit is maximal if Maximal values of are obtained if the POVM is uninformative, i.e. the NODI generating the POVM consists of one single element I/N, yielding I / N, and
It is important to note that the information deficit as defined here is a determinative quantity, quantifying the (lack of) information on the initial density operator in a measurement of POVM This should be clearly distinguished from the preparative aspects of measurement involved in a comparison of initial and final states of the object in a measurement process, often cast in terms of von Neumann entropy (1.37). As a matter of fact, there is no direct relation between the final object state (3.32) of the measurement process, and Whether exceeds or not, depends on the preparative aspects of the measurement, i.e. whether the distribution of the final object state over the possible output channels is more disorderly than the distribution of the initial state over the input channels. We do not have any reason to suppose that disorder should increase in this sense. Thus, for the efficient photon detector the final state ideally is the vacuum state, yielding and, hence, entailing object entropy to decrease if the initial state was not a pure state (of course, due to inefficiency of the detector this decrease will in general be smaller than the ideal one). Evidently, just as there was no necessity for a vanishing of the “cross” terms, measurement also does not require that the entropy of the final object state exceed initial entropy. It is an unfortunate coincidence that for the very special case of a maximal PVM in (3.40)) the determinative projection equals the preparative result of a first kind measurement (compare (A.90)), Restriction of attention to such measurement processes may have contributed to the confusion of ‘preparation’ and ‘measurement’ noted in section 3.2.6.
160
CHAPTER 3. THE PROBLEM OF MEASUREMENT
In chapter V of [2] von Neumann dealt extensively with the issues of irreversibility of the measurement process and the alleged concomitant increase of object entropy due to projection. This reasoning is based on an application of the idea of von Neumann projection in measurements of the first kind. The opportuneness of these considerations is now seen to be rather questionable. Nothing in the quantum mechanical formalism seems to resist a decrease of the von Neumann entropy of the object, at least not during the pre-measurement phase described by quantum mechanics. Of course, since total entropy is conserved under unitary evolution of the system object+measuring instrument, a decrease of object entropy should be accompanied by an increase of apparatus entropy: thus, from and (cf. (A.91)) it follows that if the increase of apparatus entropy being a consequence of the establishment of correlation between object and measuring instrument. Increase of the von Neumann entropy of the measuring instrument may be a necessary characteristic of quantum measurement, because the instrument’s final state in general is distributed more dispersely over the possible output channels than its initial state over the input states. However, this characteristic does not seem to be related in any way to thermodynamical irreversibility. It rather seems to be a property of pre-measurement, necessary for a proper functioning of the measuring instrument in distinguishing different object states.
3.4
Decoherence
Although, strictly speaking, in an empiricist interpretation the problem of the existence of the “cross” terms is not a real problem, and could therefore be ignored, in the present section this subject will yet be discussed somewhat further from the point of view of the idea of decoherence, serving as a mechanism causing the “unwanted” terms to vanish. The idea that decoherence, as expressed by the consistency conditions (3.35), plays an essential role in the process of quantum mechanical measurement is rather widespread. Thus, the ‘consistent histories’, satisfying the consistency conditions, are sometimes referred to as ‘decoherent histories’ ([180], p. 157)16. Decoherence is often thought to be closely related to the microscopic temperature fluctuations which are at the basis of the second law of thermodynamics. This law, which has a virtually universal domain of validity, also affects measurement, either because of fluctuations within the measuring instrument itself, or within its environment. In the foregoing we found some occasion, however, to doubt that decoherence and the consistency conditions are both relevant to measurement. In particular it was found that the von Neumann entropy of the object 16 On the other hand, Omnès ([178], p. 136) emphasizes the independent significance of the consistency conditions.
3.4. DECOHERENCE
161
system need not increase as a result of measurement (cf. section 3.3.6). In order to try to separate fact from fiction in the present section the idea of ‘decoherence’ is critically examined.
3.4.1 Ergodicity Daneri, Loinger and Prosperi (DLP) [181] were among the first to attempt a physical explanation of the vanishing of the “cross” terms in the density operator corresponding to state vector (3.5). Their explanation is based on the idea, alluded to in section 3.2.1, that the process in which the state (3.5) is realized, is just a premeasurement process. The states are ‘final’ states of this microscopic process of information transfer from the microscopic object to the measuring instrument. From a macroscopic perspective these states are not very different from the initial state Subsequently, however, an amplification process should be active, amplifying the microscopic information to directly observable macroscopic dimensions. This amplification process has a number of aspects:
i) A pointer of a measuring instrument has to be a macroscopically observable phenomenon, like a track consisting of water drops in a Wilson cloud chamber, or a voltage drop in a photon detector. The idea is that there is a unique correspondence between the macroscopic characteristics of these phenomena (like the radius of curvature of the track in the Wilson chamber, or the height of the voltage drop) and the different measurement results
ii) A measurement result
does not depend on the microscopic details of the pointer. It is obtained as an average, determined by the collective behavior of the atoms constituting the pointer. Each atom can fluctuate individually, without exerting any observable influence on the collective behavior of the pointer.
iii) The amplification process is considered as a process in which the macroscopic pointer approaches a state of thermodynamic equilibrium. In the amplification process the interaction between object and measuring instrument does not play a role any more. Only the internal dynamics of the measuring instrument is involved.
Like in statistical thermodynamics DLP distinguish between microstates (describing the instantaneous microscopic details of the pointer), and macrostates represented by the subspaces of the pointer’s Hilbert space containing all microstates corresponding to a measurement result The basic idea is that the relation between microstates and macrostates is governed by an ergodicity property, allowing to approximate time averages by ensemble averages over the microstates corresponding
162
CHAPTER 3. THE PROBLEM OF MEASUREMENT
to a macrostate. This idea of ergodicity entails the vanishing of the “cross” terms with in the density operator corresponding to (3.5) or, more generally, (3.10), thus yielding (3.13). The disappearance of the “cross” terms is often considered as a reflection of the irreversible character of the measurement process. A more elementary way to get this same result is the application of a random-phase approximation (e.g. van Kampen [182], Machida and Namiki [183]), in which it is assumed that the microscopic fluctuations influence the phase of the coefficients in (3.10) in a random way without affecting their absolute values. On averaging this entails DLP [184] remark that they are firmly convinced that further progress in this field of research will essentially consist of refinements of their approach.
3.4.2
Environment-induced superselection
Perhaps it is possible to consider the approach by Zeh [185, 180] and by Zurek et al. [186] as a refinement of the above-mentioned kind. In this approach an explanation of the ergodic behavior of the measuring instrument is sought in the interaction between the measuring instrument and its environment. Random temperature fluctuations of the environment are thought to be responsible for the irreversible character of the second part of the measurement process in which the “cross” terms are wiped out. According to this view macroscopic systems are open systems. Note that by DLP ergodicity was seen as a consequence of the mutual interaction of the atoms constituting the measuring instrument, thus evading any necessity to refer to it as an open system. It is possible, however, to consider the DLP theory as a theory of open systems if the description of the measuring instrument is restricted to its macroscopic degrees of freedom, the microscopic degrees being considered as belonging to the environment (e.g. [187]). In order to warrant that the correlation between the initial state of the object and the final state of the measuring instrument, brought about in the pre-measurement, does not get lost, it is necessary that during the averaging process the fluctuations of the state are restricted to the eigenspace of measurement result Therefore it has to be required [186] that the Hamiltonian describing the interaction between measuring instrument and environment, does commute with the pointer observable thus No interaction between object and measuring instrument being assumed to exist in this phase, this requirement is satisfied if the time evolution of the interaction of measuring instrument and environment is analogous to a measurement of the first kind, viz,
3.4. DECOHERENCE
163
in which the states and are environment states17. Assuming the premeasurement to be of the first kind, the time evolution after the pre-measurement is then given by
The basic idea employed by Zurek is that, since the environment is a system with very many degrees of freedom, states corresponding to different values of should be orthogonal. As a consequence, the density operator of the subsystem object+measuring instrument, obtained by taking the partial trace in the Hilbert state of environment states, is given by (3.8). According to Zurek this implies that the measurement is actually performed by the environment. Due to the interaction between environment and measuring instrument the state of the system object+measuring instrument is not any longer described by a superposition of macroscopically distinct states. This is often referred to as environment-induced superselection. According to Zurek the interaction between environment and measuring instrument chooses the pointer observable as the observable measured by the environment in a minimally disturbing way (thus warranting the stability observed in ‘classical’ states of a macroscopic object). Such an observable could be called a macroscopic observable, its eigenvectors determining the so-called preferred pointer basis, corresponding to the macroscopically observable pointer positions of the measuring instrument (also [188]). In Zurek’s treatment the emphasis is on the amplification process and the accompanying decoherence effect that is responsible for the disappearance of the “cross” terms and the concomitant weak projection. As with DLP the macroscopic character of the measuring instrument plays an important role, the basic idea being that a macroscopic quantum system is never isolated from its environment, and therefore is not described by a Schrödinger equation (the latter being thought to describe only isolated systems). In the transition from the microscopic (quantum) to the macroscopic (classical) domain the decoherence, induced by the interaction with the environment, destroys the superpositions causing all the trouble with Schrödinger’s cat. According to Zurek, only states surviving the process of decoherence can have a meaning in the classical limit.
3.4.3 Spontaneous localization A decoherence model has also been developed by Ghirardi, Rimini and Weber (GRW) [132]. It is assumed that something like von Neumann projection occurs 17 Strictly speaking, the right-hand side of this expression should be replaced by in which are different states corresponding to the same pointer position This refinement will not be considered here, because it does not affect the conclusions.
164
CHAPTER 3. THE PROBLEM OF MEASUREMENT
spontaneously, thus wiping out the “cross” terms18. Spontaneous localization is realized by inserting an extra term in the Liouville-von Neumann equation (1.44). For a system consisting of one microscopic particle GRW propose to change this equation as follows:
the position operator. In order to get an impression of the effect of the last term, the equation is considered omitting the Hamiltonian H, thus
In the
representation this yields
This can easily be reduced to
which can be integrated, yielding
Hence,
entailing
Hence, in the limit of large the extra term indeed eliminates the “cross” terms in the position representation. For this reason the effect described here is sometimes called spontaneous localization. GRW demonstrate that it is possible to choose such values for the constants and that for times relevant to atomic processes sec). For this choice of the constants the decay time of the decoherence process is much larger than the characteristic time of the atomic processes. Hence, 18 Here it will be ignored that the GRW model even seems to aim at projection in the strong sense of (1.70).
3.4. DECOHERENCE
165
these processes are not influenced by the decoherence effect: on this scale the term is the dominant one in (3.47). For a system of N particles the equation has to be modified because each separate particle is liable to suffer spontaneous localization. Accordingly, equation (3.47) is generalized by GRW to
the position operator of particle An analogous calculation as given above yields in the joint position representation of all N particles a result that is analogous to (3.48), viz,
Once again the diagonal terms are not affected by the decoherence effect. However, the interference terms are erased faster than is the case for a single particle, the decay time now being given by Hence, the parameters can be chosen in such a way that for macroscopic objects the interference terms between macroscopically distinguishable states are effectively erased, whereas microscopic processes are not influenced.
3.4.4
General evolution equation for open systems
Lindblad [189] has derived a general evolution equation for open systems on the basis of certain conditions of ergodicity and requirements of (complete) positivity of the density operator
in which the so-called Lindblad operators describe the influence of the environment. This equation is called the ‘quantum master equation’. It is a starting point of many present-day investigations into the decoherence problem (e.g. [190]). For a measurement of standard observable A the Lindblad operators are often chosen as Then (3.49) reduces to
Neglecting the influence of Hamiltonian H compared to the last term of (3.50), and representing according to it can easily be verified that the coefficients satisfy the equations
166
CHAPTER 3. THE PROBLEM OF MEASUREMENT
It follows that implies that tends to zero within a relaxation time determined by whereas is not changed by the decohering mechanism if (compare the GRW approach, valid if A is the position observable). This realizes weak projection of the state of the microscopic object. Applying (3.50) to the measuring instrument with would yield the analogous effect for this latter object. An attractive feature of the Lindblad equation is that it does not distinguish between a pre-measurement phase and a decoherence phase of the measurement, but assumes both to be operative simultaneously. This makes the approach particularly suited for describing a microscopic object subject to a continuous measurement, performed during a prolonged period of time (compare e.g. Caves and Milburn [191] for a model of (continuous) position measurement with in (3.50); also Barchielli et al. [192]). There are two different ways to look upon the extra term in the Liouville-von Neumann equation (cf. section 2.5.3): i) the modified theory is fundamentally different from quantum mechanics, the latter theory being an approximation valid for microscopic particles; ii) ordinary quantum mechanics remains valid for all domains, but it is necessary to take into account influences of the environment: quantum systems, in particular macroscopic ones, are open systems; the extra terms should be explained as describing environmental influences in a description in which the environment is not explicitly displayed. It seems that the GRW approach favors the first option (the localizations being thought to be spontaneous, hence not induced by any outside agent), whereas the Zurek approach may be inspired more by the second possibility.
3.4.5 Critique of the decoherence solution to the problem of quantum measurement The developments discussed above have the intention to solve the conventional “measurement problem” as defined in section 3.1.1. In this respect the decoherence programme is very successful. It is indubitable that interaction with the environment may cause decoherence, in the sense of reducing or even wiping out “cross” terms. This also clearly follows from a number of model calculations (e.g. [193, 194, 195]). A different question is, however, whether this really solves the problem of quantum measurement (to be distinguished clearly from the “measurement problem”). Do the fluctuations in the environment really have a fundamental significance in thwarting observability of the “cross” terms in (3.12), or do these fluctuations just constitute a practical problem, not fundamentally related to the problem of quantum mechanical measurement? The “measurement problem” is a consequence of the application of quantum
3.4. DECOHERENCE
167
mechanics -in particular, the superposition principle- to the macroscopic measuring instrument. As seen from (3.51), the problem is seemingly solved by applying the quantum master equation (3.50) to it rather than the Schrödinger equation, and by making the choice thus wiping out the “cross” terms of the pointer states. Yet, as a solution to the problem of quantum measurement this approach has a couple of drawbacks. Thus, in (3.50) it is irrelevant that the macroscopic object is a quantum mechanical measuring instrument. The problem has a bearing on the quantum mechanical description of any macroscopic object. Therefore, the question of the adequacy of quantum master equations like (3.49) or (3.50) is related to the classical limit of quantum mechanics, dealing with the quantum mechanical description of macroscopic objects. This question is not in any particular way connected to the problem of measurement (although, of course, the macroscopic parts of measuring instruments also are involved). For several reasons the intimate connection, often observed, between the problem of the classical limit of quantum mechanics and the problem of quantum measurement is a confusing one, biasing both problems in an unnecessary way. Thus, in Zurek’s theory of environment-induced superselection the pointer basis is determined by the interaction between measuring instrument and environment. In earlier work [186] the pointer basis is determined by the requirement that be a standard observable which is conserved under the interaction with the environment (cf. (3.45)). Evidently, under the influence of the theory of measurement as discussed in section 3.2, the pointer states are considered here as the orthogonal eigenvectors of a Hermitian pointer observable. In later work [195] the emphasis is on the distinction between ‘classical’ and ‘non-classical’ states, the former being stable under the interaction with the environment while the latter rapidly decay to ‘classical’ states under the influence of decoherence. It is found that the ‘classical’ states are best represented by coherent states (cf. appendix A.4). Hence, pointer states are not orthogonal any more, thus making obsolete the ideas of Hermitian pointer observables and orthogonal pointer bases as expressed by (3.45). In order to deal with the classical limit of the quantum mechanical description of pointers a transition from an orthogonal to a non-orthogonal pointer basis seems to be a necessary one. Indeed, it does not seem possible to represent macroscopic objects (to which position and momentum can both be attributed) by the orthogonal eigenvectors of a Hermitian operator. For this reason in recent literature dealing with the classical limit of quantum mechanics (e.g. [196]) the Lindblad operators are often chosen so as to achieve (approximate) diagonalization in a (generalized) coherent state basis (for instance the non-Hermitian annihilation operator defined in (A.12)) rather than the pointer observable Position and momentum now being on an equal footing, this seems to urge a quantum mechanical description of a simultaneous observation of position and momentum, not provided by the standard formalism. At this moment it is not completely clear whether it is
168
CHAPTER 3. THE PROBLEM OF MEASUREMENT
possible to achieve such a description by taking into account the interaction with the environment (cf. section 2.5.3). Detailed investigations as the ones referred to above may be instrumental in answering this question. It must be noticed that this program may have been hampered severely by the connection with the conventional “measurement problem”, in which often the Hermitian observable allowed the pointer states to be orthogonal. Conversely, the problem of quantum mechanical measurement seems to be influenced in an undesirable way by the problem of the classical limit of quantum mechanics. Quantum measurement is determined primarily by the pre-measurement phase described by (3.10), transferring microscopic information from the microscopic object to the measuring instrument. This problem is of a microscopic nature, determined by the interaction Hamiltonian of the microscopic object and that part of the measuring instrument that is sensitive to the microscopic information. It is in the first place a matter of the experimenter’s ingenuity in devising methods to bring about a correlation between the observable A to be measured and the pointer observable As seen from the example given in (3.4) this to a considerable extent determines the choice of the pointer observable (which in the example is the position observable). From the point of view of quantum mechanical measurement theory orthogonal pointer states are unproblematic as long as it is realized that these states will need amplification to become directly observable at the macroscopic level. In particular in the pre-measurement phase it is important to avoid as much as possible disturbing influences, including decoherence, thus enabling the information transfer to the measuring instrument to be as faithful as possible. It seems that the decoherence programme underestimates the importance of the pre-measurement phase by emphasizing the macroscopicity issue. Pointer observable and pointer states are related in the first place to the interaction between microscopic object and measuring instrument in the pre-measurement phase. Decohering influences are only allowed in the amplification phase, contributing to a non-unitary evolution from the (possibly) orthogonal states to non-orthogonal ‘classical’ states. From the point of view of quantum mechanical measurement theory this evolution is not very interesting as long as it does not introduce measurement inaccuracies by correlating a wrong ‘classical’ state with a final state of the premeasurement phase. The requirement (3.45), relevant to the interaction of pointer and environment, may be a necessary condition. However, the interaction between object and measuring instrument seems to be more important. This latter interaction should be able to induce transitions between different pointer states, implying that does not commute with the Hamiltonian operator describing this interaction (cf. (3.4)). It is this microscopic part of the measurement procedure that determines the pointer observable in the first place. This issue is largely independent of decoherence. Failure to distinguish in the quantum measurement process the pre-measurement phase (ideally not being subject to decoherence) and the amplifi-
3.4. DECOHERENCE
169
cation phase is a source of confusion. For instance, Presilla et al. [197] assume that a correlation between object and instrument variables exists from the outset, thus neglecting the essential role of pre-measurement in establishing such a correlation. One instance of the confusion caused by the alleged intimate relation of premeasurement and decoherence is that often the master equations (3.49) and (3.50) are applied to the microscopic object rather than to the measuring instrument. As a matter of fact, this is the only way to implement decoherence in a realist interpretation in which the measuring instrument is not dealt with explicitly. Then, as follows from (3.51), a choice of A as the standard observable actually measured, entails the von Neumann-Lüders projection “really” to take place. It is evident from our discussion that such an application of decoherence is neither necessary (since we are dealing here with a microscopic object, and, hence, there is no single reason why the “cross” terms would have to be wiped out at all) nor realistic (because hardly any realistic measurement procedure satisfies von Neumann-Lüders projection). Moreover, application of a quantum master equation to the microscopic object is interesting only if the influence of decoherence on the preparation (rather than measurement) of this object is studied. This is still another way to illustrate the rather remote connection between decoherence and measurement as far as fundamentals are concerned. Of course, the decohering influence of the environment remains a practical problem in any measurement. The fundamental significance of decoherence for the macroscopicity issue (as opposed to its practical significance) might also be questioned. The existence of “cross” terms is the distinguishing trait between quantum and classical mechanics. Quantum effects exist because of the existence of “cross” terms. With the “cross” terms the decoherence mechanism is also erasing the quantum effects. If fully effective in macroscopic objects, macroscopic quantum effects like quantum tunneling in superconducting quantum interference devices (SQUIDs) [198] would be made impossible. For this reason it is not surprising that proponents of the decoherence theory take considerable pains to demonstrate that decoherence effects in superconducting systems are negligible [199], and that the superposition of two different supercurrents in a SQUID is not disturbed by the decoherence in a measure that is observable with present-day technology. Like the ionic experiment referred to in section 3.1.3, the experiments with SQUIDs, too, have the intention to probe the extent to which interference phenomena can be observed in macroscopic objects. Progress in this direction is possible only on the basis of a belief in the possibility that decoherence can be beaten, for instance, by performing measurements that are completed within the relaxation time of the decohering mechanism (cf. section 8.5.5), or by developing techniques to retrieve the ideal quantum mechanical information from a noisy signal (cf. section 7.9.4). The classical limit of quantum mechanics is an interesting and important problem. Can classical mechanics really be retrieved in the macroscopic limit of quan-
170
CHAPTER 3. THE PROBLEM OF MEASUREMENT
turn mechanics (cf. section 2.5.3)? Or is it possible -for instance by performing measurements faster than the relaxation time of the decohering mechanism- to get observational evidence of the “cross” terms even for macroscopic objects? Is there a fundamental difference between macroscopic and microscopic measurement? These questions will not be pursued here any further, however, because it is the microscopic problem of quantum mechanical measurement that is our main issue. As far as decoherence plays a role here, it will be necessary to take it into account already in the pre-measurement phase. Decoherence may lead to nonideality of a measurement (see section 7.6). Rather than for being instrumental in providing a basis for “understanding” quantum mechanical measurements, it seems that the decoherence phenomenon is important because it is operative in most practical measurement procedures in a way disturbing the information to be obtained. This is a subject that will be dealt with in chapters 7 and 8. It will turn out that for doing so it is necessary to extend the mathematical formalism of quantum mechanics so as to encompass POVMs. As already noted in section 2.4.2. such an extension will at least be necessary for dealing with observation methods within the domain of classical mechanics, as far as describable by means of quantum mechanical observables. This makes application of standard quantum mechanics to the observation of the pointer position of a measuring instrument (or Schrödinger’s cat) rather dubious.
Chapter 4 The Copenhagen interpretation 4.1 Introduction The Copenhagen or “orthodox” interpretation of quantum mechanics is the interpretation as mainly developed by its “founding fathers”, Bohr and Heisenberg1. Its main ingredients are: i) the thesis of the completeness of quantum mechanics, ii) the correspondence principle, iii) the complementarity principle. These issues will be discussed in the present chapter. For a long time the Copenhagen interpretation has been the dominant interpretation within the physics community, especially after Bohr’s “victory” over Einstein in the debate on the completeness of quantum mechanics. Unfortunately, what precisely is the Copenhagen interpretation is not uniquely defined. This has several causes. In the first place, in particular Bohr’s ideas are not always formulated with the clarity necessary to be able to attribute a unique meaning to them. Bohr has never tried to state an explicit definition of the interpretation, which is implicit in a large number of essays (cf. Bohr [200, 72]) in which he developed his thoughts with respect to the meaning of quantum mechanics2. In the second place Bohr and Heisenberg did not always share precisely the same views. Whereas Bohr’s approach to the problems posed by atomic physics was a more conceptual and philosophical one, Heisenberg’s approach was of a more physical nature. They, however, often chose their formulations so as not to contradict each other too conspicuously, thus suggesting a unity of thought that was actually not always present. In the third place, the Copenhagen interpretation has also experienced important “alien” influences, for instance, by Dirac [1] and von Neumann 1
As “founding fathers” also Born and Jordan (cf. section 6.2) could be mentioned. The most comprehensive account by Bohr of his ideas can be found in his review papers [201] and [202]. 2
171
172
CHAPTER 4. THE COPENHAGEN INTERPRETATION
[2] who had a more mathematical attitude toward quantum mechanics, thus introducing additional elements not present in the approaches by Bohr and Heisenberg (cf. section 4.6.6). Finally, the role of human consciousness (cf. section 4.6.7) is an issue contributing elements to the interpretation of quantum mechanics that have been considered as controversial by many physicists accepting the views advocated by Bohr and Heisenberg. Consequently, a number of variants of the Copenhagen interpretation exist, and it is not always easy to decide which is the “orthodox” one. In particular, Bohr’s instrumentalist attitude with respect to the interpretation of the wave function (cf. section 2.1) is not always appreciated, adherence to the Copenhagen view often being combined with a realist interpretation. This may be a reason why it is sometimes thought to be controversial whether the projection postulate (being a fruit of realist thinking, cf. section 4.6.6) must be considered as part of the Copenhagen interpretation, or not. Attempts at characterizing the Copenhagen interpretation have been undertaken by e.g. Petersen [203], Jammer [204] and Stapp [205] (see also Beller [206] for a historical account). Stapp mentions a pragmatic attitude as an essential ingredient of the Copenhagen interpretation. Such a pragmatic attitude with respect to the mathematical formalism of quantum mechanics may be akin to an instrumentalist interpretation, instrumentalism providing a certain latitude not to bother about the physical meaning of the mathematical quantities of the theory. It was already noted in section 2.1 that such a pragmatism may have certain advantages, in the sense that a certain vagueness about the precise meaning of certain terms may be helpful in evading inconsistencies and in uniting views that may be different on closer scrutiny. Such a pragmatism may be advantageous if one’s goal is just to develop an instrument for calculating certain figures to be compared with experimental data obtained in measurements. In particular, if it is not specified whether a measurement result has to be interpreted as a property of the measuring instrument (as in an empiricist interpretation, cf. section 2.2), or as a property of the microscopic object (as in a realist interpretation, either in an objectivistic or a contextualistic sense, cf. section 2.3) to be possessed either before, during or after the measurement, there is ample opportunity to make theory fit in with experiment. Most of these possibilities can be encountered under Copenhagen auspices. The pragmatic view is very popular among physicists who are averse to metaphysics, and who “consider the wave function as an indispensable tool for quantum mechanical calculations” but “avoid all philosophical extrapolations of the physical facts” (van Kampen [207]). This may be one reason why the Copenhagen interpretation is sometimes associated with positivism/empiricism. A second reason may be that Bohr and Heisenberg have both expressed themselves in very positivisticlooking ways . For instance, Bohr ([208], p. 18): “..in our description of nature the purpose is not to disclose the real essence of the phenomena, but only to track down, as far as it is possible, relations between the manifold aspects of our experience.”
4.2. COMPLETENESS OF QUANTUM MECHANICS
173
Here we should also mention Heisenberg’s empiricist attitude in developing quantum mechanics (compare section 2.2). It would be too hasty, however, to conclude from this that the Copenhagen interpretation would completely endorse a positivistic philosophy. Such a conclusion would be based on just a part of the Copenhagen legacy, viz, the completeness thesis of this interpretation. Renouncement of ‘incompleteness’ is often interpreted as a rejection of “metaphysical” questions involved in attempts at finding a completion of the quantum mechanical formalism. However, it was the intention, at least of Bohr and Heisenberg, to go quite a bit further in attempting to tell by means of quantum mechanics something about reality, than would be possible by merely “applying the formalism”. Indeed, Heisenberg ([209], p. 145) explicitly denies that the Copenhagen interpretation would be a positivistic one, while distinguishing between the reality of the positivistic sense impressions of an observer, and the reality of objects and events to be described by means of classical concepts. According to Folse ([210], p. 262) it also is clear from all of Bohr’s writings that he never adopted a positivistic model of science. Although Bohr was always very careful to stress the conceptual nature of the ideas of ‘correspondence’ and ‘complementarity’, his classical way of treating quantum mechanical observables within the context of a measurement can hardly be interpreted otherwise than implying a (contextualistic-)realist interpretation (cf. section 2.4.5). In discussing in the present chapter the Copenhagen interpretation it is intended to restrict attention mainly to the original ideas developed by Bohr and Heisenberg. However, it will be inevitable to consider also a number of “alien” influences.
4.2 4.2.1
Completeness of quantum mechanics Completeness in a wider sense
According to Hawking ([211], p. 169) our goal is “a complete3 understanding of the events around us, and of our existence.” This goal stems from a desire to have a “Theory of everything” yielding a complete description (and explanation) of the whole (physical)4 reality. I shall refer to this ‘completeness’ concept as ‘completeness in a wider sense’ to distinguish it from the more restricted concept to be discussed in section 4.2.2. Hawking’s work radiates a strong conviction that quantum mechanics must be close to such a “Theory of everything”. Admittedly, there are still a number of problems in unifying gravitation with the other three fundamental forces of nature, and there does not yet exist a satisfactory model of the constitution of matter valid 3
Emphasis added [WMdM]. In the discussion I shall restrict myself mainly to physical reality, thus evading the question of physicalism, which tries to reduce all questions about reality to physical questions. 4
174
CHAPTER 4.
THE COPENHAGEN INTERPRETATION
at the Planck length However, this is considered as a purely technical problem, to be solved by diligence and “hard labor”. The idea is that the ultimate theory will be of an essentially quantum mechanical nature, in which in particular the Heisenberg inequality (cf. section 1.7.1) is imposing restrictions that cannot be circumvented [211]. Considered as ‘complete in a wider sense’ quantum mechanics is thought to be an irreducibly statistical theory (see also section 6.2). It is deemed to be impossible to view upon quantum mechanics as a theory analogous to classical statistical mechanics, in which each particle has well-defined values of both position and momentum, their time dependence described deterministically by the equations of classical mechanics. No subquantum theory (hidden-variables theory) analogous to classical mechanics is thought to be able to yield the quantum mechanical probabilities (1.7) as relative frequencies in an ensemble of particles having sharp values for all (hidden and non-hidden) variables. Thus,
A main reason to believe in ‘completeness of quantum mechanics in a wider sense’ might be a positivistic fear of the metaphysical, rejecting hidden variables because of their unobservable, and hence metaphysical, character. The question of the ‘completeness of quantum mechanics in a wider sense’ interferes with the choice between a realist and an empiricist interpretation discussed in chapter 2. In an empiricist interpretation of quantum mechanics (cf. section 2.2) ‘completeness in a wider sense’ would mean that there does not exist anything outside the (macroscopic) observable phenomena, thus entailing a belief in the nonexistence of a reality of electrons, quarks, or even atoms, behind the phenomena. In such a view reality would encompass “just the phenomena”. Under the influence of logical positivism/empiricism (cf. section 2.2.1) in the past such a view has been popular among physicists. The ban on subquantum theories, causing de Broglie to postpone for several decades his efforts in searching a completion of quantum mechanics, must be attributed to a belief in the ‘completeness of quantum mechanics in a wider sense’ based on a positivist/empiricist attitude. Nowadays most physicists believe in the existence of microscopic objects behind the phenomena observed in measuring instruments. For this reason an empiricist interpretation of quantum mechanics is hardly reconcilable with a belief in ‘completeness in a wider sense’ of this theory. We have just as few reason to require that quantum mechanics describe the (sub)microscopic reality behind the phenomena, as to accept an explanation of the rigidity of a billiard ball on the basis of a theory
4.2. COMPLETENESS OF QUANTUM MECHANICS
175
of rigid atoms. Consequently, in an empiricist interpretation of quantum mechanics the problem of ’(in)completeness in a wider sense’ is largely reduced to the technical problem of transcending the domain of application of quantum mechanics by developing subquantum theories, and devising new experiments within the domains of validity of these theories (see also chapter 10). A combination of a realist interpretation of quantum mechanics and ‘completeness in a wider sense’ might be thought to imply a ban on subquantum theories. The motivation for such a ban would not seem to stem from fear of metaphysics, but from a conviction that quantum mechanics is the most fundamental theory possible. In a realist interpretation of quantum mechanics such an assumption of ‘completeness’ is not compulsory, however. Like in the usual realist interpretation of classical mechanics, also within the domain of application of quantum mechanics models of different degrees of sophistication could exist, yielding descriptions of microscopic reality involving various degrees of elaborateness. Analogously to the possibility of microscopic models of a billiard ball in which the atomic constitution requires a quantum mechanical description, quantum mechanical objects may allow models of a subquantum nature. Yet, the belief in quantum mechanics as a complete theory about microscopic reality, interpreted in a realist sense, is surprisingly widespread. This might have two causes, viz, i) the absence of any experimental clue as to failure of quantum mechanics (suitably generalized to include POVMs, cf. section 1.9.1) to describe empirical evidence, ii) the problems encountered in devising subquantum theories (cf. chapter 10), suggesting the impossibility of improving on quantum mechanics, and inducing various attempts at proving the impossibility of subquantum (hiddenvariables) theories. Neither of these reasonings is cogent, however. Admittedly, the domain of application of quantum mechanics has an enormous extension, energy, for instance, ranging over more than ten orders of magnitude. Yet, this is no warrant that quantum mechanics will continue to be successful under extreme conditions, for instance, at arbitrarily high energies, short durations and/or short distances. In this respect we could learn from history. The idea of quantum mechanics as a “Theory of everything” has a remarkable resemblance to the mainstream view on classical mechanics towards the end of the 19th century, viz that by the development of a coherent theory of the interaction of charged particles and electromagnetic fields physics could be considered completed. Lord Kelvin [212] saw in 1901 only two small “clouds” in the sky, namely, the “anomalous” behavior of specific heat at low temperature and the Michelson-Morley experiment, but he was convinced that an explanation could be found for every difficulty. He did not suspect that these phenomena were actually evidence of a restricted applicability of classical mechanics and would give rise to the development of two fundamentally new theories, viz, quantum mechanics and relativity theory. The idea of classical mechanics (including the classical theory
176
CHAPTER 4. THE COPENHAGEN INTERPRETATION
of fields) as a complete description of physical reality was already obsolete by that time. At this moment no comparable experimental “clouds” with respect to the universal applicability of quantum mechanics exist. As yet, we have no reason to believe that experiment has reached the boundary of this theory’s domain of application. Yet, it would be rather frivolous to conclude from this that quantum mechanics would be universally valid. Under extreme experimental conditions quantum mechanics might break down. Quantum mechanics might be a (very good) approximation to a still more “fundamental” theory. In section 10.6 some heuristic ideas will be discussed implementing this idea, which might be useful for the purpose of solving the above-mentioned problems encountered in devising subquantum theories. Apart from the influence of positivism, encouraging a belief in the ‘completeness of quantum mechanics’ in the realist sense discussed here, this belief may have been promoted by a misunderstanding of the ‘completeness’ thesis advocated by Bohr. This thesis, and the discussion between Einstein and Bohr provoked by it, will be dealt with more fully in later sections. Here it must be stressed that this discussion is obscured by the fact that Einstein and Bohr did not start from the same concept of ‘completeness’ in assessing quantum mechanics, the former’s view presumably being close to a denial of ‘completeness in a wider sense’. It was Einstein who always insisted that physics should (also) describe the reality behind the phenomena (objective reality), and who tried to demonstrate that quantum mechanics was not able to do so because it provides only a statistical description of reality. Just like classical statistical mechanics, also quantum mechanics is thought to be incomplete because it does not allow to predict the result of an individual measurement. By demonstrating the physical impossibility that any of Einstein’s proposals might yield results transcending the quantum mechanical description, Bohr was always able to falsify Einstein’s incompleteness “proofs”. However, Bohr’s ‘completeness’ thesis does not regard ‘completeness in a wider sense’ as discussed above. Bohr and Heisenberg [213] were very well aware of the possibility that a transition to a new domain of experimentation might bring about the necessity of replacing quantum mechanics by a new theory, just like classical mechanics must be replaced by quantum mechanics when a transition is made from macroscopic to atomic physics. In particular, the problems encountered in applying quantum mechanics to the interaction of an electron with an electromagnetic field are important here. Thus, in a letter to Dirac (August 29, 1930) Bohr states: “I ... firmly believe that the solution of the present troubles will not be reached without a revision of our general physical ideas still deeper than that contemplated in the present quantum mechanics” (quoted from [214], p. 8). For this reason Bohr’s “victory” over Einstein in the ‘completeness’ debate can hardly be interpreted as a support of ‘completeness of quantum mechanics in a wider sense’. As already remarked in section 2.1, Bohr’s interpretation of the quantum mechanical wave
4.2.
COMPLETENESS OF QUANTUM MECHANICS
177
function was an instrumentalist rather than a realist one, and his “victory” over Einstein was hailed as a victory of (logical) positivism/empiricism over metaphysics. Also for this reason Bohr’s ‘completeness’ thesis might seem to endorse the empiricist ‘completeness’ claim referred to above rather than the realist claim involved in a belief in quantum mechanics as the most fundamental theory possible. However, Bohr does not even seem to endorse the empiricist claim. As will be seen in the next sections Bohr’s idea of ‘completeness’ was not at all related to the issue of ‘(in)completeness of quantum mechanics in a wider sense’. The concept of ‘completeness’ adhered to by Bohr was a different one, being closely related to the principles of ‘correspondence’ and ‘complementarity’ to be discussed extensively in sections 4.3 and 4.6. These principles were for Bohr main reasons to believe in a different ‘completeness’ concept of a strictly quantum mechanical nature, viz, ‘completeness of quantum mechanics in a restricted sense’, to be discussed in section 4.2.2 . That Bohr, nevertheless, may have had a large influence in dispersing the idea of ‘completeness of quantum mechanics in a wider sense’ may be caused by his attempts at interpreting the complementarity principle as a general property of human cognition rather than as a peculiarity of quantum mechanics (cf. section 4.6.3). Bohr’s attempts at applying the complementarity principle outside the domain of quantum mechanics, and even outside physics, has certainly contributed to a widening of the sense in which ‘completeness of quantum mechanics’ has been understood by many philosophers and physicists. Unfortunately, it is impossible to say that Bohr has not contributed to this misunderstanding.
4.2.2 Completeness in a restricted sense According to Stapp [215], Bohr and Heisenberg attributed the following meaning to ‘completeness of quantum mechanics’: “...no theoretical construction can yield experimentally verifiable predictions about atomic phenomena that cannot be extracted from a quantum theoretical description.” This, indeed, is consistent with Heisenberg’s assertion [35] that “It is the theory which decides what we can observe” , suggesting that the domain of application of quantum mechanics delimits the possibility of all observation within the domain of atomic physics. In a reaction [213], included in Stapp’s paper [215], Heisenberg protests against an earlier version of Stapp’s ‘completeness’ definition in which ‘atomic phenomena’ was replaced by ‘physical phenomena’, including biological questions. According to Heisenberg the difference between the statements “The cell is alive” and “The cell is dead” cannot be expressed in terms of quantum mechanical statements about the state of the system. This, once again, demonstrates Heisenberg’s awareness of the restricted applicability of the quantum mechanical formalism. Stapp evidently did justice to Heisenberg by restricting in his ‘completeness’ definition the domain of application to ‘atomic’ phenomena.
178
CHAPTER 4.
THE COPENHAGEN INTERPRETATION
It is clear that, due to this restriction, quantum mechanics has lost its claim of being a “theory of everything” even within physics: outside the domain of atomic phenomena we may need other theories than quantum mechanics! In the present section it is discussed why Bohr and Heisenberg thought that at least within the domain of atomic phenomena quantum mechanics is a complete theory. In dealing with this subject, neither Bohr nor Heisenberg has made any explicit reference to subquantum theories. As a matter of fact, the notion of ‘completeness’ addressed by them is different from that of ‘completeness in a wider sense’. In order to see this let us consider the well-known (thought) experiment in which a beam of particles is diffracted by a single slit in a screen S (cf. figure 4.1), with subsequent detection of the particles on screen B. It is impossible to predict with certainty the position where an individual particle will impinge on the latter screen. The quantum mechanical formalism yields only the probability that a particle is found with coordinate In a somewhat schematic approach in which represents the wave function we get (see section 7.3 for a more quantitative approach). The function is maximal directly opposite the slit, monotonously decreasing toward larger distances. In classical mechanics it would be possible, in principle, to calculate the influence exerted by screen S on the particle 5 , thus determining position and momentum immediately after passage through the slit. This allows to calculate as (L the distance between S and B). If and were exactly known this would allow to predict at which value of the particle will hit screen B. The central question in the famous discussion between Bohr and Einstein on the ‘completeness’ of quantum mechanics was whether, like in classical mechanics, sharp values can be simultaneously attributed to the quantum mechanical observables of position and momentum. In the single-slit experiment this question can be applied 5
In diffraction experiments of the type discussed here it is always understood that only one particle at a time is passing the slit. Hence, interaction between particles can be neglected.
4.2.
COMPLETENESS OF QUANTUM MECHANICS
179
to and In contrast to Einstein the question was answered by Bohr and Heisenberg in the negative. According to the latter the statistical description of atomic phenomena by quantum mechanics cannot be completed in this sense. Of course, it was well known to Bohr and Heisenberg that quantum mechanics makes statements only about probabilities. Notwithstanding this, they yet upheld the thesis that quantum mechanics is a complete theory. At first sight this seems to be self-contradictory. However, we should be cautious about what they meant by ‘completeness’. The reason why Bohr and Heisenberg thought that position and momentum cannot simultaneously have sharp values is not based on any positivistic fear of the metaphysical. For them ‘completeness’ of quantum mechanics has a physical reason, related to the uncontrollable influence of screen S, preventing a precise determination of . It is the peculiar difference between the interaction of object and measuring instrument in classical and quantum physics, embodied in the finiteness of the ‘quantum of action’, that is considered by them to be the important issue. The unique correspondence in classical mechanics between the phase space point the object is in and the measurement result obtained in a measurement, is a consequence of the analyzability of the measurement process, which, in its turn, is a consequence of the deterministic character of that process. Bohr and Heisenberg have stressed that within the atomic domain measurements have an indeterministic character due to the fact that Planck’s constant is different from zero. (Inter)action is quantized, the “quantum of (inter)action” being equal to Due to the impossibility of making the quantity of interaction arbitrarily small, the interaction between object and measuring instrument, necessary for obtaining information on a (microscopic) object, contains an element of discontinuity. This marks a fundamental difference with classical mechanics, in the domain of which theory the objects are macroscopic, enabling observation without any noticeable influence on the object, or, at least, allowing to calculate this influence and compensate for it. The finiteness of the “quantum of (inter)action” makes it impossible, however, to neglect within the domain of atomic physics the influence of the measurement. The interaction energy is of the same order of magnitude as the relevant energies, or energy differences, to be measured. According to the Copenhagen account the discontinuity of the interaction makes the process of measurement fundamentally unanalyzable in atomic physics, thus obstructing a deterministic relation between the phase space point and the measurement result. Because this indeterminism is unavoidable in all measurements within the domain of quantum mechanics, this theory is a complete theory in the sense that it cannot be completed in a deterministic way. Evidently, within the Copenhagen interpretation ‘completeness of quantum mechanics’ has a very specific meaning. It is related to the idea that it is impossible to ignore, within the domain of atomic physics, the essential influence of the measuring instrument in obtaining knowledge about the object. According to this interpreta-
180
CHAPTER 4.
THE COPENHAGEN INTERPRETATION
tion the quantum mechanical formalism is reflecting this. Whereas in classical statistical mechanics the measured values (r and p) of position and momentum can be interpreted as objective properties of a particle, such an interpretation is thought to be impossible for the observables of quantum mechanics. According to the Copenhagen interpretation, within quantum mechanics the values of quantum mechanical observables like position and momentum cannot be considered as objective properties of a microscopic particle because they are co-determined by the interaction with the measuring instrument, which, due to the finitencss of Planck’s constant, submits them to a fundamental indeterminacy represented by the Heisenberg inequality. Quantum mechanical observables are thought to take their values only within a specified measurement context. Hence they admit at most a contextualistic rather than an objectivistic interpretation. Here the notion of ‘objectivistic’ as juxtaposed to ‘contextualistic’ should be taken as ’independent of the observer including his measuring instruments’. The fundamental role of the measuring instrument within quantum mechanics is closely connected to the issues of ‘correspondence’ and ‘complementarity’ to be discussed in the next sections. It will be seen that in both of these issues it is the contextual meaning of quantum mechanics that, according to the Copenhagen interpretation, is the important issue. It is this contextuality that gave reasons to Bohr and Heisenberg for rejecting within quantum mechanics the possibility of completing the statistical description analogously to a completion of a classical statistical description (which should be achieved by simultaneously making more precise observations of position and momentum than allowed by the Heisenberg inequality). The issue of ‘objectivity versus contextuality’ is completely independent of the question of ‘(in)completeness in a wider sense’ because it refers to a property of the measurement process within the domain of quantum mechanics. The essential influence of the measurement interaction does not seem to provide any reason for assuming the impossibility of subquantum theories. As far as their domains of application would overlap with the domain of quantum mechanics, in order to reproduce the quantum mechanical data such theories would just have to duly account for the measurement-induced contextual meaning of the measurement results. It therefore is necessary to introduce a second notion of ‘completeness’ next to the notion of ‘completeness in a wider sense’. This will be referred to as ‘completeness in a restricted sense’: The statistical description of quantum Completeness of quantum mechanics in a restricted sense
mechanics cannot be completed by determining precise value of r and p because the essential influence
of the measuring instrument
does not allow such a determination
4.2. COMPLETENESS OF QUANTUM MECHANICS
181
By ‘incompleteness in a restricted sense’ we shall understand the idea that the statistical description of quantum mechanics can be completed by attributing to each quantum mechanical observable a value, possessed by the individual object independently of measurement (compare the ‘possessed values’ principle, section 2.3). The notion of ‘completeness in a restricted sense’ refers only to quantum mechanical observables, and, hence, is applicable only within the domain of application of quantum mechanics. Even if ‘completeness in a restricted sense’ would prevent the simultaneous attribution of sharp values to incompatible quantum mechanical observables, then this would not imply the impossibility of subquantum theories encompassing subquantum mechanical notions of position and momentum simultaneously having sharp values. On the other hand, the possibility of these latter notions does not imply that quantum mechanical quantities should simultaneously have sharp values: if quantum mechanical observables would have a contextual meaning only, then their meaning might depend on the measurement interaction. The discussion between Bohr and Einstein on the completeness of quantum mechanics is obscured by a failure to clearly display the difference between the two notions of ‘completeness’. In Einstein‘s view a theory making statistical statements is necessarily an incomplete theory. Einstein can be seen as one of the foremost proponents of causal explanation in physics (cf. section 2.3). By his assertion that “God does not play dice” he expressed his conviction that such a causal explanation must be possible also with respect to quantum mechanical statistics, and that, moreover, an individual quantum mechanical measurement result of position or momentum is obtained because the particle possessed that value as an objective property prior to measurement. He evidently did not see any reason why the ‘completeness’ problem of quantum mechanics would be essentially different from the classical statistical mechanical one. In his opinion the Heisenberg inequality (1.77) could be interpreted as an objective statistical property of an ensemble (see also section 4.7 and chapter 6). He did not see any objection against a simultaneous attribution, like in classical statistical mechanics, of sharp values of position and momentum to an individual particle, the Heisenberg inequality restricting only the possibility of preparing an ensemble. Unlike Bohr and Heisenberg, Einstein did not see any impediment to the possibility of deterministic statements with respect to an individual particle. Relinquishing causality, as was proposed by Bohr and Heisenberg (cf. section 4.6.3), was characterized by him as a “tranquilizing philosophy”, relegating justified questions to the realm of metaphysics. It seems clear from this that Einstein had in mind the concept of ‘completeness in a wider sense’ (see also section 5.2.2), some subquantum theory being thought to be capable of providing the causal description quantum mechanics does not yield. As will become clear in the following (cf. chapters 5 and 6), in the discussion between Einstein and Bohr the issues of ‘completeness in a wider sense’ and ‘completeness in a restricted sense’ are thoroughly intertwined, thus causing quite a bit
182
CHAPTER 4. THE COPENHAGEN INTERPRETATION
of confusion. It was not always clear that the discussion was not about the possibility of subquantum theories, but about the question of whether, in understanding quantum mechanics, the interaction of object and measuring instrument plays the fundamental role attributed to it by the Copenhagen interpretation. For instance, Jammer ([216]. section 6.1) considers the “...widespread view that it is the disturbance of the object by the observation that entails the principle of indeterminacy ...” as based on “...certain ambiguous statements concerning the inseparability of the object and the observer made by Bohr and Heisenberg themselves...” According to him “The meaning of the principle of indeterminacy is precisely the statement that such a corrective theory 6 is not possible.” Failure to distinguish the two issues directed the attention away from the question of why in the Copenhagen interpretation the quantum mechanical formalism was thought to yield a complete description of atomic reality. The real question -involved in the issue of ‘(in)completeness of quantum mechanics in a restricted sense’- was whether the quantum mechanical formalism is yielding a description of an objective reality (Einstein) or of a contextual reality (Bohr). In the present chapter this will be the main subject, aspects dealing with ‘(in)completeness of quantum mechanics in a wider sense’ being postponed until chapters 6 and 10. Bohr’s “victory” over Einstein, if existing at all, should be seen as supporting a thesis of quantum mechanics being ‘complete in a restricted sense’, i.e. not being interpretable as a description of a reality that is independent of the way it is observed [217], but describing all information about this reality to be obtained by means of quantum mechanical measurement. This leaves open the possibility that Bohr and Einstein might each be partially right: quantum mechanics might be both ‘complete in a restricted sense’ and ‘incomplete in a wider sense’. Bohr’s belief that quantum mechanics just describes an observed reality, interacting with a measuring instrument, does not exclude the possibility of the existence of a reality behind the (quantum mechanical) phenomena, not described by quantum mechanics.
4.2.3
Entanglement of the ‘(in)completeness’ question with other issues; sources of confusion
Several other issues are interfering with the ‘(in)completeness’ discussion. Since these, by interfering, tend to confuse the issue, it is necessary to pay some attention to them already here, even though a full assessment can be achieved only on the basis of a more complete picture of the Copenhagen interpretation.
6
That is, a (subquantum) theory accounting for the interaction between object and measuring instrument [WMdM].
4.2.
COMPLETENESS OF QUANTUM MECHANICS
183
Individual-particle versus ensemble interpretations
In the Copenhagen interpretation the state vector is thought to describe an individual object rather than an ensemble. This interpretation is causing a number of problems, epitomized by Schrödinger’s cat problem discussed in chapter 3. It was already mentioned there that an interpretation of the state vector as a description of an ensemble could greatly alleviate these problems. As a matter of fact, the only experimental data that can be compared with the probabilities provided by the quantum mechanical formalism are the relative frequencies obtained in measurements performed on ensembles of (identically prepared) individual objects. Hence, to critics of the Copenhagen interpretation (cf. section 4.7) an ensemble interpretation of this formalism seemed to be far more natural than an individual-particle one. They felt that problems like Schrödinger’s cat problem were consequences of an unphysical assumption of ‘completeness’ of the quantum mechanical description, in the sense that the state vector is thought to completely describe an individual object. Due to the important position Schrödinger’s cat problem takes in the early discussions on the foundations of quantum mechanics it might be thought that it is the individual-particle interpretation that is characteristic of the Copenhagen interpretation. This is not the case, however. The issue of ‘ensemble versus individual-particle interpretation’ regards ‘(in)completeness in a wider sense’, not ‘(in)completeness in a restricted sense’. Different interpretations of quantum mechanics have been contemplated (e.g. Everett [93], Dicks [218]), in which the state vector is thought to describe an individual particle rather than an ensemble, but in which no special role is attributed to measurement in giving the state vector its meaning. On the contrary, these interpretations are often presented as alternatives to the Copenhagen one, their objective primarily being to demonstrate that a formulation of quantum mechanics is possible in which ‘measurement’ does not play any fundamental role. In these non-Copenhagen individual-particle interpretations the state vector is thought to have an objective, i.e. noncontextual, meaning. Hence, such interpretations do not satisfy ‘completeness in a restricted sense’. In general, an individual-particle interpretation does not reflect the contextual meaning attributed by the Copenhagen interpretation to the quantum mechanical description. Hence, it would be inappropriate to characterize the Copenhagen interpretation by its ‘individual-particle’ nature. For this reason this subject will largely be ignored in the present chapter. The relative merits of ensemble interpretations versus individual-particle ones will be discussed in chapter 6. (In)determinism
The discussions between Bohr and Einstein on the issue of ‘(in)completeness of quantum mechanics’ have often been interpreted as a controversy over the issue of
184
CHAPTER 4. THE COPENHAGEN INTERPRETATION
‘ (in)determinism’. For instance, (in)determinism is involved when it is contemplated how in figure 4.1 the position where a particle will hit screen B could be predicted if its position and momentum are known when it is at screen S. Quantum mechanics may be thought, then, to be incomplete because it does not allow such a prediction. By his assertion that “God does not play dice” Einstein seemed to comply with the view that quantum mechanical incompleteness is a problem of ‘indeterminism’, while suggesting the possibility of restoring ‘determinism’. As will be seen in section 4.6.3, Bohr, too, saw ‘indeterminism’ as the feature characterizing quantum mechanics, but he was drawing an altogether different lesson from it: he actually tried to solve the quantum mechanical problems by abandoning ‘determinism’, replacing it by the concept of ‘complementarity’. Formulated in terms of ‘(in)determinism’ the issue of ‘(in)completeness in the wider sense’ might be thought to be at stake also here. Adding hidden variables, determining the precise values of position and momentum, might improve the description of a particle so as to make it deterministic. This seems to be the way Einstein did look upon the subject. As noted before, neither Bohr nor Heisenberg believed quantum mechanics to be ‘complete in a wider sense’. So, nothing stood in their way to admit the possibility of subquantum theories, yielding a deterministic underpinning of quantum mechanics. The only constraint on such theories would be the condition that, as far as they would make statements about phenomena that are also described by quantum mechanics (like the quantum jumps discussed in section 3.2.7), these statements should have an indeterministic nature. Since this does not seem to be an unrealizable condition (compare the possibility of discontinuous behavior of coarse-grained quantities in classical theories on deterministic chaos), from this point of view it is not very well understandable why Bohr and Heisenberg objected. But for Bohr and Heisenberg ‘indeterminism’ was related to ‘(in)completeness in the restricted sense’ rather than ‘(in)completeness in the wider sense’. The reason why they insisted on indeterminism was the disturbing influence of the measurement interaction, indeterminism being a consequence of the nonvanishing value of the “quantum of (inter)action” In agreement with Einstein’s ideas a deterministic description is thought to be possible when there is no interaction with a measuring instrument (like, for instance, in the diffraction experiment of figure 4.1 while the particle is traveling between screens S and B). Indeterminism is not thought to be a property of the free evolution of the object when no measurement is made; it is a characteristic of a quantum mechanical measurement interaction. The notion of ‘(in)completeness in a restricted sense’ is a purely quantum mechanical one, valid only within the domain of application of quantum mechanics, and unrelated to the (im)possibility of subquanturn theories (see also chapter 10). The close relationship between quantum indeterminism and ‘completeness in a restricted sense’ in the Copenhagen interpretation can be seen most easily from our
4.2.
COMPLETENESS OF QUANTUM MECHANICS
185
discussion of the “measurement problem” in chapter 3. Thus, the time evolution of the state vector is governed by the Schrödinger equation, and, hence, is deterministic as long as no measurement is performed. However, by the interaction with a measuring instrument the state of a particle may change discontinuously, the state vector indeterministically performing a quantum jump (cf. section 3.2.7) tentatively described by von Neumann’s projection postulate (cf. (1.70)). Of course, staying within the domain of application of quantum mechanics the possibility that a deterministic subquantum theory might be able to explain quantum mechanical indeterminism (analogously to the way classical mechanics can explain phenomena described by classical statistical mechanics) is observationally irrelevant. However, if quantum mechanics is not the ‘Theory of everything’, then it should be expected that one day experiment may lead outside that theory’s domain of application. Since Planck’s constant would be irrelevant there, it need not be a fundamental ingredient of a subquantum theory (although, as far as the subquantum theory also describes experiments within the domain of application of quantum mechanics, should be obtainable from it). The conceivability of a deterministic subquantum theory, also describing the influence of a measuring instrument in a deterministic way, may explain Einstein’s reluctance to accept Bohr’s proposal to completely abolish the idea of ‘determinism’ within the atomic domain (cf. section 4.6). Such a proposal, although possibly useful within the domain of quantum mechanics, would seem to be question-begging if applied to subquantum theories. A universal abolishment of ‘determinism’ of the interaction between microscopic object and measuring instrument would seem to be necessary only on the basis of an assumption of quantum mechanical ‘completeness in a wider sense’ (in which case quantum mechanics would be universally valid, and, hence, the interaction would be of an irreducibly quantum mechanical nature, being essentially indeterministic because ). Since neither Bohr nor Heisenberg thought that quantum mechanics is ‘complete in a wider sense’ it would be rather incomprehensible why they nevertheless maintained ‘indeterminism’ as an irreducible feature of quantum mechanics. Only failure to distinguish between the two concepts of ‘(in)completeness’ seems to offer an explanation for this. If duly distinguished it could have been realized that indeterminism at the level of quantum mechanics need not a priori be incompatible with determinism at a subquantum level. On the other hand, the discussion never did transcend the domain of quantum mechanics in any observationally relevant way. So, the distinction could not be implemented, and quantum indeterminism (based on ‘completeness in a restricted sense’) was not sufficiently distinguished from ‘incompleteness in a wider sense’.
186
CHAPTER 4.
THE COPENHAGEN INTERPRETATION
Realism and instrumentalism/empiricism
It is sometimes recognized that the ‘(in)completeness’ discussion is not at all about ‘(in)determinism’ (often referred to as ‘causality’), but about ‘realism’. For instance, Jammer [219] gives the following citation from a 1950 letter of Einstein’s to Besso7: “Not the question of causality but rather the question of real existence is the central one...[WMdM].” Not the predictability of the time evolution is thought to be relevant, but rather the (im)possibility of ascribing sharp values to position and momentum (or even to one single quantum mechanical observable) at one single instant. In view of the impossibility of excluding deterministic subquantum theories underpinning quantum indeterminism, this suggestion, indeed, seems to yield better perspectives for understanding the incompatibility of the quantum mechanical observables of position and momentum as expressed by the Heisenberg inequality (in which no trace of any time evolution is apparent) than does indeterminism. Once again we should be aware that the notion of ‘realism’ has different meanings (compare section 2.3), which more or less parallel the two different meanings of ‘completeness’. Thus, ‘realism’ may refer to a reality behind the phenomena described by quantum mechanics, its existence being denied if quantum mechanics is supposed to be ‘complete in a wider sense’. Alternatively, ‘realism’ may refer to a ‘realist’ interpretation of the quantum mechanical formalism as opposed to an ‘instrumentalist’ or an ‘empiricist’ one. Like ‘(in)completeness in a restricted sense’ this latter notion of ‘realism’ is restricted to the domain of quantum mechanics. Failure to distinguish the two concepts of ‘(in)completeness’ may continue obscuring the discussion if ‘determinism’ is replaced by ‘realism’. Thus, the Heisenberg inequality (1.77) has been employed to endorse both the thesis of ‘incompleteness of quantum mechanics in a wider sense’ (in the sense that quantum mechanics provides only a statistical description of reality (Einstein)), as well as ‘completeness of quantum mechanics in a restricted sense’ (the inequality being interpreted as expressing the way reality is disturbed by the measurement interaction (Bohr and Heisenberg)). Due to the independence of the two concepts of ‘completeness’ this is not contradictory. However, confusion arises when the two concepts are not properly distinguished. Thus, Bohr’s “victory” over Einstein was generally interpreted as doing away with the existence of a metaphysical sub-quantum reality, even though his argumentation derived from a principle that has nothing to do with reality as such, but is limiting only the possibility of getting objective and undisturbed knowledge on reality as far as described by quantum mechanics. In view of Bohr’s indulgence with respect to ‘incompleteness in a wider sense’ such an interpretation can be accounted for only on the basis of a confusion with respect to the two senses of ‘(in)completeness’. 7
“Die Frage der ‘Kausalität’ steht nicht eigentlich im Mittelpunkt sondern die Frage des realen Existierens...”
4.2.
COMPLETENESS OF QUANTUM MECHANICS
187
Staying within the domain of quantum mechanics the Bohr-Einstein discussion is often presented as being concerned with a choice between Bohr’s instrumentalist and Einstein’s realist interpretation of that theory (cf. section 2.1). However, although the difference in interpretation of the quantum mechanical state vector is undeniable, it was not so much the opposition of realist and instrumentalist interpretations that was the subject of the Bohr-Einstein discussion. As noted before, within the Copenhagen interpretation quite a few realist influences can be observed, among which Bohr’s realist attitude towards quantum mechanical observables, discussed in section 2.4.5. As will be seen in the sequel of the present chapter, as well as in chapter 5, the discussion was cast mainly in terms of observables. This makes it difficult to assess the measure of instrumentalism adopted by the Copenhagen interpretation. For the ‘(in)completeness’ discussion this does not seem to be of great importance, though. Taking into account the reason why Bohr insisted on ‘completeness in a restricted sense’ it is evident that the discussion was not about a realist interpretation of observables at all. This reason was the disturbing influence of the measurement interaction. Therefore, the crucial point is not whether quantum mechanical observables correspond to something in reality. The question is whether, in attributing a value to a quantum mechanical observable, it is possible to think about it in an objective way, i.e. without taking into account the influence of the measurement. As far as the quantum mechanical formalism is thought to describe microscopic reality, ‘completeness in a restricted sense’ implies that the quantum mechanical description is not thought to yield a description of an objective reality, but rather of a reality that is in interaction with a measuring instrument (i.e. a contextual reality). The discussion between Bohr and Einstein should be seen as primarily referring to the issue of whether quantum mechanics allows an objectivistic interpretation (Einstein) or rather a contextualistic one (Bohr). Since this applies to Bohr’s instrumentalism just as well, the issue hardly seems to be ‘realism’, neither in the sense of the (im)possibility of subquantum theories (although for Bohr the impossibility of neglecting the essential influence of the measuring instrument should certainly be an inevitable feature of any such theory), nor in the sense of the (im)possibility of a realist interpretation of quantum mechanics. The distinction between ‘incompleteness and completeness in a restricted sense’ draws a line between an objectivistic-realist interpretation on one hand, and contextualistic-realist and empiricist interpretations on the other, Bohr’s instrumentalist interpretation being a blend of the latter two. If restricted to realist interpretations, ‘completeness in a restricted sense’ just excludes an objectivistic-realist one. It leaves open the possibility of a contextualistic-realist one (cf. section 2.3).
188
CHAPTER 4.
THE COPENHAGEN INTERPRETATION
Preparation and measurement
In the following an important aspect of the discussion between Bohr and Einstein will be the (lack of) distinction between the notions of ‘preparation’ and ‘measurement’. As will be seen in section 4.6.1, within the Copenhagen interpretation this distinction is highly blurred. Possibly due to its instrumentalist interpretation of the wave function or state vector, and a concomitant preoccupation with ‘measurement results’, the Copenhagen interpretation is focused on ‘measurement’. ‘Preparation’ is never treated as a physical process to be distinguished from ‘measurement’ (see also the criticism in section 3.2.6 of the projection postulate as a ‘measurement’ principle). It is evident that ‘completeness in a restricted sense’ is a measurement principle, not telling anything about a preceding preparation that, moreover, may have appeared to be less accessible to assessment in empirical terms. In the Copenhagen view the principle of indeterminacy as represented by the Heisenberg inequality is considered as limiting (simultaneous) measurement. Being “just an instrument for predicting measurement results” the state vector is not thought to refer to anything existing in reality, independently of measurement. Einstein’s realist interpretation of the state vector represents the other extreme, viz, the state vector as an objective description of reality, independent of any measuring instrument or measurement process. According to this view the Heisenberg inequality represents a limitation of our ability of preparing a quantum mechanical state rather than a restriction of our ability of measuring quantum mechanical observables. In this view the principle of indeterminacy is a principle of preparation rather than measurement, being related to ‘incompleteness in a wider sense’ rather than to ‘completeness in a restricted sense’. As will be seen in chapter 5, in his final attempt to prove the incompleteness of quantum mechanics Einstein took great care in trying to circumvent any measurement interaction. Unfortunately, he was not able to go all the way in avoiding all reference to measurement. By referring to ‘(quantum mechanical) measurement results’ when specifying his ‘elements of physical reality’ (cf. section 5.2.1) he offered Bohr an opportunity to continue the Copenhagen blurring of ‘preparation’ and ‘measurement’, and robbed himself of the possibility of clearly distinguishing between ‘completeness in a restricted sense’ and ‘completeness in a wider sense’ (see also chapter 6). As a consequence of this it did not become evident that both notions might be important, and that we do not have to choose between a principle of indeterminacy for measurement and one for preparation, but that there may exist two different principles, one for preparation and one for measurement (compare de Muynck [129]). Generalized quantum mechanics
Restricting ourselves to the issue of ‘(in)completeness in the restricted sense’ (as is necessary to properly address Bohr’s ‘completeness’ claim), the appropriate question
4.2
COMPLETENESS OF QUANTUM MECHANICS
189
is whether within the quantum mechanical formalism mathematical entities can be found that can be interpreted as simultaneous values of position and momentum. With the wisdom of hindsight we can now tell that at the time of the discussion this question was unanswerable because an answer needs an extension of the standard formalism encompassing POVMs (cf. section 1.9). In chapter 7 it will be seen that within the generalized formalism it only is possible to have a notion of a simultaneous (generalized) measurement of incompatible observables like position and momentum. At the same time, however, it will turn out that quantum mechanical measurement results cannot be interpreted in the objectivistic-realist sense advocated by Einstein (see also section 6.4.2). This latter conclusion will be corroborated by a quantum mechanical derivation of the Bell inequality (cf. section 9.3). On the other hand, the generalized formalism will corroborate the essential role attributed by Bohr to the measurement arrangement in giving quantum mechanical observables their values. As will be seen in chapters 7 and 8, the generalized formalism allows to describe (physically realized) measurement procedures that are outside the domain of application of standard quantum mechanics. This makes obsolete the question of ‘completeness in a wider sense’ of standard quantum mechanics, which was involved in the Bohr-Einstein discussion: standard quantum mechanics is certainly ‘incomplete in a wider sense’. Viewed from the vantage point of generalized quantum mechanics it can be only the question of ‘(in)completeness in a restricted sense’ that still could have any interest. Because of its confusing connotation it might, however, be preferable not to refer to it as a question of ‘(in)completeness’, but rather as a problem of ‘objectivity versus contextuality’. The question of whether the quantum mechanical description can be completed (e.g. by hidden variables) is quite independent of the question of whether the effect of the measurement interaction can be made arbitrarily small in a quantum mechanical measurement. Indeed, the generalized formalism turns out to encompass two different kinds of indeterminacy relations (cf. section 7.10.3), one being related to Einstein’s ideas on ‘objectivity’, the other one quantifying the disturbing influence of the measurement arrangement as present in Bohr’s contextual interpretation of quantum mechanics. Unfortunately, the discussion between Bohr and Einstein was based on a mathematical formalism (the standard formalism) of quantum mechanics that is too restricted to be able to cover the whole domain of atomic physics. Also for this reason any conclusion reached by them should be considered with some suspicion. It is very unfortunate that the issues of ‘completeness versus incompleteness’, ‘determinism versus indeterminism’, ‘realism versus instrumentalism’, and ‘objectivism versus contextualism’ have been mixed up in the past so as to obscure the importance of the latter issue. In the following sections it will be investigated to what extent this is due to imperfections in the way the core concepts of the Copenhagen interpretation, viz, ‘correspondence’ and ‘complementarity’, have been defined and exploited. In particular it will become clear that the Copenhagen in-
190
CHAPTER 4. THE COPENHAGEN INTERPRETATION
terpretation could have done appreciably better if it would have maintained more consistently the empiricist attitude it borrowed from logical postivism/empiricism, and if it had offered better resistance to the lure of realism that is inherent in the classical paradigm.
4.3 4.3.1
The correspondence principle Weak and strong forms of the correspondence principle
The correspondence principle realizes a correspondence between quantum mechanics and classical mechanics. According to Petersen [203] this principle is expressing the fundamental importance of the formal analogy between these two theories. The correspondence principle has played an important role in the early days of quantum mechanics, when it was not yet clear what this theory would have to look like to be able to describe the atomic phenomena. According to Petersen the correspondence principle has been the most important instrument in construing the quantum formalism. In his opinion it is responsible for the attitude sometimes referred to as the “Copenhagen spirit of quantum theory”. The correspondence principle has its origin in the, finally abandoned, idea that a transition between two stationary states of the Bohr model of an atom would correspond to a Fourier component of the classical motion of the electron, and that for large quantum numbers the emitted light would be identical to the light emitted by a classical oscillator of this frequency. In his pioneering article “On the meaning of (classical) kinematical and mechanical relations in quantum mechanics8 [WMdM]” Heisenberg [73] generalized this idea to arbitrary observable quantities. It is possible to view the form, based on the Heisenberg picture, in which the quantum mechanical formalism is cast in formulating Ehrenfest’s theorem (cf. section 1.10.1), as an apotheosis of the correspondence principle: the quantum mechanical commutator as the direct generalization of the classical Poisson bracket (cf. section 1.10.2). The correspondence principle as given above has a wider (and deeper) meaning than is usually attributed to it (e.g. Messiah [57], section 1.12), viz that in the so-called classical limit of large quantum numbers quantum mechanical predictions must agree with the results of classical mechanics (sometimes formulated as the condition that classical mechanics should be recovered in the limit ). This will be referred to as the ‘weak form of the correspondence principle’. Bohr’s intention, however, was more far-reaching. Thus, in the paper by Born, Heisenberg and Jordan 8
“Über quantentheoretische Umdeutung kinematischer und mechanischer Beziehungen.”
4.3. THE CORRESPONDENCE PRINCIPLE
191
[220], developing matrix mechanics, it is stated that 9 “...rather should the theory itself be held to be the exact formulation of Bohr’s correspondence idea. It will be a real task for the further development of the theory to investigate the precise mode of this correspondence, and to describe the transition of the symbolic quantum geometry into the observable classical geometry [WMdM].” Hence, applicability of the correspondence principle was not thought to be limited to the classical limit, but to hold for the whole domain of quantum mechanics. This will be referred to as the ‘strong form of the correspondence principle’. Nowadays the correspondence principle is above all appreciated because of its heuristic value. It has functioned as a guiding principle in finding the correct expressions for quantum mechanical quantities and equations, starting from the classical ones. In that sense the principle has been extremely valuable. In many modern textbooks of quantum mechanics no attention is paid to the correspondence principle, however, apart from a short historical remark, or a limited definition in the abovementioned weak sense as the ‘classical limit’ Since the quantum mechanical formalism has been fully developed and extensively tested by now, not much need is felt for a reflection on the way the formalism has come into being: in the spirit of van Kampen [207], quantum mechanics is there to be applied, not to be philosophized upon.
4.3.2
Strong form of the correspondence principle
Yet, for assessing the positive and negative influences the correspondence principle has exerted on the development of quantum mechanics, it is necessary to pay some attention to it. Although in the quantum mechanical formalism the correspondence principle does not play a role, this principle has been amply active in informal discourse (and, hence, in the way we are thinking about physical reality). When we measure momentum of a free particle, and find a certain value, then the picture we have is generally derived from classical physics, viz, the image of a particle in uniform rectilinear motion. There even exists a method of measurement based on this picture: the time-of-flight method, in which the time is measured it takes a particle to cover a distance momentum being derived from these data according to The fact that this method corroborates the predictions of the quantum mechanical formalism might even seem to legitimize our correspondence thinking. Bohr has adopted adherence to classical pictures like that of the time-of-flight method as the basis of his philosophy with respect to quantum mechanics (cf. Folse [210]): the correspondence principle. Bohr’s justification of this adoption is of a 9
“.... vielmehr kann die Theorie selbst als exakte Formulierung des Bohrschen Korrespondenzgedankens aufgefasst werden. Es wird eine richtige Aufgabe für die weitere Entwicklung der Theorie sein die Art dieser Korrespondenz genauer zu untersuchen, und der Übergang von der symbolischen Quantengeometrie in die anschaulichen klassischen Geometrie zu beschreiben.”
192
CHAPTER 4. THE COPENHAGEN INTERPRETATION
conceptual nature. Our concepts, i.e. the terras of the language we use in speaking and thinking, are determined by the world of our daily experience, the world of our direct observations. For Bohr this is the world of classical mechanics. According to him we are bound to speak and think in classical terms also when performing an experiment in the domain of atomic physics. A quantum mechanical object always manifests itself in an experiment in a way to be understood and described in classical language, that is, either as a particle or as a wave, since these are the only available classical concepts. The reason for this is that in any experiment within the atomic domain there must be an interaction with a macroscopic measuring instrument serving as an intermediary between microscopic object and human observer, and making the object “visible” to the observer. The macroscopic character of a measuring instrument is enough reason for Bohr to think that an observation statement will have to be formulated in classical terms. This allegedly implies that our knowledge about the microscopic object, obtained by means of this measurement, should also be expressed in classical terms. This conceptual aspect is of primary importance to Bohr’s thinking about quantum mechanics. According to him it does not make any sense to speak or think about a microscopic object independently of the context of a measurement, i.e. without taking into account that the object is interacting with a measuring instrument. This implies that it does not even make sense to talk about the position of an electron outside the context of a position measurement: position of a free particle is undefined! Position is defined only when it is measured. More generally, the meaning of Bohr’s correspondence principle is the following:
Correspondence principle (strong form)
A quantum mechanical observable is exclusively defined within the context of the measurement serving to measure that observable, experimental arrangement and measurement results are to be described in classical terms
In classical mechanics it is thought to be possible to attribute to a particle a well-defined value of its position at any time, independently of the question of whether position is measured or not. In section 4.5 the physical cause of the difference between classical and quantum mechanics (as seen by Bohr) will be discussed extensively. Important in the correspondence principle is the aspect of Bohr’s thinking trying to preserve within quantum mechanics as much as possible the assets of classical physics. In order to illustrate this, we consider Bohr’s view on momentum measurement of an electron. Such a momentum measurement can, for instance, be carried out by letting the electron e collide with a macroscopic object a, momentum change of the latter being directly observable. By means of the classical law of
4.3. THE CORRESPONDENCE PRINCIPLE
193
momentum conservation,
it is possible to obtain information on the initial momentum of the electron by observing initial and final momenta and respectively, of the macroscopic object. It is essential to this reasoning that a classical momentum can be attributed also to the electron! This is an example of Bohr’s idea that a quantum mechanical measurement can and must be analyzed in classical terms. Of course, Bohr knew very well that the measuring instrument, too, consists of atoms, and, by this token, might be describable by quantum mechanics. However, as far as the instrument is functioning as a measuring instrument for measuring electron momentum, he deems such a quantum mechanical description not suited. According to Bohr, in its role as a measuring instrument it functions as a classical object. If we are interested in the functioning of the apparatus as a quantum mechanical object, then we should perform a different kind of observations on the apparatus, namely observations that are sensitive to the atomic structure of the apparatus. It is Bohr’s conviction that, if we perform such an experiment, we must interfere with the functioning of the apparatus in such a way that it will no longer function as a momentum meter. According to Bohr it is precisely the possibility of a classical description that makes the apparatus a measuring instrument rather than a quantum mechanical object. There are two aspects asking our attention, both related to the role classical mechanics according to the correspondence principle plays in the interpretation of quantum mechanics, viz, i) the issue of ‘realism versus empiricism’ (section 4.3.3), ii) the question of the ‘classical description of measurement’ (section 4.3.4). In particular the second aspect gives rise to a fundamental criticism of the correspondence principle.
4.3.3
Realism versus empiricism, and correspondence
As remarked in section 4.1 Bohr’s interpretation of quantum mechanics was appreciated in logical positivist/empiricist quarters as an application of their idea of basing physical theories on the hard data obtained by means of direct observation. Indeed, the correspondence principle can be seen as a step in this empiricist direction, since it has the intention to define microscopic (unobservable) quantities in terms of macroscopic (directly observable) ones, the latter corresponding to classical concepts. Bohr’s emphasis on the conceptual nature of the statements made by quantum mechanics, and his reluctance to draw ontic conclusions (thus evading the danger of metaphysics) were highly valued in logical positivist/empiricist circles. Yet, it is doubtful whether the correspondence principle can be interpreted in this way as an endorsement of logical positivism/empiricism. As noted by Folse [210],
194
CHAPTER 4. THE COPENHAGEN INTERPRETATION
Bohr’s attitude with respect to quantum mechanical observables is presumably not an empiricist (as defined in section 2.2) but a realist one, be it of a contextualistic, as opposed to an objectivistic nature (cf. section 2.4). A microscopic object is not supposed to have its property objectively, but only in the context of the measurement arrangement set up to measure the observable. Within this context, however, an observable can be treated as a classical quantity. This view is consistent with Bohr’s application of momentum conservation as in (4.1), in which to the electron a classical momentum is attributed that is interpreted, in the usual classical realist sense, as a property of the object (cf. section 2.3). This is hardly reconcilable with the logical positivist/empiricist reduction of microscopic concepts to directly observable data. On the contrary, it fits in much more closely into a contextualistic-realist view in which, in the context of a measurement, certain classical laws are thought to be also valid at the microscopic level, and reality is attributed in the usual classical sense to the classical concepts defined within this context. The question of Bohr’s empiricism is not easy to be answered, mainly because it was not clearly raised by him. Some of his statements seem to point into the direction of empiricism (cf. section 5.3, where his reference to “possible types of predictions” might be taken in an empiricist sense). However, even if Bohr would have tried to maintain a consistently positivist/empiricist position, not interpreting the value of a quantum mechanical observable as a property of the microscopic object but merely as a label of a directly observable measurement phenomenon, then he should be reproached for the obscurity of his way of expressing himself in such a way that the difference with a contextualistic-realist interpretation of quantum mechanical observables is not noticeable. In this respect Heisenberg’s position is much more clear. In agreement with his own observation, referred to in section 4.1, Heisenberg cannot be reckoned a positivist. Yet, it is well known that, in developing matrix mechanics, Heisenberg was inspired by empiricism. With Heisenberg this may be seen, however, above all as a pragmatic approach, to be justified by its results, rather than as a consequence of a consistent philosophy. He seems to be ready to exchange his empiricism for a realist interpretation of quantum mechanics if this would serve some purpose (like, for instance, “explaining” the impossibility of a simultaneous measurement of incompatible observables, cf. section 1.9). It seems that Heisenberg has been equally inspired by the classical realist thinking of Bohr’s correspondence principle, attributing as a (classical) property to the object the value of the observable that is actually measured. It is this latter aspect that is essential to our understanding of the correspondence principle.
4.3. THE CORRESPONDENCE PRINCIPLE
4.3.4
195
Critique of the correspondence principle
The correspondence principle has as its origin the idea that, at least within the context of a measurement, classical mechanics is the language to analyze the process, be it that this analysis cannot be complete. This may be the reason that Bohr did not distinguish between the quantum mechanical state vector and quantum mechanical observables as two different aspects of reality (as is done in an empiricist interpretation, cf. section 2.2), but always referred to classical position and momentum for characterizing the state of a particle, while adhering to an instrumentalist interpretation of the wave function. This lack of distinction between state and observables may also be at the basis of the Copenhagen confusion between ‘preparation’ and ‘measurement’ referred to in section 4.2.3. From the way the relation between object and measuring instrument was dealt with in the correspondence principle by Bohr and Heisenberg it is rather evident that the Copenhagen interpretation of quantum mechanical observables must be qualified as realist rather than as empiricist. Thus, instead of defining momentum of the electron in terms of momentum of the macroscopic measuring apparatus, in their view electron momentum obtains a meaning next to apparatus momentum Indeed, the appearance in (4.1) of prevents a unique definition of in terms of . Moreover, even if the relation between and would have been unique, then we do not have here a definition of a microscopic quantity in terms of macroscopic quantities, but a relation between two quantities, each of which being defined within the formalism of classical mechanics. Logical positivism/empiricism had adopted the goal of basing science exclusively on the “hard” facts of our direct experience. Of course, the question of what is a “hard” fact is of crucial importance here. It was soon recognized that it is virtually impossible to describe an observation without making use of theoretical terms that are not reducible to observation statements. A solution was sought (e.g. Hempel [221]) in developing a certain hierarchy between theories, and by requiring that for a description of observations testing theory only theories should be used that are lower in the hierarchy (so-called pre-theories). In order to prevent circularity, theory itself should not be used in the description of these observations. It was hoped that it would be possible to express all theoretical statements of theory using terms of pre-theories that have been tested independently of theory . The idea was that the hierarchy would halt at a level of observation not needing any theory for its description, thus making it possible to finally reduce all theoretical terms to observation statements. The correspondence principle, as proposed by Bohr, seemed to fit nicely into this logical positivist/empiricist philosophy: classical mechanics as a pre-theory of quantum mechanics. It seemed possible to provide in this way quantum mechanics with an empirical basis (see e.g. Ludwig [222]; also section 2.2.1). However, the objec-
196
CHAPTER 4. THE COPENHAGEN INTERPRETATION
tion, raised above against the empiricist content of Bohr’s correspondence principle, is a sign of the problems met in implementing this view. As a matter of fact, the problems with the correspondence principle reflect the methodological problems of the logical positivist/empiricist program already observed in section 2.2.1. One of the main problems causing the final breakdown of the program of logical positivism/empiricism is the problem of theory-dependence of observation statements [223]. It turns out that the idea of a hierarchical structure of theories cannot be maintained. As demonstrated, for instance, in a convincing way by Reichenbach ([224], section I.3), already the most direct measurements (like e.g. length measurement by means of a measuring rod) is “theory-laden” in the sense that the measurement process cannot be described without using the theory that is actually being tested. If applied to quantum mechanics this means that a quantum mechanical measurement cannot be described in purely classical terms. A measurement within the atomic domain must necessarily be described by quantum mechanics itself. The reasoning that a quantum mechanical measuring instrument must be macroscopic lest the result of a measurement can be recorded in an indelible way (Belinfante [225]), not to be annihilated by quantum fluctuations, and for this reason must be within the domain of classical mechanics, refers only to part of the truth, and, perhaps, not even to the most important part. Perhaps even more important is that a quantum mechanical measuring instrument be designed in such a way that it is sensitive to the microscopic information contained in the object, and is able to transfer this information from the microscopic level to the macroscopic level of direct observation. This implies that such a measuring instrument should indeed have a macroscopic component, represented in figure 2.1 by a pointer on a measurement scale. The macroscopic component of the measuring instrument should, of course, be describable by classical mechanics. However, the measuring instrument should also have a microscopic component sensitive to the microscopic information! It does not seem to be very plausible that, without such a microscopic component, a macroscopic object (in which, due to its large mass, quantum effects do not play an observable role) would be sensitive to the microscopic information stored in an atomic object. All quantum mechanical measuring instruments used in practice (a number of which will be discussed in chapters 7 and 8) do have such a microscopic component, and its interaction with the object is always described by quantum mechanics (compare the pre-measurement, discussed in chapter 3). It is possible to apply a classical description only after amplification to the macroscopic level. The problem with many of Bohr’s examples, used for illustrating his ideas with respect to quantum mechanics (like the momentum measurement based on (4.1)), is that they are ‘thought experiments’, not experiments that are actually performed. Such ‘thought experiments’ may be useful at a certain stage of development of a theory. However, we should always be aware of the possibility that our thoughts
4.4. COMPLEMENTARITY
197
may go into a different direction than does reality. A correspondence principle in the strong form emphasizing the classical character of the description of a quantum mechanical measurement process precisely seems to do this. Only by studying “real” measurements it is possible to compare the results of our reasoning with the results of measurements that can actually be performed. In doing so it is evident that the correspondence principle cannot be maintained in the strong form: the information transfer from microscopic object to measuring instrument is a microscopic process, to be described by quantum mechanics. In quantum mechanical textbooks the correspondence principle is often presented only in the weak form of the validity of the classical limit. A quantum mechanical description of quantum mechanical measurement processes, as already considered by von Neumann (cf. section 3.2.2), can nowadays be found in many textbooks. In characterizing the Copenhagen interpretation the strong form of the correspondence principle is seldom explicitly mentioned, the issues of ‘completeness’ and ‘complementarity’ being considered as the main characteristics of this interpretation. The concomitant obscurity of the strong correspondence principle can possibly explain how the conflicting views of Bohr and von Neumann with respect to measurement could be thought to go together within one single interpretation. However, it would be impossible to understand Bohr’s specific position without considering the correspondence principle in its strong form. From the considerations of the present section and the following one it is seen that Bohr may have relied more on his correspondence principle than is strictly possible. This principle, being based on a philosophy that is obsolete by now, is less fundamental than was thought by him.
4.4
Complementarity in a wider and in a restricted sense
A third cornerstone of the Copenhagen interpretation is the idea of ‘complementarity’. Also here we must distinguish between a wider and a restricted version, the latter being restricted to the domain of application of quantum mechanics, the former extending to the whole domain of human knowledge. In his essay “The roots of complementarity” Holton [226] argues that an important source of inspiration for Bohr has been the philosophy of William James’s “The Principles of Psychology” [227]. In this philosophy the distinction between the (observing) subject and the (observed) object plays an important role. James’s investigations on human consciousness led him to the conclusion that consciousness is not independent of the object onto which it is directed. In a single person consciousness can have qualitatively different stationary states between which it is vacillating, and which depend on the object. In certain cases the contents of consciousness (i.e. the objects) in the different stationary states may even be complementary (expression used by James)
198
CHAPTER 4. THE COPENHAGEN INTERPRETATION
in the sense that the subject may lose all consciousness about the object in an earlier stationary state after transiting to a new one. A well-known example is a speaker, starting to stammer as soon as he becomes conscious of himself and his relation to his audience, consciousness of which threatens to remove the contents of his talk from his mind 10 . Returning to his former state of consciousness this content will return, however, accompanied by a simultaneous loss of the speaker’s complementary awareness of the audience. It is not necessary to enter here into the debate whether James’s influence on Bohr has really been that large (see also Meyer-Abich [228]). It is certain, however, that for Bohr ‘complementarity’ is a fundamental property of human cognitive faculties, and that for him its significance exceeded by far the domain of quantum mechanics, or even of physics. Evidence for this judgment can be found in the essays [72, 200] in which Bohr has tried to generalize the epistemological problems he met in quantum mechanics to, for instance, ‘human knowledge’ and ‘the problem of life’. In doing so his guiding principle was the restricted applicability of the concepts by which our experiences are described, a restriction caused by the impossibility of drawing a sharp distinction between subject and object. This was the key issue in his notion of ‘complementarity’. If applied to quantum mechanics, this means that in each observation of an atomic phenomenon (Bohr employed the term quantum phenomenon) an interaction between object and measuring instrument takes place, causing a similar complementarity as the one discussed above. This is the so-called quantum postulate. This postulate has its physical basis in the unavoidable interaction between object and measuring instrument that is at the basis of the idea of ‘completeness in a restricted sense’ (cf. section 4.2.2). Each quantum phenomenon possesses an element of ‘wholeness’, in the sense that the experimental conditions determined by the measurement arrangement are an essential part of the quantum phenomenon: object and measuring instrument constitute an indivisible whole. Analogously to the lack of distinction between subject and object in an act of consciousness it is thought to be impossible to draw a sharp distinction between measuring instrument and (microscopic) object. This also is responsible for the unanalyzability of the measurement process referred to in section 4.2.2. The role played in James’s theory by the different stationary states of consciousness is in quantum mechanics replaced by different measurement contexts. Measurements of position and momentum require measurement arrangements that are mutually exclusive. They are comparable with different states of consciousness, causing the notions of position and momentum to become complementary in the sense that they are not simultaneously applicable. This actually holds true for any incompatible pair of observables. As will be seen in sections 4.5 and 4.6, the impor10
A comparable theme is developed in an (unfinished) novel by the Danish writer Poul Martin Møller, “The adventures of a Danish student”, highly appreciated by Bohr ([72], p .13).
199
4.4. COMPLEMENTARITY
tant point is that position and momentum cannot both be sharply defined in one and the same measurement context. Given the mutual exclusiveness of measurement arrangements defining incompatible observables, ‘complementarity’ is a direct consequence of ‘(strong) correspondence’. In order to avoid misunderstandings it should be noted here that the similarity observed between human consciousness and quantum mechanical measurement is not to be interpreted as if Bohr would attribute any active role to consciousness in a quantum mechanical measurement process, in the sense of influencing this process in any way (see also section 4.6.7). After he became aware of the danger that his reference to such concepts as ‘observed phenomenon’ and ‘the subjective character of all experience’ could be interpreted as if quantum mechanics would yield an account only of the subjective experience of an individual observer, Bohr always emphasized the intersubjective, character of the quantum mechanical description (cf. Folse [210], chapter 7). Hence, an ‘observation’ should not be seen as a subjective experience of an individual observer, but as a physical interaction between a microscopic object and a (macroscopic) measuring instrument, by which the measuring instrument is brought into a final state representing the measurement result. Subsequently, the corresponding macroscopic pointer position can be observed by any (human) observer without any appreciable interaction. For this reason it has the same objective meaning as is generally accepted in classical mechanics. As we have seen in section 4.3, this is essential to Bohr’s correspondence principle. In section 4.6.7 interpretations of quantum mechanics will be discussed, which, in contrast to Bohr’s interpretation, have a subjectivistic character. Because different views exist even on ‘complementarity in a restricted sense’ it is not very well possible to give an explicit definition of this concept covering all different variants, like those to be discussed in sections 4.5 and 4.6. It seems, however, that a hard core exists on which all can agree:
Incompatible quantum mechanical Complementarity in a restricted sense
observables correspond to different measurement contexts, defining dif-
ferent aspects of reality which cannot be united in a single classical picture
The discussion in the following will be restricted to the complementarity principle in this restricted sense, not transcending the domain of application of quantum mechanics, viewed as a physical theory describing measurements within the atomic domain. ‘Complementarity’ is often associated with ‘completeness in a restricted sense’. The measurement arrangements of position and momentum are not only thought to be mutually exclusive; the different classical pictures are also thought to supplement
200
CHAPTER 4. THE COPENHAGEN INTERPRETATION
each other so as to yield maximal information. Together, position and momentum would yield the most complete description allowed by quantum mechanics. A problem with such an association is that it is rather unclear what is meant by ‘maximal information’. Why could an extension to triples or quadruples of incompatible observables not yield even more information? In the present chapter this point will not be discussed any further. However, we shall return to it in chapter 7, where it will be demonstrated that the generalized formalism of section 1.9 is capable of clarifying the relation between ‘complementarity’ and ‘completeness’.
4.5
‘Thought experiments’
In order to demonstrate how Bohr’s correspondence principle (cf. section 4.3) can lead to the concept of complementarity, a number of ‘thought experiments’ will now be reviewed. By studying these ‘thought experiments’ the notion of ‘indeterminacy’ has been developed within quantum mechanics. Apart from the expression ‘indeterminacy’ also the expressions ‘inaccuracy’, ‘uncertainty’, ‘undefinedness’ and ‘latitude’ can be found in the literature. In discussions of the ‘thought experiments’ these are generally considered synonyms. We shall see later that this is not true, and that two different meanings can be distinguished among the expressions in this set. In this book the expression ‘indeterminacy’ is used if we do not want to make a distinction between these two meanings. However, finally it will be particularly important to make a clear distinction between these notions.
4.5.1
Diffraction of particles through a slit
The experiment discussed in section 4.2.2, in which a parallel beam of particles is diffracted through a slit (cf. figure 4.1), is the simplest example in which a certain complementarity between position and momentum can be observed. From the analogy with optical diffraction it is known that the width of the diffraction
4.5. ‘THOUGHT EXPERIMENTS’
201
pattern at screen B is determined by an angle (cf. figure 4.2) given by in which is the slit width. Using the de Broglie relation between momentum and wavelength this yields implying the of momentum to have an indeterminacy On the other hand, slit width a defines an indeterminacy of the position coordinate at which the particle passes screen S. Thus, Hence, we get
From this inequality it follows that there is a complementarity between the indeterminacies and the narrower the slit (i.e. the better is determined) the less determined is Conversely, by increasing diffraction can be suppressed, and made smaller.
4.5.2 The double-slit experiment The well-known double-slit experiment, which has now been realized experimentally in several ways (see also section 8.2), was only a ‘thought experiment’ at the time Bohr developed the idea of ‘complementarity’. In contrast to figure 4.2 screen S now contains two slits, giving rise to an interference pattern of light and dark lines on screen B (cf. figure 4.3) if screen S is irradiated from the left by light with a wavelength of the same order of magnitude or larger than the distance between the slits. The analogy between particles (for instance, electrons) and light suggested an analogous interference pattern if particles were used instead of light. This has been experimentally corroborated (e.g. Möllenstedt and Jönsson [229]). An electron impinging from the left on screen S can pass either through slit 1 or 2. Each transmitted particle hits screen B at a certain position The interference pattern
202
CHAPTER 4.
THE COPENHAGEN INTERPRETATION
gradually develops by the impacts of a large number of particles (cf. figure 4.4), evidently preferring certain values of over others. The intensity of the beam at position is determined by the probability that an electron hits screen B in This probability is calculated from the wave function
consisting of the two contributions Since
and
stemming from slits 1 and 2.
we obtain in which
and represents the so-called interference or “cross” term (cf. (1.4)). The result of the particle experiment is in complete agreement with analogous experiments using light, replacing the light intensity It is of primary importance that, because of the “cross” term in (4.4), probability differs from the sum of the probabilities and This is the reason that the pattern of figure 4.3 is completely different from the one of figure 4.2, and, hence, unrelated to the pattern obtained by summing contributions of each slit separately while the other slit is kept closed. This is often interpreted in the sense that the interference pattern in the setup of figure 4.3 cannot be obtained from contributions that have passed screen S either through slit 1 or slit 2. Sometimes it is even concluded from this that an electron should have passed through both slits. This latter conclusion was certainly not the one drawn by Bohr. Such a conclusion might be based on a realist interpretation of the wave function in which
4.5. ‘THOUGHT EXPERIMENTS’
203
the quantum mechanical problem is considered to be not conceptually different from the classical problem of wave propagation11. As noted above, Bohr’s interpretation of the wave function was rather an instrumentalist one. Then wave function (4.3) is thought to be useful only as a tool for calculating measurement results of physical quantities that, in agreement with the correspondence principle, are actually measured. What is determined experimentally in the measurement arrangement of figure 4.3 is the interference pattern. There is no experimental evidence about how passage through the slits takes place. The question through which of the slits the particle has passed can be answered only by a measurement of carried out when the particle is near screen S. As long as we do not perform such a measurement the position of the particle is undetermined, and, according to the correspondence principle, even undefined. For Bohr it is pointless to make any statement on the position of the particle, and, hence, on the slit the particle has passed through, under conditions in which this question cannot be answered experimentally. For the double-slit experiment an inequality can be derived similar to (4.2). As a measure of the indeterminacy we now take the distance at screen B between the central maximum of the interference pattern and the first intensity minimum. This latter point is determined by the requirement that its distances from the two slits differ by It can easily be verified that for
The indeterminacy of the of momentum stems from our lack of knowledge about which slit the particle has passed through. Assuming that a free electron has a uniform and rectilinear motion (classical assumption!), and that does not change when the particle passes a slit, it can be seen on geometric grounds (cf. figure 4.5) that the of electron momentum equals if the particle went through slit 1, and for slit 2. This implies that
Combining this with (4.5), we obtain an inequality similar to (4.2):
The fact that this inequality is encountered in rather different experimental situations may raise a suspicion that it has a universal validity, expressing a certain impossibility of simultaneously determining and (the microscope, to be discussed in the next section, provides a third example). Acceptance of the universal 11
In section 10.6.2 this issue will also be considered from a subquantum mechanical point of view.
204
CHAPTER 4. THE COPENHAGEN INTERPRETATION
validity of this inequality is at the basis of the complementarity principle. The discovery by Heisenberg that an inequality can be derived from the mathematical formalism of quantum mechanics -the Heisenberg inequality (1.77)- which has great similarity to inequalities (4.2) and (4.7), will doubtless have contributed appreciably to this acceptance. Critics of the Copenhagen interpretation doubt the universal validity of this inequality on the basis that it is not certain that no other measurements are possible, allowing a more precise determination of and than is consistent with (4.2) and (4.7). Thus, Einstein proposed to allow screen S to move freely in the It is then possible to apply the law of momentum conservation (4.1) to the combined system of electron and screen S (note that in (4.1) index now refers to screen S). An electron hitting screen B at should then have passed slit 1 if screen S is finally moving upwards if or through slit 2 if S is moving downwards. This would decrease the indeterminacy of Bohr’s [202] reaction to Einstein’s above proposal is characteristic of his reliance on the (strong) correspondence principle. He observes that by changing the measurement arrangement also our possibility of defining specific quantities of the electron has changed in a fundamental way. Different measurement arrangements imply different physical quantities being well-defined. Because screen S is not held fixed, the experimental arrangement does not any longer measure the same observable as measured previously. Unlike the fixed screen, according to Bohr screen S now plays a dynamical role in the interaction process, and for this reason should be considered as an ordinary quantum mechanical object, to be described by quantum mechanics. This implies that the initial state of screen S should satisfy the Heisenberg inequality derived in section 1.7.1, in particular,
4.5. ‘THOUGHT EXPERIMENTS’
Here
is the
205
of the center of mass of screen S.
The fact that screen S satisfies the Heisenberg inequality (4.8) has important consequences for the interference pattern. This pattern is composed of the impacts on screen B by a large number of particles (cf. figure 4.4; also Tomonaga et al. [230]). Different particles now find screen S at different initial positions Hence, the interference pattern that is finally obtained is a superposition of different interference patterns corresponding to the different initial positions of screen S. If the statistical spreading is larger than the distance between neighboring light and dark lines, then the superposition results in a uniform darkening on screen B (note that in figure 4.4 light and dark have been interchanged), i.e. the interference pattern is wiped out. Bohr demonstrates that this precisely is happening if the of electron momentum is defined so accurately that it is possible to determine through which slit the electron has passed. The reasoning is very simple. In order that a determination of ‘which slit’ be possible, we must have (cf. (4.6)) On the other hand, since the of electron momentum is involved in relation (4.1) it cannot be defined better than the indeterminacy momentum of screen S was initially defined with. Hence, implying the requirement From (4.8) it then follows that This implies that, under the experimental condition allowing a determination of the slit the electron has passed through, the interference pattern is wiped out. This reasoning appears to be analogous to the derivation of the Heisenberg inequality (4.7) from (4.6), using Yet, there remain a few questions here. Thus, one could ask why in the double-slit experiment this latter measure of indeterminacy of is chosen, and not some measure based on the slit width, as was done in the single-slit experiment (cf. section 4.5.1). There also does not seem to be any objection against a replacement of screen B by an array of particle detectors, each detector having a linear extension smaller than (for instance, equaling the extension of the individual spots exhibited in figure 4.4), thus defining more precisely, and, hence, leading to a violation of (4.7). These questions are hard to answer on the basis of the complementarity principle as discussed up to now. For this reason the answer, given above, has not been completely convincing for everybody, and criticisms have been advanced. Some of these will be discussed in section 4.7. It will be seen there how the above-mentioned choice can be justified in a certain sense. At the same time, however, the rather pragmatic nature of the Copenhagen approach in comparing theory and experiment is displayed, as well as the appreciable confusion stemming from such a pragmatism. As will be demonstrated in chapter 7 this
206
CHAPTER 4. THE COPENHAGEN INTERPRETATION
confusion can be resolved by a more formal treatment of the problem of quantum mechanical measurement (also de Muynck [129]). Anticipating a more detailed discussion in section 4.6, a second reasoning is now given to arrive at a Heisenberg inequality for the double-slit experiment. This reasoning is based on the quantum mechanical fluctuations of screen S, and the concomitant inequality (4.8). Like momentum measurement, position measurement, too, is impeded by these fluctuations. Assuming classical propagation as a free particle while traversing the distance between the screens, position at which the particle is found at screen B is an indicator of its position while interacting with screen S. This quantity’s indeterminacy cannot be smaller than the one the particle’s position was defined with at screen S. Because of the interaction between the particle and screen S the corresponding indeterminacy will have the indeterminacy of the initial position of screen S as a lower bound. Hence, analogously to (4.9), we also have From (4.8), (4.9) and (4.10) we once more obtain an inequality for the indeterminacies of the electron: strongly resembling the Heisenberg inequality (1.77). The terms in this relation will be seen to have a completely different meaning from the terms of (1.77), as well as from those of (4.7), however. Negligence of these differences has been a main source of confusion (compare sections 4.7.4 and 7.10.3).
4.5.3 The
microscope
As a third example we consider Heisenberg’s microscope [118]. A microscope is an instrument for measuring the position of a particle. To this end particle is irradiated with light as indicated in figure 4.6. The indeterminacy of particle position, i.e. the inaccuracy with which the position of a particle can be determined using a microscope, is given by the well-known classical optics expression for the resolving power (e.g. Hecht and Zajac [231]),
in which is the aperture of the microscope. From (4.12) it is clear that the indeterminacy can be decreased by decreasing the wavelength Hence, the microscope is a more accurate position meter than an optical one. The example of the microscope is used by Heisenberg for illustrating the concept of complementarity between position and momentum of a particle. The idea is that decreasing position indeterminacy by decreasing the wavelength of light will
4.6. MEANING OF THE ‘COMPLEMENTARITY’ CONCEPT
207
cause momentum indeterminacy to increase. The argumentation for this is based on conservation of momentum in a collision of a particle and a quantum of light (also considered as a particle (photon)). Assuming, as was done for a particle, also for the quantum of light a relation between wavelength and momentum it can be seen from the geometry of figure 4.6 that the incoming photon is scattered into the aperture of the microscope only if the final of photon momentum, satisfies
Since we do not know which value has within this interval, this implies an indeterminacy Assuming elastic scattering of the photon (hence, wavelength remains unchanged) by a particle with vanishing initial momentum, an application of the classical law of momentum conservation (4.1) entails an equal indeterminacy of the value of the of particle momentum . Hence,
Combining (4.12) and (4.13) we get
From this relation it is once again evident that the indeterminacy in one quantity increases if the other quantity is determined more accurately. It is not possible to determine both quantities with arbitrarily small indeterminacies.
4.6
Meaning of the ‘complementarity’ concept
It is not completely evident whether Bohr and Heisenberg were aware of the fact that they did not attribute exactly the same meaning to the concept of complementarity.
208
CHAPTER 4. THE COPENHAGEN INTERPRETATION
Heisenberg never dissociated himself explicitly from Bohr’s interpretation. For Bohr the meaning of complementarity was above all a conceptual one, whereas Heisenberg conceived it more as a physical principle 12 . They, however, agreed on the significance of complementarity as a restriction on the possibility of a simultaneous measurement of two incompatible observables, like position and momentum, in the sense of an impossibility of simultaneously determining both quantities with arbitrarily small indeterminacies. Although the ideas of Bohr and Heisenberg on this subject do not directly contradict each other, they are yet very different. These differences will be discussed in the following sections.
4.6.1 Two meanings of ‘to determine’ In order to understand the difference between Bohr’s and Heisenberg’s positions with respect to complementarity it is important to distinguish different views on what precisely is a quantum mechanical measurement. This will also play an important role in dealing with the criticisms by Einstein, Margenau and Ballentine to be discussed in section 4.7. It turns out that Heisenberg, Bohr and the three above-mentioned critics have completely different views on this issue. This also gives rise to very different interpretations of the Heisenberg inequality. Failure to appreciate these differences has been, and still is, a cause of confusion. Much of this confusion stems from the ambiguity of the meaning of the often-used expression “to determine.” What is the meaning of the statement: “The value of momentum P is determined by the measurement”? There are at least two possibilities: 1. It is ascertained by the measurement which value P had before the measurement. 2. By the interaction process it is fixed which value P has immediately after the measurement.
We shall refer to 1 and 2 as determinative and preparative meanings of ‘to determine’, respectively, referring to the difference of ‘measurement’ and ‘preparation’ as physical procedures determining certain aspects of a physical system13. In textbooks of quantum mechanics (e.g. Messiah [57], section 4.17) the first possibility is often considered as the natural one. Like in classical mechanics, measurement is meant to tell something about a reality that was present before it was 12
In [232] Heisenberg remarks that Bohr was primarily a philosopher, not a physicist. Possibilities 1 and 2 are sometimes referred to as ‘retrodictive’ and ‘predictive’, respectively. This nomenclature will not be followed here, however, because both appear to refer to ‘measurement’, thus not sufficiently reflecting the physical distinction between ‘measurement’ and ‘preparation’. 13
4.6. MEANING OF THE ‘COMPLEMENTARITY’ CONCEPT
209
changed by the disturbing influence of the measurement. This textbook view is actually the one advocated by the above-mentioned critics. It is inconsistent with the Copenhagen interpretation due to the latter’s rejection of the ‘possessed values’ principle. In the view of the critics inequality (1.77) is a restriction on our knowledge with respect to position and momentum of the particle. It does not refer to the particle itself. In reality the particle may have, like in classical mechanics, well-defined values of position and momentum. By measurement our knowledge is increased. An observer performing an accurate position measurement (having ) will find the real initial value of position. Due to the disturbing influence exerted by the position measurement he will not obtain any knowledge about initial momentum. On the other hand, if he performs a momentum measurement he will find the value momentum had beforehand, but at the same time not get any knowledge of initial position. In experiments like the microscope and are both different from zero due to mutual disturbance (cf. section 4.6.2) of the position and momentum measurements. This is interpreted by the critics in the sense that such experiments yield incomplete knowledge about initial position and momentum, which quantities, however, are thought to be both well-defined. Due to measurement disturbance the real values are not determined (retrodicted) 14 exactly. They are thought to have initially been somewhere within a region of phase space marked by the indeterminacies and Note that Heisenberg accepts the possibility of simultaneously attributing to a particle values of position and momentum with arbitrarily small indeterminacies as properties possessed in the past (for instance, by preparing the electron in a state with nearly well-defined momentum (hence, ) and measuring position with very small inaccuracy thus violating the Heisenberg inequality. However, for Heisenberg such knowledge with respect to the past has a speculative character. It is a matter of personal belief, because it cannot be experimentally verified whether any physical reality can be ascribed to it since any future (simultaneous) measurement of position and momentum has to satisfy inequality (4.14). From the discussion in section 4.5.3 of the microscope it is clear, however, that momentum indeterminacy is used there in agreement with the second possibility rather than the first one: in (4.14) indeterminacy of momentum refers to the final state of the electron. Heisenberg was well aware of this. He ([118], section II.2) remarks that the inequality does not refer to the past but to the future, that is, to the state of the electron after the measurement. Heisenberg’s reference to the meaning of his inequality as referring to the future marks a notable difference with textbook interpretations holding it to refer to the 14 It should be noted that the idea of ‘retrodiction’ hinges on the ‘possessed values’ principle, and, hence, borrows from this latter principle its suspect character. This is another reason why ‘determinative’ is preferred over ‘retrodictive’ (see also section 7.2.2).
210
CHAPTER 4. THE COPENHAGEN INTERPRETATION
past. This important difference often remains unnoticed. For instance, Messiah ([57], Vol. I, p. 48) presents his above-mentioned interpretation as a fruit of the Copenhagen school, thus characterizing a thoroughly anti-Copenhagen element as a Copenhagen one. Heisenberg actually defines a measurement as being a preparation, viz, a preparation brought about by a measurement. He is evidently using the notion of ‘to determine’ in a preparative sense, indeterminacy relation (4.14) being thought to be relevant to (our knowledge of) the final state (as prepared by the measurement) rather than to the initial state. The conception of a measurement as a preparation of a microscopic object in a particular final state has become one of the characteristics of the Copenhagen interpretation. Measurement is primarily conceived of as a filter directing the object into one of a set of spatially separated outgoing beams (compare figure 3.1), an individual measurement being completed if it is ascertained which of the possible outgoing beams contains the object. This conception of a quantum mechanical measurement was closely modeled after the experimental practice of that time, like scattering experiments (for instance, testing the Compton effect), or the SternGerlach experiment. It seemed to square well with Heisenberg’s empiricist attitude, viewing observables as referring to directly observable phenomena. Unfortunately, Heisenberg did not go all the way to an empiricist interpretation as described in section 2.2, in which observables do not correspond to properties of the microscopic object (either before or after the measurement), but to properties of the measuring instrument. By now it is evident that a model of quantum mechanical measurement as a filter is far too restrictive (cf. chapter 3). By blurring the distinction between the concepts of ‘preparation’ and ‘measurement’ the Copenhagen view of ‘measurement’ has been a source of confusion. In particular, the idea of a measurement as a means of obtaining information on the state of the object as it was before the measurement evaporates in this way (cf. section 3.2.4 for the conventional treatment of this subject). Nevertheless, in textbook quantum mechanics this latter idea is generally upheld. The neglect of the difference between preparation and measurement as two fundamentally different physical procedures -having quite different objectives- should be counted as one of the main weaknesses of the Copenhagen interpretation. The main source of the above-mentioned confusion is the preoccupation of the Copenhagen interpretation with ‘measurement results’. Preparation is never considered as a quantum mechanical procedure, fundamentally different from measurement. If discussed at all, preparation is introduced as a kind of measurement, for instance by using the projection postulate (see section 3.3.4). In section 4.7.2 Margenau’s criticism of this neglect will be discussed. Here it is noted that Bohr, too, did not observe a sharp distinction between ‘preparation’ and ‘measurement’. As is already apparent from our discussion of the correspondence principle, Bohr did not draw a distinction between different parts of the measurement arrangement for
4.6. MEANING OF THE ‘COMPLEMENTARITY’ CONCEPT
211
preparation and measurement. He always referred to the whole experimental arrangement (often referred to as the ‘measuring instrument’) defining the observable that is measured, without noting that part of it does not have any relation at all to this observable, but serves to prepare the microscopic object in its initial state. The Copenhagen neglect of the difference between ‘preparation’ and ‘measurement’ will become particularly important in the discussion of Bohr’s answer to the Einstein-Podolsky-Rosen paper (cf. section 5.3), where Bohr’s failure to recognize the Einstein-Podolsky-Rosen proposal as a preparation rather than as a measurement has caused much confusion. According to Bohr the Heisenberg inequality must be interpreted in the context of an experimental arrangement. The fact that not is specified during which part of the measurement this should be, is a particular source of confusion. Thus, there is an appreciable difference between Heisenberg and Bohr with respect to the interpretation of the inequality, the former holding it to be valid after the measurement, whereas according to the latter it is valid during the measurement. Such differences could be upheld without causing controversy only by ignoring them. Presumably, Heisenberg’s more objectivistic-realist terminology -treating the post-measurement state as an objective description of the object after it has been prepared by the measurement- did appeal more to physicists than Bohr’s contextualism. It was taken for granted too easily that Heisenberg’s theory of measurement was nothing but a physical implementation of Bohr’s philosophical ideas. The professed unanalyzability of the measurement process in quantum mechanical terms (cf. sections 4.2.2 and 4.4) may have been instrumental in blurring the differences. For a long time the difference between ‘preparation’ and ‘measurement’ has not obtained any serious attention, apart from Margenau’s criticism of the Copenhagen interpretation, to be discussed in section 4.7.2. However, possibly due to Bohr’s authority, or, more generally, due to the influence of the Copenhagen interpretation, this criticism has not been very effective in revealing the weaknesses of this latter interpretation. As far as this criticism has had any influence, this has been realized by adding certain of the critic’s ideas (like the idea of ensembles, cf. chapter 6) to the Copenhagen interpretation without bothering too much about the consistency of the whole edifice. In particular, the wide gap between an interpretation of the Heisenberg inequality as a property of measurement (either during the measurement or afterwards), and an interpretation as a property of preparation (preceding the measurement), was not observed until recently. Elucidation of this difference is one of the purposes of this book.
212
CHAPTER 4. THE COPENHAGEN INTERPRETATION
4.6.2 Heisenberg’s disturbance theory of measurement Heisenberg’s view on complementarity can best be characterized as a disturbance theory of measurement in a preparative sense. By the presence of the measurement arrangement, necessary to perform a position measurement, all observables incompatible with the position observable (like, e.g. momentum) are disturbed in the final state of the object. Thus, in the microscope a collision with a photon, intended to determine the position of a particle, causes a disturbance of particle momentum. Due to the quantal character of the interaction such a disturbance cannot be prevented, causing the final state of the particle to be such as to yield momentum probabilities different from the ones obtained in a momentum measurement. However, determinative elements (referring to the initial state of the object) can also be discerned. Indeed, seems to be interpreted in a determinative way: it refers to the accuracy of the knowledge on initial position obtained by the measurement. Thus, by increasing wavelength photon momentum, and, hence, momentum disturbance is made smaller. The increase of accompanying, according to (4.14), a decreasing is interpreted in terms of a decreasing resolution of the microscope. Notwithstanding Heisenberg’s careful distinction between past and future is the microscope experiment often interpreted as an example of a measurement arrangement for the simultaneous measurement of position and momentum in a determinative sense (i.e. referring to the past), interpreting the preparative indeterminacy as a determinative one in the way attributed in section 4.6.1 to critics of the Copenhagen interpretation. Then mutual disturbance is referring to information on the initial state rather than to a physical disturbance of the final state of the object. Due to the general neglect of the distinction between ‘preparation’ and ‘measurement’ the fundamental difference between the notions of preparative and determinative indeterminacies has seldom been appreciated (see, however, sections 4.7.2 and 4.7.3). In particular, the Copenhagen rejection of the ‘possessed values’ principle made it virtually impossible to refer to initial values of the observables, since these were not even supposed to exist. Hence, rejection of the ‘possessed values’ principle would seem to allow only Heisenberg’s preparative interpretation. It is not completely clear which was Heisenberg’s precise position on this issue. In discussing his disturbance theory his terminology is very classical, rather in the sense of an objectivistic-realist interpretation of physical quantities (cf. section 2.3), in which a particle seems to have at each time a well-defined value of both position and momentum, to be found actually if a suitable measurement is performed. Thus, when measuring position we seem to find the particle at the position it occupied immediately before the measurement; by the same token, a momentum measurement seems to reveal the momentum the particle had before the measurement. In a position measurement momentum is disturbed. This seems to make sense only if there existed something to be disturbed, i.e. if the particle had a momentum value
4.6. MEANING OF THE ‘COMPLEMENTARITY’ CONCEPT
213
before the measurement was performed. Heisenberg ([233], p. 29), indeed, notes that “It is completely certain that the electron has been moving at the observed position with the observed velocity 15 .” As discussed in section 4.6.1, at a certain stage Heisenberg considered these data as real, but irrelevant because of the impossibility of verifying (by means of a simultaneous measurement) knowledge about the precise values of position and momentum as they were prior to measurement. On the other hand, in Heisenberg ([233], p. 36) he develops a kind of Aristotelean philosophy in which before the position measurement the value of position is only potentially present, measurement actualizing the position observable. This latter view is consistent with a probabilistic interpretation of quantum mechanical probability distributions (cf. section 6.2), in which a physical quantity does not have a value prior to measurement, and serves above all to endorse the Copenhagen ‘completeness’ thesis (cf. section 4.2.2). It is, however, at variance with Heisenberg’s above-mentioned classical terminology, which is consistent with a (classical) statistical interpretation of quantum mechanical probability distributions, attributing to a physical quantity a well-defined although unknown value prior to measurement (cf. section 6.2). That Heisenberg has used both views next to each other may be an indication of the possibility that Heisenberg’s opinion on Bohr, given in section 4.6.1, may have a mirror image in the sense that Heisenberg was in the first place a physicist and only in the second place a philosopher. His physical intuitions may be more reliable than his philosophical insights. It seems most likely that, as a physicist, Heisenberg was thinking in statistical rather than in probabilistic terms, diversions into philosophy perhaps being encouraged by a wish to keep level with Bohr’s insights. That Heisenberg’s wavering between probabilistic and statistical interpretations has remained largely inconsequential, is a consequence of the general neglect of the distinction between ‘preparation’ and ‘measurement’, discussed in section 4.6.1, allowing Heisenberg to consider a measurement as a preparation, without bothering too much about the difference. The microscope is a paradigm of Heisenberg’s disturbance theory of measurement. In particular, by yielding indeterminacy relation (4.14) it suggested a close relationship with the Heisenberg uncertainty relation16 (1.77), derived from the mathematical formalism of quantum mechanics. Since the quantities involved in (4.14) are dependent on the measurement, such a relationship has plausibility only if (1.77) is taken in the final object state, as was done by Heisenberg. Unfortunately, in quantum mechanics textbooks Heisenberg’s distinction between past and future is seldom observed, and indeterminacy relation (4.14) is seen as an implemen15
“Es ist völlig sicher, dass das Elektron sich an dem beobachteten Ort mit der beobachteten Geschwindigkeit bewegt hat.” 16 The terminological difference between the two relations anticipates a conceptual difference to be discussed in the following.
214
CHAPTER 4. THE COPENHAGEN INTERPRETATION
tation of Heisenberg’s inequality, taken in the initial state of the object. However, Heisenberg’s inequality is valid independently of the precise method of measurement of position and momentum. In its derivation (cf. section 1.7.1) there is no reference to measurement at all. Hence, it is implausible to equate it with inequality (4.14), which can be seen as a property of the measurement procedure. Such an identification could only be upheld by completely ignoring any influence of the measurement process. It seems clear from the foregoing that we should be careful when interpreting the Heisenberg inequality (1.77) as a formal representation of mutual disturbance in a simultaneous measurement of incompatible observables. It might be felt a bit discomforting that an inequality characterizing such a mutual disturbance could be described without any reference to measurement (compare section 4.7). In the following it, indeed, will become clear that there is no fundamental relationship between simultaneous measurement of incompatible observables and the Heisenberg inequality (1.77) (although in certain special measurement procedures, like the microscope, the latter inequality might have an incidental applicability). A clarification of this issue can be obtained only from a generalization of the mathematical formalism (cf. chapter 7). Suffice it here to conclude that Heisenberg’s treatment of the subject is an important step forward because it explicitly takes into account the disturbing influence of the interaction with the measuring instrument. However, the step is not big enough. Its reference to the final state of the object makes it too restrictive. For instance, it does not encompass photon counting for which, ideally, the final state would not be yielding any information at all on initial photon number (compare section 3.2.4). It will turn out that, due to its restricted scope, the microscope has some misleading features as a paradigm. Using the generalized formalism it will be possible to draw a clear distinction between mutual disturbance in the preparative sense considered by Heisenberg, and mutual disturbance in a determinative sense, which in an empiricist interpretation can be formulated without resorting to the ‘possessed values’ principle.
4.6.3 Complementarity according to Bohr As regards Bohr’s view on complementarity Heisenberg is undoubtedly right in calling Bohr a philosopher rather than a physicist. As noted in section 4.4 Bohr held complementarity to be a general property of human cognition, valid far outside the domain of physics, rather than a property of physical reality. Whereas Heisenberg tended to think in ontological terms, i.e. terms referring to things existing in reality, or to things being measured, is Bohr’s terminology rather of an epistemological nature, i.e. related to our (possibility of having) knowledge. Bohr ([72], p. 5) puts it this way: “In this context, we are of course not concerned with a restriction as to the
4.6. MEANING OF THE ‘COMPLEMENTARITY’ CONCEPT
215
accuracy of measurements17, but with a limitation of the well-defined application of space-time concepts and dynamical conservation laws, entailed by the necessary distinction between measuring instruments and atomic objects.” The closing sentence of his Como lecture [234] is telling, too: “I hope, however, that the idea of complementarity is suited to characterize the situation, which bears a deep-going analogy to the general difficulty in the formation of human ideas, inherent in the distinction between subject and object.” It was stressed already by Kant [235] that the experience of our senses is not the only source of knowledge. An activity of our mind (Vernunft) is necessary to order our observations in a theoretical scheme. Without such a scheme our observations would consist of a disorderly, and hence useless, set of sense impressions. Without the notion of causality, imposed by our mind on our observations, we have no reason to expect thunder to follow lightning next time, even after having observed it to happen an arbitrarily large number of times. ‘Causality’ is one of Kant’s categories of cognition, used by our mind for the purpose of ordering our observations. In Kant’s view causality is not a property of reality itself, but primarily a property of human cognition. For Kant the categories of the mind were “a priori”, i.e. independent of experience, and, hence, metaphysical. He was convinced that any future metaphysics being suited to serve as a scientific theory (see the full title of his Prolegomena [235]) will have to use the categories, found by him, as ordering principles. Kant had similar ideas with respect to space and time, which he also considered as a priori. Kant’s ideas have been influenced appreciably by his admiration of Newton’s achievements in classical mechanics. During his time the idea of ‘causality’ was built into classical physics through the deterministic character of classical motion: given the initial conditions of position and velocity (or momentum) the solution of the classical equations of motion is uniquely determined. Kant’s ideas with respect to space and time, too, were derived from Newtonian theory, no alternatives to Euclidean space being known by then. Kant’s epistemological ideas have been challenged by the development of relativity theory and quantum mechanics as alternatives to classical mechanics, and by the concomitant introduction of non-Euclidean spaces and indeterministic motions (cf. [236, 237, 238]). A rather extreme reaction is the logical positivist/empiricist attempt to order our observations without any metaphysical principles, completely relying on empirical data. A less far-reaching reaction is the policy of preserving, in relativity theory and quantum mechanics, metaphysical principles, but not in the (classical) form introduced by Kant, but in a form adapted to the new domains of experience. Bohr’s attitude with respect to complementarity can be interpreted in this latter sense (see also Jammer [216], p. 203). Bohr [202] considers ‘complementarity’ 17
A subject favored by Heisenberg (compare section 4.5.3).
216
CHAPTER 4.
THE COPENHAGEN INTERPRETATION
as “a rational generalization of the very ideal of causality.” For him the category of ‘complementarity’ must replace the classical mechanical category of ‘determinism’ (or ‘causality’)18. According to Bohr ‘complementarity’, not ‘determinism’, is the category to be used for ordering observations within the domain of atomic physics. The reason for this is the mutual exclusiveness of measurement arrangements for quantum mechanical position and momentum, preventing a simultaneous sharp determination of the values of these observables. This makes the classical concept of ‘determinism’ inapplicable. For quantum mechanics this latter concept is meaningless. Hence, according to Bohr quantum mechanics requires a completely different way of thinking and reasoning. Due to the correspondence principle we have to keep thinking in terms of the classical quantities position and momentum. However, within the atomic domain it is no longer possible to think in terms of points in phase space, or continuous particle trajectories, because the quantities and are not both sharply defined. Each is defined only with a certain latitude its precise value depending on the measurement context, but always satisfying a relation analogous to (4.2), (4.7), (4.11), and (4.14), viz,
the Heisenberg indeterminacy relation. It is important to realize that, even though Bohr and Heisenberg refer to the same inequality, it is interpreted by them in very different ways. Thus, as discussed in section 4.6.1, for Heisenberg they refer to the final state of the object. It often is not even inconsistent with his wording to interpret (4.15) as a statistical relation, referring to a classical particle that, after a measurement, has sharp values of position and momentum, knowledge of which is restricted by a stochastic disturbance due to the measurement. For Bohr this relation has the meaning of a restriction on the definition of the quantum mechanical concepts ‘position’ and ‘momentum’ within the context of a simultaneous measurement, the latitudes of the definitions satisfying (4.15). Hence, for Bohr this inequality is valid during a measurement rather than posterior to it. For a microscopic object it is impossible to maintain the classical concept of a phase space point (q, p). For Bohr the elementary concept is a (small) region of phase space, with extension determined by the latitudes and satisfying (4.15). For Bohr it is senseless within the atomic domain to think in terms of points in phase 18 Bohr’s use of the terms ‘determinism’ and ‘causality’ as more or less synonymous is rather confusing since his abolishment of ‘determinism’ in quantum mechanics generates a conflict with the custom of calling quantum mechanics a deterministic theory due to the deterministic evolution of the solutions of the Schrödinger equation (see also section 4.6.5). By remembering that, in agreement with his correspondence principle, Bohr always referred to classical concepts, it is possible to evade this terminological conflict.
4.6. MEANING OF THE ‘COMPLEMENTARITY’ CONCEPT
217
space or particle trajectories, because these concepts simply are undefined. For conceptual reasons no trajectory can be attributed to an electron. Hence, also the concept of ‘determinism’ becomes obsolete. It should be replaced by the concept of ‘complementarity’. The complementarity principle is at the basis of Bohr’s conviction that quantum mechanics is a complete theory (cf. section 4.2), and that it is meaningless to look for a deterministic completion as pursued by Einstein. Any attempt at such a completion would have to fail because of the impossibility of defining, within the atomic domain, and more precisely than is allowed by the Heisenberg indeterminacy relation (4.15). In his [202], p. 237, Bohr seems to touch upon the difference between his own views and Heisenberg’s Aristotelean ideas when warning against phrases, to be found in the literature, like “disturbing of phenomena by observation” and “creating physical attributes to atomic objects by measurements”, suggesting the possibility of a more complete description transcending the indeterminism introduced by the measurement disturbance. In Bohr’s opinion such phrases rely on the idea that measurement would be analyzable in terms of quantities that are more sharply defined than allowed by (4.15). Such concepts are “hardly compatible with common language and practical definition.” For Bohr indeterminism had a deeper meaning, namely the fundamental impossibility of transcending concepts that are defined, in agreement with the correspondence principle, in the classical language of our macroscopic observations, and, hence, are defined only “under specified circumstances, including an account of the whole experimental arrangement.” According to Bohr the fundamental difference between classical and quantum mechanics is the impossibility of combining in the latter theory space-time concepts with the conservation laws of energy and momentum (E, p) (the latter sometimes being referred to as a “requirement of causality”). Such a combination is considered by him as characteristic of classical mechanics. In order to understand what Bohr means by this, it is advantageous to refer to the example of mutually exclusive measurement arrangements discussed above. In Bohr’s view a description in terms of spatial variables is possible only if the experimental arrangement allows their definition, e.g. if screen S is held fixed. In illustrations given by Bohr (e.g. in [202]) immobility of the screen is symbolized by heavy bolts anchoring the screen to its support. Because in this physical situation momentum transfer between particle and screen is completely undefined, the law of momentum conservation is not applicable. On the other hand, if the screen is allowed to move freely (as in Einstein’s proposal discussed in section 4.5.2) this latter law is applicable because now the momentum variables are well-defined. However, then the spatial variables are undefined. In the so-called ‘Photon in a box’ ‘thought experiment’ [202] an analogous complementarity between time and energy is considered. The analysis of the ‘complementarity’ concept as given here does not seem to change in any essential way if the term ‘realism’ is substituted for ‘determinism’ or
218
CHAPTER 4. THE COPENHAGEN INTERPRETATION
‘causality’. What is really at stake in Bohr’s conception of ‘complementarity’ is not the time evolution of physical quantities but the question of whether sharp values of position and momentum can simultaneously be attributed to a microscopic object. What, according to Bohr, can be attributed is rather a region of phase space, not a phase space point. Unfortunately it is not completely clear in what sense such an attribution should be taken. If Folse is right that Bohr had to a realist attitude with respect to quantum mechanical observables (cf. section 4.3.3), then the corresponding reality seems to be of a kind that nowadays is sometimes referred to as a ‘fuzzy reality’ [239, 40, 41]. The indeterminacy of an observable, interpreted in a realist sense, would correspond to its fuzziness. Since the indeterminacies are dependent on the measurement arrangement, the concomitant realist interpretation would be a contcxtualistic one. Note, however, that attributing to Bohr such a contextualistic-realist interpretation ignores the subtle difference between epistemology and ontology made by him. In the following I shall ignore the subtlety referred to above, and refer to the Copenhagen interpretation, as far as observables are involved, as a contextualisticrealist one. Once again, however, we have reason to complain that it would have been more fruitful if Bohr had dealt somewhat more with physics than with knowledge. This would have forced him to enter into the differences between his own contextualistic views on complementarity (indeterminacy valid within the context of the measurement) and Heisenberg’s more objectivistic interpretation (indeterminacy as an objective property of the final state of the object as prepared by a measurement), thus contributing to a more self-consistent Copenhagen interpretation than is available now.
4.6.4
Particle-wave duality
Initially, in Bohr’s coming to grips with the idea of ‘complementarity’ the so-called particle-wave duality has played a large role. The idea was that, depending on the measurement context, an object within the atomic domain can manifest itself either as a particle or as a wave. In 1928 Bohr [240] still considered these as two complementary pictures in terms of which atomic phenomena should be described. These pictures are not contradictory but supplement, each other. In agreement with the correspondence principle (section 4.3) these pictures correspond to classical concepts. Bohr’s initial idea is that the different pictures should correspond to different, mutually exclusive measurement arrangements. Thus, in one experimental context an object (e.g. an electron or a photon) would manifest itself as a wave (for instance, in a double-slit interference experiment in which it is not observed through which slit the object passes the screen), in another context as a particle (when the measurement arrangement is such that it is determined through which slit the object passes the screen). Because of the different measurement contexts
4.6. MEANING OF THE ‘COMPLEMENTARITY’ CONCEPT
219
there is no contradiction. In this way it is often presented in textbooks (e.g. Messiah [57], section 4.18). However, as stressed by Murdoch [241] (see also Martens [242]), Bohr soon realized that particle-wave duality cannot be an example of complementarity in the way described above. Indeed, from the discussion of the double-slit experiment in section 4.5.2 it is evident that within one single experimental arrangement particle and wave aspects can both play a role (cf. figure 4.4, where the interference pattern represents the wave aspect, whereas the particle aspect manifests itself through the gradual development of this pattern by means of point-like impacts on screen B). For this reason Bohr seems to change his conception of the particle-wave duality in such a way that light is always to be associated with a classical wave phenomenon, and that, for this reason, a quantum of light can at most have a symbolic meaning, photons being interpreted as artefacts of the quantum mechanical description. For electrons it is the other way around: the classical picture of an electron is a particle, now the wave being an artefact of the quantum mechanical description. This is consistent both with Bohr’s lasting resistance against the idea of the reality of photons, as well as with his instrumentalist view of the wave function. In a later publication Bohr [202] seems to take up again his older idea. Here Bohr employs the particle-wave duality for introducing and explaining the idea of mutually exclusive measurement arrangements. Undoubtedly the easy accessibility of this publication has contributed appreciably to establishing the idea that this is Bohr’s view. It is questionable, however, whether Bohr in this historical account of his discussion with Einstein on the foundations of quantum mechanics has observed the necessary finesse in presenting his ideas on particle-wave duality. It should be noted here that the gradual development of the interference pattern, as apparent in figure 4.4, has not been observed experimentally until 1958 (for light [243]), or 1959 (for electrons [229]). Before that time the idea of particle-wave duality as an illustration of complementarity was a very attractive one, and undoubtedly has inspired Bohr’s thinking about complementarity, even though at a certain moment he saw that the idea could not be maintained in its original form.
4.6.5
Parallel and circular complementarity
Von Weizsäcker [244] has pointed out still another possibility of implementing the idea of ‘complementarity’ in physics. As discussed in section 4.6.3, in Bohr’s view a space-time description of a microscopic particle can be given only in the context of a position measurement. Then the inevitable interaction between object and measuring instrument makes a causal description (in the sense of classical mechanics) impossible. According to von Weizsäcker a causal description is possible in quantum mechanics since, due to the deterministic behavior of solutions of the Schrödinger equation
220
CHAPTER 4. THE COPENHAGEN INTERPRETATION
referred to in section 4.6.3, the time evolution of the wave function can be considered as a causal process. However, this causality applies only as long as the object is an isolated object, no measurement being performed. Due to the finiteness of the ‘quantum of action’ and the unanalyzability of the measurement process caused by it, there is no causality in this sense as soon as the object starts interacting with a measuring instrument. Von Weizsäcker is referring here to a kind of complementarity related to mutually exclusive physical situations of, on one hand, an isolated object with a wave function satisfying the Schrödinger equation, and, on the other hand, an object interacting with a measuring instrument. This is called by him ‘circular complementarity’, to distinguish it from the notion of ‘parallel complementarity’ involved in the concept of ‘complementarity’ as embodied in the idea of mutually exclusive measurement arrangements. The term ‘parallel’ is used to emphasize the similarity of the complementary situations involved, the situations ‘object isolated’ and ‘object interacting with a measuring instrument’ being fundamentally different. According to Jammer ([216], p. 102) von Weizsäcker’s idea of ‘circular complementarity’ can be understood on the basis of Bohr’s idea of ‘complementarity in a wider sense’. Indeed, the situations ‘object isolated’ and ‘object interacting with a measuring instrument’ are mutually exclusive, necessitating qualitatively different descriptions. Yet, Bohr completely rejected the idea of ‘circular complementarity’ (cf. [216], p. 104). According to Bohr ‘complementarity’ is a category to be applied only to phenomena, realized in well-defined measurement arrangements of well-defined quantum mechanical observables. A description by means of the Schrödinger wave function does not correspond to any phenomenon, and can, for this reason, not be an element of a complementary relationship. As far as quantum mechanics is concerned Bohr seems to allow only the idea of ‘parallel complementarity’. As von Weizsäcker notes, acceptance of the idea of ‘circular complementarity’ by Bohr would, indeed, have meant an ambiguity in his terminology with respect to the concept of ‘causality’, since this notion was used by Bohr to refer to the context of a momentum measurement. Evidently, this way of using the concept is different from von Weizsäcker’s usage. Whether, as von Weizsäcker contends, Bohr really made such an oversight is not completely clear. For our purpose this is not important, because in an empiricist interpretation of quantum mechanics a restriction to the phenomena is the natural thing to do, thus allowing us to leave the notion of ‘circular complementarity’ out of consideration.
4.6.6
Complementarity and the projection postulate
Due to its connection with ‘completeness in a restricted sense’ the projection postulate, introduced in section 1.6 and discussed extensively in chapter 3, is often seen as an essential constituent of the Copenhagen interpretation. On the other hand,
4.6. MEANING OF THE ‘COMPLEMENTARITY’ CONCEPT
221
it also poses a number of conceptual problems to this interpretation. This may be caused by the fact that the Copenhagen ideas constitute a less coherent whole than would be desirable, and that the standard mathematical formalism due to Dirac and von Neumann -of which the projection postulate is a part- is not capable of covering all of the ideas put forward by the “founding fathers”. According to the Copenhagen interpretation quantum mechanics does not describe an objective reality that is independent of the measurement context. Outside such a context no value can be attributed to a quantum mechanical observable. In particular the ‘possessed values’ principle, introduced in section 2.3, is not thought to be valid. If the state is a superposition of eigenfunctions of momentum operator P, then an object is thought not to have a well-defined momentum (see, however, section 6.4). This is consistent with the views of both Bohr and Heisenberg. According to the former the concept of momentum is not even applicable if this observable is not actually measured: it is not even defined then. The same holds true according to Heisenberg’s Aristotelean ideas mentioned in section 4.6.2. An observable obtains a value, either a sharp one, or with a certain latitude (see also section 6.4.4), only in a measurement process. In the Copenhagen interpretation the projection postulate describes the change of the state vector realized by a measurement. Thus, weak projection (1.72) or (1.74) may be considered as describing the process in which a quantum mechanical observable gets defined, the precise value remaining unknown as long as the actual measurement result has not been observed. Strong projection (1.70) is associated with this latter act. Complementarity is explained on the basis of the idea that in measurements of incompatible observables projections are realized onto distinct state vectors. Non-existence of joint eigenvectors prevents simultaneous measurement of incompatible observables. The language of projection strongly suggests the activity of “creating physical attributes to atomic objects by measurements” Bohr is warning against (cf. sections 4.6.3 and 6.2). By itself the projection postulate need not be in disagreement with Bohr’s epistemological philosophy, since weak projection may be viewed upon as analogous to choosing a different coordinate frame (an orthonormal basis in Hilbert space), and, hence, need not have an ontic meaning. The width of the probability distribution in the contextual state is interpretable as the latitude observable A is defined with in the context of a measurement of A. In an instrumentalist interpretation of the state vector no physical cause need be given for the discontinuous change the state vector experiences during a strong projection: knowledge can change discontinuously. However, the projection postulate is primarily inspired by a realist interpretation of the quantum state. According to the strong version of the postulate the object necessarily has “to be in” the eigenstate corresponding to the observed eigenvalue of the measured observable. Moreover, this can be taken in an objectivistic
222
CHAPTER 4. THE COPENHAGEN INTERPRETATION
rather than a contextualistic sense, since the eigenstate can be used as initial state in a subsequent measurement of an arbitrary observable. Such a realist view may be much closer to Heisenbcrg’s conception of measurement in quantum mechanics. Here a “real” transition to an eigenstate of the measured observable seems to be a consequence of a “real” disturbance of observables incompatible with the measured one (Heisenberg’s disturbance theory of measurement, cf. section 4.6.2). Projection is associated here with a physical process rather than with an epistemological category. In such a realist view it is extremely difficult to see how the conclusion could be evaded that the value of observable A must have been created in the act of measurement if it was not there before the measurement 19 . Application of the projection postulate marks the duality, present in the Copenhagen interpretation, between Bohr’s epistemological approach and Heisenberg’s physical one. Each of these approaches has its particular advantages and drawbacks, highlighting certain aspects but simultaneously neglecting other ones. Thus, Bohr leaves unanswered questions having an ontic character. He considers the process of obtaining knowledge by means of measurement to be unanalyzable, and hence renounces any attempt to distinguish between object and measuring instrument. On the other hand, Heisenberg tries to give a physical analysis of the measurement process, but, in doing so, is led to a characterization of a quantum mechanical measurement as a preparation of the object in an eigenstate of the measured observable (cf. section 4.6.1). Although this attempt certainly is a step forward as compared with Bohr’s agnosticism with respect to measurement, the concomitant confusion with respect to ‘preparation’ and ‘measurement’, canonized in the projection postulate, is a legacy of the Copenhagen interpretation we have to cope with. Moreover, Heisenberg’s disturbance theory of measurement is not completely efficacious in replacing Bohr’s cognitive way of dealing with quantum mechanical measurement by a more physical approach. It does not yield a clear description of strong projection as a physical process. It would seem to be necessary to change the Schrödinger equation in an essential way to achieve this goal, thus essentially changing the theory. As will be discussed in section 4.6.7, at a certain stage it even seemed necessary to invoke active intervention of the observer to properly achieve strong projection within quantum mechanics. The projection postulate will play an important role in the discussion of the Einstein-Podolsky-Rosen problem (cf. chapter 5), which is to be considered as Einstein’s final attack on the Copenhagen interpretation. It is very unfortunate, however, that neither Einstein nor Bohr has explicitly referred to it. Bohr, in particular, always tried to cast the problem in terms of (complementary) observables (cf. section 5.2), thus evading any necessity of connecting the wave function to reality. 19 A weak form of the ‘possessed values’ principle is often accepted within the Copenhagen interpretation if the initial state is an eigenstate of the measured observable. In this case it is assumed that the measured value was present already before the measurement.
4.6. MEANING OF THE ‘COMPLEMENTARITY’ CONCEPT
223
This neglect of the projection postulate is all the more unfortunate because it is precisely the Einstein-Podolsky-Rosen problem that highlights the difference between applications of this postulate in measurement and in (conditional) preparation (cf. sections 3.3.4 and 5.4). The interpretation of the Einstein-Podolsky-Rosen proposal as a measurement rather than as a (conditional) preparation, can be seen as a consequence of the Copenhagen confusion with respect to this issue. Solution of this problem cannot be hoped for unless this confusion has been removed. A conception of ‘measurement’ as a means of obtaining information on the object in its initial state (rather than in its intermediate (Bohr) or final state (Heisenberg)) seems to be a necessary step to arrive at an understanding in terms of “ordinary” language so fervently advocated by Bohr. As discussed in section 3.2.4, projection does not play any essential role in such a conception of quantum mechanical measurement.
4.6.7
Complementarity and consciousness
In section 4.4 a possible relation between complementarity and human consciousness was contemplated. It was also noted that Bohr did not attribute an active role in the quantum mechanical measurement process to consciousness, at least no role different from the one played by an observer in classical physics. For Bohr’s observer a quantum mechanical measuring instrument is a classical (macroscopic) object, observation of which does not pose any particular problem. This implies that for Bohr the observer is not a constituent of the quantum phenomenon. The element of ‘wholeness’ is restricted to the relation between object and measuring instrument, and does not include the observer. With von Neumann [2] this is different. In von Neumann’s conception an observation act is a sequence of quantum mechanical measurement processes by which information is transferred from the object to the observer’s consciousness. In this view the human eye is a (quantum mechanical) measuring instrument registering the pointer position of the measuring instrument proper. Due to the quantum mechanical character of the whole measurement process the eye allegedly constitutes an indivisible whole with the microscopic object, causing the extension of the quantum phenomenon to encompass human consciousness. This view has been put forward above all by London and Bauer [245], and still has its proponents (see, for instance, Stapp [246], Albert [247], Squires [248]). London and Bauer consider a possible solution to the problem of strong projection (cf. sections 1.6 and 4.6.6), according to which the state vector changes discontinuously into the eigenvector corresponding to a certain eigenvalue of the measured observable A. They attribute to human consciousness a property of introspection (connaissance immanente) realizing the transition (1.70) when consciousness becomes aware of its being in a state corresponding to the observation of
224
CHAPTER 4. THE COPENHAGEN INTERPRETATION
eigenvalue In this view a measurement remains unfinished as long as no conscious observer has been involved. This is illustrated by the paradox of Schrödinger’s cat described in section 3.1. The solution proposed by London and Bauer implies that state (3.1) is valid only as long as there is no conscious observer, during which time the cat is in a so-called state of “suspended animation”. As soon as a conscious observer opens the cage to see whether the cat is alive or dead, the projection (1.70) takes place, causing the observer to find either a living or a dead cat (with probabilities and respectively). Wigner [249] has interpreted this as evidence that our consciousness or our mind is capable of influencing matter, this interaction not being described by the (linear) Schrödinger equation but by a nonlinear one. Nowadays in general no other role in the quantum mechanical measurement process is attributed to human consciousness than the one inducing an experimenter to set up a certain measurement arrangement, and to perform a certain experiment. Registration of the results of such measurements is highly automatized. In general a human observer is involved only after the data have been processed by a computer, often after these have been translated into graphical pictures. It does not seem to be very reasonable to assume that such pictures come into existence only at the instant a conscious observer is looking. In any case, if there is projection at all then it does not seem to be the human mind which is realizing it 20 . If a real process at all, projection would have to take place within the measuring instrument proper. The relation a human observer has to this measuring instrument is not different from the relation he has to any other macroscopic object. Registration of a measurement result by observation of a measuring instrument takes place inside the macroscopic domain in which Planck’s constant can effectively be put equal to zero. For this reason the human observer, including his consciousness, can remain completely outside the quantum mechanical description. As far as a certain holism is valid in Bohr’s quantum phenomenon, it is sufficient to take this into account by considering the direct interaction of the object and the measuring instrument proper. Invoking human concsiousness seems to be a last resort within a realist interpretation to effectuate a projection of the state vector for which no “physical” explanation can be found. It was above all his instrumentalist conception of the quantum mechanical wave function that allowed Bohr, in considering quantum measurement, to preserve the austerity with respect to the role of consciousness he used to observe in applying his ‘complementarity principle in a restricted sense’. No reality being attributed to the wave function there is no necessity to rely on consciousness as an agent. An empiricist interpretation, too, can serve this purpose. As discussed in section 3.3.4, the projection postulate can be interpreted in terms of conditional 20 As is evident from the following quotation, this seems also to be Bohr’s opinion [202]: “… on the other hand, it is certainly not possible for the observer to influence the events which may appear under the conditions he has arranged.”
4.7. CRITIQUE OF THE COMPLEMENTARITY PRINCIPLE
225
preparation. The role of consciousness in such an interpretation is restricted to selection of objects on the basis of coincident observation of a certain measurement result. Usually this role is taken over by some electronic device performing the selection in an automatized way.
4.7
Critique of the complementarity principle
4.7.1 Einstein It is understandable that a concept as fundamental to physics as ‘complementarity’ has not been accepted without opposition. Doubtless the most illustrious opponent was Einstein, be it that his opposition was above all directed against the Copenhagen ‘completeness’ thesis (cf. section 4.2.2). As we saw above, however, complementarity is at the basis of the idea that within the atomic domain the classical ideal of a deterministic description cannot be maintained, and that the quantum mechanical description is complete in the sense of being the “best possible” description. Einstein never acquiesced to this “tranquilizing philosophy”. For Bohr and Heisenberg ‘complementarity’ was closely related to ‘interaction between object and measuring instrument’. This restricts the (complementary) quantum mechanical description to atomic systems that are interacting with a measuring instrument (‘completeness in a restricted sense’, cf. section 4.2.2). In the discussion with Einstein on the so-called ‘thought experiments’ (cf. section 4.5) Bohr always referred to this (uncontrollable) interaction in order to vitiate Einstein’s attempts to demonstrate the possibility of defining position and momentum more sharply than is allowed by the Heisenberg inequality. It was, however, precisely this reference to the interaction with the measuring instrument that was not acceptable to Einstein. For Einstein reality was an objective reality, not being interfered with by any observation or measurement. The moon is there when nobody looks. Nuclear processes in the sun’s core are not influenced by any measurement performed on earth. If quantum mechanics describes such processes (as it appears to do), then quantum mechanics describes objective reality, not influenced by any observer nor by his measuring instruments. To Einstein Bohr’s reference to the interaction with a (man-made) measuring instrument had an odium of anthropomorphism, and did not seem relevant at all. For Einstein there was no reason to believe that it is impossible in general to attribute a trajectory to an electron. If quantum mechanics is not capable of describing objective reality, then this theory must be considered ‘incomplete in a wider sense’ (cf. section 4.2.1), perhaps even inadequate. The ‘thought experiments’ mark the long-lasting discussion between Bohr and Einstein on the problem of the ‘(in)completeness of quantum mechanics’, in which
226
CHAPTER 4. THE COPENHAGEN INTERPRETATION
Einstein always tried to devise new ways to circumvent the Heisenberg inequality, and Bohr always demonstrated the flaws in Einstein’s reasoning, caused by his neglect of the interaction with the measuring instrument. The Einstein-PodolskyRosen (EPR)-experiment ([250], cf. chapter 5) constitutes the apotheosis of the Bohr-Einstein discussion. It seems evident that this experiment has been devised with the intention to evade Bohr’s ‘interaction’ argument, and to demonstrate the possibility of performing measurements without any interaction between object and measuring instrument. In this way the EPR paper tries to attribute to the particle, in disagreement with Copenhagen views, well-defined values of both position and momentum, thus attempting to prove ‘incompleteness of quantum mechanics’. In chapter 5 it will be discussed how Bohr coped with this challenge, and how he tried to save ‘completeness’ and the closely related idea of ‘complementarity’.
4.7.2 Margenau Another early critic of the Copenhagen interpretation is Margenau [163, 251]. His criticism is directed against the confusion with respect to the question, discussed in section 4.6.1, of whether the indeterminacy relations are telling us about the past (i.e. the state before the measurement), or about the future (i.e. the state after the measurement). Margenau ([163], sections 18.3,4) studied a number of ‘thought experiments’, and demonstrated that the indeterminacies in different experiments may have different meanings. In certain of the examples the indeterminacy can be interpreted as an uncertainty in determining the value of some quantity as it was preceding the measurement (for instance, in Heisenberg’s microscope). In other cases (as, for instance, with the indeterminacy of momentum in the same microscope, or with the position indeterminacy in the single-slit experiment) there is no relation to the initial state of the object. It was already concluded in section 4.6.1 that in the microscope the quantity has relevance to the future rather than to the past, and, hence, is a property of the final state rather than of the initial one. This holds also true for the quantity in the single-slit experiment. As also emphasized by Ballentine [252] (cf. section 4.7.3), this latter quantity can not even be interpreted as an inaccuracy of the result of a measurement of the of the incoming particle because no is performed. There is only a conditional preparation by selecting those particles that have not been halted by the screen. For this reason also is a characteristic of the final state of the interaction process with screen S. A similar conclusion is drawn by Margenau, when he states that in these cases the indeterminacies refer to a preparation of a final state rather than to a measurement in the initial one. Margenau’s criticism regards one of the main weaknesses of the Copenhagen interpretation, viz, the blurring of ‘preparation’ and ‘measurement’ caused by Bohr’s ban on analysis of the measurement process (cf. section 4.2.2). Although there
4.7. CRITIQUE OF THE COMPLEMENTARITY PRINCIPLE
227
is a clear distinction between the interpretations of the indeterminacies by Bohr and by Heisenberg, this distinction was not felt as a real problem. Due to the unanalyzability there seems to be only one indeterminacy for each physical quantity, valid for the measurement process as a whole. In Bohr’s view no distinction between different phases of the measurement process is thought to be possible. Since for Heisenberg preparation of the final state of the object constitutes the essential part of a measurement (cf. section 4.6.1) no terminological difference with Bohr was evident: for both of them the Heisenberg inequality is valid as a result of measurement. There is a world of difference, however. It is Margenau’s observation that quantum mechanical practice is quite different from the Copenhagen one. In a measurement of observable A a well-defined (eigen)value is found in each individual realization of the measurement. The probability distribution of the values are determined by the initial state (cf. section 1.1), and, hence, seem to be uninfluenced by the measurement. The indeterminacy inherent in the probability distribution is represented in the formalism by the standard deviation defined by (1.9), expressing, according to Margenau, our lack of knowledge as regards the precise value of observable A during the preparation of the initial state. On the other hand, towards the future there does not seem to be any indeterminacy as regards the measured observable A, at least not if the measurement is of the first kind (cf. sections 3.2.4 and 4.6.6), as was generally assumed to be the case. Then a repeated measurement of the same observable A will yield the same value If this is correct then there is no indeterminacy of A at all with respect to the future. It seems that there is a large difference between the measures of indeterminacy of A in different phases of the measurement process. Margenau ([163], section 19.10) reproaches Bohr a dangerous kind of agnosticism in maintaining ‘complementarity’ as a category of the mind. By this, and by the concomitant requirement of a description of measurement in classical terms, the impossibility of a more detailed analysis of the measurement process in terms of the difference between ‘preparation’ and ‘measurement’ becomes a necessary attribute of cognition. Einstein has rightly indicated this as a “tranquilizing philosophy”, which according to Margenau may even be dangerous because it can lead to pseudo solutions. Although in this respect Margenau in the first place had in mind applications of the ‘complementarity principle in a wider sense’ -even outside physics- we shall see in chapter 7 that this issue is important also within quantum physics. Margenau also shows the way how to implement his criticism of Bohr’s philosophy, viz, by describing the measurement process in terms of wave functions and Schrödinger equations instead of using the classical terms of the principles of ‘correspondence’ and ‘complementarity’. This is consistent with the necessity of a quantum mechanical description of the measurement process, found in section 4.3.4. Actually, Margenau seems to be right in his observation that in its practical applications physics has already chosen this way from the beginning (cf. chapter 3).
228
CHAPTER 4. THE COPENHAGEN INTERPRETATION
Notwithstanding Bohr’s ban, in actual practice quantum mechanical measurements have been described using quantum mechanics; ‘complementarity’ as a category of knowledge is seldom, if ever, encountered in quantum mechanical practice 21 . Note that this should not be interpreted as a disavowal of the whole idea of complementarity, however. Incompatibility of observables is an essential feature of quantum mechanics, invoking deviations from classical physics which cannot be eliminated in the way favored by Einstein. In order to see this clearly it is necessary to abandon Bohr’s ‘correspondence’ thinking and to accept the necessity of applying quantum mechanics also to the measurement process. In section 7.10 it will be seen that it is possible to give ‘complementarity’ a sound physical basis precisely by means of a quantum mechanical description of the measurement process. Then, ‘complementarity’ is not an a priori category of knowledge, however, but merely a peculiarity of that part of reality that is described by quantum mechanics.
4.7.3 Ballentine As is well known (e.g. Jammer [216], section 6.8), it has been assumed for a long time that Bohr was able to answer Einstein’s criticisms in a sufficient way. Consequently, for several decades the Copenhagen interpretation has been considered as the only reasonable interpretation of quantum mechanics, the “orthodox” interpretation. This may be one reason that Margenau’s criticism, too, has not been very influential. Another reason may be that Margenau’s early criticism [163] was not yet based on a thorough analysis of the measurement process. Important progress was made only later on, especially in his collaboration with Park [253, 254, 255]. When comparing the “orthodox” doctrine with the way quantum mechanics is dealt with in actual practice, it is not surprising that some doubts always remained. In 1970 Ballentine [252] has materialized these doubts in a new attack on the Copenhagen interpretation, in which Margenau’s distinction between ‘preparation’ and ‘measurement’ plays an important role. Like Margenau, also Ballentine emphasizes that particle diffraction experiments like the single- and double-slit experiments should not be interpreted as position measurements, but as state preparations of the state the particle is in after passing the slit(s) 22 . Indeed, in the experiment of figure 4.2 it is certainly not true that the of the position observable is measured in the sense that a value is determined for every particle impinging from the left. Only the particles passing through the slits are considered. In the single-slit experiment an indeterminacy is attributed to these particles. This does not allow to draw any conclusion on the position distribution in the incoming state, since the particles that are stopped by the screen are completely left out of 21 22
Thus, in the 1937 textbook by Kemble [157] the word ‘complementarity’ is not to be found. Margenau ([163], p. 375) refers to Kemble [157] who had already observed this in 1937.
4.7. CRITIQUE OF THE COMPLEMENTARITY PRINCIPLE
229
consideration. In fact, slits are typically used for preparation purposes, preparing particles in states localized in some region. Also in the case of figure 4.2 the effect of the slit in screen S can be interpreted as realizing a new state that is nonvanishing only at the position of the slit. Observation of the diffraction pattern at screen B is a measurement performed in this new state. In the double-slit experiment inequality (4.7) can be seen as a property of the state of those particles passing through the central maximum of the interference pattern at screen B (to be selected by means of a slit of width at the position of this maximum). The quantities and in (4.7) are actually the standard deviations and in this latter state. An analogous remark holds for the microscope, discussed in section 4.5.3, with respect to the quantities and satisfying (4.14). Here, too, these quantities can be interpreted as corresponding to the final state of those particles that are detected by the microscope. Although can also serve as a measure of the inaccuracy of the position measurement, this certainly does not hold true for as Ballentine [252] observes, the microscope performs only a position measurement, not a measurement of momentum. In order to clearly understand Ballentine’s point of view it is necessary from now on to distinguish clearly between the two different meanings of ‘indeterminacy’ already alluded to in section 4.5.1. Preparation and measurement may both be responsible for a certain ‘indeterminacy’. In order to concur with present textbook usage in which is referred to the Heisenberg inequality (1.78) as an ‘uncertainty relation’, from now on ‘indeterminacy due to preparation’ will be referred to as ‘uncertainty’ . A measure of the ‘uncertainty’ of observable A is the standard deviation defined by (1.9), which, in classical physics, represents just our lack of knowledge as to the precise value of A realized in the preparation. The important point is that is a functional of the initial state only, and does not depend on the measurement procedure. Although, as will be seen in chapters 6 and 9, it is not possible to interpret in precisely the classical way referred to above, in any case it is possible to attribute this uncertainty to the preparation rather than to the measurement (de Muynck [129], see also section 7.10). There also exists an ‘indeterminacy due to measurement’ related to the way the measurement is performed. In the following this ‘indeterminacy due to measurement’ will be referred to as ‘inaccuracy’. In chapter 7 ‘inaccuracy’ will be implemented into the mathematical formalism by the introduction of the notion of ‘nonideal’ measurement. ‘Complementarity’ as a restriction on simultaneous measurement of incompatible observables is related to this latter form of indeterminacy. It is emphasized by Ballentine [252] that, as already observed by Heisenberg (cf. section 4.6.1), the Heisenberg uncertainty relation is a property of preparation (in Heisenberg’s terminology: “directed toward the future”). In Heisenberg’s view the impossibility of simultaneous measurement of position and momentum is actually an impossibility of preparing, by means of a simultaneous measurement, a state
230
CHAPTER 4. THE COPENHAGEN INTERPRETATION
in which position and momentum are more sharply determined than is consistent with the Heisenberg inequality. Ballentine’s conclusion is that inequality (1.78), being a property of the state preceding the measurement, has no relevance at all to simultaneous measurement of incompatible observables. It does not at all follow from this inequality that it is impossible to simultaneously measure position and momentum of a particle with inaccuracies and such that their product is arbitrarily close to zero, and, hence, does not satisfy (4.15). As an example Ballentine considers diffraction of a particle through a slit (cf. figure 4.2). A particle hitting screen B in a certain point is observed as a spot appearing on the screen (cf. figure 4.4), the extension of which is a measure of (in)accuracy in determining the particle’s position. Each particle hitting screen B must have passed through the slit in screen S, and is supposed to behave as a classical free particle while covering the distance between S and B. Assuming that the inaccuracy is determined with is given by By choosing L very large it is possible to make arbitrarily small, thus violating (4.15). Note that this is in complete agreement with Heisenberg’s ideas (cf. section 4.6.1). Ballentine draws a number of conclusions from this: It is possible to perform a simultaneous measurement of position and momentum with inaccuracies violating the Heisenberg relation This is not at variance with the Heisenberg inequality (1.77), because the meanings of and are different from the meanings of and respectively. The latter quantities are inaccuracies in the sense attributed by the complementarity principle to the simultaneous measurement of position and momentum of an individual particle. On the other hand, the quantities and are statistical quantities expressing the spreading (scatter) of measurement results obtained when measurements are performed on a large number (ensemble) of particles (cf. section 6). Inequality (1.77) does not refer to the simultaneous measurement of and It is possible to verify (1.77) by measuring and separately in different subensembles. In this case there cannot be any mutual disturbance of the measurements, which, allegedly, is the cause of the inequality. Yet, and satisfy (1.77) also in this case. According to Ballentine therefore (1.77) does not express a property of simultaneous measurement, but of preparation: it is not possible to prepare an ensemble of particles such that the statistical spreadings (dispersions) of the individual values of and violate inequality (1.77). Ballentine proposes to refer to (1.77), more generally (1.78), as a ‘statistical dispersion principle’. According to Ballentine it is possible to attribute to each particle of the ensemble well-defined values of and just like in classical statistical mechanics.
4.7. CRITIQUE OF THE COMPLEMENTARITY PRINCIPLE
231
However, within the domain of quantum mechanics it is impossible to control the preparation so as to prepare all values of and within a region of phase space smaller than one consistent with (1.77). Ballentine approvingly refers to Prugovecki [256] who asserts that (standard) quantum mechanics deals only with measurement of one single observable (or a set of commuting observables), and for this reason does not say anything about simultaneous measurement of incompatible observables, and, hence, cannot pose any restriction on such measurements. In order to be able to also describe simultaneous measurement of incompatible observables the mathematical formalism would have to be extended. However, Prugovecki’s proposal [256] for such an extension is rejected by Ballentine because it would entail negative probabilities.
4.7.4
Recent developments
As we shall see in section 6.4.2 the simultaneous attribution of values of incompatible observables, independently of measurement, to a microscopic object entails fundamental problems (see also section 9.4). For this reason this part of Ballentine’s ideas, essentially agreeing with the ‘possessed values’ principle (defined in section 2.3) is not taken over in this book. However, the idea that in assessing the meaning of the Heisenberg inequality the distinction between ‘preparation’ and ‘measurement’ should be duly taken into account, is of utmost importance. Ballentine’s insight that quantum mechanics does not say anything about simultaneous measurement of incompatible observables is consistent with the theorem, proven in section 1.9.2, that the existence of a joint probability distribution of two standard observables entails their compatibility. For this reason simultaneous measurement of incompatible observables is not describable by the standard formalism as developed by Dirac and von Neumann. In contrast to Ballentine’s above-mentioned opinion it will be seen, however, that quantum mechanics does have to say quite a lot about the (in)accuracies incompatible observables can simultaneously be measured with. It turns out that the original intuition of Bohr and Heisenberg, as framed in the complementarity principle and its application to the ‘thought experiments’, is basically correct. In order to illustrate this, let us once more consider the double-slit experiment of section 4.5.2. Ballentine is right in concluding that inequality (4.7) does not have any connection to simultaneous measurement of and in the initial state, but refers to standard deviations in the final state of a subensemble of particles. This, however, is not the complete story of the double-slit experiment! This experiment does also possess an aspect of simultaneous measurement, expressed by relation (4.11). Although this relation has the same form as inequalities (4.2)
232
CHAPTER 4. THE COPENHAGEN INTERPRETATION
and (4.7) its meaning is completely different. In (4.11) the quantities and are not standard deviations in (sub) ensembles of measurements in which either or is measured, but they refer to the (in)accuracies of position and momentum of one and the same individual particle, determined in one single measurement. These (in)accuracies are properties of the measurement process! As a matter of fact, the inaccuracy of a determination of is determined through (4.9) by the momentum fluctuations of screen S; analogously, the inaccuracy of a determination of is determined through (4.10) by the position fluctuations of screen S. This implies that inequality (4.11) does have a meaning as a restriction on the accuracies position and momentum can simultaneously be measured with. In this particular experiment this restriction is a consequence of the quantum mechanical restriction on the preparation of (a part of) the measuring instrument, represented by (4.8). The mistake made by Bohr and Heisenberg in the analysis of the double-slit experiment, and repeated over the years in quantum mechanics textbooks, is that no distinction was drawn between the two different aspects of quantum mechanical experiments, viz, ‘preparation’ and ‘(simultaneous) measurement’. With Bohr the Heisenberg inequality refers to the possibility of simultaneous definition of position and momentum in the context of a measurement, no distinction being made between different phases of the measurement. With Heisenberg (in)accuracy of measurement automatically becomes (in)accuracy of preparation, because for him a measurement is nothing but a preparation. The fundamental difference between Bohr’s view (indeterminacy relations refer to the situation during measurement) and Heisenberg’s (indeterminacy relations refer to the situation after the measurement) has remained largely unnoticed in textbooks of quantum mechanics, as has the difference between these views and Ballentine’s (uncertainty relations refer to a preparation, either before the measurement, or by the measurement). In this respect the criticisms by Margenau and Ballentine, to the effect that a clear distinction between ‘preparation’ and ‘measurement’ is necessary, are fully justified (see also table 4.1). Ballentine’s insight that Heisenberg’s inequality (1.77), derived from the standard formalism, does not have any relation to simultaneous measurement of and P, seems to be correct. This does not at all imply, however, that the ‘thought experiments’ would not be related to the problem of joint measurement of and P. In this respect the ideas of Bohr and Heisenberg are quite reliable. Their problem was only that they had not at their disposal a mathematical formalism suited to express this relatedness. In order to obtain such a formalism it is necessary to generalize the mathematical formalism developed by Dirac and von Neumann, so as to allow a description of the simultaneous measurement of incompatible observables, and to have a mathematical representation of joint probability distributions of such observables. The way such a generalization can be arrived at is well known by now (see section 1.9 and chapter 7). Ballentine’s objection against Prugovecki’s negative probabilities is not applicable to this generalization, since all probabilities
4.7. CRITIQUE OF THE COMPLEMENTARITY PRINCIPLE
233
are non-negative. From a description of simultaneous measurement of incompatible observables by the generalized formalism it will become apparent that Ballentine’s criticism of the interpretation of Heisenberg’s inequality (1.77) was completely justified (cf. section 7.10): this relation should be interpreted as a property of preparation rather than measurement. In chapters 7 and 8 it will be seen, however, that the idea of complementarity as a restriction imposed on the simultaneous measurability of incompatible observables (as is, for instance, evident in the double-slit experiment) is justified too, and that Ballentine’s assertion that quantum mechanics has nothing to say about such measurements applies only to the standard formalism. This latter formalism can describe only simultaneous measurement of compatible observables, and is unable to yield a formal description of the restrictions simultaneous measurement of incompatible observables is liable to. Indeed, the identification of the restriction (4.11) with the formal relation (1.77) is not justified. This does not imply, however, that this restriction itself would not exist! In section 7.10 a relation will be derived from the generalized formalism, that in a unique sense is interpretable as a restriction to be satisfied by the (in)accuracies of a determination (in a determinative sense, cf. section 4.6.1) of incompatible observables if these are measured simultaneously. Like the Heisenberg inequality (1.77) this relation follows from the (generalized) mathematical formalism of quantum mechanics. Its physical significance is completely different, however. By the existence of two qualitatively different relations -one for preparation and one for the simultaneous measurement of incompatible observables - it is possible to properly assess the significance of the ‘thought experiments’ for the concept of ‘complementarity’. Bohr and Heisenberg could dispose of only one single relation, viz, (1.77), which, moreover, appeared to be devised so as to represent within the formalism indeterminacy relations like (4.2), (4.7), (4.11), and (4.14). Due to this circumstance identification of these quantities is understandable (although not correct). In particular, the formal similarity of (4.11) and the Heisenberg inequality (1.77) can be traced back to the fact that in the derivation of (4.11) use is made of (4.8). This is a Heisenberg inequality valid for the initial state of screen S. This relation is -as it should- a restriction on the preparation of this latter state. Inaccuracy relation (4.11) is a consequence of the impossibility of preparing screen S with smaller dispersion than is allowed by (4.8). As a consequence the accuracies in the simultaneous measurement of and are limited by inequality (4.11). Notwithstanding the deceiving similarity in form of (4.8) and (4.11) there is no reason, however, to equate the inaccuracies and to the uncertainties and respectively, in whichever state of the particle. According to Ballentine such an identification can be made in some of the examples, although certainly not using standard deviations of the initial state. It is appropriate to refer to inequality (4.11) as ‘Heisenberg inaccuracy relation’ rather than ‘Heisenberg uncertainty relation’.
234
CHAPTER 4. THE COPENHAGEN INTERPRETATION
As will become evident from the formal treatment given in chapters 7 and 8, Bohr’s idea that complementarity is an a priori category of our cognition may be one cause of the confusion around the complementarity principle. As, for instance, is clear from the particle-wave duality (cf. section 4.6.4), complementarity is viewed as an opposition of two different classical pictures, corresponding to measurement arrangements for the observables of position and momentum, respectively (screen S either being fixed, or freely moving). This picture is consistent with the standard formalism, in which only these measurements can have a formal description. In the ‘thought experiments’, however, as well as in certain formulations of the correspondence principle, also intermediate situations are allowed (for instance, in which screen S is supported by a spring with a finite spring constant According to the correspondence principle, in such an intermediate physical situation the classical laws would be applicable only with latitudes satisfying Heisenberg’s indeterminacy relation. Unfortunately, only the limits and correspond to the complementary pictures actually discussed by Bohr. In the early discussions of complementarity it remains vague how intermediate situations can be implemented. In chapters 7 and 8 it will be demonstrated that a precise account of such intermediate situations is necessary to understand the complementarity principle as a principle governing Bohr’s latitudes in a simultaneous measurement of incompatible observables. Description of such measurements will need the generalization of the standard formalism referred to above. In particular, this generalized treatment will implement in a more precise way assertions like the one that interference must disappear if it is determined through which slit the particle went (induced by Bohr’s treatment of the double-slit experiment, cf. section 4.5.2): different accuracies of which-slit’ measurements induce different disturbances of the interference pattern; complete wiping out is caused only by a 100% accurate ‘which-slit’ determination (Wootters and Zurek [257], Mittelstaedt et al. [258]). We close this section by presenting a concise overview (cf. table 4.1) of the dif-
4.8. EMPIRICIST INTERPRETATION
235
ferent ways indeterminacy of an observable A is implemented by different authors, and the phases of the measurement process these implementations belong to. The diversity of these implementations highlights the confusion with respect to ‘complementarity’, caused by reasonings based on a too restricted quantum mechanical formalism, viz, the standard formalism. It seems that a complete understanding of complementarity can be achieved only on the basis of a detailed quantum mechanical account of the measurement process, distinguishing ‘uncertainty’, expressed by standard deviation (either in the initial or in the final object state) from ‘inaccuracy’ induced by the measurement.
4.8
Complementarity and empiricist interpretation
The criticisms of the complementarity principle, discussed in section 4.7, are advanced within the context of a realist interpretation of quantum mechanical observables, in which these are considered to be properties of the microscopic object. In this realist understanding the role of the measuring instrument is a rather modest one: the physical quantities are either defined by it (Bohr), or disturbed by it (Heisenberg). This makes the realist interpretation a contextualistic one: microscopic reality -as far as described by quantum mechanics- depends on the measurement context. However, neither in Bohr’s philosophy nor in Heisenberg’s physics, nor in the criticisms by Einstein and Ballentine, any attention is paid to the measuring instrument in its role as an intermediary translating information from the world of microscopic objects to the macroscopic world directly accessible to us. Although the measuring instrument is thought to be important, this latter function is completely absent in the idea of complementarity as discussed above. The measuring instrument seems to be there mainly for disturbing the microscopic object, or for preparing it in a final state. Moreover, in the standard quantum mechanical formalism as reproduced in chapter 1 there is no single term referring to the measuring instrument. Therefore, what is a quantum mechanical measuring instrument, and what it does, remains in the dark. That the measuring instrument should have a (macroscopic) pointer, and that a correlation should be established between this pointer and the microscopic object in order that the instrument be of any use for generating information on the object, remained outside the scope of the discussion. Schrödinger [104] even explicitly disposed of it as an irrelevant detail23. 23 “(dass an dem letzteren [i.e. the instrument, WMdM] die Ablesung gemacht wird, ist mehr eine Äusserlichkeit).”
236
CHAPTER 4. THE COPENHAGEN INTERPRETATION
Yet, this function of the measuring instrument is a most important one. What other reason could one have to put a measuring instrument between object and observer than the amplification of microscopic information to the macroscopic level? Certainly not to merely disturb the object, nor to induce the object to have a welldefined value of the measured observable. The primary objective of measurement is to obtain information. That this may be accompanied by a disturbance of the object is not unimportant, but this is an unpleasant side-effect rather than the main issue. Strictly speaking, the only empirical information obtained from a quantum mechanical measurement is a macroscopic phenomenon (like e.g. a click in a Geiger counter, a track in a Wilson or bubble chamber, etc.), observed with a certain relative frequency. In a realist interpretation such a phenomenon is translated into a property of the microscopic object. It is not unimportant to note that for such a translation an empirical basis is lacking because the microscopic object is not accessible to direct observation. In an empiricist interpretation the measurement result does not have an extra significance over its meaning as a macroscopic pointer position, even though the labeling of these pointer positions will often be chosen on the basis of reasonings employing the idea of ‘correspondence’. Bohr’s views with respect to ‘correspondence’ and ‘complementarity’ -although having primarily a conceptual rather than an ontological character- are presumably closer to a contextualistic-realist than to an empiricist interpretation of quantum mechanics. In chapter 5 it will be seen that this may have influenced Bohr in answering Einstein’s attacks. The discussion between Bohr and Einstein is considerably impaired by the absence of a clear distinction between, on one hand, properties of the microscopic object (perhaps disturbed by the measurement, or just well-defined within the measurement context), and, on the other hand, pointer positions as properties of the measuring instrument rather than the microscopic object. It is the empiricist interpretation which draws the attention to this difference, and which, consequently, offers a new perspective for approaching the problems of quantum mechanics. In particular, the distinction between properties of the microscopic object (physical quantities) and properties of the measuring instrument (pointer positions) yields the opportunity to distinguish between two different modes of disturbance. The first one is Heisenberg’s original disturbance idea (cf. section 4.6), in which the object is disturbed by the interaction with a measuring instrument. The second one is a concept of disturbance of the measuring instrument, to be discussed in chapter 7. In comparing different methods of determining the value of a particular physical quantity it may be found that one measurement arrangement yields a more accurate account of physical reality (more precisely: of the preparation symbolized by the density operator) than another. This may be caused by the fact that different methods may have differently disturbing influences on the process leading to a final pointer position (see section 7.2 for an example). Both kinds of disturbance can
4.8. EMPIRICIST INTERPRETATION
237
be simultaneously active; but the processes are conceptually completely different. The lack of distinction between these two kinds of disturbance has been a source of confusion in the discussion on the meaning the ‘thought experiments’ have for the concept of ‘complementarity’. An important example of this will be met in discussing the (‘thought’) experiment proposed by Einstein, Podolsky and Rosen (cf. chapter 5). This discussion is usually interpreted in a realist sense. The empiricist point of view offers a new perspective for assessing its significance.
This page intentionally left blank
Chapter 5 The Einstein-Podolsky-Rosen problem 5.1
Introduction
In the discussion between Einstein and Bohr on the completeness of quantum mechanics Einstein always tried to demonstrate by means of ‘thought experiments’ that, in contrast to Bohr’s contention, sharper values of position and momentum can be attributed to a particle than is admitted by the Heisenberg inequality (1.77). Bohr, however, was always able to refute Einstein’s ‘thought experiments’ by showing that Einstein did not take into account the disturbing influence of the measurement arrangement. Time and again Bohr demonstrated (cf. [202]) that the measurement arrangement has such a disturbing influence (remember Heisenberg’s disturbance theory of measurement, section 4.6) that the Heisenberg inequality is satisfied. For Einstein Bohr’s solution was unsatisfactory because he did not want to accept quantum mechanics as a theory describing microscopic reality only as far as it is interacting with a measuring instrument. For Einstein the ideal was a description of microscopic reality as it is, independently of any human interference, i.e. quantum mechanics as a description of an objective reality (cf. section 4.7) that is independent of the observer including his measuring instruments. The paper by Einstein, Podolsky and Rosen (EPR) [250] must be seen as an ultimate attempt in the discussion with Bohr to prove the incompleteness (or even the inadequacy) of quantum mechanics in a way not vulnerable to Bohr’s earlier rebuttals. EPR tried to achieve this goal by considering a physical situation in which it is explicitly taken care of that the measured object does not interact with any measuring instrument, i.e. that we are dealing with an objective reality rather than with an observed reality that is disturbed by the measurement interaction. 239
240
CHAPTER 5. THE EINSTEIN-PODOLSKY-ROSEN PROBLEM
The EPR problem has given rise to a gigantic literature. Since it is not always clear which interpretation of quantum mechanics is used, this literature is sometimes rather confusing. In particular this holds true for Bohr’s answer [259], which followed very soon after the publication of the EPR paper [250]. This answer will be discussed in section 5.3. In order to avoid confusion it will be necessary to be as explicit as possible with respect to interpretation. Let us first consider the empiricist interpretation of quantum mechanics (section 2.2). In this interpretation the EPR problem is not really a problem, since the theory is not thought to describe microscopic reality, but rather relations between preparations and measurements. Had Bohr been an empiricist, then his answer to EPR could have been very brief, namely, that quantum mechanics is not applicable to the EPR problem in the way considered by EPR, i.e. as a measurement on a particle not interacting with any measuring instrument. In order that the experiment be a quantum mechanical measurement the microscopic object will have to interact with a measuring instrument. Only in this way is it possible to induce in the latter a transition to a pointer position corresponding to a value of the measured observable. Hence, in an empiricist interpretation the notion of a quantum mechanical measurement is not applicable at all to the objective reality that is so crucially important to EPR. Bohr, however, took the EPR problem very seriously. Most probably the reason for this is that Bohr (rightly) took the EPR problem as challenging his philosophy of ‘complementarity’ (section 4.4) rather than questioning any empiricist tendency in his interpretation of quantum mechanics. In the discussion on the ‘complementarity’ issue the central role had always been played by physical quantities like position and momentum. The EPR paper [250], however, is formulated only very partially in terms of physical quantities; as is common nowadays the argumentation is cast mainly in terms of quantum mechanical states1, interpreted in an objectivistic-realist sense. For Bohr’s instrumentalist understanding an interpretation of the state vector as a description of an objective reality must have been completely unacceptable. Yet, for Bohr this was no reason to brush aside the EPR reasoning as irrelevant. In his answer [259], in which there is hardly any reference to the state vector, he was able to reformulate the problem in terms of physical quantities (see section 5.3), thus allowing him to apply his concept of ‘complementarity’. As we saw in section 4.3, Bohr’s conception of physical quantities (observables) is presumably a realist one (cf. section 2.3), in the sense that, like Einstein, he treated these as properties of the microscopic object. The only difference between Bohr and Einstein in this respect seems to be that for Bohr physical quantities had a meaning only within the context of a measurement (contextualistic-realist interpretation), whereas Einstein considered these quantities as objective properties of 1
For this Rosen seems to be responsible (cf. [260]).
5.2. EPR AS A PROBLEM OF PHYSICAL QUANTITIES
241
the microscopic object, independent of the observer including his measuring equipment (objectivistic-realist interpretation). It seems to be the realist understanding of physical quantities (observables) that was for Bohr a sufficient basis to take the EPR problem seriously. Bohr appears to accept EPR’s realism with respect to observables, but in his answer he remained true to his conviction that the measurement context cannot be ignored, not even in the situation considered by EPR, in which the microscopic object does not interact with a measuring instrument. It must be this realist interpretation of quantum mechanical observables, shared by Bohr and Einstein, that provided sufficient common ground for a discussion that is still continuing. Had Bohr been an empiricist with respect to observables, then his answer to EPR could have been quite a bit more simple, and perhaps less confusing. We shall now first discuss the EPR problem in Bohr’s formulation, i.e. in terms of physical quantities. In section 5.4 the formulation in terms of state vectors is given.
5.2
Formulation of the EPR problem in terms of physical quantities
5.2.1 The EPR reasoning The EPR problem deals with a system of two coupled particles, particle 1 and particle 2, that have interacted in the past, but, now being separated, can henceforth be considered as non-interacting. We assume that the correlations, established during the interaction, are conserved during the separation. Let and be position and momentum observables2, respectively, of particle Then whereas any observable of particle 1 is compatible with those of particle 2. This implies that
causing the Hermitian operators and to have common eigenvectors. EPR assume that the state of the coupled system is represented by such a common eigenvector3, with eigenvalues, say, and respectively. It is important to the EPR reasoning that in this state a joint measurement of and yields values 2 In the EPR paper [250] the expression ‘physical quantity’ is used rather than ‘observable’. In a realist interpretation of observables these can be seen as equivalent. 3 EPR do not bother about the fact that these eigenvectors are not normalizable, and, hence, strictly speaking, cannot represent physical states. In practice this problem can be solved by taking normalizable superpositions of these eigenvectors, in which the spreading of the eigenvalues is too small to have any observable consequences. Another possibility is discussed in section 5.4.1.
242
and
CHAPTER 5.
THE EINSTEIN-PODOLSKY-ROSEN PROBLEM
respectively, such that
but that and are not fixed: by repeating the experiment we may obtain for and all possible pairs of real values, such that (5.2) is satisfied for each pair. Stated differently, the measurement results of and are correlated. This correlation is the first point of crucial importance in the EPR reasoning. The second point is that the same reasoning can be set up with respect to momentum. Thus, if we do not measure and but and instead (taking the same state vector as before), then we obtain, on repeating the experiment, an ensemble of all possible measurement results and respectively, correlated according to EPR assume that and are known. Then it follows from (5.2) and (5.3) that and can be calculated from and respectively. This implies that it is not strictly necessary to actually perform the measurement on particle 2 to know its result! Due to the correlations between and as well as between and respectively, it is sufficient to perform only the measurement on particle 1. The value of the correlated observable of particle 2 can be calculated from the measurement result for particle 1. According to EPR we can obtain information on particle 2 without any interaction of the latter particle with a measuring instrument for either or The experimental situation of the EPR problem is symbolically represented in figure 5.1. For EPR this means that the values and respectively, can be considered to be objective properties of particle 2, properties this particle possesses independently of what is happening to particle 1. This idea is based on the fact that there is no interaction between the particles, nor between particle 2 and the measuring instrument for the measurement on particle 1, assumed to take place at a large distance from particle 2. Consequently, particle 2 cannot “know” which quantity
5.2. EPR AS A PROBLEM OF PHYSICAL QUANTITIES
243
of particle 1 is measured. According to EPR this means that the value of a quantity of particle 2 cannot be disturbed by a measurement performed on particle 1. Hence, particle 2 must have had its value, found via (5.2) or (5.3) by means of a measurement on particle 1, beforehand. This means that particle 2 must have this value even if we did not perform a measurement on particle 1 at all. EPR consider the values of and as, what they call, ‘elements of physical reality’, defined according to the famous formulation “If, without in any way disturbing a system, we can predict with certainty (i.e. with probability equal to unity) the value of a physical quantity, then there exists an ‘element of physical reality’ corresponding to this physical quantity” [250]. EPR conclude from this that and must both have sharp values, namely those values we can find from the correlations (5.2) and (5.3) by measuring either or on particle 1, but possessed by particle 2 independently of what happens to particle 1 (analogously to the way a classical object is thought to possess its (classical) properties objectively, determined completely by the initial preparation). Note that in this formulation we recognize the ‘possessed values’ principle defined in section 2.3, although this principle is applied here only to physical quantities (viz, and that are liable to undisturbed measurement. EPR conclude in the following way that quantum mechanics must be incomplete. To this end they deploy the following definition of ‘completeness of a physical theory’: a physical theory is complete if and only if every ‘element of physical reality’ is represented in the theory. This implies that, for quantum mechanics to be complete, the sharp values and must be represented simultaneously in this theory. However, quantum mechanics does not satisfy this requirement. It is impossible to find any quantum mechanical state in which and simultaneously have sharp values. Since and do not commute. Hence they do not have the common eigenvectors, which are the only mathematical entities that could account for the simultaneous sharpness of the values of these quantities. For this reason quantum mechanics, according to EPR, does not satisfy their criterion of completeness.
5.2.2 Discussion of the EPR reasoning By trying to prove that certain ‘elements of physical reality’ cannot be simultaneously described by quantum mechanics EPR purported to demonstrate that the EPR experiment is at least partially outside the domain of application of quantum mechanics. Hence, the EPR reasoning hinges on the notion of ‘(in)completeness in a wider sense’ introduced in section 4.2.1: if quantum mechanics is not capable of describing certain ‘elements of physical reality’, then a different (subquantum) theory will be necessary for this purpose. It is important to remember that for Bohr ‘completeness of quantum mechanics’ had quite a different meaning, viz, ‘completeness in a restricted sense’, being intimately connected with ‘correspondence’ and ‘comple-
244
CHAPTER 5. THE EINSTEIN-PODOLSKY-ROSEN PROBLEM
mentarity’, viewed as fundamental properties of quantum mechanical measurement (cf. section 4.2.2). Since ‘incompleteness in a wider sense’ is not incompatible with ‘completeness in a restricted sense’ Einstein’s conclusion could have been acceptable to Bohr (remember that Bohr himself did not believe in quantum mechanical ‘completeness in a wider sense’). Therefore, Bohr’s rejection of the EPR ‘elements of physical reality’ cannot be understood on the basis of any abhorrence of subquantum theories. If ‘completeness in a wider sense’ would have been the issue, then he might as well have accepted the possibility of ‘elements of physical reality’ of a non-quantum mechanical nature like the ones contemplated by EPR. Yet, Bohr did disagree with Einstein’s conclusion. As will be seen in the following section the reason for this is that, according to Bohr, Einstein was at variance with the idea of quantum mechanical ‘completeness in a restricted sense’. Since this latter concept is valid only within the domain of quantum mechanics, this means that, according to Bohr, the ‘elements of physical reality’ considered by EPR must be within the domain of quantum mechanics. As these ‘elements’ were presented by EPR as corresponding to quantum mechanical measurement results, such a presumption was not unreasonable at all. Indeed, at that time (and even now) all measurement results of experiments of the EPR type seemed to be satisfactorily described by (generalized) quantum mechanics. For this reason the idea that nonquantum mechanical concepts would be necessary to understand such experiments seemed far-fetched. Bohr’s indulgence with respect to ‘incompleteness in a wider sense’ could be understood as meaning that, although concepts of a non-quantum mechanical nature may be necessary for other domains of experience, this does not hold true for the EPR proposal. Being well within the domain of quantum mechanics, the EPR experiment could not serve as evidence of ‘incompleteness of quantum mechanics in a wider sense’ (even if quantum mechanics might finally turn out to be incomplete in this latter sense). Failure to appreciate the difference between ‘completeness in a wider’ and ‘in a restricted sense’ has caused quite a bit of confusion. Thus, it was assumed by Einstein that the EPR problem (as a problem of quantum mechanical ‘(in)completeness in a wider sense’) may be solved by adhering to an ensemble interpretation of the quantum mechanical state vector4 rather than an individual-particle one. However, if the ‘elements of physical reality’ are non-quantum mechanical, then no interpretation of quantum mechanics at all can account for their existence. Hence, an ensemble interpretation of the state vector would also not be able to do so5 (see also section 6.5). It seems that, for this reason, an answer to the EPR challenge would have to choose between, on one hand acceptance of subquantum theories as 4 “Here also the coordination of the psi-function to an ensemble of systems eliminates every difficulty.” (Einstein [261], emphasis added [WMdM]). 5 According to Guy and Deltete [96] Einstein, notwithstanding assertions to the contrary, probably did not really believe that an ensemble interpretation of quantum mechanics would solve all problems, because he actually considered quantum mechanics to be an inadequate theory.
5.2. EPR AS A PROBLEM OF PHYSICAL QUANTITIES
245
(more) complete descriptions of reality (encompassing subquantum mechanical ‘elements of physical reality’), and, on the other hand a rejection of the EPR notion of ‘element of physical reality’. On the basis of quantum mechanical ‘completeness in a restricted sense’ Bohr chose the latter option. By the majority of physicists this same choice was made, but for a completely different reason. Under the influence of logical positivism/empiricism ‘completeness of quantum mechanics’ was generally taken as ‘completeness in a wider sense’, relegating any subquantum (hidden-variables) theory to the realm of metaphysics. Only 30 years after the EPR paper was published hidden-variables theories were deemed interesting enough to experimentally check whether they are really necessary for the description of EPR-like experiments, or whether these experiments are fully described by quantum mechanics (for a review see Clauser and Shimony [262]). These experiments, related to the Bell inequality, will be discussed in chapters 9 and 10, accounting for developments that became possible after the belief in quantum mechanical ‘completeness in a wider sense’ had sufficiently declined. That these EPR-like experiments finally did corroborate quantum mechanics cannot be seen as a late justification of the choice for ‘completeness in a wider sense’ made in reaction to the EPR paper. As a matter of fact, at the time of the EPR paper the a priori ignorance with respect to their outcome was not smaller than it was 30 years later. Moreover, corroboration of the measurement results of these experiments by quantum mechanics does no more imply the nonexistence of hidden-variables theories reproducing the results of quantum mechanics than corroboration of the mechanics of a billiard ball implies the nonexistence of a model of its atomic constitution. It means only that the experiments are within the domains of application of both quantum mechanics and, possibly, of such hidden-variables theories. However, these considerations do not in any way reflect the discussion between Einstein and Bohr as it actually took place. As will be seen in the next section, Bohr’s answer to EPR does not at all address the issue of subquantum theories, but remains completely within the range of the quantum mechanical formalism and its interpretations. As a matter of fact, the main subject of the EPR discussion, as it actually took place, was not ‘(in)completeness of quantum mechanics in a wider sense’, but ‘(in)completeness in a restricted sense’. More particularly, the issue was whether the meaning of a quantum mechanical observable is determined by the measurement context (Bohr), or whether an objective meaning can be attributed to it, determined only by the preparation, and independent of any measurement to be performed later (Einstein). This question also regards the domain of application of quantum mechanics, but in a different way. It refers to the fundamental meaning of ‘correspondence’ and ‘complementarity’, according to the Copenhagen interpretation delimiting this domain. Both in the concepts of ‘correspondence’ and ‘complementarity’ the measurement arrangement plays a crucial role, and it actually is this role that is at stake here.
246
CHAPTER 5. THE EINSTEIN-PODOLSKY-ROSEN PROBLEM
As seen in section 4.2.2, the Copenhagen interpretation is thoroughly contextual. For Einstein this contextuality was unacceptable. The EPR experiment was deliberately devised precisely so as to be able to consider particle 2 as objectively prepared, independently of any measurement performed on particle 1. Although in the discussion between Bohr and Einstein wording revolved around ‘(in)completeness’, the real issue in the EPR paper was the question of ‘objectivity versus contextuality’. This question does not seem to have any bearing on ‘(in)completeness in a wider sense’ since it could as well be asked in any subquantum theory . However, due to the positivistic ban on subquantum theories this latter question could be posed only much later (Bell, [33, 263], see also section 10.5.2). As will be seen in the next section, Bohr’s answer to EPR did not address the non-quantum mechanical character of the EPR ‘elements of physical reality’. Tacitly assuming these to be quantum mechanical concepts, his main criticism was directed against the assumption of their objective existence, independently of the actual measurement arrangement.
5.3
Bohr’s answer to EPR
Bohr’s answer to EPR [259] boils down to the following: If is measured and measurement result is found, then, in agreement with (5.2), the value can be attributed to as far as this is an ‘element of physical reality’ this holds true in the context of the measurement. If is measured and measurement result is found, then, in agreement with (5.3), the value can be attributed to as far as this is an ‘element of physical reality’ this holds true in the context of the measurement. and are incompatible quantities. Therefore they cannot simultaneously be sharply defined: if is measured, then is not well-defined, and, for this reason, neither is Consequently, cannot be an ‘element of physical reality’ in the context of the measurement. Hence, if is an ‘element of physical reality’, then this cannot hold true for the value of the quantity incompatible with Bohr [259] explicitly states that “.. the special problem treated by Einstein, Podolsky and Rosen ... does not actually involve any greater intricacies than the simple examples discussed above.” [i.e. the double-slit experiments with either a fixed or a movable screen, cf. section 4.5, WMdM]. In Bohr’s reasoning we recognize the application of the principles of ‘correspondence’ and ‘complementarity’ as discussed in chapter 4. This, indeed, seemed to be sufficient to take the edge off EPR’s incompleteness proof in a way completely
5.3. BOHR’S ANSWER TO EPR
247
analogous to previous rebuttals of Einstein’s challenges. In the following we shall see, however, that, apart from a confusion on the different senses of ‘completeness’, in Bohr’s answer a number of other confusions also play a role, undermining its effectiveness. As already noted in section 5.1, an empiricist reaction to EPR could have been restricted to the remark that quantum mechanics is applicable to the EPR problem only to the extent that assertions are made about measurement results of measurements that are actually performed. This implies that for a statement about or to be meaningful in an EPR experiment, the corresponding measurement on particle 2 must actually be performed. A statement on the correlation of and would require that and are both measured, and their correlation recorded. This could be done by simultaneously measuring the compatible observables and and by determining the joint probability distribution (cf. section 1.3) of and The question of whether these observables are correlated according to (5.2) is an empirically relevant one only then. Under the influence of (logical) empiricism this has often been emphasized (e.g. Cantrell and Scully [264]), and the EPR problem has been denounced as metaphysical because it excludes its empirical verification in an essential way: as soon as a measurement arrangement is set up for verifying the EPR correlations, the crucial presupposition that particle 2 does not interact with a measuring instrument is no longer satisfied. In a strictly empiricist interpretation the simultaneous existence of incompatible ‘elements of physical reality’ for and corresponding to quantum mechanical measurement results would be excluded because of the disturbance of by the measurement of (and vice versa). The fact that Bohr took the EPR problem seriously must stem from his tendency to join EPR in their realist interpretation of quantum mechanical observables in which measurement results are identified with properties of the microscopic object (see also section 4.3). Bohr did not object to the attribution by EPR of values to and without these actually being measured. What he did object to was that these values would be objective properties of particle 2. In fact, Bohr was just using his old argument that the influence of the measurement arrangement should not be forgotten. In agreement with the correspondence principle (section 4.3.2) a quantum mechanical observable does not have a value outside its measurement context. For Bohr the whole experimental arrangement is important for defining the observable that is measured. For this reason it is not allowed, according to Bohr, to neglect the measurement on particle 1 when making statements about particle 2. This has as a consequence that the ‘element of physical reality’ introduced by EPR does not have an unambiguous meaning. Thus, either or is measured on particle 1. These observables, however, are incompatible, and correspond to complementary physical situations (section 4.4), implying that incompatible observables cannot simultaneously have sharp values. According to Bohr it is not allowed to leave this complementarity out of consideration, even when there is no interaction between
248
CHAPTER 5. THE EINSTEIN-PODOLSKY-ROSEN PROBLEM
particles 1 and 2. As Bohr put this [259]: “Of course there is [...] no question of a mechanical disturbance of the system [i.e. particle 2, WMdM] under investigation during the last critical stage of the measuring procedure [i.e. when the particles have separated, WMdM]. But even at this stage there is essentially the question of an influence on the very conditions which define the possible types of predictions regarding the future behavior of the system6.” This citation is among the most cryptic statements made by Bohr. When Bohr was referring to ‘the possible types of predictions regarding the future behavior of the system’ he may even have had in mind ‘predictions of measurement results of observables to be actually measured on particle 2’. If this were true, then Bohr’s position would hardly differ from an empiricist one, since in that case measurements would be performed both on particle 1 and particle 2. Since these are compatible observables they can be measured jointly. Bohr’s statement could then refer to the bivariate probability distribution (1.26), or to the conditional probability distribution (1.28) representing the relative frequencies of the measurement results for particle 2, conditional on a given value of the observable measured on particle 1 (cf. section 5.4.3). If Bohr had indeed had this empiricist intention in making the statement cited above, and if he had found a way to make this clear in an unambiguous way, then an awful amount of confusion could have been evaded, because the discussion could have been restricted to information that can be obtained by direct observation of pointer positions of measuring instruments. This implies that particle 1 and particle 2 would both have to interact with a measuring instrument. For observables and Bohr could have applied his ‘complementarity’ concept on the basis of the interaction of particle 2 with the measurement arrangements for or respectively, and the mutual disturbance of these observables entailed by these interactions. Concomitantly, the EPR concept of ‘element of physical reality’ would have been found to be completely inapplicable, and the EPR challenge would have been beaten off effectively. It is a pity that Bohr did not offer this perspicuity, presumably because he did not intend the statement in this empiricist sense. As we saw in section 4.3 Bohr’s interpretation of observables is rather of a contextualistic-realist than of an empiricist blend. In Bohr’s writings no clear distinction is drawn, however, between the contextualistic-realist idea of an observable ‘possessing a certain value within the context of the measurement’, and the empiricist notion of a measurement result as a ‘pointer reading of a measuring instrument’ (see also section 4.7). This vagueness may have been instrumental in winning Bohr his favored position among logical positivist/empiricist thinkers, but is at the same time a source of confusion 7 , 6
Emphasis in the original. Fine ([265], p. 35) has interpreted the statement as one of Bohr’s “lapses into positivistic slogans and dogmas.” 7
5.3. BOHR’S ANSWER TO EPR
249
confounding Einstein’s ‘elements of physical reality’ with measurement results as understood in an empiricist sense. It is most probable that Bohr’s realist inclination with respect to physical quantities has seduced him into following the EPR reasoning much further than is either necessary or desirable from an empiricist point of view. In any case Bohr nowhere questioned the existence of EPR’s ‘element of physical reality’: he questioned only the unambiguity of its definition. That Bohr did not see any essential difference with the ‘thought experiments’ discussed previously (like the double-slit experiment) is caused by the fact that he did not distinguish clearly between ‘preparation’ and ‘measurement’ (cf. section 4.7). He always referred to the experimental arrangement defining the measured quantity, without distinguishing between preparing apparatus and measuring instrument. In this global way he also accepted the arrangement proposed by EPR, without noting any difference between preparative and determinative aspects. This is the origin of his acceptance of the experimental arrangement of figure 5.1, in which only is measured, as a measurement (also) of (and analogously for and For Bohr the crucial question was, to what extent it is possible to define the quantity or within a particular experimental arrangement. According to the EPR reasoning this is possible because of the existence of correlations between and and between and respectively, brought about in the preparation of the particle pair. Bohr did not question the existence of these correlations. Here we clearly see the difference between Bohr’s view and an empiricist one. For the latter a procedure in which only is measured on the particle pair is not a measurement of any observable of particle 2. For the empiricist the experiment of figure 5.1 is a preparation procedure as far as particle 2 is concerned (cf. section 5.4). In agreement with the quantum mechanical rules of conditional preparation specified in section 3.2.6, he is able to associate a state vector with the preparation of those particles 2 corresponding to a particular value of Subsequently, he can either execute some measurement on particles 2 prepared in this way, or do some other experiment with these particles. For the empiricist a measurement of has been completed only if he has read a value off the scale of the pointer of a measuring instrument serving to measure this quantity. Bohr’s realism with respect to physical quantities seems to make him less critical concerning the question of what a quantum mechanical measurement is, and seems to seduce him into following EPR in their interpretation of the experiment as a quantum mechanical measurement on particle 2.
250
5.3.1
CHAPTER 5. THE EINSTEIN-PODOLSKY-ROSEN PROBLEM
Criticisms of Bohr’s answer to EPR
From an interactional to a relational interpretation Although Bohr did not think of his answer as problematic, and he did consider it as a straightforward application of his complementarity principle 8 , it has yet been felt as a cause of a certain discomfort. Thus, Popper ([266], p. 149) and Jammer ([216], p. 194) (also Folse [210]) interpret Bohr’s answer as a change of interpretation, because now what is defined for particle 2 is determined by a measurement on particle 1 with a measuring instrument not interacting with particle 2. Hence, the measuring instrument no longer exerts its defining function by means of a direct interaction; it is sufficient that a relation exist between the measuring instrument and particle 2, composed of a combination of, on the one hand, the interaction between the measuring instrument and particle 1, and, on the other hand, the correlation between particles 1 and 2 which was realized in the preparation of the particle pair. Thus,
Popper [266] compares this relation to a coordinate frame. Depending on the choice of the measurement on particle 1 (either or a different coordinate frame is chosen in which particle 2 can be described. The two coordinate frames cannot be joined together because and are incompatible. By the correlations between either and or and this incompatibility (complementarity) is passed on to the quantities of particle 2. Jammer characterizes the change of interpretation as a transition from an interactional to a relational one.
Inconsistency of Bohr’s answer with the correspondence principle Bohr’s answer to EPR does not seem to be completely consistent with his general philosophy as discussed in section 4.3. It is important to note once more that Bohr did not contest the assumption, implicitly made by EPR, of the existence of the correlations between and and between and respectively. For an empiricist these correlations are observables just like the quantities and themselves are, to be measured by a simultaneous measurement of and and of and respectively. If Bohr had recognized these correlations as quantum mechanical observables like all other ones, then his ‘correspondence’ idea would have told him that these are defined only in the context of the corresponding correlation measurements. This means that the correlation is defined only in the context of a 8
“...we are dealing with problems of just the same kind as raised by Einstein in previous discussions...”, Bohr [202], p. 232.
5.3. BOHR’S ANSWER TO EPR joint measurement of and and, analogously, the (incompatible) context of a joint measurement of and
251 correlation in the
The point is that Bohr followed EPR in their assumption that the correlations exist, notwithstanding the fact that neither nor is actually measured. That is to say, Bohr followed EPR in a realist interpretation of the correlations between the physical quantities of particles 1 and 2. On a consistent application of his correspondence principle Bohr could have rejected the EPR problem as falling outside the experimental domain in which quantum mechanics can make meaningful and empirically controllable assertions about these correlations. That he has not acted accordingly may be caused by the fact that Bohr’s view on quantum mechanical observables was indeed more realist than empiricist (compare section 4.3). This may have been instrumental in forfeiting for a moment the careful distinction he always made between ontic and epistemic categories (cf. section 4.6.3). Bohr challenged EPR, only on the count that both correlations could not exist simultaneously. He observed an “essential ambiguity” in the reality criterion deployed by EPR, in the sense that, according to him, EPR did not take into account the impossibility of simultaneously measuring the incompatible observables and of particle 1, thus obstructing the possibility of a simultaneous definition of the correlated observables of particle 2: what is well-defined of particle 2 in the context of the measurement could differ from what is well-defined in the context of the measurement. This argumentation, however, is valid only if the measurement on particle 1 can indeed be interpreted -as proposed by EPR- as a measurement on particle 2. In that case the observable of particle 2 could be considered as being defined, in the sense of the correspondence principle, in the context of the measurement on particle 1. In ontological terms this would mean that the reality of particle 2 is a contextual reality, the context being determined (in a nonlocal way) by the remote measurement arrangement the particle is not directly interacting with. On a more strict application of his correspondence idea, however, Bohr did not have to follow Einstein’s proposal. The impossibility of a simultaneous existence of both correlations follows from the incompatibility of the correlation observables and These observables can be measured only by performing measurements on both particles. But then it is possible to consider each of the particles in the context of the measurement arrangement the particle is itself directly interacting with. On this reasoning Bohr would have had no need to retreat from his interactional interpretation toward a relational one. Concomitantly, this would have denied any basis to Einstein’s intention of performing quantum mechanical measurements on particle 2 without any interaction of this particle with a measuring instrument. Like in an empiricist interpretation the impossibility of the simultaneous existence of ‘elements of physical reality’ for and would be a consequence of the incompatibility of these very observables, instead of being derived from the incompatibility of and In Bohr’s answer complementarity does play a role. However, a
252
CHAPTER 5. THE EINSTEIN-PODOLSKY-ROSEN PROBLEM
consistent application of correspondence is at least dubious. Notwithstanding the inconsistency of Bohr’s answer, noted here, this answer has counted for a long time as convincing. Indeed, the incompatibility of and is indisputable. Moreover, the existence of the correlation in the context of the measurement (and analogously for momentum correlation in the context of the measurement) seems to be warranted by the fact that if and are measured simultaneously, this correlation is actually found. It, accordingly, appears to be justified to interpret the EPR experiment, in which only an observable of particle 1 is actually measured, as a measurement on both particles 1 and 2. That we must be cautious with such an interpretation is clear, however, from the consideration of a joint measurement of the (compatible) observables and (cf. figure 5.2). If we would interpret the measurement also as a measurement (e.g. Park and Margenau [254]), then this experiment would be interpretable as a simultaneous measurement of and the measurement result for is read off directly from the meter, whereas the value of can be calculated, using (5.2), from the measurement result for Such an interpretation, however, would be inconsistent with Heisenberg’s disturbance theory of measurement (cf. section 4.6). On the basis of this latter theory is disturbed because is measured. Then the correlation (5.2) could be preserved only if not only but also would be disturbed by the measurement. Moreover, the disturbances of and would have to be identical. However, since is actually measured, according to Heisenberg’s disturbance theory of measurement this observable is not disturbed at all. Hence, it does not seem to be possible to interpret the measurement as a measurement of and at the same time maintain the elementary property of quantum mechanical measurement that compatible observables (viz, and ) can be simultaneously measured without mutual disturbance. The problem of joint measurement will be further elaborated in section 7.9 using the generalized formalism of quantum mechanics. This formalism will be seen to allow a description of joint measurement of incompatible observables, and to yield a certain justification of Heisenberg’s disturbance theory of measurement in a strictly interactional sense (see section 9.3.1 for an application to EPR-like experiments). The generalized formalism corroborates the expectation that a measurement of
5.3. BOHR’S ANSWER TO EPR
253
does disturb the value of without influencing This is consistent with a strict application of the correspondence principle in which the measurement contexts of both particles are taken into account, in the sense that, if the measurements for particles 1 and 2 are in causally disjoint regions (cf. section 1.3.1), then the measurement arrangement for particle serves to define only the observable of that same particle This causes correlations (5.2) and (5.3) to be no longer valid in the context of a measurement. By these observations it becomes evident that the EPR experiment, in which only is measured, cannot be taken to be a joint measurement of and From the point of view of his correspondence principle, on the basis of a strict interactionism Bohr could have rejected the suggestion made by EPR that their experiment is a measurement of an observable of particle 2. The fact that correlation experiments like the EPR experiment are interpreted as correlation measurements, even if only one of the two observables (e.g. is actually measured, can be seen as part of the confusion with respect to ‘preparation’ and ‘measurement’, customary in the Copenhagen interpretation. Indeed, as will be discussed in section 5.4, the experiment satisfies all characteristics of a conditional preparation. This, however, seemed to conform to Heisenberg’s definition of a quantum mechanical measurement (cf. section 4.6.1), thus boosting the confusion. Bohr’s compliance in this matter must also be seen in light of the fact that in his correspondence principle only reference is made to the global experimental arrangement, in which the distinction between ‘preparation’ and ‘measurement’ is largely neglected. However, if the EPR experiment were not a measurement but a preparation of particle 2, and if the difference between these procedures were duly taken into account, then a criticism of Bohr’s answer might be that, as far as particle 2 is concerned, there is no reason why it should refer to such measurement principles as ‘correspondence’ and ‘complementarity’. Bohr’s application of these principles in dealing with the EPR experiment does not seem to have any logical necessity. A contextual approach along the lines of these principles would be sensible only if a quantum mechanical measurement would be actually performed on particle 2 too.
Nonlocality Einstein [267] had yet another response to Bohr’s answer. Let it be granted that the quantum mechanical description of particle 2 is complete due to the fact that, in defining its quantities, the measurement arrangement for particle 1 cannot be left out of consideration. Then, because particles 1 and 2 are far apart, the measurement arrangement for particle 1 must exert its defining function far outside its actual confinement. From this Einstein concludes that Bohr’s ‘completeness’ claim can be maintained only if there is a nonlocal influence of the measuring instrument for particle 1 on particle 2. Thus
254
CHAPTER 5. THE EINSTEIN-PODOLSKY-ROSEN PROBLEM
If quantum mechanics is complete (i.e. if and cannot simultaneously be ‘elements of physical reality’ because they are defined only in the contexts of different measurement arrangements for particle 1), then quantum mechanics cannot be local. It logically follows from (5.5) (see, however, section 9.5.2) that
For Einstein this implied a trade-off between ‘completeness’ and ‘locality’ of quantum mechanics: quantum mechanics is either complete (and nonlocal) or it is local (and incomplete). The choice was not difficult: his belief in the locality of our physical reality, based on all extant experimental evidence as well as on our experience with relativity theory, dictated his choice in favor of locality, and, hence, against completeness of quantum mechanics. Unfortunately, Bohr does not seem to have commented on Einstein’s conclusion of ‘nonlocality’. The ‘nonlocality’ of quantum mechanics as encountered here has its origin in the idea of ‘completeness in a restricted sense’, not ‘completeness in a wider sense’ as employed in the original EPR paper [250]. Hence, in Einstein’s response [267] the notion of ‘completeness’ has undergone a considerable change of meaning. If in the implication (5.5) ‘completeness’ is taken ‘in a restricted sense’, then in (5.6) ‘incompleteness’ cannot have the same meaning it had in the EPR paper. It is not clear whether this change of meaning was noticed by Einstein. In view of the lack of distinction between the two concepts of ‘completeness’, observed in section 4.2, this must be doubted. As a matter of fact, whereas in the above-mentioned trade-off between ‘locality’ and ‘completeness’ only ‘completeness in a restricted sense’ seems to make sense, we shall encounter in section 5.4 an analogous trade-off, equally being inspired by Einstein, in which the issue is ‘completeness in a wider sense’. In the EPR reasoning the idea of ‘nonlocality’ has entered the discussion on the foundations of quantum mechanics for the first time, not to leave it until the present day. In particular, it has widely been accepted that the measurement of particle 1 is responsible for an instantaneous, and hence nonlocal/causal coming into being of a value of the correlated observable of particle 2 (which, according to the Copenhagen interpretation, previously was not there). By many people ‘nonlocality’ is considered as one of the most important features distinguishing the microscopic world from the macroscopic one. However, ‘nonlocality’, as encountered above, just follows from a certain interpretation of the formalism of quantum mechanics, viz, the Copenhagen one, more particularly, from the principles of ‘correspondence’ and ‘complementarity’. It is not based on any empirical fact. For this reason Einstein considered the ‘nonlocality’ resulting from these with great suspicion (Einstein’s “spooky actions at a distance”). Nevertheless, due to the confusion with respect to ‘completeness in a restricted sense’ and ‘completeness in a wider sense’ Einstein’s conclusion (5.6) is misleading. Since here ‘completeness’ is ‘completeness in a restricted sense’, the only conclusion
5.4. EPR AS A PROBLEM OF STATE VECTORS
255
that can be drawn from this reasoning is that, if ‘nonlocality’ is to be avoided, ‘correspondence’ and ‘complementarity’ cannot be applied to EPR in the way it is done. Because of the possibility of denying any existence of non-quantum mechanical ‘elements of physical reality’, no conclusion can be drawn about ‘(in)completeness in a wider sense’, thus saving ‘completeness of quantum mechanics in a wider sense’ in an a priori manner. It even cannot be concluded that ‘correspondence’ and ‘complementarity’, and the concomitant ‘contextuality’ of the interpretation, are not applicable at all, and that ‘incompleteness in a restricted sense’ would follow from ‘locality’. If Bohr had applied his ‘correspondence’ idea to the EPR problem in the strictly interactional sense discussed above, then in the EPR measurement arrangement of figure 5.1 only observables of particle 1 would have been considered well-defined. Since then all interactions can be supposed to be local, there would have been no single reason to suspect any form of nonlocality. Stated in this way the association of ‘contextuality’ with ‘nonlocality’ can be seen as just an unhappy consequence of an interpretation of the EPR arrangement as a measurement of a quantity of particle 2. ‘Nonlocality’ follows only from ‘completeness in a restricted sense’ on the basis of a dubious extension of the notion of ‘quantum mechanical measurement’. On the other hand, if the interaction is local, then, on a strictly interactional understanding of ‘correspondence’ and ‘complementarity’ Einstein would not have had any reason to reject the contextual meaning attributed by Bohr to quantum mechanical measurement results. Admittedly, Einstein had still another reason to do so, viz, his idea that quantum mechanics should yield an objective description of reality, independent of the observer (including his measuring instruments). However, this idea cannot be tested by considering quantum mechanical observables (as was done in the foregoing), because these refer to measurement, and, hence, cannot possibly be considered as independent of the measuring instrument. In order to be able to look upon particle 2 as ‘independent of measurement’, it will be necessary to deal with the EPR experiment in terms of (conditional) preparation rather than measurement. This will be done in the next section.
5.4 5.4.1
Formulation of the EPR problem in terms of state vectors The EPR reasoning in terms of state vectors
Nowadays the EPR problem is seldom discussed in terms of physical quantities or ‘elements of physical reality’; states are considered instead. Actually, states are already considered in the original EPR paper [250], where the correlations discussed
256
CHAPTER 5. THE EINSTEIN-PODOLSKY-ROSEN PROBLEM
in section 5.2 (with
are implemented by means of the state function
However, ‘elements of physical reality’ keep playing a role throughout, although this role is not unambiguous. The EPR ‘elements of physical reality’ were presented as corresponding to measurement results, even though they actually were meant to correspond to properties of individual particles which are independent of any measurement actually performed, and, hence, should be associated with a preceding preparation rather than measurement. By considering quantum mechanical measurement results to characterize ‘elements of physical reality’ Einstein had actually joined the Copenhagen interpretation in blurring the distinction between ‘measurement’ and ‘preparation’, thus lending Bohr an opportunity to strike back on the basis of his measurement principles of ‘correspondence’ and ‘complementarity’. In the present section a formulation of the EPR problem in terms of state vectors will be considered. Because, in contrast to a quantum mechanical observable, a state vector can unambiguously be seen as a representation of a (result of a) preparation, an important advantage of this formulation is the possibility of a better distinction between ‘measurement’ and ‘preparation’. In order to avoid complications due to non-normalizability one often considers state vectors introduced by Bohm and Aharonov [268], describing the (singlet) state of a system of two spin-1/2 particles:
Here the state represents the spin state of particle in which the component of spin equals In this state the probability of finding, on measurement of the value equals 1/2 for both values of and In the singlet state the measurement results of spin components of particles 1 and 2 are correlated according to
Using the well-known relations between the eigenvectors of the Pauli spin matrices and state vector (5.8) can also be written as
implying that the measurement result of the correlated too:
of the spins are strictly
5.4. EPR AS A PROBLEM OF STATE VECTORS
257
(due to spherical symmetry of the state an analogous relation holds for any direction). The role played by observables and of section 5.2 can now be taken over by, for instance, the incompatible observables and Thus, either or is measured on particle 1. In terms of states the EPR reasoning leading to the conclusion of ‘incompleteness of quantum mechanics’ could be as follows: When, on measuring measurement result is found then we are certain that a measurement of will yield the value In agreement with the definition of a conditional preparation of the first kind (cf. section 3.2.6) this is possible only if, after the measurement, the state of particle 2 is given by When, on measuring measurement result is found then we are certain that a measurement of will yield the value This is possible only if, after the measurement, the state of particle 2 is given by Since the state vectors and are eigenvectors of incompatible observables, they are distinct. Hence they cannot both yield a complete description of the same reality of particle 2. Now there are two possibilities:
i) The two realities are different. Since the only difference between the two realities is the different measurement arrangement for particle 1, this is possible only if the reality of particle 2 would be (co-)determined by the measurement of particle 1. Since measurements of particle 2, checking its state, could be carried out simultaneously with the measurement of particle 1, this would imply ‘nonlocality’.
ii) Assuming ‘locality’, there is only one reality of particle 2, independent of which observable of particle 1 is measured. Since the assumption that the state vectors and would both yield a complete description of this reality entails a contradiction, it follows that these vectors at most can describe different aspects of the particle 2 reality. Hence, each state vector must yield an incomplete description. Only the second possibility corresponds to the assumption of ‘locality’. The conclusion of ‘incompleteness’ can be implemented by the assumption that the state vector does not describe an individual particle but only an ensemble. The conclusion of incompleteness is corroborated by the reduced density operator of particle 2 (cf. (1.52)), yielding for
258
CHAPTER 5. THE EINSTEIN-PODOLSKY-ROSEN PROBLEM
This density operator can tentatively be considered as describing an ensemble of particles 2, with subensembles corresponding to the state vectors These subensembles are selected on the basis of the measurement results for and the correlation of the spins (cf. section 3.2.6). Analogously, if is measured the density operator can be represented as
On the basis of the measurement results for in this experiment two subensembles can be selected from the particle 2 ensemble, corresponding to the state vectors The difference of the representations (5.12) and (5.13) of can tentatively be interpreted as representing different subdivisions of one and the same particle 2 ensemble. Although these subdivisions are performed by selections on the basis of measurement results for particle 1, this need not involve any nonlocality if all subensembles would be there before the measurement on particle 1 took place. At least, it is evident that the state of the particle 2 ensemble, given by is independent of whether either or is measured, or whether no measurement at all is performed on particle 1. A formulation along the lines presented above seems to be consistent with Einstein’s views as expressed in [261] (cf. section 5.2.2). Since the present formulation is based on conditional preparation of particle 2 rather than measurement, it can be evaluated without applying such measurement principles as ‘correspondence’ and ‘complementarity’. An additional advantage is that the reasoning, unlike the one actually presented, completely remains within the domain of quantum mechanics and its interpretation. In particular, the question of ‘(in)completeness’ boils down to the question of whether the quantum mechanical state vector is describing an individual object (the Copenhagen individual-particle interpretation) or an ensemble (Einstein’s ensemble interpretation) (see also chapter 6). By the above reasoning a disagreement is exhibited between the individual-particle interpretation (‘completeness’) and ‘locality’. The trade-off between ‘locality’ and ‘completeness’ can be formulated in terms of the ‘individual-particle versus ensemble’ controversy in the following way: In an individual-particle interpretation immediately after a measurement of has been performed the conditionally prepared state can be attributed to particle 2. An analogous statement can be made with respect to the measurement of yielding as the conditionally prepared state of particle 2. Due to the incompatibility of and it is impossible that before the measurement particle 2 was in both of the states and Hence, at least in one of the measurements the state of particle 2 must have changed.
5.4. EPR AS A PROBLEM OF STATE VECTORS
259
This means that particle 2 must have been subjected to some interaction, exerted by the measurement of particle 1. Since the particles are in causally disjoint regions this implies ‘nonlocality’. Conversely, the assumption of ‘locality’ implies that the individual-particle interpretation cannot be valid, and, hence, quantum mechanics is incomplete in the sense that the state vector describes an ensemble rather than an individual particle. The important difference between the reasoning of the present section and that of section 5.2 is that it is based on a different concept of ‘complementarity’. Whereas in the latter section ‘complementarity’ refers to ‘measurement’ (‘elements of physical reality’ for and do not simultaneously exist because and cannot be simultaneously sharply measured), in the present section it refers to ‘preparation’ (‘elements of physical reality’for and do not simultaneously exist because and cannot be simultaneously sharply prepared). As discussed in section 4.6.1, the notions of ‘preparation’ and ‘measurement’ were thoroughly mixed up in the Copenhagen interpretation, and not sufficiently distinguished by Einstein, even though it was precisely his intention to get rid of ‘measurement’ as an important aspect in the interpretation of quantum mechanics. As already remarked in section 4.6.1, lack of distinction between ‘preparation’ and ‘measurement’ is one of the most important sources of confusion in the discussion on the foundations of quantum mechanics. The distinction between ‘preparation’ and ‘measurement’ also clarifies the related confusion about ‘(in)completeness in a wider sense’ and ‘(in)completeness in a restricted sense’. Whereas the second is referring to ‘measurement’, is the first independent of whether measurement is explicitly taken into account or not. As argued in section 4.2.3, the choice between individual-particle and ensemble interpretations of the state vector is a matter of ‘(in)completeness in a wider sense’, and, hence, is independent of Bohr’s measurement-inspired notion of ‘(in)completeness in a restricted sense’.
5.4.2 Discussion of the state vector approach to EPR Margenau [269] has stressed that in the EPR reasoning the projection postulate (cf. section 1.6) plays an important role. The EPR problem, indeed, is completely analogous to the Compton effect, which was for von Neumann the incentive to introduce the postulate. Thus, by measuring particle 1 the state of particle 2 seems to change in a way described by a projection (from state (5.8) to if
is found with value Margenau deemed the projection postulate responsible for the EPR problem. Therefore he pleaded for abolishing it. In his view the state of
260
CHAPTER 5. THE EINSTEIN-PODOLSKY-ROSEN PROBLEM
particle 2 is represented by the density operator (5.12) as long as this particle does not interact with other objects. In sections 3.2.4 and 3.2.6 the projection postulate is questioned, too. However, for Margenau the reason for doubting its validity was different from the one endorsed here. Margenau’s reason was based on his conviction that a probability distribution cannot be fixed by a single measurement performed on an individual object, but requires a very large number of observations. Evidently, Margenau considers the projection postulate here as a measurement principle in the way envisaged by von Neumann. However, with respect to particle 2 the EPR experiment is not a measurement but a (conditional) preparation. There is no reason to object to application to EPR of projection as a preparation principle. As a matter of fact, when ‘incompleteness in a wider sense’ is accepted, then there is no reason to reject the possibility of a discontinuous change of the state as described by the projection postulate. Such a change can be interpreted as a selection of a subensemble. Even though Margenau was one of the first to be aware of the difference between ‘preparation’ and ‘measurement’ (cf. section 4.7.2), his reservations with respect to the application of the projection postulate to the EPR problem can nevertheless be seen as a consequence of a certain confusion with respect to these concepts. His reasoning fails if the EPR experiment is considered as a conditional preparation rather than as a measurement of particle 2. Margenau’s objection to projection does not seem to be cogent even if quantum mechanics is thought to be ‘complete in a wider sense’, and the EPR experiment is considered as a conditional preparation rather than a measurement. Of course, a completely different objection to projection might be the nonlocality of the influence exerted on particle 2 by the measurement on particle 1. In an individual-particle interpretation of the state vector even from the points of view of ‘(conditional) preparation’ the conclusion seems to be inevitable that ‘completeness’ entails ‘nonlocality’. Evidently, staying within an individual-particle interpretation, the clearing up of the confusion of ‘preparation’ and ‘measurement’ is not sufficient for being able to cope with ‘nonlocality’. ‘Nonlocality’ is not only a consequence of ‘completeness in a restricted sense’. Those who keep considering quantum mechanics to be ‘complete in a wider sense’ also seem to have to accept ‘nonlocality’. The conditionally prepared state or is no less dependent on the measurement arrangement for particle 1 than the measurement result or It was supposed by Einstein that, in contrast to an individual-particle interpretation of the state vector, an ensemble interpretation would be able to escape from the ‘nonlocality’ problem, because projection as a principle of conditional preparation can tentatively be interpreted as a selection of a subensemble. The absence of interaction between the regions of particles 1 and 2 seems to be corroborated by the fact that the density operator of the particle 2 ensemble is independent of which measurement is performed on particle 1 ((5.12) and (5.13) are equal). No nonlo-
5.4. EPR AS A PROBLEM OF STATE VECTORS
261
cality can be present in the selection either, because the selection of a subensemble of particle 2 can take place only on the basis of (classical) information about the measurement result of particle 1 that has bridged the distance between particles 1 and 2 at a velocity not exceeding the velocity of light. For this reason it appears that in an ensemble interpretation of the state vector acceptance of projection as a principle of conditional preparation is very well possible without inducing nonlocality. From the point of view of ‘locality’ at first sight there does not appear to be any objection against Einstein’s objectivistic-realist interpretation of the state vector if this reality refers to an ensemble rather than an individual particle. However, Einstein’s conviction that an ensemble interpretation of the state vector can solve all problems (apart from the EPR ‘nonlocality’ problem also Schrödinger’s cat problem (section 3.1.1) is often thought to be a consequence of an individualparticle interpretation) is not justified. As a matter of fact, it must be assumed that the subensembles, selected in the conditional preparation, already existed before the selection took place. As will be seen in chapter 6 this assumption is quite problematic. Due to the incompatibility of and it does not seem to be possible to consider the two representations (5.12) and (5.13) of as descriptions of simultaneously existing decompositions of the particle 2 ensemble into subensembles. A trade-off between ‘completeness in a wider sense’ and ‘locality’ need not work properly because ‘incompleteness in a wider sense’ does not at all imply that incompatible quantum mechanical quantities do have simultaneous existence (cf. section 6.4.2). As will be seen in section 6.4.3, the problem of subensembles cannot be decisively dealt with by considering only quantum mechanical descriptions of states (as is done in the present chapter). As a matter of fact, the assumption of ‘incompleteness in a wider sense’ implies that, like ‘elements of physical reality’, (sub)ensembles may also have to be characterized by sub-quantum mechanical concepts. A discussion of the (im)possibility of an objectivistic-realist ensemble interpretation is postponed till the next chapter. Here we shall take impossibility for granted. On this assumption there are different ways to cope with the conclusion that neither in an individual-particle interpretation nor in an ensemble interpretation it seems possible to consider the quantum mechanical description of particle 2 as independent of the measurement of particle 1: One possibility is to consider the EPR challenge of ‘quantum mechanical completeness’ as averted, and accept the nonlocality of the quantum world accompanying an individual-particle interpretation as a new feature of reality that has been revealed by the EPR problem. A second possibility is to adopt an ensemble interpretation, not in Einstein’s objectivistic-realist sense, but in a contextualistic-realist one, admitting a nonlocal influence exerted by the measurement arrangement for particle 1 on the
262
CHAPTER 5. THE EINSTEIN-PODOLSKY-ROSEN PROBLEM
subensemble structure of the particle 2 ensemble. Here the ‘incompleteness’ assumption of an ensemble interpretation is not a result of any trade-off between ‘locality’ and ‘completeness’ (which does not seem to work properly, compare section 6.4.4), but may be inspired by the idea of ‘incompleteness in a wider sense’. A third possibility is the empiricist interpretation introduced in section 2.2. In this interpretation a state vector or density operator is not thought to describe the state of a microscopic object but rather a preparation procedure. The transition from (5.8) to one of the states or is just a transition to another preparation procedure for particle 2. In an empiricist interpretation no different significance is attributed to the projected state vector than a label of a certain combination of macroscopic measurement arrangements, observations, and decisions, to the effect that the preparation procedure of particle 2 is labeled by the state vector if the measurement of observable on particle 1 yields measurement result (and analogously for and All measurements performed subsequently on particle 2, conditional on measurement result are in agreement with such a conditional preparation. Whether this procedure is accompanied by a change of the microscopic reality of particle 2 is, according to an empiricist interpretation, a question not addressed by quantum mechanics because microscopic reality is thought not to be described by it. However, since sub-luminal transmission of information on the measurement result of particle 1 toward the region of particle 2 is an essential part of the procedure of conditional preparation (cf. section 3.2.6), there is no empirical evidence of any nonlocality. As already discussed in section 4.2.3 the question of ‘(in)completeness of quantum mechanics’ is entangled with a number of different issues. The opposition of realist and empiricist interpretations is one of these. It is important to be aware of the impact of a choice for either of these interpretations. Due to the formulation of the EPR problem in terms of physical quantities (observables), interpreted by both Einstein and Bohr in a realist sense, a realist interpretation has generally been assumed as the only option. As a consequence, apart from the question of ‘(in)completeness in a wider sense’, the discussion seemed to be about whether observables can be interpreted in an objectivistic-realist or a contextualistic-realist sense. If an objectivistic-realist position is impossible, then restriction to a realist interpretation suggests the contextualistic-realist one (either an individual-particle version or an ensemble one) to be the only viable alternative. As a consequence of this restriction a conclusion of ‘nonlocality of the microscopic world’ can hardly be evaded. This picture drastically changes if an empiricist interpretation is also taken into
5.4. EPR AS A PROBLEM OF STATE VECTORS
263
consideration as a possible alternative. In contrast to the realist interpretations, an empiricist one does not imply ‘nonlocality’. In arriving at the conclusion of ‘nonlocality’ it seems to be the ‘realism’ of the interpretation, rather than ‘completeness (in a wider sense)’, that is playing the key role. Actually, only those who adhere to a realist interpretation of quantum mechanics (be it considered as ‘complete (in a wider sense)’ or as ‘incomplete’) have to accept ‘nonlocality’ as a feature of quantum reality (of either the individual particle or the ensemble). Since nonlocality of an ensemble is as much in disagreement with the relativistic ‘locality’ concept (and experience) as is nonlocality of an individual particle, the first two of the three possibilities mentioned above are unattractive. The possibility of evading Einstein’s “spooky action-at-distance” by choosing the third (empiricist) alternative rather than one of the contextualistic-realist interpretations adds an important advantage of an empiricist interpretation to the ones already mentioned in section 2.4.5. Bohr’s instrumentalist interpretation of the state vector would have rendered an answer in the sense given in the present section much more satisfactory than the answer he actually gave (cf. section 5.3). This latter answer was obscured by the confusion with respect to ‘preparation’ and ‘measurement’, and the concomitant formulation of the problem in terms of ‘physical quantities’, interpreted by Bohr in a realist rather than an instrumentalist sense (cf. section 4.3.3). In an instrumentalist interpretation of quantum mechanics the conception of projection as a description of conditional preparation differs in an essential way from the realist one. Then the state vector is not seen as describing microscopic reality, but is merely assumed to contain information on the probabilities of (eigen)values to be found if quantum mechanical observables are measured. This holds true even if an individual-particle interpretation is assumed. Like in an empiricist interpretation a formulation in terms of state vectors, interpreted in an instrumentalist sense, is not vulnerable to the ‘nonlocality’ objection, because a change of the state vector of particle 2 does not mean that its reality is changed too. The transition to a conditionally prepared state just corresponds to a conditionalization on new macroscopic circumstances, including the reading of the measurement result of particle 1. The state transition realized in a conditional preparation is not thought to have a meaning transcending a transition from the joint probability distribution (1.26) to the conditional distribution (1.28) (cf. section 3.3.4). Like in the empiricist interpretation, in an instrumentalist one conditional preparation can be considered as a subjective change of the observer’s perspective (or the experimental context), not requiring any change of the microscopic object. Indeed, such a conditionalization need not imply any “mechanical disturbance” of particle 2. Possibly, this was in Bohr’s mind when stating [259] (cf. section 5.3) that “Of course there is [...] no question of a mechanical disturbance...” As seems to be evident from his use in this statement of the qualification “Of course”, nonlocality
264
CHAPTER 5. THE EINSTEIN-PODOLSKY-ROSEN PROBLEM
was presumably as unacceptable to Bohr as it was to Einstein. Presumably a “mechanical disturbance” of particle 2 was contemplated also by Bohr to take place only in a measurement arrangement in which particle 2 is interacting with some other object, for instance a measuring instrument for directly measuring an observable of particle 2. As will be seen in section 5.4.3, this latter experimental situation would have given Bohr an opportunity to counter the EPR challenge on the basis of a local contextuality. However, due to his interpretation of the EPR experiment as a measurement of an observable of particle 2, and his contextualistic-realist interpretation of observables (cf. section 4.6.3), Bohr was confronted with the consequence of a nonlocal context. The cryptic character of the above quotation may be evidence of the difficulty Bohr had in coming to grips with a nonlocality that, although seemingly present at the conceptual level, should not have an ontic meaning (because there was no empirical evidence of any nonlocality). Here the vagueness of the notion of ‘instrumentalism’ is demanding its toll. As already remarked in section 5.1, this could have been evaded if Bohr would have taken his instrumentalism in the sense of an empiricist interpretation of quantum mechanics. Such an empiricist interpretation shares with the instrumentalist one the possibility of looking upon conditional preparation in a local way. In this respect problems arise only if state vectors and are interpreted as descriptions of the reality of the microscopic object itself. This is precisely what is happening in a realist interpretation. Then, both in an ensemble interpretation and in an individual-particle one the reality of particle 2 would have to change instantaneously due to a measurement on particle 1 (see also sections 6.3.2 and 6.4.3) 9 . However, this conclusion is not endorsed by any empirical evidence: there is no single empirical motivation to believe that a measurement on particle 1 would change the reality of particle 2 in a nonlocal way. As is well known, all experiments performed with particle 2 after it is prepared by the EPR arrangement, can be described just as well by as by the conditionally prepared states or Hence, it is not even possible to experimentally check whether a transition from to one of the above-mentioned states has “really” taken place (see also section 3.2.7). The nonlocality, suggested by the EPR experiment, is not based on any empirical data10, but is a consequence of a tacit choice of a certain interpretation, viz, a realist interpretation. What the EPR problem seems to be teaching us, is that we have to choose between a realist interpretation of 9 Conditional preparation in EPR experiments has recently been considered as a mechanism allowing quantum teleportation [270], in which particle 2 is prepared according to the same state as particle 1 “was in”. It seems preferable not to interpret this as a transmission of a quantum state, interpreted in a realist sense, but just as subsuming an individual preparation of particle 2 under a macroscopic preparation procedure symbolized by the teleported state. 10 Contrary to frequent assertions, the Aspect experiments do not constitute experimental evidence of nonlocality (cf. sections 9.3.2 and 9.5.2).
5.4. EPR AS A PROBLEM OF STATE VECTORS
265
the state vector (entailing an unobservable nonlocality: Einstein’s “spooky actions at a distance”), and an empiricist (or instrumentalist) interpretation in which the state vector does not describe microscopic reality but has a more symbolic meaning (namely, as a label of a preparation procedure). This, however, was not the conclusion drawn from the EPR problem by Einstein himself. For him a realist interpretation of quantum mechanics was a presupposition deemed essential: quantum mechanics as a description of objective reality. As a consequence the attention was not focused on the issue of the (realist) attribution of an observable to the microscopic object, but rather on the simultaneous attribution of observables (viz, position and momentum), the possibility of the sheer attribution being taken for granted. Like Margenau’s, also Einstein’s attention was above all directed toward the question of ‘(in)completeness in a wider sense’ . Both seemed to believe that, while maintaining a realist interpretation, a transition from an individual-particle to an ensemble interpretation could solve all problems, conditional preparation being interpreted as selection of a subensemble. Unfortunately, it was not realized that the ‘nonlocality’ problem is persistent in a realist ensemble interpretation. What we can learn from the EPR problem seems to be that there is a trade-off between ‘locality’ and a ‘realist interpretation of the quantum mechanical formalism’, rather than between ‘locality’ and ‘completeness’.
5.4.3
Modified EPR experiments
Taking into account measurements performed on particle 2 Anticipating a more systematic treatment in chapter 9 we make some remarks with respect to the experimental situation in which, in contrast to the EPR problem discussed in the present chapter (in which only a measurement of e.g. is carried out) measurements are performed on both particles. According to Bohr’s correspondence principle in such modified EPR experiments the arrangement for the measurement on particle 2 should also be included in the context. The presence of this latter measurement arrangement makes the experimental situation essentially different from that of the original EPR problem. Since the EPR reasoning is crucially based on the assumption that particle 2 is not interacting with any measuring instrument, the applicability of this latter reasoning is completely undermined. ‘Elements of physical reality’ do not have relevance any longer. They should be replaced by ‘measurement results’ obtained in measurements that are actually carried out. A new assessment is necessary in which the modified EPR experiment is considered as a joint measurement in the sense of section 1.3. Neglect of the essential difference between the original EPR situation and the experiments to be discussed in chapters 9 and 10 (connected with the Bell inequality) is an important source of confusion. Modified EPR experiments performed to test the Bell inequality will also be referred to as
266
CHAPTER 5. THE EINSTEIN-PODOLSKY-ROSEN PROBLEM
EPR-Bell experiments. In the present chapter we shall restrict ourselves to joint measurements of standard observables. If the measurements on particles 1 and 2 are carried out in causally disjoint regions of space the analysis of section 1.3.1 is applicable, and the joint probability distributions (1.26) satisfy (1.29). This may have been the main reason for Einstein to choose for ‘locality’ in the trade-off with ‘completeness’, discussed above. If there is any nonlocal influence at all, then in any case there is no empirical evidence of it at the level of the probability distributions of quantum mechanical observables. On the contrary, because of (1.29) this evidence (e.g. the Aspect experiments to be discussed in chapter 9) is pointing into the direction of ‘locality’ rather than ‘nonlocality’. On an application of the correspondence principle in Bohr’s original interactionial sense it would seem reasonable to suppose that in a modified EPR experiment the measurement context of particle is mainly determined by the measurement arrangement for particle itself. Very often quite a different answer is given, though, viz that the measurement context for particle 2 is determined by the measurement arrangement for particle 1 (as well as, of course, by the experimental arrangement for preparing the particles in their initial state). The accompanying nonlocality of the measurement context was already discussed in section 5.3.1. For Bohr this was the natural reaction to EPR because only the measuring instrument for particle 1 was there. However, from the point of view of the correspondence principle this answer is not satisfactory as soon as a measuring instrument for particle 2 is also present. If the measurement arrangement for particle 1 contributes to the measurement context of the distant particle 2, then this seems even more to be expected of the measuring instrument particle 2 is directly interacting with. The nonlocal contextuality going with Bohr’s answer would at least have to take into account the possibility of a strong local contribution as soon as a measurement on particle 2 itself is carried out. It even does not seem to be unreasonable to suppose that the influence of the nearby measuring instrument might outweigh that of the far one. Could it be possible that, by taking into account the measurement on particle 2 (necessary to test any prediction made by quantum mechanics on that very particle), the ghost of ‘nonlocality’ that is haunting quantum mechanics, can be laid? It seems that, on the basis of a strictly interactional understanding of his correspondence principle, Bohr could have been quite successful in achieving this goal. It is the purpose of the present section to examine this possibility. Whereas in Bohr’s correspondence principle only physical quantities are considered, is the present treatment cast in the language of state vectors. Then it is possible to deal with Bohr’s above-mentioned answer to EPR on the basis of the idea of conditional preparation (cf. section 5.4). Thus, conditional on measurement result the state of particle 2 is as long as it does not interact with another object. In agreement with the nonlocal contextuality of
5.4. EPR AS A PROBLEM OF STATE VECTORS
267
physical quantities, discussed in section 5.3, the state of particle 2 is determined on the basis of conditional preparation by the measurement on particle 1 (as well as, of course, by the initial state (5.8)) in a nonlocal way. The question to be addressed in the case of a modified EPR experiment is, what physical relevance the conditionally prepared states may have in the context of a measurement on particle 2. Is it not more plausible that the state of particle 2 will at least be co-determined by the measurement of that very particle? Nowadays it is not customary to associate conditional preparation in the EPR experiment with Bohr’s correspondence principle. In agreement with the more objectivistic ideas of Heisenberg and von Neumann (cf. section 4.6.6) the conditionally prepared states are often considered in a more objectivistic-realist (as opposed to contextualistic-realist) sense, valid after the conditional preparation has taken place, and to be used as initial states in subsequent measurements of observables of particle 2. This endows the conditionally prepared states with a physical relevance quite different from the instrumentalist meaning attributed to the state vector by Bohr. From the perspective of ‘nonlocality’ an objectivisticrealist interpretation of conditionally prepared states is facing the same problems as Bohr’s contextualistic-realist interpretation of physical quantities; it just means that Bohr’s nonlocal context is exchanged for a nonlocal projection realizing the conditional preparation. In an objectivistic-realist interpretation the ‘nonlocality’ problem can be seen as a manifestation of the ‘causality’ problem encountered in section 3.2.6 in discussing conditional preparation. In considering modified EPR experiments, in an objectivistic-realist interpretation of the conditionally prepared state a new problem is added to the ‘nonlocality’ objection if time ordering of the measurements is taken into account. For the sake of definiteness let us consider a joint measurement of the pair By slightly varying the distances between the particle source and the measuring instruments the time ordering of the and measurements may be changed. The measurement may be carried out before, simultaneously with, or after the measurement of In the latter case, at the time of the measurement no conditional preparation of the state of particle 2 can have occurred as a consequence of the measurement. Instead, it would seem to be consistent to assume that rather particle 1 is conditionally prepared by the measurement, yielding as conditionally prepared state if the measurement result is If the measurements are simultaneous, then it might be assumed that a joint measurement result entails conditional preparation of both particles in states and respectively. Indeed, experimental data are consistent with such an assumption. As is seen from this example, in an objectivistic-realist interpretation of conditional preparation the realities of the particles, at the times of their measurements, are strongly dependent on the time ordering of these measurements, a discontinuity
268
CHAPTER 5. THE EINSTEIN-PODOLSKY-ROSEN PROBLEM
occurring if the measurements are simultaneous. This would imply that the measurement interaction realizing the conditional preparation would be even stranger than just being nonlocal: the interaction would be drastically altered by an arbitrarily small change of the time difference by which the time ordering of the measurements is inverted 11 . On the other hand, the experimental data do not give any reason to think that this is the case, since experimental joint probabilities of and are independent of the precise time ordering of the measurements. Even though the assumption of conditional preparation is sufficient to reproduce the experimental data, is its strange “unphysical” behavior reason enough to distrust it, at least as a physical process interfering with the object’s reality. The problems of conditional preparation, noted here, might be thought to be solvable by weakening the objectivistic-realist interpretation by the assumption that the quantum mechanical state vector describes an ensemble rather than an individual particle, and conditional preparation can be seen as a selection of a subensemble. Indeed, an ensemble interpretation would take the edge off both the problems of ‘nonlocality’ (cf. section 3.2.6) and of ‘time ordering’. Thus, selection of a subensemble of particle 2 on the basis of a measurement result for particle 1 (and vice versa) need not entail any change of the reality of the selected particle. Moreover, the time ordering of the measurements of particles 1 and 2 is immaterial, since the selections can be carried out after both measurements have taken place (leaving out of consideration the particles corresponding to the “wrong” measurement result). However, as will be seen in section 6.4, a solution on the basis of an ensemble interpretation in an objectivistic-realist sense is not very well possible due to the fact that such an interpretation has its own problems (one of these being that it cannot completely solve the problem of ‘nonlocality’). It seems that quantum mechanics simply does not allow an interpretation as describing an objective reality, either of an individual particle or of an ensemble. In section 5.4.2 the consequence of ‘nonlocality’ was taken as an indication of the appropriateness of interpreting observables and state vectors in an empiricist rather than in a realist sense (either objectivistic or contextualistic). This conclusion is even strengthened if modified EPR experiments are considered, since here macroscopically observable pointer readings of the measuring instrument for particle 2 can be considered rather than ‘elements of physical reality’. A particular advantage of an empiricist interpretation is that it is an ensemble interpretation. This opens up the possibility of saving an important element of Einstein’s intended solution without being trapped into the intricacies of a realist interpretation. In an empiricist interpretation of the EPR experiment conditional preparation, being an application of the projection postulate in a preparative sense, does not pose any problem if considered as a selection of a subensemble of particles 2. It is just a transition 11 A similar problem has been noticed by d’Espagnat ([160], section 8.3) on the basis of a relativistic reasoning.
5.4. EPR AS A PROBLEM OF STATE VECTORS
269
to a different preparation procedure, in which the value of the measurement result plays a role in selecting those particles 2 of which the measurement results should contribute to the experimentally determined conditional probabilities. Such a preparation procedure need not involve any influence (either local or nonlocal) on the reality of either the selected or the unselected particles 2. Note that in an empiricist interpretation it is not assumed that particle 2 has a value of as an objective property, independently of its being measured.
Contextual state versus conditional preparation In an measurement the reality of particle 2 is not “mechanically disturbed” as long as this particle is not influenced by the measuring instrument. Unfortunately, Bohr did not insist on the actual presence of this latter measuring instrument. As already seen in section 5.3.1, Bohr’s application of his correspondence principle to EPR is liable to criticism because a correlation observable is thought to be defined without actually being measured. Insistence on the presence of both measuring instruments necessary to measure the correlation, would have made it possible for him to maintained the original strictly interactional form of the correspondence principle. Instead, he stressed the presence of the measuring instrument for particle 1 in determining the context of particle 2, thus forcing himself to abolish a strictly interactionial interpretation, and actually starting off the ‘nonlocality’ problem of quantum mechanics. By requiring the observable of particle 2 to be defined by the actual measurement arrangement for measuring this very observable (thus sticking to a strict interactionism) Bohr could have reconciled his correspondence principle with locality. This would even hold true if this latter physical quantity were interpreted in the more (contextualistic-)realist sense adopted by him (cf. section 4.3.3). It would require, however, the abolishment of a (contextualistic-)realist interpretation of conditional preparation, in the sense that if in the singlet state an measurement on particle 2 is carried out simultaneously with an measurement, then particle 2 need not “have” one of the values (let alone, that it would “be in” one of the states Since, according to the Copenhagen interpretation, cannot even have a well-defined value in the context of an measurement, this would seem to be just a small step to be made by Bohr. It is well known that conditional preparation does not yield any additional information on whichever probability distribution, over the information already present in the state (5.8). Conditional probabilities can be calculated from joint probabilities using (1.28), and, hence, are completely determined by the initial state (5.8). Hence, we do not have any empirical reason to consider the transition to the conditionally prepared state as “really” taking place. Even though, in the singlet state, measurement result will always be followed by measurement result if the latter observable is measured, this need not imply that particle 2 “was in”
270
CHAPTER 5. THE EINSTEIN-PODOLSKY-ROSEN PROBLEM
the conditionally prepared state after the measurement was carried out. Application to EPR of the projection postulate (describing a transition to the conditionally prepared state) -although possible in the sense of choosing a new preparation procedure- is not necessary for interpreting experimental data obtained in a joint measurement of and (de Muynck [217]). The attribution of reality to the result of conditional preparation will be discussed further in section 9.4.2. Such an attribution is largely based on Einstein’s ‘element of physical reality’, applicable as long as particle 2 does not interact with any measuring instrument. The reality of particle 2 in the context of such a measurement may be quite different, however. In an empiricist interpretation this reality is not thought to be described by quantum mechanics. In section 2.4.5 it was conjectured that, perhaps, the empiricist interpretation can be strengthened by interpreting the contextual state obtained from the initial state vector by means of weak projection (1.72) or (1.74), in a contextualistic-realist sense as describing the state of a microscopic object in the context of a measurement of standard observable A. In the modified EPR experiment weak projection should not be applied just to particle 1, but to the state of the two-particle system, using the spectral representations of the observables of both particles. Applying this to the joint measurement of the pair in the singlet state (5.8), the contextual state is given by
For the measurement of the pair
we get
In the contextual states of the modified EPR experiments, thus obtained, the measurement contexts of the nearby measuring instruments are duly taken into account. It is illuminating to consider the relation between the contextual state and the conditionally prepared one. In the measurement the states prepared conditionally by the measurement, are eigenvectors of Therefore the conditionally prepared states are well adapted to the observable measured on particle 2, and the contextual state is in agreement with the conditionally prepared ones. However, this is a consequence of a very special choice of the observables measured jointly, the observables being adapted to the correlations already present
5.4. EPR AS A PROBLEM OF STATE VECTORS
271
in the singlet state. In order not to be led astray it will be necessary to consider experiments that are not so nicely adapted to the correlations present in the initial state. Thus, when measuring the pair the conditionally prepared states of particle 2 are the same as in the measurement, but now these are different from the eigenvectors of observable measured on particle 2. This makes the contextual state virtually unrelated to the conditionally prepared ones, thus better enabling to evaluate their relative merits. With respect to (5.14) and (5.15) a number of remarks should be made. First, it should be noted that in (5.15) the eigenvectors of could be replaced by those of without changing the contextual state Hence, in this measurement the contextual state can just as well be expressed in terms of the conditionally prepared states as in terms of eigenvectors of the actually measured observable Evidently, even the measurement of the pair is too specific to be able to exhibit in a decisive way the difference between the contextual state and the conditionally prepared ones. In order to achieve this goal we should either take a more general state than the singlet state, or a pair of observables corresponding to spin components in nonorthogonal directions (see also section 9.4.2, where it is demonstrated that in general the contextual state is different from the result of conditional preparation). This, once again, is a warning that it is very risky to draw general conclusions about the meaning of the mathematical formalism from the very restricted form of the EPR problem as discussed in the present chapter. Second, since the contextual state cannot be used as initial state in a measurement of an arbitrary observable, it is not possible to attribute to it the usual empiricist meaning of a quantum mechanical state. However, as discussed in section 2.4.5, the contextual state might be thought to yield a description of the reality of the microscopic object, valid only in the context of the measurement that is actually being performed. This rather resembles Bohr’s contextualistic-realist interpretation as expressed by his correspondence principle, even though this principle is referring to physical quantities rather than state vectors. By considering the contextual state, as presented here, as an implementation of this principle, Bohr’s contextualistic-realist conception is actually extended from physical quantities to state vectors. The important thing is that, in contrast to a contextualistic-realist interpretation of the conditionally prepared state, such an interpretation of the contextual state does not imply any nonlocality: as can easily be verified, the contextual states of each of the particles, obtained from (5.14) or (5.15) by partial tracing, can also be obtained in a manifestly local way by applying weak projection to each of the particles separately. Third, it is re-emphasized that the contextual state is not the post-measurement state, reached after the joint measurement has been carried out (cf. section 3.3.5). It, indeed, could be obtained as the post-measurement state if the measurement interactions would prepare the particles according to weak projection. However, as stressed in section 3.2.4, this is satisfied by hardly any realistic measurement
272
CHAPTER 5. THE EINSTEIN-PODOLSKY-ROSEN PROBLEM
procedure. Moreover, it is irrelevant because in general it is not the final object state but the final state of the measuring instrument that counts in a measurement. The contextual state might be seen as a contextual description of the initial state as it is in the measurement context of the measured observable (see also section 6.6). A contextualistic-realist interpretation of the contextual state would corroborate Bohr’s idea that quantum mechanics is yielding information about a microscopic reality that is in interaction with the measuring instrument set up for obtaining this information. Einstein’s ideal that quantum mechanics should yield an objective description of microscopic reality does not seem to be practicable after all. In any case, due to the equality all information provided by (standard) quantum mechanics is consistent with the view that this information may refer to the contextual state rather than to the state the microscopic object was in before it met the measuring instrument (note that this latter state would not even be described by quantum mechanics if is just a symbolic representation of a preparation procedure, as it is in an empiricist interpretation). Note also that, since and the contextual state are dissimilar quantities, it is not necessary to realize a transition from one to the other (for instance, by means of a decoherence process, cf. section 3.4). Both refer to the same physical object, be it in different ways. In summary, Bohr might be right in supposing that, if quantum mechanics yields a description of microscopic reality at all, then it must be a reality that is in interaction with some measuring instrument, i.e. an observed reality. By means of the contextual state this idea can be extended from the physical quantities, considered by Bohr, to quantum mechanical states. The problem of ‘nonlocality’ does not arise if Bohr’s correspondence principle is applied in a strictly interactional sense. Conditional preparation can be consistently implemented if taken in an empiricist or instrumentalist sense. Notwithstanding the locality gained by this way of looking at quantum mechanics, this, presumably, would not have been an attractive solution for Einstein, since he aspired to a description of reality as it is independent of any interference by measurement. Unfortunately, since all our knowledge on microscopic reality is gained by letting it interact with measuring instruments, such an objective reality, represented by the ‘element of physical reality’, is rather elusive, and without a clear operational meaning. If quantum mechanics is adequately describing measurement results of actually performed measurements, then it is hardly to be expected that the same theory would also describe ‘elements of physical reality’. Perhaps we should content ourselves with a less exacting interpretation of quantum mechanics than the objectivistic-realist one. For Einstein this might have been even a more pressing reason to reject quantum mechanics than its ‘incompleteness in a wider sense’. On the other hand, there is no reason to believe that the domain of observed reality cannot be extended beyond the domain of quantum mechanics. Even if our knowledge will never be knowledge of an objective reality, independent of the way
5.4. EPR AS A PROBLEM OF STATE VECTORS
273
it is observed, there is, from a physicist’s point of view, no reason to abandon completely the idea of such an objective unobserved reality. ‘Incompleteness of quantum mechanics in a wider sense’ may provide a source of inspiration to investigate the boundaries of the domain of application of quantum mechanics, and to try to transcend this domain both in an experimental and in a theoretical way. Some of these attempts will be discussed in chapter 10.
This page intentionally left blank
Chapter 6 Individual-particle and ensemble interpretations of quantum mechanics 6.1 Introduction With respect to quantum mechanical observables the discussion between Bohr and Einstein took place entirely within the context of the realist interpretation defined in section 2.3, in which observables are looked upon as properties of the microscopic object. Admittedly, with respect to the interpretation of the state vector there was a difference between the two opponents, in the sense that, unlike Bohr, Einstein also interpreted the state vector realistically. However, the discussion was focused on physical quantities (cf. section 5.2). Due to the resulting confinement to a realist interpretation the essential role played in the discussion by this very interpretation has largely remained under-exposed. Instead, a different question was at the center of interest, namely, whether quantum mechanics does, or does not, yield a complete description of reality. In terms of state vectors (cf. section 5.4) this can be formulated as the question of whether the state vector describes an individual object (individual-particle 1 interpretation) or an ensemble of such objects (ensemble interpretation 2 ). Einstein favored the latter possibility, while Bohr and Heisenberg seem to opt for the first one (e.g. Rayski [272]). The issue of ‘individual-particle versus ensemble interpretation of the state vector’ can be seen as an implementation of the question of ‘(in)completeness in a wider 1
We shall also use this terminology if the state vector refers to other objects than one single particle. 2 A recent review of ensemble interpretations has been given by Home and Whittaker [271].
275
276
CHAPTER 6. INDIVIDUAL PARTICLE VERSUS ENSEMBLE
sense’. As seen in section 4.2 this question is thoroughly intertwined with the problem of ‘(in)completeness in a restricted sense’. In the present chapter we shall in the first place deal with it from the ‘wider’ point of view. By doing so we follow Einstein in his attempt to interpret the state vector as an objective description of reality (be it a statistical one) independent of any measurement, and possibly to be improved upon by a subquantum theory. By taking this line it will be easier to discover the limits to individual-particle and ensemble interpretations than by just invoking the measurement interaction to set such limits by way of principle (as would be implied by an approach on the basis of ‘completeness in a restricted sense’). In particular, by circumventing in this way the Copenhagen preoccupation with measurement it is possible to deal on an equal footing with the Copenhagen instrumentalist individual-particle interpretation of the state vector, and the more realist version referred to in section 2.3. It has often been felt that one of the main advantages in switching from an individual-particle interpretation to an ensemble one is the possibility of abolishing the strong projection postulate (cf. (1.70)), thus solving the conventional “measurement problem” (cf. section 3.1). Indeed, the strong projection postulate is necessary if quantum mechanical measurements are thought to follow the prescription of Heisenberg’s theory of measurement (cf. section 4.6.1), which is at the basis of the Copenhagen interpretation. On the other hand, if the linear superposition (1.69) is interpreted as a description of an ensemble rather than an individual object, then it is not necessary any more that the final state correspond to one single value of the observable. Thus, if the state vector (3.1) is not interpreted as representing an individual cat but an ensemble of cats (each of which being either alive or dead), then at least the most obvious interpretative problems are resolved. Although, of course, the problem of the “cross” terms, as discussed in chapter 3, remains a problem in an ensemble interpretation (because this is a problem of the mathematical formalism, largely independent of the interpretation), particular aspects of this problem are less acute, and even thought to be solved. Even the strong projection postulate, viewed upon as a principle of conditional preparation (cf. section 3.2.6), might be thought to be made respectable then. No causal explanation of a state change due to conditional preparation is necessary: the state change can be understood as a transition to a description of a subensemble, selected on the basis of an observation of a particular measurement result. This, indeed, seems to provide a strong argument in favor of an ensemble interpretation. However, adoption of an ensemble interpretation is not a panacea for the solution of all conceptual problems of quantum mechanics. The mathematical structure of a quantum ensemble differs in an essential way from that of a classical one. Consequently, in so far as the ensemble was introduced for making quantum mechanics “understandable” in classical terms, this implies that there is not much progress. The question of the structure of a quantum ensemble becomes important when deal-
6.1. INTRODUCTION
277
ing with subensembles. Ultimately this leads us back to the individual object. As long as we do not have a clear picture of the way a quantum ensemble is composed of individual objects, we can hardly consider this issue as solved. One reason to prefer an individual particle interpretation may be the classical paradigm discussed in section 2.4.2, trying to interpret quantum mechanics as much as possible along the lines of classical mechanics. This may be the reason that, even though nowadays many textbooks of quantum mechanics employ the terminology of an ensemble interpretation, the Copenhagen individual-particle terminology has not ceased to pervade everyday parlance. For instance, the eigenvectors of the Hamilton operator are considered as “the stationary states of the system”, to be “occupied” by an individual object, between which this object can jump to and fro (compare Bohr’s atomic model). Moreover, an individual measurement is often thought to reveal “which eigenstate of the measured observable the object was in”. In an ensemble interpretation of the quantum mechanical state vector such phrases are meaningless (see section 6.2.3 for a review of the origin of this terminology). Another reason for not being satisfied by an ensemble interpretation of quantum mechanics might be the circumstance that nowadays we are able to observe an individual atom during a prolonged time interval (compare the fluorescence measurements discussed in section 3.2.7). Such systems are contemplated for the purpose of quantum computation, and it would be particularly important if we would be able to describe an individual quantum computer rather than just an ensemble of these. Does quantum mechanics yield such a description? In the quantum mechanical literature a tendency can be observed to answer this question affirmatively, and to interpret the state vector as a description of an individual particle rather than an ensemble. However, even if it is possible to observe an individual object, then it is yet questionable whether the requirement that quantum mechanics completely describe this individual object is either necessary or useful. It is not necessary because an ensemble interpretation could be a viable alternative. For a description of the individual object a different theory than quantum mechanics may be required, which theory will perhaps have to be very different from classical mechanics in order to be able to account for the non-classical character of the quantum mechanical formalism. It also does not seem to be useful, because the assumption that the state vector completely describes the individual object does not allow any application of the quantum mechanical formalism that would not be possible in an ensemble interpretation (compare Margenau’s views mentioned in section 5.4.2). Moreover, as was seen in section 3.2.7, an individual-particle interpretation of the quantum mechanical description of quantum jump processes observed in fluorescence measurements meets serious difficulties if the question is asked what is their cause (as noted in section 3.2.6, such a causal explanation is not required in an ensemble interpretation).
278
CHAPTER 6. INDIVIDUAL PARTICLE VERSUS ENSEMBLE
In discussions of the choice between an individual-particle interpretation and an ensemble one usually both interpretations are taken in the realist sense defined in section 2.3. This makes it difficult to make a definite choice between the two. The misgivings with respect to realist interpretations of quantum mechanics, raised in chapter 2, apply to both the individual particle version and the ensemble one. The possibility of an empiricist interpretation (section 2.2) provides a different perspective for assessing this problem. Thus, it is possible to interpret in an empiricist sense the discontinuous graph depicted in figure 3.3 as a representation of one single (compound) value of a set of observables measured consecutively (cf. section 3.2.7), quantum mechanical probabilities just referring to the relative frequencies of these (compound) events. There is no necessity to interpret figure 3.3 as implying quantum jumps of the state vector of an individual atom. In an empiricist interpretation the state vector is a label of a preparation procedure. This virtually entails an ensemble interpretation, because different individual preparations (corresponding to different individual measurement results) may result from one and the same preparation procedure. In general the quantum mechanical formalism is not demanded to distinguish between different individual preparations corresponding to a preparation procedure represented by a quantum mechanical state vector: for operational purposes we may content ourselves with quantum mechanics as a theory yielding a statistical description. In an empiricist view only the relative frequencies of the measurement results of actually performed measurements have physical relevance. Hence, strictly speaking, in the empiricist interpretation favored in this book (cf. chapter 2) a discussion of ‘completeness’ in the sense of an individual-particle interpretation is virtually superfluous. In an empiricist interpretation the theory is not thought to yield a complete description of an individual preparation: it simply is incomplete. The state vector is not attributed to the individual preparation, but to the preparation of an ensemble. The question of whether an individual-particle interpretation is useful at all, can arise only in a realist interpretation. Nevertheless, some attention will be paid to the choice between an individualparticle and an ensemble interpretation because this latter interpretation has its problems too. A discussion of these problems can show that a weakening of the realist interpretation by considering the state vector as a description of an ensemble may not be sufficient to resolve all problems. The question “Ensembles of what?” also cannot be left undiscussed, because it is closely connected to the choice between a realist and an empiricist interpretation. We should be aware of the possibility that an answer to the question of whether the state vector describes the microscopic reality of an individual object or of an ensemble, may be influenced by a prior choice between a realist and an empiricist interpretation of the state vector. As a matter of fact, there is a considerable interference between the realism/empiricism controversy and the individual-particle/ensemble one (cf. sections 6.4 and 6.5).
6.2. PROBABILISTIC AND STATISTICAL INTERPRETATIONS
279
Anticipating this discussion it is stressed here that it is hardly the controversy between individual-particle interpretation and ensemble one, but rather the question of what are the elements of the ensemble (do they refer to the microscopic object or to the measuring instrument?) that seems to be most important for understanding quantum mechanics.
6.2
Probabilistic and statistical interpretations of quantum statistics
6.2.1
The ‘statistical’ interpretation
As mentioned in section 4.7.3, in Ballentine’s view [252] quantum mechanical observables do not differ in an essential way from classical physical quantities, even though the former are subject to restrictions with respect to preparation: repeated individual preparations yield an ensemble in which the statistical spreadings of the values of quantum mechanical observables satisfy the Heisenberg inequality (1.78). According to Ballentine a well-defined value can be attributed to each observable of an individual object (compare the ‘possessed values’ principle, section 2.3). When individual preparations are carried out N times, and is the number of times measurement result is found, then 3
Ballentine proposes to refer to this interpretation of quantum mechanics as the ‘statistical’ interpretation. In this view quantum mechanical uncertainty is not a property of an individual microscopic object (as it is in the Copenhagen interpretation, cf. section 4.2.2). On the contrary, it is thought to be a property of an ensemble of individual objects. During each individual preparation a sharp value of a quantum mechanical observable may be prepared, and, in agreement with the ‘faithful measurement’ principle (section 2.4.3), registered as a measurement result. In this view quantum mechanical uncertainty is a property of the preparation process, which is subject to fluctuations that are responsible for different (sharp) values of quantum mechanical observables in different individual realizations of the preparation. In contrast to an empiricist interpretation, in the ‘statistical’ interpretation 3 Here it is generally assumed that the limit exists. This is far from evident, however. The existence of a limit to the relative frequency is an interesting subject that may have relevance to the question under which experimental conditions quantum mechanics is applicable. Since it is necessary to transcend quantum mechanics for addressing this problem, this question cannot be dealt with here. Some ideas will be developed in chapter 10, although the problem is far from solved. Here we shall conform to the general assumption that the physical conditions for the existence of the limits are satisfied in general.
280
CHAPTER 6. INDIVIDUAL PARTICLE VERSUS ENSEMBLE
the state vector is not thought to describe the preparation procedure itself, but rather the result of a preparation (see also Ballentine, [103], p. 33), viz, the state of an ensemble of objects. This makes the ‘statistical’ interpretation a realist one. Ballentine’s ‘statistical’ interpretation is closely related to Einstein’s idea that quantum mechanics is an incomplete theory (‘in a wider sense’, cf. section 4.2.1), fit to make statistical assertions only, and not telling anything (or, at least, not everything) about the actual value of a physical quantity of an individual object. According to this view in a pure state the Copenhagen interpretation is incorrectly interpreting distinct individual preparations as identical. Even in a pure state variables of the preparation may exist that are hidden up to now, but that nevertheless may be held responsible for the fact that the preparations are different from each other. Distinct individual measurement results could be explained by distinct individual preparations. This view is in agreement with a subjectivistic interpretation of statistics, according to which a statistical description is necessary because of our (subjective) lack of knowledge regarding the object. It is not deemed impossible that someone possessing more knowledge about the object would be capable of giving a better description in which, for instance, sharp values of position and momentum P are both attributed to an individual object. For obvious reasons subjectivistic interpretations are often referred to as ‘ignorance’ interpretations.
6.2.2
Individual-particle interpretation of pure states
The subjectivistic ignorance view discussed above is in disagreement with the Copenhagen completeness thesis (to be understood ‘in a restricted sense’, cf. section 4.2.2). This latter thesis does not allow to attribute, independently of measurement, to an individual object a well-defined (although possibly unknown) value of a quantum mechanical observable. Thus, in the Copenhagen interpretation it is not thought to be meaningful to distinguish between the undecayed states of two nuclei in a radioactive sample at any time before they decay, even if one nucleus will decay within one second and the other one only after one hour (‘a nucleus has no age’). The state vector is thought to describe the information that can be maximally obtained if a measurement is actually performed. It does not contain any information on the precise moment an individual nucleus will decay in the future. The universality of the exponential character of the decay strongly suggests that this information is nonexistent. Unlike the ‘statistical’ interpretation the Copenhagen one does not attribute a well-defined position to a particle prior to a position measurement. According to Jordan [273] the measurement result in a quantum mechanical measurement was not there beforehand, but is created in the measurement. By a position measurement
6.2. PROBABILISTIC AND STATISTICAL INTERPRETATIONS
281
of an electron the particle may be forced to assume a certain well-defined position; beforehand the electron was “neither here, neither there”. Quantum mechanical measurement results are not properties possessed by the object prior to measurement, but they are emergent in the measurement. According to Jammer ([216], p. 162) this view became the characteristic feature of the complementarity interpretation in the early 1930s: there does not exist a causal (deterministic) explanation of why the electron chooses a certain position if a measurement of that observable is carried out. In the Copenhagen interpretation statistics is thought to have an objective meaning rather than a subjective one (the latter being the case in Ballentine’s ‘statistical’ ensemble interpretation). Quantum probability is thought not to be a matter of incomplete knowledge, but is considered a consequence of an objective property of microscopic reality itself. Quantum statistics is interpreted in a ‘probabilistic’ rather than in a ‘statistical’ sense. When we perform a measurement of a quantum mechanical observable, then, in this view, we do not measure the value the observable had immediately preceding the measurement, because at that instant the observable simply did not have a value yet 4 . In this context Popper [274, 275] refers to a certain inclination (‘propensity’) of the object to manifest a value of an observable on measurement (compare also Heisenberg’s idea, referred to in section 4.6.2, on the relevance of the Aristotelean concept of ‘potentia’). In the probabilistic view not the values of the observables but only their propensities or inclinations (being determined by the whole experimental arrangement) can be attributed to the object as (contextual) properties. They are represented in the formalism by the probabilities that value of observable A will be found on measurement. They are determined by the state vector through relation (1.5). For this reason in the Copenhagen interpretation the state vector is deemed to represent a statistical element of an irreducible nature. Quantum mechanical completeness may be interpreted as meaning that the state vector (wave function) does not describe an ensemble of objects but that it completely describes an individual object. It is impossible to establish in an unequivocal way one single interpretation of the state vector as “the” Copenhagen interpretation 5 . In particular, there is a large difference between the instrumentalist sense in which Bohr interpreted the state vector, and the realist way in which in a large part of the Copenhagen literature the probabilities are considered as properties of an (individual) microscopic object. Also the values of observables are generally attributed to this object, be it not possessed preceding the measurement, but afterwards (cf. section 4.6.1). For Bohr the state vector had above all an epistemic meaning, not an ontic one (cf. section 4.6). 4 Margenau ([163], p. 175) has introduced the concept of latent observables, thus distinguishing quantum mechanical observables from the classical physical quantities that might be considered as real properties of the object (so-called “possessed properties”). 5 Thus, Stapp [215] refers to “the inhomogeneous body of opinions and views that now constitute the Copenhagen interpretation.”
282
CHAPTER 6. INDIVIDUAL PARTICLE VERSUS ENSEMBLE
Bohr repeatedly warned against “creation-out-of-the-blue” of measurement results as implied by Jordan’s above-mentioned view. However, this divergence between Jordan and Bohr was not generally appreciated. Bohr’s views on complementarity were widely interpreted in the sense that ‘what cannot be defined by quantum mechanics, cannot exist’, thus confounding epistemology and ontology (as is usual in an epistemic realist interpretation of physical theory, cf. section 2.3; compare also Heisenberg’s idea on the impossibility of a simultaneous measurement of position and momentum, criticized in section 2.4.2). It seems probable that Bohr’s not keeping sufficient distance from the realism of Einstein’s interpretation of quantum mechanical observables (as opposed to an empiricist one, cf. section 5.3.1) has contributed to this development by confusing the issue. Otherwise it can hardly be understood how it is possible that, notwithstanding Bohr’s resistance, Jordan’s “creation-out-of-the-blue” philosophy could become a characteristic feature of the Copenhagen interpretation. Perhaps due to the equally epistemic-realist maxim that ‘what is defined by quantum mechanics does exist’, it seems that Bohr’s cautious epistemic attitude with respect to the state vector has also been replaced by a more ontic demeanor. In most textbooks of quantum mechanics it is Bohr’s completeness thesis of quantum mechanics (cf. section 4.2) that only seems to mark the difference between Einstein’s interpretation and the Copenhagen one. In the latter interpretation the state vector is also often interpreted realistically, thus reducing the difference between the interpretations to the question of whether the state vector is thought to describe an ensemble (or, equivalently, yields an incomplete description of the individual object), or whether it is supposed to yield a complete description of an individual object. Notwithstanding Bohr’s cautious instrumentalism with respect to the wave function, this rather sloppy way of adhering to the Copenhagen interpretation marks it for the main part as a realist individual-particle interpretation. The reality of the object is then thought to be made up by the totality of all probabilities of all quantum mechanical observables. Such a tendency toward realism may be responsible for the popular but misleading picture in which a quantum mechanical observable jumps in a stochastic way between its different values, a measurement revealing the value the observable happens to have at the moment of the measurement (compare section 6.4). It seems that this view is actually a blend of Einstein’s objectivistic-realist interpretation and the Copenhagen one. An interpretation of the wave function as yielding only statistical information has already been proposed by Born [276]. According to Beller [277] this proposal was mainly done on mathematical grounds 6 , and was hardly accompanied by any reference to interpretation. In particular, no clear distinction was drawn between 6
Presumably Auletta ([278], section 6.4) is right when ascribing to Born an instrumentalist interpretation of the wave function.
6.2. PROBABILISTIC AND STATISTICAL INTERPRETATIONS
283
probabilistic and statistical interpretations as discussed above7, although the Born interpretation is generally presented as an inherent part of the Copenhagen interpretation. This would make the Born interpretation a probabilistic one. Often the Born interpretation is formulated in a vaguely empiricist or contextualistic-realist fashion 8 , interpreting as the probability of finding measurement result on measuring observable A. This demonstrates that alternatives to the subjectivistic ignorance view of an objectivistic-realist ensemble interpretation were available, and contemplated. However, the tendency, observed in section 2.4.2, to interpret as much as possible the quantum mechanical formalism analogous to the classical mechanical one, may have caused the awareness of a possible difference between the Born interpretation and the ‘statistical’ one to weaken. The Born interpretation, too, is often referred to as an ensemble interpretation 9 , without realizing that an interpretation of a state vector as a (complete) description of an individual particle need not be fully equivalent to an ensemble interpretation. From the discussion given above it will be clear that, as far as the Copenhagen interpretation is thought to refer to microscopic reality, this reality differs appreciably from the reality of classical mechanics. In particular, in the Copenhagen view reality is fundamentally indeterministic. A certain determinism might be expected only when the state vector is an eigenvector of the measured observable, because then the measurement result can be predicted with certainty. In that case it is customary also in the Copenhagen interpretation to attribute a measurement result to the microscopic object as a property the object possessed preceding the measurement. Although this is seemingly unproblematic, it actually is a source of the interpretative problems of quantum mechanics. It therefore seems wise even in an eigenstate to resist the temptation to interpret the observable in an objectivisticrealist sense, and not to surpass the empiricist (or contextualistic-realist) restraint with respect to attributing quantum mechanical properties to the object as properties possessed independently of measurement. Note that this does not imply a denial of the determinism mentioned above. However, this determinism should preferably not be based on the attribution of a quantum mechanical measurement result as an ‘element of physical reality’ , since the latter must presumably be of a sub-quantum mechanical nature (see also sections 6.4.2, 6.4.3 and section 10.6).
7 Failure to spell out this distinction may be responsible for the incapability of Born and Einstein to understand each other’s points of view ([279], especially p. 186 and p. 210). 8 The possibility that the measurement could play an active role in determining the value of an observable was suggested by Pauli in a letter to Born ([279], p. 223-224). 9 In [279], p. 280 Born refers to his interpretation as an ensemble interpretation.
284
6.2.3
CHAPTER 6. INDIVIDUAL PARTICLE VERSUS ENSEMBLE
Ensembles in the Copenhagen interpretation
In order to avoid misunderstandings it should be noted here that the Copenhagen interpretation, too, has its ensembles. In this interpretation the state vector is thought to yield the most complete description of an individual particle that is possible. Von Neumann [2] interprets the density operator (1.32), in a statistical (not: probabilistic) sense, viz, as a description of an ensemble of particles. In agreement with the individual-particle interpretation each of the particles is thought to be completely described by one of the state vectors The relative frequency of this state in the ensemble is Application of the density operator is held to be necessary only because we do not know with certainty in which of the states the individual particle actually is10. Von Neumann’s interpretation of the density operator draws on Bohr’s idea (cf. section 4.6.3) that quantum mechanical state vectors are direct generalizations of the states of classical mechanics (viz, the phase space points density operators should just be compared with the phase space distributions of classical statistical mechanics (cf. section 1.10.3). This interpretation of the density operator may have been induced by the observation that the expectation value of an arbitrary standard observable A in the state (1.32) is equal to and, hence, does not contain any interference term between different states Therefore there is no experimental evidence of any superposition of quantum mechanical states, analogous to (1.4), over the ones contained in the separate This suggests a classical interpretation of the probabilities which appear to be uninteresting from a quantum mechanical point of view. Hence, in this view the difference between classical and quantum mechanics is solely embodied in the pure states. We shall refer to an ensemble in the Copenhagen sense as a von Neumann ensemble. This interpretation of the density operator (1.32) is sometimes called the ignorance interpretation of states. This is to be distinguished from the ‘ignorance interpretation of observables’ as embodied by the ‘statistical’ interpretation, although the notion of a von Neumann ensemble seems to establish a certain link between these two ignorance interpretations. Thus, in the Copenhagen interpretation complementarity (in the sense that an individual particle cannot simultaneously “have” sharp values of incompatible observables) can be understood because such a particle cannot simultaneously “be” in eigenstates of incompatible observables. Von Neumann did consider a pure state as a description of an ensemble, be it that he qualified such an ensemble as a homogeneous one. This means that all elements of the ensemble are considered as “equal” or “identically prepared”. Mixtures represented by density operators like (1.32) were considered by him as 10 The terminology used here is a realist one. In an empiricist understanding of von Neumann’s interpretation of the density operator we would have an ensemble of preparation procedures, each labeled by one of the vectors
6.2. PROBABILISTIC AND STATISTICAL INTERPRETATIONS
285
inhomogeneous ensembles, consisting of different homogeneous subensembles, each corresponding to a pure state Such an interpretation is strongly suggested by the circumstance that the pure states are the extreme elements of the convex set of density operators (cf. section 1.4 and appendix A. 11.3), exhibiting a certain analogy between the structure of the convex subsets of a vector space and the convex structure of the set of probability measures on classical phase space (cf. (A.104)). However, this suggestion should be approached with some care because, as mentioned in appendix A.12, an extreme element can lose its extremeness property if the lattice of subsets of the probability space is refined (by means of an extension of the set of observed quantities): ensembles that appear to be homogeneous may turn out to be inhomogeneous if observed in more detail. If von Neumann considered pure states to be homogeneous notwithstanding the fact that these satisfy Heisenberg’s inequality (1.78), and, hence, are not dispersionless (in contrast to what is the case in classical physics), then this demonstrates his conviction that no future observation will be able to select a subensemble yielding a more complete description than is contained in the quantum mechanical pure state. However, as will be seen in section 10.2.1, this conviction was not as firmly based as for a long time it was thought to be. It is also emphasized already here that an interpretation of density operators as describing von Neumann ensembles entails serious difficulties (see also section 6.3.2). Indeed, the analogy, referred to above, between quantum mechanics of pure states and classical mechanics, that is responsible for the individual-particle interpretation, is a dubious one. This can most easily be seen by considering time evolution. In classical mechanics the notions of states and observables coalesce, since the state at time is completely determined by the values of the physical quantities and This implies that in classical mechanics states and observables have identical time evolutions. However, in quantum mechanics states and observables have different time evolutions (cf. (1.44) and (1.46)). This marks a breakdown of the analogy between classical and quantum mechanics. On the other hand, in section 1.10.3 it is seen that the formalisms of quantum mechanics and classical statistical mechanics distinguish between states and observables in a very similar way (compare (1.119) and (1.124)). For this reason it may be more appropriate to exploit the analogy of quantum mechanics with classical statistical mechanics rather than with classical mechanics proper11. However, then there is no fundamental distinction any more between pure states and mixtures. The former, satisfying (1.33) and (1.34), are merely considered as special cases of the latter (see also Mermin [280]), in the same way as the states (1.117) are just special cases of the general classical statistical mechanical state A similar conclusion was reached by Park and Band [281, 282], who refer to an 11
It is stressed by Ballentine [103] that not classical mechanics but classical statistical mechanics is obtained in the classical limit of quantum mechanics.
286
CHAPTER 6. INDIVIDUAL PARTICLE VERSUS ENSEMBLE
“almost forgotten” theorem already due to Schrödinger [283] (cf. appendix A.12.2). They stress the non-uniqueness of the decomposition of the density operator, (1.32), obtained for orthogonal if some frequencies are equal, since then the vectors corresponding to equal can be replaced by unitarily equivalent vectors An ‘ignorance interpretation of states’ then has a serious problem in choosing whether the individual particles of the von Neumann ensemble “are in” the states or in Park has extended the above criticism to the case of unequal frequencies Although in this case the decomposition is unique if we restrict ourselves to orthogonal states, an analogous non-uniqueness may hold if the vectors in the density operator are allowed to be non-orthogonal (compare the example given in (A.111)). In that case once again has different representations (1.32), since we also have expansion (1.36) in terms of the orthogonal eigenvectors of Hence could as well describe a von Neumann ensemble consisting of subensembles corresponding to the vectors Since the idea of a von Neumann ensemble does not imply any reason for assuming orthogonality of the vectors contained in the density operator (1.32), this demonstrates that the ‘ignorance interpretation of states’ is problematic also in the general case.
6.3
Problems of an individual-particle interpretation
6.3.1
Spreading of the wave packet
The form in which the individual-particle interpretation is adopted in the quantum mechanical literature is often not to be understood in Bohr’s instrumentalist sense, but in a more realist one. The quantum mechanical wave packet often seems to be considered as a description of the (individual) object itself (an electron as a wave-like object (wavicle)) rather than as an instrument for calculating the probabilities This is the model Schrödinger originally had in mind, and which he thoroughly studied [284]: a quantum mechanical particle as a matter wave, its amplitude squared being a measure of mass density. As is well known this model met considerable difficulty. After initially obtaining encouraging results in applications to the harmonic oscillator, in more realistic systems like the hydrogen atom the electron wave packet turned out to be subject to spreading, smearing the matter distribution over all available space, thus contradicting particle localization. A very clear example of such a spreading is occurring in a double-slit experiment, in which two different paths are available to a particle (for instance, photons at a partially reflecting mirror, cf. figure 2.4, or neutrons in a neutron interferometer, cf. section 8.2). An individual-particle interpretation of the wave packet would mean that the individual object (photon or neutron) would split into two
6.3. PROBLEMS
287
parts, each of which following a different path. Such a splitting is in agreement with the mathematical formalism (cf. section 2.4.4), but is rather embarrassing in an individual-particle interpretation. In the Copenhagen interpretation it has been tried, in the spirit of the principles of ‘correspondence’ and ‘complementarity’, to accommodate this behavior of the wave packet by appealing to the mutually exclusive measurement contexts allegedly determining whether a photon or a neutron is manifesting itself as a particle or as a wave. In the context of a path measurement the object was thought to behave as a particle, whereas in the case of interference it allegedly would behave as a wave. Yet, as already put forward in sections 2.2.1 and 4.5.2 (also section 4.6.4), such a view seems to be in complete disagreement with the particle-like character by which individual objects manifest themselves during the gradual development of an interference pattern (cf. figure 4.4). This phenomenon is evident in the very measurement arrangement necessitating the splitting of the wave packet (see also the neutron interference experiments discussed in section 8.2). For this reason it is not surprising that Schrödinger’s interpretation was soon abolished, and replaced by Born’s ensemble interpretation viewed as a ‘statistical’ one, in which the object in a double-slit experiment is thought to be particle-like, choosing one of the two paths and certainly not splitting into two (as does the wave packet). Here the wave function is thought just to describe the relative frequency that a certain path is chosen by an individual element of the ensemble. It should be emphasized that the Copenhagen ‘probabilistic’ individual-particle version of the Born interpretation does not solve the problem of wave packet spreading. Even though the Born interpretation allows Schrödinger’s matter wave to be replaced by a ‘probability’ wave, this interpretation is still in trouble if the wave packet is interpreted as a probabilistic description of an individual particle. With respect to the spreading of the wave packet the Copenhagen interpretation in the Born version meets objections that are similar to those directed against Schrödinger’s interpretation: since the particle must with a large probability be at a position where the wave function has a large amplitude, a localization problem arises for a particle represented by a wave function that is a superposition of two wave packets localized in different regions of space. Sometimes a ‘quantum jump’ picture is advanced to “explain” the bi-locality of the position probability distribution corresponding to the superposition, in which the particle is jumping to and fro between the different regions. However, as discussed in section 3.2.7, such pictures are not very attractive due to the acausal character of such quantum jumps. By not distinguishing between two different ‘particle’ concepts the Copenhagen individual-particle interpretation induces the same kind of confusion with respect to massive particles as observed in section 2.4.4 with respect to photons: next to the formal ‘particle’ concept, described by the wave packet, there is the informal ‘particle’ concept necessary to explain the local impacts, observed in the gradual development of an interference pattern (cf. figure 4.4). Therefore it seems that
288
CHAPTER 6. INDIVIDUAL PARTICLE VERSUS ENSEMBLE
the idea of an individual photon or neutron as a quantum mechanical wave packet flying around in space can hardly be maintained. As discussed in section 2.5.2, the formalism of quantum mechanics does not seem to be able to describe individual microscopic objects. On the other hand, if the quantum mechanical wave packet would be thought to refer to an ensemble, then it would be perfectly well possible to keep considering microscopic objects as particle-like entities (cf. section 6.4.1), and to understand spreading of the wave packet on the basis of different behavior of different individual particle-like objects in the ensemble (i.e. dispersion). In both the Copenhagen interpretation and in Schrödinger’s one it is sometimes attempted to solve the problem of the spreading of the wave packet by means of the introduction of a nonlinear term in the Schrödinger equation12 [285]. The so-called nonlinear Schrödinger equation thus obtained indeed has particle-like solutions (soliton solutions [286]). However, by the addition of the nonlinear term the character of the Schrödinger equation is fundamentally changed, thus yielding a new theory rather than a new interpretation. In particular the superposition principle, telling that a linear superposition of two solutions of the Schrödinger equation also is a solution, gets lost. Since up to now the superposition principle has been fully consistent with all empirical evidence, we shall restrict our discussion to the linear theory 13 . Moreover, since the superposition principle is one of the cornerstones of quantum mechanics it does not seem to be very probable that the quantum mechanical wave function should have the properties of the soliton solutions of a nonlinear equation. If the above-mentioned distinction between the two different ‘particle’ concepts is duly taken into account, then there is even no reason at all for the quantum mechanical wave function to have particle-like properties. Even though particle-like objects may exist, development of a theory describing such objects should preferably be based on a sub-quantum model. Such a model is largely lacking at this moment. Guessing the right nonlinear equation by starting from the Schrödinger equation might very well be analogous to guessing Newton’s equation of motion from the equations of thermodynamics, and, presumably, is equally impossible.
6.3.2
Entangled states, and individual-particle interpretation
Interaction between objects causes the system’s state to evolve into a so-called entangled state (cf. sections 1.5 and 3.2.1). The existence of such states poses a difficulty to the individual-particle interpretation. Thus, if the state of a two-particle system is described in terms of the complete orthonormal sets and by an 12
The nonlinear terms considered here should be effective in microscopic objects, and should therefore be distinguished from the nonlinear terms, discussed in sections 3.4.3 and 3.4.4, that are effective only in macroscopic objects. 13 Attempts at developing a nonlinear theory can be found, for instance, in [285, 287, 288].
6.3. PROBLEMS
289
expansion with (cf. (1.49)), then it is quite unclear how a well-defined state vector could be attributed to each of the particles 1 and 2 separately. It is true that by means of partial tracing it is possible to obtain for the subsystems density operators and (cf. (1.53) and (1.54)) that seemingly can be interpreted as describing von Neumann ensembles in which each particle is in a well-defined state. Yet, there are two circumstances that cause considerable trouble to an individual-particle interpretation. A first problem is posed by the observation that, if and are interpreted as describing ensembles rather than individual systems, then it is not clear how the state can describe an individual two-particle system rather than an ensemble. The reduced state just represents the information contained in the two-particle state when all information on particle 2 is ignored. As far as only particle 1 is concerned, and refer to the same object. It therefore seems that both should describe either an individual particle or an ensemble. Hence, the consistency of an individual-particle interpretation seems to hinge on the feasibility of also interpreting and as descriptions of individual objects, or of homogeneous ensembles. This would imply that the individual particles 1 and 2 are not described by state vectors but by density operators. This, at least, would require a certain adaptation of the Copenhagen individual-particle interpretation, extending the notion of completeness to mixtures. Such a view is characteristic of the so-called ‘minimal interpretation’, to be discussed in section 6.4.1. The fundamental (in)homogeneity problem going with this will be discussed further below and in section 6.4.3. The non-uniqueness of representations (1.53) and (1.54) of and poses a second difficulty. Thus, if is expanded in terms of a different couple of complete orthonormal sets and then (compare (1.55))
Due to this non-uniqueness an ‘ignorance interpretation of states’ is not very well possible for the subsystems: are the individual particles of the von Neumann ensemble of particle 1 in states or in As will be discussed in section 6.6.1, the non-uniqueness problem is sometimes thought to be solvable by singling out a particular representation as the (allegedly) physically relevant one, viz, the representation corresponding to the polar decomposition (cf. section 1.5.3), in which and correspond to unique von Neumann ensembles if their eigenvalues are non-degenerate. Unfortunately, in the EPR problem -which is one of the most interesting problems involving entangled states14this latter condition is not satisfied. Thus, when we determine the reduced density 14
It was already observed by Furry [289] that the EPR proposal as formulated in terms of states (cf. section 5.4) can be seen as posing a problem to the individual-particle interpretation.
290
CHAPTER 6. INDIVIDUAL PARTICLE VERSUS ENSEMBLE
operators (1.51) and (1.52) for the state
we get
This suggests that describes a von Neumann ensemble of particles 1 being either in eigenstate (and analogously for particle 2). As is well known (compare (5.13)), however, the reduced states (6.2) can also be represented according to
suggesting a von Neumann ensemble of particles being in eigenstates of It is rather unfortunate that in the Copenhagen interpretation the tension between the different interpretations of (of the combined system) and of the states and (of the subsystems) has not sufficiently been resolved. In general, pure states are attributed to individual objects. The impossibility of applying this to the separate EPR particles might have cast severe doubt on the feasibility of the very individual-particle interpretation, since in actual practice any particle is entangled by previous interactions with other particles. Instead, a certain modification of the Copenhagen interpretation is sometimes accepted in which the particles of an EPR pair, after having interacted, are allowed not to have state vectors of their own, although the particle pair is thought to have one. This is interpreted as evidence of a certain impossibility of considering as separate entities particles that have interacted in the past (d’Espagnat [160], section 8.2; see also section 9.3.2). In this view the failure of the individual-particle interpretation to account for the separate particles is a consequence of a fundamental ‘nonseparability’, allegedly valid for the microscopic world, and closely related to the ‘nonlocality’ that emerged from the EPR discussion (cf. section 5.3.1). In this way entanglement might even be seen as representing in the mathematical formalism the element of wholeness expressed by Bohr’s quantum postulate. The quantum mechanical ‘nonlocality’ problem will be discussed more fully in section 6.4.4 and in chapter 9. Suffice it to state here once more that up to now no experimental evidence of any nonseparability or nonlocality has been found. It will be demonstrated in chapter 9 that, in contrast to a widespread belief, experiments testing the Bell inequality (like those of Aspect [290]) are not experimental proofs of nonlocality. For this reason we do not have any empirical motive for invoking a novel feature like ‘nonseparability’ to understand the failure of the individual-particle interpretation to account for the separate particles. Moreover, introducing the notion of ‘nonseparability’ does not in any way solve the fundamental (in)homogeneity problem of the individual-particle interpretation of entangled states to be discussed
6.3. PROBLEMS
291
in the next section. Therefore it may be more appropriate to consider, like Einstein did, entangled states as indicative of a failure of the individual-particle interpretation15.
6.3.3 Entanglement, and (in)homogeneity of quantum ensembles Von Neumann’s idea that ensembles corresponding to pure states are homogeneous (cf. section 6.2.3) notwithstanding non-vanishing standard deviations of observables, is presumably not unrelated to his belief that he had demonstrated the impossibility of hidden variables, serving to subdivide such ensembles into different subensembles corresponding to different values of the hidden variables (cf. section 10.2.1). Von Neumann’s “impossibility proof” can be considered as an attempt to prove the Copenhagen ‘completeness’ thesis (taken ‘in a wider sense’), demonstrating that, prior to measurement only probabilities can be attributed to an individual object rather than some value of observable A. In a pure state all individual elements of the homogeneous ensemble are thought to be identical in the sense that all have the same probabilities This should hold true for any observable. If tenable, this also would justify the individual-particle interpretation, because in a homogeneous ensemble the properties of each individual element coincide with the ensemble averages. Of course, it is precisely the existence of dispersion that made people like Einstein, Margenau and Ballentine doubt the Copenhagen completeness thesis, as well as the concomitant homogeneity of pure states. We shall deal more extensively with the problem of hidden variables in chapter 10, where Bell’s critique of von Neumann’s “proof” will also be discussed. Here it is important to note that for quite a long time the combined authority of Bohr and von Neumann has been sufficiently influential to secure homogeneity of pure states a place as a dogma of the “orthodox” interpretation of quantum mechanics. Although it was felt by some that it is rather counter-intuitive to combine dispersion and homogeneity of ensembles, it was not easy to demonstrate in an unequivocal way that ensembles represented by pure states must be inhomogeneous. Thus, the Einstein-Podolsky-Rosen attempt, discussed in chapter 5, finally left open a choice between completeness and locality (cf. section 5.3.1), seemingly making it possible to maintain completeness (homogeneity) at the cost of locality16. 15
The undesirable and unnecessary relation between ‘nonseparability’ and certain forms of mysticism, observed in popularizations of quantum mechanics like the ones by Capra [291] and Zukav [292], will not be elaborated here. In particular, the popular idea that in a singlet state of two spin1/2 particles the two particles would be “dancing” at the same step, is based on an unwarranted idea that there is any dancing at all (compare section 6.4). 16 The opportuneness of this juxtaposition will be criticized in section 9.5.2.
292
CHAPTER 6. INDIVIDUAL PARTICLE VERSUS ENSEMBLE
Entangled states can provide additional evidence against von Neumann’s interpretation of the state vector as representing a homogeneous ensemble, though. Thus, if the reduced state describes an inhomogeneous ensemble, how, then, can the state correspond to a homogeneous one? It is not very well understandable how we could obtain an inhomogeneous ensemble from a homogeneous one by simply leaving out of consideration observations performed on a part of the object system (i.e. particle 2). On the contrary, it would rather be more reasonable to expect that ignoring possible differences would make an inhomogeneous ensemble (more) homogeneous: if color is ignored, people are (more) equal. It seems that von Neumann’s interpretation meets a fundamental problem here. This problem is not restricted to two-particle systems. As a matter of fact, it holds for any von Neumann ensemble, since every density operator (1.32), can be obtained as the partial trace (A.81) of a pure state. The proof of this statement is very simple. Consider the state vector
in which the vectors constitute a (not necessarily complete) orthonormal set in a different Hilbert space Then (compare (1.53))
This implies that it is possible to get an arbitrary state by preparing the homogeneous ensemble represented by and by ignoring, after this preparation, the degrees of freedom corresponding to d’Espagnat ([160], chapter 7.2) draws a distinction between proper and improper mixtures that may be described by the same density operator Proper mixtures are prepared by means of controlled parameter settings of the preparing apparatus. This ensemble can be considered as being composed of homogeneous subensembles, each described by a pure state corresponding to parameter setting This is in agreement with von Neumann’s original ideas. Improper mixtures are of the type obtained from pure states like (6.4) by tracing out certain degrees of freedom. According to d’Espagnat there could be a fundamental difference between proper and improper mixtures, notwithstanding the equality of their density operators. Improper mixtures could be essentially homogeneous ensembles, that cannot in an objective manner be subdivided into subensembles corresponding to different pure states On the contrary, a proper mixture having the same density operator as an improper one could be subdivided in this manner, and hence be inhomogeneous. The question of the (in)homogeneity of mixtures is not easily resolved. Introducing improper mixtures seems to be necessary only if pure states are really thought to
6.3. PROBLEMS
293
describe homogeneous ensembles. If pure states would be considered as inhomogeneous, too, then there would be no reason to draw a distinction between proper and improper mixtures. This would be tantamount to an assumption of incompleteness of quantum mechanics ‘in a wider sense’ (cf. section 4.2.1), and in accord with Einstein’s conviction that “the is to be understood as the description not of a single system but of an ensemble of systems” (Einstein [90], p. 671). ‘Improper mixtures’ might very well belong to those “implausible theoretical conceptions one arrives at if one attempts to maintain the thesis that the statistical quantum theory is in principle capable of producing a complete description of an individual physical system” (Einstein [90], p. 671). In any case, an attempt at saving the individual-particle interpretation by allowing a physical distinction between proper and improper mixtures, described by the same density operator, seems to be selfdefeating because this would imply ‘incompleteness of quantum mechanics’ in the sense that this very theory is not capable of describing the distinction between these mixtures. On the other hand, as was seen in chapter 5, even Einstein’s ultimate attempt (EPR) at proving ‘incompleteness’ was not completely convincing due to the fact that the issue was complicated by the ‘(non)locality’ problem that was induced by the experimental setup. Let us consider the problem of (in)homogeneity in a different manner still. Assuming that a quantum mechanical measurement produces a random sequence of measurement results if the ensemble is homogeneous, it is possible to apply von Mises’ theory of random sequences [293]. In this theory a random sequence is homogeneous in the following sense:
Here a subsequence is allowed if its elements are selected using an algorithm that does not depend on the values of the selected elements (for instance, in a random sequence consisting of 0’s and 1’s the criterion “select all 0’s” is not allowed17). In the case of a proper mixture represented by density operator there exists such an allowed selection procedure, viz, selection on the basis of the parameter setting of the preparation apparatus. This recipe selects different subensembles, corresponding to different Hence, according to the von Mises criterion a proper mixture must be inhomogeneous. When we apply the above reasoning to improper mixtures, the outcome is less unambiguous. It actually depends on whether the fundamental difference between 17 On this criterion the random sequence (0,1,0,1,...) is not homogeneous because the recipe “select every second element” is an allowed one, certainly not yielding in the selected subsequence.
294
CHAPTER 6. INDIVIDUAL PARTICLE VERSUS ENSEMBLE
‘preparation’ and ‘measurement’ (already discussed in sections 3.2.6, 4.6.1 and 5.4) is appreciated or not. The question is whether also in the case of the improper mixture an allowed selection procedure can be found yielding a subsequence with relative frequency differing from the one obtained in the original sequence. Using (6.4) a selection procedure can be contemplated on the basis of a measurement of an observable with eigenvectors In general, subsequences of the measurement results of a measurement on the unindexed quantity, conditional on different values of will have different relative frequencies. The question is only whether this selection procedure is an allowed one. The answer to this question depends on whether the selection is independent of the measurement result of the (unindexed) object. If, as was the case in the EPR problem of chapter 5, a measurement of one quantity is also considered as a measurement of the other one, then the selection procedure would not be an allowed one since it would depend on the very measurement result it is selecting. Hence, on this basis homogeneity of improper mixtures could be thought to be maintainable. However, as was observed in section 5.4, the possibility of considering a measurement on one particle as a measurement of the other one is limited by the requirement that the measured observables of the two particles be strictly correlated. In general there need not exist such a correlation between the observable and the observable measured on the other degree of freedom. For this reason it was proposed in section 3.3.4 to distinguish between ‘measurement’ and ‘conditional preparation of different subensembles based on the measurement results and described by the state vectors The readings of the measuring instrument yielding a certain value of can be seen as determining the parameter setting for the preparation of a subensemble. Since such a conditional preparation is completely independent of a measurement to be performed afterwards on the subensemble, the selection procedure is allowed in the sense of von Mises’ definition. This shows that the ensemble represented by (6.5) can be considered as inhomogeneous. Hence, in this respect the difference between proper and improper mixtures seems to evaporate. In summary we must conclude that it may be possible to maintain the view that (6.4) and (6.5) represent homogeneous ensembles, but that this can be done only on the basis of the same lack of distinction between ‘preparation’ and ‘measurement’ that had such a confusing effect in the EPR discussion. A different solution to the problem of the (in)homogeneity of improper mixtures is the one preferred by Einstein. It was Einstein’s conviction that mixtures as well as pure states are inhomogeneous ensembles. If ‘conditional preparation’ is duly distinguished from ‘measurement’ this, indeed, seems to be the more natural view: the ensemble represented by (6.5) is inhomogeneous because the preparation procedures represented by the state vectors are distinct; the pure state (6.4) must be inhomogeneous because the individual preparations ensemble (6.4) is prepared with (prior to the measurement of observable are identical to the ones preparing (6.5).
6.3. PROBLEMS
6.3.4
295
Disentanglement by means of projection?
The problem of the (in)homogeneity of ensembles has not attracted much attention in the physical literature. For reasons that are not related to the (in)homogeneity problem (cf. section 4.2.2) the Copenhagen interpretation has not abandoned the individual-particle interpretation in favor of an ensemble one. Notwithstanding numerous conceptual problems the Copenhagen view of an electron as a wave packet flying around in space has been particularly persistent. Much attention has been directed to the problem (inspired by von Neumann’s projection postulate) of how an individual particle could get back its state vector after having lost it in the process of entanglement. The necessity of such a process of disentanglement was questioned only by those few (cf. section 4.7) who really worried about the problem of (in)homogeneity. When the (in)homogeneity problem is ignored, an entangled state might be accepted as describing a genuine feature of an individual system consisting of two microscopic particles. Things are different, however, for macroscopic objects. Indeed, we already met the problem of entangled states when discussing the conventional “measurement problem” (cf. section 3.1). Here the superposition character of state vector (3.5) prevents the measuring instrument from “being in” one of its pointer states As discussed in section 3.2.2, the transition (3.9) to a product state is supposed to be brought about by a process of strong projection. During a measurement the state vector is supposed not to satisfy a Schrödinger equation, but to change discontinuously. It is evident that the idea of strong projection is a consequence of an individual-particle interpretation. Often a less drastic form of projection is considered, viz, weak projection, bringing about a transition from state vector to the density operator This only poses the problem of the wiping out of the “cross” terms, discussed in chapter 3. Since can be interpreted as representing a von Neumann ensemble, a transition to a product state could be seen as a selection of a subensemble. However, in the Copenhagen interpretation this is not a consistent solution to the problem of entanglement. Inhomogeneity of the ensemble represented by can hardly be reconciled with an interpretation of as a description of an individual object: you cannot select a subensemble if you do not have an ensemble first. As was already indicated in section 3.2.5, the problem can be evaded if state vector is thought to represent an inhomogeneous ensemble too. In an ensemble interpretation (in which there is no necessity to assume strong projection) we do not even have any reason to assume weak projection, since it is not necessary to require that be the final state of the measurement interaction. The difference between and the density operator corresponding to state vector (3.5) (i.e. the “cross” terms) might constitute an aspect of the ensemble of objects+measuring instruments, which, although not readily observable, might actually be there, and
296
CHAPTER 6. INDIVIDUAL PARTICLE VERSUS ENSEMBLE
become observable if sufficient ingenuity is exploited. In attempts at solving the problems an individual-particle interpretation has with entanglement, the restriction to measurements of the first kind (cf. section 3.2) has been deceivingly effective. Thus, even though the final state of object+measuring instrument is the entangled state (3.5), it might be felt as reassuring that the density operator (3.6), representing the final state of the measuring instrument, can be interpreted as representing a von Neumann ensemble of measuring instruments, with each element in a well-defined pointer state. This is not true for measurements of the second kind, however. Here we obtain (3.12) as the final state of the measuring instrument. If for (which generally is the case), then the “cross” terms interfere with the above picture. Of course, it is possible to write density operator (3.12) in the form of a von Neumann ensemble by employing its eigenvectors. However, in general these eigenvectors will be linear superpositions of the pointer vectors Then the von Neumann ensemble of superpositions of a living and a dead cat, thus obtained, would constitute just another example of Schrödinger’s cat problem (cf. section 3.1.1). By relinquishing the restriction to measurements of the first kind once again considerable doubt is cast on the tenability of the individual-particle interpretation. The problem of the “cross” terms in measurements of the second kind is sometimes tentatively solved by supposing the linear superpositions of the pointer vectors to deviate only slightly from a single pointer vector (e.g. [294]), thus trying to brush aside the problem as referring to unimportant fluctuations of the pointer’s position. This cannot be the general solution, however. For instance, in a measurement using an ideal photon detector the final state is given by (3.17). Evidently, the superposition of pointer states is crucially determined here by the initial state of the object, and cannot be dealt with in the above-mentioned way without jeopardizing the functioning of the measurement procedure as an ideal measurement of photon number. Hence, it seems that already in such common measurement processes as photon counting the Copenhagen reliance on measurement is not sufficient to warrant a consistent individual-particle interpretation. Even if the problem of (in)homogeneity is ignored, is such an interpretation made implausible by the fact that most measurements are not of the first kind. From an interpretative point of view ‘disentanglement by means of projection’ is unnecessary and potentially misleading. Recently it has been realized that entanglement is a necessary feature for understanding certain double-slit experiments [295]. Since quantum measurement requires quantum mechanics for its description (cf. chapter 3) this is hardly surprising. Due to the linearity of the Schrödingcr equation it is not possible to attribute separate state vectors to object and measuring instrument, at least not after they have started to interact. What is really surprising is that, notwithstanding the difficulties of the individual-particle interpretation discussed above, the belief that the objects
6.3. PROBLEMS
297
should have their own state vectors has remained so strong that even today von Neumann’s projection postulate is being taken seriously as a physical process governing measurement. The realization that entangled states are not just peculiarities of the quantum mechanical formalism but contain observationally relevant information over the one contained in the von Neumann-projected state, is an important step forward in understanding quantum measurement. It is unfortunate, however, that the association with ‘nonlocality’ or ‘nonseparability’ has endowed ‘entanglement’ with a meaning that possibly is no less misleading than is von Neumann’s projection postulate. The above criticism of the notion of ‘measurement of the first kind’ as well as of von Neumann’s projection postulate, is not applicable to the EPR experiment, because the criticism was based on the disturbing influence of the measurement procedure on the preparation of the final object state (compare section 3.2.6). However, even if the measurement of is of the second kind (and, hence, disturbs the state of particle 1), then the final state of particle 2 is still represented by the density operator of (6.2). So, if the measurement of were a measurement on particle 2, it could be qualified as a first kind one. However, it is not such a measurement. As discussed in section 5.4, it is a conditional preparation in which the state of particle 2 (either or can be obtained by means of von Neumann projection, conditional on a measurement result of Although in the EPR case von Neumann projection seemingly solves the entanglement problem, in the individual-particle interpretation it actually replaces it only with a different one, viz, the problem of how an individual measurement on particle 1 could have such a drastic effect on the state of particle 2 (viz, a transition from to one of the eigenvectors of even if no observable influence is exerted by the measurement act (Einstein’s “spooky action-at-a-distance”). In the individualparticle interpretation von Neumann projection or conditional preparation, induced by the measurement of particle 1, involves a change of the state of particle 2, brought about in a nonlocal way. Bohr probably did not take this nonlocality problem very seriously because in his strict version of the Copenhagen interpretation the state vector has only an instrumentalist meaning, not describing the reality of particle 2 but just yielding “predictions regarding the future behavior of the system” (cf. section 5.3). However, for Einstein’s realist interpretation of the state vector the contextuality going with Bohr’s answer to EPR meant downright nonlocality if the state vector would describe an individual particle [267]. Analogously to Schrödinger’s cat problem it seemed that the EPR problem, too, could be solved by abolishing the individualparticle interpretation in favor of an ensemble one. Then the conditional preparations represented by and just seem to correspond to selections of subensembles, seemingly not requiring any (nonlocal) physical change of the state of the ensemble (cf. section 5.4.1).
298
CHAPTER 6. INDIVIDUAL PARTICLE VERSUS ENSEMBLE
As is well known the incompleteness of quantum mechanics accompanying Einstein’s ensemble interpretation was not accepted by Bohr. Since the EPR reasoning still relied on measurement (though measurement on particle 1 only) the EPR, argument kept a liability to be countered by Bohr in the same way previous arguments were countered by him, viz, by reference to the essential role of measurement in quantum mechanics (cf. section 5.3). For Bohr the attribution of any property to particle 2, independently of measurement, did not have any meaning at all. No measurement being performed on particle 2 itself, it seemed to him that the measurement arrangement for particle 1 should be invoked to define the properties of particle 2. In agreement with Bohr’s understanding the EPR experiment is widely seen as a measurement on particle 2, and the mechanism of ‘disentanglement by projection’ is applied to it accordingly. Then, the choice between the representations (6.2) and (6.3) of as descriptions of different von Neumann ensembles may be thought to be dictated by the choice of the measurement performed on particle 1. However, although such contextuality could provide a solution to the non-uniqueness problem of the individual-particle interpretation, it seems that the price to be paid, viz, the nonlocality of the influence of the measurement, is hardly worth the gain because the fundamental (in)homogeneity problem, discussed before, remains unresolved. On the other hand, the fact that is independent of whether or is measured (note that notwithstanding the different representations the operators in (6.2) and (6.3) are the same) does not suggest any nonlocal influence. Therefore Einstein’s preference for an ensemble interpretation of is quite understandable (‘incompleteness’ rather than ‘nonlocality’). In such an interpretation the representations (6.2) and (6.3) just correspond to different partitions into subensembles18. The problems of an individual-particle interpretation seem to be severe enough to induce strong doubts with respect to its tenability. However, as will be seen in the following section, the ensemble view is not unproblematic either.
6.4
To explain, or not to explain
When considerations are restricted to strictly empirically verifiable assertions, the difference between ensemble and individual-particle interpretations of the state vector is not very important (cf. section 6.4.1). The probabilistic assertions of the latter interpretation (such as the probability that in figure 2.4 a photon is found in one of the two possible paths) can be empirically tested only by repeating the experiment a large number of times in an identical manner, and by checking whether the relative frequencies thus obtained agree with the “objective” probabilities (cf. 18
Note that the ensemble differs from a von Neumann ensemble, since state vectors are attributed to subensembles rather than to individual particles.
6.4. TO EXPLAIN, OR NOT TO EXPLAIN
299
section 6.2.2). For this reason the difference between the two interpretations is nonempirical, and is above all connected with the question of whether a result obtained in a measurement can be explained. In the (probabilistic) Copenhagen interpretation such an explanation is rejected, von Neumann’s projection postulate being invoked to describe (not: explain) the state change from initial to post-measurement state (however, see also section 6.4.4). In Ballentine’s ‘statistical’ interpretation such an explanation is sought in the idea that the object possessed the measured value of the observable already before the measurement. A similar idea was inherent in Einstein’s thinking in devising the EinsteinPodolsky-Rosen experiment (note the ‘element of physical reality’ discussed in chapter 5). In the Copenhagen interpretation the spins of the strictly correlated spin-1/2 particles, described by the singlet state (5.8), do not have values before the measurement. This makes the strict correlation of the spins, found in the measurement, virtually incomprehensible. How can measurements of spin components of particles 1 and 2, performed while the particles are far apart, always yield opposite measurement results if this was not fixed beforehand in the preparation? It, indeed, would seem to be quite natural to assume, with Ballentine (cf. section 4.7.3), that the strict correlation is a consequence of its being present already before the measurement, in the sense that the particles of a pair prepared in the singlet state have opposite spin components in all directions, independently of any measurement to be performed later. If the singlet state were a description of an ensemble of such particle pairs, half of which having spin values the other half then this would fully explain the statistics of the measurement. Moreover, the fact that the strict correlation of the measurement results of the spin values of the particles is independent of the precise times at which the two measurements are made, can be understood because the spin values are constants of the motion: as long as no measurement takes place they do not change with time. A second example is provided by two momentum measurements consecutively carried out on a free particle at times and Since the operators and of a free particle commute, it is possible to describe this measurement as a joint measurement of these two observables in the initial state at Let this state be an arbitrary superposition of the eigenvectors of the momentum operator. The important issue is that the two consecutive momentum measurements of each individual particle always yield the same value, even though the measurements have statistical distribution This can easily be seen by applying the usual quantum mechanical rules for the joint measurement of the two compatible observables and (cf. section 1.3), yielding as joint probability distribution of the measurement results of and of Hence, there is a strict correlation between the measurement results of the two consecutive measurements. In fact this example is very analogous to the EPR
300
CHAPTER 6. INDIVIDUAL PARTICLE VERSUS ENSEMBLE
case. Also in this second example it seems that a reasonable explanation of the strict correlation between the measurement results at different times might be sought in the assumption that the particle possesses a well-defined momentum as an ‘element of physical reality’, that, moreover, is a conserved quantity, undisturbed by the first measurement. The Copenhagen interpretation rejects explanations on the basis of quantities possessing values independently of measurement. In the experiment considered above no well-defined values of the spin observables and are attributed to an individual EPR particle pair, but only a joint probability distribution of the spin components (‘correlation without correlata’, Mermin [280]). Similarly, in the second example only a momentum probability distribution can be attributed to the particle, not an individual value of momentum. And it appears that the Copenhagen interpretation is correct here. Although the Copenhagen reasoning, based on the Heisenberg inequality, does not seem to be completely cogent [252], nowadays (cf. section 6.4.2, see also section 9.4) there is sufficient evidence that it is impossible to simultaneously attribute to a quantum mechanical object values of all quantum mechanical observables. Completing quantum mechanics in this sense seems impossible, and according to some (e.g. van Fraassen [78]) even unnecessary. From an empiricist point of view it, indeed, is sufficient that the quantum mechanical formalism describe correctly the experimentally obtained measurement results, including the strict correlations in the examples discussed above. There is no more obligation for quantum mechanics to yield an explanation of momentum conservation of a free particle than there is for classical mechanics to explain why a billiard ball keeps behaving as a rigid body while rolling on the table 19 : explanations must be provided by theories yielding more detailed descriptions of the objects in question. It is interesting to consider the problem of the strict EPR correlations from the point of view of the modified EPR experiments discussed in section 5.4.3, in which the correlation is actually measured by means of a joint measurement of two compatible observables. Such a measurement can be described by means of a single ‘correlation’ observable (cf. section 1.3) with values and From the point of view of the mathematical formalism of quantum mechanics the absence in the singlet state of the values (+, +) and ( —, —) of the ‘correlation’ observable is not stranger than the absence of any particular measurement result of a single observable (for instance, the absence of position value in a state with a wave function for which and needs equally much or equally few explanation. The problem of why a certain individual measurement result is obtained already regards the single measurement. 19
Note that a renunciation of an explanation of a phenomenon is not tantamount to a denial of that phenomenon. For this reason it need not be inconsistent for the Copenhagen interpretation to accept conservation of energy and momentum in individual collisions (as observed, for instance, in the Compton-Simon experiment, thus falsifying the Bohr-Kramers-Slater theory [296]).
6.4. TO EXPLAIN, OR NOT TO EXPLAIN
301
Considered from this point of view Bohr would have been right in his judgment that the EPR challenge was not different from previous ones. However, as noted in section 5.3.1, Bohr overlooked the fact that the EPR experiment is not a joint measurement but a conditional preparation. This oversight has considerably complicated the issue. In an empiricist interpretation of quantum mechanics, too, the problem of why a certain individual measurement result is obtained is thought not to have a solution in the theory: the quantum mechanical formalism is thought to simply describe the phenomena, it does not yield explanations. This holds just as well for the common cause in coincidence measurements like the modified EPR experiments as for the (local) cause in a measurement of one single (local) observable. In this respect an empiricist interpretation is similar to the Copenhagen one, which in chapter 4 was seen to contain many empiricist elements. However, the Copenhagen interpretation was seen to comprise many realist elements, too, inducing a certain tendency towards causal reasoning also in quantum mechanics. Thus, while assuming that the strict correlation of the spins in the singlet state is not reducible to a common cause inducing particles 1 and 2 to possess their spin values already before the measurement, this correlation is nevertheless often thought to have a cause, viz, a nonlocal influence, exerted by the measurement of one spin on the measurement of the other one, causing to assume the spin component opposite to the measured one It was already observed in section 5.4 that a more empiricist demeanor could have withheld the Copenhagen interpretation from the attempt to explain EPR correlations in such a nonlocal way. The EPR problem epitomizes the crucial role ‘explanation’ played both in Bohr’s (Copenhagen) and in Einstein’s (ensemble) interpretation. In chapter 5 this was discussed from the perspective of the realism/empiricism controversy. In that discussion the controversy between individual-particle and ensemble interpretations virtually did not play any role. Yet, Einstein believed this latter distinction to be the important one, all problems supposedly being solved if the completeness of the Copenhagen individual-particle interpretation would be replaced by the incompleteness of an ensemble interpretation. Indeed, as was seen in chapter 3, this replacement makes some problems (like the conventional “measurement problem”) less acute. However, as will be seen in the next sections, an ensemble interpretation, if taken in the sense of a realist interpretation, does not solve the nonlocality problem. In contrast to Einstein’s expectation, causal reasoning keeps raising problems for quantum mechanics, even if the completeness of an individual-particle interpretation is abolished in favor of the incompleteness of an ensemble one.
302
CHAPTER 6. INDIVIDUAL PARTICLE VERSUS ENSEMBLE
6.4.1 Minimal interpretation Nowadays the idea that a quantum mechanical state vector yields a description of an ensemble rather than of an individual particle appears to be rather generally accepted. In textbooks it is customary to interpret the statistical assertions of quantum mechanics as referring to ensembles. This does not at all imply a choice between an individual-particle and an ensemble interpretation, however, since, as we have seen in section 6.2.3, the Copenhagen interpretation, too, has its ensembles, consisting of individual particles, each of which represented by a state vector. Unfortunately, in textbooks of quantum mechanics it is not always made clear whether the state vector is thought to describe an individual particle or an ensemble of such particles. It also is usually left unspecified what are the elements of the ensemble, and how these elements are related. This is a source of confusion. In the following it is tried to fill this gap by spelling out different possibilities. Textbook ensemble interpretations can often best be characterized as forms of the so-called minimal interpretation, encompassing the minimum amount of interpretation necessary to apply the mathematical formalism of quantum mechanics to experimental reality. The minimal interpretation starts from the observation that in actual experimental practice we always have an ensemble. Assertions about individual measurement results are deemed meaningless in general because such assertions are not liable to experimental test (compare the discussion on diffraction in a double-slit experiment in section 4.5.2). It is assumed that only experimental measurement results obtained by repeating an experiment a large number (N) of times (ideally i.e. the relative frequencies (6.1), are accessible to experimental verification or falsification (e.g. Margenau [163], Park and Band [297]). Even though nowadays observation of an individual particle has become possible (cf. section 3.2.7), quantum mechanics is not thought to say anything experimentally verifiable on the individual measurement result (except, perhaps, if the state vector is an eigenvector of the measured observable). Individual particles have experimental relevance only as far as they belong to an ensemble represented by a density operator or a state vector. Hence, the minimal interpretation is an ensemble interpretation. Yet, unlike Ballentine’s ‘statistical’ interpretation, in the minimal interpretation the ensemble idea does not serve to explain why a certain measurement result is obtained. No values are attributed to the observables of the object prior to measurement. Nor is the ensemble thought to have the structure of a von Neumann ensemble (as is assumed in the Copenhagen interpretation). All ensembles (both pure ones and mixtures) are treated in a pragmatic way as homogeneous, thus extending the Copenhagen refusal of explanation from pure states to mixtures. Actually, the question of (in)homogeneity is not even posed since it is thought to have no operational answer. Without further asking questions the minimal interpretation
6.4. TO EXPLAIN, OR NOT TO EXPLAIN
303
accepts the “strange” character of quantum ensembles which is a consequence of the occurrence of incompatible observables. The interpretation contents itself with the disappearance of certain paradoxical consequences of an individual-particle interpretation by not asking the theory to do more than just yield the relative frequencies experimentally measured. Thus, for instance, the EPR, problem is nonexistent in the minimal interpretation (e.g. Cantrell and Scully [264]). The “measurement problem”, as exemplified by the problem of Schrödinger’s cat (cf. section 3.1.1), can be thought to be solved in the minimal interpretation, because the superposition (3.1) correctly yields the relative frequencies and of living and dead cats (pointer positions) if the experiment is repeated a large number of times. This is completely satisfactory as long as one does not worry about the “cross” terms. As seen in section 3.2.3, this coincides with the “orthodox” position. The minimal interpretation is superior to the “orthodox” one, though, because the former also is able to cope with the possibility, noted in section 3.1.3, of getting observational evidence on the “cross” terms by measuring an observable incompatible with the “cat” observable (cf. section 3.2.5). There is still a second way in which the minimal interpretation is pragmatic in the sense of not dealing with too profound questions about the physical significance of the different mathematical entities of the theory. The interpretation is thoroughly instrumentalistic in the sense that no choice is made between realist and empiricist interpretations. In particular, it is not specified what a “measurement result” is. It may be a pointer position of a macroscopic measuring instrument (as in an empiricist interpretation), but often it is treated (with Bohr) as a property of the microscopic object. In the minimal interpretation these different implementations are left undiscussed thus yielding an opportunity to circumvent questions related to the structure of the ensemble, and to evade ensuing problems. The minimal interpretation seems to be a tenable one, provided the concept of a quantum ensemble is accepted as a primitive term, different from a “classical” ensemble, neither needing nor providing any further explanation. In particular, it is fruitful to be able to circumvent the problems entailed by Ballentine’s “classical” statistical treatment of quantum observables, in which well-defined (sharp) values are simultaneously attributed to incompatible observables (compare the EPR problem as discussed in section 5.2, which was seen to stem precisely from such an assumption; it is sufficient that quantum mechanics just correctly describe the EPR correlations; it does not need to explain them). Quantum ensembles -although treated as homogeneous- may be thought of as inhomogeneous, without any obligation for quantum mechanics to yield a complete characterization of distinct subensembles. The minimal interpretation can be characterized as an ensemble interpretation evading inconsistency by simply ignoring all inhomogeneity of quantum ensembles, and by shunning any demand of explanation. However, looking for explanations is one of the basic features of physics. This
304
CHAPTER 6. INDIVIDUAL PARTICLE VERSUS ENSEMBLE
makes the minimal interpretation rather unsatisfactory. As will be seen in section 6.4.3, certain quantum ensembles are inhomogeneous in an operational sense. This means that the minimal interpretation does not exploit a possibility of explanation that is actually there. Moreover, the instrumentalism of the minimal interpretation, not choosing explicitly between a realist and an empiricist interpretation, has its drawbacks. By ignoring the difference between the interpretations of a ‘measurement result’ as a property of the microscopic object on one hand, or as a property of the measuring instrument on the other, the Copenhagen confusion about ‘preparation’ and ‘measurement’ may easily be perpetuated. Often the minimal interpretation is not adhered to in its pure form, but strengthened by means of certain elements derived from either an empiricist or a realist interpretation. At the basis of such strengthenings is the desire to explain the experimental phenomena by adding as much structure as possible to the ensembles of the minimal interpretation. Such a strengthening does not seem to be problematic in the empiricist case, because here the additional structure can refer to the phenomena only. Thus, an empiricist interpretation is an ensemble interpretation that is capable of dealing with inhomogeneity in the sense that different (sub)ensembles correspond to different preparation procedures (compare section 6.1). Things are different, however, with respect to a realist strengthening, in which the additional structure refers to the microscopic object. This may cause the interpretation to get into trouble very soon. Realist elements may be applied either deliberately or inadvertently (due to the classical paradigm, cf. section 2.4.2). Some of the problems going with this will be discussed in the next sections. If it is just intended to describe the measurement results (probability distributions) obtained in quantum mechanical measurements, then such an explanatory structure does not seem to be required.
6.4.2 Explanation by means of observables? One way of strengthening the minimal interpretation is Ballentine’s ‘statistical’ interpretation, which has the intention to explain quantum mechanical measurement results (including correlations like the EPR ones) in a “classical” way, by assuming that the measurement results were there as objective properties of the microscopic object, prepared before, and independently of, the measurement (compare the ‘possessed values’ principle, section 2.3). In particular, to an individual particle are attributed well-defined, though unknown, values of both position and momentum. The corresponding ensembles are sometimes referred to as Gibbs ensembles (Guy and Deltete [96]). In this strengthened interpretation quantum ensembles are inhomogeneous, subensembles being distinguishable by different values of observables. Quantum mechanical dispersion is thought to be an expression of our ignorance with respect to the values of the observables of the individual particles. For this
6.4. TO EXPLAIN, OR NOT TO EXPLAIN
305
reason this interpretation is sometimes referred to as an ‘ignorance interpretation of observables’. A priori an ‘ignorance interpretation of observables’ need not be at variance with the Heisenberg inequality (1.78). If this inequality refers to an ensemble rather than to an individual object, then it appears possible to attribute to each individual element of the ensemble a well-defined value of incompatible observables A and B without necessarily violating (1.78). Of course, there would remain the problem of explaining why the Heisenberg inequality must always be satisfied in an ensemble, and why it is not possible to select, on the basis of more accurate knowledge of the values of A and B, a subensemble violating it. The two different notions of ‘(in)completeness of quantum mechanics’, defined in sections 4.2.1 and 4.2.2, provide two different lines of reasoning intending to yield such an explanation. The Copenhagen notion (corresponding to ‘completeness in a restricted sense’) is based on the alleged impossibility of jointly measuring incompatible observables with accuracies exceeding the Heisenberg inequality. This would seem to prevent any knowledge that could be used for an actual selection of a subensemble violating Heisenberg’s inequality. Unfortunately, this reasoning cannot be maintained in an ensemble interpretation. As a matter of fact, it is based on the very idea (viz, (un)certainty about the value of an observable as a property of an individual particle) that an ensemble interpretation tries to correct. In an ensemble interpretation the individual particle is not thought to have an uncertainty of this kind, however. It was already known to Heisenberg that it is very well possible to select in a determinative sense (i.e. with reference to the past rather than to the future, cf. section 4.6.1) a subensemble violating his inequality. For instance, in the double-slit ‘thought experiment’ (section 4.5.2) it is possible to select those particles impinging on screen B in an arbitrarily small position interval while transferring momentum to that screen in an arbitrarily small momentum interval Hence, as stressed by Ballentine, Heisenberg’s inequality does not seem to apply at all to a joint measurement of two incompatible observables, at least not in the sense of restricting our knowledge about an individual particle (see also section 7.10.3). Although, as will be seen in section 7.10.2, a joint measurement of incompatible observables does satisfy certain restrictions in a statistical sense, it seems that Ballentine is right that this need not imply that it must (also) hold for an individual particle. Hence, the argument fails that it would be impossible to determine by means of a joint measurement sharp values of both position and momentum of an individual particle. In agreement with the necessity to clearly distinguish between ‘preparation’ and ‘measurement’ (cf. section 4.6.1), this does not imply, however, that it would actually be possible to violate the Heisenberg inequality by preparing a subensemble on the basis of more accurate knowledge. Quantum mechanical dispersion might be a consequence of a fundamental physical inability to control quantum mechanical preparation processes so as to prepare an ensemble that is dispersionless for all quan-
306
CHAPTER 6. INDIVIDUAL PARTICLE VERSUS ENSEMBLE
tum mechanical observables, even if each individual particle would have sharp values for all observables, and if these values could all be simultaneously measured with arbitrary precision (compare the analogous phenomenon in classical statistical thermodynamics, where thermal fluctuations induce a universal dispersion of energy in agreement with the probability distribution of a canonical ensemble). Some uncontrollable mechanism causing stochastic quantum mechanical fluctuations might be responsible for scattering the values of the observables of individual particles, thus possibly explaining the Heisenberg inequality as a property of preparation (compare de Broglie’s ‘hidden thermostat’ [298]; also chapter 10). This provides a second line of reasoning to explain quantum mechanical dispersion, based on a possible ‘incompleteness of quantum mechanics in a wider sense’. This second line might appear to create room for an interpretation of quantum mechanical observables in the sense of Ballentine’s ‘statistical’ one. For Einstein it may have been a reason for preferring an ensemble interpretation (incompleteness of quantum mechanics) over an individual-particle one (completeness). Indeed, because in the EPR problem each element of the particle ensemble is thought to have a well-defined value of both and the two different representations (6.2) and (6.3) of and might be thought to correspond to two distinct decompositions of the ensembles of particles 1 and 2, respectively, into subensembles, to be realized without influencing the ensembles themselves. In contrast to an individual-particle interpretation, in an ensemble interpretation the possibility of incompatible decompositions does not pose a dilemma as to the question of which of the eigenstates each individual particle is in. The subensemble of particle with well-defined values for both or might be thought to exist in the ensemble as a result of a preparation that was performed before any measurement is carried out, these values being reproduced faithfully (compare section 2.4.3) if a measurement of one of these observables is carried out. In the EPR problem this was applied only to particle 2, thus evading the possibility that, as a result of measurement disturbance, the measured value might be different from the prepared one (compare the ‘element of physical reality’ discussed in section 5.1). It is probable (Guy and Deltete [96]) that Einstein had in mind something like the ‘possessed values’ principle (cf. section 2.3), also explaining measurement results of measurements that are actually carried out (like the measurement of particle 1). Yet, these attempts at explanation are untenable because, as will be seen in the following, the simultaneous attribution of sharp values to incompatible observables entails insurmountable difficulties (see also section 9.4). As a first indication of these difficulties we mention the well-known fact -basically used already in the formulation of the EPR problem in terms of observables- that, in general, incompatible standard observables do not have common eigenvectors. Consider two such incompatible observables A and B. According to the ‘statistical’ interpretation we can subdivide each quantum ensemble into subensembles according to both the values
6.4. TO EXPLAIN, OR NOT TO EXPLAIN
307
as well as Consider a subensemble of particles having Since observable A has a sharp value, this subensemble must be represented by the corresponding eigenvector if it is a quantum mechanical ensemble. Analogously the subensemble with must be represented by eigenvector However, assuming that quantum mechanics should also describe the subensemble for which both and we get into trouble because then this latter subensemble must be represented by both and This is impossible if eigenvectors of A and B do not coincide. Decompositions of a quantum ensemble, simultaneously defined by two incompatible observables, simply cannot be combined in a joint one, such that all subensembles are quantum mechanical ones 20 . Only compatible standard observables can simultaneously have well-defined values. In the reasoning given above use is made of the close link between eigenvalues and eigenvectors of standard observables. Because of the fundamental importance of the difference between classical and quantum ensembles observed here, we shall give yet another proof, due to Mermin 21 [302, 303, 304], based solely on observables. Mermin considers the following nine observables of two spin-1/2 particles:
Here the operators
are Pauli spin operators of particle
They
commute if the values of are distinct. Using the well-known properties of Pauli spin operators (matrices) it can easily be seen that the operators in each row of matrix (6.6) mutually commute. This also holds true for the operators in each column. This is of importance for the attribution of values to the observables, because now this can be simultaneously done for all observables in one row or in one column. As a matter of fact, each observable in a row or a column is a function of a fourth observable. For instance, for the first row we have
This implies that the values of the three observables of the first row are determined by the value of D. From this it follows that, if the three observables satisfy the relation then their values must satisfy this same relation (with I replaced by its value 1). This, analogously, holds for all rows and columns. In the following way a contradiction can be derived from this: 20
EPR essentially used this reasoning to prove that quantum mechanics is incomplete. The proof is an elegant version of a proof given by Greenberger, Horne, Shimony, and Zeilinger [299]. These are simplifications of an approach by Kochen and Specker [300] (see also section 10.2.3; also Peres [301]). 21
308
CHAPTER 6. INDIVIDUAL PARTICLE VERSUS ENSEMBLE
Take G(A,B,C) = ABC. For each row of matrix (6.6) ABC = I. This also holds true for the first two columns. The product of the operators of the third column is –I. The values of the observables in the rows and columns should obey analogous relations. Calculate the product of all nine values by taking the product of the products obtained in the rows. This yields 1. Calculate the product of all nine values by taking the product of the products obtained in the columns. This yields – 1. Since the product not only depends on the individual values, but also on the way it is calculated, it is impossible to attribute in a consistent manner welldefined values to all nine observables. It follows that Ballentine’s ‘statistical’ interpretation is impossible, at least in the sense of an interpretation of quantum mechanical observables as describing objective properties of a microscopic object. The simultaneous attribution of values to all observables as objective properties of a microscopic object forces the quantum statistics of observables into the pattern of classical statistics, thus leading to contradictions of the type discussed above. The ‘possessed values’ principle is incompatible with the structure of the algebra of quantum mechanical observables. It seems that, at least in this respect, the Copenhagen interpretation was right. In the notion of a ‘quantum mechanical observable’ there must be contained an element of ‘emergence’, in the sense that (at least certain) observables get their values only in the measurement. A similar conclusion will be arrived at in section 9.4.1 on the basis of a consideration of a system of four observables, for which a simultaneous attribution of values implies that the Bell inequality must be satisfied. This conclusion also reflects the unreliability of the notion of EPR’s ‘elements of physical reality’ as quantum mechanical observables, possessed by a microscopic object independently of any measurement. Although a transition from an individual-particle to an ensemble interpretation solves a number of problems, it, evidently, does not solve them all. Quantum ensembles should clearly be distinguished from classical ones; at least they should not be taken in the sense of Ballentine’s ‘statistical’ interpretation. The minimal interpretation of quantum mechanics can be seen as an attempt to circumvent precisely this problem by not attempting to explain. ‘Elements of physical reality’ explaining the quantum mechanical measurement results might exist. But they cannot be quantum mechanical observables. Analogously to an explanation of the rigidity of a billiard ball by the properties of the constituting atoms and their interactions rather than
6.4. TO EXPLAIN, OR NOT TO EXPLAIN
309
by classical rigid-body theory (which only describes), an explanation of a quantum mechanical measurement result might require a subquantum theory describing a possible subquantum dynamics (cf. section 2.5.2). In an empiricist interpretation this analogy is quite obvious: quantum mechanical measurement results do not explain; on the contrary, they have to be explained (viz, by a detailed account of the combined action of preparation and measurement procedures). Ballentine’s ‘statistical’ interpretation can be seen as an attempt to strengthen the minimal interpretation in an objectivistic-realist sense by treating quantum mechanical observables as ‘elements of physical reality’. The failure of the ‘possessed values’ principle demonstrates the impossibility of an objectivistic-realist interpretation of quantum mechanics having this feature. In the following section an analogous question will be discussed with respect to quantum mechanical states.
6.4.3 Explanation by means of subensembles? In the minimal interpretation the quantum ensemble (both the pure one and the mixture) is considered as homogeneous in a de facto way, thus circumventing the problems, discussed in section 6.3.3, related to the (in)homogeneity of quantum ensembles. This differs from the Copenhagen form of the ensemble interpretation, in which only state vectors are thought to represent homogeneous ensembles, whereas density operators correspond to inhomogeneous ones. If no explanation is contemplated for any difference between measurement results obtained in an ensemble, then the minimal view might be thought to be no less legitimate than the Copenhagen one. However, in view of the experimentally corroborated possibility of selecting distinct subensembles by means of conditional preparation (cf. sections 3.2.6 and 3.3.4) we have reason to doubt the homogeneity of quantum ensembles. Thus, in the EPR entangled state (5.8) it is tempting to interpret, for instance, the reduced density operator in (6.2) as describing an ensemble consisting of two subensembles, each being represented by a different eigenvector of (and analogously for Measurement results obtained in a subensemble can be thought to be explained by such a decomposition of the state (rather than by predetermined observables). Note that this strengthening of the minimal interpretation need not imply a return to the Copenhagen one. It is assumed that only certain subensembles (rather than individual particles) are represented by state vectors. Hence the ensemble is not supposed to be a von Neumann ensemble. Accordingly, this strengthening does not go all the way towards an ‘ignorance interpretation of states’, and the problems of an individual-particle interpretation, discussed in section 6.3, do not arise. Moreover, a subensemble represented by a state vector might be considered as inhomogeneous with respect to observables which do not have that vector as an eigenvector.
310
CHAPTER 6. INDIVIDUAL PARTICLE VERSUS ENSEMBLE
Nevertheless, as demonstrated by the EPR problem discussed in chapter 5, an ensemble interpretation of the quantum mechanical state vector meets considerable difficulty due to the existence of incompatible observables, which also has its influence in a discussion of state vectors. In particular, it was demonstrated in section 6.4.2 that, in general, quantum mechanics is not able to describe subensembles in which incompatible observables simultaneously have well-defined values. In the Copenhagen interpretation the existence of such subensembles is denied (indeed, such subensembles do not fit into the von Neumann ensemble structure), the measurement interaction being thought to be the cause of this. In order to transcend this argumentation, in the EPR reasoning particles 1 and 2 were chosen to be physically inequivalent in the sense that, unlike particle 1, particle 2 does not interact with a measuring instrument. For this reason the focus was on the ensemble of the latter particle, allegedly not being influenced by the measurement interaction. Since each individual element of the ensemble of particles 2 would be a member of a subensemble, represented by an eigenvector of and, at the same time also a member of a different subensemble, represented by an eigenvector of it seemed that each individual particle 2 should have well-defined values of both and Moreover, the particle 2 subensembles thus defined were thought to be independent of whether or is measured. The impossibility, noticed in section 6.4.2, of finding quantum mechanical state vectors representing the subensembles obtained by combining into a joint one decompositions that are simultaneously defined by two incompatible observables, need not signify that the subensembles do not exist. It may just imply that the subensemble of particles 2 yielding a well-defined value of both and cannot be described by quantum mechanics. This, at least, was the conclusion drawn by EPR, who interpreted the existence of the subensembles as evidence of ‘incompleteness of quantum mechanics (in a wider sense)’. Even though not all subensembles could be represented by quantum mechanical state vectors, it nevertheless seemed appropriate to abolish the individual-particle interpretation in favor of an ensemble one, thus opening up the possibility of explaining measurement results by the existence of ‘elements of physical reality’ prepared independently of any measurement. However, the inability of quantum mechanics to describe certain (sub)ensembles is indicative of the fact that an ensemble interpretation of quantum mechanics also meets with severe difficulties. First, each individual element of the ensemble of EPR particles 1 and 2 is still assumed to have sharp values of both quantum mechanical observables and (as well as of spin components in all other directions). This, however, would take us back to Mermin’s problem with respect to the ‘possessed values’ principle, discussed in section 6.4.2. Measurement results of quantum mechanical observables cannot be explained by the assumption that the observables had their measured values already before the measurement. As long as subensembles are defined in terms of values of quantum mechanical observables (as was done
6.4. TO EXPLAIN, OR NOT TO EXPLAIN
311
by EPR), ‘explanation by means of subensembles’ will fail because ‘explanation by means of observables’ fails. Second, from an operational point of view there seems to be a certain inconsistency in the idea that on one hand subensembles exist in which the quantum mechanical observables and have well-defined values, whereas on the other hand quantum mechanics would not be able to yield a description of these subensembles. As a matter of fact, if such subensembles would exist, then a joint measurement of the two compatible observables and would select its elements (according to Heisenberg this is possible in a determinative sense, cf. section 4.6.1). Since such a joint measurement is a valid quantum mechanical procedure, it would seem that the description of such a subensemble is well within the domain of application of quantum mechanics. Even without insights like those obtained on the basis of Mermin’s reasoning, the Copenhagen choice of denying a quantum mechanical observable a status as ‘element of physical reality’ must have seemed a viable alternative to EPR’s conclusion that quantum mechanics must be incomplete, even though it is able to describe all procedures and observations that are possible within its domain of application. By hindsight we now know that this was the correct alternative: it follows from the failure of the ‘possessed values’ principle that quantum mechanical observables cannot play the role of ‘elements of physical reality’. By the same token an objectivistic-realist interpretation of as a description of the objective reality of the particle 2 ensemble is not very well possible. Note, however, that it would be too hasty to conclude from this to ‘completeness of quantum mechanics’ in the sense of ‘homogeneity of quantum ensembles’, or to the impossibility of explaining quantum mechanical measurement results by assuming the existence of different subensembles. The only conclusion that can be drawn is that such subensembles cannot be characterized by values of quantum mechanical observables, nor by quantum mechanical state vectors. This leaves open the possibility that ‘elements of physical reality’ could correspond to certain features of sub-quantum mechanical reality, to be described by some subquantum theory, explaining the quantum mechanical measurement result analogous to an explanation of a billiard ball’s rigidity by features of an atomic reality described by a microscopic theory explaining rigid body behavior by properties of inter-atomic interactions. Thus, it is possible that an EPR particle 2, selected on the basis of a measurement result of a quantum mechanical observable of particle 1 (e.g. or cf. chapter 5), possesses a subquantum mechanical property, warranting that, on measuring the correlated observable or the correlated value of the latter observable is found with certainty. It is important to note here that the values of these observables need not be attributed as properties to the microscopic object, neither in an objectivistic-realist sense (EPR), nor in Bohr’s contextualistic-realist sense. In an empiricist interpretation of quantum mechanics it is possible to distinguish between the measurement result (described by quantum mechanics) and the
312
CHAPTER 6. INDIVIDUAL PARTICLE VERSUS ENSEMBLE
(subquantum) property of the microscopic object explaining it (cf. section 2.2.1). In chapter 10 this will be discussed somewhat further. Here it is concluded that the possibility of sub-quantum mechanical ‘elements of physical reality’ makes the issues of ‘explanation by means of subensembles’ and ‘explanation by means of observables’ independent of each other. Hence, Einstein’s basic ideas that quantum mechanical measurement results are objectively determined by the preparation, independently of measurement, and that an ensemble can be subdivided according to different preparations, may still be valid. It is even not impossible to contemplate subquantum theories in which no unique one-to-one correspondence exists between the subquantum ‘elements of physical reality’ and quantum mechanical measurement results, yet allowing a deterministic behavior in the case of an eigenvector of the measured observable (cf. appendix 10.6). Unlike ‘explanation by means of observables’, ‘explanation by means of subensembles’ remains a possibility, although the subensembles will not be describable by quantum mechanics. ‘(In)completeness of quantum mechanics in a wider sense’ has a wider scope than was taken into account in the EPR reasoning. It does not only refer to the (a)causal way quantum mechanical measurement results come into being (i.e. (in)determinism of quantum mechanics), but it also regards the very concepts in terms of which these processes can be analyzed. By restricting itself to quantum mechanical concepts the EPR discussion essentially remained within the domain of quantum mechanics, and did not really refer to the issue of ‘(in)completeness of quantum mechanics in a wider sense’ (which, if it did, would presumably not even have been contested by Bohr, cf. section 4.2.1). ‘Explanation by means of subensembles’ might need subensembles selected on the basis of non-quantum mechanical concepts rather than quantum mechanical observables. This, precisely, was the outcome of the EPR reasoning, if cast in terms of state vectors (cf. section 5.4), unfortunately without inducing Einstein to put into doubt the quantum mechanical nature of his ‘element of physical reality’. Einstein has not pursued this subquantum mechanical line any further, although his proof of ‘incompleteness of quantum mechanics in a wider sense’ could have been an incentive to do so (see also chapter 10). By the majority of physicists ‘elements of physical reality’ were generally deemed incompatible with quantum mechanics, and the Copenhagen individual-particle interpretation of quantum mechanics was widely accepted. The concomitant switch of the key issue from ‘(in)completeness in a wider sense’ to ‘(in)completeness in a restricted sense’ has been responsible for the idea that, anyway, quantum mechanics cannot be interpreted objectivistically, but must be attributed a contextual meaning. As will be seen in section 6.4.4, this was accompanied by yet another attempt at explaining certain features of quantum measurement.
6.4. TO EXPLAIN, OR NOT TO EXPLAIN
6.4.4
313
Explanation by means of projection?
Strong projection The Copenhagen interpretation, too, attempts to explain certain aspects of measurement, even though no explanation of an individual measurement result is thought to be possible. As a matter of fact, the strong projection postulate (1.70) is clearly meant to explain the individual measurement result obtained in a subsequent measurement of the same observable. Thus, the strict correlation of successive momentum measurements of a free particle might be thought to be explained by a projection, caused by the first momentum measurement at time onto the momentum eigenvector corresponding to the momentum value obtained in the first measurement. Analogously, although in the EPR experiment no explanation of an individual measurement result for particle 1 is thought to be possible, the projection postulate can be seen as providing an explanation of the strict correlation of and in the singlet state: by the measurement on particle 1 allegedly the state of the second particle is changed into the corresponding eigenvector of thus yielding an explanation of the measurement result of this latter observable (if measured). ‘Explanation by means of projection’ of the Compton effect (1.71) (which is very analogous to EPR) is actually at the basis of the introduction by von Neumann of the projection postulate into quantum mechanics. It appears that by embracing the possibility of ‘explanation by means of projection’ the Copenhagen interpretation is perpetuating in yet another way the inconsistency observed in section 5.3.1. Thus, when the second measurement is not actually carried out, it is appropriate to consider experiments of the type considered here as conditional preparations (cf. sections 3.2.6 and 3.3.4), conditional on the measurement result of the first measurement. For measurements of the first kind von Neumann projection does the job (for measurements of the second kind more general transitions must be considered, cf. section 3.2.4). However, conditional preparation is a common and widely used experimental procedure for preparing states, not to be distinguished in any special sense from other preparation procedures. Therefore the issue of explanation is not essentially different from the general problem of explanation of measurement results following arbitrary preparation. In general, in the Copenhagen interpretation no explanation of individual measurement results is thought to be available, however. The object need not “have been in” the eigenstate corresponding to the eigenvalue of the measured observable. Hence it would be inconsistent if in a conditional preparation such an eigenstate would be required. Admittedly, by stressing instrumentalism of the interpretation of the state vector the Copenhagen interpretation has resisted the explanatory tendency going with the projection postulate. Thus, since the state vector was thought to be ‘just an instrument’ for predicting measurement results, it might be considered as ‘just describing’, not ‘explaining’, the correlations. If this were the whole story, however,
314
CHAPTER 6. INDIVIDUAL PARTICLE VERSUS ENSEMBLE
then the projection postulate would have had no need to be conjured up at all; the correlations of all measurement results of correlation measurements are sufficiently described by the initial state vector of the two-particle system [264, 10], without any necessity to consider projection or conditional preparation. Introduction of the projection postulate is a sign that the interpretation has adopted elements leading it beyond mere instrumentalism. Thus, in the Copenhagen interpretation the idea that in the above examples the first measurement is the cause of the value obtained in the second one, as well as of the state the particle is in after the first measurement, is a consequence of a tendency toward a realist interpretation of the quantum mechanical formalism. A consistent instrumentalist interpretation does not demand an explanation of measurement results of quantum mechanical observables, correlation observables not excepted. In a realist individual-particle interpretation of the state vector strong projection does not imply inconsistency because here ‘explanation’ is a legitimate issue. If a particle with certainty has a value of some standard observable, this can be explained only by its being in the corresponding eigenstate. Hence, if quantum mechanics is thought to explain (rather than just to describe), an individual-particle interpretation must assume von Neumann projection (or rather its generalized form (3.15)) to explain conditional preparation. This may be the reason that nowadays so many physicists believe that von Neumann projection is necessary, even if such a belief can be maintained only at the cost of accepting nonlocal influences in applications of conditional preparation to the EPR experiment (cf. section 6.3.2). Next to the classical paradigm, discussed in section 2.4.2, the possibility, observed here, of evading inconsistency in an individual-particle interpretation may have been an incentive to adopt a more realist view of quantum mechanics than is strictly consistent with the Copenhagen interpretation. However, ‘nonlocality’ may be too large a price to be paid for such a strengthening of the Copenhagen interpretation: it might be preferable not to require from quantum mechanics any explanation at all.
Weak projection In his trade-off of ‘completeness’ and ‘locality’ (cf. section 5.3.1) Einstein tried to evade the necessity of strong projection of the state vector of particle 2 by trading a (realist) individual-particle interpretation for an (equally realist) ensemble one. In ensemble interpretations the results of correlation measurements might be tentatively explained by attributing in the initial state the values of correlation observables (like those of any other observable) as properties to the microscopic object (cf. the ‘possessed values’ principle, section 2.3). In an ensemble interpretation projection can tentatively be interpreted as selection of a subensemble, suggesting that the ‘elements of physical reality’ of and were both already there before, and independently of, the measurement of particle 1. Stated in this way, abandoning
6.4. TO EXPLAIN, OR NOT TO EXPLAIN
315
the individual-particle interpretation for an ensemble one might seem to solve the nonlocality problem, thus seemingly also countering Bohr’s contextuality argument. However, the impossibility of such an assumption (as demonstrated, for instance, by Mermin’s problem, section 6.4.2) is evidently thwarting any attempt at explaining quantum mechanical correlations in this way. It is then often felt as a problem how, if a measurement of is carried out in the EPR experiment, particle 2 “knows” how to satisfy the strict correlation holding in the singlet state, notwithstanding the value of the observable was not there beforehand. Unfortunately, this problem of ‘explanation’ is often tentatively “solved” in the ensemble interpretation analogously to the way it is dealt with in the individualparticle one, viz, by adopting a causal view of conditional preparation. It is believed that by the measurement of particle 1 (e.g. or a subdivision of the particle 2 ensemble is brought about (to be described either by the representation (6.2) or (6.3)), corresponding to the correlated observable of particle 2. Thus, an measurement is supposed to split the particle 2 ensemble into two subensembles represented by state vectors (and analogously for and Since no quantum mechanical subensemble can exist in which and have both well-defined values, it is believed that this cannot amount to a simple selection of pre-existing subensembles. It is concluded that the subensembles of the particle 2 ensemble cannot have been there before, and independently of, the measurement on particle 1. Since the values of and were not there before, they appear to have come into being only because of the measurement performed on particle 1. Much in the same way as in an individual-particle interpretation (cf. section 6.3.2) in a realist interpretation this implies nonlocality, even though in the ensemble version this is less conspicuous than in the individual-particle one because here only weak projection (1.72) is involved. In the EPR experiment weak projection does not even regard the mathematical formalism being independent of the measurement arrangement for particle 1) but just its interpretation. In section 5.4.2 it was argued that in an instrumentalist or empiricist interpretation conditional preparation does not imply nonlocal influencing because the change of the state of particle 2, conditional on a measurement result obtained on particle 1, does not describe a change of microscopic reality itself. This is applicable to an individual-particle interpretation as well as to an ensemble one. In the instrumentalist interpretation the state change is just a change of the “instrument for the calculation of quantum mechanical measurement results”, induced by our increased knowledge obtained from a measurement of particle 1; in the empiricist interpretation it just refers to changing the way objects are selected in the preparation process of particles 2 (cf. section 3.2.6). Here it is stressed once more that only if conditional preparation is interpreted, in a realist sense, as yielding a quantum mechanical description of the microscopic reality of particle 2 (either individual particle or ensemble) as it is after the measure-
316
CHAPTER 6. INDIVIDUAL PARTICLE VERSUS ENSEMBLE
ment of particle 1 has been executed, then the nonlocal predicament of the experiment must entail acausal (nonlocal) influencing. Only a non-realist interpretation of conditional preparation is capable of circumventing this problem. Projection as a principle of conditional preparation is a valuable tool for describing the relation between certain preparations and measurements. However, it merely describes (not explains) correlations between measurement results of particle 1 and experimental results to be obtained by later measurements on particle 2. The fact that in the EPR experiment a causal connection between the measurement results of particles 1 and 2, not based on ‘elements of physical reality’, implies nonlocality of this connection, is a strong argument against a realist interpretation of the quantum mechanical formalism, and in favor of an empiricist one.
6.5
The EPR festival of confusions
Two different senses of (in)completeness; nonlocality As already observed in section 4.2.3, the ‘(in)completeness’ question is entangled with a number of other issues, disregarding of which has caused quite a bit of confusion. Thus, by the EPR discussion ‘completeness’ was connected with a possible ‘nonlocality’ that seemed to be inherent in Bohr’s application of his principles of ‘correspondence’ and ‘complementarity’ to the EPR challenge (cf. section 5.3.1). It seemed that ‘completeness’ and ‘locality’ could be traded against each other, ‘completeness’ necessarily implying ‘nonlocality’. It should be realized that in (5.5) and (5.6) ‘completeness’ has the meaning of ‘completeness in a restricted sense’. Nowadays ‘nonlocality’ is an important issue also in connection with ‘(in)completeness in a wider sense’ , associated with the possibility of deriving Bell’s inequality in hiddenvariables theories. This will be discussed in chapter 10. It is important to realize that these issues are virtually unrelated because ‘completeness in a restricted sense’ is just regarding the interpretation of quantum mechanics, and is not concerned with the possibility of subquantum theories (compare section 4.2.2). However, the EPR conclusion (to the extent that an ensemble of ‘elements of physical reality’ cannot be completely described by quantum mechanics) is not formulated in terms of ‘completeness in a restricted sense’ but in terms of ‘completeness in a wider sense’ (cf. section 5.2). As a consequence, the EPR problem has been given a much wider significance than it is actually entitled to. Due to its restriction to quantum mechanical observables as possible candidates of ‘elements of physical reality’ it hardly transcends the domain of quantum mechanics. Only by neglecting the difference between the two notions of ‘(in)completeness’ it has been possible for Einstein [267] to connect ‘nonlocality’ with the issue of ‘completeness in a wider sense’. The identification of the two different forms of ‘completeness’ may have
6.5. THE EPR FESTIVAL OF CONFUSIONS
317
undermined the logical soundness of the EPR reasoning (see also section 9.5.3). Proving that an individual-particle interpretation is incompatible with locality is not the same as proving that an ensemble interpretation is compatible with it. It follows that any conclusion with respect to ‘nonlocality’, that is based on the EPR reasoning, should be approached with a certain restraint. Nevertheless, the EPR conclusion that quantum mechanics is not able to yield a complete description of all aspects of the EPR experiment seems to be correct (cf. section 6.4.3). However, this does not entail any nonlocality if ‘elements of physical reality’ are not equated with quantum mechanical measurement results, but are allowed to have a subquantum mechanical nature (cf. section 10.6).
‘Preparation’ and ‘measurement’; ‘objectivity’ versus ‘contextuality’ As already observed in section 4.2.3 the ‘(in)completeness’ discussion between Bohr and Einstein was actually about the question of ‘objectivity versus contextuality’, that is, whether the quantum mechanical description is referring to a reality that is independent of measurement, or not. For this very reason the EPR experiment was devised such that particle 2 is not interacting with a measuring instrument. Much confusion might have been prevented if Bohr had accepted the EPR arrangement as a preparation procedure for particle 2 rather than as a measurement procedure (compare section 5.3.1). As will be seen in section 7.10.3, even if taken ‘in a restricted sense’, two different notions of ‘complementarity’ have still to be distinguished, viz, one for ‘preparation’ and one for ‘measurement’ (de Muynck [129]). By not properly distinguishing ‘preparation’ and ‘measurement’ these notions have been thoroughly mixed up. Actually, since Einstein was dealing with the objective reality of particle 2, he was dealing with (the result of) preparation. On the other hand, Bohr’s application of his correspondence principle shows that he considered the EPR experiment as a measurement of particle 2. Small wonder that it was impossible to arrive at an unambiguous conclusion acceptable to both. By realizing the fundamental difference between procedures of preparation and measurement, the distinction between EPR experiments and the modified EPR experiments discussed in section 5.4.3 might have been appreciated, in the latter a measurement being also performed on particle 2. Unfortunately, EPR did not completely avoid ‘measurement’, a measurement of particle 1 still being taken into account. This gave Bohr an opportunity to unjustifiedly apply his principles of ‘correspondence’ and ‘complementarity’ also to the EPR problem (cf. section 5.3.1). Furthermore, by equating ‘elements of physical reality’ with ‘quantum mechanical measurement results’ EPR opened the way for Bohr to apply these measurement principles even to particle 2, and to invoke mutual exclusiveness of the measurement arrangements to launch a verdict of ambiguity against the definition of the EPR ‘element of physical reality’.
318
CHAPTER 6. INDIVIDUAL PARTICLE VERSUS ENSEMBLE
As discussed in section 6.4.3, Einstein’s objectivistic ensemble view is highly problematic if based on an ‘element of physical reality’ as a ‘quantum mechanical measurement result’. As seen in chapter 4, it can also hardly be denied that the interaction between object and measuring instrument plays an important role in measurements performed in the atomic domain. This does not imply, however, that we have to accept Bohr’s contextualistic answer to EPR. This answer is based on the idea that the EPR, experiment is a quantum mechanical measurement of a quantity of particle 2. However, as discussed in section 5.3, selection of particle 2, conditional on measurement results for particle 1, should be dealt with in terms of conditional preparation rather than measurement. Therefore Bohr’s answer could be considered by Einstein as irrelevant, and stemming from a “tranquilizing philosophy” not applicable to the EPR arrangement. However, such a qualification seems to be equally applicable to the way incompatibility of quantum mechanical observables is ignored in Einstein’s attempt to treat quantum mechanics analogously to classical statistical mechanics. Equating ‘elements of physical reality’ with results of quantum mechanical measurements has been a primary source of confusion.
Realist versus empiricist interpretation Whereas the EPR discussion is confusing because of its half-hearted way of transcending the boundaries of quantum mechanics, is it equally confusing by tacitly restricting interpretations of quantum mechanics to realist ones. Interpreting results of quantum mechanical measurements as properties of the microscopic object rather than as pointer readings of a measuring instrument might turn out to be another primary source of confusion. As discussed in section 5.3, Einstein and Bohr both looked upon quantum mechanical observables as properties of the microscopic object, the difference being whether such a property can be an objective one (Einstein), or whether it can be attributed to the object only within the context of a measurement (Bohr). It was this realism of the interpretation of quantum mechanical observables that was responsible for the nonlocality which Einstein observed in the contextualism of Bohr’s ‘correspondence’ answer to EPR. For Einstein the solution to this ‘nonlocality’ problem was a (realist) ensemble interpretation of the state vector, in which preparation of particle 2, conditional on a measurement result for particle 1, can be seen as a selection of a subensemble that was already present before, and independently of, the measurement. From the impossibility that the elements of such subensembles simultaneously have values of incompatible observables it was concluded by EPR that such ensembles cannot be quantum mechanical ones (incompleteness in a wider sense). Unfortunately it was not concluded that the ‘elements of physical reality’ themselves cannot be quantum mechanical. The conclusion of ‘incompleteness of quantum mechanics’ just referred to the relative frequencies of the ‘elements of physical
6.5. THE EPR FESTIVAL OF CONFUSIONS
319
reality’, the latter being always identified with quantum mechanical measurement results. If the possibility of an empiricist interpretation would have been contemplated, then such an identification would have been far less self-evident than assumed by EPR, since in that interpretation quantum mechanical measurement results are not properties of the microscopic object, but pointer readings of measuring instruments. Hence, they can exist only if the measurements are actually performed. It would probably have been recognized that in the EPR experiment no observables of particle 2 can play a role, because no measuring instrument is present interacting with that particle (the EPR experiment being a conditional preparation of particle 2 rather than a measurement). Actually, the confusion of ‘preparation’ and ‘measurement’ (cf. section 4.6.1) that plays a role here, is a consequence of the tendency to interpret quantum mechanical observables in a realist sense (see also section 5.3.1). It is likely that, unlike what happened in the EPR discussion, in an empiricist interpretation the difference between EPR experiments and modified EPR experiments would not have escaped attention. Notwithstanding Bohr’s instrumentalism, the discussion has been dominated by a realist understanding of quantum mechanical state vectors and observables, attributing the values of the latter to particle 2 even when no measurement is carried out on this particle. It has now become evident that this is impossible if it is meant in Einstein’s objectivistic sense (compare the failure of the ‘possessed values’ principle, discussed in section 6.4.2). As a consequence it seems that in a realist interpretation the contextualistic version is the only alternative, the context of particle 2 in the EPR experiment being (co-)determined, in the sense of Bohr’s correspondence principle, by the measurement arrangement for particle 1. In contrast to Einstein’s expectation, the ensuing ‘nonlocality’ of Bohr’s contextualism cannot be traded against ‘incompleteness’. An interpretation of the reduced density operator as an objective description of an ensemble of particles 2 in the EPR experiment cannot provide a valid alternative to the nonlocal contextualism of the Copenhagen interpretation, represented by the different decompositions (6.2) and (6.3) of valid within two different measurement contexts. Due to the failure of the ‘possessed values’ principle these decompositions cannot just correspond to two different partitions of the particle 2 ensemble into subensembles, already existing before, and independently of the measurement on particle 1. The particle 2 observables appear to be able to get their values only as a consequence of the measurement on particle 1, even if the state vector is thought to represent an ensemble rather than an individual particle (cf. section 6.4.4). If the state vector describes the “reality” either of an individual particle or of an ensemble, then the “reality” of particle 2 (either individually or ensemble-wise) appears to depend on which observable is chosen to be measured on particle 1. Hence, in a realist interpretation ‘locality’ cannot be saved by abandoning the individual-particle interpretation for an ensemble one. Einstein’s trade-off between ‘locality’ and ‘completeness’ does not
320
CHAPTER 6. INDIVIDUAL PARTICLE VERSUS ENSEMBLE
seem to work properly in a realist interpretation. Even if Bohr had referred to particle 2 as being conditionally prepared rather than subject to measurement, he would have been completely right in pointing at the essential role of the measurement of particle 1 in allowing a subdivision by EPR of the ensemble of particles 2 into quantum mechanical subensembles (cf. section 5.4). Since the conditional preparations are based on measurements of the incompatible observables and respectively, we have different, incompatible preparation procedures for the subensembles of particle 2. In an empiricist interpretation neither modified EPR (correlation) measurements (due to the postulate of local commutativity, cf. section 1.3.1), nor conditional preparations (like in the EPR experiments, compare section 6.4.4) give rise to any suspicion of nonlocality. ‘Nonlocality’ (in the sense of superluminal influences) is an issue only in a realist interpretation of the quantum mechanical description of the EPR experiment. For this reason it is rather surprising that in the EPR discussion the possibility of solving the ‘nonlocality’ problem by a trade-off between a realist and an empiricist interpretation of quantum mechanics has not had a greater impact. Even though at that time the failure of the ‘possessed values’ principle was not known, there exist independent arguments in favor of an empiricist interpretation (compare section 2.4) that might have appealed to physicists who adhered to logical positivism/empiricism, and who must have been unhappy with the seeming necessity to accept nonlocality of the quantum world against all empirical evidence. Such an empiricist alternative was not contemplated, however. Instead, according to Einstein a choice should be made between a nonlocal, realist, individualparticle interpretation and a local, equally realist, ensemble one. However, the EPR proof of the incompleteness of quantum mechanics does not signify that a local realist ensemble interpretation of this theory would actually be possible. On the contrary, it was concluded by EPR that the ensembles cannot (all) be described by quantum mechanics. Hence, it seems that a subquantum theory will be necessary for such a description (assuming that such a description is possible at all).
Summary Summarizing, the conclusion which can be drawn from the EPR problem with respect to the interpretation of the quantum mechanical formalism is the following: Due to the failure of the ‘possessed values’ principle Einstein’s ideal that quantum mechanics yield a description of an objective reality cannot be fulfilled, even if this reality just refers to an ensemble. This rules out an objectivisticrealist interpretation. Sticking to realist interpretations, a contextualistic-realist interpretation is the
6.6. MODAL INTERPRETATIONS
321
next to be considered. For measurements this can be implemented by Bohr’s correspondence principle. For modified EPR experiments this can be done in a local way, in which each particle has its own local measurement context. Unfortunately, by Bohr the correspondence principle was also applied to EPR experiments, thus giving rise to the idea that a nonlocal context would be necessary. Taking into account that the EPR experiment is not a measurement of particle 2 but a conditional preparation is not sufficient to solve the nonlocality problem in a realist interpretation of quantum mechanics, even if it is an ensemble one. Since a contextualistic-realist interpretation of the EPR experiment still entails a non-empirical nonlocality, it seems necessary to adhere to an even weaker interpretation. This may be an empiricist one. In an empiricist interpretation of quantum mechanics conditional preparation involves selection of subensembles of the particle 2 ensemble. However, these ensembles are not thought to be described by quantum mechanics, but by some subquantum theory (compare chapter 10), encompassing subquantum mechanical ‘elements of physical reality’ . The problem of (non)locality as suggested by EPR experiments should be distinguished from the one associated with the Bell inequality, either in quantum mechanics (cf. chapter 9) or in subquantum (hidden-variables) theories (cf. chapter 10), because the Bell inequality is referring to modified EPR (EPRBell) experiments rather than EPR ones.
6.6
Modal interpretations
As already discussed in chapter 2, an empiricist interpretation of quantum mechanics is unattractive to many physicists because it is felt that this theory is telling us quite a bit more about reality than just describing correlations between certain events of preparation and measurement. Thus, by Einstein it was emphasized that a physical theory should yield a “complete description of any (individual) real situation (as it supposedly exists irrespective of any act of observation or substantiation22)” ([90], p. 667). If the empiricist meaning of an observable as a pointer position of some measuring instrument is to be contemplated at all, then, according to this view, at least a principle of ‘faithful measurement’ (cf. section 2.4.3) should relate it to the 22
Emphasis added, WMdM.
322
CHAPTER 6. INDIVIDUAL PARTICLE VERSUS ENSEMBLE
reality of the microscopic object. More stringently, the theory should preferably describe just this latter object, without any reference to measurement. In particular, the projection postulate is thought to be an undesirable consequence of the Copenhagen preoccupation with measurement, being inconsistent with the idea that the state vector describes the objective reality of an individual object. In its most classical form this tendency toward a realist interpretation of the quantum mechanical formalism entails the ‘possessed values’ principle, explaining measurement results by properties of the object, objectively possessed by the object prior to measurement (cf. section 6.4.2). However, this very attempt at a realist interpretation is the main source of the paradoxes by which quantum mechanics has been plagued. Thus, as seen in section 3.1, a realist interpretation of the state vector is at the origin of the conventional “measurement problem”, and its problematic “solution”, (strong) von Neumann projection. A realist interpretation of quantum mechanical observables meets considerable difficulties, too. As was seen in section 6.4.2 (see also section 9.1.2), the ‘possessed values’ principle is too stringent to be reconcilable with experimental data. Quantum mechanical measurement results cannot simply be understood by assuming that the individual microscopic object possessed values of all of its observables as objective properties already before the measurement, distinct measurement results being explained by distinct possessed values. A modal interpretation is an attempt to stick as much as possible to a realist interpretation by relaxing it so as to stay free from paradoxes, and to remain in agreement with experiment. A modal interpretation tries to attribute values to quantum mechanical observables, without relying on a principle like the ‘possessed values’ principle. Such an interpretation has first been proposed by van Fraassen [305, 78]. It was originally meant to be a formalization of the Copenhagen interpretation, and, for this reason, is referred to as the Copenhagen variant of the modal interpretation (section 6.6.1). It is to be distinguished from an anti-Copenhagen variant that will be briefly discussed in section 6.6.2. Here only the basic ideas are presented. For a more extensive account of different ways of implementing the idea of a modal interpretation, see e.g. Dieks and Vermaas [306].
6.6.1
Copenhagen variant of the modal interpretation
As is customary in the Copenhagen interpretation, the non-classical probabilistic character of quantum mechanical statistics is considered by van Fraassen to be of an irreducibly indeterministic nature. Concomitantly, unless the state is an eigenvector of the measured observable, it is thought to be impossible to attribute a measurement result of an observable to the object as a property necessarily possessed before the measurement. For van Fraassen the possibility that this measurement result is actually found is reason enough to advocate, for quantum mechanical purposes, the
6.6. MODAL INTERPRETATIONS
323
introduction of ‘modalities’ like ‘possibility’, ‘actuality’, and ‘contingency’, next to the classical concepts of ‘truth’ and ‘falseness’. Without entering into the intricacies of modal logic it will be sufficient to state here that a modal interpretation is reflecting the different ways (modi) an object may possess a certain property. When measuring observable A in eigenstate value is found with necessity. On the contrary, in the state there is only a possibility that value is found. In the modal interpretation a language is created taking into account the different ways the value is possessed in the two states. In the Copenhagen variant of the modal interpretation the state of a system is thought to be described by a state vector, or density operator, satisfying a Schrödinger equation. This is the dynamic state. Next to it by van Fraassen the existence is postulated of a so-called value state, specifying which observables do have values, and what they are ([78], p. 275). Such a value state can be characterized by the projection operator on a subspace of Hilbert space, spanned by the eigenvectors of the density operator corresponding to one of its eigenvalues. Thus, if preceding a measurement of a standard observable A the dynamic state is given by state vector then only the projection has a value, and the value state is given by this projection operator. In particular, in this situation observable A is not thought to have a value if is not an eigenvector of A. After a measurement of A the value state has changed to the projection if the measurement result is (in case of degeneracy this should be generalized to Vermaas [307] refers to these projection operators as ‘core properties’. Van Fraassen’s ideas can be implemented into the theory of measurements of the first kind, discussed in section 3.2.4. During a measurement process the dynamic state keeps behaving deterministically (taking into account that during a measurement the object is interacting with a measuring instrument). Restricting ourselves to the non-degenerate case, the post-measurement dynamic state of the system object+measuring instrument is then given by (3.5), The dynamic state of the object can be obtained from (3.5) by means of partial tracing as This is in agreement with the result of weak projection (1.72). The dynamic state is assumed not to be subject to strong projection. Nevertheless the observable has a well-defined value if for observable A measurement result is found. exhibits the different possibilities of the different post-measurement value states and their probabilities. It is felt as an important achievement of the modal interpretation that by thus relaxing the link between observables and dynamic state the Copenhagen interpretation can be rid of the necessity to assume strong projection. In the modal interpretation the transition, during measurement, between the two value states and corresponds to a transition from the “possible” to the “actual” ([78], p. 276). The probability that such a transition happens is seen as a “measure of the possible” ([78], p. 53). The Copenhagen character of the interpretation is
324
CHAPTER 6. INDIVIDUAL PARTICLE VERSUS ENSEMBLE
in agreement with Heisenberg’s Aristotelean philosophy mentioned in section 4.6.2 (see also Jordan’s ideas, section 6.2.2). In this Copenhagen variant of the modal interpretation a realist interpretation of the dynamic state vector is actually renounced. It only “summarizes what is common to all the different ways the system could be.” ([78], p. 299). As far as the reality of the object is involved, this is thought to be represented by the value state. The post-measurement density operator is considered as a “bookkeeping device which identifies the true value-attributions correctly.” ([78], p. 281). It is not necessary to assume that the object is “really” in dynamic state if measurement result has been found. According to van Fraassen the projection postulate is redundant, even when “it is as if the projection postulate... were true.” ([78], p. 327). After a first kind measurement the object is behaving as if it is in the projected state By thus abolishing ‘projection’ as a “real” physical process van Fraassen is trying to save the Copenhagen individual-particle interpretation from the problems observed in section 4.6.6. By drawing a distinction between the dynamic state and the value states (the latter referring to observables) he seems to get even closer to Bohr’s instrumentalist view of the state vector and the latter’s more realist understanding of observables (physical quantities). Ridding the Copenhagen interpretation of the inveterate realism the (dynamic) state vector is often interpreted with, might be seen as an important achievement of this Copenhagen variant of the modal interpretation. From the point of view of avoiding the paradoxical consequences of a realist interpretation this is a large step forward. On the other hand, an interpretation of the state vector as a mere ‘bookkeeping device’ seems to fall prey to the general vagueness of instrumentalist interpretations observed in section 6.4.1, and might do less justice to the meaning of the dynamic state than the empiricist ‘symbolic representation of a preparation procedure’. In particular, conditional preparation (cf. section 3.2.6) is a widely used experimental procedure for preparation of a state which for the purpose of subsequent experimentation can be interpreted as a dynamic state, describing a certain reality embodied by the preparation. It could also be questioned whether in an instrumentalist interpretation of the state vector there is any advantage in avoiding von Neumann projection because no causality problems can arise in such an interpretation. The realist dimension of the Copenhagen variant of the modal interpretation is thought to be realized by the concept of ‘value state’, attributing a value to a quantum mechanical observable, be it, in the Copenhagen way, as a property of the microscopic object, possessed not before but after the measurement. These ideas have been generalized and modified by Kochen [308] and by Dieks [218, 309], who emphasize the fact that the state vector (3.5) of the system of object+measuring instrument has precisely the form (1.59) of a polar decomposition. According to Kochen the polar decomposition is especially important to measurement because by it unique (standard) observables are defined for both object and measuring in-
6.6. MODAL INTERPRETATIONS
325
strument (compare section 1.5.3). Thus, the observable relevant to the object (1) is determined as follows. If in (1.66) for and if the set is a complete orthonormal one, then the observable is given by PVM If either the set is not complete, or some of the eigenvalues coincide, then the PVM consists of the projection operators on the (possibly more-dimensional) subspaces spanned by vectors belonging to the same eigenvalue (the vectors not showing up in (1.66) correspond to In this way PVM is uniquely determined by the polar decomposition. It has been emphasized by Dieks [309] that such a polar decomposition always exists, also if no measurement is performed. A quantum mechanical object (1) is always in interaction with its environment (the rest of the universe (2)). This implies that the state vector of object+environment can always be represented according to (1.59) or (1.66), whether a measurement is performed or not. At any moment by this decomposition an observable of the object (as well as an observable of its environment) is uniquely defined, which can be attributed to the object as a property valid in the context of its actual environment. Independence of this attribution from measurement is judged particularly important because it enables to transcend the anthropocentrism of the Copenhagen preoccupation with measurement (compare Bohr’s correspondence principle, section 4.3.2). Of course, if a measurement is carried out, then the measuring instrument will constitute an important part of the object’s environment.
Critique of the Copenhagen variant of the modal interpretation Even though neither van Fraassen’s value state nor the Kochen-Dieks PVM does entail any empirical problems when discussing measurement (as remarked by van Fraassen ([78], p. 277) the values of the observables are even “empirically superfluous”), there is yet a difficulty. Van Fraassen’s value state and the polar decomposition both strongly hinge on a restriction to measurements of the first kind. Consequently, the scarce applicability of first kind measurements, observed in section 3.2.4, constitutes a major obstacle to a general application of these concepts. For measurements of the second kind the final state of a measurement of observable A is with (cf. (3.10)). It will be necessary to generalize the criteria for determining the value state or PVM so as to apply to these more general measurement procedures. After a measurement with measurement result the dynamic state of the object is given by (cf. (3.11)). Since, in general, the eigenvectors of this operator are different from the eigenvectors of A, the postmeasurement value state cannot be an eigenprojection of A. This would have the drawback that, even though the measurement yielded a well-defined measurement result in general no well-defined value can be attributed to A. For instance, in
326
CHAPTER 6. INDIVIDUAL PARTICLE VERSUS ENSEMBLE
the case of an ideal photon detector (cf. section 3.2.4) it would follow from (3.17) that after the measurement for any measurement result registered by the measuring instrument the value state of the photon number observable is given as the projector on the vacuum state Evidently, there is no relation between the measurement result (pointer state) and the value of the observable that can be attributed to the object. Of course, the requirement that such a relation exist is just one of the peculiarities of the Copenhagen interpretation. It could easily be abolished if the unrealistic Dirac-von Neumann idea of measurement as preparing the measured value as a property of the microscopic object is abandoned. Extension of the criterion provided by the polar decomposition to measurements of the second kind meets yet another problem. Since, in general, vectors do not constitute an orthonormal set, not only vectors are different from the eigenvectors of A, but also vectors are linear superpositions of the pointer state vectors rather than these vectors themselves. This makes this criterion vulnerable to an analogous objection as the Copenhagen interpretation (cf. section 3.2.5; Elby [310]). Thus, for the ideal photon detector, referred to above, the polar decomposition (3.17) is yielding a superposition of detector states analogous to that of a living and a dead cat, which is at the basis of the conventional “measurement problem” (section 3.1.1). This would imply that, according to this criterion after the measurement in general even the pointer observable does not have a well-defined value. It has been proposed [311, 309] to solve this problem by taking into account the interaction with the environment (compare (3.46)). Assuming that environment states corresponding to different pointer states, are orthogonal, and treating object+environment as a single system, we have a polar decomposition attributing a value to the pointer observable also in the case of a measurement of the second kind. Hence, it appears that the environment can save the criterion from yielding a PVM of the measuring instrument not corresponding to its pointer states, in a way more or less analogous to the decoherence solution of the problem of quantum measurement, discussed in section 3.4. However, a criticism of this solution, analogous to the one given in section 3.4.5, is applicable also here. Although the environment must have a certain influence in the measurement procedure (as in any other procedure), it is questionable whether this influence is as crucial to the interpretation as it is supposed to be here. As discussed in section 3.4.5 a realistic description of macroscopic pointer states cannot be based on an orthogonal resolution of the identity, but must necessarily involve a non-orthogonal one (like the one generated by the coherent states). This would make the criterion based on the polar decomposition inapplicable. The problem of finding an acceptable criterion determining the post-measurement value of the measured observable of the microscopic object may be illustrative of the more fundamental question of whether it is possible at all to dissociate dynamic
6.6. MODAL INTERPRETATIONS
327
state and value state, both interpreted as describing the microscopic object. In experimental practice the vector is usually considered, in the sense of conditional preparation, as the dynamic state of the object after measurement result is obtained, in precisely the same way as is the dynamic state before this measurement. In the case of a measurement of the first kind the values of the observables not only “have predictive value, exactly because they are symptomatic of the dynamic state” (van Fraassen [78], p. 277); they actually completely determine the dynamic state (by attributing value 1 to observable It therefore seems that the difference between the dynamic state and the value state has largely evaporated in this case. It does not really help to assume that, in a first kind measurement, has a realist meaning as a value state, and to deny that same projection operator a realist meaning as a description of a dynamic state. An analogous conclusion holds for measurements of the second kind with respect to the conditionally prepared state It is questionable whether the doubts with respect to projection, partly motivating a modal interpretation, are sufficiently justified to give up the link between eigenvalue and eigenvector, as is implied by the dissociation of dynamic state and value state. If, as in an instrumentalist (or empiricist) interpretation, the state vector is not thought to describe the reality of a microscopic object, then the projection postulate is fully acceptable as a preparation principle. Problems (like the acausality anomaly) arise only in a realist interpretation of the state vectors either interpreted as dynamic states or as value states. In an instrumentalist interpretation we do not seem to have any reason to introduce a value state next to the dynamic state for circumventing the problems induced by the projection postulate. Hence, it seems that the urge felt to get rid of the projection postulate must have its origin in the tendency, observed in the Copenhagen interpretation (cf. section 4.1), toward a realist interpretation, not only of observables (or value states) but also of the dynamic state. This makes the Copenhagen variant of the modal interpretation liable to all problems, associated with a realist interpretation of the state vector, encountered in chapter 3, and further discussed in chapters 5 and 6. This makes the Copenhagen variant of the modal interpretation rather unattractive.
6.6.2
Anti-Copenhagen variant of the modal interpretation
In the Copenhagen variant of the modal interpretation no prior-to-measurement values of observables are thought to be well-defined in general. Hence, the realism of the interpretation cannot serve to provide explanations of specific measurement results as was considered in section 6.4.2. The anti-Copenhagen variant of the modal interpretation has the intention to reduce quantum mechanical indeterminism to our lack of knowledge on the precise initial conditions of the object, more or less in the way aspired by Einstein, but taking into account the impossibility of simul-
328
CHAPTER 6. INDIVIDUAL PARTICLE VERSUS ENSEMBLE
taneously attributing values to all observables. Instead, it might be assumed that at least the actually measured observable can be considered, in a realist sense, as a property of the microscopic object. In contrast to the Copenhagen version of the modal interpretation, where the value of the observable is attributed only after the measurement, here the observable is assumed to have its value at the start of the measurement. Therefore it at least seems to have an explanatory function in the sense that in a measurement of standard observable the measurement result (pointer position is explained because the observable “really has” the corresponding property (like an observation of the rigidity of a billiard ball is explained by its “really” being rigid). A realist interpretation attributing a value to the measured observable, is necessarily a contextualistic one. As far as standard observables are concerned, this would bring us close to the contextualistic realism of Bohr’s ‘correspondence principle’, in which an observable is defined only within the context of its measurement (cf. section 4.3). Due to the contextualism of the interpretation the value of an observable cannot be seen as an objective property possessed by the object independently of the measurement. Therefore a modal interpretation in the anti-Copenhagen sense does not live up to Einstein’s ideal of having a theory explaining measurement results because these would refer to objective reality. It is just purporting to interpret the quantum mechanical formalism as yielding a description of the possible reality of the object in the context of a measurement. However, it could be felt as satisfying that in a modal interpretation it is possible to attribute, in the context of the measurement, to the object the value of the measured observable in a descriptive sense (like the rigid body model of a billiard ball is just describing (not explaining) this object’s behavior within experimental contexts allowing such a description). It does not seem to be impossible to extend this variant of the modal interpretation so as to also have a realist interpretation of certain states. As a matter of fact, for a measurement of standard observable the contextual state introduced in section 2.4.5, might describe all ‘possibilities of being’ (described by density operators (1.76)) open to the object in the context of a measurement of A. Then the modal interpretation could be seen as an ensemble interpretation, in which the state vector is describing the relative frequencies of realizations of the values of the observable, when measured. The transition from the initial density operator to might be seen as the process in which the measured observable A gets its value defined. It should be compared with a transition to a description of a billiard ball as a rigid body, while dropping all reference to properties that are incompatible with such a description (like those associated with the vibrations of its atoms). Hence, the transition from to as a description of the object should be seen as an adaptation to a particular mode of observation. Note that such a transition need not be seen as a consequence of a physical process of decoherence, caused by environmental fluctuations wiping out
6.6. MODAL INTERPRETATIONS
329
cross terms (cf. section 3.4), although, of course, in experimental arrangements such decohering effects may actually occur (see also section 10.6.4). It is important to distinguish the operation of weak projection, yielding from the process of conditional preparation discussed in section 3.2.6 (see also section 5.4.3). In particular, the contextual state is not applicable outside the context of the measurement of standard observable A. It is independent of the details of the measurement interaction, and, hence, is applicable to measurements of both the first and second kind. Although for first kind measurements of maximal standard observables the contextual state coincides with the final object state this should be considered as irrelevant because it is a consequence of a spurious definition of a quantum mechanical (first kind) measurement. Unlike in the Copenhagen variant, is not considered here as the final state of the object, but as an alternative description of the initial state of an ensemble, thus enabling to explain different measurement results by the (contextual) existence of different states As far as a realist meaning can be attributed to the contextual state it is a contextual one, valid only in the context of a measurement of observable A. Therefore, the contextual state cannot contribute to an explanation of quantum mechanical measurement results: it does not provide any explanation why an individual object acquires a certain value of an observable, to be registered when a measurement of the observable is carried out (just like rigid body theory does not provide an explanation of the rigidity of a billiard ball). Concomitantly, since in the anti-Copenhagen variant of the modal interpretation the value of a quantum mechanical observable did not exist before the object started interacting with the measuring instrument, the ‘possessed values’ principle is not applicable. There does not seem to exist any objection against a realist interpretation of the contextual state in the above sense, at least as far as standard observables are concerned. On the other hand, since no empirically verifiable information is yielded by the contextual state, not also provided by the initial state vector it is just as “empirically superfluous” as the values of observables attributed by a modal interpretation to the microscopic object. It was also noted in section 2.4.5 that it is highly questionable whether such a realist interpretation can be applied to measurements of generalized observables. For these reasons we should be particularly careful in drawing any conclusion from it. The anti-Copenhagen variant of the modal interpretation is a contextualisticrealist one, more or less in the sense of Bohr’s correspondence principle (although it differs from Bohr’s interpretation by not only interpreting observables realistically, but also certain states, viz, the contextual states, and by specifying the phase of the measurement process the contextual description applies to). A transition from the contextual state to one of the states need not be considered as an indeterministic transition of a single object, but may be seen, in Einstein’s sense, as a description of a selection of a subensemble, be it not of an objectively existing
330
CHAPTER 6. INDIVIDUAL PARTICLE VERSUS ENSEMBLE
ensemble, but of an ensemble of objects considered in the measurement context of observable A. Although the quantum mechanical description does not yield any explanation of why a particular object should be described by a specific initial state this need not imply that such an explanation does not exist at all. In chapter 10 some ideas will be developed on subquantum (hidden-variables) theories in which a quantum mechanical measurement result may be explained by a property the object had already before it started to interact with the measuring instrument. Such a property cannot be a value of a quantum mechanical observable, though, but might be of a subquantum nature, even if the subensembles described by are quantum mechanical (see also section 6.4.3).
Chapter 7 Generalized quantum mechanics 7.1 Introduction In section 3.3 it was seen that the probability distribution of the pointer positions in the final state of a measuring instrument is represented in general by the expectation values of a positive operator-valued measure (POVM) (cf. appendix A.12.3) in the initial state of the object. The concomitant generalization of the notion of a quantum mechanical observable from the projection valued measure (PVM) of the standard formalism to the POVM involves an essential widening of the domain of application of quantum mechanics, which has important consequences both with respect to fundamental questions as well as possible practical applications. In this chapter this subject is discussed further. In particular, it is necessary to develop a number of mathematical tools making it possible to apply the notion of a POVM to quantum mechanical measurement. The formalism in which a quantum mechanical observable is represented by a POVM will be referred to as the generalized formalism of quantum mechanics1. We start with a number of examples of measurements necessitating a POVM rather than a PVM for their description. The method used to find the POVM is the one discussed in section 3.3: the interaction between object and measuring instrument is described using the usual Schrödinger equation 2 . In the spirit of the empiricist interpretation of quantum mechanics introduced in section 2.2 probabilities are determined that a pointer of a measuring instrument is found at well-defined pointer positions after completion of a measurement. In some cases it is possible to take a suitable degree of freedom of the microscopic object as the pointer coordinate, 1
Measurements corresponding to the PVMs of the standard formalism are sometimes referred to as ‘simple’ or ‘ideal’ measurements. This nomenclature will not be followed here. 2 This method is sometimes referred to as ‘the operational approach’ [42].
331
332
CHAPTER 7. GENERALIZED QUANTUM MECHANICS
assuming that the amplification process to the macroscopic level of the phenomena does not change the information to be conveyed by the experiment. The examples discussed in sections 7.2 through 7.5 show that a generalization of the formalism is necessary for describing even the most common methods of quantum mechanical measurement, like the detection of photons using a detector that is not 100% efficient. This also holds true for such experiments as the doubleslit experiment, being a paradigm of standard quantum mechanics. It will not be surprising, then, that an analysis of this experiment based on the standard formalism can hardly be a reliable one, and that conclusions based on such an analysis should be considered with some reservation. A detailed discussion of the double-slit experiment is given in section 7.3. In section 7.3.2 it will be seen that even the generalization of the concept of observables to POVMs is too restrictive, and has to be generalized still further to encompass OVMs (cf. appendix A.12.3). Such a further generalization will not be pursued here, however. The examples give occasion to define in section 7.6 a notion of nonideal or inaccurate measurement, representing a relation between two different measurement procedures (POVMs), one procedure being interpreted as a nonideal version of the other one. This nonideality relation is the main subject of the present chapter. It turns out that this relation can be used as a basis for a discussion of quantum mechanical complementarity in the sense of mutual exclusiveness of measurement arrangements (cf. section 4.5), and, hence, is of fundamental importance to the foundations of quantum mechanics. For this reason much attention is paid to the physical and mathematical properties of the nonideality relation. In particular, in section 7.7 the partial ordering structure, induced by the nonideality relation in the set of POVMs and their equivalence classes, is treated in some detail. The ensuing definitions of maximality and of completeness of a POVM are important both for their fundamental significance and for the essential extension of the domain of applicability of the theory they entail. In particular, the possibility of complete measurements, allowing a determination of the incoming state by means of a measurement using one single measurement arrangement, distinguishes generalized quantum mechanics in a fundamental way from standard quantum mechanics. As an application of the notion of a nonideal measurement the simultaneous or joint nonideal measurement of two (incompatible) observables is discussed in section 7.9, demonstrating the fundamental importance of generalized quantum mechanics to the problem of complementarity. A number of examples of such measurements is discussed, demonstrating the practical feasibility of such joint measurements 3 . The notion of a Wigner measure is introduced in section 7.9.4 as a possibility of eliminating from the quantum mechanical information obtained in a joint nonideal measurement of incompatible observables the disturbing influence 3
The possibility of such measurements, as well as the relevance of POVMs to their description, has already been considered a long time ago [312].
7.2. INEFFICIENT PHOTON DETECTION
333
due to the mutual exclusiveness of the measurement arrangements for the single observables (cf. Heisenberg’s disturbance theory of measurement, section 4.6.2). A quantitative analysis of mutual disturbance in a joint measurement of incompatible PVMs is finally performed in section 7.10 on the basis of nonideality measures discussed in section 7.8, characterizing the measure of nonideality introduced into quantum mechanical information by the way a measurement is performed. In particular the information-theoretic notions of ‘mutual information’, and of the ‘average row entropy’ of a nonideality matrix, representing the nonideality relation between observables, are important tools for performing such an analysis. Using the latter measure the Martens inequality is derived in section 7.10.2. This inequality is one of the central results to be recorded in this book, since, unlike the Heisenberg inequality (1.78), it yields a precise characterization of mutual disturbance in a joint measurement of incompatible observables (cf. section 7.10.3). It seems that the Martens inequality, being derivable only from the generalized formalism of quantum mechanics, is necessary in order to obtain the complete understanding of quantum mechanical complementarity which the standard formalism is not able to yield.
7.2 Inefficient photon detection 7.2.1
POVM of an inefficient photon detection process
For simplicity we restrict ourselves to one single mode of the electromagnetic field. An arbitrary state can be given as a superposition
of the eigenvectors of the photon number operator (e.g. [7]; see also appendix A.3). According to the standard formalism a quantum mechanical photon number measurement must correspond to the PVM defined by the Hermitian projection operators of the spectral representation of N. Thus,
For the pure state (7.1) this yields
334
CHAPTER 7. GENERALIZED QUANTUM MECHANICS
from which it is easily seen that if implying that PVM represents a 100% efficient detector registering with certainty all photons present. In the case of a photon detector with efficiency we can use an expression derived by Kelley and Kleiner [313] (see also Loudon [7]) for the probability of detecting photons during a time interval T:
in which is the normal ordering operator, placing in each term of a power series expansion of the operator all creation operators to the left of the annihilation operators without taking into account the commutation relation This can be written as
The POVM
is defined by equating this to
implying that Using the properties of the creation and annihilation operators it is directly seen that is diagonal in the number representation. Thus, if implying that can be written in the form
The coefficients
are found from
yielding
It can easily be verified that the operators constitute a POVM. Thus, since operators are positive operators. We also have
7.2. INEFFICIENT PHOTON DETECTION
implying that operators, since
335
It is evident that the operators
are not projection
The quantities are the conditional probabilities of detecting photons if the initial state is the state with exactly photons For this reason the binomial expression (7.8) is consistent with a realist interpretation in which each photon has probability of being detected and probability to be missed by the inefficient detector. It is possible to interpret the detection process represented by POVM as a nonideal or inaccurate version of a measurement of the photon number observable corresponding to PVM Inefficiency of the detection process introduces a certain measure of nonideality or inaccuracy, described by the matrix for this reason to be referred to as a nonideality matrix. This interpretation is in agreement with the fact that for wefind implying that a 100% efficient detector is measuring the photon number observable N in an ideal way. A not unimportant property of the nonideality matrix an inverse given by
(7.8) is that it has
Thus, and The existence of an inverse has as a consequence that relation (7.7) can be inverted, yielding
This implies that the standard probability distribution (7.4) can be calculated from probability distribution (7.6): we are able, in principle, to eliminate by means of calculation the disturbing influence due to the inefficiency of the measurement process4. Although the measurement of POVM using an inefficient photon detector, is a nonideal performance of the measurement of photon number, the measurement can evidently yield ideal information, in the sense that, in principle, probability distribution can be determined. The inefficient photon detection method is an example of an ‘invertible nonideal measurement’ (cf. section 7.6.3). It can straightforwardly be verified that the inversion formula reproduces the experimental practice of dividing the measurement result by to obtain the expectation value of the observable the experimenter intends to measure. From the inversion formula it is seen that in principle also higher moments of the probability distribution can be obtained by means of calculation. Nevertheless, (invertible) nonideal measurements are ‘inaccurate’ or ‘unsharp’ [42] measurements, because the 4
Of course this is possible only if the values of
are known for all
336
CHAPTER 7. GENERALIZED QUANTUM MECHANICS
individual measurement results are different from the ones obtained when using an efficient detector. This terminology is also useful to link the formal developments of the present chapter to the discussion of chapter 4.
7.2.2
Comments on the nonideality relation
The quantum mechanical character of the nonideality matrix It is pointed out already here that nonideality matrices like (7.8) will be met more often in the present chapter, and that they will be playing a very important role in our attempt at gaining a better understanding of quantum mechanics, and, in particular, of the principle of complementarity. It will turn out that the conditional probability is not just a side effect, introduced by the special measuring method, in accordance with the correspondence idea (cf. section 4.3) to be understood in classical terms. On the contrary, the interaction of the electromagnetic field with an inefficient detector is a quantum mechanical process just as is the case in a measurement with an ideal photon detector. For this reason the nonideality matrix has a quantum mechanical origin. It will turn out to contain the key to understanding the difference between the notions of ‘uncertainty’ and ‘inaccuracy’, advanced already in section 4.5 as a possibility of creating some order in the chaos around the problem of quantum mechanical complementarity. We must be careful not to oversimplify the problem in the way it is done in Ballentine’s ‘statistical’ interpretation (cf. section 4.7.3). In this interpretation a well-defined number of photons is assumed to be present prior to the photon number measurement. The nonideal detection process, characterized by the nonideality matrix in which only out of photons is registered, could then be seen as a purely “classical” process, the presence of each individual photon being either ascertained successfully or not, more or less in the same way as this is assumed in classical physics, and, hence, being unrelated to the quantum mechanical problem. As we have seen in section 6.2, however, the ‘statistical’ interpretation is rather problematic. It is even problematic to interpret an ideal measurement of the photon number observable N as an exact experimental determination of the number of photons present preceding the measurement (see also section 2.4.4). We have no reason to suppose that such an interpretation would be less questionable in a nonideal measurement. The quantum mechanical measurement results having physical relevance are the probability distributions. Expressions (7.7) and (7.8) do not define a relation between photon numbers of individual measurements, but between the probability distributions and Invertibility of nonideality matrix is significant only in as far as we are interested in probability distributions of observable
7.2. INEFFICIENT PHOTON DETECTION
337
Evidently, the meaning of nonideality relation (7.7) is tied up with the statistical character of quantum mechanics. The question of which value would have been found if observable would have been measured instead of when the measurement yielded result is liable to be approached by means of the theory of statistical inference (e.g. Helstrom [46]). The possibility of compensating, notwithstanding the statistical character of the measurement process, by means of inversion of the nonideality matrix, for the disturbing influence of the nonideal measurement, must be accounted for on the basis of the deterministic character of the Schrödinger equation describing the measurement process. According to Bohr’s idea, discussed in section 4.4, the finiteness of the ‘quantum of action’ precludes the possibility of neglecting the influence of measurement within the domain of quantum mechanics. This idea might seem to be in disagreement with the possibility, found above, of compensating by means of inversion of the nonideality matrix for the disturbing influence of a nonideal measurement. We must be aware of the fact, however, that Bohr’s considerations were of a heuristic nature, being implemented in the standard formalism, which was the only formalism for describing quantum mechanical measurements disposed of at that time. This implies that only PVMs could be considered. Relations like (7.7) could not arise at all. The possibility of inversion, discussed above, can be realized only for pairs of measurements of which at least one is represented by a POVM that is not a PVM. A formal mathematical description of such measurements was outside Bohr’s (and Heisenberg’s) range, even though, as we shall see in section 7.3, ‘thought experiments’ discussed by them are of the type needing POVMs for their description. Measurement disturbance as discussed by Bohr and Heisenberg refers to disturbance of a standard observable by simultaneously measuring a second, incompatible, one (cf. section 4.6.2). This appears to be unrelated to the nonideality relation expressed by (7.7), being (in this specific example) a relation between commuting operators. Nevertheless, the nonideality in this latter relation may have Heisenberg measurement disturbance as its physical origin. The extension of the quantum mechanical formalism makes it possible in the mathematical formalism to check the heuristic physical ideas Bohr and Heisenberg developed with respect to quantum mechanical measurement. We shall see in section 7.9 that the generalized formalism of quantum mechanics yields a mathematical description of Heisenberg measurement disturbance which actually turns out to require relations like (7.7). Hence, such relations do not imply any contradiction with the Copenhagen interpretation. On the contrary, as we shall see in section 7.10, the generalization of the mathematical formalism provides considerable insight into the validity of certain dogmas of the Copenhagen interpretation which the standard formalism is not able to give.
338
CHAPTER 7. GENERALIZED QUANTUM MECHANICS
Nonideality, and interpretations of quantum mechanics The nonideality relation (7.7) might seem to be interpretable both in the sense of a realist interpretation of the formalism of quantum mechanics (referring to microscopic reality), as well as an empiricist one (referring only to pointer positions of measuring instruments). Thus, in a realist interpretation POVM could be thought to represent the relative frequencies of detected photons, the PVM referring to the photons initially present. Such a realist interpretation seems to be adhered to, for instance, by Busch, Lahti and Mittelstaedt [41]. Their concept of ‘value objectification’ has the intention to interpret the registration of an event corresponding to the index of a certain POVM as a property possessed by the microscopic object in its post-measurement state. In their view a POVM, as opposed to a PVM, does not describe a reality in which the observables have ‘sharp’ values, but a ‘fuzzy’ reality in which the values of the (standard) observables are not well-defined. Thus, in (7.7) the nonideality matrix would describe the fuzziness of the photon number observable. It would seem that, if such a realist interpretation is possible at all, then it must be a contextualistic one, since the nonideality matrix is dependent on the measurement arrangement. This kind of contextualism seems to be very close to Bohr’s one discussed in section 4.6.3, the width of the distribution contributing to the latitude with which photon number is defined in the context of an inefficient detection process (see also section 7.10.3). However, the notion of contextuality as used by Bohr is a rather vague one, which, as we have seen in chapter 5, has caused considerable confusion. In particular, it is not specified by him to what phase of the measurement process the ‘fuzzy’ reality would belong. In this respect the abovementioned attribution to the post-measurement state, if possible, would seem to be an improvement. However, since the detected photons are absorbed in the detection process, and, hence, are not present in the final state, the ‘fuzzy’ reality can certainly not correspond to the final (post-measurement) state of the electromagnetic field (cf. section 3.4), the reality of this field being determined by the photons not detected rather than by the detected ones. This also precludes the possibility of interpreting POVM in the sense of Heisenberg (cf. section 4.6), for whom quantum mechanical measurement results refer to the post-measurement state of the object. Evidently, in the example of the inefficient photon detector it is impossible to combine Bohr’s notion of latitude (or fuzziness) with Heisenberg’s more specific implementation of contextuality. Also a correspondence of a ‘fuzzy’ reality with the initial (prior-to-measurement) state is not very well possible, since the nonideality matrix is determined by the characteristics of the detection process, and, hence, can hardly be seen as a priorto-measurement property of the object. If POVM corresponds to any ‘fuzzy’
7.2. INEFFICIENT PHOTON DETECTION
339
reality at all, then it would seem that this reality should refer to the photons that are actually detected in the inefficient detection process. The only part of reality corresponding to these seems to be composed of the electric pulses generated in the detector by the field, i.e. by the pointer positions of the measuring instrument. This points into the direction of an empiricist interpretation. As already extensively discussed in chapter 2, an empiricist interpretation of the quantum mechanical formalism seems to be preferable over a realist one. Our experience, as far as described by the quantum mechanical formalism, is in the first place about pointer positions of measuring instruments. Any inference regarding the reality of the microscopic object is based on direct observation of instruments. In an empiricist interpretation of the quantum mechanical formalism there is no reference to the object itself (the electromagnetic field), but only to the preparation procedure (labeled by the density operator and the measurement procedure (labeled by POVM or PVM A quantum mechanical measurement yields certain information on the initial state (see also section 7.10.4), it does not reveal values of (standard) observables, either ‘sharp’ or ‘fuzzy’ (compare section 2.5.1). In an empiricist interpretation the nonideality relation (7.7) between POVMs and is a natural one. It is interpreted as a relation between two different measurement procedures, comparing the information on the quantum mechanical state yielded by each procedure. POVMs and just describe the relative frequencies of electric pulses in different photon detectors, these pulses possibly corresponding to certain numbers of detected photons (in the ontic sense discussed in section 2.4.4), but the latter not being thought to be described by the quantum mechanical formalism. The measurement procedure represented by POVM (with is considered as a nonideal version of the other one. Since in an empiricist interpretation the observable is not thought to describe microscopic reality itself, this need not at all imply that the measurement procedure with reproduces reality “as it is”, or would correspond to any ‘sharp’ reality. It will be demonstrated in section 7.7, however, that a measurement using an efficient detector can be interpreted as a maximal measurement, in the sense of ‘yielding maximal information’. This notion of ‘maximality’ is a purely empiricist one, however, achieved by comparing measurement procedures, and does not involve any identification of pointer readings with properties of microscopic reality. It is also noted that in an empiricist interpretation of the quantum mechanical formalism the idea of Heisenberg measurement disturbance is different from its conception in a (contextualistic-)realist one. Thus, in the latter interpretation the microscopic object (i.c. the electromagnetic field) is thought to be disturbed by the measurement. Here Heisenberg disturbance should be interpreted in a preparative sense (cf. section 3.2.4). Actually, Heisenberg’s own interpretation of his disturbance theory of measurement, discussed in section 4.6.2, was rather a contextualistic-realist than an empiricist one. If the nonideality relation (7.7) is interpreted in an empiri-
340
CHAPTER 7. GENERALIZED QUANTUM MECHANICS
cist way, then Heisenberg disturbance should be taken as having a determinative meaning, in the sense of a disturbance of the measuring instrument rather than as a disturbance of the object, to the effect that, due to the disturbance, different pointer positions might be obtained starting from the same state preparation. Thus, an inefficient photon detector will yield different measurement results for the photon number observable than an efficient detector because inefficiency causes a disturbance of the detection process. Note that in an empiricist interpretation it is not possible to define a nondisturbing or ideal measurement by referring to the faithfulness of its representation of reality. In section 7.7 a definition of minimal disturbance will be developed using the notion of ‘maximality’ alluded to above. In an empiricist interpretation ‘disturbance of the measurement results’ can be interpreted as ‘deviating from the results of a maximal measurement procedure (represented by a maximal POVM)’. In this interpretation quantum mechanics is not thought to attribute either the results of a maximal observable or those of a non-maximal one to microscopic reality itself, however. Notwithstanding the clear distinction between the notions of Heisenberg measurement disturbance in the two different interpretations, we shall refer to them by the same name. The reason for this is that the empiricist version of Heisenberg disturbance is completely in the spirit of the ideas developed by Bohr and Heisenberg (see also section 7.10.3), and is actually identical with these in a minimal interpretation of quantum mechanics in which no specification is made of the meaning of the notion of a measurement result. Therefore a specification in an empiricist sense seems to be just a different implementation of the Heisenberg disturbance idea rather than a completely different principle. On the other hand, the extension of the mathematical formalism accompanying an empiricist interpretation makes it possible to implement Heisenberg disturbance into the mathematical description both in a preparative as well as in a determinative sense, thus transcending the heuristic approaches by Bohr and Heisenberg.
7.3
Quantum mechanical description of a doubleslit experiment
As a second example of a quantum mechanical experiment, to be represented by a POVM rather than by a PVM, we shall consider the double-slit experiment. It is significant to notice that this experiment, being a paradigm of quantum mechanics, and used time and again for illustrating the typical problems of this theory (cf. section 4.5), cannot be described in a satisfactory way using the standard formalism it is meant to illustrate. This does not hold true only for the double-slit experiment,
7.3. DOUBLE-SLIT EXPERIMENT
341
but for all scattering experiments formulated in terms of a differential cross section describing the probability that a particle (plane wave) is scattered in a certain direction in three-dimensional space (see any textbook on quantum mechanical scattering theory, e.g. Farina [314], p. 6). The differential cross section may have the properties of a probability distribution, but is certainly not found as the expectation value of a PVM corresponding to a standard observable. Instead, it is usual to calculate it as the radial component of the probability current density
of the scattered wave Evidently, it has been felt from the very beginning that the axiomatization as given by the standard formalism presented in section 1.1 is too restrictive to meet all experimental needs. That the standard formalism is not sufficiently general to be able to describe the double-slit experiment, can be seen quite generally. In the configuration of the double-slit experiment (cf. figure 4.3) let and be special solutions of the Schrödinger equation such that corresponds to a wave passing exclusively through slit We now keep and fixed, but consider superpositions of these two solutions with different possible relative amplitudes. Then the doubleslit experiment is described by the linear superposition
We know (appendix A.10) that the most general expression of a quantum mechanical probability distribution in the pure state (7.13) is given by a sesquilinear functional. This implies that the dependence of the intensity of the interference pattern at screen B is given by
That in this expression apparent from the fact that for fixed and dimensional Hilbert space with coordinates represented by the vector
cannot be a PVM, is immediately the problem is restricted to a twoin which the state (7.13) can be
In such a two-dimensional Hilbert space Hermitian operators can be represented by 2 × 2 matrices. These have at most 2 different eigenvalues. Hence, the spectrum of a Hermitian operator on a two-dimensional Hilbert space is discrete. Since the spectrum of the experimental observable is continuous it is impossible that a standard observable could describe the interference pattern. Indeed, a detailed calculation will show that is a POVM, not a PVM.
342
CHAPTER 7. GENERALIZED QUANTUM MECHANICS
7.3.1 Instationary approach In the instationary approach it is assumed that and are normalized wave packets, at a certain time localized in the neighborhood of slits 1 and 2, respectively. At that time they do not overlap, thus implying orthogonality:
Because of unitary evolution this orthogonality remains valid at all later times, also if scattering at screen S causes the wave packets to overlap for With
normalization of
(7.13) implies
Assuming that at time T the wave packets are in the neighborhood of screen B, we can try to define the probability of finding the particle at point of screen B as the probability that a measurement at time T of observable Z will yield value (for simplicity we restrict ourselves to two spatial dimensions):
Defining
and using (7.13), can be written in the form (7.14). Representing the density operator according to
we obtain in which the operator
is given by the 2 × 2 matrix
That the operators (7.17) satisfy (1.97) and (1.98), and, hence, constitute a NODI (corresponding to a POVM), can be seen as follows:
7.3. DOUBLE-SLIT EXPERIMENT
343
1. Operator (2 × 2 matrix) is positive if both eigenvalues are positive. This holds true if and only if both and Det The first inequality is satisfied since (7.16) implies The inequality for the determinant follows from the Schwarz inequality, since Det and defined by (7.16), is an inner product. 2. Equality (1.98) is satisfied in the form
This immediately follows from That
is not a projection operator follows from the representation
with
the Hermitian projection operator onto the vector
If were a projection operator this would imply that its range would be spanned by the vectors (7.20) with the same value of However, since must be a projector onto a one-dimensional subspace, implying all vectors (7.20) with the same value of to be in this subspace. This cannot be the case in general, however. Hence, cannot be a projection operator.
7.3.2 Stationary approach An alternative way to deal with the double-slit problem starts from the assumption that and are solutions of the time-independent Schrödinger equation at the same energy E. In general we have, once again, a superposition
Since are improper eigenfunctions we have to pay attention to normalization. As the quantity to be measured we therefore take the normalized value of the component of the probability current density J (7.12) at
344
CHAPTER 7. GENERALIZED QUANTUM MECHANICS
Unfortunately, this is not a sesquilinear functional. A sesquilinear functional is obtained in the following way. With it is easily demonstrated that
This implies that at the position of slit
is independent of L. Assuming that this entails
With
it now follows that
This can be represented according to of the initial (incoming) state as
and putting
It is possible to simplify
with
according to
by taking the density operator
7.4. HOMODYNE OPTICAL DETECTION
345
and
An, at first sight, surprising result of this calculation is that the operators not positive. This is directly seen by calculating Det We find
are
implying that, in general, there is one positive and one negative eigenvalue. Hence, although the operators do satisfy and accordingly constitute an operator-valued measure (OVM), this is not a positive operator-valued measure (POVM). The origin of this is that the probability current through a surface, for instance the surface although yielding a positive result if integrated over the whole surface, can be negative locally. For each value of there can be found an incoming state for which Evidently, it is possible that the waves exiting from slits 1 and 2 interfere in such a way that at screen B the wave is going locally in the “wrong” direction. The same effect obtains in the three-dimensional case in which we take the radial component of the current density at a spherical surface to define the probabilities. We get a POVM only in the Fraunhofer limit, in which the radial current density is determined in the limit [315]. The regions of negative current density turn out to constitute only a minor fraction of the total surface, though, thus in practice yielding a positive result when averaging over a finite detector area. It is important to note, however, that a less shallow treatment of the double-slit experiment than is usually given, demonstrates that this experiment cannot be represented by a PVM of the standard formalism of quantum mechanics.
7.4
Homodyne optical detection
An important example of a quantum mechanical experiment to be represented by a POVM and, hence, not describable by the standard formalism, is homodyne detection of an optical signal. This detection method is important both from a practical point of view and from a fundamental one. The method has been applied already within the domain of classical optics as a means of improving the signal-to-noise ratio of a noisy signal. Apart from the circumstance that measurement accuracy could be improved so much that it became possible to reach the ‘quantum mechanical limit’ (Yamamoto et al. [316]), thus making such experiments suitable for studying fundamentally quantum mechanical phenomena, it now also turns out to contribute to our understanding as regards the limited applicability of the standard formalism.
346
CHAPTER 7. GENERALIZED QUANTUM MECHANICS
The method of optical homodyning is based on the superposition of a (monochromatic) stochastic signal S with a coherent signal L of the same frequency. In practice the coherent signal is a laser beam produced by a so-called local oscillator. The superposition of the two signals takes place in a semitransparent mirror (cf. figure 2.4) in which the local oscillator is in input port 2. The signal S is in input port 1. The intensity registered by detector now consists of a superposition of signal and local oscillator. Taking the intensity is given by Subtracting the (constant) contribution and taking L >> S we obtain
Due to the (very large) multiplication factor 2L the signal S has been amplified considerably. This demonstrates the usefulness of the homodyne detection method. Of course, noise is also enhanced in the amplification process, but to a much smaller extent than the signal itself (see e.g. [316]; also section 8.4.2). Therefore the signalto-noise ratio is improved. Yuen and Shapiro [317] have given a complete quantum mechanical treatment of the measurement of an optical signal by means of homodyne detection, in which the local oscillator L is represented by a coherent state (A.29). They calculated the photocurrent caused in a detector by the electromagnetic signal. When, according to (7.22), the so-called ‘bias’ of the laser signal has been subtracted, and a correction is performed for the factor 2L, then, for real in the limit the probability distribution of the detector signal is given by
being the density operator of the incoming signal S, and the quantum efficiency of the detector (cf. section 8.4.2 for a simple derivation in which the transmission coefficient of the mirror is allowed to differ from 1/2). The vectors are the (improper) eigenvectors of the operator defined by (1.23). It is directly seen that this probability distribution can be written according to
with
The operators generate a POVM, evidently related to the projection valued measure corresponding to the spectral representation of the Hermitian operator . In the limit we even obtain
7.5. NONIDEAL POLARIZATION MEASUREMENT
347
thus making the homodyne detection method in this limit a measurement of the (standard) position observable . In practice this limit cannot be reached, however. Expression (7.24) should be compared with the nonideality relation (7.7). In the derivation leading to (7.24) a special choice of the phase of the complex number is made, determining the phase relation between signal S and local oscillator L, viz, Using a plate (see figure 7.1) the phase can be shifted an amount corresponding to the choice This entails the replacement of (7.24) by (cf. section 8.4.2)
implying that the POVM is now related to the spectral representation of the (standard) momentum operator P defined in (1.23) (see also appendix A.1). It is also possible, instead of the shift to realize an arbitrary phase shift In that case the measurement result is related to the spectral representation of the Hermitian operator
coinciding for and with and P, respectively. The operators (7.26) are known in quantum optics as the rotated ‘quadrature phase’ operators. They can, in principle, be measured by means of the homodyne detection method (cf. section 8.4.2).
7.5
Nonideal polarization measurement of a photon
As a final example we consider a measurement of the polarization of a photon after it has passed a partially transparent mirror (transmission coefficient before entering
348
CHAPTER 7. GENERALIZED QUANTUM MECHANICS
an analyzer (cf. figure 7.2). Here the analyzer is a polarizer (nicol) oriented at angle in a plane perpendicular to the direction of propagation of the photon. Detector D has efficiency The standard observable corresponding to the usual (ideal) measurement of the (standard) photon polarization observable in direction will be represented by the PVM in which and correspond to the events ‘the photon is transmitted by the analyzer’, and ‘the photon is absorbed by the analyzer’, respectively. This PVM describes the measurement results if the mirror has transmission coefficient (corresponding to absence of the mirror) and detector D is 100% efficient, the detection probabilities then being given, in agreement with the standard formalism, by and the polarization state of the incoming photon). In the situation of figure 7.2, however, the detection probabilities are different. Now the probability of detector D registering a photon is equal to i.e. the product of the probabilities of independent processes taking place in mirror, analyzer and detector. Since the probability that the photon is not detected at all is equal to we see that the experiment is represented by the POVM once again (for not a PVM. A somewhat more refined polarization measurement is obtained if, instead of a nicol polarizer, we choose as analyzer a bi-refringent crystal. It is now possible to have a detector both in the ordinary (o) and in the extraordinary (e) beam (figure 7.3). An advantage of this method is that not only the + events but also the – events are registered. This may increase the accuracy of the experiment. Since now there are three possibilities for the photon, the POVM has three terms, viz, the first two terms yielding the detection probabilities in the ordinary and the extraordinary beams, whereas the last term gives the (stateindependent) probability that the photon does not enter one of the two detectors. The projection operators and of course, are dependent on the orientation of the crystal. We have (e.g. Busch [318])
the unit vector describing the orientation of the crystal; the operators are the Pauli spin matrices.
7.6. THEORY OF NONIDEAL MEASUREMENT
349
The introduction of a partially transparent mirror creates the possibility of performing a polarization measurement also in the reflected beam (cf. figure 7.4). By choosing for the latter measurement a nicol or a bi-refringent crystal oriented at a different angle (corresponding to PVM we obtain for the case of nicols POVM (the first two operators representing the detection probabilities of detectors D and , and for bi-refringent crystals POVM in which
An interesting aspect of these measurement arrangements is that their POVMs yield information on the probability distributions of two standard observables: the probability distributions and can be calculated easily from the measured probability distributions. We shall see in section 7.9 that these measurements can in a certain sense be interpreted as simultaneous (better: joint) measurements of the two standard observables.
7.6 7.6.1
Theory of nonideal measurement Examples
The examples discussed in the previous sections have one thing in common. In all examples relations can be derived between different POVMs, connecting the outcomes of different measurement procedures. In section 7.2 this relation is given by (7.7) (with (7.8)). Note that all and that (7.9) is satisfied. We already referred to the matrix as a nonideality matrix, because (7.7) is expressing the connection between a measurement method using an inefficient detector represented by POVM and a method using an efficient detector represented by PVM
350
CHAPTER 7. GENERALIZED QUANTUM MECHANICS
Similar nonideality matrices are encountered in the other examples. Thus, in section 7.3.1 it is possible to construct from POVM (7.17) another POVM generated by the operators
to be expressed in terms of POVM
according to
the Heaviside function Here, too, we have a nonideality relation: the measurement represented by POVM is a nonideal version of the measurement represented by in the sense that it describes only the probabilities of the particle hitting screen B in the upper or lower part. The transition from one POVM to the other is related here to a coarsening of the detection method because in the transition the subsets of the spectrum of the observable are restricted (cf. section 7.6.6). Comparing (7.29) with (7.7) we see considerable similarity, even though one of the matrix indices has a continuous range. Thus,
A second interesting nonideality relation, playing a role in this example, follows from the joint spectral decomposition of the operators + , – , existing because these operators commute. We can write the operators as
7.6. THEORY OF NONIDEAL MEASUREMENT
351
constituting an ODI (consisting of the projection operators on the joint eigenvectors of and generating a PVM. The constants (these are the eigenvalues of and satisfy
implying that this matrix is once again a nonideality matrix. Hence, we can interpret the detection method represented by POVM as a nonideal version of the detection method for the standard observable We shall not study here any further the physical significance of this observable, because a more systematic method for studying interference experiments as nonideal measurements of certain observables will be developed in section 8.2. The stationary approach (cf. section 7.3.2) of the double-slit experiment yields relations completely analogous to (7.29) and (7.31). Except for the unattainable limit the example of optical homodyning (section 7.4) also does not give rise to a PVM but is represented by a POVM. And once again we recognize the relation, encountered before, of the POVM actually measured ((7.24) or (7.25)) to an observable we would like to measure, viz, or P (or rather the corresponding PVM or respectively. This relation is, once again, of the type
in which
is the intended PVM, and
satisfies
Once more, his function can be interpreted as a nonideality function determining the deviation of the measurement from an ideal measurement. In quantum optics such functions are often seen as describing excess noise introduced into the detected signal by a “wrong” choice of the measurement arrangement. Evidently, in the ideal arrangement for measuring or P by means of homodyne detection using the combination of a 100% efficient detector and a vanishingly small reflection coefficient of the partially transparent mirror, this excess noise can be neglected in comparison with the noise already present in the (quantum mechanical) signal. In practice, however, the homodyne measurement of or P is always a nonideal measurement in which the measurement induces excess noise. Finally, also in section 7.5 we recognize an analogous nonideality relation when the NODI generating POVM is put into the following matrix form:
352
CHAPTER 7. GENERALIZED QUANTUM MECHANICS
Denoting the 2 × 2 matrix again by
it is clear that
thus enabling an interpretation of this matrix as a nonideality matrix describing the nonideality of the measurement arrangement if viewed as a measurement of the observable represented by PVM Such a matrix representation is also possible for the NODI generating POVM obtained in case of bi-refringent crystals, viz,
the nonideality matrix now not being square, but still satisfying (7.35).
7.6.2 Definition of a nonideal measurement The examples given above give ample occasion for the following general definition: Nonideal measurement: We call a measurement represented by POVM of an observable represented by POVM satisfied among the NODIs:
a nonideal measurement if the following relations are
The POVM does not need to be a PVM. Moreover, each of the discrete indices and may be replaced by a continuous one, with concomitant replacement of summation by integration. Occasionally, relation (7.37) between POVMs and will be denoted by (cf. [28])
symbolizing that the probability distribution is directly determined by the expectation values of observable but not necessarily conversely. It is possible that both POVMs is a refinement of i.e.
and
are PVMs. This is true if
7.6. THEORY OF NONIDEAL MEASUREMENT
353
In this case (7.37) is satisfied with all or 0. An interpretation of a measurement of as a nonideal version of a measurement of observable is justified because the first measurement can be done by actually performing the second one, and by, subsequently, not distinguishing the labels of the measurement results corresponding to the same value of Note that (7.39) is completely analogous to the relation (A.121) between FVMs in case of a refinement of a partition of the measure space. In section 7.2 the quantity was already referred to as a conditional probability, viz, as the probability that the measurement has measurement result given that an measurement, if performed, would have given result This interpretation is justified by the existence of an observable represented by the POVM Since (note that
the POVMs and are the marginals of this POVM, thus enabling an interpretation of a measurement of POVM in agreement with the definition given in section 1.9.2, as a joint measurement of observables and The conditional probability follows directly from this as
The significance of the nonideality matrix can be illustrated by means of a transmission channel as given in figure 7.5. The channel consists of a number of subchannels, in principle connecting each input port to each output port. In an ideal transmission channel we have i.e. each input port is connected to one and only one output port (no crossing of signals between sub-channels occurs). In case of nonideality the matrix element represents the conditional probability that an input signal in channel is transmitted into output channel The nonideality
354
CHAPTER 7. GENERALIZED QUANTUM MECHANICS
matrix is known in the mathematical literature (for instance, Ortega [319]) as a stochastic matrix 5 . The theory of transmission channels is well known from the literature on stochastic processes (e.g. McEliece [321]). The nonideality relation is a transitive one. If POVM represents a nonideal measurement of POVM and POVM in its turn represents a nonideal measurement of POVM then POVM also represents a nonideal measurement of POVM This is evident since
are stochastic matrices if and (cf. appendix A. 13). This corresponds to two consecutive transmission channels like the one depicted in figure 7.5.
and
It should be remarked here that the classical picture of nonideality, suggested by the analogy with a transmission channel, is somewhat deceiving in a way comparable to the deceivingly “classical” appearance of relation (1.32) between quantum mechanical mixtures and pure states. We shall, indeed, see in section 7.9 that the fundamentally quantum mechanical idea of ‘complementarity’ in the sense of ‘mutually exclusive measurement arrangements’, discussed in section 4.5, can be formulated in terms of nonideality matrices. Although initially the excess noise in the homodyne optical detection method of section 7.4 seemed to manifest itself as classical noise, induced by the measurement process and superposed on the “proper” quantum mechanical noise corresponding to the quantum mechanical statistics of the signal, by now the insight has been reached that the excess noise has a quantum mechanical origin too. Nowadays this noise is referred to as quantum noise, the essential difference with classical noise being the existence of a quantum limit (e.g. Yamamoto et al. [316]) bounding the measurement accuracy of certain optical measurement methods in a fundamental way. In the mathematical representation of NODIs as cones in a vector space (appendix A.12.4) the nonideality relation (7.37) can be visualized in a simple way if the NODIs contain only extremal elements. POVMs corresponding to such NODIs are called extremal POVMs (Martens and de Muynck [28]). In section 7.6.6 it will be argued that we may restrict ourselves to extremal POVMs. It follows directly from (7.37) that in this case the elements of POVM are all lying inside the cone defined by POVM (cf. figure 7.6). The nonideal observable corresponds to the cone having the smaller angle at its top. It is finally noted that the operators in a NODI should not be considered as an ordered set: a permutation of the operators yields essentially the same POVM. This 5 In a different convention rows and columns are interchanged, making stochastic matrix, cf. Gantmacher [320].
the transpose of a
7.6. THEORY OF NONIDEAL MEASUREMENT
355
is consistent with an empiricist interpretation of quantum mechanics, in which the way of labeling the operators of a POVM (i.e. the output channels of the measuring instrument) is irrelevant. A measurement with a nonideality matrix is just as good or bad as one having nonideality matrix . These can be simply transformed into each other by relabeling the operators of one of the POVMs (i.e. relabeling either the input or the output channels of the measuring instrument). In section 7.6.7 equivalence of POVMs will be defined, and it will be demonstrated that two POVMs, the NODIs of which differ only by a permutation of their elements, are equivalent.
7.6.3 Invertibility Often the nonideality matrix has an inverse. In this case the nonideal measurement is called invertible. For instance, in the case of an inefficient photon detector (section 7.2) the inverse is given by (7.10). For the nonideality matrix of the nonideal polarization measurement, given in (7.34), we easily find the inverse according to
In principle, even in the continuous case of the homodyne optical detection it is possible to achieve an inversion of the relation (7.32). This latter relation describes a convolution of the functions and Inversion is possible, in principle, by means of deconvolution. For the nonideality function of (7.24),
356 the inverse
CHAPTER 7. GENERALIZED QUANTUM MECHANICS is found as
Although the integral in (7.41) can exist only in the sense of a generalized function, this expression can nevertheless be useful (cf. section 7.9.3). In general, the inverse of a stochastic matrix is not a stochastic matrix since its matrix elements can be negative (see for instance (7.40)). The other property of (7.35), however, is also valid for the inverse, thus,
For square matrices this directly follows by summing the relation over using the equality For non-square matrices the situation is somewhat more involved. Here the left inverse is different from the right one. Therefore we cannot use here the above relation, since it is based on the right inverse rather than the left one. Yet it is possible to assume that (7.42) is satisfied. This is illustrated by the example of the inverse of the nonideality matrix of (7.36). The left inverse of this matrix is given by
(it is easy to see that the right inverse does not exist). This matrix does satisfy (7.42) for the choice The problem of the non-uniqueness of the inverse of a nonideality matrix does not arise if we restrict ourselves to nonideality relations in which the number of operators in the NODI generating does not exceed the number of operators of the NODI of This implies that the nonideality matrix can be chosen as a square matrix such that is uniquely defined if it exists (since no inverse exists in the non-square case). However, as seen from the example (7.36), this restriction is not always satisfied. In realistic experiments it is not uncommon that the number of output channels exceeds the number of input channels. This can be dealt with by increasing the number of input channels in an arbitrary way by trivial refinement (cf. section 7.6.6) at the cost of the nonuniqueness exhibited by (7.43). Relation (7.42) also holds in the continuous case of (7.41) if, in the sense of the Dirac calculus, the summation is replaced by an integration over and the Dirac delta function is substituted for the Kronecker delta.
7.6. THEORY OF NONIDEAL MEASUREMENT
7.6.4
357
An alternative definition of nonideal measurement
Ludwig ([38], p. 135) has proposed an alternative to the definition of a nonideal measurement (referred to by him as a ‘reducible’ observable). This definition is suggested by the analogy with a “nonideal” preparation, described by the convex combination (1.48) of two density operators. The idea is that a convex combination of two (or more) POVMs could represent a mixture of measurement procedures, in which the fluctuations in the measuring instrument induce a stochastic alternation of the POVMs. Thus, given POVMs and according to this definition the POVM represents a nonideal measurement. It is clear that the POVM (7.44) yields a correct representation of a measurement process if each individual measurement can be represented either by POVM or by POVM This can be the case if the duration of the measurement interaction is much shorter than the characteristic time of the fluctuations in the measuring instrument. This might be realized in certain experimental situations. For instance, it would be possible to replace in figure 7.4 the measurement arrangement of the partially transparent mirror (transmission by a mixture of measurement arrangements in which in a fraction of the experiments the mirror is left out completely, while in a fraction a completely reflecting mirror is inserted. The POVM representing this experiment is, in the case of nicols, the mixture (7.44) of the POVMs and Comparing this example with the partially transparent mirror, it is striking that the typically quantum mechanical character of the measurement, evident from the linear superposition of the outgoing beams in the example of section 7.5, is absent in the mixture. The fact that it is not a priori certain whether a photon incident on the mirror will be transmitted or reflected seems to be the essential characteristic determining the quantum mechanical character of the measurement procedure. Even if it would be possible to describe the partially transparent mirror in terms of a fluctuating measurement arrangement, then in any case it would seem that fluctuations are occurring during the interaction time: precisely these fluctuations would seem to be responsible for the unpredictable behavior of the photon. Although this is not an exhaustive analysis of the merits of a mixture of POVMs as a representation of a nonideal measurement, this example sufficiently illustrates the reason why we do not want to consider Ludwig’s proposal any further. We are interested in the first place in typically quantum mechanical measurement processes in which the fluctuations are interfering in an essential way with the measurement process. The mixture (7.44) does not seem able to represent such processes (see also section 7.9.1, example 1).
358
CHAPTER 7. GENERALIZED QUANTUM MECHANICS
7.6.5 Nonideal measurement of a standard observable In this section it will be demonstrated that relation (7.37) is a natural one for representing a nonideal measurement of a standard observable. A characteristic feature of a measurement of a standard observable, represented by a PVM is that the measurement yields information only on the diagonal elements of the density operator in the representation involved (compare (1.72) and (1.75)): if then from the representation
it follows that Hence, the measurement does not yield any information on We expect that a nonideal measurement of standard observable must yield even less information than a measurement of itself. This implies that also in a nonideal measurement no information on off-diagonal elements of is to be expected. Let be the POVM of such a nonideal measurement. Since the operator is positive it has a spectral representation
with a complete orthonormal set of eigenvectors of and (positive) eigenvalues. In general the eigenvectors may be different for different operators A necessary condition lest a relation can exist between all operators of and one single is that the eigenvectors of all operators can be taken the same, thus, If this is not the case, then the POVM contains off-diagonal information, which is in disagreement with the idea of a (nonideal) measurement of one single standard observable. Hence, if represents a nonideal measurement of the standard observable we should have
From and it immediately follows that the conditions for mentioned in (7.37), should be satisfied. This demonstrates that (7.37) is a natural definition of a nonideal measurement of a standard observable. This definition can be generalized by admitting the possibility that can represent a nonideal measurement of an arbitrary observable rather than a standard one. Of course the above reasoning is not applicable to this latter generalization, thus leaving open the possibility of alternative definitions next to (7.37) in the general case.
7.6. THEORY OF NONIDEAL MEASUREMENT
359
Relation to measurement theory In section 3.3 a detailed quantum mechanical description of the interaction process between object and measuring instrument gave us the explicit expression (3.23) for the operators of the POVM representing a measurement, U being defined in (3.20). Hence,
It easily follows that commutativity of the Hamiltonian H, describing the interaction between object and measuring instrument, with standard observable is a sufficient condition for a measurement to be a nonideal measurement of that standard observable6. Consequently, each interaction Hamiltonian of the type
with arbitrary Hermitian operators of the measuring instrument, induces a nonideal measurement of PVM
Unbiasedness In the literature (e.g. Holevo [34], p. 106) sometimes an extra condition is imposed on nonideal measurements, viz that in a nonideal measurement of standard observable A the expectation values of this latter observable are reproduced. This is often referred to as a requirement of ‘unbiasedness’. Thus,
and
then ‘unbiasedness’ requires
This implies or
6
In section 8.3.2 we shall encounter an example showing that the condition is not necessary.
360
CHAPTER 7. GENERALIZED QUANTUM MECHANICS
This is an additional requirement imposed on the nonideality matrix. For the continuous case of homodyne optical detection (7.24) this boils down to
satisfied, for instance, by a Gaussian convolution function. Evidently, homodyne optical detection satisfies the requirement of ‘unbiasedncss’. Equality (7.49) is sometimes used to define a standard observable allegedly measured by the measurement procedure represented by a POVM (for instance, the standard phase observables referred to in section 1.9.4). Although ‘unbiasedness’ may be a natural requirement relating certain measurement procedures, it can certainly not be applied in general. Thus, the inefficient photon detector of section 7.2 does not satisfy the requirement. It is easily verified that for the nonideality matrix (7.8) we obtain
corresponding to the fact that the inefficient detector registers only a fraction the incoming intensity, entailing
of
By choosing the values of the observable as instead of it would be possible also in this case to satisfy the requirement that the expectation values be reproduced exactly by the nonideal measurement 7 . However, by doing so we would get into conflict with the idea that -like the ideal photon counter- also the nonideal photon detector “counts” photons, or, rather, registers numbers. For this reason this solution does not seem to be a very attractive one. In order that this widely used measurement procedure fall within the domain of application of the theory it seems necessary to drop the requirement of ‘unbiasedness’. Since this requirement is dependent on the values the observable can take, in an empiricist interpretation of quantum mechanics this is even the most natural thing to do (cf. section 2.4.2).
7.6.6 Linear dependence of elements of a NODI; coarsening and refinement of POVMs A POVM POVM 7
will be called a coarsening of POVM if is a refinement of POVM A coarsening is an example of a
The general possibility of such a procedure is demonstrated by Martens and de Muynck [28].
7.6. THEORY OF NONIDEAL MEASUREMENT
361
nonideal measurement, realized by not distinguishing the measurement results of certain detectors (compare a standard observable having a degenerate spectrum). A rather trivial, but nevertheless not unimportant way in which a refinement can be realized, is, for instance, by means of data manipulation during which the output signal of a detector corresponding to an element of POVM is fed into registration apparatus for a fraction of the measurement events, and into registration apparatus for a fraction (cf. figure 7.7). In that case only the sum of the relative frequencies and of the two registration apparata is of physical importance. Representing these probabilities by the positive operators the relative frequencies of registration apparata and are given by and respectively. In the POVM we can therefore replace the operator by the pair thus defining a new POVM. It is clear, however, that the POVM thus obtained contains precisely the same information on the density operator as the original one. The new POVM will be called a trivial refinement of POVM More generally, POVM 1} is a trivial refinement of POVM We call a trivial coarsening of POVM because the reduction of the number of registration apparata is not accompanied by any loss of information. Not only does POVM refinement
but, conversely, does
the quantities
represent a nonideal measurement of the trivial
represent a nonideal measurement of
and
nonideality relation (7.37).
since
easily being seen to satisfy the conditions of the
362
CHAPTER 7. GENERALIZED QUANTUM MECHANICS
The symmetry in the relation between a POVM and its trivial refinement is related to the fact that both POVMs contain equivalent information on the density operator. This latter circumstance is reducible to a linear dependence existing among the elements of the NODI of one of the POVMs: (pairwise linear dependence). This linear dependence can be eliminated by means of trivial coarsening. In the representation of a NODI as a cone in a vector space (cf. appendix A.12.4) pairwise linear dependence means that two of the vectors are along the same ray. It is possible, in principle, that a relation of linear dependence exists among more than two elements of a NODI. This can happen, for instance, if there are elements in the cone representation of the NODI corresponding to non-extremal elements (as in figure A.6c), which, according to the Krein-Mil’man theorem (appendix A.11.3), can be written as convex combinations of extremal elements. POVMs generated by such NODIs will be left out of consideration in the following, since they can be replaced by extremal POVMs using the procedures discussed in appendix A.12.4: once again the measurement procedure represented by a non-extremal POVM can be viewed as being obtainable from the procedure corresponding to an extremal POVM by means of data manipulation, in which the statistical information is distributed over several detectors in a way that is independent of the quantum mechanical state, thus allowing to employ a smaller number of detectors without any loss of information.
7.6.7
Equivalence of POVMs
The phenomenon of trivial refinement gives occasion to the following definition: Equivalent POVMs: Two POVMs and will be called equivalent if represents a and, conversely, represents a nonideal nonideal measurement of measurement of Hence the corresponding NODIs are related according to
From (7.51) and (7.52) it directly follows that a POVM is equivalent to each of its trivial refinements and trivial coarsenings. It is also directly evident that a POVM is equivalent to a POVM obtained by permuting the elements of the NODI generating it, since such a permutation can be interpreted as a nonideal measurement with a nonideality matrix obtained from the unit matrix by means of a permutation of its columns. Then the inverse of this matrix corresponds to the inverse permutation. Equivalence of POVMs and is denoted according to As we saw before, the possibility of equivalence of POVMs is based on linear dependence
7.6. THEORY OF NONIDEAL MEASUREMENT
363
of the operators of the NODI of at least one of the POVMs. It can easily be proven that the operators of an ODI are linearly independent, i.e. O, For this reason, the possibility of linear dependence, and hence of equivalence, is absent in the standard formalism. It is a characteristic of the generalized theory, applicable to generalized measurements only. The notion of equivalence of POVMs can be used to subdivide the set of all POVMs into classes of mutually equivalent POVMs. This holds true because the relation satisfies the properties of reflexivity, symmetry and transitivity of an equivalence relation, i.e.
This can easily be seen from (7.53). The equivalence class of POVM will be denoted by All POVMs of an equivalence class yield equivalent information on density operator Restricting ourselves to extremal POVMs, it is immediately clear from figure 7.6 that two POVMs are equivalent if and only if their cones coincide. This implies that, if and are members of the same class, each element of equals an element of up to a multiplicative constant. Hence, the equivalence class of contains only trivial refinements and coarsenings of or POVMs obtained by a combination of these procedures. By trivial coarsening it is always possible to choose the POVM representing the class as a POVM generated by a pairwise linearly independent NODI (having only one element in each ray).
Informational equivalence of POVMs Informationally equivalent POVMs: POVMs and are informationally equivalent if, instead of (7.53), the elements of the generating NODIs satisfy the equality
with
an invertible matrix (not necessarily a stochastic one)
Informational equivalence is a weaker form of equivalence than the one defined before. Any nonideal measurement with an invertible nonideality matrix defines an
364
CHAPTER 7. GENERALIZED QUANTUM MECHANICS
informationally equivalent pair. An example of informational equivalence is given by the pair of POVMs (7.7) and (7.8), representing ideal and nonideal measurements of the number observable, respectively. In case of informational equivalence it is possible to calculate the probability distribution of one POVM from the distribution of the other one. Therefore the measurements yield equivalent information on the density operator. Hence, at least in principle, nonideality of a measurement need not entail loss of information. If the nonideality matrix and its inverse are known, then it is possible in principle to compensate by means of calculation for the measurement disturbance of the probability distribution caused by the nonideality of the measurement procedure. However, due to the necessity that the whole probability distribution of observable be known, such a compensation is not always possible in practice. If the spectrum of an observable is infinite then this can in general not be the case. It may also occur that the inversion process is unstable, causing nearly equal probability distributions of to lead to very different probability distributions for This, for instance, holds for the inversion of relation (7.24) by means of deconvolution. For POVMs generated by NODIs with a small number of elements such problems do not arise, however.
7.7 Partial ordering of nonideal measurements 7.7.1 Partial ordering of equivalence classes The nonideality relation (7.37) induces a partial ordering of the equivalence classes defined in section 7.6.7. Thus, if POVMs and satisfy a nonideality relation, then this holds true for any pair of POVMs taken from the two equivalence classes and Hence, it is possible to attribute the relation of nonideality to the equivalence classes, signifying that the class consists of observables representing nonideal measurements of the observables of the class It is easily seen that the relation is a partial ordering relation, satisfying the properties
7.7. PARTIAL ORDERING OF NONIDEAL MEASUREMENTS
365
Note that the relation does not induce a partial ordering of the POVMs, since equivalence of POVMs does not imply their equality. Nevertheless, since the other two properties of partial ordering hold true also for POVMs, it is possible to deal in the following with relations between POVMs, partial ordering of POVMs being interpretable as a relation between the equivalence classes the POVMs are representing.
7.7.2 Maximal and minimal POVMs From the transitivity relation, given above, it follows that it is possible to construct sequences of POVMs, like from left to right corresponding to ever less ideal measurements of POVM An interesting question is to what extent such a sequence can be continued either to the left or to the right. On the basis of the representation of a POVM as a cone in a vector space, and the relation, found in section 7.6.2, between the cone’s top angle and nonideality, we expect the sequence to be bounded on both sides, i.e. we expect the sequence to have a maximum and a minimum, corresponding to the largest and the smallest possible top angle, respectively. These maximal and minimal POVMs can be defined in a more formal way as follows (Martens and de Muynck [28]):
Maximal POVM: POVM is maximal if for each POVM it follows that
satisfying
Minimal POVM: POVM follows that
is minimal if for each POVM
satisfying
it
We now prove the following: Theorem: 1. POVM is maximal if and only if for each the corresponding element of the NODI generating the POVM satisfies a projection operator onto a one-dimensional subspace8. 2. POVM is minimal if and only if for each of the NODI generating the POVM satisfies 8
the corresponding element
A similar result was obtained by Davies [322] from the requirement that the mutual information (A.95) of the joint probability distribution is maximal. Here are the relative frequencies of the states of a von Neumann ensemble (cf. section 6.2.3). The present derivation is independent of this latter issue.
366
CHAPTER 7. GENERALIZED QUANTUM MECHANICS
Proof:
Ad 1.
– Suppose to be maximal. Each operator spectral representation (7.45), in which projection operator. Define
of the NODI has a is a one-dimensional
is a POVM, and with This implies that i.e. POVM represents a nonideal measurement of POVM Since POVM was assumed to be maximal, we must have implying Then
From this it follows that
if
or
In agreement with the Theorem of appendix A.6 this must be, up to a multiplicative constant, a one-dimensional projection operator. Since the multiplicative constant must lie between 0 and 1. It is possible to find for each thus allowing this reasoning to be carried through for each of the elements of the NODI generating POVM – Assume that operators of the NODI are each proportional to a onedimensional projection operator. If there would exist a NODI such that then, once again, if implying to be proportional to a This implies that POVM can be only a trivial coarsening of POVM in agreement with section 7.6.6 implying that Ad 2.
– Let POVM be minimal, and, hence, equivalent to any POVM representing a nonideal measurement of it. Then, since { O , I } is a nonideal measurement of any arbitrary POVM. From this it follows that is a trivial refinement of {O,I}. Hence, – Assume
then, evidently, This implies
7.7. PARTIAL ORDERING OF NONIDEAL MEASUREMENTS
367
The cone of a minimal POVM has top angle zero. Hence, all elements are proportional to the same vector, necessarily corresponding to the operator I. Such observables are not very interesting because the expectation values are completely independent of The corresponding observables will be referred to as uninformative observables. A measurement arrangement for an uninformative observable can be realized in practice if in an arbitrary measurement procedure no distinction is made between different pointer positions, but the measurement results are distributed, analogously to figure 7.7, over a number of registration apparata in a way not depending on the quantum mechanical state. Another example will be encountered in section 7.9.2. Considerably more interesting are the maximal observables, since these are yielding, in a certain sense, maximal information on the state Evidently, the maximal standard observables corresponding to Hermitian operators with a nondegenerate spectrum, belong to the set of maximal observables. In case of degeneracy the PVM contains more-dimensional projection operators, and is not maximal, neither in the sense of the standard formalism nor in the sense of the definition given above. It is satisfactory that the latter (generalized) definition of maximality reduces, on restriction to standard observables, to the one used in that formalism. Both from a fundamental as well as from a practical point of view it is important that in the generalized formalism there are still other maximal POVMs than the standard ones. This is illustrated by the following examples: Example 1: Let the Hilbert space (dimension be a subspace of a larger Hilbert space (dimension and let P be the orthogonal projection operator such that Be a complete orthonormal set of vectors in such that Then the vectors are all vectors in Since the set is an overcomplete set of vectors in The operators now constitute a maximal POVM on Indeed, because of for arbitrary it follows that Because of we also have i.e. on this operator equals I. Finally, since is a vector in with norm it follows that with a one-dimensional projection operator. Hence, the POVM is a maximal one. Example 2: The second example is an infinite-dimensional one. If is the coherent state defined in (A.29), then is a one-dimensional projection operator. Because of (A.37) the set of operators is defining a POVM. Evidently this is a maximal one. Like in example 1 this is connected to an overcomplete set of vectors. Theorem: For every maximal POVM the operators
of the NODI generating
368
CHAPTER 7. GENERALIZED QUANTUM MECHANICS
the POVM define a complete or overcomplete set of vectors, overcomplete if N > dim complete if In the latter case is a PVM. Proof:
Let POVM the POVM operator. Let From
be a maximal one. Then for every element of the NODI generating and a one-dimensional projection be a normalized vector in the range of Then it directly follows that
demonstrating that any vector can be written as a linear combination of the vectors These vectors, for this reason, constitute an (over)complete set. It is evident that for a maximal POVM For the set is a basis for It is even an orthonormal basis. This can be seen as follows. Suppose it to be a non-orthonormal basis. Then it is possible to choose in (7.54) with a vector of the dual basis defined in appendix A.8.1. Because of (A.67) this implies
Since we have That the maximal POVM is and a PVM, now follows directly from the theorem proven in appendix A.12.3, to the effect that the operators project on mutually orthogonal subspaces.
We note that not every (over)complete set of vectors gives rise to a maximal POVM. As follows from the theorem given above, for a complete set it is necessary that the vectors constitute an orthonormal basis. In case of overcompleteness, too, the vectors have to be chosen in a special way for the requirement to be satisfied. For instance, in a two-dimensional Hilbert space with orthonormal basis the vectors constitute an overcomplete set for each N > 2 (e.g. Holevo [34], p. 31), defining a maximal POVM
7.7.3 Maximality and completeness of observables The concept of a maximal POVM marks a typical difference between quantum mechanics and classical mechanics. Whereas there exists in classical mechanics only one single maximal observable (viz, the complete FVM defined in
7.7. PARTIAL ORDERING OF NONIDEAL MEASUREMENTS
369
section 1.10.3), there are many different ones in quantum mechanics. This is true already for standard quantum mechanics, in which a maximal PVM corresponds to every orthonormal basis of vectors in Hilbert space, yielding information only on the diagonal elements of in the pertinent representation. This multitude of maximal observables is a consequence of the fact that the state space of quantum mechanics is a vector space, and, hence, not a simplex (cf. appendix A.12.2). Generalized quantum mechanics demonstrates that a maximal observable need not be associated with an orthonormal basis, but that also maximal observables (POVMs) exist corresponding to overcomplete sets of vectors. In section 7.9.1 an example of a measurement procedure will be discussed, corresponding to a maximal POVM that is not a PVM (cf. (7.73)). Since the state space in generalized quantum mechanics is not different from the one assumed in the standard formalism, in the generalization the concept of ‘maximality of an observable’ is not altered in an essential way: a maximal observable is yielding information that is related to a set of one-dimensional subspaces of the vector space. The generalization merely increases the number of maximal observables by admitting observables yielding information on certain sets of one-dimensional subspaces corresponding to overcomplete sets of vectors. It is important not to confuse the aspect of maximality of an observable with the question of to what extent a measurement is yielding maximal information on the state (better: on the state preparation) of an object, in the sense that this information is complete, i.e. sufficient for completely determining In classical statistical mechanics these questions are equivalent: the maximal FVM yields complete (hence, maximal) information. In quantum mechanics this is quite different. As far as maximal observables are yielding maximal information here, this need not imply completeness of the information. There exist maximal observables yielding very incomplete information on (like the maximal standard observables, cf. section 7.6.5). We shall, however, encounter in section 7.9.4 also generalized maximal observables yielding complete information on if the observable is yielding information on a sufficient number of one-dimensional subspaces, then it is possible to reconstruct the density operator from the probability distribution of that observable. Hence, in contrast to standard quantum mechanics, in the generalized theory complete measurements do exist. In this respect the typical difference between classical statistical mechanics and quantum mechanics is not so much embodied by the concept of a complete measurement, existing in both theories (at least, when quantum mechanics is not restricted to the standard formalism), but rather by the question of whether completeness and maximality are going together or not. Both in the standard and in the generalized formalism a one-dimensional subspace corresponds to maximal information, in the sense that this information cannot be improved within a certain set of measurement procedures. When an observable is not maximal it yields information only with respect to more-dimensional sub-
370
CHAPTER 7. GENERALIZED QUANTUM MECHANICS
spaces, and will certainly not be complete. However, maximality need not imply completeness. Whether a maximal observable is complete depends on the number of one-dimensional subspaces it is yielding information on. A maximal PVM is always incomplete. A maximal POVM may be either incomplete or complete, depending on the number of linearly independent elements. In an dimensional Hilbert space a NODI with linearly independent elements must have elements in order to represent a complete observable (see also section 3.3.6).
7.8
Measures of nonideality or inaccuracy
There are different ways to characterize the extent to which a nonideality matrix satisfying deviates from the unit matrix (corresponding to an ideal measurement). For square N × N matrices one possibility for a measure of the nonideality is the quantity
in which are the eigenvalues (not necessarily real) of the matrix The eigenvalues of a stochastic matrix satisfy (cf. appendix A. 13) implying It is clear that if i.e. when there is no crossing of signals between the sub-channels of the transmission channel (cf. figure 7.5). Since and since the absolute value of a determinant is invariant under a permutation of the rows or the columns of a matrix, the measure does not change under such a permutation. This means that for nonideal measurements differing only by the labeling of the input or output channels (cf. section 7.6.2) the measure is the same. In this respect is a suitable measure. The measure (7.55) has also a disadvantage, however. It is maximal if one eigenvalue This, for instance, is the case if (the POVM is uninformative then). Actually, apart from the eigenvalue (which is always there, cf. appendix A.13), in this example all eigenvalues are zero. However, is already satisfied if there is only one single eigenvalue This already obtains if the elements of one single row of the matrix are all equal. This demonstrates the relative unsuitability of the measure (7.55) for a characterization of the measure of nonideality of a stochastic matrix if N > 2. In this respect a better measure would be given by the sum of all nondiagonal elements of
which, using (7.35), can be reduced to
7.8. MEASURES OF NONIDEALITY OR INACCURACY
371
This expression can easily proven to be equal to
once again the eigenvalues of (this expression is real because with also is an eigenvalue (cf. appendix A.13)). This measure, too, vanishes in the case of an ideal measurement, and is positive if the POVM is uninformative (then It, however, makes a distinction between nonideality matrices having different numbers of vanishing eigenvalues. A disadvantage of the measure (7.56) is that it is not invariant under a permutation of the rows or the columns of the matrix. For instance, in a transition from to the eigenvalues change from to the latter case implying For this reason this measure is useful mainly for comparing nonideal measurements the elements of which do not differ too much. We shall discuss now a couple of nonideality measures for the nonideality relation which do not have the disadvantages of the above measures, and which are also suitable for non-square nonideality matrices. These measures are used in the theory of transmission channels already alluded to in section 7.6.2. The point of departure is the concept of mutual information cf. (A.95)) of the input and output probability distributions and respectively, of a transmission channel, these probability distributions being marginals of the bivariate distribution In this expression is the Shannon entropy defined in section 1.7.2. The mutual information is a measure of the correlation between the probability distributions and, hence, a measure of the quality of a transmission and channel. If there is no correlation at all, i.e. then conversely, implies absence of correlation; this holds if in which case POVM is uninformative. As proven in appendix A.11.2, Since tion as
and
satisfy (7.37), it is possible to write the mutual informa-
The quantity thus obtained is a functional of the matrix distribution and is denoted as
and the probability
Shannon [323] has introduced the channel capacity as a measure of the quality of a transmission channel described by nonideality matrix viz,
372
CHAPTER 7. GENERALIZED QUANTUM MECHANICS
being the supremum of the mutual information for the set of all possible input distributions This measure is invariant under a permutation of the rows or the columns of the nonideality matrix. We now prove the following properties of the channel capacity
N = number of input channels. number of output channels. If
then
N = ln
is invariant under trivial refinement and trivial coarsening. Proof:
From (7.57) it follows that
Due to of (A.86) we obtain
This inequality is also satisfied for sup – Suppose Then This directly implies that hence – Suppose Then Hence, for all input distributions input and output distributions are statistically independent:
Variation of this expression under the constraint yields
Since this is true for any input distribution satisfied by
immediately
the inequality must also be
7.8. MEASURES OF NONIDEALITY OR INACCURACY
373
implies, once more, an analogous inequality for If
and
then
This implies (cf. section 1.7.2) that The output observable section 7.6.6) the POVM Then:
has as a trivial refinement (cf.
and
From the last equality it follows that the channel capacities of all POVMs belonging to the same equivalence class (cf. section 7.6.7) are the same. This makes this measure particularly suitable for the description of the quality of a nonideal measurement. Note that, in contrast to the measures (7.55) and (7.56), is a measure of ideality or accuracy, being larger as the nonideality of the measurement is smaller. The quantity log N can serve as the corresponding measure of nonideality. This latter measure vanishes if and only if is a trivial refinement or coarsening of Because Shannon’s channel capacity is mathematically rather exacting, in the following we shall often use a somewhat different quantity, viz,
which, using (7.57), can be written as
374
CHAPTER 7. GENERALIZED QUANTUM MECHANICS
This quantity, too, is invariant under permutations of the rows or columns of the nonideality matrix Writing this quantity as
we see that is just the average row entropy of the matrix performed with relative weights The quantity
If and then
averaging being
is a measure of nonideality satisfying
are stochastic matrices, both satisfying the conditions of (7.37),
Hence, gives an excellent impression of the measure of nonideality of a transmission channel described by nonideality matrix As an application let us compare the two polarization measurements represented by the POVMs (7.34) and (7.36). We find
It is not difficult to prove that if This inequality is a correct expression of the fact that the procedure corresponding to (7.36) is a less nonideal measurement of observable than the first one, i.e. it is yielding more accurate information on the input distribution. Using (7.61) it is easily seen that Theorem: If the elements of the NODI generating POVM are linearly independent, then two equivalent nonideal measurements of POVM have equal nonideality measures Thus, let the NODIs generating POVMs and satisfy
7.9. JOINT NONIDEAL MEASUREMENT
in which
and
375
are nonideality matrices. Then:
Proof: Since, due to the linear independence of the elements of it follows directly from (7.61) that both
we have and
and
The average row entropy (7.60) is a particularly suitable measure if we want to study nonideal measurement of a maximal PVM. In this case each input channel corresponds to a one-dimensional subspace, enabling the attribution of equal weights to all input channels as is done in (7.60). In the case of a non-maximal PVM the weights are not equal. It seems appropriate to take this into account in the definition of the measure. This is possible by choosing rather than yielding the nonideality measure
7.9
Joint nonideal measurement of two observables
7.9.1
Definition and examples
In section 1.9.2 we have seen that simultaneous or joint measurement of two standard observables is possible if and only if they are represented by commuting PVMs. The generalized formalism admits a generalization of a joint measurement to the effect that a joint nonideal measurement of two standard observables is possible if these are not represented by commuting PVMs. Simultaneous or joint nonideal measurement of two observables (not necessarily standard observables) is defined as follows: Joint nonideal measurement: Two observables, represented by POVMs and are jointly nonideally measurable if a measurement arrangement exists, represented by a bivariate POVM and its marginals and are POVMs representing nonideal measurements of and respectively. Thus, the corresponding NODIs satisfy
376
CHAPTER 7. GENERALIZED QUANTUM MECHANICS
Example 1: Joint nonideal measurement of incompatible polarization observables (using nicols) A simple example of a joint nonideal measurement was already introduced in section 7.5 (cf. figure 7.4). For simplicity we restrict ourselves to 100% efficient detectors The POVM can be written in the form of a bivariate POVM according to
(this POVM was introduced already in section 1.9.2 as an example of a joint measurement of two POVMs (cf. (1.102)). The expectation value can indeed be interpreted as the joint probability for both detectors of having response the probability of (+, +) vanishing because an individual photon does not trigger both detectors. The marginals of POVM (7.64) are found as and respectively. It is possible to write these marginals in matrix form according to
From this it is clear that these expressions satisfy the definition of a joint nonideal measurement of the two polarization observables and respectively. These are incompatible standard observables if the angles and do not differ by a multiple of The nonideality matrices and are given by
The experimental arrangement of figure 7.4 differs in a fundamental way from the measurement arrangement realizing Ludwig’s alternative definition, discussed in section 7.6.4, according to which there is a mixture of experimental arrangements in which a perfectly reflecting mirror is present in a fraction and absent in a fraction of the individual measurements. The POVM representing Ludwig’s alternative definition is The alternative measurement setup differs from that of figure 7.4 in that not only the photon goes one way or the other, but that the whole wave packet does so, thus causing a polarization measurement to be carried out either in direction or in direction It is unfortunate that the picture of a photon going one way or the other is also
7.9. JOINT NONIDEAL MEASUREMENT
377
frequently used in the arrangement containing a partially transparent mirror, thus suggesting that, depending on the path the photon would have chosen, only one of the observables would actually be measured on each photon. We must, however, remind here of the discussion on the reality of photons (cf. section 2.4), calling into question that such a picture is adequate. The possibility of interference of the two paths is fundamentally determined by the fact that the wave packet splits, one part taking one path and one part the other one, coherence between the two parts being preserved. It is precisely this latter aspect that distinguishes the experiment, discussed here, from the alternative measurement procedure. This distinction will play an important role in our analysis of the Bell inequalities (cf. section 9.3). The POVM, based on Ludwig’s alternative definition, can be interpreted as representing a nonideal measurement of the polarization observable given by
in which
and
are defined by (7.27) and (7.28). Thus,
Hence, if analyzed in terms of Ludwig’s alternative observable, the mixture of measurements is yielding nonideal information on a (standard) polarization observable in a direction lying in between and Since the nonideality matrix in (7.69) is invertible, this measurement is informationally equivalent to an ideal measurement of the corresponding PVM Comparing this with the joint nonideal measurement represented by POVM (7.64) we see that this latter measurement contains much more information, because the probability distributions of and of can both be calculated. By the analysis based on Ludwig’s definition information is neglected that is present in the experiment: the presence or absence of a perfectly reflecting mirror (which can be observed!) determines whether the measurement of or the measurement of is performed. Therefore, in the alternative measurement procedure we are able to determine experimentally the corresponding probabilities, conditional on each of the two different measurement arrangements. The experiment actually consists of two separate measurements of the observables and and, hence, is actually yielding information on the probability distributions of both observables. This, however, is not reflected in the analysis based on Ludwig’s mixture of observables, thus illustrating the reservations, voiced in section 7.6.4, with respect to the usefulness of the concept of ‘reducible observables’.
378
CHAPTER 7. GENERALIZED QUANTUM MECHANICS
Example 2: Joint nonideal measurement of incompatible polarization observables (using birefringent crystals) In the case of bi-refringent crystals (instead of nicols) we found in section 7.5 the POVM Its NODI can be written in a bivariate form according to
For the marginals we obtain
(compare (7.36)), and
respectively. In agreement with definition (7.63) these marginals once again represent nonideal measurements of the polarization observables and The operators in NODI (7.70) are linearly dependent. They can be replaced by the informationally equivalent set which can also be obtained by taking (note that these operators are still linearly dependent because The operators can be arranged in a bivariate NODI, viz,
Using (7.27) and (7.28) it is easy to see that the POVMs generated by the marginals of this NODI represent nonideal measurements of polarization observables, although these are not the PVMs and but two different PVMs and the first one being given by (7.68), and the second one by
Applying the definition of joint nonideal measurement we find, apart from the nonideality matrix defined in (7.69), for the nonideal measurement of the nonideality matrix
7.9. JOINT NONIDEAL MEASUREMENT
379
Of course, it also is possible to analyze the measurement data in the manner of (7.70). This means that, depending on the way the data are processed, the experiment can be interpreted as a joint nonideal measurement of different pairs of PVMs. In a realist interpretation of the formalism this would imply that the properties of the object would be dependent on the manner of data processing. This does not seem to be very fortunate. In an empiricist interpretation we do not have any problem since here the observable is only a label of a measurement procedure, of which the processing of data can be considered a part. It is not impossible that one method of processing the measurement data could be preferable over another one. Thus, the bivariate POVM (7.70) might be preferable over (7.73) because the marginals of the first one are representing joint nonideal measurements of a fixed pair of PVMs (viz, the pair and , whereas in the latter case the PVMs and contain the parameter Although, in a strict sense, in this latter case for each value of the definition of a joint nonideal measurement is satisfied, it seems that an analysis in which only the nonideality matrices are dependent on the parameter, has a number of advantages (see also section 7.9.3). Example 3: ‘Four-port’ homodyne optical detection Another example of a joint nonideal measurement of two incompatible standard observables is ‘four-port’ homodyne optical detection. This is a combination, as given in figure 7.8, of the two optical homodyning detection methods discussed in section 7.4, in which there are now two output signals and in detectors and respectively. The joint probability distribution of and was first calculated by Yuen and Shapiro [317] (also [324]). An alternative derivation will be given in section 8.4.3. In this derivation the mirrors of the Mach-Zehnder interferometer have transmission coefficients as depicted in figure 7.8 etc.,
380
CHAPTER 7. GENERALIZED QUANTUM MECHANICS
detectors and are assumed to have equal quantum efficiency restrict ourselves to the case and
Here we shall
The joint probability distribution of and is given by the expectation value of a POVM in the initial (signal) state of the (monochromatic) signal S, with given by (8.79). For the experimental conditions specified above, this becomes (cf. (8.80):
Here
is the projection operator on the coherent state
given in appendix A.4.
The marginals of this POVM are found straightforwardly according to (cf. (8.77))
in which and
and
have the same meaning as in section 7.4. The functions respectively, are given by
These functions satisfy the requirement (7.33) of nonideality functions of nonideal measurements of and P, respectively. For this reason the ‘four-port’ homodyne detection method can be interpreted as a joint nonideal measurement of the standard observables and P (see also Arthurs and Kelly [325]). In section 8.4.3 we also shall derive the POVM representing an ‘eight-port’ homodyning experiment (cf. figure 8.14), further improving the signal-to-noise ratio. This measurement procedure can be interpreted as a joint nonideal measurement of and P, too, with Gaussians as nonideality functions. Instead of (7.77) the standard deviations have the smaller values (cf. (8.85))
If this latter experiment is represented by the POVM (cf. (8.84)), with coherent states with ranging over the whole complex plane
7.9. JOINT NONIDEAL MEASUREMENT
7.9.2
381
Measurement of a PVM as a joint nonideal measurement of incompatible observables
It is interesting to consider the foregoing examples for certain limit values of the parameters. Example 1:
In the limit
POVM (7.64) reduces to
This POVM is equivalent to the PVM of an ideal polarization measurement. It is possible, however, to maintain also in this limit an interpretation of the measurement as a joint nonideal measurement of PVMs and Indeed, expressions (7.65) and (7.66) remain valid for For this value of the marginal (7.66) reduces to the uninformative POVM {O, I}. Hence, although it is possible to interpret the ideal measurement as a joint nonideal measurement of incompatible observables, ideality of the measurement is accompanied by failure to obtain any information on the incompatible observable For
we find from (7.79) the joint probabilities
Evidently, this measurement procedure attributes to observable the value –. This, analogously, holds true for every polarization observable incompatible with For we obtain
implying value –.
thus yielding for observable
the
The analysis of ideal measurements of standard observables as joint nonideal measurements of incompatible ones should make us cautious with respect to the interpretation of experimental results. We note that the ordering of rows and columns in the matrix (7.64) is arbitrary. Interchanging two columns of this matrix would, for yield attributing to observable the value +. However, for we now obtain As a result of interchanging columns in the state for the measurement result is obtained with certainty. In an empiricist interpretation this can be understood because interchanging columns just corresponds to a relabeling of the output states of the transmission channel. This illustrates the relative
382
CHAPTER 7. GENERALIZED QUANTUM MECHANICS
unimportance, in an empiricist interpretation, of choosing the (eigen)values of an observable. Since it seems wise, however, to use a notation that is as consistent as possible, on interchanging columns it would seem preferable to also replace by The characteristic of nonideality in the limits considered here agrees completely with Bohr’s views on complementarity as discussed in section 4.4, and with Heisenberg’s idea that, on measuring a certain observable, incompatible observables are disturbed, in certain cases even wiping out all information on these latter observables. Bohr and Heisenberg did not yet have at their disposal the concept of a generalized observable, and, for this reason, were bound to restrict their considerations to standard observables corresponding to PVMs. As a consequence, they were able only to make statements, mathematically implementable at that time, with respect to the limiting cases considered above. It is rather unfortunate that they did not seem to be aware of this limitation. In their analysis of the ‘thought experiments’ (cf. chapter 4) they also made statements referring to experimental situations corresponding to intermediate values of the parameters, which are mathematically implementable only in the generalized formalism. As already noted in section 4.5.1, as a result of this considerable confusion has arisen with respect to the concept of ‘indeterminacy’. In section 7.10 this will be discussed more fully. It will be shown there that in the generalized formalism this confusion can be removed completely. By equating (7.79), (7.80) and (7.64) we see that
Evidently, the POVM (7.64) of a joint nonideal measurement of two polarization observables can be written in the form of a reducible observable in the sense of Ludwig’s alternative definition (section 7.6.4). In doing so it is essential that all POVMs be written in a bivariate form, i.e. that also the two ideal polarization measurements are interpreted as joint nonideal measurements of both polarizations. In contradiction to the analysis in terms of the univariate POVM (7.69), Ludwig’s alternative (7.81) yields a POVM that is informationally equivalent to the POVM of the joint nonideal measurement. This can be traced back to the fact that in the bivariate POVM correlation between incompatible polarization observables is taken into account also in the case of ideal measurements. In the univariate analysis the information about this correlation is lost. Although, as is evident from (7.81), an analysis of the experiment represented by POVM (7.64) in terms of a reducible observable is possible without loss of information, such an analysis appears to be superfluous. Indeed, loss of information can be prevented only if joint measurement of incompatible observables is taken into account. If this is done, however, then the introduction of reducible observables is not necessary. Example 2: For the bivariate POVM (7.70) the marginals for the limit values of are essentially the same as in example 1. For the bivariate POVM (7.73) this is different, however.
7.9. JOINT NONIDEAL MEASUREMENT
383
Both marginals now have (apart from permutation) equal PVMs as limits for for Although this manner of analysis is very well possible in principle, it seems to be less attractive because it is yielding less information than the analysis based on (7.70): in particular, it is no longer evident that the nonideal measurement of the incompatible observable becomes uninformative in the limit. Another reason to prefer the first manner of analysis will be encountered in section 7.9.3. Example 3: The example of ‘four-port’ homodyne optical detection, as given by the bivariate POVM (7.75), once again satisfies the property that, on varying the parameters of the measurement arrangement, it remains a joint nonideal measurement of a fixed pair of observables ( and P, or, rather, the corresponding PVMs). Only the nonideality functions (7.77) are dependent on the parameters. There are no values of the parameters for which the ‘four-port’ experiment can be interpreted as a measurement of a PVM. For ‘eight-port’ homodyning it can be seen from (7.78) that the (unattainable) limits or would provide ideal and P, respectively. It also is evident that if one marginal agrees measurements of with a PVM then the other one becomes uninformative.
7.9.3
Wigner measures
A Wigner measure is defined as follows. Let be the POVM of a joint nonideal measurement of POVMs and as in definition (7.63). Let and be the inverses of the nonideality matrices and respectively. We now define the operators
The operators generate an operator-valued measure (cf. appendix A. 12.3), since, due to (7.42) and an analogous relation for we have
It, however, is not a POVM since the operators may be non-positive as a consequence of negativity of the matrix elements of the inverse matrices. The marginals of the matrix are important. We obtain:
the last equality following from (7.42). Analogously we find
384
CHAPTER 7. GENERALIZED QUANTUM MECHANICS
The results (7.84) and (7.85) are interesting because it makes clear that, at least in principle, the Wigner measure allows to calculate exactly the probability distributions of the observables and measured jointly but nonideally. By calculating the Wigner measure from the POVM the measurement disturbance, being represented by the nonideality matrices and in a sense is compensated for. This makes it possible to obtain both probability distributions of the ideal measurements of the observables and from the results of the joint nonideal measurement. Note that this cannot be interpreted, however, as if herewith any influence of the measurement would have been compensated completely, in the sense that the probability distributions of and could be considered as objective properties of (the ensemble of) the microscopic object: the POVMs and could themselves represent nonideal measurements of other observables. Even if and are maximal observables, a transition from an empiricist to a realist interpretation would be necessary for such an interpretation. As far as quantum mechanics is able to yield information on microscopic reality, however, the Wigner measure in any case seems to contain information on the microscopic object that is maximally independent of the way it is obtained. In order to achieve some insight into the meaning of the Wigner measure, this quantity will now be calculated for the examples of joint nonideal measurements given in section 7.9.1.
Example 1: For the joint photon polarization measurement using nicols we find, analogously to (7.40), the inverses of the nonideality matrices (7.67) as
By direct calculation we then find the Wigner measure (7.82) corresponding to POVM (7.64) as
Relations (7.84) and (7.85) are evidently satisfied. It is important to note that this Wigner measure is independent of the parameter Example 2:
In considering the joint polarization measurement using bi-refringent crystals we encounter a complication, viz that the nonideality matrices given in (7.71) and (7.72) are not square. Notwithstanding this, it is possible to find (left) inverses
7.9. JOINT NONIDEAL MEASUREMENT
385
satisfying (7.42). We get (compare (7.43)):
arbitrary. As the Wigner measure we find from this:
For arbitrary values of the parameters and this Wigner measure yields the PVMs and as marginals. For the special choice reduces to
For this choice of parameters and the Wigner measure evidently does not depend on the parameters and of the measurement arrangement. We could compare (7.90) with the Wigner measure obtained from the bivariate POVM (7.73). We shall not calculate the latter one explicitly, however. The reason for this is that this Wigner measure, unlike (7.90), is not independent of the parameter (this can already be seen by considering the marginals of the Wigner measure, yielding in this case the PVMs and of (7.74), depending on The parameter dependence of the Wigner measure is, hence, closely connected to the possibility, already observed in section 7.9.1, that on certain methods of data processing the measurement can be interpreted as a joint nonideal measurement of two PVMs that are dependent on the value of Although, by itself, this is not objectionable, we prefer a method of data processing in terms of a fixed pair of observables, because then the information contained in the Wigner measure is maximally independent of the measurement arrangement. Example 3: For the case of ‘four-port’ optical homodyne detection with POVM (7.75) and nonideality functions (7.77) we find the inverses, analogously to (7.41), as
with
and
as given in (7.77). A calculation of the Wigner measure
386
CHAPTER 7. GENERALIZED QUANTUM MECHANICS
can easily be performed, yielding
Comparing this to (1.146) (with we see that the expectation value of the Wigner measure (7.93) is the well-known Wigner distribution 9 (1.127), which is informationally equivalent to the density operator As already given in section 1.11.2 its marginals satisfy
and, hence,
From (7.94) it is seen that the joint probability distribution of the ‘four-port’ homodyne optical detection experiment contains sufficient information, in principle, for calculating exactly the probability distributions of both and P. Hence, in this sense the joint nonideal measurement of the two incompatible observables yields accurate information on both observables separately. In practice this is not always feasible, however, because due to the strong singularity of the inverse functions (7.91) the precision the probability distribution is known with will often be too small to allow a reliable determination of It is interesting to note that not only the marginals (7.84)-(7.85) of the Wigner measures (7.87) and (7.93), but even these Wigner measures themselves are independent of the parameters of the measurement arrangement, which parameters are present in the POVMs (7.64) and (7.75). Evidently, in the calculation of the Wigner measure the disturbing influence of the measurement arrangement, described by the nonideality matrices and is completely compensated. Although in an empiricist interpretation of the formalism this does not imply that the Wigner measure represents objective information on the object itself, it does seem possible to consider the Wigner measure as information about (certain aspects of) the preparation, not influenced by measurements to be performed later. The example of POVM (7.73) shows that the manner of data processing can be important for the question of whether we are successful in completely compensating for the disturbing influence of the measurement. The existence of a possibility to achieve this goal demonstrates that the analogy between measurements in classical mechanics and quantum mechanics may be greater than is supposed in Bohr’s ‘quantum postulate’ (section 4.4). Admittedly, due to the interaction between object and 9
For this reason the pertinent bivariate OVM is referred to as a ‘Wigner measure’.
7.9. JOINT NONIDEAL MEASUREMENT
387
measuring instrument there is a disturbing influence on the measurement result, described by the POVM and the nonideality matrices and However the limitation of our knowledge about the object (or, rather, the preparation) by this measurement disturbance seems to be less fundamental than supposed in Bohr’s complementarity principle: the measurement process can be analyzed, and the measurement disturbance compensated for. Of course, this compensation regards only the statistical information. We shall see in chapter 9 that it is impossible to interpret individual measurement results of ideal measurements as objective properties to be attributed to the object prior to measurement.
7.9.4
From standard measurements to complete measurements
The examples discussed in section 7.9.3 demonstrate that a measurement of a POVM can differ in an essential way from a measurement of a standard observable with respect to the information provided. As noted in section 3.3.6, for these latter measurements the information obtained on the state is very limited; only the probability distribution corresponding to the expectation values of one single PVM is determined. A (nonideal) joint measurement of two incompatible standard observables can yield more information. Thus, by the POVM (7.64), or the Wigner measure (7.87), the probability distributions of PVMs and are both determined. Since there is a joint probability distribution, such generalized measurements even yield (nonideal) information on the correlation between incompatible standard observables. Sometimes a joint nonideal measurement of incompatible standard observables even allows to completely reconstruct the density operator. This holds true, for instance, for the example of ‘four-port’ homodyne optical detection (example 3 of section 7.9.3; see also section 8.4.3), as follows from the equality of the expectation value of the Wigner measure and the Wigner distribution. ‘Four-port’ homodyne optical detection is a complete measurement (cf. section 7.7.3). Examples 1 and 2 are not complete measurements since in a two-dimensional Hilbert space it is not possible to completely determine the density operator of the initial state from knowledge of two probability distributions of PVMs. In these examples neither the POVM nor the Wigner measure contains any additional information over the one yielded by the two measurements of the PVMs performed separately. Such measurements will be referred to as trivial joint measurements. Recently there has been much interest in the problem of reconstructing the initial state of the object on the basis of information provided by quantum measurements [326]. By measuring a sufficient number of incompatible standard observables (socalled quantum tomography, Vogel and Risken [327]; see also section 8.4.4) it is
388
CHAPTER 7. GENERALIZED QUANTUM MECHANICS
possible to completely determine Band and Park [328] have determined how many standard observables should be measured for a complete determination of Such a set of observables is called a quorum. It is not difficult to see that, for instance for the polarization degree of freedom described by a density operator in a 2dimensional Hilbert space, standard polarization observables should be measured in at least three linearly independent directions. In an infinite-dimensional vector space a quorum consists of an infinite number of standard observables, thus excluding in actual practice an exact determination of if these observables are to be measured separately. Generalized measurements provide an opportunity to obtain a comparable result using one single measurement arrangement. Thus, in ‘four-port’ and ‘eight-port’ homodyne optical detection experiments it is possible, in principle, to calculate, analogously to (7.82), the Wigner distribution from the measured joint probability distribution This implies that one single measurement arrangement is sufficient to determine the state completely. This is impossible in the standard formalism. Evidently, the generalization of the notion of an observable from a PVM to a POVM induces an extension of the experimental domain of quantum mechanics to a fundamentally new type of measurements, ranging from measurements of standard observables via trivial joint measurements to complete measurements (ordered according to increasing information content). Of course, the experimental problem of exactly determining a probability distribution on an infinite-dimensional space is a source of inaccuracy for generalized measurements as well as for standard ones. The difference between examples 1 and 2 on one hand, and example 3 on the other, demonstrates the capability of the generalized formalism, alluded to in section 4.4, to exhibit the difference between the notions of ‘complementarity’ and ‘completeness’. If a joint nonideal measurement of two incompatible standard observables is complete (as is the case in example 3), then both notions are applicable to it. However, as seen from examples 1 and 2, in general joint measurements of incompatible PVMs need not be complete. Then it is an interesting question which information is provided by the measurement (cf. section 3.3.5). It is not difficult to conceive of a complete polarization measurement. Consider for this purpose the measurement arrangement given in figure 7.9. In this arrangement we have three partially transparent mirrors (with transmission coefficients and Polarization is measured in directions and Analogously to (7.64) the joint detection probabilities of the detectors are given by the operators
7.9. JOINT NONIDEAL MEASUREMENT
389
This POVM can be analyzed in different ways.
Data processing in terms of a joint measurement of two POVMs It is possible to arrange the operators of the POVM in a bivariate way directly generalizing (7.64):
It is now easy to see that POVM (7.96) represents a joint nonideal measurement of the POVMs and
with nonideality matrices
Note that POVMs (7.97) and (7.98) are both maximal POVMs in the sense defined in section 7.7. By inverting the matrices and we can also calculate the
390
CHAPTER 7. GENERALIZED QUANTUM MECHANICS
Wigner measure (7.82) as
It is easily verified that this matrix has the POVMs (7.97) and (7.98) as marginals.
Data processing in terms of a joint measurement of four PVMs Like in section 7.9.3 it could be objected that in the analysis given above the Wigner measure is not independent of the parameters This could, once again, be a consequence of the special method of data processing employed. It is, indeed, possible to find another data processing method, leading to a Wigner measure that is parameter independent. This method starts from the observation that the measurement can be interpreted as a joint nonideal measurement of the four PVMs the nonideality relations being given by
The nonideality matrices are analogous to those of (7.67). We shall denote them as The inverses of these matrices are obtained analogously to (7.86). The Wigner measure of this experiment can now be defined as
This, finally, yields the Wigner measure (7.101) according to
7.10. COMPLEMENTARITY
391
All parameters of the measurement arrangement have indeed disappeared from these expressions. It is straightforward to demonstrate that the density operator can be completely determined from the expectation values of this Wigner measure.
7.10
7.10.1
Complementarity in a joint nonideal measurement of two incompatible standard observables Examples
The examples of joint nonideal measurement of two standard observables, discussed in section 7.9, exhibit a close similarity with respect to the dependence of their nonideality matrices or nonideality functions on the parameters of the measurement arrangements. In both cases the parameter values have a limit in which one observable is measured ideally, the measurement of the other observable being uninformative, and another limit in which the situation of the observables is reversed. Thus, for it follows directly from (7.67) that making (7.65) an ideal measurement of while (7.66) reduces to the uninformative POVM {O, I}. For the situation is reversed. Analogous remarks can be made with respect to ‘four-port’ homodyne optical detection yielding in the limit an ideal measurement and an uninformative P measurement (which is reversed for The dependence on the parameters of the nonideality matrices, observed here, strongly resembles the idea of complementarity in the sense of mutually exclusive measurement arrangements discussed in section 4.5: the different limits of the parameters correspond to such mutually exclusive measurement arrangements. If the parameters of the measurement arrangement are such that the measurement is ideal with respect to one of the two standard observables, then the other (incompatible) standard observable, measured jointly, is maximally nonideal, i.e. uninformative. Since only the limiting cases correspond to PVMs, Bohr and Heisenberg virtually had to restrict themselves to these limits in discussing the ‘thought experiments’ (cf. section 4.5). The generalization of the concept of ‘observable’ from a PVM to a POVM makes it possible to deal also with the intermediate situations. The nonideality measures introduced in section 7.8 can be helpful in formalizing this. Thus, for the polarization measurement with nonideality matrices (7.67) we obtain for the measures (7.55), (7.56) and (7.60) the values
392
CHAPTER 7. GENERALIZED QUANTUM MECHANICS
In figures 7.10 and 7.11 and J are visualized, respectively. It is important to notice that both graphs satisfy the complementarity principle in the sense that not simultaneously and (for this holds analogously), and that also is impossible. The ‘four-port’ homodyne experiment is exhibiting the same characteristics of complementarity. We can take here as a measure of the nonideality of the measurement the standard deviation of the pertinent nonideality function defined by (7.77). Since for these quantities we have
once again demonstrating that is impossible. This conclusion remains unchanged for ‘eight-port’ homodyning, for which it follows from (7.78) that 1. The conclusion remains valid if the quantum efficiency is allowed to be smaller than 1 (compare (8.78) and (8.86)). The examples discussed here illustrate the impossibility of a joint measurement of incompatible standard observables in the sense that both measurements could be ideal (the fundamental impossibility of this was already demonstrated in section 1.9.2). By the examples the idea is corroborated that in a joint measurement of incompatible standard observables a mutual disturbance takes place, causing the measurement result to deviate from the value that would have been found in an ideal measurement. In agreement with the ideas of Bohr and Heisenberg, discussed in section 4.6, this disturbance can be attributed to the changes in the measurement arrangement necessary for obtaining information on the other (incompatible) observable. In section 7.10.2 we shall see that the complementary behavior of the nonideality measures, as seen in figures 7.10 and 7.11, is not an incidental property of the measurement arrangements discussed here, but has universal validity, being an essential property of the mathematical formalism of generalized observables.
7.10. COMPLEMENTARITY
393
7.10.2 Martens inequality In this section an inequality is discussed, derived for the first time by Martens [28], applicable to joint nonideal measurements of two PVMs. This inequality makes use of the entropic nonideality measures and defined by (7.62) for the joint nonideal measurement of two incompatible standard observables and in an N-dimensional Hilbert space Thus,
We shall restrict ourselves, in proving the inequality, to maximal standard observables for which the operators and are one-dimensional projection operators. We can then make use of the simpler expression (7.60) for and to prove the Martens inequality:
Proof: Denoting the joint eigenvectors of the operators of PVM
From
it follows that
by
we have
394
CHAPTER 7. GENERALIZED QUANTUM MECHANICS
It is not difficult to see that (1.43) according to
can be written in terms of the Shannon entropy
and, analogously,
In these expressions the arguments of the functions and are positive operators with traces equal to 1. Therefore it is possible to use inequality (A.93) to find lower bounds for and Taking in (A.93)
we obtain the inequality
Analogously we find
From (7.107) and (7.108) it then follows that
Since with
is a positive operator with trace 1 we can use inequality (1.84), and to find (7.106).
Analogously to (1.86) inequality (7.106) can be made sharper, to the effect that, if the standard observables and are partially compatible in the sense that they have eigenspaces in common (cf. (1.85)), then
7.10. COMPLEMENTARITY
395
Finally we mention the generalization of inequality (7.109) when the nonideality measure (7.62) is used. For the joint nonideal measurement of the (possibly nonmaximal) PVMs and we get the inequality [28]
This inequality can be proven in an analogous way. For infinite-dimensional Hilbert spaces and continuous spectra no inequality of the Martens type has been proven thus far 10 . Inequalities like (7.104) for ‘fourport’ homodyning, and analogous inequalities for ‘eight-port’ homodyning (compare (8.78) and (8.86)) seem to justify the expectation that things will not be qualitatively different in the infinite-dimensional case. Since ‘eight-port’ homodyning satisfies the equality it will presumably turn out to reach the quantum limit in the sense that no quantum mechanical measurement procedure will be able to perform better as a joint measurement of and P.
7.10.3
Discussion of the Martens inequality
Nonideal measurement and Heisenberg inequality Strictly speaking, in an empiricist interpretation the standard deviation of a probability distribution is not a suitable quantity because it depends on the values of the observable. Nevertheless it is illuminating to consider the question of whether the standard deviations of probability distributions and of the POVMs defined in (7.105) satisfy the Heisenberg inequality. It is obvious that, in general, they do not, since both probability distributions could have vanishing standard deviations (for instance, when in which case both POVMs are uninformative). Hence, in contrast to the Martens inequality, in general the Heisenberg inequality is not a good measure of mutual disturbance in a joint nonideal measurement of incompatible PVMs in the determinative sense considered here. However, if the marginals both satisfy the condition of unbiasedness, discussed in section 7.6.5, then a Heisenberg inequality can be derived [330, 49]. This can be seen as follows: Using the notation
we have Defining
10 Some
preliminary results were obtained by Dorofeev and de Graaf [329].
396
CHAPTER 7. GENERALIZED QUANTUM MECHANICS
the standard deviation bution can be written as
of the
distri-
If the condition of unbiasedness is satisfied, then, due to (7.48) we must have Under this condition we get (with as in (1.78)), entailing Since for the other marginal an analogous inequality is obtained, we finally have for the joint nonideal measurement of observables A and B (in an obvious notation)
Such inequalities are sometimes called ‘generalized Heisenberg uncertainty relations’ [331, 332]. They strongly hinge on the requirement of unbiasedness of the measurement, and, for this reason, are not satisfied in general. In an empiricist interpretation there is no reason to try to extend the validity of the Heisenberg inequality to generalized measurements by restricting measurement procedures to unbiased ones. Analogous considerations can be based on the entropic uncertainty relation discussed in section 1.7.2. It is easily verified that the Shannon entropies of probability distributions and are related according to
which should be compared with (7.111). If and, analogously, then, using (1.84) or (1.86), analogously to (7.112) an entropic uncertainty relation could be derived for the probability distributions and A sufficient condition for this inequality to be satisfied is that the nonideality matrices and be doubly stochastic matrices, i.e. that also and thus warranting nonideality to increase entropy. The condition of double stochasticity should be compared with the condition of unbiasedness. Since this condition is not satisfied in general either, this entropic inequality has an analogous drawback to (7.112). As seen from (7.111), it is possible to distinguish between a contribution to the standard deviation stemming from the incoming signal, and a contribution due to the transmission channel, the extra spreading in the latter being represented by the quantities A disadvantage of the quantity is that it does not discriminate between these contributions. In this sense the ‘generalized Heisenberg uncertainty relation’ seems to continue confounding the contributions of preparation
7.10. COMPLEMENTARITY
397
and measurement to indeterminacy, noted in section 4.8 as a characteristic of the Copenhagen interpretation. For simplicity restricting ourselves to pure states and maximal observables it is possible to derive, analogously to (7.112), from inequalities (1.83) and (7.106) the following generalized entropic inequality for a joint nonideal measurement of and
An advantage of this inequality over the generalized Heisenberg relation (7.112) and its entropic analogue is that its validity is not restricted, no condition of unbiasedness or double stochasticity being required for its derivation. On the other hand, this inequality has the same disadvantage as (7.112), not distinguishing the separate contributions of preparation and measurement to indeterminacy. It is an achievement of the Martens inequality that it enables a separate study of the two kinds of contributions. Entropic uncertainty relations for joint measurements of incompatible observables not separately considering the different contributions, have been discussed, for instance, by Grabowski [333] and by Schroeck [334].
Uncertainty and inaccuracy In the quantum mechanical literature the idea of complementarity is generally associated with the Heisenberg inequality (1.78). Complementarity is then conceived on the meaning given to it by Heisenberg, viz, as mutual disturbance of observables measured jointly, due to mutually exclusive measurement arrangements for ideal measurements of incompatible observables. As we have seen in section 4.6, for Heisenberg this disturbance should be taken in a preparative sense, i.e. the disturbance refers to the state of the object after the measurement: measurement disturbance as a cause of uncertainty in the final state. Ballentine has stressed (cf. section 4.7) that this is in disagreement with the usual interpretation of the Heisenberg inequality, in which this relation is seen as a property of the initial state of the object (prior to measurement), and should be tested by means of separate ideal measurements of observables A and B rather than by means of a joint measurement. According to Ballentine inequality (1.78) does not refer at all to a joint measurement of A and B. According to him quantum mechanics even has nothing to say about the joint measurement of incompatible observables. Indeed, the standard formalism does not allow such measurements. However, we have seen in section 7.9 that in the generalized formalism it is possible to describe a joint nonideal measurement of incompatible observables. Moreover, this description turns out to be in complete agreement with the idea of mutual disturbance in a determinative sense (i.e. in the sense that a certain nonideality (or inaccuracy) may be introduced in the determination of the probability distribution
398
CHAPTER 7. GENERALIZED QUANTUM MECHANICS
of an observable, referring to the initial state of the object) if the measurement arrangement is changed in such a way that information can also be obtained about the probability distribution of an incompatible observable. The Martens inequality (7.110) is a precise expression of this kind of complementarity: it is impossible that in a joint nonideal measurement of incompatible standard observables both nonideality measures are arbitrarily small. Note that with respect to the Martens inequality there can be no single doubt whether it refers to preparation or to measurement: inequalities (7.109) and (7.110) refer exclusively to observables; the state (density operator) does not play any role in determining their validity! It is a property solely of the measurement procedure, unrelated to the preparation, the latter preceding the measurement. Hence, the nonideality measures etc., too, are not properties of the object preceding the measurement, but properties of the measurement process. They refer to measurement inaccuracy (nonideality) introduced by the method of measurement. For this reason relations (7.109) and (7.110) are called inaccuracy relations, thus distinguishing these from the uncertainty relations (1.77) and (1.78), which in the first place refer to preparation of the object (even though in the formulation of these latter relations observables also play a role; see also section 7.10.4). When the measurement does not induce extra spreading of the measurement results, then it does not seem unreasonable to view the Heisenberg inequality in the first place as a restriction on the possibility of preparation (in the sense that it is not possible to prepare the initial state in such a way that the standard deviations of the measurement results of ideal measurements of standard observables A and B would violate inequality (1.78)). This can be seen as a property of the preparation procedure, which is independent of the processes described by the Martens inequality (de Muynck [129]). Ballentine’s observation, to the effect that the Heisenberg inequality does not refer to a joint measurement of A and B, seems perfectly justified. Limitations on the joint measurement of incompatible observables are described by a completely different kind of relations, viz, relations like the Martens inequality derived in section 7.10.2. Bohr and Heisenberg incorrectly associated the Heisenberg inequality (1.77) with a joint measurement of incompatible observables. This may be partially due to the special examples considered by them, in which the inaccuracies of the joint measurements are caused by uncertainties in the preparation of a different object, viz, the measuring instrument, or, more generally, a secondary object (ancilla, cf. section 1.9.3) the object is interacting with (e.g. (4.8)). When the uncertainty in the preparation of the ancilla is expressed in terms of the Heisenberg inequality for this system, then it is not surprising that inaccuracy relations like (4.14), (4.11) and (7.104) will adopt a similar form. That, however, does not alter the fact that the quantities in the inaccuracy relations (denoted by have physical meanings that are completely different from the statistical spreading measures involved in
7.10. COMPLEMENTARITY
399
the Heisenberg inequality (denoted by ), even if measurement inaccuracy can be reduced to preparation uncertainty of an ancilla. An analogous remark is applicable in the case of the entropic uncertainty relations (1.81) and (1.83) with respect to the measure That Bohr and Heisenberg did not notice these differences has a number of -possibly related- causes. In the first place, Bohr in particular did not draw a sufficient distinction between ‘preparation’ and ‘measurement’ (cf. section 4.7), thus blurring the fundamental difference between measurement inaccuracies and preparation uncertainties In Bohr’s correspondence principle only one experimental arrangement, consisting of all preparing and measuring apparata which are present, was considered to define the physical quantities. In the second place, Bohr did not recognize the necessity of a quantum mechanical description of the measurement process, thus underestimating the fundamental importance of the measurement inaccuracies as quantum mechanical properties of the measurement process, these quantities often being equated to certain parameters of the experimental arrangement (like, for instance, the slit width). Thirdly, an important role may have been played by the fact that Hermitian operators seemed to be the natural substitutes within the domain of quantum mechanics of the quantities of classical physics. For a description of a joint nonideal measurement of incompatible observables, and a derivation of inaccuracy relations like the Martens inequality 11 , a generalization of the concept of a ‘quantum mechanical observable’ to a positive operator-valued measure is indispensable. The central position, for a long time taken in quantum mechanics by the standard formalism, doubtless has hampered this insight. Finally, a realist interpretation of observables, viewing these as properties of the object, may have served as a fourth cause of confusion (cf. section 2.3). The realist attitude probably has contributed appreciably to an underestimation of the difference between determinative and preparative aspects of quantum mechanical measurement. As already noted in section 7.2.2, Heisenberg’s formulations fit into a (contextualistic-)realist interpretation in which a measurement result is thought to be a property of the microscopic object in its final state, and, hence, is of a preparative nature (cf. section 4.6). Application of the projection postulate, allegedly preparing the final state of the object as an eigenstate of the measured observable, may have been instrumental in promoting this idea because this mechanism would maximally disturb observables incompatible with the measured one. In any case, a realist interpretation of quantum mechanical observables does not seem to be favorable to a recognition of quantum mechanical measurement results as properties of the measuring instrument rather than as properties of the object. Since Heisenberg’s disturbance theory relies heavily on the projection postulate, 11 Note that the role of the Martens inequality in clarifying the problem of joint measurement of incompatible observables as raised in the ‘thought experiments’, is completely ignored in Uffink’s [335] criticism of the notion of joint nonideal measurement.
400
CHAPTER 7. GENERALIZED QUANTUM MECHANICS
it seems that the same question marks, put behind the projection postulate (cf. section 1.6), should also be placed behind Heisenberg’s theory. It is largely based on a particular interpretation, and is hardly ever realized in actual practice (although in some special cases, like the Stern-Gerlach experiment, it is approximated to a certain extent, cf. section 8.3). On the other hand, the empiricist disturbance idea is a direct consequence of the (generalized) mathematical formalism, implemented into the formalism through the Martens inequality. Therefore it seems better to conceive of Heisenberg’s measurement disturbance in the empiricist sense, and to view the original formulation more as a heuristic attempt at a qualitative understanding of the ‘thought experiments’, that could not grow to full maturity due to a restriction of the formalism to the standard one. Confusion with respect to the role of the Heisenberg inequality in quantum mechanical complementarity has been a source of controversy even recently. Whereas Storey et al. [336] conclude that “the principle of complementarity is a consequence of the Heisenberg uncertainty relation,” Scully et al. [337] observe that “The principle of complementarity is manifest although the position-momentum uncertainty relation plays no role.” Dürr et al. [338] stress that quantum correlations due to the interaction of object and detector, rather than “classical” momentum transfer, enforces the loss of interference in a ‘which way’ measurement . In their experiment momentum disturbance is not large enough to account for the loss of interference if the measurement arrangement is changed so as to yield ‘which way’ information. These diverging statements can easily be reconciled if it is realized that the Heisenberg inequality refers to the initial state of the object, and does not refer to the measurement procedure (although, of course, the post-measurement state of the object will once again satisfy the Heisenberg inequality if measurements are performed in this latter state). It should be realized, however, that in general there need not exist a direct relation between the determinative properties of a measurement, yielding information on the initial state of the object, and its preparative properties, determining what is the final object state (compare section 4.2.3). We shall deal extensively with ‘which way’ experiments in sections 8.2 and 8.5. For such experiments an inequality having a significance comparable to that of the Martens inequality (7.106) has been derived by Englert [339].
7.10.4
Examples of generalized von Neumann projection
In this section generalized von Neumann projections as defined in section 3.3.5 are determined for a number of generalized measurements. Example 1: Nonideal measurement of a standard observable We first consider a nonideal measurement of standard observable with (possibly multi-dimensional) projection operators. The operators of the NODI
7.10. COMPLEMENTARITY
401
generating the POVM are then given by
We should distinguish two cases: i) and informationally equivalent In this case the matrix has a left inverse (3.36) has solution
and it is easily seen that
the dimension of the subspace corresponding to ized von Neumann projection (3.38) is given by
Then the general-
Comparing (7.115) with (3.40) it is seen that
Hence, the generalized von Neumann projection for a nonideal measurement of a standard observable is not different from that of the standard observable itself. This, once again, reminds us of the difference of and the final object state of the measurement; the latter is expected to be different for ideal and nonideal measurements. It should also be noted that in general (7.115) differs from the result of weak projection (1.73), dubbed in section 6.6.2 a candidate for the contextual state of the anti-Copenhagen modal interpretation. Of course, (7.115) could be seen as another candidate for an alternative description of the initial state. However, we do not seem to have any physical reason to prefer one over the other. It might be safer to stick to an empiricist view in which is thought to describe just information to be gained on by a measurement of POVM (cf. section 3.3.6). In an empiricist interpretation the equality (7.116) can easily be understood. Since we find from (3.39) that
Hence, the probability distribution of PVM can be obtained from the projection corresponding to the nonideal measurement. Due to the invertibility of the nonideality matrix, and both constitute a basis of subspace of Hilbert-Schmidt space. Therefore, the ideal and nonideal measurements are informationally equivalent.
402
CHAPTER 7. GENERALIZED QUANTUM MECHANICS
It is important to note here that this informational equivalence does not imply that the measurements are equivalent from an experimental point of view. In practice, application of the non-orthogonal basis of corresponding to will yield less reliable information on the quantum state than application of the orthogonal one corresponding to For instance, it is possible that all are only slightly different from each other, but nevertheless constitute a basis of In that case the nonideality matrix is nearly singular, and probabilities obtained by means of the inversion (7.114) are strongly dependent on the precise values of the experimental probabilities Hence, the reliability of the inversion depends on the accuracy of the determination of these latter probabilities. From an experimental point of view the NODIs constituting bases of a given subspace of Hilbert-Schmidt space need not be equivalent at all, the quality of the information being largest for an orthogonal basis. The (inverse of) the nonideality measure (7.60), here quantifying the nonideality of the measurement with respect to the standard observable can be used as a measure characterizing for a given experimental accuracy the reliability of the information yielded by a nonideal measurement. ii) and not informationally equivalent Consider as a simple example the POVM with
a maximal standard observable. Although this measurement is a nonideal measurement of the maximal PVM (cf. section 7.6.6), it is not informationally equivalent because the nonideality matrix does not have a left inverse. Yet, the generalized von Neumann projection can easily be found. Solving (3.36) yields and
This coincides with (7.115) as POVM
is actually a non-maximal PVM.
The nonideality involved in a measurement represented by POVM could be implemented by just ignoring in an ideal measurement of observable the difference between certain pointer positions (cf. section 7.6.2). Hence, no change of the measurement arrangement is involved. This provides another reason not to interpret in a contextualistic-realist sense as a contextual state, but in an empiricist sense, in which is thought just to describe the information about collected by means of a measurement of rather than some contextual reality.
7.10. COMPLEMENTARITY
403
Example 2: Joint nonideal measurement of two standard observables Consider the NODI with the non-vanishing operators of (7.64). Thus, By straightforward calculation we find as the solution of (3.36):
From (3.38) we find
It is illuminating to check whether the density operator is a positive operator, which in section 3.3.5 was demonstrated to be the case for a two-dimensional system like the one considered here. This can be done, for instance, by investigating whether Det (warranting positivity of the eigenvalues since also We find
It is not easy to check the sign of this expression, because the sign is not definite for (for instance, for The point is that not all values of and within this region are allowed. As a matter of fact, because of the triangle inequality they should satisfy
In order to simplfy the problem we can restrict ourselves to pure states. This is possible since for arbitrary if it is true for pure states. Then, putting we obtain
and
404
CHAPTER 7. GENERALIZED QUANTUM MECHANICS
Hence, state.
is a positive operator, possibly representing a quantum mechanical
It is easily verified that
demonstrating that reproduces the probabilities of the (incompatible) observables and in state Once again this is hardly compatible with a contextualistic-realist interpretation, since this would imply that prior to (although in the context of) a measurement of the same values could be attributed to and as found in ideal measurements of these observables. The impossibility of such an attribution will be one of the subjects of chapter 9. Suffice it here to refer to the pure state case discussed above, with yielding Det This implies that represents a pure state, Evidently, for this state Although reproduces the probabilities (i.e. this state does not have or For this reason it any special relation to either the eigenstates of is hard to see why any special (contextual) reality would have to be attributed to these observables. Once again it seems more appropriate to interpret as describing the information about collected by a measurement of including the information on the marginals if can be arranged in a bivariate POVM.
7.10.5
Complete measurements, and the Copenhagen interpretation
The possibility of equality of the initial state and its generalized von Neumann projection throws a new light on the Copenhagen maxim of the essential role of the measurement interaction, allegedly restricting our possibility of obtaining complete knowledge about the microscopic object (cf. section 4.6). In contrast to the Copenhagen view, in the objectivistic-realist interpretation preferred by Einstein (cf. section 4.7.1) quantum mechanics is thought to yield a description of a microscopic reality that is independent of the way it is observed, i.e. a description that is ‘objective’ in the sense of being ‘independent of the observer including his measurement arrangement’. As far as the issue of information is concerned, the possibility of complete measurements seems to fit into Einstein’s objectivistic ideas rather than into the Copenhagen ones. In contrast to Copenhagen ideas the disturbing influence of the measurement interaction does not prevent complete knowledge of the initial state from being within experimental reach, even within the context of one single measurement arrangement. Viewed from the perspective of generalized quantum mechanics, measurement disturbance does not seem to be necessarily involved in incomplete measurements
7.10. COMPLEMENTARITY
405
either: a measurement represented by a POVM just picks out the information to which it is sensitive. This information is described by which can be interpreted as just representing the information obtained by a measurement of the POVM. Heisenberg measurement disturbance does not seem to play any necessarily restrictive role in gaining information by means of quantum mechanical measurement processes. Choosing a better measurement procedure may yield more complete information. In his discussion with Einstein, Bohr always pointed at the influence of the measurement arrangement for arguing against the possibility of Einstein’s objectivistic view (cf. chapters 4 and 5), the issue of complementarity constituting the core of his argumentation. From the analysis of the generalized formalism of quantum mechanics it has become clear that Bohr’s ideas with respect to complementarity were based on ideas having too limited a scope to admit final conclusions to be drawn. Limitations, observed in standard treatments with respect to completeness of information obtained by a quantum mechanical measurement, are now seen to stem from considering only the too restricted set of measurements of standard observables. The fundamental relation between complementarity and restriction of information, so basic to the Copenhagen interpretation, seems to have become obsolete by now. Of course, this is not to say that complementarity would not have any bearing on quantum mechanical measurement. As far as complementarity is related to incompatibility or incommeasurability of observables, the validity of the Heisenberg and Martens inequalities testifies to the contrary, the latter inequality being a consequence of Heisenberg disturbance (in a determinative sense) in a joint nonideal measurement of incompatible standard observables. This leaves us with a problem with respect to the Heisenberg inequality. Since this inequality seems to be attributable to preparation rather than to measurement the question may be asked whether measurement has anything to do with it. Can this inequality be interpreted in Ballentine’s ‘statistical’ sense (cf. section 6.2.1) as just a property of an objectively prepared ensemble, no reference to any measurement being necessary to explain that standard deviations of incompatible standard observables satisfy the Heisenberg inequality? Or should we view this inequality as an expression of complementarity, different from the one expressed by the Martens inequality, but yet in some sense related to measurement? Indeed, uncertainty in the state (i.e. in the preparation) can be experimentally demonstrated only by means of measurement, thus making a dependence of the uncertainty relation (1.78) on observables inevitable. These questions will be addressed in the next chapters, the Bell inequality being a useful tool to discuss the quantum mechanical complementarity problem from this point of view. Derivability of the Bell inequality from the ‘possessed values’ principle (cf. section 9.4.1) will be seen to, at least, endorse Bohr’s ideas in the sense that an objectivistic-realist interpretation of quantum mechanics is impossible. Heisenberg’s
406
CHAPTER 7. GENERALIZED QUANTUM MECHANICS
inequality is not a property of preparation alone, independent of measurement. It must also be related to measurement. The simplistic ensemble idea of explanation of (standard) measurement results by means of observables (cf. section 6.4.2) is impossible. Unfortunately, the quantum mechanical formalism does not allow any further analysis of this subject over the (experimentally untestable) contextualistic-realist idea of the modal interpretation (cf. section 6.6.2) that a quantum mechanical measurement result is a property of ‘preparation within the context of a measurement’. In order to distinguish between an interpretation of a quantum mechanical measurement result as a property of the object or a property of the measuring instrument we have to take resort to subquantum (hidden-variables) theories (cf. chapter 10). It will be demonstrated in section 10.6 how hidden-variables theories could be instrumental in understanding quantum mechanical complementarity as expressed by the Heisenberg inequality, in a way completely different from the disturbance idea inherent in the ‘thought experiments’ and described by the Martens inequality.
Chapter 8 Applications of generalized quantum mechanics 8.1
The Arthurs-Kelly model
8.1.1
The Arthurs-Kelly model as a joint nonideal measurement of position and momentum
Arthurs and Kelly [325] were the first to deal with a quantum mechanical treatment of a joint measurement of position and momentum, be it that their model is rather a toy model, not suggesting a realization as a practical experiment. They considered an impulsive interaction between the microscopic object and two measuring instruments (for position and momentum, respectively), described by the interaction Hamiltonian in which X and P are position and momentum operators of the object, and and are the momentum operators of the two measuring instruments. As there is no problem with the simultaneous measurability of these latter observables. The interaction induces a change of the state vector of the combined system of object and measuring instruments according to (putting
and the initial states of the microscopic object and the measuring instruments, respectively. Denoting the Fourier transform of by we find [340, 341]
407
408
CHAPTER 8. APPLICATIONS
The probability that in the final state the pointer positions of the measuring instruments are and respectively, is given by
Putting
the corresponding POVM is given by
That this procedure can be interpreted as a joint nonideal measurement of position and momentum is seen by considering the marginals and which can be represented according to
and eigenvectors of position Q and momentum P, respectively. Evidently, can be interpreted as a measure of the position of the object (as found in an ideal measurement), and, analogously, as a measure of ideal object momentum. The nonidealities are represented by the convolution functions and In general the measurements are not unbiased. For the averages we get, in selfevident notation, unbiasedness obtaining if
For the standard deviations we find
The standard deviations and were found in ideal measurements of position and momentum, respectively. and can be interpreted as additional contributions to a spreading of measurement results due to the typically quantum mechanical character of the measurement interaction, causing and to satisfy a generalized Heisenberg inequality (cf. section 7.10.3). The quantities and satisfy a separate inequality which straightforwardly follows from the definition of the convolution functions F and G given in (8.2): taking into account the Heisenberg inequalities for the separate measuring instruments, it follows from
that This inequality is an expression of mutual disturbance of the position and momentum measurements in the Arthurs-Kelly joint measurement. It should be compared to the Martens inequality (7.106).
8.2. NEUTRON INTERFEROMETRY
8.1.2
409
Wigner measure of the Arthurs-Kelly model
By means of deconvolution we can invert relations (8.2) so as to obtain
The Wigner measure (7.82) is then found for this continuous case as
yielding
with
It can easily be verified that, if the initial states of the measuring instruments are chosen as arbitrary coherent states (A.29), then the Wigner measure of this experiment coincides with the Wigner-Weyl measure (1.133) (cf. (1.127)),
Hence, in this case the Arthurs-Kelly model is a complete measurement.
8.2
Neutron interferometry
8.2.1
Introduction
The double-slit experiment, discussed in section 7.3, is important because it is a paradigm of the problem of incompatibility of observables in quantum mechanics (compare section 4.5.2). However, in the quantum mechanical description of section 7.3 this was not (yet) at issue. There the emphasis was on the inadequacy of the standard formalism for a description of the experiment. For the purpose of studying incompatibility it is necessary to extend the formalism to observables represented by POVMs. This extension allows to apply the theory of nonideal measurements of incompatible observables (chapter 7.9), and makes it possible to give a more rigorous account than was possible in section 4.5.2, of the incompatibility of, on the
410
CHAPTER 8. APPLICATIONS
one hand, the measurement of the interference pattern, and, on the other hand, the determination of the slit the particle went through. By now the double-slit experiment has developed from a ‘thought experiment’ to an experiment that is carried out in actual practice. In particular in the domain of neutron interferometry important experimental progress has been made [342, 145, 343]. In the interferometer (cf. figure 8.1) two different paths are open to the neutron. Hence, an incoming wave packet splits into two parts, each traveling along a different path, but eventually being brought into interference again. The interferometer consists of a perfect silicon crystal, shaped in such a way that there are three parallel slabs in which the neutrons undergo Bragg diffraction. In the first slab a neutron impinging at the Bragg angle at point A is either transmitted with unchanged direction of propagation, or is Bragg reflected. After reflection in the second slab (at B or C) the partial wave packets are brought into interference in the third slab (D). Finally the neutron is registered by detector or in one of the two outgoing beams. Due to a separation of the partial waves by a distance of centimeters it is possible to influence each partial wave separately. For instance, in one of the beams an aluminum plate can be inserted, causing a phase shift (cf. figure 8.2) of the partial wave traversing it, proportional to the plate’s thickness. Variation of causes a variation of numbers of neutrons registered in detectors and This is a pure interference measurement. Another possibility is to place a thick slab of lead in one of the beams, absorbing all neutrons choosing the corresponding path. In that case it is certain that a neutron registered by one of the detectors, has taken the other path. Hence, this is a ‘which path’ measurement, registering which path the neutron has taken (cf. figure 8.3).
8.2. NEUTRON INTERFEROMETRY
8.2.2
411
Interference and path observables
The two possible experiments described above, viz, the pure interference and the pure path measurement, are respresented by standard observables. These observables will now be derived first. We shall restrict ourselves to neutrons with welldefined momentum (and energy), i.e. plane waves. Let correspond to a plain wave impinging at the Bragg angle. Assuming [343] that each Bragg reflection causes a phase shift in the state vector, the first slab will induce the following transformation:
Here
represents the Bragg reflected plane wave.
We first consider the pure interference measurement. Analogously to (8.3) the influences of the phase shifter and the second slab can be given according to
where it is assumed that in the second slab there is complete reflection 1 . Finally the effect of the third slab is given by
This yields the final (outgoing) state as a superposition of and From this the detection probabilities of detectors and are found according to
1 This is usually assumed in the literature, implying that neutrons that get lost because they are not Bragg reflected, can be left out of consideration. In section 8.2.3 the conditions will be discussed under which this assumption is justified.
412
CHAPTER 8. APPLICATIONS
In order to demonstrate that the detection probabilities (8.6) are expectation values of a standard observable we consider the two-dimensional Hilbert space spanned by vectors and Analogously to (8.5) it is possible to calculate the final state if the incoming state is If the incoming state is an arbitrary linear superposition we get:
Denoting the measured observable by
we have
yielding the following two-dimensional representation of
and
It is easily verified that and are projection operators satisfying and, hence, constitute an orthogonal decomposition of the unit operator defining a PVM. The corresponding (standard) observable will be referred to as the interference observable. The path observable is found in an analogous way by replacing the phase shifter by a thick leaden slab absorbing all neutrons impinging on it. We get:
The probability that a neutron has traversed the unblocked path equals the probability that it is registered by either one of the detectors or Denoting this probability by and the probability that the neutron has taken the other path by the path observable can be defined by
8.2. NEUTRON INTERFEROMETRY
413
This yields
Operators and are seen to generate a PVM, to be referred to as the path observable. This observable is incompatible with the interference observable (8.8).
8.2.3
Joint nonideal measurement of interference and path observables
Stochastic absorption Summhammer, Rauch and Tuppinger [145] have performed experiments in which, apart from a phase shifter, also a partially absorbing medium, consisting of a thin plate of gold or indium, is inserted into one of the paths (cf. figure 8.4). In contrast to the situation of the pure path measurement the probability that a neutron is not transmitted by the interferometer is not zero. Let the transmission coefficient of the absorber be equal to a (determined by the thickness of the absorbing plate). Since it is not determined a priori whether an individual neutron will be absorbed or transmitted, this is called stochastic absorption. It is assumed that the absorber reduces the amplitude of the partial wave packet (or the plane wave) in the path of the absorber by a factor Analogously to (8.7) can be calculated [344] for arbitrary incoming state vector
Here onal to
represents the state of the absorbed neutron. It is assumed to be orthogand The detection probabilities and of the two detectors,
CHAPTER 8. APPLICATIONS
414
and the absorption probability
are found from (8.11) as
With we get the following two-dimensional representation of
and
Using the definitions (8.8) and (8.10) these operators can be written as
The operators and generate a POVM. In the limits and this POVM reduces to and respectively. These are trivial refinements (cf. section 7.6.6) of the interference and path observables, respectively. Hence, the standard measurements of interference and path can be seen as special limiting cases of a measurement with an arbitrary transmission coefficient It is now demonstrated that for arbitrary the measurements can be interpreted as joint nonideal measurements of the two standard observables of interference and path, in the sense defined in section 7.9. For this it is necessary to define a bivariate POVM This is done by first constucting the trivial refinement which is equivalent to POVM (cf. section 7.6.7). Next, this POVM is arranged in a bivariate form according to
The marginal POVMs (8.15) as:
and
are found from
8.2. NEUTRON INTERFEROMETRY
415
These expressions have the form (7.63), with nonideality matrices for the path and interference measurements given by
respectively. For the nonideality measure (7.55) we find
whereas the measure (7.60) yields
It is interesting to note that the pure path and interference measurements can be interpreted as joint nonideal measurements, too, with and respectively. In agreement with the Heisenberg disturbance idea of measurement the observable incompatible with the actually measured one is maximally disturbed: indeed,
signifying that the corresponding marginal POVMs are uninformative (cf. tion 7.7).
sec-
Deterministic absorption Summhammer, Rauch and Tuppinger [145] have also discussed another experiment, in which the stochastic absorber is replaced by a fast chopper which is opened during a fraction of its period, and closed during a fraction If the neutrons are impinging randomly, then the transmission coefficient of this system equals Choosing we have the same absorption probability as in the case of a stochastic absorber. However, now for each neutron it is certain whether it will be absorbed or transmitted. For this reason we have deterministic absorption here. The POVM corresponding to this experiment can be obtained simply because we now have a mixture of the two pure measurement procedures for interference and path as discussed in section 7.6.4. According to (7.44) the POVM is given by
416
in which the operators can be written as
CHAPTER 8. APPLICATIONS
are given by (8.13). Analogously to (8.14) the operators
The measurement with deterministic absorption is a trivial joint nonideal measurement of interference and path, too. Defining a bivariate POVM analogous to (8.15), we find the nonideality matrices
For the nonideality measures (7.55) and (7.60) we find
and
Complementarity of interference and path measurements In figure 8.5 the nonideality measures and of the deterministic absorption experiment (a), as well as and of the stochastic absorption experiment (b) are given as functions of parameters and respectively. In the figure the theo-
8.2. NEUTRON INTERFEROMETRY
417
retical limit (c) is also exhibited, existing due to the Martens inequality (compare (7.106)). For the path and interference observables (8.10) and (8.8) this limit is found as The theoretical limit is satisfied only if or In all other cases deviations from ideal measurements of path and interference observables are larger than the theoretical limit. Evidently, in this respect the stochastic absorption method is a more accurate joint measurement of path and interference than the deterministic one. This, presumably, is a consequence of the more “classical” nature of deterministic absorption, wiping out phase relations more effectively (see also section 8.2.5).
Wigner measures of the neutron interference experiments The nonideality matrices for the joint nonideal measurement of the path and interference observables are invertible. As the inverses of (8.18) we find for stochastic absorption:
From this the Wigner measure (7.82) corresponding to the bivariate POVM (8.15) of the stochastic absorption experiment can be calculated straightforwardly:
For the deterministic absorption experiment the same Wigner measure is obtained. This Wigner measure is independent of the parameters or (cf. section 7.9.3).
Losses at the second slab At the second slab of the neutron interferometer there is no total reflection of the neutron beam: like in the other slabs the neutron has a probability to be transmitted without reflection. In that case it leaves the interferometer, and remains undetected (cf. figure 8.6). If the subensemble of the neutrons that are detected can be described by the same density operator as the whole ensemble, then this phenomenon can be neglected. Often this condition is satisfied, but, as will be seen in the following, in certain measurement arrangements it is violated. Hence, it is necessary to explicitly take this possibility into account.
CHAPTER 8. APPLICATIONS
418
If each neutron has a probability to be transmitted by the second slab without reflection, and get lost, then the stochastic absorption experiment is represented by the POVM generated by the NODI in which and once again correspond to the detection probabilities in detectors and represents the absorption probability of the absorber, and the probabilities that the neutron leaves the interferometer at the second slab. Calculating, analogously to (8.11), the outgoing state, we find POVM as
in which the operators
are given by (8.14).
By trivially refining the POVM generated by the operators ranging the resulting POVM in a bivariate one according to
and ar-
we find that the measurement can still be interpreted as a joint nonideal measurement of the path and interference observables (8.10) and (8.8), respectively, with nonideality matrices
As nonideality measures for these matrices we find
8.2. NEUTRON INTERFEROMETRY
419
with given by (8.20). It can easily be verified that and demonstrating that this experiment is a more accurate path measurement, but a less accurate interference measurement than the experiment described by (8.14). This is understandable because detection of a neutron in 4 or 5 is additional evidence as regards its path. That an increase of the accuracy of the path measurement is accompanied by a decreasing accuracy of the interference measurement can be seen as an effect of complementarity in the sense of mutual exclusiveness of measurement arrangements. It is possible to restrict considerations to the subensemble of neutrons that are reflected at the second slab, the neutrons in 4 and 5 being ignored. However, this could change detection probabilities. Since this is not the case, however: the probability of a neutron remaining inside the interferometer is and, hence, is independent of the initial state. As a consequence the density operator of the incoming state also describes the conditional probabilities of the subensemble. Thus,
Since the POVM of the measurement on the subensemble turns out to be precisely (8.14), this justifies the neglect, referred to in section 8.2.2, of the losses at the second slab. However, such a neglect is not always possible! When the position of phase shifter and absorber are interchanged (compare figures 8.6 and 8.7), then, instead of (8.27) we get the POVM
The trivial refinement
of
420
CHAPTER 8. APPLICATIONS
this POVM can be arranged in a bivariate form according to
By taking marginals it is then seen that this measurement can also be interpreted as a joint nonideal measurement of path and interference. POVMs (8.27) and (8.30) are equivalent, hence their nonideality measures (8.28) are the same. As to the possibility of neglecting the losses at the second slab the measurements are very different, however. We have Hence, the fraction is dependent on the incoming state. A description, analogous to (8.29), of the subensemble in terms of conditional preparation yields a POVM that is not independent of the incoming state. POVMs (8.14), (8.27) and (8.30) are informationally equivalent (cf. section 7.6.7). For this reason we do not obtain extra information about the incoming state by detecting the losses at the second slab. In the measurement arrangement of figure 8.7 it is necessary, however, to take into account these losses explicitly.
8.2.4 Complete neutron interference measurements From (8.14) and (8.22) it is seen that the joint measurements of interference and path, discussed in section 8.2.3, are trivial joint measurements (compare section 7.9.4). We shall now discuss two complete measurements.
Joint measurement of path and interference observable as a complete measurement First, consider the measurement arrangement of figure 8.8. In this experiment a neutron interferometer with four slabs is used. This allows to have four detectors, the first two detectors measuring interference at a phase shift the last two measuring an interference observable at a different phase shift There also are four absorbers, with transmission coefficients as indicated in the figure. Once again putting the detection probabilities of the detectors equal to and writing the absorption probabilities in the four absorbers as we find, analogously to section 8.2.3, the POVM
8.2. NEUTRON INTERFEROMETRY
Here is defined by (8.8) with of the absorbers are chosen such that
421
The transmission coefficients
This yields the possibility of restricting ourselves, analogously to the treatment in section 8.2.4 of losses at the second slab, to the subensemble of neutrons detected in This is possible by considering, instead of the full POVM, the POVM with given by (8.31). Defining the bivariate POVM
this experiment can be interpreted as a joint nonideal measurement of the path observable (8.10) and the interference observable with nonideality matrices
For this POVM the Wigner measure is found as
422
CHAPTER 8. APPLICATIONS
From the Wigner measure (8.33) the probability distributions of the three PVMs and can be calculated. It is possible to choose the phases and such that the operators of these PVMs constitute an overcomplete set in the Hilbert space of operators. Then their probability distributions yield sufficient information to completely determine the density operator of the incoming state.
Complete measurement as a joint measurement of two joint path and interference observables The analysis of the experiment depicted in figure 8.8, given above, has as a drawback that the Wigner measure (8.33) is dependent on the parameters of the measurement arrangement (compare section 7.9.4). We shall now analyze a slightly different measurement in a different way so as to avoid this disadvantage. In this analysis the experiment is not interpreted as a joint nonideal measurement of the path and interference observables, but as a joint measurement of two different joint observables. In order to do so we drop the requirement that the detection operators can be arranged in a bivariate POVM allowing an interpretation as representing a joint nonideal measurement. This makes it possible to simplify the measurement arrangement to the one given in figure 8.9. The POVM then simplifies to
8.2. NEUTRON INTERFEROMETRY
423
These operators are arranged as follows in the trivariate POVM
It can easily be verified that
Then the sets of operators and can be seen to define two different POVMs, each representing, analogously to section 8.2.3, a joint nonideal measurement of path and interference, the first one with interference operator the second one with The nonideality matrices of these are given by (8.18), in the second one being replaced by Defining the Wigner measure, analogously to (7.82), as
we get
being given by (8.26), in which for the interference observable is taken, and for The trivariate Wigner measure thus defined yields as marginals
Evidently, this Wigner measure is independent of the parameters.
8.2.5 Analogy with optical interference The stochastic absorption experiment is analogous to the optical experiment depicted in figure 8.10 (de Muynck et al. [345]). In this measurement arrangement
424
CHAPTER 8. APPLICATIONS
the semitransparent mirrors and define (together with the totally reflecting mirrors and a Mach-Zehnder interferometer. Mirror is the analogue of the stochastic absorber with transmission coefficient There is also a phase shifter It is possible to analyze this experiment along the lines of (8.11) through (8.14). Let the detection probabilities in detectors be given again by Then essentially the same relation is obtained as (8.14), viz,
Representing the relation between input and output of the partially transmitting mirror by the unitary matrix (cf. Ou et al. [346]), the interference and path observables have a representation differing somewhat from (8.8) and (8.10), respectively, viz,
The smooth behavior of the visibility of the interference exhibited by the expectation values of (8.36) as a function of a has sometimes been felt as surprising (Wootters and Zurek [257]). An experiment corroborating the “surprisingly strong interference signal although one of the beams contributes only about 0.6% of the intensity on the detectors” was carried out by Mittelstaedt et al. [258]. It should be noted, however, that this result is surprising only if considered from a point of view in which complementarity is restricted to the extreme values 0 and 1 of parameter a corresponding to the standard formalism, while ignoring intermediate situations (cf.
8.2. NEUTRON INTERFEROMETRY
425
section 4.7.4). The results are not surprising at all when considered from the wider perspective offered by the generalized formalism. If the partially transparent mirror is replaced by a chopper which is closed for a fraction of the photons and open for a fraction then we have an optical experiment that is analogous to the deterministic neutron absorption experiment discussed in section 8.2.3). The essential difference between the stochastic and the deterministic case can be understood in terms of phase coherence of the two (partial) wave packets in the stochastic case leaving mirror Evidently, in this experiment it is not allowed to think in terms of photons as particle-like objects choosing either one path or the other; such a particle picture seems to be more consistent with the deterministic case, which is observationally different from the stochastic one! This difference also illustrates our discussion of the ensemble interpretation (cf. chapter 6), in which the conclusion was reached that we should be careful with an interpretation of a quantum mechanical ensemble as a von Neumann ensemble, in the sense that the photon wave packet cannot be interpreted as a description of an ensemble of photons of which a fraction is in state and a fraction in (cf. section 6.2.3). In the stochastic case the phase relation between the two partial wave packets is important.
A complete optical interference measurement Busch [318] has proposed the optical interference experiment given in figure 8.11 as an example of a complete measurement. The POVM of this experiment is essentially given by with given by (8.31) (path and interference operators now being given by (8.37)). Hence, this POVM represents a complete measurement. Analogously to section 8.2.4 it is possible to treat this experiment either as a joint nonideal measurement of path and interference (with phase shift or as a joint measurement of two joint nonideal measurement of path and interference with phase shifts en respectively. This is possible by a transition to the POVM to be obtained from the POVM by means of coarsening and trivial refinement, followed by arrangement in a trivariate form according to (8.35).
8.2.6 Absorber fluctuations It is possible that the transmission coefficient of the absorber in the experiment of section 8.2.3 is not a constant but is fluctuating. Then the probabilities (8.12) should be averaged to take into account these fluctuations. The corresponding POVM is obtained by averaging the operators of (8.13). Since the
CHAPTER 8. APPLICATIONS
426
fluctuations of may be complex if there is, apart from a fluctuation of the amplitude of the state vector, also a fluctuation of the phase (cf. [347, 348]), we put
Since for the averaged transmission coefficient
we can define a decoherence parameter
we have
according to
The POVM is then found as
By repeating the analysis of section 8.2.3 it is seen that the experiment with fluctuations can be interpreted as a joint nonideal measurement of path and interference, too (the latter with an extra phase shift The nonideality matrices are given by
8.2. NEUTRON INTERFEROMETRY
427
Comparing this result with the nonideality matrices (8.18) of the nonfluctuating absorber we see that for the accuracy of the path measurement is not influenced by the fluctuations. However, since the interference measurement has become less accurate. Note that this effect also obtains for real fluctuations. Hence, it cannot be interpreted simply as a consequence of phase fluctuation. It is interesting to note that the Wigner measure (7.82) corresponding to POVM (8.39) equals the nonfluctuating one (8.26). Evidently, in principle the fluctuations do not influence our possibility of obtaining information about the incoming state: in the fluctuating experiment the disturbing influence of the measurement can be compensated for by means of calculation as it was done in the nonfluctuating one.
8.2.7 Accuracy of the interference measurement Interference experiments are generally analyzed in terms of the visibility of the interference pattern. This is defined according to
in which and are the maximal and minimal intensity, respectively, measured by the detector when the phase shift is varied. In this section it will be demonstrated that the visibility is not always the most appropriate measure of the accuracy of an interference measurement, the nonideality measures defined in section 7.8 sometimes being preferable. Restricting ourselves to the experimental situation in which we find, using (8.13), for the nonfluctuating case:
with
yielding
Since if the quantity can be seen as a measure of the disturbance of interference by the presence of an absorber. In case of absorber fluctuations we obtain in an analogous manner from (8.39):
and
428
CHAPTER 8. APPLICATIONS
The inequality for following from this, seems to be completely in agreement with the result obtained for the nonideality measure to the effect that absorber fluctuations reduce the accuracy of an interference measurement. There is a fundamental difference, however, between the quantities and as indicators of the accuracy of an interference measurement. The first one is intended to describe a property of the object (or the preparation, cf. section 2.2), and is generally interpreted as a measure of the phase coherence of the incoming state. A possible disturbance of it by the measurement (for instance, due to the presence of an absorber, either fluctuating or not) is then an annoying side-effect, preferably to be kept as small as possible. However, in principle preparation and measurement both contribute to the quantity When the disturbance of this quantity by the measurement can be appreciable. It is not always possible to separate the contributions to of preparation and measurement. On the other hand, quantities and are completely independent of the preparation. As follows from the fact that the nonideality matrix is a property of the POVM, not depending in any way on the density operator representing the incoming state, it is purely a measure of the accuracy of the measurement (see also section 7.10.3). The difference between the quantities and is illustrated by their different dependencies on detector efficiency Taking into account detector (in)efficiency in (8.41) the detected intensity must be multiplied by a factor It is evident that the visibility of the interference is independent of detector efficiency, and, hence, is not changed by taking it into account. Such an insensitivity to a disturbing factor in the detection mechanism might be seen as an advantage if information on the incoming state is at stake. However, the visibility is sensitive to the presence of an absorber, either fluctuating or not. Hence, from its independence we cannot draw the conclusion that the visibility is a property of the incoming state. The efficiency of the detector does have an effect on the quantities Analogously to (8.40) we obtain the nonideality matrices
from which follows
and
8.3. STERN-GERLACH EXPERIMENTS
429
Since and we see that the accuracies of the path and interference measurement are both influenced in a negative sense by the nonideality due to inefficiency of the detectors. Somewhat more manageable expressions are obtained using nonideality measure (7.55):
From this measure it is immediately clear that detector inefficiency increases the inaccuracy of the measurements of both path and interference.
8.3
Stern-Gerlach experiments
8.3.1 Introduction The experiment of Stern and Gerlach [349] is one of the classic experiments of quantum mechanics. In the experiment a measurement is performed of the of the spin of an atom, by letting it traverse a magnetic field inhomogeneous in the The path of the atom depends on the value of the of its spin. For total spin a beam of atoms splits into two sub-beams. If the atom is found in one of the sub-beams this is interpreted as measurement result or of the of spin. The Stern-Gerlach experiment is a particularly nice example of a quantum mechanical measurement because it can clearly be seen what goes on in this measurement. By the interaction with the magnetic field a correlation is established between the of the spin of the atom and its center of mass motion. It is this
430
CHAPTER 8. APPLICATIONS
center of mass motion that causes a directly observable effect. Since the center of mass motion of the atom depends on its spin, it is possible to draw a conclusion on the spin in the initial (incoming) state from an observation of its position in the final (outgoing) state. It also is clear that the establishment of a correlation between initial spin and final position is a quantum mechanical process, that can be correctly described only by the Schrödinger equation. Actually, this is the pre-measurement phase (cf. section 3.2.1). Of course, the measurement has a macroscopic component, too, viz, the detection process proper, demonstrating the presence of the atom in one of the sub-beams. For this latter process the quantum mechanical description seems to be less crucial; for this reason this process will not be explicitly discussed in the following. It will be assumed that certain quantities of the center of mass motion, like the of momentum or the of position, can serve as a pointer observable (cf. section 3.3). The mass of the atom is sufficiently large to allow this observable in the detection process (i.e. after the pre-measurement) to be treated as a classical quantity.
8.3.2 The Stern-Gerlach experiment as a nonideal measurement of The Hamiltonian of a neutral particle with spin in a magnetic field B is given by
Here the vector has as components the three Pauli spin matrices, and is the atom’s magnetic moment (we have put An inhomogeneous magnetic field B appropriate for the Stern-Gerlach experiment has components and constants. Note that it is impossible that the field be inhomogeneous only in the because then the condition posed by classical Maxwell theory, cannot be satisfied. It is possible that is a function of thus allowing for a possible localization of the field in some restricted region (cf. Scully et al. [350]). This will not be considered here. The constant represents a homogeneous magnetic field in the the constant represents its inhomogeneity. With this magnetic field we obtain the Hamiltonian
The usual approximation We first discuss the usual approximation in which the term (8.42) is neglected. This is a good approximation if the constant
in Hamiltonian is very large, at
8.3. STERN-GERLACH EXPERIMENTS
least as long as the state remains localized in the neighborhood of (8.42) in this approximation the Hamiltonian is given by
431
Instead of
We then have Hence, in this approximation (7.46) is satisfied for the standard observable This implies that in this approximation the Stern-Gerlach measurement is a (nonideal) measurement of the of spin if an arbitrary observable of the center of mass motion is chosen as a pointer observable. Take, for instance, for this latter observable the of momentum, measured at time T. The observable in (3.22) is then given as yielding the POVM of the Stern-Gerlach measurement according to
In this expression the trace is taken over the Hilbert space of the center of mass variables (the pointer), and is the spatial part of the initial state vector of the atom. Here it is assumed that spin and center of mass variables of the atom are uncorrelated in the initial state. Because of (8.44) we have
in which the operators are the projection operators onto the eigenvectors or f The functions should satisfy the properties and of a nonideality function. The nonideality function
Inserting (8.45) it follows that
With (8.44) this can be reduced to
is found from
432
CHAPTER 8. APPLICATIONS
which expression demonstrates that for a fixed value of the problem is reduced to that of a charged particle in a homogeneous electric field in the This finally yields:
Taking as an example we get
with
showing that after the interaction will be found in a neighborhood of It is easily verified that all properties of a nonideality function are satisfied. Nonideality of the spin measurement is expressed by the fact that and are non-vanishing for the same values of However, as seen from the example given above, for sufficiently large values of T deviations from nonideality become imperceptibly small. If it is registered only whether eigenvalue of is positive or negative, then it is sufficient to consider the POVM which in general represents a nonideal measurement of too. This POVM is an informationally equivalent coarsening (cf. section 7.6.6) of POVM The nonideality matrix of this measurement is given by
for the above example yielding
Usually the Stern-Gerlach experiment is carried out by means of detectors positioned in each of the outgoing beams. This can be interpreted as a measurement of the atom’s at time T. This yields a result that is analogous to the measurement: the measurement is a nonideal measurement, with POVM in which, analogously to (8.48), is given by
Here is a joint eigenvector of the position observables X, Y and Z. For given above this yields
as
8.3. STERN-GERLACH EXPERIMENTS
433
which should be compared with the result (8.50) of a momentum measurement. By analogy with POVM a POVM can be defined yielding the probability that the particle is in one beam or the other. A nonideality matrix is obtained analogous to (8.51), with . Although, in contrast to the momentum distribution, the width of the position distribution increases with time, it can be verified that for sufficiently large inhomogeneity of the magnetic field and sufficiently large T we have an ideal measurement of also in this version. Due to the fact that in the approximation employed here is a constant of the motion, the pre-measurement is of the first kind (cf. section 3.2.4). This, undoubtedly, is the reason that the Stern-Gerlach experiment is a paradigm of quantum measurement theory. It is highly unfortunate, however, that this paradigm is not even an exact application of the theory: if we take the exact Hamiltonian, as given by (8.42), then is not a constant of the motion any longer.
Exact treatment It will now be demonstrated that, under certain conditions, for the process described by the exact Hamiltonian (8.42) a measurement of the of momentum can still be interpreted as a nonideal measurement of (Martens and de Muynck [351]). This means (compare (7.46)) that it is not necessary that be a constant of the motion. Notwithstanding the pre-measurement is not of the first kind, a measurement of is nevertheless obtained if is chosen to be invariant under the transformation Thus, in which is the parity operator, defined by Note that a choice of the spatial part of the initial state of the atom essentially amounts to a choice of the initial state of the measuring instrument. It does not imply any fundamental restriction of the spin measurement, even though, of course, practical restraints will follow from it. The relevance of condition (8.52) stems from the fact that the Hamiltonian (8.42), although not commuting with yet satisfies
This is easily seen from Pauli spin matrices. Due to
and the well-known commutation relations of we also have
From (8.52), (8.53) and (8.54) it then follows that
434
CHAPTER 8. APPLICATIONS
Finally, by tracing (8.55) over and using the equality for an arbitrary operator A, it follows from (8.52) that
valid
with defined analogously to (8.45). This demonstrates that, indeed, a measurement of at time T is a nonideal measurement of in the initial state. Expression (8.46) remains valid in the exact case. This holds true even for (8.47) (with H’ replaced by H), but not for (8.48). Since is not a constant of the motion, transitions are possible between states with different values of Developing the spatial part of the state into eigenfunctions of the parity operator, we may write:
with and even and odd eigenfunctions of respectively. Due to (8.52) we have It is easily verified that the nonideality functions are given by
8.3.3
Stern-Gerlach as a joint measurement of incompatible observables
We shall consider in this section the special case that in Hamiltonian (8.42) the parameter vanishes. Thus
By doing so the magnetic field adopts a quadrupole configuration, thus allowing for the possibility of additional symmetry next to the one expressed by (8.53). As a matter of fact, the system described by (8.56) has rotational symmetry around the as is evident from
with a component of angular momentum and a Pauli spin matrix. This can be employed to set up the Stern-Gerlach experiment in such a way that it yields information on another observable next to Because of the axial symmetry we use axial coordinates Let us now consider a momentum measurement in which only angle is registered. Analogously to (8.45) such a measurement is represented by the POVM
8.3. STERN-GERLACH EXPERIMENTS
435
Using (A.43) it is seen that
Due to (8.57) this implies that, if state the i.e. then POVM
is chosen rotationally symmetric around
should satisfy the covariance condition (cf. (A.10))
Using (A.83), from this condition it can easily be demonstrated that POVM (8.58) can be represented according to
in which parameter is determined by the details of the measurement arrangement. Without loss of generality we can put Taking we obtain which corresponds to a POVM advanced by Helstrom ([46], p. 74) as representing a measurement of spin direction. POVM is a maximal one in the sense defined in section 7.7.2. Since POVM can be represented according to
it follows that the quadrupole Stern-Gerlach measurement can be interpreted as a nonideal measurement of the maximal POVM Let us now restrict registration of angle to the intervals and for some angle The corresponding POVM is obtained as
with the spin component in direction Expressing this POVM in terms of the projection operators onto the eigenvectors of it can be seen that POVM represents a nonideal measurement of the standard observable with nonideality matrix
436
CHAPTER 8. APPLICATIONS
In an analogous way we can consider a partition into four quadrants This yields the bivariate POVM defined by
as an interesting possibility of ordering the experimental data. Its marginals are found according to
Comparing this with (8.60) we see that both marginals represent nonideal measurements of spin components (in directions and respectively), with nonideality matrices both given by (8.61). It is easily verified that the joint nonideal measurement of the two incompatible spin components, obtained in this way, satisfies the Martens inequality (7.106). Note, however, that there is no complementary behavior in the sense of canonical conjugatedness like in figure 8.5, when parameter is varied: for both nonideal measurements are uninformative, whereas for nonideality is minimal for both observables. Due to joint ideal measurement is unattainable, however, since this would require
8.4
Quantum optical experiments
Interference experiments have played an important role in the discussion on the foundations of quantum mechanics. In particular, in the field of (quantum) optics measurements have reached a degree of perfection making it possible to observe in a relatively simple manner effects that are typically quantum mechanical, and to compare the experimental results with the predictions of quantum mechanics, i.e. quantum optics. To a good approximation monochromatic laser light can be represented by the state vector with except for one frequency and a coherent state (cf. appendix A.4). For this reason coherent states are of great practical importance. They are considered as the “most classical” states of a monochromatic mode of the electromagnetic field (see also section 1.7.1). The parameter is (up to a multiplicative constant) the contribution of the mode to the (complex) amplitude of the field.
8.4. QUANTUM OPTICAL EXPERIMENTS
8.4.1
437
Photon detection in the output ports of a partially reflecting mirror
We first determine the POVM for detection of photons by detectors and in the output ports of a partially reflecting mirror as in figure 2.4. In both input ports 1 and 2 coherent states are assumed to be applied as incoming states. Use is made of (A.42). The probability that photons are detected in and photons in if the incoming state is given by the two-mode coherent state and and are ideal detectors, is
The POVM of this measurement is found by writing this expression as an expectation value in the incoming state. Thus,
With
it follows from this that
This is a PVM determining the information, yielded by the measurement, on the state vector or the density operator of the incoming field. It is possible to apply in mode 2 the vacuum state, and to interpret the measurement result as information on the incoming state of mode 1 alone. For an incoming coherent state in mode 1 the detection probabilities of and are then given by (8.62) with Partial tracing over mode 2 yields
in which are the operators of the POVM describing the information provided on mode 1 by the measurement. These operators follow from (8.62) as
in which is a number state (cf. appendix A.2). This POVM is a trivial refinement (cf. section 7.6.6) of PVM and hence equivalent with it (cf. section 7.6.7). The marginals of this bivariate POVM are given by
CHAPTER 8. APPLICATIONS
438
Comparison with (7.7) shows that both marginals can be interpreted as POVMs of nonideal measurements of the PVM representing a measurement of the (standard) number observable. The similarity with the POVM of an inefficient detector can be explained by the similarity of the detection processes, both processes being Poisson processes (e.g. van Kampen [70] p. 84) with probability and respectively, that one of two possible events will take place. As a consequence each of the (efficient) detectors and registers the photons of incoming mode 1 in an inefficient way. By taking into account correlations of the measurement results of both detectors -as is done by POVM - the accuracy of the number measurement is evidently increased so as to yield an ideal measurement of photon number. If detector inefficiency of detectors and is also taken into account, then, taking for both detectors the same efficiency and once again starting from arbitrary coherent states in both input ports, we find for detector detection probability with the detection probabilities as given by (8.62), and given by (7.8). It is not difficult to check that this implies the following change of the joint detection probability (8.62):
By equating this for we find
For all
to the expectation value
of POVM
this is a nonideal measurement of the number observable.
8.4.2 Homodyne optical detection More generally, if has an arbitrary fixed value different from 0 it is possible to interpret the above measurement as a measurement of mode 1 alone. The corresponding POVM follows from (8.64) by putting It follows from (1.149) that it is possible to represent POVM in terms of the Husimi representation, according to
Here
and
is given by (1.150) with
8.4. QUANTUM OPTICAL EXPERIMENTS
439
Unbalanced homodyne detection If in the experiment of figure 2.4 only the measurement results of detector are taken into account, then the corresponding POVM is obtained as a marginal of (8.65):
If is chosen very large compared to the incoming signal in mode 1, then this measurement is called ‘(unbalanced) optical homodyning’. The coherent state in mode 2 is referred to as the ‘local oscillator state’. The method of homodyning was already developed in the classical theory of optics to increase the signal-to-noise ratio of optical signals. It is now demonstrated by means of the quantum mechanical theory that in the limit POVM (8.66) represents a nonideal measurement of a standard observable (cf. section 7.4). In order to do so, use is made of the fact that in this limit the Poisson distribution tends to a Gaussian distribution (e.g. Consul [352]) of a continuous variable:
in which represents the (continuous) intensity registered by detector In this expression we must take As before, we put We take (implying a special choice of the orientation of the coordinate frame in phase space; we shall abandon this restriction later on). Then and
The “bias” corrections referred to in section 7.4 are now realized by a transition from variable to In the limit we then obtain the probability distribution of according to (explicit reference to quantum efficiency is omitted in the following)
Analogously to (8.65) the corresponding POVM is found as
CHAPTER 8. APPLICATIONS
440
Using the representation (1.150) it follows directly from this that
This can be represented according to
in which are the (improper) eigenvectors of position operator defined by (1.23). With this is identical to (7.24). Choosing, more generally, the corresponding POVM
of mode 1,
it follows directly from (8.66) that satisfies
This implies that in which is the rotation operator (for mode 1) defined in (A. 16). From (8.68) it then follows that
in which are the (improper) eigenvectors of the rotated position operator defined in (A. 17). This demonstrates that unbalanced homodyning with an arbitrary phase difference between signal and local oscillator can be interpreted as a nonideal measurement of the rotated position operator Note that Hence for homodyne detection can be interpreted as a nonideal measurement of the momentum observable (cf. section 7.4). It is finally noted that the detection signal of detector gives a completely analogous result being replaced by Hence, this detector does not yield any additional information. This is a consequence of the fact that in this experiment we take into account only information stemming from the marginals of (8.65). No use is made of the correlation between the signals of the two output modes, inherent in the joint measurement represented by POVM In the method of balanced homodyne detection, to be discussed in the following, use is made of this correlation.
Balanced homodyne detection A disadvantage of unbalanced homodyning is that, under the influence of the large local oscillator signal, the measured intensity strongly deviates from the incoming
8.4. QUANTUM OPTICAL EXPERIMENTS
441
signal that is to be determined. The measurement procedure of balanced homodyne detection (cf. figure 8.13) greatly improves on this. In this detection method a correction of the “bias” is operationally realized by taking the difference of the measured intensities of detectors and in such a way that in the difference signal the (large) contributions of the local oscillator compensate each other. In doing so we make use of the extra information that is contained in the correlation between the detection signals of the two detectors. For this reason these signals must be measured in coincidence. In the limit of large intensity of the local oscillator the joint detection probability of detectors and is obtained from (8.64) by replacing the Poisson distribution by a Gaussian one in the second factor, too, and by making a transition to the new variable Analogously to (8.67) this yields
We can achieve mutual compensation of the large contributions of the local oscillator to the intensities of the two detectors by considering Defining we find with
From (8.71) it follows that the information on the state of input mode 1 is completely contained in variable F. Measurement of G does not provide any extra information about mode 1. If this is our only goal, then we may restrict ourselves to the information provided by the measurement of variable F, i.e. to the marginal distribution The POVM representing balanced homodyne detection of the input signal in mode 1 can be determined by once again equating to the expectation value of an operator in the coherent state
CHAPTER 8. APPLICATIONS
442
i.e. Comparison with (8.67) shows that balanced homodyne detection, too, is a nonideal measurement of the PVM
For
this reduces to
in which is the eigenvector of at eigenvalue F. In this particular case the measurement of F can be interpreted as an ideal measurement of observable If this result is compared with unbalanced homodyne detection (cf. (8.68)), then balancing turns out to have an important effect on the accuracy of the measurement, which is considerably improved by it: Choosing detection the POVM
we get, analogously to (8.69), for balanced homodyne
with the eigenvectors of the rotated position operator of mode 1. Evidently, this POVM represents a nonideal measurement of observable (once again ideal in the limit
Invertibility In principle, the convolution relation (8.73) can be inverted, entailing
This implies invertibility of the nonideal measurement, i.e. the effect of detector inefficiency can, at least in principle, be calculated and compensated for (see also Kiss et al. [353]). In actual practice it is not that simple, however, because, in order to be able to invert a measured probability distribution in a reliable way, this distribution must be known exactly. Of course this is seldom, if ever, the case (for instance, because in general only a finite range of the spectrum is covered by the measurement procedure, or due to reading inaccuracies, or due to fluctuations in parameters of the measurement arrangement like The singular character of the deconvolution relation (8.74) is responsible for a stability problem: the result
8.4. QUANTUM OPTICAL EXPERIMENTS
443
of the deconvolution may depend strongly on small uncertainties in the measured distribution (e.g. Banaszek [354]). For sufficiently smooth probability distributions, vanishing sufficiently rapidly for large values of F, the inversion may be stable provided the efficiency is sufficiently large. Thus, it can easily be seen that for the probability distribution the integration in (8.74) converges if In agreement with results obtained by Kiss et al. [353] for arbitrarily small values of states do exist for which the measured probability distribution can be inverted in a reliable way.
8.4.3
‘Four-port’ and ‘eight-port’ homodyne detection
‘Four-port’ homodyne detection In ‘four-port’ homodyne detection the homodyne detection arrangements for the nonideal measurements of (for and (for are combined in a Mach-Zehnder interferometer (cf. figure 7.8). A phase difference is realized by means of a In order to determine the POVM once again the coherent states and are chosen as incoming signal S and local oscillator L, respectively. Also now the outgoing states are coherent states, to be determined by means of repeated application of (A.42). If the transmission coefficient of the fourth mirror is 1/2 we obtain the coherent states (for detector and (for detector In the limit the POVM of this measurement can be determined analogously to the procedure of balanced homodyne detection. Analogously to (8.70) we find for the joint probability distribution of the measured intensities and in the ‘four-port’ homodyne detection experiment:
By putting POVM
with
is found by using (8.65) and (1.150) as
CHAPTER 8. APPLICATIONS
444
and the Wigner measure of mode 1. The marginals of this expression represent nonideal measurements of the observables and respectively, of mode 1 (compare section 7.9.1). We find:
Hence, ‘four-port’ homodyning can be interpreted as a joint nonideal measurement of and in the sense defined in section 7.9.1, and representing the nonidealities of the measurements. Note that
With (1.146) it is possible to write (8.75) as
For
and
this can be rewritten according to
For these values of the parameters the ‘four-port’ homodyning experiment can be interpreted as a nonideal measurement of POVM It is interesting to note that by using (1.145) and (1.146) for values of satisfying POVM (8.80) can be expressed equally well in terms of squeezed states (cf. (A.49)) according to
Analogously to (8.77) the marginals represent nonideal measurements of observables and defined in appendix A.5, the nonideality functions being Gaussians with standard deviations satisfying
8.4. QUANTUM OPTICAL EXPERIMENTS
445
Hence, by means of squeezing it is possible to increase the accuracy of one quadrature component, while decreasing the accuracy of the other one. It is easily seen that, analogously to (8.78), for values of in the allowed range and satisfy the inequality Leonhardt and Paul [355] have proposed to use the effect of (anti)squeezing for compensating in one quadrature component the negative effect of detector inefficiency on accuracy.
‘Eight-port’ homodyne detection In ‘eight-port’ homodyne detection (e.g. [356, 317]) the homodyne detection procedures of the ‘four-port’ experiment are replaced by balanced homodyne detection procedures (cf. figure 8.14). Analogously to the ‘four-port’ homodyning experiment we have a quadrivariate probability distribution of the output intensities and of the corresponding detectors, which is equal to the product of the separate probability distributions. We find:
The separate output signals can be interpreted once again as nonideal measurements of observables or of input mode 1. By balancing the nonideality of these
CHAPTER 8. APPLICATIONS
446 measurements is reduced appreciably. Let us define
Then we find
Since variables G and them. Putting, finally,
do not yield any information on mode 1 we can ignore
we find, analogously to (8.79),
Since
this yields for
Using (1.145) this can be written as (cf. Leonhardt et al. [357])
with defined by (1.133). It can directly be seen that for to (compare [317, 324])
(8.82) reduces
It turns out that, analogously to (8.77), also the marginals of (8.81) represent nonideal measurements of and with nonideality functions given by Gaussian distribution functions with standard deviations
respectively. Comparing with (8.76) we see that and Hence, compared with ‘four-port’ homodyne detection there is a reduction of nonidealities. Nevertheless, analogously to (8.78), for arbitrary and we have for
attaining its lower bound.
8.4. QUANTUM OPTICAL EXPERIMENTS
447
‘Eight-port’ homodyning as a joint nonideal measurement of number and phase Expressing according to in polar coordinates. Thus,
we can calculate the marginals of POVM (8.84)
with This marginal represents a nonideal measurement of the number observable, with nonideality function satisfying The other marginal is given by
Since this POVM cannot represent a nonideal version of a standard observable. In view of the way it is obtained, POVM might be interpreted as representing a (nonideal measurement of) a phase observable. It, indeed, does satisfy the first relation of (1.106). However, since it does not satisfy the second one it cannot be interpreted as a nonideal version of the (maximal) canonical phase observable defined in section 1.9.4 (although it approximates one if restricted to certain states of the electromagnetic field, cf. Leonhardt [358], p. 169). POVM (8.87) actually represents a nonideal measurement of another maximal generalized observable, viz, the POVM (8.84) (with nonideality function given by ‘Eight-port’ homodyning experiments, intended as phase measurements, have been performed by Noh, Fougères and Mandel [56], and found to yield results in good agreement with POVM (8.84). In agreement with an operational (or empiricist) interpretation they assume that POVM (8.87) must be interpreted as defining in an operational sense the phase observable corresponding to the ‘eight-port’ homodyning measurement procedure, and that a relation with a property of the microscopic object (unnecessary in an empiricist interpretation) need not even be contemplated. For this reason POVM (8.87) is referred to as an ‘operational phase observable’. As discussed in section 2.4, for many physicists a purely operational or empiricist interpretation of quantum mechanics is hard to swallow (e.g. [359, 360]), and it has been attempted to find a Hermitian operator fit to represent the “true” phase observable, to be attributed to the microscopic system as a property. Up to now such attempts have only met with very limited success, however, thus strengthening the suspicion that a standard phase observable may not exist. It is utterly reasonable to assume that something like phase may exist also in the microscopic domain. However, the question is whether it exists as a notion described by standard quantum
CHAPTER 8. APPLICATIONS
448
mechanics. It is not impossible that we shall have to rely on some subquantum theory for its description. It is hard to decide when an effort is no longer worthwhile, and a more empiricist approach of the quantum mechanical formalism should be adopted. In any case would it be interesting to study POVM (8.87) as a nonideal measurement of a maximal POVM composed of (refinements of) the spectral representations of the operators which would seem to correspond to a quantum mechanical measurement yielding information on the phase observable (if it exists) that is as ideal as possible. Naimark extension (cf. section 1.9.3) is often considered as a means of finding a standard observable corresponding to some POVM. A Naimark extension for the ‘eight-port’ homodyning POVM (8.84) has been found by Freyberger et al. [361] (also Hradil [362]). The dimension of Hilbert space is doubled by considering the tensor product space of two modes of the electromagnetic field, of which one mode represents the mode used in the experiment, whereas corresponds to an additional (reference) mode. Using (1.23) we can define operators and Then and are commuting Hermitian operators with joint eigenvectors2
with state
eigenvectors of Using the position representation (A.38) of coherent it can easily be demonstrated that
the projection operator onto the vacuum state of mode 2. Hence,
demonstrating that the POVM measured in the experiment of Noh, Fougères and Mandel has PVM as a Naimark extension. It follows that the phase POVM (8.87) is a Naimark projection of PVM
In principle it is possible to measure this PVM by means of a joint measurement of the compatible standard observables and P (cf. Shapiro and Wagner [363]). From (A.40) (with it directly follows that 2
Vectors
have been used already in the EPR problem (compare (5.7)).
8.4. QUANTUM OPTICAL EXPERIMENTS
449
Hence, this measurement could be performed by means of balanced homodyning (cf. section 8.4.2) of each of the output ports of a semitransparent mirror having modes 1 and 2 as input modes (cf. figure 2.4), provided 100% efficient detectors were available (cf. (8.72)). This demonstrates that in this case the Naimark extension allows for an operational implementation (at least in as far as infinitely large intensities of the local oscillators and 100% efficiency of the detectors can be approximated). Whether the Naimark extension (8.88) has any physical relation with ‘phase’ if mode 2 is not in its vacuum state remains an open question (see also Torgerson and Mandel [364]).
8.4.4 Quantum tomography The objective of quantum tomography is to reconstruct the density operator of the initial state, starting from the measurement results of a quorum of standard observables (cf. section 7.9.4). An example of this has been found by Vogel and Risken [327]. They observed that in the Wigner-Weyl representation (1.138) of the density operator,
of a monochromatic radiation field the quantity terms of the rotated position operator
can be expressed in defined in (A.17):
Using (1.41) this can be written as
in which is the probability that in an ideal measurement of standard observable value is found. This implies that (and, hence, is completely known if the probability distributions are known for all possible values of We find
Therefore, in principle, the possibility exists of completely determining in this way the density operator3. Taking into account equalities 3
The method derives its name from the analogy with tomographic methods employed in the medical world, in which a three-dimensional picture of the human body is obtained by reconstructing it from information obtained by measurements of a large number of two-dimensional cross-sections.
CHAPTER 8. APPLICATIONS
450
and
we obtain from (8.89)
The method was experimentally applied for the first time by Smithey et al. [365], its theoretical possibility having been observed by Royer [366]. Quantum homodyne tomography is based on the circumstance that the set of operators constitutes a (non-orthogonal) basis of Hilbert-Schmidt space. The inverse of (8.90) is the well-known Radon transform, given by
The dual basis
is given by
The method of quantum tomography has a certain limitation because probability distributions would have to be measured for all values of Since this means that an infinite number of measurements would have to be performed, this is not possible in practice. Some form of sampling will always have to be applied, in which measurements are performed only at a finite number of values of Hence, only states can be reconstructed that are not sensible to such a sampling. Moreover, from these measurements will in general not be known on the whole interval We shall be able to perform the integrations in (8.89) only when it is possible to find, by means of interpolation and extrapolation, a good approximation of the quantities that are not actually measured. The results given above hold for efficient detection For there is still an extra complication, because the relation between and (which is determined by (8.73)) has to be inverted, too. Using the deconvolution relation (8.74) we find (compare Leonhardt et al. [357, 358])
It is evident that, due to the inefficiency of the detection, extra requirements have to be imposed on the Fourier transform of in order that the integral converge. For each value of the set of operators is a (non-)orthogonal basis of Hilbert-Schmidt space. The basis will deviate more from orthogonality as increases.
8.5. ATOMIC BEAM INTERFERENCE EXPERIMENTS
451
It is interesting to compare the tomographic method of determining the quantum state with ‘four-port’ or ‘eight-port’ homodyning measurements. As observed in section 7.9.4, these are complete measurements. It follows from (7.92) that, in principle, the Wigner distribution of density operator can be calculated from the probability distribution measured in a ‘four-port’ or ‘eight-port’ homodyning experiment. Since the density operator is completely determined by its Wigner distribution (compare (1.136)), this means that POVM (8.84) represents a complete measurement. The same result (7.93) is obtained from all ‘four-port’ and ‘eight-port’ homodyning POVMs (8.80), (8.82) and (8.84). Comparing this with the quantum tomographic method discussed above, we see an essential difference, viz that in the ‘four-port’ and ‘eight-port’ homodyning experiments the density operator is determined using one single measurement arrangement for a single generalized observable, which can be interpreted as a joint nonideal measurement of Q and P. By contrast, in quantum tomography a large number of different measurement arrangements for incompatible standard observables must be employed to arrive at the same result. As seen from (8.82) the ‘eight-port’ homodyning POVM represents a nonideal measurement of the maximal generalized observable (8.84), whereas the ‘four-port’ homodyning POVM (8.80) does so only for sufficiently large values of It seems, however, that this advantage of the ‘eight-port’ homodyning method over the tomographic one may be compensated by a larger instability of the inversion process (7.93) as compared to (8.90).
8.5
Atomic beam interference experiments
8.5.1 Introduction In this section4 we want to demonstrate [367] that a recent atomic beam experiment by Brune et al. [368] is a generalized measurement. In the experiment, to be referred to in the following as the Haroche-Ramsey experiment, a Rb atom is sent through three cavities, C and (cf. figure 8.15), and being approximately resonant with a particular transition between two Rydberg states of the atom. Whereas the experiment without cavity C is a pure interference experiment already performed by Ramsey [369], provides the introduction of cavity C the possibility of obtaining ‘which way’ information too. The Haroche-Ramsey experiment is closely analogous to the neutron interference experiments discussed in section 8.2.3, in which an absorber is inserted into one of the paths, the absorber playing an analogous role as the cavity C field. In the present experiment the ‘which way’ information does not refer 4 A large part of this section has been published as ‘Haroche-Ramsey experiment as a generalized measurement’, W.M. de Muynck and A.J.A. Hendrikx, Phys. Rev. A 63, 042114 (2001). Copyright (2001) by the American Physical Society.
452
CHAPTER 8. APPLICATIONS
to paths in configuration space, but to paths in the space of the electronic states of the atom. The visibility of the interference fringes decreases as the amplitude of the cavity C field increases, but it vanishes only in the limit So, for finite information on both interference and path can be obtained. In order to actually obtain ‘which way’ information, a measurement must be carried out on the field left behind in cavity C by the atom. In the DavidovichHaroche experiment, discussed in section 8.5.4, this is done by sending a second atom through the cavities. In the following we shall analyze the Haroche-Ramsey and Davidovich-Haroche experiments as generalized measurements. In section 8.5.5 it is demonstrated that the Davidovich-Haroche experiment is informationally equivalent to a Haroche-Ramsey experiment in which a measurement of cavity C photon number is performed in coincidence with a determination of the final state of the atom by means of detectors and In that section the decoherence aspects of the Haroche-Ramsey experiment will be dealt with from the point of view of generalized measurements, thus demonstrating that the generalized approach can yield a deeper physical insight than can be obtained by the standard one. In section 8.5.6 an alternative measurement procedure for the Haroche-Ramsey experiment is discussed, which can be interpreted as a joint nonideal measurement of incompatible observables having the complementary character of the “classical” double-slit experiments, in the sense of mutual exclusivcness of measurement arrangements .
8.5.2 The Ramsey experiment In the Ramsey experiment [368] a beam of Rb atoms is sent through two identical cavities and The relevant Hilbert space of a Rb atom is spanned by the orthogonal state vectors and These correspond to circular Rydberg states with principal quantum numbers and respectively (transition frequency The frequency of the classical microwave
8.5. ATOMIC BEAM INTERFERENCE EXPERIMENTS
453
fields in the cavities is denoted by its amplitude by (the Rabi frequency), and the time needed for an atom to pass one cavity by T. The unitary transformation describing the evolution of the state of a Rb atom while passing cavity between and is given in the representation by the matrix
where and is the detuning parameter. A derivation of (8.91) can be found in Ramsey [369], and in Paul [370]. For all values of the parameters we have The Rb atom is said to undergo a pulse in cavity if We shall introduce a parameter quantifying experimental deviation from the pulse condition. Note that satisfaction of this latter condition does not imply The phase factor in (8.91) takes into account the phase of the microwave field at the moment the atom enters the cavity. Let be the initial state of the atom. By detectors and it is determined whether the atom is either in state or after it has passed cavity This can be interpreted as a measurement of standard observable Its probabilities and can be related to the initial state by means of the equalities
Due to the unitarity of ment, with
this yields PVM
as the POVM of this experi-
in which equals up to a phase factor. Because of the analogy with neutron interference experiments [344] the observable will be referred to as the path observable, even though here the paths are not trajectories in configuration space (as it is in the double-slit experiment and the neutron interference experiments) but in the Hilbert space of the internal states of the Rb atom. Mathematically this does not constitute a difference, however. The observable is dependent on the initial phase of the microwave field. For this reason this phase cannot be ignored if the experiment is intended to yield a measurement of the initial state of the atom. If the atoms are prepared in random phases, then a measurement of performed immediately after the atom has passed cavity will yield probabilities obtained from (8.92) by phase averaging. The corresponding POVM is found according to
454
CHAPTER 8. APPLICATIONS
and, hence, represents a nonideal measurement of PVM in the sense of (7.37). Note that is uninformative in case the pulse condition is satisfied, since then its expectation values do not yield any information on The Ramsey set-up consists of two cavities entered by the atom at times (with If the initial state of the Rb atom is
then the final state at time
is
where we have used the abbreviations
In the Ramsey experiment the standard observable is measured after the atom has passed cavity The corresponding probabilities and can be related to the initial state according to
yielding
The observable can be interpreted as the quantum mechanical observable measured in the Ramsey experiment if the initial phase of the microwave field is well defined. It is easily verified that Hence, and are projections and is a PVM. This PVM will be referred to as the interference observable, because, provided the atom velocity is sufficiently well defined, its expectation values in the state exhibit interference fringes if is varied. It is easily verified that if the pulse condition is satisfied, and the detuning parameter is taken to be zero, then the interference observable reduces to PVM This is in agreement with the fact that under these conditions the Ramsey setup
8.5. ATOMIC BEAM INTERFERENCE EXPERIMENTS just interchanges the roles of states and and are incompatible.
455
In general the standard observables
Also is dependent on the initial phase of the microwave field. Averaging over this phase yields the POVM
which, once again, is a nonideal measurement of (for it is even an ideal one). It is important to note that, nevertheless, the expectation values of exhibit interference fringes if is varied. When in (8.94) (as was satisfied in the experiments that have actually been carried out), then the expectation values of and coincide. For this special case it is possible to analyze the Ramsey experiment in terms of the standard formalism, even if the experiment actually performed is a generalized one, represented by POVM In contrast to the phase-averaged experiment, in the case of a well-defined initial phase of the field, if the measurement performed after the atom left is incompatible with the one performed after The two measurement arrangements are complementary. The experiments are analogous to double-slit experiments in which either it is directly measured which slit a particle has passed through (‘which way’ or ‘which path’ measurement) , or the interference pattern is measured after the two partial beams have been allowed to interfere (interference experiment). The quantity is the relative phase shift of the partial beams. In standard quantum mechanics complementarity is interpreted as mutual exclusiveness of information, caused by the impossibility of having both experimental arrangements simultaneously. Of the two incompatible PVMs and either one or the other can be measured. In the following we shall discuss a measurement arrangement that is intermediate between the two arrangements considered above, viz, the experiment reported by Brune et al. [368], to be referred to as the Haroche-Ramsey experiment. As a result the experiment may yield information on both the path and the interference observable.
8.5.3
The Haroche-Ramsey experiment
In the Haroche-Ramsey experiment a third cavity, C, storing a coherent field is placed between cavities and In order to avoid exchange of energy when the atom passes C, the transition frequency of the Rb atom and the frequency of the cavity C field are chosen to be off-resonance. Unlike the microwave fields in cavities and the cavity C field is treated quantum mechanically. The field in cavity C merely undergoes a phase shift (single atom index effect) which depends on
CHAPTER 8. APPLICATIONS
456
the state of the Rb atom in the following way [368]:
The states and unitary transformation the atom is going from
are coherent states, too. This yields the following describing the evolution of the atom-field system when to
where and are the photon creation and annihilation operators of the cavity C field mode. For the initial state of the combined atom-field system we get as the final state:
constants E, G, etc. being given by (8.95). After the atom has passed cavity C the field contains path information that can be retrieved by a measurement of a well-chosen observable of the field. In the standard formalism this information is usually analyzed in terms of the inner product of the field states and determining their distinguishability. The possibility of interference is seen as a consequence of the indistinguishability of the paths. How distinguishable the paths are, depends on the values of the parameters and If the states are identical, then the paths are completely indistinguishable. Ignoring the possibility this holds true if In this case the experiment cannot yield any path information. Complete distinguishability, corresponding to maximal path information, is obtained if the field states are orthogonal. This holds true only in the limit In the next sections this analysis will be corroborated on the basis of the generalized formalism. Whereas for the limiting values of considered above, the standard formalism is sufficient, is the generalized formalism necessary for experiments corresponding to intermediate values This already holds true if no measurement of the cavity C field is carried out at all. Thus, putting (and analogously for we find the POVM of the Haroche-Ramsey measurement from (8.99). Restricting ourselves to we get
in which
8.5. ATOMIC BEAM INTERFERENCE EXPERIMENTS
457
It is easily verified that, unless is not a PVM. Even in the limit (corresponding to is (8.100) a POVM, although an uninformative one. This is consistent with complementarity in the sense that in this limit no information on the interference observable is obtained, path information being obtainable by a measurement of an observable of the cavity C field in the final state of that field. Since the operators of POVM obtained from (8.100) by phase-averaging, are diagonal, the Haroche-Ramsey experiment is just a nonideal version of the Ramsey experiment if the initial phase of the microwave field is random. The nonideality measure (7.60) is then given as
This quantity is a measure of the inaccuracy introduced in the observation of observable by the insertion of cavity C. From the point of view of complementarity the experimental setup of the HarocheRamsey experiment is particularly interesting when the information is also exploited that is stored in the cavity C field, because this may add ‘which way’ information to the (nonideal) interference information obtained from the measurement of the final state of the atom. This will be discussed in the next sections.
8.5.4 The Davidovich-Haroche experiment A variation of the Haroche-Ramsey experiment was proposed by Davidovich et al. [371], in which a second atom traverses the system some time after the first one has passed. In [371] the reason for sending this second atom is to probe a possible decoherence of the field in cavity C. We shall discuss this aspect of the experiment in section 8.5.5. Here we are interested in the possibility of considering the second atom as yielding information on the cavity C field that might be useful for determining the path of atom 1. We shall demonstrate that the joint measurement of standard observables and in the final state of the atoms can be interpreted as a measurement of a POVM on the incoming state of atom 1. We shall neglect decoherence here by taking a negligible time interval between the atoms. We also restrict ourselves here to the case for which and, hence, (8.96) yields Then, starting with atom 2 in state and using rules (8.97) for both atoms, we find for an arbitrary initial state of atom 1 the final state
458
CHAPTER 8. APPLICATIONS
with By putting etc., the POVM of the Davidovich-Haroche experiment, interpreted as a measurement on atom 1, is straightforwardly found according to
From these expressions it is immediately clear that averaging over the initial phase makes the Davidovich-Haroche experiment a nonideal measurement of observable too, the nonideality measure (7.60) being given by
with Comparing with the corresponding nonideality measure (8.102) of the HarocheRamsey experiment, we find (cf. figure 8.16) that for Hence, by taking into account the extra information from the measurement of the cavity C field the accuracy of the nonideal measurement of observable has been increased. In the phase-averaged case the subspace of Hilbert-Schmidt space spanned by the operators of the NODI (8.103) is two-dimensional. From an informational point of view the Davidovich-Haroche experiment will be more interesting if it is possible to avoid the necessity of phase averaging, because in that case is three-dimensional. Although, due to the equality the operators of the NODI are linearly dependent, and, hence, the measurement is not a complete one, it nevertheless is a generalized measurement, being interpretable, in the sense defined in section 7.9, as a joint nonideal measurement of two incompatible observables. In order to see this the operators must be ordered in a bivariate way. Due to the uninformativeness of the marginal the only interesting way to do this is according to
8.5. ATOMIC BEAM INTERFERENCE EXPERIMENTS
yielding marginals
459
and
with
and
Here is taken to be zero. Evidently, both marginals depend on the parameter governing the distinguishability of the field states. In agreement with (7.63), for these marginals can be interpreted as representing nonideal measurements of two incompatible PVMs of atom 1, with nonideality matrices given by
yielding nonideality measures (7.60) as
For the parameters
and
we find
We shall not bother to calculate the corresponding PVMs, because these do not admit a straightforward physical interpretation in terms of the interference and path observables defined above. From the nonideality measures and it can already be seen that the two PVMs measured jointly in the Davidovich-Haroche
460
CHAPTER 8. APPLICATIONS
experiment do not constitute a canonically conjugate pair in the sense that, if the parameter is varied, one measurement becomes more accurate if the other one becomes more nonideal. Thus, in both of the limits and POVM (8.103) represents a (non)ideal measurement of the same PVM From the plots of and as functions of and in figure 8.17 it is also seen that both nonideality measures vanish in the limit Hence, although the two PVMs measured jointly in this experiment do satisfy the Martens inequality (7.106) for all values of this does not imply any complementarity for because the right-hand side of the inequality vanishes in this limit due to the fact that the two PVMs then coincide. As will be demonstrated in section 8.5.5, the DavidovichHaroche experiment, for general values of the parameters and is informationally equivalent to a measurement in which the second atom is replaced by a measurement of photon number in the final state of cavity C. Absence of information on the phase of the cavity C field explains the somewhat non-complementary behavior observed here. In section 8.5.6 an alternative measurement procedure will be discussed, better satisfying the canonical notion of complementarity, in which it is proposed to perform a measurement of the cavity C field also yielding phase information.
8.5. ATOMIC BEAM INTERFERENCE EXPERIMENTS
8.5.5
461
Informational aspects of the Davidovich-Haroche experiment
Decoherence The Haroche-Ramsey experiment [368] was devised in the first place to probe decoherence in cavity C following the passage of a Rb atom, entering in state Hence in the final state (8.99). Restricting ourselves to it follows from (8.99) that, conditional on measurement result or the cavity C field is described by a superposition of coherent states These states can be considered as Schrödinger cat states (cf. section 3.1.3) if is sufficiently large. In [371] it was proposed to probe, by sending after a time T a second Rb atom through the system, whether a process of decoherence is active by which these superpositions could decay to a mixture of the coherent states and In the present section it is demonstrated that, in agreement with a result obtained by Vitali et al. [372], the second Rb atom can yield information only on cavity C’s photon number. Hence, any change of the measurement results obtained for this atom should be attributed to a change of the photon number distribution. Of course, this does not imply the absence of decoherence due to decay of phase correlations described by the off-diagonal elements (in a number representation) of the field density operator. However, this measurement is not sensitive to this latter form of decoherence: it registers only decoherence of the diagonal elements. In order to demonstrate this we have to determine what information is obtained about the cavity C field by the measurement of the second atom. The corresponding POVM can be found by equating, for an arbitrary initial coherent state of the cavity field, the final state probabilities and to expectation values and respectively. Once again restricting ourselves to we find the probabilities from final state as
from which we obtain
Note that this result holds independently of phase averaging. It is easily seen that POVM represents a nonideal measurement of the number observable in the sense of definition (7.37):
462
CHAPTER 8. APPLICATIONS
Using (3.38) it is possible to calculate the generalized von Neumann projection (3.38) of density operator on the subspace spanned by and representing the information that is obtained by a measurement of POVM Due to the infinite-dimensionality of the Hilbert space of the field this must be done with some care because the operators are not Hilbert-Schmidt operators then. For this reason the dimension must be truncated. Restricting ourselves to an arbitrarily large but finite value D, we get:
with
Then
Note that, although if yet Thus, it is easily verified that in the limit we have and the second equality explicitly demonstrating that contains the same information on the measurement results of POVM as does Although a measurement of this POVM can distinguish between a mixture of the states and and a mixture of the states and this is so only because the probability distributions of the photon number observable are different in the two mixtures. Decoherence, not accompanied by a change of photon number, cannot be observed using the Davidovich-Haroche experiment.
Informational equivalence of the second atom to measurement of photon number In this section it will be demonstrated that the Davidovich-Haroche experiment, in which a second atom is used as a probe of the cavity C field, is informationally equivalent to a Haroche-Ramsey experiment in which the ‘which way’ information is obtained by measuring photon number. This will be done by considering the informational aspects of measurement, introduced in section 3.3.5. From an informational point of view the important feature is the structure of the subspaces of Hilbert-Schmidt space, spanned by the operators generating the POVM, as a function of the experimental parameters. In order to be completely general, in this section we allow the different parameters to take arbitrary values. We first determine the POVM of the Haroche-Ramsey experiment in which cavity C photon
8.5. ATOMIC BEAM INTERFERENCE EXPERIMENTS
463
number is measured in coincidence with a determination of the final state of the atom. This POVM is found from (8.99) by the equalities
in which and are the measured joint probabilities, and is the PVM corresponding to the spectral measure of the photon number observable. With the operators and are found as
in which Excluding for which POVM reduces to a trivial refinement of PVM for most values of the parameters the operators and span the whole Hilbert-Schmidt space of operators on a 2dimensional Hilbert space. Hence, in general the measurement is a complete measurement in the sense defined in section 7.9.4. The parameter values for which the measurement is incomplete can be found by looking for Hermitian operators T that are orthogonal to all and Thus, We find
Barring for all other values of the parameters no solution for T can be found. Hence, specializing the parameters of the experiment either to or to the pulse condition reduces the dimension of the subspace spanned by the operators of the POVM to 3, the dimensionality being further reduced if both conditions are satisfied simultaneously. By determining in the same way the Hilbert-Schmidt operators that are orthogonal to the operators of POVM (8.103) it is straightforward to prove that the Davidovich-Haroche experiment, discussed in
464
CHAPTER 8. APPLICATIONS
section 8.5.4, has exactly the same structure of subspaces, thus demonstrating the informational equivalence of these experiments for all values of the parameters. The subspace structure is not essentially changed by taking the detuning parameter Since then the operators T, found above, are particularly simple, viz, and . These are two orthogonal vectors, constituting together with the operators and an orthogonal basis of Hilbert-Schmidt space. Due to the uniqueness of the Hermitian projection operator this makes it particularly easy to calculate the projected density operator representing the information about the density operator provided by the measurement. We find
Note that all are non-negative (which should be the case for N = 2, compare section 3.3.5). If then information is obtained on the diagonal elements of only. Hence, for these parameter values the measurement is a nonideal measurement of PVM It is clear from this that from an informational point of view the parameter choice in section 8.5.4 was not completely appropriate for the purpose of reconstructing the initial density operator. If restricting to pulses, the measurement cannot retrieve For a complete determination of it is necessary that and (keeping One remark is in order here. Since the special parameter values and constitute sets of measure zero within the set of all possible values of the parameters, it might be thought that these special values are physically irrelevant because they cannot be attained in practice. In a strict sense this is correct. However, even though in practice the POVMs are complete, this does not mean that the subspace structure is unimportant. As a matter of fact, if the parameter values are near the special values given above, then the quantities will be very small. This means that the experimental error in the determination of these quantities is relatively large. Hence, the experimental probabilities will yield relatively poor information about the components of the Hilbert-Schmidt vector orthogonal to This illustrates the conclusion, drawn in section 7.10.4, that from an informational point of view different bases of a subspace of HilbertSchmidt space need not be equivalent, the quality of the information being largest for an orthogonal basis.
8.5. ATOMIC BEAM INTERFERENCE EXPERIMENTS
8.5.6
465
The Haroche-Ramsey experiment as a joint nonideal measurement of interference and path observables
Rather than exploiting a second atom, or, equivalently, measuring the cavity C field observable jointly with observable we consider here the field observable defined by
where is a coherent state, and the integrations are over the upper and lower complex half-planes, respectively. This observable is a coarsening of the observable measured in the ‘eight-port’ homodyning detection method (compare section 8.4.3). For we have In this limit POVM evidently yields information on the phase shift caused by the Rb atom, and, hence, provides information about the internal state the atom was in while traversing cavity C. This information will be seen to be analogous to the path information obtained in neutron interference experiments of the type discussed in section 8.2. In the Haroche-Ramsey experiment [368] the value of was finite causing the distinguishability of the states to be only partial. As will be seen in the following, this loss of path information is compensated by the interference information obtained by measuring a quantity of the C field yielding information on both number and phase. In order to interpret the experiment as a measurement in the initial state we put yielding a POVM
with elements given by
Here and are the path and interference observables (8.93) and (8.96) defined above. The constant A is given by
and
and
are given by (8.101). The operator S is defined according to
being given by (8.93). In the phase averaged case the experiment represented by this POVM once again is a nonideal measurement of PVM Restricting ourselves to we find
466
CHAPTER. 8. APPLICATIONS
yielding for the nonideality measure the same outcome (8.102) as obtained in the experiment in which no measurement is performed on the cavity C field. Evidently, in the phase averaged case such a measurement does not improve the information. Indeed, the two measurements are equivalent in the sense defined in section 7.6.7. We shall now consider POVM (8.104) when no phase averaging is performed. We should exclude here, too, since for this value of POVM reduces to a trivial refinement of PVM (this actually holds true for any choice of the observable of the cavity C field). In the limit the POVM reduces to representing a trivial refinement of the interference observable On the other hand, for the POVM reduces to the trivial refinement of the path observable We shall now demonstrate that if the pulse condition is satisfied (v arbitrary) and the Haroche-Ramsey experiment can be interpreted as a joint nonideal measurement of the incompatible observables and in the sense defined in section 7.9.1. To see this we define the bivariate POVM
For the two marginals we find
and
In the limits and the marginals represent ideal measurements of path and interference, respectively. The nonideality measures and (7.60) corresponding to the nonideality matrices and are found according to
For the special values of the parameters considered here we have Hence, for the observables and the right-hand side of
8.5. ATOMIC BEAM INTERFERENCE EXPERIMENTS
467
inequality (7.106) is nonvanishing, and it is impossible that and are both equal to zero. In figure 8.18 is plotted versus as a function of the parameter The resulting curve clearly exhibits the idea of complementarity expressed by inequality (7.106): the experiment constitutes a less accurate measurement of the interference observable as the path observable is determined more accurately by increasing (and vice versa). It is impossible that and both have small values. The nice feature of the condition is that the dependence of the marginals is completely taken into account by the nonideality matrices and PVMs and being independent of This feature is partly lost if we allow values For general values of the parameters we can represent POVM (8.104) in the following way:
From this we find as one marginal
which is still a nonideal measurement of path observable
468
CHAPTER 8. APPLICATIONS
However for the other marginal we get
with and different from Since PVM turns out to be dependent on the experiment is no longer a joint measurement of one stable PVM pair when varying Nevertheless for each set of parameters PVM is incompatible with the path observable, and inequality (7.106) is satisfied by the nonideality measures and Since the parameters A and depend on and only as this latter quantity, together with determines the measure of complementarity of observables and By comparing figures 8.19 and 8.17 it is seen that, contrary to the DavidovichHaroche experiment, in the present experiment these observables are complementary both for and complementarity being largest for The measurement represented by POVM (8.104) is not a complete one. It can be verified that, if the operator is orthogonal to all operators of the POVM. However, for no parameter values exist for which subspace has dimension smaller than 3. This demonstrates the informational superiority of the present measurement, based on homodyning, over the one measuring photon number. By refining the partition of the complex plane, POVM (8.104) can easily be refined to one spanning the whole Hilbert-Schmidt space of 2 × 2 matrices, allowing a complete determination of the incoming state of the atom.
8.5. ATOMIC BEAM INTERFERENCE EXPERIMENTS
469
An interesting aspect of the generalized measurements considered here is that the measured probability distributions are dependent on the phase of the microwave fields at the moment the atom enters the cavity. As a result of a restriction of the atom’s initial state to this feature was not present in the experiments that have been performed up to now. In order to see this effect, experimental conditions should be such that off-diagonal elements of the operators of the measurement’s POVM do not all vanish.
This page intentionally left blank
Chapter 9 The Bell inequality in quantum mechanics 9.1 Introduction 9.1.1
The EPR problem, standard and generalized
Originally, the Bell inequality was devised as a test for so-called hidden-variables theories which have the intention to understand quantum mechanical probability distributions analogously to classical statistical mechanics. In chapter 10 this will be discussed at some length. In the present chapter we shall stay completely within the boundaries of (generalized) quantum mechanics. The Bell inequality plays an important role also here. Two different derivations of it will be discussed (cf. sections 9.2 and 9.4) not using hidden variables. The important point is that the Bell inequality is in disagreement both with the standard formalism of quantum mechanics, and with a number of experiments that have been performed to test the inequality. On the other hand, the generalized formalism, discussed in chapter 7, describes experiments for which the Bell inequality is satisfied (cf. section 9.3.1). Hence, comparing experiments described by the standard formalism with generalized ones might clarify the meaning of the Bell inequality. This is done in the present chapter. The Bell inequality is often associated with the Einstein-Podolsky-Rosen problem discussed in chapter 5, in which two particles (or photons), viz, particle 1 and particle 2, fly apart after they have interacted. From the point of view of preparation of the initial state this association is justified. However, with respect to measurement there is an important difference between the original EPR problem and the experimental situation to which the Bell inequality is applicable. As we saw in section 5.2, it was essential to the EPR reasoning that a measurement is performed on 471
472
CHAPTER 9. BELL INEQUALITY IN QUANTUM MECHANICS
one of the two particles only (cf. figure 5.1), and that the other particle does not have any interaction with a measuring instrument. By contrast, the Bell inequality has relevance to a modified experimental arrangement in which measurements are performed on both particles. This difference can be seen as one of Bell’s most important contributions to the discussion, raising the EPR problem from the metaphysical level at which the discussion between Bohr and Einstein took place to a level at which the controversy could be decided by means of experiment: the Bell inequality is about measurement results of measurements that have actually been performed. Unfortunately, this fundamental difference is blurred by the habit of referring to experiments testing the Bell inequality as EPR experiments. In order to avoid confusion we shall not join this habit, but refer to these modified experiments as EPR-Bell experiments rather than EPR experiments. Conventionally, in the Bell inequality four different EPR-Bell experiments are involved, symbolically represented in figure 9.1. Either or is measured on particle 1 in coincidence with a measurement of either or on particle 2. Observables and are standard observables. Each observable of particle 1 is compatible with those of particle 2. However, we choose Eigenvalues of are denoted as those of as Each of the four EPR-Bell experiments is a joint measurement of two compatible standard observables, and can be described using the standard formalism of section 1.3. For each experiment the bivariate probability distribution (1.26) is displayed in figure 9.1. These probability distributions can be measured. The Bell inequality is expressed in terms of these probability distributions, and, for this reason, is experimentally testable. In the following the experiments of figure 9.1 will be referred to as the standard EPR-Bell experiments to distinguish them from the generalized EPR-Bell experiment to be discussed in section 9.3.1.
9.1.
INTRODUCTION
473
Several EPR-Bell experiments have been performed in the recent past, the most well-known being the Aspect ones [373, 290] in which a photon pair is created by exciting a calcium atom to the state, and subsequent decay via the cascade. The two photons have opposite momenta. Hence, they fly apart, and can be detected coincidentally when their mutual distance has become appreciable (12 m). Since the polarization state vector prepared in the experiment is very similar to the singlet state of the Bohm-Aharonov model discussed in section 5.4, the photon polarizations are strictly correlated, in the sense that equal measurement results are obtained if linear polarizations of the two photons are measured in equal directions perpendicular to the direction of motion. The Aspect experiments have proven in a rather unambiguous way that the Bell inequality can be violated in standard quantum mechanics. Here we ignore a criticism [374] to the effect that the Bell inequality can be tested experimentally only by using highly efficient detectors, and that in actual experiments efficiencies have never been high enough to be able to establish violation of this inequality, thus leaving open the possibility that the “true” data satisfy the inequality. This is sometimes referred to as the ‘detection’ or ‘(in)efficiency loophole’. It is true that the possibility of a disturbance of the measurement results by the measurement arrangement is essential for understanding the EPR-Bell experiments (cf. section 9.3). However, such a disturbance has rather the effect of making measurement results satisfy the Bell inequality when they would violate it without disturbance (compare (9.18)), than the other way around. Hence, if inefficient measurements violate the Bell inequality it is not to be expected that efficient ones won’t. For this reason the ‘detection loophole’ can hardly be an explanation of experimental violation of the inequality (cf. section 9.2.2). Moreover, this loophole seems now to have been closed experimentally by means of measurements in which detection efficiency is sufficiently high [375]. The four standard EPR-Bell experiments of figure 9.1 can be considered as measurements of the incompatible standard observables and respectively. Since the Bell inequality is a relation between the (bivariate) probability distributions of these experiments, for being able to derive the inequality it is necessary to have at one’s disposal relations between measurement results of different EPR-Bell experiments. In the derivations of sections 9.2 and 9.4 such relations are indeed supposed to exist. Thus, in section 9.2 it will be supposed that a quadrivariate probability distribution exists yielding the bivariate ones as marginals. In section 9.4 it will be assumed that values can be jointly attributed to the four observables measured in the EPR-Bell experiments, thus defining quadruples of measurement results. Both assumptions will be seen to be sufficient to derive the Bell inequality. Therefore a crucial question in discussing the relevance of the Bell inequality is whether these assumptions are warranted. Different answers to this question are possible, depending on the formalism used
474
CHAPTER 9. BELL INEQUALITY IN QUANTUM MECHANICS
and the interpretation adhered to. It is possible to derive inequalities already for the case of three observables (e.g. Leggett and Garg [376]) rather than for the four observables that are necessary for the Bell inequality. This implies that conclusions to be drawn about quadruples or quadrivariate probability distributions presumably are valid also for triplets and trivariate probability distributions. We shall restrict ourselves in the following to the case of four observables. In the standard formalism no mathematical expressions seem to exist which can be interpreted as joint probability distributions of incompatible observables (cf. Wigner’s theorem, section 1.11.6). Hence, in this formalism the assumption of a quadrivariate probability distribution of observables and (determining the four observables of the standard EPR-Bell experiments) does not have a justification. The situation is different, however, in the generalized formalism discussed in chapter 7. Analogously to section 1.9 a joint probability distribution of the four observables is possible here, even if some of these are incompatible. In section 9.3 a generalized EPR-Bell experiment will be discussed that can be interpreted as a joint nonideal measurement of the four observables (cf. de Muynck, De Baere and Martens [377]). We shall find the POVM, and the quadrivariate probability distribution describing the correlations between these observables in the generalized EPR-Bell experiment. Referring to section 1.9.2, it should be stressed here that the existence of a joint probability distribution of a number of observables can generally be interpreted as signifying their commeasurability. This allows us to draw at least one conclusion, viz that violation of the Bell inequality by the standard EPR-Bell experiments must be a consequence of the incommeasurability of the pertinent observables: if they are commeasurable, then a quadrivariate probability distribution exists causing the Bell inequality to be satisfied. If it is found by some reasoning that the standard EPR-Bell experiments of figure 9.1 should satisfy the Bell inequality, then it may be concluded that incommeasurability has not sufficiently been taken into account (for standard observables incommeasurability is tantamount to incompatibility). This, actually, is the basic thesis defended in the present chapter. Since in the standard formalism compatible observables have joint probability distributions (cf. (1.26)), and, hence, the Bell inequality is satisfied if and are mutually commuting, from this point of view its violation by the standard EPR-Bell experiments should be connected with the incompatibility of the pertinent observables. It should be noted that incompatibility was precisely the issue in the original EPR discussion of chapter 5. At that time, however, no formal description of a joint measurement of incompatible observables was possible because the generalized formalism was not yet available. EPR could relate incompatible observables and of particle 2 only by linking them together by means of joint measurements of each of the observables with an observable of particle 1. Stated in this way, Bell inherited from EPR a rather clumsy way of dealing with incompatibility of observ-
9.1.
INTRODUCTION
475
ables, the compatibility of the observables of particle 1 with those of particle 2 being a complicating factor and, hence, a possible source of confusion. By employing the generalized formalism it will be seen in section 9.3 that, indeed, violation of the Bell inequality is a consequence of the two incompatibilities and From this point of view the Bell inequality does not seem to add very much to the insights already obtained in the Heisenberg and Martens inequalities, and would hardly justify all the attention paid to it in the literature if it would not have additional assets (cf. section 9.1.2).
9.1.2 Bell’s inequality, and interpretations of quantum mechanics Due to its hidden-variables origin the Bell inequality is often seen as referring to the microscopic reality behind the phenomena rather than as referring to these phenomena. For this reason it is closely related to a realist interpretation of quantum mechanics. However, an empiricist interpretation is possible too. An important reason to be interested in the Bell inequality is that it is a valuable tool in assessing the relative merits of different interpretations of the quantum mechanical formalism, different interpretations implying different attitudes with respect to the existence of quadruples of measurement results and of quadrivariate probability distributions. In the minimal interpretation of the standard formalism (cf. section 6.4.1) the problem of the Bell inequality cannot arise. First, simultaneous or joint measurement of incompatible observables is deemed impossible. Second, in the minimal interpretation it is thought not to be sensible to attribute a value to a standard observable incompatible with the one actually measured. Hence, it is not possible to obtain quadruples in this way. This is sometimes characterized by means of the orthodox Copenhagen maxim “Unperformed experiments have no results” (Peres [94]). Thus, if the EPR pair is measured, then the formalism is thought not to yield any information on either or because of their incompatibility with and respectively. In order to obtain a quadruple of measurement results for the four standard EPR-Bell experiments it would be necessary to equate certain measurement results obtained in different experiments (e.g. of observable in the and measurements). However, in the minimal interpretation the formalism is not thought to make any other statements about relations of values of observables (either compatible or incompatible), obtained in different experiments, than the statistical ones embodied in the measured probability distributions. In particular, incompatible observables are thought not simultaneously to possess values. Hence, quadruples of values of such observables do not exist. This makes it impossible to derive the Bell inequality. The solution of the minimal interpretation is rather a minimal one, though: by minimizing the requirements to be satisfied by the theoretical quantities of the the-
476
CHAPTER 9. BELL INEQUALITY IN QUANTUM MECHANICS
ory, the problems related to the Bell inequality are evaded but not solved. The structure of quantum ensembles remains largely in the dark. This holds in particular for (cor)relations between incompatible observables. Two ways are available to be able to make statements about simultaneous values of incompatible observables: i) by generalizing the formalism of quantum mechanics it is possible to consider joint (nonideal) measurements of incompatible observables (cf. section 7.9), for each individual preparation yielding values of the incompatible observables; ii) staying within the standard formalism it might be attempted to strengthen the interpretation so as to make allowance for certain relations between the values of incompatible observables not measured jointly. Both possibilities will be discussed in the present chapter. As regards the first possibility, it was noted already in section 7.2.2 that the empiricist interpretation is the natural one in dealing with the generalized formalism of quantum mechanics, measurement results being interpreted as pointer positions of measuring instruments that are actually present. In contrast to the minimal interpretation incompatible observables may jointly have values in the empiricist one. Commeasurability of observables is automatically taken into account because quadruples of measurement results are considered only if obtained in a single measurement event. In the generalized formalism the Bell inequality can be derived for the measurement results of a generalized EPR-Bell experiment that is interpretable as a joint (nonideal) measurement of the four observables and (cf. section 9.3). Since the experimental data in this experiment do satisfy the inequality, there is no discrepancy between theory and experiment. It is important to note that, since the standard EPR-Bell experiments are just special cases of the generalized one, the same conclusion is obtained for these latter experiments. As long as considerations are restricted to measurement results obtained in one single measurement arrangement, the Bell inequality is satisfied. Violation of the inequality occurs only if measurement results are combined that are obtained in different arrangements. In an empiricist interpretation such combining is thought to be impossible, not because (like in the minimal interpretation) observables incompatible with the measured one would not have a value at all, but because such observables could have different values from the ones found in a different measurement arrangement (compare section 7.10.3; see also the discussion of ‘counterfactual definiteness’ in section 9.4.1). The second possibility refers to a realist interpretation, which, as already noted in chapter 2, is often favored over the empiricist one. In a realist interpretation observables are considered as properties of the object rather than as properties of the measuring instrument. This may have as a consequence that stronger requirements can be put with respect to correlations between observables. In particular, in this interpretation it is tempting to equate measurement results of the same observable, obtained in different experiments, if the preparations were identical (e.g. the values
9.1.
INTRODUCTION
477
of in the and measurements), identity of preparations being thought to be well-defined in terms of the properties of the individual microscopic objects. The possibility of equating measurement results obtained in different experiments, provided by this strengthening of the interpretation, has a large influence on the possibility of deriving the Bell inequality. In sections 9.4 and 9.6 it will be seen that in a realist interpretation it is indeed possible to make such strong assumptions that the Bell inequality also applies to the standard EPR-Bell experiments, and, hence, entails a contradiction. This holds true in particular for an objectivisticrealist interpretation in which the ‘possessed values’ principle (cf. section 2.3) is assumed. However, it was already seen in section 6.4.2 that the ‘possessed values’ principle is an assumption that is far too strong to be a reasonable property to be satisfied by quantum mechanical measurement results. By attributing the values of the quantum mechanical measurement results as objective properties to the object it is tried to accommodate these properties in a structure that is not essentially different from the structure of the physical quantities of classical mechanics. It is not surprising, then, that a conflict arises with the Hilbert-space structure of the standard formalism. As was already discussed in section 2.4, for this reason an objectivistic-realist interpretation of quantum mechanics does not seem to be very attractive. A discussion of the Bell inequality within quantum mechanics just strengthens this conclusion. As also discussed in section 2.4, in a realist interpretation of quantum mechanics there is an alternative to objectivistic realism, viz, contextualistic realism, holding that the observables possess their values only within the context of the measurement arrangement that is actually present. A contextualistic-realist interpretation is a considerably weaker proposition than an objectivistic-realist one, because neither the ‘possessed values’ principle nor ‘counterfactual definiteness’ is assumed to be valid. It will be seen in sections 9.4.2 and 9.6.2 that in a contextualistic-realist interpretation the Bell inequality is no longer derivable for the standard EPR-Bell experiments, thus preventing a disagreement with experiment. Hence, from the point of view of the Bell inequality the same conclusion can be reached as drawn in section 2.4.5, viz that a contextualistic-realist interpretation is just as possible as an empiricist one. Yet the reasons, mentioned in section 2.4, to prefer an empiricist interpretation are still valid here. As a matter of fact, the Bell inequality is tested by means of observations of pointer positions of measuring apparata, not by direct observation of contextual properties of the microscopic object. Nevertheless in sections 9.4.2 and 9.6 the notion of a contextual state, introduced in section 2.4.5, will be applied to EPR-Bell experiments. The reason for this is two-fold, i) because the contextual state, although quantum mechanical, may possibly approximate a description of the microscopic object more closely than is achieved by the other quantum mechanical quantities (see also section 10.6); ii) because of its relevance to an
478
CHAPTER 9. BELL INEQUALITY IN QUANTUM MECHANICS
approach sometimes referred to as “Bell’s theorem without inequalities” [299, 378], to be discussed in section 9.6.
9.1.3 Bell’s inequality and nonlocality Often violation of the Bell inequality is attributed to nonlocality of quantum mechanical reality 1 . This idea has its origin both in the Copenhagen interpretation of quantum mechanics as well as in Bell’s original derivation of the inequality, starting from a theory of local hidden variables. The first one is based on Einstein’s final conclusion in the original EPR controversy that a choice has to be made between ‘completeness of quantum mechanics’ and ‘locality’ (cf. section 5.3.1). A choice in favor of ‘completeness’ seems to force adherents of the Copenhagen interpretation into acceptance of ‘nonlocality’, a consequence seemingly accepted by Bohr. The second one, on the contrary, starts from the ‘incompleteness of quantum mechanics’. In both approaches the idea is that in an EPR-Bell experiment the measurement arrangement for particle 1, by nonlocally influencing the microscopic reality of particle 2, (co-)determines which measurement result is obtained in a measurement on the latter particle. Thus, for the same preparation of an individual particle pair a measurement on particle 2 of observable would yield a different measurement result if, instead of a measurement arrangement for observable of particle 1 a measurement arrangement for observable would have been set up. This is sometimes referred to as ‘parameter dependence’ (e.g. Shimony [379]), since the measurement arrangement may be specified by the value of some parameter. The corresponding correlations are referred to as ‘nonlocal correlations’, the reason for this terminology being the (questionable) idea that such correlations are not brought about by means of previous local interactions, but by means of a direct nonlocal interaction between the regions of particles 1 and 2. The circumstance that the ‘completeness’ and the ‘incompleteness’ views of quantum mechanics, although diametrically opposed to each other, yet both entail a conclusion of ‘nonlocality’, may be a reason for some to believe in the validity of the idea of ‘nonlocality’ of the quantum world. In the following this conclusion will be critically investigated. Our conclusion will be that violation of the Bell inequality by the standard EPR-Bell experiments does not necessarily imply nonlocality of the quantum world, and that in the reasonings leading to the conclusion of ‘nonlocality’ the assumption of either completeness or incompleteness of quantum mechanics cannot be the decisive one (see also section 9.5.3). As observed in section 9.1.1, violation of the Bell inequality is related to incommeasurability of the observables that are 1
By a certain abuse of language in this context by ‘nonlocality’ is usually understood ‘violation of relativistic causality’, i.e. the possibility of superluminal propagation of interactions between distant particles. Although this occasionally can cause some confusion, in the following we shall conform to this terminology.
9.1.
INTRODUCTION
479
involved. It is important to take this into account in assessing the inequality’s significance. Violation of the Bell inequality is a consequence of the two incompatibilities and Due to the principle of local commutativity (cf. section 1.3.1) incompatibility of quantum mechanical observables is a local affair. The fact that the Bell inequality is satisfied if the four observables involved are mutually commutative, and, hence, violation of the inequality is a consequence of incompatibility, should make us suspicious about the often-heard statement that the results of the Aspect experiments constitute experimental evidence of nonlocality. Evidently, violation of the Bell inequality is a measure of incompatibility of the observables. Unfortunately, it is not such a good measure because, although incompatibility is necessary, it is not sufficient for violation of the Bell inequality. It, actually, requires a careful choice of the quantum mechanical state to be able to experimentally realize violation (compare section 9.2.2). This can be understood because in the Aspect experiments some of the observables do commute. The idea of incompatibility immediately directs our attention toward the essential influence of the interaction between object and measuring apparata, recognized in the Copenhagen interpretation as the primary cause of incompatibility in quantum mechanics. Bohr saw this interaction as the basic feature entailing the notions of ‘complementarity’ and ‘completeness (in a restricted sense)’ (compare section 4.2.2). It will be seen that, whenever a conclusion of ‘nonlocality’ is reached on the basis of violation of the Bell inequality, the essential influence of the interaction between object and measuring apparata is not taken into account in a sufficient way. This holds true independently of the question of ‘(in)completeness in a wider sense’, both in quantum mechanics as well as in the local hidden-variables theories to be discussed in chapter 10. The essential difference between the original EPR problem (in which a measurement is performed on one particle only), and the EPR-Bell measurement arrangements of figure 9.1 (in which measurements are performed on both particles) is very important here, although seldom noticed. Due to the fact that the measurement arrangement of figure 5.1 is taken as a paradigm, often the influence of only one of the two measuring apparata is taken into account. This may cause an ambiguity. Thus, for instance in the Bohm-Aharonov version of the EPR problem, discussed in section 5.4, often von Neumann’s projection postulate is applied, to the effect that in the state (5.8) a measurement result would cause the state of particle 2 to change into However, if simultaneously a measurement of, e.g., would be performed, then the same projection postulate would require a transition into one of the states If the projection of the state is interpreted in a realist sense as a “real” state change, then the question arises which of the two projections is the real one. Neglect of the influence exerted by the measurement arrangement particle 2 itself is interacting with may easily lead to the same conclusion as Bohr’s one, viz, that the measurement context of particle 1 is determinative for the reality
480
CHAPTER 9. BELL INEQUALITY IN QUANTUM MECHANICS
of particle 2 (cf. section 5.3.1), and, hence, to a conclusion of nonlocality. This conclusion was already criticized in section 5.4, a combination of a realist interpretation and the projection postulate being held responsible for the idea of ‘nonlocality’ stemming from the original EPR reasoning. In an empiricist interpretation of the EPR experiment it is thought possible to make quantum mechanical assertions about particle 2 only as far as (conditional) preparation is concerned. Since in an EPR-Bell experiment a measurement is also performed on particle 2, it is possible to distinguish between conditional preparation of the state of particle 2 (conditional on the measurement result of particle 1), and measurement results of measurements performed on particle 2. As will be seen in section 9.6, when the presence of both measuring instruments is taken into account, there is no necessity to resort to nonlocality to be able to explain the violation of the Bell inequality by quantum mechanics. As will be seen in the following, the conclusion that violation of the Bell inequality by the Aspect experiments is a consequence of nonlocality has been obtained on the basis of a combination of inadequate logic, an unnecessarily strong interpretation of the mathematical formalism of quantum mechanics, and a too restricted formalism, viz, the standard one.
9.2
Derivation of the Bell inequality from the existence of a quadrivariate probability distribution
9.2.1 The BCHS inequality Rastall [380] and Fine [381] (see also de Muynck [382]) have demonstrated that the existence of a quadrivariate probability distribution from which the bivariate ones can be derived as marginals, is sufficient for a derivation of the Bell inequality. We shall now first give the derivation. By assumption there exists a quadrivariate probability that the measurement results of the four observables and satisfy and in which are subsets of the spectra of the observables. We denote by the complement of The derivation of the Bell inequality is based on the following assumptions:
9.2. DERIVATION OF THE BELL INEQUALITY
481
Theorem: Every probability distribution satisfying the properties (9.1) obeys the inequality
and analogous ones obtained by means of permutation of the indices. Proof:
Combining (9.3) and (9.5) yields
From (9.5) and the equality
it follows that
Then from (9.4) and (9.7) we get
Inequalities (9.6) and (9.8) can be combined to give (9.2).
Inequality (9.2) is known as the Bell-Clauser-Horne-Shimony (BCHS) inequality. It contains only bivariate probability distributions like the ones measured in the four standard EPR-Bell experiments discussed in section 9.1. For this reason it can be experimentally tested by performing these measurements. The inequality was derived earlier by Clauser and Horne [383] on the basis of a so-called local hiddenvariables theory. It should be emphasized that in the derivation given here no single reference is made to hidden variables. The measure corresponds to a probability distribution of quantum mechanical measurement results. The only assumption made is that the joint probability distribution exists, and that it satisfies
482
CHAPTER 9. BELL INEQUALITY IN QUANTUM MECHANICS
the relations (9.1). These are just the relations satisfied by the random variables of Kolmogorov’s classical theory of probability [384] (see also appendix A.12). Hence, as far as the standard EPR-Bell experiments do not satisfy this inequality, the only conclusion we can draw is that quantum mechanical observables cannot be interpreted as classical random variables. Since we are dealing here with incompatible observables, and since incompatibility is the typically quantum mechanical characteristic distinguishing this theory from classical mechanics, this conclusion is hardly sensational. In view of the connection -often supposed to exist- with the issue of nonlocality it should be noted that in the derivation of the inequality given above locality could play a role only if it were a necessary condition for the existence of the quadrivariate probability distribution, or, conversely, if nonlocality were a cause of the nonexistence of such a joint probability distribution. This will be discussed extensively in section 9.5. It is noted already here that we shall not have any occasion to doubt the assumption of locality. On the contrary, an assumption of nonlocality would seem to be in contradiction with the postulate of local commutativity (cf. section 1.3), the latter entailing for the standard EPR-Bell experiments the equalities
In agreement with the Kolmogorov axioms (9.1), expressing that the probability distributions of the measurements on particle 1 are independent of which measurement is performed on particle 2, these equalities are satisfied by the EPR-Bell experiments. Note that in the classical probability theory equality (9.9) is valid independently of the question of locality, classical variables being considered as objective properties of the objects, possessed independently of any measurement.
9.2.2
Derivation of the Bell inequality from the BCHS inequality
If the subsets consist each of one single point of the spectrum of an observable, then the BCHS inequality (9.2) has the form
The inequality originally derived by Bell [33] does not refer directly to the probabilities etc., but to the expectation values of the corresponding correlation observables etc., being linked to the probabilities according to
9.2. DERIVATION OF THE BELL INEQUALITY
483
Bell considered dichotomic observables, which have only two (nondegenerate) eigenvalues, +1 and –1, each. For such observables the following inequality has been derived [385], known as the Clauser-Horne-Shimony-Holt (CHSH) inequality:
This inequality is now directly derived from the BCHS inequality (9.10), and, hence, from the assumption of the existence of a quadrivariate probability distribution. Proof: Consider expression (9.11). For the special observables specified above it becomes
It can directly be verified that from this it follows that
in which
stands for the middle term of (9.10). Since
it follows that By interchanging
and
it can analogously be proven that
Combining (9.13) and (9.14) yields (9.12).
484
CHAPTER 9. BELL INEQUALITY IN QUANTUM MECHANICS
By Bell special states and observables were considered, such that
This leads to the inequality
which is the inequality originally derived by Bell. It is customary to refer also to the more general inequality (9.12) as the Bell inequality. It is possible to find examples of quantum mechanical observables and states for which (9.16) is not satisfied. For instance, when the state of a two-particles system is the singlet state S = 0, and and are suitably normalized spin components in directions a, b, b and respectively (cf. figure 9.2), then the quantum mechanical values of the correlation functions, apart from (9.15), are found as Choosing the directions as indicated in the figure, then (9.16) would lead to the inequality It is clear that this is impossible. inequality.
Hence, quantum mechanics violates the Bell
The Bell inequality is not violated by every set of four quantum mechanical observables. In particular, the inequality is satisfied if all four observables are mutually commuting, since the existence of a quadrivariate probability distribution implies satisfaction of the BCHS inequality. Hence, incompatible observables must be involved for violation of the Bell inequality. Even then special care must be taken to warrant experimental violation. Thus, by decreasing detector efficiency it is possible that the Bell inequality is not violated any more. Taking into account detector efficiency in (9.16) implies that the correlation functions in (9.17) should be multiplied by yielding the inequality
instead of (9.18). This inequality is satisfied if Hence, experimental violation of Bell’s inequality requires a very high detector efficiency (compare the ‘detection loophole’ referred to in section 9.1.1). Such a violation requires also a special choice of the state. For experiments of the EPR-Bell type it is necessary that the state be an entangled state (cf. section 1.5.3). For the density operator (1.68) of a general non-entangled state the Bell inequality once again follows from the existence of a quadrivariate probability distribution, viz,
9.2. DERIVATION OF THE BELL INEQUALITY
485
yielding the experimental probabilities of the Aspect experiments as marginals (see also Selleri and Tarozzi [386]). Hence, violation of the Bell inequality can be obtained only by a combination of incompatible observables and an entangled state. This conclusion is not changed if in (9.19) the PVMs are replaced by POVMs. The BCHS relation (9.10) has two important advantages over the Bell inequality (9.12). First, the BCHS inequality is applicable to a much wider set of observables than the ones to which the Bell inequality (9.12) can be applied: the observables need not be dichotomic; moreover, in the case of dichotomic observables there is no restriction to eigenvalues +1 and –1. It is an important advantage that the eigenvalues do not play any role at all in the BCHS inequality. This inequality is a property of probability distributions, not of expectation values of standard observables. For this reason the BCHS inequality can also be applied to generalized observables represented by POVMs. It can be tested without any assumption with respect to the precise values of the observables. From the point of view of an empiricist interpretation of quantum mechanics this is of utmost importance because values of observables can be chosen at will by the experimenter (cf. section 2.4): it makes the Bell inequality a respectable issue also in this interpretation, even though, due to its reference to values of observables, its physical meaning is dubious in the form (9.12), which inequality can always be satisfied by taking the values of the observables sufficiently small. For this reason we shall restrict our attention mainly to the BCHS inequality, the Bell inequality constituting just a special case. Since the BCHS inequality represents the real content of Bell’s theorem within quantum mechanics we shall often refer to it indiscriminately as ‘the Bell inequality’.
9.2.3 Quadruples and joint probability distributions of measurement results Rastall [380] and Fine [381] have proven that the existence of a quadrivariate probability distribution is also necessary for the Bell inequality to be satisfied. Garg and Mermin [387] have put into doubt the physical relevance of this joint probability distribution, however. Svetlichny, Redhead, Brown and Butterfield [388] have also noted that, due to incompatibility, a quadrivariate probability distribution of the observables and need not exist in the sense of convergence in the limit (6.1) of some relative frequency, even if such a limit would exist for the four bivariate probability distributions of the pairs etc., measured in the standard EPR-Bell experiments (also Khrennikov [389], chapter 2.7). This would diminish the relevance of the existence of a quadrivariate probability distribution for derivability of the Bell inequality, and might leave open the possibility that violation of this inequality may have a cause different from the nonexistence of such a joint probability distribution.
486
CHAPTER 9. BELL INEQUALITY IN QUANTUM MECHANICS
Strictly speaking, these criticisms are justified. Even if a quadrivariate probability distribution can be constructed if the Bell inequality is satisfied, it is not at all clear whether there is any physics in it. Actually, the real issue is not the existence of a quadrivariate probability distribution. It is rather the possibility of jointly attributing, for an individual preparation of the particle pair, unique values to all of the four observables, i.e. the existence of a quadruple of measurement results, allowing the Bell inequality to be derived (cf. section 9.4.1). If in the case of the EPR-Bell experiments a quadruple of measurement results exists for each single individual preparation, then for each finite number N of individual measurements the relative frequencies of the quadruples satisfy the BCHS inequality (9.10), independently of the existence of a probability distribution. There is no necessity to consider the limit This changes the conclusion drawn above in the sense that violation of the Bell inequality by the standard EPR-Bell experiments must be a consequence of the impossibility of jointly attributing values to the pertinent observables. This may even better reflect the physical meaning of the Bell inequality than the existence of a quadrivariate probability distribution. The existence of quadruples is necessary for the existence of a quadrivariate probability distribution, but it is not sufficient since the relative frequencies of the quadruples need not necessarily converge to a joint probability distribution. For commeasurability we may require the existence of such a joint probability distribution though (cf. section 1.9.2). In the generalized EPR-Bell experiment discussed in section 9.3.1 the quadruples and the quadrivariate probability distribution both exist. As a matter of fact, within the domain of (generalized) quantum mechanics the existence of the limiting relative frequencies (6.1) is always taken for granted analogously to the way the existence of probability distributions of standard observables is assumed in standard quantum mechanics (cf. section 6.2.1). The definition of the probability distributions of (generalized) observables is based on the existence of POVMs. The POVM being given, the existence of probability distributions like (1.40) and (1.101) is thought to be unproblematic. Indeed, this is experimentally corroborated in a multitude of experiments. The existence of the limiting relative frequencies (6.1) might even be seen as delimiting the domain of application of (generalized) quantum mechanics. Within this domain the problem of the existence of quadruples can be equated with the problem of the existence of a quadrivariate probability distribution. Only in the wider context of hidden-variables theories, to be discussed in chapter 10, is it possible to characterize the physical conditions that should be fulfilled for the limiting relative frequencies to exist, thus enabling a physical characterization of the domain of validity of (generalized) quantum mechanics. Restricting ourselves in the present chapter to (generalized) quantum mechanics, the relevance of the existence of a quadrivariate probability distribution to the problem of the Bell inequality will therefore be considered as equivalent to the relevance of the existence of
9.2. DERIVATION OF THE BELL INEQUALITY
487
quadruples of measurement results. Consequently, either the impossibility of jointly attributing values to incompatible observables, or the nonexistence of certain joint probability distributions of incompatible observables can be considered as the characteristic feature distinguishing quantum mechanics from classical mechanics, and explaining possible violation of the Bell inequality by quantum mechanics. Which formulation is favored might depend on the interpretation of the formalism that is adhered to. Thus, in an ensemble interpretation it seems more natural to base one’s physical conclusions on the joint probability distribution than on quadruples of individual measurement results. The distinction between an empiricist and a realist interpretation is particularly important in evaluating the significance of the Bell inequality. In an empiricist interpretation a measurement result corresponds to a pointer position of a measuring instrument. Hence, a quadruple of measurement results must correspond to a joint measurement of the observables of the quadruple. Then the significance of the Bell inequality is quite straightforward: if the Bell inequality is satisfied, then the four observables that are involved must necessarily be commeasurable, since without this there would not exist quadruples of measurement results (and, hence, no quadrivariate probability distribution) to derive the Bell inequality. Violation of the Bell inequality by quantum mechanics should be attributed to the observables failing to be commeasurable. Why quantum mechanical observables fail to be commeasurable is not easy to see in this way, however. It will be argued in section 9.3, that, if applied to standard observables, violation of the Bell inequality can be seen as just another evidence of the impossibility of joint (simultaneous) ideal measurement of incompatible observables. In the case of a realist interpretation the issue is about properties of the object and their simultaneous existence. In the objectivistic form of this interpretation violation of the Bell inequality is, by itself, not connected to simultaneous measurement, but to the simultaneous attribution of values of incompatible observables to the microscopic object independently of measurement. In order that such an attribution can have any observational content it will nevertheless be necessary to define a relation between the property of the object and the pointer position of a measuring instrument measuring the property. This will be discussed in section 9.4. Our conclusion will be that derivability of the Bell inequality must be seen as an argument against an objectivistic-realist interpretation.
488
CHAPTER 9. BELL INEQUALITY IN QUANTUM MECHANICS
9.3 The Bell inequality in an empiricist interpretation; relation to joint nonideal measurement of incompatible observables 9.3.1 A generalized EPR-Bell experiment In this section an experiment will be discussed of the EPR-Bell type introduced in section 9.1.1, with one essential difference, however, viz that the experiment corresponds to the measurement of a generalized observable. As we have seen in section 7.9, this opens up the possibility of a joint (nonideal) measurement of two incompatible standard observables. It is possible to extend this to the four observables and relevant to the Bell inequality. In this way each individual measurement yields a measurement result for each of the four observables, to be collected into a quadruple The relative frequency of these quadruples in an ensemble of such individual measurements determines the joint probability distribution In accord with section 9.2 the existence of a quadrivariate probability distribution implies that the Bell inequality must be satisfied. Since the measurement results do refer to pointer positions of the measuring instrument, an empiricist interpretation of the formalism is appropriate here. As the quadrivariate probability distribution has an unambiguous physical meaning the criticism by Garg and Mermin [387], mentioned in section 9.2.3, does not seem to apply. In the Aspect experiments [373, 290] the four standard EPR-Bell experiments are coincidence measurements of pairs of photon polarization observables in a number of different directions: and the angles being chosen as given in figure 9.2. Hence, each of these measurements is a joint measurement (cf. section 1.3) of two (compatible) polarization observables of the two photons. In the present experiment the same preparation is taken as in the Aspect experiments. However, the measurements are different. On each of the photons a joint nonideal measurement of two polarization observables is performed (cf. figure 9.3) in the way discussed in sections 7.5 and 7.9. The standard observables and correspond to the PVMs and respectively. Analogously to section 7.9 the joint probability distribution of this experiment is obtained as
with and both being given by (7.64), and and being chosen as the operators corresponding to the polarizations actually measured. Hence, the
9.3. BELL INEQUALITY IN EMPIRICIST INTERPRETATION
489
experiment is a joint measurement of two joint nonideal measurements and measured jointly and nonideally in one arm of the measurement arrangement, and, analogously, and in the other arm). Since the probabilities of measurements performed separately on particle 1 are given by the expectation values of the POVM (and analogously for particle 2). The mutual disturbances of the measurements of observables and for can be represented by the Martens inequality (7.106) for each of the arms separately. The Aspect experiments are special cases of the experiment of figure 9.3. Thus, choosing, for instance, (i.e. both mirrors are absent) the measurement arrangement reduces to the joint ideal measurement of the two standard polarization observables in directions and Making one (or both) of the mirrors totally reflecting yields one of the three other EPR-Bell experiments 2 . Hence -although describable in the standard formalism- the standard EPR-Bell experiments are also describable in the generalized formalism. Since in the generalized formalism values are attributed to all four observables, it, evidently, is possible to attribute values to observables that are considered in the standard formalism as ‘not measured’. Each of the four standard EPR-Bell experiments can be interpreted as a joint nonideal measurement of the four observables (cf. section 7.9.2). Their joint probability distributions can be obtained as the expectation values of the POVM defined by (9.21) if the relevant limiting values for and are taken. The four POVMs of the Aspect experiments are therefore given by
with
and
given by (7.79) and
2 In the Aspect ‘switching’ experiment [290] the parameters and are intended to be independent random variables, each having either the value 1 or 0, thus in a stochastic manner determining for each individual photon pair which of the four EPR-Bell experiments is performed. The Aspect ‘switching’ experiment that was actually carried out leaves a tiny loophole because the variables and were not varied randomly but periodically (be it with different and independently chosen periods). However, in a later experiment [390], in which randomness was secured, the results obtained in the Aspect experiment were confirmed (see also Gisin and Zbinden [391]).
490
CHAPTER 9. BELL INEQUALITY IN QUANTUM MECHANICS
(7.80), respectively. The existence of a quadrivariate probability distribution for each of the standard EPR-Bell experiments can be seen to be accompanied by the existence of quadruples of measurement results. Thus, (9.22) (with is consistent with the fact that in the measurement the standard values of and are supplemented by the values to yield a quadruple for each photon pair. The bivariate probability distribution of and is obtained as one of the marginals of (9.22):
in which it is made explicit by means of the notation that the measurement arrangement contributes to the constitution of the joint probability distribution. This marginal is easily seen to equal the bivariate probability distribution found in the standard formalism. The probabilities for the other standard EPR-Bell experiments are found in an analogous way as marginals of the joint probability distributions defined by (9.22) for the other values of and The quadrivariate POVMs of the four standard EPR-Bell experiments are different from each other. This implies that, with equal density operators the joint probability distributions (9.20) are different for the four EPR-Bell experiments. For instance, For this reason it is to be expected that the bivariate probability distributions will also be different in different EPR-Bell experiments. Indeed, using (9.22), with (7.79) and (7.80) it is easily verified that, for instance,
From (9.24) it is seen that, in particular, the bivariate probability distribution is different from the one found in the standard EPR-Bell measurement of the pair Let us see now what these results imply for the question of the Bell inequality. Since there is a quadrivariate probability distribution for each of the four EPR-Bell experiments, the BCHS inequality (9.10) is satisfied for each of these experiments, all bivariate probability distributions being derived from one and the same quadrivariate probability distribution (with a single value for Hence, the bivariate probability distributions and of each standard EPR-Bell experiment (interpreted as a generalized one) satisfy the BCHS inequality. It is important to note that in the standard description of the standard EPR-Bell experiments for instance the bivariate probability distribution is not
9.3. BELL INEQUALITY IN EMPIRICIST INTERPRETATION
491
thought to be actually measured3. What is thought to be actually measured in these experiments are the joint probability distributions
If all of these bivariate probability distributions could be obtained as marginals of one and the same quadrivariate probability distribution, then it would be possible to derive the BCHS inequality in the way as was done in section 9.2, thus creating a disagreement with the actual measurement results of the four EPR-Bell experiments. In an empiricist interpretation of the formalism, in which we can only dispose of the four different joint probability distributions corresponding to the POVMs (9.22), such a quadrivariate probability distribution is not available, however. In general the relations between the values of the four observables and are different in the four different standard EPR-Bell experiments. This makes it impossible to derive the bivariate probability distributions (9.25) from one of the joint probability distributions (cf. (9.24)). Since no other quadrivariate probability distribution is available, in an empiricist interpretation of quantum mechanics we actually do not have any reason to suppose that the bivariate probability distributions (9.25) might have to satisfy the BCHS inequality. From the point of view of the theory of joint nonideal measurement of incompatible observables, discussed in section 7.9, this conclusion is not unexpected. In agreement with this theory the measurement can, additionally, be interpreted as a nonideal measurement of the pair Analogously, the measurement can additionally be seen as a nonideal measurement of the pair In view of the incompatibility of the observables and the concomitant mutual exclusiveness of the measurement arrangements, the inequality of the quadrivariate probability distributions is plausible if mutual disturbance of the measurement results is taken into account. For this reason the nonexistence of a quadrivariate probability distribution for all standard EPR-Bell experiments can be seen as a direct consequence of the incompatibility of the observables involved in the Bell inequality. Since the Bell inequality is essentially a classical property (classical quantities always jointly possessing well-defined values), this result is abundantly plausible: the Bell inequality can be violated by the bivariate probability distributions (9.25) due to incompatibility of the relevant observables. From the point of view of the generalized formalism of quantum mechanics, and the empiricist interpretation going with it, there is no single reason why it would be expected that the standard EPR-Bell experiments might have to satisfy the Bell inequality. If we find it important enough to test experimentally whether the standard EPRBell experiments satisfy the BCHS inequality, we must have other reasons. These 3
In the Copenhagen interpretation this bivariate probability distribution is considered as nonexistent.
492
CHAPTER 9. BELL INEQUALITY IN QUANTUM MECHANICS
reasons will be discussed in section 9.4 and chapter 10. They derive from a strengthening of the interpretation of quantum mechanics in the sense of a more realist interpretation (cf. chapter 2), or from reliance on hidden variables, both nurturing the idea that a quadrivariate probability distribution may exist, different from the ones discussed above. This joint probability distribution, unlike the ones given in (9.22) not being empirical but constructed on theoretical grounds, would have the four bivariate probability distributions of the standard EPR-Bell experiments as marginals, which for this reason would have to satisfy the BCHS inequality. Although from a strictly empiricist point of view such a theoretical construction may not be very relevant, we shall yet pay attention to it in the following because it may contribute to a clarification of the implications of a realist interpretation, and may yield new indications that an empiricist interpretation of the quantum mechanical formalism may be preferable.
9.3.2 Heisenberg measurement disturbance or nonlocal interaction? The theory of joint nonideal measurement of incompatible observables answers the questions of why there need not exist a quadrivariate probability distribution having the bivariate EPR-Bell probability distributions as marginals, and causing these to satisfy the Bell inequality. This answer follows directly from the complementarity, in section 7.10 found to be present in joint measurements of incompatible standard observables. In an ideal measurement of observable the incompatible observable is maximally disturbed (cf. section 7.9.2). This disturbance is completely absent in an ideal measurement of observable itself. Therefore, assuming identical quantum mechanical preparations (i.e. equal density operators) the differences in the probability distributions can be attributed to the differing influences of the mutually exclusive measurement arrangements of the different EPR-Bell experiments. When the quadrivariate probability distribution, yielding all bivariate probability distributions as marginals, is measured using a single measurement arrangement, then the Bell inequality should be satisfied. Violation of this inequality is a consequence of incommeasurability of the observables, and is, therefore, related to the (typically quantum mechanical) property of incompatibility of observables. In view of the fact that the Bell inequality can be derived from the classical Kolmogorov probability theory (section 9.2), this conclusion is most satisfactory. As already remarked in section 9.2.1. violation of the Bell inequality is often associated with nonlocality, in the sense that a measurement on particle 1 would disturb the measurement result of a measurement performed on particle 2 (and vice versa). Our discussion does not give any reason for such an association. The difference between values of an observable found in either a generalized or a standard
9.3. BELL INEQUALITY IN EMPIRICIST INTERPRETATION
493
EPR-Bell experiment can completely be explained on the basis of a local Heisenberg measurement disturbance in a joint nonideal measurement of incompatible observables. This also holds true for the standard EPR-Bell experiments as far as these can be seen as generalized ones. Thus, in an measurement the value of is different from the value obtained in a standard EPR-Bell experiment in which is measured. This is due to the fact that the actual presence of the measurement arrangement for observable disturbs the probability distribution of (analogously is disturbed by the presence of the measurement arrangement for It is important to note that POVM (9.21) is a direct product of the two commuting POVMs and Therefore, in all experiments the principle of local commutativity is satisfied, entailing independence of the probability distribution of an observable of one particle from the choice of which observable is measured on the other particle (parameter independence). Admittedly, this does not exclude the possibility that an individual measurement result could be influenced in a nonlocal way by the far measurement arrangement. This, however, would mean that in some mysterious way the effect of this nonlocal interaction, present at the individual level, would be compensated for and made unobservable at the statistical level of the probability distributions. Since the individual result is not reproducible, however, only the probability distributions are physically relevant in quantum mechanics. Only these latter quantities can be compared with results of actually performed experiments. This means that the nonlocality (if existing at all) does not have any empirically relevant consequences for the EPR-Bell experiments. In contrast to what is often asserted, the Aspect experiments, satisfying local commutativity, do not constitute experimental evidence of nonlocality (see also section 9.5.2). The unobservability of the alleged nonlocality is sometimes interpreted as evidence of a strange property of microscopic reality, preventing its use for the purpose of superluminal transmission of signals. A different way of accounting for this is by using phrases like ‘passion at a distance’ (instead of ‘action at a distance’) (Redhead [107]), ‘nonseparability’ (instead of ‘nonlocality’) (d’Espagnat [160]), or ‘peaceful coexistence’ (Shimony [392]). These phrases have the intention to express the “strangeness” of this alleged typically quantum mechanical kind of nonlocality, seemingly prohibiting to consider as separate entities particles that have been interacting before, even after their distance from each other has become large, while, on the other hand, the statistical measurement results of quantum mechanical measurements, performed on each particle separately, behave as if the other particle does not exist at all (compare Einstein’s “spooky actions at a distance” discussed in section 5.4). If the “nonlocality” at the level of the individual particle would really exist, then it would have a rather conspiratory character, causing the individual nonlocal effects to cancel exactly in the ensemble. It is difficult to think of any reason why the quantum world would be contrived in such a conspiratory way that the nonlocality -if it were there- would be unobservable at the (empirically relevant)
494
CHAPTER 9. BELL INEQUALITY IN QUANTUM MECHANICS
level of the probability distributions. This problem cannot be solved by referring to a ‘peaceful coexistence’ of relativity theory and quantum mechanics that can be inferred from the apparent insensitivity of the quantum mechanical probabilities to the “nonlocal” influences. The unobservability of nonlocality has a remarkable similarity with the unobservability of the world aether, in the experiment by Michelson and Morley withdrawing itself from observation in an analogous conspiratory way. Like the world aether, nonlocality may also be nonexistent. For this reason it seems sensible to look for alternative explanations to replace nonlocality as a means of understanding the violation of the Bell inequality by quantum mechanics. It seems that quantum mechanical incompatibility and the concomitant mutual disturbance in a joint measurement of incompatible observables, already recognized by Bohr and Heisenberg as basic to the difference between classical and quantum mechanics, offers such an alternative also in the case of the Bell inequality. There is an equally remarkable similarity between Einstein’s empiricist way of dealing with the world aether in developing relativity theory, and the solution to the problem of the Bell inequality in quantum mechanics by means of an empiricist interpretation of the theory. Heisenberg’s rejection of particle trajectories and his reliance on observable quantities may have been modeled after Einstein’s rejection of the world aether and his introduction of the observer into the theory (de Muynck [101]). A third remarkable similarity between relativity theory and quantum mechanics is the tendency, to be observed in both theories, toward a realist interpretation, neglecting the theories’ empiricist roots. Einstein’s transition from empiricism to realism [101], in relativity theory culminating in his interpretation of the metric of four-dimensional space-time as a “new world aether”, is paralleled within quantum mechanics by his insistence on interpreting this latter theory as a description of objective reality (cf. section 2.3). It should be stressed that an empiricist interpretation is not sufficient for reaching the conclusion that the Bell inequality is related to Heisenberg disturbance rather than to nonlocality, but that the theory of measurement of generalized observables is indispensable for arriving at that conclusion. As a matter of fact, in an empiricist interpretation of the standard formalism in a standard EPR-Bell experiment no values are attributed to observables that are not measured, thus easily evading the existence of a quadrivariate probability distribution from which the Bell inequality could be derived. This point of view is too restricted, however, since in the generalized formalism it is possible and meaningful to attribute values to such “unmeasured” observables (cf. section 7.9.2). Consequently, each EPR-Bell experiment has its own quadrivariate probability distribution, in the standard formalism supposed to be nonexistent. Violation of the Bell inequality is consistent with the fact that the quadrivariate probability distributions of the four standard EPR-Bell experiments differ from each other. These
9.3. BELL INEQUALITY IN EMPIRICIST INTERPRETATION
495
differences can be understood on the basis of a local disturbing influence of the measurement arrangements, in the sense that the measurement arrangement for particle influences only particle Due to mutual disturbance in a joint nonideal measurement of incompatible observables we have no single reason to suppose that a quadrivariate probability distribution would exist from which the bivariate probability distributions of the four standard EPR-Bell experiments can all be derived as marginal distributions. For this reason from an empiricist point of view there is no a priori reason to presume that the Bell inequality must be satisfied.
9.3.3 Classical and quantum correlations As already mentioned before, the derivation of the Bell inequality from the existence of a joint probability distribution of the four observables is an application of Kolmogorov’s classical theory of probability. Hence, in a joint nonideal measurement of the four observables the measurement results are correlated in agreement with classical statistics. Sometimes this is taken as evidence that within the context of one single measurement arrangement everything is classical. This seems to be in agreement with Bohr’s correspondence idea (cf. section 4.3), and, in case of a measurement of a standard observable, with the idea of a contextual state as a description of a proper mixture of subensembles, each having a well-defined value of the observable (cf. section 2.4.5). We should be careful with such a conclusion, however. As can easily be seen from example 2 of section 7.10.4, an interpretation of the contextual state along this line is not possible in general for generalized observables representing joint nonideal measurements of incompatible observables. In general there is no way to view the contextual state as representing a classical mixture of subensembles with well-defined, although unknown, values of the incompatible observables measured jointly. This can be seen also to be true for the generalized EPR-Bell experiment of figure 9.3, which does not only yield information on the “classical” correlations of the nonideal measurement results, but also on the “quantum” correlations as found in the standard EPR-Bell experiments of figure 9.1. This is so because it is possible by means of inversion of the nonideality matrices to calculate the Wigner measures (cf. section 7.9.4) and corresponding to POVMs and of (9.21). Both Wigner measures found in this way agree with (7.87). Analogously to (9.21) the Wigner measure of the generalized EPR-Bell experiment is given by
This implies that the bivariate probability distributions of the standard EPR-Bell experiments can be obtained as marginals of the expectation values of this Wigner measure.
496
CHAPTER 9. BELL INEQUALITY IN QUANTUM MECHANICS
Since the bivariate probability distributions of the four standard EPR-Bell experiments can be calculated in this way from the measurement results of the generalized EPR-Bell experiment, the latter experiment can evidently be interpreted as having invertible nonideal measurements of the former ones as marginals. It contains all information on the “quantum” correlations that are responsible for the circumstance that the Bell inequality is violated by the standard EPR-Bell experiments. It is important to note here that the transition from the POVM to the Wigner measure is achieved by means of the two transitions and performed in parallel. Evidently, the transition is a local affair (although taking place simultaneously for both particles 1 and 2). It is possible to interpret each of the local transitions as a local compensation of the local Heisenberg measurement disturbance in each of the arms of the experiment. From this point of view therefore there is no single reason to associate “quantum” correlations with any form of nonlocal influencing of the measurement in one arm by a measurement in the other arm. As already mentioned in section 9.1.3, the EPR “quantum” correlations are often referred to as “nonlocal” correlations, suggesting that their quantum mechanical character is a consequence of nonlocality. From the present point of view this locution is a rather unfortunate one, because in the generalized experiment the classical correlations between the measurement results of particles 1 and 2 have a nonlocal character too, since they refer to particles that are far apart. Moreover, an empiricist interpretation of quantum mechanics does not give any occasion to relate violation of the Bell inequality to nonlocality. On the contrary, taking into account the essential contribution to the correlations due to the measuring instruments may yield an alternative explanation (viz, mutual disturbance of measurement results in a joint measurement of incompatible observables) of why quantum correlations can differ from classical ones that is completely consistent with locality.
9.4
The Bell inequality in realist interpretations
9.4.1
Derivation from the ‘possessed values’ principle
As we saw above, an empiricist interpretation of (generalized) quantum mechanics does not suggest in any way that the measurement results of the four standard EPR-Bell experiments would have to satisfy the Bell inequality. However, other interpretations of the quantum mechanical formalism could imply additional assumptions inducing extra requirements entailing the Bell inequality. The assumption of the ‘possessed values’ principle (cf. section 2.3), in combination with the faithful measurement principle (cf. section 2.4), is an example of this. These principles are consistent with an objectivistic-realist interpretation of quantum mechanics. Ap-
9.4. BELL INEQUALITY IN REALIST INTERPRETATION
497
plying these principles in each individual preparation, a value of each of the four observables and involved in the standard EPR-Bell problem is attributed to the object, viz, the value we would find if an ideal measurement of the observable is performed, given that individual preparation. For each individual preparation this entails the existence of a quadruple of values of the observables. Since in a standard EPR-Bell experiment only two of these observables are thought to be measured (e.g. and in the measurement), the existence of the other two values (viz, and is not self-evident, and even not generally accepted (compare the Copenhagen maxim “Unperformed experiments have no results.”). The assumption that in the context of an measurement to and the same values and can be attributed that would have been found if the pair would have been measured instead of is known as the assumption of counterfactual definiteness (CFD). This assumption is weaker than a combination of the ‘possessed values’ and ‘faithful measurement’ principles, because measurement results need not be considered as objective properties of the microscopic object, but may be thought of as emergent in the measurement process (e.g. in Jordan’s sense, cf. section 6.2.2). However, it is evident that the assumption of CFD is alien to the Copenhagen interpretation because, as demonstrated by the generalized EPR-Bell experiment discussed in section 9.3, the emergent values of and in an measurement will be different from those obtained in a measurement of CFD rather fits into a ‘statistical’ interpretation like Ballentine’s one discussed in section 4.7.3. It should be stressed that, even if the values of the observables are referred to as ‘measurement results’, they are actually treated in an objectivistic-realist sense as properties the microscopic object possessed immediately preceding the measurement. Repeating the individual preparations, on the basis of the assumption of CFD an ensemble of quadruples is obtained. Assuming that the relative frequencies of these quadruples define, analogously to (6.1), a quadrivariate probability distribution, the Bell inequality can be derived in the manner of section 9.2. This derivation can actually be interpreted as a derivation on the basis of incompleteness of quantum mechanics, the measurement results being considered as completing the statistical description provided by the quantum state. The difference with the empiricist case is that by assumption there now exists a quadrivariate probability distribution from which the bivariate probability distributions of the standard EPR-Bell experiments are thought to be derivable as marginals. This quadrivariate probability distribution is not experimentally accessible because the standard EPR-Bell observables are not commeasurable. Yet, it could be argued that already the theoretical existence of such a joint probability distribution could be sufficient to derive a disagreement with experiment. As remarked in section 9.2.3, it is rather the existence of quadruples of measurement results than the existence of a joint probability distribution that is responsible
498
CHAPTER 9. BELL INEQUALITY IN QUANTUM MECHANICS
for the derivability of the Bell inequality. It, indeed, is possible to derive inequality (9.12) directly, without explicitly relying on the existence of a quadrivariate probability distribution. A derivation along this line was first given by Stapp4 [393]. A large number (N) of individual preparations of EPR pairs is considered. These preparations are numbered according to An EPR-Bell measurement performed on the particle pair, yields measurement result where or –1 for all and The correlation function (9.11) is now given by
Analogous expressions are obtained for the other EPR-Bell experiments, for instance.
Now, for instance, (9.13) can simply be derived from the inequality
which is valid because all eigenvalues are either +1 or –1. The same inequality then obtains for the averages (9.27), etc.. The derivation is valid for each value of N, independently of whether the expressions have limits for
The conclusion to be drawn from this derivation of the Bell inequality is not different from the one reached on the basis of the existence of a quadrivariate probability distribution: an assumption of the existence of quadruples of measurement results for the observables of the standard EPR-Bell experiments leads to a contradiction with experiment. The issue therefore seems to be the validity of the ‘possessed values’ principle in conjunction with the ‘faithful measurement’ principle (and, hence, CFD). These assumptions are at the basis of an objectivistic-realist interpretation. Can a quantum mechanical measurement result be seen as an objective property of a microscopic object, to be attributed to this object at its preparation, independently of any measurement to be performed later? Since this assumption inevitably leads to disagreement with experimental data the question must be answered in the negative, thus also negating the existence of a quadrivariate probability distribution as a relative frequency of quadruples of values to be attributed to the four observables in the above-mentioned sense. An objectivisticrealist interpretation leads us into trouble. This had already been understood clearly 4 It should be noted that Stapp’s derivation is not based on an objectivistic-realist interpretation (cf. section 9.5). For the mathematical derivation this does not make any difference, however.
9.4. BELL INEQUALITY IN REALIST INTERPRETATION
499
by Bohr (cf. chapter 4), and has been an incentive for us to favor an empiricist interpretation over a realist one (cf. section 2.4). In an empiricist interpretation the microscopic object is not directly described by quantum mechanics, and quadruples of measurement results make sense only in joint measurements of the four observables. The attribution of values to observables, independently of the measurement arrangement actually present, does not have a meaning in this latter interpretation.
9.4.2
The Bell inequality, and a contextualistic-realist in-
terpretation Interactional versus relational contextualism In a contextualistic-realist interpretation it is possible to take into account the influence the measurement process may have on the measurement results, the latter being interpreted as properties of the microscopic object. Since in this way the consequences of an objectivistic-realist interpretation might be evaded, contextualism could offer a possibility of finding a solution to the problem of the Bell inequality also in a realist interpretation. This solution is based on Heisenberg’s disturbance theory of measurement (cf. section 4.6.2), and is formally equivalent to the empiricist one. In agreement with Heisenberg’s disturbance theory the value of found in an measurement, will be different from the one found, on the same individual preparation5, in a measurement of Therefore, in the correlation function (9.27) the inequality should be taken into account. Because of (9.30) we also have
i.e. can have different values for identical preparations in different standard EPRBell experiments. This implies that it is impossible to attribute counterfactually a well-defined value to all four observables. Hence no unique quadruple can be attributed to the particle pair if it is thought to be identically prepared in incompatible EPR-Bell experiments. This takes away the foundation on which the derivation of inequality (9.29) is based: on the preparation only one of the four EPR-Bell experiments can be carried out, for instance due to Heisenberg disturbance the values attributed to the other two measurement results for this value of will differ in general from the values obtained if the pair is measured instead of 5
Here the assumption is made that it makes sense to assume the possibility of identical individual
preparations in the contexts of the measurements of incompatible observables. It will be argued in
sections 9.5 and 9.6 that this assumption is not self-evident.
500
CHAPTER 9. BELL INEQUALITY IN QUANTUM MECHANICS
Evidently, if the EPR-Bell measurement results would correspond to properties the microscopic object has in the context of a quantum mechanical measurement, then a contextualistic-realist interpretation could formally deal with the problem of the Bell inequality in the same way as was found in section 9.3 to be possible for an empiricist interpretation: because the measurement results are liable to a disturbance due to interaction with the measuring instrument, they cannot be seen as objective properties possessed by the object prior to measurement. This makes a contextualistic-realist interpretation immune to a derivation of the Bell inequality based on CFD. It is important to note here that this immunity obtains only if the observables to which values are attributed in the contextualistic-realist sense are actually measured. In the EPR discussion of chapter 5 this was not the case because there it was crucial for EPR’s purpose (being a demonstration of the incompleteness of quantum mechanics) that no observable of particle 2 was measured. For this reason the physical quantities of particle 2 could be considered by EPR as objective properties in the sense of an objectivistic-realist interpretation. As pointed out in section 5.3.1, due to Bohr’s acceptance of the EPR proposal a remnant of objectivism had also slipped into his answer to the EPR problem, forcing him to change his contextualistic interpretation of quantum mechanics from an interactional into a relational one. The suggestion of nonlocality induced by this change of interpretation has been an unfortunate consequence. If, like in EPR-Bell experiments, all relevant observables are actually measured, then for each particle there exists a measurement context in the interactional sense, determined locally by the measurement arrangement the particle is actually interacting with. Then it is hardly meaningful to assume, in the mariner of Bohr’s interpretation of EPR, a nonlocal measurement context for the particle on which no measurement is performed. This seems reason enough to reject the relational form of contextualistic realism, and to restrict contextualism strictly to those situations in which the object is actually interacting with the measuring instrument. This, actually, is the situation considered in the present chapter.
Contextual states for EPR-Bell experiments
That, in the case of an interactional contextualism, the measurement context is a local affair can be made plausible by implementing the contextualism of the interpretation in terms of the contextual state defined in section 2.4.5. Let the preparation of the initial state of the correlated particle pair in the EPR-Bell experiments of figure 9.1 be represented by an arbitrary state vector
9.4. BELL INEQUALITY IN REALIST INTERPRETATION
The contextual states in the contexts of the and respectively, are represented by the density operators
501
measurements,
In a contextualistic-realist interpretation the contextual state could be interpreted as describing an ensemble in which, in the context of the measurement, observables and have well-defined values and respectively, with joint probabilities (and analogously for
Since
it follows that
Hence, the contextual state for particle 1 is independent of which observable is measured on particle 2. This holds true for the other particle as well. Consequently, there is no single reason to presume that a measurement on a distant particle would exert any influence either on the context of a local measurement, or on the value of the measured observable of particle 1. On the other hand, the context of particle 1 is influenced by the circumstance whether or is measured. The contextual states (9.33) should be compared with
It is easily verified that in general
As far as the contextual states (9.33) and (9.34) can be seen as implementations of Bohr’s correspondence principle into the quantum mechanical formalism (to the effect that a quantum mechanical observable is well-defined only in the context of the measurement arrangement for the observable that is actually measured, cf. section 4.3) is it evident that the contextual states of the separate particles are defined by the local contexts. Contextual state versus conditional preparation
The probabilities of the EPR-Bell experiments can also be found by a transition to a new state obtained by conditional preparation (strong projection, cf. sections 3.2.6
502
CHAPTER 9. BELL INEQUALITY IN QUANTUM MECHANICS
and 3.3.4). Accordingly, in a measurement of 2 coincident with measurement result
in state (9.32) the state of particle is represented by the state vector
yielding a conditional probability (i.e. conditional on measurement result for any subsequent measurement on particle 2. This, actually, corresponds to the way the EPR problem was discussed in chapter 5. In a realist interpretation this procedure may entail a conclusion of nonlocality 6 , because for the choice of instead of the conditional preparation yields a different result, to be obtained by means of a representation of in terms of the eigenvectors rather than (9.32). Hence, if state vector (9.35) would describe the reality of particle 2, then this reality would be (co-)determined by the measurement arrangement for particle 1. A realist interpretation of the conditionally prepared state is necessarily a contextualistic one. This contextualism is of a nonlocal kind, the context being determined by a measurement arrangement particle 2 is not interacting with (compare Bohr’s relational interpretation discussed in section 5.3.1). It is precisely this consequence of nonlocality that makes a realist interpretation of the conditionally prepared state (9.35) unattractive. Fortunately, such a realist interpretation is not necessary since we have an alternative possibility. If a contextualistic-realist interpretation is desirable at all, then it seems that the contextual states and obtained from (9.33) by partial tracing over the particle 1 Hilbcrt space, would also be serious candidates for descriptions of the reality of particle 2 in the contexts of the measurement arrangements for and respectively. Since in general these states arc different from the states (9.35), we evidently are in a position to make a choice. This choice cannot be made on the basis of experimental evidence, however. Taking into account that in the case of contextual states (9.33) the measuring instruments for either or are present, both possibilities are equally consistent with the experimental evidence obtainable under the given experimental conditions. Yet, under the experimental conditions discussed in the present chapter (in which a measurement is performed on both particles) such a choice is not difficult. If quantum mechanics is thought not to describe objective reality, but only a contextual reality in which the object is interacting with a measuring instrument, then it does not seem to make sense to suppose that the reality of particle 2 is determined by the distant measurement arrangement for particle 1 (as would be the case if the conditionally prepared state were chosen), rather than by the measurement 6
See, however, Ghirardi [394], who argues that the transition might not be instantaneous, and therefore could be local/causal. This possibility will be ignored here, however, because the issue is not whether the influence of the measurement arrangement of particle 1 on particle 2 is either (relativistically) causal or not, but whether there is any influence at all.
9.4. BELL INEQUALITY IN REALIST INTERPRETATION
503
arrangement particle 2 is itself directly interacting with. In the contextual states and this latter experimental context is clearly represented. Moreover, since and it is also clear that on this choice -as far as measurement is concerned- the reality of particle 2 is dependent only on the observable measured on this very particle, and is independent of the observable measured on particle 1 (and analogously for the other particle). The consequence of nonlocality might be seen as an extra indication (over the general ones put forward in section 2.4) that the state vector of the conditionally prepared state should not be interpreted realistically at all. This, at least, is the position taken in an empiricist interpretation. Conditional preparation is actually a preparation procedure, and state vector (9.35) might be interpreted as labeling this procedure rather than as labeling the result of this procedure. As already discussed in section 5.4 such an interpretation may solve the nonlocality problem, knowledge of the individual measurement result for particle 1 being necessary lest the conditional preparation can be effectuated. An instrumentalist interpretation of state vector (9.35) is able to evade the nonlocality problem in a similar way. For the reasoning applied in the original EPR problem (cf. section 5.4) a realist interpretation of the state vector of the conditionally prepared state is crucial. As already noted in section 5.3.1, the fact that Bohr, although favoring an instrumentalist interpretation of the state vector, did not clearly oppose to Einstein’s use of such a realist interpretation, should be considered as an inconsistency in his application of his correspondence principle : a consistent application would require that the measurement arrangement for particle 2 is explicitly taken into account, too. Failure to observe this clearly has caused quite a lot of confusion. If, in the context of an measurement, instead of the conditionally prepared state (9.35), the contextual state is chosen as a representation of the contextual reality of particle 2, then it is clear that this reality does not even depend on the distant measurement arrangement. Whereas Bohr may be excused for mistakenly taking into account the distant measuring instrument in defining the context of particle 2 (because no measuring instrument was present for the latter particle), can overlooking the influence of the near measuring instrument hardly be excused in the context of EPR-Bell experiments. When a contextualistic-realist interpretation is adhered to in a strictly interactional sense in which the context of each particle is determined by the measuring instrument the particle is actually interacting with, then a “solution” employing nonlocal influences is completely unnecessary. It seems that, if a realist interpretation of the quantum mechanical formalism is aspired at all, an interactional contextualistic-realist interpretation would be the best possible option.
504
CHAPTER 9. BELL INEQUALITY IN QUANTUM MECHANICS
Measurement results and elements of physical reality
A denial of the physical significance of the conditionally prepared state as describing the reality of particle 2 makes obsolete the concept of ‘element of physical reality’ introduced in section 5.2. Even if (9.35) is an eigenvector of this does not mean that particle 2 had the corresponding eigenvalue as a property preceding the measurement (nor that this value would come into being by the measurement on particle 1). If to particle 2 a value of some observable can be attributed at all, then this value might be derivable just as well from the contextual state Thus, in the context of an measurement one of the eigenvalues could be attributed. This value should not be interpreted as being possessed prior to the measurement of however, since the presence of the measurement arrangement is crucial to its definition (see also section 10.6). In an empiricist interpretation this presence is crucial because here the measurement result is thought to be a property of the measuring instrument in its postmeasurement state. In a realist interpretation, even a contextualistic one, one should be careful, however, because the role of the measuring instrument is thought to be a more modest one. Here the temptation exists to assume -in the spirit of Heisenberg’s disturbance theory of measurement- that a measurement of a standard observable disturbs only observables that are incompatible with the measured one. This would mean that the measured observable itself would not be disturbed by the measurement, and that, hence, the measured value could equal its prior-to-measurement one (cf. the ‘faithful measurement’ principle discussed in section 2.4.3). Since this would hold for any observable that could be measured, this would imply the ‘possessed values’ principle even in a contextualistic-realist interpretation, since in this way measurement results of all possible observables could be reduced to values the pertinent observables had immediately preceding and independently of the measurement. EPR’s elements of physical reality were contrived so as to have this property (values being ascertainable “without in any way disturbing a system”). Their existence would allow a derivation of the Bell inequality as given in section 9.4.1. Contextualistic realism does not offer a solution to the problem of the Bell inequality unless it is conceded that a measurement result is an emergent property, created in the measurement (cf. Jordan’s ideas mentioned in section 6.2.2), and not existing prior to measurement. The problems with a realist interpretation seem to be attributable to the fact that, in contrast to classical mechanics, within the domain of quantum mechanics values of quantum mechanical observables cannot be attributed to the microscopic object as objective properties (cf. Ballentine’s ‘statistical’ interpretation of observables, section 6.2.1). Derivability of the Bell inequality can be interpreted as new evidence of the inconsistency of such an interpretation with the formalism of quantum mechanics. As discussed in section 6.4, a realist interpretation of the
9.4. BELL INEQUALITY IN REALIST INTERPRETATION
505
quantum mechanical formalism aims at explanations of quantum mechanical measurement results by referring to properties the microscopic object had preceding the measurement, an empiricist interpretation refraining from such explanations. Violation of the Bell inequality is an important indication of the impossibility of such explanations in quantum mechanics. It is of primary importance that quantum mechanical measurement results may not be considered as properties the microscopic object had already preceding the measurement, independently of the measurement arrangement, thus explaining the measurement result that is obtained. Quantum mechanics does not yield a description of objective (microscopic) reality as aspired by EPR’s elements of physical reality, and does not allow an objectivistic realist interpretation. By attributing an essential role to the measurement process in the realization of a quantum mechanical measurement result a contextualist interpretation offers the possibility, at least in the case of standard observables, to keep thinking about observables in a realist way without being obliged to assume that they refer to some objective reality. In this interpretation the values of incompatible observables do not exist simultaneously because they do not exist independently of the measurement arrangement. If in section 9.3 the measurement results etc. would be interpreted as values of the pertinent observables taken by the microscopic object within the context of the measurement rather than as pointer positions of the measuring instrument, then the reasoning preventing the existence of quadruples of measurement results for the four standard EPR-Bell experiments could be completely analogous. In a contextualistic-realist interpretation CFD does not apply either. However, the solution offered in this sense by contextualism is not based on empirical evidence, but is rather a matter of interpretation. This is particularly evident because contextualism can be implemented in two different ways, viz, either through the conditionally prepared state or through the contextual state. No empirical motive being available for making a choice, a preference for the latter possibility might stem above all from the opportunity it offers to evade the ‘nonlocality’ problem entailed by the former one in a contextualistic-realist interpretation. It should be emphasized, however, that such an interpretation does not offer any advantage over the empiricist one. Apart from being observationally equivalent, it also does not perform better as an explanatory device, since it also fails to yield an explanation of the measurement result that is obtained experimentally. Although it is logically possible to assume that in the measurement the measurement result is obtained because the microscopic object had these values as contextual properties within the measurement context of this experiment, this does not explain how these properties came into being. Far from yielding explanations, the transition from the initial state to the contextual state is itself in need of an explanation. Due to the observational equivalence of initial and contextual states such an explanation cannot be obtained while remaining within the domain of quantum
506
CHAPTER 9. BELL INEQUALITY IN QUANTUM MECHANICS
mechanics. In order to investigate whether there is any physics in the transition to the contextual state we shall have to resort to subquantum (hidden-variables) theories, which in an empiricist interpretation of quantum mechanics are thought to be the appropriate tools for a description of microscopic reality. This will be considered in chapter 10. In particular in section 10.6 an argument will be developed justifying the transition to the contextual state, in the sense that, if the quantum mechanical state is intended to tell more about microscopic reality than is thought strictly possible in an empiricist interpretation, the contextual state rather than should be taken as the initial state of the measurement process. This approach will also suggest physical reasons why it might be preferable to interpret values of quantum mechanical observables as contextual rather than as objective properties of microscopic objects. Within the domain of quantum mechanics this does not escape the level of speculation, however. Nevertheless, in view of the incompleteness of quantum mechanics as a description of microscopic reality it seems wise to take into account the possibility that the contextual state may have a physical relevance. This will play a certain role in the next section, in which we return to an empiricist interpretation of the quantum mechanical formalism.
9.5
A Copenhagen-inspired empiricist approach
9.5.1
Stapp’s “nonlocality proof”
Unlike what is done in section 9.3, in an empiricist interpretation it is possible to take the Copenhagen position that in the context of an EPR-Bell measurement (e.g. of ) incompatible observables (like and do not have values (“Unperformed experiments have no results”, cf. section 9.1.2). For this reason in this interpretation there is no CFD in the sense defined in section 9.4.1, and a derivation of the Bell inequality on the basis of CFD, as given there, is impossible. Indeed, even though quantum mechanical measurement results may be viewed upon as emergent properties, is CFD yet strongly inspired by the objectivistic-realist interpretation. In contradistinction to his derivation based on CFD (Stapp [393]), discussed in section 9.4, in more recent work Stapp [395] has attempted to formulate a derivation of the Bell inequality in which the assumption of CFD is avoided. For Stapp measurement results are not objective properties of the microscopic object, and could very well be taken in the sense of an empiricist interpretation as discussed in section 2.2. In particular, it is not assumed that the observables already had their values prior to the measurement. The assumption that CFD does not hold in the derivation of the Bell inequality is formulated explicitly by Stapp [395] in terms of his assumption “Unique Result” (UR): “For each of the four alternative possible
9.5. A COPENHAGEN-INSPIRED EMPIRICIST APPROACH
507
(EPR-Bell [WMdM]) measurements if is performed then nature will select some unique value for the result of this experiment, and will never fix any value for the results which the remaining three measurements would have had if they had been performed 7 .” In his derivation Stapp emphasizes the fact that in (9.27) and (9.28) the values of are taken equal. The reason for this is that the measurements on particles 1 and 2 are performed in causally disjoint regions (cf. section 1.3.1), and, hence, cannot influence each other in a (relativistically) local/causal world. The value found for observable should therefore be independent of which measurement is performed on the other particle. This is an assumption of locality (e.g. Butterfield [397]) which can be expressed by the equality
(it seems that this equality should not be interpreted in a counterfactual sense, but in the sense that identical individual preparations in and measurements should yield the same value of Analogous relations hold for the other cases. Although it is assumed that and do not have values in the context of an measurement, Stapp yet arrives at a quadruple of measurement results by considering the possibility that on a certain individual preparation of a particle pair, instead of the measurement each of the other three standard EPR-Bell experiments could be chosen, each yielding a unique result if it would be performed 8 . Then the locality condition (9.36), and analogous relations for the other cases, would cause the eight numbers, involved in the pairs of measurement results and to reduce to the four measurement results of a quadruple. By assumption the correlation functions (9.27) and (9.28) correspond to the measurement results obtained in the standard EPR-Bell experiments. Analogously to section 9.4.1 these would have to satisfy the Bell inequality. 7
It is not clear, though, whether Stapp adheres to an empiricist interpretation. Unfortunately, his “pragmatism” does not yield sufficient clues for an unambiguous apprehension of his position (see also [246], where he even attributes to the human mind, or human consciousness, an important role in the creation of measurement results). It also is not completely clear whether Stapp’s [396] professed belief in the objective existence of wave functions must be taken in a pragmatic sense, or whether this indicates a conversion to an objectivistic-realist interpretation. In any case Stapp seems to adhere to the Copenhagen probabilistic interpretation of quantum mechanical statistics as discussed in section 6.2, in which there is thought to be no causal explanation of why in a measurement a certain measurement result is found (Stapp [246], p. 148). 8 Stapp’s analysis [395] has a modal character (cf. section 6.6) in the sense that the issue is one single individual preparation, in which each of the four EPR-Bell experiments has a possibility to be chosen, each possible experiment yielding a pair of measurement results. The modal approach is capable of evading the type of CFD defined in section 9.4.1, in which values are ascribed to observables that are not actually measured.
508
CHAPTER 9. BELL INEQUALITY IN QUANTUM MECHANICS
In contrast to the generalized experiment discussed in section 9.3, in the standard EPR-Bell problem a quadruple does not consist of measurement results obtained in an individual experiment. For this reason the quadruples, as well as the quadrivariate probability distribution determined by their relative frequencies, are different from the ones considered in the generalized experiment of section 9.3. This has induced Stapp [398] to comment that the subjects would even be unrelated. In light of the possibility of interpreting the standard EPR-Bell experiments as special cases of the generalized one, this comment is hard to swallow. The possibility of simultaneously having at one’s disposal an ideal and a nonideal measurement result of the same observable might at least raise the question of how these are related to the same individual preparation, and to each other. Yet, Stapp is right that there is a difference. Unlike the situation considered in section 9.3, Stapp’s quadruples are not realized experimentally, but are theoretical possibilities, constructed by combining measurement results obtained in different experiments. One conclusion to be drawn from the experimental violation of the Bell inequality by the correlation functions of the standard EPR-Bell experiments must be that the construction of the quadruples -which are basic to the derivation of the inequality- cannot be carried out if the Bell inequality is violated. Evidently, unlike the quadruples and quadrivariate probability distributions considered in section 9.3, neither Stapp’s quadruples of measurement results of the four standard EPR-Bell experiments nor their quadrivariate probability distribution can exist. The fundamental question to be asked is, of course, what may be the cause of this nonexistence. Stapp [398] rejects the solution put forward in de Muynck, De Baere and Martens [377], referring to Heisenberg disturbance as a mechanism to avoid the existence of quadruples for the four standard EPR-Bell experiments. According to him all measurement results of a quadruple are obtained in ideal measurements of the relevant EPR observable, and, hence, are undisturbed. This, indeed, constitutes a clear difference with the quadruples obtained in the generalized Aspect experiment discussed in section 9.3.1. For this reason it is necessary to discuss Stapp’s new analysis, and assess the applicability of his theoretically possible quadruples. It is Stapp’s contention that the nonexistence of the quadruples must be a consequence of a failure of the locality assumption (9.36), which is evidently considered as the only relevant assumption in the derivation of the Bell inequality. This would imply that the nonexistence of the quadruples is blamed on the fact that, for instance, a measurement result of observable is influenced by a measurement on the distant particle, causing equalities like (9.36) to be violated. If the measurements on particles 1 and 2 take place in causally disjoint regions of space-time this would imply the influence to have a nonlocal character. Hence, the impossibility of attributing to an individual preparation of a particle pair a unique quadruple of values (to be obtained as measurement results if the measurements would be performed) would be
9.5. A COPENHAGEN-INSPIRED EMPIRICIST APPROACH
509
a consequence of a nonlocal interaction between measurements on particles 1 and 2. Stated differently, the typical difference between classical and quantum mechanics, viz, the impossibility to jointly attribute in the latter case values to certain physical quantities, revealing itself by a possible violation of the Bell inequality, would be a consequence of the impossibility of performing measurements having measurement results which are determined locally. The quantum world would be nonlocal in this sense. Although this is a possible explanation of the nonexistence of quadruples, due to the absence of any empirical evidence of such a nonlocality it is not a very attractive one. As already noted in section 9.3.2, the independence of the probability distribution of the measurement results of observable from the choice of the observable measured on particle 2 (parameter independence: makes it very improbable that at the level of an individual measurement there would exist a nonlocal influencing in the sense discussed above. Therefore, an explanation of violation of the Bell inequality on the basis of nonlocality should be distrusted. For this reason in the following an alternative explanation is looked for, which is based on the quantum mechanical notion of incompatibility rather than on nonlocality. However, in an empiricist interpretation of quantum mechanics the mathematical formalism is thought just to describe macroscopic events, without yielding any explanation. Hence, any explanation we arrive at within the domain of quantum mechanics is based on extra-observational considerations, and is necessarily tentative. For this reason a definitive answer to Stapp’s contention cannot be obtained by staying within the domain of quantum mechanics, even if extended in the way discussed in chapter 7 (formalizing the idea of Heisenberg disturbance). An explanation of the nonexistence of Stapp’s theoretical quadruples must be expected from theories describing the microscopic object itself rather than just phenomena of measurement and preparation. This will be addressed in chapter 10 by considering subquantum (hidden-variables) theories, and by developing ideas why it may not make sense to assume the existence of a quadruple of undisturbed measurement results for the standard EPR-Bell experiments (cf. sections 10.5.3 and 10.6). In the present chapter we shall investigate the possibility that other assumptions than the ‘locality’ assumption may be blamed for derivability of the Bell inequality within quantum mechanics.
9.5.2
The possibility of additional assumptions
The conclusion that violation of the Bell inequality by quantum mechanics (symbolically denoted by ) would imply nonlocality ( ) is logically equivalent with the assumption
510
CHAPTER 9. BELL INEQUALITY IN QUANTUM MECHANICS
i.e. locality (LOC) is a sufficient condition for the Bell inequality to be satisfied (BI). The question is, however, whether in Stapp’s derivation LOC is the only assumption. Often there are still other assumptions, which seem to be so self-evident that it is not felt necessary to state them explicitly. If there would exist such additional assumptions (symbolically denoted by ADD), then (9.37) should be replaced by
which is equivalent to This would imply the possibility that violation of the Bell inequality could be explained because the additional assumptions are not (all) valid, the assumption of ‘locality’ not being in jeopardy. It seems that this possibility is sufficient reason to critically consider whether the assumption of ‘locality’ is the only assumption made in Stapp’s derivation, or whether there are additional ones. As long as it is not completely certain that no additional assumptions are involved in the derivation of the Bell inequality, the widespread idea that its experimental violation is evidence of nonlocality is not justified. In the derivation at least one additional assumption can be recognized, viz, the assumption of identical individual preparation in the different EPR-Bell experiments. This assumption is necessary for equality of the values of in the left- and righthand sides of (9.36) to have any meaning. Without this assumption there would not exist any reason to equate measurement results obtained in different EPR-Bell experiments (like those of (9.36)). Since it would be impossible, then, to construct a quadruple of measurement results from the four EPR pairs, a derivation of the Bell inequality as given in section 9.5.1 would be blocked. It, therefore, is sensible to consider the possibility whether, perhaps, instead of the ‘locality’ assumption, the assumption of ‘identical individual preparation’ in different EPR-Bell experiments could be responsible for the derivability of the Bell inequality, and to question whether this assumption is justified. There are two reasons why the assumption of ‘identical individual preparation’ in different EPR-Bell experiments could be experienced as unproblematic, and why, at first sight, it might be felt as counter-intuitive to put it into doubt: i) The quantum mechanical formalism, as introduced in chapter 1, does not suggest any problem in this respect. In the formalism (both standard, as well as generalized) the idea that preparation is independent of measurement (to be performed later) is firmly anchored in the possibility of choosing state vector (or density operator and observable (either PVM or POVM) independently of each other. It is important to note, however, that in an empiricist interpretation this has a meaning only for the relation between
9.5. A COPENHAGEN-INSPIRED EMPIRICIST APPROACH
511
the procedures of preparation and measurement. In this interpretation the state vector is a symbolic representation of a preparation procedure, and is always referring to an ensemble rather than to an individual particle. For Stapp’s derivation it is necessary to consider an individual preparation, which is not represented by a state vector. Hence, strictly speaking, the present argument in favor of identical individual preparations in different EPR-Bell experiments is not applicable, although it does not seem to be unreasonable to base independence of statistical preparation and measurement procedures on independence of individual ones.
ii) In EPR-Bell experiments preparation of a particle or photon pair takes place at a large distance from the measurement arrangement, and can even be carried out before the measurement apparatus has been set up. For this reason the assumption that identical (individual) preparations are possible in all EPR-Bell experiments seems to be rather unproblematic. However, when the measurement result is not a property of the microscopic object but a property of the measuring instrument (as is the case in an empiricist interpretation), then all physical conditions causing the measuring instrument to adopt a certain pointer position (with a certain probability) are relevant to the preparation. In the realization of these conditions the measuring instrument may play an active role. Thus, within the context of a measurement arrangement of observable A it is possible to take the contextual state instead of the density operator as representing the initial state (cf. section 9.4.2). Contextual states being different for measurement arrangements of incompatible (standard) observables, this once again undermines the self-evidence of equal preparations in different EPR-Bell experiments even in a statistical sense. By these considerations sufficient ground is taken away from under the casual assumption of ‘identical individual preparation in different EPR-Bell experiments’ to consider this assumption as an additional one, possibly playing a role in the derivation of the Bell inequality next to the ‘locality’ assumption. In the following we shall come to the conclusion that it may very well be a failure of this additional assumption rather than of the assumption of ‘locality’ that is responsible for the possibility that quantum mechanics violates the Bell inequality.
9.5.3
Additional assumptions induced by the Copenhagen interpretation
The idea of identity of individual preparations is, at least for pure states, firmly embedded in the Copenhagen thesis of ‘completeness of quantum mechanics’, and might be defended on similar grounds. For this reason we shall first discuss the
512
CHAPTER 9. BELL INEQUALITY IN QUANTUM MECHANICS
assumption of ‘completeness of quantum mechanics’ as an additional assumption in the sense considered here, viz, as a possible alternative to be rejected instead of ‘locality’. The discussion of the Copenhagen thesis of ‘completeness of quantum mechanics’ is complicated by the two different implementations of the concept, viz, ‘completeness in a wider sense’ (section 4.2.1), and ‘completeness in a restricted sense’ (section 4.2.2). Assuming either of these may have completely different consequences for the derivability of the Bell inequality.
i) Completeness in a wider sense Let us first assume ‘completeness in a wider sense’. As discussed in chapter 6 this idea is implemented into the Copenhagen interpretation by means of the attribution of a state vector to an individual object. In contrast to what is assumed in an empiricist interpretation this would suggest that quantum mechanics does describe an individual preparation. For this reason the assumption of ‘completeness in a wider sense’ conflicts with the empiricist point of view taken in the present section. In chapter 6 we have already seen that an individual-particle interpretation of quantum mechanics is problematic (compare von Neumann’s homogeneous ensemble, section 6.2.3), and that the Copenhagen dogma of the completeness of quantum mechanics, considered here, is a dubious one. In particular the EPR experiments are notable for revealing the difficulties created by the assumption of homogeneity of a pure state: even within one single measurement arrangement the assumption of identical preparation of different particle pairs is a dubious one, since identical preparation of the pair does not imply that a subsystem is identically prepared. Thus, consider the initial density operator of particle 2,
obtained from (9.32) by partial tracing and as given in (9.35)). The assumption that the improper mixture (cf. section 6.3.3) described by this density operator is homogeneous would hardly do justice to the possibility of dividing, by means of conditional preparation (cf. section 3.3.4) based on the measurement results of the ensemble of particle 2 into subensembles with preparations represented by the different state vectors The assumption that, notwithstanding this possibility of subdivision, the initial ensemble is homogeneous, could therefore be considered as an absurd consequence of the Copenhagen completeness thesis. If the assumption of ‘completeness in a wider sense’ would be employed in the derivation of the Bell inequality together with the assumption of ‘locality’, then it would be plausible to draw a conclusion
9.5. A COPENHAGEN-INSPIRED EMPIRICIST APPROACH
513
that is comparable to the one drawn by Einstein as a reaction to Bohr’s answer to the EPR problem (cf. section 5.3.1), viz that it is impossible to maintain both ‘completeness of quantum mechanics’ and ‘locality’, and that it is advisable to reject the assumption of ‘completeness in a wider sense’ rather than ‘locality’. Since in this way the thesis of ‘completeness of quantum mechanics in a wider sense’ has lost the self-evidence it is often adopted with, it may be considered as an additional assumption that is possibly not justified. Although the original EPR problem is fundamentally different from the problem of the Bell inequality, the completeness assumption is also relevant to the derivation of the inequality as applied to the standard EPR-Bell experiments of figure 9.1. For such a derivation it is necessary to combine individual measurement results obtained in different EPR-Bell experiments into one quadruple. This would seem to make sense only if the individual preparations in the different experiments of which the results are combined, are identical. If the individual preparations of EPR particle pairs in the pure state (9.32) would be different, then we would not have any reason to combine measurement results of different pairs into a quadruple. On the other hand, if arbitrary individual preparations, performed in this state, are assumed to be identical, then it must be allowed to combine measurement results of arbitrary individual EPR-Bell measurements into a quadruple. Then, for instance, quadruples could be constructed by combining individual results of measurements with ones obtained in arbitrary individual measurements. On the basis of the existence of the quadruples the assumption of ‘completeness in a wider sense’ would allow a derivation of the Bell inequality along the lines of section 9.5.1. It is clear, however, that the very feasibility of this derivation implies that these quadruples cannot correspond to measurement results of all four standard EPR-Bell experiments. Evidently it is not allowed to suppose that the measurement results of two arbitrary individual measurements of the pairs and can also be attributed to measurements of the pairs and This could be seen as an indication that individual preparations of the particle pairs in the state (9.32) are not all identical, and that, consequently, quantum mechanics is not complete in a wider sense. In view of the above-mentioned qualms with respect to the homogeneity of ensembles represented by pure states the choice to reject ‘completeness’ rather than ‘locality’ would seem to be the most reasonable one. Yet, as much as in the original EPR problem rejection of ‘completeness in a wider sense’ was not a sound alternative to Einstein’s nonlocality conclusion (nonlocality being persistent in case of incompleteness, cf. section 5.4), is it unsuited for blocking a derivation of the Bell inequality in the present situation. As a matter of fact, in section 9.3 this inequality was derived within the context of an empiricist interpretation, which is fully compatible with an assumption of ‘incompleteness’. Indeed, within a joint (nonideal) measurement of the four observables involved, each individual preparation is yielding a quadruple of values. Hence, no assumption of identical
514
CHAPTER 9. BELL INEQUALITY IN QUANTUM MECHANICS
preparation is necessary there for having the quadruples at one’s disposal. For this reason ‘completeness in a wider sense’ does not seem to be the source of the problem of the Bell inequality. Indeed, rejection of ‘completeness in a wider sense’ is not even a logical necessity since, apart from ‘locality’ and ‘completeness’, there could still be other assumptions that could be determinative for the derivability of the Bell inequality (compare the original EPR problem discussed in chapter 5, where, apart from ‘locality’ and ‘completeness’, a ‘realist interpretation of the formalism’ also turned out to be an essential presupposition of the reasoning). Of course, this does not make the thesis of ‘completeness in a wider sense’ more respectable. It follows, however, that the issue is irrelevant to the derivation of the Bell inequality, and need not concern us here any further.
ii) Completeness in a restricted sense The notion of ‘completeness in a restricted sense’ is closely related to the Copenhagen concept of ‘complementarity’, completion of the quantum mechanical description by simultaneously specifying sharp values of incompatible observables being thought to be impossible. With respect to derivability of the Bell inequality this has the opposite effect compared to the assumption of ‘completeness in a wider sense’: at least in an interpretation in which complementarity is thought to entail the nonexistence of measurement results of (standard) observables incompatible with the actually measured one (cf. section 9.5.1) it blocks the derivation rather than making it possible. A denial of the existence of the measurement results of “unperformed” measurements is for the Copenhagen interpretation an easy way to prevent, for the standard EPR-Bell experiments, the existence of quadruples as implied by the assumption of ‘completeness in a wider sense’. The Copenhagen idea of the nonexistence of quadruples for the standard EPRBell experiments is a consequence of an empiricist attitude towards quantum mechanical measurement results 9 . Although different from the empiricist interpretation considered in section 9.3 (quadruples of measurement results also being allowed there in the case of incompatibility), it has the same effect, viz that, due to the mutual exclusiveness of measurement arrangements of incompatible observables, no quadruples of measurement results exist that are valid for all four standard EPR-Bell experiments. Measurement results which, due to incompatibility, are nonexisting in the Copenhagen interpretation are different (due to Heisenberg disturbance) in an 9
Because of the idea of the nonexistence of certain physical quantities the Copenhagen interpretation is often associated with antirealism rather than with empiricism. With respect to the state vector and ‘elements of physical reality’ there is a case to be made for such a contention. However, the Copenhagen interpretation does not deny the reality of measurement results in a contextualistic-realist sense (as properties of the object in its post-measurement state) or even in an empiricist sense (as pointer positions of the measuring instrument).
9.5. A COPENHAGEN-INSPIRED EMPIRICIST APPROACH
515
empiricist interpretation. Hence, in both interpretations there also is no single reason to suspect the existence of a quadrivariate probability distribution from which all bivariate probability distributions of the standard EPR-Bell experiments could be obtained as marginals. Due to an empiricist point of departure the problem of CFD as discussed in section 9.4.1 does not arise, measurement results being thought not to have any physical relevance independently of the measurement arrangement. If Stapp’s contention would be justified that nonexistence of the quadruples must be a consequence of a failure of the assumption of ‘locality’, and if, on the other hand, it would be a consequence of ‘completeness in a restricted sense’, then it seems that there would have to be an intimate relation between these two issues. Indeed, it might be thought that in the measurement the value of does not exist due to the presence of the distant measurement arrangement for (and vice versa). This, however, is not the only possible explanation, and not even the most plausible one. If ‘completeness in a restricted sense’ is employed as an additional assumption, then it is far more reasonable to suppose that the nonexistence of is caused by the presence of the measurement arrangement for (and analogously for and thus achieving the same goal by local rather than by nonlocal means (compare section 9.3.2, where the same conclusion was reached for the generalized EPR-Bell experiment). ‘Completeness in a restricted sense’ can be accounted for in a local way, provided each particle is interacting with its own measuring instrument. In the Copenhagen interpretation a denial of the existence of the results of unperformed measurements may be interpreted in two different ways: i) in the original EPR experiment, discussed in chapter 5, the observables of particle 2 do not have values because no measurement is performed on that particle, ii) observables incompatible with the measured one do not have (sharp) values due to Heisenberg disturbance. Since in the present discussion measurements are performed on both particles, only the second issue is relevant. Therefore nonlocality need not play any role in a solution of the problem of the Bell inequality based on ‘completeness in a restricted’ sense. As a matter of fact, the reasoning would not be different if all four observables would be local standard ones belonging to the same particle: if the observables are incompatible, then they cannot all be measured simultaneously in an ideal way and have undisturbed values. It seems, therefore, that ‘completeness in a restricted sense’ and ‘locality’ are independent issues, thus allowing the former to be seen as additional to the latter. It also is clear, however, that an additional assumption of ‘completeness in a restricted sense’ could hardly be held responsible for any successful derivation of the Bell inequality. On the contrary, by appealing to its consequence of the nonexistence of the values of certain observables the Copenhagen interpretation seems to be able to evade a derivation of the Bell inequality. In contrast to ‘completeness in a wider sense’ the Copenhagen idea of ‘completeness in a restricted sense’ has a firm basis in the formalism of quantum mechanics, incompatibility of observables being represented by noncommutativity of the cor-
516
CHAPTER 9. BELL INEQUALITY IN QUANTUM MECHANICS
responding operators. Since a successful derivation of the Bell inequality appears either to overcome or to ignore the difficulties posed by incompatibility of observables, it seems advisable to carefully scrutinize any derivation to see whether in it perhaps this incompatibility is disregarded in some way or other, and, in particular, to scrutinize whether the influence of the measurement arrangement is sufficiently taken into account.
iii) Inadvertent realism There is still a third way in which the Copenhagen completeness thesis might entail an additional assumption relevant to a derivation of the Bell inequality in an empiricist interpretation. As was seen in chapters 2 and 4, in the Copenhagen interpretation there is a tension between empiricism and realism, an empiricist interpretation of quantum mechanics being related to ‘completeness in a restricted sense’, whereas a realist interpretation is promoted by the idea of ‘completeness in a wider sense’. It is clear that the idea of the nonexistence of measurement results of unperformed measurements has its origin within empiricism, since in this interpretation at least the presence of a measuring instrument, intended to measure the observable, is required lest a measurement result exist at all. By the same token the nonexistence of a measurement result for if is measured, might be attributed to the necessary absence of a measuring instrument for due to the mutual exclusiveness of measurement arrangements of incompatible observables. On the other hand, the measurement results of performed measurements are often treated in a realist rather than in an empiricist sense, the value of the measured observable being attributed to the microscopic object as a property possessed within the context of the measurement. The custom of attributing, in an eigenstate of some observable, the corresponding eigenvalue as a property to the microscopic object, even reveals a tendency towards an objectivistic-realist interpretation. As discussed in section 4.8, the Copenhagen empiricism is of a rather rudimentary kind, no clear distinction being made between measurement results as pointer positions of a measuring instrument and as properties (either objective or contextual) of the microscopic object. By treating a measuring instrument as something merely disturbing the microscopic reality of the object (rather than as translating information from the world of microscopic objects to the macroscopic world directly accessible to us) the issues of empiricism and (contextual) realism have been appreciably confounded. Concomitantly, by their reliance on classical mechanics in interpreting quantum mechanics (the former theory being interpreted realistically!) the Copenhagen completeness ideas have an origin which, notwithstanding reference to the measurement context, is more of a (contextualistic-)realist than of an empiricist nature. This essentially realist way of thinking might entail additional requirements to be satisfied by the physical quantities of the theory, even if these
9.5. A COPENHAGEN-INSPIRED EMPIRICIST APPROACH
517
are interpreted in the first place in an empiricist sense. The assumption of the existence of undisturbed measurement results for the measurement if an measurement is performed is an example of the inadvertent realism meant here. Another example is the assumption that a measurement result of a quantum mechanical observable, apart from being a property of the measuring instrument (in an empiricist sense), also is a property of the microscopic object (cf. section 2.4.3). The refutation of derivations of the Bell inequality which are based on the existence of quadruples put together by combining measurement results from and measurements, might be taken as an example of the effectivity of a strictly empiricist reasoning not infected by realist elements. In the present empiricist treatment of the Bell inequality inadvertent realism must be considered as a possible, though unjustified, additional assumption influencing derivability of the inequality. Reference to the (simultaneous) existence or nonexistence of values of observables hinges on the attribution of a certain reality to these values. If no clear distinction is made between the reality of a measuring instrument and the reality of a microscopic object, then the reality of a measurement result is easily attributed to the microscopic object rather than to the measuring instrument. In particular, if the inadvertent realism is of an objectivistic kind this can be harmful to empiricist reasoning by involving CFD, contextualistic realism not leading to formal differences from empiricism (cf. section 2.4.5). It will be necessary to remain alert to the possibility that within an empiricist context inadvertent realism slips into the reasoning and plays an analogous role in the derivation of the Bell inequality as realism played in the original EPR problem, (objectivistic) realism rather than completeness being there the genuine issue leading to Einstein’s nonlocality conclusion (see also section 5.3.1). Actually, in his discussion with Einstein, Bohr’s choice in favor of completeness can be interpreted as a choice against an objectivistic-realist interpretation rather than against locality (cf. section 5.3). That in the EPR problem the juxtaposition was different (viz, completeness versus locality) is a consequence of the fact that, due to the absence of a measuring instrument for particle 2, the interpretation, as far as this latter particle is concerned, had necessarily to be a realist one (be it, for Bohr, a contextualistic-realist interpretation). The absence of the measuring instrument made it impossible to solve the problem (as in the empiricist interpretation discussed in section 9.3) in a local way by referring to the influence of the measuring instrument for particle 2. If it is realized, however, that this measuring instrument must necessarily be there if a measurement on particle 2 is intended, then the local solution is the more natural one. As stressed in section 9.1.1, in contrast to the original EPR problem, the physical circumstances for testing the Bell inequality in EPR-Bell experiments satisfy this requirement.
518
9.5.4
CHAPTER 9. BELL INEQUALITY IN QUANTUM MECHANICS
The assumption that identical individual preparations in different EPR-Bell experiments are possible
Since the assumption of ‘completeness in a wider sense’ seems to get us into trouble (cf. section 6.3.3), it seems wise not to assume that all individual preparations are identical. However, this does not exclude the possibility that individual preparations are identical. This weaker assumption, which is in agreement both with the Copenhagen and the ensemble interpretation, is stressed by Stapp in his “nonlocality proof” given in section 9.5.1. This, at least, would imply the possibility that in some sequence of standard EPR-Bell experiments the Bell inequality is satisfied. Stapp considers the possibility of this assumption to be a consequence of locality, its experimental falsification being interpreted as evidence against locality. It is the purpose of the present section to point out that the assumptions of ‘locality ’and of ‘the possibility of identical preparations in different EPR-Bell experiments’ are not related in this implicatory way, but that the latter can be seen as an assumption additional to (and compatible with) the locality assumption, and that rejection of the additional assumption need not imply rejection of ‘locality’. In this analysis it will be important to have a clear picture of what can be the meaning of the notion of an ‘individual preparation’ in quantum mechanics. It is a merit of Stapp’s derivation of the Bell inequality as given in section 9.5.1 that it highlights the role individual preparation may play in quantum mechanical reasoning, and that it forces us to assess critically the way it is conceived. As already conceded in section 9.5.2, there does not seem to exist any a priori objection against the possibility of identical individual preparations in different EPR-Bell experiments. Yet, it could be argued that, if quantum mechanics would not be complete in a wider sense, then this need not imply that individual preparations are actually identical (compare two different preparations of the same volume of a gas, having all thermodynamic quantities identical, but yet corresponding to different points in phase space). Analogously to the practical impossibility of reproducing the preparation of a volume of gas in the sense of equal positions and velocities of all particles, it might be impossible in practice to reproduce an individual preparation of a microscopic object. Violation of the Bell inequality by the standard EPR-Bell experiments could be a result of a practical impossibility of reproducing the individual initial state of the particle pair. De Baere [399] has formulated this in terms of a ‘non-reproducibility’ hypothesis, thought to be valid within the domain of quantum mechanics next to the locality assumption. De Baere’s ‘nonreproducibility’ hypothesis is sufficient to prevent derivability of the Bell inequality without invoking nonlocality. De Baere’s ‘non-reproducibility’ hypothesis draws on the idea of quantum mechanical ‘incompleteness in a wider sense’. It can be taken either in a strict sense in which reproduction of the individual state is held to be strictly impossible (for
9.5. A COPENHAGEN-INSPIRED EMPIRICIST APPROACH
519
instance, because the state of the universe has changed as a whole between two successive individual experiments), or in a more practical sense in which the probability of obtaining a sequence of identical preparations is thought to be effectively zero, thus reducing the set of experiments expected to satisfy the Bell inequality to a set of measure zero. In both cases the ‘non-reproducibility’ hypothesis has the drawback of being so general that no special relation to quantum mechanics is evident. In particular, this approach does no justice to the role of incompatibility of observables in the violation of the Bell inequality by actual standard EPR-Bell experiments. On the basis of the thermodynamic analogy, referred to above, non-reproducibility could hold for any (sub)microscopic theory, be it classical or quantum mechanical. Without additional argumentation it cannot be understood why the probability distributions of quantum mechanical measurements are reproducible in experimental practice, i.e. are independent of the particular sequence of individual preparations realized in an ensemble, whereas, allegedly, the constitution of the sequences would be essential for making the ‘non-reproducibility’ hypothesis effective in blocking a derivation of the Bell inequality. It seems necessary to base the ‘non-reproducibility’ hypothesis on some physical principle relating to quantum mechanics. Such a relation was already put forward in section 9.2.3, a violation of the Bell inequality being associated with the incommeasurability of quantum mechanical observables. Rather than referring to the idea of ‘incompleteness of quantum mechanics in a wider sense’, this argument draws on the Copenhagen notion of ‘completeness in the restricted sense’ (cf. section 4.2.2). More precisely, the feature seeming to entail satisfaction of the Bell inequality is the commeasurability (possibly in a nonideal sense) of the four observables and Since reproducibility does not play any role, neither in the case of the joint measurement of compatible observables, nor in the experiment described in section 9.3 (in which the four observables are measured jointly, be it nonideally), it seems that commeasurability is the determinative factor. This amounts to the question of whether the experimental data are (or could be) obtained within one single measurement arrangement, or whether they stem from mutually exclusive arrangements. If the quadruples are obtained within one single measurement arrangement, then the Bell inequality is satisfied because each individual preparation yields, in the sense of Stapp’s principle UR, a unique measurement result for each of the four observables. In case of commeasurability the question of whether individual preparations are reproduced in different (mutually exclusive) measurement situations does not even arise. On the other hand, in case of incommeasurability such reproducing would be necessary for actually obtaining quadruples satisfying the Bell inequality. Incommeasurability, or mutual exclusiveness of measurement arrangements, of incompatible (standard) observables, is one of the main themes in the Copenhagen interpretation. It is at the basis of Bohr’s strong correspondence principle (cf. chapter 4.3.2), implying that the measurement arrangement should be taken into ac-
520
CHAPTER 9. BELL INEQUALITY IN QUANTUM MECHANICS
count in defining quantum mechanical quantities, and is implemented through the notion of ‘completeness in a restricted sense’. However, in the Copenhagen interpretation no clear distinction is made between ‘preparation’ and ‘measurement’ (cf. section 4.6.1). In general the contribution of preparation is largely neglected compared to the contribution of measurement. In the Copenhagen interpretation the notion of ‘state preparation’ 10 is largely neglected (compare Bohr’s preference for a discussion of the EPR problem in terms of observables, cf. section 5.3). Nevertheless, we have seen in section 7.10 that quantum mechanical incompatibility does refer both to measurement and to preparation, the Heisenberg inequality referring to the latter rather than the former (de Muynck [129]). Since no disturbance of the measured observable is involved in maximal (standard) measurements testing the Heisenberg inequality, it seems that the impossibility of simultaneous sharp values of incommeasurable (standard) observables should be a consequence of preparation in the first place. Therefore we shall now apply the idea of ‘completeness in a restricted sense’ to preparation by the following conjecture:
This conjecture was already anticipated by the introduction of the contextual state (cf. sections 2.4.5 and 9.4.2). In general the contextual states of (9.33) and of (9.34) are all different, the differences being caused by the mutual exclusiveness of the different EPR-Bell arrangements. If the initial state would be represented by the contextual state, then non-reproducibility would be warranted by the incommeasurability of the observables measured in the standard EPR-Bell experiments already at the level of the ensemble. In the quantum mechanical non-reproducibility conjecture this non-reproducibility is extended to individual preparation11. Thus, the ensemble described by a contextual state can be thought to consist of subensembles of individual preparations, the subensembles being represented by the eigenvectors of the observables measured in the EPR-Bell experiment. This would clearly make identical individual preparations in different EPR-Bell experiments impossible. 10
Here by ‘state preparation’ we do not refer to Heisenberg’s interpretation of measurement as a preparation of the post-measurement state of the object, but to the preparation of the object preceding the measurement. 11 Note the difference between this conjecture and the projection postulate given in section 1.6.
9.5. A COPENHAGEN-INSPIRED EMPIRICIST APPROACH
521
By the quantum mechanical non-reproducibility conjecture the nonexistence of measurement results of unperformed measurements is associated with state preparation in the context of the observable actually measured. Although we do not have direct empirical evidence of the validity of the conjecture, it at least demonstrates the logical possibility of a quantum mechanical principle of non-reproducibility of individual states within the measurement contexts of incompatible observables. Moreover, as emphasized in section 9.4.2, this non-reproducibility is a completely local affair, the presence of the measurement arrangement of being reponsible for the non-reproducibility of, e.g., contextual state of the measurement, in the context of the measurement. Identical individual preparations of particle 2 in different standard EPR-Bell experiments are not excluded by any influence of a distant measurement performed on particle 1, but due to incompatibility of and
9.5.5
Discussion of the quantum mechanical non-reproducibility conjecture
i) ‘Contextual state’ versus ‘objective initial state’ In order to discuss the meaning of the quantum mechanical non-reproducibility conjecture it is necessary to make a clear distinction between the contextual state and the state of the object as it is prepared before it has any interaction with a measuring instrument. In agreement with the way the notion of ‘objectivity’ is employed in this book (cf. section 2.3) we shall refer to this latter state as the ‘objective initial state’. In an empiricist interpretation this state is not thought to be described by quantum mechanics (cf. section 2.5.2). The possibility of a distinction between the contextual state and the ‘objective initial state’ hinges on quantum mechanical ‘incompleteness in a wider sense’. It will be necessary to consider subquantum theories to characterize the relation between these states. Failure to do so may be a source of confusion. For this reason this problem will be reconsidered in section 10.6 in the context of subquantum (hidden-variables) theories. In the present chapter we do not want to transcend the quantum mechanical formalism, thus lacking the possibility of having a precise mathematical representation of the ‘objective initial state’. It may be clear, however, that the distinction between the two kinds of initial states may be important in assessing quantum mechanical measurement. In drawing a distinction between the contextual state and the ‘objective initial state’ the important issue might seem to be the difference between ‘preparation of an individual object’ and ‘preparation of an ensemble’. In an ensemble interpretation the mathematical formalism of quantum mechanics is not thought to describe an individual preparation (cf. chapter 6). Hence, in this interpretation the contextual state refers to an ensemble. On the other hand, the ‘objective initial state’ could be
522
CHAPTER 9. BELL INEQUALITY IN QUANTUM MECHANICS
taken as the result of an individual preparation (for instance, represented by a value of a hidden variable of a subquantum theory to be discussed in chapter 10). This (correctly) suggests that a contextual state has its origin in an ensemble of individual preparations, each element of the ensemble being described by an ‘individual objective initial state’. It is important to note, however, that the contextual state cannot be simply seen as a description of an ensemble of such ‘individual objective initial states’, because there is no trace of contextuality in the latter notion. Hence, we shall have to distinguish the contextual state from the ‘statistical objective initial state’ describing the same ensemble of objects prior to its interaction with the (ensemble of) measuring instrument(s). Like the ‘individual objective initial state’ the statistical one is outside the domain of quantum mechanics, as is the physical relation between the ‘individual objective initial states’ and the ‘individual contextual states’ to be linked to the eigenvectors of the measured (standard) observable. This latter relation will be discussed in section 10.6. In Stapp’s derivation of the Bell inequality the individual preparation is an essential element. Since it is assumed that the same individual preparation can occur in different (even incommeasurable) EPR-Bell experiments, it seems that an individual preparation must refer here to an ‘individual objective initial state’. Then De Baere’s ‘non-reproducibility’ hypothesis might be invoked to take the edge off the derivation by denying the possibility that the same individual preparation occurs in different EPR-Bell experiments. However, as mentioned in section 9.5.4, this hypothesis does not have any special relation to incompatibility of observables, necessary for the Bell inequality to be violated by quantum mechanical measurements. For this reason a quantum mechanical non-reproducibility conjecture was formulated there, referring to the contextual state, which does depend on the measurement arrangement. Applying this conjecture to Stapp’s reasoning in the “nonlocality proof” of section 9.5.1, it could be questioned whether the role played by the measuring instrument has sufficiently been taken into account. Does it really make sense to assume that identical individual preparations are possible in the four standard EPR-Bell experiments, and that, by the principle UR, these preparations yield a single quadruple of measurement results? As a consequence of this assumption there is in Stapp’s approach no mechanism preventing the quadruple of measurement results to be attributed, via a principle like the ‘faithful measurement’ principle (cf. section 2.4), to the microscopic object as a set of properties jointly possessed by the object preceding the measurement. This would imply that even in an empiricist interpretation a derivation of the Bell inequality would be possible along the objectivistic-realist line of section 9.4.1. By the present discussion we are once more reminded to take very seriously the empiricist idea (shared by the Copenhagen interpretation) that the value of the measured observable did not exist prior to measurement, and that quantum mechanics
9.5. A COPENHAGEN-INSPIRED EMPIRICIST APPROACH
523
is not applicable to objective reality. The object’s capacity to induce a measurement result in a measuring instrument should be clearly distinguished from the measurement result itself. Whereas the latter is within the domain of quantum mechanics (at least, if the measurement is in that domain), should the former presumably be described by means of a subquantum theory. We shall therefore have to distinguish between two different ways of approaching the same individual preparation, viz, i) preparation in a subquantum mechanical sense , preparing an ‘individual objective initial state’, ii) preparation in a quantum mechanical sense , preparing an ‘individual contextual state’. Since only the contextual state is probed by quantum mechanical measurements, only the latter preparation is relevant within the domain of quantum mechanics. The individual preparation, referred to in the quantum mechanical non-reproducibility conjecture, is a preparation in a quantum mechanical sense. Unlike preparation in a subquantum mechanical sense, quantum mechanical preparation does not seem to be independent of the measurement arrangement. We conjecture:
It follows from this conjecture that mutual exclusiveness of measurement arrangements implies quantum mechanical preparations to be different even if the ‘objective initial states’ are identical. Non-reproducibility does not apply to the ‘individual objective initial state’ but to the ‘individual contextual state’. It is a consequence of incommeasurability. On the basis of this conjecture identical individual preparations in different EPR-Bell experiments are impossible in a quantum mechanical sense due to mutual exclusiveness of measurement arrangements, preventing the same ‘individual contextual state’ to be prepared notwithstanding there does not seem to be any objection against identical preparations in a subquantum mechanical sense. As far as a quantum mechanical measurement result can be seen as an individual property of the microscopic object, it should be seen as a contextual property, valid only within the context of the measurement (compare section 6.6), and, hence, well-defined only through the contextual state. Although a quantum mechanical measurement result may yield certain information about the object, this information is not objective information, but information colored by the measurement. In the “nonlocality proof”, discussed in section 9.5.1, Stapp’s interpretation of a quantum mechanical measurement result is not inconsistent with an empiricist interpretation as labeling a pointer position of a measuring instrument. Yet, by his assumption UR the quantum mechanical measurement result is essentially attributed to the ‘individual objective initial state’ of the object, although realized only
524
CHAPTER 9. BELL INEQUALITY IN QUANTUM MECHANICS
after it has been brought into interaction with a measuring instrument. Such an attribution, however, introduces an element of objectivistic realism into the empiricist approach. According to our conjectures Stapp’s derivation of the Bell inequality cannot be completed, not so much because the values of the four observables cannot be attributed simultaneously to the microscopic object, but because no value of any quantum mechanical observable can be attributed to the ‘objective initial state’ at all. The idea that values of quantum mechanical observables cannot be attributed to the object as objective properties is firmly embedded in the Copenhagen interpretation (cf. section 6.2). The same holds true with respect to the contextual meaning of observables (cf. section 6.6). Nevertheless, the Copenhagen interpretation is liable to some confusion here. By denying the possibility of subquantum theories it is bound to fall prey to an equivocation equating ‘quantum mechanical preparation’ and ‘subquantum mechanical preparation of an individual objective initial state’ . Attribution of a quantum mechanical measurement result to an ‘individual contextual state’ can then easily be mistaken as an attribution to an ‘individual objective initial state’. This seems to have happened in Stapp’s derivation (cf. section 9.5.1). In order to block derivation of the Bell inequality it is crucial not to attribute a quantum mechanical measurement result as a property to the ‘objective initial state’ (see also section 6.4.2). Therefore it seems that we should turn this argument around, and conclude from quantum mechanical violation of the Bell inequality that the distinction between the ‘individual objective initial state’ and the ‘individual contextual state’ must be physically relevant. Indeed, if quantum mechanical measurements are sensitive to the contextual state rather than to the ‘objective initial state’, then it does not seem to make much sense to assume, with Stapp, that the object can be identically prepared in the different standard EPR-Bell experiments, so as to have, by the principle UR, for each experiment a unique quantum mechanical result. Identical preparation of ‘individual contextual states’ in different EPR-Bell experiments would then be an unwarranted additional assumption -next to the locality assumption- on which derivability of the Bell inequality could be blamed. Due to the context dependence of the contextual state this would even hold if the ‘objective initial states’ would be the same, and if a deterministic relation would exist from this latter state to the corresponding contextual one. Unfortunately, this is unanalyzable in the quantum mechanical formalism. Subquantum mechanical states refer to subquantum mechanical properties, which may be quite different from quantum mechanical ones because they need not be restricted to certain contexts. Such an idea is not too far-fetched because it occurs in analogous situations in which two different theories refer to the same object at different levels of sophistication. For instance, compare the classical theory of rigid bodies with quantum mechanical solid state theory. No microscopic solid state theory contains the concept of rigidity inherent in the classical theory of rigid bodies (cf. section 2.4.5). Even though at every instant a well-defined atomic configuration may
9.5. A COPENHAGEN-INSPIRED EMPIRICIST APPROACH
525
exist, this is irrelevant to observations within the domain of classical rigid body theory because such observations are not capable of detecting deviations from a rigid body model. Applicability of this latter model is restricted to those (contextual) states of the microscopic theory that are valid within the context of experiments allowing the object to behave as a rigid body. In section 10.6 an analogy between quantum mechanics and thermodynamics will be exploited to analyze the way in which quantum mechanical measurements are not suited to deal with deviations from quantum mechanical contextual states. On the basis of these ideas it can be argued that the contextual state can be seen as anticipating an aspect of subquantum physics already within the domain of quantum mechanics, and, for this reason, allows a certain realist interpretation (compare the modal interpretation, discussed in section 6.6). Even though, because of equality (3.39), the contextual state does not have any observational significance over the density operator as a preparation procedure (in an empiricist interpretation) nor as a state description (in a realist interpretation)- might it nevertheless be seen as representing a certain aspect of reality having observational consequences only when probing the ‘individual objective initial state’ rather than the quantum mechanical one. Since quantum mechanical measurements yield measurement results not distinguishing between and the contextual state, this is outside the domain of application of quantum mechanics, however. In Stapp’s derivation quantum mechanical measurement results are, in fact, attributed to the ‘individual objective initial state’. This seems to stretch the domain of applicability of quantum mechanics beyond the domain of validity of ‘completeness in a restricted sense’, and -not surprisingly- implies departure from the empirical consequences of this theory. The logical possibility that contextual states represent preparations as far as these are relevant to measurements within the domain of quantum mechanics implies that we may, conversely, conclude from the violation of the Bell inequality by quantum mechanics that the ‘individual objective initial state’ does not have a meaning within this domain. In order to probe the ‘individual objective initial state’ (and find the Bell inequality experimentally satisfied) we shall have to perform measurements belonging outside the domain of quantum mechanics (cf. chapter 10). For quantum mechanical measurements the additional assumption that identical preparations are possible in different EPR-Bell experiments seems to be based on an unwarranted identification of ‘individual objective preparations’ (preparing ‘individual objective initial states’) and ‘quantum mechanical’ preparations.
ii) Complementarity in preparation and in measurement The distinction between an ‘individual objective preparation’ and a quantum mechanical one is closely related to the Copenhagen idea of ‘complementarity in a
526
CHAPTER 9. BELL INEQUALITY IN QUANTUM MECHANICS
restricted sense’, emphasizing the essential role played by the measuring instrument in interpreting quantum mechanical measurement results. Both preparation and measurement arrangements are thought to share a certain responsibility for the realization of a quantum mechanical measurement result. According to the Copenhagen interpretation it does not make sense to attribute a measurement result to the microscopic object in an objective sense prior to any interaction with the measuring instrument. A quantum mechanical observable simply is not a property of the microscopic object independent of the measurement arrangement, as would be required by an objectivistic-realist interpretation of quantum mechanics. At least, it is clear from the derivation of the Bell inequality from the ‘possessed values’ principle (cf. section 9.4.1) that such an assumption entails disagreement with experiment. As already noted in chapter 4, one of the main problems the Copenhagen interpretation is subject to in discussing complementarity is its confounding ‘preparation’ and ‘measurement’. Indeed, Heisenberg’s idea of measurement disturbance (cf. section 4.6.2) was based on a thorough confusion of ‘measurement’ and ‘preparation’, a measurement being interpreted as a preparation of a post-measurement state of the microscopic object, liable to disturbance by the measurement interaction. However, in general the post-measurement state of the object is irrelevant to a proper functioning of a quantum mechanical measuring instrument. As explained in chapters 3 and 7, it is the final state of the measuring instrument that is the relevant issue here. However, from this point of view, too, Heisenberg measurement disturbance was seen to be an important source of complementarity, described by the Martens inequality rather than by the Heisenberg one. The distinction between an ‘individual objective preparation’ and a quantum mechanical one, made above, offers an opportunity to clarify the role played by the Heisenberg inequality in complementarity. Complementarity as expressed by the Heisenberg inequality does not seem to be related at all to measurement disturbance (compare Ballentine’s criticism, sections 4.7.3 and 7.10.3), since only the initial state is involved. As already noted in section 7.10.3, the Heisenberg inequality must have another source. Any complementarity described by it should refer to quantum mechanical preparation in the first place. It seems that by confounding quantum mechanical ‘preparation’ and ‘measurement’ the Copenhagen interpretation has fallen prey to another equivocation by not distinguishing ‘complementarity in preparation’ and ‘complementarity in measurement’. Whereas the latter issue has received ample attention both in the discussions of the ‘thought experiments’ (cf. chapter 4) and of joint nonideal measurements of incompatible observables (cf. chapter 7), has the former one remained underexposed. The idea of ‘complementarity in preparation’ is implemented by the assumption of ‘contextuality of quantum mechanical preparation’ given above. Here the contextual state is viewed as representing the initial state of the measurement, in which the source of the microscopic object, and the measurement arrangement have coop-
9.5. A COPENHAGEN-INSPIRED EMPIRICIST APPROACH
527
erated (at a subquantum level, cf. section 10.6) to create the reality to be probed by the quantum mechanical measurement. Hence, Bohr may be completely right in supposing that quantum mechanical measurements probe only a microscopic reality that is influenced by the measurement arrangement itself, and do not probe objective reality as Einstein liked to have it. The important lesson to be drawn is that the quantum mechanical measurement postulates should not be applied to physical situations in which the microscopic object does not interact with a measuring instrument. By taking into account this restriction of the applicability of quantum mechanics it is possible to contemplate a kind of complementarity related to the impossibility of preparing the same initial contextual state in mutually exclusive measurement arrangements. By hindsight this yields a possible explanation of the Copenhagen confusion of ‘preparation’ and ‘measurement’ in the interpretation of the Heisenberg inequality. Quantum mechanical preparation, as far as relevant to quantum mechanical measurement, does depend on the measurement arrangement. However, this dependence should not be equated to the quantum mechanical interaction between object and measuring instrument, which is described by a Schrödinger equation, and which is responsible for the Heisenberg measurement disturbance. It should be associated with a transition from an ‘individual objective initial state’ to an ‘individual contextual state’, which is a subquantum mechanical process not described by quantum mechanics (cf. chapter 10.6). Once again the Copenhagen interpretation seems to have cut itself off from the possibility of a better understanding by adopting ‘completeness in a wider sense’. Violation of the Bell inequality by quantum mechanical measurements can be seen as a consequence of the impossibility of having the same contextual state in mutually exclusive measurement arrangements. On this view violation of the Bell inequality is just a consequence of ‘quantum mechanical complementarity in preparation’, based on incommeasurability of observables. Whereas at the level of the ‘individual objective initial preparation’ no complementarity need obtain (and, hence, due to reproducibility of the ‘individual objective preparation’, the Bell inequality may be satisfied in measurements probing the ‘individual objective initial state’), this may be different for preparations that are relevant to quantum mechanical measurements even if the same objective preparation is applied in mutually exclusive measurement arrangements. ‘Complementarity in preparation’ does not play a role in the generalized EPR-Bell experiment discussed in section 9.3.1, because only one single measurement arrangement is involved, thus enabling the Bell inequality to be satisfied in this experiment.
528
CHAPTER 9. BELL INEQUALITY IN QUANTUM MECHANICS
iii) Complementarity and CFD Given an objective preparation we are free to choose any of the EPR-Bell measurement setups. This essentially amounts to reproducibility of the ‘individual objective initial state’ in different EPR-Bell experiments. Stapp’s reasoning clearly demonstrates that, because of this reproducibility, attribution of a value of one single observable to the ‘individual objective initial state’ immediately entails simultaneous attribution of values of incompatible observables, and, consequently, implies CFD again (cf. section 9.4.1). As follows from the derivability of the Bell inequality from CFD, quantum mechanical complementarity must be seen as the source of failure of CFD in quantum mechanics. However, as seen above, there exist two different kinds of complementarity. It follows that ‘complementarity in measurement’ may provide a different mechanism from ‘complementarity in preparation’ for explaining the failure of CFD. Whereas in the former one Heisenberg disturbance is responsible for different quadruples of measurement results in different EPR-Bell experiments (compare section 9.3.1), no Heisenberg disturbance is involved in the transitions to different contextual states related to the latter form of complementarity. This difference was rightly stressed by Stapp [398] in his comment on de Muynck, De Baere and Martens [377], presenting the generalized EPR-Bell experiment discussed in section 9.3.1. This observation was used by Stapp to argue in favor of his contention of not assuming CFD at all in the derivation presented in section 9.5.1. However, it should be stressed that even if Heisenberg disturbance is not involved, the second source of CFD, viz, ‘complementarity in preparation’, remains active. Ignoring this in Stapp’s derivation entails a danger of treating quantum mechanical measurement results in a way very similar to the one employed by Ballentine (cf. section 4.7.3), and not unlike the assumption of an ‘element of physical reality’, already observed by Bohr to be in disagreement with ‘complementarity’ as embodied by the idea of ‘completeness in a restricted sense’. Therefore it may very well be the additional assumption of reproducibility of the ‘individual objective initial state’ in different EPR-Bell experiments, rather than locality, which is responsible for the derivability of the Bell inequality also in Stapp’s more recent derivation. It is an important asset of Stapp’s recent derivation that by it the attention is drawn towards the existence of two kinds of complementarity, the Copenhagen confounding of ‘preparation’ and ‘measurement’ tending to ignore this difference. It is also important to note that by itself this distinction is not established easily by just considering the standard EPR-Bell experiments in which only ‘complementarity in preparation’ is effective, and from which Stapp draws quite different conclusions. However, it is also true that this distinction is not evident from the generalized measurement of section 9.3.1, in which only ‘complementarity in measurement’ is effective. The role of ‘complementarity in preparation’ does seem to be evident,
9.6. BELL’S THEOREM WITHOUT INEQUALITIES
529
though, when comparing the standard EPR-Bell experiments (in which it prevents the Bell inequality) with the generalized one (in which it is ineffective, and, hence, the Bell inequality is satisfied). It is possible to get sufficient information to be able to assess which features are essential in quantum mechanical measurement processes only by taking into account all possible experimental evidence. By doing so it also becomes evident that, since the standard and the generalized experimental arrangements are similar with respect to locality, this latter issue cannot explain why the Bell inequality is violated in the former experiments but is satisfied in the latter.
9.6
Bell’s theorem without inequalities
In this section the problem of complementarity will be approached from a somewhat different perspective, with a view to demonstrate that it is in the first place a problem of the structure of quantum ensembles rather than of the Bell inequality. This way of dealing with complementarity has first been discussed by Hardy [378, 400] (see also Greenberger et al. [299]).
9.6.1 Problem and derivation In the present section we will follow Stapp ([246], p. 5, [401]) in his discussion of a reasoning due to Hardy having the intention to force a contradiction with quantum mechanics not based on the Bell inequality. The usual four EPR-Bell experiments (cf. figure 9.1) are considered, each observable being assumed to be dichotomic. The eigenvalues of all observables are denoted by + and –, respectively. A quantum mechanical state vector is considered for which the following relations are satisfied by the conditional probabilities and
and by the joint probability
As is easily verified this can be realized by choosing the state vector according to
530
CHAPTER 9. BELL INEQUALITY IN QUANTUM MECHANICS
and by taking according to
and
such that their eigenvectors are related to those of
en
For these choices (9.43) is satisfied with In the Hardy-Stapp derivation two conclusions are drawn from (9.40) through (9.42):
However, conclusion B contradicts (9.43) since
As before, the contradiction is interpreted as evidence of the logical incompatibility of the predictions of quantum mechanics with the assertion that the outcome of any measurement performed on one part of a quantum system is independent of which measurement is performed simultaneously on a faraway part, i.e. as evidence of nonlocality. We first give the reasoning entailing conclusions A and B, employing the usual properties of conditional and joint probabilities (cf. appendix A.12), as also used in section 9.2.1. The following notation is introduced:
Then the reasoning can be summarized as follows:
Conclusion A can be reached in the following way:
9.6. BELL’S THEOREM WITHOUT INEQUALITIES
531
Next:
Since for the state (9.44) we have Hence
this implies
Conclusion B follows in a completely analogous manner from
and the equality
9.6.2 Discussion of the Hardy-Stapp formulation It is first noted that the four probabilities (9.40) through (9.43) are unproblematic if considered separately, because each one refers to mutually compatible observables only. However, the quantum mechanical significance of the conditional probability figuring in conclusion A, is less clear because and are incompatible. Since this conditional probability is used in the derivation of conclusion B, this latter conclusion may also be problematic, even if the probability present in conclusion B once again refers to compatible observables only. As seen from the derivation given above, the Hardy-Stapp reasoning can be completed on the basis of the Kolmogorov rules of classical probability theory. The question is whether this is allowed if complementarity is duly taken into account. In the Hardy-Stapp reasoning and are thought not to be measured in the same EPR experiment. For this reason is not associated here with a joint probability distribution of a joint nonideal measurement of the two incompatible observables, as was the case in section 7.9. Presumably, the meaning attributed in the reasoning to is the following one, seeming to be consistent with Stapp’s assumption UR (cf. section 9.5.1): is the probability that in a measurement of the value is obtained if the ensemble is restricted to those preparations for which a measurement of would have yielded the value
532
CHAPTER 9. BELL INEQUALITY IN QUANTUM MECHANICS
It is important to note that on this meaning of the symbol in the antecedent represents a preparation rather than a measurement result, be it a preparation labeled by a value of an observable. Since any preparation, valid within the domain of quantum mechanics, should be represented by a density operator or a state vector, the question now is, which is the correct one. Since in general an ensemble with a sharp value of a standard observable is thought to be represented by the corresponding eigenvector, it seems that the state vector is the only candidate. Note, however, that this would certainly not be the case if the preparation considered here were a conditional preparation (i.e. conditional on the measurement result of in a joint (nonideal) measurement of and as discussed in chapter 7. Indeed, the choice of completely ignores the incompatibility of and It, therefore, is very well possible that this choice is already at the basis of the contradiction found, and should be rejected. This, indeed, will be our final conclusion. However, for the time being we shall stick to the above-mentioned interpretation of in order to be able to clarify in which manner this leads to a contradiction. For the interpretation of the conditional probability given above, it is of crucial importance that identical individual preparations are possible in the different contexts of the measurements of the incompatible observables and As already remarked in section 9.5.4, this seems to be an unproblematic assumption at first sight, because the preparation takes place before the measurement is carried out. For this reason we shall accept this assumption for the time being as reasonable, and assume identical individual preparations in all four standard EPR-Bell experiments. Having an ensemble of such individual preparations, it must be possible to distinguish different subensembles corresponding to the possible measurement results and Because of (9.40) through (9.42), in the state (9.44) these subensembles should satisfy
If this is thought to imply then a paradox arises because from (9.43) it follows that It seems that the source of the paradox must be sought in the transition from (9.47) to (9.48). In this transition the fact is ignored that compatibility of quantum mechanical observables is not transitive, i.e. does not imply There does not exist any quantum mechanical state corresponding to an ensemble with sharp values of both and as would be required by (9.48). In quantum mechanics the expression cannot be given any content in the way intended here. Evidently, it is impossible to subdivide the ensemble in a classical way into subensembles corresponding to well-defined
9.6. BELL’S THEOREM WITHOUT INEQUALITIES
533
values of all four observables without running into a contradiction. Indeed, such a subdivision would be tantamount to the assumption of CFD, the (in)validity of CFD being a different way of expressing the distinction between classical and quantum ensembles. Once again the question can be posed whether the failure of CFD in quantum mechanics is a consequence of nonlocality, or whether it has another origin. It is clear that the possibility of additional assumptions is equally present here as it was in the preceding sections. Therefore the conclusion of nonlocality is as questionable as it was before. As a matter of fact, it is clear that also here the issue of ‘complementarity in preparation’ is not taken into account. By the transition from (9.47) to (9.48) subensembles of incommeasurable observables are essentially thought to be united in one classical ensemble of identically prepared objects. This implies that the quantum mechanical measurement results are attributed to an objective preparation, this preparation being the same in all measurements. This, however, seems to be an instance of inadvertent (objectivistic) realism that may not be allowed within the domain of quantum mechanics. It is straightforward to calculate for the initial state (9.44) the contextual states valid in the EPR-Bell experiments (compare (9.33) and (9.34)):
The different density operators in (9.49) are consistent with (9.40) through (9.43). In the context of an measurement the quantum mechanical preparation of particle 2 can be different from the one obtained in a measurement, because the state is thought to represent a preparation that is co-determined by the measurement arrangement for particle 2. It is easily verified that in (9.49) the contextual state of one particle is independent of which observable is measured on the other particle: thus, etc., exhibiting the locality involved in the notion of ‘contextual state’. No nonlocal influences being involved, it seems that, like the derivability of the Bell inequality, the present paradox should also be blamed on the additional assumption of an objectivistic-realist attribution of quantum mechanical measurement results to the object as properties possessed prior to and independently of the measurement. The paradox implied by (9.48) does not seem to be essentially different from the paradox involved in the well-known reasoning: “I fit into my shirt, my shirt fits into my suitcase, hence I fit into my suitcase”, ignoring the contextual meaning of the concept ‘shirt’.
This page intentionally left blank
Chapter 10 Subquantum or hidden-variables theories 10.1 Introduction The idea of a subquantum or hidden-variables theory originates with a comparison of the statistical characters of quantum mechanics, manifesting itself in the quantum mechanical probability distribution (1.3) or (1.7), and classical statistical mechanics. The idea is that quantum mechanics is not ‘complete in a wider sense’ (cf. section 4.2.1), to the effect that the quantum mechanical state vector does not yield a complete description of the state of an object, but that it is possible to specify the state more precisely by giving additional information using so-called hidden variables. By the value of the hidden variable it is then thought to be determined which measurement result is obtained if an observable is measured. In this manner an explanation of quantum mechanical statistics could be obtained. In classical statistical mechanics such a “hidden” variable is given, for instance, by the set of position and momentum coordinates of all particles of a gas. This hidden variable is represented by a point in phase space. It is unobservable, although in the classical theory not fundamentally so: there is merely a practical impossibility of observing all quantities and The observables of a gas are quantities like total energy, pressure, temperature, etc., e.g. the macroscopic quantities of thermodynamics. The idea is that these latter quantities are uniquely determined by the variables In any case, the observables constitute only a subset of all possible physical quantities of the system. Something analogous could be the case for quantum mechanics. Perhaps the physical quantities of quantum mechanics (i.e. the observables represented by Hermitian operators or POVMs), too, correspond to a restricted subset of all possible physical quantities of some hidden-variables theory. Two different kinds of hidden-variables theories should be distinguished. In the 535
536
CHAPTER 10. SUBQUANTUM THEORIES
first kind quantum mechanics is an integral part of the theory, hidden variables being extra variables existing next to quantum mechanical observables and state vector (in this way they were originally dealt with by von Neumann [2]). This is a hybrid approach, mixing quantities of a very different character in one theory, and therefore liable to ambiguities (cf. section 10.2). Quantum mechanics, equipped with the ‘possessed values’ principle (cf. section 2.3), can be seen as a degenerate form of this hybrid kind of hidden-variables theories, the ‘possessed values’ of the observables being considered as hidden variables. Proofs like those by Mermin (cf. section 6.4.2) are “no go” theorems for such hidden-variables theories. In the second kind the quantum mechanical observables and state vectors are not entities of the hidden-variables theory. However, in order that the domain of application of the hidden-variables theory contain the domain of quantum mechanics, it is necessary that some mapping exist from the quantum mechanical quantities into the set of hidden-variables quantities. In this case we have a realist underpinning of quantum mechanics by a new theory rather than an extension of quantum mechanics by means of additional (hidden) quantities. It seems that the second kind of theories is more fundamental than the first. We shall often refer to such theories as subquantum theories instead of hidden-variables theories. In subquantum theories we often start from the existence of a state space (not necessarily a linear space) of values of the hidden variable, comparable to classical phase space. In contrast to (values of) quantum mechanical observables hidden variable is thought to describe an objective property of the object. It can represent a set of values of hidden variables of different particles, or even a field variable (analogous to the description of a classical field). In a dynamic theory these variables are dependent on time Physical quantities are functions defined on states correspond to probability distributions on this space. A state described by a well-defined value of is called a dispersionless state. These states can be interpreted as the ‘objective initial states’ referred to in section 9.5.4, representing the results of individual preparations. In general, a statistical state, described by a probability distribution on has a certain spreading (dispersion) of the values of For dispersionless states this spreading is zero. A statistical state describes an ensemble. A quantum mechanical state (described by a state vector) cannot correspond to a dispersionless hidden-variables state, but only to a statistical one. This is closely related to the ensemble or ‘statistical’ interpretation of quantum mechanics as endorsed by Einstein, Margenau and Ballentine (cf. section 4.7 and chapter 6), in which quantum mechanical statistics is explained by different values of the hidden variable in different individual preparations of an ensemble. Actually, the possibility of describing individual preparations is the main advantage of the introduction of hidden variables. In the Copenhagen interpretation it is thought that hidden variables do not exist (compare section 6.2.2): allegedly, before the measurement of an observable
10.1.
INTRODUCTION
537
objects having the same state vector cannot be distinguished, even if yielding different measurement results. For the above-mentioned authors this is unsatisfactory. Admittedly, it is possible to renounce any explanation of correlations. However, for those who think that physics should not only describe but also explain phenomena, it is far more attractive to assume that to an individual particle a certain property can be attributed (to be described by a hidden variable), that can be held responsible for the fact that a well-defined value of an observable is found if that observable is measured. Strict correlations, like the ones observed in EPR experiments, could then be explained in a natural way by means of strictly correlated values of hidden variables of the correlated particles. A major task of such a hidden-variables underpinning of quantum mechanics is to demonstrate how it can cope with the inapplicability of the ‘possessed values’ principle to quantum mechanics. As a matter of fact, it seems that a quantum mechanical measurement result could be considered an objective property of the object as well if it were fixed in a deterministic way by the value of the hidden variable. This may be the reason of Einstein’s assumption that in an EPR experiment (figure 5.1) an ‘element of physical reality’ of particle 2 can be represented by a measurement result of a quantum mechanical observable of that particle. As discussed in section 6.4.2 this view cannot be maintained, however. It will be important to distinguish between quantum mechanical measurement results and values of a quantity of a subquantum theory, not satisfying the rules of quantum mechanics, and being able to play the role of the subquantum mechanical ‘element of physical reality’ referred to in section 6.4.3. Note that such a distinction is strongly suggested by an empiricist interpretation of quantum mechanics, in which a quantum mechanical measurement result is a property of the measuring instrument rather than a property of the object. As is well known, the idea of a hidden-variables underpinning of quantum mechanics has widely been rejected due to its metaphysical character, hidden variables being thought not to have any observational consequences. By itself this argument against hidden variables is convincing only to those adhering to a strictly empiricist philosophy, rejecting ‘explanation’ as a task to be performed by science. However, the argument does not seem to be applicable any more after Bell [33] has derived from hidden-variables theory an inequality (viz, (9.16)) that is operationally testable (by means of EPR-Bell experiments, cf. section 9.1.1). This has made hidden-variables theories respectable to a certain extent. On the other hand, since Bell’s inequality violates quantum mechanics and experiment1, it can be interpreted as increasing the number of “no go” theorems demonstrating that (certain) hidden-variables theories are impossible (compare section 10.2). This, once again, would make obsolete hidden-variables theories excluded by Bell’s inequality, this time on the basis of 1
Even though the experimental results do not seem to be convincing to everyone (e.g. Sulcs et al. [402]).
538
CHAPTER 10. SUBQUANTUM THEORIES
“hard” experimental evidence. We are still interested in hidden-variable theories for two reasons. First, as already put forward in chapter 2, an empiricist interpretation of quantum mechanics provides a natural impetus towards subquantum theories describing the microscopic object itself rather than just certain measurement phenomena. Being aware of the restricted applicability of a physical theory, looking for the boundaries of its domain of application, and trying to transcend these boundaries have always been hallmarks of good physics. It is not clear why this should not be true for quantum mechanics. Subquantum theories will have to provide the explanations (cf. section 6.4) quantum mechanics is not capable to give. A second reason for being interested in subquantum theories is the nature of its explanations. Here we are once more within the domain of metaphysics, at least as far as present-day experimentation goes. However, only by understanding the physical mechanisms underlying quantum mechanical phenomena will it be possible to experimentally transcend the boundaries of quantum mechanics. In particular, the nonlocality question, already encountered when discussing the Bell inequality in the quantum mechanical formalism (cf. chapter 9), is an important issue also here: does violation of the Bell inequality imply nonlocality? Contrary to widespread belief (Laudisa [403]), I do not think so [404]. Like in section 9.5.2, also within the context of derivations of the Bell inequality from hidden-variables theories we must put the question of ‘additional assumptions’. Are there additional assumptions next to the locality assumption? Is a derivation still possible if the additional assumptions are skipped, maintaining locality? Such additional assumptions may give us more insight into the meaning of quantum mechanics itself, and may suggest what experiments should be performed to probe the boundaries of the domain of application of this theory. In section 10.6 one such additional assumption will be discussed that seems to be of particular importance, viz, quasi-objectivity, implying that a measurement result can be conditioned -either in a deterministic (cf. section 10.4.2) or in a statistical/stochastic (cf. section 10.4.3) sense- on an instantaneous value of the hidden variable. In hidden-variables theories it is natural to try to describe an individual preparation by an initial value of hidden variable However, employing the analogy of quantum mechanics and thermodynamics, conjectured earlier by many authors (e.g. de Broglie [405], Bohm et al. [406, 407], Nelson [408, 409], Davidson [410], Dürr et al. [411]), it is argued that the assumption of quasi-objectivity may not be justified within the domain of quantum mechanics. The analogy with thermodynamics suggests the introduction into hidden-variables theory of a distinction between microstates and macrostates, of which only the latter are experimentally probed by quantum mechanical measurements. Although the analogy has not been developed to the full extent of a local hidden-variables theory reproducing all results of quantum mechanics, this distinction seems to be promising. In particular,
10.2.
“NO GO” THEOREMS
539
it demonstrates the feasibility of the dissociation of quantum mechanical and subquantum mechanical ‘elements of physical reality’, in section 6.4.3 found necessary lest ‘explanation by means of subensembles’ be possible.
10.2
“No go” theorems
We shall first discuss a number of attempts at proving the impossibility of hiddenvariables theories reproducing the results of quantum mechanics. Such attempts have been inspired by the fundamentally different structures of classical and quantum mechanics, observed in section 1.8.
10.2.1
Von Neumann
Von Neumann’s proof of the impossibility of the existence of hidden variables next to the quantum mechanical state vector ([2], section IV.2) purports to demonstrate the impossibility of dispersionless states. It is based on the following general requirements, to be met by physical quantities 2 (represented by Hermitian operators). Let and averages 1.
be physical quantities, then the expectation values (i.e. the ensemble and of the measured values of and satisfy if all values of
2. If
and
are real numbers, then
3. If physical quantity corresponds to the Hermitian operator A, then physical quantity corresponds to operator 4. If corresponds to operator A, and to B, then corresponds to operator A + B. This is required for both commuting and noncommuting operators A and B.
Assuming that all Hermitian operators correspond to physical quantities, von Neumann derives from these requirements that for any physical quantity
with
a density operator (cf. section 1.4).
We first make three remarks on (10.1) and its derivation: 2
Like Bohr, von Neumann refers to ‘physical quantities’ rather than to ‘observables’.
540
CHAPTER 10. SUBQUANTUM THEORIES
1. When applying (10.1) to physical quantities corresponding to (Hermitian) projection operators, this essentially is the result of Gleason’s theorem (1.94). 2. It is obvious that (10.1) can hold only for a physical quantity corresponding to a Hermitian operator and a state corresponding to a density operator In von Neumann’s hybrid hidden-variables model no other physical quantities seem to exist than those represented by Hermitian operators, i.e. quantum mechanical observables. In particular, the hidden variables themselves are not recognized as physical quantities. Gleason’s theorem is closely tied up with the linear structure of Hilbert space. However, in order to obtain a “no go” theorem, von Neumann applied (10.1) also outside this structure. There does not seem to exist a justification for this (see Bell [412].
3. In the derivation of (10.1) requirement 4 plays an essential role, in the sense that it is applied to incompatible observables A and B. On the basis of Gleason’s theorem we now know that this assumption is possible. However, for von Neumann the assumption was rather a hazardous one, its validity being closely related to the applicability of the linear structure of quantum mechanical Hilbert space.
Von Neumann gives two arguments in favor of the impossibility of hidden variables: 1. Dispersionless states cannot exist because the Heisenberg inequality (cf. section 1.7.1) must be satisfied. Such a dispersionless state would have to correspond to the “actual” state the system is in, quite analogously to the dispersionless states of classical statistical mechanics (see also section 1.4). 2. A homogeneous ensemble cannot be represented as a mixture of two different ensembles. For this reason the homogeneous ensembles correspond to the extreme elements of the convex set of density operators (cf. section 1.4). These are the one-dimensional projection operators (cf. appendix A.11.3). This implies that the homogeneous ensembles coincide with the pure states (1.33). But pure states are not dispersionless. If dispersion would be a consequence of the existence of a hidden variable determining which value of the observable is obtained, then the ensemble would not be homogeneous any more. This is in contradiction with the previously assumed homogeneity of pure states (cf. section 6.2.3).
It is important to note that both arguments refer to quantum mechanical states, described by state vectors or density operators. Indeed, dispersionless quantum mechanical states do not exist. However, this does not imply that no dispersionless hidden-variables states could exist. Taking into account a possible analogy with
10.2. “NO GO” THEOREMS
541
classical statistical mechanics, we see that empirically relevant states (for instance, the states of the canonical ensemble) are not dispersionless either, whereas the theory also contains dispersionless states to be compared with hidden-variables ones. It is very well possible that the quantum mechanical states, too, constitute only a restricted subset of all possible states. This latter remark actually takes the edge off von Neumann’s arguments: 1. Since the Heisenberg inequality can be proven only for quantum mechanical states, no argument can be derived from it with respect to the non-existence of non-quantum mechanical hidden-variables states. 2. Extremality of a probability measure is dependent on the measure space the measure is defined on (cf. appendix A.12). On a refinement of the class of subsets an extreme measure may turn out to be not extreme any longer. Introduction of hidden variables can precisely have this effect. For this reason in a hidden-variables theory states corresponding to quantum mechanical pure states need not necessarily be homogeneous.
Presumably because it endorsed the prevailing logical positivist philosophy, von Neumann’s “no go” theorem was readily accepted, and has hardly been criticized for a very long time. By hindsight we can now easily see the weaknesses of von Neumann’s proof. It seems to be above all a consequence of the hybrid character of his hidden-variables theory, maintaining the quantum mechanical states, defined as density operators operating on a linear vector space, as the only possible states. Hidden-variables theories are not considered capable of yielding an independent characterization of a state. In his proof of the failure of von Neumann’s “no go” theorem Bell [412] (see also [98], p. 31) used essentially this argument. According to Bell von Neumann’s assumption of the existence of a dispersionless quantum mechanical state for all components of the spin of a spin-1/2 particle would imply that the eigenvalues of the spin operators and would have to satisfy the same linear relation
as the quantum mechanical expectation values (since the expectation values in a dispersionless state precisely coincide with the eigenvalues). However, this is impossible because all eigenvalues of and are equal to +1 or – 1. Evidently, it is impossible to reconcile dispersionlessness with the properties of operators on a linear vector space. But such a reconciliation is an unnecessarily strong requirement: dispersionless states of a hidden-variables theory need not have any relation to linear vector spaces. By not recognizing this von Neumann put too heavy requirements on the dispersionless states of a hidden-variables theory. For this reason von Neumann’s impossibility proof is untenable.
542
CHAPTER 10. SUBQUANTUM THEORIES
10.2.2 Jauch and Piron On the basis of the definition of quantum mechanical states given in section 1.8.5, Jauch and Piron [413] (sec also Jauch [29], chapter 7) have given a “no go” proof similar to von Neumann’s. They prove that such states cannot be dispersionless, i.e. not for every proposition quantum mechanical state can satisfy either In view of Gleason’s theorem, establishing a one-to-one or relation between quantum mechanical states and density operators, this result is understandable, but, once again, not conclusive: it is possible that a refinement of the lattice of propositions exists (i.e. there may exist an embedding of the lattice of quantum mechanical propositions into a larger lattice), such that dispersionless states may exist. Jauch and Piron prove that this larger lattice should be Boolean, i.e. each pair of propositions is compatible (cf. section 1.8.4). This implies that the question of the possibility of a realist underpinning of quantum mechanics can be formulated as the question of the possibility of an embedding of the (weakly modular) lattice of quantum mechanical propositions into a Boolean lattice. Such a possibility is not excluded by the results of Jauch and Piron. As was already noted by Bohm and Bub [414], the proof by Jauch and Piron is circular, due to the fact that they actually start from the general validity of quantum mechanics.
10.2.3 Kochen and Specker Kochen and Specker [300] were the first to deal seriously with the problem of the (im)possibility of an embedding of the lattice of quantum mechanical propositions into a Boolean lattice3. They start by demonstrating that such an embedding is possible in a trivial manner, viz, by taking as probability space the direct product of the spectra of all possible quantum mechanical observables A. The joint probability that each of the observables A has eigenvalue as its value, can be defined as
in which is the usual quantum mechanical probability (1.5) or (1.31). The quantum mechanical probability distribution of some observable can be found by taking the relevant marginal of (10.2). In agreement with the results of Jauch and Piron (cf. section 10.2.2) the lattice of propositions (corresponding to the subsets of the direct product is Boolean. The trivial hidden-variables model with probabilities (10.2) is not considered an acceptable model, however. The reason is that, as follows directly from (10.2), in the 3
Actually, they did not consider lattices but algebras. This is related to the fact that operators cannot only be added, but can also be multiplied.
10.2. “NO GO” THEOREMS
543
hidden-variables model quantities corresponding to observables like A and are considered statistically independent. However, as quantum mechanical observables they are strongly correlated since their eigenvectors are the same (and, hence, for instance, in general). Evidently, all relations between different physical quantities are left out of consideration in this trivial hidden-variables model. In particular von Neumann’s requirement 3 (cf. section 10.2.1) is in disagreement with this model. In order that correlations between (compatible) quantum mechanical observables be also reflected in the hidden-variables model, Kochen and Specker choose to maintain von Neumann’s requirement 3. This has an important consequence, viz that implies Hence, projection operators should be represented in the hidden-variables theory by quantities having only values 0 or 1. In order that an embedding of the weakly modular lattice of quantum mechanical propositions (represented by projection operators into a Boolean lattice be possible, it should be possible to define a dispersionless state on the quantum mechanical lattice. Since Kochen and Specker do not take over von Neumann’s requirement 4, their result is considerably stronger than von Neumann’s. Actually, requirement 4, restricted to compatible observables A and B, is implicit in Kochen and Specker’s assumptions because this follows from requirement 3 and the fact that commuting operators can be represented as functions of one single operator C. This restriction to compatible observables turns out to be sufficient to demonstrate that for Hilbert spaces of dimension there does not exist a classical system of probability measures on the lattice of its subspaces. The proof is based on the fact that the quantum mechanical relation of compatibility is not transitive. An example (cf. figure 10.1) is provided by the projection operators and of the spectral representations of two standard observables having one eigenvector in common, for instance, The lattice of subspaces generated by these projection operators is not Boolean. Kochen and Specker essentially demonstrate that on this lattice no dispersionless states are possible, and that, hence, an embedding into a Boolean lattice is impossible. A similar, but simpler, proof was given by Peres [301]. Apparently, the presuppositions of the proof given by Kochen and Specker are
544
CHAPTER 10. SUBQUANTUM THEORIES
very modest. They actually amount to the following two requirements: 1. Von Neumann’s requirement 3, stating that quantity operator if is represented by A.
is represented by
2. The requirement that the correspondence of operators A and quantities a unique one.
is
These two requirements give rise to the following remarks: Ad 1. This presupposition originates from the fact that a quantum mechanical measurement, described by a Hermitian operator A, is also a measurement of a quantum mechanical observable described by operator In the quantum mechanical formalism expectation value can be calculated from the probability distribution of A. This is a mathematical operation, seemingly without any deep physical significance. Observable could be measured using the measuring instrument of A, simply replacing4 pointer reading by Von Neumann’s requirement 3 stems from the idea that the measurement results of A are precisely the possible values of the classical quantity possessed by the object prior to measurement, and registered according to the principle of ‘faithful measurement’ (cf. section 2.4.3). The Kochen-Specker theorem actually proves the impossibility of the ‘possessed values’ principle (section 2.3), thus excluding the degenerate kind of hybrid hidden-variables theories, referred to in section 10.1, as possible underpinnings of quantum mechanics. However, such hidden-variables theories may be far too simplistic because they hinge on an objectivistic-realist interpretation of quantum mechanics, ignoring the possibility of contextualistic-realist and empiricist interpretations. In particular, they do not take into account that a quantum mechanical measurement is a dynamic process, translating information about the microscopic object into a macroscopically observable pointer position of a measuring instrument. As noted in section 2.2, introduction of hidden variables precisely yields the possibility of distinguishing properties of the object from properties of the measuring instrument. Then hidden variable can represent (sub-)microscopic information, whereas (eigen)value of the observable may correspond to a macroscopic property of a measuring instrument. In quantum mechanics (eigen)value of the observable is attributed at a definite instant Thus, quantum mechanical probabilities are calculated, for instance, from (1.5), being the state vector at time In the hidden-variables 4
This actually implies that a measurement of a quantum mechanical standard observable corresponds to a PVM rather than to a Hermitian operator.
10.2. “NO GO” THEOREMS
545
theories considered up to now time dependence was seldom taken into account. As a consequence, quantum mechanical measurement result and the related value of hidden variable are thought to refer to the same time Taking into account that the measurement process is a physical process, needing a certain amount of time to produce a measurement result, the question arises of what is the precise instant and what is the value of the hidden variable at that time, determining measurement result This question would not have a unique answer if would vary much faster than the macroscopic pointer position. As an example we might consider the underpinning of thermodynamic quantities by classical statistical mechanics. Here the phase space point (to be compared with is rapidly changing with time, whereas thermodynamic quantities like temperature and pressure are slowly varying. It is not meaningful to consider these latter quantities as functions of an instantaneous value of the “hidden variable” (see also section 10.6). The macroscopic quantities are determined by time averaging of a stochastic variable over time intervals much longer than the characteristic fluctuation time of the microscopic fluctuations, but short compared to the time in which the macroscopic quantities can change their values in an observable way. In the formalism of classical statistical mechanics fluctuations of the microscopic quantities are apparent at the macroscopic level because the standard deviations of these quantities are different from zero. Thus, for total energy E we have If a realist underpinning of quantum mechanics would have a comparable stochastic character, then the individual quantum mechanical measurement result would correspond to a time averaged version of the corresponding physical quantity time averaging being restricted to the duration of the individual measurement process (more precisely, to that part of the measurement process during which the information transfer from object to measuring instrument takes place), thus The value of the measurement result will be determined by the trajectory during the relevant time interval. In a stochastic hidden-variables theory of the present kind, analogously to (10.3) we will in general have
because of (10.4) entailing
546
CHAPTER 10. SUBQUANTUM THEORIES
This means that von Neumann’s requirement 3 need not be satisfied in the more general hidden-variables theories considered here, thus removing one of the pillars of Kochen and Specker’s “no go” theorem. Note that for reaching this conclusion an empiricist interpretation of quantum mechanical observables has been helpful by explicitly distinguishing between, on the one hand, the values of observables as representations of pointer positions of measuring instruments, and, on the other hand, the physical quantities of the object itself (the hidden variable). It is precisely this distinction that enables to escape from von Neumann’s requirement 3, which -although not in a priori disagreement with an empiricist interpretation- was yet inspired by the identification of observable A and physical quantity i.e. by a realist interpretation of the quantum mechanical observable. As noted in section 2.4, in an empiricist interpretation the precise values of an observable are irrelevant, as well as all results depending on these values. Von Neumann’s requirement 3 does refer to values of observables. This is yet another reason to take the Kochen-Specker result not too seriously. Ad 2. A second way to obstruct the Kochen-Specker proof is based on the possibility that the correspondence of observable A and physical quantity is not a unique one. The argument for non-uniqueness reaches back to Bohr’s correspondence principle (cf. section 4.3) stating that quantum mechanical quantities are well-defined only within the context of a specified measurement arrangement. As seen in section 4.3.3, a contextualistic-realist interpretation of quantum mechanics, in which quantum mechanical reality is (co-)determined by the measurement arrangement, is a possible view implementing this. It has been pointed out by van Fraassen [415] that this prevents the proof by Kochen and Specker from being executed. In this proof two incompatible observables play a role, corresponding to mutually exclusive measurement arrangements, represented by PVMs and (cf. figure 10.1). Since a jointly measured observable is defined notwithstanding the incompatible measurement contexts. Van Fraassen has pointed out that equality of and need not signify that the physical quantities corresponding to these observables do coincide 5 . By a splitting of the mapping onto physical quantities observables like would be mapped onto distinct physical quantities and describing (a part of) reality in the context of either the or the measurement. It is essential that the two quantities are not objective ones, i.e. defined independently of the measurement arrangement, but that they are well-defined only within their respective measurement contexts, thus allowing to escape from the consequences of the ‘possessed values’ principle discussed in section 6.4.2. 5 This idea is at the basis of van Fraassen’s modal interpretation of quantum mechanics (cf. section 6.6).
10.3.
BOHM’S HIDDEN-VARIABLES THEORIES
547
It is interesting that the analogy with the underpinning of thermodynamics by classical statistical mechanics, referred to above, provides examples also for this contextualistic idea. Thus, the temperature of a volume of gas is welldefined only if the gas is in thermal equilibrium with its environment, the latter containing the thermostat enclosing the gas. If, on the contrary, the volume of gas is isolated, then not temperature but energy is well-defined. To a certain extent identification of the physical quantities corresponding to and is comparable to an assumption that quantities like temperature and energy are well-defined (and equal) in both measurement contexts. The thermodynamic analogy will be further developed in section 10.6.4. Summarizing, we must conclude that the Kochen-Specker “no go” theorem is not convincing because it excludes only a very restricted type of hidden-variables theories, viz those hybrid theories that are not full-blown hidden-variables theories, but actually are equivalent to an objectivistic-realist interpretation of quantum mechanics. Like von Neumann’s proof it still relies too heavily on assumptions that are specific to the Hilbert space structure of quantum mechanics. The possibility, provided by contextualistic-realist and empiricist interpretations of quantum mechanics, of making a distinction between quantum mechanical observables and physical quantities of a microscopic object suggests that more general hidden-variables theories may be feasible, not impugned by the Kochen-Specker theorem. Such theories will be considered in the following sections.
10.3 Bohm’s hidden-variables theories 10.3.1 Introduction David Bohm should be honored for being the first to free himself from the oppression of von Neumann’s “impossibility proof”. In view of von Neumann’s great authority, and the strong position held by logical positivism/empiricism in the physical sciences of a large part of the twentieth century, this was not a small achievement. De Broglie, who had developed similar ideas before [416], had not been able to withstand Pauli’s criticism [417], and gave up his attempts to develop an underpinning of quantum mechanics, to resume his original program only in the sixties, after -and influenced by- Bohm’s work. Even Einstein -one of the most influential physicists of his timedid not meet with much appreciation in his attempts at proving quantum mechanics to be an incomplete theory, and had to cope frequently with criticisms for dealing with metaphysical issues which allegedly have no relevance to the empirical content of this theory. For this reason Bohm’s achievement [418] is an important, if not a revolutionary, one.
548
CHAPTER 10. SUBQUANTUM THEORIES
In particular, John Bell was strongly influenced by the perspectives Bohm’s theory seemed to offer (cf. section 10.5). It was Bell who was able to lift Bohm’s ideas from the sphere of metaphysics to the domain of experimental physics, thus making hidden variables respectable even to the logical positivist/empiricist. That this seems to have been possible only on the basis of Bohm’s “metaphysical” effort, could be seen as an indication that it is sometimes justified to tamper a bit with metaphysics, and to postpone issues of empirical verification. Indeed, for achieving scientific progress it may sometimes be advantageous to temporarily abandon a strictly empiricist methodology for a more rationalist one, in which the free creativity of the human mind plays a more important role than observation or experiment. Unfortunately, there is also a reverse side to Bohm’s theory. It was heavily opposed almost immediately (Takabayasi [419]), and not, as could be expected, from empiricist quarters (even though the “Copenhagen” physicist Rosenfeld denigrated Bohm’s work as “a short-lived decay-product of the mechanistic philosophy of the nineteenth century” (Jammer [216], p.295)), but, for instance, also by Einstein [420], who cannot be denied a certain sympathy for the ideas driving Bohm. A number of critical questions with respect to Bohm’s theory will be discussed in section 10.3.4. On the basis of these considerations we shall have to conclude that if Bohm’s theory is a hidden-variables theory at all, it has a number of properties that make it virtually unacceptable as a realist underpinning of quantum mechanics. As a matter of fact, Bohm’s theory has the same hybrid character as von Neumann’s: the hidden variables are added to the quantum mechanical formalism in a more or less ad hoc way. This seems to reduce Bohm’s theory to quantum mechanics in disguise, the hidden variables constituting empirically irrelevant additions. Bohm’s attempt at devising a realist underpinning of quantum mechanics -even though worthy of praisemight be qualified as a (temporary) failure, at least needing a thorough revision (see also the objections to Bohm’s theory discussed in section 10.3.4). One aspect of Bohm’s theory deserves special attention, viz, its nonlocal character. This has had a large influence on later discussions about the foundations of quantum mechanics, by inducing a belief that nature is nonlocal at the quantum level, and that a fundamentally correct physical theory must be nonlocal (e.g. Dürr [421]). In section 10.3.4 we are going to see, however, that this belief stems from too easily accepting Bohm’s theory as a physically relevant model of subquantum mechanical reality. The aspect of nonlocality was for Bell an important point of departure in his approach to hidden-variables theories, and his derivation of the so-called Bell inequality from such theories (cf. section 10.5). We already saw in chapter 9 that no direct connection can be demonstrated between this inequality and an alleged nonlocality of the quantum mechanical formalism. In section 10.5 we are going to have the same experience with respect to hidden-variables theories. But for the Bohm theory, the idea of a nonlocal quantum reality -inducing Bell to restrict his attempt at eliminating hidden-variables theories to local ones- would
10.3.
BOHM’S HIDDEN-VARIABLES THEORIES
549
possibly not even have arisen. Thanks to the theories of Bohm and Bell, nonlocality of the quantum world has virtually become the paradigm of research on the foundations of quantum mechanics, even though at this moment no empirical evidence exists, or -even according to supporters of this paradigm- is to be expected (in agreement with the quantum mechanical principle of local commutativity, cf. section 1.3). This, indeed, stamps ‘nonlocality’ as a metaphysical issue. Although Rosenfeld was mistaken with respect to the short-livedness of Bohm’s theory and its influence, he seems to be right in a methodological sense: when theory withdraws too far from empiricism, it is very difficult to avoid all metaphysical pitfalls. In this respect Bohm was not completely successful. However, the only way to progress is exploration of unknown territory that has not (yet) been charted. Some metaphysical risk belongs to such an endeavor. Even if a first expedition is not completely successful, it can contribute to the success of a next one. In this sense Bohm’s achievement remains a valuable one. In section 10.6 we shall discuss a number of ideas possibly leading to a hidden-variables theory having a more physically realistic character, and not plagued by metaphysical nonlocality. A number of these ideas can be found already in Bohm’s work.
10.3.2 Bohm’s causal theory A detailed account of Bohm’s causal theory was recently given by Holland [422]. The point of departure of Bohm’s theory [418] is the possibility of writing the Schrödinger equation,
in a way suggesting a classical interpretation. Putting
the following equations can be derived for
in which the quantity
is given by
and
:
550
The quantity
CHAPTER 10. SUBQUANTUM THEORIES
is called the quantum potential.
Equation (10.6) is a continuity equation for transport of quantity if can be interpreted as the local transport velocity of that quantity. In the quantum mechanical formalism this relation is nothing but the law of conservation of probability. It can be expressed in terms of the probability flux
P being the probability density. Equation (10.7) has the form of a classical Hamilton-Jacobi equation (e.g. Goldstein [423]) of a particle moving in potential This suggests a classical interpretation of this equation, in which the only difference between classical and quantum theory is the existence of an extra quantum potential (10.8) next to the classical one, representing an extra force associated with the wave function so as to yield deviations from purely classical behavior. Since the Hamilton-Jacobi equation is equivalent to Newton’s equation,
it is possible, like in classical theory, to calculate particle trajectories. Analogously to the classical equality a value is attributed to particle momentum also in Bohm’s theory. The possibility provided by Bohm’s theory to calculate particle trajectories is often interpreted as restoring causality and determinism within the domain of quantum mechanics. Existence of particle trajectories, as well as the simultaneous attribution of position and momentum realized by (10.10), are in clear contradiction to Bohr’s complementarity principle (cf. section 4.6). Bohm’s idea is closely related to de Broglie’s earlier idea of the wave function as a kind of “pilot wave” [424]. In Holland’s book [422] many quantum mechanical examples are discussed -including the double-slit experiment- for which trajectories can be calculated. This will not concern us here any further, however, because the physical relevance of these particle trajectories is rather doubtful (cf. section 10.3.4). The quantum potential has two properties distinguishing it from classical potentials. First, depends on the modulus of the wave function in such a way that its effect can be appreciable for very small In Bohm’s program this property has led to changing the interpretation of Whereas originally the so-called “hydrodynamic” model was assumed, in which the field was interpreted as a kind of liquid flow carrying along the particle in a mechanical way, in later work [425, 426] the quantum potential is referred to as a “quantum information potential” which
10.3. BOHM’S HIDDEN-VARIABLES THEORIES
551
must be seen as an “information content”, to be compared with “radar waves guiding a ship” (see also Holland [422], section 3.4.4). Hence, the quantum potential seems to be comparable to the “radar” signal a ship is steered with, independently of whether the signal is either strong or weak (it should be strong enough to be detected by the ship’s radar equipment, it need not be able to change the ship’s momentum). In the second place, the quantum potential turns out to have a nonlocal character. This becomes evident if the procedure of equations (10.5) through (10.7) is applied to a system of two particles. In this case a Hamilton-Jacobi equation is obtained for the quantities and The quantum potential is
If then it follows from (10.11) that Hence, in this case the quantum potential does not induce any interaction between the particles. However, in case of an entangled state the quantum potential does induce such an interaction. Moreover, this interaction is such that particle 1 instantaneously “feels” a change of the wave function of particle 2, no matter how far the particles are apart. Analogously to the first point, the interaction described by the quantum potential need not become small if the distance between the particles is increased. The particles keep “feeling” each other by means of the quantum potential, even if the classical interaction potential tends to zero in the limit of large mutual distance. This feature of Bohm’s theory may have contributed to the idea that entanglement is related to nonlocality or nonseparability (cf. section 6.3.2).
10.3.3 Bohm’s stochastic theory Bohm was aware of the fact that his theory is empirically equivalent to quantum mechanics, and that the particle trajectories do not seem to give rise to new experimentally verifiable phenomena, not described by quantum mechanics. Holland ([422], p. 377), too, concludes that it is impossible to demonstrate the validity of the theory using experimental techniques that are compatible with the usual (standard) formalism of quantum mechanics. However, Bohm [418] also indicated how it might be possible to arrive at empirically testable results demonstrating the necessity of his theory. For this to be the case it would be necessary to carry out experiments outside the domain of validity of quantum mechanics, but inside that of the hiddenvariables theory. By way of an example Bohm mentioned processes being operative over distances of the order of cm or less, which he assumed to be outside the domain of quantum mechanics (today this distance would have to be chosen much smaller).
552
CHAPTER 10. SUBQUANTUM THEORIES
Since there should be a difference with quantum theory, Bohm’s causal theory has to be modified. According to Bohm this can be achieved by assuming that relation (10.10) is valid only for processes operating over distances larger than cm. At smaller distances an extra chaotic force might exist, causing momentum to deviate from the value This extra force might be a function of the difference According to Bohm this would mean that not the Schrödinger equation is satisfied in general, but, instead, an equation of the form
in which is a function that has yet to be specified. By assuming that can hold only by way of some kind of fluctuation, and that the value is reached within a (relaxation) time of the order the velocity of light, it is plausible that atomic processes are completely inside the domain of application of the Schrödinger equation (with However, measurements that are sensitive over distances smaller than and times shorter than could be expected to yield deviations from quantum mechanics, and could therefore be empirical tests of the hidden-variables theory. In particular, for a short time the position probability distribution P(x) could deviate from the quantum mechanical distribution but under the influence of the chaotic force it would quickly relax to the quantum mechanical distribution. It seems that in the stochastic theory quantum mechanics is interpreted more or less analogously to thermodynamics, that is, as a theory of equilibrium processes, being applicable only if a certain (thermal) equilibrium has been established. Bohm, indeed, compared deviation from equilibrium, manifesting itself by the difference to Brownian motion, and the tendency towards to relaxation toward thermal equilibrium under the influence of molecular chaos, like described by Boltzmann’s H theorem. Bohm [406] also discussed the conditions the chaotic forces must satisfy lest is the final result of the relaxation process (see also Bohm and Vigier [407]). Bohm’s stochastic theory has not gained the same popularity as his causal one. Thus, in Holland’s book [422] it is not even discussed. Indeed, the appeal of Bohm’s theory stems mainly from its allowance of deterministic trajectories, which become problematic again in a theory like that of Brownian motion. As will be seen in section 10.6, an extension in a stochastic sense seems nevertheless to be necessary for a hidden-variables theory to be able to reproduce the quantum mechanical results.
10.3. BOHM’S HIDDEN-VARIABLES THEORIES
553
10.3.4 Objections to Bohm’s theory Metaphysical character of Bohm’s theory The initial reception of Bohm’s theory (1952) was not very favorable. During that time the influence of logical positivism/empiricism was still very strong in physics, and the equivalence of equations (10.6) and (10.7) to the Schrödinger equation, combined with unobservability of the hidden variable defined by equation (10.10), could easily result in a verdict on Bohm’s theory of being a metaphysical embellishment of quantum mechanics, inducing Bohm’s theory to be referred to as a “causal interpretation of quantum mechanics”. Nowadays this verdict would seem not to apply any longer, because from the (assumed) existence of particle trajectories certain quantities (like arrival times, Muga and Leavens [427]) can be derived that do not follow from quantum mechanics. So, by itself Bohm’s theory would now seem to be respectable even from a logical positivist/empiricist point of view. However, empirical testability is not a sufficient reason to accept a physical theory. The theory should also have a certain physical plausibility. Although the existence of particle trajectories might be considered as a feature adding to the plausibility of Bohm’s theory, the opposite would be the case if these trajectories would have implausible properties. Calculations of particle trajectories have been performed for quite a few physical situations (e.g. Holland [422]), yielding more or less plausible results. However, as will be seen in the following, there have also been obtained a number of quite implausible trajectories (e.g. Spiller [428], see also Einstein’s objection below). Hence, even if the possibility of particle trajectories would be admitted, it is questionable whether these coincide with the ones given by Bohm’s causal theory. There are other objections, to be discussed below (see also de Muynck [429]), putting Bohm’s causal theory into doubt. The fundamental doubt about the validity of Bohm’s theory may be a reason why experimental physicists who are not put off by a dislike of metaphysics, have not made much effort to devise experiments that might be able to test whether it, indeed, tells us more about reality than does quantum mechanics.
Hybrid character of Bohm’s theory In Bohm’s theory -like in von Neumann’s- quantum mechanics is an integral part of the hidden-variables theory, the hidden variables nevertheless being supposed to satisfy classical evolution equations (be it with a potential energy that partially has a quantum mechanical origin). The solutions of these latter equations are obtained only after solving the Schrödinger equation for determining the quantum potential.
554
CHAPTER 10. SUBQUANTUM THEORIES
In the analogy with thermodynamics, referred to in section 10.3.3, this would mean that we would not reduce the macroscopic theory (i.e. thermodynamics) to a microscopic one (for instance, classical mechanics of a system consisting of many microscopic particles), but that thermodynamics would be part of the microscopic theory. Then the chaotic motion of a particle in a volume of gas would not exclusively be a consequence of the interaction with all other particles of the gas, but would be influenced also by a “thermodynamic” potential, comparable to the quantum potential. In fundamental treatments of the statistical underpinning of thermodynamics this is not a common way of looking at things, however. The intention is to understand all thermodynamic quantities in microscopic terms. By itself a hybrid theory need not be unphysical. Thermodynamic potentials are not unusual in phenomenological discussions of thermodynamics. We do something analogous when we describe classically the atomic nuclei in a crystal, while treating the particles quantum mechanically. However, these should be seen as pragmatic moves. We will not easily be tempted to draw from the properties of the classical motion of the nuclei fundamental conclusions with respect to the particles. As was seen in section 10.2.1, a hybrid theory can easily cause derailments in this respect: properties, valid only for the quantities of quantum mechanics, tend to be attributed to the hidden variables as well. We must be prepared for anomalies entering the theory in this way. In fact, the nonlocality of Bohm’s theory -to be discussed next- is a consequence of its hybrid character, causing any nonlocality of quantum mechanics to have its counterpart in the Bohmian representation of this theory. Such an adoption of properties of a phenomenological theory by the more fundamental theory would be similar to an underpinning of the rigid-body model of a billiard ball by a theory of nonlocal interactions between distant atoms rather than by the narrow-ranged local interatomic potentials restricting relative motion of nearest neighbors. Like von Neumann, also Bohm did not succeed in avoiding this pitfall. The fact that in Bohm’s stochastic theory there still is a Schrödinger equation (10.12) (be it a modified one), may serve as an illustration of this. It seems to be a good methodological principle to require that the wave function not be a part of a hidden-variables underpinning of quantum mechanics, but that the concepts of the explaining theory are as different from those of quantum mechanics as atoms are different from billiard balls.
Nonlocality The nonlocality of the quantum potential of Bohm’s theory is often interpreted as evidence of “nonlocal interactions” between distant particles. However, such “nonlocal interactions” could easily be artefacts of the interpretation, suggested by the specific form (10.6)-(10.8) of the quantum mechanical equations. Notwithstanding the different representations, the mathematical formalisms of quantum mechanics
10.3. BOHM’S HIDDEN-VARIABLES THEORIES
555
and of Bohm’s theory are largely equivalent, in different ways describing essentially the same physical properties. The nonlocality of the quantum potential reflects correlations between particles 1 and 2, described by the quantum mechanical wave function This is evident, because no interaction -neither local nor nonlocal- is mediated by the quantum potential if the particles are uncorrelated (i.e. if In the quantum mechanical formalism the only nonlocality about quantum mechanical correlations is that, if the particles are far apart, they should be investigated by measuring bi-local correlation observables like those measured in the EPR-Bell experiments discussed in chapter 9. Unfortunately, in the context of the latter measurements such correlations are often referred to as “nonlocal correlations” (compare sections 9.1.3 and 9.3.3), thus affirming the suggestion of nonlocality going with Bohm’s theory. However, as was seen in chapter 9, we do not need any nonlocal interactions to understand these correlations. Deviation of quantum mechanical correlations from classicality can be explained by the local influences of the measurement arrangements; no “nonlocal interaction” is necessary. The main reason to doubt violation of locality as suggested by Bohm’s theory is the same as advanced with respect to violation of the Bell inequality, viz that we do not have any operational evidence of it (for instance, in the sense that the probability distribution of a local quantum mechanical measurement would be influenced by another measurement carried out in a causally disjoint region of space-time). Due to the equivalence of the Bohm theory with the quantum mechanical formalism it is possible that the nonlocality of Bohm’s quantum potential is just an alternative description of correlations that can be explained by local interactions having occurred in the past when the particles were close together. The “nonlocal interactions” of Bohm’s theory are comparable to “nonlocal interactions” which in rigid body theory could be imagined to explain the constancy of the mutual distances of particles in a rigid body. Insertion of the quantum mechanical wave function into the hiddenvariables theory endows the states of the latter theory with a certain “rigidity”, in a similar way eliciting explanation by means of “nonlocal interactions”.
Need for measurement disturbance As already mentioned in section 10.3.1, Einstein [420] was one of the early critics of Bohm’s theory. Actually, his objection was advanced already earlier by Pauli [417] against de Broglie’s work [416]. Einstein’s objection consists of the observation that according to Bohm’s theory momentum of a particle in an eigenstate of an infinite well potential, vanishes:
556
CHAPTER 10. SUBQUANTUM THEORIES
(as a matter of fact, for any real wave function Hence, according to Bohm’s theory such a particle does not move at all. This result is in disagreement with a picture in which the particle moves to and fro between the points and momentum changing its sign on each collision with one of the infinite potential barriers. On the other hand, this latter picture is in agreement with the probability distribution of the momentum observable prescribed by the quantum mechanical measurement postulates, consisting of two narrow peaks (narrower as is larger) in the neighborhoods of the values and By the same token, if the electron of a hydrogen atom is in a state described by a real wave function, then the quantum mechanical measurement result of the of angular momentum in general yields a result This makes the result of Bohm’s theory, viz that, the electron would stand still, utterly implausible. Bohm [418] did not consider Einstein’s objection detrimental to his theory. That we find on measurement a value different from the “real” value is considered by him as evidence of the disturbing influence of a quantum mechanical measurement. Measurement disturbance causes the measured value of momentum, corresponding to an eigenvalue of a quantum mechanical observable, to deviate from the “real” value Bohm proposed to carry out in Einstein’s example such an (allegedly disturbing) momentum measurement by removing the walls of the potential well without changing the wave function. This would result in the formation of two wave packets moving away from each other at (average) velocities and respectively. A time-of-flight measurement between two space points and in the path of the wave packet moving to the right would then yield a measurement result of quantum mechanical momentum. This is consistent with Bohm’s idea that all measurements are actually position measurements (according to quantum mechanics maximally disturbing momentum). According to Bohm quantum mechanical observables have an ambiguous meaning: they belong as much to the measuring instrument as to the observed system itself. The quantum mechanical momentum observable does not have a simple relation to the “real” value of momentum. In Einstein’s example the measured value of momentum corresponds to the “real” value as it is in the context of the measurement, rather than with momentum as it was preceding the measurement. The impossibility of determining this latter quantity by means of a quantum mechanical momentum measurement makes it truly hidden indeed. It has to be changed by the measurement in order to yield the quantum mechanical measurement result. It, therefore, is a misunderstanding that Bohm’s theory -in contrast to the Copenhagen one- would be a “theory without observers” (e.g. Dürr [421]). The measurement disturbance Bohm needs to defend his theory against Einstein’s objection is different from Heisenberg’s measurement disturbance (cf. section 4.6), in the latter view a momentum measurement not disturbing momentum itself. In contrast to Heisenberg’s theory, Bohm’s theory is manifestly inconsistent
10.3. BOHM’S HIDDEN-VARIABLES THEORIES
557
with the principle of ‘faithful measurement’ (cf. section 2.4.3). Since Einstein’s ideas strongly hinge on this principle, his rejection of Bohm’s theory is understandable. For Bohm this must have been very disappointing, because he explicitly took Einstein’s criticism of ‘completeness of quantum mechanics’ as a point of departure, which criticism had been refuted by Bohr precisely by referring to the disturbing influence of measurement (cf. section 5.3). Evidently, Bohm, too, needs such a disturbance for making his theory consistent with (quantum mechanical) observation, thus failing to realize Einstein’s ideal of an observer-independent, objective description of the object (cf. section 4.7) by a hidden-variables theory explaining quantum mechanical measurement. Notwithstanding the possibility of understanding the difference between “real” and measured values of observables on the basis of measurement disturbance, this explanation seems to be ad hoc. Moreover, it may have rather strange consequences. Thus, in agreement with the principle of inertia, according to quantum mechanics momentum of a free particle is conserved. This is corroborated by the standard formalism of quantum mechanics, consecutive momentum measurements, performed on the same particle, yielding identical results (compare section 6.4). According to Bohm’s causal theory momentum (10.10) is not constant along the trajectory of a free particle (e.g. Holland [422], section 4.7). According to Bohm this disagreement between “real” and measured momentum would have to be remedied by means of momentum disturbance by the measurement. However, this would require that momentum measurements, performed at different times on a free particle, transform different “real” values (given by (10.10)) into identical measured ones. This is rather counterintuitive. It would also be hard to swallow that the principle of inertia would hold only for the measured values of momentum, and not for the “real” ones. Bohm’s measurement theory has yet another unsatisfactory feature in Einstein’s example. The same “real” momentum value can yield either or as a measured value. Actually, the sign of the measured momentum value is determined by the particle’s initial position, not by its initial “real” momentum. Hence, although in Bohm’s causal theory the hidden-variables description does provide an “explanation” of the measured value of momentum, this explanation is not very telling. These features make Bohm’s causal theory rather unattractive. Things might be slightly better in Bohm’s stochastic theory. Here the above picture of a quantum mechanical measurement is changed in the sense that is not interpreted as an instantaneous “real” value of momentum, but as an average value in a stochastic process (see also [407, 425]), caused by stochastic motion in the field. Such fluctuations could in principle yield an explanation of the sign of measured momentum on the basis of statistical fluctuations of “real” momentum: measurement results and could be caused by positive and negative “real” initial momenta, respectively. Whether the explanatory character of the stochastic theory is sufficient seems
558
CHAPTER 10. SUBQUANTUM THEORIES
to be largely a matter of taste. As observed by Bohm, Hiley and Kaloyerou ([426], p. 329), due to fluctuations the motion will be chaotic and unstable, causing complete predictability and controllability of the initial conditions to be essentially impossible in practice. Nonetheless the authors stress that in this way the statistical notions of the predictions of quantum mechanics can be explained on the basis of a causal theory.
Methodological objection against Bohm’s theory As a methodological objection against Bohm’s causal theory we should mention here the double roles of the quantities and This objection is analogous to the one advanced in section 2.4.3 against interpreting a quantum mechanical observable in two different ways. Since is thought to determine the trajectory of an individual particle, it seems that it must be a property of the individual object. However, at the same time it also describes the statistical measurement results obtained in an ensemble. In an ensemble interpretation of it would be rather incomprehensible how an individual particle could be influenced by it (Takabayasi [419]). The quantity too, has a meaning both as a (deterministic) property of an individual particle and as a (statistical) property of an ensemble. Apart from its role, defined by (10.10), as the value of “real” momentum, also defines the quantum mechanical probability flux J by means of (10.9). Although such a coincidence is not strictly impossible, it would imply that in a quantum ensemble the “real” value of momentum, measured at a certain position, is the same for all elements. Hence, statistical fluctuations of measured values should be a consequence of measurement alone (compare Einstein’s example, discussed above). This is an unattractive result for a theory purporting to address ‘incompleteness of quantum mechanics in a wider sense’ rather than ‘incompleteness in a restricted sense’ (compare section 4.2). Since the statistical meaning of is experimentally verifiable, it seems that at least this interpretation has a physical meaning. As the value of p in Bohm’s causal theory is the same as the statistical one, the result p = 0 for real wave functions can be understood as stemming from a superposition of equal but opposite probability fluxes in a superposition It does not seem to be very meaningful to interpret this statistical average as a value of an individual particle. An analogous remark is in order with respect to the result obtained by Spiller et al. [428], referred to above, to the effect that Bohm’s causal theory does not yield reflected trajectories if applied to a certain one-dimensional tunneling problem: absence of reflected trajectories (corresponding to can be understood because the probability flux is positive. This does not mean that there are no reflected particles; it just means that the reflected flux is smaller than the incident one. Compensation of these fluxes for real wave functions is just a special case of this phenomenon.
10.3. BOHM’S HIDDEN-VARIABLES THEORIES
559
10.3.5 Concluding remarks Bohm’s theory has some similarity with Bohr’s views: when determining the properties of an object, the whole experimental arrangement is important (cf. section 4.5). Like in Bohr’s interpretation the measurement context plays an important role. In order to obtain quantum mechanical measurement results the quantum potential should be calculated from the quantum mechanical state of the particle as it is during this interaction [425]. For Bohm this has been occasion to develop a holistic philosophy [430] in which Einstein’s objectivistic ideal is completely abandoned. The fact that the particle is allegedly reacting to information obtained from its whole environment is taken as evidence in favor of Bohr’s concept of “undivided wholeness” ([426], p. 329). As an important difference with Bohr it is advanced that, due to the existence of trajectories, in Bohm’s theory this “undivided whole” is analyzable. Yet, the appeal Bohm must make to measurement to be able to draw a distinction between the “real” and the measured values of an observable, seems to be a last resort, necessary to cope with certain objections, rather than a virtue of the theory. The statistical interpretation of quantum mechanical probability yielded by Bohm’s stochastic theory is similar to the Copenhagen probabilistic interpretation (cf. section 6.2). Bohm’s reliance on a restriction to position measurements is reminiscent of a similar restriction involved in Heisenberg’s notion of quantum measurement as a process conditionally preparing non-overlapping final object states in a position representation (cf. section 4.6.1). It is equally unsubstantiated. Of course, in a sense Bohm is right that each measurement is a position measurement. But this refers to the pointers of measuring instruments rather than to the microscopic object. Taking together all objections, it seems that we do not have much reason to expect that Bohm’s causal theory gives a satisfactory realist underpinning of quantum mechanics. Strictly speaking it is not even a new theory; it is just an interpretation of the quantum mechanical formalism, even though the interpretation suggests new physical quantities. This is not an unconditional recommendation, however, since it shares this latter feature with other theories that have been abandoned, like e.g. theories about the world aether. As an attempt at explaining quantum mechanical measurement results by subquantum dynamics it lacks physical plausibility. Although particle trajectories need not be mere metaphysical speculation, they presumably will not coincide with those of Bohm’s causal theory. Inclusion of the theory to be explained (quantum mechanics) in the explaining theory is very unusual, to say the least. Thus, thermodynamics is not an integral part of the statistical theory purporting to explain it. The trajectories of a classical underpinning of thermodynamics are not (co-)determined by the probability distributions of (classical) statistical mechanics, but follow from the equations of classical particle dynamics, i.e. from a fundamentally different theory. This seems to be desirable for a hidden-variables underpinning of quantum mechanics as well. For
560
CHAPTER 10. SUBQUANTUM THEORIES
this reason in the following sections we shall restrict ourselves to hidden-variables theories not having the hybrid character of Bohm’s theory.
10.4
Quasi-objectivistic hidden-variables theories
10.4.1
Introduction
In this section a type of hidden-variables theories is discussed that is often encountered in attempts at a realist underpinning of quantum mechanics. Point of departure is that i) immediately after preparation an individual object is in a state described by a well-defined value of the hidden variable, ii) the quantum mechanical observables constitute a subset of the set of physical quantities described by the hidden-variables theory. We shall refer to such hidden-variables theories as quasi-objectivistic. In contrast to quantum mechanical observables hidden variable is considered to be an objective property of the microscopic object, possessed by the object at the moment of its preparation, and independent of the measurement to be performed later. Observables are thought to be determined, either in a deterministic or in a stochastic sense, by the value the hidden variable has at the moment of preparation. In contrast to objectivistic theories, in which the value of the observable is considered as an objective property of the microscopic object, in quasi-objectivistic theories it is possible to draw a distinction between the individual measurement result and the capacity of an individual microscopic object to induce (either in a deterministic or in a stochastic way) such a measurement result when a measurement is carried out. The main feature of a quasi-objectivistic hidden-variables underpinning of quantum mechanics is that the probability of value of quantum mechanical observable6 A can be represented according to
in which initial value
is the conditional probability of measurement result for a given and is the relative frequency is prepared with. Hence,
Within the domain of quantum mechanics of a quantum mechanical state. 6
should correspond to the preparation
Here we restrict ourselves to standard observables, but extension to POVMs is straightforward.
10.4. QUASI-OBJECTIVISTIC HIDDEN-VARIABLES THEORIES
561
The representation (10.13) is sometimes seen as the most general form a quantum mechanical probability can assume in a hidden-variables theory (e.g. Eberhard [431]). Such an assumption is in agreement with the presupposition that no hidden-variables underpinnings of quantum mechanics would exist different from the quasi-objectivistic one. It was already pointed out in section 10.2.3, however, that it may not be very reasonable to condition a measurement result of a quantum mechanical observable on an instantaneous value of a hidden variable. As a matter of fact, the theory still has a partially hybrid character because it is required to encompass quantum mechanical observables. In section 10.6 we shall discuss the possibility of relaxing this requirement by introducing the notion of non-quasiobjectivistic hidden-variables theories. In the present section we restrict ourselves to quasi-objectivistic theories. Quasi-objectivistic hidden-variables theories are either deterministic or stochastic; a theory is deterministic if the conditional probabilities in (10.13) can have only value 0 or 1. It should be noted that these qualifications do not have a bearing on the time evolution of hidden variable but that they refer to the measurement process. The conditional probabilities may depend on the measurement procedure. In the following this contextuality will be explicitly displayed by denoting the conditional probability of observable A as In agreement with the different possibilities of realist or empiricist interpretations of quantum mechanics, in (10.13) the observable may be considered as a property either of the object or of the measuring instrument. In general, quasi-objectivistic hidden-variables theories do not attribute values of quantum mechanical observables as properties to the microscopic object. For this reason they need not satisfy the ‘possessed values’ principle of the objectivisticrealist interpretation of quantum mechanics, which in section 9.4.1 was seen to imply the Bell inequality. It will be demonstrated in section 10.5.3 that objectivistic hidden-variables theories, in which quantum mechanical observables are thought to be objective properties of the object, are not capable of reproducing all quantum mechanical probabilities, too, because they must satisfy the Bell inequality. Perhaps somewhat surprisingly, it will be seen there that this holds true for quasi-objectivistic theories as well, even though quasi-objectivism is much weaker than objectivism. It turns out that conditioning of observables on an instantaneous value of hidden variable by means of conditional probability (even in the weak sense of a stochastic theory, and notwithstanding dependence of the conditional probabilities on the measurement procedure) is sufficient to allow derivation of the Bell inequality. From this it follows that, in order to escape from the Bell inequality, quasi-objectivity will have to be abandoned. Non-quasi-objectivistic hidden-variables theories will be considered in section 10.6.
562
10.4.2
CHAPTER 10. SUBQUANTUM THEORIES
Quasi-objectivistic deterministic hidden-variables theories
In case of a deterministic theory the value of an observable is attributed as a property to the initial hidden-variables state of the microscopic object. This means that for each observable A a partitioning of the hidden-variables state space is possible, such that each point is in a region corresponding to a well-defined value of the observable (cf. figure 10.2, in which the partitions are exhibited for two observables, A and B, with values and respectively). Thus, In such theories an observable is actually considered as a function on The relative frequency of measurement result is given by
corresponding to (10.13) if
is the indicator function of region
Analogously, for two observables A and B we get
with
the region in which
. Thus,
10.4. QUASI-OBJECTIVISTIC HIDDEN-VARIABLES THEORIES
563
Objectivistic deterministic theories We call a quasi-objectivistic deterministic hidden-variables theory objectivistic deterministic if the partition of is independent of the way the observables are measured. Classical statistical mechanics, in which all physical quantities are functions of the position and momentum variables, considered as objective properties of the particles, is the paradigm of an objectivistic deterministic hidden-variables theory Measurement does not play a role here. This is in agreement with an objectivistic-realist interpretation of quantum mechanics, and with the ‘possessed values’ principle (cf. section 2.3). Adhering to this interpretation it is even possible to consider quantum mechanics itself as a hidden-variables theory in which the observables themselves play the roles of hidden variables. In particular, in such a theory it is thought possible to attribute on each individual preparation, represented by a well-defined value of values to the four observables measured in EPR-Bell experiments, thus realizing the quadruples from which the Bell inequality can be derived (cf. section 9.4.1). As a consequence objectivistic deterministic hidden-variables theories are not capable of reproducing quantum mechanics. This holds even true if the interpretation is not realist, but an empiricist interpretation is combined with the principle of ‘faithful measurement’ (cf. section 2.4.3).
Contextualistic deterministic hidden-variables theories In a contextualistic deterministic hidden-variables theory the partitions of in figure 10.2 may depend on the measurement arrangement, thus making it impossible to attribute for each value of an objective value to each observable. It is possible to implement this by assuming that the quantum mechanical measurement result is not determined by the value the hidden variable had at the instant of the initial preparation, but by the value it has after interaction with the measuring instrument. Let us assume the existence of an operator on for the measurement of observable A in a unique way mapping initial value onto final value : If the measurement result of quantity A is determined by the final value (10.15) must be replaced by
then
in which is the region must be prepared in lest correspond to measurement result By means of coordinate transformation (10.18) this expression can be rewritten according to
564
CHAPTER 10. SUBQUANTUM THEORIES
with the Jacobian of the transformation. For regular transformations we may assume Hence, can be interpreted as a probability distribution on Then (10.19) has the same appearance as (10.15), however, with the important difference that depends on the measurement arrangement. For this reason there need not exist unique joint probabilities (10.17) for incompatible observables. Analogously to the discussion of the Bell inequality in a contextualistic-realist interpretation of quantum mechanics (cf. section 9.4.2) there is a different joint probability distribution of the four observables for each EPR-Bell experiment. For Lochak [432, 433] the dependence of on the measurement arrangement is reason enough to believe that contextuality is sufficient for making deterministic hidden-variables theories acceptable as realist underpinnings of quantum mechanics. Lochak’s solution to the problem of the Bell inequality does not seem to be acceptable, though. As follows from (10.20) this solution is mathematically equivalent to an objectivistic situation in which the initial state of the object depends on the measurement arrangement. It does not seem to be acceptable, however, to assume that in EPR-Bell experiments it would be impossible to prepare the (statistical) initial state in such a way that it is independent of a measurement to be carried out later. In the Aspect “switching” experiment, referred to in section 9.3.1, it is even carefully looked after that the measurement arrangement is not even set up at the moment of preparation! A dependence of on A would require causal influencing backward in time (from measurement to preparation), or at least a kind of cosmic harmony taking care that the initial state is adapted to the later measurement. Such a conspiratory behavior of reality seerns to be as improbable as the one, observed in section 9.3.2 to be necessary to make nonlocal influences unobservable at the statistical level. In section 10.6 it will be seen that, indeed, this kind of contextuality is not sufficient to solve the problem of the violation of the Bell inequality by quantum mechanics. The reason for this is that probability (10.19) can be written in the form (10.13), with the characteristic function of Hence, notwithstanding the dependence on the measurement procedure, it satisfies the conditions of a quasi-objectivistic theory. In contrast to quantum mechanics, such a theory is capable of describing identical individual preparations in different EPR-Bell experiments (compare section 9.5.2) by just taking identical values of in all four experiments. For this value of equation (10.18) then yields, in the manner proposed by Stapp (cf. section 9.5.1), a unique quadruple of measurement results7. Consequently, the Bell inequality should be satisfied. Evidently, the fact that the hidden variable in this theory is considered as an objective property, possessed by the object prior to each measurement, is sufficient to lend through (10.18) even the quantum mechanical measurement results a certain (quasi-)objective status, even 7
According to Stapp this should already hold for quantum mechanics, no hidden variables being required.
10.4. QUASI-OBJECTIVISTIC HIDDEN-VARIABLES THEORIES
565
if these measurement results are themselves not seen as objective properties of the object.
Empiricist deterministic hidden-variables theories In empiricist theories the measurement result is represented by the final state of a measuring instrument. Then it is necessary to explicitly introduce the measuring instrument, as well as its interaction with the microscopic object, into the description. Let us assume that the measuring instrument is described by a hidden variable Then a deterministic interaction between the object and the measuring instrument for observable A realizes a transformation
between initial and final values of the hidden variables. In the empiricist theory the quantum mechanical measurement result is associated with the final value of the hidden variable of the measuring instrument. We shall ignore here theories in which rather than is taken as determining the measurement result, because such theories share the shortcomings of the Copenhagen measurement theories discussed in section 4.6.1. It will now be demonstrated that, although -like in an empiricist interpretation of quantum mechanics- an observable is a property of a measuring instrument, the theory is nevertheless quasi-objectivistic in the sense that a measurement result can be reduced (at least in a statistical sense) to the value hidden variable of the object had at the time of preparation. Let be the space of the apparatus variable and the subspace corresponding to measurement result Then, analogously to (10.19) and (10.20), we get
where is the initial state of the apparatus ensemble. Expression (10.22) is of the quasi-objectivistic type (10.13), with
For this reason empiricist deterministic hidden-variables theories are not able to reproduce quantum mechanics (cf. section 10.5.3). There is yet another way to see why neither contextualism nor empiricism yield acceptable solutions for the case of a deterministic hidden-variables theory. It is true that the impossibility of attributing in such a theory in an objective, measurementindependent way a value of observable A to the initial state of the object is clearly
566
CHAPTER 10. SUBQUANTUM THEORIES
demonstrated by (10.21): it is possible that, in combination with different initial values of one and the same initial value of yields final states corresponding to different values of A. This can occur in case of a nonideal measurement (cf. section 7.6) in which the nonideality of a measurement is caused by fluctuations in the initial state of the measuring instrument, explaining why sometimes the “wrong” measurement result is found. However, such an explanation is not applicable to an ideal measurement of a standard observable, and is hardly reconcilable with the fact that in case of an eigenstate of such an observable we always get the correct value (viz, the corresponding eigenvalue). If empiricist deterministic hiddenvariables theories were correct, then quantum mechanical statistics would seem to tell more about the measuring instrument than about the microscopic object. This would be in disagreement with the intention of a realist underpinning of quantum mechanics to find an explanation of that statistics in the properties of the microscopic object itself (cf. section 2.3). It, therefore, seems necessary that at least in an ideal measurement of a standard observable the measurement result is independent of the initial value of However, this would bring us back to the non-empiricist case in which such an observable is represented by a unique function of
10.4.3
Quasi-objectivistic stochastic theories
Like in the deterministic theory, in a quasi-objectivistic stochastic hidden-variables theory we start from a value of prepared prior to measurement. However, the correspondence of the value of observable A with is thought not to be unique. The idea is that, starting from this for each of the values of observable A there is a certain probability that it is obtained on measurement. This implies the existence of the conditional probabilities and of expression (10.13) for the probabilities The existence of the conditional probabilities conditioning quantum mechanical measurement results on hidden variable is the characterizing feature of a quasi-objectivistic hidden-variables theory. Stochastic hidden-variables theories purporting to reproduce the results of quantum mechanics must encompass a certain determinism, necessary to yield the unique result for observable A if corresponds to an eigenvector, i.e. Hence, we should have for all such that However, for general values of we may have Stochastic hidden-variables theories offer the possibility to relax the unique relation between the initial value of hidden variable and measurement result obtaining in the deterministic theories discussed in section 10.4.2. Such a relaxation is necessary to escape from the ‘possessed values’ principle. The conditional probabilities might be interpreted in the sense of Popper’s propensity interpretation (cf. section 6.2.2), not offering any explanation of measurement result over the constraint imposed by the conditional probability, but nevertheless reducing -be it in a statistical sense- to an objective
10.4. QUASI-OBJECTIVISTIC HIDDEN-VARIABLES THEORIES property
567
of the object.
A statistical description of a measurement process can be realized by a so-called ‘master’ equation (e.g. van Kampen [70]), describing the time evolution of a probability distribution. In the following we shall not try to develop realistic models, but we restrict ourselves to a couple of examples demonstrating the apparent feasibility of such theories. In both examples we allow a dependency of conditional probabilities on the measurement procedure.
Contextualistic stochastic theories Restricting ourselves, analogously to (10.18), first to theories in which the interaction of object and environment is represented by an external potential, the ‘master’ equation is given by
in which operator controls the dynamics of the stochastic process. Denoting the solution of this equation by
must be such that the operator is a stochastic operator (Primas [434]), warranting the solution to be interpreted as a probability distribution also for With
in which T is the final time of the measurement process, the probabilities be written in the form (10.13) if we define
Here
is the Hermitian adjoint of
can
It is not difficult to see that
with the solution of the ‘master’ equation (Green’s function) with initial condition Fluctuations in the stochastic process described by the ‘master’ equation can, in principle, explain the indeterministic character of the measurement process (however, see section 10.5.3). Since operator and Green’s function may depend on the measurement arrangement for observable A, the theory is contextualistic in general.
568
CHAPTER 10. SUBQUANTUM THEORIES
Empiricist stochastic hidden-variables theories Like in the deterministic case it is unsatisfactory that in the theory as presented above the interaction with the measuring instrument is not described in a dynamic way, not relating measurement result to the final state of the measuring instrument but to that of the object. It is possible, however, to extend the theory into this latter direction also in the stochastic case (de Muynck and van Stekelenborg [71]). It is possible to consider a ‘master’ equation for the whole system of object and measuring instrument, the solution being given according to
the initial statistical state of the measurement apparatus. It is assumed that at object and measuring instrument are statistically independent. If is the region of the apparatus hidden-variables space corresponding to measurement result then we get
If is the indicator function of region then written in the quasi-objectivistic form (10.13), with
can once again be
As an example we consider a measurement process, described by the FokkerPlanck equation (de Muynck and van Stekelenborg [71])
Green’s function for the initial values problem of this equation is found according to
Putting form (10.13), with
this implies that
has indeed the
A measurement of at time T can be seen as a nonideal version (in the sense of section 7.6) of a measurement of ideally reproducing probability distribution
10.4. QUASI-OBJECTIVISTIC HIDDEN-VARIABLES THEORIES
Thus, if then, indeed,
569
is the measured probability of the ideal measurement,
with given by the conditional probability defined in (10.28). Hence, the two probability distributions satisfy a nonideality relation analogous to (7.37). Since relation (10.28) is a convolution relation, nonideality relation (10.29) is even invertible in the sense of section 7.6.1. An ideal measurement of can be obtained by putting in (10.26) D = 0. In that case Green’s function is given by
This implies that the measurement process is deterministic if no spreading is introduced by fluctuations in the initial value of Indeed, we find
Hence,
implies that
The example given here can be generalized to more general equations, leaving the qualitative picture unchanged8 [71]. If quantum mechanical measurements were described by these equations this would imply that an ideal measurement of an observable with a continuous spectrum is impossible in general if stochasticity would play an essential role in the measurement process: if the diffusion-like character of the solution would prevent such an ideal measurement, even if the fluctuations in the initial state of the measuring instrument would be completely controlled. If D is not too large, this does not seem to be a big problem, though: as was seen in chapters 7 and 8, realistic measurements always exhibit a certain nonideality, causing experimental measurement results to deviate from the results predicted by the standard formalism. In case of a discrete spectrum it seems very well possible that for sufficiently small fluctuations an ideal measurement of a standard observable can be approximated arbitrarily closely: if deviation from (10.30) is not too large, it seems possible that the initial distribution corresponding to an eigenvector of the measured standard observable, is concentrated in a certain subregion of in such a way that of final state is concentrated, too, in subregion 8
In contrast to (10.27) in a relativistic dynamics Green’s function should not exhibit superluminal behavior.
570
CHAPTER 10. SUBQUANTUM THEORIES
There is an essential difference here with the Copenhagen interpretation. In this latter interpretation the statistical spreading of measurement results is not a consequence of an initial spreading of hidden variables in the initial state of the object, but is thought either not to have a cause at all, or to be caused by the disturbing influence of the measuring instrument. In terms of a stochastic hidden-variables theory this would mean that ‘indeterminacy’ is a consequence of a spreading in rather than in The Copenhagen interpretation is not capable of giving a complete and convincing analysis of this issue, thus causing the confusion of ‘preparation’ and ‘measurement’ observed in section 4.8. This confusion was already partially resolved by relying on an empiricist interpretation of quantum mechanics. However, it was not possible to establish a relation between the object itself and the quantum mechanical measurement result. As a consequence, in the quantum mechanical formalism possible measurement disturbance in a maximal measurement (for instance, a maximal standard observable, cf. section 7.7) could not be analyzed any further. By the present example it is demonstrated that, at least in principle, the hiddenvariables theories discussed here are capable of yielding an explanation of quantum mechanical statistics on the basis of fluctuations stemming from two different sources, viz, i) the preparation, ii) disturbance in the sense of Heisenberg’s disturbance theory of measurement. The latter contribution is in agreement with the results obtained by means of the generalized formalism of quantum mechanics (cf. section 7.10.3).
EPR-Bell experiments An empiricist stochastic hidden-variables theory can be applied in the following way to EPR-Bell experiments of the type depicted in figure 9.1. Now there are two objects, with hidden variables and respectively. Observables and are measured in causally disjoint regions. The ‘master’ equation can be given according to
with initial condition
in which it is assumed that object and measuring instruments and are uncorrelated at and free propagation of particle toward its measuring instrument depends only on Green’s function for the initial values problem of this equation satisfies
10.5. QUASI-OBJECTIVISTIC LOCAL THEORIES
571
Hence, for the probability distribution of the apparatus variables in the final state of the measurement process we get:
with
Relation (10.31) expresses the local character of the interaction between particle and the measuring instrument of observable It plays an important role in the following discussion with respect to the Bell inequality. As will be seen in section 10.5, quasi-objectivistic hidden-variables theories -be they deterministic or stochastic- are not capable of reproducing the results of quantum mechanical measurements, if it is required that the interactions described by the theory be local, because in that case the Bell inequality should be satisfied. This fact has induced a general belief that the quantum world must be nonlocal (see also section 10.3.4). If ‘locality’ would be the only assumption, this belief would be justified. However, there is another possibility, to the effect that the assumption of ‘quasi-objectivism’ is not valid, and that abandoning this latter assumption may prevent a derivation of the Bell inequality. This possibility will be investigated in section 10.6.
10.5 10.5.1
Quasi-objectivistic local hidden-variables theories, and Bell’s inequality Why local hidden variables?
Each physical theory defines its own domain of application, i.e. the set of measurements and observations described by the theory. By itself there is no reason to object to the possibility that a hidden-variables theory may satisfy the Bell inequality, whereas quantum mechanics violates it, if the theories have different domains of application. However, this way out of the problem of the Bell inequality is too easy if we want the domain of application of the hidden-variables theory to encompass the domain of quantum mechanics. This implies that the hidden-variables theory will have to allow violation of the Bell inequality, at least for measurements like the standard EPR-Bell experiments discussed in chapter 9, which are within the domain of quantum mechanics. Of course, the domain of a hidden-variables theory may also contain experiments that are outside the domain of quantum mechanics. By taking this point of view it becomes possible to investigate the question of what are the special conditions measurements should satisfy to be within the domain of quantum
572
CHAPTER 10. SUBQUANTUM THEORIES
mechanics. In this manner we hope to get some grips on the problem of possible additional assumptions in derivations of the Bell inequality, and on the physical reasons why certain assumptions may not be valid within the domain of quantum mechanics. The Bell inequality has its historic origin in attempts at a realist underpinning of quantum mechanics by means of a hidden-variables theory. Bell’s first point of departure [33] was his insight that the impossibility proof by von Neumann is defective (cf. section 10.2.1). This insight was a consequence of his conviction that Bohm’s hidden-variables theory, discussed in section 10.3, is a counterexample to von Neumann’s proof. A second point of departure was the nonlocal character of Bohm’s hidden-variables theory. This gave Bell the idea that only a nonlocal hiddenvariables theory would be possible as an underpinning of quantum mechanics, and that perhaps only local hidden-variables theories could be proven to be inconsistent with that theory. This induced him to try to construe an impossibility proof valid only for local hidden-variables theories. Bell [33] succeeded by deriving the inequality named after him. Not surprisingly, this inequality refers to the EPR problem, in which the issue of (non)locality appears in quantum mechanics for the first time (compare section 5.3.1). Because of its historical significance we shall discuss Bell’s proof in section 10.5.2. However, it is noted already here that the attention focused by Bell on the question of (non)locality in quantum mechanics has considerably clouded the issue, and that in the derivation of the Bell inequality the assumption of locality of the hiddenvariables theory is far less essential than assumed by Bell. This observation must be seen as an extension to the domain of hidden-variables theories of our discussion in chapter 9. There, too, we saw that violation of the Bell inequality is often attributed to the alleged fact that quantum mechanics would describe a nonlocal reality, a belief for which Bell to a large extent can be held responsible. However, we saw already there that other (additional) assumptions than locality may be responsible for derivability of the Bell inequality in quantum mechanics. We shall consider this possibility also here. In sections 10.5.2 and 10.5.3 we shall first discuss the traditional derivations of the Bell inequality which for many have been occasion to firmly believe in the nonlocality of quantum mechanical reality. It will be seen, however, that the assumption of locality does not play an essential role in these derivations. Such an assumption is reasonable in EPR-Bell experiments because the experimental measurement arrangement makes it plausible that there is no parameter dependence (cf. section 9.1.3). However, as seen in chapter 9, violation of the Bell inequality is connected to incompatibility of the observables that are involved in the EPR-Bell experiments, rather than to simultaneous measurement of compatible observables.
10.5.
QUASI-OBJECTIVISTIC LOCAL THEORIES
10.5.2
573
Bell’s theorem
Bell’s theorem will first be proven in the way originally done by Bell [33]. The experimental arrangement considered by Bell was already discussed in section 9.1 (cf. figure 9.1). The same notation will be used here. Like in Bell’s paper we shall restrict ourselves to dichotomic observables with eigenvalues +1 and – 1. In his derivation Bell restricted himself to deterministic hidden-variables theories (cf. section 10.4.2). Then the value of an observable is uniquely determined by the value of hidden variable Thus, depending on the value possessed by the object on preparation, observables and have values and respectively. Hence, it is possible to define functions and on hidden-variables space by taking their values, and respectively, to be equal to the quantum mechanical measurement results obtained if the corresponding hidden variable is prepared. In Bell’s theorem the four correlation functions and are represented by integrals over
These quantities are compared with the quantum mechanical expectation values, already considered in section 9.2.2. Bell’s theorem can be formulated as follows: Bell’s theorem: The correlation functions
etc., defined by (10.32), satisfy inequality (9.12).
Proof: The proof makes use of the assumption that +1 or — 1. This implies
and
have values
and, hence, Writing
it follows that
With (10.33) and the properties of
given in (10.14) this yields inequality (9.12).
574
CHAPTER 10. SUBQUANTUM THEORIES
In the derivation given above one assumption is not yet made explicit, viz, the assumption of locality deemed essential by Bell. This assumption is hidden in the presupposition that the functions and do not depend on any parameter of the measurement arrangement of the other particle. It can be imagined that the values of the observables do not only depend on the value of but also on the choice of the measurement arrangements. Thus
parameters and having two possible values each, corresponding to the possible measurement arrangements. Dependence of or on would imply nonlocality. Such a ‘parameter dependence’ was already encountered in section 9.5 (using a different notation for the values of the observables). This dependence was associated with nonlocality already there, because the influence of the measurement arrangement for particle 2 would have to be felt in an instantaneous, and, hence, nonlocal way, by (the measurement of) particle 1. If in (10.35) dependence of on would be essential, then the proof of the Bell inequality, as given above, could not be carried through any longer because then the functions should have different parameter values in different terms of (10.34). Hence, may then have different values in the second and fourth terms, making it impossible to write down the right-hand side of (10.34). This, actually, was the reason why Bell considered the locality assumption -implying that in (10.35) there is no nonlocal parameter dependence- of essential importance for a derivation of his inequality. For Aspect the conviction that -in agreement with relativity theory- the world can be described in terms of local/causal interactions, must have been reason enough to think it worthwhile to perform experiments testing whether -in disagreement with quantum mechanical prediction- the Bell inequality is satisfied. As is well known the Aspect experiments [373, 290] corroborated the predictions of quantum mechanics, even in the “switching” version in which special care was taken to prevent that information on the value of parameter could reach particle 1 in a local/causal way (and vice versa). Also in the “switching” experiment violation of the Bell inequality was observed. As a consequence the locality assumption is generally considered as the true reason why the Bell inequality should be satisfied, and is held responsible for the fact that local hidden-variables theories are not able to reproduce quantum mechanical violation of the Bell inequality. If true, this, indeed, would imply that a reality described by quantum mechanics must have a nonlocal character.
Possibility of additional assumptions in Bell’s derivation However, as already remarked in section 10.5.1, this reasoning is valid only if the hidden-variables theory considered by Bell is the most general one. It is possible
10.5.
QUASI-OBJECTIVISTIC LOCAL THEORIES
575
that Bell has made additional assumptions, not explicitly specified by him, and undermining the logical reasoning involved in (9.37). Analogously to the quantum mechanical considerations put forward in section 9.5.1 the possibility, allowed by (10.35), that the locality condition would not be satisfied in a hidden-variables theory, would be incomprehensible in view of the applicability of the postulate of local commutativity to EPR-Bell experiments. Therefore it is worthwhile to try to maintain locality also in hidden-variables theories meant to provide a realist explanation of these experiments. It is desirable to thoroughly investigate the possibility of additional assumptions also here. The following additional assumptions, next to locality, might be thought of:
i) Existence of hidden variables It is possible to consider the existence of hidden variables as an additional assumption, and to blame on it the derivability of the Bell inequality for the standard EPR-Bell experiments. Indeed, as was seen in chapter 9, the quantum mechanical formalism does not give any reason to assume that such a derivation is possible at all. This, actually, is the positivistic solution, which was popular during the period in which the Copenhagen interpretation prevailed. This is too easy a solution, however. From the point of view of an empiricist interpretation (cf. section 2.2), in which quantum mechanics does not describe the microscopic object itself but just macroscopic phenomena associated with preparation and measurement, existence of a subquantum or hidden-variables theory is necessary for describing the reality behind the phenomena. Hence, we do not have at our disposal the possibility of denying the mere existence of hidden variables. However, choosing a special kind of hidden-variables theory may correspond to more specific additional assumptions.
ii) Determinism One additional assumption made by Bell in his derivation from hidden-variables theory is immediately evident, viz, the restriction to deterministic theories. It could be surmised that derivability of the Bell inequality should be blamed on the assumption of a deterministic detection procedure. There is certainly reason to think so. As observed in section 10.4.2, in objectivistic deterministic theories it is possible to attribute measurement result as a property to the microscopic object in initial state This entails the Bell inequality to be satisfied as a consequence of the ‘possessed values’ principle (cf. sect 9.4.1). This holds true even in local contextualistic deterministic theories, yielding quadruples of measurement results analogously to the way discussed in section 9.5.1. This would seem to imply that determinism has to be abandoned if locality is to be maintained. In view of the necessity to make a clear distinction between properties of the microscopic object and properties of the measuring instrument such a conclusion would be too hasty, however. Indeed, the situation is different in empiricist deterministic theories. Here the measurement result should be attributed to the combined initial state of object+measuring instrument. Hence, given identical values of dif-
576
CHAPTER 10. SUBQUANTUM THEORIES
ferent measurement results could be obtained due to different initial values of the measuring instrument, thus preventing the existence of quadruples of measurement results for the four standard EPR-Bell experiments. Although this seems to be able to reconcile violation of the Bell inequality with determinism, it is not a satisfactory solution, however, because it is based on the idea of nonideality of the measurements (compare (10.29)), in which fluctuations in the measuring instrument are responsible for disturbance of measurement results. Since the four standard EPR-Bell experiments are ideal measurements of the observables involved in the quadruples, this solution is not applicable (nonideality in these experiments just referring to observables that are incompatible with the measured ones, cf. section 9.3.1). Therefore it does not seem probable that violation of determinism can solve the problem posed by Bell’s theorem. Moreover, it may not be wise to abandon determinism too easily. As a matter of fact, in certain specific situations (e.g. if the quantum state is an eigenstate of the measured standard observable) a certain determinism can be observed even in quantum mechanics. Also, the strict EPR correlations bear witness to a certain determinism. We do not have any empirical evidence contradicting the assumption that results of measurements of standard observables could be determined in a deterministic way by the subquantum state of the object. Notwithstanding the professed indeterminism of the Copenhagen interpretation, such a determinism is suggested by Heisenberg’s disturbance theory of measurement, in which the actually measured standard observable is thought to be undisturbed (compare section 4.6.2). Admittedly, due to the inapplicability of the ‘possessed values’ principle it is not possible to apply Heisenberg’s idea of ‘determinism’ in the sense that the measurement result was there beforehand, but there may have been a subquantum state in a deterministic way inducing the measurement result in the measuring instrument (see also section 6.4.3). Since the problem of violation of the Bell inequality by quantum mechanics refers to standard observables, it would be too hasty to blame its violation on lack of determinism of hidden-variables theories. However, this determinism might be different from the one described by (10.18) or (10.21). A distinction must be drawn between (in)determinisrn at the level of the macroscopic phenomena described by the quantum mechanical measurement results, and (in)determinism at the level of the subquantum dynamics of the hidden variables. Determinism at the macroscopic level may be consistent with indeterminism at the (sub)microscopic level, analogously to the way thermodynamic quantities like temperature and pressure behave deterministically notwithstanding an underlying stochasticity. As is well known, the discussion about the relevance of the assumption of determinism has become obsolete by now. The Bell inequality can be derived without the assumption of determinism. In section 10.5.3 we shall consider the generalization, due to Clauser and Horne [383], to stochastic (quasi-objectivistic lo-
10.5.
QUASI-OBJECTIVISTIC LOCAL THEORIES
577
cal) hidden-variables theories, eliminating the restriction to deterministic theories. It turns out that with respect to the Bell inequality deterministic and stochastic quasi-objectivistic local hidden-variables theories behave similarly. Since the first are special cases of the second, it is not necessary to explicitly consider deterministic theories. This has the advantage that it is clear that problems with respect to the Bell inequality cannot be blamed on an (additional) assumption of determinism 9 . iii) Non-contextuality or objectivity In agreement with Bohr’s contextualistic interpretation of the EPR problem (cf. section 5.3), in which a quantity of particle 1 must be (co-)determined by the measurement arrangement of particle 2 (and vice versa), a nonlocal parameter dependence, present in (10.35) by the dependence of an observable of particle on a parameter of the measurement arrangement of the other particle, is sometimes interpreted as ‘contextuality’. Then Bell’s locality assumption would actually be an assumption of non-contextuality. Indeed, for objectivistic deterministic local theories, in which and do not depend on any parameter at all, quasi-objectivity in EPR-Bell experiments reduces to objectivity, since by the equality the measurement result can be interpreted as an objective property of a microscopic object in state In this view the nonlocal contextuality expressed by the dependence of and on is responsible for the nonexistence of quadruples of measurement results allowing a derivation of the Bell inequality along the lines discussed in section 9.4.1. Recalling our criticism of Bohr’s contextual interpretation of the EPR problem (cf. section 5.3.1), it is important to note here that the measurement context of observable is not determined by the parameter of the measurement arrangement of the other particle alone (if it is determined by that parameter at all), but that, as assumed in (10.35), the measurement result of observable may also depend on a parameter specifying which measurement is carried out on particle 1. This latter dependence is seldom explicitly taken into account. It may be important, though, as an additional assumption next to the locality assumption: independence of from does not imply independence from For instance, parameter could specify the compatible standard observables of particle 1 itself, measured simultaneously with The Kochen-Specker proof (cf. section 10.2.3) suggests the necessity of such a local contextualism in a realist interpretation of quantum mechanics; in an empiricist interpretation the possibility is evident that a measurement result of an observable may depend on the measurement arrangement for that very observable. As argued in section 5.3.1, it does not seem sensible to accept the nonlocal kind of contextuality implied by the dependence of on On the other hand does it make sense to consider local contextualistic hidden-variables theories satisfying 9
Bell [435] already demonstrated that determinism is not necessary for a derivation of the Bell inequality.
578
CHAPTER 10. SUBQUANTUM THEORIES
As already stressed by Bell [263], his proof is fully applicable to contextualistic local hidden-variables theories satisfying (10.36): in the proof nothing changes if the dependence is taken into account. In the quasi-objectivistic representation (10.13) of probabilities this is taken into account by allowing for a dependence of the conditional probabilities on the measurement procedure. Notwithstanding the influence of the measurement, by relation (10.18) measurement result is attributed to an initial hidden-variables state of the object (cf. section 10.4.2). In contrast to the value of hidden variable is considered to be an objective property of the object. Then for the standard EPR-Bell experiments the Bell inequality follows from the existence of quadruples of measurement results in precisely the same way as in objectivistic local theories. More generally, for stochastic theories this can be seen as a consequence of the quasi-objectivity implied by the fact that by the conditional probabilities the quantum mechanical measurement results are conditioned on hidden variable This is independent of whether does depend on the measurement arrangement or not. Hence, contextuality does not seem to yield a solution either.
iv) Reproducibility of In the derivations of the Bell inequality, considered above, and are taken at identical values of This makes sense only if it is assumed that identical values of can be prepared in different EPR-Bell measurements (de Muynck et al. [377]). Impossibility of such a reproducibility would prevent the existence of quadruples of measurement results (compare sections 9.5.4 and 9.5.5), thus blocking a derivation of the Bell inequality. The main problem with the ‘non-reproducibility’ hypothesis, preventing ‘identical preparation’ of in different EPR-Bell experiments, is that there is no clear physical reason for it. Admittedly, ‘non-reproducibility’ is satisfied in a de facto way, because it is virtually impossible to actually prepare the same value of in different preparations because it constitutes a ‘measure zero’ subset of Since a repeated preparation would have probability zero the Bell inequality would be satisfied only with probability zero, too, and the problem of violation of the inequality would be virtually nonexistent. Yet, this way out is not completely satisfactory because there is no obvious physical reason why different preparations could not have nonzero probability of being so close together that deviations from quantum mechanics could be observed. The ‘rion-reproducibility’ argument would be more convincing if a physical reason could be ascertained. This line will be pursued in the following. It is important to keep in mind that ‘identical preparation’ in different measurements of incompatible observables is a crucial issue in deriving the Bell inequality for the standard EPR-Bell experiments (cf. section 9.5.4). The necessity of incompatibility for violation of the Bell inequality suggests ‘incompatibility’ as a cause of ‘non-reproducibility’. However, since refers to preparation rather than to measure-
10.5.
QUASI-OBJECTIVISTIC LOCAL THEORIES
579
ment, it seems that a connection with ‘incompatibility’ is absent. This subject will be discussed further in section 10.6, where a connection between ‘non-reproducibility’ (suitably modified) and ‘incompatibility’ will be established so as to provide a physical explanation based on the distinction between ‘complementarity in measurement’ and ‘complementarity in preparation’ found in sections 7.10.3 and 9.5.5 to be useful in characterizing ‘complementarity’. In this approach a fifth additional assumption, viz, the assumption of quasi-objectivity, will be involved in preventing a derivation of the Bell inequality in local hidden-variables theories.
10.5.3
The Clauser-Horne derivation
In the derivation of the Bell inequality by Clauser and Horne [383] a stochastic hidden-variables theory is considered, in which the relation (10.13) between the hidden variable and the measurement result is not deterministic but stochastic. This means that for a given preparation of the object in hidden-variables state the value of the measurement result of observable A is not fixed in a deterministic way, but that only the probabilities are defined that for this preparation a certain value of A is found. Clauser and Horne are not very explicit about the cause of this stochasticity. According to them the conditional probabilities should be associated with an ensemble of objects, all being prepared with the same value of but nevertheless yielding different measurement results of the observable. They refer to their theory as an “objective” hidden-variables theory, to stress that the value of the hidden variable must be considered as a kind of classical property of the object, to be attributed to it independently of the measurement to be performed later, and determined only by the preparation procedure. The observable is considered by them in the empiricist sense of section 2.2: they refer to reactions triggered in measurement apparata. In any case, the conditional probabilities may depend on the measurement procedure for A. The derivation by Clauser and Horne is valid for all quasi-objectivistic hidden-variables theories discussed in section 10.4. An important remark made by Clauser and Horne, is that the theory, discussed by them, need not be complete in the sense discussed in section 4.2.1. It is very well possible that yields only an incomplete specification of the state of the object. However, all conclusions to be drawn from the theory remain valid in case of incompleteness, provided the state corresponds to an objective preparation, i.e. a preparation which is independent of the measurement. This implies that for all possible quantum mechanical observables the conditional probabilities are defined on the same hidden-variables space This remark, which is just implicit in [383], will play an important role in section 10.6, because here we have an additional assumption in the theory that -although seemingly self-evident- is not trivial at all, and that regards the quasi-objectivity assumed in hidden-variables theories satisfying (10.13).
580
CHAPTER 10. SUBQUANTUM THEORIES
In case of a joint measurement of two observables gously to (10.13),
and
we obtain, analo-
with the conditional probability of the pair given preparation For EPR-Bell experiments, in which two observables are jointly measured in causally disjoint regions of space-time (i.e. the measurements take place outside each other’s light cones, cf. figure 1.1b), Clauser and Horne assume conditional statistical independence of the measurements of particles 1 and 2, viz,
and analogously for the other EPR-Bell experiments. Now is the initial hiddenvariables state of the two-particles system 1 + 2, if necessary to be specified according to Equality (10.38) expresses that the stochastic process leading from the initial hidden-variables state to the measurement results for particle 1 is statistically independent of the process leading for particle 2 from to In the context of an EPR-Bell experiment this can be interpreted as a locality condition: for given the measurement process of particle 1 is not influenced by the measurement process of particle 2 (and vice versa). In a local theory correlation between the measurement results of particles 1 and 2 should exclusively be a consequence of joint preparation, already contained in The locality assumption (10.38) is a plausible condition if the measurements take place outside each other’s light cones. By (10.14) and (10.38) it follows that
thus causing the univariate probabilities in the BCHS inequality to be uniquely defined, and warranting that the hidden-variables theory reproduces the quantum mechanical postulate of local commutativity (cf. section 1.3). From (10.37) and (10.38) the Bell-Clauser-Horne-Shimony (BCHS) inequality, given by (9.10), can be derived in the following way: Proof of the Bell-Clauser-Horne-Shimony (BCHS) inequality: Start from four numbers satisfying
Consider the quantity
10.5.
QUASI-OBJECTIVISTIC LOCAL THEORIES
581
Since a convex combination of two probabilities is once again a probability, it immediately follows from the right-hand side of this expression that
This inequality remains valid if it is averaged over an ensemble of numbers satisfying (10.39). Then, putting
all
the BCHS inequality (9.10) follows directly from (10.37), (10.38) and (10.41).
We have already seen in section 9.2.2 that for dichotomic observables with eigenvalues + 1 and –1 the Bell inequality (9.12) can be derived from the BCHS inequality (9.10). This demonstrates that in the derivation of section 10.5.2 the assumption of determinism is not essential. Quantum mechanics is in disagreement with local stochastic hidden-variables theories of the type discussed here, just as well as with deterministic theories, which just constitute a subset of the set of stochastic hidden-variables theories. In view of the fact that the Bell inequality essentially is a special case of the BCHS inequality, neither determinism, nor the restriction to dichotomic observables is necessary for a test of the hidden-variables theories discussed here. For this reason the significance of the BCHS criterion is much larger than the -not unimportantgeneralization from deterministic to stochastic hidden-variables theories: due to its independence of the values of the observables the BCHS criterion is also applicable in an empiricist interpretation of quantum mechanics, even if the measurement process would be deterministic.
Existence of a quadrivariate probability distribution The derivability of the Bell inequality in a quasi-objectivistic local stochastic hiddenvariables theory can be viewed in light of the theorem, proven in section 9.2, that existence of a quadrivariate probability distribution is sufficient for such a derivation. We saw in section 9.3.2 that, due to incompatibility of some of the observables, quantum mechanics does not yield any reason to surmise the existence of a quadrivariate probability distribution having the bivariate probabilities of EPR-Bell experiments as marginals. An analogous account can be given in quasi-objectivistic hiddenvariables theories. Analogously to (9.22), for each of the four standard EPR-Bell experiments (considered as generalized measurements) the quadrivariate probability (9.20) can be represented in the hidden-variables theory according to
582
CHAPTER 10. SUBQUANTUM THEORIES
Here a possible dependence of the conditional probabilities on the measurement arrangements is taken into account, as well as a locality condition analogous to (10.38). Inequality (9.23) now follows from the inequality
valid if and are incompatible. Hence, although a quadrivariate probability distribution exists for each of the standard EPR-Bell experiments separately (as well as for the generalized experiment discussed in section 9.3), no Bell inequality can be derived from it, which is valid for the four standard EPR-Bell experiments taken together. This holds true notwithstanding the locality of the theory as expressed by the condition of conditional statistical independence. In contrast to quantum mechanics, however, in which theory no further resources are available to derive the Bell inequality for the four standard EPR-Bell experiments, taken together, quasi-objectivistic local stochastic hidden-variables theories do provide such a resource. Although also in the latter theories no quadrivariate probability distribution can be found that is actually realized in all of the four EPRBell experiments, it is possible to construct one in a theoretical way. The reasoning is completely analogous to Stapp’s one, used in his “nonlocality proof” discussed in section 9.5.1. The idea is that in the bivariate conditional probability (10.38) quantity is independent of the observable measured on particle 2. For this reason in the four bivariate probabilities (10.37) only four univariate conditional probabilities actually occur. Using these it is simple to construct a quadrivariate probability distribution yielding the four bivariate probabilities as marginals:
This expression should be compared to (9.19). From the existence of (10.42) it, once again, is possible to derive the Bell inequality. This remains true notwithstanding the (local) contextuality involved in the dependence of the conditional probabilities etc. on the measurement arrangement. For many physicists the validity of the Bell inequality for stochastic (hence, not only deterministic) local hidden-variables theories has been a persuasive argument to think that only two possible choices are left: either reject all hidden-variables theories, or accept that the reality described by quantum mechanics has a nonlocal character. Thus, Clauser and Shimony [262] remark: “The conclusions are philosophically startling: either one must totally abandon the realistic philosophy of most working scientist, or dramatically revise our concept of space-time.” It is quite remarkable that the second alternative has gained wide acceptance, notwithstanding that positivism was (and presumably still is) very popular among physicists, who, for this reason, should hardly be embarrassed by the first alternative. However, presumably the “realist philosophy of most working physicists” is so strong (see also
10.5.
QUASI-OBJECTIVISTIC LOCAL THEORIES
583
sections 2.3 and 2.4) that it is prepared to take for granted even a nonlocal reality allowing unobservable superluminal influences. The (false) idea that nonlocality has been experimentally proven by the Aspect experiments (compare section 9.5.2) may have contributed to this development. However, as will be seen in section 10.6, there exists a third alternative next to the ones observed by Clauser and Shimony. As a matter of fact, the role of the locality condition in the derivation of the Bell inequality remains unclear also in stochastic hidden-variables theories. Indeed, for such a derivation it is not at all necessary that the quadrivariate probability have the form (10.42). Thus, the bivariate marginals of the probability distribution
satisfy the Bell inequality without any requirement of locality being posed, because the mere existence of the quadrivariate probability is sufficient (cf. section 9.2). This seems to suggest that satisfaction of the Bell inequality is rather connected with the existence of the conditional probabilities i.e. with the assumption that the measurement results of the four observables and can be conditioned on a value of An analogous conclusion was already arrived at in our analysis of Stapp’s “nonlocality proof” (cf. section 9.5.2), although we were hindered there by the fact that an individual preparation cannot be represented in quantum mechanics in an unambiguous way. The big advantage of hidden-variables theories is that such a representation is possible. This makes it much more feasible to study the role of the individual preparation, and to see to what extent the derivability of the Bell inequality may be a consequence of the special properties we attribute to the hidden variable It can be imagined that, notwithstanding the generality of the stochastic theories considered in the present section, additional assumptions are made (next to the locality condition), which are too strong to be valid within the domain of quantum mechanics. The assumption of ‘quasi-objectivity’ seems to be a candidate in this respect. This assumption will be scrutinized in section 10.6. We shall see there that local hidden-variables theories are feasible which are more general than the quasiobjectivistic theories considered by Clauser and Home. These theories do not support the existence of joint probability distributions like (10.42). Hence, derivation of a Bell inequality to be violated by the standard EPR-Bell experiments, taken together, is impossible in such theories. Quasi-objectivity can be seen as the real cause that the Bell inequality can be derived. Hence, to be able to reproduce quantum mechanics by a local hidden-variables theory we should get rid of quasi-objectivity. This is less far-fetched than it appears to be at first sight. In section 10.6.4 we shall see that non-quasi-objectivistic theories are not uncommon in statistical physics, and that it, therefore, is far too early to conclude that there cannot exist local
584
CHAPTER 10. SUBQUANTUM THEORIES
hidden-variables theories which are in agreement with quantum mechanics. Such a conclusion would be based on an unnecessary restriction to (quasi-)objectivistic hidden-variables theories.
10.6 10.6.1
Non-quasi-objectivistic hidden-variables theories Farewell to quasi-objectivism
In the previous sections we assumed a kind of objectivism with respect to quantum mechanical measurement results, in section 10.4 referred to as quasi-objectivism, to the effect that, although a quantum mechanical observable is not considered to be an objective property of the microscopic object, it can be conditioned through the conditional probability on a value of hidden variable which was considered to be an objective property, possessed by the object at the moment of its preparation, and independent of the measurement to be performed later. This assumption is implicitly contained in the Clauser-Horne stochastic hidden-variables theory, and in the derivation of the Bell inequality from this theory. It is important to realize that quasi-objectivity, too, is an additional assumption next to locality. As a matter of fact, the assumption of quasi-objectivity may very well be the crucial assumption allowing a derivation of the Bell inequality. Thus, the construction of joint probability distribution (10.42) is possible because conditional probabilities like are thought to exist. Such an existence is not self-evident, however. In pasting together measurement results obtained in different EPR-Bell experiments, the assumption of (quasi)-objectivity is a natural one to try first because it is closest to the objectivism of the classical paradigm (cf. section 2.4.2). However, as will be seen in the present section, it is not the only one, and perhaps not even a plausible one. For this reason quasi-objectivity is a genuine additional assumption, to be taken into account in logical reasonings like (9.38). We shall now discuss why the additional assumption of quasi-objectivity might be liable to be abandoned. What is actually cast into doubt here, is the physical relevance of conditional probabilities within the domain of quantum mechanics. One reason for doubting such a relevance is the hybrid character of the quantities connecting a quantum mechanical term to a term of a hidden-variables theory. Hence, is composed of terms from different languages. Such concepts may not be well-defined. It is not at all clear that within the domain of quantum mechanics the term has a well-defined and unique meaning. In the present section the hypothesis will be defended that, indeed, does not have such a meaning because a quantum mechanical measurement is not able to yield information on an instantaneous (hidden-variables) state of the object. Quantum mechanical measurement
10.6.
NON-QUASI-OBJECTIVISTIC THEORIES
585
results do not refer to instantaneous properties of the microscopic object, but are results of processes in which object and measuring instrument interact during some time. Quantum mechanical measurements do not allow an operational attribution of a measurement result to an instant of time so precisely defined that it is reasonable to assume that a well-defined value of was responsible for its coming into being. In a quantum mechanical measurement the initial value corresponding to the initial time of the measurement is not operationally well-defined. We have still another reason, derived from the mathematical formalism, to surmise the inapplicability of conditional probabilities in the quantum mechanical domain. Thus, as demonstrated by (10.25), for quasi-objectivistic theories this quantity may be related to Green’s function of a ‘master’ equation. However, as was found in section 1.11.7, for stochastic equations yielding a phase space representation of the quantum mechanical formalism Green’s function may not exist. This holds true, for instance, for the Husimi representation (cf. (1.172)), even though this representation allows a “classical” statistical interpretation for special choices of the initial states (viz, those corresponding to quantum mechanical states). Although it is not at all clear whether the Husimi representation has a realist meaning next to the empiricist one attributed in this book to the quantum mechanical formalism, and, therefore, it need not be literally interpretable as a description of a stochastic process underlying quantum mechanics, can it yet serve as an additional reason to doubt that subquantum states of the type do have a meaning within the domain of quantum mechanics. Denying the relevance of to quantum mechanical measurements amounts to a strengthening of the quantum mechanical rejection of the ‘possessed values’ principle (cf. section 6.4.2): not only is it impossible to attribute to a microscopic object a value of a quantum mechanical observable, possessed prior to measurement; it is even impossible to attribute such a value in the sense that it would be determined in a stochastic way by the value a hidden variable possessed at a certain instant. Not only are quantum mechanical measurement results not properties the microscopic object possessed prior to measurement; they cannot even be conditioned on an instantaneous value the hidden variable possessed at some time prior to measurement. This does not imply that a microscopic object would not possess instantaneous properties at all; it just means that such instantaneous properties are not described by quantum mechanical observables (compare section 6.4.3). It does also not mean that it would be impossible to obtain information on such instantaneous (subquantum) properties; it only means that measurements within the domain of quantum mechanics do not yield such information. Nor does it imply that no causal relation could exist between such an instantaneous (subquantum) property of the object and a quantum mechanical measurement result. It means only that, due to certain limitations inherent in the concept of a quantum mechanical measurement, such a causal
586
CHAPTER 10. SUBQUANTUM THEORIES
relation is not described by the quantum mechanical formalism. Quasi-objectivistic theories might be applicable outside the domain of application of quantum mechanics, in which case the Bell inequality should be satisfied because of (10.42). Two different aspects are involved in a recognition that quantum mechanical measurements cannot be described by quasi-objectivistic hidden-variables theories: i) Quantum mechanics does not yield instantaneous information on the microscopic object; ii) Quantum mechanics does not yield objective (i.e. non-contextual, cf. section 2.3), but only contextual information on the microscopic object. The first point could be implemented by assuming that a quantum mechanical measurement result should be conditioned on a trajectory rather than on an instantaneous value of the hidden variable. In this view a quantum mechanical measurement result just provides time averaged information (averaged over a certain part of the trajectory). For instance, in an optical measurement the interaction time between object and detector should be longer than the optical period lest the measurement result be interpretable in terms of e.g. the number observable of a mode of the electromagnetic field. The second point can be accommodated by assuming that the relevant part of the trajectory is dependent on the measurement arrangement. Taking the two points together, it follows that in the context of a quantum mechanical measurement of observable A an individual preparation is represented in the subquantum theory by a contextual trajectory In the following we shall denote a generic trajectory of this kind by and assume that conditioning of quantum mechanical measurement result of observable A should be implemented according to rather than The transition from conditional probabilities to marks the essential difference between quasi-objectivistic and non-quasi-objectivistic subquantum theories underpinning quantum mechanics, in which an individual preparation is not represented by an instantaneous value but by a trajectory These trajectories can be conceived as representing in the subquantum theory the ‘individual contextual states’ introduced in section 9.5.5. If it is necessary to take into account statistical distributions of individual contextual states, then (10.13) should be replaced by the functional integral
in which is the space of ‘individual contextual states’ possible in the context of a measurement of observable A. In the same vein bivariate probabilities of EPR-Bell experiments can be given according to
10.6. NON-QUASI-OBJECTIVISTIC THEORIES
587
where and represent ‘individual contextual states’ of particles 1 and 2, respectively, and in which, analogously to (10.38), locality is represented by an assumption of conditional statistical independence. If observable B is incommeasurable with A, then in general the ‘individual contextual states’ will differ from Instead of (10.43) we have
probability space being different from The fact that the probability spaces and are different reflects the Copenhagen idea of ‘complementarity’ in the sense of ‘mutually exclusive measurement arrangements’ (cf. section 4.4).
10.6.2
Implications of non-quasi-objectivism
Contextual states in quantum mechanics and in hidden-variables theory The crucial point is that within the domain of quantum mechanics it is not allowed to consider a preparation of the object as independent of the measurement actually carried out. In section 9.5.5 a distinction was drawn between preparation of an ‘(individual) objective initial state’ and preparation of an ‘individual contextual state’. Unfortunately, in the quantum mechanical formalism there is no entity which can represent an ‘individual objective initial state’. In a hidden-variables theory such a state can be represented by It was already argued in section 9.5.5 that the ‘objective initial state’ may not be appropriate for serving as an initial state of a quantum mechanical measurement. Instead, the contextual state was conjectured to be more appropriate. Even if starting with a well-defined value of the hidden variable, we shall have to take into account the possibility that a contextual state will have to be assumed as the initial state on which a measurement result of a quantum mechanical measurement can be conditioned. Such an assumption is not unreasonable. In general, quantum mechanical measuring instruments are not able to “see” certain details. Thus, a measuring instrument for the measurement of a standard observable is not sensitive to the ‘cross’ terms in the density operator (cf. (1.75)). One could imagine that information about the ‘cross’ terms is wiped out because the hidden-variables state adapts itself to the measurement arrangement as soon as the object comes into contact with it. As far as measurement results of the actually measured quantum mechanical observable are concerned, a description of the object by means of this adapted state is appropriate. In the context of hidden-variables theories this latter state is the ‘individual contextual state’ This implies that quantum mechanical probabilities should not be conditioned on the ‘objective initial state’ (as is done in quasi-objectivistic
588
CHAPTER 10. SUBQUANTUM THEORIES
theories) but on Thus, it may be assumed that, instead of the conditional probability a probability is relevant to a measurement of quantum mechanical observable A. This implies that the assumption of quasi-objectivity is not appropriate for quantum mechanical measurements. Quantum mechanical measurements simply do not provide information on objective reality. This holds true for an ensemble as well as for an individual object. By (10.43) and (10.44) it is taken into account that quantum mechanical measurements can yield only contextual information, and that this physical circumstance does not change if the description is refined by introducing hidden variables. The contextual hidden-variables state plays an analogous role with respect to an individual object as does the quantum mechanical contextual state introduced in section 2.4, with respect to an ensemble. As a matter of fact, the (modal) realist interpretation of contemplated in section 6.6.2, can be seen as referring to the contextual reality of an ensemble of individual objects described by different ‘individual contextual states’ corresponding to different values of A. The term in can describe the quantum mechanical preparation of a subensemble of individual objects with contextual state The probabilities correspond to the relative frequencies of the states Within the fixed context of a measurement of this observable these probabilities satisfy the Kolmogorovian rules of classical probability theory (cf. appendix A.12). In case of a measurement of a standard observable A we can look upon the creation of an ‘individual contextual state’ as a process in which the value of quantum mechanical observable A is created in Jordan’s sense (cf. section 6.2.2). The hidden-variables theory provides a possibility of investigating to what extent a quantum mechanical measurement result can also be seen as a property of the microscopic object. The present model is just an implementation of a contextualisticrealist interpretation of quantum mechanics (compare section 9.4.2), to the effect that quantum mechanics is thought not to describe objective reality but just reality as far as it is in contact with a measuring instrument, represented in the hiddenvariables theory by a contextual state. A similar idea has recently been developed by Accardi and Regoli [436] to prove that the Bell inequality can be violated in classical mechanics, without any nonlocality being involved. Their ‘chameleon model’ violates (quasi-)objectivity in the sense that the state of a chameleon, responsible for its decision which color to take on, is dependent on the measurement arrangement: when sitting on a (green) leaf the chameleon will take on a color different from when it is sitting on a (brown) log. The crucial point is that the decision to adopt a certain color depends on the state the chameleon is in after it has come into interaction with the measurement arrangement. This state is comparable to or in case of measurement arrangement A (leaf) or B (log), respectively. In the chameleon model conditioning on the objective (initial) state the chameleon was in prior to its interaction with the
10.6. NON-QUASI-OBJECTIVISTIC THEORIES
589
measuring arrangement is impossible because the measurement result does not refer to a ‘passive’ reality, created before any measurement, but to an ‘adaptive’ reality, created during the measurement. Like a quantum mechanical measuring instrument the chameleon conveys only contextual information, referring to a reality that is influenced by the measurement arrangement.
‘Correspondence’ and ‘complementarity’ in non-quasi-objectivistic hiddenvariables theory The difference between and reflects the criticism advanced in section 5.3.1 against Einstein’s ‘element of physical reality’ , if taken as a value of a quantum mechanical observable. If ‘elements of physical reality’ exist at all, they should be objective properties of the microscopic object, to be described by (a function of) . They cannot correspond to values of quantum mechanical (standard) observables, since these are conditioned on contextual states rather than objective states As far as reality is described by quantum mechanics it is (co-)determined by the measurement arrangement not exclusively in a determinative, but also in a preparative sense. Since contextual states are relevant only if the measurements are actually carried out, the notions of ‘quantum mechanical observable’ and of ‘element of physical reality’ apply in different experimental contexts. This might be compared with Bohr’s principles of ‘correspondence’ and ‘complementarity’. A quantum mechanical observable is defined only within the measurement context of that observable; incompatible observables are defined in mutually exclusive arrangements. In contrast to Bohr’s ideas, measurement can be analyzed in a hidden-variables theory, viz, in terms of contextual states, thus also accounting for the realist attitude with respect to observables discussed in section 4.3.3 (compare also the modal interpretation, discussed in section 6.6). In a hidden-variables model Bohr’s epistemic views can obtain an ontic extension. Indeed, the possibility of defining a quantum mechanical observable (cf. section 4.3) is connected in an essential way to the experimental arrangement. The experimental arrangement must be actually present to be able to exert its defining function by realizing the contextual state This is in complete agreement with the criticism of Bohr’s answer to EPR as articulated in section 5.3.1: also from the perspective of a non-quasiobjectivistic hidden-variables theory the value of a quantum mechanical observable does not exist if no measurement is performed. In the EPR experiment the measurement context is determined for each particle by the measurement arrangement for that very particle (cf. section 9.4.2), and the relevant observable is defined only if it is actually measured. As already discussed in section 5.3.1, this observation is sufficient to prevent Einstein’s conclusion of nonlocality. It is particularly important that in a hidden-variables theory it is possible to distinguish between ‘measurement’ and ‘preparation’ when considering the depen-
590
CHAPTER 10. SUBQUANTUM THEORIES
dence on the measurement arrangement: the measurement set-up may not only influence the measurement result, thus causing a certain nonideality of the measurement (in the sense defined in section 7.6.2), it may even influence preparation as far as relevant to quantum mechanical measurement. Quasi-objectivity is not applicable within the domain of quantum mechanics because preparation, as far as relevant to the measurement procedure, cannot be considered as non-contextual. Incommeasurable observables have distinct contextual states because they are physically established in interaction with distinct (mutually exclusive) measurement arrangements. Conditioning of values of incompatible standard observables on one and the same contextual state is impossible then. Contextuality is at the basis of complementarity in both a determinative and a preparative sense (cf. section 7.10.3). Nevertheless, these manifestations of complementarity are fundamentally different. Whereas the former (e.g. the Martens inequality (7.106)) can be completely understood at the quantum mechanical level as a consequence of the quantum mechanical interaction of object and measuring instrument, can the latter (e.g. the Heisenberg inequality) be fully understood only if the problem of quantum mechanical measurement is considered from the wider perspective of hidden-variables theories. Within the domain of application of quantum mechanics preparation is only meaningful as a process establishing a contextual state. Presumably, the confusion, observed in section 4.7.4 with respect to the distinction between ‘preparation’ and ‘measurement’, is a consequence of not sufficiently being aware of the different ways in which contextuality can be effective. Once again, this does not imply that it would not be possible to prepare the object in a way that is independent of a measurement that is carried out later. At the subquantum mechanical level we do not have any reason to believe that in the Aspects experiment the preparation of a particle pair at the source is not independent of the choice which of the four experiments will be carried out later. Hence, there is no objection against the assumption that identical initial values of can be assumed in different (incompatible) EPR-Bell experiments. The point is, however, that this is irrelevant within the domain of quantum mechanics if quantum mechanical measurements are not capable of testing whether the initial value of the hidden variable is precisely that value of because they are just capable of probing the contextual state.
Contextual states and the Bell inequality We can look upon the problem of the Bell inequality as an indication that, indeed, reference to the instantaneous initial value i.e. the assumption of quasi-objectivity, is responsible for the non-quantum mechanical result that the Bell inequality must be satisfied. It seems that within the domain of quantum mechanics the measurement arrangement plays a role not only in the measurement process, but it even cannot be
10.6. NON-QUASI-OBJECTIVISTIC THEORIES
591
ignored in the preparation of the object. Stated differently, a quantum mechanical measurement does not yield information (possibly disturbed by the measurement process) about an objective reality, but only about reality as far as it is in interaction with a measuring instrument that is actually present, i.e. a contextual reality. As follows from the results obtained in section 10.5.3, measurements which are so fast that they can yield information on an instantaneous value of must satisfy the Bell inequality. Violation of the Bell inequality can be seen as an indication that a measurement is not fast enough, and is yielding information only on the trajectory It should be realized that by itself the assumption of conditioning on a trajectory is not sufficient to prevent a derivation of the Bell inequality. Thus, if identical trajectories could exist in all EPR-Bell experiments, then, using (10.44), it would be possible to define, analogously to (10.42), for the EPR-Bell experiments the quadrivariate probability distribution
from which the Bell inequality would once again be derivable for local theories. However, the contextuality of the trajectories makes the construction of quadrivariate probability distribution (10.45) impossible if incompatible observables are involved, since, due to complementarity, measurement results of incompatible observables cannot be conditioned on identical trajectories. This prevents a derivation of the Bell inequality for local non-quasi-objectivistic hidden-variables theories on the basis of the existence of a quadrivariate probability distribution like (10.45). Dependence of the ‘individual contextual states’ on the measurement context also yields an explanation of non-reproducibility (compare section 9.5.4): in general it is not possible to prepare the same individual contextual hidden-variables state in the contexts of incommeasurable quantum mechanical observables because the measurement arrangements of these observables are mutually exclusive. This corroborates the physical reason of violation of the Bell inequality found in section 9.2, viz, incompatibility of quantum mechanical observables. The assumption of quasiobjectivity ignores the preparative incompatibility involved in the EPR-Bell experiments. Non-reproducibility of individual states in different EPR-Bell experiments is a consequence of non-preparability: if A and B are incompatible observables, then in general cannot be prepared in the measurement context of a measurement of A because even if starting from the same value The notion of contextuality, involved in the ‘individual contextual state’ differs from Lochak’s criterion -in section 10.4.2 observed to be insufficient- that probability distribution should depend on the measurement: in Lochak’s theory the preparation of hidden variable is thought to be objective, and contextuality is restricted to the measurement. In the present model contextuality is thought to be applicable both to measurement and preparation. Once again, a distinction
592
CHAPTER 10. SUBQUANTUM THEORIES
between ‘preparation’ and ‘measurement’ turns out to be crucially important: only contextuality of preparation is capable of evading the nonlocality problem posed by the Bell inequality.
(In)determinism We must distinguish two kinds of determinism, viz, i) determinism of the measurement process, as considered in section 10.4.2 when characterizing quasi-objectivistic hidden-variables theories, for non-quasi-objectivistic theories to be modified according to whether the relation between measurement result and ‘individual contextual state’ is either deterministic or stochastic, ii) determinism of the evolution of hidden variable as implied by (10.18) or (10.21). These kinds of determinism are rather independent. The first one refers to the measurement process, which may be deterministic in the sense of mapping in a unique way an ‘individual contextual state’ onto a pointer position of a measuring instrument. On the other hand, ‘faithful measurement’ does not imply that the underlying evolution process is deterministic. We shall consider this in more detail in section 10.6.4. In the Copenhagen interpretation these two kinds of (in)determinism are not properly distinguished. Indeterminism is thought to be caused by the measurement. However, it is not at all clear what this could mean if quantum mechanical observables have no values prior to measurement (compare section 4.2.3). In a nonquasi-objectivistic hidden-variables theory the idea of ‘faithful measurement’ (cf. section 2.4.3) can be implemented by an ideal measurement of A, assuming a deterministic relation between the ‘individual contextual state’ and the final pointer position of the measuring instrument. Copenhagen indeterminism can be implemented in the case of a nonideal measurement of standard observable A, in which the nonideality of the measurement procedure can imply indeterminism as represented by the possibility that for It is not clear whether a well-defined value can be attributed to an ‘individual contextual state’ in this case. It might be true, for instance, in the case of neutron interference with deterministic absorption (cf. section 8.2.3), because here we actually have an ensemble of ideal interference and path measurements. However, in case of stochastic absorption the situation is less evident (compare section 10.6.3). For general measurements such an attribution does not seem to be feasible at all (cf. section 7.10.4). It seems that in an ideal measurement of a standard observable only the second kind of indeterminism can be operative. The question is then whether in a measurement of standard observable A there is a unique relation between the objective initial state and the contextual state From the assumption that such a unique relation exists it would be possible to derive the Bell inequality in a way analogous to Stapp’s reasoning discussed in section 10.4.2. Hence, the precise form of the relation between and is determinative for answering the question of
10.6. NON-QUASI-OBJECTIVISTIC THEORIES
593
whether the hidden-variables theory is capable of reproducing quantum mechanical measurement results, or not. As seen in section 10.4.2, in a quasi-objectivistic hidden-variables theory contextuality is not sufficient to prevent a derivation of the Bell inequality. If there would exist a functional relation describing a unique dependence of the ‘individual contextual state’ on the individual objective one , then, notwithstanding the influence of the measurement, a conditioning of observable A on a value of the hidden variable could be realized. In that case the theory would remain quasi-objectivistic, and the Bell inequality would have to hold. In order to violate this inequality it is necessary that the measurement procedure does not allow conditioning of the measurement result on an individual objective initial state This requirement actually delimits the domain of application of quantum mechanics. A certain indeterminism of the subquantum mechanical stochastic processes in which ‘(individual) contextual states’ are brought into being is characteristic of this domain (compare Nelson [409], p. 124; see also section 10.6.4). However, this should be distinguished from the Copenhagen probabilistic interpretation, according to which quantum mechanical measurement results are stochastically fluctuating. As will be seen in section 10.6.3, this view is unnecessary and physically not very attractive. The time averaged quantities corresponding to contextual states need not evolve too stochastically. Stochasticity may be a property valid at the subquantum level of the objective state however. Such a Stochasticity could be consistent with a deterministic behavior of contextual states, analogously to the way a deterministic behavior of a billiard ball is not inconsistent with stochastic behavior of the constituting atoms. Assuming the existence of a unique functional relation between measurement result and may be just as absurd as assuming that the center of mass of a billiard ball is a unique function of the positions of all atoms at a certain instant of time (the relevant time scales being very different). From a discussion of neutron interference experiments in section 10.6.3 it will be seen that it may even be impossible to attribute to a neutron a well-defined position, since its contextual state is not localized in a particle-like way.
10.6.3 Contextual states for generalized measurements In order to get a better idea of the distinction of the contributions of preparation and measurement to quantum mechanical complementarity it is instructive to try to extend the idea of a ‘contextual state’ to generalized measurements like the joint nonideal measurements of incompatible observables discussed in previous chapters. It is to be expected that changing the measurement arrangement will in general change the contextual states in generalized measurements, too. Then, incommeasurability of incompatible standard observables could be explained because, in the
594
CHAPTER 10. SUBQUANTUM THEORIES
context of a measurement arrangement of a joint nonideal measurement conditioning on the contextual states of the separate observables may be impossible. Formulated in this way it might seem that ‘complementarity in a determinative sense’ has the same origin as ‘complementarity in a preparative sense’. To a certain extent this may be correct. However, different phases of the measurement process have to be distinguished, the context changing as the object traverses the measurement arrangement. Here, too, the confusion caused by not distinguishing the two kinds of complementarity could be understood because in Copenhagen reasonings a detailed analysis of different phases of the measurement process is largely lacking (compare section 4.7.2). It should be repeated here that information obtained from a quantum mechanical measurement is information on the initial (incoming) state (cf. section 3.3.6). Thus, in general in neutron interference experiments the quantum mechanical result does not tell us in an unambiguous way which path the neutron has taken through the interferometer. Strictly speaking it does not even yield the probabilities of the paths, but only the probabilities of detector clicks (empiricist interpretation) described in terms of the initial density operator of the incoming neutron. Detailed information on the ‘individual contextual state’ transcends quantum mechanics. Hence, such information can only be obtained by means of measurements outside the domain of application of quantum mechanics, which do not yet seem to be feasible today. Even though we do not have any experimental possibility to test the dependence of the contextual state on the parameters of the measurement arrangement in generalized measurements, it may nevertheless be advantageous to study the neutron interference experiments discussed in section 8.2 in some more detail. This might provide an opportunity to gain some insight into the possibility of a subquantum theory rendering the quantum mechanical results, or even transcending the domain of application of quantum mechanics. It will be clear, however, that the following discussion is rather speculative and tentative.
Contextual states in neutron interference experiments The examples of deterministic and stochastic absorption in neutron interference (cf. section 8.2.3) do not seem to be only quantitatively, but to be also qualitatively different. Note that here ‘deterministic’ and ‘stochastic’ do not refer to the evolution of hidden variable but to the measurement process. In contrast to the latter, the deterministic absorption experiment is a mixture of pure interference and pure path measurements (compare POVM (8.21), which satisfies Ludwig’s alternative definition (7.44)). In this case only the extreme values 0 and 1 of absorption parameter play a role in an individual measurement. Consequently, only the contextual
10.6. NON-QUASI-OBJECTIVISTIC THEORIES
595
states of the corresponding standard observables can occur 10 . For the ideal path measurement depicted in figure 8.3 it is of primary importance what happens in the first slab of the interferometer, since it is determined there whether the neutron will be detected in one of the detectors and or whether it will be absorbed in the absorber. The contextual states and corresponding to measurement results + and – of path observable respectively, represent physical circumstances in which, roughly speaking, the neutron is in one path or the other. Since the decision whether the neutron will be detected in one path or the other is made in the first slab, the same measurement result will be obtained if the absorber and/or the detector(s) are placed somewhere else between the first and the third slab in the same path of the interferometer. We should be cautious, however, not to draw from this the conclusion that contextual states and describe a point particle choosing one way or the other. Such an assumption would not enable to explain the measurement results obtained in the ideal interference experiment of figure 8.2. Since, as far as the interaction between neutron and first slab is concerned, the measurement arrangement is not different from that of a path measurement, it should be concluded that at this stage of the measurement the contextual states cannot be different from those of the ideal path measurement. In an interference experiment information must go along both paths, however. Since this is determined in the first slab, too, this must hold true in the ideal path measurement as well. Therefore, contextual states and should both have contributions taking both paths, be it that only one of the contributions is capable of activating a particle detector. In deciding whether the neutron is in one path only or in both paths, we should be careful about what we exactly mean when using the word ‘neutron’. If the neutron is identified with the contextual state, then it should be in both paths. However, if the neutron is identified with that part of the contextual state that is capable of activating a particle detector, then it seems that we should say that the neutron is in one path only. As a metaphor illustrating the meaning of the ‘individual contextual state’ we could imagine a ship accompanied by its bow wave, constituting a single entity as far as the quantum mechanical description is concerned. However, on a more detailed hidden-variables account different parts can be distinguished. The ship is in one path only. However, the bow wave is in both paths (compare the idea of ‘empty waves’, e.g. Selleri [437]). Only the ship, not the bow wave, is capable of triggering a detector. However, the partial bow waves going along the two paths may interfere and cause the ship to exhibit traces of this interference when it is 10
Note that POVM (7.64) of a joint nonideal polarization measurement satisfies Ludwig’s alternative definition, too (cf. (7.81)) as a consequence of the fact that the beam splitter used is not a polarizing one. Hence, it determines only the position of the contextual state, not its polarization, which is only determined in the analyzer. Although the spatial characteristics of the contextual state leaving the beam splitter may depend on this may not have a large effect on the interaction of the photon and the analyzer.
596
CHAPTER 10. SUBQUANTUM THEORIES
detected in an interference measurement as depicted in figure 8.2. It is important to realize that quantum mechanics is not capable of making a distinction between ship and bow wave because quantum mechanical measurements are not able to probe contextual states in any detail. Contextual states (1) and (2), corresponding to the eigenvalues of the interference observable defined in (8.8), describe a neutron leaving the third slab in the direction of detector or accompanied by its “bow wave”. The final decision whether the neutron is in one outgoing beam or the other is made in the third slab, and depends on the details of the contextual state impinging on that slab. These details are (co-)determined by the furniture of the interferometer, in particular the absorber. In the joint nonideal measurement of path and interference observables of figure 8.4 the contextual states will be either or in case of deterministic absorption, but they will be different from these if the absorber is a stochastic one. Not only is it important that the absorber stops the neutron in a certain fraction of transits; it is also important that the “bow wave” is influenced in a particular way, depending on the transmission coefficient a of the absorber, so as to change interference at the third slab. A subquantum theory would have to describe this process in more detail. The different behaviors of contextual states in interference experiments with deterministic and stochastic absorption highlight the fundamental difference between standard measurements and generalized ones, stressed in sections 7.1 and 7.9.4. The “bow wave” metaphor has a certain similarity with de Broglie’s “theory of the double solution” [424] in which a particle is represented by a singular solution of the Schrödinger equation, accompanied by a regular solution as a kind of guiding wave, as well as with Bohm’s theory discussed in section 10.3. There, however, exist fundamental differences with these theories. Thus, in the present theory the quantum mechanical wave function is not interpreted realistically. Therefore, in contrast to what is often assumed in hybrid hidden-variables theories, the wave function does not play a role as a guiding wave physically influencing the singularity. Another difference is that the contextual state is not a solution of the Schrödinger equation; its evolution is governed by an (as yet unknown) subquantum theory. It, moreover, is not applicable outside the measurement context. Although there is a certain similarity with the views of the Copenhagen interpretation, the general picture is quite different. In the Copenhagen interpretation it is assumed that by the neutron no decision is made which path to take before an observation is made in a path measurement, i.e. the moment at which it is ascertained that the neutron is either registered by the detector or not (probabilistic interpretation, cf. section 4.2.2). In the hidden-variables model it is assumed that not only in the path measurement but also in the interference measurements the neutron follows a well-defined path through the interferometer, independently of whether it is observed or not. This is quite independent of the issue of complemen-
10.6. NON-QUASI-OBJECTIVISTIC THEORIES
597
tarity, which is determined by the differences of the contextual states in the different measurement arrangements. In particular, this is in disagreement with the popular view that in an interference measurement the neutron would not have a position (path) but would jump stochastically between the paths. Even though the wave function and the ‘individual contextual state’ both split at the first interferometer slab, they should be carefully distinguished, the first referring to an ensemble, the latter to an individual event. Such an event can be interpreted as an individual neutron. However, here a neutron is not thought to be a point particle, but rather a more or less extended object (“ship+bow wave”; here the possibility of a description of an individual object by means of a soliton solution of a nonlinear evolution equation should be mentioned, e.g. Enz [438]). Indeed, the ‘individual contextual state’ can be seen as a description of an object that for the largest part chooses one path, but for a small, although for later interference essential part, also follows the other path. This picture is clearly different from the wave function which splits into equal parts if the incoming state is either or As seen from the “bow wave” metaphor, this need not imply any “nonseparability” of the parts (cf. section 6.3.2): it seems very well possible to locally influence one part without instantaneously influencing the other (of course, such a local influence may have consequences for later interference of the partial waves). Quantum field theory gives ample occasion for such a view, even though this theory only refers to ensembles. Vacuum fluctuations, contributing to the mass of a particle, can be considered as genuine constituents of the object; they also give the object a nonlocal character, which may have observational consequences (cf. section 1.3.1). But this nonlocality is comparable to the nonlocality of a billiard ball, acting as one single extended (rigid) object as long as it is not subject to local influences forcing it outside the domain of application of rigid body theory. For instance, it is very well possible to locally excite one atom of the ball by absorption of a photon. This, however, would lead us outside the domain of rigid body theory. By the same token, distinguishing a neutron from its accompanying vacuum fluctuations would require observational methods which are outside the domain of quantum field theory. In our present discussion nothing seems to prevent a deterministic evolution of the objective hidden-variables state in the sense that its initial value gives a unique measurement result in case of an ideal measurement of a standard observable11. However, this would imply the possibility of jointly attributing values of incompatible standard observables (like the path and interference observables) to the object at a time prior to any measurement, thus enabling a derivation of Bell’s inequality along the objectivistic lines discussed in section 9.5.1. For this to be impossible it is necessary that the deterministic connection between the objective 11
Note that for an ideal measurement of a standard observable the initial objective state of the measuring instrument is unimportant.
598
CHAPTER 10. SUBQUANTUM THEORIES
initial state and the contextual states (either or be broken within the domain of application of quantum mechanics. In the next section we shall exploit an analogy between quantum mechanics and thermodynamics to argue that this idea is not too contrived. Even though it may not be impossible that the classical mechanical evolution underlying thermodynamics is deterministic, more subtle measurement procedures than thermodynamic ones are required to probe the phase space point of a classical system. As is well known, ‘coarse-graining’ is essential for reconciling thermodynamics with classical statistical mechanics. Hence, measurements within the domain of thermodynamics yield only coarse-grained information. Analogously, quantum mechanical measurements may lack the precision necessary to probe the objective state The possibility that ‘coarse-graining’ may be relevant to the reduction of quantum mechanics to a subquantum theory has also been considered by Valentini [439].
10.6.4
Thermodynamic analogy
At this moment we do not have any experimental evidence to guide us in devising any specific subquantum theory. For this reason our objective must necessarily be a more modest one, viz, trying to get some insight into the possibility of the non-quasi-objectivistic theories conjectured in section 10.6.1 as local subquantum underpinnings of quantum mechanics. In the following it will turn out that the analogy of quantum mechanics and thermodynamics may be very helpful in this respect. In the absence of any definite idea with respect to what subquantum reality looks like, the analogy with a classical statistical mechanical underpinning of thermodynamics at least yields a formal opportunity of considering local hiddenvariables theories that need not satisfy the Bell inequality. In the analogy between quantum mechanics and thermodynamics the hiddenvariables theory is thought to play a role analogous to that played by classical (statistical) mechanics in attempts at reducing thermodynamics to a more fundamental theory. Thus, we have the scheme
The analogy is attractive because, due to the atomic constitution of matter, the fundamental problems of quantum mechanics and thermodynamics may be closely related, and may even have to be solved jointly. For this reason the analogy should not be approached in too naive a fashion. Our experience is that, as particles get smaller their dynamics tends to deviate more from the classical dynamics of point particles: below a certain size quantum mechanical effects can no longer be ignored.
10.6. NON-QUASI-OBJECTIVISTIC THEORIES
599
We do not have any reason to expect that classicality will return at still smaller dimensions. It is very improbable that a classical mechanics of point particles can yield a satisfactory model for subquantum dynamics. It, therefore, is to be expected that a subquantum theory will not imply a return to classical mechanics. Such a theory will presumably describe a reality that is even “stranger” than the reality described by quantum mechanics. Nevertheless, a comparison with thermodynamics is particularly useful because classical mechanical underpinnings of this theory, as well as the problems going with it, have been widely discussed since the days of Maxwell and Boltzmann.
Realist and empiricist interpretations of thermodynamics Let us start by noting that the question of whether the theory allows a realist (either objectivistic or contextualistic) or an empiricist interpretation (cf. chapter 2) applies to thermodynamics just as well as to quantum mechanics. Do the thermodynamic quantities ‘temperature’ and ‘pressure’ refer to properties of the object (e.g. a volume of gas) or just to readings of a measuring instrument? Unfortunately, thermodynamic quantities are often considered in a naive way as objective properties of the object. Thus, in literature a “definition” of temperature T can be found on the basis of the equality
This would define temperature in terms of the instantaneous kinetic energy of the object. Such a “definition” is problematic, however, as can be seen by applying it to an ideal gas in a container freely falling in a gravitational field. According to the “definition” temperature would have to rise due to the acceleration of the object. This is in disagreement with the more appropriate (empiricist) definition of temperature as registered by a thermometer which is in stationary contact with the object. Like in quantum mechanics we can pose the question of whether thermodynamic quantities, apart from representing pointer positions, do tell us something about the reality of the object. Here the answer must be an unconditionally positive one: it would be wise to take a high thermometer reading as a warning that you may burn your hand if you touch the object. Accordingly, it could be conjectured that an objective definition of temperature of a falling body could be obtained by taking in (10.46) particle velocities relative to a coordinate frame that is co-moving with the object. However, like in the modal interpretation of quantum mechanics (cf. section 6.6) thermodynamic quantities seem to have a contextual meaning only. Temperature can be attributed to a physical system only if a certain condition of thermal equilibrium is
600
CHAPTER 10. SUBQUANTUM THEORIES
satisfied, referred to by Boltzmann as a state of ‘molecular chaos’, and implemented by the requirement that a condition of (quasi-)ergodicity be satisfied (e.g. ter Haar [440], p. 123, Huang [441], p. 203, also P. and T. Ehrenfest [442]). This condition is satisfied, for instance, if the object is in equilibrium with a heat bath. By itself the concept of kinetic energy has no connection with temperature; it can be generalized to physical situations not satisfying an equilibrium condition, in which to the object no temperature can be attributed having a thermodynamic meaning. As far as thermodynamics allows a realist interpretation, it is a contextualistic-realist one. The context is provided by the presence of a heat bath, or, more generally, a system with which the object can be in thermal equilibrium. Here we should distinguish between local and global equilibrium. For the analogy with quantum mechanics it seems that local equilibrium is the relevant one. Hence, a comparison with so-called non-equilibrium 12 thermodynamics (e.g. de Groot and Mazur [443]) would be appropriate. In case equilibrium is not global, time averaging should be restricted to a finite time interval. In the case of equilibrium thermodynamics (homogeneous systems) it is evident that the container enclosing a gas is an important constituent of the experimental context: equilibrium thermodynamics applies to the physical circumstance in which a gas is in equilibrium with its container.
Microstates and macrostates in a classical statistical mechanics underpinning of thermodynamics The main lesson to be learnt from thermodynamics is that this theory can be applied only to systems that are in very specific states. Thermodynamics does not regard arbitrary classical mechanical states; it applies only to a restricted subset of such states, viz, states satisfying a requirement of (quasi-)ergodicity. Boltzmann and Maxwell have formulated this as a requirement to be satisfied by relevant trajectories in phase space. Classical quantities are relevant to thermodynamics only as far as they are time averaged over a (quasi-)ergodic trajectory. This implies that a thermodynamic quantity is not a function of the phase space point (cf. section 1.10.3), but a functional on the space of (quasi-)ergodic trajectories (see also section 10.2.3). In Gibbs’s treatment of statistical thermodynamics time averaging over a (quasi)ergodic trajectory is replaced by averaging over an ensemble, for instance, the canonical ensemble described by the canonical phase space distribution functions
12 Non-equilibrium thermodynamics describes the transition from local equilibrium to global equilibrium; it does not describe the relaxation from non-equilibrium to equilibrium (either local or global).
10.6. NON-QUASI-OBJECTIVISTIC THEORIES
601
We shall refer to as macrostates, to be distinguished from the microstates corresponding to phase space points or the dispersionless states (cf. (1.117)). In Gibbs’s formulation thermodynamic quantities are functionals of the canonical distribution functions, and, consequently, are conditioned on macrostates. It does not make sense to attribute a thermodynamic quantity to a microstate. Thermodynamic quantities are not instantaneous properties of the object, even if this is not evident in Gibbs’s description by means of canonical distribution functions, which could be interpreted as describing an ensemble of microstates all prepared at the same instant. However, since it represents a (quasi-)ergodic path of an individual system, it seems more appropriate to consider it as a description of an ensemble of microstates along a (quasi-)ergodic path (indeed, the canonical density function yields an appropriate value of the standard deviation of the thermal equilibrium fluctuations of energy). Thermodynamic quantities are slowly varying in time because they are equilibrium properties. Of course, it is not impossible to define classical mechanical quantities of the system at an arbitrary instant of time. However, these quantities will in general fluctuate rapidly in time. They will correspond to thermodynamic quantities only after relaxation to a state of molecular chaos has taken place. The domain of application of thermodynamics is restricted to (local) equilibrium states.
Thermodynamics and quantum mechanics as theories of (local) equilibrium processes As already assumed in Bohm’s stochastic theory (cf. section 10.3.3), quantum mechanics -like thermodynamics- might be a theory of equilibrium processes. The contextual states of quantum mechanics might be comparable to the thermodynamic equilibrium states (10.47), which constitute just a subset of all possible phase space distributions. Analogously to the Gibbsian canonical distributions the trajectories might be representable as equilibrium distributions on and, hence, be related to the objective states in the way statistical mechanical macrostates are related to microstates. Like the statistical mechanical macrostate the contextual state could be a result of a process in which object and measuring instrument approach a joint state of equilibrium. Both in thermodynamics and in quantum mechanics the restriction to a canonical set of states could be justified by a restrictive way of observation, in which measurements are so slow that they only yield information that is averaged over a period of time longer than the time the system needs to relax to (local) equilibrium. The canonical states of thermodynamics might be comparable to the (nonnegative) distribution functions on phase space involved in the Husimi representation of quantum mechanics discussed in section 1.11.3. The set of these states
602
CHAPTER 10. SUBQUANTUM THEORIES
does not enclose arbitrary non-negative distribution functions [71], but only the very special functions defined by (1.139). The embedding of functions (1.139) in the space of all possible non-negative functions seems to be analogous to the embedding of the canonical distribution functions (10.47) of thermodynamics in the space of all classical statistical mechanical distribution functions. Analogously to the canonical distribution functions the Husimi distributions exhibit a certain smoothing which may be caused by a diffusion process in phase space, establishing a kind of (sub)quantum equilibrium to be compared with thermal equilibrium. However, it should be remembered that the Husimi functions are possible probability distributions of quantum mechanical measurement results (compare (8.84)). We, therefore, should be careful to distinguish between the two possible sources of indeterminacy referred to in section 7.10.3, viz, preparation and measurement. Measurement disturbance does contribute to the smoothing, which, therefore, cannot be a result exclusively of preparation of the contextual state. As a matter of fact, it is suggested by our interpretation of the Wigner measure (cf. section 7.9.4) that the smoothing relation (1.151) between the Husimi measure and the Wigner measure is a consequence of measurement disturbance. Quantum mechanical equilibrium should be taken into account already by the Wigner measure. We have two additional reasons to believe that some kind of equilibrium must be involved in a reality underlying quantum mechanics. First, there is the issue of the existence of limit values of relative frequencies (6.1) assumed to be generally realized in quantum mechanical measurements, as well as the experimental repeatability of such measurements, causing von Neumann to assume homogeneity of pure states (cf. section 6.2.3). We do not have any reason to expect these to be true for arbitrary ensembles of objective states The requirement that an arbitrary subensemble yield the same relative frequencies as the full ensemble is a very strong one, in need of a physical explanation. Satisfaction of such a requirement is possible only if different subsequences of measurements are physically equivalent. Equilibrium seems to be a favorable condition for this to be the case. A second issue is the (anti)symmetry of wave functions of identical particles (cf. section 2.4.4). Here, too, the physical equivalence of the particles could be explained by the existence of a certain equilibrium, not unlike an explanation of the physical equivalence of particles at different positions in a gas in thermal equilibrium. If the analogy between thermodynamics and quantum mechanics has any physical reality, then the physical quantities of both theories are defined only as averages over certain space-time regions in an equilibrium situation. The contextual state represents information on a microscopic object, comparable to the information on the (average) kinetic energy contained in the parameter T of the canonical phase space distribution function. The analogy can explain the invalidity of the ‘possessed values’ principle in quantum mechanics. Only after the object has reached equilibrium with the measuring instrument a state comes into being (viz, the contextual
10.6. NON-QUASI-OBJECTIVISTIC THEORIES
603
state) to which a value of the quantum mechanical observable can be attributed. There is a certain analogy between an assignment of a well-defined value of a quantum mechanical observable to an object outside the measurement context of that observable, and the assignment of a value of temperature to an object that is not in thermal equilibrium with its environment. Unfortunately, in both cases it is not unusual to do such a thing. In a certain sense a transition to equilibrium is consistent with the projection postulate, taken in the sense of the anti-Copenhagen variant of the modal interpretation (cf. section 6.6.2). Such a transition can be seen as a process in which a measurement result comes into being in the Jordan sense (cf. section 6.2.2). Yet, we should be careful not to draw too drastic conclusions of indeterminism from this. Like in the thermal process of approach to equilibrium, starting from a nonequilibrium state like the one depicted in figure 10.3, the temperature of the final equilibrium state can be seen as a consequence of the kinetic energy present in the initial non-equilibrium state. This energy need not be changed drastically in the process in which equilibrium with the container is established. Temperature comes into being in this process; however, the (non-thermodynamical) kinetic energy was already there beforehand. In principle, it is not impossible to measure this kinetic energy, but within the domain of thermodynamics this information may be too subtle to resist the stochasticity involved in the relaxation processes. Our comparison of quantum mechanics and thermodynamics might also shed some light on the fundamental problems of the latter theory, connected with attempts at deriving the Second Law of Thermodynamics from classical (statistical) mechanics, i.e. the problem of irreversibility. A widely accepted solution to this problem is the ‘coarse-graining’ solution, stating that observation can yield only inaccurate (smoothed) information. However, this solution is sometimes put into doubt. For instance, Prigogine [444]: “It is difficult to believe that the observed irreversible processes, such as viscosity, decay of unstable particles, and so forth, are simply illusions caused by lack of knowledge or by incomplete observation.” If the analogy between thermodynamics and quantum mechanics is meaningful, then
604
CHAPTER 10. SUBQUANTUM THEORIES
there might indeed exist a second smoothing mechanism next to the one caused by inaccurate measurement, viz, a process in which a contextual state is prepared at the onset of a measurement by a process establishing equilibrium of the subquantum state. Viewed in this way, the fundamental problems of quantum mechanics and thermodynamics might not only be analogous, but they are possibly closely related. It does not seem to be improbable that, like in quantum mechanics, also in thermodynamics no sufficient distinction has been made between the processes of preparation and measurement, thus causing confusion. Presumably, also thermodynamics cannot be completely understood unless both processes are taken into account. This, in particular, might apply to the question of whether the randomization process yielding a thermodynamical macrostate is consistent with a deterministic microscopic dynamics. We might expect that for thermodynamics an assumption of determinism may yield problems that are comparable to those discussed in section 10.6.2 with respect to existence of a deterministic relation between contextual and objective states. It is not impossible that decoherence -argued in section 3.4.5 not to be of fundamental importance at the level of quantum mechanicsplays a fundamental role at the subquantum level. The idea of quantum mechanics as a theory of equilibrium processes can yield an indication as regards the kind of experiments that have to be performed to transcend the boundaries of the domain of application of quantum mechanics. Such experiments would have to be capable of probing the objective hidden-variables state rather than the contextual state. This implies that they should be fast enough to register deviations from equilibrium. Relaxation times of subquantum processes might be guessed from the value of interpreted as a diffusion constant. For an electron confined in a space of atomic dimension R this would amount to a relaxation time of the order of Since at this moment the femtosecond region is state-of-the-art, it is clear that all present-day experiments are well within the domain of quantum mechanics. By hindsight, on this view it is not surprising that Aspect’s “switching” experiment (cf. section 9.3.1), with its time of flight of and its ‘switching’ frequency of 50 MHz, was not fast enough to show any deviation from quantum mechanical predictions.
Contextuality in thermodynamics and in quantum mechanics In our attempt to solve the quantum mechanical ‘nonlocality’ problem induced by the Bell inequality (cf. section 10.6.1) contextuality played an important role. The solution proposed there was to assume that a quantum mechanical measurement result can be conditioned only on an ‘(individual) contextual state’ In sections 10.6.1 and 10.6.2 we did not consider the equilibrium aspect. Indeed, contextuality does not necessarily imply equilibrium. However, equilibrium does imply
10.6. NON-QUASI-OBJECTIVISTIC THEORIES
605
contextuality since in establishing equilibrium the system’s environment plays a role. It may not be unreasonable to adopt this feature in quantum mechanics, too. It is seldom realized that the thermodynamic equilibrium states described by the canonical density functions (10.47) exhibit a similar contextuality as does the contextual state This is evident from their dependence on the experimental arrangement. Thus, when a gas is enclosed in a container we should assume the potential energy of the particles to be infinite outside the container, thus causing the canonical density functions (10.47) to vanish there. Contextuality is evident when we compare two containers having the same temperature and pressure, but rotated with respect to each other (cf. figure 10.4): the canonical states vanish outside different regions, and, hence, are distinct. Note that these states could be obtained by starting from the same (non-equilibrium) microstate, e.g. the one depicted in figure 10.3, by preparing it in each of the two containers of figure 10.4 and letting each system establish equilibrium with its container. This implies that thermodynamic quantities can be different, even if starting from identical initial microstates, thus demonstrating the contextuality of the equilibrium states. Stated differently, thermodynamic quantities, too, are defined only within the context of a well-defined measurement arrangement. The contextuality observed here induces in thermodynamics a complementarity of preparation highly reminiscent of the analogous quantum mechanical property (cf. section 10.6.2). Evidently, thermodynamic measurement arrangements can be just as mutually exclusive as quantum mechanical ones. For thermodynamics the relevance of canonical distributions seems to be a reality that should be taken into account. For quantum mechanics the relevance of contextual states to a hidden-variables underpinning is conjectured only as a logical possibility, which, however, would be physically plausible if the analogy between thermodynamics and quantum mechanics would have a physical basis. It would make more easily acceptable the idea that quantum mechanical observables can be conditioned only on trajectories.
606
CHAPTER 10. SUBQUANTUM THEORIES
Nonlocality in thermodynamics and in quantum mechanics It is seldom noticed, too, that thermodynamics has a nonlocal character. Statistical mechanical macrostates described by thermodynamics have a certain extension, corresponding to the region of phase space quantities should be averaged over to obtain thermodynamic results. (Local) equilibrium can be defined only on spacetime regions of dimensions large enough that a situation of “molecular chaos” can prevail. It is impossible to change such a state locally without changing it into a non-equilibrium one, and, hence, without leaving the domain of application of thermodynamics. Thermodynamics describes transitions from one equilibrium state to another. Such transitions are necessarily nonlocal, changing the state globally. Of course, this does not mean that it would be impossible to locally disturb an equilibrium state; it only means that such a local disturbance would lead us outside the domain of application of thermodynamics. Nor does it mean that thermodynamic states are fundamentally nonlocal, neither are their parts nonseparable. This is analogous to the nonlocality of a billiard ball within the domain of application of rigid body theory. Like rigid body theory, also thermodynamics can only describe those processes in which the state of the system is uniformly influenced in the whole region in which (local) equilibrium has been established. Here it is possible to apply the maxim “It is the theory which decides what we can observe”, unjustifiedly used by Heisenberg to deny the possibility of certain experiments (cf. section 1.9.1). However, in the present context it is used not to assert the impossibility of observations not described by the theory, but merely to delimit the domain of applicability of the theory. For thermodynamics this domain refers to observation processes which are so slow that they can only yield information on the system as it is after having relaxed to equilibrium. Hence, the nonlocality of the theory does not have a physical cause in the sense that “real” nonlocal interactions would exist. It is just a consequence of a restricted validity of the theory. It does seem possible to explain thermodynamic nonlocality by means of local interactions which have occurred in the past, and which have established equilibrium correlations during the realization of equilibrium, analogously to the way the nonlocality of a billiard ball can be explained by means of local interatomic forces. Quantum mechanical nonlocality may have a similar character. Measurements within the domain of application of (generalized) quantum mechanics may not be fast enough to provide information on properties that are conditioned on an objective hidden variable They may yield information only on a contextual state representing an equilibrium state analogous to a statistical mechanical macrostate, and having a similar nonlocal character. If quantum mechanics, like thermodynamics, is a theory of equilibrium processes, then the mathematical equivalence of Bohm’s causal theory, discussed in section 10.3.2, with quantum mechanics could provide an explanation of the nonlocality of Bohm’s theory. Indeed, Bohm’s theory would only
10.6. NON-QUASI-OBJECTIVISTIC THEORIES
607
describe a certain category of physical processes, in which the object is always in local equilibrium with its environment. A theory describing such equilibrium processes must necessarily be nonlocal, in the sense that it refers to events which have a certain extension. To a certain extent this would make quantum mechanics a nonlocal theory, too, describing events that cannot be confined to pointlike regions but that should at least have the extension of an elementary particle, including vacuum fluctuations (see also section 1.3.1). However, this need not at all imply that an underlying hidden-variables theory ought to be nonlocal too. The nonlocality of equilibrium processes may be just a consequence of a restriction to a specific physical condition, forbidding an arbitrary local change of equilibrium states on penalty of leaving the domain of application of the (equilibrium) theory, i.e. quantum mechanics. In the case of thermodynamics the underlying theory, viz, classical mechanics, is capable of describing such local influences, thus transcending the domain of application of thermodynamics. The same may hold for a hidden-variables theory underpinning quantum mechanics in a non-hybrid way, not borrowing from quantum mechanics the nonlocality that is inherent in that theory due to its restricted domain of application. As a description of equilibrium processes solutions of the Schrödinger equation may have a similar nonlocal character as solutions of the classical diffusion equation.
Summary Summarizing we have the following sketch of a hidden-variables model underlying quantum mechanics. Starting from a preparation of an objective state a relaxation process to an equilibrium state takes place. If we are able to perform sufficiently fast measurements (i.e. within the relaxation time), then in principle information might be obtained on the state itself. If these measurements are of the deterministic type discussed in section 10.4.2, then their results could be attributed as objective properties to the microscopic object. However, the relaxation process is accompanied by a certain loss of memory. Measurements lasting longer than the relaxation time yield only time-averaged information, i.e. information on equilibrium states. It does not make any sense to attribute results of such measurements to an objective state It is assumed that a measurement within the domain of application of quantum mechanics, for instance of quantum mechanical standard observable A, is of this type. It is also assumed that an equilibrium state is (co-)determined by the measurement arrangement. This is plausible because the relaxation process leading to an equilibrium state takes place in the context of the measurement, and, hence, the measurement arrangement (co-)determines the precise equilibrium configuration. This implies that for the contextual state defined in section 10.6.1, such an equilibrium state should be taken.
608
CHAPTER 10. SUBQUANTUM THEORIES
The essential difference between (10.13) and (10.43) is contained in the choice of hidden-variables states. In contrast to the former, in the latter the essential roles of (local) equilibrium and contextuality in determining quantum mechanical measurement results are both taken into account. The interplay of these two is sufficient to prevent a derivation of the Bell inequality. Neither contextuality nor (local) equilibrium can separately prevent a conditioning of results of quantum mechanical measurements on one and the same initial state of the hidden variable, thus allowing a derivation of the Bell inequality: on one hand, the derivations by Bell (section 10.5.2) and by Clauser and Horne (section 10.5.3) are also valid for contextualistic theories; on the other hand, for a non-contextualistic non-quasi-objectivistic theory a quadrivariate probability distribution like (10.45) could be constructed. A combination of contextuality and equilibrium makes this impossible, however, because the conditional probabilities of quantum mechanical measurement results of incompatible observables are conditioned on different equilibrium states, which, moreover, cannot be reduced to an initial objective state due to lack of memory of the process in which equilibrium is established. This latter circumstance is crucial since, in principle, there is no objection against the assumption of identical values of prepared in different EPR-Bell experiments. However, this is irrelevant if quantum mechanical measurement results are determined by the contextual state rather than by the initial value of Which contextual state originates from a certain objective state is not only determined by but also by the measurement arrangement for A. For this reason it is meaningless to assume -as is done in Stapp’s “nonlocality proof” (cf. section 9.5)- identical preparations in different EPR-Bell experiments: as far as ‘identical preparation’ is interpreted as ‘equal values of this is irrelevant because this initial condition cannot determine a quantum mechanical quantity; on an interpretation as ‘equal identical individual preparation is impossible in case of incompatibility of the observables, due to mutual exclusiveness of measurement arrangements.
Appendix A Mathematical appendix A.1
Position and momentum operators
In order to deal with position and momentum operators1 it is necessary to consider an infinite-dimensional Hilbert space These operators have continuous spectra and non-normalizable eigenvectors. Thus,
for position operator
and
for momentum operator P. The eigenvectors obey Dirac normalization
with the Dirac delta function. In Dirac notation the spectral representations of operators and P are given by
In the Dirac formalism we have the following relation between the eigenvectors of and P:
1
Much of the mathematics not presented in this appendix can be found in elementary textbooks like e.g. Kreyszig [3]. As usual in physics texts the presentation is a simplified one, omitting most of the subtleties involved in infinite-dimensional spaces by using Dirac’s formalism.
609
610
APPENDIX A. MATHEMATICAL APPENDIX
yielding for the inner product
According to Stone’s theorem a unitary operator real, can be defined. This operator is a translation (or position shift) operator, satisfying
In an analogous manner we can define a momentum shift operator according to The shift operators relation
and
satisfy the commutation
This relation is known as the Weyl commutation relation . There is an important theorem, proven by von Neumann [445], that all operators and satisfying this relation must be essentially the same operators (up to unitary equivalence) as the ones defined above. Therefore the Weyl commutation relation is seen as characterizing the canonical conjugatedness of all quantum mechanical position and momentum observables, expressed by the commutation relation (1.19). It is easily verified that any set of operators
with
satisfying the relations
and P defined as above, necessarily has the form
with eigenvectors of . If the set is meant to be the spectral representation of a Hermitian operator, then it follows that In that case relations (A. 10) are said to define a system of imprimitivity (Mackey [446]). However, as seen from (A.11), there are more general solutions. Thus, if the set defines a so-called non-orthogonal decomposition of the identity (NODI) (cf. (A.116)). In that case relations (A.10) are referred to as a system of covariance (Prugovecki2 [40]). Analogous relations define the spectral representation of P. 2
Actually, Prugovecki requires only the first equality of (A.10).
A.2. BOSON CREATION AND ANNIHILATION OPERATORS
611
A.2 Boson creation and annihilation operators The (non-Hermitian) boson annihilation and creation operators and are defined in terms of the position and momentum operators of appendix A.1 according to
From (1.19) it directly follows that these operators satisfy the commutation relation
The interpretation as creation and annihilation operators, respectively, derives from their properties as ladder operators, in the sense that
in which the vectors number operator defined by
are the (orthonormal) eigenvectors of the
The eigenvectors of N correspond to the so-called number states. In Dirac’s notation the operators are represented according to
In agreement with Stone’s theorem the number operator gives rise to the definition of the unitary operator This operator will be referred to as the rotation operator, as suggested by the relations
Operator
can be interpreted as a phase shift operator, satisfying
In particular, defining
APPENDIX A. MATHEMATICAL APPENDIX
612
we have It is possible to define also a number shift operator according to
With (A.18) we find Note that operator V is not unitary, no unitary number shift operator existing due to the one-sided boundedness of the spectrum of the number operator.
A.3
Some properties of boson creation and annihilation operators
Without proof the following important relations are given, valid for the creation and annihilation operators defined in appendix A.2 (see e.g. Louisell [5], p. 102):
This is a special case of the Baker-Hausdorff relation. It holds because, due to (A.13), we have Application of (1.23) yields:
By repeated application of (A.20) we get
Using (A.21) it is not difficult to prove that for real
and
Also: with (A.23) directly entailing that
Relation (A.25) enables a kind of Fourier expansion of operators according to
A .4. COHERENT STATES
613
For instance, for A = P this representation can easily be verified, since from (A.21) and (A.23) it follows that
In an analogous way equality (A.26) can be proven for
A.4
by using
Coherent states
Although the operator is not Hermitian, it yet has eigenvectors. For each complex value of the normed vector
satisfies De states (A.29) are called coherent states (Glauber [447]). They satisfy
allowing an interpretation of the operator (A.16) as inducing a rotation in the complex plane. This also means that the coherent states of a harmonic oscillator remain coherent as long as the oscillator does not have any interaction with other systems. Apart from their practical interest as (approximate) descriptions of states of an electromagnetic field produced by a laser, the coherent state vectors have a great theoretical interest because they play an important role in the generalization of the quantum mechanical formalism discussed in this book. For this reason we give here a number of mathematical properties of these vectors. The coherent states (A.29) can be obtained from the vacuum state lating this latter state by means of the translation operator according to
by trans-
This can easily be proven by applying in (A.20) the series expansion and taking into account the representation of in terms of number states. From (A.20) it follows that
614
APPENDIX A. MATHEMATICAL APPENDIX
Hence, The coherent states are not mutually orthogonal:
The set of coherent states appears to constitute a kind of basis for the quantum mechanical state vectors Indeed, every state vector can be expanded according to
where the integration over is over the whole complex plane: This can be demonstrated by showing that the projection operators satisfy the equality
which is equivalent with (A.36). A proof of (A.37) can most easily be given if use is made of the representation (A.29) of and if the integration over the complex plane is carried out in polar coordinates thus We then get for the left-hand side of (A.37):
This straightforwardly yields ness of the vectors this equals I.
Due to the orthonormality and complete-
An important property of the set of coherent states is its overcompleteness, distinguishing it from the set of orthogonal eigenvectors of a Hermitian operator constituting a basis of Hilbert space. This can also be expressed by the following property (e.g. Klauder and Sudarshan [6], p. 131):
or, equivalently,
implying that the operator A is completely determined by the diagonal elements This property is not valid if the set of coherent states is replaced by an orthogonal basis. The completeness of the set of coherent states in the sense of (A.36)
A.4. COHERENT STATES
615
is a consequence of the fact that the set of operators constitutes a (non-orthogonal) decomposition of the identity (cf. appendix A.12.3). The position representation of the coherent states is given by
eigenvectors of Schrödinger [448].
. In this form coherent states were found for the first time by
Two-mode coherent states Let and be boson annihilation operators in different modes (1 and 2), satisfying the commutation relations (1.21). A two-mode coherent state is defined by
We have with
a coherent state (A.32) for mode
If we define then the operators and satisfy commutation relations (1.21), and, hence, are boson operators, too. It is then possible to define, analogously to (A.30), coherent states also for and
Using representation (A.32) it can be proven that
Using (A.41) it is also simple to prove that
This follows, for instance, by diagonalizing the quadratic form by means of the transformations and by applying equality (A.31) in the representation. In terms of the unitary operator the operators (A.40) can be given as
616
APPENDIX A. MATHEMATICAL APPENDIX
Expressing and according to (A. 12) in terms of position and momentum operators we find that with a component of the angular momentum operator. This explains why the transformation (A.42) looks like a rotation. Indeed, from (A.42) it can straightforwardly be derived that
momentum eigenvectors.
A.5
Squeezed boson operators
When performing a scale transformation it is natural to consider the operators and instead of the usual position and momentum operators. They satisfy the same canonical commutation relation (1.19) as do and P, and, for this reason, are equivalent with these. They are called squeezed position and momentum operators (see also section 1.11.2). For real
we define, analogously to (A. 12), the following operators:
The boson creation and annihilation operators and are special cases for However, since the operators and for arbitrary satisfy the commutation relation these are creation and annihilation operators of bosons, too. For we refer to such bosons as “squeezed” bosons. The parameter the “squeeze” parameter, is a measure of the squeezing. Since the results of appendices A.3 and A.4 are completely determined by the commutation relation (A.45), they can be generalized to arbitrary values of Analogously to (A.15) a number operator
can be defined, the eigenvectors of which could be interpreted as representing states containing With (A.44) and (1.19) we directly find
Analogously to (A.30) for each a squeezed coherent state can be defined (Yuen [449], and references given there) as an eigenvector of the operator
A.5. SQUEEZED BOSON OPERATORS
617
The position representation of this state is found as
with
given by (A.38).
For every value of the coherent states
the squeezed coherent states constitute, analogously to an overcomplete set of vectors. We have
as becomes immediately clear if this operator is considered in the position representation, and (A.49) is applied. In an analogous way it also follows that (compare (A.35)):
Inserting (1.23) in (A.44) gives the following relation between the annihilation and creation operators of different bosons:
The transition from operator to operator is a unitary transformation, which can also be carried out by means of the (unitary) squeezing operator
according to In Dirac’s notation the squeezing operator can be represented as
Hence, The state
is unitary, and is often called the squeezed vacuum.
The squeezing operator also determines the relation between the coherent state and the squeezed coherent state It follows from (A.54) that
APPENDIX A. MATHEMATICAL APPENDIX
618
implying that also The unitarity of
explains equalities (A.50) and (A.51).
Due to the complete equivalence of the boson formalisms for different values of we finally have, analogously to (A.32),
With (A.54) this implies
Rotated squeezed coherent states We define the rotated squeezed coherent states according to
with
given by (A.16). These are eigenvectors of the operator
With (A.44) it follows from (A.59) that
in which
A.6
and
are given by (A.17).
Theorem on ordering of a positive operator and a projection operator
Theorem: If P is a Hermitian projection operator, and B a positive operator satisfying then B = PBP. If, moreover, P is a one-dimensional projection operator, then Proof: We can write B as
Then
A.7. THEOREM ON OVERCOMPLETE SETS OF VECTORS Choose
with
619
arbitrary. Then
or
This implies this it follows that
and hence,
or A = AP. From
If P is a one-dimensional Hermitian projection operator, and then Since
we have
A.7 Theorem on overcomplete sets of vectors Let the set of vectors dimensional vector space
be an overcomplete set spanning an Ni.e. each vector a of can be written as
but the vectors are linearly dependent. We can define the N' × N'-dimensional (Hermitian) Gram matrix Due to the linear dependence and the overcompleteness of the vectors this matrix has rank N < N'. Like an orthogonal basis, also an overcomplete set vectors can be represented by a star in (cf. Coxeter [450]). A star is called eutactic if it can be obtained by means of an orthogonal projection from an orthonormal basis in an N'-dimensional vector space of which is a subspace. We now prove the following theorem (cf. Seidel [451]): Theorem: A star is eutactic if and only if its Gram matrix has precisely two different eigenvalues (one of which being 0). Proof: i) Assume
620
APPENDIX A. MATHEMATICAL APPENDIX
Then This means that the Gram matrix is a matrix representation of the orthogonal projection operator P. This implies that this (Hermitian) matrix can only have eigenvalues 0 and 1. ii) Assume, conversely, that the matrix has only two different eigenvalues. Since (with the eigenvalues of the Gram matrix), and since due to the linear dependence of the vectors at least one of the eigenvalues must be 0. Since the Gram matrix is of rank N, there are precisely N' – N eigenvalues 0. By suitably normalizing the vectors it is possible to make all other eigenvalues (assumed to be all equal) have value 1. This implies that there exists a unitary matrix such that for all other values of and Assume
Then (A.60) can be written as
From this it follows that
and that
Moreover, the vectors (A.61) this is an orthonormal basis of
constitute an orthonormal set. Because of
If is a subspace of an N'-dimensional vector space then an orthonormal basis in this latter space can be obtained by adding to the vectors set of N' – N mutually orthogonal and normed vectors in the orthogonal complement of in The orthonormal basis of obtained in this way is denoted as (note that If P is the orthogonal projection operator in onto subspace then for this basis we have
A.8. NON-ORTHOGONAL (SKEW) PROJECTIONS
We obtain a different orthonormal basis in
621
by means of the unitary transformation
Using (A.62) it follows that
Note that the orthonormal basis of is not unique, but is only determined up to a unitary transformation of the basis of the orthocomplement of in
A.8
Non-orthogonal (skew) projections
If but then P is a non-orthogonal projection operator (Akhiezer and Glazman [47], Vol. I, p 145). Apart from (I – P) we now also have at our disposal the projection operators and Because of
the subspaces and are mutually orthogonal. In fact, are orthogonal complements. The same holds for and subspaces and are orthogonal only if Since
and However, the
the subspaces and are complementary in be it that they need not be orthogonal complements. We say that P projects parallel to subspace In case of non-orthogonal projections it is possibile that
i.e.
and
project onto the same subspace. This is satisfied if and only if
622
APPENDIX A. MATHEMATICAL APPENDIX
Proof: (A.64) (A.65): Let a be an arbitrary vector of Then Let b be an arbitrary vector of Then: Hence, The second relation can be derived in an analogous way. (A.65) (A.64): Assume Hence, Hence,
Then, for we have i.e. Analogously, from the other relation it follows that
Theorem: The class of projectors, projecting onto one and the same subspace contains one single, unique projector that is Hermitian. Proof: Assume that there are two Hermitian projectors projecting onto the same subspace. Then hence or But we also have hence
A.8. NON-ORTHOGONAL (SKEW) PROJECTIONS Two projectors
and
623
project parallel to the same subspace if
i.e. and project onto the same subspace. Because of (A.65), for this to be the case it is necessary and sufficient that
or
The operator projecting orthogonally onto is denoted as operator is Hermitian and it projects parallel to figure A.1). Analogously, is the orthogonal projector onto Let be a non-orthogonal basis in Let tor projecting onto the subspace spanned by subspace spanned by Then for every
In appendix A.8.1 an explicit expression for
A.8.1
Hence, this (cf. parallel to
be the skew projecparallel to the we have
is derived.
Non-orthogonal projections and bi-orthonormality
Let be a non-orthogonal basis in an Hilbert space. Since the determinant of the Gram matrix is non-vanishing, it is possible to find a set of vectors such that (cf. figure A.2)
The vectors are uniquely determined by the basis through relation (A.67). They are linearly independent too, and, hence, constitute a basis different from if the latter basis is not an orthonormal one. It is the so-called dual basis. The set of vectors consisting of vectors and is called a bi-orthonormal system (Akhiezer and Glazman [47], Vol. I, p. 40). The Gram matrix of the dual basis is the inverse of the Gram matrix of the basis
APPENDIX A. MATHEMATICAL APPENDIX
624
Matrices
and
are Hermitian.
Using a bi-orthonormal system it is possible to treat an expansion of a vector into a non-orthonormal basis, analogously to the orthonormal case. Thus, it follows directly from (A.67) that
For
this yields
with
given by (A.68).
In Dirac notation an arbitrary operator A can be represented according to
Hence,
Operators are non-orthogonal projection operators but For instance, operator projects onto the one-dimensional subspace spanned by parallel to the space spanned by
Analogously to the (usually considered) orthogonal case the operators satisfy The vectors are joint eigenvectors of the commuting projection operators also in the non-orthogonal case:
A.8. NON-ORTHOGONAL (SKEW) PROJECTIONS
625
The operator
is a non-orthogonal projection operator, too. It satisfies
Hence, gously, the dual contrast (A.71)).
is the skew projector projecting onto the subspace parallel to the subspace spanned by projects onto the subspace spanned by vectors basis, parallel to the subspace spanned by to the case expression (A.72) is Hermitian for This follows directly from the Hermiticity of matrix
spanned by Analoof Note that, in (compare
It is possible to relate a bi-orthonormal system in the following way to an orthonormal basis Define the operator
For this operator we have
With (A.71) we then get
Because of bi-orthonormality is indeed warranted by
Note that the operator is neither Hermitian nor unitary. For this reason this operator can establish a transformation between the Hermitian projection operator and the non-Hermitian projection operator
APPENDIX A. MATHEMATICAL APPENDIX
626
Starting from an arbitrary Hermitian operator A, with spectral representation we obtain
Then the bi-orthonormal system can be found as the eigenvectors of the operators and respectively (cf. Morse and Feshbach [452], Vol. I, p. 884):
A.9
Direct product and tensor product of Hilbert spaces
Let and be two Hilbert spaces with vectors and respectively. Then the direct product of and is defined such that the elements of the direct product are the direct products (bi-vectors) The direct product is not a linear vector space because it is not possible to define an addition of bi-vectors such that the sum of two bi-vectors is a bi-vector. of
Let and
We see that
be a basis in and a basis in Then the tensor product is defined as the linear space spanned by the bi-vectors
(for
The tensor product is a Hilbert space. The inner product of and can as follows be defined in terms of the inner products of and
This quantity satisfies all requirements of an inner product. If and are orthonormal bases in and respectively, then is an orthonormal basis in An arbitrary operator on can be written as
A.10. LINEAR SPACES OF OPERATORS
627
in which, as usual, the direct product signs × have been omitted. It is sometimes convenient to write this as
In this way it is clear that an operator A on combination of direct products of the operators defined on and respectively.
can be written as a linear and separately
The trace of operator A satisfies
The partial trace over Hilbert space
is defined as
with
Using (A.80) this can be written as
an orthonormal basis in
This is an operator acting on Hilbert space an operator on by writing it as on In an analogous way the partial trace operator on
A. 10
This operator can be considered as where is the unit operator over can be defined. It is an
Linear spaces of operators
Linear operators can be added. They can therefore be considered as elements of a linear vector space. The dimension of the linear vector space is equal to the number of linearly independent operators. In the case of linear operators on an vector space the dimension of the space of the operators is in case of a real vector space). On such a linear vector space of operators an inner product can be defined, for instance, the inner product in which
is the trace of operator
628
APPENDIX A. MATHEMATICAL APPENDIX
The Hilbert space, obtained in this way, is called the Hilbert-Schmidt space, and (A.82) the Hilbert-Schmidt inner product. An orthogonal basis for HilbertSchmidt space is, for instance, the set of all operators with a complete orthonormal set of vectors in the space on which the operators act. Note that this set can be replaced by the set of Hermitian operators yielding an alternative orthogonal basis with respect to the Hilbert-Schmidt inner product. If we restrict ourselves to Hermitian operators and real linear combinations of these, then we have a subspace of Hilbert-Schmidt space, with the latter set of operators as a possible basis. For instance, the operators with the Pauli spin matrices, constitute an orthogonal basis, allowing an arbitrary Hermitian 2 × 2 matrix A to be represented according to
with real constants It is also possible to define operators acting on the vectors of Hilbert-Schmidt space. Such operators are sometimes called ‘super operators’ in order to distinguish them from the operators of Hilbert-Schmidt space itself. We say that a super operator is Hermitian if
In order to distinguish ‘Hermiticity of super operators’ from ‘Hermiticity of operators of Hilbert-Schmidt space’ the former is denoted as
Applying Riesz’ lemma (see e.g. Reed and Simon [13], Vol. I, p. 43) to HilbertSchmidt space, it follows that for each bounded linear functional on this space a vector exists such that the functional can be written as
If is finite, then Hilbert-Schmidt space is a very useful space because it contains all linear operators. If then we have a problem, because many physically relevant operators have a Hilbert-Schmidt norm which is not finite. For instance, for this reason the unit operator I is not a vector in Hilbert-Schmidt space. The same holds true for any orthogonal projection operator on an infinite-dimensional subspace. Since such operators must have a place in the mathematical formalism, Hilbert-Schmidt space is not particularly appropriate. An alternative possibility is, not to consider the space of operators as a Hilbert space, but as a Banach space. When we take as a norm in the linear vector space of
A.11. CONVEXITY
629
all bounded linear operators A the operator norm then this yields a Banach space which is not a Hilbert space (for instance, for 2 × 2 matrices it can easily be seen that the operator norm does not satisfy the parallelogram law necessary lest an inner product can be defined). Since the norm of the unit operator satisfies it is an element of this Banach space also in case of infinite dimension. This is an important advantage over Hilbert-Schmidt space.
A.11
Convexity
A.11.1 Convex functions An (upward) convex function is a function the graph of which turns its hollow side downwards (cf. figure A.3), i.e. for which: An example of an (upward) convex function is the function For the points of every chord connecting two points of a convex function we have Parameterizing the chord between the points ing to for all in the interval
Defining then and, hence,
and
accordwe have
APPENDIX A. MATHEMATICAL APPENDIX
630
More generally it can be proven that an upward convex function
For an (upward) concave function
This is the Jensen inequality ([321], p. 277). Since
satisfies
the inverse holds; thus,
is (upward) convex,
or The function
is (upward) concave for
This means that
or
A. 11.2
Quantum mechanical entropic inequalities
Inequalities for the von Neumann entropy Von Neumann entropy (1.37) satisfies the inequality (cf. Balian [453], p. 113):
Proof: Let
and
be the eigenvectors of A and B, respectively. Since this can be written as
Here the inequality is a consequence of the convexity of the function
Then we have
entailing
A.11. CONVEXITY
631
Using inequality (A.88) it is proven that the von Neumann entropy satisfies the following
Inequality for mixtures:
Proof: Take (A.88) with
and
This yields
Then (A.89) follows by multiplying with
and summing.
If the operators constitute the spectral representation of a Hermitian operator, then the following inequality is valid for the von Neumann entropy of the Lüders projection (cf. (1.74)) of a density operator
Inequality for Lüders projection:
Proof (e.g. Balian [453], p. 119):
Application of (A.88) with since
and
now yields the desired result
We finally prove an inequality which is valid in case of two correlated systems (cf. section 1.5).
Inequality for correlated systems: Let be the density operator of a system consisting of two subsystems, 1 and 2, the density operators of which are given by and Then
632
APPENDIX A. MATHEMATICAL APPENDIX
Proof (e.g. Balian [453], p. 115): With and an analogous expression for
we can verify that
The desired result follows once again by direct application of inequality (A.88) with and
Inequalities for the Shannon entropy If the operators are one-dimensional projection operators, then the Shannon entropy, defined by (1.43), satisfies the Klein inequality,
This inequality follows directly from (A.87) if we express representation With (if projection operator) the following inequality then follows
in terms of its spectral is a one-dimensional
from which the Klein inequality can be obtained by summing over
If we take in (1.43) for to the eigenvectors of
the projection-valued measure then we get
corresponding
The right-hand side is precisely the von Neumann entropy
defined by (1.37).
Analogously to the von Neumann entropy (cf. (1.43) satisfies the following
Inequality for mixtures:
(A.89)) the Shannon entropy
A. 11. CONVEXITY
633
This inequality, too, is most easily proven by means of (A.87). Thus,
directly implies inequality (A.93) since the (upward) concavity of the function entails
from which (A.93) is obtained by summation over
For the Shannon entropy marginals and
Proof: With
in which
and
is a bivariate (N)ODI with we have
we have, because of (A.86),
The expression
is called the mutual information (e.g. [321]). It is a measure of the correlation between the marginal probability distributions and of the bivariate distribution Because of (A.94) we have
with
if and only if there is no correlation, i.e. if
634
APPENDIX A. MATHEMATICAL APPENDIX
A.11.3 Convex subsets of a linear space Often we are not interested in all possible elements of a linear space but just in a subset. An important instance of this is the convex subset (see section 1.4). This is defined as follows: A subset also between.
of
is convex if i.e. with
and
the subset
implies that contains all vectors lying in
Examples of convex subsets of a linear space are the set of all vectors with the subspaces of the set of all vectors lying within the cones displayed in figure A.4, independently of their length (these are called convex cones of type or the set of all vectors lying within or on the surface of the cone and having their top in the intersecting plane 3 as indicated in figure A.5 (type Convex cones of types and play both a role in quantum mechanics. A convex combination of
arbitrary vectors
is defined according
to
Extreme elements of a convex set are those elements of which cannot be written as a convex combination of two other elements of i.e. elements not lying in between two other elements of This means that an element a is an extreme element of if implies that In case of a convex cone of type only the vector 0 is an extreme element, since every other vector a is, for instance, a convex combination of the vectors 3/2a and 1/2a (with Convex cones of type do have nontrivial extreme elements, because each ray is represented by one vector only. In figure A.5a the vectors and are extreme elements (the other vectors in the side planes of the tetrahedron are not extreme!); in figure A.5b all elements lying on the surface of the cone are extreme. The set of extreme elements of is denoted by All vectors of the convex cone of figure A.5a can be obtained 3
The intersecting plane is often called a basis of the cone (see e.g. Ludwig [38], p. 363).
A.11. CONVEXITY
635
as convex combinations of the extreme vectors Also in figure A.5b an arbitrary vector, having its top in the intersecting plane, is a convex combination of the extreme vectors, be it that the summation in (A.96) must be replaced by an integration. These are applications of the Krein-Mil’man theorem (e.g. Ruckle [454]), for convex cones stating that each element can be written as a convex combination of elements of Also in the case of convex cones of type we are interested in characterizing the (extreme) vectors considered above. Since it is impossible to do so on the basis of the definition of extremeness, we should take resort to a different method (e.g. Jameson [455]). Extremal elements are defined as those vectors satisfying the requirement that for every for which it follows that a' ~ a. On this definition the extreme vectors of are extremal vectors of The set of extremal vectors of is denoted as The Krein-Mil’man theorem asserts that each element of a convex cone can be written as a convex combination of elements of An example of a convex set in a linear vector space, having important application in quantum mechanics (cf. section 1.9), is that of the subset of the set of non-negative operators A in the linear space of operators introduced in appendix A. 10, having That the non-negative operators constitute a convex set follows from the fact that a convex combination (A.96) of non-negative operators is a non-negative operator. Since if we have a convex cone. The requirement defines an intersecting plane as depicted in figure A.5. Hence, it is a convex set of type every convex combination satisfying
We now prove that the extreme elements of this convex set are the projection operators onto the one-dimensional subspaces of the Hilbert space in which the operators are defined.
636
APPENDIX A. MATHEMATICAL APPENDIX
Proof:
Let P be the projection operator of a one-dimensional subspace. Assume Then and It follows from the theorem proven in appendix A.6 that Because of we have Hence, Hence, the operator P is an extreme element. Inversely, assume P > O is an extreme element; Since the eigenvalues of P, we have This implies that and If then we can write P as
This is a convex combination. Hence, we must have Finally, if P is a projection operator, then is equal to the dimension of the range of P. Since is P a projection operator onto a one-dimensional subspace of
A. 12 Measures A.12.1
Measures on a set
Let
be a set. Subsets are denoted by Let be a set of subsets such that and i.e. the subsets constitute a partition of The lattice generated by is a Boolean lattice 4 , having the subsets as atoms. This lattice need not comprise all possible subsets of For instance, it may be possible to consider a much finer partition of generating another lattice of which the above lattice is a sublattice. Let be a mapping of the lattice generated by numbers) such that
Then the mapping 4 5
is a positive additive measure5 on
into
(the set of real
The measure can be a
Or a if the join is allowed to contain infinitely many elements. If the summation in (A.97) is denumerably infinite the measure is called
A.12. MEASURES
637
probability measure if being the probability If
is the complement of
If
that quantity A has value in
then
then
Joint and conditional probabilities Let a set of subsets be a different covering of corresponding to a different quantity B. Thus, and Then a joint probability distribution of quantities A and B can be defined according to
Because of the distributive laws it follows from (A.97) and (A.100) that
The probability distributions and called the marginals of the bivariate probability distribution
are
Using the joint probability distribution (A. 100) it is possible to define a conditional probability distribution according to
This satisfies
Convex sets of probability measures Probability measures constitute a convex subset (cf. appendix A.11.3) of a linear space (the space of all functions defined on Indeed, if we have two probability measures, and then is also a probability measure if and
APPENDIX A. MATHEMATICAL APPENDIX
638
In principle, a probability measure can be represented in many ways as a convex combination of two other probability measures. Thus, choose a fixed subset of Let be the complement of in Then, because of (A.97) we have
with
The mappings and defined here, are probability measures. Since can be chosen arbitrarily, there exist many convex representations (A.102) of measure For an extreme element (cf. appendix A.11.3) of the convex set of probability measures the construction (A. 102) is impossible by definition, unless In that case we have in (A. 103)
yielding For
this implies
Hence, This means that an extreme measure must be 0 everywhere, except on atomic elements. Hence, in the case that consists of discrete points the extreme measure corresponds to the Kronecker delta: If we, more generally, denote the atoms of the lattice by ( in which is the set of all possible properties A), then in an analogous way with each atom an extreme measure is associated according to Every arbitrary measure on the lattice generated by measures:
can be written as a convex combination of all extreme
For the given lattice this expansion is unique. We emphasize here that the definition of an extreme measure is dependent on the choice of the atoms of the lattice. If more properties are taken into account we might be able to choose as atoms of the lattice a finer partition of the set causing the atoms to correspond to smaller subsets. With respect to the new partition the old extreme measures are not extreme any more. Whether a measure is extreme, or
A.12.
MEASURES
639
not, depends on the topology (e.g. Simmons [456]) of the measure space. However, for measures on a set of discrete points there is an absolute sense in which an extreme measure can be defined. Indeed, the so-called discrete topology determines the finest partition which is possible in a set, viz, a partition in subsets consisting of the separate points of the set. All measures on the lattice generated by the set of points are convex combinations (A. 104) of the corresponding extreme measure.
A.12.2 Measures on a linear vector space; Gleason’s theorem It is also possible to define probability measures on a linear vector space Like in appendix A.11.3 we do not consider, then, subsets of vectors but subsets of rays. This boils down to a definition of probability measures on the lattice of subspaces of The important difference with a point set is, that the lattice is not Boolean any longer. This implies a number of differences (see also section 1.8.5) which can be seen as characteristic of the difference between quantum mechanical and classical probability theory. In particular, (A.97) must be replaced by
If
is an orthonormal basis of vectors in
For every subspace
in which
of
then
we have
is the orthogonal complement of
The difference with (A.98) illustrates the essential difference between measures on point sets and measures on vector spaces: whereas for sets we have the “ordinary” union of all vectors of and does not constitute the whole vector space This difference is particularly important when discussing joint probability distributions. It does not make sense to try to give a definition of these, analogously to (A.100), as taking for the subspace spanned by the eigenvectors of A corresponding to eigenvalue (and analogously for B). As a matter of fact, applying this to two onedimensional subspaces spanned by vectors and respectively, this in general yields
APPENDIX A. MATHEMATICAL APPENDIX
640
since the meet of two one-dimensional subspaces is the empty set If and are two orthonormal bases without joint vectors (A.108) is satisfied for all and In that case all marginal probabilities should have to vanish, too, thus causing a disagreement with (A. 106). This illustrates the problem of the non-existence of a joint probability distribution of incompatible observables, discussed more extensively in section 1.8.4. If in (A. 107) the subspaces and are spanned by vectors of one and the same orthonormal set then the possibility does exist of defining in this manner a joint probability distribution. This is applicable in the case of two compatible observables (cf. section 1.8.3).
Gleason’s theorem An important theorem is Gleason’s theorem [457] (see also Varadarajan [458] or Ludwig [38], p. 397-404), given here without proof: Let be the set of all possible Hermitian projection operators on subspaces of a linear space and the lattice of subspaces of If the dimension of is at least 3, then every probability measure on is represented by a non-negative operator on satisfying according to
Using Gleason’s theorem it is easily seen that the structure of the convex set of probability measures on a linear space is very different from that of the probability measures on a point set. This can be seen most clearly by looking at the extreme elements. For this purpose we make use of the property of the convex set of nonnegative operators having trace equal to 1, proven in appendix A.11.3, that the extreme elements are the Hermitian projection operators onto the one-dimensional subspaces of The spectral representation
in which are the (positive) eigenvalues of and are the (one-dimensional) projection operators onto the corresponding eigenvectors of is an expansion, analogous to (A.104), into the set of extreme measures. There are two big differences with (A. 102), however. 1. In general, expansion (A.110) is not unique. An example, given by Park [281], is the following: Let
and
Define
A.12.
MEASURES Defining
641
we have
and
Note that vectors and are normalized, but, for not mutually orthogonal. However, orthogonality is not a requirement that has to be fulfilled for a convex expansion into extreme measures. Hence, this example clearly demonstrates that the expansion (A.110) is not unique. For the special case there even exist many orthonormal bases allowing a convex expansion. 2. In the case of a point set there is one single unique set of extreme measures (the delta measures) into which all probability measures can be expanded in a convex way (compare (A.104)). In this case we have a simplex (e.g. Holevo [34]). In the case of a linear vector space the probability measures do not constitute a simplex. In principle, it would be possible to define, analogously to (A.104), a convex expansion into all possible extreme elements, assuming, for instance, in the expansion (A.110) that the probabilities are 0 for all other extreme elements than However, due to the non-uniqueness of the expansion, discussed in point 1, these probabilities could be nonvanishing in another representation. This dependency on the representation thwarts a consistent interpretation of the numbers as probabilities.
A.12.3
Operator-valued and functional-valued measures
Operator-valued measures Analogously to appendix A.12.1 we consider a mapping M of the lattice of subsets of a set into a space of operators like those of appendix A.10 rather than into the set of real numbers. This is an operator-valued measure (OVM) if the operators satisfy the following conditions
We have a positive operator-valued measure (POVM) if, moreover,
An example of a positive operator-valued measure is a projection-valued measure (PVM), in which is the spectrum of a Hermitian operator A, and is a subset of the spectrum (for instance, a Borel set), and with the spectral representation, in agreement with (1.8) associated with A. It is not difficult to prove that the projection operators satisfy all requirements given above.
APPENDIX A. MATHEMATICAL APPENDIX
642
Non-orthogonal decomposition of the unit operator In general, a positive operator-valued measure is associated with a generalization of the orthogonal decomposition of the identity operator (ODI) corresponding to the spectral representation of a Hermitian operator to the concept of a non-orthogonal decomposition of the identity operator (NODI). This is a set of operators satisfying
This reduces to an orthogonal decomposition if operators ators.
are projection oper-
The concept of a NODI differs from that of a POVM in that the first contains only a subset of the elements of the second, corresponding to a partition of the index set. However, by means of (A.113) the elements of the POVM can be generated by the elements of the NODI (plus the operator O). For this reason, somewhat sloppily the set satisfying (A.116) is often referred to as a POVM, too (e.g. Peres [48], p. 283). Strictly speaking this is not correct, however. We now prove that a NODI consisting of projection operators must correspond to a PVM. The proof follows from the following theorem as a special case. Theorem: The sum of a set of Hermitian projection operators is a projection operator if and only if the operators project (orthogonally) onto mutually orthogonal subspaces. Thus, if then
Proof: If
then
If
then consider
A.12.
MEASURES
643
The range of the projection operator is the subspace spanned by the vectors of the subspaces This means that subspace is in the range of P, entailing As a consequence
Since every operator satisfies
it follows that
This yields from which we find
and
The proof that a NODI consisting of projection operators must be an ODI follows from this theorem by taking P = I.
Function-valued and functional-valued measures Analogously to an OVM a function-valued measure (FVM) is a mapping of the lattice of subspaces of into a linear space of functions satisfying relations analogous to (A.112) through (A.114):
An example of an FVM is the set of indicator functions of the point set (i.e. functions we have
of a partition For these indicator
Hence, (A.117) through (A.119) are satisfied. Because of the relation
APPENDIX A. MATHEMATICAL APPENDIX
644
this FVM can be considered as an analogue of a projection-valued measure. A refinement of the partition each of the subsets according to
The two FVMs
and
of
is obtained by means of a partition of
now satisfy the following relation:
Using the discrete topology, the partition of can be refined in such a way that every subset contains precisely one single point. This suggests the definition of a functional-valued measure with Analogously to (A.121) every FVM can be expressed in terms of the measure according to
Relation (A.121) is an example of a partial ordering existing between FVMs. More generally, it is possible to define a new FVM related to FVM according to
Whereas the indicator function yields a sharp representation of the set can the function defined by (A.122), be seen as a fuzzy representation, the measure of fuzziness being determined by the deviation of from The FVM corresponds to a partition of in so-called “fuzzy” sets. The matrix is a stochastic matrix (cf. appendix A.13).
A.12.4
Mathematical representation of (N)ODIs
Operators on an vector space can be represented by matrices. Operators can themselves be considered as linear vectors in a vector space (cf. appendix A.10). In case of complex vectors the vector space of operators is This means that an operator can be represented as a vector in a real vector space. Hermitian operators have matrices satisfying This means that the number of real degrees of freedom of a Hermitian operator is restricted to For this reason Hermitian operators can be represented in an real vector space (real, since is Hermitian only if is real).
A.12. MEASURES
645
In general the positive operators do not constitute a linear vector space, since is not positive in general if can be negative. The positive operators do constitute a convex cone, the so-called positive cone, in the linear space of Hermitian operators. A (N)ODI (cf. appendix A.12.3) can be represented by a set of vectors in the positive cone. As illustrated in figure A.6, the operators of a NODI define a convex cone (of type 1, cf. appendix A.11.3), denoted by The cone of figure A.6b corresponds to a NODI consisting of a (continuously) infinite number of elements on the surface of the cone. An example of this type is the NODI given by the operators Pauli spin matrices (cf. section 8.3.3). It is possible that, like in figure A.6c, some of the operators do not belong to the set of extremal elements of the cone, but lie inside it. Using the Krein-Mil’man theorem (appendix A.11.3) it is possible to write such a non-extremal element as a convex combination of extremal elements:
We can then define a new NODI in which the non-extremal element is omitted, and the extremal elements are replaced by
The non-extremal POVM represents a nonideal measurement of this new POVM. For instance, let and extremal. Then and
If there are more non-extremal elements, these can be successively removed in an analogous way. It is easily verified that the nonideality matrix has a (non-unique) left inverse. Since not all of its elements are positive, the inverse matrix cannot be a nonideality matrix. Hence, the POVMs are not equivalent in the sense defined in section 7.6.7. However, they are informationally equivalent.
APPENDIX A. MATHEMATICAL APPENDIX
646
A nonequivalent but informationally equivalent extremal POVM can also be obtained by eliminating all non-extremal elements simultaneously. Thus, define
Then
A.13
is the informationally equivalent extremal POVM.
Some properties of stochastic matrices
A stochastic matrix (e.g. Ortega [319]) is a matrix
with elements satisfying
The matrix need not be square. An example is Theorem:
The product of two stochastic matrices is another stochastic matrix. Proof:
Let and be two stochastic matrices that can be multiplied according to Due to the positivity of the matrix elements of these matrices the product matrix has positive elements, too. Moreover, due to it follows that
The next theorems apply only to square stochastic matrices. Theorem:
A square stochastic matrix has an eigenvalue 1. Proof:
A necessary and sufficient condition for a stochastic matrix to have an eigenvalue 1 is that the characteristic equation
has a solution That this, indeed, is the case follows from (A. 124) by adding all rows of the determinant.
A.13. SOME PROPERTIES OF STOCHASTIC MATRICES
647
Theorem: An eigenvector a of a square stochastic matrix corresponding to an eigenvalue that is different from 1 has components satisfying Proof:
With (A.124) the desired result directly follows.
Theorem: All eigenvalues
of a square stochastic matrix satisfy
Proof:
This implies
and, hence,
Note that the eigenvalues of a stochastic matrix need not be positive, or even real. For instance, the matrix 1 and Theorem: With also
has eigenvalues
hence, is an eigenvalue of a square stochastic matrix.
Proof: This follows directly from the fact that a stochastic matrix is real. Hence, an eigenvector corresponding to is the complex conjugate of an eigenvector corresponding to eigenvalue
Theorem: All eigenvalues of a square stochastic N × N matrix, obtained by means of permutation of the rows and columns of the unit matrix, satisfy
648
APPENDIX A. MATHEMATICAL APPENDIX
Proof: For an arbitrary square N × N matrix we have
Since is invariant under a permutation of rows and columns of the matrix, it follows that for permutations of the unit matrix we must have:
Because of (A.125) this yields
Theorem: The stochastic N × N matrix with eigenvalues 0. Proof: It can easily be verified that
for all
has one eigenvalue 1 and N – 1
Bibliography [1] P.A.M. Dirac, The principles of quantum mechanics (4th rev. ed.), Clarendon, Oxford, 1967 (first edition 1930). [2] J. von Neumann, Mathematische Grundlagen der Quantenmechanik, Springer, Berlin, 1932; or, Mathematical foundations of quantum mechanics, Princeton Univ. Press, 1955.
[3] Kreyszig E., Introductory functional analysis with applications, Wiley, New York, etc., 1978. [4] S.J.L. Eijndhoven and J. de Graaf, A mathematical introduction to Dirac’s formalism, North–Holland, Amsterdam, 1986. [5] W.H. Louisell, Radiation and noise in quantum electronics, McGraw–Hill, London, 1964. [6] J.R. Klauder, E.C.G. Sudarshan, Fundamentals of quantum optics, Benjamin, 1968. [7] R. Loudon, Quantum Theory of Light, 2nd edition, Clarendon, Oxford, 1983, section 6.8. [8] G.G. Emch, Algebraic methods in statistical mechanics and quantum field theory, Wiley–Interscience, New York, 1972. [9] R. Haag and B. Schroer, Journ. Math. Phys. 3, 248 (1962). [10] W.M. de Muynck, Found. of Phys. 14, 199 (1984). [11] G.C. Hegerfeldt, Phys. Rev. D 10, 3320 (1974); G.C. Hegerfeldt, Phys. Rev. Lett. 72, 596 (1994); D. Buchholz and J. Yngvason, Phys. Rev. Lett. 73, 613 (1994); B. Schroer, in: Trends in Quantum Mechanics, H.-D. Doebner et al. eds, World Scientific, Singapore, 2000, p. 274; G.C. Hegerfeldt, in: Irreversibility and causality in quantum theory - Semigroups and rigged Hilbert spaces, A. Bohm, H.-D. Doebner and P. Kielanowski, eds., Springer Berlin, 1998. 649
650
BIBLIOGRAPHY
[12] S.N.M. Ruijsenaars, Ann. Phys. (N.Y.) 137, 33 (1981). [13] M. Reed, B. Simon, Methods of modern mathematical physics, Academic Press, 1972. [14] E. Schmidt, Math. Ann. 63, 433 (1906). [15] W.Heisenberg, Zeitschr. f. Phys. 43, 172 (1927). [16] M. Cini, J.-M. Lévy-Leblond eds., Quantum theory without reduction, Adam Hilgers, Bristol and New York, 1990. [17] L.E. Ballentine, Found. of Phys. 20, 1329 (1990). [18] W.H. Furry, “Quantum theory of measurement”, in: Lectures in Theoretical Physics 8a, Statistical Physics and solid state physics, ed. W.E. Brittin, The University of Colorado Press, 1966, p. 1-64. [19] G. Lüders, Ann. der Phys., 6, Folge 8, 322 (1951). [20] H.P. Robertson, Phys. Rev. 34, 163 (1929). [21] H. Kennard, Zeitschr. f. Phys. 44, 329 (1927). [22] J. Uffink and J. Hilgevoord, Phys. Lett. A 105, 176 (1984). [23] V.V. Dodonov, V.I. Man’ko, Generalization of the uncertainty relations in quantum mechanics, in: M.A. Markov ed., Invariants and the evolution of nonstationary quantum systems, Nova Science Publ., Commack, 1989, p. 3. [24] D. Deutsch, Phys. Rev. Lett. 50, 631 (1983). [25] M.H. Partovi, Phys. Rev. Lett. 50, 1883 (1983). [26] K. Kraus, Phys. Rev. D 35, 3070 (1987). [27] H. Maassen and J.B.M. Uffink, Phys. Rev. Lett. 60, 1103 (1988). [28] H. Martens and W. de Muynck, Found. of Phys. 20, 255, 357 (1990). [29] J.M. Jauch, Foundations of Quantum Mechanics, Addison–Wesley Publ. Cy., Reading Mass., 1966. [30] C. Piron, Foundations of Quantum Physics, W.A. Benjamin, Inc., Reading Mass., 1976. [31] G. Birkhoff and J. von Neumann, Ann. Math. 37, 823 (1936); G. Birkhoff, “Lattices in Applied Mathematics”, Proc. Symp. Pure Math., Vol. II, Amer. Math. Soc., 1961.
BIBLIOGRAPHY
651
[32] J.M. Jauch, Synthese 29, 131 (1974). [33] J.S. Bell, Physics 1, 195 (1964). [34] A.S. Holevo, Probabilistic and Statistical Aspects of Quantum Theory, North– Holland, Amsterdam, 1982. [35] W. Heisenberg, Physics and beyond, Harper and Row, New York, 1971, p. 63. [36] E.B. Davies, J.T. Lewis, Comm. Math. Phys. 17, 239 (1970). [37] E.B. Davies, Quantum Theory of Open Systems, Academic Press, London, 1976. [38] G. Ludwig, Foundations of Quantum Mechanics, Springer, Berlin, 1983, Vol. I. [39] K. Kraus, States, effects, and operations, Springer-Verlag, Berlin, Heidelberg, New York, Tokyo, 1983. [40] E. Prugovecki, Stochastic quantum mechanics and quantum spacetime, Reidel, Dordrecht, 1984. [41] P. Busch, P.J. Lahti, and P. Mittelstaedt, The quantum theory of measurement, Springer, 1991. [42] P. Busch, M. Grabowski, and P.J. Lahti, Operational quantum mechanics, Springer, Berlin, Heidelberg, 1995. [43] W.M. de Muynck and J.P.H.W. van den Eijnde, Found. of Phys. 14, 111 (1984). [44] W.M. de Muynck and J.M.V.A. Koelman, Phys. Lett. A 98, 1 (1983). [45] M. Naimark, Izv. Akad. Nauk SSSR Ser. Mat. 4, 277 (1940). [46] C. Helstrom, Quantum detection and estimation theory, Academic Press, New York, 1976. [47] N.I. Akhiezer and I.M. Glazman, Theory of linear operators in Hilbert space, Pitman Advanced Publishing Program, Boston, London, Melbourne, 1981. [48] A. Peres, Quantum theory: concepts and methods, Kluwer, Dordrecht, 1993. [49] P. Kruszynski and W. de Muynck, Journ. Math. Phys. 28, 1761 (1987). [50] L. Susskind and J. Glogower, Physics 1, 49 (1964); P. Carruthers and M.M. Nieto, Rev. Mod. Phys. 40, 411 (1968).
652
BIBLIOGRAPHY
[51] W.P. Schleich and S.M. Barnett, eds., Quantum phase and phase dependent measurements, Physica Scripta Vol. T48 (1993), p. 3-142. [52] R. Lynch, Physics Reports 256, 367 (1995). [53] U. Leonhardt, J. Vacarro, B. Böhmer, and H. Paul, Phys. Rev. A 51, 84 (1995). [54] S.M. Barnett and D.T. Pegg, Journ. Phys. A Math. Gen. 19, 3849 (1986). [55] J.H. Shapiro and S.R. Shepard, Phys. Rev. A 43, 3795 (1991). [56] J.W. Noh, A. Fougères, L. Mandel, Phys. Rev. Lett. 67, 1426 (1991); Phys. Rev. Lett. 71, 2579 (1993); Phys. Rev. A 45, 424 (1992); Phys. Rev. A 46, 2840 (1992). [57] A. Messiah, Quantum mechanics, North–Holland Publ. Cy, Amsterdam, 1967. [58] L.E. Ballentine, Y. Yumin, and J.P. Zibin, Phys. Rev. A 50, 2854 (1994). [59] H.J. Groenewold, Physica 12, 405 (1946). [60] L. van Hove, Proc. Roy. Acad. Sci. Belg. 26, 1 (1951). [61] P. Chernoff, Hadronic Journ. 4, 879 (1981). [62] H. Weyl, Zeitshr. f. Phys. 46, 1 (1927); E. Wigner, Phys. Rev. 40, 749 (1932); J.E. Moyal, Proc Cambr. Phil. Soc. 45, 99 (1949); L. Cohen, Journ. Math. Phys., 7, 781 (1966); G.S. Agarwal and E. Wolf, Phys. Rev. D 2, 2161, 2187, 2206 (1970); M. Hillery, R.F. O’Connell, M.O. Scully, E.P. Wigner, Physics Reports 106, 121 (1984). [63] E. Wigner, Phys. Rev. 40, 749 (1932). [64] H. Weyl, Zeitshr. f. Phys. 46, 1 (1927). [65] K. Husimi, Proc. Phys. Math. Soc. Japan 22, 67 (1940). [66] E. Wigner, in Perspectives in Quantum Theory, W. Yourgrau and A. van der Merwe, eds., MIT Press Cambridge Mass., 1971, p. 25. [67] M.D. Srinivas and E. Wolf, Phys. Rev. D 11, 1477 (1975). [68] J.E. Moyal, Proc Cambr. Phil. Soc. 45, 99 (1949). [69] G.S. Agarwal and E. Wolf, Phys. Rev. D 2, 2161, 2187, 2206 (1970). [70] N.G. van Kampen, Stochastic processes in physics and chemistry, NorthHolland, Amsterdam, 1981.
BIBLIOGRAPHY
653
[71] W.M. de Muynck and J.T. van Stekelenborg, Ann. der Phys., 7. Folge, 45, 222 (1988). [72] N. Bohr, Essays 1958–1962 on Atomic Physics and Human Knowledge, Interscience Publishers, New York, 1963. [73] W. Heisenberg, Zeitschr. f. Phys. 33, 879 (1925). [74] J. Mehra and H. Rechenberg, The historical development of quantum theory, Springer-Verlag, 1982. [75] J.A. Wheeler, in: Problems in the Foundations of Physics, G. Toraldo di Francia, ed., North–Holland Publ. Cy., Amsterdam, New York, Oxford, 1979, p. 395. [76] J.A. Wheeler, in Quantum Theory and Measurement, eds. J.A. Wheeler and W.H. Zurek, Princeton Univ. Press, 1983, p. 182. [77] J.A. Wheeler and R.P. Feynman, Rev. Mod. Phys. 17, 157 (1945); 21, 425 (1949). [78] B.C. van Fraassen, Quantum mechanics, an empiricists view, Clarendon Press, 1991. [79] S.G. Brush, The kind of motion we call heat, North–Holland Personal Library, Amsterdam, 1986, sections 8.6, 8.7. [80] S.M. Dancoff, Bulletin of the Atomic Scientists 8, 139 (1952). [81] H. Reichenbach, Philosophical Foundations of Quantum Mechanics, University of California Press, 1944. [82] G. Ludwig, Foundations of Quantum Mechanics, Springer, Berlin, 1983, Vols. I and II. [83] P.W. Bridgman, in Albert Einstein: Philosopher-Scientist, P.A. Schilpp ed., Cambridge University Press, London, Third edition, 1982, pp. 333-354. [84] F. Suppe, The structure of scientific theories, University of Illinois Press, Urbana, Chicago, London, 1977. [85] I. Hacking, Philosophical Topics 13, 154 (1982). [86] N.D. Mermin, Physics Today, April 1985, 38. [87] A. Pais, ‘Subtle is the Lord...’, Oxford University Press, 1982. [88] W.M. de Muynck, Phys. Lett. A 94, 73 (1983).
654
BIBLIOGRAPHY
[89] H. Primas, Chemistry, quantum mechanics, and reductionism, SpringerVerlag, Berlin, 1983. [90] A. Einstein, Reply to criticisms, in Albert Einstein: Philosopher-Scientist, P.A. Schilpp ed., Cambridge University Press, London, Third edition, 1982, p. 665688. [91] A. Einstein, Autobiographical notes, in Albert Einstein: Philosopher-Scientist, P.A. Schilpp ed., Cambridge University Press, London, Third edition, 1982, p. 1-94. [92] J.S. Bell, Against ”measurement”, in: Sixty Years of Uncertainty, A. Miller ed., Plenum, New York, 1990, p. 17. [93] H. Everett, in: The many–world interpretation of quantum mechanics, B.S. DeWitt and N. Graham, eds., Princeton Univ. Press, 1973, p. 3. [94] A. Peres, Am. J. Phys. 46, 745 (1978). [95] A. Peres, Amer. Journ. Phys. 52, 644 (1984). [96] R. Guy and R. Deltete, Found. of Phys. 20, 943 (1990). [97] M. Bitbol, Schrödinger’s philosophy of quantum mechanics, Kluwer Academic Publishers, Dordrecht, Boston, London, 1996. [98] J.S. Bell, Speakable and unspeakable in quantum mechanics, Cambridge University Press, 1987. [99] J.L. Aronson, A realist philosophy of science, The Macmillan Press Ltd., London, 1984. [100] C.G. Hempel, Grundzüge der Begriffsbildung in der empirischen Wissenschaft,
Bertelsmann Universitätsverlag, Düsseldorf, 1974. [101] W.M. de Muynck, Synthese 102, 293 (1995). [102] B. Russell, Human knowledge, its scope and limits, George Allen and Unwin
Ltd., London, 1961 (first published in 1948). [103] L.E. Ballentine, Quantum mechanics, Prentice-Hall International, Inc., Lon-
don, 1990. [104] E. Schrödinger, Naturwissenschaften 23, 807, 823, 844 (1935) (English trans-
lation in Quantum Theory and Measurement, eds. J.A. Wheeler and W.H. Zurek, Princeton Univ. Press, 1983, p. 152).
BIBLIOGRAPHY
655
[105] J.A. Wheeler, in: Problems in the Foundations of Physics, G. Toraldo di Francia, ed., North–Holland Publ. Cy., Amsterdam, New York, Oxford, 1979, p. 423. [106] N. Cufaro Petroni, Ph. Gueret, J.-P. Vigier, Nuovo Cim. 81B, 243 (1984); N. Cufaro Petroni, Ph. Droz–Vincent, J.-P. Vigier, Lett. Nuovo Cim. 31, 415 (1981). [107] M. Redhead, Incompleteness, nonlocality, and realism, Clarendon Press, Oxford, 1987. [108] L. Mandel and E. Wolf, Optical coherence and quantum optics, Cambridge University Press, 1995. [109] J.A. Wheeler, Phys. Rev. 52, 1083, 1107 (1937). [110] W. Heisenberg, Zeitschr. f. Phys. 120, 513, 673 (1943); 123, 93 (1944). [111] J.T. Cushing, Theory construction and selection in modern physics: the S matrix, Cambridge University Press, 1990. [112] H.P. Stapp, Phys. Rev. D 3, 1303 (1971). [113] W. Heisenberg, Rev. Mod. Phys. 29, 269 (1957). [114] S.S. Schweber, QED and the men who made it: Dyson, Feynman, Schwinger, and Tomonaga, Princeton University Press, Princeton, 1990. [115] G.F. Chew, S-Matrix theory of strong interactions, W.A. Benjamin, Inc., New York, 1961. [116] G. ’t Hooft, Nucl. Phys. B33, 173 (1971). [117] A. Pais, Inward bound, Clarendon Press, Oxford, 1986. [118] W. Heisenberg, The physical principles of quantum theory, Dover Publications,
Inc., 1930. [119] W.M. de Muynck, Int. Journ. Theor. Phys. 14, 327 (1975); W.M. de Muynck and G.P. van Liempd, Synthese 67, 277 (1986); S. French and M. Redhead, Brit. Journ. Phil. Sc. 39, 233 (1988); S. French, Austral. Journ. Phil. 67, 432 (1989); D. Dieks, Synthese 82, 127 (1990). [120] S.S. Schweber, An introduction to relativistic quantum field theory, Row, Peterson and Cy., Evanston, Illinois, 1961, chapter 6.b; I.G. Kaplan, Sov. Phys. Usp. 18, 988 (1976); M.F. Sarry, Sov. Phys. Jetp. 50, 678 (1979).
656
BIBLIOGRAPHY
[121] R. Mirman, Nuov. Cim. 18B, 110 (1973). [122] P. Grangier, G. Roger and A. Aspect, Europhys. Lett. 1, 173 (1986). [123] G. Richter, W. Brunner, H. Paul, Ann. der Phys., 7. Folge 14, 239 (1964). [124] S. Prasad, M.O. Scully, W. Martiensen, Optics Commun. 62, 139 (1987). [125] H.-A. Bachor, A guide to experiments in quantum optics, Weinheim: WileyVCH, 1998. [126] H. Paul, Photonen, Eine Einführung in die Quantenoptik, Teubner Studienbücher, Stuttgart, 1995. [127] F. Herbut, Found. of Phys. 24, 117 (1994); Found. of Phys. Lett. 9, 437 (1996). [128] H.F. Hofmann, Phys. Rev. A 61, 033815 (2000). [129] W.M. de Muynck, Found. of Phys. 30, 205 (2000). [130] B.-G. Englert and K. Wódkiewicz, Phys. Rev. A 51, R2661 (1995). [131] B.-G. Englert, K. Wódkiewicz and P. Riegler, Phys. Rev. A 52, 1704 (1995). [132] G.C. Ghirardi, A. Rimini, and T. Weber, Phys. Rev. D 34, 470 (1986); Phys. Rev. D 36, 3287 (1987); G.C. Ghirardi, R. Grassi, and F. Benatti, Found. of Phys. 25, 5 (1995). [133] L. Diosi, Phys. Rev. A 40, 1165 (1989); G.C. Ghirardi, P. Pearle, A. Rimini, Phys. Rev. A 42, 78 (1990). [134] P. Claverie, S. Diner, Israel Journ. Chem., 19, 54 (1980). [135] F. Rohrlich From paradox to reality, Cambridge University Press, 1987. [136] N. Bohr, letter to Schrödinger, October 26, 1935 (reprinted in: N. Bohr, Collected Works, J. Kalckar, ed. North–Holland, Amsterdam, 1985, Vol. 7, pp. 510-512). [137] K. Gottfried, Quantum mechanics, Benjamin, New York, 1966. [138] W.H. Zurek, S. Habib, and J.P. Paz, Phys. Rev. Lett. 70, 1187 (1993). [139] J.M. Jauch, Helv. Phys. Act. 37, 293 (1964). [140] M. Cini, Nuovo Cimento 73B, 27 (1983). [141] A.J. Leggett, Progr. Theor. Phys. Suppl. 69, 80 (1980).
BIBLIOGRAPHY
657
[142] B. Yurke, D. Stoler, Phys. Rev. Lett. 57, 13 (1986). [143] C. Monroe, D.M. Meekhof, B.E. King, D.J. Wineland, Science 272, 1131
(1996). [144] D.M. Meekhof, C. Monroe, B.E. King, W.M. Itano, D.J. Wineland, Phys. Rev.
Lett. 76, 1796 (1996). [145] J. Summhammer, H. Rauch, and D. Tuppinger, Phys. Rev. A 36, 4447 (1987). [146] O. Carnal, J. Mlynek, Phys. Rev. Lett. 66, 2689 (1991); D.W. Keith, C.R.
Ekstrom, Q.A. Turchette, D.E. Pritchard, Phys. Rev. Lett. 66, 2693 (1991); M. Kasevich, S. Chu, Phys. Rev. Lett. 67, 181 (1991). [147] E.P. Wigner, Amer. Journ. of Phys. 31, 6 (1963). [148] A. Peres, Phys. Rev. D 22, 879 (1980). [149] E.B. Davies, Comm. Math. Phys. 15, 277 (1969); E.B. Davies, J.T. Lewis,
Comm. Math. Phys. 17, 239 (1970); E.B. Davies, Journ. of Funct. Anal. 6 318 (1970). [150] W. Pauli, Die allgemeinen Prinzipien der Wellenmechanik, in: Handbuch der
Physik, ed. S. Flügge, Springer, Berlin, Band V/1, p. 1. [151] R.J. Glauber, in Quantum optics and electronics, Proceedings Summerschool
les Houches, C. DeWitt, ed., Gordon and Breach, 1965, p. 65. [152] E.P. Wigner, Zeitshr. f. Phys. 133, 101 (1952). [153] H. Araki and M. Yanase, Phys. Rev. 120, 622 (1961); M. Yanase, Phys. Rev.
123, 666 (1961). [154] L. Landau, R. Peierls, Zeitschr. f. Physik 69, 56 (1931). [155] A. Fine, Phys. Rev. D 2, 2783 (1970); A. Shimony, Phys. Rev. 9, 2321 (1974);
H.R. Brown, Found. of Phys. 16, 857 (1986). [156] M. Ozawa, Journ. Math. Phys. 26, 1948 (1985). [157] E.C. Kemble, The Fundamental Principles of Quantum Mechanics, McGraw–
Hill Book Company, Inc., New York, 1937. [158] V.A. Fock, Fundamentals of quantum mechanics, Mir, Moscow, 1932 (English
translation published by Mir, 1978). [159] S.L. Braunstein, C.M. Caves, Found. of Phys. Lett. 1, 3 (1988).
658
BIBLIOGRAPHY
[160] B. d’Espagnat, Conceptual Foundations of Quantum Mechanics, W.A. Benjamin, Inc., Reading, Mass., 1976. [161] K.-E. Hellwig, K. Kraus, Phys. Rev. D 1, 566 (1970). [162] M. Renninger, Zeitschr. f. Physik 158, 417 (1960). [163] H. Margenau, The Nature of Physical Reality, McGraw–Hill, New York, 1950. [164] H.G. Dehmelt, Bull. Amer. Phys. Soc. 20, 60 (1975). [165] R.J. Cook and H.J. Kimble, Phys. Rev. Lett. 54, 1023 (1985). [166] W. Nagourney, J. Sandberg, H. Dehmelt, Phys. Rev. Lett. 56, 2797 (1986); T. Sauter, W. Neuhauser, R. Blatt, P. E. Toschek, Phys. Rev. Lett. 57, 1696 (1986); J.C. Bergquist, R.G. Hulet, W.M. Itano, D.J. Wineland, Phys. Rev. Lett. 57, 1699 (1986); R.J. Cook, in: Progress in Optics XXVIII, E. Wolf, ed., Elsevier Science Publishers B.V., 1990, p. 361. [167] A. Beige and G.C. Hegerfeldt, Phys. Rev. A 53, 53 (1996); A. Beige, G.C. Hegerfeldt and D.G. Sondermann, Quant. Opt. 8, 999 (1996). [168] M. Poratti and S. Putterman, Phys. Rev. A 36, 929 (1987). [169] J. Dalibard, Y. Castin, K. Molmer, Phys. Rev. Lett. 68, 580 (1992); N. Gisin, P.L. Knight, I.C. Percival, R.C. Thompson, D.C. Wilson, Journ. Mod. Optics 40, 1663 (1993). [170] M.B. Plenio and P.L. Knight, Rev. Mod. Phys. 70, 101 (1998). [171] N. Gisin and I.C. Percival, Phys. Lett. A 167, 315 (1992). [172] B. Misra, E.C.G. Sudarshan, Journ. Math. Phys. 18, 756 (1977). [173] W.M. Itano, D.J. Heinzen, J.J. Bollinger, D.J. Wineland, Phys. Rev. A 41, 2295 (1990). [174] A.M. Wolsky, Found. of Phys. 6, 367 (1974). [175] V. Frerichs and A. Schenzle, Phys. Rev. A 44, 1962 (1991). [176] M.D. Srinivas, Journ. Math. Phys. 16, 1672 (1975). [177] R.B. Griffiths, Journ. Stat. Phys. 36, 219 (1984). [178] R. Omnès, The interpretation of quantum mechanics, Princeton University Press, Princeton, New Jersey, 1994.
BIBLIOGRAPHY
659
[179] W.M. de Muynck. Journ. Phys. A: Math. Gen. 31, 431 (1998). [180] D. Giulini, E. Joos, C. Kiefer, J. Kupsch, I.-O. Stamatescu, H.D. Zeh, Decoherence and the appearance of a classical world, Springer, Berlin, 1996. [181] A. Daneri, A. Loinger, and G.M. Prosperi, Nucl. Phys. 33, 297 (1962). [182] N.G. van Kampen, in: Fundamental problems in statistical mechanics, E.G.D. Cohen, ed., North-Holland Publ. Cy., Amsterdam, 1962, p. 173. [183] S. Machida, M. Namiki, Progr. Theor. Phys. 63, 1457, 1833 (1980). [184] A. Daneri, A. Loinger, and G.M. Prosperi, Il Nuovo Cimento 44 B, 119 (1966). [185] H.D. Zeh, Found. of Phys. 1, 69 (1970). [186] W.H. Zurek, Phys. Rev. D 24, 1516 (1981); Physics Today, October 1991, p. 36; Progr. Theor. Phys. 89, 281 (1993); J.P. Paz and W.H. Zurek, Phys. Rev. Lett. 82, 5181 (1999). [187] A. Barchielli, L. Lanz, G.M. Prosperi, Nuov. Cim. B72, 79 (1982); A. Barchielli, Nuov. Cim. B74, 113 (1982). [188] P. Pearle, Phys. Rev. A 39, 2277 (1989). [189] G. Lindblad, Comm. Math. Phys. 48, 119 (1976). [190] A. Barchielli, V.P. Belavkin, Journ. Phys. A24, 1495 (1991); L. Viola, R. Onofrio, T. Calarco, Phys. Lett. A 229 23 (1997). [191] C.M. Caves, G.J. Milburn, Phys. Rev. A 36, 5543 (1987). [192] A. Barchielli, Phys. Rev. A 34, 1642 (1986); A. Barchielli, G. Lupieri, Journ. Math. Phys. 26, 2222 (1985). [193] D.F. Walls, G.J. Milburn, Phys. Rev. A 31, 2403 (1985). [194] A.O. Caldera, A.J. Leggett, Phys. Rev. A 31, 1059 (1985). [195] W.H. Zurek and J.P. Paz, in “Proc. Symp. on the Found. of Modern Physics” 1993, P. Busch, P. Lahti and P. Mittelstaedt, eds, World Scientific, Singapore, 1993, p. 458; W.H. Zurek, S. Habib, and J.P. Paz, Phys. Rev. Lett. 70, 1187 (1993); J.P. Paz, S. Habib, and W.H. Zurek, Phys. Rev. D 47, 488 (1993). [196] J. Halliwell and A. Zoupas, Phys. Rev. D 52, 7294 (1995); Phys. Rev. D 55, 4697 (1997); L. Diosi, N. Gisin, J. Halliwell, I.C. Percival, Phys. Rev. Lett. 74, 203 (1995); T.A. Brun, I.C. Percival, R. Schack, Journ. Phys. A29, 2077 (1996); M. R. Gallis, Phys. Rev. A 53, 655 (1996).
660
BIBLIOGRAPHY
[197] C. Presilla, R. Onofrio, M. Patriarca, Journ. Phys. A: Math. Gen. 30, 7385 (1997). [198] A.J. Leggett, Contemp. Phys. 25, 583 (1984); J. Clark, A.N. Cleland, M.H. Devoret, D. Esteve and J.M. Martinis, Science 239, 992 (1988); B.J. Vleeming, A.V. Zakarian, A.N. Omelyanchouk, R. de Bruyn Ouboter, Physica B226, 253 (1996). [199] M. Buffa, O. Nicrosini, and A. Rimini, Found. of Phys. Lett. 8, 105 (1995). [200] N. Bohr, Atomic Theory and the Description of Nature, Cambridge, 1934; Atomic Physics and Human Knowledge, New York, 1958. [201] N. Bohr, Dialectica 2, 312 (1948). [202] N. Bohr, in “Albert Einstein: Philosopher–Scientist”, P.A. Schilpp, ed., The Library of Living Philosophers, 1949, p. 199. [203] A. Petersen, Quantum Physics and the Philosophical Tradition, M.I.T. Press, Cambridge, Mass., 1968. [204] M. Jammer, The Conceptual Development of Quantum Mechanics, McGraw Hill, New York, 1966. [205] H.P. Stapp, Amer. Journ. Phys. 40, 1098 (1972). [206] M. Beller, Quantum Dialogue, University of Chicago Press, Chicago & London, 1999. [207] N.G. van Kampen, Physica A153, 97 (1988). [208] N. Bohr, Atomic Theory and the Description of Nature, Cambridge, 1934. [209] W. Heisenberg, Physics and Philosophy, Harper and Row, Publ., New York, 1962. [210] H. Folse, The philosophy of Niels Bohr, North Holland, Amsterdam, 1985. [211] S. Hawking, A brief history of time, Bantam Books, Toronto, etc., 1988. [212] W. Kelvin, Philosophical Magazine 2, 1 (1901). [213] Letter by W. Heisenberg to H.P. Stapp, reprinted in ref. [215], p. 1112-1113. [214] D.C. Cassidy, Historical Studies in the Physical Sciences, 12, 1 (1981). [215] H.P. Stapp, Am. J. Phys. 40, 1098 (1972).
BIBLIOGRAPHY
661
[216] M. Jammer, The philosophy of quantum mechanics, Wiley, New York, 1974. [217] W.M. de Muynck, Found. of Phys. 16, 973 (1986). [218] D. Dieks, Found. of Phys. 19, 1397 (1989). [219] M. Jammer, “Albert Einstein und das Quantenproblem”, in: Proceedings Einstein Symposion Berlin, 1979, Lecture Notes in Physics 100, Springer-Verlag, Berlin, etc., 1979, p. 146. [220] M. Born, W. Heisenberg and P. Jordan, Zeitschr. f. Phys. 35, 557 (1926). [221] C.G. Hempel, Philosophy of natural science, Prentice-Hall, Inc., Englewood Cliffs, 1966. [222] G. Ludwig, An Axiomatic Basis for Quantum Mechanics, Springer–Verlag, New York, 1985. [223] T.S. Kuhn, The Structure of Scientific Revolutions, The University of Chicago Press, 1962. [224] H. Reichenbach, The Philosophy of Space and Time, Dover, 1957. [225] F.J. Belinfante, Measurements and Time Reversal in Objective Quantum Theory, Pergamon Press, Oxford, 1975. [226] G. Holton, Thematic Origins of Scientific Thought, Harvard University Press, 1973. [227] William James, The Principles of Psychology, Dover Publications, New York, 1950 (original edition: 1890). [228] K.T. Meyer–Abich, Korrespondenz, Individualität und Komplementarität, Wiesbaden, 1965. [229] G. Möllenstedt, C. Jönsson, Zeitschr. f. Phys. 155, 472 (1959). [230] A. Tonomura, J. Endo, T. Matsuda, T. Kawasaki, H. Ezawa, Am. Journ. Phys. 57, 117 (1989). [231] E. Hecht, A. Zajac, Optics, Addison–Wesley, 1979. [232] W. Heisenberg, “Quantum theory and its interpretation”, in Niels Bohr, His life and work as seen by his friends and colleagues, ed. S. Rozental, North– Holland Personal Library, Amsterdam, 1967, p. 105. [233] W. Heisenberg, Physik und Philosophie, Verlag Ullstein GmbH, Frankfurt, 1959, p. 25.
662
BIBLIOGRAPHY
[234] N. Bohr, Como Lecture, The quantum postulate and the recent development of atomic theory in: N. Bohr, Collected Works, J. Kalckar, ed. North–Holland, Amsterdam, 1985, Vol. 6, pp. 113–136). [235] I. Kant, Kritik der reinen Vernunft, 1781; I. Kant, Prolegomena zu einer jeden künftigen Metaphysik, die als Wissenschaft wird auftreten können, 1783. [236] K. Hentschel, Interpretationen und Fehlinterpretationen der speziellen und der allgemeinen Relativitätstheorie durchZeitgenossen Albert Einsteins, Birkhaeuser, Historical Studies, 1990. [237] W.P. Welten, Causaliteit in de quantummechanica, Noordhoff, Groningen, 1961. [238] C.F. von Weizsaecker, Het wereldbeeld in de fysica, Aula, Utrecht, 1959. [239] S.T. Ali and G.G. Emch, Journ. Math. Phys. 15, 176 (1974). [240] N. Bohr, Nature (suppl.) 121, 580 (1928). [241] D. Murdoch, Niels Bohr’s Philosophy of Physics, Cambridge University Press, 1987. [242] H. Martens, The uncertainty principle, PhD Thesis, Eindhoven University of Technology, 1991. [243] L. Jánossy, Z. Náray, Nuovo Cimento Suppl. 9, 588 (1958). [244] C.F. von Weizsäcker, Naturwissenschaften 42, 521, 545 (1955). [245] F. London, E. Bauer, La théorie de l’observation en mécanique quantique, Hermann and Cie., Paris, 1939. [246] H.P. Stapp, Mind, matter and quantum mechanics, Springer, Berlin, 1993. [247] D.Z. Albert, Quantum mechanics and experience, Harvard University Press, Cambridge, Mass., 1992. [248] E.J. Squires, Conscious mind and the physical world, IOP, Bristol, New York, 1990. [249] E.P. Wigner, in The scientist speculates, I.J. Good, ed., W. Heinemann, London, 1961, p. 284. [250] A. Einstein, B. Podolsky, and N. Rosen, Phys. Rev. 47, 777 (1935). [251] H. Margenau, Philosophy of Science 4, 337 (1937).
BIBLIOGRAPHY
663
[252] L.E. Ballentine, Rev. Mod. Phys. 42, 358 (1970). [253] H. Margenau, Ann. Phys. (N.Y.) 23, 469 (1963). [254] J. Park, H. Margenau, Int. J. Theor. Phys. 1, 211 (1968). [255] H. Margenau, J. Park, Found. of Phys. 3, 19 (1973). [256] E. Prugovecki, Can. J. Phys. 45, 2173 (1967). [257] W.K. Wootters and W.H. Zurek, Phys. Rev. D 19, 473 (1979). [258] P. Mittelstaedt, A. Prieur, and R. Schieder, Found. of Phys. 17, 891 (1987). [259] N. Bohr, Phys. Rev. 48, 696 (1935). [260] M. Jammer, in “Proc. Symp. on the Found. of Modern Physics” 1985, P. Lahti and P. Mittelstaedt, eds, World Scientific, Singapore, 1985, p. 129. [261] A. Einstein, Out of my later years, The philosophical Library, New York, 1950, p. 90. [262] J.F. Clauser and A. Shimony, Rep. Prog. Phys. 41, 1881 (1978). [263] J.S. Bell, Epistemological Letters Nov. ’75, p. 2-6 (reprinted in [98], chapt. 8). [264] C.D. Cantrell and M.O. Scully, Physics Reports 43, 499 (1978). [265] A. Fine, The shaky game, University of Chicago Press, Chicago and London, 1986. [266] K.R. Popper, Quantum theory and the schism in physics, Rowman and Littlefield, Totowa, 1982. [267] A. Einstein, Dialectica 2, 320 (1948). [268] D. Bohm and Y. Aharonov, Phys. Rev. 108, 1070 (1957). [269] H. Margenau, Phys. Rev. 49, 240 (1936). [270] C.H. Bennett, G. Brassard, C. Crepeau, R. Josza, A. Peres, W.K. Wootters, Phys. Rev. Lett. 70, 1895 (1993). [271] D. Home and M.A.B. Whitaker, Phys. Rep. 210, 223 (1992). [272] J. Rayski, Evolution of physical ideas towards unification, B. Sredniawa, ed., Universitatis Iagellonicae Folia Physica, Krakow, 1995, p. 22. [273] P. Jordan, Erkenntnis 4, 215 (1934).
664
BIBLIOGRAPHY
[274] K.R. Popper, Brit. Journ. Phil. Sc. 10, 25 (1959). [275] K.R. Popper, “Quantum Mechanics without “The Observer” ”, in: Quantum theory and reality, M. Bunge, ed., Springer-Verlag, Berlin, etc., 1967, p. 7. [276] M. Born, Zeitschr. f. Phys. 37, 863 (1926); 38, 803 (1926). [277] M. Beller, Stud. Hist. Phil. Sci. 21, 563 (1990). [278] G. Auletta, Foundations and Interpretations of Quantum Mechanics, World Scientific, Singapore, 2000. [279] The Born-Einstein Letters, Correspondence between Albert Einstein and Max and Hedwig Born from 1916 to 1955 with commentaries by Max Born, Walker and Company, New York, 1971. [280] N.D. Mermin, Amer. Journ. Phys. 66, 753 (1998). [281] J.L. Park, Int. Journ. Theor. Phys. 8, 211 (1973). [282] J.L. Park and W. Band, Found. of Phys. 6, 157 (1976). [283] E. Schrödinger, Proc. Camb. Philos. Soc. 32, 446 (1936). [284] E. Schrödinger, Four lectures on wave mechanics, Blackie and Sons Ltd., London and Glasgow, 1928. [285] A.O. Barut, Found. of Phys. 18, 95 (1988); S. Weinberg, Ann. of Phys. (N.Y.) 194, 336 (1989). [286] G.B. Whitham, Linear and nonlinear waves, Wiley, New York, 1974. [287] B. Mielnik, Commun. Math. Phys. 9, 55 (1968); Commun. Math. Phys. 15, 1 (1969); Commun. Math. Phys. 37, 221 (1974). [288] H.-D. Doebner, G.A. Goldin, Phys. Rev. A 54, 3764 (1996). [289] W.H. Furry, Phys. Rev. 49, 393, 476 (1936). [290] A. Aspect, J. Dalibard, and G. Roger, Phys. Rev. Lett. 49, 1804 (1982). [291] F. Capra, The Tao of physics, Shambala, Berkeley, 1975. [292] G. Zukav, The dancing Wu Li masters, William Morrow and Company, Inc., New York, 1979. [293] R. von Mises, Wahrscheinlichkeit, Statistik und Wahrheit, Springer, 1936.
BIBLIOGRAPHY
665
[294] G. Bacciagaluppi, M. Hemmo, Stud. Hist. Phil. Mod. Phys. 27, 239 (1996); J. Bub, Found. Phys. Lett. 6, 21 (1993). [295] P. Knight, Nature 395, 12 (1998); S. Duerr, T. Nonn and G. Rempe, Nature 395, 33 (1998). [296] N. Bohr, H.A. Kramers and J.C. Slater, Zeitschr. f. Phys. 24, 69 (1924); Phil. Mag. 47, 785 (1924). [297] W. Band and J.L. Park, Found. of Phys. 6, 249 (1976). [298] L. de Broglie, La thermodynamique de la particule isolée, Gauthier–Villars, 1964. [299] D.M. Greenberger, M.A. Horne, A. Shimony, and A. Zeilinger, Am. Journ. Phys. 58, 1131 (1990); D.M. Greenberger, M. Horne, and A. Zeilinger, “Going beyond Bell’s theorem”, in Bell’s Theorem, Quantum Theory, and Conceptions of the Universe, ed. M. Kafatos (Kluwer Academic Publishers, Dordrecht, 1989), p. 73. [300] S. Kochen and E.P. Specker, J. Math. and Mech. 17, 59 (1967). [301] A. Peres, Journ. of Phys. 24A, L175 (1991). [302] N.D. Mermin, Phys. Rev. Lett. 65, 3373 (1990). [303] N.D. Mermin, Physics Today, June 1990, 9. [304] N.D. Mermin, Rev. Mod. Phys. 65, 803 (1993). [305] B.C. van Fraassen, in Current issues in quantum logic, E. Beltrametti and B.C. van Fraassen, eds., Plenum, New York, 1981, p. 229-258. [306] D. Dieks and P. Vermaas (eds.), The modal interpretation of quantum mechanics, Kluwer, Dordrecht, 1998. [307] P.E. Vermaas, A philosopher’s understanding of quantum mechanics : possibilities and impossibilities of a modal interpretation, Cambridge : Cambridge University Press, 1999. [308] S. Kochen, in “Proc. Symp. on the Found. of Modern Physics” 1985, P. Lahti and P. Mittelstaedt, eds, World Scientific, Singapore, 1985, p. 151-169. [309] D. Dieks, Phys. Rev. A 49, 2290 (1994). [310] A. Elby, Found. Phys. Lett. 6, 5 (1993). [311] R. Healy, Found. Phys. Lett. 6, 37 (1993).
666
BIBLIOGRAPHY
[312] E. Arthurs and J.L. Kelly Jr., Bell Syst. Tech. Journ. 44, 725 (1965); C.Y. She and H. Heffner, Phys. Rev. 152, 1103 (1966); C.W. Helstrom, R.S. Kennedy, IEEE Journ. on Inform. Theory IT-20, 16 (1974). [313] P.L. Kelley and W.H. Kleiner, Phys. Rev. A 136, 316 (1964). [314] J.E.G. Farina, Quantum theory of scattering processes, Pergamon Press, Oxford, 1973. [315] A. von Peij, Eindhoven University of Technology, unpublished report, 1993. [316] Y. Yamamoto, S. Machida, S. Saito, N. Imoto, T. Yanagawa, M. Kitagawa, and G. Björk, “Quantum mechanical limit in optical precision measurement and communication”, in: Progress in optics XXVIII, E. Wolf, ed., Elsevier Science Publishers B.V., 1990, p. 87. [317] H.P. Yuen, J.H. Shapiro, IEEE Trans. Inform. Theory IT–26, 78 (1980). [318] P. Busch, Found. of Phys. 17, 905 (1987). [319] J.M. Ortega, Matrix theory, Plenum Press, New York and London, 1987. [320] F.R. Gantmacher, Application of the theory of matrices, Interscience Publishers Inc., New York, 1959. [321] R. McEliece, The theory of information and coding, Addison–Wesley, London, 1977. [322] E.B. Davies, IEEE Trans. Inform. Theor. IT-24, 596 (1978). [323] C. Shannon, Bell Syst. Techn. Journ. 27, 379 (1948). [324] Y. Lai and H.A. Haus, Quant. Opt. 1, 99 (1989); M. Freyberger and W. Schleich, Phys. Rev. A 47, R30 (1993); U. Leonhardt and H. Paul, Phys. Rev. A 47, R2460 (1993). [325] E. Arthurs and J.L. Kelly Jr., Bell Syst. Tech. Journ. 44, 725 (1965). [326] D. Leibfried, T. Pfau and C. Monroe, Physics Today, April 1998, p. 22; K. Banaszek, Journ. Mod. Opt. 46, 675 (1999); C. D’Helon, G.J. Milburn, Phys. Rev. A 54, R25 (1996); U. Leonhardt, H. Paul and G.M. d’Ariano, Phys. Rev. A 52, 4899 (1995). [327] K. Vogel and H. Risken, Phys. Rev. A 40, 2847 (1989). [328] W. Band and J.L. Park, Found. of Phys. 1, 133 (1970); Amer. Journ. of Phys. 47, 188 (1979).
BIBLIOGRAPHY
667
[329] S.V. Dorofeev and J.de Graaf, Indag. Mathem. N.S. 8, 349 (1997). [330] W.M. de Muynck, P.A.E.M. Janssen, and A. Santman, Found. of Phys. 9, 71 (1979). [331] E. Arthurs and M. Goodman, Phys. Rev. Lett. 60, 2447 (1988). [332] M.G. Raymer, Am. J. Phys. 62, 986 (1994). [333] M. Grabowski, Reports on Math. Phys. 20, 153 (1984). [334] F.E. Schroeck, Journ. Math. Phys. 30, 2078 (1989). [335] J. Uffink, Int. Journ. Theor. Phys. 33, 199 (1994). [336] E.P. Storey, S.M. Tan, M.J. Collett and D.F. Walls, Nature 375, 368 (1995). [337] M.O. Scully, B.-G. Englert and H. Walther, Nature 351, 111 (1991); B.-G. Englert, M.O. Scully and H. Walther, Nature 375, 367 (1995). [338] S. Duerr, T. Nonn and G. Rempe, Nature 395, 33 (1998). [339] B.-G. Englert, Phys. Rev. Lett. 77, 2154 (1996). [340] S. Stenholm, Annals of Physics 218, 233 (1992). [341] D.M. Appleby, J. Phys. A: Math. Gen. 31, 6419 (1998). [342] S.A. Werner and A.G. Klein, in Methods of Experimental Physics, K. Sköld and D.L. Price eds, Academic Press, Orlando, 1985, Vol. 23, Part A, p. 259. [343] A. Zeilinger, Phys. Lett. A 118, 1 (1986). [344] W.M. de Muynck and H. Martens, Phys. Rev. A 42, 5079 (1990). [345] W.M. de Muynck, W.W. Stoffels, and H. Martens, Physica B 175, 127 (1991). [346] Z. Ou, C. Hong, and L. Mandel, Optics Comm. 63, 118 (1987). [347] M. Namiki, S. Pascazio, Phys. Lett. A 175, 150 (1993). [348] W.M. de Muynck, Phys. Lett. A 182, 201 (1993). [349] W. Gerlach, O. Stern, Zeitschr. f. Phys. 7, 349 (1922). [350] M.O. Scully, W.E. Lamb Jr., and A. Barut, Found. of Phys. 17, 575 (1987). [351] H. Martens and W.M. de Muynck, Journ. Phys. A: Math. Gen. 26, 2001 (1993).
668
BIBLIOGRAPHY
[352] P.C. Consul, Generalized Poisson distributions, Marcel Dekker Inc., New York and Basel, 1989, p. 55. [353] T. Kiss, U. Herzog, U. Leonhardt, Phys. Rev. A 52, 2433 (1995); U. Herzog, Phys. Rev. A 53, 1245 (1996). [354] K. Banaszek, Journ. Mod. Opt. 46, 675 (1999). [355] U. Leonhardt and H. Paul, Phys. Rev. Lett. 72, 4086 (1994). [356] N.G. Walker and J.E. Caroll, Electr. Lett. 20, 981 (1984). [357] U. Leonhardt and H. Paul, Phys. Rev. A 48, 4598 (1993). [358] U. Leonhardt, Measuring the quantum state of light, Cambridge University Press, 1997. [359] R. Lynch, Phys. Rev. A 47, 1576 (1993). [360] S.M. Barnett and B.J. Dalton, Physica Scripta T48, 13 (1993). [361] M. Freyberger, M. Heni and W.P. Schleich, Quantum Semiclass. Opt. 7, 187 (1995). [362] Z. Hradil, Quant. Opt. 4, 93 (1992). [363] J.H. Shapiro and S.S. Wagner, Journ. Quant. Electr. QE 20, 803 (1984). [364] J.R. Torgerson and L. Mandel, Phys. Rev. Lett. 76, 3939 (1996). [365] D.T. Smithey, M. Beck, M.G. Raymer and A. Faridani, Phys. Rev. Lett. 70, 1244 (1993). [366] A. Royer, Phys. Rev. Lett. 55, 2745 (1985); Found. of Phys. 19, 3 (1989). [367] W.M. de Muynck and A.J.A. Hendrikx, Phys. Rev. A 63, 042114 (2001). [368] M. Brune, E. Hagley, J. Dreyer, X. Maître, A. Maali, C. Wunderlich, J.M. Raimond and S. Haroche, Phys. Rev. Let. 77, 4887 (1996). [369] Norman F. Ramsey, Molecular Beams, Oxford at the Clarendon Press, First published 1956, Reprinted lithographically in Great Brittain from corrected sheets of the first edition 1963, 1969. [370] H. Paul, Quant. Opt. 3, 169 (1991). [371] L. Davidovich, M. Brune, J.M. Raimond and S. Haroche, Phys. Rev. A 53, 1295 (1996).
BIBLIOGRAPHY
669
[372] D. Vitali, P. Tombesi and G.J. Milburn, Phys. Rev. A 57, 4930 (1998). [373] A. Aspect, P. Grangier, and G. Roger, Phys. Rev. Lett 47, 460 (1981). [374] J.F. Clauser, M.A. Horne, A. Shimony and R.A. Holt, Phys. Rev. Lett. 23, 880 (1969); V.L. Lepore and F. Selleri, Found. of Phys. Lett. 3, 203 (1990); E. Santos, Phys. Rev. A 46, 3646 (1992); L. de la Peña and A.M. Cetto, The quantum dice, Kluwer Academic Publishers, Dordrecht, Boston, London, 1996, section 13.4 . [375] M.A. Rowe, D. Kielpinski, V. Meyer, C.A. Sackett, W.M. Itano, C. Monroe, and D.J. Wineland, Nature 409 , 791 - 794 (15 Feb 2001). [376] A.J. Leggett and A. Garg, Phys. Rev. Lett. 54, 857 (1985). [377] W.M. de Muynck, W. De Baere, and H. Martens, Found. of Phys. 24, 1589 (1994). [378] L. Hardy, Phys. Rev. Lett. 71, 1665 (1993). [379] A. Shimony, “Events and processes in the quantum world”, in Quantum concepts in space and time, R. Penrose and C.J. Isham, eds., Clarendon Press, Oxford, 1986, p. 182. [380] P. Rastall, Found. of Phys. 13, 555 (1983). [381] A. Fine, Journ. Math. Phys. 23, 1306 (1982); Phys. Rev. Lett. 48, 291 (1982). [382] W.M. de Muynck, Phys. Lett. A 114, 65 (1986). [383] J.F. Clauser and M.A. Horne, Phys. Rev. D 10, 526 (1974). [384] A.N. Kolmogorov, Foundations of the theory of probability, Chelsea, 1956. [385] F. Clauser, M. A. Horne, A. Shimony, and R.A. Holt, Phys. Rev. Lett. 23, 880 (1969). [386] F. Selleri, and G. Tarozzi, Riv. Nuovo Cim. 4, nr. 2 (1981). [387] A. Garg and N.D. Mermin, Phys. Rev. Lett. 49, 242 (1982). [388] G. Svetlichny, M. Redhead, H. Brown, J. Butterfield, Phil. of Science, 55, 387 (1988). [389] A. Khrennikov, Interpretations of probability, VSP BV, Utrecht, the Netherlands, 1999.
670
BIBLIOGRAPHY
[390] G. Weihs, T. Jennewein, C. Simon, H. Weinfurter, and A. Zeilinger, Phys. Rev. Lett. 81, 5039 (1998). [391] N. Gisin and H. Zbinden, Phys. Lett. A 264, 103 (1999). [392] A. Shimony, Proc. Internal. Symp. on the Found. of Quantum Mechanics, S. Kamefuchi et al. (eds), Phys. Soc. Japan, Tokyo, 1983, p. 225. [393] H.P. Stapp, Phys. Rev. D 3, 1303 (1971); Nuovo Cimento 29B, 270 (1975); 40B, 191 (1977); Found. of Phys. 10, 767 (1980). [394] G.C. Ghirardi, Found. of Phys. Lett. 9, 313 (1996). [395] H.P. Stapp, in Philosophical Implications of Quantum Theory, J. Cushing and E. McMullin, eds. (Notre Dame University Press, Notre Dame, 1989); also Lawrence Berkeley Laboratory Report LBL–24257, 1988. [396] H.P. Stapp, Phys. Rev. A 46, 6860 (1992). [397] J. Butterfield, Brit. Journ. Phil. Sc. 43, 41 (1992). [398] H.P. Stapp, Found. of Phys. 24, 1665 (1994). [399] W. De Baere, Lett. Nuovo Cim. 39, 234 (1984), Lett. Nuovo Cim. 40, 488 (1984), Adv. in Electronics and Electron Phys. 68, 245 (1986). [400] L. Hardy, Phys. Rev. Lett. 68, 2981 (1992). [401] H.P. Stapp, Amer. Journ. of Phys. 65, 300 (1997). [402] S. Sulcs, G. Oppy and B.C. Gilbert, Found. Phys. Lett. 13, 521 (2000). [403] F. Laudisa, Stud. Hist. Phil. Mod. Phys. 27, 297 (1996). [404] W.M. de Muynck, Stud. Hist. Phil. Mod. Phys. 27, 315 (1996). [405] L. de Broglie, La thermodynamique de la particule isolée, Gauthier–Villars, 1964; L. de Broglie, Diverses question de mécanique et de thermodynamique classiques et relativistes, Springer-Verlag, 1995. [406] D. Bohm, Phys. Rev. 89, 458 (1953). [407] D. Bohm and J.-P. Vigier, Phys. Rev. 96, 208 (1954). [408] E. Nelson, Dynamical theories of Brownian motion, Princeton University Press, 1967. [409] E. Nelson, Quantum fluctuations, Princeton University Press, 1985.
BIBLIOGRAPHY
671
[410] M.P. Davidson, Physica 96A, 465 (1979). [411] D. Dürr, S. Goldstein, and N. Zanghí, Journ. Stat. Phys. 67, 843 (1992). [412] J.S. Bell, Rev. Mod. Phys. 38, 447 (1966). [413] J.M. Jauch and C. Piron, Helv. Phys. Acta 36, 827 (1963). [414] D. Bohm and J. Bub, Rev. Mod. Phys. 38, 453 (1966). [415] B.C. van Fraassen, in Contemporary research in the foundations and philosophy of quantum theory, C.A. Hooker, ed., Reidel, Dordrecht, 1973, p. 80. [416] L. de Broglie, Comptes Rendues Acad. Sci. Paris 183, 447 (1926); 184, 273 (1927); 185, 380 (1927). [417] W. Pauli, in: Reports on the Solvay Congress, Gauthiers-Villars et Cie., Paris, 1928, p. 280. [418] D. Bohm, Phys. Rev. 85, 166, 180 (1952). [419] T. Takabayasi, Progr. Theor. Phys. 8, 143 (1952). [420] A. Einstein, in Scientific papers presented to Max Born, Oliver and Boyd, Edinburgh, 1953, p. 33. [421] D. Dürr, in: Chance in physics, J. Bricmont et al., eds., Springer, 2001, p. 115. [422] P.R. Holland, The quantum theory of motion, Cambridge University Press, Cambridge, 1993. [423] H. Goldstein, Classical mechanics, Addison-Wesley Publ. Cy., Reading, Mass., etc., 1959. [424] L. de Broglie, Une tentative d’interprétation causale et non linéaire de la mécanique ondulatoire: la théorie de la double solution, Gauthier-Villars, Paris, 1956 (English translation: Non-linear wave mechanics, a causal interpretation, Elsevier, Amsterdam, 1960). [425] D. Bohm and B.J. Hiley, Found. of Phys. 14, 255 (1984). [426] D. Bohm, B.J. Hiley, and P.N. Kaloyerou, Phys. Reports 144, 321 (1987). [427] J.G. Muga and C.R. Leavens, Phys. Reports 338, 353 (2000); C.R. Leavens and W.R. McKinnon, Phys. Lett. A 194, 12 (1994). [428] T.P. Spiller, T.D. Clark, R.J. Prance, H. Prance, Europhys. Lett. 12, 1 (1990).
672
BIBLIOGRAPHY
[429] W.M. de Muynck, in Proc. Symp. on the Found. of Modern Physics 1987, P. Lahti and P. Mittelstaedt, eds, World Scientific, Singapore, 1987, p. 419. [430] D. Bohm, Wholeness and the implicate order, Routledge and Kegan Paul, 1980. [431] P.H. Eberhard, Nuov. Cim. 46B, 392 (1978). [432] G. Lochak, Found. of Phys. 6, 173 (1976). [433] G. Lochak, in J. Leite Lopes and J. Paty (eds), Quantum mechanics a half century later, Reidel, Dordrecht, 1977, p. 245. [434] H. Primas, in: Quantum dynamics of molecules, R.G. Woolley, ed., Plenum Press, New York, London, 1980, p. 39. [435] J.S. Bell, in: Foundations of Quantum Mechanics, Proceedings of the International School of Physics “Enrico Fermi”, Course XLIX, B. d’Espagnat ed., Academic, New York, 1972, p. 171. [436] L. Accardi and M. Regoli, quant-ph/0007005; quant-ph/0007019; quant-
ph/0110086; quant-ph/0112067. [437] F. Selleri, in Quantum Theory and Pictures of Reality, W. Schommers, ed.,
Springer–Verlag, 1989, p. 279. [438] U. Enz, Phys. Rev. 131, 1392 (1963); Annales de la Fondation Louis de Broglie
11, 87 (1986); 15, 19 (1990). [439] A. Valentini, in: Chance in physics, J. Bricmont et al., eds., Springer, 2001,
p. 165. [440] D. ter Haar, Elements of statistical mechanics, Holt, Rinehart and Winston,
New York, 1961. [441] K. Huang, Statistical Mechanics, John Wiley and Sons, Inc., New York, 1963. [442] P. and T. Ehrenfest, The conceptual foundations of the statistical approach in
mechanics, Dover Publ., Inc., New York, 1990 (Encyclopädie der mathematischen Wissenschaften, Vol. IV, Mechanik, eds. F. Klein and C. Müller, Heft 6, 1912). [443] S.R. de Groot and P. Mazur, Non-equilibrium thermodynamics, North-Holland Publ. Cy., Amsterdam-London, 1969. [444] Ilya Prigogine, From being to becoming, W.H. Freeman and Company, San Francisco, 1980, p. 12.
BIBLIOGRAPHY
673
[445] J. von Neumann, Math. Annalen 104, 570 (1931). [446] G.W. Mackey, Induced representations of groups and quantum mechanics, Benjamin, New York, 1968. [447] R.J. Glauber, Phys. Rev. 130, 2529, 2766 (1963).
[448] E. Schrödinger, Naturwissenschaften 14, 166 (1926). [449] H. Yuen, Phys. Rev. A 13, 2226 (1976). [450] H.S.M. Coxeter, Regular polytopes, Dover, 1973. [451] J.J. Seidel, Eutactic stars, in: Colloquia Mathematica Societas Janos Bolyai, 18. Combinatorics, Keszthely (Hungary), 1976. [452] P.M. Morse and H. Feshbach, Methods of theoretical physics, McGraw-Hill, London, 1953. [453] R. Balian, From microphysics to macrophysics, Springer-Verlag, Berlin, etc., 1991. [454] W.H. Ruckle, Modern analysis, measure theory and functional analysis with applications, PWS–KENT Publ. Cy. , Boston, 1991. [455] G. Jameson, Ordered linear spaces, Lecture Notes in Mathematics 141, Springer, Berlin, 1970. [456] G.F. Simmons, Topology and modern analysis, McGraw–Hill Book Company, New York, etc., 1963. [457] A.M. Gleason, Journ. of Rat. Math. and Mech. 6, 885 (1957). [458] V.S. Varadarajan, Geometry of quantum theory, Springer–Verlag, New York, etc., Second Edition, 1985, p. 97. [459] W.M. de Muynck, in Proc. Symp. on the Found. of Modern Physics 1993, P. Busch, P. Lahti and P. Mittelstaedt, eds, World Scientific, Singapore, 1993, p. 281. [460] W.M. de Muynck, in Proc. Conf. Foundations of Probability and Physics, A. Khrennikov, ed., World Scientific, Singapore, 2001, p. 95. [461] W.M. de Muynck, W. De Baere, and H. Martens, Found. of Phys. Lett. 5, 527 (1992).
674
BIBLIOGRAPHY
[462] W. M. de Muynck and H. Martens, in: W. Florek, D. Lipinski and T. Lulek eds., Symmetry and structural properties of condensed matter, World Scientific, Singapore, 1993, p. 101. [463] W.M. de Muynck and W. De Baere, Annals of the Israel Physical Society 12, 109 (1996).
Index absorption deterministic, 415, 594 stochastic, 413 analogy of classical statistical mechanics and quantum mechanics, 54–71 thermodynamic, 598–608 ancilla, 48, 398 Arthurs-Kelly model, 407–409
consciousness, 124, 197, 223 convex cone, 634 convex function, 629 convex subset, 634 correspondence principle strong, 190–200, 246–258, 265–272, 317–321, 503 and logical positivism, 195–197 weak, 190, 197 counterfactual definiteness, 497 coupled systems, 16–22 cross terms, 114–169, 202, 295, 303 observability of, 133 unobservability of, 117
basic postulates, 1–16 for mixtures, 12–16 for pure states, 1–7 canonical commutation relation, 7 canonical conjugatedness, 7, 610 causal connectedness, 10 causal disjointness, 10 chameleon model, 588 channel capacity, 371 classical limit, 52, 109–112, 163–169, 190 commeasurability, 11, 44, 474, 486, 519 complementarity, 7, 199–259, 317, 382, 388–406, 455–468, 525–533, 587– 597 circular, 219 in a restricted sense, see complementarity in a wider sense, 197, 223, 227 in measurement, 525–529 in preparation, 525–529, 533 parallel, 219 completeness in a restricted sense, 177–189, 199, 254–259, 514–528 in a wider sense, 173–189, 243–246, 254–265, 316, 512–521
Davidovich-Haroche experiment, 457 decoherence, 118, 160–170, 426, 461 determinism, 179, 183, 216, 592 Dirac quantization, 53 disentanglement, see entanglement distributive law, 32 disturbance Bohm, 555 Heisenberg, 111, 339, 340, 405, 415, 499, 528 mutual, 209, 230 in a determinative sense, 212, 397 in a preparative sense, 214, 397 double-slit experiment, 286, 340–345, 409, 452 dual basis, 368, 623 Einstein-Podolsky-Rosen experiment, 239– 273 element of physical reality, 243–273, 311, 504, 537 quantum mechanical, 283, 589
675
676 subquantum mechanical, 245, 321, 589 ensemble homogeneous, 284–295 inhomogeneous, see homogeneous ensemble quantum, 291, 303 von Neumann, 284–310 entanglement, 21, 291–298 entropy average row, 374 mixing, 19 Shannon, 15, 28, 371, 632 von Neumann, 13 environment-induced superselection, 162, 163, 167 EPR experiment, see Einstein-PodolskyRosen experiment EPR-Bell experiment, 265, 472 equation Heisenberg, 6 Liouville-von Neumann, 15 Schrödinger, 5 equilibrium, 600, 601, 604 ergodicity, 161–165, 600–601 eutactic star, 619 expectation value, 3 explanation, 84, 181, 298–316 filter, 23, 39, 210 fluorescence quenching, 141 formalism generalized, 40, 375, 405, 489, 494 standard, 1, 14, 43, 231 fuzzy reality, 218, 338 Gram matrix, 619 Hardy-Stapp formulation of EPR, 529– 533 Haroche-Ramsey experiment, 455 hidden-variables theory, 108–109, 174, 182, 535–608 Bohm’s causal, 549–551, 557 Bohm’s stochastic, 551, 557 contextualistic, 561 deterministic, 186, 562–566
INDEX contextualistic, 563 empiricist, 565 objectivistic, 563 hybrid, 536, 540, 541, 553, 561, 584 local, 571–584 non-quasi-objectivistic, 584–608 quasi-objectivistic, 560–584, 593 stochastic, 545, 566–593 empiricist, 568 homodyne optical detection, see optical homodyning identical particles, 97–98 indistinguishability of, 97–98 inaccuracy, 200, 206, 229–233, 335, 370, 397 inaccuracy relation Heisenberg, 233 inadvertent realism, 516, 517, 533 indeterminacy, 200–218, 226 234, 397 indeterminacy relation Heisenberg, 26, 210, 232, 233 indeterminism, see determinism inequality BCHS, 480, 580 Bell in hidden-variables theory, 537–608 in quantum mechanics, 471–533 CSHS, 483 Heisenberg, 26, 180–188, 204, 395– 400 Jensen, 630 Klein, 632 Martens, 393 information mutual, 633 interference pattern, 201 visibility of, 427 interference term, 3 interpretation and logical positivism, 87, 108, 172 Born, 74, 282 consistent histories, 154, 160 Copenhagen, 171–228, 276–327, 404, 511–527, 592
INDEX empiricist, 74–82, 151–155, 235–237, 262, 318–321, 338, 488–496, 506– 509, 599 and logical positivism, 77 and microscopic objects, 78–81 of observables, 75 of wave functions, 75 ensemble, 116, 134, 183, 258, 275– 330, 425 ignorance, 280–289, 305 individual-particle, 134, 183, 258, 275– 330 instrumentalist, 74, 263 interactional, 250, 269, 499, 503 minimal, 74, 302–304, 308 modal, 321–330 anti-Copenhagen variant of, 327– 330 Copenhagen variant of, 322–327 of quantum field theory, 95–101 orthodox, 171 probabilistic, 279, 284, 287 propensity, 281, 566 realist, 82–87, 265, 318–330, 338, 475, 496, 599 and causal reasoning, 84 and classical paradigm, 88–92 contextualistic-, 86, 101–106 objectivistic-, 86–87 relational, 250, 499 statistical, 279–287, 306, 308 irreversibility, 603 latitude, 200, 216 lattice Boolean, 31 modular, 37 local commutativity, 10, 482 locality condition, 580 macrostate quantum mechanical, 126 statistical mechanical, 601 subquantum mechanical, 538 measure, 636–644
677 extreme, 638 functional-valued, 56 operator-valued, 150, 345, 383, 641 positive operator-valued, 40–51, 57, 146–160, 331–469, 641 coarsening of, 360 equivalent, 362 extremal, 354 informationally equivalent, 363 maximal, 365 minimal, 365 refinement of, 360 probability, 637 projection-valued, 41 Wigner, 383–391 measurement ‘which path’, 400, 410, 455 ‘which way’, see ‘which path’ measurement classical versus quantum mechanical description of, 81 complete, 369, 387, 404, 420, 451 consecutive, 126, 143, 154 continuous, 145 determinative aspect of, 131, 134 disturbance theory of, 212, 252, 499 faithful, 93, 497 first kind, 127–146, 296 interference, 410 atomic beam, 451–469 neutron, 409–429 joint, 9 joint nonideal, 375–383 negative-result, 139, 140, 142 nonideal, 333–340, 345–469, 566, 568 invertible, 355 partial ordering of, 364 phase, 447 preparative aspect of, 131, 134 quantum mechanical, 2 individual, 2 quantum mechanical description of, 115 repeatable, 123 second kind, 130–134, 296
678 simultaneous, see joint measurement successful, 4 trivial joint, 387 unanalyzability of, 179, 198 measurement problem, 113–120 orthodox solution of, 125–127 measurement result individual, 2 quantum mechanical, 2 microstate statistical mechanical, 601 subquantum mechanical, 538 mixture, 13, 14, 24, 289 (im)proper, 292–294, 495 mutual exclusiveness, 317, 491, 514, 523 neutron interferometry, 410 NODI, see non-orthogonal decomposition of identity extremal element of, 635 extreme element of, 634 non-contextuality, see objectivity non-orthogonal decomposition of identity, 42–51, 352–378, 401–423, 642 non-preparability, 591 non-reproducibility, 518–523, 578, 591 quantum mechanical, 520 nonideality, 332, 335 nonideality matrix, 335–338, 350–364 nonideality measure, 370–375 nonlocality, 253–272, 297, 314–321, 478– 482, 492–496, 502–523, 530, 548, 554, 606 nonseparability, 21, 290, 297, 493 objectification, 114, 338 objectivity, 32 observable correlation, 19, 250, 269, 482 dichotomic, 483 generalized, 40–51 interference, 412, 454, 465 maximal, 369 momentum, 8, 609, 610 number, 8, 611
INDEX path, 413, 453, 465 phase canonical, 50 generalized, 50 pointer, 49, 121, 149–152, 326 position, 8, 347, 609, 610 quantum mechanical, 2 reducible, 357, 377, 382 standard, 2, 34, 358, 387 uninformative, 367 values of, 151 observables (in)compatible, 7 generalized commeasurable, 44 joint measurement of, 43 standard joint measurement of, 9 joint nonideal measurement of, 391 operation, 138, 153 operational approach, 43, 102 operator boson squeezed, 616 boson annihilation, 611 boson creation, 611 density, 12 reduced, 18 number shift, 612 phase shift, 611 rotation, 611 operator ordering, 66 optical homodyning, 345–347, 351, 360, 392, 438–451 ‘eight-port’, 445 ‘four-port’, 379, 385, 443 parameter (in) dependence, 493 particle-wave duality, 218 partly reflecting mirror, 99, 437 phase space representations, 57–71 representations, 64 Husimi representation, 61 Wigner-Weyl representation, 58 phenomenalism, 77
INDEX photon, 98 polarization measurement of, 347–349 squeezed, 100 picture Heisenberg, 6 Schrödinger, 5 Poisson bracket, 53 polar decomposition, 19–21, 289, 324 possessed values principle, 86, 209, 304, 496 POVM, see positive operator-valued measure phase Susskind-Glogower, 50 pre-measurement, 123, 160–170 preferred pointer basis, 163, 167 preparation conditional, 134–140, 152–155, 223– 226, 249–276, 294–324, 501 first kind, 136 identical, 3, 284, 499 individual, 280, 510–533, 578, 608 in a quantum mechanical sense, 523 in a subquantum mechanical sense, 523 individual, 2, 10, 278–280, 486, 497, 583 objective, 525 quantum mechanical, 2 probability distribution, 2 bivariate, 9, 637 conditional, 9, 153, 637 joint, 637 marginal, 9, 637 quadrivariate, 480 product direct, 626 tensor, 626 projection Lüders, 25 non-orthogonal, 621, 624 skew, see non-orthogonal projection von Neumann, 23 generalized, 155–160, 400–404 projection postulate, 22–26, 145–146
679 objections to, 24–26 origin of, 23 strong, 23, 135 weak, 295, 314, 315 property emergent, 281, 308, 497, 504 possessed, 86, 281, 497 proposition calculus, 31–40 PVM, see projection-valued measure quadruple, 486, 487 quantum jumps, 140–146 quantum master equation, 165, 567 quantum of (inter)action, 179 quantum phenomenon, 198, 223 quantum postulate, 198, 386 quantum stochastic differential equations, 143–146 quantum tomography, 449 quantum Zeno effect, 145 quorum, 388, 449 Ramsey experiment, 452 random sequence, 293 reduction postulate, see projection postulate relativistic causality, 11 reproducibility, see non-reproducibility S matrix theory, 95 Schrödinger cat state, 119, 461 Schrödinger’s cat, 113–117 space Fock, 8 Hilbert-Schmidt, 628 phase, 54 spontaneous localization, 163–165 standard deviation, 5 state coherent, 613 squeezed, 28, 616 two-mode, 99, 615 conditionally prepared, 138, 142, 258– 271, 327, 502–505 contextual, 102–106, 269–272, 328– 329, 500–506, 520–528, 590, 605
680 individual, 522–524, 586–597 dispersionless, 27, 55, 536, 540–543 dynamic, 323–327 entangled, 17, 21, 122, 288–297, 484 minimal uncertainty, 27 objective initial, 521–528 individual, 524–528, 587, 593 pointer, 121 pure, 13, 24, 280, 285, 291, 294 quantum mechanical, 1 value, 323–327 state preparation, see preparation state reconstruction, 387 Stern-Gerlach experiment, 429–436 subquantum theory, see hidden-variables theory subsystem, 16 superposition principle, 1, 111, 113, 288 system bi-orthonormal, 623 closed, 86 isolated, see closed system open, 86 system of covariance, 610 system of imprimitivity, 610 theorem Bell’s, 573 Ehrenfest’s, 52 Gleason’s, 38, 640 Krein-Mil’man, 635 Naimark’s, 47 Wigner’s, 67 thought experiment, 196, 200 microscope, 206 double-slit, 201, 203, 206, 305 single-slit, 178, 200 trace, 627 partial, 627 trajectory, 545, 550, 586 unbiasedness, 359, 360, 395 uncertainty, 200, 229, 397 measure of, 5, 28 uncertainty relation, 26–31, 398
INDEX entropic, 28 Heisenberg, 26 wave packet splitting of, 286 spreading of, 286 wavicle, 79, 108 Weyl commutation relation, 610 wholeness, 198, 223, 290, 559
Fundamental Theories of Physics 94. 95. 96. 97. 98. 99.
100.
101. 102. 103. 104. 105. 106. 107. 108. 109. 110. 111. 112. 113. 114. 115. 116. 117. 118.
V. Dietrich, K. Habetha and G. Jank (eds.): Clifford Algebras and Their Application in Mathematical Physics. Aachen 1996. 1998 ISBN 0-7923-5037-5 J.P. Blaizot, X. Campi and M. Ploszajczak (eds.): Nuclear Matter in Different Phases and Transitions. 1999 ISBN 0-7923-5660-8 V.P. Frolov and I.D. Novikov: Black Hole Physics. Basic Concepts and New Developments. 1998 ISBN 0-7923-5145-2; Pb 0-7923-5146 G. Hunter, S. Jeffers and J-P. Vigier (eds.): Causality and Locality in Modern Physics. 1998 ISBN 0-7923-5227-0 G.J. Erickson, J.T. Rychert and C.R. Smith (eds.): Maximum Entropy and Bayesian Methods. 1998 ISBN 0-7923-5047-2 D. Hestenes: New Foundations for Classical Mechanics (Second Edition). 1999 ISBN 0-7923-5302-1; Pb ISBN 0-7923-5514-8 B.R. Iyer and B. Bhawal (eds.): Black Holes, Gravitational Radiation and the Universe. Essays in Honor of C. V. Vishveshwara. 1999 ISBN 0-7923-5308-0 P.L. Antonelli and T.J. Zastawniak: Fundamentals of Finslerian Diffusion with Applications. 1998 ISBN 0-7923-5511-3 H. Atmanspacher, A. Amann and U. Müller-Herold: On Quanta, Mind and Matter Hans Primas in Context. 1999 ISBN 0-7923-5696-9 M.A. Trump and W.C. Schieve: Classical Relativistic Many-Body Dynamics. 1999 ISBN 0-7923-5737-X A.I. Maimistov and A.M. Basharov: Nonlinear Optical Waves. 1999 ISBN 0-7923-5752-3 W. von der Linden, V. Dose, R. Fischer and R. Preuss (eds.): Maximum Entropy and Bayesian Methods Garching, Germany 1998. 1999 ISBN 0-7923-5766-3 M.W. Evans: The Enigmatic Photon Volume 5: O(3) Electrodynamics. 1999 ISBN 0-7923-5792-2 G.N. Afanasiev: Topological Effects in Quantum Mecvhanics. 1999 ISBN 0-7923-5800-7 V. Devanathan: Angular Momentum Techniques in Quantum Mechanics. 1999 ISBN 0-7923-5866-X P.L. Antonelli (ed.): Finslerian Geometries A Meeting of Minds. 1999 ISBN 0-7923-6115-6 M.B. Mensky: Quantum Measurements and Decoherence Models and Phenomenology. 2000 ISBN 0-7923-6227-6 B. Coecke, D. Moore and A. Wilce (eds.): Current Research in Operation Quantum Logic. Algebras, Categories, Languages. 2000 ISBN 0-7923-6258-6 G. Jumarie: Maximum Entropy, Information Without Probability and Complex Fractals. Classical and Quantum Approach. 2000 ISBN 0-7923-6330-2 B. Fain: Irreversibilities in Quantum Mechanics. 2000 ISBN 0-7923-6581-X T. Borne, G. Lochak and H. Stumpf: Nonperturbative Quantum Field Theory and the Structure of Matter. 2001 ISBN 0-7923-6803-7 J. Keller: Theory of the Electron. A Theory of Matter from START. 2001 ISBN 0-7923-6819-3 M. Rivas: Kinematical Theory of Spinning Particles. Classical and Quantum Mechanical Formalism of Elementary Particles. 2001 ISBN 0-7923-6824-X A.A. Ungar: Beyond the Einstein Addition Law and its Gyroscopic Thomas Precession. The Theory of Gyrogroups and Gyrovector Spaces. 2001 ISBN 0-7923-6909-2 R. Miron, D. Hrimiuc, H. Shimada and S.V. Sabau: The Geometry of Hamilton and Lagrange Spaces. 2001 ISBN 0-7923-6926-2
Fundamental Theories of Physics M. Pavšič: The Landscape of Theoretical Physics: A Global View. From Point Particles to the Brane World and Beyond in Search of a Unifying Principle. 2001 ISBN 0792370066 120. R.M. Santilli: Foundations of Hadronic Chemistry. With Applications to New Clean Energies and Fuels. 2001 ISBN 1 402000871 121. S. Fujita and S. Godoy: Theory of High Temperature Superconductivity. 2001 ISBN 1402001495 122. R. Luzzi, A.R. Vasconcellos and J. Galvão Ramos: Predictive Statitical Mechanics. A Nonequilibrium Ensemble Formalism. 2002 ISBN 1-4020-0482-6 123. V.V. Kulish: Hierarchical Methods. Hierarchy and Hierarchical Asymptotic Methods in Electrodynamics, Volume 1. 2002 ISBN 1-4020-0757-4 124. B.C. Eu: Generalized Thermodynamics. Thermodynamics of Irreversible Processes and Generalized Hydrodynamics. 2002 ISBN 1-4020-0788-4 125. A. Mourachkine: High-Temperature Superconductivity in Cuprates. The Nonlinear Mechanism and Tunneling Measurements. 2002 ISBN 1-4020-0810-4 126. W.M. de Muynck: Foundations of Quantum Mechanics, an Empiricist Approach. 2002 ISBN 1-4020-0932-1 119.
KLUWER ACADEMIC PUBLISHERS – DORDRECHT / BOSTON / LONDON