QUANTUM PHYSICS, RELATIVITY, AND COMPLEX SPACETIME Towards a New Synthesis
NORTH-HOLLAND MATHEMATICS STUDIES 163 (Continuation of the Notas de Matematica)
Editor: Leopoldo NACHBIN Centro Brasileiro de Pesquisas Fisicas Rio de Janeiro, Brazil and University of Rochester New York, U.S.A.
NORTH-HOLLAND - AMSTERDAM
NEW YORK
OXFORD
TOKYO
QUANTUM PHYSICS, RELATIVITY, AND COMPLEX SPACETIME Towards a New Synthesis
Gerald KAISER Department of Mathematics University of Lowell Lowell, MA, USA
1990
NORTH-HOLLAND -AMSTERDAM
NEW YORK
OXFORD TOKYO
ELSEVIER SCIENCE PUBLISHERS B.V. Sara Burgerhartstraat 25 P.O. Box 211, 1000 AE Amsterdam, The Netherlands Distributors for the U.S.A. and Canada: ELSEVIER SCIENCE PUBLISHING COMPANY, INC. 655 Avenue of the Americas NewYork, N.Y. 10010, U.S.A.
Library o f Congress Cataloging-in-Publlcation Data
Kaiser, Gerald. Ouantun physics. relativity. and complex spacetine : towards a new SynthEbiS / Gerald Kaiser. p. cn. -- (North-Holland matheiatlcs studies ; 163) Includes b l b l i O Q r ~ p h i C ~references 1 and index. ISBN 0-444-88465-3 1. Quantum theory. 2. Relativity (Physics) 3. S p a c e and tine. 4. Mathematical physics. I. Title. 11. Serios. QC174.12.K34 1990 530.1'2--dc20 90-7979
CIP
ISBN: 0 444 88465 3 Q ELSEVIER SCIENCE PUBLISHERS B.V., 1990
All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording or otherwise, without the prior written permission of the publisher, Elsevier Science Publishers B.V. / Physical Sciences and Engineering Division, P.O. Box 103, 1000 AC Amsterdam, The Netherlands. Special regulations for readers in the U.S.A. - This publication has been registered with the Copyright Clearance Center Inc. (CCC), Salem, Massachusetts. Information can be obtained from the CCC about conditions under which photocopies of parts of this publication may be made in the U.S.A. All other copyright questions, including photocopying outside of the USA., should be referred to the publisher. NO responsibility is assumed by the publisher for any injury and/or damage to persons or property as a matter of products liability, negligence or otherwise, or from any use or operation of any methods, products, instructions or ideas contained in the material herein. Printed in The Netherlands
To my parents, Bernard and Cesia and to Janusz, Krystyn, Mirek and Renia
This Page Intentionally Left Blank
Vii
UNIFIED FIELD THEORY In the beginning there was Aristotle, And objects at rest tended to remain at rest, And objects in motion tended to come to rest, And soon everything was at rest, And God saw that it was boring. Then God created Newton, And objects at rest tended to remain at rest, But objects in motion tended to remain in motion, And energy was conserved and momentum was conserved and matter was conserved, And God saw that it was conservative. Then God created Einstein, And everything was relative, And fast things became short, And straight things became curved, And the universe was filled with inertial frames, And God saw that it was relatively general, but some of it was especially relative. Then God created Bohr, And there was the principle, And the principle was quantum, And all things were quantized, But some things were still relative, And God saw that it was confusing. Then God was going to create Fergeson, And Fergeson would have unified, And he would have fielded a theory, And all would have been one, But it was the seventh day, And God rested, And objects at rest tend to remain at rest. by Tim Joseph copyright 01978 by The New York Times Company Reprinted by permission.
This Page Intentionally Left Blank
ix
CONTENTS
Preface ........................................................
.xi
Suggestions to the Reader ..................................
xvi
.
Chapter 1 Coherent-State Representations 1.1. Preliminaries ............................................ 1 1.2. Canonical coherent states ................................ 9 1.3. Generalized frames and resolutions of unity ............. 18 1.4.Reproducing-kernel Hilbert spaces ...................... 29 34 1.5. Windowed Fourier transforms ........................... 1.6. Wavelet transforms ..................................... 43
.
Chapter 2 Wavelet Algebras and Complex Structures 2.1. Introduction ............................................ 57 2.2. Operational calculus .................................... 59 2.3. Complex structure ...................................... 70 2.4. Complex decomposition and reconstruction ............. 82 2.5. Appendix .............................................. 92
.
Chapter 3 Frames and Lie Groups 3.1. Introduction ............................................ 95 3.2. Klauder’s group-frames .................................95 3.3. Perelomov’s homogeneous G-frames ................... 103 3.4. Onofri’s’s holomorphic G-frames .......................113 3.5. The rotation group .................................... 135 3.6. The harmonic oscillator as a contraction limit ..........145
Contents
X
.
Chapter 4 Complex Spacetime 4.1. Introduction ........................................... 155 4.2. Relativity, phase space and quantization ............... 156 4.3. Galilean frames ........................................ 169 183 4.4. Relativistic frames ..................................... 4.5. Geometry and Probability ............................. 207 4.6.The non-relativistic limit .............................. 225 Notes ...................................................... 230
.
Chapter 5 Quantized Fields 5.1. Introduction ...........................................
235 5.2. The multivariate Analytic-Signal transform ............239 5.3. Axiomatic field theory and particle phase spaces .......249 273 5.4. Free Klein-Gordon fields .............................. 5.5. Free Dirac fields ....................................... 289 5.6. Interpolating particle coherent states ..................302 5.7. Field coherent states and functional integrals .......... 308 Notes ...................................................... 318
.
Chapter 6 Further Developments 6.1. Holomorphic gauge theory .............................
321 6.2. Windowed X-Ray transforms: Wavelets revisited ...... 334
..................................................
347
........................................................
357
References Index
xi
PREFACE The idea of complex spacetime as a unification of spacetime and classical phase space, suitable as a possible geometric basis for the synthesis of Relativity and quantum theory, first occured to me in 1966 while I was a physics graduate student at the University of Wisconsin. In 1971, during a seminar I gave at Carleton University in Canada, it was pointed out to me that the formalism I was developing was related to the coherent-state representation, which was then unknown to me. This turned out to be a fortunate circumstance, since many of the subsequent developments have been inspired by ideas related to coherent states. My main interest at that time was to formulate relativistic coherent states. In 1974, I was struck by the appearance of tube domains in axiomatic quantum field theory. These domains result from the analytic continuation of certain functions (vacuum expectaion values) associated with the theory to complex spacetime, and powerful methods from the theory of several complex variables are then used to prove important properties of these functions in real spacetime. However, the complexified spacetime itself is usually not regarded as having any physical significance. What intrigued me was the possibility that these tube domains may, in fact, have a direct physical interpretation as (extended) classical phase spaces. If so, this would give the idea of complex spacetime a firm physical foundation, since in quantum field theory the complexification is based on solid physical principles. It could also show the way to the construction of relativistic coherent states. These ideas were successfully worked out in 1975-76,culminating in a mathematics thesis in 1977 at the University of Toronto entitled “Phasespace Approach to Relativistic Quantum Mechanics.”
Xii
Preface
Up to that point, the theory could only describe free particles. The next goal was to see how interactions could be added. Some progress in this direction was made in 1979-80, when a natural way was found to extend gauge theory to complex spacetime. Further progress came during my sabbatical in 1985-86, when a method was developed for extending quantized fields themselves (rather than their vacuum expectation values) to complex spacetime. These ideas have so far produced no “hard” results, but I believe that they are on the right path. Although much work remains to be done, it seems to me that enough structure is now in place to justify writing a book. I hope that this volume will be of interest to researchers in theoretical and mathematical physics, mathematicians interested in the structure of fundamental physical theories and assorted graduate students searching for new directions. Although the topics are fairly advanced, much effort has gone into making the book self-contained and the subject matter accessible to someone with an understanding of the rudiments of quantum mechanics and functional analysis.
A novel feature of this book, from the point of view of mathematical physics, is the special attention given to “ signal analysis” concepts, especially time-frequency localization and the new idea of wavelets. It turns out that relativistic coherent states are similar to wavelets, since they undergo a Lorentz contraction in the direction of motion. I have learned that engineers struggle with many of the same problems as physicists, and that the interplay between ideas from quantum mechanics and signal analysis can be very helpful to both camps. For that reason, this book may also be of interest to engineers and engineering students. The contents of the book are as follows. In chapter 1 the simplest
Preface
Xiii
examples of coherent states and time-frequency localization are introduced, including the original “canonical” coherent states, windowed Fourier transforms and wavelet transforms. A generalized notion of frames is defined which includes the usual (discrete) one as well as continuous resolutions of unity, and the related concept of a reproducing kernel is discussed. In chapter 2 a new, algebraic approach to orthonormal bases of wavelets is formulated. An operational calculus is developed which simplifierthe formalism considerably and provides insights into its symmetries. This is used to find a complex structure which explains the symmetry between the low- and the high-frequency filters in wavelet theory. In the usual formulation, this symmetry is clearly evident but appears to be accidental. Using this structure, complex wavelet decompositions are considered which are analogous to analytic coherent-state representations. In chapter 3 the concept of generalized coherent states based on Lie groups and their homogeneous spaces is reviewed. Considerable attention is given to holomorphic (analytic) coherent-state representations, which result from the possibility of Lie group complexification. The rotation group provides a simple yet non-trivial proving ground for these ideas, and the resulting construction is known as the “spin coherent states.” It is then shown that the group associated with the Harmonic oscillator is a weak contraction limit (as the spin s 4 00) of the rotation group and, correspondingly, the canonical coherent states are limits of the spin coherent states. This explains why the canonical coherent states transform naturally under the dynamics generated by the harmonic oscillator. In chapter 4,the interactions between phase space, quantum mechanics and Relativity are studied. The main ideas of the phasespace approach to relativistic quantum mechanics are developed for
xiv
Preface
free particles, based on the relativistic coherent-state representations developed in my thesis. It is shown that such representations admit a covariant probabilistic interpretation, a feature absent in the usual spacetime theories. In the non-relativistic limit, the representations are seen to “contract” smoothly to representations of the Galilean group which are closely related to the canonical coherent-state representation. The Gaussian weight functions in the latter are seen to emerge from the geometry of the mass hyperboloid. In chapter 5 , the formalism is extended to quantized fields. The basic tool for this is the Analytic-Signal transform,which can be applied to an arbitrary function on R”to give a function on a!” which, although not in general analytic, is “analyticity-friendly” in a certain sense. It is shown that even the most general fields satisfying the Wightman axioms generate a complexification of spacetime which may be interpreted as an extended classical phase space for certain special states associated with the theory. Coherent-st ate represent ations are developed for free Klein-Gordon and Dirac fields, extending the results of chapter 4. The analytic Wightman two-point functions play the role of reproducing kernels. Complex-spacetime densities of observables such as the energy, momentum, angular momentum and charge current are seen to be regularizations of their counterparts in real spacetime. In particular, Dirac particles do not undergo their usual Zitterbewegung. The extension to complex spacetime separates, or polarizes, the positive- and negativefrequency parts of free fields, so that Wick ordering becomes unnecessary. A functionalintegral represent ation is developed for quantized fields which combines the coherent-state representations for particles (based on a finite number of degrees of freedom) with that for fields (based on an infinite number of degrees of freedom). In chapter 6 we give a brief account of some ongoing work, begin-
Preface
xv
ning with a review of the idea of holomorphic gauge theory. Whereas in real spacetime it is not possible to derive gauge potentials and gauge fields from a (fiber) metric, we show how this can be done in complex spacetime. Consequently, the analogy between General Relativity and gauge theory becomes much closer in complex spacetime than it is in real spacetime. In the “holomorphic” gauge class, the relation between the (non-abelian) Yang-Mills field and its potential becomes linear due to the cancellation of the non-linear part which follows from an integrability condition. Finally, we come full circle by generalizing the Analytic-Signal transform and pointing out that this generalization is a higher-dimensional version of the wavelet transform which is, moreover, closely related to various classical transforms such as the Hilbert, Fourier-Laplace and Radon transforms.
I am deeply grateful to G. Emch for his continued help and encouragement over the past ten years, and to J. R. Klauder and R. F. Streater for having read the manuscript carefully and made many invaluable comments, suggestions and corrections. (Any remaining errors are, of course, entirely my responsibility.) I also thank D. Buchholtz, F. Doria, D. Finch, S. Helgason, I. Kupka, Y. Makovoz, J. E. Marsden, M. O’Carroll, L. Rosen, M. B. Ruskai and R. Schor for miscellaneous important assistance and moral support at various times. Finally, I am indebted to L. Nachbin, who first invited me to write this volume in 1981 (when I was not prepared to do so) and again in 1985 (when I was), and who arranged for a tremendously interesting visit to Brazil in 1982. Quero tarnbkm agradecer a todos 0s meus colegas Brasileiros!
xvi
Suggestions t o the Reader The reader primarily interested in the phasespace approach to relativistic quantum theory may on first reading skip chapters 1-3 and read only chapters 4-6, or even just chapter 4 and either chapters 5 or 6, depending on interest. These chapters form a reasonably self-contained part of the book. Terms defined in the previous chapters, such as “frame,” can be either ignored or looked up using the extensive index. The index also serves partially as a glossary of frequently used symbols. The reader primarily interested in signal analysis, timefrequency localization and wavelets, on the other hand, may read chapters 1 and 2 and skip directly to sections 5.2 and 6.2. The mathematical reader unfamiliar with the ideas of quantum mechanics is urged to begin by reading section 1.1, where some basic notions are developed, including the Dirac notation used throughout the book.
1
Chapter 1 COHERENT-STATE REPRESENTATIONS
1.1. Preliminaries In this section we establish some notation and conventions which will be followed in the rest of the book. We also give a little background on the main concepts and formalism of non-relativistic and relativistic quantum mechanics, which should make this book accessible to nonspecialists. 1. Spacetirne and its Dual
In this book we deal almost exclusively with flat spacetime, though we usually let space be R"instead of R3,so that spacetime becomes X =
IRs+l. The reason for this extension is, first of all, that it involves little cost since most of the ideas to be explored here readily generalize to IRS+l, and furthermore, that it may be useful later. Many models in constructive quantum field theory are based on two- or threedimensional spacetime, and many currently popular attempts to unify physics, such as string theories and Kaluza-Klein theories, involve spacetimes of higher dimensionality than four or (on the string worldsheet) two-dimensional spacetimes. An event x € X has coordinates 2
= ( 2 P ) = (xO,xj),
1 . Coherent-St ate Represent ations
2
where x o
3
t is the time coordinate and x j are the space coordinates.
Greek indices run from 0 to s, while latin indices run from 1 to s. If we think of x as a translation vector, then X is the vector space of all translations in spacetime. Its dual
X* is the set of all linear maps
k : X + IR. By linearity, the action of k on x (which we denote by k x instead of k ( x ) ) can be written as 5
kx =
C k,xp
k,xp,
(2)
,=o
where we adopt the Einstein summation convention of automatically summing over repeated indices. Usually there is no relation between x and k other than the pairing (x,k ) H kx. But suppose we are given a scalar product on X , 5 . 2I
=g,,x
CL x 'V
(3)
where (g,,) is a non-degenerate matrix. Then each x in X defines a linear map x * : X + *: X
---f
IR by
x*(x') = x
. x',
thus giving a map
X * , with ( x * ) , =- 5 , = g , , x Y
(4)
Since g,, is non-degenerate, it also defines a scalar product on X * , '
whose metric tensor is denoted by g"". The map x x* establishes an isomorphism between the two spaces, which we use to identify them. If 2 denotes a set of inertial coordinates in free spacetime, then the scalar product is given by
g,, = diag(c2,-1, -1,
- - - ,-1)
where c is the speed of light. X , together with this scalar product, is called Minkowskian or Lorentzian spacetime.
1.1. Preliminaxies
3
It is often convenient to work in a single space rather than the dual pair X and X * . Boldface letters will denote the spatial parts of vectors in X * . Thus x = ( t , -x), k = (ko, k ) and
where x . x' and k - x denote the usual Euclidean inner products in
IR". 2. Fourier ?f-ansforms The Fourier transform of a function f : X + (E (which, to avoid analytical subtleties for the present, may be assumed to be a Schwartz test function; see Yosida [1971])is a function f:X * + a given by
where dz 3 dt d"x is Lebesgue measure on X . f can be reconstructed from by the inverse Fourier transform, denoted by " and given by
3
f ( ~ =)
d k e-2xikzf ( k ) = (P)"(x),
(7)
where d k = dk0d"k denotes Lebesgue measure on X * M X . Note that the presence of the 27r factor in the exponent avoids the usual ~ (27r)-'-' in front of the integrals. need for factors of ( 2 ~ ) - ( " + ' ) /or Physically, k represents a wave vector: ko v is a frequency in cycles per unit time, and kj is a wave number in cycles per unit length. Then the interpretation of the linear map k: X + 1R is that 27rkx is the total radian phase gained by the plane wave g(x') = exp(-27rikxt) through the spacetime translation x , i.e. 27rk "measures" the radian phase shift. Now in pre-quantum relativity, it was realized
4
I. Coherent-St ate Representations
that the energy E combines with the momentum p to form a vector
p ( p , ) = (E,p) in X * . Perhaps the single most fundamental difference between classical mechanics and quantum mechanics is that in the former, matter is conceived to be made of “dead sets” moving in space while in the latter, its microscopic structure is that of waves descibed by complex-valued wave functions which, roughly speaking, represent its distribution in space in probabilistic terms. One important consequence of this difference is that while in classical mechanics one is free to specify position and momentum independently, in quantum mechanics a complete knowledge of the distribution in space, i.e. the wave function, determines the distribution in momentum space via the Fourier transform. The classical energy is reinterpreted as the frequency of the associated wave by Planck’s Ansatz,
where tL is Planck’s constant, and the classical momentum is reinterpreted as the wave-number vector of the associated wave by De Broglie’s relation,
These two relations are unified in relativistic terms as p , = 27rlik,. Since a general wave function is a superposition of plane waves, each with its own frequency and wave number, the relation of energy and momentum to the the spacetime structure is very different in quantum mechanics from what is was in classical mechanics: They become operators on the space of wave functions:
or, in terms of x*,
5
1.1. Preliminaries
This is, of course, the source of the uncertainty principle. In terms of energy-moment um, we obtain the “quantum-mechanical” Fourier transform and its inverse,
If f(s) satisfies a differential equation, such as t,,e Schrodinger equation or the Klein-Gordon equation, then f ( p ) is supported on an s-dimensional submanifold P of X * (a paraboloid or two-sheeted hyperboloid, respectively) which can be parametrized by p E
R”.
We will write the solution as
where f ( p ) is, by a mild abuse of notation, the “restriction” of
P (actually, l f ( p > l 2 is a density on P ) and d p ( p )
f
to
p ( p ) d ” p is an appropriate invariant measure on P. For the Schrodinger equation p(p) = 1, whereas for the Klein-Gordon equation, p ( p ) = lpo I Setting t = 0 then shows that f(p) is related to the initial wave function by G
-’.
where now
“-”
denotes the the s-dimensional inverse Fourier trans-
form of the function
f
on
P
fi?
IR”.
We will usually work with “natural units,” i.e. physical units so chosen that h = c = 1. However, when considering the non-
6
1. Coherent-State Representations
relativistic limit ( c + 00) or the classical limit (ti + 0), c or ti will be reinserted into the equations.
3. Hilbert Space Inner products in Hilbert space will be linear in the second factor and antilinear in the first factor. Furthermore, we will make some discrete use of Dirac’s very elegant and concise bra-ket notation, favored by physicists and often detested or misunderstood by mathematicians. As this book is aimed at a mixed audience, I will now take a few paragraphs to review this notation and, hopefully, convince mathematicians of its correctness and value. When applied to coherent-state representat ions, as opposed to representat ions in which the positionor momentum operators are diagonal, it is perfectly rigorous. (The bra-ket notation is problematic when dealing with distributions, such as the generalized eigenvectors of position or momentum, since it tries to take the “inner products” of such distributions.) Let ‘FI be an arbitrary complex Hilbert space with inner product (-,-). Each element f E 7-t defines a bounded linear functional
f * : ? t + (Jby
The Riesz represent at ion theorem guarantees that the converse is also true: Each bounded linear functional L : 3-1 + for a unique
f
E 3-1. Define the bra
(fl
(fl
a has the form L = f *
corresponding to f by
= f* : 3-1 +
a.
(14)
Similarly, there is a one-to-one correspondence between vectors g E ‘H and linear maps
7
1.1. Preliminaries
1s):
c + ‘H
defined by
Is)(X) = Xg,
E
a,
(16)
which will be called kets. Thus elements of ‘H will be denoted alternatively by g or by 1s). We may now consider the composite map bra-ke t
given by
(f I 9 ) P ) = f*(W = Xf*(s> = X(f, 9).
(18)
Therefore the “bra-ket” map is simply the multiplication by the inner product (f, g) (whence it derives its name). Henceforth we will identify these two and write (flg) for both the map and the inner product. The reverse composition
Is)(fl : ‘H 3-1 +
may be viewed as acting on kets to produce kets: IS)(fl(lW = lg)((fIN)*
(20)
To illustrate the utility of this notation, as well as some of its pitfalls, suppose that we have an orthonormal basis {gn} in H. Then the usual expansion of an arbitrary vector f in H takes the form
n
n
8
1. Coherent-State Representations
from which we have the “resolution of unity”
n
where I is the identity on ‘FI and the sum converges in the strong operator topology. If { h,} is a second orthonormal basis, the relation between the expansion coefficients in the two bases is
In physics, vectors such as gn are often written as I n ) , which can be a source of great confusion for mathematicians. Furthermore,
If),
with functions in L2(IRB), say, are often written as f ( x ) = ( x ( x I 2’ ) = 6 ( x - 2 ’ ) ) as though the I x ))s formed an orthonormal basis. This notation is very tempting; for example, the Fourier transform is written as a “change of basis,”
with the “transformation matrix” ( k I x ) = exp(2~ikx). One of the advantages of this notation is that it permits one to think of the Hilbert space as “abstract,” with ( gn
If),
( h,
If
), ( x 1 f ) and ( k I f )
merely different “representat ions” (or “realizations”) of the same vector f . However, even with the help of distribution theory, this use of Dirac notation is unsound, since it attempts to extend the Riesz representation theorem to distributions by allowing inner products of them. (The “vector” ( x I is a distribution which evaluates test func-
tions at the point z; as such, I x‘ ) does not exist within modern-day distribution theory.) We will generally abstain from this use of the bra-ket notation.
1.2. Canonical Coherent States
9
Finally, it should be noted that the term “representation” is used in two distinct ways: (a) In the above sense, where abstract Hilbertspace vectors are represented by functions in various function spaces, and (b) in connection with groups, where the action of a group on a Hilbert space is represented by operators. This notation will be especially useful when discussing frames, of which coherent-state representations are examples.
1.2. Canonical Coherent States
We begin by recalling the original coherent-state representations (Bargmann [1961],Klauder [1960,1963a, b], Segal [1963a]). Consider a spinless non-relativistic particle in IR.” (or 9/3 such particles in IR.~), whose algebra of observables is generated by the position operators XI, and momentum operators Pk,k = 1,2,. . .s. These satisfy the “canonical commutation relations”
where I is the identity operator. The operators -iXk, -iPk and -iI together form a r e d Lie algebra known aa the Heisenberg algebra, which is irreducibly represented on L2(R8)by
the Schrodinger representation. As a consequence of the above commutation relations between xk and Pk, the position and momentum of the particle obey the
10
1. Coherent-St ate Represent ations
Heisenberg uncertainty relations, which can be derived simply as follows. The expected value, upon measurement, of an observable represented by an operator F in the state represented by a wave function f ( x ) with llfll = 1 (where 11 11 denotes the norm in L2(R"))is given
-
by
In particular, the expected position- and momentum coordinates of the particle are ( Xk ) and ( pk ). The uncertainties Ax, and Apk in position and momentum are given by the variances
Choose an arbitrary constant b with units of area (square length) and consider the operators
Notice that although Ak is non-Hermitian, it is real in the Schrodinger representation. Let
where Zk denotes the complex-conjugate of
Zk.
Then for 6Ak =-
Ak - Z k I we have ( 6 & ) = 0 and
The right-hand side is a quadratic in b, hence the inequality for all b demands that the discriminant be non-positive, giving the uncertainty relations
1.2. Canonical Coherent States
AX$AP$L
11
1
-2 '
Equality is attained if and only if 6Akf = 0, which shows that the only minimum-uncertainty states are given by wave functions f(z) satisfying the eigenvalue equations
for some real number b (which may actually depend on k) and some z E a'. But square-integrable solutions exist only for b > 0, and then there is a unique solution (up to normalization) xz for each z E as. To simplify the notation, we now choose b = 1. Then Ak and A*, satisfy the commutation relations
and
xz is given by
where the normalization constant is chosen ELS N = 7rr-s/4, so that llxzll = 1for z = 0. Here f 2 is the (complex) inner product of Z with itself. Clearly xz is in L2(IRd),and if z = z - ip, then
in the state given by
xz. The vectors xz are known as the canon-
ical coherent states. They occur naturally in connection with the
harmonic oscillator problem, whose Hamiltonian can be cast in the form 1 1 sw H = -( P 2+ m 2 u 2 X 2 )= - w 2 A * - A + 2m 2 2
(13)
12
1, Coherent4t a t e Representations
with
(thus b = l / r n w ) . They have the remarkable property that if the initial state is xt,then the state at time t is x t ( q where ~ ( tis)the orbit in phase space of the corresponding classical harmonic oscillator with initial data given by z. These states were discovered by Schrodinger himself [1926], at the dawn of modern quantum mechanics. They were further investigated by Fock [1928] in connection with quantum field theory and by von Neumann [1931]in connection with the quantum measurement problem. Although they span the Hilbert space, they do not form a basis because they possess a high degree of linear dependence, and it is not easy to find complete, linearly independent subsets. For this reason, perhaps, no one seemed to know quite what to do with them until the early 1960’9, when it was discovered that what really mattered was not that they form a basis but what we shall call a generalized frame. This allows them to be used in generating a representation of the Hilbert space by a space of analytic functions, as explained below. The frame property of the coherent states (which will be studied and generalized in the following sections and in chapter 3) was discovered independently at about the same time by Klauder, Bargmann and Segal. Glauber [1963a,b] used these vectors with great effectivenessto extend the concept of optical coherence to the domain of quantum electrodynamics, which was made necessary by the discovery of the laser. He dubbed them “coherent states,” and the name stuck to the point of being generic. (See also Klauder and Sudarshan [1968].) Systems of vectors now called “coherent” may have nothing to do with optical coherence, but there is at least one unifying characteristic, namely their frame property
1.2. Canonical Coherent States
13
(next section). The coherent-state representation is now defined as follows: Let F be the space of all functions
J
=N
J
d'x'
exp[-r2/2
+ z - x' - ~ ' ~ / 2 ] f ( ~ ' )
(15)
where f runs through L2(IR'). Because the exponential decays rapidly in x', f" is entire in the variable I E C'. Define an inner product on F by
where
I
x - ip and
dp(z) = (27r)-' exp( --E
- 2/2) d'x
d'p.
(17)
Then we have the following theorem relating the inner products in L2(IR') and F. T h e o r e m 1.1. Let f,g E L2(IRa) and let entire functions in F. Then
f", ij
be the corresponding
Proof. To begin with, assume that f is in the Schwartz space S(IR') of rapidly decreasing smooth test functions. For z = x - ip, we have
xl(x') = N exp[-E2/2
+ x2/2 - (5' - ~
+
) ~ / i2p * 4,
14
1. Coherent-State Representations
hence
where denotes the Fourier transform with respect to x'. Thus by Plancherel's theorem (Yosida [1971]),
(27r)-'
/
d'p exp(-p2/2)lf"(z - ip)12 = N 2 exp(x2/2)
J
d'z' exp[-(z' - x ) ~lf(x')l2. ] (20)
Therefore
=N2
/ 1 d'z
)'.(fI
d a d exp[-(z' - x ) ~ ]
1'
(21)
after exchanging the order of integration. This proves that
11311:
= Ilf
IIL
(22)
for f E S(IR"),hence by continuity also for arbitrary f E L2(IRb). By polarization the result can now be extended from the norms to the inner products. I The relation f H f' can be summarized neatly and economically in terms of Dirac's bra-ket notation. Since
1.2. Canonicd Coherent States
15
theorem 1 can be restated as
Dropping the bra
(fl
and ket
Is), we have the operator identity
where I is the identity operator on L2(IRB) and the integral converges at least in the sense of the weak operator topology,* i.e. as a quadratic form. In Klauder’s terminology, this is a continuous resolution of unity. A general operator B on L2(RB) can now be expressed as an integral operator B on F as follows:
Particularly simple representations are obtained for the basic position- and momentum’ operators. We get
*
As will be shown in a more general context in the next sec-
tion, under favorable conditions the integral actually converges in the strong operator topology.
1. Coherent-State Representations
16
thus
Hence xk and Pk can be represented as differential rather than integral operat ors. As promised, the continuous resolution of the identity makes it possible to reconstruct f E L2(IR")fr lm its transform f" E E
that is,
"
Thus in many respects the coherent states behave like a basis for
L2(IR").But they differ from a basis i s at least one important
respect: They cannot all be linearly independent, since there are uncountably many of them and L2(IR")(and hence also F)is separable. In particular, the above reconstruction formula can be used to express x I in terms of all the xw's:
1.2. Canonical Coherent States
17
In fact, since entire functions are determined by their values on some discrete subsets I? of Cs,we conclude that the corresponding subsets of coherent states {xZ I z E I?} are already complete since for any function f orthogonal to them all, f(z) = 0 for all z E r and hence f = 0, which implies f = 0 a.e. For example, if I’is a regular lattice, a necessary and sufficient condition for completesness is that r contain at least one point in each Planck cell (Bargmann et al., [1971]), in the sense that the spacings Azk and Apk of the lattice coordinates zk = X k ipk satisfy AxkApk _< 27rh 21r. It is no accident that this looks like the uncertainty principle but with the inequality going “the wrong way.” The exact coefficient of h is somewhat arbitrary and depends on one’s definition of uncertainty; it is possible to define measures of uncertainty other than the standard deviation. (In fact, a preferable-but less tractabledefinition of uncertainty uses the notion of entropy, which involves all moments rather than just the second moment. See Bialynicki-Birula and Mycielski [1975] and Zakai [1960].) The intuitive explanation is that if f gets “sampled” at least once in every Planck cell, then it is uniquely determined since the uncertainty principle limits the amount of variation which can take place within such a cell. Hence the set of all coherent states is overcomplete. We will see later that reconstruction formulas exist for some discrete subsystems of coherent states, which makes them as useful as the continuum of such states. This ability to synthesize continuous and discrete methods in a single representation, as well as to bridge quantum and classical concepts, is one more aspect of the a.ppea1 and mystery of these systems. I
+
18
1. Coherent-State Representations
1.3. Generalized Frames and Resolutions of Unity Let M be a set and p be a measure on M (with an appropriate aalgebra of measurable subsets) such that { M ,p } is a a-finite measure space. Let 3-1 be a Hilbert space and hm E 3-1 be a family of vectors indexed by m E M .
Definition.
The set
is a generalized frame in 3-1 if 1. the map h: m H hm is weakly measurable, i.e. for each f E 3.1 the function f ( m ) ( hm I f ) is measurable, and
2. there exist constants 0 < A I B such that r
3 - 1 ~is a frame (see Young [1980] and Daubechies [198Sa]) in the special case when M is countable and p is the counting measure on M (i.e., it assigns to each subset of M the number of elements
contained in it). In that case, the above condition becomes
C
A I I ~ I I ~ S I(hmIf)12
vf~3-1.
(3)
mEM
We will henceforth drop the adjective “generalized” and simply speak of “frames.” The above case where M is countable will be refered to as a discrete frame. If A = B , the frame 3 - 1 ~is called tight. The coherent states of the last section form a tight frame, with M = a?,dp(m) = d p ( ~ )11,, = xZ and A = B = 1.
1.3. Generalized I h m e s and Resolutions of Unity
19
Given a frame, let T be the map taking vectors in ‘H to functions on M defined by
( ~ f > ( m (>hm
If )
.f(m),
Then the frame condition states that
f E 3.t.
(4)
Tf is square-integrable with
respect to dp, so that T defines a map
T : ‘H + L 2 ( d p ) ,
(5)
witb
The frame property can now be stated in operator form as
AI 5 T*T 5 B I ,
(7)
where I is the identity on ‘H. In bra-ket notation,
where the integral is to be interpreted, initially, as converging in the weak operator topology, i.e. as a quadratic form. For a measurable subset N of M , write
Proposition 1.2. I f the integral G ( N ) converges in the strong operator topology of ‘H whenever N has finite measure, then so does the complete integral representing G = T*T.
20
1. Coherent-State Representations
Proof*. Since M is a-finite, we can choose an increasing sequence { M n } of sets of finite measure such that M = U,Mn. Then the corresponding sequence of integrals G , forms a bounded (by G) increasing sequence of Hermitian operators, hence converges to G in the strong operator topology (see Halmos [1967], problem 94). I If the frame is tight, then G = A1 and the above gives a resolution of unity after dividing by A. For non-tight frames, one generally has to do some work to obtain a resolution of unity. The frame condition means that G has a bounded inverse, with
Given a function g(m) in L 2 ( d p ) ,we are interested in answering the following two questions: (a) Is g = Tf for some f E 'H? (b) If so, then what is f ? In other words, we want to:
c L 2 ( d p ) of the map T. (b) Find a left inverse S of T,which enables us to reconstruct f from (a) Find the range %T
Tf by f = S T f . Both questions will be answered if we can explicitly compute
G-l. For let
P
= TG-I T*:L2(dp) -+
L2(dp).
Then it is easy to see that
*
I thank M. B. Ruskai for suggesting this proof.
(11)
1.3. Generalized fiames and Resolutions of Unity
21
( a ) P* = P
( b ) P2 = P (c)
PT
= T.
It follows that P is the orthogonal projection onto the range of T,
Tf = g, and conversely if for some g we have Pg = g , then g = T(G-'T*g) z Tf.
for if g =
Tf for some f in 3.t, then Pg = P T f
=
Thus 8~is a closed subspace of L2(dp) and a function g E L 2 ( d p ) is in 8~if and only if
The function
therefore has a property similar to the Dirac &-functionwith respect to the measure dp, in that it reproduces functions in %T. But it differs from the S-function in some important respects. For one thing, it is bounded by
1. Coherent-St ate Representations
22
IK(m,m')l = l ( h m l G - l h ~ ) l
5 IIG-' II IIhmII IIhml II I A-'lIhml\ Ilhm)II < 00
(16)
for all m and m'. Furthermore, the "test functions" which K ( m ,m') reproduces form a Hilbert space and K(m,m') defines an integral operator, not merely a distribution, on %T. In the applications to relativistic quantum theory to be developed later, M will be a complexification of spacetime and K(m,m') will be holomorphic in m and antiholomorphic in m'. The
Hilbert space %T and the associated function K ( m ,m') are an example of an important structure called a reproducing-kernel Hilbert space (see Meschkowski [1962]), which is reviewed briefly in the next section. K(rn,m') is called a reproducing kernel for SRT. We can thus summarize our answer to the first question by saying that a function g E L2(dp) belongs to the range of T if and only if it satisfies the consistency condition
Of course, this condition is only useful to the extent that we have information about the kernel K(m,m') or, equivalently, about the operator G-l. The answer to our second question also depends on the knowledge of G-l. For once we know that g = Tf for some
f
E 3.1, then
Thus the operator
1.3. Generalized Fkaxnes and Resolutions of Unity
S = G-lT*: L2(dp)+ 'FI
23
(19)
is a left inverse of T and we can reconstruct f by
f = Sg = G-'T*Tf
This gives f as a linear combination
Note that
d p ( m ) I h" ) ( hm I = G-lGG-'
= G-I,
/M
therefore the set
3-IM
3
(h" Im E M )
(23)
is also a frame, with frame constants 0 < B-' 5 A-I. We will call ' H M the frame reciprocal to 3 - 1 ~ . (In Daubechies [ISSSa], the corresponding discrete object is called the dual frame, but as we shall see below, it is actually a generalization of the concept of reciprocal basis; since the term "dual basis" has an entirely different meaning, we prefer "reciprocal frame" to avoid confusion.) The above reconstruction formula is equivalent to the resolutions of unity in terms of the pair ' H M , E M of reciprocal frames:
24
1. Coherent-State Representations
Corollary 1.3. Under the assumptions of proposition 1.2, the above resolutions of unity converge in the strong operator topology of ‘FI. The proof is similar to that of proposition 1.2 and will not be given. The strong convergence of the resolutions of unity is important, since it means that the reconstruction formula is valid within
3-1 rather
than just weakly. Application to f = hk for a fixed k E M gives
which shows that the frame vectors hm are in general not linearly independent. The consistency condition can be understood as requiring the proposed function g(m) to respect the linear dependence of the frame vectors. In the special case when the frame vectors are linearly independent, the frames 7 - l and ~ ‘FIM both reduce to bases of
7-l. If 7-l is separable (which we assume it is), it follows that M must be countable, and without loss in generality we may assume that d p is the counting measure on M (renormalize the hm’s if necessary). Then the above relation becomes
mEM
and linear independence requires that K be the Kronecker 6: K ( m ,k ) = 6;. Thus when the hm’s are linearly independent, 3-1121 and ‘FIM reduce to a pair of reciprocal bases for ‘FI. The resolutions of unity become
GEM and we have the relation
1.3. Generalized fiames and Resolutions of Unity
25
(28) where gmk
I
(hm hk) =
(h" I G h k )
(29)
is an infinite-dimensional version of the metric tensor, which mediates between covariant and contravariant vectors. (The operator G plays the role of a metric operator.) In this case, %T = L 2 ( d p ) 1 2 ( M )and the consistency condition reduces to an identity. The reconstruction formula becomes the usual expression for f as a linear combination of the (reciprocal) basis vectors. If we further specialize to the case of a tight frame, then G = AI implies that
(h,(hk)=A6:
and
(hmJhk)=A-'6P,
(30)
so 'HM and 'HM become orthogonal bases. Requiring A = B = 1
means that ' H M = 'KM reduce to a single orthonormal basis. Returning to the general case, we may summarize our findings as follows: E M ,N M ,K(m,m') and g(m,m') (h" I Ghml) are generalizations of the concepts of basis, reciprocal basis, Kronecker delta and metric tensor to the infinite-dimensional case where, in addition, the requirement of linear independence is dropped. The point is that the all-important reconstruction formula, which allows us to express any vector as a linear combination of the frame vectors, survives under the additional (and obviously necessary) restriction that the consistency condition be obeyed. The useful concepts of orthogonal and orthonormal bases generalize to tight frames and frames with A = B = 1, respectively. We will call frames with A = B = 1normal. Thus normal frames are nothing but resolutions of unity.
=
26
1. Coherent-State Representations hturning to the general situation, we must still supply a way
of computing G - l , on which the entire construction above depends. In some of the examples to follow, G is actually a multiplication operator, so G-' is easy to compute. If no such easy way exist, the following procedure may be used. From AI 5 G 5 B I it follows that 8
1 1 - - ( B - A)I 5 G - - ( B 2 2
1 + A ) I 5 -(B - A)I. 2
Hence letting
6=-
B-A B+A
and
c=-
2 B+A'
we have
-61 5 I - CG 5 6I. Since 0
(33)
5 6 < 1 and c > 0, we can expand 00
-'= c C(I - c G ) ~
G-' = c [ I - ( I - cG)]
(34)
k=O
and the series converges uniformly since
( ( I- ~ G l 5 l 6 < 1. The smaller 6, the faster the convergence. For a tight frame,
(35)
S=0
and cG = I , so the series collapses to a single term G-' = c. Then the consistency condition becomes
and the reconstruction formula simplifies to
1.3. Generalized fiames and Resolutions of Unity
27
If 0 < S << 1, the frame is called snug (Daubechies [1988a]) and the above formulae merely represent the first terms in the expansions. However, it is found in practice that under certain conditions, this first term gives a good approximation to the reconstruction for snug frames. It appears to be an advantage to have much linear dependence among the frame vectors (precisely that which is impossible when dealing with bases!), so the transformed function j ( m )is “oversampled.” For such frames, the oversampling seems to compensate for the truncation of the series. A good measure of the amount of linear dependence among the hm’s is the size of the orthogonal complement R$ of the range of T, which is the null space N ( T * )of T* since
(glTf)=O
gE%+
* (T*gIf)=O
Vf€%
V ~ E %
(38)
e T*g = 0. Suppose we are given some function g(m) E L2(dp) and apply the reconstruction formula to it blindly, without worrying whether the consistency condition is satisfied. That is, consider the vector
G-’T*g
f
in %. Since g can be uniquely written gi
+
g2
where g i E %T and
g2
E
(39) &s
an orthogonal sum g =
= N ( T * ) ,we find that
where g l does satisfy the consistency condition. This means that applying the reconstruction formula to an arbitrary g E L 2 ( d p ) results in a least-squares approximation f to a reconstruction, in the sense that the “error” 92
g
- g1 = g - Tfhas a minimal norm in L2(dp).
28
1. Coherent-State Representations
For example, suppose we only have information about f ( m )for
m in some subset
r of M . Let
Then g does not belong to
!RT,in general, and f1
= G-'T*g is the
least-squares approximation to the (unknown) vector f in view of our ignorance. A similar argument applies if M is discrete and f must be reconstructed from Tf by numerical methods. Then we must confine ourselves to a finite subset I? of M . The above procedure then gives a least-squares approximation to formula to a finite sum over
f by truncating the reconstruction
r.
A final note: The usual argument in favor of using bases rather than overcomplete sets of vectors is that one desires a unique representation of functions as linear combinations of basis elements. When a frame is not a basis, i.e. when the frame vectors are linearly depen-
dent, this uniqueness is indeed lost since we may add an arbitrary function e(m) E %+ to f(m) in the reconstruction formula without changing f (%+ # (0) since the frame vectors are dependent). However, there is still a unique admissible coefficient function, i.e. one satisfying the consistency condition. Moreover, as we shall see, it usually happens in practice that the set M , in addition to being a measure space, has some further structure, and the reproducing kernel K(m,m') preserves this structure. For example, M could be a topological space and I< be continuous on M x M , or M could be a differentiable manifold and I< be differentiable, or (as will happen in our treatment of relativistic quantum mechanics) M could be a complex manifold and K be holomorphic. Furthermore, I< could exhibit a certain boundary- or asymptotic behavior. In such cases, these
1.4. Reproducing-Kernel Hilbert Spaces
29
properties are inherited by all the functions f(m) in %T, and then of all possible coefficient functions for a given f E 3.c, there is only one which exhibits the appropriate behavior. In this sense, uniqueness is restored. We will refer to frames with such additional properties as
continuous, differentiable, holomorphic, etc. In addition to properties such as differentiability or holomorphy, the kernels K we will encounter will usually have certain invariance properties with respect to some group of transformations acting on
M . This, too, will induce a corresponding structure on the function space 8 ~ .
1.4, Reproducing-Kernel Hilbert Spaces The function K ( m ,m’) of section 1.3 is an example of a very general structure called a reproducing kernel, which we now review briefly since it, too, will play an important role in the chapters to come. The reader interested in learning more about this fascinating subject may consult Aronszajn [1950], Bergman [1950], Meschkowski [1962] or Hille [1972].
Suppose we begin with an arbitrary set M and a set of functions g(m)on M which forms a Hilbert space F under some inner product
(
I - ).
In section 1.3, M happened to a measure space, ,F was %T and the inner product was that of L2(dp). But it is important to +
keep in mind that the exact form of the inner product in
F
does
not need to be specified, as far as the general theory of reproducing kernels is concerned. Suppose we are given a complex-valued function
1. Coherent-State Representations
30
K(rn,rn') on M x M with the following two properties: 1. For every rn E M , the function K,(rn')
= K(m',rn) belongs to
F. 2. For every m E M and every g E F,we have
Then F is called a reproducing-kernel Hilbert space and K(m,m') is called its reproducing kernel. Some properties of K follow immediately from the definition. For example, since K, E F, property (2) implies that
Thus I P i must satisfy (a) I{(m,m') = K ( m ' , m )
(a) (4
K(m,m)= llKm1122 0 Vm E M
(3)
IWm, m')1 5 IIKmll IIKm) 11-
One of the most important and useful aspects of reproducingkernel Hilbert spaces is the fact that the kernel function itself virtually generates the whole structure. For example, the function L(m) IIICmII dominates every g(m) in F in the sense that
by the Schwarz inequality. In particular, all the functions in .F inherit the boundednes and growth properties of I{. If I< has singularities, then some functions in F have similar singularities. From the above it follows that for fixed mo E M ,
31
1.4. Reproducing-Kernel Hilbert Spaces
the supremum being attained by gmo(m)= Kmo(m)/L(mo), and this is, up to an overall phase factor, the only function which attains the supremum. This fact has an important interpretation when M is the classical phase space of some system, .F represents the Hilbert space of the corresponding quantum mechanical system and lg(m)I2is the probability density in the quantum mechanical state g of finding the system in the classical state m. The above inequality then shows that the state which maximizes the probability of being at mo is uniquely determined (up to a phase factor) as gmo. In other words, gmo is a wave packet which is in some sense optimally localized at mo in phase space. Another example of how the kernel function K embodies the properties of the entire Hilbert space .F is the following: If the basic set M has some additional structure, such as being a measure space (as it was in the setting of frame theory) or a topological space or a C',Coo, real-analytic or complex manifold, and if K(m,m'),
as a function of m, preserves this structure (that is, K(m,m') is ' , C", real-analytic or holomorphic in m, measurable, continuous, C respectively), then every function g ( m ) in F . has the same property. The question arises: If we are given a Hilbert space
F
whose
elements are all functions on M , how do we know whether this space posesses a reproducing kernel? Clearly a necessary condition for K to exist is that
32
1. Coherent-St ate Represent ations
for all f E F and all m E M . But this means that for every fixed m, the map Em:F + (E defined by
is a bounded linear functional on F. E m is called the evaluation map at m. By the Riesz representation theorem, every bounded linear functional E on F must have the form E ( f ) = ( e I f ) for a unique vector e E F. Hence there exists a unique em E 3 such that
for all f E F. Since also f(m) = ( K m I f ), we conclude that em = ICm. Thus it follows that given only F such that all the evaluation maps E m are bounded (not necessarily uniformly in m ) , we can construct a reproducing kernel by
K(m,m') = ( em I em#).
(9)
That is, F posesses a reproducing kernel if and only if all the evaluation maps are hounded. In all of the applications we will encounter, the set M will have a structure beyond those mentioned so far: it will be a Lie group G or a homogeneous space of G. That is, each g E G acts on M as an invertible transformation preserving whatever other structure M may have such as continuity, differentiability, etc, and these tranformations form the group G under composition. Let us denote the action of g on M by m H rng (i.e., G acts from the right). Then the operat or
1.4. Reproducing-Kernel Hilbert Spaces
33
gives a representation of G on F,since TgTgI= Tggt, From f ( m )=
( Km I f ) it now follows that
hence
Therefore the reproducing kernel I<(ml,m)is invariant under the action of G if and only if all the operators Tg axe unitary, i.e. the representation g H Tgis unitaqy. The group G usually appears in applications as a natural set of operations such as translations in space and time (evolution), changes of reference frame or coordinate system, dilations and frequency shifts (especially useful in signal analysis), etc. Invariance under G then means that the transformed objects (wave functions, signals) form a description of the system equivalent to the original one, i.e. that the transformat ion changes nothing of physical significance. Finally, let us note that the concept of generalized frame, as defined in the previous section, is a special case of a reproducingkernel Hilbert space. For given such a frame {h,), the function K ( m ' ,m ) = ( hmf I G-lh, ) is a reproducing kernel for the space %T since (a):
34
1. Coherent-St a t e Represent ations
Km(m’)2 K(m’,m) = (T(G-’hm))(m’) shows that K , = T(G-’h,)
belongs to S T , and (b):
1.5. Windowed Fourier Transforms The coherent states of section 2 form a (holomorphic) frame. I now want to give some other examples of frames, in order to develop a better feeling for this concept. The frames to be constructed in this section turn out to be closely related to the coherent states, but have a distinct “signal processing” flavor which will lend some further depth to our understanding of phase-space localization. For simplicity, we begin with the study of functions f ( t ) of a single real variable which, for motivational purposes, will be thought of as “time.” Although in the model to be built below f is real-valued, we allow it to be complex-valued since our eventual applications will be quantum-mechanical. The same goes for the window function h. The extension to functions of several variables, such as wave functions or physical fields in spacetime, is straightforward and will be indicated later. We think of f(t) as a “time-signal,” such as the voltage going into a speaker or the pressure on an eardrum. We are interested in the frequency content of this signal. The standard thing to do is to find its Fourier coefficients (if f is periodic) or its Fourier transform (if it is not periodic). But if f represents, say, a symphony, this
1.5. Windowed Fourier llansforrns
35
approach is completely inappropriate. We are not interested in the total amplitude
in f of each frequency v. For one thing, both we and the musicians would be long gone before we got to enjoy the music. Moreover, the musical content of the signal, though coded into f , would be as inaccessible to us as it is in f ( t ) itself. Rather, we want to analyze the frequency content of f in real time. At each instant t = s we hear a “spectrum” f(v,s). To accomplish this, let us make a simple (though not very realistic) linear model of our auditory system. We will speak of the “ear,” though actually the frequency analysis appears to be performed partly by the nervous system as well (Roederer [1975]). The ear’s output at time s depends on the input f ( t ) for t in some interval s - r 5 t 5 s, where r is a lag-time characteristic of the ear. In analyzing f(t) in this interval, the ear may give different weights to different parts of f(t). Thus the signal to be analyzed for output at time s may be modeled as
where h(t - s) is the weight assigned to f(t). (As noted above, we are allowing h to be complex-valued for future applications, though here it should be real-valued; the bar means complex conjugation.) The function h(t) is characteristic of the ear, with support in the interval -r 5 t 5 0. Such functions are known in communication theory as windows . Having “localized” the signal around time s, we now analyze its frequency content by taking the Fourier transform:
36
1. Coherent-State Representations
J-00
This function, representing our “dynamical spectrum,” is called a windowed Fourier transform of f (Daubechies [1988a]). If the window is flat ( h ( t )= 1))f ( v , s ) reduces to the ordinary Fourier transform. On the other hand, if f(t) is a “unit impulse” at t = 0, i.e. f(t) =
&(t),then f ( v ,s) = h(-s). Hence h(-s) is the “impulse response” (Papoulis [1962]) of the ear. Note that 00
dt h(t - S ) f ( t )
f(0,S) =
(4)
J-00
is nothing but the convolution of f with the impulse response. Thus the windowed Fourier transform is a marriage between the Fourier transform and convolution. Letting
we have
h, 4=(
hv,s
I f ),
(6)
the inner product being in L2(IR). As our notation implies, we want to make a frame indexed by the set
M = { ( v , s) I v,s
E
IR} = IR2,
(7)
M corresponds to the “phase space” in quantum mechanics in the sense that if t were the posithat is, the time-frequency plane.
tion coordinate, then v would be the wavenumber and 27rhv would be the momentum. Since we expect all times and all frequencies to be equally important, let us guess that an appropriate measure on A4 is the Lebesgue measure, dp(v, s ) = ds dv. Now
37
1.5. Windowed Fourier fiansforms
&, 4 = (Lf)"(V)
(8)
where hd(t) = h(t - s) is the translated window and denotes the Fourier transform. Thus for "nice" signals (say, in the space of Schwartz test functions), Plancherel's theorem gives A
where both norms are those of L2(R). This shows that the family of vectors h,,, is indeed a tight frame, with frame constants A = B = llh1I2. Now h,,,= = exp(-2nivt)hS(t) is just a translation of h, in ftequency, that is, L,,,+(v')= R,(v' - v).
(10)
Thus if the Fourier transform R(v) of the basic window is concentrated near u = 0 (which, among other things, means that h(t) must be fairly smooth, since any discontinuities or sharp edges introduce high-frequency components), then that of h,,,,, is concentrated near v' = v. In other words, hu,d is actually a window in phase space or time-frequency. The fact that these windows form a tight frame means that signals may be equally well represented in timefrequency as in time. The resolution of unity is
38
1. Coherent-State Representations
and if the consistency condition is satisfied the reconstruction formula gives
f ( t ) = llhll-2
/
ds dv e-2niut h(t - s) f(v, s).
(12)
IR2
As in section 1.2, we denote by ? thei frame ~ formed by the hy,s’s. Incident ally, the windowed Fourier transform is quite symmetrical with respect to the interchange of time and frequency. It can be rewritten as f(v, s) = =
(6s f) ^(v>
(is *f) ( v )
where
now plays the role of a window in frequency used to localize As will be seen in the next chapter, the reason for this symmetry is that windowed Fourier transforms, like the canonical coherent states,
fl.
are closely related to the Weyl-Heisenberg group, which treats time and frequency in a symmetrical fashion. (This is rooted in symplectic geometry.) The hv,s’s are highly redundant. We want to find discrete subsets of them which still form a frame, that is we want discrete subframes. The following construction is taken from Kaiser [1978c, 1984al. Let T
> 0 be
a fixed time interval and suppose we “sam-
ple” the output signal f ( v , s ) only at times s = nT where n is an integer. To discretize the frequency as well, note that the localized signal f n ~ ( t=) h n ~ ( t ) f ( t )has compact support in the interval nT - T 5 t 5 nT,hence we can expand it in a Fourier series
1.5. Windowed Fourier Tknsforms
39
(14) m
where F = 1/r and nT Cmn
at e2nimFt
=FLT-,
fnT(t)
nT
=F
J,,_,at
e2nimFt
h(t - nT)f ( t )
(15)
= Ff"(mF,nT).
The only problem with this representation of f n T is that it only holds in the interval nT - r 5 t 5 nT, since f n T vanishes outside this interval while the Fourier series is periodic. To force equality at all times, we multiply both sides by r h n T ( t ) :
=
c
h m F , n T ( t ) ( h,mF,nT
I f 1.
m
Attempting to recover the entire signal rather than just pieces of it, we now s u m both sides with respect to n:
Recovery of f ( t ) is possible provided that the sum on the left converges to a function
40
1. Coherent-State Representations
which is bounded above and below by positive constants: 0
< A 5 g(t) 5 B.
In that case, we have
and hence the subset f (hmF,nT
I m, n E z)
(21)
forms a discrete subframe of 3 . c ~with frame constants A and B. This subframe is in general not tight, and the operator G is just multiplication by g(t), so finding G-’ presents no problem in this case. The reconstruction formula is
f(t) = g(t)-l
hmF,nT(t)f”(mF,nT)-
(22)
n,m€Z
We are therefore able to recover the signal by “sampling” it in phase space at time intervals At = T and frequency intervals Au = F. What about the uncertainty principle? It is hiding in the condition that the discrete subset
hmFnT
forms a frame! For we cannot satisfy
2 A > 0 unless the supports of h(t) and h(t - T) overlap, which implies that T 5 T , or AtAu = T / r 5 1. In quantum mechanics, the
g(t)
radian frequency is related to the energy by condition is
At A E 5 27rh.
E
= 27rhv, so the above
(23)
This looks like the uncertainty principle going “the wrong way.” The intuitive explanation for this has been discussed at the end of section
1.3.
I.5. Windowed Fourier ?f.ansforms
41
Notice that the closer we choose T to T , the more difficult it is for the window function to be smooth. In the limiting case T = r , h ( t ) must be discontinuous if the above frame condition is to be obeyed. As noted earlier, this means that its Fourier transform k(v) can no longer be concentrated near v = 0, so the frequency resolution of the samples suffers. In concrete terms, this means that whereas for “nice” windows h(t) we may hope to get a good approximation to the reconstruction formula by truncating the double sum after an appropriate finite number of terms, this can no longer be expected when T + r . In other words, it pays to oversample! “Appropriate” here means that we cover most of the area in the t i m e frequency plane where f lives. Clearly, if the sampling is done only for 1721 5 N , we cannot expect to recover f ( t ) outside the interval -NT - T <_ t 5 N T . If k ( u ) is nicely peaked around v = 0, say with a spread of Au, and the signal f ( t ) is (approximately) “bandW , then we can expect to get limited,” so that f ( u ) x 0 for IvI a good approximation to f(t) by truncating the sum with rn 5 M , where M is chosen to satisfy M F > W Au. On the other hand, we may actually not be interested in recovering the exact original signal
>
+
f(t). If we are only interested in a particular time interval and a particular frequency interval (say, to eliminate some high-frequency noise), then the appropriate truncation would include only an area in the time-frequency plane which is slightly larger than our area of interest. If we sample a given signal f only in a finite subset r of our lattice (as we are bound to do in practice) and then apply the reconstruction formula, truncating the sum by restricting it to I?, then the result is a least-squares approximation f~ to the original signal.
On a philosophical note, suppose we are given an arbitrary signal
42
1. Coherent-State Representations
without any idea of the type of information it carries. The choice of a window h ( t ) then defines a scale in time, given by T , and it is this scale that distinguishes what will be perceived as frequency and what as time. Thus, variations of f(t) in time intervals much smaller than T are reflected in the v-behavior of f", while variations at scales much larger than 7 survive in the s-dependence of f. What an elephant perceives as a tone may appear as a rhythm to a mouse. Finally, I wish to compare the above reconstruction with the Nyquist/Shannon sampling theorem (Papoulis [19621)) which gives a reconstruction for band-limited signals ( f ( v ) x 0 for Ivl 2 W ) in terms of samples of f(t) taken at times t = n/2W for integer n. Note, first of all, that in the time-frequency reconstruction formula above, it was not necessary to assume that f(t) is band-limited. In fact, the formula applies to all square-integrable signals. But suppose that f ( t ) is band-limited as above, and choose a window h ( t ) such that k(v) is concentrated around an interval of width Au about the origin, with Au < W . Then we expect f ( v ,s) x 0 for IvI 2 W + A v . In order to reduce the double sum in our reconstruction formula to a single sum over n as in the Nyquist theorem, choose F = W Av. Then f"(mF',s) e 0 whenever m # 0. To apply the reconstruction formula, we must still choose a time interval T < T . By the uncertainty principle, T A V> 1/2. The above condition on Av therefore implies that T > 1/2W. It would thus seem that we could get away with a slightly larger sampling interval T than the Nyquist interval TN = 1/2W. Our reconstruction formula reduces to
+
f(t)
X
g(t)-'
c
h(t - nT)f"(0, nT).
(24)
n
The smaller the ratio A v / W , the better this approximation is likely
1.6. Wavelet ?Eansforms
43
to be. But a small Av means a large r , hence the samples f(0, nT) are smeared over a large time interval.
1.G. Wavelet Transforms The frame vectors for windowed Fourier transforms were the wave packets
h&)
= e-2Ti”%(t
- 3).
(1)
The basic window function h ( t ) was assumed to vanish outside of the interval -T < t < 0 and to be reasonably smooth with no steep slopes, so that its Fourier transform k ( v ) was also centered in a small interval about the origin. Of course, since h ( t ) has compact support, f ( v )is the restriction to IR of an entire function and hence cannot vanish on any interval, much less be of compact support. The above statement simply means that A(.) decays rapidly outside of a small interval containing the origin. At any rate, the factor exp(-2rivt) amounted to a translation of the window in frequency, so that hV,# was a “window” in the time-frequency plane centered about ( v , ~ ) .
Hence the frequency components of f ( t ) were picked out by means of rigidly translating the basic window in both time and frequency. (It is for this reason that the windowed Fourier transform is associated to the Weyl-Heisenberg group, which is exactly the group of all translations in phase space amended with the multiplication by phase factors necessary to close the Lie algebra, as explained in chapter 3.) Consequently, hV,#has the same width r for all frequencies, and the number of wavelengths admitted for analysis is ur. For low frequencies with vr << 1 this is inadequate since we cannot gain any
44
1. Coherent-St a t e Represent at ions
meaningful frequency information by looking at a small fraction of a wavelength. For high frequencies with v r >> 1, too many wavelengths are admitted. For such waves, a time-interval of duration r seems infinite, thus negating the sense of “locality” which the windowed Fourier transform was designed to achieve in the first place. This deficiency is remedied by the wavelet transform. The window h ( t ) , called the basic wavelet, is now scaled to accomodate waves of different frequencies. That is, for a # 0 let
The factor ]al-1/2 is included so that Ilha,,l12
=J
00
-00
dt
I ha,&) I 2
= llhl12.
(3)
The necessity of using negative as well as positive values of a will become clear as we go along. It will also turn out that h will need to satisfy a technical condition. Again, we think of both h and f as real but allow them to be complex. The wavelet transform is now defined by
Before proceeding any further, let us see how the wavelet transform localizes signals in the timefrequency plane. The localization in time is clear: If we assume that h ( t ) is concentrated near t = 0 (though it will no longer be convenient to assume that h has compact support), then f ( a , s) is a weighted average of f(t) around t = s (though the weight function need not be positive, and in general may even be complex). To analyze the frequency localization, we again
1.6. Wavelet ?f.ansforms
45
want to express f in terms of the Fourier transforms of h and f . This is possible because, like the windowed Fourier transform, the wavelet transform involves rigid time-translations of the window, resulting in a convolution-like expression. The ‘‘impulse response” is now (setting f ( t ) = b ( t ) ) ga(s) = lal-1/2h(-s/a),
(5)
and we have
with
J-00
Later we will see that discrete tight frames can be obtained with certain choices of h ( t ) whose Fourier transforms have compact support in a frequency interval interval a 5 v 5 @. Such functions (or, rather, the operations of convolutiong with them) are called bandpass filters in communication theory, since the only frequency components in f(t) to survive are those in the “band” [a,@].Then the above expression shows that f(a,s) depends only on the frequency component of f ( t ) in the band a / a 5 v 5 @ / a (if a > 0) or @ / a5 v 5 a / a (if a < 0). Thus frequency localization is achieved by dilations rather than translations in frequency space, in contrast to the windowed Fourier transform. At least from the point of view of audio signals, this actually seems preferable since it appears to be frequency ratios, rat her than frequency differences, which carry meaning. For example, going up an octave is achieved by doubling the frequency. (However,
46
1. Coherent-State Representations
frequency differences do play a role in connection with beats and also in certain non-linear phenomena such as difference tones; see Roederer [1975].)Let us now try to make a continuous frame out of the vectors h,,,. This time the index set is
M = { ( a ,s) I a
# 0, s E lR} !a IR*x IR,
(8)
IR*denotes the group of non-zero real numbers under multiplication. M is the fine group of translations and dilations of the
where
+
real line, t’ = at s, and this fact will be recognized as being very important in chapter 3. But for the present we use a more pedestrian approach to obtain the central results. This will make the power and elegance of the group-theoretic approach to be introduced later stand out and be appreciated all the more. At this point we only make the safe assumption that the measure dp on M is invariant under time translations, i.e. that dp(a, s) = p(a)da ds
where p ( a ) is an as yet undetermined density on Plancherel’s theorem, r
(9)
lR*. Then, using
1.6. Wavelet Tkansforms
47
The frame condition is therefore A L. H ( v )
5 B for some positive
where
constants A and B. To see its implications, we analyze the cases
v > 0 and v < 0 separately. If v
If v < 0, let
> 0, let t = av. Then
= -uv. Then
giving the same expression. Therefore the frame condition requires that P ( t / V > = O ( ( t / y r 2 > as
unless
t/v
4
fw,
(14)
vanishes in a neighborhood of the origin. Note that the
&(I/)
above expression for H ( v ) shows that if both p(a) and &v) vanish for negative arguments then H ( v )
0 and no frame exists. Hence to support general complex-valued windows (such as bandpass filters for a positive-frequency band), it is necessary to include negative as well as positive scale factors a. The general case, therefore, is that we get a (generalized) frame whenever p(u) and h ( t ) are chosen such that 0 < A 5 H ( v ) 5 B is
1. Coherent-St ate Represent ations
48
satisfied. The “metric operator” G and its inverse are given in terms of Fourier transforms by
Gf = (Hf”)” and
G-’f = ( H - l f ) ” .
(15)
Since G is no longer a multiplication operator in the time domain (as it was in the case of the discrete frame we constructed from the windowed Fourier transforms), the action of G-’ is more complicated. It is preferable, therefore, to specialize to tight frames. This requires that H ( v ) be constant, so the asymptotic conditions on p reduce to the requirement that p be piecewise continuous: c + / a 2 , if a c - / a 2 , if a
>o
where c+ and c- are non-negative constants (not both zero). Then for u
> 0,
and for u
< 0,
k(-t)
Thus H ( v ) = A = B requires either that I k ( ( ) I = I I (which holds if h ( t ) is real) or that c+ = c - . Since we want to accomodate complex wavelets, we assume the latter condition. Then we have
We have therefore arrived at the measure dp(a, s ) = c+da d s / a 2
(20)
1.6. Wavelet lhnsforms
49
for tight frames, which coincides with the measure suggested by group theory (see chapter 3). In addition, we have found that the basic wavelet h must have the property that
In that case, h ( t ) is said to be admissible. This condition is also a special case of a grouptheoretic result, namely that we are dealing with a square-integrable representation of the appropriate group (in this case, the f i n e group IR* x IR). To summarize, we have constructed a continuous tight frame of wavelets ha,, provided the basic wavelet is admissible. The corresponding resolution of unity is
The associated reproducing-kernel Hilbert space !Rzh: is the space of
=
, = ( ha,, I f ) f(a, s) depending on the scale functions ( K f ) ( a s) parameter a as well as the time coordinate s. As a + 0, ha,, becomes peaked around t = s and
where c = Sh(u)du. The transformed signal f" is a smoothed-out version of f and a serves as a resolution parameter. Ultimately, all computations involve a finite number of operations, hence as a first step it would be helpful to construct a discrete subframe of our continuous frame. Toward this end, choose a fundamental scale parameter a > 1 and a fundamental time shift b > 0. We will consider the discrete subset of dilations and translations
D
= {(a",na"b) Im,n E
Z}c IR* x IR.
(24)
50
1. Coherent-State Representations
Note that since am > 0 for all m, only positive dilations are included in D,contrary to the lesson we have learned above. This will be remedied later by considering h ( t ) along with h ( t ) . Also, D is not a subgroup of R*x R,as can be easily checked. The wavelets parametrized by D are
hmn =
h
(t
)
- numb am = u-m/2h(a-mt - nb).
(25)
To see that this is exactly what is desired, suppose k ( u ) is concentrated on an interval around u = F (i.e., k is a band-pass filter). Then jLmn is concentrated around u = F l u m . For given integer m, the “samples” fmns((hmnIf),
~
E
Z
(26)
therefore represent (in discrete “time” n ) the behavior of that part of the signal f ( t ) with frequencies near Flu”. If m >> 1, f m n will vary slowly with n, and if m << -1, it will vary rapidly with n (if f ( t ) has frequency components with u Flu”). Now the time-samples are separated by the interval Amt = umb, so the smpling rate N
Rm = l/amb
(27)
is automatically adjusted to the frequency range u Flum of the output signal { f m n In E Z}:The high-frequency components get sampled proportionately more often. This is an example of a topic in mathematics which has recently attracted intense activity under the banner of multiscale analysis (Mallat [1987], Meyer [1986]) and which is in fact closely related to the subject of wavelet transforms. Returning to the constructi’on of a discrete frame, Parseval’s formula gives N
1.6. Wavelet Tkansforms
J-00
where k,,(v)
E
Amn(v)is the Fourier transform of h,,(t),
krnn(v)= a m / 2exp(27rinambu)R(umu>.
(29)
We now assume that k(v) = k(v) vanishes outside the interval
where F
> 0 is some fixed frequency to be determined below. The
width of the “band” I0 is Wo = ( a - u-’)F. Therefore the function k ( a ” v ) f ” ( v )is supported on the compact interval
I , = [F/am,F/am-l]
(31)
of width W, = WO/am,and we can expand it in a Fourier series in that interval:
n
where
cmn = Wi’
I
00
du exp(-27rinv/Wm) k(amu)f ( u ) .
(33)
Comparing this with 00
fmn
dv aml2 exp(-27rivnbam) k ( u m u ) f ( u )
= J-00
suggests that we choose F so that Wm = l/amb, which gives
(34)
1. Coherent-State Representations
52
U
F= (a2
(35)
- I) b
and
Th Fourier series representation above only holds in th interval I m , since the left-hand side vanishes outside this interval while the right-hand side is periodic. To get equality for all frequencies and reconstruct f(t), multiply both sides by k( am v) and sum over m:
m,n
(37) To have a frame we would need the series on the left-hand side to converge to a function x+( v) with 0
< A 5 x+(Y)
C I k ( u m v )l 2 5 B
(38)
m
for some constants A and B. But this is a priori impossible, since
R(V) is supported on an interval of positive frequencies and am > 0, 5 0. However, we can choose h ( t ) ,a and b such that x + ( v )satisfies the frame condition for v > 0. Negative freso x + ( v ) vanishes for v
quencies will be taken care of by starting with the complex-conjugate of the original wavelet. We adopt the notation
h+(t) = h(t), h-(t)
G
h(t).
Then the Fourier transforms k* of hf are related by
(39)
1.6. Wavelet Zlansforms
53
k-(v) = k+(-v),
(40)
hence k - ( v ) is supported on -I0 = [-Fa, ment to the above gives
-Flu].A similar argu-
and
x-(Y)
C I k-(amv) I
= X+(-V).
(42)
m
Hence if
x+
satisfies the frame condition for u > 0, then
for all u # 0. Since (0) has zero measure in frequency space, the frame condition is satisfied by the joint set of vectors
7igb= {kf,,
k,;
Im,n E 22).
(44)
The metric operator
is given by
(Gf>(t>=
(x' + x-)!)
and satisfies the frame condition 0
"(t)
(46)
< A1 5 G 5 B. Since G is no
longer a multiplication operator in the time domain (as was the case
54
1. Coherent-State Representations
with the discrete frame connected to the windowed Fourier transform), the recovery of signals would be greatly simplified if the frame was tight. The following construction is borrowed from Daubechies [1988a]. Let F = a/(a2 - l ) b as above and let k be any non-negative integer or k = 00. Choose a real-valued function 77 E C'(lR,) (i.e., 77 is k times continuously differentiable) such that 0 for x 5 0 n / 2 for x 2 1.
(47)
(Such functions are easily constructed; they are used in differential geometry, for example, to make partitions of unity; see Warner [1971].) Define h ( t ) through its Fourier transform k+(v)by
Note that k + ( v ) is C ' since the derivatives of q(z) up to order k all vanish at IC = 0 and x = n/2. This means that the wavelets in the frame we are about to construct are all Ck.Also, k+ vanishes outside the interval 10 = [ F / a ,Fa]. The width of its support is Wo = ( a - a - l ) F , and for each frequency Y > 0 there is a unique integer M such that F / a < u M v 5 F , hence also F < a M + l v 5 aF. Therefore, for v > 0,
Thus 0 for Y 5 0 1 forv > 0,
55
1.6. Wavelet Tkansforms
i.e., x + ( Y ) is the indicator function for the set of positive numbers. It follows that x - ( v ) is the indicator function for the negative reals, and
-
This choice of k+ and k- = k+ gives us a tight frame,
This frame is not a basis; if it were, it would have to be an orthonormal basis since it is a normal frame, hence the reproducing kernel would have to be diagonal. But
K ( Ern, , n; E', m', n') = ( kk,,I k$,, does not vanish for
E'
)
(53)
= E , n' = n and rn' = rn f 1, due to the overlap
of wavelets with adjacent scales. However, it is possible to construct orthonormal bases of wavelets which, in addition, have some other surprising and remarkable properties. For example, such bases have been found (Meyer [1985], Lemarie and Meyer [1986])whose Fourier transforms, like those above, are C" with compact support and which are, simultaneously, unconditional bases for all the spaces Lp (IR)with
1
as well as all the Sobolev spaces and some other popular spaces to boot. Similar bases were constructed in connection with quantum field theory (Battle [1987]) which are only C kfor fi00
nite k but, in return, are better localized in the time domain (they have exponential decay). The concept of multiscale analysis (Mallat [1987], Meyer [19SS])provided a general method for the construction and study of orthonormal bases of wavelets. This was then used by
56
1. Coherent -State Represent at ions
Daubechies [1988b] to construct orthonormal bases of wavelets having compact support and arbitrarily high regularity. The mere existence of. such bases has surprised analysts and made wavelets a hot new topic in current mathematical research. They are also finding important applications in a variety of areas such as signal analysis, computer science and quantum field theory. They are the subject of the next chapter, where a new, algebraic, method is developed for their study.
57
Chapter 2
WAVELET ALGEBRAS AND COMPLEX STRUCTURES
2.1. Introduction As stated at the end of chapter 1, orthonormal bases of wavelets are finding important applications in mathematics, physics, signal analysis and other areas. In this chapter we present a new treatment of such systems, based on an algebraic approach. This approach was actually discovered while the author was doing work initially unrelated to this book, in preparation for a conference on wavelets (Kaiser [199Oa]). But it turned out that orthonormal bases of wavelets are closely associated with the concept of a complex structure, i.e. a linear map J satisfying J 2 = -I where I is the identity. J unifies certain fundamental operators H and G associated with the wavelets, known as the low-pass and high-pass filters, in much the same way as the unit imaginary i combines the position- and momentum operators in the coherent-state construction. This provides us with yet another example of the central theme of this book, namely that competing (or complementary) quantities can often be reconciled through complexification. For this reason, I decided to include these new results in the book. Furthermore, there may be a direct connection between wavelets and relativistic quantum mechanics (aside from their application to quantum field theory, which is less direct) based on the fact
58
2. Wavelet Algebras and Complex Structures
that relativistic windows (which are like those associated with the windowed Fourier transform but modified so as to be covariant under the Poincark group) behave like wavelets because they undergo a Lorentz contraction in the direction of motion, which is in fact a dilation. This idea is touched on in chapters 4 and 5.
The theory of orthonormal wavelet bases is closely related to multiscale analysis (Mallet [1987], Meyer [1986]),in which functions are decomposed (or filtered) recursively into smoother and smoother functions (having lower and lower frequency spectra) and the remaining high-frequency parts at each stage are stored away. In the limit, the smooth part vanishes (for L2functions) and the original function can be expressed as the sum of the details drawn off at the various frequency bands. Each recursion involves the application of a “low-frequency filter” H and a “high-frequency filter” G. The entire structure is based on a function 4, called an averaging function, which satisfies a so-called dilation equation. Roughly, 4 may be thought of as representing the shape of a single pixel whose translates and dilates are used to “sample” functions at various locations and scales. Although the operators H and G have very different interpretations, they exhibit a remarkable symmetry whose origin has not been entirely obvious. What is especially striking is that there exists a function $ whose (discrete) translates and dilates span all the highfrequency subspaces. That is, $ is a “basic wavelet” (also called a mother wavelet) in the same sense as that used for the function h ( t ) in section 1.6, except that now all the translates and dilates of 1c, (corresponding to the functions hmn) form an orthonormal basis. 1c, is related to G in a way formally similar to the way
H.
4 is related
to
2.2. Operational Calculus
59
The complex structures developed in this chapter are orthogonal operators* which relate G to H and t,b to 4, thus explaining the symmetry between these entities in terms of a “complex rotation.” The plan of the chapter is as follows. In section 2 we develop an operational calculus for wavelets, which conceptually simplifies the formalism and helps in the search for symmetries. This is used in section 3 to construct the complex structures. These structures, in turn, suggest a new decomposition- and reconstruction algorithm for wavelets, which is considered in section 4. Section 5 consists of an appendix in which we summarize the operational calculus and state how our notation is related to the standard one.
2.2. Operational Calculus
In wavelet analysis (see Daubechies [19SSb], Mallat [1987],Meyer [1986],Strang [19S9] and the references therein), one deals with the representation of a function (“signal”) at different scales. One begins with a single real-valued function r#J of one real variable which we take, for simplicity, to be continuous with compact support. One assumes that for some T > 0, the translates &(t) r#J(t- nT), n E 24, form an orthonormal set in L2(lR) (such functions can be easily constructed). The closure of the span of the vectors dn in L2(R)forms a subspace V which can be identified with 1 2 ( Z ) since , for a real sequence u = {un} we have
=
*
We assume that the function spaces are real to begin with; if they are complex instead, then the complex structures are unitary,
60
2. Wavelet Algebras and Complex Structures
n
n
We introduce the shift operator
(Sf)(t>= f(t - TI,
(2)
which leaves V invariant and is an orthogonal operator on L2(R)(we shall be dealing with r e d spaces, unless otherwise stated). A general element of V can be written uniquely as
n
where u(eilT) is the square-integrable function on the unit circle (It15 T / T )having {Un} as its Fourier coefficients and u(S) is defined as an operator on "nice" functions (e.g., Schwartz test functions) f(t) through the Fourier transform, i.e.
For the purpose of developing our operational calculus, we shall consider operators u( s)which are polynomials in S and s-'. These form an abelian algebra P of operators on V . Moreover, it will suffice to restrict our attention to the dense subspace of finite combinations in V , i.e. to P#,since our goal here is to produce an L2 theory and this can be achieved by developing the algebraic (finite) theory and then completing in the L2 norm. Note that the independence of the vectors 4 n means that u(S)# = 0 implies u(S) = 0. Our results could actually be extended to operators u(S) with {un} E ['(Z) c 12(ZZ), which also form an algebra since the product u( S-')w(S) corresponds to the convolution of the sequences {tin} and {wn}. We resist the tempt ation.
2.2. Operational Calculus
61
Let us stop for a moment to discuss the “signal-processing” interpretation of u(S)$, since that is one of the motivations behind wavelet theory. It is natural to think of u(S)$ as an approximation to a function (“signal”) f(t) obtained by sampling f only at t , = nT, n E Z. Let fo denote the band-limited function obtained > T / T . That is, fo from f by cutting off all frequencies ( with coincides with for 5 T / T but vanishes outside this interval. The value of fo at t , is then
3
which is just the Fourier coefficient of the periodic function
obtained from fo(E) by identifying ( domain,
Fo(t) =
c
+ 2n/T with f.
Tfo(nT)s(t - nT).
In the time
(7)
n
This has the same form as u(S)$, if we set t i n = Tfo(nT)and $ ( t ) = 6 ( t ) where S is the Dirac distribution. Hence the usual sampling theory may be regarded as the singular case 4 = 6, and then u(S)$ characterizes the band-limited approximation fo of f . For squareintegrable +, the samples un are no longer the values at the sharp times t n but are smeared over $n, since = { $n, u ( S ) $ ) . In fact, acts as a filter, i.e. as a convolution operator, since (u(S)$)^(Q= u(eEt*)$(t). Roughly speaking, we may think of q5 as giving the shape of a pixel.
+
62
2. Wavelet Algebras and Complex Structures
Next, a scaled family of spaces V,,a E Z, is constructed from V as follows. The dilation operator D , defined by
( D f ) ( t )= 2-'/'f(t/2),
L2(lR). It stretches
is orthogonal on
(8)
a function by a factor of 2
without altering its norm and is related to S by the commutation rules
D S = S'D,
D-'S2 = SD-1.
(9)
Hence D "squares" S while D-l takes its "square root." A repeated application of the above gives
D"S = S'~D,,
CY
E
z.
(10)
Define the spaces
V, = D"V,
(11)
which are closed in L2(IR)(Vo G V ) . An orthonormal basis for V, is given by
4E(t) G D a S n 4 ( t ) = 2-*/'4 (2-at - nT) ,
(12)
and V, can also be identified with t'(Z). The motivation is that Va will consist of functions containing detail only up to the scale of 2,, which correspond to sequences { u z } in t2((;z)representing samples at t , = 2"nT, n E Z. For this to work, we must have Va+l c V, for all a. A necessary condition for this is that 4 must satisfy a functional equation (taking cy = - 1) of the form
n
n
63
2.2. Operational Calculus
for some (unique) set of coefficients h,. Since we assume that 4 has compact support, it follows that all but a finite number of the coefficients hn vanish, so h ( S ) is a polynomial in S and S-l, i.e. h ( S ) E P. This operator averages, while D-' compresses. Hence C$ is a fixed point of this dual action of spreading and compression. The equation Dq5 = h(S)4,called a dilation equation, states that the dilated pixel Dq5 is a linear combination of undilated pixels dn. The coefficients hn uniquely determine 4, up to a sign. For if we iterate D-'h(S) = h(S'/2)D-', we obtain
n N
C$ = [o-'h(S)]
4=
h (S2-=) D - N 4 .
(14)
Ct=l
Since the Fourier transform of D-Ng5 satisfies 2*12 (D-Nq5)n(t) = &2-Nt)
--$
&O)
as N + 00,
(15)
we obtain formally
where b ( t ) is the Dirac distribution. The normalization is determined up to a sign by
11cj11
= 1. See Daubechies [1988b] for a discussion of
the convergence and the regularity of
4.
Note that the singular case 4 = 6, discussed above, satisfies the diIation equation with h(S) = f i r , where I denotes the identity operator. A more regular solution, related to the Ham basis, is the case where
is the indicator function for the interval [0, 1)and h( 5') =
( I + S)/&
In general, integration of Dq5 = h(S)$ with respect to t
gives
64
2, Wavelet Algebras and Complex Structures
Ch,=JZ,
or
h ( ~=) &I.
(17)
n
Also, the regularity of 4 is determined by the order N of the zero which h ( S ) has at S = -I, i.e.
h(S)= ( I
+ S)%(S),
with k(S) regular at S = -I. For example, N = 0 for N = 1for the Haar system. See Daubechies [1988b].
4 = S, and
The next step is to introduce a "multiscde analysis" based on the sequence of spaces V,. We shall do this in a basis-independent fashion. Since shifts and dilations are related by DS = S 2 D ,we have
This defines a map H:: V,+1 + V,, given by
Since the two sides of this equation are actually identical as functions or elements of L2(IR), H: is simply the inclusion map which establishes VO+l c V,. This shows that the relation D4 = h(S)+ is not only necessary but also sufficient for V,+l c V,. Although a vector in Va+l is identical with its image under HE as an element of L2(R),it is useful to distinguish between them since this permits us to use operator theory to define other useful maps, such as the adjoint Ha:V, + V,+l of H:. Since the norm on V, is that of L2(R) and HE is an inclusion, it follows that H , HE = Ia+l, the identity on Va+l. In particular, Ha is onto; it is just the orthogonal projection
2.2. Operational Calculus
65
from Va to Va+l. H: is interpreted as an operator which interpolates a vector in V,+l, representing it as the vector in V, obtained by replacing the “pixel” 4 with the linear combination of compressed pixels D-lh(S)d. The adjoint H , is sometimes called a “low-pass filter” because it smooths out the signal and re-samples it at half the sampling rate, thus cutting the freqency range in half. However, it is not a filter in the traditional sense since it is not a convolution operator, as will be seen below. The kernel of H , is denoted by W,+1. It is the orthogonal complement of the image of H:, i.e. of V,+l, in V, :
Note that HC is “natural” with respect to the scale gradation, i.e.
Our “home space” will be V. All our operators will enjoy the above naturality with respect to scale. Because of this property, it will generally be sufficient to work in V . Define the operator H*: V 4 V by
We will refer to H* as the “home version” of H:. Home versions of operators will generally be denoted without subscripts. Note that while HC preserves the scale (it is an inclusion map!), H* involves a change in scale. It consists of a dilation (which spreads the sample points apart to a distance 257) followed by an interpolation (which
66
2. Wavelet Algebras and Complex Structures
restores the sampling interval to its original value 2’). Thus H* is a zoom-in operator! Its adjoint
H = D-lHo
(23)
consists of a “filtration” by HO (which cuts the density of sample points by a factor of 2 without changing the scale) followed by a compression (which restores it to its previous value). H is, therefore, a zoom-out operator. It is related to H , by
The operators H and H* are essentially identical with those used by Daubechies, except for the fact that hers act on the sequences { u n } rather than the functions u(S)$. They are especially useful when considering iterated decomposition- and reconstruction algorithms (section 3). To find the action of H,, it suffices to find the action of H . Note where u ( S 2 )is even in S. This will be that H*u(S)$= h(S)u(S2)q5, an important observation in what follows, hence we first study the decomposition of V into its even and odd subspaces. An arbitary polynomial u(S) in S,S-’
can be written uniquely
as the sum of its even and odd parts,
n
n
= u+(S2) + Su-(S2).
(25)
Define the operator E* (for even) on V by
E*S = S2E*,
E*$= 4.
(26)
2.2. Operational Calculus
67
Then
E*u(s)$ = u(s2)4 =
C
un42n.
(27)
n
Also define the operator O* (for odd) by O* = SE",so that
o*u(s)$ = su(s2)4 = C Un42n+l.
(28)
n
H* is related to E* by H* = h(S)E*. Hence to obtain H is suffices to find the adjoint E of E*. Lemma 2.1. Let v(S) E by E and 0. Then
P and denote the adjoints of E* and O*
Ov(S)O* = v+(S), 1 -1 Ov(S)E*= v-(S) = -D 2 Ev(S)O* = Sv-(S)
s-1
[v(S) - v(-S)] D
(note that (a) is a special case with v(S) = I ) , and
68
2. Wavelet Algebras and Complex Structures
(4 E*E
+ 0*0= I .
Proof. For u(S),v( S) E P, we have
where the last equality follows from the invariance of the inner product under S H S2, i.e.
Hence EE* = I , so 00*= ES-'SE* = I .
EO* = OE* = 0 follows
from the orthogonality of even and odd functions of S (applied to
4).
This proves (a). To show (b), note that due to the orthogonality of even and odd functions,
where we have used (a). This proves the first equation in (b). The second follows from 0 = ES-I and S-'v(S) = v-(S2)+S-'v+(S2). To prove (c), note that u(S2)E*= E*u(S)and Su(S2)E*= O*u(S), hence
2.2. Operational Calculus
69
+ = EE*v+(S)+ EO*v-(S) = v+(S),
Ev(S)E* = E(v+(S2) Sv-(S2))E* Ov(S)O* = ES-lv(s)SE* = v+(S),
Ow(S)E*= O(v+(S2) + Sv-(S2))E*
(36)
+ Ev(S)O* = E(V+(S2)+ S v - ( P ) ) S E * = EO*v+(S)+ EE*Sv-(S) = Sv-(S). = OE*v+(S) EE*v-(S) = v-(S),
Lastly, (d) follows from
+
+ = v+(S2)$ + Sw-(S2)$b= v(S)$.I
E*Ev(S)$ O*Ov(S)$ = E*v+(S)$ O*v-(S)$
(37)
Remark. The algebraic structure above is characteristic of orthogonal decompositions and will be met again in our discussion of low- and high-frequency filters. E*E and 0'0 are the projection operators to the subspaces of even and odd functions of S (applied to $),
V" = {v(S2)$ I v(S) E P},
V" =- {Sv(S2)bI v(S) E P}, (38)
and
v = V" @ V". This decomposition will play an important role in the sequel.
(39)
70
2. Wavelet Algebras and Complex Structures
Proposition 2.2. given by
The maps H : V
V a n d H,: V,
V,+l are
H u ( S ) b = Eh(S-l)u(S)$
+
= [h+(S--l)u+(S) h-(S-l)u-(S)] 4,
(40)
H,D"u(S)4 = D"+'E h(S-l)u(S)d
+
= D*+l [h+(s-l)u+(S) h - ( S - l ) u - ( s ) ]
4.
Proof. Since H * = h ( S ) E * ,it follows that H = E h ( S - l ) and H,D" = D"+lH = D"+'Eh(,Y1). I
2.3. Complex Structure
Up to this point, it could be argued, nothing extraordinary has happened. We have a filter which, when applied repeatedly, gives rise to a nested sequence of subspaces V,. However, the next step is quite surprising and underlies much of the interest wavelets have generated. It is desirable to record the information lost at each stage of filtering, i.e., that part of the signal residing in the orthogonal complement W,+1 of V,+l in V,. The orthogonal decomposition V, = V,+1 @ W,+l is described by filters H , and G,, where H , is as above and G, extracts high-frequency information. For this reason, H, and G, obey a set of algebraic relations similar to those satisfied by E and 0 above. What is quite remarkable is that there exists a vector 1c, in V.1 which is related to the spaces W, and the maps G, in a way almost totally symmetric to the way 4 is related to V, and H,. This is not merely a consequence of the orthogonal decomposition but
2.3. Complex Structure
71
is somehow related to the fact that Va+l is “half” of Va,due to the doubling of the sampling interval upon dilation, as expressed by the commutation relation DS = S 2 D . However, the precise reason for this symmetry has not been entirely clear. The usual constructions are somewhat involved and do not appear to shed much light on this question. It was this puzzle which motivated the present work. As an answer, we propose the following new construction. Begin by defining a complex structure on V , i.e. a map J : V V such that J 2 = -I. (To illustrate this concept, consider the complex plane as the real space R2. Then multiplication by the unit imaginary i is represented by a real 2 x 2 matrix whose square is -I.) J is defined by giving its commutation rule with respect to the shift and its action on 4:
where e(S) is an as yet undetermined function. It follows that for
4s)E p , Ju(S)$ = E(S)U(--S-~)$.
(2)
We further require that J preserve the inner product, i.e. that J* J = I . Combined with J 2 = - I , this gives J* = - J . That is, J will behave like multiplication by i also with respect to the inner product, giving it an interpretation as a Hermitian inner product. In order to study J, we first define two simpler operators C and M as follows.
cs = s-lc, M S = -SM,
c4 = 4
M $ = $. Note that C M = M C and that C* = C and M* = M , since
(3)
2. Wavelet Algebras and Complex Structures
72
where u(S)* = u(S-') was used in the second line and the third line follows from the invariance of the inner product under S H -S. Since
C and M are also involutions, i.e.
it follows that they are orthogonal operators. Hence they represent symmetries, which makes them import ant in themselves, especially in the abstract context where one begins with an algebra and constructs a representation (see the remark at the end of section 3). In fact, the orthogonal decomposition V = V" @ V" is nothing but the spectral decomposition associated with M , since V" and V" are the eigenspaces of M with eigenvalues 1 and -1, respectively.
C has
a
simple interpretation as a conjugation operator, since for u(S) E P,
Cu(S)C = u ( S - l ) = u(S)*.
(6)
In terms of C and M ,
J
= E(S)CM.
(7)
2.3. Complex Structure
73
Proposition 2.3. The conditions J* = -J and J2 = -I hold if and only if c(S) satisfies e(-S) = --E(S),
E(S-l)€(S) = 1.
(8)
Proof. We have
J* = MCe(S-l) = M € ( S ) C= €(-S)MC = E(-S)CM, hence J* = -J if and only if case. Then
E(
(9)
-S) = -e(S). Assume this to be the
J2 = c(S)CMe(S)CM= E(S)E(-S-') = --E(S)E(S-'), hence J 2 = -I if and only if E(S-')E(S)= I .
(10)
I
Remarks. 1. J is determined only up to the orthogonal mapping E(S). This corresponds to a similar freedom in the standard approach to wavelet theory, where a factor eix(c) in Fourier space relates the functions H ( [ ) and G(() associated with the operators H and G (Daubechies [1988b], p. 943, where T = 1). The relation between e(S) and A([) is given in the appendix. 2. The simplest examples of a complex structure are given by choosing E(S) = S 2 P + l , P E Z .
(11)
More interesting examples can be obtained by enlarging P to a topological algebra, for example allowing u(S) with { u n } E
P(Z).
74
2. Wavelet Algebras and Complex Structures ,
3. The above proof used the symmetry of the inn-roduct. Later we shall complexify our spaces and the inner product becomes Hermitian. However, this proof easily extends to the complex case (when transposing, also take the complex conjugate). C then becomes C-antilinear and is interpreted as Hermitian conjugation. At an arbitrary scale a , define maps J,: V,
J,D"
4
V, by naturality, i.e.
= D" J,
(12)
which implies that J: = -I, and J: = - J,. J , is related to S by
showing that
S 2 a J , = -J,S-'*. In particular, note that S'I2 J-1 = -J-1S-'/2, hence
We are now in a position to construct the basic wavelet $, the spaces W , and an appropriate set of high-frequency filters in a way which will make the symmetry with 4, V, and H , quite clear. Consider the restriction of J, to the subspace V,+l of V,, i.e. the map I<:: V,+1 4 V, defined by
ICE = J,HE.
2.3. Complex Structure
75
I<: is natural with respect to the scale gradation, and its home version will, as usual, be denoted by K* G K,*D = J H * . It will turn out that its adjoint K is essentially equivalent to the usual filter G (to be introduced below) .but is more natural from the point of view of the complex structure. Define the vector $ E V-1 by
where the function g(S) = E ( S ) h ( - K 1 )
will play a similar role for the high-frequency components as does h ( S ) for the low-frequency components. Namely, g ( S ) is a "differencing operator," just as h ( S ) is an averaging operator. [For example, in the simplest case h ( S ) = (I S ) / 4 and e(S) = -S, we obtain g(S) = (I- S)/& This gives rise to the Haar system.] For w ( S ) E P,we have
+
I"*w(S)$ = J H * w ( S ) $ = Jh(S)w(S2>+ = g(s)w(s-2)$ = g(S)E*Cw(S)$,
(19)
hence
Proposition 2.4. The adjoints of K* and I{: axe given by
Kv(S)$ = EBg( s-')v( S)$ = Ev(S--l)D$, K,Dav(S)$ = D"+1ECg(S-')v(S)$ = D"+'Ev(S-')D$.
(21)
76
2. Wavelet Algebras and Complex Structures
Proof. Since IC = ECg(S-') and K,D"
= D"+'K, we have
K V ( S ) q 5 = ECg(S-1)V(S)4 = Eg(S)v(S-l)$h= Ev(S-')D$
(22)
and
K,} Proposition 2.5. The pairs of operators { H ,IC} and {Ha, satisfy H H * = Eh(S-')h(S)E* = I ,
K K * = Eg(S-')g(S)E* = I) H K * = Eh(S-')g(S)E*c = 0, I
H,H:
= KoK: = la+',
H,K:
= KaH: = 0.
(25)
Proof. (Note that HaH: = la+' has already been shown; it is included here for completeness, since it belongs with the other identities.) The first equation follows from H* = H,*D and H,*Ho = I. The second follows from the first and I<* -= JH*,since J*J = I . The last two equations follow from lemma 1, since h(S-')g(S) and g(S-l)h(S) are odd functions, hence their even parts vanish. This proves the identities for H and K . The other identities follow by I naturality.
2.3. Complex Structure
77
Proposition 2.6. The pairs { H ,K ) and { H a ,K,) give orthogonal decompositions of V and Va. We have
(4
HV=KV=V H*V = Vi, K*V = W1 H*H+K*K = I ,
(b)
(26)
H,V, = K,V, = Va+i H:V.+i = Va+i, K:Va+l = w Y + l H:H,+K:K, =I,.
(27)
Proof. We have
HV = D-’HoV = D-’Vl = V ,
(28)
and since K = - H J , it follows that KV = HJV = HV = V . Also
K K * = I and H K * = 0 (proposition 2.4) imply that K* is injective and its range is orthogonal to that of H*, i.e. to V1. Hence K*V C Wl, To show that K*V = W1, let u(S) E P. We need to find v(S),w(S) E P such that
u(S)$ = H*v(S)+
+ K*w(S)+= h(S)v(S2)$+ g(S)w(S-2)+
or, equivalently, dropping
u(S)
(30)
4,
=Fs)v+) + g(s)w(s-2).
(31)
Use lemma 1 to decompose this equation into its even and odd parts:
2. Wavelet Algebras and Complex Structures
78
+
u+(S) = Eu(S)E*= E [h(S)v(S2) g ( s ) ~ ( S - ~ E*) ]
+
= h+(S)v(S) g+(S)w(S-'),
+
u - ( S ) = Ou(S)E* = 0 [h(S)W(S2) g(S)w(S-"] E* = h- ( S) v( S)
(32)
+ g-(S)w(S-l),
which can be written in matrix form as
w(s-1 ) But proposition 2.5 is precisely the statement that the matrix U ( S ) is unitary i.e., V ( S ) * V ( S= ) I. Multiplying by U ( S ) * ,we obtain the unique solution
[w(v(s)s-1) ] [ h+(S-l) =
which shows that V
g+(S-')
L(S-1) S - ( W
VI @ W1 = H*V @ K*V. Applying H and I<
to eq. (30) gives
+
which proves that H * H I P K = I as claimed. For the range of I{: we have I
We now construct the usual "high-pass filters" G,. that elements in W1 E IPV can be written in the form
First note
2.3. Complex Structure
79
K * w ( S ) $ = g(S)w(S-2)$= w ( S - ~ ) D $= Dw(S-l)$.
(35)
It follows that 1c, is a "basic wavelet," i.e. that
Wa = DaP$,
(36)
and the vectors
$,"
* a+l DaSn$ = Da-lK*S-n$ = D -1 Ka$-n
(37)
form an orthonormal basis for W,. Since I<$: V1 + V is injective and its range is factored uniquely as
where R;: V1 + WI is an isomorphism and inclusion map. From
W1, it
G,*: W1
K',*Dw(S)$= Dw(S-')$ = g(S)w(S-2)$ we read off
RTDw(S)$ = Dw(S-l)$,
G;Dw(S-')$ = g(S)w(S-')$. For the home versions, we have
-+
can be
V is the
(39)
so
2. Wavelet Algebras and Complex Structures
is an isomorphism and G*: W + V , though injective, is not an inclusion map. (This is the price for working with the home versions, which do not preserve the scale.) Note that R*R = Iw and RR* = Iv, hence K = RG implies that G = R*K and
GG* = R*KI<*R = I w ,
GH* = R*KH* = 0 ,
(43)
G*G = I{*RR*K = K*K,
+
hence H*H G*G = I . Therefore H and G give an orthogonal decomposition of V which is equivalent to that given by H and .'A Defining RZ: V, + W , and G;t: W,+l --+ V, by RZD" = D"R* and GZD"+l = P G * ,we get a graded family of filters G, related to I<, by
K:
= G:R:+i.
(44)
The operators G , are, in fact, the usual high-frequency filters. For the latter are defined analogously to H,, namely by substituting g(S)# for the dilated wavelet D$:
G*a D&l u( S)$J= GzDau(S2)D$ = Dag(S)u(S2)# = D*G*u(S)$J.
(45)
The orthogonal decomposition of V given by H and G induces an orthogonal decomposition of V, by H , and G, which is, in fact, the usual wavelet decomposition and is equivalent to the one given by H , and K , in proposition 2.6.
2.3. Complex Structure
81
Remark. We have stated that K , is more natural than G, from the point of view of the symmetry associated with J,. The reason is that both H , and K , map V, to V,+1, whereas G, maps Va to K+l.This symmetry is reflected by the simple relation K: = J,H:, whereas the relation
is somewhat more complicated. A more concrete divident of this symmetry will appear in the next section, where the complex combinations HZ f iK: will be considered. The corresponding combinations
H: f iGTy do not make sense, as the two operators have different domains. We now prove an identity which will be useful later.
Proposition 2.7.
K * H - H * K = J,
I{: H , - H:K, = J,.
(47)
+
Proof. For u(S)4 = H * v ( S ) ~ I<*w(S)d E V , we have
( K * H - H * K ) u ( S ) 4= I C * V ( S )-~H * w ( S ) 4 = JH*v(S)$
+ JK*w(S)$= Ju(S)4.
(48)
The identity at arbitrary scale follows from naturality:
(I<:Ha - H:Iia)Da = D"(K*H - H * K ) = D"J = J,D".
I
(49)
82
2. Wavelet Algebras and Complex Structures
It is natural to wonder whether the complex structures J , extend to define a "global" complex structure on L2(R).We now show that this is not the case.
Proposition 2.8. The complex structure J , on V, is not an extension of J,+1. Proof. The statement that J , is an extension of J,+l J,+1 is the restriction of J , to V,+l, i.e.,
Kz
G
J,H:
means that
= H z J,+l.
If this were true, then left-multiplication by H , would imply 0 = H,K:
= H,H:J,+1
which contradicts J&l = -I,+I.
= Ja+l,
I
2.4. Complex Decomposition and Reconstruction The decomposition/reconstruction algorithm of the last section can be iterated, and when repeated indefinitely gives a unique representation of any function f E
L2(IR)as an L2- convergent infinite orthog-
onal sum of "detail" functions at finer and finer scales. We develop the algorithm formally in the home version, which will be seen to be much more convenient. For a rigorous treatment, see Daubechies [1988b]. Given any D"v"(S)c$ E V,, write
2.4. Complex Decomposition and Reconstruction
for brevity. Since
I = Ii*K + H*H = K*K + H*(I<*I< + H * H ) H = N
p=1
we have N
=
C(H*)P-~K*KH~~--IV* + ( H * ) ~ H ~ V ~
p=1
N
s= 1 where
represents an N-fold smoothed version of vQ, while The vector wcr+P represents the detail filtered out at /?-th iteration. The terms in the above sum are orthogonal, since for y
* ((H
p-1
> /? we have
.-* &p (H*)7-11{*wU+7) I1 w > - ( wa+p, I { ( H * ) ~ - Q { * ~ " + P
) = 0.
(5)
To see what this expansion means in terms of the scale-preserving filters, apply
D o and use naturality (see appendix): N
2. Wavelet Algebras and Complex Structures
84
This formula shows the advantage of the home versions of the filters, which "zoom" in and out to get the detail at any desirable scale and can therefore be used repeatedly without changing operators. It can be shown that w " + ~ t 0 in
L*(IR)as N
+
00,
which
gives the orthogonal decomposition 00
p=1
Since
any L2(IR)function can be approximated as accurately as desired in the form of eq. (7). One need only choose a "cut-off" scale a,which means that all detail finer than 2" will be ignored. Moreover, since
the above gives the orthogonal decomposition
L2(IR) = $W". aEZ
Let us now look at the reconstruction and decomposition formulas in light of the complex structure. A single iteration gives
where
85
2.4. Complex Decomposition and Reconstruction
Since J is like multiplication by f i , let us define the complex conjugate pairs of vectors
in the denominators is for later convenience, where the belong to the complexification of V , i.e. to
V" =
v @ iv = (I: €3 v.
Ca
and
(14)
We endow V" with the Hermitian inner product obtained from V by extending Glinearly in the second factor and antilinearly in the first factor (this is the convention used in the physics literature). Note that under this inner product, iV is not orthogonal to V ,hence the direct sum V @ iV is not an orthogonal decomposition. We now extend all operators on V to V c by &linearity, denoting the extensions by the same symbols. Substituting p + 1
=
+ ("+l
Jz
7
W"+l =
p + 1
-p + 1 Jzi
into the reconstruction formula, we obtain
where the operators Z*,Z*: V" + V" are defined by
Therefore
(15)
86
2. Wavelet Algebras and Complex Structures
The operators
pf =
IFiJ 2
(19)
are the orthogonal projections in V" to the eigenspaces of J with eigenvalues f i , since
J P =~k i p f ,
( P f ) *= p f = ( P f ) ' . We have the orthogonal decomposition
V" = v+(33 v-,
V*
= P*V.
(21)
The above shows that the operator Z : V" + V " and its adjoint satisfy Z = JZHP',
Z* = JZP+H*.
(22)
Since H* is injective, it follows that the range of Z* is V + and the kernel of Z is V - . Similarly,
Z
= JZHP-,
Z* = A P - H * ,
so the range of Z* is V - and the kernel of
2 is V+.
Proposition 2.9. The operators Z and 2 satisfy
zz* = I , zz* = 0, z*z = P+, Proof. By proposition 2.5, we have
ZZ*
=I,
2z* = 0, Z*Z = p-*
(23)
87
2.4. Complex Decomposition and Reconstruction
ZZ* = 2HP+P+H* = 2HP+H* =H(I
- iJ)H* = HH* -iHK* = I,
(25)
and ZZ* = 2HP+P-H* = 0. Furthermore, by propositions 2.6 and 2.7,
+iK) = ( H * H + K * K )+ i ( H * K - K * H )
22*2 = (H* -iK*)(H
(27)
=I-iJ=2P+. The other equalities follow from complex conjugation, which exchanI ges 2 and 2. Proposition 2.9 gives a new, complex decomposition- and reconstruction algorithm. Given C" E V " ,define Ca+l,~Q+l E V" by
5"
=
z*cm+l + Z*Xa+l,
(29)
Like the real algorithm, this can be iterated. By proposition 2.9,
a=l
2. Wavelet Algebras and Complex Structures
88
Hence for any
5'
E V c and N E IN,
ff=l
where
Xff%zZa-yo,
p k ZN
(32)
The convergence and significance (in relation to the usual frequency decomposition) of this algorithm will be studied elsewhere. Here we merely note that the terms in the above sum are mutually orthogonal, since for /I> a proposition 2.9 implies
Just as the filters H and K have associated averaging- and "differencing" operators h ( S ) and g(S), there are operators associated with Z and 2. For
Z*<(S)$ = 2-"2 (h(S)C(S2)- ig(S)<(s-") 4
(34) hence Z* = z*E* and, similarly, 2* = z*E*,where z* =
h( S) - ig( S)C
Jz
9
z* =
+
h ( S ) ig(S)C
Jz
(35)
Note, however, that these operators do not commute with S, since they contain C.
2.4. Complex Decomposition and Reconstruction
89
Proposition 2.10. The operators z and f satis@
Proof. A straight computation gives 1 (h(s-l) z*z = - ( h ( S )- ig(S)C) 2
+ icg(s-'))
+
;
1 (h(S)h(S-l) 9(S)g(S-l)) ( h ( S ) s ( S )- S(S)h(S))c 2 1 = - ( h ( S ) h ( S - l ) h(-S)h(-s-l)) 2
=-
+
+
= DEh(s)h(S-')E*D-l =I
(37) by proposition 2.5, and the other identities are proved similarly.
I
Furthermore, there is a natural complex function which stands in the same relation to z as do and 2c, stand to h ( S ) and g(S), : defined by respectively. Consider the function ElV
x
x =d -7 ilc,. Similarly, we define
xE
(39)
by
The functions 4 and II, are somewhat reminiscent of the cosine- and sine-function. By that analogy, x and R correspond to the complex
90
2. Wavelet Algebras and Complex Structures
x,
exponentials! It may be that x and which combine averaging and "differencing" in a complex way, play a similar role in wavelet analysis as do the complex exponentials in Fourier analysis. Finally, note that the Hermitian inner product in V" has the decomposition
(C,C') = (5,P+C)+ (C,P-C'),
(41)
which is the sum of the inner products in V+ and V - . Now the restriction of P+ to V C V" maps V one-to-one onto V+,although it cannot preserve the inner product since V is real while V+is complex. In fact, for 5 P+u and C' E P+u' in V + ,we have
I
= 2( u , u ' ) - iw(u,u'), which states that imaginary part of the Hermitian inner product in V+ is the skew-symmetric bilinear form w . This form is known as the symplectic structure determined by the complex structure J together with the inner product on V . It plays a fundamental role in a broad variety of subjects, e.g. group representation theory, classical mechanics and quantum mechanics (see Marsden [1981]). I do not know whether it has any significance for wavelet analysis, but that seems to me a question worth exploring.
A Final Remark. Although no attempt has been made here to work in an abstract setting, the algebraic approach clearly lends itself to such generalizations. We have made use of just a few facts about our initial set-up, for example that the inner product is invariant under S and D and also under C and M . This suggests that one begin with an
2.4. Complex Decomposition and Reconstruction
91
algebra whose generators include S and D and other operators such as C and M , subject to certain relations, and look for representations of this algebra, i.e. for a vector space on which the elements of the algebra act as operators. In our case, the vector 4 provides a representation on L2(R).r$ is a cyclic vector because its orbit under the algebra spans L2(R). This representat ion is orthogonal in the sense that the generators are represented by orthogonal operators. Solving the dilation equation Or$= h(S)r$therefore amounts to constructing the entire representation! In the next chapter we will see that all the continuous frames developed so far can be unified using ideas of Lie-group theory. This approach, however, does not apply to the discrete frames, in particular to the orthonormal wavelet bases. It seems to me that the algebraic approach developed in this chapter is a kind of discrete substitute for the Lie-theoretic construction. It would be most interesting if this relation could be made precise, for example by starting with a finite shift S corresponding to
T > 0 and a finite dilation D
> 1, building an operational and then considering the limits T -+ 0 and a + 1.
corresponding to a factor a
calculus,
92
2. Wavelet Algebras and Complex Structures
2.5. Appendix
Here we assemble various information for the reader's convenience. We begin with a summary of the operational calculus. 1. Operational Calculus and Basis Representations
We list the actions of various operators in their "home versions," both intrinsically and on the orthonormal bases of W WO.(See Daubechies [1988b].)
{$n)
of V
Vo and
($n)
2.5. Appendix
93
2. Naturality Naturality relates the scale-preserving filters
Ha,K,, G, and 2, to
one another and to their home versions, which involve changes of scale. We give the relations for 2,; the others are similar. The last two equations show the advantage of the home versions for iterated decomposition and reconstruction.
3. Relation to Daubechies' Notation Upon taking the Fourier transform, our operators u(S) become functions u ( e i t T ) on the unit circle. Their connections with those occuring in Daubechies' paper, where the sampling interval follows: Write e(S) = S&-(S2). Then
T = 1, are as
94
2. Wavelet Algebras and Complex Structures 4. Analogy with Complex Numbers
We give a “pocket dictionary” of the correspondence between wavelet operations and operations on complex numbers. This is done for the relation Vo x Vl @ iV1, although the same analogy holds at every scale.
R iIR a%RIRiIR
+ iy %(a: + iy) 2 + iy H %(x + iy)
2
z
H 22
2H5+iO y++O+iy z=x+iy
y E IR H i y E
iIR.
95
Chapter 3 FRAMES A N D LIE GROUPS
3 .l. Introduction
Although we have not sought to exploit it until now, it is clear that all our frames so far have been obtained with the aid of group operations. The frames associated with the canonical coherent states and the windowed Fourier transform were built using translations in phase space (Weyl-Heisenberg group), while the wavelet frames used translations and dilations (the &ne group). In this chapter we look for a unifying pattern in these constructions based on group theory. We analyze the foregoing constructions in turn, and draw separate lessons from each. It will be natural to work in reverse order. The &ne group, which is, in some sense, the simplest, will lead us to the general method. Successive refinements will be suggested by the windowed Fourier transform and the canonical coherent states.
3.2. Klauder’s Group-Frames
This was the first of the grouptheoretic constructions, pioneered by J. R. Klauder [1960, 1963a, 1963b], who was also the first to apply it
3. Frames and Lie Groups
96
to the f i n e group G = R*x R (Aslaksen and Klauder [1968, 19691). An element g = ( a , s) of G acts on the real line (“time”) by dilation followed by translation: gt
3
( a ,s ) t = at
+ s.
(1)
The group-composition law is given by
g’gt = ( U ’ ) s’)(a, s ) t = ayat
+ s) + s’ = (u‘a, a ’ s +
S‘)t,
(2)
hence g-’ = (a-’, - s / a ) . (The form of the composition law shows that the subgroup of dilations acts on the subgroup of translations, so that G is the semidirect product of the two.) The frame vectors for the wavelet transform were obtained from a single “basic wavelet” vector h by applying the transformation
h(t)
la1 -1’2 h
(y) =
( U ( a , s ) h )(t).
(3)
It it easy to see that with respect to the inner product in L2(R),
In terms of the group operations, this means that U(g)* = U(g)-l and U ( g ‘ g ) = U(g’)U(g),respectively. That is, each U ( g ) is a unitary operator (on L2(R)), and g H U ( g ) is a representation of G. This means that U is a unitary representation of G on L2(R).The theory of such representations for general groups is a deep and highly developed subject, and is of fundamental importance in quantum mechanics, as was realized by Hermann Weyl and others long ago (Weyl
3.2. Klauder’s Group-names
97
[1931]). In the broadest sense, group representations amount to a vast generalization of the exponential function (think of the map z
H
ear
from (E to @*),and unitary group representations generalize the map 2 H
eiaz from IR to the unit circle.
A representation U of a group G on a Hilbert space 3.t is said to be reducible if 3.1 has a non-trivial closed subspace S (i.e., S # (0) and S # 7i) which is invariant under U (i.e., V ( g ) S S S for every g E G). If U is unitary and S is invariant, then clearly so is its orthogonal complement S*. If no such S exists, then U is said to be irreducible. It can be shown that the above representation of R*x IR on
L2(IR)is, in fact, irreducible. The method to be described below assumes that we begin with
a given irreducible unitary representation U of a given group G on a given Hilbert space 3-1. In addition, we assume that the group is a
Lie group, meaning that it has a differetiable structure such that it is essentially determined (in local terms) by its Lie algebra of left(or right-) invariant vector fields. (See Helgason [1978],Varadarajan [1974] or Warner [1971] for background on Lie groups.) The &ne
group and the Weyl-Heisenberg group are examples of Lie groups, as is IR” (under vector addition as the group operation). Given a general setup non-zero vector h in
(U,G, ‘H ) as above, choose an arbitrary
3.t. (Klauder dubbed h a ‘Yiducial vector”; for
the affine group, this was the “basic wavelet”.) For every g E G define the vector
h, = U ( g ) h . Since U is unitary, Ilh,II action of G on ‘FI, i.e.,
(4)
= llhll. The hg’s are covariant under the
98
3. Frames and Lie Groups
U(g’)h, = U(g’)U(g)h= U(g’g)h = hgIg.
(5)
Hence the set of all finite linear combinations (span) of hg’s is invariant under U , and therefore so is its closure S. Since U is irreducible and S # {0}, it follows that S = ‘H. This means that every vector in 7f can be approximated to arbitrary precision by finite linear combinations of hg’s-a good beginning, if one is ultimately interested in reconstruction! To build a frame from the hg’s, we need a measure on G. Now every Lie group has an essentially unique (up to a constant factor) left-invariant measure, which we will denote by d p . This means that if E is an arbitrary (Borel) subset of G, and if glE { g l g I g E E } is its left translate by 91 E G, then
In local terms, d,u(glg) = d p ( g ) for fixed 91. [Similarly, there exists a right-invariant measure d p R on G, which is in general different from d p ; if d p R is proportional by a constant to d p , the group G is called unimodular. Everything we do below can be repeated, with obvious modifications, using d p R , provided that h, is redefined as U(g-’)h.] To find d p for the affine group, for example, we write it in the form of a density,
where da A ds is a differential 2-form denoting the’ area element in R2 2 G. ( d a A d s becomes a positive measure upon choosing an orientation in R2and orienting all subsets accordingly; see Warner [1971].) Then, for fixed ( a l , s1) f IR’ x IR,
3.2. Klauder’s Group-Flames
99
hence left-invariance implies
Setting al = l / a and
s1
= - s / a then shows that
dp(a, s) = da A dsla2,
(10)
where we have chosen the normalization p( 1,O) = 1. This is precisely the measure we obtained earlier using a more pedestrian approach. Returning to the general case, consider the formal integral
We want to show that (a) J converges, in some sense, and (b) J is a multiple of the identity, thus giving a tight frame in the generalized sense defined in chapter 1. Before worrying about convergence, let us formally apply U ( g 1 ) from the left:
= JU(Sl),
where we have used the left-invariance of the measure and
3. Frames and Lie Groups
100
(U(gl’)
I h,,
)>* = (h,,
I Wl),
(13)
which follows from unitarity. That is, J commutes with every representative U ( g ) of the group. By Schur’s Lemma (Varadarajan [1974]), it follows from the irreducibility of
U
that J is a multiple of the identity operator on Ti.
Since J, if it converges, is a positive operator, we arrive at the desired frame condition
where c is a positive constant which can taken as unity by the appropriate normalization of dp. Returning to the formal integral defining J, the above argument shows that if the integral converges in some sense, it must converge to cI, hence define a bounded operator. (Of course c could be infinite!) Thus a necessary condition for convergence in the weak sense (i.e., as a quadratic form) is that
r
(15)
We have already encountered this condition in the special case of the affine group, where it was called the admissibility condition for h. The same terminology is used in the present, general setting. The above condition depends both on the representation U and the choice of h. If it is satisfied for at least one non-zero vector h, the representation U is called squareintegrable and h is called admissible. It turns out that the existence of an admissible (non-zero) vector is also sufficient for the weak convergence of the integral J. The following theorem
3.2. Klauder’s Group-frames
101
is due to Carey [1976], and D d o and Moore [1976]; G can be any locally compact topological group, in particular any Lie group.
Theorem 3.1. Let U be a squareintegrable unitary irreducible representation of G on a Hilbert space ‘FI. Then there exists a unique self-adjoint (in general unbounded) operator C on ‘H such that: (a) The domain of C coincides with the set of all admissible vectors. (b) If hl and h2 are admissible, then for all fl and fi in ‘FI we have
(c) If G is unimodular, then C is a multiple of the identity. Choosing hl = h2 f h now leads to the earlier resolution of unity, with c = llCh112. As usual, an arbitrary vector f in ‘FI can be “presented” as a function f(g) ( h , I f ) on G, from which f may be reconstructed as a linear combination of hg’sweighted by the reproducing kernel K ( g‘ ,g ) = ( h,t I h, ) .
=
Note that the basic wavelet h corresponds to
so the admissibility condition is nothing but the requirement that
k
belong to L 2 ( d p ) . A potentially interesting generalization of this scheme is actually possible. By the foregoing theorem we can use two distinct admissible vectors: the vector h2 is used to analyze f, i.e. present it as f(g) = ( ( h z ) , I f ), while the vector hl is then used to synthesize f from f(g). There is no fundamental reason to use the same “wavelets” for
3. fiames and Lie Groups
102
analysis and synthesis, provided only that they overlap in the sense that
Absorbing the reciprocal of this constant into the group measure, the map
T:f
H
f is an isometry from B onto its range gT,
Due to the “covariance” of the hg’s with respect to the action of
G, the representation of G on %T acquires the simple geometric form (O(91 ).f) ( 9 ) 5 ( h g I U(91)f )
= ( h91- l g I f )
(19)
=f ( g h ) ,
showing that G acts on !RT by merely translating the variable in the base space G. The realization of f by f and U by 0 as above is called the coherent-state representation determined by the pair (U, h). We will also refer to the frame { hg I g E G} as the group-frame (Gframe) associated with (U, h ) . From a purely mathematical point of view, one of the attractions of this scheme is that although we started with an arbitrary representation of G on an arbitrary Hilbert space, this construction “brings it home” to G itself and objects directly associated with it: the Hilbert space is a closed subspace of L 2 ( d p ) , and the representation is induced from the (left) action of G on itself, i.e. g H gllg. That is, Klauder’s construction exhibits U as a subrepresentation of the regular representation of G. (See Mackey [1968] for the definition and discussion of the regular representation.)
3.3. Perelomov's Homogeneous G - h m e s
103
3.3. Perelomov's Homogeneous G-Frames Let us now attempt to apply the grouptheoretic method to the windowed Fourier transform. As most of our applications will be to the phase-space formulation of quantum mechanics, we shift gears and replace the time by a space coordinate, t + -z (the sign is related to the Minkowski metric; see section 1.1) and the frequency by a momentum coordinate, 2nu + p . Although it is a trivial matter to extend everything we do in this section to an arbitrary (finite) number of degrees of freedom (where x and p belong to IR'), we restrict ourselves to a single degree of freedom to keep the notation simple. The self-adjoint generators of translations in space and momentum, P and X , are defined by
Expressed in the space domain,
XfW =z f ( 4
(2)
Starting with a basic window function h(x), the window centered at
( p , x) in phase space is given by *
I
hp,z(z')= e * p z h(z' - x) = eipr'
(e-itP
h) (2') = (eipx e-iz'h) ( $ I ) .
(3)
That is, h,,, is obtained from h by a translation in space (by x) followed by a translation in momentum (by p ) . Defining the corresponding unit my operators
104
3. Frames and Lie Groups
let us see what happens when two such operations are applied in succession:
- e i P l z ’ h P , I (2‘ - 2 1 ) - eip1z’e’p(zf-+1)h(X/ - e--ipzl
- z1- x)
(5)
hP1 +P,Zl + z ( 4
- e-*pzl U(P1 + P,5 1 + +(z’). Hence the operators U ( p ,z) do not form a group, since two successive operations give rise to a “multiplier” exp(-ipzl). The reason is that translations in space do not commute with translations in momentum, as can also be seen at the infinitesimal level by noting that their respective generators obey the “canonical commutation relations”
[ X , P I= [ z , - i i ] =iI. The remedy (suggested by Hermann Weyl) is to include the identity operator as a new generator (it generates phase factors e - i 4 which can be used to absorb the multiplier). To see this in terms of unitary representations of Lie groups, consider the abstract real Lie algebra w with three generators {-is,-iT,-iE} and Lie brackets
[$TI = iE,
[S,E]= 0)
[T,E]= 0.
(7)
The corresponding three-dimensional (simply connected) Lie group
W is known as the Weyl-Heisenberg group. Topologically, W is just IR3. (If configuration space is IR”,the corresponding WeylHeisenberg group W, x IR2”’+l .) A general element in W , parametrized by ( p , z, 4) E IR3, may be expressed as the product of three factors
3.3. Perelomov’s Homogeneous G-Emes
105
g(p, x,4) = exp( -+E) exp(ipS) exp( -id’),
(8)
where “exp” denotes the exponential mapping from the Lie algebra w to the Lie group W , whose group law is
A unitary irreducible representation of W on L2(lR)is obtained by the correspondence S + X , T + P, E + I. This is known as the Schrodinger representation. The unitary operator correponding to
d x ,P,4) is
As expected, these operators are closed under multiplication, with the composition law
With this unitary irreducible representation of W on L 2 ( R ) ,we have all the ingredients needed to attempt the construction of a group frame for W . The prospective frame vectors are
where h,,, are the vectors defined earlier. The left-invariant measure on W (which, in this case, is also right-invariant) is just Legesgue measure on R3,given by the differential form dp A dx A dq5. Th’is can 51,$1)) be seen by looking at the composition law: For fixed (PI,
3. fiames and Lie Groups
106
since dp A dp = 0 . The method of section 2 then gives the following candidate for a resolution of unity:
J=
J,
dp A dx A d 4
= Jw dp A dx A d$
where the phase factor cancels in the integrand. This integral clearly diverges, since the integrand is independent of 4 and the integration is over all real 4. Equivalently, the representation U is not squareintegrable, since for every nonzero h,
~h E
J
dp dx d+ I ( h I U ( p ,~ , 4 ) hI ) = 00
(14)
due, again, to the constancy of the integrand in 4. We could get around the problem by choosing a multiply connected version W 1 of W , say with 0 5 4 < 27r. (W1 has the same Lie algebra as W ) . This would indeed give a tight frame, but this frame is unnecessarily redundant (as opposed to the beneficial sort of redundance associated with oversampling) since the vectors hplz,+are not essentially different for distinct values of 4. More significantly, we would miss an important lesson which this example promises to teach us. For other important groups, as we will see, the problem cannot be circumvented by compactifying the troublesome parameters. The following solution was proposed by Perelomov [1972] (see also Klauder [1963b, p. 10681, where this idea is anticipated). To get rid of the +dependence, choose a “slice” of W , 4 = a ( p , z ) ,and define
3.3. Perelomov's Homogeneous G-fiames
107
Integrating only over this slice with the measure dp A dx, we get
since the integral reduces to the same one we where c h = 2~llh11~, had for the windowed Fourier transform. Flom a computational point of view this is, of course, trivial. But to extend the technique to other groups we must understand the group theory behind it. Suppose, then, that we return to the general setup we had in the last section: Given a unitary irreducible representation U of a Lie group G on a Hilbert space 'H, choose a nonzero vector h in 'H and form its translates h, = U ( g ) h under the group action as before. Consider now the set H of of all elements k of G for which the action of U ( k ) on h reduces to a multiplication by a phase factor x(k) = exp[iq5(k)]:
In the case of W , H consists of all elements of the form k = (O,O,4) and U(k)is, in fact, nothing but a phase factor. However, this is deceptive, since in general U(k) may be a non-trivial operator, acting trivially only on some vectors h. Hence, in general, H will in fact depend on the choice of h and should properly be designated H ( h ) . Now for two elements kl and k2 of H ,we have
hence kl kzl also belongs to H and it follows that H is a subgroup of G. Furthermore, the above equation shows that the map k H x ( k ) is
108
3. frames and Lie Groups
a group-homomorphism of H into the unit circle, thus it is a character of H , i.e. a unitary representation on the one-dimensional Hilbert space a. H is called the stability subgroup for h, and its Lie algebra is called the stability subalgebra. The reason for this terminology is that H does not affect the quantum-mechanical state defined by h, since all observable expectation values in that state are given in terms of sequilinear forms in h. More precisely, if we choose llhll = 1, the corresponding state is by definition the rank-one projection operator
and the expected value of any observable A in this state is given by
( A ) = ( h I A h ) = trace ( P A ) .
(20)
(This formulation, besides avoiding irrelevant phase facors, also permits states which are statistical mixtures of pure states as needed, for example, in statistical quantum mechanics.) The translate of P under a general group element g is then
hence for g in H we have Pg = P , i.e., P is stable under H as the name implies. There is, therefore, some advantage to formulating the theory as much as possible in terms of states rather than Hilbert space vectors since this automatically eliminates the fictitious degree of freedom represented by the overall phase. [Incidentally, a similar situation appears to prevail in communication theory, since an overall phase shift has no effect whatsoever on the informational content of the signal.] However, the Hilbert space vectors will play an important role in connection with holomorphy (recall that in the coherent-state
3.3. Perelomov’s Homogeneous G-fiames
109
representation, f(z) is analytic, whereas I i(z) I is not), hence we work primarily with them. If g E G and Ic E H , then
hence Pgk = Pg.That is, Pg is the same for all members of the left coset
gH
G
{gkl k E H } .
(23)
The set of all translates Pgis therefore parametrized by the left coset space
Members of M will be alternatively denoted by rn and by g H , and will play a dual role: as points in M , and subsets of G. In the case i.e. it is phase of W , for example, M is parametrized by ( p , ~E) R2, space, precisely the label space for the frame we obtained. The coset ( p , z ) H is a straight line in W x R3. (We will see in the next chapter that W can be interpreted as a degenerate non-relativistic limit of phase spacex time and ( p , z ) H then corresponds to the trajectory of a free classical particle in W . ) To build a frame, we take a “slice” of G by choosing a representative from each coset, i.e. choosing a map
This is actually a non-trivial process. The projection G + G / H defines a fiber bundle, and Q is a section of this bundle; in general o can be chosen smoothly only in a neighborhood of each Note:
110
3. fiames and Lie Groups
point of G/H, i.e., locally (see F. Warner [1971], p.120). However, this is sufficient for our purposes, since we will ultimately deal only
with the state Pm, which is independent of the choice of u. # Thus u has the form u(gH) = ga(g) for some function a: G -+
H. In the case of W , we had u ( p , z ) = ( p , z , a ( p , z ) ) . The frame vectors corresponding to this choice are
To build a frame, we need two more ingredients: an action of G on the label space M , and a measure on M which is invariant with respect to this action. The action is easy, since G acts naturally on G/H by left translation :
If m = g H E M , we will denote (g1g)H by glm. M is called a homogeneous space of G. As for an invariant measure, it exists, in general, only subject to a certain technical condition (Helgasson [1962], p. 369). It does exist whenever G is a unimodular group (its right- and left-invariant measures are proportional by a constant), such as W . Let us assume that a G-invariant measure does exist on
M (in which case it is unique, up to a constant factor) and denote it by d p M . Once more we consider the formal integral
If we can show that J commutes with every U(gl), then irreducibility again forces J = c l . But
3.3. Perelomov’s Homogeneous G-frames
111
That is, the vectors hk are “almost” covariant under the action of G, with a residual phase factor (multiplier). This means that the states P, transform covariantly, i.e.,
Thus
= J,
using the unitarity of U ( g 1 ) and the invariance of d p M . This proves that J = cI, provided the integral converges in some sense.
Note: The above proof becomes shorter if one works directly with the states P, rather than the frame vectors; however, it is important to see the action of G on the frame vectors in terms of the multipliers since this will play a role in the next section, where we look for representations of G on spaces of holomorphic functions. #
3. Jk-ames and Lie Groups
112
A necessary condition for the weak convergence of the integral is that
i.e., that
h(m) = ( h & I h ) be in L 2 ( d p , ) . As before, this condition
is also sufficient, and the same terminology is used: h is called admissible, and
U is called square-integrable. The action of U in the
Hilbert space of functions f(m) = ( h& I f ) is somewhat more complicated than earlier because of the multiplier. (If we simply dropped the multipliers we would no longer have a unitary representa.tion of
G but a projective representation; see Varadarajan [1970].) The above construction has the advantage that the trivial part of the action of G is factored out, thereby improving the chances that the integral J converges. As mentioned above, it depends on the existence of the invariant measure d p M on G / H , which is not guaranteed. Note that for fixed g E G , the stability subgroup for the vector U ( g ) h is gHg-’, i.e., a subgroup of G conjugate to H . In general, however, the stability subgroups of two admissible vectors
hl and h2 may not be conjugates. It may happen that G / H 1 has an invariant measure while G / H 2 does not. It pays, therefore, to choose the vector h very carefully. Intuition suggests that h should be chosen so as to maximize the stability subgroup, since this will minimize the homogeneous space M and improve the chances for convergence. Furthermore, maximal use of symmetry would seem to make it more likely that an invariant measure exists on the quotient. More will be said about this in the next section, in connection with the “weight” of a representation. We will refer to the frame { h& } as a homogeneous G-frame associated with (U, h). The dependence on cr will usually be suppressed,
3.4. Onofri ’s Holomorphic G-Rmes
113
since a change in (the local sections) o gives an equivalent frame.
3.4. Onofri’s Holomorphic G-Frames
The frames associated with the windowed Fourier transform and the canonical coherent states are similar in that both provide “phase space)’ representations of functions in L2(R8). The distinguishing feature is that the representation determined by the canonical coherent states is in terms of holomorphic (analytic) functions. Neither of the two general grouptheoretic methods covered so far explains the origin of this analyticity, yet it has been found that all such group-relat ed phasespace representat ions have analytic counterparts. Moreover, analyticity will play a key role in the coherentstate representations of relativistic quantum mechanics (next chapter) and quantum field theory (chapter 5 ) . The method to be described in this section assumes we are dealing with a compact, semisimple group. Since the coherent-state representations we will develop for relativistic quantum mechanics are based on the Poincar6 group, which is neither compact nor semisimple, the present considerations do not apply directly to the main body of this book (chapters
4 and 5). Nevertheless, we describe them in considerable detail in the hope that they may shed some light on our later constructions, which are still not well-understood in general terms. Let us, then, return to the canonical coherent states in order to isolate the property leading to analyticicy and find its generalization to other groups. We follow an approach first advocated by Onofri [1975].For related developments, also see Perelomov [1986].Again consider the three-dimensional Weyl-Heisenberg group W , whose
3. n a m e s and Lie Groups
114
(real) Lie algebra w has a basis {-is, -iT, -iE} which is represented on L2(R)by S
+
X,T
+
P and E
arrived at the canonical coherent states
xr
+
I . Recall that we
as eigenvectors of the
+
non-hermitian operator A = X iP, which represents the complex combination S iT of generators in w. Since w is a real Lie algebra, we must complexif'y it in order to consider such combinations of its
+
generators. This is done in the same way as complexifying a real vector space, namely by taking the tensor product with the field of
+
complex numbers. With obvious notation, w, = w 8 (J = w iw. As will be seen below, complex combinations such as S f i T play a very important role in the theory of r e d Lie groups, exactly for the same reason that complex eigenvectors and eigenvalues are necessary in order to study real matrices. We begin by rederiving the canonical coherent states from an algebraic point of view which shows their relation to the vectors h,,, associated with the windowed Fourier transform and pinpoints the property which makes them analytic. In the last section we found that
Now if B and C are operators such that [ B ,C] commutes with both
B and C,then the Baker-Campbell-HausdorfT formula (Varadarajan [1974]) reduces to
Since [ipX, -izP] = ipxl, we therefore have
h,,, = exp(ipz/2) exp (ipX - izP) h. Substituting
(3)
3.4. Onofri 's Holomorphic G-lkimes
X = (A* + A)/2 and defining
z
z - ip, we
and
115
- iP = (A* - A)/2
(4)
get
h,,, = exp(ipz/2) exp [ZA*/2 - zA/2] h.
(5)
So far, all our manipulations have been justifiable since we have only exponentiated skew-adoint operators. The next one is more delicate since the operators to be exponentiated are not skew-adjoint; it will be justified later. Using the Baker-Campbell-HausdorfF formula again with B = zA*/2 and C = -zA/2, write
h,,, = exp (ipz/2 - Er/4) exp (ZA*/2) exp (-zA/2) h.
(6)
If we choose h(z') = N exp (-x' "2) = x o ( z ' )
(7)
(where N = ( 2 ~ ) - ' / ~ and xo is just the canonical coherent state with z = 0), then Ah = 0, hence exp(-zA/2) h = h and
hp,l = exp ( i p z / 2 - .Zz/4) exp (ZA*/2) xo.
xs (8)
We claim that exp(ZA*/2) x o = xr.
(9)
This can be seen by applying A to the left-hand side, then using Ax0 = 0 and [A,A*] = 21: Aexp(zA*/2) xo = [A,exp(ZA*/2)] xo = z exp(ZA*/2) xo,
(10)
3. frames and Lie Groups
116
which shows that exp(ZA*/2) xo equals xZ up to a constant factor, which is easily shown to be unity. That is,
so the canonical coherent states are a special case (modulo the z-
dependent factor in front) of the frame vectors h,,, for the choice h = xo. Note that the corresponding states are related by
I h P , Z )( h,,z I = exp (-1Zl2/2)
Ix z )(X z I ?
(12)
giving the weight function on the right-hand side in the bargain. The ~ obtained by unitarily translating reason for this is that the h p , r ’were h , whereas the operator exp(fA*/2) which “translates” x o to x z is not unitary but results in llxzll = exp(1.zl2/4) llxoII; the weight function then corrects for this. We can now justify the fine point we glossed over earlier. The expression exp (zA*/2) exp (-zA/2) xo
(13)
makes sense because
(4 e
-zA/2
xo = xo
(14)
since Ax0 = 0, and (b) xo is an analytic vector (Nelson [1959]) for the operator A*, since
3.4. Onofri’s Holomorphic G-fiames
117
As will be seen below, the key to constructing representations of real Lie groups on spaces of analytic functions will be to (a) complexify the Lie algebra; (b) find a counterpart to
Ax0
= 0;
(c) define counterparts to xz = exp(zA*/2) xo. Before embarking on this task, we must make a brief excursion into the structure theory of Lie algebras. For background and details, see Helgason [1978]or Hermann [1966].The Lie groups and -algebras we consider below are assumed to be real and sem’simple. (A Lie algebra is semisimple if it contains no proper abelian ideals; a Lie group is semisimple if its Lie algebra is semisimple.) Since W is not semisimple (the subspace spanned by -iE is an ideal of w),these results do not apply to it, strictly speaking. However, as will be shown in section 3.6,W can be obtained as a (contraction) limit of a simple Lie group ( S U ( 2 ) ) ,and this turns out to be sufficient for our purpose. Thus, let G be an arbitrary real, semisimple Lie group and g its Lie algebra. The following is known:
1. Every element X of g defines a linear map adX on g by adX(Y) = [ X ,Y].Similarly, every 2 E gc defines a complexlinear map (denoted by ad 2) on g,. It therefore makes sense to look for eigenvalues and eigenvectors of ad 2. 2. g has a (Lie) subalgebra 11 called a Cartan subalgebra, defined as a maximal abelian subalgebra of g satisfying an additional technical condition. h is analogous to a “maximal commuting set of observables” in quantum mechanics. Its complexification,
118
3. fiames and Lie Groups denoted by h,, is a maximal abelian subalgebra of g,.
(The
technical condition on h essentially ensures that each of the linear maps a d H , with H E h,, has a complete set of eigenvectors in g,, hence that all the linear transformations ad H with H E h,
can be diagonalized simultaneously.)
3. For every complex-linear form a:h,
+
(E , let
That is, g" is the set of all vectors in g, which are common eigenvectors of ad H for every H in h,, with corresponding eigenvalue a ( H ) . For generic a , no such eigenvectors will exist, hence g" = (0). When g" # {0}, a is called a root (of g, with respect to h,), each Z in gc is called a root vector and g" is called a root subspace of g,. Clearly, the zero-form a ( H ) = 0 is a root, since the fact that h, is abelian means that go contains h,. The fact that h, is maximal-abelian means that actually go = h,, since every Z in go commutes with all H in h,.
4. Let 2, E g" and Zp E gp, where a and ,f? are arbitrary linear forms on h,. Then the Jacobi identity implies that for every H in h,, [ H ,[Z", Zpll = "K Z"1, Zpl [Z", [ H ,Zpll
+
+
= ( Q ( H ) P(H)) [Z", Zpl,
(17)
thus [Z,, Zp] E ga+p. This statement is abbreviated as
5. The set of all nonzero roots is denoted by A. g, has a direct-sum
decomposition (root space decomposition) as gc = hc
+
c
g",
3.4. Onofri's Hofornorphic G-fiames
119
and each g" (with a E A) is a one-dimensional subspace of g,. 6. The Killing form of g, is the bilinear symmetric form defined bY
B(2 , Z ' ) = trace (ad Z ad 2').
(20)
It is non-degenerate on g,, and its restriction to h, is also nondegenerate. Since a non-degenerate bilinear form on a vector space defines an isomorphism between that space and its dual, the restriction of B to h, defines an isomorphism between h, and h:. The vector in h, corresponding to a E 11; is denoted by Ha.It is defined by
B(H,,H) = a ( H )
VH E h,.
Note that the vector corresponding to a G 0 is
7. If
CY E
A, then -a E A and a ( H , )
(21)
Ho = 0.
B ( H a , H o l )# 0. Further-
more, [g",g-"] = CH",
i.e. the set of all brackets [Z,,Z-,]
with 2, E g" fills out
the one-dimensional subspace spanned by CY E
(22)
A.)
Ha. (H, # 0
since
+
8. Any two root subspaces g a and gg with a ,f3 # 0 are "orthogonal" with respect to B. 9. It is possible to choose (non-uniquely) a subset A+ of A, called a set of positive roots, such that (a) If a and p belong to A+ and a ,f3 is a root, then a p belongs to A+. (b) The set A- F -A+ is disjoint from A+. (c) A = A+ U A-.
+
It follows from (4)and (9) that
+
3. Rames and Lie Groups
120
are Lie subalgebras of gc, and by ( 5 ) )
Since there can only be a finite number of roots, (4) and (9) also imply that after taking a finite number of brackets of elements in either n+ or n- we obtain zero; that is, the subalgebras n* are nilpotent. We may choose a basis for gc as follows: Pick a non-zero vector 2, from each g" with a E A+. Then by (7) there is a unique vector 2-, in g-" with
Since 2, and 2-, are root vectors, they also satisfy
[Ha, Za]= a(KY)Za [Ha, 2 - a ] = -a(Hcr)Z-a, and a(H,)
# 0 (property (7)). If we define
then each triplet (2")Cay 2-,) spans a Lie subalgebra of gc which is isomorphic to sl(2,
a):
The root space decomposition then shows that gc is a direct sum of copies of sl(2,
a),indexed by A+.
3.4. Onofri’s Holomorphic G-fiames
121
The connection with the non-hermitian combinations X f iP can now be explained. The following argument is heuristic and has no pretense to rigor. Its sole purpose is to motivate the construction of general holomorphic frames. It will be made precise in section 3.6 at the level of unitary representations, where a clear geometric interpretation will be given. Begin with the group G = SU(2), which is a simple, r e d Lie group whose Lie algebra g = su(2) has a basis { - i J l , -i J2, - i J 3 } satisfying [ J l , J21
= iJ3
[ J 2 , J3]
= iJi
[J3, J1]
= iJ2.
We first show that w can be obtained as a “contraction limit” of g.* Choose a positive number tc and define IC1 = K; J1, K2 = tc J2 and K3 = K~ J3. These form a new basis for g, with
[K1,K2]= iK3 [K2,K3]= iK2K1 [&,K1] = i ~ ’ K 2 .
(30)
In the limit tc + 0, g becomes isomorphic to w. We will see in section 3.6 that within a unitary irreducible representation of SU(2), the operator K3 can be chosen so that K3 + I as K 0, hence we may interpret the limits of 11 ‘ and K2 as X and P,respectively. Now apply the above structure theory to gc, which is just sl(2, (I.!). The direct s u m of copies of sl(2, a) therefore reduces to a single term. For the Cartan subalgebra of g we can choose h = IR J3 (the one-dimensional subspace spanned by J 3 ) , so that h, = (E J3.
*
The idea of group contractions is due to Inonu and Wigner [1953].
3. f i m e s and Lie Groups
122
Two linearly independent root vectors are given by Jh =
J1
fiJ2,
with
We choose a(J3) = 1 as the single “positive” root and Jh for the basis vectors of the two one-dimensional root subspaces. Then
shows that
C, corresponds to J3. Setting Kh
and taking the limit
K -+
= K J and ~ K3 = tc2 J3,
0, we obtain the correspondence
K+ + S + iT
H
X
+ iP = A
S - iT H X - iP = A* K3 -+ E H I ,
K-
where we have used
-+
“H” to
(33)
denote the representation of w on L2(IR).
[This correspondence is not unique; for example, K+ -+ A*, K- + A, IC, -+ -E is equally good. As will be shown in section 3.6, both of these correspondences actually occur as weak limits, due to the fact that an irreducible representation of S U ( 2 ) contracts to a reducible representation of w.] Note that the three roots { t c 2 , 0, - t c 2 } all merge into a single root, zero, in the contraction limit. Thus the operators
A and A* are interpreted as the contraction limits of root vectors. Note: Another way of seeing the importance and naturality of A and A* is in the context of the Kirillov-Kostant-Souriau theory, sometimes called “Geometric Quantization,” applied to the Oscillator group (Streater [1967]; see section 3.6). For background on Geometric Quantization, see Kirillov [1976], Kostant [1970] and Souriau
3.4. Onofri’s Holomorphic G-frames
123
[1970]; see also Guillemin and Sternberg [1984], Simms and Woodhouse [1976] and Sniatycki [1980]. We are, at last, ready to generalize the construction of g r o u p represent at ions on spaces of holomorhic functions to other groups. We begin with a semisimple, real Lie group G and a unitary irreducible representation U of G on a Hilbert space 3-1. To avoid technical difficulties, we will here assume that G is compact. (The reason for assuming compactness, as well as ways to get around it, will be discussed below.) Then the irreducibility of U implies that 3-1 be finite-dimensional, hence the operators U ( g ) representing group operations are just unitary matrices and the operators U ( X ) representing elements of g are skew-adoint matrices. (Sometimes the operator representing X is written as d U ( X ) to emphasize its “infinitesimal” nature; we will write it as U ( X ) to keep the notation simple.) At the Lie algebra level, U extends, by complex-linearity, to a representation T of g,:
T ( X + iY)= U ( X )+ i U ( Y ) .
T is the unique representation of c1,c2 E
g, that extends
(34)
U ;that is, for all
and all Zl,Z2 E gc,
The matrices T ( 2 ) are no longer skew-adjoint, but because of the finite dimensionality of 3-1, there is no problem in exponentiating them to give a representation of G, on 3-1, which we also denote by 2’. Thus,
3. names and Lie Groups
124
the representative of the group element exp Z of G, is defined by*
T(expZ) = exp[T(Z)],
Z E gc.
(36)
Note: If G is non-compact, expZ is still well-defined but the righthand side of the above equation is problematic since for any nontrivial representation
U ,‘H is infinitedimensional and T(2)will, in
general, be a non-skew-adjoint, unbounded operator. (Even the definition T ( Z ) = T ( X ) i T ( Y )becomes troublesome, since the skew-
+
adjoint operators T ( X )and T ( Y )may both be unbounded and their domains may have little in common; additional assumptions must be made.) In the case of W , this was resolved by restricting exp[T(Z)] to act on analytic vectors. A similar approach is used in extending the present construction to non-compact G. For the present, we continue to assume that G is compact to avoid this problem. #
Lemma 3.2.. (a) g H T ( g )is an irreducible (non-unitary) representation of G,. (b) The map 2 I-+ T(exp 2 ) is analytic as a map from the complex vector space g, to the complex matrices on ‘H. Proof. If A is a matrix which commutes with all T(2)for 2 E gc, then in particular it commutes with all U ( X )for X E g , hence must be a multiple of the identity since U is irreducible. Therefore T is irreducible. To prove (b), note that Z H T(exp2) is, by definition, the composite of the’two analytic maps 2 H T(2)and T ( Z ) H ““P[T(Z)I. I
*
Since G is compact, it is exponential; that is, every group element
can be written in the form expZ for some 2.
3.4. Onofri ’s Holomorphic G-Rames
125
It follows that the map T from G, (considered as a complex manifold; see Wells [1980])to the group GL(’).I) of non-singular matrices on ‘H is also analytic; that is, T is a holomorphic representation of G,, obtained by analytically continuing the representation U of G. Now the point of Onofri’s construction is this: We have seen that by choosing a state which is stable under H, U can be reformulated as a representation of G on a space of functions f(m) defined over the homogeneous space G/H. In the case G = W , G/H was identified as a phase space, but in general it is not clear that it can be interpreted as such. Following Onofri, we will show that:
(a) The representation T induces a complex structure on the ho-
mogeneous space G/H,making it into a complex manifold on which G acts by holomorphic transformations. (Such a manifold is called a complex homogeneous space of G, or a holomorphic homogeneous G-space.) (b) In addition, G/H has the (symplectic) structure of a classical phase space, and the action of G on G/H is by canonical transformations. Thus it becomes possible to think of G/H as phase space. To actually identify G/H as the phase space of a classical physical system, i.e. as the set of dynamical trajectories followed by that system, it is necessary for G to include the dynamics for the system, i.e. its evolution group, of which nothing has been said so far. This will be discussed in the next chapter.
The representatives U(H) of the elements H of h form a commuting set of skew-adjoint matrices, hence can all be diagonalized simultaneously. Let h be a common eigenvector:
126
3. fiames and Lie Groups
U ( H )h = X ( H ) h,
(37)
where X(H) is imaginary. Since U is linear at the Lie algebra level, it follows that X is a linear functional on h, called the weight of h. (Roots are simply weights in the adjoint representation, where ‘H is replaced by g, and U ( H ) by adH.) For any non-zero element 2, in g* with a in As, we have (remembering that U ( H ) = T ( H ) since
H E 11)
That is, T(Z,) “raises” the weight by a. Similarly, for a E A-, T(Z,) lowers the weight by -a. Since non-zero vectors with different weights are linearly independent and ‘H is finite-dimensional, it follows that ‘H must contain a non-zero vector with lowest weight, i.e. such that
T(Z,)h = 0
V a E A-.
(39)
T ( Z ) h= 0
VZ E n-.
(40)
Equivalently,
For the group W , h was the ”ground state” x o and the above equaand Axo = tions correspond to T(-ZE) xo = 4 x 0 (so X(-iE) = 4) 0. Consider the subalgebra
3.4. Onofri 's Holomorphic G-fiames
127
m
of g,, called a Borel subalgebra. If N E n+ and we denote by its complex-conjugate with respect to the real subalgebra g, then N E n-. Hence for arbitrary 2 = H N E b, we have
+
+
T ( Z ) h= T ( E ) h T ( N ) h= T(R)h,
(42)
and H belongs to h, since [a,h,] = 0. Extending X by complexlinearity to h,, we therefore have
T ( Z ) h= A(R) h,
(43)
hence exp [T(Z)] h = exp [X(
n)]h.
(44)
The subgroup B = exp(b) is called a Borel subgroup of g,. For 2 as above, let b = expZ E B, i = expZ and
~ ( 8= ) exp[X(R)]. Then
T(8)h = ~ ( 8h.)
(45)
Since X ( H ) is imaginary for H E h, complex-linearity implies that X(a)= -X(H) for H E h,. Hence
The map
T:
B +
a* satisfies ~ ( b l b z = ) n ( b l ) ~ ( b z )i.e. ,
it is
a character of B. Furthermore, since ~ ( b is) analytic in the group parameters of b, Onofri calls it a holomorphic character of B . Notice that the state corresponding to h (i.e., the onedimensional subspace spanned by it) is invariant under B. If we restrict ourselves to the real group G, this means that the state is invariant under the subgroup H . If H is the maximal subgroup of G leaving this state invariant, then the weight X is called non-singular. In that
3. fiaznes and Lie Groups
128
case, H plays the same role as it did in the last section: it is the
stability subgroup of the state. We will assume this to be the case; if it is not (in which case X is singular), the present considerations still apply but in modified form. Note that in the non-singular case, the stability subgroup is abelian. We now adapt the construction of the last section in a way which respects the complex-analytic structure of G,. Introduce the notat ion
(T(g)*)-' = T # ( g ) ,
g
E G.
(47)
Since the representation U of G is unitary, U ( X ) * = - U ( X ) for X E g; hence for 2 E g,, we have T ( Z ) * = -T(Z). It follows that for group elements g = exp(2)of G,,
T(d# = T ( g ) ,
9 E Gc,
(48)
where g = exp(2) and, in particular, T(g)# = T(g)= U ( g ) for gE
G. Define the vectors
which, when restricted to g E G, coincide with the earlier frame vectors but are anti-holomorphic in the group parameters of G,. An arbitrary vector f E 3.1 defines a holomorphic function on G, by
m=
( h , If).
For arbitrary b E B ,
-
hg* = T(g)#T(b)#h= T ( b ) - l h g , hence
3.4. Onofri’s Holomorphic G-frames
129
Note: The reader familiar with fiber bundles (Kobayashi and Nomizu [1963,19691) will recognize the above equation as the condition defining a holomorphic section of the holomorphic line bundle associated to the principal bundle B --+ G, + G,/B by the character T : B + @*. We now proceed to construct this section in a naive way, that is, without assuming any knowledge of bundle theory. # The above shows that the state determined by h, depends only on the left coset gB,which we denote by z. Let
2 = G,/B
(53)
be the left coset space. Now (a) G, is a complex manifold; (b) B , as a complex subgroup, is a complex submanifold of G,; (c) the projection map G, + 2 is holomorphic.
Hence it follows that 2 is a complex manifold. This means that a neighborhood of each point z = gB E 2 can be parametrized by a local chart, i.e. a set of local complex coordinates (21,.. . ,2,) (say, with (0, . . . ,0) corresponding to z), and the transformation from one local chart to another on overlapping neighborhoods is a local holomorphic function. In the case of G = W , B corresponds (under the contraction limit) to the complex subalgebra spanned by E and A, and 2 can be identified with a, hence only a single chart is needed to cover all of 2. In general, more than one chart is necessary. For G = S U ( 2 ) , we will see that 2 is the Riemann sphere S2,hence two
130
3. frames and Lie Groups
charts are needed; however, the north pole has measure zero, and a single chart will do for S2\{ 00) (I!. Since G, acts on itself by holomorphic transformations, its action on 2 is also by holomorphic transformations. This means the following: For gl E G, and z = gB E 2, let w(z) = glz E (g1g)B. If ( 2 1 , . . . ,Zn) and (w1,. . . ,wn) are local charts in neighborhoods of z and w , respectively, then the mapping 4: ( ~ 1 , .. . ,zn) I+ ( ~ 1 , .. . ,wn) is holomorphic in a neighborhood of (0,. . . ,O). (It must, of course, be locally invertible with holomorphic inverse.) We could proceed as in section 3 and consider the states
where we must now divide by ((h,1I2 since the non-unitary operator T ( g ) does not preserve norms. However, this would spoil the holomorphy which we are attempting to study. Instead, proceed as follows: Choose an arbitrary reference point a in G, and define the
holomorphic coherent states
As indicated by the notation, the right-hand side depends only on the coset z = gB. There is no guarantee that the denominator on the right-hand side is non-zero, but certainly the open set
(which, as indicated, depends only on the coset a = aB) contains a, and its projection Va to 2 is an open set containing a such that xf is defined for all z in V,. Hence by choosing more than one reference
3.4. Onofri’s Holomorphic G-fiames
131
point a , if necessary, we can cover 2 with patches V, on which the x;’s are defined. An arbitrary vector f E 3.t can now be expressed as a local holomorphic function
of z in V,, with transition functions
The reader may wonder where this is all leading, since we are ultimately interested in the real group G and not in G,. Here is the point: It is known (Bott [1957]) that the complex homogeneous space 2 = G,/B actually coincides with the real homogeneous space M = G / H used in Perelomov’s construction! For example, consider the Weyl-Heisenberg group: G / H is parametrized by ( x , p ) while G J B is parametrized by x - ip, and they are the same set but with the difference that the latter has gained a complex structure. The identification of M with 2 in general can be obtained by noting that as a subgroup of G,, G acts on 2 by holomorphic transformations; this action turns out to be transitive, and the isotropy subgroup at the “origin” zo = B is H, hence 2 M G/H. In other words, M inherits a complex structure from G,, and the natural action of G on M preserves this structure. Because of this, we need not deal directly with G, to reap the benefits of the complex structure. Let us therefore restrict ourselves to G. Then
3. frames and Lie Groups
132 hence
11 h, 11 = 11 hll
z i 1 and the state corresponding to h, is
I)
where d ( z , a) -21n I( ha I h, depends only on z = gH and a = aH and their complex conjugates (it is not analytic). Notice that in eq. (60), the left-hand side, hence also the right-hand side, is independent of a. Only the three individual factors on the righthand side depend on a. This becomes important if several patches are needed to cover 2, since it means that we can change the reference point without affecting the smoothness of the frame. With the above definitions, the resolution of unity derived in section 3 becomes
where we have assumed for simplicity that a single chart suffices. If more than one chart is needed, partition M as a (disjoint) union U,M,,, where each M,, is covered by a single chart. Since, by the above remark, the integrand is independent of a , the corresponding integrals I,, form a partition of unity in the sense that C,,I, = I. Therefore, the holomorphic coherent states form a tight frame which we call the holomorphic G-frame associated with the representation U and the lowest-weight vector x. In this connection, note that a choice of lowest weight actually determines the representation U up to equivalence, hence a more economical terminology would be to call the above the holomorphic G-frame associated with x. The corresponding inner product is given by
3.4. Onofri’s Holomorphic G - R m e s
133
In the case G = W , since 2 = (I everything !, can be done globally: Just one chart is needed, and the reference point a can be fixed once for all. Taking a = 1 (the identity element of G) and CY = 0, we find that the x z 7 sreduce to the canonical coherent states and
Hence in the general case, e-4 takes the place of the Gaussian weight function: it corrects for the fact that holomorphic translations d~ not preserve the norm. The action of G on the space of local holomorphic functions has a “multiplier”:
P(Z)
= 7(z,91,qP(91-1z), where z = g H , a = aH and
is holomorphic in z and anti-holomorphic in a. We have mentioned that M inherits a complex structure from G,. Actually, this is only part of the story. Let d and 8 denote the external derivatives with respect to z and 2, respectively, i.e., in local coordinates,
134
3. fiames and Lie Groups
(summation over k is implied). Consider the 2-form
where 4 is the function in the exponent of the weight function above. Theorem. (a) w is closed, i.e. dw = aw = 0, (b) w is independent of the reference point a , (c) w is invariant under the action of G, and (d) w is non-degenerate, if X is non-singular.
Proof. We prove (a), (b) and (c). For the proof of (d), see Onofri [1975].
(a) follows from the fact that 8+a = d is the total exterior derivative, hence the identity d2 = 0 implies
(c) follows from the fact that y(z,g1,6) is holomorphic in z, hence IyI2 is harmonic and 8 l y I 2 = 0. But eq. (65)shows that
which implies that the pullback g;w of w under g1 equals w , i.e. that w is invariant. (b) follows from (c) and 4 ( z , g l a ) = 4(gF1z,cx).
135
3.5. The Rotation Group
The property (b) implies that w is defined globally on 2, whereas (a) and (d) mean that w is a symplectic form (Kobayashi and Nomizu [1969]) on 2, which makes 2 a possible classical phase space. Finally, ( c ) means that the symplectic structure defined by w is G-invariant, hence G acts on 2 by canonical transformations. (Actually, the 2form w together with the complex structure define a Kiihler structure on 2 , i.e. a Hermitian metric such that the complex structure is invariant under parallel translations.)
3.5. The Rotation Group
A simple but important example of the foregoing methods is provided by their application to the three-dimensional rotation group SO(3). The resulting frame vectors are known as spin coherent states. An excellent and detailed account of this is given in Perelomov [1986]; the treatment here will be fairly brief. SO(3) is locally isomorphic to the group SU(2) of unitary unimodular 2 x 2 matrices, which we denote by G in this section. This is the set of all matrices
hence G M S3 (the unit sphere in R4)as a manifold. The Lie algebra g has a basis { J1, J2, J 3 ) satisfying [ J I ,521 = i J3 plus cyclic permutations (where, as usual, it is actually i J k which span the real algebra g) which can be conveniently given as J k = (1/2)ak in terms of the Pauli matrices a l = ( l0
1o),
0 2 = ( 9
ii),
1
0
(2)
3. frames and Lie Groups
136
Root vectors in g, are given by Jh = 51 f i J 2 , satisfying
The vectors
(J3,
J*} form a complex basis for gc, with
Unitary irreducible representations of G are characterized by a single number (highest weight) s = 0,1/2,1,3/2,. . ., with the representation space ‘H having dimensionality 2s 1. The generators Jk are represented by hermitian matrices Sk satisfying the irreducibility (Casimir) condition
+
s2 s,2+ s;
+ s:
= (1/2)(S+S-
+ s-S+) + s3”= s(s + 1).
(5)
A basis for ‘H is obtained by starting with a highest-weight vector vs, i.e.,
and applying S- repeatedly until the commutation relations imply that the resulting vector vanishes. This results (after normalization) in an orthonormal basis { u s ,~ ~ - 1. ., ,.v - ~ } satisfying
To build a homogeneous frame as in section 3.3, we use a decomposition of G in terms of Euler angles,
137
3.5. The Rotation Group
with 0
5 4 < 27r, 0 5 8 5
7r
and 0
5 11, <
47r, which gives a
corresponding decomposition of U ( g ) as
If 2s is odd, #, 8 and 1c, have the same ranges as before; if 2s is even, then 1c, + 27r and give the same operators, hence 0 5 1c, < 27r. If we choose one of the basis vectors v, as our initid vector h, then the stability subgroup is H = {g(O,O, $)) = S'. The homogeneous space G / H M S3/S1is parametrized by ( # , 8 ) , or by the unit vectors n = (cos 4 sin 8, sin 4 sin 8, cos O ) , hence G / H can be identified with
+
the unit sphere S2. Choosing the section
0:
S2 4 G as 0(4,8) =
(4,8,0), we obtain the frame vectors
The G-invariant measure on S2 is just the area measure
and the tight frame is
for some number c. To find c, take the trace of both sides and use the fact that I h n ) ( h n I is a rank-one projection operator (and hence its trace is unity). This gives 47r = c
Tr I
= c(2s
+ l),
(13)
3. frames and Lie Groups
138
hence the resolution of unity is
The overlap between frame vectors can be shown to be
hence they are orthogonal if and only if n' = -n. To construct a holomorphic frame, we must consider the complex Lie algebra g, and its Lie group G, = SL(2, (E ) of unimodular 2 x 2 complex matrices
The Lie algebra h of the subgroup H of G used above is a Cartan subalgebra of g, and 11, = a J 3 . The corresponding root-space decomposition is
and this yields a Gaussian decomposition of (almost all) elements of
G, as products of lower-triangular, diagonal and upper-triangular matrices: g ( 5 , d, [) = ecJ- e2dJSetJ+
(- d [+.
If we write N f = exp(n*), then the above decomposition is G, N - H , N + . Comparison with the original form of g gives
-
3.5. The Rotation Group a = e d9
B=edt,
7 = Ced,
6 = Ced
We call C(g), [ ( g ) and ed
139
+e-d.
= a ( g ) the Gaussian parameters of g. Thus
Remarks:
1. Matrices with a = 0, i.e. of the form
clearly do not have this decomposition. They form a 2-dimensional complex submanifold of G,, hence have (group-) measure zero. 2. Elements of implies
G are distinguished by 6 = b and 7 = -p, which
+ (()-I
ba = (1
+ tt)-1.
= (1
(22)
The Bore1 subgroup discussed in section 3.4 is B = H,N+ and consists of all matrices of the form
b=("0
')
0-1
(23)
Its cosets C-B can therefore be parametrized by E a. The unattainable matrices with Q = 0 form a single coset, corresponding to C = 00. Hence the homogeneous space G,/B is the Riemann sphere:
2
G , / B w (E U (00) w S2,
(24)
140
3. fiames and Lie Groups
in agreement with our earlier G / H = S2 but with an added complex structure, as claimed in section 3.4. The action of G, on 2 is as follows: If gCLB = (EBB,i.e.
then
That is, G, acts on 2 by Mobius transformations. We defined h, = T(g)#h, where T ( g ) was the analytic continuation of U ( g ) to G, and T(g)# = (T(g)*)-' . But
Hence the unique state left invariant by B is that corresponding to the vector of lowest weight, h = v - ~ For . this choice, we get
h, = e2dsexp( -cS+)h
e2da hc.
(28)
The holomorphic coherent states with respect to the reference point 91 E G, are then
with
3.5. The Rotation Group
141
To evaluate this, express exp(-
which gives
and
x?
= e - 2 r d 1 (1
+
hc.
(34)
<
Since a single chart covers all of Z NN S2 except for the point at = 00, just one reference point g1 is needed in this case. The simplest choice is gl = 1, the identity of G, which gives
xt = hc. With this choice, the weight function is
(35)
3. fiames and Lie Groups
142
But for g E G we have the constraint
hence
To find the invariant measure on G , / B , recall that the 2-form w = ia84 is invariant under G , and non-degenerate. Hence it defines an invariant measure, once we choose a positive orientation on G J B . An easy computation gives
Thus
for some c. Taking the trace and using
+ m2"
( he I hc ) = (1 gives, with
C = reie,
3.5. The Rotation Group
from which c = 47rs/(2s
143
+ 1). Thus we have the resolution of unity
where d2< is now Lebesgue measure on (I!, What do the functions
f(C)
= ( h c I f ) look like? Consider the
vectors
1 n!
u, = -(-S+)"
h,
n = 0,1,2, - .- .
(44)
Then uo = h , u1,. . . ,uz8 are linearly independent and ups+l = 0, 28+1 since S+ = 0. Thus
hence
m where
fn
= ( U,
= fo
+ Cfl + *.. + C 2 8 f 2 s ,
(46)
I f ). Thus, .f((c)is a polynomial of degree 5 2s in C.
How are the two sets of frame vectors h, and hc related? It turns out that n is related to 5 by stereographic projection of the 2-sphere onto the complex plane. The exact relation depends on the particular factorizations of G used to construct the frames. In the case of h,, this was the Euler angle decomposition, whereas for hc it was the Gaussian decomposition. However, there is an intrinsic way of relating the two sets of vectors, which goes as follows: Consider the functions
3. names and Lie Groups
144
which are the expectations of the observables sk in the state of he, and which correspond to the components of the classical angular momentum of a system whose only degrees of freedom are the rotational motions given by G. s k ( c ) is a quantum-mechanical version of a statistical average. Since hc = exp(-<S+)h, we have
S,(C)
=
-x d
1 4 he I hc ) (48)
- -- 2s 5 1
To find 33(5), note that
[S3, S+]
+cc‘
= S+ implies [&, S:] = nST, hence
and
+ e-@+ = - ((S+ + he.
S3hc = -(S+ hc
S3 h
S)
Hence
$(5)
= -s - <3+(5) =s
[-I.
The equator of S2 corresponds to S3 = 0, hence to ICl = 1, and the south pole (S3 = -s) and north pole ( 5 3 = s) to = 0 and ( = 00, respectively. The above equations imply that
S2(C) = S,(02 + S,(C)’ + S3(C)2 = F+(0l2 + S,(CI2
(52)
= s 2,
thus S ( C ) belongs to the 2-sphere of radius s centered at the origin. In fact, the correspondence 5 t) S ( 5 ) is a bijection if we include 5 = 00. The relation
3.6. The Harmonic Oscillator as a Contraction Limit
shows that
-C
is just the stereographic projection of
145
g(C) from the
north pole to the complex plane tangent to the south pole. We could choose s as a new independent variable ranging over the 2-sphere of radius s minus the north pole and define h, = hc, where is the unique point with S ( C ) = s. Then the vectors h, are essentially equivalent to the hn’s. Note that they are eigenvectors of the operator s S with eigenvalue s2, i.e.,
<
since the equation obviously holds for s = (0,0, -s), hence for all s by symmetry.
3.6. The Harmonic Oscillator as a Contraction Limit
In section 3.4,we gave a heuristic argument suggesting that the WeylHeisenberg group W is a “contraction limit” of SU(2). This can now be made precise and given a geometric interpretation at the representation level. Furthermore, imitating the analysis of the nonrelativistic limit of Klein-Gordon theory (next chapter), we shall gain an understanding of the relation between the Harmonic Oscillator and the canonical coherent states in the bargain. This connection between the rotation group and the harmonic oscillator has some potentially important applications in quantum field theory, which I hope to explore in the future.
3. n a m e s and Lie Groups
146
We will study the limit of the representation of G = S U ( 2 ) with spin s as s -+ 00. Since s is now a variable, we denote the representation space by 3-cs and the holomorphic coherent states by hz. The matrices representing the generators J k will be denoted by S i . By way of motivation, compare the resolution of unity on 7&,
with that on the representation space 3-c of W in terms of the canonical coherent states,
Note that if we set
and define
xz G hz, then the resolution of unity on 3-1,
becomes
If we now take the formal limit s + 00, this coincides with the resolution of unity on R,provided we can show that xl -+ x z . Our task is now (a) to find the sense in which this limit is to be taken, (b) to show that the generators S; of g go over to the generators A , A * and I of w and (c) to show that the coherent states xz go over to the canonical coherent states x r . To properly study the limit' s + 00, we will first of all imbed all the spaces 3-1, into 3-1, so that the limit may be considered within 'H.
3.6. The Harmonic Oscillator as a Contraction Limit
147
This is done most easily by using the orthonormal bases obtained by applying S$ and A* to the respective "ground states". An orthonormal basis for 'H is given by
where Ax0 = 0 and the normalization is determined by the commutation relation [A, A*] = 21. The generators A and A* act by
An orthonormal basis for 3-1, is given by
where SLh' = 0. We imbed R, into 'H by identifying w i with w, and defining Siw, = 0 for n > 2s. Then for 0 5 n 5 29,
S i w n = ( n - s)w,.
To see how the generators Si must be scaled, note that the relation between C and z implies that
where
Define
= S d / d g , so that 1i't = (I<$)*, and
3. fiames and Lie Groups
148
Then for 0 5 n 5 2s,
n(2s - n
K8wn =
+ 1)
Wn-I
n-s K,”wn = -W n . s+l
If we now take the limit s ---*
K;Wn
00
while keeping n fixed, we obtain
+ -wn,
Comparing this with the action of A and A*, we see that
as s + 00 in the weak operator topology of 3.1. (These limits are not valid in the strong operator topology since it is necessary to keep n fixed.) Thus we have shown that in the weak sense, the representation of S U ( 2 ) goes over to a representation of W as s + 00. Note that the operator
which satisfies
3.6. The Harmonic Oscillator as a Contraction Limit
149
has the weak limit
1 N = -A*A, (17) 2 which is the Hamiltonian for the harmonic oscillator (minus the ground-state energy). If we write A = X iP with X and P selfadjoint , then w-limNf
+
1 N = - (X2 P 2 - I ) . 2
+
The fact that K: + A* means that
where xz are the canonical coherent states and the convergence is in the weak topology of 3.1. We now examine the limit s + 00 from a global geometric point of view, using the coherent states. Recall that the expectation vector
ranges over the sphere of radius s centered at the origin, with the north pole corresponding to ( = 00. The transformation from S; to K i deforms this sphere to an ellipsoid,
150
3. names and Lie Groups
When s + 00, this ellipsoid splits into the two planes I?: + fl. Our weak limit K: + -I only picked out the lower plane. We could have picked out the upper plane by imbedding 'FI, into 'FI differently, namely by identifying W Z , - ~ with w, for n = 0,1,2,. . . ,2s. In that case we would have obtained the weak limits
K9
4
A*
K; A (22) K; + I . In terms of the coherent states, this corresponds to using the highestweight vector instead of the lowest-weight vector or, equivalently, using a chart centered about the north pole rather than the south pole, for example, by using as reference point the element g1 = e-"'2. The corresponding harmonic-oscillator Hamiltonian is the weak limit of the operator N$
s
- S;
= 29
- N!.
The expectation values of N i and N$,
are related by inversion:
The splitting of the ellipsoid into the two planes can be understood from the point of view of representation theory by writing the irreducibility condition in terms of the I P S :
3.6. The Harmonic Oscillator as a Contraction Limit 1 2s
+2
(KpC
151
S + KBKf;) + (K3”)2= =Is.
When s + 00, this implies formally that --+ I . The subspaces ‘Ilk on which I<$ + fI are invariant in the limit, hence the limiting representation of W is reducible. Evidently, by taking the limits in the weak topology we are able to pick out just one irreducible component at a time. There is an interesting analogy between the above analysis and the non-relat ivist ic limit of Klein-Gordon theory, which we now point out for those readers already familiar with the latter. (The nonrelativistic limit will be discussed in the next chapter.) The relation S$/(s+l)+ *I corresponds to Po/mc2 4 &I in the limit mc2 4 00, where PO is the relativistic energy operator. As will be seen in the next chapter, the Poincark group contracts, in this approximation, to a semidirect product of the 7-dimensional Weyl-Heisenberg group and the rotation group. A first-order correction is obtained by ex.. panding
Po =
N
f (mc2 + P2/2m) = f (mc2+ H ) , (27)
where H is the non-relativistic free Schrodinger Hamiltonian. Similarly, it follows from 1
- (s;s: 2
1
- (sy: 2
that
+ SdST) + (s3”)2 = s(s + 1) - S”$)
- s3”= 0
152
3. names and Lie Groups
(S3” - +)2 = (s
+ t)2- sy:,
hence formally
s; - 3 =
+
+)2
- ss ;.
from which
-
+
which corresponds to Pi*) f ( m c 2 H). The analogy between the large-spin and the non-relativistic limits can be summarized as follows: The Poincark group corresponds to S U ( 2 ) , the energy POto Si, the rest energy mc2 to -s, and the non-relativistic Hamiltonian H to the harmonic oscillator hamiltonian N . The analog of the central extension of the Galilean group (which is the contraction limit of the Poincark group, as discussed in the next chapter) is the Oscillator group (Streater [1967]),whose generators are A , A * ,I and N , with the commutation relations
[ N ,A] = -A,
[ N ,A*] = A*.
(32)
Finally, we note that the above analysis explains a well-known relationship between the harmonic oscillator and the canonical coherent states, namely that the latter evolve naturally under the harmonic oscillator dynamics. We first derive t he corresponding relation
3.6. The Harmonic Oscillator as a Contraction Limit
153
within SU(2), i.e. for finite s. Consider the behavior of h: under the “evolution operator” exp( -itNf). We wish to express =
,-itNLX;
-its
e -its,,-<S+.
ha
(33)
in terms of the coherent states x z , hence we need to display the operator on the right in the reverse Gaussian form
As explained earlier, it sufficesto do the computation on 2x2 matrices:
This gives d‘ = -it/2,
C’
e-itN*
t’ = 0.
= eitC and s
x x
Hence
- e-itae-
h
*
= e-c ’+h = h c
(36)
= XZ(t),
where z ( t ) = e i t z . (This is intuitively obvious, since exp(-its:) rotates the 1-2 plane clockwise by an angle t , hence it rotates the coordinate z counterclockwise by an angle t.) In the limit s + rn, this gives the well-known result (Henley and Thirring [1962]) that the set of canonical coherent states is invariant under the harmonic oscillator time evolution, with individual coherent states moving along the clas) by the initial conditions z = x - ip sical trajectories ~ ( tdetermined in phase space. The above shows that the same is true within SU(2),
154
3. Frames and Lie Groups
i.e. for finite s, where it is a consequence of the fact that essentially the generator of rotations about the 3-axis.
Nf is
Note: After this section was written, I learned from R. F. Streater that a related construction was made by Dyson [1956].
155
Chapter 4
COMPLEX SPACETIME
4.1. Introduction Relativistic quantum mechanics is a synthesis of special relativity and ordinary (i.e., non-relativistic) quantum mechanics. The former is based on the Lorentzian geometry of spacetime, while the latter is usually obtained from classical mechanics by a somewhat mysterious set of rules known as “quantization” in which classical observables, which are functions on phase space, suddenly become operators on a Hilbert space. Classical mechanics, in turn, can be formulated in
terms of Newtonian space-time (the Lagrangian approach) or it can be based on the symplectic geometry of phase space (see Abraham and Marsden [1978]). The latter, called the Hamiltonian approach, is usually considered to be deeper and more powerful, and its study has virtually turned modern classical mechanics into a branch of differential geometry. Yet, the standard formalism of relativistic quantum mechanics rests solely on the geometry of spacetime. Symplectic geometry, so prominent in classical mechanics, seems to have disappeared without a trace. In this chapter we develop a formulation of relativistic quantum mechanics in which symplectic geometry plays an important role. This will be done by studying the role of phase space in relativity
156
4. Complex Spacetime
and discovering its counterpart in relation to the Poincark group, which is the invariance group of Minkowskian spacetime. It turns out that the Perelomov construction fails for relativistic particles (the physically relevant representations are not square-integrable), and an alternative route must be taken. The result is a formalism based on complex spacetime which, we show, may be regarded as a relativistic extension of classical phase space. As a by-product, two long-standing inconsistencies of relativistic quantum mechanics (in its standard spacetime formulation) are resolved, namely the problems of localization and covariant probabilistic interpret ation. Rather than being sharply localizable in space (which leads to conflicts with causality; see Newton and Wigner [1949] and Hegedeldt [1985]),particles in the new formulation are at best softly localizable in phase space. This is just a covariant version of the situation in the coherentstate representation. But whereas for non-relativistic particles both the Schr6dinger represent ation and the coherent-state represent at ion give equally consistent theories, the spacetime formulation of relativistic quantum mechanics is inconsistent because it lacks a genuine probabilistic interpretation, a situation remedied by the phase-space formulation (section 4.5). 4.2. Relativity, Phase Space and Quantization At first it appears that phase space and spacetime are mutually exclusive: The phase-space coordinates of a particle are in one-to-one correspondence with the initial conditions which determine its classical motion, i.e. its worldline. Hence the phase space can be identified with either the set of all initial conditions or the set of worldlines of the classical particle. In either case, time is treated differently from space: For initial conditions, an arbitrary “initial” time is chosen;
157
4.2. Relativity, Phase Space and Quantization
for world-lines, the dynamical “flow” is factored out. Furthermore, locality in spacetime is lost in either case. In this section we confine ourselves to the physical case s = 3, i.e., three-dimensional space. Our approach will be to leave spacetime intact and, instead, consider its group of symmetries, the Poincarb group, which is defined as follows: Let u be a four-vector. We denote its time component (with respect to an arbitrary reference frame) by uo and its space components by u = (ul,u 2 ,u 3 ) . The invariant Lorentzian inner product of two such vectors is defined as (u,v ) G uv
G
c2uovo- u ’ v
= gpLYu~vU.
(1)
Later we will set c = 1 (which amounts to measuring time as the distance traveled by light), but for the present it is important to include c, since the non-relativistic limit c + 00 will be considered. The Lorentz group L is the set of all linear transformations A : R4 + IR4 which leave the inner product invariant:
(Au,Av) = (u,v) Vu,v E R4.
(2)
C includes transformations which reverse the orientation of time and ones which reverse the orientation of space. Such space- and time reflections split C into four connected components. The component connected to the identity (whose elements reverse neither the orientations of time nor of space) is called the restricted Lorentz group and denoted by Lo. If we let u4 f cuo,so that uv = - u . v u4v4, we can identify LOwith S0(3,1)+(the plus sign indicates that the orientation of time, hence also of space, is preserved separately.) The Poincar6 group P is defined as the set of all Lorentz transformations combined with spacetime translations:
+
4. Complex Spacetime
158
P
= { ( a ,A)JAE L,a E lR4}
(3)
where ( a , A) acts on R4 by
(.,A).
= hu
+ a.
(4)
We will be dealing with the restricted Poincari group PO where A E Lo. The reason for our interest in Po is that it parametrizes all reference frames which can be obtained from a given reference frame by a continuous motion. (By “motion” we mean any spacetime translation, rotation or boost, including physically impossible “motions” such as space-like translations and translations backwards in time.) Now to specify a reference frame (.,A) relative to some fixed reference frame (located, say, at the origin in spacetime), we must give its origin (namely, a ) , its velocity and its spatial orientation (all relative to the fixed frame). Thus if we ignore the spatial orientation by factoring out the rotation subgroup SO(3) of Li , the resulting seven-dimensional homogeneous space
can be identified with the set of positions in spacetime (events) together with all possible velocities at these events. The set of all (future-pointing) four-velocities is a hyperboloid
Thus
c M R4 x @,
(7)
4.2. Relativity, Phase Space and Quantization
159
which is an extension of classical phase space obtained by including time along with the space coordinates. Such an object is usually called a state space. Strictly speaking, C is an extended velocity phase space rather than an extended momentum phase space; this appears to be the price for retaining locality in spacetime. What matters for us is that it will have the required symplectic (more precisely, contact) structure. from its construction, it is clear that C is a relativistically covariant object; it is not invariant since the choice of SO(3) in
Po
is frame-dependent. We will see that C combines the geometries of spacetime and phase space in a natural way. Thus, rather than conflicting with relativity, the concept of phase space actually follows from it: Appending time to the geometry of space means appending velocity-changing transformations (which are just rotations in a space-time plane) to the group of rigid motions, hence the enlarged group contains velocity coordinates in addition to space coordinates. In a sense, PO itself is actually a “super phase space” since it furthermore contains information on the spatial orientation, which is needed to include spin degrees of freedom along with the translational degrees of freedom. Po even has a natural generalization to the case of curved spacetime, namely the frame bundle of all “orthonormal” frames (with respect to the given curved metric) at all possible events. (For the definition and study of frame bundles, see Kobayashi and Nomizu [1963,19691.) In our review of canonical coherent states (chapter 1), we saw that the classical phase space resulted from the Weyl-Heisenberg group W , which, unlike PO was not a symmetry group of the theory but merely a Lie group generated by the fundamental dynamical observables of position and momentum at a fixed time. We will now see that W is related to the non-relativistic limit of POin two distinct ways: as a normal subgroup, and as a homogeneous space. This in-
160
4. Complex Spacetime
sight will play a key role in generalizing the canonical coherent states to the relativistic case. It turns out that the role of W as a group has no relativistic counterpart, whereas its role as a homogeneous space does: its relativistic generalization is C. Consider the Lie algebra p of PO, which is spanned by the generators Pr, of spatial translations, POof time translations, Jk of spatial rotations and I
= iJ1 [Po, K,.] = iP,. [ K j , Kk] = - i ~ -Ji~ [ J j ,J k ]
iK1 [ J j , P k ] = iP1 [Pr,I<,] = i~-~6,.,Po [ J j ,Kk] =
(8)
where ( j , k , Z ) is a cyclic permutation of (1,2,3), r,s = 1 , 2 , 3 and all unspecified brackets vanish. The physical dimensions of these generators are as follows: PO is a reciprocal time, Pk is a reciprocal lenght, Jk is dimensionless (reciprocal angle) and K I , is a reciprocal velocity. Notice that so far nothing has been said about quantum mechanics. p is simply the Lie algebra of the infinitesimal motions of Minkowskian spacetime, a classical concept. (The unit imaginary i in J k with -iJk and eq (9) can be removed by replacing P,, with -PP,
K k with - i K k ; i is included because we anticipate that in quantum mechanics these generators become self-adjoint operators.) Quantum mechanics is now introduced through the following postulate:
The formalism of relativistic quantum theory is based on a unitary (though possibly reducible) representation of PO.
(Q).
That is, the representation provides the quantum-mechanical Hilbert space, and the generators of p, which by unitarity are represented by self-adjoint operators, are interpreted as the basic physical observ-
4.2. Relativity, Phase Space and Quantization
161
ables: PO as the energy, PI,as the momentum and J k as the angular momentum. One may then consider perturbations by introducing interactions or gauge fields. In fact, the assumption (&)is very general in scope; it serves as one of the axioms in axiomatic quantum field theory (Streater and Wightman [1964]). Unlike the usual prescription of “quantization” in non-relativistic quantum mechanics, (Q) is both mathematically and physically unambiguous. Yet, (Q) implies and, at the same time, supercedes the canonical commutation relations! To see this, consider the non-relativistic limit c + 00 of g. Letting
we see that in the limit c + 00, p “contracts” to a Lie algebra with generators M , Pk,JI,,KI, satisfying
gl
and all other brackets vanishing. Note that (a) M is a central element of gl and (b) M , PI, and KI, span an invariant subalgebra w of gl which is isomorphic to the Weyl-Heisenberg algebra, with A4 playing the role of the central element E. Hence if 91 denotes the connected, simply connected Lie group generated by gl, then the invariant subgroup of 61 generated by w is isomorphic to the Weyl-Heisenberg group W . The remaining generators JI,of gl span the Lie algebra so(3) of the spatial rotation group SO(3), so 61 is the semi-direct product of W with SO(3) :
91 = W@SO(3).
162
4. Complex Spacetime Now suppose that the unitary representation of
P+T in assump-
tion (Q) is irreducible. [Assumption (Q) means that we are dealing with a general quantum system, possibly a system of interacting particles or even quantum fields; it is the additional assumption of irreducibility which makes this system elementary, roughly a single particle. Hence the concept of position, discussed below, is only now admissible.] Assuming that the formal limit c + 00 of Lie algebras rigorously induces a corresponding limit at the representation level (and this is indeed the case, as will be shown later), assumption (Q) implies that we have a unitary irreducible representation of
61 in that
limit. Then M,Pk and I<&are represented by self-adjoint operators on some Hilbert space, which we will denote by the same symbols. Irreducibility implies that the central element M has the form m l , where m is a real number and I denotes the identity operator. Assume m > 0 (this is physically necessary since m will be interpreted as a mass) and let X k
= --(l/m)&.
Then eq. (10) shows that xk and tion relations:
Pk satisfy the canonical commuta-
thus X k behaves like a position operator. This shows that the assumption (Q), which is conceptually simple, mat hematically precise, relativistically invariant and very general, actually implies the much less satisfactory “quantization” prescription in the non-relativistic limit, under the additional assumption of irreducibility. How does it happen that classical relativistic geometry, as rep-
4.2. Relativity, Phase Space and Quantization
163
resented by ’Po, when combined with assumption (Q), yields the mysterious canonical commutation relations? To understand this, note that eq. (10) came from the relativistic Lie bracket
which states that boosting (accelerating) in any given spatial direction does not commute with translating in the same direction. This, in turn, is a consequence of the fact that Einsteinian space is not absolute since in the accelerated frame there is a (Lorentz) contraction in the direction of motion, so first translating and then boosting is not the same as first boosting and then translating. In the non-relativistic limit this gives the canonical commutation relations. No such easy derivation of these relations would have been possible without invoking Relativity. For had we begun with Newtonian space-time, the appropriate invariance group would have been not PO but the Galilean group 6 . Since Newtonian space is absolute and hence unaffected by boosting to a moving frame, the Galilean boosts K k ‘ commute with with the Galilean generators of translations Pk‘, hence yield no canonical commutation relations and no associated uncertainty principle. In the case of Q1, the canonical commutation relations are a remnant of relativistic invariance. Thus the uncertainty principle originates, in some sense, in’ “classicaJ” Relativity theory! Eq. (12) states that an acceptable set of position operators for a non-relativistic particle is given in terms of the generators of Galilean boosts (more precisely, the boosts of a central extension of the Galilean group, as explained below). It is interesting to see how this comes about from a more intuitive, physical point of view, since position operators are problematic in relativistic quantum mechanics
164
4. Complex Spacetime
(as will be discussed later) but the boosts have natural relativistic counterparts. To gain insight, we will now give two additional rough but intuitive arguments for the validity of eq. (12). 1. For a spinless particle, the generators of the Poincark group can be realized as operators on a space of functions over spacetime (namely, the space of solutions of the Klein-Gordon equation) by
where (k,1, m ) is a cyclic permutation of (1,2,3). In the non-relativistic limit, Po -,mc2, so
which displays -( l / m ) K k as the operator of multiplication by X k (the usual non-relativistic position operator) minus the distance the particle has traveled in time t = 20 while going at a velocity v = P / m . This is just the initial position of the particle at time t = 0; that is, the non-relativistic limit of -( l/m)I
state space C R R4x :2C and considers the non-relativistic limit. As c + 00, the hyperboloid :2C flattens out to a (three-dimensional) hyperplane at infinity, the non-relativistic velocity space Y x R3.
4.2. Relativity, Phase Space and Quantization
165
In that limit, the boosts K k commute and become the generators of translations in V . If the Cartesian coordinates of V are V k , then
Now the velocities V k are related to the momenta p k by where m is the mass. Thus in the limit
Vk
=p k / m ,
But in the momentum representation of non-relativistic quantum mechanics , the position operators are represented by x k
= i-,
a
OPk
which again gives agreement with eq. (12). Now that we have an acceptable relativistic generalization of the classical phase space, let us return to our goal of extending the canonical coherent-st ate represent ation to relativistic particles. We have seen that the Weyl-Heisenberg group, on which this representation is based, is isomorphic to an invariant subgroup of the non-relativistic limit of Pi. However, this subgroup is not the non-relativistic limit of any subgroup of not close due to
PO,since the Lie brackets of
Kk,Pk
and
PO do
That is, we cannot simply generalize the canonical coherent states by choosing the right subgroup of PO.However, eq. (11) shows that W also plays another role in the group 81, namely as a homogeneous space:
4. Complex Spacetime
166
In this form, it does have an obvious relativistic counterpart, namely
C. In retrospect, the role of W as a group is purely incidental to quantum mechanics
, since there is no fundamental reason why
the
set of all dynamical states should form a group. On the other hand, this set should certainly be a homogeneous space under the group of motions, since this group must transform dynamical states into one another and this action can be assumed to be transitive (otherwise we may as well restrict ourselves to an orbit). In either of its roles relative to 91, W acquires a slightly different physical interpretation from the one it had in relation to the canonical commutation relations: Since the groupmanifold of W is generated by the vector fields and M , the coordinates on W are the positions zk (generated by Pk),the velocities v k (generated by Kk) and the variable Pk, K k
generated by M , which is a degenerate form of “time” inherited form Relativity through the limit c - ~ P o+ M . By comparison, the coordinates on the original Weyl-Heisenberg group were X k (generated by
Pk),
the momentum pr: (generated by
X k )
and a dimensionless
“phase angle” q5 generated by the identity operator. With this new interpretation, W is the product of velocity phase space with “time” and truly represents the non-relativistic limit of C. We will construct a coherent-state representation of
Po by first
discovering its non-relativistic limit. This limit will be a representation of a quantum mechanical version
62 of the Galilean group 6
and will be seen to be a close relative of the canonical coherent-state representation of W . It is therefore necessary first to understand just
6 is related to the PoincarC group ’Po and its nonrelativistic limit 61.Again, we will do everything at the level of Lie how the group
4.2. Relativity, Phase Space and Quantization
167
algebras. There is no problem with globalization. The universal enveloping algebra of g contains the mass-squared operator
which is a Casimir operator, i.e. commutes with all generators in p. Assuming that both Po and M are represented by positive operators and that M is invertible (and this will be the case for massive particles), we have formally for large c:
+
c - ~ P=~p2
-2
P2 )1/2 -M
+ p 2 / m C+2 o ( ~ - ~ ) ,(23)
where we have used the fact that M commutes with Pk. The operator
H = P2/2M
(24)
is just the Hamiltonian for the non-relativistic free particle, hence
generates its time translations and must be included in the Lie algebra of the Galilean group. Let us therefore try to append it to the Lie algebra gl of 91. Indeed, by eq. (lo),
[H, pk] = O [H, Jk] =0
[ H , M ]= 0 [ H ,Kk] = iPk,
(25)
showing that
actually forms a Lie algebra with Lie brackets given by eqs. (10) and (25). Clearly g2 contains gl as a subalgebra. We will refer to
168
4. Complex Spacetime
the corresponding Lie group 9 2 as the quantum mechanical Galilean group. It is the group of translations, rotations, boosts, dynamics (i.e., time translations) and multiplications by constant phase factors (generated by M ) acting on the wave functions of a non-relativistic quantum mechanical particle. The relation of 6 2 to the classical Galilean group 6 is as follows: The subgroup generated by M , which can be identified with the group of real numbers IR,is central; then Q is obtained from 6 2 by factoring out IR:
9 = 62/R. The action of
IR on
(27)
quantum mechanical wave functions, which a-
mounts to a multiplication by a constant phase factor, is a necessary part of
92
because of [Pr,KS]= i&,M which, as we have seen, is
related to the uncertainty principle. Factoring out this action means ignoring that phase factor, so it is reasonable that it should give the classical Galilean group. On the Lie algebra level, it amounts to setting M = 0. Had we included Planck’s constant fi in eq. (lo), this would have amounted to taking the classical limit fi relation between IR,
62
and
9 in
The above
an example of a central extension
(Varadarajan [1969]). One says that by
+ 0.
62
is a central extension of IR
6. The fact that W is a subgroup of
62
was noted by Bargmann
are contractions of representations of the Poincark group was shown by Mackey [1955].*
[1954];that representations of
*
92
I thank R. F. Streater for these remarks.
4.3. Galilean fiaxnes
169
4.3. Galilean Frames Our object in this section is to construct coherent states which are naturally associated with free non-relativistic particles, just as the canonical coherent states are associated with the Weyl-Heisenberg group or the Harmonic oscillator and the spin coherent states are associated with with SU(2). An obvious starting point would be to apply the Klauder-Perelomov method (chapter 1) to the quantummechanical Galilean group &, since it is this group which describes such particles. However, this method fails, due to the fact that all the representations of physical interest are not square-integrable. Therefore we will follow a more pedestrian route. Our main guides will be analyticity (which, it turns out, follows from an important physical condition) and “physical intuition.” We return to the general case where the configuration space is R”instead of R3. For simplicity, we restrict ourselves to spinless particles. It is not difficult to include spin, &s will be shown later. The states of such particles are described by complex-valued wave functions f(x,t ) of position x and time t which are me square-integrable with respect to x at any time t . Their evolution in time is given by the Schrodinger equation
.af = Hf,
a%
where
H=--
2m
is the Hamiltonian operator, and A is the Laplacian acting on L2(IR”).Since H is self-adjoint, though unbounded, the solutions are given through the unitary one-parameter group V ( t )= exp(-itH):
170
4. Complex Spacetime
where it is assumed that f(x,O), hence also its Fourier transform
f"(p),is in L~(IR~). The key to our approach will be to note that H is a non-negative operator, hence the evolution group U ( t ) can be analytically continued to the lower-half complex time plane a- as - e U ( t - iu) = exp[-i(t - iu)H] -
-itHe-UH
,
u
> 0.
(4)
(Note also that since H is unbounded, no analytic continuation to the upper-half time plane is possible.) The operator e-uH is familiar from two other contexts: it constitutes the evolution semigroup for the heat equation (where u is time), and it is also the unnormalized density matrix for the Gibbs canonical ensemble describing the statistical equilibrium of a quantum system at temperature
T (where
u = l/kT). To get afeel for our use of this operator, let us be heuristic for a moment and consider what happens when a free classical free particle of mass m is evolved in complex time
T
=t
- iu. If its
initial position and momentum are x and p respectively, then its new position will be z(7) = x
+ ( t - iu)p/m
= (x
+ @/m)- i(u/m)p
(5)
= x(t) - i(u/m)p.
Since x ( t ) is just the position evolved in real time t , we see that z ( t ) is, in fact, a complex phase space coordinate of the same type
4.3. Galilean F'rames
171
we encountered in the construction of the canonical coherent states! Armed with this intuition, let us now return to quantum mechanics and see if this idea has a quantum mechanical counterpart. The operator e+"', when applied to any function in L2(IR.'), gives
If we replace x in the integrand by an arbitrary z E a?, the integral still converges absolutely since the quadratic terni in the exponent dominates the linear term for large lpl. Clearly the resulting function is entire in z (differentiating the integrand with respect to z k still gives an absolutely convergent integral). This shows that the group of Galilean space-time translations, ~ ( x , t=) e x p ( - i t i ~ + i x . ~ ) ,
(7)
extends analytically to a semigroup of complex sp tions
-tim t ransla-
U ( E 5, ) = exp(-iTH
+ i~ - P)
defined over the complex space-time domain 2, =
{ ( Z , T ) 1z E CS,T E c-}.
(9)
This translation semigroup can be combined with the rotations and boosts to give an analytic semigroup Si extending G2. Let H ' , be the vector space of all the entire functions fu(z) as f ( p ) runs through L2(IRd). Then
172
4. Complex Spacetime
are seen to be Gaussian wave packets in momentum space with expected position and momentum given in terms of z EE x - iy b y
The e:'s are easily shown to have minimal uncertainty products. The momentum uncertainty can be read off directly from the exponent and is
hence
Axk = J-. We now have our prospective coherent states and their label space M = Ca. To construct a coherent-state representation, we need a measure on M which will make the e:'s into a frame. Since the er's are Gaussian, the measure in not difficult to find:
d p u ( z ) = ( m / n ~ ) "exp / ~( - m y 2 / u ) daxd"y. Defining
(15)
4.3. Galilean fiames
173
we have T h e o r e m 4.1. (a) ( I is an inner product on 3.1, under which 'FI, is a Hilbert space. (b) The map e+"' is unitary from L2(IR")onto F ' I,. (c) The e: 's define a resolution of unity on L2(IR")given by
-
e ) ~ , ,
(17)
=
Proof. We prove prove that 11 f (f I f ) ~ = , , I I f I I i 2 . The inner product can be recovered by polarization. To begin with, assume that f" is in the Schwartz space S(R")of rapidly decreasing smooth test functions. Then
hence by Plancherel's theorem,
and
4. Complex Spacetime
174
where exchanging the order of integration was justified since the integrals are absolutely convergent. This proves (b), hence also (a), for f E S(IR").Since the latter space is dense in L2(IRs),the proof extends to
f E L2(Rs)by continuity. (c) follows by noting that
and dropping ( f^ I and
14 ).
I
Using the map e W u Hwe , can transfer any structure from L2(IRs) to 3-1,. In particular, time evolution is given by
where
T
= t - iu and the wave packets
are obtained from the ei's by evolving in real time t. They cannot be of minimal uncertainty since the free-particle Schrodinger equation is neccessarily dissipative. Instead, they give the following expectations and uncertainties:
4.3. Galilean Ehtnes
175
Since
it follows that
thus we have a frame { es,r 1 z E Cs} at each complex “instant” r = t - iu, with the corresponding resolution of unity
The space L2(R”)carries a representation of the quantum mechanical Galilean group 62. Since the e5,,’s were obtained from the dynamics associated with this group, they transform naturally under its action. A typical element of & has the form g = (R,v,xo,to,B), where R is a rotation, v is a boost, xg is a spatial translation, t o is a time-translation and 8 is the “phase” parameter associated with the central element M = m / h m in our representation (see section 4.2). g acts on the complex space-time domain 2, by sending the point (2, T ) to ( T I , e l ) , where
176
4. Complex Spacetime XI
= R x + t v + xo
y’=Ry+uv
t’ = t + t o uI = 21. The parameter 6 has no effect on space-time; it only acts on wave functions by multiplying them by a phase factor. The representation of 92 is defined by
Thus we have
and the e,,,’s are “projectively covariant” under the action of 9 2 ; if we define ee,,,4 e --im4e,,r, then this expanded set is invariant under the action of 92, with 4’ = 4-6. Since e,,, and e,,,,,j, represent the same physical state, we won’t be fussy and just work with the e,,,’s. Anyway, this anomaly will disappear when we construct the corresponding relativistic coherent states. The above representation of & on L2(IR”)can be transfered to
N uusing the map e-uH.
This map therefore intertwines (see Gelfand, Graev and Vilenkin [1966])the representations on 92 on L2(R”)with
the one on Xu. We conclude with some general remarks.
1. Since the e:’s are spherical and therefore invariant under S O ( n ) (which is, after all, why they describe spinless particles), they can be parametrized by the homogeneous space W = Q1 /SO( n) as long as we keep u fixed ( u is a parameter associated with the Hamiltonian, which
177
4.3. Galilean frames
is a generator of 6 2 but not of 61). The action of W as a subgroup of 61 on the e:)s is preserved in passing to the homogeneous space, hence W acts to translate these vectors in phase space. This explains the similarity between the ei's and the canonical coherent states. On the other hand, dynamics (in imaginary time) is responsible for the parameter u. If we write k G ( m / u ) y ,then e:(p)
U
= (Zr)-"exp -k2
[2m
G
21
- -(p
exp [u k2/2m]e--iP'x h
2m
- k)2- ip - x ]
(JS) *
The measure dp,(z) is now
d p u ( x ,k) = ( ~ / n m ) exp(-uk2/rn) "/~ d"xd"k.
(32)
Hence the exponential factor exp[uk2/2m] in e,U, when squared in the reconstruction formula, precisely cancels the Gaussian weight factor in dpu(x,k), leaving the measure
in phase space. It follows from the above form of ez that 2 A q = ,/- plays the role of a scale factor in momentum space (as used in the wavelet transforms of chapter l), hence its reciprocal Axk = ,/* acts as a scale factor in configuration space. Thus the Galilean coherent states combine the properties of rigid "windows" with those of wavelets, due to the fact that their analytic semigroup @ includes j both phase-space translations and scaling, the latter due to the heat operator e - u H . However, note that u is constant, though arbitrary, in the resolution of unity and the corresponding reconstruction formula. Since there is an abundance of "wavelets" due
178
4. Complex Spacetime
to translations in phase space, only a single scale is needed for reconstruction. (One could, of course, include a range of scales by integrating over u with a weight function, but this seems unnecessary.) In the treatment of relativistic particles, u becomes the time component of a four-vector y = ( u ,y ) , hence will no longer be constant. This is because relativistic windows shrink in the direction of motion, due to Lorentz contractions, thus automatically adjusting to the analysis of high-frequency components of the spectrum.
2. Notice that ef is essentially the heat operator e-uH applied to the &function at z, then analytically continued to Z = x i y . The fact that all the ef’s have minimal uncertainties shows that the action of the heat semigroup {U(-iu)} is such that while the position undergoes the normal diffusion, the momentum undergoes the opposite process of refinement, in just such a way that the product of the two variances remains constant. This is reflected in the fact that the is unbounded, becomes unioperator e - u H , whose inverse in L2(IR”) tary when the functions in its range get analytically continued, and the reconstruction formula is just a way of inverting e - - u H . Hence no information is lost if one looks in phase space rather than configuration space! It seems to me that this way of “inverting” semigroups
+
must be an example of a general process. If such a process exists, I am unaware of it. In our case, at least, it appears to be possible because of analyticity.
3. So far, it seems that coherent-state representations are intimately connected with groups and their representations. However, there is a reasonable chance that coherent-state representations similar to the above can be constiucted for systems which, unlike free particles, do not possess a great deal of symmetry. Suppose we are
4.3. Galilean frames
179
given a system of 9/3 particles in R3 which interact with one another and/or with an external source through a potential V(x). We assume that V(x) is timeindependent, so the system is conservative. (This means that we do have one symmetry, namely under time translations. If, moreover, the potential depends only on the differences xi - xj between individual particles, we also have symmetry with respect to translations of the center of mass of the entire system; but we do not make this assumption here.) This system is then described by a Schrodinger equation with the Hamiltonian operator
H = Ho + V, where Ho is the free Hamiltonian and V is the operator of multiplication by V(x). We need to assume that this (unbounded) How operator can be extended to a self-adjoint operator on L2(RB). far can the above construction be carried in this case? The key to our method was the positivity of the free Hamiltonian HO= P2/2m. But a general Hamiltonian must at least satisfy the stability condition:
(S)
The spectrum of H is bounded below.
If H fails to meet this condition, then the system it describes is unstable, and a small perturbation can make it cascade down, giving off an infinite amount of energy. For a stable system, the evolution group ~ ( t=) eWitH can be analytically continued to an analytic semigroup U ( T ) = e-irH in the lower-half complex time plane as in the free case. Depending on the strength of the potential, the functions fu = U ( - i u ) f may be continued to some subset of C8. Formally, this corresponds to defining
for an initial function f(x) in L2(RS).As mentioned, this expression is formal since the operator is unbounded and e - i r H f may not
180
4. Complex Spacetime
be in its domain. But it can make sense operating on the range of e - i r H , which coincides with the range of e--rH, provided y is not too large. Let y, be the set of all y's for which e"J' is defined on the range of e-rH and, furthermore, the function exp[iz P] x exp [ - i 7 H ] f is sufficiently regular to be evaluated at the origin in IR", no matter which initial f was chosen in L2(IR"). For many potentials, of course, y , will consist of the origin alone; in that case there are no coherent states. We assume that Y, contains at least some open neighborhood of the origin. Intuitively, we may think of y , as the set of all imaginary positions which can be attained by the particle in an imaginary time-interval u, while moving in the potential V . In the free case, y, = R"and there is no restriction on y provided only that u > 0. This corresponds to the fact that there is no "speed limit" for free non-relativistic free particles, hence a particle can get to any imaginary position in a given positive imaginary time. For relativistic free particles, Y , is the open sphere of radius uc, where c is the speed of light. Returning to our system of interacting particles, define the associated complex space-time domain
-
This is the set of all complex space-time points which can be reached by the system in the presence of the potential V(x), and it is the label space for our prospective coherent states. These are now defined as evaluation maps on the space of analytically continued solutions:
the inner product being in L2(Rs).Then from the above expression, again formally, we have the dynarnical coherent states
4.3. Galilean fiames
181
for ( z , ~ in ) ZH. What is still missing, of course, is the measure d p f . (Since the potential is t-independent, so will be the measure, if it exists.) Finding the measure promises to be equally difficult to finding the propagator for the dynamics. The latter is closely related to the reproducing kernel,
K H ( z T, ; z',
H
I
H
7 ' ) = ( e5,7 e5f,rt).
(38)
h ' ~depends on T and 7' only through the difference T - ?', and is the analytic continuation of the propagator to the domain ZH x ZH. It is related to the measure through the reproducing property,
J..
dp,H(z) K H ( z 'T, ' ; E , 7 ) K H ( z ,T ; B", 7")
(39) = K H ( d , 7';z",?"),
where the integration is carried out over a "phase space" ur in ZH with a fixed value of T = t - iu. A reasonable candidate for dp: (see section 4.4)is
Rather than finding the measure explicitly, a more likely possibility is that its existence can be proved by functional-analytic methods for some classes of potentials and approximation techniques may be used to estimate it or at least derive some of its properties. The theoretical possibility that such a measure exists raises the prospect of an interesting analogy between the quantum mechanics of a single system and a statistical ensemble of corresponding classical systems at
182
4. Complex Spacetime
equilibrium with a heat reservoir. In the case of a free particle, if we set k = ( m / u ) y as above (see remark 1) and define T by u = 1/2kT where k is Boltzmann’s constant, then it so happens that our measure dpu is identical to the Gibbs measure for a classical canonical ensemble (see Thirring [1980]) of s/3 free particles of mass m in
IR3, at equilibrium with a heat reservoir at absolute temperature T. Thus, integrating with dpu over phase space is very much like taking the classical thermodynamic average at equilibrium! It remains to be seen, of course, whether this is a mere coincidence or if it has a generalization to interacting systems. There is also a connection between the expectation values of an operator A in the coherent states e,H[, and its thermal average in the Gibbs state,
(A)P where 2
2-l Trace ( e - P H A ) = 2-’ Trace ( e - P H / 2 A e - B H / 2
),
= Trace (e-PH>.
(41) Namely, if we have the resolution of unity
then
= 2-’ S,,
dp,H(z) A ( 2 , T
where we have used the formula
- iP/2),
4.4.Relativistic fiames
TraceA=
J..
dpf(z)(e&IAe&),
183
(44)
which follows easily from eq. (42). Thus taking the thermal average means shifting the imaginary part u by p / 2 in the integral.
4.4. Relativistic Frames We are at last ready to embark on the main theme of this book: A new synthesis of Relativity and quantum mechanics through the geometry of complex spacetime. The main tool for this synthesis will be the physically necessary condition that the energy operator of the total system be non-negative, also known in quantum field theory as the spectral condition. The (unique) relativistically covariant statement of this condition gives rise to a canonical complexfication of spacetime which embodies in its geometry the structure of quantum mechanics as well as that of Special Relativity. The complex spacetime also has the structure of a classical phase space underlying the quant um system under consideration. Quantum physics is developed through the construction of frames labeled by the complex spacetime manifold, which thus forms a natural bridge between the classical and quantum aspects of the system. It is hoped that this marriage, once fully developed, will survive the transition from Special to General Relativity. As mentioned at the beginning of this chapter, the Perelomovtype constructions of chapter 3 do not apply directly to the Poincark group since its time evolution (dynamics) is non-trivial. Pending a generalization of these methods to dynamical groups, we merely use the ideas of chapter 3 for inspiration rather than substance. In fact,
184
4. Complex Spacetime
it may well be that a closer examination of the construction to be developed here may suggest such a generalization. We begin with the most basic object of relativistic quantum mechanics, the Klein-Gordon equation, which describes a simple relativistic particle in the same way that the Schrodinger equation describes a non-relativistic particle. The spectral condition will enable us to analytically continue the solutions of this equation to complex spacetime, and the evaluation maps on the space of these analytic solutions will be bounded linear functionals, giving rise to a reproducing kernel as in section 1.4. Physically, the evaluation maps are optimal wave packets, or coherent states, and it is this interpretation which establishes the underlying complex manifold as an extension of classical phase space. The next step is to build frames of such coherent states. (Recall from section 1.4 that a frame determines a reproducing kernel, but not vice versa.) The coherent states we are about to construct are covariant under the restricted Poincark group, hence they represent relativistic wave packets
. As we have seen, such a covariant family is closely re-
lated to a unitary irreducible representation of the appropriate group, in this case
Po. Such representations
are called elementary systems,
and correspond roughly to the classical notion of particles, though with a definite quantum flavor. (For example, physical considerations prohibit them from being localized at a point in space, as will be discussed later.) We will focus on representations corresponding to massive particles. (A phase-space formalism for massless particles would be of great interest, but to my knowledge, no satisfactory formulation exists as yet.) Such representations are characterized by two parameters, the mass m > 0 and the spin j = 0,1/2,1,3/2,. . . of the corresponding particle. w e will specialize to spinless particles ( j = 0) for simplicity. The extension of our construction to particles
4.4. Relativistic fiames
185
with spin is not difficult and will be taken up later. Thus we are interested in the (unique, up to equivalence) representation of POwith m > 0 and j = 0. A natural way to construct this representation is to consider the space of solutions of the Klein-Gordon equation
where
= a%$
is the Del'Ambertian, or wave operator, A is the usual spatial Lapla-
cian and a,, = d/Bz". The function f is to be complex-valued (for spin j , f is valued in C2j+'). We set c = 1 except as needed for future reference. If we write f(z) as a Fourier transform,
then the Klein-Gordon equation requires that f ( p ) be a distribution supported on the mass shell
Qm
is a two-sheeted hyperboloid,
where
4. Complex Spacetime
186
f(p) = 2 n S(p2 - m 2 )a @ ) for some function a ( p ) on
(7)
am,and using
6(P2 - m2)= 6 ((Po - w)(po + w ) ) 1
=2w P ( p 0
- w ) + 6(po + 41,
we get
where
is the unique (up to a constant factor) Lorentz-invariant measure (The factor w - l corrects for Lorentz contraction in frames on 52,. at momentum p . ) For physical particles, we must require that the energy be positive, i.e. that a ( p ) = 0 on 52;. states are given as positive-energy solutions,
f(z) =
LA
d@ e-izp a ( p ) .
Hence the physical
(11)
The function a ( p ) can now be related to the initial data by setting zo
so
t = 0, which shows that
4.4. Relativistic frames
187
4 4 = 4 4 P ) = %fo(P),
(13)
where denotes the spatial Fourier transform. In particular, f(z) is determined by its values on the Cauchy surface t = 0. For general solutions of the Klein-Gordon equation, we would also need to specify on that surface, but restricting ourselves to positive-energy solutions means that f(x) actually satisfies the first-order pseudodifferential non-local equation ---. A
af/&
(which implies the Klein-Gordon equation), hence only f(x,0) is necessary to determine f. (We will see that when analytically continued to complex spacetime, positiveenergy solutions have a local characterization.) The inner product on the space of positive-energy solutions is defined using the Poincarkinvariant norm in momentum space,
We will refer to the Hilbert space
L:(dp")
= {a E L2(dp")I a ( p ) = 0 on a,}
as the space of positive-energy solutions in the momentum representation. It carries a unitary irreducible representation of Po defined as follows. The natural action of Po on spacetime is
(b, A)a: = Ax
+ b,
(17)
where A is a resticted Lorentz transformation (A E Lo ) and b is a spacetime translation. Since the Klein-Gordon equation is invariant
188
4. Complex Spacetime
under ’PO, the induced action on functions over spacetime transforms solutions to solutions. Since the positivity of the energy is also invariant under ’Po, the subspace of positiveenergy solutions is also left invariant.
Po
acts on solutions by
( U ( 4 4 . f ) (4= f (A%
- b)) *
(18)
The invariance of the inner product on L:(dj) then implies that the induced action on that space (which we denote by the same operator) is
( U ( b ,A) a ) ( p ) = eibpa (A-’p)
.
(19)
The invariance of the measure dj3 then shows that V (b, A) is unitary, thus (b,A) H U(b,A) is a unitary representation of ’Po. It can be shown that it is, furthermore, irreducible. Neither of the “function” spaces {f(z)} and L?(d$) are reproducing-kernel Hilbert spaces, since the evaluation maps f H f(x) and a H a ( p ) are unbounded. To obtain a space with bounded evaluation maps, we proceed as in the last section. Due to the positivity of the energy, solutions can be continued analytically to the lowerhalf time plane:
where u > 0. As in the non-relativistic case, the factor exp(-uw) decays rapidly as I p I + 00, which permits an analytic continuation of the solution to complex spatial coordinates z = x - iy. But since
d
m
w(p) is no longer quadratic in I p I, y cannot be arbitrarily large. Rather, we must require that the four-vector ( u , y ) satisfy the condition
4.4.Relativistic f i m e s
uw - y * p
>0
In covariant notation, setting yo
V(w,p) E a+,. u , we
(21)
must have
VPEfl+,,
YP>O
189
(22)
so that the complex exponential exp [ - i ( x - iy)p] remains bounded as p varies over fl;. This implies that yp > 0 for all p E V+,where
is the open forward light cone in momentum space. In general, we need to consider the closure of V+,i.e. the cone
-
v+ = {P E IRa+l I lPl I P O / C } ,
(24)
which contains the light cone { p 2 = 0 I PO > 0) (corresponding to massless particles) and the point { p = 0) (corresponding in quantum field theory to the vacuum state). The set of all y’s with yp > 0 is called the d u d cone of i.e.,
v+,
v; 3 {y E R”+lI yp > 0 v p E V+}.
(25)
It is easily seen that y belongs to V; if and only if I y I < cyo. Note contracts to the non-negative Po-axis while V; that as c + 00, expands to the half-space { ( u ,y) I u > 0, y E R’} which we have encountered in the last section. V . coincides with V+ when c = 1, but it is important to distinguish between them since they “live” in different spaces (see section 1.1). Thus for y E V;, setting z = x - iy, we define
v+
f(z) =
J’,+ d@e-;*’ rn
a@).
4. Complex Spacetime
190
The integral converges absolutely for any a E L:(dp") and defines a function on the forward tube
also known as the future tube and, in the mathematical literature, as the tube over V;. Differentiation with respect to z p under the integral sign leaves the integral absolutely convergent, hence the function
f(z) is holomorphic, or analytic, in 7+.As y -+ 0 in V;, f ( z ) + f ( z ) in the sense of Li(dj5). Thus f( 2) is a boundary value of f( z). Clearly f(z) is a solution of the Klein-Gordon equation in either of the variables z or z. Let
K
be the space of all such holomorphic solutions:
Then the map a ( p ) H f(z) is one-to-one from L$(dp")onto Ic. Hence we can make
K
into a Hilbert space by defining
where the inner product on the right-hand side is understood to be that of L$(d@).We now show that Ic is a reproducing-kernel Hilbert space. Its evaluation maps are given by
E , ( f ) = f(z) = where
JCk dp" e-"fpa(p)
E (e,
Ia),
(30)
191
4.4. Relativistic Frames
Lemma 4.2. 1. For each z E I+, e, belongs to Li(dfi), with
where v = (s - I)/&
and I<, is a modified Bessel function (Abramowitz and Stegun [1964]; the speed of light has been inserted for future reference.)
2. In pasticular, the evaluation maps on
K
are bounded, with
Proof. Set c = 1. (To recover c, replace m by mc in the end.) Then
Since G(y) is Lorentz-invariant and y E V l , we can evaluate the integral in a Lorentz frame in which y = (A, 0 ) :
G(y) = G(X,O) = = (27r)-
Set p = mq. Then
s
s,,
d@e-’””
2
J
(36) L exp [ - 2 ~ & ? 7 7 ]
.
4. Complex Spacetime
192
G(y) = (27r)-'mS-l
exp [ - 2 ~ r n J m ]
(37) The reproducing kernel can be obtained by analytic continuation from ller112:
( e,t I e, )
IC(z', Z ) =
J,,dp" exp [-z(z'
- z)p]
(g) V
= (27d-l
K,(??rnc),
where
q-
+(Z'
- [(y'
- 2)2
+ y)2 - (x' - x ) 2 + 2i(y' + y>(x'
- x)]
1 /2
(39)
is defined by analytic continuation from z' = z (when 77 = 2X) as
follows: The square-root function is defined on the complex plane cut along the negative real axis. Since y and y' both belong to V'., so does y'
+
+ y. Now the argument of the square root is real if and only
if (y' y)(x' - 2) = 0, and this can happen only when (5' - x ) 5 ~ 0. (Otherwise, either 2'--2 or x--2' belongs to V;, hence (y'+y)(z'-x) is positive or negative, respectively.) But then,
193
4.4. Relativistic n a m e s
-(z’
- 2)’ = (y’
+ y)2
-
(x’- x ) 2~ (y’
+ y)2 > 0.
(40)
Thus for z‘,z E 7+,the quantity -(t’ - z ) cannot ~ belong to the negative real axis, so 77 is well-defined. The reproducing kernel is closely related to the analytically continued (Wightman) 2-point function for the scalar quantum field of mass m (Streater and Wightman [1964]):
K ( z ‘ ,Z ) = -iA+(t’ - 2 ) .
(41)
We will encounter this and other 2-point functions again in the next chapter, in connection with quantum field theory. Note: We will be interested in the behavior of ary of 7+,i.e. when X that
N
11 e, 11
near the bound-
0. From the properties of I<’, it follows
In particular, the evaluation maps are no longer bounded when X = 0.
Po acts on 7+by a complex extension of its action on real spacetime, i.e., t’ 3
This means that x‘ = Ax
( b , A ) z = AZ
+ b.
+ b as before,
(43)
and y‘ = Ay. (This is
consistent with the phase-space interpretation of 7+to be established below,) The induced action on K: is therefore
194
4. Complex Spacetime
(W,A ) f ) ( 4 = f (
W
Z
- 4 )*
(44)
This implies that the wave packets e , transform covariantly under Po, i.e.
We have now established that the space K of holomorphic posit ive-energy solutions is a reproducing-kernel Hilbert space. Recall that picking out the positive-energy part of f ( z ) was a non-local operation in real spacetime, involving the pseudodifferent ial operator d m . However, when extended t o I+,such functions may be characterized locally, as simultaneaous solutions of the Klein-Gordon equation and the Cauchy-Ftiemann equations, since the negativeenergy part of f ( z ) does not have an analytic continuation to 7+. We now show that 7+may, in fact, be interpreted as an extended phase space for the underlying classical relativistic particles. Clearly, zp ‘8z p are the spacetime coordinates. Their relation to the expectation values of the relativistic (Newton-Wigner) position operators will be discussed below. We now wish t o investigate the relation of the ima,ginary coordinates yp E - 3 z p to the energy-momentum vector. The bridge between the “classical” coordinates g p and the quantum-mechanical observables Pp will be, as usual, the (future) coherent states e,. Before getting involved in computations, let us take a closer look at these wave packets in order to get a qualitative picture. Since yp is Lorentz-invariant, it can be evaluated in a reference frame where p = (rnc2,0). Thus yp = y0mc2 = J-mc
2 Xmc,
(46)
4.4. Relativistic Frames
195
with equality if and only if y = 0, i.e. when y and p are parallel. This is a kind of reverse Schwarz inequality which holds in V . x under the pairing provided by the Lorentzian scalar product. Thus we have for fixed y E V. and variable p E
v+
the maximum being attained when and only whenp = (mc/X)y Hence we expect, roughly, that
py.
Therefore the vector y, while itself not an energy-momentum, acts as a control vector for the energy-momentum by filtering out p’s which are “far” from pv. The larger we take the parameter A, the finer the filter. The expected energy-momentum can be computed exactly by noting that
( e, I P,e, ) =
J,,
djj p ,
e-2yP
(49)
where G(y) = llez112 as before. Since G depends on y only through the invariant quantity A, we have
where we have used the recurrence relation (Abramowitz and Stegun w41)
196
4. Complex Spacetime
a (A-” ax
l i , ( 2 X m ) ) = 2mX-”K,+1(2Xm).
--
(51)
This verifies and corrects the above qualitative estimate. In view of the above relation, the hyperboloid
0;
= { y E v; J y 2 = A * }
(52)
corresponds to the mass shell. Hence the submanifold c x = {x
-iy
I
E I+ y 2 = A 2 }
(53)
corresponds to the extended phase space C defined in section 4.2 which, we recall, was a homogeneous space of Fo that was interpreted as spacetimexvelocity space. Let us define the effective mass mx of the particle on 0; by
and mxc (PP>= X Y P ” ’
(56)
We claim that mx > m, which can be seen as follows. For all p,p’ E 02 we have the “reverse Schwarz inequality” pp‘ 2 m 2 ,with equality if and only if p = p’. Hence
= G-2
> m2
//
dp”dp”’ pp‘ exp( -2yp
- 2yp’)
(57)
197
4.4. Relativistic Frames
This is a kind of “renormalization effect” due to the uncertainty, or fluctuation, of the energy-momentum in the state e,. It appears to go in the “wrong direction” (i.e., ( P ) 2 > ( P 2 )) for the same reason as does the inequality pp’ 2 m 2 , namely because of the Lorentz metric. Thus ( P p ) is proportional, by a y-dependent but %-invariant factor, to yp. We may therefore consider the yp’s as homogeneous coodinates for the direction of motion of the classical particle in (real)
spacetime. Alternatively, the expectation of the velocity operator
P/Po can be shown to be y/yo. Thus of the
s
+ 1 coordinates yp,
only s have a “classical” interpretation. It is important to understand that the parameter X has no relation to the mass; it can be chosen to be an arbitrary positive number and has the physical dimensions of length. It is the relativistic counterpart of the parameter u encountered in connection with the non-relativistic coherent states, and its significance will be studied later. At this point we simply note that X measures the invariant distance of z from the boundary of 7+.The larger A, the more smeared out are the spatial features and the more refined are the features in momentum space. (Recall that the imaginary part u of the time played a similar role in the non-relativistic theory.)
Because the vector y is so fundamental to our approach, it deserves a name of its own. We will call it the temper vector. The name is motivated in part by the smoothing effect which y has on spacetime quantities, and also by the fact that y plays a role similar to that played by the inverse teperature p = l / h T in statistical mechanics; the latter controls the energy.
From the asymptotic properties of the K Y ’ s it follows that
198
4. Complex Spacetime
ylx2)y,,
when Xmc
--+
0.
(58)
This can be understood as follows: When Xmc + 00, e.g. c --t 00 for fixed Am, we recover the non-relativistic results. For example, the expectations of the spatial momenta approach those in the nonrelativistic coherent states:
When Xmc t 0, say X 3 0 for fixed mc, then z approaches the boundary of 7+.In that case, fluctuations take over and the expectations become independent of the mass m. The relation of the spacetime parameters x, Rz, to the Newton-Wigner position operators is as follows. Since a fixed f E 7+ describes the entire history of the particle, the associated state does not change with time (i.e., the dynamics is already built in). This means that we are in the so-called Heisenberg picture, and timebehavior must be described by evolving the observables A :
The Newton-Wigner position operators are uniquely determined by a set of seemingly reasonable assumptions concerning the localizability of the particle (Newton and Wigner [1949], Wightman [1962]), and are given in the momentum representation at time 2 0 = 0 by
4.4. Relativistic Frames Now choose z = x
- iy
199
E‘ I with + x o = 0. Then
hence
It must be noted, however, that the concept of position for relativistic particles is highly unsatisfactory. Not only are the position operators non-covariant (this would seem to require a time operator on equal footing with them, which would exclude the possibility of dynamics); but the very concept of localizability for such particles is fraught with difficulties. For example, eigenvectors of the Newton-Wigner position operators, known as “localized states,” spread out from a single point at time x o to fill the entire universe an arbitrarily small time later, violating relativistic causality. (The same phenomenon in the nonrelativistic theory presents no conceptual problem, since propagation velocities are unrestricted there.) Even much weaker notions of localization give rise to problems with causality (Hegerfeldt [1985]). In my opinion, it is best to admit that position is simply a non-relativistic concept, and in the relativistic theory events z should be regarded as mere parameters of the spacetime manifold. As such, our formalism extends them to z = x - iy, with the new parameters y playing the role of a control vector for the energy-momentum observables. Thus, in place of a set of pairs of canonically conjugate observables x k , p k , we have a set of observables Pp and a dual set of complex parameters
200 zp.
4. Complex Spacetime The symmetry between position- and momentum operators in
the non-relativistic theory was based on the Weyl-Heisenberg group, and we have seen that this symmetry is “accidental,” being broken in the transition to relativity. Note: This further reinforces the idea discussed in section 4.2, that
“group-t heoretic quantization,” i.e. the quantization of classical systems obtained by requiring states and observables to transform under unitary representations of the associated dynamical groups or central extensions thereof, is superior to “canonical quantization” (sending the classical phasespace observables to operators, with Poisson brackets going to commutators).
#
Let us now consider the uncertainties in the energy- and momentum operators. More generally, we compute the correlation matrix
of which the uncertainties A$@are the diagonal elements. From
it follows that
(Incidentally, this shows that the function lnG is of some interest in itself its first partials are the expected momenta, while its second partials are the correlations.) A computation similar to that for ( Pp ) gives
4.4. Relativistic fiames
201
Although this expression does not appear to be enlightening in any obvious way, the uncertainties App can be estimated from it in various limits such as Am + 00 and Am -+ 0. The reproducing kernel by itself is of limited use. Although it makes it possible to establish the interpretation of 7+as an extended classical phase space, it does not provide us with a direct physical interpretation of the function values f(z). The inner product in K: is borrowed from L:(dp’), hence a probability interpretation exists, so far, only in momentum space. In the standard formulation of
Klein-Gordon theory, it is possible to define the inner product in configuration space, but the corresponding density turns out not to be positive, thus precluding a probabilistic interpretation. This is one of the well-known difficulties with the first-quantized Klein-Gordon theory, and is one of the reasons cited for the necessity to go to quantum field theory (second quantization). We will see that the phase-space approach does admit a covariant probabilistic picture of relativistic quantum mechanics, thus making the theory more complete even before second quantization. These topics will be discussed further in the next section and the next chapter. At this point we wish only to define an “autonomous” inner product on K: as an integral over a “phase space” lying in 7+.This will provide us with a normal frame of evaluation maps (chapter 1). Recall that in the non-relativistic theory of the last section, the norm in the space 7i, of holomorphic solutions was obtained by integrating I f(z, T ) I with respect to the Gaussian measure d p u ( z ) over the phase space T = t - iu =constant. But now, for given yo = u,
4. Complex Spacetime
202
only those y's with I y 1 < cu belong to V;. That is, the particle can only travel a finite imaginary distance in a finite imaginary time. In view of the relation ( Pp ) oc yr, the obvious candidate for a phase space is the set defined for given t E
IR and X > 0 by
Such sets are not covariant, but a covariant extension will be found in the next section. As for the measure, a Gaussian weight function (such as exp(-my2/u), which occured in dpu(y)) is no longer satisfactory since it cannot be covariant. It turns out that we do not need a weight function at all! This can be seen as follows: In the non-relativistic case, the shift to complex time was performed once and for all by the operator e-uH. For fixed u > 0, the weight function served to correct for the non-unitary translation from the real point x in space to the complex point z = x - iy. However, if we restrict ourselves to the subset u,then a translation to imaginary space is necessarily accompanied by a translation in imaginary time. The analog of the above translationis (t-iA,x)
H
(t-i,/m,x-iy).
The increase in yo, it turns out, precisely compensates for the shift to complex space! This follows from the fact that the operator e-Yp, which affects the total shift to complex spacetime, is relativistically invariant, hence the point y = 0 no longer plays a special role. We will show later that in the non-relativistic limit, we recover the weight function naturally. Hence the Gaussian weight function associated with the Galilean coherent states (which, as we have seen, is closely related to that associated with the canonical coherent states) has its origin in the geometry of the relativistic phase space, i.e. in the curvature of the hyperboloid (y2= A'}.
For
CT
= ot,X as above and f E Ic, define
4.4. Relativistic names
where we parametrize
Q
203
by (x,y) E R2" and the measure do is given
with
=
(z)'-
m G(A).
Then we have the folowing result.
Theorem 4.3.
For all f E K ,
Proof. Assume, to begin with, that a(p) a(w,p) belongs to the Schwartz space S(IRd)of rapidly decreasing test functions. Then
Hence, by Plancherel's theorem,
204
4. Complex Spacetime
Exchanging the order of integration in the integral representing \If[ :) we obtain
We now evaluate J(p)= as follows: Consider all s
J
dSye-2yp
(76)
+ 1 components of p as independent and
define m(p)f @. From the integral computed earlier, i.e.
J
G(y) 3 ( 2 ~ ) - ~d"p(2u)-1e-2yp
(77)
4nX we obtain by exchanging p and y (as well as rn and A):
Taking the partial derivative with respect to p o on both sides gives
d [t-"Kv(t)] dPO
= -(2nX2)y-
(79)
where t ( p ) 2Xrn(p). Using again the recurrence relation for the I<,,)s) we get
205
4.4. Relativistic Fkames
Thus
Ilf for u(p) E
llu
J
- (W-" d"P ( 2 4 - 1
I&) I
= Ilfll;
(81)
S(IR").By continuity, this extends to all a E L:(dj.j) since
S(IR")is dense in L:(dj.j). I If we define
then by polarization we have the following immediate consequence of the theorem.
Corollary 4.4. For all f 1 ,
f2
E K:,
-
In particular, ( I .). defines a %-invariant inner product on K:, and we have the resolution of unity
making the vectors e, with z E
a normal frame.
I
Note: The above results extend to the case X = 0 by continuity. The normalization constant in the measure do is then given by
The same formula applies fsr fixed X Ax becomes unbounded as rn + 0.
> 0 and rn #
-
0, and shows that
206
4. Complex Spacetime The vectors e, belong t o L$(dfi), but correspond to vectors E,
in
Ic defined by
( f I f ) - on K provides us with an interpretaThe norm llfll: tion of I f ( z ) I as a probability density with respect to the measure dax on the phase space Q. Within this interpretation, the wave packets e , have the following optimality property: For fixed z E 7+let
Proposition 4.5. Up to a constant phase factor, the function g Z is the unique solution to the following variational problem: Find f E K: such that llfll = 1 and I f ( z ) I is a maximum.
Proof. This follows at once from the Schwarz inequality and theorem 4.3,since by eq. (26),
If(z)l
= I(ezla)l=
I(e"zIf>l
5 I l 4 I llfll = l l e z l l Ilfll, with equality if and only if f is a constant multiple of e",.
I
According to our probability interpretation of I f(z) I ', this means that the normalized wave packet i, maximizes the probability of finding the particle at z . Note: Unlike the non-relativistic coherent states of the last section, the e,'s do not have minimum uncertainty products. In fact, since the uncertainty product is not a Lorentz-invariant notion, it is a
4.5. Geometry and Probability
207
priori impossible to have relativistic coherent states with minimum uncertainty products. The above optimality, which is invariant, may be regarded as a reasonable substitute. Actually, there are better ways to measure uncertainty than the standard one used in quantum
mechanics, which is just the variance. From a statistical point of view, the variance is just the second moment of the probability distribution. Perhaps the best definition of uncertainty, which includes all moments, is in terms of entropy (Bialynicki-Birula and Mycielski
[ 19751) Zakai [19601). Being necessarily non-linear, however, makes this definition less tractable.
4.5. Geometry and Probability
The formalism of the last section was based on the phase space and the measure do, neither of which is invariant under the action of Po on 7+.Yet, the resulting inner product ( I .)o is ot,~ o
-
clearly invariant. It is therefore reasonable to expect that 0 and do merely represent one choice out of many. Our purpose here is to construct a large natural class of such phase spaces and associated measures to which our previous results can be extended. This class will include u and will be invariant under Fo. In this way our formalism is freed from its dependence on u and becomes manifestly covariant. As a byproduct, we find that positiveenergy solutions of the Klein-Gordon equation give rise to a conserved probability current, so the probabilistic interpretation becomes entirely compatible with the spacetime geometry. As is well-known, no such compatibility is possible in the usual approach to Klein-Gordon theory. We begin by regarding 7+as an extended phase space (symplec-
208
4. Complex Spacetime
tic manifold) on which Po acts by canonical transformations. Candidates for phase space are 2s-dimensional symplectic submanifolds
c
7+,and Po maps different u’s into one another by canonical transformations. A submanifold of the “product” form IY = S - iQ:, where S (interpreted as a generalized configuration space) is an su
submanifold of (real) spacetime IRS+l, turns out to be symplectic if and only if S is given by xo = t ( x ) with
lot1 5
1, that is, if and only if S is nowhere timelike. (This is slightly larger than the class of all spacelike configuration spaces admissible in the standard theory.) The original ut,A corresponds to t (x)=constant. The results of the last section are extended to all such phase spaces of product form. The action of ’POon 7+is not transitive but leaves each of the
(2s
+ 1)-dimensional submanifolds 7 -= {x - iy E 7+I y2 = P}
invariant. Each 7’
(1)
is a homogeneous space of Po, with isotropy
subgroup SO(s), hence
Thus 7’ corresponds to the homogeneous space C of section 4.2 (where we had specialized to s = 3). In view of the considerations in sections 4.2 and 4.4, each 7 ; can be interpreted as the product of spacetime with ‘(momentumspace”. Phase spaces u will be obtained by taking slices to eliminate the time variable. On the other hand, we also need a covariantly assigned measure for each u. The most natural way this can be accomplished is to begin with a single Po-invariant symplectic form on 7+and require that its restriction to each u be symplectic. This will make each u a
4.5. Geometry and Probability
209
symplectic manifold (which, in any case, it must be to be interpreted as a classical phase space) and thus provide it with a canonical (Liouville) measure. Thus we look for the most general 2-form a on 7+ such that (a) a is closed, i.e., da = 0;
(b) a is non-degenerate, i.e., the (29+2)-fonn as+1 c y A a A . - - A a vanishes nowhere; (c) for every g = (.,A) E PO,g*a = a,where g*a denotes the pullback of a under g (see Abraham and Marsden [1978]). Since every Po-invariant function on 7+ depends on z only through y2, the most general invariant 2-form is given by
Now the restriction (pullback) of the second term to 7 ;
vanishes,
since it contains the factor ypdyp = d(y2)/2. Furthermore, the coefficient $(y2) of the first term is constant on 7:. confine our attention to the form
a = dy,, A dx” without any essential loss of generality.
Hence we may
(4)
This form is symplectic as
well as invariant, hence it fulfills all of the above conditions. 7+, together with a,is a symplectic manifold, and invariance means that each g E Po maps 7+into itself by a canonical transformation.
A general 2s-dimensional submanifold c of 7+will be a potential phase space only if the restriction, or pullback, of cy to t~ is a symplectic form. We denote this restriction by a,. Let c be given bY
210
4. Complex Spacetime
0
= {Z E ' I I+ S(Z) = h(z) =0},
(5)
where s ( z ) and h ( z ) are two real-valued, C" (or at least C') functions on 7+such that d s A d h # 0 on u. For example, Q ~ , X can be obtained from S ( Z ) = xo - t and h ( z ) = y2 - X2. The pullback au depends only on the submanifold 6,not on the particular choice of s and h.
Proposition 4.6. Poisson bracket
The forrn a , is symplectic if and only if the
{s, h } everywhere on
ds dh dh as -- -- # O ax, dy, ax, dy,
Q.
Proof. a , is closed since a is closed and d ( a u ) = (da),. Hence a , is symplectic if and only if it is non-degenerate, i.e. if and only if its s-th exterior power a" vanishes nowhere on Q. Now (a,)" equals the pullback of a" to Q, and a straightforward computation gives h
h
a" = S! d y " A d z , ,
(7)
where h
dy' = (-1)' dyo A dyl A h
--
d x , = (-1)' dx" A dx"-l A
*
A dy,-l
- - - A dx""
A dy,+l
A
--
*
A dxP-' A
A dy, *
- - A dx'.
(8)
(z z,
and are essentially the Hodge duals (Warner [1971]) of dy, and dx,, respectively, with respect to the Minkowski metric.) Let ( ~ 1 , .. . , ~ 2 ~ , v 1 , v 2be} a basis for the tangent space of 7+ at z E Q, with ( 2 1 1 , . . . ,u2"} a bas& for the tangent space Q, of Q. Since ds and d h vanish on the vectors u j ,
4.5. Geometry and Probability
211
By assumption, ( d s A d h ) ( v l ,u2) # 0. Therefore a, is non-degenerate at z if and only if a' A ds A d h # 0 at z. But by eq. (7))
a' A d s A d h = s! { s ,h } d y A d x ,
(10)
where
Hence a:
# 0 at
z if and only if {s, h } # 0 at z . I
Let us denote the family of all such symplectic submanifolds a
by Co.
Proposition 4.7. Let a E CO and g E PO.Then ga E CO and the restriction g : a + g a is a canonical transformation from (a,a,,)to
(so,Q g 4 . Proof. Let
g* denote the pullback map defined by g, taking forms
on ga to forms on u. Then the invariance of
Q
implies that
Hence agUis non-degenerate. It is automatically closed since closed. Thus ga E
CO.To say that
Q
is
g: a + g a is a canonical trans-
formation means precisely that a, and agaare related as above. I
212
4. Complex Spacetime
We will be interested mainly in the special case where h ( z ) = y2 - X2 for some X > 0 and s ( z ) depends only on x. Then the sdimensional manifold
s = (5 E IR”l I s ( x ) = 0)
(13)
is a potential generalized configuration space, and u has the “product” form u = s - iR,+
{x - iy E T+ I x E S, y E Q:).
(14)
The following result is physically significant in that it relates the pseudeEuclidean geometry of spacetime and the symplectic geometry of classical phase space. It says that u is a phase space if and only if S is a (generalized) configuration space.
Theorem 4.8.
Let u = S - i Q i be as above. Then (a,a,) is
symplectic if and only if
that is, if and only if S is nowhere timelike.
Proof. On
6 ,we
have {s,
h } = 2-
89 yp # 0,
dXP
and we may assume { s , h } > 0 without loss. For fixed x E S, the above inequality must hold for all y E R i , hence for all y E V’.. This of V i . I implies that the vector a s / a x ~is in the dual
v+
We denote the family of all a ’ s as above (i.e., with S nowhere timelike) by C. It is a subfamily of CO and is clearly invariant under
4.5. Geometry and Probability
213
PO.Note that C admits lightlike as well
as spacelike configuration spaces, whereas the standard theory only allows spacelike ones.
We will now generalize the results of the last section to all a E C. The 2s-form CY; defines a positive measure on u , once we choose an orientation (Warner [1971]) for a. (This can be done, for example, by choosing an ordered set of vector fields on a which span the tangent space at each point; the order of such a basis is a generalization of the idea of a “right-handed” coordinate system in three dimensions.) The appropriate measure generalizing da of the last section is now defined as da = ( s ! A x )-1 a,8,.
(17)
do is the restriction to u of a 2s-form defined on all of 7+,which we also denote by do. (This is a mild abuse of notation; in particular, the ‘(8’here must not be confused with exterior differentiation!) By
eq. (7), we have
-
h
da = A ,-1 d y p Adz,.
We now derive a concrete expression for da. Since s obeys eq. (15) and ds # 0 on u, we can solve ds = 0 (satisfied by the restriction n of ds to a) for dzo and substitute this into dzk. This (and a similar procedure for y) gives
(2) -1
n
dz, =
on u . Hence
as
-dxo ax,
4. Complex Spacetime
214
We identify a with mapping
R2"by solving s ( x )
( t ( x >-
id=,
= 0 for zo = t ( x ) and
x - iy) H (x,y).
(21)
&
We further identify AZ o with the Lebesgue measure d"y d'x on R2"(this amounts to choosing a non-standard orientation of R2"). Thus we obtain an expression for da as a measure on R'". Now s(z) = 0 on a implies that
which can be substituted into the above expression to give
-
= AX1 (1 - V t (y/yo)) d"y d " ~ .
But eq. (15) implies that I Vt ( x ) I
5 1, hence for y E V i ,
and da is a positive measure as claimed. The above also shows that if I Vt (x)I = 1for some x,then da becomes "asymptotically" degenerate as y I + 00 in the direction of V t ( x ) . That is, if (r is lightlike at ( t (x),x), then da becomes small as the velocity y/yo approaches the speed of light in the direction of V t ( x ) . This means that functions in L 2 ( d a )(and, in particular, as we shall see, in K ) are allowed high
I
215
4.5. Geometry and Probability
velocities in the direction V t ( x ) at ( t ( x ) , x )E S. This argument is an example of the kind of microlocal analysis which is possible in the phase-space formalism. (In the usual spacetime framework, one cannot say anything about the velocity distribution of a function at a given point in spacetime, since this would require taking the Fourier transform and hence losing the spatial information.) For u E C, denote by L2(da)the Hilbert space of all complexvalued, measurable functions on 0 with
llfll2,= J do l f I 2 < 00.
(25)
0
If f is a C” function on 7+, we restrict it to u and define above. Our goal is to show that
llflld
llfllo
= Ilfllx: for every f E
as
K. To
do this, we first prove that each f E K defines a conserved current in spacetime, which, by Stokes’ theorem, makes it possible to deform the EC phase space at,^ of the last section to an arbitrary u = S without changing the norm. For f E K , define
in:
where 52:
has the orientation defined by
&
O,
so that Jo(z)is posi-
tive. Then
where S has the orientation defined by to S does not vanish since I V t (x)I
go. (The restriction of 30
5 1.)
Let f(p) be C” with compact support. Then J ” ( z ) is C” and satisfies the continuity equation Theorem 4.9.
4. Complex Spacetime
216
8 JP ax@
- 0.
Proof. By eq. (19),
= d y ‘/yo. h
where dg
The function
is in L 1 ( d g x d??;x dg), hence by Fubini’s theorem,
(31) where, setting k p q, 7 @ and using the recurrence relation for the ICv’sgiven by eq. (51) in section 4.4,we compute
+
E
kC”H(7).
H ( 7 ) is a bounded, continuous function of 7 for 7 2 2m, and
4.5. Geometry md Probability
217
dPd4 exp b ( P - dl 3 ( P ) P(d (P” + 4‘9H(rl).
J p ( x )= AX1
(33) Since f(p) has compact support, differentiation under the integral sign to any order in 5 gives an absolutely convergent integral, proving that J p is C”. Differentiation with respect to x p brings down the factor i ( p p - q p ) from the exponent, hence the continuity equation follows from p 2 = q2 = rn2. I
Remark. The continuity equation also follows from a more intuitive, geometric argument. Let
oriented such that
(the outward normal on aB: points “down,” whereas f-2: “up”). Then by Stokes’ theorem,
is oriented
Here, d represents exterior differentiation with respect to y, and since the s-form dy p contains all the dyy’s except for dya, we have h
Jp(x) = -AT1
d
- iY) I 2 ,
(37)
218
4. Complex Spacetime
where dy is Lebesgue measure on B z . To justify the use of Stokes’ theorem, it must be shown that the contribution from l y l + 00 to the first integral vanishes. This depends on the behavior of f(z), which is why we have given the previous analytic proof using the Fourier transform. Then the continuity equation is obtained by differentiating under the integral sign (which must also be justified) and using
a2If l 2 axray,
= 0,
(38)
which follows from the Klein-Gordon equation combined with analyticity, since
Incident ally, this shows that j”(2)
= --d I f ( Z ) I (40)
ay, =i
[f(t)a,f(.)
- d,f(
is a “microlocal,” spacetimeconserved probability current for each fixed y E V;, so the scalar function I f(z) I is a potential for the probability current. We shall see that this is a general trend in the holomorphic formalism: many vector and tensor quantities can be derived from scalar potentials. Eqs. (37) and (40) also show that our probability current is a regularized version of the usual current associated with solutions in
4.5. Geometry and Probability
219
real spacetime. The latter (Itzykson and Zuber [1980]) is given by
which leads t o a conceptual problem since the time component, which should serve as a probability density, can become negative even for positiveenergy solutions (Gerlach, Gomes and Petzold [1967], Barut and Malin [1968]).By contrast, eq. (36) shows that Jo(s) is stricly non-negative. The tendency of quantities in complex spacetime to give regularizations of their counterparts in real spacetime is further discussed in chapter 5. We can now prove the main result of this section.
Theorem 4.10.
Let u = S - in: E C and f E X:
.
Then
llflld = IlfllK. Proof. We will prove the theorem for ](p) in the space D ( R 8 )of C" functions with compact support, which implies it for arbitrary f" E L:(d$) by continuity. Let S be given by zo = t(x), and for R > 0 let
I ER = {x E R"+'I SOR = {Z E R'+l I SR = {X E ]Rd+' I D R = {a: E Rd+'
1x1 < R,':a E [O,t(x)]}, 1x1 = R,':a E [O,t (x)]},
1x1 < R,ZO= o},
(41)
1x1 < R,'5 = t(X)}, where [0, t (x)]means [t(x),01 if t (x)< 0. We orient SORand SR by dxo , E R by the "outward normal" h
220
4. Complex Spacetime
+
and D R so that ~ D = R SR - SOR ER. Now let f(p) E D(IR3). Then J P ( x ) is C", hence by Stokes' theorem,
J
S R -SOR+ER
J p ( x )&,, =
kR(
d Jp
= (-1)"
J
DR
z,,) dJC"
dx -= 0.
(43)
ax'
We will show that
A ( R ) = J,, J",,
+
o as R + 00
(44)
(i.e., there is no leakage to 1x1 + oo),which implies that
= Ilfl:o,
= llfll:,
by theorem 1 of section 4.4. To prove that A(R) + 0, note L a t on h
ER,dxo = 0 and h
h
h
dxk = xk- dx 1 = xk- d52 = ' ' ' = X k - dX3 9
h
21
22
5 3
each form being defined except on a set of measure zero; hence h
i.=R-.d X 1 21
(47)
221
4.5. Geometry and Probability
=s
LR
J o i. = a(R).
Now by eqs. (31) and (32),
J o ( x )=
k2,
d"pd"q eiz(p-q)$(p, q),
where
D=jZ*V,, where 2 = x/R, and observe that for x E ER,
=-
i t(x,p) ~ eizp,
where v = p/po. Since q5 has compact support, there exists a constant a < 1 such that IvI 5 a and lv'l 5 CY for all (p,p') in the support
4. Furthermore, Ixo I < R(1+ e) for
of
IE(x,p)l
since (Vt(z)I 5 1, given any E > 0 we have E ER for R sufficiently large; hence
1 1 - a(1+
E)
for z E ER and p E supp
4.
(54)
4. Complex Spacetime
222
Choose 0 < E < 0-l - 1, substitute
into the expression for Jo(x) and integrate by parts:
This procedure can be continued, giving (for x E E R )
where
r-')"
Now ( D o is a partial differential operator in p whose coefficients are polynomials in D'((E-l) with Ic = 0,1, ... ,n. We will show that for x E ER with R sufficientlylarge, there are constants b k such that
ID' (t-') I< bk which implies that
Ic = 0,1,-,
(59)
4.5. Geometry and Probability
223
for some constants cn, so that by eqs (49) and (57),
a(R) = s
LR
Jot
if we choose n > s. To prove eq. (59)) note that it holds for k = 0 by eq. (54) and let u = 2 v. Then
-
and if for some k pk(u) Dku = Pt
where pk is a constant-coefficient polynomial, then
-
pk+1( U ) p;+l
hence eq. (63) holds for k = 1,2,
which implies
'
- - by induction. Thus
224
4. Complex Spacetime
But D k ( t - ' ) is a polynomial in
[-'and DE,D 2 [ ,- - - ,Ilk[;hence eq.
(59) follows from eqs. (54) and (66).
I
The following is an immediate consequence of the above theorem.
Corollary 4.11. (a) For every o E C, the form
defines a Po-invariant inner product on
K , under
which
K is a
Hilbert space.
(b)
=
The transformations (V,f)(z) f(g-'z), g E PO,form a unitary irreducible representation of PO under the above inner prod-
f^
f from L$(dfi) to K: intertwines this representation with the u s u d one on L:(d@). (c) For each t~ E C, we have the resolution of unity uct, and the map
H
on L:(d@) (or, equivalently, on
K
if e, is replaced by e",. ) I
Note: As in section 4.4,all the above results extend by continuity to the case X = 0. #
4.6. The Non-Relativistic Limit
225
4.6. The Non-Relativistic Limit
We now show that in the non-relativistic limit c + 00, the foregoing coherent-state representation of Po reduces to the representation of 62 derived in section 4.3, in a certain sense to be made precise. As a by-product, we discover that the Gaussian weight function associated with the latter representation (hence also the closely related weight function associated with the canonical coherent states) has its origin in the geometry of the relativistic (dual) “momentum space” That is, for large lyl the solutions in K: are dampened by the factor in momentum space, which in the non-relativistic exp[limit amounts to having a Gaussian weight function in phase space.
Qi.
d w u ]
In considering the non-relativistic limit, we make all dependence on c explicit but set h = 1. Also, it is convenient to choose a coordinate system in which the spacetime metric is g = diag(1, -1,. . . ,-I), so that yo = yo = and po = po = Fix u > 0 and let X = uc. Then
d
m
d w .
n
= umc2
++ 2m up‘
n
my“ ~(c-~). 2u
+
Working heuristically at first, we expect that for large c, holomorphic solutions of the Klein-Gordon equation can be approximated by
226
4. Complex Spacetime
where T = t - iu and fNR is the corresponding holomorphic solution of the Schrodinger equation defined in section 4.3. Note that the Gaussian factor exp[-my2/2u] is the square root of the weight function for the Galilean coherent states, hence if we choose f(p) E L2(IR")c L$(dfi), then
We now rigorously justify the above heuristic argument. Let f ( z ) be the function in K: corresponding to f(p) and denote by fc its restriction to z o = t and y2 = u2c2,for fixed u > 0. Theorem 4.12. Let u
> 0 and f(p)
E L2(IRs).Then
Proof. Without loss of generality, we set u = m = 1 and t = 0 to simplify the notation. Note first of all that
4.6. The Non-Relativistic Limit
where A,
Ax (A
G
227
uc = c). But
Choose a , y such that 1/2 < y < a < 1. Then
-+
0 as c + 00,
I
where xc is the indicator function of the set {p IpI > c ~ - ~ }Define . 6 and d by IyI = csinh6 and IpI = csinh4. Then yo = cosh8 and w = c2 cosh 6, hence
228
4. Complex Spacetime
Thus for arbitrary a 2 0,
Let a = sinh-'(c-7).
Then for IpI
~ - ~ ,
c(a - 4) -s/2c 2 c [~inh-'(c-~)- ~ i n h - ' ( c - ~ ) ]- s/2c z g(c). (13)
g(c) is independent of p and g(c) when IpI
N
cl-7 as c + 00.
Also,
c$
- ~
< c ' - ~ . Hence
+ 0 as c + 00.
Now 2c2 - 2yp = y2 = (Yo
+p2
- 2yp = (y -p) 2 2
- 4 /c
2
- (Y
- P)2 L -(y - p)2.
(15)
4.6. The Non-Relativistic Limit
229
Hence
Finally,
where
1
< +Y2 2c2 5
f
(c-27
- P21
+ c-2a)
- c-27. < We have used the estimate
U
5 ilvxdxl = Hence for sufficiently large c and lpl <
G
f lv2 -u21.
(20)
230
4. Complex Spacetime
= h(c) + 0 as c + 00. Thus
Notes
This chapter represents the main body of the author’s mathematics thesis at the University of Toronto (Kaiser 11977~1).All the theorems, corollaries, lemmas and propositions (labeled 4.1-4.12) have appeared in the literature (Kaiser [1977b, 1978a1). In 1966, when the idea of complex spacetime as a unification of spacetime and phase space first occurred to me, I had found a kind of frame in which both the bras and the kets were holomorphic in E and the resolution of unity was obtained by a contour integral, using Cauchy’s theorem. During a seminar I gave in 1971 at Carleton University in Ottawa (where I
Notes
231
was then a post-doctoral fellow in physics), L. Resnick pointed out to me that this “wave-packet representation” appeared to be related to the coherent-state representation, which was at that time unknown to me. The kets were identical to the canonical coherent states, but the bras were not their Riesz duals; in the language of chapter 1, they belonged to a (generalized) frame reciprocal to that of the kets, and the resolution of unity was of the type given by eq. (24) in section 1.3, which may be called a continuous version of biorthogonality. A version of this result was reported at a conference in Marseille (Kaiser [1974]). I was later informed by J. R. Klauder that a similar representation had been developed by Dirac in connection with quantum electrodynamics (Dirac [1943, 19461). The original idea of complex spacetime as phase space was to consider a complex combination of the (symmetric) Lorentzian metric with the (antisymmetric) symplectic structure of phase space, obtaining a hermitian metric on the complex spacetime parametrized by local coordinates of the type 2 ibp. (I have since learned that this structure, augmented by some technical conditions, is known as a Kihler metric; see Wells [1980].) The above “wavepacket representation” indicated that this combination may in fact be interesting, but so far it was ad hoc and lacked a physical basis. Also, the representation was non-relativistic, and it was not at all clear how to extend it to the relativistic domain, as pointed out to me by V. Bargmann in 1975. The standard method of arriving at canonical coherent states is to use an integral transform with a Gaussian kernel in the configuration-space representation, and there is no obvious relativistic candidate for such a kernel. The more general methods described in chapter 3 do not work, since the representations of interest are not squareintegrable (section 4.3). An important clue came in 1974 from the study of axiomatic quantum field theory, where I
+
232
4. Complex Spacetime
was fascinated by the appearance of tube domains. These domains
occur in connection with the analytic continuation of vacuum expectation values of products of fields, and are therefore extensions of such products to complex spcetime. However, the complexified spacetimes themselves are not taken seriously as possible arenas for physics. They are merely used to justify the application of powerful methods from the theory of several complex variables, in order to obtain results concerning the restrictions of vacuum expectaion values to real spacetime. (However, the restrictions to Euclidean spacetime do have important consequences for statistical mechanics; see Glimm and Jai€e [1981].) I felt that if these tube domains could somehow be given a physical interpretation as extended classical phase spaces, this would give the phasespace formulation of relativistic quantum mechanics a firm physical foundation, since in quantum field theory the extension to complex spacetime is based on solid physical principles such as the spectral condition. This idea was first worked out at the level of non-relativistic quantum mechanics, leading to the representation of the Galilean group given in section 4.3. That amounted to a reformulation of the canonical coherent-state representation in which the Gaussian kernel appears naturally in the momentum representation, as a result of the analytic continuation of solutions of the Schrodinger equation. This “explained” the combination x ibp (section 4.3, eq. ( 5 ) ) and gave the coherent-state representation a dynamical significance. It also cleared the way to the construction of relativistic coherent states, since now the Gaussian kernel merely had to be replaced with the analytic Fourier kernel e-’*p on the mass shell. An important tool was the use of groups to compute certain invariant integrals, which I learned from a lecture by E. Stein on Hardy spaces in 1975. The construction of the relativistic coherent states given in sections 4.4 and 4.5 was carried out in 1975-76, culminating
+
Notes
233
in the 1977 thesis. Related results were announced at a conference in 1976 (Kaiser [1977a]) and at two conferences in 1977 (Kaiser [1977d, 1978bl). To my knowledge, this was the first successful formulation of relativistic coherent states, which have since then gained some popularity (see De Bikvre [1989], Ali and Antoine [1989]). An earlier attempt to formulate such states was made by PrugoveEki [1976], but this was shown to be inadequate since the proposed states were merely the Gaussian canonical coherent states in disguise, hence not covariant under the Poincark group (Kaiser [1977c], remark 4 in sec. 11.5 and addendum, p. 133.) After the results of the thesis appeared in the literature, PrugoveEki [1978; see also 19841 discovered that they can be generalized by replacing the invariant functions e - y p with arbitrary (sufficiently regular) invariant functions. The price of this generalization is that solutions of the Klein-Gordon equation are no longer represented by holomorphic functions and the close connection with quantum field theory (chapter 5 ) appears to be lost. The relation between the two formalisms and their history was discussed at a conference in Boulder in 1983 (Kaiser [1984b]), where an inconsistency in PrugoveEki’s formalism was also pointed out. The classical limit of solutions of the Klein-Gordon equation in the coherent-state representation was studied in Kaiser [1979]. In an effort to understand interactions, the notion of holomorphic gauge theory was introduced (Kaiser [1980a, 19811). This is reviewed in section 6.1. An early attempt was also made to extend the theory to the framework of interacting quantum fields (Kaiser [1980b]), but that was soon abandoned as unsatisfactory. A more promising approach was developed later (Kaiser [1987a]) and is presented in the next chapter. Note that our phase spaces o are not unique, since the configura-
234
4. Complex Spacetime
tion space S can be chosen arbitrarily as long as it is nowhere timelike and X > 0 can be chosen arbitrarily. The freedom in S is, in fact, related to the probability-current conservation, while the freedom in A, combined with holomorphy, allowed us to express the probability current as a regularization of the usual current by the use of Stokes’ theorem (section 4.5, eqs. (37) and (40)). By contrast, the phase spaces obtained by De Bikvre [1989] are unique. They are “coadjoint orbits” of the Poincark group, related to “geometric quantization” theory (Kirillov [1976], Kostant [1970],Souriau [1970]). Although this uniqueness seems attractive, it involves a high cost: the dynamics must be factored out. This means that the ensuing theory is no longer “local in time.” Since one of the attractions of coherent-state representations is their “pseudwlocality” in both space and momentum, and since in a relativistic theory time ought to be treated like space, it seems to me an advantage to retain time in the theory. Perhaps a more persuasive argument for this comes from holomorphic gauge theory (section 6.1), where a theory describing a free particle can, in principle, be “perturbed” by introducing a non-trivial fiber metric to obtain a theory describing a particle in an electromagnetic (or Yang-Mills) field. This cannot be done in a natural way once time has been factored out. Some very interesting work done recently by Unterberger [19881 uses coherent states which are essentially equivalent to ours to develop a pseudodifferential calculus based upon the Poincarh group as an alternative to the usual Weyl calculus, which is based on the Weyl-Heisenberg group. Since the Poincard group contracts to a group containing the Weyl-Heisenberg group in the non-relativistic limit (section 4.2), Unterberger’s “Klein-Gordon calculus” similarly contracts to the Weyl calculus.
235
Chapter 5 QUANTIZED FIELDS
5.1. Introduction We have regarded solutions of the Klein-Gordon equation as the quantum states of a relativistic particle. But such solutions also possess another interpretation: they can be viewed as classical fields, something like the electromagnetic field (whose components, in fact, satisfy the wave equation, which is the Klein-Gordon equation with zero mass). This interpretation is the basis for quantum field theory. The general idea is that just as the finite number of degrees of freedom of a system of classical particles was quantized to give ordinary (“point”) quantum mechanics, a similar prescription can be used to quantize the infinite number of degrees of freedom of a classical field. It turns out that the resulting theory implies the existence of particles. In fact, the asymptotic free in- and out-fields are represented by operators which create and destroy particles and antiparticles, in agreement with the fact that such creation and destruction processes occur in nature. These particles and antiparticles are represented by positive-energy solutions of the asymptotic free wave equation, e.g. the Klein-Gordon or Dirac equation. Thus the formalism of relativistic quantum mechanics appears to be, at least partially, absorbed into quant um field theory.
236
5. Quantized Fields
In regarding solutions of the Klein-Gordon equation as the physical states of a relativistic particle, it was appropriate to restrict our attention to functions having only positive-frequency Fourier components, since the energy of the particle must be positive. Even a small negative energy can be made arbitrarily large and negative by a Lorentz transformation, leading to instability. When the solutions are regarded as classical fields, however, no such restriction on the frequency is necessary or even justifiable. For example, in the case of a neutral field (i.e., one not carrying any electric charge), the solutions must be real-valued, hence their Fourier transforms must contain negative- as well as positivefrequency components. On the other hand, the analytic extension of the solutions to complex spacetime appeared to depend crucially on the positivity of the energy. We must therefore ask whether an extension is still possible for fields, or if it is even desirable from a physical standpoint, since the connection between solutions and particles is not as immediate as it was earlier. In this chapter we find an affirmative answer to both of these questions. A natural method, which we call the Analytic-Signal transto form, will be developed to extend arbitrary functions from R”+’ ( p + 1 , and when the functions represent physical fields, the double will be shown to have a direct physical tube 7 = 7+U 7-in significance as an extended classical phase space, not for the fields themselves but for certain “particle”- and “antiparticle” coherent states e$ associated with them. These states are related directly to the dynainical (interpolating) fields, not their asymptotic free in- and out-fields. To be precise, they should be called charge coherent states rather than particle coherent states, since they have a well-defined charge whereas, in general, the concept of individual particles does not make sense while interactions are present. If the given fields satisfy some (possibly non-linear) equations, the coherent states satisfy
5.1. Introduction
237
a Klein-Gordon equation with a source term. Hence they represent dynamical rather than “bare” particles. For free fields, e z reduces to the state e, defined in the last chapter and e; to its complex conjugate, which is a negativeenergy solution of the Klein-Gordon equation holomorphic in I-. Complex tube domains also appear in the contexts of axiomatic and constructive quantum field theory, and our results suggest that those domains, too, may have interpretations related to classical phase space, a point of view which, to my knowledge, has not been explored heretofore.* While our extended fields are not analytic in general, they are “analyticity-friendly,” i.e. have certain features which yield various analytic objects under different circumstances. For example, their two-point functions are piecewise analytic, and the pieces agree with the analytic Wightman functions. In the special case when the given fields are free, the extended fields themselves are analytic in 7. f i r thermore, the fields in general possess a directional analyticity which looks like a covariant version of analyticity in time. Since the latter forms the basis for the continuation of the theory (in the form of vacuum expectaion values) from Lorentzian to Euclidean spacetime (see Nelson [1973a,b] and Glimm and J&e [19Sl]), it may be that our extended fields, when restricted to the Euclidean region, bear some relation to the corresponding Euclidean fields. The formalism we are about to develop for fields is a natural
*
R. F. Streater has recently told me that G. Kiillkn was informally
advocating the interpret ation of the holomorphic Wightman twopoint function as a correlation function in phase space around 1957. Nothing appears to have been published on this, however.
5. Quantized Fields
238
extension of the one constructed for particles in the last chapter. Like its predecessor, it posesses a degree of regularity not found in the usual spacetime formalism. Some examples of this regularity are: (a) The extended fields $ ( z ) are, under reasonable assumptions,
operator-valued functions (rather than distributions, as usual) when restricted to 7. (b) The theory contains a natural, covariant ultraviolet damping which is a permanent feature of the theory. This comes from the possibility of working directly in phase space, away from real spacetime. From the point of view of the usual (red spacetime) theory, our formalism looks like a “regularization)). From our point of view, however, no regularization is necessary since, it is suggested, reality takes place in complex spacetime! In other words, this “regularization” is permanent and is not to be regarded as a kind of trick, used to obtain finite quantities, which must later be removed from the theory. (c) In the case of free fields, the formalism automatically avoids zero-point energies without normal ordering, due to a polarization of the positive- and negative frequency components into the forward and backward tubes, respectively. Observables such as charge, energy-momentum and angular momentum are obt ained as conserved integrals of bilinear expressions in the fields over phase spaces IJ c 7. These expressions, which are densities for the corresponding observables, look like regularizations of the corresponding expressions in the usual spacetime theory. The analytic (Wightman) two-point function acts as a reproducing kernel for the fields, much as it did for the wave functions in chapter 4. (d) The particles and antiparticles associated with the free Dirac
5.2. The Multivariate Analytic-Signal ?f.ansform
239
field do not undergo the random motion known as Zitterbewegung (Messiah [1963]),again because of the aforementioned polarization.
5.2. T h e Multivariate Analytic-Signal Transform
As mentioned above, in dealing with physical fields such as the elec-
tromagnetic field, rather than quantum states, we can no longer justify the restriction that frequencies must be positive. For one thing, as we shall see, in the presence of interactions there is no longer a covariant way to eliminate negative frequencies. Hence the method used in chapter 4 to analytically continue solutions of the KleinGordon equation to complex spacetime will no longer work directly.
In this section we devise a method for extending arbitrary functions from R"+l to (IF1 When . the given functions are positive-energy solutions of the Klein-Gordon or the Wave equation, this method reduces to the analytic extension used in chapter 4. But it is much more general, and will enable us to extend quantized fields, whether Bose or Fermi, interacting or free, to complex spacetime. We begin by formulating the method for functions of one variable, where it is closely related to the concept of analytic signals. For motivational purposes, we think of the variable as time (s = 0). In this chapter, Fourier transforms will usually be with respect to spacetime
(IR"')
than just space (IR"). Hence we will denote them by for the spatial Fourier transform, as done so far.
j,reserving f^
rather
Suppose we are given a "time-signal," i.e. a real- or complexvalued function of a single real variable x. To begin with, assume that f is a Schwartz test function, although most of our considerations will
5. Quantized Fields
240
extend to certain kinds of distributions. Consider the positive- and negative- frequency parts of f, defined by
f+(x)
= (27r)-l
d p e - i z p f(p)
I"1" 0
f-(x)
E
(27r)-1
(1)
dP cizp f(P).
Then f+ and f- extend analytically to the lower-half and upper-half complex planes, respectively, i.e.
J-00
f+ and f- are just the Fourier-Laplace transforms of the restrictions of
f
to the positive and negative frequencies.
If f is complex-valued, then f+ and f- are independent and the original signal can be recovered from them as
If f is real-valued, then f(P) =
hence f+ and
and
fw,
f- are related by reflection,
(4)
5.2. The Multivariate Andytic-Signal !Dansform
241
When f is real, the function f+(z) is known as the analytic signal associated with f(s). A complex-valued signal would have two independent associated analytic signals f+ and f- . What significance do fk have? For one thing, they are regularizations of f . The above equation states that f is jointly a boundary-value of the pair f+ and f-. As such, f may actually be quite singular while remaining the boundary-value of analytic functions. Also, fk provide a kind of “envelope” description of f (see Klauder and Sudarshan [1968], section 1.2). For example, if f(z) = cosux ( a > 0), then fk(z) = 1 exp(Tiaz), so the boundary values are fk(a) = $ exp(Tiaz). In order to extend the concept of analytic signals to more than one dimension, let us first of all unify the definitions of f+ and f- by defining
J--00
for arbitrary s - iy E
a, where 8 is the unit step function, defined by
Then we have
(9) [The apparent inconsistency f(a) = $f(s)for y = 0 is due to a mild abuse of notation. It could be removed by redefining f(z) by a factor of 2 or, more correctly but laboriously, rewriting it as ( S f ) ( z ) . We prefer the above notation, since the boundary-values f ( a ) will not actually be used in the phase-space formalism.] Let us define the exponentid step function by
5. Quantized Fields
242
so that our extension is given by 00
dp F
irp
f(p).
The identity
qu)q u I )
+
= qUu’) qU tt)
shows that Oc has the “pseudo-exponential” property
ec ec’
= e(%c %<‘)of+(‘
,
which will be useful later. Although this unification of f+ and f- may at first appear to be somewhat artificial, we shall now see that it is actually very natural. Note first of all that for any real u , we have
since the contour on the right-hand side may be closed in the lower half-plane when u < 0 and in the upper half-plane when u > 0. For u = 0, the equation states that
00
dr
1
(15)
in agreement with our definition, if we interpret the integral as the limit as L + 00 of the integral Trom -L to L. The exponential step function therefore has the integral representation
5.2. The Multivariate Analytic-Signal Tkansform
243
If this is substituted into our expression for f ( z ) and the order of integrations on r and p is exchanged, we obtain
for arbitrary x - i y E C. We shall refer to the right-hand side as the Analytic-Signal transform of f(z). It bears a close relation to the Hilbert transform, which is defined by 1 ( H f ) ( z )= -PV 7r
J_m_
f(z - 4 ,
where PV denotes the principal value of the integral. Consider the complex combinat ion
Jm -- *uf-( zi e- u ) = -1im JmL f ( x - r e ) T i €10 r-i = -1im I wi €10
--oo
= lim 2 f ( x - i e ) . €10
Similarly,
f(x) + i ( H f ) ( z )= lim 2 f ( x €10
+ ie).
Hence
+
( H f ) ( x )= -i lim [f(z i e ) - f(z - i e ) ] , €10
5. Quantized Fields
244
which, for real-valued f , reduces to
( H f ) ( z )= lim 2%f(a: el0
+
26)
= - lim 2 3 f(z
- i~).
€10
(22)
We are now ready to generalize the idea of analytic signals to an arbitrary number of dimensions.
Definition. Let f E S(Rs+').The Analytic-Signal transform of f is defined by
J 2x8 1
f(. - 2y) = -
O0
--oo
dr
-f(a: 7 - 8
- ry).
(23)
The same argument as above shows that
f(z) = (274-8-1
f"(P)
(24) =( 2 V - l
dp 8(yp) e-irp-Yp f " ( p ) .
We shall refer to the right-hand side of this equation as the (inverse) Fourier-Laplace transform off" in the half-space
Mar= { p E R"+'Iyp 2 0 ) .
(25)
The integral converges absolutely whenever f E L1(IR""), since 1 0 - i z P [ 5 1. Hence f(z) can actually be defined for some distributions, not only for test functions. The extension of the transform to distributions is complicated by the fact that fFirp is not a test function in the variable p , hence for a tempered distribution T,
5.2. The Multivariate Analytic-Signal lhnsform
245
(defined through the Fourier transform of 2') may not make sense as a function on as+'. It would be intersting to find a natural class of distributions for which T ( z ) does make sense as a function. In general, however, it may be necessary to consider distributions T such that T ( z ) is some kind of distribution on (I?+'. The solutions to both of these problems are unknown to me, so a certain amount of vagueness will be necessary on this point. In the next section, where T is a quantized field, it will be assumed to satisfy some physically reasonable conditions which imply that T ( z ) is well-defined in an important subset 7 of a'+'. Recall that for s = 0, f(z) was analytic in the upper- and lowerhalf-planes. In more than one dimension, f(z) need not be analytic, even though, for brevity, we still write it as a function of z rather than z and 2. However, f(z) does in general possess a partial analyticity which reduces to the above when s = 0. Consider the partial derivative of f ( z - iy) with respect to P,defined by
Then f is analytic at z if and only if our definition of f(z), we find that
arf
= 0 for all p. But using
The right-hand side is known as the X-Ray transform (Helgason [1984])of the function af/azp, given in terms of the parameters r and y defining the line X(T) = z - ~ y .It follows that the complex &derivative in the direction of y vanishes, i.e.
4
246
5. Quantized Fields
= 0)
if f decays for large x (e.g., if f is a test function). Equivalently, using
we have
so 47riaPf is the inverse Fourier transform of p P f " ( p in ) the hyperplane
Ny = { p E IR"l
I y p = O}
= aMy.
(32)
Hence, if the intersection of the support of f" with Ny has positive Lebesgue measure in N , , then f will not be analytic at x - iy in general. However, in any case,
in agreement with the above conclusion. In the one-dimensional case s = 0, this reduces to
5.2. The Multivariate Analytic-Signal ?l.ansforrn
247
which states that f(z) is analytic in the upper- and lower- halfplanes. The point is that in one dimension, there are only two imaginary directions (up or down), whereas in s 1 dimensions, every y # 0 defines a direction. This motivates the following.
+
Definition. Y = Y"z)
a,,.
Let Y ( z ) be a vector field of type (0,l) on a?+', i.e. Then a function f on Cs+' is holomorphic along Y if
Thus, our Analytic-Signal transform
Y ( z - Zy) = y p
a,.
f( z ) is holomorphic along
Note: The "functorial" way to look at f(z) is as an extension of f(z) to the tangent bundle T(IR'+'), Then the above states that f(z) is holomorphic along the fibers of this bundle. It makes sense that f ( z ) ought to satisfy some constraints, since it is determined by a function f(z) depending on half the number of variables. If is ) z - ~y would replaced by a differentiable manifold, the line z ( ~ = have to be replaced by a geodesic. This gives a generalized transform which can be used to extend functions on an arbitrary Riemannian or Pseudo-Riemannian manifold, such as a curved spacetime, to its # tangent bundle. The multivariate Analytic-Signal transform is related to the HiJbert transform in the direction y (Stein [1970], p. 49), defined as
5. Quantized Fields
248
1 (Hyf)(2)= -PV 7r
du J_, f(a: O0
U
UY),
Z,Y
E lR9+l, Y
# 0.
(36)
Namely, an argument similar to the above shows that
hence
+
(Hyf)(s) = -i lim [f(z icy) - f(z - icy)], €10
(38)
which, for s = 0, reduces to the previous relation with the ordinary Hilbert transform. As in the one-dimensional case, f(s)is the boundary-value of f( z ) in the sense that
+
f(z) = €-+O lim [f(s icy)
+ f(a: - icy)].
(39)
For real-valued f , eqs. (38) and (39) reduce to
Note: The Analytic-Signal transform is remarkable in that it com-
bines in a single entity elements of the Hilbert, Fourier-Laplace and X-Ray transforms. In fact, it is an example of a much more general transform which, furthermore, includes the Radon transform and an n-dimensional version of the wavelet transform as special cases! (Kaiser [1990b,c]) .
5.3.Axiomatic Field Theory and Particle Phase Spaces
249
5.3. Axiomatic Field Theory and Particle Phase Spaces
We begin with a study of general fields, i.e. fields not necessarily satisfying any differential equation or governed by any particular model of interactions. The Analytic-Signal transform defines (at least formally) a canonical extension of such fields to complex spacetime. In this section we show that for general quantized fields satisfying the Wightman axioms (Streater and Wightman [1964]), the proposed extension is natural and interesting. In particular, the double tube domain 7 = 7+U 7- (where y E V& in 7k) can be interpreted as an extended classical phase space for certain “particle”- and “antiparticle” coherent states naturally associated with the quantized fields. Although the extended fields are in general not analytic (they need not even be functions, only ditributions), they are “analyticityfriendly” in the sense that various objects associated with them are analytic functions. Consequently, they are more regular than the original fields, which are their boundary values. In particular, we will see that under some reasonable assumtions, the extended fields are operator-valued functions (rather than distributions) when restricted to 7. The extensions of free fields and generalized free fields are, moreover, weakly holomorphic in 7. Recall (chapter 4) that a system with a finite number of degrees of freedom, say a free classical particle, can be quantized in at least two ways: by replacing the position-and momentum variables with operators satisfying the canonical commutation relations (this is called canonical quantization), or by considering unitary irreducible representations of the underlying dynamical symmetry group (92 for non-relativistic particles and Po for relativistic particles). The second scheme seems to be more natural, especially in the relativistic
5. Quantized Fields
250
case, where position variables are not a covariant concept. But the first scheme has the advantage that interactions (potentials) can be introduced more easily, since it generates a kind of functional calculus for operators . Both schemes have counterparts in field quantization. In canonical quantization, the particle position is replaced with the initial configuration of the field $, i.e. with the values of
d(z) at all space points
x at some initial time xo = t ; its momentum, according t o Lagrangian
field theory, then corresponds to the time-derivative of the field at time t. Thus the “phase space’’ of the field is simply the set of all possible initial data on the Cauchy surface 2 0 = t in real spacetime. Quantization is implemented by requiring the initial field and its conjugate momentum to satisfy an infinitedimensional generalization of the canonical commutation relations. Although this is the standard approach to field quantization in the physics literature, its physical and mathematical soundness is open to question. Whereas the two schemes are equivalent when applied to non-relativistic quant urn mechanics, it is not clear that they remain equivalent when applied to field quantization, in general. However, they are equivalent for the quantization of free fields. We will apply canonical quantization to the free Klein-Gordon and Dirac fields in sections 5.4 and 5.5. The second approach to field quantization is subsumed in the so-called axiomatic approach to quantum field theory. The unitary representations of Po under which the fields transform are no longer irreducible. This is a consequence of the fact that while Po is transitive on the phase space of a classical particle (i.e., any two locations, orientations and states of motion can be transformed into one another), it is no longer transitive on the phase space of a classical field (two sets of initial conditions need not be related by
Po).The
reducibility of the representation corresponds to the presence of an in-
5.3.Axiomatic Field Theory and Particle Phase Spaces
251
finite set of degrees of freedom or, at the particle level, to an indefinite number of particles. (Even two particles would result in reducibility, since the Lorentz norm of their total energy-momentum can be arbitrarily large.) Consequently, the representation is now characterized by an infinite number of parameters and no longer uniquely determines the theory, as it did (for a given mass and spin) when it was irreducible. On the other hand, the label space for the configuration observables, which for N particles in Eta was { 1,2,3, ,s N } , is now IR’,and Lorentz invariance means that the “labels” x mix with the time variable. This gives an additional structure to quantized fields not shared by ordinary quantum mechanics, and some of this structure is codified in terms of the Wightman axioms, partly making up for the indeterminacy due to the reducibility of the representation.
---
For simplicity of notation we confine our attention to a single scalar field. The results of this section extend to an arbitrary system of scalar, spinor or tensor fields. Thus, let d(z) be an arbitrary scalar quantized field. “Quantized” means that rather than being a real- or complex-valued function on spacetime (like the components of the classical electromagnetic field), d(s) is an operator on some Hilbert space ‘H. Actually, as noted by Bohr and Rosenfeld [1950],quantized fields are too singular to be measured at a single point in spacetime. This led Wightman to postulate that 4 is an operator-valued distribution, i.e., when smeared over a test function f(s)it gives an operator $(f) (unbounded, in general) on ‘H. The axiomatic approach turns out to provide a surprisingly rich mathematical framework common to all quantum field theories; therefore, it is model-independent. It was followed by constructive quantum field theory, in which model field theories are built and shown to satisfy the Wightman axioms. For simplicity, we state the axioms
5. Quantized Fields
252
in terms of the formal expression $(z) rather than its smeared form we assume, to begin with, that the field 4 is neutral, which means that the operators +(x) (or, rather, their smeared forms +(f) with f real-valued) are essentially self-adjoint. Charged fields, which are described by non-Hermitian operators, and spinor (Dirac) fields, will be considered later. The axioms are:
4(f). Also,
1. Relativistic Invariance: PoincarC transformations are implemented by a continuous unitary representation U of Po acting on the Hilbert space 3-1of the theory. (For particles of half-integral spin, such as electrons, POmust be replaced by its universal cover; see section 5.5.) Thus,
4(Ax
+ a ) = U ( a ,A) 4(x) U ( a ,A)*.
(1)
Let P p and M p v be the self-adjoint generators of spacetime translations and Lorentz trnsformations, respectively. They are interpreted physically as the tot a1 energy-momentum and angular momentum operators of the field -or, more generally, of the system of (possibly coupled) fields. They satisfy the commutation relations of the Lie algebra g of PO.In particular, the Pp's must commute with one another, thus have a joint spectrum C c R"". (Actually, C is a subset of the dual (IRs+l )* of IRs+l, which we may identify with RS+lusing the Minkowski metric; see section 1.1,) 2. Vacuum: The Hilbert space contains a unit vector !Po, called the vacuum vector, which is invariant under the representation U , i.e. U ( u , A ) Q o = !POfor all ( u , A ) E PO.90is unique up to a constant phase factor, and it is cyclic, meaning that the set of
5.3. Axiomatic Field Theory and Particle Phase Spaces
253
all vectors of the form
spans 'FI. Invariance under POmeans that \ko is a common eigenvector of the generators Pp and M p v with eigenvalue zero:
3. Spectral Condition: The joint spectrum E of the energy-momentum operators Pp is contained in the closed forward light cone: c CV+. (4)
This axiom follows from the physical requirement of stability, which merely states that the energy is bounded below. To see this, note first of all that C must be invariant under LO,by Axiom 1. Hence, if any physical state had a spectral component with p # that component could be made to have an arbitrarily large negative energy by a Lorentz transformation. Note that the existence and uniqueness of the vacuum means that C contains the origin in its point spectrum with multiplicity one.
v+,
4. Locality: The field operators $(x) and b ( d ) at points with spacelike separation commute, i.e.,
This corresponds to the physical requirement that measurements of the field at points with spacelike separations must be independent, since no signal can travel faster than light. (For fields of
5. Quantized Fields
254
half-integral spin, such as the Dirac field treated in section 5.5, commutators must be replaced with anticommutators.) 5. Asymptotic Condition: As the time 20 + f o o , the field $(x) has weak asymptotic limits
are free fields of mass m > 0, i.e. they satisfy the and Klein-Gordon equation: &n
Physically, this means that in the far past and future, all particles are sufficiently far apart to be decoupled, and that 4 interpolates the in- and out- fields. Furthermore, the Hilbert spaces on which and $ operate all coincide:
In addition to the above, there are axioms concerning the regularity of the distributions $(x) (they are assumed to be tempered, i.e. the test functions belong to S(lFt8+l)) and the domains of the smeared operators q5( f), which we use implicitly, and clustering, which we will not use here. Let us now draw some conclusions from these axioms. Since the energy-momentum operators generate spacetime translations, it follows from eq. (1) that
5.3.Axiomatic Field Theory and Particle Phase Spaces For the Fourier transform &I),
this means that
[i(P),P,,I= P, &PI* Consequently for any p E
IR”’, the “vector” @p 3
i(-P)Qo
(which is in general non-normalizable) satisfies
P,@p = P, i(-P)IQo = P,%’ That is, @, is a generalized eigenvector of energy-momentum with eigenvalue p , unless it vanishes. The spectral condition therefore requires that
where C1 is the intersection of the spectrum C with the support of More generally, if Q, is any generalized eigenthe distribution vector of energy-momentum with eigenvalue p E C, then the above commutation relations show that
4.
P,i(P’)
Q ,
= (P,
- P:) &P/)*,,
(14)
thus either $ ( p ’ ) Q , vanishes or it is a generalized eigenvector of energy-momentum with eigenvalue p - p’, a necessary condition for which is that p - p‘ E C. Thus we conclude that the operator &I), when it does not vanish, removes an energy-momentum p from the field. This establishes the physical significance of the field operators, as well as the connection between the (mathematical) Fourier variable p and the (physical) energy-momentum operators P,,.From the Jacobi identity, it follows that
5. Quantized Fields
256
showing that [&p))&p')] (if not zero) removes a total energy-momentum p p' from the field. The cyclic property of the vacuum,
+
furthermore, implies that the entire spectrum C is generated by reto the vacuum. Note that in general we peated applications of
6
cannot draw any conclusions about the support of &). For example, &p) need not vanish for spacelike p , since C may contain points p' such that p' - p E C. In fact, very strong conclusions can be drawn from the nature of the support of
6. From Lorentz invariance and
&p)* = &-p) it follows that &p) = 0 if and only if &p') = 0, where p' = kAp for some A E Lo. We conclude that the support of
4 must
be a union of sets of the form
which are, in fact, the various orbits of the full Lorentz group
L.
Greenberg [1962] has shown that q5 is a generalized free field (i.e., a sum or integral of free fields of varying masses m 2 0) if vanishes
6
on any of the following types of sets:
5.3. Axiomatic Field Theory and Particle Phase Spaces
257
He has also shown, by giving counter-examples, that this conclusion cannot be drawn if
4 vanishes on sets of the type
(See also Dell’Antonio [1961]and Robinson [1962].) Note: Up to now, we have not assumed that the field satisfies the canonical commutation relations (section 5.4), hence our conclusions are quite general and should hold for an arbitrary (system of mutually) interacting field(s). The “Lie algebra” generated by the fields (obtained by including, along with the fields, their commutators [&p),&p’)] as well as higher-order commutators) has a very interesting formal structure, although it is not a Lie algebra in the usual sense. (For one thing, it is uncountably infinitedimensional rather than finite-dimensional.) Namely, the above relations suggest that the operators Pp be regarded as belonging to a Cartan subalgebra and that &p) (with p in the support of
4) is a root vector with as-
sociated root - p . The spectrum C is therefore reminiscent of a set
A+ of positive roots. In general, the Cartan subalgebra consists of a maximal set of commuting observables. When considering charged
fields, the charge will also belong to the Cartan subalgebra, with root values 0, kc, f 2 c , . . ., where
E
is a fundamental unit of charge. The
vacuum is a vector (not in the Lie algebra but in an associated representation space) of “highest” (or lowest) weight which, as in the finite-dimensional case, generates a representation of the algebra because of its cyclic property. To my knowledge, this important analogy between the structures of general quantized fields (i.e., apart from the canonical commutation relations or any particular models of interac-
5. Quantized Fields
258
tions) and Lie algebras has not been explored, although the methods of Liealgebra theory could add a powerful new tool to the study of quantized fields. (In a somewhat different context, the structures of quantized fields and infinitedimensional Lie algebras are united in string theory; see Green, Schwarz and Witten [1987].) # Let us now formally extend the quantized field #(z) to V+l, using the Analytic-Signal transform developed in section 5.2. Recall that this transform was originally defined for Schwartz test functions. In principle, we would like to define 4 ( z ) by using its distributional Fourier transform &p): $ ( z ) = (274-8-1
_i(e
-iZp
dp
O-;’P
&p)
).
This presents us with a technical problem, as already noted in the last section, since P i Z P is not a Schwartz test function in p . One way out is to smear 4 ( z ) with a test function f ( z ) over (IF1. Although this is the safest solution, it is not very interesting since not much appears to have been gained by extending the field to complex spacetime: the new field is still an operator-valued distribution. However, we shall see that there are reasons to expect 4 ( z ) to be more regular than a “generic” Analytic-Signal transform, due in part to the fact that 4(z) satisfies the Wightman axioms. When 4 is a (generalized) free field, the restriction of $ ( z ) to the double tube 7 turns out to be a holomorphic operator-valued function. We will see that even In for general Wightman fields, 7 is an important subset of the presence of interactions, holomorphy is lost but some regularity in 7 is expected to remain. We now proceed to find conditions which do not force d, to be a generalized free field but still allow $ ( z ) to
5.3. Axiomatic Field Theory and Particle Phase Spaces
259
be an operator-valued function on 7.The arguments given below have no pretense to rigor; they are only meant to serve as a possible framework for a more precise analysis in the future. All statements and conditions concerning convergence, integrability and decay of operator-valued expressions are meant to hold in the weak sense, i.e. for matrix elements between fixed vectors. Since the operators involved are unbounded, we must furthermore assume that the vectors used to form the matrix elements are in their (form) domains. For a fixed timelike “temper” vector y, P ’ P fails to be a Schwartz test function in two distinct ways: (a) It has a discontinuity on the spacelike hyperplane Ny = { p I yp = 0}, and (b) it has a const ant modulus on hyperplanes parallel to Nar, hence cannot decay there. On the other hand, by relativistic covariance, the support of must be smeared over the orbits of L,given by eq. (16). This gives a “stratification” of as a sum of tempered distributions
7
7
i = i++ $0
f6oo
+ 6-
with support properties
the distributions Although O+ and f2- contain 00,
i+ and 6-
have no contributions from p 2 = 0. (For example, the distribution
1+6(p) on R has a decomposition 2’1
+TOwhere TI has support IR and
5. Quantized Fields
260
To has support (0)
c IR,but
the p = 0 contribution t o TI vanishes.)
Similarly, although Ro contains
5200, 6 0
has no contribution from
p = 0. Corresponding to the above decomposition of formally,
6,we have,
1. $+(z) and $ o ( z ) are holomorphic operator-valued functions in 7; 2. $ O O ( Z ) must be a constant field, to be physically reasonable; and 3. under certain (hopefully not too restrictive) conditions, $ - ( z ) , though not holomorphic, is an operator-valued function on 7. First of all, we claim that each of the fields (a = k,O,OO) is still covariant under Po.* To see this, take the Fourier transform of eq. (1)and use the invariance of Lebesgue measure d p under
LO.
This leads to &p) = eiap U ( U , A) & ~ p )U ( U , A)*,
(23)
where A‘ is the transpose of A E Lo. Since the different components are “essentially” supported on disjoint subsets and these subsets are
ia.
invariant under LO,we conclude that eq. (23) holds for each To show that 4+(z) and 4 0 ( z ) are holomorphic in 7+, let z E 7+,
-
Since P i r p vanishes for p E V-\{O}, and since contribution from p = 0, we have
*
However,
$a
$+
and
60
have no
need not satisfy other Wightman axioms such as
locality; for example, $+ need not be a generalized free field.
5.3.Axiomatic Field Theory and Particle Phase Spaces
261
v+
Now e - i z P may be regarded as the restriction to of a Schwartz test function fz(p) which is of compact support in p for each fixed po and vanishes when po < -E, for some E > 0. Thus $ + ( z ) and $ o ( z ) make sense as operator-valued functions on 7+, and they are cleary holomorphic there. (This will be shown explicitly below.) A similar analysis shows that the same can be done for z E 7-. Next, we consider ~ o o ( z ) Since .
400
is supported on {p = 01, it
must have the form P ( d )6 ( p ) , where P ( d ) is a partial differential operator. In the z-domain (i.e., in real spacetime), this corresponds to
P( -iz), for which the Analytic-Signal transform is not well-defined, although it is possible that a regularization procedure would cure this. But in any case, non-constant polynomials in x do not appear to be of physical interest since they correspond to unbounded fields even in the classical sense (as functions of x). Hence we assume that
(a) 4 0 0 ( p )= 2 A 6 ( p ) , where A is a constant operator. This corresponds to a constant field $(z)
= 2A and, correspondingly,
A (see eq. ( 1 5 ) in section 5.2). In order that + ( z ) be an operator-valued function on 7, it therefore remains only for q5-(z) ~ O O ( Z3 )
to be one. Note that so far, the only assumption we needed to make, in addition to the Wightman axioms, was (a). To make $ - ( z ) a function, we now make our second assumption:
(b) i - ( p ) is integrable on all spacelike hyperplanes. firthennore, the integral of over the hyperplane HY,” ZE {p I yp = Y } ( y E V’) grows at most polynomially in u.
4
It is not clear what specific minimal conditions on
4-
produce this
property. The integral occurring in (b) is known a.s the Radon trans-
5. Quantized Fields
262 form (Rd)(y,v ) of
6 when y is a (Euclidean) unit vector, and will
be further discussed in section 6.2. (See also Helgason [1984].)Unlike the Fourier transform, the Radon transform does not readily generalize to tempered distributions (which were, after all, designed specifically for the Fourier transform). However, it does extend to distributions of compact support and can be further generalized to distributions with only mild decay. Also, the relation of assumptions
(a), (b) (or their future replacements, if any) to the Wightman axioms needs to be investigated. In order to compute 4 - ( z ) for z E 7 it suffices, by covariance, to do so for z = 0 and y = ( u , O ) , for all u # 0. The analyses for u > 0 and u < 0 are similar, so we restrict ourselves to u > 0. Eq. (19) then gives
$ L ( - i u , 0) = (24-S-1 For fixed po 2 0, condition (b) implies that the integral over p converges, giving an operator-valued function F ( p 0 ) which is of at most polynomial growth in p o . 4-( -iu, 0 ) is then the Laplace transform
of F(po), which is indeed well-defined. Note: The behaviors of 4 ( z ) and &p) exhibit a certain duality which reflects the dual nature of y E R”+l and p E (lR’+’)*(section 1.1). We have just seen that when &p) behaves reasonably for spacelike p , then $ ( z ) behaves reasonably for timelike y. In the trivial case when &p) G 0 for p 2 < 0 and (a) holds, 4 ( z ) is holomorphic for y2 > 0. In fact, 4 is then a generalized free field, hence may be said to be “trivial.” This dual behavior also extends to p 2 2 0 and y2 < 0: For any non-constant field, & p ) is non-trivial for p 2 2 0; for spacelike y, the hyperplane H,,,(which contains timelike as well as spacelike
5.3. Axiomatic Field Theory and Particle Phase Spaces
263
6
directions) therefore intersects the support of in a non-compact set and we do not expect $ ( z ) to make sense as an operator-valued function outside of 7. # No claims of analyticity can be made for 4 ( z ) in general. In fact, Greenberg’s results show that d ( z ) may not be analytic anywhere in unless q5 is a generalized free field. For as in the classical case, formal differentiation with respect to 2, gives
hence 4xi8, $ is the inverse Fourier transform of p,J in the hyperplane Ny.If $ is not a generalized free field, then, according to Greenberg, the support of contains sets of timelike as well as sets of spacelike p’s with positive Lebesgue measure. Hence, for any nonzero y E IRS+l, the intersection of the support of with N yhas positive measure in ivy, so 9 will not be holomorphic at 3 - iy in general. As in the classical case, however, the above equation for 8,d implies that $ ( z ) is holomorphic along the vector field y, i.e.,
4
6
This is a covariant condition which, when specialized to y = (yo, 0), states that $ ( z ) is holomorphic in the complex timedirection. As we have seen, this result simply follows from the nature of the AnalyticSignal transform. A similar situation forms the basis of Euclidean quantum field theory. However, there one is dealing not directly with the field but with its vacuum expectation values, and the mathematical reason for the analyticity is the spectral condition, which would appear to have little in common with the Analytic-Signal transform.
264
5. Quantized Fields
Incidentally, eq. (26) provides a simple formal proof that 4 + ( z ) is holomorphic in 7. The support of i + ( p ) is contained in hence for any y in V’, its intersection with N , is either empty or equal to Roo. But the contribution from p = 0 vanishes, hence eq. (26) shows that aPq5+(z) = 0 in 7. The same argument also shows that free fields and generalized free fields are holomorphic in 7. In the next two sections we shall study free Klein-Gordon and Dirac fields in more detail.
v,
Although d ( z ) is not holomorphic in general, we will be able to establish for it one essential ingredient of the foregoing phasespace formalism, namely the interpretation of the double tube 7 as an extended classical phase space for certain “particles” and “antiparticles” associated with the quantized field 4. First, let us expand the above considerations to include charged fields by allowing $(z) to be non-Hermitian (i.e., a non-Hermitian operator-valued distribution). Then the extended field $( z ) need not satisfy the reflection condition $(z)* = r j ( Z ) . The charge Q is defined as a self-adjoint operator which generates overall phase translations of the field, i.e., e -:a9
+(.>
eiaQ
= eiac
4(4
(28)
for real a, where e is a fundamental unit of charge. This implies
showing that $(x) and &)I remove a unit e of charge from the field, while their adjoints add a unit of charge. We assume that phase translations commute with Poincark transformations, and in particular with spacetime translations. Thus
5.3. Axiomatic Field Theory and Particle Phase Spaces
265
so charge is conserved. Q can be included in the “Cartan subalgebra”
containing the P,’s,and the above commutation relations show that &)I and &p)* are still “root vectors,” with Q-root values --e and E , respectively. We also assume that the vacuum is neutral, i.e. QQO = 0. Repeated applications of r$ and J* to 90show that the spectrum of Q is (0, f ~ f 2,~. ., .}. Recall that the commutation relation between &)I and P, implied that &p) removes an energy-mometum p from the field. Similarly, its adjoint relation
shows that &I)* adds an energy-momentum p to the field. In place of the generalized eigenvectors 9, of energy-momentum which we had for the Hermitian field, we can now define two eigenvectors,
for each p E El. For a non-Hermitian field, these vectors are independent. They are states of charge e and -&, respectively. We may think of them as particles and antiparticles, although they do not have a well-defined mass since p 2 will be variable on C1,unless 4 is a free field. Each p # 0 in C1 belongs to the continuous spectrum of the P,’s, since it can be changed continuously by Lorentz transformations. Hence the “vectors” 9(: are non-normalizable. Since the and @; belong to different eigenPp’s are self-adjoint, and since values of the charge operator (which is also self-adjoint), we have
5. Quantized Fields
266
(with the usual abuse of Dirac notation, where “inner products” of distributions are taken)
where a, a distribution with support in C1, depends only on p 2 by Lorentz invariance. (Charge symmetry requires that o be the same for particles as for antiparticles.) If q5 is the free field of mass m > 0, a ( p2 ) - e ( p o ) 27r6(p2 - m 2 )= 27r6(p2 - m 2 )in V+\{O}. Now define the particle coherent states by
these do not have a well-defined mass; in addition, Like the @s’:, they are wave packets, i.e. have a smeared energy-momentum, but they still have a definite charge
6
. Their spectral components are
given by
7-,then y p < 0 on V+\{O},hence e$ = 0. If z belongs to the forward tube 7+,then y p > 0 and the
If
z belongs to the backward tube
vector e$ is weakly holomorphic in 2. For the free field, it reduces to the coherent state e , defined in chapter 4. Similarly, define the coherent antiparticle states by
5.3.Axiomatic Field Theory and Particle Phase Spaces
These are wave packets of charge
-E
267
for which
Thus e l vanishes in the forward tube and is weakly holomorphic in the backward tube. In the usual formulation of quantum field theory, particles are associated not directly with the interacting, or interpolating, field but with its asymptotic fields q5jn and dout,which are free. (We will construct such free-particle coherent states in the next two sections.) However, the coherent states e$ are directly associated with the interpolating field. We shall refer to them as interpolating particle coherent states (section 5.6). We are now ready to establish the phase-space interpretation of
7 in the general case. We will show that 7+ and 7-are extended phase spaces associated with the particle- and antiparticle coherent states e$ and e;, respectively, in the sense that they parametrize the classical states of these particles. We first discuss x as a “position” coordinate. In the case of interacting fields there is no hope of finding even a “bad” version of position operators. Recall that position operators were in trouble even in the case of a one-particle theory without interactions! In the general case of interacting fields, this problem becomes even more
5. Quantized Fields
268
serious, since one is dealing with an indefinite number of particles which may be dynamically created and destroyed. (As argued in section 4.2, the generators MOkof Lorentz boosts qualify as a natural, albeit non-commutative, set of center-of-mass operators; although I believe this idea has merit, it will not be discussed here.) Since no position operators are expected to exist, we must not think of 3 as eigenvalues or even expectation values of anything, but rather simply as spacetime parameters or labels. On the other hand, y will now be shown to be related to the expectations of the energy-momentum operators (which do survive the transition to quantum field theory, as we have seen). For z,z' E 7+,we have
-
where we have set m2 (27r)-'-'
27ri
L-dm' u(m2)A+(*' - Z;m),
p 2 and used
dp = (27r)-'-l dpo d s p = (27r)-' dm2dp",
(39)
with
dp" = [2(27r)'
drn1-l d'p
(40)
the Lorentz-invariant measure on 52h. A + ( w ; m )is the two-point function for the free Klein-Gordon field of mass m, analytically continued to w z' - Z E I+. In the limit y , ~ '+ 0, this gives the
=
5.3.Axiomatic Field Theory and Particle Phase Spaces
269
KdGn-Lehmann representation (Itzykson and Zuber [1980])for the usual two-point function,
(9
0
I4 ( 4 4 ( 4 * Q o
)=
&
dm2 c(m2A ) + ( d - s; m ) , (41)
which is a distribution. In Wightman field theory, such vacuum expect ation values are analytically continued using the spectral condition, and conclusions are drawn from these analytic functions about the field in real spacetime. In our case, we have first extended the field (albeit non-analytically), then taken its vacuum expectation values (which, due to the spectral condition, are seen to be analytic functions, not mere distributions). The fact that we arrived at the same result (i.e., that “the diagram commutes”) indicates that our approach is not unrelated to Wightman’s. However, there is a fundamental difference: The thesis underlying our work is that the “red” physics actually takes place in complex spacetime, and that there is no need to work with the singular limits y -+ 0. The norm of
e t is given by
where G(y;m), computed in section 4.4, is given by
fl,
Recall that A (s - 1 ) / 2 and IC, is a modified Bessel v function. We assume that e$ is normalizable, which means that the spectral density function a ( m 2 )satisfies the regularity condition
5. Quantized Fields
270
= F ( X )<
00.
(This condition is automatically satisfied for Wightman fields, where it follows from the assumption that is a tempered distribution; however, it is also satisfied by more singular fields since K, decays exponentially.) It follows that
aF(X) - _ -1 2 ay,
(45)
= -y, F’( X)/2X.
Using the recursion relation (Abramowitz and Stegun [1964])
we find that the state e$ has an expected energy-momentum
where
F / (A) mA = -2F(X) -
dm2a(m 2 )mV+lKv+l(2Xm) . JOm dm2a ( m 2 )mu K,(2Xm)
(48)
We call mA the effective mass of the particle coherent states; it generalizes the corresponding quantity for Klein-Gordon particles (section 4.4). The name derives from the relation
5.3. Axiomatic Field Theory and Particle Phase Spaces
( P )z~( P ~ ) ( P=mi. ~ )
271
(49)
It is important to keep in mind that in quantum field theory, the natural picture is the Heisenberg picture, where operators evolve in spacetime and states are fixed. Recall that for a free Klein-Gordon particle (chapter 4),we interpreted e, as a wave packet focused about the event z E Rz and moving with an expected energy-momentum (mx/X)y. This suggests that the above states ef be given a similar interpretation. Thus z becomes simply a set of labels parametrizing the classical states of the particles. This establishes the interpretation of 7+as an extended classical phase space associated with the “particle” states e.:
A similar
computation shows that 7.acts as an extended classical phase space for the “antiparticle” states e;
, whose expected energy-momentum
is
The expected angular momentum in the states e$ can be computed similarly. The angular momentum operator MpY is the generator of rotations in the p-v plane, hence
This implies for the Fourier transform
Since the vacuum is Lorentz-invariant, we have M,,\ko for z E 7+,
= 0, hence
5. Quantized Fields
272
v+
provided that &p) vanishes on the boundary of (this excludes massless fields). The expectation of Mhv is therefore related to that
of P, by
Similarly, in e ; with z E ‘TL,
This section can be summarized by saying that the vector y plays a similar role for general quantized fields as it did for positive-energy solutions of the Klein-Gordon equation, namely it acts as a control vector for the energy-momentum. In other words, the function
9(yp) e--YP acts as a window in momentum space, filtering out from each mass shell R, momenta which are not approximately parallel to y. The step function 8(yp) makes certain that only parallel components of p pass through this filter by eliminating the antiparallel ones (which would make the integrals diverge). Thus we may think of B(yp) e--YJ‘ as a kind of “ray filter” in when y E V‘. We continue to refer to y as a temper vector (section 4.4).
v,
5.4. n e e Klein-Gordon Fields
273
Note: The regularity condition given by eq. (44) for o ( m 2 ) ,i.e. the requirement that e: be normalizable, shows that X acts as an effective ultraviolet cutoff, since K42Xm) decays exponentially as m + 00, giving finite values to mx and other quantities associated with the field.
#
5.4. Free Klein-Gordon Fields In the context of general quantum field theory, we were able to show that 7 plays the role of an extended phase space for certain “particle” states of the fields. The question arises whether the phase-space formalism of chapter 4 can be generalized to quantized fields. There, we saw that all free-particle states in the Hilbert space could be reconstructed from the values of their wave function on any phase space u c 7+. There are two possible ways in which this result might extend to quantized fields: (a) The vectors e$ belong the subspaces ‘&I with charge &E, hence we may try to get continuous resolutions, not of the identity on 3-1 but of the orthogonal projection operators to ‘I&,in terms of these vectors. (This can then be generalized to the resolution of the projection operator II, to the subspace
?in with charge m, n E Z:) (b) The global observables of the theory, such as the energy-momentum, the angular momentum and the charge operators, are usually expressed as conserved integrals of the field operators and their derivatives over an arbitrary configuration space S in spacetime (i.e., an s-dimensional spacelike submanifold of Rssl); our approach would be to express them as integrals of the extended fields over 2s-dimensional phase spaces cr in 7, much as the inner products of positive-energy solutions were expressed as
274
5. Quantized Fields
such integrals. In this section we do both of these things for the free Klein-Gordon field of mass m > 0, which is a quantized solution of
We consider classical solutions at first. The Fourier transform &p) has the form
for some complex-valued function a ( p ) defined on the two-sheeted mass hyperboloid flm= fl$ U 0,. Write
If the field is neutral, then #(z) is real-valued and b ( p ) f a(p). For charged fields, a(p) and b ( p ) are independent. At this point, we keep both options open. Then
The extension of +(z) to complex spacetime given by the AnalyticSignal transform is
5.4. R e e Klein-Gordon Fields
If
y E
275
V;, then yp > 0 for all p E Qk,hence
is analytic in 7+, containing only the positive-frequency part of the field. Similarly, when y E Vl, then y p < 0 for all p E Qk and =
J,, dfi eiZpb(p),
z
E T-.
(7)
Thus + ( z ) is also analytic in 7-, where it contains only the negativefrequency part of the field. However, note that the two domains of analyticity 7+and 7-do not intersect, hence the corresponding restrictions of + ( z ) need not be analytic continuations of one another. We are now ready to quantize $ ( z ) . This will be done by first quantizing $(x) and then using the Analytic-Signal transform to extend it to complex spacetime. We assume, to begin with, that $ is a neutral field, so b(p) a ( p ) . According to the standard rules (Itzykson and.Zuber [19SO])of field quantization, $(x) becomes an operator on a Hilbert space ‘FI such that at any fixed time 2 0 ,the field ‘‘configuration” operators $(5 0 , x) and their conjugate “momenta” &$( 50,x) obey the equal-time commutation relations
5. Quantized Fields
276
This is an extension to infinite degrees of freedom of the canonical commutation relations obeyed by the quantum-mechanical positionand momentum operators. Note that since time evolution is to be implemented by a unitary operator, the same commutation relations will then hold at any other time as well. For the non-Hermitian operators a ( p ) , the corresponding commutation relations are
for p,p' E R,. Using the neutrality condition a ( - p ) = a(p)*, these can be rewritten in their conventional form
A charged field can be built up from a pair of neutral fields as
where the two fields c$l,q52 each obey the equal-time commutation relations and commute with one another. Equivalently, the operators a ( p ) and b(p) become independent and satisfy
W>l= [W, b(P')l = [W,b(P')l
[a(p),
[a(p),a*(p')I = [ b ( P ) ,b*(P')l =
= 0,
( 2 4 " S(P - P')
(12)
for p,p' E R i . The canonical commutation relations for both neutral and charged fields can be put in the manifestly covariant form
5.4. n e e Kfein-Gordon Fields
277
W),i ( P ' ) * ] = 47r2 6(p2 - m 2 )w2- m 2 )[a(p),a(p')*] = (2~>"'2po B(p0pb) 6(p2 - m 2 )6 ( ~ -' p~ 2 ) S(p - p') = (27r)8+22poq p o p ; ) 6(p2 - m 2 )6(pb2 - p i ) S(p - p') = sign(p0) ( 2 ~ ) 6(p2 ~ + ~ m 2 )~ ( -pp')
(13) for arbitrary p,p' E mented by
IRS+l.
For charged fields, this must be supple-
Recall that in the general case we had
(
q I @pi
) = 0 ( p 2 )(2T)"+l S(p - p')
(15)
v+,
for p,p' E where a ( p 2 )is the spectral density for the two-point function (sec. 5.3, eq. (33)). For the free field now under consideration we have
(
q I 'pp") = ( Qo I 6(P)i(P')* Qo ) I [W &)')*I
) = (27r)5+26(p2 - m 2 )6(p - p'), = ( Qo
?
Qo
(16)
which shows that the spectral density for the free field is 0 ( p 2 )= 27r6(p2 - m2).
The spectral condition implies that
(17)
5. Quantized Fields
278
since otherwise these would be states of energy-momentum -p. Hence the particle coherent states defined in the last section are now given by
where the vectors 6: are generalized eigenvectors of energy-momenturn p E
with the normalization
-+ I @+,p
(@p
I 42-4.(p')*Qo 1
) =( =(9
0
I
[+)l
.(P')*IQO )
(20)
= 2w(27r)s S(p - p').
The wave packets e$ span the one-particle subspace 'FI1 of 'FI and have the momenturn represent at ion ._ ( aP -+ I e+, ) = carp.
(21)
A dense subspace of 'HI is obtained by applying the smeared operators
5.4. B e e Klein-Gordon Fields where
279
fo is the restriction of f” to 522. Hence the functions
exactly the holomorphic positive-energy solutions of the KleinGordon equation discussed in section 4.4,with e;f corresponding to the evaluation maps e,. The space K: of these solutions can thus be are
identified with 3-11, and the orthogonal projection from 7f to given by 4
is
Consequently, the resolution of unity developed in chapter 4 can now be restated as a resolution of Ill:
where a+,earlier denoted by a,is a particle phase space, i.e. has the form a+ = { x - i y I z E
s,
y
E 52;)
(27)
for some X > 0 and some spacelike or, more generally, nowhere timelike (see section 4.5) submanifold S of real spacetime. As in section 4.5, the measure do is given in terms of the Poincar&invariant symplectic form Q! = dy, A dx” by restricting as a A . A a to a+ and choosing an orient at ion:
--
h
h
da = (s!Ax)-1 a s = AX1 dy” A d x , .
(28)
Similarly, the antiparticle coherent states for the free field are given by
5. Quantized Fields
280
Since for p E !2$ and n E 7- we have
. it follows that e;
has exactly the same spacetime behavior as e t ,
confirming the interpretation of an antiparticle as a particle moving backward in time. An antiparticle phase space is defined as a submanifold of 7- given by
where S is as above. The resolution of
n-1
is then given by
Many-particle or -antiparticle coherent states and their corresponding phase spaces can be defined similarly, and the commutation relations imply that such states are symmetric with respect to permutations of the particles' complex coordinates. For example,
since 4(z1) and 4(z2) commute. In this way, a phase-space formalism can be buit for an indefinite number of particles (or charges), analogous to the grand-canonical ensemble in classical statistical mechanics. This idea will not be further pursued here. Instead, we
5.4. R e e Klein-Gordon Fields
281
now embark on option (b) above, i.e. the construction of global, conserved field observables as integrals over particle and antiparticle phase spaces. The particle number and antiparticle number operators are given by
N+ and N - generalize the harmonic-oscillator Hamiltonian A'A to the infinite number of degrees of freedom possessed by the field, where normal modes of vibration are labeled by p E $22for particles and p E 0; for antiparticles. The total charge operator is Q = e (N+ - N - ) , as can be seen from its commutation relations with u ( p ) and b(p). But the resolution of unity derived in chapter 4 can now be restated as
for p,p' E 02,where the second identity follows from the first by replacing z with Z and o+ with g-. It follows that N h can be expressed as phase-space integrals of the extended field q5(z):
282
5. Quantized Fields
Hence the charge is given by
The two integrals can be combined into one as follows: Define the total phase space as u = a+ - u-, where the minus sign means that u- enters with the opposite (“negative”) orientation to that of u+, in the sense of chains (Warner [1971]).The reason for this choice of orientation is that B; and By are both open sets of IR8+’, hence must have the same orientation, and we orient 0: and 0, so that
Since the outward normal of B: points “down” and that of B, points ((up,”this means that 0, must have the opposite orientation to that By and 0 x G 0: - a, we have of 0;. Thus, setting Bx B:
=
+
This gives u- the orientation opposite to that of u+, and we have
Next, define the Wick-ordered product (or normal product) by
This coincides with the usual definition, since in 7+, $* is a creation operator and 4 is an annihilation operator, while in 7-these roles are reversed. The charge can now be written in the compact form
5.4. R e e Klein-Gordon Fields
283
We may therefore interpret the operator p(z)
f€
: 4(z)* 4(z) :
(42)
as a scalar phase-space charge density with respect to the measure
da.
The Wick ordering can be viewed as a special case of imaginarytime ordering, if we define 4 * ( z ) = $(Z)*: : +(z')* 4 ( z ) :=: 4 * ( ~ ' $) ( z ) := T i [d*(Z')
4(~)],
(43)
where
and
when z and z' are in the same half of 7,whereas if they are in opposite halves of 7, the sign of 3(zk - zo) is invariant. Note: For the extended fields, the Wick ordering is not a necessity but a mere convenience, allowing us to combine the integrals over a+ and 0- into a single integral. Each of these integrals is already in normal order, since the extension to complex spacetime polarizes the free field into its positiveand negativefrequency parts. Also,
5. Quantized Fields
284
the extended fields are operator-valued functions rather than distributions, hence products such as 4(z)*c$( z ) are well-defined, which is not the case in the usual formalism. A similar situation will occur in the expressions for the other observables (energy-momentum, angular momentum, etc.) as phasespace integrals. Hence the phasespace formalism resolves the problem of zero-point energies without the need to subtract infinite terms “by hand”! In this connection, see the remarks on p. 21 of Henley and Thirring [1962]. # The above expression for the charge can be related to the usual one in the spacetime formalism, which is ~~~~d = ia
G
J, z,, : +*-dx, d4 - %* - qqx): ax, (47)
kz,,Jc”(x),
by using Rx = -dBx and applying Stokes’ theorem:
where j”(.)
= --d
dY,
p(z)
is the phase-space current density. Using the notation
(49)
5.4. n e e Klein-Gordon Fields
285
we have
--d - i(d” - 8”). Hence, by the holomorphy of
j”(2)
4, d
=-
E G
.$*4: *
(a. - a”) : f$* f$ : = i&: 4* - &4* - 4 : = i&: 4*- 34 - %* 4 : . = i&
axr
dx”
Our expression for the charge is therefore
where
is seen to be a regulmized version of the usual current density J”(z) obtained by first extending it to 7 and then integrating it over Bx.
The vector field fixed y E V’, since
jP(z)
is conserved in real spacetime for each
5. Quantized Fields
286
by virtue of the Iclein-Gordon equation combined with the holomorphy of 4 in 7. This implies that JfLx,(x) is also conserved, hence the charge does not depend on the choice of S or u.
Note: In using Stokes’ theorem above, we have assumed that the contribution from lyol + 00 vanishes. (This was implicit in writing the non-compact manifold Rx as -dBx.) This is indeed the case, as has been shown rigorously in the context of the one-particle theory in chapter 4 (theorem 4.10). Also, we see another example of the pattern, mentioned before, that in the phase-space formalism vectorand tensor fields can often be derived from scalar potentials. Here, p ( z ) acts as a potential for j p ( z ) . Note also that the Klein-Gordon equation can be written in the form
which is manifestly gauge-invariant.
#
&call now that $ ( z ) is a “root vector” of the charge with root value -e:
Substituting our expression for Q, we obtain the identity
287
5.4. n e e Klein-Gordon Fields
K is
a distribution on (I?+1 x
(59) which is piecewise analytic in
7 x 7, with
K(Z',2 ) =
{
y ( Z '
- Z; m ) ,
Z',Z
Z'
E 7-
E T+,ZE 7-
(60)
The two-point functions -iA+ and iA- are analytic in 7+and 7-, respectively, and act as reproducing kernels for the subspaces with charge e and -6. Because of the above property, it is reasonable to call K ( z ' , Z) a reproducing kernel for the field + ( z ) , though this differs somewhat from the standard usage of the term as applied to Hilbert spaces (see chapter 1). Note that K propagates positivefrequency components of the field into the forward ("future") tube
5. Quantized Fields
288
and negative-frequency components into the backward (“past”) tube. This is somewhat reminiscent of the Feynman propagator, but K is a solution of the homogeneous Klein-Gordon equation in the real spacetime variables rather than a Green function. The energy-momentum and angular momentum operators may be likewise expressed as conserved phase-space integrals of the extended field:
Pp = i l d o : $*a,$: r
Mpv = i
d o : $*(xcdv - x V a p ) $ : .
Like Q, these may be displayed as regularizations of the usual, more complicated expressions in real spacetime. Note first that Pp can be rewritten as
The angular momentum can be recast similarly as
5.5. n e e Dirac Fields
289
Using s2x = -aBx and applying Stokes' theorem, we therefore have
where
is a regularized energy-momentum density tensor which, inciclzntally, is automatically symmetric. Similarly,
where
is a regularized angular momentum density tensor.
5.5. Free Dirac Fields
For simplicity, we specialize in this section (only) to the physical
case of three spatial dimensions, s = 3. The proper Lorentz group
290
5. Quantized Fields
Lo is then S0(3,1)+, where the plus sign indicates that At > 0, so that A preserves the orientations of space and time separately. Its universal covering group can be identified with SL(2, C) as follows (Streater and Wightman [1964]): An event x E R4 is identified with the Hermitian 2 x 2 matrix
x =x b p=
xo + x3 x1 +ix2
-
x1 ix32) xo - x
where QO = I (2 x 2 identity) and b k (k = 1,2,3) are the Pauli spin matrices. Note that det X = x2 x x. The action of SL(2,C) on
-
Hermitian 2 x 2 matrices given by
X' = AXA*,
A E SL(2,C),
(2)
induces a linear transformation on IR4 which we denote by ?r(A):
x' = n(A) x
(3)
From
it follows that 7r(A) is a Lorentz transformation, and it can easily be seen to be proper. Hence
7r
7r:
defines a map
SL(2,C) + Lo,
(5)
which is readily seen to be a group homomorphism. Clearly, ?r( -A) =
.rr(A),and it can be shown that if 7r(A) = ?r(B),then A = fB.Since SL(2, C)is simply connected, it follows that SL(2, a) is the universal covering group of LO,the correspondence being two-to-one. The relativistic transformation law as stated in section 5.3,
5.5. n e e Dirac Fields
291
applies to scalar fields, i.e. fields without any intrinsic orientation or spin. To generalize it to fields with spin, note first of all that since the representing operator U ( a ,A) occurs quadratically, the law is invariant under U + 4. This means that U could, in fact, be a representation, not of PO,but of the inhomogeneous version of
SL(2,a!),
which acts on R4 by ( u , A ) s= ~ ( A ) a : + a .
(8)
P2 is the two-fold universal covering group of PO. A field $(a:) of arbitrary spin is a distribution taking its values in the tensor product L(7-l)8 Y of the operator algebra of the quantum Hilbert space 3.c with some finite-dimensional representation space V of SL(2, a!). The transformation law is
where S is a given representation of SL(2, a!) in V. S determines the spin of the field, which can take the values j = 0,1/2,1,3/2,2,. . .. The locality condition for the scalar field (axiom 4) can be extended to non-scalar fields as
[$a(a:),$,9(a:c')]
=o
if (a: - z ' ) ~< o
292
5. Quantized Fields
where $a are the components of $. Now it follows from the axioms that if the field has half-integral spin ( j = 1/2,3/2,. . .), then the 0. above locality condition implies that it is trivial, i.e. that $(z) A non-trivial field of hal-integral spin can be obtained, however, if we modify the locality condition by replacing commutators with ant icommutat ors:
Replacing the commutators with anticommutators means that changing the order in which $(z) and $(d)are applied to a state vector in Hilbert space merely changes the sign of the vector, which has no observable effect. Hence the physical interpretation that events at spacelike separations cannot influence one another is still valid. Similarly, for fields of integral spin ( j = 0,1,. . .), the locality condition with anticommutators gives a trivial theory, whereas a nontrivial theory can exist using commutators. The choice of commutators or anticommutators in the locality condition does, however, have an important physical consequence. For we have seen that the free asymptotic fields can be written as sums of creation and destruction operators for particles and antiparticles. If xl,2 2 , - - - z, are n distinct points in the hyperplane zo = 0 and $+(z) denotes the positive-frequency part of the field (which can be obtained from $(a: - iy) by taking y --+ 0 in V;), then
is a state with n particles located at these points. Since any two of these points are separated by a spacelike distance, the locality condition implies that this state is symmetric with respect to the exchange
5.5. fiee Dirac Fields
293
of any two particles if commutators are used and antisymmetric with respect to the exchange if anticommutators axe used. Particles whose states are symmetric under exchange are called Bosons, and ones antisymmetric under exchange are called Fermions. The choice of symmetry or antisymmetry crucially affects the large-scale statistical behavior of the particles. For example, no two Fermions can occupy the same state due to the antisymmetry under exchange; this is the Pauli
exclusion principle. Hence the choice of commutators or anticommutators is known as the choice of statistics, and the above theorem correlating this choice with the spin is known as the Spin-Statistics theorem of quantum field theory (Streater and Wightman [1964]). This theorem is fully supported by experiment, and represents one of the successes of the theory. The Dirac field is a quantized field with spin 1/2 whose associated particles and antiparticles are typically taken to be electrons and positrons, though it is also used (albeit less accurately) to model neutrons and protons. Our treatment follows the notation used in Itzykson and Zuber [1980],with minor modifications. The free Dirac field is a solution of the Dirac equation
where
is the Dirac operator and the 7’s are a set of 4 x 4 Dirac matrices, meaning they satisfy the Clifford condition with respect to the Minkowski metric:
5. Quantized Fields
294
{ y p , 7')
= ypyv + 7 " 7 p = 2gp'.
(15)
The components of $ satisfy the Klein-Gordon equation, since q 2 = 0 , and the solutions can be written as
where u" and v" are positive- and negative-frequency four-component spinors and summation over the polarization index a = 1 , 2 is implied. b, and d, are operators satisfying the "canonical anticommutation relations"
G y p ) u"p) = -fi"(p) v q p ) = 6"B
where a summation on cr is implied in the last two equations and the adjoint spinors are defined by
5.5. B e e Dirac Fields
295
ba(p) and d Q ( p ) are interpreted as annihilation operators for particles and antiparticles, respectively, while their adjoints are creation operators. The adjoint field is defined by
and satisfies
The particle- and antiparticle number operators are now
(23)
r
and the charge operator is
Q = &(N+- N - ) .
(24)
As for the Klein-Gordon field, we wish to give a phasespace repusing the resentation of Q. The first step is to extend $(z) to Analytic-Signal transform, which gives
a?
Again, the extended field is analytic in 7,with the parts in 7+and 7containing only positive and negative frequncies, respectively. Using the above orthogonality relations, as well as
296
5. Quantized Fields
for p , q E ni, we obtain the following expressions for the particleand antiparticle number operators as phase-space integrals:
J u-
where the fields in the first integral are already in normal order and the second integral involves two changes of sign: one due to the normal ordering, and another due to the orthogonality relation for the P’s.The charge operator can therefore be given the following compact expression as a phasespace integral over the oriented phase space CT = a+ - a-:
Q = a / da : $lC,
:= e
U
J dap(z),
(28)
U
where p = : $$ : is the scalar phase-space charge density. The usual expression for the charge as an integral over a configuration space S is
To compare these two expressions, we again use Rx = -dBx and invoke Stokes’ theorem:
Define the phase-space current density
-
j p ( z ) E 2ma : + ( z ) $~( z~)
:,
297
5.5. n e e Dirac Fields
where the factor 2m is included to give j p the correct physical dimensions, given our normalization. Note that
jp(z)
is conserved in
spacetime, i.e.
=O by the Dirac equation combined with the analyticity of 1c, in 7.The same combination also implies
where
are the spin matrices. The real part of this equation gives a phasespace version of the Gordon identity
The two terms are conserved separately, since
298
5. Quantized Fields
and the second term, which is due to spin, does not contribute to the total charge since it is a pure divergence with respect to x. Thus
where
is a “regularized” spacetime current.
Note: The Dirac equation can also be written in the manifestly gauge-invariant form
Again, $ is a “root vector” of the charge operator, since it removes a charge E from any state to which it is applied:
[$(z’),Q]= E$(z’)
VZ’ E C4.
(40)
Substituting for Q the above phase-space integral and using the commut ator identity
[ A ,BC]= { A ,B}C - B { A , C}
(41)
and the canonical anticommutation relations, we obtain
where the “reproducing kernel” for the Dirac field is a matrix-valued distribution on C4 x C4 given by
5.5. fiee Dirac Fields
299
Here, K is the reproducing kernel for the Klein-Gordon field and 8' is the Dirac operator with respect to the real part x' of z'. Like K, ICo is piecewise holomorphic in z' - 2 for z', z E 7.Another form of the reproducing relation can be obtained by substituting the more complicated expression for Q given by eq. (37) into eq. (40): $ ( z ' ) = 2mAi1
(44)
This form is closer to the usual relation. The energy-moment um and angular momentum operators for the Dirac field can likewise be represented by phase-space integrals as
Pp = l d u : $ia,,$:
+
iwPV= J,du : s(ix,av - ixvaP + a P v ) : .~
(45)
More generally, let $ ( z ) represent either a KleinTGordon field (in which case II) will mean $*) or a Dirac field, and let Ta be the local generators of an arbitrary internal or external symmetry group, so that the infinitesimal change in $ ( z ) is given by
5. Quantized Fields
300
For example, T, is multiplication by 6 for U(1) gauge symmetry, Tp = ia, for spacetime translations (where the derivative is with respect to zp), etc. (In case the theory has an internal symmetry higher than U(1),of course, 1c, must have extra indices since it must be valued in a representation space of the corresponding Lie algebra.) The generators satisfy the Lie relations
where Cib are the structure constants. Then we claim that the conserved global field observable corresponding to T, is
Q , = l d c :$T,$:. For this implies [$(z'),
&,I
=
J. d o K D ( ~3'),Ta$(z),
(49)
where K D is replaced by K if $ is a Klein-Gordon field. Since T, generates a symmetry, it follows that Ta$(z) is also a solution of the appropriate wave equation, hence it is reproduced by K D :
J.
du KD(z',Z)Ta$(z) = Talc,(z').
(50)
Therefore Q , has the required property
It can furthermore be checked that
[&, &a]
=
1
do : $ [Ta, Tb]$ := CZb QC,
4
hence the mapping
T, H Qa is a Lie algebra homomorphism.
(52)
5.5. B e e Dirac Fields
301
Finally, we show that due to the separation of positive and negative frequencies in 7, the interference effect known &s Zitterbewegung does not occur for Fermions in the phasespace formalism. Let St be the configuration space defined by xo = t. Then the components of the “regularized” three-current at time t are
and a straightforward computation gives
The right-hand side is independent of t , hence no Zitterbewegung occurs. In real spacetime, Zitterbewegung is the result of the inevitable interference between the positive- and negativefrequency components of qb. Its absence in complex spacetime is due to the polarization of the positive and negative frequencies of II, into 7+and 7-, respectively. In the usual theory, Zitterbewegung is shown to occur in the singleparticle theory; the above computation can be repeated for the classical (i.e., “first-quantized”) Dirac field, with an identical result except for a change in sign in the second term due to the commutation of dz and d,. Alternatively, the above argument also implies the absence of Zitterbewegung for the one-particle and oneantiparticle states of the Dirac field.
302
5. Quantized Fields
5.6. Interpolating Particle Coherent States We now return to the interpolating charged scalar field 4. The asymptotic fields satisfy the Klein-Gordon equation,
and have the same vacuum expectation values as the free KleinGordon field discussed in section 5.4. Hence, by Wightman’s reconstruction theorem (Streater and Wightman [1964]),these three fields are unitarily related. We identify the free field of section 5.4 with &n. Then there is a unitary operator S such that
S is known as the scattering operator. Define the source field j ( x ) by j(.)
3
(n + m2)4(.).
(3)
It is a measure of the extent of the interaction at x, and by axiom 5,
o
j(x)
(weakly) as x o
-, foe.
(4)
Note that we are not making any additional assumptions about j . If j is a known function (i.e., if it is a multiple of the identity on ‘H for each x), then it acts as an external source for 4. If, on the other hand, j is a local function of such as : 43:, it represents a self-interaction of 4. In any case, the above equations can be “solved” using the Green functions of the Klein-Gordon operator, which satisfy
+
+
(0, m2)G ( x )= 6(x).
(5)
5.6. Interpolating Particle Coherent States
303
In general, we have formally
4(x) = 4o(x)
+ J dx' G(x - .')j(.'),
(6)
where 4 0 is a free field determined by the initial or boundary conditions at infinity used .to determine G. The retarded Green function (we are back to s spatial dimensions) is defined as
where
with
E
> 0 and the limit E 5 0 is taken after the integral is evaluated.
Gret propagates both positive and negative frequencies forward in time, which means that it is causal, i.e. vanishes when xo < 0. Since it is also Lorentz-invariant, it follows that
Gret(x - z') is interpreted as the causal effect at x due to a unit disturbance at x'. The corresponding choice of free field 4 0 is q5jn, hence
If j is a known external source, this gives a complete solution for q5(x). If j is a known function of 4, it merely gives an integral equation which
4 must satisfy.
Similarly, the advanced Green function is defined by
5. Quantized Fields
304
=
with p (po - ie,p) and E: 3. 0, and propagates both positive and negative frequencies backward in time, which means it is anticausal. The corresponding free field is q50ut, hence
4(.)
= $out(z)
+
1
dx'
Gadv(2
- x')j(.')-
(12)
Let us now apply the Analytic-Signal transform to both of these equations:
where (with z = z - iy)
and
Since the Analytic-Signal transform involves an integration over the entire line x(r) = x - r y , the effect of Gret(z - 2') is no longer
5.6. Interpolating Particle Coherent States
305
causal when regarded as a function of z and 5’. Rather, it might be interpreted as the causal effect of a unit disturbance at x’ on the line parametrized by 2. (Note that only those values of r for which z - r y - z’ E contribute to the integral.) A similar statement goes for Gadv( z - 2’). Whereas djn(z) and dout(z) are holomorphic in 7,4 ( z ) is not (unless j (z) = 0), since Gret( z -z‘) and G d v( z -5 ’ ) are not holomorphic. This breakdown of holomorphy in the presence of interactions is by now expected. Of course 4, Gret and Gad” are all holomorphic along the vector field y, as are all Analytic-Signal transforms.
Qt,
In Wightman field theory, the vacua Qytand QO of the in-, out- and interpolating fields all coincide (the theory is “alreadyrenormalized”). Let us define the asymptotic particle coherent states by
as the interpolating particle coherent states. By eq.
and
(13),
5. Quantized Fields
306
From the definitions it follows that
Gadv(z - 2') = Gret(z' - z), hence eq. (18) can be rewritten as
Eqs. (19) and (21) display the interpolating character of e$. Note that when j ( ~is) an external source, then the interpolating particle coherent states differ from the asymptotic ones by a multiple of the vacuum.
As in the case of the free theory, a general state with a single positive charge
E
can be written in the form
q = 4*(f)Qo.
(22)
For interacting fields, this may, in general, no longer be interpreted as a one-particle state, since no particle-number operator exists.* But
*
If the spectrum C contains an isolated mass shell
is concentrated around St$, then
Qi
and f ( p ) is, in fact, a one-particle state.
This is the starting point of the' Haag-Ruelle scattering theory (Jost [1965]).I thank R. F. Streater for this remark.
5.6. Interpolating Particle Coherent States
307
the charge operator does exist since charge (unlike particlenumber) is conserved in general, due to gauge invariance; hence Qf makes sense as an eigenvector of charge with eigenvalue 6. !PF can be expressed in terms of particle coherent states as
f(z ) satisfies the inhomogeneous equations
where the last equation is a definition of 6 ( z - 3') as the AnalyticSignal transform with respect to x of b(x - z'). The above is easily seen to reduce to
308
5. Quantized Fields
where j ( z ) is the Analytic-Signal transform of j ( s ) . Equivalently, eq. (3) can be extended to by applying the Analytic-Signal transform, giving
(02
+ m2)J(z) = ( q o I (0,+ m 2 )+ ( z ) I Q:
) = ( Qo I j ( z ) Q; ). (28)
For a known external source, this is a “perturbed” Klein-Gordon equation for j ( z ) ; if j depends on 4, it appears to be of little value.
5.7. Field Coherent States and Functional Integrals
So far, all our coherent states have been states with a single particle or antiparticle. In this section, we construct coherent states in which the entire field participates, involving an indefinite number of particles. We do so first for a neutral free Klein-Gordon field (or a generalized free field; see section 5.3), then for a free charged scalar field. A similar construction works for Dirac fields, but the “functions” labeling the coherent states must then anticommute instead of being “classical” functions and a generalized type of functional integral must be used (Berezin [1966], Segal [1956b, 19651). We also indulge in some speculation on generalizing the construction to interpolating fields. An extended neutral free Klein-Gordon field satisfies the canonical commutation relations
5.7. Field Coherent States and finctiond Integrals
309
for all z,z‘ E 7+,as well as the reality condition +(z)* = +(Z). The basic idea is that since all the operators 4 ( z ) ( z E 7+) commute, it may be possible to find a total set of simultaneous eigenvectors for them. This is not guaranteed, since + ( z ) is not self-adjoint (it is not even normal, by eq. (1))and, in any case, it is unbounded and thus may present us with domain problems. However, this hope is realized by explicitly constructing such eigenvectors. This construction mimics that of the canonical coherent states in section 3.4, which used the lowering and raising operators A and A*. As in the case of finitely many degrees of freedom, the canonical commutation relations mean that +* acts as a generator of translations in the space in which is “diagonal.” The construction proceeds as follows: Let f(p) be a function on IR”, which will also be regarded as a function To simplify the analysis, we assume to begin with that f on is a (complex-valued) Schwartz test function, although this will be relaxed later. f determines a holomorphic positive-energy solution of the Klein-Gordon equation,
+
i22.
Define
where cr+ is any particle phase space and the second equality follows from theorem 4.10 and its corollary. (Note: this is not the same as the smeared field in real spacetime, since the latter would involve an integration over time, which diverges when f is itself a solution rather than a test function in spacetime.) The canonical commutation relations imply that for z E 7+,
5. Quantized Fields
310
and for n 2 1,
We now define the field coherent states of $ by the formal expression
Then if z E ' I so + that $ () z ) 90= 0, eq. ( 5 ) implies that $(z)
Ef = [ $ ( z ) , ,4*(f)] Qo (7)
= f(z)Ef.
Hence Ef is a common eigenvector of all the operators $ ( z ) , z E 7+. This eigenvalue equation implies that the state corresponding to
Ef
is left unchanged by the removal of a single particle, which requires that Ef be a superposition of states with 0,1,2,. . . particles. Indeed,
Ef =
c+
n=O
n.
$*(f)"
9 0 .
The projection of Ef to the one-particle subspace can be obtained by using the particle coherent states e,:
=
where the last equality follows from d(f) 90 ($*(f))*90= 0. More generally, the n-particle component of the n-particle coherent state
Ef is given by
projecting to
5.7. Field Coherent States and Ftrnctiond Integrals
so all particles are in the same state
31 1
f and the entire system of par-
ticles is coherent! Similar states have been found to be very useful in the analysis of the phenomenon of coherence in quantum optics Klauder and Sudarshan [1968]),where the name “co(Glauber [1963], herent states” in fact originated. In the usual treatment, the positivefrequency components have to be separated out “by hand” using their Fourier representation, since one is dealing with the fields in real spacetime. For us, this separation occured automatically though the use of the Analytic-Signal transform, i.e. $*(f) can be defined directly as an integral of f ( z ) over a+. (This would remain true even if f had a negative-frequency component, since the integration over CT+ would still restrict f to positive frequencies.) The inner product of two field coherent states can be computed as follows. Note first that if g(z) is another positiveenergy solution, then
Ja+ L
where, by theorem 4.10,
5. Quantized Fields
312
Hence
Thus Ef belongs to 'FI (i.e., is normalizable) if and only if f(p) belongs to L:(dfi) or, equivalently, f(z) belongs to the oneparticle space Ic of holomorphic positive-energy solutions. If we suppose this to be the case for the time being, then the field coherent states Ef are parametrized by the vectors f" E Li(dfi) or f E K. Next, we look for a resolution of unity in 'H in terms of the Ef's. The standard procedure (section 1.3)would be to look for an appropriate measure dp(f) on Ic. Actually, it turns out that due to the infinite dimensionality of Ic, a larger space Kb 3 Ic will be needed to support dp. Thus, for the time being, we leave the domain of integration unspecified and write formally (15)
where dp is to be found. Taking the matrix element of this equation between the states Eh and E g , we obtain
With h = -g this gives /'dp(f)e(fIg)-(glf)
= e-(gIg)
s[g].
(17)
The left-hand side is an infinitedimensional version of the Fourier transform of dp, as becomes apparent if we decompose f and g into their real and imaginary parts. The Fourier transform of a measure is called its characteristic function. Hence we conclude that a
5.7. Field Coherent States and finctional Integrals
313
necessary condition for the existence of dp is that its characteristic function be S[g]. In turn, a function must satisfy certain conditions in order to be the characteistic function of a measure. In the finite-dimensional case, Bochner's theorem (Yosida [1971]) guarantees the existence of the measure if these conditions are satisfied. If the idnite-dimensional space of f's is replaced by (En,the above relation would uniquely determine dp as a Gaussian measure. For the identity
det A-'d2"( exp[-?r(( - At)* A-'(( - At)] = 1,
(18)
where A is a positive-definite matrix, implies
with dp((') = det A-' exp[-~('*A-'(] d2"C.
The integral in eq. (19) is entire in the variables hence it can be analytically continued to
+(()
If ( = (Y + ip and
(20)
< and t* separately,
I* + -t*, giving
e7r(<*€-€*o = ,-n€*A€
t = u + iv with a,p, u , v E R", then
showing that S(v,- u ) is the Fourier transform of dp(a,p).
If A is merely positive-semidefinite, i.e., if it is singular, then dp still exists but is concentrated on the range of A , and A-l makes
5. Quantized Fields
314
sense as a map from this range to the orthogonal complement of the kernel of A. Eq. (19) is afinite-dimensional version of eq. (16) (with g = h). In going to the infinite-dimensional case, two separate complications arise. First of all, recall that in finite dimesions, the Fourier transform takes funcions on R" to functions on the dual space, (IRn)* (section 1.1). One usually identifies these two spaces by choosing an inner product on
IR",e,g., the
Euclidean inner product. In the
infinite-dimensional case, it is tempting to extend Bochner's theorem by letting R" go to a Hilbert space and looking for a measure on this space. However, Segal [1956a, 19581 has shown that the ensuing dp cannot be a Borel measure, since it is only finitely additive. To obtain a Borel measure, the domain of integration must be expanded to a space of distributions, its dual then being a space of test functions. Minlos' theorem (Gel'fand and Vilenkin [1964]; Glimm and Jaf€e [1981]) states that if a functional S [ g ] defined on the Schwartz space of test functions S(IR")satisfies appropriate conditions (positive-definiteness, normalization and continuity), a Borel probability measure dp exists on the dual space of tempered distributions S'(IR")such that eq. (17) is satisfied for all g E S(IR"), the integration being over S'(IR"). This resolves the problem of infinite dimansionality. In our case, however, the situation is further complicated by the fact that the space of f's over which we wish to integrate, even if it is enlarged, consists not of free functions but of solutions of the Klein-Gordon equation. This difficulty can be overcome by first applying Minlos' theorem in the momentum space representation, where S[S]
= exp (-l[el[$(dj)) satisfies the necessary
conditions for 4 E S(IRs).This gives a probability measure d j i ( f ^ )on
S'(IR").We then define the spaces
5.7. Field Coherent States and finctiond Integrals
f
{m= J,, djj
e-irp
j(p)
If
}.
315
(23)
E S'
These may be regarded as mutually dual, under the sesquilinear pairing
Together with
K:, they form a "triplet" KO c K: c Kh.
(25)
We now use the map f^ H f to transfer the measure from S' to Kk, obtaining a probability measure dp on K:b. This results, finally, in the resolution of unity
(26) where the integral converges, as usual, in the weak operator topology. d p is Gaussian in the sense that its restrictions to finite-dimensional cylinder sets in Kk are all Gaussian measures. It is, therefore, an infinite-dimensional version of the Gaussian measure on C" which gave the resolution of unity for the canonical coherent states in section
1.2. One might well ask what is the point of insisting that the integrat ion take place over Kb rat her than S'(IR"), the momentum space representation. One reason is esthetic: The vectors Ef combine the finite-dimensional (particle) coherent-state representation with the
5. Quantized Fields
316
infinite-dimensional (field) coherent-state representation. Another reason is that whereas the "sample points" f in S'(R")are merely distributions, the elements f in the decaying exponential
e--IP
Ich
are holomorphic functions, since
dominates the singular behavior of
f,
f"
just as it did when E L2(dj3). Note, however, that non-normalizable field coherent states Ef now enter the resolution of unity. In fact, the Hilbert space K: has measure zero with respect to dp, since L2(dj3) has measure zero with respect to dji. This is remedied by the fact that only vectors h , g in the test function space K:o are now allowed in eq. (16). With the resolution of unity provided by the field coherent states, the inner product in 'FI can be represented by the functional integral
The above construction was for a neutral scalar field. If charged, its coherent states would take the form
E f J = e4*(f)+4(a Qo
4 were
(28)
where $*( f ) is as before and # ( g ) is an antiparticle creation operator,
which commutes with q5*(f). g ( z ) is a negativeenergy solution of the Klein-Gordon -
equation, holomorphic in
g ( 2 ) with g E Icb. Thus we write
charged fields is therefore
gE
T-, or, equivalently, i j ( z ) =
q.The resolution of unity for
5.7. Field Coherent States and finctional Integrals
J
EC; x z
dp(f,g) I Ef,P )( E f J I = I R ,
317
(30)
where d p ( f , g ) = dp(f)dp(g) is the tensor product of two Gaussian measures defined as above. Finally, it is reasonable to ask whether coherent states exist for an interpolating field, by analogy with the interpolating particle coherent states studied in sections 5.3 and 5.6. A necessary condition would seem to be that the first half of the canonical commutation relations still be valid, i.e.,
[&), 4 ( 4 1 = 0
(31)
for z, z' E 7+,since one would like to find simultaneous eigenvectors of b(z) for all z E 7+. Recall that the extended free neutral scalar field had the form
for arbitrary z E are+', and the commutation relation given by eq. (31)was due to the polarization of the positive and negative frequenand 7-, respectively. If interactions are introduced, the cies into I+ positive- and negativefrequency components get inextricably mixed together, hence it is highly unlikely that the above commutation relation survives. However, a charged free field has the form
where b* commutes with a, hence [ $ ( z ) , +(z')] = 0
V%,2' E aY+l.
(34)
5. Quantized Fields
318
I believe that this relation does have a chance of holding for interpolating charged scalar fields. It would be a consequence, for example, of the physical requirement that the Lie algebra generated by the field has no operators which remove (or add) a double charge 2 ~ .This commutation relation is the weaker half of the free-field canonical commutation relations, the stronger half (which we do not assume) being that [ $ ( z ) , $ ( z ’ ) * ] is a “c-number,” i.e. a multiple of the identity. If [~(z),$(z‘)]= 0 for all z and z’, then it makes sense to look for common eigenvectors of all the $ ( z ) ’ s , which would be coherent states of the interpolating field.
Notes Most of the results in sections 5.2-5.5 were announced in Kaiser [1987b] and have been published in Kaiser [1987a]. An earlier attempt to describe quantized fields in complex spacetime was made in Kaiser [1980b] but was found to be unsatisfactory. The AnalyticSignal transform is further studied in Kaiser [199Oc]. Segal [1963b] proposed a formulation of quantum field theory in terms of the symplectic geometry of the phase space of classical fields. This phase space corresponds, roughly, to the space Kh defined in section 5.7. An attempt to study quantized fields as (operatorvalued) functions on the PoincarC group-manifold has been made by LurCat [1964]; see also Hai [1969]. As mentioned in section 4.2, this manifold may be regarded as an extended phase space which includes spin degrees of freedom in addition to position- and velocity coordinates. In the context of classical field theory, this point of view has been generalized to curved spacetime by repla.cing the Poincark
Notes
319
group-manifold with the orthogonal frame bundle over a Lorentzian spacetime (Toller [1978]). These efforts have not, however, utilized holomorphy. It may be interesting to expand the point of view advocated here to a complex manifold containing the Poincarb groupmanifold in order to account naturally for spin. This might result in a “total” coherent-state representation where the classical phase space coordinates range over 7 and the spin phase space coordinates range over the Riemann sphere, as in section 3.5.
I owe special thanks to R. F. Streater for many important comments and corrections in this chapter.
This Page Intentionally Left Blank
321 Chapter 6
FURTHER DEVELOPMENTS
6.1. Holomorphic Gauge theory In this section we give a brief, not-very-technical but hopefully intuitive, discussion of gauge theory and indicate how it may be modified in order to make sense in complex spacetime. This represents work still in progress, and our account is accordingly incomplete. One obstacle is the absence, so far, of a satisfactory Lagrangian formulation. This is due in part to the fact that fields in complex spacetime are constrained since they can be derived from local fields in IR"+l (chapter 5 ) . Our treatment is as elementary as possible, with a geometrical emphasis. We ignore global questions and work within a single chart.
For more details, see Kaiser [1980a,19811. Gauge theory is a natural, geometric way of introducing interactions. It applies some of the ideas of General Relativity to quantum mechanics and arrives at a class of theories which are generalizations of classical electrodynamics, the latter being the simplest case. The power of Einstein's theory of gravitation owes much to the fact that it actually assumes less, initially, than its predecessor, Newton's theory. By dropping the assumption that spacetime is flat, we lose the ability to transport tangent vectors from one point in spacetime to another, as is done when differentiating a vector field or even find-
6. Further Developments
322
ing the acceleration of a particle moving in spacetime. To regain it, we need a connection. We need to know how a vector transforms in going from any given point 2 0 to a neighboring point along a curve
x ( t ) . The tangent vector to the curve at the point xo is
(The partial-derivative operators d, = d/&P form a basis for the tangent space at
T O ;see
Abraham and Marsden [1978].) An infinites-
imal transport must have a linear effect, thus a vector Y at xo should change by
where r ( X ) is a linear transformation on the tangent space at 2 0 . Furthermore, r ( X ) must be linear in X since the latter is an infinitesimal (i.e., linearized) description of the curve at this gives
Since
50.
If Y = Y”d,,
r(a,,)is a linear transformation, we have
where FEU are a set of (locally defined) funcions on spacetime, known as the connection coefficients. This gives the rate of change of Y due
to transport along X. Now suppose we are given a metric g on spacetime. (In Relativity, the “metric” is indefinite, i.e. Lorentzian rather than Ftiemannian; with this understood, we continue to call it a metric.) If two vectors are transported along a curve, then their inner product must
6.1. Holomorphic Gauge theory
323
not change since it is a scalar. This gives a relation betweeen I? and g which determines the symmetric part (with respect to exchange of X and Y ) of r. The antisymmetric part is the torsion, which in the standard theory is assumed to vanish. Hence the metric uniquely determines a torsionless connection known as the Riemannian connection. General Relativity (see Misner, Thorne and Wheeler [1970])relates the Riemannian connection to Newton's gravitational potential, thus giving gravity a geometric interpretation. This is reasonable, since the connection determines the acceleration of a particle moving freely (i.e., along a geodesic) in spacetime, which is related to gravity. By assuming less to begin with, one discovers what must necessarily be added to even compute the acceleration, and this additional structure turns out to include gravity! In Newton's theory, gravity must be added in an ad-hoc fashion. The two theories coincide in the non-relat ivist ic limit. Now consider the wave function of a (scalar) quantum particle, for example a solution of the free Klein-Gordon equation, in real spacetime R"" . Suppose we drop the usually implicit assumption that f can be differentiated by simply taking the difference between its values at neighboring points. Instead of regarding f(s)as a complex number, we now regard it as a onedimensional complex vector attached to z, analogous to the tangent vectors in Relativity. This means assuming less structure to begin with, since f(s)now belongs to as a vector space rather than as an algebra. The complex plane attached to x will be denoted by Cz and is called the fiber at x, and the set of all fibers is called a complex line bundle over IR"" (Wells [1980]). To differentiate f , we must know how it is affected by transport. The situation is similar to the one above, only now r ( X ) must
324
6. Further Developments
be a 1 x 1 complex matrix, i.e. a complex number. An infinitesimal transport gives
where r P ( x ) is a complex-valued function. The total rate of change along X is therefore
where
Dx is called the covariant derivative along X . Equivalently,
the differential change is
D f = df + rf,
where
r=
dxp.
(7)
The 1-form r is called the connection form. Df is the sum of a “horizontal” part df (which measures change due to the dependence of f on x ) and a “vertical” part (which measures change due to transport).
A gauge transformation is represented locally by a multiplication by a variable phase factor, i.e.
This is a linear map on each fiber Cz, which corresponds in Relativity to the linear map on tangent spaces induced by a coordinate transformation. In fact, since there is no longer any natural way to identify distinct fibers, a gauge transformation is a coordinate transformation of sorts. We therefore require that Df be invariant under gauge transformations, which implies that
r transforms as
6.1. Hoiomorphic Gauge theory
325
Suppose now that we try to complete the analogy with Relativity by deriving I' from a Hermitian metric on the fibers such that the scalar product of two vectors remains invariant under transport. If we assume the metric to be positive-definite, it must have the form
(f(4,9(4) = J ( 4 h ( 4 9 ( 4 ,
(10)
where h ( z ) is a positive function. It will suffice to consider the inner product of f(z) with itself, i.e. the quantity
We require that p be invariant under transport. This means that (f,f ) changes only due to its dependence on 2, i.e.
It follows that
-
r h + h r = dh,
(13)
which constrains the real part of I' but leaves the imaginary part arbitrary. Writing I? = R + iA, where R and A are real 1-forms, we have 2R = dlog h.
(14)
The real part R of the connection can be transformed away by d d n ing f" = h'l2 f and & 1, which gives J = p and
-
Df = h-l/* (df"+ i A f ) h-'/2 ( d + f')f.
(15)
Since p = f f , we may as well assume from the outset that p = lfI2 and I? = iA is purely imaginary.
6. h t h e r Developments
326 Note: The mapping f
H
f” is not
a gauge transformation in the
usual sense; in the standard gauge theory the metric is assumed to be constant ( h ( z )
l), hence only phase translations are allowed. This corresponds to having already transformed away R. It turns out that in phase space, it will be natural to admit non-constant metrics. To make the Klein-Gordon equation invariant under gauge trans-
formations, we now replace a, by D, = d,
+ iA,.
The result is
This equation was known (even before gauge theory) to be a relativistically covariant description of a Klein-Gordon particle in the presence of the electromagnetic field determined by the vector potential A , ( z ) . Hence the connection, which describes a geometric property of the the complex line bundle, acquires a physical significance with respect to electrodynamics, just as did the connection I? with respect to gravitation. When f is differentiated in the usual way, it is unconsciously assumed that the connection vanishes. Coupling to an electromagnetic field then has to be put in “by hand,” through the substitution
a,
+
8, + iA,, which is known as the minimal
coupling prescription. Gauge theory gives this ad-hoc prescription a geometric interpretation. But note that in this case, the fiber metric
h ( z ) did not determine the connection. This is due to the complex structure: iAf cancels in the inner product because it is imaginary. The electromagnetic field generated by the potential A is given by the 2-form
F
= dA,
(17)
6.1. Holomorphic Gauge theory
327
which in fact measures the non-triviality of the connection form A: if F = 0, then A is closed and therefore (locally) exact, i.e. it is due purely to a choice of gauge. This is analogous in Relativity to choosing an accelerating coordinate system, which gives the illusion of gravity. Note the complementary nature of the two theories: in Relativity, the skew part of the connection, which is the torsion, is assumed to vanish. In gauge theory, the inner product becomes Hermitian, and the symmetric and antisymmetric parts correspond to its real and imaginary parts, respectively. It is the real part of the connection which is assumed to vanish in gauge theory. Were r required to be real, it could in fact be transformed away as above, giving rise to no gauge field. That is, only the trivial part of the connection is determined by the metric. The non-trivial (imaginary) part is arbitrary. We will see that when the theory is extended to complex spacetime, the metric does determine a non-trivial connection. Gauge theory usually begins with a Lagrangian invariant under phase translations f H eidf, which form the group U(1). The equations satisfied by f and A,, are derived using variational principles (see Bleeker [1981]). There is a natural generalization where f(z) is an n-dimensional complex vector. In that case, the group of phase tanslations is replaced by a non-abelian group G, usually a subgroup of U ( n ) . G is called the gauge group, and the correponding gauge theory is said to be non-abelian or of the Yang-Mills type. Such theories have in recent years been applied with great success to the two remaining known interactions (aside from gravity and electromagnetism), namely the weak and the strong forces, which involve nuclear matter (Appelquist et al. [1987]).
328
6. firther Developments
Let us now see how non-abelian gauge theory may be extended to complex spacetime. (This will include the abelian case of electrodynamics when n = 1.) Consider a field f on complex spacetime, say on the double tube 7, whose values are n-dimensional complex vectors. The set of all possible values at z E 7 is a complex vector space F, x a* called the fiber at z. The collection of all fibers is called a vector bundle. We assume that this bundle is holomorphic (Wells [1980]), so that holomorphic sections z H f(z) E F,, represented locally by holomorphic vector-valued functions, make sense. ) the complex tangent vector Upon transport along a curve ~ ( thaving 2, f changes by
where e(Z)is a linear map on each fiber. The total differential change is
If z = 5 - iy, then d=a+a, where
Since a general tangent vector has the form
6.1. Holomorphic Gauge theory
329
we have
where 8, = @(a,)and OP = O(8,). Let us now try again to derive the connection from a fiber metric, as we have failed to in the case of real spacetime. A positivedefinite metric on the fibers F, must have the form
where h ( z ) is a positivedefinite matrix. Again, it will suffice to consider the (squared) fiber norm
We define a holomorphic gauge transformation to be a map of the form
where x( z ) is an invertible n x n matrix-valued holomorphic function. Clearly p is invariant under holomorphic gauge transformations. The corresponding gauge group acting on a single fiber F, is the general linear group G = GL(n,a), which includes the usual gauge group U ( n ) . However, analyticity correlates the values of x ( z ) at different fibers. Invariance of the inner product under transport gives
from which we obtain the matrix equation
8*h
+ h8 = dh = ah + dh.
330
6. firther Developments
As in the case of real spacetime, this only determines the Hermitian part (relative to the metric) of 9. But if we make the Ansatz
h9 = a h ,
(29)
then the resulting connection 9 = h - l d h satisfies the above constraint. We now show that in general, the connection 9 is non-trivial, i.e. cannot be transformed away by a holomorphic gauge transformation. Under such a transformation, 9 becomes
8' = ( x * h x ) - l a ( x * h x ) = x - l h-' d h x =x
-
+ x-lax
'ex +x-lax,
since ax* = 0 by analyticity. It follows that
D'f'
E (d
+ 9')f'
= x-'D(xf'),
(31)
hence ' 2
(D) f
I
=x-
1
D
2
(xf').
(32)
If 8 were trivial, then for some gauge we would have 9' = 0, hence = d2 = 0 , so D2 = 0. But
D2f = (d
+ e ) ( d + 9)f = d ( 9 f ) + 9 A clf + 9 A 9f
=(dB)f+9A9f E
of,
(33)
where the 2-form
0 = d9
+ 8 A 9 = 89 + d9 + 9 A 9
(34)
6.1. Holomorphic Gauge theory
331
is the curvature form of the connection 0, analogous to the Riemann curvature tensor in Relativity. Using the matrix equation
88 + 8 A 8 = (-h-'bh
- h-'
) A ah + (h-'ah)
A (h-'ah) = 0.
(36) This is an integrability condition for 0, being a consequence of the fact that 6 can be derived from h. One could say that h is a "potential" for 8. Therefore the quadratic term cancels in eq. (34) and the curvature form reduces to
o = 88.
(37)
Hence if h is such that 8 ( h - ' d h ) # 0, then the connection is nontrivial. The form 0 is the complex spacetime version of a Yang-Mills field, and 8 corresponds to the Yang-Mills potential. For n = 1, 8 and 0 are the complex spacetime versions of the electromagnetic potential and the electroma.gnetic field, respectively. Since h ( z ) is a positive function, it may be written as
where 4 ( z ) is real. Then
e = -84,
o = -8ae.
(39)
To relate 8 and 0 to the electromagnetic potential A and the electromagnetic field F , one performs a non-holomorphic gauge transformation simi1a.r to that in eq.(15): let
332
6. Further Developments
Then the transformed potential becomes purely imaginary,
giving
for the complex spacetime version of the electromagnetic potential. Note that although A , ( z ) is a pure gradient in the y-direction, the corresponding electromagnetic field is not trivial, since
need not vanish, in general. Incidentally, there is an intriguing similarity between the inner product using the fiber metric,
Ja
and that in Onofri's holomorphic coherent states representation (sec.
3.4, eq. (62)). Note that our 0 coincides with Onofri's symplectic form -w. This possible connection remains to be explored.
Remarks. 1. The relation between the Yang-mills potential A and the YangMills field F in real spacetime is F = d A A A A, which is
+
6.I. Holomorphic Gauge theory
333
quadratic in the non-abelian case (n 7 l), since the wedge product A A A involves matrix multiplication. In our case, however, the connection satides the integrability condition given by eq. (36), hence the quadratic term cancels and the relation becomes linear, just as it is normally in the abelian case. The non-holomorphic gauge transformation f H f” in eq. (40) can be generalized to the non-abelian case as follows: Since h ( z ) is a postive matrix, it can be written as h ( z ) = k ( z ) * k ( z ) ,where k ( z ) is an n x n matrix. (Ic(z) need not be Hermitian; the holomorphic case 8k = 0 corresponds to a “pure gauge” field, i.e. 0 = 0.) Setting f(z) = k ( z ) f ( z )and x ( z ) f 1 brings us to the unitary gauge, where the new gauge transformations are given by unitary matices f(z) H V ( ~ ) f ( z This ) . amounts to a reduction of the gauge group from GL(n,C) to U ( n ) . In the unitary gauge, the relation between the connection and the curvature becomes non-linear, as it is in the usual Yang-Mills theory. See Ihiser [1981] for details. 2.
0 has a symmetric real part and an antisymmetric imaginary part. The antisymmetric part corresponds to the usual YangMills field, whereas the symmetric part does not seem to have an obvious counterpart in real spacetime.
3. In complex differential geometry, 6 = h”dh is known as the canonicd connection of type (1, 0) determined by h. (Wells [1980]). The functions f are assumed to be (local representations of) holomorphic sections of the vector bundle. We do not make this assumption, since it appears that analyticity may be lost in the presence of interactions. However, it is possible that f(z) is holomorphic, and it is the non-holomorphic gauge
334
6. Further Developments ttspoils the analyticity. That is, the non-analytic part of the theory may be all contained in the fiber metric h.
6.2 Windowed X-Ray Transforms: Wavelets Revisited
In this final section we generalize the idea behind the Analytic-Signal transform (section 5.2) and arrive at an n-dimensional version of the Wavelet transform (chapters 1 and 2). In a certain sense, relativistic wave functions and fields in complex spacetime may be regarded as generalized wavelet transforms of their counterparts in real spacetime. This is related to the fact, mentioned earlier, that relativistic windows shrink in the direction of motion, due to the Lorentz contraction associated with the hyperbolic geometry of spacetime. This contraction is like the compression associated with wavelets. Other generalizations of the wavelet transform to more than one dimension have been studied (see Malat [1987]), but they are usually obtained from the onedimensional one by taking tensor products, hence are not natural with respect to symmetries of
R" (such as ro-
tations or Lorentz transformations), and consequently do not lends themselves to analysis by group-theoretic methods. The generalization proposed here, which we call a windowed X-Ray transform, does not assume any prefered directions, hence it respects all symmetries of
IR". For example, the transforms of functions over real spacetime
IR"+l will transform naturally under the PoincarC group. If these functions form a representation space for Po (whether irreducible, as in the case of a free particle, or reducible, as in the case of systems of interacting fields), then so do their transforms.
6.2 Windowed X-Ray fiansforms: Wavelets Revisited
335
Let us start directly with the windowed X-Ray transform (Kaiser [1990b]). Fix a window function h:IR + a, which will play a role similar to a “basic wavelet .” For a given (sufficiently well-behaved) function f on IR”,define fh on IR”x IR” by
J-00
Note that for n = 1 and y
# 0,
where W fis the usual Wavelet transform based on the affine group A = R@R*,defined in section 1.6. On the other hand, for arbitrary n but
coincides with the Analytic-Signal transform defined in section 5.2. The name “windowed X-Ray transform” derives from the fact that for the choice h 3 1, f h ( z , y ) is simply the integral of f along the line x ( t ) = x ty, and if IyI = 1 this is known as the X-Ray transform (Helgason [1984]), due to its applications in tomography (Herman [19791). Returning to n dimensions and an arbitrary window function h ( t ) , we wish to know, first of all, whether and how f can be reconstructed from f h . Note that the transform is trivial for y = 0, since fh
+
6. Further Developments
336
that
Rt
R"\{O}. Note also has the following dilation property for a # 0:
so we rule out y = 0 and let y range over fh
where ha(t) E 1al-l h(t/a). To reconstruct f , begin by formally substituting the Fourier representation of f into fh:
where
iz,v is defined by
so that
Note that kx,y,and hence also hz,y,is not square-integrablefor n > 1, since its modulus is constant along directions orthogonal to y.* This means that eq. (6) will not make sense for arbitrary f E L2(Rn). We therefore assume, initially, that f is a test function, say it belongs to
*
So far, we have not assumed any metric structure on R"".Recall (sec. 1.1)that the natural domain of f ( ~ is ) the d u d space (R")*. "Orthogonal" here means that p(y) = 0 as a linear functional. This will be important when IR"is spacetime with its Minkowskian metric.
6.2 Windowed X-Ray lkansforms: Wavelets Revisited
337
the Schwartz space S(R"). Then eq. (6) makes sense with ( I ) as the (sesquilinear) pairing between distributions and test functions. If h is also sufficiently well-behaved, then we can substitute
and change the order of integration in eq. (8),obtaining
h,,,(z') =
J dv i ( v ) J d"p
e--lnip(z'--r)
qpy - v ) .
(10)
To reconstruct f , we look for a resolution of unity in terms of the vectors h,,, (chapter 1). That is, we need a measure d p ( z , y ) on IR," x lR: such that
For then the map T: f and polarization gives
H
fh
is an isometry onto its range in L 2 ( d p ) ,
showing that f = T * ( T f )in L2(IRn), which is the desired reconstruction formula. There are various ways to obtain a resolution of unity, since f is actually overdetermined by fh, i.e. giving the values of fh on all of IR" X Rt amounts to "oversampling," so fh will have to satisfy a consistency condition. We have seen several examples of this in the study of the windowed Fourier transform (section 1.5) and the one-dimensional wavelet transform (section 1.6 and chapter 2), where discrete subframes were obtained starting with a continuous resolution of unity. However, for n > 1, there are other options than discrete subframes, as we will see. In the spirit of the one-dimensional
6. Further Developments
338
wavelet transform, our first resolution of unity will involve an integration over all of IR" x IR;. Note that
so Plancherel's theorem gives
We therefore need a measure dp(y) on
= J dp(y) lh(py) l2
~ ( p )
G
IR: such that 1 for almost all p.
(15)
The solution is simple: Every p # 0 can be transformed to q G (1,0,. . ,O) by a dilation and rotation of R".That is, the orbit of q (in Fourier space) under dilations and rotations is all of I R:. Thus we choose dp to be invariant under rotations and dilations, which gives
where N is a normalization constant and Iyl is the Euclidean norm of y. Then for p # 0,
But a straightforward computation gives
This shows that the measure dp(z, y) z dnz d p ( y ) gives a resolution of unity if and only if
6.2 Windowed X-Ray TI-ansforms: Wavelets Revisited
339
which is precisely the admissibility condition for the usual (onedimensional) Wavelet transform (section 1.6). If h is admissible, the normalization const ant is given by
N = r(n/2) T n l 2ch and the reconstruction formula is then
The sense in which this formula holds depends, of course on the behavior of f. The class of possible f's, in turn, depends on the choice of h. A rigorous analysis of these questions is not easy, and will not be attempted here. Note that in spite of the factor lyln in the denominator, there is no problem at y = 0 since
by the admissibility condition. The behavior of fh for small y can be analized using the dilation property, since eq. ( 5 ) implies that for
X
> 0,
Thus if h ( t ) decays rapidly, say if
Xh(Xt) + 0 then we expect fh(z,y/X)
+0
as
X + 00,
as X + 00.
(24)
6. f i t h e r Developments
340
Since eq. (11) holds for admissible h, we can now allow f E L2(IRn). We would like to characterize the range %T of the map 2’: f H fh from L2(IR”)to L2(dp). The relation
shows that h,,, acts like an evaluation map taking fh E L2(dp) to its “value” at (5,y). These linear maps on !RT are, however, not bounded if n > 1, since then h,,, is not square-integrable. (In general, the “value” of fh at a point may be undefined.) Hence %T is not a reproducing-kernel Hilbert space (chapter 1). But in any case, the distributional kernel
represents the orthogonal projection from L2(dp) onto RT. Thus a given function in L2(dp) belongs to !RT if and only if it satisfies the consistency condition
where the integral is the symbolic representation of the action of K as a distribution. Remarks. 1. For n = 1, the reconstruction formula is identical with the one for the continuous one-dimensional wavelet transform W f , since by eq. (21,
6.2 Windowed X-Ray Transforms: Wavelets Revisited
341
2. In deriving the resolution of unity and the related reconstruction formula, we have tacitly identified Rnas a Euclidean space, i.e. we have equipped it with the Euclidean metric and identified the pairing px in the Fourier transform as the inner product. The exact place where this assumption entered was in using the rotation group plus dilations to obtain IRr from the single vector q, since rotations presume a metric. Having established fh as a generalization of the one-dimensional wavelet transform, let us now investigate it in its own right. First, note that for n = 1 there were only two simple types of candidates for generalized frames of wavelets: (a) all continuous translations and dilations of the basic wavelet, or (b) a discrete subset thereof. For n > 1, any choice of a discrete subset of vectors h,,, spoils the invariance under continuous symmetries such as rotations, and it is therefore not obvious how to use the above grouptheoretic method to find discrete subframes. In fact, the discrete subsets { ( a m ,namb)} which gave frames of wavelets in section 1.6 and chapter 2 do not form subgroups of the &ne group. One of the advantages of using tensor products of one-dimensional wavelets is that they do generate discrete frames for n > 1, though sacrificing symmetry. However, other options exist for choosing generalized (continuous) subframes when n > 1, and one may adapt one’s choice to the problem at hand. Such choices fall between the two extremes of using all the vectors h,,, and merely summing over a discrete subset, as seen in the examples below. 1. The X-Ray fiansfonn The usual X-Ray transform is obtained by choosing h(t)
1, which
342
6. f i r t h e r Developments
is not admissible in the above sense; hence the above “wavelet” reconstruction fails. The reason is easy to see: Note that now fh has the following symmetries: fh(2,ay) = laI-lfh(z,y)
vaE
fh(z -k sy, 9) = fh(z,y) v s E
m*
IR.
(29)
Together, these equations state that fh depends only on the line of integration and not on the way it is parametrized. The first equation shows that integration over all y
# 0 is unnecessary as well as unde-
sira.ble, and it suffices to integrate over the unit sphere Iyl = 1. The second equation shows that for a given y, it is (again) unnecessary and undesirable to integrate over all z, and it suffices to integrate over the hyperplane orthogonal to y. The set of all such (z, y) does, in fact, correspond to the set of all lines in IR”,and the corresponding ~ a continuous frame which gjves the usual reconset of h z , y ’forms struction formula for the X-Ray transform (Helgason [1984]). The moral of the story is that sometimes, inadmissibility in the “wavelet” sense carries a message: Reduce the size of the frame. 2. The Radon llansform Next, choose v E IR and
Like the previous function, this one is inadmissible, hence the “wavelet” reconstruction fails. Again, this can be corrected by understanding the reason for inadmissibility. Eq. (6) now gives fh, (z, y)
=
J d”P e-2xipxqpy - v ) f”(P).
6.2 Windowed X-Ray Dansforms: Wavelets Revisited For any a
343
# 0, we have
where w = v / a . Hence it suffices to restrict the y-integration to the unit sphere, provided we also integrate over v E R. Also, for any 7
E IR,
Fixing x = 0, the function
is called the Radon transform of f” (Helgason [1984]). It may be regarded as being defined on the set of all hyperplanes in the Fourier and f” can be reconstructed by integrating over the set space (R”)*, of these hyperplanes.
3. The Fourier-Laplace Dansform Now consider
which gives rise to the Analytic-Signal transform. (We have adopted a slightly different sign convention than is sec. 5.2. Also, note that we have reinserted a factor of 27r in the exponent in the Fourier transform, which simplifies the notation.) Then fi is the exponential step function (sec. 5.2)
6. Further Developments
344
and eq. (6) reads
J
(37)
> 0). This is the Fourier-Laplace transform of f in My.For n = 1 and y > 0, it reduces to the usual
where My is the half-space {p Ipy
Fourier-Laplace transform. This h, too, is not admissible. f(z) can be recovered simply by letting y + 0, and fh(z,y) may be regarded as a regularization of A
f(z). If the support of f is contained in some closed convex cone r*C (Rn)*, then fh(2,y) f(z - i y ) is holomorphic in the tube 7r over the cone r dual to r*,i.e.
r = {y E IR"~ p y> o 7i
= { z - i y E C:"ly E
vp E
r*) (38)
r).
(Note that no metric has been assumed.) In that case, f(z) is a
boundary value of f( z -iy). This forms the background for the theory of Hardy spaces (Stein and Weiss [1971]). We have encountered a similar situation when R" was spacetime (n = s l),r*= and f(z) was a positive-energy solution of the Klein-Gordon equation;
+
v+,
then I' = V' and 5 = 7+.But in that case, f(z) was not in L2(RSs1) due to the conservation of probability. There it was unnecessary and undesirable to integrate over all of 7+since it was determined by its values on any phase space a+ C 7+,and reconstruction was then achieved by integrating over a+ (chapter 4).
6.2 Windowed X-Ray llansforms: Wavelets Revisited
345
As seen from these examples, the windowed X-Ray transform has the remarkable feature of being related to most of the “classical” integral transforms: The X-Ray, Radon and Fourier-Laplace transforms. Since the Analytic-Signal transform is a close relative of the multivariate Hilbert transform H , (sec. 5.2), we may also add Hgto this collection.
This Page Intentionally Left Blank
347
REFERENCES Abraham, R. and Marsden, J. E. [19781 Foundations of Mechanics, Benjamin/Cummings, Reading, MA. Abramowitz, M. and Stegun, L. A,, eds. [ 19641 Handbook of Mathematical’Functions, National Bureau of Standards, Applied Math. Series 55. Ali, T. and Antoine, J.-P. [1989] Ann. Inst. H. Poincark 51, 23. Applequist, T. Chodos, A. and Freund, P. G. O., eds. [19871 Modern Kaluza-Klein Theories, Addison-Wesley, Reading, MA. Aronszajn, N. [1950] Trans. Amer. Math. SOC.68, 337. Aslaksen, E. W. and Klauder, J. R. [1968] J. Math. Phys. 9, 206. [1969] J. Math. Phys. 10, 2267. Bargmann, V. [1954] Ann. Math. 59, 1. [1961] Commun. Pure Appl. Math. 14,187. Bargmann, V., Butera, P., Girardello, L. and Klauder, J. R. [1971] Reps. Math. Phys. 2, 221. Barut, A.O. and Malin, S. [1968] Revs. Mod. Phys. 40, 632. Berezin, F. A. [1966] The Method of Second Quantization, Academic Press, New York. Bergman, S. [1950] The Kernel finction and Conformal Mapping, Amer. Math. SOC., Providence.
348
References
Bialinicki-Birula, I. and Mycielski, J. [1975] Commun. Math. Phys. 44, 129. Bleeker, D . [1981] Gauge Theory and Variational Principles, Addison- Wesley, Reading, MA. Bohr, N. and Rosenfeld, L. [1950] Phys. Rev. 78,794. Bott, R. [1957] Ann. Math. 66, 203. Carey, A. L. [1976] Bull. Austr. Math. SOC.15, 1. Daubechies, I. [1988a] “The wavelet transform, time-frequency localization and signal analysis,” to be published in IEEE Trans. Inf. Thy. [1988b] Commun. Pure Appl. Math. XLI, 909. De Bikvre, S. [1989] J. Math. Phys. 30,1401. Dell’Antonio, G. F. [1961] J. Math. Phys. 2, 759. Dirac, P. A. M. [1943] Quantum Electrodynamics, Commun. of the Dublin Inst. for Advanced Studies series A, #l. [1946] Developments in Quantum Electrodynamics, Commun. of the Dublin Inst. for Advanced Studies series A, #3. Dufflo, M. and Moore, C. C. [1976] J. Funct. Anal. 21,209. Dyson, F. J. [1956] Phys. Rev. 102,1217. Fock, V. A. 119281 Z. Phys. 49, 339. Gel’fand, I. M., Graev, M. I. and Vilenkin, N. Ya. [1966] Generalized Functions, vol5, Academic Press, New York.
References
349
Gel’fand, I. M. and Vilenkin, N. Ya. [1964] Generalized Functions, vol4, Academic Press, New York. Gerlach, B., Gomes, D. and Petzold, J. [1967] Z. Physik 202,401. Glauber, R. J. [1963a] Phys. Rev. 130,2529. [1963b] Phys. Rev. 131,2766. Glimm, J. and J d e , A. [1981] Quantum Physics, Springer-Verlag, New York. Green, M. W., Schwarz, J. H. and Witten, E. [1987] Superstring Theory, vols. I, 11, Cambridge Univ. Press, Cambridge. Greenberg, 0. W. [1962] J. Math. Phys. 3,859. Guillemin, V. and Sternberg, S. [19841 Symplectic Techniques in Physics, Cambridge Univ. Press, Cambridge. Hai, N. X. [1969] Commun. Math. Phys. 12,331. Halmos, P. [1967] A Hilbert Space Problem Book, Van Nostrand, London. Hegerfeldt, G. C. [1985] Phys. Rev. Lett. 54, 2395. Helgason, S. [19621 Differential Geometry and Symmetric Spaces, Academic Press, New York. [1978] Differentid Geometry, Lie Groups, and Symmetric Spaces, Academic Press, New York. [1984] Groups and Geometric Analysis, Academic Press, New York.
350
References
Henley, E. M. and Thirring, W. [1962] Elementary Quantum Field Theory, McGraw-Hill, New York. Herman, G. T., ed. [1979] Image Reconstruction from Projections: Implementation and Applications, Topics in Applied Physics 32,SpringerVerlag, New York. Hermann, R. [1966] Lie Groups for Physicists, Benjamin, New York. Hille, E. [1972] Rocky Mountain J. Math. 2, 319. Inonu, E. and Wigner, E. P. [1953] Proc. Nat. Acad. Sci., USA, 39,510. Itzykson, C. and Zuber, J.-B. [1980] Quantum Field Theory, McGraw-Hill, New York. Jost, R. [1965] The General Theory of Quantized Fields, Amer. Math. SOC.,Providence. Kaiser, G. [1974] “Coherent States and Asymptotic Representations,’’ in Proc. of Third Internat. Colloq. on Group-Theor. Meth. in Physics, Marseille. [1977a] “ Relativistic Coherent-State Representations,” in Group-Theoretical Methods in Physics, R. Sharp and B. Coleman, eds., Academic Press, New York. [1977b] J. Math. Phys. 18,952. [ 1977~1Phase-Space Approach to Relativistic Quantum Mechanics, Ph. D. thesis, Mathematics Dept, Univ. of Toronto. [1977d] “Phase-Space Approach to Relativistic Quantum Mechanics,” in proc. of Eighth Internat. Conf. on General Relativity, Waterloo, Ont. [1978a] J. Math. Phys. 19,502.
References
351
[1978b] “Hardy spaces associated with the PoincarC group,” in Lie Theories and Their Applications, W. Rossmann, ed., Queens Univ. Press, Ont. [1978c] “Local Fourier Analysis and Synthesis,” Univ. of Lowell preprint (unpublished). [1979] Lett. Math. Phys. 3,61. [1980al “Holomorphic gauge theory,” in Geometric Methods in Mathematical Physics, G. Kaiser and J. E. Marsden, eds., Lecture Notes in Math. 775, Springer-Verlag, Heidelberg. [1980b] “Relativistic quantum theory in complex spacetime,” in Differential-Geometrical Methods in Mathematical Physics, P.L. Garcia, A.Perez-Rendon and J.-M. Souriau, eds., Lecture Notes in Math. 836,Springer-Verlag, Berlin. [1981] J . Math. Phys. 22, 705. [1984a] “A sampling theorem for signals in the joint time- frequency domain,” Univ. of Lowell preprint (unpublished). [1984b] “On the relation between the formalisms of PrugoveEki and Kaiser,” Univ. of Lowell preprint, presented as a poster at the VII-th International Congress of Mathematical Physics, in Mathematical Physics VII, W. E. Brittin, K. E. Gustafson and W. Wyss, eds., North-Holland, Amsterdam [1987a] Ann. Phys. 173,338. [1987b] “A phase-space formalism for quantized fields,” in Group-Theoretical Methods in Physics, R. Gilmore, ed., World Scientific, Singapore. [199Oa] “Algebraic Theory of Wavelets. I: Operational Calculus and Complex Structure,” Technical Repors Series #17, Univ. of Lowell.
352
References
[1990b] “Generalized Wavelet Transforms. I: The Windowed XRay Tkansform,” Technical Repors Series #IS, Univ. of Lowell. [1990~1“Generalized Wavelet Transforms. 11: The Multivariate Analytic-Signal Transform,” Technical Repors Series #19, Univ. of Lowell. Kirillov, A. A. [1976] Elements of the Theory of Representatins, SpringerVerlag, Heidelberg. Klauder, J. R. [1960] Ann. Phys. 11,123. [1963a] J Math. Phys. 4, 1055. [1963b] J Math. Phys. 4, 1058. Klauder, J. R. and Skagerstam, B.-S., eds. [1985] Coherent States, World Scientific, Singapore. Klauder, J. R. and Sudarshan, E. C. G. [1968] Fundamentals of Quantum Optics, Benjamin, New York. Kobayashi, S. and Nomizu, K. [1963] Foundations of Differential Geometry, vol. I, Interscience, New York. [1969] Foundations of Differential Geometqy, vol. 11, Interscience, New York. Kostant, B. [1970] “Quantization and Unitary Representatins,” in Lectures in Modern Analysis and Applications, R. M. Dudley et al., eds. Lecture Notes in Math. 170,Springer-Verlag, Heidelberg. Lemarik, P. G. and Meyer, Y. [1986] Rev. Math. Iberoamericana 2, 1. LurCat, F. [1964] Physics 1, 95.
References
353
Mackey, G. [19551 The Theory of Group-Representations, Lecture Notes, Univ. of Chicago. [1968] Induced Representations of Groups and Quantum Mechanics, Benjamin, New York. Mallat, S. [19871 “Multiresolution approximation and wavelets,” preprint GRASP Lab., Dept. of Computer and Information Science, Univ. of Pennsylvania, to be published in Trans. Amer. Math. SOC. Marsden, J. E. [1981] Lectures on Geometric Methods in Mathematical Physics, Notes from a Conference held at Univ. of Lowell, CBMSNSF Regional Conference Series in Applied Mathematics vol. 37,SIAM, Philadelphia, PA. Meschkowski, H. [19621 Hilbertsche Raume mit Kernfunktion, Springer-Verlag, Heidelberg. Messiah, A. [1963] Quantum Mechanics, vol. 11, North-Holland, Amsterdam. Meyer, Y. [19851 “Principe d’incertitude, bases hilbertiennes et algkbres d ’opkrateurs ,” Sdminaire Bourbaki 662. [1986] “Ondelettes et functions splines,” S6minaire EDP, Ecole Polytechnique, Paris, France (December issue). Misner, C, Thorne, K. S. and Wheeler, J. A. [1970] Gravitation, W. H.Freeman, San Francisco. Nelson, E. [1959] Ann. Math. 70,572. [1973a] J. Funct. Anal. 12,97. [1973b] J. Funct. Anal. 12,211.
354
References
Neumann, J. von [1931] Math. Ann. 104,570. Newton, T. D. and Wigner, E. [1949] Rev. Mod. Phys. 21,400. Onofri, E. [1975] J. Math. Phys. 16,1087. Papoulis, A. [1962] The Fourier Integral and its Applications, McGraw-Hill, New York. Perelomov, A. M. [1972] Commun. Math. Phys. 26,222. [1986] Generalized Coherent States and Their Applications, Springer-Verlag, Heidelberg. PrugoveEki, E. [1976] J. Phys. A9, 1851. [1978] J. Math. Phys. 19,2260. [1984] Stochastic Quant urn Mechanics and Quantum Spacetime, D. Reidel, Dordrecht. Robinson, D. W. [1962] Helv. Physica Acta 35,403. Roederer, J. G. [1975] Introduction to the Physics and Psychophysics of Music, Springer-Verlag, New York. Schrodinger, E. [1926] Naturwissenschaften 14,664. Segal, I. E. [1956a] Trans. Amer. Math. SOC.81,106. [1956b] Ann. Math. 63,160. [1958] Trans. Amer. Math. SOC.88,12. [1963a] Illinois J. Math. 6,500. [1963b] Mathematical Problems of Relativistic Physics, Amer. Math. SOC.,Providence.
References
355
[1965] Bull. Amer. Math. SOC.71,419. Simms, D.J. and Woodhouse, N. M. J. [1976] Lectures on Geometric Quantization, Lecture Notes in Physics 53, Springer-Verlag, Heidelberg. Sniatycki, J. [19801 Geometric Quantization and Quantum mechanics, Springer-Verlag, Heidelberg. Souriau, J .-M. [1970] Structure des Syst&mesDynamiques, Dunod, Paris. Stein, E. [19701 Singular Integrals and Differentiability Properties of finctions, Princeton Univ. Press, Princeton. Strang, G. [1989] SIAM Rev. 31,614. Streater, R. F. [1967] Commun. Math. Phys. 4, 217. Streater, R. F. and Wightman, A. S. [1964] PCT, Spin & Statistics, and All That, Benjamin, New York. Thirring, W. [ 19801 Quantum Mechanics of Large Systems, Springer-Verlag, New York. Toller, M. [1978] Nuovo Cim. 44B,67. Unterberger, A. [1988] Commun. in Partial DifF. Eqs. 13,847. Varadarajan, V. S. [1968] Geometry of Quantum Theory, vol. I, D.Van Nostrand, Princeton, NJ. [1970] Geometry of Quantum Theory, vol. 11, D.Van Nostrand, Princeton, NJ.
356
References
[1974] Lie Groups, Lie Algebras, and their Representations, Prentice-Hall, Englewood Cliffs,NJ. Warner, F. [19711 Foundations of Differentiable Manifolds and Lie Groups, Scott, Foresman and Co., London. Wells, R. 0. [19801 Differential Analysis on Complex Manifolds, SpringerVerlag, New York. Weyl, H. [1919] Ann. d. Physik 59, 101. [1931] The Theory of Groups and Quantum Mechanics, translated by H. P. Robertson, Dover, New York Wightman, A. S. [1962] Rev. Mod. Phys. 34, 845. Yosida, G. [1971] finctional Analysis, Springer-Verlag, Heidelberg. Young, R. M. [1980] An Introduction to Non-Harmonic Fourier Series, Academic Press, New York. Zakai, M. [1960] Information and Control 3, 101.
357
INDEX admissible, 49, 100, 112, 339 affine group, 46 analytic signal, 239 -transform, 243ff, 335, 343 analytic vector, 116 anticommutators, 292 averaging operator h ( S ) , 62ff, 75 Bosom, 293 bra-ket not ation , 6ff canonical anticommutation relations, 294 commutation relations, 9, 11, 162, 276 canonical transformation, 211 Cartan subalgebra, 117 central extension, 168 characteristic function , 3 12 charge, 264ff, 282, 296 -current jfi,284, 296 density p, 283, 296 coherent state, canonical, 9ff, 115 field-, 308ff Galilean e i , 172ff holomorphic, 130 interpolating, 302ff particle-, 266 relativistic e,, 19Off -representation, 13ff spin-, 135 complex line bundle, 323 complex manifold, 129 complex structure J , 70ff complex vector bundle, 328 configuration space S, 208, 212 conjugation operator C, 72
connection, 322ff -form, 324 Riemannian, 323 type (1, O), 333 consistency condition, 22, 340 contraction limit, 121, 145ff, 161 control vector y, 195, 272 coset spaces, 109 covariant derivative Dx , 32M curvature 0,331 De Broglie’s relation, 4 decomposition, 82ff complex, 87ff differencing operator g(S), 75 dilation, 45, 58, 336 -equation, 63 -operator D, 62 Dirac equation, 393 -field 4, 289ff directional holomorphy, 247 electromagnetic field F, 326 -potential A, 325 energy, 4 evaluation maps, 32, 190 exponential step function Oc , 241 Fermions, 293 fiber, 323 -metric h, 325, 329 filter, bandpass, 45 complex 2,2, 85ff high-pass Gal 78ff lOW-pasS Ha,65 Fourier transform, 3, 5 windowed, 34, 36
358 Fourier-Laplace transform, 240, 244, 344 frame, l8ff, 33, 37, 49 discrete, 38-40, 55 group-, 95ff holomorphic, 113ff homogeneous, 103ff frame bundle, 159 functional integral, 316 gauge group, 327 -symmetry, 300 gauge transformation, 324 holomorphic-, 329 Galilean group 6,163ff General hlativity, 321 Gordon identity, 297 Haar basis, 63, 75 harmonic oscillator, 11, 145ff Heisenberg algebra, 9 -picture, 198 Hilbert transform H , 243 H,, 247, 345 home versions, 65 -space V, 65 homogeneous space, 110 interpolation operator H: , 65 Killing form, 119 Klein-Gordon equation, 185ff field 4, 273ff light cones V+,V . , 189 Lorentz group L,157 -metric, 2 Lorentzian spacetime, 2 mass rn, 162 -shell Om, 185 metric, 322ff minimal coupling, 326 Mobius transformation, 140
Index momentum, 4 multiscale analysis, 58, 64 natural units, 5 naturality, 65 non-relativistic limit, 225ff number operators N* , 281 , 295 orientation, 2 13 oversampling, 41 Pauli exclusion principle, 293 phase space, 36, 156 b, 202, 207ff, 279ff Planck’s Ansatz, 4 Poincarh group P,157ff polarization (of frequencies), 301 position operators, 162, 198 probability -current, 207ff, 215, 219 -density, relativistic, 206 quantization, 162-163 Radon transform, 342 reconstruction, 38, 40, 82ff, 33Gff complex, 87ff regularized current, 285, 298 representation of group, 33, 96, 124 of vector space, 8 project,ive, 112 Schrodinger, 9, 105 square-integrable, 49, 112 reproducing kernel, 22, 29ff, 49, 193, 287, 298 resolution of unity, 8, 15, 18ff, 23, 24, 37, 49, 205ff root, 118 -subspace, 118 -vector, 118 sampling -rate, 50 time-frequency, 40
Index
359
Schrodinger equation, 169 section (of bundle), 109 shift operator S,60 signal, 34 state space C, 159 statistics, 293 stereographic projection, 145 symplect ic form, 135,208ff geometry, 155 temper vector, 197, 272 tube 7, 190,249 two-point functions Ah, 193,268,287 uncertainty principle, 5, 10, 163 vector bundle, 328 vector potential A,, 326 wave number vector, 4
wavelet, 44 mother (basic) $, 44,58,75,79 -transform, 44ff,334ff weight, 126 Weyl-Heisenberg group W ,38, 104, 159ff Wick ordering, 282ff Wightman axioms,252ff window, 35 relativistic, 58, 178,334 X-ray transform, 245, 341 windowed, 334ff Yang-Mills field, 331-332 -potential, 331 -theory, 327 Zitterbewegung , 301 zoom operators H, H ' , 66,84
This Page Intentionally Left Blank