S.V. Shabanov / Physics Reports 326 (2000) 1}163
1
GEOMETRY OF THE PHYSICAL PHASE SPACE IN QUANTUM GAUGE SYSTEMS
Sergei V. SHABANOV Department of Mathematics, University of Florida, Gainesville, FL 32611-2085, USA
AMSTERDAM } LAUSANNE } NEW YORK } OXFORD } SHANNON } TOKYO
Physics Reports 326 (2000) 1}163
Geometry of the physical phase space in quantum gauge systems Sergei V. Shabanov1 Department of Mathematics, University of Florida, Gainesville, FL 32611-2085, USA Received June 1999; editor: J.A. Bagger
Contents 1. Introduction 2. The physical phase space 3. A system with one physical degree of freedom 3.1. Lagrangian formalism 3.2. Hamiltonian dynamics and the physical phase space 3.3. Symplectic structure on the physical phase space 3.4. The phase space in curvilinear coordinates 3.5. Quantum mechanics on a conic phase space 4. Systems with many physical degrees of freedom 4.1. Yang}Mills theory with adjoint scalar matter in (0#1) spacetime 4.2. The Cartan}Weyl basis in Lie algebras 4.3. Elimination of non-physical degrees of freedom. An arbitrary gauge group case 4.4. Hamiltonian formalism 4.5. Classical dynamics for groups of rank 2 4.6. Gauge invariant canonical variables for groups of rank 2 4.7. Semiclassical quantization 4.8. Gauge matrix models. Curvature of the orbit space and the kinematic coupling
4 8 9 9 12 16 17 20 23 24 25 28 31 32 35 37 38
5. Yang}Mills theory in a cylindrical spacetime 5.1. The moduli space 5.2. Geometry of the gauge orbit space 5.3. Properties of the measure on the gauge orbit space 6. Artifacts of gauge "xing in classical theory 6.1. Gribov problem and the topology of gauge orbits 6.2. Arbitrary gauge "xing in the SO(2) model 6.3. Revealing singularities in a formally gauge-invariant Hamiltonian formalism 6.4. Symplectic structure on the physical phase space 7. Quantum mechanics and the gauge symmetry 7.1. Fock space in gauge models 7.2. SchroK dinger representation of physical states 7.3. The SchroK dinger representation in the case of many physical degrees of freedom 7.4. The theorem of Chevalley and the Dirac states for groups of rank 2 7.5. The operator approach to quantum Yang}Mills theory on a cylinder 7.6. Homotopically non-trivial Gribov transformations
1 On leave from Laboratory of Theoretical Physics, Joint Institute for Nuclear Research, Dubna, Russia. E-mail address:
[email protected]#.edu (S.V. Shabanov) 0370-1573/00/$ - see front matter ( 2000 Elsevier Science B.V. All rights reserved. PII: S 0 3 7 0 - 1 5 7 3 ( 9 9 ) 0 0 0 8 5 - X
41 45 50 52 54 56 60 63 67 68 71 75 78 82 83 88
S.V. Shabanov / Physics Reports 326 (2000) 1}163 7.7. Reduced phase-space quantization versus the Dirac approach 8. Path integrals and the physical phase space structure 8.1. De"nition and basic properties of the path integral 8.2. Topology and boundaries of the con"guration space in the path integral formalism 8.3. Gribov obstruction to the path integral quantization of gauge systems 8.4. The path integral on the conic phase space 8.5. The path integral in the Weyl chamber 8.6. Solving the Gribov obstruction in the 2D Yang}Mills theory 8.7. The projection method and a modi"ed Kato}Trotter formula for the evolution operator in gauge systems 8.8. The modi"ed Kato}Trotter formula for gauge models. Examples 8.9. Instantons and the phase space structure 8.10. The phase space of gauge "elds in the minisuperspace cosmology
90 96 96
99 102 104 106 109
113 119 125 127
9. Including fermions 9.1. 2D SUSY oscillator with a gauge symmetry 9.2. Solving Dirac constraints in curvilinear supercoordinates 9.3. Green's functions and the con"guration space structure 9.4. A modi"ed Kato}Trotter formula for gauge systems with fermions 10. On the gauge orbit space geometry and gauge "xing in realistic gauge theories 10.1. On the Riemannian geometry of the orbit space in classical Yang}Mills theory 10.2. Gauge "xing and the Morse theory 10.3. The orbit space as a manifold. Removing the reducible connections 10.4. Coordinate singularities in quantum Yang}Mills theory 10.5. The projection method in the Kogut}Susskind lattice gauge theory 11. Conclusions Acknowledgements References
3 130 130 132 135 138 141
141 144 147 149 153 157 157 158
Abstract The physical phase space in gauge systems is studied. Simple soluble gauge models are considered in detail. E!ects caused by a non-Euclidean geometry of the physical phase space in quantum gauge models are described in the operator and path integral formalisms. The projection on the Dirac gauge invariant states is used to derive a necessary modi"cation of the Hamiltonian path integral in gauge theories of the Yang}Mills type with fermions that takes into account the non-Euclidean geometry of the physical phase space. The new path integral is applied to resolve the Gribov obstruction. Applications to the Kogut}Susskind lattice gauge theory are given. The basic ideas are illustrated with examples accessible for nonspecialists. ( 1999 Elsevier Science B.V. All rights reserved. PACS: 11.15.!q Keywords: Gauge theories; Dirac quantization; Constrained systems; Path integral; Gribov problem; Orbit space
4
S.V. Shabanov / Physics Reports 326 (2000) 1}163
1. Introduction Yang}Mills theory and gauge theories in general play the most profound role in our present understanding of the universe. Nature is quantum in its origin so any classical gauge model should be promoted to its quantum version in order to be used as a model of the physical reality. We usually do this by applying one or another quantization recipe which we believe to lead to a consistent quantum theory. In general, quantization is by no means unique and should be regarded as a theoretical way to guess the true theory. We certainly expect any quantization procedure to comply with some physical principles, like the correspondence principle, gauge invariance, etc. And "nally, the resulting quantum theory should not have any internal contradiction. All these conditions are rather loose to give us a unique quantization recipe. The simplest way to quantize a theory is to use canonical quantization based on the Hamiltonian formalism of the classical theory. Given a set of canonical coordinates and momenta, one promotes them into a set of self-adjoint operators satisfying the Heisenberg commutation relations. Any classical observable, as a function on the phase space, becomes a function of the canonical operators. Due to the non-commutativity of the canonical operators, there is no unique correspondence between classical and quantum observables. One can modify a quantum observable by adding some operators proportional to commutators of the canonical operators. This will not make any di!erence in the formal limit when the Planck constant, which `measuresa the noncommutativity of the canonical variables, vanishes. In classical mechanics, the Hamiltonian equations of motion are covariant under general canonical transformations. So there is no preference of choosing a particular set of canonical variables to span the phase space of the system. It was, however, found in practice that canonical quantization would be successful only when applied with the phase space coordinates referring to a Cartesian system of axes and not to more general curvilinear coordinates [1]. On the other hand, a global Cartesian coordinate system can be found only if the phase space of the system is Euclidean. This comprises a fundamental restriction on the canonical quantization recipe. Another quantization method is due to Feynman [2] which, at "rst sight, seems to avoid the use of non-commutative phase space variables. Given a classical action for a system in the Lagrangian form, which is usually assumed to be quadratic in velocities, the quantum mechanical transition amplitude between two "xed points of the con"guration space is determined by a sum over all continuous paths connecting these points with weight being the phase exponential of the classical action divided by the Planck constant. Such a sum is called the Lagrangian path integral. If the action is taken in the Hamiltonian form, the sum is extended over all phase-space trajectories connecting the initial and "nal states of the system and, in addition, this sum also involves integration over the momenta of the "nal and initial states. Recall that a phase-space point speci"es uniquely a state of a Hamiltonian system in classical theory. Such a sum is called the Hamiltonian path integral. One should however keep in mind that such a de"nition of the Hamiltonian path integral (as a sum over paths in a phase space) is formal. One usually de"nes it by a speci"c "nite dimensional integral on the time lattice rather than a sum over paths in a phase space. The correspondence principle follows from the stationary phase approximation to the sum over paths when the classical action is much greater than the Planck constant. The stationary point, if any, of the action is a classical trajectory. So the main contribution to the sum over paths comes from
S.V. Shabanov / Physics Reports 326 (2000) 1}163
5
paths #uctuating around the classical trajectory. But again, one could add some terms of higher orders in the Planck constant to the classical action without changing the classical limit. Despite this ambiguity, Feynman's sum over paths looks like a miracle because no noncommutative phase-space variables are involved in the quantum mechanical description. It just seems like the knowledge of a classical theory is su$cient to obtain the corresponding quantum theory. Moreover, the phase-space path integral with the local Liouville measure seems to enjoy another wonderful property of being invariant under general canonical transformations. Recall that the Liouville measure is de"ned as a volume element on the phase space which is invariant under canonical transformations. One may tend to the conclusion that the phase-space path integral provides a resolution of the aforementioned problem of the canonical quantization. This is, however, a trap hidden by the formal de"nition of the path integral measure as a product of the Liouville measures at each moment of time. For systems with one degree of freedom one can easily "nd a canonical transformation that turns a generic Hamiltonian into one for a free particle or harmonic oscillator. It is obvious that the quantum mechanics of a generic one-dimensional system is not that of the harmonic oscillator. From this point of view the Feynman integral should also be referred to the Cartesian coordinates on the phase space, unless the formal measure is properly modi"ed [3}5]. So, we conclude that the existence of the Cartesian coordinates that span the phase space is indeed important for both the canonical and path integral quantization. When quantizing a system by one of the above methods, one often makes an implicit assumption that the phase space of the physical degrees of freedom is Euclidean, i.e., it admits a global Cartesian system of coordinates. We will show that, in general, this assumption is not justi"ed for physical degrees of freedom in systems with gauge symmetry. Hence, all the aforementioned subtleties of the path integral formalism play a major role in the path integral quantization of gauge systems. The true geometry of the physical phase space must be taken into account in quantum theory, which signi"cantly a!ects the corresponding path integral formalism. Gauge theories have a characteristic property that the Euler}Lagrange equations of motion are covariant under symmetry transformations whose parameters are general functions of time. Therefore the equations of motion do not determine completely the time evolution of all degrees of freedom. A solution under speci"ed initial conditions on the positions and velocities would contain a set of general functions of time, which is usually called gauge arbitrariness [6]. Yet, some of the equations of motion have no second time derivatives, so they are constraints on the initial positions and velocities. In the Hamiltonian formalism, one has accordingly constraints on the canonical variables [6]. The constraints in gauge theories enjoy an additional property. Their Poisson bracket with the canonical Hamiltonian as well as among themselves vanishes on the surface of the constraints in the phase space ("rst-class constraints according to the Dirac terminology [6]). Because of this property, the Hamiltonian can be modi"ed by adding to it a linear combination of the constraints with general coe$cients, called the Lagrange multipliers of the constraints or just gauge functions or variables. This, in turn, implies that the Hamiltonian equations of motion would also contain a gauge arbitrariness associated with each independent constraint. By changing the gauge functions one changes the state of the system if the latter is de"ned as a point in the phase space. These are the gauge transformations in the phase space. On the other hand, the physical state of the system cannot depend on the gauge arbitrariness. If one wants to associate a single point of the phase space with each physical state of the system, one is necessarily led to the
6
S.V. Shabanov / Physics Reports 326 (2000) 1}163
conclusion that the physical phase space is a subspace of the constraint surface in the total phase space of the system. Making it more precise, the physical phase space should be the quotient of the constraint surface by the gauge transformations generated by all independent constraints. Clearly, the quotient space will generally not be a Euclidean space. One can naturally expect some new phenomena in quantum gauge theories associated with a non-Euclidean geometry of the phase space of the physical degrees of freedom because quantum theories determined by the same Hamiltonian as a function of canonical variables may be di!erent if they have di!erent phase spaces, e.g., the plane and spherical phase spaces. This peculiarity of the Hamiltonian dynamics of gauge systems looks interesting and quite unusual for dynamical models used in fundamental physics, and certainly deserves a better understanding. In this review we study the geometrical structure of the physical phase space in gauge theories and its role in the corresponding quantum dynamics. Since the path integral formalism is the main tool in modern fundamental physics, special attention is paid to the path integral formalism for gauge models whose physical phase space is not Euclidean. This would lead us to a modi"cation of the conventional Hamiltonian path integral used in gauge theories, which takes into account the geometrical structure of the physical phase space. We also propose a general method to derive such a path integral that is in a full correspondence with the Dirac operator formalism for gauge theories. Our analysis is mainly focused on soluble gauge models where the results obtained by di!erent methods, say, by the operator or path integral formalisms, are easy to compare, and thereby, one has a mathematical control of the formalism being developed. In realistic gauge theories, a major problem is to make the quantum theory well-de"ned non-perturbatively. Since the perturbation theory is not sensitive to the global geometrical properties of the physical phase space } which is just a fact for the theory in hand } we do not go into speculations about the realistic case, because there is an unsolved problem of the non-perturbative de"nition of the path integral in a strongly interacting "eld theory, and limit the discussion to reviewing existing approaches to this hard problem. However, we consider a Hamiltonian lattice gauge theory due to Kogut and Susskind and extend the concepts developed for low-dimensional gauge models to it. In this case we have a rigorous de"nition of the path integral measure because the system has a "nite number of degrees of freedom. The continuum limit still remains as a problem to reach the goal of constructing a non-perturbative path integral in gauge "eld theory that takes into account the non-Euclidean geometry of the physical phase space. Nevertheless from the analysis of simple gauge models, as well as from the general method we propose to derive the path integral, one might anticipate some new properties of the modi"ed path integral that would essentially be due to the non-Euclidean geometry of the physical phase space. The review is organized as follows. In Section 2 a de"nition of the physical phase space is given. Section 3 is devoted to mechanical models with one physical degree of freedom. In this example, the physical phase space is shown to be a cone unfoldable into a half-plane. E!ects of the conic phase space on classical and quantum dynamics are studied. In Section 4 we discuss the physical phase space structure of gauge systems with several physical degrees of freedom. Special attention is paid to a new dynamical phenomenon which we call a kinematic coupling. The point being is that, though physical degrees of freedom are not coupled in the Hamiltonian, i.e., they are dynamically decoupled, nonetheless their dynamics is not independent due to a non-Euclidean structure of their phase (a kinematic coupling). This phenomenon is analyzed as in classical mechanics as in quantum theory. It is shown that the kinematic coupling has a signi"cant e!ect on the spectrum of the
S.V. Shabanov / Physics Reports 326 (2000) 1}163
7
physical quantum Hamiltonian. In Section 5 the physical phase space of Yang}Mills theory on a cylindrical spacetime is studied. A physical con"guration space, known as the gauge orbit space, is also analyzed in detail. Section 6 is devoted to artifacts which one may encounter upon a dynamical description that uses a gauge "xing (e.g., the Gribov problem). We emphasize the importance of establishing the geometrical structure of the physical phase space prior to "xing a gauge to remove nonphysical degrees of freedom. With simple examples, we illustrate dynamical artifacts that might occur through a bad, though formally admissible, choice of the gauge. A relation between the Gribov problem, topology of the gauge orbits and coordinate singularities of the symplectic structure on the physical phase space is discussed in detail. In Section 7 the Dirac quantization method is applied to all the models. Here we also compare the so called reduced phase space quantization (quantization after eliminating all non-physical degrees of freedom) and the Dirac approach. Pitfalls of the reduced phase space quantization are listed and illustrated with examples. Section 8 is devoted to the path integral formalism in gauge theories. The main goal is a general method which allows one to develop a path integral formalism equivalent to the Dirac operator method. The new path integral formalism is shown to resolve the Gribov obstruction to the conventional Faddeev}Popov path integral quantization of gauge theories of the Yang}Mills type (meaning that the gauge transformations are linear in the total phase space). For soluble gauge models, the spectra and partition functions are calculated by means of the Dirac operator method and the new path integral formalism. The results are compared and shown to be the same. The path integral formalism developed is applied to instantons and minisuperspace cosmology. In Section 9 fermions are included into the path integral formalism. We observe that the kinematic coupling induced by a non-Euclidean structure of the physical phase space occurs for both fermionic and bosonic physical degrees of freedom, which has an important e!ects on quantum dynamics of fermions. In particular, the modi"cation of fermionic Green's functions in quantum theory is studied in detail. Section 10 contains a review of geometrical properties of the gauge orbit space in realistic classical Yang}Mills theories. Various approaches to describe the e!ects of the non-Euclidean geometry of the orbit space in quantum theory are discussed. The path integral formalism of Section 8 is applied to the Kogut}Susskind lattice Yang}Mills theory. Conclusions are given in Section 11. The material of the review is presented in a pedagogical fashion and is believed to be easily accessible for non-specialists. However, a basic knowledge of quantum mechanics and group theory might be useful, although the necessary facts from the group theory are provided and explained as needed. One of the widely used quantization techniques, the BRST quantization (see, e.g., [20]) is not discussed in the review. Partially, this is because it is believed that on the operator level the BRST formalism is equivalent to the Dirac method and, hence, the physical phenomena associated with a non-Euclidean geometry of the physical phase space can be studied by either of these techniques. The Dirac method is technically simpler, while the BRST formalism is more involved as it requires an extension of the original phase space rather than its reduction. The BRST formalism has been proved to be useful when an explicit relativistic invariance of the perturbative path integral has to be maintained. Since the discovery of the BRST symmetry [215,216] of the Faddeev}Popov e!ective action and its successful application to perturbation theory [217], there existed a belief that the path integral for theories with local symmetries can be de"ned as a path integral for an e!ective theory with the global BRST symmetry. It was pointed out [22,23] that this equivalence
8
S.V. Shabanov / Physics Reports 326 (2000) 1}163
breaks down beyond the perturbation theory. The conventional BRST action may give rise to a zero partition function as well as to vanishing expectation values of physical operators. The reason for such a failure boils down to the non-trivial topology of the gauge orbit space. Therefore, a study of the role of the gauge orbit space in the BRST formalism is certainly important. In this regard one should point out the following. There is a mathematical problem within the BRST formalism of constructing a proper inner product for physical states [24]. This problem appears to be relevant for the BRST quantization scheme when the Gribov problem is present [25]. An interesting approach to the inner product BRST quantization has been proposed in [26,27] (cf. also [20, Chapter 14]) where the norm of physical states is regularized. However if the gauge orbits possess a non-trivial topology, it can be shown that there may exist a topological obstruction to de"ning the inner product [28]. There are many proposals to improve a formal BRST path integral [29]. They will not be discussed here. The BRST path integral measure is usually ill-de"ned, or de"ned as a perturbation expansion around the Gaussian measure, while the e!ects in question are non-perturbative. Therefore, the validity of any modi"cation of the BRST path integral should be tested by comparing it with (or deriving it from) the corresponding operator formalism. It is important that the gauge invariance is preserved in any modi"cation of the conventional BRST scheme. As has been already mentioned, the BRST operator formalism needs a proper inner product, and a construction of such an inner product can be tightly related to the gauge orbit space geometry. It seems to us that more studies are still needed to come to a de"nite conclusion about the role of the orbit space geometry in the BRST quantization.
2. The physical phase space As has been emphasized in the preceding remarks, solutions to the equations of motion of gauge systems are not fully determined by the initial conditions and depend on arbitrary functions of time. Upon varying these functions the solutions undergo gauge transformations. Therefore at any moment of time, the state of the system can only be determined modulo gauge transformations. Bearing in mind that the gauge system never leaves the constraint surface in the phase space, we are led to the following de"nition of the physical phase space. The physical phase space is a quotient space of the constraint surface relative to the action of the gauge group generated by all independent constraints. Denoting the gauge group by G, and the set of constraints by pa , the de"nition can be written in the compact form PS1):4 "PSDpa /0 /G ,
(2.1)
where PS is the total phase space of the gauge system, usually assumed to be a Euclidean space. If the gauge transformations do not mix generalized coordinates and momenta, one can also de"ne the physical con"guration space CS1):4 "CS/G .
(2.2)
As they stand, de"nitions (2.1) and (2.2) do not depend on any parametrization (or local coordinates) of the con"guration or phase space. In practical applications, one always uses some particular sets of local coordinates to span the gauge invariant spaces (2.1) and (2.2). The choice can
S.V. Shabanov / Physics Reports 326 (2000) 1}163
9
be motivated by a physical interpretation of the preferable set of physical variables or, e.g., by simplicity of calculations, etc. So our "rst task is to learn how the geometry of the physical phase space is manifested in a coordinate description. Let us turn to some examples of gauge systems to illustrate formulas (2.1) and (2.2) and to gain some experience in classical gauge dynamics on the physical phase space.
3. A system with one physical degree of freedom Consider the Lagrangian (3.1) ¸"1(x5 !ya¹a x)2!<(x2) . 2 Here x is an N-dimensional real vector, ¹a real N]N antisymmetric matrices, generators of SO(N) and (¹a x)i"(¹a )ij xj. Introducing the notation y"ya¹a for an antisymmetric real matrix (an element of the Lie algebra of SO(N)), the gauge transformations under which the Lagrangian (3.1) remains invariant can be written in the form xPXx, yPXyXT!XXQ T ,
(3.2)
where X"X(t) is an element of the gauge group SO(N), XTX"XXT"1, and XT is the transposed matrix. In fact, the Lagrangian (3.1) is invariant under a larger group O(N). As we learn shortly (cf. a discussion after (3.8)), only a connected component of the group O(N), i.e. SO(N), can be identi"ed as the gauge group. Recall that a connected component of a group is obtained by the exponential map of the corresponding Lie algebra. We shall also return to this point in Section 7.1 when discussing the gauge invariance of physical states in quantum theory. The model has been studied in various aspects [7}10]. For our analysis, the work [9] of Prokhorov will be the most signi"cant one. The system under consideration can be thought as the (0#1)-dimensional Yang}Mills theory with the gauge group SO(N) coupled to a scalar "eld in the fundamental representation. The real antisymmetric matrix y(t) plays the role of the time-component A0 (t) of the Yang}Mills potential (in fact, the only component available in (0#1)-spacetime), while the variable x(t) is the scalar "eld in (0#1)-spacetime. The analogy becomes more transparent if one introduces the covariant derivative Dt x,x5 !yx so that the Lagrangian (3.1) assumes the form familiar in gauge "eld theory ¸"1(Dt x)2!<(x2) . 2
(3.3)
3.1. Lagrangian formalism The Euler}Lagrange equations are d R¸ R¸ ! "D2t x#x2<@(x2)"0 , dt Rx5 Rx
(3.4)
d R¸ R¸ ! "(Dt x, ¹a x)"0 . dt Ry5 a Rya
(3.5)
10
S.V. Shabanov / Physics Reports 326 (2000) 1}163
The second equation in this system is nothing but a constraint associated with the gauge symmetry. In contrast to Eq. (3.4) it does not contain a second derivative in time and, hence, serves as a restriction (or constraint) on the admissible initial values of the velocity x5 (0) and position x(0) with which the dynamical equation (3.4) is to be solved. The variables ya are the Lagrange multipliers for the constraints (3.5). Any solution to the equations of motion is determined up to the gauge transformations (3.2). The variables ya"ya(t) remain unspeci"ed by the equation of motion. Solutions associated with various choices of ya(t) are related to one another by gauge transformations. The dependence of the solution on the functions ya(t) can be singled out by means of the following change of variables
C P
xi(t)" T exp
t
D
y(q)dq
i
zj(t) , (3.6) 0 j where T exp stands for the time-ordered exponential. Indeed, in the new variables the system (3.4), (3.5) becomes independent of the gauge functions ya(t) zK "!2<@(z2)z ,
(3.7)
(z5 , ¹a z)"0 .
(3.8)
Note that the matrix given by the time-ordered exponential in (3.6) is orthogonal and, therefore, x2"z2. When transforming the equations of motion, we have used some properties of the time-ordered exponential which are described below. Consider a solution to the equation
C
D
d j !y(t) ui"0 . dt i The vectors ui(t1 ) and ui(t2 ) are related as ui(t2 )"X ij (t2 , t1 ) uj(t1 ) , where X (t2 ,t1 )"T exp
P
tÈ
y(q) dq .
tÇ These relations can be regarded as the de"nition of the time-ordered exponential X(t2 , t1 ). The matrix X can also be represented as a power series
P
= X j (t , t )" + dq 2 dq [y(q ) 2 y(q )]j , i 2 1 1 n 1 n i n/0 where the integration is carried out over the domain t 5q 5 2 5q 5t . If y is an antisym2 1 n 1 metric matrix, then it follows that the time-ordered exponential in (3.6) is an element of SO(N), that is, the gauge arbitrariness is exhausted by the SO(N) transformations of x(t) rather than by those from the larger group O(N). Since the matrices ¹ are antisymmetric, the constraint equation (3.8) is ful"lled for the states in a which the velocity vector is proportional to the position vector z5 (t)"j(t)z(t) ,
(3.9)
S.V. Shabanov / Physics Reports 326 (2000) 1}163
11
and j(t) is to be determined from the dynamical equation (3.7). A derivation of relation (3.9) relies on a simple observation that Eq. (3.8) means the vanishing of all components of the angular momentum of a point-like particle whose positions are labeled by the N-dimensional radius-vector z. Thus, the physical motion is the radial motion for which Eq. (3.9) holds and vice versa. Substituting (3.9) into (3.7) and multiplying the latter by z, we derive jQ #j2"!2<@(z2) .
(3.10)
Eqs. (3.9) and (3.10) form a system of "rst-order di!erential equations to be solved under the initial conditions j(0)"j and z(0)"x(0)"x . According to (3.9) the relation z5 (0)"x5 (0)"j x speci0 0 0 0 "es initial values of the velocity allowed by the constraints. In the particular case of a harmonic oscillator <"u2x2/2"u2z2/2, Eq. (3.10) is easily solved j(t)"!u tan(ut#u ) , 0 thus leading to
(3.11)
z(t)"x cos(ut#u )/cos u , (3.12) 0 0 0 where the initial condition is taken into account. A general solution x(t) is obtained from (3.12) by means of the gauge transformation (3.6) where components of the matrix y(t) play the role of the gauge transformation parameters. In particular, one can always choose y(t) to direct the vector x along, say, the "rst axis xi(t)"x(t)di1 for all moments of time. That is, the "rst coordinate axis can always be chosen to label physical states and to describe the physical motion of the gauge system. This is, in fact, a general feature of gauge theories: By specifying the Lagrange multipliers one "xes a supplementary (gauge) condition to be ful"lled by the solutions of the Euler}Lagrange equations. The gauge "xing surface in the con"guration (or phase) space is used to label physical states of the gauge theory. In the model under consideration, we have chosen the gauge xi"0, for all iO1. Furthermore, for those moments of time when x(t)(0 one can "nd y(t) such that x(t)P!x(t) ,
(3.13)
being the SO(N) rotations of the vector x through the angle p. The physical motion is described by a non-negative variable r(t)"Dx(t)D50 because there is no further gauge equivalent con"gurations among those satisfying the chosen gauge condition. The physical con"guration space is isomorphic to a half-line CS "RN/SO(N)&R . (3.14) 1):4 ` It should be remarked that the residual gauge transformations (3.13) cannot decrease the number of physical degrees of freedom, but they do reduce the `volumea of the physical con"guration space. The physical con"guration space can be regarded as the gauge orbit space whose elements are gauge orbits. In our model the gauge orbit space is the space of concentric spheres. By having speci"ed the gauge we have chosen the Cartesian coordinate x1 to parameterize the gauge orbit space. It appears however that our gauge is incomplete. Amongst con"gurations belonging to the gauge "xing surface, there are con"gurations related to one another by gauge transformations, thus describing the same physical state. Clearly, the x1 axis intersects each sphere (gauge orbit) twice so that the points x1 and !x1 belong to the same gauge orbit. Thus the gauge orbit space can be parameterized by non-negative x1. In general, given a gauge condition and a con"guration
12
S.V. Shabanov / Physics Reports 326 (2000) 1}163
satisfying it, one may "nd other con"gurations that satisfy the gauge condition and belong to the gauge orbit passing through the chosen con"guration. Such con"gurations are called Gribov copies. This phenomenon was "rst observed by Gribov in Yang}Mills theory in the Coulomb gauge [11]. At this point we shall only remark that the Gribov copying depends on the gauge, although it is unavoidable and always present in any gauge in Yang}Mills theory [12]. The existence of the Gribov copying is directly related to a non-Euclidean geometry of the gauge orbit space [12,13]. For the latter reason, this phenomenon is important in gauge systems and deserves further study. As the Gribov copying is gauge dependent, one can use gauge-invariant variables to avoid it. This, however, does not always provide us with a description of the physical motion free of ambiguities. For example, for our model problem let the physical motion be described by the gauge invariant variable r(t)"Dx(t)D"Dx(t)D. If the trajectory goes through the origin at some moment of time t , i.e., r(t )"0, the velocity r5 (t) su!ers a jump as if the particle hits a wall at r"0. Indeed, 0 0 r5 (t)"e(x(t))x5 (t) where e(x) is the sign function, e(x)"#1 if x'0 and e(x)"!1 for x(0. Setting v "x5 (t ), we "nd r5 (t !o)!r5 (t #o)P2v as oP 0. On the other hand, the potential <(r2) is 0 0 0 0 0 smooth and regular at the origin and, therefore, cannot cause any in"nite force acting on the particle passing through the origin. So, despite using the gauge-invariant variables to describe the physical motion, we may encounter non-physical singularities which are not at all anticipated for smooth potentials. Our next step is therefore to establish a description where the ambiguities are absent. This can be achieved in the framework of the Hamiltonian dynamics to which we now turn. 3.2. Hamiltonian dynamics and the physical phase space The canonical momenta for the model (3.1) read p"R¸/Rx5 "D x , t R¸ p " "0 . a Ry5 a
(3.15) (3.16)
Relations (3.16) are primary constraints [6]. A canonical Hamiltonian is H"1 p2#<(x2)!yap , a 2 where
(3.17)
p "Mp , HN"!(p, ¹ x)"0 (3.18) a a a are secondary constraints. Here M , N denotes the Poisson bracket. By de"nitions (3.15) and (3.16) we set Mxi, p N"di and Mya, n N"da , while the other Poisson bracket of the canonical variables vanish. j j b b The constraints (3.18) ensure that the primary constraints hold as time proceeds, p5 "Mn , HN"0. a a All the constraints are in involution Mn , n N"0, Mn , p N"0, Mp , p N"f cp , (3.19) a b a a a b ab c where f c are the structure constraints of SO(N). There is no further restriction on the canonical ab variables because p5 weakly vanishes, p5 "Mp , HN&p +0, i.e., it vanishes on the surface of a a a a constraints [6].
S.V. Shabanov / Physics Reports 326 (2000) 1}163
13
Since n "0, one can consider a generalized Dirac dynamics [6] which is obtained by a replacing the canonical Hamiltonian (3.17) by a generalized Hamiltonian H "H#man where T a ma are the Lagrange multipliers for the primary constraints. The Hamiltonian equations of motion FQ "MF, H N will contain two sets of gauge functions, ya and ma (for primary and T secondary constraints). However, the primary constraints n "0 generate only shifts of a ya : dya"dmbMn , yaN"!dma with dma being in"nitesimal parameters of the gauge transformation. b In particular, y5 a"Mya, H N"!ma. The degrees of freedom ya turn out to be purely non-physical T (their dynamics is fully determined by arbitrary functions ma). For this reason, we will not introduce generalized Dirac dynamics [6], rather we discard the variables ya as independent canonical variables and consider them as the Lagrange multipliers for the secondary constraints p . That is, a in the Hamiltonian equations of motion p5 "Mp, HN and x5 "Mx, HN, which we can write in the form covariant under the gauge transformations, D p"!2 x<@(x2), D x"p , (3.20) t t the variables ya will be regarded as arbitrary functions of time and canonical variables p and x. The latter is consistent with the Hamiltonian form of the equations of motion because for any F"F( p, x) we get MF, yap N"MF, yaNp #yaMF, p N+yaMF, p N. Thus, even though the Lagrange a a a a multipliers are allowed to be general functions not only of time, but also of the canonical variables, the Hamiltonian equations of motion are equivalent to (3.20) on the surface of constraints. The constraints p generate simultaneous rotations of the vectors p and x because a Mp, p N"¹ p, Mx, p N"¹ x . (3.21) a a a a Thus, the last term in the Hamiltonian (3.17) generates rotations of the classical trajectory at each moment of time. Note that a "nite gauge transformation is built by successive in"nitesimal rotations, that is, the gauge group generated by the constraints is SO(N), not O(N). The time evolution of a quantity F does not depend on arbitrary functions y, provided MF, p N+0, i.e., F is gauge invariant on the surface of constraints. The quantity F is gauge invariant a in the total phase space if MF, p N"0. The constraints (3.18) mean that all components of the a angular momentum are zero. The physical motion is the radial motion for which the following relation holds: p(t)"j(t) x(t) .
(3.22)
As before, the scalar function j(t) is determined by the dynamical equations (3.20). Applying the covariant derivative to (3.6), we "nd
C P
t
D
(3.23) y(q)dq z5 (t) , 0 where z(t) and j(t) are solution to the system (3.7), (3.10). Now we can analyze the motion in the phase space spanned by variables p and x. The trajectories lie on the surface of constraints (3.22). Although the constraints are ful"lled by the actual motion, trajectories still have gauge arbitrariness which corresponds to various choices of ya(t). Variations of ya generate simultaneous SO(N)rotations of the vectors x(t) and p(t) as follows from the representations (3.6) and (3.22). Therefore, with an appropriate choice of the arbitrary functions ya(t), the physical motion can be described in p(t)" T exp
14
S.V. Shabanov / Physics Reports 326 (2000) 1}163
two-dimensional phase space xi(t)"x(t)di1, p (t)"j(t)x(t)d ,p(t)d . (3.24) i i1 i1 An important observation is the following [9]. Whenever the variable x(t) changes sign under the gauge transformation (3.13), so does the canonical momentum p(t) because of the constraint (3.22) or (3.24). In other words, for any motion in the phase-space plane two states (p, x) and (!p,!x) are physically indistinguishable. Identifying these points on the plane, we obtain the physical phase space of the system which is a cone unfoldable into a half-plane [9,10] PS "PSD a /SO(N)&R2/Z &cone(p) . (3.25) 1):4 p /0 2 Fig. 1 illustrates how the phase-space plane turns into the cone upon the identi"cation of the points (p, x) and (!p,!x). Now we can address the above issue about non-physical singularities of the gauge invariant velocity r5 . To simplify the discussion and to make it transparent, let us "rst take a harmonic oscillator as an example. To describe the physical motion, we choose gauge-invariant canonical coordinates r(t)"Dx(t)D and p (t)"(x, p)/r. The gauge invariance means that r Mr, p N"Mp , p N"0 , (3.26) a r a i.e., the evolution of the canonical pair p , r does not depend on arbitrary functions ya(t). Making r use of (3.11) and (3.12) we "nd r(t)"r Dcos utD ; (3.27) 0 p (t)"j(t)r(t)"r5 (t)"!ur (sin ut)e(cos ut) . (3.28) r 0 Here the constant u has been set to zero, and r "Dx D. The trajectory starts at the phase-space 0 0 0 point (0, r ) and goes down into the area of negative momenta as shown in Fig. 1f. At the time 0 t "p/2u, the trajectory reaches the half-axis p (0, r"0 (the state A in Fig. 1f). The physical A r momentum p (t) has the sign #ip as if the particle hits a wall. At that instant the acceleration is r in"nite because *p (t )"p (t #o)!p (t !o)P2r u, oP0, which is not possible as the r A r A r A 0 oscillator potential vanishes at the origin. Now we recall that the physical phase space of the model is a cone unfoldable into a half-plane. To parameterize the cone by the local gauge-invariant phase-space coordinates (3.28), (3.27), one has to make a cut of the cone along the momentum axis, which is readily seen from the comparison of Figs. 1d and f where the same motion is represented. The states (r u, 0) and (!r u, 0) are two images of one state that lies on the cut made on the cone. 0 0 Thus, in the conic phase space, the trajectory is smooth and does not contain any discontinuities. The nonphysical `walla force is absent (see Fig. 1e). In our discussion, a particular form of the potential < has been assumed. This restriction can easily be dropped. Consider a trajectory xi(t)"x(t)di1 passing through the origin at t"t , x(t )"0. In the physical variables the trajectory is r(t)"Dx(t)D and p (t)"r5 (t)"p(t)e(x(t)) 0 0 r where p(t)"x5 (t). Since the points (p, x) and (!p,!x) correspond to the same physical state, we "nd that the phase-space points (p (t !o), 0) and (p (t #o), 0) approach the same physical r 0 r 0 state as o goes to zero. So, for any trajectory and any regular potential the discontinuity Dp (t !o)!p (t #o)DP2Dp(t )D, as oP0, is removed by going over to the conic phase space. r 0 r 0 0
S.V. Shabanov / Physics Reports 326 (2000) 1}163
15
Fig. 1. (a) The phase-space plane (p, x) and the oscillator trajectory on it. The states B"(p, x) and !B"(!p,!x) are gauge equivalent and to be identi"ed. (b) The phase-space plane is cut along the p-axis. The half-plane x(0 is rotated relative to the x-axis through the angle p. (c) The resulting plane is folded along the p-axis so that the states B and !B get identi"ed. (d) Two copies of each state on the p-axis, which occur upon the cut (e.g., the state A), are glued back to remove this doubling. (e) The resulting conic phase space. Each point of it corresponds to one physical state of the gauge system. The oscillator trajectory does not have any discontinuity. (f) The physical motion of the harmonic oscillator in the local gauge invariant variables (p , r). The trajectory has a discontinuity at the state A. The discontinuity occurs through the r cut of the cone along the momentum axis. The cut is associated with the (p , r) parameterization of the cone. r
The observed singularities of the phase-space trajectories are essentially artifacts of the coordinate description and, hence, depend on the parameterization of the physical phase space. For instance, the cone can be parameterized by another set of canonical gauge-invariant variables (p, x) p "DpD50, r" , Mr, p N"1 . (3.29) r r p r It is easy to convince oneself that r(t) would have discontinuities, rather than the momentum p . r This set of local coordinates on the physical phase space is associated with the cut on the cone along the coordinate axis. In general, local canonical coordinates on the physical phase space are determined up to canonical transformations (p , r)P(P , R)"(P (r, p ), R(p , r)), MR, P N"1 . (3.30) r R R r r R The coordinate singularities associated with arbitrary local canonical coordinates on the physical phase space may be tricky to analyze. However, the motion considered on the true physical phase space is free of these ambiguities. That is why it is important to "nd the geometry of the physical phase space before studying Hamiltonian dynamics in some local formally gauge invariant canonical coordinates. It is also of interest to "nd out whether there exist a set of canonical variables in which the discontinuities of the classical phase-space trajectories do not occur. Let us return to the local
16
S.V. Shabanov / Physics Reports 326 (2000) 1}163
coordinates where the momentum p changes sign as the trajectory passes through the origin r"0. r The sought-for new canonical variables must be even functions of p when r"0 and be regular on r the half-plane r50. Then the trajectory in the new coordinates will not su!er the discontinuity. In the vicinity of the origin, we set = = R"a (p2)# + a (p )rn, P "b (p2)# + b (p )rn . (3.31) 0 r n r R 0 r n r n/1 n/1 Comparing the coe$cients of powers of r in the Poisson bracket (3.30) we "nd, in particular, 2p [a (p )b@ (p2)!a@ (p2)b (p )]"1 . (3.32) r 1 r 0 r 0 r 1 r Eq. (3.32) has no solution for regular functions a and b . By assumption the functions a and 0,1 0,1 n b are regular and so should be a b@ !a@ b "1/(2p ), but the latter is not true at p "0 as follows n 1 0 0 1 r r from (3.32). A solution exists only for functions singular at p "0. For instance, one can take r R"r/p and P "p2/2, MR, P N"1 which is obviously singular at p "0. In these variables the r R r R r evolution of the canonical momentum does not have abrupt jumps, however, the new canonical coordinate does have jumps as the system goes through the states with p "0. r In general, the existence of singularities are due to the condition that a and b must be even 0 0 functions of p . This latter condition leads to the factor 2p in the left-hand side of Eq. (3.32), thus r r making it impossible for b and a to be regular everywhere. We conclude that, although in the 1 1 conic phase space the trajectories are regular, the motion always exhibits singularities when described in any local canonical coordinates on the phase space. Our analysis of the simple gauge model reveals an important and rather general feature of gauge theories. The physical phase space in gauge theories may have a non-Euclidean geometry. The phase-space trajectories are smooth in the physical phase space. However, when described in local canonical coordinates, the motion may exhibit non-physical singularities. In Section 6 we show that the impossibility of constructing canonical (Darboux) coordinates on the physical phase space, which would provide a classical description without singularities, is essentially due to the nontrivial topology of the gauge orbits (the concentric spheres in this model). The singularities fully depend on the choice of local canonical coordinates, even though this choice is made in a gaugeinvariant way. What remains coordinate- and gauge-independent is the geometrical structure of the physical phase space which, however, may reveal itself through the coordinate singularities occurring in any particular parameterization of the physical phase space by local canonical variables. One cannot assign any direct physical meaning to the singularities, but their presence indicates that the phase space of the physical degrees of freedom is not Euclidean. At this stage of our discussion it becomes evident that it is of great importance to "nd a quantum formalism for gauge theories which does not depend on local parameterization of the physical phase space and takes into account its genuine geometrical structure. 3.3. Symplectic structure on the physical phase space The absence of local canonical coordinates in which the dynamical description does not have singularities may seem to look rather disturbing. This is partially because of our custom to often identify canonical variables with physical quantities which can be directly measured, like, for instance, positions and momenta of particles in classical mechanics. In gauge theories canonical
S.V. Shabanov / Physics Reports 326 (2000) 1}163
17
variables, that are de"ned through the Legendre transformation of the Lagrangian, cannot always be measured and, in fact, may not even be physical quantities. For example, canonical variables in electrodynamics are components of the electrical "eld and vector potential. The vector potential is subject to the gradient gauge transformations. So it is a non-physical quantity. The simplest gauge invariant quantity that can be built of the vector potential is the magnetic "eld. It can be measured. Although the electric and magnetic "elds are not canonically conjugated variables, we may calculate the Poisson bracket of them and determine the evolution of all gauge invariant quantities (being functions of the electric and magnetic "elds) via the Hamiltonian equation motion with the new Poisson bracket. Extending this analogy further we may try to "nd a new set of physical variables in the SO(N) model that are not necessarily canonically conjugated but have a smooth time evolution. A simple choice is Q"x2, P"(p, x).
(3.33)
The variables (3.33) are gauge invariant and in a one-to-one correspondence with the canonical variables r, p parameterizing the physical (conic) phase space: Q"r2, P"p r, r50. Due to r r analyticity in the original phase space variables, they also have a smooth time evolution Q(t), P(t). However, we "nd MQ, PN"2Q ,
(3.34)
that is, the symplectic structure is no longer canonical. The new symplectic structure is also acceptable to formulate Hamiltonian dynamics of physical degrees of freedom. The Hamiltonian assumes the form 1 H" P2#<(Q) . 2Q
(3.35)
Therefore P2 QQ "MQ, HN"2P, PQ "MP, HN" !2Q<@(Q) . Q
(3.36)
The solutions Q(t) and P(t) are regular for a su$ciently regular <, and there is no need to `remembera where the cut on the cone has been made. The Poisson bracket (3.34) can be regarded as a skew-symmetric product (commutator) of two basis elements of the Lie algebra of the dilatation group. This observation allows one to quantize the symplectic structure. The representation of the corresponding quantum commutation relations is realized by the so called a$ne coherent states. Moreover the coherent-state representation of the path integral can also be developed [14], which is not a canonical path integral when compared with the standard lattice treatment. 3.4. The phase space in curvilinear coordinates Except the simplest case when the gauge transformations are translations in the con"guration space, physical variables are non-linear functions of the original variables of the system. The separation of local coordinates into the physical and pure gauge ones can be done by means of going over to curvilinear coordinates such that some of them span gauge orbits, while the others
18
S.V. Shabanov / Physics Reports 326 (2000) 1}163
change along the directions transverse to the gauge orbits and, therefore, label physical states. In the example considered above, the gauge orbits are spheres centered at the origin. An appropriate coordinate system to separate physical and nonphysical variables is the spherical coordinate system. It is clear that dynamics of angular variables is fully arbitrary and determined by the choice of functions ya(t). In contrast the temporal evolution of the radial variable does not depend on ya(t). The phase space of the only physical degree of freedom turns out to be a cone unfoldable into a half-plane. Let us forget about the gauge symmetry in the model for a moment. Upon a canonical transformation induced by going over to the spherical coordinates, the radial degree of freedom seems to have a phase space being a half-plane because r"DxD50, and the corresponding canonical momentum would have an abrupt sign #ip when the system passes through the origin. It is then natural to put forward the question whether the conic structure of the physical phase space is essentially due to the gauge symmetry, and may not emerge upon a certain canonical transformation. We shall argue that without the gauge symmetry, the full phase-space plane (p , r) is required r to uniquely describe the motion of the system [10]. As a general remark, we point out that the phase-space structure cannot be changed by any canonical transformation. The curvature of the conic phase space, which is concentrated on the tip of the cone, cannot be introduced or even eliminated by any coordinate transformation. For the sake of simplicity, the discussion is restricted to the simplest case of the SO(2) group [10]. In this case, the phase space is four-dimensional Euclidean space spanned by the canonical coordinates p3R2 and x3R2. For the polar coordinates r and h introduced by x1"r cos h, x2"r sin h ,
(3.37)
the canonical momenta are p "(x, p)/r, p "(p, ¹x) (3.38) r h with ¹ "!¹ , ¹ "1, being the only generator of SO(2). The one-to-one correspondence ij ji 12 between the Cartesian and polar coordinates is achieved if the latter are restricted to non-negative values for r and to the segment [0, 2p) for h. To show that the full plane (p , r) is necessary for a unique description of the motion, we compare r the motion of a particle through the origin in Cartesian and polar coordinates, assuming the potential to be regular at the origin. Let the particle move along the x1 axis. As long as the particle moves along the positive semiaxis the equality x1"r is satis"ed and no paradoxes arise. As the particle moves through the origin, x1 changes sign, r does not change sign, and h and p change r abruptly: hPh#p, p "DpD cos hP!p . Although these jumps are not related with the action of r r any forces, they are consistent with the equations of motion. The kinematics of the system admits an interpretation in which the discontinuities are avoided. As follows from the transformation formulas (3.37), the Cartesian coordinates x1,2 remains untouched under the transformations hPh#p, rP!r ,
(3.39)
hPh#2p, rPr .
(3.40)
This means that the motion with values of the polar coordinates h#p and r'0 is indistinguishable from the motion with values of the polar coordinates h and r(0. Consequently, the
S.V. Shabanov / Physics Reports 326 (2000) 1}163
19
phase-space points (p , r; p , h) and (!p ,!r; h#p, p ) correspond to the same state of the system. r h r h Therefore, the state (!p , r; p , h#p) the particle attains after passing through the origin is r h equivalent to (p ,!r; p , h). As expected, the phase-space trajectory will be identical in both the r h (p , r)-plane and the (p , x1)-plane. r 1 In Fig. 2 it is shown how the continuity of the phase-space trajectories can be maintained in the canonical variables p and r. The original trajectory in the Cartesian variables is mapped into two r copies of the half-plane r50. Each half-plane corresponds to the states of the system with values of h di!ering by p (Fig. 2b). Using the equivalence between the states (!p , r; p , h#p) and r h (p ,!r; p , h), the half-plane corresponding to the value of the angular value h#p can be viewed as r h the half-plane with negative values of r so that the trajectory is continuous on the (p , r)-plane and r the angular variables does not change when the system passes through the origin (Fig. 2c). Another possibility to keep the trajectories continuous under the canonical transformation, while maintaining the positivity of r, is to glue the edges of the half-planes connected by the dashed lines in Fig. 2b. The resulting surface resembles the Riemann surface with two conic leaves (Fig. 2d). The curvature at the origin of this surface is zero because for any periodic motion the trajectory goes around both conic leaves before it returns to the initial state, i.e., the phase-space radius-vector (r, p ) sweeps the total angle 2p. Thus, the motion is indistinguishable from the motion in the r phase-space plane.
Fig. 2. (a) A phase-space trajectory of a harmonic oscillator. The initial condition are such that x2"p "0 for all 2 moments of time. The system moves through the origin x1"0. (b) The same motion is represented in the canonical variables associated with the polar coordinates. When passing the origin r"0, the trajectory su!ers a discontinuity caused by the jump of the canonical momenta. The discontinuity can be removed in two ways. (c) One can convert the motion with values of the canonical coordinates (!p , r; p , h#p) into the equivalent motion (p ,!r; p , h), thus making r h r h a full phase-space plane out of two half-planes. (d) Another possibility is to glue directly the points connected by the dashed lines. The resulting surface is the Riemann surface with two conic leaves. It has no curvature at the origin because the phase-space radius vector (p , r) sweeps the total angle 2p around the two conic leaves before returning to the initial r state.
20
S.V. Shabanov / Physics Reports 326 (2000) 1}163
When the gauge symmetry is switched on, the angular variable h becomes nonphysical, the constraint is determined by p "0. The states which di!er only by values of h must be identi"ed. h Therefore two conic leaves of the (p , r)-Riemann surface become two images of the physical phase r space. By identifying them, the Riemann surface turns into a cone unfoldable into a halfplane. In the representation given in Fig. 2c, the cone emerges upon the familiar identi"cation of the points (!p ,!r) with (p , r). This follows from the equivalence of the states r r (!p ,!r; p "0, h)&(p , r; p "0, h#p)&(p , r; p "0, h), where the "rst one is due to the r h r h r h symmetry of the change of variables, while the second one is due to the gauge symmetry: States di!ering by values of h are physically the same. 3.5. Quantum mechanics on a conic phase space It is clear from the correspondence principle that quantum theory should, in general, depend on the geometry of the phase space. It is most naturally exposed in the phase-space path integral representation of quantum mechanics. Before we proceed with establishing the path integral formalism for gauge theories whose physical phase space di!ers from a Euclidean space, let us "rst use simpler tools, like Bohr}Sommerfeld semiclassical quantization, to get an idea of how the phase space geometry in gauge theory may a!ect quantum theory [9,10]. Let the potential < of the system be such that there exist periodic solutions of the classical equations of motion. According to the Bohr}Sommerfeld quantization rule, the energy levels can be determined by solving the equation
Q
P
A B
1 T pq5 dt"2p+ n# , n"0, 1,2 , (3.41) 2 0 where the integral is taken over a periodic phase-space trajectory with the period ¹"¹(E) which depend on the energy E of the system. The quantization rule (3.41) does not depend on the parameterization of the phase space because the functional =(E) is invariant under canonical transformations: {p dq"{P dQ and, therefore, coordinate-free. For this reason we adopt it to analyze quantum mechanics on the conic phase space. For a harmonic oscillator of frequency u and having a Euclidean phase space, the Bohr}Sommerfeld rule gives exact energy levels. Indeed, classical trajectories are =(E)" p dq"
J2E q(t)" sin ut, p(t)"J2E cos ut , u
(3.42)
thus leading to (3.43) E "+u(n#1), n"0, 1,2 . n 2 In general, the Bohr}Sommerfeld quantization determines the spectrum in the semiclassical approximation (up to higher orders of +) [15]. So our consideration is not yet a full quantum theory. Nonetheless it will be su$cient to qualitatively distinguish between the in#uence of the non-Euclidean geometry of the physical phase space and the e!ects of potential forces on quantum gauge dynamics.
S.V. Shabanov / Physics Reports 326 (2000) 1}163
21
Will the spectrum (3.43) be modi"ed if the phase space of the system is changed to a cone unfoldable into a half-plane? The answer is a$rmative [9,10,16]. The cone is obtained by identifying points on the plane related by re#ection with respect to the origin, cone(p)&R2/Z . 2 Under the residual gauge transformations (p, q)P(!p,!q), the oscillator trajectory maps into itself. Thus on the conic phase space it remains a periodic trajectory. However the period is twice less than the one of the oscillator with a #at phase space. Because the states the oscillator passes at t3[0, p/u) are physically indistinguishable from those at t3[p/u, 2p/u). Therefore, the oscillator with the conic phase space returns to the initial state in two times faster than the ordinary oscillator: ¹ "1¹"p/u . c 2 The Bohr}Sommerfeld quantization rule leads to the spectrum
(3.44)
Ec "2E "2+u(n#1), n"0, 1,2 . (3.45) n n 2 The distance between energy levels is doubled as though the physical frequency of the oscillator were u "2u. Observe that the frequency as the parameter of the Hamiltonian is not changed. 1):4 The entire e!ect is therefore due to the conic structure of the physical phase space. Since the Bohr}Sommerfeld rule does not depend on the parameterization of the phase space, one can also apply it directly to the conic phase space. We introduce the polar coordinates on the phase space [9]
S
q"
2P cos Q, u
p"J2uP sin Q .
(3.46)
Here MQ, PN"1. If the variable Q ranges from 0 to 2p, then (p, q) span the entire plane R2. The local variables (p, q) would span a cone unfoldable into a half-plane if one restricts Q to the interval [0, p) and identify the phase-space points (p, q) of the rays Q"0 and Q"p. From (3.42) it follows that the new canonical momentum P is proportional to the total energy of the oscillator E"uP .
(3.47)
For the oscillator trajectory on the conic phase space, we have
Q
Q
P
A B
pE E p 1 = (E)" p dq" P dQ" dQ" "2p+ n# , (3.48) c u u 2 0 which leads to the energy spectrum (3.45). The curvature of the conic phase space is localized at the origin. One may expect that the conic singularity of the phase space does not a!ect motion localized in phase-space regions which do not contain the origin. Such motion would be indistinguishable from the motion in the #at phase space. The simplest example of this kind is the harmonic oscillator whose equilibrium is not located at the origin [9]. In the original gauge model, we take the potential u2 <" (DxD!r )2 . 0 2
(3.49)
The motion is easy to analyze in the local gauge invariant variables (p , r), when the cone is cut r along the momentum axis.
22
S.V. Shabanov / Physics Reports 326 (2000) 1}163
Fig. 3. (a) Oscillator double-well potential. (b) Phase-space trajectories in the #at phase space. For E(E there are two 0 periodic trajectories associated with two minima of the double-well potential. (c) The same motion in the conic phase space. It is obtained from the corresponding motion in the #at phase space by identifying the points (p, x) with (!p,!x). The local coordinates p and r are related to the parameterization of the cone when the cut is made along the momentum r axis (the states A and !A are the same).
As long as the energy does not exceed a critical value E "u2r2 /2, i.e., the oscillator cannot 0 0 reach the origin r"0, the period of classical trajectory remains 2p/u. The Bohr}Sommerfeld quantization yields the spectrum of the ordinary harmonic oscillator (3.43). However the gauge system di!ers from the corresponding system with the phase space being a full plane. As shown in Fig. 3b, the latter system has two periodic trajectories with the energy E(E associated with two 0 minima of the oscillator double-well potential. Therefore in quantum theory the low energy levels must be doubly degenerate. Due to the tunneling e!ect the degeneracy is removed. Instead of one degenerate level with E(E there must be two close levels (we assume (E!E )/E ;1 to justify 0 0 0 the word `closea). In contrast, there is no doubling of classical trajectories in the conic phase space (see Fig. 3c), and no splitting of the energy levels should be expected. These qualitative arguments can also be given a rigorous derivation in the framework of the instanton calculus. We shall return to this issue after establishing the path integral formalism for the conic phase space (see Section 8.9). When the energy is greater than E , the particle can go over the potential barrier. In the #at 0 phase space there would be only one trajectory with "xed energy E exceeding E . From the 0 symmetry arguments it is also clear that this trajectory is mapped onto itself upon the re#ection (p, x)P(!p,!x). Identifying these points of the #at phase space, we observe that the trajectory on the conic phase space with E'E is continuous and periodic. In Fig. 3c the semiaxes p (0 0 r and p '0 on the line r"0 are identi"ed in accordance with the chosen parameterization of the r cone. Assume the initial state of the gauge system to be at the phase space point O in Fig. 3c, i.e. r(0)"r . Let t be the time when the system approaches the state !A. In the next moment of time 0 A the system leaves the state A. The states A and !A lie on the cut of the cone and, hence, correspond to the same state of the system. There is no jump of the physical momentum at t"t . A From symmetry arguments it follows that the system returns to the initial state in the time ¹ "p/u#2t . c A
(3.50)
It takes t"2t to go from the state O to !A and then from A to O@. From the state O@ the system A reaches the initial state O in half of the period of the harmonic oscillator, p/u. The time t depends A
S.V. Shabanov / Physics Reports 326 (2000) 1}163
23
on the energy of the system and is given by
S
1 E p 04 t " sin~1 , E5E . A u 0 E 2u
(3.51)
The quasiclassical quantization rule yields the equation for energy levels
P
= (E)"=(E)!2E c
p@u~tA
cos2 ut dt
tA 1 ut 1 1 "=(E) # A # sin 2ut "2p+ n# . A 2 p 2p 2
A
B
A B
(3.52)
Here =(E)"2pE/u is the Bohr}Sommerfeld functional for the harmonic oscillator of frequency u. The function = (E) for the conic phase space is obtained by subtracting a contribution of the c portion of the ordinary oscillator trajectory between the states !A and A for negative values of the canonical coordinate, i.e., for t3[t , p/u!t ]. When the energy is su$ciently large, E<E , A A 0 the time 2t is much smaller than the half-period p/u, and = (E)&1=(E), leading to the doubling A c 2 of the distance between the energy levels. In this case typical #uctuations have the amplitude much larger than the distance from the classical vacuum to the singular point of the phase space. The system `feelsa the curvature of the phase space localized at the origin. For small energies as compared with E , typical quantum #uctuations do not reach the singular point of the phase space. 0 The dynamics is mostly governed by the potential force, i.e., the deviation of the phase space geometry from the Euclidean one does not a!ect much the low energy dynamics (cf. (3.52) for t +p/(2u)). As soon as the energy attains the critical value E the distance between energy levels A 0 starts growing, tending to its asymptotic value *E"2+u. The quantum system may penetrate into classically forbidden domains. The wave functions of the states with E(E do not vanish under the potential barrier. So even for E(E there are 0 0 #uctuations that can reach the conic singularity of the phase space. As a result a small shift of the oscillator energy levels for E;E occurs. The shift can be calculated by means of the instanton 0 technique. It is easy to see that there should exist an instanton solution that starts at the classical vacuum r"r , goes to the origin and then returns back to the initial state. We postpone the 0 instanton calculation for later. Here we only draw the attention to the fact that, though in some regimes the classical dynamics may not be sensitive to the phase space structure, the quantum theory may well expose the in#uence of the phase space geometry. The lesson we could learn from this simple qualitative consideration is that both the potential force and the phase space geometry a!ect the behavior of the gauge system. In some regimes the dynamics is strongly a!ected by the non-Euclidean geometry of the phase space. But there might also be regimes where the potential force mostly determines the evolution of the gauge system, and only a little of the phase-space structure in#uence can be seen. Even so, the quantum dynamics may be more sensitive to the non-Euclidean structure of the physical phase space than the classical one. 4. Systems with many physical degrees of freedom So far only gauge systems with a single physical degree of freedom have been considered. A non-Euclidean geometry of the physical con"guration or phase spaces may cause a speci"c
24
S.V. Shabanov / Physics Reports 326 (2000) 1}163
kinematic coupling between physical degrees of freedom [17]. The coupling does not depend on details of dynamics governed by some local Hamiltonian. One could say that the non-Euclidean geometry of the physical con"guration or phase space reveals itself through observable e!ects caused by this kinematic coupling. We now turn to studying this new feature of gauge theories. 4.1. Yang}Mills theory with adjoint scalar matter in (0#1) spacetime Consider Yang}Mills potentials A (x, t). They are elements of a Lie algebra X of a semisimple k compact Lie group G. In the (0#1) spacetime, the vector potential has one component, A , which 0 can depend only on time t. This only component is denoted by y(t). Introducing a scalar "eld in (0#1) spacetime in the adjoint representation of G, x"x(t)3X, we can construct a gauge invariant Lagrangian using a simple dimensional reduction of the Lagrangian for Yang}Mills "elds coupled to a scalar "eld in the adjoint representation [18,19] ¸"1(D x, D x)!<(x) , (4.1) 2 t t D x"x5 #i[y, x] . (4.2) t Here ( , ) stands for an invariant scalar product for the adjoint representation of the group. Let j be a a matrix representation of an orthonormal basis in X so that tr j j "d . Then we can make a b ab decompositions y"yaj and x"xaj with ya and xa being real. The invariant scalar product can a a be normalized on the trace (x, y)"tr xy. The commutator in (4.2) is speci"ed by the commutation relation of the basis elements [j , j ]"i f cj , a b ab c where f c are the structure constants of the Lie algebra. ab The Lagrangian (4.1) is invariant under the gauge transformations xPxX"XxX~1, yPyX"XyX~1#iXQ X~1 ,
(4.3)
(4.4)
where X"X(t) is an element of the group G. Here the potential < is also assumed to be invariant under the adjoint action of the group on its argument, <(xX)"<(x). The Lagrangian does not depend on the velocities y5 . Therefore the corresponding Euler}Lagrange equations yield a constraint R¸ ! "i[x, D x]"0 . t Ry
(4.5)
This is the Gauss law for the model (cf. with the Gauss law in the electrodynamics or Yang}Mills theory). Note that it involves no second-order time derivatives of the dynamical variable x and, hence, only implies restrictions on admissible initial values of the velocities and positions with which the dynamical equation D2x"!<@ (4.6) t x is to be solved. The Yang}Mills degree of freedom y appears to be purely nonphysical; its evolution is not determined by the equations of motion. It can be removed from them and the constraint (4.5)
S.V. Shabanov / Physics Reports 326 (2000) 1}163
25
by the substitution
G P
x(t)";(t)h(t);~1(t), ;(t)"T exp !i
t
0
H
dq y(q) .
(4.7)
In doing so, we get [h, hQ ]"0,
h$ "!<@ . (4.8) h The freedom in choosing the function y(t) can be used to remove some components of x(t) (say, to set them to zero for all moments of time). This would imply the removal of non-physical degrees of freedom of the scalar "eld by means of gauge xxing, just as we did for the SO(N) model above. Let us take G"SU(2). The orthonormal basis reads j "q /J2, where q , a"1, 2, 3, are the Pauli a a a matrices, q q "d #ie q ; e is the totally antisymmetric structure constant tensor of SU(2), a b ab abc c abc e "1. So the variable x is a hermitian traceless 2]2 matrix which can be diagonalized by means 123 of the adjoint transformation (4.7). Therefore one may always set h"h3j . All the continuous 3 gauge arbitrariness is exhausted, and the real variable h3 describes the only physical degree of freedom. However, whenever this variable attains, say, negative values as time proceeds, the gauge transformation hP!h can still be made. For example, taking ;"e*pq2 @2 one "nd ;q ;~1"q q q "!q . Thus, the physical values of h3 lie on the positive half-axis. We 3 2 3 2 3 conclude that CS "su(2)/ad SU(2)&R , CS"X"su(2)&R3 . (4.9) 1):4 ` It might look surprising that the system has physical degrees of freedom at all because the number of gauge variables ya exactly equals the number of degrees of freedom of the scalar "eld xa. The point is that the variable h has a stationary group formed by the group elements e*r, [u, h]"0 and, hence, so does a generic element of the Lie algebra x. The stationary group is a subgroup of the gauge group. So the elements ; in (4.7) are speci"ed modulo right multiplication on elements from the stationary group of h, ;P;e*r. In the SU(2) example, the stationary group of q is isomorphic 3 to ;(1), so the group element ;(t) in (4.7) belongs to SU(2)/;(1) and has only two independent parameters, i.e., the scalar "eld x carries one physical and two non-physical degrees of freedom. From the point of view of the general constrained dynamics, the constraints (4.5) are not all independent. For instance, tr(u[x, D x])"0 for all u commuting with x. Such constraints are t called reducible (see [20,21] for a general discussion of constrained systems). Returning to the SU(2) example, one can see that among the three constraints only two are independent, which indicates that there are only two non-physical degrees of freedom contained in x. To generalize our consideration to an arbitrary group G, we would need some mathematical facts from group theory. The reader familiar with group theory may skip the following section. 4.2. The Cartan}Weyl basis in Lie algebras Any simple Lie algebra X is characterized by a set of linearly independent r-dimensional vectors uo , j"1, 2,2, r"rank X, called simple roots. The simple roots form a basis in the root system of j the Lie algebra. Any root ao is a linear combination of uo with either non-negative integer j coe$cients (ao is said to be a positive root) or non-positive integer coe$cients (ao is said to be
26
S.V. Shabanov / Physics Reports 326 (2000) 1}163
a negative root). Obviously, all simple roots are positive. If ao is a root then !ao is also a root. The root system is completely determined by the Cartan matrix c "!2(uo , uo )/(uo , uo ) (here (uo , uo ) ij i j j j i j is a usual Euclidean scalar product of two r-vectors) which has a graphic representation known as the Dynkin diagrams [30,32]. Elements of the Cartan matrix are integers. For any two roots ao and bo , the cosine of the angle between them can take only the following values (ao , bo )[(ao , ao )(bo , bo )]~1@2"0,$1/2,$1/J2,$J3/2. By means of this fact the whole root system can be restored from the Cartan matrix [30, p. 460]. For any two elements x, y of X, the Killing form is de"ned as (x, y)"tr(ad x ad y)"(y, x) where the operator ad x acts on any element y3X as ad x(y)"[x, y] where [x, y] is a skew-symmetric Lie algebra product that satis"es the Jacobi identity [[x, y], z]#[[y, z], x]#[[z, x], y]"0 for any three elements of the Lie algebra. A maximal Abelian subalgebra H in X is called the Cartan subalgebra, dim H"rank X"r. There are r linearly independent elements u in H such that j (u , u )"(uo , uo ). We shall also call the algebra elements u simple roots. It will not lead to any i j i j i confusing in what follows because the root space Rr and the Cartan subalgebra are isomorphic, but we shall keep arrows over elements of Rr. The corresponding elements of H have no over-arrow. A Lie algebra X is decomposed into the direct sum X"H=+ (X =X ), a ranges over the a;0 a ~a positive roots, dim X "1. Simple roots form a basis (non-orthogonal) in H. Basis elements Ba e of X can be chosen such that [30, p. 176], Ba Ba [e , e ]"a , (4.10) a ~a [h, e ]"(a, h)e , (4.11) a a , (4.12) [e , e ]"N e a b a,b a`b for all a, b belonging to the root system and for any h3H, where the constants N satisfy a,b N "!N . For any such choice N2 "1/2q(1!p)(a, a) where b#na (p4n4q) is the a,b ~a,~b a,b a-series of roots containing b; N "0 if a#b is not a root. Any element x3X can be decomposed a,b over the Cartan}Weyl basis (4.10)}(4.12), x"x # + (xae #x~ae ) (4.13) H a ~a a;0 with x being the Cartan subalgebra component of x. The commutation relations (4.10)}(4.12) H imply a de"nite choice of the norms of the elements e , namely, (e , e )"0 and (e , e )"1 Ba Ba Ba a ~a [30, p. 167]. Norms of simple roots are also "xed in (4.10)}(4.12). Consider, for instance, the su(2) algebra. There is just one positive root u. Let its norm be c"(u, u). The Cartan}Weyl basis reads [e , e ]"u and [u, e ]"$ce . Let us calculate c in this basis. By de"nition c"tr(ad u)2. u ~u Bu Bu The operator ad u is a 3]3 diagonal matrix with 0,$c being its diagonal elements as follows from the basis commutation relations and the de"nition of the operator ad u. Thus, tr(ad u)2"2c2"c, i.e. c"1/2. The su(3) algebra has two equal-norm simple roots uo and uo with the angle between them 2 1 equal to 2p/3. For the corresponding Cartan subalgebra elements we have (u , u )"(u , u )"c 1 1 2 2 and (u , u )"!c/2. The whole root system is given by six elements $u , $u and 1 2 1 2 $(u #u ),$u . It is readily seen that (u , u )"c and (u , u )"(u , u )"c/2. All 1 2 12 12 12 1 12 2 12 the roots have the same norm and the angle between two neighbor roots is equal to p/3. Having
S.V. Shabanov / Physics Reports 326 (2000) 1}163
27
obtained the root pattern, we can evaluate the number c. The (non-orthogonal) basis consists of eight elements u , e , e and e where we have introduced simpli"ed notations e 1 ,e , 1,2 B1 B2 B12 Bu B1 etc. The operators ad u are 8]8 diagonal matrices as follows from (4.11) and [u , u ]"0. 1,2 1 2 Using (4.11) we "nd tr(ad u )2"3c2"c and, therefore, c"1/3. As soon as root norms are 1,2 established, one can obtain the structure constants N . For X"su(3) we have a,b N2 "N2 "N2 "1/6 and all others vanish (notice that N "!N and 1,2 12,~1 12,~2 a,b ~a,~b N "!N ). The latter determines the structure constants up to a sign. The transformation a,b b,a e P!e , N P!N leaves the Cartan}Weyl commutation relations unchanged. Therefore, a a a,b a,b only relative signs of the structure constants must be "xed. Ful"lling the Jacobi identity for elements e , e , e and e , e , e results in N "!N and N "N , respectively. ~1 1 2 ~2 1 2 1,2 12,~1 1,2 12,~2 Now one can set N "N "!N "1/J6, which completes determining the structure 1,2 12,~2 12,~1 constants for su(3). One can construct a basis orthonormal with respect to the Killing form. With this purpose we introduce the elements [30, p. 181], s "i(e !e )/J2, c "(e #e )/J2 a a ~a a a ~a so that [h, s ]"i(h, a)c , [h, c ]"!i(h, a)s , h3H . a a a a Then (s , s )"(c , c )"d and (c , s )"0. Also, a b a b ab a b
(4.14) (4.15)
(x, x)" + [(xa)2#(xa)2]#(x , x ) , (4.16) s c H H a;0 where xa are real decomposition coe$cients of x in the orthonormal basis (4.14). Supplementing s, c (4.14) by an orthonormal basis j , (j , j )"d , of the Cartan subalgebra (it might be obtained by j j i ij orthogonalizing the simple root basis of H), we get an orthonormal basis in X; we shall denote it j , a that is, for a"j, j ranges over the orthonormal basis in the Cartan subalgebra, and for a"a over a the set s , c . a a Suppose we have a matrix representation of X. Then (x, y)"c tr(xy) where xy means a matrix r multiplication. The number c depends on X. For classical Lie algebras, the numbers c are r r listed in [30, pp. 187}190]. For example, c "2(r#1) for X"su(r#1). Using this, one can r establish a relation of the orthonormal basis constructed above for su(2) and su(3) with the Pauli matrices and the Gell}Mann matrices [33, p. 17], respectively. For the Pauli matrices we have [q , q ]"2ie q , hence, (q , q )"!4e e "8d "4tr q q in full accordance a b abc c a b ab{c{ bc{b{ ab a b with c "2(r#1), r"1. One can set u"q /4, s "uq and c "uq where 1/u"2J2. r 3 u 1 u 2 A similar analysis of the structure constants for the Gell}Mann matrices j [33, p. 18], yields a u "j /6, s "uj , c "uj , u "(J3j !j )/12, s "uj , c "uj , u "(J3j #j ) 1 3 1 1 1 2 2 8 3 2 6 2 7 12 8 3 /12, s "uj and c "!uj where 1/u"2J3. This choice is not unique. Actually, the 12 5 12 4 identi"cation of non-diagonal generators j , aO3,8 with (4.14) depends on a representation of the a simple roots u by the diagonal matrices j . One could choose u "j /6 and 1,2 3,8 1 3 u "!(J3j #j )/12, which would lead to another matrix realization of the elements (4.14). 2 8 3 Consider the adjoint action of the group G on its Lie algebra X: xPad ;(x). Taking ;"ez, z3X, the adjoint action can be written in the form ad ;"exp(ad z). In a matrix representation it has a more familiar form, xP;x;~1. The Killing form is invariant under the
28
S.V. Shabanov / Physics Reports 326 (2000) 1}163
adjoint action of the group (ad ;(x), ad ;(y))"(x, y) .
(4.17)
In a matrix representation this is a simple statement: tr(;x;~1;y;~1)"tr(xy). The Cartan} Weyl basis allows us to make computations without referring to any particular representation of a Lie algebra. This great advantage will often be exploited in what follows. 4.3. Elimination of non-physical degrees of freedom. An arbitrary gauge group case The key fact for the subsequent analysis will be the following formula for a representation of a generic element of a Lie algebra [32] x"ad ;(h), ;";(z)"e*z (or x";h;~1) ,
(4.18)
in which h"hij is an element of the Cartan subalgebra H with an orthonormal basis i j , i"1, 2,2, r"rank G and the group element ;(z) is obtained by the exponential map of i z"zaj 3X>H to the group G. Here a"r#1, r#2,2, N"dim G and za are real. The r varia ables hi are analogous to h3 from the SU(2) example, while the variables za are non-physical and can be removed by a suitable choice of the gauge variables ya for any actual motion as follows from a comparison of (4.18) and (4.7). Thus the rank of the Lie algebra speci"es the number of physical degrees of freedom. The function h(t)3H describes the time evolution of the physical degrees of freedom. Note that the constraint in (4.8) is ful"lled identically, [h, hQ ],0, because both the velocity and position are elements of the maximal Abelian subalgebra. We can also conclude that the original constraint (4.5) contains only N!r independent equations. There is still a gauge arbitrariness left. Just like in the SU(2) model, we cannot reduce the number of physical degrees of freedom, but a further reduction of the con"guration space of the variable h is possible. It is known [32] that a Lie group contains a discrete "nite subgroup =, called the Weyl group, whose elements are compositions of re#ections in hyperplanes orthogonal to simple roots of the Cartan subalgebra. The group = is isomorphic to the group of permutations of the roots, i.e., to a group that preserves the root system. The gauge x"h is called incomplete global gauge with the residual symmetry group =LG.2 The existence of the residual symmetry leads to a further reduction of the con"guration space. The residual gauge symmetry of the SU(2) model is Z (the Weyl group for SU(2)) which identi"es the mirror points h3 and !h3 on the real axis. One 2 can also say that this group `restoresa the real axis (isomorphic to the Cartan subalgebra of SU(2)) from the modular domain h3'0. Similarly, the Weyl group = restores the Cartan subalgebra from the modular domain called the Weyl chamber, K`LH [32] (up to the boundaries of the Weyl chamber being a zero-measure set in H). The generators of the Weyl group are easy to construct in the Cartan}Weyl basis. The re#ection of a simple root u is given by the adjoint transformation: RK u,e*rsu ue~*rsu "!u where u u"p/J(u, u). Any element of = is obtained by a composition of RK with u ranging over the set u 2 The incomplete global gauge does not exist for the vector potential (connection) in four dimensional Yang}Mills theory [34]. See also Section 10.4 in this regard.
S.V. Shabanov / Physics Reports 326 (2000) 1}163
29
of simple roots. The action of the generating elements of the Weyl group on an arbitrary element of the Cartan subalgebra reads 2(h, u) u, X 3G . RK h"X hX~1"a! u u u u (u, u)
(4.19)
The geometrical meaning of (4.19) is transparent. It describes a re#ection of the vector h in the hyperplane orthogonal to the simple root u. In what follows we assume the Weyl chamber to be an intersection of all positive half-spaces bounded by hyperplanes orthogonal to simple roots (the positivity is determined relative to the root vector direction). The Weyl chamber is said to be an open convex cone [35]. For any element h3K`, we have (h, u)'0 where u ranges over all simple roots. Thus we conclude that CS "X/ad G&H/=&K` . (4.20) 1):4 The metric on the physical con"guration space can be constructed as the induced metric on the surface ;(z)"e where e is the group unity. First, the Euclidean metric ds2"(dx, dx),dx2 is written in the new curvilinear variables (4.18). Then one takes its inverse. The induced metric is identi"ed with the inverse of the hh-block of the inverse of the total metric tensor. In doing so, we "nd dx"ad ;(dh#[h, ;~1 d;]) ,
(4.21) ds2"dh2#[h, ;~1d;]2"d dhi dhj#g8 (h, z) dza dzb , ij ab where we have used (4.17) and the fact that [h, ;~1 d;]3X>H (cf. (4.11)) and hence (dh, [h, ;~1d;])"0. The metric has a block-diagonal structure and so has its inverse. Therefore, the physical metric (the induced metric on the surface z"0) is the Euclidean one. The physical con"guration space is a Euclidean space with boundaries (cf. (4.20)). It has the structure of an orbifold [36]. The above procedure of determining the physical metric is general for "rst-class constrained systems whose constraints are linear in momenta. The latter condition insures that the gauge transformations do not mix up the con"guration and momentum space variables in the total phase space. There is an equivalent method of calculating the metric on the orbit space [13] which uses only a gauge condition. One takes the (Euclidean) metric on the original con"guration space and obtains the physical metric by projecting tangent vectors (velocities) onto the subspace de"ned by the constraints. Since in what follows this procedure will also be used, we give here a brief description. Suppose we have independent "rst-class constraints p "Fi (q)p . Consider the kinetic a a i energy H "gijp p /2,g vivj/2, where g is the metric on the total con"guration space, vi"gijp 0 i j ij ij j tangent vectors, and gij the inverse of the metric. We split the set of the canonical coordinates qi into two subsets hl and q6 a such that the matrix Mq6 a, p N"Fa (q) is not degenerate on the surface b b q6 a"0 except, maybe, on a set of zero measure. Then the physical phase space can be parameterized by canonical coordinates p and hl. Denoting FM b (h)"Fb D 6 , similarly FM l and g6 ~1, we solve the l a a q/0 a constraints for non-physical momenta p6 "!(FM ~1)b FM l p ,cl p and substitute the result into the a a b l b l kinetic energy: H "1gkl p p "1gph vkvl , 0 2 ph l k 2 kl gkl "g6 kl!ckg6 al!g6 kacl #ckg6 abcl , ph a a a b
(4.22) (4.23)
30
S.V. Shabanov / Physics Reports 326 (2000) 1}163
where gph is the inverse of gkl ; it is the metric on the orbit space which determines the norm of the kl ph corresponding tangent vectors vk (physical velocities). Instead of conditions q6 "0, one can use general conditions sa(q)"0, which means that locally q6 "q6 (h), where h is a set of parameters to span the surface sa(q)"0, instead of q6 "0 in the above formulas. In the model under consideration, we set x"h#z, h3H, and impose the condition z"0. Then setting z equal zero in the constraint we obtain [h, p ]"0 (p ,p6 ), which leads to p "0 as one can see from the commutaz z z tion relation (4.11). Therefore g "1 because g"g6 "1. ph It is also of interest to calculate the induced volume element k(h)drh in CS . In the curvilinear 1):4 coordinates (4.18), the variables z parameterize a gauge orbit through a point x"h. For h3K`, the gauge orbit is a compact manifold of dimension N!r, dim X"N, and isomorphic to G/G H where G is the maximal Abelian subgroup of G, the Cartan subgroup. The variables h span the H space locally transverse to the gauge orbits. So, the induced volume element can be obtained from the decomposition dNx"Jg dN~rz drh"k(h)drh k8(z)dN~rz .
(4.24)
Here g is the determinant of the metric tensor in (4.21). Making use of the orthogonal basis constructed in the previous subsection, the algebra element ;~1 d; can be represented in the form !ij Fa (z)dza with Fa being some functions of z. Their explicit form will not be relevant to us. a a a Since the commutator [j , j ] always belongs to X>H and the j 's are commutative, we "nd i a i [h, ;~1 d;]"j hif cFa dzb. Hence, c ia b g8 "Fc (z)G (h)Fd (z), G "u u c , u b"hif b , (4.25) ab a cd b ab ac b a ia and the Cartesian metric d "(j , j ) is used to lower and rise the indices of the structure ab a b constants. Substituting these relations into the volume element (4.24) we obtain k(h)"det u(h). The latter determinant is quite easy to calculate in the orthogonalized Cartan}Weyl basis. Indeed, from (4.11) it follows that [h, j ]"iu b(h)j and j is the set (4.14). Let us order the basis elements j so a a b a a that the "rst r elements form the basis in the Cartan subalgebra, while j "s and j "c for a a a`1 a a"r#1, r#3,2, N!1. An explicit form of the matrix iu b is obtained from the commutation a relations (4.15). It is block-diagonal, and each block is associated with the corresponding positive root a and equals i(h, a)q (q being the Pauli matrix). Thus, 2 2 k(h)"i2(h),
i(h)" < (a, h) . (4.26) a;0 The density k is invariant under permutations and re#ections of the roots, i.e., with respect to the Weyl group: k(RK h)"k(h) for any simple root u. It also vanishes at the boundary of the Weyl u chamber. One should draw attention to the fact that the determinant of the induced metric on the physical con"guration space does not yield the volume element. This is, in fact, a generic situation in gauge theories [13]: In addition to the square root of the determinant of the physical metric, the volume element also contains a factor being the volume of the gauge orbit associated with each point of the gauge orbit space. In the model under consideration the physical con"guration space has a Euclidean metric, and k(h) determines the volume of the gauge orbit through the point x"h up to a factor (: H dzk8(z)) which is independent of h. For example, the adjoint action of SU(2) in its G@G
S.V. Shabanov / Physics Reports 326 (2000) 1}163
31
Lie algebra can be viewed as rotations in three-dimensional Euclidean space. The gauge orbits are concentric two-spheres. In the spherical coordinates we have d3x"sin h dh d/ r2 dr. The volume of a gauge orbit through xi"di1r is 4pr2. In (4.24) za are the angular variables h and /, while h is r, and k8"sin h, k"r2. 4.4. Hamiltonian formalism Now we develop the Hamiltonian formalism for the model and describe the structure of the physical phase space. The system has N primary constraints p "R¸/Ry5 a"0. Its canonical a Hamiltonian reads H"1p2#<(x)#yap , (4.27) 2 a where p2"(p, p), p"R¸/Rx5 "D x is the momentum conjugate to x and t p "i(j , [x, p])"0, Mp , p N"if cp (4.28) a a a b ab c are the secondary constraints. They generate the gauge transformations on phase space given by the adjoint action of the group G on its Lie algebra pPpX"XpX~1,
xPxX"XxX~1 ,
(4.29)
because Mp, p N"i[j , p] and Mx, p N"i[j , x]. The Hamiltonian equations of motion do not a a a a specify the time evolution of the gauge variable y. So the phase space trajectory described by the pair p(t), x(t) depends on the choice of y(t). Trajectories associated with di!erent functions y(t) are related to one another by gauge transformations. Just like in the Lagrangian formalism, this gauge arbitrariness can be used to suppress dynamics of some degrees of freedom of the scalar "eld x(t). We choose the y(t) so that x(t)"h(t)3H. The constraint (4.28) means that the momentum and position should commute as Lie algebra elements, [p, x]"0. Therefore on the constraint surface, the canonical momentum p conjugate to h must commute with h, [p , h]"0. This is a simple h h consequence of the gauge transformation law (4.29): If the variable x(t) is brought to the Cartan subalgebra by a gauge transformation, then the same gauge transformation simultaneously applies to p(t) turning it into p (t). Since the constraint is covariant under gauge transformations, the new h canonical variables h and p should also ful"ll the constraint. Thus, we are led to the conclusion h that p is an element of the Cartan subalgebra because it commutes with a generic element h3H. h Though there is no more continuous gauge arbitrariness left, but a further reduction of the phase space is still possible. The variable h has gauge equivalent con"gurations related to one another by the Weyl transformations. In the phase space spanned by the Cartan algebra elements p and h, the Weyl group acts h simultaneously on the momentum and position variables in accordance with the gauge transformation law (4.29). Thus, PS &H=H/=&R2r/= . (4.30) 1):4 By identifying the points (RK p , RK h), RK 3=, the Euclidean space R2r turns into a 2r-dimensional h hypercone which, after an appropriate cut, is unfoldable into Rr=K`. For generic con"gurations h3K` the physical phase space has no singularities and is locally #at. When h approaches a generic point on the boundary (h, u)"0 of the Weyl chamber, the physical
32
S.V. Shabanov / Physics Reports 326 (2000) 1}163
phase space exhibits a conic singularity. Indeed, we may always make a linear canonical transformation such that one of the canonical coordinates, say hM, varies along the line perpendicular to the boundary, while the others span hyperplanes parallel to the hyperplane (h, u)"0 being a part of the Weyl chamber boundary. In the new variables, the Weyl transformation that #ips sign of the root u will change signs of hM and its canonical momentum, while leaving the other canonical variables unchanged. So, at a generic point of the Weyl chamber boundary, the physical phase space has a local structure R2(r~1)=cone(p). The Weyl chamber boundary is not a smooth manifold and contains intersections of two hyperplanes (u , h)"0 and (u , h)"0. At these points, the two local conic singularities of the 1 2 physical phase space associated with simple roots u would merge, forming locally a 41,2 dimensional hyperconic singularity. This singularity cannot be simply described as a direct product of two cones cone(p). It would only be the case if the roots u are orthogonal. In general, the tip of 1,2 the hypercone would be `sharpera than that of cone(p)=cone(p), meaning that the hypercone can always be put inside of cone(p)=cone(p) when their tips are at the same point. This can be understood again in the local canonical variables where the coordinates h are split into a pair hM that spans a plane orthogonal to the intersection of two hyperplanes (u , h)"0 and the 1,2 others orthogonal to hM. The root pattern in any plane containing at least two roots (e.g., a plane through the origin and parallel to the hM-plane) is isomorphic to one of the root patterns of the groups of rank two, i.e., SU(3), Sp(4)&SO(5), G or just SU(2)]SU(2). In the latter case the 2 simple roots u are orthogonal. A modular domain of hM coincides with the Weyl chamber of one 1,2 of these groups and is contained in the positive quadrant being the Weyl chamber for SU(2)]SU(2). That is, a solid region bounded by the hypercone spanned by pM and hM and obtained as a quotient space of R4, with respect to the Weyl group is contained in the solid region bounded by cone(p)=cone(p). The procedure is straightforward to generalize it to the boundary points belonging to intersections of three hyperplanes (u , h)"0, etc. At the origin, the physical 1,2,3 phase space has the most singular point being the tip of 2r-dimensional hypercone which is `sharpera than [=cone(p)]r. We shall see that the impossibility to split globally the physical degrees of freedom into `conica and `#ata ones, which is due to the non-Euclidean (hyperconic) structure of the physical phase space, will have signi"cant dynamical consequences. For example, the physical frequencies of an isotropic oscillator turn out to be proportional to orders of the independent invariant (Casimir) polynomials of the corresponding Lie algebra, rather than being equal as one might naively expect after "xing the gauge x"h. In the coordinate representation of quantum theory, the existence of the boundaries in the con"guration space of the physical variable h will also have important consequences. 4.5. Classical dynamics for groups of rank 2 To "nd out what kind of dynamical e!ects are caused by the hyperconic structure of the phase space, we analyze an isotropic harmonic oscillator for groups of rank 2, i.e., for SU(3), SO(5)&Sp(4) and G . Eliminating the non-physical degrees of freedom by choosing y(t) so that 2 x(t)"h(t)3H, the Hamiltonian for physical degrees of freedom assumes the form H"1(p2#h2) . 2 h
(4.31)
S.V. Shabanov / Physics Reports 326 (2000) 1}163
33
Fig. 4. Classical dynamics in the Weyl chamber of SU(3).
In this parameterization of the physical phase space the canonical coordinates are restricted to the Weyl chamber. For the sake of simplicity we set the oscillator frequency, as the parameter of the Hamiltonian (4.31), to one. The physical con"guration space, i.e., the Weyl chamber, is a sector with angle p/l on the plane, where l"3, 4, 6 for SU(3), SO(5)&Sp(4) and G , respectively. A trajectory 2 of the oscillator for the group SU(3) is shown in Fig. 4. The initial conditions are chosen so that the solutions of equations of motion have the form h (t)"A cos t, h (t)"A sin t , (4.32) 2 2 1 1 and A 'A . The Weyl chamber is a sector with angle p/3 which is shown as a grey area in the 2 1 "gure. The ray OO@ is its symmetry axis. In the initial moment of time t"0 the oscillator is located at the point A. Then it follows the elliptic trajectory extended along the axis h and at t"p/6 reaches the point B, i.e., the boundary of 2 the Weyl chamber. The further motion along the ellipse in the sector bounded by the rays Oc and Oc is gauge equivalent to the motion from B to C in the Weyl chamber K`. It looks like the 1 oscillator hits the boundary, re#ects from it and arrives to the point C at time t"p/3. Though at the point B the oscillator momentum abruptly changes its direction, it is important to realize that there is no force causing this change because the oscillator potential is smooth on the entire plane. The momenta right before and after hitting the boundary wall are gauge equivalent. They are related by the Weyl transformation being the re#ection RK 2 relative to the line c@c perpendicular to u the root u . So there is no dramatic change of the physical state of the system at the moment of 2 reaching the boundary. Just like in the SO(N) model of Section 3, the trajectory is smooth on the hyperconic physical phase space (4.30). The momentum jump is a coordinate artifact occurring through a cut made on the physical phase space to parameterize it by a particular set of local canonical coordinates p 3H and h3K`. Within the path integral formalism for the model, h we shall see that the phase of the wave function does not change under such a re#ection, in full
34
S.V. Shabanov / Physics Reports 326 (2000) 1}163
contrast with the realistic re#ection from an in"nite potential wall where the phase would be shifted by p. At the point C the oscillator hits the boundary of the Weyl chamber one more time and follows the elliptic segment CPD. Again, at the very moment of the collision, no abrupt change of the physical state occurs. Finally, at t"p the oscillator reaches the point D, re#ects from the boundary and goes the same way back to the point A, returning there at t"2p. What are independent frequencies of this two-dimensional isotropic oscillator? It is quite surprising that they do not equal just one (the frequency that enters into the Hamiltonian), but rather 2 and l"3. By de"nition the angular frequency is 2p/¹ where ¹ is the time in which the system returns into the initial state upon periodic motion. The system state is speci"ed by values of the momentum and position of the physical degree of freedom in question. Let us decompose the motion of the system into oscillations along the axis O@OA and the angular motion about the origin O. After passing the segments APBPC by the oscillator, the angular variable attains its initial value since the angles O@OC and O@OA coincide, while the equality of the corresponding (angular) canonical momentum at the points A and C follows from its conservation law. The angular degree of freedom returns to the initial state two more times as the oscillator follows the path CPDPC, and then returns to the initial state after passing the segments CPBPA. Thus, the period of the angular variable is three times less than that of the angular variable of an ordinary twodimensional isotropic oscillator, i.e., the physical frequency is tripled. From Fig. 4 one can easily see that the states of the radial degree of freedom at points A and D are the same, so the physical frequency of the radial degree of freedom is doubled. A similar analysis can also be done for the groups SO(5)&Sp(4) and G . For them the 2 independent frequencies appear to be 2 and l. Note that the Weyl chamber is a sector with angle p/l. The numbers 2 and l are, in fact, "ne characteristics of the groups, namely, they are degrees of two independent invariant (Casimir) polynomials, which are tr x2 and tr xl in a matrix representation. Any regular function f (x) invariant under the adjoint action of the group on its argument is a function of these two independent polynomials. This fact holds for an arbitrary semisimple compact gauge group G: The independent frequencies of the isotropic harmonic oscillator are determined by degrees of independent Casimir polynomials. The number of the independent Casimir polynomials equals the rank of the group G, i.e., the number of physical degrees of freedom. The list of degrees of the independent Casimir polynomials for each group can be found in [32]. In the next subsection we shall develop a Hamiltonian formalism in explicitly gauge invariant variables and see the relation between the physical frequencies and orders of the independent Casimir polynomials once again. So, in the classical theory the hyperconic structure of the physical phase space reveals itself through the e!ect of re#ections of physical trajectories from the boundaries of the physical con"guration space when the latter is parameterized by elements of the Cartan subalgebra. One should stress again that the e!ect of changing the physical frequencies of the oscillator does not depend on the choice of local canonical variables and is essentially due to the hyperconic structure of the physical phase space. To calculate the e!ect, we used the above parameterization of the physical phase space. The choice of the parameterization is, in fact, a matter of convenience. Had we taken another set of local canonical coordinates, say, by making cuts of the hyperconic phase space such that the momentum variable p is restricted to the Weyl chamber, we would have h arrived to the very same conclusion about the oscillator frequencies. The message is therefore:
S.V. Shabanov / Physics Reports 326 (2000) 1}163
35
Whatever local canonical coordinates are assumed, the coordinate singularities associated with them should be carefully taken into account when solving the dynamical problem because they may contain information about the geometry of the physical phase space. Another important observation is that the geometry of the physical phase space does not permit excitations of the Cartesian degrees of freedom hi independently, even though the Hamiltonian does not contain any interaction between them. This e!ect can be anticipated from the fact that the residual Weyl transformations mix hi. Such a kinematic coupling between the physical degrees of freedom appears to be crucial for constructing a correct path integral formalism for gauge systems [19,18]. If the Hamiltonian does not contain any coupling between the degrees of freedom, then the transition amplitude in quantum mechanics is factorized over the degrees of freedom. This, however, is not the case if the phase space is not Euclidean. It can already be seen from the correspondence principle. Let us, for example, set <"0 in the above model. In addition to the straight trajectory connecting the initial point h 3K` and the "nal point h 3K`, there are 2l!1 1 2 trajectories which involve several re#ections from the boundary of the Weyl chamber. Since no change of the physical state occurs at the very moment of the re#ection, these trajectories would also be acceptable classical trajectories contributing to the semiclassical transition amplitude at the same footing as the straight one. Note that the re#ected trajectories can be viewed as straight lines connecting the points RK h , RK 3= and h and, hence, they satisfy the classical equations of motion. 1 2 Using the Weyl symmetry they can be mapped into piecewise straight continuous trajectories inside of the Weyl chamber. The contribution of these trajectories makes it impossible to factorize the semiclassical transition amplitude because the re#ected trajectories cannot be associated with an excitation of any particular Cartesian degrees of freedom hi (see Fig. 9). 4.6. Gauge invariant canonical variables for groups of rank 2 The analysis of classical dynamics of the isotropic harmonic oscillator shows that independent excitations of the Cartesian degrees of freedom h1 and h2 are impossible due to the non-Euclidean structure of their physical phase space. If, say, p "hQ 2"0 in the initial moment of time, then after h2 hitting the boundary of the Weyl chamber, the momentum p "hQ 1 will be re-distributed between h1 both physical degrees of freedom, thus exciting the h2-degree of freedom. This occurs not due to an action of any local potential force (it can even be zero), but rather due to the non-Euclidean structure of the physical phase space. This speci"c kinematic coupling implies that the independent physical excitations turn out to be collective excitations of the original degrees of freedom. Here we show that the collective excitations are described by composite gauge invariant variables. The goal is therefore to demonstrate that the kinematic coupling is important to maintain the gauge invariance of the Hamiltonian dynamics of physical canonical variables. In the Hamiltonian (4.27) for groups of rank r"2 we introduce new gauge invariant variables [18] U "(tr x2)1@2, U "U~l tr xl , (4.33) 1 2 1 where l is the degree of the second independent Casimir polynomial. The use of a matrix representation is just a matter of technical convenience. The invariant independent (Casimir) polynomials can also be written via symmetric invariant irreducible tensors, tr x2&d xaxb and ab tr xl&d(l)1 2 l xa1 2xal . Every symmetric invariant tensor in a Lie algebra can be decomposed over a a
36
S.V. Shabanov / Physics Reports 326 (2000) 1}163
the basis formed by irreducible symmetric invariant tensors [32]. Ranks of the irreducible tensors are orders of independent Casimir polynomials of the Lie algebra. The canonical momenta conjugate to the new variables read tr (pe ) RU RU i , i"1, 2; p" e" ij , i . (4.34) i i Rxa a tr e2 Rx i By straightforward computation one can convince oneself that the elements e possess the i following properties: tr (e e )"0, [e , e ]"0 . (4.35) 1 2 1 2 Therefore, they can serve as the local basis in the Cartan subalgebra H. One can also show that l2 l2 (4.36) tr e2 " (c #c U !U2 ), (a![U !b]2) , 1 2 2 2 2 U2 2 U2 1 1 where b"c /2, a"c #c2 /4, and the constants c depend on the structure constants and 1 2 1 1,2 specify the decomposition of the gauge invariant polynomial [tr (j xl~1)]2"(c U #c )U2(l~1) a 1 2 2 1 over the basis polynomials tr x2 and tr xl. For example, for SU(3) we have l"3 and c "0, 1 c "1/6. This can be veri"ed by a straightforward computation in the matrix representation. 2 Let us decompose the canonical momentum p over the basis e , p"p e #p8 where tr e p8 "0. i i i i A solution to the constraint equation [p, x]"[p8 , x]"0 is p8 "0. That is, all the components of p orthogonal to the Cartan basis elements e must vanish since the commutator of p8 and x&e i 1 does not belong to the Cartan subalgebra. The physical Hamiltonian of an isotropic harmonic oscillator, <"tr x2/2"U2 /2, assumes the form 1 1 l2p2 1 H " p2 # 2 (a![b!U ]2)# U2 . (4.37) 1) 2 1 2 2U2 2 1 1 From positivity of the norm tr e2 50 we infer the condition 2 !14(U !b)/Ja41 . (4.38) 2 The Hamiltonian equations of motion are tr e2 "1, 1
l2p2 p5 "Mp , H N"!U # 2 (a![b!U ]2) , 1 1 1) 1 2 U3 1 UQ "MU , H N"p ; 1 1 1) 1 l2p2 p5 "Mp , H N" 2 [b!U ] , 2 2 1) 2 U2 1 l2p UQ "MU , H N" 2 (a![b!U ]2) . 2 2 1) 2 U2 1 They admit the following oscillating solutions independently for each degree of freedom U (t)"p (t)"0 ; 2 2 U (t)"JEDcos tD, 1
(4.39)
(4.40)
(4.41) p (t)"!JE sin t e(cos t) , 1
(4.42)
S.V. Shabanov / Physics Reports 326 (2000) 1}163
37
where E is the energy and e denotes the sign function; and U (t)"JE, 1
p (t)"0 ; 1
U (t)"b#Ja cos lt, 2
E p (t)"! . 2 lJa sin lt
(4.43) (4.44)
Absolute value bars in (4.42) are necessary because U is positive. One can easily see that the 1 independent frequencies are degrees of the independent Casimir polynomials, 2 and l. Clearly, the variable cos~1[(U !b)/Ja] can be associated with the angular variable introduced in the 2 previous section and U with the radial variable. 1 Thanks to the gauge invariance of the new variables U we may always set x"h in (4.33) and 1,2 p"p in (4.34), thus establishing the canonical transformation between the two sets of canonical h variables. The kinematic coupling between U is absent. However, to excite either of U inde1,2 1,2 pendently, excitations of both Cartesian degrees of freedom h are needed. Thus, the removal of the 1,2 kinematic coupling is equivalent to restoration of the explicit gauge invariance. In Sections 7.3 and 7.4 we show that this remarkable feature has an elegant group theoretical explanation based on the theorem of Chevalley. The mathematical fact is that, if one attempts to construct all polynomials of h invariant relative to the Weyl group, which specify wave functions of the physical excitations of the harmonic oscillator, then one would "nd that all such polynomials are polynomials of the elementary ones tr h2 and tr hl [32]. Since orders of the polynomials determines the energy levels of the harmonic oscillator, we anticipate that the spectrum must be of the form 2n #ln , where 1 2 n are nonnegative integers. 1,2 Remark. The canonical variables (4.33) and (4.34), though being explicitly gauge invariant and describing independent physical excitations of the harmonic oscillator, can be regarded as just another possible set of the local canonical coordinates on the non-Euclidean physical phase space. As one can see from (4.42) and (4.44), there are singularities in the phase space trajectories in these variables too. One can actually "nd arguments similar to those given at the end of Section 3.2 to show that there are no canonical coordinates on the hyperconic phase space in which the phase space trajectories are free of singularities. The singularities can be removed by introducing a non-canonical symplectic structure on the physical phase space (cf. Section 3.3 and see Section 6.4 for a generalization). 4.7. Semiclassical quantization Having chosen the set h, p of local canonical variables to describe elementary excitations of h physical degrees of freedom, we have found a speci"c kinematic coupling as a consequence of the non-Euclidean structure of the physical phase space. If now we proceed to quantize the system in these variables, it is natural to expect some e!ects caused by the kinematic coupling. Let us take a closer look on them. The Bohr}Sommerfeld quantization rule is coordinate-free, i.e., invariant under canonical transformations. So we take advantage of this property and go over to the new canonical variables U , p from p , h. Note that due to the gauge invariance of the new variables we can always i i h
38
S.V. Shabanov / Physics Reports 326 (2000) 1}163
replace x and p in (4.33) and (4.34) by h and p , respectively. We have h
Q
Q
=" (p , dh)" (p dU #p dU )"2p+n , h 1 1 2 2
(4.45)
where n is a non-negative integer. Here we have also omitted the vacuum energy [15]. For an ordinary isotropic oscillator of unit frequency, one can "nd that E"n+"(n #n )+, where 1 2 n are non-negative integers, just by applying the rule (4.45) for an independent periodic motion 1,2 of each degree of freedom. That is, the functional = is calculated for the motion of one degree of freedom of the energy E, while the motion of the other degree of freedom is suppressed by an appropriate choice of the initial conditions. Then the same procedure applies to the other degree of freedom. So the total energy E of the system is attained through exciting only one degree of freedom in the above procedure. Although the independent excitations of the components of h are impossible, the new canonical variables can be excited independently. Denoting {p dU "= (E) (no summation over i), we take i i i the phase-space trajectory (4.42) and "nd =(E)"= (E)"pE"2p+n . 1 1 For the other degree of freedom we have the trajectory (4.44), which leads to
(4.46)
=(E)"= (E)"2pl~1E"2p+n . 2 2 Therefore we conclude that
(4.47)
E"+(2n #ln ) . (4.48) 1 2 Up to the ground state energy the spectrum coincides with the spectrum of two harmonic oscillators with frequencies 2 and l, being degrees of the independent Casimir polynomials for groups of rank 2. We will see that the same conclusion follows from the Dirac quantization method for gauge systems without an explicit parameterization of the physical phase space. 4.8. Gauge matrix models. Curvature of the orbit space and the kinematic coupling So far we have considered gauge models whose physical con"guration space is #at. Here we give a few simple examples of gauge models with a curved gauge orbit space. Another purpose of considering these models is to elucidate the role of a non-Euclidean metric on the physical con"guration space in the kinematic coupling between the physical degrees of freedom. To begin with let us take a system of two particles in the plane with the Lagrangian being the sum of the Lagrangian (3.1), where N"2, [17,16] ¸"1(D x )2#1(D x )2#< (x2 )#< (x2 ) , 2 t 1 2 t 2 1 1 2 2 which is invariant under the gauge transformations
(4.49)
x PeTux , yPy#u5, q"1, 2 . (4.50) q q The gauge transformations are simultaneous rotations of the vectors x . By going over to the 1,2 Hamiltonian formalism one easily "nds that the system has two "rst-class constraints R¸ p" "0, Ry5
p"(p , ¹x )#(p , ¹x )"0 . 1 1 2 2
(4.51)
S.V. Shabanov / Physics Reports 326 (2000) 1}163
39
The second constraint means that the physical motion has zero total angular momentum. The Hamiltonian of the system reads H"1( p2 #p2 )#< (x2 )#< (x2 )#yp,H #H , (4.52) 2 1 2 1 1 2 2 1 2 where each H coincides with (3.17). The coupling between the degrees of freedom occurs only i through the constraint. The physical phase space of the system is the quotient R2=R2D /SO(2) where the gauge p/0 transformations are simultaneous SO(2) rotations of all four vectors x and p . To introduce a local q q parameterization of the physical phase by canonical coordinates, we observe that by a suitable gauge transformation the vector x can be directed along the "rst coordinate axis, i.e., x(2)"0. 1 1 Here we label the components of the vector x as x(i). So, the phase space of physical degrees of q q freedom can be determined by two conditions 1 p(2)"! ( p , ¹x ) . (4.53) 1 2 x(1) 2 1 The second equation follows from the constraint p"0. The gauge condition still allows discrete gauge transformations generated by the rotations through the angles np (n is an integer). It is important to understand that the residual gauge transformations on the hypersurface (4.53) do not act only on p(1) and x(1) changing their sign, but rather they apply to all degrees of freedom 1 1 simultaneously: x P$x and p P$p . The physical phase space cannot be split into a cone q q q q and two planes. It is isomorphic to the quotient x(2)"0, 1
PS &R3=R3/Z . (4.54) 1):4 2 The residual gauge symmetry forbids independent excitations of the canonical variables chosen. Only pairwise excitations, like x(1)x(1), are invariant under the residual gauge transformations. So, 1 2 we have the familiar kinematic coupling of the physical degrees of freedom. Accordingly, if one takes the potentials < as those of the harmonic oscillators, only pairwise collective excitations 1,2 of the oscillators are allowed by the gauge symmetry, which is most easily seen in the Fock representation of the quantum theory (see Section 7.1). In addition to the kinematic coupling induced by the non-Euclidean structure of the physical phase space, there is another source for the kinematic coupling which often occurs in gauge theories. Making use of (4.23) we calculate the metric on the orbit space in the parameterization (4.53). Let us introduce a three-vector q whose components qa, a"1, 2, 3, are, respectively, x(1), x(1), x(2). Then [17] 1 2 2 q2 0 0 1 gph " 2 0 q2!(q3)2 q3q2 . (4.55) ab q 0 q3q2 q2!(q2)2
A
B
The metric (4.55) is not #at. The scalar curvature is R"6/q2. Since the metric is not diagonal, the reduction of the kinetic energy onto the physical phase space spanned by the chosen canonical variables will induce the coupling between physical degrees of freedom: p2 #p2 "gab p p , where 1 2 ph a b p are canonical momenta for qa. Thus, the physical Hamiltonian is no longer a sum of the a Hamiltonians of each degree of freedom. The degrees of freedom described by q cannot be excited
40
S.V. Shabanov / Physics Reports 326 (2000) 1}163
independently. It is possible to "nd new parameterization of the orbit space where the kinematic coupling caused by both the non-Euclidean structure of the phase space and the metric of the orbit space is absent [17] (the metric is diagonal in the new variables). The new variables are related to the q's by a non-linear transformation and naturally associated with the independent Casimir polynomials in the model (see Section 7.1). In this regard, the model under consideration and the one discussed in Section 4.5 are similar. So we will not go into technical details. Let us calculate the induced volume element on the orbit space. As before, it does not coincide with Jdet(gph)"q1/Jq2 because the volume of the gauge orbit through a con"guration space ab point depends on that point [13]. Consider a matrix x with the components x "x(i), i.e., the ij j columns of x are vectors x . Then the gauge transformation law is written in a simple form j xPexp(u¹)x. For this reason we will also refer to the model (4.49) as a gauge matrix model. For a generic point x of the con"guration space one can "nd a gauge transformation such that the transformed con"guration satis"es the condition x "0. Therefore 21 q1 q2 x"ehT ,ehTo , (4.56) 0 q3 where the coordinates qa span the gauge orbit space. The volume element k(q)dq dh can be found by taking the square root of the determinant of the Euclidean metric tr (dxT dx)" trM(do#¹o dh)T(do#¹o dh)N, where xT is the transposed matrix x, in the new curvilinear coordinates. After a modest computation, similar to (4.21), we get the Jacobian k(q)"q1. Note that the variable q is gauge invariant in this approach, while h spans the gauge orbits. We have q2"tr (xTx) and therefore the scalar curvature can also be written in the gauge invariant way R"6/tr (xTx). Clearly, the curvature must be gauge invariant because it is a parameterization independent characteristic of the gauge orbit space. The Jacobian k vanishes at q1"0. Its zeros form a plane in the space of qa. On this plane the change of variables (4.56) is degenerate, which also indicates that the gauge x "0 is not complete 21 on the plane x "0. Indeed, at the singular points x "x(1)"0 so the constraint cannot 11 11 1 be solved for the nonphysical momentum p(2) and is reduced to p"(x , ¹p ) which generates the 1 2 2 SO(2) continuous rotations on the plane x "0. Such gauge transformations are known as the 1 residual gauge transformations within the Gribov horizon [37,100,198]. Given a set of constraints p and the gauge conditions s "0 (such that Ms , s N"0 [68]), the Faddeev}Popov determinant a a a b is detMs , p N,D . Zeros of D on the gauge "xing surface s "0 form the Gribov horizon (or a b FP FP a horizons, if the set of zeros is disconnected). It has a codimension one (or higher) on the surface s "0. Within the Gribov horizon the gauge is not complete, and continuous gauge transformaa tions may still be allowed [37]. Consequently, there are identi"cations within the Gribov horizon, which may, in general, lead to a non-trivial topology of the gauge orbit space [100] (see an example in Section 10.3). In our case, s"x(2) and, hence, D "x(1), i.e., it coincides with the Jacobian k. 1 FP 1 This is a generic feature of gauge theories: The Faddeev}Popov determinant speci"es the volume element on the gauge orbit space [13]. In our parameterization, the orbit space is isomorphic to the half-space x(1)'0 modulo 1 boundary identi"cations. To make the latter we can, e.g., make an additional gauge "xing on the plane x(1)"0, say, by requiring x(2)"0 [182]. We are left with discrete gauge transformations 1 2 x(1)P!x(1). Therefore, every half-plane formed by positive values of x(1) and values of x(1) would 2 2 1 2 have the gauge equivalent half-axis x(1)'0 and x(1)(0 on its edge x(1)"0. Identifying them we 2 2 1
A
B
S.V. Shabanov / Physics Reports 326 (2000) 1}163
41
get the cone unfoldable into a half-plane. So the orbit space has no boundaries, and there is one singular point (the origin) where the curvature is in"nite. The topology of the gauge orbit space is trivial. On the Gribov horizon, the physical phase space of the model also exhibits the conic structure. On the horizon x "0 the constraint is reduced to p"(x , ¹p ), so we get the familiar situation 1 2 2 discussed in Section 3.2: One particle on the plane with the gauge group SO(2). The corresponding physical phase space is a cone unfoldable into a half-plane, R4D /SO(2)&cone(p). p/0 Another interesting matrix gauge model can be obtained from the Yang}Mills theory under the condition that all vector potentials depend only on time [38,39]. The orbit space in this model has been studied by Soloviev [37]. The analysis of the physical phase space structure and its e!ects on quantum theory can be found in [72,10,16]. The orbit space of several gauge matrix models is discussed in the work of Pause and Heinzl [182]. It is also noteworthy that gauge matrix models appear in the theory of 11-dimensional supermembranes [40,41], in the dynamics on D-particles [42] and in the matrix theory [43] describing some important properties of the superstring theory. The geometrical structure of the physical con"guration and phase space of these models does not exhibit essentially new features. The details are easy to obtain by the method discussed above.
5. Yang}Mills theory in a cylindrical spacetime The de"nition of the physical phase space as the quotient space of the constraint surface relative to the gauge group holds for gauge "eld theories, i.e., for systems with an in"nite number of degrees of freedom. The phase space in a "eld theory is a functional space, and this gives rise to considerable technical di$culties when calculating the quotient space. One has to specify a functional class to which elements of the phase space, being a pair of functions of the spatial variables, belong. In classical theory it can be a space of smooth functions [12] (e.g., to make the energy functional "nite). However, in quantum "eld theory the corresponding quotient space appears to be of no use, say, in the path integral formalism because the support of the path integral measure typically lies in a Sobolev functional class [44,45,34], i.e., in the space of distributions, where smooth classical con"guration form a zero-measure subset. To circumvent this apparent di$culty, one can, for instance, discretize the space or compactify it into a torus (and truncate the number of Fourier modes), thus making the number of degrees of freedom "nite. This would make a gauge "eld model looking more like mechanical models considered above where the quotient space can be calculated. The simplest example of this type is the Yang}Mills theory on a cylindrical spacetime (space is compacti"ed to a circle S1) [46}54]. Note that in two dimensional spacetime Yang}Mills theory does not have physical degrees of freedom, unless the spacetime has a non-trivial topology [55}61]. In the Hamiltonian approach, only space is compacti"ed, thus leading to a cylindrical spacetime. We shall establish the PS structure of this theory in the case of an arbitrary compact semisimple 1):4 gauge group [52,54]. The Lagrangian reads
P
1 2pl 1 ¸"! dx(F , F ),! SF , F T , kl kl 4 4 kl kl 0
(5.1)
42
S.V. Shabanov / Physics Reports 326 (2000) 1}163
where F "R A !R A !ig[A , A ], g a coupling constant, k, l"0, 1; the Yang}Mills potenkl k l l k k l tials A , being elements of a Lie algebra X, are periodic functions of a spatial coordinate, k A (t, x#2pl)"A (t, x), i.e. l is the space radius; the parenthesis (,) in the integrand (5.1) stand for k k the invariant inner product in X. We assume it to be the Killing form introduced in Section 3.2. In a matrix representation, one can always normalize it to be a trace. Since the vector potential is a periodic function in space, it can be decomposed into a Fourier series. The Fourier components of A are regarded as independent (Cartesian) degrees of freedom in the theory. k To go over to the Hamiltonian formalism, we determine the canonical momenta E "d¸/dAQ k"F ; the overdot denotes the time derivative. The momentum conjugated to k 0k A vanishes, E "0, forming the primary constraints. The canonical Hamiltonian has the form 0 0 (5.2) H"SE , AkT!¸"SE , E T/2!SA , pT , k 1 1 0 where p"+(A )E with +(A )"R !ig[A , ] being the covariant derivative in the adjoint 1 1 1 1 1 representation. The primary constraints must be satis"ed during the time evolution. This yields the secondary constraints EQ "ME , HN"R E !ig[A , E ]"p"0 , 0 0 1 1 1 1 where the standard symplectic structure
(5.3)
MAak(x), Eb (y)N"dabdl d(x!y), x, y3S1 , (5.4) l k has been introduced, and the su$ces a, b enumerate Lie algebra components. The constraints are in involution Mp (x), p (y)N"if cd(x!y)p (x), Mp , HN"!f cAb p , (5.5) a b ab c a ab 0 c with f c being the structure constants of X. We conclude that there are no more contraints in the ab theory, and all constraints are of the "rst class. The primary and secondary ("rst-class) constraints are independent generators of gauge transformations. As in the mechanical models, the primary constraints Ea "0 generate shifts of the 0 Lagrange multipliers Aa : dAa (x)"MAa , Su , E TN"ua (x), and leave the phase space variables 0 0 0 0 0 0 Ea and Aa untouched. Therefore, the hyperplane Ea "0 spanned by Aa in the total phase space is k 1 0 0 the gauge orbit. We can discard Aa and Ea as pure non-physical degrees of freedom and 0 0 concentrate our attention on the remaining variables. To simplify the notation, from now on we omit the Lorentz su$x `1a of the "eld variables, i.e., instead of E and A we write just E and A. The constraints (5.3) generate the following gauge 1 1 transformations: EPXEX~1"EX,
i APXAX~1# XRX~1"AX . g
(5.6)
Here and below R ,R, while the overdot is used to denote the time derivative R ; X"X(x) takes 1 0 its values in a semisimple compact group G (X is its Lie algebra). The gauge transformed variables EX and AX must be also periodic functions of x. This results in the periodicity of X modulo the center Z of G G X(x#2pl)"zX(x), z3Z . (5.7) G
S.V. Shabanov / Physics Reports 326 (2000) 1}163
43
Indeed, by de"nition an element z from the center commutes with any element of X and, therefore, EX and AX are invariant under the shift xPx#2pl. The relation (5.7) is called a twisted boundary condition [62]. The twisted gauge transformations (i.e., satisfying (5.7) with zOe, e a group unit) form distinct homotopy classes. Therefore they cannot be continuously deformed towards the identity. On the other hand, gauge transformations generated by the constraints (5.3) are homotopically trivial because they are built up by iterating the in"nitesimal transformations [8] dE"ME, Su, pTN"ig[E, u] and dA"MA, Su, pTN"!+(A)u with u being an X-valued periodic function of x. Thus, we are led to the following conclusion. When determining PS as the 1):4 quotient space, one should restrict oneself by periodic (i.e. homotopically trivial) gauge transformations. Such transformations determine a mapping S1PG. A collection of all such transformations is called a gauge group and will be denoted G, while an abstract group G is usually called a structure group of the gauge theory. Yet we shall see that quantum states annihilated by the operators of the constraints } these are the Dirac physical states } are not invariant under the twisted gauge transformations. Consider a periodic function f (x) taking its values in X. It is expanded into a Fourier series
A
B
= nx nx f (x)"f # + f sin #f cos . (5.8) 0 s, n c, n l l n/1 We denote a space of functions (5.8) F and its "nite-dimensional subspace formed by constant functions F so that A"A #AI , where A 3F and AI 3F>F . For a generic connection A(x), 0 0 0 0 0 we can always "nd a periodic gauge element X(x) such that the gauge transformed connection AX is homogeneous in space, RAX"0 .
(5.9)
This also means that the Coulomb gauge "xing surface RA"0 intersects each gauge orbit at least once. To "nd X(x), we set i u"! X~1RX3X g
(5.10)
and, hence,
P
0
X(x)"P exp ig
u(x@)dx@ . (5.11) x The path-ordered exponential (5.11) is de"ned similarly to the time-ordered exponential in Section 3.1. They di!er only by integration variables. After simple algebraic transformations, Eq. (5.9) can be written in the form +(A)u"Ru!ig[A, u]"!RA ,
(5.12)
which has to be solved for the Lie algebra element u(x). It is a linear non-homogeneous di!erential equation of "rst order. So its general solution is a sum of a general solution of the corresponding homogeneous equation and a particular solution of the nonhomogeneous equation. Introducing the group element
P
; (x)"P exp ig A
x
0
dx@ A(x@) ,
(5.13)
44
S.V. Shabanov / Physics Reports 326 (2000) 1}163
that has simple properties R; "igA; and R;~1"!ig;~1A, the general solution can be A A A A written as u(x)"; (x) u ;~1(x)!A(x) . (5.14) A 0 A The "rst term containing an arbitrary constant Lie algebra element u represents a solution of the 0 homogeneous equation, while the second term is the particular solution of the non-homogeneous equation. The constant u should be chosen so that the group element constructed would satisfy 0 the periodicity condition, which yields
Q
X(2pl)"P exp ig dx u"e ,
(5.15)
where e is the group unit. This speci"es completely the function u(x), and, hence, X(x) for any generic A(x). So, any con"guration A3F can be reduced towards a spatially homogeneous con"guration by means of a gauge transformation. Now we shall prove that the gauge reduction of A to a homogeneous connection A 3F leads 0 0 to a simultaneous gauge reduction of the momentum E to E 3F on the constraint surface. To 0X 0 this end, we substitute the gauge transformed canonical pair A "A 3F , EX into the constraint 0 0 equation +(A)E"0 and obtain +(A )EX"0 . (5.16) 0 The momentum variable is then divided into a homogeneous part E and a nonhomogeneous one 0 EI X"EX!E . For these two components one obtains two independent equations from Eq. (5.16): 0 p ,[A , E ]"0 , (5.17) 0 0 0 X X X REI !ig[A , EI ]"+(A )EI "0 . (5.18) 0 0 The "rst equation stems from the F -component of the constraint equation (5.16), while the 0 second one is the constraint in the subspace F>F . A general solution of Eq. (5.18) can be written 0 in the form EI X(x)"; (x)EI X ;~1(x) where ; (x)"exp[igA x] and REI X "0. For a generic A , the 0 0 0 0 X 0 0 0 solution is not periodic in x for all constants EI O0. Since EI X(x) must be a periodic function, the 0 constant EI X should necessarily vanish. Thus, Eq. (5.18) has only a trivial solution EI X"0, and 0 X E "E 3F . 0 0 A useful observation following from the above analysis is that the operator +(A ) has no zero 0 modes in the subspace F>F and, hence, is invertible. The determinant of the operator +(A ) 0 0 restricted on F>F does not vanish. We shall calculate it later when studying the metric on the 0 physical con"guration space. We are led to a redundant system with N"dim X degrees of freedom and the constraint (5.17) which generates homogeneous gauge transformations of the phase-space variables A and E 0 0 (RX,0).3 This mechanical system has been studied in Section 3. The system is shown to have 3 In Section 10.3 we discuss the special role of constant gauge transformations in detail in relation with a general analysis due to Singer [12]. Here we proceed to calculate the physical phase space as the quotient space (2.1) with respect to the full gauge group of the Lagrangian (5.1). In fact, in the path integral formalism we develop in Sections 8 and 9, there is no need to pay a special attention to the constant gauge transformations and neither to the so-called reducible connections [176] which have a non-trivial stabilizer in the gauge group and, therefore, play a special role in Singer's analysis of the orbit space.
S.V. Shabanov / Physics Reports 326 (2000) 1}163
45
r"rank X physical degrees of freedom which can be described by Cartan subalgebra components of A and E . Since any element of X can be represented in the form A "X aX~1, a an element of 0 0 0 A A the Cartan subalgebra H, X 3G, con"gurations A and a belong to the same gauge orbit. A 0 Moreover, a spatially homogeneous gauge transformation with X"X~1 brings the momentum A E on the constraint surface (5.17) to the Cartan subalgebra. Indeed, from (5.17) we derive 0 [a, X~1E X ]"0 and conclude that p "X~1E X 3H by the de"nition of H. The element a has A 0 A a A 0 A a stationary group being the Cartan subgroup of G. This means that not all of the constraints (5.17) are independent. Namely there are just N!r, r"dim H, independent constraints among (5.17). The continuous gauge arbitrariness is exhausted in the theory. 5.1. The moduli space We expect the existence of the residual gauge freedom which cannot decrease the number of physical degrees of freedom, but might change the geometry of their con"guration and phase spaces. If two homogeneous connections from the Cartan subalgebra, a and a , belong to the same s gauge orbit, then there should exit a gauge group element X (x) such that s i a "X aX~1! X RX~1, Ra "Ra"0, a , a3H . (5.19) s s s s s g s s There are two types of solutions to this equation for X . First, we can take homogeneous gauge s group elements, RX "0. This problem has already been solved in Section 3. The homogeneous s residual gauge transformations form the Weyl group. Thus, we conclude that the phase-space points RK p , RK a, where RK ranges over the Weyl group, are gauge equivalent and should be identi"ed a when calculating the quotient space PS . To specify the modular domain in the con"guration 1):4 space, we recall that the Weyl group acts simply transitively on the set of Weyl chambers [30, p. 458]. Any element of H can be obtained from an element of the positive Weyl chamber K` (a3K` if (a, u)'0, for all simple roots u) by a certain transformation from =. In other words, the Weyl chamber K` is isomorphic to the quotient H/=. In contrast with the mechanical model of Section 3, the Weyl group does not cover the whole admissible discrete gauge arbitrariness in the 2D Yang}Mills theory. To "nd nonhomogeneous solutions to Eq. (5.19), we take the derivative of it, thus arriving at the equation i RX aX~1#X aRX~1! R(X RX~1)"0 . s s s s s s g
(5.20)
To solve this equation, we introduce an auxiliary Lie algebra element u "!(i/g)X~1RX . From s s s (5.20) we infer that it satis"es the equation +(a)u "0 . (5.21) s For a generic a from the Cartan subalgebra this equation has only a homogeneous solution which we write in the form u "a g, a "(gl)~1, g3H . (5.22) s 0 0 Note that Eq. (5.21) can always be transformed into two independent equations by setting u "a g#u8 , where g3F and u8 3F>F . As has been shown above, the operator +(A ) has s 0 s 0 s 0 0
46
S.V. Shabanov / Physics Reports 326 (2000) 1}163
no zero modes in the space F>F , and, hence, so does +(a)"X~1+(A )X which means 0 A 0 A that det +(a)"det +(A )O0. So u8 "0, whereas the homogeneous component satis"es the 0 s equation +(a)g"!ig[a, g]"0, that is, g must be from the Cartan subalgebra because it commutes with a generic a. From the relation RX "igX u we "nd that s s s X (x)"exp(iga gx) . (5.23) s 0 This is still not the whole story because the group element we have found must obey the periodicity condition otherwise it does not belong to the gauge group. The periodicity condition yields the restriction on the admissible values of g: X (2pl)"exp(2pig)"e , (5.24) s where e stands for the group unit. The set of elements g obeying this condition is called the unit lattice in the Cartan subalgebra [30, p. 305]. The non-homogeneous residual gauge transformations leave the canonical momentum p untouched since [p , g]"0 and shift the canonical a a coordinate aPa#a g, along the unit lattice in the Cartan subalgebra. 0 Consider a diagram D(X) being a union of a "nite number of families of equispaced hyperplanes in H determined by (a, a)3a Z, a ranges over the root system and Z stands for the set of all 0 integers. Consider then a group ¹ of translations in H, aPa#a g, where g belongs to the unit e 0 lattice. The group ¹ leaves the diagram D(X) invariant [30, p. 305]. The diagram D(X) is also e invariant with respect to Weyl group transformations. Since = is generated by the re#ections (4.19) in the hyperplanes orthogonal to simple roots, it is su$cient to prove the invariance of D(X) under them. We have (a, RK a)"a n where n "n!2k (u, a)/(u, u) is an integer as u 0 u u u (a, u)"k a , k 3Z because a3D(X). We recall that any root a can be decomposed over the basis u 0 u formed by simple roots. The coe$cients of this decomposition are all either non-negative or non-positive integers. Therefore, the number !2(u, a)/(u, u) is a sum of integers since the elements of the Cartan matrix !2(u, u@)/(u, u) are integers. So, RK D(X)"D(X). Now we take the compu lement H>D(X). It consists of equal polyhedrons whose walls form the diagram D(X). Each polyhedron is called a cell. A cell inside of the positive Weyl chamber K` such that its closure contains the origin is called the Weyl cell K` . W The Weyl cell will play an important role in the subsequent analysis, so we turn to examples before studying the problem in general. The diagram D(su(2)) consists of points na u/(u, u), n3Z 0 with u being the only positive root of su(2), (u, u)"1/2 (u"q /4 in the matrix representation). 3 A cell of H >D(su(2)) is an open interval between two neighbor points of D(su(2)). Assuming the 46(2) orthonormal basis in the Cartan subalgebra, we can write a"J2a u, (a, a)"a2 . Since the Weyl 3 3 chamber K` is isomorphic to the positive half-line R`, we conclude that a belongs to the Weyl cell K` if a lies in the open interval (0, J2a ). The translations aPa#2na u/(u, u), n3Z, form the W 3 0 0 group ¹ , and ="Z , RK a"!a. Thus, D(su(2)) is invariant under translations from ¹ and the e 2 u e re#ection from the Weyl group =. For X"su(3) we have three positive roots, u , u and u "u #u which have the same 1 2 12 1 2 norms. The angle between any two neighbor roots equals p/3. The root pattern of SU(3) is plotted in Fig. 5. The diagram D(su(3)) consists of three families of equispaced straight lines (u , a)"a n , n3Z, on the plane H &R2. The lines are perpendicular to the roots 1,2,12 0 1,2,12 46(3) u , respectively. The complement H >D(su(3)) is a set of equilateral triangles covering the 1,2,12 46(3)
S.V. Shabanov / Physics Reports 326 (2000) 1}163
47
Fig. 5. The root pattern of SU(3). The diagram D(su(3)) is formed by three families of straight lines perpendicular to the simple roots u , u and the root u "u #u . These families are denoted as c , c and c , respectively. The grayed 1 2 12 1 2 1 2 12 equilateral triangle is the Weyl cell of su(3) which is the moduli space of the su(3) connections with respect to homotopically trivial gauge transformations. Had we included the homotopically nontrivial transformations into the gauge group, the moduli space would have been four times less and isomorphic to the equilateral triangle whose vertices are in the mid-points of the Weyl cell boundaries (see Section 7.6 for details).
plane H . The Weyl cell K` is the triangle bounded by lines (u , a)"0 (being the boundary of 46(3) W 1,2 K`) and (u , a)"a . The group ¹ is generated by integral translations through the vectors 12 0 e 2a a/(a, a), a ranges over u , and (a, a)"1/3 (see Section 3.2 for details of the matrix 0 1,2,12 representation of the roots). Let = denote the group of linear transformations of H generated by the re#ections in all A the hyperplanes in the diagram D(X). This group is called the a$ne Weyl group [30, p. 314]. = preserves D(X) and, hence, A K` &H/= , (5.25) W A i.e. the Weyl cell is isomorphic to a quotient of the Cartan subalgebra by the a$ne Weyl group. Consider a group ¹ of translations r aPa#2a + n a/(a, a),a#a + n g , n 3Z . 0 a 0 a a a a;0 a;0 Then = is a semidirect product of ¹ and = [30, p. 315]. For the element g we have the A r a following equality [30, p. 317], 4pia exp(2pig )"exp "e , a (a, a)
(5.26)
Comparing it with (5.24) we conclude that the residual discrete gauge transformations form the a$ne Weyl group. The space of all periodic connections A(x) is F. Now we can calculate the moduli space of connections relative to the gauge group, i.e., obtain the physical con"guration space, or
48
S.V. Shabanov / Physics Reports 326 (2000) 1}163
the gauge orbit space CS &F/G&H/= &K` . (5.27) 1):4 A W Similarly, the original phase space is isomorphic to F=F because it is formed by pairs of Lie-algebra-valued periodic functions A(x) and E(x). The quotient with respect to the gauge group reads PS "F=F/G&R2r/= , (5.28) 1):4 A where the action of = on H=H&R2r is determined by all possible compositions of the following A transformations: 2(a, p ) a a, RK p "RK p "p ! a, n a a a a (a, a)
(5.29)
2n a RK a"RK a# a 0 a , a, n a (a, a)
(5.30)
where the element RK 3= acts on a as a re#ection in the hyperplane (a, a)"n a , n 3Z, and a is a, n A a 0 a any root. To illustrate the formula (5.28), let us construct PS for the simplest case X"su(2). We have 1):4 r"1, ="Z , (u, u)"1/2. The group ¹ "¹ acts on the phase plane R2 spanned by the 2 r e coordinates p , a (we have introduced the orthonormal basis in H ; see the discussion of 3 3 46(2) D(su(2)) above) as p , a Pp , a #2J2na . On Fig. 6 we set ¸"J2 a . The points B and B are 3 3 3 3 0 0 1 related by the gauge transformation from ¹ . The strips bounded by the vertical lines (cc@) are e gauge equivalent through the translations from ¹ . The boundary lines (cc@) are gauge equivalent e to one another, too. So, R2/¹ is a cylinder. After an appropriate cut, this cylinder can be unfolded e into the strip p 3R, a 3(!¸, ¸) as shown in Fig. 6b. The boundary lines a "$¸ are edges of 3 3 3 the cut. They contain the same physical states and later will be identi"ed. On the strip one should stick together the points p , a and !p ,!a connected by the re#ection from the Weyl group 3 3 3 3 (the points B and !B in Fig. 6b). This converts the cylinder into a half-cylinder ended by two conic horns at the points p "0, a "0, ¸. Indeed, we can cut the strip along the p -line and rotate the 3 3 3 right half (the strip 0(a (¸) relative to the coordinate axis a through the angle p (cf. a similar 3 3 procedure in Fig. 1b). The result is shown in Fig. 6c. It is important to observe that the half-axis (¸c) is gauge equivalent to (!¸c) and (!¸c@) to (¸c@), while the positive and negative momentum half-axes in Fig. 6c are edges of the cut and therefore to be identi"ed too (cf. Fig. 1c). Next, we fold the strip in Fig. 6c along the momentum axis to identify the points B and !B. Finally, we glue together the half-lines (c¸) with (¸c) and (p O) with (Op ) in Fig. 6d, thus obtaining the physical 3 3 phase space (Fig. 6e). In neighborhoods of the singular conic points, PS looks locally like 1):4 cone(p) studied in Section 2 because = acts as the Z -re#ections (5.29) and (5.30) with a"u and A 2 n"0, 1 near a "0, J2a , respectively. 3 0 For groups of rank 2, all conic (singular) points of PS are concentrated on a triangle being the 1):4 boundary RK` of the Weyl cell (if X"su(3), RK` is an equilateral triangle with side length J3a W W 0 in the orthonormal basis de"ned in Section 3.2). Let us introduce local symplectic coordinates pM, aM and p@@, a@@ in a neighborhood of a point of RK` (except the triangle vertices) such that a a W aM and a@@ vary along lines perpendicular and parallel to RK` , respectively. The = -re#ection in W A
S.V. Shabanov / Physics Reports 326 (2000) 1}163
49
Fig. 6. The physical phase space of the SU(2) Yang}Mills theory on a cylindrical spacetime. It is a half-cylinder with two conic horns attached to it. It is #at everywhere except the conic singularities where the curvature is in"nite. Here ¸"J2a . 0
the wall of RK` going through this neighborhood leaves p@@, a@@ invariant, while it changes the sign W a of the other symplectic pair, pM, aMP!pM,!aM. Therefore PS locally coincides with a a 1):4 R2=cone(p). At the triangle vertices, two conic singularities going along two triangle edges merge. If those edges are perpendicular, PS is locally cone(p)=cone(p). If not, PS is a 4D-hypercone. 1):4 1):4 The tip of the 4D-hypercone is `sharpera than the tip of cone(p)=cone(p), meaning that the 4D-hypercone can be always put inside of cone(p)=cone(p) when the tips of both the hypercones are placed at the same point. Obviously, a lesser angle between the triangle edges corresponds to a `sharpera hypercone. A generalization of this pattern of singular points in PS to gauge groups of an arbitrary rank 1):4 is trivial. The Weyl cell is an rD-polyhedron. PS at the polyhedron vertices has the most 1):4 singular local 2rD-hypercone structure. On the polyhedron edges it is locally viewed as an R2=2(r!1)D-hypercone. Then on the polyhedron faces, being polygons, the local PS structure 1):4 is an R4=2(r!2)D-hypercone, etc. Remark. As in the mechanical models studied earlier one can choose various ways to parameterize the physical phase space. When calculating a quotient space, one can, for instance, restrict the
50
S.V. Shabanov / Physics Reports 326 (2000) 1}163
values of the canonical momentum E(x). This is equivalent to imposing a gauge condition on the "eld variables rather than on the connection [69,73]. By a gauge rotation E can be brought to the Cartan subalgebra at each point x. So we set E(x)"E (x)3H. Decomposing the connection A into H the Cartan component A (x) and AM (x)3X>H, we "nd that the constraint +(A)E "0 is equivaH H lent to two independent constraints: RE "0 and [AM , E ]"0. These are two components of the H H original constraint in H and X>H, respectively. From the Cartan}Weyl commutation relations follows that AM (x)"0. The residual constraints RE "0 generate the gradient shifts of the correH sponding canonical variables: A PA #Ru, where Ru is a periodic function of x. We have H H obtained the Abelian projection of the theory [73]. Therefore the physical degrees of freedom can again be described by the pair E (x)"p and A "a. Now p can be taken into the Weyl chamber H a H a by an appropriate Weyl transformation, while a is determined modulo shifts on the periods of the group torus (the shifts along the group unit lattice). Note that we can take u"gx since Ru"g is periodic as is any constant. The necessary restrictions on g follow from the periodicity condition on the corresponding gauge group element. So, we have another parameterization of the same physical phase space such that p 3K` and a3H/¹ , which obviously corresponds to another cut of a e the rD-hypercone. The quantum theory of some topological "eld models in the momentum representation has been studied in [74}76]. 5.2. Geometry of the gauge orbit space Let us "nd the metric and the induced volume element of the physical con"guration space. They will be used in quantum mechanics of the Yang}Mills theory under consideration. It is useful to introduce the following decomposition of the functional space (5.8) = = F" + =F " + =(FH=F M ), (5.31) n n n n/0 n/0 where F is a space of constant Lie algebra-valued functions (the "rst term in the series (5.8)), F , 0 n nO0, is a space of functions with the "xed n in the sum (5.8). Each subspace F is "niten dimensional, dim F "dim X, dim F "2 dim X, nO0 (we recall that Lie algebra-valued func0 n tions are considered). Functions belonging to FH take their values in the Cartan subalgebra H, n while functions from F M take their values in X>H. All subspaces introduced are orthogonal with n respect to the scalar product S ,T": 2p- dx( ,). 0 From the above analysis of the moduli space of Yang}Mills connections follows a local parameterization of a generic connection i A"XaX~1! XRX~1, Ra"0, a3H , g
(5.32)
where X3G/G , and G is the Cartan subgroup (the maximal Abelian subgroup of G) which is H H isomorphic to the stationary group of the homogeneous connection a. By de"nition the connection remains invariant under gauge transformations from its stationary group. Eq. (5.32) can be regarded as a change of variables in the functional space F. In the new variables the functional di!erential dA3F can be represented in the form
A
B
i dA"X da! +(a)dw X~1 , g
(5.33)
S.V. Shabanov / Physics Reports 326 (2000) 1}163
51
where by the de"nition of the parameterization (5.32) da"da3FH and dw(x)" 0 iX~1 dX3F>FH. Therefore the metric tensor reads 0 SdA, dAT"2pl(da, da)!g~2Sdw, +2(a)dwT ,(da, g da)#Sdw, g dwT . (5.34) aa ww Eq. (5.34) results from (5.33) and the relation that Sda, +(a)dwT"!S+(a)da, dwT"0 which is due to Rda"0 and [da, a]"0. The operator +(a) acts in the subspace F>FH. It has no zero mode in 0 this subspace if a3K` and, hence, is invertible. Its determinant is computed below. The metric W tensor has the block-diagonal form. The physical block is proportional to the r]r unit matrix g "2pl. For the non-physical sector we have an in"nite dimensional block represented by the aa kernel of the di!erential operator: g "!g~2+2(a), and g "g "0. Taking the inverse of the ww aw wa aa-block of the inverse total metric, we "nd that the physical metric gph coincides with g . That is, aa aa the physical con"guration space is a #at manifold with (singular) boundaries. It has the structure of an orbifold [36]. To obtain the induced volume element, one has to calculate the Jacobian of the change of variables (5.32)
P
P
P
P
< dA(x)U" < dw(x) da J(a)UP da i2(a)U , (5.35) ` ` G 1 @GH x KW KW x|S J2(a)"det g det g "(2pl)rdet[!g~2+2(a)] . (5.36) aa ww Here U"U(A)"U(a) is a gauge invariant functional of A. The induced volume element does not coincide with the square root of the determinant of the induced metric on the orbit space. It contains an additional factor, (det g )1@2, being the volume of the gauge orbit through a generic ww con"guration A(x)"a, in the full accordance with the general analysis given in [13] for Yang}Mills theories (see also Section 10.1). Consider the orthogonal decomposition F
FM " + =Fa , (5.37) n n a;0 where Fa contains only functions taking their values in the two-dimensional subspace X =X of n a ~a the Lie algebra X. The subspaces FH, Fa are invariant subspaces of the operator +(a), that is, n n +(a)FH is a subspace of FH, and +(a)Fa is a subspace of Fa . We conclude that the operator +(a) n n n n has a block-diagonal form in the decomposition (5.31) and (5.37). Indeed, we have +(a)"R!igad a, where ad a"[a, ] is the adjoint operator acting in X. The operator R is diagonal in the algebra space, and its action does not change periods of functions, i.e. FH,a are its invariant n spaces. Obviously, ad aFH"0 and ad aFa "Fa if (a, a)O0 in accordance with the Cartan}Weyl n n n commutation relation (4.11). Therefore, an action of the operator +(a) on F>FH is given by an 0 in"nite-dimensional, block-diagonal matrix. In the real basis j introduced after Eq. (4.16), its i blocks have the form n r +H(a),+(a)DFHn "RDFHn " ? e , nO0, r"rank X , (5.38) n l
A B
+a (a),+(a)DFa0 "!igad aDFa0 "g(a, a)e , 0 n +a (a),+(a)DFan "1? e#g(a, a)e?1 , n l
(5.39) (5.40)
52
S.V. Shabanov / Physics Reports 326 (2000) 1}163
where e is a 2]2 totally antisymmetric matrix, e "!e , e "1; and 1 is the 2]2 unit matrix. In ij ji 12 (5.40) the "rst components in the tensor products correspond to the algebra indices, while the second ones determine the action of +(a) on the functional basis sin(xn/l), cos(nx/l). The vertical bars at the operators in Eqs. (5.38)}(5.40) mean a restriction of the corresponding operator onto a speci"ed "nite dimensional subspace of F. An explicit matrix form of the restricted operator is easily obtained by applying R to the Fourier basis, and the action of ad a, a3H, is computed by means of (4.15). Since e2"!1, we have for the Jacobian
C
D
= J2(a)"(2pl)r < det(ig~1+a )2 < det(ig~1+H)2 < det(ig~1+a )2 0 n n a;0 n/1 a;0 n 4r n2 4 = < "(2pl)r < (a,a)4 < !(a,a)2 . gl g2l2 a;0 a;0 n/1 Set J(a)"C(l)i2(a). Including all divergences of the product (5.41) into C(l) we get
CA B
C
A
A
BD
(5.41)
BD
(a, a)2 p(a, a) p(a, a) = " < sin , (5.42) < 1! a2 n2 a a 0 0 0 n/1 a;0 a N` = C(l)"(2pl)r@2 0 < (n2a2 )r`2 , (5.43) 0 p n/1 where a "(gl)~1, the integer N "(N!r)/2 is the number of positive roots in X; the last equality 0 ` in (5.42) results from a product formula given in [65, p. 37]. The induced volume element is da i2(a). It vanishes at the boundaries of the Weyl cell (at the boundaries of the physical con"guration space in the parameterization considered) since (a, a)/a 3Z for all a3RK` . Zeros of the function i(a) 0 W extended to the whole Cartan subalgebra form the diagram D(X). This fact will be important for quantization of the model in Section 8.6. i(a)" < a;0
A B
5.3. Properties of the measure on the gauge orbit space We will need a few mathematical facts about the function i which are later proved to be useful when solving quantum Yang}Mills theory on a cylindrical spacetime in the operator and path integral approaches. The "rst remarkable fact is that the function (5.42) is proportional to the Weyl determinant [63, p. 185]:
C
D
2pi (2i)N` i(a)" < (e*p(a,a)@a0 !e~*p(a,a)@a0 )" + det RK exp (RK o, a) . (5.44) a K 0 a;0 R|W Here we have introduced the parity det RK of the elements of the Weyl group. It is 1 if RK contains even number of the generating elements RK and !1 if this number is odd. Recall that in the root u space Rr the re#ection RK in the hyperplane orthogonal to a simple root x can be thought as an u r]r-matrix from the orthogonal group O(r) such that det RK "!1. The element o is a half-sum of u all positive roots: 1 o" + a . 2 a;0
(5.45)
S.V. Shabanov / Physics Reports 326 (2000) 1}163
53
The relation between i and the Weyl determinant allows us to establish the transformation properties of i relative to the action of the a$ne Weyl group on its argument. From Eqs. (5.30) and (5.44) we infer
C
D C
D
2pi 4pin b (RK o, b) (2i)N` i(RK a)" + det RK exp (RK o, RK a) exp (5.46) b, n b a (b, b) K 0 R|W 2pi 4pin b (RK o, b) , "det RK + det RK exp (RK o, a) exp ! (5.47) bK a (b, b) 0 R|W where we have rearranged the sum over the Weyl group by the change RK PRK RK and made use of b the properties that RK 2"1 and RK b"!b. Next we show that the second exponential in (5.47) is b b 1 for any b and RK . To this end, we observe that (RK o, b)"(o, b@) where b@"RK Tb is also a root that has the same norm as b because the Weyl group preserves the root pattern. Therefore we have to prove that
C
D C
D
n (b)"2(o, b)/(b, b) (5.48) o is an integer. The half-sum of the positive roots has the following properties [30, p. 461]: 2(u, o)/(u, u)"1 ,
(5.49)
RK o"o!u , (5.50) u for any simple root u. Since the Weyl group = preserves the root system and the re#ection RK in b the hyperplane (b, a)"0 is a composition of re#ections RK , there exists an element RK 3= and u a simple root u such that RK u "b. The statement that n (b) is an integer follows from the relation b b o 2(b, o) 2(u , RK To) n (b)" " b 3Z . (5.51) o (b, b) (u , u ) b b Indeed, representing RK T as a product of the generating elements RK and applying Eqs. (5.49) and u (5.50) we obtain (5.51) from the fact that 2(u , a)/(u , u ) is an integer for any root a. Recall that b b b a root a can be decomposed into a sum over simple roots with integer-valued coe$cients, and the Cartan matrix 2(u, u@)/(u, u) is also integer valued. Thus, we arrive at the simple property i(RK a)"det RK i(a)"!i(a) (5.52) b, n b for any root b. Since any elements of the a$ne Weyl group = is a composition of the re#ections A (5.30), we conclude that i(RK a)"det RK i(a)"$i(a), RK 3= , (5.53) A where by de"nition det RK "!1 if RK contains an odd number of the re#ections (5.30) and det RK "1 for an even number. The Jacobian k"i2 is invariant under the a$ne Weyl group transformations. The second remarkable property of the function i(a) is that it is an eigenfunction of the r-dimensional Laplace operator 4p2(o, o) p2N (R , R )i(a),D i(a)"! i(a)"! i(a) , a a (r) a2 6a2 0 0
(5.54)
54
S.V. Shabanov / Physics Reports 326 (2000) 1}163
where the relation (o, o)"N/24 [32] between the norm of o and the dimension N of the Lie algebra has been used. A straightforward calculation of the action of the Laplace operator on i(a) leads to the equality
C
D
4p2 p2 p(a, a) p(a, a) D i(a)"! (o, o) i(a)# + (a, b) cot cot #1 i(a) . (5.55) (r) a2 a2 a a 0 0 aEb;0 0 0 The sum over positive roots in Eq. (5.55) can be transformed into a sum over the roots aOb in a plane P and the sum over all planes P . Each plane contains at least two positive roots. ab ab Relation (5.54) follows from + (a, b)[cot(b, a)cot(b, b)#1]"0 , (5.56) aEb;0|Pab for any B3H. To prove the latter relation, we remark that the root pattern in each plane coincides with one of the root patterns for algebras of rank 2, su(3), sp(4)&so(5) and g because the absolute 2 value of cosine of an angle between any two roots a and b may take only four values Dcos h D"0, 1/J2, 1/2, J3/2. For the algebras of rank 2, Eq. (5.56) can be veri"ed by an explicit ab calculation. For example, in the case of the su(3) algebra, the sum (5.56) is proportional to !cot b cot b #cot b cot (b #b )#cot b cot(b #b )#1"0 , 1 2 1 1 2 2 1 2 where b "(b, u ), and u , u and u #u constitute all positive roots of SU(3). 1,2 1,2 1 2 1 2 6. Artifacts of gauge 5xing in classical theory The de"nition of PS is independent of the choice of local symplectic coordinates and 1):4 explicitly gauge invariant. However, upon a dynamical description (quantum or classical) of constrained systems, we often need to introduce coordinates on PS , which means "xing a gauge 1):4 or choosing a PS parameterization. The choice of the parameterization is usually motivated by 1):4 physical reasons. If one deals with gauge "elds, one may describe physical degrees of freedom by transverse components AM of the vector potential and their canonically conjugated momenta EM, i.e. the Coulomb gauge (T, A)"0 is imposed to remove non-physical degrees of freedom. This choice comes naturally from our experience in QED where two independent polarizations of a photon are described by the transverse vector potential. The Coulomb gauge is a complete global gauge condition in QED. Apparently, the phase space of each physical degree of freedom in the theory is a Euclidean space. In the high-energy limit of non-Abelian gauge theories like QCD the physical picture of self-interacting transverse gluons works extremely well. However, in the infrared domain where the coupling constant becomes big and dynamics favors large #uctuations of the "elds, transverse gauge "elds do not serve any longer as good variables parameterizing PS . It appears that 1):4 there are gauge-equivalent con"gurations in the functional hyperplane (T, A)"0, known as Gribov's copies [11]. Moreover, this gauge "xing ambiguity always occurs and has an intrinsic geometric origin [12] related to the topology of the gauge orbit space and cannot be avoided if gauge potentials are assumed to vanish at spatial in"nity. This makes a substantial di$culty for
S.V. Shabanov / Physics Reports 326 (2000) 1}163
55
developing a consistent non-perturbative path integral formalism for gauge theories (see Section 10 for details). To illustrate the Gribov copying phenomenon in the Coulomb gauge, one can take the 2D Yang}Mills theory considered above. The spatially homogeneous Cartan subalgebra components of the vector potential A"a and "eld strength E"p can be regarded as symplectic coordinates a on PS . In fact, this implies the Coulomb gauge condition RA"0. Note that this condition is not 1):4 complete in the two-dimensional case because there are some non-physical degrees of freedom left.4 They are removed by imposing an additional gauge condition (e , A)"0, i.e. A3H. Gribov Ba copies of a con"guration A"a3H&Rr are obtained by applying elements of the a$ne Weyl group = to a. The modular domain coincides with the Weyl cell. We will see that the residual A transformations from the a$ne Weyl group are important for constructing the Hamiltonian path integral in the Coulomb gauge for the model in question. In fact, if we ignore them and calculate the path integral as if there were no Gribov copies, the answer would appear in con#ict with the explicitly gauge invariant approach due to Dirac. From the geometrical point of view [12], the absence of a `gooda gauge condition s(A)"0 is due to non-triviality of the "ber bundle with the base being space (compacti"ed into a sphere by imposing zero boundary conditions on the connection A at the spatial in"nity) and the "bers being the group G (see also a comprehensive work [66]). For this reason, the Gribov problem is often identi"ed with the absence of the global cross-section on the non-trivial "ber bundle. However, one could look at this problem di!erently. Gribov found the obstruction to the non-perturbative extension of the Faddeev}Popov path integral [67]. To give an operator interpretation to the Lagrangian gauge-"xed (formal) path integral, a more general Hamiltonian path integral has been developed by Faddeev [68]. The construction is based on an explicit parameterization of the physical phase space, which is introduced by imposing supplementary (gauge) conditions on the canonical variables. We have discussed such parameterizations of the physical phase space in the SO(N) model (the gauge x "xd ), in the Yang}Mills mechanics (the gauge x"h3H), or in i 1i the 2D Yang}Mills theory (the gauge A(x)"a3H). The singularities discovered by Gribov are associated with the particular choice of the supplementary conditions imposed on the canonical coordinates (connections A). In Yang}Mills theory this particular class of gauge conditions is indeed subject to the mathematical `no-goa theorem due to Singer. As has later been proposed by Faddeev with collaborators, this mathematical problem of constructing a cross-section on the non-trivial "ber bundle can be circumvented if the supplementary condition is imposed on the momentum variables [69]. One could even construct a set of local gauge invariant canonical variables to span the physical phase space [70,71]. The gauge "xing in the space of the canonical momenta E is an algebraic (local) problem similar to the one discussed in Section 3 because under the gauge transformations, EPXEX~1 [72]. The physical phase space structure does not depend on whether one uses canonical coordinates or momenta to remove the gauge arbitrariness. The problem of constructing the correct path integral measure on the physical phase space parameterized in either way would still remain
4 The Coulomb gauge would have been complete, had we removed the constant gauge transformations from the gauge group, which, however, would have been rather arti"cial since the Lagrangian of the theory has the gauge invariance relative to spatially homogeneous gauge transformations (see also Section 10.3 in this regard).
56
S.V. Shabanov / Physics Reports 326 (2000) 1}163
because there would be singularities in the canonical momentum space or in the con"guration space as the consequence of the non-Euclidean structure of the physical phase space. 't Hooft considered gauge "xing for the "eld variables rather than for the vector potentials [73]. He identi"ed the singularities occurring in such a gauge with topological defects in gauge "elds that carry quantum numbers of magnetic monopoles with respect to the residual Abelian gauge group. The existence of singularities in the momentum space were also stressed in [69]. In view of these arguments, we consider the Gribov obstruction as a part of a much more general and fundamental quantization problem: Quantization on non-Euclidean phase spaces. The phase space of physical degrees of freedom may not be Euclidean even if one can "nd a global cross section in the "ber bundle associated with a gauge model. In fact, it is the geometry of the phase space that lies at the heart of the canonical or path integral quantization because the Heisenberg commutation relations and their representation strongly depend on it. The quantization problem of non-Euclidean phase spaces is known since the birth of quantum mechanics. Yang}Mills theory has given us a "rst example of the fundamental theory where such an unusual feature of the Hamiltonian dynamics may have signi"cant physical consequences. An explicit parameterization of the physical phase space by local canonical coordinates is often used in gauge theories, e.g., in the path integral formalism. Although a particular set of canonical variables may look preferable from the physical point of view, it may not always appear reasonable from the mathematical point of view as a natural and convenient set of local canonical coordinates on a non-Euclidean phase space because it may create arti"cial (coordinate dependent) singularities in a dynamical description. On the other hand, it could also happen that the physical phase space is hard to compute and "nd mathematically most convenient coordinates to describe dynamics. Therefore it seems natural to take a closer look at possible `kinematica e!ects caused by the coordinate singularities in a generic parameterization of the physical phase space. Here we investigate classical Hamiltonian dynamics. A quantum mechanical description will be developed in next section. 6.1. Gribov problem and the topology of gauge orbits Gribov copies themselves do not have much physical meaning because they strongly depend on a concrete choice of a gauge "xing condition that is rather arbitrary. An `inappropriatea choice of the gauge condition can complicate a dynamical description. To illustrate what we mean by this statement, let us take a simple gauge model with three degrees of freedom whose dynamics is governed by the Lagrangian ¸"1x5 2 #1(x5 !y)2!<(x ) . (6.1) 2 1 2 2 1 The Lagrangian is invariant under the gauge transformations x Px #m, yPy#mQ , while the 2 2 variable x remains invariant. The variable y is the Lagrange multiplier since the Lagrangian does 1 not depend on the velocity y5 . We can exclude it from consideration at the very beginning. On the plane spanned by the other two variables x , the gauge orbits are straight lines parallel to 1,2 the x axis. Therefore any straight line that is not parallel to the x axis can serve as a unique 2 2 gauge "xing condition because it intersects each orbit precisely once. However, one is free to choose any gauge "xing condition, s(x , x )"0, to remove the gauge 1 2 arbitrariness. In the dynamical description, this amounts to a speci"c choice of the function y(t) in
S.V. Shabanov / Physics Reports 326 (2000) 1}163
57
Fig. 7. (a) An illustration to an arti"cially created Gribov copying. The gauge "xing curve AA@ intersects each gauge orbit, being vertical parallel straight lines, precisely once. There is a one-to-one correspondence between the parameter u of the gauge "xing curve and the gauge invariant variable x . In contrast, the curve cc@ does not intersect each gauge 1 orbit once. The states labeled by the values of u from the intervals (u , u ), (u , u ) and (u , u ) are gauge equivalent. Two 1 2 2 3 3 4 of them must be discarded to achieve a unique parameterization of the gauge orbit space. If the "rst and third intervals are removed, then, in the u-parameterization, the con"guration space would have two `holesa. There is no one-to-one correspondence between u and x . In fact, the function u(x ) is multi-valued and has, in this particular case, three 1 1 branches. (b) Gribov problem in the SO(2) model. Here the gauge "xing curve s(x , x )"0, speci"ed by a continuous, 1 2 single-valued and everywhere regular function s on the plane such that R sO0 at zeros of s, would intersect each gauge h orbit at least twice, thus making the Gribov problem unavoidable in any non-invariant approach. The condition R sO0 h at zeros of s(x , x ) is necessary [68] to reestablish a canonical symplectic structure on the physical phase space 1 2 parameterized by points of the surface p"s"0 in the total phase space.
the Euler}Lagrange equations of motion. Recall that the equations of motion do not impose any restrictions on the Lagrange multipliers in gauge models because of their covariance under gauge transformations. So the solutions depend on generic functions of time, the Lagrange multipliers, which can be speci"ed so that the solutions would ful"ll a supplementary (or gauge) condition. Now let us take the parametric equations of the gauge "xing curve x "f (u) and let u range 1,2 1,2 over the real line. One can, for instance, set f (0)"0 and let u equal the arc length of the curve 1,2 counted in one direction from the origin and negative of the arc length, when the latter counted in the other direction traced out by the curve from the origin. The parameter u describes the only physical degree of freedom in the model. It seems that the dynamics of u and the gauge invariant variable x is the same modulo a functional relation between u and x . This is, however, only 1 1 partially true. If the gauge "xing curve intersects some gauge orbits more than once, some distinct values of u would correspond to the same physical states. An example of such a `bada gauge "xing curve is plotted in Fig. 7a. To achieve a one-to-one correspondence between physical states and the values of u, one has to remove certain values of u from the real line, thus making `holesa in it. These holes are absent in the gauge invariant description via the variable x . The parameter space also 1 has boundaries where dynamics may exhibit unusual properties. All these troubles have been created just by a `bada choice of the gauge. We observe also that the function u(x ) is multi-valued 1 in this case. An important point to realize is that the Gribov problem in the above model is fully arti"cial. The topological structure of the gauge orbits is such that it admits a gauge "xing that would allow
58
S.V. Shabanov / Physics Reports 326 (2000) 1}163
one to construct a system of Cartesian canonical coordinates on the physical phase space. The phase space of the only physical degree of freedom is a plane. All the complications of the dynamical description are caused by the `inappropriatea parameterization of the physical phase space. In the SO(N) model studied earlier, the gauge orbits are spheres centered at the origin. Their topology is not that of a Euclidean space. This is the reason for the physical phase space being non-Euclidean. The same applies to the Yang}Mills mechanical systems and the 2D Yang}Mills theory studied above. The gauge orbits in all those models are compact manifolds with non-trivial topology, which makes the coordinate singularities in the physical con"guration space unavoidable, in contrast to the model with the translational gauge symmetry. The non-trivial topology of gauge orbits is, in general, the source of the Gribov obstruction to the reduced-phase-space path integral or canonical quantization in gauge theories. The reason is that the physical phase space is not Euclidean in this case, which, in turn, implies that a conventional representation of the canonical commutation relations is no longer valid and should be modi"ed in accordance with the geometry of the phase space. The arti"cial Gribov problem, like in the model with translational gauge symmetry, does not lead to any di$culty in quantization because the physical phase space is Euclidean. Note that a bad choice of a gauge is always possible even in electrodynamics where no one would expect any obstruction to canonical quantization. Suppose we have the constraints p "p (p, q). To parameterize the physical phase space, we a a introduce supplementary (gauge) conditions s (q, p)"0 such that Ms , s N"0 and the matrix a a b M "Ms , p N is not degenerate, i.e., the Faddeev}Popov determinant D "det M is not zero. ab a b FP ab A symplectic structure on the physical phase space can locally be reestablished by means of a canonical transformation [68] p, qPpH, qH; p8 , q8 "s . In the new canonical variables we get a a a M "Mq8 , p N"Rp /Rp8 . The condition D O0 allows one to solve the equation p "0 for the ab a b b a FP a non-physical canonical momenta p8 "p8 (pH, qH) which, together with the conditions q8 "0 introa a a duces a parameterization of the physical phase space by the canonical coordinates pH, qH. The condition D O0 is crucial for establishing a canonical symplectic structure on the physical phase FP space. There may not exist functions s regular everywhere such that this condition is met a everywhere on the surface p "s "0. This turns out to be the case when gauge orbits have a a a non-trivial topology. To illustrate the importance of gauge orbit topology for the physical phase space geometry, let us consider the SO(2) model (see also Fig. 7b). First we remark that one can always "nd a singe-valued regular function s(x) on the plane such that its zeros form a curve that intersects each circle (gauge orbit) precisely once. Thus, the condition s"0 is the global cross section of the associated "ber bundle, or global gauge condition. From this point of view there is no di!erence between the model (6.1) and the SO(2) model. The di!erence appears when one attempts to establish the induced symplectic structure on the physical phase space parameterized by points of the surface p"s"0. The canonical symplectic structure exists if the Faddeev}Popov determinant Mp, sNO0 does not vanish [68]. In the model with the translational gauge symmetry we have p"p and, hence, the 2 condition reads R sO0, which can easily be achieved with the choice s"x on the entire surface 2 2 p"s"0 (the radial variable r is "xed). In the SO(2) model we have p"(p, ¹x)"p , where p is h h the canonical momentum for the angular variable h on the plane. Therefore Ms, pN"R sO0. Let h the function s(x) vanish, say, at the h"0. Since R s cannot be zero, the function s changes sign h when its argument passes through the point h"0. The function s must be a periodic function
S.V. Shabanov / Physics Reports 326 (2000) 1}163
59
of h because it is single-valued on the plane. Therefore it has to change sign at least one more time before h approaches 2p, that is, there exists another point h"h O0 on the orbit r"const such 0 that s(h )"0. Thus, any curve s(x)"0, Mp, sNO0 would intersect each gauge orbit at least twice. 0 The periodicity of s along the directions tangent to the gauge orbits is due to the non-trivial topology of the latters. Remark. If multi-valued gauge conditions are used to remove the gauge freedom (see, e.g., [22]), then the canonical transformation that separates the total phase space variables into the physical and non-physical canonical variables is generally associated with curvilinear coordinates [68]. There will be singularities associated with the points of the con"guration space where the multi-valued s is ill-de"ned. For instance, if we set s"h in the SO(2) model, then the origin is the singular point in the physical sector described by the radial variable. This singularity is clearly associated with the conic structure of the physical phase space as we have seen in Section 3.4. Multi-valued gauges have an additional bad feature: they would, in general, lead to a multi-valued Faddeev}Popov e!ective action. In the literature one can "nd another model which has been intensively studied in an attempt to resolve the Gribov obstruction [78}80]. This is the so-called helix model [77]. It is obtained by a kind of merging the translational gauge model (6.1) and the SO(2) model. The Lagrangian reads ¸"1(x5 !y)2#1[(x5 #yx )2#(x5 !yx )2]!< . (6.2) 2 3 2 1 2 2 1 It is invariant under simultaneous time-dependent rotations of the vector (x , x ) and translations 1 2 of x : 3 yPy#mQ , (6.3) x Px #m , 3 3 x Px cos m!x sin m , 1 1 2 x Px cos m#x sin m . 2 2 1 The potential < is a function of two independent Casimir functions
(6.4) (6.5) (6.6)
C "x cos x #x sin x , C "x cos x !x sin x , (6.7) 1 1 3 2 3 2 2 3 1 3 which are invariant under the gauge transformations. In fact, any gauge invariant function is a function of C . After excluding the Lagrange multiplier y from the con"guration space, we "nd 1,2 that the gauge orbits in the model are helices extended along the x axis. The topology of the gauge 3 orbits in the model is that of the real line and thus trivial. There is no topological obstruction to "nd a regular single-valued gauge "xing condition that would provide a Cartesian system of coordinates on the physical phase space. For instance, the plane x "0 intersects each gauge orbit, 3 speci"ed by "xed values of C , precisely once. No Gribov ambiguity occurs in contrast to the 1,2 models with topologically non-trivial gauge orbits studied above. The Gribov problem here can only be artixcially created by a bad choice of the gauge. An example of a bad choice of the gauge is easy to "nd. Con"gurations in the plane x "0 would have in"nitely many Gribov copies. Indeed, 2 the plane x "0 intersects each helix winding around the third axis at the points related to one 2 another by transformations x P(!1)nx , x Px #pn with n being any integer. The modular 1 1 3 3 domain on the gauge "xing surface in con"guration space is therefore a half-strip x 50, 1
60
S.V. Shabanov / Physics Reports 326 (2000) 1}163
x 3[!p, p). One can also make the number of copies depending on the con"guration itself by 3 taking, e.g., the gauge interpolating the bad and good gauges, x #ax "0. When a"0 we 3 2 recover the good gauge, and when a approaches in"nity we get the bad gauge. Thus, the model exhibits no obstruction to either the reduced phase-space canonical or path integral quantization because the physical phase space in the model is obviously a four-dimensional Euclidean space. From this point of view the model has no di!erence from the translational gauge model discussed earlier. Remark. In the gauge x "0, it looks like the physical phase space is not R4 because of the 2 restrictions x 50 and x 3[p,!p). This is not the case. As one might see from the form of 1 3 the Casimir functions (6.7), the gauge x "0 corresponds to the parameterization of the physical 2 phase space by the canonical variables associated with the polar coordinates on the C -plane, 1,2 while the gauge x "0 is associated with the natural Cartesian canonical coordinates on the 3 physical phase space. Both the parameterization are related by a canonical transformation. In Section 3.4 it is shown that by going over to polar coordinates (as well as to any curvilinear coordinates) one cannot change the geometrical structure of the phase space. The arti"cial Gribov problem in this model is just a question of how to regularize the conventional Liouville path integral measure on the Euclidean phase space with respect to general canonical transformations. This, as a point of fact, can be done in general [3,4]. As far as the particular gauge x "0 is 2 concerned, one knows perfectly well how to change variables in the path integral (or in the SchroK dinger equation) from the Cartesian to polar coordinates in the plane [81,82,118,16]. 6.2. Arbitrary gauge xxing in the SO(2) model Although a good choice of the gauge could greatly simplify the dynamical description of the physical degrees of freedom, we often use bad gauges for the reasons that either the geometry of gauge orbits is not explicitly known or the variables parameterizing the gauge orbit space and associated with a particular gauge (like the Coulomb gauge in Yang}Mills theory) have a convenient physical interpretation. Here we take a closer look at some dynamical artifacts that may occur through a bad choice of the gauge. These artifacts would be purely gauge dependent or, in other words, they are coordinate dependent, meaning that they can be removed by changing a parameterization of the gauge orbit space. However, the physical interpretation may also considerably change upon going over to the new variables related to the initial ones by a non-linear transformation, like the transverse gluons are easy to describe in the Coulomb gauge, while it would be a hard task to do so using the gauge invariant loop variables tr P exp[ig {(dx, A)] which can be used to parameterize the gauge orbit space in the Yang}Mills theory. We limit our consideration to the SO(2) model. The reason is, "rst of all, that a general case (meaning a general gauge in a general gauge theory) would be rather involved to consider in details, and it is hardly believed that the arti"cially created Gribov-like problem is of great physical signi"cance. Secondly, the idea is general enough to be extended to any gauge model. So, the gauge orbits are circles centered at the origin. The con"guration space is a plane spanned by the vector variable x. Any gauge condition s(x)"0 determines a curve on a plane R2 over which a physical variable ranges. The curve s(x)"0 must cross each orbit at least once because a gauge choice is nothing but
S.V. Shabanov / Physics Reports 326 (2000) 1}163
61
a choice of a parameterization of the gauge orbit space. In the model under consideration, this yields that the curve has to go through the origin to in"nity. Let us introduce a smooth parameterization of the gauge condition curve x"x(u)"f (u), u3R ,
(6.8)
where f (0)"0 and D f DPR as uP$R so that u serves as a physical variable which we can always choose to range the whole real line. If f "0 and f "u, we recover the unitary gauge 2 1 considered above. Let the points x and x belong to the same gauge orbit, then x "X x, X 3SO(2). Suppose the s s s s curve (6.8) intersects a gauge orbit at points x"f (u) and x "f (u ). We have also u "u (u) s s s s because f (u )"X f (u). If the structure of gauge orbits is assumed to be unknown, the function u (u) s s s can be found by solving the following equations: s(X f )"0 , (6.9) s X (u) f (u)"f (u (u)) . (6.10) s s Eq. (6.9) is to be solved for X while u is kept "xed. The trivial solution, X "1, always exists by the s s de"nition of f. All the solutions form a set S of discrete residual gauge transformations. Eq. (6.10) s determines an induced action of S on the variable u spanning the gauge "xing curve, i.e., it speci"es s the functions u (u). The set S is not a group because for an arbitrary s a composition X X of two s s s s{ elements from S might not belong to S since it may not satisfy (6.9), while for each X there exists s s s the inverse element X~1 such that X~1X "1. Indeed, suppose we have two di!erent solutions s s s X and X to the system (6.9)}(6.10). The composition X X is not a solution to (6.9), i.e. s s{ s s{ s(X X f (u))"s(X f (u ))O0 because, in general, f (u )Of (u) whereas we only have s(X f (u))"0. s s{ s s{ s{ s From the geometrical point of view, this simply means that, although the con"gurations f, X f and s X f are in the gauge "xing curve, the con"guration X X f is not necessarily in it. s{ s s{ The functions u (u) determined by Eq. (6.10) do not have a unique analytic continuation to the s covering space R isomorphic to the gauge "xing curve x"f (u), u3R, otherwise the composition u " u "u (u) would be uniquely de"ned and, hence, one could always "nd an element s s{ ss{ X "X X being a solution to (6.9), which is not the case. Moreover, a number of elements of ss{ s s{ S can depend on u. s To illustrate our analysis, let us take an explicit function f (u), "nd the functions u (u) and s investigate their analytic properties. Set f "!u , f "!c(2u #u) for u(!u and f "u, 1 0 2 0 0 1 f "cu for u'!u where c and u are positive constants. The curve is plotted in Fig. 7b (see the 2 0 0 curve BPB@ in it). It touches circles (gauge orbits) of radii r "u and r "u c , c "J1#c2 1 0 2 0 0 0 (the points Q and P in Fig. 7b, respectively). It intersects twice all circles with radii r(r and 1 r'r , whereas any circle with a radius from the interval r3(r , r ) has four common points 2 1 2 with the gauge condition curve. Therefore, S has one non-trivial element for s u3R XR , R "(!u /c , u /c ), R "(!R,!3u )X(u ,R) and three non-trivial elements 1 3 1 0 0 0 0 3 0 0 for u3R "(!3u ,!u /c )X(u /c , u ). In Fig. 7b the point P@ correspond to u"!3u , PA to 2 0 0 0 0 0 0 0 u"u , Q@ to u"!u /c and QA to u"u /c , i.e., R is the segment (Q@QA), R "(BP@)X(PAB@) 0 0 0 0 0 1 2 and R "(P@PQ@)X(QAPA). Since the points f (u ) and f (u) belong to the same circle (gauge orbit), the 3 s functions u have to obey the following equation: s f 2(u )"f 2(u) . (6.11) s
62
S.V. Shabanov / Physics Reports 326 (2000) 1}163
Denoting S "S for u3R , a"1, 2, 3, we have S "Z , u (u)"!u; S is determined by the s a a 1 2 s 2 following mappings of the interval K "(u /c , u ). 2 0 0 0 u 1 (u)"!u, u 1 : K P(!u ,!u /c ) ; (6.12) s s 2 0 0 0 c u2 1@2 u 2 (u)"!2u # 0 u2! 0 (6.13) , u 2 : K P(!u ,!2u ) ; s 0 2 0 0 s c c2 0 c u2 1@2 u 3 (u)"!2u ! 0 u2! 0 , u 3 : K P(!2u ,!3u ) ; (6.14) s s 0 2 0 0 c c2 0 and for S we get 3 c u2 1@2 u (u)"!2u ! 0 u2! 0 : (u ,R)P(!3u ,!R) . (6.15) s 0 0 0 c c2 0 The functions (6.13)}(6.14) do not have a unique analytic continuation to the whole domain R 2 (observe the square root function in them) and, hence, their composition is ill-de"ned. The mappings (6.12)}(6.14) do not form a group. Since they realize a representation of S , S is not a a a group. The physical con"guration space is, obviously, isomorphic to K"6K , K "R /S , i.e., K is a a a a a a fundamental domain of R with respect to the action of S "S in R , R "6RK K , RK ranges a s a a a a over S . Upon solving (6.11) (or (6.9)}(6.10)) we have to choose a particular interval as the a fundamental domain where the solutions are analytic functions. We have set K "(u /c , u ) in 2 0 0 0 Eqs. (6.12)}(6.14). Another choice would lead to a diwerent form of the functions u (to another s representation of S in R ). Setting, for example, K "(!2u ,!u ) we obtain from (6.11) s 2 2 0 0 u 1 (u)"!4u !u, u 1 : K P(!3u ,!2u ) ; (6.16) s 0 s 2 0 0 1 u (6.17) u 2 (u)"! [u2 #c2(2u #u)2]1@2, u 2 : K P !u ,! 0 ; 0 0 s 2 0 s c c 0 0 1 u u 3: K P 0,u . (6.18) u 3 (u)" [u2 #c2(2u #u)2]1@2, 0 0 s 2 0 s c c 0 0 To "nd the group elements X (u) corresponding to u (u), one should solve Eq. (6.11). Setting s s X "exp(!¹u ), where ¹ "!¹ , ¹ "1, the only generator of SO(2), and substituting s s ij ji 12 (6.12)}(6.14) into (6.10), we "nd
A A
B B
A
B
A B A B
u 1 (u)"p ; (6.19) s 3p u (6.20) u 2 (u)" !sin~1 0 !tan~1c ; s 2 c u 0 u p u 3 (u)" #sin~1 0 !tan~1c , (6.21) s c u 2 0 where u3K "(u /c , u ). Elements of S are obtained analogously. It is readily seen that 2 0 0 0 1,3 X 1 X 2 OX 3 , etc., i.e., the elements X do not form a group. An alternative choice of K results in s s s s 2 a modi"cation of the functions (6.19)}(6.21).
A B A B
S.V. Shabanov / Physics Reports 326 (2000) 1}163
63
Thus, under an inappropriate gauge "xing, residual gauge transformations might not form a group (no composition for elements); the parameterization of CS appears to be complicated. 1):4 One could assume that all the complications of the CS structure, CS &K, found above have 1):4 1):4 been caused by using gauge non-invariant variables for describing physical degrees of freedom. Indeed, we have chosen a `bada gauge s(x)"0 and gained a complicated set of residual gauge transformations (Gribov-like problem). However, one can easily turn the variable u into a formally gauge-invariant one by means of a special canonical transformation. The set S will appear again s due to topological properties of such a canonical transformation rather than due to gauge "xing ambiguities. The coordinate singularities in the physical phase space parameterized by such gauge-invariant canonical variables will be present again. Since local canonical coordinates on the gauge invariant phase space (2.1) can only be speci"ed modulo canonical transformations, it is natural to expect, and we will see this shortly, that the arbitrariness of gauge "xing may always be re-interpreted as the arbitrariness in choosing local canonical coordinates on the physical phase space. If one cares only about a formal gauge invariance of canonical variables, i.e., vanishing Poisson brackets of the canonical variables with the constraints, and ignores a geometrical structure of the physical phase space (2.1), then the choice of the canonical coordinates might lead to some arti"cial (coordinate dependent) singularities in the Hamiltonian formalism which are similar to those in the non-invariant approach. 6.3. Revealing singularities in a formally gauge-invariant Hamiltonian formalism The gauge condition s(x)"0 induces a parameterization of the physical phase space by some local canonical variables. To construct them, consider the following canonical transformation of x and p x"exp(¹h) f (u) ;
(6.22)
1 d p "p¹x"p, p " ( p, x) ln x2 , h u 2 du
(6.23)
where in (6.23) the derivative dx/du"exp(¹h) f @(u) is expressed via h(x) and u(x). We also obtain that Mh, p N"Mu, p N"1 (if Mx , p N"d ) all other Poisson brackets vanish. We remark that the h u i j ij case f "u and f "0 corresponds to the polar coordinates on the plane, u2"x2. The matrix 1 2 exp(¹h) rotates the ray x "0, x "u"r'0 so that it sweeps the entire plane. For arbitrary 2 1 smooth functions f (u), Eq. (6.22) de"nes a generalization of the polar coordinates. The plane is now i swept by segments of the curve x"f (u) rotated by the matrix exp(¹h), where h3[0, 2p). The segments are traced out by the the vector function x"f (u) for those values of u3KLR for which Eq. (6.22) determines a one-to-one correspondence between the components of x and the new variables u and h. For example, if x"f (u) is the curve cOc@ plotted on Fig. 7b, then a possible choice of K is the union of the sets [0, u@ ) and [u ,R), where D f (u@ )D"D f (u )D, but u@ (u . The 2 2 2 2 2 2 parameter u is gauge-invariant since f 2(u)"x2. We shall call such a change of variable associated with (or adjusted to) both the gauge condition chosen and the gauge transformation law. Note that we have already used such curvilinear coordinates. These are the spherical coordinates for the SO(N) model which are naturally associated with the unitary gauge x "0, iO1, or the functional i curvilinear coordinates (5.32) associated with the Coulomb gauge in the 2D Yang}Mills theory.
64
S.V. Shabanov / Physics Reports 326 (2000) 1}163
So, given a gauge transformation law and a desired gauge condition, such curvilinear coordinates can be constructed in any gauge model by acting by a generic gauge group element on elements the gauge "xing surface. The latter is subject to the only condition that each gauge orbits has at least one common point with it. The parameters of the gauge transformation and those spanning the gauge "xing surface are the new curvilinear coordinates. Clearly, the parameters of the gauge "xing surface become gauge invariant in such an approach. We postpone for a moment the analysis of topological properties of the curvilinear coordinates associated with the gauge transformation law and the gauge condition chosen, and complete constructing the Hamiltonian formalism. Since p coincides with the constraint, we conclude that h is the non-physical variable in the h model; p"p generates its shifts, whereas Mp, uN"Mp, p N"0 and, hence, u and p are gaugeh u u invariant. Using the decomposition ¹x x p"p #p , h x2 u k(u)
(6.24)
where k(u)"(df/du, f ), and the constraint p "0 we derive the physical Hamiltonian h 1 f 2(u) 1 " p2#<( f 2(u)) . (6.25) H " p2#<(x2) 1) 2 k2(u) u 2 ph /0 Hamiltonian equations of motion generated by (6.25) provide a formally gauge-invariant dynamical description. Let us "nd the hidden set of transformations S . As we have pointed out, dynamics is sensitive to s a phase space structure. Therefore, to complete the formally gauge-invariant description, one should describe the phase space parameterized by the local canonical variables u and p . Let us u forget for a moment about the gauge symmetry and the constraint p "0 induced by it and h consider relation (6.22) as a change of variables. We will be interested in the topological properties of the change of variables. There should be a one-to-one correspondence between points x3R2 and h, u. The latter yields a restriction on admissible values of h and u, h3[0, 2p) and u3KLR. To see this, we allow the variables h and u to have their values on the whole real axis and consider transformations h, uPh#h "RK h, u "RK u such that s s x(RK h, RK u)"x(h, u) . (6.26)
A
BK
We assume f (u) to be a real analytic function on R. Points RK h, RK u of the (u, h)-plane are mapped to one point on the x-plane. The mapping (6.22) becomes one-to-one, i.e., it determines a change of variables, if one restricts values of h and u by the modular domain KI "R2/SI where transformations from SI are de"ned by (6.26). The set SI is decomposed into the product ¹ ]S where elements of e s ¹ are translations of h through the group manifold period, e ¹ : hPh#2pn, uPu, n3Z , (6.27) e and S formally coincides with the set of residual gauge transformations in the gauge s"0. Indeed, s let X "exp(¹u (u)) satisfy (6.9)}(6.10). Then we have x(u, h)"exp(¹h)X~1X f (u)"x(RK u, RK h) s s s s s s where S : hPRK h"h!u (u), uPRK u"u (u) . s s s s s
(6.28)
S.V. Shabanov / Physics Reports 326 (2000) 1}163
65
Thus, KI &[0, 2p)XK with K being the fundamental modular domain for the gauge s"0. In the case of the polar coordinates, S "Z , u "p and u "!u, hence K&R (a positive semiaxis). s 2 s s ` Under the transformations (6.27), the canonical momenta (6.23) remain untouched, while p Pp , h h
A B
du ~1 s p P p ,p s "RK p u u u s u du
(6.29)
under the transformation (6.28). In the new canonical variables, a state with given values of canonical coordinates p and x corresponds to phase-space points (p , RK h, RK p , RK u), RK runs over h s s u s s S , provided h3[0, 2p). Therefore, values of the new canonical variables connected with each other s by the S -transformations are physically indistinguishable. s Consider a phase-space plane, where p "0 and h has a "xed value, and states h (p "0, h, RK p , RK u) on it. These states di!er from each other only by values of the angular variable h s u s (0, h, RK p , RK u)&(0, RK ~1h, p , u) where RK ~1h"h#u (u). If now we switch on the gauge syms u s s u s s metry, the angular variable becomes non-physical and, hence, the di!erence between all those states disappears. They correspond to the same physical state. Thus, the transformations u, p Pu , p s relate distinct points in the phase space spanned by p and u, which correspond to the u s u u very same physical state of the system. Therefore they should be identi"ed to describe PS in the 1):4 parameterization chosen. For the polar coordinates, we obviously get PS "cone(p). The conic 1):4 singularity is also present in the new variables (it is non-removable due to the non-trivial topology of the gauge orbits), but there appear additional singular points which are pure coordinate artifacts and merely related to the fact that the function u"u(r) (r is the radial variable on the plane) is multi-valued. There is no curvature at those points of the phase space. The transformations S are s nothing but the transformations which relate di!erent branches of the function u(r) to one another as one might see from (6.11) since f 2(u)"r2. One should emphasize that in the approach being developed the transformations RK 3S in s the (u, p )-plane cannot be regarded as the ones generated by the constraint p"p since u h Mp, uN"Mp, p N"0 in contrast to the gauge "xing description considered above. Physical variables u are chosen so that the set S determining their phase space coincides formally with the set of s residual gauge transformations in the gauge "xing approach. Thus, all artifacts inherent to an inappropriate gauge "xing may well emerge in a formally gauge-invariant approach. To see them, we compare phase-space trajectories in the canonical variables r"D x D, p "(x, p)/r and u, p . They r u are connected by the canonical transformation r"r(u)"D f (u)D, p "rp /k"p (dr/du)~1. We also r u u assume the function f to be di!erentiable so that dr/du"0 only at two points u"u@ and 1,2 dr/du'0 as u(u@ and u'u@ , while dr/du(0 if u3(u@ , u@ ). Our assumptions mean that the 2 1 2 1 curve x"f (u), u50, goes from the origin, crosses the circle DxD"r "r(u ) at x"f (u ) and 1 1 1 reaches the circle D x D"r "r(u@ ), touches it at x"f (u@ ), returns back to the circle DxD"r , and, 2 2 2 1 after touching it at the point x"f (u@ ), tends to in"nity, crossing the circle DxD"r at x"f (u ). An 1 2 2 example of such a curve is given in Fig. 7b (the curve cOc@) and in Fig. 8 (right). In a neighborhood of the origin, PS has the conic structure as we have already learned. This 1):4 local structure is preserved upon the canonical transformation to the variables u, p because it is u a smooth and one-to-one mapping of the strip r3(0, r ) on u3(0, u ). The same holds for the map of 1 1 the half-plane r'r onto the half-plane u'u . Troubles occur in the domain r3(r , r ) where the 2 2 1 2 inverse function u"u(r) becomes multi-valued; it has three branches in our particular case. States
66
S.V. Shabanov / Physics Reports 326 (2000) 1}163
Fig. 8. Phantom trajectories caused by coordinate singularities occurring through a bad parameterization of the physical phase space.
belonging to the strips u3(u , u@ ), u3(u@ , u@ ) and u3(u@ , u ) are physically equivalent because 1 2 2 1 1 2 there are transformations from S mapping the strips on each other and leaving points s p , r3(r , r ) untouched. r 1 2 To investigate what happens to phase-space trajectories in the region u3(u , u ) of the phase 1 2 space, consider a motion with a constant momentum p and suppose that the particle is outgoing r from the origin r"0. On the (p , u)-plane, the particle motion corresponds to a point running u along a curve going from the origin u"0. As soon as the phase-space point crosses the line u"u , 1 there appear two `phantoma phase-space trajectories outcoming from the point p "0, u"u@ u 1 because the point u is S -equivalent to u@ . Note also that p @1 "p @2 "0 since dr/du"0 at 1 s 1 u u u"u@ . The process is shown in Fig. 8.The interval (r , r ) is represented by the three intervals 1,2 1 2 (u , u@ ), (u@ , u@ ) and (u@ , u ) in the u-parameterization. They are ranges of the three branches of the 1 2 2 1 1 2 multi-valued function u(r). The dashed and dotted lines in the "gure show the `splittinga of the points r and r , respectively. The trajectories at u"u@ appear right after crossing the line u"u 1 2 1 1 by the system. So a single trajectory in the r-parameterization is represented by the three trajectories in the u-representation on the interval (r , r ). 1 2 If u 1 and u 2 map (u , u@ ) onto (u@ , u@ ) and (u@ , u ), respectively, so that r(u)" s s 1 2 2 1 1 2 r(u 1 )"r(u 2 ), u3(u , u@ ), then the `phantoma trajectories, shown in Fig. 8 as c and c , are s s 1 2 s1 s2 described by the pairs RK p , RK u (cf. (6.29)) where the point p , u traces out the trajectory c in the 1,2 u 1,2 u phase space region u3(u , u@ ). Since du 1 /du(0 and du 2 /du'0, the `phantoma trajectory 1 2 s s RK p , RK u goes from the origin, while the point RK p , RK u traces out the trajectory in the opposite 2 u 2 1 u 1 direction. Note that the momentum RK p is negative for this trajectory since dr/du is negative in the 1 u interval (u , u ). The points p , u and RK p , RK u arrive at p "p @2 "0, u"u@ in the same time and 1 2 u 1 u 1 u u 2 annihilate each other, whereas a `phantoma particle moving along the branch RK p , RK u approaches 2 u 2 the line u"u . In the next moment of time the system leaves the interval r3(r , r ) (or u3(u , u )). 2 1 2 1 2
S.V. Shabanov / Physics Reports 326 (2000) 1}163
67
Such `branchinga of classical phase-space trajectories is a pure artifact of an inappropriate parameterization of PS (or, as we have argued above, of a bad gauge "xing). It has to be 1):4 removed by gluing all the `phantoma trajectories (branches). In so doing, we cannot however avoid breaking the trajectories at the singular points u"u@ . Indeed, consider trajectories approaching 1,2 the line u"u with diwerent momenta p from the origin and crossing it. Since the motion in the 1 r phase-space strips (u@ , u@ ) and (u@ , u ) is physically equivalent to the one in the strip (u , u@ ), we can 2 1 1 2 1 2 cut out those two strips from the physical domain of the local canonical variables u and p . The u state u"u@ , p "0 is equivalent to the state u"u , p "0, so we can glue them together making 2 u 2 u just a point-like joint between two phase-space domains u(u@ and u'u . In principle, we could 2 2 glue the edges of the cut shown by the dotted line in the bottom of Fig. 8 since the phase-space points in the vicinity of u"u@ are S -equivalent to those in the vicinity of u"u and, therefore, 2 s 2 correspond to the same physical states. This would restore the original conic structure of the physical phase space which certainly cannot depend on the parameterization. However, the continuity of the phase-space trajectories is lost. Every trajectory approaching the line u"u@ from 2 the origin would fall into the point p "0 on this line because p "dr/du p and dr/du vanishes at u u r u"u@ . So there is no trajectory that could cross this line with non-zero momentum. On the other 2 hand, trajectories approaching the line u"u from in"nity can have a non-zero momentum. 2 Therefore we always gain the discontinuity by gluing the lines u"u@ and u"u . The arti"cial 2 2 attractor at the phase-space point p "0, u"u@ corresponds to one of the zeros of the Fadu 2 deev}Popov determinant k(u@ )"0. It is, obviously, absent in another gauge or, as we have just 1,2 learned, in another parameterization of the physical phase space. We conclude that the use of formally gauge invariant canonical variables (i.e., those whose Poisson bracket with the constraints vanishes) may well exhibit the same type of singularities as the non-invariant approach based on the gauge "xing. For this reason, it is of great importance to study the geometrical structure of the physical phase space before introducing any explicit parameterization of it either via gauge "xing or by local formally gauge invariant canonical coordinates in order to avoid unnecessary (arti"cial) complications associated with a bad parameterization. 6.4. Symplectic structure on the physical phase space The existence of the singularities in any parameterization of the physical phase space by a set of canonical variables naturally leads to the question whether one could get around this trouble by using local non-canonical coordinates. The answer is azrmative, although it does not come for free. The idea is a generalization of the approach proposed in Section 3.3. Suppose we know a set of all independent Casimir functions C (q) in a gauge theory, where q labels points in the total con"gurai tion space. Clearly, the values of the Casimir functions parameterize the gauge orbit space. We also assume C (q) to be regular on the entire con"guration space. Then we can introduce another set of i variables P (q, p)"Sp, R C T, where S ,T is an inner product such that the phase-space functions i q i P are invariant under gauge transformations on the phase space spanned by q and p. The i canonical symplectic structure in the total phase space would induce a non-canonical symplectic structure on the physical phase space spanned by variables C and P i i MC , C N"0, MC , P N"D (C), MP , P N"DM (P, C) , (6.30) i j i j ij i j ij
68
S.V. Shabanov / Physics Reports 326 (2000) 1}163
where the functions D and DM depend on the structure of the constraint algebra. The Hamiltonian ij ij dynamics can be reformulated in terms of these gauge invariant variables with the symplectic structure (6.30) just as has been done in Section 3.3 for the simplest case. If the Hamiltonian is regular in the total phase space, classical phase-space trajectories C (t), P (t) do not have any i i singularities because they are regular gauge invariant functions on the total phase space. In this way one can always circumvent the coordinate singularities in classical theory. However, the induced symplectic structure would vanish at certain points like the right-hand side of Eq. (3.34) vanishes at Q"0. For the model discussed in Section 4, C (x)"tr xli , where i l are degrees of the independent Casimir polynomials. So, P "tr(pxli ~1). For groups i i of rank 2, these variables are related to U and p introduced in Section 4.6 by a coordinate i i transformation. It is not hard to be convinced that, for instance, the function D vanishes for some ij values of C . Using the gauge invariance of C one can show that the singularities of the symplectic i i structure occur exactly at those con"gurations of C that correspond to values of x"h on the i boundary of the Weyl chamber, C "C (x)"C (h) (cf. Section 7.4). Similarly, in the SU(2) i i i Yang}Mills theory in two dimensions, one can take C(A)"tr P exp(ig{ dx A) and P"SE, d/dATC(A). Then the symplectic structure reads MC, PN"1!C2 (after an appropriate rescaling C and P by some constants depending on g and l ). Due to the gauge invariance, C&cos[p(a, u)/a ], where u"q /2, and SE, d/RAT&(p , R/Ra). Zeros of the symplectic structure 0 3 a are obviously related to the boundary of the Weyl cell where the Polyakov loop variable attains its maximal (minimal) values. So, in this approach the gauge invariant induced symplectic structure inherits the information about the physical phase space structure. In contrast to the simplest case (3.34), the symplectic structure (6.30) may no longer have a Lie algebra structure, which poses substantial technical di$culties in its quantization because it is hard to "nd a representation of the corresponding commutation relations. In Yang}Mills theory, with each spatial loop one can associate a Casimir function, being the trace of the path-ordered exponential of a generic connection along the spatial loop. These functionals form an overcomplete set of gauge invariant variables (there are identities between them [83]) that can be used to parameterize the orbit space. A quantum mechanical description in term of loop variables can be developed (see, e.g., [83] for a review) but it is still technically complicated in practical use. The symplectic structure based on loop variables has been proposed to quantize gravity [84,85] (see [86}88] for advances in this approach).
7. Quantum mechanics and the gauge symmetry Upon going over to a quantum mechanical description of gauge systems the following main questions are to be put forward. First, can one promote "rst-class constraints into operator equalities? Second, can the non-physical variables be excluded before a canonical quantization? Under the canonical quantization we imply the procedure of promoting canonical symplectic coordinates p and q in the phase space of the system into self-adjoint operators p( and q( which i i i i satisfy the Heisenberg commutation relations [q( , p( ]"i+Mq , p N"i+d , j k j k jk [q( , q( ]"i+Mq , q N"0, [p( , p( ]"i+Mp , p N"0 , j k j k j k j k
(7.1) (7.2)
S.V. Shabanov / Physics Reports 326 (2000) 1}163
69
where + is the Planck constant. The canonical operators can be realized as linear operators in a Hilbert space. The states DtT of the system are vectors of the Hilbert space. For instance, one can take the representation of the Heisenberg algebra in the space of square integrable complex functions SqDtT"t(q), :dqDtD2(R. Then SqDq( DtT"q t(q), SqDp( DtT"!i+R t(q) , (7.3) j j j j where R stands for the partial derivative R/Rq . One should emphasize that the self-adjointness of j j the canonical operators p( and q( is guaranteed by that the phase space is a Euclidean space and i i p, q refer to the Cartesian system of coordinates on it. The time evolution of the system is described by the SchroK dinger equation (7.4) i+R Dt(t)T"HK Dt(t)T . t Here HK is the Hamiltonian operator which is obtained from the classical Hamiltonian by replacing the canonical variables by the corresponding operators. The quantum Hamiltonian obtained in such a way is by no means unique. Since the canonical operators are non-commutative, there is, in general, a great deal of operator ordering ambiguity. The condition of hermiticity of HK is not generally su$cient to uniquely specify the operator ordering. In addition, one should also keep in mind that any quantization recipe is only a guess for the right theory. Nature is quantum. One should start, in fact, from quantum mechanics and derive all the properties of our classical world from it by means of the classical approximation, i.e., when the e!ects of the non-commutativity of the canonical operators are negligible. This can be achieved by studying the formal limit in which the Planck constant vanishes. Unfortunately, we do not have enough experience to postulate the quantum laws prior to the classical ones. For this reason we use various quantization procedures and believe that by means of them we guess the quantum physics right. So, the quantum Hamiltonians are, in principle, allowed to have any quantum corrections (of higher orders of +) which disappear in the classical limit. This corrections can either be decided experimentally by observing the energy spectrum of the system, or, sometimes, theoretically by analyzing selfconsistency of quantum theory, meaning that the quantum theory obtained by means of a certain quantization rule does not contain any internal contradiction, nor does it contradict some fundamental theoretical principles which we believe to be true and superior. Canonical quantization ful"lls the correspondence principle. This can be most easily seen from the Heisenberg representation of the time evolution i+
d FK "[FK , HK ] , dt
(7.5)
where FK is any operator constructed out of the canonical operators. In the formal limit +P0, (i+)~1[ ,]PM ,N as follows from the canonical commutation relations, the Heisenberg equations turn into the Hamilton equations of classical mechanics. The SchroK dinger and Heisenberg representations of the time evolution are related through the unitary transformation Dt(t)T";K DtT, FK (t)";K sFK ;K , ;K "e~*tHK . (7.6) t t t t Here the states with the time label and the operators without it refer to the SchroK dinger picture, while the states without the time label and the operators with it belong to the Heisenberg picture.
70
S.V. Shabanov / Physics Reports 326 (2000) 1}163
The numerical values of the amplitudes StDFK (t)Dt@T"St(t)DFK Dt@(t)T do not depend on the picture in which they are computed. Having made all the above de"nitions and reservations about them, we can now proceed to answer the questions about quantization of gauge systems. The answer to the "rst question can be anticipated through the analysis of the simplest gauge model where the gauge symmetry is just a translation of one of the Cartesian coordinates spanning the con"guration space of the system. The constraint coincides with one of the canonical momenta p"p"0. We cannot promote this classical equality into the operator equality p( "0 because this would be in con#ict with the canonical commutation relation (7.1): q( p( !p( q( "i+. So, if one wants to have a quantum theory whose classical limit complies with the existence of the "rst class constraints, one should restrict the physical states by those annihilated by the operator version of the constraints p( DtT"0 . (7.7) a This recipe has been proposed in the works of Dirac [89] (see also [6]) and Bergmann [90]. Its consistency is guaranteed by the properties of the "rst class constraint algebra [p( , p( ]"fK c p( , [p( , HK ]"fK bp( , (7.8) a b ab c a a b where fK c and fK b are some functions of canonical operators. One should remark that the constraints ab a may also exhibit the operator ordering ambiguity upon promoting them into operators. Therefore one of the conditions which should be imposed on constraints is that the constraints remain in involution (7.8) upon quantization. This is necessary for the consistency of the Dirac rule (7.7). Sometime it turns out to be impossible to ful"ll this condition. This is known as the quantization anomaly of "rst-class constraints. An example of such an anomaly is provided by Yang}Mills theory with chiral massless fermions [91]. In other theories, e.g., the string theory, the condition of the absence of the anomaly may impose restriction on physical parameters of the theory (see, e.g., [92]). In what follows we shall always deal with gauge theories where the constraints generate linear gauge transformations in the con"guration space: qPX(u)q. In the SchroK dinger picture, the Dirac condition means the gauge invariance of the physical states e*uap( a t(q)"t(X(u)q)"t(q) .
(7.9)
The norm of the Dirac states is proportional to the volume of the gauge orbit through a generic point q because the wave function (7.9) is constant along the gauge orbit. An apparent di$culty within the Dirac quantization scheme is a possible non-renormalizability of the physical states. If the gauge orbits are non-compact, then the norms are divergent. Even if the gauge orbits are compact, the norm can still be divergent if the number of nonphysical degrees of freedom is in"nite, like in gauge "eld theories. This means, in fact, that the physical states do not belong to the original Hilbert space. In the simple case, when the constraint coincides with a canonical momentum, the problem can be resolved by discarding the corresponding degree of freedom. This does not lead to any contradiction because the wave function does not depend on one of the Cartesian coordinates. So this coordinate can be excluded at the very beginning, i.e., before the canonical quantization (7.1)}(7.2). The existence of the constraint means that the corresponding variable is non-physical.
S.V. Shabanov / Physics Reports 326 (2000) 1}163
71
It belongs to the non-physical con"guration space which is orthogonal to the physical one. The non-physical degrees of freedom cannot a!ect any physical process. The divergence of the norm, on the other hand, is exactly caused by the integration over the non-physical space. Therefore in Cartesian coordinates the integral over non-physical variables can be omitted without any e!ect on the physical amplitudes. This procedure may not be consistent if the non-physical degrees of freedom are described by curvilinear coordinates. In this case the problem amounts to our second question about excluding the non-physical variables before quantization. If the number of non-physical degrees of freedom is "nite and the gauge orbits are compact, there is no problem with the implementation of the Dirac rule. In the case of gauge "eld theory, the number of non-physical degrees of freedom is in"nite. For compact gauge groups the norm problem can, for instance, be resolved by introducing a "nite lattice regularization. After factorizing the volume of the gauge orbits in the scalar product, one removes the regularization. Now we turn to the second question. Here the crucial observation made by Dirac, is that the canonical quantization is, in general, consistent when applied with the dynamical coordinates and momenta referring to a Cartesian system of axes and not to more general curvilinear coordinates [1]. We have seen that in gauge theories physical phase space coordinates are typically not Cartesian coordinates, and the physical phase space is often a non-Euclidean space. So the canonical quantization of the reduced phase space might have internal inconsistencies. Another important observation, which follows from our analysis of the physical phase space in gauge models, is that the parameterization of the physical phase space is de"ned modulo general canonical transformations. Quantization and canonical transformations are non-commutative operations, in general. On the other hand, there are in"nitely many ways to remove non-physical variables before quantization. Various parameterization of the physical phase space obtained in such a way are related to one another by canonical transformations. Thus, the canonical quantization after the elimination of non-physical variables may lead to a quantum theory which depends on the parameterization chosen [9,93]. Clearly, this indicates a possible theoretical inconsistency of the approach since quantum mechanics of the physical degrees of freedom cannot depend on the way the non-physical variables have been excluded, i.e., on the chosen gauge. We shall illustrate our general preceding remarks with explicit examples of gauge models. Remark. The non-commutativity of the canonical quantization and canonical transformations does not mean that it is impossible to develop a parameterization independent (coordinate-free) quantum theory on the physical phase space (2.1). Actually, it can be done for constrained systems in general [122}124]. A naive application of the canonical quantization, which is often done in physical models, is subject to this potential problem, while other methods may still work (e.g., the Bohr}Sommerfeld semiclassical quantization applies to non-Euclidean phase spaces). 7.1. Fock space in gauge models The Bohr}Sommerfeld semiclassical quantization has led us to the conclusion that the geometry of the physical phase space a!ects the spectrum of the harmonic oscillator. Let us now verify whether our semiclassical analysis is compatible with the gauge invariant approach due to Dirac. Consider "rst the SO(N) model. We shall not quantize the Lagrange multipliers since they represent pure non-physical degrees of freedom. Only the canonical variables x and p are promoted
72
S.V. Shabanov / Physics Reports 326 (2000) 1}163
to the self-adjoint operators x( and p( satisfying the Heisenberg commutation relations. In what follows we shall also assume units in which the Planck constant + is one. When needed it can always be restored from dimensional arguments. Let us introduce a new set of operators a( "( p( !i x( )/J2, a( s"( p( #i x( )/J2 ,
(7.10)
which are called the destruction and creation operators, respectively. The dagger stands for the hermitian conjugation. The operators (7.10) satisfy the commutation relations [a( , a( s ]"d , [a( , a( ]"[a( s, a( s ]"0 . j k jk j k j k An orthonormal basis of the total Hilbert space is given by the states
(7.11)
N (a( s )nk Dn , n ,2, n T,DnT" < k D0T, a( D0T,0, S0D0T"1 , (7.12) 1 2 N k k/1 Jnk ! where n are non-negative integers. In this representation the Hamiltonian of an isotropic k harmonic oscillator has the form HK "1(a( sa( #a( a( s)"a( sa( #N/2 . (7.13) 2 The Dirac physical subspace is de"ned by the condition that the operators of constraints annihilate any state from it: p( DUT"(a( s, ¹ a( )DUT"0 . (7.14) a a There is no operator ordering ambiguity in the constraints thanks to the antisymmetry of the matrices (¹ ) . a jk First of all we observe that the vacuum state D0T belongs to the physical subspace since it is annihilated by the constraints. Hence, any physical state can be constructed by applying an operator UK that commutes with the constraints, [UK , p( ]"0, to the vacuum state. In fact, it is a su$cient to assume that the commutator vanishes weakly, i.e., [UK , p( ]&p( , to guarantee that a a p( UK D0T"0. However, it is clear that any state can be obtained by applying a function only of the a creation operators to the vacuum state. Therefore, UK is also a function only of the creation operators. Since the constraints are linear in the destruction operators, their commutator with UK cannot depend on the constraints and, therefore, has to vanish. To describe all possible operators that commute with the constraints, we observe that the constraints generate SO(N)-rotations of the destruction and creation operators. This follows from the commutation relations [p( , a( ]"!¹ a( , [p( , a( s]"!¹ a( s . (7.15) a a a a Thus, the operator UK must be a gauge invariant function of the creation operators. This holds in general. Operators that commute with the operators of constraints are gauge invariant. This is a quantum version of the analogous statement in classical theory: The Poisson bracket of gauge invariant quantities with the constraints vanishes. The correspondence principle is ful"lled for observables. Returning to the model, one can say that UK is a function of independent Casimir operators built of as. For the fundamental representation of the group SO(N) there is only one independent Casimir operator which is (a( s)2. Note that the system has only one physical degree of freedom. The
S.V. Shabanov / Physics Reports 326 (2000) 1}163
73
powers of this operator applied to the vacuum state form a basis in the physical subspace [10]
A
DU T" n
B
4nn!C(n#N/2) ~1@2 [(a( s)2]nD0T . C(n/2)
(7.16)
The coe$cients have been chosen so that SU DU T"d . k n kn The basis vectors (7.16) are also eigenvectors of the oscillator Hamiltonian. From the commutation relation [a( sa( , (a( s)2]"2(a( s)2 ,
(7.17)
the eigenvalues follow E "2n#N/2 , (7.18) n that is, the distance between energy levels is doubled. This e!ect has been observed in the semiclassical quantization of the system. It has been caused by the conic structure of the physical phase space. Here we have established it again using the explicitly gauge invariant approach. The vacuum energy depends on N, while in the Bohr}Sommerfeld approach it does not because the physical phase space structure and the physical classical Hamiltonian do not depend on N. Let us now turn to gauge systems with many physical degrees of freedom. In classical theory we have seen that the non-Euclidean structure of the physical phase space causes a speci"c kinematic coupling between the physical degrees of freedom, because of which only collective excitations of the physical degrees of freedom occur. The kinematic coupling has also been shown to have a signi"cant e!ect on the semiclassical spectrum of the physical excitations. Now we can verify whether our conclusion is consistent with the Dirac approach. We take "rst the gauge model where the total con"guration space is a Lie algebra and the action of the gauge group in it is the adjoint action of the group in its Lie algebra. Introducing the operators a( "a( j the Dirac condition for b b the gauge invariant states can be written in the form p( DUT"f a( sa( DUT"0 . (7.19) b bcd c d Due to the antisymmetry of the structure constants f "!f , there is no operator ordering bcd bdc ambiguity in the constraints. So they remain in involution after quantization. To solve Eq. (7.19) for the physical states, we can use the same method as for the SO(N) model. Since the vacuum state belongs to the physical Hilbert space, any physical state can be obtained by applying a gauge invariant operator built out of a( s to the vacuum state. The problem is reduced to seeking all independent Casimir polynomials that can be constructed from a( s. From the commutation relation [p( , a( s]"f a( s we infer that the operator a( s is transformed by the adjoint action of b c bcd d the gauge group. Therefore, the independent Casimir polynomials are P j (a( s)"tr(a( s)lj , (7.20) l where the trace is related to a matrix basis j in the Lie algebra; the integers l , b j j"1, 2,2, r"rank G, are degrees of the independent Casimir polynomials, l "2 for all groups. 1 For the groups of rank 2, we have l "3, 4, 6 for SU(3), Sp(4)&SO(5) and G , respectively [32]. 2 2 We remark also that the use of a matrix representation is not necessary to construct the gauge invariant polynomials of a( s. In general, gauge invariant operators are polynomials of a( s whose b coe$cients are invariant symmetric tensors in the adjoint representation of the Lie algebra.
74
S.V. Shabanov / Physics Reports 326 (2000) 1}163
Alternatively the operators (7.20) can be written via the irreducible invariant symmetric tensors d(l)1 2 2 l , where ranks l of the tensors equal corresponding degrees of the independent Casimir bb b polynomials. The irreducible invariant symmetric tensors form a basis for all invariant symmetrical tensors [32]. Thus, the operators P j (a( s)"d(l1j )2 2 lj a( s1 a( s2 2a( slj (7.21) l bb b b b b form a basis of gauge invariant polynomials of the creation operators. The irreducible invariant tensors can be obtained from the commutation relations of the basis elements of the Lie algebra. For instance, for SU(3) the invariant symmetrical tensors are d and d which are proportional to ab abc traces of two and three Gell}Mann matrices, respectively. A basis in the physical Hilbert space is given by the states [19] (7.22) Dn , n ,2, n T"[P 1 (a( s)]n1 [P 2 (a( s)]n2 2[P r (a( s)]nr D0T , l l 1 2 r l where n are non-negative integers. These states are eigenstates of the oscillator Hamiltonian. The j eigenvalues follow from the commutation relation [a( s a( , P (a( s)]"lP (a( s) and have the form b b l l E "l n #l n #2#l n #N/2 . (7.23) n 1 1 2 2 r r So up to the ground state energy this is the spectrum of the r-dimensional harmonic oscillator with frequencies equal to ranks of the irreducible symmetric tensors in the adjoint representation of the Lie algebra. We have anticipated this result from the semiclassical quantization of the r-dimensional isotropic harmonic oscillator with a hyperconic structure of its physical phase space described in Section 3. In the matrix gauge model discussed in Section 4.8 we take < "u2x2/2 and u Ou [10,16]. q q q 1 2 The destruction and creation operators (7.10) carry an additional index q"1, 2. The constraint (4.51) and the Hamiltonian (4.52) assume the form p( "(a( s , ¹a( )#(a( s , ¹a( ) , (7.24) 1 1 2 2 HK "u (a( s , a( )#u (a( s , a( )#u #u , (7.25) 1 1 1 2 2 2 1 2 where the term proportional to the constraint in the Hamiltonian (4.52) has been omitted since it vanishes on the physical states. Since the vacuum is annihilated by the constraint operator, p( D0T"0, the physical states are generated by the independent invariants of the orthogonal group SO(2) which are composed of the vectors a( s : q bK s"(a( s )2, bK s "(a( s , a( s ), bK s "e a( (i)sa( (j)s , (7.26) q q 3 1 2 4 ij 1 2 where e "!e is a totally antisymmetric tensor, e "1. Note that the group SO(2) has two ij ji 12 invariant irreducible tensors d and e . The operators (7.26) are all independent operators which ij ij can be composed of the two vectors a( s and the two invariant tensors. q Here the following should be noted. All the invariant operators (7.26), except bK s are invariant 4 under the larger group O(2)"SO(2)?Z (the non-trivial element of Z corresponds to the 2 2 re#ection of one of the coordinate axes, which changes sign of bK s ). Should the operator bK s be 4 4 included among the operators that generate the basis of the physical Hilbert space? In other words what is the gauge group of the model: SO(2) or O(2)? We remark that the similar question exists in gauge theories without fermions: What is the gauge group G or G/Z , where Z is the center of G G G [94]? Yet, we have already encountered this question when studying the physical phase space in
S.V. Shabanov / Physics Reports 326 (2000) 1}163
75
the 2D Yang}Mills theory in Section 5. Following the arguments given there we point out that formally all information about the dynamics is contained in the Lagrangian. In the Hamiltonian formalism, any "nite gauge group transformation is an iteration of in"nitesimal gauge transformations generated by the constraints. Therefore only the transformations which can be continuously deformed towards the group unity have to be included into the gauge group. The existence of the discrete gauge group cannot be established for the Lagrangian (4.49). So, the group O(2) can be made a gauge group of the model only by a supplementary condition that the physical states are invariant under the transformations from the center of O(2). Another possibility would be to consider a larger gauge group where O(2) is a subgroup, e.g., SO(3). In view of these arguments, we include the operator bK s into the set of physical operators. 4 Because of the identity e e "d d !d d , the operator (bK s )2 can be expressed via the other ij kn ik jn in jk 4 operators bK s , a"1, 2, 3 so that the basis of the physical Hilbert space is given by the states [17] a (bs )n1 (bs )n2 (bs )n3 D0T, (bs )n1 (bs )n2 (bs )n3 bs D0T , (7.27) 1 2 3 1 2 3 4 where n are non-negative integers. The physical states acquire a phase factor $1 under the a transformations from the center of O(2). Similarly, the physical states of the 2D Yang}Mills theory get a phase factor under homotopically non-trivial gauge transformations as will be shown in Section 7.6. The spectrum of the Hamiltonian (7.25) reads En "2n u #2n u #n (u #u )#n (u #u )#u #u , (7.28) 1 1 2 2 3 1 2 4 1 2 1 2 where n "0, 1. Here we see again that the oscillators are excited in pairs, the same e!ect we have 4 anticipated from the analysis of the physical phase space of the model in Section 4.8. The physical frequencies are 2u and u #u , while the original frequencies of the uncoupled oscillators 1,2 1 2 (cf. (4.52)) are just u . 1,2 The lesson one could learn from the above analysis is that, when describing a quantum gauge theory in terms of only physical degrees of freedom (e.g. the Hamiltonian path integral), it is of great importance to take into account the true structure of the physical phase space in order to establish the equivalence with the Dirac gauge invariant operator formalism. 7.2. SchroK dinger representation of physical states In the path integral formalism one uses an explicit parameterization of the physical con"guration space (the Lagrangian path integral) or that of the physical phase space (the Hamiltonian path integral). It is often the case that the structure of gauge orbits is so complicated that a parameterization is chosen on the basis of a physical `conveniencea which may not be the best choice from the mathematical point of view. To develop the path integral formalism which uniquely corresponds to the Dirac gauge invariant approach, it seems useful to investigate, within the operator formalism, the role of coordinate singularities, that unavoidably occur in any parameterization of a nonEuclidean physical phase space by canonical variables. In the case of the SO(N) model the total Hilbert space is the space of square integrable functions t(x) in the N-dimensional Euclidean space. The gauge invariance condition means that the physical wave functions must be invariant under the SO(N) rotations of the argument. So the physical motion is the radial motion. Recall that the constraints in the model are nothing but the components of the angular momentum of the particle. The motion with zero angular momentum is
76
S.V. Shabanov / Physics Reports 326 (2000) 1}163
radial. Physical wave functions U depend only on the radial variable r"DxD. Therefore a natural way to solve the SchroK dinger equation for eigenfunctions of the Hamiltonian is to make use of spherical coordinates. In the equation (7.29) [!1D #<(x2)]U "EU , E E 2 N where D is the N-dimensional Laplace operator, we introduce the spherical coordinates and omit N all the terms of the corresponding Laplace}Beltrami operator containing the derivatives with respect to the angular variables because the physical wave functions are independent of them. The radial part of the Laplace}Beltrami operator is the physical kinetic energy operator. The equation assumes the form
C
D
d2 N!1 d ! ! #<(r2) U (r)"2EU (r) . E E dr2 r dr
(7.30)
We shall solve it for the oscillator potential <"r2/2. To this end, we make the substitution U"r2 exp(!r2/2)/(r) and introduce a new variable z"r2 so that the function f (z)"/(r) would satisfy the equation zf A#(a!z) f @!bf"0 ,
(7.31)
in which a"N/2 and b"(a!E)/2. The solution of this equation that is regular at the origin z"r2"0 is given by the con#uent hypergeometric function f (z)" F (b, a; z) . (7.32) 1 1 From the condition that U (r) decreases as r approaches in"nity, which means that the function f (z) E must be a polynomial, i.e., b"!n, we "nd the spectrum (7.18). The distance between the oscillator energy levels is doubled. Making use of the relation between the function F and the Laguerre 1 1 polynomials ¸a , F (!n, a#1; z)"¸a (z)C(n#1)C(a#1)/C(n#a#1), we can represent the 1 1 1 n eigenfunctions as follows [9,10,16] (7.33) U (r)"c ¸~1`N@2(r2) e~r2@2 , n n n with c being normalization constants. The physical wave functions are normalizable with the n scalar product
P
P
=
P
dr rN~1DU D2P n
=
dr rN~1DU D2 , (7.34) n 0 0 where X is the total solid angle in RN (the volume of the non-physical con"guration space) which N we can include into the norm of physical states. Let us compare our results with those we would have obtained, had we quantized the system after eliminating all non-physical degrees of freedom, say, by imposing the unitary gauge x "0, iO1. The gauge-"xed classical Hamiltonian can be obtained by solving the constraints for i p , iO1, substituting the solution into the original Hamiltonian and then setting all x , except x , i i 1 to zero. It would have the form dNxDU D2"X n N
H "1(p2 #x2 ) . 1):4 2 1 1
(7.35)
S.V. Shabanov / Physics Reports 326 (2000) 1}163
77
Clearly, the canonical quantization of this Hamiltonian would lead to the spectrum E "n#1/2 n which gives the energy level spacing di!erent from that found in the gauge invariant approach. The reason of failure of the canonical quantization is obviously that the phase space spanned by the variables p and x is not a plane, but a cone unfoldable into a half-plane. If the cone is cut 1 1 along the momentum axis, then we have to impose the restriction on the admissible values of x : It 1 has to be non-negative. The operator p( "!iR/Rx is not self-adjoint on the half-axis in the space 1 1 of square integrable functions. Therefore p( cannot be identi"ed with the physical observables, 1 while the Hamiltonian (7.35) can be made self-adjoint. A possible way is to quantize the theory in the covering space, i.e., on the full real line spanned by x , and then to implement the condition that 1 the physical states must be invariant under the parity transformation x P!x 1 1 /(x )"/(!x ) . 1 1
(7.36)
In so doing, the right energy level spacing of the oscillator is restored. Recall that the wave function of the one-dimensional harmonic oscillator are / (x )"c@ H (x )e~x21 @2 , k 1 n k 1
(7.37)
where H are Hermite polynomials. They have the property that H (!x )"(!1)kH (x ). So the k k 1 k 1 physical values of k are even integers, k"2n. Although the invariance under the residual (discrete) gauge transformations of the physical wave functions has led us to the right energy level spacing, the quantum theory still di!ers from that obtained by the gauge invariant Dirac procedure. The physical eigenstates in both theories are di!erent. This, in turn, means that the amplitudes for the same physical processes, but described within two quantum theories, will not be the same. Thus, in general, the canonical quantization of a gauge-"xed theory with an additional condition of the invariance of the physical states with respect to the residual gauge transformations may lead to a gauge dependent quantum theory, which is not acceptable for a physical theory. Yet, though the variable x is assigned to describe the physical degree of freedom, the state x( /(x ), where / is 1 1 1 a physical state satisfying (7.36), is not a physical state. The action of the operator x( throws the 1 states out of the physical subspace because it does not commute with the parity transformation, which is a rather odd property of a `physicala variable. This is not the case for the radial variable r used in the Dirac approach. It is still not the whole story. Here we have been lucky not to have had an ordering problem in the physical Hamiltonian after eliminating the non-physical degrees of freedom, thanks to the simplicity of the constraints and the appropriate choice of the gauge. In general, the elimination of the non-physical variables would lead to the operator ordering problem in the physical kinetic energy. A solution to the ordering problem is generally not unique. On the other hand, an explicit form of the classical kinetic energy depends on the chosen gauge. So it might be di$cult to "nd a special ordering of the operators in the physical Hamiltonian such that the spectrum would be independent of the parameterization of the physical con"guration space, or of the chosen gauge. If any operator ordering is assumed, say, just to provide hermiticity of the Hamiltonian, the spectrum would generally be gauge dependent. An explicit example is discussed in Section 7.7. This observation seems especially important for gauge theories where the structure of gauge orbits is unknown (or hard to describe, like in the Yang}Mills theory), and, hence, no `appropriatea gauge "xing condition exists.
78
S.V. Shabanov / Physics Reports 326 (2000) 1}163
Let us analyze the singular point r"0 in the Dirac approach. This point can be thought as the Gribov horizon since r"Dx D in the unitary gauge x "0. We recall that any gauge invariant 1 2 parameterization of the physical con"guration space can be related to a special gauge "xing condition through curvilinear coordinates associated with the gauge transformation law and the chosen gauge, as has been shown in Section 6.2. In the non-invariant approach the singular points form the Gribov horizon; in the invariant approach the singular points appear as the singular points of the change of variables, like the origin in the the spherical coordinates, i.e., as zeros of the Jacobian. This is also the case for the Yang}Mills theory (see Section 10.1). For the sake of simplicity let us take the group SO(3). By means of the substitution U(r)"/(r)/r Eq. (7.30) can be transformed to the ordinary one-dimensional SchroK dinger equation !/A/2#"E/. Since the potential is an even function of r (a consequence of the gauge invariance), the solutions to this equation have certain parity. The Hamiltonian commutes with the parity transformation, so some of the eigenvalues would correspond to odd eigenfunctions, some to even ones. For example, we can take the harmonic oscillator, / (r)"c H (r)exp(!r2/2). For k k k odd k, the wave functions U (r)"/ (r)/r are even, while for even k they are odd. We have k k eliminated the solutions that are not invariant under the parity transformation rP!r. The reason is that these solutions are not regular at the origin r"0. Indeed, H (0)O0 so there is a singularity 2n 1/r. Although this singularity is integrable since the scalar product has the density r2, the singular solution to the SchroK dinger equation must be excluded. As has been pointed out by Dirac [1], singular solutions of the SchroK dinger equation with a regular potential obtained in curvilinear coordinates are not solutions in the original Cartesian coordinates. Indeed, the wave functions with the singularity 1/r would not satisfy the SchroK dinger equation in the vicinity of the origin because D (1/r)"4pd3(x). (3) Regular even functions of r are regular functions of r2"x2 and, hence, they have a unique gauge-invariant analytic continuation into the whole original con"guration space. We conclude that the regularity condition for wave functions at the singular points in a chosen parameterization of the physical con"guration space eliminates non-physical states and provides one-to-one correspondence with the explicitly gauge invariant approach that does not rely on any parameterization of the physical con"guration space. This conclusion is rather general and can be extended to all gauge theories. Thus, the Gribov obstruction in the SchroK dinger representation of quantum gauge theories can be solved in the following way. Given a gauge condition, construct the curvilinear coordinates associated with it and the gauge transformation law. Solve the constraint equations in the new coordinates and "nd the physical Hamiltonian. Solve the SchroK dinger equation under the condition that the physical wave functions are regular at the points where the Jacobian of the change of variable vanishes. 7.3. The SchroK dinger representation in the case of many physical degrees of freedom To obtain the Dirac gauge invariant wave functions in gauge models with many physical degrees of freedom, we will follow the general scheme formulated at the very end of the preceding section. We take the model where the con"guration space is a Lie algebra and the gauge group acts in the adjoint representation in it. A natural parameterization of the physical con"guration space is provided by the gauge x"h, where h belongs to the Cartan subalgebra. The associated curvilinear coordinates have been constructed in Section 4.3 (see (4.18)). The physical wave function are
S.V. Shabanov / Physics Reports 326 (2000) 1}163
79
functions of h because the constraints generate shifts of the variables z. The Laplace}Beltrami operator in general curvilinear coordinates has the form 1 D " R (gjkJg R ) , LB Jg j k
(7.38)
where g"det g , g is the metric in the curvilinear coordinates and gjk is the inverse of g . The jk jk jk metric (4.21) is block-diagonal so the Laplace}Beltrami operator is a sum of the physical and the non-physical terms. Since the physical wave function are independent of z, we omit the second term containing the derivatives R . The metric in the physical sector is Euclidean, but the Jacobian i2 is z not trivial (cf. (4.26)). The physical part of the Laplace}Beltrami operator reads 1 D i 1 1 (R , i2R )" D i! (r) " D i , h i (r) i i (r) i2 h
(7.39)
where D "(R , R ) is the r-dimensional Laplace operator. The vanishing of the second term in the (r) h h right-hand side of the "rst equality can be demonstrated by the explicit computation D i (a, b) (r) " + (7.40) i (h, a)(h, b) aEb;0 (a, b) "+ + "0 . (7.41) (h, a)(h, b) Pab aEb;0|Pab Here the sum over the positive roots aOb'0 has been divided into the sum over the positive roots contained in a plane P and a sum over all planes. The sum in one plane is calculated ab explicitly. The relative directions of the roots in one plane are speci"ed by the matrix cos2 h "(a, b)2/[(a, a)(b, b)] whose elements may only have the values 0, 1/4, 1/2 and 3/4. That is, ab the quantity (7.41) for a group of rank r is determined by that for the groups of rank 2. By an explicit computation one can convince oneself that it vanishes for SU(3), Sp(4)&SO(5), and G 2 [30]. The SchroK dinger equation in the physical con"guration space is written as
A
B
1 ! D i#< U"EU . 2i (r)
(7.42)
Its solutions must be normalizable with respect to the scalar product
P
dNx DUD2"V /V G H
P
P
drh i2DUD2P drh i2DUD2 , (7.43) K` K` where V is the volume of the group manifold and V "(2p)r is the volume of the stationary G H subgroup of a generic element x"h which is the Cartan group G &[]U(1)]r. The ratio of these H factors is the result of the integration over the variable z (see (4.24) and the paragraph after (4.26)). The gauge orbits are compact in the model, so their volume can be included into the norm of physical states, which is shown by the arrow in Eq. (7.43). Eq. (7.42) can be transformed to the SchroK dinger equation in the r-dimensional Euclidean space by the substitution U"//i. Let / be a solution in the Euclidean space. The physical wave
80
S.V. Shabanov / Physics Reports 326 (2000) 1}163
functions U must be regular at the singular points where the Jacobian (or the Faddeev}Popov determinant) i2 vanishes. To obtain the physical solutions, we observe that the Hamiltonian HK "!D /2#< commutes with the operators RK that transform the argument of the wave (r) (r) functions by the Weyl group. This follows from the invariance of the Laplace operator and the potential under the Weyl transformations. Note that the Weyl group can be regarded as the group of residual gauge transformations in the gauge x"h so the invariance of the potential < under the Weyl group follows from its gauge invariance. Thus, if / (h) is an eigenfunction of HK then / (RK h) E (r) E is also its eigenfunction with the same eigenvalue E. Let us take a ray through a generic point on the hyperplane (a, h)"0 and perpendicular to it, and let the variable y span the ray so that y"0 at the point of intersection of the ray with the hyperplane. The potential < is assumed to be a regular function everywhere. Therefore the eigenfunctions / are regular as well. Since i&y as y apE proaches zero, the function U has the singularity 1/y in the vicinity of the hyperplane (a, h)"0, E which is a part of the boundary of the Weyl chamber K`. Consider an element RK of the Weyl a group which is a re#ection in the hyperplane (a, h)"0, i.e., RK a"!a. Then, RK y"!y. The a a function /(h)/i(h)#/(RK h)/i(RK h) satis"es Eq. (7.42) and regular in the vicinity of the hyperplane a a (a, h)"0. This analysis can be done for any positive root a, which may lead us to the guess that the functions U (h)"+ [i(RK h)]~1/ (RK h) (7.44) E E W are regular solutions to Eq. (7.42) on the entire Cartan subalgebra. Let us show that this is indeed the case. First of all we observe that i(RK h)"$i(h) .
(7.45)
Note that k(h)"i2(h) is an invariant of the Weyl group since the Weyl group is the group of permutations and re#ections of the roots which preserve the root pattern. The negative sign in Eq. (7.45) corresponds to an odd number of re#ections in the group element RK . Any re#ection in a hyperplane through the origin can be viewed as an orthogonal transformation. In the matrix representation det RK "$1 because RK is a composition of re#ections in the hyperplanes orthogonal to simple roots. Next, we invoke the following theorem from group theory [30,32]. Any polynomial p(h) in the Cartan subalgebra with the property p(RK h)"$p(h) can be represented in the form p(h)"i(h)C(h) ,
(7.46)
where a polynomial C(h) is invariant under the Weyl group. By construction the function (7.44) is invariant under the Weyl transformations. Making use of the relation (7.45), the physical wave function can also be represented in the form U (h)"[i(h)]~1/I (h) where /I (RK h)"$/I (h). Let E E E E us decompose /I (h) into a power series and re-group the latter into a sum of terms of the same E order in h: = /I (h)" + /I (E)p (h) . E n n n/0
(7.47)
S.V. Shabanov / Physics Reports 326 (2000) 1}163
81
The polynomials p (h) of order n satisfy the condition of the above theorem p (RK h)"$p (h). n n n Therefore = U (h)" + /I (E)C (h) , E n n n/0
(7.48)
where C (h) are polynomials invariant under the Weyl group. We remark that for every order n n there may not exist an invariant polynomial C . For instance, there is no invariant polynomial of n order one. So some of the coe$cients /I (E) necessarily vanish. n Now let us prove the converse that any regular solution of the SchroK dinger equation (7.42) is invariant under the Weyl group. In the total con"guration space the solutions of the SchroK dinger equation can be written in the form / (x)"/ (h, z)"U(k)(h)> (z) , E E E (k)
(7.49)
where > (z) are eigenfunctions of the Casimir operators in the algebra generated by the operators (k) p( of constraints, and the index (k) stands for a set of corresponding eigenvalues, E"E(k). The a functions U(0)(h) form a basis in the physical subspace, > (z)"const. Consider the symmetry E (0) transformation of the new variables h and z in Eq. (4.18) such that the old variables x remain unchanged. These transformations contain translations of z on the periods of the manifold G/G H (h is not changed) and the Weyl group xPx, hPRK h"XhX~1, S(z)PXS(z)"S(zX ) .
(7.50)
The functions (7.49) must be invariant under these transformations. Hence U(0)(h) must be invariant E under the Weyl group. The functions U(0)(h) are also regular because the functions U (x) are E E regular. We have established the one-to-one correspondence between analytic gauge invariant functions U (x) in the total con"guration space and analytic functions U (h) invariant under the Weyl group E E in the reduced theory. In group theory this statement is known as the theorem of Chevalley which asserts [32] that any polynomial in the Cartan subalgebra invariant under the Weyl group has a unique analytic continuation to the Lie algebra that is invariant under the adjoint action of the group. Since polynomials form a dense set in the space of analytic functions, the statement is also valid for analytic functions. The regularity condition of the physical wave functions at the Gribov horizon (the boundary of the Weyl chamber) on the gauge "xing surface has been crucial to prove the equivalence of the gauge "xed formalism to the explicitly gauge invariant approach due to Dirac. Attention should be drawn to the fact that this boundary condition does not allow separation of variables in the SchroK dinger equation, even if the potential would allow it, i.e., the physical wave functions cannot be factorized into a product of wave functions for each component of h. This is the evidence of the kinematic coupling in the quantum theory, the e!ect we have observed in the classical theory. The above example also provides us with the key idea for how to deal with the coordinate singularities in quantum theory: The physical amplitude must be regular at the singular points in any particular coordinate system assumed on the orbit space. There is no need to postulate the invariance of the physical states under the residual gauge (Gribov) transformation. It is ensured by the regularity condition.
82
S.V. Shabanov / Physics Reports 326 (2000) 1}163
7.4. The theorem of Chevalley and the Dirac states for groups of rank 2 Although the theorem of Chevalley establishes the one-to-one correspondence between the gauge invariant Dirac states and the states invariant under the residual gauge transformations in the non-invariant approach, an explicit construction of the analytic continuation might be tricky. A general idea is to "nd an explicit form of the physical wave functions in the new variables P (h)"tr hl instead of the components of h, where P (x)"P (h) are the independent Casimir l l l polynomials (or functions in a general case) in the chosen gauge. We ful"ll this program for groups of rank 2 in the case of the oscillator potential [18], just to give an idea of how hard it might be in general to realize. We take the variables U introduced in Section 4.6. To calculate the density 1,2 i(h) in the new variables, we make use of Lemma III.3.7 in [31] which asserts that
A B
det
RP k l "c@i(h), c@"const, k, j"1, 2,2, rank X . Rhj
(7.51)
Applying (7.51) to groups of rank 2, we "nd that i2&U2l(c #c U !U2 ) where U are de"ned 1 2 1 2 2 1,2 by (4.33) for x"h and the coe$cients are speci"ed after Eq. (4.36). The variable U is then replaced 2 by (U !b)/Ja (cf. (4.38)). As a result we obtain 2 k(h)"i2(h)"cU2l(1!U2 ) , (7.52) 1 2 where c is a constant. Chevalley's theorem applies to the function k(h). Eq. (7.52) determines the analytic gauge invariant continuation of k to the whole con"guration space. It is a polynomial of rank 2l constructed out of two independent Casimir polynomials P (x) and 2 P (x). A gauge invariant function W (x) is a regular function of U . So substituting l E 1,2 W (x)"[i(U , U )]~1u (U , U ) into the SchroK dinger equation in the total con"guration space E 1 2 E 1 2 HK W "EW , we "nd the equation E E HK W "EW ; 1) E E (7.53) 1 l2 U2 HK "! R U R ! [1!U2 ]1@2 R [1!U2 ]1@2 R # 1 , 1) 2 2 2 2 2U 1 1 1 2U2 2 1 1 where R are partial derivatives with respect to U . Solutions are sought in the form 1,2 1,2 u "g(U )F(U ). Observe that just as in the classical theory discussed in Section 4.6, the new E 1 2 variables allow us to separate independent oscillator modes and thereby to solve the kinematic coupling problem. Eq. (7.53) is equivalent to two equations ![1!U2 ]FA#U F@#cF"0 (7.54) 2 2 1 cl2 !gA! g@! !U2 #2E g"0 , (7.55) 1 U U2 1 1 where c is a constant of separation of the variables. Since the function W (x)"W (h) has to be E E "nite at the boundaries of the Weyl chamber (k"0 when U "$1), the following boundary 2 conditions are to be imposed on F:
A
F($1)"0 .
B
(7.56)
S.V. Shabanov / Physics Reports 326 (2000) 1}163
83
The solution of Eq. (7.54) satisfying this condition is given by F (U )"sin[(m#1)cos~1 U ]"(1!U2 )1@2; (U ) , (7.57) m 2 2 2 m 2 where ; (U ) are the Chebyshev polynomials, m"0, 1, 2,2, and c"!(m#1)2. Eq. (7.55) is m 2 transformed to the standard form (7.31) by the substitution g"Ul(m`1) exp(!U2 /2) f (U ) and 1 1 1 by introducing a new variable z"U2 . In Eq. (7.31) one should set a"l(m#1)#1 and 1 b"!(E!a)/2. Thus, the spectrum and the gauge invariant eigenfunctions are E "2n#lm#N/2 , (7.58) nm W "c Ulm; (U )¸l(m`1)(U2 )e~U21 @2 , (7.59) nm nm 1 m 2 n 1 where c are normalization constants. The dimension N of the gauge group speci"es the ground mn state energy, as in the Fock space approach. To establish this within the SchroK dinger picture, we have used the relation [32] N"l l 2l #r, i.e., N"2l#2 for the groups of rank 2. 1 2 r From the expression (7.59) we infer that W depends only on the Casimir polynomials P . For nm 2,l the groups Sp(4)&SO(5) and G , the factor of the exponential in Eq. (7.59) is a polynomial of 2 P since l is an even integer (l"4, 6, respectively), and, therefore, Ulm is a polynomial for any 2,l 1 positive integer m. In the case of SU(3), l"3, and for even m, U3m is proportional to the 1 non-polynomial factor [P ]1@2. However the coe$cient b in (4.36) vanishes for SU(3), thus leading 2 to U "J6P U~3. Hence, the non-polynomial factor in U3m; (U ) is canceled out. Since 2 3 1 1 m 2 P (x)"P (h), the wave functions (7.59) are invariant under the Weyl group, and have a unique 2,l 2,l gauge invariant continuation to the Lie algebra (the total con"guration space). Remark. The approach can also be applied to obtain explicitly gauge invariant wave functions for the SO(2) gauge matrix model with the oscillator potential. The idea is to write "rst the SchroK dinger equation in the curvilinear coordinates (4.56). The physical wave functions do not depend on h, so the corresponding derivative should be omitted in the Laplace}Beltrami operator. Next, one introduces a new set of curvilinear coordinates to separate the variables in the SchroK dinger equation (to remove the kinematic coupling): q2"r cos u, q3"r sin u. We refer to the works [17,182] for the details. 7.5. The operator approach to quantum Yang}Mills theory on a cylinder Here we analyze the features associated with the coordinate singularities in the SchroK dinger picture for a soluble gauge system with in"nitely many degrees of freedom [95,52]. Following the Dirac method we replace the canonical variables E(x)P!i+d/dA(x), A(x)PA(x), A(x)3F, by the corresponding operators and get the quantum theory in the SchroK dinger functional representation [96,97]
T
U
+2 d d HK U [A]"! , U [A]"E U [A] , n n n n 2 dA dA
(7.60)
d U [A]"0 . p( U [A]"!i++(A) n dA n
(7.61)
84
S.V. Shabanov / Physics Reports 326 (2000) 1}163
The states are now given by functionals on the space F. In accordance with the general method proposed in the end of Section 7.4, to solve Eq. (7.61) and to project the Hamiltonian in (7.60) onto the gauge orbit space, one should introduce curvilinear coordinates associated with both a gauge transformation law and a chosen gauge condition. These coordinates are given in Eq. (5.32). In the new variables the orbit space is parameterized by homogeneous connections a from the Cartan subalgebra. Following the analysis of the moduli space in Section 5.1, we also impose the condition a3K` &F/G to ensure a one-to-one correspondence between the `olda and `newa variables. If W we assume that a ranges over the entire Cartan subalgebra, then the values of the new variables XX~1, RK a, for all RK from the a$ne Weyl group = , are mapped to the same con"guration A(x) by s A (5.32). Therefore the admissible values of a in (5.32) must be restricted by the Weyl cell K` . We W show below that the constraint operator p( commutes with the curvilinear variable a and, therefore, a is a formally gauge-invariant variable. The norm of the physical states is de"ned according to the rule (5.35)
P
F
P
< dA(x) UH[A]U [A]P da i2(a) UH(a) U (a)"d , n n{ n n{ nn{ K`W x|s1
(7.62)
where the in"nite constant C(l)":G H < dw(x) is removed by a renormalization of the physical @G x states, which we denote by the arrow in (7.62). The integration over the non-physical variables gives an in"nite factor, thus making the physical states non-normalizable in the original Hilbert space, even though the gauge orbits are compact. The origin of the divergence is the in"nite number of non-physical degrees of freedom. Frankly speaking, at each space point x non-physical degrees of freedom contribute a "nite factor proportional to the volume G to the norm of a physical state. As one might see from Eq. (5.43), the way to get around of this di$culty is to make the number of Fourier modes "nite, renormalize the physical states and then remove the regularization. We have implied this procedure done in Eq. (7.62). The independence of the physical state from w(x) as well as the formal gauge invariance of a can be demonstrated explicitly by solving the Gauss law in the new curvilinear coordinates (5.32). We assert that d p( U [a, u]"!i+gXK T U [a, w]"0 ; n dw n
(7.63)
here we have used the notation (XK y) "(XyX~1) "(yXK T) ,XK y , XK TXK "XK XK T"1 for any a a a ab b element y3X. Since det XK O0, the physical states are functionals independent of w(x). To prove (7.63), we "rst derive the following relations from (5.33): da,da"PHXK T dA , 0
(7.64)
dw"ig+~1(a)(1!PH)XK T dA 0
(7.65)
with PH being a projector on the subspace FH of spatially homogeneous functions taking their 0 0 values in the Cartan subalgebra. Recall that the operator +(a) is invertible on (1!PH)F. The 0
S.V. Shabanov / Physics Reports 326 (2000) 1}163
85
following simple computation leads us to the desired result +(A)
CA C
d "+(A) dA
da R , dA Ra
B T #
a
dw d , dA dw
UD
(7.66)
w
R d "+(A) (PHXK T)T #(ig+~1(a)(1!PH)XK T)T 0 0 Ra dw
D
d d R . "XK +(a)PH !igXK +(a)(1!PH)+~1(a) "!igXK T 0 0 Ra dw dw
(7.67)
(7.68)
In Eq. (7.66), the subscript of the scalar product brackets denotes variables over whose indices the scalar product is taken, i.e. all indices labeling independent degrees of freedom described by A(x) (the Lie algebra ones and x3S1) in the scalar products in (7.66) are left free. Eq. (7.67) is obtained by the substitution of da/dA(x) and dw(y)/dA(x) which are taken from Eqs. (7.64) and (7.65), respectively. To get (7.68), we have used +(a)PH R/Ra ,0 and +(A)XK "XK +(a). 0 Thus, the operator of multiplication on the variable a commutes with the constraint operator [p( , a( ]"0. In this approach, the Gauss law (7.60) can formally be solved even in four dimensions [95]. As has already been argued in Section 6.3, such a formally gauge invariant approach is not, in general, free of coordinate singularities. We now turn to analyze the role of these singularities in quantum theory. To project the functional Laplace operator in Eq. (7.60) on the gauge orbit space spanned by variable a, one should calculate the Laplace}Beltrami operator in the new functional variables (5.32) and omit in it all terms containing the variational derivative d/dw. The metric (5.34) is block diagonal. The physical and nonphysical parts of the kinetic energy operator are decoupled (cf. (7.38)). After a transformation similar to (7.39), we arrive at the quantum mechanical problem
C
D
+2 1 HK 1)U (a)" ! D i(a)!E U (a)"E U (a) , n C n n n 4pl i(a) (r)
(7.69)
where we have taken into account that the metric on the physical space is #at: gaa"(2pl)~1, and that the function i(a) is an eigenfunction of the Laplace operator (cf. (5.54)), +2 D i p+2 p+2N (r) " E "! (o, o)" . C 4pl i a2 l 24a2 l 0 0
(7.70)
Substituting U "i~1/ into (7.69) we "nd that / is an r-dimensional plane wave, n n n exp(2pi(c , a)/a ). However, not all values of the momentum vector c 3H are admissible because n 0 n only regular solutions to (7.69) have a physical meaning. This means that / (a) should vanish on n the hyperplanes orthogonal to positive roots, (a, a)"n a , n an integer, because the function a 0 a i~1(a) has simple poles on them. Since (RK c , RK c )"(c , c ), RK is from the Weyl group =, the n n n n superposition of the plane waves
G
H
2pi U (a)&[i(a)]~1 + det RK exp (RK c , a) ,[i(a)]~1/ (a) n n n a K 0 R|W
(7.71)
86
S.V. Shabanov / Physics Reports 326 (2000) 1}163
is an eigenstate of the physical Hamiltonian with the eigenvalue p+2 [(c , c )!(o, o)] . E " n a2 l n n 0
(7.72)
The function (7.71) is also regular at the hyperplanes (a, a)"n a , provided the momentum a 0 c attains discrete values such that the number n 2(c , b) n 3Z (b, b)
(7.73)
is an integer for any root b. Thus, the regularity condition has a dramatic e!ect on the physical spectrum: It appears to be discrete, rather than continuous as one might naively expect after removing all nonphysical degrees of freedom by a gauge "xing because the system has no potential. Moreover, the regular eigenfunctions (7.71) have a unique gauge invariant analytic continuation to the whole con"guration space F. They are characters of all irreducible representation of the Polyakov loop P exp[i g { dx A(x)], which means that the wave functions we have obtained do not depend on the particular parameterization of the gauge orbit space we have chosen to solve the Gauss law and the SchroK dinger equation [52,54]. To prove the above statements, let us decompose a into two parts a"a@@#aM, such that (aM, a)"0 for a root a and let =(a) be the quotient =/Z(a), Z(a)"M1, RK N, where RK a"!a and, 2 2 a a therefore, RK aM"aM, RK a@@"!a@@, det RK "!1. Then the sum in (7.71) can be rewritten as a a a follows:
C
G G
H HC
G
HD
2pi 2pi / (a)& + det RK exp (RK c , a) #det(RK RK )exp (RK RK c , a) n n a a n a a 0 0 RK |W(a) 2pi " + det RK exp (RK c , a) n a (a) K 0 R|W
G
HD
4pi 1!exp ! (RK c , a@@) n a 0
.
(7.74)
Here we made use of the identities det(RK RK )"!det RK and a (RK RK c , a)"(RK c , aM!a@@)"(RK c , a)!2(RK c , a@@) . a n n n n
(7.75)
In a neighborhood of the hyperplane (a, a)"n a with a non-vanishing integer n , we have a 0 a a@@"n a a/(a, a)#oa where oP0. The sum in (7.74) vanishes as oP0 if the factor in the brackets a 0 vanishes. This yields the condition that 2(RK c , a)/(a, a) must be an integer. Since RK a"b is a root, we n conclude that the function (7.71) is regular, provided the momentum c satis"es the condition (7.73). n For any c satisfying (7.73), a vector RK c , RK 3=, also satis"es (7.73) and corresponds to the n 0 n 0 same energy level (7.72) because the Killing form is =-invariant. Replacing c by RK c in (7.71) we n 0 n have / (a)Pdet RK / (a)"$/ (a), which means that linearly independent wave functions corren 0 n n sponding to each energy level (7.72) are determined only by c modulo the Weyl transformations, n that is, c 3H/=&K`, thus leading to the condition (u, c )'0, u ranging over simple roots. n n Moreover, if c 3RK`, meaning that (c , u)"0 for a certain simple root u, then the corresponding n n wave function vanishes because / (a)"0. Indeed, changing the summation in (7.71) RK PRK RK , n u
S.V. Shabanov / Physics Reports 326 (2000) 1}163
87
RK u"!u, and making use of the relations det RK "!1 and RK c "c (since RK is a re#ection u u u n n u in the hyperplane perpendicular to u and (c , u)"0) we get / (a)"!/ (a) and, hence, / (a)"0. n n n n The property (5.52) of the measure i(a) leads us to the conclusion that the regular solutions of the SchroK dinger equation (7.69) are invariant with respect of the a$ne Weyl group, U (RK a)"[i(RK a)]~1/ (RK a)"U (a) . (7.76) n a, m a n a n Thus, we observe again that in the Dirac quantization scheme there is no need to postulate the invariance of physical states with respect to residual gauge transformations. The regularity condition for the wave functions at the singular points of the chosen orbit space parameterization ensures this invariance. Now we obtain an explicitly gauge invariant analytic continuation of the physical wave functions into the total functional con"guration space F. Recall that we solved a similar problem for the mechanical gauge models by means of the theorem of Chevalley. Here we invoke other remarkable facts from group theory to achieve the goal: The relation (5.44) between the measure i and the Weyl determinant, and the Weyl formula for the characters sKn of irreducible representations of Lie groups [64, p. 909]. We get + K det RK expM2p* (o#K , RK a)N R|W a0 n "c sKn (e2p*a@a0 ) , (7.77) n (o, RK a)N + K det RK expM2p* R|W a0 where c are normalization constants and c "o#K . The lattice formed by vectors K labels the n n n n irreducible representations of the Lie group. The sum over the Weyl group in (7.71) should vanish for all c such that (c , c )((o, o) because the function (7.71) must be regular, which is, in turn, n n n possible only if (c , c )5(o, o) as one can see from the explicit form (5.44) of the function i(a). This n n latter condition on the norm of c ensures also that the spectrum (7.72) is non-negative. For the n character sKn we have the following representation: U (a)"c n n
A
B
A
Q
B
2pia "tr(exp 2pigla)Kn "tr P exp ig A dx , (7.78) a K n 0 S1 where by (ey)Kn we imply the group element ey in the irreducible representation K . The last equality n in (7.78) follows from the fact that the variable a is related to a generic connection A(x) by a gauge transformation. Formula (7.78) establishes the gauge invariant analytic continuation of the eigenstates (7.77) to the total con"guration space. Thus, the solutions to the system of functional equations (7.60) and (7.61), which are independent of any parameterization of the gauge orbit space, are given by the characters of the Polyakov loop in all irreducible representations of the gauge group. The wave functions (7.77) are orthogonal with respect to the scalar product (7.62). This follows from the orthogonality of the characters (7.78). For normalization coe$cients c we obtain n sKn exp
P
d " nn{
K`W
da i2(a)U (a)UH(a) n n
"22N` c cH n n{
P
G
H
2pi da + det RK RK @ exp (a, RK c !RK @c ) . n n{ a ` K K W 0 K R,R{|W
(7.79)
88
S.V. Shabanov / Physics Reports 326 (2000) 1}163
The integrand in (7.79) is a periodic function on the Cartan subalgebra. Its periods are determined by the geometry of the Weyl cell. Therefore, the integral over the periods vanishes for all RK ORK @ because c and c belong to the Weyl chamber and the Weyl group acts simply and transitively on n n{ the set of the Weyl chambers. Hence, there is no Weyl group element RK such that RK c "c if n n{ c 3K`. For RK "RK @ the integral di!ers from zero only for c "c , i.e., when the periodic n, n{ n n{ exponential equals one. Thus, Dc D"2~N` (N ) < `W )~1@2 , (7.80) n W K where N "l l 2l is a number of elements in the Weyl group, < `W is the volume of the W 1 2 r K Weyl cell. The energy spectrum (7.72) seems to depend on normalization of the roots in the Lie algebra. Recall, however, that the norms of the roots are "xed by the structure constants in the Cartan} Weyl basis (see Section 4.2 for details). If the roots are rescaled by a factor c, which means, in fact, rescaling the structure constants in the Cartan}Weyl basis by the factor c~1, the invariant scalar product (x, y)"tr(ad x, ad y) gets rescaled accordingly, i.e., by c~2. Therefore the spectrum (7.72) does not depend on the rescaling factor because the factor c~2 in the scalar product is canceled against c2 resulting from rescaling c and o by c. n Remark. The Coulomb gauge can be "xed prior to canonical quantization. Such an approach has been considered by Hetrick and Hosotany [48]. Some boundary condition at the Gribov horizon must be assumed. The choice of the boundary conditions is not unique and depends on a selfadjoint extension of the Laplace operator in the Weyl cell. The spectrum di!ers from Eq. (7.72) to order O(+). The model can also be solved without explicit parameterization of the gauge orbit space via a gauge "xing. According to the earlier work of Migdal [46] devoted to the lattice version of the model, all physical degrees can be described by the Polyakov loop extended around the compacti"ed space (the circle). Rajeev formulated the SchroK dinger equation in terms of the Polyakov loop and solved it [47]. Our conclusions [52] coincide with those obtained in [46,47,49]. Although we have used an explicit parameterization of the gauge orbit space via the Coulomb gauge which has singularities, all the eigenstates found are explicitly gauge invariant and regular in the total con"guration space. The technique developed is important for establishing a gauge invariant path integral formalism and resolving the Gribov obstruction within it. It is also noteworthy that despite that the physical con"guration space has an orbifold structure, the quantum theory obtained from the Dirac formalism di!ers from a general quantum mechanics on orbifolds [98], where wave functions are, generally, allowed to have singularities at the singular points of the con"gurations space. In our approach the regularity condition plays the major role in maintaining the gauge invariance if an explicit parameterization of the orbit space is used. 7.6. Homotopically non-trivial Gribov transformations Having found the physical wave functions in the parameterization of the gauge orbit space, which is associated with the Coulomb gauge, we can investigate their properties under homotopically non-trivial residual gauge transformations. The wave functions (7.77) can be regarded as gauge invariant functions (7.78) reduced on the gauge "xing surface RA"0 in the functional space F. The Coulomb gauge is not complete and, therefore, there are Gribov copies on the gauge "xing
S.V. Shabanov / Physics Reports 326 (2000) 1}163
89
surface. They are related, in general, either by homotopically trivial or non-trivial gauge transformations [100,101]. We have excluded the latter, when calculating the physical phase space, because they cannot be generated by the constraints, that is, two classical states related by homotopically non-trivial transformations are, in fact, two di!erent physical states, so they have to correspond to two di!erent points in the physical phase space. Here we demonstrate that the physical wave functions are not invariant under homotopically non-trivial residual gauge transformations. The analogy can be made with instanton physics [8,102] in Yang}Mills theory. An instanton connects two distinct classical vacua related by a homotopically non-trivial gauge transformation. Physical wave functions acquire a phase factor under such a transformation. Consider the group SU(2) "rst. The algebra has one positive root u. Solutions to (7.73) are given by c "un/2 where n ranges positive integers because K`"R and RK` coincides with the origin n ` c "0. The spectrum and wave functions respectively read n p+2 E " (n2!1)(u, u), n"1, 2,2 ; (7.81) n 4a2 l 0 sin[pn(a, u)/a ] 0 . U "c (7.82) n n sin[p(a, u)/a ] 0 Substituting n"2j#1, j"0, 1/2, 1,2, into (7.81) we observe that E is proportional to eigenn values of the quadratic Casimir operator of SU(2); E &j( j#1) where the spin j labels the n irreducible representations of SU(2). Let us introduce a new variable h such that a"a uh/(u, u). When a ranges the Weyl cell K` , 0 W the variable h spans the open interval (0, 1). The measure da is de"ned in the orthonormal basis in H (meaning that H&Rr). For the SU(2) case we have da,da , a"J2ua so that (a, a)"a2 , 3 3 3 a 3R. Here we have used (u, u)"1/2 for SU(2). Hence, the normalization coe$cients c in (7.82) 3 n are
A P
B
A B
1 ~1@2 a ~1@2 dh sin2pnh " 0 . (7.83) J2 0 The action of homotopically non-trivial elements (5.7) of an arbitrary simple compact gauge group on the argument of the wave functions are determined by the shifts aPa#i/gX RX~1 where s s X "exp(ixg/l) so that (cf. (5.24)) s exp(2pig)"z3Z . (7.84) G The lattice g is given by integral linear combinations of elements a/(a, a), with a ranging over the root system, because [30] c " J2a 0 n
exp
2pia 3Z G (a, a)
(7.85)
for any root a. Thus, homotopically non-trivial gauge transformations are generated by shifts (cf. (5.26) and the example of SU(3) given in Fig. 5) naa aPa# 0 , n3Z . (a, a)
(7.86)
90
S.V. Shabanov / Physics Reports 326 (2000) 1}163
In the matrix representation of SU(2) the only positive root is u"q /4 (see Section 3.2). Then 3 exp(2piu/(u, u))"exp ipq "!e3Z "Z . Therefore in the case of SU(2) we get the follow3 2 46(2) ing transformation of the wave functions (7.82)
A
B
a un U a# 0 "(!1)n`1U (a) , n n (u, u)
(7.87)
i.e., the physical states acquire a phase factor under homotopically non-trivial gauge transformations. The analysis can easily be extended to an arbitrary group by using the properties of the root pattern and the Weyl representation of the characters (7.77) under the transformations (7.86). However the use of the explicit form of the gauge invariant wave functions (7.78) in the total con"guration space F would lead to the answer faster. Making a homotopically non-trivial gauge transformation of the Polyakov loop and taking into account the twisted periodicity condition (5.7), we "nd
Q
Q
tr(P exp ig A dx)Kn Ptr(zP exp ig A dx)Kn .
(7.88)
Thus, the Gauss law (7.61) provides only the invariance of physical states with respect to gauge transformations which can be continuously deformed towards the identity. 7.7. Reduced phase-space quantization versus the Dirac approach The key idea to include a gauge condition chosen for parameterization of the physical con"guration space into the Dirac scheme is to use the curvilinear coordinates associated with the gauge condition and the gauge transformation law to solve the constraints and "nd the physical quantum Hamiltonian. We have also seen that this approach can be applied in classical theory. Here we will compare quantum theories obtained by the Dirac procedure and by what is known as the reduced phase-space quantization. By the latter we imply that non-physical degrees of freedom are removed by a suitable canonical transformation such that the constraints turn into some of the new canonical momenta. Due to the gauge invariance, the corresponding canonical coordinates are cyclic, i.e., the Hamiltonian does not depend on them. So the physical Hamiltonian is obtained by setting the non-physical momenta to zero. Finally, the theory is canonically quantized. The point we would like to stress in the subsequent analysis is the following. All quantum theories obtained by the Dirac procedure with various parameterizations of the physical con"guration space (i.e., with various gauges) are unitarily equivalent. Thus, physical quantities like the spectrum of the Hamiltonian are independent of the parameterization of the physical con"guration space. In contrast, the reduced phase-space quantization involves ambiguities which, when not taken care of, may lead to a gauge dependent quantum theory. Here we discuss gauge systems in rather general settings and turn to examples only to illustrate general concepts. Let operators X acting in a space isomorphic to RN realize a linear representation of a compact group G. Consider a quantum theory determined by the SchroK dinger equation
A T
U
B
1 R R ! , #<(x) t "Et . E E 2 Rx Rx
(7.89)
S.V. Shabanov / Physics Reports 326 (2000) 1}163
91
where x3RN, and the group G acts on it as xPX(u)x, X(u)3G; Sx, yT"+N x y "SXx, XyT p/1 p p is an invariant scalar product in the representation space. We also assume that the potential is invariant under G-transformations <(Xx)"<(x). The eigenfunctions t are normalized by the E condition
P
RN
dx tH(x)t (x)"d . E E{ EE{
(7.90)
The theory turns into a gauge theory if we require that physical states are annihilated by operators p( "p( s generating G-transformations of x, p( W(x)"0. These conditions determine a physical a a a subspace in the Hilbert space. By de"nition, we have exp(iu p( )t(x)"t(X(u)x) . a a Therefore, the physical states are G-invariant W(X(u)x)"W(x) .
(7.91)
(7.92)
Let the number of physical degrees of freedom in the system equal M, then a number of independent constraints is N!M. One can also admit that N or M, or both of them, are in"nite. Like in the 2D Yang}Mills theory, we can always introduce a countable functional orthogonal basis in any gauge "eld theory and regard the coe$cients of the decomposition of the "elds over the basis functions as independent degrees of freedom [186]. Suppose we would like to span the physical con"guration space K&RN/G by local coordinates satisfying a gauge condition s(x)"0. The gauge condition "xes the gauge arbitrariness modulo possible discrete gauge transformations, that is, there is no non-physical degree of freedom left. Let u3RM be a parameter of the gauge condition surface; x"f (u) such that s( f (u)) identically vanishes for all u3RM. By analogy with (6.22) we introduce curvilinear coordinates associated with the chosen gauge and the gauge transformation law x"x(h, u)"X(h) f (u) ,
(7.93)
where variables h ran over the manifold G/G with G being a stationary group of the vector f f x"f, G f"f. The subgroup G is non-trivial if the constraints are reducible like in the mechanf f ical model discussed in Section 4. The metric tensor in the new coordinates reads (7.94) Sdx, dxT"Sdf, df T#2Sd f, dh f T#Sdh f, dh f T,g dyA dyB , AB where we have put dh"Xs dX and dy1,du, dy2,dh. An integral in the new variables assumes the form
P
RN
P
dx t(x)"
P
' dh
dMu k(u)t(X(h)f (u)) ;
(7.95)
G@G K here k(u)"(det g )1@2, K is a domain in RM such that the mapping (7.93), K=G/G PRN, is AB f one-to-one. To determine the modular domain K, one should "nd transformations h, uPRK h, RK u, RK 3SI which leave x unchanged, x(RK h, RK u)"x(h, u). Obviously, SI "¹ ]S where ¹ is a group s s e s e of translations of h through periods of the manifold G/G , while the set S is obtained by solving f s Eqs. (6.9) and (6.10) with f replaced by f3RN, u3RM, X 3G, so K&RM/S . Indeed, if Eq. (6.9) has s s f
92
S.V. Shabanov / Physics Reports 326 (2000) 1}163
non-trivial solutions (the trivial one X "1 always exists by the de"nition of f (u)), then all points s X f belong to the gauge condition surface and, hence, X f (u)"f (u ), u "u (u). The transformas s s s s tions X determine the Gribov copies on the gauge "xing surface. Consider transformations of s h generated by the group shift X(h)PX(h)X~1"X(h ), h "h (h, u). Setting RK u"u and RK h"h s s s s s s we see that the transformations RK 3S leave x"x(h, u) untouched. To avoid a `doublea counting s in the integral (7.95), one has to restrict the integration domain for u to the quotient RM/S &K. s The modular domain K can be speci"ed as a portion of the gauge condition surface x"f (u), u3KLRM, which has just one common point with any gauge orbit. A choice of the fundamental modular domain is not unique as we have already seen in Section 6.2. In (7.95), we assume the choice of K such that k'0 for u3K. Having chosen the parameterization of K, we "x a representation of S by functions RK u"u (u), u3K, u 3K , i.e., K is the domain s s s s of the function u (u) and K is its range. The intersection K WK "0 is an empty set for any s s s s{ RK ORK @. Then RM"6 K up to a set of zero measure being a uni"cation of the boundaries RK . s s s We de"ne an orientation of K so that for all RK 3S , : s du /(u)50 for any /(u)50, and the s s K following rules hold:
P P
RM
K
P
du /(u)"+ du /(u) , Ss Ks
P
du DJ (u)D /(u)" s
Ks
(7.96)
du /(u~1(u)) , s
(7.97)
where J (u) is the Jacobian of the change of variable uPu (u), the absolute value of J has been s s s inserted into the right-hand side of (7.97) to preserve the positive orientation of the integration domain. Remark. A number of elements in S can depend on u. We follow the procedure described in s Section 6.2. We de"ne a domain RM-RM such that S "S has a "xed number of elements for all a s a u3RM. Then K"6 K , K "RM/S , RM"6 RM. The sum in (7.96) means + s "+ + a and a a a a a a a a S a S K carries an additional subscript a. In what follows we will omit it and use the simpli"ed notations s (7.96)}(7.97) to avoid piling up subscripts in formulas. The subscript a can be easily restored by means of the rule just explained. Let us illustrate some of the concepts introduced with the example of the SO(2) model of Section 6.2. We have G"SO(2), G "1, det g "f @2f 2!( f @, ¹f )2"( f @, f )2"k2(u). We take a particuf AB lar form of f considered in Section 6.2 as an example. Set K"6 K , K "(0, u /c ), a a 1 0 0 K "(u /c , u ), K "(u ,R), i.e., K"R , then := du /"+ :Ra du / and (7.96) means that 2 0 0 0 3 0 ` ~= a the upper integral limit is always greater than the lower one, for example,
P
R 2
AP
du /(u)"
~2u0
~3u
0
P
#
~u0
~2u
0
P
#
~u0 @c0
~u
0
P B
#
u0
u @c 0
du /(u) ,
0
where the terms of the sum correspond to integrations over RK K , RK K , RK K and K , 3 2 2 2 1 2 2 respectively. The explicit form of functions u (u) is given by (6.12)}(6.13). The following chain of s
S.V. Shabanov / Physics Reports 326 (2000) 1}163
93
equalities is to illustrate the rule (7.97)
P
RK 3 K2
P
du 3 /" s
~2u0
~3u0
P
du 3 /" s
u0 @c0
u0
P
du J 3 /"! s
u0
u0 @c0
P
du J 3 /" s
K2
du D J 3 D/ ; s
(7.98)
the last equality results from J 3 " du 3 /du(0 (cf. (6.14)). s s Solutions to the constraint equations p( W(x)"0 are given by functions independent of h, a W(x)"WI (X(h) f (u))"W( f (u)),U(u) ,
(7.99)
because p( generate only shifts of h and leave u untouched. To obtain a physical Hamiltonian, one a has to write the Laplacian in (7.89) via the new variables (7.93), pull all the derivatives with respect to h to the right and then set them to zero. In doing so, we get HK f U (u)"(1 p( gij p( #
(7.100)
here we have introduced hermitian momenta p( "!i+k~1@2 R k1@2, R "R/Ruj; the induced inj j j verse metric gij on the physical con"guration space is the 11-component (see (7.94)) of a tensor 1) gAB inverse to g , gACg "dA, gij "(g11)ij, i, j"1, 2,2, M. The quantum potential, AB CB B ph +2 +2
(7.101)
occurs after an appropriate re-ordering of the operators u( i and p( in the original Laplace}Beltrami i operator to transform it to the form of the kinetic energy operator in the Hamiltonian in (7.100). The scalar product is reduced to
P
RN
P
dx UH(u)U (u)P E E{
K
dMuk(u)UH(u)U (u)"d , E E{ EE{
(7.102)
where the integral over G/G has been included into norms of physical states, which we denote by f the arrow in (7.102). The construction of an operator description of a gauge condition is completed. In this approach the variables u appear to be gauge invariant; they parameterize the physical con"guration space CS "RN/G. Two di!erent choices of f (u) (or the gauge condition s) 1):4 correspond two di!erent parameterizations of CS related to one another by a change of 1):4 variables u"u(u8 ) in (7.100)}(7.102) because x"f (u8 (u))"fI (u). Therefore, quantum theories constructed with di!erent gauges are unitary equivalent in the Dirac approach because the Hamiltonian in (7.100) is invariant under general coordinate transformations uPu8 (u). The physical quantities like the spectrum of the Hamiltonian (7.100) are independent of the choice of s or f. This holds despite that the explicit form of the physical Hamiltonian depends on the concrete choice of f. We emphasize that the form (7.101) of the quantum potential is crucial for establishing the unitary equivalence of quantum theories in di!erent gauges [129]. To illustrate this statement, consider the simplest example G"SO(2), M"1, g "r2(u)/k2(u), 1) and compare descriptions in the coordinates (6.22) and in the polar ones ( f "r, f "0). 1 2 With this purpose we change variables r"r(u) in (7.100)}(7.102). For u3K the function r(u) is
94
S.V. Shabanov / Physics Reports 326 (2000) 1}163
invertible, u"u(r), r3R . Simple straightforward calculations lead us to the following equalities ` HK f "1/2p( 2#< (r)#<, p( "!ir~1@2R r1@2, < "!+2(8r2)~1, : du k/":= dr r/. It is nothing 1) r q r r q K 0 but quantum mechanics of a radial motion on a plane. All theories with di!erent f 's are unitarily equivalent to it and, therefore, to each other. A speci"c operator ordering obtained in the Dirac method is important to ensure the unitary equivalence. Had < been di!erent from (7.101), the q spectrum of the physical Hamiltonian in (7.100) would generally have depended on the gauge. This statement can also be veri"ed in general by an explicit computation of the Hamiltonian in (7.100) in the new parameterization u8 "u8 (u): The Hamiltonian remains invariant under general coordinate transformations if the quantum potential has the form (7.101). Thus, the operator ordering appears to be of great importance for the gauge invariance of the theory in a chosen parameterization of the physical con"guration space. The Dirac method leads to the operator ordering that guarantees the unitary equivalence of all representations of a quantum gauge theory with various parameterizations of the gauge orbit space (see also the remark at the very end of this section). Now let us take a formal classical limit of the Hamiltonian in Eq. (7.100), meaning that +"0 and the operators p( and u( are replaced by commutative canonical variables p and u. The classical Hamiltonian is 1 H " gij p p #<( f (u)) . (7.103) 1) 2 ph i j This Hamiltonian can also be obtained by the canonical transformation associated with the change of variables (7.93) just like we derived the Hamiltonian (6.25) for the SO(2) model in an arbitrary gauge. The constraints p become linear combinations of the momenta conjugated to the variables a h that span the gauge orbits. Thanks to the gauge invariance, the Poisson bracket of the total Hamiltonian and the constraints is zero. The canonical momenta conjugated to the h's are integrals of motion, and, therefore, the variables h are cyclic: The Hamiltonian does not depend on them. So the Hamiltonian (7.103) is the reduction of the total Hamiltonian on the physical phase space. Had we eliminated the non-physical degrees of freedom in the classical theory, the Hamiltonian (7.103) would have been the starting point to develop a quantum theory. The di$culties arising in this approach are twofold. First, the phase space spanned by p and u may not be Euclidean. In particular, u may not take its values in the full Euclidean space RM. Therefore, a canonical quantization runs into a notorious problem of the self-adjointness of the momentum operators. Second, the kinetic energy exhibits an operator ordering ambiguity. The hermiticity condition for the quantum Hamiltonian is not generally su$cient to "x the operator ordering uniquely. The physical Hamiltonian (7.103) describes a motion in a curved space with the metric gph. What ij quantization procedures for motion in curved spaces are on the market? The most popular is to replace the kinetic energy by the corresponding Laplace}Beltrami operator (7.38) for the physical metric [103]. Let us see what quantum theory emerges when this approach is applied to the Hamiltonian (6.25) which is a one-dimensional version of (7.103). A general consideration would be slightly more involved, but leads to the same conclusion. Comparing Eqs. (6.25) and (7.103) we see that g(u)"r2/k2, where r2(u)"f 2(u), plays the role of the inverse metric. So the volume element being the square root of the determinant of the metric is c(u)"k/r. According to Eq. (7.38), the kinetic energy is quantized by the rule 1 r r g(u)p2P!+2 R gc R "!+2 R R "!+2R2 , u r c u k uk u
(7.104)
S.V. Shabanov / Physics Reports 326 (2000) 1}163
95
where we have used the relation dr/du"k/r. The scalar product measure is transformed accordingly
P
P
P
= k (7.105) du /" dr / . r 0 K K The operator R2 is not essentially self-adjoint on the half-axis. Its self-adjoint extensions form r a one-parametric family characterized by a real number c"(R t/t) . Thus, the naive replacer r/0 ment of the kinetic energy by the Laplace}Beltrami operator does not lead, in general, to a self-adjoint Hamiltonian and its self-adjoint extension may not be unique. The boundary RK may have a complicated geometrical structure, which could make a self-adjoint extension of the kinetic energy a tricky problem in the case of many physical degrees of freedom, needless to say about the "eld theory case. One of the reasons that the above method fails is inherent to any gauge theory with a nonEuclidean orbit space. The density k(u) on the gauge orbit space does not coincide with the square root of the determinant of the induced physical metric gph as one might see from Eq. (7.94). One ij could therefore abandon the above quantization recipe and require that the volume element of the orbit space should be calculated by the reduction of the volume element dNx onto the gauge "xing surface. In this approach one could make the canonical momenta hermitian p( "!i+k~1@2 R k1@2 u with respect to the scalar product : du k /* / "S1D2T. Hence, hermiticity of the physical K 1 2 Hamiltonian can be achieved by an appropriate operator ordering, say, by a symmetrical one du c/"
g(u)p2Pp( g(u)p( #O(+) .
(7.106)
But now we face another problem. The +-corrections to the quantum kinetic energy operator should be precisely of the form (7.101), otherwise the spectrum of the physical Hamiltonian would depend on the chosen gauge to parameterize the physical con"guration space. In the Dirac approach the necessary operator ordering has been generated automatically, while in the reduced phase-space quantization approach we have to seek a resolution of this problem separately. Thus, the Dirac approach has advantages in this regard. Remark. The operator ordering ambiguities in the reduced phase-space quantization might be resolved in the sense that the spectrum of the quantum Hamiltonian would not depend on the parameterization of the gauge orbit space. One can require that a physically acceptable operator ordering should provide an invariance of the physical Hamiltonian under general coordinate transformations uPu(u8 ). This condition would lead to the physical Hamiltonian that coincides with that obtained in the Dirac approach modulo quantum corrections containing the Riemann curvature tensor of the gauge orbit space (any scalar potential that can be built out of the physical metric tensor). This type of corrections is known in quantization on curved manifolds (without a gauge symmetry) [104}109]. In gauge theories such an addition would mean a modi"cation of the canonical Hamiltonian by corrections to order +2. Note that the curvature of the gauge orbit space does not depend on the choice of local coordinates and, hence, is gauge invariant (cf. the example of the gauge matrix model in Section 4.8). An addition of a curvature term to a quantum Hamiltonian would be consistent with the gauge invariance. So far there seem to be no theoretical reason to forbid such terms, unless they a!ect the Yang}Mills perturbation theory, which seem unlikely because the perturbation theory deals with "eld #uctuations that are much smaller in
96
S.V. Shabanov / Physics Reports 326 (2000) 1}163
amplitude than the inverse curvature of the orbit space. Possible non-perturbative e!ects of such terms are unknown.
8. Path integrals and the physical phase space structure In this section we develop the path integral formalism for quantum gauge systems. The goal is to take into account the geometrical structure of either the physical con"guration space in the Lagrangian path integral or the physical phase space in the Hamiltonian path integral. A modi"cation of the conventional path integral formalism stems from the very de"nition of the sum over paths. So we "rst give a derivation of the path integral in a Euclidean space and then look for what should be modi"ed in it in order to reproduce the Dirac operator formalism for gauge theories. 8.1. Dexnition and basic properties of the path integral Let us take a quantum system with one degree of freedom. Let Dq, tT be an eigenstate of the Heisenberg position operator q( (t)Dq, tT"qDq, tT .
(8.1)
The operator q( (t) depends on time and so do its eigenstates. Making use of relation (7.6) between the Heisenberg and SchroK dinger pictures we "nd Dq, tT"e*tHK @+DqT .
(8.2)
The probability amplitude that a system which was in the eigenstate Dq@T at the time t"0 will be found to have the value q of the Heisenberg position operator q( (t) at time t'0 is Sq, tDq@T"SqDe~*tHK @+Dq@T"; (q, q@) . (8.3) t The amplitude (8.3) is called the evolution operator kernel, or the transition amplitude. It satis"es the SchroK dinger equation i+R ; (q, q@)"HK (q); (q, q@) t t t with the initial condition
(8.4)
; (q, q@)"SqDq@T"d(q!q@) . (8.5) t/0 Any state DWT evolving according to the SchroK dinger equation can be represented in the following form:
P
W (q)"Sq, tDWT" dq@ ; (q, q@)W (q@) , t t 0
(8.6)
where W (q@)"Sq@DWT is the initial wave function. 0 The kernel ; (q, q@) contains all information about dynamics of the quantum system. There exists t a representation of it as a Feynman sum over paths weighted by the exponential of the classical
S.V. Shabanov / Physics Reports 326 (2000) 1}163
97
action [2]. We derive it following the method proposed by Nelson [110] which is based on the Kato}Trotter product formula. The derivation can easily be extended to gauge systems. For this reason we reproduce its details. For any two self-adjoint operators AK and BK , in a separable Hilbert space such that the operator AK #BK is self-adjoint on the intersection of the domains of the operators AK and BK the following relation holds [111}113]: (8.7) e*(AK `BK )" lim (e*AK @Ne*BK @N)N . N?= Assume the Hamiltonian HK to be a sum of kinetic and potential energies HK "HK #<(q( ), 0 H "p( 2/2, and set AK "!tHK /+ and BK "!t
G
;0(q, q@)"SqDe~*tHK 0 @+Dq@T"(2pi+t)~1@2 exp t
H
i(q!q@)2 , 2+t
(8.9)
i.e., it is a solution to the SchroK dinger equation (8.4) with HK "HK "!+2R2/2 and the initial 0 q condition (8.5). Consider the matrix element of the operator ;K o ";K o0 exp(!io
G A
BH
Inserting the resolution of unity
P
1"
=
dq DqTSqD
(8.11)
~= between the operators ;K o in the product (;K o )N";K we "nd t 2pi+t ~N@2 dq dq 2dq e*S(q, qN~1 ,2, q1 , q{)@+ , ; (q, q@)" lim 1 2 N~1 t N N?= where
A B P
C
AB
(8.12)
D
t ~2 N~1 t 1 (q !q )2 (8.13) !<(q ) S(q, q , , q , q@)" + j`1 j j N~1 2 1 N N 2 j/0 with q ,q@ and q ,q. Let q(q) be a polygonal path going through points q "q( jo) and 0 N j connecting points q(q"0)"q@"q and q(q"t)"q"q so that on each interval q3[ jo, ( j#1)o] 0 N it is a linear function of q q(q)"(q !q )(oq!j)#q . j`1 j j
(8.14)
98
S.V. Shabanov / Physics Reports 326 (2000) 1}163
The classical action of this path is
P CA B
S[q]"
t
dq
0
D
1 dq(q) 2 , , q , q@) . !<(q(q)) +S(q, q N~1 2 1 2 dq
(8.15)
Thus, for su$ciently large N, integrating with respect to q ,2, q in (8.12) is like integrating 1 N~1 over all polygonal paths having N segments. In the limit NPR, polygonal paths turn into continuous paths. The continuity of the paths contributing to the Feynman integral follows from the fact that (8.13) is a Gaussian distribution of D "q !q so that the expectation value of j j`1 j D2n is proportional to on. Therefore the main contribution to the discretized integral (8.12) comes from DD D&JoP0 as o approaches zero, i.e., the distance between neighboring points of the path j vanishes as Jo. It should be noted however that for a generic action the distance between neighboring points of paths in the Feynman sum may have a di!erent dependence on the time slice o, and the paths will not be necessarily continuous. In the continuum limit, the integral (8.12) looks like a sum over all continuous paths connecting q and q@ and weighted by the exponential of the classical action
P
(8.16)
2pi+t ~N@2 t dq dq 2dq ,Z~1 < dq(q) . 1 2 N~1 0 N q/0
(8.17)
q(t)/q Dqe*S*q+@+ , Sq, tDq@T"; (q, q@)" + e*S*q+@+" t q(0)/q{ 1!5)4 where Dq" lim N?=
A B
The integral (8.16) is called the Lagrangian path integral. The transition amplitude of a free particle (8.9) can be written as the Gaussian integral
P
;0(q, q@)"(2p+)~1 t
=
~=
i dp exp Mp(q!q@)/t!p2t/2N . +
(8.18)
The expression in the exponential is the action of a free particle moving with the momentum p. Making use of this representation in each stage of the Lagrangian path integral derivation we obtain the Hamiltonian path integral representation of the transition amplitude
P
t
+: 5 ; (q, q@)" DpDq e*@ 0 $q*pq~H(p, q)+ , t
(8.19)
where H(p, q) is the classical Hamiltonian of the system, and the measure is de"ned as the formal time product of the Liouville measures on the phase space t dp(q) dq(q) dp N~1 dp dq N < j j,< DpDq" lim . 2p+ 2p+ 2p+ N?= q/0 j/1
(8.20)
S.V. Shabanov / Physics Reports 326 (2000) 1}163
99
Observe one extra integration over the momentum and the normalization of the phase space measure by 2p+. Let W (q) be normalized eigenfunctions of the Hamiltonian HK with the eigenvalues E. Then we E can derive the spectral decomposition of the transition amplitude ; (q, q@)" + SqDETSED;K DE@TSE@Dq@T"+ e~*tE@+W (q)WH(q@) . (8.21) t t E E E, E{ E This decomposition will be useful to establish the correspondence between the Dirac operator formalism and the path integral formalism for gauge theories. A generalization of the path integral formalism to systems with many degrees of freedom is straightforward. The kernel of ;K 0 is a product of the kernels (8.9) for each Cartesian degree t of freedom. The rest of the derivation remains the same. In the "eld theory, a lattice regularization of the functional integral is usually assumed. The analysis of the continuum limit leads to the conclusion that the support of the functional integral measure is in the space of distributions rather than continuous "eld con"gurations (see, e.g., [114,186]). Yet, the removal of the lattice regularization in strongly interacting "eld theory is not simple and, in general, may pose a problem [114]. 8.2. Topology and boundaries of the conxguration space in the path integral formalism The con"guration space of a system may have a non-trivial topology. It can be, for instance, due to constraints. Consider a planar motion constrained to a circle. The system is known as a rigid rotator. Its quantum mechanics is described by the Hamiltonian +2 R2 HK "! , 0 2 Ru2
(8.22)
where the angular variable u spans the con"guration space being a circle of unit radius, u3[0, 2p). The entire di!erence between the quantum motion on the line and circle lies in the topologies of these spaces. The topology of the rotator con"guration space } the fact that it is a circle rather than a line } is accounted for by the periodicity condition imposed on state vectors Su#2pDWT"SuDWT
(8.23)
for any DWT. Accordingly, the resolution of unity for the rotator di!ers from that for the free particle (8.11)
P
1"
2p du DuTSuD . 0
(8.24)
Observe that the integral is taken over a xnite interval. The transition amplitude Su, tDu@T must satisfy the SchroK dinger equation and the periodicity condition (8.23) for both arguments u and u@. Since the Hamiltonians for free particle and the rotator have the same form, the solution to the SchroK dinger equation for the free motion is given
100
S.V. Shabanov / Physics Reports 326 (2000) 1}163
by (8.9), where the variable q is replaced by u, and can also be written as the path integral
P
Su, tDuT"
r(t)/r
r(0)/r{
C P
Du exp
D
i t dq u5 2 . 2+ 0
(8.25)
Clearly, the transition amplitude (8.9) does not satisfy the periodicity condition (8.23) and neither does the path integral (8.25). The measure of the path integral in (8.25) is the standard one, that is, in every intermediate moment of time it is integrated over the entire real line u(q)3(!R,R), 0(q(t. Looking at the resolution of unity (8.24) one could argue that the integration in the in"nite limits is the source of the trouble because it seems to be in con#ict with the path integral de"nition (8.12), where the resolution of unity has been used, and the replacement := du(q) by :2pdu(q) in the path integral measure (8.25) (in accordance with the folding (8.12)) ~= 0 would have to improve the situation. However, making the time-slicing regularization of the path integral measure (8.25) and restricting the integration to the interval [0, 2p) we immediately see that we are unable to calculate the Gaussian integrals in the folding (8.12) due to the "niteness of the integration limits. Thus, such a modi"cation of the folding (8.12) would fail to reproduce the solution (8.9) of the SchroK dinger equation. This leads to the conclusion that the formal restriction of the integration domain in the path integral contradicts the operator formalism. An important point is that even for an in"nitesimal interval of time tP0, the amplitude (8.9) does not satisfy the periodicity condition and, therefore, cannot be used to construct the path integral as the limit of the folding (8.12) that stems from the Kato}Trotter formula. To "nd a right relation between the transition amplitudes on a line and circle, we invoke the superposition principle in quantum mechanics. Let u3[0, 2p) and the initial point u@ may take its values on the whole real line which is the covering space of the circle. Note that the circle can be regarded as a quotient space R/¹ where ¹ is a group of translations uPu#2pn. If u@ describes e e the states of the rotator, then the states u@#2pn, where n runs over integers, corresponds to the same physical state. So the Feynman sum over paths should include paths outgoing from u@#2pn and ending at u in accordance the superposition principle. Thus, the transition amplitude for the rotator has the form [113] = i(u!u@#2pn)2 Su, tDu@T " + (2pi+t)~1@2 exp . (8.26) # 2+t n/~= Here by the su$x c we imply `circlea. The sum over n can be interpreted as a sum over winding numbers of a classical trajectory around the circle. The function (8.26) satis"es the SchroK dinger equation and the periodicity condition. Let us take the limit of zero time: = lim Su, tDu@T " + d(u!u@#2pn) (8.27) # t?0 n/~= which coincides with d(u!u@) for physical values of u, u@3[0, 2p) and de"nes a continuation of the unit operator kernel into the covering space of the physical con"guration space. The notion of the covering space as well as the continuation of the unit operator kernel to the covering space will be useful in the path integral formalism for gauge theories. The concept of the covering space has been used to construct the path integral over non-Euclidean phase spaces, e.g., the sphere [115].
S.V. Shabanov / Physics Reports 326 (2000) 1}163
101
A similar structure of the path integral occurs when passing to curvilinear coordinates in the measure [118,119] and in quantum dynamics on compact group manifolds [120]. The kernel (8.26) can be used in the folding (8.12) with the resolution of unity (8.24) without any contradiction. Indeed, we have
AP B AP B AP B
N~1 2p du Su, eDu Su, tDu@T " < T 2Su , eDu@T j # N~1 # 1 # j/1 0 = N~1 du Su, eDu T2Su , eDu@T "< j N~1 1 # j/1 ~= = = N~1 du Su, eDu T2Su , eDu@#2pnT . (8.28) " + < j N~1 1 n/~= j/1 ~= Here in the "rst equality we used the sum over the winding number to extend the integration to the whole real line and to replace the in"nitesimal transition amplitude on the circle by that on the line. In the limit NPR, the expression (8.28) yields the path integral
P
t = r(t)/r : 52 + Due* 0 $q r @2 (8.29) Su, tDu@T " + # n/~= r(0)/r{`2pn = duASu, tDuATQ(uA, u@) (8.30) " ~= where the kernel Q is given by (8.27). It de"nes a periodic continuation of any function on the interval [0, 2p) to the covering space:
P
P
2p du@Q(u, u@)W(u@) , (8.31) 0 and, thereby, ensures that the action of the evolution operator constructed by the sum over paths in the covering space preserves the periodicity of the physical states W (u#2p)"W (u), where t t 2p = W (u)" du@Su, tDu@T W (u@)" duASu, tDuATWQ(uA) . (8.32) t # 0 0 0 ~= A similar representation can also be established for the path integral of a free particle in the in"nite well. In this case the transition amplitude should satisfy the zero boundary conditions WQ(u#2p)"WQ(u)"
P
P
Sq"0, tDq@T"Sq"¸, tDq@T"Sq, tDq@"0T"Sq, tDq@"¸T"0 ,
(8.33)
where ¸ is the size of the well. The resolution of unity reads
P
L
dq DqTSqD"1 . (8.34) 0 The formal restriction of the integration domain in the path integral would yield an incorrect answer because the kernel of ;K o0 in the Kato}Trotter product formula does not have the standard
102
S.V. Shabanov / Physics Reports 326 (2000) 1}163
form of (8.9). The right transition amplitude compatible with the Kato}Trotter operator representation of the evolution operator is obtained by the superposition principle [116,117]. It can be written in the form [218]
P
Sq, tDq@T " "09
=
~=
dqA
P
q(t)/q
q(0)/qA
Dq exp
G P
H
i t dq q5 2 Q(qA, q@) ; 2+ 0
(8.35)
= Q(q, q@)" + [d(q!q@#2¸n)!d(q#q@#2¸n)] . (8.36) n/~= The contributions of trajectories going from x@#2¸n to x and of those going from !x@#2¸n to x have opposite signs, which is necessary to provide the zero boundary conditions (8.33). The trajectories x@#2¸nPx can be interpreted as continuous trajectories inside the well which connect x@, x3(0, ¸) and have 2n re#ections from the well walls because they have the same action. Contributions of the trajectories !x@#2¸nPx are equivalent to contributions of trajectories inside the well with an odd number of re#ections 2n#1. In general such an interaction with boundaries in the con"guration space has been studied in [99]. The lesson following from our analysis is that the restriction of the integration domain in the path integral, which might seem to be motivated by the prelimit expression (8.12), is ruled out because the in"nitesimal transition amplitude, that is used in the Kato}Trotter product formula for the path integral, has no `standarda form (8.9) if the system con"guration space has either a non-trivial topology, or boundaries, or both of them. This is the key observation for constructing the path integral formalism equivalent to the Dirac operator quantization of gauge systems. 8.3. Gribov obstruction to the path integral quantization of gauge systems The Feynman representation of quantum mechanics has led to a new quantization postulate which is known as the path integral quantization [2]. Given a classical Hamiltonian H"H(p, q) and the canonical symplectic structure on the phase space, the transition amplitude of the corresponding classical system is given by the Hamiltonian path integral (8.19). The correspondence principle is guaranteed by the stationary phase approximation of the path integral (8.19) in the formal limit +P0, i.e., in the dynamical regime when the classical action is much greater than the Planck constant. For many physically interesting models this postulate is valid. It is natural to extend it as quantization postulate to general Hamiltonian systems, and, thereby, to avoid the use of non-commutative variables (operators) to describe quantum systems. This attractive idea has, unfortunately, some shortcomings, which, as we will see, appear to be relevant for the path integral formalism in gauge theories. The action functional of systems with gauge symmetry is constant along the directions traversed by gauge transformations in the path space. Therefore, the Feynman sum (8.16) would diverge. In the Hamiltonian formalism the gauge symmetry leads to constraints and the appearance of non-physical variables. The physical motion occurs in the physical phase space, the quotient of the constraint surface by the gauge group. In his pioneering work [68], Faddeev proposed the following modi"cation of the path integral measure for systems with "rst-class constraints: Dp DqPDp Dq d(p )d(s )D "DpH DqH Dp8 Dq8 d(p8 )d(q8 ) . a a FP a a
(8.37)
S.V. Shabanov / Physics Reports 326 (2000) 1}163
103
Here d(p ) reduces the Liouville measure onto the constraint surface at every moment of time, while a the supplementary (or gauge) conditions s "0 are to select a representative from the gauge orbit a through a point q. The function D "detMs , p N , (8.38) FP a b known as the Faddeev}Popov determinant [67,68], e!ects a local reestablishment of the Liouville measure on the PS ; it is assumed that Ms , s N"0. If D O0, then one can show [68] that there 1):4 a b FP exists a canonical transformation p, qPpH, qH; p8 , q8 such that the variables pH, qH form a set of local canonical coordinates on PS , and p8 , q8 are non-physical phase-space variables (see also 1):4 Section 6.1). Assuming that the Liouville path integral measure remains invariant under canonical transformations, Eq. (8.37) is readily established (after solving the constraints p "0 for the a non-physical momenta p8 "p8 (pH, qH), the shift of the integration variable p8 Pp8 !p8 (pH, qH) has a a a a a to be done). The method has successfully been applied to the perturbation theory for quantum gauge "elds [67] and provided a solution to two signi"cant problems: the unitarity problem in perturbative path integral quantization of Yang}Mills "elds [126] and the problem of constructing a local gauge "xed e!ective action [127]. If the physical phase space is non-Euclidean, transformation (8.37) is no longer true. The evidence for this obstruction is the impossibility to introduce a set of supplementary conditions that provide a global parameterization of the physical phase space by the canonical coordinates pH, qH without singularities, a situation which is often rendered concrete in the vanishing or even sign changing of the Faddeev}Popov determinant [11]. In Section 6.1 we have shown that the condition D O0 cannot be met everywhere for any single-valued functions s if gauge orbits have FP a a non-trivial topology. The surface p "s "0 may not be isomorphic to PS , i.e., it still has a a ph gauge-equivalent con"gurations (Gribov copies). Assuming that the local coordinates pH, qH span the surface p "s "0, the physical phase space will be isomorphic to a certain (gauge-dependent) a a domain within it (modulo boundary identi"cations), called a modular domain. We have seen that the formal restriction of the integration domain, say, to the modular domain, to remove the contribution of physically equivalent con"gurations is not consistent and contradicts the operator formalism. Another remark is that the parameterization of the physical phase space is de"ned modulo general canonical transformations. Di!erent choices of the supplementary condition s would lead to di!erent parameterizations of the physical phase space which are related by canonical transformations. Physical amplitudes cannot depend on any particular parameterization, i.e., they have to be independent of the choice of the supplementary condition. However, the formal Liouville measure DpH DqH does not provide any genuine covariance of the path integral under general canonical transformations as has been argued in the Introduction. Thus, the measure (8.37) should be modi"ed to take into account the non-Euclidean geometry of the physical phase space, which is natural given the fact that path integral quantization of the phase space geometries di!erent from the Euclidean one leads to quantizations di!erent from the canonical one based on the canonical Heisenberg algebra [3}5,121]. In the Yang}Mills theory, the Coulomb gauge turns out to be successful for a consistent path integral quantization in the high energy limit where the coupling constant is small, and the geometry of the physical phase space does not a!ect the perturbation theory. In the infrared limit, where the coupling constant becomes large, the coordinate singularities associated with the Coulomb gauge invalidate the path integral quantization based on the recipe (8.37) as has been "rst
104
S.V. Shabanov / Physics Reports 326 (2000) 1}163
observed by Gribov [11]. Therefore a successful non-perturbative formulation of the path integral in Yang}Mills theory is impossible without taking into account the (non-Euclidean) geometry of the physical phase space. 8.4. The path integral on the conic phase space To get an idea of how the Faddeev}Popov recipe should be modi"ed if the physical phase space is not Euclidean, we take the isotropic oscillator in three-dimensional space with the gauge group SO(3) [128,18]. The reason of taking the group SO(3) is that the quantum Hamiltonian (7.30) constructed in the Dirac formalism can be related to the corresponding one-dimensional problem by rescaling the wave functions, U"//r. So one can compare the path integrals for oscillators with #at and conic phase spaces. For an arbitrary orthogonal group, the Dirac quantum Hamiltonian contains the quantum potential < "+2(N!3)(N!1)/(8r2) as compared with the Hamiltonian q of the corresponding one-dimensional system (see (7.101)). A general technique to construct the path integral over a non-Euclidean phase space, which takes into account the operator ordering problem, is given in Section 8.7. Making the substitution U "/ /r in (7.30) for N"3 and the oscillator potential, we "nd that n n the functions / are eigenfunctions of the one-dimensional harmonic oscillator (+"1) n / (r)"c H (r)e~r2@2 . (8.39) n n n The physical eigenstates are given by the regular functions U (r)"c8 2k`1 2k`1
H (r) 2k`1 e~r2@2 . r
(8.40)
The singular functions for n"2k do not satisfy the SchroK dinger equation in the vicinity of the origin r"0 (cf. the discussion in Section 7.2). To compare the transition amplitude of the oscillator with #at and conic phase spaces, we assume that the normalization constants c of the wave n functions of the ordinary oscillator are calculated with respect to the standard measure := dr D/ D2"1. The normalization constants c8 of the Dirac states (8.40) are evaluated with n n ~= respect to the measure (7.34), :=dr r2DU D2"1. This leads to the relation between the normalization n 0 constants c8 "J2 c . (8.41) 2k`1 2k`1 The transition amplitude for the harmonic oscillator is given by the spectral decomposition (8.21) = ; (r, r@)" + c2H (r)H (r@)e~(r2`r{2)@2 e~*tEn , (8.42) t n n n n/0 where E "n#1/2 is the energy spectrum. Let us apply the spectral decomposition (8.21) to the n system with eigenfunctions (8.40). The result is [128] 1 ;c(r, r@)" [; (r, r@)!;(r,!r@)] t rr@ t
P
"
= drA ;(r, rA)Q(rA, r@) , rrA ~=
(8.43) (8.44)
S.V. Shabanov / Physics Reports 326 (2000) 1}163
105
where the kernel Q is given by Q(rA, r)"d(rA!r@)#d(rA#r@) ,
(8.45)
for rA3R and r@'0. Eq. (8.43) follows from both the symmetry property of the Hermite polynomials under the parity transformation H (!r)"(!1)nH (r) (so that even n do not contribute n n to the right-hand side of (8.43)) and relation (8.41) between the normalization constants. Note that there is no extra factor 1/2 in the right-hand side of (8.43). The kernel (8.42) has the standard path integral representation
P
r(t)/r
GP
H
i t dq(r5 2!r2) . (8.46) 2 r(0)/r{ 0 The measure involves integrations in inxnite limits over r. Introducing the integration over a momentum variable in (8.46), we can see that it coincides with the Faddeev}Popov integral in the unitary gauge x "x "0. Taking, for example, 2 3 p x !x p "0 and p x !x p "0 as independent constraints we "nd the Faddeev}Popov 1 2 1 2 1 3 1 3 determinant D "x2 which vanishes at the origin x "0 indicating the singularity of the unitary FP 1 1 gauge. Integrating then over the non-physical variables the Faddeev}Popov measure turns into the Liouville measure for the variables x and p , thus leading to the path integral (8.46) x "r after 1 1 1 integrating out the momentum variable p . The Faddeev}Popov determinant is canceled against 1 the corresponding factor resulting from the delta functions of the independent constraints. The unitary gauge is not complete. We have the Gribov transformations x P!x . So the modular 1 1 domain is the positive half-axis. In Section 6.1 it has been shown that due to a non-trivial topology of the gauge orbits, there is no smooth single-valued supplementary condition in the model which would provide a parameterization of the physical phase space (a cone) by canonical coordinates without singularities. This is the Gribov obstruction to the Hamiltonian path integral quantization. The physical reason behind it is the non-Euclidean structure of the physical phase space. The solution to the Gribov obstruction given by the formula (8.44) is a simple procedure. First construct the path integral in the covering space, i.e., on the whole line; then symmetrize the result with respect to the residual gauge transformations. The operator QK does this job. The transition amplitude on the covering space does not, in general, coincide with the Faddeev}Popov phase-space path integral. Observe the factor (rr@)~1 in (8.44). The deviation would stem from the fact that the insertion of the delta-functions of constraints and supplementary conditions into the path integral measure means the elimination of non-physical degrees of freedom before canonical quantization (canonical quantization of pH and qH), while in the Dirac approach the non-physical degrees of freedom are excluded after quantization, which is not generally the same. The physical degrees of freedom are frequently described by curvilinear coordinates. That is why we get the factor (rr@)~1 which is related to the density r2 in the scalar product. In general, there could also be an operator ordering correction to the Faddeev}Popov e!ective action (cf. the discussion at the end of Section 7.7). The non-physical variables do not disappear without a trace as a consequence of the fact that they are associated with curvilinear coordinates. The sum (8.43) can be interpreted as the sum over trajectories inside the modular domain r'0. Due to the gauge invariance, the action of the trajectories outgoing from !r@ to the origin is the same as the action of the re#ected trajectory from r@ to the origin. This may be interpreted as ; (r, r@)" t
Dr exp
106
S.V. Shabanov / Physics Reports 326 (2000) 1}163
contributions to the Feynman integral of trajectories r(q)50 outgoing from r@'0 and re#ecting from the origin before coming to the "nal point r'0. The amplitude (8.43) does not vanish at r"0 or r@"0, i.e., the system has a non-zero probability amplitude to reach the horizon. The situation is similar to the path integrals discussed in Section 8.2. 8.5. The path integral in the Weyl chamber Let us illustrate the Kato}Trotter product formula (8.8) by constructing the path integral for the model in the adjoint representation discussed in Section 3. Although a direct analysis of the evolution (SchroK dinger) equation for a generic potential would lead to the answer faster [18,19], it is instructive to apply the Kato}Trotter formula. The aim is to show how the integration over the modular domain, being the Weyl chamber, in the scalar product turns into the integration over the entire covering space (a gauge "xing surface) in the path integral. This has already been demonstrated when deriving the path integral on the circle (see (8.28)}(8.30)) and, as we will show, holds for gauge theories as well. The key observation we made in the very end of Section 8.2 is that the kernel (8.9) of the evolution operator for a free motion should be modi"ed in accordance with the true geometry of the physical con"guration space. To "nd the right evolution kernel for the free motion, we have to solve the SchroK dinger equation (see (7.42)) in the Dirac operator formalism 1 ! D (i;D0(h, h@))"iR ;D0(h, h@) , t t t 2i (r)
(8.47)
where the superscript D stands to emphasize that the amplitude is obtained via the Dirac operator formalism, and the superscript 0 means the free motion as before. The solution must be regular for all t'0 and turn into the unit operator kernel with respect to scalar product (7.43) at t"0. According to the analysis of Section 7.3, we make the substitution ;D0(h, h@)"[i(h)i(h@)]~1;0(h, h@), solve the equation and the symmetrize the result with respect to t t the Weyl group. The kernel ;0(h, h@) satis"es the free SchroK dinger equation in H&Rr. So it is t a product of kernels (8.9) for each degree of freedom. Thus, we get
G
;D0(h, h@)"(2pit)~r@2+ [i(h)i(RK h@)]~1 exp t W dhA ;0(h, hA)Q(hA, h@) ; " i(h)i(h@@) t H
P
i(h!RK h@)2 2t
H
Q(hA, h@)"+ d(hA!RK h@@), hA3H, h@3K` . W As t approaches zero, kernel (8.48) turns into the unit operator kernel
(8.48) (8.49) (8.50)
ShDh@T"+ [i(h)i(RK h@)]~1 d(h!RK h@) (8.51) W which equals the unit operator kernel [i(h)]~2 d(h!h@) for h, h@ from the Weyl chamber K`. It is noteworthy that by taking the limit tP0 in the regular solution to the evolution equation
S.V. Shabanov / Physics Reports 326 (2000) 1}163
107
we have obtained a W-invariant continuation of the unit operator kernel to the covering space (the Cartan subalgebra) of the fundamental modular domain (the Weyl chamber). Due to the W-invariance of the potential <(RK h)"<(h) (the consequence of the gauge invariance), we also "nd ShDe~*tVK Dh@T"e~*tV(h)ShDh@T .
(8.52)
Thus, the in"nitesimal evolution operator kernel reads
P
dhA ;o (h, hA)Q(hA, h@) , i(h)i(hA)
;D o (h, h@)"
(8.53) H where ;o (h, h@) is the r-dimensional version of the kernel (8.10). We will also write the integral relation (8.53) in the operator form ;K Do ";K o QK . To obtain folding (8.12) that converges to the path integral, one has to calculate the folding of kernels (8.53). The di!erence from the standard path integral derivation of Section 8.1 is that the integration domain is restricted to the Weyl chamber and the ;K Do does not have the standard form (8.10). Next we prove that (8.54) ;D"(;K Do )N"(;K o QK )N"(;K o )NQK ";K QK , t t where the folding (;K o )N is given by the standard expression (4.34), i.e., without the restriction of the integration domain. To this end we calculate the action of kernel (8.53) on any function /(h). We get
P P P
dh@i2(h@) ;o (h, hA)Q(hA, h@)/(h@) (8.55) ` i(h)i(hA) H K i(hA) ;o (h, hA)+ H ` (RK hA)/(RK hA) , (8.56) " dhA K i(h) H W where H ` (h) is the characteristic function of the Weyl chamber, i.e., it equals one for h3K` and K vanishes otherwise. To do the integral over h@, we use the invariance of i2 relative to the Weyl group and d(h!RK h@)"d(RK ~1h!h@) (recall det RK "$1). If the function / is invariant under the Weyl group, then ;K oD/(h)"
dhA
P
;K oD/(h)"
dhA
i(hA) ;o (h, hA)/(hA) , i(h)
(8.57) H because + H ` (RK h)"1 except for a set of zero measure formed by the hyperplanes orthogonal to W K positive roots (where the Faddeev}Popov determinant i2(h) in the gauge x"h vanishes). Taking the W-invariant kernel (8.53) as /, we "nd that the kernel ;Do (h, h@) of the operator (;K oD)2 has the 2 form (8.53) where o is replaced by 2o and
P
; o (h, hA)" 2
dhM ;o (h, hM );o (hM , hA) .
(8.58)
H The proof of (8.54) is accomplished by a successive repeating of the procedure in the folding ;K oD 2;K oD from left to right. Thus the path integral has the form [19,18]
P
t : Q2 h(t)/h Dh e* 0 $q*h @2~V(h)+ . ;D(h, h@)"+ [i(h)i(RK h@)]~1 t h(0)/RK h{ W
(8.59)
108
S.V. Shabanov / Physics Reports 326 (2000) 1}163
Fig. 9. The case of SU(3). The modular domain is the Weyl chamber being the sector 1 O 2 with the angle p/3. In addition to a trajectory connecting the initial con"guration h@ and the "nal con"guration h, there are "ve trajectories connecting them and containing re#ections from the Gribov horizon (the Weyl chamber boundary). All these trajectories contribute to the `freea transition amplitude. The points h@ are the Weyl images of h@ obtained by all compositions of the 1,2,12,21,121 mirror re#ections with respect to the lines 1 O 1@ and 2 O 2@. For instance, h@ is obtained by the re#ection in 1 O 1@ and 12 then in 2 O 2@, etc. Such an interaction with the horizon induces the kinematic coupling of the physical degrees of freedom. The transition amplitude cannot be factorized, even though the Hamiltonian would have no interaction between the physical degrees of freedom.
The exponential contains the gauge-"xed action. Due to the Weyl invariance of the action, the sum over the Weyl group can be interpreted as contribution of trajectories re#ected from the boundary of the Weyl chamber (cf. the analysis of the harmonic oscillator in Fig. 4 and the discussion at the end of Section 4.5). That is, a trajectory outgoing from RK h@ and ending at h3K` has the same action as the trajectory outgoing from h@3K`, re#ecting from the boundary RK` (maybe not once) and ending at h3K`. An example of the group SU(3) is plotted in Fig. 9. We stress again that the re#ections are not caused by any force action (no in"nite potential well as in the case of a particle in a box). The physical state of the system is not changed at the very moment of the re#ection. Thanks to the square root of the Faddeev}Popov determinant at the initial and "nal points in the denominator of Eq. (8.59), the amplitude does not vanish when either the initial or "nal point lies on the boundary of the Weyl chamber, that is, the system can reach the horizon with non-zero probability. This is in contrast to the in"nite well case (8.33). The occurrence of the re#ected trajectories in the path integral measure is the price we have to necessarily pay when cutting the hyperconic physical phase space to unfold it into a part of a Euclidean space spanned by the canonical coordinates h and p and, thereby, to establish the relation between the path integral h measure on the hypercone and the conventional Liouville phase space measure. A phase space trajectory p (q), h(q), q3[0, t] that contributes to the phase-space path integral, obtained from (8.59) h by the Fourier transform (8.19), may have discontinuities since the momentum p (q) changes h abruptly when the trajectory goes through the cut on the phase space. Such trajectories are absent in the support of the path integral measure for a similar system with a Euclidean phase space. The Weyl symmetry of the probability amplitude guarantees that the physical state of the system does not change when passing through the cut, which means that the system does not feel such discontinuity of the phase space trajectory associated with particular canonical coordinates on the hypercone.
S.V. Shabanov / Physics Reports 326 (2000) 1}163
109
Remark. The path integral (8.59) is invariant relative to the Weyl transformations. Therefore, it has a unique, gauge-invariant analytic continuation into the total con"guration space in accordance with the theorem of Chevalley (cf. Section 7.4). It is a function of the independent Casimir polynomials P (x) and P (x@), which can also be anticipated from the spectral decomposition (8.21) l l over gauge invariant eigenstates. Thus, the path integral (8.59) does not depend on any particular parameterization of the gauge orbit space. 8.6. Solving the Gribov obstruction in the 2D Yang}Mills theory The Jacobian (5.41) i2(a) calculated in Section 5.2 is the Faddeev}Popov determinant in the Coulomb gauge RA"0 (with the additional condition that A3H). Indeed, the Faddeev}Popov operator is Ms, pN"MRA, +(A) EN"!R+(A). Since the Coulomb gauge is not complete in two dimensions (there are homogeneous continuous gauge transformations left), the determinant det [!R+(A)] should be taken on the space F>F , i.e., homogeneous functions should 0 be excluded from the domain of the Faddeev}Popov operator (these are zero modes of the operator R+(A)). The residual continuous gauge arbitrariness generated by the constraints (5.17) is "xed by the gauge A"A "a, where a is from the Cartan subalgebra. On the surface 0 A "a, the Faddeev}Popov operator in the space F of constant functions has the form 0 0 Mp , A N"ad A "ad a. It vanishes identically on the subspace of constant functions taking their 0 0 0 values in the Cartan subalgebra FH"H. This indicates that there is still a continuous gauge 0 arbitrariness left. These are homogeneous transformations from the Cartan subgroup. They cannot be "xed because they leave the connection A"a invariant. As we have already remarked, this is due to the reducibility of the constraints (the Gauss law) in two dimensions (not all the constraints are independent). In the reducible case the Faddeev}Popov determinant should be de"ned only for the set of independent constraints (otherwise it identically vanishes). Thus, the Faddeev}Popov operator acts as the operator R+(a) in the space F>F and as ad a in F >FH&X>H. 0 0 0 An additional simpli"cation, thanks to two dimensions, is that the determinant det [!R+(A)]"det (iR) det (i+(A)) is factorized, and the in"nite constant det iR can be neglected. On the constraint surface we have +(A)"+(a). The operator ad a acting in X>H"F >FH 0 0 coincides with +(a) acting in the same space of constant functions taking their values in the orthogonal supplement to the Cartan subalgebra. Thus, the Faddeev}Popov determinant is det i+(a), where the operator +(a) acts in F>FH. The determinant det i+(a) is the Jacobian i2(a) 0 computed modulo some (in"nite) constant. We see again that the Jacobian of the change of variables associated with the chosen gauge and the gauge transformation law is proportional to the Faddeev}Popov determinant in that gauge. The Faddeev}Popov determinant vanishes if (a, a)"n a for any integer n and a positive root a 0 a a. What are the corresponding zero modes of the Faddeev}Popov operator? Let us split the zero modes into those which belong to the space F>F and those from F >FH, i.e., the spatially 0 0 0 homogeneous and nonhomogeneous ones. They satisfy the equations +(a)m"Rm!ig[a, m]"0, m3F>F ; 0
(8.60)
+(a)m "!ig[a, m ]"0, m 3F >FH . 0 0 0 0 0
(8.61)
110
S.V. Shabanov / Physics Reports 326 (2000) 1}163
A general solution to Eq. (8.60) reads m(x)"e*gaxmM e~*gax"exp[igx(ad a)]mM , mM "const3F 3X . (8.62) 0 The zero modes must be periodic functions m(x#2pl)"m(x) because the space is compacti"ed into a circle of radius l. This imposes a restriction on the connection A"a under which zero mode exist, and accordingly the Faddeev}Popov determinant vanishes at the connection satisfying these conditions. Let us decompose the element mM over the Cartan}Weyl basis: mM "+ (mM `e #mM ~e ). a ~a a;0 a a The constant mM cannot contain a Cartan subalgebra component, otherwise m(x) would have a component from FH. Making use of the commutation relation (4.11) we "nd 0 = m(x)" + [igx(ad a)]kmM " + [e*gx(a,a)mM `e #e~*gx(a,a)mM ~e ] . (8.63) a a a ~a k/0 a;0 Each coe$cient in the decomposition (8.63) must be periodic, which yields (a, a)"a n , n O0 . (8.64) 0 a a We conclude that the Faddeev}Popov operator has an in"nite number of independent nonhomogeneous zero modes labeled by all roots $a and integers n O0 if the connection is in any Ba of the hyperplanes (8.64). Note that each term in sum (8.63) satis"es Eq. (8.60) and, therefore, can be regarded as an independent zero mode. The zero modes are orthogonal with respect to the scalar product :2pl dx(mH, m ) where (e )H"e (cf. (4.14)). The condition n O0 ensures that m(x) is not 1 2 a ~a a 0 homogeneous. However, the Jacobian i2(a) vanishes on the hyperplanes (a, a)"0. Where are the corresponding zero modes? They come from Eq. (8.61). Let us decompose m over the Cartan Weyl 0 basis: m "+ (m`e #m~e ). We recall that m does not have a Cartan subalgebra component. 0 a;0 a a a ~a 0 Making use of commutation relation (4.11) we conclude that Eq. (8.61) has dim G!dim H (the number of all roots) linearly independent solutions proportional to e , provided the connection Ba satisfy the condition (a, a)"0 (cf. also the analysis in Section 4.3 between (4.25) and (4.26)). So the Faddeev}Popov determinant should vanish on the hyperplanes (a, a)"0 as well. The Gribov copies are found by applying the a$ne Weyl transformations to con"gurations on the gauge "xing surface. The fundamental modular domain is compact and isomorphic to the Weyl cell. The Faddeev}Popov determinant vanishes on its boundary. Note also that there are copies inside of the Gribov region (i.e., inside the region bounded by zeros of the Faddeev}Popov determinant), but they are related to one another by homotopically non-trivial gauge transformations which are not generated by the constraints (see Fig. 5 where the case of SU(3) is illustrated). Now we construct a modi"ed path integral that solves the Gribov obstruction in the model [52]. Let us take "rst the simplest case of SU(2). We will use the variable h"(a, u)/a introduced in 0 Section 7.6. The Weyl cell is the open interval h3(0, 1) and i(h)"sin ph. The a$ne Weyl transformations are hPh "ph#2n, p"$1 , (8.65) p, n where n ranges over all integers. The interval (0, 1) is the quotient of the real line by the a$ne Weyl group (8.65). A transition amplitude is a solution to the SchroK dinger equation (+"1),
C
D
1 R2 ! sin(ph)!E ;D(h, h@)"iR ;D(h, h@) , C t t t 2b sin(ph) Rh2
(8.66)
S.V. Shabanov / Physics Reports 326 (2000) 1}163
111
that is regular at the boundaries h"0, 1 and satis"es the initial condition, ;D (h, h@)"ShDh@T"[sin(ph)sin(ph@)]~1d(h!h@) , t/0 where h, h@3(0, 1). It has the form
(8.67)
G
H
i(h!h@ )2 p, n #iE t exp C 1 2tb = ;D(h, h@)"(2p itb)~1@2 + . (8.68) + t [sin(ph)sin(ph@ )] p, n p/~1 n/~= i(h!h@#2n)2 i(h#h@#2n)2 !exp exp 2bt 2bt = e*EC t + . (8.69) " sin ph sin ph@ (2pitb)1@2 n/~= We have included all parameters of the kinetic energy in the Hamiltonian (7.69) into the constant b"4pla "4p/(lg2). The sum in (8.68) is extended over the residual gauge transformations (the 0 a$ne Weyl group), or, in other words, over the Gribov copies of the initial con"guration h@ in the gauge "xing surface. The regularity of the transition amplitude at h"n or h@"n is easy to verify. The numerator and the denominator in the sum in (8.69) vanish if either h or h@ attains an integer value, but the ratio remains "nite because the zeros are simple. The exponential in (8.68) is nothing, but the evolution operator kernel of a free particle on a line. So it can be written as the path integral with the standard measure which involves no restriction of the integration to the modular domain. The action of a free particle coincides with the Yang}Mills action in two dimensions in the Coulomb gauge. That is, we have found the way to modify the Faddeev}Popov reduced phasespace path integral to resolve the Gribov obstruction. The sum over the Gribov copies of the initial con"guration h@ in the covering space (the gauge "xing surface) can again be interpreted as contributions of the trajectories that re#ect from the Gribov horizon several times before they reach the "nal point h. The amplitude does not vanish if the initial or "nal point is on the horizon. A generalization to an arbitrary compact group is straightforward [52]
G
H
G
H
P
a(t)/a :t 52 ;D(a, a@)"+ [i(a)i(RK a@)]~1 Da e* 0 $q*nla `EC + . (8.70) t K a(0)/Ra{ WA The path integral for a free particle in r dimensions has the standard measure. The transition amplitude obviously satis"es the SchroK dinger evolution equation. One can also verify the validity of representation (8.70) by the direct summation of the spectral representation (8.21) of the transition amplitude because we know the explicit form of eigenstates (7.77). However we will give another derivation of (8.70) which is more general and can be used for deriving a Hamiltonian path integral for any gauge theory from the Dirac operator formalism. Consider a spectral decomposition of the unit operator kernel SaDa@T"+ U (a)UH(a@)"[i(a)i(a@)]~1d(a!a@), a, a@3K` . (8.71) E E W E The eigenfunctions U (a) are reductions of the gauge invariant eigenfunctions (7.77), (7.78) on the E gauge "xing surface. Therefore kernel (8.71) is, in fact, a genuine unit operator kernel on the gauge
112
S.V. Shabanov / Physics Reports 326 (2000) 1}163
orbits space, which does not depend on any particular parameterization of the latter. Clearly, U (a) E are invariant under the residual gauge transformations, under the a$ne Weyl transformations. Let us make use of this fact to obtain a continuation of the unit operator kernel to the non-physical region a3H, i.e., to the whole covering space of the modular domain K` . The following property W should hold SaDRK a@T"SaDa@T because U (RK a)"U (a). Therefore E E SaDa@T"+ [i(a)i(RK a@)]~1d(a!RK a@) WA daA d(a!aA)Q(aA, a@) " i(a)i(aA) H daA dp e*p(a~aA)Q(aA, a@) , " i(a)i(aA) (2p)r H H where a3H and a@3K` , pa,(p, a); the kernel of the operator QK is de"ned as before: W
P P
P
(8.72) (8.73) (8.74)
Q(a, a@)"+ d(a!RK a@) . (8.75) WA The extended unit operator kernel coincides with the transition amplitude in the limit tP0 as we have learned in Section 8.2 (see (8.27)). Now we can construct the in"nitesimal transition amplitude by means of the relation ;D(a, a@)"(1!ioHK (a))SaDa@T#O(o2) , (8.76) e 1) where the physical Hamiltonian is taken from the Dirac quantization method (7.69) (see also (7.100) for a general case). Applying the physical Hamiltonian to the Fourier transform of the unit operator kernel (8.74), we obtain the following representation:
P P
;D o (a, a@)"
;o (a, aA)"
H H
daA ;o (a, aA)Q(aA, a@)#O(o2) , i(a)i(aA)
G G
A
p2 dp exp ip(a!aA)!io !E C 4pl (2p)r
"(io/l)~r@2 exp
H
inl(a!aA)2 #ioE . C o
BH
(8.77) (8.78) (8.79)
The function p2/(4pl)"H in (8.78) is the classical gauge-"xed Hamiltonian, the addition E is 1) C a quantum correction to it resulting from the operator ordering. To obtain (8.79), we did the Gaussian integral over the momentum variable. The folding (;K o QK )N can be calculated in the same fashion as it has been done in the preceding section. The only di!erence is that the integration in the scalar product is extended over the Weyl cell which is compact. Due to the invariance of the amplitude ;D o (a, a@) relative to the a$ne Weyl transformations, all the operators QK in the folding can be pulled over to the right with the result that the integration over the Weyl cell is replaced by the integration over the whole Cartan
S.V. Shabanov / Physics Reports 326 (2000) 1}163
113
subalgebra (the covering space) in the folding (;K o )N (thanks to the sum over the a$ne Weyl group generated by QK ). Thus, formula (8.70) is recovered again. Amplitude (8.70) has a unique analytic continuation into the original functional con"guration space F as has been argued in Section 7.5. It is a function of two Polyakov loops for the initial and "nal con"gurations of the vector potential. So it does not depend on any particular parameterization of the gauge orbit space, which has been used to calculate the corresponding path integral, and, hence, is a genuine coordinate-free transition amplitude on the gauge orbit space F/G. Replacing the time t by the imaginary one tP!ib, one can calculate the partition function
P
da i2(a);D(a, a)"+ e~bEn , (8.80) b K K`W n where the sum is extended over the irreducible representations K (see (7.72)). Thanks to the sum n over the a$ne Weyl group in (8.70), the integral in (8.80) can be done explicitly. The result coincides with the earlier calculation of the partition function of the 2D Yang}Mills theory on the lattice where no gauge "xing is needed since the path integral is just a "nite multiple integral [46,57]. The partition function can also be calculated directly from the spectral decomposition (8.21) and the orthogonality of the characters of the irreducible representations (7.78) which are eigenfunctions in the model. As a conclusion, we comment that the formalism developed above provides us with a necessary modi"cation of the Faddeev}Popov Hamiltonian path integral which takes into account the non-Euclidean geometry of the physical phase space and naturally resolves the Gribov obstruction. It determines an explicitly gauge invariant transition amplitude on the gauge orbit space. Next we will develop a general method of constructing such a path integral formalism in gauge theories directly from the Kato}Trotter formula without any use of the SchroK dinger equation. Z(b)"tr e~bHK "
Remark. The method of constructing the Hamiltonian path integral, based on the continuation of the unit operator kernel to the whole gauge "xing surface, can be applied to a generic gauge model of the Yang}Mills type discussed in Section 7.7 [95,129]. The e!ective classical Hamiltonian that emerges in the Hamiltonian path integral will not coincide with the classical gauge-"xed Hamiltonian (7.103). It will contain additional terms equal to the operator ordering corrections that appear in the Dirac quantum Hamiltonian in (7.100). In this way, one can construct the path integral that takes into account both the singularities of a particular coordinate parameterization of the orbit space and the operator ordering which both are essential for the gauge invariance of the quantum theory as we have seen in Section 7.7. 8.7. The projection method and a modixed Kato}Trotter formula for the evolution operator in gauge systems The path integral quantization is regarded as an independent quantization recipe from which the corresponding operator formalism is to be derived. So far we have explored the other way around. It is therefore of interest to put forward the following question. Is it possible to develop a selfcontained path integral quantization of gauge systems that does not rely on the operator formalism? The answer is a$rmative [130]. The idea is to combine the Kato}Trotter product
114
S.V. Shabanov / Physics Reports 326 (2000) 1}163
formula for the evolution operator in the total Hilbert space and the projection on the physical (Dirac) subspace. In such an approach no gauge "xing is needed in the path integral formalism [131}133]. The gauge invariant path integral can then be reduced onto any gauge "xing surface. Let the gauge group G be compact in a generic gauge model of the Yang}Mills type discussed in Section 7.7. Consider the projection operator [131}133]
P
PK "
dk (u)e*ua p( a , G
(8.81) G where the measure is normalized on unity, :dk "1, e.g., it can be the Haar measure of the group G G. The operators of constraints are assumed to be hermitian. So, PK "PK s"PK 2. Dirac gauge invariant states (7.92) are obtained by applying the projection operator (8.81) to all states in the total Hilbert space
P
W(x)"PK t(x)"
dk (u)t(X(u)x) . G
(8.82)
G If the group is not compact, one can take a sequence of the rescaled projection operators c PK d d where PK projects on the subspace + p( 24d. In the limit dP0 the Hilbert space isomorphic to d a a the Dirac physical subspace is obtained. To make the procedure rigorous, the use of the coherent state representation is helpful [133,134]. An explicit form of the projection operator kernel in the coherent state representation for several gauge models has been obtained in [131,10,133,219]. From the spectral representation of the evolution operator kernel (8.21) it follows that the physical evolution operator is obtained by the projection of the evolution operator onto the physical subspace in the total Hilbert space ;K D"PK ;K PK , (8.83) t t where the superscript `Da stands for `Diraca. The path integral representation of the physical evolution operator kernel is then derived by taking the limit of the folding sequence K ;K D"(;K Do )n"(PK ;K o PK )n"(PK ;K o0PK e~*eVK )n,(;K 0D (8.84) o e~*eV)n , t where the gauge invariance of the potential has been used, [
P
W ( f (u))"
dk (u)t(X(u) f (u)),U(u) . (8.85) G G The invariance of physical states (8.85) with respect to the Gribov transformations uPRK u"u (u) s follows from the relation f (u)"X~1(u) f (u ), which de"nes the Gribov transformations, and s s the right-shift invariance of the measure on the group manifold. To make use of the modi"ed
S.V. Shabanov / Physics Reports 326 (2000) 1}163
115
Kato}Trotter formula (8.84), we have to construct the kernel of ;K o0D";K o0PK . Applying the projection operator to the in"nitesimal evolution operator kernel of a free motion in the total con"guration space we "nd
P
dk (u)exp G
P
dk (u) exp G
;0D o (x, x@)"(2pio)~N@2
G
H
iSx!X(u)x@T2 , 2o
(8.86) G where by SxT2 we imply the invariant scalar product Sx, xT. Kernel (8.86) is explicitly gauge invariant. Reducing it on the gauge "xing surface by the change of variables (7.93) we "nd ;0D o (u, u@)"(2pio)~N@2
G
H
iS f (u)!X(u) f (u@)T2 , 2o
(8.87) G where u and u@ belong to the fundamental modular domain K. The amplitude is also well de"ned for u ranging over the entire gauge "xing surface (the covering space of the modular domain K). The transition amplitude (8.87) can be analytically continued into the covering space. The continuation is invariant under the Gribov transformations ;0D K u@)";0D o (u, R o (u, u@) .
(8.88)
The evolution of the physical states governed just by the free Hamiltonian is determined by the relation
P
Uo (u)"
(8.89) du@k(u@);0D o (u, u@)U (u@) , 0 K where the induced volume element k(u) is the Faddeev}Popov determinant on the gauge "xing surface [13]. Formulas (8.86)}(8.89) are obviously valid for a "nite time, oPt. This follows from the modi"ed Kato}Trotter formula for zero potential <"0. By construction, kernel (8.87) turns into a unit operator kernel as oP0. Moreover, thanks to invariance property (8.88), we get a unique continuation of the unit operator kernel to the covering space of the modular domain
P
SuDu@T"
G
dk (u)dN( f (u)!X(u) f (u@)) G
"+ [k(u)k(RK u@)]~1@2dM(u!RK u@) , Ss duA dM(u!uA)Q(uA, u@) , " [k(u)k(uA)]1@2
P
(8.90) (8.91) (8.92)
where u is a generic point on the gauge "xing surface, u@ belongs to the modular domain and Q(uA, u@)"+ s dM(uA!RK u@); the integration in (8.92) is extended over the whole gauge "xing S surface. Recall that the functions RK u@"u (u@) are well de"ned after the modular domain is identi"ed s (see Sections 6.2 and 7.7). Due to the gauge invariance of the potential we obviously have SuDe~*oVK Du@T"e~*oV(f(u))SuDu@T .
(8.93)
116
S.V. Shabanov / Physics Reports 326 (2000) 1}163
Thus, the basic idea is to project the in"nitesimal transition amplitude of a free motion onto the gauge orbit space, rather than to reduce the formal local measure < t dx(q) onto the gauge "xing q/0 surface by means of the Faddeev}Popov identity [67]
P
dk (u)dM(s(X(u)x)) . (8.94) G G From the mathematical point of view, folding (8.84) of kernels (8.87) and (8.93) leads to a certain measure for the averaging functions u"u(t) in the continuum limit. By making use of the classical theory of Kolmogorov, one can show that this measure is a countably additive probability measure for u(t) such that any set of values of u(t) at any set of distinct times is equally likely [122]. Our next step is to calculate the averaging integral explicitly by means of the stationary phase approximation as oP0. It would be technically rather involved to do this in our general settings. We shall outline the strategy and turn to concrete examples in next section to illustrate the procedure. The stationary phase approximation can be applied before the reduction of ;D o (x, x@) on a gauge "xing surface. No gauge "xing is needed a priori. A deviation from the conventional gauge-"xing procedure results from the fact that there may be more than just one stationary point. 1"D (x) FP
Remark. As a point of fact, the averaging integral in the Faddeev}Popov identity (8.94) may also have contributions from several points in the gauge parameter space [125]. To characterize the path integral measure, one needs to know the e!ect of the gauge group averaging on correlators between neighboring points of a path contributing to the path integral. Because of the locality of the Faddeev}Popov identity, such information cannot be obtained from (8.94), while amplitude (8.87) does determine all correlators between neighboring points on paths on the gauge orbit space. We can always shift the origin of the averaging variable u so that one of the stationary points is at the origin u"0. Let ¹K be operators generating gauge transformations of x. Decomposing the a distance S(x!X(u)x@)T2 in the vicinity of the stationary point, we "nd Sx!x@, ¹K x@T"0 . (8.95) a In the formal continuum limit, x!x@+ex5 , we get the condition p (x5 , x),(x5 , ¹K x)"0 induced by a a the averaging procedure. This is nothing, but the Gauss law enforcement for trajectories contributing to the path integral for folding (8.84). Suppose there exists a gauge condition s (x)"0, which a involves no time derivatives, such that a generic con"guration x"f (u) satisfying it also ful"lls identically the discretized Gauss law (8.95), i.e., S f!f @, ¹K f @T,0, where f"f (u) and f @"f (u@). We a will call it a natural gauge. In this case all other stationary points in integral (8.86) are u "u c s where X(u ) f (u)"f (u ). That is, the transformations X(u ) generate Gribov copies of the con"gs s s uration x"f (u) on the gauge "xing surface. Therefore, we get a sum over the stationary points in the averaging integral (8.87) if the Gribov problem is present. Still, in the continuum limit we have to control all terms of order o. This means that we need not only the leading term in the stationary phase approximation of (8.87) but also the next two corrections to it. Therefore, the group element X(u) should be decomposed up to order u4 because u4/o&o as one is easily convinced by rescaling the integration variable uPJo u. The averaging measure should also be decomposed up to the necessary order to control the o-terms. The latter
S.V. Shabanov / Physics Reports 326 (2000) 1}163
117
would yield quantum corrections to the classical potential associated with the operator ordering in the kinetic energy operator on the orbit space. We stress that the averaging procedure gives a unique ordering so that the integral is invariant under general coordinate transformations on the orbit space, i.e., does not depend on the choice of s. Thus, ;0D K u@) o (u, u@)"(2p io)~M@2+ D~1@2(u, R Ss ]Mexp[iS f (u)!f (RK u@)T2/2o!io<M (u, RK u@)]#O(o2)N q
(8.96)
,+ D~1@2(u, RK u@);I o (u, RK u@) , (8.97) Ss where D(u, u@) is the conventional determinant arising in the stationary phase approximation, RK u@"u (u@), u@3K, and by <M we denote a contribution of all relevant corrections to the leading s q order. The amplitude ;D(u, u@) is obtained by adding !ie<( f (u)) to the exponential in (8.96). Note e that <( f (u))"<( f (RK u)) thanks to the gauge invariance of the potential. We postpone for a moment a discussion of the quantum corrections <M . q In general, the equations p (x5 , x)"0 are not integrable, therefore the natural gauge does not a always exist. In this case we consider two possibilities. Let X (u, u@) be the group element at c a stationary point in (8.87). Decomposing the distance in the vicinity of the stationary point we get S f!X f @, ¹K X f @T"0, f @"f (u@). Let s be such that the latter condition is also satis"ed if f (u@) is c a c replaced by f (RK u@) where u@3K. Then the sum over the stationary points is again a sum over the Gribov residual transformations. In Eq. (8.96) we have to replace f (RK u@)PX (u, RK u@) f (RK u@), u@3K . (8.98) c In the most general case, the sum over stationary points may not coincide with the sum over Gribov copies in a chosen gauge. However for su$ciently small o, the averaged short-time transition amplitude can always be represented in the form (8.97) for some ;I o . Note that as o approaches zero, the amplitude ;0D o (u, u@) tends to the unit operator kernel (8.91) that contains the sum over the Gribov copies. Each delta function in sum (8.91) can be approximated by the corresponding amplitude of a free motion up to terms of order o. Thus, the sum over copies always emerges in the short-time transition amplitude. The folding of two in"nitesimal evolution operator kernels is given by
P
;Do (u, u@)" 2
(u, u );Do (u , u@) . du k(u );D 1 1 1 1 o
(8.99)
K Let us replace ;Do (u, u ) in (8.99) by sum (8.97) and make use of (8.88) applied to the second kernel 1 in (8.99): ;D K u , u@). Note that RK u "u (u ) and the functions u are well de"ned o (u , u@)";D o (R 1 1 1 s 1 s because u 3K. Since the measure on the orbit space does not depend on a particular choice of the 1 modular domain, du k(u )"du k(u), we can extend the integration to the entire covering space by s s removing the sum over S (cf. Section 7.7) s
P
;Do (u, u@)" du Dk(u )DD~1@2(u, u );I o (u, u );D (u , u@) . 2 1 1 1 1 o 1
(8.100)
118
S.V. Shabanov / Physics Reports 326 (2000) 1}163
The absolute value bars account for a possible sign change of the density k(u) (the Faddeev}Popov determinant on the gauge "xing surface). The procedure can be repeated from left to right in the folding (8.84), thus removing the restriction of the integration domain and the sum over copies in all intermediate times q3(0, t). The sum over S for the initial con"guration u@ remains in the integral. s Now we can formally take a continuum limit with the result
P
u(t)/u DuJdet gph e*Seff *u+ (8.101) ;D(u, u@)"+ [k(u)k(RK u@)]~1@2 t @ K s u(0)/Ru Ss t S " dq[(u5 , gphu5 )/2!< (u)!<( f (u))] , (8.102) eff q 0 where gph"gph(u) is the induced metric on the orbit space spanned by local coordinates u (cf. ij ij (7.100)). The local density
P
D(u@#D, u@)"D2 ( f (u@))/det gph(u@)#O(D) , (8.103) FP where D ( f (u))"k(u). Relation (8.103) explains the cancellation of the absolute value of the FP Faddeev}Popov determinant in the folding (8.84) computed in accordance with rule (8.100). The term D2/e in exponential (8.96) gives rise to the kinetic energy (D, gphD)/2o#O(D3). The metric gph can be found from this quadratic form. A technically most involved part to calculate is the operator ordering corrections < (u) in the q continuum limit. Here we remark that D(u, u@) has to be decomposed up to order D2, while exponential in (8.96) up to order D4 because the measure has support on paths for which D2&e and D4&o2. There is a technique, called the equivalence rules for Lagrangian path integrals on manifolds, which allows one to convert terms D2n into terms en and thereby to calculate < q [135}138] (see also [16] for a detailed review): Dj1 2Dj2k P(io+)k
gj1 j2 2gj2k~1 j2k , + (8.104) ph ph 2 p(j1 , , j2k ) in the folding of the short-time transition amplitudes, where the sum is extended over all permutations of the indices j to make the right-hand side of (8.104) symmetric under permutations of the j's. Following (8.104) one can derive the SchroK dinger equation for the physical amplitude (8.101). The corresponding Hamiltonian operator on the orbit space has the form (+"1) 1 HK "! R (kgjk R )#<( f (u)) , 1) 2k j ph k
(8.105)
where R "R/Ru . It can easily be transformed to HK f in (7.100) by introducing the hermitian j j 1) momenta p( . Observe that the kinetic energy in (8.105) does not coincide with the Laplace}Beltrami j operator on the orbit space because [det gph]1@2Ok as shown in [13]. The operator (8.105) is ij
S.V. Shabanov / Physics Reports 326 (2000) 1}163
119
invariant under general coordinate transformations on the orbit space, i.e., its spectrum does not depend on the choice of local coordinates u and, therefore, is gauge invariant. Thus, we have developed a self-contained path integral quantization in gauge theories that takes into account both the coordinate singularities associated with a parameterization of the nonEuclidean physical phase or con"guration space and the operator ordering corrections to the e!ective gauge "xed action, which both are important for the gauge invariance of the path integral. The essential step was to use the projection on the Dirac physical subspace directly in the Kato}Trotter representation of the evolution operator. The latter guarantees the unique correspondence of the path integral and the gauge invariant operator formalism. The equivalence of the path integral quantization developed here to the Dirac operator approach discussed in Section 7.7 follows from the simple fact that the projection operator (8.81) commutes with the total Hamiltonian in (7.89) (due the gauge invariance of the latter). Therefore the evolution (SchroK dinger) equation in the physical subspace should have the form iR ;K D"iR PK ;K PK "HK PK ;K PK "HK ;K D , (8.106) t t t t t 1) t where HK is given by (7.100), because the projection eliminates the dependence on the non1) physical variables h (cf. (7.93)) in the transition amplitudes in the total con"guration space. Remark. If gauge orbits are not compact, the integral over the gauge group in (8.86) may still exist, although the measure dk (u) is no longer normalizable; the Riemann measure on the gauge orbit G can be taken as the measure dk . For example, if the gauge group acts as a translation of one of the G components of x, say, x Px #u, the integration over u in the in"nite limits with the Cartesian 1 1 measure du would simply eliminate x !x@ from the exponential in (8.86). A general procedure of 1 1 constructing the coordinate-free phase-space path integral based on the projection method in gauge theories has been developed in [122,123]. 8.8. The modixed Kato}Trotter formula for gauge models. Examples Let us illustrate the main features of the path integral quantization method based on the modi"ed Kato}Trotter formula. We start with the simplest example of the SO(N) model. For the pedagogical reasons, we do it in two ways. First we calculate the averaging integral exactly and then use the result to develop the path integral. Second, we obtain the same result using the stationary phase approximation in the average integral. The latter approach is more powerful and general since it does not require doing the averaging integral exactly. As has been mentioned in Section 8.4, for NO3 the kinetic energy would produce a quantum potential of the form < "(N!3)(N!1)/(8r2) if the unitary gauge, x "r, x "0, iO1, is used. Assuming a spherical q 1 i coordinate system (as the one associated with the chosen gauge and the gauge transformation law) we get for the in"nitesimal amplitude (8.87)
P
G A B
H
V e*(r2`r{2)@2o p irr@ N~1 ;0D dh sinN~2h exp ! cos h (r, r@)" o o V (2pio)N@2 N 0 V C(l#1/2)C(1/2) rr@ N~1 " J e*(r2`r{2)@2o , l o V (pi)N@2 2o(rr@)l@2 N
(8.107) (8.108)
120
S.V. Shabanov / Physics Reports 326 (2000) 1}163
where V is the volume of the N-sphere of unit radius, h is the angle between x and x@, J is the N l Bessel function where l"N/2!1. Note that for N"3 and a xnite time, oPt, Eq. (8.107) turns into (8.43). As o is in"nitesimally small, we should take the asymptotes of the Bessel function for a large argument, keeping only the terms of order o. Making use of the asymptotic expansion of the Bessel function [65]
S G
C(l#3/2) 2 J (z)" cos z !sin z l l l 2zC(l!1/2) pz
H
,
(8.109)
z "z!pl/2!p/4 , l we "nd, up to terms of order O(o2),
P
;0D o (r, r@)"(2pio)~1@2
=
(8.110)
drA e*(r~rA)2@(2o)~*oVq (r)Q(rA, r@) , (rrA)(N~1)@2
~= Q(rA, r@)"d(rA!r@)#d(rA#r@) ,
(8.111) (8.112)
where < "(N!1)(N!3)/(8r2) is the quantum potential. The "rst term in the exponential is q identi"ed with the kinetic energy r5 2/2 in the e!ective gauge "xed action, while the second one is the quantum potential (7.101). The projection method automatically reproduces the scalar product measure rN~1 as a prefactor of the exponential (the Faddeev}Popov determinant in the unitary gauge). The existence of the quantum potential barrier near the Gribov horizon r"0 would change the phase with which trajectories re#ected from the horizon contribute to the sum over paths. Observe that there are no absolute value bars in the denominator of the integrand in (8.111). The phase is determined by the phases of the two exponentials in the asymptote of the Bessel function (8.109). In the stationary phase approximation of the average integral (8.107), we have to control the corrections of order o in the exponential. The stationary points are h"0 and h"p. In the vicinity of h"0, we decompose cos h+1!h2/2#h4/24 and in the measure sin h+h!h3/6. The cubic and quardic terms give the contribution of order o. This can be immediately seen after rescaling the integration variable hPh/Jo. Keeping only the o and r dependencies and the phase factors, the contribution of the "rst stationary point h"0 to the averaging integral can be written in the form
P
A
B A
B
e*(r~r{)2@2o = oh3 N~2 irr@ dh oN@2~1 h! 1! oh4 e~*rr{h2@2 . (8.113) (io)N@2 6 24 0 Contributions of the averaging measure and the group element X(u) in (8.87) to the next-to-leading order of the stationary phase approximation are given, respectively, by the h3- and h4-terms in the parenthesis. All the quantum corrections are determined by them. Indeed, doing the integral we get
A
B
io(N!1)(N!3) e*(r~r{)2@2o 1! #O(o2) , 8rr@ (2pio)1@2(rr@)(N~1)@2
(8.114)
where all numerical factors of (8.107) have now been restored. The expression in the parenthesis is nothing but the exponential of the quantum potential up to terms of order o2. Similarly, we can calculate the contribution of the second stationary point h"p. The result has the form (8.114) where r@P!r@ because cos p"!1. Thus, we have recovered result (8.111) again.
S.V. Shabanov / Physics Reports 326 (2000) 1}163
121
The lesson we can learn from this exercise is the following. When doing the stationary phase approximation in the average integral in (8.87), the group element X(u) and the averaging measure must be decomposed up to such order in the vicinity of every stationary point that the integral would assume the form o~M@2A where A is decomposed up to O(o2) and M is the number of physical degrees of freedom. For the Yang}Mills theory in (0#1) spacetime, we set f (u)Ph3H. The distance in the exponential (8.87) is taken with respect to the Killing form: (h!exp(ad u)h@)2. The equation for stationary points is i(h!e* !$ uc h@, ad e
(e* !$ uc h@))"0 , (8.115) Ba for any positive root a. The operators ad e generate the adjoint action of G/G , G is the Cartan Ba H H subgroup, on the Lie algebra. Eq. (8.115) has a trivial solution u "0 because ad e (h@)"[e , h@] c Ba Ba is orthogonal to any element of the Cartan subalgebra H. That is, x"h is the natural gauge. All non-trivial solutions are exhausted by the elements of the Weyl group: exp[i ad u ]"RK 3=. The c averaging measure has the form [30] dk(u)"du detM(i ad u)~1[e* !$ u!1]N .
(8.116)
The determinant has to be decomposed up to the second order in ad u, while the exponential in the distance formula up to the fourth order, in the fashion similar to (8.113). The second variation of the distance at the stationary point follows from the decomposition (h!e* !$ uh@)2"(h!h@!i ad uh@#1(ad u)2h@)2#O(u3) 2 "(h!h@)2!(u, ad h ad h@(u))#O(u3) .
(8.117)
Therefore D1@2(h, h@)"det[!ad h ad h@]1@2"i(h)i(h@) .
(8.118)
For another stationary point, the con"guration h@ has to be replaced by RK h@, RK 3=. If we do such a replacement formally in (8.118), we get an ambiguity. Indeed, det[i ad h@]"i2(h@)"i2(RK h@), while the function i(h@) can change sign under the Weyl transformations. The question is: How do we treat the square root in (8.118)? This is a quite subtle and important question for the formalism being developed in general. If we put the absolute value bars in the right-hand side of Eq. (8.118), as it seems formally correct, the corresponding short-time transition amplitude would not coincide with the one obtained in Section 8.5 by solving the SchroK dinger equation. That is, the phase with which the trajectories re#ected from the Gribov horizon contribute to the sum over paths would be incorrect. To determine a correct phase, we note that, if DK is a strictly positive operator, then
P
du exp[!(u, DK u)]&D~1@2, D"det DK .
(8.119)
If DK P$iDK , the integral (8.119) is obtained by an analytic continuation of the left-hand side of (8.119). In our case DK u"i ad h ad h@(u)"i[h, [h@, u]], u3X>H, because the distance (8.117) is multiplied by i in the exponential (8.87). Making use of the Cartan}Weyl basis, the quadratic form can be written as (u, DK u)"i + (h, a)(h@, a)[(ua)2#(ua)2] . c c a;0
(8.120)
122
S.V. Shabanov / Physics Reports 326 (2000) 1}163
Here we have used the commutation relations (4.15). The replacement of h or h@ by RK h or RK h@, respectively, induces permutations and re#ections of the roots in the product < (h, a) (h@, a) a;0 which emerges after the integration over ua since (RK h, a)"(h, RK ~1a) and the Weyl group preserves c,s the root pattern. Permutations do not change the product. A re#ection in the hyperplane perpendicular to a positive root a, RK a"!a, changes sign of an odd number of terms in (8.120) and may make some permutations among other positive roots, too. Note that for any two positive roots, b and c, distinct from a, the re#ection can only occur pairwise: RK b"!c and RK c"!b because RK 2"1. According to the analytic continuation of (8.119), each of the integrals over ua s, c (a "xed) would contribute the phase factor exp(!ip/2) when h@ is replaced by RK h@ and det RK "!1 (re#ection), thus making together the phase exp(!ip)"!1"det RK , while the pairwise re#ections give rise to the total phase [exp(!ip/2)]4k"1, k"0, 1,2. Therefore, an analytic continuation of (8.118) assumes the form D1@2(h, RK h@)"det RK det[!ad h ad RK h@]1@2"det RK D i(h)i(RK h@)D"i(h)i(RK h@) .
(8.121)
This is the power and the beauty of the new path integral formalism. In the operator formalism, a change of the phase the system wave function gets after re#ecting from the con"guration space boundary is determined by the type of the boundary conditions at the horizon which, in turn, are speci"ed by a self-adjoint extension of the kinetic energy operator on the modular domain. The latter problem may be extremely hard and has no unique solution in general, given the fact that the modular domain depends on the choice of the gauge. In the projection formalism, the stationary phase approximation of the averaged short-time amplitude speci"es this phase automatically so that the amplitude remains guage invariant regardless of the guage choice. The number of stationary points in the averaging integral can be in"nite. This would indicate that the physical con"guration space may be compact in certain directions. Feynman conjectured that a compacti"cation of the con"guration space in certain directions due to the gauge symmetry might be responsible for the mass gap in the spectrum of (2#1) Yang}Mills theory [163] (a "nite gap between the ground state energy and the "rst excited state energy). We have seen that such a conjecture is indeed true for (1#1) Yang}Mills theory. Now we can establish this within our path integral quantization of gauge theories without solving the SchroK dinger equation. The averaging integral is now a functional integral over the gauge group G/G . A rigorous de"nition of H the averaging measure can be given via a lattice regularization of the theory (see Section 10.5). To achieve our goal, it is su$cient to calculate the leading order of the stationary phase approximation for the averaging integral, for which no lattice regularization is needed. The key observation is that the sum over an in"nite number of stationary points has a similar e!ect on the spectrum of a free motion (there is no magnetic "eld in 2D Yang}Mills theory) as the sum over the winding numbers in the free particle transition amplitude: The spectrum becomes discrete. Let us turn to the details. The exponential in (8.86) assumes the form SA!A@XT2 for any two con"gurations A(x) and A@(x). It is the distance between two con"gurations A(x) and A@X(x) introduced by Feynman [163]. Here the scalar product has the form S , T":2pldx( , ). According to 0 our general analysis, the gauge group average enforces the Gauss law p(AQ , A)"+(A)AQ "0. The orbit space can be parameterized by constant connections A(x)"a taking their values in the Cartan subalgebra H. An in"nitesimal gauge transformation of a has the form da"+(a)u where u(x)3F>FH (cf. Sections 5.1 and 8.6). The gauge A(x)"a is the natural gauge because the Gauss 0 law is satis"ed identically p(a5 , a)"+(a)a5 ,0. Thus, if u(x)"0 is a stationary con"guration,
S.V. Shabanov / Physics Reports 326 (2000) 1}163
123
X(0)"e, then all other stationary con"gurations u (x) in the functional averaging integral (8.87) c must be given by the Gribov transformations of the gauge "xed potential A(x)"a, i.e., X(u ) c generate transformations from the a$ne Weyl group. To "nd the function D(a, a@), we decompose the distance up to the second order in the vicinity of the stationary point Sa!a@XT2"Sa!a@!+(a@)u#1[+(a@)u, u]T2#O(u3) 2 "Sa!a@T2!Su, +(a)+(a@)uT#O(u3) .
(8.122)
The Gaussian functional integration over u yields D1@2(a, a@)"det[!+(a)+(a@)]1@2&i(a)i(a@) ,
(8.123)
where i2(a)&det[i+(a)] is the Faddeev}Popov determinant in the chosen gauge (cf. Section 8.6). One should be careful when taking the square root in (8.123) for other stationary points, i.e., when a@PRK a@, RK is from the a$ne Weyl group, or when a or a@ is outside the modular domain being the Weyl cell. By making use of the representation (5.38)}(5.40) and the analyticity arguments similar to those given above to prove (8.121), it is not hard to be convinced that the absolute value bars must be omitted when taking the square root in (8.123). The folding of the short-time transition amplitudes can be computed along the lines of Section 8.5 and leads to the result of Section 8.6. To calculate the Casimir energy E , the C higher-order corrections must be taken into account in addition to the leading term of the stationary phase approximation as has been explained with the example of the SO(N) model. The e!ects on the energy spectrum caused by the modi"cation of the path integral (due to the sum over Gribov copies) can be found from the pole structure of the trace of the resolvent
P
tr RK (q)"tr (q!iHK )~1"
P
=
0
dt e~qt tr ;K D , t
(8.124)
du k(u);D(u, u) . (8.125) t K In particular, thanks the sum over in"nite number of Gribov copies, the resolvent for (1#1) Yang}Mills theory has discrete poles. Thus, we have veri"ed Feynman's conjecture for (1#1) Yang}Mills theory without any use of the operator formalism. To illustrate the e!ects of curvature of the orbit space, we consider a simple gauge matrix model of Section 4.8. Let x be a real 2]2 matrix subject to the gauge transformations xPX(u)x where X3SO(2). An invariant scalar product reads (x, x@)"tr xTx@ with xT being a transposed matrix x. The total con"guration space is R4. Let ¹ be a generator of SO(2). Then X(u)"exp(u¹). The Gauss law enforced by the projection, p"(x5 , ¹x)"0, is not integrable. We parameterize the orbit space by triangular matrices o, o ,0 (the gauge x "0). The residual gauge transformations 21 21 form the group S "Z : oP$o. The modular domain is a positive half-space o '0. Accords 2 11 ing to the analysis of Section 4.8, we have k(o)"o (the Faddeev}Popov determinant). The plane 11 o "0 is the Gribov horizon. The averaging measure in (8.87) reads (2p)~1:2pdu. The quadratic 11 0 form in the exponential in (8.87) has the form tr ;K D" t
(o!euTo@)2"(o, o)#(o@, o@)!2(o, o@)cos u!2(o, ¹o@)sin u .
(8.126)
124
S.V. Shabanov / Physics Reports 326 (2000) 1}163
A distinguished feature of this model from those considered above is that the stationary point is a function of o and o@. Taking the derivative of (8.126) with respect to u and setting it to zero, we "nd (o, ¹o@) u "tan~1 , us "u #p . c c c (o, o@)
(8.127)
The second stationary point us is associated with the Gribov transformation oP!o. c A geometrical meaning of the transformation o@Pexp(u ¹)o@ is transparent. The distance c [(o!o@)2]1@2 between two points on the gauge "xing plane is greater than the minimal distance between the two gauge orbits through x"o and x@"o@. By shifting x@ along the gauge orbit to x@ "exp(u ¹)o@, a minimum of the distance between the orbits is achieved. In such a way the c c metric on the orbit space emerges in the projection formalism. To "nd its explicit form we substitute u"u (o, o@) into (8.126), set o@"o!D and decompose (8.126) in a power series over D. c The quadratic term (the leading term) determines the metric. We get (D, gph(o)D)"(D, D)#(D, ¹o)(¹o, D)/(o, o) ,
(8.128)
which coincides with the metric (4.55). In the stationary phase approximation the cosine and sine in (8.126) should be decomposed up to fourth order in the vicinity of the stationary point. In this model quantum corrections do not vanish. The short-time transition amplitude on the orbit space is ;D I o (o, o@)#D~1@2(o,!o@);I o (o,!o@) o (o, o@)"D~1@2(o, o@);
(8.129)
;I o (o, o@)"(2pio)~3@2e*So (o,o{) ,
(8.130)
1 o So (o, o@)" [(o, o)#(o@, o@)!2D(o, o@)]! !o<(o) , 2o 8D(o, o@)
(8.131)
where !2D(o, o@) is given by the two last terms in Eq. (8.126) at the stationary point u"u . In the c continuum limit it can be written in the form D(o, o@)"k(o)k(o@) det~1 gph(o@)#O(D2) .
(8.132)
Here we have used an explicit form of metric (8.128) and k"o to compute det gph"k2/(o, o). As 11 before, an analytic continuation must be applied to obtain D~1@2 outside the modular domain o '0. The result, expanded in a power series in D, is obtained by taking the square root of the 11 right-hand side of (8.132) even though o and o@ range over the entire gauge "xing surface. The phase of D~1@2 is determined by the sign of the Faddeev}Popov determinant k at the points o and o@ as the determinant of the physical metric is positive. The phase is invariant under permutations of o and o@ in (8.132) because terms O(D2) in det gph do not a!ect it. The leading term in (8.132) speci"es the phase of D~1@2(o, o@) in the continuum limit. According to (8.99)}(8.100), the folding of N#1 kernels (8.129) would contain the following density: Dk D2Dk D Dk D
(8.133)
S.V. Shabanov / Physics Reports 326 (2000) 1}163
125
with N being the number of integrations in the folding; k "k(o ), D "D(o , o ), etc., k k k, k~1 k k~1 k"0, 1,2, N#1, and o are initial and "nal con"gurations, respectively. All terms O(D2) are 0, N`1 assumed to have been converted into O(o) by means of the equivalence rule (8.104). In the numerator of the right-hand side of (8.133), the density at the initial state det1@2gph can be replaced 0 by the density at the "nal state det1@2 gph . The choice depends on the base point (pre-point or N`1 post-point) in the de"nition of the path integral on a curved space. In other words, the short-time action (8.131) in the amplitude (8.130) can be decomposed in powers of D either at the point o (post-point) or at the point o@ (pre-point). Both representations di!er in terms of order o. We have chosen the pre-point decomposition in D (cf. (8.132)) and So . The base point can be changed by means of the equivalence rules (8.104). If we make a Fourier transformation for D in each kernel (8.130) involved in the folding, the numerator of (8.133) would cancel against the same factor resulting from the integrals over momentum variables, thus producing a local Liouville measure in the formal continuum limit. Note that the number of momentum integrals should be exactly N#1 because it exceeds by one the number of integrals over con"gurations (see Section 8.1). 8.9. Instantons and the phase space structure Here we discuss the simplest consequences of the modi"cation of the path integral for instanton calculus in gauge quantum mechanics. The instantons are used in quantum theory to calculate the tunneling e!ects [149,102]. Consider a one-dimensional quantum systems with a periodic potential [149]. The ground state in the vicinity of each potential minima is degenerate. The degeneracy is removed due to the tunneling e!ects, and the ground state turns into a zone. It appears that knowledge of the solutions of the Euclidean equations of motion (the equations of motion in the imaginary time tP!iq) allows one to approximately calculate the energy levels in the zone and "nd the corresponding wave functions (the h-vacua). Let us take the SU(2) model from Section 3 with the periodic potential <(x)"1!cos[(x, x)1@2]. Since the cosine is an even function, the potential is a regular function of the only independent Casimir polynomial P (x)"(x, x). The analogous one-dimensional model has been well studied 2 (see, e.g., [149] and references therein). In our case the phase space of the only physical degree of freedom is a cone. Consider the Euclidean version of the theory. In the Lagrangian (4.1) we replace tP!iq and yPiy. Recall that y is analogous to the time component of the Yang}Mills potential which requires the factor i in the Euclidean formulation [149]. The Lagrangian assumes the form ¸P¸ "(D x)2/2#<(x). The dynamics of the only physical degree of freedom is described by the E q element of the Cartan subalgebra x"hj 3H (j is the only basis element of H&R, (j , j)"1). 1 1 1 The solutions of the Euclidean equations of motion d R¸ R¸ R¸ E" E, E "0 , dq Rx5 Rx Ry5
(8.134)
where the overdot denotes the Euclidean time derivative R , depend on the arbitrary functions q y"y(q) whose variations imply the gauge transformations of the classical solutions x(q) (see Section 4.1 for details). Removing the gauge arbitrariness by imposing the condition y"0, we get
126
S.V. Shabanov / Physics Reports 326 (2000) 1}163
the following equation for h (cf. (4.8)): h$ "sin h .
(8.135)
The instanton solution of Eq. (8.135) has the form [149] h(q)"h (q)"4 tan~1 exp(q!q )#2pm, q "const . (8.136) */45 c c It connects the local minima of the potential: x2 P(2pm)2 as qP!R, and x2 P[2p(m#1)]2 */45 */45 as qPR, where x (q)"h (q)j in the chosen gauge. */45 */45 1 Eq. (8.135) is the same as in the analogous one-dimensional model ¸ "hQ 2/2#1!cos h, h3R, E i.e., with the Euclidean phase space R2. For this model the wave function of the h-vacuum is calculated as follows [149]. First, one "nds the amplitude ; (2pm, 2pm@) in the semiclassical q approximation of the corresponding path integral. The instanton solution serves as the stationary point. In the limit qPR, the main contribution comes from the states of the lowest zone (the contributions of higher levels are exponentially suppressed):
P
; (2pm, 2pm@)"S2pmDe~qHK D2pm@T+ q
2p
dhS2pmDhTShD2pm@Te~qEh ,
(8.137)
0 as qPR, where h parameterizes the energy levels E in the lowest zone. The amplitude S2pmDhT is h obtained by calculating the path integral in the semiclassical approximation for the instanton solution (8.135). The details can be found in [149] where it is shown that (qPR)
P
2p dh e~*(m~m{)he~qEh , 2p3@2 0
; (2pm, 2pm@)+ q
1 E " !e~S0 S K cos h , 0 h 2
(8.138) (8.139)
where S is the instanton action, K a constant independent of h (the instanton determinant [149]). 0 The amplitude S2pmDhT&exp(!imh) follows from the comparison of (8.137) and (8.138). It speci"es the value of the vacuum wave function ShDhT in the local minima of the potential, h"2pm. Therefore the wave function ShDhT can be approximated by the superposition = ShDhT+c + e~*mhShD2pmT , (8.140) m/~= where ShD2pmT&exp[!(h!2pm)2/2] is the ground state wave function in the oscillator approximation in the vicinity of each potential minima. To "nd how the above calculations are modi"ed in the case when the physical degree of freedom has the conic phase space, one has to take the amplitude ;D(2pm, 2pm@). In Eq. (8.59) we take q ="Z (in the SU(2) case) and replace t by !iq. Since the algebra su(2) is isomorphic to so(3), the 2 amplitude is also given by (8.43) (where rPh). Making use of this relation we "nd the modi"cation [150,16]
P
;D(2pm, 2pm@)+ q
2p dh sin(mh)sin m@h e~qEh . p3@2 (2p)2mm@ 0
(8.141)
S.V. Shabanov / Physics Reports 326 (2000) 1}163
127
Therefore the change of the phase space structure does not a!ect the distribution of the energy levels in the lowest zone. However, it does a!ect the amplitudes S2pmDhT, thus leading to the modi"cation of the wave function of the h-vacuum: = sin mh ShDhTD"c + ShD2pmT . (8.142) 2pm m/~= From the obvious relation S!hD2pmT"ShD!2pmT we infer that the function (8.142) is even, S!hDhTD"ShDhTD, i.e., invariant under the residual Weyl transformations, while (8.140) does not have a de"nite parity. That the energy level distribution in the lowest zone is not sensitive to the conic phase space structure holds, in general, only for the continuum spectrum. One can make the analogy with a free particle. The change of the phase space structure from the plane to the cone has no e!ect on the spectrum. The latter would not be the case for systems with a discrete spectrum, like the harmonic oscillator. A similar phenomenon might be expected for the instantons. Consider, for example, the double well potential <"(x2!v2)2 and the gauge group SU(2). The corresponding one-dimensional system has been well studied [102]. In this model the ground state is doubly degenerate x"$vj . The lowest zone contains two levels. The lower level has an odd wave function, while 1 the upper one has an even wave functions. Such an unusual parity property (typically one expects the lowest level to have an even wave function) is a consequence of the fact that wave functions of the corresponding one-dimensional system have to be multiplied by the odd density factor (h)~1 (cf. Section 7.3) to get the wave functions of the gauge system. The reduction of the phase space from the plane to cone shows that the odd functions are to be excluded. So, there would be only one (upper) energy level in the lowest zone. The analysis of more complicated gauge systems would not add essentially new features. Given a classical solution, one should evaluate the path integral in the semiclassical approximation, multiply it by the Faddeev}Popov determinant of the density at initial and "nal con"gurations raised to the negative 1/2 power as prescribed by (8.101), and symmetrize the result relative to the residual gauge transformations. 8.10. The phase space of gauge xelds in the minisuperspace cosmology Another simple e!ect due to the non-Euclidean structure of the physical phase space in gauge theories can be found in the minisuperspace (quantum) cosmology. Consider the Einstein} Yang}Mills theory. The theory is complicated for a general analysis, but one can introduce a set of simplifying assumptions and consider closed cosmologies with an R]S3 topology. These are known as minisuperspace cosmological models [151]. They are used to study the wormhole dynamics [154]. Wormholes are Riemannian manifolds which have two or more asymptotically Euclidean regions. They are believed to play an important role in quantum gravity [151}153]. It is known however that there is no wormhole solutions of the Einstein equations in vacuum [158}160]. The presence of matter changes the situation [158]. We consider the case when only gauge "elds are present. The gauge "elds on a homogeneous space are described by the SO(4)invariant Ansatz [154}156]. The reduced system contains only a "nite number of degrees of freedom of gravitational and gauge "elds. Our primary interest will be to "nd e!ects caused by the non-Euclidean geometry of the physical phase space of the Yang}Mills "elds.
128
S.V. Shabanov / Physics Reports 326 (2000) 1}163
In the minisuperspace approach to the Einstein}Yang}Mills system, the prototype of a fourdimensional wormhole may be described by the SO(4) symmetric metric. The most general form of such a metric, i.e., a metric which is spatially homogeneous and isotropic in the spacetime of the R]S3 topology, is given by the Friedmann}Robertson}Walker Ansatz [157] 2G g dxkdxl" g [!N2(t)dt2#o2(t)hihi] , kl 3p
(8.143)
where N(t) and o(t) are arbitrary nonvanishing functions of time, G is the gravitational constant g and hi are the left-invariant one-forms (i"1, 2, 3) on the three-sphere S3 satisfying the condition dhi"!e hj'hk. The Ansatz for gauge "elds in the metric (8.143) has been proposed by Verbin ijk and Davidson [154] for the group SU(2) and generalized to an arbitrary group in Refs. [156,155]. The gauge "elds with the SO(n) group, n'3, are described by a scalar z(t)3R a vector x(t)3Rl, l"n!3 and a real antisymmetric l]l matrix y"yaj with j being generators of SO(l). a a The e!ective Einstein}Yang}Mills action reads
P G A B A B A
B
H
1 N o 2 o 2 o 2 S" dt ! o5 # z5 # D x !2< , t 2 o N N N
(8.144)
where D is the covariant derivative introduced for the SO(n) models in Section 3. The potential has t the form a <" g 3p
CA
B
D
3p 2 1 j z2#x2! #4z2x2 ! o2# o4 , 2a 2 2
(8.145)
with a "g2/(4p) being the Yang}Mills coupling constant, j"2G K/(9p), and K the cosmological g g constant. The action is invariant under the gauge transformations (3.2) and time reparameterizations tPt@(t), N(t)PN(t@)
dt@ . dt
(8.146)
Therefore our analysis of the phase space structure of the gauge "elds applies here. The gauge "elds have two physical degrees of freedom. One has a planar phase space (z is gauge invariant) while the other physical degree of freedom DxD has a conic phase space just like in the model discussed in Section 3. A quantum theory can be developed by the methods discussed in Sections 7.2 and 8.7. The corresponding path integral has been obtained in [161]. It has the same structure as the one derived in Section 8.7, and, hence, leads to a modi"cation of the semiclassical approximation of the path integral where wormhole solutions play the role of a stationary point. Here, however, we study only the classical e!ects caused by the non-Euclidean structure of the physical phase space on the wormhole dynamics, in particular, on the wormhole size quantization. The wormhole size quantization was "rst observed by Verbin and Davidson [154] for Yang}Mills "elds with the group SU(2). In this case SU(2)&SO(3), i.e., l"0 in the minisuperspace model. So, the physical phase space is a plane. We need the gauge groups of higher ranks to see the e!ect of the non-Euclidean structure of the physical phase space. The wormholes are solutions to the Euclidean equations of motion (tP!iq, yPiy) for the action (8.144) with a particular behavior for o(q) : o2(q)&q2 as qP$R. The simplest example of
S.V. Shabanov / Physics Reports 326 (2000) 1}163
129
the wormhole is known as the Tolman wormhole [158]. This is a closed radiation-dominated universe, and o2(q)"4b2#q2 .
(8.147)
The positive constant b is identi"ed as the wormhole radius (or size). The idea is to "nd solutions of the minisuperspace Einstein}Yang}Mills system which have an asymptotic behavior as (8.147). It turns out that such solutions exist, provided the constant b is quantized [154]: (8.148) b"b &K~1@2 exp(!pn/J2) . n In the gauge sector the solutions are determined modulo gauge transformations associated by various choices of the Lagrange multiplier y(q). So we are free to "x the gauge so that x (q)"d x(q) i i1 (cf. Section 3.2). The time reparameterization gauge freedom is "xed by going over to the conformal time dg"dq/o(q). The use of the conformal time has advantage that the equations of motion for o and gauge "elds are decoupled. From the action principle we "nd d2x R< d2z R< " , " . Rx dg2 Rz dg2
(8.149)
On any line x"az potential (8.145) has the form of the double well. Therefore the Euclidean equations of motion (8.149) should have periodic solutions oscillating around the local minima x"z"0 of the Euclidean potential !<. For every periodic solution in the gauge sector, one can "nd a periodic solution for o [155]. The solution o(g) is interpreted as a wormhole connecting two points in the same space. Therefore the gauge "elds should be the same at both sides of the wormhole. Since z(g) and x(g) are periodic (with the periods ¹ ), the period ¹ (the Euclidean z, x o time between two o-maxima) should be an integer multiple of their periods [154] ¹ "n¹ "m¹ . (8.150) o z x Relation (8.150) leads to the exponential quantization of the wormhole size [154,155]. For the gauge group SU(2), the integer n determines the wormhole size quantization (8.148). For the group SO(n), n'3, the wormhole size depends on both the integers n, m [155]. The phase space of the x-degree of freedom is a cone unfoldable into a half-plane. Since the x(g) oscillates around the origin x"0, the corresponding phase-space trajectory winds about the phase-space origin. Therefore the physical degree of freedom x needs twice less time to return to the initial state (see Section 3), that is, ¹ph"1¹ , thus leading to the modi"cation of the wormhole x 2 x size quantization rule m ¹ "n¹ "m¹ph" ¹ . o z x 2 x
(8.151)
If the theory contains "elds realizing di!erent representations of the gauge group, the periods of their physical oscillations would be determined by degrees of the independent Casimir operators for a given representation [161]. The modi"cation of the wormhole size quantization would have an e!ect on quantum tunneling in quantum gravity involving wormholes. The minisuperspace quantum theory with gauge "elds and fermions is discussed in [162].
130
S.V. Shabanov / Physics Reports 326 (2000) 1}163
9. Including fermions So far we have investigated the e!ects of the non-Euclidean geometry of the physical phase space on classical and quantum dynamics of bosonic systems with gauge symmetry. In realistic models, gauge and fermionic "elds are typically coupled in a gauge invariant way. The fermionic degrees of freedom are also subject to gauge transformations. However they are described by Grassmann (anticommutative) variables, so one cannot eliminate non-physical degrees of freedom in a gauge theory by imposing a gauge in the fermionic sector. A total con"guration or phase space of the system can be regarded as a superspace spanned by some number of bosonic and Grassmann variables [139}141]. De"nitions (2.1) and (2.2) of the physical phase and con"guration spaces apply in this case too. If, when calculating the quotient spaces (2.1) or (2.2), one eliminates non-physical degrees of freedom by "xing a gauge in the bosonic sector, then the residual gauge transformations, that might occur, provided the topology of the gauge orbits is non-trivial, would act on both physical bosonic and fermionic variables of the corresponding superspace, thus changing its structure signi"cantly after identifying gauge equivalent con"gurations. The aim of the subsequent analysis is to investigate the e!ects of non-Euclidean geometry of the physical con"guration and phase spaces in gauge models with fermionic degrees of freedom. We will see that the kinematic coupling of bosonic degrees of freedom, which is essentially due to the non-trivial phase-space geometry, occurs for fermionic degrees of freedom too, which, in turn, a!ects their quantum dynamics. 9.1. 2D SUSY oscillator with a gauge symmetry Consider a simple supersymmetric extension of the SO(2) gauge model of the isotropic oscillator. The Lagrangian reads ¸"1(x5 !y¹x)2#iwH(wQ !iyCw)!1x2!wHw . (9.1) 2 2 Here w is a two-dimensional vector with complex Grassmann components, t , i"1, 2. If h are i 1,2 two (real) Grassmann elements, h2 "0, then we can de"ne a complex Grassmann element by 1,2 t"h #ih and tH"h !ih . We also have the following rule (ct t )H"cHtHtH where c is 1 2 1 2 1 2 2 1 a complex number. The matrix ¹ is a generator of SO(2) as before, and C is diagonal matrix, C "!C "1. The Lagrangian is invariant under the gauge transformations 11 22 xPeuTx, wPe*uCw, yPy#u5 . (9.2) To construct the Hamiltonian formalism for this model, we have to deal with the second class constraints in the fermionic sector because the Lagrangian is linear in the velocities wQ and wQ H. The usual way is to introduce the Dirac bracket and solve the second class constraints [6]. We observe however that in any "rst-order Lagrangian the term linear in velocities, like iwHwQ , can be regarded as a symplectic one-form. So the corresponding symplectic structure is obtained by taking the exterior derivative of it. The same symplectic structure emerges if one proceeds along the lines of the Dirac treatment of the second class constraints. Therefore, we simply assume that the variables w and wH are canonical variables in the fermionic sector and Mt , tHN"MtH, t N"!id by j k k j jk de"nition. That is, the action with the Lagrangian (9.1) should be regarded as the Hamiltonian action for the fermionic degrees of freedom. On a phase space being a supermanifold the symplectic
S.V. Shabanov / Physics Reports 326 (2000) 1}163
131
structure has the following parity transformation property [142,143]: MA, BN"!(!1)pA pB MB, AN ,
(9.3)
where p is the Grassmann parity of the function A, i.e., p is zero, if A is an even element of the A A Grassmann algebra, and one, if A is odd. The Poisson bracket for odd functions is symmetric, while for even functions it is antisymmetric. The Hamiltonian of the model reads H"1 p2#1x2#wHw!yp , 2 2 where the secondary constraint p"p¹x#wHCw"0
(9.4) (9.5)
generates simultaneous gauge transformation in the bosonic and Grassmann sectors of the phase space. In classical theory, solutions to the equations of motion are elements of the superspace, i.e., x"x(t) is a general even element of the Grassmann algebra generated by the initial values of wH"wH(0) and w "w(0). Note that a generic interaction <"<(x, w, wH) between fermions and 0 0 bosons would require such an interpretation of the classical dynamics on the superspace [16] because the time derivatives xR and t0 are respectively generic even and odd functions on the superspace. Since there is no preference in the choice of the initial moment of time, the initial con"gurations of the bosonic coordinates and momenta should also be regarded as generic even elements of the Grassmann algebra. Therefore the constraint (9.5) is not `decoupleda into two independent constraints in the bosonic and fermionic sectors. If the non-physical degrees of freedom are eliminated by imposing the unitary gauge x "0, then 2 the residual gauge transformations would act on both the bosonic and fermionic variables x P!x , wP!w , (9.6) 1 1 p P!p , wHP!wH , (9.7) 1 1 thus making the corresponding points of the con"guration or phase space physically indistinguishable. Therefore the physical phase (super)space would not have a Euclidean structure. One should stress again that the gauge "xing has been used only to get local canonical coordinates on the physical phase space. The geometrical structure of the physical phase (super)space is certainly gauge independent. To see the e!ects caused by the non-Euclidean structure of the physical phase space, let us turn to the Dirac quantization of gauge systems because it provides an explicitly gauge invariant description of quantum mechanics. All the degrees of freedom (except the Lagrange multiplier y) are canonically quantized by the rule M,NP!i[,] [x( , p( ]"id , [tK , tK s] "d , (9.8) j k jk j k ` jk where [,] stands for the anticommutator. Observe the parity property of the symplectic structure ` (9.3). It is symmetric for odd variables, so upon quantization it should be turned into the anticommutator to maintain the correspondence principle. Introducing creation and destruction operators for the bosonic degrees of freedom (see Section 7.1), we write the Dirac constraint equation for the physical gauge invariant states in the form p( DUT"[a( s¹a( #wK sCwK ]DUT"0 .
(9.9)
132
S.V. Shabanov / Physics Reports 326 (2000) 1}163
Let D0T be the vacuum state in the Fock representation, i.e., a( D0T"wK D0T"0. It is a physical state because p( D0T"0. Then any physical state can be obtained by acting on the vacuum by a gauge invariant function of the creation operators. Thus, to construct the physical subspace, one has to "nd all independent gauge invariant polynomials built out of a( s and wK s. These are bK s "(a( s)2, bK s "tK s tK s , (9.10) 1 2 1 2 fK s"(a( s #ia( s )tK s , fK s"(a( s !ia( s )tK s . (9.11) 1 1 2 1 2 1 2 2 The operators bK s create states with the bosonic parity, while fK s create fermionic states. Since bosonic and fermionic degrees of freedom can only be excited in pairs, as one might see from (9.10) and (9.11), we conclude that the spectrum of the supersymmetric oscillator is E "2(n #n #n #n ) , (9.12) n 1 2 3 4 where n runs over all non-negative integers, while n "0, 1 as a consequence of the nilpotence 1 2,3,4 of the fermionic operators (tK s )2"(tK s )2"0. The physical eigenstates are 1 2 DnT"cn (bK s )n1 (bK s )n2 ( fK s)n3 ( fK s)n4 D0T , (9.13) 1 2 1 2 where cn is a normalization constant. Observe the doubling of the spacing between the oscillator energy levels in both the fermionic and bosonic sectors, whereas the Hamiltonian (9.4) has the unit oscillator frequency in the potential, even after the removal of all non-physical canonical variables. Our next task is to establish this important fact in the coordinate (and path integral) approach. The goal is to show that the e!ect is due to the invariance of the physical states under the residual gauge transformations (9.6) acting simultaneously on both bosonic and fermionic degrees of freedom. We interpret this e!ect as a consequence of a non-Euclidean structure of the physical phase superspace which emerges upon the identi"cation (9.6) and (9.7). Had we eliminated the nonphysical variable by imposing the unitary gauge and then formally canonically quantized the reduced phase-space system, we would have obtained a diwerent spectrum which would have the unit spacing between the energy levels. 9.2. Solving Dirac constraints in curvilinear supercoordinates Consider the SchroK dinger picture for the above quantum supersymmetric oscillator with the gauge symmetry. For the fermionic degrees of freedom we will use the coherent state representation as usual [139]. The states are functions of x and a complex Grassmann variable h so that Ro wK sU"hHU, wK U" U, RhH
(9.14)
where Ro denotes the left derivative with respect to the Grassmann variables. The scalar product reads [139]
P P
SU DU T" 1 2
R2
dx dhH dh exp(!hHh)[U (x, hH)]HU (x, hH) . 1 2
(9.15)
The physical states are invariant under the gauge transformations generated by the constraints p( e*up( U(x, hH)"U(euTx, e~*uChH)"U(x, hH) .
(9.16)
S.V. Shabanov / Physics Reports 326 (2000) 1}163
133
To solve the constraint in the SchroK dinger representation, we use again curvilinear coordinates associated with a chosen gauge and the gauge transformation law. A new feature is that the change of variables should be done on the total superspace since the gauge transformations act on both commutative and anticommutative coordinates of the superspace [119]. The unitary gauge is the natural one for the this model. So we introduce the new curvilinear supervariables r, u and n by the relations [144] x"erTf (r), hH"e~*rCnH ,
(9.17)
where the vector f has only one component f "d r. In the bosonic sector, the new variables are i 1i nothing but the polar coordinates. However the angular variable u also appears in the Grassmann sector as a parameter of the generic gauge transformation in there. The variables r and n are gauge invariant because the gauge transformations are translations of u. Indeed, following the rules of changing variables on the superspace [140] we "nd Rx R RhH R R R R " # "(¹x) !i(ChH) "!ip( . Ru RhH Rx RhH Ru Ru Rx
(9.18)
In the new variables the constraint operator is just the momentum conjugated to u. We stress the importance of changing variables on the total con"guration superspace to achieve this result. In this sense the idea of solving the constraints via the curvilinear coordinates associated with the chosen gauge and the gauge transformation law has a straightforward generalization to gauge systems with fermions. Next, we have to "nd a physical Hamiltonian. This requires a calculation of the Laplace}Beltrami operator in the curvilinear supercoordinates. Let us derive it for a special case when the change of variable is linear in the generators of the Grassmann algebra [119]. This would be su$cient to analyze any gauge model with fermions because the gauge transformations are usually linear transformations in the fermionic sector. Let x be a vector from RN and h is an M-vector with components being complex Grassmann variables. Consider a change of variables x"x(y), hH"X(y)nH ,
(9.19)
where X is an M]M matrix. Let q and Q be collections of the old and new supercoordinates, respectively. Then taking the di!erential of the relations (9.19) we "nd the supermatrix A"A(Q) such that dq"A(Q)dQ. From this relation follows the transformation law of the partial derivatives R/Rq"A~1T(Q)R/RQ. In particular, we "nd
A A B
R R "Bj (y) #ip k j Rxk Ryj
B
R RX p "inH X~1T , j RnH Ryj
(9.20) (9.21)
where Bj "[(Rx/Ry)~1]j . The second term in the right-hand side of Eq. (9.20) results from the k k dependence of the new Grassmann variables on the bosonic variables. Making use of these relations (9.20) and (9.21) we can write the kinetic energy operator in the new curvilinear
134
S.V. Shabanov / Physics Reports 326 (2000) 1}163
supercoordinates 1 1 ! D " PK gjkPK #< , k q 2 (N) 2 j
(9.22)
PK "!ik~1@2(R #ip )k1@2 , (9.23) k k k where R "R/Ryk, and the quantum potential < has the form (7.101) where the Jacobian is given by k q the Berezian (or superdeterminant) k"sdet A and gjk"dmnBj Bk . The Jacobian depends only on m m y since the change of variables is linear in the Grassmann sector. If the change of variables q"q(Q) is invariant under the discrete transformations q"q(Q)"q(RK Q). Then domain of the new bosonic variables should be restricted to the modular domain K&RN/S where S is formed by all transformations RK , that is,
P
P
dx /"
dy k(y)/ . K For example, the change of variables (9.17) is invariant under the transformations rP(!1)nr, uPu#pn, nP(!1)nn .
(9.24)
(9.25)
The modular domain is r3[0,R) and u3[0, 2p), and the Jacobian is k"r. Rewriting the Laplace operator in the quantum Hamiltonian in the new variables (9.17) and omitting all the derivatives R/Ru in it we "nd the SchroK dinger equation in the physical subspace
A
B
1 1 1 1 ! R2! R # p( # r2#nK snK !1 U "EU . E E 2 r 2r r 2r2 F 2
(9.26)
Here p( "nK sCnK . In the fermionic sector we used the symmetric ordering of the operators F wHwPwK swK !1"nK snK !1. To solve the SchroK dinger equation, we split the physical subspace into four orthogonal subspaces which are labeled by quantum numbers of fermions in the corresponding states, i.e., we take U(0)"U(0)(r), U(k)"mHF(k)(r) and U(3)"mHmHF(3)(r). These states are E E E k E E 1 2 E orthogonal with respect to the scalar product
P P
SU DU T" 1 2
=
dr r dnH dn e~nHn[U (r, nH)]HU (r, nH) . 1 2
(9.27) 0 The volume 2p of the non-physical con"guration space spanned by u is included into the norm of the physical states. The operator p( is diagonal in each of the subspaces introduced, F p( 1"p( m m "0 and p( m "(Cn) . The bosonic wave functions can be found by the same F F 1 2 F k k method used in Section 7.2. The regular normalized eigenstates and the eigenvalues are [144] J2 U(0)" ¸ (r2)e~r2@2, E(0)"2n n n n! n
(9.28)
J2 U(k)" rmH¸1(r2) e~r2@2, E(k)"2n#2 , n n n!Jn#1 k n
(9.29)
U(3)"mHmHU(0), E(3)"2n#2 , n 1 2 n n where n"0, 1, 2,2 . The spectrum is the same as in the Fock representation.
(9.30)
S.V. Shabanov / Physics Reports 326 (2000) 1}163
135
The wave functions have a unique gauge invariant continuation into the total con"guration superspace. This follows from the fact that they are regular functions of the independent gauge invariant polynomials r2"x2, mHmH"tHtH, rmH"ztH and rmH"zHtH, where z"x #ix . This 1 2 1 2 1 1 2 2 1 2 is an analog of the theorem of Chevalley for mixed systems [119]. Now we can see that in the unitary gauge x "0 the physical states are invariant under the residual gauge transformations 2 (9.6), and, eventually, this symmetry is responsible for pairwise excitations of the bosonic and fermionic degrees of freedom since only the compositions x2 , x nH and mHmH are invariant under 1 1 1 2 this symmetry. Thus, the kinematic coupling of physical bosonic degrees of freedom, which is due to the non-Euclidean structure of their physical phase space, is also inherent to gauge systems with bosonic and fermionic degrees of freedom. This kinematic coupling may considerably a!ect quantum dynamics of the fermionic degrees of freedom as we proceed to demonstrate. 9.3. Green's functions and the conxguration space structure In quantum "eld theory dynamics of physical excitations is usually described by Green's functions which are vacuum expectation values of time ordered products of the Heisenberg "eld operators. In gauge theories, they are calculated in a certain gauge. In turn, the gauge may not be complete, thus leading to some residual gauge transformations left which reduce the con"guration space of bosonic physical degrees of freedom to a modular domain on the gauge "xing surface. An interesting question is: What happens to fermionic Green's function? Will they be a!ected if the con"guration space of bosonic variables is reduced to the modular domain? The answer is a$rmative. We illustrate this statement with the example of the supersymmetric oscillator with the SO(2) gauge symmetry. The model is soluble. So all the Green's functions can be explicitly calculated. We will consider the simplest Green's function D "S¹(q( (t)q( (0))T , being the analogy of the quantum t 0 "eld propagator; ¹ stands for the time ordered product. Here q( (t) is the Heisenberg position operator. Taking the Hamiltonian of a harmonic oscillator for bosonic and fermionic degrees of freedom HK "bK sbK #fK sfK ,
(9.31)
where [bK , bK s]"[fK , fK s] "1, we set q( "(bK s#bK )/J2. Then we "nd [145] ` D (t)"S0D¹(q( (t)q( (0))D0T"1h(t)e~*t#1h(!t)e*t , (9.32) b 2 2 D (t)"S0D¹( fK (t) fK s(0))D0T"h(t)e~*t , (9.33) f where h(t) is the Heaviside step function. It is easy to verify that they satisfy the classical equations of motion with the source (!R2!1)D (t)"(iR !1)D (t)"id(t) , (9.34) t b t f which de"nes, in fact, the classical Green's functions of the Bose- and Fermi-oscillators. After the Fourier transform D(u)":= dt exp(!iut)D(t), the Green's functions assume a more familiar ~= form D (u)"i(u2!1#io)~1, D (u)"!i(u!1#io)~1 b f
(9.35)
136
S.V. Shabanov / Physics Reports 326 (2000) 1}163
where o'0 and oP0. The poles of D (u) are associated with the "rst excited state of the b, f corresponding degree of freedom. In the unitary gauge, the SUSY oscillator is described by the variable r which ranges over the positive semiaxis. The eigenstates and eigenvectors are given in Eqs. (9.28)}(9.30). We can investigate the e!ect of the restriction of the integration domain in the scalar product on the Green's functions by their explicit calculation through the spectral decomposition of the vacuum expectation values S0Dq( (t)q( D0T"+ e~*t(E~E0 )DS0Dq( DETD2 . E
(9.36)
For the Fourier transforms of the two-point functions Dc (t)"S¹(r( (t)r( )T and Dc (t)" b 0 fjk S¹(mK (t)mK )T , we obtain j k 0 = C2(n!1/2) in Dc (u)" + , b 4n!2 u2!4n2#io n/0
(9.37)
= C2(n#1/2) !i Dc (u)"d + . fjk jk 4n!2(n#1) u!2n!2#io n/0
(9.38)
In accordance with the theorem of De Morgan [146], series (9.37) and (9.38) are absolutely convergent and de"ne analytic functions on the complex plane of u with simple poles. Their Fourier transforms do not satisfy the classical equations (9.34). The reason for such a drastic modi"cation of the oscillator Green's functions is the restriction of the integration domain in the scalar product. In contrast to the ordinary oscillators with a #at phase space the amplitudes S0Dr( DU(0)T and S0DmK DU(k)T do not vanish for all n, i.e., for all energy n k n levels. In other words, the action of the operators r( or mK on the ground state does not excite k only the next energy level, but all of them. One can also say that the variables r and n do not describe elementary excitations, but rather composite objects. This unusual feature deserves further study. To this end we recall the residual symmetry (9.25) of the eigenstates (9.28)}(9.30). Making use of it we can continue the physical wave functions into the nonphysical domain r(0 as well as extend the integration domain to the whole real line := dr r/(r2)"1/2:= dr DrD/(r2) keeping the ortho~= 0 gonality of the eigenfunctions. However the states r( U "rU and nK sU "nU occurring in the E E E E Green's functions are not invariant under transformations (9.25). If we take an analytic continuation of these functions into the covering space, we get the obvious result Dc "Dc "0. This means that b fjk the action of the operators r( and nK s throws the states out of the physical subspace. The correspondence with (9.37) and (9.38) is achieved when the states r( U and nK sU are continued into the covering E E space to be invariant under the transformations (9.25), i.e., as DrDU and e(r)nHU , respectively, where E E e(r) is the sign function. Excitations described by the functions DrD and e(r)nH would contain all the powers of the elementary gauge invariant polynomials r2 and rn. This is why the corresponding Green's functions contain the sum over the entire spectrum. The Green's functions can be calculated in the covering space (i.e., on the total gauge "xing surface), provided all operators in question are replaced by their S-invariant continuations into the
S.V. Shabanov / Physics Reports 326 (2000) 1}163
137
covering space OK PQK OK "OK , where Q = (9.39) dr@ dn@ H dn@ e~n{ Hn{[enHn{d(r!r@)#e~nHn{d(r#r@)]U(r@, n@H) . QK U(r, nH)" 0 Here the expression in the brackets is the kernel of the extending operator QK . Note that exp(nHn@) is the unit operator kernel in the Grassmann sector. It is noteworthy that the kernel of QK has the same structure as, e.g., in (8.45), (8.50) or (8.75) with one natural addition that the residual group acts on both fermionic and bosonic degrees of freedom in the unit operator kernel. The kernel of QK is invariant under the transformations (9.25) for its "rst argument and so is the function QK U. In particular, we "nd
P P
rPQK r"r "+ H K (r)RK r"e(r)r"DrD , Q RK S
(9.40)
nHPQK nH"nH "+ H K (r)RK nH"e(r)nH , (9.41) Q RK S where H (r) is the characteristic function of the modular domain K (a half axis in this case) and the K sum is extended over the residual symmetry S such that the quotient of the gauge "xing surface by S is isomorphic to K. The extending operator QK has been introduced when studying the path integral formalism for bosonic gauge theories with a non-Euclidean phase space. Here we have a generalization of this concept to the simple model with fermionic degrees of freedom. Our analysis of the Green's functions is also compatible with the path integral formalism. The Heisenberg operator OK (t) is determined by the evolution operator ;K , but the latter is modi"ed as t ;K P;K QK "QK ;K QK ";K D. Therefore t t t t SOK (t)T"S;K DsOK ;K T"S;K sOK ;K T, QK D0T"D0T , (9.42) t t t Q t where ;K is the evolution operator on the covering space. If the operator OK is a reduction of a gauge t invariant operator on the gauge "xing surface (like OK "r( 2"x( 2 in the above model), then OK "OK , Q and the modular domain has no ewect on its Green's function, which might have been anticipated since the dynamics of gauge invariant quantities cannot depend on the gauge xxing or on the way we parameterize the gauge orbit space to regularize the path integral. All we still have to prove is that the evolution operator has the form ;K QK when the fermions are added into the gauge system. t Remark. Under certain conditions perturbative Green's functions may not be sensitive to a nonEuclidean structure of the phase space. A simple example is the double well potential discussed at the end of Section 8.8 and in Section 3.5. The potential has a minimum at r"v, so the perturbative Green's functions of the operator o that describes small #uctuations around the classical vacuum, o"r!v, are not a!ected by the conic singularity of the phase space. Indeed, we get QK o"DrD!v+r!v as long as (SrT!v)/v;1 for the states close to the perturbative (oscillator) ground state (cf. also the Bohr}Sommerfeld quantization of the system discussed in Section 3.5). The coordinate singularities in the Coulomb gauge in the 4D Yang}Mills theory seems to be `far awaya from the classical vacuum so that the perturbative Green's functions of gluons are not a!ected by them (see Section 10). The notion `far awaya requires a dimensional scale in the physical con"guration space. In 2#1 dimensions it might be constructed out of a gauge coupling constant
138
S.V. Shabanov / Physics Reports 326 (2000) 1}163
(which is dimensional in this case) [163]. In the four dimensions, such a scale could be associated with the curvature of the gauge orbit space [164,101]. A non-perturbative analysis of Green's functions can be done in the 1#1 QCD on the cylindrical spacetime (the 2D Yang}Mills theory with fermions). The residual gauge transformations from the a$ne Weyl group would lead to a speci"c anomaly because the Dirac sea (the fermionic vacuum) is not invariant under them [51]. 9.4. A modixed Kato}Trotter formula for gauge systems with fermions Consider a generic gauge system with bosonic and fermionic degrees of freedom described by commutative variables x3RN and complex Grassmann variables t , k"1, 2,2, M. Let the gauge k group act linearly in the con"guration superspace xPX (u)x and tPX (u)t, where the subb f scripts b and f denote the corresponding representations of the gauge group in the bosonic and fermionic sectors, respectively. To develop a gauge invariant path integral formalism associated with the Dirac operator method, we use the projection method proposed in Section 8.7 for the path integral de"ned via the Kato}Trotter formula. In the coherent state representation of the fermions [139], tK DtT"tDtT, we have StDHK Dt@T"H(tH, t@)StDt@T"H(tH, t@)eWtH,t{X ,
(9.43)
where S , T stands for the invariant scalar product in the representation space of the gauge group. The classical Hamiltonian H (with possible quantum corrections due to the operator ordering) is assumed to be invariant under the gauge transformations. For this reason in the Kato}Trotter product formula (8.8), the kernel of the `freea evolution operator is a product of the `freea evolution operator kernel for the bosonic degrees of freedom and the unit operator kernel for the fermionic degrees of freedom. By analogy with (8.86) we construct the gauge invariant short-time transition amplitude on the gauge orbit superspace
P
;0D o (x, tH; x@, t@)"(2pio)~N@2
G
dk (u) exp G
H
iSx!X (u)xT2 b expStH, X (u)t@T . f 2o
(9.44) G Due to the explicit gauge invariance of this amplitude, one can reduce it on any gauge "xing surface, say, x"f (u), parameterized by a set of variables u, just by changing the variables x"X (u) f (u) and t"X (u)m in the superspace. If the gauge is incomplete and there are discrete b f residual transformations determined by the equation f (u (u))"X (u (u)) f (u), then in the limit oP0 s b s integral (9.44) gets contributions from several stationary points of the exponential, just as in the pure bosonic case (8.87). Note that the entire time dependence is in the bosonic part of kernel (9.44). So the stationary phase approximation of the gauge group averaging integral is the same as in the pure bosonic case. One should however be aware of the possibility that the same function u (u) s may, in general, be generated by diwerent group elements X . In the pure bosonic case the existence s of such a degeneracy of the stationary points in the averaging integral would lead to a numerical factor in the amplitude. Since the representations of the physical bosonic and fermionic variables may be di!erent, the group elements X that have the same action on the bosonic variables may act s diwerently on the fermionic variables [119]. The above degeneracy is removed by di!erent contributions of the fermions in (9.44). For instance, if we add a multiplet of fermions in the adjoint representation to the (0#1) SU(2) Yang}Mills model, then the Weyl re#ection q P!q can be 3 3 induced by two di!erent group elements (!iq )q (iq )"(!iq )q (iq )"!q . As the fermion 1 3 1 2 3 2 3
S.V. Shabanov / Physics Reports 326 (2000) 1}163
139
multiplet has all the components (no gauge can be imposed on fermions), these groups elements acts di!erently on it. Yet, in this particular model, the stationary group U(1) of q will act as 3 a continuous gauge group on the fermionic multiplet, while leaving the boson variable unchanged. The average over this Cartan group would have no e!ect on the `freea bosonic amplitude, while it will have an e!ect on the fermionic unit operator kernel in (9.44). Thus, the sum over the residual transformations associated with Gribov copying on the gauge "xing surface would appear again, and the residual transformations act on both the bosonic and fermionic variables simultaneously. The operator ordering corrections to the physical kinetic energy of free bosons would emerge from the pre-exponential factor in the stationary phase approximation for the gauge group averaging integral in the limit tP0 as we have illustrated with the example in the end of Section 8.8. By analogy with (8.91) one can obtain a continuation of the unit operator kernel to the total covering space
P P P
Su, mDu@, m@T"
G
"
G
"
dk (u)d(x!X (u)x)eWtH,Xf (u)t{X G b
(9.45)
dk (u)d( f (u)!X (u) f (u@))eWmH,Xf (u)m{X G b
(9.46)
duA d(u!uA)eWmH,mAXQ(uA, mA H; u@, m@) , [k(u)k(uA)]1@2
(9.47)
(9.48) Q(u, mH; u@, m@)"+ d(u!RK u@)eWmH,RK m{X , Ss where u@ is from the modular domain, RK u@"u (u@) and RK m"X (u (u@))m (observe the u@-dependence s f s of the residual gauge transformations in the fermionic sector). Kernel (9.45) is nothing but the kernel of the projection operator (8.81) for gauge systems with fermions. It has been used in [147] to develop the path integral formalism in gauge models with a non-Euclidean phase space and in Yang}Mills theory with fermions, in particular. A general structure of the kernel (9.48) has been analyzed in [119] (see also [16]). Recent developments of the projection formalism for fermionic gauge systems can be found in [148]. Since the bosonic potential, fermionic Hamiltonian and terms describing coupling between bosons and fermions are gauge invariant by assumption, we conclude that the gauge invariant in"nitesimal transition amplitude reduced on the gauge "xing surface has the form
P
;D o (qH; q@)"
dqA e~WmA H,mAX ;o (qH; qA)Q(qA H; q@) , [k(u)k(uA)]1@2
(9.49)
where, to simplify the notations, we have introduced the supervariable q to denote the collection of the bosonic coordinates u and the Grassmann variables mH; accordingly qH means the set u, m dq,du dmH dm, and k(u) is the Jacobian of the change of variables on the superspace, or the Faddeev}Popov determinant on the gauge "xing surface. Here we assume that det X "1, which is f usually the case in gauge theories of the Yang}Mills type (otherwise the Jacobian is a product of the Faddeev}Popov determinant and det X ). This is no restriction on the formalism being developed. f When necessary, det X can be kept in all the formulas, and the "nal conclusion is not changed. f
140
S.V. Shabanov / Physics Reports 326 (2000) 1}163
To calculate the folding of two kernels (9.49), we "rst prove the following property of the integration measure for the modular domain
P
P
du (u)k(u (u))/ , (9.50) du k(u)/" s s K Ks where K is the range of u (u), u3K. Indeed, since det X "1, the Jacobian is fully determined by s s f the Jacobian in the bosonic sector. We have dx"dk (u) du k(u). Under the transformations G uPu (u) and X(u)PX(u)X~1(u (u)), the original variables x are not changed, so dx(u, X)" s s dx(u (u), XX~1). Eq. (9.50) follows from the invariance of the measure dk on the group manifold s s G with respect to the right shifts. Eq. (9.50) merely expresses the simple fact that when integrating over the orbit space the choice of a modular domain is not relevant, any K can serve for this purpose. s Consider the action of the in"nitesimal evolution operator (9.49) on a function U(u, mH) on the modular domain. We have
P
P
P
A B
dq@ k(u@)e~Wm{H,m{X ;o (qH; qA)+ d(uA!u (u@))eWRK mAH,m{XU(q@) . (9.51) s [k(u)k(uA)] K S In the integral over the modular domain we change the variables u@Pu (u@) and m@PRK m@ in each s term of the sum over S (see also Section 7.7 for details about the orientation of the integration domain in the bosonic sector). Making use of (9.50) and the relation S(X m)H, X mT"SmH, m@T we f f can do the integral over the new variables since it contains the corresponding delta functions, thus obtaining the relation ;K oDU" dqA e~WmAH,mAX
;K Do U" dqA e~WmAH,mAX
k(uA) 1@2 ;o (qH; qA)U (qA H) , Q k(u)
(9.52)
where the function U is the S-invariant continuation of the function U outside of the modular Q domain to the whole gauge "xing surface (or the covering space) U (u, mH)"+ H K (u)U(RK ~1u, RK ~1m) Q RK S
P
du@ dm@H dm@ e~Wm{H,m{XQ(u, mH; u@, m@)U(u@, m@H) .
(9.53)
(9.54) K Here by RK ~1u we imply the function u~1 : K "RK KPK. Recall that the function u (u) has the s s s domain K and the range K "RK K, so the inverse function has the domain RK K and the range K. s The physical wave function are gauge invariant and therefore they are well de"ned on the entire gauge "xing surface and invariant under the S-transformations. Thus, the action of QK does not change physical Dirac states reduced on the gauge "xing surface since + H K (u)"1 just like in the S RK example right after (8.54). Taking instead of U the gauge invariant in"nitesimal evolution operator kernel (9.44) reduced on the gauge "xing surface (see (9.49)) by means of the gauge group average, we immediately conclude that the relation (9.51) holds for the folding ;K Do ";K oD;K oD";K o QK where 2 2 the folding ;K o ";K o ;K o is taken with the standard measure du dmH dm exp(!SmH, mT) and the 2 integration over u is extended over the whole gauge "xing surface. Indeed, when U is replaced by Q the kernel (9.49) in (9.52), then QK ;K oD";K oD, thanks to the gauge invariance of the projected kernel (9.49), and the factor [k(uA)]1@2 in (9.52) is canceled against the corresponding factor [k]~1@2 in the "
S.V. Shabanov / Physics Reports 326 (2000) 1}163
141
evolution operator kernel (9.49). The path integral representation of ;K is given by the Faddeev} t Popov reduced phase space integral modulo the operator ordering corrections whose exact form can be calculated from the stationary phase approximation of the group averaging integral (9.44) as has been explained in Section 8. This accomplishes the proof of the formula (9.42) which was essential for an understanding of the e!ects of the modular domain on the gauge "xed Green's functions. Remark. To calculate the operator ordering terms, it is su$cient to decompose X up to second f order in the vicinity of the stationary point, just as the measure dk (u), because the fermionic G exponential in (9.44) does not contain e~1. The second-order terms will contribute to the quantum potential, and therefore the latter may, in general, depend on fermionic variables.
10. On the gauge orbit space geometry and gauge 5xing in realistic gauge theories The non-Euclidean geometry of the physical phase space may signi"cantly a!ect quantum dynamics. In particular, a substantial modi"cation of the path integral formalism is required. This should certainly be expected to happen in realistic gauge theories. Unfortunately, a mathematically rigorous generalization of the methods discussed so far to realistic four dimensional gauge "eld theories can only be done if the number of degrees of freedom is drastically reduced by assuming a "nite lattice instead of continuous space, or by compactifying the latter into torus and considering small volumes of the torus so that high-momentum states can be treated perturbatively, and only the lowest (zero-momentum) states will be a!ected by the non-perturbative corrections. The removal of the regularizations is still a major problem to achieve a reliable conclusion about the role of the con"guration or phase space geometry of the physical degrees of freedom in realistic gauge theories. For this reason we limit the discussion by merely a review of various approaches rather than going into the details. At the end of this section we apply the projection method to construct the path integral for the Kogut}Susskind lattice gauge theory, which seems to us to be a good starting point, consistent with the gauge invariant operator formalism, for studying the e!ects of the physical phase space geometry in quantum Yang}Mills theory. 10.1. On the Riemannian geometry of the orbit space in classical Yang}Mills theory The total con"guration space of the classical Yang}Mills theory consists of smooth square integrable gauge potentials (connections) A"A(x)3C= on the space being compacti"ed into a sphere [12] (meaning that the potentials decrease su$ciently fast to zero at spatial in"nity). Potentials take their values in a Lie algebra of a semisimple compact group G (the structure group). As before, we use the Hamiltonian formalism in which the time component A of the four-vector 0 A is the Lagrange multiplier for the constraint (the Gauss law) k p(x)"($(A), E)"R E !ig[A , E ]"0 , (10.1) i i i i where the components of the color electrical "eld are canonical momenta for A. We omit the details of constructing the Hamiltonian formalism. They are essentially the same as for the two-dimensional case discussed in Section 5.
142
S.V. Shabanov / Physics Reports 326 (2000) 1}163
Gauge transformations are generated by constraint (10.1): dF"MSu, pT, FN for any functional F of the canonical variables and in"nitesimal u. Finite gauge transformations are obtained by successive iterations of in"nitesimal transformations. One can show that each gauge orbit in the con"guration space A of all (smooth) connections intersects at least once the hyperplane R A "0 [170,171]. The Coulomb gauge does not "x constant gauge transformations because i i R AX"XR A X~1"0 if R X"0. One can remove this gauge arbitrariness by reducing the gauge i i i i i group G to the so called pointed gauge group G whose elements satisfy the condition X(x )"e 0 0 (group unity) for some "xed point x . For example, one can identify x with spatial in"nity by 0 0 requiring that X(x)Pe as DxDPR (the space is compacti"ed into a three-sphere). Local e!ects of the orbit space geometry on dynamics of physical degrees of freedom are caused by a non-Euclidean metric because the kinetic energy depends on the metric. To construct the metric on the orbit space A/G , we need local coordinates. The space A is an a$ne space, while 0 the orbit space has a non-trivial topology [12]. To introduce local coordinates on the orbit space, we identify a suitable region of A that upon dividing out the gauge group projects bijectively on some open subset of the orbit space. There always exists a subset K of A which is isomorphic to the orbit space modulo boundary identi"cations. The subset K is called a (fundamental) modular domain. To construct K, one uses a gauge "xing, i.e., the modular domain K is identi"ed as a subset on a gauge "xing surface s(A)"0. Con"gurations from K are used as local (a$ne) coordinates on the orbit space [12,44,167]. Clearly, the gauge "xing surface must have at least one point of intersection with every gauge orbit. We take the Coulomb gauge s(A)"R A (x)"0. We adopt the i i method and notations from the discussion of the two-dimensional case in (5.33) and (5.34) where a should be replaced by a transverse potential A(x), R A ,0, that is, the transverse potentials i i are chosen as local coordinates on the orbit space. We will use the same letter A for the transverse connections parameterizing the gauge orbit space (unless speci"ed otherwise). In the new coordinates
A
B
i dA PX dA ! + (A)dw X~1 , j j g j
(10.2)
where R dA ,0 in the right-hand side and dw"iX~1dX. In contrast to (5.34), the metric is not j j block-diagonal relative to the physical and non-physical sectors. If g denotes the metric tensor in AB the new coordinates where A, B"1 is a collective index for the transverse connections dA and A, B"2 is a collective index for pure gauge variables dw(x), then the metric has a block form d !g* P + (A) jk nm m (10.3) * + (A)P ! 12 +2(A) g m mn g where P "d !R D~1R is the projector on transverse vector "elds. It occurs through the simple jk jk j k relation SdA , + dwT"SdA , P + dwT since by construction dA is transverse, P dA "dA . j j j jk k j kj j k The square root of the determinant of metric (10.3) is the Jacobian of the change of variables. From the analysis of the simple models one can naturally expect it to be proportional to the Faddeev}Popov determinant for the Coulomb gauge [13]. Indeed, making use of the formula for the determinant of the block matrix, we "nd g
AB
A
"
B
k[A]"(det g )1@2&(detM+2!+ P + N)1@2&det(!R + (A)) AB k k kn n j j
(10.4)
S.V. Shabanov / Physics Reports 326 (2000) 1}163
143
which is the Faddeev}Popov determinant for the Coulomb gauge as one might see by taking the determinant of the operator whose kernel is determined by the Poisson bracket of the constraint p(x) and the gauge "xing function R A (x@). Thus, the Faddeev}Popov determinant speci"es i i a relative volume of a gauge orbit through A. The singular points of the change of variables are con"gurations where the determinant vanishes (the Jacobian vanishes). For A"0 the Faddeev} Popov operator M K ,!R + (A)"!D has no zero modes in the space of functions decreasing FP j j to zero at spatial in"nity. By perturbation theory arguments one can also conclude that in the vicinity of the zero con"guration the operator M K has no zero modes. Given a con"guration A, FP consider a ray gA in the functional space, where the ray parameter g may be frankly regarded as the gauge coupling constant in the operator M K . Gribov showed [11] that for su$ciently large g the FP equation M K t(x)"0 would always have a non-trivial solution, that is, the Faddeev}Popov FP operator would have a zero mode. Therefore a ray from the zero con"guration in any direction would reach the point where the Jacobian or the Faddeev}Popov determinant vanishes. The singular points form a space of codimension one in the space of transverse connections, which is called the Gribov horizon (where the lowest eigenvalue of the Faddeev}Popov operator vanishes (see below)). The plane waves associated with two transverse polarization of gluons are solutions of the equations of motion in the limit of the zero coupling constant. Therefore for dynamics described by the perturbation theory of transverse gluons, the coordinate singularities in the Coulomb gauge have no e!ect. With the fact that the e!ective coupling constant decreases in the high energy limit (see, e.g., [166]), one can understand why the perturbation theory based on the Faddeev}Popov path integral in the Coulomb gauge was so successful. The relevant con"gurations are simply far away from the coordinate singularities. In the strong coupling limit, it is rather hard to determine the relative `strengtha of the contributions to the dynamics which come from the coordinate singularities (i.e. from the physical kinetic energy, or color electric "eld energy) and from the strong self-interaction (i.e. from the color magnetic energy). There is no technique to solve the Yang}Mills theory non-perturbatively and compare the e!ects of the singular points in the Coulomb gauge with those due to the self-interaction. This resembles the situation discussed in Section 3.5 (see also the remark at the end of Section 9.3) where the conic singularity of the physical phase space does not appear relevant for dynamics in the double well potential in a certain regime: The classical ground state of the system is far from the conic singularity so that small #uctuations around the ground state are insensitive to it. One should emphasize it again that, though the coordinate singularities are fully gauge dependent, they are unavoidable. Therefore the singular points should always be taken care of in any formalism which relies on an explicit parameterization of the gauge orbit space. However, they may or may not be relevant for a particular physical situation in question. Returning to calculating the metric on the gauge orbit space, we assume that A in (10.3) is a generic con"guration inside of the Gribov horizon, so we can take the inverse of (10.3)
A
gAB"
d #P + D~1+ P , !igP + D~1 jk jn n m mk nm m igD~1+ P , !g2D~1 m mn
B
(10.5)
where D"(T, $)D~1(T, $). The metric gph on the gauge orbit space according to a general analysis jk given in Section 7.7 (see (7.94) and (7.100)) is the inverse of the block g11 in (10.5). This metric would
144
S.V. Shabanov / Physics Reports 326 (2000) 1}163
specify the physical kinetic energy in our parameterization of the gauge orbit space. That is, gjk "d #P + D~1+ P ,d #K . ph jk jn n m mk jk jk
(10.6)
The same result can be obtained by a solving the Gauss law for the longitudinal components of the momenta E . Imposing the gauge R A "0, one makes the decomposition E "EM#D~1R (R E ), i i i i i i j j where R EM,0. Substituting the latter into the Gauss law and solving it for R E , one "nds the i i i i expression of E via the physical canonical variables AM(,A ) and EM. The physical metric is then i i i i given by (4.23) and coincides with (10.6). The determinant of the physical metric is not equal to the squared Faddeev}Popov determinant, but rather we get [13]5 det gph"[det gjk ]~1&D2 [det D det($, $)]~1 . jk ph FP
(10.7)
In the two-dimensional case studied in Section 5, the physical metric is proportional to the unit matrix, so its determinant is constant. If the space is one-dimensional, then D &det(R+)2&det R2 det(+)2 and the determinant (10.7) equals one, indeed. The curvature of the FP gauge orbit space is positive as has been shown by Singer [165,164]. 10.2. Gauge xxing and the Morse theory A representative of the gauge orbit in classical Yang}Mills theory can be speci"ed by means of the Morse theory as has been proposed by Semenov-Tyan-Shanskii and Franke [168]. The idea is to minimize the ¸2 norm of the vector potential along the gauge orbit [168}170] M (X)"SXAX~1!ig~1XTX~1T2 . A
(10.8)
Here SFT2 denotes :d3x(F, F). The minima of the Morse functional carry information about the topology of the gauge orbit through A. Taking X"e*gw and expanding the Morse functional around the critical point w"0, we "nd M (w)"SAT2#2Sw, R A T!Sw, (R + )wT#O(w3) . A j j i i
(10.9)
From (10.9) it follows that the relative minima of the Morse function are given by the transverse potentials, that satisfy R A "0, which leads to the Faddeev}Popov operator M K "!R + being j j FP j j a symmetric, positive operator. The positivity of the Faddeev}Popov operator ensures that the connection A has the property that X"e is a minimum of M . The Gribov horizon is determined A by the condition that the lowest eigenvalue of the Faddeev}Popov operator vanishes [100]. The con"gurations on the Gribov horizon are degenerate critical points of the Morse function. A Gribov region K is de"ned as the set of all minima of the Morse functional. It has the property G that each gauge orbit intersects it at least once, and it is convex and bounded [171].
5 Eq. (10.7) can be derived by means of the exponential representation of the determinant det g " ph exp(tr ln g )"exp + = [(!1)n/n] tr Kn, where tr Kn"tr [($ 2!D)D~1]n"tr($2D~1!1)n. Therefore det g " ph ph n/0 det ($ 2D~1) which leads to (10.7).
S.V. Shabanov / Physics Reports 326 (2000) 1}163
145
It may happen that two relative minima inside the Gribov domain K are related by a gauge G transformation, i.e., they are on the same gauge orbit. To obtain the modular domain K which contains only one representative of each gauge orbit, one has to take only the absolute minima of the Morse functional. Let A and the gauge transform of it AX both be from the Gribov domain K . G Then it is straightforward to show [100,101] that SAXT2!SAT2"SX~1, R +f(A)XT , i i
(10.10)
where $fX"TX!igAX is the covariant derivative in the fundamental representation. Since the Faddeev}Popov operator is positive in K , the absolute minima of the Morse function G can be de"ned in terms of the absolute minima over the gauge group of the right-hand side of Eq. (10.10). A con"guration A from the Gribov domain K belongs to the modular domain K if the G minimum of the functional (10.10) over the gauge group vanishes. This condition simply selects the absolute minima of the Morse function out of its relative minima. Since the Faddeev}Popov operator for the Coulomb gauge is linear in A, all con"gurations of the line segment sA #(1!s)A , where s3[0, 1] and A 3K, also belong to K. That is, the modular domain is (1) (2) (1,2) convex. In a similar way the existence of the horizon and the description of the modular domain have been established in the background gauge + (AM )A "0 (AM is a background ("xed) connection). This i i result is due to Zwanziger [171]. In this case the Morse functional is M (X)"SAX!AM T2 , A
(10.11)
and the Faddeev}Popov operator has the form M K "!+ (AM )+ (A). FP k k The main properties of the modular domain are as follows [101]. First, its boundary has common points with the Gribov horizon, i.e., it contains the coordinate singularities in the chosen gauge. Second, the modular domain has a trivial topology as any convex subset in an a$ne space, but its boundary contains gauge equivalent con"gurations. Through their identi"cation one obtains a non-trivial topology of the gauge orbit space. In fact, the orbit space contains noncontractable spheres of any dimension [12]. Third, the gauge transformations that relate con"gurations inside the Gribov domain K may be homotopically non-trivial. Any point on the Gribov G horizon has a xnite distance from the origin of the "eld space and one can derive a uniform bound, as has been done in the original work of Gribov [11] and later improved by Zwanziger and Dell' Antonio [172]. Although the above procedure to determine the modular domain applies to general background connections, some properties of K and K may depend on the choice of the background G connection. In particular, reducible and irreducible background con"gurations have to be distinguished [173,174]. A connection A is said to be reducible if it has a non-trivial stationary group G (the stabilizer) such that AX"A for all X3G . If G coincides with the center Z of the A A A G structure group G, then the connection is irreducible. From the identity AX"A#ig~1X$(A)X~1 it follows that X3G if $(A)X"0. Any stabilizer G is isomorphic to a closed subgroup of G [175]. A A This can be understood as follows. We recall that for any A, G is isomorphic to the centralizer of A the holonomy group of A relative to the structure group G [176]. By de"nition, the centralizer G@ of c a subgroup G@ of G consists of all elements of G which commute with all elements of G@. Clearly,
146
S.V. Shabanov / Physics Reports 326 (2000) 1}163
G@ is a subgroup of G. On the other hand, the holonomy group is a Lie subgroup of G (see, e.g., c [176]), i.e., its centralizer is a subgroup of G. The orbit space has the structure of a so called strati"ed variety which can be regarded as the disjoint sum of strata that are smooth manifolds [177}179]. Each stratum of the variety consists of orbits of connections whose stabilizers are conjugate subgroups of G. In other words, the stabilizers of the connections of a "xed stratum are isomorphic to one another. A stratum that consists of orbits of all irreducible connections is called a main stratum. The set of orbits of reducible connections is a closed subset in the orbit space which is nowhere dense. Accordingly, the main stratum is dense in the orbit space, and any singular strata can be approximated arbitrarily well by irreducible connections [173]. If all reducible connections are excluded from the total con"guration space, then the orbit space is a manifold. The Morse functional (10.11) can also be regarded as the distance between AX and AM . Let AM be an irreducible connection. Any two connections A , A OA , on the gauge "xing surface 1,2 1 2 + (AM )A "0 that are su$ciently close to AM belong to distinct gauge orbits. For reducible backi i grounds, the gauge "xing surface does not possess such a property. The Morse functional (10.11) has a degeneracy for reducible backgrounds. Indeed, if XM 3G M , then we have A M (XXM )"SAXXM !AM T2"SAXXM !AM XM T2"M (X) , A A
(10.12)
because the Morse functional is invariant under simultaneous gauge transformations of AX and AM . It is also not hard to see that, if M (XXM )"M (X) holds true for any A, then XM should be an A A element of the stabilizer G M . A In the case of a reducible background, the Faddeev}Popov operator always has zero modes. Let AM X"AM , i.e., X3G M . As G M is isomorphic to a Lie group (a subgroup of G), there exists a family A A X 3G M such that all elements are connected to the group unity, X "e. The Lie algebra valued j A j/0 function t(x)"R X (x)D is covariantly constant, + (AM )t"0, and therefore it is a zero mode of j j j/0 j the Faddeev}Popov operator, M K t"!+ (AM )+ (A)t"!+ (A)+ (AM )t"0, thanks to the symFP j j j j metry of M K . In particular, taking AM "0 we get G M &G, i.e., G M is a group of constant gauge FP A/0 A/0 transformations. By removing constant gauge transformations from the gauge group, we remove a systematic degeneracy of the Faddeev}Popov operator for the Coulomb gauge. Next, we observe that the collection of all absolute minima of the Morse functional cannot serve as the fundamental modular domain K because of the degeneracy (10.12). There are gauge equivalent con"gurations inside the set of the absolute minima. The identi"cation in the interior precisely amounts to dividing out the stabilizer G M [173]. A We shall not pursue a further elaboration of the classical physical con"guration space because in quantum theory the "elds are distributions, and the relevance of the above analysis to the quantum case is not yet clear [13]. A strati"cation of the orbit space of the classical SU(2) Yang}Mills theory is studied in detail in the work of Fuchs et al. [173]. It is noteworthy that classical trajectories in the Hamiltonian formalism are always contained in one "xed stratum (in one smooth manifold) [180]. A "nal remark is that a principal bundle has isomorphism classes characterized by the instanton number which can be any integer. Connections with di!erent instanton numbers satisfy di!erent asymptotic conditions at in"nity. If we allow asymptotic conditions associated with all instanton numbers, then the fundamental modular domain will be the disjoint sum of modular domains for every instanton number [181].
S.V. Shabanov / Physics Reports 326 (2000) 1}163
147
10.3. The orbit space as a manifold. Removing the reducible connections In the previous section we have seen that the main stratum of the orbit space is a smooth manifold. On the other hand, in 2D Yang}Mills theory the orbit space appears to be an orbifold (with trivial topology). The reason is that there are reducible connections on the gauge "xing surface whose stabilizers are subgroups of the group of constant gauge transformations. If we restrict the gauge group by excluding constant gauge transformations, then, as we shall show, the orbit space is a manifold which is a group manifold [47]. The group manifold is compact and has a non-trivial topology. The latter occurs through the identi"cation of gauge equivalent points on the Gribov horizon (the example of the SU(2) group has recently been studied in this regard by Heinzl and Pause [182]). One should keep in mind, however, that such a truncation of the gauge group cannot be done in the Lagrangian, and is added to the theory by hand as a supplementary condition on the gauge group. If constant gauge transformations are excluded from the gauge group in the 2D Yang}Mills theory, then all zero modes of the Faddeev}Popov operator !R+(A ) are determined by 0 Eq. (8.60), where a is replaced by A being a generic element on the gauge "xing surface RA"0, 0 A"A 3X, i.e., a constant connection in the Lie algebra (not in its Cartan subalgebra). Since 0 A "XaX~1, for some constant group element X, the zero modes have the same form (8.62), where 0 mM PX~1mM X. Therefore Eq. (8.64) speci"es zeros of the Faddeev}Popov determinant because det [!R+(a)]"det [!R+(A )], i.e., it does not depend on X. The Cartan algebra element 0 a related to A by the adjoint action of the group has r"rank X independent components which 0 can be expressed via the independent Casimir polynomials P i (A )"tr Ali "P i (a). For example l 0 0 l in the SU(2) case, the only component of a is proportional to [tr A2 ]1@2. Hence, the 0 Faddeev}Popov determinant vanishes at the concentric two-spheres tr A2 "2a2 n2, nO0. The 0 0 vacuum con"guration A "0 is inside the region bounded by the xrst Gribov horizon n"1, which 0 is the Gribov region. The vacuum con"guration A "0 is no longer a singular point because 0 constant gauge transformations are excluded. The Faddeev}Popov determinant is proportional to the Haar measure [30] on the group manifold sin2(a, a) D (A )"D (a)" < , (10.13) FP 0 FP (a, a)2 a;0 which is regular at any hyperplane orthogonal to a root and passing through the origin (vacuum a"0). Returning to the SU(2) case, we remark that all the con"gurations A such that 0 2a2 (n!1)24tr A2 42a2 n2, n'1 are gauge equivalent to those in the Gribov region. In general, 0 0 0 given a constant connection A one can "nd a group element X and the Cartan subalgebra element 0 a such that A "XaX~1. To obtain a Gribov copy of A , we translate a by an integral linear 0 0 combination of the elements g de"ned by Eq. (5.26), and then bring the resulting element back to a the whole algebra by the inverse adjoint action generated by the group element X. In the SU(3) case, the "rst Gribov horizon is obtained by generic adjoint transformations of all the con"gurations that lie in the polyhedron B B 2B in Fig. 5. So it is a seven-dimensional 1 2 6 compact manifold. The same holds in general. We take the polyhedron around the vacuum con"gurations a"0 whose faces are portions of the hyperplanes (a, a)"a for all positive roots a. 0 Then each point of the polyhedron is transformed by the adjoint action of generic group elements.
148
S.V. Shabanov / Physics Reports 326 (2000) 1}163
As the result, we obtain the "rst Gribov horizon which is manifold of dimension dim X!1. As should be, it has the codimension one on the gauge "xing surface. As in the matrix model considered in Section 4.8, there are gauge equivalent con"gurations within the Gribov horizon. To "nd them we observe that the vacuum con"guration a"0 can always be shifted to the "rst Gribov horizon by a homotopically non-trivial gauge transformation (7.86) with n"1. If we shift the vacuum con"guration by aa /(a, a) (see (7.86)) and then rotate it as 0 XaX~1a /(a, a),A(a), where X3G/G , we obtain a portion of the Gribov horizon that contains all 0 0 H possible images of the vacuum con"guration generated by the homotopically non-trivial transformations associated with the root a (cf. (7.85)). All the points A(a) of this portion of the horizon are 0 related to one another by homotopically trivial gauge transformations. Indeed, the homotopically non-trivial gauge group element that transforms the vacuum con"guration to a generic con"guration on the a-portion of the horizon is X(x, A(a))"exp(igA(a)x). The gauge transformation that 0 0 relates two con"gurations A(a) and AI (a) on the a-portion of the horizon is obtained by the 0 0 composition of the gauge transformation that shifts, say, A(a) to the vacuum, and the gauge 0 transformation that shifts the vacuum to AI (a). It is generated by the group element 0 X(x, AI (a))X~1(x, A(a)) which is homotopically trivial: 0 0 X(x#2pl, AI (a))X~1(x#2pl, A(a))"z X(x, AI (a))X~1(x, A(a))z~1 0 0 a 0 0 a "X(x, AI (a))X~1(x, A(a)) , (10.14) 0 0 where z is the center element associated with the root a (cf. (7.85)). In the case of the SU(2) group, a we have only one root. So all the points of the horizon, being the two-sphere, are gauge equivalent. Identifying them we get the gauge orbit space as the three-sphere, which is the group manifold of SU(2). In the general case, we observe that the Lie algebra elements A "XaX~1 from the region 0 bounded by the portions of the hyperplanes (a, a)"a serve as local a$ne coordinates on the 0 group manifold. Any element from the connected component of the group has the form exp(2piA /a ) [30]. This coordinate chart does not cover the center of the group. Singular points of 0 0 the a$ne coordinate system are zeros of the Haar measure (10.13) [30] and, therefore, form the "rst Gribov horizon. The group manifold is obtained by identifying all points in each of the a-portions of the horizon so that the latter is shrunk to a "nite number of points which are elements of the center of the group [30], like in the SU(2) case the entire two-sphere trA2 "2a is shrunk to 0 0 a single point being the only non-trivial element of the center !e. Thus, if we exclude constant gauge transformations, then the 2D Yang}Mills theory becomes an irreducible gauge system. The corresponding gauge orbit space is a topologically non-trivial (group) manifold. The above discussion may serve as an illustration for the classical Yang}Mills theory in four dimensions, where the gauge orbit space exhibits the same features [12,13,100]. In particular, one needs more than one coordinate chart to make a coordinate system on the gauge orbit space. If one takes two geodesics outgoing from one point on the orbit space, then they may have another point of intersection which belongs to the Gribov horizon in the local a$ne coordinate system centered at the initial point of the geodesics [13], thus indicating the singularity of the coordinate system. We emphasize that the existence of conjugate points on the geodesics is an intrinsic feature of the theory. We also point out that the use of several coordinate charts allows one to avoid the Gribov singularities [183,220] (see also [194]) in principle, but does not lead to any convenient method to calculate the path integral.
S.V. Shabanov / Physics Reports 326 (2000) 1}163
149
A recent development of this approach in the framework of stochastic quantization can be found in [221]. 10.4. Coordinate singularities in quantum Yang}Mills theory The description of the parameterization of the gauge orbit given in Section 10.2 applies to classical theory only. The con"guration space of in quantum "eld theory is much larger than the space of square integrable functions. It consists of distributions [186]. Smooth classical functions form a subset of zero measure in the space of distribution (a Sobolev functional space). The Sobolev space Sp , where 14p(R, k"0, 1, 2,2, consists of "elds all of whose derivatives up to and k including order k have integrable pth power. The smaller the indices p and k the larger the space of the "elds. The result of Singer on the absence of a global continuous gauge "xing for smooth classical "eld con"gurations can be extended to the Sobolev space [44,34], provided p(k#1)'n ,
(10.15)
where n is the dimension of the base manifold. Since the gauge transformation law of the connection involves the derivatives of the group elements, the latter must have one derivative more than the connections, i.e., they must be from the Sobolev space (of the group valued functions) Sp . Only under the condition (10.15) the gauge group possesses the structure of a "nitek`1 dimensional Lie group and acts smoothly on the space of connections Sp [45]. The condition k (10.15) is discussed in more details in [204]. Here we point out the following. The condition (10.15) is crucial for continuity of gauge transformations as functions of a point of the base manifold. For instance [34], the function DxD~e is singular at the origin, but the pth power of its kth derivative is integrable if p(k#1)(n and e(1. If p(k#1)"n (pO1), then there may exist a singularity (!lnDxD)1~1@p~e. Thus, the necessity of condition (10.15) for continuity is clear. The condition ensures also the existence of a local gauge "xing and the structure of the principal "ber bundle in the con"guration space [205]. If p(k#1)(n, the very notion of the gauge "xing becomes meaningless [34]. Ordinary conditions like the Coulomb gauge will not be any gauge "xing even locally. Consider the transformation A (x)PAj(x)"jA (jx) of connection in Rn. Then i i i
GP
DDR 1 2R k AjDD , j j p
H
dnx(R 1 2R k Aj, R 1 2R k Aj)p@2 j j j j
1@p "jk`1~n@pDDR 1 2R k ADD . j p j
(10.16)
The right-hand side of Eq. (10.16) tends to zero as jPR if p(k#1)(n. If we take a transverse connection and its Gribov copy and perform the j-transformation of them, then for su$ciently large j both con"gurations will be arbitrary close to zero "eld in the sense of the topology of Sp, k and they will remain transverse. The noncompactness of the base is not important here because both A and its copy can be taken near the vacuum con"guration [34]. In the Sobolev space of connections satisfying (10.15) there is an improved version of the theorem of Singer which is due to Soloviev [34]. It asserts that the gauge orbit "ber bundle in non-Abelian "eld theory does not admit reduction to a "nite-dimensional Lie group. In other words, there is no gauge condition that would "x the gauge arbitrariness globally modulo some "nite subgroup of the
150
S.V. Shabanov / Physics Reports 326 (2000) 1}163
gauge group. Observe that in all models we have discussed, one can always "nd a gauge condition that removes the gauge arbitrariness completely up to a discrete subgroup of the gauge group. In contrast, in the Yang}Mills theory the residual gauge symmetry in any gauge would not form a "nite subgroup of the gauge group. Soloviev's result gives the most precise characterization of the Gribov problem in the Yang}Mills theory. The formal generalization of the path integral over the covering space of the orbit space, though possible [95,147], would be hard to use since there is no way we could ever "nd all Gribov copies for a generic con"guration (being a distribution) satisfying a chosen gauge. Moreover, a class of "elds on which the functional integral measure has support depends on the "eld model in question (cf. the Minlos}Backner theorem [206]). The property of continuity discussed above is decisive for Singer's analysis, while quantum "eld distributions in four dimensions would typically have `singularity DxD~1a almost everywhere. With such a poor state of a!airs, we need some approximate methods that would allow us to circumvent (or resolve) this signi"cant problem associated with the distributional character of quantum "elds. We stress that the e!ects in questions are essentially non-perturbative, so one of the conventional ways of de"ning the path integral as a (renormalized) perturbative expansion with respect to the Gaussian measure does not apply here. Since in any actual calculation on the gauge orbit space the introduction of a (local) set of coordinates is unavoidable, one should raise a natural question of how the coordinate singularities can be interpreted in terms of quantum "elds. Following the basic ideas of (perturbative) quantum "eld theory, one may attempt to interpret quantum Yang}Mills theory as the theory of interacting gluons. This picture naturally results from perturbation theory in the Coulomb gauge. Consequently, the `physicala picture of the e!ects caused by the coordinate singularities would strongly depend on the choice of variables that are to describe `physicala (elementary) excitations in the theory. There is, in fact, a great deal of the choice of physical variables, especially in the non-perturbative region. For instance, in the picture of self-interacting gluons, one may expect some e!ects on the gluon propagator (cf. Section 9.3) caused by coordinate singularities in the Coulomb gauge as has been conjectured by Gribov. But with another set of variables describing elementary excitations the physical picture would look di!erently in terms of the quanta of the new "elds because the singularities would also be di!erent. An example of this kind is provided by 't Hooft's Abelian projection [73]. The gauge is imposed on the "eld components F (k, l are "xed) kl rather than on the potential, or on any local quantity B(A) that transforms in the adjoint representation. It is required that all non-Cartan components of B vanish. Such a gauge restricts the gauge symmetry to a maximal Abelian subgroup of the gauge group, which may be "xed further by the Coulomb gauge without any singularities. A potential A gauge transformed to satisfy the Abelian projection gauge would, in general, have singularities or topological defects. They would have quantum numbers of magnetic monopoles with respect to the residual Abelian gauge group. So, the e!ective theory would look like QED with magnetic monopoles. This is completely di!erent interpretation of gluodynamics, which leads to a di!erent interpretation of the coordinate singularities. Lattice simulations show that the monopole defects of gauge "xed Yang}Mills "elds are important in the non-perturbative regime and cannot be ignored [184,185]. Singularities in the path integral approach with a gauge imposed on the "eld variables have also been observed in [69]. Thus, one should conclude that the singularities have a di!erent `physicala appearance (or interpretation), depending on the choice of the `elementarya excitations in the Yang}Mills theory.
S.V. Shabanov / Physics Reports 326 (2000) 1}163
151
However, whatever choice is made, they must be taken into account in a complete (nonperturbative) quantum theory. Yet, it seems desirable to develop a formalism which is sort of universal and does not rely on a particular choice of the physical variables, that is, independent of the parameterization of the physical phase space. A proposal based on the projection formalism is discussed in next section. The existing approaches can be divided into two groups based respectively on the functional SchroK dinger equation and the path integral. In the "rst approach there are great complications as compared with the soluble two-dimensional case we have discussed. First, there is a potential (color magnetic) energy in the Hamiltonian which has terms cubic and quartic in the gauge potential. This would create non-perturbative dynamical e!ects in the strong coupling limit, thus making it hard to distinguish between the contributions of the kinetic and potential energies to, say, the mass gap (the di!erence between the vacuum and "rst excited state energies) in the quantum Yang}Mills theory [163,164]. Second, the metric on the orbit space is not #at, so one should expect quantum corrections to the classical potential stemming from the kinetic energy as predicted by Eqs. (7.100) and (7.101). The quantum potential will be singular at the points where the Jacobian (or the Faddeev}Popov determinant) vanishes as one might see from its explicit form (7.101). Due to locality of the kinetic energy, the quantum corrections would contain a non-physical in"nite factor +2[d3(0)]2 which typically results from the operator ordering in any local "eld theory whose kinetic energy operator contains a non-Euclidean metric in the "eld space. Thus, the SchroK dinger equation in "eld theory requires a regularization of the local product of operators involved. Needless to say about de"ning a proper Hilbert space in this approach. Even in the case of a free "eld, which is an in"nite set of harmonic oscillators, solutions of the functional SchroK dinger equation are not without di$culty [186]. Yet, in dimensional regularization one usually sets d3(0)"0. This however would not justify throwing away the singular terms from the Hamiltonian. The applicability of dimensional regularization is proved within perturbation theory only. Christ and Lee studied the e!ects of the operator ordering terms resulting from solving the Gauss law in the Coulomb gauge [7] in the (Hamiltonian) perturbation theory. They did not "nd any e!ect for the physical perturbative S-matrix, though the operator ordering terms appeared to be important for a renormalization of the two-loop vacuum diagrams. Their work has been further extended by Prokhorov and Malyshev [187]. It seems that the only reasonable approach based on the SchroK dinger equation can be formulated if one truncates the number of degrees of freedom. Cutkosky initiated one such program attempting to investigate the e!ects of the coordinate singularities on the ground state [188}191]. Another approach is due to LuK sher [192] which has been developed further by Koller and van Baal [193,194] (for recent developments see [195] and [101] and references therein). It is based on compactifying the space into a three-torus and studying the limit of small torus size. The latter allows one to use a perturbation theory for all excitations with higher momenta, while the non-perturbative e!ects would be essential for the low (or zero) momentum excitations. In all these approaches the geometry of the modular domain appears to be important for the spectrum of the truncated Yang}Mills Hamiltonian. In the path integral approach, Gribov proposed to modify the original Faddeev}Popov measure by inserting into it a characteristic function of the domain where the Faddeev}Popov operator is positive [11]. This would modify the path integral substantially in the infrared region (the Green's functions, e.g., the gluon propagator, derived from the path integral are modi"ed) because the
152
S.V. Shabanov / Physics Reports 326 (2000) 1}163
horizon in the Coulomb gauge approaches the vacuum con"guration from the infrared directions in the momentum space. Since later it was understood that the modular domain is smaller than the Gribov domain, the idea was appropriately modi"ed by Zwanziger [196}198] with a similar conclusion about the infrared behavior of the gluon propagator. Instead of the conventional G(k2)&[k2]~1 it turned out to be k2 G(k2)& , k4#m4
(10.17)
where m2 is a dynamically generated mass scale. In this approach the self-interaction of gluon "elds has been taken into account by perturbation theory, so the entire e!ect on the gluon propagator came from `horizon e!ectsa. A propagator of the form (10.17) has been also observed in the lattice simulations [200]. However the other group reported a di!erent result [199]: G(k2)&Z~1[m2#k2(k2/K2)a]~1 ,
(10.18)
where a&0.5 and m2 is compatible with zero. The constant m2 has been reported to be a "nite volume artifact. Thus, in the continuum limit, one has G(k2)&(k2)~1.5. So, it cannot be "tted as a sum of single particles poles with positive residues, which certainly unacceptable feature of the propagator of a physical particle because it violates the KaK llen}Lehmann representation. However it has been argued that it could be acceptable for a con"ned particle [201]. In this controversy it is also unclear which e!ect is most relevant for such a behavior of the non-perturbative gluon propagator: that of the Gribov horizon, or the e!ects of a strong self-interaction. For instance, the in#uence of the Gribov copies in the Coulomb gauge on the correlation function in lattice QCD has been studied in [202]. It has been observed that the residual gauge symmetry does not appear to be relevant. However, the authors of [202] have also noted that the e!ect may become important on bigger lattices. Yet, invoking a special non-perturbative technique of solving Schwinger}Dayson equations, Stingl [203] found the expression (10.17) for the non-perturbative gluon propagator without taking into account the existence of the horizon. In his approach the whole e!ect was due to the strong self-interaction of gluon "elds. In the aforementioned Abelian projection of QCD, the e!ects of Gribov copying has also been studied on a lattice [207]. No signi"cant e!ect has been found. It is curios, however, that the singularities themselves (`monopolesa) play the key role in the con"nement scenario in the maximal Abelian projection. It seems like the singularities in that gauge serve as labels for con"gurations (or degrees of freedom) that are most relevant for the con"nement. As singularities are gauge dependent it seems very likely that in the nonperturbative region the e!ects of coordinate singularities associated with a generic gauge and those of the strong self-interaction would be hard to distinguish. The maximal Abelian gauge look more like an exception rather than a rule. Returning to the Coulomb gauge, one may anticipate a potential problem in the path integral approach based on the formal restriction of the integration domain, say, to con"gurations for which the Morse functional attains absolute minima. The point is that the modular domain found in classical theory cannot be applicable in the path integral whose measure has a support on the space of distributions. The formal extension of the classical results to the quantum theory is
S.V. Shabanov / Physics Reports 326 (2000) 1}163
153
questionable because classical con"gurations have zero measure in the quantum con"guration space. The way out is to go to lattice gauge theory. The above ideas of de"ning the modular domain via the Morse theory and the restriction of the integration domain in the path integral has been implemented in the lattice gauge theory by Zwanziger [208]. He also investigated the thermodynamic limit of the modi"ed path integral, i.e., when the number of lattice sites becomes in"nite (the limit of an in"nite number of degrees of freedom). The conclusion was that the existence of the horizon alone (without a strong self-interaction) is su$cient to explain the area law of the Wilson loop, i.e., to ful"ll the con"nement criteria [209]. This conclusion, though being attractive, still remains a conjecture since e!ects of a strong self-interaction have not been estimated. Even if the lattice regularization is assumed in the approach based on the restriction of the functional integral measure to the modular domain, there is no obvious correspondence to the operator formalism, which should, as is believed, be present since the operator and path integral formalism are just two di!erent representations of the same physical model. In Section 8.2 it is argued that the topology and the boundaries in the con"guration space cannot, in fact, be taken into account simply by a formal restriction of the integration domain, i.e., by inserting a characteristic function of the modular domain into the path integral measure. This would be in con#ict with the operator formalism. For soluble gauge models with a non-Euclidean geometry of the physical phase space, the formal restriction of the integration domain in the path integral turns them into insoluble models because the integral is no longer Gaussian. In the work of Scholtz and Tupper a dynamical gauge "xing has been proposed to circumvent the Gribov obstruction to the path integral quantization [210]. The idea was to introduce a supersymmetric (auxiliary) multiplet coupled to the Yang}Mills "elds in a special way that the physical S-matrix is not modi"ed. Then the gauge is imposed on the bosonic components of the auxiliary supermultiplet, while the Yang}Mills potentials are left untouched. The operator version of such supersymmetric quantization has been developed in [211,212]. As one can see, it is rather hard to arrive at any de"nite conclusion about the role of the orbit space geometry in quantum gauge "eld theories. The reason is twofold. First, there is no good understanding of the very notion of the orbit space in the quantum case. Distributional character of quantum "elds imposes severe restrictions on the use of conventional topological and geometrical means based on continuity. Second, we do not know how to solve strongly interacting quantum "eld theories, which makes it impossible to distinguish between e!ects caused by the geometrical structure of the orbit space and those due to the strong interaction. The theory still needs more developments from both mathematical and physical sides. However, in various model approximations, where the above di$culties can be resolved, we do see the importance of the orbit space geometry in quantum theory. 10.5. The projection method in the Kogut}Susskind lattice gauge theory A possible way to extend the idea of combining the projection method and the Kato}Trotter product formula to gauge "eld theories is to make some regularization of a quantum "eld Hamiltonian. The "nite lattice regularization turns the quantum "eld theory into quantum mechanics. Since we still want to have the SchroK dinger equation and the Hamiltonian formalism, which is essential to control the gauge invariance in quantum theory, the only choice we have is the
154
S.V. Shabanov / Physics Reports 326 (2000) 1}163
Kogut}Susskind lattice gauge theory [213], where the space is discretized, while the time remains continuous. Let points of the three-dimensional periodic cubic lattice be designated by three-vectors with integer components, which we denote by x, y, etc. The total con"guration space is formed by the link variables u "u~13G, where y"x#k, and k is the unit vector in the direction of the kth xy yx coordinate axis. We also assume G to be SU(N). If A (x) is the (Lie algebra-valued) vector potential k at the site x, then u ,u "e*gaAk (x) . xy x, k
(10.19)
Here a is the lattice spacing. The gauge transformations of the link variables are u
x, k
PX u X~1 , x x, k x`ka
(10.20)
where X is the group element at the site x. The variables conjugate to the link variables are electric x "eld operators associated with each link, which we denote Eb , where the index b is a color index x, k (the adjoint representation index in an orthogonal basis of the Lie algebra). If the group element u is parameterized by a set of variables ub then the electric "eld operator is the Lie algebra x, k x, k generator for each link R , Eb "!iJbc(u ) x, k x, k Ruc x, k
(10.21)
[Ec , Eb ]"id d f bc Ee . x, k y, j xy kj e x, k
(10.22)
The Kogut}Susskind Hamiltonian reads H"H #< , 0
(10.23)
g2 H " 0 + E2 , x, k 0 2a (x, k)
(10.24)
2N <" + (1!N~1Re tr u ) , p ag2 0p
(10.25)
where g is the bare coupling constant, u is the product of the link variables around the 0 p plaquette p. As it stands the kinetic energy H is a sum of the quadratic Casimir operators of the group at 0 each link. It is a self-adjoint operator with respect to the natural measure on the con"guration space being a product of the Haar measures dk (u ). In the case of SU(2), H is nothing but the G x, k 0 kinetic energy of free quantum three-dimensional rotators. In general, the kinetic energy describes a set of non-interacting particles moving on the group manifold as follows from the commutation relation (10.22). We shall also call these particles generalized rotators. The magnetic potential energy (10.25) describes the coupling of the generalized rotators. Let Mu@N and MuN, respectively, be collections of initial and "nal con"gurations of the generalized rotators. To construct the path
S.V. Shabanov / Physics Reports 326 (2000) 1}163
155
integral representation of the transition amplitude ; (MuN, Mu@N), we make use of the modi"ed t Kato}Trotter formula (8.84) for gauge systems. The projector operator is just the group average at each lattice site with the Haar measure dk (X ) normalized to unity. G x As before, the crucial step is to establish the projected form of the free transition amplitude. The entire information about the geometry of the orbit space is encoded into it. An important observation is that the free transition amplitude is factorized into a product of the transition amplitude for each generalized rotator. But the amplitude for a singe free particle on the group manifold is well known due to some nice work of Marinov and Terentiev [120]. Let the amplitude for a single rotator be ;0(u, u@), then the gauge invariant transition amplitude associated with the t Dirac operator approach for a system of rotators reads
P
;0D(MuN, Mu@N)" t
< dk (X )< ;0(u , X u@ X~1 ) . (10.26) G x t x, k x x, k x`k Gx x, k Due to the invariance of the Casimir operator H at each site with respect to shifts on the group 0 manifold, it is su$cient to average only one of the arguments of the free transition amplitude. Simultaneous right (or left) group shifts of both arguments of the free transition amplitude leave the amplitude unchanged. We now can see how a kinematic coupling of the generalized rotators occurs through the gauge group average. The uncoupled rotators become coupled and factorization of the free transition amplitude over the degrees of freedom disappears. This phenomenon we have already seen in soluble gauge models. Observe that each group element X enters into six transition x amplitudes ;0(u, u@) associated with six links attached to the site x. This is what makes the gauge t average non-trivial even for the `freea Kogut}Susskind quantum lattice gauge theory (i.e., when the potential is set to zero). The projection of the transition amplitude on the gauge orbit space (regardless of any explicit parameterization of the latter) induces a non-trivial interaction between physical degrees of freedom of the Yang}Mills theory. The di!erence between the Abelian and non-Abelian cases is also clearly seen in this approach. The projection implicitly enforces the Gauss law in the path integral, i.e., without any gauge "xing. In the Abelian case this is a trivial procedure because the Gauss law merely requires vanishing of some canonical momenta (R E "0), so the corresponding part of the kinetic energy simply vanishes without any e!ect of the i i redundant degrees of freedom. From the geometrical point of view, the orbit space in QED is Euclidean and therefore no coupling between physical degrees of freedom occurs through the kinetic energy. Once the averaging procedure has been de"ned, one can proceed with introducing an explicit parameterization of the physical con"guration space. For instance, we can introduce the lattice analog of the Morse functional [208] M (X)" + [1!N~1Re tr(X u X~1 )] . (10.27) u x x, k x`i x, k The con"gurations u at which the functional (10.27) attains relative minima form the gauge x, k "xing surface, the analog of the Coulomb gauge. The modular domain, being a collection of unique representatives of each gauge orbit, is K"Mu : M (e)4M (X) for all X3GN , x, k u u
(10.28)
156
S.V. Shabanov / Physics Reports 326 (2000) 1}163
where e is the group unit. Clearly, K consists of con"gurations at which the Morse function attains absolute minima. Let u be from K. Then a generic link variable can always be represented in the x, k form = u =~1 where = is a group element. From the gauge invariance of the amplix x, k x`k x tude (10.26) it follows that the initial and "nal con"gurations can be taken from K, i.e., the amplitude does not depend on the set of group elements = . Having reduced the transition x amplitude on the gauge orbit space parameterized by the con"gurations (10.28), one may calculate the group average using the stationary phase approximation in the limit t"oP0 and obtain the modi"ed in"nitesimal free transition amplitude which would contain the information about the geometry and topology of the orbit space and also an explicit form of the operator ordering corrections resulting from the reduction of the kinetic energy operator H on the modular domain 0 (10.28). As in the general case, the amplitude has a unique gauge invariant continuation outside of the modular domain to the entire gauge "xing surface. Consequently, the group averaging integral would have not only one stationary point. The sum over Gribov copies would emerge as the sum over the stationary points of the gauge group average integral, indicating a possible compactixcation of the physical con"guration space as we have learned with the two dimensional example. The structure of the path integral would be the same as that found in Section 8.7 in the general case. Having proved the equivalence of the path integral obtained by the projection method to the Dirac operator approach and, thereby, ensured gauge invariance (despite using a particular parameterization of the orbit space), one could try to investigate the role of the orbit space geometry in quantum theory, which partially reveals itself through the coordinate singularities of the parameterization chosen. This would require studying the thermodynamic and continuum limits, e.g., by the methods developed by Zwanziger [208]. It is also important to note that the Coulomb gauge has recently been proved to be renormalizable [214]. This provides a tool to control ultraviolet behavior of the theory in the continuum limit. To separate the e!ects of the kinetic and potential energy would be a hard problem in any approach. But in the strong coupling limit, the kinetic energy dominates as one sees from the Kogut}Susskind Hamiltonian [213]. This leaves some hope that in this limit the e!ects of the kinetic energy reduced on the modular domain K could be accurately studied in the path integral approach. An investigation of the mass gap [163] would be especially interesting. The program can be completely realized in the two-dimensional case (cf. Section 8.8). The gauge group average can be calculated explicitly by means of the decomposition of the transition amplitude of a particle on a group manifold over the characters of the irreducible representations proposed by Marinov and Terentiev [120]. If the orbit space is parameterized by constant link variables (the Coulomb gauge) u "u for any x, x@, then in the continuum limit one x x{ obviously recovers the transition amplitude (8.70). Taking the resolvent of the evolution operator one can "nd the mass gap, which would be impossible to see, had we neglected the true structure of the physical con"guration space. The compactness of the orbit space and the mass gap follow from the very structure of the path integral containing the sum over Gribov copies, which appears as the result of the projection of the transition amplitude onto the gauge orbit space. The projection formalism guarantees that the true geometry of the gauge orbit space is always appropriately taken into account in the path integral, whatever gauge is used, and thereby provides the right technical tool to study non-perturbative phenomena.
S.V. Shabanov / Physics Reports 326 (2000) 1}163
157
11. Conclusions We have investigated the physical phase space structure in gauge theories and found that its geometrical structure has a signi"cant e!ect on the corresponding quantum theory. The conventional path integral requires a modi"cation to take into account the genuine geometry of the physical phase space. Based on the projection method, the necessary modi"cation has been established, and its equivalence to the explicitly gauge invariant operator formalism due to Dirac has also been shown. Upon a quantum description of gauge systems, one usually uses some explicit parameterization of the physical phase space by local canonical coordinates. Because of a non-Euclidean geometry of the physical phase space any coordinate description would in general su!er from coordinate singularities. We have developed a general procedure for how to cope with such singularities in the operator and path integral formalisms for gauge models of the Yang}Mills type. It appeared that the singularities cannot generally be ignored and have to be carefully taken into account in quantum or classical theory in order to provide the gauge invariance of the theory. Though all the exact results have been obtained for soluble gauge models, it is believed that some essential features of quantum gauge dynamics on the non-Euclidean physical phase space would also be present in the realistic theories. There are several important problems yet to be solved in non-perturbative quantum "eld theory to make some reliable conclusions about the role of the physical phase space geometry in quantum Yang}Mills theory. The way based on the projection formalism in the Kogut}Susskind lattice seems a rather natural approach to this problem, which ensures agreement with the operator formalism and leads to the functional integral that does not depend on any explicit parameterization of the gauge orbit space. The path integral formalism based on the projection method gives evidence that the compactness of the gauge orbit space might be important for the existence of the mass gap in the theory and, hence, for the gluon con"nement, as has been conjectured by Feynman. When constructing the path integral over a non-Euclidean physical phase space, we have always used a parameterization where no restriction on the momentum variables has been imposed. The reason for that is quite clear. The explicit implementation of the projection on the gauge invariant states is easier in the con"guration space for gauge theories of the Yang}Mills type. This latter restriction can be dropped, and a phase-space path integral measure covariant under general canonical transformations on the physical phase space can be found [122}124] for systems with a "nite number of degrees of freedom. The corresponding path integral does not depend on the parameterization of the physical phase space and, in this sense, is coordinate-free. The problem remains open in quantum gauge "eld theory. Despite many unsolved problems, it is believed that the soluble examples studied above in detail and the concepts introduced would provide a good starting point for this exciting area of research.
Acknowledgements I am deeply indebted to John Klauder whose encouragement, support, interest and numerous comments have helped me to accomplish this work. I also wish to thank him for a careful reading of
158
S.V. Shabanov / Physics Reports 326 (2000) 1}163
the manuscript, many suggestions to improve it, and for stimulating discussions on the topics of this review. I express my gratitude to Lev Prokhorov from whom I learned a great deal about gauge theories and path integrals. On this occasion I would like to thank I.A. Batalin, L. Baulieu, A. Broyles, P. van Baal, T. Heinzl, M. Henneaux, H. HuK !el, J.C. Mourao, F.G. Scholtz, M. Shaden, T. Strobl, C.-M. Viallet and D. Zwanziger for fruitful discussions, references and comments that were useful for me in this work. It is a pleasure for me to thank the Departments of Physics and Mathematics of the University of Florida for the warm hospitality extended to me during my stay in Gainesville.
References [1] P.A.M. Dirac, The Principles of Quantum Mechanics, Clarendon Press, Oxford, 1956, p. 114. [2] R.P. Feynman, Rev. Mod. Phys. 20 (1948) 367. R.P. Feynman, Phys. Rev. 84 (1951) 108. R.P. Feynman, A.R. Hibbs, Quantum Mechanics and Path Integrals, McGraw-Hill, New York, 1965. The idea of using the classical action in quantum mechanics was "rst proposed by Dirac in P.A.M. Dirac, Phys. Z. Sowietunion 3 (1933) 64 (reprinted in: J. Schwinger, Quantum Electrodynamics, Dover, New York, 1958). [3] I. Daubechies, J.R. Klauder, J. Math. Phys. 26 (1985) 2239. [4] I. Daubechies, J.R. Klauder, T. Paul, J. Math. Phys. 28 (1987) 85. [5] J.R. Klauder, Ann. Phys. 188 (1988) 120. [6] P.A.M. Dirac, Lectures on Quantum Mechanics, Yeshiva Univ., New York, 1965. [7] N. Christ, T.D. Lee, Phys. Rev. D 22 (1980) 939. [8] R. Jackiw, Rev. Mod. Phys. 52 (1980) 661. [9] L.V. Prokhorov, Sov. J. Nucl. Phys. 35 (1982) 129. [10] L.V. Prokhorov, S.V. Shabanov, Sov. Phys. Uspekhi 34 (1991) 108. [11] V.N. Gribov, Nucl. Phys. B 139 (1978) 1. [12] I.M. Singer, Commun. Math. Phys. 60 (1978) 7; M.F. Atiyah, J.D.S. Jones, Commun. Math. Phys. 61 (1978) 97. [13] O. Babelon, C.-M. Viallet, Phys. Lett. B 85 (1979) 246; O. Babelon, C.-M. Viallet, Commun. Math. Phys. 81 (1981) 515; M. Daniel, C.-M. Viallet, Rev. Mod. Phys. 52 (1980) 175. [14] J.R. Klauder, E. Aslaksen, Phys. Rev. D 2 (1970) 272; J.R. Klauder, in: J.-P. Antoine, E. Tirapequi (Eds.), Functional Integration, Plenum Pub. Corp., New York, 1980. [15] V.P. Maslov, M.V. Fedoriuk, Semi-Classical Approximation in Quantum Mechanics, D. Reidel Pub. Comp., Holland, Dortrecht, 1981. [16] L.V. Prokhorov, S.V. Shabanov, Hamiltonian Mechanics of Gauge Systems, St.-Petersburg Univ. Press, St.-Petersburg, 1997 (in Russian). [17] S.V. Shabanov, Int. J. Mod. Phys. A 6 (1991) 845. [18] S.V. Shabanov, Theor. Math. Phys. 78 (1989) 292. [19] L.V. Prokhorov, S.V. Shabanov, Phys. Lett. B 216 (1989) 341. [20] M. Henneaux, C. Teitelboim, Quantization of Gauge Systems, Princeton University Press, New Jersey, 1992. [21] J. Govaerts, Hamiltonian Quantisation and Constrained Dynamics, Leuven Univ. Press, Leuven, 1991. [22] H. Neuberger, Phys. Lett. B 183 (1987) 337. [23] K. Fujikawa, Nucl. Phys. B 226 (1983) 437. [24] K. Nishijima, Nucl. Phys. B 328 (1984) 601; M. Spiegelglas, Nucl. Phys. B 283 (1987) 205; W. Kalan, J.W. van Holten, Nucl. Phys. B 361 (1991) 471. [25] F.G. Scholtz, S.V. Shabanov, Ann. Phys. (NY) 263 (1998) 119. [26] R. Marnelius, M. OG rgen, Nucl. Phys. B 351 (1991) 233. [27] I.A. Batalin, R. Marnelius, Nucl. Phys. B 442 (1995) 669. [28] N. DuK chtig, S.V. Shabanov, T. Strobl, Nucl. Phys. B 538 (1999) 485.
S.V. Shabanov / Physics Reports 326 (2000) 1}163
159
[29] K. Fujikawa, Prog. Theor. Phys. 61 (1979) 627; P. Hirschfeld, Nucl. Phys. B 157 (1979) 37; L.J. Carson, Nucl. Phys. B 266 (1986) 357; H. Kanno, Lett. Math. Phys. 19 (1990) 249; H. Yabuki, Ann. Phys. 209 (1991) 231; C. Becchi, C. Imbimbo, Nucl. Phys. B 462 (1996) 571; L. Baulieu, A. Rosenberg, M. Schaden, Phys. Rev. D 54 (1996) 7825; R. Zucchini, Commun. Math. Phys. 185 (1997) 723; M. Testa, Phys. Lett. B 429 (1998) 349. [30] S. Helgason, Di!erential Geometry, Lie Groups, and Symmetric Spaces, Academic Press, New York, 1978. [31] S. Helgason, Groups and Geometric Analysis, Academic Press, New York, 1984. [32] D.P. Zhelobenko, Compact Lie Groups and Their Representations, Translations of Mathematical Monographs, Vol. 40, American Mathematical Society, Providence, RI, 1973. [33] K. Huang, Quarks, Leptons and Gauge Fields, World Scienti"c, Singapore, 1982. [34] M.A. Soloviev, Theor. Math. Phys. 78 (1989) 117. [35] O. Loos, Symmetric Spaces, Benjamin, New York, 1969. [36] M. Nakahara, Geometry, Topology, and Physics, IOP, Bristol, 1993. [37] M.A. Soloviev, Theor. Math. Phys. 73 (1987) 3. [38] S.G. Matinyan, G.K. Savvidi, N.G. Ter-Arutyunyan, Sov. Phys. JETP 53 (1981) 421. [39] G.K. Savvidi, Phys. Lett. B 159 (1985) 325. [40] B. de Wit, J. Hoppe, H. Nicolai, Nucl. Phys. B 305 (1988) 545. [41] B. de Wit, M. LuK scher, H. Nicolai, Nucl. Phys. B 320 (1989) 135. [42] E. Witten, Nucl. Phys. B 443 (1995) 85. [43] T. Banks, W. Fischer, S.H. Shenker, L. Susskind, Phys. Rev. D 55 (1997) 5112. [44] M.S. Narasimhan, T.R. Ramadas, Commun. Math. Phys. 67 (1979) 121. [45] K.K. Uhlenbeck, Commun. Math. Phys. 83 (1982) 31. [46] A.A. Migdal, Sov. Phys.-JETP 42 (1976) 413. [47] S.G. Rajeev, Phys. Lett. B 212 (1988) 203. [48] J.E. Hetrick, Y. Hosotani, Phys. Lett. B 230 (1989) 88. [49] M. Mickelsson, Phys. Lett. B 242 (1990) 217. [50] E. Langmann, G.W. Semeno!, Phys. Lett. B 296 (1992) 117. [51] E. Langmann, G.W. Semeno!, Phys. Lett. B 303 (1993) 303. [52] S.V. Shabanov, Phys. Lett. B 318 (1993) 323. [53] J.E. Hetrick, Int. J. Mod. Phys. A 9 (1994) 3153. [54] S.V. Shabanov, Commun. Theor. Phys. 4 (1995) 1. [55] H.G. Dosch, V.F. Muller, Fortschr. Phys. 27 (1979) 547. [56] V. Kazakov, I. Kostov, Nucl. Phys. B 179 (1981) 283. [57] B. Rusakov, Mod. Phys. Lett. A 5 (1990) 693. [58] D. Fine, Commun. Math. Phys. 134 (1990) 273. [59] E. Witten, Commun. Math. Phys. 141 (1991) 153. [60] E. Witten, J. Geom. Phys. 9 (1992) 303. [61] M. Blau, G. Thompson, J. Mod. Phys. A 7 (1991) 3781. [62] G. 't Hooft, Nucl. Phys. B 153 (1979) 141. [63] N. Burbaki, Group et Algebres de Lie, Masson, Paris, 1981 (Chapters 4}6). [64] Kiyosi Ito (Ed.), Encyclopedic Dictionary of Mathematics, Vol. II, MIT Press, Cambridge, MA, 1987. [65] I.S. Gradshteyn, I.M. Ryzhyk, Table of Integrals, Series, and Products, Academic Press, New York, 1965. [66] P.K. Mitter, C.-M. Viallet, Commun. Math. Phys. 79 (1981) 457. [67] L.D. Faddeev, V.N. Popov, Phys. Lett. B 25 (1967) 30. [68] L.D. Faddeev, Theor. Math. Phys. 1 (1968) 3. [69] A.G. Izergin, V.E. Korepin, M.A. Semenov-Tyan-Shanskii, L.D. Faddeev, Theor. Math. Phys. 38 (1979) 1. [70] J. Goldstone, R. Jackiw, Phys. Lett. B 74 (1978) 81. [71] V.A. Matveev, A.N. Tavkhelidze, M.E. Shaposhnikov, Theor. Math. Phys. 59 (1984) 529. [72] S.V. Shabanov, Phase Space Structure in Gauge Theories, JINR lectures for young scientists, Vol. 54, P2-89-533, JINR Publ. Dep., Dubna, 1989 (in Russian). [73] G. 't Hooft, Nucl. Phys. B 190 (1982) 455. [74] D. Amati, E. Elitzur, E. Rabinovici, Nucl. Phys. B 418 (1994) 45.
160
S.V. Shabanov / Physics Reports 326 (2000) 1}163
[75] D. Cangemi, R. Jackiw, Phys. Rev. D 50 (1994) 3913. [76] A. Yu. Alekseev, P. Schaller, T. Strobl, Phys. Rev. D 52 (1995) 7146. [77] B. DeWitt, in: S. Hawking, W. Israel (Eds.), General Relativity: An Einstein Centenary Survey, Cambridge Univ. Press, Cambridge, 1979; K. Kuchar, Phys. Rev. D 34 (1986) 3044. [78] R. Friedberg, T.D. Lee, Y. Pang, H. Ren, Ann. Phys. (NY) 246 (1996) 381. [79] H. HuK !el, G. Kelnhofer, Ann. Phys. (NY) 266 (1998) 417. [80] K. Fujikawa, Nucl. Phys. B 468 (1996) 355. [81] S. Edwards, Y.V. Gulyaev, Proc. Roy. Soc. A 279 (1964) 229. [82] A. Arthurus, Proc. Roy. Soc. A 313 (1969) 445. [83] A.A. Migdal, Phys. Rep. 102 (1983) 201. [84] A. Ashtekar, Phys. Rev. Lett. 57 (1986) 2244. [85] C. Rovelli, L. Smolin, Nucl. Phys. B 331 (1990) 80. [86] J. Baez, Lett. Math. Phys. 31 (1994) 213. [87] A. Ashtekar, J. Lewandowski, J. Math. Phys. 36 (1995) 2170. [88] D. Marolf, J.C.M. Mourao, Commun. Math. Phys. 170 (1995) 583. [89] P.A.M. Dirac, Can. J. Math. 2 (1950) 129; 3 (1951) 1. [90] P.G. Bergmann, Phys. Rev. 75 (1949) 680. [91] L.D. Faddeev, S.L. Shatashvilli, Phys. Lett. B 167 (1986) 225. [92] K. Hamachi, Lett. Math. Phys. 40 (1997) 257. [93] A. Ashtekar, G.T. Horowitz, Phys. Rev. D 26 (1982) 3342. [94] A.M. Polyakov, Gauge Fields and Strings, Harwood, New York, 1987. [95] S.V. Shabanov, Phys. Lett. B 255 (1991) 398. [96] H.G. Loos, Phys. Rev. 188 (1969) 2342. [97] H.G. Loos, J. Math. Phys. 11 (1970) 3258. [98] C. Emmrich, H. RoK mer, Commun. Math. Phys. 129 (1990) 69. [99] A. Auerbach, S. Kivelson, D. Niloce, Phys. Rev. Lett. 53 (1984) 411. [100] P. van Baal, Nucl. Phys. B 369 (1992) 259. [101] P. van Baal, in: P. van Baal (Ed.), Con"nement, Duality, and Nonperturbative Aspects of QCD, NATO ASI Series, Vol. B 368, Plenum Press, New York, 1998. [102] A.I. Vainshtein, V.I. Zakharov, V.A. Novikov, M.A. Shifman, Sov. Phys. Uspekhi 24 (1982) 195. [103] B. Podolsky, Phys. Rev. 32 (1928) 812. [104] B.S. De Witt, Rev. Mod. Phys. 29 (1957) 377. [105] K.S. Cheng, J. Math. Phys. 13 (1972) 1723. [106] H. Dekker, Physica 103A (1980) 586. [107] H. Jensen, H. Koppe, Ann. Phys. 63 (1971) 586. [108] R.C.T. da Costa, Phys. Rev. A 23 (1981) 1982. [109] N. Ogawa, Prog. Theor. Phys. 87 (1992) 513. [110] E. Nelson, J. Math. Phys. 5 (1964) 332. [111] T. Kato, H.F. Trotter, Paci"c Math. J. 8 (1958) 887. [112] H. Trotter, Proc. Amer. Math. Soc. 10 (1959) 545. [113] L.S. Schulmann, Techniques and Applications of Path Integration, Wiley, New York, 1981; G. Roepsdor!, Path Integral Approach to Quantum Physics. An Introduction, Springer, Berlin, 1994. [114] J. Zinn-Justin, Quantum Field Theory and Critical Phenomena, Clarendon Press, Oxford, 1990. [115] A. Yu. Alexeev, L.D. Faddeev, S. Shatashvilli, J. Geom. Phys. 5 (1989) 391. [116] W. Pauli, Pauli Lectures on Physics, Vol. 6, MIT Press, Cambridge, 1973, p. 171. [117] W. Janke, H. Kleinert, Lett. Nuovo Cimento 25 (1979) 297. [118] L.V. Prokhorov, Sov. J. Nucl. Phys. 39 (1984) 496. [119] S.V. Shabanov, J. Phys. A: Math. Gen. 24 (1991) 1199. [120] M.S. Marinov, M.V. Terentyev, Fortschr. Phys. 27 (1979) 511. [121] J.R. Klauder, Found. Phys. 27 (1997) 1467. [122] J.R. Klauder, S.V. Shabanov, Phys. Lett. B 398 (1997) 116.
S.V. Shabanov / Physics Reports 326 (2000) 1}163 [123] [124] [125] [126] [127] [128] [129] [130] [131]
[132] [133] [134] [135] [136] [137] [138] [139] [140] [141] [142] [143] [144] [145] [146] [147]
[148] [149] [150] [151] [152] [153] [154] [155] [156] [157] [158] [159] [160] [161]
161
J.R. Klauder, S.V. Shabanov, Nucl. Phys. B 511 (1998) 713. S.V. Shabanov, J.R. Klauder, Phys. Lett. B 435 (1998) 343. G. 't Hooft, Nucl. Phys. B 33 (1971) 173. R.P. Feynman, Acta Phys. Polon. 24 (1963) 697. B. de Witt, Phys. Rev. 162 (1967) 1192, 1239. L.V. Prokhorov, S.V. Shabanov, Vestik LGU, Ser. 4, 4 (1988) 68 (in Russian). S.V. Shabanov, in: F. Colomo et al. (Eds.) Constraint Theory and Quantization Methods, World Scienti"c, Singapore, 1994, p. 176. S.V. Shabanov, J.R. Klauder, Phys. Lett. B 456 (1999) 38. S.V. Shabanov, Path integral in holomorphic representation without gauge "xation, JINR preprint, E2-89-687, Dubna, 1989 (unpublished); see a revised version in: V.S. Yarunin, M.A. Smondyrev (Eds.), The Proceedings of `Path Integrals'96a, JINR Publ. Dept., Dubna, 1996, p. 133. S.V. Shabanov, The Role of Gauge Invariants in Path Integral Construction, JINR preprint, E2-89-688, Dubna, 1989 (unpublished). J.R. Klauder, Ann. Phys. (NY) 254 (1997) 419. J.R. Klauder, in: New non Perturbative Methods and Quantization on the Light Cone, Springer Veralg, Les Ulis, 1998, p. 45. B. DeWitt, Rev. Mod. Phys. 29 (1957) 377. I.W. Mayes, J.S. Dowker, J. Math. Phys. 14 (1973) 434. D. McLaughlin, L. Schulman, J. Math. Phys. 12 (1971) 2520. L.V. Prokhorov, Sov. J. Part. Nucl. 13 (1982) 456. F.A. Berezin, The Method of Second Quantization, Academic Press, New York, 1966. F.A. Berezin, in: A.A. Kirillov (Ed.), Introduction to Superanalysis, D. Reidel Publ. Co., Dordrecht, Holland, 1987. B. DeWitt, Supermanifolds, Cambridge Univ. Press, Cambridge, 1984. J.L. Martin, Proc. Roy. Soc. A 251 (1959) 536. J.L. Martin, Proc. Roy. Soc. A 251 (1959) 543. S.V. Shabanov, Mod. Phys. Lett. A 6 (1991) 909. P. Ramond, Field Theory: A Modern Primer, Benjamin-Cummings, Reading, MA, 1981. E.T. Whitteker, G.N. Watson, A Course of Mordern Analysis, Vol. 1, The University Press, Oxford, 1927, Sec. 2.37. S.V. Shabanov, Lectures on Quantization of Gauge Theories by the Path Integral Method, in: E.N. Gazis, G. Koutsoumbas, N.D. Tracas, G. Zoupanos (Eds.), The Proceedings of 4th Hellenic School on Elementary Particle Physics, vol. II, Corfu, September 1992, National Technical University, Athens, 1994, p. 272. G. Junker, J.R. Klauder, Eur. Phys. J. C 4 (1998) 173. R. Rajaraman, An Introduction to Solitons and Instantons in Quantum Field Theory, North-Holland, Elsevier, New York, 1982. L.V. Prokhorov, S.V. Shabanov, in: B. Markovsky et al. (Eds.), Topological Phases in Quantum Theory, World Scienti"c, Singapore, 1989, p. 354. S.W. Howking, Phys. Lett. B 195 (1987); Phys. Rev. D 37 (1988) 904. G.V. Lavrelashvili, E. Rubakov, P.G. Tinyakov, JETP Lett. 46 (1987) 167. S. Coleman, Nucl. Phys. B 307 (1988) 867; 310 (1988) 643. Y. Verbin, A. Davidson, Phys. Lett. B 299 (1989) 364. O. Bertolami, J.M. Mourao, R.F. Picken, I.P. Volobuev, Int. J. Mod. Phys. A 6 (1991) 4149. O. Bertolami, J.M. Mourao, Class. Quant. Grav. 8 (1991) 1271. B. DeWitt, Phys. Rev. D 160 (1967) 1113; C. Isham, J.E. Nelson, Phys. Rev. D 10 (1974) 3226; W.E. Blyth, C. Isham, Phys. Rev. D 11 (1975) 768. G.W. Gibbons, C.N. Pope, Commun. Math. Phys. 66 (1979) 627. E. Witten, Commun. Math. Phys. 80 (1981) 381. P. Schoen, S.T. Yau, Phys. Rev. Lett. 42 (1979) 547. S.V. Shabanov, Phys. Lett. B 272 (1991) 11.
162
S.V. Shabanov / Physics Reports 326 (2000) 1}163
[162] S.V. Shabanov, in: M.C. Bento et al. (Eds.), The Proceedings of the 1st Iberian Meeting on Classical and Quantum Gravity, World Scienti"c, Singapore, 1993, p. 322. [163] R. Feynman, Nucl. Phys. B 188 (1981) 479. [164] R. Jackiw, Diverse Topics in Theoretical and Mathematical Physics, World Scienti"c, Singapore, 1995. [165] I. Singer, AsteH risque (1985) 323. [166] C. Itzykson, J.-B. Zuber, Quantum Field Theory, McGraw-Hill, New York, 1980. [167] P.K. Mitter, C.M. Viallet, Commun. Math. Phys. 79 (1981) 457. [168] M.A. Semenov-Tyan-Shanskii, V.A. Franke, Zapiski Nauch. Sem. Len. Otdel. Mat. Inst. im. V.A. Steklov AN SSSR, 120 (1982) 159; Translation: Plenum Press, New York, 1986, p. 999. [169] G. Dell'Antonio, D. Zwanziger, in: P.H. Damgraad et al. (Eds.), Probabilistic Methods in Quantum Field Theory and Quantum Gravity, Plenum Press, New York, 1990. [170] G. Dell'Antonio, D. Zwanziger, Commun. Math. Phys. 138 (1991) 291. [171] D. Zwanziger, Nucl. Phys. B 209 (1982) 336. [172] G. Dell'Antonio, D. Zwanziger, Nucl. Phys. B 378 (1989) 333. [173] J. Fuchs, M.G. Schmidt, C. Schweigert, Nucl. Phys. B 426 (1994) 107. [174] J. Fuchs, The singularity structure of the Yang}Mills con"guration space, hep-th/9506005. [175] W. Kondracki, J.S. Rogulski, Dissertationes Mathematicae, Warsaw, CCL (1986) 1. [176] S. Donaldson, P. Kronheimer, The Geometry of Four Manifolds, Oxford University Press, Oxford, 1990. [177] I.M. Singer, Physica Scripta T 24 (1981) 817. [178] W. Kondracki, P. Sadowski, J. Geom. Phys. 3 (1983) 421. [179] A. Heil, A. Kersch, N. Papadopoulos, B. ReifenhaK user, F. Scheck, J. Geom. Phys. 7 (1990) 489. [180] J.M. Arms, J.E. Marsden, V. Moncrief, Commun. Math. Phys. 78 (1981) 455. [181] M. Asorey, P.K. Mitter, Ann. Inst. PoincareH , A 45 (1986) 61. [182] T. Pause, T. Heinzl, Nucl. Phys. B 524 (1998) 695. [183] W. Nahm, in: Z. Ajduk (Ed.), IV Warsaw Symposium on Elementary Particle Physics, 1981, p. 275. [184] J.D. Stack, S.D. Neiman, R.J. Wensley, Phys. Rev. D 50 (1994) 3399. [185] H. Shiba, T. Suzuki, Phys. Lett. B 333 (1994) 461. [186] J.R. Klauder, Beyond Conventional Quantization, Cambridge University Press, Cambridge, 1999. [187] L.V. Prokhorov, Yu. Malyshev, Sov. J. Nucl. Phys. 48 (1988) 890. [188] R.E. Cutkosky, Phys. Rev. Lett. 51 (1983) 538. [189] R.E. Cutkosky, Phys. Rev. D 30 (1984) 447. [190] R.E. Cutkosky, J. Math. Phys. 25 (1984) 939. [191] R.E. Cutkosky, K.C. Wang, Phys. Rev. D 36 (1987) 3825. [192] M. LuK scher, Nucl. Phys. B 219 (1983) 233. [193] J. Koller, P. van Baal, Ann. Phys. 174 (1987) 288. [194] J. Koller, P. van Baal, Nucl. Phys. B 302 (1988) 1. [195] P. van Baal, B. van den Heuvel, Nucl. Phys. B 417 (1994) 215. [196] D. Zwanziger, Nucl. Phys. B 209 (1982) 336. [197] D. Zwanziger, Nucl. Phys. B 345 (1990) 461. [198] D. Zwanziger, Nucl. Phys. B 412 (1994) 657. [199] P. Marenzoni, G. Martinelli, N. Stella, M. Testa, Phys. Lett. B 318 (1993) 511. [200] C. Bernard, C. Parrinello, A. Soni, Phys. Rev. D 49 (1994) 1585. [201] J.E. Mandula, M. Ogilvie, Phys. Lett. B 185 (1987) 127. [202] M.L. Paciello, C. Parrinello, S. Petrarca, B. Taglienti, A. Vladikas, Phys. Lett. B 289 (1992) 405. [203] M. Stingl, Phys. Rev. D 34 (1986) 3863. [204] M.A. Soloviev, JETP Lett. 38 (1983) 415. [205] M.A. Soloviev, Kratk. Soobshch. Fiz. 3 (1985) 29 (in Russian). [206] I.M. Gelfand, G.E. Shilov, Generalized Functions, Vol. 4, Academic Press, New York, 1964. [207] A. Hart, M. Teper, Phys. Rev. D 55 (1997) 3756. [208] D. Zwanziger, Nucl. Phys. B 485 (1997) 185. [209] A. Cucchieri, D. Zwanziger, Phys. Rev. Lett. 78 (1997) 3814.
S.V. Shabanov / Physics Reports 326 (2000) 1}163
163
[210] F.G. Scholtz, G.B. Tupper, Phys. Rev. D 48 (1993) 1585. [211] F.G. Scholtz, S.V. Shabanov, Supersymmetric quantization of gauge theories; hep-th/9509015. [212] F.G. Scholtz, S.V. Shabanov, in: I.M. Dremin, A.M. Semikhatov (Eds.), The Proceedings of 2nd Sakharov Conference on Physics, World Scienti"c, Singapore, 1997, p. 423. [213] J. Kogut, L. Susskind, Phys. Rev. D 11 (1975) 395. [214] L. Baulieu, D. Zwanziger, Nucl. Phys. B 548 (1999) 527. [215] C. Becchi, A. Rouet, R. Stora, Ann. Phys. 98 (1976) 287. [216] I.V. Tyutin, Gauge invariance in "eld theory and statistical mechanics, Lebedev Institute preprint, FIAN-39, 1975. [217] E.S. Fradkin, G.A. Vilkovisky, Quantization of relativistic systems with constraints: equivalence of canonical and covariant formalisms in quantum theory of gravitational "eld, CERN report TH-2332, 1977; I.A. Batalin, G.A. Vilkovisky, Phys. Lett. B 69 (1977) 309; L. Baulieu, Phys. Rep. 129 (1985) 3. [218] L.V. Prokhorov, Vestnik Leningr. Univ., Ser. 4, 1(4) (1983) 43. [219] J. Govaerts, J.R. Klauder, Ann. Phys. 274 (1999) 251.
S. Sieniutycz / Physics Reports 326 (2000) 165}258
165
HAMILTON}JACOBI}BELLMAN FRAMEWORK FOR OPTIMAL CONTROL IN MULTISTAGE ENERGY SYSTEMS
Stanislaw SIENIUTYCZ Faculty of Chemical Engineering, Warsaw University of Technology, 00-645 Warsaw, 1 Warynskiego Street, 00-645 Warszawa, Poland
AMSTERDAM } LAUSANNE } NEW YORK } OXFORD } SHANNON } TOKYO
Physics Reports 326 (2000) 165}258
Hamilton}Jacobi}Bellman framework for optimal control in multistage energy systems Stanislaw Sieniutycz* Faculty of Chemical Engineering, Warsaw University of Technology, 00-645 Warsaw, 1 Warynskiego Street, 00-645 Warszawa, Poland Received November 1999; editor: I. Procaccia Contents 1. Introduction 2. Basic theory 2.1. Hamilton}Jacobi}Bellman equations for continuous systems 2.2. Pontryagin's structure for discrete systems linear in holdup times 2.3. Outline of computational methods 3. Applications 3.1. Multistage endoreversible work-producing systems 3.2. Optimally controlled unit operations and unit processes 3.3. Processes spontaneously relaxing to the equilibrium 3.4. Thermal rays traveling along paths of least resistivity
168 177 177 185 197 206 206 224 231
3.5. Fermat's principle for propagating di!usion}reaction fronts 4. Concluding remarks 4.1. Advantages stemming from Hamiltonian descriptions 4.2. Outline of generalized theory for arbitrary discrete processes 4.3. Thermodynamic limits for "nite rates Acknowledgements Appendix A A.1. Notation A.2. Greek letters A.3. Subscripts A.4. Superscripts References
240 245 245 246 249 250 250 250 253 254 254 255
238
Abstract We enunciate parallelism for structures of variational principles in mechanics and thermodynamics in terms of the duality for thermoeconomic problems of maximizing of production pro"t and net pro"t which can be transferred to duality for least action and least abbreviated action which appear in mechanics. With the parallelism in mind, we review theory and macroscopic applications of a recently developed discrete formalism of Hamilton}Jacobi type which arises when Bellman's method of dynamic programming is applied to optimize active (work producing) and inactive (entropy generating) multistage energy systems with
* Fax: 00-48-22-251440. E-mail address:
[email protected] (S. Sieniutycz) 0370-1573/00/$ - see front matter ( 2000 Elsevier Science B.V. All rights reserved. PII: S 0 3 7 0 - 1 5 7 3 ( 9 9 ) 0 0 1 1 6 - 7
S. Sieniutycz / Physics Reports 326 (2000) 165}258
167
free intervals of an independent variable. Our original contribution develops a generalized theory for discrete processes in which these intervals can residue in the model inhomogeneously and can be constrained. We consider applications to multistage thermal machines, controlled unit operations, spontaneous relaxations, nonlinear heat conduction, and self-propagating reaction}di!usion fronts. They all satisfy a basic functional equation that leads to the Hamilton}Jacobi}Bellman equation (HJB equation) and a related discrete optimization algorithm with a maximum principle for a Hamiltonian. Correspondence is shown with the well-known HJB theory for continuous processes when the number of stages approaches an in"nity. We show that a common unifying criterion, which is the criterion of a minimum generated entropy, can be proven to act locally in the majority of considered cases, although the related global statements can be invalid far from equilibrium. General limits are found which bound the consumption of the classical work potential (exergy) for "nite durations. ( 2000 Elsevier Science B.V. All rights reserved. PACS: 05.70.Ln; 47.27.Te; 44.30.#v Keywords: Dynamic programming; Hamilton}Jacobi equation; Discrete systems; Optimal control; Variational principles; Waves and rays; Finite-time thermodynamics; Energy systems
168
S. Sieniutycz / Physics Reports 326 (2000) 165}258
1. Introduction This paper synthesizes recent results obtained in multistage optimization of active (work producing) and inactive (mechanical energy degrading) macroscopic systems. The mathematical basis is essentially Bellman's method of dynamic programming and resulting maximum principles. Endoreversible multistage processes which yield maximum mechanical work, optimally controlled unit operations which minimize exergy costs, spontaneous relaxations towards equilibrium, thermal waves and self-propagating concentration fronts, are all shown to satisfy a canonical discrete algorithm, with the maximum principle for a Hamiltonian with respect to external adjustable parameters (controls). Locally and for "xed end states, a common unifying criterion for all considered processes is that of minimum entropy generation; the extremal structure is always canonical. A gauging approach that applies the idea of equivalent Lagrangians provides the simplest way to interpret equations which govern various optimal performance functions (principal functions of optimization problems). Both multistage and continuous processes are considered; all discrete characteristics reach their continuous counterparts in the limit of an in"nite number of stages. The dynamic programming approach leads to a discrete Hamilton}Jacobi}Bellman equation, discrete canonical set, and extends to multistage processes of the classical method of Pontryagin in which a Hamiltonian is maximized with respect to controls. Optimal performance functions which describe extremal work or minimal entropy generation are found in terms of end states, process duration and number of stages. Alternatively, the Legendre transforms of the original functions with respect to the time variable can be generated; in this case the optimal performance functions are found in terms of end states, time penalty (Hamiltonian) and number of stages. Extensions are possible of the approaches presented here to thermal "elds in distributed parameter systems. The optimization goals may be quite di!erent for the variety of processes considered. For the class of work-producing endoreversible processes, the principal goals are enhanced bounds for the work produced by an engine system or consumed by a heat-pump system in high-rate regimes. For the class of optimally controlled unit operations, the principal goals are optimal decisions and optimal trajectories which minimize certain exergy costs, whereas the optimal data of these costs may be of secondary importance. For the class of spontaneously relaxing nonequilibrium processes, whose equations of dynamics are known, the principal goal is a form of the entropy production functional which assures the given dynamics. For the heat-conducting solids, the principal goals are nonlinear e!ects caused by spatially distributed thermal resistances. For the reaction}di!usion fronts with kinetics governed by the mass action law and di!usion}reaction couplings, the principal goal is a proper transformation of an exact "eld model into an optimal lumped-parameter model which satis"es the second law in terms of the wave fronts and rays, as an admissible approximation. In the last three cases, the appropriate structure of the entropy production integrand is the sum of the two dissipation functions of which the "rst is the rate dependent and the second is the state dependent. Our work shows, in particular, how to unify the optimization criteria for processes with imposed external control and those with internal relaxation. An important link between economics and "nite duration of the process is illustrated by an example of multistage heat and mass heat exchangers, especially for cascade processes of #uidized bed heating and evaporation from porous solids. Thermodynamic applications are outlined which involve an extension of the classical problem of minimal work to discrete processes comprised of an equipment with "nite heat transfer area.
S. Sieniutycz / Physics Reports 326 (2000) 165}258
169
In the basic discrete theory (Section 2.2), a class of multistage optimal control processes which are linear with respect to the residence time interval (or a state variable interval) is considered. With the discrete version of dynamic programming, the necessary conditions of optimality are determined in an original form which contains a Hamiltonian}Jacobi equation with a delayed time argument. To avoid large di$culties associated with solving equations of this sort, these necessary conditions are transformed to a form described by a discrete Hamiltonian. It is then shown that in multistage autonomous systems with free intervals of the residence time, a Pontryagin-like Hamiltonian emerges which is constant along the optimal discrete trajectory. From a physical standpoint, this constant Hamiltonian condition is a generalization of the energy conservation condition as applied to optimal discrete systems with free intervals of time. On this basis, a canonical formalism strongly analogous to those in analytical mechanics and the optimal control theory of continuous systems is introduced and analyzed. Bene"ts and limitations of the basic theory are also discussed. Bellman's method of dynamic programming (DP) is applied to derive necessary optimality conditions for both continuous and discrete processes [1,2]. A general approach to multistage processes, developed here in the form of the so-called stage criterion (Section 2.2), allows one to pass from DP results to the discrete maximum principle which is a very powerful computational tool. Bellman's principle of optimality is crucial for both the existence of the optimal performance potentials and the derivation of the pertinent dynamic programming equation, which describe these potentials. The optimality principle makes one possible to replace the simultaneous evaluation of all optimal controls by a sequence of local evaluations of optimal controls at stages, for evolving subprocesses. When the optimal performance function is generated in terms of the initial states and initial time, the principle of optimality may be stated as follows: In a continuous or discrete process which is described by an additive performance criterion, the optimal strategy and the optimal proxt function are functions of the initial state, initial time and (in a discrete process) the total number of stages. A consequence of this property is that each "nal segment of an optimal path (continuous or discrete) is optimal with respect to its initial state and initial time and (in a discrete process) the corresponding number of stages. The proof of this formulation by contradiction uses the additivity property of the performance criterion [1,2]. The above formulation of the optimality principle refers to the so-called backward algorithm of the dynamic programming method. In this algorithm, a recursive optimization procedure for applying a governing functional equation begins at the "nal process state and terminates at its initial state. Consequently, local optimizations take place in the direction opposite to the direction of the physical time or the direction of the #ow of matter. (The process to which this may be applied may be arbitrary: it may be discrete by nature or may be obtained by discretizing an original continuous process.) The state transformations possess in the backward algorithm their most natural form, as they describe output states in terms of input states and controls at a stage. The optimization at a stage and the optimal functions do recursively involve the information generated at earlier subprocesses. In the continuous case this method leads directly to a basic equation of optimal continuous processes which is the so-called Hamilton}Jacobi}Bellman equation which constitutes a control counterpart of the well-known Hamilton}Jacobi equation of classical mechanics [3,4]. However, as we shall see in Section 2.2, a similar equation can be derived only for special discrete processes, those with unconstrained hn.
170
S. Sieniutycz / Physics Reports 326 (2000) 165}258
However, one may also generate the optimal pro"t function in terms of the "nal states and "nal time. The optimality principle has then its dual form: In a continuous or discrete process, which is described by an additive performance criterion, the optimal strategy and the optimal proxt function are functions of the xnal state, xnal time and (in a discrete process) the total number of stages. A consequence of this property is that each initial segment of the optimal path (continuous or discrete) is optimal with respect to its "nal state and "nal time and (in a discrete process) the corresponding number of stages. This formulation refers to the so-called forward algorithm of the dynamic programming method. In this algorithm the recursive optimization procedure for solving the governing functional equation begins from the initial process state and terminates at its "nal state. With the forward DP algorithm, one makes local optimizations in the direction of real time. It is the dual formulation of the optimality principle and the associated forward algorithm, which we apply commonly to multistage processes considered in this paper. The state transformations used in this case have the form which describes input states in terms of output states and controls at a process stage. The transformations of this sort are directly obtainable for multistage processes with an ideal mixing at the stage, otherwise the inverse transformations (applicable to the backward algorithm) may be di$cult to obtain in an explicit form. Again, as in the case of the original form of the optimality principle, its dual form makes it possible to replace the simultaneous evaluation of all optimal controls by sequence of successive evaluations of optimal controls for evolving optimal subprocesses. A suitable representation of the optimality principle is contained in recurrence equations. To introduce them, we restrict in this section to a special subset of recurrence equations which refers to problems of classical variational calculus and analytical mechanics. In equations of this subset, controls are identical with the process rates, and the dimensionality of the control vector is that of the state. An example is a recurrence equation for a multistage process in which the #uid of the speci"c heat c is heated sequentially in a cascade of heat pumps (Fig. 1) is given by Rn(¹n, tn)"min [c(1!¹e/(¹n#sun))unhn#Rn~1(¹n!unhn, tn!hn)] , (1.1) un where Rn is the optimal cost function of the n-stage subprocess and the c-term is a generalized cost (here: the speci"c work) consumed at the stage n. The control variable un is the rate of the temperature change ¹n with respect to the holdup time tn; the proportionality coe$cient s is the reciprocal of a time constant. The quantity hn is the interval of time tn at the stage n. This model is related to the cost function (3.17) of Section 3.1.1. For the forward}backward duality, considered above, the performance index is the same in the original and in the dual problem, yet, numerically generated functions of optimal performance, which describe generalized or time-dependent potentials, are di!erent. Regardless of any, appropriately imposed, boundary conditions (initial } in the forward algorithm and "nal } in the backward algorithm), optimal functions of the forward algorithm are potentials in the space of "nal states and "nal times whereas those of the backward algorithm are potentials in the space of initial states and initial times (in discrete problems, the total number of stages, N, is the additional variable). Thus, the two numerical DP approaches generate in each case di!erent functions which characterize generalized potentials. Yet, for analytical approaches, in a seemingly special but, in fact, basic case in which both the potential functions contain coordinates of an arbitrary but "xed boundary point as parameters, analytical expressions for forward and backward potentials do
S. Sieniutycz / Physics Reports 326 (2000) 165}258
171
Fig. 1. Application of Bellman's principle of optimality for a multistage heat pump with the single state variable x"¹ and the single control u"*¹/*q, which is optimized by the forward algorithm of the dynamic programming method. The nondimensional time q (the number of transfer units) and its interval h at a stage constitute the extra coordinate of state and the extra control at the stage. Elipse-shaped balance areas embrace successive subprocesses which evolve by inclusion of remaining stages. In the multidimensional case the temperature ¹ is replaced by the state vector x and the control u by the control vector u.
coincide. This is, in fact, the approach applied in Section 2; with suitable equations obtained in this case, any special problem with free boundary conditions can be derived by considering variations of boundary points. However, there exists also another type of duality in which the performance criteria in the original and dual problems are di!erent. It is particularly interesting to physicists as it is related to the parallel structures of variational principles in mechanics and thermodynamics. To illustrate it, let us recall a typical continuous problem of thermodynamic optimization in which maximizing is required of an integral-pro"t criterion
P
S,
t&
f (x, t, u) dt , (1.2) 0 t* subject to certain di!erential constraints at the stage n. We will call the problem related to criterion (1.2) the original optimization problem. The functional S is a generalized thermodynamic pro"t which has to attain a maximum in an optimal process of a "nite duration T"t&!t* (in an alternative formulation the negative of S can be minimized). There are also process constraints which link the state vector, x, with the control vector u and the process time t, but their general form is not speci"ed here; see Section 2. Here we restrict ourselves to the case when the only constraints are u"x5 , which we call the case of classical mechanics and variational calculus. Indeed, substituting into integral (1.2) the rate x5 in place of the control vector u leads to the classical problem of variational calculus. In mechanics the function f is the negative of the classical 0 Lagrangian, ¸. In thermodynamic optimization, f describes the generation rate of a thermodyn0 amic quantity of pro"t type, for example, the total work produced or the negative of the total entropy produced. In thermoeconomic extensions of thermodynamic optimization, f can describe 0 quite exactly an economic pro"t, with some terms related to the total work or to the entropy production.
172
S. Sieniutycz / Physics Reports 326 (2000) 165}258
Whenever the mathematical model (including the function f and all constraints) does not 0 contain explicitly the time t, instead of Eq. (1.2) one can maximize a modi"ed criterion of the &net pro"t' type
P
S ,S!h(t&!t*)" H
t&
( f (x, u)!h) dt , (1.3) 0 t* In a corresponding discrete process, exempli"ed in Fig. 1, i.e. in the process whose limit for an in"nite number of stages is described by the above continuous model, the original and dual performance criteria are appropriate sums: N SN,+ f n(xn, un)hn 0 1
(1.4)
and N (1.5) SN ,SN!h(tN!t0)"+ ( f n(xn, un)!h)hn . 0 H 1 These sums are maximized subject to suitable di!erence constraints operative at each stage n. They link the state vector, xn, with the control vector un and the time interval at the stage, hn. Restricting ourselves (similarly as in the continuous process case) to the situation when the constraining vector equation is of the form un"(xn!xn~1)/hn, we may treat the discrete counterpart of the continuous constraint u"x5 . This implies the discrete state transformations in the special form xn~1"xn!unhn, which is just the form that appears in Bellman's recurrence equations of the original and dual problem, Eqs. (1.9) and (1.10) below. The function f n describes the discrete rate of 0 pro"t generation at the stage n. Whenever the mathematical model (including f n) does not contain 0 explicitly the time tn, which is the case of Eq. (1.4)), instead of the original equation (1.4), one can maximize a modi"ed criterion of the &net pro"t' type, Eq. (1.5). It is due to the thermoeconomic interpretation of h as a measure of an investment cost associated with the increase of the process duration by one unit (Section 3.2.1) that the change of the original criterion S into its dual S can be interpreted as the passage from the production-related pro"t S to H the net pro"t S . The associated bene"t may be a dimensionality reduction of the problem and H related simpli"cations in solving procedures, numerical or analytical; see examples in Sections 2.3, 3.1 and 3.2. However, in continuous problems of mechanics, the constant h has a numerical value equal to the physical energy which equals the negative value of a Lagrange multiplier for the duration constraint. In this case, the negative of f is the mechanical Lagrangian ¸. Then 0 maximizing of criteria (1.2) and (1.3) is consistent with minimizing of the well-known action functionals,
P
A,
t& ¸(x, u) dt t*
(1.6)
P
(1.7)
and A , H
t&
t*
(¸(x, u)#h) dt .
S. Sieniutycz / Physics Reports 326 (2000) 165}258
173
With u"x5 , A"!S and A "!S are, respectively, the action and the abbreviated action in H H problems of classical mechanics [5]. Clearly, both original and dual problems can be stated as minimization problems, similar to familiar &least action' problems of mechanics. This observation places our formulations and related analogies close to those common in classical mechanics in which a minimum rather that a maximum of an action integral occurs along an optimal path. In fact, the constant h in Eq. (1.7) is the numerical value of energy or Hamiltonian along a autonomous extremal path. Regarding discrete models, we note again that the negative of the original performance index, or the action AN"(!S)N is a cost-like criterion that has to be minimized. Its dual AN "(!S)N is H H also a cost-like criterion, thus it has to be minimized too. In thermodynamic formulations of this sort we use various thermoeconomic analyses to introduce discrete cost intensities l n"!f n as 0 0 certain discrete Lagrangians per analogiam with Lagrangian ¸ of classical mechanics. Yet, when the classical mechanical Lagrangian ¸ (the di!erence between the kinetic and potential energy) is applied for l n or !f n, in which the transformation xn~1"xn!unhn is the discrete representation 0 0 of the continuous constraint u"x5 , the classical mechanical action A can be calculated as lim(!S)N for an in"nite N. The discrete analogs of the action integrals (1.6) and (1.7), obtained as sums (1.4) and (1.5) related to Eqs. (1.6) and (1.7), are obvious. Yet there is a di$culty, since in the phase space of a discrete process (Section 2.2) the constancy of a related (autonomous and discrete) Hamiltonian Hn holds only under the stationarity condition for the optimal time intervals hn; for arbitrary or locally constrained hn, the constancy of Hn is violated. This is the case (Section 2.2) in which a discrete equation of Hamilton}Jacobi type does not hold any longer. The equation, which describes the identity of the numerical value of Hn,h with the negative Lagrange multiplier of the time constraint, pn, shows that for locally constrained hn only Lagrange multiplier of the time constraint, t pn, is a constant of a discrete autonomous path. Thus, in situations when intervals hn are not free, t the constant pn should be applied instead of the negative of H in discrete optimization algorithms. t To maximize discrete criteria (1.4) and (1.5) or to minimize their negatives by dynamic programming. Bellman's recurrence equations are applied. Section 2.2 presents a basic optimization theory based on such equations and Section 2.3 discusses related computational issues. Here we only introduce equations of this sort in their special form relevant to criteria (1.4) and (1.5) and the multistage process shown in Fig. 1. De"ning the optimal performance functions of n-stage subprocesses for original and dual problem as Rn(xn, tn),min(!Sn),
Rn (xn, hn),min(!Sn ) , H H
(1.8)
the corresponding recurrence equations are, respectively, Rn(xn, tn)"min [ln (xn, un)hn#Rn~1(xn!unhn, tn!hn)] , 0 un, hn
(1.9)
Rn (xn, h)"min [(ln (xn, un)#h)hn#Rn~1(xn!unhn, h)] . H 0 H un,hn
(1.10)
In the formalism of analytical mechanics, the functions R and R represent, respectively, the H potentials of optimal action and optimal abbreviated action; in thermodynamic optimization
174
S. Sieniutycz / Physics Reports 326 (2000) 165}258
they represent the potentials of optimal production costs and optimal total costs. The "rst function incorporates the original discrete transformations for the state vector xn and time tn; xn~1"xn!unhn and tn~1"tn!hn. The second (the dual, which does not contain time tn) applies the same transformation for the state vector along with the constancy property for the Hamiltonian along the discrete path, hn~1"hn,h. (See the proof of this property in Section 2.2.) The choice of the constant h in#uences the resulting duration of the process. At this point, a comment on the role of boundary conditions is in order. Some of the end coordinates of the state vector (x0 and xN) and one of the end times (tN or t0) may be "xed, but the total duration, TN,tN!t0 must be free. In an optimal process its duration follows for an assumed h as a function of all "xed end coordinates and the total number of stages. The dimensionality reduction, possible in thermodynamic optimization due to the constancy of h, resembles the use of energy integral in mechanics, in stationary problems with undetermined duration T. Analytical and numerical aspects of a reduced thermal problem are displayed in Section 3.1 on an example of multistage heating of a #uid in a cascade of heat pumps. Accuracy of the numerical results improves when the state variable tn is eliminated by using the property hn~1"hn,h. To compare the original and dual process, consider necessary optimality conditions for a stationary extremum of the right-hand sides of Eqs. (1.9) and (1.10). Dealing with the discrete case (as more general than the continuous) we search for the stationary minimum of AN of (!S)N with respect to controls hn in Eqs. (1.9) and (1.10). Setting to zero the respective partial derivatives we obtain the relationships !RRn~1(xn~1, tn~1)/Rtn~1"[RRn~1(xn~1, tn~1)/Rxn~1] ) un!ln (xn, un) , 0 h"[RRn~1(xn~1, h)/Rxn~1] ) un!ln (xn, un) , 0 H
(1.11) (1.11a)
where it should be noted that h and l are connected by the Legendre transformation. The optimal 0 costs Rn and Rn are linked by an associated equation H Rn (xn, h),Rn(xn, tn)#hTn , H
(1.12)
where Tn,tn!t0 is the process duration. In thermodynamic optimization, this equation describes the relation between optimal total costs, Rn , and optimal production costs, Rn, with the H term hTn measuring investment costs. In mechanics, obtained in the continuous limit, the considered equation links optimal action with optimal abbreviated action along an extremal. Eq. (1.12) is an immediate consequence of Eq. (1.5) and de"nition of Rn and Rn . All these equations H link original and dual optimal costs, Rn and Rn . H From a formula describing the total di!erential of potential function Rn we "nd d(Rn!(RRn/RTn)Tn)"(RRn/Rxn) ) dxn#Tn d(!RRn/RTn) ,
(1.13)
whereas for potential function Rn H dRn "(RRn /Rxn) ) dxn#(RRn /Rh) dh . H H H
(1.14)
S. Sieniutycz / Physics Reports 326 (2000) 165}258
175
Moreover, as implied by Eq. (1.12), for every n RRn (xn, h)/Rxn"RRn(xn, Tn)/Rxn . H
(1.15)
This shows that the functions Rn and Rn have the same optima with respect to the "nal state H variables. By virtue of the above relationships, comparison of Eqs. (1.11) and (1.11a) with Eqs. (1.13) and (1.14) yields an explicit information about the optimal duration Tn"(RRn (xn, h)/Rh)xn , H
(1.16)
whereas the optimal unit price for an equipment satisfying this duration is h"!(RRn~1(xn~1, Tn~1)/RTn~1)xn~1 "!(RRn(xn,Tn)/RTn)xn .
(1.17)
For potentials Rn and Rn , described by Eqs. (1.12), (1.16) and (1.17), the duration-related optimal H performance functions Rn which act in the space of variables x, T and N, follow from the asterisk potentials as Legendre transforms of the latter with respect to h Rn"Rn !hRRn /Rh . H H
(1.18)
The inverse transformation is also valid Rn "Rn!TnRRn/RTn. H
(1.19)
Eq. (1.17) describes the duration penalty along an optimal path of an autonomous process, as the negative derivative of optimal cost Rn with respect to duration Tn. The equality of two partial derivatives in this equation is particularly easy to discover when di!erentiation is made of both sides of (autonomous) Eq. (1.9) with respect to the time tn. (As the di!erentiation refers to an extremal process, the extremizing sign is immaterial.) We can also investigate changes of derivative pn"RRn/RTn when state coordinates are subject of t thermodynamic transformation. Assume that in one coordinate frame the thermodynamic process is described by the state vector x, whereas in another frame the state vector for the same process is x@. An example can be an optimal evolution in time of an isobaric evaporation of moisture from a solid (Section 3.2); in the frame x the process is described by solid enthalpy and solid moisture content whereas in the frame x@ in terms of solid temperature and chemical potential of moisture. Clearly, transformations which link coordinates of these two frames are (nonlinear) thermodynamic transformations which do not contain explicitly the time t; in this case t is only a parameter. The optimal cost functions R(x, t) and R*(x, h) and their total time derivatives must be invariants of thermodynamic transformations. Therefore, along with the equality Rn(x, T)"R@n(x@, T), dRn(xn, tn)/dtn"dR@n(x@n, tn)/dtn~1
(1.20)
which means that RRn/Rtn#(RRn/Rxn) ) dxn/dtn"RR@n/Rtn#(RR@n/Rx@n) ) dx@n/dtn .
(1.21)
176
S. Sieniutycz / Physics Reports 326 (2000) 165}258
Whence, since the transformation dx"(Rx/Rx@) dx@ is valid, we obtain RR@n/Rtn"RRn/Rtn ,
(1.22)
RR@n/Rx@n"(RRn/Rxn) ) Rxn/Rx@n .
(1.23)
Eqs. (1.22) and (1.23) represent transformation rules for the `time adjointa pn,RRn/Rtn and the t `state adjoint vectora pn,RRn/Rxn when passing from the frame x to x@. They show that pn,RRn/Rtn is a scalar and pn is a covariant vector. From Eq. (1.22) we conclude that pn"p@ n or t t t h"h@ meaning that the optimal unit price of time allocation is independent of thermodynamic variables used in optimization. A natural practical interpretation of this property (Section 3.2.1) is that the optimal unit price of equipment is independent of choice of thermodynamic coordinates. Crucial role of adjoints in canonical equations is discussed in Section 2.2. Note also that, when working in terms of RH rather than R, an invariance condition is found for duration, T"T. Consistently, the optimal cost of time allocation, hTN, is also an invariant of thermodynamic transformations. As shown in Section 3.1, for thermodynamic problems of extremum work, generalized or "nite-time exergies follow directly from optimal functions R(x,T) and R*(x, h). Otherwise, in continuous problems of classical mechanics, optimal action functions are generated in the context of variational principles of Maupertuis type [5]. In classical mechanics pn is the negative of the t energy and pn is the vector of generalized momenta; the process duration is obtained by di!erentiation of the &extremum abbreviated action' R with respect to the Hamiltonian parameter, h. H The parallelism of structures of variational principles in mechanics and thermodynamics is an important property which holds in both continuous and discrete cases and helps in a better understanding of the basic equations in both disciplines. It should inspire physicists to apply our discrete theory (Sections 2.2 and 4.2) which shows that structures of classical mechanics and variational calculus are compatible with those of thermodynamic optimization, and both "elds can take advantage of this fact. Also, the presented approach could be of use for construction of a discrete quantum theory, or at least for the multistage cooling processes necessary to reach conditions in which quantum phenomena become observable in cold systems. Examples of physical quantum systems where this approach could be applied can be found in a recent book on theory of interaction of multilevel systems with quantized "elds [6]. The canonical structure of the basic discrete algorithn developed in Section 2.2 conforms to the present tendencies in physics in constructing numerical integration schemes for ordinary di!erential equations (ODEs) in such a way that a qualitative property of the solution of the ODE is exactly preserved. For Poissonstructure-preserving integration schemes (symplectic integrators), symmetries and related invariants, see [7}9]. In the following section, we develop a generalized theory for both continuous and discrete processes, applicable when dimensionalities of control space and state space are di!erent. It is called the basic theory as it is related to classical Pontryagin's algorithm for continuous optimal control and its discrete counterpart. The basic discrete theory is derived in Section 2.2 and its computational aspects are discussed in Section 2.3. Applications in Section 3 show the power of the basic theory. Yet since the discrete version of the basic theory is restricted by the requirement of free time intervals, Section 4.2 presents an outline of a generalized theory for discrete processes in which these intervals can be constrained.
S. Sieniutycz / Physics Reports 326 (2000) 165}258
177
2. Basic theory 2.1. Hamilton}Jacobi}Bellman equations for continuous systems 2.1.1. Continuous optimization problem In this section we consider a general problem of continuous optimal control associated with the Bolza form of the performance criterion
P
S"
t&
f (x, t, u) dt#G(x&(t&), t&)!G(x*(t*), t*) , (2.1) 0 t* where f is the pro"t intensity and G(x, t) is a gauging function that depends on the state x and the 0 time t. For Eq. (2.1), a maximum of the criterion S is sought with respect to a suitable choice of the vector function x(t) and u(t). The gauging function G in#uences an optimal solution only if some of the end coordinates of the state vector or time are not speci"ed. Dealing with all initial coordinates xi (k"1,2, s) and initial time t* as independent variables, we can generate a function describing k the maximum value of S in terms of x* and t*. This is called the original optimization problem. However, one can also consider the maximum of S as a function generated in terms of the "nal coordinates xf and the "nal time t&. This is called the dual optimization problem. It is insightful to k confront properties of the original problem with those of the dual problem, and we will occassionally do this in this paper. In terms of coordinates of both the end points (initial and "nal) an optimal performance function <(x*, t*, x&, t&), Eq. (2.28) below, describes the maximum of S. The optimization in (2.1) is subject to the di!erential constraints resulting from a given set of di!erential equations dx /dt"f (x, t, u) , (2.2) k k where x"(x , x ,2, x ,2, x ) is the s-dimensional state vector and f"( f , f ,2, f ,2, f ) is the 1 2 i s 1 2 i s vector of rates. The r-dimensional vector of adjustable parameters u"(u , u ,2, u ) is the control 1 2 r vector. An admissible control usually satis"es certain local constraints, the most typical being u(t)3U ,
(2.3)
where U is an admissible set in the control space. Additional constraints may also exist which link coordinates of the state vector, x, and the control vector, u. They are usually of the type u(x, u, t)"0 or u(x, u, t)40. However, they may be included into the above model by using Lagrange multipliers, special sort of controls which reside linearly in the model. These multipliers will increase the dimensionality of u without changing the general structure of the model, given above. Thus model (2.1)}(2.3) is su$cient for general considerations. To include the time coordinate into the state vector we can use the enlarged (s#1)-dimensional vector of state x8 "(x , x ,2, x ,2, x , x ) in 1 2 i s s`1 which case x ,t and s#1 dimensional vector of rates fI "( f , f ,2, f ,2, f , f ) with s`1 1 2 i s s`1 f n "1. s`1 We can also include into considerations the performance coordinate, x . To this end we de"ne 0 the performance equation dx /dt"f (x, t, u) , 0 0
(2.4)
178
S. Sieniutycz / Physics Reports 326 (2000) 165}258
where the x has an initial value x (t*)"x* and a "nal value x (t&)"x& . One of these values must 0 0 0 0 0 be free. With Eq. (2.4) we "nd S as a function of both end states and end times. S"g(x& , x&, t&, x* , x*, t*),x& !x* #G(x&, t&)!G(x*, t*) . 0 0 0 0
(2.5)
Thus with the help of the performance variable x , the original optimization problem becomes that 0 of maximum for x (t&)#G&, subject to di!erential constraints (2.2) and (2.4) and local constraints 0 (2.3). The dual problem becomes that of minimum for x (t*)#G*, subject to the same set of 0 constraints. Extremal values of these criteria are described by two functions Q, de"ned below. Further the notion of the complete state (x , x, t) will be useful. 0 2.1.2. Optimal performance functions and related HJB equations The performance variable must be free at the end at which S is extremized. Working "rst with the backward DP algorithm, we consider an optimal function Q* of the complete initial state (x* , x*, t*) 0 and of the partially "xed "nal state (x&, t&). It constitutes an example of an optimal performance function in the complete state space. For simplicity, we shall neglect superscript i in Q. The function is de"ned as Q(x* , x*, t*, x&, t&),maxMx& #G(x&, t&)N . 0 0
(2.6)
To apply Bellman's optimality principle in the context of the backward DP algorithm, we maximize S along a special trajectory that starts at (x* , x*, t*) and that possesses a short nonoptimal 0 part as well as a long optimal part. The long optimal part begins at the point (x* #*x* , x*#*xi, t*#*t*). In order to get Q at (x* , x*, t*), Bellman's optimality principle requires 0 0 0 maximizing Q evaluated at the beginning of the long optimal part Q(x* , x*, t*, x&, t&)"max MQ(x* #*x* , x*#* x*, t*#*t*)N . 0 0 0 u*
(2.7)
Taylor's expansion of the right-hand side of the above equation yields
G
H
RQ s RQ RQ Q(x* , x*, t*, x&, t&)"max Q(x* , x*, t*, x&, t&)# f * *t*# + f **t*# *t* . 0 0 0 k Rx* Rt* Rx* u* 0 k/1 k
(2.8)
Hence, as Q is independent of u, max u*
G
H
s RQ RQ RQ f*# + f *# "0 . 0 k Rt* Rx* Rx* 0 k k/1
(2.9)
This is the Hamilton}Jacobi}Bellman equation of the problem (HJB equation). It can be written in a symbolic form that shows a maximum rate of increase of Q max u*
G
H
dQ(x* , x*, t*, x&, t&) 0 "0 . dt*
(2.10)
S. Sieniutycz / Physics Reports 326 (2000) 165}258
179
From Eq. (2.5) and the de"nition of Q, Eq. (2.6), max S"Q(x* , x*, t*, x&, t&)!x* !G(x*, t*) . 0 0
(2.11)
Thus, we can evaluate the maximum performance function <, which describes max S, as <(x*, t*, x&, t&),max S"Q(x* , x*, t*, x&, t&)!x* !G(x*, t*) . 0 0
(2.12)
The function <(t*, x*, t&, x&),<(x8 *, x8 &) describes extremal values of the criterion S in terms of the generalized end states x8 * and x8 & when the total duration is T"t&!t*. Since < is a potential and S is not, only S depends on how the controls change during the process. When the process is autonomous, i.e. its model does not contain the time t explicitly, the form of < is such that <(t*, x*, t&, x&) depends on end times t* and t& through the duration T"t&!t* only, meaning that the initial time instant does not in#uence the value of < whenever the duration is "xed. This is associated with the invariance of the Hamiltonian function along an extremal path, which is then the "rst integral of the process. From Eq. (2.12) Q(x* , x*, t*, x&, t&)"x* #G(x*, t*)#<(x*, t*, x&, t&) , 0 0
(2.13)
RQ/Rx* "1 , 0
(2.14)
G
H
s RQ RQ max f * # + f *# "0 . 0 Rx* k Rt* u* k/1 k
(2.15)
Since the di!erentiations are with respect to the initial coordinates only, thus in terms of an e!ective or overall gauging function de"ned as P(x*, t*, x&, t&),G(x*, t*)!G(x&, t&)#<(x*, t*, x&, t&) ,
(2.16)
the HJB equation preserves the derived form
G
H
s RP RP max f * # + f *# "0 . 0 k Rx * Rt* u* k/1 k
(2.17)
Expressing the HJB equation in terms of P, Eq. (2.16), is convenient, because P summarizes all gauging e!ects. Now, working with the forward DP algorithm, we consider another function Q&, which depends of the complete "nal state (x& , x&, t&) and a partially "xed initial state (x*, t*). Again, for simplicity we 0 shall neglect superscript f in Q. The function is de"ned as Q(x*, t*, x& , x&, t&),minMx* #G(x*, t*)N . 0 0
(2.18)
In this case a special trajectory is considered that terminates at (x& , x&, t&) and that possesses a short 0 nonoptimal "nal part as well as a long optimal initial part. For the optimal trajectory which
180
S. Sieniutycz / Physics Reports 326 (2000) 165}258
terminates at (x& , x&, t&), the optimality principle and Taylor expansion yield 0 Q(x*, t*, x& , x&, t&)"min MQ(x& !*x& , x&!* x&, t&!*t&)N , 0 0 0 u&
G
(2.19)
H
RQ s RQ RQ f & *t&! + f &*t&! *t& , Q(x*, t*, x& , x&, t&)"min Q(x*, t*, x& , x&, t&)! 0 0 k 0 Rx& Rt& Rx& 0 k/1 k s RQ RQ RQ f + f "0 , max 0 k Rt& Rx& Rx& u& 0 k/1 k max u&
G G
H
H
dQ(x*, t*, x& , x&, t&) 0 "0 . dt&
(2.20) (2.21)
(2.22)
From Eq. (2.5) and the present de"nition of Q, Eq. (2.18), max S"x& #G(x&, t&)!Q(x*, t*, x& , x&, t&),<(x*, t*, x&, t&) 0 0
(2.23)
or Q(x*, t*, x& , x&, t&)"x& #G(x&, t&)!<(x*, t*, x&, t&) , 0 0 whence
(2.24)
RQ/Rx& "1 , (2.25) 0 s RQ RQ max f & # + f "0 . (2.26) 0 Rx& Rt& u& k/1 k Since the di!erentiations are with respect to the "nal coordinates only, we can use the same e!ective or overall gauging function P as that introduced earlier in Eq. (2.16). In terms of this function the HJB equation has the form
G
H
G
H
s RP RP f &! "0 . max f & ! + 0 k Rx& Rt& u& k/1 k
(2.27)
2.1.3. Link with gauged integrals of performance We shall now interpret the meaning of extremum operations in the HJB equations. Let us start with the dual problem for which the "nal state is operative and Eq. (2.27) holds. As it follows from the previous derivations based on Bellman's optimality principle in terms of the complete "nal state, the optimal performance function de"ned by the equation
GP
<(x*, t*, x&, t&),max S"max
t& *
f (x, t, u) dt#G(x&, t&)!G(x*, t*) 0
t satis"es in the dual problem ("xed x& and free x* ) the condition 0 0 max Mx& !x* !P(x*, t*, x&, t&)N"0 , 0 0 Mu N (q)
H
(2.28)
(2.29)
S. Sieniutycz / Physics Reports 326 (2000) 165}258
181
or the equivalent integral condition
GP
H
t& f (x, t, u) dt!P(x*, t*, x&, t&) "0 , (2.30) max 0 Mu N t* (q) where P"G*!G<, Eq. (2.16), is the e!ective or overall gauging function. Eq. (2.30) describes vanishing of a gauged integral of performance. As it follows from the above formulae for < and P, Eq. (2.29) or (2.30) are valid for a "xed "nal boundary point (x& , x&, t&), and the freely varied initial 0 coordinate of pro"t, x (t*). The di!erentiation of this equation with respect to the "nal time 0 t& proves that the total time derivative of the potential function P satis"es the equation max M f (x&, t&, u&)!dP(x*, t*, x&, t&)/dt&N"0 0 u&
(2.31)
which is just a compact form of the HJB equation (2.27), that deals with derivatives of the "nal state. This, in turn, proves a mnemonic rule with states that the di!erentiation is allowed at the end at which the complete state is operative, and that this di!erentiation yields the suitable HJB equation. Otherwise, in the original problem, in which the initial point (x* , x*, t*) is "xed and a maximum of 0 S at an "nal time t& is considered, the following analog of Eq. (2.31) holds max M f (x*, t*, u*)!dP(x*, t*, x&, t&)/dt*N"0 0 u*
(2.32)
as shown by Eq. (2.17). This might be interpreted as the requirement that, in order to get Eq. (2.32), the di!erentiation with respect to the initial time should be made for the minimum counterpart of Eq. (2.29), which is min Mx& !x* !P(x*, t*, x&, t&)N"0 0 0 Mu N (q)
(2.33)
or
GP
H
t& f (x, t, u) dt!P(x*, t*, x&, t&) "0 . (2.34) min 0 Mu N t* (q) Indeed, di!erentiation of the above equations with respect to the initial time t* leads after changing signs (associated with change of the extremum operation) to Eq. (2.32) that agrees with the backward HJB equation (2.17) dealing with derivatives of the initial state. However, such interpretation could be questioned as it involves the change of optimization problem from that of max S to that of min S. However, a better interpretation exists, that is associated with the unchanged form of the (maximized) criterion S. It consistently involves the di!erentiation of the integral Eq. (2.30) (or its equivalent, Eq. (2.29)) with respect to the negative of the initial time, (!t*). This yields the correct HJB equation (2.32) immediately. Again, this proves the mnemonic rule that the di!erentiation is allowed at the end at which the complete state is speci"ed, and that this di!erentiation yields the related HJB equation. It is also worth knowing that the optimal performance function <, Eq. (2.28), that is related directly to the original criterion S, can be used to yield an HJB equation. Indeed, we "nd from
182
S. Sieniutycz / Physics Reports 326 (2000) 165}258
Eqs. (2.24) and (2.26) or Eqs. (2.13) and (2.15):
G G
H H
s R< R< max fI & ! + f &! "0 , 0 Rx& Rt& u& k/1 k
(2.35)
s R< R< max fI * # + f *# "0 , 0 Rt* Rx* k u* k/1 k
(2.36)
where in either case a gauged pro"t intensity s RG RG f (x, t, u) fI (x, t, u),f (x, t, u)# # + 0 0 Rt Rx k k/1 k
(2.37)
appears in an HJB equation describing <. 2.1.4. Diversity of equivalent formulations The Hamilton}Jacobi}Bellman equation (HJB equation) is the quasilinear partial di!erential equation with the extremizing sign which governs the characteristic functions P or < via the control u, which achieves the optimization. The de"nition of the performance potential < in Eq. (2.28) is the most suitable to processes producing a pro"t in which case < is positive. For processes described in terms of a consumed cost, the most suitable de"nition assuring a positive potential involves an optimal function R which is the negative of <. Indeed, for an arbitrary functional S and the same change of the end states and times, R,min(!S x* x& )"!max S x* x& "!< . * , + * , +
(2.38)
However, a single extremal function <(t*, x*, t&, x&) is su$cient to adequately describe the extremum of the functional S. Our treatment develops a systematic search towards properties and implications of HJB equations in physics, chemistry, thermodynamics and economics. In analytical mechanics such equations are usually derived by the method of variational calculus [4]. The approach used here allows for a more general derivations, that take into account local constraints imposed on control variables, Eq. (2.3). For discrete problems, the method of dynamic programming will e!ectively be applied in the next section. The HJB relationships, Eqs. (2.31) and (2.32), are local conditions that describe vanishing maxima of the pro"t intensity f gauged by the total derivative of the optimal performance function 0 P. Similarly, Eqs. (2.35) and (2.36) perform analogous gauging in terms of the total derivative of function <. Changing in the latter equations signs of extremized expressions whenever the change of the extremum operation takes place, yields, respectively, min MR
(2.39)
max MR
(2.40)
S. Sieniutycz / Physics Reports 326 (2000) 165}258
183
The partial derivative of the optimal pro"t < with respect to time can be taken out of the bracket of this equation and the indices f or i are conveniently omitted in equations of this sort for various end states. Hence "nally, in terms of "nal states and times R
(2.41)
whereas in terms of initial states and initial times R
(2.42)
See Ref. [5] for analogous properties in classical mechanics. Note that the cost-type functions R appearing in these equations represent extremum actions of mechanics provided that lI (x, t, u),!fI (x, t, u) is a mechanical Lagrangian. 0 0 For autonomous systems a mixed description is also possible. In this case initial coordinates of state are operative whereas the varied time coordinate is the process duration T,t&!t*. The condition of the "xed "nal time yields then dT"!dt*, thus, from Eq. (2.40) or (2.42) !R
(2.43)
These are still usable HJB equations for optimal functions < or R of the problem. In all equations of this sort, extremized members are certain Hamiltonian expressions. In fact, they refer to Pontryagin's type, `nonextremala Hamiltonians. An optimal control u which solves the optimization problem must extremize a Hamiltonian at each point of the extremal path, which means extremizing a wave-front velocity in the HJB equation. The power of approaches based on the HJB equation follows from the fact that an optimal performance function found for a constrained problem satis"es an HJB equation with the same state variables as that for related unconstrained problem. Only numerical values of optimizing control sets and functions P or < may di!er in each case. Solving methods to HJB equations can be both analytical and numerical. For some special models, e.g. those of thermal machines with heat transfer, solutions can be obtained analytically, although even those examples present situations in which analytical solutions are not possible. Examples in energy "eld include problems with free boundary conditions, or with non-Newtonian and radiative transfer and constraints imposed on the state and rates. A numerical procedure, which works with Bellman's recurrence equation, is the most common tool to solve an HJB equation in the case of low dimensionality of the state vector [1,2].
184
S. Sieniutycz / Physics Reports 326 (2000) 165}258
2.1.5. Passage to the Hamilton}Jacobi equation We now consider the Hamilton}Jacobi equation for Eqs. (2.41) and (2.42), which are treated simultaneously. For a de"nite physical problem, common formulae describe the e!ect of variation of "nal states and "nal time of initial states and initial time. We de"ne an adjoint vector, p, as the negative partial derivative !R
(2.44)
where upper signs refer to "nal states, Eq. (2.41), and lower ones to initial states, Eq. (2.42). For the generalized Lagrangians, i.e. the integrands lI (x, t, u),!fI (x, t, u), the extremum condition of 0 0 Hamiltonian expression in Eqs. (2.41)}(2.43) determines the link between the derivatives of lI or 0 !fI with respect to controls u and the adjoints p"!($R
(2.45)
$R
(2.49)
S. Sieniutycz / Physics Reports 326 (2000) 165}258
185
where the upper sign refers to "nal states and the lower one to initial states. This equation di!ers from HJB equations insofar as it refers only to extremal paths and H is the extremal Hamiltonian. This equation should be solved subject to the boundary condition lim <(x*, t*, x&, t&)"0 , (2.50) &?* that applies when the "nal states (x&, t&) approach the initial point (x*, t*). Associated with the canonical set, free boundary conditions for H and p follow from the de"nition of <"max S in the form !H"p "!($R
186
S. Sieniutycz / Physics Reports 326 (2000) 165}258
For discrete processes, the standard discrete theory of optimal control [14,15] does not predict any special similarity between discrete and continuous descriptions. Therefore, such characteristic features of the continuous theory as constancy of an autonomous Hamiltonian or the Hamilton}Jacobi equation remained unknown in discrete systems for a long time. Discrete descriptions are generally not reducible to continuous ones in the limit of an in"nite number of discrete units. To assure that a discrete model converges to the continuous limit and exhibits symplectic properties, one must restrict oneself to a special class of discrete processes in which intervals of state variables are unconstrained, and the allowed constraints can a!ect only ratios of di!erences of the state variables at a stage. To satisfy these requirements, it su$ces that the optimization model is linear with respect to the free time intervals, hn. Whenever the discrete model deals with free hn, a remarkable similarity emerges between necessary optimality conditions in continuous and discrete cases. In particular, optimal performance functions satisfy discrete formalisms of Hamilton's and Hamilton}Jacobi type, and a discrete maximum principle emerges in a form analogous to that known for continuous systems [16,17]. As shown in Section 4.2, problems with constrained hn involve &quasicanonical' systems where these properties disappear. Referring to speci"cation of conditions on smoothness of a function in terms of sequences that govern underlying discrete sets, a discrete process which can approach a solution of a continuous optimization problem may exist only if its state and time increments can be made in"nitesimally small at each instant of time, the condition that can never be satis"ed in systems with those local constraints which require that intervals of coordinates or time may remain "nite for an in"nite number of stages, N. This is precisely why the continuous limit of the HBJ model is very special and only discrete systems with free intervals hn may converge to the continuous systems. Since the discrete version of the basic theory given here is restricted by the requirement of linear unconstrained time intervals hn, Section 4.2 presents a generalized theory for arbitrary discrete processes in which these intervals can reside in the model inhomogeneously and can be constrained. It is then shown that the Hamiltonian continuity is violated, yet the canonical equations can still be preserved in the case of the state independent constraints g (un) imposed on the intervals hn. t Since however any constraints of this sort (including the generalized constraint g (xn, un)) do not t assure a priori in"nitesimal intervals hn, the convergence of the discrete process to the continuous limit cannot be generally assured. In particular, there is not any nonsingular scaling procedure which would make these discrete systems convergent to the continuous limit. 2.2.2. Discrete optimization problem To develop an analogy with Eq. (2.1), we consider the discrete Bolza functional N SN, + f n(xn, tn, un)hn#G(xN, tN)!G(x0, t0) . (2.51) 0 n/1 The optimization in Eq. (2.51) is subject to constraints resulting from a given set of di!erence equations xn!xn~1"f (xn, tn, un)hn , (2.52) i i i where x"(x , x ,2, x ,2, x ) is the s-dimensional state vector and f"( f , f ,2, f ,2, f ) is the 1 2 i s 1 2 i s vector of rates. The r-dimensional control vector u"(u , u ,2, u ) is constrained, i.e., 1 2 r un3U , (2.53)
S. Sieniutycz / Physics Reports 326 (2000) 165}258
187
where U is the admissible set in the control space. Generalizations are possible to include constraints on state and control variables in the way described for the continuous case. To include the time coordinate into the state vector we can use the enlarged (s#1)-dimensional vector of state x8 "(x , x ,2, x ,2, x , x ) in which x ,t and s#1 dimensional vector of rates 1 2 i s s`1 s`1 fI "( f , f ,2, f ,2, f , f ) in which f n "1. A set with arbitrary f n (x8 n, un) may be considered as 1 2 i s s`1 s`1 s`1 a proper generalization of the original model which still preserves the linearity of the model with respect to a free control hn. The optimization problem can be stated as that maximizing S for n"N when the initial point (x0 , x0, t0) is "xed. However a di$culty arises if we want to obtain a di!erence analog of Eq. (2.37), 0 in fact, it is not clear whether such an analog exists, because the di!erential calculus does not apply to "nite di!erences in the discrete case. It is precisely for this reason that we shall prefer to use the potential function P rather than < when discrete processes are analysed. To solve the problem, it is essential to recognize the role of the necessary optimality condition for free (unconstrained) intervals of time, hn, and the role of Bellman's recurrence equation, especially in the form of its generalization to the so-called stage criterion. The generalization yields discrete characteristic sets, not only necessary conditions for optimal controls as the original Bellman's equation does. We concentrate here on the dual problem in which a maximum of S is sought at a "xed "nal point (xN, xN, tN), subject to the constraints, Eqs. (2.52)}(2.54). 0 We introduce the pro"t coordinate x , associated with the pro"t intensity f , whose change 0 0 along the n-stage subprocess (stages 1,2, n) describes the sum n xn !x0 " + f m(xm, tm, um)hm . (2.54) 0 0 0 m/1 Evaluating the di!erence between xn and xn~1 we "nd the di!erence equation for x 0 0 0 xn !xn~1"f n (xn, un, tn)hn . (2.55) 0 0 0 In the original optimization problem, the pro"t coordinate x must be free at the "nal point of the 0 discrete trajectory. Otherwise, in the dual optimization problem, which we now analyze in detail, x must be free at the initial point. With the coordinate x , the performance index S becomes 0 0 a function of all end coordinates S"g(xN, xN, tN, x0 , x0, t0),xN!x0 #G(xN, tN)!G(x0, t0) 0 0 0 0 ,GI (xN, xN, tN)!GI (x0 , x0, t0) , 0 0 where GI is the extended gauging function de"ned as
(2.56)
GI (x , x, t),x #G(x, t) . (2.57) 0 0 The free initial coordinate x0 must assure a maximum of S in the dual problem. Clearly, the value 0 of x0 that optimizes S is a function of all remaining variables in g, in agreement with Bellman's 0 optimality principle. Yet, the optimal performance function
(2.58)
is independent of the coordinate xN. Eq. (2.56) de"nes a pro"t-type function xN for a pro"t intensity 0 0 f n. In the cost-type function, which contains a cost intensity or the Lagrangian ln , the signs in 0 0 gauging terms are inverted, i.e. the di!erence G0!GN appears.
188
S. Sieniutycz / Physics Reports 326 (2000) 165}258
We note the formal di!erence between the maximizing of S through free variations of xN at the 0 right end of path (in the original problem) and the maximizing of S through free variations of x0 at 0 the left end of the path (in the dual problem). Maximizing of S in the original optimization problem is equivalent with maximizing the "nal gauging function, GI N, with respect to a suitable choice of vector functions xn and un when the initial point of trajectory, (x0 , x0, t0) is "xed. Otherwise, 0 maximizing of S in the dual optimization problem is equivalent with minimizing the initial gauging function, GI 0, while keeping "xed the "nal point, (xN, xN, tN). 0 2.2.3. Optimal performance functions by dynamic programming We are now prepared to de"ne the basic optimal performance function of the discrete problem as a quantity independent of the pro"t coordinate x 0
G
H
G G C G H
H
n
DH
n G(x8 k)!G(x8 k!fI k (x8 k, uk)hk) "max + f k(x8 k, uk)# hk 0 hk k/1
n ,max + fI khk , (2.59@) 0 m/1 where the gauged pro"t intensity fI k, a discrete analog of that in Eq. (2.9), is a h-dependent quantity 0 having the structure fI k(x8 k, uk, hk). 0 Eqs. (2.60) and (2.61) below describe the link between the optimal performance function
G
n max + f m(xm, tm, um)hm#G(xn, tn)!G(x0, t0)!
G
H
n "max + f m(xm, tm, um)hm!Pn(x0, t0, xn, tn) "0 , 0 m/1
H (2.60)
where Pn(x0, t0, xn, tn),G(x0, t0)!G(xn, tn)#
(2.61)
is the gauged pro"t. Note the analogy with Eqs. (2.30) and (2.16) of continuous processes. The function
S. Sieniutycz / Physics Reports 326 (2000) 165}258
189
pro"t intensity fI n similar to that in Eq. (2.37), which could include the e!ect of the gauging function 0 G and the original pro"t intensity f n , need not to be de"ned. 0 We shall now consider the dynamic programming theory of necessary optimality conditions in the complete state space, i.e. in the space}time that includes the performance coordinate x , and the 0 time t. The maximum condition for the pro"t S leads to the forward algorithm of the dynamic programming applied to the optimal function Q de"ned as Qn(xn , xn, tn, x0, t0),min GI (x0 , x0, t0) , (2.62) 0 0 where GI "x #G(x, t), Eq. (2.57). Eq. (2.62) states that the maximum of the initial function GI 0 is 0 generated in terms of the coordinates (xn , xn, tn) which are "nal coordinates of an nth stage process. 0 This is the technique of the forward DP algorithm. Based on data of Qn, Eq. (2.62), the data of maximum pro"t S can be found from Eq. (2.59). Indeed, it follows from these equations that
(2.63)
Qn(xn , xn, tn, x0, t0)"xn #G(xn, tn)!
A
HI n~1 xn, tn,
B
RQn~1 RQn~1 s`1 RQn~1 , , un , + fn Rxn~1 Rtn~1 Rxn~1 k k/0 k RQn~1 s RQn~1 RQn~1 " f n# + f n# , Rxn~1 0 Rtn~1 Rxn~1 k 0 k/1 k
(2.66)
190
S. Sieniutycz / Physics Reports 326 (2000) 165}258
where, from Eq. (2.64), RQn~1/Rxn~1"1 . 0
(2.67)
Note that the Hamiltonian is independent of the pro"t variable x . 0 For a stationary minimum of GI 0, Eq. (2.62), necessary optimality conditions follow from Eq. (2.65) by setting to zero the partial derivatives of the minimized expression with respect to controls hn and un. In terms of the Hamiltonian function, Eq. (2.66), these stationarity conditions constitute a set of the following di!erence } di!erential equations for Qn"Qn~1: HI n~1(xn, tn, RQn~1/Rxn~1, RQn~1/Rtn~1, un)"0 ,
(2.68)
[RHI n~1(xn, tn, RQn~1/Rxn~1, RQn~1/Rtn~1, un)]/Run"0 .
(2.69)
When the extremum of the Hamiltonian HI n~1 falls in the interior of admissible control set, U, the stationarity condition, Eq. (2.69), should be replaced by a maximum condition of HI n~1 with respect to controls un. The weak maximum condition [RHI n~1(xn, tn, RQn~1/Rxn~1, RQn~1/Rtn~1, un)/Run]dun40
(2.70)
which states that only negative variations of S and HI n~1 are possible is su$cient in most practical cases. 2.2.5. A discrete Hamilton}Jacobi equation Eq. (2.68) is the stationary extremum condition with respect to the unconstrained time intervals hn whereas Eq. (2.69) or (2.70) de"ne the optimal control vector un. When this optimal control is evaluated from one of these equations and substituted into Eq. (2.68) the latter becomes the discrete Hamilton}Jacobi equation RQn~1/Rtn~1#Hn~1(xn 2xn, tn,RQn~1/Rxn~12RQn~1/Rxn~1)"0 s 1 s 1
(2.71)
which is nonlinear in terms of the derivatives RQn~1/Rxn~1. Note that Eq. (2.71) is written for the extremum Hn~1 of the Hamiltonian function of energy type,
A
Hn~1 xn, tn,
B
RQn~1 RQn~1 RQn~1 s RQn~1 , , un , f n# + fn , Rxn~1 Rxn~1 Rxn~1 0 Rxn~1 i 0 0 i/1 1
(2.72)
rather than in terms of the enlarged hamiltonian HI n~1. In the limiting case of an in"nitesimal sequence of hn, Eq. (2.71) goes over into the Hamiltonian}Jacobi equation of the corresponding continuous process, consistent with Eq. (2.49) with upper sign. In contrast with continuous processes, in the discrete case the solving approaches to the problem deal usually with the underlying equation (2.65), not with Eq. (2.71). Sometimes set of Eqs. (2.68) and (2.69) is solved, as in our example with multistage thermal engines in Section 3.1. As compared with continuous processes, which deal with a single function Q(x , x, t), in discrete processes 0 a sequence of the optimal functions Qn(xn , xn, tn) forms a solution of Eq. (2.65). When a numerical 0
S. Sieniutycz / Physics Reports 326 (2000) 165}258
191
procedure is organized, e!ective for problems with small dimensionality of state, Bellman's recurrence equation (2.65) can e!ectively be solved in the form of Eq. (2.75) below. 2.2.6. Equivalent equation of discrete dynamic programming However, when solving practical problems, Bellman's recurrence equation is usually applied in its equivalent form which deals with the optimal function Pn, Eqs. (2.60) and (2.61). As the function Pn(xn, tn) does not contain the performance variable xn , it works in the state space of one dimension 0 less than the function Qn(xn , xn, tn). Thus the use of Pn should be preferred at least from the 0 viewpoint of computational reasons. Since from Eqs. (2.61) and (2.64) the following relations hold: Qn(xn , xn, tn, x0, t0)"xn #G(xn, tn)!
(2.73)
min MQn~1(xn !f n(xn, tn, un)hn, xn!f n(xn, tn, un)hn, tn!hn, x0, t0)!Qn(xn , xn, tn, x0, t0)N 0 0 0 un n ,h "min Mxn~1!xn !Pn~1(xn!f n(xn, tn, un)hn, tn!hn, x0, t0)#Pn(xn, tn, x0, t0)N"0 , (2.74) 0 0 un n ,h where the variable xn~1 has to be expressed in terms of the state and the controls un at the stage. 0 This means that the function f n(xn, tn, un)hn has to be substituted in place of the di!erence xn !xn~1. 0 0 0 Consequently, the x -free recurrence equation follows: 0 (2.75) Pn(xn, tn, x0, t0)"max M f n(xn, tn, un)hn!Pn~1(xn!f n(xn, tn, un)hn, tn!hn, x0, t0)N . 0 un n ,h In the special case when the gauging function G vanishes, the above equation contains the function
192
S. Sieniutycz / Physics Reports 326 (2000) 165}258
notation that ignores the initial point (x0, t0). Writing Eq. (2.75) in the form of the so-called stage criterion or CBS criterion (Caratheodory}Boltyanski}Sieniutycz criterion) max M f n(xn, tn, un)hn!(Pn(xn, tn)!Pn~1(xn!f n(xn, tn, un)hn, tn!hn))N"0 , (2.76) 0 ,h , ,t one can determine a complete set of necessary optimality conditions including those with respect to state vector xn. This is the most powerful and basic criterion which transfers the original Caratheodory's idea of Lagrangians gauged by their own optimal performance functions [4] to independent state and control spaces [11,18]. This refers to the trajectory optimization and also realizes passage to the algorithm of discrete maximum principle and canonical equations. The passage could, equally well, be made in the original Q-representation, Eqs. (2.65) and (2.68)}(2.71), however we would like to show it with some details in the P-representation which is the most popular and lucid. Bellman's equation follows from criterion (2.76) for "xed "nal states and times. It is the necessary optimality conditions with respect to controls un and hn which are considered now; they are common for both Eqs. (2.75) and (2.76). For a positive unconstrained hn and constrained un3U a set equivalent with Eq. (2.75) or (2.76) follows which has the form of the three equations un n xn n
f n(xn, tn, un)hn!(Pn(xn, tn)!Pn~1(xn!f n(xn, tn, un)hn, tn!hn))"0 , (2.77) 0 f n(xn, tn, un)!(RPn~1/Rxn~1) ) f n(xn, tn, un)!RPn~1/Rtn~1"0 , (2.78) 0 MRf n(xn, tn, un)/Run!(RPn~1/Rxn~1) ) f n(xn, tn, un)/RunN ) dun40 . (2.79) 0 Eq. (2.78) is the stationarity condition for the optimal intervals hn. Whenever hn is free and positive, it follows from Eqs. (2.78) and (2.79) that Eq. (2.79) can be derived from the following maximum condition: max M f n(xn, tn, un)!(RPn~1/Rxn~1) ) f n(xn, tn, un)!RPn~1/Rtn~1N"0 . 0 un
(2.80)
This is a discrete HJB equation which represents a maximum principle with respect to un for an enlarged P-based Hamiltonian with respect to the controls un. It is the variables xn, tn, RP/Rxn~1 and RP/Rtn~1 which must be frozen during extremizing. Eq. (2.80) contains, in fact, Hamiltonian (2.66) expressed in terms of the function P rather than Q, in which representation HI n~1"f n(xn, tn, un)!(RPn~1/Rxn~1) ) f n(xn, tn, un)!RPn~1/Rtn~1 , (2.81) 0 where Pn is the optimal performance pro"t de"ned by Eq. (2.61). As f ("1) is u-independent, s`1 the optimality condition (2.80) can also be expressed in terms of the energy-like Hamiltonian which has excluded the partial derivatives of Qn~1 or Pn~1 with respect to t
A
B
RPn~1 s RPn~1 , un ,f n! + fn . (2.82) n 0 Rx ~1 Rxn~1 i i/1 i Eqs. (2.66), (2.67), (2.72), and (2.73) were used to get the above expressions for HI n~1 and Hn~1. Eq. (2.82) rejects the u-independent term RPn~1/Rtn~1 from Hamiltonian (2.81). Hn~1 xn, tn,
S. Sieniutycz / Physics Reports 326 (2000) 165}258
193
When the optimal control un is evaluated from Eq. (2.80) and substituted into Eq. (2.78), the latter becomes the discrete Hamilton}Jacobi equation !RPn~1/Rtn~1#Hn~1(xn ,2, xn, tn,!RPn~1/Rxn~1,2,!RPn~1/Rxn~1)"0 1 s 1 s
(2.83)
which is Eq. (2.71) in terms of P. As Eq. (2.71), it is nonlinear in terms of the derivatives RPn~1/Rxn~1. It is written for the extremum Hn~1 of the Hamiltonian function of energy type, Eq. (2.82), rather than in terms of the enlarged hamiltonian HI n~1. In the limiting case of an in"nitesimal sequence of hn, this equation goes over into the Hamilton}Jacobi equation of corresponding continuous process, consistent with Eq. (2.49). Note that if the gauging function G"0, the limiting form of Eq. (2.71) coincides with Eq. (2.49) with upper sign. For processes consuming a net value it is convenient to replace the pro"t function Pn by its negative, Kn"!Pn. The function Kn has the meaning of a cost. For "xed states and times at both ends, Kn is related to the Lagrangian ln ,!f n by 0 0 n n Kn(xn, tn),!Pn(xn, tn)"!max+ f nhn"min+ ln hn . (2.84) 0 0 1 1 As Pn, Kn does not contain the performance variable x0. In the special case when the process rates n f ,2, f are the controls u ,2, u , a proper Lagrangian ln "!f n is convex with respect to rates. 1 s 1 s 0 0 Substituting f n"!ln into the recurrence equation (2.74) multiplied by the minus unity and using 0 0 Eq. (2.84) the reader can obtain the minimum counterpart of Eq. (2.76) in terms of Kn, which contains ln in place of f n and Kn in place of Pn. We note the mutual correspondence for pro"t 0 0 representation and cost representation in the recurrence equations. 2.2.8. Adjoint variables, discrete canonical equations and boundary conditions Eq. (2.75) may be used to de"ne the so-called adjoint variables, the partial derivatives of Pn or Kn with respect to the state coordinates xn . Since in an optimal process Eq. (2.73) holds with k a constant G0, the state adjoints can be de"ned as zn~1,RQn~1/Rxn~1"R(xn~1!Pn~1(xn~1))/Rxn~1"R(xn~1#Kn~1(xn~1))/Rxn~1 k k 0 k 0 k
(2.85)
(k"0, 1,2, s, s#1). In terms of the state and adjoint variables Hamiltonian (2.81) is s`1 s HI n~1(xn, tn, zn~1, zn~1, un), + zn~1f n"zn~1f n# + zn~1f n#zn~1 , (2.86) t l l 0 0 k k t l/0 k/1 where zn~1"1, zn~1"!RPn~1/Rxn~1"RKn~1/Rxn~1, zn~1,zn~1"!RPn~1/Rtn~1"RKn~1/ 0 k k k s`1 t Rtn~1 and f "1, for i"1, 2,2, s and n"1, 2,2, N. Note that adjoints z di!er from adjoints s`1 k p of Section 1 by coordinates of the gradient of gauge function G. Hamiltonian (2.86) has to be k a maximum with respect to the controls un which maximize a performance index of the pro"t type, such as the production criterion SN. In an equivalent formulation, one minimizes a performance index of the cost type, such as the consumption criterion (!SN), whose optimal function Kn is de"ned by Eq. (2.83). In that formulation, the unchanged Hamiltonian (2.86), which may operate with (!ln ) in place of f n, still has to attain the global maximum with respect to controls un, or 0 0
194
S. Sieniutycz / Physics Reports 326 (2000) 165}258
a modi"ed Hamiltonian H@n~1,!Hn~1 can be used that has to attain the global minimum with respect to un. Thus the original maximization problem for SN requires maximum of the Hamiltonian Hn~1 and the equivalent problem for minimization of (!SN) requires minimum of the modi"ed Hamiltonian H@n~1 (,!Hn~1), the one which contains ln or (!f n) in the place of f n in 0 0 0 Eq. (2.86). Hamiltonian (2.86) is the basis for the method using canonical equations, which are obtained from the stage criterion (2.76) as shown below. Di!erentiating the bracketed expression in Eq. (2.76) to determine its stationarity conditions with respect to "nal space and time coordinates, we obtain an optimal di!erence set which is canonical with respect to two sort of equations, one de"ning the changes of state and one the corresponding changes of the adjoint variables. Using the most popular energy-like Hamiltonian, Eq. (2.82), expressed in terms of the adjoint variables, s Hn~1(xn, zn~1, un, tn),f n(xn, un, tn)# + zn~1f n(xn, un, tn) , 0 k k k/1
(2.87)
the discrete canonical set is represented by the equations (xn !xn~1)/hn"RHn~1/Rzn~1 , k k k
(2.88)
(zn !zn~1)/hn"!RHn~1/Rxn , k k k
(2.89)
(zn!zn~1)/hn"!RHn~1/Rtn , t t
(2.90)
where hn,tn!tn~1 and the optimal control un satis"es the maximum condition, Eq. (2.80), in the form zn~1#max Hn~1(xn, zn~1, un, tn)"0 t un
(2.91)
(n"1,2, N; k"1,2, s and l"1,2, r). As shown by Eq. (2.70) or (2.79), the weak or local maximum can be proved easily, otherwise the strong or global maximum condition, Eq. (2.80), requires a subtle proof of the global validity of Eq. (2.80) within the admissible set U; for further considerations see Ref. [19]). Nonetheless, it is the weak principle which is su$cient in most applications. Eq. (2.88) constitutes the Hamiltonian form of the state equations, and Eq. (2.89) is its adjoint equation. Eq. (2.90) describes the Hamiltonian interval at the stage n, whereas Eq. (2.91) states that the enlarged Hamiltonian of the extremal process is always constant and equals zero. Eq. (2.91) includes the necessary condition for the stationary optimality of the decision vector un, if its optimal value falls in the interior of the allowable range U. Quite importantly, the combination of Eqs. (2.90) and (2.91), which describes changes of the extremum Hamiltonian through "nite stages, does not follow (as in the continuous version) from the canonical equations for x and z , but it i i represents an independent extremum condition associated with the optimal choice of tn. In autonomous systems, max Hn"max Hn~1, i.e., the energylike Hamiltonian is constant along an optimal discrete path. In nonautonomous systems only enlarged Hamiltonian is constant of the optimal discrete motion; the value of this constant always vanishes or max HI n"0.
S. Sieniutycz / Physics Reports 326 (2000) 165}258
195
In terms of the enlarged HI n the optimal discrete dynamics is described by a canonical set of equations (x8 n!x8 n~1)/(tn!tn~1)"RHI n~1/Rz8 n~1 , j j j (z8 n!z8 n~1)/(tn!tn~1)"!RHI n~1/Rx8 n , j j j
(2.93)
max HI n~1(x8 n, z8 n~1, un)"0 ,
(2.95)
(2.94)
un
where x8 s`1,t is the independent variable. This set is equivalent with the set of Eqs. (2.88)}(2.91) j but it describes the optimal process in the space}time rather than in the state space. Note that the de"nition of the time interval hn,tn!tn~1 follows here from the "rst canonical equation. The boundary conditions are determined by the vanishing stationarity conditions for the extremum of SN with respect to the end state coordinates and end times. Applying Eq. (2.61) in the form
A A
B
B
A
A
B
(2.96)
(2.97)
B
s`1 RG s`1 RG dSN" + !z8 N dxN# + !z8 0 dx0 . (2.98) j j j j Rx8 N Rx8 0 j j j/1 j/1 Setting to zero the respective partial derivatives of the optimal function Pn for one end, either for n"0 or for n"N, yields, for the free-end state variables zN"RGN/RxN, kOb, k k and for free-end time t
z0"RG0/Rx0, kOa , k k
(2.99)
zN"RGN/RtN, z0"RG0/Rt0 . (2.100) t t Eq. (2.99) prescribes boundary conditions for the adjoint vector zn at the end and at the beginning of the process, and Eq. (2.100) sets such conditions for the time adjoint at the end and at the beginning of the process. As expressed in Eq. (2.99), if bth component of the "nal state vector, xN, or k ath component of the initial state vector, x0, is "xed, the respective component of the adjoint vector, k zN or z0, is undetermined. Analogously, if the boundary time, tN or t0, is "xed, the respective time b a adjoint, zN or z0, is undetermined. When N tends to in"nity the discrete algorithm becomes that of t t Pontryagin. It is also useful to note that the developed discrete optimization theory can be derived in a somewhat di!erent way. Using the optimal performance function
196
S. Sieniutycz / Physics Reports 326 (2000) 165}258
principle of optimality in the form of the stage criterion maxM f n(xn, tn, un)hn#G(xn, tn)!G(xn~1, tn~1) 0 #
(2.101)
(equivalent with Eq. (2.76)) we obtain HJB equations, de"nition of H, state adjoints and canonical sets in a way similar to that has already been considered. 2.2.9. Analogies between optimal continuous and discrete systems Let us now discuss analogies between optimal continuous and discrete systems. As it follows from Eq. (2.76), the maximum principle (2.80) can be regarded as the consequence of a modi"ed form of Eq. (2.79)
G
H
Pn(xn, tn)!Pn~1(xn!f n(xn, un)hn, tn!hn) max f n(xn, tn, un)! "0 (2.102) 0 hn un xn n , ,t subject to Eq. (2.78) as an extra condition. This is suitable to discuss the discrete}continuous analogies. Eq. (2.102) may be compared with its continuous counterpart, Eq. (2.27), written in the form (2.103) max M f (x, t, u)!dP(x, t)/dtN"0 0 u x , ,t which can be obtained by di!erentiation of an expression characterizing maximum performance function P of the continuous process
GP
H
t& f (x, t, u) dt!P(x*, t*, x&, t&) "0 (2.104) max * 0 Mu x N t , ,q with respect to variable upper limit of integration t& and the subsequent replacement of the variable t& by t. Next, by using di!erential calculus to expand the total derivative in Eq. (2.103), we obtain the HJB equation of the continuous problem [equivalent with Eq. (2.27)]
G
H
RP s RP f (x, t, u) "0 . (2.105) max f (x, t, u)! ! + 0 Rt Rx k u k/1 k This is a mnemonic rule of derivation. However, in the discrete case the total derivative operator does not exist, what exists is the di!erence ratio. Thus it is not direct to design a similar mnemonic rule. Still the discrete}continuous analogy can be observed. A discrete analogue of Eq. (2.104) is represented by the equation
G
H
n (2.60@) + f m(xm, tm, um)hm!Pn(xn, tn) "0 max 0 Mum xm m mN , , t , h m/1 which may be regarded as a de"nition of the optimal function Pn [cf. Eqs. (2.60) and (2.61)]. In the discrete case, Eqs. (2.76) or (2.102), the di!erentiation with respect to the upper limit of integration,
S. Sieniutycz / Physics Reports 326 (2000) 165}258
197
such as that in Eq. (2.104), is replaced by the subtraction rule for an optimal performance function. Since in Eq. (2.76) the extremum conditions with respect to un and xn are at a constant hn, one can extremize equally well the ratio of Eq. (2.76) and hn, which is Eq. (2.102). Its continuous analogue is Eq. (2.103). Yet in the discrete case the role of the free extremum with respect to hn is crucial. Since for any "xed sequence MunN Eq. (2.78) is necessary for the optimality with respect to an unconstrained hn and Eq. (2.77) must hold, the optimal performance function Pn satis"es a special rule of di!erencing [Pn(xn, tn)!Pn~1(xn~1, tn~1)]/(tn!tn~1)"RPn~1/Rtn~1 #(RPn~1/Rxn~1) ) f n(xn, tn, un)
(2.106)
which is strongly analogous to a corresponding expression for the total derivative operator in continuous systems. Owing to this property, the di!erence ratio in Eq. (2.102), becomes just the scalar product (RPn~1/Rxn~1) ) f n(xn, un), and the maximum principle follows for the Hamiltonian expression (2.80) with respect to controls un. This maximum property holds in the phase space as explicitly shown by Eq. (2.91). The canonical sets, Eqs. (2.87)}(2.91) or Eqs. (2.93)}(2.95), contain the whole theory of optimal discrete cascades. 2.3. Outline of computational methods Now we shall describe the basic principles of solving methods. The primary idea is either to solve an underlying DP recurrence equation, such as Eq. (2.75) or (2.81), or the maximum principle equations, e.g. Eqs. (2.87)}(2.91), rather than the related HJB equations or Hamilton}Jacobi equations. This is because the solving methods for the preferred equations are those most e$cient, although they still have their own shortcomings: DP is restricted to problems with low dimensionality of state whereas maximum principles do not generate optimal pro"t functions directly. The control theory approach which is used here to solve HJB equations di!ers from a traditional approach encountered in mechanics, in which Hamilton}Jacobi equations are solved by their own methods [20]. 2.3.1. Numerical approaches which apply dynamic programming We shall start with the basic numerical method which applies the dynamic programming. For optimal control problems, both continuous and discrete control processes with a single independent variable (time or length) can be treated in the framework of the common discrete formalism. As we always tend to deal with a discrete set of equations, in the continuous case prior discretizing of OD equations is required, to obtain a set of di!erence equations. Then, it is appropriate to focus on the numerical multistage optimization. Here we describe generation of the optimal function
198
S. Sieniutycz / Physics Reports 326 (2000) 165}258
processes with an arbitrary number of stages. It is, however, important that DI n is properly expressed at the stage n as a function of state xn, time tn, and controls (un,hn). Data of optimal pro"t functions <1,2,
(2.107)
(2.108) DI n(x8 n, un,hn),fI nhn,[ f n(x8 n, un)#[G(x8 n)!G(x8 n!fI n(x8 n, un)hn)]/hn]hn 0 0 follows from the second line of Eq. (2.101). Tildas over symbols refer generally to extended quantities; thus the state x8 n denotes the enlarged vector (x, t) and the tilda over the symbol Dn refers to the extended cost including the e!ect of the gauging function G. The possibility of simultaneous generation of data for
(2.110)
S. Sieniutycz / Physics Reports 326 (2000) 165}258
199
This refers to the linear grid of values x8 "ad, where a"A, A#1,2B. Other values of
(2.111)
The discrete subset of controls un is de"ned in a similar way. For example, when the constrains imposed on un satisfy the inequality u 4un4uH, the variable un may assume only the discrete H values u "Ec#(E#1)c,2, (F!1)c, Fc"uH (2.112) H for an appropriately small value c. This refers to a linear grid of controls u"bd, where b"E, E#1,2, F!1, F. However the extremum of S with respect to hn must be stationary in order to be allowed to apply the present theory. This stationarity condition is represented by the vanishing enlarged Hamiltonian, Eq. (2.78). In terms of the function
(2.113)
f @n (x8 n, un,hn),f n(x8 n, un)#Gx [x8 n!fI n(x8 n, un)hn] ) f n(x8 n, un)#Gn~1[x8 n!fI n(x8 n, un)hn] . (2.114) t 0 0 While the modi"ed pro"t intensity f @n does not coincide with the gauged pro"t intensity fI n of 0 0 Eq. (2.108) the stationarity condition of S with respect to hn, Eqs. (2.113) and (2.114), is nonetheless consistent with the recurrence formula, Eq. (2.101). With the help of interpolation formula, Eq. (2.111), applied at the stage n!1, Eq. (2.113) uses tables of function
(2.115)
un
which follows from Eq. (2.109) for a given hn. However, the approach involving the numerical calculation of partial derivatives of
200
S. Sieniutycz / Physics Reports 326 (2000) 165}258
<0(x8 )"0 in Eqs. (2.109) and (2.116); this yields <1(x8 )"max MDI 1(x8 , u1,h1)N , un n ,h
(2.117)
(2.118) Mu1(x8 ), h1(x8 )N"arg max MDI 1(x8 , u1,h1)N . u1 1 ,h To "nd these functions, the computer choses the "rst point x8 "Ad"(A d, A d) and compares 1 2 DI 1(Ad, E c, E c) with DI 1(Ad, (E #1)c, E c), for a "xed h1. The larger of these values is stored and 1 2 1 2 compared with DI 1(Ad,(E #2)c, E c), etc. This process is continued until the whole discrete set of 1 2 controls u1 is exhausted. The largest of the so-obtained values is always a maximum of DI 1 with respect to u1 for the "xed discrete point x8 and for the "xed h1. At the same time, the coordinates of u1 which maximize DI 1 are stored. The whole computations are next repeated for another "xed value of h1, and the best pro"ts DI 1 are compared until an optimal h1 is found for which DI 1 is the largest. This leads to the absolute maximum of DI 1 with respect to u1 and h1 for the "xed discrete point x8 . The coordinates of u1 and h1 which maximize DI 1 are stored. Analogous operations are next performed for x8 "((A #1)d, A d),((A #2)d, A d), and so on. Again, this leads to 1 2 1 2 maximum of DI 1 and optimal values of u1 and h1. The data of the same quantity di!er at various points x8 (di!erent nodes of the grid). The computer outputs are DP tables which contain only the optimal data: <1(x8 ),h1(x8 ) and u1(x8 ). For n"2 (two-stage process), as well as for larger n, the procedure is analogous but uses Eqs. (2.109) and (2.116) in their complete form. The data of
S. Sieniutycz / Physics Reports 326 (2000) 165}258
201
we obtain an optimal solution as a sequence of optimal controls uN, uN~1,2, u1 and hN, hN~1,2, h1, and an optimal discrete trajectory, x8 N, x8 N~1,2, x8 1, x8 0. A sequence of optimal costs describing all related subprocesses
202
S. Sieniutycz / Physics Reports 326 (2000) 165}258
costs are de"ned in a similar way. For the single-stage pro"t at the stage n, DI n, which appears in Eq. (2.107), a net pro"t is DI n ,DI n!hhn, where hn,*tn; similarly the total cost at the stage n is H KI n ,KI n#hhn, where KI n,!DI n. H For the net pro"t DI n , an optimal behavior at the stage n is governed by a sequence of asterisk H functions: <1 ,2,
S. Sieniutycz / Physics Reports 326 (2000) 165}258
203
Fig. 2. Example of the dynamic programming generation of optimal data in a #uidized drying process, for one node of the computational grid (t "803C, = "0.115 kg/kg). The potential function Rn (xn, h) of the N stage process is obtained 4 4 H as the sum of the one-stage cost KI n and the function Rn (xn~1, h) of the N!1 stage process, at the constant parameter of H H investment cost (&time price'), h. In the considered case N"2.
¹ and moisture content = ) by a gas with temperature ¹ , humidity X and #ow hn"*Gn/S. The S S ' ' original controls (¹n , Xn ) are replaced in Fig. 2 by another pair of controls which are just state ' ' variables of the n!1 stage subprocess (I,In~1, =,=n~1). Consequently, Fig. 2 depicts how s s the nonoptimal costs KI n (xn, In~1, =n~1, hn) and Rn~1(xn~1, h) contribute to "nd the optimal H s s H function Rn (xn, h). The mathematics of these approaches is independent of the speci"c applications; H thus, for example, they can be conducted "rst in the frame of thermodynamic formulations and the experience gained can then be used to formulate and solve more involved problems of economics. 2.3.3. Numerical approaches which apply discrete maximum principle Now we describe the second basic numerical method, the one which applies the discrete maximum principle with the energy-type Hamiltonian (2.87). The necessary extremum conditions
204
S. Sieniutycz / Physics Reports 326 (2000) 165}258
are Eqs. (2.87)}(2.91) with hn"tn!tn~1. They are written below in a form suitable for numerical considerations which uses the extremum Hamiltonian Hn~1,max Hn~1 F (Hn~1, xn, zn~1, tn, un)"0 (de"nition of Hn~1, Eq. (2.87)) , 1 F (xn, xn~1, tn, tn~1, un)"0 (state equations, Eq. (2.88)) , 2 F (xn, zn, zn~1, tn, tn~1, un)"0 (adjoint equations, Eq. (2.89)) , 3 F (Hn, Hn~1, xn, zn~1, tn, tn~1, un)"0 (rate change of Hn~1, Eq. (2.90)) , 4 F (xn, tn, zn~1, un)"0 (extremality of Hn~1, Eq. (2.91)) . 5 These equations should be solved with a computer. Note that in the case of an autonomous process, Eq. (2.90) simpli"es to the form Hn!Hn~1"0. Typical optimal control problems lead to two-point boundary conditions, and procedures matching these boundary conditions should be designed. Contrary to DP algorithms, two-point boundary conditions increase the di$culty of the numerical solution in the case when the method of maximum principle is used. Due to the strong analogy of the di!erence model with Pontryagin's [16] algorithm, both the trial and error procedures which deal with two-point boundary values and the control improvement procedures are identical with those applied in the continuous case [3,16,17]. Methods of trajectory improvement in the state space, and gradient methods in the control space are e!ective. The conditions for the free "nal state variables and free "nal time, Eqs. (2.99) and (2.100) for n"N, are especially frequent. They prove that when the gauging function G vanishes and the "nal coordinates (xN, tN) are undetermined, the jth "nal state adjoint zN"0 and j j the "nal time adjoint zN"0. The initial conditions usually require that some state coordinates, say t x0, and time t0 are "xed, for example, j x0"x6 0; t0"tM 0 (2.121) j j Since the discrete rates f n are explicit in terms of xn and tn rather than in terms of xn~1 and tn~1, the standard procedure is backward in time, i.e. it starts with the last stage, N. The outlet variables from the last stage (xN, zN, tN) and the "nal maximum Hamiltonian, HN,max HN, should be determined from "nal conditions or they should be assumed. For autonomous processes the numerical value of HN,max HN represents the optimal values of Hn at all stages. At each stage n, the considered set of algebraic equations has to be solved for n"N, N!1,2, k,2, 1 with respect to ("ve types of ) variables xn~1, tn~1, zn~1, un and Hn~1 in terms of the quantities xn, zn, tn and Hn, the latter being known from the computations of the previous stage. This leads "nally to the "rst stage data, x0, t0, z0, u1 and H0, which should be veri"ed with respect to the initial conditions. As in the continuous systems theory, error criteria describe deviations of the obtained initial data from their known "xed values (dashed symbols). These deviations are generated in terms of "nal undetermined variables, e.g., 1 1 s e(xN, tN)" b (t0(xN, tN)!tM 0)2# + b (x0(xN, tN)!x6 0)2 j j j 2 t 2 j/1 and minimized during the numerical procedure, thus leading to a required accuracy.
(2.122)
S. Sieniutycz / Physics Reports 326 (2000) 165}258
205
We shall now describe a general solving procedure for an nonautonomous system that applies when adjoint equations contain controls (which is not always the case). The computational scheme works if one can solve analytically a set of equations characterizing maximum Hn~1 with respect to controls un and adjoints zn rather than zn~1. Eqs. (2.87)}(2.91) are the basis for the procedure which is characterized by the following algorithm: Express zn~1 as a function of remaining variables from adjoint equations F (xn, zn, zn~1, tn, tn~1, un)"0 (adjoint equations) , 3 thus obtaining zn~1"a(xn, zn, tn, tn~1, un) . Substitute this function into the remaining equations of the set. This leads to a set of transformed equations: FI (Hn~1, xn, zn, tn, tn~1, un)"0 (de"nition of Hn~1) , 1 FI (xn, zn, tn, tn~1, un)"0 (extremality of Hn~1) , 4 FI (Hn, Hn~1, xn, zn, tn, tn~1, un)"0 (rate change of Hn~1) , 5 FI (xn, xn~1, tn, tn~1, un)"0 (state equations) . 2 (Note that the form of the state equations remains unchanged by these transformations.) If the transformed equation characterizing the extremality of the Hamiltonian is solvable analytically with respect to un then the control can be expressed as un(xn, zn, tn, tn~1), and then substituted to the remaining equations. One is then left with an optimal set in which the state equations are in terms of the adjoint vector zn rather than zn~1 FI (Hn~1, xn, zn, tn, tn~1)"0 (de"nition of Hn~1) , 1 IF (Hn, Hn~1, xn, zn, tn, tn~1)"0 (rate change of Hn~1) , 5 FI (xn, xn~1, zn, tn, tn~1)"0 (state equations) . 2 Numerical solution of the "rst two equations at the stage n leads to variables tn~1 and Hn~1 in terms of known (after-stage) variables xn, zn, tn,Hn. Substitution of time tn~1 into the state equations FI and the transformed adjoint equations: 2 FI (xn, zn, zn~1, tn, tn~1)"0 (adjoint equations) . 3 (which incorporate the optimal controls un(xn, zn, tn, tn~1)) leads to a "nal set FI (xn, xn~1, zn, tn)"0 (state equations) , 2 FI (xn, zn, zn~1, tn)"0 (adjoint equations) . 3 Thus, we were able to show that the original computational set can be broken down into the above "nal form, which is the basis for the computational procedure. From this set the state coordinates
206
S. Sieniutycz / Physics Reports 326 (2000) 165}258
and adjoint coordinates before nth stage, xn~1 and zn~1, and all other quantities entering the stage n are determined. Then the computer may pass to the stage n!1 and all computations are repeated for this stage (backward procedure).
3. Applications 3.1. Multistage endoreversible work-producing systems 3.1.1. Extremum work in cascade of thermal machines with pure heat transfer As the "rst example where HJB equations are applied, we consider multistage systems in which work can be produced by cascade thermal machines operating sequentially between a #uid and a bath, i.e. an in"nite reservoir. The #uids have "nite thermal conductivity; hence nonvanishing resistances in their boundary layers and irreversible properties of any such heat-transfer system. The sequence of thermal machines operates at a steady state. The multistage process is, in fact, a sequence of Novikov}Curzon}Ahlborn engines (NCA processes [27}30]). These repeat in all stages of the cascade; the only di!erences among them are the temperatures of each stage, which must be consistent with energy balance. The multistage system contains: the driving #uid with gradually decreasing temperatures ¹1,2, ¹N; the environment at the constant temperature ¹%; the boundary layers which act as thermal conductances; and the set of the Carnot engines, C1,2, CN, which generate the mechanical work at each stage n. At each stage the bulk of the #uid is well mixed, and a mixing scale is assigned to the #uid #ow. Each stage n is a distinct unit which has its own internal structure composed of the following elements: a well-mixed cell of bulk #uid with inlet temperature ¹n~1 and outlet temperature ¹n, thermal resistances of two boundary layers, a Carnot engine Cn, and a contact zone with an isothermal bath. The power producing version of the process is called the engine mode; in this version the whole system approaches the equilibrium and the power is produced. In the second version, called the heat-pump mode, the system leaves themodynamic equilibrium and power is consumed. An analysis of a single-stage precedes the cascade optimization. The stage conductances are: g "a a and g "a a , where a and a are the upper and lower exchange surface areas at the 1 1 1 2 2 2 1 2 stage. When the system is in the engine mode, the hot #uid with temperature ¹ releases heat at 1 ¹"¹ , which reaches the engine part at ¹ . The low-grade heat is released by the engine #uid at 1 1{ a low temperature ¹ to the bath #uid, and reaches the bath at a low temperature ¹ "¹%. In 2{ 2 a more general case of two #uids, each #owing with a "nite mass #ux, both #uid temperatures ¹ and ¹ vary along the path, however we restrict ourselves here to the case ¹ "¹%. Clearly, for 1 2 2 the engine mode, the e$ciency of the power production at a stage, g"W/q , is smaller than the 1 e$ciency g of the Carnot cycle operating between ¹ and ¹ . For the purpose of optimization, C 1 2 a standard representation of this multistage process as a discrete optimal control problem is given in terms of a set of di!erence equations describing the relation between the stage inputs, outputs and controls. The derivation of these equations follows an earlier work [29,31]. The #uxes of work produced at stages are summed up, thus a cumulative power output W is obtained at the last stage. By de"nition, the power released in the engine mode is positive. The associated optimization problem is that of maximizing the power generated in a "nite time. The power consumed in the heat-pump mode is negative. The associated optimization problem is that
S. Sieniutycz / Physics Reports 326 (2000) 165}258
207
of minimizing the power supplied. In classical, reversible thermodynamics, the associated magnitudes of the power are the same for the two modes. Finite-time thermodynamics, a theory for irreversible processes, is a schema showing that (and why) the magnitude of the heat-pump-mode power is larger than that of the engine-mode power. In classical thermodynamics such processes involve #uids and Carnot engines only; no boundary layers are considered. Their continuous quasistatic limit leads to the classical thermal energy for a reversible process; it has no rate term. In our case, which deals with an irreversible multistage process in sequence of NCA machines, the optimal performance function represents a "nite-time available energy (exergy) of the driving #uid. The power output per unit #ux of the driving #uid, ="W/G, has units of work per unit mass, and describes the speci"c work associated with the steady-#ow process of power production. The optimization problem for the cascade can be stated for the extremum work ="W/G, consistent with extremum of the power W. For a cascade of engines, where work is released, the work = has to be maximized. For the cascade of heat pumps, where work is supplied to the system, the minimizing of the negative of work, (!=), is required. The optimal speci"c work represents an extended or "nite-time exergy of the #owing #uid when one of the boundary states of this #uid is the state of equilibrium with the environment (e.g. ¹N"¹%). This extended exergy simpli"es to the classical thermal exergy in the reversible limit when the overall-stage conductances, gn, tend to in"nity. The entropy balance for the internal Carnot machine in a single NCA unit, and the related energy balance determine the power production. It is given by the Carnot formula with respect to the internal temperatures at which the engine works (the primed temperatures ¹ and ¹ ). This 1{ 2{ power is, of course, lower than that of a Carnot engine working between the external temperatures ¹ and ¹ because the engine part operates on the reduced temperature di!erence ¹ !¹ . 1 2 1{ 2{ While temperatures ¹ and ¹ are unknown, they may be found in terms of ¹ and ¹ and 1{ 2{ 1 2 a control variable at the stage. The choice of the control variable is in principle arbitrary; for example, the control may be the heat #ux, q , the "rst-law e$ciency, g, the heat released to the 1 environment, u, or others. See an analysis of a one-stage NCA process in terms of e$ciency [30,31]. Here, we shall modify results of these analyses to describe a stage in terms of the driving heat #ux q as a control. The results are next adopted to cascades consisting of N stages, with 1 arbitrary N, and extended to processes with mass transfer. If the heat q is the control variable and Newton's law, heat #ow proportional to the temper1 ature di!erence, describes the heat exchange, the temperature ¹ follows immediately in terms of 1{ q as 1 ¹ "¹ !q /g . 1{ 1 1 1
(3.1)
On the other hand, the entropy balance in the form of the continuity of entropy #ux through the reversible part of engine, q /¹ "q /¹ , is necessary to evaluate the temperature ¹ in terms of 1 1{ 1 2{ 2{ q . With Eq. (3.1) and Newton's law for q , this entropy balance can be written in the form 1 2 q /(¹ !q /g )"g (¹ !¹ )/¹ . 1 1 1 1 2 2{ 2 2{
(3.2)
Solving this equation in terms of ¹ yields 2{ ¹ "¹ (¹ !q /g )/(¹ !q /g) , 2{ 2 1 1 1 1 1
(3.3)
208
S. Sieniutycz / Physics Reports 326 (2000) 165}258
where the overall conductance of heat transfer was de"ned in the traditional way as the harmonic mean, g,(1/g #1/g )~1 or g g /(g #g ). 1 2 1 2 1 2 The "rst-law e$ciency of the internal engine satis"es Carnot formula in terms of ¹ and ¹ ; 1{ 2{ with Eqs. (3.1) and (3.3) this e$ciency is obtained in terms of ¹ , ¹ and q as 1 2 1 g"1!¹ /¹ "1!¹ /(¹ !g~1q ),1!¹ /(¹ #u) , 2{ 1{ 2 1 1 2 1
(3.4)
where u,!g~1q is a measure of the heat added to the #uid 1, a suitable control if g is constant. 1 It has units of temperature and is positive for #uid heating and negative for #uid cooling. Thus a simple equation holds which links the power w with the heat #ux q 1 w"q (1!¹ /(¹ !g~1q ))"!gu(1!¹ /(¹ #u)) . 1 2 1 1 2 1
(3.5)
The e$ciency of the engine deviates monotonically from the Carnot e$ciency with the "nite q or 1 u. Also, the power w, Eq. (3.5), deviates from the Carnot model due to the "nite q . For a quasistatic 1 transfer, i.e. for very low q or u, the e$ciency g is that of Carnot. Yet the e$ciency vanishes for 1 a su$ciently large #ux q at the Fourier point, where q "g(¹ !¹ ). This corresponds to pure 1 F 1 2 heat conduction and no power production at all. Thus, the power vanishes at both q "0 and 1 q "q , hence a maximum of w at an intermediate point. 1 F The inverted form of the middle expression in Eq. (3.4) can be used to present quantities of interest (power w, released heat u, etc.) in terms of the e$ciency control g instead of the heat #ux control q : 1 q "g(¹ ![1/(1!g)]¹ ) 1 1 2
(3.6)
[30]. It follows from this formula that, in the e$ciency representation, the power produced, w"gq , is 1 w"gg(¹ !¹ /(1!g)) . 1 2
(3.7)
The power may be produced only in the range of e$ciencies between 0 and g . # We add the stage number superscript, n, to all symbols in the above formulae when we apply them to a cascade. Yet, since in the in"nite reservoir case ¹ is only a parameter and not a state 2 variable, we can also simplify our designation for the state variable ¹n and heat qn by rejecting 1 1 the unnecessary subscript 1; thus we will use the symbols ¹n and qn for the resource #uid temperature and driving heat #ux at the stage n. We shall also designate the constant heat-sink temperature as ¹% rather than ¹n . With these changes the last expression in Eq. (3.4), which links 2 the heat control variable, un, with the e$ciency gn, becomes gn"1!¹%/(¹n!(gn)~1qn)"1!¹%/(¹n#un) ,
(3.8)
where un,!(g~1)nqn. Eq. (3.8) shows that the e$ciency of power production at an engine stage decreases with the intensity variables qn or (!un). The decreased e$ciencies of engines and changes in other system properties caused by "nite rates were "rst shown in a number of papers on "nite-time thermodynamics (FTT), see e.g. [32]; all early works related to FTT issues are contained in Andresen's thesis [33]. Optimizing of multistage FTT systems was initiated in Ref. [34].
S. Sieniutycz / Physics Reports 326 (2000) 165}258
209
However, for cascades there are some extensive quantities speci"c to the cascade as a whole, and not to a single stage. These are cumulative quantities which may play the role of state coordinates in optimization when there are limits on the consumption of certain #uxes. Since the optimization problem is that of maximizing =n at the end of the process, i.e. for n"N, it is convenient to associate the variable = with a zeroth state coordinate. Eq. (3.10) below is a discrete rate formula describing this coordinate. An equation for the state variable describing the cumulative work is obtained from an expression for the power delivered at the stage n. We de"ne the cumulative power per unit mass #ow, =n,+wn/G, a quantity whose units are those of speci"c work. When Eq. (3.3) or (3.8) are applied at the stage n with the identi"cations ¹ "¹% and ¹ "¹n and we introduce 2 1 a nondimensional conductance hn at the stage n, such that hn,gn/Gc , along with the power intensity function, f n, with respect to the nondimensional time qn, 0 f n,!c(1!¹%/(¹n#un))un , 0 then we obtain a state equation for the speci"c wrok in terms of the control un:
(3.9)
(3.10)
=n!=n~1"f n hn,!c(1!¹%/(¹n#un))unhn . (3.11) 0 The total work is the sum of these expressions over the stages. Since the conductance gn may be expressed as the product of an overall coe$cient of heat transfer and a corresponding overall transfer area, we identify the nondimensional conductance hn as the so-called number of transfer units at the stage n, a well known engineering quantity. The ratio of the stage length ln to hn is then quantity with units of length; it is identi"ed with the so-called height of the transfer units at the stage n, Hn . From its de"nition and Eq. (3.9) Hn equals Gcln/gn. The quantities Hn exhibit small TU TU TU values for large conductances gn. However, it will be su$cient to use the dimensionless variable hn only. One may note that hn appears linearly in Eq. (3.11), an important fact for optimizing of the model. To do this, we "rst need the second discrete equation of state, which describes the driving heat. To "nd an equation of state for the temperature ¹n, we de"ne the cumulative driving heat Qn for the "rst n stages of the cascade: Qn"+qk, where k"1, 2,2, n. This quantity satis"es the equality Qn!Qn~1"!Gc(¹n!¹n~1)"qn, where, because of Eq. (3.9) and the de"nition un,!(g~1)nqn, the heat qn is the product of !gn("!Gchn) and un. The comparison of the two expressions for qn yields the second equation of state ¹n!¹n~1"unhn .
(3.12)
This equation tell us that un"!qn/gn, i.e. the negative of the heat received by the nth Carnot engine (in units of temperature), at the same time plays the role of the discrete rate of temperature change, *¹n/*qn. To close the model, we regard hn as the increment of an independent variable which we designate by qn. The latter is nondimensional, of de"nite (positive) sign, and measures the cumulative number of heat transfer units. The variable qn satis"es the de"nition qn"+hk for k"1, 2,2, n. Thus the third equation of state is simply qn!qn~1"hn .
(3.13)
210
S. Sieniutycz / Physics Reports 326 (2000) 165}258
Eq. (3.13) is essential in all problems with the constrained sum of hn. This corresponds to the constrained total transfer area, because the single-stage transfer areas are contained in gn and hence in hn. This also corresponds to a constraint on the total length. In accordance with the terminology used in optimization, Eqs. (3.11)}(3.13) are the discrete equations of state for the cascade. They contain on their right-hand sides state variables and controls. An optimizing procedure for the discrete problem requires speci"cation of a maximized quantity which may be regarded as =n of Eq. (3.11) at n"N, complete set of the state equations (as above), and possibly some local constraints at the stage in question. 3.1.2. Generalized cascades with heat and mass transfer A generalization of the above model can be considered [35] which includes the e!ect of mass transfer. In this case the power intensity which generalizes Eq. (3.10) has an involved form. Some rede"nitions are suitable in this more general case. Molar quantities are used as then some formulae become simpler. Lewis analogy is applied, that links the heat and mass transfer coe$cients. The conductances are then linked by the molar heat capacity, c; for the mass transfer conductance g, the heat transfer conductance is gc. The negative ratio of the power w and the mass transfer conductance g de"nes the Lagrangian ¸ of the problem in terms of two controls u and v. The "rst control is related to the heat #ux, q"!gcu, the second one } to the mass #ux, n"!gv. The Lagrangian ¸ has now units of the energy per mole. We also use f "!¸ or the power output 0 per unit molar conductance g"Gh; then d=/dq"w/g"w/(hG),f "!¸ , 0
(3.14)
where h"g/G is the nondimensional conductance of mass. To obtain work produced per unit mole of the #uid, we integrate over q the function f expressed in terms of controls and state. With 0 q"!gcu and n"!gv the power formula is w g f (¹, X, u, v)" "cu!c¹v# 2 c%¹% 0 g g
A
!
A
B GA
g 2 c%!c v ¹% p g
¹#(cg /g#c v)~1cu 1 p ¹#(g /g)(cg /g#c v)~1cu 1 1 p
(1#X)(1`X)(X#gv/g ) X#gv/g1 1 XX(1#X#gv/g ) 1#X#gv/g1 1
A
B
B
g R/g c% 1 2
(1#X%)(1`X%)(X%!gv/g )X%~gv@g2 2 X%X%(1#X%!gv/g )1`X%~gv@g2 2
B
!g R/g c% 2 2
H
.
(3.15)
This equation allows us to "nd the maximum cumulative work when a "nite-resource #uid changes its thermodynamic parameters in a "nite time between two assumed states. It can be shown [35] that the above power formula reduces exactly to that of pure heat transfer when n"0.
S. Sieniutycz / Physics Reports 326 (2000) 165}258
211
In this generalized case a set of discrete state equations contains: f The state equation for the molar work in terms of the controls un and vn. It assumes the general form given by the "rst equality of Eq. (3.11) =n!=n~1"f nhn , (3.11@) 0 where the function f is that appearing in Eq. (3.15). 0 f The state equation for the temperature changes of the resource #uid (the same as in the case with pure heat transfer) ¹n!¹n~1"unhn .
(3.12a)
Again, this equation tells us that un"!qn/gncn, or the negative of the heat received by the nth reversible engine (in units of temperature), plays the role of the discrete rate of the temperature change in the nondimensional time, *¹n/*qn. f A new equation of state which describes the concentration Xn. A reasoning similar to that applied in the previous case holds. The cumulative molar #ux Nn for the "rst n stages of the cascade: Nn"+nk, where k"1, 2,2, n. This quantity satis"es the equality Nn!Nn~1"!G(Xn!Xn~1)"nn, where, because of Eq. (3.9) taken in the mass representation, the mass #ux nn is the product of gn"Ghn and vn. The comparison of these two expressions yields an equation of state Xn!Xn~1"vnhn ,
(3.16)
where vn is the rate change of the concentration per unit of the nondimensional time q. f The closing equation of state: Eq. (3.13). We observe that, in cascades with "nite N, the discrete state equations follow from a simple discretizing operation applies to derivatives d¹/dq"u, dX/dq"v and dq/dq,1. 3.1.3. Extensions for variable transport coezcients When transport coe$cients and speci"c heat vary along the path we cannot use nondimensional time q any longer. Optimization has to deal with models in which the time variable represents a real residence time; the independent variable increment (for which always the same symbol hn is used) equals then hn"tn!tn~1. We shall illustrate the model modi"cation on the analytical example of multistage problem with pure heat transfer. The state equation for the maximized work can be written as =n!=n~1"f n hn,!c (1!¹%/(¹n#snun)) unhn , (3.11@@) 0 where the coe$cient s"oc/(a@a ), o is #uid's density, c #uid's speci"c heat, a@ overall heat transfer v coe$cient associated with the overall conductance g, and a total exchange area per unit volume of v the #uid. The product sun (in units of temperature) equals"!qn/gn, where gn is the overall thermal conductance at the stage n and qn is the heat which drives the nth Carnot engine. The nondimensional conductance or the number of transfer units at the stage n equals hn/s"gn/(cG). Eq. (3.12) is still valid provided that the de"nition of control un is modi"ed; now un is the discrete rate of the temperature change in usual time; un"*¹n/*tn. Eq. (3.13) holds for the usual time tn replacing qn. The improved treatment takes into consideration variability in time of thermal and transfer
212
S. Sieniutycz / Physics Reports 326 (2000) 165}258
coe$cients: the speci"c heat capacity c and the heat transfer coe$cient a@. An analogous procedure holds when mass transfer is included. (Again, s"oc/(a@a ), which has units of time necessary to v pass to the description which uses the usual time t.) Since the rate function in the state equations do not contain the work coordinate =n explicitly, the optimization problem of maximum "nal work coordinate =N is equivalent to the problem of maximizing of the sum, SN"+ f nhn. If, for example, the usual time is used in the case of thermal 0 machines without mass transfer, the sum to be maximized is
A
B
N N ¹% SN"+ f (¹n, un)hn,+ c !1 unhn . (3.17) 0 ¹n#sun 1 1 The speci"c work consumption, which should be minimized, is the negative of this sum. Working forms of recurrence equations derived by dynamic programming use most often the pro"ts f nhn or 0 costs (!f nhn) contained in these sums, as we shall see below. 0 3.1.4. Examples of solving algorithms for potential functions With the above adjustments, we are prepared to discuss solving algorithms for potential functions satisfying appropriate HJB equations and related recurrence equations. We shall formulate these basic equations for the problem of simultaneous heat and mass transfer which includes, of course, the problem of pure heat transfer as its special case. To deal with this special case we shall ignore the state variable Xn and the control vn. For the discrete power production function described by Eq. (3.15), the speci"c work delivery which should be maximized is the sum of the general structure N SN,+ f n (¹n, Xn, un, vn) hn . (3.18) 0 1 The state variables, temperature ¹n, concentration Xn and cumulative number of transfer units qn (the measure of a total area of the heat exchange) obey the discrete equations of state, Eqs. (3.12), (3.13) and (3.16), the latter being used only in the case with the mass transfer. The optimal performance functions
(3.20)
S. Sieniutycz / Physics Reports 326 (2000) 165}258
213
where
(3.21)
In terms of the optimal cost function Rn the structure typical of classical mechanics is found RRn~1/Rqn~1#Hn~1(¹n, Xn, RRn~1/R¹n~1, RRn~1/RXn~1)"0 .
(3.22)
Eqs. (3.21) and (3.22) are nonlinear in the derivatives R
A
R
B
R
(3.23)
where the negative of the pro"t intensity ln ,!f n is the discrete Lagrangian of the problem. 0 0 Eq. (3.23) shows that for our representation of controls, which are identical with process rates, the functions Hn~1 and ln are connected by the Legendre transformation. 0 In the limiting case of an in"nitesimal sequence of hn, Eq. (3.21) goes over into the Hamilton}Jacobi equation of a corresponding continuous process !R
(3.24)
which is consistent with Eq. (2.49) with upper sign. Yet, explicit analytical formulae and solutions for all these equations are possible only for processes with pure heat transfer. For example, in the NCA cascades with the pro"t intensities given by Eqs. (3.10) and (3.15), the purely thermal case, Eq. (3.10), involves the HJB equation (3.20) in the form max M!c (1!¹%/(¹n#un)) un!R
(3.26)
214
S. Sieniutycz / Physics Reports 326 (2000) 165}258
(The negative of a Hamiltonian of this sort can be minimized as in Eq. (3.32) below.) For an unconstrained control un, this Hamiltonian yields the stationarity condition pn~1,!Rf n/Run,Rl n/Run"RRn~1/R¹n~1 . (3.27) 0 0 This is just the de"nition of classical mechanical momentum which reads in the thermal case c (1!¹%¹n/(¹n#un)2)"RRn~1/R¹n~1 .
(3.28)
To second derivative of H with respect to u is negative as long as ¹@"¹#u'0. This proves that the control satisfying Eq. (3.28) maximizes Hn~1, as required by the maximum principle. Solving Eq. (3.28) with respect to un yields un"(¹%¹n/[(¹n!RRn~1/cR¹n~1)2])1@2!¹n .
(3.29)
Whence the extremum of Hamiltonian (3.26) in terms of ¹n and pn~1,RRn~1/R¹n~1 is Hn~1"c (J¹n(1!RRn~1/cR¹n~1)!J¹%)2 .
(3.30)
For free hn, this Hamiltonian satis"es the optimality condition Hn~1"!RRn~1/Rqn~1. The special optimal case with Hn~1"0 corresponds to a quasistatic process. Consequently, the discrete Hamilton}Jacobi equation for the multistage thermal problem is RRn~1/Rqn~1#c (J¹n(1!RRn~1/cR¹n~1)!J¹%)2"0 .
(3.31)
To enunciate hierarchy of the solving equations, Eq. (3.31) is compared with the &more basic' HJB equation (3.25) of the problem written in terms of Rn~1 min Mc (1!¹%/(¹n#un)) un!RRn~1/Rqn~1!(RRn~1/R¹n~1) unN"0 . (3.32) n u While Eq. (3.32) implies Eq. (3.31) the reverse is not true. However, it is most e$cient to deal with the &still more basic' Bellman's functional equation, whose &forward' form is (3.33) Rn (¹n,qn)"min [c (1!¹%/(¹n#un)) unhn#Rn~1 (¹n!unhn,qn!hn)] . n n u ,h Of special attention are two processes: the one which starts at ¹0"¹% and terminates at an arbitrary ¹n and the one which starts at ¹n and terminates at ¹%. In these processes the functions Rn are generalizations of the classical thermal exergy for discrete processes with "nite durations. 3.1.5. DP solutions in original state space According to the philosophy of our approach, we shall solve the underlying DP formula, Eq. (3.33) rather than any of Eq. (3.31) or (3.32). The numerical procedure described in Section 2.3, Eq. (2.102), is necessary to solve the problem with simultaneous heat and mass transfer, whose power production function f is given by Eq. (3.15). For the pure heat transfer problem governed by 0 Eqs. (3.10) and (3.33), an analytical solution exists as shown below. Fig. 1 illustrates Bellman's principle of optimality for the multistage NCA process optimized by the forward algorithm of dynamic programming. For N"1 (a single-stage process) there is no choice for u1 other than u1"(¹1!¹0)/q1 and h1"q1. (The initial time q0,0 by assumption.) Thus the optimal work consumed in the one-stage
S. Sieniutycz / Physics Reports 326 (2000) 165}258
215
process is R1 (¹1,q1)"min c (1!¹%/(u1#¹1))u1q1 u "c (1!¹%/[(¹1!¹0)/q1#¹1]) (¹1!¹0) .
(3.34)
For N"2 (two-stage cascade) nontrivial minimization begins. In accordance with Eq. (3.33) one has to evaluate R2(¹2,q2) by searching for optimal controls u2 and h2 such that
GA A
B
¹% u2h2 R2 (¹2,q2)"min c 1! u2#¹2 u2,h2
B
H
¹% #c 1! (¹2!h2u2!¹0) . (3.35) (¹2!h2u2!¹0)/(q2!h2)#¹2!h2u2 The second line of this equation contains the one-stage cost function (3.34) expressed in terms of the state and controls at the second stage. The stationary optimal control u2 which minimizes the cost expression contained in the braces e2"M N must satisfy an equation
A
B
A
Re2 ¹%¹2 ¹% "c 1! h2!ch2 1! Ru2 (u2#¹2)2 (¹2!h2u2!¹0)/(q2!h2)#¹2!h2u2 #c¹%
B
B
!h2/(q2!h2)!h2 (¹2!h2u2!¹0)"0 . [(¹2!h2u2!¹0)/(q2!h2)#¹2!h2u2]2
(3.36)
The extremum condition with respect to the control variable h2 yields an equation
A
B
A
Re2 ¹% ¹% "c 1! u2!cu2 1! Rh2 u2#¹2 (¹2!h2u2!¹0)/(q2!h2)#¹2!h2u2 #c¹%
[(¹2!q2u2!¹0) (q2!h2)~2!u2] (¹2!h2u2!¹0) "0 . [(¹2!h2u2!¹0)/(q2!h2)#¹2!h2u2]2
B (3.37)
The solution of Eqs. (3.36) and (3.37) de"nes the optimal controls u2 and h2 at stage 2 in terms of the extended state ¹2 and q2. From this information all suitable quantities at stages 1 and 2 and properties of the two-stage system are determined including the optimal work function R2. The analytical solution for n"2 has the form ¹1"J¹2¹0 ,
(3.38)
h2"h1"q2/2 ,
(3.39)
u2/¹2"u1/¹1"2(q2)~1[1!(¹0/¹2)1@2] ,
(3.40)
R2"c (¹2!¹0)!2c¹%[1!(¹0/¹2)1@2] #c¹%M2[1!(¹0/¹2)1@2]N2/(q2#2[1!(¹0/¹2)1@2])
(3.41)
216
S. Sieniutycz / Physics Reports 326 (2000) 165}258
[36]. These results suggest that in optimal multistage cascades the absolute interstage temperatures satisfy the rule of the geometric mean: ¹n"(¹n~1¹n`1)1@2, and the optimal time intervals are equal along an optimal path, h1"h2,2, hn. In fact, these properties are true for an arbitrary n. The analysis of the three stage process uses the transformed function R2 to evaluate
GA B C A B D
¹% R3 (¹3,q3)"min c 1! u3h3#c (¹3!u3h3!¹0) u3#¹3 u3,h3 !2c¹% 1!
#c¹%
¹0 1@2 ¹3!u3h3
H
M2[1!(¹0/(¹3!u3h3))1@2]N2 . q3!h3#2 [1!(¹0/(¹3!u3h3))1@2]
(3.42)
This leads to the minimum work function of the three stage cascade in the form
C A B D
R3 (¹3,q3)"c (¹3!¹0)!3c¹% 1!
M3[1!(¹0/¹3)1@3]N2 ¹0 1@3 #c¹% . q3#2[1!(¹0/¹3)1@3] ¹3
(3.43)
The generalization of the above result for an arbitrary N is
C A B D
RN (¹N,qN)"c (¹N!¹0)!c¹%N 1!
MN[1!(¹0/¹N)1@N]N2 ¹0 1@N #c¹% . qN#N[1!(¹0/¹N)1@N] ¹N
(3.44)
This function is consistent with optimal trajectories satisfying the rule ¹n"(¹n~1¹n`1)1@2 for arbitrary stages n!1, n and n#1. With this rule we "nd the temperatures along an optimal path in terms of its boundary temperatures ¹1"(¹N)1@N(¹0)(N~1)@N, ¹2"(¹N)2@N(¹0)2(N~1)@N~1,2, ¹N~1"(¹N)(N~1)@N(¹0)1@N .
(3.45)
For an in"nite number of stages the limiting form of this sequence is an exponential line ¹"¹0 exp (mt) known from the optimization of heat exchangers, simulated annealing and in"nitesimal NCA sequences [40}43]. The limiting form of Eq. (3.44) for an in"nite N R="c (¹!¹0)!c¹% ln ¹/¹0#c¹% [ln(¹/¹0)]2/(qln (¹/¹0))
(3.46)
agrees with recent results for continuous systems [31]. It is impossible to "nd an analytic solution for the process with simultaneous heat and mass transfer. In this case a computer generates tables of optimal controls and optimal costs through direct extremizing procedure contained in the recurrence equation Rn (¹n, Xn,qn)" min [!f n (¹n, Xn, un, vn) hn#Rn~1 (¹n!unhn, Xn!vnhn,qn!hn)] , (3.47) 0 un,vn,hn where the power production function f is given by Eq. (3.15). The numerical procedure is described 0 in Section 2.3 for Eq. (2.107). In some special cases one can iteratively solve analytical stationarity
S. Sieniutycz / Physics Reports 326 (2000) 165}258
217
conditions, such as Eqs. (3.36) and (3.37), by the Newton}Raphson method. In these special cases analytical properties of the problem are exploited. In each case the procedure generates optimal work functions for cascade processes; the sequence R1(¹1, X1,q1), R2(¹2, X2,q2),2, RN(¹N, XN,qN). A subsidiary discussion of general numerical aspects is available [37]. 3.1.6. DP solutions in reduced state space for a constant Hamiltonian The virtue of a constant Hamiltonian is seen particularly well when the constancy condition H"h is applied to reduce the state dimensionality in the problem. Since for free time intervals the constant h equals the negative of the time adjoint pq, the former can replace the latter in playing the role of the Lagrangian multiplier associated with a constraint imposed on the total time qN. The multiplier h is introduced to avoid the time q as a state variable. Thus, instead of Eq. (3.18), one has to maximize a modi"ed criterion of the &net pro"t' type N SN ,+ ( f n(¹n, Xn, un, vn)!h) hn . (3.48) H 0 1 Alternatively, a costlike criterion (!S)N de"ned as the sum + (ln #h)hn is minimized, where H 0 ln "!f n is the Lagrangian. In that case, a computer generates tables of optimal controls and 0 0 optimal costs by solving a recurrence equation Rn (¹n, Xn)" min [(!f n (¹n, Xn, un, vn)#h) hn#Rn~1 (¹n!unhn, Xn!vnhn)] . (3.49) H 0 H un,vn,hn Any equation of this sort does not contain the time qn. Some of the end coordinates (¹0, X0) and (¹N, XN) may be "xed, but the total duration, qN, must be free. In an optimal process this duration follows for an assumed h as a function of "xed end values and total number of stages, N. Accuracy of results is much better when the state variable qn is excluded, i.e. when the problem is described by only two state variables, ¹n and Xn. The X-free truncation of this equation serves to generate numerical generalizations of functions R or < when both the transfer coe$cients and the heat capacity vary along the process path and an analytical solution cannot be obtained. Let us illustrate analytical aspects of the reduced problem in the case of pure heat transfer where minimizing is required of the following modi"ed criterion:
AA
B
B
N ¹% (!S)N ,+ c 1! un#h hn (3.50) H un#¹n 1 with the "xed boundary temperatures ¹0 and ¹N, but for a free total duration, qN. In an optimal process this duration follows for an assumed h as a function of ¹0, ¹N and N. Optimality of free time intervals is assured a priori in equations considered. Accordingly, the dynamic programming equation for the optimal function Rn ,min(!S)n has the form H H (3.51) Rn (¹n, h)"min M(c(1!¹%/(un#¹n)) un#h) hn#Rn~1 (¹n!unhn, h)N . H H n n u ,h A related equation, applied when the intervals hn are de"ned by the constraint Hn~1"h, max M[Rn (¹n, h)!Rn~1(¹n!unhn, h)]/hn![c (1!¹%/(un#¹n)) un#h]N"0 H H un
(3.52)
218
S. Sieniutycz / Physics Reports 326 (2000) 165}258
may be considered as a recurrence relationship in which extremizing is solely with respect to the control un. [This is admissible only for independent controls un and hn, thus a one-stage application should be excluded.] Its applicability is associated with an optimal hn contained in Eq. (3.51). This sets the stage-independent structure of the Hamiltonian [RRn~1 (¹n~1, h)/R¹n~1] un!c (1!¹%/(un#¹n)) un"h H
(3.53)
and states that its numerical value equals h. With this general prescription, the recurrence equation (3.52) requires e!ectively only one decision at the stage, un, because the optimality of time intervals hn is assured by the constancy of Hamiltonian function, Hn"h [29]. On the other hand, since for every n RRn (¹n, h)/R¹n"RRn (¹n, qn)/R¹n , H
(3.54)
the global maximum of the braces expression in Eq. (3.52) with respect to un satis"es Eq. (3.27) in the form of Eq. (3.28) or (3.29) as the necessary condition for the maximum of the Hamiltonian (3.53), i.e., Rln /Run,c (1!¹%¹n/(¹n#un)2)"RRn~1/R¹n~1 . 0 H
(3.55)
Thus each of the recurrence equations, Eq. (3.51) or Eq. (3.52), provides the solution of the optimization problem subject to the de"ned Hamiltonian, Eq. (3.53). For N"1 and subject to the conditions R0 "0 and hn"(¹1!¹0)/un we obtain the rate H representation of Eq. (3.51) R1 (¹1, h)"min M[c(1!¹%/(u1#¹1))u1#h]((¹1!¹0)/u1)N . H u1 The di!erential calculus yields the rate representation of the Hamiltonian h"c¹% (u1)2/(u1#¹1)2 .
(3.56)
(3.57)
Similarly by taking the h-representation of the same recurrence equation one would obtain the h-representation of the same Hamiltonian, i.e. Eq. (3.57) with u1"(¹1!¹0)/h1. Substituting into Eq. (3.56) the solution of Eq. (3.57) for the optimal control u1 u1"[$(h/c¹%)1@2/(1!$(h/c¹%)1@2)]¹1,m¹1
(3.58)
yields the optimal work function R1 for the process between the temperatures ¹0 and ¹1 H R1 (¹1, h)"c (¹1!¹0) (1!(¹%/¹1) [1!$Jh/c¹%]2) . H
(3.59)
When N"2 the dynamic approach involves minimizing of the sum e2 "[c (1!¹%/(u2#¹2))u2#h](¹2!¹1)/u2 H #c (¹1!¹0) [1!(¹%/¹1) (1!$Jh/c¹%)2]
(3.60)
S. Sieniutycz / Physics Reports 326 (2000) 165}258
219
by a suitable choice of two decisions, the rate u2 and the interstage temperature ¹1. It follows then that the minimizing of (3.60) with respect to u2 is the same operation as that leading to the optimal u1 in Eq. (3.56). This allows one to apply at stage 2 the optimal cost expression, Eq. (3.59), with indices appropriate for N"2. Thus one "nds a two-stage work function with interstage temperatures as only controls w2 "c (¹2!¹1) [1!(¹%/¹2) (1$Jh/c¹%)2] H #c (¹1!¹0) [1!(¹%/¹1) (1!$Jh/c¹%)2] .
(3.61)
By induction, the work function of the n-stage subprocess is
C
A
S BD
N ¹% h 2 wN "+ c (¹n!¹n~1) 1! 1!$ . H ¹n c¹% 1
(3.62)
This special criterion should be minimized by a suitable sequence of interstage temperatures, ¹1,2, ¹k,2, ¹N~1. This procedure is much easier than that based on the original recurrence equation with the time variable, described above. In the previous procedure one had to search for sequences of two decisions, un and hn, for the two state variables, ¹n and qn. It is the knowledge of the structure of Hn~1 and its constancy which directed us to problem with only one state variable ¹n and one decision ¹n~1 at the nth stage. Minimizing the sum (3.61) with respect to ¹1 yields the rule of geometric mean for the interstage temperature ¹1"(¹2¹0)1@2, as an h-independent optimality condition. This leads to the optimal work function of two-stage subprocess R2 "c (¹2!¹0)!2c¹% (1!J¹0/¹2)(1!$Jh/c¹%)2 . H
(3.63)
3.1.7. Discrete energy potentials for xnite process durations Continuing the procedure we conclude that analytical solutions for potential functions in the space of thermodynamic variables and Hamiltonian h (asterisk potentials in the space of ¹, h and N) are obtained in the form RN "c (¹N!¹0)!c¹%N [1!(¹0/¹N)1@N] H #c¹%N [1!(¹0/¹N)1@N] [1!(1!$Jh/c¹%)2] .
(3.64)
In the continuous limit R="c (¹!¹0)!c¹% ln (¹/¹0)#c¹% ln (¹/¹0) [1!(1!$Jh/c¹%)2] . H
(3.65)
The classical part of work, known from thermodynamics as the exergy, follows from Eq. (3.65) when ¹0"¹% and h"0. The nonclassical, h-dependent terms of these equations represent the rate penalty. Both equations prove that the generalized exergy becomes larger when the process of energy consumption becomes more intense.
220
S. Sieniutycz / Physics Reports 326 (2000) 165}258
The optimal duration follows in terms of h as the partial derivative RRN /Rh, i.e., H TN"RRN /Rh"N [1!(¹0/¹N)1@N][$(h/c¹%)~1@2!1] . H In the continuous limit of an in"nite number of stages T="ln (¹/¹0)[$(h/c¹%)~1@2!1] .
(3.66)
(3.67)
Note the analogy with classical mechanics, Section 1, where the process duration is obtained by di!erentiation of the &abbreviated action' R "R#hT with respect to the hamiltonian in the H variational principles of Maupertuis type [5]. The duration-related functions of optimal work, which act in the space of variables ¹, T and N follow from the asterisk potentials as the Legendre transforms with respect to h RN"RN !hRRN /Rh"hR
(3.68)
This is also true for RN satisfying Eq. (3.64), whose negative Legendre transform or the function H
(3.69)
The same quantity but in terms of the total time is
MN[1!(¹0/¹N)1@N]N2 TN#N[1!(¹0/¹N)1@N]
(3.70)
which agrees with Eq. (3.44). In the continuous limit one obtains <"!R"c (¹*!¹&)!c¹% ln ¹*/¹&!c¹% [ln(¹*/¹&)]2/(T&!ln (¹*/¹&))
(3.71)
which agrees with Eq. (3.46). The T terms in the last two equations are nonclassical. After using exergy boundary conditions, a "nite-time exergy of the process with pure heat transfer follows either in terms of the constant h or in terms of the total duration TN
C A B D C A B D S
AN"c (¹N!¹%)!c¹%N 1!
¹% 1@N ¹% 1@N h $N 1! c¹% ¹N ¹N c¹%
C A B D
"c (¹N!¹%)!c¹%N 1!
¹% 1@N MN [1!(¹%/¹N)1@N]N2 $c¹% . ¹N TN$N [1!(¹%/¹N)1@N]
The lower sign refers to the engine mode and the upper sign to the heat pump mode.
(3.72)
S. Sieniutycz / Physics Reports 326 (2000) 165}258
221
When the total number of stages, N, approaches an in"nity, the work functional is the limit of the sum described by Eq. (3.17). This limit may be written as the Lagrange functional
P A
B
t& ¹% c !1 ¹Q dt , (3.73) ¹#s¹ Q * t for which Eq. (3.71) represents an optimal actionlike function. Eq. (3.73) contains the time in usual units. However, the potential function (3.71) is expressed in terms of nondimensional time T which is the number of the heat transfer units. In the limit of an in"nitely slow process, when d¹/dt"0 and d¹O0, corresponding with free in"nite duration, integral (3.73) simpli"es to classical exergy. When the initial ("nal) state is that of equilibrium with the enviornment, the minimum (maximum) work function represents a generalization of thermal exergy for a "nite-size system which consumes (produces) work. For the continuous case at the limit of in"nite number of stages (NPR), Eq. (3.72) yields S,
A"c (¹!¹%)!c¹% ln ¹/¹%$c¹% ln (¹/¹%)Jh/c¹% "c (¹!¹%)!c¹% ln
¹ [ln (¹/¹%)]2 $c¹% , ¹% T&$ln (¹/¹%)
(3.74)
where the upper sign refers to work consumption and the lower sign to work production [31,40]. This result incorporates the two-mode distinction to a number of earlier "ndings restricted to one-mode theory [38,39]. The distinction stresses a general fact that a "nite-time creation of a nonequilibrium system requires a greater magnitude of work than that which can be released during the system's destruction [40]. 3.1.8. Application of discrete maximum principle Instead of solving the DP recurrence equation (3.33), the discrete maximum principle can be applied to solve the optimization problem stated as that of minimum work consumed. In this section, we use the nondimensional time q (number of heat transfer units) as the state coordinate. Correspondingly, the symbol hn stands for intervals of this nondimensional time. For discrete NCA cascades with power production function described by Eq. (3.10), the speci"c work consumption which should be minimized, is
A
B
N ¹% (!SN)" + c 1! unhn , (3.75) un#¹ n/1 where the temperature ¹ and the cumulative number of transfer units q (a measure of the total area of the heat exchange) obey the discrete equations of state (¹n!¹n~1)/hn"un ,
(3.76)
(qn!qn~1)/hn"1 .
(3.77)
The solution of this discrete variational problem is associated with optimal choice of the transfer areas at the stages (contained in qn), for a prescribed total transfer area.
222
S. Sieniutycz / Physics Reports 326 (2000) 165}258
According to our general strategy, we "rst pursue the discrete description as the most general and next "nd the continuous process characteristics at the limit of an in"nite N. The constancy of the discrete Hamiltonian, which is the consequence of the optimality of hn, is expressed by the equality Hn~1,pn~1un!c (1!¹%/(un#¹n)) un"h ,
(3.78)
where, by de"nition, pn"RRn/R¹n, and h"0 corresponds to the quasistatic process. We search for the maximum of the above Hamiltonian with respect to the controls un. For a stationary optimal control un, the Hamiltonian function H satis"es the stationarity condition RHn~1/Run"pn~1!c (1!¹%¹n/(un#¹n)2)"0 .
(3.79)
After eliminating pn~1 from these two equations an integral of the discrete motion follows: h . (un)2/(un#¹n)2" c¹%
(3.80)
Accordingly, one obtains two modes of control corresponding to increasing and decreasing temperatures ¹n with time qn un/(un#¹n)"$Jh/c¹% ,
(3.81)
where the positive sign refers to the #uid's heating and the negative one to the #uid's cooling. De"ning an intensity constant m in terms of the constant h as m"$Jh/c¹% (1!$Jh/c¹%)~1
(3.82)
we obtain un"m¹n, and, after using the state equation (3.76) we "nd (¹n!¹n~1)/hn"m¹n .
(3.83)
This proves that the discrete rate in an extremal process changes proportionally to the temperature, the result being analogous to the logarithmic formula obtained for an optimal intensity of temperature change in continuous systems [41}43]. The above di!erence equation should be solved simultaneously with the second canonical equation (pn!pn~1)/hn"!RHn~1/R¹n"c¹%un/(¹n#un)2
(3.84)
which results from the extremum of the stage criterion with respect to ¹n and Eq. (3.79) above. This leads to the interstage temperatures ¹n between the stages n and n#1 as geometic means of the boundary temperatures ¹n~1 and ¹n`1 of any two-stage subprocess. Use of this rule with boundary conditions for ¹0 and ¹N yields all interstage temperatures in terms of the boundary temperatures ¹1"(¹N)1@N(¹0)(N~1)@N, ¹2"(¹N)2@N(¹0)2(N~1)@N~1,2 , ¹N~1"(¹N)(N~1)@N(¹0)1@N .
(3.85)
S. Sieniutycz / Physics Reports 326 (2000) 165}258
223
After using the constant Hamiltonian condition along with the above result we "nd that hn"hn`1"2"h"qN/N .
(3.86)
See an original publication for more details [31]. As in the case of DP algorithm, the multiplier h helps avoid the time variable as the state variable (dimensionality reduction). The modi"ed optimality criterion, Eq. (3.50), can be applied for "xed boundary temperatures ¹0 and ¹N but for an undetermined total duration, TN. In an optimal process, this duration follows for the assumed h as a function of ¹0, ¹N and the total number of stages, N. Accordingly, the underlying DP equation (3.51) holds for the optimal performance function Rn ,min(!Sn ), and also, as long as hn are frozen, an associated equation (3.52) of H H the stage-criterion type is valid. In this equation, the extremizing operation is only with respect to the control un. It holds along with the optimality condition of (!S)n with respect to hn, Eq. (3.53), H which sets the structure of the Hamiltonian and states that the numerical value of Hn~1 equals h. With this condition, the recurrence Eq. (3.52) e!ectively works with only one decision at the stage, un, because the e!ect of an optimal decision hn is contained in the constancy of the Hamiltonian function, Hn"h [29]. One can also consider a related problem of minimum entropy production, to compare it with the problem of extremum work. For the latter, the discrete Hamilton}Jacobi equation is given by Eq. (3.31). For the former, a discrete Hamilton}Jacobi equation is found in the form RRn~1/Rqn~1#c (1!J1!c~1¹nRRn~1/R¹n~1)2"0 , (3.87) p p where R "min s . This equation can be compared with a &more basic' HJB equation of the p p problem, which is (3.88) min Mc (un)2/(¹n (¹n#un))!RRn~1/Rqn~1!(RRn~1/R¹n~1) unN"0 . p p n u However, it is the most e$cient to deal with the &still more basic' Bellman's functional equation, whose &forward' form is (3.89) Rn (¹n,qn)"min [[c(un)2/¹n (¹n#un)]hn#Rn~1 (¹n!unhn,qn!hn)] . p p n n u ,h Again, according to our hierarchy rules, we solve the underlying DP equation (3.89) rather than any of Eq. (3.87) or (3.88). We left to the reader the development of the discrete Hamiltonian theory stemming from the stage criterion (CBS criterion) generalizing Eq. (3.89). Now a few remarks will be made about the extremum work problem for systems with heat and mass transfer. Since we know the analytical expression for the power generation function, Eq. (3.15), Bellman's recurrence equation (3.47) or (3.49) can be used to numerically generate optimal controls and optimal pro"t functions. This numerical procedure is necessary because no analytical solution is possible for f of Eq. (3.15). Alternatively the CBS or stage cirterion is used, in 0 which the trajectory optimization associated with a canonical set is performed. In this case, following the dimensionality reduction for the time qn, a computer procedure generates data of optimal controls and optimal costs through direct extremizing procedure contained in the
224
S. Sieniutycz / Physics Reports 326 (2000) 165}258
CBS equation min M[!f (¹n, Xn, un, vn)#h]hn 0 u ,v ,h ,Tn,Xn #Rn~1 (¹n!unhn, Xn!vnhn)!Rn (¹n, Xn)N"0 . (3.90) H H The Hamiltonian constant h serves as the Lagrange multiplier of the time constraint to eliminate qn from the set of original state variables. The X-free modi"cation of this equation (with the usual time t rather than q) serves to generate numerical generalizations of functions (3.64) and (3.70) in the pure heat transfer problem when both transfer coe$cients and heat capacities vary along the process path and an analytical solution cannot be obtained. When one of the end states is that of environment, a "nite-time exergy, A, is generated. Of two parts of the "nite-time exergy, the classical one is known from reversible thermodynamics; thus one may use tables of A to evaluate the associated minimum entropy production. The possibility of this evaluation follows from the fact that the "nite-time exergy of a humid gas contains the entropy production as an additive part with a known multiplier ¹%. For a continuous process with R ,min s p p A(¹, X)"(c #Xc )[(¹!¹%)!¹% ln ¹/¹%]#R¹%MX ln [X(1#X%)/(1#X)X%] g p #ln [(1#X%)/(1#X)]N$¹%R (¹, X, ¹%, X%, T&) . (3.91) p For a multistage process, a discrete counterpart of Eq. (3.91) is generated; it should refer to a su$ciently large N in order to approximate the continuous limit well enough. The last term of Eq. (3.91) contains the minimum entropy production R ,min s as a function of end thermop p dynamic states and nondimensional duration, T& (total number of mass transfer units). This last term is the nonclassical or path-dependent term which vanishes for in"nite durations. It should be distinguished from the "rst or classical term that has potential properties. The upper (plus) sign at the last term refers to processes departing from the equilibrium and the lower (minus) sign to the processes approaching to the equilibrium. With the knowledge of the classical exergy, explicit in the above equation, the numerical procedure can generate data for both A and R . Enhanced bounds p result from the "nite-time exergy on the work production and consumption. In conclusion, by applying the discrete HJB approach to work functionals, we have obtained the work potentials
3.2. Optimally controlled unit operations and unit processes 3.2.1. Mathematical model: optimization criterion and constraints In reference to a practical process with one independent variable (length or time), our main aim here is to stress an important link between numerical value of extremal Hamiltonian and unit price
S. Sieniutycz / Physics Reports 326 (2000) 165}258
225
of apparatus. Even if process technical details are of limited interest for a physicist, he should keep in mind this link is the result of an interplay between process economics and requirement of a "nite duration. As an example we consider optimization of complex crosscurrent operations of multistage #uidized drying or sorption. In these operations a single solid stream #ows sequentially through ideally mixed stages and interacts at each stage with a di!erent stream of gas. It is the allocation of gas streams and choice of their inlet states which should be found by an optimization. Evaporation or condensation occurs of an active component (moisture) while drying of solid or adsorption on solid. These separation operations do not produce work although may yield a valuable product or increase value of a substance. They are often described by nonlinear state equations and nonquadratic performance criteria following from complex solid}gas equilibria. In a #uidized cascade, the performance criterion is the sum of the exploitation cost measured in terms of the available energy of all drying agents and the investment cost of all stages. In crosscurrent cascades the cascade price PN is measured in terms of the gas crossectional areas summed over stages, PN"p+An, where p is the unit price of this area. As each #uidation runs at a nearly constant gas velocity v (which is a constant multiple of the minimum velocity) and ' An"*Gn/(o v ), the cascade price per unit #ow of product (dry solid #ow M) is proportional to the ' ' total gas #ow per unit #ow of the product, TN"+*Gn/M. Indeed, PN/M"(p/o v )+hn, where ' ' hn"*Gn/M is the interval of the nondimensional gas #ow through the stage n. With these results it is easy to evaluate the exergy investment costs of the cascade per unit #ow of the product, KN, * z PN z p N KN" + hn . (3.92) #b " #b * Tp eTu M Tp eTu o v ' ' 1 Here we have used a familiar economic expression which links apparatus price PN with unit investment costs KN and equipment utilization time, Tu . Coe$cients of economic nature appear: * z is the coe$cient of investment freezing, Tp the payout time of the apparatus, Tu the utilization time during a year, and b the coe$cient of renovations. The economic value of exergy unit, e($/kJ), was introduced to express KN in exergy units (kJ/kg) rather than in economic units ($/kg). In this * framework, the production cost per unit product #ow, KN, is simply equal to +bn hn, where bn is the 1 ' ' unit exergy of inlet gas to stage n. On the other hand, from the basic theory developed in Section 2, the cost of time allocation in terms of the Hamiltonian is hTN, where TN"+hn is the duration. Thus the economic considerations link the Hamiltonian value h with the unit apparatus price
A
A
B
A
B
B
z p #b . (3.93) Tp Tu o v e ' ' The important link between the numerical value of H and the apparatus unit price holds in every autonomous process with one independent variable (length or time). Consequently, when "xed parts of various costs (which do not in#uence the optimal solution) are ignored, the modi"ed performance index to be maximized may be written in the form h"
N N SN ,! + [bn (¹n , Xn )#h]hn:! + [1 A(¹n!¹%)2#1 B(Xn!X%)#h]hn , ' 2 ' * ' ' ' 2 n/1 n/1
(3.94)
226
S. Sieniutycz / Physics Reports 326 (2000) 165}258
where b is the speci"c exergy of a drying gas, and h is the numerical value of the constant evaluated ' from Eq. (3.93). In this setting, h is both the numerical value of the original Hamiltonian and the Lagrangian multiplier of the time constraint. An e!ective exergy of gas which includes gas pumping work bn (¹n , Xn )"(c #c Xn )(¹n !¹%!¹% ln ¹n /¹%) ' ' ' 1 8 ' ' ' R¹% Xn (M /M #X%) R¹% M /M #X% c *P # Xn ln ' 8 ' # ln 8 ' # %1 ' X%(M /M #Xn ) M M M /M #Xn goe 8 8 ' ' ' 8 ' '
(3.95)
is applied in Eq. (3.94). Here g is the e$ciency of the compressor used to overcome the pressure drop *P, c the unit price of electric energy, c and c the speci"c heats of the dry gas and active %' 8 substance (moisture), ¹% and X% the temperature and moisture content of environment, M and ' M the molar mass of dry gas and active substance. 8 The state variables at the stage n are outlet solid temperature ¹n and outlet solid moisture 4 content =n. They appear in the discrete state equations 4 (In!In~1)/hn"in (¹n , Xn )!in(=n, ¹n) , (3.96) 4 4 ' ' ' 4 4 4 (=n!=n~1)/hn"Xn !Xn(=n, ¹n) . (3.97) 4 4 ' 4 4 4 The state equations contain the controls ¹n and Xn , and the gas enthalpy in is evaluated in terms of ' ' ' these controls as in (¹n , Xn )"(c #Xn c )(¹n !¹ )#r Xn , (3.98) ' ' ' ' ' 8' ' 0 0 ' where ¹ is a reference temperature and r the speci"c evaporation heat at ¹ . Similarly, the solid 0 0 0 temperature is evaluated in terms of the state variables In and =n; when heat e!ects in the solid 4 4 phase are negligible, the following formula holds: ¹n (In, =n)"¹ #(c #=nc )~1In . 4 4 4 0 4 4 84 4 The gas humidity is constrained from below by the inequality
(3.99)
Xn 5X . (3.100) ' H The state equations also contain complex state dependent equilibrium functions in(=n, In) and 4 4 4 Xn(=n, In). They describe enthalpy and humidity of gas in equilibrium with solid, and are given as 4 4 4 complex semiempirical formulae [13]. 3.2.2. Hamiltonian and maximum principle solution The discrete maximum principle is used to maximize criterion (3.94). The Hamiltonian function is Hn~1 (=n, In, zn~1, zn~1, ¹n , Xn ),zn~1[in (¹n , Xn )!in(In, =n)] 4 4 1 2 ' ' 1 ' ' ' 4 4 4 #zn~1[Xn !Xn(In, =n)]![bn (¹n , Xn )#h] 2 ' 4 4 4 ' ' '
(3.101)
S. Sieniutycz / Physics Reports 326 (2000) 165}258
227
for n"1,2,2, N. The process duration is measured by the cumulative dimensionless gas #ow, Tn"+*Gn/M. Since this #ow is undetermined, the optimal value of H equals zero. The adjoint equations do not involve the control over un RHn~1 Rin(=n, In) RXn(=n, In) zn !zn~1 1 1 "! "zn~1 4 4 4 #zn~1 4 4 4 , 1 2 RIn~1 In In hn 4 4 4
(3.102)
zn !zn~1 RHn~1 Rin(=n, In) RXn(=n, In) 2 2 "! "zn~1 4 4 4 #zn~1 4 4 4 . 1 2 hn R=n~1 =n =n 4 4 4
(3.103)
The "nal solid moisture content =N is required to be su$ciently low, say, equal to a small value =&, =N"=& . (3.104) 4 The "nal solid enthalpy is undetermined. Thus, the boundary conditions for the "nal adjoints are: zN"0 , (3.105) 1 (3.106) zn~1 should assure =N"=& . 2 The canonical equations (3.102) and (3.103) serve to analytically determine the two functions zn~1"F~1 (In, =n, zn , zn , tn!tn~1) , 1 3,1 4 4 1 2
(3.107)
zn~1"F~1 (In, =n, zn , zn , tn!tn~1) 2 3,2 4 4 1 2
(3.108)
as the control-free representatives of the general adjoint equations (3.89) written in the form F (xn, zn, zn~1, tn, tn~1, un)"0 . (3.109) 3 Next, optimal values of controlling gas parameters are found from the extremum conditions for the Hamiltonian. For the quadratic approximation of the gas exergy b in Eq. (3.94) ' RHn~1/RXn "!B(Xn !X%)#zn~1c (¹n !¹%)#r #zn~1"0 , (3.110) ' ' 1 8' ' 0 2 RHn~1/R¹n "!A(¹n !¹%)#zn~1(c #c Xn )"0 , (3.111) ' ' 1 ' 8' ' where A and B are the coe$cients of this approximation. These relations are representatives of the general formula (2.91) for the Hamiltonian extremum with respect to controls F (xn, tn, zn~1, un)"0 . 4
(3.112)
They are solved with respect to the control vector un"(¹n , Xn ); this yields two expressions ' ' Xn "[BX%!¹%zn~1c !r #zn~1#zn~1c ¹% ' 1 8' 0 2 1 8' #A~1(zn~1)2c c ][B!A~1(zn~1c )2]~1 , 1 8' ' 1 8' ¹n "¹%#zn~1A~1(c #Xn c ) . ' 1 ' ' 8'
(3.113) (3.114)
228
S. Sieniutycz / Physics Reports 326 (2000) 165}258
If the stationary value of the air humidity computed from the "rst of these equations is less than a minimum admissible humidity X then the optimal gas humidity is H Xn "X ' H
(3.115)
and the expression for the optimal gas temperature simpli"es to the form ¹n "¹%#zn~1A~1 (c #X c ) . ' 1 ' H 8'
(3.116)
The derived relations make possible the construction of the computational block scheme [13]. A standardized numerical procedure uses the extremum control functions ¹n (zn~1, zn~1) and ' 1 2 Xn (zn~1, zn~1) in the state equations and in the Hamiltonian function. Next, the two adjoint ' 1 2 functions (3.107) and (3.108) are applied in the condition describing constancy of Hn~1. This yields an equation Hn~1,f n(In, =n, zn , zn , tn!tn~1)!h"0 3 4 4 1 2
(3.117)
which follows from the vanishing Hamiltonian (3.101). From this equation the time interval hn"tn!tn~1 is determined by a computer. The numerical value hn is used in equations which describe adjoints, controls and state coordinates. In e!ect, the procedure generates at the stage n the following optimal functions: hn"g (In, =n, zn , zn , h), 0 4 4 1 2
(3.118)
¹n "g (In, =n, zn , zn , h) , ' 1 4 4 1 2
(3.119)
Xn "g (In, =n, zn , zn , h) , ' 2 4 4 1 2
(3.120)
In~1"g (In, =n, zn , zn , h) , 4 4 4 4 1 2
(3.121)
=n~1"g (In, =n, zn , zn , h) . 4 5 4 4 1 2
(3.122)
With these data the computer can pass to computations of the next stage. As this is a backward procedure, the next stage is the stage n!1. By this numerical procedure a family of optimal trajectories is generated, and a trajectory whose initial state coincides with a prescribed initial state constitutes the solution of the optimization problem. 3.2.3. Basic properties of optimal controls and optimal trajectories Here is a short discussion of optimal results. The numerical solution was obtained for evaporation of water from silicagel into air in a three-stage cascade of #uidized beds, N"3 [12,13]. The initial state of solid I0, =0, and the "nal moisture content =N were prescribed, whereas the "nal 4 4 4 solid enthalpy was free. The role of h as an intensity index was tested in the range 0.42}4.20 kJ/kg. The optimal control data exhibit the following properties: The optimal gas temperatures ¹n decrease along an optimal drying path. The highest temper' atures of gas ¹n are for n"1. The temperature ¹3 is the lowest. For higher intensities h all ' '
S. Sieniutycz / Physics Reports 326 (2000) 165}258
229
temperatures increase and their di!erences increase as well. For h"0 the environment temperature is optimal for each n, i.e., ¹n "¹%, provided that the equilibrium solid moisture content ' evaluated at ¹"¹% is lower than the required "nal solid moisture content =N (this condition was 4 always satis"ed in computations). The optimal gas humidities Xn decrease along the optimal path. They attain stationary values ' for only su$ciently small h (small intensities or large values of total gas #ow). For very small h the values Xn attain the limiting environmental humidity (X%"0.008 kg/kg). With increasing h the ' values Xn decrease and then the lowest admissible humidity X "0.0001 is the optimal solution. ' H For a su$ciently large h each Xn equals the lowest X . ' H The optimal dimensionless gas #ows hn"*Gn/M are nonequal along the optimal path. The largest #ows are at the "rst stage and the lowest at the last stage of the cascade. Each #ow hn decreases with h, corresponding with an increase of the process intensity. Otherwise, for h approaching zero, very large hn are obtained corresponding to drying of solid by gas with environmental parameters. The optimal trajectories of the control drying process with a free "nal enthalpy (temperature) of solid depend substantially on the parameter h which is the constant of these paths. For small h (0.42 kJ/kg), state transitions are through regimes of lower temperatures of solid with possible minima of ¹ . This is caused by use of gas with lower exergy potential (lower ¹ and larger X , closer to 4 ' ' X% and ¹%). For larger h, state transitions are through higher ¹ , with possible maxima of ¹ . For 4 4 example, for h"4.2 kJ/kg, the solid is heated at the "rst two stages of cascade and cooled at the last stage of the cascade. Such solution applies for drying of sugar and ¹-sensitive biological materials. They should be dried relatively fast but otherwise their "nal temperatures cannot be too high. The optimal exploitation costs of gas tend to zero for h"0; as the ambient gas is then used only "xed costs are paid. Otherwise, the optimal costs increase with the intensity index h. Since h measures simultaneously the process intensity and the unit price of apparatus [in appropriate units; see Eq. (3.93)] one may conclude that optimal costs increase with each of these factors. For apparatuses of high unit price p, only large intensities can be optimal. Yet, limits are e!ective which bound the classical exergy consumption for "nite durations. 3.2.4. Problem with constant humidity of inlet gas A special case in which the inlet air humidity was constant and equal to that in ambient air, X , 0 was investigated by dynamic programming [13,21,26]. Fig. 3, which is a sort of the rays-wave fronts picture, shows lines of constant values of the potential Rn (xn, h) which bounds from below H total exergy costs in crosscurrent dryers. Rn (xn, h) is the minimum cost function in the equation H Rn (xn, h)"min MKn (xn, un, hn, h)#Rn~1(xn!f n(xn, un)hn, h)N , (3.123) H H H un n ,h where Kn "(b #h)hN. The function Rn (xn, h) is the minimum value of the sum in Eq. (3.94). The H ' H enthalpy-moisture content diagram, which is a suitable coordinate frame for these costs, also contains optimal trajectories, obtained for h"0.42 kJ/kg. They are con"ned in the state region de"ned by inequalities X (I, =)5X "0.008 kg/kg and ¹ 4¹H"373 K. (The subscript 4 0 ' ' s stressing the solid state is omitted in the diagram I}=.)
230
S. Sieniutycz / Physics Reports 326 (2000) 165}258
Fig. 3. Enthalpy-moisture content diagram with lines of constant values of the "nite time potential Rn (xn, h) which H bounds from below the total exergy costs in continuous crosscurrent dryers in the case of the constant humidity and variable temperature of inlet gas. The optimal trajectories of drying, obtained for h"0.42 kJ/kg, are in the state region de"ned by inequalities X (I, =)5X "0.008 kg/kg and ¹ 4¹H"3743C. 4 0 ' '
In the considered case, an increasing inlet gas temperature, until an admissible gas temperature (¹H"373 K) is attained, is an optimal strategy. Then the optimal temperature strategy becomes ' isothermal in time. It is the constraint ¹ 4¹H which de"nes the `criticala or bifurcation point at ' ' which the appropriate optimizing potential switches from variable-¹ part to constant-¹ part. The former part is just an interior solution and the latter is the boundary solution. In the "rst range (the stationary optimum) the optimal trajectories and controls render driving forces for controlled evaporation nearly constant in time. Yet, this property, which is sometimes called the equipartition of thermodynamic forces, is valid only for crosscurrent processes with unconstrained controls. Thermodynamic analyses show that reversible processes can be optimal only when unit apparatus price is in"nitely low (h"0) and outlet gas is fully exploited. Otherwise an irreversible process, associated with equipment of a "nite size or a "nite holdup time, is optimal. In optimal processes
S. Sieniutycz / Physics Reports 326 (2000) 165}258
231
changes in process intensity are balanced by changes in equipment's price; for expensive apparatuses only intensive processes are optimal. Those latter are characterized by large irreversibility index h and large entropy production. This means that the design of expensive apparatuses should be associated with the design of intense optimal processes. 3.3. Processes spontaneously relaxing to the equilibrium 3.3.1. Constrained and unconstrained formulations Now, we consider with relaxation processes in which original state variables are constrained by conservation laws for the energy mass and momentum. This requires an approach which applies Lagrange multipliers to handle dependent rates. Our system is a composition of two homogeneous subsystems c and d separated by an interface with negligible thermodynamic properties. The subsystems are not in thermodynamic equilibrium, which means that they di!er in the values of their intensive parameters, the temperature reciprocals and the Planck potentials. In each subsystem the dependent variables of state, xc"(nc, ec)!(ncH, ecH) and xd"(nd, ed)!(ndH, edH), are deviations of mole numbers, n, and energy, e, from their equilibrium values, nH and eH. To present basic concepts in a transparent way, we neglect surface phenomena by treating the interface as just a mathematical surface, possibly involving discontinuities of concentrations, enthalpy, etc., but not of temperatures and chemical potentials as the interface is still assumed to be nondissipative. Applying the theorem of the minimum entropy production to a lumped relaxing system with simultaneous heat and mass transfer between two subsystems, we "nd the canonical (Hamilton's) structure of the nonequilibrium dynamics, and show a general self-consistent way of derivation of this dynamics. The relaxation dynamics is derived from a `thermodynamic lagrangiana ¸ which uses two p dissipation functions: a rate dependent UT and a state dependent W. UT is the Legendre transform of an original U that appears in the Hamiltonian of the problem, H . Two possible variational p approaches are compared: the "rst (DVA), which applies dependent variables of state connected by balance constraints, and the second (IVA), in which these constraints are eliminated in advance, so that the model contains only independent variables. The "rst approach is novel, the second is basically an integral version of Onsager's approach in the two-phase context. Both approaches are analyzed by methods of optimal control theory, and make use of ideas based on the HJB theory. It is shown that the DVA has a number of virtues with respect to the IVA. It can deal with the nontruncated thermodynamic entropy and with absolute thermodynamic adjoints (¹~1, ¹~1k ). i Also, it gives a complementary relaxation picture in terms of these adjoints, as governed by the relaxation matrix KT, the transpose of the state relaxation matrix K. Other physical results elucidate a general optimal-control scheme to construct the nonequilibrium thermodynamic entropy S as a principal function which satis"es an autonomous HJB equation and the related Hamilton}Jacobi theory under the constraint of a vanishing thermodynamic Hamiltonian H "U!W, necessary for S to be a state function. Properties of this entropy are herein analyzed p as that evaluated additively over homogeneous subsystems. A basis approach to irreversible dynamics is applied here that uses an integral criterion of the minimum entropy production, analyzed by optimal control theory or variational calculus. This help establish a general theoretical scheme for phenomenological equations which are dynamical equations characterizing the behavior of a relaxation process. In fact, our approach leads to the
232
S. Sieniutycz / Physics Reports 326 (2000) 165}258
corresponding representation of Onsagerian irreversible thermodynamics [44}46]. The relaxation dynamics are shown to be Lagrangian or Hamiltonian. In the case of a system isolated as a whole, they are governed by the thermodynamic entropy S as the principal function of the variational problem, that satis"es the HJB equation of the optimal control problem and the related Hamilton}Jacobi equation. The state variables of the process are components of the vector variables, xc and xd, which describe `chargesa of energy and mass in the subsystems c and d; speci"cally xc"(nc !ncH, nc !ncH, nc!ncH, ec!ecH) , 1 1 2 2 4 4
(3.124)
xd"(nd !ndH, nd !ndH, nd!ndH, ed!edH) . (3.125) 1 1 2 2 4 4 The subscripts refer to the species and the energy as transferred entities. The superscripts refer to two subsystems. While x"(xc, xd) are our basic variables, we shall occasionally use also the original state vector, n8 "(n, e). Further, we will report an equation that applies for the "rst subsystem only, if the equation for the second subsystem can be trivially obtained by exchanging c with d. We will compare two approaches: the dependent (constrained) variable approach (DVA) and the independent (unconstrained) variable approach (IVA). In the "rst, the balance constraint is explicit, the property which allows to treat all state variables (in each phase) on an equal footing if Lagrange multipliers are applied. In the second, the conservation-law constraint is eliminated in advance; thus only coordinates of one, say "rst, phase are involved (the independent coordinates a,xc). In each approach a restricted local-extremum principle is contained in the HJB equation for the minimum of generated entropy. Using the method of dynamic programming in the case of dependent variables, we minimize the functional of the entropy production
P P
p" 4
"
P
t2
K (x,*, k@, t) dt, 1 p
t t2 t
1
t2 [UT(x, *)#W(x, t)#k@.(*c#*d)] dt 1 t
1 Rc(x): *c*c#1 Rd (x): *d*d#W (xc, xd, t)#k@.(*c#*d) dt , 2 2
(3.126)
where x"(xc, xd) and *"(*c, *d) satisfy the simple di!erential constraints x5 c"*c, x5 d"*d .
(3.127)
The "rst line of integral (3.126) describes a general structure proper for the optimal control formulation, whereas the second line refers to the example with a quadratic dissipation functiion UT which contains the symmetric resistance matrices Rc and Rd. Clearly, representations with rates as derivatives of state variables are pertinent for direct use of variational calculus [47] and methods of classical mechanics [48]. On the other hand, the optimal control formulations, in particular the dynamic programming, which we use here as the basis, require minimizing of p, Eq. (3.126), subject to the simple equations of state (3.127). In principle this can be done in the framework of either forward or backward DP algorithm. Two other equivalent formulations are possible. The "rst searches for maximum of the entropy functional S,S&!p and uses the forward DP algorithm to obtain the initial-entropy potential 4
S. Sieniutycz / Physics Reports 326 (2000) 165}258
233
S,max S in terms of the "nal states x&. The second minimizes the entropy functional S,S*#p and applies the backward algorithm to obtain the "nal-entropy potential S,min S in terms of the initial states x*. Note that the functionals p used in each formulation refer to di!erent parts of the path, in the "rst formulation the quantity p is the entropy production for a "nal part of the path and in the second it is the entropy production for an initial part of the path. The dissipative Lagrangian (3.126) contains rate constraints adjoined via vector of the Lagrange multipliers, k@. The presence of time t in W means that the above formulation admits an explicit presence of time in K in the case of driven (nonisolated) systems. In general, a knowledge of both p dissipation functions is necessary. While, as we shall see later, U and W are numerically equal along extremals of an isolated system, choosing ¸ ,p as 2U or 2W as the integrand would be p 4 erroneous, since it is the analytical form of the functional which in#uences the result of extremizing. We shall show that the quadratic approximation of the thermodynamic entropy S(x, nH, eH)"SH(ncH, ecH, ndH, ecH)#pH ) (xc#xd)!1 Cc: xcxc!1 Cd: xdxd , 2 2
(3.128)
where C is the Hessian matrix and the asterisk refers to the equilibrium, is the suitable extremal function for the linear dynamics, which is dynamics with x-independent Rc and Rd. Otherwise, the quadratic approximation proves to be insu$cient for nonlinear dynamics with state dependent resistances. The HJB theory along with relevant Belman's equation or stage criterion provide an e$cient way to treat such nonlinear systems. As Eq. (3.128) contains a linear term, the quantity S is the `nontruncateda entropy of an initial state expressed in terms of that corresponding to a common equilibrium state and the deviations, x. When dealing with such nontrucated S, the optimal control approach shows an important property of DVA: the Lagrange multipliers of the rate constraints, x5 "*, are the co-state variables in the Gibbs equation, the transfer potentials p. On the other hand, for the Onsager-type unconstrained formulation (with independent variables a; IVA) the minimized functional is
P PC
p"
P
t2 t2 L (a, a5) dt" [UT(a, a5)#W(a)] dt p t1 t1
D
t2 1 R (a): a5a5#W(a) dt . (3.129) 1 2 t Its integrand contains the total resistance R"Rc#Rd. The distinction between UT and U is important only when dissipation is nonquadratic with respect to the rates a5"da/dt. Again, an equivalent way is to consider the initial entropy functional S,S&!p in terms of its "nal states; with the forward DP algorithm this leads to the entropylike potential max S,S(a&). Similarly, we may consider the "nal entropy functional S,S*#p in terms of its initial states; with the backward DP algorithm this leads to the entropylike potential min S,S(a*). The quadratic approximation to the entropy of the IVA can be obtained by applying the state constraint xc#xd"0 in Eq. (3.128); this yields "
S(a, nH, eH)"SH(ncH, ecH, ndH, ecH)!1C: aa . (3.130) 2 Again, we have to decide if we concentrate on backward or forward DP algorithm. In the former, for example, we will search for the minimum of S(t&) at the instant t"t&, subject to the di!erential
234
S. Sieniutycz / Physics Reports 326 (2000) 165}258
constraint dx /dt"¸ (x,*)"U#W and all remaining constraints, dx/dt"* and *c#*d"0. The 0 p minimum of S(t&) is the value of the entropy S at the "nal instant t&; this entropy is generated by the backward algorithm in terms of the initial states a"a*. 3.3.2. HJB equations and irreversible dynamics For the Lagrangian K and the optimal entropy production s (x*, t*, x&, t&),min p , the forward p p 4 and backward H}J}B equations, Eqs. (3.41) and (3.42) of Section 2.1, are
G
H
G
H
s`1 Rs s`1 Rs Rs p# + p vc # + p vd !K& (x, t,*, k@) "0 , max k p Rt& Rxc& Rxd& k * ,k{ k/1 k k/1 k
(3.131)
s`1 Rs s`1 Rs Rs p# + p vc # + p vd #K* (x, t, *, k@) "0 . (3.132) max k p Rt* Rxc* Rxd* k * ,k{ k/1 k k/1 k As Rs /Rx&"!Rs /Rx*, it is su$cient to focus on one of these equations. p p In what follows, we will deal with the `backwarda HJB equation, Eq. (3.132), in which we will neglect the superscript i at initial state variables and initial entropy as these quantities are subject to changes. For the potential of total entropy production s ,min(S&!S*)"S!S*, Eq. (3.132) with p neglected index i can be written in the form describing the total time derivative of the "nal entropy potential S min *c *d , ,k{
G
dS(x, t, S) Rs Rs Rs p # p *c# p *d , min dt Rt Rxc Rxd *c *d , ,k{ 1 1 # Rc(x): *c*c# Rd (xc): *d*d#W(xc, xd)#k@ ) (*c#*d) "0 , 2 2
H
(3.133)
where *c and *d are the two dependent controls and the second line contains the Lagrangian K . p (Note that the partial derivatives of s can be replaced by those of S.) The HJB formulation (3.133) p is essential as the computational tool for arbitrary dependence of resistance functions on the state x. Of course, Rs /Rt"0 and RS/Rt"0 for adiabatic thermodynamics. As the presence of the p Lagrange multipliers k@ allows to treat the problem as unconstrained, i.e. to apply the stationary extremum conditions, Eq. (3.133) is equivalent with the set of the following equations: Rs Rs Rs p # p *c# p *d#K (x, *, k@, t)"0 , p Rt Rxc Rxd RK (x, *, k@, t) Rs (x, t) RK (x, *, k@, t) Rs (x, t) p "! p , p "! p , R*c Rxd R*d Rxc RK (x, *, k@, t) p "/@(x, *),*d#*d"0 Rk@
(3.134)
with the unknowns s ,* and k@. We can now identify the Lagrange multipliers with the interphase p potentials k@,pH. Working with the simple but important case when both dissipation
S. Sieniutycz / Physics Reports 326 (2000) 165}258
235
functions are quadratic U (*)"1 Rc:*c*c#1 Rd:*d*d , 2 2 W (x)"1 Wc: xcxc#1 Wd: xdxd 2 2 and using the following de"nitions: Wd,(CLC)d, Wc"(CLC)c, Ld,(Rd)~1, Lc,(Rc)~1 ,
(3.135) (3.136)
(3.137)
we obtain from Eq. (3.134) the linear equations dxc/dt"(Rc)~1( pc!pH)+!(LC)cxc ,
(3.138)
dxd/dt"(Rd)~1( pd!pH)+!(LC)dxd ,
(3.139)
dxc/dt#dxd/dt"0 .
(3.140)
In this set pc and pd are the state adjoins or the negative derivatives of the nonequilibrium entropy potential with respect to the state variables xc and xd. The approximations containing the hessian C are consistent with the linear dynamics and the second-order expansion of S, Eq. (3.128), with constant coe$cients. On eliminating the Lagrange multiplier pH from the transfer equations (3.138) and (3.139) and de"ning the overall transfer resistance matrix R"Rc#Rd [or the overall conductance matrix L,R~1, where L~1"(Lc)~1#(Ld)~1] one obtains the kinetic equations dxc/dt"R~1( pc!pd),L(RS/Rxc!RS/Rxd) ,
(3.141)
dxc/dt"R~1( pd!pc),L(RS/Rxd!RS/Rxc) ,
(3.142)
in which S is the current entropy of the system. The simplest special case described by these equations refers to the pure heat exchange between the two subsystems c and d. The dependent equations of the heat exchange may be written in the form dec/dt"[(¹c)~1!(¹d)~1]/R ,
(3.143)
ded/dt"[(¹d)~1!(¹c)~1]/R .
(3.144)
Their sum vanishes since the energy conservation is identically satis"ed. 3.3.3. Relaxation dynamics by Hamilton}Jacobi equations In terms of the nonequilibrium entropy potential S"min S& as the optimal performance function, the Hamilton}Jacobi equation of the dependent variable theory is a truncation of the general standard form RS/Rt#H (RS/Rxc, RS/Rxd, xc, xd, t)"0 (3.145) p with the functions S and H explicitly independent of the time t. In linear adiabatic systems, with p x-independent resistances Rc and Rd, the Hamilton}Jacobi equation applies in the quadratic and time independent form 1 L: (RS/Rxc!RS/Rxd)(RS/Rxc!RS/Rxd)!1Wc: xcxc!1Wd: xdxd"0 , 2 2 2
(3.146)
236
S. Sieniutycz / Physics Reports 326 (2000) 165}258
where Wc"(CLC)c, Wd"(CLC)d and L~1"(Lc)~1#(Ld)~1. This equation is satis"ed by the thermodynamic entropy, Eq. (3.128). In the phase space we obtain for linear systems H ( pc, pd, xc, xd),1L: ( pc!pd)( pc!pd)!1Wc: xcxc!1Wd: xdxd"0 . p 2 2 2
(3.147)
This Hamiltonian yields the canonical set in the form of dependent equations which are linear and satisfy the conservation laws identically. A modi"ed form of Eq. (3.146) with an unknown W(x) may serve to determine the second dissipation function in an exact way, since the thermodynamic entropy is frequently known with a good accuracy. For the linear dynamics, the quadratic function (3.128) is su$cient for that purpose. Using the independent variable approach, the HJB theory must be in terms of the variables a,xc. The HJB equation for the thermodynamic problem (written in the form of the corresponding power criterion and the entropy rather than the entropy production s ) yields p min SP (a,a5),!(RS/Ra)a5#R(a): a5a5#W(a)T"0 , (3.148) s a5 where the function S is Onsager's approximation to the nonequilibrium entropy, Eq. (3.130), which is the initial entropy function generated by the forward DP algorithm. This result represents Onsager's restricted extremum principle in which phenomenological equations are found by setting to zero the variation of P with respect to independent rates a5, at the "xed state coordinates, a. s A related (backward DP) result for the dependent variables is Eq. (3.133). From Eq. (3.148) independent phenomenological equations follow in Onsager's form: da/dt"R~1( pc!pd)"LRS/Ra,L ) X .
(3.149)
Note that its structure is di!erent from that represented by Eqs. (3.141) and (3.142) which are related to the dependent variable theory. Yet the physical content is the same in both cases. In order to assure time-independent potentials s and S, the thermodynamic Hamiltonian or the p Legendre transform of the dissipative Lagrangian ¸ must vanish p H ,(R¸ /Ra5)a5!¸ "U(a,a5)!W(a)"0 . p p p
(3.150)
As it follows from Eq. (3.147), in the dependent-variable approach the analogous condition of vanishing of the corresponding Hamiltonian is also satis"ed. For Hamiltonian (3.150) expressed in terms of the entropy production, the forward Hamilton}Jacobi equation 1 Rs RS p p !W(a)"0 + + L (3.151) ik Ra Ra 2 i k i k is the IVA counterpart of Eq. (3.146). It describes the vanishing property of the dissipative Hamiltonian, H "U!W, or the Legendre transform of ¸ "UT#W for the quadratic p p UT"(1/2)R : a5a5. The generalized momenta p are identi"ed with the partial derivatives Rs /Ra . In i p i fact, the Legendre transformtion of the Lagrangian is explicit in all HJB equations and power criteria, in which p "Rs /Ra "Rs/Ra (IVA) or p "Rs /Rx "RS/Rx (DVA). Clearly, in both i p i i i p i i approaches the entropy plays the role of an action.
S. Sieniutycz / Physics Reports 326 (2000) 165}258
237
3.3.4. Euler}Lagrange equations, canonical description and nonlinear generalizations The thermodynamic Euler}Lagrange equations of the dependent variable approach follow as (d/dt)(Rc*c#pH)"Wcxc, (d/dt)(Rd*d#pH)"Wdxd ,
(3.152)
where the Lagrange multipliers pH,k@ are interphase transfer potentials. One may note that the left-hand sides of these equations are the time derivatives of the DVA transfer potentials (Planck potentials and temperature reciprocals). With de"nitions (3.137) and the following evaluation of the thermodynamic forces in terms of the entropy hessian: X c,pc!pH"[R2ScH(ncH, ecH)/(R(ncH, ecH)R(ncH, ecH))]xc,!Cxc ,
(3.153)
we may write Eq. (3.152) in a few alternative forms dpc/dt"Wcxc"(CLC)cxc"(CL)c( pH!pc)"KTc( pH!pc) ,
(3.154)
where the matrix Kc"LcCc is the state relaxation matrix. Simultaneously it is the matrix of products of the transfer coe$cients and the transfer areas. In fact, the Euler}Lagrange equations, such as Eq. (3.154), and the phenomenological equations of the HJB theory deal with the common dynamics, although the former describe the time derivatives of the absolute transfer potentials, dpc/dt and dpd/dt and the latter those of the state variables, dxc/dt and dxd/dt. In terms of the original state vector n8 "(n, e), and for Gibbs equation de"ning the nonequilibrium entropy as an additive quantity over the homogeneous subsystems dS,dSc#dSd"pc.dn8 c#pd.dn8 d ,
(3.155)
the associated dynamics has the canonical form (3.156) dn8 c/dt"RH /Rpc, dpc/dt"!RH /Rn8 c . p p In the case of linear dynamics these equations are governed by the extremum Hamiltonian of Eq. (3.147). The canonical equations describe relaxation of mole numbers and energy and their thermodynamic adjoints p (Planck potentials and negative temperature reciprocal) to equilibrium. For the linear relaxation consistent with the quadratic entropy function, Eq. (3.128), we "nd dn8 c/dt"K c(n8 cH!n8 c),
dpc/dt"KTc( pH!pc) ,
(3.157)
where KTc( pH!pc)"CcLcCcxc"CcK cxc .
(3.158)
(An analogous dynamics holds for the phase d.) While the state dynamics and adjoint dynamics are di!erent, it follows from Eqs. (3.157), (3.158) and (3.153) that in the linear theory the adjoint dynamics is transformed with the help of Eq. (3.158) into the state dynamics. Equations of motion for the state variables n8 and their thermodynamic adjoints p mutually complement the canonical set (3.156). This analysis shows that the linear relaxation of state variables is governed by the transfer matrix Kc"(Rc)~1Cc and that of thermodynamic adjoints by its transpose, KTc"Cc(Rc)~1. Only in a particular frame in which K is symmetric are relaxations of state variables and adjoints governed by the same matrix K.
238
S. Sieniutycz / Physics Reports 326 (2000) 165}258
In nonlinear systems with x-dependent Rc and Rd in Eq. (3.126), forward DP equations are applied to solve either the forward counterpart of the HJB equation (3.133) (with minus K ) or p the related Hamilton}Jacobi equation (3.145). In terms of the optimal cost function which is just the minimum entropy production, Rn ,sn , the recurrence equation is p p sn (xn)" min MK (xn, un, k@n)hn#sn~1(xn!*nhn)N , p p p *n n n ,h ,k{
(3.159)
where hn"tn!tn~1. This refers to Onsager's entropy as the maximum of an initial entropy coordinate in the forward DP. The dissipative Lagrangian K at each stage n has the form p K (x, *, k@),1Rc(x): *c*c#1Rd(xc): *d*d#W(xc, xd)#k@ ) (*c#*d) . p 2 2
(3.160)
In Eq. (3.159) the Lagrange multipliers k@ are extra coordinates of the control vector. To reduce the problem's dimensionality, the conservation constraint is eliminated and the set of independent state variables a"xc and controls *c is used. Note that Kn "Kn H whenever Hn,0; thus the p p potential functions sn and sn H generated by Eq. (3.159) or its asterisk counterpart represent the same p p quantity which describes the minimum entropy production, s . This quantity should be subtracted p from the "nal entropy S& to get the actual entropy of the system [49]. Extension of the present theory to relaxations around nonequilibrium steady states should be the task of a future e!ort along this line. 3.4. Thermal rays traveling along paths of least resistivity 3.4.1. Resistivity integral and related HJB equation In a rigid medium with "xed boundary temperatures, the #ow of thermal energy in the thermal "eld can be described in terms of &thermal rays', the paths of heat #ow in the direction of negative temperature gradient. Their deviation from straight lines results from variable thermal conductivity. The original problem was formulated in the energy representation [50]; yet the entropy representation is the most appropriate framework to handle the thermodynamics of energy transfer [51]. While it would be worthwhile to develop an analogy with the index of refraction or dielectric constant, a comparison of thermal and optical processes made in the adduced publications shows that physical results for thermal rays are essentially di!erent from that for optical rays; in particular the tangent law of refraction for the former replaces the sine law of refraction for the latter. As we are dealing here with a steady-state heat #ow, we ignore the e!ect of thermal inertia which may become pronounced in highly unsteady #ows [52,53]. To describe time-independent heat #ows which satisfy the Laplace equation or its nonlinear generalizations, it is unnecessary to deal with both (real and imaginary) parts of Kronig}Kramers relations [54]. Yet, there are some unexpected analogies too. The thermal rays travel along paths satisfying the principle of minimum of entropy production, which looks at "rst glance di!erent than the well-known Fermat principle of minimum time for optical rays. However, the minimum of entropy production for prescribed #ows assures the minimum resistivity of the path. This means the maximum for heat #ux in a dual problem with a "xed resistivity, which implies that the residence time of heat in the medium is as short as possible. This is very similar to Fermat principle for propagation of light.
S. Sieniutycz / Physics Reports 326 (2000) 165}258
239
Our purpose here is to investigate thermal rays by the method of dynamic programming. We use a special coordinate system (x, y) in which the local resistivity of heat #ow changes along the axis x, the axis y is tangent to a surface of constant speci"c resistivity o and u"dy/dx is the local direction of heat #ow towards the gradient of temperature reciprocal ¹~1. The shape of thermal rays can be described as a control problem for a minimum of the resistivity integral
P
t2
A~1o(x)(1#u2) dx (3.161) 0 t1 subject to the control u"dy/dx. A is the constant area of projection of the heat #ux tube 0 cross-sectional area on the surface of constant resistivity. The minimal resistance function of the problem de"ned as (!S)"
P
R(x*, y*, x&, y&),min
t2
t1
A~1o(x)(1#u2) dx 0
(3.162)
satis"es the HJB equation RR/Rx#max M(RR/Ry)u!A~1o(x)(1#u2)N"0 . 0 u
(3.163)
3.4.2. Bending law and Hamilton}Jacobi equation Maximizing the Hamiltonian expression in the above HJB equation yields as an optimal control u"(A /2o(x))RR/Ry . (3.164) 0 This optimality condition can be written in the form of the tangent law of bending for a thermal ray o(x)dy/dx"1A RR/Ry,b, (3.165) 2 0 where b may be both positive or negative constant. The constancy of the partial derivative RR/Ry follows from an explicit independence of the model Lagrangian with respect to y. A suitable integral formula for the bending constant in terms of the deviation y!y0 is
AP
x
B
o~1(x@) dx@
~1
. (3.166) x0 Expressing the optimal control u in the HJB equation (3.163) in terms of p"RR/dy yields the Hamilton}Jacobi equation b"(y!y0)
RR/Rx#A~1o(x)(((A /2o(x))RR/Ry)2!1)"0 , (3.167) 0 0 where the second term of the left-hand side expression is the optimal Hamiltonian. The solution to this equation can always be broken down to quadratures. 3.4.3. Numerical DP approaches for complicated resistivity functions However, if the function of speci"c resistivity o(x) is too complicated, the integrals cannot be evaluated analytically. Hence, the role of the discrete approach which solves numerically Bellman's
240
S. Sieniutycz / Physics Reports 326 (2000) 165}258
recurrence equation of the problem Rn(yn, xn)"min MA~1o(xn)(1#(un)2)hn#Rn~1(yn!unhn, xn!hn)N , 0 un n ,h
(3.168)
where hn"xn!xn~1. Eq. (3.168) has not any analytical solution for an arbitrary o(xn), thus the sequence of functions Rn must be generated numerically. Yet, in the limit of an in"nite number of stages it may be shown in a purely analytical way that the potential function satisfying Eq. (3.168) takes the limiting form
P
R(x, y)"
x
AP
A~1o(x@) dx@#A~1(y!y0)2 0 0 x0
x
x0
B
o~1(x@) dx@
~1
.
(3.169)
The above function satis"es both HJB equation (3.163) and Hamilton}Jacobi equation (3.167). The numerical solution to Eq. (3.168) for a "nite number of stages n represents the "nite-stage generalization of the solution (3.169); this numerical solution automatically accomplishes the numerical integration required in Eq. (3.169). 3.5. Fermat+s principle for propagating diwusion}reaction fronts 3.5.1. Reaction}diwusion mechanism for chemical wave motion In this section we consider propagation of concentration disturbances in the form of (bio)chemical waves satisfying Fermat's principle of minimum time. The dynamic programming approach leads to the Hamilton}Jacobi}Bellman equation and its characteristic set for chemical waves. All these equations describe the link between the constrained motions of wave fronts and associated &rays' or extremal trajectories. Usually important are some &geodesic' constraints due to an obstacle in#uencing the state changes and the entering (leaving) conditions of a ray as the tangentiality condition for rays that begin to slide over the boundary of an obstacle. A suitable analysis can determine complex shapes of rays in terms of their deviations from straightlinearity in an inhomogeneous media. Traveling chemical disturbances have been discovered by experiments in reaction}di!usion systems undergoing in #uids [55}59], solids [60}64] and porous systems [65}67]. E!ects such as: excitability of chemical media, dispersion relation, and curvature e!ect have been shown for chemical waves [68]. Of all the wave motions known, chemical waves were the last to be discovered, and their properties are relatively unknown or insu$ciently understood. Autocatalytic chemical systems were shown to be responsible for the wave propagation, and autocatalytic models were applied to provide expressions for the wave propagation speed [69,70]. Excitability properties were recognized to be responsible for wave behavior [56,68,71}75]. An example of a general model implying dissipative oscillations is Bruxelator [76,77]. However, it is not capable of modeling the phenomenon of bistability which frequently causes oscillations in real processes, as, for example, in Bielousov}Zhabotynskii reaction (B}Z reaction). SchloK gl's model [76] incorporates a simple bistable mechanism as an inherent property of a nonequilibrium steady state and leads to one of the basic catastrophes (cusp [78]) in nonequilibrium states.
S. Sieniutycz / Physics Reports 326 (2000) 165}258
241
Mechanism of B}Z reaction is complicated and not well explained to date, especially as referred to role and form of intermediates and side reactions. Amongst various explanatory concepts the best known and widely accepted is the so-called Field}KoK roK s and Noyes (FKN) mechanism which predicts periodic switching of the system (caused by concentration changes of bromian ions, Br~) between the two pseudostationary states [68,79]. The FKN mechanism also explains creation of space}time dissipative structures, kinematic waves and Winfree's trigger waves in BZ reaction [37]. The FKN mechanism can be broken down to a simpler model form called Oregonator introduced by Field and Noyes [68,79]. While Oregonator does not predict some experimental e!ects (e.g. chaos), its invention put essentially forward modeling of oscillatory reactions. Its #ow extension, in particular, describes in a lucid way creation of wave fronts and link between bistability and oscillations [79]. A modi"ed Oregonator model or the so-called SNB model served to simulate complicated limit cycles and chaos in #ow systems [80]. Yet this model was not successful to describe chaotic phenomena well enough [79]. However, coupling of Oregonator with an oscillator based on dynamic changes of bromomalonic acid concentration has led to a model called Montanator [79,81,82]. That model can simulate increase of complexity of oscillations through period doubling and subsequent chaos. Also, models were constructed [83,84] that both met the requirements of thermodynamics and exhibit chaos. According to Ref. [85] there is no general rule that the entropy production is smaller in an oscillating mode than it would be if the system stayed in some of the available steady states. As stated in the book [86] a stationary state is not always favorable in terms of minimizing entropy production, nor is an oscillating mode. In other words, a general minimum entropy production principle for time averages over periodic oscillations cannot be valid. In a further step, similar investigations for chemical oscillators with period doubling, bifurcations and chaos were performed, to "nd the behavior of the average entropy production in periodic and chaotic models as well as in coexisting steady states [85]. These results also imply the restrictive nature of the theorem of minimum entropy production. However, when the thermodynamic formalism of chemical reactions applies a nonlinear chemical resistance consistent with the mass action law and the entropy production functional is properly expressed in terms of the sum of two dissipation functions, an integral theorem of minimum entropy production is satis"ed which can describe even unstable systems [87]. This might imply trials to reconcile the above statements into a form which would preserve implication of the variational formulation. A resolution of the controversy is that the common criterion of the minimum generated entropy, can act locally in all cases, although the related global-extremum statements can be invalid far from equilibrium. Spiral waves drew recently a considerable attention because of analogous phenomena in biology [65,66]. A number of dynamical properties observed in experiments have been shown to be substantiated in terms of interactions of the elementary wave properties with the chemical system geometry. It was Winfree [57,58] who suggested "rst that the shape of the spiral waves should be involute of small circle, the `corea of the spiral wave. An involute of a given curve C is a curve CH which lies on the tangent surface of C (the surface generated by the tangent lines to C) and intersects the tangent lines orthogonally. (See Refs. [65,66] and books on di!erential geometry, e.g. Ref. [88].) Initially, experiments on concentric and spiral-shaped waves were carried out in thin layers, the closed systems, where reagents were gradually consumed, so it was impossible to maintain constant conditions for a longer time. The spatial patterns changed and ultimately decayed as the mixture evolved towards the thermodynamic equilibrium. The real break-through
242
S. Sieniutycz / Physics Reports 326 (2000) 165}258
was when real open systems and the so-called continuously fed unstirred reactors (CFUR) appeared, which made it possible to study chemical waves under steady conditions. Noszticzius and coworkers [89] developed the "rst CFUR with a ring geometry using acrylamide gel and created chemical pinwheels in that reactor; soon other CFURs were designed. A chemical pinwheel system consists of a circular strip of gel which separates two concentric CSTRs. The wave motion can be understood as interaction between di!usion and kinetics [73,90,91]. Chemical components di!use into the gel, react and create a medium of excitable properties. Concentrations of species in the outer and inner reservoirs di!er usually, thus resulting in a radial concentration gradients and inhomogeneous properties of the excitable medium. The simplest structures observed in early experiments, wave trains moving with a constant velocity and the same shape, have been replaced by more complex spatio-temporal structures after the resting system state become inhomogeneous. These phenomena have been ingeniously modelled; for example in Ref. [92] the researchers reduced their studies to single wave which moves in chemical excitable system with periodic boundary conditions. The di!usion causes that the chemical species di!use out of the front of the wave towards the area of lower concentration, the concentration of autocatalyst builds up, and after it crosses a threshold limit, kinetics takes over. In e!ect, the concentration increases strongly due to autocatalytic reactions, thus creating the front of the wave. In inhomogeneous and (possibly) anisotropic media the description of the propagation of the chemical wave is a very di$cult task due to the constraints on the boundary coordinates, caused by the presence of obstacles. Our analysis of this motion [93] treats the chemical system as an optimal control system which is analogous with burning prairie. Due to the (nonrelativistic) causality, the real path of ignition is the one along which the "re arrives "rst to a point, as for all subsequent instants the grass will already be burned out at the point. In other words, the activation at a point is elicited by the "rst possible impulse, the latter impulses have no e!ect at this point. This means that the system satis"es the Fermat principle of the least time for the disturbance propagation [94]. The approach based on the Fermat principle embeds the chemical problem into the dynamical system theory. The usual formulation of Fermat's principle states that the path of a ray between two points results from vanishing of the "rst variation of the transition time. This is the simplest approach to chemical fronts which can be contrasted with a second, underlying approach based on reaction}di!usion equations. The simplest approach requires that the propagation speed * is given as a known function of the coordinates and directions; when this condition is satis"ed, the simplest approach can be used in its natural form which deals with ordinary di!erential equations. For a lumped system, the speed v can be obtained from the second approach that applies autocatalytic model of reaction and di!usion described by partial di!erential equations of the type R u"Lu#N(u)#D+2u , (3.170) t see Refs. [69,70,72,91,95]. This result incorporates the classical mass action kinetics [96,97]. In Eq. (3.170) u is the set of "elds, L and N are, respectively, the linear and nonlinear reaction dynamics, and D is the transport (di!usion) matrix. The analysis proceeds by assuming that a constant wave pro"le emerges and propagates with a constant speed. Rectilinear, solitary wavefronts that propagate with a constant speed v are one-dimensional solutions, 0 u(x, t)"w(x!c t), of the ordinary di!erential equation 0 Lw#N(w)#DwA#c w@"0 , (3.171) 0
S. Sieniutycz / Physics Reports 326 (2000) 165}258
243
obtained from Eq. (3.170) for u"w(x!c t), where the prime refers to di!erentiation with respect 0 to the traveling wave coordinate r"x!c t, and the rest state corresponds to u"0. The solution 0 to Eq. (3.171) represents a shape traveling with a speed c ; diverse models yield a constant speed 0 c or a state dependent c [90,91]. The physical propagation speed c"dl/dt depends usually on 0 both the di!usion coe$cient of the autocatalytic species and the rate constants of autocatalytic reactions; it is a function of the rest state. To apply the propagation speed in the framework of the minimum time approach, state coordinates must be assigned to each point of the physical space where the wave motion occurs. 3.5.2. Chemical Fermat principle and related HJB theory The Fermat principle approach is associated with perfect-medium models of wave motion as a train of chemical waves can be made to travel unidirectionally around a ring almost inde"nitely somewhat as an induced electric current around a superconducting coil [98,99]. From the thermodynamic viewpoint, this sustainability of motion is a behavior that can be attributed to open systems only, which assure sustained di!usion}reaction couplings. The extremals of related variational problems may be regarded as curves of lowest resistance and shortest transition time of the chemical wave within the medium. A more general form of this concept is associated with the minimum entropy production and associated maximum rate of the wave front, thus linking the wave motion with the Pontryagin's principle of the optimal control [16,17]. When a function c describing the propagation speed is known, an HJB equation can be formulated. Consider, for example, the case of constrained rays that can slide over the boundary of a spherical obstacle. A general constraint-adjoining Lagrangian for the minimum time problem can be written in the form K,dx0/dt"1#k(x2#y2#z2!R2)#u[x5 2#y5 2#z5 2!c2(x, y, z, y5 /x5 , z5 /x5 )]
(3.172)
[93]. The symbols with dots in Eq. (3.172) refer to the dependent derivatives of the geometrical coordinates (x, y, z) with respect to the time t. The symbol x0 designates the time coordinate which equals numerically the initial time t0. The coordinates (x, y, z) are the state variables; the parameters k and u are Lagrange multipliers adjoining the trajectory constraint and the rate de"nition constraint. Here the dependent time derivatives x5 , y5 and z5 are regarded as constrained controls that satisfy the usual state equations de"ning rates as derivatives of state variables and the local constraint x5 2#y5 2#z5 2!c2(x, y, z, y5 /x5 , z5 /x5 )"0 .
(3.173)
In this case there are three dependent controls and their multiplier u is nonvanishing. However, when two independent controls u"dy/dx and v"dz/dx are applied to the problem in which the time t is the independent variable, the dependent state equations dx/dt"c(x, u, v)/J1#u2#v2 ,
(3.174)
dy/dt"c(x, u, v)u/J1#u2#v2 ,
(3.175)
dz/dt"c(x, u, v)v/J1#u2#v2 .
(3.176)
244
S. Sieniutycz / Physics Reports 326 (2000) 165}258
satisfy constraint (3.173) identically. This is equivalent with elimination of the wave speed constraint, Eq. (3.173), from the optimization model. In this setting the Lagrange multiplier u equals zero, and the general Lagrangian (3.172) becomes a u-independent function K "1#k(x2#y2#z2!R2) . (3.177) 1 In a more general case of an nonspherical obstacle whose equation is /(x)"0, the original Lagrangian takes the u-independent form K "1#k/(x) which contains only the operative 1 constraint /(x). Yet it should be stressed that the role of that constraint is restricted to rays attached to the boundary; all others, which are free, are governed by the Lagrangian K "1. 1 The forward HJB equation of the problem deals with the dual formulation in which the maximum of an initial time S,t0 is considered as a function of "nal coordinates x and "nal times t. That HJB equation expresses a condition for the local maximum of the total derivative dS/dt along an extremal. In terms of the function ¹"t!t0 which represents the shortest transition time of the chemical ray in the medium, the HJB equation is
G
R¹ c(x, u, v)u R¹ c(x, u, v)v R¹ c(x, u, v) max # # u,*,k RxJ1#u2#v2 RyJ1#u2#v2 RzJ1#u2#v2 dS !K (x,k)N"0 ,max , 1 dt * u, ,k
G
A BH
(3.178)
where K "1#k/(x). Again, free rays are governed by the Lagrangian K "1. The constraint 1 1 /(x, y, z)"0 is operative when the ray slides over the surface of a boundary; this is the only case when the multiplier k is nonvanishing. Note that the multipliers of the derivatives R¹/Rx, R¹/Ry and R¹/Rz in Eq. (3.178) represent the rates dx/dt, dy/dt and dz/dt that satisfy identically the wave speed constraint (3.173). 3.5.3. Dynamic programming model for chemical waves When equation describing the speed function is complicated, an analytical solution cannot be found. Yet a numerical solution for the function ¹(x, y, z) is still possible; it can be obtained with the aid of Bellman's equation for the minimum time +hn
G
A
c(x, y, z, un, vn)unhn , ¹n(y, z, x)" min (1#kn/(x, y, z))hn#¹n~1 y! J1#(un)2#(vn)2 un n n ,h ,k c(x, y, z, un, vn)hn c(x, y, z, un, vn)vnhn , x! z! J1#(un)2#(vn)2 J1#(un)2#(vn)2
BH
.
(3.179)
In the simplest possible case of a homogeneous medium, within its interior where the constraint / is inoperative and k"0, the solution describes straight rays. Therefore, in a complex chemical medium, the quantity ¹, which describes the shortest transition time, is a generalization of the simplest transition function of homogeneous and isotropic medium in which the wave motion is with the constant speed c 0 ¹(x, y, z, x0, y0, z0)"c~1J(x!x0)2#(y!y0)2#(z!z0)2 . (3.180) 0
S. Sieniutycz / Physics Reports 326 (2000) 165}258
245
Eq. (3.179) allows for numerical generation of the function ¹(x, y, z) for the case of constrained wave motions in con"ned regions and in complex media. Experiments con"rming the behavior of chemical fronts predicted by the theory are available [65}67,99].
4. Concluding remarks 4.1. Advantages stemming from Hamiltonian descriptions In this treatment we have shown bene"ts following from the use of dynamic programming, stage criterion and a discrete Hamiltonian of Pontrygin's structure. The main technical virtue of these approaches is that they are capable of providing numerical solutions for nonlinear formulations in which diverse coe$cients are state dependent and analytical solutions cannot be found. However there are also conceptual advantages. They refer to invariant results of discrete optimization when change is made of an independent variable, e.g. when a state variable x is used in place of the time j t. Unchanged, objective data are then produced which depend only on the number of discrete stages, and are not in#uenced by the choice of x . In particular, optimal time intervals remain the j same regardless of tn being an independent variable or a dependent variable. This objectivity is lost when the rule of discretizing is based on a subjective basis, as in the standard numerical approaches. According to a basic rule in optimization, which assures that addition of a constraint to an optimization model can only worsen the optimal value of its performance criterion, relaxation of constraints on intervals hn may only improve approximation of a continuous process by its discrete counterpart. Thus, in order to assure a required accuracy of the approximation of the continuous system by a cascade, lower number of stages is required when using the constant-H algorithm in place of constant-h algorithms. This conclusion was con"rmed by computations. Applying the discrete algorithm with a constant Hamiltonian, processes which are discrete by nature, i.e. those with "nite number of "nite stages, can be optimized. Yet the algorithm is capable of yielding (numerically or analytically) discrete approximations to solutions of continuous processes. The discrete solution diverges to a continuous solution at the limit of an in"nite N. Due to the strong formal similarity of our discrete algorithm to the continuous algorithm of Pontryagin, full power and diversity of computational tools, that were originally designed for Pontryagin's algorithm, can be exploited. An extra insight is gained when studying the discrete systems considered. The class of discrete processes which is linear with respect to a special decision h is, of course, more narrow that the class of all possible discrete processes. But just due to this fact, the considered class bears properties which are not typical of all discrete processes. In particular, Lagrangian approach can be constructed for this class; in this approach an energy function E"H is a discrete integral of motion. A sort of discrete mechanics is valid with discrete Euler}Lagrange equations resembling closely those known for continuous systems. A special equation characteristic of the theory, such as the discrete Hamilton}Jacobi equation or the constant-H condition, links the discrete H with the Lagrange multiplier of the time constraint which makes possible the suitable dimensionality reduction.
246
S. Sieniutycz / Physics Reports 326 (2000) 165}258
However, limitations of the basic theory should also be remembered. The proof of strong maximum principle for the discrete Hn requires convexity assumptions for rate functions and constraining sets, i.e. much stronger conditions than those in Pontryagin's theorem. An enhanced weak theorem in which the discrete Hamiltonian attains a local maximum is usually applied; its basic content is that the necessary conditions for the maximum of a discrete functional S are collections of those for the related Hamiltonians Hn [19]. The satisfaction of the linearity condition with respect to hn is su$cient for existence of the Hamiltonian structure but it is not necessary; thus models may exist which are nonlinear in hn and still satisfy the basic theory. See, for example, Eqs. (2.59@) and (2.108) to con"rm this conclusion. Otherwise, there are models nonlinear in hn which do not admit this theory. For example, some models of heat exchanger transfer area contain all controls as typically nonlinear controls, and the constant-H algorithm cannot be used [11]. In such cases the generalization of the basic theory, described below, should be applied. 4.2. Outline of generalized theory for arbitrary discrete processes Here we brie#y show how the basic theory of Section 2.2 can be generalized so that it can handle constrained intervals of time and arbitrary state transformations in which there is originally no control which appears linearly in the optimization model. When optimizing general discrete processes we deal most commonly with state transformations consistent with the backward algorithm of dynamic programming. In this case the original constraints are in the form of s#1 transformations xn "Tn (xn~1, un) , (4.1) k k where k"0,1,2, s, s#1, the 0th state variable is the pro"t coordinate and (s#1)th state variable is accepted as a time-like coordinate. As long as the coordinate x is the usual time t, the state s`1 vector x used here includes the space}time vector x8 of Section 2.2. Assuming that changes of the pro"t coordinate x on the right-hand sides of transformations (4.1) can only be additive, i.e. that 0 Tn has the structure xn~1#g (x8 n, un) and the remaining Tn does not contain xn~1, we can deal with 0 0 0 k 0 the optimal functions
(4.2)
(4.3)
(4.4)
S. Sieniutycz / Physics Reports 326 (2000) 165}258
247
Now, we introduce the interval of the time-like variable either as hn,tn!tn~1 (whenever x ,t) or as the quantity de"ned by an equation s`1 xn !xn~1"hn . (4.5) s`1 s`1 Note that the function g in Eq. (4.4) constraints intervals of time; they must satisfy Eq. (4.4) for s`1 k"s#1. We also de"ne functions of relative rates f (xn, un),g (xn, un)/g (xn, un) . (4.6) k k s`1 Clearly, f ,1. Applying these de"nitions in Eq. (4.4), we recover the basic di!erence model of s`1 Section 2.2 in the form of the state equations xn !xn~1"f n(xn, un)hn (4.7) k k k (k"0,1,2, s#1 and f ,1), where the controls (un,hn) should satisfy the standard constraint s`1 un3U and one extra constraint hn!g (xn, un)"0 (4.8) s`1 which limits the time intervals. Later we consider a generalization of Eq. (4.8) in which the inequality constraint hn!g 40 replaces the original equation (4.8). s`1 In the present notation, the discrete Bolza criterion SN, Eq. (2.51) in Section 2.2, has the form N SN, + f n(xn, un)hn#G(xN)!G(x0) . (4.9) 0 n/1 This quantity should attain a maximum subject to the above set of constraints. As the intensity of pro"t generation is independent of the pro"t coordinate x , the latter does not appear in Eqs. (4.6), 0 (4.8) and (4.9). The presence of constraint (4.8) changes the form of the e!ective pro"t intensity which has to be applied in the stage criterion when deriving a canonical set from this criterion. A modi"ed pro"t function f n which adjoints constraint (4.8) to the original function f n by a Lagrange multiplier 0j 0 j has to be used in the stage criterion of the type of Eq. (2.76). In the present case the stage criterion has the form max M f n(xn, un)hn#jn(hn!g (xn, un))!(Pn(xn)!Pn~1(xn!f n(xn, un)hn))N"0 . (4.10) 0 s`1 un n xn ,h , Dealing with this criterion we de"ne state adjoints in a manner consistent with Eq. (2.85) zn~1,!RPn~1(xn~1)/Rxn~1 (4.11) k k (k"1,2, s, s#1). We also preserve the original de"nition of the enlarged Hamiltonian including the time adjoint s`1 Hn~1(xn, zn~1, un),f n(xn, un)# + zn~1 f n(xn, un) . (4.12) 0 k k k/1 From Eq. (4.10), the necessary condition for the maximum of SN with respect to intervals hn is (Hn~1(xn, zn~1, un)#jn)dhn40
(4.13)
248
S. Sieniutycz / Physics Reports 326 (2000) 165}258
and that with respect to controls un is hn(RHn~1/Run!jnR ln gn /Run).dun40 . s`1
(4.14)
Eq. (4.13) implies that for stationary and positive intervals hn, Eq. (4.5), and active constraint (4.8) the enlarged Hamiltonian function is not constant but satis"es the condition Hn~1(xn, zn~1, un)#jn"0 .
(4.15)
Eq. (4.14) shows that interior optimal controls un satisfy the condition RHn~1/Run!jnR ln gn /Run"0 s`1
(4.16)
which becomes the local maximum condition for Hn~1 only in case of unconstrained hn. Eq. (4.15) shows that the negative value of the Lagrange multiplier for the active local constraint (4.8) equals the value of the enlarged Hamiltonian function. Now we perform variation of state and time coordinates in the stage criterion (4.10). Splitting the e!ect of space and time and using condition (4.15) we get a &quasicanonical' set of discrete optimality conditions (xn !xn~1)/hn"RHn~1/Rzn~1 , k k k
(4.17)
(xn!xn~1)/hn"RHn~1/Rzn~1"f "1 , t t t s`1
(4.18)
(zn !zn~1)/hn"!RHn~1/Rxn !Hn~1R ln gn /Rxn , k k k s`1 k
(4.19)
(zn!zn~1)/hn"!RHn~1/Rtn!Hn~1R ln gn /Rtn , t t s`1
(4.20)
(RHn~1/Run#Hn~1R ln gn /Run).dun40 , s`1
(4.21)
where k"1,2, s. The above set is purposely written in a form suitable to comprise both the present theory of constrained hn and that of free hn described in Section 2.2. For the former, the identi"cation hn"gn should be made on the left-hand sides of Eqs. (4.17)}(4.21). For the latter, s`1 constraint (4.8) is absent or the corresponding inequality hn!g 40 is inoperative, in which s`1 cases jn"0 and hn is free; then the basic algorithm of Section 2.2 is recovered. In this case the enlarged Hamiltonian Hn~1 vanishes and set (4.17)}(4.21) becomes canonical. Optimal-performance-based choice of time intervals, which involves global or integral criteria, may be compared with the group of special-purpose integration methods for OD equations called collectively the structure-preserving integrators (also called mechanical or geometric integrators). In these methods local discretizing structure may be established without explicit recourse to an optimization criterion although it has to preserve exactly a number of important properties known for OD equations. Examples are symplectic integrators for Hamiltonian OD equations, volumepreserving integrators for divergence-free OD equations, time-reversing symmetries preserving integrators, and integrators preserving the structure of gradient and Lyapunov systems [7}9,100}102].
S. Sieniutycz / Physics Reports 326 (2000) 165}258
249
4.3. Thermodynamic limits for xnite rates Enhanced thermodynamic bounds limiting work or energy delivered (consumed) in a "nite time of suitable "nite time bounds limiting other performance criteria constitute the main bene"t resulting from the presented macroscopic analyses, in particular from generalized exergies of Section 3.1. Because they include dissipative factors and dynamic e!ects, these are stronger, more informative and useful than bounds of reversible thermodynamics. To describe these bounds quantitatively it is possible } and useful } to introduce two types of optimal work functions which we call < and < (or R and R ). The "rst depends on the boundary states, number of stages and H H duration, and the second contains the Hamiltonian h in place of the duration. The same refers to generalized exergies. This dualism has its counterpart in classical mechanics, where two di!erent action functions exist, one a duration-based action and the other a Hamiltonian-based action (sometimes called the abbreviated action), with the two linked by a Legendre transformation. To understand the problem of bounds and their distinction for the work production and consumption, recall the work-producing process as the inverse of the work-consuming process (the "nal state of the second process is the initial state of the "rst, and conversely), when we have "xed the durations of the two processes to be the same. In thermostatics the two bounds on the work, the bound on the work produced and that on the work consumed, coincide. However the static limits are often too far from reality to be really useful. The generalized exergy provides bounds stronger than those predicted by the classical exergy. They do not coincide for processes of work production and work consumption, and they are &thermokinetic' rather than &thermostatic' bounds. Only for in"nitely long durations or for processes with excellent transfer (an in"nite number of transfer units) do the thermokinetic bounds reduce to their classical thermostatic limits. The hysteresis, or divergence of the bounds (Fig. 4), proves that many idealized processes allowed by classical thermostatics are prohibited by the more severe and realistic constraints of
Fig. 4. Finite-time exergy A of a limiting continuous processes (NPR) prohibits processes from operating below the heat-pump mode line which is the lower bound for work supplied (lbws) and or above the engine mode line which is the upper bound for work produced (ubwp). These bounds, more realistic than those based on the classical, reversible de"nition of exergy, are the result of "nite rates consistent with "nite duration of the process.
250
S. Sieniutycz / Physics Reports 326 (2000) 165}258
thermokinetics. The hysteretic e!ect, caused by the dissipation and the associated increase of the exergy supplied in the pump mode (and decrease of exergy released in the engine mode), reveals the extent to which thermokinetic bounds are stricter than thermostatic bounds. The greater the mean rate we demand, the greater is the range of performance excluded by limits predicted by the generalized exergy. A real process which does not apply the optimal protocol but has the same boundary states and duration as the optimal sequence, requires a real work supply that can only be larger than the "nite-rate limit obtained by optimization. Similarly, the real work delivered from a nonequilibrium work-producing system (with the same boundary states and duration but with a suboptimal control) can only be lower that the corresponding "nite-rate limit. Thus the two bounds for a process and its inverse, which coincide in thermostatics, diverge in thermodynamics, at a rate that grows with any of the indices of deviation from static behavior, m, S , ¸ or H. For su$ciently high p values of the rate indices, the work consumed may far exceed the classical work; the work produced can even vanish. Thus, we can confront and surmount the limitations of applying classical thermodynamic bounds to real processes. This is a direction with many open opportunities, especially for separation and chemical systems.
Acknowledgements The author acknowledges a support in the framework of the grant T09C 063 from Polish Committee of National Research. Referee's remarks resulted in an improvement of the original manuscript. Helpful discussions with Profs. R.S. Berry and M. Grmela are gratefully acknowledged.
Appendix A A.1. Notation A A, A* An(xn, un, tn,hn) A 0 A, a a v b b ' c c ,c ' 8 c %Dn DI n
available energy (exergy) action and abbreviated action in problems of classical mechanics criterion of stage optimality constant area of transfer projected on axis y (in thermal Fermat problem) cumulative and local heat exchange area, respectively speci"c area of heat exchange (per unit volume) bending constant for a thermal ray speci"c exergy of controlling gas speci"c heat at the constant pressure, propagation speed speci"c heats of the dry gas and active substance, respectively unit price of electric energy, pro"t at stage n gauged pro"t at stage n
S. Sieniutycz / Physics Reports 326 (2000) 165}258
E e e F f 0 fI 0 f"( f ,2, f ) 1 s fn G *Gn GI "G#x 0 G(x, t) u(x, u, t)40 g ,g 1 2 g H(x, u, z, t) Hn~1(xn, un, zn~1, tn) Hn~1 HI n~1(xn, tn, un, zn~1, zn~1, tn) t H "U!W p H TU h I In 4 i ' i 4 J i J 4 J 6 K, KT Kc"LcCc Kn"!Pn KI n ,KI n#hhn H k L,R~1 ¸ ln "!f n 0 0 ¸ "UH#W p l M M ' M k N n
251
energy type function in terms of rates and state coordinates internal energy coordinate in relaxation process economic value of the exergy cross-sectional area of the system pro"t intensity gauged pro"t intensity, Eq. (2.37), in an HJB equation describing < vector of process rates rate vector at stage n mass #ux of resource #uid, total #ow rate gas #ow rate through stage n of a cascade mass exchanger (dryer) extended gauging function, Eq. (2.57) gauging function depending on state x and time t constraining vector function partial thermal conductances overall conductance, mass conductance in Section 3.1.2. standard Hamiltonian function of a continuous process Hamiltonian function of a discrete process at stage n extremum Hamiltonian at stage n enlarged Hamiltonian of a discrete process at stage n thermodynamic Hamiltonian height of transfer unit numerical Hamiltonian, Lagrange multiplier of time constraint heat current through the area A solid enthalpy at stage n speci"c inlet enthalpy of controlling gas speci"c gas enthalpy at equilibrium with solid molar #ux density of ith component density of entropy #ux density of energy #ux relaxation matrix and its transpose, respectively relaxation matrix in phase c cost counterpart of function Pn modi"ed cost function at stage n Onsager's conductivity related to gradient of ¹~1 overall conductance matrix, Onsagers matrix mechanical Lagrangian, rate of cost generation process Lagrangian at stage n dissipative kinetic potential of thermodynamic system distance from the beginning of exchanger matrix of mass #ows in a thermodynamic system molar mass of the dry gas molar mass of kth component total number of stages in a multistage process vector of mole numbers in thermodynamic process
252
S. Sieniutycz / Physics Reports 326 (2000) 165}258
n8 "(n, e) n P"G*!G#< Pn p pH p pc *
enlarged state vector including energy current stage number of a multistage process, refraction coe$cient e!ective optimal pro"t function with gauging term optimal gauged pro"t at nth stage in terms of the state xn and time tn momentum-type vector, vector of transfer potentials vector of interphase transfer potentials momentum type integral, RR/Ry in thermal Fermat problem transfer potentials in phase c as derivatives of entropy with respect to n and e k cumulative heat exchanged optimal performance function in terms of the complete initial state (x* , x*, t*) 0 optimal performance function in terms of the complete "nal state (x& , x&, t&) 0 optimal performance function of the forward DP algorithm, Eq. (2.64) driving heat in the engine mode of the process resistance matrix of phase c universal gas constant total resistance of a thermal path, radius of a spherical obstacle optimal performance function of cost type in terms of state and time minimum resistance potential optimal cost function at stage n modi"ed optimal cost function at stage n thermodynamic entropy, current entropy of the system potential of "nal entropy generated along extremum path Onsager's entropy function performance index, optimization criterion for N-stage process speci"c entropy, number of species, number of coordinates integral entropy production vector function describing the state transformation at nth stage temperatures of upper and lower reservoirs (usually ¹ "¹% and 2 ¹ ,¹) 1 constant equilibrium temperature of reservoir upper and lower temperature of the Carnot #uid temperature of #uid leaving the stage n temperature of #uid entering the stage n temperature of solid phase from stage n temperature of gaseous phase from stage n upper admissible temperature of gas duration of the process physical time, contact time admissible control set control vector of a general process, "elds in chemical wave problem
Q Q* Q& Qn"xn #Gn!
S. Sieniutycz / Physics Reports 326 (2000) 165}258
u8 n un"*¹n/*qn V <,max S
enlarged control vector at stage n rate of the temperature change as the control variable in the NCA process volume of the physical system optimal performance function of pro"t type optimal pro"t function at stage n modi"ed optimal pro"t function at stage n rate vector in the thermodynamic process linear velocity of the driving #uid symmetric matrix of second dissipation function cumulative power output total speci"c work or total power per unit mass #ux moisture content in solid phase at stage n power delivered at stage n thermodynamic forces as deviations of transfer potentials from equilibrium, Eq. (3.153) interphase thermodynamic forces as di!erence of transfer potentials pc!pd lowest admissible humidity of gas gas humidity at equilibrium with solid absolute humidity of gas at stage n state vector of a general process enlarged state vector with the last coordinate x8 s`1,t 1 transfer area coordinate, direction perpendicular to the resistivity gradient performance coordinate direction tangent to the resistivity gradient adjoint vector of a general process adjoint variable for coordinate xn i adjoint variable in the work minimization problem, momentum-type variable R¸/R¹Q adjoint variable in the entropy generation minimization, momentumtype variable R¸ /R¹Q p
A.2. Greek letters a,xc a@ C"Cc#Cd c c ,c 1 2 + d d2
253
independent Onsager's coordinates overall heat transfer coe$cient overall Hessian matrix of entropy function coordinate of overall cumulative conductance coordinates of partial conductances nabla operator "rst di!erential, "rst-order perturbation second di!erential, second-order perturbation
254
S. Sieniutycz / Physics Reports 326 (2000) 165}258
e g"w/q 1 hn K K p k@,pH k k k8 "k M M~1!k k n k n k m o p s ¹ q U,UT W / (x, t) a
error criterion engine e$ciency free interval of an independent variable or time interval at stage n generalized Lagrangian, cost intensity dissipative Lagrangian including constraints vector of Lagrange multipliers identi"ed with the interphase potentials molar chemical potential of kth component transfer potential of kth component logarithmic intensity variable density, thermal resistance entropy production of unit volume passage time for a chemical ray nondimensional time, number of the heat transfer units (x/H ) TU "rst dissipation function and its Legendre transform, respectively state dependent dissipation function constraining function, ath component of the vector /
A.3. Subscripts C g i q s t w p 0 * 1,2
Carnot point gas ith state variable heat entropy, solid time moisture dissipative quantity zeroth variable, pro"t, reference state modi"ed quantity, minimum admissible quantity "rst and second #uid
A.4. Superscripts e f i k or n N n T * I 0 c
environment, equilibrium "nal state initial state number of kth or nth stage last stage stage n transpose matrix, transform interface, maximum admissible quantity enlarged quantity inlet stage, "xed point phase c
S. Sieniutycz / Physics Reports 326 (2000) 165}258
255
References [1] R. Bellman, Dynamic Programming, Princeton University Press, Princeton, 1957***. [2] R. Aris, Discrete Dynamic Programming, Blaisdell, New York, 1964***. [3] W. Findeisen, J. Szymanowski, A. Wierzbicki, Theory and Computational Methods of Optimization, PanH stwowe Wydawnictwa Naukowe, Warsawa, 1980. [4] H. Rund, The Hamilton}Jacobi Theory in the Calculus of Variations, Its role in Mathematics and Physics, Van Nostrand, London, 1966*. [5] L. Landau, E. Lifshitz, Mechanics, Pergamon, Oxford, 1971**. [6] V.I. ManH ko, M.A. Markov, (Eds.), Theory of Interaction of Multilevel Systems with Quantized Fields, Proceedings of P.N. Lebedev Physics Institute, Vol. 209, Nova Publishers, New York, 1996. [7] J.E. Marsden, Lectures on Mechanics, University Press, Cambridge, 1992. [8] R.I. McLachlan, Explicit Lie}Poisson integration and the Euler equations, Phys. Rev. Lett. 71 (1993) 3043 and references therein. [9] G.R.W. Quispel, Volume preserving integrators, Phys. Lett. A 206 (1995) 26}30. [10] S. Sieniutycz, The constant Hamiltonian problem and an introduction to the mechanics of optimal discrete systems, Rep. Inst. Chem. Eng. Warsaw Tech. Univ. 3 (1974) 27}53***. [11] S. Sieniutycz, Optimization in Process Engineering, 1st Edition, Wydawnictwa Naukowo Techniczne, Warsaw, 1978**. [12] Z. Szwast, Discrete algorithms of maximum principle with constant Hamiltonian and their selected applications in chemical engineering, Ph.D. Thesis, Institute of Chemical Engineering at the Warsaw University of Technology, Warsaw, 1979**. [13] S. Sieniutycz, Z. Szwast, Practice in Optimization: Process Problems, Wydawnictwa Naukowo Techniczne, Warsaw, 1982**. [14] L.T. Fan, C.S. Wang, The Discrete Maximum Principle, A Study of Multistage System Optimization, Wiley, New York, 1964. [15] V.G. Boltyanski, Optimal Control of Discrete Systems, Nauka, Moscow, 1973. [16] L.T. Fan, The Continuous Maximum Principle, A study of Complex System Optimization, Wiley, New York, 1966. [17] L.S. Pontryagin, V.A. Boltyanski, R.V. Gamkrelidze, E.F. Mischenko, The Mathematical Theory of the Optimal Processes, Wiley, New York, 1962**. [18] V.G. Boltyanski, Mathematical Methods of Optimal Control, Nauka, Moscow, 1969*. [19] Z. Szwast, Enhanced version of a discrete algorithm for optimization with a constant Hamiltonian, Inz. Chem. Proc. 3 (1988) 529}545. [20] G. Gutowski, Analytical Mechanics, Panstwowe Wydawnictwa Naukowe, Warszawa, 1971. [21] S. Sieniutycz, Thermodynamic methods in optimization of #uidized drying and moistening, Rep. Inst. Chem. Eng. Warsaw Tech. Univ. 2 (1973) 181}309*. [22] S. Sieniutycz, A general theory of optimal discrete drying processes with a constant Hamiltonian, in: A. Mujumdar (Ed.), Drying 84, Hemisphere, New York, 1984. [23] S. Sieniutycz, Examples of one-dimensional variational problems for selected #uidization processes (heating and sorption), Inz. Chem. 5 (1975) 651}661. [24] G. Leitman, An Introduction to Optimal Control, McGraw-Hill, New York, 1966. [25] S. Sieniutycz, The constancy of Hamiltonian in a discrete optimal process, Rep. Inst. Chem. Eng. Warsaw Tech. Univ. 2 (1973) 399}429. [26] S. Sieniutycz, The thermodynamic approach to #uidized drying and moistening optimization, A.I.Ch.E. J. 19 (1973) 277}285. [27] I.I. Novikov, The e$ciency of atomic power stations, Nucl. Energy II 7 (1958) 125}128 (English translation from At. Energy 3 (1957) 409}412)*. [28] F.L. Curzon, B. Ahlborn, E$ciency of Carnot engine at maximum power output, Am. J. Phys. 43 (1975) 22}24**. [29] S. Sieniutycz, R.S. Berry, Discrete Hamiltonian analysis of endoreversible thermal cascades, in: S. Sieniutycz, A. de Vos (Eds.), Thermodynamics of Energy Conversion and Transport, Springer, New York, 2000 (Chapter 6).
256
S. Sieniutycz / Physics Reports 326 (2000) 165}258
[30] de Vos, Endoreversible Thermodynamics of Solar Energy Conversion, Clarendon Press, Oxford, 1992, pp. 29}51*. [31] S. Sieniutycz, Nonlinear thermokinetics of maximum work in "nite time, Int. J. Eng. Sci. 36 (1998) 577}597. [32] B. Andresen, P. Salamon, R.S. Berry, Thermodynamics in "nite-time: extremals for imperfect heat engines, J. Phys. Chem. 66 (1977) 1571}1577. [33] B. Andresen, Finite-Time Thermodynamics, University of Copenhagen, 1983. [34] M.J. Ondrechen, B. Andresen, M. Mozurkiewicz, R.S. Berry, Maximum work form a "nite reservoir by sequential Carnot cycles, Am. J. Phys. 49 (1981) 681}685. [35] S. Sieniutycz, Optimal control framework for multistage endoreversible engines with heat and mass transfer, J. Non-Equilibrium Thermodyn. 24 (1999) 40}74. [36] S. Sieniutycz, Endoreversible modeling and optimization of thermal machines by dynamic programming, in: Ch. Wu (Ed.), Advance in Recent Finite Time Thermodynamics, Nova Science, New York, 1999, Ch. 11, pp. 189}220. [37] R.E. Bellman, S.E. Dreyfus, Applied Dynamic Programming, Panstwowe Wydawnictwa Ekonomiczne, Warsaw, 1967**. [38] B. Andresen, M.H. Rubin, R.S. Berry, Availability for "nite time processes general theory and model, J. Phys. Chem. 87 (1983) 2704}2713*. [39] V.A. Mironova, A.M. Tsirlin, V.A. Kazakov, R.S. Berry, Finite-time thermodynamics, Exergy and optimization of time-constrained processes, J. Appl. Phys. 76 (1994) 629}636. [40] S. Sieniutycz, Hamilton}Jacobi}Bellman theory of dissipative thermal availability, Phys. Rev. E 56 (1997) 5051}5064*. [41] P. Salamon, A. Nitzan, B. Andresen, R.S. Berry, Thermodynamics in "nite time IV: minimum entropy production and the optimization of heat engines, Phys. Rev. A 21 (1980) 2115}2129. [42] B. Andresen, J. Gordon, Optimal heating and cooling strategies for heat exchanger design, J. Appl. Phys. 71 (1992) 76}79. [43] S. Sieniutycz, Carnot problem of maximum work form a "nite resource interacting with environment in a "nite time, Physica A 264 (1999) 234}263. [44] L. Onsager, Reciprocal relations in irreversible processes, I. Phys. Rev. 37 (1931) 405}426 and II. Phys. Rev. 38 (1931) 265}2279*. [45] L. Onsager, S. Machlup, Fluctuations and irreversible processes, I. Phys. Rev. 91 (1953) 1505}1512 (see also, Machlup S. and Onsager L., ibid, 1953). [46] J. Keizer, Variational principles in nonequilibrium thermodynamics, Biosystems 8 (1977) 219}226. [47] L.E. Elsgolc, Variational Calculus, Panstwowe Wydawnictwa Naukowe, Warsaw 1960. [48] H. Goldstein, Classical Mechanics, Addison-Wesley, Reading, MA, 1980. [49] S. Sieniutycz, A.N. Beris, A nonequilibrium internal exchange of energy and matter and its Onsager's-type variational theory of relaxation, Int. J. Heat Mass Transfer 42 (1999) 2695}2715. [50] A. Tan, L.R. Holland, Tangent law of refraction for heat conduction through an interface and underlying variational principle, Am. J. Phys. 58 (1990) 988}991. [51] S. Sieniutycz, Dynamic programming approach to a Fermat-type principle for heat #ow, Int. J. Heat Mass Transfer (1999), submitted. [52] S. Sieniutycz, R.S. Berry, Conservation laws from Hamilton's principle for nonlocal thermodynamic equilibrium #uids with heat #ow, Phys. Rev. A 40 (1989) 348}361. [53] S. Sieniutycz, R.S. Berry, Thermal mass and thermal inertia: a comparison of hypotheses, Open System Inform. Dyn. 4 (1997) 15}43. [54] L. Landau, E. Lifshitz, Electrodynamics of Continua, Pergamon, Oxford, 1971. [55] A.M. Zhaboutinsky, Concentrational Oscillations, Moscow, Nauka, 1974*. [56] A.N. Zaikin, A.M. Zhaboutinsky, Concentration wave propagation in a two-dimensional, liquid-phase selfoscillating system, Nature 225 (1970) 535. [57] A.T. Winfree, Spiral waves of chemical acitivity, Science 175 (1972) 634}676*. [58] A.T. Winfree, Scroll-shaped waves of chemical acitivity in three dimensions, Science 181 (1973) 937. [59] A.T. Winfree, Rotating chemical reactions, Sci. Am. 230 (1974) 82}95. [60] H.W. Dandekar, J. Puszynski, J. Degreve, V. Hlavacek, Reaction front propagation characteristics in noncatalytic exotermic gas-solid systems, Chem. Eng. Commun. 92 (1990a) 199}224.
S. Sieniutycz / Physics Reports 326 (2000) 165}258
257
[61] H.W. Dandekar, C.C. Agra"otis, J. Puszynski, V. Hlavacek, Modeling and analysis of "ltration combustion for synthesis of transition metal nitrides, Chem. Eng. Sci. 45 (1990b) 2499}2504. [62] P. Dimitriou, J. Puszynski, Hlavacek, On the dynamics of equations describing gaseous combustion in condensed systems, Combin. Sci. Tech. 68 (1989) 101}111. [63] J. Puszynski, J. Degreve, V. Hlavacek, Modeling of exothermic solid}solid noncatalytic reactions, Ind. Eng. Chem. 92 (1990) 199}224. [64] J. Puszynski, S. Kumar, P. Dimitriou, V. Hlavacek, A numerical and experimental study of reaction front propagation in condensed phase systems, Z. Naturforsch. 43 A (1988) 1017}1025. [65] A. Lazar, Z. Noszticzius, H. Farkas, Involutes: the geometry of chemical waves rotating in annular membranes, Chaos 5 (1995) 443}447. [66] A. Lazar, Z. Noszticzius, H.-D. FoK rstelling, Z. Nagy-Ungvarai, Chemical waves in modi"ed membranes I. Developing the technique, Physica D 84 (1995b) 112}119. [67] A. Lazar, H.-D. FoK rstelling, A. Volford, Z. Noszticzius, Refraction of chemical waves propagating in modi"ed membranes, J. Chem. Soc. Faraday Trans. 92 (16) (1996) 2903}2909. [68] R.J. Field, M. Burger, Oscillations and Traveling Waves in Chemical Systems, Wiley, New York, 1985*. [69] P. Gray, S.K. Scott, Chemical Oscillations and Instabilities, Oxford University Press, Oxford, 1990*. [70] P. Gray, S.K. Scott, K. Showalter, The in#uence of the form of autocatalysis on the speed of chemical waves, Phil. Trans. Roy. Soc. London A 337 (1994) 249}260. [71] N. Wiener, A. Rosenblueth, The mathematical formulation of the problem of conduction of impulses in a network of connected excitable elements, speci"cally in cardiac muscle, Arch. del Instituto de Cardiologia de Mexico 16 (1946) 205. [72] J.J. Tyson, The Belousov}Zhaboutinsky Reaction, Lecture Notes in Mathematics, Vol. 10, Springer, Berlin, 1976. [73] J.P. Keener, A geometrical theory for spiral waves in excitable media, SIAM J. Appl. Math. 46 (1986) 1039}1056. [74] J.J. Tyson, J.P. Keener, Singular perturbation theory of traveling waves in excitable media, Physica D 32 (1988) 327}361. [75] A.S. Mikhailov, in: H. Haken (Ed.), Foundations of Synergetics I: Distributed Active Systems, Springer Series in Synergetics, Vol. 51, Springer, Berlin, 1991. [76] G. Nicolis, I. Prigogine, Self-Organization in Nonequilibrium Systems, Wiley, New York, 1977. [77] G. Nicolis, J. Portnow, Chemical oscillations, Chem. Rev. 73 (1973) 365}384. [78] R. Thom, Stabilite Structurelle et Morphogenese, Benjamin, New York, 1972. [79] M. Orlik, Oscillatory Reactions: Order and Chaos, Wydawnictwa Naukowo Techniczne, Warszawa, 1996 (in Polish). [80] K. Showalter, R.M. Noyes, K. Bar-Eli, A modi"ed Oregonator model exhibiting complicated limit cycle behavior in a #ow system, J. Phys. Chem. 69 (1978) 2514}2524. [81] R.I. GyoK rgyi, R.J. Rempe, J. Field, cited according Orlik's 1996 book, J. Phys. Chem. 95 (1991) 3159. [82] R.I. GyoK rgyi, R.J. Rempe, J. Field, cited according Orlik's 1996 book, J. Phys. Chem. 92 (1988) 7079. [83] K.-D. Willamowski, O.E. RoK ssler, Irregular oscillations in a realistic abstract quadratic mass-action systems, Z. Naturforsch. 35A (1980) 317. [84] J.S. Nicolis, G. Mayer-Kress, G. Haubs, Non-uniform chaotic dynamics with implications to information processing, Z. Naturforsch 38A (1983) 1157. [85] K. Lindgren, B.AG . Mannson, Entropy production in a chaotic chemical system, Z. Naturforsch 41A (1986) 1111. [86] K.-E. Erickson, K. Lindgren, B.AG . Mannson, Structure, Context, Complexity, Organization, Physical Aspects of Information and Value, World Scienti"c, Singapore, 1987. [87] S. Sieniutycz, Variational thermomechanical processes and chemical reactions in distributed systems, Int. J. Heat Mass Transfer 40 (1997) 3467}3485. [88] M.M. Lipschutz, Theory and Problems of Di!erential Geometry, Schaum's Outline Series, McGraw-Hill, New York, 1969. [89] Z. Noszticzius, W. Horsthemke, W.D. McCormick, H.L. Swinney, W.Y. Tam, Sustained chemical waves in an angular gel reactor: a chemical pinwheel, Nature 329 (1987) 619}620*. [90] E. Meron, The role of curvature and wavefront interactions in spiral-wave dynamics, Physica D 49 (1991) 98}106.
258
S. Sieniutycz / Physics Reports 326 (2000) 165}258
[91] D. Horvath, V. Petrov, S.K. Scott, Instabilities in propagating reaction}di!usion fronts, J. Chem. Phys. 98 (1993) 6332}6343. [92] D.A. Vasquez, W. Horsthemke, Chemical waves in a continuously fed reactor: numerical studies of the chemical pinwheel, J. Chem. Phys. 80 (1991) 3829}3834. [93] S. Sieniutycz, H. Farkas, Chemical waves in con"ned regions by Hamilton}Jacobi}Bellman theory, Chem. Eng. Sci. 52 (1997) 2927}2945. [94] P.L. Simon, H. Farkas, Geometric theory of trigger waves } a dynamical system approach, J. Math. Chem. 19 (1996) 301}315. [95] S.K. Scott, Oscillations, Waves and Chaos in Chemical Kinetics, Oxford University Press, Oxford, 1994. [96] C.M. Guldberg, P. Waage, Etudes sur les A$niteH s Chimiques, Brogger et Christie, Christiana, 1867. [97] S. Sieniutycz, From a least action principle to mass action law and extended a$nity, Chem. Eng. Sci. 11 (1987) 2697}2711. [98] R.M. Noyes, Reactions with directed motions, Nature 329 (1987) 581. [99] A.G. Merzhanov, E.N. Rumanov, Physics of reaction waves, Rev. Modern Phys. 71 (1999) 1173}1211. [100] G.R.W. Quispel, C.P. Dyt, Volume preserving integrators have linear error growth, Phys. Lett. A 242 (1998) 25}30, and references therein. [101] R.I. McLachlan, G.R.W. Quispel, Generating functions for dynamical systems with symmetries, integrals and di!erential invariants, Physica D 112 (1998) 298}309 (1998), and references therein. [102] R.I. McLachlan, G.R.W. Quispel, N. Robidoux, Geometric integration using discrete gradients, Phil. Trans. Roy. Soc. London A 357 (1999) 1021}1045.
A.D. Mirlin / Physics Reports 326 (2000) 259}382
259
STATISTICS OF ENERGY LEVELS AND EIGENFUNCTIONS IN DISORDERED SYSTEMS
Alexander D. MIRLIN Institut fu( r Theorie der kondensierten Materie, Universita( t Karlsruhe, 76128 Karlsruhe, Germany
AMSTERDAM } LAUSANNE } NEW YORK } OXFORD } SHANNON } TOKYO
Physics Reports 326 (2000) 259}382
Statistics of energy levels and eigenfunctions in disordered systems Alexander D. Mirlin1 Institut fu( r Theorie der kondensierten Materie, Postfach 6980, Universita( t Karlsruhe, 76128 Karlsruhe, Germany Received July 1999; editor: C.W.J. Beenakker
Contents 1. Introduction 2. Energy level statistics: random matrix theory and beyond 2.1. Supersymmetric p-model formalism 2.2. Deviations from universality 3. Statistics of eigenfunctions 3.1. Eigenfunction statistics in terms of the supersymmetric p-model 3.2. Quasi-one-dimensional geometry 3.3. Arbitrary dimensionality: metallic regime 4. Asymptotic behavior of distribution functions and anomalously localized states 4.1. Long-time relaxation 4.2. Distribution of eigenfunction amplitudes 4.3. Distribution of local density of states 4.4. Distribution of inverse participation ratio 4.5. 3D systems 4.6. Discussion 5. Statistics of energy levels and eigenfunctions at the Anderson transition 5.1. Level statistics. Level number variance
262 266 266 269 273 273 277 283 294 294 303 309 312 317 319 320 320
5.2. Strong correlations of eigenfunctions near the Anderson transition 5.3. Power-law random banded matrix ensemble: Anderson transition in 1D 6. Conductance #uctuations in quasi-onedimensional wires 6.1. Modeling a disordered wire and mapping onto 1D p-model 6.2. Conductance #uctuations 7. Statistics of wave intensity in optics 8. Statistics of energy levels and eigenfunctions in a ballistic system with surface scattering 8.1. Level statistics, low frequencies 8.2. Level statistics, high frequencies 8.3. The level number variance 8.4. Eigenfunction statistics 9. Electron}electron interaction in disordered mesoscopic systems 9.1. Coulomb blockade: #uctuations in the addition spectra of quantum dots 10. Summary and outlook Acknowledgements Appendix A. Abbreviations References
325 328 344 345 348 353 360 362 363 364 365 366 367 373 374 374 375
1 Tel.: #49-721-6083368; fax: #49-721-698150. Also at Petersburg Nuclear Physics Institute, 188350 Gatchina, St. Petersburg, Russia. E-mail address:
[email protected] (A.D. Mirlin) 0370-1573/00/$ - see front matter ( 2000 Elsevier Science B.V. All rights reserved. PII: S 0 3 7 0 - 1 5 7 3 ( 9 9 ) 0 0 0 9 1 - 5
A.D. Mirlin / Physics Reports 326 (2000) 259}382
261
Abstract The article reviews recent developments in the theory of #uctuations and correlations of energy levels and eigenfunction amplitudes in di!usive mesoscopic samples. Various spatial geometries are considered, with emphasis on low-dimensional (quasi-1D and 2D) systems. Calculations are based on the supermatrix p-model approach. The method reproduces, in so-called zero-mode approximation, the universal random matrix theory (RMT) results for the energy-level and eigenfunction #uctuations. Going beyond this approximation allows us to study system-speci"c deviations from universality, which are determined by the di!usive classical dynamics in the system. These deviations are especially strong in the far `tailsa of the distribution function of the eigenfunction amplitudes (as well as of some related quantities, such as local density of states, relaxation time, etc.). These asymptotic `tailsa are governed by anomalously localized states which are formed in rare realizations of the random potential. The deviations of the level and eigenfunction statistics from their RMT form strengthen with increasing disorder and become especially pronounced at the Anderson metal}insulator transition. In this regime, the wave functions are multifractal, while the level statistics acquires a scale-independent form with distinct critical features. Fluctuations of the conductance and of the local intensity of a classical wave radiated by a point-like source in the quasi-1D geometry are also studied within the p-model approach. For a ballistic system with rough surface an appropriately modi"ed (`ballistica) p-model is used. Finally, the interplay of the #uctuations and the electron}electron interaction in small samples is discussed, with application to the Coulomb blockade spectra. ( 2000 Elsevier Science B.V. All rights reserved. PACS: 05.45.Mt; 71.23.An; 71.30.#h; 72.15.Rn; 73.23.!b; 73.23.Ad; 73.23.Hk Keywords: Level correlations; Wave function statistics; Disordered mesoscopic systems; Supermatrix sigma model
262
A.D. Mirlin / Physics Reports 326 (2000) 259}382
1. Introduction Statistical properties of energy levels and eigenfunctions of complex quantum systems have been attracting a lot of interest of physicists since the work of Wigner [1], who formulated a statistical point of view on nuclear spectra. In order to describe excitation spectra of complex nuclei, Wigner proposed to replace a complicated and unknown Hamiltonian by a large N]N random matrix. This was a beginning of the random matrix theory (RMT) further developed by Dyson and Mehta in the early 1960s [2,3]. This theory predicts a universal form of the spectral correlation functions determined solely by some global symmetries of the system (time-reversal invariance and value of the spin). Later it was realized that the random matrix theory is not restricted to strongly interacting many-body systems, but has a much broader range of applicability. In particular, Bohigas et al. [4] put forward a conjecture (strongly supported by accumulated numerical evidence) that the RMT describes adequately statistical properties of spectra of quantum systems whose classical analogs are chaotic. Another class of systems to which the RMT applies and which is of special interest to us here is that of disordered systems. More speci"cally, we mean a quantum particle (an electron) moving in a random potential created by some kind of impurities. It was conjectured by Gor'kov and Eliashberg [5] that statistical properties of the energy levels in such a disordered granule can be described by the random matrix theory. This statement had remained in the status of conjecture until 1982, when it was proved by Efetov [6]. This became possible due to development by Efetov of a very powerful tool of treatment of the disordered systems under consideration } the supersymmetry method (see the review [6] and the recent book [7]). This method allows one to map the problem of the particle in a random potential onto a certain deterministic "eld-theoretical model (supermatrix p-model), which generates the disorder-averaged correlation functions of the original problem. As Efetov showed, under certain conditions one can neglect spatial variation of the p-model supermatrix "eld (so-called zero-mode approximation), which allows one to calculate the correlation functions. The corresponding results for the two-level correlation function reproduced precisely the RMT results of Dyson. The supersymmetry method can be also applied to the problems of the RMT-type. In this connection, we refer the reader to the paper [8], where the technical aspects of the method are discussed in detail. More recently, focus of the research interest was shifted from the proof of the applicability of RMT to the study of system-speci"c deviations from the universal (RMT) behavior. For the problem of level correlations in a disordered system, this question was addressed for the "rst time by Altshuler and Shklovskii [9] in the framework of the di!uson-cooperon diagrammatic perturbation theory. They showed that the di!usive motion of the particle leads to a high-frequency behavior of the level correlation function completely di!erent from its RMT form. Their perturbative treatment was however restricted to frequencies much larger than the level spacing and was not able to reproduce the oscillatory contribution to the level correlation function. Inclusion of non-zero spatial modes (which means going beyond universality) within the p-model treatment of the level correlation function was performed in Ref. [10]. The method developed in [10] was later used for calculation of deviations from the RMT of various statistical characteristics of a disordered system. For the case of level statistics, the calculation of [10] valid for not too large
A.D. Mirlin / Physics Reports 326 (2000) 259}382
263
frequencies (below the Thouless energy equal to the inverse time of di!usion through the system) was complemented by Andreev and Altshuler [11] whose saddle-point treatment was, in contrast, applicable for large frequencies. Level statistics in di!usive disordered samples is discussed in detail in Section 2 of the present article. Not only the energy levels statistics but also the statistical properties of wave functions are of considerable interest. In the case of nuclear spectra, they determine #uctuations of widths and heights of the resonances [12]. In the case of disordered (or chaotic) electronic systems, eigenfunction #uctuations govern, in particular, statistics of the tunnel conductance in the Coulomb blockade regime [13]. Note also that the eigenfunction amplitude can be directly measured in microwave cavity experiments [14}16] (though in this case one considers the intensity of a classical wave rather than of a quantum particle, all the results are equally applicable; see also Section 7). Within the random matrix theory, the distribution of eigenvector amplitudes is simply Gaussian, leading to s2 distribution of the `intensitiesa Dt D2 (Porter}Thomas distribution) [12]. i A theoretical study of the eigenfunction statistics in a disordered system is again possible with use of the supersymmetry method. The corresponding formalism, which was developed in Refs. [17}20] (see Section 3.1), allows one to express various distribution functions characterizing the eigenfunction statistics through the p-model correlators. As in the case of the level correlation function, the zero-mode approximation to the p-model reproduces the RMT results, in particular the Porter}Thomas distribution of eigenfunction amplitudes. However, one can go beyond this approximation. In particular, in the case of a quasi-one-dimensional geometry, considered in Section 3.2, this p-model has been solved exactly using the transfer-matrix method, yielding exact analytical results for the eigenfunction statistics for arbitrary length of the system, from weak to strong localization regime [17,18,21}23]. The case of a quasi-1D geometry is of great interest not only from the point of view of condensed matter theory (as a model of a disordered wire) but also for quantum chaos. In Section 3.3 we consider the case of arbitrary spatial dimensionality of the system. Since for d'1 an exact solution of the problem cannot be found, one has to use some approximate methods. In Refs. [24,25] the scheme of [10] was generalized to the case of the eigenfunction statistics. This allowed us to calculate the distribution of eigenfunction intensities and its deviation from the universal (Porter}Thomas) form. Fluctuations of the inverse participation ratio and long-range correlations of the eigenfunction amplitudes, which are determined by the di!usive dynamics in the corresponding classical system [25}27], and are considered in Section 3.3.3. Section 4 is devoted to the asymptotic `tailsa of the distribution functions of various #uctuating quantities (local amplitude of an eigenfunction, relaxation time, local density of states) characterizing a disordered system. It turns out that the asymptotics of all these distribution functions are determined by rare realizations of disorder leading to formation of anomalously localized eigenstates. These states show some kind of localization while all `normala states are ergodic; in the quasi-one-dimensional case they have an e!ective localization length much shorter than the `normala one. Existence of such states was conjectured by Altshuler et al. [28] who studied distributions of various quantities in 2#e dimensions via the renormalization group approach. More recently, Muzykantskii and Khmelnitskii [29] suggested a new approach to the problem. Within this method, the asymptotic `tailsa of the distribution functions are obtained by "nding a non-trivial saddle-point con"guration of the supersymmetric p-model. Further development and generalization of the method allowed one to calculate the asymptotic behavior of the distribution
264
A.D. Mirlin / Physics Reports 326 (2000) 259}382
functions of relaxation times [29}31], eigenfunction intensities [32,33], local density of states [34], inverse participation ratio [35,36], level curvatures [37,38], etc. The saddle-point solution describes directly the spatial shape of the corresponding anomalously localized state [29,36]. Section 5 deals with statistical properties of the energy levels and wave functions at the Anderson metal}insulator transition point. As is well known, in d'2 dimensions a disordered system undergoes, with increasing strength of disorder, a transition from the phase of extended states to that of localized states (see, e.g. [39] for review). This transition changes drastically the statistics of energy levels and eigenfunctions. While in the delocalized phase the levels repel each other strongly and their statistics is described by RMT (up to the deviations discussed above and in Section 2), in the localized regime the level repulsion disappears (since states nearby in energy are located far from each other in real space). As a result, the levels form an ideal 1D gas (on the energy axis) obeying the Poisson statistics. In particular, the variance of the number N of levels in an interval *E increases linearly, var(N)"SNT, in contrast to the slow logarithmic increase in the RMT case. What happens to the level statistics at the transition point? This question was addressed for the "rst time by Altshuler et al. [40], where a Poisson-like increase, var(N)"sSNT, was found numerically with a spectral compressibility sK0.3. More recently, Shklovskii et al. [41] put forward the conjecture that the nearest level spacing distribution P(s) has a universal form at the critical point, combining the RMT-like level repulsion at small s with the Poisson-like behavior at large s. However, these results were questioned by Kravtsov et al. [42] who developed an analytical approach to the problem and found, in particular, a sublinear increase of var(N). This controversy was resolved in [43,44] where the consideration of [42] was critically reconsidered and the level number variance was shown to have generally a linear behavior at the transition point. By now, this result has been con"rmed by numerical simulations done by several groups [45}48]. Recently, a connection between this behavior and multifractal properties of eigenfunctions has been conjectured [49]. Multifractality is a formal way to characterize strong #uctuations of the wave function amplitude at the mobility edge. It follows from the renormalization group calculation of Wegner [50] (though the term `multifractalitya was not used there). Later the multifractality of the critical wave functions was discussed in [51] and con"rmed by numerical simulations of the disordered tight-binding model [52}56]. It implies, in very rough terms, that the eigenfunction is e!ectively located in a vanishingly small portion of the system volume. A natural question then arises: why do such extremely sparse eigenfunctions show the same strong level repulsion as the ergodic states in the RMT? This problem is addressed in Section 5.1. It is shown there that the wavefunctions of nearby-in-energy states exhibit very strong correlations (they have essentially the same multifractal structure), which preserves the level repulsion despite the sparsity of the wave functions. In Section 5.2 we consider a `power-law random banded matrix ensemblea (PRBM) which describes a kind of one-dimensional system with a long-range hopping whose amplitude decreases as r~a with distance [57]. Such a random matrix ensemble arises in various contexts in the theory of quantum chaos [58,59] and disordered systems [60}62]. The problem can again be mapped onto a supersymmetric p-model. It is further shown that at a"1 the system is at a critical point of the localization}delocalization transition. More precisely, there exists a whole family of such critical points labeled by the coupling constant of the p-model (which can be in turn related to the parameters of the microscopic PRBM ensemble). Statistics of levels and eigenfunctions in this model are studied. At the critical point they show the critical features discussed above (such as the multifractality of eigenfunctions and a "nite spectral compressibility 0(s(1).
A.D. Mirlin / Physics Reports 326 (2000) 259}382
265
The energy level and eigenfunction statistics characterize the spectrum of an isolated sample. For an open system (coupled to external conducting leads), di!erent quantities become physically relevant. In particular, we have already mentioned the distributions of the local density of states and of the relaxation times discussed in Section 4 in connection with anomalously localized states. In Section 6 we consider one of the most famous issues in the physics of mesoscopic systems, namely that of conductance #uctuations. We focus on the case of the quasi-one-dimensional geometry. The underlying microscopic model describing a disordered wire coupled to freely propagating modes in the leads was proposed by Iida et al. [63]. It can be mapped onto a 1D p-model with boundary terms representing coupling to the leads. The conductance is given in this approach by the multichannel Landauer}BuK ttiker formula. The average conductance SgT of this system for arbitrary value of the ratio of its length ¸ to the localization length m was calculated by Zirnbauer [64], who developed for this purpose the Fourier analysis on supersymmetric manifolds. The variance of the conductance was calculated in [65] (in the case of a system with strong spin}orbit interaction there was a subtle error in the papers [64,65] corrected by Brouwer and Frahm [66]). The analytical results which describe the whole range of ¸/m from the weak localization (¸;m) to the strong localization (¸<m) regime were con"rmed by numerical simulations [67,68]. As has been already mentioned, the p-model formalism is not restricted to quantum-mechanical particles, but is equally applicable to classical waves. Section 7 deals with a problem of intensity distribution in the optics of disordered media. In an optical experiment, a source and a detector of the radiation can be placed in the bulk of disordered media. The distribution of the detected intensity is then described in the leading approximation by the Rayleigh law [69] which follows from the assumption of a random superposition of independent traveling waves. This result can be also reproduced within the diagrammatic technique [70]. Deviations from the Rayleigh distribution governed by the di!usive dynamics were studied in [71] for the quasi-1D geometry. When the source and the detector are moved toward the opposite edges of the sample, the intensity distribution transforms into the distribution of transmission coe$cients [72}74]. Recently, it has been suggested by Muzykantskii and Khmelnitskii [75] that the supersymmetric p-model approach developed previously for the di!usive systems is also applicable in the case of ballistic systems. Muzykantskii and Khmelnitskii derived the `ballistic p-modela where the di!usion operator was replaced by the Liouville operator governing the ballistic dynamics of the corresponding classical system. This idea was further developed by Andreev et al. [76,77] who derived the same action via the energy averaging for a chaotic ballistic system with no disorder. (There are some indications that one has to include in consideration certain amount of disorder to justify the derivation of [76,77].) Andreev et al. replaced, in this case, the Liouville operator by its regularization known as Perron}Frobenius operator. However, this approach has failed to provide explicit analytical results for any particular chaotic billiard so far. This is because the eigenvalues of the Perron}Frobenius operator are usually not known, while its eigenfunctions are highly singular. To overcome these di$culties and to make a further analytical progress possible, a ballistic model with surface disorder was considered in [78,79]. The corresponding results are reviewed in Section 8. It is assumed that roughness of the sample surface leads to the di!usive surface scattering, modelling a ballistic system with strongly chaotic classical dynamics. Considering the simplest (circular) shape of the system allows one to "nd the spectrum of the corresponding Liouville operator and to study statistical properties of energy levels and eigenfunctions. The
266
A.D. Mirlin / Physics Reports 326 (2000) 259}382
results for the level statistics show important di!erences as compared to the case of a di!usive system and are in agreement with arguments of Berry [80,81] concerning the spectral statistics in a generic chaotic billiard. In Section 9 we discuss a combined e!ect of the level and eigenfunction #uctuations and the electron}electron interaction on thermodynamic properties of quantum dots. Section 9.1 is devoted to statistics of the so-called addition spectrum of a quantum dot in the Coulomb blockade regime. The addition spectrum, which is determined by the positions of the Coulomb blockade conductance peaks with varying gate voltage, corresponds to a successive addition of electrons to the dot coupled very weakly to the outside world [82]. The two important energy scales characterizing such a dot are the charging energy e2/C and the electron level spacing D (the former being much larger than the latter for a dot with large number of electrons). Statistical properties of the addition spectrum were experimentally studied for the "rst time by Sivan et al. [83]. It was conjectured in Ref. [83] that #uctuations in the addition spectrum are of the order of e2/C and are thus of classical origin. However, it was found in Refs. [84,85] that this is not the case and that the magnitude of #uctuations is set by the level spacing D, as in the non-interacting case. The interaction modi"es, however, the shape of the distribution function. In particular, it is responsible for breaking the spin degeneracy of the quantum dot spectrum. These results have been con"rmed recently by thorough experimental studies [86,87]. The research activity in the "eld of disordered mesoscopic systems, random matrix theory, and quantum chaos has been growing enormously during the recent years, so that a review article clearly cannot give an account of the progress in the whole "eld. Many of the topics which are not covered here have been extensively discussed in the recent reviews by Beenakker [88] and by Guhr et al. [89].
2. Energy level statistics: random matrix theory and beyond 2.1. Supersymmetric p-model formalism The problem of energy level correlations has been attracting a lot of research interest since the work of Wigner [1]. The random matrix theory (RMT) developed by Wigner et al. [2,3] was found to describe well the level statistics of various classes of complex systems. In particular, in 1965 Gor'kov and Eliashberg [5] put forward a conjecture that the RMT is applicable to the problem of energy level correlations of a quantum particle moving in a random potential. To prove this hypothesis, Efetov developed the supersymmetry approach to the problem [6,7]. The quantity of primary interest is the two-level correlation function2 1 R(s)" Sl(E!u/2)l(E#u/2)T SlT2
(2.1)
2 The two-level correlation function is conventionally denoted [3,4] as R (s). Since we will not consider higher-order 2 correlation functions, we will omit the subscript `2a.
A.D. Mirlin / Physics Reports 326 (2000) 259}382
267
where l(E)"<~1 Tr d(E!HK ) is the density of states at the energy E, < is the system volume, HK is the Hamiltonian, D"1/SlT< is the mean level spacing, s"u/D, and S2T denote averaging over realizations of the random potential. As was shown by Efetov [6], the correlator (2.1) can be expressed in terms of a Green function of certain supermatrix p-model. Depending on whether the time reversal and spin rotation symmetries are broken or not, one of three di!erent p-models is relevant, with unitary, orthogonal or symplectic symmetry group. We will consider "rst the technically simplest case of the unitary symmetry (corresponding to the broken time reversal invariance); the results for two other cases will be presented at the end. We give only a brief sketch of the derivation of the expression for R(s) in terms of the p-model. One begins with representing the density of states in terms of the Green's functions,
P
1 l(E)" ddr[GE (r, r)!GE (r, r)] , A R 2pi<
(2.2)
where GE (r1 , r2 )"Sr1 D(E!HK $ig)~1Dr2 T, gP#0 . R,A
(2.3)
The Hamiltonian HK consists of the free part HK and the disorder potential ;(r): 0 1 HK "HK #;(r), HK " p( 2 , 0 0 2m
(2.4)
the latter being de"ned by the correlator 1 d(r!r@) . S;(r);(r@)T" 2plq
(2.5)
A non-trivial part of the calculation is the averaging of the G G terms entering the correlation R A function Sl(E#u/2)l(E!u/2)T. The following steps are: (i) to write the product of the Green's functions in terms of the integral over a supervector "eld U"(S , s , S , s ): 1 1 2 2
P
GE`u@2(r1 , r1 )GE~u@2(r2 , r2 )" DU DUs S (r1 )SH(r1 )S (r2 )SH(r2 ) R A 1 1 2 2
GP
H
]exp i dr@Us(r@)K1@2[E#(u/2#ig)K!HK ]K1@2U(r@) ,
(2.6)
where K"diagM1, 1,!1,!1N, (ii) to average over the disorder; (iii) to introduce a 4]4 supermatrix variable R (r) conjugate to the tensor product U (r)Us(r); kl k l (iv) to integrate out the U "elds;
268
A.D. Mirlin / Physics Reports 326 (2000) 259}382
(v) to use the saddle-point approximation which leads to the following equation for R: 1 g(r, r) , R(r)" 2plq
(2.7)
g(r1 , r2 )"Sr1 D(E!HK !R)~1Dr2 T . 0
(2.8)
The relevant set of the solutions (the saddle-point manifold) has the form: i R"p ) I! Q 2q
(2.9)
where I is the unity matrix, p is certain constant, and the 4]4 supermatrix Q"¹~1K¹ satis"es the condition Q2"1, with ¹ belonging to the coset space ;(1, 1 D 2)/;(1 D 1)];(1 D 1). The expression for the two-level correlation function R(s) then reads
A B P CP
R(s)"
1 2 Re DQ(r) 4<
D G P
ddr Str QKk
2
H
pl ddr Str[!D(+Q)2!2iuKQ] . exp ! 4
(2.10)
Here k"diagM1,!1, 1,!1N, Str denotes the supertrace, and D is the classical di!usion constant. We do not give here a detailed description of the model and mathematical entities involved, which can be found, e.g. in Refs. [6}8,90], and restrict ourselves to a qualitative discussion of the structure of the matrix Q. The size 4 of the matrix is due to (i) two types of the Green functions (advanced and retarded) entering the correlation function (2.1), and (ii) necessity to introduce bosonic and fermionic degrees of freedom to represent these Green's function in terms of a functional integral. The matrix Q consists thus of four 2]2 blocks according to its advanced-retarded structure, each of them being a supermatrix in the boson}fermion space. To proceed further, Efetov [6] neglected spatial variation of the supermatrix "eld Q(r) and approximated the functional integral in Eq. (2.10) by an integral over a single supermatrix Q (so-called zero-mode approximation). The resulting integral can be calculated yielding precisely the Wigner}Dyson distribution:3 sin2(ps) RU (s)"1! , WD (ps)2
(2.11)
the superscript U standing for the unitary ensemble. The corresponding results for the orthogonal (O) and the symplectic (Sp) ensemble are
C
DC
sin2(ps) p ! sgn (s)!Si(ps) RO (s)"1! WD (ps)2 2
D
cos ps sinps ! , (ps)2 ps
(2.12)
3 Strictly speaking, the level correlation functions (2.11)}(2.13) contain an additional term d(s) corresponding to the `self-correlationa of an energy level. Furthermore, in the symplectic case all the levels are double degenerate (Kramers degeneracy). This degeneracy is disregarded in (2.13) which thus represents the correlation function of distinct levels only, normalized to the corresponding level spacing.
A.D. Mirlin / Physics Reports 326 (2000) 259}382
C
D
sin2(2ps) cos 2ps sin 2ps RS1 (s)"1! #Si(2ps) ! , WD (2ps)2 (2ps)2 2ps
269
(2.13)
P
x sin y dy . y 0 The aim of Section 2.2 will be to study the deviations of the level correlation function from the universal RMT results (2.11)}(2.13). Si(x)"
2.2. Deviations from universality The procedure we are using in order to calculate deviations from the universality is as follows [10]. We "rst decompose Q into the constant part Q and the contribution QI of higher modes with 0 non-zero momenta. Then we use the renormalization group ideas and integrate out all fast modes. This can be done perturbatively provided the dimensionless (measured in units of e2/h) conductance g"2pE /D"2plD¸d~2<1 (here E "D/¸2 is the Thouless energy). As a result, we get an c c integral over the matrix Q only, which has to be calculated non-perturbatively. 0 We begin with presenting the correlator R(s) in the form
P
1 R2 R(s)" DQ expM!S[Q]ND , u/0 (2pi)2 Ru2
P
P
P
1 S[Q]"! Str(+Q)2#s8 Str KQ#u8 Str QKk t
(2.14)
where 1/t"plD/4, s8 "ps/2i<, u8 "pu/2i<. Now we decompose Q in the following way: Q(r)"¹~1QI (r)¹ (2.15) 0 0 where ¹ is a spatially uniform matrix and QI describes all modes with non-zero momenta. When 0 u;E , the matrix QI #uctuates only weakly near the origin K of the coset space. In the leading c order, QI "K, thus reducing (2.14) to a zero-dimensional p-model, which leads to the Wigner} Dyson distribution (2.13). To "nd the corrections, we should expand the matrix QI around the origin K:
A
B
=2 =3 QI "K(1#=/2)(1!=/2)~1"K 1#=# # #2 , 2 4
(2.16)
where = is a supermatrix with the following block structure:
A
="
t
0
t 12 0
B
21 Substituting this expansion into Eq. (2.14), we get S"S #S #O(=3) , 0 1
(2.17)
270
A.D. Mirlin / Physics Reports 326 (2000) 259}382
P C P
D
1 S " Str (+=)2#s8 Q K#u8 Q Kk , 0 0 0 t 1 S " Str[s8 ; K=2#u8 ; K=2] , 0 0k 1 2
(2.18)
where Q "¹~1K¹ , ; "¹ K¹~1, ; "¹ Kk¹~1. Let us de"ne S [Q ] as a result of 0 0 0 0 0 0 0k 0 0 %&& 0 elimination of the fast modes: e~S%&& *Q0 +"e~S0 *Q0 +Se~S1 *Q0 ,W+`-/ J*W+T
, (2.19) W where S2T denote the integration over = and J[=] is the Jacobian of the transformation W (2.15), (2.16) from the variable Q to MQ , =N (the Jacobian does not contribute to the leading order 0 correction calculated here, but is important for higher-order calculations [25,91]). Expanding up to the order =4, we get (2.20) S "S #SS T!1SS2 T#1SS T2#2 2 1 %&& 0 1 2 1 The integral over the fast modes can be calculated now using the Wick theorem and the contraction rules [6,28]: SStr =(r)P=(r@)RT"P(r, r@)(Str P Str R!Str PK Str RK) ,
(2.21)
SStr[=(r)P]Str[=(r@)R]T"P(r, r@)Str(PR!PKRK) , where P and R are arbitrary supermatrices. The di!usion propagator P is the solution of the di!usion equation !D+2P(r , r )"(pl)~1[d(r !r )!<~1] (2.22) 1 2 1 2 with the Neumann boundary condition (normal derivative equal to zero at the sample boundary) and can be presented in the form 1 1 + / (r)/ (r@) (2.23) P(r, r@)" k pl e k k_ek E0 k where / (r) are the eigenfunctions of the di!usion operator !D+2 corresponding to the eigenk values e (equal to Dq2 for a rectangular geometry). As a result, we "nd k SS T"0 , 1 1 SS2 T" dr dr@ P2(r, r@)(s8 Str Q K#u8 Str Q Kk)2 . (2.24) 1 0 0 2
P
Substitution of Eq. (2.24) into Eq. (2.20) yields p p2a p S [Q ]" s Str Q K# u Str Q Kk# d (s Str Q K#u Str Q Kk)2 , 0 0 0 0 %&& 0 2i 4g2 2i
P
1 g2 dr d r@P2(r, r@)" a " d 4<2 p4
= 1 . + (n2 #2#n2 )2 1 d n2 1 ,2, nd /0 n1 `2`n2d ;0
(2.25)
A.D. Mirlin / Physics Reports 326 (2000) 259}382
271
The value of the coe$cient a depends on spatial dimensionality d and on the sample geometry; in d the last line of Eq. (2.25) we assumed a cubic sample with hard-wall boundary conditions. Then for d"1, 2, 3 we have a "1/90K0.0111, a K0.0266, and a K0.0527 respectively. In the case of 1 2 3 a cubic sample with periodic boundary conditions we get instead = 1 , (2.26) + (n2 #2#n2)2 1 d , nd /~= n12,22 n1 ` `n2$ ;0 so that a "1/720K0.00139, a K0.00387, and a K0.0106. Note that for d(4 the sum in 1 2 3 Eqs. (2.25) and (2.26) converges, so that no ultraviolet cut-o! is needed. Using now Eq. (2.14) and calculating the remaining integral over the supermatrix Q , we "nally 0 get the following expression for the correlator to the 1/g2 order: 1 a " d (2p)4
sin2(ps) 4a # d sin2(ps) . R(s)"1! g2 (ps)2
(2.27)
The last term in Eq. (2.27) just represents the correction of order 1/g2 to the Wigner}Dyson distribution. The formula (2.27) is valid for s;g. Let us note that the smooth (non-oscillating) part of this correction in the region 1;s;g can be found by using purely perturbative approach of Altshuler and Shklovskii [9,40]. For s<1 the leading perturbative contribution to R(s) is given by a two-di!uson diagram: D2 Re R (s)!1" AS 2p2
1 1 1 + " Re + . (2.28) [!is#(p/2)gn2]2 (Dq2!iu)2 2p2 qi /pni @L ni z0 ni /0,1,2,2 At s;g this expression is dominated by the q"0 term, with other terms giving a correction of order 1/g2: 2a 1 # d , R (s)"1! AS g2 2p2s2
(2.29)
where a was de"ned in Eq. (2.25). This formula is obtained in the region 1;s;g and is d perturbative in both 1/s and 1/g. It does not contain oscillations (which cannot be found perturbatively) and gives no information about actual small-s behavior of R(s). The result (2.27) is much stronger: it represents the exact (non-perturbative in 1/s) form of the correction in the whole region s;g. The important feature of Eq. (2.27) is that it relates corrections to the smooth and oscillatory parts of the level correlation function (represented by the contributions to the last term proportional to unity and to cos 2ps respectively). While appearing naturally in the framework of the supersymmetric p-model, this fact is highly non-trivial from the point of view of semiclassical theory [80,81], which represents the level structure factor K(q) (Fourier transform of R(s)) in terms of a sum over periodic orbits. The smooth part of R(s) corresponds then to the small-q behavior of K(q), which is related to the properties of short periodic orbits. On the other hand, the oscillatory part of R(s) is related to the behavior of K(q) in the vicinity of the Heisenberg time q"2p (t"2p/D in dimensionful unit), and thus to the properties of long periodic orbits.
272
A.D. Mirlin / Physics Reports 326 (2000) 259}382
The calculation presented above can be straightforwardly generalized to the other symmetry classes. The result can be presented in a form valid for all the three cases:4
A
B
2a d2 R(b)(s)" 1# d s2 R(b) (s) WD bg2 ds2
(2.30)
where b"1(2,4) for the orthogonal (unitary, symplectic) symmetry; R(b) denotes the correspondWD ing Wigner}Dyson distribution (2.11)}(2.13). For sP0 the Wigner}Dyson distribution has the following behavior: R(b) Kc sb, sP0 WD b p2 (2p)4 p2 . c " , c " , c " 2 4 1 3 135 6
(2.31)
As is clear from Eq. (2.30), the found correction does not change the exponent b, but renormalizes the prefactor c : b
A
B
2(b#2)(b#1) a d c sb, sP0 . R(b)(s)" 1# b g2 b
(2.32)
The correction to c is positive, which means that the level repulsion becomes weaker. This is b related to a tendency of eigenfunctions to localization with decreasing g. What is the behavior of the level correlation function in its high-frequency tail s
(2.33)
(note that in 2D the coe$cient of the term (2.33) vanishes, and the result for R is smaller by an AS additional factor 1/g, see [44]). What is the fate of the oscillations in R(s) in this regime? The answer to this question was given by Andreev and Altshuler [11] who calculated R(s) using the stationarypoint method for the p-model integral (2.10). Their crucial observation was that on top of the trivial stationary point Q"K (expansion around which is just the usual perturbation theory), there exists another one, Q"kK, whose vicinity generates the oscillatory part of R(s). (In the case of symplectic symmetry there exists an additional family of stationary points, see [11]). The saddle-point approximation of Andreev and Altshuler is valid for s<1; at 1;s;g it reproduces the above results of Ref. [10] (we remind that the method of [10] works for all s;g). The result of [11] has the following form: cos 2ps RU (s)" D(s) , 04# 2p2
(2.34)
4 For all the ensembles, we denote by g the conductance per one spin projection: g"2plD¸d~2, without multiplication by factor 2 due to the spin.
A.D. Mirlin / Physics Reports 326 (2000) 259}382
273
cos 2ps RO (s)" D2(s) , 04# 2p4
(2.35)
cos 2ps cos 4ps RS1 (s)" D1@2(s)# D2(s) , 04# 4 32p4
(2.36)
where D(s) is the spectral determinant
A
B
1 s2D2 ~1 D(s)" < 1# . (2.37) s2 e2 k k The product in Eq. (2.37) goes over the non-zero eigenvalues e of the di!usion operator (which are k equal to Dq2 for the cubic geometry). This demonstrates again the relation between R (s) and the 04# perturbative part (2.28), which can be also expressed through D(s), 1 R2 ln D(s) R(b) (s)!1"! . AS 2bp2 Rs2
(2.38)
In the high-frequency region s
G
AB H
p 2s d@2 D(s)&exp ! , C(d/2)d sin(pd/4) g
(2.39)
so that the amplitude of the oscillations vanishes exponentially with s in this region. Taken together, the results of [10,11] provide complete description of the deviations of the level correlation function from universality in the metallic regime g<1. They show that in the whole region of frequencies these deviations are controlled by the classical (di!usion) operator governing the dynamics in the corresponding classical system.
3. Statistics of eigenfunctions 3.1. Eigenfunction statistics in terms of the supersymmetric p-model Within the RMT, the distribution of eigenfunction amplitudes is simply Gaussian, leading to the s2 distribution of the `intensitiesa y "NDt2D (we normalized y in such a way that SyT"1) [12] i i i PU(y)"e~y , (3.1) e~y@2 PO(y)" . J2py
(3.2)
Eq. (3.2) is known as the Porter}Thomas distribution; it was originally introduced to describe #uctuations of widths and heights of resonances in nuclear spectra [12]. Recently, interest in properties of eigenfunctions in disordered and chaotic systems has started to grow. On the experimental side, it was motivated by the possibility of fabrication of small systems (quantum dots) with well resolved electron energy levels [92,93,82]. Fluctuations in the tunneling
274
A.D. Mirlin / Physics Reports 326 (2000) 259}382
conductance of such a dot measured in recent experiments [94,95] are related to statistical properties of wave function amplitudes [13,96}98]. When the electron}electron Coulomb interaction is taken into account, the eigenfunction #uctuations determine the statistics of matrix elements of the interaction, which is in turn important for understanding the properties of excitation and addition spectra of the dot [99,100,84]. Furthermore, the microwave cavity technique allows one to observe experimentally spatial #uctuations of the wave amplitude in chaotic and disordered cavities [13}15]. Theoretical study of the eigenfunction statistics in a d-dimensional disordered system is again possible with use of the supersymmetry method [17}20]. The distribution function of the eigenfunction intensity u"Dt2(r )D in a point r is de"ned as 0 0 1 P(u)" + d(Dt (r )D2!u)d(E!E ) . (3.3) a 0 a l< a The moments of P(u) can be written through the Green's functions in the following way:
T
U
iq~2 lim (2g)q~1SGq~1(r , r )G (r , r )T . (3.4) SDt(r )D2qT" R 0 0 A 0 0 0 2pl< g?0 The product of the Green's functions can be expressed in terms of the integral over a supervector "eld U"(S , s , S , s ), 1 1 2 2 i2~q Gq~1(r , r )G (r , r )" DUDUs(S (r )SH(r ))q~1S (r )SH(r ) 1 0 1 0 2 0 2 0 R 0 0 A 0 0 (q!1)!
P GP
H
]exp i dr@Us(r@)K1@2(E#igK!HK )K1@2U(r@) .
(3.5)
Proceeding now in the same way as in the case of the level correlation function (Section 2.1), we represent the r.h.s. of Eq. (3.5) in terms of a p-model correlation function. As a result, we "nd5
P
q SDt(r )D2qT"! lim (2plg)q~1 DQQq~1 Q e~S*Q+ , 0 11, bb 22, bb 2< g?0 where S[Q] is the p-model action,
P
C
D
plD b (+Q)2!plgKQ S[Q]"! ddr Str 4 2
(3.6)
(3.7)
(b"2 for the considered case of the unitary symmetry). Let us de"ne now the function >(Q ) as 0
P
>(Q )" 0
DQ(r) expM!S[Q]N .
(3.8)
Q( )/Q r 0
0
5 The "rst two indices of Q correspond to the advanced}retarded and the last two to the boson}fermion decomposition.
A.D. Mirlin / Physics Reports 326 (2000) 259}382
275
Here r is the spatial point, in which the statistics of eigenfunction amplitudes is studied. For the 0 invariance reasons, the function >(Q ) turns out to be dependent in the unitary symmetry case 0 on the two scalar variables 14j (R and !14j 41 only, which are the eigenvalues of 1 2 the `retarded}retardeda block of the matrix Q . Moreover, in the limit gP0 (at a "xed value of the 0 system volume) only the dependence on j persists: 1 >(Q ),>(j , j )P> (2plgj ) . (3.9) 0 1 2 a 1 With this de"nition, Eq. (3.6) takes the form of an integral over the single matrix Q , 0 q SDt(r )D2qT"! lim (2plg)q~1 DQ Qq~1 Q >(Q ) . (3.10) 0 0 0_11, bb 0_22, bb 0 2< g?0 Evaluating this integral, we "nd
P
P
1 SDt(r )D2qT" q(q!1) du uq~2> (u) . 0 a <
(3.11)
Consequently, the distribution function of the eigenfunction intensity is given by [17] 1 d2 P(u)" > (u) (U) , < du2 a
(3.12)
where < is the sample volume. In the case of the orthogonal symmetry, >(Q ),>(j , j , j), where 14j , j (R and 0 1 2 1 2 !14j41. In the limit gP0, the relevant region of values is j <j , j, where 1 2 >(Q )P> (plgj ) . (3.13) 0 a 1 The distribution of eigenfunction intensities is expressed in this case through the function > as a follows [17]:
P
1 = d2 P(u)" dz(2z!u)~1@2 > (z) p
(z#u/2) (O) . (3.14) p
P
> (z)+e~Vz (O, U) , a and consequently, P(u)+<e~uV (U) ,
S
P(u)+
< e~uV@2 (O) , 2pu
(3.15)
(3.16)
(3.17)
276
A.D. Mirlin / Physics Reports 326 (2000) 259}382
which are just the RMT results for the Gaussian Unitary Ensemble (GUE) and Gaussian Orthogonal Ensemble (GOE) respectively, Eqs. (3.1) and (3.2). Therefore, like in the case of the level correlations, the zero mode approximation yields the RMT results for the distribution of the eigenfunction amplitudes. To calculate deviations from RMT, one has to go beyond the zero-mode approximation and to evaluate the function > (z) determined by a Eqs. (3.8) and (3.9) for a d-dimensional di!usive system. In the case of a quasi-1D geometry this can be done exactly via the transfer-matrix method, see Section 3.2. For higher d, the exact solution is not possible, and one should rely on approximate methods. Corrections to the `main bodya of the distribution can be found by treating the non-zero modes perturbatively, while the asymptotic `taila can be found via a saddle-point method (see Sections 3.3 and 4). Let us note that the formula (3.8), (3.9) can be written in a slightly di!erent, but completely equivalent form [32,33]. Making in (3.8) the transformation Q(r)PQI (r)"<~1(r )Q(r)<(r ) , 0 0 where the matrix <(r ) is de"ned from Q(r )"<(r )K<~1(r ), one gets in the unitary case 0 0 0 0
P
> (u)" a
GP
DQ exp K
Q( )/ r 0
C
ddr Str
u plD (+Q)2! KQP bb 2 4
DH
,
(3.18)
where P denotes the projector onto the boson}boson sector, and a similar formula in the bb orthogonal case. The above derivation can be extended to a more general correlation function representing a product of eigenfunction amplitudes in di!erent points
T
U
1 + Dt2q1 (r )DDt2q2 (r )D2Dt2qk (r )Dd(E!E ) CMqN(r ,2, r )" 1 k a 1 a 2 a k a l< a
.
(3.19)
If all the points r are separated by su$ciently large distances (much larger than the mean free path i l), one "nds for the unitary ensemble [18] 1 q !q !2q ! 1 2 k lim (2plg)q1 `2`qk ~1 CMqN(r ,2, r )"! 1 k 2< (q #q #2#q !1)! 1 2 k g?0
P
] DQQq1 ~1 (r )Q (r )Qq2 (r ) Qqk (r )e~S*Q+ . 11, bb 1 22, bb 1 11, bb 2 2 11, bb k
(3.20)
In the case of the quasi-1D system one can again evaluate Eq. (3.20) via the transfer matrix method, while in higher d one has to use approximate schemes. The correlation functions of the type (3.19) appear, in particular, when one calculates the distribution of the inverse participation ratio (IPR) P ":ddr D t4(r)D, the moments of which are given by Eq. (3.19) with q "q "2"q "2. 2 a 1 2 k We will discuss the IPR distribution function below in Sections 3.2.4 and 3.3.3. The case of k"2 in Eq. (3.19) corresponds to the correlations of the amplitudes of an eigenfunction in two di!erent points; we will discuss such correlations in Sections 3.3.3 and 4.1.1 (where they will describe the shape of an anomalously localized state).
A.D. Mirlin / Physics Reports 326 (2000) 259}382
277
3.2. Quasi-one-dimensional geometry 3.2.1. Exact solution of the p-model In the case of quasi-1D geometry an exact solution of the p-model is possible due to the transfer-matrix method. The idea of the method, quite general for the one-dimensional problems, is in reducing the functional integral (3.8) or (3.20) to solution of a di!erential equation. This is completely analogous to constructing the SchroK dinger equation from the quantum-mechanical Feynman path integral. In the present case, the role of the time is played by the coordinate along the wire, while the role of the particle coordinate is played by the supermatrix Q. In general, at "nite value of the frequency g in Eq. (3.7) (more precisely, g plays a role of imaginary frequency), the corresponding di!erential equation is too complicated and cannot be solved analytically [6]. However, a simpli"cation appearing in the limit gP0, when only the non-compact variable j survives, allows to "nd an analytical solution [18] of the 1D p-model.6 1 There are several di!erent microscopic models which can be mapped onto the 1D supermatrix p-model. First of all, this is a model of a particle in a random potential (discussed above) in the case of a quasi-1D sample geometry. Then one can neglect transverse variation of the Q-"eld in the p-model action, thus reducing it to the 1D form [101,6]. Secondly, this is the random banded matrix (RBM) model [102,17,18] which is relevant to various problems in the "eld of quantum chaos [103,104]. In particular, the evolution operator of a kicked rotor (paradigmatic model of a periodically driven quantum system) has a structure of a quasi-random banded matrix, which makes this system to belong to the `quasi-1D universality classa [18,105]. Finally, the Iida}WeidenmuK ller}Zuk random matrix model [63] of the transport in a disordered wire (see Section 6 for more detail) can be also mapped onto the 1D p-model. The result for the function > (u) determining the distribution of the eigenfunction intensity a u"Dt2(r )D reads (for the unitary symmetry) 0 1 > (u)" =(1)(uAm, q )=(1)(uAm, q ) . (3.21) a ` ~ < Here A is the wire cross-section, m"2plDA the localization length, q "¸ /m, q "¸ /m, with ` ` ~ ~ ¸ , ¸ being the distances from the observation point r to the right and left edges of the sample. ` ~ 0 For the orthogonal symmetry, m is replaced by m/2. The function =(1)(z, q) satis"es the equation
A
B
R2 R=(1)(z, q) " z2 !z =(1)(z, q) Rz2 Rq
(3.22)
and the boundary condition =(1)(z, 0)"1 .
(3.23)
The solution to Eqs. (3.22) and (3.23) can be found in terms of the expansion in eigenfunctions of the operator z2R2/Rz2!z. The functions 2z1@2K (2z1@2), with K (x) being the modi"ed Bessel function ik l
6 Let us stress that we consider a sample with the hard-wall (not periodic) boundary conditions in the logitudinal direction, i.e. a wire with two ends (not a ring).
278
A.D. Mirlin / Physics Reports 326 (2000) 259}382
(Macdonald function), form the proper basis for such an expansion [106], which is known as the Lebedev}Kontorovich expansion; the corresponding eigenvalues are !(1#k2)/4. The result is
G
P
H
k 2 = pk dk =(1)(z, q)"2z1@2 K (2z1@2)# sinh K (2z1@2)e~((1`k2)4)q . 1 1#k2 p 2 ik 0
(3.24)
The formulas (3.12), (3.14), (3.21) and (3.24) give therefore the exact solution for the eigenfunction statistics for arbitrary value of the parameter X"¸/m (ratio of the total system length ¸"¸ #¸ to the localization length). The form of the distribution function P(u) is essentially ` ~ di!erent in the metallic regime X;1 (in this case X"1/g) and in the insulating one X<1. We will discuss these two limiting cases below, in Sections 3.2.3 and 3.2.4 respectively. 3.2.2. Global statistics of eigenfunctions The multipoint correlation functions (3.19) determining the global statistics of eigenfunctions can be also computed in a similar way. Let us "rst assume that the points r lie su$ciently far from each i other, Dr !r D
P
A
B
k dz zqk ~2=(1) z; X! + q =(k)(z; q ,2, q ) , s 1 k 0 s/1 where the functions =(s)(z; q ,2, q ) are de"ned by the equation (identical to Eq. (3.22)) 1 s ]
=
A
B
R2 R=(s)(z; q ,2, q ) 1 s " z2 !z =(s)(z; q ,2, q ) 1 s Rz2 Rq s
(3.25)
(3.26)
and the boundary conditions =(s)(z; q , q ,2, q , 0)"zqs~1 =(s~1)(z; q , q ,2, q ) . 1 2 s~1 1 2 s~1
(3.27)
Solving these equations consecutively via the Lebedev}Kontorovich transformation, one can "nd all the correlation functions
HK "!R2#eh#1 h 4
(3.28)
A.D. Mirlin / Physics Reports 326 (2000) 259}382
279
with the boundary conditions I (s~1)(h; q ,2, q ) , = I (s)(h; q ,2, q , 0)"eqs~1 h= 1 s~1 1 s~1 = I (1)(h; 0)"e~h@2 .
(3.29)
This allows to rewrite Eq. (3.25) as a matrix element, q !2q ! 1 k
(3.30)
(these transformations are completely analogous to those performed by Kolokolov in [107], where the eigenfunction statistics in the strictly 1D case was studied). Furthermore, the matrix element can be represented as a Feynman path integral,
P G P A
q !2q ! 1 k
P
BH
]eq1 h(t1 )`q2 h(t2 )`2`qk h(tk ) ,
(3.31)
with N being the normalization constant, N~1":Dh expM!1: dt hQ 2N. The quantum mechanics 4 de"ned by the Hamiltonian (3.28) (or, equivalently, by the path integral (3.31)) is known as Liouville quantum mechanics [108,109]; the corresponding spectral expansion is obviously equivalent to the Lebedev}Kontorovich expansion. Inserting here the decomposition of unity, 1":dw d(X~1: dt eh!w) and making a shift hPh#ln w, we get
P G P H
q ! q ! k e~X@4N Dh(t)e~(h(0)`h(X))@2 CMqN(r ,2, r )" 1 2 1 k
A P
B
1 ]exp ! dt hQ 2 eq1 h(t1 )`2`qk h(tk )d X~1 dt eh!1 . 4
(3.32)
According to (3.32), the eigenfunction intensity can be written as a product t(r)"U(r)W(t) ,
(3.33)
where U(r) is a quickly #uctuating (in space) function, which has the Gaussian Ensemble statistics, SDU2qDT"q!/
G P H
1 PMh(t)"ln W2(t)N"Ne~X@4e~(h(0)`h(X))@2 exp ! dt hQ 2 . 4
(3.34)
280
A.D. Mirlin / Physics Reports 326 (2000) 259}382
The above calculation can be repeated for the case, when some of the points r lie closer than l to i each other. The result (3.33), (3.34) is reproduced also in this case, with the function U(r) having the ideal metal statistics given by the zero-dimensional p-model. This statistics [110}112] is Gaussian and is determined by the (short-range) correlation function <SUH(r)U(r@)T"k1@2(Dr!r@D) , (3.35) $ see Eq. (3.71) below. The physics of these results is as follows. The short-range #uctuations of the wave function (described by the function U(r)) have the same origin as in a strongly chaotic system, where superposition of plane waves with random amplitudes and phases leads to the Gaussian #uctuations of eigenfunctions with the correlation function (3.35) and, in particular, to the RMT statistics of the local amplitude, SDU2qDT"q!/
G
P A BH
plAD dh 2 PMh(x)N" exp ! dx . 2 dx
(3.36)
We will see in Section 4, while studying the statistics of anomalously localized states in d51 dimensions, that the probability of appearance in a metallic sample of such a rare state with an envelope eh(r) is given (within the exponential accuracy) by the d-dimensional generalization of (3.36) (see, in particular, Eqs. (4.10) and (4.77)). Finally, we compare the eigenfunction statistics in the quasi-1D case with that in a strictly 1D disordered system. In the latter case, the eigenfunction can be written as
S
2 t (x)" cos(kx#d)W(x) , 1D ¸
(3.37)
where W(x) is a smooth envelope function. The local statistics of t (x) (i.e. the moments (3.4)) was 1D studied in [113], while the global statistics (the correlation functions of the type (3.19)) in [107]. Comparing the results for the quasi-1D and 1D systems, we "nd that the statistics of the smooth envelopes W is exactly the same in the two cases, for a given value of the ratio of the system length ¸ to the localization length (equal to bplAD in quasi-1D and to the mean free path l in 1D). In particular, the moments C(q)(r)"SDt2q(r)DT are found to be related as q!2 AqC(q) " C(q) , Q1D (2q!1)!! 1D
(3.38)
where the factor q!2/(2q!1)!! represents precisely the ratio of the GUE moments, SDU2qDT"q!/
P
P " dr Dt(r)D4 , 2
(3.39)
A.D. Mirlin / Physics Reports 326 (2000) 259}382
281
is identical in the 1D [114,115] and quasi-1D [18,22] cases (the form of this distribution in the localized limit ¸/m<1 is explicitly given in Section 3.2.4 below; for arbitrary ¸/m the result is very cumbersome [115]). 3.2.3. Short wire In the case of a short wire, X"1/g;1, Eqs. (3.12), (3.14), (3.21) and (3.24) yield [17,18,36]
C
D
aX P(U)(y)"e~y 1# (2!4y#y2)#2 , y[X~1@2 , 6
C
A
B
D
aX 3 y2 e~y@2 1# !3y# #2 , P(O)(y)" 6 2 2 J2py
G
H
y[X~1@2 ,
a P(U)(y)" exp !y# y2X#2 , X~1@2[y[X~1 , 6
GC
DH
a 1 1 exp !y# y2X#2 P(O)(y)" 6 2 J2py
,
X~1@2[y[X~1 ,
P(y)& exp[!2bJy/X], yZX~1
(3.40) (3.41)
(3.42) (3.43) (3.44)
(a more accurate formula for the far `taila (3.44) can be found in Section 4.2.1, Eq. (4.74)). Here the coe$cient a is equal to a"2[1!3¸ ¸ /¸2]. We see that there exist three di!erent regimes of the ~ ` behavior of the distribution function. For not too large amplitudes y, Eqs. (3.40) and (3.41) are just the RMT results with relatively small corrections. In the intermediate range (3.42), (3.43) the correction in the exponent is small compared to the leading term but much larger than unity, so that P(y)
(3.45)
2m2A K (2JuAm) 1 , P(O)(u)K ¸ JuAm
(3.46)
with m"2plAD as before. Note that in this case the natural variable is not y"u<, but rather uAm, since typical intensity of a localized wave function is u&1/Am in contrast to u&1/< for a delocalized one. The asymptotic behavior of Eqs. (3.45) and (3.46) at u<1/Am has precisely the same form, P(u)&exp(!2bJuAm) ,
(3.47)
282
A.D. Mirlin / Physics Reports 326 (2000) 259}382
as in the region of very large amplitude in the metallic sample, Eq. (3.44). On this basis, it was conjectured in [18] that the asymptotic behavior (3.44) is controlled by the probability to have a quasi-localized eigenstate with an e!ective spatial extent much less than m (`anomalously localized statea). This conjecture was proven rigorously in [36] where the shape of the anomalously localized state (ALS) responsible for the large-u asymptotics was calculated via the transfer-matrix method. We will discuss this in Section 4 devoted to ALS and to asymptotics of di!erent distribution functions. Distribution of the inverse participation ratio (IPR) is also found to have a simple form in the limit ¸<m [22,25]:
G
H
= 4 R = P(z)"2p2 + (2p2zk4!3k2)e~p2k2z, z~3@2 + k2e~k2@z Jp Rz k/1 k/1
(3.48)
where z"plDA2P in the unitary case and z"(plDA2/3)P in the orthogonal case. (The second 2 2 line in (3.48) can be obtained from the "rst one by using the Poisson summation formula.) Therefore, the spatial extent of a localized eigenfunction measured by IPR #uctuates strongly (of order of 100%) from one eigenfunction to another. More precisely, the ratio of the r.m.s. deviation of IPR to its mean value is equal to 1/J5 according to Eq. (3.48). The "rst form of Eq. (3.48) is more suitable for extracting the asymptotic behavior of P(z) at z<1, whereas the second line gives us the leading behavior of P(z) at small z;1:
G
P(z)"
4p4ze~p2z,
z<1 ,
4p~1@2z~7@2e~1@z, z;1 .
(3.49)
Therefore, the probability to have atypically large or atypically small IPR is exponentially suppressed. The function P(z) is presented in Fig. 1. The above #uctuations of IPR are due to #uctuations in the `central bumpa of a localized eigenfunction. They should be distinguished from the #uctuations in the rate of exponential decay of eigenfunctions (Lyapunov exponent). The latter can be extracted from another important
Fig. 1. Distribution function P(z) of the normalized (dimensionless) inverse participation ratio z"[b2/(b#2)]plDA2P 2 in a long (¸<m) quasi-1D sample. The average value is SzT"1/3. From [18].
A.D. Mirlin / Physics Reports 326 (2000) 259}382
283
physical quantity } the distribution function P(v), where v"(2plDA2)2Dt2(r )t2(r )D a 1 a 2 is the product of the eigenfunction intensity in the two points close to the opposite edges of the sample r P0, r P¸. The result is [23,18] 1 2 b 1 exp ! (2X/b#ln v)2 , P(!ln v)"F[!(b ln v)/2X] 8X 2(2pX/b)1@2 FU(u)"u
C2[(3!u)/2] , C(u)
G
H
uC2[(1!u)/2] FO(u)" . pC(u)
(3.50)
Therefore, ln v is asymptotically distributed according to the Gaussian law with the mean value S!ln vT"(2b)X"¸/bplAD and the variance var(!ln v)"2S!ln vT. The same log-normal distribution is found for the conductance and for transmission coe$cients of a quasi-1D sample from the Dorokhov}Mello}Pereyra}Kumar formalism [116,74] (see end of Section 6.2). Note that the formula (3.50) is valid in the region of v;1 (i.e. negative ln v) only, which contains almost all normalization of the distribution function. In the region of still higher values of v the log-normal form of P(v) changes into the much faster stretched-exponential fall-o! JexpM!2J2bv1@4N, as can be easily found from the exact solution given in [23,18]. The decay rate of all the moments SvkT, k52, is four times less than S!ln vT and does not depend on k: SvkTJe~X@2b. This is because the moments SvkT, k52, are determined by the probability to "nd an `anomalously delocalized statea with v&1. 3.3. Arbitrary dimensionality: metallic regime 3.3.1. Distribution of eigenfunction amplitudes In the case of arbitrary dimensionality d, deviations from the RMT distribution P(y) for not too large y can be calculated [24,25] via the method described in Section 2. Applying this method to the moments (3.6), one gets
C
D
q! 1 SDt(r)D2qT" 1# iq(q!1)#2
(U) ,
(2q!1)!! SDt(r)D2qT" [1#iq(q!1)#2] (O) ,
(3.51) (3.52)
where i"P(r, r). Correspondingly, the correction to the distribution function reads
C
D
i P(y)"e~y 1# (2!4y#y2)#2 2
C A
B
(U) ,
D
e~y@2 i 3 y2 P(y)" 1# !3y# #2 2 2 2 J2py
(3.53) (O) .
(3.54)
Deviations of the eigenfunction distribution function P(y) from its RMT form are illustrated for the orthogonal symmetry case in Fig. 2. Numerical studies of the statistics of eigenfunction amplitudes
284
A.D. Mirlin / Physics Reports 326 (2000) 259}382
Fig. 2. Distribution P(y) of the normalized eigenfunction intensities y"
in weak localization regime have been performed in Ref. [117] for the 2D and in Ref. [118] for the 3D case. The found deviations from RMT are well described by the above theoretical results. Experimentally, statistical properties of the eigenfunction intensity have been studied for microwaves in a disordered cavity [15]. For a weak disorder the found deviations are in good agreement with (3.54) as well. In the quasi-one-dimensional case (with hard wall boundary conditions in the longitudinal direction), the one-di!uson loop P(r, r) is equal to
C
A
2 1 x x i,P(r, r)" ! 1! g 3 ¸ ¸
BD
, 04x4¸ ,
(3.55)
so that Eqs. (3.53) and (3.54) agree with results (3.40) and (3.41) obtained from the exact solution. For the periodic boundary conditions in the longitudinal direction (a ring) we have i"1/6g. In the case of 2D geometry, 1 ¸ P(r, r)" ln , pg l
(3.56)
with g"2plD. Finally, in the 3D case the sum over the momenta P(r, r)"(pl<)~1+q (Dq2)~1 diverges linearly at large q. The di!usion approximation is valid up to q&l~1; the corresponding cut-o! gives P(r, r)&1/2plDl"g~1(¸/l). This divergency indicates that more accurate evaluation of P(r, r) requires taking into account also the contribution of the ballistic region (q'l~1) which depends on microscopic details of the random potential. We will return to this question in Section 3.3.4. The formulas (3.53) and (3.54) are valid in the region of not too large amplitudes, where the perturbative correction is smaller than the RMT term, i.e. at y;i~1@2. In the region of large amplitudes, y'i~1@2 the distribution function was found by Fal'ko and Efetov [32,33], who applied to Eqs. (3.12) and (3.14) the saddle-point method suggested by Muzykantskii and Khmelnitskii [29]. We relegate the discussion of the method to Section 4 and only present the results here:
A.D. Mirlin / Physics Reports 326 (2000) 259}382
GA G
P(y)Kexp
BH
b iy2 !y# #2 2 2
H
G
]
b P(y)&exp ! lnd(iy) , yZi~1 . 4i
1 1
J2py
(U) , i~1@2[y[i~1 , (O)
285
(3.57)
(3.58)
Again, as in the quasi-one-dimensional case, there is an intermediate range where a correction in the exponent is large compared to unity, but small compared to the leading RMT term [Eq. (3.57)] and a far asymptotic region (3.58), where the decay of P(y) is much slower than in RMT. In the next section we will discuss the structure of anomalously localized eigenstates, which are responsible for the asymptotic behavior (3.44), (3.58). 3.3.2. 2D: Weak multifractality of eigenfunctions Since d"2 is the lower critical dimension for the Anderson localization problem, metallic 2D samples (with g<1) share many common properties with systems at the critical point of the metal}insulator transition. Although the localization length m in 2D is not in"nite (as for truly critical systems), it is exponentially large, and the criticality takes place in the very broad range of the system size ¸;m. 3.3.2.1. Multifractality: basic dexnitions. The criticality of eigenfunctions shows up via their multifractality. Multifractal structures "rst introduced by Mandelbrot [119] are characterized by an in"nite set of critical exponents describing the scaling of the moments of a distribution of some quantity. Since then, this feature has been observed in various objects, such as the energy dissipating set in turbulence [120}122], strange attractors in chaotic dynamical systems [123}126], and the growth probability distribution in di!usion-limited aggregation [127}129]; see Ref. [130] for a review. The fact that an eigenfunction at the mobility edge has the multifractal structure was noticed for the "rst time in [51], though the underlying renormalization group calculations were done by Wegner several years earlier [50]. For this problem, the probability distribution is just the eigenfunction intensity Dt2(r)D and the corresponding moments are the inverse participation ratios,
P
P " ddrDt2q(r)D . q
(3.59)
The multifractality is characterized by the anomalous scaling of P with the system size ¸, q q (3.60) P J¸~D (q~1),¸~q(q) , q with D di!erent from the spatial dimensionality d and dependent on q. Equivalently, the q eigenfunctions are characterized by the singularity spectrum f (a) describing the measure ¸f(a) of the set of those points r where the eigenfunction takes the value Dt2(r)DJ¸~a. The two sets of exponents q(q) and f (a) are related via the Legendre transformation, q(q)"qa!f (a), f @(a)"q, q@(q)"a .
(3.61)
For a recent review on multifractality of critical eigenfunctions the reader is referred to [55,131].
286
A.D. Mirlin / Physics Reports 326 (2000) 259}382
3.3.2.2. Multifractality in 2D. We note "rst that the formulas (3.51) and (3.52) for the IPRs with q[i~1@2 can be rewritten in the 2D case (with (3.56) taken into account) as
AB
SP T ¸ (1@bpg)q(q~1) q K , (3.62) PRMT l q where PRMT is the RMT value of P equal to q!¸~2(q~1) for GUE and (2q!1)!!¸~2(q~1) for GOE. q q We see that (3.62) has precisely the form (3.60) with D "2!(q/bpg) . (3.63) q As was found in [32,33], the eigenfunction amplitude distribution (3.57), (3.58) leads to the same result (3.63) for all q;2bpg. Since deviation of D from the normal dimension 2 is proportional to q the small parameter 1/pg, it can be termed `weak multifractalitya (in analogy with weak localization). The result (3.63) was in fact obtained for the "rst time by Wegner [50] via the renormalization group calculations. The limits of validity of Eq. (3.63) are not unambiguous and should be commented here. The singularity spectrum f (a) corresponding to (3.63) has the form
A
B
1 bpg 2 2# !a f (a)"2! , bpg 4
(3.64)
so that f (a "0) for B
C
D
1 2 . a "2 1$ B (2bpg)1@2
(3.65)
If a lies outside the interval (a , a ), the corresponding f (a)(0, which means that the most likely ~ ` the singularity a will not be found for a given eigenfunction. However, if one considers the average SP T over a su$ciently large ensemble of eigenfunctions (corresponding to di!erent realizations of q disorder), a negative value of f (a) makes sense (see a related discussion in [132,133]). This is the de"nition which was assumed in [32,33] where Eq. (3.63) was obtained for all positive q;2bpg. In contrast, if one studies a typical value of P , the regions a'a and a(a will not q ` ~ contribute. In this case, Eq. (3.63) is valid only within the interval q 4q4q with ~ ` q "$(2bpg)1@2; outside this region one "nds [134,135] B qa , q'q , ~ ` (3.66) q(q),D (q!1)" q qa , q(q . ` ~ Therefore, within this de"nition the multifractal dimensions D saturate at the values a and q ` a for qP#R and qP!R, respectively. This is in agreement with results of numerical ~ simulations [52}56].
G
3.3.3. Correlations of eigenfunction amplitudes and yuctuations of the inverse participation ratio In this subsection, we study correlations of eigenfunctions in the regime of a good conductor [25}27,136,137]. The correlation function of amplitudes of one and the same eigenfunction with
A.D. Mirlin / Physics Reports 326 (2000) 259}382
287
energy E can be formally de"ned as follows:
T
U
a(r , r , E)"SDt (r )t (r )D2T ,D + Dt (r )t (r )D2d(E!e ) . k 1 k 2 k 1 2 k 1 k 2 E k An analogous correlation function for two di!erent eigenfunctions is de"ned as p(r , r , E, u)"SDt (r )t (r )D2T 1 2 k 1 l 2 E, u
T
(3.67)
U
,D2R~1(u) + Dt (r )t (r )D2d(E!e )d(E#u!e ) , (3.68) k 1 l 2 k l kEl where R(u) is the two-level correlation function (2.1). To evaluate a(r , r , E) and p(r , r , E, u), we 1 2 1 2 employ an identity 2p2[D~1a(r , r , E)d(u)#D~2RI (u)p(r , r , E, u)] 1 2 1 2 " Re[SGR(r , r , E)GA(r , r , E#u)!GR(r , r , E)GR(r , r , E#u)T] , (3.69) 1 1 2 2 1 1 2 2 where GR,A(r, r@, E) are retarded and advanced Green's functions and RI (u) is the non-singular part of the level-level correlation function: R(u)"RI (u)#d(u/D). A natural question, which arises at this point, is whether the r.h.s. of Eq. (3.69) cannot be simply found within the di!uson-Cooperon perturbation theory [9]. Such a calculation would, however, be justi"ed only for u
G
J2 (p r), 2D , 0 F k (r)"exp(!r/l) (3.71) d (p r)~2sin2p r, 3D . F F We consider the unitary ensemble "rst; results for the orthogonal symmetry will be presented in the end. Evaluating the p-model correlation functions in the r.h.s. of Eq. (3.70) and separating the result into the singular the singular (proportional to d(u)) and regular at u"0 parts, one can obtain the correlation functions a(r , r , E) and p(r , r , E, u). The two-level correlation function R(u) entering 1 2 1 2 Eq. (3.70) was studied in Section 2. We employ again the method of [10] described in Section 2 to calculate the sigma-model correlation functions SQ11(r )Q22(r )T and SQ12(r )Q21(r )T for bb 1 bb 2 S bb 1 bb 2 S relatively low frequencies u;E . First, we restrict ourselves to the terms of order g~1. Then, the c result for the "rst correlation function reads as exp(ips)sin ps 2i ! P(r , r ) , SQ11(r )Q22(r )T "!1!2i 1 2 bb 1 bb 2 S ps (ps)2
(3.72)
288
A.D. Mirlin / Physics Reports 326 (2000) 259}382
where s"u/D#i0. The "rst two terms in Eq. (3.72) represent the result of the zero-mode approximation; the last term is the correction of order g~1. An analogous calculation for the second correlator yields:
G C
D
H
exp(ips) sin ps i SQ12(r )Q21(r )T "!2 # 1#i P(r , r ) . bb 1 bb 2 S 1 2 (ps)2 ps
(3.73)
Now, separating regular and singular parts in r.h.s. of Eq. (3.70), we obtain the following result [27] for the autocorrelations of the same eigenfunction: <2SDt (r )t (r )D2T !1"k (r)[1#P(r , r )]#P(r , r ) , k 1 k 2 E d 1 1 1 2 and for the correlation of amplitudes of two di!erent eigenfunctions
(3.74)
<2SDt (r )t (r )D2T !1"k (r)P(r , r ), kOl. (3.75) k 1 l 2 E,u d 1 1 In particular, for r "r we have 1 2 (3.76) <2SDt (r)t r)D2T !1"d #(1#d )P(r, r) . k l E,u kl kl Note that the result (3.74) for r "r is the inverse participation ratio calculated above (Section 1 2 3.3.1); on the other hand, neglecting the terms with the di!usion propagator (i.e. making the zero-mode approximation), we reproduce the result of Refs. [110}112]. Eqs. (3.75) and (3.76) show that the correlations between di!erent eigenfunctions are relatively small in the weak disorder regime. Indeed, they are proportional to the small parameter P(r, r). The correlations are enhanced by disorder; when the system approaches the strong localization regime, the relative magnitude of correlations, P(r, r) ceases to be small. The correlations near the Anderson localization transition will be discussed in Section 5. Another correlation function, generally used for the calculation of the linear response of the system, c(r , r , E, u)"StH(r )t (r )t (r )tH(r )T 1 2 k 1 l 1 k 2 l 2 E,u
T
,D2R~1(u) + tH(r )t (r )t (r )tH(r )d(E!e )d(E#u!e ) k 1 l 1 k 2 l 2 k l kEl can be calculated in a similar way. The result reads
U
(3.77)
<2StH(r )t (r )t (r )tH(r )T "k (r)#P(r , r ), kOl. (3.78) k 1 l 1 k 2 l 2 E,u d 1 2 As is seen from Eqs. (3.74), (3.75) and (3.78), in the 1/g order the correlation functions a(r , r , E) 1 2 and c(r , r , E, u) survive for the large separation between the points, r
exp(2ips) ( f !f ) 2 3 ps
exp(2ips)!1 ( f !4f #3f !2f ) , ! 1 2 3 4 2(ps)2
(3.79)
A.D. Mirlin / Physics Reports 326 (2000) 259}382
289
where we de"ned the functions f (r , r )"P2(r , r ) , 1 1 2 1 2
P
f (r , r )"(2<)~1 dr[P2(r, r )#P2(r, r )] , 1 2 2 1 2
P
f "<~2 dr dr@ P2(r, r@) , 3
P
f (r , r )"<~1 dr P(r, r )P(r, r ) . 1 2 4 1 2
(3.80)
Consequently, we obtain the following results for the correlations of di!erent (kOl) eigenfunctions at r'l: 1 <2SDt (r )t (r )D2T !1" ( f !f !2f ) k 1 l 2 E,u 3 4 2 1
A
sin2 ps sin 2ps #2( f !f ) ! 2 3 (ps)2 2ps
BA
B
sin2 ps ~1 1! . (ps)2
(3.81)
As it should be expected, the double integral over the both coordinates of this correlation function is equal to zero. This property is just the normalization condition and should hold in arbitrary order of expansion in g~1. The quantities f , f , and f are proportional to g~2, with some (geometry-dependent) prefactors 2 3 4 of order unity. On the other hand, f in 2D and 3D geometry depends essentially on the distance 1 r"Dr !r D. In particular, for l;r;¸ 1 2 ¸ 1 ln2 , 2D , r (pg)2 f (r , r )"P2(r , r )+ 1 1 2 1 2 1 , 3D . (4p2lDr)2
G
Thus, for l(r;¸, the contributions proportional to f dominate in Eq. (3.81), yielding 1 <2SDt (r )t (r )D2T !1"1P2(r , r ), kOl . (3.82) k 1 l 2 E,u 2 1 2 On the other hand, for the case of the quasi-1D geometry (as well as in 2D and 3D for r&¸), all quantities f , f , f , and f are of order of 1/g2. Thus, the correlator p(r , r , E, u) acquires 1 2 3 4 1 2 a non-trivial (oscillatory) frequency dependence on a scale u&D described by the second term in the r.h.s. of Eq. (3.81). In particular, in the quasi-1D case the function f !f determining the 2 3 spatial dependence of this term has the form
C A B A BD
2 r r f !f "! B 1 #B 2 2 3 4 4 3g2 ¸ ¸
,
where B (x)"x4!2x3#x2!1/30 is the Bernoulli polynomial. 4
(3.83)
290
A.D. Mirlin / Physics Reports 326 (2000) 259}382
Let us remind the reader that the above derivation is valid for u;E . In the range uZE the c c p-model correlation functions entering Eqs. (3.70) can be calculated by means of the perturbation theory [9], yielding
G
C
P
DH
1 1 <2SDt (r )t (r )D2T "1#Re k (r)P (r , r )# P2 (r , r )! dr dr@P2 (r, r@) k 1 l 2 E,u d u 1 2 u 2 u 1 2 <2 <2StH(r )t (r )t (r )tH(r )T "k (r)#Re P (r , r ) , k 1 l 1 k 2 l 2 E,u d u 1 2
,
(3.84)
where P (r , r ) is the "nite-frequency di!usion propagator u 1 2 / (r )/ (r ) P (r , r )"(pl)~1+ q 1 q 2 , u 1 2 Dq2!iu q
(3.85)
and the summation in Eq. (3.85) now includes q"0. As was mentioned, the perturbation theory should give correctly the non-oscillatory (in u) part of the correlation functions at u
(3.86)
<2SDt (r )t (r )D2T !1"2k (r)P(r , r ) , k 1 l 2 E,u d 1 2 <2StH(r )t (r )t (r )tH(r )T "k (r)#[1#k (r)]P(r , r ), kOl . k 1 l 1 k 2 l 2 E,u d d 1 2
(3.87) (3.88)
3.3.3.1. IPR yuctuations. Using the supersymmetry method, one can calculate also higher-order correlation functions of the eigenfunction amplitudes. In particular, the correlation function SDt4(r )DDt4(r )DT determines #uctuations of the inverse participation ratio (IPR) P ": drDt4(r)D. k 1 k 2 E 2 Details of the corresponding calculation can be found in Ref. [25]; the result for the relative variance of IPR, d(P )"var(P )/SP T2 reads 2 2 2
P
32a 8 dr dr@ P2(r, r@)" d , d(P )" 2 b2g2 b2 <2
(3.89)
with a numerical coe$cient a de"ned in Section 2 (see Eqs. (2.25) and (2.26)). The #uctuations d (3.89) have the same relative magnitude (&1/g) as the famous universal conductance #uctuations. Note also that extrapolating Eq. (3.89) to the Anderson transition point, where g&1, we "nd d(P )&1, so that the magnitude of IPR #uctuations is of the order of its mean value (which is, in 2 turn, much larger than in the metallic regime; see Section 5).
A.D. Mirlin / Physics Reports 326 (2000) 259}382
291
Eq. (3.89) can be generalized onto higher IPRs P with q'2, q dr dr@ 2 8q2(q!1)2a var(P ) d , q K q2(q!1)2 P2(r, r@)" (3.90) <2 b2 b2g2 SP T2 q so that the relative magnitude of #uctuations of P is &q(q!1)/g. Furthermore, the higher q irreducible moments (cumulants) |Pn }, n"2, 3,2, have the form q
P
C
DP
n dr 2dr |Pn } (n!1)! 2 1 n P(r , r )2P(r , r ) q " q(q!1) 1 2 n 1 2 b
C
D
(3.91)
where P is the integral operator with the kernel P(r, r@)/<. This is valid provided q2n;2bpg. Prigodin and Altshuler [137] obtained Eq. (3.91) starting from the assumption that the eigenfunction statistics is described by the Liouville theory. According to (3.91), the distribution function P(P ) of the IPR P (with q2/bpg;1) decays exponentially in the region q q q(q!1)/g;P /SP T!1;1, q q pb e P /SP T!1 1 q q , (3.92) P(P )&exp ! q 2 D q(q!1)
G
H
where e is the lowest non-zero eigenvalue of the di!usion operator !D+2. 1 The perturbative calculations show that the cumulants of the IPRs are correctly reproduced (in the leading order in 1/g) if one assumes [137] that the statistics of the eigenfunction envelopes Dt2(r)D "eh(r) is governed by the Liouville theory (see, e.g. [139,140]) de"ned by the functional 4.005) integral
P AP Dh d
B G
P
H
bplD ddr eh!1 exp ! ddr(+h)2 2 . 4 <
(3.93)
We will return to this issue in Section 4 where the asymptotics of the IPR distribution function will be discussed. We will see that these `tailsa governed by rare realizations of disorder are described by saddle-point solutions which can be also obtained from the Liouville theory description (3.93). The multifractal dimensions (3.63) can be found from the Liouville theory as well [139,140]. It should be stressed, however, that this agreement between the supermatrix p-model governing the eigenfunctions statistics and the Liouville theory is not exact, but only holds in the leading order in 1/g. Let us note that the correlations of eigenfunction amplitudes determine also #uctuations of matrix elements of an operator of some (say, Coulomb) interaction computed on eigenfunctions t of the one-particle Hamiltonian in a random potential. Such a problem naturally arises, when k one wishes to study the e!ect of interaction onto statistical properties of excitations in a mesoscopic sample (see Section 9). 3.3.4. Ballistic ewects 3.3.4.1. Ballistic systems. The above consideration can be generalized to a ballistic chaotic system, by applying a recently developed ballistic generalization of the p-model [75}77]. The results are
292
A.D. Mirlin / Physics Reports 326 (2000) 259}382
then expressed [78] in terms of the (averaged over the direction of velocity) kernel g(r , n ; r , n ) of 1 1 2 2 the Liouville operator KK "v n+ governing the classical dynamics in the system, F
P
P (r , r )" dn dn g(r , n ; r , n ) , B 1 2 1 2 1 1 2 2 KK g(r , n ; r , n )"(pl)~1[d(r !r )d(n !n )!<~1] . (3.94) 1 1 2 2 1 2 1 2 Here n is a unit vector determining the direction of momentum, and normalization :dn"1 is used. Equivalently, the function P (r , r ) can be de"ned as B 1 2 = dt dn g8 (r , n , t; r ) , P (r , r )" (3.95) B 1 2 1 1 1 2 0 where g8 is determined by the evolution equation
P P
A
B
R #v n + g8 (r , n , t; r )"0, t'0 F 1 1 1 1 2 Rt
(3.96)
with the boundary condition g8 D "(pl)~1[d(r !r )!<~1] . (3.97) t/0 1 2 Eq. (3.94) is a natural `ballistica counterpart of Eq. (2.22). Note, however, that P (r , r ) contains B 1 2 a contribution P(0)(r , r ) of the straight line motion from r to r1 (equal to 1/(pp Dr !r D) in 2D F 1 2 B 1 2 2 and to 1/2(p Dr !r D)2 in 3D), which is nothing else but the smoothed version of the function F 1 2 k (Dr !r D). For this reason, P(r , r ) in Eqs. (3.86)}(3.88) should be replaced in the ballistic case d 1 2 1 2 by P(r , r )"P (r , r )!P(0)(r , r ). At large distances Dr !r D<j the (smoothed) correlation 1 2 B 1 2 B 1 2 1 2 F function takes in the leading approximation the form 2 <2a(r , r , E)"1# P (r , r ) . 1 2 b B 1 2
(3.98)
A formula for the variance of matrix elements closely related to Eq. (3.98) was obtained in the semiclassical approach in Ref. [141]. In a recent paper [142] a similar generalization of the Berry formula for StH(r )t (r )T was proposed. k 1 k 2 Eq. (3.98) shows that correlations in eigenfunction amplitudes in remote points are determined by the classical dynamics in the system. It is closely related to the phenomenon of scarring of eigenfunctions by the classical orbits [143,144]. Indeed, if r and r belong to a short periodic orbit, 1 2 the function P (r , r ) is positive, so that the amplitudes Dt (r )D2 and Dt (r )D2 are positively B 1 2 k 1 k 2 correlated. This is a re#ection of the `scarsa associated with this periodic orbits and a quantitative characterization of their strength in the coordinate space. Note that this e!ect gets smaller with increasing energy E of eigenfunctions. Indeed, for a strongly chaotic system and for Dr !r D&¸ (¸ 1 2 being the system size), we have in the 2D case P (r , r )&j /¸, so that the magnitude of B 1 2 F correlations decreases as E~1@2. The function P (r , r ) was explicitly calculated in Ref. [78] for B 1 2 a circular billiard with di!use surface scattering (see Section 8). 3.3.4.2. Ballistic ewects in diwusive systems. We return now to the question of deviations of the eigenfunction amplitude distribution from the RMT in a di!usive 3D sample. As was shown in
A.D. Mirlin / Physics Reports 326 (2000) 259}382
293
Section 3.3.1, such deviations are controlled by the parameter i"P(r, r), see Eqs. (3.53)}(3.56). The physical meaning of the parameter i is the time-integrated return probability, see Eq. (3.95) generalizing de"nition of P(r , r ) to the ballistic case. The contribution to this return probability 1 2 from the times larger than the momentum relaxation time, t'q, is given by P$*&&(r, r)"(pl<)~1 + (Dq2)~1 . @q@[1@l The sum over the momenta diverges on the ultraviolet bound in d52, so that the cut-o! at q&1/l is required. This results in Eq. (3.56) in 2D and in P$*&&(r, r)&1/(k l)2 in the 3D case. There exists, F however, an additional, ballistic, contribution to P(r, r), which comes from the times t shorter than the mean free time q. Diagrammatically, it is determined by the "rst term of the di!uson ladder contributing to P(r, r) (that with one impurity line), i.e. by the probability to hit an impurity and to be rejected back after a time t;q. Contrary to the di!usive contribution, which has a universal form and is determined by the value of the di!usion constant D only, the ballistic one is strongly dependent on the microscopic structure of the disorder. In particular, in the case of the white-noise disorder we "nd
G
P
1 (dq) 1 l " ln , 2D , plv2 q q2 2pg j F F P"!--(r, r)" 1 (dq) p & , 3D . k l 4lv2 q q2 F F
P
(3.99)
Note that the integrals over the momenta are again divergent at large q } precisely in the same way as in the di!usive region, but with di!erent numerical coe$cients } and are now cut-o! at q&k . F The total return probability is given by the sum of the short-scale (ballistic) and long-scale (di!usive) contributions. It is important to notice, however, that the single-scattering contribution (3.99) should be divided by 2 in the orthogonal symmetry case, because the corresponding trajectory is identical to its time reversal. Thus, i"P$*&&(r, r)#P"!--(r, r) for b"2 and i"P$*&&(r, r)#(1/2)P"!--(r, r) for b"1. We see that for the white-noise random potential in 3D the return probability is dominated by the ballistic contribution, yielding i&1/k l. In the 2D case, F taking into account of the ballistic contribution modi"es only the argument of the logarithm in (3.56). Furthermore, even in the quasi-1D geometry the non-universal short-scale e!ects can be important. Indeed, if we consider a 3D sample of the quasi-1D geometry (¸ , ¸ ;¸ ), x y z the di!usion contribution will be given by Eq. (3.55), P$*&&(r, r)&1/g, while the ballistic one will be P"!--(r, r)&1/k l. Therefore, the di!usion contribution is dominant only provided F g;k l. F On the other hand, let us consider the opposite case of a smooth random potential with correlation length d<j . Then the scattering is of small-angle nature and the probability for F a particle to return back in a time t;q is exponentially small, so that P"!--(r, r) can be neglected. Therefore, the return probability i in Eqs. (3.51)}(3.54) is correctly given by the di!usion contribution, see Eq. (3.56) for 2D and the estimate below it for 3D. Thus the corrections to the `bodya of the distribution function are properly given by the p-model in this case.
294
A.D. Mirlin / Physics Reports 326 (2000) 259}382
4. Asymptotic behavior of distribution functions and anomalously localized states In this section, we discuss asymptotics of distribution functions of various quantities characterizing wave functions in a disordered system. Asymptotic behavior of these distribution functions is determined by rare realizations of the disorder producing the states, which show much stronger localization features than typical states in the system. We call such states `anomalously localized statesa (ALS). It was found by Altshuler, Kravtsov and Lerner (AKL) [28] that distribution functions of conductance, density of states, local density of states, and relaxation times have slowly decaying logarithmically normal (LN) asymptotics at large values of the arguments. These results were obtained within the renormalization group treatment of the p-model. The validity of this RG approach is restricted to 2D and 2#e-dimensional systems, with e;1. On the other hand, the conductance, LDOS and relaxation times #uctuations in strictly 1D disordered chains, where all states are strongly localized, were studied with the use of Berezinski and Abrikosov}Ryzhkin techniques [113,145}148]. The corresponding distributions were found to be of the LN form, too. It was conjectured on the basis of this similarity [28,113,149] that even in a metallic sample there is a "nite probability to "nd `almost localizeda eigenstates, and that these states govern the slow asymptotic decay of the distribution functions. Similar conclusion [25] is implied by the exact results for the statistics of the eigenfunction amplitude in the quasi-one-dimensional case, which shows the identical asymptotic behavior in the localized and metallic regimes, see Section 3, Eqs. (3.44) and (3.47). A new boost to the activity in this direction was given by the paper of Muzykantskii and Khmelnitskii [29], who proposed to use the saddle-point method for the supersymmetric p-model in order to calculate the long-time dispersion of the average conductance G(t). Their idea was to reproduce the AKL result by means of a more direct calculation. However, they found a di!erent, power-law decay of G(t) in an intermediate range of times t in 2D. As was shown by the author [30] (and then reproduced in [31] within the ballistic p-model approach), the far asymptotic behavior is of log-normal form and is thus in agreement with AKL. The saddle-point method of Muzykantskii and Khmelnitskii allowed also to study the asymptotic behavior of distribution functions of other quantities: relaxation times [29}31], eigenfunction intensities [32,33], local density of states [34], inverse participation ratio [35,36], level curvatures [37], etc. The form of the saddle-point solution describes directly the spatial shape of the corresponding anomalously localized state [29,36]. We will consider the unitary symmetry (b"2) throughout this section; in the general case, the conductance g in the exponent of the distribution functions is replaced by (b/2)g (we will sometimes do it explicitly in the end of the calculation). 4.1. Long-time relaxation In this subsection we study (mainly following Refs. [29,30]) the asymptotic (long-time) behavior of the relaxation processes in an open disordered conductor. One possible formulation of the problem is to consider the time dependence of the average conductance G(t) de"ned by the non-local (in time) current}voltage relation
P
I(t)"
t
~=
dt@ G(t!t@)<(t@) .
(4.1)
A.D. Mirlin / Physics Reports 326 (2000) 259}382
295
Alternatively, one can study the decay law, i.e. the survival probability P (t) for a particle injected s into the sample at t"0 to be found there after a time t. Classically, P (t) decays according to the s exponential law, P (t)&e~t@tD , where t~1 is the lowest eigenvalue of the di!uson operator !D+2 s D with the proper boundary conditions. The time t has the meaning of the time of di!usion through D the sample, and t~1 is the Thouless energy (see Section 2). The same exponential decay holds for the D conductance G(t), where it is induced by the weak-localization correction. The quantities of interest can be expressed in the form of the p-model correlation function
P
P
du G(t), P (t)& e~*ut DQ(r)AMQNe~S*Q+ , s 2p
(4.2)
where S[Q] is given by Eq. (3.7) with gP!2iu. The preexponential factor AMQN depends on speci"c formulation of the problem, but is not important for the leading exponential behavior studied here. Varying the exponent in Eq. (4.2) with respect to Q and u, one gets the equations [29] 2D+(Q+Q)#iu[K, Q]"0 ,
(4.3)
P
pl dr Str(KQ)"t . 2
(4.4)
[We assume unitary symmetry (b"2); in the orthogonal symmetry case the calculation is applicable with minor modi"cations and we will present the result in the end.] Note that in fact u plays in (4.2) a role of the Lagrange multiplier corresponding to the condition (4.4). Therefore, it remains (i) to "nd a solution Q of Eq. (4.3) (which will depend on u); u (ii) to substitute it into the self-consistency equation (4.4) and thus to "x u as a function of t; (iii) to substitute the found solution Q into Eq. (4.2), which will yield t plD P (t)&exp Str (+Q )2 . (4.5) s t 4
G
P
H
Note that Eq. (4.3) is to be supplemented by the boundary conditions QD "K -%!$4 at the open part of the boundary, and
(4.6)
+ QD "0 (4.7) n */46-!503 at the insulating part of the boundary (if it exists); + denotes here the normal derivative. n It is not di$cult to show [29] the solution of Eq. (4.3) has in the standard parametrization the only non-trivial variable } bosonic `non-compact anglea7 04h (R; all other coordinates being 1 equal to zero. As a result, Eq. (4.3) reduces to an equation for h (r) (we drop the subscript `1a below) 1 iu +2h# sinh h"0 , (4.8) D 7 h is related to the eigenvalue j used in Section 3.1 as j "cosh h . 1 1 1 1
296
A.D. Mirlin / Physics Reports 326 (2000) 259}382
the self-consistency condition (4.4) takes the form
P
pl ddr(cosh h!1)"t ,
(4.9)
and Eq. (4.5) can be rewritten as
P
plD ddr(+h)2 . ln P (t)"! s 2
(4.10)
For su$ciently small times, h is small according to (4.9), so that Eqs. (4.8) and (4.9) can be linearized +2h#2ch"0, 2c"iu/D ,
(4.11)
P
pl ddr h2"t . 2
(4.12)
This yields
A B
2t 1@2 / (r) , 1 pl
h"
(4.13)
where / is the eigenfunction of the Laplace operator corresponding to the lowest eigenvalue 1 2c "1/Dt . The survival probability (4.10) reduces thus to 1 D plD ddr h+2h"!plDc ddr h2"!t/t , (4.14) ln P (t)" 1 D s 2
P
P
as expected. Eq. (4.14) is valid (up to relatively small corrections) as long as h;1, i.e. for tD;1 (D"1/l< being the mean level spacing). To "nd the behavior at tZD~1, as well corrections at t(D~1, one should consider the exact (non-linear) equation (4.8), solution of which depends on the sample geometry. 4.1.1. Quasi-1D geometry We consider a wire of a length ¸ and a cross-section A with open boundary conditions at both edges, h(!¸/2)"h(¸/2)"0. Eqs. (4.8) and (4.9) take the form hA#2c sinh h"0 ,
P
(4.15)
L@2
dx(cosh h!1)"t/plA . (4.16) ~L@2 From the symmetry consideration h(x)"h(!x) and h@(0)"0, so that it is su$cient to consider the region x'0. The solution of Eq. (4.15) reads
P
x"
h0
d0
h(x) 2Jc(cosh h0 !cosh 0 )
,
(4.17)
A.D. Mirlin / Physics Reports 326 (2000) 259}382
297
where h is determined by the condition h(¸/2)"0 yielding 0 h0 d0 ¸" . (4.18) 0 Jc(cosh h0 !cosh 0 ) In the large-t region (tD<1) we will have h <1, and Eqs. (4.17) and (4.18) can be simpli"ed to give 0 2x , (4.19) h(x)Kh 1! 0 ¸
P
A
B
2h2 h Kln 0 . 0 c¸2
(4.20)
Substituting this into condition (4.16) allows to relate h to t, 0 2 eh0 " t*h . 0 p
(4.21)
Finally, substitution of Eqs. (4.19) and (4.21) into (4.10) yields the log-normal asymptotic behavior of P (t) [29]: s tD ln P (t)K!g ln2 ; tD<1 , (4.22) s ln(tD) with g"2plAD/¸ being the dimensionless conductance. We remind that Eq. (4.22) has been derived for the unitary ensemble (b"2); in the general case, its r.h.s. should be multiplied by b/2. Eq. (4.22) has essentially the same form as the asymptotic formula for G(t) found by Altshuler and Prigodin [148] for the strictly 1D sample with a length much exceeding the localization length:
G
H
l G(t)&exp ! ln2(t/q) . ¸
(4.23)
If we replace in Eq. (4.23) the 1D localization length m"l by the quasi-1D localization length m"bplAD, we reproduce the asymptotics (4.22) (up to a normalization of t in the argument of ln2, which does not a!ect the leading term in the exponent for tPR). This is one more manifestation of the equivalence of statistical properties of smooth envelopes of the wave functions in 1D and quasi-1D samples [18] (see Section 3). Furthermore, agreement of the results for the metallic and the insulating samples demonstrates clearly that the asymptotic `taila (4.22) in the metallic sample is indeed due to anomalously localized eigenstates. As another manifestation of this fact, Eq. (4.22) can be represented as a superposition of the simple relaxation processes with mesoscopically distributed relaxation times [28]:
P
P (t)& dt e~t@t( P(t ) . s ( ( The distribution function P(t ) then behaves as follows: ( P(t )&expM!g ln2(gDt )N; t <1/gD,t . ( ( ( D
(4.24)
(4.25)
298
A.D. Mirlin / Physics Reports 326 (2000) 259}382
This can be easily checked by substituting Eq. (4.25) into Eq. (4.24) and calculating the integral via the stationary point method; the stationary point equation being 2gt ln(g*t )"t . (4.26) ( ( Note that Thouless energy t~1 determines the typical width of a level of an open system. Therefore, D formula (4.25) concerns indeed the states with anomalously small widths t~1 in the energy space. ( The saddle-point solution h(r) provides a direct information on the spatial shape of the corresponding ALS. This was conjectured by Muzykantskii and Khmelnitskii [29] and was explicitly proven in [36] for the states determining the distribution of eigenfunction amplitudes, see Section 4.2.1 below. Speci"cally, the smoothed (over a scale larger than the Fermi wavelength) intensity of the ALS is Dt2(r)D"N~1eh(r), where N is the normalization factor determined by the requirement :dd rDt2(r)D"1. We get thus from Eq. (4.19) h " 0 e~2h0 @x@@L Dt2(r)D 4.005) A¸
(4.27)
with
A
B A
B
2 4 h Kln tD ln(tD) Kln gDt ln2(gDt ) . ( ( 0 p p
(4.28)
Thus, the ALS, which gives a minimum to the level width t~1, has an exponential shape (4.27), ( (4.28). The saddle-point method allows also to "nd the corrections to Eq. (4.14) in the intermediate region t ;t;D~1, where h ;1 [150]. For this purpose, we expand cosh h and cosh 0 in D 0 0 Eq. (4.18) up to the 4th order terms, which leads to the following relation between c and h : 0 1 p2 1! h2 #2 . (4.29) c" 8 0 2¸2
A
B
Further, we substitute Eq. (4.17) into (4.16),
P
h0
d0(cosh 0!1)
t " 2plA 0 2Jc(cosh h0 !cosh 0) and expand cosh 0 in (4.30) up to the 4th order terms. This gives the relation
A
B
t p 1 "h2 1! h2 . 0 2plA 96 0 8J2c
(4.30)
(4.31)
Using Eqs. (4.15) and (4.16), we can rewrite the action in the form
P
plAD S, dx(h@)2"2plAD¸c(cosh h !1)!2Dtc . 0 2 Expressing now h and c through t according to Eqs. (4.30) and (4.31), we "nd 0 t 1 t !ln P (t)"S" 1! #2 , s t 2p2g t D D
A
B
(4.32)
(4.33)
A.D. Mirlin / Physics Reports 326 (2000) 259}382
299
with t "¸2/p2D. In the general case, g is replaced by (b/2)g here. Eq. (4.33) is completely D analogous to the formula (3.42), (3.43) for the statistics of eigenfunction amplitudes. It shows that the correction to the leading term !t/t in ln P becomes large compared to unity at tZJgt , D S D though it remains small compared to the leading term up to t> &D~1. D Result (4.33) was also obtained by Frahm [151] from rather involved calculations based on the equivalence between the 1D p-model and the Fokker}Plank approach and employing the approximate solution of the Dorokhov}Mello}Pereyra}Kumar equations in the metallic limit. The fact that the logarithm of the quantum decay probability, ln P (t), starts to deviate strongly (compared S to unity) from the classical law, ln P#-(t)"!t/t at t&t Jg was observed in numerical simulas D D tions by Casati et al. [152]. For related results in the framework of the random matrix model see Section 4.1.3. 4.1.2. 2D geometry We consider now a 2D disk-shaped sample of a radius R with an open boundary. If the problem is formulated in terms of the conductance, we can assume the two leads attached to the disk boundary to be of almost semicircular shape, with relatively narrow insulating intervals between them. Then we can approximate the boundary conditions by using Eq. (4.6) for all the boundary. In fact, in view of the logarithmic dependence of the saddle point action on R (see below), the result does not depend to the leading approximation on the speci"c shape of the sample and the leads attached. With the rotationally invariant form of the boundary condition, the minimal action corresponds to the function h depending on the radial coordinate r only. We get therefore the radial equation hA#h@/r#2c sinh h"0, 04r4R
(4.34)
(the prime denotes the derivative d/dr) with the boundary conditions: h(R)"0 ,
(4.35)
h@(0)"0
(4.36)
Condition (4.36) follows from the requirement of analyticity of the "eld in the disk center. Assuming that characteristic values of h satisfy the condition h<1 (which corresponds to tD<1), one can replace sinh h by eh/2. Eq. (4.34) can be then easily integrated, and its general solution reads: 2C2 C rC1 ~2 2 eh(r)" 1 , c (C rC1 #1)2 2
(4.37)
with two integration constants C and C . To satisfy the boundary condition (4.36), we have to 1 2 choose C "2. Furthermore, the above assumption h(0)<1 implies that 2C /c<1. Therefore, the 1 2 second boundary condition (4.35) is satis"ed if C K8/cR4, and the solution can be written in the 2 form eh(r)K[(r/R)2#cR2/8]~2 .
(4.38)
300
A.D. Mirlin / Physics Reports 326 (2000) 259}382
Using now the self-consistency equation (4.9) one "nds c"4p2l/t. Finally, calculating the action on the saddle point (4.38), we "nd [29] !ln P (t)"SK8p2lD ln(tD) , (4.39) s The above treatment is valid provided h@(r)(l~1 on the saddle point solution, which is the condition of the applicability of the di!usion approximation (here l is the mean free path). In combination with the assumption h(0)<1 this means that 1;tD;(R/l)2. Now let us consider the region of still longer times, t
C
D
The function h(r) is meant as being constant within the vicinity DrD4r of the disk center. The H condition h@4l~1 yields r &lC. It is important to note that the result does not depend on details H of the cut-o! procedure. For example, one gets the same results if one chooses the boundary condition in the form h@(r )"1/l. The crucial point is that the maximum derivative h@ should not H exceed 1/l. The constant C is to be found from the self-consistency equation (4.9) which can be reduced to the following form:
A B
R C 2t C2 . " r p2lR2 C!2 H Neglecting corrections of the ln(ln ) ) form, we "nd ln(tD) ln(tD) CK K . ln(R/r ) ln(R/l) H The action (4.10) is then equal to
(4.41)
(4.42)
SKp2lD(C#2)2 ln(R/r )Kp2lD H
ln2[tD(R/l)2] . ln(R/l)
(4.43)
Combining Eqs. (4.39) and (4.43) and introducing the factor b/2 for generality, we get thus the following long-time asymptotics of P (t) (or G(t)) [30]: s P (t)&(tD)~2pbg, 1;tD;(R/l)2 , (4.44) s pbg ln2 (t/gq) P (t)&exp ! , tD<(R/l)2 , (4.45) s 4 ln (R/l )
G
H
A.D. Mirlin / Physics Reports 326 (2000) 259}382
301
where g"2plD is the dimensionless conductance per square in 2D and q is the mean free time. The far asymptotic behavior (4.45) is of the log-normal form and very similar to that found by AKL (see Eq. (7.8) in Ref. [28]). It di!ers only by the factor 1/g in the argument of ln2. It is easy to see however that this di!erence disappears if one does the last step of the AKL calculation with a better accuracy. Let us consider for this purpose the intermediate expression of AKL (Ref. [28], Eq. (7.11)):
P
C
D
p = 1 t dt ( e~t@t( exp ! ln2 ( G(t)J! q 4u q t 0 ( where
(4.46)
R 1 ln uK 2p2lD l in the weak localization region in 2D, which we are considering. Evaluating the integral (4.46) by the saddle point method, we "nd
G G
1 2ut G(t)&exp ! ln2 4u q
H
H
pg ln2(t/gq) &exp ! , 4 ln(R/l)
(4.47)
where we have kept only the leading term in the exponent. Eq. (4.47) is in exact agreement with Eq. (4.45) for b"1 (AKL assumed the orthogonal symmetry of the ensemble). Therefore, the supersymmetric treatment con"rms the AKL result and also establishes the region of its validity. It is instructive to represent the obtained results in terms of the superposition of simple relaxation processes with mesoscopically distributed relaxation times t : ( dt ( e~t@t( P(t ) . G(t), P (t)& (4.48) s ( t ( Eqs. (4.44) and (4.45) lead then to the following result for the distribution function P(t ) [30]: ( (t /t )~2pbg, t ;t ;t (R/l)2 , ( D D ( D (4.49) P(t )& ( pbg ln2(t /q) ( , t
P
G
G
H
where t KR2/D is the time of di!usion through the sample. D The smooth envelope of the ALS corresponding to the intermediate region t ;t ;t (R/l)2 D ( D has according to Eq. (4.37) the following spatial structure: 1 1 , Dt2(r)D "N~1eh(r)" 4.005) 8bpDt [(r/R)2#R2/8bDt ]2 ( (
(4.50)
302
A.D. Mirlin / Physics Reports 326 (2000) 259}382
Thus, this ALS has an e!ective localization length m &R(t /t )1@2, with the intensity decreasing as %& D ( 1/r4 outside the region of the extent m . As to the ultra-long-time region, Eqs. (4.40) and (4.42) %& indicate that now the e!ective localization length is given by l(t) de"ned as H ln(t /t ) l(t)"c l; cK ( D . (4.51) H t t ln(R/l) The ALS intensity decays in a power-law manner for r'l(t), with an exponent depending on t : H ( 1 r ~ct ~2 , l(t)4r4R . (4.52) Dt2(r)D& H l(t)2 l(t) H H
AB
4.1.3. Random matrix model Here we mention brie#y the results on the quantum decay law obtained by Savin and Sokolov [153] within the RMT model. This will allow us to see the similarities and the di!erences between the di!usive systems and the random matrix model. The model describes a Hamiltonian of an open chaotic system by a Gaussian random matrix coupled to M external (decay) channels. The decay law found has the form P (t)&(1#C t/M)~M , (4.53) s W where C "M¹D/2p is a typical width of the eigenstate, with ¹ characterizing the channel W coupling (¹"1 for ideal coupling, see also Section 6). In this case, the product M¹ plays a role of the dimensionless conductance g (in contrast to the di!usive case where g is governed by the bulk of the system, here it is determined by the number of decay channels and the strength of their coupling). For not too large t (t*¹;1), Eq. (4.53) yields the classical decay law, P (t)&e~tCW , with s the corrections of the form ln P (t)"!tC (1!C t/2M#2) , (4.54) s W W which is similar to the results found for the di!usive systems (see, e.g. Eq. (4.33)). At large t<(*¹)~1, the decay take the power-law asymptotic form [154] ln P (t)K!M ln(C t/M) . s W
(4.55)
4.1.4. Distribution of total density of states Here we discuss the contribution of ALS to the asymptotic behavior of the distribution function P(l) of the total density of states (DOS),
P
!1 l(E)" Im ddr G (r, r; E) R p<
(4.56)
(in the present subsection we denote the average DOS as l to distinguish it from the #uctuating 0 quantity l(E)). A resonance state with an energy E and width t~1 gives a following contribution ( to l(E): 2t 2 l (E)" ( " t Dl . ALS p< p ( 0
(4.57)
A.D. Mirlin / Physics Reports 326 (2000) 259}382
303
We expect that the asymptotic behavior of P(l) at l
A
B
A
B
AB AB
G
in the 2D geometry. The far LN asymptotic tail in Eq. (4.59) is in full agreement with the RG calculation by Altshuler et al. [28]. We "nd also an intermediate power-law behavior, which could not be obtained from the study of cumulants in Ref. [28]. We note, however, that this power-law form is fully consistent with the change of the behavior of cumulants |ln} at n&pg discovered in [28]. 4.2. Distribution of eigenfunction amplitudes 4.2.1. Quasi-1D geometry The spatial shape of the ALS determining the asymptotics of the distribution function of eigenfunction intensities can be found [36] via the exact solution of the p-model. We de"ne Q(u, r) , SDt2(r)DT " u P(u)
(4.60)
where
T
U
1 + Dt (r)D2d(Dt (0)D2!u)d(E!E ) , (4.61) Q(u, r)" a a a l< a and P(u) is the distribution function of u"Dt2(0)D de"ned formally by Eq. (3.3). According to Eq. (4.60), SDt2(r)DT is the average intensity of an eigenstate, which has in the point r"0 the u intensity u (which will be assumed to be atypically large). The exact result for P(u) is given in the form of the Lebedev}Kontorovich expansion by Eqs. (3.12), (3.21) and (3.24). Calculating the moments SDt(0)D2Dt(r)D2qT, and restoring the function Q(u, r), we "nd for r'l
C
D
1 R =(2)(umA; q , q )=(1)(umA, q ) 1 2 ~ , Q(u, r)"! <mA Ru u where the function =(2)(z; q , q ) satis"es the same equation, as =(1), 1 2 R2 R=(2)(z; q , q ) 1 2 " z2 !z =(2)(z; q , q ) 1 2 Rz2 Rq 1
A
B
(4.62)
(4.63)
304
A.D. Mirlin / Physics Reports 326 (2000) 259}382
with the boundary condition =(2)(z; 0, q )"z=(1)(z, q ) . 2 2 The solution of Eqs. (4.63) and (4.64) is
P P
=
=(2)(z, q , q )"2Jz 1 2
(4.64)
dk b(k, q )K (2Jz )e~((1`k2)@4)q1 , 2 ik
0 k sinh(pk) = dt K (t)=(1)(t2/4, q ) . (4.65) b(k, q )" ik 2 2 2p2 0 Substituting here the formula (3.24) for =(1)(z, q ) and evaluating the integral over z, we can reduce 2 Eq. (4.65) for b(k, q ) to the form 2 k sinh(pk) 1#ik 4 k sinh(pk) dk k 1 1 b(k, q )" C (1#k2)# 2 16p2 2 2p3 1#k2 1 pk k#k k!k 2 2 1 1 ]sinh 1 C 1#i C 1#i (4.66) e~((1`k21 )@4)q2 . 2 2 2
K A BK KA BK K A
P
BK
In the opposite case r(l we "nd
G A
B H
1 d2 d d Q(u, r)" k (r) u # ! > (u) , < d du2 du du a
(4.67)
where the function > (u) was de"ned in Eq. (3.9). This formula is valid for any sample, which is a locally d-dimensional. In the case of the quasi-1D geometry we get
G A
B H
1 d2 d d Q(u, r)" k (r) u # ! [=(1)(umA, q )=(1)(umA, q )], r;l ~ ` < d du2 du du
(4.68)
4.2.1.1. Insulating sample (¸<m). The distribution P(u) is given by Eqs. (3.45) and (3.46). The `taila, Eq. (3.47), at u<1/Am corresponds to atypically large local amplitudes. Analyzing the general formula for SDt2(r)DT in this case, we "nd [36] the following spatial structure of the ALS with Dt2(0)D"u: p3@2 SDt2(r)DT " u~1@2A~3@2r~3@2e~r@4m, r<m , u 16
A B
1 u 1@2 SDt2(r)DT " u 2 mA
A B
1
A S B
uA 2 1#r m
, l(r;m ,
1 u 1@2 SDt2(r)DT " [1#2JuAmk (r)], r(l . u 2 mA d
(4.69) (4.70)
(4.71)
We see from Eqs. (4.69), (4.70) and (4.71) that the eigenfunction normalization is dominated by the region r&m , where m &Jm/uA;m plays the role of an e!ective localization length. In the %& %&
A.D. Mirlin / Physics Reports 326 (2000) 259}382
305
region m ;r;m the wave intensity falls down as 1/r2, and crosses over to the conventional %& localization behavior at r<m. Therefore, the appearance of an anomalously high amplitude Dt2(0)D"u<1/Am is not just a local #uctuation, but rather a kind of a cooperative phenomenon corresponding to existence of a whole region r[m with an unusually large amplitude %& Dt2(r)D"1Ju/mA&1/Am . 2 %& Eq. (4.71) describes a sharp drop of the amplitude from Dt2(0)D"u to SDt2(r)DT"1Ju/mA 2 at r&l. This `quasi-jumpa happens on a short scale r &k~1(uAm)1@*2(d~1)+. To understand 0 F the reason for it, let us recall that the above formulas represent the average intensity Dt2(r)D (under the condition Dt2(0)D"u). One can also study the yuctuations of the intensity. It turns out [36] that in the region r ;r;m the #uctuations are of usual GUE type superimposed on the envelope 0 (4.70). It is not di$cult to understand that the quasi-jump has the same origin as the GUE-like #uctuations at r
A B
1 u 1@2 u"Dt2(0)D J(u), 2(uAm)1@2 . 4.005) 2 mA
(4.72)
For arbitrary geometry of the sample, the magnitude of the `quasi-jumpa J(u) is given according to Eq. (4.67) by J(u)K!u
d d ln> (u)K!u ln P(u) a du du
(4.73)
(in the quasi-1D case this reduces to J(u)"2(uAm)1@2 as stated above). The formula (4.73) can be reproduced (within the saddle-point approximation) by writing the quantity u as a product u"u J s of the smooth part u and the local #uctuating quantity J, with the latter distributed according to s P(J)"e~J. 4.2.1.2. Metallic sample (¸;m). The asymptotic behavior of the intensity distribution function has the same stretched-exponential form, as in the localized regime, see Eq. (3.44). More accurately (with the subleading factors included), this formula reads [36]
S
G
A
B
Am J¸ ¸ p2m Jm/uA ` ~ exp !4JumA# 1! #2 ¸ ¸ u 4¸ ` ` p2m Jm/uA # 1! #2 . (4.74) ¸ 4¸ ~ ~ Calculation of Q(u, r) shows [36] that the ALS intensity has for l(r;¸ the same form (4.70), provided the condition
A
BH
306
A.D. Mirlin / Physics Reports 326 (2000) 259}382
behavior (3.44), (4.74) is valid, acquires now a very transparent meaning. This is just the condition that the e!ective localization length of an ALS, m "Jm/uA is much less than the sample %& size ¸. Indeed, m /¸"Jm/uA¸2"Jg/u<. %& Near the sample edges, r&¸<m , the form of the ALS intensity is slightly modi"ed by the %& boundary of the sample, see [36]. Finally, the `quasi-jumpa of SDt2(r)DT at r;l has the same form u (4.71) as in the insulating regime. 4.2.1.3. Saddle-point method. The saddle-point method of Muzykantskii and Khmelnitskii can be also applied to the problem of the statistics of eigenfunction amplitudes, as was done by Fal'ko and Efetov [32,33]. In this case, one should look for the saddle-point of the functional integral (3.8) determining the function > (u) (which in turn determines the eigenfunction statistics, see Eqs. (3.12) a and (3.14)). The saddle-point is again parametrized by the bosonic non-compact angle h(r) only, and the corresponding saddle-point equation has the form plD+2h!ueh"0 .
(4.75)
It is similar to Eq. (4.8) of the long-time relaxation problem, but with di!erent sign of the second term. Also, the boundary conditions have now a di!erent form: h(0)"0
(4.76)
and condition (4.7) at the boundary (since we consider now a closed sample). Alternatively, one can write Eqs. (4.75) and (4.76) in a slightly di!erent form by shifting the variable h#ln(u<)Ph. Then u is removed from the saddle-point equation and from the action, but appears in the boundary condition. The action determining the distribution function P(u) is given by
P C
!ln P(u)"S" ddr
D
plD (+h)2#ueh . 2
(4.77)
The formula (4.77) acquires a very transparent meaning if we take into account what was written in Section 4.2.1 concerning the two factors contributing to the large amplitude Dt2(0)D"u. Firstly, this is the non-uniform smooth envelope Jeh(r) yielding 1 eh(0) " , Dt2(0)D " 4.005) :ddr eh(r) :ddr eh(r) the corresponding weight is represented by the "rst term in action (4.77). Secondly, these are the local Gaussian (GUE-like) #uctuations of the wave function amplitude, which should provide the remaining factor (`quasi-jumpa)
P
u J" "u ddr eh(r) , Dt2(0)D 4.005) the corresponding probability P(J)"e~J reproduces the second term in action (4.77).
A.D. Mirlin / Physics Reports 326 (2000) 259}382
307
In the quasi-1D case and under the condition u
1
A S B
u 2 1#r 2plD
,
0(r;¸
`
.
(4.78)
Comparing Eq. (4.78) with Eqs. (4.70), we see that the saddle-point solution nicely reproduces the average intensity of the ALS, SDt2(r)DT for r'l, up to an overall normalization factor. Also, u the form of P(u) found in the quasi-1D case by the saddle-point method [33] is in a very good agreement with the exact results presented above. This agreement does credit to the saddle-point method and allows to use the saddle-point con"guration for characterizing the shape of ALS in higher dimensions and for other distribution functions, where the exact solution is not available. 4.2.2. 2D geometry For a 2D disk-shaped sample of a radius ¸ with the high amplitude point r"0 in the center of the disk, the saddle-point solution of Eqs. (4.75) and (4.76) is found to have the form [32,33]
AB G AB
AB H
r ~2k r 2~2k ~2 l2 u H , r5l 1! H l 8(1!k)2plD l H H r ~2k + for l 4r;¸ , H l H where the exponent 0(k(1 depends on u and satis"es the equation eh(r)"
(4.79)
AB
¸ 2k 2!k ¸2u " . (4.80) l 8k(1!k)2 plD H We are interested in the asymptotic region u¸2
A
B
¸2u ¸ ln ln 2plD l H . kK (4.81) 2 ln(¸/l ) H The lower cut-o! scale l appears in Eq. (4.79) for the same reason as in the long-time relaxation H problem (Section 4.1.2), i.e. because of the restriction of the di!usion approximation on the momenta q of the p-model "eld: q(l~1. It is determined by the condition h(@r)D H &l~1, which r/l yields l &kl. The corresponding asymptotic behavior of > (u) (and consequently of the intensity H a distribution function), which was already quoted in Section 3.3.1, is
G
> (u), P(u)& exp !p2lD a
A
ln2
¸
B
H
.
(4.82)
308
A.D. Mirlin / Physics Reports 326 (2000) 259}382
Normalizing the expression (4.79), we "nd that the average ALS density for r'l is equal to H l2 u u r ~2k r 2~2k ~2 H 1! SDt2(r)DT " , r5l . (4.83) u 4p2lDk l H 8(1!k)2plD l H H The saddle-point calculation assumes that h(r) is constant for r(l , so that Eq. (4.83) gives H u SDt2(r)DT K u 4p2lDk
AB G
AB H
in this region. However, for very small r(l the average intensity SDt2(r)DT changes sharply, as we u have seen in the quasi-1D case. Using Eq. (4.67) for Q(u, r) and Eq. (3.12) for P(u), we get Q(u, r) "[1!k (r)#J(u)k (r)]SDt2(r"l )DT , r(l . SDt2(r)DT , 2 2 H u H u P(u)
(4.84)
According to Eqs. (4.73) and (4.82), the height of the quasi-jump is given by
A
B
2p2lD ¸ (u)K ln ln K4p2lDk , (4.85) a ln(¸/l ) 2p2lD l du H H which is precisely the factor by which the value of SDt2(r"l )DT found above di!ers from H u SDt2(0)DT ,u. Combining Eqs. (4.83)}(4.85), we get u u [1!k (r)#k (r)J(u)], r(l . (4.86) SDt2(r)DT " 2 2 H u J(u) J(u)K!u
Therefore, in the 2D case the ALS determining the asymptotics of the amplitude distribution function has the power-law shape (4.83) with the short-scale bump (4.86). 4.2.3. States localized near the boundary We assumed in the above calculations that the center of an ALS is located far enough from the sample edge. For a quasi-1D sample, this means that m ;¸ , ¸ . In the 2D case this implies that %& ` ~ the distance from the observation point to the boundary is of the same order of magnitude in all directions, so that ln(¸/l) is de"ned without ambiguity. Here, we will consider brie#y the role of ALS situated close to the boundary, when these conditions are violated [36]. We "rst consider the quasi-1D geometry. Let us calculate the distribution function P(u) in a point located very close to one of the sample edges. Formally, this means that ¸ ;m . Then the ~ %& function =(1)(uAm, q ) in Eq. (3.21) can be approximated by unity, and we get ~ 2 p2m Jm/uA P(u)" m3@4A1@4¸~1@2u~3@4 exp !2JumA# 1! #2 . (4.87) p 4¸ ¸ ` ` We see therefore that close to the boundary the distribution P(u) has the asymptotic decay P(u)&expM!2JuAmN, which is slower than in the bulk of the sample, P(u)&expM!4JuAmN. This means that if we consider the distribution P(u) averaged over the position of the observation point, its asymptotic tail will be always dominated by contribution of the points located close to the
G
A
BH
A.D. Mirlin / Physics Reports 326 (2000) 259}382
309
boundary, P(u)&expM!2JuAmN. This could be already anticipated from Eq. (4.74), where the factor
G A
BH
m p2 m # ¸ 4 ¸ ~ ` increases strongly with approaching one of the sample edges. The same tendency, but in a weaker form, is observed in Eqs. (3.40)}(3.43). Calculating the average intensity SDt2(r)DT of the correu sponding ALS, we "nd that at r'l the ALS spatial shape retain the form (4.70), with an additional overall factor of 2. At small r, Eq. (4.71) is slightly modi"ed: exp
A B
u 1@2 SDt2(r)DT " [1#JuAmk (r)] . u d mA
(4.88)
In 2D, we can consider a sample of the semicircular shape, with the observation point located in the center of the diameter serving as a boundary. The saddle-point solution then has exactly the same form (4.79), and the ALS intensity is still given by Eq. (4.83), with an additional factor 2. The asymptotic form of the distribution function P(u) contains an extra factor 1/2 in the exponent:
G
P(u)&exp !p2lD 2
A
ln2
¸
B
H
.
(4.89)
This result is expected to be applicable to any 2D sample of a characteristic size ¸, with a smooth boundary and the observation point taken in the vicinity of the boundary. We see therefore, that, very generally, the probability of formation of an ALS with the center in a given point is strongly enhanced (via an extra factor 1/2 in the exponent), if this point lies close to the sample edge. This leads to the additional factor 1/2 in the exponent in the asymptotic form of the distribution P(u) near the boundary. 4.3. Distribution of local density of states We again assume the sample to be open, as in the problem of the distribution of relaxation times, Section 4.1. Then it is meaningful to speak about the statistics of the local density of states (LDOS) o(E, r)"(!1/p)Im G (r, r; E). In a metallic sample, the LDOS is a weakly #uctuating quantity, R whose distribution P(o) is mostly concentrated in a narrow Gaussian peak [28,155] with mean value SoT"l and the variance var(o/l)&i;1, where i is the usual parameter of the perturbation theory i"P(r, r)"+ 1/pl
310
A.D. Mirlin / Physics Reports 326 (2000) 259}382
of P(o) is determined by a probability to have a single narrow resonance, which gives this value of LDOS o(E, r). The most favorable situation happens when the resonance is located around the point r in the real space and around the energy E in energy space. The LDOS provided by such a resonance is: o "Dt2(r)D2t /p , ALS (
(4.90)
where t~1 is the resonance width. Thus, the optimal #uctuation should provide now a maximum to ( the product of the local amplitude u"Dt2(r)D and the inverse level width t , and the asymptotics of ( the distribution P(o) should be related to that of P(u) and P(t ). In particular, in the quasi-1D case, ( where the distribution P(t ), Eq. (4.25), decays much more slowly than P(u), Eq. (3.47), one should ( expect the asymptotic behavior of P(o) to be mainly determined by P(t ). We will see below that ( this is indeed the case. Now we turn to a formal calculation. The distribution function P(o) of LDOS can be expressed through the function >(j , j ) introduced in Section 3.1 as follows [19,20,322,323]. 1 2 1 R2 P(o)"d(o!1)# 4p Ro2
GP
=
A
B H
2o 1@2 , dj >M (j ) 2 1 1 j !o `1 2 (o `1)@2o 1 2o
(4.91)
where
P
>M (j )" 1
>(j , j ) 1 2 dj 2 j !j 1 2 ~1 1
(4.92)
and o is normalized by its mean value: o/lPo. Let us note the symmetry relation found in [19,20] P(o~1)"o3P(o) .
(4.93)
It follows from Eq. (4.91) and is completely independent of a particular form of the function >(j , j ). Obviously, Eq. (4.93) relates the small-o asymptotic behavior of the distribution P(o) to 1 2 its large-o asymptotics. Asymptotic behavior of P(o) was studied in [34] via the saddle-point method supplemented in the quasi-1D case by the exact solution. The saddle-point equation has for this problem a very simple form, +2h"0 , with the boundary conditions hD "0, h(0)"o/2 . -%!$4 4.3.1. Quasi-1D geometry In the quasi-1D case, the solution reads
G
eh(r)K
(o/l)1~r@L` ,
r'0 ,
(o/l)1~@r@@L~ , r(0 ,
(4.94)
A.D. Mirlin / Physics Reports 326 (2000) 259}382
311
where, as before, ¸ and ¸ are the distances from the observation point r"0 to the sample ` ~ edges. This yields the asymptotics of the distribution function
G A
B
H
m 1 1 P(o)&exp ! # ln2(o/l) . 4 ¸ ¸ ` ~
(4.95)
Let us note that this asymptotic behavior of P(o) in an open sample is strongly di!erent from the asymptotics of P(u) in a closed sample, Eq. (3.47). As was explained above, the di!erence originates from the fact that P(o) is essentially determined by P(t ). To demonstrate this explicitly, we put the ( observation point in the middle of the sample, ¸ "¸ "¸/2. The con"guration (4.94) acquires ` ~ then precisely the same form as the optimal con"guration (4.19) for the relaxation time t , and the ( asymptotics (4.95) of P(o) is identical to that of P(t ), Eq. (4.25). ( The corresponding values of t and o are related as follows: ( 4 gDt ln2(gDt )"o/l . ( ( p
(4.96)
Now we calculate the value of the local amplitude Dt2(0)D for an ALS corresponding to the con"guration (4.94). First, its smoothed intensity is given by
AB
ln(o/l) o ~2@r@@L Dt2(r)D "N~1eh(r)" . 4.005) < l
(4.97)
Second, the quasi-jump induced by the GUE-type #uctuations gives an additional factor, which is found according to Eq. (4.73) to be J(o)"!o
R ln P(o)"2g ln(o/l) . Ro
(4.98)
Combining Eqs. (4.96)}(4.98), we can compute the LDOS (4.90) determined by this resonance state: ln(o/l) o< 2t 2g ln(o/l) "o . o (E, 0)"Dt2(0)D J(o) ( " ALS 4.005) < 2g ln2(o/l) p
(4.99)
We have explicitly checked therefore that the LDOS o is indeed determined by a single ALS, smoothed intensity of which is given by Eq. (4.97). There are three sources of the enhancement of LDOS: (i) amplitude of the smooth envelope of the wave function, (ii) the short-scale GUE `bumpa, and (iii) the inverse resonance width. They are represented by the three factors in Eq. (4.99), respectively. The result (4.95) can be also obtained [34] from the exact solution of the p-model. It was also shown in [34] that in the long wire limit (¸<m) the whole distribution function takes a log-normal form analogous to that found in [113] for the case of a strictly 1D sample by the Berezinskii technique.
312
A.D. Mirlin / Physics Reports 326 (2000) 259}382
4.3.2. 2D geometry In the 2D case, we introduce again (as in Sections 4.1.2 and 4.2.2) the small-r cuto!, l(o). The H saddle-point solution reads
A B
o l(o) co eh(r)K H , l r
(4.100)
where l(o)"c l, and H o c "ln(o/l)/ln(¸/l(o)) . o H The distribution P(o) has the following asymptotics [34]:
G
H
p2lD ln2o . (4.101) P(o)&exp ! ln(¸/l(o)) H So, the LDOS distribution has (as in the quasi-1D case) the log-normal form, which is now similar to the distribution (4.82) of the eigenfunction intensities (while the distribution of relaxation times has an intermediate power-law regime). This result is in perfect agreement with the asymptotic behavior of P(o) found by the renormalization group method in [28]. As for all the other distribution functions studied, the relevant ALS decay in a power-law fashion, Dt2(r)DJ(r/l(o))~co . Like in the quasi-1D case, one can explicitly verify [36] that a single H state with the spatial shape determined by (4.100) indeed provides, by virtue of Eq. (4.90), the value of LDOS equal to o. 4.4. Distribution of inverse participation ratio In this subsection, we study the asymptotics of the distribution function of the IPR P , Eq. (3.39). 2 We have already considered the #uctuations of P in Section 3.3.3. As was explained there, the 2 relative magnitude of the #uctuations is [r.m.s.(P )]/SP T&1/g. At 1/g;P /SP T!1;1 the 2 2 2 2 distribution function is of the exponential form,
G
A
BH
pb e P 1 2 !1 . P(P )&exp ! (4.102) 2 4 D SP T 2 Note that for negative deviations P /SP T!1 with DP /SP T!1D<1/g the distribution function 2 2 2 2 decays much faster [137], so that the distribution is strongly asymmetric. As was mentioned in Section 3.3.2, the `bodya of the distribution P(P ) is described properly (in the leading order in 1/g) 2 by the Liouville theory. We will see below that this is also true for the asymptotic `taila of P(P ). 2 Our consideration of asymptotics of the IPR distribution is based on unpublished results [35] (partially announced in [36]). We derive "rst a relation between P(P ) and the distribution of level 2 velocities P (v) [25]. To this end, we consider a Hamiltonian H#aW, where W is a random v perturbation. Speci"cally, the matrix elements W 1 2 are supposed to be independent Gaussianrr distributed random variables with the mean value equal to zero and the variance SWH1 2 W @2 @1 T"W (Dr !r D)d(r !r@ )d(r !r@ ) . rr rr 0 1 2 1 1 2 2
A.D. Mirlin / Physics Reports 326 (2000) 259}382
313
We will assume that W (r) is a short-ranged function with some characteristic scale f. The level 0 velocity v corresponding to an energy level E is de"ned as v "dE (a)/da (where E (a) is the level n n n n n of the perturbed Hamiltonian H#aW) and can be found within the conventional perturbation theory as
P P
v " ddr ddr@ W tH(r)t (r@) . n rr{ n n
(4.103)
Using (4.103), we "nd
P
Sv2T"w P , n 0 2
w "c ddr W (r) , 0 0
(4.104)
where c"1 (1/2) if k f;1 (resp. k f<1). f f This consideration can be extended to higher moments of the level velocity Sv2qT as well. This n leads to the following relation between the two distributions: P (v),Sd(v!v )T v n v2 = dP 2 exp ! P (P ) . (4.105) " I 2 2w P [2pw P ]1@2 0 2 0 0 2 On the other hand, the level velocity distribution can be expressed through the p-model correlation function in the following way [156]. According to the de"nition,
P
C
T
D
UK
1 P (v)" + d(E!E )d(v!RE /Ra) v n n l< a?0 n a "lim + d(E!E (0))d(E#av!E (a)) , (4.106) n n l< a?0 n so that P (v) is determined by the aP0 limit of the parametric level correlation function. The latter v can be represented [as a generalization of Eq. (2.10)] in terms of a correlation function of the p-model [157}159]. This yields
T
P AP
U
BAP
B
!1 g P (v)"lim DQ ddr Str Q k ddr Str Q k e~Sv *Q+; v 11 22 8<2 l
P
C
D
(4.107)
Combining Eqs. (4.105) and (4.107), we "nd the expression for the IPR distribution function in terms of the p-model,
P BAP
!1 1 c`*= du P (P )" lim gl< I 2 8<2 2iJ2pP3@2 c~*= Ju 2 g?0
P AP
] DQ
ddr Str Q k 11
B
ddr Str Q k e~Su *Q+ ; 22
(4.108)
314
A.D. Mirlin / Physics Reports 326 (2000) 259}382
where S [Q]"!u/2P #S [Q]D 2 0 . The saddle point con"guration is again parametrized by u 2 v v @w /u the bosonic `anglea h(r) only; the action on such a con"guration is
P C
D
u iplg (plg)2 plD S [h]"! # ddr (+h)2! eh# e2h . (4.109) u 2P 2 4u 2 2 The corresponding saddle-point equations can be readily obtained by varying S [h] with respect to u h(r) and u: plg2 ig e2h"0, !D+2h! eh# 2u 2
P
(plg)2 1 ddr e2h# "0 . (4.110) 4u2 2P 2 Making a shift h"hI #ln(iu/plg) and dropping the tilde, we can reduce them to the form +2h!c(eh!e2h)"0 ,
NP
P "2 2
ddr e2h ,
(4.111) (4.112)
where c"u/2plD. Integration of Eq. (4.111) with the Neumann boundary condition yields :ddr(eh!e2h)"0, so that Eq. (4.112) can be rewritten in the following form (invariant with respect to a shift of the variable h): :ddr e2h . P "2 2 (:ddr eh)2
(4.113)
The meaning of Eq. (4.113) is completely transparent if we recall that Dt2(r)D Jeh(r). The factor 4.005) 2 comes from the GUE-like short-scale #uctuations. Taking into account Eqs. (4.111) and (4.112), we "nd the action (4.109) on the saddle-point con"guration to be equal to
P
plD S [h]" ddr (+h)2 . u 2
(4.114)
The problem that we are solving is easily seen to be equivalent to searching for the minimum of (4.114) under the conditions :eh ddr"1, :e2h ddr"P (or, equivalently, under the condition (4.113) 2 invariant with respect to normalization of eh). This is nothing else but the optimum #uctuation problem for P(P ) within the Liouville theory (3.93). Therefore, the Liouville theory (3.93) describes 2 properly the asymptotics of the IPR distribution. 4.4.1. Quasi-1D geometry Equation hA!ceh#ce2h"0
(4.115)
A.D. Mirlin / Physics Reports 326 (2000) 259}382
has the following general solution: c e~h" [1#J1!C /c sin(JC x!C )] . 1 1 2 C 1 The constants C and C should be found from the boundary conditions 1 2 h@(!¸/2)"h@(¸/2)"0 ,
315
(4.116)
yielding
A
B
¸ cos JC $C "0 . 12 2 The solution providing the minimum to the action corresponds to C "0, JC "p/¸ and gives 2 1 the wave function intensity eh(x) p 1 ADt2(r)D " " 4.005) :eh(x)dx c1@2¸2 1#J1!p2/¸2c sin(px/¸)
(4.117)
(with A being the sample transverse cross-section) and the IPR value c1@2¸ 2c1@2 "PGUE . P " 2 2 p Ap
(4.118)
Calculating action (4.114) we "nd the following asymptotic behavior of the IPR distribution function
A
B
A
B
p3lDA2 2 p2 P 2 !1 . !ln P(P )KS " P ! " g (4.119) 2 u 2 ¸A 4 4 PGUE 2 Eq. (4.119) is valid for 1/g;P /PGUE!1;¸/l. Therefore, in the quasi-1D case the exponential 2 2 behavior (4.102) is not restricted to the region of small deviations from the average value; there is no change of the behavior of P(P ) at P /PGUE!1&1 (we will "nd such a change below in the 2D 2 2 2 geometry case). The far asymptotics at P
316
A.D. Mirlin / Physics Reports 326 (2000) 259}382
from Eq. (4.116) by taking a solution which has a maximum in the middle of the sample (C "p/2, JC "2p/¸); the result is 2 1 1 m %& , ADt2(r)D " 4.005) p x2#m2 %&
1 m " . %& pAP 2
(4.121)
The only di!erence between (4.120) and (4.121) is that ALS is now located in the bulk of the sample. This leads to an extra factor 4 in the action (see similar discussion in Section 4.2.3 for the case of the statistics of eigenfunction amplitudes). However, in the limit of a long sample, ¸m<1 the contribution of the states located near the boundary is additionally suppressed by a factor &1/¸ as compared to that of the bulk ALS (which may be located everywhere in the sample). Therefore, if one "xes P and considers P (P ) in the limit ¸PR, only the contribution of the bulk ALS 2 I 2 survives, despite the fact that it has the exponent 4 times larger than that of the ALS located near the boundary. In other words, the contribution of the states located near the boundary to the "rst line of Eq. (3.49) (large-P asymptotics of P (P ) at X"¸/m<1) is J(1/X)e~p2z@4. 2 I 2 On the other hand, if a sample with periodic boundary conditions in the longitudinal direction (a ring) is considered, only the bulk solution (C "p/2, JC "2p/¸) will survive. Consequently, 2 1 the asymptotic form of ln P(P ) will be di!erent from (4.119) by an extra factor of 4. 2 4.4.2. 2D geometry Now we calculate the far asymptotics of P(P ) at P
(4.122)
(In fact, for the hard-wall boundary conditions the asymptotics is determined by the states located near the boundary; however, such states can be obtained from the symmetric solution by putting the center at the boundary and restricting the solution to the interior of the sample; see below.) From our experience in the quasi-1D case, we expect the solution to have a form of a bump concentrated in a region r[l and decreasing with r outside this region. For r
G
P~1@2, a'2 , 2 l & p (¸4~2aP )~1@*2(a~1)+, a(2 , 2
(4.123)
up to a numerical coe$cient of order unity. Therefore, the action (4.114) is equal to S "a2p2l D ln(¸/l )"F(a)p2lD ln(¸2P ) , u p 2
(4.124)
A.D. Mirlin / Physics Reports 326 (2000) 259}382
where
G
F(a)"
a2/2,
317
a52, (4.125)
a2 , 1(a42 . 2(a!1)
Thus, the minimum of S corresponds to a"2, yielding u P ~bpg@2 2 . (4.126) P(P )& 2 SP T 2 The upper border of validity of (4.126) is P &PRMT(¸/l)2&1/l2. 2 2 For hard wall boundary conditions, the asymptotic behavior of P(P ) will be, however, 2 determined by con"gurations with a maximum located near the sample boundary. Assuming that the sample has a smooth boundary with the single characteristic scale ¸ (for example, it is of the circular form), we can get such a state from the rotationally invariant bulk state by putting its center on the boundary and removing that half of the state which is outside of the sample. Such a truncated state will have the twice larger IPR and twice smaller action compared to its parent bulk state. Consequently, the asymptotics of the distribution function for a sample with a boundary will be di!erent from (4.126) by an extra factor 1/2 in the exponent. Using the Liouville theory description (3.93), one can generalize the above consideration to the distribution P(P ) of higher IPR's P (3.90) with q'2 [160]. We will assume that q is not too large, q q q2(2bpg, so that the average value SP T is at the same time the typical value of P (see Section q q 3.3.2). Then in the region q2/bpg[P /SP T!1[1 the distribution has the exponential form (3.92). q q At larger P the optimal con"guration is again of the form Dt2(r)D ,eh(r)"A/ra for r'l ; q 4.005) p minimizing the action, we "nd a"q/4 and the distribution function
A B
A B
1 P ~2bpg@q2 q . (4.127) P(P )& q P SP T q q This is valid for P (PRMT(¸/l)2; for still larger P the corresponding optimal #uctuation would q q q violate the condition of the applicability of the di!usion approximation h@[1/l. Incorporating this restriction (cf. similar situation for the distribution of relaxation times, Section 4.1.2) leads to
C
D
1 ln(P /PRMT) q q !2 a" q ln(¸/l)
(4.128)
and to the log-normal far asymptotics
G
bpg ln2[(P /PRMT)(¸/l)2] 1 q q P(P )& exp ! q 4q2 P ln(¸/l) q for (¸/l)2[P /PRMT[(¸/l)2q~2. q q
H
(4.129)
4.5. 3D systems As we will see below, in the 3D case the states determining the asymptotics of the distribution functions have just a local short-scale spike on top of a homogeneous background. In this sense, no
318
A.D. Mirlin / Physics Reports 326 (2000) 259}382
ALS is formed, in contrast to the quasi-1D and 2D situations. As a closely related feature, we "nd that the results in 3D are strongly dependent on microscopic details of the random potential. We start from discussing what the p-model calculation gives when applied to the 3D geometry. Then we compare this with the results of the direct optimal #uctuation method [161]. Let us consider, e.g. the long-time relaxation problem. The solution of the p-model saddle-point equation has the form [30]
A
B
l l h(r)KC H ! H , l 4r4R , H r R
(4.130)
where R is the system size (which is, in fact, irrelevant here) and
C
D
C
D
t t ; C&ln . (4.131) l &l ln H q(k l)2 q(k l)2 F F The reason for introduction of the short-scale cut-o! length l is the same as in 2D (see Sections H 4.1.2 and 4.2.2): the gradient h@ should not exceed &1/l. Calculating the action, we "nd
C
D
t (4.132) !ln P (t)&(k l)2 ln3 s F q(k l)2 F with an uncertainty in numerical prefactor. This uncertainty originates from the fact that the action is dominated by the ultraviolet (short-distance) region r&l . Similar result is obtained for the H eigenfunction amplitude statistics [33,36]
(4.133)
(4.134)
As we see, all these distribution functions are found to have the exponential-log-cube asymptotic form. The corresponding eigenstates have the shape
G
H
l 1 Dt2(r)D K exp C ln2Z , ir i 4.005) < where C &1, Z "t/[q(k l)2], Z "
A.D. Mirlin / Physics Reports 326 (2000) 259}382
319
`direct optimal #uctuation methoda). While having con"rmed the exponential-log-cube form of the asymptotics in the 3D case, they found a prefactor smaller by &k l compared to the p-model F result, i.e. they obtained P(u)&k l ln3(u<) . (4.135) F The physical reason for this di!erence lies in the ballistic e!ects which have been already discussed in Section 3.3.3 in connection with perturbative corrections to the eigenfunction amplitude distribution. It was shown there that in the 3D case and for the white-noise random potential the parameter i governing these corrections is dominated by a non-universal (depending on the type of the disorder) ballistic contribution yielding i&1/k l, while the di!usive contribution is F &1/(k l)2. This is in direct correspondence with Eqs. (4.133) and (4.135), which show precisely the F same di!erence. Eq. (4.135) is again non-universal; its derivation by Smolyarenko and Altshuler [161] relies on the white-noise disorder assumption. The corresponding optimal con"guration of the potential found in [161] is nothing else but a potential wall surrounding the observation point with the height several times larger than E and the thickness &j ln(u<). Such con"gurations are F F not included in the p-model consideration, which assumes that the absolute value of the particle velocity does not change appreciably in space. If one would consider a smooth random potential, whose magnitude is limited from above by some value ; ;E , such con"gurations would not .!9 F be allowed. Whether in this case the p-model result would hold remains to be seen. 4.6. Discussion In Section 4, we have studied the asymptotic behavior of various distribution functions characterizing the eigenfunction statistics in a disordered sample. For this purpose, we used two methods of treatment of the p-model: exact solution (in the quasi-1D case) and the saddle-point method. Physically, the saddle-point solution describes the relevant optimum #uctuations of the wave function envelope; probability of formation of such a #uctuation is found to be governed by the Liouville theory (3.93). In the quasi-1D case, the results of the saddle-point method are in agreement with those of the exact solution of the p-model. The 2D geometry is of special interest, since the eigenfunctions show the features of criticality. In this case, a full agreement between the saddle-point calculation and the renormalization-group (RG) treatment of Altshuler et al. [28] was found for all the distributions, where such a comparison was possible, namely for P(t ), P(o) and P(l). This agreement is highly non-trivial, for the ( following reason. The RG treatment is based on a resummation of the perturbation theory expansion and can be equally well performed within the replica (bosonic or fermionic) or supersymmetric formalism. At the same time, the present approach based on the supersymmetric formalism relies heavily on the topology of the saddle-point manifold combining non-compact (j ) 1 and compact (j ) degrees of freedom. The asymptotic behavior of the distribution functions 2 considered is determined by the region j <1 which is very far from the `perturbativea region of 1 the manifold QKK (i.e. j , j K1). It is well known [164] that for the problem of energy level 1 2 correlation, the replica approach fails, since it does not re#ect properly the topology of the p-model manifold. The success of the RG treatment of [28] seems to be determined by the fact that for the present problem (in contrast to that of level correlation) only the non-compact sector of the supersymmetric p-model is essential, with compact one playing an auxiliary role. Let us note that
320
A.D. Mirlin / Physics Reports 326 (2000) 259}382
the same situation appears in the vicinity of the Anderson transition [90,19,20] where the function >(j , j ) acquires a role of the order parameter function and depends on the non-compact variable 1 2 j only. The above agreement found for the `tailsa of the distributions in the metallic region 1 provides therefore support to the results concerning the Anderson transition obtained with making use of the renormalization group approach and 2#e expansion. We have found that the spatial structure of ALS relevant to the asymptotic behavior of di!erent distributions may be di!erent. This is because an ALS constitutes an optimal #uctuation for one of the above quantities, and the form of this #uctuation depends on the speci"c characteristic, which is to be optimized. Finally, we have discussed interrelations between asymptotics of various distribution functions. In the quasi-1D and 2D cases, we thus presented a comprehensive picture which explains all the studied asymptotics as governed by exponentially rare events of formation of ALS.
5. Statistics of energy levels and eigenfunctions at the Anderson transition In d'2 dimensions a disordered system undergoes, with increasing strength of the disorder, a transition from the phase of extended states to that of localized states (see, e.g. [39] for review). This transition changes drastically the statistics of energy levels and eigenfunctions. In particular, at the mobility edge these statistics acquire distinct features re#ecting criticality of the theory. This is the subject of the present section. Section 5.1 is devoted to the level statistics and Section 5.2 to the eigenfunctions correlations at the mobility edge. In Section 5.3 we study the level and eigenfunction statistics in a quasi-one-dimensional model with long-range (power-law) hopping which undergoes the Anderson transition and shows at criticality all the features characteristic for a conventional metal-insulator transition point in d'2. 5.1. Level statistics. Level number variance The problem of the energy level statistics at the mobility edge was addressed for the "rst time by Altshuler et al. [40], who considered the variance SdN2(E)T"SN2(E)T!SN(E)T2 of the number of levels within a band of a width E. This quantity is related to the two-level correlator (2.1) via
P
SdN(E)2T"
WN(E)X
(SN(E)T)!DsD)R(c)(s) ds ,
(5.1)
~WN(E)X
or, equivalently,
P
d WN(E)X R(c)(s) ds , SdN(E)2T" dSN(E)T ~WN(E)X
(5.2)
where SN(E)T"E/D and R(c)(s)"R(s)!1 is the connected part of the two-level correlation function. In RMT, the 1/s2 behavior of R(c)(s) leads to the logarithmic behavior of the variance SdN2T K(2/p2b) lnSNT for SNT<1. In the opposite situation characteristic for the phase WD of localized states, when all energy levels are completely uncorrelated (known as the Poisson statistics), one gets SdN2T "SNT. Supported by their numerical simulations, Altshuler et al. [40] P
A.D. Mirlin / Physics Reports 326 (2000) 259}382
321
put forward a conjecture that at the critical point SdN2TKsSNT ,
(5.3)
where 0(s(1 is a numerical coe$cient (which is conventionally called now `spectral compressibilitya). More recently, Shklovskii et al. [41] introduced the concept of new universal statistics at the mobility edge (see also Ref. [165]). In Ref. [42] the correlator R(s) at the mobility edge was studied by means of perturbation theory combined with scaling assumptions about a form of the di!usion propagator. It was found that for s<1, R(c)(s)Js~2`c
(5.4)
where c(1 is certain critical index. The consideration of Ref. [42] led to the conclusion that c"1!1/ld (where l is the critical exponent of the localization length), which was however questioned later [166] in view of the oversimpli"ed treatment of the di!usion propagator at the transition point in [42]. At any rate, the behavior (5.4) with some c is what one expects to hold at the mobility edge; the condition c(1 follows from the requirement of convergence of :R(s) ds. Using Eqs. (5.1) and (5.4) and the sum rule
P
R(s) ds"0
(5.5)
(implied by the conservation of the number of energy levels), the authors of [42] concluded that SdN2TJNc
(5.6)
with c(1, in contradiction with Ref. [40]. This conclusion was critically reexamined in Refs. [43,44], where it was shown that the asymptotic behavior (5.4) of the correlator R(s) at s<1 does not imply the absence of the linear term (5.3). The #aw in reasoning of Ref. [42] was in the assumption that the universal part of the correlator R(s) (which is the one surviving in the limit E/D"SNT"const, ¸PR) satis"es the sum rule (5.5). It turns out, however, that the sum rule is ful"lled only if all contributions are taken into account, including the non-universal contribution of the `ballistica region u&1/q, where q is the elastic mean free time. To demonstrate this, we estimate below (following Ref. [43]) the contributions to the sum rule from all regions of the variable s. In fact, for the conventional model of a particle in a random potential de"nition (2.1) of the correlator and the sum rule relation (5.4) should be modi"ed, when the vicinity of the critical point is considered. The reason is that the Anderson transition point corresponds to a strong disorder regime E q&1, so that the condition u&1/q implies u&E . On the other hand, the density of F F states l(E ) can be considered as a constant only for small variations of energy u;E . This means F F that variation of l(E ) should be taken into account in (2.1) and (5.5). Besides, the condition F E q&1 leads to a breakdown of the perturbation theory, that complicates the analysis of the F `ballistica region contribution. To get rid of these problems, we consider a di!erent microscopic model which has exactly the same universal part of R(s), but whose density of states does not change within the range of u&1/q. This is so-called n-orbital Wegner model [167], which can be considered as a system of metallic granules forming a d-dimensional lattice, each granule being coupled to its nearest neighbors. In the limit n<1 this model can be mapped onto a supersymmetric p-model de"ned on a lattice. The action of this p-model reads [90] (we consider the unitary
322
A.D. Mirlin / Physics Reports 326 (2000) 259}382
symmetry for de"niteness) C SMQN" + Str Q Q #e+ Str Q K , (5.7) i j i 2 WijX i where the supermatrices Q are de"ned on sites i of a d-dimensional lattice with a lattice spacing a. i Summation in the "rst term of Eq. (5.7) goes over the pairs SijT of nearest neighbors. The parameters C and e are related to the classical di!usion constant D, the density of states l, and the frequency u as follows: C"plDad~2,
e"!iuplad/2 .
(5.8)
The two-level correlator can be expressed through a correlation function of this p-model via the discretized version of Eq. (2.10):
P
GC
D H
ad 2 + Str Q kK !1 exp(!SMQN) , (5.9) R(s)"Re < DQ j 4< i j i where < is the system volume. It is not di$cult to prove explicitly [43] that the two-level correlation function of this lattice p-model, Eq. (5.9), satis"es exactly the sum rule (5.5). As will become clear below, the continuum version of the p-model does not possess this property: there is a de"ciency of the sum rule for it which is related with the contribution of the range of u close to the ultraviolet cut-o!. Let us stress that in the region u;D/a2,E (a) the correlator R(s) is universal, i.e. does not c depend on microscopical details of the model. The region u&D/a2 plays a role analogous to that of the ballistic region, u&1/q, in the case of the usual model of particle in a random potential. Despite the non-universality of the correlation function R (s) in this `ballistica domain, the /6 corresponding integral contribution I to the sum rule is universal, because it determines, /6 according to Eq. (5.5), the sum rule de"ciency for the universal part R (s): 6 I #I "0 , 6 /6
P
I " R(c)(s) ds, 6 6
P
I " R(c) (s) ds . /6 /6
(5.10)
As will be seen below, when the system is close to the Anderson transition the two regions of the variable s dominating the integrals I and I , respectively, are separated by a parametrically broad 6 /6 range of s giving a negligible contribution to the sum rule. For energy band width E lying in this range, we get from Eqs. (5.2) and (5.10) SdN(E)2TKI SN(E)T"!I SN(E)T , (5.11) 6 /6 i.e. just the linear term (5.3) with s"I "!I . We turn now to the analysis of the correlator R(s) 6 /6 and of the sum rule in various situation. For completeness, we start from the case of a good metal, then we consider the critical point and the critical region cases. (1) Good metal. Here the following three regions with di!erent behavior of R(s) can be found: (A) `Wigner}Dysona (WD) region: u;E . The correlator R(s) in this region was studied in c Section 2.2, see Eqs. (2.27) and (2.30). The corresponding contribution to the sum rule can be
A.D. Mirlin / Physics Reports 326 (2000) 259}382
323
estimated as (we omit numerical factors of order unity)
P
I K A
g
R(s)J#1/g .
(5.12)
0
(B) `Altshuler}Shklovskiia (AS) region: E ;u;E (a). Here E (a)"D/a2 is the Thouless c c c energy at the scale of lattice spacing a, which plays a role of the ultraviolet cut-o! for the di!usion theory. The level correlation function is given by Eq. (2.33), yielding
P
I K B
A B
D@a2D D d@2~1 R(s) dsJg~d@2 J#1/g(a) , a2D g
(5.13)
where g(a)"g(a/¸)d~2 is the conductance at the scale a. Note that for the case of a particle in a random potential, the following substitutions should be done: aPl; E (a)PE (l)"1/q; c c 1/g(a)P1/g(l)J(eq)d~1. (C) `Ballistica region: uZE (a). To "nd the correlator R(s) in this range, we can neglect in the c leading approximation the "rst term in Eq. (5.7), that gives
AB
R(c)(s)J!
P
I K C
=
D@a
¸ d1 , a s2
(5.14)
R(c)(s) dsJ!1/g(a) .
(5.15)
2D
Contribution (5.12) to the sum rule is dominated by the region u&D, whereas the contributions (5.13) and (5.15) are dominated by a vicinity of the ultraviolet cut-o! u&D/a2. In d"3 g
E ;E;D/a2 , c
(5.16)
but the corresponding coe$cient is of order 1/g;1. The numerical coe$cient in (5.16) can be calculated using the explicit form of R(c)(s), yielding 1 s,I K , g<1 . 6 2bpg
(5.17)
With an increase in disorder strength, the coupling constant C in Eq. (5.7) decreases. When it approaches a critical value C , corresponding to the metal}insulator transition, the correlation # length m becomes large: m
324
A.D. Mirlin / Physics Reports 326 (2000) 259}382
(2) Metal in the critical region: a;m;¸. In this case g(a)Kg , but g,g(¸)
P
A B
gH Dm @D D d@2~1 m J#1/g , (5.18) R(c)(s) dsJg~d@2g H H D g where we have used the relations D /D"(¸/m)d and g/g J(¸/m)d~2. m H (C) `Kravtsov}Lerner}Altshuler}Aronova (KLAA) region: E (m),g D ;u;g D ,E (a). c H m H a c The correlator R(c)(s) in this range was studied in Ref. [170]; the result reads: I K B
R(c)(s)K!g~c(D /D)1~cs~2`c H m and, consequently,
(5.19)
P
gH Da @D R(c)(s) dsJ!1/g (5.20) H D D H m g @ (D) `Ballistica region: uZE (a). Here Eqs. (5.14) and (5.15) hold, yielding c I J!1/g (5.21) D H The contribution I is dominated by u&D, the contributions I , I by u&D , and "nally, the A B C m contribution I comes from the `non-universala region u&E (a). Therefore, this last contribution D c determines I in Eq. (5.10), leading according to Eq. (5.11) to the linear behavior of the variance /6 SdN2TJ(1/g )SNT, E (m);E;E (a) (5.22) H c c with a coe$cient &1/g (which is of order unity in 3D). H (3) Finally, a system is at the critical point, when m<¸. In this case, g(¸)"g(a)"g . The AS H region disappears, and we have, in full analogy with the previous estimates: (A) WD region: u[E "g D. c H I J#1/g , (5.23) A H (B) KLAA region: g D;u;D g . H a H R(s)J!g~cs~2`c, I J!1/g , (5.24) H B H (C) `Ballistica region: uZD g . a H I J!1/g . (5.25) C H I K C
8 In 3D the value of g is, of course, of order unity. However, if one considers formally d"2#e with e;1, then H g &1/e is parametrically large. It is thus instructive to keep g as a parameter in all estimates. H H
A.D. Mirlin / Physics Reports 326 (2000) 259}382
325
Again the same conclusion, as for the critical region case, can be drawn: the `ballistica contribution (5.25) can be identi"ed as I in Eq. (5.10), yielding: /6 SdN2TKsSNT, E ;E;E (a) , (5.26) c c with s&1/g . H Recently, a relation between the spectral compressibility s and the multifractal dimension D (for 2 discussion of multifractality and corresponding bibliographic references, see Sections 3.3.2 and 5.2) was proposed [49,166]: (5.27) s"(d!D )/2d . 2 The central idea of the derivation [49] is to consider the motion of energy levels, when the system is subject to a random perturbation. This allows to link the spectral statistics with the wave function correlations. In 2D, one can check, by comparing Eqs. (3.63) and (5.27), that Eq. (5.27) is indeed satis"ed in leading order in 1/g. The general derivation of Eq. (5.27) is, however, based on certain approximate decoupling of a higher-order correlation function [171,172,49], so that it is not completely clear whether this is indeed an exact formula as argued in [49] or only an approximation valid for g <1. H The linear behavior (5.3) of the level number variance at the Anderson transition has been con"rmed by now in numerical simulations by several groups [45}48]. The spectral compressibility s is an important universal parameter characterizing the critical point of Anderson transition. Its universality is of the same sort as that of the critical indices, i.e. s depends only on the spatial dimensionality and on the symmetry (universality) class. Let us note that the whole level correlation function R(c)(s) is not as universal, since it depends also on the shape of the sample and on the boundary conditions [48,173}175]. This can be expected already from the perturbative 1/g2 correction, Eqs. (2.27) and (2.30), where the coe$cient a does depend on the shape and on the d boundary conditions. Also, the shape dependence becomes evident if one considers the limit of an elongated sample with a length considerably larger than the transverse sample size. Indeed, let us consider (in 3D) a rectangular sample with ¸ "¸ "¸ /a, where a is a numerical factor [173]. If y z x we "x a and consider the limit ¸PR, the level correlation function has a limiting form, which, depends, however on a. In particular, at a<1 the sample is of quasi-1D geometry with a ratio of the sample length to the localization length &a, so that the level statistics will be close to Poissonnian. Asymptotic behavior of the nearest-neighbor level spacing distribution function P(s) for s<1 at the mobility edge has been also a controversial issue. We remind that for the Poissonnian statistics P(s)"e~s, while in the RMT P(s)&e~#0/45 s2. While Refs. [40,41] conjectured that P(s)&e~#0/45 s, the authors of Ref. [176] found P(s)&e~#0/45 s2~c with the same index c as in Eq. (5.4). Recent numerical studies [177,178] support the former result, P(s)&e~#0/45 s. 5.2. Strong correlations of eigenfunctions near the Anderson transition In this section, we discuss correlations between the amplitudes of di!erent, but close in energy, eigenstates at and near the critical point of the Anderson transition. Let us recall that in the metallic phase far from the transition point a typical eigenfunction covers essentially uniformly the sample volume. This is re#ected in the inverse participation ratio P , as well as in the higher moments 2
326
A.D. Mirlin / Physics Reports 326 (2000) 259}382
P ":ddr D t2q(r)D, which di!er only weakly from their RMT values, see Eqs. (3.15) and (3.52). When q the system approaches the point of the Anderson transition E , these extended eigenfunctions c become less and less homogeneous in space, showing regions with larger and smaller amplitudes and eventually forming a multifractal structure in the vicinity of E , see Section 3.3.2. This c multifractal behavior is characterized by the following behavior of the moments P at the critical q point: P J¸~Dq (q~1), ¸(m , q
(5.28)
as well as in the conducting phase in the vicinity of the critical point: P Jm(d~Dq )(q~1)¸~d(q~1), ¸'m . q
(5.29)
Here mJDE!E D~l is the correlation length and D is the set of multifractality exponents, c q d'D 'D '2. As Eq. (5.29) indicates, the eigenfunctions become more and more sparse when 2 3 the system approaches the critical point of the Anderson transition from the metallic phase (i.e. when m increases). Just at the mobility edge the scaling of IPR with the system size ¸ becomes di!erent, see Eq. (5.28), so that the eigenfunction e!ectively occupies a vanishing fraction of the system volume. At last, in the insulating phase any eigenstate is localized in a domain of "nite extension m, and IPR remains "nite in the limit of in"nite system size ¸PR. This transparent picture serves as a basis for qualitative understanding of spectral properties of disordered conductors. Indeed, as long as eigenstates are well extended and cover the whole sample, they overlap substantially, and corresponding energy levels repel each other almost in the same way as in RMT. As a result, the Wigner}Dyson (WD) statistics describes well energy levels in a good metal, see Section 2. In contrast, in the insulating phase di!erent eigenfunctions corresponding to levels close in energy are localized far apart from one another and their overlap is negligible. This is the reason for absence of correlations of energy levels in this regime, i.e. the Poisson statistics. A naive extrapolation of this argument to the vicinity of the transition point would lead to a wrong conclusion. Indeed, one might expect that sparse (multifractal in the critical point) eigenstates fail to overlap, that would result in essential weakening of level correlations close to the mobility edge and vanishing level repulsion at E"E . However, numerical simulations show c [40,41,179}181] that even at the mobility edge levels repel each other strongly (although the level statistics is di!erent from RMT). As we are going to explain now, this apparent contradiction is resolved in the following way: The critical eigenstates for nearby levels are so strongly correlated that they overlap well in spite of their sparse structure. To demonstrate the strong correlations, we will consider the relation between the overlap function p(r, r, E, u), Eq. (3.68), and the `self-overlapa a(r, r, E), Eq. (3.67) (the latter determining the IPR P according to P ":dr a(r, r, E). As was shown9 in [182], everywhere in the metallic phase 2 2 9 Eq. (5.30) was explicitly derived in [182] for the case of the sparse random matrix model [183,184] corresponding to the limit of in"nite dimensionality, d"R. So, even at d"R, where the wave function sparsity (multifractality) takes its extreme form, nearby in energy eigenfunctions are fully correlated.
A.D. Mirlin / Physics Reports 326 (2000) 259}382
327
(including the vicinity of the critical point) b p(r, r, E, u)" a(r, r, E) b#2
(5.30)
for u;D , where D &1/lmd is the level spacing in the correlation volume. Eq. (5.30) implies the m m following structure of eigenfunctions within an energy interval u(D . Each eigenstate can be m represented as a product W (r)"t (r)U (r). Here the function U (r) is an eigenfunction envelope of i i E E `humps and dipsa. It is the same for all eigenstates around energy E, re#ects the underlying gross (multifractal) spatial structure, and governs the divergence of the inverse participation ratios P at q the critical point. In contrast, t (r) shows RMT-like #uctuations on the scale of the wave length. It i "lls the envelope function U (r) in an individual way for each eigenfunction, but is not critical, i.e. is E not sensitive to the vicinity of the Anderson transition. These Gaussian #uctuations are responsible for the factor b/(b#2) (which is the same as in the corresponding Gaussian Ensemble) in Eq. (5.30). As was already mentioned, this picture is valid in the energy window u&D around the energy m E; the number of levels in this window being large as D /D&(¸/m)d<1 in the limit ¸<m. These m states form a kind of Gaussian Ensemble on a spatially non-uniform (multifractal for EPE ) c background U (r). Since the eigenfunction correlations are described by the formula (5.30), which E has exactly the same form as in the Gaussian Ensemble, it is not surprising that the level statistics has the WD form everywhere in the extended phase [183,184]. For larger u, p(r, r, E, u) is expected to decrease as u~g@d, where g"d!D , according to the scaling arguments [131,185}189] (see 2 below), so that p(r, r, E, u)/a(r, r, E)&(u/D )~g@d, u'D , (5.31) m m up to a numerical coe$cient of order unity. The above formulas are valid in the metallic regime, i.e. for ¸<m. Exactly at the critical point (m<¸) they take the form p(r, r, E, u)/a(r, r, E)&1, u(D
(5.32)
p(r, r, E, u)/a(r, r, E)&(u/D)~g@d, u'D .
(5.33)
and
Of course, Eq. (5.32) is not su$cient to ensure the WD statistics at the critical point, since there is only of order of one level within its validity range u&D. Indeed, the numerical simulations show that the level statistics on the mobility edge is di!erent from the WD one [40,41,179}181]. However, Eq. (5.32) allows us to make an important conclusion concerning the behavior of R(s) (or, which is essentially the same, of the nearest neighbor spacing distribution P(s)) at small s"u/D. For this purpose, it is su$cient to consider only two neighboring levels. Let their energy di!erence be u &D. Let us now perturb the system by a random potential <(r) with S<(r)T"0, 0 S<(r)<(r@)T"Cd(r!r@). For the two-level system it reduces to a 2]2 matrix M< N, i, j"1, 2, with ij elements < ":ddr <(r)WH(r)W (r). The crucial point is that the variances of the diagonal and ij i j o!-diagonal matrix elements are according to Eq. (5.32) equal to each other up to a factor of order of unity: S<2 T/SD<2 DT"p(r, r, E, u)/a(r, r, E)&1 . 11 12
(5.34)
328
A.D. Mirlin / Physics Reports 326 (2000) 259}382
The distance between the perturbed levels is given by u"[(< !< #u )2#D< D2]1@2. 11 22 0 12 Choosing the amplitude of the potential in such a way that the typical energy shift < &D and 11 using Eq. (5.34), we "nd SD< D2T&D. As a result, the probability density for the level separation 12 u is for u;D of the form dP&(u/D)b du/D, with some prefactor of order of unity. We thus conclude that in the critical point P(s)Kc sb, s;1 , (5.35) b with a coe$cient c of order of unity, in agreement with the numerical "ndings [41,179,181]. b 5.2.1. Spatial correlations at the mobility edge Another question that one can ask concerning the wave function correlations at the mobility edge is how the correlations decay with distance in real space. Let us consider correlations a(r, r@, E) of the eigenfunction amplitudes in two di!erent points. Matching the behavior of a(r, r@, E) at r"r@ with that at Dr!r@D&minM¸, mN (where no strong correlations is expected), we "nd in the metallic phase (¸<m
(5.36)
and at the critical point (m<¸) a(r, r@, E)&¸~2d`gDr!r@D~g, l[Dr!r@D[¸ .
(5.37)
The same consideration can be straightforwardly applied to the higher correlation functions SDt2q1 (r)DDt2q2 (r@)DT, leading to the conclusion that they decay as SDt2q1 (r)DDt2q2 (r@)DTJDr!r@D~*d`q(q1 )`q(q2 )~q(q1 `q2 )+ , where q(q),(q!1)D has been already introduced in Section 3.3.2. These results were obtained q for the "rst time by Wegner [190] from renormalization-group calculations. The above formulas for the spatial correlations at the mobility edge can be generalized to the correlator p(r, r@, E, u) of two di!erent eigenfunctions by using the scaling assumptions that this is the length ¸ "¸(D/u)1@d"(1/ul)1@d which will set the correlation range at "nite u. This give for the u correlations at criticality (l(Dr!r@D(¸ (¸, m) u p(r, r@, E, u)&¸~2d(ul)~g@dDr!r@D~g . (5.38) Chalker and Daniell [185,186] conjectured that the di!usion propagator c(r, r@, E, u), Eq. (3.77), has the same behavior (5.38) at the mobility edge. Extensive numerical simulations [185,187}189] fully con"rm this conjecture, which determines the form of the anomalous di!usion at the critical point of the Anderson transition. 5.3. Power-law random banded matrix ensemble: Anderson transition in 1D The ensemble of random banded matrices (RBM) is de"ned as the set of matrices with elements H "G a(Di!jD) , (5.39) ij ij where the matrix G runs over the Gaussian ensemble, and a(r) is some function satisfying the condition lim a(r)"0 and determining the shape of the band. In the most frequently considered r?=
A.D. Mirlin / Physics Reports 326 (2000) 259}382
329
case of RBM the function a(r) is considered to be (exponentially) fast decaying when r exceeds some typical value b called the bandwidth. Matrices of this sort were "rst introduced as an attempt to describe an intermediate level statistics for Hamiltonian systems in a transitional regime between complete integrability and fully developed chaos [191] and then appeared in various contexts ranging from atomic physics (see [192] and references therein) to solid state physics [18] and especially in the course of investigations of the quantum behavior of periodically driven Hamiltonian systems [193,194]. The most studied system of the latter type is the quantum kicked rotor [104] characterized by the Hamiltonian lK 2 = HK " #<(h) + d(t!m¹) , (5.40) 2I m/~= where lK "!i+ R/Rh is the angular momentum operator conjugated to the angle h. The constants ¹ and I are the period of kicks and the moment of inertia, correspondingly, and <(h) is usually taken to be <(h)"k cos h. Classically, the kicked rotor exhibits an unbound di!usion in the angular momentum space when the strength of kicks k exceeds some critical value. It was found, however, that the quantum e!ects suppress the classical di!usion in close analogy with the localization of a quantum particle by a random potential [104,195]. It is natural to consider the evolution (Floquet) operator ;K that relates values of the wavefunction over one period of perturbation, t(h, t#¹)";K t(h, t), in the `unperturbeda basis of eigenfunctions of the operator 1 lK : DlT" exp(inh), n"$0,$1,2 . (2p)1@2 The matrix elements SmD;DnT tend to zero when Dm!nDPR. In the case <(h)"k cos h this decay is faster than exponential when Dm!nD exceeds b+k/+, whereas within the band of the e!ective width b matrix elements prove to be pseudorandom [104]. Let us note, however, that the exponentially rapid decay of SmD;DnT in the above mentioned situation is due to the in"nite di!erentiability of <(h)"cos h. If we take a function <(h) having a discontinuity in a derivative of some order, the corresponding matrix elements of the evolution operator would decay in a power-law fashion in the limit Dn!mDPR. In fact, there is an interesting example of a periodically driven system where the matrix elements of the evolution operator decay in a power-law way, namely the so-called Fermi accelerator [58]. The power-law (pseudo-)random banded matrices appear also in other models of the quantum chaos, such as a close-to-circular Bunimovich stadium [196}198] or a Coulomb center inside an integrable billiard [59]. One may also consider the random matrix (5.39) as the Hamiltonian of a one-dimensional tight-binding model with long-ranged o!-diagonal disorder (random hopping). A closely related problem with non-random long-range hopping and diagonal disorder was studied numerically in Ref. [199]. E!ect of weak long-range hopping on the localized states in a 3D Anderson insulator was discussed by Levitov [200,201,324]. Similar models with a power-law hopping appear also in other physical contexts [60}62,202]. As was shown in [18,102], the conventional RBM model can be mapped onto a 1D supermatrix non-linear p-model, which allows for an exact analytical solution. The same p-model was derived
330
A.D. Mirlin / Physics Reports 326 (2000) 259}382
initially [6,101] for a particle moving in a quasi-1D system (a wire) and being subject to a random potential. All states are found to be asymptotically localized, with the localization length B (2B !E2) = 0 m" 2 Jb2, B " + a2(r)rk . k 8B2 0 r/~= However, for the case of a power-like shape of the band, a(r)Jr~a for large r ,
(5.41)
(5.42)
this derivation should be reconsidered. As explained below, this leads [57] to a more general 1D p-model with long-range interaction, which is much richer than the conventional short-range one. In particular, it exhibits the Anderson localization transition at a"1. We mainly follow Ref. [57] in our presentation. 5.3.1. Mapping onto the ewective p-model Let us consider the ensemble of large N]N matrices (NPR) de"ned by Eq. (5.39) with a function a(r) having the form
G
a(r)K
1,
r(b ,
(r/b)~a, r'b .
(5.43)
The parameter b will serve to label the critical models with a"1. We will consider b to be large: b<1, in order to justify formally the derivation of the p-model. We will argue later on that our conclusions are qualitatively valid for arbitrary b as well. We will call the ensemble (5.39), (5.43) the power-law random banded matrix (PRBM) model. While considering localization properties, we will restrict ourselves by one-loop calculation, and for this reason, by the orthogonal ensemble. The PRBM model can be mapped onto the p-model on a lattice, 1 iplu SMQN"! (plA )2 Str + [(A~1) !A~1d ]Q(i)Q(j)! + Str Q(i)K . (5.44) 0 ij 0 ij 4 4 ij i Here Q(i)"¹~1K¹ satis"es the constraint Q2(i)"1, A is the matrix with elements i i A "a2(Di!jD), A is given by10 A "+ A ++= a2(r), and l is the density of states: ij 0 0 l kl r/~= 1 l" (4A !E2)1@2 . (5.45) 0 2pA 0 The standard next step is to restrict oneself to the long wavelength #uctuations of the Q-"eld. For usual RBM characterized by a function a(r) decreasing faster than any power of r as rPR, this is
10 The expression for A is valid for a'1/2 and in the limit NPR. When a(1/2, A starts to have a dependence on 0 0 N which can be removed by a proper rescaling of the matrix elements H in Eq. (5.39). Then, the properties of the model ij with a(1/2 turn out to be equivalent to those of the GOE, so we will not consider this case any longer.
A.D. Mirlin / Physics Reports 326 (2000) 259}382
331
achieved by the momentum expansion of the "rst term in the action (5.44): + [(A~1) !A~1d ]Q(i)Q(j),N~1+ [A~1!A~1]Q Q ij 0 ij q 0 q ~q ij q B B + 2 N~1+ q2Q Q " 2 dx(R Q)2 , (5.46) q ~q 2A2 x 2A2 0 0 q where Q is the Fourier transform of Q(i), and B "+ A (k!l)2, as de"ned in Eq. (5.41). This q 2 l kl immediately leads to the standard continuous version of the non-linear p-model:
P
P C
D
pl 1 SMQN"! Str dx D (R Q)2#iuQK 4 2 0 x
(5.47)
with the classical di!usion constant D "plB , which implies the exponential localization of 0 2 eigenstates with the localization length m"plD Jb2. 0 Let us try to implement the same procedure for the present case of power-like bandshape (5.43). Restricting ourselves to the lowest order term in the momentum expansion, we arrive again at Eq. (5.47) as long as a53/2. This suggests that for a53/2 the eigenstates of the present model should be localized in the spatial domain of the extension mJlD . However, in contrast to the 0 usual RBM model this localization is power-like rather than exponential: Dt(r)D2Jr~2a at r<m. This is quite evident due to the possibility of direct hopping with the power-law amplitude. On a more formal level the appearance of power-law tails of wave functions is a consequence of the breakdown of the momentum expansion for the function A~1!A~1 in higher orders in q2. The q 0 presence of power-law `tailsa of the wave functions, with an exponent a determined by the decay of hopping elements, was found in numerical simulations in Refs. [197}199]. The most interesting region 1/2(a(3/2 requires a separate consideration. In this case, Eq. (5.46) loses its validity in view of the divergence of the coe$cient B . We "nd instead 2 = A2 (A~1!A~1)+A !A "2 dr a2(r)(1!cos qr) 0 q 0 0 q 0 2 b@q@ = dx " dx(1!cos x)#(bDqD)2a (1!cos x) DqD x2a 0 b@q@ +c b2aDqD2a~1 for 1/2(a(3/2 and DqD;1/b , (5.48) a where
P
GP
P
H
P
= dx (1!cos x) x2a 0 is a numerical constant, c "[2C(2!2a)/(2a!1)]sin pa for aO1 and c "p for a"1. a 1 The corresponding long-wavelength part of the action, c "2 a
P
1 S MQN"! Str (dq) DqD2a~1Q Q , q ~q 0 t
(5.49)
332
A.D. Mirlin / Physics Reports 326 (2000) 259}382
where
P
P
dq ,N~1+ , 2p q cannot be reduced to the local-in-space form in the coordinate representation any longer. Here, 1/t"1(pl)2c b2aJb2a~1<1 plays the role of coupling constant, justifying the perturbative and a 4 renormalization group treatment of the model described below. Let us mention that if we consider the RBM model as a tight-binding Hamiltonian, the corresponding classical motion described by the master equation on the 1D lattice is superdi!usive for 1/2(a(3/2 with a typical displacement in a time t being rJt1@(2a~1) (Levy law of index 2a!1; see Ref. [203] for a review). As will be discussed in Section 5.3.3, this in#uences the asymptotic behavior of the spectral correlation function for the corresponding quantum system. (dq),
5.3.1.1. Perturbative treatment of the non-linear p-model: General formulas. Here we derive oneloop perturbative corrections to the density}density correlation function and inverse participation ratios. Analysis of these expressions for various values of the power-law parameter a will be presented in Section 5.3.2. The density}density correlation function (di!usion propagator) K(r , r ; u)"GE`u@2(r , r )GE~u@2(r , r ) 1 2 R 1 2 A 2 1 can be expressed in terms of the p-model as follows [6]:
P
K(r , r ; u)"!(pl)2 DQQ (r )k Q (r )e~SMQN . 1 2 12,ab 1 bb 21,ba 2
(5.50)
Here the indices p, p@ of the matrix Q correspond to its advanced}retarded block structure, pp{,ab whereas a, b discriminate between bosonic and fermionic degrees of freedom. The matrix k is bb equal to 1 for bosons and (!1) for fermions. To calculate the correlation function (5.50) perturbatively, we use here the following parametrization [6] of the matrix Q:
A
B
=2 =4 Q"K(=#J1#=2)"K 1#=# ! #2 , 2 8
(5.51)
where = is block-o!-diagonal in the advanced}retarded representation. To get the perturbative expansion for K(r , r ; u), one has to substitute Eq. (5.51) into (5.50), to separate the part quadratic 1 2 in = from the rest in the exponent and to apply the Wick theorem. In the usual case, when the action is given by Eq. (5.47), the leading order (tree level) result reads in the momentum space as follows: 2pl K (q, u)" . (5.52) 0 D q2!iu 0 The perturbative quantum corrections do not modify the general form (5.52), but change the value of the di!usion constant. In particular, in one-loop order one gets Eq. (5.52) with D replaced 0
A.D. Mirlin / Physics Reports 326 (2000) 259}382
333
by [204]
G
H
1 D"D 1! 0 pl<
1 + . (5.53) D q2!iu 0 i i i q /pn @L This induces the standard weak-localization correction to the conductivity. Now we implement an analogous procedure for the non-local p-model of the type of Eqs. (5.44) and (5.49): 1 pl SMQN" Str+ ;(r!r@)Q(r)Q(r@)!i + Str KQ(r) , (5.54) t 4 r.r{ r with ;(r)Jr~2a as rPR, so that the Fourier transform of ;(r) behaves at small momenta as ;I (q)"!DqDp, 1/2(p(2 .
(5.55)
The exponent p is related to the parameter a of the RBM model by p"2a!1. In leading order, we keep in the action the terms quadratic in = only, which yields 2pl K (q, u)" , 0 8(plt)~1DqDp!iu
(5.56)
corresponding to a superdi!usive behavior. To calculate the one-loop correction to K (q) (we set u"0 for simplicity) we expand the kinetic 0 term in SMQN up to fourth order in =: 1 (5.57) + Str ;(r!r@)Q(r)Q(r@)D "+ Str =2(r)=2(r@) . 45) 03$%3 4 r,r{ r,r{ The contraction rules are given by Eqs. (8) and (16) of Ref. [25] (reproduced below as Eq. (2.21) for the unitary symmetry), with the propagator P(q) replaced by P(q)"t/8DqDp .
(5.58)
Evaluating the one-loop correction, we get the following expression for the density}density correlation function up to the one-loop order:
P
Dq#kDp!DkDp (pl)2 (dk) . K~1(q)"K~1(q)! 0 DkDp 2
(5.59)
Now we calculate the perturbative correction to the inverse participation ratios P . The results q of Ref. [25] presented in Section 3.3.1 are straightforwardly applicable to the present case of power-law RBM, provided the appropriate modi"cation of the di!usion propagator entering the contraction rules is made, see the text preceding Eq. (5.58). One "nds
G
H
1 (2q!1)!! SP T" 1# q(q!1)+ P(r, r) , q N Nq~1 r where P(r, r@)"(1/N)+ P(q)exp[iq(r!r@)] and P(q) is given by Eq. (5.58). q
(5.60)
334
A.D. Mirlin / Physics Reports 326 (2000) 259}382
5.3.1.2. Renormalization group treatment. The e!ective p-model, Eq. (5.54), is actually of onedimensional nature. However, for the sake of generality, we "nd it convenient to consider it here to be de"ned in d-dimensional space with arbitrary d. The form (5.56) of the generalized di!usion propagator implies that d"p plays the role of the logarithmic dimension for the problem. In the vicinity of this critical value we can carry out a renormalization group (RG) treatment of the model, following the procedure developed for general non-linear p-models in [205}208]. We begin by expressing the action in terms of the renormalized coupling constant t"Z~1t kd~p, where t is the 1 B B bare coupling constant, Z is the renormalization constant, and k~1 is the length scale governing 1 the renormalization:11 kd~p S" + ;(Dr!r@D) Str [!=(r)=(r@)#J1#=2(r) J1#=2(r@)] 2tZ 1 rr{ iplu ! + StrJ1#=2(r) . 4 r Expanding the action in powers of =(r) and keeping terms up to 4th order, we get
(5.61)
S"S #S #O(=6) , 0 1 iplu kd~p + ;(Dr!r@D) Str (=(r)!=(r@))2! + Str =2(r) , S " 0 4tZ 8 1 rr{ r (5.62) kd~p iplu S " + ;(Dr!r@D) Str =2(r)=2(r@)# + Str =4(r) . 1 8tZ 32 1 rr{ r We have restricted ourselves to 4th order terms, since they are su$cient for obtaining the renormalized quadratic part of the action in one-loop order. The calculation yields, after the cancelation of an :(dk)Jd(0) term with the contribution of the Jacobian: S "S #SS T 26!$ 0 1
P
C
P
D
1 kd~p 1 ;I (k)!;I (k#q) " (dq) Str= = ! ;I (q)! (dk) . (5.63) q ~q 2 tZ 2 iplu kd~p 1 ;I (k)! ! 4 tZ 1 According to the renormalization group idea, one has to chose the constant Z (t)"1#at#2 1 so as to cancel the divergence in the coe$cient in front of the leading DqDp term. 5.3.2. Localization and eigenfunction statistics In this subsection we analyze the model in di!erent regions of the exponent p"2a!1 using the general formulas derived in Section 5.3.2.
11 Note that the =-"eld renormalization is absent due to the supersymmetric character of the problem, which is physically related to the particle number conservation [6].
A.D. Mirlin / Physics Reports 326 (2000) 259}382
335
5.3.2.1. Localized regime: 1(p(2 (1(a(3/2). To evaluate the one-loop correction (5.59) to the di!usion propagator, we use the expansion
G
p
A BA B
qk p q2 p # #p !1 k2 2 k2 2
Dq#kDp !1K DkDp DqDp , DkDp
qk 2 #2, k2
q;k , (5.64) q
Thus, the integral in Eq. (5.59) can be estimated as
P C
I, (dk)
D P
Dq#kDp !1 + DkDp
A
B
P
p!2 q2 # (dk)p 1# 2k2 d
(dk)
DqDp , DkDp
(5.65)
k;q k:q and we are interested in the particular case of the dimension d"1. For p'd"1 the integral diverges at low k, and the second term in Eq. (5.65) dominates. This gives I&¸p~1DqDp ,
(5.66)
where ¸ is the system size determining the infrared cut-o! (in the original RBM formulation it is just the matrix size N). This leads to the following one-loop expression for the di!usion correlator (pl)2tI K(q)" , tI ~1"t~1! const ¸p~1 . 4DqDp
(5.67)
Now we turn to the renormalization group analysis, as described in Section 5.3.1. In the one-loop order, the expression for the renormalization constant Z is determined essentially by the same 1 integral I, Eq. (5.65), with the RG scale k playing the role of the infrared cut-o! (analogous to that of the system size ¸ in Eq. (5.67)). This yields (in the minimal subtraction scheme): 1 t Z (t)"1! #O(t2) , 1 2p p!1
(5.68)
leading to the following relation between the bare and the renormalized coupling constant analogous to Eq. (5.67): Z 1 1 1 1 k1~p, 1 " ! k1~p . t t 2 p!1 t B B From Eq. (5.68), we get the expression for the b-function:
K
(5.69)
Rt (1!p)t t2 b(t)" " "!(p!1)t! #O(t3) . (5.70) R lnk B 1#tR ln Z (t) 2p t 1 t Both Eqs. (5.67) and (5.70) show that the coupling constant t increases with the system size ¸ (resp. scale k), which is analogous to the behavior found in the conventional scaling theory of localization in d(2 dimensions [167,209]. The RG #ow reaches the strong coupling regime t&1 at the scale k&t1@(p~1). Remembering the relation of the bare coupling constant t and the index p to the B B
336
A.D. Mirlin / Physics Reports 326 (2000) 259}382
parameters of the original PRBM model: t~1Jb2a~1; pJ2a!1, we conclude that the length B scale m&t~1@(p~1)&b(2a~1)@(2a~2) (5.71) B plays the role of the localization length for the PRBM model. This conclusion is also supported by an inspection of the expression for the IPR, Eq. (5.60). Evaluation of the one-loop perturbative correction in Eq. (5.60) yields
G
H
(2q!1)!! t SP T" 1#q(q!1) f(p)Np~1 , q Nq~1 8pp
(5.72)
where f(p) is Riemann's zeta-function; f(p)K1/(p!1) for p close to unity. The correction term becomes comparable to the leading (GOE) contribution for the system size N&t1@(p~1), parametrically coinciding with the localization length m. For larger N the perturbative expression (5.72) loses its validity, and the IPR is expected to saturate at a constant value SP T&m1~q for N<m. q In concluding this subsection, let us stress once more that the localized eigenstates in the present model are expected to have integrable power-law tails: Dt2(r)DJr~2a"r~p~1 at r<m. 5.3.2.2. Delocalized regime: 0(p(1 (1/2(a(1). We begin again by considering the perturbative corrections to the di!usion propagator (5.59). The "rst term in the r.h.s. of Eq. (5.65) proportional to :dk/k2 is determined by the vicinity of its lower cut-o! (i.e. k&q), whereas the second one is proportional to :dk/kp and thus determined by the vicinity of its upper cut-o! (i.e. again k&q). Therefore, the integral (5.65) is now dominated by the region k&q, and is proportional to DqD. We get, therefore: 4DqDp (pl)2K~1(q)" !C DqD p t
(5.73)
with a numerical constant C . p We see that the correction term is of higher order in DqD as compared to the leading one. Thus, it does not lead to a renormalization of the coupling constant t. This is readily seen also in the framework of the RG scheme, where the one-loop integral in Eq. (5.63) does not give rise to terms of the form DqDp. One can check that this feature is not speci"c to the one-loop RG calculation, but holds in higher orders as well. For the case of a vector model with long-ranged interaction this conclusion was reached by Brezin et al. [210]. Thus, the renormalization constant Z is equal to 1 unity and the b-function is trivial: b(t)"(1!p)t .
(5.74)
This means that the model does not possess a critical point, and, for p(1, all states are delocalized, for any value of the bare coupling constant t. This property should be contrasted with the behavior of a d-dimensional conductor described by the conventional local non-linear p-model and undergoing an Anderson transition at some critical coupling t"t [167]. # Though all states of the model are delocalized, their statistical properties are di!erent from GOE. In particular, calculating the variance of the inverse participation ratio P (see Section 3.3.3, 2
A.D. Mirlin / Physics Reports 326 (2000) 259}382
337
Eq. (3.89)) we get 8 8 SP2 T!SP T2 2 " + P2(r, r@)" + P2(q) . d(P )" 2 2 N2 N2 SP T2 2 rr{ q/pn@N_ n/1, 2,2 At p'1/2 the sum over q is convergent yielding t2 f(2p) . d(P )" 2 8p2p N2~2p
(5.75)
(5.76)
Thus, in this regime the #uctuations of the IPR are much stronger than for the GOE where d(P )J1/N. Only for p(1/2 (a(3/4) the IPR #uctuations acquire the GOE character. Consid2 ering higher irreducible moments (cumulants) of the IPR, |Pn }, one "nds that the GOE behavior 2 is restored at p(p(n),1/n. In this sense, the model is analogous to a d-dimensional conductor at c d"2/p. Therefore, only when pP0 (correspondingly, aP1/2 in the original PRBM formulation) all statistical properties become equivalent to those characteristic of GOE. 5.3.2.3. Critical regime: p"1 (a"1). As we have seen, p"1 separates the regions of localized (p'1) and extended (p(1) states. It is then natural to expect some critical properties showing up just at p"1. Let us again consider the generalized di!usion propagator, Eq. (5.59). At p"1 the one-loop correction yields
C
D
1 1 K~1(q)"4DqD t~1! ln(DqD¸) . 8p (pl)2
(5.77)
As expected at the critical point, the correction to the coupling constant is of logarithmic nature. However, Eq. (5.77) di!ers essentially from that typical for a 2D disordered conductor: t~1"t~1!ln(¸/l) , (5.78) B where the bare coupling constant t corresponds to scale l. Comparing the two formulas, we see B that in Eq. (5.77) the mean free path l is replaced by the inverse momentum q~1. Therefore, the correction to the bare coupling constant is small for low momenta q&1/¸, and the correlator K(q) is not renormalized. This implies the absence of eigenstate localization, in contrast to the 2D di!usive conductor case, where Eq. (5.78) results in an exponentially large localization length mJexp t~1. On a more formal level, the absence of essential corrections to the low-q behavior of B K(q) is due to the fact that the region k'q does not give a logarithmic contribution. This is intimately connected with the absence of t renormalization at p(1. To study in more details the structure of critical eigenfunctions, let us consider the set of IPR P . q The perturbative correction, Eq. (5.60), is evaluated at p"1 as
G
H
t (2q!1)!! SP T" 1#q(q!1) ln(N/b) , q 8p Nq~1
(5.79)
where the microscopic scale b, Eq. (5.43), enters as the ultraviolet cut-o! for the p-model, the role usually played by the mean free path l. This formula is valid as long as the correction is small:
C
q;
D
t ~1@2 ln(N/b) . 8p
338
A.D. Mirlin / Physics Reports 326 (2000) 259}382
For larger q the perturbation theory breaks down, and one has to use the renormalization group approach. This requires [28,50] introduction of higher vertices of the type z :Strq(QkK) dr into the q action of the non-linear p-model and their subsequent renormalization. The resulting RG equations for the charges z read, in the one-loop order, q dz t q "q(q!1) z , (5.80) d ln k~1 8p q where k~1 is the renormalization scale. Integrating Eq. (5.80), we "nd
AB
(2q!1)!! N q(q~1)t@8p . SP T" q b Nq~1
(5.81)
Note, that this formula is reduced to the perturbative expression, Eq. (5.79), in the regime
C
q;
D
t ~1@2 ln(N/b) . 8p
The behavior described by Eq. (5.81) is characteristic of a multifractal structure of wave functions (see Sections 3.3.2 and 5.2). Comparing (5.81) with (3.60), we "nd the multifractality dimensions D "1!qt/8p. The general formula valid for both orthogonal (b"1) and unitary (b"2) q universality classes is t . D "1!q q 8pb
(5.82)
This form of the fractal dimensions is reminiscent of that found in two and 2#e dimensions for the usual di!usive conductor, see Eq. (3.63). The one-loop result (5.82) holds for q;8p/t. Now we can understand the reason for the q-dependent logarithmic correction to the di!usion propagator K(q), Eq. (5.77). As was mentioned in the end of Section 5.2, the multifractality of eigenfunctions determines the momentum dependence of the di!usion propagator at high q in the critical point [185,186] K~1(q)JDqD(DqD¸)~g, g"d!D (5.83) 2 (at "nite frequency u the system size ¸ is replaced by ¸ &(D/u)1@2). On the other hand, the u logarithmic correction in Eq. (5.77) is the "rst term of the expansion
G A
C
D
H
1 4 t t 2 K~1(q)" DqD 1! ln(DqD¸)#C ln(DqD¸) #2 2 (pl)2 t 8p 8p
B
DqD t " F ln(DqD¸) , t 8p
(5.84)
where F(x) is some parameterless function. Since g"1!D Kt/4p, Eq. (5.84) has precisely the 2 form expected from (5.83), assuming that F(x)&e~2x at x<1. The set of fractal dimensions D (as well as spectral properties at p"1, see Section 5.3.3) q is parametrized by the coupling constant t. Strictly speaking, the above p-model derivation is justi"ed for t;1 (i.e. b<1). However, the opposite limiting case can be also studied, following Levitov [200,201,324]. It corresponds to a d-dimensional Anderson insulator, perturbed by a weak
A.D. Mirlin / Physics Reports 326 (2000) 259}382
339
long-range hopping with an amplitude decreasing with distance as r~p. The arguments of Levitov imply that the states delocalize at p4d, carrying fractal properties at p"d. The PRBM model in the limit b;1 is just the 1D version of this problem. This shows that the conclusion about localization (delocalization) of eigenstates for p'1 (resp. p(1), with p"1 being a critical point holds irrespective of the particular value of the parameter b. The b;1 limit of the PRBM model was also studied in Ref. [61] where the IPR P was calculated, yielding the fractal dimension 2 D Jb. Alternatively, the regime of the Anderson insulator with weak power-law hopping can be 2 described in the framework of the non-linear p-model, Eq. (5.54), by considering the limit t<1. Formally, the non-linear p-model for arbitrary t can be derived from a microscopic tight-binding model by allowing n<1 `orbitalsa per site [167]. To summarize, the PRBM model (5.39), (5.43) with 0(b(R or the p-model (5.54) with a coupling constant 0(t(R represent at p"1 a continuous family of critical theories parametrized by the value of b (respectively, t). 5.3.2.4. Numerical simulations. Numerical simulations of the PRBM model were performed in Ref. [57] for values of a3[0,2] and b"1. In Fig. 3 we present typical eigenfunctions for four
Fig. 3. Typical eigenfunctions for the matrix size N"800 and four di!erent values of a: 0.375, 0.875, 1.250, and 1.625. From [57].
340
A.D. Mirlin / Physics Reports 326 (2000) 259}382
di!erent regions of a. In agreement with the theoretical picture presented above, the eigenstates corresponding to a"0.375 and a"0.875 are extended, whereas those corresponding to a"1.25 and a"1.625 are localized. At the same time, one can notice that the states with a"0.875 and a"1.25 exhibit a quite sparse structure, as opposed to the other two cases. This can be explained by the proximity of the former two values of a to the critical value a"1.0 where eigenstates should show the multifractal behavior. To get a quantitative information about the properties of the eigenstates, the mean value of the IPR, SP T, and its relative variance, d"(SP2 T!SP T2)/SP T2, 2 2 2 2 were calculated in [57]. At any given a the dependences of the quantities SP T and d on the matrix 2 size N were approximated by the power-laws SP TJ1/Nl, dJ1/Nk for N ranging from 100 to 2 2400. In Figs. 4 and 5 the values of the exponents l and k obtained in this way are plotted versus the PRBM parameter a. The expected theoretical curves are presented as well. We see from Fig. 4 that the data show a crossover from the behavior typical for extended states (l"1) to that typical for localized states (l"0), centered approximately at the critical point a"1. The deviations from the sharp step-like theoretical curve l(a) can be presumably attributed to the "nite-size e!ects which are unusually pronounced in the PRBM model due to the long-range nature of the o!-diagonal coupling. The data for the exponent k (Fig. 5) also show a reasonable agreement with the expected linear crossover, k"4(1!a) for 3/4(a(1, see Eq. (5.76). While the presented data are in good agreement with the above theoretical picture, a more detailed numerical investigation of the structure of eigenstates and of spectral statistics is certainly desirable. In particular, it would be especially interesting to study the critical manifold, a"1, where the multifractal properties of eigenstates and intermediate level statistics are predicted by the theory.
Fig. 4. Index l characterizing the dependence of the inverse participation ratio SP T on the matrix size N via 2 SP TJ1/Nl, as a function of a. Points refer to the best-"t values obtained from matrix sizes between N"100 and 2 N"1000 (squares) or N"2400 (circles). The dashed line is the theoretical prediction for the transition from l"1, at small a, to l"0, at large a. From [57]. Fig. 5. The same as Fig. 4, but for the index k, derived from the N dependence of the variance d of the inverse participation ratio: d,(SP2 T!SP T2)/SP T2J1/Nk. The dashed line corresponds to the predicted linear crossover 2 2 2 from k"1 at a(3/4 to k"0 at a'1. From [57].
A.D. Mirlin / Physics Reports 326 (2000) 259}382
341
5.3.3. Spectral properties Let us consider now the spectral statistics of the PRBM model. In the `metallica regime, the leading correction to the Wigner}Dyson form R (s) of the level correlation function is given by WD the properly modi"ed Eq. (2.30), i.e.
C
D
1 d2 R(s)" 1# C s2 R (s) , WD 2b ds2
(5.85)
where 1 1 + P2(q) C" + P2(r, r@)" N2 N2 rr{ q/pn@N_ n/1, 2,2 t2 N2p~2, p'1/2 , 64p2p " t2 const N~1, p(1/2 . 64p2pb1~2p
G
(5.86)
At p(1/2 the sum divergent at high momenta is cut o! at q&p/b; the procedure leaving undetermined a constant of order of unity. The correlation function R(s) is close to its RMT value if p(1 (the region of delocalized states), or else if p'1 and the system size N is much less than the localization length m, Eq. (5.71). Under these conditions, Eq. (5.85) holds as long as the correction term is small compared to the leading one. This requirement produces the following restriction on the frequency s"u/D:
G
t~1N1~p, p'1/2 , s(s & c t~1b1@2~pN1@2J(Nb)1@2, p(1/2 .
(5.87)
At larger frequencies (s's ), the form of the level correlation function changes from the 1/s2 c behavior typical for RMT to a di!erent one, in full similarity with the Altshuler}Shklovskii regime (2.33) in the case of a conventional d-dimensional conductor. Extending Eq. (2.28) to the present case, we "nd D2 R(c)(s)" Re bp2
G
+ n/0,1,2
1 8 pn p 2 !iu plt N
C A B
D
N1~1@pt1@ps1@p~2,
p'1/2 (pO1) ,
t2N~1b2p~1J(Nb)~1,
p(1/2 .
J
(5.88)
At last, let us consider the level statistics at criticality, p"1. In this case the coe$cient of proportionality in the asymptotic expression (5.88) vanishes in view of analyticity:
P
= dx *t "0 . (5.89) R(c)(s)& (x!iu)2 16p2b ~= This is similar to what happens in the case of a 2D di!usive conductor (see Section 5.1). A more accurate consideration requires taking into account the high-momentum cut-o! at q&b~1. In full
342
A.D. Mirlin / Physics Reports 326 (2000) 259}382
analogy with the 2D situation mentioned we "nd then a linear term in the level number variance:
P
t . SdN2(E)TKsSN(E)T; s" R(c)(s) ds" 8pb
(5.90)
The presence of the linear term (5.90) (as well as the multifractality of eigenfunctions, Section 5.3.2) makes the case p"1 similar to the situation on the mobility edge of a disordered conductor in d'2 discussed in Sections 5.1 and 5.2. Let us "nally mention that the value of s, Eq. (5.90), is in agreement with the formula (5.27), where D is given by Eq. (5.82) with q"2, and d"1 in the 2 present case. We stress, however, that both Eqs. (5.82) and (5.90) have been derived in the leading order in t/8p;1. As we discuss below, at t/8pZ1 Eq. (5.27) appears to be violated. In fact, one can calculate the whole two-level correlation function using the results of Refs. [10,11] presented in Section 2.2. We consider the unitary symmetry (b"2) here for simplicity. In this case, the level correlation function is described by a single formula R(s)"R (s)#R (s) 1%35 04# [with R (s) and R (s) given by Eqs. (2.38) and (2.34), respectively] in the whole range of s. The 1%35 04# spectral determinant D(s), Eq. (2.37) can be easily calculated [211], yielding 1 (pst/16)2 D(s)" s2 sinh2(pst/16)
(5.91)
in the case of periodic boundary conditions and 1 pst/8 D(s)" s2 sinh(pst/8)
(5.92)
in the case of hard-wall boundary conditions. The resulting expressions for R(s) are sin2ps (pst/16)2 R(c)(s)"! (ps)2 sinh2(pst/16)
(5.93)
for periodic boundary conditions and
C
(pst/8)2 pst/8 1 1# !2 cos(ps) R(c)(s)"! sinh2(pst/8) sinh(pst/8) 4p2s2
D
(5.94)
for the hard-wall boundary conditions. Calculating the spectral compressibility s":R(c)(s) ds, we reproduce Eq. (5.90). There exists a deep connection between the PRBM ensemble and the random matrix ensemble introduced by Moshe et al. in Ref. [212]. This latter ensemble is determined by the probability density
P
P(H)J d; exp M!Tr H2!h2N2 Tr([;, H][;, H]s)N ,
(5.95)
where :d; denotes the integration over the group of unitary matrices with the Haar measure. The parameter h allows to interpolate between the limits of the Wigner}Dyson (h"0) and the Poisson (hPR) statistics. The relation between the ensemble (5.95) and the PRBM ensemble is established as follows [211]. For any given unitary matrix ; the level statistics in the ensemble determined by the density (5.95) (without integration over ;) is determined by the eigenvalues e*hk (k"1,2, N) of
A.D. Mirlin / Physics Reports 326 (2000) 259}382
343
;. For N<1 a typical matrix ; has an essentially uniform density of eigenvalues, so that we can consider as a typical representative the matrix with equidistant eigenvalues, h "2pk/N. Indeed, k weak #uctuations of h around these values do not a!ect essentially the behavior of a(r) in Eq. (5.97) k below. On the other hand, the matrices with strongly non-uniform density of eigenvalues represent a vanishingly small fraction of the whole group volume and can be neglected. Making a transformation to the eigenbasis of ;, we can transform P(H) to
G
C
p P(H)Jexp !+ DH D2 1#(2Nh)2sin2 (i!j) ij N ij
DH
,
(5.96)
so that the matrix H has a form of the PRBM in the critical regime a"1 with b a2(r)" 4
1 1 sin2(pr/N) 1# b2 (p/N)2
1 b K 4 1#(r/b)2
for r;N ,
(5.97)
and b"1/2ph. Calculating (for h;1) the coupling constant in the center of the band (E"0) according to Section 5.3.1, we get t"4/b"8ph. A nice feature of the ensemble (5.95) is that at b"2 the level correlation function can be calculated exactly for arbitrary value of h [212] (we remind that the above p-model study of the PRBM ensemble was restricted to the regime 1/2pb"h;1). After the integration over the unitary group with making use of the Itzykson}Zuber formula, the joint probability distribution of the eigenvalues of H is found to be equal to the probability density of coordinates of a system of non-interacting 1D fermions in a harmonic con"nement potential at "nite temperature. The parameter h of the model (5.95) determines, in this formulation, the ratio of the temperature to the Fermi energy, h"¹/E , or, in other words, the degree of degeneracy of the Fermi gas. Calculating F the two-particle correlation function for non-interacting fermions, one "nds the sought level correlation function [212]. In the center of the band the result reads
C P
Sl(!u/2)l(u/2)T!Sl(0)T2"!
D
Nh = dp 2 e2*phNu , 2 p 1#Cep ~=
(5.98)
where C"[e1@h!1]~1 and the mean level spacing D is given by
P
dp Nh D~1"Sl(0)T" . p 1#Cep2
(5.99)
The spectral compressibility s is found from (5.98) and (5.99) to be :dp(1#Cep2)~2 s"1! . :dp(1#Cep2)~1
(5.100)
344
A.D. Mirlin / Physics Reports 326 (2000) 259}382
The formulas (5.98) and (5.99) can be simpli"ed in the limits h;1 and h<1. For h;1, when the gas is strongly degenerate, one gets (as usual, we introduce s"u/D)
A BA
R(c)(s)"!
B
sinps 2 p2hs/2 2 , ps sinh p2hs/2
(5.101)
in precise agreement with the result (5.93) obtained directly for the PRBM ensemble (with the above identi"cation of the parameters of the two models, t"8ph). In the opposite limit, h<1, the gas is almost classical and the correlations are weak, R(c)(s)"!e~2ph2s2 .
(5.102)
Note that the corresponding spectral compressibility sK1!1/J2h approaches unity in the limt hPR, whereas the formula (5.27) would imply s41/2. Therefore, at least for the PRBM model, Eq. (5.27) is not an exact relation, but rather an approximation valid in the close-to-RMT regime 1/2pb"h;1.
6. Conductance 6uctuations in quasi-one-dimensional wires This section is devoted to a study of the conductance #uctuations of a quasi-one-dimensional disordered system [64,65]. As was already mentioned in Section 3, there exist di!erent microscopic models which can be mapped onto the same 1D supersymmetric p-model and thus belong to the same `quasi-one-dimensional universality classa. Our treatment below will be based on the Iida}WeidenmuK ller}Zuk (IWZ) model [63] representing a wire as a sequence of coupled N]N random matrices, the "rst and the last of which are coupled to the states propagating in the leads. Using the multi-channel BuK ttiker}Landauer formula [213}216], the mean conductance and its variance can be expressed in terms of end-to-end correlation functions of the p-model. Similarly to what was done in Section 3.2, where the eigenfunction statistics was studied, these correlation functions can be calculated exactly via the transfer-matrix method. This means that calculation of the functional integral can be reduced to solution of a `SchroK dinger equationa, which can be found in terms of an expansion in corresponding eigenfunctions. The results depend on a single parameter ¸/m (where ¸ is the sample length and m the localization length) and do not depend on details of the underlying microscopic model. On top of the IWZ-model and two other models from the quasi-1D class already discussed in Section 3.2 (the thick wire model of Efetov and Larkin and the random banded matrix model), we mention here a system of weakly coupled 2D layers in strong magnetic "eld [68,217}224]. In the quantum Hall regime the transport in each layer is due to edge states. Tunneling between the layers leads to appearance of the transport in the transverse direction. The coupled edge states propagating on a surface of the cylinder form a 2D chiral metal, which can be described by a directed network model introduced by Chalker and Dohmen [217]. Mapping of this problem onto the supersymmetric spin chain was done in [218,221]; further mapping onto the supersymmetric p-model was presented in [222]. If the number of layers is su$ciently large, the system is of quasi-one-dimensional nature. Very recent numerical study of the directed network model [68] showed perfect agreement with the analytical results presented below. First experimental realization of the multilayer quantum Hall system has been also reported recently [224].
A.D. Mirlin / Physics Reports 326 (2000) 259}382
345
6.1. Modeling a disordered wire and mapping onto 1D p-model The presentation below is based on Ref. [65]. We consider a quasi one-dimensional disordered wire of length ¸, decomposed into K boxes with linear dimension l. With each box i (i"1,2, K) we associate N electronic states DikT (k"1,2, N). The boxes 1 and K are coupled to ideal leads i on the left (i"¸) and right (i"R) side of the disordered region. In each of these leads we have a number of modes DE, a, iT (a"1,2) with transverse energy e whose longitudinal momentum a k is de"ned by their total energy E being equal to e #+2k2/(2mH). Here mH is the e!ective mass of a % % the electrons. We work at zero temperature. The Hamiltonian of the system reads
P
T
U
H"+ dE D E, a, iTE E, a, iD# + Dik Hij S jlD kl i,a ea i,j,k,l
P
# + + dE(DE, a, iT=i (E, i)SikD#c.c.) . (6.1) ak a i/1,K k,a,i e The N]N-matrices (Hii ) are taken to be members of the Gaussian unitary (GUE), orthogonal kl (GOE) or symplectic (GSE) ensemble. For de"niteness, we will consider the orthogonal symmetry case in the present section; the results for all the three symmetry classes will be presented in Section 6.2. In the GOE case, the elements Hii are independent real Gaussian random variables kl with zero mean value and j2 SHii Hii T" (d d #d d ) . kk{ ll{ kl k{l{ N kl{ lk{
(6.2)
States in adjacent boxes are coupled by another set of Gaussian random variables with zero mean values and SHii`1Hi`1iT"v2/N2. (6.3) kl lk All other matrix elements Hij vanish. The coupling between channel and box states is e!ected by kl the matrix elements =i (E, i). We assume that they do not depend on E and i, that they obey the ak symmetry =i "=iH "=i and that they ful"ll the orthogonality relation ka ak ak p+ =i =i "d x (i"1, K) (6.4) ak kb ab k with x a normalization constant. Eq. (6.4) is convenient but does not result in any loss of generality [8]. We note that =i vanishes unless i"1, K. ak The conductance of the system is given by the many-channel Landauer}BuK ttiker formula [213}216], g"+ (DSRLD2#DSLRD2)"2+ DSRLD2 ab ba ab ab ab The S-matrix Sii{ of the IWZ-model reads [225] ab Sii{"d dii{!2pi + =i (i)(D~1)ij =j (i@) , ab ab ak kl lb i,j,k,l
(6.5)
(6.6)
346
A.D. Mirlin / Physics Reports 326 (2000) 259}382
where Dij "Edijd !Hij #iX1 di1dj1#iXK diKdjK , (6.7) kl kl kl kl kl N#) (i"1, K) . (6.8) Xi "p + =i =i ka al kl a/1 The S-matrix in Eq. (6.5) is taken at the Fermi energy E"E , and N denotes the number of open F #) channels at this energy. It follows that in order to obtain SgT and Sg2T we have to calculate the ensemble average of a product of two resp.four propagators D~1. Further manipulations are completely analogous to those outlined in Section 2.1 for the case of the level correlation function. We de"ne a supervector t"(S1 , S2 , s , sH, S1 , S2 , s , sH) of real commuting (S) and complex 1 1 1 1 2 2 2 2 anticommuting (s) variables, with the "rst four components corresponding to the retarded and the last four to the advanced sector. Then
P
A
B
i (D~1)ij " d[t](S1 )j (S1 )i exp tsK1@2DK K1@2t kl 1l 1k 2
(6.9)
where DK "diag(D, D, D, D, Ds, Ds, Ds, Ds), K"diag(1, 1, 1, 1,!1,!1,!1,!1). Products of two and four propagators can be expressed in similar fashion (see Eq. (6.14) below). After averaging of Eq. (6.9) over the Gaussian distribution of random variables entering DK , we perform the Hubbard}Stratonovitch transformation introducing 8]8 supermatrices R (i"1,2, K) conjui gate to the dyadic product t ts and then take the large-N limit. As a result, the integration over i i R is restricted to solutions of the saddle-point equation R "j2(E!R )~1, which have the form i i i
S
E2 E R" !i j2! ¹K¹~1,p ) I!id ) Q , 4 2
(6.10)
with Q"¹K¹~1, Q2"1. As a result, Eq. (6.9) takes the form
P
A
B
!d2v2 + Str Q Q i i`1 4j4 i i ] d[t] (S1 )j (S1 )i exp tsK1@2(EK #iXK K#idQK )K1@2t ,S(S1 )j (S1 )i T . 1l 1k 1l 1k Q 2
S(D~1)ij T" D[Q]exp kl
P
A
B
(6.11) Here (EK )ij "Ed dij, (XK )ij "(X1) di1dj1#(XK) diKdjK and (QK )ij "d Q dij. The last line of kl kl kl kl kl kl kl i Eq. (6.11) introduces a short-hand notation S2T . Q According to Eqs. (6.5) and (6.6), the "rst two moments of the conductance are given by SgT"8 + X1 XK S(D~1)K1(D~1 s)1K T , ll{ k{k kl l{k{ klk{l{
(6.12)
Sg2T"64 + X1 XK X1 XK S(D~1)K1(D~1s)1K (D~1)K1(D~1s)1K T . ll{ k{k qq{ o{o kl l{k{ oq q{o{ klk{l{oqo{q{
(6.13)
A.D. Mirlin / Physics Reports 326 (2000) 259}382
347
The averages of products of the Green's functions relevant for SgT and Sg2T can be expressed similarly to Eq. (6.11) as follows: S(D~1)K1(D~1s)1K T"S(S1 )1(S1 )K(S1 )K (S1 )1 T , kl l{k{ 1 l 1 k 2 k{ 2 l{ Q S(D~1)K1(D~1 s)1K (D~1)K1(D~1s)1K T kl l{k{ oq q{o{ "S(S1 )1(S1 )K(S1 )K (S1 )1 (S2 )1(S2 )K(S2 )K (S2 )1 T . (6.14) 1 l 1 k 2 k{ 2 l{ 1 q 1 o 2 o{ 2 q{ Q For simplicity, we set E"0 below. Evaluating the contractions in (6.14), performing the convolution with the projectors X1, XK in (6.12), (6.13) and using N <1, one comes after some algebraic #) manipulations to the following p-model representation12 of SgT and Sg2T [64,65]:
P P
NI 2 SgT" #) D[Q]Q51Q51 expM!S[Q]N , 1 K 2
(6.15)
NI 4 Sg2T" #) D[Q]Q51Q62Q51Q62 expM!S[Q]N , 1 1 K K 4
(6.16)
m NI S[Q]" + Str Q Q # #) StrK(Q #Q ) . (6.17) i i`1 1 K 8l 8 i Here NI "N ¹ , with ¹ "4dx/(d#x)2 being the so-called transmission coe$cient, and #) #) 0 0 can be m"(4v2/j2)l. In the weak-disorder limit, m
P
A B
RQ 2 m L dz Str , (6.18) Rz 16 0 which is precisely the continuous version of the 1D p-model (see Sections 2 and 3) with m being the localization length equal to m"2plAD for a thick wire. The second term in Eq. (6.17) containing Q and Q describes the coupling of the wire to the leads. 1 K Let us note that the duplication of the number of "eld components in the supervector t (which is forced by the supersymmetric formulation) gave us enough #exibility to write down the expressions Eq. (6.14) for products of two as well as of four propagators. As a result, we were able to express both SgT and Sg2T in terms of correlation functions of the usual `minimala p-model. We did not have to introduce Q-matrices of larger dimension. This fact no longer holds true when higher moments of the conductance are to be calculated. For that purpose we would have to introduce supervectors of larger size to express higher-order products of propagators in a form analogous to (6.14). The increase in the number of supervector components would force us to deal with Q-matrices of larger dimension. This enlargement, while posing no di$culty to perturbative calculations [28], gives rise to substantial complications in the case of the exact treatment.
12 We use here a brief notation for the matrix elements of the Q-matrices. Each of the two upper indices runs from 1 to 8 according to ordering of components of a supervector introduced before Eq. (6.9).
348
A.D. Mirlin / Physics Reports 326 (2000) 259}382
To introduce the transfer operator for the `partition suma in Eq. (6.16), we de"ne the function
P
A
B
m K~1 =(Q , Q ; ¸/2m)" D[Q ]2D[Q ]exp + Str(Q !Q )2 , K 1 2 K~1 i`1 i 32l i/1 which has the property
P
(6.19)
=(Q , Q ; (¸#l)/2m)" D[Q ]=(Q , Q ; l/2m)=(Q , Q ; ¸/2m) K`1 1 K K`1 K K 1
P
A
B
m Str(Q !Q )2 =(Q , Q ; ¸/2m) . " D[Q ]exp K K`1 K K 1 32l
(6.20)
In the continuum limit m
P
C
D
p = ¸ SgnT(¸)" dj tanh2(pj/2)(j2#1)~1p (1, j, j) exp ! (1#j2) n 2 2m 0 = dj dj l(l2!1)j tanh(pj /2)j tanh(pj /2) #24 + 1 2 1 1 2 2 N 0 l|2 `1
P
C
D
¸ ]p (l, j , j ) < (!1#pl#ip j #ip j )~1exp ! (l2#j2 #j2 #1) , n 1 2 1 1 2 2 1 2 4m p,p1 ,p2 /B1
A.D. Mirlin / Physics Reports 326 (2000) 259}382
349
GUE:
P
C
D
¸ = SgnT(¸)"22 + dj j tanh(pj/2)l(j2#l2)~2p (l, j)exp ! (l2#j2) , n 4m l|2N~1 0
(6.23)
GSE:
P
= dj j(j2#1)tanh(pj/2)l l p (j, l , l ) + 12 n 2 2 l1 ,l2 |2NN~1 0 l1 `l2 |4 ~2
SgnT(¸)"25
C
D
¸ ] < (!1#ipj#p l #p l )~1exp ! (l2 #l2 #j2!1) . 11 22 2 8m 1 p,p1 ,p2 /B1 The polynomials p in the above expressions are given by n GOE: p (l, j , j )"l2#j2 #j2 #1 , 1 1 2 1 2 p (l, j , j )"1(j4 #j4 #2l4#3l2(j2 #j2 )#2l2!j2 !j2 !2) , 2 1 2 1 2 2 1 2 2 1 GUE: p (l, j)"l2#j2, 1 p (l, j)"1(l2#j2)2, 2 2 GSE: p (j, l , l )"j2#l2 #l2 !1 , 1 1 2 1 2 p (j, l , l )"1(l4 #l4 #2j4#3j2(l2 #l2 )!2j2#l2 #l2 !2) . 2 1 2 4 1 2 1 2 1 2
(6.24)
The results (6.23) are presented in Figs. 6 and 7 for all the three symmetry classes. The above GUE formula is written for the case of broken time-reversal symmetry, but preserved spin-rotation
Fig. 6. Average conductance SgT normalized to its Ohm's law value as a function of the sample length (measured in units of m"2plAD). The full, dashed and dot-dashed lines correspond to the orthogonal, unitary, and symplectic symmetry, respectively. Fig. 7. The same as Fig. 6, but for the conductance variance var(g).
350
A.D. Mirlin / Physics Reports 326 (2000) 259}382
Fig. 8. Numerical data [67] for the average conductance in the case of symplectic symmetry. The box size N and the number of channels N are equal to 10 (squares), 20 (diamonds), 60 (triangles), and 100 (stars). Each data point #) corresponds to an average over 100 realizations of disorder. The full line is the theoretical prediction. Fig. 9. The same as Fig. 8, but for the conductance variance.
invariance. The result for systems with broken time-reversal symmetry and with strong spin interactions (e.g. systems with magnetic impurities, or systems with strong spin-orbit interaction in magnetic "eld; to be denoted as GUE@ below) is SgnTGUE{(¸)"(1/2)nSgnTGUE(¸/2) .
(6.25)
The results (6.23) have been con"rmed by numerical simulations of the IWZ-model [67] and of the directed network model [68]. In particular, we present in Figs. 8 and 9 the numerical results of [67] for the symplectic symmetry class. Let us discuss the behavior of Eqs. (6.23) in the limits of short (¸;m) and long (¸<m) wire. Condition ¸;m corresponds to the metallic (weak localization) region, SgT<1. We "nd in this case the following perturbative (in ¸/m) expansion for SgT, Sg2T and var(g):
AB
AB AB
2m 2 2¸ 4 ¸ 2 ¸ 3 SgT(¸)" ! # # #O , ¸ 3 45 m 945 m m
A B
Sg2T(¸)"
2m 2 8 m 52 136 ¸ ¸ 2 #O ! # ! , ¸ 3 ¸ 45 945 m m
(6.26)
AB
¸ 2 8 32 ¸ . var(g(¸))" ! #O m 15 315 m The perturbative results for the symplectic case are related to those for the orthogonal class via the symmetry relations [226}228] SgnTS1(¸)"(!1/2)nSgnTO(!¸/2) and have the form
AB
AB
2m 1 1¸ 1 ¸ 2 ¸ 3 SgT(¸)" # # ! #O , ¸ 3 90 m 1890 m m
A.D. Mirlin / Physics Reports 326 (2000) 259}382
A B
Sg2T(¸)"
AB
17 ¸ 2m 2 4 m 13 ¸ 2 #O # # # , ¸ 3 ¸ 45 945 m m
351
(6.27)
AB
4 ¸ 2 ¸ 2 #O var(g(¸))" # . 15 315 m m Finally, for unitary symmetry we have
AB AB
2m 2¸ ¸ 3 SgT(¸)" ! #O , ¸ 45 m m
A B
Sg2T(¸)"
2m 2 4 ¸ 2 # #O , ¸ 45 m
(6.28)
AB
¸ 2 4 . var(g(¸))" #O m 15 The expressions for SgT start from the Ohm's law term 2m/¸"4plAD/¸ (we remind that g is measured in units of e2/h"e2/2p+ and includes factor 2 due to the spin, while l is the density of states per spin projection). The other terms constitute the weak-localization corrections. The leading terms in var(g) are well-known values of the universal conductance #uctuations in the case of quasi-1D geometry. The opposite condition ¸<m de"nes the region of strong localization. In this case, the asymptotic behavior of SgnT, n"1, 2, is as follows: GOE: SgnT"2~3@2~np7@2(m/¸)3@2e~L@2m , GUE: SgnT"23~np3@2(m/¸)3@2e~L@4m ,
(6.29)
GSE: SgnT"215@2~np3@2(m/¸)3@2e~L@8m . Let us recall that m is de"ned here as m"2plAD independently of the symmetry. The formulas (6.29) demonstrate therefore well-known dependence of the localization length on the symmetry of the ensemble, ¸ Jb. Let us stress, however, that in the GUE@ case (broken time-reversal and -0# spin-rotation symmetries), the results for which can be obtained via the relation (6.25) the localization length is the same as in GSE [101,229]. This is because the transition GSEPGUE@ not only changes b from 4 to 2, but also breaks the Kramers degeneracy, increasing by factor of 2 the number of coupled channels. These two e!ects compensate each other. More generally, localization length is proportional to b/s, where s is the degeneracy factor. 6.2.1. DMPK equations As has been already mentioned, calculation of higher moments SgnT with n'2 and thus of the whole distribution function P(g) has not been achieved in the supersymmetry approach because of technical di$culties (necessity to increase the size of the Q-matrix with increasing n). The conductance distribution in the localized regime ¸<m can be approximately calculated from the Dorokhov}Mello}Pereyra}Kumar (DMPK) approach. Within this approach, pioneered by Dorokhov [230] and developed by Mello et al. [231], one derives a Fokker-Planck (di!usion)
352
A.D. Mirlin / Physics Reports 326 (2000) 259}382
equation for the distribution of transmission eigenvalues ¹ (the eigenvalues of the transmissionn matrix product tts). This equation is conveniently written in terms of the distribution function P(j , j ,2; ¸), where j "(1!¹ )/¹ and has the form 1 2 i i i
RP 2 N R R l " + j (1#j )J J~1P , n Rj R¸ c Rj n n n n/1 (6.30) N N J" < < Dj !j Db , i j i/1 j/i`1 where l is the mean free path, N is the number of transverse modes, and c"bN#2!b. Overview of the results obtained within the DMPK approach can be found in the review article of Beenakker [88]. It has been shown recently [66] that in the limit N<1 the DMPK approach is equivalent to the supersymmetric p-model considered above. Let us note that the DMPK approach is restricted to calculation of transport properties of the quasi-1D conductor (which can be expressed through the transmission eigenvalues ¹ ), while the supersymmetry method allows to study all sorts of i quantities which can be expressed through the Green's functions (e.g. statistics of levels, eigenfunction amplitudes, local density of states etc. } see other sections of this article). In the localized regime ¸
A B
C
A
BD
N g"G/(e2/h)"2 + ¹ (6.32) n n/1 is dominated by x , e.g. gK2/cosh2x K8e~2x1 , which implies the Gaussian distribution of ln g, 1 1 cl 2¸ 2 cl 1@2 P(ln g)K ln g# , (6.33) exp ! 8¸ cl 8p¸
A B
C
A
BD
with the average Sln gT"!2¸/cl and the variance var(g)"4¸/cl"!2Sln gT. We have already encountered the same log-normal distribution in Section 3.2.4, when we calculated the distribution of the product of the wave function intensities in two points located close to the opposite edges of the sample. The result (6.33) is fully con"rmed by numerical simulations as illustrated in Fig. 10. The log-normal form (6.33) of the conductance distribution holds for g(1 only (since one transmission eigenvalue cannot produce a conductance larger than unity); for g'1 the distribution P(g) decays very fast. The moments SgnT with n51 are thus determined by the probability to have g&1, which yields (with exponential accuracy) SgnT&exp(!¸/2cl)
(6.34)
in full agreement with the asymptotics of the "rst two moments, Eqs. (6.29), found from the supersymmetric approach. [Eqs. (6.29) contain also preexponential factors, for derivation
A.D. Mirlin / Physics Reports 326 (2000) 259}382
353
Fig. 10. The average and the variance of the conductance logarithm as a function of the sample length (for the symplectic symmetry class). The box size N and the number of channels N are equal to 10. Each data point corresponds to an #) average over 100 realizations of disorder. The dashed line corresponds to the formula (6.33).
of which the approximation (6.31), (6.33) of the solution of the DMPK equations is not su$cient.] Finally, we note that in the strictly 1D case Abrikosov [147] derived the exact result for the conductance distribution function P(g) for arbitrary value of ¸/l. In the limit ¸/l<1 it approaches the log-normal distribution (also found by Melnikov [145,146]), which can be obtained from Eq. (6.33) by setting N"1 (i.e. c"2). In fact, essentially equivalent results were obtained much earlier in the context of the classical wave propagation, see Refs. [237}239].
7. Statistics of wave intensity in optics As was already mentioned in the introduction, our statistical considerations are applicable not only to properties of eigenfunctions and energy levels of quantum particles (electrons), but also to those of intensities and eigenfrequencies of classical waves. This is related to similarity of the stationary SchroK dinger equation and the wave equation. Propagation of the classical "eld t with u frequency u in an inhomogeneous medium is described by the wave equation [+2#k2 (1#k(r))]t (r)"0 , 0 u
(7.1)
supplemented by appropriate sources. Here k "u/c, with c the speed of propagation in the 0 average medium, and k(r) describes #uctuations of the refraction index. The "eld t can describe u a component of the electromagnetic or acoustic wave. The impurity diagrammatic technique [240}243] and the p-model approach [244,245] can be developed in full analogy with the case of the SchroK dinger equation in random potential. Let us consider an open system with a permanently radiating source (for example, a point-like source would correspond to addition of the term Jd(r!r ) in the r.h.s. of Eq. (7.1). The problem 0 of #uctuations of the wave intensity t2 (r) in such a situation has a very long history. Almost u
354
A.D. Mirlin / Physics Reports 326 (2000) 259}382
a century ago Rayleigh proposed a distribution which bears his name: P (II )"exp(!II ) , 0
(7.2)
where II is the intensity normalized to its average value, II "I/SIT. A simple statistical argument leading to Eq. (7.2) is based on representing t (r) as a sum of many random contributions (plane u waves with random amplitudes and phases). This is essentially the same argument that was used by Berry to describe #uctuations of eigenfunctions t in chaotic billiards and which leads to the RMT i statistics of Dt2(r)D (see Section 3). Let us mention, however, a di!erence between the two cases i (emphasized by Pnini and Shapiro [246]). In the case of an open system, t is a sum of traveling u waves, while for the closed system t is represented as a sum of standing waves. As a result, the i Rayleigh statistics has the same form as the statistics of eigenfunction amplitudes in a closed system with broken time-reversal invariance (unitary class), where the eigenfunctions are complex. Diagrammatic derivation of Eq. (7.2) is very simple [247] (see below); for the case of a smooth randomness an essentially equivalent derivation [248] using path-integral arguments was given. However, similarly to distributions of other quantities studied above (eigenfunction amplitude, local DOS, relaxation time, etc.), distribution of optical intensities show deviations from the Rayleigh law, which will be studied below. More speci"cally, we will consider #uctuations of the intensity I(r, r ) at a point r induced by a point-like source at r , with the both points r and 0 0 r located in the bulk of the sample. We will assume quasi-1D geometry of the sample with length 0 ¸ much larger than the transverse dimension = (see Fig. 11). Let us note that there was a considerable activity recently in studying statistics of the transmission coe$cients ¹ of a disordered waveguide [72}74,249,250]. In this formulation of the ab problem, a source and a detector of the radiation are located outside the sample. The source produces a plane wave injected in an incoming channel a, and the intensity in an outgoing channel b is measured. The transmission coe$cients are related to the transmission matrix t (already mentioned in Section 6) as ¹ "Dt D2. One can also de"ne the transmittance ¹ "+ ¹ summed ab ab a a ab over the outgoing channels. Finally, the total transmittance ¹"+ ¹ is the optical analogue of a,b ab the conductance g (#uctuations of which were studied in Section 6). Combining the diagrammatic approach with the results on distribution of transmission eigenvalues in the metallic regime (following from the DMPK equations), Nieuwenhuizen and van Rossum [72] calculated the distribution functions P(¹ ) and P(¹ ) for g<1. The results have the following form: a ab
P
P(s )" a
= dx exp[xs !U (x)] ; a #0/ 2pi ~i=
(7.3)
Fig. 11. Geometry of the problem. Points r "(x , y , z ) and r"(x, y, z) are the positions of the source and of the 0 0 0 0 observation point (detector) respectively. From [71].
A.D. Mirlin / Physics Reports 326 (2000) 259}382
P P
C
D
s =dv = dx exp ! ab #xv!U (x) ; #0/ v v 2pi 0 ~i= (x)"g ln2(J1#x/g#Jx/g) ,
P(s )" ab
355
(7.4)
U (7.5) #0/ where s "¹ /S¹ T and s "¹ /S¹ T. For large g<1 (i.e. in the `metallica regime), the a a a ab ab ab distribution P(s ), Eq. (7.3), has a Gaussian form, a 3g 3g exp ! (s !1)2 , (7.6) P(s )K a 4 a 4p
S
C
D
with log-normal tails at small s ;1 and an exponential asymptotics at large s <1 of the form a a P(s )&exp(!gs ). The distribution of the channel-to-channel transmission coe$cients s is a a ab a close relative of the distribution of the point-to-point transmitted intensity I(r, r ) which was 0 discussed above and will be studied in detail below. For not too large s it has the Rayleigh form, ab P(s )&e~sab , with the leading perturbative correction given by ab
C
D
1 P(s )Ke~sab 1# (s2 !4s #2) , s ;Jg ab ab ab 3g ab
(7.7)
(this form of the correction to the Rayleigh law for the intensity distribution was found in a number of papers, see Refs. [248,249,251,252]). In the intermediate region Jg(s (g Eq. (7.4) yields ab 1 (7.8) P(s )Kexp !s 1# s #2 ab ab 3g ab
C A
BD
(correction of this type was also obtained earlier, Refs. [252,253]). For large s 'g the distribution ab acquires a stretched-exponential form (7.9) P(s )&exp(!2Jgs ) . ab ab We will return to Eq. (7.9) below, when comparing the results for the distribution of I(r , r) with 0 P(s ). In the localized regime both distributions P(s ) and P(s ) are determined by the single ab a ab (largest) transmission eigenvalue and acquire the same log-normal form as the conductance distribution, Eq. (6.33). We note also that van Langen et al. [74] were able to calculate P(s ) and a P(s ) in the case of the unitary symmetry, b"2, in the whole range of the parameter ¸/m (from ab weak to strong localization). The distributions P(s ) and P(s ) were studied experimentally by a ab Garcia and Genack [254,255] and by Stoychev and Genack [256]; their "ndings are in good agreement with the theoretical results. Now we return to the statistics of the intensity I(r, r ). The "eld at the point r is given by the 0 (retarded) Green's function G (r, r ) and the radiation intensity is I(r, r )"DG (r, r )D2. The average R 0 0 R 0 intensity SI(r, r )T is given diagrammatically by a di!uson ¹(r , r ), attached to two external 0 1 2 vertices. The vertices are short-range objects and can be approximated by a d-function times (l/4p), so that SI(r, r )T"(l/4p)2¹(r, r ). For the quasi-one-dimensional geometry, the expression for the 0 0 di!uson reads
A B
4p 2 3 [z (¸!z )] : ; ¹(r, r )" 0 l 4p Al¸
(7.10)
356
A.D. Mirlin / Physics Reports 326 (2000) 259}382
where l is the elastic mean free path, A is the cross-section of the tube, z-axis is directed along the sample, z "min(z, z ) and z "max(z, z ). We assume that Dz!z D<=. : 0 ; 0 0 The intensity distribution P(I) is obtained, in the diagrammatic approach, by calculating the moments SInT of the intensity. In the leading approximation [247], one should draw n retarded and n advanced Green's functions and insert ladders between pairs MG , G N in all possible ways. This R A leads to SInT"n!SITn and, thus, to Eq. (7.2). Corrections to the Rayleigh result come from diagrams with intersecting ladders, which describe interaction between di!usons. The leading correction is due to pairwise interactions. The diagram in Fig. 12 represents a pair of `collidinga di!usons. The algebraic expression for this diagram is
A B PA
l 4 C(r, r )"2 0 4p
GA
]
B
4 < d3rn ¹(r, r )¹(r, r )¹(r , r )¹(r , r ) 1 2 3 0 4 0 n /1
l5 48pk2 0
BP
H
4 d3o[(+ #+ ) ) (+ #+ )#2(+ ) + )#2(+ ) + )] < d(o!rn ) , 1 2 3 4 1 2 3 4 n /1 (7.11)
where k is the wave number and +n acts on rn . The factor (l/4p)4 comes from the 4 external vertices 0 of the diagram, the ¹'s represent the two incoming and two outgoing di!usons and the expression in the curly brackets corresponds to the internal (interaction) vertex (`Hikami boxa) [257,258]. Finally, the factor 2 accounts for the two possibilities of inserting a pair of ladders between the outgoing Green's functions. Integrating by parts and employing the quasi one-dimensional
Fig. 12. Diagram for a pair of interacting di!usons. The external vertices contribute the factor (l/4p)4. The shaded region denotes the internal interaction vertex, see Eq. (7.11). From [71].
A.D. Mirlin / Physics Reports 326 (2000) 259}382
geometry of the problem, we obtain (for z (z): 0 4 C(z, z )K2SI(z, z )T2 1# , 0 0 3c
A
B
357
(7.12)
where 3 z (¸!z) 0 SI(z, z )T" 0 4p Al¸ is the average intensity, ¸3 <1 , (7.13) ¸2(3z#z )!2¸z(z#z )#2z2 (z!z ) 0 0 0 0 and g"k2 lA/3p¸<1 is the dimensionless conductance of the tube. For simplicity, we will assume 0 that the source and the detector are located relatively close to each other, so that Dz!z D;¸, in 0 which case Eq. (7.13) reduces to c"g¸2/2z(¸!z). (All the results are found to be qualitatively the same in the generic situation z &z!z &¸!z&¸.) 0 0 In order to calculate SInT one has to compute a combinatorial factor which counts the number N of diagrams with i pairs of interacting di!usons. This number is N "(n!)2/[22ii!(n!2i)!]K i i (n!/i!)(n/2)2i. For not too large n;c1@2 it is su$cient to keep the i"1 contribution, yielding the leading perturbative correction [71], c"2g
C
D
2 SII nT"n! 1# n(n!1) , 3c
(7.14)
or in terms of the distribution function,
C
D
2 P(II )"e~II 1# (II 2!4II #2) , II ;c1@2. 3c
(7.15)
For larger n (or, equivalently, II ) we have to sum up the series over i, yielding
A B
*n@2+ 1 2n2 i SInT "n! + Kn! exp(2n2/3c) . (7.16) SITn i! 3c n /0 Although i cannot exceed n/2, the sum in Eq. (7.16) can be extended to R, if the value of n is restricted by the condition n;c. Eq. (7.16) represents the leading exponential correction to the Rayleigh distribution. Let us discuss now e!ect of higher order `interactionsa of di!usons. Diagrams with 3 intersecting di!usons will contribute a correction of n3/c2 in the exponent of Eq. (7.16), which is small compared to the leading correction in the whole region n;c, but becomes larger than unity for nZc2@3. Likewise, diagrams with 4 intersecting di!usons produce a n4/c3 correction, etc. Restoring the distribution P(I), we "nd [71]
G
AB
H
2 II 3 P(II )Kexp !II # II 2#O #2 , 3c c2
(7.17)
Eqs. (7.14), (7.15) and (7.17) are analogous to those found in the case of transmission coe$cient statistics, (7.7), (7.8), with the only di!erence that the parameter g is now replaced by c/2 (if we
358
A.D. Mirlin / Physics Reports 326 (2000) 259}382
consider the limit z P0, zP¸, then cP2g, so that both results are consistent). As has been 0 already mentioned, deviations of this form from the Rayleigh distribution of intensities were found earlier by various authors [248,249,251}253]; a value of the parameter governing the strength of deviations (here 1/c) depends, however, on the geometry of the problem. It should be realized that Eq. (7.17) is applicable only for II ;c&g and, thus, does not determine the far asymptotics of P(I). The latter is unaccessible by the perturbative diagram technique and is handled below by the supersymmetry method. For technical simplicity, we will assume now that the time reversal symmetry is broken by some magnetooptical e!ects (unitary ensemble). The moments of the intensity at point r due to the source at r are given by 0 k2 n SInT" ! 0 [DQ](Q (z))n(Q (z ))ne~S*Q+ , (7.18) 12,bb 21,bb 0 16p2
A
BP
where S[Q] is the zero-frequency p-model action,
P
plD S[Q]"! d3r Str(+Q)2 , 4
(7.19)
which reduces in the quasi-1D geometry to
P
S[Q]"!(g¸/8) dz Str(dQ/dz)2 . Assuming again that the two points r and r are su$ciently close to each other, Dz!z D;¸ and 0 0 taking into account slow variation of the Q-"eld along the sample, we can replace the product Q (z)Q (z ) by Q (z)Q (z). We get then the following result for the distribution of the 12,bb 21,bb 0 12,bb 21,bb dimensionless intensity y"(16p2/k2 )I: 0
P
P(y)" dQ d(y#Q Q )>(Q) , 12,bb 21,bb
(7.20)
where >(Q) is a function of a single supermatrix Q de"ned by Eq. (3.8). Using the fact that the function >(Q) depends only on the eigenvalues 14j (R, !14j 41, we "nd 1 2 d j #j d2 1 2 >(j , j )d(y#1!j2 ) . P(y)" #y dj dj (7.21) 1 2 j !j 1 2 1 dy dy2 1 2 The function >(j , j ) can be evaluated at g<1 via the saddle-point method (see [34] and Section 1 2 4.3) with the result
A
BP
G
A
B
H
c >(j , j )Kexp ! [h2 #h2 ] . 1 2 2 2 1
(7.22)
where j ,cosh h , j ,cos h (04h (R, 04h 4p). In fact, the dependence of > on h is 1 1 2 2 1 2 2 not important, within the exponential accuracy, because it simply gives a prefactor after the integration in Eq. (7.21). Therefore, the distribution function P(y) is given by P(y)&>(j "J1#y, j "1)&exp(!ch2 /2) , 2 1 1
(7.23)
A.D. Mirlin / Physics Reports 326 (2000) 259}382
359
where h "ln (J1#y#Jy). Finally, after normalizing y to its average value SyT"2/c, we 1 obtain [71]:
G
H
c P(II )Kexp ! [ln2(J1#2II /c#J2II /c)] . 2
(7.24)
For II ;c, Eq. (7.24) reproduces the perturbative expansion (7.17), while for II
(7.25)
The log-normal `taila (7.25) should be contrasted with the stretched-exponential asymptotic behavior of the distribution of transmission coe$cients, Eq. (7.9). As was found in [71], these two results match each other in the following way. When the points z and z approach the sample 0 edges, z "¸!z;¸, an intermediate regime of a stretched-exponential behavior emerges: 0 1 !II # II 2#2, II ;g , 3g
G
A B A B
¸ 2 (7.26) , z 0 z 2 II ¸ 2 g¸ ln2 16 0 , II
!2JgII ,
CA B D
g;II ;g
GR(r , r)"+ tH(r )t (r)(k2 !E #ic )~1 . 0 i 0 i 0 i i i Since the level widths c are typically of order of the Thouless energy E &D/¸2, there is typically i c &g levels contributing appreciably to the sum. In view of the random phases of the wave functions, this leads to a Gaussian distribution of GR(r , r) with zero mean, and thus to the Rayleigh 0 distribution of I(r , r)"DGR(r , r)D2, with the moments SII nT"n!. The stretched-exponential behav0 0 ior results from such disorder realizations where one of the states t has large amplitudes in both i the points r and r. Considering both t (r ) and t (r) as independent random variables with 0 i 0 i Gaussian distribution and taking into account that only one (out of g) term contributes in this case to the sum for GR, we "nd SII nT&n!n!/gn, corresponding to the above stretched-exponential form of P(II ). Finally, the log-normal asymptotic behavior corresponds to those disorder realizations where GR is dominated by an anomalously localized state, which has an atypically small width c i
360
A.D. Mirlin / Physics Reports 326 (2000) 259}382
(the same mechanism determines the log-normal asymptotics of the distribution of local density of states, see Section 4.3).
8. Statistics of energy levels and eigenfunctions in a ballistic system with surface scattering In the preceding part of this article we considered statistical properties of spectra of disordered di!usive systems. Using the supersymmetric p-model approach, we were able to demonstrate the relevance of the random matrix theory (RMT) and to calculate deviations from its predictions both for the level and eigenfunction statistics. Generalization of these results to the case of a chaotic ballistic system (i.e. quantum billiard) has become a topic of great research interest. For ballistic disordered systems the p-model has been proposed [75], with the Liouville operator replacing the di!usion operator in the action. It has also been conjectured that the same p-model in the limit of vanishing disorder describes statistical properties of spectra of individual classically chaotic system. This conjecture was further developed in [76,77,259] where the p-model was obtained by means of energy averaging, and the Liouville operator was replaced by its regularization } the Perron} Frobenius operator. However, straightforward application of the results of Refs. [10,11,25,27] to the case of an individual chaotic system is complicated by the fact that the eigenvalues of the Perron}Frobenius operator are unknown, while its eigenfunctions are extremely singular. For this reason the p-model approach has so far failed to provide explicit results for any particular ballistic system. In this section, we consider a ballistic system with surface disorder leading to di!usive scattering of a particle in each collision with the boundary. This models behavior of a quantum particle in a box with a rough boundary which is irregular on the scale of the wave length. Since the particle loses memory of its direction of motion after a single collision, this model describes a limit of an `extremely chaotica ballistic system, with typical relaxation time being of order of the #ight time. (This should be contrasted with the case of a relatively slight distortion of an integrable billiard [196,260,261].) One might naively think that all results for such a model could be obtained by setting l+¸ in a system with bulk disorder. In fact, the level statistics in a system with bulk disorder and arbitrary relation between mean free path l and system size ¸ was studied in [168,169,262]. However, the results presented below show that systems with bulk and surface disorder are not equivalent. To simplify the calculations, we will assume a circular geometry of the billiard. A similar problem was studied numerically in Ref. [263,264] for a square geometry. We consider only the case of unitary symmetry (broken time-reversal invariance); generalization to the orthogonal case is straightforward. We follow Ref. [78] in the presentation below. The level statistics for the same problem was independently studied in Ref. [79]. Very recently, the same approach was used [265] to calculate the persistent current in a ring with di!usive scattering. Our starting point is the sigma-model for ballistic disordered systems [31,75]. The e!ective action for this model has the form
P C
D
pl 1 F[g(r, n)]" dr Str iuKSg(r)T! Sg(r)T2!2v SK;~1n+;T . F 4 2q(r)
(8.1)
A.D. Mirlin / Physics Reports 326 (2000) 259}382
361
Here a 8]8 supermatrix "eld g is de"ned on the energy shell of the phase space, i.e. g depends on the coordinate r and direction of the momentum n. The momentum dependence of the "eld g distinguishes the ballistic p-model from the di!usive case where the supermatrix "eld Q depends on r only. The angular brackets denote averaging over n: SO(n)T":dn O(n) with the normalization :dn"1. Like in the case of the di!usive p-model, the matrix g is constrained by the condition g2(r, n)"1, and can be represented as g";K;~1, with K"diag(1, 1, 1, 1,!1,!1,!1,!1). Since we are interested in the clean limit with no disorder in the bulk, the second term in the action (8.1) containing the elastic mean free time q is zero everywhere except at the boundary where it modi"es the boundary condition (see below). As was explained in Section 2, the statistical properties of energy levels are governed by the structure of the action in the vicinity of the homogeneous con"guration of the g-"eld, g(r, n)"K. Writing ;"1!=/2#2, we "nd the action in the leading order in =,
P
pl F [=]"! dr dn Str [= (KK !iu)= ] , 0 21 12 4
(8.2)
where the indices 1, 2 refer to the `advanced}retardeda decomposition of =, and KK is the Liouville operator, KK ,v n+. This `linearizeda action has the same form as that of a di!usive system, with F the di!usion operator being replaced by the Liouville operator. This enables us to use the results derived for the di!usive case by substituting the eigenvalues and eigenfunctions of the operator KK for those of the di!usion operator. The operator KK should be supplemented by a boundary condition, which depends on the form of the surface roughness. As a model approximation we consider purely di!use scattering [266,267] for which the distribution function u(r, n) of the outgoing particles is constant and is "xed by #ux conservation:13
P
u(r, n)"p
(Nn@)u(r, n@)dn@, (Nn)(0 . (8.3) (Nn@);0 Here the point r lies at the surface, and N is an outward normal to the surface. This boundary condition should be satis"ed by the eigenfunctions of KK . The eigenvalues j of the operator KK corresponding to angular momentum l obey the equation
P
1 p dh sin h exp[2ilh#2m sin h]"0 , (8.4) JI (m),!1# l 2 0 where m,Rj/v , and R is the radius of the circle. For each value of l"0,$1,$2,2 Eq. (8.4) has F a set of solutions m with m "m "mH , which can be labeled with k"0,$1,$2,2 (even l) lk lk ~l,k l,~k or k"$1/2,$3/2,2 (odd l). For l"k"0 we have m "0, corresponding to the zero mode 00
13 Exact form of the boundary condition depends on the underlying microscopic model. In particular, the di!use scattering can be modelled by surrounding the cavity by a disordered layer with a bulk mean free path l and a thickness d
362
A.D. Mirlin / Physics Reports 326 (2000) 259}382
Fig. 13. First 11]11 (04k, l(11) eigenvalues of the Liouville operator KK in units of v /R, as given by Eq. (8.4). From F Ref. [78].
u(r, n)"const. All other eigenvalues have positive real part Re m '0 and govern the relaxation of lk the corresponding classical system to the homogeneous distribution in the phase space. The asymptotic form of the solutions of Eq. (8.4) for large DkD and/or DlD can be obtained by using the saddle-point method,
G
m + kl
0.66l#0.14 ln l#0.55pik, 04k;l , (ln k)/4#pi(k#1/8),
04l;k .
(8.5)
Note that for k"0 all eigenvalues are real, while for high values of k they lie close to the imaginary axis and do not depend on l (see Fig. 13). 8.1. Level statistics, low frequencies As was explained in Section 2.2 [see Eq. (2.27)], in the range of relatively low frequencies (which for our problem means u;v /R, see below) the level correlation function R(s"u/D) has the form F sin2ps RD 2 R(s)!1"d(s)! #A sin2ps . (8.6) (ps)2 pv F The "rst two terms correspond to the zero-mode approximation and are given by RMT, while the last one is the non-universal correction to the RMT results. The information about the operator KK enters through the dimensionless constant A"+@m~2, where the prime indicates that the kl eigenvalue m "0 is excluded. The value of A, as well as the high-frequency behavior of R(s) (see 00 below), can be extracted from the spectral function [9]
A B
S(u)"+ S (u); S (u),+ (j !iu)~2 . l l kl l k
(8.7)
A.D. Mirlin / Physics Reports 326 (2000) 259}382
363
According to the Cauchy theorem, S can be represented as an integral in the complex plane, l R 2 1 JI @ (z) 1 l dz , S (u)" l v 2pi (z!iuR/v )2 JI (z) F C F l where the contour C encloses all zeroes of the function JI (z). Evaluating the residue at z"iuR/v , l F we "nd
A B Q
d2 S (u)"!(R/v )2 l F dz2
K
ln JI (z) . l z/*uR@vF Considering the limit uP0 and subtracting the contribution of j "0, we get 00 A"!19/27!175p2/1152#64/(9p2)+!1.48 .
(8.8)
(8.9)
In contrast to the di!usive case, this constant is negative: the level repulsion is enhanced with respect to result for RMT. Eq. (8.6) is valid as long as the correction is small compared to the RMT result, i.e. provided u is below the inverse time of #ight, v /R. F 8.2. Level statistics, high frequencies In the range u
A BA
B
A
B
D(s)"s~2 < (1!isD/j )~1(1#isD/j )~1 . kl kl klE(00) Since D~2R2 ln D(s)/Rs2"!2Re S(u), we can restore D(s) from Eqs. (8.7), (8.8) up to a factor of the form exp(c #c s), with c and c being arbitrary constants. These constants are "xed by the 1 2 1 2 requirement that Eq. (2.34) in the range D;u;v /R should reproduce the low-frequency F behavior (8.6). As a result, we obtain
AB
1 p 61 < . (8.11) 2 N JI (isN~1@2)JI (!isN~1@2) l l l Here N"(v /RD)2"(p R/2)2 is the number of electrons below the Fermi level. For high F F frequencies u
A B A B
p4 *R 2 2pu R (u)" . cos 04# 128 v D F
(8.12)
364
A.D. Mirlin / Physics Reports 326 (2000) 259}382
It is remarkable that the amplitude of the oscillating part does not depend on frequency. This is in contrast to the di!usive case, where in the AS regime (u above the Thouless energy) the oscillating part R (u) is exponentially small, see Eq. (2.39). 04# 8.3. The level number variance The smooth part of the level correlation function can be best illustrated by plotting the variance of the number of levels in an energy interval of width E"sD,
P
R (s)" 2
s
(s!Ds8 D)R(s8 )ds8 . ~s A direct calculation gives for s;N1@2
(8.13)
p2R (s)"1#c#ln(2ps)#As2/(2N) 2 and for s
(8.14)
A B
A
B
16N1@2 p2 2N1@2 1@2 4s p p2R (s)"1#c#ln cos ! ! . 2 p2 N1@2 4 16 ps
(8.15)
Here c+0.577 is Euler's constant, and A is de"ned by Eq. (8.9). The "rst three terms at the rhs of Eq. (8.14) represent the RMT contribution (curve 1 in Fig. 14). As seen from Fig. 14, the two asymptotics (8.14) and (8.15) perfectly match in the intermediate regime, s&N1@2. Taken together, they provide a complete description of R (s). According to 2 Eq. (8.15), the level number variance saturates at the value R(0)"p~2(1#c#ln(16N1@2/p2)), in 2
Fig. 14. Level number variance R (E) as a function of energy; s"E/D. Curve 1 shows the RMT result, while curves 2 and 2 3 correspond to asymptotic regimes of low (8.14) and high (8.15) frequencies. The saturation value R(0) is given in the text. 2 From Ref. [78].
A.D. Mirlin / Physics Reports 326 (2000) 259}382
365
contrast to the behavior found for di!usive systems [9] or ballistic systems with weak bulk disorder [168,262]. The saturation occurs at energies s&N1@2, or in conventional units E&v /R. This F saturation of R (s), as well as its oscillations on the scale set by short periodic orbits, is expected for 2 a generic chaotic billiard [80,81]. It is also in good agreement with the results for R (s) found 2 numerically for a tight-binding model with moderately strong disorder on boundary sites [263,264]. 8.4. Eigenfunction statistics Now we turn to correlations of the amplitudes of an eigenfunction in two di!erent points de"ned by Eq. (3.67). As was discussed in Section 3.3.3, these correlations are governed by the ballistic propagator P (r , r ) (see Eq. (3.98) and the text preceding it). Direct calculation gives: B 1 2 P (r , r )"P (r , r )#P (r , r ) , (8.16) B 1 2 1 1 2 2 1 2
P
P (r , r )"P(0)(r !r )!<~1 dr@ P(0)(r@ !r ) 1 1 2 B 1 2 1 B 1 2
P
P
!<~1 dr@ P(0)(r !r@ )#<~2 dr@ dr@ P(0)(r@ !r@ ) , 2 B 1 2 1 2 B 1 2
A B
(8.17)
= 4k2!1 r r k 1 1 2 cos k(h !h ) + P (r , r )" 1 2 2 1 2 R2 4k2 4pp R F k/1 where P(0)(r)"1/(pp DrD), and (r, h) are the polar coordinates. This formula has a clear interpretaB F tion. The function P can be represented as a sum over all paths leading from r to r , with B 1 2 possible surface scattering in between. In particular, P corresponds to direct trajectories from 1 r to r with no re#ection from the surface, while the contribution P is due to the surface 1 2 2 scattering. The "rst term in the numerator 4k2!1 comes from trajectories with only one surface re#ection, while the second sums up contributions from multiple re#ections. Let us summarize the main results of this section. We have used the ballistic p-model approach to study statistical properties of levels and eigenfunctions in a billiard with di!usive surface scattering, which exempli"es a ballistic system in the regime of strong chaos. It was found that the level repulsion and the spectral rigidity are enhanced compared to RMT. In particular, the level number variance saturates at the scale of the inverse time of #ight, in agreement with Berry's prediction for a generic chaotic system [80,81]. As another manifestation of the strong spectral rigidity, the oscillating part of the level correlation function does not vanish at large level separation. We calculated also the ballistic analog of the di!usion propagator in this model, which governs correlations of eigenfunction amplitudes in di!erent spatial points. While we focused our attention on the statistics of levels and wave functions in a closed ballistic sample, the surface nature of scattering will also modify statistical properties of the transport characteristics for an open system. In this connection, we mention the recent papers [271,272] where quantum localization and #uctuations of the transmission coe$cients and of the conductance were studied for a quasi-1D wire with a rough surface. A number of important di!erences compared to the case of bulk disorder was found.
366
A.D. Mirlin / Physics Reports 326 (2000) 259}382
Note that the motion in a quasi-1D wire with surface scattering is closely related to the PRBM ensemble of Section 5.3. According to (8.3), the probability density for a particle to leave the surface after a scattering event with an angle h;1 with respect to the surface is P(h)&h. Since the distance of the ballistic #ight is r&h~1 for small h, this yields P(r)&r~3 for r larger than the transverse size of the wire. Thus, we get a power-law `taila at large r of the form (5.43) with a"3/2. According to Section 5.2, this is precisely the marginal value separating the regions of conventional K~1(q, 0)Jq2 and unconventional K~1(q, 0)Jq2a~1 behavior of the di!usion propagator 0 0 K (q, u). It is clear that at a"3/2 the propagator acquires a logarithmic correction, 0 K~1(q, 0)Jq2 ln q, see Ref. [271] for details. 0 9. Electron}electron interaction in disordered mesoscopic systems In the preceding sections we have considered statistical properties of energy levels and eigenfunctions of a single particle in a disorder potential. However, if an electronic system is considered, the Coulomb interaction between the electrons has to be taken into account. The in#uence of the electron}electron interaction on transport properties of disordered systems has been intensively studied during the last two decades, in particular in connection with such phenomena as weak localization [273] and universal conductance #uctuations [274]. The electron}electron interaction sets the length scale l (phase breaking length) below which the electron wave function preserves its ( coherence. Also, interplay of the interaction and disorder leads to a singular (at Fermi energy or at zero temperature) correction to the density of states and to the conductivity [273]. More recently, another kind of problems has attracted the research interest: to what extent does the electron}electron interaction in#uence the properties of the electron spectrum in a disordered dot? This interest is largely motivated by a recent progress in nanotechnology which allows to observe experimentally discrete electronic levels in semiconductor quantum dots [93,275] and in small metallic grains [276,277]. In fact, there are two types of the quantum dot spectra studied experimentally via measuring their I}< characteristics: (i) excitation spectrum, when excited levels are probed in a dot with given number of electrons by increasing the source-drain voltage, and (ii) addition spectrum, when electrons are added one by one by changing the gate voltage. As concerns the excitation spectrum, it was found in [99,278] that the quasiparticle levels with energies below the Thouless energy E (counted from the Fermi energy) have a width smaller than c the mean level spacing and thus form a well-de"ned discrete spectrum. To show this, let us calculate the r.m.s. value of the matrix element < of the screened Coulomb interaction which describes ijkl a decay of the quasiparticle state DiT to a three-particle (more precisely, two particles#one hole) state D jklT. Using Eqs. (3.78), (3.88) for a di!usive dot, one "nds [99] SD< D2T&(D/g)2. On the ijkl other hand, the density of states of the three-particle states is l (E)&E2/D3, so that the Golden 3 Rule width of the one-particle states is C(E)"2pSD
A.D. Mirlin / Physics Reports 326 (2000) 259}382
367
considered as a tight-binding model in the Fock space, with matrix elements of the Coulomb interaction playing a role of the hopping terms. It was found that only for the energy above E &JgD is the Golden Rule applicable and the levels have the regular Lorentzian shape. For H smaller energies the quasiparticle excitation consist essentially of a single peak with a small admixture of other many-particle exact eigenstates. This corresponds to the Anderson localization in the Fock space. Though the properties of the single-particle excitations discussed above are of most interest, one can also discuss statistical properties of exact many-body levels. For excitation energy E
368
A.D. Mirlin / Physics Reports 326 (2000) 259}382
inequality is met: C;k¹;D, where C"C #C is the sum of the tunneling rates to the leads. l r The peak width is then set by the temperature, while the height is given by [298] e2 p C C l r . " (9.1) h 2k¹ C #C l r The peak heights show strong #uctuations induced by RMT-like #uctuations of eigenfunction amplitudes, as predicted in [13,96] and observed experimentally in [94,95]. In the valleys between the peaks, it is energetically costly (of order of the charging energy e2/C, where C is the dot capacitance) to add an electron to (or to remove from) the dot, and the conductance is determined by virtual processes (so-called elastic cotunneling [299]) and is strongly suppressed. The issue of the statistics of the peak spacings was addressed for the "rst time in [83]. Basing on numerical data for very small systems, the authors of [83] concluded that the r.m.s. deviation of the peak spacing S is proportional to the charging energy e2/C, with a coe$cient +0.15. We will N show below (our consideration will closely follow Refs. [84,136]) that the #uctuations are in fact much smaller, of order of the mean level spacing D. Let us note that in an analogous problem for classical particles the #uctuation magnitude rms(S ) would indeed be proportional to the mean N value SS T [300,301]. The physical reason for smaller #uctuations in the quantum case is in the N delocalized nature of the electronic wave functions, which are spread roughly uniformly over the system. These theoretical conclusions were con"rmed recently by thorough experimental studies, as discussed in the end of the section. The simplest theoretical model which may be used to study distribution of the peak spacings is as follows. One considers a dot as a "xed size di!usive mesoscopic sample and assumes that changing a gate voltage by an amount d< simply reduces to a uniform change of the potential inside the dot g by a constant c d< , with certain numerical coe$cient c (`lever arma). Such a model was used for g numerical simulations of the addition spectra in Refs. [83,302]. We start by considering the statistics of peak spacings within this model [84]; we will later return to the approximations involved and relax some of them. We will neglect the spin degree of freedom of electrons "rst; inclusion of the spin will be also discussed in the end of the section. The distance between the two consecutive conductance peaks is given by (Fig. 15) g
.!9
S "(E !E )!(E !E ) N N`2 N`1 N`1 N "kN`2 !kN`1 , (9.2) N`1 N where E is the ground state of a sample with N electrons. In the second line of Eq. (9.2) we rewrote N S in terms of the Hartree}Fock single electron energy levels, with kj denoting the energy of the N i state dj in the dot containing i electrons. It is convenient to decompose S in the following way N S "(kN`2 !kN`2)#(kN`2!kN`1) N N`1 N N N ,E #E . (9.3) 1 2 The quantity E is the distance between the two levels of the same one-particle (Hartree}Fock) 2 Hamiltonian HK (describing a dot with N electrons) and is expected to obey RMT; in particular N SE T"D and r.m.s.(E )"aD with a numerical coe$cient a of order of unity [a"0.52 (0.42) for 2 2 the orthogonal (resp. unitary) ensemble]. On the other hand, E is a shift of the level d(N#2) due 1 to the change of the Hamiltonian HK PHK accompanying addition of the electron d(N#1) to N N`1 the system.
A.D. Mirlin / Physics Reports 326 (2000) 259}382
369
Fig. 15. Energy levels before and after the electron d(N#1) has been added to the dot. From Ref. [84].
The e!ective interaction ;(r, r@) between the electrons d(N#1) and d(N#2) can be found from the RPA-type equation,
P
;(r, r@)"; (r!r@)! dr1 dr2 ; (r!r1 )P (r1 , r2 );(r2 , r@) 0 0 0
(9.4)
where ; (r)"e2/er, e is the dielectric constant, and 0 P (r1 , r2 )"l[d(r1 !r2 )!<~1] 0
(9.5)
is the polarization operator. Its solution has the form [84] ;(r, r@)";M #d;(r)#d;(r@)#; (r, r@) . i
(9.6)
Here ;M ,e2/C is a constant (charging energy), d;(r) is the change of the self-consistent potential due to addition of one electron (i.e. di!erence in the self-consistent potential in the dots with N and N#1 electrons), and ; (r) is the screened Coulomb interaction. In particular, in the experii mentally most relevant 2D case (which we will consider below) and assuming a circular form of the dot with radius R, we have e2 (R2!r2)~1@2 , d;(r)"! 2ei R s
(9.7)
while ; is given in the Fourier space by ;I (q)"2pe2/e(q#i ) with the inverse screening length i i s i "2pe2l/e. s According to Eq. (9.6), the term E can be decomposed into the following three contributions: 1
P
P
E "e2/C# dr(Dt2 (r)D#Dt2 (r)D)d;(r)# dr dr@Dt2 (r)DDt2 (r@)D; (Dr!r@D) 1 N`1 N`2 N`1 N`2 i "E(0)#E(1)#E(2) . 1 1 1
(9.8)
370
A.D. Mirlin / Physics Reports 326 (2000) 259}382
The "rst term in Eq. (9.8) (the charging energy) determines the average value SE T and thus the 1 average peak spacing SS T (since e2/C
P
(e2/C) "2 dr dr@ d;(r)P (r, r@)d;(r@) . 3 3
(9.9)
Evaluating the #uctuations of the polarization operator, we "nd [138]
CP
D
AB
48 1 2 1 D 2 var(E(0))" l2 ln g dr1 dr2 d;(r1 )P(r1 , r2 )d;(r2 ) J ln g . 1 b < b g
(9.10)
Now we consider #uctuations of the last term, E(2), in Eq. (9.8). Using Eqs. (3.74) and (3.86) for the 1 correlations of eigenfunction amplitudes in two remote points, the variance of E(2) is found to be 1 4 var(E(2))" dr1 dr@1 dr2 dr@2 ; (Dr1 !r@1 D); (Dr2 !r@2 D)P(r1 , r2 )P(r@1 , r@2 ) 1 i i b2<4
P P
AB
1 D 2 4D2 . dr1 dr2 P2(r1 , r2 )J + b2 g b2<2
(9.11)
Finally, #uctuations of the term E(1) can be also evaluated with a help of Eqs. (3.74), (3.86), yielding 1 1 D2 4 dr1 dr2 d;(r1 )P(r1 , r2 )d;(r2 )J . (9.12) var(E(1))" 1 bg b<2
P
It is seen that for g<1 all the contributions Eqs. (9.10)}(9.12) are parametrically small compared to the RMT #uctuations (which are &D). Fluctuations of the term E(1) related to the change d;(r) 1 of the self-consistent potential represent parametrically leading contribution to the enhancement of the peak spacing #uctuations with respect to RMT. Let us now discuss approximations made in the course of the above derivation: (i) The dot was supposed to be di!usive in the calculation. For a ballistic dot one should replace P(r, r@) by its ballistic counterpart. This would mean that the parameter g is replaced by &N1@2&¸/j , where N is the number of electrons in the dot and ¸ the characteristic linear F dimension. The numerical coe$cient would depend, however, on `how strongly chaotica is the dot. The role of the eigenfunctions #uctuations and correlations (`scarsa) in enhancement of
A.D. Mirlin / Physics Reports 326 (2000) 259}382
371
the peak spacing #uctuations was studied in [303] via numerical simulations of a dot with N+100 electrons. (ii) It was assumed that changing the gate voltage results in a spatially uniform change of the potential in the sample. This has led us to the expression Eq. (9.7) for the change of the self-consistent potential d;(r) accompanying the addition of one electron to the dot. This result would correspond to a gate located far enough from the sample. In a more realistic situation, when the gate is relatively narrow and located close to the sample, the potential change d;(r) (as well as the additional electron density) will be concentrated on the side of the dot facing the gate. The change of the potential d;(r) corresponds then to a slight deformation of the dot with adding each electron to it. To estimate d;(r) in this case, we can consider a model problem of a point-like charged object (modeling the gate) located a distance d from the edge of the dot. Assuming that the dot size is larger than d, we can approximate the dot [while calculating d;(r)] by a half plane. In one period of the Coulomb blockade oscillations the gate charge changes by e (electron charge). The charge distribution induced in the dot is e do(x, y)"! p2
S
d 1 , x J(d#x)2#y2
(9.13)
where the closest to the gate point of the dot is chosen as the coordinate origin and the x axis is directed along the dot edge. The corresponding change of the potential is d;(r)"(e2l/e)~1do(r). Substituting this into the "rst line of Eq. (9.12), we come to the same result r.m.s.(E(1))JD/Jbg , 1 in the di!usive limit d'l, and to r.m.s.(E(1))&D/Jbk d 1 F in the case d(l. E!ect of the quantum dot deformation on the peak spacing #uctuations has been recently considered in Ref. [304]. The authors of [304] characterized the strength of the deformation by a phenomenological dimensionless parameter x and assumed that it can be large (xZ1), strongly a!ecting the spacing distribution. However, the above estimates indicate that for a typical geometry this parameter is much less than unity, x&1/Jg or x&1/Jk d. F (iii) It was assumed that the dot energy and the measured gate voltage are related through a constant (or smoothly varying) coe$cient c. This `lever arma c depends, however on the dot-gate capacitance, which is also a #uctuating quantity. If the gate size and the distance to the gate is of the order of the size of the dot, these #uctuations should be of the same order as #uctuations of the dot self-capacitance and thus lead to additional #uctuations parametrically small compared to D, see Eq. (9.10). (iv) The calculation was done within the random phase approximation, which assumes that the ratio of the interparticle Coulomb energy to the kinetic energy is small, r ,J2e2/e+v ;1 s F (v is the Fermi velocity). However, most of the experimental realizations of semiconductor F quantum dots correspond to r +1. Since this value is still considerably lower than the Wigner s crystallization threshold, the calculations should be still valid, up to a numerical factor a(r ) s [depending on r only and such that a(r ;1)"1]. s s
372
A.D. Mirlin / Physics Reports 326 (2000) 259}382
(v) We considered the model of spinless electrons up to now. Let us brie#y discuss the role of the spin degree of freedom. Within the constant interaction model, it would lead to a bimodal distribution [302] of peak spacings
C
A
1 1 S !e2/C N P(S )" d(S !e2/C)# P N N 2 2D WD 2D
BD
,
(9.14)
where P (s) is the Wigner}Dyson distribution and D denotes the level spacing in the absence WD of spin degeneracy. The value of the coe$cient a in the relation r.m.s.(S )"aD is then N increased (compared to the spinless case) and is equal to 1.24 (1.16) for the orthogonal (resp. unitary) ensemble. Taking into account #uctuations of eigenfunctions (and thus of E ) 1 however modi"es the form of the distribution. The value of the term E(2) representing the 1 interaction between two electrons is larger in the case when t and t correspond to two N`2 N`1 spin-degenerate states (i.e. have the same spatial dependence of the wave function), since
TP
U TP
dr dr@Dt2(r)DDt2(r@)D; (Dr!r@D) ! i i i
"
P
U
dr dr@Dt2(r)DDt2(r@)D; (Dr!r@D) i j i
2 dr dr@ k (Dr!r@D); (Dr!r@D)&D d i b<2
(9.15)
for r &1 (the coe$cient depends on r , see [84]). Therefore, "lling a state t pushes up the s s it level t (with respect to other eigenstates) by an amount of order of D. This removes is a bimodal structure of the distribution of peak spacings and slightly modi"es the value of the coe$cient a. Basing on the above analysis, we can make the following general statement. Imagine that we "x r &1 (i.e. "x the electron density and thus the Fermi wave length) and the system geometry, s and then start to increase the linear dimension ¸ of the system. Then, while the average value of the peak spacing S scales as SS T+e2/CJ1/¸, its #uctuations will scale di!erently: N N r.m.s.(S )&DJ1/¸2. This conclusion is also corroborated by diagrammatic calculations of N Ref. [85]. As was mentioned in the beginning of this section, this result should be contrasted with that for classical particles [300,301], where the #uctuations are proportional to the mean value SS T. The N parametrically smaller #uctuations in the quantum case are due to the delocalized nature of the electronic wave functions, which are spread roughly uniformly over the system. The above prediction was con"rmed by recent experiments [86,87,290], where a thorough study of the peak spacing statistics was carried out. It was found that the low-temperature value of r.m.s.(S ), as well the typical temperature scale for its change are approximately given by the mean N level spacing D (while in units of E the magnitude of #uctuations was as small as 1}4%). c Several recent papers studied statistical properties of the peak spacing numerically. Stopa [303] used the density functional theory and found the #uctuations of the addition energies to be approximately 0.7D, in agreement with the above results. Disappearance of the bimodal character of the peak spacing distribution with increasing r was observed recently in numerical simulations s by Berkovits [305] via exact diagonalization. However, the system size and the number of particles
A.D. Mirlin / Physics Reports 326 (2000) 259}382
373
in the exact diagonalization studies are very small, so that it is di$cult to draw any quantitative conclusions concerning the dots with large number of electrons from the results of [305].
10. Summary and outlook In this article we have reviewed the recent progress in the study of statistical properties of disordered electronic systems. We have discussed statistics of energy levels and eigenfunction amplitudes, as well as of several related quantities (local density of states, escape time, conductance, etc.) In most of the article the supersymmetric p-model approach was used, as a unique and powerful tool allowing one to calculate various distribution and correlation functions. Within this approach, we have employed a number of complementary methods of treating the p-model: exact solution (in particular, transfer-matrix method in 1D), perturbation theory, renormalization group, saddle-point method. The main emphasis has been put onto system-speci"c deviations from the universal predictions of the random matrix theory. The results presented constitute a detailed and, in many respects, complete description of #uctuations of spectra and wave functions of disordered systems. Still, there is a number of directions in this "eld in which a more complete understanding is needed, so that the corresponding research remains active at present. Let us point out some of them. f Statistical properties of energy levels and wave functions of ballistic systems whose classical counterparts are chaotic require more theoretical studies. It remains to be understood what are the conditions of applicability of the ballistic p-model derived in [75}77]. Apparently, a certain amount of disorder present in a system (or, in other words, some ensemble averaging) is needed to justify the derivation in these papers, but the precise conditions have not been quanti"ed yet (see Ref. [306] where this problem is discussed in the context of correlations of quantum maps). Also, a discrepancy between the results of the p-model and of the semiclassical approach in treating repetitions of periodic orbits [307,308] is to be resolved. It remains to be seen whether the ballistic p-model approach can be developed to predict the weak-localization corrections (recent attempts [309,310] did not lead to any de"nite conclusions), the asymptotic `tailsa of the distributions (like those discussed in Section 4 for di!usive systems), etc. f Interplay of disorder and electron}electron interaction in mesoscopic systems continues attracting considerable research interest. In Section 9 of this article we discussed only the issue of #uctuations in an (almost) closed system. In the experiment, coupling of the dot to the outside world is controlled by the gate(s) and may be varied. There arise a rich variety of regimes depending on the coupling strength, magnetic "eld, and the strength of the Coulomb interaction (parameter r ). Fluctuations in open quantum dots have been studied recently, both theoretically s [311}313] and experimentally [289,290,314,315], but the issue is not fully understood yet. In Ref. [316] it was found that application of a strong magnetic "eld a!ects dramatically the addition spectrum of a quantum dot, leading to strong #uctuations and to bunching of the Coulomb blockade peaks. A consistent explanation of these experimental results is still missing. Very recently, several groups [317}321] simultaneously performed self-consistent Hartree} Fock calculations of addition spectra of quantum dots. Though the method allows one to study
374
A.D. Mirlin / Physics Reports 326 (2000) 259}382
considerably larger systems as compared to the exact diagonalization method, the obtained results were not su$cient to extract unambiguously a scaling dependence of the spacing distribution function on the parameters of the problem. A drawback of the Hartree}Fock method with the bare Coulomb potential is that the exchange interaction does not get screened, whereas the screening was crucially important for the theoretical consideration in Section 9.1. More work in this direction may be expected in the nearest future. f The physics of the Anderson metal}insulator transition, including #uctuations at the critical point, remains an actively studied "eld. Parameters of modern computers allow one to evaluate numerically critical indices and various distribution functions with a high accuracy. It would be very interesting to study numerically the critical point in higher dimensions (see, in this respect, recent paper [178]) as well as in an e!ectively in"nite-dimensional tree-like sparse random matrix model and in a long-range 1D model (power-law random banded matrix ensemble considered in Section 5.3). It remains to be seen what is the status of the conjecture of Ref. [49] relating the spectral compressibility to the eigenfunction multifractality. Also, the form of the conductance distribution at the critical point remains an open problem. Ref. [28] predicted a log-normal form of the distribution function at g<SgT&1, similarly to the distribution functions of local density of states and of relaxation times. However, in contrast to the latter two quantities, an anomalously large conductance cannot be explained by a single anomalously localized state (since a single state cannot produce the conductance larger than 1). It seems much more probable that the conductance distribution falls o! much more fast, in a Gaussian (or similar) fashion at large g, but no corresponding calculation has been available so far. Technically, the problem is that calculating higher moments of the conductance requires increasing of the size of the Q-matrix in the p-model, which complicates tremendously a non-perturbative treatment of the distribution function. For recent numerical studies of the conductance distribution at criticality, see Refs. [327}329]. Acknowledgements My own work reviewed in this paper was done in collaboration with the late A.G. Aronov, Ya.M. Blanter, F.-M. Dittes, Y.V. Fyodorov, V.E. Kravtsov, A. MuK ller-Groeling, B.A. Muzykantskii, R. Pnini, T. Seligman, B. Shapiro, M.R. Zirnbauer. It is my pleasure to thank all of them. I am also indebted to many other colleagues, in particular, O. Agam, B.L. Altshuler, R. Berkovits, S. Fishman, K. Frahm, Y. Gefen, D.E. Khmelnitskii, I.V. Lerner, L.S. Levitov, C. Marcus, K. Muttalib, D.G. Polyakov, P.G. Silvestrov, I.E. Smolyarenko, A.M. Tsvelik, P. WoK l#e, and I.Kh. Zharekeshev for numerous stimulating discussions of the topics addressed in the review. This work was supported by SFB195 der Deutschen Forschungsgemeinschaft. Appendix A. Abbreviations AKL ALS DMPK DOS
Altshuler, Kravtsov, and Lerner, Ref. [28] anomalously localized state Dorokhov-Mello-Pereira-Kumar (equations) density of states
A.D. Mirlin / Physics Reports 326 (2000) 259}382
GOE GSE GUE IPR LDOS LN PRBM RBM RG RPA RMT WD 1D, 2D, 3D
375
Gaussian orthogonal ensemble Gaussian symplectic ensemble Gaussian unitary ensemble inverse participation ratio local density of states logarithmically normal power-law random banded matrix random banded matrix renormalization group random phase approximation random matrix theory Wigner}Dyson (level statistics) one dimensional, two dimensional, three dimensional
References [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] [22] [23] [24] [25] [26] [27] [28] [29]
E.P. Wigner, Ann. Math. 53 (1951) 36. F.J. Dyson, J. Math. Phys. 3 (1962) 140. M.L. Mehta, Random Matrices, Academic Press, Boston, 1991. O. Bohigas, M.J. Giannoni, C. Schmit, Phys. Rev. Lett. 52 (1984) 1. L.P. Gor'kov, G.M. Eliashberg, Zh. Eksp. Theor. Fiz. 48 (1965) 1407 [Sov. Phys. JETP 21 (1965) 940]. K.B. Efetov, Adv. Phys. 32 (1983) 53. K.B. Efetov, Supersymmetry in Disorder and Chaos, Cambridge University Press, Cambridge, 1997. J.J.M. Verbaarschot, H.A. WeidenmuK ller, M.R. Zirnbauer, Phys. Rep. 129 (1985) 367. B.L. Altshuler, B.I. Shklovskii, Zh. Eksp. Theor. Fiz. 91 (1986) 220 [Sov. Phys. JETP 64 (1986) 127]. V.E. Kravtsov, A.D. Mirlin, Pis'ma Zh. Eksp. Teor. Fiz. 60 (1994) 645 [JETP Lett. 60 (1994) 656]. A.V. Andreev, B.L. Altshuler, Phys. Rev. Lett. 75 (1995) 902. C.E. Porter, Statistical Theories of Spectra: Fluctuations, Academic Press, New York, 1965. R.A. Jalabert, A.D. Stone, Y. Alhassid, Phys. Rev. Lett. 68 (1992) 3468. J. Stein, H.-J. StoK ckmann, Phys. Rev. Lett. 68 (1992) 2867. A. Kudrolli, V. Kidambi, S. Sridhar, Phys. Rev. Lett. 75 (1995) 822. H. Alt, H.-D. GraK f, H.L. Harney, R. Ho!erbert, H. Lengeler, A. Richter, P. Schardt, H.A. WeidenmuK ller, Phys. Rev. Lett. 74 (1995) 62. A.D. Mirlin, Y.V. Fyodorov, J. Phys. A: Math. Gen. 26 (1993) L551. Y.V. Fyodorov, A.D. Mirlin, Int. J. Mod. Phys. B 8 (1994) 3795. A.D. Mirlin, Y.V. Fyodorov, Phys. Rev. Lett. 72 (1994) 526. A.D. Mirlin, Y.V. Fyodorov, J. de Phys. I France 4 (1994) 655. Y.V. Fyodorov, A.D. Mirlin, Phys. Rev. Lett. 69 (1992) 1093, 1832 (E). Y.V. Fyodorov, A.D. Mirlin, Phys. Rev. Lett. 71 (1993) 412. Y.V. Fyodorov, A.D. Mirlin, Pis'ma Zh. Eksp. Teor. Fiz. 58 (1993) 636 [JETP Lett. 58 (1993) 615]. Y.V. Fyodorov, A.D. Mirlin, Pis'ma Zh. Eksp. Teor. Fiz. 60 (1994) 779 [JETP Lett. 60 (1994) 790]. Y.V. Fyodorov, A.D. Mirlin, Phys. Rev. B 51 (1995) 13 403. Ya.M. Blanter, A.D. Mirlin, Phys. Rev. B 53 (1996) 12 601. Ya.M. Blanter, A.D. Mirlin, Phys. Rev. E 55 (1997) 6514. B.L. Altshuler, V.E. Kravtsov, I.V. Lerner, in: B.L. Altshuler, P.A. Lee, R.A. Webb (Eds.), Mesoscopic Phenomena in Solids, North-Holland, Amsterdam, 1991, p. 449. B.A. Muzykantskii, D.E. Khmelnitskii, Phys. Rev. B 51 (1995) 5480.
376 [30] [31] [32] [33] [34] [35] [36] [37] [38] [39] [40] [41] [42] [43] [44] [45] [46] [47] [48] [49] [50] [51] [52] [53] [54]
[55] [56] [57] [58] [59] [60] [61] [62] [63] [64] [65] [66] [67] [68] [69] [70] [71] [72] [73] [74] [75]
A.D. Mirlin / Physics Reports 326 (2000) 259}382 A.D. Mirlin, Pis'ma Zh. Eksp. Teor. Fiz. 62 (1995) 583 [JETP Lett. 62 (1995) 603]. B.A. Muzykantskii, D.E. Khmelnitskii, preprint cond-mat/9601045. V.I. Fal'ko, K.B. Efetov, Europhys. Lett. 32 (1995) 627. V.I. Fal'ko, K.B. Efetov, Phys. Rev. B 52 (1995) 17 413. A.D. Mirlin, Phys. Rev. B 53 (1996) 1186. A.D. Mirlin, Y.V. Fyodorov (1995), unpublished. A.D. Mirlin, J. Math. Phys. 38 (1997) 1888. V.E. Kravtsov, I.V. Yurkevich, Phys. Rev. Lett. 78 (1997) 3354. C. Basu, C.M. Canali, V.E. Kravtsov, I.V. Yurkevich, Phys. Rev. B 57 (1998) 14 174. P.A. Lee, T.V. Ramakrishnan, Rev. Mod. Phys. 57 (1985) 287. B.L. Altshuler, I.K. Zharekeshev, S.A. Kotochigova, B.I. Shklovskii, Zh. Eksp. Theor. Fiz. 94 (1988) 343 [Sov. Phys. JETP 67 (1988) 625]. B.I. Shklovskii, B. Shapiro, B.R. Sears, P. Lambrianides, H.B. Shore, Phys. Rev. B 47 (1993) 11 487. V.E. Kravtsov, I.V. Lerner, B.L. Altshuler, A.G. Aronov, Phys. Rev. Lett. 72 (1994) 888. A.G. Aronov, A.D. Mirlin, Phys. Rev. B 51 (1995) 6131. V.E. Kravtsov, I.V. Lerner, Phys. Rev. Lett. 74 (1995) 2563. B.R. Sears, H.B. Shore, Statistical #uctuations of energy levels spectra near the metal-insulator transition, 1994, unpublished. I. Zharekeshev, B. Kramer, in: H.A. Cerdeira, B. Kramer, G. SchoK n (Eds.), Quantum Transport in submicron structures, NATO ASI Series E, Vol. 291, Kluwer Academic Publishers, Dordrecht, 1995, p. 93. I. Zharekeshev, B. Kramer, Jpn. J. Appl. Phys. 34 (1995) 4361. D. Braun, G. Montambaux, M. Pascaud, Phys. Rev. Lett. 81 (1998) 1062. J.T. Chalker, V.E. Kravtsov, I.V. Lerner, Pis'ma Zh. Eksp. Teor. Fiz. 64 (1996) 355 [JETP Lett. 64 (1996) 386]. F. Wegner, Z. Phys. B 36 (1980) 209. C. Castellani, L. Peliti, J. Phys. A: Math. Gen. 19 (1986) L429. M. Schreiber, H. Grussbach, Phys. Rev. Lett. 67 (1991) 607. W. Pook, M. Janssen, Z. Phys. B 82 (1991) 295. B. Huckestein, B. Kramer, L. Schweitzer, Surf. Sci. 263 (1992) 125; B. Huckestein, L. Schweitzer, in: G. Landwehr (Ed.), High Magnetic Fields in Semiconductor Physics III. Proc. Int. Conf. WuK rzburg 1990, Springer Series in Solid State Sciences, Vol. 101, Springer, Berlin, 1992, p. 84. M. Janssen, Int. J. Mod. Phys. B 8 (1994) 943. B. Huckestein, Rev. Mod. Phys. 67 (1995) 357. A.D. Mirlin, Y.V. Fyodorov, F.-M. Dittes, J. Quezada, T.H. Seligman, Phys. Rev. B 54 (1996) 3221. J.V. Jose, R. Cordery, Phys. Rev. Lett. 56 (1986) 290. B.L. Altshuler, L.S. Levitov, Phys. Rep. 288 (1997) 487. A.V. Balatsky, M.I. Salkola, Phys. Rev. Lett. 76 (1996) 2386. I.V. Ponomarev, P.G. Silvestrov, Phys. Rev. B 56 (1997) 3742. M.A. Skvortsov, V.E. Kravtsov, M.V. Feigel'man, Pis'ma Zh. Eksp. Teor. Fiz. 68 (1998) 78 [JETP Lett. 68 (1998) 84]. S. Iida, H.A. WeidenmuK ller, J.A. Zuk, Ann. Phys. (NY) 200 (1990) 219. M.R. Zirnbauer, Phys. Rev. Lett. 69 (1992) 1584. A.D. Mirlin, A. MuK ller-Groeling, M.R. Zirnbauer, Ann. Phys. (NY) 236 (1994) 325. P.W. Brouwer, K. Frahm, Phys. Rev. B 53 (1996) 1490. A.D. Mirlin, A. MuK ller-Groeling (1994), unpublished. V. Plerou, Z. Wang, Phys. Rev. B 58 (1998) 1967. A. Ishimaru, Wave Propagation and Scattering in Random Media, Academic Press, New York, 1978. B. Shapiro, Phys. Rev. Lett. 57 (1986) 2168. A.D. Mirlin, R. Pnini, B. Shapiro, Phys. Rev. E 57 (1998) R6285. Th.M. Nieuwenhuizen, M.C.W. van Rossum, Phys. Rev. Lett. 74 (1995) 2674. E. Kogan, M. Kaveh, Phys. Rev. B 52 (1995) R3813. S.A. van Langen, P.W. Brouwer, C.W.J. Beenakker, Phys. Rev. E 53 (1996) R1344. B.A. Muzykantskii, D.E. Khmelnitskii, Pis'ma Zh. ED ksp. Teor. Fiz. 62 (1995) 68 [JETP Lett. 62 (1995) 76].
A.D. Mirlin / Physics Reports 326 (2000) 259}382 [76] [77] [78] [79] [80] [81] [82] [83] [84] [85] [86] [87] [88] [89] [90] [91]
[92] [93] [94] [95] [96] [97] [98] [99] [100] [101] [102] [103] [104] [105] [106] [107] [108] [109] [110] [111] [112] [113] [114] [115] [116]
377
A.V. Andreev, O. Agam, B.D. Simons, B.L. Altshuler, Phys. Rev. Lett. 76 (1996) 3947. A.V. Andreev, B.D. Simons, O. Agam, B.L. Altshuler, Nucl. Phys. B 482 (1996) 536. Ya.M. Blanter, A.D. Mirlin, B.A. Muzykantskii, Phys. Rev. Lett. 80 (1998) 4161. V. Tripathi, D.E. Khmelnitskii, Phys. Rev. B 58 (1998) 1122. M.V. Berry, Proc. Roy. Soc. London A 400 (1985) 229. M.V. Berry, in: M.-J. Giannoni, A. Voros, J. Zinn-Justin (Eds.), Chaos and Quantum Physics, North-Holland, Amsterdam, 1991, p. 251. L.P. Kouwenhoven, C.M. Marcus, P.L. McEuen, S. Tarucha, R.M. Westerwelt, N.S. Wingreen, in: L.L. Sohn, L.P. Kouwenhoven, G. SchoK n (Eds.), Mesoscopic Electron Transport, Kluwer, 1997, p. 105. U. Sivan, R. Berkovits, Y. Aloni, O. Prus, A. Auerbach, G. Ben-Joseph, Phys. Rev. Lett. 77 (1996) 1123. Ya.M. Blanter, A.D. Mirlin, B.A. Muzykantskii, Phys. Rev. Lett. 78 (1997) 2449. R. Berkovits, B.L. Altshuler, Phys. Rev. B 55 (1997) 5297. S.R. Patel, S.M. Cronenwett, D.R. Stewart, A.G. Huibers, C.M. Marcus, C.I. DuruoK z, J.S. Harris, K. Campman, A.C. Gossard, Phys. Rev. Lett. 80 (1998) 4522. D.R. Stewart, D. Sprinzak, C.M. Marcus, C.I. DuruoK z, J.S. Harris, Science 278 (1997) 1784. C.W.J. Beenakker, Rev. Mod. Phys. 69 (1997) 731. T. Guhr, A. MuK ller-Groeling, H.A. WeidenmuK ller, Phys. Reports 299 (1998) 189. M.R. Zirnbauer, Nucl. Phys. B 265 (1986) 375. V.E. Kravtsov, C.M. Canali, I.V. Yurkevich, in: I.V. Lerner, J.P. Keating, D.E. Khmelnitskii (Eds.), Supersymmetry and Trace Formulae: Chaos and Disorder, Proceedings of the NATO ASI, Plenum Press, New York, 1998, pp. 273}292. U. Meirav, M.A. Kastner, S.J. Wind, Phys. Rev. Lett. 65 (1990) 771. U. Sivan, F.P. Milliken, K. Milkove, S. Rishton, Y. Lee, J.M. Hong, V. Boegli, D. Kern, M. De Franza, Europhys. Lett. 25 (1994) 605. A.M. Chang, H.U. Baranger, L.N. Pfei!er, K.W. West, T.Y. Chang, Phys. Rev. Lett. 76 (1996) 1695. J.A. Folk, S.R. Patel, S.F. Godijn, A.G. Huibers, S.M. Cronenwett, C.M. Marcus, K. Campman, G. Gossard, Phys. Rev. Lett. 76 (1996) 1699. V.N. Prigodin, K.B. Efetov, S. Iida, Phys. Rev. Lett. 71 (1993) 1230. Y. Alhassid, H. Attias, Phys. Rev. Lett. 76 (1996) 1711. H. Bruus, H.C. Lewenkopf, E.R. Mucciolo, Phys. Rev. B 53 (1996) 9968. Ya.M. Blanter, Phys. Rev. B 54 (1996) 12 807. B.L. Altshuler, Y. Gefen, A. Kamenev, L.S. Levitov, Phys. Rev. Lett. 78 (1997) 2803. K.B. Efetov, A.I. Larkin, Zh. Eksp. Teor. Fiz. 85 (1983) 764 [Sov. Phys. JETP 58 (1983) 444]. Y.V. Fyodorov, A.D. Mirlin, Phys. Rev. Lett. 67 (1991) 2405. G. Casati, B.V. Chirikov (Eds.), Quantum Chaos: Between Order and Disorder, Cambridge University Press, Cambridge, 1994. F.M. Izrailev, Phys. Rep. 129 (1990) 299. A. Altland, M.R. Zirnbauer, Phys. Rev. Lett. 77 (1996) 4536. O.I. Marichev, Handbook on integral transforms of higher transcendental functions, Ellis Horwood Ltd., New York, 1983, p. 137. I.V. Kolokolov, Zh. Eksp. Teor. Fiz. 103 (1993) 2196 [Sov. Phys. JETP 76 (1993) 1099]. E. D'Hoker, R. Jackiw, Phys. Rev. D 26 (1982) 3517. D.G. Shelton, A.M. Tsvelik, Phys. Rev. B 57 (1998) 14 242. M.V. Berry, J. Phys. A 10 (1977) 2083. V.N. Prigodin, Phys. Rev. Lett. 74 (1995) 1566; V.N. Prigodin et al., Phys. Rev. Lett. 75 (1995) 2392. M. Srednicki, Phys. Rev. E 54 (1996) 954. B.L. Altshuler, V.N. Prigodin, Zh. Eksp. Teor. Fiz. 95 (1989) 348 [Sov. Phys. JETP 68 (1989) 198]. V.L. Berezinski, L.P. Gor'kov, Zh. Eksp. Teor. Fiz. 78 (1980) 812 [Sov. Phys. JETP 51 (1980) 409]. I.V. Kolokolov, Europhys. Lett. 28 (1994) 193. A.D. Stone, P.A. Mello, K.A. Muttalib, J.-L. Pichard, in: B.L. Altshuler, P.A. Lee, R.A. Webb (Eds.), Mesoscopic Phenomena in Solids, North-Holland, Amsterdam, 1991.
378 [117] [118] [119] [120]
[121] [122] [123] [124] [125] [126] [127] [128] [129] [130] [131] [132] [133] [134] [135] [136] [137] [138] [139] [140] [141] [142] [143] [144] [145] [146] [147] [148] [149] [150] [151] [152] [153] [154] [155] [156] [157] [158] [159] [160] [161] [162] [163]
A.D. Mirlin / Physics Reports 326 (2000) 259}382 K. MuK ller, B. Mehlig, F. Milde, M. Schreiber, Phys. Rev. Lett. 78 (1997) 215. V. Uski, B. Mehlig, R.A. RoK mer, Ann. Phys. (Leipzig) 7 (1998) 437. B.B. Mandelbrot, J. Fluid. Mech. 62 (1974) 331. U. Frisch, G. Parisi, in: M. Ghil, R. Benzi, G. Parisi (Eds.), Turbulence and Predictability in Geophysical Fluid Dynamics and Climate Dynamics, International School of Physics Enrico Fermi, Varenna, 1983, North-Holland, Amsterdam, 1985, p. 84. R. Benzi, G. Paladin, G. Parisi, A. Vulpiani, J. Phys. A: Math. Gen. 17 (1984) 3521. U. Frisch, Turbulence: The Legacy of A.N. Kolmogorov, Cambridge University Press, Cambridge, 1995. P. Grassberger, Phys. Lett. 97 A (1983) 227. H.G.E. Hentschel, I. Procaccia, Physica D 8 (1983) 435. R. Benzi, G. Paladin, G. Parisi, A. Vulpiani, J. Phys. A: Math. Gen. 18 (1985) 2157. T.C. Halsey, M.H. Jensen, L.P. Kadano!, I. Procaccia, B.I. Shraiman, Phys. Rev. A 33 (1986) 1141. T.C. Halsey, P. Meakin, I. Procaccia, Phys. Rev. Lett. 56 (1986) 854. J. Lee, H.E. Stanley, Phys. Rev. Lett. 61 (1988) 2945. R. Blumenfeld, A. Aharony, Phys. Rev. Lett. 62 (1989) 2977. G. Paladin, A. Vulpiani, Phys. Rep. 156 (1987) 147. M. Janssen, Phys. Rep. 295 (1998) 1. B. Mandelbrot, Physica A 163 (1990) 306. B. Mandelbrot, Prog. Roy. Soc. London A 434 (1991) 79. C. de Chamon, C. Mudry, X.-G. Wen, Phys. Rev. Lett. 77 (1996) 4194. H.E. Castillo, C. de Chamon, E. Fradkin, P.M. Goldbart, C. Mudry, Phys. Rev. B 56 (1997) 10 668. A.D. Mirlin, in: I.V. Lerner, J.P. Keating, D.E. Khmelnitskii (Eds.), Supersymmetry and Trace Formulae: Chaos and Disorder, Proceedings of the NATO ASI, Plenum Press, New York, 1998, pp. 245}260. V.N. Prigodin, B.L. Altshuler, Phys. Rev. Lett. 80 (1998) 1944. Ya.M. Blanter, A.D. Mirlin, Phys. Rev. B 57 (1998) 4566. A.B. Zamolodchikov, Al.B. Zamolodchikov, Nucl. Phys. B 477 (1996) 577. I.I. Kogan, C. Mudry, A.M. Tsvelik, Phys. Rev. Lett. 77 (1996) 707. B. Eckhardt, S. Fishman, J. Keating, O. Agam, J. Main, K. MuK ller, Phys. Rev. E 52 (1995) 5893. S. Hortikar, M. Srednicki, Phys. Rev. Lett. 80 (1998) 1646. E. Heller, in: M.-J. Giannoni, A. Voros, J. Zinn-Justin (Eds.), Chaos and Quantum Physics, North-Holland, Amsterdam, 1991, p. 547, and references therein. O. Agam, S. Fishman, J. Phys. A 26 (1993) 2113. V.I. Melnikov, Pis'ma Zh. Exp. Teor. Fiz. 32 (1980) 244 [JETP Lett. 32 (1980) 225]. V.I. Melnikov, Fiz. Tverd. Tela 22 (1980) 2404 [Sov. Phys. Solid State 22 (1980) 1398]. A.A. Abrikosov, Solid State Commun. 37 (1981) 997. B.L. Altshuler, V.N. Prigodin, Pis'ma Zh. Eksp. Teor. Fiz. 47 (1988) 36 [JETP Lett. 47 (1988) 43]. V.E. Kravtsov, Habilitationsschrift, Heidelberg, 1992. A.D. Mirlin, B.A. Muzykantskii (1997), unpublished. K. Frahm, Phys. Rev. E 56 (1997) R6237. G. Casati, G. Maspero, D.L. Shepelyansky, Phys. Rev. E 56 (1997) R6233. D.V. Savin, V.V. Sokolov, Phys. Rev. E 56 (1997) R4911. C.H. Lewenkopf, H.A. WeidenmuK ller, Ann. Phys. (NY) 212 (1991) 53. I.V. Lerner, Phys. Lett. A 133 (1988) 253. Y.V. Fyodorov, Phys. Rev. Lett. 73 (1994) 2688. A. Szafer, B.L. Altshuler, Phys. Rev. Lett. 70 (1993) 587. B.D. Simons, B.L. Altshuler, Phys. Rev. B 48 (1993) 5422. B.D. Simons, P.A. Lee, B.L. Altshuler, Nucl. Phys. B 409 [FS] (1993) 487. A.D. Mirlin, Y.V. Fyodorov (1998), unpublished. I.E. Smolyarenko, B.L. Altshuler, Phys. Rev. B 55 (1997) 10 451. B.I. Halperin, M. Lax, Phys. Rev. 148 (1966) 722. J. Zittartz, J.S. Langer, Phys. Rev. 148 (1966) 742.
A.D. Mirlin / Physics Reports 326 (2000) 259}382
379
[164] J.J.M. Verbaarschot, M.R. Zirnbauer, J. Phys. A: Math. Gen. 17 (1985) 1093. [165] E. Hofstetter, M. Schreiber, Phys. Rev. B 48 (1993) 16 979. [166] V.E. Kravtsov, cond-mat/9603166, in: Proceedings of the Moriond Conference `Correlated Fermions and Transport in Mesoscopic Systemsa, Les Arcs, 1996. [167] F. Wegner, Phys. Rep. 67 (1980) 15. [168] A. Altland, Y. Gefen, Phys. Rev. Lett. 71 (1993) 3339. [169] A. Altland, Y. Gefen, Phys. Rev. B 51 (1995) 10 671. [170] A.G. Aronov, V.E. Kravtsov, I.V. Lerner, Phys. Rev. Lett. 74 (1995) 1174. [171] J.T. Chalker, I.V. Lerner, R.A. Smith, Phys. Rev. Lett. 77 (1996) 554. [172] J.T. Chalker, I.V. Lerner, R.A. Smith, J. Math. Phys. 37 (1996) 5061. [173] H. Potempa, L. Schweitzer, J. Phys.: Condens. Matter 10 (1998) L431. [174] L. Schweitzer, H. Potempa, Physica A 266 (1999) 486. [175] V.E. Kravtsov, V.I. Yudson, Phys. Rev. Lett. 82 (1999) 157. [176] A.G. Aronov, V.E. Kravtsov, I.V. Lerner, Pis'ma Zh. Eksp. Teor. Fiz. 59 (1994) 39 [JETP Lett. 59 (1994) 40]. [177] I.Kh. Zharekeshev, B. Kramer, Phys. Rev. Lett. 79 (1997) 717. [178] I.Kh. Zharekeshev, B. Kramer, Ann. Phys. (Leipzig) 7 (1998) 442. [179] S.N. Evangelou, Phys. Rev. B 49 (1994) 16 805. [180] D. Braun, G. Montambaux, Phys. Rev. B 52 (1995) 13 903. [181] I.Kh. Zharekeshev, B. Kramer, Phys. Rev. B 51 (1995) 17 356. [182] Y.V. Fyodorov, A.D. Mirlin, Phys. Rev. B 55 (1997) R16001. [183] A.D. Mirlin, Y.V. Fyodorov, J. Phys. A: Math. Gen. 24 (1991) 2273. [184] Y.V. Fyodorov, A.D. Mirlin, Phys. Rev. Lett. 67 (1991) 2049. [185] J.T. Chalker, G.J. Daniell, Phys. Rev. Lett. 61 (1988) 593. [186] J.T. Chalker, Physica A 167 (1990) 253. [187] B. Huckestein, L. Schweitzer, Phys. Rev. Lett. 72 (1994) 713. [188] T. Brandes, B. Huckestein, L. Schweitzer, Ann. Phys. (Leipzig) 5 (1996) 633. [189] K. Pracz, M. Janssen, P. Freche, J. Phys.: Condens. Matter 8 (1996) 7147. [190] F. Wegner, in: H. Fritsche, D. Adler (Eds.), Localisation and Metal Insulator Transitions, Plenum, New York, 1985, p. 337. [191] T.H. Seligman, J.J.M. Verbaarschot, M.R. Zirnbauer, J. Phys. A: Math. Gen. 18 (1985) 2751. [192] V.V. Flambaum, A.A. Gribakina, G.F. Gribakin, M.G. Kozlov, Phys. Rev. A 50 (1994) 267. [193] F.M. Izrailev, Chaos, Solitons, Fractals 5 (1995) 1219. [194] G. Casati, L. Molinari, F.M. Izrailev, Phys. Rev. Lett. 64 (1990) 1851. [195] S. Fishman, D.R. Grempel, R.E. Prange, Phys. Rev. Lett. 49 (1982) 509. [196] F. Borgonovi, G. Casati, B. Li, Phys. Rev. Lett. 77 (1996) 4744. [197] G. Casati, T. Prosen, Phys. Rev. E 59 (1999) R2516. [198] F. Borgonovi, P. Conti, D. Rebuzzi, B. Hu, B. Li, Physica D 131 (1999) 317. [199] G. Yeung, Y. Oono, Europhys. Lett. 4 (1987) 1061. [200] L. Levitov, Europhys. Lett. 9 (1989) 83. [201] L. Levitov, Phys. Rev. Lett. 64 (1990) 547. [202] P. Cizeau, J.P. Bouchaud, Phys. Rev. E 50 (1994) 1810. [203] J.P. Bouchaud, A. Georges, Phys. Rep. 195 (1990) 127. [204] L.P. Gor'kov, A.I. Larkin, D.E. Khmelnitskii, Pis'ma Zh. Eksp. Teor. Fiz. 30 (1979) 248 [JETP Lett. 30 (1979) 228]. [205] E. BreH zin, J. Zinn-Justin, Phys. Rev. B 14 (1976) 3110. [206] E. BreH zin, J. Zinn-Justin, J.C. Le Guillou, Phys. Rev. D 14 (1976) 2615. [207] E. BreH zin, S. Hikami, J. Zinn-Justin, Nucl. Phys. B 165 (1980) 528. [208] J. Zinn-Justin, Quantum Field Theory and Critical Phenomena, Clarendon Press, Oxford, 1989. [209] E. Abrahams, P.W. Anderson, D.C. Licciardello, T.V. Ramakrishnan, Phys. Rev. Lett. 42 (1979) 673. [210] E. BreH zin, J. Zinn-Justin, J.C. Le Guillou, J. Phys. A: Math. Gen. 9 (1976) L119. [211] V.E. Kravtsov, K.A. Muttalib, Phys. Rev. Lett. 79 (1997) 1913.
380 [212] [213] [214] [215] [216] [217] [218] [219] [220] [221] [222] [223] [224] [225] [226] [227] [228] [229] [230] [231] [232] [233] [234] [235] [236] [237] [238] [239] [240] [241] [242] [243] [244] [245] [246] [247] [248] [249] [250] [251] [252] [253] [254] [255] [256]
A.D. Mirlin / Physics Reports 326 (2000) 259}382 M. Moshe, H. Neuberger, B. Shapiro, Phys. Rev. Lett. 73 (1994) 1497. R. Landauer, Philos. Mag. 21 (1970) 863. D.S. Fisher, P.A. Lee, Phys. Rev. B 23 (1981) 6851. M. BuK ttiker, Phys. Rev. Lett. 57 (1986) 1761. H.U. Barranger, A.D. Stone, Phys. Rev. B 40 (1989) 8169. J.T. Chalker, A. Dohmen, Phys. Rev. Lett. 75 (1995) 4496. L. Balents, M.P.A. Fisher, Phys. Rev. Lett. 76 (1996) 2782. Y.B. Kim, Phys. Rev. B 53 (1996) 16 320. H. Mathur, Phys. Rev. Lett. 78 (1997) 2429. L. Balents, M.P.A. Fisher, M.R. Zirnbauer, Nucl. Phys. B 483 (1996) 601. I.A. Gruzberg, N. Read, S. Sachdev, Phys. Rev. B 55 (1997) 10 593. I.A. Gruzberg, N. Read, S. Sachdev, Phys. Rev. B 56 (1997) 13 218. D.P. Druist, P.J. Turley, K.D. Maranowski, E.G. Gwinn, A.C. Gossard, Phys. Rev. Lett. 80 (1998) 365. C. Mahaux, H.A. WeidenmuK ller, Shell-Model Approach to Nuclear Reactions, North-Holland, Amsterdam, 1969. K. JuK ngling, R. Oppermann, Z. Phys. B 38 (1980) 93. R. Oppermann, K. JuK ngling, Phys. Lett. A 76 (1980) 449. F. Wegner, Z. Phys. B 49 (1983) 297. A.D. Mirlin, Phys. Rev. Lett. 72 (1994) 3437. O.N. Dorokhov, Pis'ma Zh. Eksp. Teor. Fiz. 36 (1982) 259 [JETP Lett. 36 (1982) 318]. P.A. Mello, P. Pereyra, N. Kumar, Ann. Phys. (NY) 181 (1988) 290. O.N. Dorokhov, Zh. Eksp. Teor. Fiz. 85 (1983) 1040 [Sov. Phys. JETP 58 (1983) 606]. J.-L. Pichard, in: B. Kramer (Ed.), Quantum Coherence in Mesoscopic Systems, NATO ASI Series B254, Plenum Press, New York, 1991, p. 369. V.I. Oseledec, Trans. Moscow Math. Soc. 19 (1968) 197. A. Crisanti, G. Palladin, A. Vulpiani, Products of Random Matrices in Statistical Physics, Springer, Berlin, 1993. J.-L. Pichard, G. Sarma, J. Phys. C 14 (1981) L127. M.E. Gertsenshtein, V.B. Vasil'ev, Teor. Veroyatn. Primen. 4 (1959) 424; 5 (1960) 3 (E) [Theor. Probab. Appl. 4 (1959) 391; 5 (1960) 340 (E)]. Yu.L. Gazaryan, Zh. Eksp. Teor. Fiz. 56 (1969) 1856 [Sov. Phys. JETP 29 (1969) 996]. G.C. Papanicolaou, SIAM J. Appl. Math. 21 (1971) 13. U. Frisch, in: A.T. Bharucha-Reid (Ed.), Probabilistic Methods in Applied Mathematics, Vol. 1, Academic Press, New York, 1968, pp. 76}198. P. Sheng, Introduction to Wave Scattering, Localization, and Mesoscopic Phenomena, Academic Press, New York, 1995. P. Sheng (Ed.), Scattering and Localization of Classical Waves in Random Media, World Scienti"c Series on Directions in Condensed Matter Physics, Vol. 8, World Scienti"c, Singapore, 1990. B. Shapiro, in: Y. Avishai (Ed.), Recent Progress in Many-Body Theories, Vol. 2, Plenum Press, New York, 1990, pp. 95}104. S. John, M.J. Stephen, Phys. Rev. B 28 (1983) 6358. B. Elattari, V. Kagalovsky, H.A. WeidenmuK ller, Phys. Rev. B 57 (1998) 11 258. R. Pnini, B. Shapiro, Phys. Rev. E 54 (1996) 1032R. B. Shapiro, Phys. Rev. Lett. 57 (1986) 2168. R. Dashen, J. Math. Phys. 20 (1979) 894. N. Shnerb, M. Kaveh, Phys. Rev. B 43 (1991) 1279. E. Kogan, M. Kaveh, R. Baumgartner, R. Berkovits, Phys. Rev. B 48 (1993) 9404. V.V. Zavorotnyi, V.I. Klyatskin, V.I. Tatarskii, Zh. Eksp. Teor. Fiz. 73 (1977) 481 [Sov. Phys. JETP 46 (1977) 252]. K. Gochelashvili, V. Shishov, Zh. Eksp. Teor. Fiz. 74 (1978) 1974 [Sov. Phys. JETP 47 (1978) 1028]. R. Dashen, Optics Lett. 9 (1984) 110. N. Garcia, A.Z. Genack, Optics Lett. 16 (1991) 1132. A.Z. Genack, N. Garcia, Europhys. Lett. 21 (1993) 753. M. Stoychev, A.Z. Genack, Phys. Rev. Lett. 79 (1997) 309.
A.D. Mirlin / Physics Reports 326 (2000) 259}382 [257] [258] [259] [260] [261] [262] [263] [264] [265] [266] [267] [268] [269] [270] [271] [272] [273] [274] [275] [276] [277] [278] [279] [280] [281] [282] [283] [284] [285] [286] [287] [288] [289] [290] [291] [292] [293] [294] [295] [296] [297] [298] [299] [300] [301] [302] [303] [304]
381
S. Hikami, Phys. Rev. B 24 (1981) 2671. K.B. Efetov, A.I. Larkin, D.E. Khmelnitskii, Zh. Eksp. Teor. Fiz. 79 (1980) 1120 [Sov. Phys. JETP 52 (1980) 568]. O. Agam, B.L. Altshuler, A.V. Andreev, Phys. Rev. Lett. 75 (1995) 4389. K.M. Frahm, D.L. Shepelyansky, Phys. Rev. Lett. 78 (1997) 1440. K.M. Frahm, D.L. Shepelyansky, Phys. Rev. Lett. 79 (1997) 1833. O. Agam, S. Fishman, J. Phys. A 29 (1996) 2013. E. Cuevas, E. Louis, J.A. VergeH s, Phys. Rev. Lett. 77 (1996) 1970. E. Louis, E. Cuevas, J.A. VergeH s, M. Ortun8 o, Phys. Rev. B 56 (1997) 2120. K.V. Samokhin, Phys. Rev. B 60 (1999) 1511. K. Fuchs, Proc. Cambridge Phil. Soc. 34 (1938) 100. A.A. Abrikosov, Fundamentals of the Theory of Metals, North-Holland, Amsterdam, 1988, p. 122. S. Chandrasekhar, Radiative Transfer, Oxford University Press, Oxford, 1950. P.M. Morse, H. Feshbach, Methods of Theoretical Physics, Part II, Section 12.2, McGraw-Hill, New York, 1953. V.I. Okulov, V.V. Ustinov, Fiz. Nizk. Temp. 5 (1979) 213 [Sov. J. Low. Temp. Phys. 5 (1979) 101]. M. Leadbeater, V.I. Falko, C.J. Lambert, Phys. Rev. Lett. 81 (1998) 1274. J.A. Sa`nchez-Gil, V. Freilikher, A.A. Maradudin, I.V. Yurkevich, Phys. Rev. B 59 (1999) 5915. B.L. Altshuler, A.G. Aronov, in: A.L. Efros, M. Pollak (Eds.), Electron}Electron Interactions in Disordered Systems, North-Holland, Amsterdam, 1985, p. 1. P.A. Lee, A.D. Stone, H. Fukuyama, Phys. Rev. B 35 (1987) 1039. S. Tarucha, D.G. Austing, T. Honda, R.J. van der Hage, L.P. Kouwenhoven, Phys. Rev. Lett. 77 (1996) 3613. D.C. Ralph, C.T. Black, M. Tinkham, Phys. Rev. Lett. 74 (1995) 3241. O. Agam, N.S. Wingreen, B.L. Altshuler, D.C. Ralph, M. Tinkham, Phys. Rev. Lett. 78 (1997) 1956. U. Sivan, Y. Imry, A.G. Aronov, Europhys. Lett. 28 (1994) 115. A.D. Mirlin, Y.V. Fyodorov, Phys. Rev. B 56 (1997) 13 393. P.G. Silvestrov, Phys. Rev. Lett. 79 (1997) 3394. Ph. Jacquod, D.L. Shepelyansky, Phys. Rev. Lett. 79 (1997) 1837. P.G. Silvestrov, Phys. Rev. E 58 (1998) 5629. R. Berkovits, Europhys. Lett. 25 (1994) 681. R. Berkovits, Y. Avishai, J. Phys.: Condens. Matter 8 (1996) 389. D. Weinmann, J.-L. Pichard, Y. Imry, J. Phys. I France 7 (1997) 1559. R. Berkovits, Y. Avishai, Phys. Rev. Lett. 80 (1998) 568. M. Pascaud, G. Montambaux, preprint cond-mat/9810211. F. Simmel, T. Heinzel, D.A. Wharam, Europhys. Lett. 38 (1997) 123. S.R. Patel, D.R. Stewart, C.M. Marcus, M. Gokcedag, Y. Alhassid, A.D. Stone, C.I. Duruoz, J.S. Harris Jr., Phys. Rev. Lett. 81 (1998) 5900. S.M. Maurer, S.R. Patel, C.M. Marcus, C.I. DuruoK z, J.S. Harris Jr., Phys. Rev. Lett. 83 (1999) 1403. D.L. Shepelyansky, Phys. Rev. Lett. 73 (1994) 2707. O.N. Dorokhov, Zh. Eksp. Teor. Fiz. 98 (1990) 646 [Sov. Phys. JETP 71 (1990) 360]. Ph. Jacquod, D.L. Shepelyansky, Phys. Rev. Lett. 75 (1995) 3501. Y.V. Fyodorov, A.D. Mirlin, Phys. Rev. B 52 (1995) R11580. K. Frahm, A. MuK ller-Groeling, Europhys. Lett. 32 (1995) 385. Y. Imry, Europhys. Lett. 30 (1995) 405. C.M. Marcus, S.R. Patel, A.G. Huibers, S.M. Cronenwett, M. Switkes, I.H. Chan, R.M. Clarke, J.A. Folk, S.F. Godijn, K. Campman, A.C. Gossard, Chaos, Solitons, and Fractals 8 (1997) 1261. C.W.J. Beenakker, Phys. Rev. B 44 (1991) 1646. D.V. Averin, Y. Nazarov, Phys. Rev. Lett. 65 (1990) 2446. J.R. Morris, D.M. Deaven, K.M. Ho, Phys. Rev. B 53 (1996) R1740. A.A. Koulakov, F.G. Pikus, B.I. Shklovskii, Phys. Rev. B 55 (1997) 9223. O. Prus, A. Auerbach, Y. Aloni, U. Sivan, R. Berkovits, Phys. Rev. B 54 (1996) R14289. M. Stopa, preprint cond-mat/9709119. R.O. Vallejos, C.H. Lewenkopf, E.R. Mucciolo, Phys. Rev. Lett. 81 (1998) 677.
382
A.D. Mirlin / Physics Reports 326 (2000) 259}382
[305] R. Berkovits, Phys. Rev. Lett. 81 (1998) 2128. [306] M.R. Zirnbauer, preprint chao-dyn/9812023, in: I.V. Lerner, J.P. Keating, D.E. Khmelnitskii (Eds.), Supersymmetry and Trace Formulae: Chaos and Disorder, Plenum Press, New York, 1999, p. 153. [307] N. Argaman, Y. Imry, U. Smilansky, Phys. Rev. B 47 (1993) 440. [308] E.B. Bogomolny, J.P. Keating, Phys. Rev. Lett. 77 (1996) 1472. [309] I.L. Aleiner, A.I. Larkin, Phys. Rev. B 54 (1996) 14 423. [310] R. Smith, I.V. Lerner, B.L. Altshuler, Phys. Rev. B 58 (1998) 10 343. [311] I.L. Aleiner, L.I. Glazman, Phys. Rev. Lett. 77 (1996) 2057. [312] I.L. Aleiner, L.I. Glazman, Phys. Rev. B 57 (1998) 9608. [313] A. Kaminski, I.L. Aleiner, L.I. Glazman, Phys. Rev. Lett. 81 (1998) 685. [314] S.M. Cronenwett, S.R. Patel, C.M. Marcus, K. Campman, A.C. Gossard, Phys. Rev. Lett. 79 (1997) 2312. [315] S.M. Cronenwett, S.M. Maurer, S.R. Patel, C.M. Marcus, C.I. DuruoK z, J.S. Harris Jr., Phys. Rev. Lett. 81 (1998) 5904. [316] N.B. Zhitenev, R.C. Ashoori, L.N. Pfei!er, K.W. West, Phys. Rev. Lett. 79 (1997) 2308. [317] S. Levit, D. Orgad, Phys. Rev. B 60 (1999) 5549. [318] L. Bonci, R. Berkovits, preprint cond-mat/9901332. [319] P.N. Walker, Y. Gefen, G. Montambaux, Phys. Rev. Lett. 82 (1999) 5329. [320] P.N. Walker, G. Montambaux, Y. Gefen, Phys. Rev. B 60 (1999) 2541. [321] A. Cohen, K. Richter, R. Berkovits, Phys. Rev. B 60 (1999) 2536. [322] K.B. Efetov, V.N. Prigodin, Phys. Rev. Lett. 70 (1993) 1315; Mod. Phys. Lett. B 7 (1993) 981. [323] A.D. Mirlin, Y.V. Fyodorov, Europhys. Lett. 25 (1994) 669. [324] L.S. Levitov, preprint cond-mat/9908178. [325] C. Mejia et al., Phys. Rev. Lett. 81 (1998) 5189. [326] X. Leyronas, J. Tworzydko, C.W.J. Beenakker, Phys. Rev. Lett. 82 (1999) 4894. [327] K. Slevin, T. Ohtsuki, Phys. Rev. Lett. 78 (1997) 4083; 82 (1999) 669. [328] X. Wang, Q. Li, C.M. Soukoulis, Phys. Rev. B 58 (1998) R3576; C.M. Soukoulis, X. Wang, Q. Li, M.M. Sigalas, Phys. Rev. Lett. 82 (1999) 668. [329] P. Markos\ , preprint cond-mat/9901285.
383
CONTENTS VOLUME 326 S.V. Shabanov, Geometry of the physical phase space in quantum gauge systems
1
S. Sieniutycz, Hamilton}Jacobi}Bellman framework for optimal control in multistage energy systems
165
A.D. Mirlin, Statistics of energy levels and eigenfunctions in disordered systems
259