Communications in Mathematical Physics - Volume 212

Commun. Math. Phys. 212, 1 – 27 (2000) Communications in Mathematical Physics © Springer-Verlag 2000 On the Magnetiz...

Author: A. Jaffe (Chief Editor)

29 downloads 847 Views 6MB Size Report

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!

Report copyright / DMCA form

DOWNLOAD PDF

Commun. Math. Phys. 212, 1 – 27 (2000)

Communications in

Mathematical Physics

© Springer-Verlag 2000

On the Magnetization of a Charged Bose Gas in the Canonical Ensemble Horia D. Cornean Institute of Mathematics of the Romanian Academy, P.O. Box 1-764, 70700 Bucharest, Romania. E-mail: [email protected]; [email protected] Received: 15 July 1999 / Accepted: 29 November 1999

Abstract: Consider a charged Bose gas without self-interactions, confined in a three dimensional cubic box of side L ≥ 1 and subjected to a constant magnetic field B 6 = 0. If the bulk density of particles ρ and the temperature T are fixed, then define the canonical magnetization as the partial derivative with respect to B of the reduced free energy. Our main result is that it admits thermodynamic limit for all strictly positive ρ, T and B. It is also proven that the canonical and grand canonical magnetizations (the last one at fixed average density) are equal up to the surface order corrections. 1. Introduction Much work has been done on the thermodynamic behavior of large systems composed from independent quantum particles in the presence of external magnetic fields.As is well known, the fundamental problem consists in proving the existence of the thermodynamic limit for the potentials and the equations of state defined at finite volume. In the particular case of the canonical magnetization (defined as the partial derivative with respect to the magnetic field of the reduced free energy), one has to prove that the derivative (performed at finite volume) commutes with the thermodynamic limit of the reduced free energy. Although the quantum canonical ensemble is (from the physical point of view) the most important one, most of the previous works were carried out either using the Maxwell–Boltzmann statistics or in the framework of the quantum grand canonical ensemble, because in those settings, many physically relevant quantities can be expressed employing the integral kernel of the Gibbs semigroup associated to the one particle problem. Moreover, one is able to go beyond the bulk terms and investigate finite size effects. Take for example the grand canonical pressure of a quantum gas in a constant magnetic field. The rigorous proof of its thermodynamic limit goes back at least to Angelescu and Corciovei [A, A-C]; its surface correction (in the regime in which the fugacity is less than one) was obtained by Kunz [K]. As for the Maxwell–Boltzmann magnetization,

2

H. D. Cornean

nice results were obtained by Macris et al. [M-M-P 1,2]; they wrote down even the corner corrections. Notice that in these papers the domain 3 was allowed to be more general, typically convex with piecewise smooth boundary. Another result concerning the thermodynamic limit and the surface corrections for the magnetization and susceptibility of a Fermi gas at zero magnetic field was obtained by Angelescu et al. [A-B-N 2]. Because this paper motivated our work, we are giving some more details about it. Firstly, as in our setting, their domain was a rectangular parallelepiped and the magnetic field oriented after the third direction. They defined the grand canonical magnetization m3 (β, z) (susceptibility χ3 (β, z)) as the first (second) derivative with respect to the magnetic field of the grand canonical pressure at B = 0, for all z ∈ C \ (−∞, −1]. Their main result can be roughly stated as follows: i. m3 (β, z) = 0, ∀z ∈ C \ (−∞, −1]; ii. There exists χ∞ (β, z) analytic in C \ (−∞, −1] such that for any compact K ⊂ C \ (−∞, −1] one has: lim sup |χ3 (β, z) − χ∞ (β, z)| = 0.

3→∞ z∈K

More than that, they gave even the surface correction for susceptibility and proved that this expansion is uniform on compacts. Because the relation between the fugacity and the grand canonical average density of Fermi particles can be always inverted, they were able to express the grand canonical susceptibility in terms of the canonical parameters ρ and β. Let us stress that B = 0 and 3 a rectangular parallelepiped were crucial ingredients in [A-B-N 2], the uniform convergence on compacts being obtained via a substantial use of the explicit formula of the integral kernel of the Gibbs semigroup associated to the Dirichlet Laplacian. In this paper, we are studying the “true” canonical problem for a Bose gas at nonzero magnetic field B0 > 0 (in order to avoid the Bose condensation). Using a standard procedure (see [K-U-Z, H]) of deriving the canonical partition function from the grand canonical pressure (see (2.27)), we are able to transform the uniform convergence on compacts of the grand canonical magnetization (see Lemma 1) into a pointwise convergence (β, ρ fixed and L → ∞) of the canonical magnetization; this result is given in Theorem 2. Moreover, we obtain that the canonical magnetization mL (see (2.29)) and the grand canonical magnetization at fixed average density (see (2.30)) are equal up to the surface order corrections. Two natural questions arise: what about Fermi statistics and what about higher derivatives with respect to B (the susceptibility for example)? Partial answers and a few open problems are outlined at the end of the proofs.

2. Preliminaries and the Results Let 3 = x ∈ R3 | − L2 < xj < L2 , j ∈ {1, 2, 3} , L > 1, be a cubic box with its side equal to L. Then the “one particle” Hilbert space is H1,L := L2 (3); denote with Hn,L the proper subspace of ⊗nj=1 H1,L ∼ = L2 (3n ) which contains all totally symmetric functions. Denote with L H0,L = C the space with no particles; then the Fock space is defined as FL := n≥0 Hn,L . One can introduce the “number of particles” operator NL as the unique self-adjoint extension of the multiplication with n on each Hn,L .

On the Magnetization of a Charged Bose Gas in the Canonical Ensemble

3

Assume that the particles (each having an electric charge e) are subjected to a constant magnetic field B = Be3 , which corresponds to a magnetic vector potential Ba = B2 e3 ∧x. If c stands for the speed of light, define ω := (e/c)B. Then the “one particle” Hamiltonian (denoted with H1,L (ω)) will be the Friederichs extension of the symmetric and positive operator 21 (− i∇ − ω a)2 defined on C0∞ (3). Due to the regularity of 3, H1,L (ω) is essentially self-adjoint on o n D = f ∈ C 2 (3) ∩ C 1 (3), f |∂3 = 0, 1f ∈ L2 (3) . The Hamiltonian which describes n particles reads as: Hn,L (ω) = H1,L (ω) ⊗ · · · ⊗ I + · · · + I ⊗ · · · ⊗ H1,L (ω) . {z } | “n” terms

(2.1)

The second quantized Hamiltonian HL (ω) is defined as the unique self-adjoint operator on FL whose restrictions to Hn,L coincide with Hn,L (ω). If T > 0 stands for the temperature and µ ∈ R for the chemical potential, then define β = kB1 T > 0 and z = exp (β µ) (the fugacity), where kB is the Boltzmann constant. When working in the canonical ensemble, one considers that the bulk density of particles ρ is constant, therefore the number of particles is defined as N (L) := ρL3 . As is well known (see [R-S 4]), H1,L (ω) is positive, unbounded and has compact resolvent; these imply that its spectrum is purely discrete with accumulation point at infinity. Moreover, from the min-max principle it follows: inf σ (H1,L (ω)) ≥ inf σ (H1,∞ (ω)) =

ω . 2

(2.2)

It is also known that the semigroup WL (β, ω) := exp (−βH1,L (ω)) is trace class and admits an integral kernel Gω,L (x, x0 ; β), which is continuous in both its “spatial” variables. The diamagnetic inequality at finite volume (see [B-H-L]) reads as: 1 |x − x0 |2 . (2.3) exp − |Gω,L (x, x0 ; β)| ≤ G0,L (x, x0 ; β) ≤ (2πβ)3/2 2β If I1 (L2 (3)) denotes the Banach space of trace class operators, it follows that: ||WL (β, ω)||I1 = tr WL (β, ω) ≤

L3 . (2πβ)3/2

(2.4)

Denote with {Ej (ω)}j ∈N the set of the eigenvalues of H1,L (ω). If µ < 0, the grand canonical partition function reads as: 4L (β, z, ω) = tr FL exp [−β(HL (ω) − µNL )] =

∞ Y

[1 − z exp (−βEj (ω))]−1 .

j =0

(2.5) The canonical partition function of our system is: ZL (β, ρ, ω) = tr HN (L),L exp (−βHN (L),L (ω)).

(2.6)

4

H. D. Cornean

The link between them is contained in the following equality: 4L (β, z, ω) =

∞ X n=0

zn trHn,L exp [−βHn,3 ].

(2.7)

Throughout the entire paper, by log z we shall understand the logarithm function restricted to C \ (−∞, 0]. Let C be a contour which surrounds the origin, does not intersect the cut [1, ∞) but contains the spectrum of the trace class operator zWL , where z ∈ C \ [exp (βω/2), ∞). Let q(ξ ) = ξ1 log(1 − ξ ) be an analytic function in the interior of C. Define the following bounded operator: Z 1 dξ q(ξ )(ξ − zWL )−1 . (2.8) q(zWL ) = 2π ı C It is easy to see that log(1 − zWL ) = zWL · q(zWL ) and using (2.5) one obtains: log 4L (β, z, ω) = −tr (zWL · q(zWL )) .

(2.9)

Employing the above expression, one can easily prove that the grand canonical potential (seen as a function of z) is analytic in C \ [exp (βω/2), ∞). When |z| < 1, (2.9) becomes: log 4L (β, z, ω) =

∞ n X z n=1

∞ n X z tr WLn = n n n=1

Z 3

dx Gω,L (x, x; nβ) .

(2.10)

The grand canonical pressure and density are defined as: PL (β, z, ω) :=

1 1 X −βEj (ω) log 4 (β, z, ω) = − log 1 − ze , L βL3 βL3

(2.11)

j

and ρL (β, z, ω) := βz

∂PL (β, z, ω). ∂z

(2.12)

Let us remark that ρL (β, x, ω) is an increasing function if 0 < x < exp (βω/2): ∂ρL (β, x, ω) = 1/L3 tr[(1 − xWL )−2 WL ] > 0. ∂x

(2.13)

The proof of the thermodynamic limit for these two quantities goes back at least to Angelescu and Corciovei [A, A-C]. Because this result plays an important role in our work, we shall reproduce it here. In order to do that, let us define (ω > 0): P∞ (β, z, ω) := ω

∞ X 1 −(k+1/2)ωβ g ze , 3/2 (2πβ)3/2 k=0

(2.14)

On the Magnetization of a Charged Bose Gas in the Canonical Ensemble

5

and ∞ X 1 ∂P∞ −(k+1/2)ωβ (β, z, ω) = βω g ze , ρ∞ (β, z, ω) := βz 1/2 ∂z (2πβ)3/2

(2.15)

k=0

where gσ (ζ ) are the usual Bose functions: ζ gσ (ζ ) = 0(σ )

Z

∞

dt

0

t σ −1 e−t , 1 − ζ e−t

(2.16)

analytic in C \ [1, ∞) and if |ζ | < 1, they are given by the following expansion: gσ (ζ ) =

∞ X ζn . nσ n=1

Then the following result is true (see [A, A-C]): Theorem 1. Let K ⊂ C \ [exp (βω/2), ∞) be a compact set. Then the grand canonical pressure and density admit the thermodynamic limit i.e.: lim sup |PL (β, z, ω) − P∞ (β, z, ω)| = 0,

(2.17)

lim sup |ρL (β, z, ω) − ρ∞ (β, z, ω)| = 0.

(2.18)

L→∞ z∈K

and L→∞ z∈K

Firstly, because PL and ρL are analytic functions, then via the Cauchy integral formula it follows that all their complex derivatives admit a limit which is uniform on compacts. In particular:     ∂ρL ∂ρ∞  (β, z, ω) − (β, z, ω) (2.19) lim sup   = 0. L→∞ z∈K ∂z ∂z It can be seen from (2.15) that limx%eβω/2 ρ∞ (β, x, ω) = ∞, which means that the Bose condensation is absent when a nonzero magnetic field is present. A very important consequence of the theorem is that the relation between the fugacity and density can be inverted for all temperatures and moreover, if 0 < x∞ (β, ρ, ω) < eβω/2 is the unique real and positive solution of the equation ρ∞ (β, x, ω) = ρ and if xL (β, ρ, ω) is the unique real and positive solution which solves ρL (β, x, ω) = ρ, then limL→∞ xL = x∞ . Let us perform the Legendre transform at finite volume: ρ f˜L (β, ρ, ω) := −PL (β, xL (β, ρ, ω), ω) + log xL (β, ρ, ω). β

(2.20)

A straightforward result is that f˜L has the following limit: f∞ (β, ρ, ω) := −P∞ (β, x∞ (β, ρ, ω), ω) +

ρ log x∞ (β, ρ, ω). β

(2.21)

6

H. D. Cornean

L Let us denote with ∂W ∂ω (β, ω0 ) the following integral which makes sense in the norm 2 topology of B(L (3)) (see [A-B-N 1]): Z β dτ WL (β − τ, ω0 ) [a · (p − ω0 a)]WL (τ, ω0 ). (2.22) −

0

A particular case of the problem treated in [A-B-N 1] is that class and moreover, for δω sufficiently small one has: w w w w wWL (β, ω0 + δω) − WL (β, ω0 ) − δω ∂WL w w ∂ω w

I1

∂WL ∂ω (β, ω0 ) is even trace

= O((δω)2 ).

(2.23)

Using for PL (β, z, ω0 ) the following expression: −

1 tr[log(1 − zWL (β, ω0 ))], βL3

then the estimate (2.23) justifies the definition of the grand canonical magnetization: e ∂PL ez −1 ∂WL tr (1 − zW (β, ω )) (β, z, ω0 ) = − . 0L (β, z, ω0 ) := − L 0 c ∂ω cβL3 ∂ω (2.24) From its definition, one can easily see that 0L has the same domain of analyticity in z. Now let us define the natural candidate for its thermodynamic limit: 0∞ (β, z, ω0 ) := −

e ∂P∞ (β, z, ω0 ). c ∂ω

(2.25)

Our main technical result is presented in the following lemma: Lemma 1. Let K ⊂ C \ [exp (βω/2), ∞) be a compact set. Then the grand canonical magnetization admits the thermodynamic limit i.e.: lim sup |0L (β, z, ω) − 0∞ (β, z, ω)| = 0.

L→∞ z∈K

(2.26)

Let us go back to the canonical ensemble. From (2.7), (2.11) and (2.6), one can write down an useful representation of the canonical partition function: N (L)  β Z 1 1  exp ρ PL (β, ξ, ω)  dξ , (2.27) ZL (β, ρ, ω) = 2πı C1 ξ ξ where C1 is a contour which surrounds the origin and avoids the cut. The reduced free energy reads as: fL (β, ρ, ω) := −

1 log ZL (β, ρ, ω). βL3

(2.28)

The canonical magnetization is defined as follows: mL (β, ρ, ω) :=

e ∂fL (β, ρ, ω). c ∂ω

(2.29)

On the Magnetization of a Charged Bose Gas in the Canonical Ensemble

7

We expect that mL should be close to the following quantity: m ˜ L (β, ρ, ω) :=

e ∂ f˜L (β, ρ, ω) = 0L (β, xL (β, ρ, ω), ω), c ∂ω

(2.30)

which converges to 0∞ (β, x∞ (β, ρ, ω), ω). We are able now to give our main result: Theorem 2. Fix 0 < δ < 1/2. For all strictly positive temperatures, bulk densities and magnetic fields, the canonical magnetization mL (β, ρ, ω) admits the thermodynamic limit. Moreover, there exist two positive constants Cδ (β, ρ, ω) and Lδ (β, ρ, ω) such that for all L ≥ Lδ one has: ˜ L | ≤ Cδ L−3/2+δ . |mL − m

(2.31)

Remark. It is clear that m ˜ L is a much more convenient quantity. If (at least for dilute gases) one would be able to write down an expansion for m ˜ L which takes into account the surface corrections: m ˜ L = m∞ +

1 mS + o(1/L), L

(2.32)

then the estimate (2.31) would imply that the same expansion is true for mL , too. 3. The Proof of Theorem 2 At this point, we shall consider that Lemma 1 is true and give its proof in the next section. In order to simplify the notations, we shall drop the dependence on β, ρ and ω but we shall reintroduce it when needed. The main idea of the proof consists in isolating the principal part of the integral from (2.27). Although this procedure is far from being new (in the physical literature it is known as the Darwin-Fowler method; see [K-U-Z, H] and references therein), we decided to give a rather detailed proof in order to have a clearer image of the remainder from (2.31). Firstly, let us choose the contour C1 as follows: C1 := {xL eıφ , φ ∈ [−π, π]}.

(3.1)

Using (2.20), the formula (2.27) can be rewritten as: Z π 1 N (L)β(PL (xL eıφ ) − PL (xL )) −ıN (L)φ e dφ exp ZL = exp (−β f˜L L3 ) 2π −π ρ = exp (−β f˜L L3 )N(L)−1/2 AL , (3.2) where AL (β, ρ, ω) is given by: √ Z N(L) π ıφ ıφ dφ e N (L)β/ρ(
(3.3)

Denote with p˜ ∞ (φ) the function given by
8

H. D. Cornean

We shall prove that

d 2 p˜ ∞ (0) dφ 2

< 0 and that the following limit is true:

lim AL := A∞ =

L→∞

s −

ρ 2 2πβ ddφp˜2∞ (0)

.

(3.4)

Firstly, let us remark that PL (ξ ) = PL (ξ ), where the over line means complex βω/2 conjugation. Let 0 < < e 2−x∞ be a fixed positive number, and let C be a circle centered in x∞ with radius . If ξ belongs to the interior of C then the Cauchy integral formula gives (L ≤ ∞): Z 1 PL (ζ ) . (3.5) dζ PL (ξ ) = 2π ı C ζ −ξ Take 0 < δ < 1/2 and define φL := N (L)−1/2+δ/3 . Then for L big enough and |φ| ≤ φL , one has that xL eıφ belongs to the interior of C/2 and (3.5) implies: Z 1 1 1 dζ PL (ζ ) + . (3.6) p˜ L (φ) = 4πı C ζ − xL eıφ ζ − xL e−ıφ Theorem 1 implies now that: d 2 p˜∞ d 2 p˜L (0) = (0) lim L→∞ dφ 2 dφ 2

and

∀|φ| ≤ φL ,

 3   d p˜L     dφ 3 (φ) ≤ const.

(3.7)

Denote with sL (φ) the function =PL (xL eıφ ). In a similar way, one can obtain that for all |φ| ≤ φL :   3   d sL  ≤ const.  (φ) (3.8)   dφ 3 Finally, let us prove (3.4). This will be done in two steps: 1. The first one consists in showing that the contribution to AL coming from the region |φ| ≥ φL is exponentially small in L; we shall use that p˜ L (φ) is an even function, is decreasing on the interval [0, π ] and has a non-degenerate maximum in 0; 2. The second one consists in a more careful study of the case in which |φ| ≤ φL ; the most important thing here is showing that the oscillations of the imaginary part are small. Let us prove 1. Firstly, because p˜ L (φ) = p˜ L (−φ), it is sufficient to look only at φ ∈ [φL , π]. From (2.11) one has: 2 1 X −βEj 2 2 −2βEj log 1 − x cos φ e + x sin φ e , (3.9) p˜ L (φ) = − L L 2βL3 j

and: xL sin φ X e−βEj d p˜ L (φ) = −   . dφ βL3 1 − xL eıφ e−βEj 2 j

(3.10)

On the Magnetization of a Charged Bose Gas in the Canonical Ensemble

9

This means that p˜ L is decreasing on [0, π ]. Its second derivative in 0 reads as: xL X e−βEj d 2 p˜ L (0) = −   . dφ 2 βL3 1 − xL e−βEj 2 j

(3.11)

For L big enough, one has that xL ≥ x∞ /2. The above equality implies: x∞ 1 d 2 p˜ L (0) ≤ − trWL . dφ 2 2β L3

(3.12)

To obtain an uniform estimate in L, we employ (see [K]): 1 ωβ/2 1 trWL = (1 + O(1/L)). 3/2 3 L (2πβ) sinh(ωβ/2)

(3.13)

Then (3.7), (3.12) and (3.13) imply that for L sufficiently large: x∞ ωβ/2 1 d 2 p˜ L (0) ≤ − dφ 2 4β (2πβ)3/2 sinh(ωβ/2)

d 2 p˜ ∞ (0) < 0. dφ 2

and

(3.14)

From (3.14) and (3.7) it follows that for all φ ∈ [φL , π ] one has: p˜ L (φ) − PL (xL ) ≤ p˜ L (φL ) − p˜ L (0) ≤

1 d 2 p˜ ∞ (0)φL2 , 4 dφ 2

or in other words, β d 2 p˜ ∞ N(L)β 2δ/3 (0)N (L) [p˜ L (φ) − PL (xL )] ≤ exp , exp ρ 4ρ dφ 2

(3.15)

(3.16)

therefore the contribution to AL coming from |φ| ≥ φL is exponentially small. Now let us study the region |φ| ≤ φL . We are mainly interested in the behavior of the imaginary part of the pressure (see (3.8)). Similarly as in (3.9), one has: sL (φ) = −

1 X 1 X xL sin φ e−βEj ıφ −βEj arg (1 − x e e ) = arctan . L βL3 βL3 1 − xL cos φ e−βEj j j (3.17)

By direct computation, one obtains: sL (0) = 0,

dsL ρL (xL ) ρ (0) = = dφ β β

and

d 2 sL (0) = 0. dφ 2

(3.18)

Together with (3.8), one has: β sL (φ) − φ = O(φL3 ). ρ We are interested now in the following integral: √ Z N(L) φL ıφ ıφ dφ e N (L)β/ρ(
(3.19)

(3.20)

10

H. D. Cornean

Changing the variable in t = 1 2π

Z

√ N(L)φ one obtains:

N (L)δ/3

−N (L)δ/3

β d 2 p˜ L (0) t 2 dφ 2

dt e 2ρ

−1/2+δ ]

eO[N (L)

.

(3.21)

Using (3.14), (3.7) and the Lebesgue dominated convergence theorem, it is easy to see that the above integral converges to (3.4). From (2.28) and (3.2), the reduced free energy reads as: fL = f˜L +

log N (L) log AL − . 2βL3 βL3

(3.22)

e 1 ∂AL , 3 βcL AL ∂ω

(3.23)

From (2.29) and (2.30) it follows: ˜L − mL = m or: ˜L mL − m N(L)1/2 = 2π AL

Z

π −π

h

dφ e

N (L)β ρ

e d ıφ PL (xL e ) − PL (xL ) , − c dω (3.24)

i

PL (xL eıφ )−PL (xL )

where: −

e d PL (xL eıφ ) − PL (xL ) c dω e ∂PL ∂PL ∂xL ıφ ıφ ∂xL ıφ (xL e ) e − (xL ) . = 0L (xL e ) − 0L (xL ) − c ∂z ∂ω ∂z ∂ω

(3.25)

∂0L L From (2.12) it follows that − ec ∂ρ ∂ω (β, z, ω) = βz ∂z (β, z, ω). This implies:

−1 ∂ρL ∂0L e ∂xL = −βxL (xL ) (xL ) . − c ∂ω ∂z ∂z Lemma 1 and Theorem 1 imply (via the Cauchy integral formula) that the above quantity remains bounded when L goes to infinity. Performing a similar analysis of the integral from (3.24) as that one made for AL , it follows that the contribution coming from the region |φ| ≥ φL is exponentially small in L. When we are analyzing the region |φ| ≤ φL , ˜L the factor N(L)1/2 disappears when we are changing the variable, therefore mL − m will behave as the quantity from (3.25), and this one behaves like φL ∼ L−3/2+δ . The proof of (2.31) is now completed.

On the Magnetization of a Charged Bose Gas in the Canonical Ensemble

11

4. The Proof of Lemma 1 The idea of the proof is borrowed from [A]. Namely, the uniform convergence on the compacts which belong to D := C \ [eβω0 /2 , ∞) will be obtained applying the Vitali Theorem to the sequence of analytical functions 0L (β, z, ω0 ) when L is going to infinity. Therefore, we have to make the following three steps: I. Prove the uniform boundedness in L of the functions 0L (β, z, ω0 ) on any compact K ⊂ D; II. Identify the limit (0∞ (β, z, ω0 ) in our case) and prove that it has the same domain of analyticity D; III. Prove the existence of a set D0 ⊂ D having at least one point of accumulation, such that 0L (β, z, ω0 ) has a pointwise convergence to 0∞ (β, z, ω0 ) on D0 . 4.1. The proof of I. Here is the “nontrivial part” of our paper. In order to see where the main difficulty is, let us take a look at (2.24). One could try a bound on 0L (β, z, ω) using the following inequality (see also (2.22)): w   w w w  w ∂WL w  w −1 ∂WL  −1 w w w ) sup ) ≤ tr (1 − zW − zW sup  w(1 w 2 L L  B(L ) ∂ω  w ∂ω wI1 z∈K z∈K w w w ∂WL w w ≤ C(β, K, ω) w w ∂ω w . I1 Unfortunately, the linear growth of the vector potential a leads to a bad trace norm 4 3 L estimate for ∂W ∂ω , which behaves like L and not like L as needed. In the case of the pressure [A-C], this kind of problem does not appear (see (2.9), (2.8) and (2.4)), because for the Gibbs semigroup the trace and the trace norm are equal and grow like L3 . Because the proof is rather lengthy, we shall outline in a few words our strategy. Firstly, having (2.8) in mind, let us define (ω0 , β, τ > 0): gω0 (ξ, z; β, τ ) := [ξ − zWL (β, ω0 )]−1 zWL (τ, ω0 ),

(4.1)

where ξ belongs to the contour C. Then the pressure will admit the following representation (see (2.9)): Z 1 1 dξ q(ξ ) 3 tr[gω0 (ξ, z; β, β)], (4.2) PL (β, z, ω0 ) = − 2π ı C L where the norm convergent integral from (2.8) commutes with the trace from (2.9). Let us argue that we can choose the same contour of integration if ω varies in a small interval := [ω0 , ω1 ], 0 < ω0 < ω1 . In other words, we search for a C included in C \ [1, ∞) such that for all L > 1, z ∈ K, ω ∈ and ξ ∈ C one has the following bound: k[ξ − zWL (β, ω)]−1 k ≤ M < ∞.

(4.3)

Assume that every z ∈ K verifies the following condition: dist{z, [ e

βω0 2

, ∞ )} ≥ δ > 0,

|z| ≤ d < ∞.

(4.4)

12

H. D. Cornean βω0

We know that the spectrum of WL (β, ω) is included in [0, e− 2 ] for all L > 1 and ω ∈ . We claim that C can be chosen as the union C1 ∪C2 , where C2 is given by (η > 0): {(1 + t, ±η)| − η ≤ t ≤ 2d} ∪ {(1 − η, t)| − η ≤ t ≤ η}, and C1 is chosen such that if ξ ∈ C1 , then |ξ | ≥ 2d + 1. It is not difficult to prove that by choosing η sufficiently small, then: sup sup

z∈K ξ ∈C

sup βω0 0≤r≤e− 2

|(ξ − z r)−1 | ≤ M < ∞,

(4.5)

and via the Spectral Theorem, (4.3) takes place. If we manage to prove the existence of a numerical constant c(β, K, ω0 ) such that for all δω ∈ (0, ω1 − ω0 ) and z ∈ K to have: 1 |PL (β, z, ω0 + δω) − PL (β, z, ω0 )| ≤ c(β, K, ω0 ), δω

sup sup z

δω

(4.6)

then the magnetization will be bounded by the same constant. This estimate is straightforward if a stronger one takes place: sup sup sup z

ξ

δω

1 L3 δω

|tr[gω0 +δω (ξ, z; β, β)] − tr[gω0 (ξ, z; β, β)]| ≤ C(β, K, ω0 ). (4.7)

Our main task will consist in constructing a trace class operator Aω0 +δω (ξ, z; β) having the following two properties: trAω0 +δω (ξ, z; β) = trgω0 (ξ, z; β, β) (i.e. its trace is not depending on δω) and moreover, sup sup sup z

ξ

δω

1 L3 δω

kgω0 +δω (ξ, z; β, β) − Aω0 +δω (ξ, z; β)kI1 ≤ C 0 (β, K, ω0 ), (4.8)

which would clearly end the problem. We will see that (ω = ω0 + δω) Aω (ξ, z; β) can be chosen as a product g˜ ω (ξ, z; β, β/2)Sω (β/2) where the first term is bounded, the second one is trace class and moreover: sup sup kgω (ξ, z; β, β/2) − g˜ ω (ξ, z; β, β/2)kB(L2 ) ≤ C1 (β, K, ω0 )δω, z

ξ

(4.9)

and kWL (β/2, ω) − Sω (β/2)kI1 ≤ C2 (β, ω0 )L3 δω.

(4.10)

In particular, (4.9) and (4.10) imply: kSω (β/2)kI1 ≤ C(β, ω0 )L3 and sup sup kg˜ ω (ξ, z; β, β/2)kB(L2 ) ≤ C3 (β, K, ω0 ). z

ξ

(4.11) Employing these estimates together with the following identity (see (4.1)): gω (ξ, z; β, β) = gω (ξ, z; β, β/2)WL (β/2, ω), the proof of (4.8) follows easily. The rest of this subsection is dedicated to the rigorous proofs of these estimates and will be structured in a sequence of technical propositions. We start with a well known result, given without any other comments:

On the Magnetization of a Charged Bose Gas in the Canonical Ensemble

13

Proposition 1. The Dirichlet Laplacian defined in 3 admits a trace class semigroup which has an integral kernel given by the following formula: G0,L (x, x0 ; β) =

3 Y j =1

g0,L (xj , xj0 ; β),

(4.12)

where the “one dimensional” kernels read as: g0,L (x, x 0 ; β) = X (x + x 0 − 2mL − L)2 1 (x − x 0 + 2mL)2 − exp − = exp − (2πβ)1/2 2β 2β m∈Z

:= g0,∞ (x, x 0 ; β) + ζ0,L (x, x 0 ; β).

(4.13)

Using the previous proposition, one can write: G0,L (x, x0 ; β) = G0,∞ (x, x0 ; β) + Z0,L (x, x0 ; β),

(4.14)

1 |x − x0 |2 exp − . G0,∞ (x, x ; β) = (2πβ)3/2 2β

(4.15)

where: 0

The purpose of the next proposition is to give a few properties of smoothness and localization of the reminder Z0,L (x, x0 ; β): Proposition 2. For all β > 0 and L > 1, there exist two positive numerical constants c1 and c2 such that:    ∂Z0,L  1+β 0 0  i.   ∂x  (x, x ; β) ≤ c1 β 1/2 G0,∞ (x, x ; c2 β); j   2    ∂ Z0,L   ∂Z0,L  1+β 0 0     (x, x ; β),  G0,∞ (x, x0 ; c2 β); (x, x ; β) ≤ c1 ii. max  ∂β  ∂x ∂x 0  β j

k

iii. |Z0,L |(x, x0 ; β) ≤ c1 (1 + β) G0,∞ (x, x0 ; c2 β). Proof. For x, x 0 ∈ (−L/2, L/2) define: (x − x 0 + 2mL)2 exp − , 2β m∈Z\{0} 1 X [x − x 0 − (2m + 1)L]2 0 exp − . ζ2 (x, x ; β) = √ 2β 2πβ m∈Z ζ1 (x, x 0 ; β) = √

1 2πβ

X

(4.16)

It is clear that if one obtains uniform estimates in L and β for these two quantities, the same would be true for Z0,L , too. A very useful estimate is the following: ∀t ≥ 0,

te−t = 2(t/2)e−t/2 e−t/2 ≤ 2e−t/2 .

(4.17)

14

H. D. Cornean

Then we have the inequalities:   X  ∂ζ1  (x − x 0 + 2mL)2  (x, x 0 ; β) ≤ const.  exp − ,  ∂x  β 4β m6 =0  2  X  ∂ ζ1  (x − x 0 + 2mL)2  (x, x 0 ; β) ≤ const.  , exp −  ∂x∂x 0  β 3/2 4β

(4.18)

m6 =0

and similarly for ζ2 . At this point we have to control the summation over m. Let us prove the following inequality: X (x − x 0 + 2mL)2 (x − x 0 )2 exp − ≤ c1 (1 + β) exp − . (4.19) 4β c2 β m6=0

Because |x − x 0 | < L one has (|m|, L ≥ 1): (x − x 0 + 2mL)2 = (x − x 0 )2 + 4mL(x − x 0 ) + 4m2 L2 ≥ (x − x 0 )2 + 4(|m| − 1). (4.20) Therefore: X (x − x 0 + 2mL)2 exp − 4β m6=0   0 2 X 1 (x − x ) exp − m  2 1 + ≤ exp − 4β β m≥1   1 1 1 (x − x 0 )2  2β  2β exp − 1+ = 2 exp − 4β 2 sinh 1 2β ≤ 2(1 + β) exp −

(x

2β

− x 0 )2 4β

.

In order to control ζ2 , one has to study the following quantity: X [x + x 0 − (2m + 1)L]2 exp − . A := 4β

(4.21)

(4.22)

m∈Z

Denote with ξ = x + L/2 and ξ 0 = x 0 + L/2; then 0 < ξ, ξ 0 < L, x − x 0 = ξ − ξ 0 and (ξ + ξ 0 )2 ≥ (ξ − ξ 0 )2 . It follows: (4.23) [x + x 0 − (2m + 1)L]2 = [ξ + ξ 0 − 2(m + 1)L]2 0 2 0 2 2 = (ξ + ξ ) − 4(m + 1)L(ξ + ξ ) + 4(m + 1) L . If m ≤ −1, then: [x + x 0 − (2m + 1)L]2 ≥ (x − x 0 )2 + 4(|m| − 1). If m = 0, then: [x + x 0 − (2m + 1)L]2 = [(L/2 − x) + (L/2 − x 0 )]2 ≥ (x − x 0 )2 .

On the Magnetization of a Charged Bose Gas in the Canonical Ensemble

15

If m ≥ 1, then: [x + x 0 − (2m + 1)L]2 ≥ (ξ + ξ 0 )2 + 4L2 (m + 1)(m − 1) ≥ (x − x 0 )2 + 4(m − 1), and we can repeat the summation procedure used in (4.19). Putting all these things together, the proof is completed. u t The next proposition is a variant of the perturbation theory for self-adjoint Gibbs semigroups (see [H-P]). Instead of starting with a perturbation of its generator, we start with an approximation of the semigroup. Although simple, this proposition contains the main technical core of our paper. Proposition 3. Let H := L2 (3) and let H be a self-adjoint and positive operator having the domain D. Fix β0 > 0. Assume that there exists an application 0 < β ≤ β0 → S(β) ∈ B(H) with the following properties: A. sup0<β≤β0 ||S(β)|| ≤ c1 < ∞; B. It is strongly differentiable, RanS(β) ⊂ D and s − limβ&0 S(β) = 1; C. There exists a normly continuous application 0 < β ≤ β0 → R(β) ∈ B(H) such that ||R(β)|| ≤ c2 /β α where 0 ≤ α < 1 and: ∂S f + H S(β)f = R(β)f. ∂β

(4.24)

Then the following two statements are true: i. The sequence of bounded operators (n > [1/β]): Z β−1/n dτ exp [−(β − τ )H ]R(τ ) Tn (β) := 1/n

converges in norm; let T (β) be its limit; ii. The following equality takes place in B(H): exp (−βH ) = S(β) − T (β).

(4.25)

Proof. i. The norm convergence is assured by the integrability condition imposed on the norm of R(β). Moreover, when β is near zero: sup ||Tn (β)|| ≤ c(α)β 1−α ,

(4.26)

n

therefore the same thing is true for T (β). ii. Let 0 < β1 < β < β0 . If n > [1/β1 ] and φ ∈ H, define the vector: ψn (β) := exp (−βH )φ − S(β)φ + Tn (β)φ.

(4.27)

From (4.26) and condition A it follows: lim ψn (β) := ψ(β) = exp (−βH )φ − S(β)φ + T (β)φ and sup ||ψn (β)|| ≤ const. n,β

(4.28) Define fn (β) = ||ψn (β)||2 and f (β) = ||ψ(β)||2 . From the strong convergence to one of S(β) when β goes to zero and from the norm convergence to zero of T (β), it follows that limβ&0 f (β) = 0. If we manage to prove that f (β) is decreasing, then it would be identically zero and this would end the proof. u t

16

H. D. Cornean

Notice that Tn (β) is normly differentiable and: 1 ∂Tn φ + H Tn (β)φ = exp − H R(β − 1/n)φ. ∂β n

(4.29)

The positivity of H implies:

1 ∂fn ≤ 2 <[hψn (β), exp − H R(β − 1/n)φ − R(β)φi]. ∂β n Rβ n Using fn (β) = fn (β1 ) + β1 dτ ∂f ∂τ , one has: Z

0 ≤ fn (β) ≤ fn (β1 ) + const

β

β1

(4.30)

1 dτ || exp − H R(τ − 1/n)φ − R(τ )φ||. (4.31) n

Because the above integrand is bounded on the domain of integration and has pointwise convergence to zero, the dominated convergence theorem implies that the whole integral converges to zero. Taking the limit, it follows that f (β) is decreasing and therefore is identically zero. Define the “magnetic phase”: ϕ(x, x0 ) = x · a(x0 ) =

1 e3 · (x0 ∧ x). 2

(4.32)

We shall use this quantity as a local gauge transformation (see [C-N] for another use of this idea); namely, it will alter the magnetic vector potential by making it depend on x − x0 , only: a(x) − a(x − x0 ) = ∇x ϕ(x, x0 ). To see how this transformation acts (at a formal level) on H1,L (ω), let us notice the following equation: 0 0 (4.33) [−ı∇x − ωa(x)] eıωϕ(x,x ) = eıωϕ(x,x ) −ı∇x − ωa(x − x0 ) . Proposition 4. The bounded operator S(β) ∈ B L2 (3) given by the integral kernel 0 eıωϕ(x,x ) G0,L (x, x0 ; β), verifies the hypotheses of Proposition 3, having the following properties: i. s − limβ&0 S(β) = 1; ii. The application (0, ∞) 3 β 7 −→ S(β) is strongly differentiable and uniformly bounded; iii. RanS(β) ∈ Dom(H1,L (ω)) and ∂S ∂β (β)f + H1,L (ω)S(β)f = R(β)f , where R(β) has an integral kernel R(x, x0 ; β) given by: 0

eıωϕ(x,x ) [ω2 a2 (x − x0 )G0,L (x, x0 ; β) + 2ıωa(x − x0 ) · ∇x Z0,L (x, x0 ; β)]. Proof. i. Throughout the whole section, we shall use a few well known results, which are given without proof. Let us begin with an useful boundedness criterion for integral operators (see [S]): Let A be an integral operator, given by a continuous integral kernel A(x, x0 ) ∈ C(3 × 3). If the next inequality holds: Z Z 0 0 0 dx |A(x, x )|, sup dx |A(x, x )| ≤ C < ∞, (4.34) max sup x∈3 3

x0 ∈3 3

On the Magnetization of a Charged Bose Gas in the Canonical Ensemble

17

then the operator norm of A in B(L2 (3)) is bounded by C. If c, β and τ < β are three strictly positive numbers, then we have the following three identities: 3 Z 3 |y − x0 |2 |x − y|2 |x − x0 |2 (β − τ )τ 2 2 − = (π c) , dy exp − exp − c(β − τ ) cτ β cβ R3 (4.35) Z R3

dy

|x − y|2 = 1, exp − 3 cβ (π cβ) 2 1

1

Z dy |G0,∞ (x, y; β)|

2

R3

2

= (2πβ)− 4 . 3

A very useful inequality will be the next one (t > 0, n ≥ 1): 2 2 n t t n 2 ≤ const(n)β exp − . t exp − β 2β

(4.36)

(4.37)

(4.38)

Let us get back to the proof of i. Firstly, because 0

|eıωϕ(x,x ) G0,L (x, x0 ; β)| ≤ G0,∞ (x, x0 ; β), it follows that S(β) obeys the condition (4.34) with C ≤ 1, which means that is uniformly bounded in β > 0. Let us show now that the operator given by the integral kernel 0

(eıωϕ(x,x ) − 1)G0,L (x, x0 ; β) converges in norm to zero. Let us remark first that |ϕ(x, x0 )| ≤ L |x − x0 | and moreover: 0

|(eıωϕ(x,x ) − 1)| ≤ ω|ϕ(x, x0 )| ≤ ωL |x − x0 |.

(4.39)

Then (using (4.38) with n = 1): 0

|(eıωϕ(x,x ) − 1)G0,L (x, x0 ; β)| ≤ const L β 1/2 G0,∞ (x, x0 ; 2β),

(4.40)

therefore its operator norm behaves in zero like β 1/2 at least. The proof of i is now straightforward. ii. We will prove that the application is in fact normly differentiable. For β > 0 and δβ sufficiently small, one has: Z 0 ∂G0,L (·, x0 ; β)f (x0 ) dx0 eıωϕ(·,x ) S(β + δβ)f − S(β)f = δβ ∂β 3 Z 2 (δβ)2 0 ∂ G0,L ˜ (x0 ), (4.41) dx0 eıωϕ(·,x ) (·, x0 ; β)f + 2 ∂β 2 3 where β˜ is situated between β and β + δβ. It is not difficult now to see that the “operator derivative” is an integral operator whose kernel is the derivative with respect to β of the initial one.

18

H. D. Cornean

iii. Denote with D0 the common domain of essentially self-adjointness for H1,L (ω), ω ≥ 0: D0 = {ψ ∈ C 2 (3) ∩ C 1 (3)| ψ|∂3 = 0, 1ψ ∈ L2 (3)}.

(4.42)

The action of H1,L (ω) on a function from D0 is as follows: [H1,L (ω)ψ](x) = −(1ψ)(x) + 2ıωa(x) · (∇ψ)(x) + ω2 a2 (x)ψ(x).

(4.43)

Now take f ∈ C0∞ (3). After integration by parts, using (4.33) and the fact that G0,L (x, x0 ; β) solves the heat equation in the interior of 3, one obtains: Z 0 dx0 dx ψ(x)f (x0 )eıωϕ(x,x ) hH1,L (ω)ψ, S(β)f i = 32

· [−1x + 2ıωa(x − x0 ) · ∇x + ω2 a2 (x − x0 )]G0,L (x, x0 ; β) = −hψ, S 0 (β)f i + hψ, R(β)f i.

(4.44)

The result follows easily after a density argument and with the remark that: a(x − x0 ) · ∇x G0,∞ (x, x0 ; β) = 0.

(4.45)

Finally, let us notice that the norm of R(β) is independent of L and is integrable in zero. To do that, one has to employ the estimates from Proposition 2, (4.38) and the criterion from (4.34). u t In order to perform a similar perturbative treatment of the semigroup near a nonzero magnetic field, we need the estimate given by the next proposition: Proposition 5. Let n be a unit vector in R3 . Then there exist three positive numerical constants s, c4 and c5 such that for all ω ∈ , x, x0 ∈ 3 and β > 0, one has the following uniform estimate in L: |x − x0 |2 (1 + β)s 0 . (4.46) exp − |n · (−ı∇ − ωa(x))Gω,L (x, x ; β)| ≤ c4 β2 c5 β Proof. Proposition 3 allows us to write down the following integral equation: Z β dτ WL (β − τ, ω) R(τ ). WL (β, ω) = S(β) −

(4.47)

Because S(β) and WL are self-adjoint, one can rewrite (4.47) as: Z β dτ R ∗ (τ )WL (β − τ, ω). WL (β, ω) = S(β) −

(4.48)

0

0

In terms of integral kernels, (4.48) reads as: Z β Z 0 0 dτ dy R ∗ (x, y, τ )Gω,L (y, x0 , ; β − τ ), Gω,L (x, x ; β) = S(x, x ; β) − 0

3

(4.49)

where the equality is between continuous functions in C(3 × 3) and the integral in τ R β− has to be understood as “ ” in the limit & 0.

On the Magnetization of a Charged Bose Gas in the Canonical Ensemble

19

Because the kernel of R ∗ reads as R ∗ (x, y; τ ) = R(y, x; τ ), by direct computation one can obtain the estimate (see Proposition 2): 0 |x − x0 |2 (1 + τ )s exp − , (4.50) |n · (−ı∇ − ωa(x))R ∗ (x, x0 ; τ )| ≤ c40 τ2 c50 τ where one has to apply (4.33) then use the estimates from Proposition 2; finally, introducing (4.50), (2.3), (4.35) and (4.38) in (4.49) and because the singularity in τ is t integrable, the result for Gω,L is straightforward. u Take ω = ω0 + δω ∈ . The analogous of Proposition 4 at nonzero magnetic field is: Proposition 6. The bounded operator denoted with Sω (β) and given by the kernel 0 eıδωϕ(x,x ) Gω0 ,L (x, x0 ; β) has the following properties: i. (0, ∞) 3 β 7 −→ Sω (β) is strongly differentiable and s − limβ&0 Sω (β) = 1; ∂ Sω (β)f + H1,L (ω)Sω (β)f = Rω (β)f , where ii. RanSω (β) ∈ Dom(H1,L (ω)) and ∂β Rω (β) is given by: h 0 Rω (x, x0 ; β) = eı(δω)ϕ(x,x ) (δω)2 a2 (x − x0 )Gω0 ,L (x, x0 ; β)

+ 2(δω)a(x − x0 ) · (ı∇x + ω0 a(x))Gω0 ,L (x, x0 ; β) .

Proof. i. Rewriting the integral kernel of Sω (β) as: h i 0 Sω (x, x0 ; β) = Gω,L (x, x0 ; β) + eı(δω)ϕ(x,x ) − 1 Gω,L (x, x0 ; β),

(4.51)

(4.52)

and using the diamagnetic inequality, one can reproduce the argument from (4.40) in order to prove that the second term converges in norm to zero. Clearly, the first one converges strongly to one. If {ψj } and {Ej } denote the sets of eigenvectors and eigenvalues of H1,L (ω), then: X e−βEj ψj (x)ψj (x0 ), (4.53) Gω,L (x, x0 ; β) = j

where the series is absolutely and uniformly convergent on 3 × 3. This can be seen from the fact that the semigroup is trace class and that the eigenfunctions belong to D0 and admit the estimate: |ψj |(x) ≤ const(L) (Ej + 1),

(4.54)

obtained from the fact that the resolvent [H1,L (ω) + 1]−1 is bounded between L2 (3) and L∞ (3). It follows that uniformly in 3: |Gω,L (·, ·; β + δβ) − Gω,L (·, ·; β) −

∂Gω,L (·, ·; β)| ≤ const(L) (δβ)2 , ∂β

(4.55)

which is sufficient for the strong differentiability (see also (4.41)). ii. One has to make the same steps as in the proof of the third point of Proposition 4. As for the norm of R(β), let us see that is independent of L and is integrable in zero: from

20

H. D. Cornean

(4.51), (4.46), (4.38) and (2.3), one can obtain an estimate on the kernel of Rω (β) of the following form: |Rω (x, x0 ; β)| ≤ c9 δω(1 + β)s G0,∞ (x, x0 ; c10 β),

(4.56)

which implies that its B(L2 (3)) norm is bounded by a constant multiplied with δω (see (4.34)). u t We shall give now without proof a result which gives sufficient conditions for an operator defined in B(L2 (3)) to be trace class: Proposition 7. Let {Tn } a sequence of trace class operators, converging to T in B(L2 (3)). If supn ||Tn ||I1 ≤ c < ∞, then T ∈ I1 and ||T ||I1 ≤ c. Remark. Assume that an operator T is defined by a B(L2 (3))-norm Riemann integral Rb on the interval [a, b], with a continuous trace class integrand S(t). If a dt ||S(t)||I1 ≤ c < ∞, then T is trace class and Z b dt tr S(t). ||T ||I1 ≤ c, tr T = a

˜ Denote with R(ω, β) the bounded operator given by the kernel: ˜ x0 ; β) = 2 a(x − x0 ) · (−ı∇ − ω0 a(x))Gω0 ,L (x, x0 ; β). R(x,

(4.57)

Among other things, the next proposition proves (4.10): Proposition 8. Take β > 0 and ω = ω0 + δω ∈ . i. The operator Sω (β) is trace class and moreover, there exists a positive numerical constant c such that: ||WL (β, ω) − Sω (β)||B(L2 (3)) ≤ c δω

and

||WL (β, ω) − Sω (β)||I1 ≤ c δω L3 .

(4.58) (4.59)

ii. For all x, x0 ∈ 3 and uniformly in L, one has: 0

(4.60) Gω,L (x, x0 ; β) = eıδωϕ(x,x ) Gω0 ,L (x, x0 ; β) + Z Z β 0 ˜ x0 ; τ ) + O((δω)2 ). dτ dy eıδωϕ(x,y) Gω0 ,L (x, y; β − τ )eıδωϕ(y,x ) R(y, + δω 0

3

Proof. i. We know that as bounded operators: Z β dτ WL (β − τ, ω)Rω (τ ). WL (β, ω) = Sω (β) −

(4.61)

0

We have already seen that the B(L2 (3)) norm of Rω (τ ) is bounded by a constant multiplied with δω (see (4.56)). Its Hilbert-Schmidt norm is bounded by (see (4.37): ||Rω (τ )||I2 ≤ const δω

(1 + τ )s 3/2 L . τ 3/4

(4.62)

On the Magnetization of a Charged Bose Gas in the Canonical Ensemble

21

For the semigroup, we know that ||WL ||B(L2 (3)) ≤ 1 and from (2.3) and (4.37) it follows that: ||WL (β − τ, ω)||I2 ≤ const

1 L3/2 . (β − τ )3/4

(4.63)

With the help of the well known inequality ||A B||I1 ≤ ||A||I2 ||B||I2 , it follows that in both situations (B(L2 ) and I1 (L2 )) the singularities in τ are integrable (see the previous remark), and the desired bounds follow easily. ii. The formula (4.60) is obtained from (4.61) by isolating the term which contains δω. t u The next proposition imposes sufficient conditions on a trace class integral operator such that its trace to be equal to the integral of the kernel’s diagonal (see [R-S 1]): Proposition 9. Let T ∈ I1 (L2 (3)), given by the integral kernel T (x, x0 ) ∈ C(3 × 3). Then: Z dx T (x, x). (4.64) tr T = 3

The rest of this subsection is dedicated to the proof of (4.9). Fix β, τ > 0. Let 0 < ω0 < ω1 and let = [ω0 , ω1 ]. Clearly, gω (ξ, z; β, τ ) is trace class and admits a continuous integral kernel given by the following series, which is absolutely and uniformly convergent on 3 × 3 (see also (4.53)): X [ξ − z exp (−βEj )]−1 z exp (−τ Ej )ψj (x)ψj (x0 ), (4.65) Tω (x, x0 ) = j

Notice that in order to simplify the notations, we did not specify the dependence on ξ , z, β and τ . Let us start with the equation satisfied by Tω (x, x0 ): Proposition 10. As continuous functions: Z dy Gω,L (x, y; β)Tω (y, x0 ) = zGω,L (x, x0 ; τ ). ξ Tω (x, x0 ) − z 3

(4.66)

Proof. The above equality is nothing but the rewriting in terms of integral kernels of an identity between bounded operators: t [ξ − zWL (β, ω)]gω (ξ, z; β, τ ) = zWL (τ, ω).u

(4.67) 0

For further purposes, we shall prove that for all L ≥ 1, |Tω (x, x0 )| ∼ e−α|x−x | for some positive α. We need first a few definitions: let ρ(x) := (1 + x2 )1/2 and α ≥ 0. It is known that the partial derivatives up to the second order of ρ are bounded by a numerical constant and moreover: ∀x ∈ R3 , e±αρ(x) e∓α|x| ≤ const(α).

(4.68)

Fix x0 ∈ 3. Denote with A(α) the multiplication operator with e−αρ(·−x0 ) . Then A(α) and A−1 (α) = A(−α) are bounded operators and invariate D0 (see 4.42)). An useful result is contained in the next proposition:

22

H. D. Cornean

Proposition 11. i. The operator A(α)WL (τ, ω)A(−α) belongs to B(L2 ), and has a norm which is uniformly bounded in L, x0 ∈ 3, ω ∈ and 0 < τ ≤ β; ii. The operator A(α)WL (β, ω)A(−α) belongs to B(L2 , L∞ ), having a norm which is uniformly bounded in L, ω and x0 . Proof. i. The inequality (4.68) allows us to replace A with the multiplication operator given by eα|x−x0 | . Let ψ ∈ L2 (3) and define: Z dy e−αρ(·−x0 ) GL,ω (·, y; τ )eαρ(y−x0 ) ψ(y). (4.69) φ := A(α)WL (τ, ω)A(−α) = 3

Let us remark an elementary estimate, which is true for all 0 < τ ≤ β: |x − x0 |2 0 ≤ const(α, β). eα|x−x | exp − 4τ Applying (4.68), the triangle inequality, (2.3) and (4.70) one obtains: Z 0 dx0 eα|x−x | |Gω,L (x, x0 ; τ )| |ψ|(x0 ) |φ|(x) ≤ const(α) 3 Z dx0 G0,∞ (x, x0 ; 2τ )|ψ|(x0 ). ≤ const(α, β) 3

(4.70)

(4.71)

The result follows from (4.36) and (4.34). ii. We apply the Schwartz inequality in (4.71) with τ = β and then use (4.37).

t u

Proposition 12. Under the same conditions as above, there exists a sufficiently small 0 < α < 1 such that the following inequalities are true in B(L2 (3)), uniformly in L, ω and x0 : i. ||WL (β, ω) − A(α)WL (β, ω)A(−α)|| ≤ α const(β); ii. Uniformly in ξ ∈ C and z ∈ K one has: ||A(α)[ξ − zWL (β, ω)]−1 A(−α)||B(L2 (3)) ≤ const(β). Proof. i. Let S(β) = A(α)WL (β, ω)A(−α). We will see that S(β) obeys the conditions of Proposition 3. From Proposition 11 follows condition A. Then S(β) is strongly differentiable, has its range included in the domain of H1,L (ω) and converges strongly to one. Define B := H1,L (ω) − A(α)H1,L (ω)A(−α), or in other form: B = 2ıα(p − ωa) · ∇ρ(· − x0 ) + α(1ρ)(· − x0 ) − α 2 |∇ρ(· − x0 )|2 . (4.72) w w A well known result says (see [S]) that wB[H1,L (ω) + 1]−1/2 w ≤ const. By direct computation, √ R(τ ) = B A(α)WL (τ, ω)A(−α); a rough estimate gives ||R(τ )|| ≤ const(L)/ τ and even if the constant behaves badly with L, the norm is integrable in zero with respect to τ . In conclusion: Z β dτ WL (β − τ, ω)R(τ ) WL (β, ω) − S(β) = − Z

0

β

=− 0

dτ WL (β − τ, ω)BS(τ ),

(4.73)

On the Magnetization of a Charged Bose Gas in the Canonical Ensemble

23

where the integral converges in norm. But uniformly in L and x0 there exists a numerical constant such that: const . ||WL (β − τ, ω)B|| ≤ α √ β −τ

(4.74)

From (4.74), (4.73) and point i of Proposition 11, the needed estimate follows. ii. Using point i, the estimate (4.3) and the identity (α small enough): A(α)[ξ − zWL (β, ω)]−1 A(−α) = [ξ − zA(α)WL (β, ω)A(−α)]−1 =

X [ξ − zWL (β, ω)]−1 zj j ≥0

n

· [A(α)WL (β, ω)A(−α) − WL (β, ω)][ξ − zWL (β, ω)]−1 the result follows.

oj

,

t u

Corollary 1. The operator A(α)gω (ξ, z; β, τ )A(−α) belongs to B(L2 , L∞ ) if α is small enough, and uniformly in ξ , z, x0 , ω and L one has: i. ||A(α)gω (ξ, z; β, τ )A(−α)|| ≤ const(β, τ ); R ii. 3 dy e2α|y−x0 | |Tω (x0 , y)|2 ≤ const(β, τ ); iii. eα|x−y| |Tω (x, y)| ≤ const(β, τ ).

Proof. i. It is an immediate consequence of Propositions 11 and 12. ii. Let φ = A(α)gω (ξ, z; β, τ )A(−α)ψ, where φ is bounded and continuous. From i it follows: Z dy eαρ(y−x0 ) Tω (x0 , y)ψ(y)| ≤ const(β, τ )||ψ||L2 , (4.75) |φ(x0 )| = | 3

and the result follows from the representation theorem of linear and continuous functionals on L2 . iii. Rewrite the identity gω (ξ, z; β, τ ) = gω (ξ, z; β, τ/2)WL (τ/2, ω) in terms of integral kernels, use (4.68), (2.3), ii, the triangle and Schwartz inequalities, and the proof is completed. u t Let δω > 0 be such that ω = ω0 +δω ∈ . Define the bounded operator g˜ ω (ξ, z; β, τ ) given by the following integral kernel: 0 T˜ω (x, x0 ) := eıδωϕ(x,x ) Tω0 (x, x0 ).

(4.76)

Equations (4.3) and (4.58) imply that if δω is sufficiently small then there exists a numerical constant such that uniformly in L: sup sup ||[ξ − zSω (β)]−1 || ≤ const

(4.77)

sup sup ||[ξ − zSω (β)]−1 − [ξ − zWL (β, ω)]−1 || ≤ const δω.

(4.78)

ξ

z

and: ξ

z

24

H. D. Cornean

We state now an important property of g˜ ω (ξ, z; β, τ ): Proposition 13. Under the above conditions, there exists a numerical constant such that if δω is small enough, then uniformly in ξ , z and L, the following B(L2 ) estimate takes place: ||[ξ − zSω (β)]−1 zSω (τ ) − g˜ ω (ξ, z; β, τ )|| ≤ const δω.

(4.79)

Proof. The integral kernel of the operator [ξ − zSω (β)]g˜ ω (ξ, z; β, τ ) is given by: Z 0 ˜ dy Sω (x, y; β)T˜ω (y, x0 ). (4.80) ξ Tω (x, x ) − z 3

Let us notice a crucial property of the magnetic phase: ϕ(x, y) + ϕ(y, x0 ) = ϕ(x, x0 ) + fl(x, y, x0 ),

(4.81)

where fl(x, y, x0 ) = 1/2 B · [(y − x0 ) ∧ (x − y)]. Using (4.81) and (4.66) in (4.80) we obtain: [ξ − zSω (β)]g˜ ω (ξ, z; β, τ ) = zSω (τ ) + R, where R is an integral operator given by: Z 0 0 dy (eıδω fl(x,y,x ) − 1) Gω0 ,L (x, y; β)Tω0 (y, x0 ). −z eıδωϕ(x,x ) 3

(4.82)

(4.83)

0

Because |eıδω fl(x,y,x ) − 1| ≤ δω|x − y| |y − x0 |, denoting with P the operator given by |x − y| |zGω0 ,L (x, y; β)| and with Q the operator corresponding to |y − x0 | |Tω0 (y, x0 )|, it follows that ||R|| ≤ δω||P || ||Q||. Using (2.3), Corollary 1 iii., (4.34) and (4.77), the proof is completed. u t Employing (4.58), (4.79), (4.78) and (4.77) in the next equality: gω (ξ, z; β, β/2) − g˜ ω (ξ, z; β, β/2) = [(ξ − zWL (β, ω))−1 − (ξ − zSω (β))−1 ]zWL (β/2, ω) + (ξ − zSω (β))−1 z[WL (β/2, ω) − Sω (β/2)] + (ξ − zSω (β))−1 zSω (β/2) − g˜ ω (ξ, z; β, β/2),

(4.84)

(4.9) is straightforward. Let us end this subsection by proving that the operator Aω (ξ, z; β) = g˜ ω (ξ, z; β, β/2)Sω (β/2) has the same trace as gω0 (ξ, z; β, β). Indeed, because Aω fulfills the conditions of Proposition 9 and noticing that ϕ(x, x0 ) = −ϕ(x0 , x), one can write: Z dx dx0 Tω0 (ξ, z; β, β/2; x, x0 )Gω0 ,L (x0 , x; β/2) trAω (ξ, z; β) = 32 Z dx Tω0 (ξ, z; β, β; x, x) = trgω0 (ξ, z; β, β). (4.85) = 3

On the Magnetization of a Charged Bose Gas in the Canonical Ensemble

25

4.2. The proof of II and III. The analyticity of 0∞ (β, z, ω0 ) in D follows from the bound (see (2.16)) |gσ (ζ )| ≤ const(σ, K)|ζ | where K is some compact in C \ [1, ∞). In what follows, we will prove that if z ∈ D0 := {|z| < 1}, then: lim 0L (β, z, ω0 ) = 0∞ (β, z, ω0 ).

(4.86)

L→∞

Because |z| < 1, the grand canonical pressure will be (see (2.10)): PL (β, z, ω) =

∞ n X z n=1

∞

X zn 1 trW (nβ, ω) = L n βL3 n

n=1

1 βL3

Z 3

dx Gω,L (x, x; nβ) . (4.87)

Under the same conditions, the magnetization reads as: 0L (β, z, ω0 ) = −

∞ e X zn 1 ∂WL (nβ, ω0 ). tr c n βL3 ∂ω

(4.88)

n=1

An important quantity is the integral kernel of the semigroup defined on the whole space: 0

eıωϕ(x,x ) ωβ/2 · Gω,∞ (x, x0 ; β) = 3/2 (2πβ) sinh (ωβ/2) ωβ/2 1 (e3 ∧ (x − x0 ))2 + (e3 · (x − x0 ))2 . · exp − 2β tanh (ωβ/2)

(4.89) (4.90)

Denote with: g(β, ω) = Gω,∞ (x, x; β) =

1 ωβ/2 . 3/2 (2πβ) sinh (ωβ/2)

Then it is easy to see that if |z| < 1: P∞ (β, z, ω0 ) =

∞ n X z g(nβ, ω0 ) n=1

n

β

,

0∞ (β, z, ω0 ) = −

∞ e X zn ∂g (nβ, ω0 ). βc n ∂ω n=1

One of the results in [M-M-P 1] can be adapted to our problem and gives: ∂WL ∂g 1 (β, ω0 ) = (β, ω0 ). tr L→∞ L3 ∂ω ∂ω lim

(4.91)

If we prove the existence of a positive function f with at most polynomial growth such that:    1 ∂WL   (nβ, ω0 ) ≤ f (nβ),  tr (4.92)  L3 ∂ω  then (4.86) would be true.

26

H. D. Cornean

The next corollary is a direct consequence of Proposition 8 ii.: Corollary 2. Under the conditions of Proposition 8 one has: Z 1 dx [Gω0 +δω,L (x, x; β) − Gω0 ,L (x, x; β)] = lim δω&0 δω 3 Z β Z ∂WL (β, ω0 ) = 2 dτ dx dy · = tr ∂ω 32 0 ·Gω0 ,L (x, y; β − τ )a(y − x) · (−ı∇y − ω0 a(y))Gω0 ,L (y, x; τ ).

(4.93)

Use (2.3), (4.46), (4.38) and (4.35) in (4.93) and (4.92) follows. The proof of (4.86) is now completed. Remarks. 1. What can we say about the same problem for Fermi particles (say electrons, where H1,L (ω) should be replaced with the Pauli operator)? Knowing that in this case inf σ (H1,∞ (ω)) = 0 (i.e. is independent of ω), the grand canonical result (Lemma 1) can be easily restated in terms of Fermi statistics: the only thing that changes is the domain on which the limit takes place i.e. C \ (−∞, −1]. As for Theorem 2, its proof was based on the fact that there exists a compact K ⊂ C \ [eβω/2 , ∞) which contains the circle centered in the origin with radius equal to xL (β, ρ, ω), for all L ≥ L0 and for all strictly positive β, ω and ρ. For Fermi particles, if one fixes ρ and ω but makes β very large (lowers the temperature), then xL (β, ρ, ω) would be a very large positive quantity, therefore the above circle could intersect the negative cut. Of course, if the gas is diluted (say ρ and ω fixed and β small), then it could happen that x∞ (β, ρ, ω) < 1 which means that the circle never intersects the cut when L ≥ L0 , therefore a similar proof can be provided. Our conclusion is that the extension of Theorem 2 to a Fermi gas at low temperature is not trivial and remains an interesting problem. 2. What about the higher derivatives with respect to ω (the susceptibility for example) at ω0 6 = 0? This also remains an open problem, even for the grand canonical ensemble. Nevertheless, we think that our approach (the modified perturbation theory for Gibbs semigroups) could provide an answer to it. Acknowledgement. Part of this work was done during a visit to Centre de Physique Théorique in Marseille, at the invitation of Professor P. Duclos. The financial support of CNCSU grant 13(C) is hereby gratefully acknowledged. Finally, the author wishes to thank Professors N. Angelescu, M. Bundaru and G. Nenciu for their encouragement and fruitful discussions.

References [A] [A-B-N 1] [A-B-N 2] [A-C] [B-H-L] [C-N] [H] [H-P]

Angelescu, N.: Ph.D thesis, I.F.A., Bucharest, 1976 Angelescu, N., Bundaru, M., Nenciu, G.: On the perturbation of Gibbs semigroups. Commun. Math. Phys. 42, 29–30 (1975) Angelescu, N., Bundaru, M., Nenciu, G.: On the Landau diamagnetism. Commun. Math. Phys. 42, 9–28 (1975) Angelescu, N., Corciovei, A.: On free quantum gases in a homogeneous magnetic field. Rev. Roum. Phys. 20, 661–671 (1975) Broderix, K., Hundertmark, D., Leschke, H.: Continuity properties of Schrodinger semigroups with magnetic fields. Mathematical-Physics preprint archive of University of Texas at Austin. Cornean, H.D., Nenciu, G.: On eigenfunction decay for two dimensional magnetic Schrödinger operators. Commun. Math. Phys. 192, 671–685 (1998) Huang, K.: Statistical mechanics. New York–London: John Wiley & Sons, Inc., 1963 Hille, E., Phillips, R.S.: Functional integral and semigroups. Providence; RI: Am. Math. Soc., 1957

On the Magnetization of a Charged Bose Gas in the Canonical Ensemble

[K-U-Z]

27

Kac, M., Uhlenbeck, G.E., Ziff, R.M.: The ideal Bose-Einstein gas, revisited. Phys. Rep. 32C, no 4, 169–248 (1977) [K] Kunz, H.: Surface orbital magnetism. J. Stat. Phys. 76, 183–207 (1994) [M-M-P 1] Macris, N., Martin, Ph.A., Pulé, J.V.: Diamagnetic currents. Commun. Math. Phys. 117, 215–241 (1988) [M-M-P 2] Macris, N., Martin, Ph.A., Pulé, J.V.: Large volume asymptotics of Brownian integrals and orbital magnetism. Ann. I.H.P. Phys. Theor. 66, 147–183 (1997) [R-S 1,4] Reed, M., Simon, B.: ßit Methods of modern mathematical physics I, IV. New York: Academic Press, 1975 [S] Simon, B.: Schrödinger semigroups. Bull. Am. Math. Soc. (N. S.) 7, 447–510 (1982)

Communicated by H. Araki

Commun. Math. Phys. 212, 29 – 61 (2000)

Communications in

Mathematical Physics

© Springer-Verlag 2000

Trace Construction of a Basis for the Solution Space of slN qKZ Equation Atsushi Nakayashiki Graduate School of Mathematics, Kyushu University, Ropponmatsu 4-2-1, Fukuoka 810-8560, Japan. E-mail: [email protected] Received: 26 March 1999 / Accepted: 4 January 2000

Abstract: The trace of intertwining operators over the level one irreducible highest (1) weight modules of the quantum affine algebra of type AN −1 is studied. It is proved that the trace function gives a basis of the solution space of the qKZ equation at a generic level. The highest-highest matrix elements of the composition of intertwining operators are explicitly determined as rational functions up to an overall scalar function. The integral formula for the trace is presented.

1. Introduction In this paper we shall study solutions of the quantized Knizhnik–Zamolodchikov (qKZ) equation associated with the quantum group Uq (slN ). The idea in this paper stems from the study of solvable lattice models. The qKZ equation was introduced in [6] as the equation satisfied by the highesthighest matrix elements of the intertwining operators of quantum affine algebra. For generic values of parameters the set of matrix elements gives a basis of the solution space over the field of appropriate periodic functions. The connection matrix of two solutions with different asymptotics have been calculated from the commutation relation of intertwining operators. The solutions of the qKZ equation associated with Uq (sl2 ) is systematically studied by Tarasov and Varchenko [15] (see also references therein). In [15] the solutions are described as the multidimensional q-hypergeometric integrals. It is proved that, for generic values of parameters, the q-hypergeometric solutions give a basis of the solution space over the field of appropriate periodic functions. The connection matrix is determined as the representation of Felder’s elliptic quantum group. In this paper we propose another basis of the solution space of the qKZ equation as the trace of intertwining operators of quantum affine algebra. The trace depends on two kinds of variables. It satisfies a qKZ equation in one kind of variable and another qKZ

30

A. Nakayashiki

equation in the other kind of variables. Those two qKZ equations are dual to each other. The solutions to one qKZ equation are parametrized by solutions to the dual equation. Let us consider the Uq (slN ) modules V1 , . . . ,Vn and the trigonometric R matrix Rij (zi /zj ) acting on the tensor product Vi ⊗ Vj . The qKZ equation is the q-difference equation for the V1 ⊗ · · · ⊗ V valued function f (z1 , · · · , zn ) of the form f (· · · , pzj , · · · ) = Rjj −1 (pzj /zj −1 ) · · · Rj 1 (pzj /z1 )(κ −H )j ×Rj n (zj /zn ) · · · Rjj +1 (zj /zj +1 )f (z1 , · · · , zn ),

(1)

Q −hi , h1 , · · · , hN −1 is a basis of the Cartan subalgebra of slN and where κ −H = N−1 i=1 κi −H −H acts on Vj . The complex numbers p and κi ’s are the parameters (κ )j means that κ of the equation. If we write p = q 2(k+N ) the number k is called level. d Let 3i (0 ≤ i ≤ N − 1) be the fundamental weights of sl N . We identify 3i (1 ≤ i ≤ N − 1) with the fundamental weights of slN . In this paper we consider the case where all Vi are isomorphic to the N dimensional irreducible Uq (slN ) module V with the highest weight 31 or 3N −1 . c Let V (3i ) be the irreducible highest weight Uq (sl N ) module with the highest weight 3i and Vζ the evaluation module of V . Then there exist, up to normalization, unique intertwining operators 8(ζ ) and 9 ∗ (ξ ): 8(ζ ) : V (3i+1 ) −→ V (3i ) ⊗ Vζ ,

9 ∗ (ξ ) : Vξ ⊗ V (3i ) −→ V (3i+1 ).

We extend the index i of 3i to the set of integers and read it modulo N . The operators 8(ζ ) and 9 ∗ (ξ ) are sometimes called of type I and type II respectively [8]. The difference between type I and type II is in the place where the evaluation module is. For type I it is on the right of the highest weight module while for type II it is on the left. Denote by D the grading operator of the principal gradation of V (3i ) and consider the trace of the form G(ζ1 , · · · , ζm |ξ1 , · · · , ξn |x, κ) = F (ζ |ξ |x)−1

N−1 X

trV (3i ) x D κ H 8(ζ1 ) · · · 8(ζm )9 ∗ (ξn ) · · · 9 ∗ (ξ1 ) (2)

i=0

which is a function taking the value in HomC (V ⊗n , V ⊗m ). Here F (ζ |ξ |x) is some scalar function (cf. (16)). By the commutation relation of the intertwining operators, the cyclic property of the trace and the functional equations of F (ζ |ξ |x), G satisfies G(ζ | · · · , xξi , · · · |x, κ) = G(ζ |ξ |x, κ)R¯ ii+1 (ξi /ξi+1 ) · · · R¯ in (ξi /ξn )(κ −H )ξi ×R¯ i1 (xξi /ξ1 ) · · · R¯ ii−1 (xξi /ξi−1 ), (3) −1 G(· · · , x ζi , · · · |ξ |x, κ) = R¯ ii−1 (x −1 ζi /ζi−1 ) · · · R¯ i1 (x −1 ζi /ζ1 )(κ −H )ζi ×R¯ im (ζi /ζm ) · · · R¯ ii+1 (ζi /ζi+1 )G(ζ |ξ |x, κ),

(4)

¯ ) is the trigonometric R matrix (cf. (7)), R¯ ii+1 (ξi /ξi+1 ) acts non-trivially on where R(ζ Vξi ⊗ Vξi+1 in V ⊗n , etc. Equation (4) has precisely the same form as the qKZ equation (1). Let t G be the transpose of G, that is, t G ∈ HomC (V ∗⊗m , V ∗⊗n ), V ∗ being the dual vector space of V . Then, as the equation for t G, (3) is of the same form as (1). Since we use the principal gradation in this paper, to make a precise correspondence between the parameter x and the parameter p in (1) we need to consider G as a function of zj = ζjN

Trace Construction of Basis for Solution Space of slN qKZ Equation

31

and uj = ξjN . Then if x N = p = q 2(k+N ) , t G and G satisfy the qKZ equation of level k and level −k − 2N in the variabs u and z respectively. In this paper, if x −N = q 2(k+N ) , we say (4) the qKZ equation of level k with the value in V ⊗m . Let S nk and Sk∗n be the space of meromorphic solutions of the qKZ equation of level k with the value in V ⊗n and V ∗⊗n respectively and F the field of x periodic meromorphic functions in n variables. Then the function G defines two maps simultaneously: t

G(ζ | · |x, κ) : V ∗⊗m ⊗ F −→ Sk∗n , G(·|ξ |x, κ) : V

⊗n

⊗ F −→

m S−k−2N .

(5) (6)

In (5), ζ1 , · · · , ζm are parameters of the map and in (6), ξ1 , · · · , ξn are parameters of the map. We consider the case n = m. We assume |x| < 1. We shall prove that if x and κ are generic, (5) is an isomorphism for the generic values of ζ1 , . . . ,ζm and (6) is an isomorphism for the generic values of ξ1 , . . . ,ξn . It is proved by showing that the determinant of G does not vanish identically. We calculate the determinant at x = 0, where G reduces to the highest-highest matrix element. For the level one irreducible module V (3i ) the matrix elements can be calculated explicitly as rational functions up to an overall scalar function. This is expected because at q = 1 such formula is given c2 ) the with the help of the Frenkel–Kac bosonization of V (3i ) [5]. In the case of Uq (sl c ) it is possible to carry out the integral of formula of this type is given in [8]. For Uq (sl N the integral formula, which is obtained from the bosonization of intertwining operators based on the Frenkel–Jing bosonization of V (3i ) in [10], in a similar manner to N = 2 case. The case x = q 2 is relevant to the physical quantities in solvable lattice models. In fact at this value of x if we further specialize the variables ζi and ξj appropriately, the trace functions give correlation functions and form factors of the solvable lattice model ¯ ). We have calculated the determinant of G for whose Boltzman weight is given by R(ζ 2 N = n = 2 and x = q explicitly. By the q series expansion we checked that det G does not vanish identically for n = 3. We conjecture that the determinant does not vanish identically at x = q 2 . This suggests that the trace description can be effective for the completeness problem of the space of local fields [13, 1]. The bosonization of intertwining operators makes it possible not only to derive the integral formula for the matrix elements but also to derive the integral formula for the trace. Therefore the integral formulae of the basis of the solution space of (3) and (4) are given. The plan of this paper is as follows. In the second section we summarize necessary notations of quantum affine algebra c Uq (sl N ). We review the properties of the intertwining operators for the level one intec grable Uq (sl N ) modules in the principal picture in Sect. 3. In Sect. 4 we give the relation between principal picture and homogeneous picture. It serves for translating the results in the references, in which the homogeneous gradation is used, into principal picture and vice versa. In Sect. 5 the trace of intertwining operators are introduced and the equations satisfied by them are derived. The non-vanishing of the determinant of the trace function is properly formulated and proved in Sect. 6. In Sect. 7 an example of the concrete expression of the determinant of the trace of intertwing operators in the case N = 2 is given. In Sect. 8 we give the integral formulae for the matrix elements of the intertwining operators. The rational function formula for the extremal component of the normalized matrix element is given in Sect. 9. In Sect. 10 the integral formula of

32

A. Nakayashiki

the trace of intertwining operators is presented. In Appendix A we refer to the integral c2 ) in [8], since in this case it is formula for the trace of intertwining operators of Uq (sl possible to simplify the formula a bit. This simplification is used in the calculation in the example of Sect. 7. The bosonic expression of the intertwining operators are reviewed in Appendix B. The list of the expression of the operators in terms of their normal ordered operators is given in Appendix C. In Appendix D a derivation of the integral formula for the trace of intertwining operators is briefly explained. The explicit formulae of constants appeared in the formulae of matrix elements and the trace are given in Appendix E. 2. Preliminary Let

−1 P = ⊕N i=0 Z3i ⊕ Zδ

d be the weight lattice of sl N and −1 P ∗ = HomZ (P , Z) = ⊕N i=0 Zhi ⊕ Zd

its dual. The pairing is given by h3i , hj i = δij ,

h3i , di = 0,

hδ, hi i = 0,

hδ, di = 1.

We say 3i (0 ≤ i ≤ N − 1) are the fundamental weights. Simple roots are given by αi = −3i−1 + 23i − 3i+1 + δi0 δ,

where the index should be read modulo N . If we set aij = αi , hj , (aij ) is the generalized (1) Cartan matrix of type AN−1 . ±1 c The quantum affine algebra Uq0 (sl N ) is the Hopf algebra generated by ei , fi , ti (0 ≤ i ≤ N − 1) with the following defining relations: ti tj = tj ti , [ei , fj ] = δij 1−aij

X k=0

ti ej ti−1 = q hhi ,αj i ej ,

ti±1 ti∓1 = 1,

ti − ti−1 , q − q −1

(k) (1−aij −k)

(−1)k ei ej

1−aij

=

X k=0

(k) (1−aij −k)

(−1)k fi fj

ti fj ti−1 = q −hhi ,αj i fj ,

=0

i 6= j,

where e(k) = ek /[k]! and similarly for f (k) , [k]! = [k] · · · [2][1], [k] = (q k − q −k )/(q − q −1 ). The coproduct 1 and the antipode S are given by 1(ei ) = ei ⊗ 1 + ti ⊗ ei , and

1(fi ) = fi ⊗ ti−1 + 1 ⊗ fi ,

S(ei ) = −ti−1 ei ,

S(fi ) = −fi ti ,

1(ti ) = ti ⊗ ti ,

S(ti ) = ti−1 .

c We extend the Hopf algebra Uq0 (sl N ) by adding the element D such that [D, ei ] = ei ,

[D, fi ] = −fi ,

[D, ti±1 ] = 0,

1(D) = D ⊗ 1 + 1 ⊗ D.

Trace Construction of Basis for Solution Space of slN qKZ Equation

33

c c The resulting algebra is denoted by Uq (sl N ). We say that an element X ∈ Uq (sl N ) has degree n if [D, X] = n. c For a highest weight Uq (sl N ) module M with a highest weight vector v, D defines a c grading on M by D(Xv) = n for an element X of degree n in Uq (sl N ). The evaluation 0 (sl c Cv of U ) associated with the vector representation of Uq (slN ) module Vζ = ⊕N−1 j N q j =0 is given by fi vj = ζ −1 δij +1 vj +1 ,

ei vj = ζ δij vj −1 ,

ti vj = q −δij +δij +1 vj ,

where the index of vj should be read modulo N. In particular the weight of vj , which we denote by wtvj , is given by wtvj = 3j +1 − 3j . P We denote the binomial coefficient by n Cr , that is, (1 + x)n = nr=0 n Cr x r . In this paper two kinds of variables appear, one is u and z, the other is ξ and ζ . They are always related by the relation u = ξ N and z = ζ N except in Appendix A where u = −ξ 2 and z = ζ 2 . 3. Intertwining Operators In [2,10] the evaluation module, R matrices and intertwining operators are described in terms of the homogeneous grading. We shall rewrite them to the principal picture so that the description is consistent with the N = 2 case in [8] and that the equations for the trace of intertwining operators are free from cumbersome factors. ¯ 1 /ζ2 ) the intertwining Let P be the permutation operator, P (v⊗w) = w⊗v, and P R(ζ ¯ 1 /ζ2 )(v0 ⊗ v0 ) = v0 ⊗ v0 . We operator from Vζ1 ⊗ Vζ2 to Vζ2 ⊗ Vζ1 normalized as R(ζ ¯ ) by define the components of R(ζ X ¯ )ij0 0 vi 0 ⊗ vj 0 . ¯ )(vi ⊗ vj ) = R(ζ R(ζ ij i 0 ,j 0

Explicitly they are given by (cf. [2]) N ¯ )j k = b(ζ ) = q(1 − ζ ) (j 6= k), R(ζ jk 1 − q 2ζ N 2 1−q = cj k (ζ ) = ζ N θ (k−j )+j −k (j 6 = k), 1 − q2ζ N

¯ )jj = 1, R(ζ jj ¯ )j k R(ζ kj

(7)

where θ(k) = 1 (k ≥ 0), = 0 (otherwise) and 0 ≤ j, k ≤ N − 1. c Let V (3i ) be the irreducible highest weight Uq (sl N ) module with the highest weight ∗ 3i and the highest weight vector |3i >, V (3i ) the restricted dual right highest weight module of V (3i ) with the highest weight

vector < 3i | such

that hh3 i |, |3

i ii = 1, where h, i is the dual pairing. We denote 3j |, X|3i = 3j |X, |3i = 3j |X|3i for any X ∈ HomC (V (3i ), V (3j )), where X acts on V (3j )∗ from the right. c The type I and type II intertwining operators 8(i) (ζ ) and 9 ∗(i) (ξ ) are the Uq0 (sl N) linear operators of the form 8(i) (ζ ) : V (3i+1 ) −→ V (3i ) ⊗ Vζ , 8(i) (ζ ) =

N −1 X =0

8(i) (ζ ) ⊗ v ,

9 ∗(i) (ξ ) : Vξ ⊗ V (3i ) −→ V (3i+1 ), 9 ∗(i) (ξ )(vµ ⊗ ·) = 9µ∗(i) (ξ ).

34

A. Nakayashiki

We normalize them by the condition that E D (i) 3i |8i (ζ )|3i+1 = 1,

D

∗(i)

3i+1 |9i

E (ζ )|3i = 1.

Under these normalizations the operators 8(i) (ζ ) and 9 ∗(i) (ξ ) are unique. We sometimes omit the upper index (i) of 8(i) (ζ ) and 9 ∗(i) (ξ ) for the sake of simplicity. The intertwining operators 8(ζ ) and 9 ∗ (ξ ) satisfy the following commutation relations ([2]): R(ζ1 /ζ2 )8(ζ1 )8(ζ2 ) = 8(ζ2 )8(ζ1 ), 9 ∗ (ξ2 )9 ∗ (ξ1 )R ∗ (ξ1 /ξ2 ) = 9 ∗ (ξ1 )9 ∗ (ξ2 ), 8(ζ )9 ∗ (ξ ) = τ (ζ /ξ )9 ∗ (ξ )8(ζ ),

(8) (9) (10)

where τ (ζ ) = ζ 1−N

θq 2N (qζ N )

θq 2N (qζ −N )

and for any complex number p such that |p| < 1 we set ∞ Y

(z; p)∞ =

(1 − pk z),

θp (z) = (z; p)∞ (pz−1 ; p)∞ (p; p)∞ .

k=0

The matrices R(ζ ) and R ∗ (ζ ) are given by ¯ ), R(ζ ) = r(ζ )R(ζ

¯ ), R ∗ (ζ ) = r ∗ (ζ )R(ζ

with r(ζ ) = ζ −1

(q 2N z−1 ; q 2N )∞ (q 2 z; q 2N )∞ , (q 2N z; q 2N )∞ (q 2 z−1 ; q 2N )∞

r ∗ (ζ ) = −ζ −1

(q 2N z−1 ; q 2N )∞ (q 2N−2 z; q 2N )∞ . (q 2N z; q 2N )∞ (q 2N−2 z−1 ; q 2N )∞

In (8)–(10) we use the following notation: for vi ⊗ vj ∈ Vζ1 ⊗ Vζ2 and vj 0 ⊗ vi 0 ∈ Vζ2 ⊗ Vζ1 , the equation vi ⊗ vj = vj 0 ⊗ vi 0 means vi = vi 0 and vj = vj 0 . This is for the sake of simplifying the description of the equation. Thus in terms of components (8), (9) and (10) are written as 0 0

R(ζ1 /ζ2 )11 22 810 (ζ1 )820 (ζ2 ) = 82 (ζ2 )81 (ζ1 ),

R (ξ1 /ξ2 )10 20 9∗0 (ξ2 )9∗0 (ξ1 ) = 9∗1 (ξ1 )9∗2 (ξ2 ), 2 1 1 2 8 (ζ )9µ∗ (ξ ) = τ (ζ /ξ )9µ∗ (ξ )8 (ζ ). ∗

(11) (12) (13)

c Let σ be the automorphism of Uq (sl N ) induced by the Dynkin diagram automorphism, σ (ei ) = ei+1 , σ (fi ) = fi+1 , σ (hi ) = hi+1 .

Trace Construction of Basis for Solution Space of slN qKZ Equation

35

The indices should be read modulo N . The automorphism σ induces the linear automorphism of Vζ , the linear isomorphism between the left highest weight modules V (3i ) and V (3i+1 ), the linear isomorphism between the right highest weight modules V (3i )∗ and V (3i+1 )∗ by σ (vj ) = vj +1 , σ (|3i i) = |3i+1 i, σ (h3i |) = h3i−1 | with the propc erties σ (Xv) = σ (X)σ (v) and σ (v ∗ X) = σ (v ∗ )σ −1 (X) for X ∈ Uq (sl N ), v ∈ V (3i ), ∗ ∗ v ∈ V (3i ) . Then the intertwining operators satisfy the following relations: 8(i) (ζ ) = (σ ⊗ σ )8(i−1) (ζ )σ −1 ,

9 ∗(i) (ξ ) = σ 9 ∗(i−1) (ξ )(σ −1 ⊗ σ −1 ).

(14)

In terms of components these are (i−1)

−1 , 8(i) (ζ ) = σ 8−1 (ζ )σ

∗(i−1)

9µ∗(i) (ξ ) = σ 9µ−1 (ξ )σ −1 .

These relations are proved by checking the intertwining properties and the normalization conditions of the right hand side of the equations using the relation (σ ⊗ σ )1 = 1σ. ¯ ) is also invariant with respect to σ ; The R matrix R(ζ (j ) ¯ ij ¯ )σ (i)σ R(ζ σ (i 0 )σ (j 0 ) = R(ζ )i 0 j 0 ,

where σ (i) = i + 1 (0 ≤ i ≤ N − 2), σ (N − 1) = 0. 4. Principal-Homogeneous Correspondence We shall give relations between the intertwining operators defined in the previous section and those in [10,2]. (h) Let Vz = ⊕N−1 j =0 Cvj be the homogeneous evaluation module given by fi vj = z−δi0 δij +1 vj +1 ,

ei vj = zδi0 δij vj −1 ,

ti vj = q −δij +δij +1 vj .

(h) c The map Vζ −→ Vz given by vi 7→ vi ζ i commutes with the action of Uq0 (sl N ), where N z=ζ . ∗ ˜ V 3i+1 (z) be the intertwining operators in [10]; ˜ 3i V (z) and 8 Let 8 3i 3i+1 ∗

˜ V 3i+1 (z) : V (3i ) −→ Vz(h)∗ ⊗V (3i+1 ). 8 3i

˜ 3i V (z) : V (3i+1 ) −→ V (3i )⊗Vz(h) , 8 3i+1 We set ˜ 3i V (z), ˜ h(i) (z) = 8 8 3i+1

∗

˜ V 3i+1 (z) = 8 3i

N −1 X j =0

∗h(i)

˜ vj∗ ⊗ 9 j

(z),

D E where {vj∗ } is the dual basis to {vj }, vi , vj∗ = δij . Then (i)

h(i)

N ˜ 8j (ζ ) = ζ i−j 8 j (ζ ),

∗(i)

9j

∗h(i)

˜ (ζ ) = ζ j −i 9 j

(ζ N ), ∗

˜ V 3i+1 (z) in [10] where 0 ≤ i, j ≤ N − 1. We remark that the dual represeation V ∗ in 8 3i is with respect to the antipode inverse. Let R¯ (h) (z) = R¯ V (1) V (1) (z) be the R matrix in [2]. Then ¯ )ij0 0 = R¯ (h) (ζ N )ij0 0 ζ i−i 0 . R(ζ ij ij

36

A. Nakayashiki

5. Trace of Intertwining Operators In order to normalize the trace of intertwining operators we first introduce scalar functions which satisfies some functional equations. For complex numbers p1 , · · · , pk such that |pi | < 1 for any i, we define ∞ Y

(z; p1 , · · · , pk )∞ =

(1 − pr1 · · · prk z).

r1 ,··· ,rk =0

We assume |q| < 1, |x| < 1 and set {z} = (z; q 2N , x N )∞ ,

{q 1+σ x N z−1 }{q 1+σ z} , {q 2N−1+σ x N z−1 }{q 2N−1+σ z}

h(σ ) (z|x) =

where σ = 0, ±1. Let us define Y Y Y zb ua ua h(+) ( |x)( h(0) ( |x))−1 h(−) ( |x), F¯ (z|u|x) = za zb ub a
and

a,b

(16)

a
Q

(15)

F (ζ |ξ |x) = F¯ (z|u|x) Q

N −1

a,b θx (−ξb /ζa )

a
Q

a
.

The function F satisfies the following equations: F (· · · , ζj +1 , ζj , · · · |ξ |x) = r(ζj /ζj +1 )F (ζ |ξ |x), F (ζ | · · · , ξj +1 , ξj , · · · |x) = r ∗ (ξj /ξj +1 )F (ζ |ξ |x), F (x −1 ζ1 , · · · , ζm |ξ |x) = F (ζ |xξ1 , · · · , ξn |x) =

n Y j =1 m Y

τ (ξj /ζ1 )F (ζ2 , · · · , ζm , ζ1 |ξ |x), τ (ξ1 /ζj )F (ζ |ξ2 , · · · , ξn , ξ1 |x).

j =1

For complex numbers y1 , . . . ,yN −1 let us set y ±H = malized trace function as

QN −1 j =1

±hj

yj

. Define the nor-

trV (3i ) x D y H 8(ζ1 ) · · · 8(ζm )9 ∗ (ξn ) · · · 9 ∗ (ξ1 ) , G (ζ |ξ |x, y) = F (ζ |ξ |x) (i)

(17)

and set G(ζ |ξ |x, y) =

N −1 X

G(i) (ζ |ξ |x, y).

(18)

i=0

These functions take the value in HomC (V ⊗n , V ⊗m ). We define the components of G by X ,m G(ζ |ξ |x, y)µ11,··· G(ζ |ξ |x, y)(vµ1 ⊗ · · · ⊗ vµn ) = ,...,µn v1 ⊗ . . . ⊗ vm . 1 ,··· ,m

Trace Construction of Basis for Solution Space of slN qKZ Equation

37

By the functional equations of F and the commutation relations of intertwining operators the function G satisfies the following system of equations: R¯ ii+1 (ζi /ζi+1 )G(· · · , ζi , ζi+1 , · · · |ξ |x, y) = G(· · · , ζi+1 , ζi , · · · |ξ |x, y), G(ζ | · · · , ξi , ξi+1 , · · · |x, y)R¯ ii+1 (ξi /ξi+1 ) = G(ζ | · · · , ξi+1 , ξi , · · · |x, y),

G(x −1 ζ1 , ζ2 , · · · , ζm |ξ |x, y) = (y −H )ζ1 G(ζ2 , · · · , ζm , ζ1 |ξ |x, y), G(ζ |xξ1 , ξ2 , · · · , ξn |x, y) = G(ζ |ξ2 , · · · , ξn , ξ1 |x, y)(y −H )ξ1 ,

where R¯ ij (ζi /ζj ) acts nontrivially on the component Vζi ⊗ Vζj in Vζ1 ⊗ · · · ⊗ Vζm , R¯ ij (ξi /ξj ) acts nontrivially on the component Vξi ⊗ Vξj in Vξ1 ⊗ · · · ⊗ Vξn and (y −H )ζ1 means that y −H acts on the component Vζ1 etc. In these equations we use the same notation as in Eqs. (8)–(10) to avoid the use of permutation operators in the equations (see the comment before (11)–(13)). As a consequence of these equations G satisfies (1)

G(· · · , x −1 ζi , · · · |ξ |x, y) = Ki (ζ1 , · · · , ζm |x, y)G(ζ |ξ |x, y), G(ζ | · · · , xξi , · · · |x, y) =

= =

(19)

(2) G(ζ |ξ |x, y)Ki (ξ1 , · · ·

, ξn |x, y), (20) (1) Ki (ζ1 , · · · , ζm |x, y) R¯ ii−1 (x −1 ζi /ζi−1 ) · · · R¯ i1 (x −1 ζi /ζ1 )(y −H )ζi R¯ im (ζi /ζm ) · · · R¯ ii+1 (ζi /ζi+1 ), (2) Ki (ξ1 , · · · , ξn |x, y) R¯ ii+1 (ξi /ξi+1 ) · · · R¯ in (ξi /ξn )(y −H )ξi R¯ i1 (xξi /ξ1 ) · · · R¯ ii−1 (xξi /ξi−1 ).

Note that t

(2)

(1)

Ki (ζ1 , · · · , ζm |x, y) = Ki (ζ1 , · · · , ζm |x −1 , y).

If we denote x N = q 2(k+N ) then the corresponding equations (19) and the transpose of (20) are the qKZ equations of level −k − 2N and level k respectively. From (14), G(i) and G satisfy σ ( ),··· ,σ ( )

G(i+1) (ζ |ξ |x, y1 , · · · , yN−1 )σ (µ11 ),··· ,σ (µmn ) ,m = y1 G(i) (ζ |ξ |x, y1−1 y2 , · · · , y1−1 yN −1 , y1−1 )µ11,··· ,··· ,µn ,

G(ζ |ξ |x, y1 , · · · , yN −1 ) = y1 G(ζ |ξ |x, y1−1 y2 , · · · , y1−1 yN −1 , y1−1 ).

(21) (22)

¯ (i) (ζ |ξ |x, y) by the similar formula to (17), where F (ζ |ξ |x) is replaced We define G ¯ ¯ |ξ |x, y) similarly to (18). The function G ¯ by F (z|u|x). Then define the function G(ζ satisfies ζi+1 N−1 ¯ ¯ · · , ζi , ζi+1 , · · · |ξ |x, y) Rii+1 (ζi /ζi+1 )G(· ζi ¯ · · , ζi+1 , ζi , · · · |ξ |x, y), (23) = G(·

38

A. Nakayashiki

¯ | · · · , ξi , ξi+1 , · · · |x, y) G(ζ

ξi ξi+1

N −1

R¯ ii+1 (ξi /ξi+1 ) ¯ | · · · , ξi+1 , ξi , · · · |x, y), = G(ζ

¯ 2 , · · · , ζm , ζ1 |ξ |x, y) ¯ −1 ζ1 , ζ2 · · · , ζm |ξ |x, y) = G(ζ G(x ¯ |ξ2 , · · · , ξn , ξ1 |x, y) ¯ |xξ1 , ξ2 , · · · , ξn |x, y) = G(ζ G(ζ

n N −1 Y ζ1

ξj j =1 n N −1 Y ζj

j =1

ξ1

(24)

, .

¯ also satisfy (21) and (22) respectively. As we shall see later the ¯ (i) and G The functions G ¯ scalar function F (z|u|x) naturally appears in the integral formula of the trace through ¯ in what ¯ (i) and G the boson calculus (cf. §10, Appendix D). That is why we consider G follows. Up to now we do not mention in which sense the trace (17) exists and to the validity of the application of the commutation relations (8), (9) and (10) inside the trace. By definition the trace (17) exists as a formal power series in x whose coefficient is a finite sum of matrix elements of the intertwining operator 8(ζ1 ) · · · 9 ∗ (ξ ). It is known that the latter matrix element, which is originally defined as a series in ζ and ξ , are analytically continued to give a meromorphic function on (C∗ )n+m , where C∗ = C\{0} is the algebraic torus. In fact, as we show in Sect. 10, the series in x can be summed up explicitly to express the trace (17) as a meromorphic function on (C∗ )n+m . Since the commutation relations (8), (9) and (10) hold in the sense of analytically continued matrix element, they are applicable to the series expression of the trace in x and hence to the meromorphic expression of the trace. 6. Completeness of Trace Function ¯ Let k0 , . . . ,kN −1 be In this section we assume n = m and study the determinant of G. non-negative integers satisfying k0 + · · · + kN −1 = n and (V ⊗n )k0 ,··· ,kN −1 = {v ∈ V ⊗n |ti v = q ki−1 −ki v, 1 ≤ i ≤ N − 1} the weight subspace of V ⊗n with respect to Uq (slN ). From the definition of the trace ¯ commutes with the action of ti (1 ≤ i ≤ N − 1). and the intertwining operators, G ¯ is the product of the determinants taken at each weight Therefore the determinant of G subspace. In what follows we fix a set of numbers k0 , . . . ,kN −1 . The determinant is always taken at the weight subspace (V ⊗n )k0 ,··· ,kN −1 . ¯ |ξ |x, y) does not vanish identically as a Theorem 1. We assume |x| < 1. Then det G(ζ ¯ |ξ |x, 1) does not vanish identically. function of ζi ’s, ξj ’s, x, q and yk ’s. Moreover det G(ζ Here y = 1 means that all yi = 1. ¯ |ξ |x, y) does not We say that (x, q, y) = (x, q, y1 , · · · , yN −1 ) is generic if det G(ζ vanish identically as a function of ζi ’s and ξj ’s. For fixed (x, q, y) we say that the set of ¯ |ξ |x, y) does not vanish identically as complex numbers ζ1 , . . . ,ζn is generic if det G(ζ a function of ξj ’s. Similarly for fixed (x, q, y) we say that the set of complex numbers ¯ |ξ |x, y) does not vanish identically as a function of ζi ’s. ξ1 , . . . ,ξn is generic if det G(ζ

Trace Construction of Basis for Solution Space of slN qKZ Equation

39

Corollary 1. Suppose that (x, q, y) is generic. The map (5) is an isomorphism for generic values of ζ1 , . . . ,ζn and the map (6) is an isomorphism for generic values of ξ1 , . . . ,ξn . ¯ using some single In order to prove Theorem 1 we first express the determinant of G scalar function. Let us set k

k

¯ |ξ |x, y)0k0 ···(N−1)kN −1 , f (ζ |ξ |x, y) = G(ζ N −1 0 0 ···(N−1)

where (0k0 · · · (N − 1)kN −1 ) means = (1 , · · · , n ) such that 1 ≤ · · · ≤ n and the ¯ number of i in is ki . We call f the extremal component of G. From (23) and (24) we have, for l > k, (3) ¯ · · ζi , ζi+1 · · · |ξ |x, y)···kl··· ¯ · · ζi , ζi+1 · · · |ξ |x, y)···lk··· = akl (ζi /ζi+1 )G(· G(· µ µ

¯ · · ζi+1 , ζi · · · |ξ |x, y)···kl··· + a (1) (ζi /ζi+1 )G(· , µ

(25)

¯ | · · · ξi , ξi+1 · · · |x, y) ¯ | · · · ξi , ξi+1 · · · |x, y)···lk··· = a (4) (ξi /ξi+1 )G(ζ G(ζ ···kl··· kl ¯ | · · · ξi+1 , ξi · · · |x, y) + a (2) (ξi /ξi+1 )G(ζ ···kl··· , (26) where q −1 ζ N−1 (1 − q 2 ζ N ) q −1 ζ −N +1 (1 − q 2 ζ N ) , a (2) (ζ ) = , N 1−ζ 1 − ζN q −1 ζ N+k−l (1 − q 2 ) q −1 ζ l−k (1 − q 2 ) (3) (4) , akl (ζ ) = − . akl (ζ ) = − N 1−ζ 1 − ζN

a (1) (ζ ) =

¯ in terms of f . To Using these equations it is possible to express any component of G this end let us introduce the lexicographical order, on the set of = (1 , · · · , n )’s such that ]{j |j = i} = ki , comparing from left to right. The minimal element is (0k0 · · · (N − 1)kN −1 ). It is convenient to associate M = (M0 , · · · , MN −1 ), Mi ∈ {1, · · · , n}ki with = (1 , · · · , n ) by the rule (i)

(i)

Mi = {m1 , · · · , mki } = {j |j = i},

(i)

(i)

m1 < · · · < mki .

¯ In the M notation We freely use both M and notations to specify components of G. we denote the minimal element by M 0 : M 0 = (M00 , · · · , MN0 −1 ). For M = (M0 , · · · , MN−1 ) we set (ζM0 , · · · , ζMN −1 ) = (ζm(0) , · · · , ζm(N −1) ). 1

Then ¯ |ξ |x, y)M G(ζ L =

X M 0 ≤M

kN −1

0 0 ¯ M 0 , · · · , ζM 0 |ξ |x, y)M a MM G(ζ L , 0 N −1

40

A. Nakayashiki 0

for some a MM with Y

a MM =

Y

a (1) (ζa /ζb ).

r>l a∈Mr ,b∈Ml ,a
Similarly ¯ |ξ |x, y)M G(ζ L =

X L0 ≤L

0 ¯ |ξL0 , · · · , ξL0 |x, y)M 0 , bLL G(ζ M 0 N −1

0

for some bLL with bLL =

Y

Y

a (2) (ξa /ξb ).

r>l a∈Lr ,b∈Ll ,a
¯ and the matrix whose components consist These equations mean that the matrix G of f with permuted variables are connected by the product of two triangular matrices with diagonal components (a MM )M and (bMM )M respectively. Thus we have Proposition 1. ¯ |ξ |x, y)µ ) = det(G(ζ

Y

n(k0 ,··· ,kN −1 )

a (1) (ζi /ζj )a (2) (ξi /ξj )

i<j

× det f (ζM0 , · · · , ζMN −1 |ξL0 , · · · , ξLN −1 |x, y)

M,L

,

where n(k0 , · · · , kN−1 ) =

X

nl,r (k0 , · · · , kN −1 ),

0≤l
nl,r (k0 , · · · , kN−1 ) =

N −2 Y j =0

Pj −1

n−2−

i=0

ki0

Ckj0 ,

0 ) = (k0 , · · · , kl − 1, kl+1 , · · · , kr−1 , kr − 1, · · · , kN −1 ), (k00 , · · · , kN−1

where the empty sum

P−1

0 i=0 ki

should be understood as 0.

¯ |ξ |x, y) reduces to the matrix element: By definition, at x = 0, G(ζ ¯ |ξ |0, y)µ = F¯ (z|u|0) G(ζ ×

N−1 XD i=0

E 3i |y H 81 (ζ1 ) · · · 8m (ζm )9µ∗ n (ξn ) · · · 9µ∗ 1 (ξ1 )|3i .

By specializing the formula of Proposition 3 in Sect. 9 to i = j , kr = lr (0 ≤ r ≤ N − 1), multiplying yi and summing up in i from 0 to N − 1 we get

Trace Construction of Basis for Solution Space of slN qKZ Equation

41

Proposition 2. f (ζ |ξ |0, y) = C

−1 Y r n (N−1)(1−a) N Y Y ζa ξa ξa ζ a 0

a=1

×

Y

r=0 a∈Mr

N −1 n Ki−1 (za − qub )(ua − qzb ) X Y ζa i Y ua y , i (za − q 2 zb )(ua − q 2 ub ) ξa za

Y

r
i=0

a=1

a=1

where Kj = k0 + · · · + kj , K−1 = 0, y0 = 1, C = (−1)

PN −2 r=0

(r+1)kr

q 2 KN −1 − 2 1

2

1

PN −2 r=0

kr2 +KN −2 kN −1

.

The empty product from 1 to 0 should be understood as one. Proof of Theorem 1. It is sufficient to prove that det f (ζM0 , · · · , ζMN −1 |ξL0 , · · · , ξLN −1 | does not vanish identically, where 1 means yi = 1 for any i. Let P = 0, 1) QN−2M,L a=0 n−Ka−1 Cka be the size of the determinant. Set f1 (ζ1 , · · · , ζn ) =

n Y a=1

ζa(N−1)(1−a)

N−1 Y

Y

r=0

a∈Mr0

−1

f2 (ξ1 , · · · , ξn ) = f1 (ξ1 , · · · , ξn )

ζa−r ,

.

Then by Proposition 2 we have det f (ζM0 , · · · , ζMN −1 |ξL0 , · · · , ξLN −1 |0, 1) M,L Y = CP f1 (ζM0 , · · · , ζMN −1 )f2 (ξM0 , · · · , ξMN −1 ) M

×

YY

Y

(za − q 2 zb )−1 (ua − q 2 ub )−1 D(ζ |ξ |q),

M r
where D(ζ |ξ |q) = det D(ζ |ξ |q)M,L

M,L

and

D(ζ |ξ |q)M,L =

Yh r
Y

(za −qub )

a∈Mr ,b∈Ll

Y

j −1 n −1 Y i NX ζa Y ( )j ξa

(ua −qzb )

a∈Lr ,b∈Ml

j =0 a=1

r=0

Y a∈Mr

za−1

Y

ua .

a∈Lr

We consider the case q = 1 and ζj = ξj for any j . It is easy to see that if M is different from L then D(ζ |ζ |1)M,L = 0. Hence the matrix D(ζ |ζ |1)M,L M,L is a diagonal matrix and YY Y Y D(ζ |ζ |1)M,M = N P (za − zb )2 . det D(ζ |ζ |1)M,L M,L = M

This completes the proof.

t u

M r
42

A. Nakayashiki

7. Determinant Formula at N = 2 and x = q 2 ¯ To understand Let us give examples of the explicit formulae for the determinants of G. the general structure of the formula in the example below we first present the system of equations satisfied by the determinant and give one solution of it. By taking the ¯ at the weight space (V ⊗n )k0 ,k1 we have determinant of the equation of G (

(

ζi+1 n Ck zi − q 2 zi+1 n−2 Ck0 −1 ¯ · · , ζi , ζi+1 , · · · |ξ |x, y) ) 0 det G(· ζi zi+1 − q 2 zi ¯ · · , ζi+1 , ζi , · · · |ξ |x, y), = det G(·

ξi ξi+1

)n Ck0

u − q 2 u n−2 Ck −1 0 i i+1 ¯ | · · · , ξi , ξi+1 , · · · |x, y) det G(ζ ui+1 − q 2 ui ¯ | · · · , ξi+1 , ξi , · · · |x, y), = det G(ζ

¯ −1 ζ1 , ζ2 , · · · , ζn |ξ |x, y) det G(x ¯ 2 , · · · , ζn , ζ1 |ξ |x, y)(−1)(n−1)n−2 Ck0 −1 ( = det G(ζ

n Y ζ1 n Ck ) 0, ξj

j =1

¯ |xξ1 , ξ2 , · · · , ξn |x, y) det G(ζ ¯ |ξ2 , · · · , ξn , ξ1 |x, y)(−1)(n−1)n−2 Ck0 −1 ( = det G(ζ

n Y ζj n Ck ) 0, ξ1

j =1

where zi = ζi2 , ui = ξi2 . If x = q 2 and y = 1, one solution to these systems of equations is given by Q(ζ |ξ ) = n Ck n−2 Ck −1   0 0 n n n Y Y Y ξj n−1 Y zk 0 − q 2 zk ξj j −1 ξj   ( ) θq 2 (−  ( ) ) . ζj u − q 2 uk 0 ζj ζj 0 k j =1

k
j =1

j =1

Any other meromorphic solution of the equation is obtained by multiplying Q(ζ |ξ ) by a meromorphic function which is symmetric and q 2 periodic in ζi ’s and ξi ’s respectively. Example 1. We consider the case of k0 = 0. The following formula is from [8]: 2 ¯ 1 , · · · , ζn |ξ1 , · · · , ξn |x, y)−···− G(ζ −···− = (x )∞

n Y ξj j j =1

ζj

θx (−y

n Y ζj ). ξj

j =1

Trace Construction of Basis for Solution Space of slN qKZ Equation

43

Example 2. Let us consider the case x = q 2 , y = 1, n = 2, k0 = 1. Then Q2 ξj 2 Y ξj θq 2 (−q j =1 ζj ) ζ1 ξ1 +− 2 4 ¯ 1 , ζ2 |ξ1 , ξ2 |q , 1)+− = q(q )∞ ( ) u2 (1 − ), G(ζ ζj u1 − q 2 u2 ζ2 ξ2

(27)

j =1

and ¯ 1 , ζ2 |ξ1 , ξ2 |q 2 , 1)µ1 µ2 det G(ζ 1 2

 2 2 2j 2 2 Y Y ξj ξj z2 − q z1  . θ 2 −q = −q 2 (q 4 )2∞ ζj u1 − q 2 u2 q ζj j =1

(28)

j =1

Here (z)∞ = (z : q 4 )∞ and zi = ζi2 , ui = ξi2 . This formula is calculated using the integral formula in Appendix A and the technique found in [12]. ¯ From (22) for G ¯ Using (25), (26) it is possible to calculate other components of G. we know a priori that +− 2 ¯ ¯ |ξ |q 2 , 1)−+ G(ζ −+ = G(ζ |ξ |q , 1)+− ,

+− 2 ¯ |ξ |q 2 , 1)−+ ¯ G(ζ +− = G(ζ |ξ |q , 1)−+

¯ is given by Then the concrete expression for G ¯ |ξ |q 2 , 1)(v+ ⊗ v−) = (q 4 )∞ G(ζ

2 θ 2 (−q Y q ξj j =1

Q2

ξj j =1 ζj u1 −q 2 u2

ζj

)

ξ1 ζ1 ξ1 2 ζ1 ξ2 · u2 q(1− )v+ ⊗ v−− (1−q )v− ⊗ v+ , ζ2 ξ2 ξ2 ζ2 ξ1 Q2 ξj 2 θ 2 (−q Y q j =1 ζj ) ξj 2 4 ¯ |ξ |q , 1)(v− ⊗ v+ ) = (q )∞ G(ζ ζj u1 −q 2 u2 j =1 ζ1 ξ1 ξ1 2 ζ1 ξ2 · u2 − (1−q )v+ ⊗ v− + q(1− )v− ⊗ v+ . ξ2 ζ2 ξ1 ζ2 ξ2 8. Integral Formula for Matrix Elements Set

3i |81 (ζ1 ) · · · 8m (ζm )9µ∗ n (ξn ) · · · 9µ∗ 1 (ξ1 )|3j , F¯ (z|u|0)

,m ¯ (ij ) (ζ |ξ )µ1 ,··· = G 1 ,··· ,µn

where F¯ (z|u|0) is defined by (15) and (16). We need to assume j + n = i + m mod.N for the matrix element to be well defined. P −1 Let kr = ]{j |j = r}, lr = ]{j |µj = r} for 0 ≤ j ≤ N − 1. Then m = N r=0 kr PN−1 (ij ) ¯ and n = r=0 lr . The function G (ζ |ξ )µ is zero unless n X r=1

wtvµr + 3j =

m X r=1

wtvr + 3i .

44

A. Nakayashiki

Since wtvr = 3r+1 − 3r this condition is written as kr − lr = kr−1 − lr−1 + δr,i − δr,j

0 ≤ r ≤ N − 1,

(29)

where we understand k−1 = kN−1 and l−1 = N − 1. In particular we have m − n = j − i + N r0 ,

r0 := kN −1 − lN −1 .

We assume the condition (29). We set (a)

(b)

wN = q N +1 za ,

vN = q N +1 ub

for the sake of convenience. Then ,m ¯ (ij ) (ζ |ξ )µ1 ,··· G 1 ,··· ,µn

= C¯ (ij ) (, µ) Z ×

m Y

(N−1)(m−n+1−a)+j −a

ζa

a=1 (1) dw1 +1

(1) C +1 1

2πi

b=1

Z ···

n Y

(n) dvN −1

2π i

(n) C˜ N −1

Y

(N−1)(b−1)−j +µb

ξb

Y

(a)

(wj )−1

a;a ≤j −1

b;µb ≤j −1

(b)

vj

m N−1 Y Y

(a) (q −1 − q)wk × (a) (a) (a) −1 (a) a=1 k=a +1 (wk − q wk+1 )(wk − qwk+1 )

×

(b)

N −1 n Y Y b=1 k=µb +1

×

YY

(b)

(q −1 − q)vk+1 (b)

(b)

(b)

(vk − q −1 vk+1 )(vk − qvk+1 ) YY −1 1

(a)

(b)

(a)

(b)

wk − qwk−1 a
a
YY

1

Y Y

−1

(b) a
(b)

(a)

(b)

(a)

− q −1 vk−1 (a)

(vk −q−2 vk )(vk −vk ) (b) −1 v (a) v −q a
where the constant C¯ (ij ) (, µ) are given in Appendix E. (a) (b) The integration variables are wk a = 1, · · · , m, k = a + 1, · · · , N − 1 and vk b = 1, · · · , n, k = µb +1, · · · , N −1. If a = N −1 (resp. µb = N −1), we understand (a) (b) that there is no integration variable of the form wk (resp. vk ). Thus the number of integration variables is N−2 X (N − 1 − r)(kr + lr ). r=0

Each product in a, b, k which appears in the integrand is over all possible values satisfying the conditions written in the product symbol.

Trace Construction of Basis for Solution Space of slN qKZ Equation

45

(a) (a) (b) (b) The integration contours Ck of wk and C˜ k of vk are as follows. (a) The contour Ck is a simple closed curve going round the origin in the anticlockwise (b) (b) (b) direction such that qwk±1 (any b), q ±1 vk (any b) are inside, q −1 wk±1 (any b) are outside. (b) The contour C˜ k is a simple closed curve going round the origin in the anticlockwise (a) (a) (a) direction such that q −1 vk±1 (any a) are inside, qvk±1 (any a), q ±1 wk (any a) are outside.

9. Integrated Formula for the Extremal Component In this section we express the extremal component of the matrix element as a rational function by carrying out the integration of the integral formula. For 0 ≤ r ≤ N − 1 we define Kr , Lr by r r X X ks , Lr = ls , Kr = s=0

s=0

and Kr = Lr = 0 for r < 0 or r ≥ N . For a proposition P we define θ (P ) = 1 if P is true and θ(P ) = 0 otherwise. The variables are related by zr = ζrN , ur = ξrN . Proposition 3. We have k

k

¯ (ij ) (ζ |ξ )0l 0 ···(N −1)l N −1 G N −1 0 0 ···(N −1)

=C

(ij )

(k|l) Qm

m Y

(N−1)(m−n+1−a)−a +j ζa

a=1

n Y b=1

Qn

(N−1)(b−1)+µb −j ξb

Kj −1

Y

a=1

za−1

Lj −1

Y

ub

b=1

θ (a <µb ) (u − qz )θ (a >µb ) b a b=1 (za − qub ) Qn , 2 z )θ (a <b ) 2 u )θ (µa <µb ) (z − q (u − q a b a b a,b=1 a,b=1

× Qm

a=1

where = (0k0 , · · · , (N −1)kN −1 ), µ = (0l0 , · · · , (N −1)lN −1 ). The constant C (ij ) (k|l) is given in Appendix E. In [8] this formula is given for the case N = 2 and mn = 0. Let us explain how to derive this formula. First we carry out the integration in the variable w by the order (1)

(1)

(1)

(2)

(K

N −3 w1 → w2 → · · · → wN−1 → w1 → · · · → wN−1

(1)

+1)

(K

)

N −2 → · · · → wN −1 ,

(1)

that is, first in w1 , next in w2 , etc. After the integration in w we integrate in the variable v by the order (1)

(1)

(1)

(2)

(K

N −3 v1 → v2 → · · · → vN −1 → v1 → · · · → vN−1

(1)

+1)

(K

)

N −2 → · · · → vN −1 .

In the variable w1 the poles of the differential form in the integrand outside the con(1) (1) (1) tour C1 is only at w1 = q −1 w2 . It means in particular that there are no poles at infin(1) (1) (1) ity. Thus we can calculate the integral in w1 by taking the residue at w1 = q −1 w2 . (1) After taking this residue the integrand have the same structure in the variable w2 and

46

A. Nakayashiki

so on. Therefore the integrals in w’s are calculated by taking residues successively. After (1) calculating the integral in the variables w the poles of the integrand in the variable v1 (1) (1) (1) inside the contour C˜ 1 is only at v1 = q −1 v2 . Hence the integral is calculated by (1) (1) (1) taking the residue at v1 = q −1 v2 . After taking the residue in v1 the integrand has (1) the same structure in the variable v2 and so on. Thus the integral is calculated by substituting (a)

(q −1 − q)wk

(a) = q −1 , wk = q k+1 za , (a) (a) (a) −1 − qwk+1 )(wk − q wk+1 ) (b) (q −1 − q)vk+1 (b) = 1, vk = q k+1 ub , (b) (b) (b) (b) −1 (vk − qvk+1 )(vk − q vk+1 ) PN −2 integrand and multiplying it by (−1) r=0 (N−1−r)kr , which comes (a) (wk

into the the residue in w outside the contour.

from taking

Example (m = n = 1 case). In this case i = j , 1 = µ1 , kr = lr for any r and r0 = 0. We consider the case i = 0. The integral formula is read as ¯ (00) (ζ1 |ξ1 ) = C (00) (|)ζ − ξ1 × I, G 1 Z I=

Z

(1)

dw+1 (1)

C+1

2πi

···

(1)

dwN −1 2π i

(1)

CN −1

Z

Z

(1)

dv+1 2π i

(1) C˜ +1

···

N −1 (1) Y (q −1 − q)wk × (1) (1) (1) −1 (1) k=+1 (wk − q wk+1 )(wk − qwk+1 ) k=+1

(1)

dvN −1

(1) C˜ N −1

N−1 Y

×

(1)

(1)

(wk − vk+1 )

k=+1

N Y k=+2

(1)

(1)

(vk−1 − wk )

2π i

(1)

N−1 Y

(1)

(q −1 − q)vk+1 (1)

(1)

(1)

(vk − q −1 vk+1 )(vk − qvk+1 )

N −1 Y

1

(1) k=+1 (wk

(1) (1) − qvk )(wk

(1)

− q −1 vk )

.

Here, by calculation, C (00) (|) = 1. Let us denote the integrand of I by J . Consider (1) (1) (1) the integral in w+1 . By definition of the contour C+1 , q −1 w+2 is outside and all other (1)

(1)

poles on the complex plane are inside of C+1 . The differential form J dw+1 has no poles at ∞. We have (1)

Res

(1)

(1)

w+1 =q −1 w+2

−

N−1 Y k=+2

×

N−1 Y k=+2

·

J dw+1 =

(1)

(1)

(1)

(1)

(1)

(1)

(1)

(q −1 − q)vk+1 (1)

(1)

(1)

(wk − q −1 wk+1 )(wk − qwk+1 ) k=+1 (vk − q −1 vk+1 )(vk − qvk+1 ) (1)

(1)

(wk − vk+1 )

N Y

(vk−1 − wk )

k=+3

1 (1) (w+2

(1)

N −1 Y

(1)

(q −1 − q)wk

(1) (1) − q 2 v+1 )(w+2

(1)

− q −1 v+2 )

.

N −1 Y (1) k=+3 (wk

1 (1) (1) − qvk )(wk

(1)

− q −1 vk ) (30)

Trace Construction of Basis for Solution Space of slN qKZ Equation

47

(1)

Consider this function (30) as a function of w+2 . By the definition of the contour

(1)

(1)

(1)

C+2 , q −1 w+3 is outside of C+2 and all other poles on the complex plane are inside. (1)

The differential forms (30)×dw+2 has no singularity at ∞. Thus Z

(1)

(1)

C+2

(30)dw+2 = −

(1)

Res

(1) (1) w+2 =q −1 w+3

(30)dw+2

and so on. Consequently we have Z

Z

(1)

dw+1 (1)

C+1

2πi

···

= (−1)N−1− = q +1

(1)

dwN −1 2π i

(1)

CN −1

Res

(1) wN −1 =q N z1

J

Res

(1) (1) wN −2 =q −1 wN −1

···

Res

(1) (1) w+1 =q −1 w+2

(1)

(1)

z1 − qui

N −1 Y

(q −1 − q)vk+1

q +1 z1 − v+1

k=+1

(vk − q −1 vk+1 )(vk − qvk+1 )

(1)

(1)

J dw+1 · · · dwN −1

(1)

(1)

(1)

(1)

.

(31)

A similar consideration is applicable to the function (31) in the variables v’s. Finally we have I=

Res

(1) vN −1 =q N u1

···

Res

(1) (1) v+1 =q −1 v+2

(1)

(1)

(31)dv+1 · · · dvN −1 = 1.

Consequently ¯ (00) (ζ1 |ξ1 ) = (ζ −1 ξ1 ) . G 1 This reproduces the formula in [2]. 10. Integral Formula for the Trace of Intertwining Operators ¯ (i) : We recall the definition of G ¯ (i)

G

,m (ζ |ξ |x, y)µ11,··· ,··· ,µn

trV (3i ) x D y H 81 (ζ1 ) · · · 8m (ζm )9µ∗ n (ξn ) · · · 9µ∗ 1 (ξ1 ) = , F¯ (z|u|x)

where F¯ (z|u|x) is given by (15) and (16). We define A¯ r = {j |j = r}, Ar = A¯ 0 t · · · t A¯ r , B¯ r = {j |µj = r}, Br = B¯ 0 t · · · t B¯ r . Then ]A¯ r = kr , ]B¯ r = lr and ]Ar = Kr , ]Br = Lr . c The condition that the weight, with respect to Uq0 (sl N ), of the composition of the intertwining operators are zero is kr − lr = kN −1 − lN −1 =: r0

48

A. Nakayashiki

for 0 ≤ r ≤ N − 1. We assume this condition. We set (z)∞ = (z; x N )∞ , za = ζaN , ub = ξbN . Then for 0 ≤ i ≤ N − 1 we have ,m ¯ (i) (ζ |ξ )µ1 ,··· G 1 ,··· ,µn

=C

tr(i)

(|µ)

m Y

a=1

Y

×

za−1

a
×

Z

Y b∈BN −2 ,k

Y

×

Y a
tr(b) C˜ k

b=1

Z

tr(a)

a∈AN −2 ,k Ck

Y

(b) 2πivk a
a∈AN −2

za−1

(a) dwk (a) 2π iwk

(a)

(wb +1 )θ (a ≤b ) (w(a) )−θ (a <b ) b

(b)

a∈AN −2 ,b∈BN −2

(wµ(a)b )θ(a <µb ) (v(b) )θ (a >µb ) a

Y

a
(a) (wN−1 )−1

Y

Y a∈AN −2

a
a
Y

(a)

(wN −1 )lN −1

(b) (vN −1 )−1

YY YY (a) (b) (b) (a) × (1 − qwk /wk+1 ) (1 − qwk+1 /wk ) ×

Y

(N−1)(b−1)+µb −i

ξb

(vµa +1 )θ(µa ≥µb ) (vµ(b)a )−θ (µa >µb )

Y

×

n Y

Y

u−1 b

dvk

a
×

ζa(N−1)(m−n−a+1)−a +i

(b)

b∈BN −2

Y

a∈AN −2

(vN −1 )kN −1

(a)

wa +1

a>b k

Y

1 (a)

(b)

(b)

(a)

(qwk /wk+1 )∞ (qwk+1 /wk )∞ YY YY (b) (a) (a) (b) × (1 − q −1 vk+1 /vk ) (1 − q −1 vk /vk+1 ) a,b,k

a
× ×

1

a,b,k

(q −1 vk /vk+1 )∞ (q −1 vk+1 /vk )∞

×

(a)

(b)

(b)

(a)

(b) (a) (a) (b) Y θx N (vk+1 /wk ) Y θx N (wk+1 /vk ) a,b,k

×

a>b k

Y

(x N )∞

Y Y

a,b,k

(x N )∞

(x N )2∞

(b) (a) (a) (b) a,b k≤N−1 θx N (qvk /wk )θx N (qwk /vk )

Y Y θx N (w(b) /w(a) ) (a) (b) (b) (a) k k (q 2 x N wk /wk )∞ (q 2 wk /wk )∞ (x N )∞

a
Y Y θx N (v (a) /v (b) ) (a) (b) (b) (a) k k (q −2 vk /vk )∞ (q −2 x N vk /vk )∞ (x N )∞ a
a:a +1≤i

b:µb +1≤i

Trace Construction of Basis for Solution Space of slN qKZ Equation

49

where g0−1 = (−1)(m−n)(N−1) , Y

gj = yj

a:a +1≤j

(a) (wj )−1

θi (z1 , · · · , zN−1 |p) =

−1 gN = q (m−n)(N+1)

Y b:µb +1≤j

X

(b) vj

za

a=1

n Y b=1

u−1 b ,

(1 ≤ j ≤ N − 1), ¯

p 2 (α|α)+(α|3i ) 1

m Y

¯ α∈Q

N−1 Y j =1

¯ j) (α|3

zj

.

¯ = Zα1 ⊕ · · · ⊕ ZαN −1 is the root lattice of slN . The constant C tr(i) (|µ) is Here Q given in Appendix E. tr(a) (a) tr(b) (b) for wk and C˜ k for wk are specified in the folThe integration contour Ck lowing manner. tr(a) is a simple closed curve going round the origin in the anticlockThe contour Ck (b) (b) wise direction such that qx N m wk±1 (m ≥ 0, any b), q ±1 x N m vk (m ≥ 0, any b) are (b)

(b)

inside and q −1 x −Nm wk±1 (m ≥ 0, any b), q ±1 x −N m vk (m ≥ 1, any b) are outside. tr(b) is a simple closed curve going round the origin in the anticlockThe contour C˜ k

(a)

(a)

wise direction such that q −1 x N m vk±1 (m ≥ 0, any a), q ±1 x N m wk (m ≥ 1, any a) are (a)

(a)

inside and qx −Nm vk±1 (m ≥ 0, any a), q ±1 x −N m wk (m ≥ 0, any a) are outside. It can be checked that those contours are well defined for |x| < |q|2/N < 1. For other values of x such that |x| < 1, the integral is defined by the analytic continuation. If we set x = 0 in the formula above, we obtain the integral formula for the matrix element with i = j in Sect. 8. We have verified that this formula coincides with the trace formula in [8] for N = 2. The derivation of the integral formula of the trace is similar to the N = 2 case [9, 8] and it is briefly explained in Appendix D. As a corollary of the integral formula for the trace we have ¯ |ξ |x, y) are meromorphic functions on Corollary 2. The functions G(ζ |ξ |x, y) and G(ζ (C∗ )n+m , where C∗ = C\{0} is the algebraic torus. Proof. The singularity of the integral appears only when the pinch of the integration contour occurs. By the definition of the contour the pinch happens at X = q a x b Y for some integers a, b, where X, Y ∈ {ζ1 , · · · , ζm , ξ1 , · · · , ξn }. Suppose that pinch occurs at X = q a x b Y . We decompose the integral into the sum of residues and the integral with the integration contour for which the pinch does not occur at X = q a x b Y . Since the integrand of the trace formula is a meromorphic function on (C∗ )n+m its residue is also a meromorphic function on (C∗ )n+m . In the decomposition the singularity at X = q a x b Y appears only from the residue part. Thus the singularity of the trace function t at X = q a x b Y is a pole. u 11. Discussion In this paper we have proved that the trace of the composition of the intertwining operators of type I and type II gives a basis of the solution space of the qKZ equation at generic

50

A. Nakayashiki

values of parameters. The qKZ equation considered in this paper takes the value in the tensor product of the vector representation of Uq (slN ). There is a problem whether it is possible to construct solutions of the qKZ equation taking values in the tensor product of the arbitrary finite dimensional irreducible Uq (slN ) modules as a trace of intertwining operators. For N = 2 it will be possible to construct solutions of the qKZ equation taking values in the arbitrary irreducible Uq (sl2 ) modules by taking the trace of the intertwining operators introduced in [11] over the tensor product c2 ) modules of level one [7]. It is natural to expect that of integrable highest weight Uq (sl the trace functions thus constructed give a basis of the solution space. For N ≥ 3 a similar construction will be possible. For the moment what kind of modules we can treat is not very clear. Let us consider the qKZ equation (1) of N = 2 on the weight subspace of V1 ⊗· · ·⊗Vn with a weight, say λ. At some special values of κ, which depend on p, q and λ, the hypergeometric solution of Tarasov–Varchenko [15] takes the value in the space of singular vectors with respect to certain action of Uq (sl2 ). From the experience of rational and level zero case [12], it is probable that the trace function still gives a basis of the full space of the tensor product at those special values of κ. This means that the hypergeometric solution and the trace solution have very different structures. It is an interesting and important problem to relate these two bases. A partial result in this direction is given in [12]. One of the important properties of the trace construction of the solution is that it gives a map from V ⊗n with fixed n to the space of solutions of the qKZ equation taking the value in V ⊗m for any m. This will be a key structure to relate finite and infinite dimensional modules. Note that it is nothing but the typical structure of the form factors in integrable quantum field theories [13, 12]. The above mentioned problem connecting two types of solutions is also important to understand the completeness problem of local fields constructed by Smirnov [13, 1]. The value x = q 2 is of particular interest, since the correlation functions and the form factors of the solvable lattice model are given by some special case of the trace function ¯ does not vanish identically at x = q 2 . at this value of x. We conjecture that det G The generalization of the results in this paper to other types of quantum affine algebra is also interesting. The trace of intertwining operators are also studied in [3]. Here we simply comments the following things. In [3] the trace is twisted by the Dynkin diagram automorphism and thus it is different from the trace considered in this paper. The difference equations satisfied by the trace in [3] and in this paper are also different.

Acknowledgement. I would like to thank Vitaly Tarasov for the helpful discussion. This work was done while the author stayed at LPTHE in Universite Pierre et Marie Curie. I am grateful to people in the laboratoire, in particular, to Olivier Babelon and Fedor Smirnov for their kind hospitality.

c2 ) Case – A. Integral Formula for Trace – Uq (sl c2 ) the integral formula for the trace is given in [8]. Our formula at In the case of Uq (sl ¯ of the trace over V (30 ) and V (31 ) N = 2 in Sect. 10 recovers it. In this case the sum G simplifies a bit. It is used in the calculation of the example in Sect. 7. Thus we shall

Trace Construction of Basis for Solution Space of slN qKZ Equation

51

present this simplified formula. It also serves as a simplest example of the trace formula. ,m ¯ |ξ |x, y)µ1 ,··· = Cstmn G(ζ 1 ,··· ,µn

×

m Y j =1

−j +(1+j )/2

ζj

t Z Y

C˜

r=1

n Y k=1

k−(1+µk )/2

ξk

s Z Y r=1 C

dwr 2π iwr

dvr F AB (ζ, ξ, w, v|x, y), 2π ivr

where F AB (ζ, ξ, w, v|x, y) =

s Y Y

(qzj − q −1 wr )

r=1 j <ar

×

t Y

Y

Y r,r 0

×

(qvr−1 − q −1 u−1 k )

j,r

Y k>br

1 (wr /zj )∞ (q 2 zj /wr )∞ Y

(vr−1 − u−1 k )

k,r

(q −2 v

(x 2 )∞

r,k

(x 2 )∞

(x 2 )2∞ θx 2 (−qvr /wr 0 )θx 2 (−q −1 vr /wr 0 )

Y (q 2 wr /wr 0 )∞ (q 2 wr 0 /wr )∞ w−1 0 r 0 θx 2 (wr /wr ) 2 2 wr 0 − q wr (x )∞ 0

r
×

1 r /uk )∞ (uk /vr )∞

Y θx 2 (−q −1 vr /zj ) Y θx 2 (−quk /wr ) r,j

×

Y

(zj − wr )

j >ar

r=1 k
×

Y

Y

(vr 0 − q −2 vr )(x 2 q −2 vr /vr 0 )∞ (x 2 q −2 vr 0 /vr )∞

r
× θx ((−1)

(−q)

m−n 2

Q Q ζj vr y Q Q ). ξk wr

vr 0 θx 2 (vr /vr 0 ) (x 2 )∞

Here A = {a1 < · · · < as } = {j |j = +} and B = {b1 < · · · < bt } = {j |µj = +}, zj = ζj2 , uk = −ξk2 . The constant Cstmn is given in §8.2 of [8]. The integration contour C and C˜ go round the origin such that for C: q 2 x 2l zj (l ≥ 0) ˜ x 2l uk (l ≥ 0) are inside and q 2 x −2l uk are inside and x −2l zj (l ≥ 0) are outside, for C: ±1 2l (l ≥ 0) are outside, −q x wr (l ≥ 1) are inside and −q ±1 x −2l wr (l ≥ 0) are outside. We have rewritten the formula in [8] using the following formula: θx 4 (−xX2 ) + (−1)t Xθx 4 (−x 3 X 2 ) = θx ((−1)t+1 X). B. Boson Expression of Intertwining Operators c Here we review the bosonic expression of intertwining operators for Uq (sl N ) [10]. Recall that −1 P = ⊕N i=0 Z3i ⊕ Zδ

52

A. Nakayashiki

d is the weight lattice of sl N . The normalized invariant bilinear form on P is given by (3i |3j ) =

i(N − j ) N

(i ≤ j ),

(3i |δ) = 1,

(32)

which means in particular (αi |3j ) = δi,j ,

(αi |αj ) = −δij −1 + 2δij − δij +1 .

¯ i, 3 ¯ i and P¯ are identified with the fun¯ i = 3i − 30 and P¯ = ⊕N −1 Z3 If we set 3 i=1 damental weight and the weight lattice of slN respectively. We have the orthogonal decomposition P = P¯ ⊕ Z30 ⊕ Zδ and the orthogonal projection from P to P¯ is given by ¯ i, 3i 7 → 3

δ 7→ 0.

Consider the Heisenberg algebra with the generators {ai (k)|1 ≤ i ≤ N − 1, k ∈ Z\{0}} and the commutation relation [ai (k), aj (l)] = δk+l,0

[(αi |αj )k][k] . k

Denote by H the Fock space of this algebra: H = C[ai (−k)|1 ≤ i ≤ N − 1, k ≥ 1]. ¯ is the algebra ¯ = ⊕N−1 Zαj be the root lattice of slN . The group algebra C[Q] Let Q j =1 generated by eα1 , . . . , eαN −1 with the defining relation eαi eαj = (−1)(αi |αj ) eαj eαi . It is known that there is an isomorphism of vector spaces [4]: ¯

¯ 3i , V (3i ) ' H ⊗ C[Q]e ¯

(33)

in which a highest weight vector corresponds to 1 ⊗ e3i . For the action of the generators c of Uq (sl N ) on the right-hand side of (33) see [4, 10]. ¯ 3¯ i . To describe the intertwining operators we introduce the algebra containing C[Q]e ¯ Notice that P¯ = ⊕N−1 j =2 Zαj ⊕ Z3N −1 , since ¯i =− 3

N−1 X

¯ N −1 . (r − i)αr + (N − i)3

r=i+1

In this description α1 is written as α1 = −

N −1 X r=2

¯ N −1 . rαr + N 3

Trace Construction of Basis for Solution Space of slN qKZ Equation

53 ¯

The group algebra C[P¯ ] is the algebra generated by eα1 , . . . , eαN −1 , e3N −1 with the defining relation [10] ¯ N −1 }. eα eβ = (−1)(α|β) eβ eα , α, β ∈ {α2 , · · · , αN −1 , 3 P −1 ¯ As a convention, for α = N j =2 mj αj + mN 3N −1 , we set ¯

eα = em2 α2 · · · emN −1 α1 emN 3N −1 . ¯

¯ becomes a subalgebra of C[P¯ ]. We consider C[Q]e ¯ 3i as a subspace The algebra C[Q] of C[P¯ ]. ¯ and d on the space We define the action of the symbols ∂γ , (γ ∈ P¯ ), eα (α ∈ Q), ¯ ¯ 3¯ i , β ∈ P¯ , and ¯ 3i . Let X = aj1 (−n1 ) · · · ajk (−nk ) ∈ H, eβ ∈ C[Q]e H ⊗ C[Q]e Y = X ⊗ eβ . Then eα Y = X ⊗ eα eβ ,

∂γ Y = (γ |β)Y, k X

dY = (−

(β|β) (3i |3i ) + )Y. 2 2

nk −

r=1

The principal grading operator D (i) on V (3i ) is given by D (i) = −ρ +

i(N − i) , 2

ρ = Nd +

r=1

We set Xj± (w)

= exp ± =

X n∈Z

!

∞ X aj (−k) k=1

q

[k]

∓ 2k

w

N −1 1X r(N − r)∂αr . 2

k

exp ∓

∞ X aj (k)

[k]

k=1

! q

∓ 2k

w

−k

e±αj w

±∂αj

,

± ± xj,n w −n−1 , xj± = xj,0 .

Then Theorem 2 ([10]). ˜ h(i) (z) 8 N−1

= exp

∞ X k=1

×e

¯ N −1 3

! 3 ∗ aN−1 (−k)q (N+ 2 )k zk

∂3¯

(q N+1 z)

N −1

˜ h(i) (z), x − ]q , ˜ h(i) (z) = [8 8 j +1 j j +1

+ N −1−i N

exp

∞ X k=1

! 1 ∗ aN−1 (k)q −(N+ 2 )k z−k

(N−1)(∂3¯ − N −1−i ) N

(−1)

1

(−1) 2 (N−i)(N−1−i) , 1

0 ≤ j ≤ N − 2,

∞ ∞ X X ∗ (N+ 21 )k k ∗ −(N+ 23 )k −k ˜ ∗h(i) (u) = exp − a (−k)q u a (k)q u exp − 9 N −1 N−1 N−1 k=1

×e

¯ N −1 −3

(q

k=1

N+1

−∂3¯

u)

N −1

+ Ni

(N−1)(−∂3¯ + NN−i )

(−1)

1

(−1) 2 (N−i)(N−1−i) , 1

54

A. Nakayashiki

˜ ∗h(i) (u)]q −1 , ˜ ∗h(i) (u) = [x + , 9 9 j +1 j j +1

0 ≤ j ≤ N − 2,

where [X, Y ]q = XY − qY X and N −1 −1 X [rk]ar (k). [k][N k]

∗ (k) = aN−1

r=1

∗ (k) satisfy the relations The elements aN−1 ∗ (−l)] = δk,l δj,N −1 [aj (k), aN−1

[k] , k

∗ ∗ [aN−1 (k), aN−1 (−l)] = −δk,l

1 [(N − 1)k] . k [N k]

¯ i is given by From (32) the inner product for 3 ¯ j) = ¯ i |3 (3

i(N − j ) N

¯ j ) = δi,j (αi |3

(i ≤ j ),

C. List of Normal Ordering Rules We define the normal ordered operator as an operator of the form 





An aj (−n) exp 

(j ) Bn aj (n)



∞ N−1 XX

exp 

∞ N −1 X X

(j )

j =1 n=1

j =1 n−1



× exp 

N −1 X



N −1 X

cj αj  exp 

j =1 (j )



j =1

 cj0 ∂αj  ,

(j )

where An , Bn , cj , cj0 are constants. Thus we define the normal order of the product of operators as : ai (k)aj (l) : = ai (k)aj (l) if k ≤ l = aj (l)ai (k) if k > l, : ∂α ai (k) : =: ai (k)∂α := ai (k)∂α , : eα ai (k) : =: ai (k)eα := ai (k)eα , : ∂α eβ : =: eβ ∂α := eβ ∂α . We give a list of expressions of operators in terms of their normal ordered operators. Xj−1 (w1 )Xj−2 (w2 ) =: Xj−1 (w1 )Xj−2 (w2 ) :

|j1 − j2 | > 1, j1 , j2 6 = 0,

X1− (w1 )Xj− (w2 ) = (−1)j +1 : X1− (w1 )Xj− (w2 ) :

j ≥ 3,

Xj− (w1 )X1− (w2 )

j ≥ 3,

j −1

= (−1)

Xj− (w1 )Xj−+1 (w2 ) =

:

Xj− (w1 )X1− (w2 )

:

(−1)δj 1 : Xj− (w1 )Xj−+1 (w2 ) :, w1 − qw2

Trace Construction of Basis for Solution Space of slN qKZ Equation

Xj−+1 (w1 )Xj− (w2 ) =

55

(−1)1−δj 1 : Xj−+1 (w1 )Xj− (w2 ) :, w1 − qw2

Xj− (w1 )Xj− (w2 ) = (w1 − q 2 w2 )(w1 − w2 ) : Xj− (w1 )Xj− (w2 ) :, Xj−1 (w)Xj+2 (v) =: Xj−1 (w)Xj+2 (v) : X1− (w1 )Xj+ (w2 ) Xj− (w1 )X1+ (w2 ) Xj− (w)Xj++1 (v) Xj−+1 (w)Xj+ (v)

= = = =

|j1 − j2 | > 1, j1 , j2 6 = 1,

(−1) : X1− (w1 )Xj+ (w2 ) : j ≥ 3, (−1)j −1 : Xj− (w1 )X1+ (w2 ) : j ≥ 3, (−1)δj 1 (w − v) : Xj− (w)Xj++1 (v) :, (−1)1−δj 1 (w − v) : Xj−+1 (w)Xj+ (v) :, j +1

1 : Xj− (w)Xj+ (v) :, (w − qv)(w − q −1 v) Xj+1 (v)Xj−2 (w) =: Xj+1 (v)Xj−2 (w) : |j1 − j2 | > 1, j1 , j2 6= 1, Xj− (w)Xj+ (v) =

X1+ (w1 )Xj− (w2 ) = (−1)j +1 : X1+ (w1 )Xj− (w2 ) : Xj+ (w1 )X1− (w2 ) Xj+ (v)Xj−+1 (w) Xj++1 (v)Xj− (w)

= = =

j ≥ 3,

(−1)j −1 : Xj+ (w1 )X1− (w2 ) : j ≥ 3, (−1)δj 1 (v − w) : Xj+ (v)Xj−+1 (w) :, (−1)1−δj 1 (v − w) : Xj++1 (v)Xj− (w) :,

1 : Xj+ (v)Xj− (w) :, (v − qw)(v − q −1 w) Xj+1 (v1 )Xj+2 (v2 ) =: Xj+1 (v1 )Xj+2 (v2 ) : |j1 − j2 | > 1, j1 , j2 6 = 1, Xj+ (v)Xj− (w) =

X1+ (w1 )Xj+ (w2 ) = (−1)j +1 : X1+ (w1 )Xj+ (w2 ) :

j ≥ 3,

Xj+ (w1 )X1+ (w2 )

j ≥ 3,

j −1

= (−1)

:

Xj+ (w1 )X1+ (w2 )

:

(−1)δj 1 : Xj+ (v1 )Xj++1 (v2 ) :, v1 − q −1 v2 (−1)1−δj 1 : Xj++1 (v1 )Xj+ (v2 ) :, Xj++1 (v1 )Xj+ (v2 ) = v1 − q −1 v2 Xj+ (v1 )Xj++1 (v2 ) =

Xj+ (v1 )Xj+ (v2 ) = (v1 − q −2 v2 )(v1 − v2 ) : Xj+ (v1 )Xj+ (v2 ) :, ˜ h(i) (z)X− (w) : ˜ h(i) (z)X− (w) =: 8 8 j j N−1 N −1 ˜ h(i) (z)X− (w) = 8 N−1 N−1 ˜ h(i) 8 N−1

q −1

j 6 = N − 1,

h(i)

− ˜ :8 N −1 (z)XN −1 (w) :, w − qN z ˜ h(i) (z)X+ (v) : j 6= N − 1, (z)X+ (v) =: 8 j

N −1

j

˜ h(i) (z)X+ (v) :, ˜ h(i) (z)X+ (v) = (v − q N +1 z) : 8 8 N−1 N −1 N−1 N −1

˜ ∗h(i) (u)X+ (v) : j 6 = N − 1, ˜ ∗h(i) (u)X+ (v) =: 9 9 j j N−1 N −1 q ∗h(i) + ˜ ∗h(i) (u)X + (v) :, ˜ :9 9 N −1 N−1 (u)XN−1 (v) = N−1 v − q N +2 u ˜ h(i) (z) =: X− (w)8 ˜ h(i) (z) :, j 6= N − 1, X − (w)8 j

N−1

j

N −1

56

A. Nakayashiki h(i)

1 − ˜ h(i) : XN −1 (w)8N −1 (z) :, w − q N +2 z ˜ h(i) (z) :, j 6 = N − 1, (z) =: X+ (v)8

− ˜ XN−1 (w)8 N−1 (z) = h(i)

˜ Xj+ (v)8 N−1

j

+ ˜ h(i) (z) (v)8 XN−1 N−1 ˜ ∗h(i) (u) Xj+ (v)9 N−1

N −1

= (v − q

N +1

+ ˜ h(i) z) : XN −1 (v)8N −1 (z) :,

∗h(i)

˜ =: Xj+ (v)9 N −1 (u) :,

j 6 = N − 1,

1 + ˜ ∗h(i) : XN −1 (v)9N −1 (u) :, v − qN u ˜ ∗h(i) (u) =: X− (w)9 ˜ ∗h(i) (u) :, j 6= N − 1, X − (w)9

+ ˜ ∗h(i) (u) = (v)9 XN−1 N−1 j

j

N−1 ∗h(i)

N −1

∗h(i)

− − N +1 ˜ ˜ (w)9 u) : XN XN−1 −1 (w)9N −1 (u) :, N−1 (u) = (w − q

˜ h(i2 ) (z2 ) = (−q N+1 z1 ) ˜ h(i1 ) (z1 )8 8 N−1 N−1

N −1 N

˜ ∗h(i2 ) (u) = (−q N+1 z)− ˜ h(i1 ) (z)9 8 N−1 N−1

N −1 N

˜ h(i2 ) (z) = (−q N+1 u)− ˜ ∗h(i1 ) (u)8 9 N−1 N−1

N −1 N

˜ ∗h(i1 ) (u1 ) = (−q N+1 u2 ) ˜ ∗h(i2 ) (u2 )9 9 N−1 N−1

N −1 N

(q 2 zz21 )∞

(q 2N zz21 )∞

˜ h(i1 ) (z1 )8 ˜ h(i2 ) (z2 ) :, :8 N −1 N−1

(q 2N−1 uz )∞ (q uz )∞

˜ h(i1 ) (z)9 ˜ ∗h(i2 ) (u) :, :8 N −1 N−1

(q 2N−1 uz )∞ ˜ ∗h(i1 ) (u)8 ˜ h(i2 ) (z) :, :9 N −1 N −1 (q uz )∞ ( uu21 )∞

(q 2N−2 uu21 )∞

˜ ∗h(i2 ) (u2 )9 ˜ ∗h(i1 ) (u1 ) : . :9 N−1 N −1

Let us set ˜ h(i) (z|wN −1 · · · wj +2 ), X− (wj +1 )]q , ˜ h(i) (z|wN−1 · · · wj +1 ) = [8 8 j +1 j j +1 ˜ ∗h(i) (u|vN −1 · · · vj +2 )]q −1 . ˜ ∗h(i) (u|vN−1 · · · vj +1 ) = [X+ (vj +1 ), 9 9 j +1 j j +1 Then ˜ h(i) (z|wN−1 · · · wj +1 ) = (−1) 2 N (N +1)δj 0 +δj 0 8 j 1

×:

N −1 Y k=j +1

(q −1 − q)wk (wk − q −1 wk+1 )(wk − qwk+1 )

˜ h(i) (z)X− (wN −1 ) · · · X− (wj +1 ) 8 N −1 j +1 N −1

:,

˜ ∗h(i) (u|vN−1 · · · vj +1 ) 9 j = (−1)

1 2 N(N+1)δj 0 +δj 0

N −1 Y k=j +1

(q −1 − q)vk+1 (vk − q −1 vk+1 )(vk − qvk+1 )

˜ ∗h(i) (u)X+ (vN −1 ) · · · X+ (vj +1 ) :, ×:9 N −1 j +1 N −1 where we set wN = q N+1 z and vN = q N +1 u. We have h(i )

(1)

(1)

− ˜ 1 (z1 )X− (w :8 N−1 N−1 N−1 ) · · · Xj1 +1 (wj1 +1 ) :

Trace Construction of Basis for Solution Space of slN qKZ Equation h(i )

(2)

57

(2)

− ˜ 2 (z2 )X− (w ×:8 N−1 N−1 N −1 ) · · · Xj2 +1 (wj2 +1 ) :

= (−1) 2 (N−1−j2 )(N +2+j2 )δj1 0 + 2 (N−1−j1 )(N+2+j1 )δj2 0 1

1

(+)

×h

Q (1) (1) (2) 2 (2) N −1 z2 −1 k (wk − q wk )(wk − wk ) N+1 N ( )q (−q z1 ) (1) (2) z1 (w − q N +2 z2 )(w − q N z1 ) N −1

×

Y

Y

−1

(1) k wk

(2) − qwk−1

k

N −1

1 (1) wk

(2)

− qwk+1

˜ h(i1 ) (z1 ) · · · X− (w(2) ) :, :8 j2 +1 N −1 j2 +1

h(i )

˜ 1 (z)X− (wN−1 ) · · · X− (wj1 +1 ) : :8 N−1 j1 +1 N−1 ∗h(i )

+ + 2 ˜ ×:9 N−1 (u)XN−1 (vN −1 ) · · · Xj2 +1 (vj2 +1 ) :

= (−1) 2 (N−1−j2 )(N +2+j2 )δj1 0 + 2 (N−1−j1 )(N+2+j1 )δj2 0 1

1

−1 N +1 u)(v N +1 z) N −1 (wN −1 − q u N −1 − q Q (−q N+1 z)− N −1 z k (wk − qvk )(wk − q vk ) Y Y ˜ h(i1 ) (z) · · · X+ (vj2 +1 ) :, × (wk − vk+1 ) (−1)(wk − vk−1 ) : 8 j2 +1 N −1 × h(0)

k

k

∗h(i )

(2)

(2)

+ + 2 ˜ :9 N−1 (u2 )XN−1 (vN−1 ) · · · Xj2 +1 (vj2 +1 ) :

˜ ∗h(i1 ) (u1 )X + (v (1) ) · · · X+ (v (1) ) : ×:9 N−1 N −1 j1 +1 j1 +1 N−1 = (−1) 2 (N−1−j2 )(N +2+j2 )δj1 0 + 2 (N−1−j1 )(N−2+j1 )δj2 0 1

(−)

×h

1

Q (2) (2) (1) −2 (1) N −1 u1 k (vk − q vk )(vk − vk ) N+1 N ( )q(−q u2 ) (1) (2) u2 (v − q N +2 u2 )(v − q N u1 ) N −1

×

Y

1

(2) k vk

(1) − q −1 vk+1

Y (2) k vk

N −1

−1 (1)

− q −1 vk−1

∗h(i )

(1)

+ 2 ˜ :9 N−1 (u2 ) · · · Xj1 +1 (vj1 +1 ) : .

Here, denoting (z)∞ = (z; x N )∞ , we set h(+) (z) =

(q 2 z)∞ , (q 2N z)∞

h(0) (z)−1 =

(q 2N−1 z)∞ , (qz)∞

h(−) (z) =

(z)∞ . (q 2N−2 z)∞

D. Derivation of Integral Formula for Trace The calculation of the trace using the bosonic expression of the intertwining operators are similar to the case of N = 2 [9, 8]. Thus we simply present the necessary information for the calculation of the trace.

58

A. Nakayashiki

We use the following formula: ∞ ∞ N−1 N −1 X XX X (j ) (j ) An aj (−n) exp Bn aj (n) trV (3i ) x D exp j =1 n=1

× exp(

N−1 X

−∂3¯

cj αj )g0

j =1

1

j =1 n=1 −∂3¯

gN

N −1

N −1 Y j =1

∂α j

gj

¯ ¯ ) ¯ 1 |3 ¯ i ) −(3 |3 −(3 −1 −1 N 2 gN N −1 i gi1−δi0 θi (g0−1 g12 g2−1 , · · · , gN = (x N )−1 ∞ g0 −2 gN −1 gN |x ) ∞ N−1 ∞ X X Y 1 Nmn (j ) (j −1) (j ) (j +1) exp An (−[n]2 Bn + [n][2n]Bn − [n]2 Bn ) , x × n m=1 n=1 j =1

(34) (0)

(N )

where we set Bn = Bn = 0. The derivation of this formula is similar to the sl2 case. We refer to [8] for details. In the previous section we have given the expression of the operators in terms of their normally ordered operators. Therefore in this section we shall give a list of contributions to the trace from the normally ordered operators. Then using the formula (34) we can calculate the trace and the result is presented in Sect. 10. For an operator O such that O = exp

∞ N−1 XX

∞ N −1 X X (j ) (j ) An aj (−n) exp Bn aj (n)

j =1 n=1

× exp(

N−1 X

j =1 n=1 −∂3¯

cj αj )g0

j =1

if we write

1

−∂3¯

gN

N −1

N −1 Y j =1

∂α

gj j ,

O ≈ J,

then it means that J =

N−1 Y j =1

exp

∞ ∞ X X 1 Nmn (j ) (j −1) (j ) (j +1) An (−[n]2 Bn + [n][2n]Bn − [n]2 Bn ) . x n m=1 n=1

The following is the list which is necessary for the calculation of the trace. The constants c1 , c2 , c± are defined by the first four equations: ˜ h(i) (z) ≈ 8 N−1

{q 2 x N } =: c1 , {q 2N x N }

Xj− (w) ≈ (q 2 x N )∞ (x N )∞ =: c− ,

˜ ∗h(i) (u) ≈ 9 N −1

Xj+ (w) ≈ (q −2 x N )∞ (x N )∞ =: c+ ,

(cσ1 cσ2 )−1 : Xjσ11 (w1 )Xjσ22 (w2 ) :≈ 1, −2 : Xj− (w1 )Xj−+1 (w2 ) :≈ c−

{x N } =: c2 , {q 2N−2 x N }

|j1 − j2 | > 1, σ1 , σ2 ∈ {±},

1 , (qx N w1 /w2 )∞ (qx N w2 /w1 )∞

Trace Construction of Basis for Solution Space of slN qKZ Equation

59

−2 c− : Xj− (w1 )Xj− (w2 ) : ≈ (q 2 x N w1 /w2 )∞ (q 2 x N w2 /w1 )∞

· (x N w1 /w2 )∞ (x N w2 /w1 )∞ , (c− c+ )−1 : Xj− (w)Xj++1 (v) : ≈ (x N w/v)∞ (x N v/w)∞ , (c− c+ )−1 : Xj−+1 (w)Xj+ (v) : ≈ (x N w/v)∞ (x N v/w)∞ , (c− c+ )−1 : Xj− (w)Xj+ (v) : ≈

1 (qx N w/v)∞ (qx N v/w)∞ (q −1 x N w/v)∞ (q −1 x N v/w)∞ , −2 : Xj+ (v1 )Xj++1 (v2 ) :≈ c+

1 (q −1 x N v1 /v2 )∞ (q −1 x N v2 /v1 )∞ ,

−2 : Xj+ (v1 )Xj+ (v2 ) :≈ (x N v1 /v2 )∞ (x N v2 /v1 )∞ (q −2 x N v1 /v2 )∞ (q −2 x N v2 /v1 )∞ , c+ h(i)

± ˜ (c1 c± )−2 : 8 N−1 (z)Xj (w) : ≈ 1,

j 6= N − 1,

˜ ∗h(i) (u)X± (w) : ≈ 1, (c2 c± )−2 : 9 j N−1

j 6= N − 1,

˜ h(i) (z)X− (w) : ≈ (c1 c− )−1 : 8 N −1 N−1 h(i)

1 , (q N +2 x N z/w)∞ (q −N x N w/z)∞

+ N +1 N ˜ x z/v)∞ (q −N −1 x N v/z)∞ , (c1 c+ )−1 : 8 N−1 (z)XN −1 (v) : ≈ (q

˜ ∗h(i) (u)X− (w) : ≈ (q N +1 x N u/w)∞ (q −N −1 x N w/u)∞ , (c2 c− )−1 : 9 N −1 N−1 1 + −1 ˜ ∗h(i) , (c2 c+ ) : 9N−1 (u)XN −1 (v) : ≈ (q N x N u/v)∞ (q −N −2 x N v/u)∞ {q 2 x N z1 /z2 }{q 2 x N z2 /z1 } ˜ h(i1 ) (z1 )8 ˜ h(i2 ) (z2 ) : ≈ , c1−2 : 8 N−1 N−1 {q 2N x N z1 /z2 }{q 2N x N z2 /z1 } {q 2N−1 x N z/u}{q 2N−1 x N u/z} ˜ h(i1 ) (z)9 ˜ ∗h(i2 ) (u) : ≈ , (c1 c2 )−1 : 8 N−1 N−1 {qx N z/u}{qx N u/z} {x N u1 /u2 }{x N u2 /u1 } ˜ ∗h(i1 ) (u1 )9 ˜ ∗h(i2 ) (u2 ) : ≈ . c2−2 : 9 N−1 N−1 {q 2N−2 x N u1 /u2 }{q 2N−2 x N u2 /u1 } E. Explicit Formulae of Constants The constants C¯ (ij ) (, µ), C (ij ) (k|l) and C tr(i) (|µ) are given here. The constant C¯ (ij ) (, µ) appears in the integral formula of the matrix elements in Sect. 8. It is given by C¯ (ij ) (, µ) = (−1)sgnN (i,j )−ir0 (N−1)+δj 0 (n+m)(N−1)+( 2 N (N+1)+1)(k0 +l0 ) 1

× (−1) 2 (j −i)(N−j )(N−j −1)θ (1≤i<j )+ 2 (i−j )(N−i)(N−1+i−2j )θ (i>j ≥1) 1

× (−1) 2 (k0 +l0 ) 1

1

PN −1 r=1

(N−1−r)(N+2+r)(kr +lr )

60

A. Nakayashiki

× q 2 (N+1)((i−j )(i−j −1)+r0 N (N−1)+2j r0 N −2ir0 (N−1)) , 1

2

where m

sgnN (i, j ) =

1X i+a−1 i+a−1 (N − N { })(N − 1 − N{ }) 2 N N a=1

+

n

j +b−1 j +b−1 1X })(N − 1 − N { }). (N − N{ 2 N N b=1

and, for a rational number r, {r} is the fractional part of r, that is, {r} = r − [r], [r] being the Gauss symbol. This notation appears only in the description of sgn and should not be confused with the double infinite product. The constant C (ij ) (k|l) appears in Proposition 3. It is given by C (ij ) (k|l) PN −2

= (−1)sgnN (i,j )−ir0 (N−1)+

r=0

(N −1−r)kr +δj 0 (n+m)(N−1)+( 21 N (N+1)+1)(k0 +l0 )

× (−1) 2 (j −i)(N−j )(N−j −1)θ (1≤i<j )+ 2 (i−j )(N−i)(N−1+i−2j )θ (i>j ≥1) 1

1

× (−1) 2 (k0 +l0 ) 1

× (−1)

PN −1 r=1

(N−1−r)(N+2+r)(kr +lr )

P

PKN−2 PLN−2

P

1≤a
1≤a
a=1

b=1

(N−1−max(a ,µb ))

× q 2 (N+1)((i−j )(i−j−1)+r0 N(N−1)+2j r0 N−2ir0 (N−1))+(j +1)(−Kj−1 +Lj−1 )−N KN−2 C2−(N−1)LN−2 C2 1

2

PN−2

× q N(−KN−2 +LN−2 )(kN−1−lN−1 )+LN−2 lN−1− PN−2

× q−

r=0 (r+1)kr lr +NKN−2 LN−2

PN−2

r=0 (N−1−r)kr +

PN−2

r=0 (r+1)kr C2 +

r=0 r lr C2

.

The constant C tr(i) (|µ) appears in the integral formula of the trace in Sect. 10. It is given by C tr(i) (|µ) = (−1)sgnN (i,i)+ir0 (N−1)+ 3 r0 (N−1)(N−2)(N−3)+ 2 r0 (r0 +1)(N−1)+KN −2 +nLN −2 1

PN −2

× (−1)

a=1

P

aka +

a∈AN −2

1

P

a+

P

× (−1)

P

b+

b∈BN −2

a∈AN −2 ,b∈BN −2 cab

P

a
a
PN −2

× q ir0 (N+1)+ 2 r0 N(N−1)(N +1)−n(N +1)LN −2 +(N−1)KN −2 LN −2 + 1 2

×q

(N+1)(−

P

a∈AN −2

P

a+

b∈BN −2

P

b)−

a∈AN −2 ,b∈BN −2 cab

b=1

blb

Trace Construction of Basis for Solution Space of slN qKZ Equation

×

61

PN −2 {q 2 x N } m {x N } n N a=0 (N−1−a)(ka +la )−1 (x ) ∞ 2N N 2N−2 N {q x } {q x } PN −2

× (q 2 )∞a=0

(N−1−a)ka

PN −2

(q −2 )∞b=0

(N−1−b)lb

,

where we set ab = max(a , b ), cab = max(a , µb ),

µab = max(µa , µb ), (z)∞ = (z; x N )∞ .

References 1. Babelon, O., Bernard, D. and Smirnov, F.: Null-Vectors in Integrable Field Theory. Commun. Math. Phys. 186, 601–648 (1997) 2. Date, E. and Okado, M.: Calculation of excitation spectra of the spin model related with the vector (1) representation of the quantized affine algebra of type An . Int. J. Mod. Phys. A 9, 399–417 (1994) 3. Etingof, P.: Difference equations with elliptic coefficients and quantum affine algebras. hep-th/9312057 4. Frenkel, I. and Jing, N., Vertex representations of quantum affine algebras. Proc. Natl. Acad. Sci. USA 85, 9373–9377 (1988) 5. Frenkel, I and Kac, V.: Basic representations of affine Lie algebras and dual resonance models. Invent. Math. 62, 23–66 (1980) 6. Frenkel, I. and Reshetikhin, N.: Quantum affine algebras and holonomic difference equations. Commun. Math. Phys. 146, 1–60 (1992) 7. Hong, J. Kang, S-J., Miwa, T. and Weston, R.: Vertex models with alternating spins. (Special edition of AJM dedicated to Prof. M. Sato on his 70th birthday). Asian J. Math. 2, 711–758 (1998) 8. Jimbo, M. and Miwa, T.: Algebraic analysis of solvable lattice models. CBMS Regional Conference Series in Math. AMS 85, (1995) 9. Jimbo, M., Miki, K., Miwa, T. and Nakayashiki, A.: Correlation function of the XXZ model for 1 < −1. Phys. Lett. A, 168, 256–263 (1992) 10. Koyama, Y.: Staggered polarization of vertex models with Uq (sd lN ) symmetry. Commun. Math. Phys. 164, 277–291 (1994) 11. Nakayashiki, A.: Fusion of q-vertex operators and its application to solvable vertex models. Commun. Math. Phys. 177, 27–62 (1996) 12. Nakayashiki, A., Pakuliak, S. and Tarasov, V.: On solutions of the KZ and qKZ equations at level zero. To appear in Ann. Inst. Henri Poincare, q-alg/9712002 13. Smirnov, F.: Counting the local fields in SG theory. Nucl. Phys. B 453, 807–824 (1995) 14. Tarasov, V.: Completeness of the hypergeometric solutions of the qKZ equation at level zero. Max-PlanckInst. Preprint Series 1998 (87) 15. Tarasov, V. and Varchenko, A.: Geometry of q-hypergeometric functions, quantum affine algebras, and elliptic quantum groups. Asterisque 246, 1–135 (1997) Communicated by T. Miwa

Commun. Math. Phys. 212, 63 – 91 (2000)

Communications in

Mathematical Physics

Finite-Volume Excitations of the 111 Interface in the Quantum XXZ Model∗ Oscar Bolina, Pierluigi Contucci, Bruno Nachtergaele, Shannon Starr Department of Mathematics, University of California, Davis, CA 95616-8633, USA. E-mail: [email protected]; [email protected]; [email protected]; [email protected] Received: 30 August 1999 / Accepted: 5 January 2000

Abstract: We show that the ground states of the three-dimensional XXZ Heisenberg ferromagnet with a 111 interface have excitations localized in a subvolume of linear size R with energies bounded by O(1/R 2 ). As part of the proof we show the equivalence of ensembles for the 111 interface states in the following sense: In the thermodynamic limit the states with fixed magnetization yield the same expectation values for gauge invariant local observables as a suitable grand canonical state with fluctuating magnetization. Here, gauge invariant means commuting with the total third component of the spin, which is a conserved quantity of the Hamiltonian. As a corollary of equivalence of ensembles we also prove the convergence of the thermodynamic limit of sequences of canonical states (i.e., with fixed magnetization).

1. Introduction and Main Results A determining factor in the stability of the magnetic state of small ferromagnetic particles is the structure of the spectrum of their low-lying excitations. Stability against thermal (and quantum) fluctuations is a major concern when one is interested in increasing the density of information stored on magnetic hard disks. Higher density of information requires smaller magnetic particles to store the bits. The smaller these particles get, the less stable their magnetic state tends to be. It is also well-known that ferromagnets spontaneously form domains with different orientations of the magnetization. These two facts motivate us to study the excitation spectrum of finite size ferromagnets with a domain wall or interface. From examples, it is known that the presence of an interface, in general, has an effect on the low-lying excitation spectrum [8, 9]. ∗ Copyright © 2000 rests with the authors. Faithful reproduction for non-commercial purposes is permitted.

64

O. Bolina, P. Contucci, B. Nachtergaele, S. Starr

We consider the spin 1/2 XXZ Heisenberg model on the three-dimensional lattice Z3 . For any finite volume 3 ⊂ Z3 , the Hamiltonian is given by X 1−1 (Sx(1) Sy(1) + Sx(2) Sy(2) ) + Sx(3) Sy(3) , (1.1) H3 = − x,y∈3 |x−y|=1

where 1 > 1 is the anisotropy. It will be convenient to work with the usual parametrization 1 = (q + q −1 )/2, 0 < q < 1. Note that in the limit 1 → ∞ (q → 0), one recovers the Ising model. The case 1 = 1 (q = 1) is the XXX Heisenberg model. It is well-known that this model has two ferromagnetically ordered translation invariant ground states. What is less well-known is that there are also ground states describing an interface between two domains with opposite magnetization. The 100 interfaces are similar to the Dobrushin interfaces found in the Ising model. They exist for sufficiently small temperatures, as was recently proved in [3]. Unlike the Ising model, the XXZ model also possesses ground states with a rigid 111 interface at zero temperature [8]. Its stability at positive temperatures is still an open problem. In this paper we are interested in estimating the low-lying excitations above the ground state with a 111 interface. It is easy to show that the excitation spectrum above the translation invariant ground states has a non-vanishing gap. In [8] it was proved that, in the corresponding two-dimensional model, the excitations above the 11 interface are gapless. By an extension of the methods in [10], Matsui [11] showed that the excitation spectrum has to be gapless in all dimensions ≥ 2. Here, we are interested in the nature of the low-lying excitations for the three-dimensional model, and in particular their dependence on size. We prove the following bound for the energy of an excitation localized in a finite domain 3R of linear size R. Main Result. Excitations localized in 3R have a gap γR bounded by γR ≤ 100

q 2(1−δ(q,ν)) 1 , (1 − q 2 ) R 2

for R > 70,

(1.2)

where δ(q, ν) is an exponent between 0 and 1/2 that depends on the filling factor ν of the interface plane (see explanation below), as well as the parameter q. The meaning of this bound is the following. We consider the model in a finite volume 3, with a fixed magnetization and boundary conditions that induce an interface. By perturbing the ground state in a cylindrical subvolume 3R , with circular cross-section of radius R, we then construct an orthogonal state with the same magnetization. The bound (1.2) is an upper bound for the difference in energy of this state with respect to the ground state in the limit 3 % Z3 . For finite volumes 3, the same bound holds as long as 3 is substantially larger than R. When R and the finite volume are comparable in size, a similar bound holds but with a larger constant factor and additional error terms (see Sect. 4). The dependence on q of the bound (1.2) has some interesting features, which we explain next. First, in the limit q → 1, the bound diverges. This means that our ansatz for the excitations of the 111 interface does not work for the isotropic model. This is not surprising as the isotropic model does not have a rigid 111 interface, although it does possess gapless excitations, as is well-known from spinwave theory. In the limit q → 0, the Ising limit, the bound vanishes. This is to be expected, as the 111 interface contours of the Ising model are highly degenerate.

Quantum XXZ Model

65

In order to explain the role of the exponent δ(q, ν) in (1.2) we first need to discuss some properties of the interface states themselves. For 0 < q < 1, the model has a two-parameter family of pure ground states with an interface in the 111 direction. One parameter is an angle, playing the same role as the angles φx in the ansatz (1.4) for the excitations. The second parameter, which is relevant for the present discussion, corresponds to the mean position of the interface in the lattice. If we think of spin up at any site as describing an empty site, and spin down as a site occupied by a particle, the third component of the spin becomes equivalent to the number of particles. In Sect. 2, (2.8), we will introduce the chemical potential µ to control the expected number of particles, alias the third component of the total spin. In the limit q → 0, the filling factor ν of the interface has a simple interpretation: ν = 0 means that interface separates a region entirely filled with particles from a region that is empty. A non-zero ν means that there is a partially filled plane in between the filled and the empty region, with filling factor ν. It turns out that the exponent δ(q, ν), can be considered as a function of µ alone. For each value of µ ∈ R, we get an interface state, and δ is the distance of µ to the integers, i.e., δ(µ) = min(|µ − bµc|, |1 − µ + bµc|), where bµc is the integer part of µ. In general, the relation between µ and ν depends nontrivially on q. But for all q, 0 < q < 1, one has δ(q, 1/2) = 0 and δ(q, 0) = 1/2. For further details on the interdependence of the parameters q, δ, µ, and ν, we refer to Subsect. 6.1. We believe that O(1/R 2 ) is the true behavior of the low-lying excitations. There are indications in the physics literature that this should indeed be the case [6]. Our rigorous bounds are obtained using the variational principle: If ψ0 is a ground state of H3 , and ψ is any other state that is linearly independent of ψ0 , then (q)

γ := E1 − E0 ≤

hψ| H3 |ψi · kψk2 1−

1 |hψ0 |ψi|2 kψ0 k2 kψk2

.

(1.3)

The first factor in the RHS is the energy of the perturbed state ψ. The second factor is necessary to correct for the non-orthogonality of ψ and the ground state. In general, one would need to consider the orthogonal complement of ψ to the entire ground state subspace of H3 . In the present case however, we know that for each eigenvalue of the third component of the total spin, J (3) , there is exactly one ground state. As we will only consider perturbations that commute with J (3) , it is sufficient to take the orthogonal complement of ψ to ψ0 . Our ansatz for ψ is of the following form: Y (3) ei2φx Sx ψ0 . (1.4) ψ= x∈3R

The energy of such a state can be written as follows: hψ | H3 | ψi = kψk2

X

Px,y [1 − cos(φx − φy )],

(1.5)

x∈3R ,y∈3 |x−y|=1

where the Px,y are probabilities determined by the interface ground state. Px,y can be interpreted as the probability that the bond (x, y) belongs to “the interface contour”, i.e., one of the sites is occupied by an up spin and one by a down spin. These probabilities decay exponentially fast as a function of the distance to the expected location of the interface. In particular, this shows that the interface is rigid and that the problem of

66

O. Bolina, P. Contucci, B. Nachtergaele, S. Starr

calculating its excitation energies is quasi two-dimensional. In fact, the next step in our proof makes this explicit. We consider excitations of the form (1.4) with φx = Sφ(

x⊥ ), R

R ≥ 1,

where S is a suitable scale factor, φ is a smooth function with compact support in R2 , and x⊥ is the component of x ∈ Z3 , orthogonal to the 111 direction. It is shown that the energy γR of such excitations satisfies the bound C(q) k∇φkL2 . R 2 kφk2 2 2

γR ≤

L

is a map from R2

to the circle, and as such could have nontrivial topology. In principle, φ As we will only be considering small perturbations, this will be of no relevance here. It is, therefore, natural to take for φ an eigenfunction belonging to the smallest eigenvalue of −1 on a circular domain with Dirichlet boundary conditions, which minimizes the Rayleigh quotient on the RHS, i.e., the Bessel function J0 . This is different from the so-called superinstanton ansatz of Patrascioiu and Seiler in [12], where they use the fundamental solution of the Laplace equation, instead of an eigenfunction. All our results are for ground states that are eigenstates of the third component of the total spin, which is a conserved quantity, and for thermodynamic limits of such states. We will call this the canonical ensemble. Our derivation, however, relies on an equivalence of ensembles result for the interface ground states of the XXZ model. The state of the “small” volume 3R , immersed in the much larger volume 3, is well approximated by a grand canonical state with suitable chemical potential (see Sect. 2 for the precise definitions), which does not have a fixed magnetization. As expected, this equivalence of ensembles holds only for observables that commute with the third component of the total spin which are analogous to the gauge invariant observables in particle systems. This equivalence of ensembles result is non-trivial. Although we only give the proof in dimensions 3, it is straightforward to generalize the proof to all dimensions ≥ 3. Equivalence of ensembles (in the above sense) does not hold for the one-dimensional model. This can be derived from the results in [5]. In two dimensions, our method √ without modifications, yields the equivalence of ensembles for volumes that grow as L in the 11 direction and as L in the direction of the interface. With additional work one can obtain equivalence of ensembles result for standard sequences of increasing volumes. As another application of equivalence of ensembles we prove the existence of the thermodynamic limit of sequences of canonical ground states with a given density, i.e., magnetization per site, and filling factor of the interface. Concerning the gap above diagonal interface states in dimensions other than three we can make the following comments. First of all, diagonal interface states exist in all dimensions [1]. In one dimension there is a spectral gap above the ground states [7]. In two dimensions an upper bound of order 1/R was proved in [8]. The method of this paper can be used to obtain a bound of order 1/R 2 also in two dimensions. In all dimensions greater than three our method can be applied without change to obtain equivalence of ensembles, the existence of the thermodynamic limit and an upper bound of order 1/R 2 for the excitation energies. The paper is organized as follows. Section 2 introduces the model and the geometrical setting. Section 3 deals with the equivalence of ensembles result which is a main ingredient of our proofs. The bound on the excitation energy is a product of two factors

Quantum XXZ Model

67

e

3

11111111 00000000 00000000 11111111 00000000 11111111 00000000 11111111 00000000 11111111 00000 11111 00000 11111 00000 11111 00000 11111

e

e

2

1

Fig. 1. Example of a cylindrical 3 embedded in Z3 . A small cylindrical subvolume as used in the construction of the perturbed states is also shown

as in (1.3). A bound on the first factor, called the energy bound, is derived in Sect. 4. The second factor requires an estimate for the inner product of the ground state with the perturbed state, which is derived in Sect. 5. In Sect. 6 we prove a number of results for the grand canonical ensemble in one dimension that we use in the paper. 2. Interface States of the XXZ Model Our magnet occupies a volume 3 which is a subset of Z3 . Let e1 , e2 , e3 denote the standard basis vectors in Z3 . (See Fig. 1.) We let l(x) denote the signed distance from the origin: l(x) = x 1 + x 2 + x 3 , where x = (x 1 , x 2 , x 3 ) ∈ Z3 . Then B(3) = {(x0 , x1 ) : |x0 − x1 | = 1, l(x1 ) = l(x0 ) + 1}

(2.1)

describes the set of oriented bonds in Z3 . The infinite stick 60∞ is, by definition, the set of vertices of the form . . . − e2 − e3 , −e3 , 0, e1 , e1 + e2 , e1 + e2 + e3 , e1 + e2 + e3 + e1 , . . . . For any even integer L, the finite stick 60 of length L + 1 is then given by 60 = {x ∈ 60∞ | −L/2 ≤ l(x) ≤ L/2}. We will take for 3 a cylindrical region whose axis points in the 111 direction, where by cylindrical we mean that 3 can be obtained from a subset 0 of the l(x) = 0 plane, which we will call the base, by adding to all vertices x ∈ 0 the finite stick 60 : 3 = {x + y | x ∈ 0, y ∈ 60 }. The equation l(x) = c, for any constant c, defines a cross-section of 3, which contains exactly A = |0| vertices. Hence, |3| = (L + 1)A. We refer to these cross-sections as planes. As an example, the projection onto the plane l(x) = 0, of the vertices of 3 with triangular base is shown in Fig. 2, with different shades depending on the value of l(x) modulo 3. The orientation of the bonds is indicated by arrows, and one may observe that

68

O. Bolina, P. Contucci, B. Nachtergaele, S. Starr

Fig. 2. The projection onto the 111 plane of a cylindrical volume 3 with triangular base. The shading of the vertices depends on the value of l(x) modulo 3. The orientation of the bonds is indicated by arrows. Observe that each site has an equal number of incoming and outgoing bonds

Fig. 3. The bonds connecting the vertices of a stick 6 form a one-dimensional subsystem

each site on the interior of 3 has an equal number of incoming and outgoing bonds. By construction, 3 can be decomposed into one-dimensional sticks running parallel to the cylindrical axis, which we will generically call 6. (See Fig. 3.) One should observe that 6 is comprised entirely of nearest-neighbor pairs so that every site on 6 is connected to every other site by a sequence of bonds. This will allows us to exploit the wellknown properties of the one-dimensional Heisenberg XXZ model to describe 6. The Hamiltonian for the spin- 21 ferromagnetic XXZ Heisenberg model is given by X q hx0 ,x1 , (2.2) H3 = (x0 ,x1 )∈B(3)

Quantum XXZ Model

69

where 1 + A(1)(Sx(3) − Sx(3) ). (2.3) 1 0 4 p and 1 ≥ 1 is the “anisotropic coupling”, A(1) = 21 1 − 1/12 , and q, 0 < q < 1, (α) is the solution of 1 = 21 (q + q −1 ). The matrices Sx (α = 1, 2, 3) are the Pauli spin matrices acting on the site x, 0 1/2 0 −i/2 1/2 0 , S (2) = , S (3) = . (2.4) S (1) = 1/2 0 i/2 0 0 −1/2 q

Sx(1) + Sx(2) Sx(2) ) − Sx(3) Sx(3) + hx0 ,x1 = −1−1 (Sx(1) 0 1 0 1 0 1

The terms containing A(1) cancel on all sites except at the top and bottom plane of the cylinder. The usefulness of the nearest-neighbor Hamiltonian stems from the fact that its action on any bond is given by 1 (q |↓↑i − |↑↓i) , q + q −1 1 −1 |↑↓i |↓↑i hq |↑↓i = − . − q q + q −1

hq |↓↓i = 0,

hq |↓↑i =

hq |↑↑i = 0,

In other words, hq is the orthogonal projection on the unit vector 1 (q |↓↑i − |↑↓i). ξq = p 1 + q2

(2.5)

There is a (|3| + 1)-fold degeneracy in the ground states with a unique ground state P (3) for each value of total third component of the spin x∈3 Sx . The basis vectors of the Hilbert space (C2 )⊗|3| can be labeled with particle configurations α = {α(x)}x∈3 , where α(x) is 0 or 1, corresponding to |↑i and |↓i, respectively. We write N for the operator defined by X α(x)) |αi , N |αi = ( x∈3

and let A(3, n) denote the collection of all configurations with N(α) = n. Following [1] the ground states are given by X O q l(x)α(x) |α(x)i . ψ0 (3, n) =

(2.6)

α∈A(3,n) x∈3

Note that the weights of α are invariant under any permutation of the sites for which planes are invariant. These states describe an interface located, on the average, in the plane determined by (L/2 + lx )A = n [8]. We denote kψ0 (3, n)k2 by Z(3, n). This quantity is given by X Y q 2l(x)α(x) . (2.7) Z(3, n) = α∈A(3,n) x∈3

We will treat Z(3, n) as a canonical partition function. It will be useful to consider, also, its grand canonical analogue: Z GC (3, µ) =

L X n=0

Z(3, n)q −2µn =

Y

(1 + q 2(l(x)−µ) ).

x∈3

(2.8)

70

O. Bolina, P. Contucci, B. Nachtergaele, S. Starr

Then it is easily seen that Z GC (3, µ) is the squared-norm of the grand canonical vector defined by ψ GC (3, µ) =

|3| X

q −nµ ψ0 (3, n) =

O (|↑i + q l(x)−µ |↓i).

(2.9)

x∈3

n=0

Due to the product structure, the thermodynamic limit is simply given by = hXiGC Z3 ,µ

O h↑| + q l(x)−µ h↓| O |↑i + q l(x)−µ |↓i p p X 1 + q 2(l(x)−µ) 1 + q 2(l(x)−µ) x∈Z3 x∈Z3

(2.10)

for all local observables X. 3. Equivalence of Ensembles A key step in our argument is the development of an equivalence of ensembles. Specifically, we will show that for a gauge-invariant local observable the canonical expectation is close to the grand canonical expectation for some suitably chosen chemical potential µ. Here µ only depends on the total spin of the canonical ensemble, not on the form of the observable. From this, naturally follows a thermodynamic limit for gauge-invariant observables. We begin with activity bounds that show that the ratio of two canonical partition functions with different particle numbers is approximately exponential in the difference of the particle numbers, i.e., Z(3, n − k) ≈ Z(3, n)q −2kµ for |k| n. More precisely, we have the following lemma. Lemma 3.1 (Activity Bounds). For every volume 3, |3| = (L + 1)A, the ratio of canonical partition functions for different number of particles can be bounded from above and below by activity bounds as follows. Let A0 be any constant. Suppose n, 0 ≤ n ≤ A(L + 1), and µ are such that n − AhNiGC 6,µ ≤

1 A0 A1/2 . 2

(3.1)

Then, for every k satisfying |k| ≤

1 A0 A1/2 , 2

(3.2)

one has the bounds n GC 2 k 2 Z(3, n) ≤ C(A0 , A)q k[2 A −2hNi6,µ +2µaσ − A ]/(aσ ) , Z(3, n − k)

(3.3)

n GC 2 k 2 Z(3, n) ≥ C(A0 , A)−1 q k[2 A −2hNi6,µ +2µaσ − A ]/(aσ ) , Z(3, n − k)

(3.4)

and

Quantum XXZ Model

71

where a = 2| ln q|, σ 2 := σ 2 (µ, L) =

L/2 1 X 1 , a 2 4 cosh ( 2 (l − µ)) l=−L/2

and C(A0 , A) = Moreover, if µ is the solution of given in (6.15), we obtain C(A0 /2, A)−1 q

−

k 2 (1−q 2 ) 2a(1+q 2 )A

Alternatively, if µ solves

1−

A0 σ 2 A1/2 A0 σ 2 A1/2

.

(3.5)

2 − hNiGC 6,µ = 0, then, also using the bounds for σ

≤ q −2kµ

n−k A

k 2 (1−q 2 )

n A

1+

2

2

Z(3, n) − 2k (1−q ) ≤ C(A0 /2, A)q aq 2 A . Z(3, n − k)

(3.6)

− hNiGC 6,µ = 0, then we obtain

C(A0 /2, A)−1 q 2a(1+q 2 )A ≤ q −2kµ

2

2

2k (1−q ) Z(3, n) ≤ C(A0 /2, A)q aq 2 A . Z(3, n − k)

(3.7)

Proof. This can be obtained as follows. Let us consider the grand canonical probability p(µ, n) = q −2µ|n| with

Z(n) ; ZGC (µ)

X

Z(n) =

(3.8)

q w(α) ,

(3.9)

α:A(61 ,n1 )⊗···⊗A(6A−A0 ,nA−A0 )

where 6i is the i th one dimensional stick that we are decomposing our volume in, and where ZGC (µ) is the grand-canonical partition function. Clearly, we have X Z(n) . (3.10) Z(n) = n:|n|=n

Define p(µ, n) =

X

p(µ, n) ,

(3.11)

n:|n|=n

and we have p(µ, n) Z(n) = q 2kµ . Z(n − k) p(µ, n − k)

(3.12)

The idea now is to make use of the local central limit theorem for the probability distribution of the occupation number in the i th stick (see [4, Theorem XVI.4.3.]). Let P ξi = x∈6i αx . For any integer N , consider, the probability Pµ (ξ1 = n1 , . . . , ξN = nN ) = p(µ, n).

(3.13)

72

O. Bolina, P. Contucci, B. Nachtergaele, S. Starr

Due to the factorization property of p(µ, n), the ξ ’s are independent identically distributed random variables. For centered i.i.d. random variables Xi with variance σ 2 , the local central limit theorem guarantees that the random variable N 1 X Xn SN = √ σ N n=1

(3.14)

is close to a Gaussian in the sense that the quantity N X Xn = x) PN (x) := Prob(

(3.15)

n=1

fulfills the bounds 2 1 − x e 2σ 2 N √ σ 2πN

c 1− √ N

≤ PN (x) ≤

2 1 − x e 2σ 2 N √ σ 2π N

c 1+ √ N

,

(3.16)

where c is the constant c=

max(|x|, |x − k|) . √ σ2 N

(3.17)

By applying (3.16) to the centered quantity Xn = ξn − hξn i, we obtain the following bounds on the ratio of probabilities: C(N)−1 e−k(2x−k)/2σ

2N

≤

PN (x) 2 ≤ C(N )e−k(2x−k)/2σ N , PN (x − k)

(3.18)

where C(N ) =

1 + cN −1/2 . 1 − cN −1/2

(3.19)

In terms of the non-centered variables ξi we have p(µ, n) = PA n − AhNiGC 6,µ ,

(3.20)

where hNiGC 6,µ is the average number of particles of a 1D stick 6, in the grand canonical ensemble with chemical potential µ. From this and the hypotheses (3.1), (3.2), we obtain A0 c= 2 σ

and

C(A0 , A) =

A0 σ 2 A1/2 A0 σ 2 A1/2

1+ 1−

.

(3.21)

GC Note that in case µ is chosen so that hNiGC 6,µ = n/A or hNi6,µ = (n − k)/A then we can replace c by c/2, with the result that C(A0 , A) may be replaced by

C(A0 /2, A) = as well.

1+ 1−

A0 2σ 2 A1/2 A0 2σ 2 A1/2

,

Quantum XXZ Model

73

Also, from (3.20) and (3.18), we have C(A0 , A)−1 e

−

k(2n−2AhNiGC 6,µ −k) 2σ 2 A

GC

≤

k(2n−2AhNi6,µ −k) p(µ, n) − 2σ 2 A ≤ C(A0 , A)e . p(µ, n − k)

(3.22)

Using (3.12) (and observing that q 2µk = e−aµ ), we have n k GC 2 2 Z(n) ≤ C(A0 , A)e−k[2 A −2hNi6,µ +2aσ µ− A ]/2σ , Z(n − k)

(3.23)

n k GC 2 2 Z(n) ≥ C(A0 , A)e−k[2 A −2hNi6,µ +2aσ µ− A ]/2σ . Z(n − k)

(3.24)

and

Changing to base q then leads to Eqs. (3.3) and (3.4) of the theorem. By the derivation of Sect. 6.2, we have the bounds on the variance for the number of particles in a 1D stick: 1 + q2 1 q2 2 ≤ σ (µ) ≤ . 4 1 − q2 1 − q2

(3.25)

In conjunction with the remark about replacing C(A0 , A) by C(A0 /2, A), this gives Eqs. (3.6) and (3.7). u t As an application of this lemma, let us consider the case where n is replaced by ρ|3| − n0 , k is replaced by ρ|30 | − n0 and 3 is replaced by 3c0 := 3 \ 30 . This means that in the lemma A is replaced by A − A0 , and (n − k)/A is replaced by ρ(|3| − |30 |)/(A − A0 ) = ρ(L + 1). Then, direct substitution shows Z(3c0 , ρ|3| − n0 ) Z(3c0 , ρ|3c0 |) ≤ C(A0 /2, A − A0 ) q −2kµ e Z(3c0 , ρ|3| − n0 ) Z(3c0 , ρ|3c0 |)

k 2 −k[2ρ(L+1)−2hNiGC 6,µ + A−A ]/2σ

≥ C(A0 /2, A − A0 )−1 q −2kµ e

0

,

k 2 −k[2ρ(L+1)−2hNiGC 6,µ + A−A ]/2σ 0

(3.26)

,

(3.27)

where we have retained k, for the moment. If, further, we choose µ so that hNiGC 6,µ = ρ(L + 1), which is always possible (see Sect. 6.3), then, by Eq. (3.7), we have k Z(3c0 , ρ|3| − n0 ) − ≤ C(A0 /2, A − A0 ) e 2(A−A0 )σ 2 , c c Z(30 , ρ|30 |)

(3.28)

k Z(3c0 , ρ|3| − n0 ) −1 − 2(A−A0 )σ 2 ≥ C(A /2, A − A ) e . 0 0 Z(3c0 , ρ|3c0 |)

(3.29)

2

q 2µk

2

q 2µk

Using our bounds for σ 2 , we have Z(3c0 , ρ|3| − n0 ) − (1−q )k ≤ C(A0 /2, A − A0 ) e 2(1+q 2 )(A−A0 ) , c c Z(30 , ρ|30 |)

(3.30)

2(1−q )k Z(3c0 , ρ|3| − n0 ) −1 − 2q 2 (A−A0 ) ≥ C(A /2, A − A ) e . 0 0 Z(3c0 , ρ|3c0 |)

(3.31)

2 2

q 2µk

2 2

q 2µk

74

O. Bolina, P. Contucci, B. Nachtergaele, S. Starr

By our choice of µ, conditions (3.1) and (3.2) are satisfied as long as the order of L does not exceed the order of (A − A0 )1/2 . This estimate will be of use in the next theorem. Let kXkgs denote the operator-norm of X restricted to the subspace of ground states. For observables X, localized in 3 and commuting with J (3) , kXkgs is also given by kXkgs =

sup |hXi3,n |.

0≤n≤|3|

Theorem 3.1 (Equivalence of Ensembles). Consider two cylindrical volumes 3 and 30 , 30 ⊂ 3, of the type defined in Sect. 2 (in particular |3| = A(L + 1), |30 | = A0 (L + 1)), and fix a total number of particles n3 . Define ρ = n3 /|3|. Suppose X is P (3) a local observable in the volume 30 , which commutes with J (3) := x Sx . Then we have |hXi3,n − hXiGC 30 ,µ | ≤ εkXkgs ,

(3.32)

ln2 (A − A0 ) + 2(1 + a 2 )A20 + 4 4A0 + 2 , 2(A − A0 ) q (A − A0 )1/2 − 2A0

(3.33)

where ε=

a = 2| ln q|, and the chemical potential µ is determined by the equation hNiGC 6,µ = ρ(L + 1).

(3.34)

In particular, for ρ = 1/2 the calculations of Sect. 6.1 will show that µ = 0. Corollary 3.1 (Existence of the Thermodynamic Limit). (i) Suppose we have a sequence of pairs (3k , nk ) with 3k cylindrical volumes and 3k % Z3 in such a way that the length does not grow faster than the linear size of the base. Let µk solve hNiGC 3k ,µk = nk . Then the convergence µk → µ guarantees , for all local observables X commuting with the convergence of h.i3k ,nk to h.iGC Z3 ,µ J (3) :

. hXi3k ,nk → hXiGC Z3 ,µ

(3.35)

(ii) Moreover, for any choice of µ, we may find a sequence of pairs (3k , nk ) such that . hXi3k ,nk → hXiGC Z3 ,µ

(3.36)

Proof of Corollary. It follows from the monotonicity of hNiGC 6,µ proved in Sect. 6.1, that the equation hNiGC 3k ,µk = nk

(3.37)

always has a unique solution for µk . Then, (i) follows immediately from the inequality (3.32), once we observe that & 0 as 3 % Z3 in the sense prescribed in the corollary. For (ii), take 3k , with base Ak , and nk such that nk = bAk hNiGC 6,µ c,

Quantum XXZ Model

75

where bxc denotes the largest integer ≤ x. Then, µk solving (3.37) is easily seen to converge to µ, and (3.36) follows from (i). u t The interpretation of the condition µk → µ in (i) of the corollary is that, not only does nk /|3k | converge to ρ = 1/2, but, more precisely nk = ρ|3k | + νAk + o(Ak ). The term proportional to |3k | guarantees that the interface is in the center of the volume, the second term fixes its filling factor. Proof of Theorem 3.1. Let µ be determined by (3.34), and define 4 as follows: 4=

Z(3, n3 )q −2µρ|30 | , Z(3c0 , ρ|3c0 |)Z GC (30 , µ)

(3.38)

where 3c0 := 3 \ 30 . We will obtain the equivalence of ensembles by combining two facts. The first is that 4 is approximately equal to 1, and the second is an estimate showing that |hXi3,n3 4 − hXiGC 30 ,µ | ≤ εkXkgs . But first, let us recall the definitions of the expectation of an observable X: hXi3,n =

hψ(3, n)| X |ψ(3, n)i , hψ(3, n)|ψ(3, n)i

ψ GC (3, µ) X ψ GC (3, µ) = . hψ GC (3, µ)|ψ GC (3, µ)i

(3.39)

hXiGC 3,µ

(3.40)

GC Since X is an observable localized in 30 , we note that hXiGC 3,µ = hXi30 ,µ . Moreover, we may decompose the grand canonical state into a superposition of canonical states:

ψ GC (30 , µ) =

|30 | X

q −µn0 ψ(30 , n0 ).

(3.41)

n0 =0

Since X commutes with J (3) , it does not have off-diagonal matrix elements between these canonical states with all different values of the total spin. Therefore, GC (3, µ)−1 hXiGC 3,µ = Z

|30 | X

q −2µn0 Z(30 , n0 )hXi30 ,n0 .

(3.42)

n0 =0

Note also, that since we have a decomposition ψ(3, n) =

|30 | X n0 =0

ψ(3 \ 30 , n − n0 ) ⊗ ψ(30 , n0 ),

(3.43)

76

O. Bolina, P. Contucci, B. Nachtergaele, S. Starr

and using the previously described properties, we have hXi3,n =

|30 | X Z(3 \ 30 , n − n0 )Z(30 , n0 ) hXi30 ,n0 Z(3, n)

(3.44)

n0 =0

= Z GC (30 , µ)−1 ×

|30 | X

q −2µn0 Z(30 , n0 )hXi30 ,n0 ×

n0 =0 c Z(30 , n − n0 )Z GC (30 , µ) . q −2µn0 Z(3, n)

(3.45)

This differs from the definition of hXiGC 30 ,µ only by the final factor, which is a ratio of partition functions hence amenable to our activity bounds. In fact, we have hXi3,n 4 − hXiGC 3,µ

=Z

GC

−1

(30 , µ)

× q

|30 | X

q −2µn0 hXi30 ,n0 Z(30 , n0 ) ×

n0 =0 2µ(n0 −hn0 i)

Z(3c0 , n − n0 ) −1 , Z(3c0 , bρ|30 |c)

(3.46)

where hn0 i = hNiGC 30 ,µ , which equals ρ|30 | for our choice of µ. Thus we obtain GC |hXi3,n 4 − hXiGC 3,µ | ≤ kXkgs h|g|i30 ,µ , where g = q 2µ(n0 −hn0 i)

Z(3c0 , n − n0 ) − 1. Z(3c0 , bρ|30 |c)

(3.47)

Now we use the activity bounds (3.30) and (3.31), but replacing k by its actual value, hn0 i − n0 . We arrive at the bounds g ≤ g1 := C(A0 /2, A − A0 )e

−

(1−q 2 )(hn0 i−n0 )2 2(1+q 2 )(A−A0 )

g ≥ g2 := C(A0 /2, A − A0 )−1 e

− 1,

2(1−q 2 )(hn0 i−n0 )2 − 2q 2 (A−A0 )

− 1,

(3.48) (3.49)

where C(A0 /2, A − A0 ) =

1+ 1−

A0 2σ 2 (A−A0 )1/2 A0 2σ 2 (A−A0 )1/2

.

(3.50)

Therefore, |g| ≤ max(|g1 |, |g2 |) ≤ |g1 | + |g2 |. We now use the triangle inequality and the fact that the exponent is negative to obtain: (1−q 2 )(hn0 i−n0 )2 − (3.51) |g1 | ≤ 1 − e 2(1+q 2 )(A−A0 ) + |1 − C(A0 /2, A − A0 )|, so that h|g1 |i30 ,µ ≤ h1 − e

−

(1−q 2 )(hn0 i−n0 )2 2(1+q 2 )(A−A0 )

iGC 30 ,µ + C(A0 /2, A − A0 ) − 1.

(3.52)

Quantum XXZ Model

77

Similarly,

h|g2 |i30 ,µ ≤ h1 − e

−

2(1−q 2 )(hn0 i−n0 )2 2q 2 (A−A0 )

−1 iGC 30 ,µ + 1 − C(A0 /2, A − A0 ) .

(3.53)

We will use the Chebyshev inequality to control the expectation term in (3.52). Specifically, for any B > 0, h1 − e

−

(1−q 2 )(hn0 i−n0 )2 2(1+q 2 )(A−A0 )

iGC 30 ,µ

≤ Prob(2|n0 − hn0 i| ≥ 2B) + 1 − e ≤ q 2B hq −2|n0 −hn0 i| iGC 30 ,µ + 1 − e

−

−

(1−q 2 )B 2 2(1+q 2 )(A−A0 )

(1−q 2 )B 2 2(1+q 2 )(A−A0 )

.

−2 A0 In Sect. 6.3 we show that hq −2|n0 −hn0 i| iGC 30 ,µ ≤ 2(2q ) . One choice for B is −1 −2 a [ln(A − A0 ) + A0 ln(2q )]. This gives the bound

h1 − q

(n0 −hn0 i)2 A−A0

iGC 30 ,µ

≤

2+

1−q 2 a 2 (1+q 2 )

2(1 + a 2 )A20 + ln2 (A − A0 )

A − A0 + 21 ln2 (A − A0 ) ≤ A − A0 =: C1 (A, A0 , q). 2 + (1 + a 2 )A20

The leading order term in the bound is Also, let C2 (q, A, A0 ) =

ln2 (A−A0 ) 2(A−A0 )

(3.54)

for fixed q, strictly between 0 and 1.

4A0 2 q (A − A0 )1/2

− 2A0

,

(3.55)

which is greater than both C(A0 /2, A − A0 ) − 1 and 1 − C(A0 /2, A − A0 )−1 . Then GC |hf i3,n 4 − hf iGC 3,µ | ≤ (C1 + C2 )kXkgs . In particular, |h1i3,n 4 − h1i3,µ | ≤ (C1 + C2 )k1kgs , which is to say that |4 − 1| ≤ C1 + C2 . Then, using the triangle inequality, we have GC |hXi3,n − hXiGC 3,µ | ≤ |1 − 4| · |hXi3,n | + |hXi3,n 4 − hXi3,µ | ≤ 2(C1 + C2 )kXkgs .

So, defining ε = 2C1 (q, 3, 30 , n) + 2C2 (q, 3, 30 ), the theorem is proved.

t u

Note that the restriction to observables X that commute with the third component of the total spin J (3) is necessary. E.g., the expectation of Sx+ obviously vanishes in any canonical state, while it is easy to see, by direct computation, that it does not vanish in the grand canonical states. This is entirely analogous to the restriction to gauge invariant observables in particle systems.

78

O. Bolina, P. Contucci, B. Nachtergaele, S. Starr

4. Bound on the Energy In this section we will estimate the energy of a class of perturbations of the ground state ψ0 given in (2.6). Let 3 and 3R be two cylindrical volumes as described in Sect. 2, 3R ⊂ 3. E.g., 3R and 3, may have triangular cross-sections (see Fig. 1). We will generally assume that the radius R of 3R is much less than that of 3. We consider ψ of the form X O (4.1) eiφ(x)α(x) q l(x)α(x) |α(x)i , ψ(3, n, φ) = α∈A(3,n) x∈3

where supp(φ) ⊂ 3R , We will also suppose that φ=

S ˜ y˜1 , y˜2 ). φ( R

(4.2)

where φ˜ is a smooth functions of its variables and S is a parameter, which we will eventually take to zero independent of R. The coordinates y˜ 1 , y˜ 2 , are defined by y˜ 1 =

2x 1 − x 2 − x 3 √ 6R

and

x2 − x3 y˜ 2 = √ , 2R

(4.3)

and are to be viewed as rescaled coordinates for x along the plane perpendicular to the 111 axis. There are two points to our assumptions on φ: First, that φ is independent of the 111 component of x. Second, that φ is associated to a scale-invariant phase φ˜ by φ(x) = ˜ Ultimately, the constant S will vanish. The leading term in our estimate of R −1 φ(x/R). the gap is independent of S as long as S 1. Let 0R be the projection of 3R onto the plane l(x) = 0, AR = |0R |, R be the ˜ = {x ∈ R2 : Rx ∈ R }, the rescaled region, and let m() ˜ be convex hull of 0R , and ˜ (for the standard Lebesgue measure on R2 ). the area of We will also use the following notation: ∂y˜ φ˜ and ∂y2˜ φ˜ are the first- and second˜ and by the L∞ norm of a tensor we mean the maximum of the derivative tensors of φ, ∞ L norms of the components. (q)

Theorem 4.1 (Bound on energy is bounded by

hψ|H3 |ψi ). kψk2

Considering a perturbed state as in (4.1), the

  ˜ 2 (q) hψ | H3 | ψi 1 + q 2  AR S 2 k∇y˜ φkL2 () ˜ + Enum  , ≤2 ˜ kψk2 1 − q2 R4 m()

(4.4)

where Enum =

6AR S 2 2 ˜ L∞ k∂y˜ φk ˜ L∞ k∂y˜ φk R5

is a correction to the main term which becomes negligible as R → ∞.

(4.5)

Quantum XXZ Model

79 q

Proof. We begin by calculating how a two-site hamiltonian hb acts on the perturbed state. We consider the decomposition of our lattice into the relevant bond b = (x0 , x1 ) and everything else 3 \ b. Thus q

hb = 13\b ⊗ |ξb i hξb | ,

(4.6)

where ξb is the unit vector from (2.5) on the pair b, and ψ(3, n) =

2 X

ψ(3 \ b, n − nb ) ⊗ ψ(b, nb ).

(4.7)

nb =0

Here ψ(b, nb ) is as would be defined by (4.1), but with 3 replaced by b and n replaced by nb . For example ψ(b, 1) = q l(x0 ) eiφ(x0 ) |↓↑i + q l(x1 ) eiφ(x1 ) |↑↓i. But ξb is orthogonal to ψ(b, 0) and ψ(b, 2), since ξb lies in the sector of total spin 1. And 1 q l(x0 )+1 eiφ(x0 ) (1 − ei[φ(x1 )−φ(x0 )] ). hξb |ψ(b, 1)i = p 1 + q2

(4.8)

Now it is straightforward to see q

hψ(3, n)| hb |ψ(3, n)i

(4.9)

= kψ(3 \ b, n − 1)k |hξb |ψ(b, 1)i| 2 = Z(3, n)P q (b)(1 − cos[φ(x1 ) − φ(x0 )]), (q + q −1 )2 2

2

(4.10)

where we have defined P q (b) =

Z(3 \ b, n − 1)Z(b, 1) . Z(3, n)

(4.11)

Then we may write (q)

hψ | H3 | ψi 2 = Z(3, n) (q + q −1 )2

X

P q (b)(1 − cos[φ(x1 ) − φ(x0 )]).

(4.12)

b∈B(3)

Actually, P q (b) depends on b only through l(x0 ). So from here on, we’ll write it as P q (l(x0 )), and observe the following: L/2−1 3 (q) X X X hψ | H3 | ψi 2 q = P (l) (1 − cos[φ(x + ej ) − φ(x)]), Z(3, n) (q + q −1 )2 l l=−L/2

x∈0R j =1

(4.13) where 0Rl = {x ∈ 3R : l(x) = l}. P3 P Let us estimate the term x∈0 l j =1 (1 − cos[φ(x + ej ) − φ(x)]). We have an R inequality 1 − cos[φ(x + ej ) − φ(x)] ≤

1 [φ(x + ej ) − φ(x)]2 2

(4.14)

80

O. Bolina, P. Contucci, B. Nachtergaele, S. Starr

(which is actually an equality in the limit R → ∞ for our ansatz). Also, 3 X S2 ˜ 2. [φ(x + ej ) − φ(x)]2 ≈ |∇x φ(x)|2 = 4 |∇y˜ φ| R

(4.15)

i=1

In fact, using the inequality ˜ y)] ˜ L∞ k∂y˜ φk ˜ L∞ kvk31 ˜ y˜ + v) − φ( ˜ y)] ˜ 2 | ≤ k∂y2˜ φk |[φ( ˜ 2 − [v · ∇y˜ φ( l

(4.16)

2 ˜ L∞ k∂y˜ φk ˜ L∞ . one may conclude that the error in (4.15) is bounded by 3RS5 k∂y2˜ φk Incorporating this estimate into the inequality of (4.14), we have

3 X X (1 − cos[φ(x + ej ) − φ(x)]) l j =1 x∈0R

≤

3S 2 |0Rl | 2 1 X 2 ˜ L∞ k∂y˜ φk ˜ L∞ . |∇ φ(x)| + k∂y˜ φk y ˜ 2R 2 2R 5 l

(4.17)

x∈0R

Finally, as R → ∞, the sum over each 0Rl becomes increasingly well-approximated by the integral over R , which is proved in Lemma 4.1 immediately following this proof. The lemma gives us a bound X l x∈0R

|∇y˜ φ(x)|2 ≤

Z S 2 |0Rl | ρ 1 2 2 2˜ ˜ ˜ |∇ d y + k∇ φ| φ∇ φk , ∞ y˜ ˜ L () ˜ ˜ y˜ R2 R y˜ m()

(4.18)

√ where ∇ 2 is the Laplacian and ρ = 2/3 is the maximum radius for the Voronoi domain. (Note that by its definition, as the trace of the second-derivative tensor, the Laplacian enjoys the bounds ˜ y˜ φk ˜ ∞ ˜ ≤ 2k∂ 2 φk ˜ L∞ k∂y˜ φk ˜ L∞ , k∇y2˜ φ∇ y˜ L ()

(4.19)

which may be combined with error term in (4.17).) Combining (4.18) and (4.19) gives PL/2−1 us the theorem, modulo the term l=−L/2 P q (l), for which we derive the necessary result in Lemma 4.2. u t Lemma 4.1. Suppose 0 is a region in a regular lattice. For each x ∈ 0, let x be the Voronoi domain of x with respect to the whole lattice, and let 0 be the union of all the individual domains x . If f is a smooth function on 0 , then Z 1 X 1 f (x) − f (y) dy ≤ ρk∇y f kL∞ (0 ) , |0| m(0 ) 0 x∈0

where ρ is the maximum radius of a Voronoi domain.

(4.20)

Quantum XXZ Model

81

Proof. For each x ∈ 0, Z Z 1 1 f (y) dy ≤ − [f (y) − f (x)] dy f (x) − m(x ) x m(x ) x Z Z 1 d 1 f (x + t (y − x)) dt dy =− m(x ) x 0 dt Z Z 1 1 ∇y f (x + t (y − x)) · (y − x) dt dy. =− m(x ) x 0 This clearly leads to the bound Z f (x) − 1 f (y) dy ≤ ρ(x )k∇y f kL∞ (x ) . m(x ) x

(4.21)

t u

From this, the lemma follows easily.

Now, we will derive the necessary bound on L/2−1 X

P q (l).

l=−L/2

We will rely on bounds for similar quantities in the one-dimensional model proved in [2]. PL/2−1 Lemma 4.2 (Bound on l=−L/2 P q (l)). L/2−1 X

P q (l) ≤ 2

l=−L/2

1 + q2 . 1 − q2

(4.22)

Proof. Recall P q (l) =

Z(3 \ b, n − 1)Z(b, 1) . Z(3, n)

(4.23)

The ratio of partition functions in the equation above is clear: It is the probability of finding one particle shared by the sites of b, and n − 1 particles shared by the sites of 3 \ b, conditioned on finding n total particles on 3. We consider the operator Yb = 13\b ⊗ |↑↓ib h↑↓|b + |↓↑ib h↓↑|b . Then Z(3 \ b, n − 1)Z(b, 1) = hYb i3,n , Z(3, n)

(4.24)

and L/2−1 X l=−L/2

q

P (l) =

* L/2−1 X l=−L/2

+ Yb(l)

, 3,n

(4.25)

82

O. Bolina, P. Contucci, B. Nachtergaele, S. Starr

where b(l) = (x0 , x1 ), where l(x0 ) = l, and (x0 , x1 ) is a bond in the stick containing the origin, which we denote by 60 . The restriction of the state in 3 with n spins down is of the form L+1 X ck hXi60 ,k , hXi60 = k=0

P (3) where X is any observable commuting with J (3) = x∈60 Sx , as is, e.g., Yb(l) , and the ck are non-negative numbers summing up to one. We will now derive an upper bound PL/2−1 for h l=−L/2 Yl i60 , that is independent of the coefficients ck . We start from (3)

hYl i60 ,k ≤ Probk (Sl

(3)

(3)

=↑, Sl+1 =↓) + Probk (Sl

(3)

=↓, Sl+1 =↑),

(4.26)

where Probk denotes the probability in the ground state with k spins down for a onedimensional system on [−L/2, L/2], the sites of which we label by l. Each term in the RHS of (4.26) can be estimated as follows: (3) (3) (3) (3) (4.27) Probk (Sl =↑, Sl+1 =↓) ≤ min Probk (Sl =↑), Probk (Sl+1 =↓) . Theorem 7.1 of [2] gives the following bounds: (3)

Probk (Sl+1 =↓) ≤ q 2(l−(k+1−L/2)) (3) Probk (Sl

=↑) ≤ q

2(k+1−L/2−l)

if l ≥ k + 1 − L/2, if l < k + 1 − L/2.

Combining these inequalities and summing over l yields L/2−1 X

hYl i60 ,k ≤ 2

l=−L/2

1 + q2 1 − q2

for all k = 0, . . . , L + 1. Together with (4.25) this concludes the proof.

(4.28) t u

5. Bound for the Denominator Note that ψ(3, n) = T (φ)ψ0 (3, n), where T (φ) is the unitary operator defined by, O (|↑i h↑| + eiφ(x) |↓i h↓|). (5.1) T (φ) = x∈3

In particular, kT (φ)ψ0 (3, n)k2 = kψ(3, n)k2 = Z(3, n). For convenience, we will sometimes omit the arguments 3 and n from the notation. In this section we will consider the half-filled system, i.e, ρ = n/|3| = 1/2. This corresponds to µ = 0. <ψ0 |ψ> | ). Considering a perturbed state in the volume Theorem 5.1 (Bound on | <ψ 0 |ψ0 > 30 defined by (4.1) we have that canonical and grand-canonical expectations of the perturbed state are arbitrarily close for large volumes 3 in the sense: ln2 (A − A0 ) + 2(1 + a 2 )A20 + 4 hψ|ψ0 i 4A0 GC . − hT (φ)i + 2 1/2 3,µ ≤ hψ |ψ i 2(A − A0 ) q A − 2A0 0 0 (5.2)

Quantum XXZ Model

83

Moreover, with the ansatz defined by (4.1), the grand canonical expectation is bounded as 2 (5.3) ln hT (φ)iGC 3,µ ≤ √ ˜ 22 2 kφk S2 6 ˜ L () 2δ(µ) AR S 4 ˜ ˜ ˜ k∂y˜ φkL∞ kφkL∞ − − kφkL∞ , ≤ −q ˜ 4R 2 R 12R 2 m() where δ(µ) is the distance of µ from its closest integer neighbor. (Recall that we have defined the L∞ -norm of a tensor to be the L∞ -norm of its maximum component.) Proof. The proof of Eq. (5.2) is a direct consequence of the equivalence of ensembles because, since T (φ) is a unitary operator, kT (φ)k = 1. Let us now consider the proof of Eq. (5.3). We wish to bound the denominator from below; i.e. to demonstrate that 1− |hT (φ)i3,n |2 is not too small. This is tantamount to showing that |hT (φ)i3,n |2 is not too close to 1. Furthermore, we know this quantity lies between 0 and 1. We estimate the actual canonical average with the grand canonical average, and take the logarithm in order to exploit the factorization properties of the grand canonical ensemble. First, we note Y 1 + eiφ(x) q 2(l(x)−µ) . (5.4) hT (φ)iGC 3,µ = 2(l(x)−µ) x∈30 1 + q Recall the definition a = −2 ln q. This allows us a more convenient form in place of (5.4), 2 Y iφ(x) q 2(l(x)−µ) 1 + e 2(l(x)−µ) x∈30 1 + q Y e2a(l(x)−µ) + 2 cos φ(x)ea(l(x)−µ) + 1 e2a(l(x)−µ) + 2ea(l(x)−µ) + 1 x∈30 Y 1 2 1 − (1 − tanh [a(l(x) − µ)/2])(1 − cos φ(x)) . = 2 =

(5.5)

x∈30

We partition the product over planes and estimate the logarithm, thus:   2 Y 1 GC 2 1 − (1 − tanh [a(l(x) − µ)/2])(1 − cos φ(x)) ln hT (φ)i3,µ = ln  2 x∈30

1 X (1 − tanh2 [a(l(x) − µ)/2])(1 − cos φ(x)) ≤− 2 x∈30

=−

L/2 X 1 X (1 − tanh2 [a(l − µ)/2]) (1 − cos φ(x)). 2 l l=−L/2

x∈0R

84

O. Bolina, P. Contucci, B. Nachtergaele, S. Starr

We may approximate 1 − cos(φ(x)) by 21 φ(x)2 , with an error no larger than 4 ˜ 4 ∞ . In this case which is the same as S 4 kφk L

24R

ZGC (30 , µ, φ) 2 ln Z (3 , µ, 0) GC

1 ≤− 2

0

L/2 X





X 1 2 φ − (1 − tanh [a(l − µ)/2])  2 x l 2

l=−L/2

1 4 24 kφkL∞

x∈0R

S 4 |0Rl | ˜ 4∞  kφk . 24R 4

(5.6)

We may approximate the sum over 0Rl with an integral such that the error is bounded P ρ S 2 |0 l | ˜ L∞ . We may bound the sum L/2 (1 − tanh2 [a(l − µ)/2]) by R 3 R k∇y˜ φkL∞ kφk l=−L/2 from below by its largest term (since all the terms are positive). The largest term occurs for that integer l which is closest to µ. Thus, defining δ(µ) = min(µ − bµc, dµe − µ), we see L/2 X

(1 − tanh2 [a(l − µ)/2]) ≥ 1 − tanh2 [aδ(µ)/2] =

l=−L/2

(q δ(µ)

4 ≥ q 2δ(µ) . + q −δ(µ) )2 (5.7)

Using these bounds, we may continue the estimate of (5.6). We arrive at 2 ln hT (φ)iGC 3,µ ≤ ≤ −q

2δ(µ) S

˜ ≤ 2k∂y˜ φk ˜ l∞ Since |∇y˜ φ|

˜ 22 2 |0 l | kφk ˜ L () R 2 ˜ 4R

(5.8)

ρ S2 4 ˜ ˜ ˜ − k∇y˜ φkL∞ kφkL∞ − kφkL∞ . R 12R 2

m() √ and since ρ = 3/2, we have Eq. (5.3). u t

5.1. Bound on the ratio. We will now combine the results of the bound on the numerator and the bound on the denominator to get a true bound on the spectral gap. We first allow 3 % Z3 in the appropriate fashion so that ε & 0. Then we consider the case that S → 0, holding R fixed. This means that we consider a perturbation to the ground state which is very small. But since the ground state has energy zero, the energy of the perturbed state is entirely due to the small perturbation. In fact it is proportional to the size of the perturbation, and from this we obtain a linearized (with respect to amplitude of φ) bound: In fact we have, combining (1.3), (4.4), and (5.2) 6 2˜ ˜ 2 ˜ ˜ 16q 2(1−δ(µ)) k∇y˜ φkL2 () ˜ /m() + R k∂y˜ φk∞ k∂y˜ φk∞ √ · . γ1 ≤ (1 − q 2 )R 2 ˜ ∞ kφk ˜ 2 2 /m() ˜ − 6 k∂y˜ φk ˜ ∞ kφk R ˜

(5.9)

L ()

Note that this bound is homogeneous with respect to the amplitude of φ, which is ˜ as long as it is the result of our linearization. We observe that, whatever the form for φ, smooth we have the same asymptotic behavior for the bound on the spectral gap. Namely γ1 = O(1/R 2 ). This said, it is certainly worthwhile to find a best bound, which we take up presently.

Quantum XXZ Model

85

5.2. The Bessel function ansatz. Let us write the leading-order term in the bound for the spectral gap: ˜ = E(φ)

˜ 2 k∇y˜ φk 2 . ˜ 2 kφk

(5.10)

2

In order to minimize the bound on the spectral gap, we will minimize the functional E(φ) amongst all functions φ which possess two continuous derivatives and which vanish on ˜ (In order that the “small” phase φ the boundary of the rescaled perturbed region . match the external phase of 0, ±2π, . . . on ∂, it must be zero there. Thus φ˜ ≡ 0 on ˜ Therefore, we consider the first variation ∂ .) R R R 2 ∇φ · ∇φ 0 2 φφ 0 |∇φ|2 1 0 R R R − . (5.11) lim [E(φ + τ φ ) − E(φ)] = τ →0 τ φ2 φ2 φ2 Setting the first variation to zero for all test functions φ 0 leads to the eigenvalue problem for Laplace’s equation ˜ −∇ 2 φ˜ = λφ˜ in , (5.12) ˜ ˜ φ=0 on ∂ , where λ = E(φ). We choose, for our domain, the unit disk. We seek the solution to Eq. (5.12) which minimizes λ, but with the restriction that φ must possess two continuous derivatives. So the fundamental solution, which is the logarithm, is disallowed (and, in fact, has higher energy). We seek the first eigenstate of the Laplacian above the ground state. This is a classic problem, found in any elementary PDE text, with the Bessel Function for the solution: ˜ y) φ( ˜ = J0 (z0 r), where r = |y|, ˜ J0 is the zeroth Bessel function, and z0 ≈ 2.406 is its first zero. Now, using this choice for φ and the bounds (5.9), we obtain γ1 ≤

16q 2(1−δ(µ)) 1.56 + R6 (2.90)(1.40) √ . · (1 − q 2 )R 2 0.27 − R6 (1.40)(1)

(5.13)

Thus, γ1 ≤

100q 2(1−δ(µ)) (1 − q 2 )R 2

for

R > 70.

(5.14)

6. Results from the 1D Grand Canonical Ensemble 6.1. The mean number of particles in a stick. Recall that 6 is a 1D stick running parallel to the 111 axis. So, it is actually a 1D spin chain. We wish to estimate the mean number of particles in 6, for the grand canonical ensemble. This is GC (6, µ)−1 hNiGC 6,µ := Z

L+1 X

nq −2µn Z(6, n)

n=1

= Z GC (6, µ)−1

L+1 X n=1

neaµn Z(6, n),

(6.1)

86

O. Bolina, P. Contucci, B. Nachtergaele, S. Starr

where 6 is the interval {− L2 , − L2 + 1, . . . , L2 }. (Recall a = −2 log q.) By a standard calculation, we have hNiGC 6,µ =

1 ∂ log Z GC (6, µ). a ∂µ

(6.2)

On the other hand, the grand canonical partition function factorizes, as we have seen, so that hNiGC 6,µ =

L/2 X l=−L/2

L/2 a i X ea(µ−l) 1h 1 − tanh (l − µ) . = 2 2 1 + ea(µ−l)

(6.3)

l=−L/2

An examination of the graph of the function x 7 → 1 − tanh(x) reveals an approximate heaviside function, with support on the negative axis. We define the function   1 x < 0, η(x) = 1/2 x = 0, (6.4)  0 x > 0. Then, as long as −L/2 ≤ µ ≤ L/2, we remark bµc + L2 µ 6∈ Z, = hNiGC 6,µ µ + L+1 2 µ∈Z L/2 a X 1 1 − tanh (l − µ) − η(l − µ) . + 2 2 2

(6.5)

l=−L/2

We make the definition L+1 . − µ + FL (µ) = hNiGC 6,µ 2

(6.6)

For µ in the range above one may determine (by combining the two tails in the series and estimating upwards by an integral) that ! 1 + exp(− a2 ( L2 − µ)) 1 . (6.7) |F∞ (µ) − FL (µ)| ≤ ln a 1 + exp(− a2 ( L2 + µ)) Notice that in case µ = 0, there is no error at all in estimating FL by F∞ , and, furthermore, F∞ (0) = 0. It is clear that F∞ (µ) is periodic in µ with period 1, because it is a sum over the entire integer lattice, so it will suffice for us to consider µ in the range ]0, 1[. A straightforward calculation then yields ∞ X 1 1 1 1 + − F∞ (µ) = −µ + − 2 1 + eaµ 1 + ea(l−µ) 1 + ea(l+µ) l=1

= −µ +

1 tanh(aµ) + 2

∞ X l=1

sinh(aµ) . cosh(aµ) + cosh(al)

Quantum XXZ Model

87

0.3 0.2 0.1 F

0

−0.1 −0.2 −0.3 −2

−1

0

1

µ

2

−2

−1

0

1

µ

2

0.25 0.2 0.15 σ 2 0.1 0.05 0

Fig. 4. A plot of the functions F∞ (µ) and σ 2 (µ), with q = e−10

Defining {µ} = µ − bµc we have ∞

F∞ (µ) = −{µ} +

X 1 sinh(a{µ}) tanh(a{µ}) + 2 cosh(a{µ}) + cosh(al)

(6.8)

l=1

for all values of µ. Lemma 6.1. The function F∞ defined in (6.8) has the following properties: (i) F∞ is periodic with period 1, i.e, F∞ (µ + 1) = F∞ (µ), for all µ ∈ R (ii) F∞ is odd about µ = 1/2, i.e., F∞ (1 − µ) = −F∞ (µ), for all µ ∈ R. (iii) −1 ≤ F∞ (µ) ≤ 1, for all µ ∈ R. L+1 (iv) F∞ (µ) = 0 for µ ∈ Z and µ ∈ 21 + Z. I.e. the estimate hNiGC 6,µ = µ + 2 is exact for the half-integer and integer filling. Proof. The periodicity of F∞ follows directly from its definition. To prove (ii), define F (µ) for 0 < µ < 1 as

F (µ) =

∞ X k=1

1 1 1 − . − a(l−µ) a(l+µ) 1 + eaµ 1+e 1+e

(6.9)

88

O. Bolina, P. Contucci, B. Nachtergaele, S. Starr q = 10−9

0.5 δ 0.4 0.3

0.3

0.2

0.2

0.1

0.1

0

0

0.2

0.4

0.6

0.8 ν

0

1

q = 0.05

0.5 δ 0.4

0.3

0.2

0.2

0.1

0.1 0

0.2

0.4

0.6

0

0.2

0.8 ν

0

1

0.4

0.6

0.8 ν

1

0.6

0.8 ν

1

q = 0.9

0.5 δ 0.4

0.3

0

q = 10−4

0.5 δ 0.4

0

0.2

0.4

Fig. 5. A plot of the function δ(ν, q) for four different values of q

Then, F (1 − µ) =

∞ X l=1

=

∞ X l=1

1 1 + ea(l−1+µ)

−

1 1 + ea(l+1−µ)

1 1 − 1 + ea(l+µ) 1 + ea(l−µ)

−

1 1 + ea(1−µ)

1 1 1 + − 1 + eaµ 1 + ea(1−µ) 1 + ea(1−µ) = −F (µ). +

And clearly the remainder term

(

1 2

0,

− {µ},

if µ 6∈ Z if µ ∈ Z

satisfies property (ii). For the bounds, we first restrict ourselves to µ ∈ [0, 1]. For µ ≥ 0, we note that (6.8) implies F∞ (µ) ≥ −{µ} ≥ −1. Then we use property ii) in combination with this bound to also get the upper bound for µ ∈ [0, 1], F∞ (µ) = −F∞ (1 − µ) ≤ 1. Due to the peridicity property i), the upper and lower bound are automatically extended to all real µ. The special values stated in iv) are straightforward from (6.8) and (6.9). u t We can define the quantity δ(µ) = min(|µ − bµc|, |1 − µ + bµc|), where bµc is the integer part of µ. In general, the relation between µ and ν depends nontrivially on q and the function δ can be thought as δ(q, ν). But for all q, 0 < q < 1, one has δ(q, 1/2) = 0 and δ(q, 0) = 1/2. See Fig. 5.

Quantum XXZ Model

89

6.2. The variance of the number of particles in a stick. In the same way as was done above for the mean, we can compute the variance of the number of particles in a stick in the grand canonical ensemble by using the standard formula GC 2 σ 2 (µ, L) = hN2 iGC 6,µ − (hNi6,µ ) =

1 ∂2 log Z GC (6, µ), a 2 ∂µ2

(6.10)

which gives σ 2 (µ, L) =

L/2 1 X 1 . a 2 4 cosh ( 2 (l − µ)) l=−L/2

(6.11)

Define σ 2 (µ) = lim σ 2 (µ, L).

(6.12)

L→∞

Then, the speed of convergence of this limit is bounded as follows: |σ (µ) − σ (µ, L)| ≤ 2 2

2

∞ X

e−a(n−µ+L/2) =

n=0

2q 2(L/2−µ) . 1 − q2

(6.13)

It is clear that σ 2 (µ) is a periodic function of µ with period 1. It is not hard to see that σ 2 (µ, L) is C ∞ and attains its maximum in all integers and its minimum in the integers +1/2. It is easy to derive upper and lower bounds for σ 2 (µ, L). An upper bound is given by σ 2 (µ, L) ≤

L/2 X

e−|a(l−µ)| ≤

l=−L/2

L/2 X

e−a|l| ≤ 1 +

l=−L/2

2e−a , 1 − e−a

(6.14)

and a lower bound can be obtained using the crude bound 2 cosh x ≤ 2e|x| : L

σ 2 (µ, L) ≥

1 X −|an| 1 e−a − e−a(L+1) e ≥ . 4 4 1 − e−a

(6.15)

n=1

From (6.14) and (6.15) we see that the limit σ 2 (µ) satisfies the bounds 1 + q2 1 q2 ≤ σ 2 (µ) ≤ , 2 41−q 1 − q2 for all real µ and where we have again used the relation e−a = q 2 . For the aficionados, one can also show that ( 0 if µ 6 ∈ Z 2 . lim σ (µ) = 1 q↓0 if µ ∈ Z 4

(6.16)

(6.17)

The interpretation is simple. When µ ∈ Z, the interface (kink) in the one-dimensional system is located at a lattice site, which is occupied by a particle with probability 1/2. Clearly, the variance of the particle number is then 1/4. However, for µ 6 ∈ Z, the kink is centered at a position not belonging to the lattice and the state converges, as q ↓ 0, to a deterministic configuration with zero variance for the particle number.

90

O. Bolina, P. Contucci, B. Nachtergaele, S. Starr

6.3. Estimating hq 2|N−hNi| iGC 6,µ . We begin with the obvious fact q 2|N−hNi| ≤ q 2N−2hNi + q 2hNi−2N

(6.18)

−2hNi 2N GC hq i6,µ + q 2hNi hq −2N iGC hq 2|N−hNi| iGC 6,µ ≤ q 6,µ .

(6.19)

from which it follows that

Now, we observe hq 2N i6,µ =

PL+1 n=0

q 2n q −2µn Z(6, n) Z GC (6, µ − 1) = . GC Z (6, µ) Z GC (6, µ)

(6.20)

Since Z

GC

(6, µ) =

L/2 Y

(1 + q 2(l−µ) ),

(6.21)

l=−L/2

Eq. (6.20) leads us to conclude 1 + q 2(L/2+1−µ) ≤ 2q 2(L/2+µ) . 1 + q −2(L/2+µ)

(6.22)

1 + q −2(L/2+1+µ) ≤ 2q −2(L/2+1+µ) . 1 + q 2(L/2−µ)

(6.23)

hq 2N i6,µ = Similarly, hq −2N i6,µ =

Using the results of Subsect. 6.1, we then have −1−|FL (µ)| ≤ 4q −2 . hq 2|N−hNi| iGC 6,µ ≤ 4q

(6.24)

If we wish to calculate hq 2|N−hNi| iGC 3,µ , where 3 is comprised of A sticks, then nothing changes except that each estimate is raised to the power A. Thus, hq 2|N−hNi| iGC 3,µ ≤ 2A+1 q −2A . Acknowledgements. O.B. was supported by Fapesp under grant 97/14430-2. B.N. was partially supported by the National Science Foundation under grant # DMS-9706599.

References 1. Alcaraz, F.C., Salinas, S.R., Wreszinski, W.F.: Anisotropic ferromagnetic quantum domains. Phys. Rev. Lett. 75, 930–933 (1995) 2. Bolina, O., Contucci, P., Nachtergaele, B.: Path Integral Representation for Interface States of the Anisotropic Heisenberg Model. To appear in Rev. Math. Phys., archived as math-ph/9908004 3. Borgs, C., Chayes J., Fröhlich, J.: Dobrushin states in quantum lattice systems. Commun. Math. Phys. 189, 591–619 (1997) 4. Feller, W.: An Introduction to Probability Theory and Its Applications. New York: John Wiley & Sons, New York, 1966, vol. 2, p. 512 5. Gottstein, C.-T., Werner, R.F.: Ground states of the infinite q-deformed Heisenberg ferromagnet. Preprint archived as cond-mat/9501123 6. Hasenfratz, P., Niedermayer, F.: Finite size and temperature effects in the AF Heisenberg model. Z. Phys. B 92, 91–112 (1993)

Quantum XXZ Model

91

7. Koma, T., Nachtergaele, B.: The spectral gap of the ferromagnetic XXZ chain. Lett. Math. Phys. 40, 1–16 (1997) 8. Koma,T., Nachtergaele, B.: Low-lying spectrum of quantum interfaces. Abstracts of the AMS, 17, 146 (1996) and unpublished notes 9. Koma, T., Nachtergaele, B.: Interface states of quantum lattice models. In: Matsui, T. (eds.) Recent Trends in Infinite Dimensional Non-Commutative Analysis. RIMS Kokyuroku # 1035, Kyoto, 1998, pp. 133–144 10. Landau, L., Fernando Perez, J., Wreszinski, W.F.: Energy gap, clustering, and the Goldstone theorem in statistical mechanics. J. Stat. Phys. 26, 755–766 (1981) 11. Matsui, T.: On the spectra of the kink for ferromagnetic XXZ models. Lett. Math. Phys. 42, 229–239 (1997) 12. Patrascioiu, A., Seiler, E.: Superinstanton and the Reliability of Perturbation Theory in non-Abelian Models. Phys. Rev. Lett. 74, 1920–1923 (1995) Communicated by D. Brydges

Commun. Math. Phys. 212, 93 – 104 (2000)

Communications in

Mathematical Physics

© Springer-Verlag 2000

Derivation of the Euler Equations from a Caricature of Coulomb Interaction Yann Brenier Institut Universitaire de France, et Laboratoire d’analyse numérique, Université Paris 6, France. E-mail: [email protected] Received: 9 April 1999 / Accepted: 11 January 2000

Abstract: A caricature of collisionless plasma involving 2N particles of opposite charge is introduced. The N first particles are called “ions” and don’t move. The N other particles are called “electrons”. At each time, there is a one-to-one matching between electrons and ions and each pair is linked by a “spring” so that each electron oscillates with fixed frequency −1 . The essential point is that the matching between electrons and ions is updated at every discrete time nτ , n = 0, 1, 2, . . . , so that the total potential energy of the system stays minimal. This leads to a non trivial interaction which turns out to be a caricature of Coulomb interaction. It is proven that, provided the N ions are equally spaced in a bounded domain D and , τ and N −1 tend to zero at appropriate rates, the electrons behave as the fluid parcels of an incompressible inviscid liquid moving inside D according to the Euler equations. Our proof relies on a result of P. Lax on the approximation of volume-preserving transformations by permutations. 1. Description of the Dynamical System Consider a smooth compact domain D in Rd with unit volume and set N particles inside D. These particles are called “ions” and their positions A1 , . . . , AN ∈ D are supposed to be fixed. Now, consider N other particles, called “electrons”, moving in the Euclidean space Rd , with label α = 1, . . . , N and position Xα (t) ∈ Rd at time t. A time step τ > 0 is fixed. In each time interval nτ ≤ t < (n + 1)τ , there is a one-to-one pairing α → σα between each electron Xα and a corresponding ion Aσα . A spring links each pair so that each electron oscillates around the corresponding ion with fixed frequency −1 : 2 Xα00 + Xα = Aσα .

(1)

Of course, during the time interval nτ < t < (n + 1)τ , the total energy E(t) =

1 1 0 ||X (t)||2 + 2 ||X(t) − σ A||2 , 2 2

(2)

94

Y. Brenier

is preserved. (In this equation, the following notations have been used : , X(t) = (Xα (t))N α=1 ||Y ||2 =

1 X |Yα |2 , N α

A = (Aα )N , α=1 σ Y = (Yσα )N , α=1

∈ (Rd )N .) for all Y = (Yα )N α=1 At each discrete time t = nτ , the pairing is subject to change and σ is updated to keep minimal the potential energy, namely 1 ||X(nτ ) − σ A||2 , 2 2

(3)

among all permutations. (Notice that they may be several solutions, in which case we arbitrarily choose one of them.) So, σ is time dependent, piecewise constant, and denoted by σ (t). Of course, we assume both positions and velocities of each particle to be continuous at each discrete time nτ and we prescribe their values at time 0. This gives a complete description of the dynamical system. Notice that the total energy, defined by (2), is preserved on each interval nτ < t < (n + 1)τ , and can only decay at each time nτ , by definition of σ (nτ ). So the total energy is a non-increasing function of time. The possible dissipation is due to the fact that σ (t) is updated only at t = nτ , and not continuously in time (in which case, the system would be formally conservative). We have chosen to introduce the time step τ in order to have a non ambiguous definition of the dynamical system and also to get a system that can be exactly integrated on a computer, without further approximation. 2. Derivation of the Euler Equations The motion of an incompressible inviscid liquid in Rd is classically described by the Euler equations (see [AK], [MP], for example) ∂t v + (v.∇)v + ∇p = 0,

∇.v = 0,

(4)

where p = p(t, x) ∈ R is the pressure field, v = v(t, x) ∈ Rd is the velocity field. If the initial value v(0, x) ∈ Rd is smooth, then the Euler equations have a unique smooth solution, which is globally defined in time if d = 2 and locally if d = 3. Let us consider such a solution and assume that D is preserved by the fluid motion, namely that the velocity field v is parallel to the boundary of D, so that there is no material flux across the boundary ∂D . We fix a time interval [0, T ] on which v is well defined and denote by C any constant depending only on D, T , v and p. Theorem 21. Assume that D can be split into N disjoint subdomains Dα of equal volume and diameter not larger than Ch, where h = N −1/d , each of them containing one and only one ion Aα . Assume that the initial positions and velocities of the particles are given by Xα (t = 0) = Aα ,

Xα0 (t = 0) = v(t = 0, Aα ),

(5)

Euler Equations and Coulomb Interaction

95

for α = 1, . . . , N. Scale the parameters , τ and h = N −1/d so that h ≤ C 8 , τ ≤ C 4 .

(6)

1 X 0 |Xα (t) − v(t, Xα (t))|2 ≤ C 2 . N α

(7)

Then

Before giving the proof, we provide in the next section a geometrical interpretation of the theorem and we explain why the interaction of the particles can be seen as a caricature of Coulomb interaction. Of course, the reader only interested in the proof of the derivation of the Euler equations may go directly to Sect. 4. 3. Geometric and Physical Interpretations 3.1. A geometric approximation to the Euler equations. Let us recall that the Euler equations, describing the motion of an inviscid incompressible fluid moving inside a smooth bounded domain D of the Euclidean space Rd , have a nice geometric interpretation, for which we refer to [AK]. They describe the geodesics on the group G of all volume-preserving diffeomorphisms of D, where length are measured in the L2 sense, G being viewed as a subset of the Hilbert space H = L2 (D, Rd ). Of course, this correspondence is somewhat formal and a lot of analytical difficulties are left behind [Sh], one of them, for example, being that, for all d ≥ 2, the L2 closure of G is the (much larger) semi-group S of all Borel Lebesgue-measure preserving map from D into itself. So, it is natural to look for either generalized ([Sh, Br2]) or approximate geodesics. A simple way to define approximate geodesics is to consider the formal dynamical system X 00 + ∇X (

dH2 (X, G) 2 2

) = 0,

(8)

in the configuration space H , where ∇X is the functional gradient in H and the potential involves the distance in H between X and G (or, equivalently, between X and S, the L2 closure of G), namely : dH (X, G) = inf ||X − g||H = inf ||X − g||H = dH (X, S), g∈G

g∈S

(9)

where ||.||H is the Hilbert norm of H . (This approach is similar – but not identical – to Ebin’s slightly incompressible flow theory [Eb], see also [RU] for finite dimensional mechanical systems.) To get an approximate finite-dimensional Hamiltonian system, we set N points A1 , . . . , AN equally spaced in D, we substitute for H and G respectively (Rd )N and the finite group of all permutations of the Aα {(Aσ1 , . . . , AσN ) ∈ (Rd )N }, where σ is any permutation of {1, . . . , N}, and we keep unchanged (8), (9). Finally, a more tractable dynamical system is obtained by introducing a time step τ and updating the potential energy only at discrete times nτ . This exactly leads back to our system of particles and it is no longer surprising that the Euler equations can be recovered as , n−1 and τ go to zero.

96

Y. Brenier

3.2. A caricature of Coulomb interaction. In this subsection, we give a f ormal argument to show that our dynamical system evolves according to a caricature of Coulomb interaction, which is not so obvious. We first go back to formulation (8), where X should be considered as a time dependent square integrable map from D into Rd . We introduce Z δ(x − X(t, a))da, (10) ρ(t, x) = D

and we claim that, for each time t such that ρ(t, .) is absolutely continuous with respect to the Lebesgue measure, equation (8) is equivalent to : X 00 (t, a) = E(t, X(t, a)), ∀a ∈ D,

(11)

where the acceleration field E is given by E(t, x) =

∇9(t, x) − x 2

(12)

and 9 is a solution of the Monge–Ampère equation det(D 2 9(t, x)) = ρ(t, x),

(13)

where det (D 2 9) stands for the determinant of the second derivatives of 9(t, x) with respect to x. To justify this claim, we refer to the polar factorization theorem for maps (see [Br,Ca]). At each time t for which ρ(t, .) is absolutely continuous with respect to the Lebesgue measure, we write (Theorem 1.2, p.377 in [Br]) X(t) = ∇8(t) ◦ g(t),

(14)

where 8(t) is a function on D, with convex extension to the convex hull of D, and g(t) ∈ S is a Lebesgue measure-preserving map from D into itself. The factor g(t) has additional properties (deduced from Theorem 1.2 and Proposition 2.2, p.390, in [Br]). First, g(t) is the unique point in S that minimizes the L2 distance to X(t). Next, X(t) − g(t) is the gradient of X∈H →

1 2 d (X, S)) 2 H

at point X = X(t). Finally, g(t) can be written g(t) = ∇9(t) ◦ X(t),

(15)

where 9 (the Legendre–Fenchel transform of 8 with respect to x ∈ D) is a convex solution (in a suitable sense [Br,Ca]) of the Monge–Ampère equation (13). Thus, it follows from the polar factorization theorem that ∇X

dH2 (X(t), S) 2

= X(t) − ∇9(t) ◦ X(t)

(16)

and our claim is now justified. So, we have obtained for the approximate geodesic equation (8) a Monge–Ampère formulation with (10), (13), (11) and (12).

Euler Equations and Coulomb Interaction

97

Now, as is small, a natural ansatz for 9 is, 9(t, x) =

|x|2 − 2 φ(t, x), 2

(17)

which, inserted in (12) and (13), respectively leads to E(t, x) = −∇φ(t, x),

(18)

ρ(t, x) = 1 − 2 1φ(t, x) + O( 4 ).

(19)

Dropping the O( 4 ) term in the last equation exactly gives the Poisson equation ρ(t, x) = 1 − 2 1φ(t, x),

(20)

which involves the Coulomb potential. Since Eqs. (10), (11), (18), (20) correctly describe a collisionless plasma of electrons with a uniform ion background and Coulomb interaction, we can say that our dynamical system of particles (which is an approximation as h, τ → 0, being fixed, of equation (8)) is just a caricature which gets finer as tends to zero. Presumably, a rigorous analysis can be achieved with appropriate uniform pseudo-differential energy estimates as in [Gr]. Let us finally observe that, in the very special case d = 1, there is no discrepancy between the Monge–Ampère equation (13) and the Poisson equation (20) (because of (17)). Then, our dynamical system turns out to be an exact model of collisionless plasma, as shown in the Appendix.

3.3. The semi-geostrophic equations. Our dynamical system has an interesting connection with another physical model, namely Hoskins’ frontogenesis model and the related semi-geostrophic equations in atmospheric sciences ([Ho], see also [CNP, BB]). A discrete version of this model has been discussed in [BN] and the corresponding particle system (in two dimensions) is given by i.Xα0 (t) + Xα (t) = Aσα (t) ,

(21)

where i is the rotation matrix of angle π/2 in R2 and σ (t) is defined exactly in the same way as for our system of particles. In some vague sense, this Hamiltonian system has the same structural relationship with our system than the dynamical system for N vortex points in R2 [MP] has with the dynamical system of N electrons with Coulomb interaction in Rd . 4. Proof of the Main Result and notations v(t, X(t)), 4.1. Notations. If Y ∈ (Rd )N , v(t, Y ) stands for (v(t, Yα ))N α=1 v(t, σ (t)A) etc, . . . will be used. Partial derivatives in ∂t f and ∂xi f are denoted by f,t and f,i . There will be automatic summation on repeated latin indices i, j and notation X∗ α

=

1 X N α

98

Y. Brenier

will be used. When possible, the explicit dependence on t of X and σ will be omitted and capital letter will be used for functions of X, such as V pour v(t, X). For instance P ∗ α Xαj Vαi,j means N d 1 XX (Xα )j (t)(∂xj vi )(t, Xα (t)). N α=1 j =1

4.2. Bounds. According to definition (2) and assumption (5), we have 2E(0) = ||v(t = 0, A)||2 ≤ C. Since the total energy is non increasing, we deduce ||X 0 (t)|| ≤ C,

||X(t) − σ (t)A|| ≤ C,

||X(t)|| ≤ C.

(22)

4.3. Modulated energy. Let us introduce Ev (t) =

1 0 1 ||X (t) − v(t, X(t))||2 + 2 ||X(t) − σ (t)A||2 , 2 2

(23)

which can be seen as a modulated energy depending on v. Let us compute its time derivative on each interval nτ < t < (n + 1)τ , where we know that the total energy E(t) is preserved. We find X∗ d 00 0 0 (−Xαi Vαi + (Vαi − Xαi Ev (t) = )(Vαi,t + Vαi,j Xαj )) dt α

(24)

from which we get, using (1), X∗ d 0 0 Ev (t) + Vαi,j (Vαi − Xαi )(Vαj − Xαj ) dt α =

X∗ α

0 (Vαi − Xαi )(Vαi,t + Vαj Vαi,j ) + Vαi

(25)

Xαi − σ Aαi . 2

Rearranging (25), we obtain d Ev (t) + Q(t) = I1 + I2 , dt

(26)

where Q is defined by X∗ α

0 0 Vαi,j (Vαi − Xαi )(Vαj − Xαj )−

Xαi − σ Aαi (vi (t, Xα ) − vi (t, σ Aα )) 2

(27)

Euler Equations and Coulomb Interaction

99

and I1 , I2 by I1 =

X∗

−2 (Xαi − σ Aαi )vi (t, σ Aα ),

(28)

0 (Vαi − Xαi )(Vαi,t + Vαj Vαi,j ).

(29)

α

I2 =

X∗ α

From the Euler equations (4), we get Vαi,t + Vαj Vαi,j = −(p,i )(t, Xα ).

(30)

Dt p = p,t + vi p,i ,

(31)

After setting

we see that (Dt p)(t, Xα ) =

d 0 ). (p(t, Xα )) + (p,i )(t, Xα )(Vαi − Xαi dt

Thus I2 becomes d J (t) + I3 + I4 , dt

(32)

p(t, Xα (t)),

(33)

(Dt p)(t, Aα ),

(34)

I2 = − where J (t) = −

X∗ α

I3 = − I4 =

X∗

α X∗

(Dt p)(t, Aα ) − (Dt p)(t, Xα )

α

which is also I4 =

X∗

(Dt p)(t, σ Aα ) − (Dt p)(t, Xα ).

(35)

α

We split I1 = I5 + I6 with I5 = − I6 =

X∗

−2 Aαi vi (t, Aα ),

α X∗ −2

Xαi vi (t, σ Aα )

(36) (37)

α

and (26) becomes d (Ev (t) + J (t)) + Q(t) = I3 + I4 + I5 + I6 . dt

(38)

100

Y. Brenier

Since v is smooth, by definitions (23) and (27), we have −Q ≤ CEv (t),

(39)

d (Ev + J ) ≤ CEv + I3 + I4 + I5 + I6 , dt

(40)

and therefore (38) implies

on each time interval nτ < t < (n + 1)τ . Since E(t) is preserved on each of these intervals with non-positive jumps at each t = nτ , we deduce from definition (23) that Ev (t) have the same jumps at t = nτ as E(t) and, therefore, (40) is valid for all 0 < t < T (in the sense of distributions). 4.4. Error estimates. Let us first observe that, because of the assumption on the location of the Aα , we have, for all Lipschitz continuous function f Z 1 X f (Aα ) − f (x)dx| ≤ CLip(f )h. (41) | N α D (Indeed, each Aα is assumed to belong to a Dα , where each Dα has volume N −1 and diameter no larger than Ch.) We first consider I3 + I5 , defined by (34), (36), that can be seen as a “quadrature formula” for the integral Z (42) − ((∂t + v(t, x).∇)p(t, x) + −2 x.v(t, x))dx. D

Since v is divergence-free and parallel to ∂D, this integral is simply Z d p(t, x)dx. dt But, the pressure can be normalized so that Z p(t, x)dx = 0

(43)

(44)

D

at each time t and |I3 + I5 | ≤ Ch −2

(45)

follows from (41). We immediately obtain for I4 , defined by (35), I4 ≤ C||X − σ A|| ≤ CEv1/2 ≤

1 Ev + C 2 . 2

To deal with J , defined by (33), we observe on one hand that |J +

X∗ α

p(t, σ Aα )| ≤ C||X − σ A|| ≤

1 Ev + C 2 , 2

(46)

Euler Equations and Coulomb Interaction

and on the other hand that

X∗

is a quadrature formula for

R

101

p(t, σ Aα ) =

α D

p(t, Aα )

α

p(t, x)dx which is null by (44). Therefore 1 Ev (t) + C( 2 + h). 2

|J (t)| ≤ Moreover, at t = 0,

J (0) = − is a quadrature for

X∗

X∗

p(t, Aα )

(47)

(48)

α

R D

p(t, x)dx and |J (0)| ≤ Ch,

(49)

follows from (44) and (41). Let us finally consider I6 , the most interesting term, defined by (37). Let us introduce X∗ −2 Xαi (tτ )vi (t, σ (tτ )Aα ), (50) I7 = α

where tτ stands for the integer part of t/τ , multiplied by τ . Thus, X∗ −2 (Xαi (tτ ) − Xαi (t))vi (t, σ (tτ )Aα ), I7 − I6 =

(51)

α

where the dependence in t and tτ is explicitly written. We immediately get that I8 = I7 − I6 satisfies I8 ≤ C −2 |t − tτ | sup ||X0 (θ )|| 0≤θ≤T

and, therefore, |I6 − I7 | ≤ C −2 τ,

(52)

follows from (22). Let us introduce an artificial time step θ > 0 so that sup |v(t, a) −

a∈D

M(t, t + θ, a) − a | ≤ Cθ, θ

(53)

where we denote by M(t0 , t1 , a) the location in D at time t1 of a point advected by the velocity field v(t, x) and located at a at time t0 . Let us introduce I9 =

X∗ α

−2 Xαi (tτ )

Mi (t, t + θ, σ (tτ )Aα ) − σ (tτ )Aαi θ

(54)

which is an approximation of I7 , defined by (50). Indeed, |I7 − I9 | ≤ Cθ −2 ||X(tτ )|| ≤ Cθ −2

(55)

102

Y. Brenier

follows from (53) and (22). Since v is a smooth divergence-free vector field, parallel to ∂D, the mapping a → M(t, t + θ, .) is a Lebesgue measure- preserving Lipschitz transformation of D. Following a result of Lax [La], we observe that such a transformation can be approximated, in sup norm with an error of order h, by a permutation of the Aα . More precisely, there exists a permutation η such that sup

α=1,...,N

|M(t, t + θ, Aα ) − ηAα | ≤ Ch. h |I9 − I10 | ≤ C −2 , θ

Thus

(56)

(57)

where I10 =

X∗

−2 Xαi (tτ )

α

ησ (tτ )Aαi − σ (tτ )Aαi . θ

(58)

This last expression is always non-positive, since, by construction (3), σ (tτ ) satisfies ||X(tτ ) − σ (tτ )Aα ||2 ≤ ||X(tτ ) − ησ (tτ )Aα ||2 . So, from (52), (55), (57) we deduce I6 ≤ C(

h + θ + τ ) −2 . θ

(59)

Using (47), (45), (46) and (59) in the right-hand side of (40), we get h d (Ev + J ) ≤ C(Ev + J ) + C( + θ + τ ) −2 + C( 2 + h). dt θ

(60)

By setting the artificial parameter θ equal to h1/2 and using assumption (6), we get (Ev + J )(t) ≤ C(Ev + J )(0) + C 2 . Since Ev (0) = 0 follows from assumption (5), we deduce from (47) and (49) that Ev (t) ≤ C 2 , which concludes the proof, by definition (23). Appendix: The One-Dimensional Case As already mentioned in Subsect. 3.2, we can expect, in the special case d = 1, that our dynamical system is an exact description of a collisionless plasma of electrons with a fixed and uniform ion background (see [BR, ZM, MMZ], for some mathematical and numerical aspects of this model). To check this statement, let us consider a uniform background of non-moving ions in R3 and N layers of electrons, all of them being orthogonal to a fixed axis. Each layer can be considered as a particle moving along this axis, with position Xα (t) ∈ R at time t, for α = 1, . . . , N. The electric field is scalar, depends on one space variable x ∈ R and satisfies 1 X δ(x − Xα (t)). (61) ∂x E(t, x) = 1 − N α

Euler Equations and Coulomb Interaction

103

Thus we get for each particle Xα00 (t) + Xα (t) =

N 1 X H (Xα (t) − Xβ (t)) − E0 (t), N

(62)

β=1

where H stands for the Heaviside function, with conventionally H (0) = 1/2, and E0 (t) depends on the boundary conditions we choose. Let us now consider each time t when all particles have different locations, sort their positions by increasing order and denote by σα (t) the rank of the particle with label α, for α = 1, . . . , N. This defines a (timedependent) permutation σ = σ (t) that can be easily seen as the one that minimizes X |Xα (t) − Aσα |2 , (63) α

where Aα = (α − 1/2)/N , for α = 1, . . . , N. Thus (62) becomes Xα00 (t) + Xα (t) = Aσα (t) − E0 (t).

(64)

So, up to the choice of suitable boundary conditions to enforce E0 (t) = 0, we have recovered our system of particles (up to the further time discretization with time-step τ > 0) with D = [0, 1] and the fictitious “ions” Aα standing for the ion background. Acknowledgements. The author thanks the Erwin Schrödinger Institute (ESI), the University of Toronto and the Courant Institute where this work has been completed. This work has also been supported by TMR ERB FMBX CT97 0157 “asymptotic methods in kinetic equations” and the program on Charged Particle Kinetics at the ESI.

References [AK] [BN]

Arnold, V.I., Khesin, B.: Topological methods in hydrodynamics. New York: Springer, 1998 Baigent, S., Norbury, J.: Two discrete models for semi-geostrophic dynamics. Phys. D 109, 333–342 (1997) [BR] Batt, J., Rein, G.: Global classical solutions of the periodic Vlasov–Poisson system. C.R. Acad. Sci. Paris 313, 411–416 (1991) [BB] Benamou, J.-D., Brenier, Y.: Weak existence for the semigeostrophic equations formulated as a coupled Monge-Ampère/transport problem. SIAM J. Appl. Math. 58, 1450–1461(1998) [Br] Brenier, Y.: Polar factorization and monotone rearrangement of vector-valued functions. Comm. Pure Appl. Math. 44, 375–417 (1991) [Br2] Brenier, Y.: Minimal geodesics on groups of volume-preserving maps. Comm. Pure Appl. Math. 52, 411–452 (1999) [Eb] Ebin, D.: The motion of slightly compressible fluids viewed as a motion with strong constraining force. Ann. of Math. 2 105, 141–200 (1977) [Ca] Caffarelli, L.: Boundary regularity of maps with convex potentials. Comm. Pure Appl. Math. 45, 1141–1151 (1992) [CNP] Cullen, M., Norbury, J., Purser, J.: Generalised Lagrangian solutions for atmospheric and oceanic flows. SIAM J. Appl. Math. 51, 20–31 (1991) [Gr] Grenier, E.: Pseudo-differential energy estimates of singular perturbations. Comm. Pure Appl. Math. 50 821–865 (1997) [Ho] Hoskins, B.: The mathematical theory of frontogenesis. Annual review of fluid mechanics. Vol. 14, Palo Alto, 1982 pp. 131–151 [La] Lax, P.: Approximation of measure preserving transformations. Comm. Pure Appl. Math. 24 133–135 (1971) [MMZ] Majda,A., Majda, G., Zheng,Y.X.: Concentrations in the one- dimensionalVlasov–Poisson equations. I. Temporal development and non-unique weak solutions in the single component case. Phys. D 74, 268–300 (1994)

104

[MP] [RU] [Sh] [ZM]

Y. Brenier

Marchioro, C.,Pulvirenti, M.: Mathematical theory of incompressible nonviscous fluids. New York: Springer, 1994 Rubin, H., Ungar, P.: Motion under a strong constraining force. Comm. Pure Appl. Math. 10, 65–87 (1957) Shnirelman, A.I.: Generalized fluid flows, their approximation and applications. Geom. Funct. Anal. 4 586–620 (1994) Zheng, Y.X., Majda, A.: Existence of global weak solutions to one-component Vlasov–Poisson and Fokker–Planck–Poisson systems in one space dimension with measures as initial data. Comm. Pure Appl. Math. 47, 1365–1401 (1994)

Communicated by Ya. G. Sinai

Commun. Math. Phys. 212, 105 – 164 (2000)

Communications in

Mathematical Physics

© Springer-Verlag 2000

Non-Equilibrium Statistical Mechanics of Strongly Anharmonic Chains of Oscillators J.-P. Eckmann1,2 , M. Hairer1 1 Dépt. de Physique Théorique, Université de Genève, 1211 Genève 4, Switzerland.

E-mail: [email protected]; [email protected]

2 Section de Mathématiques, Université de Genève, 1211 Genève 4, Switzerland

Received: 27 September 1999 / Accepted: 11 January 2000

Abstract: We study the model of a strongly non-linear chain of particles coupled to two heat baths at different temperatures. Our main result is the existence and uniqueness of a stationary state at all temperatures. This result extends those of Eckmann, Pillet, Rey-Bellet [EPR99a,EPR99b] to potentials with essentially arbitrary growth at infinity. This extension is possible by introducing a stronger version of Hörmander’s theorem for Kolmogorov equations to vector fields with polynomially bounded coefficients on unbounded domains. 1. Introduction In this paper, we study the statistical mechanics of a highly non-linear chain of oscillators coupled to two heat reservoirs which are at (arbitrary) different temperatures. We show that such systems have, under suitable conditions, a unique stationary state, in which heat flows from the hotter reservoir to the cooler one. These results are an extension of the same statements obtained by Eckmann, Pillet and Rey-Bellet in [EPR99a, EPR99b] where it was assumed that the Hamiltonian is essentially “quadratic at high energies”. Since quadratic Hamiltonians have been discussed much earlier by Lebowitz and Spohn [LS77], there is an issue here of whether the quadratic nature of the forces at infinite energies is an essential ingredient of existence and uniqueness of the stationary state. Our result shows that this is not the case, since we allow for potentials of arbitrary polynomial growth. Our models, which are described in Sect. 2, treat a Hamiltonian of the form HS (p, q) =

N 2 X p i

i=0

2

N X + V1 (qi ) + V2 (qi − qi−1 ), i=1

describing a chain of particles with nearest-neighbor interaction (see Fig. 3.1). This chain is linearly coupled to heat baths Bi represented by free fields at temperatures Ti .

106

J.-P. Eckmann, M. Hairer

We proceed then, as in [EPR99a], to a reduction to a stochastic differential equation, see (2.2). Associated with it is an “effective energy” G, described in (2.6), which is equal to HS with some quadratic terms from the heat baths added. The generator corresponding to the stochastic differential equation above, represented in a space weighted with an exponential of G, will be called K and is the main object of study of this paper. It is for this generator that we show existence and uniqueness of an invariant state. This will be done by first showing that K has compact resolvent (which is really more than needed), and then using this result to derive the properties of the invariant measure. Our conditions on HS are spelled out in Sect. 3 below. They basically say that the coupling between the oscillators must be stronger than the single particle potential. This condition might be physically relevant, since it implies that transport is favored over storing of energy, but we have not found a counterexample when this condition is violated. Furthermore, the interparticle coupling must be convex. The main technical insight behind our generalization of the results of [EPR99a, EPR99b] is a new, and stronger version of the Hörmander theorem for Kolmogorov equations. We will develop this in more generality in Sect. 5, but here we just indicate how we use this result. The operator K is of the form K=

n X i=1

Xi∗ Xi + X0 ,

(1.1)

where the Xi are smooth vector fields on Rd . For example, see Eq. (3.13), X0 contains terms of the form pi ∂qi and (∂qi V )∂pi , where V is the interaction. The Xi for i 6= 0 are first order operators. Here, ∂V is polynomially bounded, whereas, in [EPR99a], ∂V was assumed to be linearly bounded. Letting g0 be an adequate inverse power of the effective energy G, one successively considers the finite sets of operators A−1 = {X1 , . . . , Xn },

A0 = {g0 X0 , X1 , . . . , Xn },

and then – see Sect. 6 for the detailed definition – A` = A`−1 ∪ [g0 X0 , A`−1 ]. We stop this iteration after at most 2N steps, where N is the number of particles in the chain, obtaining the set A = A2N+1 . We now define the operator 3A as the finite sum X 2 =1+ A∗ A. 3A A∈A

This is a generalization to our case of an elliptic operator of the type 32 = 1 − P P used in [Hör85] or 32 = 1 − i ∂x2i + i xi 2 used in [EPR99a]. With these definitions, one then has the bound

P

2 i ∂xi

Proposition 1.1 (Momentum space bound). There is a constant C such that for all f ∈ C0∞ (Rd ) one has in L2 : −N

16 k3A

f k ≤ C kKf k + kf k .

We also derive a similar bound in the conjugate variables:

(1.2)

Non-Equilibrium Statistical Mechanics of Strongly Anharmonic Chains

107

Proposition 1.2 (Position space bound). There is a constant C such that for all f ∈ C0∞ (Rd ) one has in L2 : kGε f k ≤ C kKf k + kf k , (1.3) where ε > 0 depends on the asymptotic behavior of the potential V and on N . Combining these two propositions one easily shows that K has compact resolvent. Then one derives from that result the existence of an invariant measure. Its properties are then found adapting the techniques of [EPR99a, EPR99b]. The remainder of this paper is organized as follows. In Sect. 2 we describe the physical model and in Sect. 3 we refine the setting and state the results. Section 4 will be devoted to the proof of the position space bound (Proposition 3.7). In Sect. 5, we present in detail the general scheme for studying operators of the form of (1.1), and show the inequality corresponding to (1.2). This section is as much as possible self-contained as it presents some independent interest. The detailed application of this general scheme to the problem of the chain allows us to prove the momentum space bound (Proposition 3.8) in Sect. 6. In Sect. 7 we combine these two bounds and prove Theorem 3.6 showing that K has compact resolvent and hence discrete spectrum. In Sect. 8 we show existence, uniqueness, and further properties of the invariant measure (Theorem 3.9). Appendix A contains some technical estimates used in Sect. 5. Appendix B contains the proof of a result concerning the domains of K and K ∗ . The method used there probably works for more general accretive second-order differential operators. Appendix C finally contains the proof of a technical result used in Sect. 8. 2. The Model We will study the model of a (small) classical N-particle Hamiltonian system coupled to M stochastic heat baths proposed in [EPR99a]. The small system without the heat baths is governed by a Hamiltonian HS ∈ C ∞ (R2N ). (We stay here with d = 1 dimensional position space for each particle to simplify notation.) The heat baths are modeled by classical field theories associated to the wave equation. The fields will be called ϕi and their conjugate momenta πi , where the index i ranges from 1 to M. The Hamiltonian for one heat bath is given by Z 1 |∂ϕ|2 + |π|2 dx. HB (π, ϕ) = 2 R The couplings allowed for the model are linear in the field variables. The total Hamiltonian for our model is then given by H (p, q, π, ϕ) =

Z

M X

HB (πi , ϕi ) + Fi (p, q)

i=1

R

∂ϕi (x)%i (x) dx + HS (p, q). (2.1)

We assume the initial conditions describe the heat baths at equilibrium at inverse temperatures βi , i.e. they are distributed in a sense according to the measure with “weight” e−βi HB (πi ,ϕi ) .

108

J.-P. Eckmann, M. Hairer

The paper [EPR99a] explains in detail how, and under which conditions on the coupling functions %i , one can reduce the resulting “big” system to a “small” system, where the heat baths are described by a finite number of variables. The price to pay for that is that we are now dealing with the following system of stochastic differential equations: dqj = ∂pj HS dt −

M X

∂pj Fi ri dt,

j = 1, . . . , N,

i=1

dpj = −∂qj HS dt +

M X

∂qj Fi ri dt,

(2.2)

i=1

p dri = −γi ri dt + λ2i γi Fi (p, q) dt − λi 2γi Ti dwi (t),

i = 1, . . . , M,

where the wi are independent Wiener processes. The various constants appearing in (2.2) have the following meaning. Ti is the temperature of the i th heat bath, λi is the strength of the coupling between that heat bath and the small system and 1/γi is the relaxation time of the i th heat bath. The value of γi depends on the choice of the coupling function %i . If we wanted to be more general, we would have to introduce for each bath a family of auxiliary variables ri,m as is done in [EPR99a]. This would only cause notational problems and does not change our argument. If we consider a generic n-dimensional system of stochastic differential equations with additive noise of the form dxi (t) = bi (x(t)) dt +

n X

σij dwj (t),

(2.3)

j =1

we can associate with it the second-order differential operator L formally defined by L≡

n n X 1 X ∂i (σ σ T )ij ∂j + bi (x) ∂i . 2 i,j =1

(2.4)

i=1

It is a classical result that if the solution of such a system of stochastic differential equations exists, the probability density of the solution satisfies the partial differential equation ∂t p(x, t) = Lp (x, t). In our case, the differential operator L is given by L=

M X i=1

λ2i γi Ti ∂r2i −

M X i=1

M X γi ri − λ2i Fi (p, q) ∂ri + X HS − ri XFi ,

(2.5)

i=1

where the symbol X F denotes the Hamiltonian vector field associated to the function F . It is convenient to introduce the “effective energy” given by G(p, q, r) = HS (p, q) +

M 2 X ri i=1

2λ2i

− Fi (p, q)ri .

(2.6)

At this point, we make the following assumption on the asymptotic behavior of G.

Non-Equilibrium Statistical Mechanics of Strongly Anharmonic Chains

109

Assumption 0. There exist constants d˜i , C > 0 and α > 0, as well as constants c˜i > 2/λ2i such that HS (p, q) ≥ C(1 + kpkα + kqkα ),

(2.7a)

Fi2 (p, q) ≤ c˜i HS (p, q) + d˜i .

(2.7b)

Remark 2.1. This assumption essentially means that the effective energy G grows at infinity at least like 1 + krk2 + kpkα + kqkα . This implies the stability of the system, as follows easily from the inequality Fi2 (p, q) , s2 which holds for every s > 0. In particular, this implies that exp(−βG) is integrable for every β > 0. |ri Fi (p, q)| ≤ s 2 ri2 +

We also define W ≡

M X

γi Ti ,

i=1

which is, in some sense that will be clear in a moment, the maximal power the heat baths can pull into the chain. We have the following result. Proposition 2.2. Assume Assumption 0 holds. Then the solution ξ(t; x0 , w) of (2.2) exists and is continuous for all t > 0 with probability 1. Moreover, the mean energy of the system satisfies for all values of t and x0 the estimate E[G(x(t; x0 , w))] − G(x0 ) ≤ W t,

(2.8)

where E[·] denotes the expectation with respect to the M-dimensional Wiener process w. Remark 2.3. The bound (2.8) allows the energy to grow forever, which would cause the system to “explode”. But this is not the case for the systems we consider in this paper. Indeed, we will prove that the process possesses a unique stationary state. This implies among other features that the mean time needed to reach any compact region is finite, and so the energy can not grow forever. Proof. A classical result (see e.g. [Has80, Thm 4.1]) states the following. Assume that the vector field b of (2.3) is locally Lipshitz and that there exists a confining C 2 function G : Rn → R and a constant k such that (LG)(x) ≤ k

for all

x ∈ Rn .

Then there exists a unique stochastic process ξ(t) solving (2.3). The process ξ is regular (i.e. it does not blow up in a finite time) and continuous for all t > 0. It satisfies the statistics of a Markovian diffusion process with generator L. Moreover, we have the estimate E[G(x(t; x0 , w))] − G(x0 ) ≤ kt. This result can be applied to our case, if we take for G the effective energy defined in (2.6). An explicit computation yields indeed LG = W −

M X 2 γi ri − λ2i Fi (p, q) . 2 λ i=1 i

Moreover, G is confining by Assumption 0. This proves the assertion.

(2.9) t u

110

J.-P. Eckmann, M. Hairer

2.1. Definition and simple properties of the semigroup. In this paper, we will mainly be interested in studying under which assumptions on the chain Hamiltonian HS it is possible to prove the existence of a unique invariant measure for the stochastic process ξ(t; x0 , w) solving (2.2). Throughout, we will use the notation X = R2N+M for the extended phase space (p, q, r). This stochastic process defines a semigroup T t on C0∞ (X ) by T t f (x0 ) = E[f (ξ(t; x0 , w))].

(2.10)

This semigroup satisfies the following Proposition 2.4. Assume Assumption 0 holds. Then T t extends to a strongly continuous, quasi-bounded semigroup of positivity preserving operators on L2 (X ). Its generator L is the closure of the operator L with domain C0∞ (X ). The adjoint L∗ is the closure of the formal adjoint LT with domain C0∞ (X ). Proof. The proof will be given in Appendix B.

t u

This in turn defines a dual semigroup (T t )∗ by Z Z T t f (x) ν(dx) = f (x) (T t )∗ ν (dx). The generator of (T t )∗ is given by the adjoint of L in L2 that will be denoted LT . It is possible to check that if the heat baths are all at the same temperature T = 1/β, we have LT µ0 = 0,

where

µ0 (p, q, r) = e−βG(p,q,r) .

Thus, the generalized Gibbs measure dµ0 = e−βG(p,q,r) dp dq dr = µ0 (p, q, r) dp dq dr, is an invariant measure for the Markov process described by (2.2). This confirms our definition of G as the effective energy of our system. We want to consider the more interesting case where the temperatures of the heat baths are not the same. The idea is to work in a Hilbert space that is weighted with a Gibbs measure for some reference temperature. We will therefore study an extension T0t of T t acting on an auxiliary weighted Hilbert space H0 , given by H0 ≡ L2 X , Z0−1 e−2β0 G(p,q,r) dp dq dr , where Z0 is a normalization constant and β0 is a “reference” inverse temperature that we choose such that 1/β0 ≡ T0 > max{Ti | i = 1, . . . , M}. We have the following

(2.11)

Non-Equilibrium Statistical Mechanics of Strongly Anharmonic Chains

111

Proposition 2.5. Assume Assumption 0 holds. Then the semigroup T t given by (2.10) extends to a strongly continuous quasi-bounded semigroup T0t on H0 . Moreover, T0t 1 = 1 and T0t is positivity preserving, i.e. T0t f ≥ 0

f ≥ 0.

if

Let L0 be the generator of T0t . Then L0 coincides on C0∞ (X ) with L of (2.5) and C0∞ (X ) is a core for both L0 and L∗0 . Proof. The statement can be proven by simply retracing the proof of Lemma 3.1 in [EPR99a]. There are only three points that have to be checked. We define the vector fields b and b0 respectively by b=−

M X

M X γi ri − λ2i Fi (p, q) ∂ri + XHS − ri X Fi ,

i=1 M X

b0 = 2β0

i=1

λ2i γi Ti ∂ri G ∂ri = 2β0

M X i=1

i=1

γi Ti ri − λ2i Fi (p, q) ∂ri .

In order to make the proof of [EPR99a] work, we have to check that kdiv bk∞ < ∞, kdiv b0 k∞ < ∞, sup b + 21 b0 G(x) < ∞, x∈X

where b and b0 are considered as first-order differential operators in the last inequality. The divergence of any Hamiltonian vector field vanishes, and so we have kdiv bk∞ = −

M X

γi < ∞.

i=1

The term involving the divergence of b0 can easily be computed to give kdiv b0 k∞ = β0

M X

γi Ti < ∞.

i=1

In order to check the last inequality, we compute the expression M X 2 γi (β0 Ti − 1) ri − λ2i Fi (p, q) . b + 21 b0 G(p, q, r) = 2 λ i=1 i

We see that condition (2.11) on β0 obviously implies β0 Ti − 1 < 0, and so the desired inequality holds. t The domains of L0 and L∗0 are controlled by the techniques of Appendix B. u We are mainly interested in the case M = 2. The Hamiltonian HS will describe a chain of N + 1 strongly anharmonic oscillators coupled to two heat baths at the first and the last particle. In the case in which the Hamiltonian HS can be written as a quadratic function plus some bounded terms, the existence and uniqueness of a stationary state for every temperature difference has been proved in [EPR99a, EPR99b]. We will extend this result to the case where the potentials grow faster than quadratically at infinity. Besides some weak conditions on the derivatives of the one and two-body potentials, we will only require that they grow algebraically and that the two-body potentials grow asymptotically faster than the one-body potentials, i.e. at large separation the interaction energy between neighboring particles grows faster than the one-particle energy.

112

J.-P. Eckmann, M. Hairer

2.2. Notations. Throughout, the domain of an operator A will be denoted by D(A). Unless specified, the domain of any operator will always be the closure in the graph norm of C0∞ . For example, if we write [A, B], we mean in fact (AB − BA) C0∞ , so that the domain of [A, B] can be larger than that of A or B separately. 3. Setting and Results In order to set up our model, we need to be able to describe precisely the growth rates of the potentials at infinity. This will be achieved with the following function spaces. Definition 3.1. Choose α ∈ R. We call Fα the set of all C ∞ functions from Rn to R such that for every multi-index k there exists a constant Ck for which kD k f (x)k ≤ Ck (1 + kxk2 )α/2 ,

for all x ∈ Rn .

Definition 3.2. Choose α ∈ R and i ∈ N ∪ {∞}. We call Fαi the set of all C ∞ functions from Rn to R such that for every multi-index k with |k| ≤ i, we have D k f (x) ∈ Fα−|k| . Remark 3.3. For any α ∈ R, the function P α : Rn → R

(3.1)

x 7 → (1 + kxk2 )α/2 belongs to Fα∞ . Moreover, any polynomial of degree n belongs to Fn∞ . 3.1. The chain. We consider the Hamiltonian HS (p, q) =

N 2 X p i

i=0

2

N X + V1 (qi ) + V2 (qi − qi−1 ),

(3.2)

i=1

describing a chain of particles with nearest-neighbor interaction (Fig. 3.1). We slightly modify the notations used so far. Because there are only two heat baths, we will not use for them the indices i ∈ {1, 2}, but rather i ∈ {L, R}. Concerning the coupling between the chain and the baths, we assume that we can make a dipole approximation, so we set FL = q0

and

FR = qN ,

(3.3)

in Eq. (2.1). We will make the Assumptions 1–3 on V1 and V2 .

V2 (~q1 )

TL q0

V1 (q0 )

Fig. 3.1. Chain of oscillators

TR qN

Non-Equilibrium Statistical Mechanics of Strongly Anharmonic Chains

113

2 for some n > 1. Moreover, there are constants Assumption 1. The potential V1 is in F2n ci > 0 such that

V1 (x) ≥ c1 P 2n (x),

(3.4a)

xV10 (x) ≥ c2 P 2n (x) − c3 ,

(3.4b)

for all x ∈ R. 2 for some m > n. Moreover, there are constants Assumption 2. The potential V2 is in F2m 0 ci > 0 such that

V2 (x) ≥ c10 P 2m (x), xV20 (x)

≥

(3.5a)

c20 P 2m (x) − c30 ,

(3.5b)

for all x ∈ R. Assumption 3. The function x 7→

1 V200 (x)

belongs to F` for some `. Remark 3.4. It is clear that (3.3), together with Assumptions 1 and 2 immediately imply 2 and V ∈ F 2 give bounds not only Assumption 0. Notice that the assumptions V1 ∈ F2n 2 2m on the asymptotic behavior of V1 and V2 , but also of their derivatives. The numbers n, m and ` need not be integers. The generalization to a Hamiltonian with V1 , V2 depending also on the number of the particle only creates notational problems and is left to the reader. An example of potentials that satisfy Assumptions 1–3 is V1 (x) = x 4 − x 2 + 2

and

V2 (x) = (1 + x 2 )5/2 − cos(x).

The effective energy of the system chain+baths is given by G(p, q, r) = HS (p, q) +

rL 2 rR 2 + 2 − q0 rL − qN rR + 0, 2 2λL 2λR

(3.6)

where we choose the constant 0 such that G ≥ 1, which is always possible, because n > 1. In fact, it is important that the function exp(−βG) be integrable for any β > 0. This could also be achieved with for example only one of the one-body potentials nonvanishing, but would cause some unimportant notational difficulties. The case n = 1 is marginal, the stability of the system depends on the values of the constants λi and was treated in [EPR99a]. We will not treat this case, but it would not cause any difficulties, as long as G remains confining. In the sequel, we will extensively use the notations q˜i ≡ qi − qi−1

and

Q≡

N X i=0

qi .

114

J.-P. Eckmann, M. Hairer

The system of stochastic differential equations we consider is given by dqi = pi dt, dp0 = −V10 (q0 ) dt + V20 (q˜1 ) dt + rL dt, dpj = −V10 (qj ) dt − V20 (q˜j ) dt + V20 (q˜j +1 ) dt, dpN = −V10 (qN ) dt − V20 (q˜N ) dt + rR dt, p drL = −γL rL dt + λ2L γL q0 dt − λL 2γL TL dwL (t), p drR = −γR rR dt + λ2R γR qN dt − λR 2γR TR dwR (t),

(3.7)

where i = 1, . . . , N and j = 1, . . . , N − 1. Since Assumption 0 holds, the results of the preceding section apply. Therefore, there exists for any initial condition x0 a unique stochastic process ξ(t; x0 , w) solving (3.7). It obeys the statistics of a Markov diffusion process with generator L = λ2L γL TL ∂r2L + λ2R γR TR ∂r2R − γL (rL − λ2L q0 )∂rL − γR (rR − λ2R qN )∂rR + rL ∂p0 + rR ∂pN +

N X

N X V20 (q˜i ) ∂pi − ∂pi−1 . pi ∂qi − V10 (qi )∂pi −

i=0

(3.8)

i=1

We want to prove the existence of a smooth invariant measure with density µ(p, q, r). It is the solution of (T t )∗ µ = 0, where (T t )∗ is the dual semigroup of T t . To achieve this, we introduce, as above, the Hilbert space H0 ≡ L2 R2N +4 , Z0−1 e−2β0 G(p,q,r) dp dq dr , where Z0 is a normalization constant and β0 is a “reference” inverse temperature that we choose such that 1/β0 ≡ T0 > max{TL , TR }.

(3.9)

Proposition 2.4 holds, so the dynamics of our system is described by a semigroup T0t acting in H0 with generator L0 , formally given by L. The extended phase space of our system will again be denoted by X ≡ R2N+4 . For convenience, we would like to work in H = L2 (X ), so we define the unitary transformation U : H → H0 by Uf (x) = eβ0 G(x) f (x). So L0 is unitarily equivalent to the operator LH : D(LH ) → H defined by LH = U −1 L0 U = e−β0 G L0 eβ0 G . An explicit computation shows that LH is given by LH = α − K,

Non-Equilibrium Statistical Mechanics of Strongly Anharmonic Chains

115

where the formal expression for the differential operator K is 2 2 2 2 ∂rL + aL2 (rL − λ2L q0 )2 − cR ∂rR + aR2 (rR − λ2R qN )2 K = αK − cL

− rL ∂p0 + bL (rL − λ2L q0 )∂rL − rR ∂pN + bR (rR − λ2R qN )∂rR −

N X

pi ∂qi − V10 (qi )∂pi +

i=0

N X

(3.10)

V20 (q˜i ) ∂pi − ∂pi−1 .

i=1

Since C0∞ (X ) is invariant under the unitary transformation U , it remains a core for both K and K ∗ . The various constants appearing in (3.10) are given by ai2 = γi (β0 Ti − 1), γi β0 bi = 2 β0 Ti − 1 , λi p ci = λi γi Ti , bR bL − , αK = − 2 2X γi Ti . α = αK + β0

i ∈ {L, R},

i∈{L,R}

We see that condition (3.9) ensures the positivity of the constants aL2 and aR2 , which in turn implies that the closure of ReK = (K + K ∗ )/2 is a strictly positive self-adjoint operator. The first feature we notice about K is that Assumption 3 implies the hypoellipticity of the operators K, K ∗ , ∂t + K and ∂t + K ∗ . We recall that a differential operator L acting on functions in a finite-dimensional differentiable manifold M is called hypoelliptic if sing supp f = sing supp Lf,

for all f ∈ D0 (M),

where D0 (M) is the space of distributions on C0∞ (M). In particular, the eigenfunctions of a hypoelliptic operator are C ∞ . The hypoellipticity of the above operators is a consequence of a theorem by Hörmander [Hör67,Hör85]: given a second-order differential operator L=

n X i=1

L∗i Li + L0 + c,

where c : M → C is a smooth function and the Li are smooth vector fields. Then a sufficient condition for L to be hypoelliptic is that the Lie algebra generated by {Li | i = 0, . . . , n} has maximal rank everywhere. It is not hard to verify thatAssumption 3 ensures that this condition is verified for K, K ∗ , ∂t + K and ∂t + K ∗ . Proposition 3.5. If Assumptions 0 and 3 are satisfied, the transition probabilities of the Markov process solving (3.7) have a smooth density P (t, x, y) ∈ C ∞ (0, ∞) × X × X . Proof. This is an immediate consequence of the Kolmogorov equations which state that ∂t P = LP

⇒

(∂t + K − α)U −1 P = 0,

so U −1 P is an eigenfunction of the operator ∂t + K − α, which is hypoelliptic.

t u

116

J.-P. Eckmann, M. Hairer

3.2. Main results. Our main technical result is Theorem 3.6. If Assumptions 1–3 are satisfied, then the operator K defined in (3.10) has compact resolvent. In order to prepare the proof of Theorem 3.6, we will prove the following two propositions. Proposition 3.7. If Assumptions 1 and 2 are satisfied, there exist constants C and ε > 0 such that kGε f k ≤ C(kKf k + kf k), for all f ∈ D(K), kGε f k ≤ C(kK ∗ f k + kf k), for all f ∈ D(K ∗ ).

(3.11a) (3.11b)

Proposition 3.8. If Assumptions 1–3 are satisfied, there exist constants C, ε > 0, a positive function a0 : X → R and a finite number N¯ of smooth vector fields Li with bounded coefficients such that, for every function f ∈ C0∞ (X ), we have ˜ ε f k ≤ C(kKf k + kf k), k1

(3.12)

where ˜ = 1

N¯ X i=1

L∗i Li + a0 .

Moreover, the Li span the whole of R2N+4 at every point. Given Theorem 3.6, we can state and prove the main result of this paper, namely the existence and uniqueness of an invariant measure for our Markov process. More precisely, we have the following result. Theorem 3.9. If Assumptions 1–3 are satisfied, then the stochastic process ξ(t) solving (2.2) possesses a unique and strictly positive invariant measure µ. Its density h is C ∞ and satisfies for any β0 < min{βL , βR }, −β0 G(x) ˜ , h(x) = h(x)e

where h˜ decays at infinity faster than any polynomial. The above results say that the spectrum of K looks roughly like the one schematically depicted in Fig. 3.2. We see that it is discrete (compactness of the resolvent) and located in the right half of the complex plane (m-accretivity). Moreover, it is symmetric along the real axis, because K is a differential operator with real coefficients. Most of the remainder of this paper is devoted to the proofs of Theorems 3.6 and 3.9. In the sequel, we will always use the notation K=

4 X i=1

Xi∗ Xi + X0 ,

Non-Equilibrium Statistical Mechanics of Strongly Anharmonic Chains

117

Im λ

Re λ

Fig. 3.2. Spectrum of K

where we define X1 = cL ∂rL ,

X2 = aL (rL − λ2L q0 ),

X3 = cR ∂rR ,

X4 =

aR (rR − λ2R qN ),

(3.13a) (3.13b)

X0 = − rL ∂p0 + bL (rL − λ2L q0 )∂rL − rR ∂pN + bR (rR − λ2R qN )∂rR −

N X

pi ∂qi − V10 (qi )∂pi

i=0

+

N X

V20 (q˜i ) ∂pi − ∂pi−1 − αK .

(3.13c)

i=1

The operator X0 is antisymmetric, i.e. X0∗ = −X0 .

(3.14)

This implies that ReK =

4 X i=1

Xi∗ Xi

and

X0 = K − ReK,

(3.15)

and thus ReK is a positive self-adjoint operator. We have one more estimate that will be extensively used in the sequel. If f is some function in C0∞ (X ) and i ∈ {1, . . . , 4} we have kXi f k2 = hf, Xi∗ Xi f i ≤ hf, ReKf i = Rehf, Kf i ≤ kf kkKf k ≤ (kKf k + kf k)2 ,

(3.16)

and by a similar argument also kXi∗ f k2 ≤ (kKf k + kf k)2 .

(3.17)

118

J.-P. Eckmann, M. Hairer

4. Proof of the Bound in Position Space (Proposition 3.7) First of all, we need a collection of functions belonging to F0 , as defined in Definition 3.1. We have the following result. Proposition 4.1. Let r, p, q and q˜ designate the vectors r = (rL , rR ), p = (p0 , . . . , pN ),

q = (q0 , . . . , qN ), q˜ = (q˜1 , . . . , q˜N ).

Choose α ≥ 0 and let hk : Rk → R be functions in Fα . Then the functions G−α/2 h2 (r),

G−α/2 hN+1 (p),

G−α/(2n) hN +1 (q),

and

G−α/(2m) hN (q) ˜

belong to F0 . ˜ The other Proof. We will only sketch the proof of the statement for G−α/(2m) hN (q). expressions can easily be treated in a similar way. We first notice that G−1 (D k G) is bounded for every multi-index k. This is a straightforward consequence of two observations. The first one is that because of the lower bounds (3.4a) and (3.5a) of Assumptions 1 and 2 and the expression (3.6) of G, there exists a constant C > 0 for which ˜ , (4.1) G(p, q, r) ≥ C r 2 + p2 + P 2n (q) + P 2m (q) where P k was defined in (3.1). The second observation is that, because V1 ∈ F2n and V2 ∈ F2m , we have for every multi-index k some constant Ck for which |D k G(p, q, r)| ≤ Ck r 2 + p2 + P 2n (q) + P 2m (q) ˜ . (4.2) ˜ is bounded by a similar argument, in particular because Notice that G−α/(2m) D k hN (q) hN ∈ Fα . We set α = −α/(2m) and write ˜ = α G−1 ∂i G Gα hN (q) ˜ + Gα ∂i hN (q). ˜ ∂i Gα hN (q) Both terms are bounded by (4.1), (4.2) and the fact that hN ∈ Fα . It is easy to see that all the derivatives can be bounded similarly. The proof of Proposition 4.1 is complete. t u Let us define

31 ≡ G1/2 . The symbol 31 was chosen in order to emphasize the similarity between the proof of Proposition 3.7 and the proof of the main result of Sect. 5, Theorem 5.5. Before we start the proof of Proposition 3.7, we notice two more facts. Let us choose α,β ∈ R with 0 ≤ β ≤ 1, and let A, B be two operators of multiplication by positive functions A ≤ B. We then have h3α1 Af, f i ≤ h3α1 Bf, f i,

(4.3)

as well as the implication k3α1 Af k ≤ C(kKf k + kf k)

⇒

αβ

k31 Aβ f k ≤ C(kKf k + kf k).

(4.4)

Both inequalities are trivial consequences of the fact that 31 is an operator of multiplication by a positive function and the estimate x s ≤ 1 + x if x ≥ 0 and s ≤ 1.

Non-Equilibrium Statistical Mechanics of Strongly Anharmonic Chains

119

4.1. The main tool of the proof. The main tool in the proof of Proposition 3.7 is the following lemma. Lemma 4.2. Let 31 and K be defined as above. Let A and B be multiplication operators represented by functions of the form ˜ q), h(p, q, r) = cL rL + cR rR + h(p,

h˜ ∈ C ∞ (R2N+2 ).

Assume moreover that there are exponents αi and βi and positive constants Ci such that the following estimates are true for every f ∈ C0∞ (X ): −β1

1 k3−α 1 Af k ≤ C1 (kKf k + kf k),

k31

2 k3−α 1 Af k

−β k31 2 Bf k

3 k3−α 1 [X0 , A]f k

≤ C3 kf k, ≤ C5 (kKf k + kf k),

Bf k ≤ C2 (kKf k + kf k),

−β k31 3 [X0 , B]f k

≤ C4 kf k, ≤ C6 (kKf k + kf k).

If γ satisfies the conditions γ ≥ α3 + β1 , β1 + max{β2 , β3 } , γ ≥ α2 + 2 γ ≥ min{α1 + β2 , α2 + β1 },

(4.5) (4.6) (4.7)

then there exists a constant C such that −γ

|h[X0 , B]f, 31 Af i| ≤ C(kKf k + kf k)2 ,

for all f ∈ C0∞ (X ).

(4.8)

Proof. The proof of this lemma involves some of the commutation techniques developed by Hörmander [Hör85], but it uses the fact that most operators involved are multiplication operators, i.e. they commute. An explicit computation, using (3.6) and (3.13) yields X

2 bj rj − λ2j Fj , [X1 , G] = cL rL /λ2L − q0 , 2 λ j ∈{R,L} j [X3 , G] = cR rR /λ2R − qN . [X2 , G] = [X4 , G] = 0,

[X0 , G] =

(4.9a) (4.9b)

We therefore see that, by Proposition 4.1, we have for i = 0, . . . , 4, G−1 [Xi , G] ∈ F0 .

(4.10)

Since the Xi are either differentiation operators or multiplicative operators, we have, for any α ∈ R, the relation G−α [Xi , Gα ] = αG−1 [Xi , G] ∈ F0 , and so, since 321 = G, k3α1 [Xi , 3−α 1 ]k < ∞. We can now start to bound (4.8). Since [X0 , B] = as −γ

−γ

−X0∗ B

(4.11) − BX0 , we can write (4.8) −γ

|h[X0 , B]f, 31 Af i| ≤ |hBX0 f, 31 Af i| + |hBf, X0 31 Af i| ≡ T 1 + T2 .

120

J.-P. Eckmann, M. Hairer

Both terms will be estimated separately. Term T1 . Since we know by (3.15) that X0 = K − ReK, we can write it as −γ

−γ

T1 ≤ |hB(ReK)f, 31 Af i| + |hBKf, 31 Af i| ≡ T11 + T12 . The term T12 can be estimated by using (4.7). We indeed have either γ ≥ α1 + β2 , or γ ≥ α2 + β1 . In the former case, we write −β2

T12 ≤ k31

−γ +β2

BkkKf kk31

Af k ≤ C(kKf k + kf k)2 .

In the latter case, we use the fact that A, B and 31 commute and are self-adjoint to write similarly −γ

−γ +α2

2 T12 = |hAKf, 31 Bf i| ≤ k3−α 1 AkkKf kk31

Bf k ≤ C(kKf k + kf k)2 .

Let us now focus on the term T11 . Using the positivity of ReK, it can be written as −γ

−γ

−γ

−γ

T11 = h(ReK)1/2 31 1 Bf, (ReK)1/2 31 2 Af i + h[31 1 B, ReK]f, 31 2 Af i ≡ T13 + T14 , where γ1 , γ2 > 0,

γ1 + γ2 = γ ,

are to be chosen later. We estimate both terms separately. The commutator in T14 can be expanded to give −γ

−γ

−γ

−γ

T14 = h31 1 [B, ReK]f, 31 2 Af i + h[31 1 , ReK]Bf, 31 2 Af i. In order to estimate these terms, we recall that ReK = T14 =

4 X i=1

≡

4 X i=1

−γ

−γ

P4

∗ i=1 Xi Xi . We

−γ

therefore have

−γ

h31 1 [B, Xi∗ ]Xi f, 31 2 Af i + h31 1 Xi∗ [B, Xi ]f, 31 2 Af i −γ

−γ

−γ

−γ

+ h[31 1 , Xi∗ ]Xi Bf, 31 2 Af i + hXi∗ [31 1 , Xi ]Bf, 31 2 Af i (1)

Ti

(2)

+ Ti

(3)

+ Ti

(4)

+ Ti

.

Noticing that [B, Xi∗ ] is a multiple of the identity operator and that 31 is self-adjoint, we have −γ

(1)

−γ

|Ti | ≤ ChXi f, 31 Af i ≤ kXi f kk31 Af k ≤ C(kKf k + kf k)2 , (2)

where we used (3.16) and the fact that γ > α2 to get the last inequality. The term Ti (3) is bounded by C(kKf k + kf k)2 in a similar way. The term Ti is written as (3)

γ

−γ

−γ

γ

−γ

−γ

|Ti | = |h311 [31 1 , Xi∗ ]Xi f, 31 ABf i + h311 [31 1 , Xi∗ ][Xi , B]f, 31 Af i| −γ

−γ

≤ CkXi f kk31 ABf k + Ckf kk31 Af k,

Non-Equilibrium Statistical Mechanics of Strongly Anharmonic Chains

121 (3)

where we used (4.11) and the fact that [Xi , B] is bounded. Now we can bound Ti by −γ C(kKf k + kf k)2 , using (3.16) to estimate kXi f k and (4.7) to estimate k31 ABf k −γ (4) and k31 Af k. The term Ti can be estimated in a similar way. Let us now focus on the term T13 . We can write v u 4 uX −γ1 −γ1 −γ 1/2 t kXi 31 2 Af k. |T13 | ≤ |RehK31 Bf, 31 Bf i| i=1

If we choose γ2 = α2 ,

(4.12)

the terms under the square root are easily estimated by writing them as −γ

−γ

−γ

γ

−γ

kXi 31 2 Af k ≤ k31 2 AkkXi f k + k[Xi , 31 2 ]312 kk31 2 Af k −γ + k31 2 [Xi , A]f k, and estimating the two commutators by (4.11) and (4.9) respectively. The term preceding the square root can be written as −γ

−γ

−γ

−γ

−γ

−γ

hK31 1 Bf, 31 1 Bf i = h31 1 BKf, 31 1 Bf i + h[K, 31 1 B]f, 31 1 Bf i ≡ T15 + T16 . The term T15 can be bounded if we choose 2γ1 ≥ β1 + β2 ,

(4.13)

because we have then −β2

T15 ≤ kKf kk31

−β1

Bkk31

Bf k ≤ C(kKf k + kf k)2 .

In order to estimate the term T16 , we use K = ReK + X0 to write −γ

−γ

−γ

−γ

T16 = h[X0 , 31 1 B]f, 31 1 Bf i + h[ReK, 31 1 B]f, 31 1 Bf i (1)

(2)

≡ T16 + T16 . (1)

The term T16 can be estimated by writing it as (1)

−γ

−γ

−γ

γ

−γ

−γ

T16 = h31 1 [X0 , B]f, 31 1 Bf i + h[X0 , 31 1 ]311 31 1 Bf, 31 1 Bf i. The first term can be bounded by C(kKf k + kf k)2 if we choose 2γ1 ≥ β1 + β3 .

(4.14)

In order to bound the second term, it suffices to have γ1 ≥ β1 , which is the case because of (4.13) and the fact that β2 ≥ β1 . (2) The term T16 can be bounded by C(kKf k + kf k)2 , by treating it in a similar way as the term T14 . We leave to the reader the verification that no additional conditions on γ1 have to be made. This completes the estimate of T1 , because (4.12), (4.13) and (4.14) can be satisfied simultaneously by (4.6).

122

J.-P. Eckmann, M. Hairer

Term T2 . We decompose this term as −γ

−γ

−γ

T2 ≤ |hBf, 31 AX0 f i| + |hBf, 31 [X0 , A]f i| + |hBf, [X0 , 31 ]Af i| ≡ T21 + T22 + T23 . Since γ ≥ α3 + β1 the term T22 is easily estimated by −β1

T22 ≤ k31

2 3 Bf kk3−α 1 [X0 , A]f k ≤ C(kKf k + kf k) .

Noticing that we can assume α1 ≤ α2 and β1 ≤ β2 , condition (4.7) implies γ ≥ −γ α1 +β1 . Since [X0 , 31 ] is a function, it commutes with 31 , and so T23 can be estimated writing γ

−β

−γ

1 T23 ≤ |h31 1 Bf, 31 [X0 , 31 ]3−α 1 Af i| −γ γ −β1 2 1 ≤ k31 Bf kk[X0 , 31 ]31 kk3−α 1 Af k ≤ C(kKf k + kf k) ,

where we used (4.11) to get the last bound. We finally bound T21 . Since X0 = K − ReK, it can be expanded as −γ

−γ

(1)

(2)

T21 ≤ |hBf, 31 AKf i| + |hBf, 31 A(ReK)f i| ≡ T21 + T21 . (1)

The term T21 can be estimated by writing (1)

−γ

T21 ≤ kKf kk31 ABf k, (2)

and using (4.7). The term T21 can be written as (2)

−γ

−γ

−γ

T21 = hBf, 31 A(ReK)f i = T13 + h31 1 Bf, [31 2 A, ReK]f i. The term T13 has already been estimated. The other term can be treated like the term T14 . We leave to the reader the verification that one can indeed bound it by C(kKf k + kf k)2 without any further restriction on γ1 and γ2 . This completes the proof of the lemma. u t 4.2. The main step of the proof of Proposition 3.7. By an elementary approximation argument, it is sufficient to prove the inequalities (3.11) for f ∈ C0∞ (X ), since this is a core for both K and K ∗ . Moreover, we will prove only (3.11a). The interested reader may verify that the same arguments also apply for (3.11b). We want to show that we can find constants ε and C such that k3ε1 f k ≤ C(kKf k + kf k),

for all f ∈ C0∞ (X ).

In order to show this, we notice that there is a constant C such that N N X X pi2 + P 2n (Q) + P 2m (q˜i ) 321 ≤ C 1 + (rL − λ2L q0 )2 + (rR − λ2r qN )2 + i=0

˜ ≡ G. The immediate consequence is that

2ε−2 ˜ Gf i. k3ε1 f k2 = hf, 32ε 1 f i ≤ hf, 31

i=1

Non-Equilibrium Statistical Mechanics of Strongly Anharmonic Chains

123

It is therefore enough to show that there exists a (small) constant ε such that the terms ε−1 ε−1 m n k3ε−1 1 P (Q)f k , k31 pi f k , k31 P (q˜i )f k , . . .

are bounded by C(kKf k + kf k). We are first going to bound the terms involving variables near the boundary of the chain. Then, we will proceed by induction towards the middle of the chain. 2 The term k3ε−1 1 (rL − λL q0 )f k. We have

k(rL − λ2L q0 )f k2 = |h(rL − λ2L q0 )2 f, f i| ≤ C|h(ReK)f, f i| = C|RehKf, f i| ≤ CkKf kkf k ≤ C(kKf k + kf k)2 ,

(4.15)

where we used the fact that aL 6= 0 to obtain the first inequality. Since 31 ≥ 1, we thus have the estimate 2 2 k3ε−1 1 (rL − λL q0 )f k ≤ C(kKf k + kf k)

if we take ε ≤ 1. The term k3ε−1 1 p0 f k. We will prove the estimate k31ε0 −1 p0 f k ≤ C(kKf k + kf k),

(4.16)

for ε0 ≤ 1/(2m). An explicit computation yields the relation [X0 , rL − λ2L q0 ] = bL (rL − λ2L q0 ) − λ2L p0 .

(4.17)

Solving (4.17) for p0 , we get

2ε0 −2 −2 2 2 p0 f k31ε0 −1 p0 f k2 = λ−2 L bL (rL − λL q0 ) − λL [X0 , rL − λL q0 ] f, 31 (1)

(2)

≡ X0 − X0 . (1)

The term X0 can be estimated as (1)

2ε0 −2 2 p0 f k ≤ C(kKf k + kf k)2 , |X0 | ≤ λ−2 L kbL (rL − λL q0 )f kk31

where the last inequality holds because ε0 ≤ 1/2. (2) In order to estimate X0 , we apply Lemma 4.2 with A = p0 and B = rL − λ2L q0 . An explicit computation yields [X0 , A] = V10 (q0 ) + V20 (q˜1 ) − rL . The term [X0 , B] has already been computed in (4.17). Because of Proposition 4.1 and of (4.15), we can choose α1 = 1, α2 = 1, α3 = 2 − 1/m,

β1 = 0, β2 = 1, β3 = 1 .

The hypotheses of Lemma 4.2 are thus fulfilled if we choose γ = 2 − 1/m. We therefore have the estimate (4.16) with ε0 = 1/(2m). We have a similar estimate for the symmetric term at the other end of the chain.

124

J.-P. Eckmann, M. Hairer

m The term k3ε−1 1 P (q˜ 1 )f k. We will prove the estimate ε0 −1

k310

P m (q˜1 )f k ≤ C(kKf k + kf k),

for some ε00 < ε0 . Because of the bound (3.5b) of Assumption 2, we can find some constants c1 and c2 such that

2ε00 −2

31

2ε0 −2 2ε0 −2 P 2m (q˜1 )f, f ≤ c1 31 0 V20 (q˜1 )f, q˜1 f + c2 31 0 f, f ,

(4.18)

2ε0 −2

where we also used (4.3). The second term is easily estimated because 31 0 is bounded if ε00 ≤ 1. We once again use the fact that [X0 , p0 ] = V10 (q0 ) + V20 (q˜1 ) − rL to write the first term as 2ε00 −2 0 2ε0 −2 3 V2 (q˜1 )f, q˜1 f = 31 0 [X0 , p0 ] − V10 (q0 ) + rL f, q˜1 f 1 (1) (2) (3) ≡ |Y1 + Y1 + Y1 |. (2)

The term Y1

can be written as 2ε0 −2+1/m 0 −1/m (2) V1 (q0 )f, 31 q˜1 f |Y1 | = 31 0

−1/m

2ε0 −2+1/m 0 V (q0 )f k3 q˜1 f k. ≤ 3 0 1

1

By Proposition 4.1 and the fact that take ε00 so small that

V10

1

∈ F2n−1 , this term is bounded by Ckf k2 if we

2ε00 ≤ 1/n − 1/m. (3)

The term Y1

(4.19)

is bounded similarly by writing

2ε0 −2+1/m

−1/m (3) rL f k31 q˜1 f k, |Y1 | ≤ 31 0

if we impose 2ε00 ≤ 1 − 1/m.

(4.20)

Both conditions can be satisfied because we assumed that 1 < n < m. In order to (1) estimate Y1 , we apply once again Lemma 4.2. This time we have A = q˜1 and B = p0 . Using (4.16) and Proposition 4.1, we see that we can choose α1 = 1/m, α2 = 1/m, α3 = 1,

β1 = 1 − ε0 , β2 = 1, β3 = 2 − 1/m .

By using m > 1, we see that the hypotheses of Lemma 4.2 are fulfilled if (4.19) and (4.20) hold, together with ε00 < ε0 /2. Once again, we have the same estimate at the other end of the chain. We can now go along the chain by induction. At each step, we go one particle closer towards the middle of the chain. We present here only the terms arising when we go from the left to the right of the chain.

Non-Equilibrium Statistical Mechanics of Strongly Anharmonic Chains

125

The term k3ε−1 1 pi f k. We already treated the case i = 1. Let us therefore assume 0 > 0 such that we have i > 1. We moreover assume that there exist constants εi−1 , εi−1 the estimates

εi−1 −1

3 pi−1 f ≤ C(kKf k + kf k), 1 (4.21) 0

εi−1

3 −1 P m (q˜i )f ≤ C(kKf k + kf k). 1 We will show that this implies the existence of a constant εi > 0 such that

εi −1

3 pi f ≤ C(kKf k + kf k). 1

(4.22)

We use pi = pi−1 + [X0 , q˜i ] to write

2

εi −1 (1) (2) 2εi −2

3 pi f = h312εi −1 pi−1 f, 3−1 pi f i ≡ Xi + Xi . 1 pi f i + h[X0 , q˜i ]f, 31 1 (1)

The term Xi

is easily bounded if we write (1)

2 |Xi | ≤ k312εi −1 pi−1 f kk3−1 1 pi f k ≤ C(kKf k + kf k) ,

where the last inequality is obtained by using Proposition 4.1 and (4.21). We only have to make the assumption 2εi ≤ εi−1 . (2) In order to estimate the term Xi , we apply Lemma 4.2 with A = pi and B = q˜i . Explicit computation yields [X0 , pi ] = V10 (qi )−V20 (q˜i+1 )−V20 (q˜i ). Using the induction hypothesis (4.21) and Proposition 4.1, we see that we can choose α1 = 1,

0 β1 = (1 − εi−1 )/m,

α2 = 1, α3 = 2 − 1/m,

β2 = 1/m, β3 = 1 .

0 /(2m), we see that the hypotheses of Lemma 4.2 are satisfied. We If we take εi ≤ εi−1 thus have the desired bound (4.22). m The term k3ε−1 1 P (q˜ i+1 )f k. We assume that there exist strictly positive constants 0 such that εi and εi−1

εi −1

3 pi f ≤ C(kKf k + kf k), 1

0

εi−1

3 −1 P m (q˜i )f ≤ C(kKf k + kf k).

1

We will show that this implies the existence of a constant εi0 > 0 for which

εi0 −1 m

3 P (q˜i+1 )f ≤ C(kKf k + kf k). 1

(4.23)

Expression (4.18) with q˜1 replaced by q˜i+1 holds. In order to prove (4.23), it suffices therefore to show that 2εi0 −2

|h31

V20 (q˜i+1 )f, q˜i+1 f i| ≤ C(kKf k + kf k)2 .

Since, for i > 1 we have [X0 , pi ] = V10 (qi ) − V20 (q˜i+1 ) − V20 (q˜i ), the preceding term can be written as 2εi0 −2 (1) (2) (3) 3 [X0 , pi ] + V10 (qi ) + V20 (q˜i ) f, q˜i+1 f ≡ |Yi + Yi + Yi |. 1

126

J.-P. Eckmann, M. Hairer (2)

We impose 2εi0 ≤ 1/n − 1/m. The term Yi (2)

−1/m

|Yi | ≤ k31

is then estimated as

2ε0 −2+1/m 0

q˜i+1 f k 31 i V1 (qi )f ≤ C(kKf k + kf k)2 ,

where the last step uses Proposition 4.1 and V10 ∈ F2n−1 . In order to estimate the term (3) Yi , we notice that by the Cauchy-Schwarz inequality and Assumption 2, we have (3)

2ε0 −2+1/m 2m−1

q˜i+1 f k 31 i P (q˜i )f

1/m−1 m−1

2ε0 −1 ≤ Ckf k 31 P (q˜i )31 i P m (q˜i )f

2ε0 −1 ≤ Ckf k 31 i P m (q˜i )f . −1/m

|Yi | ≤ Ck31

0 , so this term can be estimated by the induction hypothesis. We can choose 2εi0 < εi−1 (1)

The term Yi is once again estimated by using Lemma 4.2, this time with A = q˜i+1 and B = pi . Using Proposition 4.1, it is easy to verify that one can take α1 = 1/m, α2 = 1/m, α3 = 1,

β1 = 1 − εi , β2 = 1, β3 = 2 − 1/m .

It suffices then to choose 2εi0 < εi to satisfy the assumptions of Lemma 4.2 and get the desired estimate. It is obvious that this induction also works in the other direction, starting from the other end of the chain. It also accommodates to a little bit more complicated topologies, as long as the chain does not contain any closed loop. In order to complete the proof of the lemma, we have to estimate the last term corresponding to the motion of the center of mass. n The term k3ε−1 1 P (Q)f k. Finally, we want to show the estimate n k3ε−1 1 P (Q)f k ≤ C(kKf k + kf k),

(4.24)

for some ε. We start with a little computation. We write (N + 1)q0 = Q + (qN−1 − qN ) + 2(qN −2 − qN −1 ) + . . . + N (q0 − q1 ). Moreover, we have qi = q0 + (q1 − q0 ) + . . . + (qi − qi−1 ). We can thus write N

X Q bij q˜j , − qi = N +1

with bij ∈ R.

j =1

This, together with the mean-value theorem, implies the useful relation (N

+ 1)QV10

Q/(N + 1) = Q =Q

N X

V10 (qi ) + Q

V10 Q/(N + 1) − V10 (qi )

N X

i=0

i=0

N X

N X

i=0

V10 (qi ) + Q

i=0

V100 (ξi )

N X

bij q˜j ,

j =1

(4.25)

Non-Equilibrium Statistical Mechanics of Strongly Anharmonic Chains

127

where ξi is located somewhere on the Q/(N + 1) and qi . In the case of d-dimensional particles, the expression corresponding to (4.25) is |(N + 1)QV10 Q/(N + 1) | N N tQ X X sup ∇ 2 V1 bij |q˜j |. + (1 − t)qi ≤ (N + 1)|Q||∇V1 (qi )| + |Q| N +1 t∈(0,1) i=0

j =1

The subsequent expressions can be rewritten accordingly. We use Assumption 1 and (4.25) to write the left-hand side of (4.24) as 2ε−2 2n n 2 P (Q)f, f i| k3ε−1 1 P (Q)f k = |h31

≤ C(N + 1) 32ε−2 V10 Q/(N + 1) f, Qf + Ckf k2 1 N D E X 0 ≤ C 32ε−2 V (q ) f, Qf 1 i 1

+C

N X

i=0

bij |h32ε−2 q˜j V100 (ξi )f, Qf i| + Ckf k2 1

i,j =1

≡ Y (1) + Y (2) + Ckf k2 . The term Y (2) can be bounded because V100 ∈ F2n−2 , and so |V100 (ξi )| ≤ C(1 + ξi2 )n−1 ≤ CP 2n−2 (Q) + CP 2n−2 (qi ) ≤ C

N X

P 2n−2 (qk ).

k=0

Thus, Y (2) can be split in terms of the form 1/n−2

q˜j P 2n−2 (qk )f, Qf i| ≤ k31 |h32ε−2 1

P 2n−2 (qk )Qf kk32ε−1/n q˜j f k.

The first factor clearly can be bounded by Ckf k if we notice that q 7 → P 2n−2 (qk )Q belongs to F2n−1 and then apply Proposition 4.1. The second factor can also be bounded by Ckf k if we impose 1 1 − , 0<ε≤ 2n 2m which can be done because we assumed n < m.P It remains to estimate Y (1) . We define P = N i=0 pi . Since it may easily be verified PN 0 that [X0 , P ] = i=0 V1 (qi ) − rL − rR , we can write Y1 as

Y1 = 32ε−2 [X0 , P ] + rL + rR f, Qf ≡ Y (3) + Y (4) + Y (5) . 1 We leave to the reader the verification that the terms Y (4) and Y (5) can be bounded by Ckf k2 without introducing any stronger condition on ε. The term Y (3) can be estimated by using Lemma 4.2 with A = Q and B = P . We have already verified that (4.22) holds for every i, so we can define εP ≡ min{εi | i = 0, . . . , N}.

128

J.-P. Eckmann, M. Hairer

This, together with Proposition 4.1, allows us to choose, α1 = 1/n, α2 = 1/n, α3 = 1,

β1 = 1 − εP , β2 = 1, β3 = 2 − 1/n ,

and thus (4.24) is fulfilled if we choose 2ε ≤ εP . This completes the proof of the lemma. t u 5. Generalization of Hörmander’s Theorem In a celebrated paper [Hör67], Hörmander studied second-order differential operators of the form P =

r X j =1

L∗j Lj + L0 ,

(5.1)

where the Lj are some smooth vector fields acting in Rd . He showed that a sufficient condition for the operator P to be hypoelliptic is that the Lie algebra generated by {L0 , . . . , Lr } has maximal rank everywhere. The main step in his proof is to show that there exists a constant ε > 0 and, for every compact domain K ⊂ Rd , a constant CK such that kuk(ε) ≤ CK (kP uk + kuk),

∀ u ∈ C0∞ (K).

(5.2)

In this expression, the norm k · k(ε) is the natural norm associated to the Sobolev space H ε (Rd ), i.e. Z 2 |u(k)| ˆ (1 + k 2 )ε d dk ≡ k(1 + 1)ε/2 uk. kuk2(ε) = Rd

We base our discussion on the proof presented in [Hör85]. Hörmander first defines Q1 as the set of all properly supported symmetric first-order differential operators q such that for every compact domain K, there exist constants CK0 and CK00 with kquk2 ≤ CK0 RehP u, ui + CK00 kuk2 ,

u ∈ C0∞ (K).

(5.3)

In particular, if we write L∗j = −Lj + cj , where cj is some function, Q1 contains all the operators of the form (Lj − cj /2)/i, j ≥ 1, as well as their linear combinations. It also contains every operator of order 0. Hörmander then defines Q2 as consisting of the operator (P − P ∗ )/ i, as well as all the commutators of the form [q, q 0 ]/i with q, q 0 ∈ Q1 . For k > 2, he defines Qk as the set of all commutators [q, q 0 ]/i with q ∈ Qk−1 and q 0 ∈ Qk−2 . One feature of this construction is that a finite number of steps suffices to catch every symmetric first-order differential operator. This is a consequence of the maximal rank hypothesis. The main point of Hörmander’s proof is then the following result. Lemma 5.1 (Hörmander). If qk ∈ Qk and ε ≤ 21−k , we have for every K ⊂ Rd , kqk uk(ε−1) ≤ C(kP uk + kuk),

u ∈ C0∞ (K).

(5.4)

Non-Equilibrium Statistical Mechanics of Strongly Anharmonic Chains

129

The proof can be found in [Hör85, p. 355]. The result (5.2) then follows almost immediately, because the operators i∂j all belong to some Qk . Thus there exists some ε > 0 such that d X k∂j uk2(ε−1) ≤ CK (kP uk + kuk)2 , u ∈ C0∞ (K), j =1

which implies (5.2). One of the major problems encountered in this paper is to find a global estimate analogous to (5.2), i.e. to find constants C and ε such that ˜ ε uk ≤ C(kP uk + kuk), k1

for all

u ∈ C0∞ (Rd ),

˜ is some modified Laplacean. There are two major difficulties: where 1 • If we were to construct the sets Qk as in [Hör85], they would not necessarily “close” in the sense that the successive commutators could blow up, and the whole proof would break down. To avoid this we do not necessarily put (P − P ∗ )/ i into Q2 , but rather g0 (P − P ∗ )/i, where g0 is some bounded function. This allows to get decreasing bounds on the successive commutators. This problem does not appear in [EPR99a], where the successive commutators are all first-order differential operators with constant (or bounded) coefficients. On the other hand, the commutator technique is essentially the same as in [EPR99a]. • The above construction does not allow to deal with arbitrary symmetric first-order differential operators. The reason is that if we want a global equivalent of (5.3), the set Q1 is no longer allowed to contain products of the Lj and unbounded functions. We thus work with fewer operators, which means that we track much more closely the expressions which appear in the constructions. 5.1. General setting. Let us consider the Hilbert space H = L2 (Rd , dx) for some integer d ≥ 1. We define the set C(H) as the set of closed operators on H and the algebra B(H) as the everywhere defined bounded operators on H. We define D ≡ C0∞ (Rd ), which is dense in H. Let us fix some sub-algebra F ⊂ B(H) that is closed under conjugation and such that F D ⊂ D for all F ∈ F (typically F is some algebra of bounded functions). The advantage of considering C0∞ (Rd ) is that every differential operator with sufficiently smooth coefficients is closable on it (see [Yos80] for a justification). Moreover, every differential operator with smooth coefficients maps D into itself. This allows us to make a formal calculus, i.e. every relationship between operators appearing in this section is supposed to hold on D. The actual operators are then the closures of the operators defined on D. We define L as the set of all formal expressions of the form X a` (x)D ` , k ≥ 0, a ∈ C ∞ (Rd ), |`|≤k

where D ` denotes the |`|th derivative with respect to the multi-index `. By the above remark, any element of L can naturally be identified with a differential operator in C(H). Consider a differential operator K that can be written as K=

n X i=1

Xi∗ Xi + X0 ,

Xj ∈ L,

j = 1, . . . , n,

(5.5)

130

J.-P. Eckmann, M. Hairer

where X0 is such that X0∗ = −X0 + g,

g ∈ F.

(5.6)

We introduce now a definition that will be very useful in the sequel. Definition 5.2. Let S ⊂ L be a finite set of differential operators and i ≥ 0 a natural number. We define the set YFi (S) as the module on F generated by the terms S1 S2 · · · Si ,

Sk ∈ S ∪ {1},

k = 1, . . . , i.

The elements of YFi (S) are naturally identified with densely defined closed operators on H. If i = 0, we use the convention YF0 (S) ≡ F. The subscript F will be dropped in the sequel when the algebra F is clear from the context. We construct the sets A−1 = {X1 , . . . , Xn },

A0 = {g0 X0 , X1 , . . . , Xn },

g0 ∈ F,

(5.7)

where the operator g0 is assumed to be self-adjoint, positive and such that [g0 , X0 ] ∈ F.

(5.8)

Let us now construct recursively up to a level R < ∞ some finite sets Bi , Ai ⊂ L by (0) the following procedure. Assume Ai−1 is known. Consider next the set Bi of all A of the form X X fXB [X, B] , fB , fXB ∈ F. (5.9) fB B + A= B∈Ai−1

X∈A0 (0)

We then select a finite subset Bi ⊂ Bi . The set Ai is then defined as Ai ≡ Ai−1 ∪ Bi .

Remark 5.3. It is here that our construction differs from similar ones where all elements (0) of Bi would have been selected. This makes the set of operators which we study much smaller, but then we of course have to verify that the operators of interest are really covered by our construction. We will make some working hypotheses on the sets Ai . Hypothesis 1. The pair (AR , F) satisfies the following. If A, B ∈ AR and f ∈ F, then [A, B] ∈ Y 1 (AR ),

A∗ ∈ Y 1 (AR ),

[A, f ] ∈ F.

Hypothesis 2. If A ∈ Ai with i ≥ −1, we have A∗ ∈ Y 1 (Ai ). Remark 5.4. Hypothesis 1 implies that if X ∈ Y j (AR ) and Y ∈ Y k (AR ), then [X, Y ] ∈ Y k+j −1 (AR ). This will be very useful in the sequel. Hypothesis 2 implies that the classes Y k (Ai ) are closed under conjugation.

Non-Equilibrium Statistical Mechanics of Strongly Anharmonic Chains

We define now the operator 32 by 32 = 1 +

X

131

A∗ A.

(5.10)

A∈AR

This is, in some sense that will immediately be clear from Lemma 5.6, the “biggest” operator contained in Y 2 (AR ). The operator 32 is symmetric, densely defined and positive. We will moreover assume that Hypothesis 3. 32 is essentially self-adjoint on D. The powers 3α thus exist and are also essentially self-adjoint on D for α ≤ 2. 5.2. Results and a preliminary lemma. The following theorem is the main result of this section. Theorem 5.5. Let K and 3 be defined as above and assume Hypotheses 1–3 are satisfied for some R. Then there exist some constants C, ε > 0 such that for every f ∈ D, we have k3ε f k ≤ C(kKf k + kf k).

(5.11)

In the sequel, we will write A instead of AR to simplify the notation. In order to prove Theorem 5.5, we need the following lemma, which will be extensively used in the sequel. Lemma 5.6. Let 3, F and A be as above and assume Hypotheses 1 and 3 hold. If j X ∈ YF (A), then the operators 3β X3γ

with β + γ ≤ −j

are bounded. j If Y ∈ L is such that [Y, 32 ] ∈ YF (A), then the operators 3β [3α , Y ]3γ

with α + β + γ ≤ 2 − j

are bounded. If X, Y ∈ L are such that j

[X, 32 ] ∈ YF (A) , [Y, 32 ] ∈ YFk (A) and

j +k−2 [32 , X], Y ∈ YF (A),

then the operators

3β [3α , X], Y 3γ

with α + β + γ ≤ 4 − j − k

are bounded. Proof. The proof of this lemma is postponed to Appendix A.

t u

Remark 5.7. Lemma 5.6 allows us to count powers in the following sense. Each time we see an operator that is a monomial containing fractional powers of 3 and some operators of Y j (A), we know that the operator is bounded if its “degree” is less or equal to 0. The rule is that if Y ∈ Y j (A), its degree is j and the degree of 3α is α. Moreover, every time we encounter a commutator, we can lower the degree by one unit.

132

J.-P. Eckmann, M. Hairer

Lemma 5.6 also shows that if f ∈ D, A ∈ A and α ≤ 2, expressions such as A3α f can be well defined by A3α f ≡ 3α Af + [A, 3α ]3−2 32 f, where [A, 3α ]3−2 is bounded and can therefore be defined on all of H. Similar expressions hold to show that any expression of this section can be well defined. We are now ready to prove the theorem.

5.3. Proof of Theorem 5.5. The proof uses the commutation techniques developed by Hörmander [Hör85] and improved by Eckmann, Pillet, Rey-Bellet [EPR99a]. Large parts of this proof are inspired from this latter work. Before we start the proof itself, let us make a few computations, the results of which will be used repeatedly in the sequel. We first show that we can assume ReK positive. An explicit computation, using (5.5) and (5.6), shows that ReK =

n X i=1

Xi∗ Xi +

g , 2

and thus also

X0 = K − ReK + g/2.

(5.12)

Because g ∈ F, we can add a sufficiently big constant to X0 to make ReK positive. This will change neither the commutation relations, nor the estimate (5.11). Another useful equality is g0 ReK = Re(g0 K + K1 ) + K2

K1 , K2 ∈ Y 1 (A−1 ),

(5.13)

where K1 is a self-adjoint operator such that Re(g0 K + K1 ) is a positive self-adjoint operator. This is a consequence of the following two equalities, which are easily verified by inspection g0 ReK =

n X i=1

Re(g0 K) =

n X i=1

Xi∗ g0 Xi + K2

K2 ∈ Y 1 (A−1 ),

Xi∗ g0 Xi − K1

K1 ∈ Y 1 (A−1 ).

We therefore have Re(g0 K + K1 ) =

n X i=1

Xi∗ g0 Xi .

This proves (5.13). Another useful identity will be (g0 X0 )∗ = −X0 g0 + gg0 = −g0 X0 + [g0 , X0 ] + gg0 = −g0 X0 + g00 ,

g00 ∈ F,

(5.14)

where the last equality is a consequence of (5.8). We will now verify the estimate (5.11) for some vector f ∈ D. In the sequel, the symbol C will be used to denote some constant depending only on the operator K. This

Non-Equilibrium Statistical Mechanics of Strongly Anharmonic Chains

133

constant can change from one line to the other. We will first prove that A ∈ Y 1 (Ai ) with 0 ≤ i ≤ R implies k31/4

i+1 −1

Af k ≤ C(kKf k + kf k).

(5.15)

In fact, an immediate consequence of the first part of Lemma 5.6 is that we only have to prove this assertion for A ∈ Ai . The proof will proceed by induction on i. 5.3.1. Verification for i = 0. We want to verify the estimate k3−3/4 Af k ≤ C(kKf k + kf k),

for all A ∈ A0 .

The cases A = g0 X0 and A = Xj with j 6= 0 will be treated separately. The case A = Xj . We write k3−3/4 Xj f k2 ≤ CkXj f k2 ≤ Chf, Xj∗ Xj f i ≤ Chf, (K + K ∗ − g)f i ≤ CRehf, Kf i + Ckf k2 ≤ Ckf k(kKf k + kf k). This implies the desired estimate. Because Xj∗ ∈ Y 1 (A−1 ) by hypothesis, this computation immediately implies the estimates kXj f k ≤ C(kKf k + kf k), kXj∗ f k ≤ C(kKf k + kf k),

(5.16a) (5.16b)

which hold for every j ≥ 1. The case A = g0 X0 . We write, using expression (5.12), k3−3/4 Af k2 = hg0 X0 f, 3−3/2 Af i = hKf, g0 3−3/2 Af i + hg0 gf, 3−3/2 Af i/2 − h(ReK)f, g0 3−3/2 Af i ≡ S1 + S2 − S3 . The terms S1 and S2 are easily bounded by C(kKf k+kf k)2 , using the Cauchy-Schwarz inequality and the first part of Lemma 5.6. Using the positivity of ReK and the explicit form of K, the term S3 can be bounded as |S3 | = h(ReK)1/2 f, (ReK)1/2 g0 3−3/2 Af i ≤ |RehKf, f i|1/2 |h(ReK)g0 3−3/2 Af, g0 3−3/2 Af i|1/2 n 1/2 X p ≤ kKf kkf k hgg0 3−3/2 Af, g0 3−3/2 Af i/2 + kXi g0 3−3/2 Af k2 v u n X p u 2 . ≡ kKf kkf k tS0 + S0,i

i=1

i=1

The term S0 is estimated by simple power counting (the 3’s contribute for −3 and the A’s for 2 in the total degree of the expression, hence |S0 | ≤ Ckf k2 ). The terms S0,i are estimated by writing |S0,i | ≤ kg0 3−3/2 AXi f k + k[Xi , g0 3−3/2 A]f k.

134

J.-P. Eckmann, M. Hairer

The first term is estimated by using (5.16) and power counting. The second term is estimated by expanding the commutator as [Xi , g0 3−3/2 A] = [Xi , g0 ]3−3/2 A + g0 [Xi , 3−3/2 ]A + g0 3−3/2 [Xi , A], and estimating separately the resulting terms. 5.3.2. The induction hypothesis. We shall proceed by induction. Let us fix j > 0, take A ∈ Aj and assume (5.15) holds for i < j . Let us moreover define ε ≡ 1/4j +1 in order to simplify the notation. Our assumption is therefore that k34ε−1 Bf k ≤ C(kKf k + kf k)

∀ B ∈ Y 1 (Aj −1 ).

(5.17)

We will now prove that this assumption implies the desired estimate, i.e. k3ε−1 Af k ≤ C(kKf k + kf k)

∀ A ∈ Y 1 (Aj ).

(5.18)

This, together with the preceding paragraph, will imply the estimate (5.15). 5.3.3. Proof of the main estimate. Because of the induction hypothesis, we only have to check (5.18) for A ∈ Aj \Aj −1 . By (5.9), we can write A=

X

fB B + fB0 [g0 X0 , B] +

B∈Aj −1

n X i=1

fBi [Xi , B] ,

with all the f belonging to F. We have X D

k3ε−1 Af k2 =

B∈Aj −1

X

fB B + fB0 [g0 X0 , B] +

≡

B∈Aj −1

TB + TB0 +

n X i=1

n X i=1

E fBi [Xi , B] f, 32ε−2 Af

TBi .

We are going to bound each term of this sum separately by C(kKf k + kf k)2 . Term TB . We have |TB | = |h32ε−1 fB 31−2ε 32ε−1 Bf, 3−1 Af i|. The operators 32ε−1 fB 31−2ε and 3−1 A are bounded by Lemma 5.6. Using the induction hypothesis (5.17), we thus get the bound |TB | ≤ C(kKf k + kf k)2 . Term TBi with i 6 = 0. We define h ≡ fBi . The term TBi is then written as TBi = hBf, Xi∗ h∗ 32ε−2 Af i − hXi f, h∗ B ∗ 32ε−2 Af i ≡ Q1 − Q2 . Term Q1 . It can be estimated by writing Q1 = hBf, h∗ 32ε−2 AXi∗ f i + hBf, [Xi∗ , h∗ 32ε−2 A]f i.

Non-Equilibrium Statistical Mechanics of Strongly Anharmonic Chains

135

The first term is estimated by rewriting it as |hBf, h∗ 32ε−2 AXi∗ f i| = |h32ε−1 Bf, 31−2ε h∗ 32ε−2 AXi∗ f i| ≤ k32ε−1 Bf kk31−2ε h∗ 32ε−2 AXi∗ f k ≤ C(kKf k + kf k)2 . The last inequality has been obtained by using the induction hypothesis (5.17), the estimate (5.16b) and the fact that the operator 31−2ε h∗ 32ε−2 A is bounded by Lemma 5.6. The second term is estimated as |hBf, [Xi∗ , h∗ 32ε−2 A]f i| = |h32ε−1 Bf, 31−2ε [Xi∗ , h∗ 32ε−2 A]f i| ≤ k32ε−1 Bf kk31−2ε [Xi∗ , h∗ 32ε−2 A]f k. The term k32ε−1 Bf k is bounded by the induction hypothesis (5.17). The other term can be estimated by writing the commutator as [Xi∗ , h∗ 32ε−2 A] = [Xi∗ , h∗ ]32ε−2 A + h∗ [Xi∗ , 32ε−2 ]A + h∗ 32ε−2 [Xi∗ , A]. The resulting terms are estimated by power counting, using the fact that Xi∗ ∈ Y 1 (A). Term Q2 . We bound this term as |Q2 | = hXi f, h∗ 32ε−2 AB ∗ f i + hXi f, h∗ [B ∗ , 32ε−2 A]f i ≤ kXi f k kh∗ 32ε−2 AB ∗ f k + kh∗ [B ∗ , 32ε−2 A]f k

≤ kXi f k kh∗ 32ε−2 A31−2ε kk32ε−1 B ∗ f k + kh∗ [B ∗ , 32ε−2 A]f k .

We leave to the reader the not too hard task to verify that it is indeed possible to get the bound |Q2 | ≤ C(kKf k + kf k)2 by similar estimates as for the term Q1 . Term TB0 . We define h ≡ fB0 . The term TB0 is thus equal to TB0 = h[g0 X0 , B]f, h∗ 32ε−2 Af i = hg0 X0 Bf, h∗ 32ε−2 Af i − hBg0 X0 f, h∗ 32ε−2 Af i. We use (5.14) to write this as TB0 = − hBf, h∗ 32ε−2 Ag0 X0 f i + hBf, g00 h∗ 32ε−2 Af i

− hBf, [g0 X0 , h∗ 32ε−2 A]f i − hBg0 X0 , h∗ 32ε−2 Af i ≡ − U1 + U2 − U3 − U4 ,

where g00 ∈ F. The term U2 can easily be estimated by |U2 | = |h32ε−1 Bf, 31−2ε g00 h∗ 32ε−2 Af i| ≤ k32ε−1 Bf kk31−2ε g00 h∗ 32ε−2 Af k ≤ C(kKf k + kf k)kf k, using the induction hypothesis. In order to estimate the term U3 , we notice that g0 X0 ∈ A, and thus [g0 X0 , 32 ] ∈ Y 2 (A). We can therefore write |U3 | = |h32ε−1 Bf, 31−2ε [g0 X0 , h∗ 32ε−2 A]f i| ≤ k32ε−1 Bf kk31−2ε [g0 X0 , h∗ 32ε−2 A]f k,

136

J.-P. Eckmann, M. Hairer

expand the commutator and estimate the resulting terms separately by power counting. We use the equality X0 = K − ReK + g/2, to write the terms U1 and U4 as

U1 = Bf, h∗ 32ε−2 Ag0 K − (ReK) + g/2 f ≡ TB,1 − TB,2 + TB,3 ,

U4 = Bg0 K − (ReK) + g/2 f, h∗ 32ε−2 Af ≡ TB,4 − TB,5 + TB,6 . Each of these terms will now be estimated separately. Terms TB,3 and TB,6 . They are easily bounded like the term U2 by power counting and using the induction hypothesis to bound k32ε−1 Bf k. In the case of TB,6 , we first have to commute B with g0 g/2, but this does not cause any problem. Term TB,1 . This term can be estimated by |TB,1 | ≤ kKf kkg0∗ A∗ 32ε−2 hBf k ≤ kKf kkg0∗ A∗ 32ε−2 h32−2ε kk32ε−2 Bf k. The norm of g0∗ A∗ 32ε−2 h32−2ε is bounded by power counting. Using the induction hypothesis (5.17), we thus have |TB,1 | ≤ C(kKf k + kf k)2 . Term TB,4 . We have the estimate |TB,4 | = |hKf, g0∗ B ∗ h∗ 32ε−2 Af i| ≤ kKf kkg0∗ B ∗ h∗ 32ε−2 Af k. The second norm can be estimated by writing kg0∗ B ∗ h∗ 32ε−2 Af k ≤ kg0∗ h∗ 32ε−2 A31−2ε kk32ε−1 B ∗ f k + kg0∗ [B ∗ , h∗ 32ε−2 A]f k. Here, the first term can be bounded by C(kKf k + kf k) because, by Hypothesis 2, we have B ∗ ∈ Y 1 (Aj −1 ) and so we can use the induction hypothesis. The commutator can be expanded and bounded by power counting. Term TB,2 . We can write this term as TB,2 = h32ε−1 hBf, (g0 ReK)3−1 Af i + h32ε−1 hBf, [3−1 A, g0 ReK]f i = h32ε−1 hBf, K2 3−1 Af i + h32ε−1 hBf, [3−1 A, g0 ReK]f i + h32ε−1 hBf, Re(g0 K + K1 )3−1 Af i ≡ M1 + M2 + M3 , where the second equality has been obtained using (5.13). These terms can now be estimated separately. Term M1 . We write this term as M1 = h32ε−1 hBf, 3−1 AK2 f i + h32ε−1 hBf, [K2 , 3−1 A]f i. The first term is estimated by using K2 ∈ Y 1 (A−1 )

⇒

kK2 f k ≤ C(kKf k + kf k),

Non-Equilibrium Statistical Mechanics of Strongly Anharmonic Chains

137

where the implication is a straightforward consequence of (5.16a). The second term can be estimated by power counting and the induction hypothesis, using the fact that K2 ∈ Y 1 (A−1 ), so that [K2 , 3−1 A] is bounded. Term M2 . We use the explicit form of ReK to write this term as M2 = h32ε−1 hBf, [3−1 A, g0 ](ReK)f i n X h32ε−1 hBf, g0 Xi∗ [3−1 A, Xi ]f i + i=1

+ h32ε−1 hBf, g0 [3−1 A, Xi∗ ]Xi f i

+ h32ε−1 hBf, g0 [3−1 A, g]f i/2 n X (Mi1 + Mi2 ) + M21 . ≡ M20 + i=1

The term M20 is estimated by using the explicit form of ReK to decompose it in terms of the form |h32ε−1 hBf, [3−1 A, g0 ]Xi∗ Xi f i| ≤ k[3−1 A, g0 ]Xi∗ kk32ε−1 hBf kkXi f k. The norm k[3−1 A, g0 ]Xi∗ k is finite by Lemma 5.6. The terms k32ε−1 hBf k and kXi f k are bounded by C(kKf k + kf k), using the induction hypothesis (5.17) and the estimate (5.16a) respectively. The terms M21 and Mi2 are estimated by power counting and the induction hypothesis. In order to estimate the term Mi1 , we have to commute once more to find

Mi1 = h32ε−1 hBf, g0 [3−1 A, Xi ]Xi∗ f i + 32ε−1 hBf, g0 Xi∗ , [3−1 A, Xi ] f . The first term is estimated by using (5.16b). The second term is estimated by expanding the double commutator and power counting. Term M3 . We use the positivity of Re(g0 K + K1 ) to write

1/2 2ε−1 1/2 −1 3 hBf, Re(g0 K + K1 ) 3 Af |M3 | = Re(g0 K + K1 ) ≤ |Reh(g0 K + K1 )32ε−1 hBf, 32ε−1 hBf i|1/2 × |hRe(g0 K + K1 )3−1 Af, 3−1 Af i|1/2 p p ≤ |ReM4 | + |ReM5 | |M6 |. We will now estimate M4 , M5 and M6 separately. Term M4 . We want to put the operator g0 K to the left of f . So we write M4 = h3−1 hBg0 Kf, 34ε−1 hBf i + h[g0 K, 32ε−1 hB]f, 32ε−1 hBf i ≡ M41 + M42 .

138

J.-P. Eckmann, M. Hairer

The term M41 is estimated easily by using the induction hypothesis and the fact that 3−1 hBg0 is bounded. In order to estimate M42 , we use the explicit form of K to write M42 = h3−2ε [g0 X0 , 32ε−1 hB]f, 34ε−1 hBf i n X hg0 Xi∗ [Xi , 32ε−1 hB]f, 32ε−1 hBf i + i=1

+ hg0 [Xi∗ , 32ε−1 hB]Xi f, 32ε−1 hBf i

+ h3−2ε [g0 , 32ε−1 hB]Kf, 34ε−1 hBf i n X (Mi3 + Mi4 ) + M4K . ≡ M40 + i=1

The terms M40 and M4K are estimated by expanding the commutator and power counting. The term Mi4 can be written as |Mi4 | = |h3−2ε g0 [Xi∗ , 32ε−1 hB]Xi f, 34ε−1 hBf i| ≤ k31−4ε h∗ 32ε−1 g0 [Xi∗ , 32ε−1 hB]kkXi f kk34ε−1 Bf k. It is then estimated by power counting, using moreover the induction hypothesis and the estimate (5.16). In order to estimate the term Mi3 , we have to commute once more to write Mi3 = h3−2ε g0 [Xi , 32ε−1 hB]Xi∗ f, 34ε−1 hBf i

+ 3−2ε g0 Xi∗ [Xi , 32ε−1 hB] f, 34ε−1 hBf . The first term is estimated exactly like Mi4 . The second term can then be estimated by expanding the double commutator and power counting. Term M5 . We write this term as M5 = h3−1 hBK1 f, 34ε−1 hBf i + h3−2ε [K1 , 32ε−1 hB]f, 34ε−1 hBf i. The first term is estimated using the induction hypothesis and the fact that (5.13) and (5.16) imply K1 ∈ Y 1 (A−1 )

⇒

kK1 f k ≤ C(kKf k + kf k).

(5.19)

The other term is estimated by using the fact that [K1 , 32 ] ∈ Y 2 (A) and [K1 , hB] ∈ Y 1 (A), which follows from A−1 ⊂ A and thus K1 ∈ Y 1 (A). Term M6 . We use the explicit expression for Re(g0 K + K1 ) to write this term as M6 =

n X i=1

kg0 Xi 3−1 Af k2 ≤ C 1/2

n X

kXi 3−1 Af k2 .

i=1

These terms are easily estimated by putting the Xi to the left of f , using (5.16) and estimating the commutators.

Non-Equilibrium Statistical Mechanics of Strongly Anharmonic Chains

139

Term TB,5 . This is the last term we have to estimate. Using the expression (5.13) and the positivity of Re(g0 K + K1 ), it can be written in the form

1/2 1/2 ∗ ∗ 2ε−2 f, Re(g0 K + K1 ) B h 3 Af TB,5 = Re(g0 K + K1 ) + hK2 f, B ∗ h∗ 32ε−2 Af i ≡ N1 + N2 . These terms are now estimated separately. Term N2 . We use the Cauchy-Schwarz inequality to write |N2 | ≤ kK2 f kkB ∗ h∗ 32ε−2 Af k ≡ kK2 f kkN3 k. We can estimate N3 by writing B ∗ h∗ 32ε−2 A = h∗ 32ε−2 A31−2ε 32ε−1 B ∗ + [B ∗ , h∗ 32ε−2 A], and estimating the resulting terms using the induction hypothesis. We already noticed that we have the desired estimate for kK2 f k. Term N1 . Using the Cauchy-Schwarz inequality, we write it as N1 ≤ hf, Re(g0 K + K1 )f i1/2 hRe(g0 K + K1 )B ∗ h∗ 32ε−2 Af, B ∗ h∗ 32ε−2 Af i1/2 ≤ C(kKf k + kf k)|h3−2ε (g0 K + K1 )B ∗ h∗ 32ε−2 Af, 32ε B ∗ h∗ 32ε−2 Af i|1/2 p p ≡ C(kKf k + kf k) |hf1 + f2 , f3 i| ≤ C(kKf k + kf k) (kf1 k + kf2 k)kf3 k. Estimate of kf3 k. We write it as f3 = 32ε h∗ 32ε−2 A31−4ε 34ε−1 B ∗ f + 32ε [B ∗ , h∗ 32ε−2 A]f. The first term is estimated by using the recurrence hypothesis and the fact that Hypothesis 2 implies B ∗ ∈ Y 1 (Aj −1 ). The second term is estimated by power counting and by using the fact that ε < 1/4. Estimate of kf2 k. We write it as f2 = 3−2ε B ∗ h∗ 32ε−2 AK1 f + 3−2ε [K1 , B ∗ h∗ 32ε−2 A]f. The first term is estimated using the fact that kK1 f k ≤ C(kKf k + kf k) and power counting. The second term is simply estimated by power counting, and the fact that K1 ∈ Y 1 (A). Estimate of kf1 k. We use the explicit form of K to write f1 as f1 = 3−2ε B ∗ h∗ 32ε−2 Ag0 Kf + 3−2ε [g0 X0 , B ∗ h∗ 32ε−2 A]f n X + 3−2ε g0 Xi∗ [Xi , B ∗ h∗ 32ε−2 A]f + 3−2ε [g0 Xi∗ , B ∗ h∗ 32ε−2 A]Xi f i=1

≡ QK + Q0 +

n X (Qi,1 + Qi,2 ). i=1

These terms will now be estimated separately.

140

J.-P. Eckmann, M. Hairer

Term QK . We notice that the operator 3−2ε B ∗ h∗ 32ε−2 Ag0 is bounded by power counting. This yields the desired estimate. Term Q0 . This term is bounded by Ckf k by power counting, noticing that g0 X0 ∈ A. Term Qi,2 . This term can be estimated by power counting if we expand the commutator and use the estimate (5.16). Term Qi,1 . We use once more the trick that consists of putting the Xi∗ to the left of f . We write therefore Qi,1 = 3−2ε g0 [Xi , B ∗ h∗ 32ε−2 A]Xi∗ f + 3−2ε g0 Xi∗ [Xi , B ∗ h∗ 32ε−2 A] f. The first term is estimated by using (5.16b) and expanding the commutator. The second term is estimated in a similar way by expanding the double commutator. We don’t write the resulting terms here, because there are too much of them. They are all bounded by simple power counting and by using Lemma 5.6. This completes the proof of estimate (5.18). It is now straightforward to prove the theorem. Recall that R is the level up to which the Ai are defined. We put ε = 1/4R+1 , and we write: X hf, 32ε−2 A∗ Af i k3ε f k = hf, 32ε−2 32 f i = =

X

A∈A k3ε−1 Af k2 + hf, [A∗ , 32ε−2 ]Af i .

A∈A

The first term in the sum is bounded by using (5.15), the second term by simple power counting. This finally completes the proof of Theorem 5.5. u t We next note a consequence of this theorem, namely a simple criterion to see if a quadratic differential operator has compact resolvent. It is an easy illustration of the technique that will be used in the sequel to show that K has compact resolvent. 5.4. Quadratic differential operators. Definition 5.8. An operator A : D(A) → H is called accretive if it satisfies Rehf, Af i ≥ 0,

for all f ∈ D(A).

An operator A is called quasi accretive if there exists λ ∈ R such that A + λ is accretive. It is called strictly accretive if there exists λ > 0 such that A − λ is still accretive. If −A is accretive, A is called dissipative. An operator A is called m-accretive if it is accretive and if (A + λ)−1 exists for all λ > 0 and satisfies k(A + λ)−1 k ≤ λ−1 . The expressions m-dissipative, quasi dissipative, etc. are defined similarly in an obvious way. An equivalent characterization of m-accretive operators is that they are accretive with no proper accretive extension. It is a classical result (see e.g. [Dav80]) that the quasi m-dissipative operators are precisely the generators of quasi-bounded semigroups. An immediate consequence is that if an operator A is (quasi) m-accretive (m-dissipative), its adjoint A∗ is also (quasi) m-accretive (m-dissipative).

Non-Equilibrium Statistical Mechanics of Strongly Anharmonic Chains

141

Proposition 5.9. Let H be a Hilbert space and C be a dense subset of H. Let K : D(K) → H be a quasi m-accretive (or quasi m-dissipative) operator and let 32 : D(32 ) → H be a self-adjoint positive operator such that C ⊂ D(32 ). Assume moreover that C is a core for K, that 32 has compact resolvent and that there are constants C > 0 and 0 < ε < 2 such that k3ε f k ≤ C(kKf k + kf k),

for all f ∈ C.

(5.20)

Then K has compact resolvent too. Proof. By assumption, there exists a constant λ > 0 such that K + λ is strictly maccretive. Moreover, (5.20) with K replaced by K + λ holds if we change the constant C. Since C is a core for K, a simple approximation argument shows that D(K) ⊂ D(3ε ) and that (5.20) holds for every f ∈ D(K). This immediately implies that (K +λ)∗ (K +λ) has compact resolvent. Since (K +λ) is strictly m-accretive, it is invertible and the operator −1 ∗ = (K + λ)−1 (K + λ)−1 , (K + λ)∗ (K + λ) is compact. Moreover, we know that (K + λ)−1 is closed, so we can make the polar decomposition (K + λ)−1 = P J, with P self-adjoint and J unitary. Thus P 2 is compact. By the spectral theorem and the characterization of compact operators, this immediately implies P compact, and thus also P J compact. Thus K has compact resolvent. u t We now consider H = L2 (Rd ) and F = {λI | λ ∈ R}, where I is the identity operator in H. We define the formal expressions x T = (x1 , . . . , xd ), ∂xT = (∂x1 , . . . , ∂xd ). Let A : Rd → Rd be a linear map and B = {bi ∈ Rd | i = 1, . . . , s},

C = {ci ∈ Rd | i = 1, . . . , t},

two vector families. Let us consider the differential operator K defined as the closure on C0∞ (Rd ) of K=−

s X i=1

∂xT bi biT ∂x

+

t X j =1

x T cj cjT x + x T A∂x .

(5.21)

We are interested in giving a geometrical condition on A, B and C that implies the compactness of the resolvent of K, and therefore the discreteness of its spectrum. It is possible to prove that K is quasi m-accretive. Just follow the proof of Proposition B.3, replacing G(x) by x T x. We have the following result.

142

J.-P. Eckmann, M. Hairer

Proposition 5.10. A sufficient condition for the resolvent of the operator K defined in (5.21) to be compact is that the vector families [

(AT )N B

and

N≥0

[

AN C

(5.22)

N ≥0

span the whole space Rn . Remark 5.11. The intuitive meaning of this theorem is that we can apply Hörmander’s criterion in both direct and Fourier space to obtain an estimate of the form kH ε f k ≤ C(kKf k + kf k),

H = −∂xT ∂x + x T x.

(5.23)

It is well known that H has compact resolvent. By Proposition 5.9, (5.23) implies that K has compact resolvent. Proof. We have the following relations: [x T A∂x , bT ∂x ] ≡

X

[xi aij ∂xj , bk ∂xk ] =

i,j,k

=−

X

X

bk δki aij ∂xj = −b A∂x ,

[xi aij ∂xj , ck xk ] =

i,j,k

=

X

bk [xi , ∂xk ]aij ∂xj

i,j,k T

i,j,k

[x T A∂x , cT x] ≡

X

X

xi aij [∂xj , xk ]ck

i,j,k T

xi aij δj k ck = cT A x.

i,j,k

We take g0 = 1, so we have A0 = A−1 ∪ {x T A∂x }. We construct the remaining Ai by Bi ≡ [x T A∂x , Bi−1 ]. It is very easy to verify Hypotheses 1 and 2, because the assumptions we made on A, B and C imply that Y 1 (A) contains every operator of the form bT ∂x or cT x. We have moreover (bT ∂x )∗ = −bT ∂x

and

(cT x)∗ = cT x.

It is well-known that Hypothesis 3 concerning the essential self-adjointness of the 32 constructed in Theorem 5.5 holds. Finally, it is straightforward that 32 satisfies 32 ≥ CH , where H is the “harmonic oscillator” defined in (5.23). This proves the validity of (5.23), and hence of the assertion. u t The interested reader may verify that Proposition 5.10 is quite stable under perturbations. A similar result indeed still holds when the coefficients bi and ci are not constants, but functions in F0 . This is precisely what was proved in [EPR99a].

Non-Equilibrium Statistical Mechanics of Strongly Anharmonic Chains

143

6. Proof of the Bound in Momentum Space (Proposition 3.8) This proposition is an application of Theorem 5.5. It is just a little bit cumbersome to verify the hypotheses of the theorem. In this section, the symbol K will again denote the operator defined in (3.10). We choose F ≡ F0 , which is simply the set of bounded smooth functions with all their derivatives bounded. It is trivial to check that F is an algebra of closed operators. Moreover, they are all self-adjoint. We also define D ≡ C0∞ (X ). In this section, we will first construct a set A according to the rules explained in Sect. 5. Then we will check that Hypotheses 1–3 are indeed satisfied, so we will be able to apply Theorem 5.5. This will prove Proposition 3.8 almost immediately. Before we start this program, we write down once again the definition of X0 , as it will be used repeatedly throughout this section: X0 = − rL ∂p0 + bL (rL − λ2L q0 )∂rL − rR ∂pN + bR (rR − λ2R qN )∂rR N N X X − V20 (q˜i ) ∂pi − ∂pi−1 − αK . pi ∂qi − V10 (qi )∂pi + i=0

i=1

6.1. Definition of A. We choose an exponent α < −3/2 − `/(2m) and we let g0 be the operator of multiplication by Gα . It is clear that g0 is self-adjoint and positive. Moreover, we recall that [X0 , Gα ] = αGα G−1 [X0 , G] ∈ F0 , and so we have [X0 , g0 ] ∈ F. The set A0 is defined as ¯ A0 = {cL ∂rL , cR ∂rR , Gα X0 } ∪ A, with

A¯ = aL (rL − λ2L q0 ), aR (rR − λ2R qN ) .

Before we define the sets Ai , we need a few functions. Let i > 0 be a natural number. (i) (i) The functions VL and VR are defined respectively by (i)

˜ = V200 (q˜i )V200 (q˜i−1 ) · . . . · V200 (q˜1 ), VL (q) (i) VR (q) ˜ = V200 (q˜N +1−i ) · . . . · V200 (q˜N −1 )V200 (q˜N ). It is useful to notice that  0, if j > i ,     (i)   V2000 (q˜1 )V200 (q˜1 )−1 VL (q), ˜ if j = 0 , (i) ˜ = ∂qj VL (q) (i)  ˜ if j = i , V2000 (q˜i )V200 (q˜i )−1 VL (q),      000 (i) ˜ otherwise . V2 (q˜i )V200 (q˜i )−1 − V2000 (q˜i+1 )V200 (q˜i+1 )−1 VL (q), (6.1)

144

J.-P. Eckmann, M. Hairer (i)

There are symmetric relations for the derivatives of VR . At this point, we use Assumption 3 to write (i)

(i)

˜ = fij (q)V ˜ L (q), ˜ ∂qj VL (q)

fij ∈ F2m−2+` .

(6.2)

This implies (i)

˜ = Gα [Gα X0 , VL (q)]

N X j =0

(i)

(i)

pj fij VL (q) ˜ = fi VL (q), ˜

fi ∈ F,

(6.3)

because of Proposition 4.1 and by the choice α < −3/2 − `/(2m). Moreover, we notice that (i) G2iα VR ∈ F, still because of Proposition 4.1. One more thing we have to remember is (4.10), which implies for example that there exists a function f0 ∈ F such that [Gα X0 , Gβ ] = βf0 Gβ ,

for any

β ∈ R.

We are now ready to complete the construction of A. 6.1.1. Definition of A1 and A2 . We verify that in the case of our model, we can find functions fB and fXB in (5.9) such that A1 \A0 = {Gα ∂p0 , Gα ∂pN }, A2 \A1 = {G2α ∂q0 , G2α ∂qN }. Considering the elements of A1 , we see that it is indeed possible to write −1 [Gα X0 , cL ∂rL ] − G−1 (∂rL G)Gα X0 + Gα bL ∂rL , Gα ∂p0 = cL −1 belong and a similar relation concerning Gα ∂pN . The operators G−1 (∂rL G) and Gα bL cL to F, so we succeeded to construct A1 according to (5.9). Let us now focus on the elements of A2 . We can write

G2α ∂q0 = [Gα X0 , Gα ∂p0 ] − Gα−1 (∂p0 G)Gα X0 − αf0 Gα ∂p0 , and an equivalent expression at the other end of the chain. Since Gα−1 (∂p0 G) ∈ F and f0 ∈ F, we succeeded to construct A2 according to (5.9). 6.1.2. Definition of A2i−1 and A2i . For i ≥ 1, these sets are defined by (i)

(i)

A2i−1 \A2i−2 = {G(2i+1)α VL ∂pi , G(2i+1)α VR ∂pN −i }, (i) (i) A2i \A2i−1 = {G(2i+2)α VL ∂qi , G(2i+2)α VR ∂qN −i }. We repeat this construction until i = N − 1, i.e. we do not stop at the middle of the chain, but we go on until we reach the other end. We want to check that these sets were constructed according to (5.9). In fact, we will see that any element A of Aj \Aj −1 with j ≥ 2 can be written as A = [Gα X0 , B] + D,

B ∈ Aj −1 ,

D ∈ Y 1 (Aj −1 ).

(6.4)

Non-Equilibrium Statistical Mechanics of Strongly Anharmonic Chains

145

We will verify this only for 2 ≤ i ≤ N − 2. We let the reader verify that (6.4) is also valid for the remaining sets. (i) Let us first take j = 2i − 1 and A = G(2i+1)α VL ∂pi . We choose (i−1)

B = G2iα VL

∂qi−1 ∈ Aj −1

and write (i−1)

(i−1)

∂qi−1 − G2iα VL G−1 (∂qi−1 G)Gα X0 [Gα X0 , B] = fi−1 G2iα VL (i−1) + 2iαf0 B + G(2i+1)α VL [X0 , ∂qi−1 ]. The first three terms belong to Y 1 (A2i−2 ) and can thus be absorbed into D. The last term can be written as (i−1)

G(2i+1)α VL

[X0 , ∂qi−1 ] = (i−1) ∂pi−1 G V100 (qi−1 ) + V200 (q˜i ) + V200 (q˜i−1 ) G(2i−1)α VL (i−2) (i) + G4α V200 (q)V ˜ 200 (q)G ˜ (2i−3)α VL ∂pi−2 + G(2i+1)α VL ∂pi . 2α

The first two terms also belong to Y 1 (A2i−2 ), so they can be absorbed into D as well. The remaining term is (i) G(2i+1)α VL ∂pi = A, thus we have verified that A can be written as in (6.4). The procedure to get the symmetric term from the other end of the chain is similar. (i) We take now j = 2i and A = G(2i+2)α VL ∂qi . We choose (i)

B = G(2i+1)α VL ∂pi ∈ Y 1 (Aj −1 ) and write (i)

(i)

[Gα X0 , B] = fi G(2i+1)α VL ∂pi + G(2i+1)α VL G−1 (∂pi G)Gα X0 (i) + (2i + 1)αf0 B + G(2i+2)α VL ∂qi . The first three terms belong to Y 1 (A2i−1 ) and can be absorbed into D, so we verified that every element of A can indeed be written as in (5.9). 6.2. Verification of the hypotheses and proof. In order to be able to apply Theorem 5.5, we verify the hypotheses 1–3. Verification of Hypothesis 2. We want to check that A ∈ Aj implies A∗ ∈ Y 1 (Aj ). By Proposition 4.1, we can easily verify that A\A¯ ⊂ L0 . But we know that A ∈ L0 ⇒ A∗ = −A + g, ¯ The elements of A¯ being self-adjoint, Hypothesis and so Hypothesis 2 holds for A\A. 2 holds trivially. Verification of Hypothesis 3. The operator 32 can be written as X ∂i aij (x)∂j + V (x). 32 = − i,j

146

J.-P. Eckmann, M. Hairer

It is well-known that if aij and V are sufficiently nice, such operators are essentially self-adjoint on C0∞ (X ) (see e.g. [Agm82, Thm. 3.2]). Verification of Hypothesis 1. Let us define L0 ⊂ L as the set of first-order differential operators with coefficients in F0 . We first verify that A ∈ A, f ∈ F

⇒

[A, f ] ∈ F.

¯ F] = {0}. This is trivial, noticing that A ⊂ L0 ∪ A¯ and [L0 , F] = [A, We now verify that A ∈ A ⇒ A∗ ∈ Y 1 (A). This is also trivial, because A ∈ L0 ⇒ A∗ = −A + g, with g ∈ F0 . Moreover, the elements of A¯ are self-adjoint. Finally, we want to verify that A, B ∈ A

⇒

[A, B] ∈ Y 1 (A).

This is a little bit longer to verify. Concerning the commutators of the elements of A¯ with the other elements of A, the statement follows easily from the fact that if F : Rn → R is linear and A ∈ L0 , then [A, F ] ∈ F0 ≡ F. Moreover, the commutator between two multiplication operators vanishes. Concerning the commutators between the ∂r and the other elements, we notice that (i) (i) ˜ and VR (q). ˜ Moreover, we have for example they commute with the functions VL (q) [∂rL , Gγ ] = γ G−1 [∂rL , G] Gγ , if γ ∈ R, and G−1 [∂rL , G] belongs to F. It is straightforward to verify that this implies the desired statement. Concerning the commutators of Gα X0 with the other elements of A, the statement has already been verified by the construction of A for every operator, but those in A2N−2 \A2N−3 . These operators are of the form (N−1)

A = G2N α VR

∂q1 ,

and a similar term at the other end of the chain. We can make a computation very similar to the one we made when we constructed A2i−1 , to show that (N )

[Gα X0 , A] = G(2N +1)α VR ∂p0 + C,

C ∈ Y 1 (A).

(N )

But G2Nα VR ∈ F, so [Gα X0 , A] ∈ Y 1 (A). It remains therefore only to verify the statement for commutators between elements of A\A0 . We can divide these commutators in three classes. Both operators contain a ∂p . We notice that these operators can all be written in the form Gαi Wi (q)∂pi . The commutator between two such elements is given by [Gαi Wi (q)∂pi , Gαj Wj (q)∂pj ] = G−1 (∂pi G)Gαi Wi (q)Gαj Wj (q)∂pj − G−1 (∂pj G)Gαj Wj (q)Gαi Wi (q)∂pi . Both terms belong to Y 1 (A), because G−1 (∂p G) ∈ F.

Non-Equilibrium Statistical Mechanics of Strongly Anharmonic Chains

147

One operator contains a ∂p , one contains a ∂q . Let us compute the commutator between (j ) (i) G(2i+2)α VL ∂qi and G(2j +1)α VL ∂pj . We have (j )

(i)

(j )

(i)

[G(2i+2)α VL ∂qi , G(2j +1)α VL ∂pj ] = G(2i+2)α VL (∂qi G)G−1 G(2j +1)α VL ∂pj (i)

(i)

+ G(2i+1)α VL (∂pi G)G−1 G(2i+2)α VL ∂qi (j ) (i) + G2iα VL G2α fij G(2j +1)α VL ∂pj .

All those terms belong to Y 1 (A). The computation is similar if we take for example (j )

G(2j +1)α VR ∂pN −j (j )

instead of G(2j +1)α VL ∂pj . Both operators contain a ∂q . The computation is similar to the preceding case and is left to the reader. It is now easy to give the Proof of Proposition 3.8. We have just verified that the hypotheses of Theorem 5.5 are satisfied. We apply it, so we have the estimate ˜ ε f k ≤ C(kKf k + kf k), k1 ˜ is given by where 1

˜ =1+ 1

X

A∗ A.

A∈A

˜ has exactly the form (3.12). This completes the proof of PropoIt is easy to see that 1 sition 3.8. u t

7. Proof of Theorem 3.6 It is now possible to prove that the operator K has compact resolvent, which is one of the main results of this paper. Before we start the proof itself, we need two preliminary results. The first one states ˜ be the closure in L2 (Rn ) of the operator acting on C ∞ (Rn ) as Lemma 7.1. Let 1 0 ˜ = 1

N¯ X i=1

L∗i Li + a0 ,

where the Li are smooth vector fields with bounded coefficients spanning Rn at every point and a0 is a smooth positive function. Let V : Rn → Rn be a continuous function such that for every constant C > 0, there exists a compact KC ⊂ Rn with the property that V (x) > C for every x ∈ Rn \KC . We moreover assume that V (x) ≥ 1. Define the operator H as the closure in L2 (Rn ) of the operator acting on f ∈ C0∞ (Rn ) as ˜ (x) + V (x)f (x). Hf (x) = 1f Then the operator H is self-adjoint.

148

J.-P. Eckmann, M. Hairer

Suppose V and the Li are such that the function 2a0 V +

N¯ X i=1

(L∗i + Li )[Li , V ] − Li , [Li , V ]

(7.1)

is bounded. We then have the estimate ˜ ε f i + hf, V ε f i + Chf, H ε−1 f i, hf, H ε f i ≤ hf, 1

0 < ε < 1,

(7.2)

which holds for any f ∈ C0∞ (Rn ). ˜ is classical, we will not Proof. The result concerning the self-adjointness of H and of 1 prove it here. The interested reader can find a proof in [Agm82, Thm. 3.2]. We use the fact that if T is a strictly positive self-adjoint operator and α = 1 − ε ∈ (0, 1), we can write Z ∞ sin(π α) , z−α (z + T )−1 dz, Cα = T −α = Cα π 0 and thus

Z

∞

T dz. z+T 0 Moreover, a core of T is again a core of T ε , so (7.2) makes sense. For a proof of these statements, see [Kat80, §V.3]. This allows us to write inequality (7.2) as Z ∞ D E H zε−1 f, f dz ≤ z+H 0 Z ∞ Z ∞ E E D D ˜ 1 V f dz f dz + zε−1 f, zε−1 f, (7.3) ˜ z+V z+1 0 0 Z ∞ E D 1 zε−1 f, f dz. +C z+H 0 T ε = Cα

zε−1

˜ +V1 ˜ is lower bounded. In order to prove (7.3), let us first show that the operator 1V This is an immediate consequence of (7.1) and the equality L∗i Li V + V L∗i Li = 2L∗i V Li + (Li + L∗i )[Li , V ] − Li , [Li , V ] , which is easily verified, using the fact that Li + L∗i is simply a function. Therefore, there exists a constant C > 0 such that

˜ + V 1)g ˜ g, (1V + Chg, gi ≥ 0, ∀g ∈ C0∞ (Rn ). Since H ≥ 1 in the sense of quadratic forms, we find

˜ + V 1)g ˜ g, (1V + Chg, (z + H )gi ≥ 0, ˜ and V are positive self-adjoint operators, this which holds for every z ≥ 0. Since 1 immediately implies D g, V

E D E

V ˜ ˜ + g, (1V ˜ + V 1)g ˜ V g + g, 1 1g + Chg, (z + H )gi ≥ 0. ˜ z+V z+1 (7.4) ˜ 1

Non-Equilibrium Statistical Mechanics of Strongly Anharmonic Chains

149

We can easily check the following identities: ˜ = (z + H )1(z ˜ + 1) ˜ −1 V , ˜ + 1) ˜ −1 V + 1V V 1(z ˜ +V1 ˜ = (z + H )1(z ˜ + V )−1 1. ˜ ˜ (z + V )−1 1 1V Inserting this in (7.4), we get

˜ g + Chg, (z + H )gi ≥ 0, ˜ + 1) ˜ −1 V + V (z + V )−1 1 g, (z + H ) 1(z and thus

˜ + 1) ˜ −1 V g, (z + H )Hg ≤ g, (z + H ) H + 1(z −1 ˜ + V (z + V ) 1 g + Chg, (z + H )gi.

We can check the equalities ˜ = 1(z ˜ + 1) ˜ −1 (z + H ), ˜ + 1) ˜ −1 V + 1 1(z ˜ + V = V (z + V )−1 (z + H ), V (z + V )−1 1 which allow us to write

˜ + 1) ˜ −1 + V (z + V )−1 (z + H )g g, (z + H )Hg ≤ (z + H )g, 1(z + Chg, (z + H )gi. Let us define f ≡ (z + H )g. This immediately yields the estimate D

f,

E D E D E E D ˜ 1 1 V H f ≤ f, f + C f, f , f + f, ˜ z+H z+V z+H z+1

(7.5)

which holds for any f in W ≡ (z + H )C0∞ (Rn ). But we know that C0∞ (Rn ) is a core for H , therefore W is dense in L2 (Rn ). Since the operators appearing in (7.5) are all bounded, the inequality (7.5) holds for every f ∈ L2 (Rn ) and thus in particular also for t f ∈ C0∞ (Rn ). This implies the wanted estimate (7.3). u The second result we want to use is ˜ V and H be as in Lemma 7.1. Then H has compact resolvent. Proposition 7.2. Let 1, ˜ is a positive self-adjoint operator, so Proof. We know that 1 ˜ + 1)−1 T = (1 exists and kT k ≤ 1. The proof of compactness is a modification of the standard proof ˜ replaced by the true Laplacian 1, which can be found e.g. of the same theorem with 1 in [Agm82]. It is based on the fact that if χ is a function with compact support, then the ˜ We want to prove that multiplication operator χ is relatively compact with respect to 1. χ T is a compact operator, i.e. that the closure of Y = {χ Tf | f ∈ C0∞ (Rn ) and is compact.

kf k ≤ 1}

150

J.-P. Eckmann, M. Hairer

Let us define K = supp χ. By hypothesis, K is compact. Moreover, we have Y ⊂ C0∞ (K). It is well-known that if K is a compact domain of Rn , then the set {u ∈ C0∞ (K) | kuk ≤ 1; hu, 1ui ≤ 1} is compact (see e.g. [RS80, Thm. XIII.73]). This implies that Y is compact if we are able to prove that there are strictly positive constants ε, c1 and c2 such that u ∈ Y implies kuk ≤ c1

and

hu, 1ui ≤ c2 .

We take any element u in Y and write it as u = χTf . We have kuk ≤ kχ k∞ kT k kf k ≤ c1 . ˜ span Rn Recall that we assumed the vector fields Li appearing in the construction of 1 at any point and that a0 is a strictly positive function. Together with the compactness of the support of u, this implies that there are constants C and k1 such that ˜ ˜ Tf i| ≤ Ckukk1χ ˜ Tf k |hu, 1ui| ≤ C|hu, 1ui| = C|hu, 1χ ˜ χ]Tf k, ˜ ˜ χ ]Tf k ≤ k1 + Ck[1, ≤ Ckχ 1Tf k + Ck[1,

(7.6)

˜ −1 . We therefore only need where the last inequality is a consequence of T = (1 + 1) ˜ to bound the term containing the commutator of 1 and χ . Explicit calculation yields ˜ χ] = [1,

N¯ X i=1

¯

N X ηi Li + η0 , −2[Li , χ]Li + Li , [Li , χ ] + (L∗i + Li )[Li , χ ] ≡ i=1

where the ηi are bounded functions with supp ηi ⊂ K. So the only terms that remain to be bounded are of the form kηi Li Tf k. As ηi is bounded, it is enough to bound kLi Tf k. We have ˜ i ≤ kf k2 . kLi Tf k2 = hTf, L∗i Li Tf i ≤ hTf, 1Tf

(7.7)

This completes the proof of the statement about the relative compactness of χ. This implies that we can add to H any function with compact support without changing its essential spectrum (see [RS80, Thm. XIII.14]). But the assumption we made ˜ imply that for any constant C, we can raise the concerning V and the positivity of 1 spectrum of H + χ above C by taking for χ a smooth function satisfying C x ∈ KC , χ(x) = 0 d(x, KC ) > 1. Therefore, the essential spectrum of H is empty and thus H has compact resolvent.

t u

It is now easy to give the Proof of Theorem 3.6. By Proposition 3.7 and 3.8, we can choose a constant ε small enough to have, for every f ∈ C0∞ (X ), the estimate ˜ ε f k ≤ C(kKf k + kf k) and k1

kGε f k ≤ C(kKf k + kf k).

Non-Equilibrium Statistical Mechanics of Strongly Anharmonic Chains

151

We moreover define ˜ + G. H ≡1

(7.8)

By the proof of Proposition 3.8, we see that the assumptions of Lemma 7.1 are satisfied. We can thus write ˜ + G)2ε f i ≤ hf, 1 ˜ 2ε f i + hf, G2ε f i + Ckf k2 kH ε f k2 = hf, H 2ε f i = hf, (1 ε 2 ε 2 2 ˜ f k + kG f k + Ckf k ≤ C(kKf k + kf k)2 . ≤ k1 Because G is confining, we can apply Proposition 7.2 to see that H , and therefore also H ε , have compact resolvent. Therefore Corollary 5.9 applies, showing that K has compact resolvent. u t Remark 7.3. The proof still works under slightly weaker assumptions. The coupling between the ends of the chain and the heat baths does not have to be of the dipolar type. It is enough for example that FL and FR belong to some Fβ with β < n. Moreover, the potentials V1 and V2 can be different for each particle. We only have to impose that Assumptions 1–3 can be satisfied for every particle with the same constants `, m and n. Remark 7.4. Throughout this paper, we restricted ourselves to the one-dimensional case, i.e. each particle had only one degree of freedom. It is not very hard to generalize the results of this paper to the d-dimensional case. It is straightforward to generalize Assumptions 1 and 2, where V 0 is now a vector. In Assumption 3, the inverse of V200 has to be read as the inverse matrix. A matrix or vector-valued function is said to belong to Fβ if each of its components belong to Fβ . The only point that could cause some trouble is the expression (6.1), because the (i) 00 ˜ will V2 (q˜j ) are now matrices which do not commute, so the expression for ∂qj VL (q) show terms of the form V200 (q˜i )V200 (q˜i−1 ) · . . . · V2000 (q˜j ) · . . . · V200 (q˜1 ), where V2000 is a trilinear form. Such a term can be written as −1

V200 (q˜i )V200 (q˜i−1 ) · . . . · V2000 (q˜j )V200 (q˜j +1 )

−1

· . . . · V200 (q˜i )

(i)

VL .

If we want to get expressions similar to (6.2) and (6.3), we have to make |α| very big (of the order of N), but this is not a problem. Remark 7.5. One important assumption was that m > n, in other words, the interparticle coupling is stronger at infinity than the single particle potential. If this is not satisfied, our proof does not work. There could be some physical reason behind this. If a stationary state exists, this means that even if the chain is in a state of very high energy, the mean time to reach a region with low energy is finite (see e.g. [Has80]). But if m < n, the relative strength of the coupling versus the one-body potential goes to zero at high energy. The consequence is that there is almost no energy transmitted between particles. Since the only points where dissipation occurs are the ends of the chain, we see that the higher the energy of the chain is, the slower this energy will be dissipated. Probably this is not sufficient to destroy the existence of a stationary state, but it could explain why the proof does not work in this situation. It is even possible that this phenomenon destroys the compactness of the resolvent of K.

152

J.-P. Eckmann, M. Hairer

8. The Invariant Measure This section is devoted to the proof of Theorem 3.9. Throughout this section, we denote by T t the semigroup generated by the system of stochastic differential equations (3.7). We also assume that Assumptions 1–3 are satisfied, so Propositions 3.7 and 3.8 hold, as well as Theorem 3.6. The proof of Theorem 3.9 is divided into three separate propositions, showing respectively the following properties of the invariant measure µ: (i) Existence and smoothness. (ii) Decay properties. (iii) Uniqueness and strict positivity. Proposition 8.1. If Assumptions 1–3 are satisfied, the Markov process given by (3.7) possesses an invariant measure µ. It has a density h, which is a C ∞ function on R2N+4 . Proof. By Theorem 3.6, we know that K has compact resolvent. This implies also the compactness of the resolvent of LH and thus of L0 . Since G grows algebraically at infinity, we see that the constant function 1 belongs to H0 . Moreover, we notice that L0 1 = 0, thus the operator L0 has an eigenvalue 0, which is isolated because of the compactness of its resolvent. This in turn implies that L∗0 also has an isolated eigenvalue 0. We denote the corresponding eigenvector by g and normalize it so that hg, 1iH0 = 1. Since L∗0 is hypoelliptic, g must be C ∞ . Assume first that g ≥ 0. We then define h(p, q, r) = Z0−1 g(p, q, r)e−2β0 G ,

(8.1)

where Z0 is the normalization constant appearing in the definition of H0 . Set µ(dx) = h(x) dx; we want to check that µ is the invariant measure we are looking for. Notice that µ(dx) is a probability measure because Z Z −1 e−2β0 G(x) g(x) dx = hg, 1iH0 = 1. µ(dx) = Z0 Let A be a Borel set of R2N+4 . Then the characteristic function χA of A belongs to H0 . We have Z Z T t χA (x) µ(dx) = Z0−1 e−2β0 G(x) g(x) T t χA (x) dx (T t )∗ µ (A) = Z −1 e−2β0 G(x) (T0t )∗ g (x)χA (x) dx = µ(A), = Z0 thus µ is an invariant measure for the Markov process defined by (3.7). The argument showing that it was indeed justified to assume g positive can be taken over from [EPR99a, Prop. 3.6]. u t We next turn to the decay properties of the invariant measure h. We first introduce a convenient family of Hilbert spaces. Definition 8.2. Choose γ ∈ R. We define the Hilbert space W (γ ) as W (γ ) ≡ L2 (X , G2γ (x) dx) = D(Gγ ).

Non-Equilibrium Statistical Mechanics of Strongly Anharmonic Chains

153

We will denote by h·, ·i(γ ) and k · k(γ ) the corresponding scalar product and norm. We also define \ W (γ ) , W (∞) ≡ γ >0

which is the set of all functions that decay at infinity faster than any polynomial. We already know that h is a C ∞ function, so we want to show that it is possible to write ˜ h˜ ∈ W (∞) . h(p, q, r) = h(p, q, r)e−β0 G(p,q,r) , The function h˜ being an eigenfunction of the operator K, the decay properties of the invariant measure are a consequence of the following result. Proposition 8.3. The eigenfunctions of K and K ∗ belong to C ∞ (X ) ∩ W (∞) . We will show Proposition 8.3 only for the eigenfunctions of K. It is a simple exercise left to the reader to retrace the proof for the eigenfunctions of K ∗ . We already know that K and K ∗ are hypoelliptic, so their eigenfunctions belong to C ∞ (X ). It remains to be proven that they also belong to W (∞) . To prove the proposition, we will show the implication f ∈ W (γ )

and

Kf ∈ W (γ )

⇒

f ∈ W (γ +ε) ,

(8.2)

which immediately implies that the eigenvectors of K belong to W (∞) . For this purpose, we introduce the family of operators Kγ defined by Kγ : D(Kγ ) → W (γ ) f 7 → Kf, where D(Kγ ) is given by D(Kγ ) = {f ∈ W (γ ) | Kf ∈ W (γ ) }. The expression Kf has to be understood in the sense of distributions. We have the following preliminary result. Lemma 8.4. C0∞ (X ) is a core for Kγ . Proof. The proof uses the tools developed in Appendix B and is postponed to Appendix C. u t The key lemma for the proof of Proposition 8.3 is the following. Lemma 8.5. There are an ε > 0 and constants Cγ > 0 such that for every γ > 0 and every u ∈ D(Kγ ), the relation kGε uk2(γ ) ≤ Cγ kKγ uk2(γ ) + kuk2(γ ) holds.

(8.3)

154

J.-P. Eckmann, M. Hairer

Proof. Since we know that C0∞ (X ) is a core for Kγ , it suffices to show (8.3) for u ∈ C0∞ (X ). Let L be the first-order differential operator associated to a divergence-free vector field. Then we have for f, g ∈ C0∞ , hLf, gi(γ ) = −hf, LG2γ gi = −hf, G2γ Lgi − 2γ hf, G2γ G−1 (LG)gi = −hf, Lgi(γ ) − 2γ hf, G−1 (LG)gi(γ ) . We write this symbolically as L∗γ = −Lγ − 2γ G−1 (LG). (1)

(2)

We can use the latter equality to show that there are constants cγ and cγ such that (Lγ )2 + (L∗γ )2

= L∗γ Lγ + cγ(1) G−2 (LG)2 + cγ(2) G−1 (L2 G). 2 Using the explicit form of K, this in turn yields the useful relation −

2 2 k∂rL uk2(γ ) + cR k∂rR uk2(γ ) Rehu, Kui(γ ) = cL

+ aL2 k(rL − λL q0 )uk2(γ ) + aR2 k(rR − λR qN )uk2(γ ) + hu, fK ui(γ ) , (8.4) where fK is some bounded function. We now have the tools to prove the validity of (8.3). We use Proposition 3.7 to write kuk2(γ +ε) = kGε Gγ uk2 ≤ C(kKGγ uk2 + kGγ uk2 ) ≤ C(kGγ Kuk2 + k[K, Gγ ]uk2 + kGγ uk2 ). An explicit computation yields

[K, Gγ ]u = Gγ fL ∂rL + fR ∂rR + f0 u,

for some smooth bounded functions fL , fR and f0 . We are thus able to write 2 2 k∂rL uk2(γ ) + cR k∂rR uk2(γ ) . kuk2(γ +ε) ≤ C kKuk2(γ ) + kuk2(γ ) + cL

(8.5)

Using (8.4), we can write 2 2 k∂rL uk2(γ ) + cR k∂rR uk2(γ ) ≤ |Rehu, Kui(γ ) | + Ckuk2(γ ) cL ≤ C kKuk2(γ ) + kuk2(γ ) .

This, together with (8.5), completes the proof of the assertion.

t u

Proof of Proposition 8.3. Lemma 8.5 immediately shows that D(Kγ ) ⊂ W (γ +ε) for every γ > 0. This proves the assertion (8.2). Let f be an eigenfunction of K. We know that f ∈ L2 (X ) and, because it is an eigenvector of K, we have Kf ∈ L2 . Thus, by (8.2), f ∈ W (ε) . Of course Kf ∈ W (ε) as well, so f ∈ W (2ε) . This can be continued ad infinitum, and so we have f ∈ W (∞) , which is the desired result. u t Finally, we want to show the strict positivity and the uniqueness of the invariant measure. The proof of this result will only be sketched, as it simply retraces the proof of Theorem 3.6 in [EPR99b].

Non-Equilibrium Statistical Mechanics of Strongly Anharmonic Chains

155

Proposition 8.6. The density h of the invariant measure µ is a strictly positive function. Moreover, the invariant measure is unique. Sketch of proof. The idea is to show that the control system associated with the stochastic differential equation (3.7) is strongly completely controllable. This means that, given an initial condition x0 , a time τ and an endpoint xτ , it is possible to find a realization of the Wiener process w such that ξ(τ ; x0 , w) = xτ . The main assumption needed to show that is that the gradient of the two-body potential is a diffeomorphism. This is ensured by Assumption 3. The consequence is that, for every time τ , every initial condition x0 and every open set U , the transition probability P (τ, x0 , U ) is strictly positive. Because µ is invariant, we have Z µ(U ) = P (t, x, U ) µ(dx) > 0. This implies the strict positivity of h. Uniqueness follows from an elementary ergodicity argument. u t A. Proof of Lemma 5.6 Throughout this appendix, we will make use of the same notations as in Sect. 5, i.e. H = L2 (Rn ), D = C0∞ (Rn ) and D is the set of differential operators with smooth coefficients. Moreover, A denotes some finite subset of D and is identified with closed operators on H. The operator 32 is defined as X A∗ A. (A.1) 32 ≡ 1 + A∈A

We will moreover assume that Hypotheses 1 and 3 concerning A and F hold, i.e. A, B ∈ A and f ∈ F imply [A, B] ∈ Y 1 (A),

A∗ ∈ Y 1 (A),

[A, f ] ∈ F.

(A.2)

In order to prepare the proof of Lemma 5.6, we need a few auxiliary results. Lemma A.1. Let A, F, D and 3 be as above and assume Hypotheses 1 and 3 hold. j Then, if A ∈ YF (A), the operator A3−j is bounded. The proof of this lemma will be a consequence of Lemma A.2. Let A, F, D and 3 be as above and assume Hypotheses 1 and 3 hold. Then, if A1 , A2 ∈ A, the operators A1 3−1 and A1 A2 3−2 are bounded. Proof. Let us show first that A1 3−1 is bounded. Since D is a core for 3, it suffices to show that there is a constant C such that kA1 f k2 ≤ Ck3f k2

∀ f ∈ D.

This is an immediate consequence of k3f k2 = kf k2 +

X A∈A

kAf k2 .

156

J.-P. Eckmann, M. Hairer

In order to show that A1 A2 3−2 is bounded, we will show that there are constants τ and C such that kA1 A2 f k2 ≤ Ck32 f + (τ − 1)f k2 .

(A.3)

We can write the following equality: X

k(32 − 1)f + τf k2 = τ 2 kf k2 + 2τ

A∈A 2

X

X

A∈A 2

= τ kf k + 2τ 2

+

X

kAf k2 +

hf, A∗ AB ∗ Bf i

A,B∈A

kAf k

2

kABf k + hf, [A∗ A, B ∗ ]Bf i .

A,B∈A

We can write the operator intervening in the last term as [A∗ A, B ∗ ]B = A∗ [A, B ∗ ]B + [B, A]∗ AB. Because of Hypothesis 1, this implies that there are positive constants CABC such that X X kAf k2 + kBCf k2 k(32 − 1)f + τf k2 ≥ τ 2 kf k2 + 2τ A∈A

X

−

B,C∈A

CABC kAf kkBCf k.

A,B,C∈A

If we use now

y2 x, y ≥ 0 , s > 0, s2 we see that we can choose τ big enough to have 2xy ≤ x 2 s 2 +

k(32 − 1)f + τf k2 ≥ τ 2 kf k2 +

1 X 1 X kAf k2 + kBCf k2 . 2 2 A∈A

This immediately implies (A.3).

B,C∈A

t u

This lemma can now be used to prove Lemma A.1. Proof of Lemma A.1. We want to show that A ∈ YFi (A) implies A3−i bounded. We already treated the cases i = 1 and i = 2. For the other cases, we proceed by induction. Let us fix j > 2 and assume the assertion has been proved for i < j . Then the operators of the form A1 A2 · . . . · Aj 3−j

Ai ∈ A,

are bounded. We distinguish two cases. j = 2n. We write the operator of (A.4) as A1 A2 3−2 · 32 A3 A4 3−4 · . . . · 32n−1 A2n−2 A2n 3−2n . We show that operators of the form 32m−2 AB3−2m

A, B ∈ A , m ≤ n,

(A.4)

Non-Equilibrium Statistical Mechanics of Strongly Anharmonic Chains

157

are bounded. We write 32m−2 AB3−2m = AB3−2 + [32m−2 , AB]3−(2m−1) 3−1 . The first term is bounded by Lemma A.2. The second term is bounded by noticing that [32m−2 , AB] ∈ YF2m−1 (A) and using the induction hypothesis. j = 2n + 1. We write the operator of (A.4) as A1 A2 3−2 · 32 A3 A4 3−4 · . . . · 32n A2n+1 3−2n−1 . The first terms are bounded exactly the same way as before. Concerning the last term, we have 32n A2n+1 3−2n−1 = A2n+1 3−1 + [32n , A2n+1 ]3−2n 3−1 , which is bounded by Lemma A.2 and the induction hypothesis, noticing that the commutator belongs to Y 2n (A). This completes the proof of the lemma.

t u

We need another result from [EPR99a]. Lemma A.3. Let {A(z)} ⊂ B(H) be a family of uniformly bounded operators, 3 ≥ 1 a self-adjoint operator and let F (λ, z) be a real, positive bounded function. Then

Z ∞ Z ∞

≤ sup kA(y)kkf k

A(z) F (3, z)f dz sup F (λ, z) dz, ∀ f ∈ H.

y≥0

0

λ≥1

0

(A.5) If furthermore A = A(z) is independent of z, one has the bound

Z ∞ Z ∞

A F (3, z)f dz ≤ kAkkf k sup F (λ, z) dz, ∀ f ∈ H.

λ≥1 0

0

(A.6)

Lemma A.4. Let 3, F and A be as above and assume Hypotheses 1 and 3 hold. If j X ∈ YF (A), then the operators 3β X3γ

with β + γ ≤ −j

are bounded. j If Y ∈ L is such that [Y, 32 ] ∈ YF (A), then the operators 3β [3α , Y ]3γ

with α + β + γ ≤ 2 − j

are bounded. If X, Y ∈ L are such that j

[X, 32 ] ∈ YF (A) , [Y, 32 ] ∈ YFk (A) and

j +k−2 [32 , X], Y ∈ YF (A),

then the operators

3β [3α , X], Y 3γ

are bounded.

with α + β + γ ≤ 4 − j − k

158

J.-P. Eckmann, M. Hairer

Proof. Let us prove the first assertion. The case γ = 0 is handled by noticing that ∗ 3β X = 3β+j X∗ 3−j , and that both operators of the latter product are bounded by Lemma A.1. The case β = 0 is handled in the same way by considering the adjoint. The proof for the other cases follows exactly [EPR99a]. We will demonstrate the techniques involved by proving the third assertion, assuming the first two assertions hold. The second assertion can be proved in a similar way without using the third one. We will first assume that α ∈ (−2, 0). In this case, we can write (see e.g. [Kat80, § V.3.11]) Z ∞ sin(π α/2) . (A.7) zα/2 (z + 32 )−1 dz, Cα = − 3α = Cα π 0 We notice moreover that it is possible to write [(z + 32 )−1 , X], Y = (z + 32 )−1 [32 , X], Y (z + 32 )−1 + (z + 32 )−1 [32 , X](z + 32 )−1 [32 , Y ](z + 32 )−1 (A.8) + (z + 32 )−1 [32 , Y ](z + 32 )−1 [32 , X](z + 32 )−1 . If we substitute the expression (A.7) in 3β [3α , X], Y 3γ and use (A.8), we get three terms, which we call T1 , T2 and T3 , and which will be estimated separately. Term T1 . This term is given by Z ∞ 3γ 3β 2 zα/2 , X], Y dz. [3 T1 = Cα z + 32 z + 32 0 We define B = [32 , X], Y ∈ Y j +k−2 (A) and write Z ∞ Z ∞ β 3γ 3γ α/2 β α/2 3 2 z 3 B dz + C z [3 , B] dz T1 = Cα α 2 2 z + 32 (z + 32 )2 0 0 (z + 3 ) ≡ Cα T11 + T12 . The term T11 is estimated by writing, for any f ∈ H,

Z ∞ γ +β+j +k−2

β

2−j −k−β α/2 3

z f dz kT11 f k = 3 B3

2 2 (z + 3 ) 0 Z ∞ γ +β+j +k−2

β λ zα/2 dz ≤ kf k 3 B32−j −k−β sup (z + λ2 )2 λ≥1 0 Z ∞

λα+γ +β+j +k−4 s α/2 ds. = kf k 3β B32−j −k−β sup (s + 1)2 λ≥1 0 Since the assumption yields B ∈ Y j +k−2 (A), the norm is bounded. The integral is also bounded because, by assumption, we have α + γ + β ≤ 4 − j − k. To bound T12 , we observe that [32 , B] ∈ Y j +k−1 (A). Using (A.6), we find the bound

Z ∞

β γ +β+j +k−3

α/2 3 2 3−j −k−β 3

z [3 , B]3 f dz kT12 f k =

2 2 2 z+3 (z + 3 ) 0 Z

3β

∞ γ +β+j +k−3 λ

2 3−j −k−β α/2 [3 , B]3 z sup dz. ≤ kf k sup

2 2 2 y>0 y + 3 λ≥1 (z + λ ) 0

Non-Equilibrium Statistical Mechanics of Strongly Anharmonic Chains

159

This expression is bounded when α + β + γ ≤ 4 − j − k and α ∈ (−2, 0). This can be seen by making as before the substitution z 7→ λ2 s. Before we go on, we introduce the notation 3z ≡ (z + 32 )−1 . Term T2 . This term is given by Z T2 = Cα

∞

zα/2

0

1 3γ 3β A B dz, 2 2 z + 3 z + 3 z + 32

where we defined A = [32 , X]

B = [32 , Y ].

and

Since [3z , B] = 3z [B, 32 ]3z , the term appearing under the integral can be written as 3β 3z A3z B3z 3γ = 3β 3z AB32z 3γ + 3β 3z A3z [B, 32 ]32z 3γ . According to this, the term T2 is split into two terms T21 and T22 . We have

kT21 f k ≤ kf k sup 3β 3y AB3−β−j −k

Z

y>0

∞

λα+β+γ +j +k−4 ds. (s + 1)2 λ≥1

s α/2 sup

0

The integral is bounded by hypothesis. The norm is also bounded, because AB ∈ Y j +k (A). For the second term, we have

kT22 f k ≤ kf k sup 3β 3y A3y [32 , B]3−β−j −k y>0

Z 0

∞

λα+β+γ +j +k−4 ds. (s + 1)2 λ≥1

s α/2 sup

This is bounded in the same fashion, noticing that

sup 3β 3y A3y [32 , B]3−β−j −k ≤ sup 3β 3x A3−β−j y>0 x>0

32

j +β−2 2 −β−j −k

3 [3 , B]3 × sup

. 2 y>0 y + 3 Term T3 . It can be bounded in the same way as T2 by symmetry. We now have to check the assertion for the other values of α. If α = 0 or α = 2, it holds trivially. For α > 0, we proceed by induction, using the equality [3α+2 , X], Y = 32 [3α , X], Y + 3α [32 , X], Y

+ [3α , Y ][32 , X] + [32 , Y ][3α , X].

(A.9)

For α = −2, the assertion is proved using equality (A.8) with z = 0. For α < −2, we also proceed by induction, using (A.9) with 2 replaced by −2. This completes the proof of Lemma 5.6. u t

160

J.-P. Eckmann, M. Hairer

B. Proof of Proposition 2.4 Proposition B.1. T t , as defined in (2.10), extends uniquely to a quasi-bounded strongly continuous semigroup on L2 (X , dx). Its generator L acts like L on functions in C0∞ (X ). Proof. See the proof of Lemma A.1 in [EPR99a].

t u

We now turn to the question of the domain of the generator L. Recall that L is the set of all formal expressions of the form X al (x)D l , k ≥ 0, a ∈ C ∞ (Rn ). |l|≤k

To any element L ∈ L having the above form, we associate its formal adjoint L∗ ∈ L in an obvious way. In the sequel, the notation hf, gi will be used to denote the scalar product in L2 if f, g ∈ L2 and the evaluation f (g) if f is a distribution and g ∈ C0∞ (Rn ). We hope this slight ambiguity will not be too misleading. We associate to every L ∈ L the operator TL : D(TL ) → L2 (Rn ) by TL f (x) = Lf (x) and D(TL ) = {f ∈ L2 | Lf ∈ L2 }, where Lf has to be understood in the sense of distributions, i.e. Lf (g) ≡ f (L∗ g) for all g ∈ C0∞ (Rn ). We also define the operator SL : D(SL ) → L2 (Rn ) by SL = TL C0∞ . The operators TL and SL are usually called the minimal operator and the maximal operator constructed from the formal operator L. The following result is classical, so we do not give its proof here: Proposition B.2. For every L ∈ L, we have TL∗ = SL∗ and SL∗ = TL∗ . In particular, this t shows that TL is closed. u We prove now the quasi m-dissipativity of SL . We define L˜ ≡ L −

M X

γi − 1.

i=1

By definition, if SL˜ is strictly m-dissipative, SL is quasi m-dissipative. It is well-known that an equivalent characterization of strict m-dissipativity is that (a) SL˜ is strictly dissipative and (b) Range(SL˜ ) = H. Proposition B.3. Assume Assumption 0 holds. Then SL˜ is strictly m-dissipative. Remark B.4. It is clear that the statement holds if we consider the minimal operator in L2 (K, dx), where K is some compact domain of X . The idea is to approximate X by a sequence of increasing compact domains and to control the rest terms. This proposition fills a gap in [EPR99a], since the statement “Re(f, L∗ f ) = 1 − 2 kσ T ∇f k2 + (f, div b f ) ≤ Bkf k2 ” in the proof of Lemma A.1 is not justified for every f ∈ D(L∗ ).

Non-Equilibrium Statistical Mechanics of Strongly Anharmonic Chains

161

ϕ(x)

Fig. B.1.

Proof. Property (a) is immediate. By the closed-range theorem, property (b) is equivalent to the statement (b’) f ∈ L2 and L˜ ∗ f = 0 imply f = 0. Assume on the contrary that there exists a non-vanishing function f ∈ L2 for which ∗ ˜ L f = 0 holds in the sense of distributions. Since L˜ ∗ is hypoelliptic, f must be a C ∞ function. Let us choose some function ϕ ∈ C0∞ (R+ ) such that ϕ(x) = 1 if x ∈ [0, 1]. We also define ϕn : X → R

x 7→ ϕ G(x)/n .

By assumption, L˜ ∗ f = 0, so we have 0 = 2Rehϕn f, L˜ ∗ f i = hϕn f, L˜ ∗ f i + hL˜ ∗ f, ϕn f i. Since ϕn ∈ C0∞ and all the other functions are C ∞ , we can make all the formal manipulations we want. In particular, we have ˜ nf i hL˜ ∗ f, ϕn f i = hf, Lϕ

⇒

˜ n )f = 0. f, (ϕn L˜ ∗ + Lϕ

(B.1)

Recall that L˜ is given by L˜ =

M X i=1

λ2i γi Ti ∂r2i −

+ X HS −

M X

M X i=1

ri XFi −

i=1

≡

M X i=1

γi ri − λ2i Fi (p, q) ∂ri

ζi ∂r2i + Y0 − 1.

M X i=1

γi − 1

(B.2)

162

J.-P. Eckmann, M. Hairer

Straightforward computation yields ˜ n=2 ϕn L˜ ∗ + Lϕ

M X

ζi ∂ri ϕn ∂ri +

i=1

=2

M X i=1

M X

ζi ∂r2i ϕn + [Y0 , ϕn ] − ϕn

ζi ∂ri ϕn ∂ri

i=1

+

M 1 X 1 ζi (∂r2i G)ϕ 00 (G/n) + 2 (∂ri G)2 ϕ 0 (G/n) n n

(B.3)

i=1

M

+

2 1 X γi ri − λ2i Fi (p, q) ϕ 0 (G/n) − ϕn 2 n λi i=1

≡2

M X

ζi ∂ri ϕn ∂ri + 8n − ϕn .

i=1

Using Assumption 0, we next verify that |8n (x)| ≤ C˜ for all x ∈ X and for all n ≥ 1. We define c1 ≡ sup ϕ 00 (x) and c2 ≡ sup xϕ 0 (x). x≥0

x≥0

An elementary computation shows that Assumption 0 implies that there are constants c3 , . . . , c5 > 0 for which 2 ∂ G(x) ≤ c3 , ∂r G(x) 2 ≤ c4 G(x), and ri − λ2 Fi (p, q) 2 ≤ c5 G(p, q, r). ri

i

i

We thus have M X ζi c4 ζi c3 00 (G/n)ϕ 0 (G/n) + γi c5 (G/n)ϕ 0 (G/n) ϕ (G/n) + |8n (x)| ≤ n n λ2i i=1 M X γi c2 c5 c1 c3 + c2 c4 ˜ + ζi ≤ C, ≤ 2 n λ i i=1

as asserted. Moreover, the first part of Assumption 0 implies that there exist constants C, α > 0 such that supp 8n ⊂ {x ∈ X | kxkα ≥ n/C}.

(B.4)

Substituting (B.3) back into (B.1), we get 0 = −2

Z M X

2

2

√ √ ζi ϕn ∂ri f − k ϕn f + 8n (x)|f (x)|2 dx. X

i=1

Since f ∈ L2 (X ), one has

2 √ lim k ϕn f = kf k2 .

n→∞

(B.5)

Non-Equilibrium Statistical Mechanics of Strongly Anharmonic Chains

163

Moreover, the uniform boundedness of 8n together with property (B.4) imply that Z 8n (x)|f (x)|2 dx = 0. lim n→∞ X

This supplies the required contradiction to (B.5), thus establishing the strict m-dissit pativity of SL˜ . u We complete now the Proof of Proposition 2.4. It only remains to be proved that L = SL and that L∗ = SL∗ . It is clear that the generator L of T t satisfies SL ⊂ L. Since SL is quasi m-dissipative, i.e. has no proper quasi dissipative extension, and since the generator of a quasi-bounded semigroup is always quasi m-dissipative, we must have L = SL . Concerning the adjoint, we have by Proposition B.2, L∗ = TL∗ . It is possible to retrace the above argument for L∗ to show that SL∗ is quasi m-dissipative. Since L∗ is t also quasi m-dissipative and SL∗ ⊂ L∗ , we must have L∗ = SL∗ . u C. Proof of Lemma 8.4 Using the technique developed in Appendix B, we can now turn to the proof of Lemma 8.4. Recall that K is given by (3.10) and that W (γ ) = L2 (X , G2γ dx). Moreover, Kγ is the maximal operator constructed from K when considering it as a differential operator in W (γ ) . We have Proposition C.1. C0∞ (X ) is a core for Kγ . Proof. We introduce the unitary operator U : W (γ ) → L2 (X ) defined by Uf (x) = Gγ (x)f (x). We also define Kγ0 ≡ Kγ C0∞ (X ). The operators Kγ and Kγ0 are unitarily equivalent to the operators K˜ γ and K˜ γ0 respectively by the following relations: Kγ

D(Kγ ) −→ x  U yU −1

Kγ0

W (γ ) x  U yU −1

D(Kγ0 ) −→ x  U yU −1

D(K˜ γ ) −→ L2 (X )

W (γ ) x  U yU −1

D(K˜ γ0 ) −→ L2 (X )

K˜ γ

K˜ γ0

By construction, K˜ γ is maximal. Thus, by Proposition B.2, its adjoint K˜ γ∗ is minimal. It is immediate that the formal expressions for K˜ γ∗ and Kγ0 are given by K˜ γ∗ = G−γ K ∗ Gγ

K˜ γ0 = Gγ KG−γ .

and

It is now a simple exercise to retrace the proof of Proposition B.3 to see that K˜ γ∗ and K˜ γ0 are both m-accretive. The remark of Sect. 2.2 concerning the adjoints of m-accretive operators implies that K˜ γ is also m-accretive. Since K˜ γ0 ⊂ K˜ γ , we must have K˜ γ0 = K˜ γ and thus Kγ0 = Kγ . This proves the assertion.

t u

164

J.-P. Eckmann, M. Hairer

Acknowledgement. We have profited from helpful discussions with M. Mantoiu, C.-A. Pillet, L. Rey-Bellet, and J. Rougemont. This work was partially supported by the Fonds National Suisse.

References [Agm82] Agmon, S.: Lectures on Exponential Decay of Solutions of Second-Order Elliptic Equations. Princeton, NJ: Princeton University Press, 1982 [Dav80] Davies, E.B.: One-Parameter Semigroups. London: Academic Press, 1980 [EPR99a] Eckmann, J.-P., Pillet, C.-A., and Rey-Bellet, L.: Non-Equilibrium Statistical Mechanics of Anharmonic Chains Coupled to Two Heat Baths at Different Temperatures. Commun. Math. Phys. 201, 657–697 (1999) [EPR99b] Eckmann, J.-P., Pillet, C.-A., and Rey-Bellet, L.: Entropy Production in Non-Linear, Thermally Driven Hamiltonian Systems. J. Stat. Phys. 95, 305–331 (1999) [Has80] Has’minskiˇı, R.Z.: Stochastic Stability of Differential Equations. Alphen aan den Rijn, The Netherlands: Sijthoff & Noordhoff, 1980 [Hör67] Hörmander, L.: Hypoelliptic Second Order Differential Equations. Acta Math. 119, 147–171 (1967) [Hör85] Hörmander, L.: The Analysis of Linear Partial Differential Operators I–IV. New York: Springer, 1985 [Kat80] Kato, T.: Perturbation Theory for Linear Operators. New York: Springer, 1980 [LS77] Lebowitz, J.L. and Spohn, H.: Stationary Non-Equilibrium States of Infinite Harmonic Systems. Commun. Math. Phys. 54, 97–120 (1977) [RS80] Reed, M. and Simon, B.: Methods of Modern Mathematical Physics I–IV. San Diego, CA: Academic Press, 1980 [Yos80] Yosida, K.: Functional Analysis. 6th ed., New York: Springer, 1980 Communicated by J. L. Lebowitz

Commun. Math. Phys. 212, 165 – 189 (2000)

Communications in

Mathematical Physics

© Springer-Verlag 2000

Proof of the Symmetry of the Off-Diagonal Hadamard/Seeley–deWitt’s Coefficients in C ∞ Lorentzian Manifolds by a “Local Wick Rotation” Valter Moretti Department of Mathematics, Trento University, 38050 Povo (TN), Italy. E-mail: [email protected] Received: 13 September 1999 / Accepted: 12 January 2000

Abstract: Completing the results achieved in a previous paper, we prove the symmetry of Hadamard/Seeley–deWitt off-diagonal coefficients in smooth D-dimensional Lorentzian manifolds. This result is relevant because it plays a central rôle in Physics, in particular in the theory of the stress-energy tensor renormalization procedure in quantum field theory in curved spacetime. To this end, it is shown that, in any Lorentzian manifold, a sort of “local Wick rotation” of the metric can be performed provided the metric is a (locally) analytic function of the coordinates and the coordinate are appropriate. No time-like Killing field is necessary. Such a local Wick rotation analytically continues the Lorentzian metric in a neighborhood of any point (more generally, in a neighborhood of a space-like (Cauchy) hypersurface) into a Riemannian metric. The continuation locally preserves geodesically convex neighborhoods. In order to make rigorous the procedure, the concept of a complex pseudo-Riemannian (not Hermitian or Kählerian) manifold is introduced and some features are analyzed. Using these tools, the symmetry of Hadamard/Seeley–deWitt off-diagonal coefficients is proven in Lorentzian analytical manifolds by analytical continuation of the (symmetric) Riemannian heat-kernel coefficients. This continuation is performed in geodesically convex neighborhoods in common with both the metrics. Then, the symmetry is generalized to C ∞ non analytic Lorentzian manifolds by approximating Lorentzian C ∞ metrics by analytic metrics in common geodesically convex neighborhoods. 1. Introduction, Generalities and Summary of Previous Results 1.1. In a previous paper [Mo99c] we have considered the problem of the symmetry of heat-kernel/Seeley–deWitt coefficients, taken off-diagonal, for a second order differential operator A0 defined in a manifold M. As is well-known [Wa78,Wa94] that symmetry property assures the validity of some physically very important requirements (e.g. the conservation along the motion) of the quantum stress-energy tensor in quantum field theory in curved spacetime, whenever such a tensor is renormalized by means of

166

V. Moretti

the “point-splitting” procedure. In [Mo99c], we considered the Euclidean case whereas, within this paper we want to deal with the Lorentzian case which is much more interesting on physical grounds. From now on, M denotes a (real, Hausdorff, paracompact, connected, orientable) D-dimensional C ∞ manifold endowed with a non-singular either Lorentzian (namely, the signature is (−, +, · · · , +)) or Riemannian metric, g 1 . (In the next section we shall consider also complex manifolds.) The operator A0 has the form A0 = −1 + V : C0∞ (M) → L2 (M, dµg ),

(1)

whenever the metric is Riemannian. Conversely, in the Lorentzian case, the operator A0 has the form A0 = −1 + V : D(M) → C ∞ (M),

(2)

D(M) being any domain of smooth functions, like C0∞ (M) or C ∞ (M). 1 := ∇a ∇ a denotes the Laplace-Beltrami operator and ∇ means the covariant derivative associated to the metric connection. dµg denotes the natural Borel measure induced by the metric, and V is a real function of C ∞ (M). (See [Mo99a, Mo99b, Mo99c] for discussions concerning the existence and the relevance of self-adjoint extensions of A0 in L2 (M, dµg ) in both cases.) Throughout the text, if (U, xE) is a local chart of the differentiable structure of a n-dimensional manifold M and thus xE : U → V : p 7 → (x 1 , . . . , x n )(p) = xE(p), V ⊂ Rn , we shall identify U with V, writing (x 1 , . . . , x n ) ∈ U as well as p ∈ V, whenever it does not give rise to misunderstandings. The heat-kernel coefficients for the Riemannian case and the Seeley–deWitt coefficients for the Lorentzian case, barring numerical factors, coincide with the coefficients which appear in the singular part of the Hadamard local solution (or Hadamard parametrix) for the linear homogeneous equation associated to the operator A0 [Ch84, Ca90,Ga64,Fu91,BD82,Wa94] (see also [Mo99a, Mo99b, Mo99c] where the same notations used here are employed, for further references and comments.) The heat-kernel/Seeley–deWitt coefficients are given by the following definition (see [Mo99c] for further comments and remarks and for the corresponding differential recursive definition). Definition 11. Within the hypotheses on M and A0 given above, in any fixed open geodesically convex neighborhood N ⊂ M, both the heat-kernel (for the Riemannian case) and Seeley–deWitt (for the Lorentzian case) coefficients are the functions defined on N × N and labeled by j ∈ N, 1/2

a0 (x, y) = 1V V M (x, y), 1/2

a(j +1) (x, y) = −1V V M (x, y)

Z 0

(3) 1

h

−1/2

i

λj 1V V M A0x(λ) aj (x(λ), y)dλ.

(4)

λ 7 → x(λ) is the unique geodesic segment from y ≡ x(0) to x ≡ x(1) contained completely in N . 1 In the gr-qc version of [Mo99c], we also assumed the positivity of A in the Riemannian case and the 0 geodesic completeness in general. Actually, these requirements are not necessary to assure the symmetry of the heat-kernel coefficients and they can be dropped as can be shown with a little modification of Theorem 2.1 in [Mo99c].

Symmetry of the Off-Diagonal Hadamard/Seeley–deWitt’s Coefficients

167

Remark. This definition can be given as it stands also in the general case of a nondegenerate semi-Riemannian metric, namely, when more than one eigenvalue of the metric is negative and no eigenvalue vanishes. This is a straightforward consequence of the theory developed in 2.3 below. 1V V M (x, y) is a (smooth or analytic2 depending on the hypotheses on the metric) bi-scalar called the van Vleck-Morette determinant (see [Mo99c] for details). In any coordinate system uE = (u1 , · · · , uD ) defined in any open totally normal (or geodesically convex) neighborhood N , if x, y ∈ N and g := detgab , we have 2 ∂ σ (E g(E x) x , yE) 1 p det > 0. (5) 1V V M (x, y) := (−1)D |g(E x )| ∂x a ∂y b g(E x )g(E y) Above x ≡ xE, y ≡ yE and σ (E x , yE) is one half the “squared geodesical distance” of x to y (see [Mo99c] for details). The right-hand side of (5) is positive with the choice done for the first (constant) coefficient, not depending on the (fixed) non-singular semiRiemannian signature of the metric (in particular, Riemannian or Lorentzian) and the used coordinates. Remarks. (1) These definitions can be given also if, in any non-degenerate semi-Riemannian case, M denotes a manifold with (smooth) boundary ∂M. In this case it is also required that the fixed open geodesically convex neighborhood N does not intersect ∂M. The results obtained in this paper can be straightforwardly generalized to manifolds with boundary. (2) Differently from Definition 1.1 in [Mo99c], here we prefer to distinguish explicitly between the Lorentzian and the Riemannian case employing a different nomenclature (heat-kernel or Seeley–deWitt coefficients respectively). (3) The coefficients defined by (4) are either smooth if both the metric and V are smooth or (real) analytic if both the metric and V are (real) analytic (see [Mo99c]). These coefficients have been shown to be symmetric in x and y whenever the metric and V are smooth (or analytic) and the metric is Riemannian [Mo99c]. This holds true despite the non-symmetric definition (4) and despite several subtleties in the convergence properties of the off-diagonal heat-kernel expansion which could be non-asymptotic. As we said previously, this result is physically relevant within the theory of the pointsplitting renormalization of the stress-energy tensor in curved spacetime concerning so-called Hadamard quasi-free quantum states. Indeed, the symmetry of the heat-kernel or Seeley–deWitt coefficients trivially implies the symmetry of the coefficients which appear in the singular part of the (Euclidean or Lorentzian) Hadamard parametrix (see 1.3 of [Mo99c] for further details). The symmetry property is a sufficient3 condition which assures a final well-behaved renormalized (Euclidean or Lorentzian) stress-energy tensor (see [Wa78,FSW78] and references in [Wa94]). Such an important requirement has been assumed in the mathematical-physics literature without an explicit proof of the knowledge of the author (see [Mo99c] for further comments). This paper is devoted to show that the symmetry holds true also in the Lorentzian case which is much more interesting on physical grounds. Please, see the note added in proof. 2 Throughout this work “smooth” means C ∞ and “analytic” (C ω ) means holomorphic whenever the considered functions are complex and complex valued. 3 It is not so clear whether or not this condition is necessary. After the appearance of the first version of [Mo99c], R.M. Wald pointed out to me that a weaker requirement should be, in practice, sufficient (see comment before Proposition 2.1 in [Mo99c]).

168

V. Moretti

1.2. The rough idea of the proof of the symmetry for the Lorentzian case. In principle, a direct attempt to prove the symmetry could be performed as we have done in the Riemannian case [Mo99c]. That is, by employing the so-called Seeley–deWitt (or Schwinger– deWitt) expansion of the integral kernel associated to the one-parameter group of unitary operators generated by some suitable self-adjoint extension of A0 [BD82, Fu91, Ca90, Mo99c]. In fact, the Seeley–deWitt coefficients are just the coefficients of this expansion. This expansion is the direct analogue of the heat-kernel expansion [Ch84, Mo99c]. Anyway, the convergence properties of the former are much more complicated than those of the latter (see discussions and references in [Fu91, Mo99c]), so we prefer to follow an alternative way, which seems to be more interesting also on mathematical-physics grounds. The rough idea of the proof involves a sort of “local” Wick rotation, namely a continuation somehow of the relevant coefficients from the Lorentzian theory into the Riemannian one. By the uniqueness theorem of the analytical continuation, this should entail the generalization of the symmetry to the Lorentzian coefficients from the symmetry of Riemannian coefficients. We shall see that a sort of “local” Wick rotation can be performed, not depending on the presence of time-like Killing vectors, provided the metric and the potential V are (real) analytic functions of the coordinates. Finally, the generalization of the symmetry to the smooth non-analytic case can be obtained exactly as we have done in the Riemannian case, making use of Proposition 2.1 in [Mo99c]. This is, by an approximation of smooth metrics by analytic metrics and smooth functions V by analytic functions. The intriguing issue is the generalization of the Wick rotation from Minkowski spacetime to curved non-stationary spacetimes. This is the argument of the next section. 2. “Local Wick rotation” 2.1. A generalized local Wick rotation. In QFT in flat spacetime the so-called (spatial) Wick rotation is a useful tool in spite of quite a vague definition. Roughly speaking, the Wick rotation is nothing but an analytical continuation of the Minkowskian time coordinate into imaginary values: t → iτ for τ ∈ R. This is done in order to produce a Riemannian background where one can define the Euclidean QFT. Formally, the metric changes as follows: 8L = −dt ⊗ dt +

d X i=1

dx i ⊗ dx i → 8R = dτ ⊗ dτ +

d X

dx i ⊗ dx i .

(6)

i=1

Such a procedure is performed for several goals, e.g., to make sensible the path integral as a Wiener measure or to build up the thermal QFT. In curved spacetime, the use of the Wick rotation is much more problematic. In particular, there is no guarantee for the fact that the continued Riemannian metric is real whenever the initial Lorentzian metric is real. In principle, a somehow sufficient condition which gives a real Riemannian metric is given by the requirement of a static metric [Wa79, FR87, Fu91,Wa94]. If the metric is only stationary the Wick rotation is more problematic and generally involves the analytic continuation of further parameters than the time coordinate [Ha77]. In spite of these difficulties, the Wick rotation is successfully used in QFT in curved spacetime, Quantum Gravity and black holes theory, where it is a very powerful tool in studying black hole thermodynamics in particular [GH93]. In this work, to get the proof of the symmetry of Seely-deWitt coefficients, we want to generalize the Minkowskian Wick rotation to quite a general Lorentzian manifold

Symmetry of the Off-Diagonal Hadamard/Seeley–deWitt’s Coefficients

169

dropping any hypothesis concerning the presence of temporal Killing fields. As far as we are concerned, a sort of Wick rotation of the metric is sufficient to prove the first step of our symmetry theorem. To this end, let us focus our attention on (6) once again. Notice that the same result can be obtained by an analytic continuation of the metric rather than the time coordinate. In fact, we are free to interpret (6) as 8L = gLab dx a ⊗ dx b → 8R = gRab dx a ⊗ dx b ,

(7)

where gLab = diag(−1, 1, 1, 1)

and gMab = diag(1, 1, 1, 1).

Different from the customary interpretation, the metric has now changed, since the eigenvalue −1 has been continued into a final eigenvalue +1, but the manifold has remained the initial one. The changes have taken place only in each (co-)tangent fiber. In this sense the continuation is “local”. In principle, such a procedure could be used also in curved spacetime without the requirement of a static or stationary metric. We are not interested in the issue about which properties of the customary interpretation are preserved by the new interpretation. Only two important points have to be remarked: Following the new procedure, there is no guarantee for the fact that the Riemannian metric so obtained is a solution of Euclidean Einstein equation if the initial Lorentzian metric is a solution of the Lorentzian Einstein equations; moreover it is worthwhile stressing that the found procedure is very non-unique. The use of complex metrics is compulsory if one assume the absence of pathologies in the structure of geodesically convex neighborhoods during the continuation procedure. Such an absence is essential as far as our main goal, i.e., the proof of symmetry of Hadamard coefficients, is concerned, because these coefficients are defined just in geodesically convex neighborhoods. If we want to pass from a Lorentzian to a Riemannian metric continuously, the signature has to change from (−, +, · · · , +) to (+, +, · · · , +). Therefore, the determinant of the metric, in any fixed coordinate frame, must change sign somewhere. This implies that, employing only real metrics, some of these must become degenerate somewhere during the continuation. Therefore, pathologies would arise concerning the exponential maps and the structure of geodesical neighborhoods. On the other hand, the use of complex coefficients of the metric makes sense only in complex manifolds. For instance, in general, the equations of the geodesics admit no solution in real coordinates with complex coefficients of the metric. Therefore, we are forced to extend the initial Lorentzian metric and manifold to complex values in complex coordinates. A natural way to do this follows from the assumption of an initial real analytic metric. The “complex metrics” which arise in the continuation procedure are not Hermitian or Kählerian but generalize the concept of a pseudo-Riemannian metric into a complex context. To proceed with our idea of a local generalized Wick rotation we need a well-known preliminary definition. Definition 21. Let (M, g) be a C k (k ∈ {2, . . . , ∞, ω}) Lorentzian (D = d + 1)dimensional manifold. Choose any embedded spacelike hypersurface S and an open neighborhood O such that O ∩ S 6 = ∅. Taking O sufficiently small (preserving the condition above)if necessary, any admissible (C k ) local coordinate system defined in O, xE = (x 0 , . . . , x d ) with, (x 1 , . . . , x d ) ∈ open subset of RD and x 0 ∈ ] − δ, δ[ , δ > 0,

170

V. Moretti

such that, in O, S ∩ O = {(x 0 , . . . , x n ) ∈ ] − δ, δ[ × | x 0 = 0}, g00 = −1, g0a = 0 for a > 0,

(8) (9) (10)

is said to be a local synchronous coordinate system with respect to S. Concerning the existence of such coordinate systems, see [Wa84] and, for a more general mathematical discussion, see Chap. 7 of [ON83] where these coordinates are called “Fermi coordinates with respect to a given spacelike hypersurface”. A sketch of the proof of their existence will also be given within the proof of Theorem 2.2. Remark. Any point p ∈ M belongs to an embedded spacelike hypersurface: Such a hypersurface can be obtained as Sp = {expp (X a ea ) | X0 = 0 (X1 , . . . , Xd ) ∈ }, where is a suitable small neighborhood of the origin of Rd , (e0 , . . . , ed ) being an orthonormal base of Tp (M) with (e0 , e0 ) = −1. Notice that the found hypersurface about p is not uniquely defined. Let us consider an analytic manifold M endowed with an analytic Lorentzian metric g. Take a complex analytic continuation of a synchronous coordinate system defined in a open neighborhood O about a point p ∈ M, zE = (z0 , z1 , · · · , zd ) with za = x a +iy a into a complex open neighborhood G ∈ CD (containing the initial real domain of definition of the coordinates). Suppose that analytic continuations of the functions zE 7 → gab (Ez) are defined in G. In general zE 7 → gab (Ez) are complex-valued functions. By these functions it is possible to define a non-degenerate “complex pseudo-metric” (the rigorous definition will be given in Definition 2.2) zE 7→ g(Ez) = gab (Ez) dza ⊗ dzb on G. Finally, fix an arbitrary real λ > 0 and consider the class of “complex pseudo-metrics” g(λθ ) θ , where θ ∈ C, defined in the coordinates of G by g(λθ )00 (Ez) := g00 (Ez)λ2θ/π eiθ , g(λθ )ab (Ez) := gab (Ez) for (a, b) 6 = (0, 0).

(11) (12)

We want to use this class to continue the initial Lorentzian metric obtained for θ = 0, g = g(λ0) into a final Riemannian “Wick-rotated metric”, obtained for θ = π. Indeed, within our hypotheses the “Wick-rotated metric” g¯ λ (Ez) := g(λπ ) (Ez) defines a real, nondegenerate and Riemannian metric for zE = xE ∈ O when it acts on real (with respect to the considered coordinates) vectors. In particular, fixed any positive real λ, g(λθ ) (Ez) is non-degenerate in a complex open neighborhood of [0, π]×O. More strongly, in a sense we shall specify later, fixed the parameter λ, the procedure preserves geodesically convex neighborhoods for a complex value of θ. In practice, the presented procedure locally defines an analytic continuation of metrics4 which interpolates through complex metrics, from Lorentzian to Riemannian metrics and preserves the local geodesic structure at each step. As we shall see, the apparently superfluous parameter λ plays a central rôle in using the local Wick rotation to get the symmetry of Seeley–deWitt coefficients. Let us state some of the results argued above into a precise theorem. 4 Notice that the functions (θ, z E) 7 → g(λθ )ab (Ez) are indeed analytic in (θ, z).

Symmetry of the Off-Diagonal Hadamard/Seeley–deWitt’s Coefficients

171

Theorem 21. Let (M, g) be a (D = d + 1)-dimensional Lorentzian manifold with class C ω . Let O ⊂ M be any open set endowed with local synchronous coordinates (with respect to some embedded hypersurface) xE = (x 0 , · · · , x d ) and consider the x ) in these coordinates. Fix a real λ > 0. Then, there coefficients of the metric gab (E is a complex open set G ⊂ CD endowed with a differentiable structure induced by coordinates zE = (z0 , · · · , zd ) with za = x a + iy a , a = 0, · · · , d and O ⊂ G (in the x ), a, b = 0, · · · , d, can obvious sense), where the components of the metric xE 7 → gab (E be analytically continued into analytic complex functions zE 7→ gab (Ez). Moreover the functions defined in C × G, (λ, zE) 7 → g(λθ )ab (Ez), where g(λθ )ab (Ez) have been defined in (11) and (12), define a θ-parametrized class {g(λθ ) }θ , θ ∈ C, of complex analytic (0, 2)-degree symmetric fields g(λθ ) (z) := g(λθ )ab (Ez) dza ⊗ dzb ,

(13)

(z ≡ zE) which are non-degenerate everywhere in the complex manifold G. In O, this class analytically continues in the parameter θ, the initial Lorentzian metric g = g(λ0) into the Riemannian Wick-rotated metric g¯ λ := g(λπ ) .

(14)

Proof. It is straightforward if one takes into account that the components of g(λθ )ab (Ez) with a, b > 0 do not depend on θ and λ and also noticing that (9) and (10) hold true for the “complex-continued metrics” in (13). Hence, | det{[g(λθ )ab (Ez)]a,b=0,... ,d }| = |eθ [i+(2/π ) ln λ] | | det{[gab (Ez)]a,b=1,... ,d }|. The former factor on the right-hand side is positive and cannot vanish if λ > 0 whatever θ ∈ C and the latter factor defines a continuous function of the only variable zE ∈ G which preserves the positive sign in a open complex neighborhood of each point zE = xE ∈ O where the function is positive by hypotheses. We can redefine G as the union of all of these open neighborhoods. u t Remark. Notice that we have used a little misuse of notations in the last statement of the x ) dx a ⊗ dx b and the Wicktheorem. Indeed, the initial Lorentzian metric g(x) = gab (E a b x ) dx ⊗ dx are defined in the real, with rotated Riemannian metric g¯ λ (x) = g¯ (λ)ab (E respect to the base induced by the coordinates specified above za = x a + iy a , cotangent space of O. Conversely, all interpolating fields (including those corresponding to the values θ = 0, π) g(λθ ) in (13) are defined in the whole complex cotangent space of O. Anyway, we shall use these notations also in the following since it does not produce misunderstandings. Note. The local Wick rotation we have defined above can be generalized to a larger class of coordinates with a precise physical meaning, namely, coordinates where x 0 represents a “true” time and x 1 , . . . , x d represent “true” spatial coordinates. In other words, in these coordinates, it must hold g00 < 0 as well as g 00 < 0. Within this general approach, fixing a point p ∈ M, the parameter λ is related to the relative velocity between the rest reference d-dimensional space (subspace of Tp (M)) of the infinitesimal observer evolving along ∂x 0 |p and the d-dimensional reference space (subspace of Tp (M)) “normal” to the vector dx 0 |p . Also Theorem 2.1 and Theorem 2.4 below can be generalized for these “physical” coordinates but the proofs are much more complicate.5 . 5 A general discussion on these arguments, and all the relevant proofs, can be found in the first gr-qc version of this paper.

172

V. Moretti

We also have the following almost straightforward result, which is interesting by its own not depending on our final goal. It shows that the local procedure defined above can be relatively globalized about any spacelike embedded hypersurface (e.g. a Cauchy surface if M is globally hyperbolic) also when global synchronous coordinates with respect to it do not exist. Anyway, we stress that we shall use the local result only in the proof of symmetry theorem. Theorem 22. Let (M, g) be a (D = d + 1)-dimensional Lorentzian manifold with class C ω and time-oriented. Fix any real λ > 0. Let S ⊂ M any fixed embedded space-like hypersurface with class C ω . Then, there is an open D-dimensional open Lorentzian submanifold of M, N containing S which admits the class A(N ) of (C ω ) time-oriented local synchronous coordinates with respect to S as an atlas. Moreover, the Wickrotated metrics defined by (14) in each local coordinate system of A(N ) induce a global Riemannian C ω metric on the whole sub-manifold N . Sketch of Proof. See the Appendix.

t u

In the next part we shall consider some features of the Wick-rotated metrics. To this end we need some definitions and results concerning complex metrics in complex analytical manifolds. 2.2. Complex pseudo-Riemannian manifolds. To give a precise status to the complex field g(λθ ) defined on the complex manifold G presented in Theorem 2.1, we introduce the concept of a complex pseudo-Riemannian manifold and a complex pseudo-Riemannian metric. Also with different nomenclature, several results obtained in the following can be found in [LB83] (see also [Cs96]). First of all, we give some results concerning the existence and the analyticity of the exponential map in complex manifolds with a generally complex pseudo-Riemannian metric. Afterwards we discuss the existence of geodesically convex neighborhoods and related features. Let us start by giving the definition of a complex pseudo-Riemannian manifold. Definition 22. A complex analytic manifold M ([KN63]) endowed with an analytic nondegenerate (0, 2)-degree symmetric tensorial field g is said to be a complex pseudoRiemannian manifold and the field g is said to be a complex pseudo-metric. The complex pseudo-metric induces a non-degenerate complex quadratic form V 7→ g(z)(V, V), in the tangent space Tz (M) at any point z ∈ M. We call such a quadratic form the complex pseudo scalar product induced in Tz (M) by the complex pseudo-metric. Remarks. (1) It is worthwhile stressing that the complex pseudo scalar product induced on the tangent spaces is not Hermitian and the metric is not Kählerian. (2) It is clear that the manifold G introduced in Theorem 2.1, endowed with any fixed field g(λθ ) , is a complex pseudo-Riemannian manifold. (3) The equations of the geodesics take the usual formula with the difference that the connection coefficients of the Levi–Civita connection (see below) induced by the complex pseudo-metric are complex analytic functions of the considered coordinates. Concerning the equation of the geodesics we can apply the following general lemma. ¯ Cn ) 6 , with Lemma 21. Let f : (z, Y, α) 7 → f (z, Y, α) ∈ Cn be a function in C ω (C; C = Br1 (z0 ) × Br2 (y0 ) × Br3 (α0 ), where Br1 (z0 ), Br3 (α0 ) ⊂ C and Br2 (y0 ) ⊂ Cn 6 Notice that C¯ is closed. We say that f is analytic in a closed set when it is possible to continue f into an analytic function defined in a open set which includes the closed set.

Symmetry of the Off-Diagonal Hadamard/Seeley–deWitt’s Coefficients

173

are open balls with radii r1 , r2 , r3 > 0 centered in z0 , y0 , α0 respectively. Consider the differential equation system depending on the parameter α ∈ B¯ r3 (α0 ) dY = f (z, Y, α) dz

Y ∈ C 1 (Br10 (z0 ); Cn ) for some r10 > 0, r10 < r1

(15)

and initial condition Y (t0 ) = y¯0 ,

y¯0 ∈ B¯ r20 (y0 ),

where r20 > 0,

is fixed and r20 < r2 .

(16)

(a) A solution of Eq. (15) with initial condition (16) exists and is unique in any set B¯ r10 (z0 ), provided that 0 < r10 < Min r1 , δ 0 , δ 00 ,

(17)

where δ 0 = (r2 − r20 )/Sup ||f (z, y, α)|| | (z, y, α) ∈ C¯ , and

q δ 00 = 1/Sup 2 n T r∇f ∗ (t, y, α)T ∇f (t, y, α) | (z, y, α) ∈ C¯ .

(b) This solution satisfies Y (z) ∈ B¯ r2 (y0 ) for any z ∈ B¯ r10 (z0 ), whatever y¯0 ∈ B¯ r20 (y0 )

and α ∈ B¯ r3 (α0 ). (c) Moreover, varying also y0 and α, and writing down the dependence on these variables explicitly, the function (t, y¯0 , α) 7→ Y (t, y¯0 , α) is analytic. In particular, it belongs to C ω (B¯ r10 (z0 ) × B¯ r20 (y0 ) × B¯ r30 (α0 )) for any r30 > 0 with r30 < r3 . Proof. See the Appendix.

t u

We can apply the lemma above to the equations of the geodesics for a complex pseudometric g in coordinates zE = (z1 , · · · , zD ). For the moment we do not consider the further parameter α. The first-order geodesical equation system reads, for the complex pseudometric g in the coordinates zE = (z1 , · · · , zD ), dza (t, yE, V) = U a (t, yE, V), dt dU a (t, yE, V) a = −0bc (E y )U b (t, yE, V)U c (t, yE, V), dt

(18) (19)

for a = 1, · · · , D. (The sum over the repeated indices is understood). Above, the complex Levi–Civita connection coefficients are defined, as usual, by 1 ∂gdb (Ez) ∂gcd (Ez) ∂gbc (Ez) a (Ez) := g ad (Ez) + − ; (20) 0bc 2 ∂zc ∂zb ∂d yE and V are, respectively, the initial position and the initial velocity of the geodesic segment evaluated at t = 0. Equations (18) and (19) for t ∈ C, locally admit a unique solution which satisfies the given initial conditions. The existence of sets where the

174

V. Moretti

solution exists, is unique and is analytic is assured by Lemma 2.1. Let us indicate the local solution of the system above by t 7 → γ (t, zE, V),

(21)

y , V) ∈ Bρ (E y0 ) × Br (0), ρ, r > 0. Exactly as in the real for t ∈ Bδ (0), δ > 0, (E Riemannian case, for any fixed complex number c 6= 0, (18) and (19) entail the identity γ (ct, zE, V/c) = γ (t, zE, V).

(22)

This means that if, for instance, 2 > δ > 0, passing to the new variable t 0 = (2/δ)t, E provided r is replaced we can work with geodesics defined in the interval t 0 ∈ B2 (0) by r 0 = (δ/2)r < r. This can be done preserving all remaining properties concerning the analyticity and not depending on the initial condition zE (and not depending on any further parameter α as that in Lemma 2.1). Since there is no ambiguity we can use the E name r instead of r 0 and t instead of t 0 . Therefore, from now on, we suppose t ∈ B2 (0). E will be called a complex With this choice, a map (21) with the restriction t ∈ B¯ 1 (0) geodesic segment. It is worthwhile stressing that this is not a “usual” segment because the parameter t corresponds to two real parameters. A complex geodesic segment restricted to the real axis in the domain, s ∈ R, s 7 → γ (s, zE, V),

where s ∈ [0, 1]

(23)

will be called real-parameter geodesic segment. Notice that it satisfies (18) and (19) in the variable s and is real analytic in this variable. It determines the whole complex geodesical segment by analytic continuation. Obviously, these definitions do not depend on the chosen coordinates, so sometimes we shall use, e.g., p instead of yE in the second argument of a geodesic segment. The exponential map is, as usual, given as the analytic map, defined in an opportune open set [ {p} × Ep ⊂ T (M), (24) E= p∈M

where Ep is an open neighborhood of the origin of Tp (M) we shall specify shortly, exp : E → M : (p, V) 7 → γ (1, p, V).

(25)

The exponential map centered in p ∈ M is the map expp : Ep → M : V 7→ γ (1, p, V).

(26)

Obviously, these definitions do not depend on the used coordinates and, changing the domains one finds restrictions or extensions of the same function. Since, for any point p ∈ M, it holds d(expp )0 V =

d d d |t=0 expp (tV) = |t=0 γ (1, p, tV) = |t=0 γ (t, p, V) = V, dt dt dt

there is a open neighborhood, which we can assume to be starshaped, of the origin of the tangent space at p, where the exponential map centered in p defines an analytic diffeomorphism onto a neighborhood of p in the manifold. With an open starshaped neighborhood of 0 we mean an open neighborhood of 0 such that if V belongs to this

Symmetry of the Off-Diagonal Hadamard/Seeley–deWitt’s Coefficients

175

neighborhood, also any λV, with λ ∈ R and 0 ≤ λ ≤ 1, belongs to the neighborhood7 . For instance, any complex open ball centered in z ∈ C with positive radius r > |z| is an open starshaped neighborhood of the origin of C. We can take each Ep above as fixed open starshaped neighborhoods of 0 where the exponential map centered on p define a diffeomorphism. Working in fixed local coordinates, it is trivially possible to choose such sets Ep such that E given in (24) is also open. Hence, the map φ : (p, X) 7 → (p, exp X),

(27)

defines a diffeomorphism in E onto φ(E) ⊂ M × M because it is injective and its differential does not vanish in each point of the domain. As usual, a normal neighborhood of the point p ∈ M, is an open neighborhood of p with the form Np = expp (S) whenever S ⊂ Ep ⊂ Tp (M) is an open starshaped neighborhood of the origin of Tp (M) where the exponential map centered in p defines an analytic diffeomorphism. Then, the components of the vectors V ∈ Tp (M), with respect to a fixed base, contained in S, define normal coordinates on M centered in p via the function V 7 → expp V. Notice that any q ∈ Np , due to (22) and the starshapedness of exp−1 p (N )p , can be connected with p by only one complex geodesic segment “starting from p” at t = 0 and “terminating in q” at t = 1, such that the associated real-parameter geodesic segment is completely contained in Np . (Using the stronger definition of starshaped neighborhood suggested in the previous footnote, the whole complex segment geodesic would be contained in Np .) Finally, in normal coordinates centered in p, due to (22), the equation of a complex geodesic which starts from p is a linear function of the parameter. This involves that the connection coefficients vanishes at p if evaluated in these coordinates. Similarly to the real case, we define a totally normal neighborhood of a point p ∈ M as a neighborhood 8 of p, Vp ⊂ M, such that, if q ∈ Vp , there is a normal neighborhood of q, Nq , with Vp ⊂ Nq . Therefore, if q and q 0 belong to the same totally normal neighborhood, there is only one complex geodesic segment which “connects” these two points (respectively for t = 0 and t = 1) such that the associated real-parameter geodesic segment is completely contained in normal neighborhoods centered in q and q 0 respectively, Nq and Nq 0 . Finally, a complex geodesically convex neighborhood of a point p ∈ M should be defined as a totally normal neighborhood of p, Up , such that, for any couple q, q 0 ∈ Up , there is only one complex geodesic segment which is completely contained in Up and “connects” q (for t = 0) and q 0 (for t = 1). In fact, also in the simplest case of a complex manifold M ⊂ C these neighborhoods do not exist barring trivial cases (e.g., Up = M = C). However, a weaker definition can successfully be given. A geodesically linearly convex neighborhood of a point p ∈ M is defined as a totally normal neighborhood of p, Up , such that, for any couple q, q 0 ∈ Up , there is only one real-parameter geodesic segment which is completely contained in Up and connects q (for s = 0) and q 0 (for s = 1). It is not so obvious, at this point, that our complex pseudo-Riemannian structure admits totally normal and geodesically linearly convex neighborhoods. Actually this is the case. 7 It is possible to give a stronger definition requiring λ ∈ C and |λ| ≤ 1. Anyway, throughout this paper we use the weaker definition. 8 In this work, a neighborhood of a point is any set which includes an open set which contains the point.

176

V. Moretti

Theorem 23. Let (M, g) be a complex pseudo-Riemannian manifold. For each point p ∈ M there is a local base of the topology {Gpj }j ∈R consisting of open totally normal, geodesically linearly convex neighborhoods of the point p. Moreover each G¯pj is also totally normal and geodesically linearly convex for any j ∈ R and G¯pj ⊂ Gpj 0 if j < j 0 . Proof. See the Appendix.

t u

Remark. The definition of linearly geodesically convex neighborhoods could not seem very natural. Anyway, it can be given in a more natural way for open sets (see also [LB83]). To this end, we leave to the reader the proof of the following relevant proposition. Proposition 21. Given a complex pseudo-Riemannian manifold (M, g), an open set U ⊂ M is linearly geodesically convex, if and only if there is an open set E(U) ⊂ T (M), with [ {p} × E(U)p , (28) E(U) = p∈U

E(U)p ⊂ Ep being an open starshaped neighborhood of the origin of Tp (M), such that the map φU : E(U) → U × U : (p, X) 7 → (p, expp X)

(29)

is an analytic diffeomorphism onto U × U. The existence of totally normal and linearly geodesically convex neighborhoods allow us to define the one half squared complex pseudo-distance or complex world function similarly to the case of real metrics. Given an open linearly geodesically convex neighborhood, or, more simply, an open totally normal neighborhood, U, the complex world function is given by σ (p, q) :=

1 −1 g(p)(exp−1 p (q), expp (q)) for any q, p ∈ U. 2

(30)

Since all functions involved on the right-hand side of (30) are analytic, it must hold σ ∈ C ω (U × U). Moreover, essentially from the conservation of g(V, V) along any complex geodesic segment due to the geodesical transport, V being the tangent vector, we have the following properties which generalize well-known Riemannian and Lorentzian results [Fu91]: σ (p, q) = σ (q, p), 1 a σ (p, q), σ (p, q) = ∇(p)a σ (p, q)∇(p) 2 d |t=1 γ (t, q, p), ∇(p) σ (p, q) = dt

(31) (32) (33)

where the function on the right hand side of (33) is defined below. In an open geodesically linearly convex neighborhood or, more simply, in an open totally normal neighborhood, U, we can define the function ¯ γ (t, p, q) := γ (t, p, exp−1 p (q)) for any q, p ∈ U and t ∈ B1 (0),

(34)

Symmetry of the Off-Diagonal Hadamard/Seeley–deWitt’s Coefficients

177

which gives the complex geodesic segment “connecting” the point p (t = 0) and the point q (t = 1) as a function of the extreme points. Once again, trivially, γ ∈ C ω (B¯ 1 (0) × U × U). The property (32) is a consequence of the property (33). The latter is not very simple to prove. A direct way is the following. Consider a normal coordinates system centered in q. In these coordinates, if xE = xE(t)(= t xE(1)) is the equation of the real-parameter geodesic segment from q ≡ xE(0) = 0E to p ≡ xE(1), it holds trivially, since the integrand actually does not depend on t due to the parallel transport, R1 a dx b σ (p, q) = 21 0 gab (x(t)) dx dt dt dt. Then we can vary the curve in the integrand within any family of (real-parameter segment) geodesics xEα = xEα (t), t ∈] − δ, δ[. Assume also that the dependence on α is smooth, xE0 (t) := xE(t) and xEα (0) = 0E for any α. This defines a functional σ = σ [E xα ]. Using the equation of the geodesic for α = 0, it is dx b (1)

[E xα ] α |α=0 = gab (E x (1))x a (1) dα |α=0 . quite trivial to get by integration by parts that dσdα On the other hand, since each curve of the family is a geodesic and p ≡ xE(1), we E

dx b (1)

[E xα ] α (1),0) α (1) α |α=0 = dσ (Exdα |α=0 = ∂(p)b σ (p, q) dα |α=0 . Noticing that d xEdα |α=0 have dσdα is arbitrary, one has (33). Finally, let us consider the bi-scalar called van Vleck-Morette determinant. In a real manifold either Riemannian or Lorentzian, the definition (5) can be rewritten, employing any coordinate system zE = (z1 , · · · , zD ) defined in an open totally normal (or geodesically convex) neighborhood T as s 2 ∂ σ (E (−1)D g(E x) x , yE) det . (35) 1V V M (x, y) := g(E x ) g(E y) ∂x a ∂y b

This expression can be generalized to open totally normal neighborhoods in complex pseudo-Riemannian manifolds. Notice that the bi-scalar so obtained is (complex) jointlyanalytic in T × T , but, in principle, can be a multiple-valued function due to the squared root. In any case, the branch point of the squared root is harmless since g(Ez) 6= 0. Computing 1V V M (x, y) in normal coordinates centered in x (these coordinates do exist and cover T in our hypotheses), making use of (33), we get that, in these coordinates, s g(E x) (6 = 0). (36) 1V V M (x, y) = g(E y) Therefore, the bi-scalar 1V V M (x, y) cannot vanish anywhere and is positive either for a Riemannian or Lorentzian metric (all that not depending on the used coordinates!). We are now able to state and prove the most important theorem for our goal (we omit the index λ in some notation for the sake of simplicity). Theorem 24. Let (M, g) be a (D = d + 1)-dimensional Lorentzian manifold with class C ω . Let O ⊂ M be any open set endowed with (C ω ) local synchronous coordinates (with respect to some spacelike hypersurface) xE = (x 0 , · · · , x d ). Fix a positive real λ and consider the set of complex pseudometrics {gλθ } defined in (13) of Theorem 2.1 in the analytically extended coordinates z0 , · · · , zd (za = x a + iy a ) varying in a open complex set G ⊂ CD with O ⊂ G and θ ∈ C. (a) For any p ∈ G, there is a local base of the topology of G, {Gpj }j ∈R , consisting of open totally normal, geodesically linearly convex neighborhoods of p in common with all of the complex pseudo-metrics g(λθ ) for θ which belongs to an open complex

178

V. Moretti

neighborhood of [0, π ], Kp . Moreover, each G¯pj is also totally normal and geodesically linearly convex, with respect to all of the complex pseudometrics when θ ∈ Kp , and G¯pj ⊂ Gpj 0 if j < j 0 . (b) If p ∈ O, posing (with obvious notations which refer to the coordinates zE) Upj := Re Gpj , {Upj }j ∈R , is a local base of the topology of O about p, consisting of open totally normal, geodesically convex neighborhoods of the point p in common with whichever real (Riemannian or Lorentzian) metric produced, in the considered coordinates, by x ) for particular choices of the, generally complex, value of θ ∈ Kp . In particular, g(λθ ) (E this holds for the initial Lorentzian metric g (θ = 0) and for the final Riemannian Wickrotated metric g¯ λ (θ = π). Moreover, each U¯pj is also totally normal and geodesically convex, with respect all of the real metrics considered above and U¯pj ⊂ Upj 0 , if j < j 0 . (c) With an element arbitrarily fixed Gpj , the complex functions obtained from (30), (35) and (34) specialized to the generic complex pseudometric metric g(λθ ) , (θ, q, q 0 ) 7 → σθ (q, q 0 ) 0

(θ, q, q ) 7 → 0

(θ, t, q, q ) 7 →

for (θ, q, q 0 ) ∈ Kp × Gpj × Gpj

1/2 1V V Mθ (q, q 0 ) γθ (t, q, q 0 )

0

for (θ, q, q ) ∈ Kp × Gpj × Gpj 0

for (θ, t, q, q ) ∈ Kp × B2 (0) × Gpj × Gpj

(37) (38) (39)

1/2

x , yE) is a single-valued function and are jointly-analytic functions. Moreover, 1V V Mθ (E can be defined such that it coincides with the usual real positive van Vleck-Morette determinant for real metrics considered in (b). Proof. See the Appendix.

t u

3. The Symmetry of Seeley–deWitt Coefficients in Smooth Manifolds 3.1. The analytic case. Let us consider the off-diagonal Seeley–deWitt coefficients given in Definition 1.1. It is possible to show that, if the Lorentzian metric and the function V are real analytic functions of the local coordinates, then the coefficients are symmetric functions of the arguments x and y. The way is direct, we can use the local Wick rotation previously defined and, via Theorem 2.4, we get the symmetry of the Seeley–deWitt coefficients from the symmetry of heat-kernel coefficients defined with respect the Wick-rotated Riemannian metric. Theorem 31. Let (M, g) be a (real, Hausdorff, paracompact, connected, orientable) (D = d + 1)-Lorentzian C ω manifold. Suppose the function V which appears in (2) is a (real) analytic function and consider the Seeley–deWitt/Hadamard coefficients given in Definition 1.1. Then, any point p ∈ M admits a (totally normal) geodesically convex neighborhood Np such that, if x, y ∈ Np , aj (x, y) = aj (y, x),

(40)

for any j ∈ N. Proof. Fix any point p ∈ M and consider a synchronous coordinate system x 0,... , x d defined in a open neighborhood O of p. Fix λ = 1 and use Theorem 2.4 in O with respect to the coordinates xE. From now on, we shall use the notations of Theorem 2.4.

Symmetry of the Off-Diagonal Hadamard/Seeley–deWitt’s Coefficients

179

Consider the local complex extension of the manifold defined on G and fix a common geodesically linearly convex set Hp = Gpj0 of the local base of the topology found in (a) of Theorem 2.4. The Seeley–deWitt coefficients defined in Np := Re Hp by Definition 1.1 can be analytically continued in the whole set Hp . In particular we have that from Definition 1.1 and (c) of Theorem 2.4, fixing the index j , and x, y ∈ Np , each function (with obvious notations) x , yE|gθ ) − aj (E y , xE|gθ ), θ 7 → aj (E

(41)

is analytic for θ ∈ Kp where Kp is a complex open neighborhood of [0, π ]. Kp can be assumed to be open and connected dropping the connected components which do not contain [0, π ]. In particular, we can consider the complex values of θ , θ = π + iµ, where µ ranges in [0, [. If is small enough, all these values of θ belong to Kp . Then, we notice that, using the notations defined in (11) and (12), we have g(λ=1, θ=π+iµ) = g(λ=e−µ/2 , θ=π ) .

(42)

The metric on the right-hand side is Riemannian. Since the analytical continuation of the metric preserves the form of the right-hand side of (4), the analytical continuation of the off-diagonal Seeley–deWitt coefficients of the initial Lorentzian metric g, for θ = π + iµ produces the off-diagonal heatkernel coefficients of the corresponding Riemannian metrics. Therefore, as we know by [Mo99c], the right hand side of (41) vanishes, whenever θ belongs to the set {θ = π + iµ | µ ∈ [0, [} ⊂ Kp for some > 0. The uniqueness of the analytic continuation in open connected sets entails that the right hand side of (41) vanishes everywhere in Kp , and in particular for θ = 0. This means that the off-diagonal Seeley–deWitt coefficients t defined with respect to the initial metric are symmetric functions of x and y in Np . u The result just proved implies the following more general result in a direct way, as remarked in 2.2 of [Mo99c]. Theorem 32. Let (M, g) be a (real, Hausdorff, paracompact, connected, orientable) (D = d + 1)-Lorentzian C ∞ manifold. Consider the Seeley–deWitt Hadamard coefficients given in Definition 1.1 when both the metric g and the function V which appear in (2) are smooth fields. Then, any point p ∈ M admits a (totally normal) geodesically convex neighborhood Np such that, if x, y ∈ Np , aj (x, y) = aj (y, x),

(43)

for any j ∈ N. Proof. The proof is exactly the same performed in the Riemannian case, Theorem 2.2 in [Mo99c]. u t Remark. This result can be achieved also if the manifold admits a smooth boundary as pointed out in [Mo99c]. Finally, we have a trivial corollary based on the fact that the Seeley–deWitt coefficients are also the coefficients which appear in the Hadamard local solution [Mo99c].

180

V. Moretti

Corollary of Theorem 3.2. Let (M, g) be a (real, Hausdorff, paracompact, connected, orientable) (D = d + 1)-Lorentzian C ∞ manifold. Let the metric g and the function V in (2) be smooth fields. Then, for any point p ∈ M there is a (totally normal) geodesically convex neighborhood Np of p, such that, for any pair (x, y) ∈ Np , the coefficients uj , vj of the Hadamard parametrix, up to the order indicated (see (22) and (23) in [Mo99c] ), HN (x, y) =

D/2−2 X j =0

2 σ (x, y)

D/2−j −1

uj (x, y) +

N X

σ j (x, y)vj (x, y) ln(σ (x, y)/2)

j =0

(44)

(N ∈ N fixed arbitrarily) for D even (the former summation appears for D ≥ 4 only), and s (D−5)/2 X 2 D/2−j −1 2π uj (x, y) + v0 (x, y) H (x, y) = σ (x, y) σ (x, y) j =0 p (45) +v1 (x, y) 2π σ (x, y) for D odd (the summation appears for D ≥ 5 only), satisfy uj (x, y) = uj (y, x), vj (x, y) = vj (y, x).

(46) (47)

3.2. Final remarks. The results proved in this work show that the Seeley–deWitt Hadamard off-diagonal coefficients are symmetric as requested within the point-splitting renormalization procedure of the stress-energy tensor. Such a result has been proved for the case where the manifold, the metric and the potential V are smooth. The result holds true in both Lorentzian and Riemannian manifolds. Anyway, the intriguing general fact we have pointed out is the existence of quite a natural local Wick rotation of the metric which preserves the local geodesical structures of the manifold making use of nonhermitian complex manifolds. This procedure makes sense regardless the presence of time-like Killing fields whenever the employed coordinates are somehow “physical”. It is not so obvious what physics is involved in this procedure.

Acknowledgement. I am grateful to A. Cassa and S. Baldo, S. Delladio, S. Hollands, A. Tognoli and P. Vigna Suria for very useful remarks, suggestions and discussions. A part of this work has been written during my visit at the Department of Mathematics of the University of York. I would like to thank Bernard S. Kay in particular and C. J. Fewster, A. Higuchi for very stimulating discussions and for the cordial hospitality all they provided during my stay there. This work has been financially supported by both a postdoctoral fellowship of the Department of Mathematics of the University of Trento and a grant by the MURST within the National Project “Young Researchers”.

Appendix: Proof of Some Theorems and Lemmata Sketch of Proof of Theorem 2.2. Fix a point p ∈ S. Since S is embedded, it is possible to find a local coordinate system centered in p, xE = (x 0 , x 1 · · · , x d ), defining a local chart (Up , xE) about p such that the set Up ∩ S is given by the equation x 0 = 0. Then

Symmetry of the Off-Diagonal Hadamard/Seeley–deWitt’s Coefficients

181

(x 1 , · · · , x d ) define local space-like coordinates on S in a neighborhood of p. Now consider the local map (t, x 1 , · · · , x d ) 7 → exp(0,x 1 ,··· ,x d ) (tN(x 1 , · · · , x d )) which is defined in an open neighborhood of (t = 0, x 1 = 0, · · · , x d = 0). N(x 1 , · · · , x d ) is the unique time-oriented vector normal to S in (x 0 = 0, x 1 , · · · , x d ) with g(N, N) = −1. It is a trivial task to compute the Jacobian determinant Jp of the map (t, x 1 , · · · , x d ) 7→ xE−1 ◦ exp(0,x 1 ,··· ,x d ) (tN(x 1 , · · · , x d )) at (t = 0, x 1 = 0, · · · , x d = 0) obtaining Jp = dx 0 |p (N(p)) 6 = 0. Hence, the map (t, x 1 , · · · , x d ) 7 → exp(0,x 1 ,··· ,x d ) (tN(x 1 , · · · , x d )) is a coordinate system in an open neighborhood of p, Up0 ⊂ Up and (y 0 , y 1 , · · · , y d ) := (t, x 1 , · · · , x d ) are local coordinates about p ∈ S. Using the equation of geodesics, it is a trivial task to get that (9) and (10) are fulfilled and thus (making smaller Up , if necessary, in order to have a yE domain of the form ] − δ, δ[ × ) we have built up (time-oriented) locally synchronous coordinates with respect to S. So local synchronous coordinates do exist. On the other hand it is also simply proved that the temporal coordinate of a point q in any (time-oriented) local synchronized coordinate system defined in Definition 2.1 represents the (positive) length tq of the unique geodesic segment which starts from S with a unitary initial tangent vector time-oriented normal to S at, say, q 0 ∈ S and reaches q. The spatial synchronous coordinates are nothing but the coordinates of q 0 on S. Then, Proposition 26 in Chap.7 of [ON83] entails that there is an open neighborhood O of S where any pair of geodesics starting from different points of S with initial tangent vector normal to S do not intersect each other anywhere (also if the starting points belong to different local synchronous coordinate system domains). By consequence, in O, the temporal coordinate q 7→ tq of any point q does not depend on the chosen local synchronous coordinate system. The coordinate transformation law between local synchronous coordinate system reads, in any common domain, y 0 q = yq0 = tq , 0

j y0q

=

(48)

j y 0 q (yq1 , · · ·

, yqd ) , j = 1, · · · , d.

(49)

This trivially assures that the transformation law from different local synchronous coordinates preserves the form of the Wick rotated metric in common domains for any globally fixed value of λ. This defines a Riemannian metric on N which can be taken as the union of all af the intersections of O with each synchronous chart domain. u t Proof of Lemma 2.1. The differential equation system in Lemma 2.1 is equivalent to the integral equation Z z f (u, Y (u, y¯0 , α), α)du, (50) Y (z, y¯0 , α) = y¯0 + z0

where the path of integration is the segment from z0 to z (Y is C 1 and thus analytic in z and the integration does not depend on the chosen path between the same extreme points). We can write the equation above as Y = Ay¯0 α (Y ),

(51)

where Ay¯0 α is defined by the right-hand side of (50) and it should be thought of as a function which maps the Banach space B := C 0 (B¯ r1 (z0 ); Cn ) (with the norm ||||∞ ) into B itself. Actually f (z, Y (z), α) may not be defined, in general, when Y ∈ B because some Y (z) may be out of the domain of f . However, once one has fixed r20 > 0 such that

182

V. Moretti

r20 < r2 , taking a value r10 > 0 which satisfies (17) one sees that Ay¯0 α is well-defined on the closed subset of B, B0 := {Y ∈ C 0 (B¯ r1 (z0 ); Cn )

Y (z) ∈ B¯ r2 (y0 ) for any z},

(52)

which is invariant under Ay¯0 α provided y¯0 ∈ B¯ r20 (y0 ). In this domain, Ay¯0 α is a contraction map, with contraction constant ρ, such that 0 < ρ < 1 which does not depend on y¯0 ∈ B¯ r20 (y0 ) and α ∈ B¯ r3 (α0 ), on that set. The Banach theorem of the fixed point proves the existence and the uniqueness of the solution which is nothing but the fixed point of Ay¯0 ,α and belongs to B0 . In particular, the solution can be found as the limit (in the norm || ||∞ with respect to the variable z, the remaining variables being fixed.) Y = lim Yk , k→∞

(53)

where Yk := Aky¯0 α (Y0 ) and Y0 is the constant function z 7→ Y0 (z) = y0 everywhere. Using the contraction property one finds that, for k > m, Sup||Yk (z, y, α) − Ym (z, y, α)|| ≤

ρ k−m Sup||Y1 (z, y, α) − Y0 (z, y, α)||, 1−ρ

where the Sup is evaluated for (z, y, α) ∈ B¯ r10 (z0 ) × B¯ r20 (y0 ) × B¯ r3 (α0 ). This entails that the convergence of the sequence (53) is uniform in all variables jointly. Since each function of the series is analytic by construction in any set B¯ r10 × B¯ r20 (y0 ) × B¯ r30 (α0 ), t 0 < r30 < r3 , the limit function must be analytic therein. u Proof of Theorem 2.3. We follow and generalize the similar proof given in [KN63]. Let n the dimension of M. Take a coordinate system centered in p ∈ M, zE = (z1 , · · · , zn ), p ≡ (0, · · · , 0). We want to show that in these coordinates it is possible to find of Pn a class i |2 < E := {Ez ∈ Cn | |z open totally normal neighborhoods of p of the form Bρ (0) i=1 ¯ which are also linear geodesically convex and the class of the sets ρ 2 }, 0 < ρ < ρ, E enjoys the same properties. The remaining part of the thesis is trivially proven by B¯ ρ (0) E with j ∈ R. The existence defining, for a fixed ρ, 0 < ρ < ρ, ¯ Gpj := Bρ(1+tanh j )/2 (0) of the class above follows from a pair of propositions indicated by (p1) and (p2) in the following. Pn i 2 E := {Ez ∈ Cn | (p1) Let Sρ (0) i=1 |z | = ρ}, ρ > 0, then there exists c > 0 such E at a point, that if ρ ∈]0, c[, then any real-parameter geodesic which is tangent to Sρ (0) E in a neighborhood of yE. say yE, lies outside Sρ (0) Proof of (p1). Let zE = zE(s), s defined in a neighborhood of s0 , be a real-parameter E at yE = zE(s0 ) (ρ will be restricted later). Formally geodesic which is tangent to Sρ (0) speaking, this means that, setting F (s) := ||Ez(s)||2 ,

(54)

X dzi dz∗i dF z∗i (s0 ) |s=s0 = |s=s0 + zi (s0 ) |s=s0 = 0. ds ds ds

(55)

it holds F (s0 ) = ρ 2 and

i

Symmetry of the Off-Diagonal Hadamard/Seeley–deWitt’s Coefficients

183

Let us consider the second derivative of F at s = s0 . A trivial computation based on the derivation of the central term in (55) and the equation of geodesics shows that d 2F |s=s0 = V † A(Ez, zE∗ )V , ds 2

(56)

where V is a vector with components V j = dzj /ds|s=s0 for j = 1, · · · n and ∗ is the complex conjugation and † the hermitian conjugation. A(Ez, zE∗ ) is a 2n × 2n Hermitian matrix with A(Ez, zE∗ )ij = δij

(57)

for i, j = 1, · · · , n and i, j = n + 1, · · · 2n, and A(Ez, zE∗ )ij = −

n X a=1

za 0ja k−n (Ez)

(58)

for i = 1, · · · , n and j = n + 1, · · · , 2n, and finally A(Ez, zE∗ )ij = −

n X a=1

z∗a 0j∗a−n k (Ez)

(59)

for i = n + 1, · · · , 2n. and j = 1, · · · , n. The matrix A becomes the identity matrix for zE = zE∗ = 0E and thus is positive definite. There is a neighborhood of 0E which can be E where, by continuity, A(Ez, zE∗ ) is defined positive. In this chosen in the form of Bc (0), neighborhood d 2 F /ds 2 |s=s0 > 0 and hence F (s) > ρ 2 when s 6= s0 belongs to a real t neighborhood of s0 , provided ρ ∈]0, c[9 . This concludes the proof of (p1). u (p2) Choose a real c > 0 as in (p1). Then there exists a real a with 0 < a < c E (B¯ a (0)) E can be joined by a complex geodesic such that: (1) Any two points of Ba (0) E E (B¯ a (0)) E has a normal coordinate segment which lies in Bc (0); (2) Each point of Ba (0) ¯ E E neighborhood containing Ba (0) (Ba (0)) and thus is a totally normal neighborhood. Proof of (p2). Consider M as a submanifold of T (M) in a natural way and work in E Set coordinates zE defined above, notice that p ≡ 0. φ : X 7 → (q, exp X) for X ∈ Tq (M).

(60)

In general, φ is defined only in a neighborhood of M in T (M). Since the differential E 0) in T (M) and a of φ at (0E ≡ p, 0) is nonsingular, there exist a neighborhood V of (0, E × Bb (0). E positive number b < c such that φ defines a diffeomorphism in V onto Bb (0) E for all X ∈ V and t ∈ C, Taking V and b small, we can assume that exp(tX) ⊂ Bc (0) |t| ≤ 1. Item (1) holds true for any a > 0 with a ≤ b, since the complex geodesic E to q 0 ∈ Bb (0) E is the map t 7→ exp(tX) where |t| ≤ 0 and segment from q ∈ Bb (0) −1 0 X := φ (q, q ) ∈ V . Let us consider Item (2). With the positive real b and V fixed as those in the proof of Item (1), choosing b0 > 0 and δ > 0 small enough, we can fix an open subset of V , which E × Bδ , where 0 < b0 < b and Bδ is a neighborhood of 0E in T (M), with the form Bb0 (0) is an open ball of radius δ > 0 and center in 0 ∈ Cn . All the tangent spaces Tq (M), 9 In general, this is not true in a complex neighborhood of 0 as one could trivially check in M = C.

184

V. Moretti

E have been identified with Cn by means of the bases induced by the considered q ∈ Bb0 (0), E × Bb00 (0) E ⊂ φ(Bb0 (0) E × Bδ ). coordinates. Then, choose an open neighborhood Bb00 (0) E E 00 0 Finally notice that, if q ∈ Bb (0), since φ is a diffeomorphism in Bb (0) × Bδ , we have E ⊂ φ({q} × Bδ ), {q} × Bb00 (0) and in particular, from the definition of φ, E ⊂ expq (Bδ ). Bb00 (0)

(61)

E is a totally normal neighborhood if a ≤ b00 (and (1) also holds This means that Ba (0) 00 true due to b < b). The proof for the closure of the considered neighborhoods is trivial and is obtained by E ⊂ Ba (0) E if a 0 < a. u t taking a smaller and noticing that B¯ a 0 (0) To complete the proof of the theorem, let 0 < ρ < a(< c) and let q, q 0 be any pair of points in Bρ (p) (B¯ ρ (p)). Let zE = zE(s), s ∈ [0, 1] the real-parameter segment geodesic from q to q 0 in Bc (p) (see (p1)). We shall show that this real-parameter segment geodesic lies completely in Bρ (p) (B¯ ρ (p)). Consider the function s 7 → F (s) defined in (54) along this geodesic segment. Assume that F (s) ≥ ρ 2 (F (s) > ρ 2 ) for some s, that is, zE(s) lies outside Bρ (p) (B¯ ρ (p) ) for some s. Let s0 , s0 ∈]0, 1[, be the value for which F attains the maximum, say, ρ02 ≥ ρ 2 (ρ02 > ρ). Then 0=

dF |s=s0 . ds

(62)

This means that the real-parameter geodesic segment is tangent to the sphere Sρ0 (p) at the point xE(s0 ). By the choice of ρ the considered real-parameter geodesic segment lies t inside the sphere Sρ0 (p), contradicting (p1). u Proof of Theorem 2.4. First of all, we notice that the item (c) is a direct consequence of items (a) and (b), Lemma 2.1 and the discussion which follows that lemma. Barring the item (a), the only not completely trivial fact is the statement concerning the possibility 1/2 of defining 1V V Mθ as a single-valued function which coincides with the usual one when evaluated on any Upj for real (Riemannian or Lorentzian metrics). We shall prove this result in the end of the proof of this theorem. Let us prove the validity of items (a) and (b). The latter is a straightforward consequence of the former taking into account that the the initial, the Wick-rotated and any other real metric obtained for the corresponding values of θ when restricting to real coordinates produce real exponential maps end geodesics. (Therefore, with respect to the considered coordinates, the exponential map transfor vectors with real components onto points with real coordinates. And the real-parameter geodesic segments connecting pairs of points in any Re Gpj (Re Gpj ) are real geodesic segments completely contained in Re Gpj , (Re Gpj ).) Then we have to prove item (a) only. To this end, we use the same proof of Theorem 2.3 with the necessary modifications. Proof of (a). Fix any λ > 0. (From now on, for the sake of simplicity, we omit the index λ where not strictly necessary.) Take the complex coordinate system considered in the hypotheses. We are free to move the origin of the coordinate in p ∈ G by a complex translation. Let uE = (u1 , · · · , uD ) the new coordinate system. Therefore p ≡

Symmetry of the Off-Diagonal Hadamard/Seeley–deWitt’s Coefficients

185

(0, · · · , 0). We want to show that, in these coordinates, it is possible to find a class of open totally normal neighborhoods of p, corresponding to the metric g(λθ ) , of the form u ∈ CD | Bρ (p) := {E

D X

|ui |2 < ρ 2 },

(63)

i=1

E 0 < ρ < ρ, ¯ which are also linear geodesically convex and the class of the sets B¯ ρ (0) enjoys the same properties. Moreover all these properties of a fixed neighborhood are preserved varying θ in a complex open neighborhood of [0, π], Kp . The remaining part of the thesis is trivially proven by defining, for a fixed ρ with 0 < ρ < ρ, ¯ E Gpj := B(1+tanh j )ρ/2 (p),

(64)

with j ∈ R. The thesis follows from a pair of propositions indicated by (p1) and (p2) in the following. PD i 2 E := {E u ∈ CD | (p1) Let Sρ (0) i=1 |u | = ρ}, ρ > 0, then there exist c > 0 and a complex open neighborhood of [0, π ], Kp0 , such that if ρ ∈]0, c[, then any realparameter geodesic defined with respect to any fixed metric g(λθ ) with θ ∈ Kp0 which E at a point, say yEθ , lies outside Sρ (0) E in a neighborhood of yEθ . is tangent to Sρ (0) Proof of (p1). Fix θ ∈ [0, π ] arbitrarily, let uE = uEθ (s), s being defined in a neighborhood of sθ0 and with respect to the metric g(λθ ) , be a real-parameter geodesic which is tangent E at yEθ = uEθ (0) (ρθ will be restricted later). Formally speaking, this means that, to Sρθ (0) setting uθ (s)||2 , Fθ (s) := ||E

(65)

X duiθ du∗i dFθ θ i u∗i (s ) + u (s ) |s=sθ 0 = | |s=sθ 0 = 0. θ 0 s=s θ 0 θ0 θ θ ds ds ds

(66)

it holds Fθ (0) = ρθ2 and

i

Let us consider the second derivative of Fθ at s = sθ 0 . A trivial computation based on the derivation of central term in (66) and the equation of geodesics shows that d 2 Fθ |s=sθ 0 = Vθ† Aθ (E uθ , uE∗θ )Vθ , ds 2 j

(67)

j

where Vθ is a vector with components Vθ = duθ /ds|s=sθ 0 for j = 1, · · · D and ∗ is the uθ , uE∗θ ) is a 2D ×2D Hermitian complex conjugation and † the hermitian conjugation. A(E matrix with u, uE∗ )ij = δij Aθ (E

(68)

for i, j = 1, · · · , D and i, j = D + 1, · · · 2D, and u, uE∗ )ij = − Aθ (E

D X a=1

a ua 0(θ u) )j k−D (E

(69)

186

V. Moretti

for i = 1, · · · , D and j = D + 1, · · · , 2D, and finally u, uE∗ )ij = − Aθ (E

D X a=1

∗a u∗a 0(θ u) )j −D k (E

(70)

for i = D + 1, · · · , 2D. and j = 1, · · · , D. The matrix Aθ becomes the identity matrix for uE = uE∗ = 0E and thus is positive definite. Now we let the coefficient θ in Aθ vary in a neighborhood of the initial value and rename the variable θ by η. Due to the joint E which continuity of the connection coefficients, there is an open neighborhood of (θ, 0) ∗ E where Aη (E u, uE ) is positive definited. In can be chosen in the form of Bδθ (θ ) × Bcθ (0), this neighborhood d 2 Fη /ds 2 |s=sθ 0 > 0 and hence Fη (s) > ρη2 when s 6 = sθ 0 belongs to a real neighborhood of sθ0 . This procedure can be performed for any point θ ∈ [0, π ] obtaining a covering of this set made by complex open balls Bδθ (θ ). Since [0, π] is compact also as a complex set, we can extract a finite covering made of balls centered on the points θi , i = 1, · · · , N , whose union is a complex open neighborhood of [0, π ], Kp0 . E and θ ∈ Kp0 , we have d 2 Fθ /ds 2 |s=sθ 0 > 0 Let c = Min{cθi |i = 1, · · · , N}. If zE ∈ Bc (0) and hence Fθ (s) > ρθ2 when s 6 = sθ 0 belongs to a real neighborhood of sθ 0 , provided t ρθ ∈]0, c[. This concludes the proof of (p1). u (p2) Choose a real c > 0 as in (p1). Then there exist a real a with 0 < a < c and a complex open neighborhood of [0, π], Kp00 such that: (1) Fixing any θ ∈ Kp00 , any two E (B¯ a (0)) E can be joined by a complex geodesic segment of the metric g(λθ ) points of Ba (0) E (2) Fixing any θ ∈ Kp00 , each point of Ba (0) E (B¯ a (0)) E has a normal which lies in Bc (0); E (B¯ a (0)), E coordinate neighborhood, with respect to the metric g(λθ ) , containing Ba (0) E (B¯ a (0)) E is a totally normal neighborhood with respect to any metric and thus Ba (0) g(λθ ) . Proof of (p2). Consider G as a submanifold of T (G) in a natural way and work in E Set coordinates uE defined above, notice that p ≡ 0. 8 : (θ, X) 7 → (θ, q, exp(θ ) X) for X ∈ Tq (M), θ ∈ C.

(71)

In general, 8 is defined only in a neighborhood of G in T (G). Since the differential of 8 E 0) E 0) for θ ∈ [0, π] is nonsingular, there exist an open neighborhood Vθ of (θ, 0, at (θ, 0, in C × T (G), a complex open neighborhood Brθ (θ ) of θ and a positive number bθ < c E × Bbθ (0). E Taking Vθ , such that 8 defines a diffeomorphism in Vθ onto Brθ (θ ) × Bbθ (0) E for all X ∈ Vθ , t ∈ C, |t| ≤ 1 rθ and bθ small, we can assume that exp(η) (tX) ⊂ Bc (0) and η ∈ Brθ (θ). Then extract a finite complex covering of [0, π ] made by balls Brθi (θ ), 00 be the union of the sets of this finite covering. Item (1) holds i = 1, · · · , M. Let Kp1 true for any a > 0 with a ≤ b := Min{bθi | i = 1, · · · , M}, since the complex geodesic 00 from q ∈ B (0) 0 E segment corresponding to the metric g(λθ ) with θ ∈ Kp1 b E to q ∈ Bb (0) E is the map t 7 → exp(θ ) (tX) where |t| ≤ 0 and (θ, X) = 8−1 (θ, q, q 0 ) ∈ Brθk × Bbθk (0) for some k ∈ {1, · · · , M}. Let us consider Item (2). Fixed any θ ∈ [0, π ] and the positive reals rθ , bθ and Vθ exactly as those in the proof of item (1), choosing bθ0 > 0 and δθ > 0 small enough, E 0) with the we can fix an open subset of Brθ (θ ) × Vθ , which is a neighborhood of (θ, 0, E × Bδθ , where 0 < r 0 < rθ , 0 < b0 < b and Bδθ is an open ball of form Brθ0 × Bbθ0 (0) θ

Symmetry of the Off-Diagonal Hadamard/Seeley–deWitt’s Coefficients

187

E have radius δθ > 0 and center in 0 ∈ Cn . All the tangent spaces Tq (G), q ∈ Bb0 (0), been identified with Cn trough the bases induced by the considered coordinates. Then, E × Bb00 (0) E ⊂ 8(Br 0 (θ ) × Bb0 (0) E × Bδ ). choose an open neighborhood Brθ00 (θ ) × Bbθ00 (0) θ E and η ∈ Br 00 (θ ), since 8 is a diffeomorphism in Finally notice that, if q ∈ Bb00 (0) θ

θ

E × Bδθ , we have Brθ0 (θ) × Bbθ0 (0)

E ⊂ 8({η} × {q} × Bδθ ), {η} × {q} × Bbθ00 (0) and in particular, from the definition of 8, E ⊂ exp(η)q (Bδθ ), Bbθ00 (0)

(72)

for any η ∈ Brθ00 . All that can be performed for any fixed θ ∈ [0, π]. Therefore, as we done before, we can extract a finite covering of [0, π] made by balls Brθ00 (θ ), k = 1, · · · , L k

00 . Then put b00 := Min{b00 | k = 1, · · · , L}. Then B (0) with union Kp2 a E is a totally θk 00 . 00 normal neighborhood, if a ≤ b , with respect to all the metrics g(λθ ) whenever θ ∈ Kp2 00 ∩ K00 , (1) also holds true due to b00 < b. The proof for the closure of In Kp00 := Kp1 p2 the considered neighborhoods is trivial and is obtained by taking a smaller and noticing E ⊂ Ba (0) E if a 0 < a. u t that B¯ a 0 (0)

To complete the proof of Item (a), let 0 < ρ < a(< c) and let q, q 0 be any pair of E Let uE = uEθ (s), s ∈ [0, 1] the real-parameter segment points in Bρ (p) (B¯ ρ (p)), p ≡ 0. geodesic from q to q 0 in Bc (p) (see (p1)) computed with respect to the metric g(λθ ) with θ arbitrarily fixed in the complex open neighborhood of [0, π ] given by Kp := Kp0 ∩Kp00 . We shall show that this real-parameter segment geodesic lies completely in Bρ (p) (B¯ ρ (p)). Consider the function s 7 → Fθ (s) defined in (65) along this geodesic segment. Assume that Fθ (s) ≥ ρ 2 (Fθ (s) > ρ 2 ) for some sθ , that is, uEθ (sθ ) lies outside Bρ (p) (B¯ ρ (p) ) for some sθ . Let sθ0 , sθ0 ∈]0, 1[, be the value for which Fθ attains the maximum, say, ρθ2 ≥ ρ 2 (ρθ2 > ρ). Then 0=

dFθ |s=sθ 0 . ds

(73)

This means that the real-parameter geodesic segment is tangent to the sphere Sρθ (p) at the point xE(sθ0 ). By the choice of ρ the considered real-parameter geodesic segment lies inside the sphere Sρθ (p), contradicting (p1). To end the proof, let us prove that in any set Gpj and for θ ∈ Kp , the van Vleck– Morette determinant can be defined as a single-valued function which coincides with the ordinary van Vleck–Morette determinant for whatever value of θ such that the metric is real, in particular θ = 0, π . 1/2 By (35), we can assume that 1V V Mθ is single-valued, if the functions defined in our coordinates by x) gθ (E , gθ (E y) 2 ∂ σθ (E x , yE) (−1)D det , θ, xE, yE) 7 → G(θ, xE, yE) := gθ (E x) ∂x a ∂y b

(θ, xE, yE) 7 → F (θ, xE, yE) :=

188

V. Moretti

take values away from the cut of a folder of the domain of definition of the functions z 7 → z1/4 and z 7 → z1/2 . From now on, we fix this cut along the negative real axis and work in the folder where both the functions z 7→ z1/4 and z 7→ z1/2 produce real and positive values when evaluated on positive real numbers. Let us prove that we can shrink the open set Bρ (p) in (63) (used to define the class of Gpj := B(1+tanh j )ρ/2 (p)) and Kp such that, in Kp × Bρ (p) × Bρ (p), the functions F and G take values with strictly positive real part. Fix θ ∈ [0, π ] and p ≡ zE (the arbitrary center of Bρ (p)). Trivially F (θ, zE, zE) = 1. On the other hand it also holds 1V V Mθ (Ez, zE) = 1, because the VVM determinant is a bi-scalar and this result can be trivially obtained in normal coordinates centered in xE by (36). Therefore we have also, in our coordinates, G(θ, zE, zE) = 1. Since F and G are jointly continuous in (θ, xE, yE), there is a neighborhood of (θ, zE, zE) of the form Bkθ (θ) × Bρθ (p) × Bρθ (p), 0 < ρθ ≤ ρ, where both the functions assume only values with strictly positive real part. This procedure can be performed for any point θ ∈ [0, π], obtaining a covering of this set made of complex open disks Bkθ (θ ). By compactness, we can extract a finite sub-covering made by disks Bkθi (θi ) centered in θi , i = 1, · · · , L, and a corresponding finite class of open neighborhood of p, Bρθi (p), i = 1, · · · , L. Then we can take ρ 0 > 0 such that Bρ 0 (p) ⊂ ∩i Bρθi (p) and use Bρ 0 (p) to define the class of Gpj by (64). Finally Kp can be re-defined as the intersection between the initial Kp and ∪i Bkθi (θi ). With these definition both functions F and G assume values with strictly positive imaginary part whenever θ ∈ Kp and xE, yE belong to any Gpj . This implies 1/2 that, in any Gpj , 1V V Mθ can be defined as a single-valued function for any θ ∈ Kp . Moreover, with the choice above of the folder of definition of the functions z 7→ z1/4 1/2 and z 7 → z1/2 , 1V V Mθ (E x , yE) coincides to the usual real one for these θ such that the function takes real values. u t Note added in proof. S. Hollands pointed out to me that a partial but relevant result about the symmetry of Lorentzian Hadamard coefficients vj (x, y) in (44) is contained in the final chapter of Friedlander’s book The wave equation on a curved space-time (Cambridge University Press, Cambridge, 1975). This result gives an overlap with results of the present work in a direct corollary of Theorem 6.4.1 in Friedlander’s book and taking into account the comment after Theorem 4.3.1 where it is specified that the coefficients considered in the book essentially coincide with vj (x, y) Hadamard’s coefficients despite a different use and definition. This corollary and the comment show that, in a smooth Lorentzian manifold, when x, y belong to a common, sufficiently small, causal domain and σ (x, y) < 0 is satisfied, then vj (x, y) = vj (y, x). This result is achieved using the whole theory of Lorentzian distribution developed in the book (see in particular Theorems 5.2.1 and 6.3.2) and makes use of the method of descent which explicitly requires a Lorentzian (D dimensional) metric. Finally, within a short remark after Theorem 6.4.1, it is suggested that a generalization to a whole causal domain may be obtained in the analytic case. Then, it is argued that the smooth non-analytic case also could be enconpassed by means of somehow approximation procedure of smooth differential equations by analytic differential equations. This last part of the suggested procedure seems to be exactly what we explicitly done in Theorem 3.2. I am very grateful to S. Hollands for his remark.

Symmetry of the Off-Diagonal Hadamard/Seeley–deWitt’s Coefficients

189

References [BD82]

Birrel, N.D. and Davies, P.C.W.: Quantum Fields in Curved Space. Cambridge: Cambridge University Press, 1982 [Ca90] Camporesi, R.: Phys. Rep. 196, 1 (1990) [Cs96] Cassa, A.: Class. Quant. Grav. 12, 5, 1151 (1995) [Ch84] Chavel, I.: Eigenvalues in Riemannian Geometry. Orlando, FL: Academic Press, Inc., 1984) [Fu91] Fulling, S.A.: Aspects of Quantum Field Theory in Curved Space-Time. Cambridge: Cambridge University Press, 1991 [FR87] Fulling, S.A., and Ruijsenaars, S.N.M.: Phys. Rep. 152, 135 (1987) [FSW78] Fulling, S.A., Sweeny, M.R., Wald, M.: Commun. Math. Phys. 63, 257 (1978) [Ga64] Garabedian, P.R.: Partial Differential Equations. New York: John Wiley and Sons, Inc., 1964 [GH93] Gibbons, G.W., Hawking, S.W. (eds.): Euclidean Quantum Gravity. Singapore: World Scientific, 1993 [Gi84] Gilkey, P.G.: Invariance theory, the heat equation and the Atiyah-Singer index theorem. Math. Lecture Series 11, Boston, Ma: Publish or Perish Inc., 1984 [Ha77] Hawking, S.W.: Commun. Math. Phys. 55, 133 (1977) [KN63] Kobayashi S. and Nomizu, K.: Foundations of Differential Geometry Vol. 1. NewYork: Interscience Publishers, 1963 [LB83] LeBrun, C.: Trans. A.M.S. 278, 1, 209 (1983) [Mo99a] Moretti, V.: Commun. Math. Phys. 201, 327 (1999) [Mo99b] Moretti, V.: J. Math. Phys. 40, 3843 (1999) [Mo99c] Moretti, V.: Commun. Math. Phys. 208, 283 (1999) [ON83] O’Neill, B.: Semi-Riemannian Geometry with applications to Relativity. New York: Academic Press, (1983) [Wa78] Wald, R.M.: Phys. Rev. D 17, 1477 (1978) [Wa79] Wald, R.M.: Commun. Math. Phys. 70, 226 (1979) [Wa84] Wald, R.M.: General Relativity. Chicago, IL: The University of Chicago Press, 1984 [Wa94] Wald, R.M.: Quantum Field theory and Black Hole Thermodynamics in Curved Spacetime. Chicago, IL: The University of Chicago Press, 1994 Communicated by H. Araki

Commun. Math. Phys. 212, 191 – 204 (2000)

Communications in

Mathematical Physics

© Springer-Verlag 2000

Uniform Spectral Properties of One-Dimensional Quasicrystals, III. α-Continuity David Damanik1,2 , Rowan Killip1 , Daniel Lenz2 1 Department of Mathematics 253–37, California Institute of Technology, Pasadena, CA 91125, USA.

E-mail: [email protected]; [email protected]

2 Fachbereich Mathematik, Johann Wolfgang Goethe-Universität, 60054 Frankfurt, Germany.

E-mail: [email protected] Received: 29 September 1999 / Accepted: 14 January 2000

Abstract: We study the spectral properties of one-dimensional whole-line Schrödinger operators, especially those with Sturmian potentials. Building upon the Jitomirskaya– Last extension of the Gilbert–Pearson theory of subordinacy, we demonstrate how to establish α- continuity of a whole-line operator from power-law bounds on the solutions on a half-line. However, we require that these bounds hold uniformly in the boundary condition. We are able to prove these bounds for Sturmian potentials with rotation numbers of bounded density and arbitrary coupling constant. From this we establish purely αcontinuous spectrum uniformly for all phases. Our analysis also permits us to prove that the point spectrum is empty for all Sturmian potentials. 1. Introduction In this article we are interested in spectral properties of discrete one-dimensional Schrödinger operators on the whole line, that is, operators H in `2 (Z) of the form [H u](n) = u(n + 1) + u(n − 1) + V (n)u(n)

(1)

with arbitrary potential V : Z → R. Among the most powerful tools that have been developed for the investigation of the spectral type of such operators are those which establish a correspondence to the behavior of the solutions of the associated difference equation u(n + 1) + u(n − 1) + V (n)u(n) = Eu(n).

(2)

In what follows we shall always assume a solution of (2) to be normalized in the sense that |u(0)|2 + |u(1)|2 = 1.

(3)

192

D. Damanik, R. Killip, D. Lenz

An elementary observation is that a support of the pure point part of a spectral measure associated to H is given by Mpp = {E ∈ R : ∃ solution u of (2) which is `2 at both ± ∞}. The following notion, introduced by Gilbert and Pearson [7], allows an analogous description of a support of the singular part. Namely, a solution u of (2) is called subordinate at +∞ if lim

L→∞

kukL =0 kvkL

(4)

for any solution v of (2) which is linearly independent of u. Here k · kL denotes the norm of the solution over a lattice interval of length L, that is, kuk2L =

bLc X u(n) 2 + (L − bLc) u(bLc + 1) 2 .

(5)

n=0

Subordinacy of a solution u at −∞ is defined analogously. Gilbert [6] then proves that Msing = {E ∈ R : ∃ solution u of (2) which is subordinate at both ± ∞} is a support of the singular part of a spectral measure associated to H . Hence the standard decomposition of a spectral measure into its pure point, singular continuous, and absolutely continuous part can be investigated by studying solutions of (2). Recall that by the RAGE theorem, each of these standard spectral parts is related to certain quantum dynamical behavior. Remark 1. Note that these support descriptions require a certain condition to hold at both “endpoints” +∞ and −∞. That is, if one can show that for some energy E, there is an endpoint such that no solution of (2) satisfies this condition (square-summability and subordinacy, respectively) at this endpoint, then this energy does not belong to the respective support. In this sense the “more continuous half-line dominates” and this picture is consistent with heuristic quantum evolution in one dimension. Recently, further decompositions of spectral measures have been proposed by Last [18]. These decompositions are motivated by the goal of answering more delicate questions arising in the study of quantum dynamics in the presence of purely singular continuous spectral measures. A finite positive measure d3 is said to be uniformly α-Hölder continuous (or UαH) if the distribution function Z E d3 3(E) = −∞

is uniformly α-Hölder continuous. A measure is said to be α-continuous if it is absolutely continuous with respect to a UαH measure. This definition of α-continuity is equivalent to the more common “µ(S) = 0 for all sets S of zero α-Hausdorff measure” [22]. On the other hand, a measure is called α-singular if it is supported on a set of zero α-Hausdorff measure. Last discusses the decomposition of a measure into its α-continuous and its α-singular part and he obtains explicit quantum dynamical bounds in the case where the α-continuous part is non-trivial. Moreover, this decomposition is further motivated as there is apparently a very nice interpolation of the Gilbert–Pearson results. Namely,

Uniform α-Continuity for 1D Quasicrystals

193

Jitomirskaya and Last introduce in [11] the following notion: A solution u of (2) is called α , α-subordinate at +∞ if, setting β = 2−α lim inf L→∞

kukL β

kvkL

=0

(6)

for any solution v of (2) which is linearly independent of u. Again, α-subordinacy at −∞ is defined analogously. In [12] these authors establish this interpolation for halfline operators. The natural whole-line correspondence accompanying the half-line result would be the following interpolation of the Gilbert result. Conjecture. A support of the α-singular part of a spectral measure associated to H is given by Mα-sing = {E ∈ R : ∃ sol. u of (2) which is α-subordinate at both ± ∞}. We shall obtain, in Theorem 1 below, a restricted version of this statement. In view of Remark 1 the goal is to establish the following implication: Pick some endpoint. If for all energies in some set 6, all solutions of (2) are not α-subordinate at the chosen endpoint, then the α-singular part of a spectral measure associated to H gives zero weight to 6. There is a well-known way to prove non-existence of α-subordinate solutions for some fixed energy E and a fixed endpoint which has been exploited in [2, 13]. Namely, power-law bounds of the form C1 Lγ1 ≤ kukL ≤ C2 Lγ2 for all normalized solutions u of (2) imply non-existence of α-subordinate solutions at 1 +∞, where α = γ12γ +γ2 ; similarly at −∞. The restriction we have to impose on the conjecture in order to establish the desired connection is twofold. Firstly, we require that non-existence of α-subordinate solutions is established by this power-law criterion. Secondly, we need that the bounds are uniform in the solutions corresponding to a fixed energy. Under these assumptions one may conclude purely α-continuous spectrum on 6. Theorem 1. Let 6 be a bounded set. Suppose there are constants γ1 , γ2 such that for each E ∈ 6, every normalized solution of (2) obeys the estimate C1 (E)Lγ1 ≤ kukL ≤ C2 (E)Lγ2

(7)

for L > 0 sufficiently large and suitable constants C1 (E), C2 (E). Let α = 2γ1 / (γ1 + γ2 ). Then H has purely α-continuous spectrum on 6, that is, for any φ ∈ `2 , the spectral measure for the pair (H, φ) is purely α-continuous on 6. Moreover, if the constants C1 (E), C2 (E) can be chosen independently of E ∈ 6, then for any φ ∈ `2 of compact support, the spectral measure for the pair (H, φ) is uniformly α-Hölder continuous on 6. Remark 2. a) We have stated the theorem in “right half-line” form. Of course, there is an analogous “left half-line” version. b) In particular, the intuition embodied in Remark 1 interpolates. For example, if one is able to establish uniform power-law bounds on the right half-line, then the resulting α-continuity is independent of the potential on the left half-line. In this sense the more continuous half-line dominates and bounds the dimensionality of the whole-line problem from below. Note, however, that the naive rule “dim(whole-line) = max(dim(left half-

194

D. Damanik, R. Killip, D. Lenz

line), dim right half-line))” is wrong. Indeed, using the analysis of sparse potentials by Jitomirskaya and Last in [12], one may construct examples where the two half-line problems have zero-dimensional spectrum (in a certain energy region) and the whole-line problem has one-dimensional spectrum. c) By combining the results of [12] and the ideas we present to prove Theorem 1, one can prove analogs of this theorem for Jacobi matrices and Schrödinger operators in L2 (R). Our application of Theorem 1 is to Schrödinger operators with Sturmian potentials. That is, we shall consider the operators [Hλ,θ,β u](n) = u(n + 1) + u(n − 1) + λvθ,β (n)u(n),

(8)

acting in `2 (Z), along with the corresponding difference equation (Hλ,θ,β − E)u = 0. Here

(9)

vθ,β (n) = χ[1−θ,1) nθ + β mod 1 ,

with coupling constant λ ∈ R \ {0}, irrational rotation number θ ∈ (0, 1), and phase β ∈ [0, 1). The family of operators (Hλ,θ,β ) is commonly agreed to model a one-dimensional quasicrystal. It provides a natural generalization√of the Fibonacci family of operators which corresponds to rotation number θ = θF = 5−1 2 , the golden mean. This model was introduced independently by two groups in the early 1980’s [16, 21] and has been studied extensively since. The review articles [3, 25] recount the history of generalizations of the basic Fibonacci model and the results obtained for each of them. Before stating the result, let us recall some basic notions from continued fraction expansion theory; we mention [15, 17] as general references. Given θ ∈ (0, 1) irrational, we have an expansion 1

θ=

1

a1 + a2 +

1 a3 + · · ·

with uniquely determined an ∈ N. The associated rational approximants by p0 = 0, q0 = 1,

p1 = 1, q1 = a1 ,

pn qn

are defined

pn = an pn−1 + pn−2 , qn = an qn−1 + qn−2 .

The number θ is said to have bounded density if n

lim sup n→∞

1X ai < ∞. n i=1

The set of bounded density numbers is uncountable but has Lebesgue measure zero.

Uniform α-Continuity for 1D Quasicrystals

195

Theorem 2. Let θ be a bounded density number. Then for every λ, there exists α = α(λ, θ) > 0 such that for every β and every φ ∈ `2 (Z) of compact support, the spectral measure for the pair (Hλ,θ,β , φ) is uniformly α-Hölder continuous. In particular, Hλ,θ,β has purely α-continuous spectrum. In the course of the proof of Theorem 2 we will establish solution estimates which allow us to exclude eigenvalues for all parameter values. Theorem 3. For every λ, θ, β, the operator Hλ,θ,β has empty point spectrum. Remark 3. This is the final result on a question with a long history. Building upon Süt˝o [23,24], the paper [1] by Bellissard et al. proves zero-measure spectrum and hence absence of absolutely continuous spectrum for all parameter values. Moreover, the authors of [1] implicitly exclude eigenvalues for β = 0 and arbitrary λ, α. Absence of eigenvalues for β 6 = 0 is listed as an open problem. Various partial results have been obtained since; see [4] for detailed remarks on the history of the problem and the first result that holds uniformly in the phase. The main improvement in the present article will be discussed in Sect. 4. Combining Theorem 3 with the results from [1] we obtain a complete identification of the spectral type. Corollary 11. For every λ, θ, β, the operator Hλ,θ,β has purely singular continuous zero- measure spectrum. The organization of this article is as follows. Section 2 discusses the transition from half-line eigenfunction estimates to spectral properties of the whole-line operator and so proves Theorem 1. In Sect. 3 we present some crucial properties of Sturmian potentials. We recall in particular the unique decomposition property and the uniform bounds on the traces of certain transfer matrices. Section 4 provides a study of the scaling properties of solutions to (9) with respect to the decomposition of the potentials on various levels and shows how Theorem 3 follows from these scaling properties. Uniform upper and lower power-law bounds on kukL for certain rotation numbers are established in Sect. 4. In Sect. 5 this information is then combined with Theorem 1 to prove Theorem 2. 2. Subordinacy Theory In this section we demonstrate how the solution estimates discussed in the introduction may be used to prove α-continuity of spectral measures for some α > 0. Although we shall only be applying Theorem 1 to Sturmian potentials, we believe the result holds a broader interest. Moreover, it will cost us nothing in clarity to treat the operator [H u](n) = u(n + 1) + u(n − 1) + V (n)u(n) with arbitrary potential V : Z → R. To each such whole-line operator we associate two half-line operators, H+ = P+∗ H P+ and H− = P−∗ H P− , where P± denote the inclusions P+ : `2 ({1, 2, . . . }) ,→ `2 (Z) and P− : `2 ({0, −1, −2, . . . }) ,→ `2 (Z). The spectral properties of H, H± are typically studied via the Weyl m-functions. For each z ∈ C \ R we define ψ ± (n; z) to be the unique solutions to H ψ ± = zψ ± ,

ψ ± (0; z) = 1

and

∞ X n=0

|ψ ± (±n; z)|2 < ∞.

196

D. Damanik, R. Killip, D. Lenz

With this notation we can define the Weyl functions by m+ (z) = hδ1 |(H+ − z)−1 δ1 i = −ψ + (1; z)/ψ + (0; z), m− (z) = hδ0 |(H− − z)−1 δ0 i = −ψ − (0; z)/ψ − (1; z) for each z ∈ C \ R. Here and elsewhere, δn denotes the vector in `2 supported at n with δn (n) = 1. For the whole-line problem, the m-function role is played by the 2 × 2 matrix M(z):

a † M(z) ab = (aδ0 + bδ1 ) (H − z)−1 (aδ0 + bδ1 ) . b Or, more explicitly,

+ 1 ψ (0)ψ − (0) ψ + (1)ψ − (0) M= + + − + − ψ (1)ψ − (0) − ψ + (0)ψ − (1) ψ (1)ψ (0) ψ (1)ψ (1) 1 m− −m+ m− = + − m+ 1 − m+ m− −m m with z dependence suppressed. We define m(z) = tr M(z) , that is, the trace of M. These definitions relate the m-functions to resolvents and hence to spectral measures. By pursuing these relations, one finds that: Z 1 dρ ± (t), m± (z) = t −z Z 1 d3(t), (10) m(z) = t −z where dρ + , dρ − are the spectral measures for the pairs (H+ , δ1 ), (H− , δ0 ), respectively, and d3 is the sum of the spectral measures for the pairs (H, δ0 ) and (H, δ1 ). An immediate consequence of these representations is that each of the m-functions maps C+ = {x + iy : y > 0} to itself. The pair of vectors {δ0 , δ1 } is cyclic for H ; indeed, if φ is supported in {−N, . . . , N, N + 1}, then there exist polynomials P0 , P1 of degree not exceeding N such that φ = P0 (H )δ0 + P1 (H )δ1 . This may be proved readily, by induction, once it is observed that φ(−N), φ(N + 1) uniquely determine the leading coefficients of P0 , P1 , respectively. Our immediate goal is to prove that d3 is uniformly α- Hölder continuous. This will follow quickly from Theorem 4. Fix E ∈ R. Suppose every solution of (H − E)u = 0 with |u(0)|2 + |u(1)|2 = 1 obeys the estimate C1 Lγ1 ≤ kukL ≤ C2 Lγ2 for L > 0 sufficiently large. Then sin(ϕ) + cos(ϕ)m+ (E + i) ≤ C3 α−1 , sup + ϕ cos(ϕ) − sin(ϕ)m (E + i)

(11)

(12)

where α = 2γ1 /(γ1 + γ2 ). Proof. This result lies within the Gilbert–Pearson theory of subordinacy [6, 7, 14]. A concise proof is available in [11,12]. In this context, the ϕ above corresponds to the choice of boundary conditions. u t

Uniform α-Continuity for 1D Quasicrystals

197

Corollary 21. Given a Borel set 6, suppose that the estimate (11) holds for every E ∈ σ (H ) with C1 , C2 independent of E. Then, given any function m− : C+ → C+ , and any E ∈ 6, + m (E + i) + m− (E + i) ≤ C3 α−1 (13) |m(E + i)| = 1 − m+ (E + i)m− (E + i) for all > 0. Consequently, 3(E) is uniformly α-Hölder continuous at all points E ∈ 6. In particular, d3 is α-continuous on 6. Proof. Fix E ∈ 6 and > 0. Then, by introducing new variables z = e2iϕ and µ = (m+ − i)/(m+ + i), we may rewrite (12) as 1 + µz ≤ C3 α−1 . sup |z|=1 1 − µz Note that Im(m+ ) > 0 implies |µ| < 1 and so (1 + µz)/(1 − µz) defines an analytic function on {z : |z| ≤ 1}. The point z = (i − m− )/(i + m− ) lies inside the unit disk since Im(m− ) > 0. The estimate (13) now follows from the maximum modulus principle and a few simple manipulations. This estimate and the representation (10) provide 3 [E − , E + ] ≤ 2Im m(E + i) ≤ 2C3 α for all E ∈ 6, > 0, from which 3(E) is uniformly α-Hölder continuous on 6.

t u

Remark 4. If we permit C1 , C2 to depend on E, the only consequence is that now C3 depends on E and so 3 need not be uniformly Hölder continuous. However, α- continuity is still guaranteed. Proof of Theorem 1. Given φ ∈ `2 (Z) with compact support, the remarks preceding Theorem 4 show that the spectral measure for φ is bounded by f (E)d3(E) for some polynomially bounded function f (E). If C1 , C2 are independent of E, then, by the above corollary, d3 is uniformly α-Hölder continuous, and as 6 is bounded, this implies that f d3 is also UαH. In the case that C1 , C2 are permitted to depend on E, the remark above shows that d3 is α-continuous. Given any φ ∈ `2 , its spectral measure may be written as f d3 and so must be α-continuous. u t

3. Basic Properties of Sturmian Potentials In this section we recall some basic properties of Sturmian potentials. For further information we refer the reader to [1, 3, 4, 19, 20]. We focus, in particular, on the decomposition of Sturmian potentials into canonical words, which obey recursive relations, and on known results on the traces of the transfer matrices associated to these words. Fix some rotation number, θ , and let an denote the coefficients in its continued fraction expansion. Define the words sn over the alphabet A = {0, 1} by s−1 = 1,

s0 = 0,

s1 = s0a1 −1 s−1 ,

an sn = sn−1 sn−2 , n ≥ 2.

(14)

In particular, the word sn has length qn for each n ≥ 0. By definition, sn−1 is a prefix of sn for each n ≥ 2. For later use, we recall the following elementary formula [4].

198

D. Damanik, R. Killip, D. Lenz

an −1 Proposition 31. For each n ≥ 2, sn sn+1 = sn+1 sn−1 sn−2 sn−1 .

Thus, the word sn sn+1 has sn+1 as a prefix. Note that the dependence of an , pn , qn , sn on θ is left implicit. Fix coupling constant λ and energy E; then, for each w = w1 . . . wn ∈ An , we define the transfer matrix M(λ, E, w) by E − λw1 −1 E − λwn −1 × ··· × . (15) M(λ, E, w) = 1 0 1 0 If u is a solution to (9), we have U (n + 1) = M λ, E, vθ,β (1) . . . vθ,β (n) U (1), where

u(n) U (n) = . u(n − 1)

When studying the power-law behavior of kukL , one can investigate as well the behavior of  1 2 bLc X

2

2

U (n) + (L − bLc) U (bLc + 1)  , kU kL = 

(16)

n=1

where kU (n)k2 = |u(n)|2 + |u(n − 1)|2 , since 2 1 2 kU kL

≤ kuk2L ≤ kU k2L .

(17)

Now, the spectrum of Hλ,θ,β is independent of β [1] and can thus be denoted by 6λ,θ . Let us define xn = tr M(λ, E, sn−1 ) , yn = tr M(λ, E, sn ) , zn = tr M(λ, E, sn sn−1 ) , with dependence on λ and E suppressed. Proposition 32. For every λ, there exists Cλ ∈ (1, ∞) such that for every irrational θ , every E ∈ 6λ,θ , and every n ∈ N, we have max {|xn |, |yn |, |zn |} ≤ Cλ . Proof. This result follows implicitly from [1]. It can be derived from the analysis in [1] by combining their bound on |xn | and |yn | with the fact that the traces obey the FrickeVogt invariant xn2 + yn2 + zn2 − xn yn zn = λ2 + 4, which was also shown in [1].

t u

Uniform α-Continuity for 1D Quasicrystals

199

The words sn are now related to the sequences vθ,β in the following way. For each pair (θ, n), every sequence vθ,β may be partitioned into words such that each word is either sn and sn−1 . This uniform combinatorial property, together with the uniform trace bounds given in Proposition 3.2, lies at the heart of the results contained in this paper and its precursors [4,5]. Let us make this property explicit. Definition 33. Let n ∈ N0 be given. An (n, θ )-partition of a function f : Z −→ {0, 1} is a sequence of pairs (Ij , zj ), j ∈ Z such that: i) the sets Ij ⊂ Z partition Z; ii) 1 ∈ I0 ; iii) each block zj belongs to {sn , sn−1 }; and iv) the restriction of f to Ij is zj . That is, fdj fdj +1 . . . fdj +1 −1 = zj . Notice that dj is defined implicitly to be the left-hand endpoint of the interval Ij . We will suppress the dependence on θ if it is understood to which θ we refer. In particular, we will write n-partition instead of (n, θ )-partition. The unique decomposition property is now given in the following lemma which was proved in [4]. Lemma 34. For every n ∈ N0 and every β ∈ [0, 1), there exists a unique n-partition (Ij , zj ) of vθ,β . Moreover, if zj = sn−1 , then zj −1 = zj +1 = sn . If zj = sn , then there is an interval I = {d, d + 1, . . . , d + l − 1} ⊂ Z containing j and of length l ∈ {an+1 , an+1 + 1} such that zi = sn for all i ∈ I and zd−1 = zd+l = sn−1 . We finish this section with a short discussion of symmetry properties of the words vθ,β . This will show that the considerations below, based on a study of the operators Hλ,θ,β on the right half-line, could equally well be based on a study of the operators on the left half-line. This particularly implies that for all parameter values, given an energy in the spectrum, both at +∞ and −∞ every solution of (9) does not tend to zero. For a finite word w = w1 . . . wn over {0, 1}, define the reverse word wR by wR = wn . . . w1 and for a word w ∈ {0, 1}Z , define the reverse word wR by wR = v with vn = w−n for n ∈ Z. It is not hard to show that every vθ,β allows a unique n-RR and partition [19]. Here, an n-R-partition is defined by replacing sn−1 and sn by sn−1 R sn , respectively, in the definition of n-partition. Mimicking the proof of Lemma 5.1 in [5] with the norm replaced by the trace, immediately gives xnR = xn , ynR = yn and znR = zn . Here, xnR , ynR and znR are defined by replacing sn−1 , sn and sn sn−1 with their reverse words in the definition of xn , yn and zn , respectively. Thus, the analog of Proposition 3.2 holds for xnR , ynR , znR (in fact, this can also be established by remarking that the underlying trace map system is essentially unchanged by passing from sn to snR ). The n-R-partitions and the bound on the traces allow one to study the operators on the left half-line in exactly the same way as the operators on the right half-line are studied in the following two sections. Alternatively, it is possible to show that the map R leaves the set {vθ,β : β ∈ [0, 1)} ⊂ {0, 1}Z invariant, where the bar denotes closure with respect to product topology [19]. This could also be used to show that the two half-lines are equally well accessible. 4. Scaling Behavior of Solutions In this section, we use the trace bounds and the partition lemma to study the growth of kU kL for energies in the spectrum and normalized solutions to (9). For our purposes it

200

D. Damanik, R. Killip, D. Lenz

will be sufficient to consider this quantity only for L = q8n , n ∈ N. In Lemma 4.1 below it is shown that this growth has a lower bound which is exponential in n. In particular, this will imply absence of eigenvalues as claimed in Theorem 3 and it will also be used in our proof of power-law (in L) lower bounds for certain rotation numbers which will be given in the next section. Lemma 41. Let λ, θ, β be arbitrary, E ∈ 6λ,θ , and let u be a normalized solution to (9). Then, for every n ≥ 8, the inequality kU kqn ≥ Dλ kU kqn−8 holds, where Dλ2 = 1 +

1 2 2Cλ .

Proof of Theorem 3. It follows immediately from Lemma 4.1 that for all parameter t values λ, θ, β, the operator Hλ,θ,β has no eigenvalues. u Before giving the proof of Lemma 4.1, let us recall a basic definition: A word w = w1 . . . wn is conjugate to a word v = v1 . . . vn if for some i ∈ {1, . . . , n}, we have w1 . . . wn = vi . . . vn v1 . . . vi−1 , that is, if w is obtained from v by a cyclic permutation of its symbols. To prove Lemma 4.1 we shall employ the mass-reproduction technique that was used in [2]. This technique is based on the two-block version of the Gordon argument from [8]. More explicitly we have Lemma 42. Fix λ, θ, β. Suppose that vθ,β (j ) . . . vθ,β (j +2k−1) is conjugate to (sn−1 )2 , (sn )2 , or (sn−1 sn )2 for some n ∈ N, l ≤ k, and every j ∈ {1, . . . , l}. Let E ∈ 6λ,θ . Then every normalized solution u to (9) satisfies kU kl+2k ≥ Dλ kU kl . Proof. Consider some j ∈ {1, . . . , l}. By definition, we have

U (j + k) = M λ, E, vθ,β (j ) . . . vθ,β (j + k − 1) U (j ), and U (j + 2k) = M λ, E, vθ,β (j ) . . . vθ,β (j + 2k − 1) U (j ).

Since vθ,β (j ) . . . vθ,β (j + 2k − 1) is conjugate to a square, it is itself a square, and 2 U (j + 2k) = M λ, E, vθ,β (j ) . . . vθ,β (j + k − 1) U (j ). Hence, applying the Cayley–Hamilton theorem, U (j + 2k) − tr M λ, E, vθ,β (j ) . . . vθ,β (j + k − 1) U (j + k) + U (j ) = 0. (18) Moreover, tr M λ, E, vθ,β (j ) . . . vθ,β (j + k − 1) ≤ Cλ .

(19)

Combining (18) and (19), we obtain 1 kU (j )k max kU (j + k)k, kU (j + 2k)k ≥ 2Cλ

(20)

Uniform α-Continuity for 1D Quasicrystals

201

for all 1 ≤ j ≤ l. We can therefore proceed as follows, kU k2l+2k

=

l+2k X

kU (m)k2

m=1

=

l X

kU (m)k +

m=1

≥

l X

kU (m)k2 +

= 1+ This proves the assertion.

kU (m)k2

m=l+1

m=1

l+2k X

2

1 2 2Cλ

1 2 2Cλ

l X

kU (m)k2

m=1

kU k2l .

t u

Proof of Lemma 4.1. We make use of the information provided by Lemma 3.4 and exhibit squares in the potentials which are suitable in the sense that they satisfy the assumption of Lemma 4.2. In fact, we shall show kU k2(qn+1 +qn )+qn−1 ≥ Dλ kU kqn−4

(21)

for all λ, θ, β, all E ∈ 6λ,θ , all solutions u, and all n ≥ 4. Since qn+4 ≥ 2(qn+1 + qn ) + qn−1 , this proves the assertion. Fix λ, θ, β and some n ≥ 4 and consider the n-partition of vθ,β . Since we want to exhibit squares close to the origin, we consider the following cases. Case 1. z0 = sn−1 . Applying (14) and Proposition 3.1, we see that this block is followed 2 s by sn−1 n−4 . We can therefore apply Lemma 4.2 with l = qn−4 and k = qn−1 . This yields (21) and we are done in this case. Case 2. z0 = sn and z1 = sn . Proposition 3.1 yields that these two blocks are followed by sn sn−3 . Lemma 4.2 now applies with l = qn−3 and k = qn . Case 3. z0 = sn and z1 = sn−1 . Let zj0 label the blocks in the (n + 1)-partition of vθ,β . By uniqueness of the n-partition we therefore have z00 = sn+1 . Let us consider the following subcases. Case 3.1. z10 = sn+1 . Similarly to Case 2, this implies that z00 z10 is followed by sn+1 sn−2 and hence Lemma 4.2 applies with l = qn−2 and k = qn+1 . Case 3.2. z10 = sn . It follows that z20 = sn+1 . Again we consider two subcases. Case 3.2.1. z30 = sn . Of course, this case can only occur if an+2 = 1. We infer that z40 = sn+1 . But this implies that we have squares conjugate to sn sn+1 and Lemma 4.2 is applicable with l = qn−1 and k = qn + qn+1 . Hence, (21) also holds in this case. Case 3.2.2. z30 = sn+1 . Let us consider the consequences of this particular case for the blocks in the n- partition. We have a

a

z0 z1 . . . z2an+1 +4 = sn sn−1 sn snn+1 sn−1 snn+1 sn−1 .

(22)

Since sn is a prefix of sn+1 , this must be followed by sn . We therefore have the sequence of blocks a a sn sn−1 sn snn+1 sn−1 snn+1 sn−1 sn ,

202

D. Damanik, R. Killip, D. Lenz

where the site 1 ∈ Z is contained in the leftmost block. Using Proposition 3.1 this can be rewritten as a a an−1 −1 sn−3 sn−2 , sn sn−1 sn snn+1 sn−1 snn+1 sn sn−2 which can as well be interpreted as a

a

a

n−1 sn sn−1 sn snn+1 sn−1 sn snn+1 sn−2

−1

sn−3 sn−2 .

Thus, Lemma 4.2 is applicable with l = qn−3 and k = qn +qn+1 which closes Case 3.2.2. Between Cases 1, 2, and 3 we have covered all possible choices of z0 , z1 .

t u

Remark 5. While our analysis is similar in spirit to the analysis performed in [4], we want to note here that we were able to improve upon essential aspects. Not only are we now able to treat an arbitrary rotation number θ ([4] had to exclude the case lim sup an = 2), we are also able to restrict our attention to one half-line which is of course crucial since we are aiming at an application of Theorem 1. The improvement stems from our considering the triple {sn−1 , sn , sn−1 sn } as being the set of “good” words. This allows us to conclude as in Case 3.2.2 which is not possible when one is only working with the pair {sn−1 , sn } of “good” words as was done in [4]. 5. Power-Law Upper and Lower Bounds on Solutions In this section we provide power-law bounds for kukL in the case where the rotation number θ has suitable number theoretic properties. Recall that an denote the coefficients in the continued fraction expansion of θ and qn denote the denominators of the canonical continued fraction approximants to θ . Proposition 51. Let θ be such that for some B < ∞, qn ≤ B n for every n ∈ N. Then for every λ, there exist 0 < γ1 , C1 < ∞ such that for every E ∈ 6λ,θ and every β, every normalized solution u of (9) obeys kukL ≥ C1 Lγ1

(23)

for L sufficiently large. Remark 6. The set of θ’s obeying the assumption of Proposition 5.1 has full Lebesgue measure [15]. Proof. The bound (23) can be derived from the exponential lower bound on kU kq8n , n ∈ N, given the exponential upper bound on qn , n ∈ N. Lemma 4.1 established the power-law bound for L = q8n . It can then be interpolated to other values of L (see [2] for details). u t Proposition 52. Let θ be a bounded density number. Then for every λ, there exist 0 < γ2 , C2 < ∞ such that for every E ∈ 6λ,θ and every β, every normalized solution u of (9) obeys kukL ≤ C2 Lγ2 for all L.

(24)

Uniform α-Continuity for 1D Quasicrystals

203

Proof. The proof is based upon local partitions and results by Iochum et al. [9, 10]. Up to interpolation to non-integer L’s, it was given in [5]. u t Remark 7. It is easy to see that bounded density numbers obey the assumption of Proposition 5.1. Thus, if θ is a bounded density number, we have C1 Lγ1 ≤ kukL ≤ C2 Lγ2 with λ-dependent constants γi , Ci , uniformly for all energies from the spectrum, all phases β, and all normalized solutions of (9). We are now fully prepared for the Proof of Theorem 2. We employ Theorem 1. Propositions 5.1 and 5.2 provide the estimate (11) for each E in the spectrum 6λ,θ of Hλ,θ,β . This set is bounded because the potential is bounded and hence, so is the operator Hλ,θ,β . Of course, the spectral measure for the pair (Hλ,θ,β , φ) is supported by 6λ,θ and so must be uniformly α-Hölder continuous. t u Acknowledgements. D. D. was supported by the German Academic Exchange Service through Hochschulsonderprogramm III (Postdoktoranden), R. K. was supported, in part, by an Alfred P. Sloan Doctoral Dissertation Fellowship, and D. L. received financial support from Studienstiftung des Deutschen Volkes (Doktorandenstipendium), all of which are gratefully acknowledged.

References 1. Bellissard, J., Iochum, B., Scoppola, E. and Testard, D.: Spectral properties of one-dimensional quasicrystals. Commun. Math. Phys. 125, 527–543 (1989) 2. Damanik, D.: α-continuity properties of one-dimensional quasicrystals. Commun. Math. Phys. 192, 169– 182 (1998) 3. Damanik, D.: Gordon-type arguments in the spectral theory of one-dimensional quasicrystals. Preprint (math-ph/9912005, mp-arc/99-472), to appear in Directions in Mathematical Quasicrystals, Eds. M. Baake and R. V. Moody, CRM Monograph Series, Providence, RI: AMS 4. Damanik, D. and Lenz, D.: Uniform spectral properties of one-dimensional quasicrystals, I. Absence of eigenvalues. Commun. Math. Phys. 207, 687–696 (1999) 5. Damanik, D. and Lenz, D.: Uniform spectral properties of one-dimensional quasicrystals, II. The Lyapunov exponent. Preprint (math-ph/9905008, mp-arc/99-184), to appear in Lett. Math. Phys. 6. Gilbert, D.J.: On subordinacy and analysis of the spectrum of Schrödinger operators with two singular endpoints. Proc. Roy. Soc. Edinburgh A 112, 213–229 (1989) 7. Gilbert, D.J. and Pearson, D.B.: On subordinacy and analysis of the spectrum of one-dimensional Schrödinger operators. J. Math. Anal. Appl. 128, 30–56 (1987) 8. Gordon, A.: On the point spectrum of the one-dimensional Schrödinger operator. Usp. Math. Nauk 31, 257–258 (1976) 9. Iochum, B., Raymond, L. and Testard, D.: Resistance of one-dimensional quasicrystals. Physica A 187, 353–368 (1992) 10. Iochum, B. and Testard, D.: Power law growth for the resistance in the Fibonacci model. J. Stat. Phys. 65, 715–723 (1991) 11. Jitomirskaya, S. and Last, Y.: Dimensional Hausdorff properties of singular continuous spectra. Phys. Rev. Lett. 76, 1765–1769 (1996) 12. Jitomirskaya, S. and Last, Y.: Power law subordinacy and singular spectra, I. Half-line operators. Preprint (mp-arc/98-723), to appear in Acta Math. 13. Jitomirskaya, S. and Last, Y.: Power law subordinacy and singular spectra, II. Line operators. Preprint (mp- arc/99-364), to appear in Commun. Math. Phys. 14. Khan, S. and Pearson, D.B.: Subordinacy and spectral theory for infinite matrices. Helv. Phys. Acta 65, 505–527 (1992) 15. Khinchin, A.Ya.: Continued Fractions. Mineola: Dover Publications, 1997

204

D. Damanik, R. Killip, D. Lenz

16. Kohmoto, M., Kadanoff, L.P. and Tang, C.: Localization problem in one dimension: Mapping and escape. Phys. Rev. Lett. 50, 1870–1872 (1983) 17. Lang, S.: Introduction to Diophantine Approximations. New York: Addison-Wesley, 1966 18. Last, Y.: Quantum dynamics and decompositions of singular continuous spectra. J. Funct. Anal. 142, 406–445 (1996) 19. Lenz, D. Hierarchical structures in Sturmian dynamical systems. Preprint 20. Lothaire, M.: Algebraic Combinatorics on Words. In preparation 21. Ostlund, S., Pandit, R., Rand, D., Schellnhuber, H.J. and Siggia, E.D.: One-dimensional Schrödinger equation with an almost periodic potential. Phys. Rev. Lett. 50, 1873–1877 (1983) 22. Rogers, C.A.: Hausdorff Measures. London: Cambridge Univ. Press, 1970 23. Süt˝o, A.: The spectrum of a quasiperiodic Schrödinger operator. Commun. Math. Phys. 111, 409–415 (1987) 24. Süt˝o, A.: Singular continuous spectrum on a Cantor set of zero Lebesgue measure for the Fibonacci Hamiltonian. J. Stat. Phys. 56, 525–531 (1989) 25. Süt˝o, A.: Schrödinger difference equation with deterministic ergodic potentials. In: Beyond Quasicrystals (Les Houches, 1994), Eds. F. Axel and D. Gratias, Berlin: Springer, 1995, pp. 481–549 Communicated by B. Simon

Commun. Math. Phys. 212, 205 – 217 (2000)

Communications in

Mathematical Physics

© Springer-Verlag 2000

Semiclassical Estimates in Asymptotically Euclidean Scattering András Vasy, Maciej Zworski Mathematics Department, University of California, Berkeley, CA 94720, USA. E-mail: [email protected]; [email protected] Received: 21 October 1999 / Accepted: 17 January 2000

Abstract: We consider long range semiclassical perturbations of the Laplacian on asymptotically Euclidean manifolds. We obtain precise resolvent estimates under nontrapping assumptions. The novelty lies in a systematic use of geometric microlocal methods.

1. Introduction The purpose of this note is to obtain semiclassical resolvent estimates for long range perturbations of the Laplacian on asymptotically Euclidean manifolds. For an estimate which is uniform in the Planck constant h we need to assume that the energy level is non-trapping. In the high energy limit (that is, when we consider 1 − λ2 , as λ → ∞, which is equivalent to h2 1 − 1, h → 0), this corresponds to the global assumption that the geodesic flow is non-trapping. We note here that a sufficiently small neighbourhood of infinity is always non-trapping. The resolvent estimates in the classical (h = 1) and semi-classical cases have a long tradition going back to the limiting absorption principle – see [1] and references given there. Various variants of the theorem we present were proved in Euclidean potential scattering by Jensen–Mourre–Perry [9], Robert–Tamura [14], Gérard–Martinez [5], Gérard [4] and Wang [15], and for more general elliptic operators by Robert [12, 13]. The proofs were based on Mourre theory whose underlying feature is the positive commutator method accompanied by functional analytic techniques for obtaining a resolvent estimate. While the work of Gérard–Martinez [5] explains the role of geometry in the positive commutator estimate itself, it refers to Mourre’s work for the functional analytic argument. We adopt a completely geometric approach based on direct microlocal ideas. The classical version of the estimate on asymptotically Euclidean manifolds (h = 1, in which case there is no need for the non-trapping assumption) is essentially in Melrose’s original paper on the subject [10] in which he introduced a fully microlocal point of view

206

A. Vasy, M. Zworski

to scattering. However, the proof presented here is somewhat different in spirit: a global positive commutator argument is used to derive an estimate on the resolvent directly. Referring to (2.3) below for the definition of a scattering metric, to (2.4), (2.5) for the definition of a long range semi-classical perturbation, and (2.7) for the definition of a non-trapping energy, we state our main Theorem. Let X be a manifold with boundary and let 1 be the Laplacian of a scattering metric on X. If P = h2 1 + V is a semi-classical long range perturbation of h2 1, and R(λ) = (P − λ)−1 its resolvent, then for all m ∈ R, kR(λ + it)f kH m,−1/2− (X) ≤ C0 h−1 kf kH m−2,1/2+ (X) , sc

sc

> 0,

(1.1)

with C0 independent of t 6 = 0 real and λ ∈ I , I ⊂ (0, +∞) a compact interval in the set of non-trapping energies for P . m,k (X) denote Sobolev spaces adapted to the scattering calculus, that is to Here Hsc asymptotically Euclidean structures. The index m indicates smoothness and k the rate of decay at infinity: the larger the better in both cases. To indicate the main idea of the proof let p −λ be the principal symbol of P −λ. Here the principal symbol is meant in both the semi-classical sense and the scattering sense – see Sect. 2. Near the characteristic variety of P − λ, we construct a function q ≥ 0 such that q is decreasing along the Hamilton vector field Hp . This gives the required estimate for the resolvent when we apply a variant of the well known commutator method – see [7] for the now standard application to the propagation of singularities for operators of principal type. On the quantum level, propagation of singularities corresponds to the propagation by the classical flow. The use of commutators is natural as their symbols are given by the classical Poisson brackets. The microlocal approach is thus motivated by the quantumclassical correspondence. In a scattering problem, estimates on the resolvent are closely related to quantum propagation estimates. Hence we can apply the same strategy for directly relating analysis and geometry. We stress that to prove, say, the outgoing resolvent estimate, one needs to keep the signs of both q and Hp q fixed throughout phase space, and in case of the outgoing estimate, these signs must be opposite. Indeed, it is the fixed sign of q that makes it possible to eliminate the machinery of Mourre’s method. The positivity of q shows that in the outgoing region, where bicharacteristics tend as t → +∞, q must be of the form x r a, a ∈ C ∞ (scT ∗ X), r > 0, (here x is a boundary defining function), and in the incoming region, where bicharacteristics tend as t → −∞, it must be of the form x −s b, b ∈ C ∞ (scT ∗ X), s > 0. The difference between these two weights, which can be made arbitrarily small, but is never 0, plus the improvement by 1 in the order when calculating a commutator, explains how the weighting of the Sobolev spaces works. For applications of the non-trapping estimates to more general operators we refer to a recent paper by Bruneau–Petkov [2]. It is clear that the “black box” set-up discussed there can be easily adapted to the manifold situation.

2. Preliminaries Let X be a C ∞ manifold with boundary, ∂X and let x be a boundary defining function. Thus, in a small collar neighborhood [0, 0 ) × ∂X of the boundary ∂X, we have “semi∗ X. global coordinates” (x, y, ξ, η) on T[0, 0 )×∂X

Semiclassical Estimates in Asymptotically Euclidean Scattering

207

Microlocal techniques adapted to asymptotically Euclidean structure near ∂X (see (2.3)) were introduced by Melrose [10]. We start by recalling the scattering cotangent bundle scT ∗ X which is the natural phase space. It is defined as the dual of the scattering tangent bundle sc T X, which in turn is defined so that the space of vector fields xVb (X), where Vb (X) are vector fields tangent to ∂X, is given by sections xVb (X) = C ∞ (X; scT ∗ X), see [10] for a thorough discussion. Since sc T X ,→ T X we have a natural map T ∗ X → scT ∗ X. In ‘semi-global coordinates’ (x, y, ξ, η) on scT ∗ [0,0 )×∂X X it is given by (x, y, τ, µ) = (x, y, x 2 ξ, xη), and this identification is worth keeping in mind since the symplectic and contact structures are inherited from T ∗ X, that is, from the (x, ξ ) coordinates. In particular, when we speak of the Hamilton vector fields on scT ∗ X, we mean the natural extension of the usual Hamilton vector field on scT ∗ Xo ' T ∗ X o , to scT ∗ X – see [10] and [11]. We also note that the variable µ is naturally identified with µ ∈ Ty∗ ∂X. The fiber radial compactification of scT ∗ X is denoted by sc T¯ ∗ X; sc T¯ ∗ X is thus a ball m,l −m C ∞ (sc T¯ ∗ X). (X), are functions a ∈ x l ρ∞ bundle over X. Classical symbols, a ∈ Scl m,l ∞ ∗ −l m ∞ ∗ By a ∈ S (X) it is meant that a ∈ C (T X), x ρ∞ a ∈ L (T X), and the same estimate holds after the application (to a) of any b-differential operator on sc T¯ ∗ X, that is, an operator in the algebra generated by Vb (sc T¯ ∗ X), vector fields tangent to ∂ sc T¯ ∗ X.

S X

sc

..................................................................................................................................................... . .... . . . . ... . . . . .... . . . . .... . . . .... . . . . .... . . . .... . . . . .... . . . ... . . . ... . . . ........................................................................................................................................................ . . . . .... . . .... . . . . .... . . . . ... . . . . ... . . . . .... . . . .... . . . . .... . . . . .... . . . . ... . .....................................................................................................................................................

o

sc

X T@X

T X Æ

Fig. 1. The fiber compactification sc T¯ ∗ X of scT ∗ X is a manifold with corners. Its boundary hyperfaces are sc T¯ ∗ X, which is a ball bundle over ∂X, and the cosphere bundle sc S ∗ X, which is a sphere bundle over X. ∂X The zero section is denoted by o

The semiclassical calculus for the Weyl metric on T ∗ X. dζ 2 dz2 + 1 + |z|2 1 + |ζ |2 is well known and, for instance, it is discussed in great generality in [6]. The natural generalization to manifolds with asymptotically Euclidean structure near infinity is given in the Appendix to [16]. We will review and slightly extend it below. The semi-classical symbols are defined as follows: a ∈ S m,l,k (X) means that a ∈ m a ∈ L∞ ((0, 1) × T ∗ X), and the same estimate holds C ∞ ((0, 1) × T ∗ X), hk x −l ρ∞ after the application of any b-differential operator sc T¯ ∗ X. Thus, a(h, .) ∈ S m,l (X) for

208

A. Vasy, M. Zworski

... .... ... ... ... .... ... ... ... ... ... ... ... ... ... .... ........ ..... ..... ..... . .. . . ... .... . . . . . ... .. .... .. ... ..... .... ... .... ... . . . .. . ... ... ..... . . . ... ... . .... . . ... . .. .... ... . . . . . .. .... .. . .............. . . . . . . . . . . ................ .. ....... . . . . .. . . . . . . ... ... .... ........... ... . . . . . . . . . . . . . . . ... ... . ..... .. .. ..r ... ................... ... ... .. ... ....... . . ... . . . .. . . . . . . . . . ..... .. .. .. ...................... ... ... ... . . .. ... ... . ...... . . .. ... ..... .. ... ................... ... . . . . ... ... ........... . . . . . . . . .. . . . . . . ... . . ... . .......... .... . . . .. . . . .. . . . . .. ... .. ... ....... .. .. .................. ... .. .. .... .. ......... . . . . . . . . . . ..... . . . ... .. .. ...... ......... .. . . ........................ .. .. . ................ . .. . . . ..... .............................. . . . . .................................................................................... .. . .. .. .. .. . ..

... .... .... ..... ..... .... ..... . . . .. ..... .... ..... ..... .... ..... . . . .. ..... ..... ................................................................ .... ............. ........ .... ........ ......... ........ ....... ....... .... ..... ...... ....... . . . . . . . . ... . ... ...... .... .... ...... ..... ... ..... ..... ....... ..... ... ..... ....... .... .... .. ..... .... ....... ... .... ....... . . . . . . . . . . . . ... .. ... ... ....... ..... ... ............ ..... .......... ..... .... ... ..... ........... .... ....... .... . . . . . . . . . ... ....... ..... .. ....... ..... .. ....... ........ .. .... ..... ....... .. .... ....... ... ... .... ....... .... ... .. .... ....... ............ . . . . . . . . .. . . . .............. ... ......... ... .... .............. . . . . . . . . .. . ... ......................... ....... .... . . . ........ . . . . . . . ..... ... ... ....... .............. .... . . . . . . . . . . . . . . . . . . . . . . . . . .. .. ... .. .............. .. ....... ... ... .............. .. ....... ... ... ............. ....... .. ... .. ......................... ... ........ ..... ... ........ ... ........... .............. .............. .... . .. . . . . . . . . . . . . . . . . .. ... .. . .............. . . . . . . . . .. . . . . . . . . .. ... ..................... .. ... .. .............. .. .... .. .. ....... .. . ............ .. .. ............................................................................ .................................................................................................... ..................................................

@X

X

p

U

s

(U )

Fig. 2. Local Euclidean coordinates near p ∈ ∂X identify a neighborhood U of p in X with a conic neighborhood φ(U ) of infinity in Rn

h ∈ (0, 1), and the symbol estimates are uniform in h. The corresponding class of m,l,k m a ∈ C ∞ ([0, 1) ×sc T¯ ∗ X). classical symbols, a ∈ Scl (X) are functions with hk x −l ρ∞ m,l,k (X) as in ApFor a ∈ S m,l,k (X) we define a semiclassical operator Op(a) ∈ 9sc pendix to [16]: we first use local Euclidean coordinates in a cone near infinity, identified with a neighbourhood of a boundary point (see Fig. 2) to define Z 1 n ei(z−w)·ξ/ h a(h, z, ξ )u(w)dwdξ Au(z) = 2πh with a ∈ S m,l,k (Rn ). Invariance under local changes of coordinates then gives Op(a) m,l,k and leads to the definition of the class 9sc (X). m,l,k (X) → S m,l,k (X) with the usual propWe then have the symbol map σsc,h : 9h,sc erties, and in particular with the short exact sequence m,l,k σh,sc

m−1,l+1,k−1 m,l,k (X) → 9sc (X) −→ S m,l,k (X)/S m−1,l+1,k−1 (X) −→ 0. 0 → 9h,sc

Another important property of 9scc,h (X) is that it is commutative to top order, and the principal symbol of a commutator is given by the Poisson bracket of the principal m,l,k m0 ,l 0 ,k 0 (X), B ∈ 9scc,h (X), then [A, B] ∈ symbols of the commutants. That is, if A ∈ 9scc,h 0

0

0

m+m −1,l+l +1,k+k −1 (X) and 9scc,h 0

0

0

m+m −1,l+l +1,k+k −1 ([A, B]) = σh,sc

h Ha b, i

(2.1)

where a, b are the principal symbols of A and B, and Ha denotes the Hamilton vector field of a. We will also make use of the sharp Gårding estimate: 0,0,0 (X) is self-adjoint, and its (joint semiclassical) prinLemma 2.1. Suppose A ∈ 9scc,h cipal symbol is a ≥ 0. Then there exists C > 0 such that

hu, Aui ≥ −ChkukH −1/2,1/2 (X) . sc

(2.2)

Semiclassical Estimates in Asymptotically Euclidean Scattering

209

2m,−2l,0 In particular, if A ∈ 9scc,h (X) has principal symbol a ≥ 0, then A ≥ hR for some 2m−1,−2l+1,0 (X). R ∈ 9scc,h

Proof. The inequality is well known in the case of Rn – see Sect.18.4 of [8] (easily adapted to the semi-classical setting), [3] and [6]. The localization argument presented in the Appendix of [16] then gives the lemma. u t Now, let g be a scattering metric on X, that is, a metric which near ∂X takes the form h0 dx 2 + 2, 4 x x

h0 |∂X = h is a metric on ∂X.

(2.3)

This defines an asymptotically Euclidean structure near ∂X: a neighbourhood of ∂X is isometric to a perturbation of the large end of the cone R+ × ∂X with the metric dr 2 + r 2 h. We will consider the following self-adjoint, classically elliptic operators in 2,0,0 Diff h,sc 2,0,0 (X) ⊂ 9h,sc (X): P = h2 1g + V ,

(2.4)

where P in any compact set, V is a second order semiclassical operator (V = |α|≤2 vα (z, h)(hDz )α in local coordinates) and near the boundary ∂X, in local coordinates y ∈ ∂X, X vkα (x, y, h)(hx 2 Dx )k (hDy )α , V = x γ V0 , V0 = |α|+k≤2 (2.5) 0 0 ∈ hS 0,0,0 (X), vkα ∈ S 0,0 (X) vkα − vkα

γ > 0.

The condition that the coefficients are symbols independent of the fiber variables means β that |(x∂x )l ∂y vk,α | ≤ Clβ . In the Euclidean setting it corresponds to assuming that the coefficients are symbols in the Euclidean base variables. Due to the vanishing of 0 in S 0,0,0 (X) when h = 0, the semiclassical principal symbol of P is vkα − vkα p = g + xγ

X |α|+k≤2

0 vkα (x, y)τ k µα ,

(2.6)

where g also denotes the (dual) metric function of the metric g. Thus, p can be represented by an h-independent function, which will be convenient for the construction in the last 0 ∈ hS 0,0,0 (X) could be section of this paper. Note, however, that in (2.5), vkα − vkα 0 ρ 0,0,0 (X), ρ > 0, or indeed by the assumption that vkα is replaced by vkα − vkα ∈ h S continuous on [0, 1)h with values in S 0,0 (X), at the expense of minor changes in the next section. For obtaining the uniform resolvent estimates in h for R(λ ± i0), we make the assumption that the Hamiltonian is non-trapping at energy λ, for any ξ ∈ T ∗ X◦ satisfying p(ξ ) = λ, lim x(exp(tHp )(ξ )) = 0. t→±∞

(2.7)

210

A. Vasy, M. Zworski

As discussed in [5], this implies that an interval of energies around λ is non-trapping: ∃δ0 > 0 such that for any ξ ∈ T ∗ X ◦ satisfying p(ξ ) ∈ (λ − δ0 , λ + δ0 ), lim x(exp(tHp )(ξ )) = 0.

(2.8)

t→±∞

The symbolic functional calculus applies in the semiclassical setting as well – see [3] and references given there. Here, we will restrict the discussion to the operator P given by (2.4). The formula Z 1 ∂¯z f˜(z)(P − z)−1 d z¯ ∧ dz, f˜ ∈ Cc∞ (C), f (P ) = 2πi C f˜|R = f, ∂¯ f˜ = O(| Im z|∞ ), (f˜ is an almost analytic extensions of f ) shows that for f ∈ Cc∞ (R),

−∞,0,0 f (P ) ∈ 9sc,h (X).

?,0,0 (f (P )) = f (p). Also σh,sc If ψ ∈ Cc∞ (R), ψ ≡ 1 near λ, then for t ∈ R, 1 − ψ(σ ) = ψ˜ t (σ )(σ − (λ + it)), −1 ˜ (R) satisfying uniform symbol estimates as t varies over compact sets, so ψ ∈ Scl ˜ ) ∈ 9 −2,0,0 (X), and we have proved the following lemma: ψ(P sc,h

Lemma 2.2. Let P be as in (2.4). Suppose that ψ ∈ Cc∞ (R), ψ ≡ 1 near λ, and suppose that r, s ∈ R. Then there exists C > 0, independent of t as long as t varies in compact r−2,s (X), the following sets, such that for all u ∈ C −∞ (X) with (P − (λ + it))u ∈ Hsc estimate holds: k(Id −ψ(P ))ukHscr,s (X) ≤ Ck(P − (λ + it))ukH r−2,s (X) . sc

(2.9)

3. Semiclassical Estimates In this section we will prove the semi-classical resolvent estimates under the assumption that there exists q ∈ S 0,−,0 (X), ∈ (0, 41 ), such that 2qHp q = −bψ(p)2 , b ∈ S 0,1−2,0 (X), ψ ∈ Cc∞ (R; [0, 1]), ψ ≡ 1 near λ, b ≥ c0 x

1+2

(3.1)

> 0.

The existence of q under global non-trapping assumptions will be established in Sect.4. If we write Q = Op(q) and B = (Op(b) + Op(b)∗ )/2 then, as reviewed in Sect.2, i[Q∗ Q, P ] = hψ(P )Bψ(P ) + h2 R

(3.2)

0,2−2,0 −∞,0,0 (X). Note that ψ(P ) ∈ 9scc,h (X), i.e. it is smoothing, so the with R ∈ 9scc,h r,s differentiability order r in the weighted Sobolev spaces Hsc (X) is mostly irrelevant 0, 1 +

below. Suppose that u ∈ Hsc 2

(X). Then for t > 0,

hu, i[Q∗ Q, P ]ui = −2 Imhu, Q∗ Q(P − (λ + it))ui − 2tkQuk2 .

(3.3)

Semiclassical Estimates in Asymptotically Euclidean Scattering

211

0,−2,0 (Note that Q∗ QP and P Q∗ Q are in 9scc,h (X), so the expressions of the form ∗ hu, Q QP ui, make sense.) Thus, taking into account that 2tkQuk2 ≥ 0,

hhu, ψ(P )Bψ(P )ui ≤ 2|hu, Q∗ Q(P − (λ + it))ui| + h2 |hu, Rui|.

(3.4)

By the Cauchy-Schwartz inequality we have, for any δ > 0, |hu, Q∗ Q(P −(λ + it))ui| ≤ kx 2 + ukkx − 2 − Q∗ Q(P −(λ + it))uk 1

1

≤ δhkx 2 + uk2 + δ −1 h−1 kx−2 − Q∗ Q(P −(λ + it))uk2 . (3.5) 1

1

0,1−4,0 (X), hence bounded on L2sc (X) since ∈ Note that x − 2 − Rx − 2 − ∈ 9scc,h 1

1

0,0,0 (X) is also bounded on the L2 space. (0, 41 ). Similarly, x − 2 − Q∗ Qx 2 +3 ∈ 9scc,h Thus, 1

1

hhu,ψ(P )Bψ(P )ui − (δh + h2 kx − 2 − Rx − 2 − kB(L2sc (X)) )kx 2 + uk2 1

1

1

≤ δ −1 h−1 kx − 2 − Q∗ Qx 2 +3 k2B(L2 (X)) kx − 2 −3 (P − (λ + it))uk2 . 1

1

1

(3.6)

sc

We will now use the last assumption in (3.1): x −1+2 bψ(p) ≥ c0 x 2 ψ(p). Hence by the sharp Gårding estimate, x − 2 + ψ(P )Bψ(P )x − 2 + ≥ c02 x 2 ψ(P )2 x 2 + hR1 , 1

1

−∞,1,0 R1 ∈ 9sc,h (X).

(3.7)

Adding c02 x 2 (Id −ψ(P )2 )x 2 to both sides gives x − 2 + ψ(P )Bψ(P )x − 2 + + c02 x 2 (Id −ψ(P )2 )x 2 ≥ c02 x 4 + hR1 . 1

1

(3.8)

We also note that |hx 2 − u, R1 x 2 − ui| ≤ C 0 kx 1− uk2 . Thus, applying both sides of 1 1 (3.8) to x 2 − u, and pairing with x 2 − u afterwards yields 1

1

c02 kx 2 + uk2 1

≤ hu, ψ(P )Bψ(P )ui + c02 |h(Id +ψ(P ))x 2 + u, (Id−ψ(P ))x 2 + ui| + C 0 hkx 1− uk2 1

1

≤ hu, ψ(P )Bψ(P )ui + 2c02 δkx 2 + uk2 + δ −1 k(Id−ψ(P ))x 2 + uk2 + C 0 hkx 1− uk2 , (3.9) 1

1

The last term is clearly bounded by C 0 hkx 2 + uk2 and the second to last term can be estimated using (2.9). Choosing δ < 1/4, h1 = c02 /4C 0 gives that for h ∈ (0, h1 ), 1

kx 2 + uk2 ≤ C1 hu, ψ(P )Bψ(P )ui + C2 k(P − (λ + it))uk2 1

−2,− 21 −

Hsc

(X)

.

(3.10)

−2, 1 +

The norm in the second term on the right hand side can be replaced by the Hsc 2 (X) norm. Combining (3.6) and (3.10), we thus conclude that there exists h0 > 0 such that for h ∈ (0, h0 ), hu, ψ(P )Bψ(P )ui ≤ Ch−2 kx − 2 −3 (P − (λ + it))uk2 . 1

(3.11)

212

A. Vasy, M. Zworski

Again using (3.10), we conclude that for all > 0, kukH 0,−1/2− (X) ≤ Ch−1 k(P − (λ + it))ukH 0,1/2+3 (X) , sc

sc

h ∈ (0, h0 ).

(3.12)

We can modify this argument slightly by inserting (P + i)(P + i)−1 in (3.5) between Q and P − (λ + it), to see that the last factor in (3.6) can be replaced by k(P + 1 i)−1 x − 2 −3 (P − (λ + it))uk2 , and correspondingly the norm on the right-hand side of (3.12) can be replaced by k(P − (λ + it))ukH −2,1/2+3 (X) . A further slight modification sc r,s (X) can be in the same spirit allows us to conclude that the smoothness order r in Hsc shifted by the same amount on both sides of (3.12): kukH r,−1/2− (X) ≤ Ch−1 k(P − (λ + it))ukH r−2,1/2+3 (X) , sc

sc

h ∈ (0, h0 ).

(3.13)

r,1/2+3

(X). Since R(λ+it) = (P −(λ+it))−1 ∈ Now let u = ut = R(λ+it)f , f ∈ Hsc r+2,1/2+3 −2,0,0 9scc,h (X) for t > 0, we see that ut ∈ Hsc (X) for t > 0. Thus, the above estimate is applicable and we conclude that kR(λ + it)f kH r+2,−1/2− (X) ≤ Ch−1 kf kH r,1/2+3 (X) , sc

sc

h ∈ (0, h0 ).

(3.14)

Note that for a fixed ψ, we can let λ be arbitrary inside the region where ψ ≡ 1, so a compactness argument gives the uniform estimate in λ as stated in our main Theorem. Remark 3.1. As in Melrose’s paper [10], using these estimates one can show that for r+2,−1/2− r,1/2+ (X) for f ∈ Hsc (X), fixed h > 0, the limits R(λ ± i0)f exist in Hsc > 0. Hence, (3.14) yields kR(λ + i0)f kH r+2,−1/2− (X) ≤ Ch−1 kf kH r,1/2+ (X) , sc

sc

h ∈ (0, h0 ),

(3.15)

as well. 4. Symbol Construction Let p be the principal symbol of P . Thus, near ∂X, p = τ 2 + g∂ (y, µ) + x γ r,

r ∈ S 2,0 (X),

(4.1)

where g∂ is the metric on the boundary, and we denote the metric function on the cotangent bundle the same way. Its Hamilton vector field Hp is of the form x(2τ (x∂x + µ · ∂µ ) − 2g∂ ∂τ + Hg∂ ) + x 1+γ W,

W ∈ Vb (scT ∗ X) ⊗ S 1,0 (X) ; (4.2)

see [10, Eq. (8.17)] for a detailed calculation. Here we will be mainly concerned with the (x, τ ) variables, so we rewrite this as Hp = x(2τ + x γ a)(x∂x ) − x(2g∂ + x γ b)∂τ + 2xτ µ · ∂µ + xHg∂ + x 1+γ W 0 , (4.3) where a, b ∈ S 1,0 (X), and W 0 is now a vector field tangent to the ∂X fibers, i.e. it is a vector field in ∂y and ∂µ with coefficients in S 1,0 (X). In this section we take λ2 , not λ, as the spectral parameter.

Semiclassical Estimates in Asymptotically Euclidean Scattering

213

6

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..................... ....... . ..r . ............ ............. . . . . . . . . ....... . ...... . ..... . . . . . . ..... .. . . ..... ..... . . . . . ... . . . .... . .... . . . ... . . ... . ... . . . . . .. . .. . .. . . . .. . .. . .. . . .. . .. . . . .. . .. . .. . . . . . .. . . . . . . ..................................................................................................................................................... .................................................. .......................... . . . .. . .. . . . .. . .. . . .. .. . . . .. . . . ... .. . . . . . . ... . ... . ... . . . ... ... . .... . ... . . ... ..... . . .... . . ... . . . ..... . . ..... . ..... . ..... ..... . ...... . ........ . ....... .......... . . ........... ...................... . .................... .. ...r . . . . . . . . . . . .

? R

R+ ()

R (P 2) ? -

R ()

Fig. 3. The projection of the characteristic variety 6(P − λ2 ) and the bicharacteristics of Hp inside it to the (τ, µ)-plane

As indicated, we make the assumption that a small interval of energies around λ2 is non-trapping, i.e. ∃δ0 > 0 such that for any ξ ∈ T ∗ X ◦ satisfying p(ξ ) ∈ (λ2 − δ0 , λ2 + δ0 ), lim x(exp(tHp )(ξ )) = 0.

(4.4)

t→±∞

Now, Hp (x −1 τ ) = −2(τ 2 + g∂ ) + x γ f, f ∈ S 1,0 (X),

(4.5)

so there exists 1 > 0 such that for ξ ∈ scT ∗ X satisfying p(ξ ) ∈ (λ2 /2, 2λ2 ), x < 1 , −(Hp (x −1 τ ))(ξ ) ≥ c0 > 0. Since p is constant along integral curves of Hp , we see that if exp(−tHp )(ξ ), t ≥ T , stays in x < 1 (which holds under our non-trapping assumption for sufficiently large T ), then x −1 τ tends to +∞; in particular τ is nonnegative for all large t. By reducing 1 > 0 if necessary, we also see that there exist δ1 > 0, 1 > 0 such that for ξ ∈ scT ∗ X, |p(ξ ) − λ2 | < δ1 , x(ξ ) < 1 , |τ | < 7λ/8 ⇒ g∂ (ξ ) ≥ c1 > 0.

(4.6)

Reducing 1 > 0 further if necessary, we can thus arrange that |p(ξ ) − λ2 | < δ1 , x < 1 , |τ | < 7λ/8 ⇒ −x −1 Hp τ (ξ ) ≥ c1 > 0.

(4.7)

Thus, we see that given any x0 > 0, ξ ∈ T ∗ X ◦ with |p(ξ ) − λ2 | < δ1 , there exists T > 0 such that t ≥ T ⇒ τ (exp(−tHp )(ξ )) > 2λ/3, x(exp(−tHp )(ξ )) < x0 /2.

(4.8)

We now define a symbol q ∈ S −∞,0 (X) whose most important properties are that q≥0

and x −1 Hp q ≤ 0.

(4.9) Cc∞ (R)

is We will always use a localization in the energy via a factor ψ(p), where ψ ∈ supported in (λ2 − δ, λ2 + δ), where δ ∈ (0, λ2 ) is a fixed small constant with δ < δ1 , δ1 as above. Let M = sup{|a(ξ )| + |b(ξ )| : p(ξ ) ≤ 2λ2 } < +∞;

(4.10)

214

A. Vasy, M. Zworski

here we used that p−1 ((−∞, 2λ2 ]) is a compact subset of scT ∗ X. Also, let x0 = min{(λ/6(M + 1))1/γ , (c1 /2(M + 1))1/γ , 1 }.

(4.11)

Let χ− ∈ C ∞ (R) be supported in (λ/3, +∞), identically 1 on (2λ/3, +∞), with χ−0 ≥ 0, and similarly let χ+ ∈ C ∞ (R) be supported in (−∞, −λ/3), identically 1 on (−∞, −2λ/3), with χ+0 ≤ 0.Also, let χ∂ ∈ Cc∞ (R) be supported in (−7λ/8, 7λ/8), with χ∂0 ≥ (6λ/c1 )χ∂ ≥ 0 on (−7λ/8, 3λ/4), and χ∂ (−3λ/4) > 0. Let ρ ∈ Cc∞ ([0, +∞)) be identically 1 on [0, 1/2], supported in [0, 1), ρ 0 ≤ 0 on [0, +∞). In the incoming region we will take the symbol q− = x − χ− (τ )ψ(p)ρ(x/x0 ),

(4.12)

in the outgoing one the symbol q+ = x χ+ (τ )ψ(p)ρ(x/x0 ),

(4.13)

with ∈ (0, 41 ). In the intermediate region we take q∂ = x − χ∂ (τ )ψ(p)ρ(x/x0 ).

(4.14)

Note that for any α ∈ R, χ , ρ ∈ C ∞ (R), x −α−1 Hp (x α χ (τ )ρ(x/r)) = (2τ + x γ a)(αρ(x/r) + r −1 ρ 0 (x/r))χ (τ ) − (2g∂ (ξ ) + x γ b)ρ(x/r)χ 0 (τ ).

(4.15)

Note that in the definition of q− , α = − < 0, so αρ(x/r) + r −1 ρ 0 (x/r) ≤ 0 everywhere. Moreover, on supp χ− , τ > λ/3 > 0, so for x(ξ ) ≤ x0 , ξ ∈ supp ψ(p), τ (ξ ) ∈ supp χ− , 2τ + x γ a ≥ λ/3 > 0. In addition, τ ≤ 2λ/3 on supp χ−0 , so if ξ ∈ supp(ρ(x/x0 )χ 0 (τ )ψ(p)) then g∂ ≥ c1 > 0, hence 2g∂ + x γ b ≥ c1 > 0 there. Thus, x −1+ Hp q− ≤ 0.

(4.16)

Moreover, x ≤ x0 /2 implies ρ 0 (x/x0 ) = 0, and τ ≥ 2λ/3 implies χ−0 (τ ) = 0, so x ≤ x0 /2, τ ≥ 2λ/3 ⇒ −x −1+ Hp q− ≥ c2 ψ(p), c2 > 0.

(4.17)

The difference between q− and q+ is that τρ 0 is positive on supp χ+ , and −χ+0 is also positive, so the negativity estimate only holds away from supp ρ 0 and supp χ+0 . Thus, there is no analogue of (4.16), but the following analogue of (4.17) still holds: x ≤ x0 /2, τ ≤ −2λ/3 ⇒ −x −1− Hp q+ ≥ c3 ψ(p), c3 > 0.

(4.18)

Next, q∂ provides the connection between the incoming and outgoing regions. Since χ∂0 can be used to estimate χ∂ on (−7λ/8, 3λ/4), we see that τ (ξ ) ∈ (−7λ/8, 3λ/4), x(ξ ) ≤ x0 /2, ξ ∈ supp ψ(p) ⇒ |(2τ + x γ a)χ∂ (τ )| ≤ c1 χ∂0 (τ )/2.

(4.19)

Semiclassical Estimates in Asymptotically Euclidean Scattering

215

Since α = −, |α| < 1, so we conclude that τ (ξ ) ∈ (−7λ/8, 3λ/4), x(ξ ) ≤ x0 /2, ξ ∈ supp ψ(p) ⇒ −x −1+ Hp q∂ ≥ c1 χ∂0 (τ )ψ(p) ≥ 0.

(4.20)

Note that on (−3λ/4, 3λ/4), χ∂0 ≥ C > 0, so x(ξ ) ≤ x0 /2,

τ (ξ ) ∈ (−3λ/4, 3λ/4), ξ ∈ supp ψ(p) ⇒ −x

−1+

Hp q∂ ≥ c4 ψ(p),

c4 > 0.

(4.21)

For ξ ∈ T ∗ X ◦ with p(ξ ) ∈ (λ2 − δ0 , λ2 + δ0 ), take T = Tξ > 0 as in (4.8), so for t ≥ T we have τ (exp(−tHp )(ξ )) > 2λ/3, x(exp(−tHp )(ξ )) < x0 /2. We will define a symbol qξ which is supported in a neighborhood of the bicharacteristic segment {exp(−tHp )(ξ ) : t ∈ [0, T + 1]}, and which satisfies Hp q ≤ 0 over K 0 = {ξ 0 ∈ T ∗ X◦ : x(ξ 0 ) ≥ x0 /2

or (x(ξ 0 ) ≤ x0 /2

and τ (ξ 0 ) ≤ 2λ/3}.

(4.22)

Namely, let 6 be a hypersurface through ξ which is transversal to Hp . Then there is a neighborhood Uξ of ξ , such that Vξ = {exp(−t (Uξ ∩ 6)) : t ∈ (−1, T + 2)} is a neighborhood of the above bicharacteristic segment, which we can think of as a product (−1, T + 2) × (Uξ ∩ 6), and (T + 1/2, T + 2) × (Uξ ∩ 6) is disjoint from K 0 . Now let φξ ∈ Cc∞ (Uξ ∩ 6) be identically 1 near ξ , and let χξ ∈ Cc∞ (R) be supported in (−1, T + 2), χξ ≥ 0, χξ0 ≥ 0 on (−1, T + 2/3). Using the product coordinates, we can think of φξ and χξ as functions of scT ∗ X with compact support in Vξ . Let qξ = χξ φξ ψ(p),

(4.23)

Hp qξ = −χξ0 φξ ψ(p).

(4.24)

so

Thus, for ξ 0 ∈ K 0 , Hp qξ (ξ 0 ) ≤ 0. Now let K ⊂ T ∗ X ◦ be the compact set K = {ξ ∈ T ∗ X ◦ : ξ ∈ supp ψ(p), x(ξ ) ≥ x0 /4}.

(4.25)

Since K is compact, applying the previous argument for every ξ ∈ K gives a Uξ , and a Uξ0 ⊂ Uξ on which φξ = 1. Since {Uξ0 : ξ ∈ K} covers K, the compactness of K shows that we can pass to a finite subcover, {Uξ0j : j = 1, . . . , N}. We let q◦ =

N X

qξj .

(4.26)

j =1

The symbol we use in the positive commutator estimate is q = q− + C 00 q∂ + Cq◦ + C 0 q+ ,

(4.27)

with C, C 0 , C 00 > 0 chosen appropriately. Namely note that in the region x ≤ x0 /2, τ ≥ 2λ/3, which is the only place where Hp q◦ is positive, we have the estimate −x −1+ Hp q− ≥ c2 > 0. Since x −1+ Hp q◦ is bounded, we can choose C > 0 sufficiently small so that −x −1+ Hp (q− + Cq◦ ) is still bounded below by a positive constant

216

A. Vasy, M. Zworski

supp q+

supp q@

@R

..... ... .. ... .... .. .. .... .... ... ... ..... ..... .. ..... .... ... ... .... ..... .. ... .... .. . ..... . . . ... .. ..... .. ..... ... ..... ... .... .... ... ... .... ..... .. .... ... ..... .. .. ..... ..... ... .. .... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .... . . . . . . . . . . . . . . . .. .. .............. .... .. ... .................... ........ ..... ..... ........ .... .... ........... ..... ......... .... .... .... .. .... ..... ................ .. .......... ........ ..... ..... ... .... .... ... . . . . . . . . . . . . .. . .... ...... ... .... .... .... ....... ... .... ...... ..... . ... ... ..... ... ..... .... ... ... ... ..... ... .... .. .. .... ..... ... ... ... .. ........ ..... .. ... .. . . . ..... .. .. .. ............. ... ......... . ... . ........ ..... ... .. .. . . ............. . . . . . . .. .. ....................................... ... .. .. . . . .. . . .. . .. .. . .. . . .. .. .. .. . . . .. .. . .. ... . .. . . . .. ... . .. . . . . . . .. . . . . . . .. .. . . . . . . . ... . . . . .. . . .. . . . . . . . . .. . . . .... . . . .. . .. .. . . . . .. ... . . . . . . . .. . .. . . ... .. .. . . . .. . ... . . ... . ... . . . .. . . . ... . . .............................................................. .. . . . ........ . . . ........... .......... . . . ... . . . . . .. ....... . . ..... .... . . . ... . . . . . . . ... ...... . .. . ..... .... . . . . . . . . ..... . .. . ..... .... ... . . . . . . . . . . .... .. . . .... ..... .. . ... . . . . . . . . . . ..... ... . ... . . ..... . . . . . . . . .. . . . ... ..... . ... ... . ..... . . . . . . . . . . . . .... ... . . ... ..... . ... .... ......... . . ..... . ... . . .. ..... ... . . . . . . ...... ... ......... . .. . . . . . . . . . .. ..... .. . ....... ... . . .. . . . . . . . . ........ ... . ..... .... .. . . . . ..... . . ... ..... .. ... .... . . ..... . . . . . . . . . . . . . . . .... . . . . ........... .. ..................... ..... .... ................... . . . . . . . . . . . . . . . ... ..... .. ................................. ................... ..... . . . . . .... . . . . . . . . . . . . . . . . . . . . . . ....................... .. ... ..... .. ............................................. ..... . . . ..... . . . . . ... ..... .. .... . . . . . ... . . . .

supp q@

-

supp qÆ

6

@Rsupp q

Fig. 4. Supports of q+ , q− , q◦ and q∂ for X = Rn . scT ∗ X is identified with Rn × Rnξ , and the covector ξ is fixed on the picture

in this region. Then −x −1+ Hp (q− +Cq◦ ) is non-negative everywhere, and it is bounded below by a positive constant on x ≥ x0 /2 as well as on x ≤ x0 /2, τ ≥ 2λ/3. But this is the only region where the bounded function x −1+ Hp q∂ is positive, so by choosing C 00 > 0 sufficiently small, we can arrange that −x −1+ Hp (q− +Cq◦ +C 00 q∂ ) is non-negative everywhere, and it is bounded below by a positive constant on x ≥ x0 /2, as well as on x ≤ x0 /2, τ ≥ −3λ/4. But this is the only region where x −1− Hp q+ > 0. Thus, by choosing C 0 > 0 sufficiently small, and taking into account that x −1+ Hp = x 2 x −1− Hp q+ , with x −1− Hp q+ as well as x 2 bounded, we can arrange that −x −1+ Hp q is nonnegative everywhere, and −x −1− Hp q bounded below by a positive constant everywhere. In summary, we have proved the proposition needed in Sect.3 (see (3.1)) Proposition 4.1. There exist functions q ∈ S −,∞ (X), ψ ∈ Cc∞ (R; [0, 1]), ψ ≡ 1 near λ2 , and c0 , c00 > 0 such that q ≥ c0 x ψ(p), −Hp q ≥ c00 x 1+ ψ(p).

(4.28)

Thus, the results of the previous section show that there exists h0 > 0 such that for h ∈ (0, h0 ), kR(λ2 + it)f kH ∗,−1/2− (X) ≤ C0 h−1 kf kH ∗,1/2+3 (X) . sc

sc

(4.29)

Acknowledgements. The authors are grateful to the National Science Foundation for partial support grants number DMS-99-70607 and DMS-99-70614.

Semiclassical Estimates in Asymptotically Euclidean Scattering

217

References 1. Agmon, Sh.: Spectral theory of Schrödinger operators on Euclidean and non-Euclidean spaces. Comm. Pure Appl. Math. 39, 3–16 (1986) 2. Bruneau, V., and Petkov, V.: Semiclassical resolvent estimates for trapping perturbations. To appear in Commun. Math. Phys. 3. Dimassi, M., and Sjöstrand, J.: Spectral asymptotics in the semi-classical limit. Cambridge University Press, 1999 4. Gérard, Ch.: Semiclassical resolvent estimates for two and three body Schrödinger operators. Comm. P.D.E. 15, 1161–1178 (1990) 5. Gérard, Ch., and Martinez, A.: Principe d’absorption limite pour des opérateurs de Schrödinger à longue portées C.R. Acad. Sci. Paris 306, 121–123 (1988) 6. Helffer, B., and Sjöstrand, J.: Resonances en limite semi-classique. Mém. Soc. Math. France (N.S.) 24-25, (1986) 7. Hörmander, L.: On the existence and the regularity of solutions of linear pseudo-differential equations. Enseignement Math. 17 (2), 99–163 (1971) 8. Hörmander, L.: Linear partial differential equations. Vol. 3, Berlin: Springer Verlag, 1985 9. Jensen, A., Mourre, E. and Perry, P.: Multiple commutator estimates and resolvent smoothness in quantum scattering theory, Ann. Inst. H. Poincaré (phys. théor.) 41, 207–225 (1984) 10. Melrose, R.B.: Spectral and scattering theory for the Laplacian on asymptotically Euclidean spaces. In: Spectral and scattering theory (M. Ikawa, ed.), New York: Marcel Dekker, 1994, pp, 85–130 11. Melrose, R.B., and Zworski, M.: Scattering metrics and geodesic flow at infinity. Invent. Math. 124, 389–436 (1996) 12. Robert, D.: Asymptotique de la phase de diffusion à haute énergie pour des perturbations du second ordre du laplacien. Ann. Sci. École Norm. Sup. 25, 107–134 (1992) 13. Robert, D.: Relative time-delay for perturbations of elliptic operators and semiclassical asymptotics. J. Funct. Anal. 126, 36–82 (1994) 14. Robert, D., and Tamura, H.: Semiclassical estimates for resolvents and asymptotics for total scattering cross-sections. Ann. Inst. H. Poincaré (phys. théor.) 47, 415–442 (1987) 15. Wang, X.P.: Time-decay of scattering solutions and classical trajectories. Ann. Inst. H. Poincaré (phys. théor.) 47, 25–37 (1987) 16. Wunsch, J., and Zworski, M.: Distribution of resonances for asymptotically Euclidean manifolds. Preprint, 1999 Communicated by B. Simon

Commun. Math. Phys. 212, 219 – 243 (2000)

Communications in

Mathematical Physics

© Springer-Verlag 2000

On the Constraints Defining BPS Monopoles C. J. Houghton, N. S. Manton, N. M. Romão Department of Applied Mathematics and Theoretical Physics, University of Cambridge, Silver Street, Cambridge CB3 9EW, UK. E-mail: [email protected]; [email protected]; [email protected] Received: 15 October 1999 / Accepted: 19 January 2000

Abstract: We discuss the explicit formulation of the transcendental constraints defining spectral curves of SU (2) BPS monopoles in the twistor approach of Hitchin, following Ercolani and Sinha. We obtain an improved version of the Ercolani–Sinha constraints, and show that the Corrigan–Goddard conditions for constructing monopoles of arbitrary charge can be regarded as a special case of these. As an application, we study the spectral curve of the tetrahedrally symmetric 3-monopole, an example where the Corrigan– Goddard conditions need to be modified. A particular 1-cycle on the spectral curve plays an important rôle in our analysis. 1. Introduction BPS monopoles for the SU (2) Yang–Mills–Higgs gauge theory have been studied for over twenty years, using a number of different approaches. Twistor methods, relating the solutions of the integrable differential equations of the model to holomorphic vector bundles over a so-called twistor space, were first introduced by Ward, adapting previous work on the self-duality equations for the pure Yang–Mills theory on R4 . They enabled solutions of magnetic charge k > 1 to be constructed for the first time [15]. A twistor approach intrinsic to the geometry of R3 was developed later by Hitchin in [7] and [8]. In his formulation, a monopole is associated to a spectral curve, a compact complex curve in T 0 CP1 , the total space of the holomorphic tangent bundle of CP1 , satisfying a number of conditions which were stated in [8]. Based on this approach, new solutions have been constructed and new characterisations of monopoles developed; we refer to [14] for a brief overview. The (reduced) moduli space N k of gauge-inequivalent BPS monopoles of a given charge k is a (4k − 1)-dimensional manifold, which has been described in several ways. If we adopt the twistor formulation in terms of spectral curves, it can be characterised as the space of complex curves in T 0 CP1 satisfying a number of transcendental constraints. For the case where the curve is nonsingular, Ercolani and Sinha attempted to formulate

220

C. J. Houghton, N. S. Manton, N. M. Romão

these constraints explicitly in [4]; they followed essentially the method of Hurtubise in [12], who achieved a satisfactory description of N2 . Their approach leads to a method of determining constraints on spectral curves by analysing objects of the function theory obtained on them. These constraints parallel the Corrigan–Goddard conditions [3] for constructing SU (2) monopoles, and we shall clarify how they relate to each other. This paper is organised as follows. We start by introducing the relevant aspects of the twistor approach for monopoles in Sect. 2, in order to fix the notation. In Sect. 3, we review the method of Ercolani and Sinha, and present a new version (22) of their constraint equations which involves a special 1-cycle c on the spectral curve. The Corrigan– Goddard conditions were originally formulated in terms of integrals around the equator of CP1 , but we show in Sect. 4 how to interpret them equivalently as integrals on the spectral curve. Moreover, we establish that the conditions in the two methods agree except for one detail: the Corrigan–Goddard approach enforces c to be of a special sort, namely a combination of lifts of the equator of CP1 to the spectral curve. In Sect. 5, we apply the Ercolani–Sinha method to compute the spectral curve of the tetrahedral 3-monopole discussed in [9]. Thereby, the corresponding 1-cycle c is determined and the result shows that the Corrigan–Goddard assumption about c is too restrictive in general; we also consider the action of the tetrahedral group A4 ⊂ SO(3) on the homology of the spectral curve and show that it leaves c invariant. Finally, we present some concluding remarks in Sect. 6. 2. Twistor Methods for BPS Monopoles Magnetic BPS monopoles with gauge group SU (2) are defined as gauge equivalence classes of solutions (A, φ) to the Bogomol’ny˘ı equations in R3 , ∗FA = ±∇A φ,

(1)

satisfying boundary conditions (see [2]) that ensure finiteness of the energy functional; here A is a connection 1-form (with covariant derivative ∇A and curvature 2-form FA ), and φ (the Higgs field) a function, both taking values in su(2). Such solutions can be interpreted as particle-like solitons carrying discrete magnetic charge. They are associated with an integer k ∈ Z (with ±k > 0 according to the sign in Eq. (1)), which corresponds to the magnetic charge of the field configuration in suitable units and classifies the solutions homotopically; we take k > 0 throughout. The Bianchi identity together with (1) imply that BPS monopoles are also static classical solutions of the corresponding Yang–Mills–Higgs theory in the BPS limit, in which the Higgs potential is set to zero, and they correspond exactly to the minima of the energy functional. The Eqs. (1) are integrable and their solutions can be studied using methods of complex algebraic geometry. This was formulated by Hitchin in [7] as follows. The space T of oriented geodesics (straight lines) of R3 is a 4-dimensional manifold – a point on it can be specified by a pair of vectors (u, v) ∈ R3+3 , where u has unit length and defines the orientation of the line, while v gives the position of the point on the line closest to the origin and is thus orthogonal to u. This manifold admits a natural integrable almost complex structure, given at each point (u, v) by taking the cross product with u of each of the pair of vectors representing a tangent vector. It turns out that T, endowed with this complex structure, is isomorphic as a complex surface to the total space T 0 CP1 of the holomorphic tangent bundle to the Riemann sphere. The isomorphism takes u to the corresponding point in CP1 ∼ = S 2 and v to the obvious complex coordinate on the fibre. We will consider the standard affine pieces U0 and U1 of CP1 , identifying

Constraints Defining BPS Monopoles

221

the affine coordinate ζ on U0 with the stereographic projection from the south pole and ∂ |ζ letting η denote the corresponding coordinate on the fibre; thus a tangent vector η ∂ζ 1 is assigned the pair (η, ζ ). We let π be the natural projection T → CP , given in these coordinates by (η, ζ ) 7 → ζ , and denote again by U0 , U1 the pre-images under π of the affine pieces of CP1 . In the literature, T is often called mini-twistor space. It admits a real structure τ : T → T, which is the anti-holomorphic involution corresponding to the reversal of direction of oriented lines in R3 ; it obviously has no fixed points. In terms of our coordinates, it can be seen to be given by τ : (η, ζ ) 7 → (−

1 η¯ , − ). ¯ζ 2 ζ¯

(2)

The group SO(3) of rotations in R3 induces an action on T, which can be easily described in the coordinates (η, ζ ) in terms of the corresponding P SU (2) transformations: The matrix p q ∈ P SU (2), |p|2 + |q|2 = 1 −q¯ p¯ acts on the affine coordinate ζ as ζ 7→

pζ ¯ − q¯ qζ + p

(3)

and this corresponds to a rotation by θ around the direction n ∈ S 2 with n1 sin θ2 = Im q, n2 sin θ2 = −Re q, n3 sin θ2 = −Im p and cos θ2 = Re p; η transforms by multiplication by the derivative of (3), η , η 7→ (qζ + p)2 since it is the fibre coordinate of T 0 CP1 corresponding to ζ . It is clear from the definitions that the action of SO(3) commutes with the Z2 action generated by τ . For each s ∈ R and k ∈ Z, we define a holomorphic line bundle Ls (k) on T through the transition function (s,k)

g01

: U0 ∩ U1 −→ C ∗ η

(η, ζ ) 7 −→ e−s ζ ζ k with respect to the trivialising cover {U0 , U1 } of T; this definition is independent of the stereographic projection used on CP1 . We shall use the notation Ls for Ls (0) and O(k) for L0 (k). These line bundles play a rôle in the formulation of the twistor correspondence for monopoles, which we now describe. To a monopole (A, φ) we associate the complex vector bundle E → T, whose fibre at an oriented line γ ∈ T is the complex 2-dimensional space of solutions u : γ → C 2 to the equation (∇γ − iφ)u = 0, where ∇γ is the restriction of ∇A to γ . The Bogomol’ny˘ı equations (1) implies that E is holomorphic; it can be regarded as an extension 0 −→ L± −→ E −→ (L± )∗ −→ 0

(4)

222

C. J. Houghton, N. S. Manton, N. M. Romão

of the line subbundles L± ⊂ E of solutions decaying exponentially as t → ±∞, where t ∈ R is the natural coordinate on γ . It can be shown that, for any monopole of charge k, L± is isomorphic to L±1 (−k); different monopoles correspond to different extensions E. Given the two short exact sequences (4), we consider the composite morphism L− → E → (L+ )∗ , which defines a holomorphic section P of the line bundle (L− ⊗ L+ )∗ ∼ = O(2k). It will determine a compact curve S ⊂ T, which is given in our coordinates by an equation P (η, ζ ) = ηk + α1 (ζ )ηk−1 + . . . + αk (ζ ) = 0,

(5)

where each αj is a complex polynomial of degree not exceeding 2j . Notice that the real '

→ L∓ and thus restricts to a real structure τ induces antiholomorphic morphisms L± − structure on S. This implies that the polynomials αj in Eq. (5) must satisfy the reality conditions 1 αj (ζ ) = (−1)j ζ 2j αj (− ). ζ¯

(6)

It can be shown that the three independent real coefficients of α1 (ζ ) may be interpreted as giving the center (x1 , x2 , x3 ) of the monopole in R3 , α1 (ζ ) = k(x− ζ 2 + 2x3 ζ − x+ ), where x± := x1 ± ix2 , and are thus trivial moduli in the solution, related to the translational symmetry of (1). In the following, we shall only consider centred monopoles; these are defined as having the origin as center and thus have α1 (ζ ) = 0. In [8], Hitchin proved that, conversely, any compact real curve S of the linear system |O(2k)| on T for which L2 |S is trivial determines a charge k monopole, which will be smooth if the additional condition (7) H 0 S, Ls (k − 2) = 0 holds for 0 < s < 2. S is called the spectral curve of the monopole and completely determines the gauge equivalence class of the field configuration. It encodes all the information about the monopole; in particular, its genus g is related to the magnetic charge k by g = (k − 1)2

(8)

and every symmetry of S is also a symmetry of the corresponding solution to (1). 3. A New Version of the Ercolani–Sinha Conditions In [4], Ercolani and Sinha rephrase the condition of triviality of the line bundle L2 |S in terms of g equations involving periods of 1-forms on the spectral curve S. Starting with these equations, which they call the “quantisation conditions”, they propose an algorithm for constructing monopoles in the case where the underlying spectral curve is nonsingular. We now review their argument. Recall that when S is nonsingular the group H 0 (S, 1S ) of global holomorphic 1forms on S is a finite-dimensional C-vector space, whose dimension is the genus g of S. Locally, these forms can be described, using the adjunction formula, as Poincaré

Constraints Defining BPS Monopoles

223

residues of meromorphic 1-forms on T with at most simple poles along S. Imposing global regularity, it is easy to show that they can be written in our coordinates as β0 ηk−2 + β1 (ζ )ηk−3 + . . . + βk−2 (ζ ) dζ (9) = ∂P /∂η (on U0 ∩ S and away from the branch points of π |S ), where each βj is a polynomial of degree at most 2j with arbitrary coefficients. It is clear from this formula that Eq. (8) indeed holds. From Eq. (5), it is clear that the spectral curve S can be described as a k-sheeted branched cover of CP1 , with projection π |S : S → CP1 . The reality symmetry implies that the number of branch points is even and that they occur in antipodal pairs. To define the sheets of the cover, which we will label by integers 1, . . . , k, we have to introduce appropriate branch cuts. We may start by choosing a great circle on the sphere passing through no branch points, and joining the branch points in one of the corresponding hemispheres by non-intersecting cuts; then we apply the antipodal map to these to get further cuts joining the branch points on the other side of the great circle we have chosen. To ensure that each sheet is simply connected, we have to make one last cut, connecting the cuts introduced on the two hemispheres. For the spectral curves we shall consider below, one can argue that this last cut has trivial monodromy and is thus unnecessary; in this situation, the reality structure maps cuts to cuts and can therefore be described in terms of the antipodal map together with an order two permutation of the sheets. We will be interested in the local behaviour of certain meromorphic forms at the points of the fibre above 0, which we shall denote by 0j , j = 1, . . . , k, and assume to be distinct; this is no loss of generality since there is the freedom of rotating the monopole. Consider the meromorphic function on S defined by η/ζ on U0 ∩ S; it is easy to see that it has simple poles at the 2k points of (π |S )−1 ({0, ∞}) and is holomorphic elsewhere. In a neighbourhood of 0j , ηj (0) η = − 2 + O(1) dζ as ζ → 0, (10) d ζ ζ where ηj (ζ ) denotes the local solution of (5) on the j th sheet. Given a global holomorphic 1-form , we introduce the notation gj for the coefficient of at the point 0j in terms of the local coordinate ζ , i.e. |0j =: gj dζ |ζ =0 .

(11)

The triviality of the line bundle L2 |S is equivalent to the existence of a nowhere vanishing holomorphic section f ; with respect to the trivialisation of L2 |S over the open sets U0 ∩ S and U1 ∩ S, f is given by two nowhere vanishing holomorphic functions f0 and f1 on U0 ∩ S, U1 ∩ S respectively, satisfying η

f0 (η, ζ ) = e−2 ζ f1 (η, ζ ) for (η, ζ ) ∈ U0 ∩ U1 ∩ S. This implies that the meromorphic 1-forms d logf0 (:=df0 /f0 ) and d logf1 are related by η (12) + d logf1 d logf0 = −2d ζ

224

C. J. Houghton, N. S. Manton, N. M. Romão

on U0 ∩ U1 ∩ S. Notice that

I λ

d logfα ∈ 2π iZ α = 0, 1

(13)

for any homology 1-cycle λ ∈ H1 (Uα ∩ S, Z); moreover, these integrals are nonzero in general, since the 1-forms d logfα do not have to be exact. From Eqs. (10) and (12), we conclude that d logf1 must have the local behaviour near 0j 2ηj (0) d logf1 = − + O(1) dζ as ζ → 0 ζ2 in order for f0 not to have an essential singularity at 0j ∈ U0 ∩ S. It should be noted that the section f is uniquely determined up to a multiplicative constant, since the quotient of f by any other nowhere vanishing section of L2 |S yields a global holomorphic function on the compact Riemann surface S. Notice also that the modulus of this constant can be fixed by imposing the symmetry f1 (η, ζ ) =

1 f0 ◦ τ (η, ζ )

since the right-hand side has the regularity and nowhere vanishing properties of f1 , and η ζ is conjugated under pull-back by τ . Let {a1 , . . . , ag , b1 , . . . , bg } be a canonical basis of H1 (S, Z) ∼ = Z⊕2g , i.e. satisfying the orthonormality conditions ](ai , bj ) = δij ,

](ai , aj ) = 0 = ](bi , bj )

(14)

for the intersection pairing. Following Ercolani and Sinha, we apply the reciprocity law for differentials of the first and second kinds (cf. [6], p. 241) to an arbitrary holomorphic 1-form and d logf1 to get H H g k X 1 X Haj Haj d logf1 (−2ηi (0))gi = (15) . bj bj d logf1 2π i i=1

j =1

Let mj and nj be the integers I I 1 1 d logf1 and nj := d logf1 , mj := − 2πi aj 2π i bj

(16)

consistently with (13), and let us define the 1-cycle c :=

g X (nj aj + mj bj ).

(17)

j =1

Then Eq. (15) can be rewritten as −2

k X i=1

I ηi (0)gi =

c

.

(18)

Constraints Defining BPS Monopoles

225

The existence of c ∈ H1 (S, Z) satisfying (18) is equivalent to the line bundle L2 |S being trivial. Unfortunately, the condition (7) which would ensure smoothness cannot be implemented directly in the Ercolani–Sinha approach if k > 2, but we can include a weaker statement in the analysis as follows. Since for k ≥ 2 there is an inclusion H 0 (S, Ls ) ,→ H 0 (S, Ls (k − 2)) given by tensoring with a section of O(k − 2)|S , the condition H 0 (S, Ls ) = 0

(19)

is necessary for (7) to hold. Now we can repeat the argument above to investigate the existence of global sections of Ls |S , arriving at the same Eq. (18) with 2 replaced by s, and we can conclude that there will be no nontrivial global sections of Ls |S for 0 < s < 2 if and only if c is primitive in H1 (S, Z). We can still simplify the left-hand side of (18). Consider a global holomorphic 1-form on S, as given by (9). After defining the branch cuts, we can write P (η, ζ ) =

k Y

η − ηj (ζ ) ,

j =1

and so k

k

XY ∂P η − ηj (ζ ) . (η, ζ ) = ∂η i=1 j 6 =i

On sheet i, η = ηi (ζ ) and all the terms in the sum above vanish except one, k Y ∂P (η, ζ ) = ηi (ζ ) − ηj (ζ ) . ∂η sheet i

(20)

j 6 =i

We can use this to write the coefficient gi in (11) for as gi =

β0 ηik−2 (0) + β1 (0)ηik−3 (0) + . . . + βk−2 (0) , Qk j 6 =i ηi (0) − ηj (0)

so the left-hand side of (18) takes the form −2

k X

ηi (0)gi = −2

i=1

k X β0 ηik−1 (0) + β1 (0)ηik−2 (0) + . . . + βk−2 (0)ηi (0) . Qk j 6 =i ηi (0) − ηj (0) i=1

This appears to be a very complicated expression, but we can simplify it considerably if we make use of the identity ( k k X Y 1 0 ,0 ≤ n ≤ k − 2 n xi = . (21) 1 ,n = k − 1 xi − xj i=1

j 6=i

Taking xi = ηi (0), we obtain −2

k X i=1

ηi (0)gi = −2β0

226

C. J. Houghton, N. S. Manton, N. M. Romão

and substitution in (18) yields

I c

= −2β0 .

(22)

So our version of the Ercolani–Sinha conditions amounts to the existence of a primitive 1-cycle c such that Eq. (22) is satisfied for every global holomorphic 1-form , where β0 is the coefficient in (9) for . To prove (21), we first note that the cases 0 ≤ n ≤ k − 2 follow from the n = k − 1 case: A translation xi 7 → xi − y of all the xi ’s leaves the denominators in the sum invariant, k k X Y (xi − y)k−1 i=1

j 6 =i

1 = 1, xi − xj

so by expanding the binomials and collecting equal powers of y we get the statement for all 0 ≤ n ≤ k − 2. The proof of the n = k − 1 case by induction on k is rather lengthy and we prefer to argue as follows. It is readily seen that the whole sum is symmetric under the action of the symmetric group Sk permuting the xi ’s. Reducing to a common fraction yields as denominator 1(x1 , . . . , xk ) =

k Y

(xi − xj )

i<j

and this polynomial is completely antisymmetric under Sk ; in fact, the space of antisymmetric polynomials in k variables is generated by 1 over the ring of symmetric polynomials. The numerator is then necessarily antisymmetric and a homogeneous polynomial of degree 21 k(k − 1), which is also the degree of 1, so it has to be equal to 1 times a constant. Taking the asymptotic limit x1 → ∞ in the original sum, we conclude that this constant has to be 1. It is convenient, when we come to investigate particular examples, to introduce bases for both the global holomorphic 1-forms and the homology 1-cycles on S. An obvious basis {(`) , 1 ≤ ` ≤ g} for H 0 (S, 1S ) corresponds to taking monomials ηr ζ s for the allowed powers r and s (in lexicographical order of decreasing r and increasing s) as numerators of (9), (1) =

ηk−2 dζ ηk−3 dζ ηk−3 ζ dζ ζ 2k−4 dζ , (2) = , (3) = , . . . , (g) = . ∂P /∂η ∂P /∂η ∂P /∂η ∂P /∂η (23)

The condition (22) for a general is then equivalent to the g conditions I (`) = −2δ1` . c

(24)

Let us also fix a canonical basis (14) for H1 (S, Z). The (g × 2g) period matrix for S corresponding to the two choices of bases is then defined as usual by P = [A|B], where A and B are square matrices with entries I I (`) and B`j := (`) . A`j := aj

bj

Constraints Defining BPS Monopoles

227

Recalling (17), Eq. (24) can now be written as g X (A`j nj + B`j mj ) = −2δ1` .

(25)

j =1

Although the number of integers to be determined in (25) is 2g, they still have to satisfy constraints coming from the reality structure of S. We prove below that these imply that c is antisymmetric under the action of τ on the first homology group, τ∗ c = −c.

(26)

This imposes g linear constraints on the 2g components of c. In fact, since τ is antiholomorphic, ](a, b) = −](τ∗ a, τ∗ b) for any a, b ∈ H1 (S, Z), and this shows that the matrix τ representing τ∗ in the canonical basis (14) of H1 (S, Z) satisfies τ t = J(−τ −1 )J−1 ,

(27)

where J is the matrix representing the intersection pairing in this basis, 1g . J= −1g

(28)

Since τ 2 = 12g , τ is diagonalisable and has eigenvalues ±1; then (27) implies that these have to occur with equal multiplicities. Hence the antisymmetric 1-cycles lie in a Z⊕g subgroup of H1 (S, Z). To prove (26), we consider the basis (23). Since τ is antiholomorphic, it pulls back holomorphic 1-forms on S to antiholomorphic 1-forms and vice-versa; the forms above are mapped as τ∗

ηr ζ s dζ ∂P /∂η(η, ζ )

= (−1)k+r+s+1

ηr ζ 2(k−r−2)−s dζ ∂P /∂η(η, ζ )

(29)

for 0 ≤ r ≤ k − 2 and 0 ≤ s ≤ 2(k − r − 2). Using (29) and (24), we obtain I τ∗ c+c

I I (1) = − (1) + (1) = 2 − 2 = 0 c

c

and for ` 6 = 1 I τ∗ c+c

I I 0 (`) = ± (` ) + (`) = ±0 + 0 = 0 c

c

for some `0 6 = 1. We conclude that the integral of any global holomorphic 1-form around τ∗ c + c vanishes, and this implies (26). To illustrate how we can use the conditions (25) to determine spectral curves of monopoles, we take as example the well-known charge 2 monopole ([12, 2]), which is

228

C. J. Houghton, N. S. Manton, N. M. Romão

also considered in [4]. The general spectral curve for a centred monopole of charge 2, after imposing the reality conditions (6), has the form η2 + (γ0 ζ 4 + γ1 ζ 3 + γ2 ζ 2 − γ 1 ζ + γ 0 ) = 0, where γ2 is real. The four roots of the polynomial in brackets occur in antipodal pairs; we can use the SO(3) action to take one pair to ±1 and the other one to ±e±2iθ , where 0 ≤ θ ≤ π4 . A further rotation by ζ 7 → eiθ ζ then takes the spectral curve to η2 +

κ 2

ζ 4 − 2 cos(2θ )ζ 2 + 1 = 0,

2

(30)

where κ is a real number to be determined in terms of θ . Equation (30) defines a double cover of CP1 with branch points at the four roots of the polynomial in brackets, w1 = eiθ ,

w2 = −e−iθ ,

z1 = −eiθ ,

z2 = e−iθ .

We will be interested in the generic case where S is nonsingular; this happens if and only if all the points above are distinct. S is an elliptic curve and can be constructed by gluing together two copies of the Riemann sphere along two branch cuts, that we choose to be on the equator {ζ : |ζ | = 1}. We label the two sheets of S by j = 1, 2, which correspond to the two possible choices of sign for η when solving (30); sheet j is defined by the function ηj obtained by analytic continuation, avoiding the cuts above, of ζ 7 → (−1)j −1

iκ p 4 ζ − 2 cos(2θ )ζ 2 + 1 2

regarded as a germ at 0 ∈ C. Here, and elsewhere, we consider the principal branch of the root, viz − πq < arg z1/q ≤ πq , ∀ z ∈ C ∗ . Im ζ b

w2

w1 Re ζ

0 a z1

z2

Fig. 1. Branch cuts and 1-homology basis for the spectral curve of the charge 2 monopole

Constraints Defining BPS Monopoles

229

We choose a canonical basis {a, b} of H1 (S, Z) as in Fig. 1, where we draw the paths as dashed or dotted lines if they lie on sheets 1 or 2, and write c = na + mb. In this case H 0 (S, 1S ) is 1-dimensional and a generator is =

dζ . 2η

The periods can be expressed in terms of Legendre’s complete elliptic integral of the first kind, Z θ I 2 2 du p = K(sin θ ), A= = 2 2 κ sin θ κ a 0 1 − csc θ sin u I

2i B= = κ cos θ b

Z

π 2 −θ

0

2i du p = K(cos θ ). 2 κ 1 − sec2 θ sin u

So Eq. (25) reads 2i 2 K(sin θ )n + K(cos θ )m = −2. κ κ Therefore m = 0, and n must then be a generator of Z, which we can take to be −1, obtaining κ = K(sin θ ). This can be checked to agree with the result of Hurtubise [12]. Note that in this case Eqs. (7) and (19) are equivalent, so the method recovers all nonsingular spectral curves of (centred and suitably oriented) monopoles of charge 2. In this example, the special 1-cycle c in Eq. (22) is thus −a. It is readily checked that it is antisymmetric under τ . We point out that, although here I d log f1 = 0, a

the a-periods of d log f1 do not vanish for general monopoles, and this cannot be avoided by just rescaling f as claimed in [4]. This will be illustrated in Sect. 5.1, where we consider a monopole with a spectral curve of higher genus. 4. The Corrigan–Goddard Conditions In [3], Corrigan and Goddard used the so-called Ak Ansatz ofAtiyah–Ward for instantons to construct a charge k solution to the Bogomol’ny˘ı equations (1) with dim Nk = 4k − 1 free parameters. This construction was also obtained independently by Forgács et al. [5], and has been applied [13] to study monopoles in situations where the equations involved are simplified. Unlike the method we presented in Sect. 3, the Corrigan–Goddard approach does not assume smoothness of the underlying spectral curves; indeed, it can be used to obtain for example the axially symmetric monopole of arbitrary charge k, whose spectral curve is reducible to k spherical components. In the notation we have introduced, the construction goes as follows. Start with a polynomial P (η, ζ ) as in (5), satisfying the reality constraints (6). Orient the monopole

230

C. J. Houghton, N. S. Manton, N. M. Romão

so that there is an open annulus A in CP1 which contains the equator E = {ζ : |ζ | = 1} but does not contain any of the branch points of π |S . Assume that A lifts to k disjoint annuli on the spectral curve; then one can define the branch cuts so that sheet j contains one of the lifted annuli, which we denote by Aj . On π −1 (A), consider the function 2(η, ζ ) := 2π i

k k X νj Y j =1

2

`6 =j

η − η` (ζ ) , ηj (ζ ) − η` (ζ )

(31)

where νj are some integers to be determined. This is a Lagrange interpolation polynomial in η of the k conditions that 2 should take the value π iνj on Aj . For ζ ∈ A, define the functions 2r from the coefficients of ηr in 2 as follows: 2(η, ζ ) =: 2π i

k−1 X r=0

2r (ζ )

η 2ζ

r

.

Corrigan and Goddard’s analysis then leads to the conditions I dζ =2 21 (ζ ) ζ E and

I E

2r (ζ )ζ s

dζ = 0, ζ

2 ≤ r ≤ k − 1, |s| ≤ r − 1.

(32)

(33)

(34)

These are (k − 1)2 constraints on the k 2 + 2k coefficients of P (η, ζ ), just as one obtains using the Ercolani–Sinha algorithm. When the spectral curve is nonsingular, we would expect them to be equivalent to (24). We now clarify how they relate to each other. Denoting by si the i th elementary symmetric polynomial in a given number of variables, we can expand the numerator of (31) to obtain 2(η, ζ ) = 2πi

k k−1 X r X νj sk−r−1 (η1 (ζ ), . . . , η[ j (ζ ), . . . , ηk (ζ ))η . (−1)k−r−1 Qk 2 `6 =j ηj (ζ ) − η` (ζ ) r=0 j =1

The elementary symmetric polynomials satisfy the recurrence relation si (x1 , . . . , xbj , . . . , xk ) = si (x1 , . . . , xk ) − xj si−1 (x1 , . . . , xbj , . . . , xk ) for 0 ≤ i ≤ k (taking s0 := 1), and iterating this one finds si (x1 , . . . , xbj , . . . , xk ) =

i X (−1)h xjh si−h (x1 , . . . , xk ). h=0

Clearly, (−1)j sj (η1 (ζ ), . . . , ηk (ζ )) are just the polynomials αj (ζ ) in (5) for each 0 ≤ i ≤ k (with α0 := 1). Therefore, we can read off the functions 2r in (32) as 2r (ζ ) =

ηjh (ζ )αk−r−h−1 (ζ ) (2ζ )r . Qk 2 `6=j ηj (ζ ) − η` (ζ )

k k−r−1 X X νj j =1 h=0

Constraints Defining BPS Monopoles

231

So far, we have shown that, for the j th term in the sum, the numerator depends only on ηj (ζ ) and ζ . Using (20), we can eliminate altogether the dependence on the functions η` with ` 6 = j , and this allows us to write for 1 ≤ r ≤ k − 1, k−r−1 X

I

dζ = 2r (ζ ) ζ E

I

h=0

Pk

j =1 νj Ej

ηh αk−r−h−1 (ζ ) ∂P (η, ζ ) ∂η

(2ζ )r−1 dζ ,

where Ej := (π |S )−1 (E)∩Aj is the lift of E to sheet j . The integrand no longer depends on the sheet label. It becomes clear now how to write the left-hand side of the Corrigan– Goddard conditions as integrals over 1-cycles on S. If we define the holomorphic 1-form −4r on ∪kj =1 Aj to be the integrand in the above expression, then the conditions (33) and (34) can be written respectively as I Pk

j =1 νj Ej

41 = −2

(35)

ζ s 4r = 0

(36)

and I Pk

j =1 νj Ej

for 2 ≤ r ≤ k − 1 and |s| ≤ r − 1. Equations (35) and (36) are very similar to the version (24) of the Ercolani–Sinha conditions. In fact, they turn out to be precisely equivalent to (24), provided we assume c to be of the form c=

k X

νj Ej

(37)

j =1

rather than a general 1-cycle as in (17). To see this, we first remark that all the integrands in (35) and (36) are of the form (9), and hence global holomorphic 1-forms on S. For each 1 ≤ r ≤ k − 1, the highest power of η in the numerator of 4r never exceeds k − r − 1, and the coefficient of ηk−r−1 can be seen to be equal to −(2ζ )r−1 . So multiplication of 4r by ζ s with −r + 1 ≤ s ≤ r − 1 as in (36) gives monomials in ζ of all degrees between 0 and 2(r −1) as coefficients for ηk−r−1 . We conclude that all the homogeneous Eqs. (` 6 = 1) in (24) can be obtained from (36) if we consider first the 2k − 3 equations corresponding to r = k − 1 and continue decreasing r down to 2, using at each stage the vanishing of the integrals for greater r from the previous steps. The ` = 1 equation also agrees with (35), since we can use (36) and the coefficient of ηk−2 in the numerator of 41 is −1. Conversely, the Ercolani–Sinha conditions in the form (24) also imply the Corrigan–Goddard conditions (35) and (36) if (37) holds. The question to put now is of course: Is the Ansatz (37) for the special cycle c in Eq. (22) valid in general? In the next section, we show that this is not the case, by explicit computation of c for the tetrahedral 3-monopole.

232

C. J. Houghton, N. S. Manton, N. M. Romão

5. The Tetrahedral 3-Monopole Revisited 5.1. Spectral curve. Now we apply the method of Sect. 3 to investigate the spectral curve of the tetrahedrally symmetric monopole of charge 3. This was first studied in [9], where the existence of the monopole was proved by imposing tetrahedral symmetry to simplify Nahm’s equations and solve them in terms of elliptic functions. A numerical treatment of the ADHMN construction was developed and applied to this monopole in [10], which allowed the fields to be computed and, using these, level surfaces for the energy density were plotted. As in [9], we start with the Ansatz √ (38) η3 + α(ζ 6 + 5 2ζ 3 − 1) = 0 for the spectral curve S, where α is a nonzero constant to be determined; the reality conditions imply α ∈ R. The branch points occur at the zeroes of the polynomial in brackets, √ √ 3−1 3+1 ¯ 1 , z1 = − √ , z2 = ωz1 , z3 = ωz ¯ 1, w1 = √ , w2 = ωw1 , w3 = ωw 2 2 2π i

where ω := e 3 . These are equidistant points on the Riemann sphere, antipodal in pairs, which are related by radial projection to the midpoints of the edges of a tetrahedron inscribed in the sphere. In the configuration we have chosen, the tetrahedron has a vertex at 0 and is oriented such that the radial projection of one of the three edges containing 0 passes through 1, as shown in Fig. 2. 0

w3

w2

w1

z1

z3

8

z2

1

Fig. 2. The inscribed tetrahedron underlying the symmetry of the spectral curve

To define the branch cuts, we choose to connect the wi ’s and the zi ’s together along arcs of circles centred at the origin and antipodal to each other as shown in Fig. 3. No more cuts are needed, since each branch point is of cube root type and so any closed path on CP1 enclosing zero mod 3 branch points lifts to a closed path on S. Now we can

Constraints Defining BPS Monopoles

233

label the three sheets as before: for j = 1, 2, 3, we define sheet j to correspond to the analytic continuation ηj of q √ 3 (39) ζ 7 → −ωj −1 α 1/3 ζ ζ 3 + 5 2 − ζ −3 regarded as a germ at 1 ∈ C. In particular, notice that on each sheet η is indeed given by (39) for all ζ in the annulus C := {ζ : |w1 | < |ζ | < |z1 |}. With these conventions, it can be checked that the rules for crossing the branch cuts are as given in Fig. 3, where the encircled ± signs mean that the label j is to be increased/decreased by 1 mod 3 when the corresponding cut is crossed.

Im ζ + z3

w2 + 0 z1 w3

w1

1

Re ζ

+

+ z2

Fig. 3. Branch cuts for the spectral curve of the tetrahedral 3-monopole

It is not hard to see that one obtains a compact Riemann surface of genus four when three copies of the Riemann sphere are identified along the branch cuts as specified in Fig. 3. In fact, by identifying three copies of the upper or lower hemispheres along the pair of cuts as above, one obtains a torus with three discs removed; the circles of the boundary correspond to the equators of the spheres we started with. Gluing together the two surfaces obtained in this way along their boundaries gives a compact curve of genus four. This is sketched in Fig. 4; the three circles shown project under π |S to the equator E of CP1 , and they will be referred to as the equators on a given sheet. We shall adopt the convention of drawing the paths as dash-dotted, dashed or dotted curves if they lie on sheets 1, 2 or 3, respectively. Now we choose a canonical basis for H1 (S, Z) ∼ = Z⊕8 as in Fig. 5. The first two 1-cycles a1 and b1 are drawn close to the cut connecting the zi ’s so as to have the desired intersection number; for a2 and b2 we choose the equator on sheet 2 and a distorted meridian intersecting it as required; all the other intersections between these four 1cycles are zero. Then we act with the reality map τ on these cycles to get the other

234

C. J. Houghton, N. S. Manton, N. M. Romão

2

1

3

Fig. 4. Spectral curve of the tetrahedral 3-monopole

elements of the basis: a3 := τ∗ a2 , a4 := τ∗ a1 , b3 := −τ∗ b2 , b4 := −τ∗ b1 .

(40)

Our choice of branch cuts is such that τ sends cuts to cuts and hence maps a given sheet onto another sheet. It is easy to check that for ζ ∈ R, η as given by (39) for j = 1 also takes real values (cf. Eq. (2)). We then conclude that sheet 1 is invariant under τ , while the other two sheets are interchanged. It follows that the second half of our homology basis is as drawn in Fig. 5, and all the remaining intersection numbers for the elements in the basis are as required by (14).

a1

a2

a3 a4

b3 b1

b2

Fig. 5. The basis for H1 (S, Z)

Our chosen basis for H 0 (S, 1S ) is (1) =

dζ dζ ζ dζ ζ 2 dζ , (2) = 2 , (3) = , (4) = . 2 3η 3η 3η 3η2

b4

Constraints Defining BPS Monopoles

235

According to (29) these forms are pulled back by the reality structure as τ ∗ (1) = −(1) ,

τ ∗ (2) = (4) ,

τ ∗ (3) = −(3) ,

τ ∗ (4) = (2) .

(41)

We are now ready to compute the period matrix. The reality properties (40) and (41) imply that the periods around a2 , a1 , b2 and b1 determine those around a3 , a4 , b3 and b4 , respectively. For example, I B23 =

−τ∗ b2

I τ ∗ (4) = − (4) = −B42 . b2

This means that we only have to calculate half of the 32 entries of the period √ matrix. 3 First we consider the periods around the equator a2 . Notice that ζ + 5 2 − ζ −3 is invariant under the change of variable ζ 7 → ωζ . So Z ω (1 + ω + ω) ¯ dζ A22 = − 2/3 = 0 √ 3α 1/3 1 ζ 2 ζ 3 + 5 2 − ζ −3 and similarly A42 = 0. The two integrals A12 and A32 can be expressed in terms of the 2 ), we find hypergeometric function 2 F1 . Letting F := 2 F1 ( 16 , 23 ; 1; − 25 A12

2i ω¯ =− √ 3(5 2α)1/3

Z

π 2

− π2

2π i ωF ¯ √ 1/3 = − √ 3 6 3 5 2α 1/3 sin u

du

1−

√ i 2 5

and, using the relation (see [1], p. 559) 2 F1

2 1 5 , ; 1; − 3 6 25

√ 3 2 1 2 5 = √ 2 F1 , ; 1; − , 6 3 25 3

we obtain A32

2iω = √ 3(5 2α)2/3

Z

π 2

− π2

2π iωF . 2/3 = √ √ 3 3 3 10α 2/3 sin u

du

1−

√ i 2 5

Our choice of a1 and b1 implies that the periods around these two cycles are related by conjugation, Ai1 = Bi1 . This follows from the fact that the paths π ◦ a1 and π ◦ b1 are complex conjugate, while η2 (ζ ) = η3 (ζ ) from the definition in (39). There remain eight integrals to be calculated. By resorting to numerical integration, we have established that they are related to the periods around a2 by simple numerical

236

C. J. Houghton, N. S. Manton, N. M. Romão

factors. The conclusion is that the two blocks A and B of the period matrix can be written as   2π i ωF ¯ 2π iωF 2π F 2πF − √ − √ − √ √ √ √ √ √ √ √ 3 3 3 3 6 6 6 6  3 3 5 2α 1/3 3 5 2α 1/3 3 5 2α 1/3 3 3 5 2α 1/3      √   4 2π ωF ¯   0 0 0 √   3   9 10α 2/3  A=   2π F 2πF 2π iωF 2π i ωF ¯     − √ √ √ √ √ √ 3 3 2/3 3 3 3 10α 2/3 2/3   9 3 10α 2/3 3 3 10α 9 10α     √   4 2πωF 0 0 0 √ 9 3 10α 2/3 and  4π F 4π F 2π F 2πF − √ √ − √ √ √ √ √ √ √ √ √ √ 3 3 3 3 6 6 6 6  3 3 5 2α 1/3 3 3 5 2α 1/3 3 3 5 2α 1/3 3 3 5 2α 1/3      √ √ √  4 2π iF 4 2π iF 4 2π ωF    0 − √ √ − √ √ √    9 3 3 10α 2/3 9 3 3 10α 2/3 9 3 10α 2/3  .  B=  4π iF 2πF 4π iF 2π F     − √ √ √ √ √ √ 3 3 3 2/3 2/3 2/3   9 3 10α 2/3 9 3 10α 9 3 10α 9 10α     √ √ √   4 2π ωF 4 2π iF ¯ 4 2π iF − √ √ 0 √ √ √ 3 3 3 9 10α 2/3 9 3 10α 2/3 9 3 10α 2/3 

We have now all that is needed to determine α from the conditions (25). For a given α, this is a system of eight real linear equations in eight (integer) unknowns. It has a solution given by   0 0 n =  , 0 0

  0 1 m = m , 1 0

where m satisfies √ √ √ 3 3 3 5 6 2 α 1/3 . m= 4π F Now m must be either 1 or −1 for (m, n) to be primitive in Z⊕8 . If we take m = 1, √ 32 2π 3 F 3 , α= √ 405 3

(42)

while m = −1 reverses the sign of α. The two solutions can be seen to be a rotation of each other by ζ 7 → − ζ1 ; to fix ideas, we take α positive from now on. It can be checked

Constraints Defining BPS Monopoles

237

numerically that (42) agrees with the solution obtained in [9]. For our orientation of the spectral curve (38), the latter is given by √ 0( 1 )9 2 0( 16 )3 0( 13 )3 = √3 . (43) α= √ · √ 3 3 48 3π 3/2 48 6π 3 In Sect. 5.2, we use a change of variables projecting S onto an elliptic curve to relate analytically the two results. The special 1-cycle c for the tetrahedral 3-monopole with a positive is c = b2 + b3 .

(44)

It is sketched on the Riemann sphere in Fig. 6, after simplification using relations in homology – for example, the sum of the three equators with the same orientation is homologous to zero, which is clear from Fig. 4. Clearly, it is not a combination of a lift of equators, since the projections of b2 and b3 enclose different branch points on the Riemann sphere. In the next section, it will be proved that c is invariant under the tetrahedral group.

0 w2

w3

w1

z1

z3

z2 8

Fig. 6. The special 1-cycle c for the tetrahedral 3-monopole

5.2. Action of the tetrahedral group. The spectral curve S defined by (38) admits an action of the tetrahedral group A4 ⊂ SO(3) determined by P SU (2) transformations on ζ ; the corresponding rotations are the symmetries of the tetrahedron drawn in Fig. 2. This induces an action on H1 (S, Z), which we now describe. Recall that A4 is generated by the 3-cycle (123) and the double transposition (12)(34). We represent these as the rotation by 2π 3 about the direction defined by the top vertex 0 of the tetrahedron in Fig. 2, R : ζ 7→ e

2π i 3

ζ,

(45)

238

C. J. Houghton, N. S. Manton, N. M. Romão

and the rotation by π around the axis connecting the edge midpoints w1 and z1 , √

T : ζ 7→

2−ζ √ , 1 + 2ζ

(46)

respectively. Later, we will also be interested in another element of order two, V = R T R 2,

(47)

which corresponds to a rotation by π about the axis connecting w2 and z2 . We also denote by R, T and V the maps induced on S by (45), (46) and (47). On the complex plane, R is of course just the rotation by 2π 3 about the origin, while T and V are elliptic Möbius transformations of order two with w1 , z1 and w2 , z2 as fixed points, respectively. A way to visualize the action of T or V is to draw the (invariant) circles of Apollonius corresponding to the two fixed points; the other four branch points of π|S all lie on one of these circles and it is easy to verify that they are permuted as expected under the two transformations. To describe the action of A4 on H1 (S, Z), we start by computing the matrices representing the generators R and T . The effect of R is easy to understand, since it leaves the three annuli over C = {ζ : |w1 | < |ζ | < |z1 |} invariant. T is harder to describe since it does not preserve the annuli, on which we can easily keep track of the sheet labels by using the expression (39) for ηj (ζ ). But we can still use (39) when ζ is in the smaller region C+ ∪ C− , where C± := {ζ ∈ C ∩ T (C) : ±Im ζ > 0} are mapped onto each other by T . Denoting by C±,j the intersection of (π |S )−1 (C± ) with sheet j , it can be concluded that T sends C±,j to C∓,j ∓1 , where the labels are taken mod 3. The sheet that contains the image under R or T of any point on S can now be easily identified from these data and analytic continuation. In particular, we conclude that the 1-cycles in our basis for H1 (S, Z) are mapped as shown in Fig. 7. We can now use the perfect intersection pairing (14) to compute the matrices of the maps R∗ and T∗ induced on homology from the intersection numbers of the 1-cycles ai and bi with their images. Let ci := ai and c4+i := bi for i = 1, . . . , 4. Defining Mij := ](R∗ ci , cj ),

Nij := ](T∗ ci , cj ),

we obtain the entries of the matrices R and T representing R∗ and T∗ as Rij =

8 X

Jik Mj k ,

Tij =

k=1

8 X

Jik Nj k ,

k=1

where, as in (28), Jij := ](ci , cj ) =

−14

14

ij

.

Constraints Defining BPS Monopoles

239

R

T

a1

R

a2

R

a3

R

T

T

T a4

R

T b1

R

T b2

b3

R

R

b4

T

T

Fig. 7. The action of R and T on the basis of H1 (S, Z)

240

C. J. Houghton, N. S. Manton, N. M. Romão

The intersection numbers Mij and Nij  0 0 0 0 1 0  0 0 1  0 0 0 R=  1 0 0 0 0 0  0 0 0 0 0 0 and

      T=    

can be just read off from Fig. 7, and we get  0 0 0 −1 0 0 −1  0 −1 0  1 0 0 1  0  0 0 0 1  0  0 −1 −1 1 0  0 1 0 0  0  0 0 1 0  0 1 −1 −1 −1 0

0 0 0 0 −1 0 −1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 −1 0 −1 0 0 0 0 0 1 0 0 −1 0 0 0 0 1 0 −1 −1 0 −1 0 0 1 0 1 −1 0 1 0 0 0 0 −1 0 0 −1 0

      .    

So the characters of the A4 representation on 1-cycles are tr 18 = 8,

tr R = 2 = tr R2 ,

tr T = 0,

and this shows that H1 (S, C) splits as 1⊕2 ⊕ 3⊕2 . Another way to see this is to consider the action of A4 by pull-back on the holomorphic 1-forms (`) by R and T and calculate the characters to conclude that H0 (S, 1S ) splits as 1 ⊕ 3 under A4 (with (1) spanning the trivial singlet and being orthogonal to the triplet), and use Poincaré duality. Using the matrices for R∗ and T∗ , we can compute the projection onto the subspace 1⊕2 ⊂ H1 (S, C) as   0 −1 −1 0 0 0 0 0  0 2 2 0 0 0 0 0     0 2 2 0 0 0 0 0   1  0 −1 −1 0 0 0 0 0  1 X  σ =  = . 1 1 0 0 0 0 0  |A4 | 4 0  −1 0 −3 1 −1 2 2 −1  σ ∈A4    −1 3 0 1 −1 2 2 −1  0 −1 −1 0 0 0 0 0 The range of this matrix is spanned by t 0 0 0 0 0 1 1 0 and

t 2 −4 −4 2 −2 3 −3 2 ,

(48)

so we conclude that the special cycle c given in (44) is invariant under the action of A4 . Notice that in (48) the first vector is antisymmetric whereas the second is symmetric under reality. We can explore the action of the Vierergruppe D2 ⊂ A4 generated by the two elements T and V to express the value α given by (42) in terms of elliptic integrals, as in [9]. The

Constraints Defining BPS Monopoles

241

actions of both T and V are much easier to describe in an alternative orientation of the monopole, obtained by rotation of (38) under ζ 7→

(z3 − z1 )(ζ − w1 ) . (z3 − w1 )(ζ − z1 )

Then the spectral curve is taken to the form √ 3 3 3 η + √ αiζ (ζ 4 − 1) = 0, 2

(49)

which can be described as a covering of CP1 with branch points at 0, ±1, ±i and ∞. In this configuration, T is just ζ 7→ −ζ , while V is ζ 7 → − ζ1 . The map p : ζ 7→

1 1 ζ 2 + 2 =: z 2 ζ

identifies points in the same orbit of D2 , having the first quadrant as fundamental region. Under the map induced on T 0 CP1 by p, the spectral curve (49) goes to √ w 3 + 24 6αi(z2 − 1)2 = 0, which is a torus by the Riemann–Hurwitz formula and corresponds to the quotient S/D2 . The two pairs of branch cuts on the original Riemann sphere are both identified with a cut connecting the new branch points 1, ∞ and −1 along the real axis. With some care, it can be seen that the image of the 1-cycle c in (44) can be identified with a cycle going four times along the √ imaginary axis in the negative direction, on the sheet containing √ the point (w, z) = (2 6 2 3α 1/3 i, 0). On the other hand, it is easy to see that the 1-form (1) is given by the same expression in the new orientation, and dζ dz = = (1) . p∗ 3w 3η Thus we can write I I (1) = c

p∗ c

dz =4 3w

Z

−i∞

i∞

dz √ √ 1/3 6 6 2 3ωα 1/3 −i(z2 − 1)2

and this can be reduced to an elliptic integral, yielding 0( 1 )3 . −√ 3 6π α 1/3 Now this has to be equal to −2 by (24). Thus we get α= in agreement with (43).

0( 13 )9 √ 48 6π 3

242

C. J. Houghton, N. S. Manton, N. M. Romão

6. Discussion The version of the Ercolani–Sinha constraints that we derived in Sect. 3 generalises the Corrigan–Goddard conditions to all monopoles with a nonsingular spectral curve. An interesting aspect is the existence of a distinguished 1-cycle c on the spectral curve. The premises in the Corrigan–Goddard approach lead to the constraint (37) for c, but their conditions are otherwise equivalent to Eq. (22). In Sect. 5, we have applied (22) to rederive the scale parameter α in the spectral curve of the tetrahedrally symmetric monopole of charge 3. We also verified that this monopole provides an example where our condition (22) can be satisfied but those of Corrigan and Goddard are not. Let us make some remarks about the nature of the special 1-cycle c. Given a nonsingular spectral curve S in T, c is uniquely determined as the solution to Eq. (22); we have established that it is always antisymmetric under the real structure. Moreover, although the left-hand side of (22) depends on the spatial orientation of the monopole, c remains constant along the SO(3) orbit of S in the moduli space Nk . In fact, its components in a given homology basis are integer solutions to a linear equation and cannot change when the spectral curve is rotated, since the period matrix occurring in (15) never becomes singular. This argument applies to more general deformations in Nk that do not pass through monopoles with a singular spectral curve. It also implies that c has to be invariant under any rotational symmetry of the spectral curve, and this imposes further restrictions – for example, in the case of the tetrahedrally symmetric 3-monopole that we studied in Sect. 5, this consideration together with the τ -antisymmetry completely determines c up to sign. As implied in [4], the components of the 1-cycle c are the characteristics of the line bundle L2 |S and can thus be interpreted as giving the direction of the linear flow determined by Nahm’s equations on the Jacobian of the spectral curve S. Another interpretation for c is afforded by Eq. (16). Recall that the triviality of L2 |S provides for two nowhere vanishing functions f0 and f1 on the open sets U0 ∩ S and U1 ∩ S. We may wonder whether we can define logarithms of these functions. And of course the answer is no: the nonzero components of c correspond to nontrivial periods of both d logf0 and d logf1 , and so they cannot be exact 1-forms. To define the logarithms, one should eliminate the 1-cycles correponding to the nonzero periods, by cutting S along their conjugate homology 1-cyles in the canonical basis (14). But we can see from (17) that this is equivalent to cutting S along c. The Riemann surface of logf0 or logf1 is then obtained from the cut surfaces U0 ∩ S or U1 ∩ S by analytic continuation across the cuts, and this yields an infinite cover of the original open sets. So we may regard c as a topological obstruction to defining the logarithms of the nowhere vanishing functions f0 and f1 on the spectral curve punctured at the points lying over ζ = ∞ and ζ = 0, respectively. We should emphasise that the Ercolani–Sinha algorithm is still not sufficient to ensure smoothness of the fields if k > 2, since it does not include the condition (7). The family of nonsingular spectral curves of monopoles has codimension zero in the family of real curves in |O(2k)| satisfying Eq. (22), but the inclusion is proper in general. For example, it can be shown that the icosahedrally symmetric curve η6 + αζ (ζ 10 + 11ζ 5 − 1) = 0

(50)

satisfies (22) for some constant α, but not (7); this follows from the conclusion in [9] that there is no 6-monopole with icosahedral symmetry. It is known [11] that an icosahedrally

Constraints Defining BPS Monopoles

243

symmetric monopole of charge 7 exists, and its spectral curve is reducible to a projective 33 0( 1 )18

line and a smooth genus 25 curve of the form (50), with α = 28 π36 . An interesting question is to understand how (22) degenerates when a spectral curve becomes singular. Some singularities arise by imposing interesting symmetries on the monopoles, as in the case of the axially symmetric monopoles that we have mentioned already. We may expect that the condition still holds for other singular spectral curves, but it is not clear how the 1-cycle c is to be determined in general. Acknowledgements. We thank Roger Bielawski for advice. CJH thanks Fitzwilliam College, Cambridge, for a research fellowship. NMR is supported by Fundação para a Ciência e a Tecnologia, Portugal, through the research grant BD/15939/98.

References 1. Abramowitz, M. and Stegun, I.A.: Handbook of Mathematical Functions. National Bureau of Standards, 1965 2. Atiyah, M.F. and Hitchin, N.J.: The Geometry and Dynamics of Magnetic Monopoles. Princeton, NJ: Princeton University Press, 1988 3. Corrigan, E. and Goddard, P.: An n Monopole Solution with 4n − 1 Degrees of Freedom. Commun. Math. Phys. 80, 575–587 (1981) 4. Ercolani, N. and Sinha, A.: Monopoles and Baker Functions. Commun. Math. Phys. 125, 385–416 (1989) 5. Forgács, P., Horváth, Z. and Palla, L.: Finitely Separated Multimonopoles Generated as Solitons. Phys. Lett. B 109, 200–204 (1982) 6. Griffiths, P. and Harris, J.: Principles of Algebraic Geometry. New York: Wiley, 1978 7. Hitchin, N.J.: Monopoles and Geodesics. Commun. Math. Phys. 83, 579–602 (1982) 8. Hitchin, N.J.: On the Construction of Monopoles. Commun. Math. Phys. 89, 145–190 (1983) 9. Hitchin, N.J., Manton, N.S. and Murray, M.K.: Symmetric Monopoles. Nonlinearity 8, 661–692 (1995); dg-ga/9503016 10. Houghton, C.J. and Sutcliffe, P.M.: Tetrahedral and Cubic Monopoles. Commun. Math. Phys. 180, 343– 361 (1996); hep-th/9601146 11. Houghton, C.J. and Sutcliffe, P.M.: Octahedral and Dodecahedral Monopoles. Nonlinearity 9, 385–401 (1996); hep-th/9601147 12. Hurtubise, J.: SU (2) Monopoles of Charge 2. Commun. Math. Phys. 92, 195–202 (1983) 13. O’Raifeartaigh, L., Rouhani, S. and Singh, L.P.: Explicit Solution of the Corrigan–Goddard Conditions for n Monopoles for Small Values of the Parameters. Phys. Lett. B 112, 369–372 (1982) 14. Sutcliffe, P.M.: BPS Monopoles. Int. J. Mod. Phys. A 12, 4663–4705 (1997); hep-th/9707009 15. Ward, R.S.: A Yang–Mills–Higgs Monopole of Charge 2. Commun. Math. Phys. 79, 317–325 (1981) Communicated by A. Kupiainen

Commun. Math. Phys. 212, 245 – 256 (2000)

Communications in

Mathematical Physics

© Springer-Verlag 2000

On the Statistical Mechanics of 2D Euler Equation Raoul Robert Institut Fourier CNRS, BP 74, 38402 Saint Martin d’Hères Cedex, France. E-mail: [email protected] Received: 25 November 1997 / Accepted: 27 January 2000

Abstract: We address the issue of a rigorous justification of the statistical mechanics of 2D Euler equation. We construct a converging sequence of approximations of this equation for which a Liouville theorem holds and such that the sequence of Liouville measures has a large deviation property. This provides an important step in the justification of the use of the entropy functional previously introduced in [8, 11, 13]. 1. Introduction The most striking feature of 2D hydrodynamical turbulence is the emergence of a largescale organization of the flow, leading to structures usually called coherent structures (see references in [2, 13]). Jupiter’s Great Red Spot, a huge vortex persisting for more than three centuries in the turbulent shear between two zonal jets, is probably related to this general property [7, 14]. Such hydrodynamical vortices, whose dynamics is governed by Euler equation or some quasi-geostrophic variant, occur in a wide variety of geophysical phenomena. The common remarkable feature of these structures is that they occur and persist in a strongly turbulent environment, and their robustness demands a general understanding. Onsager [10] was the first to suggest that an explanation might be found in terms of statistical mechanics of Euler equation. Our previous work [8] was an attempt to provide a rigorous basis to the statistical theory of such systems. In this paper we focussed on the Sanov-type large deviation estimates for empiricalYoung measures which were necessary to justify the thermodynamic limit. But the only rigorous link between the statistics and the dynamics of the system that was appealed to was the invariance by the flow of a concentration property associated to our entropy functional (this property was called an erzatz of Liouville theorem). At the time, it was argued by Eyink and Spohn [3] that this invariance argument was too weak and did not provide a good justification of the statistical theory. Their criticism was justified since we can have many different notions of concentration conserved by the

246

R. Robert

flow and corresponding to different entropy functionals. Then we tried to give a stronger argument: we tried to construct an approximation of the flow on the finite dimensional space of piecewise constant vorticity functions, the approximate dynamical system preserving the natural product measure for which we have derived large deviation estimates. Despite some efforts, related in [8], we did not succeed, and the problem remained open. Meanwhile experimental and numerical works made some progress, accumulating evidences that the (vorticity – stream function) relationship derived from our entropy functional was fairly well satisfied inside the coherent structures in a variety of cases [12, 15, 16]. So the theoretical justification of the special form of the entropy became a crucial question. Our aim in this paper is to address as far as possible this remaining issue. Let us formulate precisely the problem at hand. To provide an appropriate justification I think we have to construct a sequence of finite dimensional approximations of the Euler flow (of course with good convergence properties such as strong L2 convergence uniformly on any finite time interval), satisfying the two following properties: (i) A Liouville theorem holds for the finite dimensional approximations. (ii) For the family of measures given by (i), we can prove the Sanov-type large deviation estimates for empirical Young measures which are necessary to take the thermodynamic limit (as in [8]). Of course, it is easy to satisfy the point (i) by considering the spectral approximation; but then, it is a very difficult issue to prove that the associated family of measures satisfies (ii). Our approach here is to get a Liouville theorem for a general class of approximations, including approximations on spaces of functions which are spatially localized like finite element approximants. For such approximations we are not able to prove directly the large deviation estimates (ii) but we can use them as an intermediate to construct the final approximation on the space of piecewise constant functions for which the large deviation estimates hold (see [8]); so that the use of the finite-element approximants appears here as an essential intermediate step in order to both insure the convergence of the approximations and keep the large deviation estimates. One may worry about the fact that our approximate dynamical system retains only the enstrophy among the infinite family of the Casimir functionals which are conserved by the continuous system (in contrast with the finite mode hamiltonian approximation of [20, 21]). Of course it would be more satisfactory to construct approximations having in addition a large number of constants of the motion. But we think that this is not truly necessary to our microcanonical approach. Indeed if we are interested in the long-time behavior of 2D-Euler flow, and if we believe that a statistical mechanics approach can bring some light to this issue, then we expect that we will finally have to solve some constrained variational problem: find the maximum value of some entropy functional under a set of constraints. But while we have no doubts about the set of constraints which is directly derived from the constants of the motion of the system (energy, integrals of functions of the vorticity field...), it is hard to guess what the relevant entropy functional is. So a key issue is to find a pertinent justification for our special form of the entropy functional. But in our microcanonical approach the entropy is not related to the fact that many constants of the motion are (or are not) exactly conserved by the approximate flow but it is only associated to large deviation estimates for the invariant measures. Up to now we only considered the issue of the justification of the entropy functional via invariant measures of approximate systems and large deviation estimates. Of course this necessary step is not sufficient to give a conclusive justification of the equilibrium

Statistical Mechanics of 2D Euler Equation

247

statistical mechanics. Such a task would involve intricate dynamical considerations (involving an ergodicity assumption and a precise estimate of the mixing time for the approximations.). It seems that such an analysis is out of reach at the present time. Nevertheless we give in Sect. 5 some elements of discussion which may help us to delimit the field of validity of the theory. 2. 2D Euler Equation Euler equation. The motion of a two-dimensional incompressible inviscid fluid in a bounded domain is governed by Euler equation, which we write in the classical velocity-vorticity formulation: ( ω1 + div(ωu) = 0 (E) + curl u = ω, div u = 0, u · n = 0 on ∂, where u(t, x) is the velocity field of the fluid, ω = curl u the scalar vorticity, n the outward unit normal vector to ∂. Because of incompressibility we introduce the stream function ψ(t, x): ω = −1ψ, ψ = 0 sur ∂. The constants of the motion of this dynamical system are: – the energy Z Z u2 dx = 21 ψω dx; 4(ω) = 21

– the integrals

Z Fθ (ω) =

θ (ω(x)) dx,

for any continuous function θ . These constants of the motion which are associated to the degeneracy of the (infinite dimensional) hamiltonian system are usually called Casimir functionals. – If is the ball B(0, R), we must consider also the angular momentum with respect to 0: Z Z 2 2 1 x ∧ u(x) dx = 2 R − x ω(x) dx e3 . M(ω) =

The Cauchy problem. Youdovitch’s theorem [18] gives a satisfactory existence-uniqueness result for the Cauchy problem for (E): For any given initial datum ω0 (x) in the space L∞ (), there is a unique weak solution ∞ of (E); ω(t, x) is in L () for all t, and furthermore belongs to the space this solution p C 0, ∞[; L () for all p p, 1 ≤ p < ∞. We will define the flow 0t of the Euler equation on the phase space L∞ (), by ω(t, .) = 0t ω0 . Furthermore this weak solution satisfies the following useful stability property: If ω0ε is a bounded sequence in the space L∞ (), which converges in the strong L2 topology towards ω0 , then 0t ω0ε converges L2 -strongly towards 0t ω0 , uniformly on any bounded time interval.

248

R. Robert

3. The Entropy Associated to the Turbulent Mixing Process The mechanism of turbulent mixing responsible for the self organization of the flow in Euler equation is studied at a physical level in [2]. Our concern here is to justify, as rigorously as possible, the introduction of the entropy functional which we use to give a precise content to the vague notion of turbulent disorder of the flow. As previously discussed [8], this issue is based on the existence of finite dimensional approximations which admit invariant Liouville measures. This is the very root of any thermodynamical approach. It is well known that, at a formal level, Euler equation is an infinite dimensional Hamiltonian system; but, in contrast with the finite dimensional case, this does not imply the existence of an invariant Liouville measure on the natural phase space L∞ . Although we can find finite dimensional approximations of Euler equation which preserve the Hamiltonian structure [20, 21], this stucture is broken by any kind of approximation of practical use. But for the needs of thermodynamics the Hamiltonian stucture is not truly necessary, it is the Liouville theorem and the constants of the motion which are the key ingredients. In the case of Euler equation, it is well known that a Liouville theorem holds for the usual spectral approximation. We shall show that this is a particular case of a general property: there is a natural way to approximate Euler equation on any finite dimensional space in such a way that the volume measure is conserved. The spectral approximation is only a particular case of that. It does not seem that this simple fact was previously noticed. Then the problem of defining an equilibrium statistical mechanics for (E) amounts to the study of families of measures. For an arbitrary choice of the approximating spaces the study of the asymptotic behavior of these measures seems untractable, but fortunately we can choose spaces for which the thermodynamic limit of these measures can be carried on [8].

3.1. Finite dimensional approximations. A classical way to construct finite dimensional approximations of Euler equation is as follows. Let FN be an N-dimensional subspace of L∞ and denote PN the orthogonal projector from L2 () onto FN . Then we define the approximate solution ωN (t) as the solution of the ordinary differential equation in FN : ( ωtN + PN uN · ∇ωN = 0, (EN) ωN (0) = PN ω0 , where uN = curl ψ N , and −1ψ N = ωN , ψ N = 0 on ∂. If FN is properly chosen and = ω0 regular enough, then ωN (t) converges towards ω(t) for the strong L2 topology, uniformly on any bounded time interval [6]. The constants of the motion of the dynamical system (EN ) are: – the energy

Z 1 2

– the enstrophy

ψ N ωN dx,

Z

ωN

2

dx.

Statistical Mechanics of 2D Euler Equation

249

Let us notice here that (EN ) is a differential system with quadratic non-linearity so that the solution always exists on a small time interval; but due to the conservation of the enstrophy the solution cannot blow up and it exists globally in time. Now, it is well known that if we take for FN a subspace generated by N eigenvectors of the operator −1 (with the Dirichlet boundary condition), the volume measure on FN is conserved by (EN ). This is in fact a particular case of what follows. We consider the modification of (EN ) which consists in replacing, in the definition of ψ N , the Dirichlet problem by the variational formulation: Z Z ∇ψ N · ∇ϕ dx = ωN ϕ dx, ∀ϕ ∈ FN . ψ N ∈ FN and

For sake of simplicity, from now on we shall also denote by (EN ) this modified dynamical system. Of course, we shall suppose at least that FN is included in the Sobolev space H01 (), so that for any given ωN , the above variational problem possesses a unique solution ψ N (by the Lax–Milgram theorem). One can easily check that the energy and the enstrophy are still conserved but now we have in addition Theorem 1. The volume measure on FN is conserved by the dynamical system (EN ). N Proof. FN is endowed with the L2 scalar product. Let us write (EN ) in the form ωt = N N N N N GN ω , where GN ω = −P u · ∇ω is a nonlinear transformation of FN . Then to prove the theorem it suffices to show that the trace of the derivative G0N ωN vanishes. Let us compute the first variation of GN corresponding to a small variation δωN : h i δGN = G0N ωN δωN = −PN δuN · ∇ωN − PN uN · ∇δωN .

P By definition, we have tr G0N ωN = i G0N ωN [ei ] , ei , for any orthonormal basis ei of FN . Let us denote ui the vector field associated to ei , we have: Z Z G0n ωN [ei ] , ei = − ui · ∇ωN ei dx − uN · ∇ei ei dx,

but since div uN = 0, the last term vanishes, and after integration by parts we get: Z ωN curl ψi · ∇ei dx. G0N ωN [ei ] , ei =

Let us consider now the positive definite and symmetric linear operator A defined on R FN by: ∇ψ · ∇ϕ dx = (Aψ, ϕ), and take for ei an orthonormal basis of eigenvectors ψi (λi is the eigenvalue corresponding to ei ), so that of A. We obviously have ei = λi t curl ψi · ∇ei = 0 and tr G0N ωN = 0. u Two main concerns then remain. (i) Prove the convergence (when N → ∞) of the approximate solution ωN (t) towards the solution ω(t) of the Euler equation.

250

R. Robert

(ii) In order to properly define an equilibrium statistical mechanics, one has to study the asymptotic behavior of the (N -dependent) family of invariant probability distributions on FN : 1 µN = exp −αkωN k2L2 () dωN , Z where dωN is the volume measure on FN given by the L2 metric and the exponential factor is introduced to normalize to a probability. Point (ii) will be addressed later, and we will now focus on (i). We shall take for approximating space FN the space Fh () of the finite-element approximation of the Sobolev space H m (R)2 , with compact support in (m is an integer > 5 and h a small positive parameter, see the Appendix). Then we have the following convergence result whose proof is classical. Proposition 2. Let ω(t) be any weak solution of (E), with ω0 (x) in the space L∞ (), and let T > 0 be fixed.Then for all ε > 0, there is h(ε) > 0, such that for all h, 0 < h ≤ h(ε), there is a solution ωh (t) of (Eh) such that:

≤ ε, for all t in [0, T ].

ω(t) − ωh (t) 2 L ()

The measures µh on Fh () associated to this approximation are not easy to handle, but it appears that a slight change in the approximating dynamical system improves greatly the situation with a view to (ii). Let us denote 0th the flow on Fh () defined by the system (Eh ). Let ph : L2h → F h be the classical prolongation operator of the finite-element method (see the Appendix), and πh = ph−1 . Let us define Lh () = πh Fh (), and denote 2ht = πh ◦ 0th ◦ ph , the flow induced on Lh (). Obviously 2ht preserves the volume measure on Lh (). And from Proposition 2 we deduce the following. Corollary 3. Let ω(t) be any weak solution of (E), with ω0 (x) in the space L∞ (), and let T > 0 be fixed. Then for all ε > 0, there is h(ε) > 0 such that for all h, 0 < h ≤ h(ε), there is ω0h in Lh () such that:

≤ ε, for all t in [0, T ].

ω(t) − 2ht ω0h 2 L ()

Proof. By the L2 -stability property of Euler equation, we only need to prove the result h

for ω0 in C ∞ c . Using Proposition 2, we have, for h ≤ h(ε): ω(t) − ω (t) ≤ ε, on h [0, T ]. Let us denote ωh (t) = πh ω (t), we have: kω(t) − ωh (t)k ≤ kω(t) − r h ω(t)k + kr h ω(t) − ωh (t)k , where r h is the classical restriction operator (see the Appendix). But since

kr h ω(t) − ωh (t)k ≤ c ph rh ω(t) − ωh (t) , it comes:

kω(t) − ωh (t)k ≤ kω(t) − rh ω(t)k + c kph rh ω(t) − ω(t)k + c ω(t) − ωh (t) . Now we have (see the Appendix) kω(t) − rh ω(t)k ≤ ch kω(t)kH 1 () ≤ C(T )h,

on [0, T ]

Statistical Mechanics of 2D Euler Equation

and similarly

kph rh ω(t) − ω(t)k ≤ C(T )h,

thus and the result follows.

251

kω(t) − ωh (t)k ≤ C(T )h + cε, t u

Let us summarize our results, we have constructed a flow 2ht on Lh () which apj proximates the Euler flow and preserves the measure dωh = ⊗j dωh , where ωh (x) = P j x j ωh χ h − j (finite sum). 3.2. Long time dynamics and Young measures. As we have seen, Euler system describes the advection of a scalar function (the vorticity) by an incompressible velocity field , thus R the vorticity ω remains bounded in L∞ (). The functionals C2 (ω) = 2 (ω(x)) dx, are constants of the motion (for any continuous function 2). That is to say, the distribution measure of ω, πω , defined by hπω , 2i = C2 (ω), is conserved by the flow. Let us consider an initial datum ω0 . It is well known that, in general, as time evolves, 0t ω0 becomes a very intricate oscillating function. Let us denote r = kω0 kL∞ () . Since the measure πω is conserved, 0t ω0 will remain, for all time, in the ball L∞ r = {ω : kωk∞ ≤ rk. Extracting a subsequence (if necessary), we may suppose that, as time goes to infinity, 0t ω0 converges weakly (for the weak-star topology σ (L∞ , L1 )) towards some function ω∗ : w 0t ω0 −→ ω∗ . We can easily see that C2 (0t ω0 ) does not converge towards C2 (ω∗ ) if 2 is nonlinear, whereas some other invariants can converge, as it is the case for the energy. So much information (given by the constants of the motion) is lost in this limit process. Thus the weak space L∞ () is not the good one to describe the long-time limits of our system. Fortunately, the relevant space to do this is well known. The need to describe in some macroscopic way the small-scale oscillations of functions was understood a long time ago by L.C. Young [19]. To solve problems from the calculus of variations, Young introduced a natural generalization of the notion of function: at each point x in we no longer associate a well determined real value, but only some probablity distribution on R (such a mapping is called a Young measure on × R). More precisely, a Young measure ν on × R is a measurable mapping x → νx from to the set M1 (R) of the Borel probability measures on R, endowed with the narrow topology (weak topology associated to the continuous bounded functions). Clearly, ν defines a positive Borel measure on × R (that we will also denote by ν) by: Z hνx , φ(x, .)i dx, hν, φi =

for every real function φ(x, z), continuous and compactly supported on × R. To any measurable real function g on , we associate the Young measure δg : x → δg(x) , Dirac mass at g(x). We shall denote by M the convex set of Young measures on × R, and we recall some useful properties: – M is closed in the space of all bounded Borel measures on × R (with the narrow topology). In the sequel, M will be endowed with the narrow topology. If we replace R by the compact interval [−r, r], the space Mr of Young measures on × [−r, r] is compact.

252

R. Robert

We can now identify the long time limits of the system as Young measures. Indeed, Mr is a suitable compactification of L∞ r since the narrow convergence (when t goes to infinity) of δ0t ω0 towards some Young measure ν preserves the information given by the constants of the motion, that is, for all functions 2(z): Z Z hνx , 2i dx, 2 (0t ω0 (x)) dx →

but the left-hand side is constant and equal to πω0 , 2 , so that:

Z

νx dx = πω0 .

(*)

The same kind of arguments applies R to the other invariants. For example, since 0t ω0 converges weakly towards ν¯ (x) = z dνx (z), we have, for the energy, 4(0t (ω0 )) → 4(¯ν ), which is the energy of the Young measure ν, and thus: 4(¯ν ) = 4(ω0 ). We shall denote by (∗∗) the set of constraints (associated to the constants of the motion) other than (∗), that ν¯ has to satisfy: (∗∗) = {energy constraint, angular momentum constraint (eventually)}. Thus we see that the constants of the motion bring the constraints (∗), (∗∗) on the possible long time limits. Since we don’t know anything( in the general case) on the long time behavior of the solutions of Euler equation, we will consider Young measures merely as a convenient framework in which we can perform the thermodynamic limit of a family of invariant measures.

3.3. A large deviation property. In order to define relevant statistical equilibrium states, we have to take the thermodynamic limit of the invariant Liouville measures with the conditionning given by all the constants of the motion. Let be a bounded open subset of Rd , the space Fh () is composed of the func P j tions of the form j fh β xh − j which are compactly supported in (see the Appendix). And the space Lh () = πh (Fh ()) is composed of functions of the form P j x j fh χ h − j which vanish in a neighborhood of the boundary ∂ (whose width goes to zero with h). P j Let us write a function of Lh () : fh = j ∈J h fh χ xh − j . We denote dfh = R j ⊗j ∈Jh dfh , and µh = Z1 exp − h1d fh2 dx dfh , the probability measure on Lh ()),

where the scaling factor 1/ hd is introduced in order to give a finite value to themean R 2 R j fh dx dµh (fh ), in the limit h → 0. We will write µh = ⊗j ∈J h dπ∗ fh , Lh () where dπ∗ (y) = √1π e−y dy. We will consider now fh as a random function with probability distribution µh . Thus δfh is a random Young measure on × R. 2

Statistical Mechanics of 2D Euler Equation

253

It follows from Theorem 3.1. in [8] that the family (depending on h) of the random Young measures δfh has the large deviation property with constants 1/ hd and rate function Iπ ν, where we denote π = dx ⊗π∗ , and Iπ (ν) is the classical Kullback information functional, defined on M by: Z Iπ (ν) =

Log

Iπ (ν) = +∞

dν dν, if νis absolutely continuous with respect to π, dπ otherwise.

A straightforward consequence of this large deviation property is that the random Young measures δfh which in addition satisfy the constraints (*), (**) are exponentially concentrated (see [8] for a precise statement) about the set E ∗ of the solutions of the variational problem Iπ (ν ∗ ) = inf {Iπ (ν) : ν ∈ E} , where E is the closed subset of the Young measures on × R satisfying the constraints (*), (**). Notice that this variational problem has at least one solution since E is non empty and closed and Iπ (ν) is a lower semi-continuous and inf-compact functional on M. 1 πω0 , and π 0 = dx ⊗ π0 . For all ν satisfying (*), one Now, let us denote π0 = || can easily get the relationship: Iπ (ν) = Iπ 0 (ν) + ||Iπ∗ (π0 ). Thus if Iπ∗ (π0 ) < ∞, minimizing Iπ or Iπ 0 on E gives the same equilibrium set E ∗ . In fact the use of the functional Iπ 0 is more natural since it is associated to the invariant distribution πω0 . To justify the use of Iπ 0 in the degenerate case Iπ∗ (π0 ) = ∞, one can, for instance, j

modify the definition of the measures µh , and consider µh = ⊗j ∈J h dπh fh , where

dπh (y) = Z1 exp (−Qh (y)) dy, and the polynomial function Qh (y) is such that πh converges towards π0 in the narrow topology when h → 0. Of course, we have µh =

Z 1 1 exp − d Qh (fh ) dx dfh . Z h

It is not hard to see that the proof of Theorem 3.1. in [8] works for these measures, it follows that δfh has the large deviation property with constants 1/ hd and rate function Iπ 0 (ν). Notice that −Iπ 0 (ν) is the entropy, that is the functional which measures the disorder created in the fluid by the turbulent mixing. Remark 1. For Euler equation we have d = 2. R Remark 2. In order to get probability measures, we multiply dfh by Z1 exp − h1d fh2 dx

despite the fact that this functional is (eventually) not conserved by the flow 2ht . Indeed, we consider as an authorized trick to multiply the measures by any functional which is conserved by the flow of the infinite dimensional dynamical system.

254

R. Robert

4. The Statistical Equilibrium States Once we have identified the relevant entropy functional, the determination of the equilibrium states come down to the solution of a variational problem: i.e. find the minimum value of Iπ 0 (ν) under the constraints given by the constants of the motion of the system. After that it remains to discuss at a physical level the relevance of these states. The discussion of the equilibrium states for Euler equation was done at a mathematical and physical level in [11, 13, 15, 16], and we refer to these papers. 5. Comments Let us now address, at an heuristical level, the remaining difficult issue of a complete justification of this equilibrium statistical mechanics. We will consider the well known phenomenon of the formation of coherent structures in 2D turbulence. We can observe, in meteorology, experiments or numerical simulations that such structures form. Let us scrutinize what chain of logic would lead us to identify these structures with the statistical equilibrium states previously described. Notice first that we observe the phenomenon (the formation of the structure) over some finite time interval [0,T]. Obviously the turbulent real fluid has some very small dimensionless viscosity, so that we may suppose that in our time interval the flow is well approximated, in a strong L2 sense, by a solution of 2D Euler ( we consider for example the case of periodic boundary conditions to avoid the problem of boundary layers formation). Then we can approximate (still in a strong L2 sense), uniformly over [0, T ], the flow by a solution of our finite dimensional system, taking the number of degrees of freedom N large enough. Now we have to make the assumption that this finite dimensional system is ergodic and comes close to equilibrium in a mixing time T (N) which is less than T . Of course to have a good approximation of the flow over [0, T ] we have to take N very large (this is well known in hydrodynamical simulations) and the crucial question is: how T (N ) increases with N ? We clearly don’t have any rigorous argument to insure that T (N ) does not increase dramatically with N so that the above justification might fail. From a careful examination of the results of many tests in various situations emerge the following facts (see [12] and references therein). 1) If the turbulent flow reaches an equilibrium state after a mixing process occupying the whole domain, then the description of the final state as a global maximum entropy state is accurate. 2) In many cases the flow reaches a kind of equilibrium which is not a statistical equilibrium in the whole domain occupied by the flow. This indicates clearly that difficulties may arise with the ergodic hypothesis. 3) In such cases, inside the subdomain occupied by the coherent structure, the relationship (vorticity-stream function) associated to our entropy functional is fairly well satisfied. This indicates also clearly that our entropy functional retains some relevance even when ergodicity fails. In conclusion, from the above considerations, it seems unrealistic to seek a complete mathematical justification (involving the dynamics) of the statistical equilibrium states. Nevertheless we can go on to study at a physical level and investigate the relaxation process about the equilibrium; this can bring some light to the dynamical mechanisms responsible for a possible lack of ergodicity [12].

Statistical Mechanics of 2D Euler Equation

255

The picture of a turbulent mixing of the vorticity driving the system towards its equilibrium is very similar to the violent relaxation process in Vlasov–Poisson system that was suggested by astrophysicists to explain the formation of galaxies [5]. But in the case of stellar systems a true difficulty occurs: the stars are not naturally confined in a bounded container, and there is no equilibrium state in the whole space. One can put forward physical arguments [2] to impose such a confinement, but this point is rather controversial at this time. Nevertheless, once a spatial confinement is imposed the above analysis on 2D Euler extends with only minor technical changes to 6D Vlasov–Poisson system. Appendix Finite-Element Approximation. For the comfort of the reader, we briefly recall some standard notations and properties [1]. Approximation of the Sobolev space H m Rd . We denote Qd =] − 1/2, 1/2[d , χ the characteristic function of Qd , and β = χ ∗ . . . ∗ χ (m + 1 terms). For a given parameter P j h > 0, we define a prolongation operator ph : to any function fh = j fh χ xh − j (j belongs to Zd ), we associate the function X j x −j . fh β ph fh = h j

Let us R consider now a compactly supported measurable bounded function λ(x) satisfying λ(x) dx = 1 and ZZ β(x)λ(y)(x − y)k dx dy = 1 for k = 0 =0

for 0 < |k| ≤ m,

where k = (k1 , . . . , kd ), |k| = k1 +· · ·+kd , and (x −y)k = (x1 −y1 )k1 . . . (xd −yd )kd . Then we define a restriction operator rh : Z 1 x j − j f (x) dx, λ for f ∈ L1loc (Rd ), we denote fh = d h h X j x fh χ −j . and define rh f = h j

We have the well known estimates (where c denotes different constants which do not depend on h): (1) If fh ∈ L2 (R)d ,we have krh f kL2 ≤ ckf kL2 . (2) If fh ∈ L2 Rd , we have ph fh ∈ H m and kph fh kH m ≤ hcm kfh kL2 , moreover ckfh kL2 ≤ kph fh kL2 ≤ kfh kL2 , where c > 0 does not depend on h. It follows that ph is an isomorphism from the space L2h of the functions fh which are square integrable onto a subspace Fh of H m . (3) If f ∈ H m+1 (Rd ), for 0 ≤ k ≤ s ≤ m + 1 and k ≤ m, we have: kf − ph rh f kH k ≤ chs−k kf kH s .

256

R. Robert

m the Sobolev space The periodic case. Let us suppose that h = 1/N . We denote Hper H m on the d-dimensional torus (R/Z)d , and Fh (Qd ) the space of the restrictions to Qd P j j j +N i of the functions of the form j fh β xh − j which are Zd -periodic (i.e. fh = fh , d 2 for all j , i in Z ). Fh (Qd ) is endowed with the L scalar product. Obviously, if f and fh are Zd -periodic, so are rh f and ph fh . And the following estimates hold:

(10 ) If f ∈ L2per , we have krh f kL2 (Qd ) ≤ ckf kL2 (Qd ) . c m ≤ m kfh k 2 (20 ) If f is Zd -periodic, we have kph fh kHper L (Qd ) , and h ckfh kL2 (Qd ) ≤ kph fh kL2 (Qd ) ≤ kfh kL2 (Qd ) . m+1 , for 0 ≤ k ≤ s ≤ m + 1 and k ≤ m, we have (30 ) If f f ∈ Hper s−k s . kf kHper kf − ph rh f kHper k ≤ ch

References 1. Aubin, J.P.: Approximation of elliptic boundary-value problems. New York: Wiley-Interscience, 1972 2. Chavanis, P.H., Sommeria, J., Robert, R.: Statistical mechanics of two-dimensional vortices and collisionless stellar systems. The Astrophysical J. 471, 385–399 (1996) 3. Eyink, G.L., Spohn, H.: Negative states and large-scale long-lived vortices in two-dimensional turbulence. J. Stat. Phys. 70, 833–886 (1993) 4. Jordan, R.: A statistical equilibrium model of coherent structures in magnetohydrodynamics. Nonlinearity 8, 585–614 (1995) 5. Lynden-Bell, D.: Statistical mechanics of violent relaxation in stellar systems. Mon. Not. R. Astr. Soc. 181, 405, (1967) 6. Marchioro, C., Pulvirenti, M.: Mathematical Theory of Incompressible Nonviscous Fluids. New York: Springer-Verlag, 1994 7. Michel, J., Robert, R.: Statistical mechanical theory of the great red spot of Jupiter. J. Stat. Phys. 77 3/4, 645–666 (1994) 8. Michel, J., Robert, R.: Large deviations for Young measures and statistical mechanics of infinite dimensional dynamical systems with conservation law. Commun. Math. Phys. 159, 195–215 (1994) 9. Miller, J., Weichman, P.B., Cross, M.C.: Statistical mechanics, Euler equations,and Jupiter’s red spot. Phys. Rev. A 45, 2328–2359 (1992) 10. Onsager, L.: Statistical hydrodynamics. Nuovo Cimento supll. 6, 279 (1949) 11. Robert, R.: A maximum entropy principle for two-dimensional Euler equations. J. Stat. Phys. 65, 3/4, 531–553 (1991) 12. Robert, R., Rosier, C.: On the modelling of small scales for 2D turbulent flows. J. Stat Phys. 86, 3/4, 1997 13. Robert, R., Sommeria, J.: Statistical equilibrium states for two- dimensional flows. J. Fluid Mech. 229, 291–310 (1991) 14. Sommeria, J., Nore, C., Dumont, T., Robert, R.: Théorie statistique de la tache rouge de Jupiter. C. R. Acad. Sci. Paris, 312 Série II, 999–1005 (1991) 15. Sommeria, J., Staquet, C., Robert, R.: Final equilibrium state of a two-dimensional shear layer. J. Fluid Mech. 233, 661–689 (1991) 16. Thess, A., Sommeria, J., JÜttner, B.:: Inertial organization of a two-dimensional turbulent vortex street. Phys. Fluids 6 (7), 2417–2429 (1994) 17. Turkington, B., Jordan, R.: Turbulent relaxation of a magnetofluid: A statistical equilibrium model. In: Proceedings, International Conference on Advances in Geometric Analysis and Continum Mechanics. Stanford University, August 1993 18. Youdovitch, V.I.: Non-stationary flow of an incompressible liquid. Zh. Vych. Mat. 3, 1032–1066 (1963) 19. Young, L.C.: Generalized surfaces in the calculus of variations. Ann. Math. 43, 84–103 (1942) 20. Zachos, C.K.: Hamiltonian flows, SU(8), SO(8), USp(8) and strings. In: Differential geometric methods in theoretical physics, L.L. Chau, W. Nahm (eds.). New York: Plenum Press, 1990 21. Zeitlin, V.: Finite mode analogs of 2D ideal hydrodynamics: Coadjoint orbits and local canonical structure. Physica D 49, 353–362 (1991) Communicated by J. L. Lebowitz

Commun. Math. Phys. 212, 257 – 275 (2000)

Communications in

Mathematical Physics

© Springer-Verlag 2000

The Stability of Magnetic Vortices∗ S. Gustafson† , I. M. Sigal Dept. of Mathematics, University of Toronto, 100 St. George St., Toronto, ON, Canada, M5S 3G3 Received: 16 November 1998 / Accepted: 3 January 2000

Abstract: We study the linearized stability of n-vortex (n ∈ Z) solutions of the magnetic Ginzburg–Landau (or Abelian Higgs) equations. We prove that the fundamental vortices (n = ±1) are stable for all values of the coupling constant, λ, and we prove that the higher-degree vortices (|n| ≥ 2) are stable for λ < 1, and unstable for λ > 1. This resolves a long-standing conjecture (see, eg, [JT]). 1. Introduction In this paper, we determine the stability of magnetic (or Abelian Higgs) vortices. These are certain critical points of the energy functional Z λ 1 2 2 2 2 |∇A ψ| + (∇ × A) + (|ψ| − 1) (1) E(ψ, A) = 2 R2 4 for the fields A : R 2 → R2

and

ψ : R2 → C.

Here ∇A = ∇ − iA is the covariant gradient, and λ > 0 is a coupling constant. For a vector, A, ∇ × A is the scalar ∂1 A2 − ∂2 A1 , and for a scalar ξ , ∇ × ξ is the vector (−∂2 ξ, ∂1 ξ ). Critical points of E(ψ, A) satisfy the Ginzburg–Landau (GL) equations −1A ψ +

λ (|ψ|2 − 1)ψ = 0, 2

∗ Research on this paper was supported by NSERC under grant N7901 † Present address: Courant Institute, 251 Mercer St., New York, NY 10012, USA.

E-mail: [email protected]

(2)

258

S. Gustafson, I. M. Sigal

¯ A ψ) = 0, ∇ × ∇ × A + I m(ψ∇

(3)

where 1A = ∇A · ∇A . Physically, the functional E(ψ, A) gives the difference in free energy between the superconducting and normal states near the transition temperature in the Ginzburg– Landau theory. A is the vector potential (∇ × A is the induced magnetic field), and ψ is an order parameter. The modulus of ψ is interpreted as describing the local density of superconducting Cooper pairs of electrons. The functional E(ψ, A) also gives the energy of a static configuration in the YangMills-Higgs classical gauge theory on R2 , with abelian gauge group U (1). In this case A is a connection on the principal U (1)- bundle R2 × U (1), and ψ is the Higgs field (see [JT] for details). A central feature of the functional E(ψ, A) (and the GL equations) is its infinitedimensional symmetry group. Specifically, E(ψ, A) is invariant under U (1) gauge transformations, ψ 7→ eiγ ψ,

(4)

A 7→ A + ∇γ

(5)

for any smooth γ : R2 → R. In addition, E(ψ, A) is invariant under coordinate translations, and under the coordinate rotation transformation ψ(x) 7 → ψ(g −1 x)

A(x) 7 → gA(g −1 x)

(6)

for g ∈ SO(2). Finite energy field configurations satisfy |ψ| → 1

as

|x| → ∞

(7)

which leads to the definition of the topological degree, deg(ψ), of such a configuration: ! ψ : S1 → S1 deg(ψ) = deg |ψ| |x|=R (R sufficiently large). The degree is related to the phenomenon of flux quantization. Indeed, an application of Stokes’ theorem shows that a finite-energy configuration satisfies Z 1 (∇ × A). deg(ψ) = 2π R2 We study, in particular, “radially-symmetric” or “equivariant” fields of the form ψ (n) (x) = fn (r)einθ ,

A(n) (x) = n

an (r) ⊥ xˆ , r

(8)

where (r, θ) are polar coordinates on R2 , xˆ ⊥ = 1r (−x2 , x1 )t , n is an integer, and fn , an : [0, ∞) → R. It is easily checked that such configurations (if they satisfy (7)) have degree n. The existence of critical points of this form is well-known (see Sect. 2.1). They are called n-vortices.

The Stability of Magnetic Vortices

259

Our main results concern the stability of these n-vortex solutions. Let L(n) = Hess E(ψ (n) , A(n) ) be the linearized operator for GL around the n-vortex, acting on the space X = L2 (R2 , C) ⊕ L2 (R2 , R2 ). The symmetry group of E(ψ, A) gives rise to an infinite-dimensional subspace of ker(L(n) ) ⊂ X (see Sect. 3.2), which we denote here by Zsym . We say the n-vortex is (linearly) stable if for some c > 0, L(n) |Zsym ⊥ ≥ c, and unstable if L(n) has a negative eigenvalue. The basic result of this paper is the following linearized stability statement: Theorem 1. 1. (Stability of fundamental vortices) For all λ > 0, the ±1-vortex is stable. 2. (Stability / instability of higher-degree vortices) For |n| ≥ 2, the n-vortex is stable for λ < 1, unstable for λ > 1. Theorem 1 is the basic ingredient in a proof of the nonlinear dynamical stability / instability of the n-vortex for certain dynamical versions of the GL equations. These include the GL gradient flow equations, and the Abelian Higgs (Lorentz-invariant) equations. These dynamical stability results are established in a separate work ([G2]). Other work on dynamics of magnetic vortices appears in [DS, S, S2]. The statement of Theorem 1 was conjectured in [JT] on the basis of numerical observations (see [JR]). Bogomolnyi ([B]) gave an argument for instability of vortices for λ > 1, |n| ≥ 2. Our result rigorously establishes this property. The instability of higher-degree vortices for sufficiently large λ was established in [ABG]. The stability of vortices of Ginzburg–Landau equations without magnetic field was studied in [LL, M,OS1]. The stability of “monopole” solutions of a non-abelian generalization of (2-3) was studied in [AD] (see also [G1]). The solutions of (2)–(3) are well-understood in the case of critical coupling, λ = 1. In this case, the Bogomolnyi method ([B]) gives a pair of first-order equations whose solutions are global minimizers of E(ψ, A) among fields of fixed degree (and hence solutions of the GL equations). Taubes ([T1,T2]) has shown that all solutions of GL with λ = 1 are solutions of these first-order equations, and that for a given degree n, the gauge-inequivalent solutions form a 2|n|-parameter family. The 2|n| parameters describe the locations of the zeros of the scalar field. This is discussed in more detail in [JT] (see also [BGP]) and Sect. 6. We remark that for λ = 1, an n-vortex solution (8) corresponds to the case when all |n| zeros of the scalar field lie at the origin. The remainder of this paper is organized as follows. In Sect. 2 we describe in detail various properties of the n-vortex. In particular, we establish an important estimate on the n-vortex profiles which differentiates between the cases λ < 1 and λ > 1. In Sect. 3, we introduce the linearized operator, fix the gauge on the space of perturbations, and identify the zero-modes due to symmetry-breaking. Sections 4 through 7 comprise

260

S. Gustafson, I. M. Sigal

a proof of Theorem 1. A block-decomposition for the linearized operator is described in Sect. 4. This approach is similar to that used to study the stability of non-magnetic vortices in [OS1] and [G1]. In Sect. 5, we establish the positivity of certain blocks (those corresponding to the radially-symmetric variational problem, and those containing the translational zero-modes) for all λ, which completes the stability proof for the ±1vortices. The basic techniques are the characterization of symmetry-breaking in terms of zero-modes of the Hessian (or linearized operator), and a Perron-Frobenius type argument, based on a version of the maximum principle for systems (Proposition 6), which shows that the translational zero-modes correspond to the bottom of the spectrum of the linearized operator. A more careful analysis is needed for |n| ≥ 2. This requires us to review some aspects of the critical case (λ = 1) in Sect. 6. The stability / instability proof for |n| ≥ 2 is completed in Sect. 7. We use an extension of Bogomolnyi’s instability argument, and another application of the Perron-Frobenius theory. 2. The n-Vortex In this section we discuss the existence, and properties, of n-vortex solutions. 2.1. Vortex solutions. The existence of solutions of (GL) of the form (8) is well-known: Theorem 2 (Vortex existence; [P, BC]). For every integer n, and every λ > 0, there is a solution an (r) ⊥ A(n) (x) = n (9) xˆ ψ (n) (x) = fn (r)einθ r of the variational equations (2)–(3). In particular, the radial functions (fn , an ) minimize the radial energy functional Z 2 2 0 2 1 ∞ λ 2 (n) 0 2 2 (1 − a) f 2 (a ) 2 +n + (f − 1) rdr (10) Er (f, a) = (f ) + n 2 0 r2 r2 4 (which is the full energy functional (1) restricted to fields of the form (8)) in the class a a0 ∈ L2loc (rdr), ∈ L2 (rdr)}. r r The functions fn , an are smooth, and have the following properties (for n 6 = 0): {f, a : [0, ∞) → R | 1 − f ∈ H 1 (rdr),

1. 0 < fn < 1, 0 < an < 1 on (0, ∞), 2. fn0 , an0 > 0, 3. fn ∼ cr n , an ∼ dr 2 , as r → 0 (c > 0 and d > 0 are constants), 4. 1 − fn , 1 − an → 0 as r → ∞, with an exponential rate of decay. We call (ψ (n) , A(n) ) an n-vortex (centred at the origin). It follows immediately that the functions fn and an satisfy the ODEs −1r fn +

n2 (1 − an )2 λ fn + (fn2 − 1)fn = 0 r2 2

(11)

and −an00 +

an0 − fn2 (1 − an ) = 0. r

(12)

The Stability of Magnetic Vortices

261

Remark 1. The n-vortex is known to be the unique solution of (GL) of the form (8) when λ ≥ 2n2 [ABGi]. In the appendix, we show that for λ ≥ 2n2 , any such solution (n) minimizes Er . Remark 2. The functions fn and an also depend on λ, but we suppress this dependence for ease of notation. When it will cause no confusion, we will also drop the subscript n. ¯ A 7 → −A of (GL) interchanges (ψ (n) , A(n) ) Remark 3. The discrete symmetry ψ 7→ ψ, (−n) (−n) ,A ). Thus, we can assume n ≥ 0. and (ψ 2.2. An estimate on the vortex profiles. The following inequality, relating the exponentially decaying quantities f 0 and 1 − a, plays a crucial role in the stability / instability proof. Proposition 1. We have 0 f (r) for λ < 1 f (r) > n(1−a(r)) r . (13) f (r) for λ > 1 f 0 (r) < n(1−a(r)) r f (r). The properties listed in Theorem 2 imply Proof. Define e(r) ≡ f 0 (r) − n(1−a(r)) r that e(r) → 0 as r → 0 and as r → ∞. Using the ODEs ((11)–(12)) we can derive the equation e (−1r + α)e + e0 = (1 − λ)f 2 f 0 , f where rf 0 na 0 1 + n(1 − a) ) + f2 + >0 (1 + α(r) = 2 r f r and the result follows from the maximum principle. u t 3. The Linearized Operator In this section, we introduce the linearized operator (or Hessian) around the n-vortex, and identify its symmetry zero-modes. 3.1. Definition of the linearized operator. We work on the real Hilbert space X = L2 (R2 ; C) ⊕ L2 (R2 ; R2 ) with inner-product

Z < (ξ, B), (η, C) >X =

R2

{Re(ξ¯ η) + B · C}.

We define the linearized operator, Lψ,A (= the Hessian of E(ψ, A)) at a solution (ψ, A) of (2)–(3) through the quadratic form ∂2 E(ψ + ξ + δη, A + B + δC)|=δ=0 = h(η, C), Lψ,A (ξ, B)iX ∂∂δ for all (ξ, B), (η, C), ∈ X. The result is ! [−1A + λ2 (2|ψ|2 − 1)]ξ + λ2 ψ 2 ξ¯ + i[2∇A ψ + ψ∇] · B ξ . = Lψ,A B ¯ A ]ξ ) + (−1 + ∇∇ + |ψ|2 ) · B I m([∇A ψ − ψ∇

262

S. Gustafson, I. M. Sigal

3.2. Symmetry zero-modes. We identify the part of the kernel of the operator L(n) ≡ Lψ (n) ,A(n) which is due to the symmetry group. Proposition 2. We have 1. L(n)

iγ ψ (n) ∇γ

=0

(14)

=0

(15)

for any γ : R2 → R. 2. (n)

L

∂j ψ (n) ∂j A(n)

for j = 1, 2. Proof. We use the basic result that the generator of a one-parameter group of symmetries of E(ψ, A), applied to the n-vortex, lies in the kernel of L(n) . The vector in (14) is easily seen to be the generator of a one-parameter family of gauge transformations (4-5) applied to the n-vortex. Similarly, the vector in (15) is the generator of coordinate translations applied to the n-vortex. u t Remark 4. Applying the generator of the coordinate rotational symmetry (6) to the nvortex gives us nothing new. This is covered by the gauge-symmetry case. We define Zsym to be the subspace of X spanned by the L2 zero-modes described in Proposition 2. We recall that the n-vortex is called stable if there is a constant c > 0 such that L(n) |Zsym ⊥ ≥ c,

(16)

and unstable if L(n) has a negative eigenvalue. 3.3. Gauge fixing. In order to remove the infinite dimensional kernel of L(n) arising from gauge symmetry, we restrict the class of perturbations. Specifically, we restrict L(n) to the space of those perturbations (ξ, B) ∈ X which are orthogonal to the L2 gauge zero-modes (14). That is, ξ iγ ψ (n) , =0 B X ∇γ for all γ . Integration by parts gives the gauge condition I m(ψ (n) ξ ) = ∇ · B. As is done in [S], we consider a modified quadratic form L˜ (n) , defined by Z (n) (n) ˜ < α, L α >=< α, L α > + (I m(ψ (n) ξ ) − ∇ · B)2

(17)

The Stability of Magnetic Vortices

263

for α = (ξ, B) ∈ X. Clearly, L˜ (n) agrees with L(n) on the subspace of X specified by the gauge condition (17). This modification has the important effect of shifting the essential spectrum away from zero (see (26)). A straightforward computation gives the following expression for L˜ (n) : ˜ (n)

L

ξ B

[−1A + λ2 (2|ψ|2 − 1) + 21 |ψ|2 ]ξ + 21 (λ − 1)ψ 2 ξ¯ + 2i∇A ψ · B

=

2I m[∇A ψξ ] + [−1 + |ψ|2 ]B

! .

To establish Theorem 1, it suffices to prove that L˜ (n) ≥ c > 0 on the subspace of X orthogonal to the translational zero-modes (15). L˜ (n) is a real-linear operator on X. It is convenient to identify L2 (R2 ; R2 ) with 2 L (R2 ; C) through the correspondence B=

B1 B2

↔ B c ≡ B1 − iB2 ,

(18)

and then to complexify the space X 7→ X˜ = [L2 (R2 ; C)]4 via (ξ, B) 7 → (ξ, ξ¯ , B c , B¯ c ).

(19)

As a result, L˜ (n) is replaced by the complex-linear operator (n)

˜˜ L

= diag {−1A , −1A , −1, −1} + V (n) ,

where λ  V (n) =  

1 1 2 2 2 2 (2|ψ| − 1) + 2 |ψ| 2 (λ − 1)ψ λ 1 1 2 2 2 ¯ 2 (λ − 1)ψ 2 (2|ψ| − 1) + 2 |ψ| i(∂A∗ ψ) i(∂A ψ) −i(∂A∗ ψ) −i(∂A ψ)

 −i(∂A∗ ψ) i(∂A ψ) −i(∂A ψ) i(∂A∗ ψ)  . |ψ|2 0  0 |ψ|2

Here we have used the notation ∂A ≡ ∂z − iA, where ∂z = ∂1 − i∂2 (and the superscript c has been dropped from the complex function A obtained from the vector-field A via (18)). The components of V (n) are bounded, and it follows from standard results ([RSII]) ˜˜ (n) is a self-adjoint operator on X, ˜ with domain that L (n)

˜˜ D(L

) = [H 2 (R2 ; C)]4 .

264

S. Gustafson, I. M. Sigal

4. Block Decomposition We write functions on R2 in polar coordinates. Precisely, X˜ = [L2 (R2 ; C)]4 = [L2rad ⊗ L2 (S1 ; C)]4 ,

(20)

where L2rad ≡ L2 (R+ , rdr). Let ρn : U (1) → Aut([L2 (S1 ; C)]4 ) be the representation whose action is given by ρn (eiθ )(ξ, η, B, C)(x) = (einθ ξ, e−inθ η, e−iθ B, eiθ C)(R−θ x), where Rα is a counter-clockwise rotation in R2 through the angle α. It is easily checked ˜˜ (n) commutes with ρ (g) for any g ∈ U (1). It follows that the linearized operator L n

˜˜ (n) leaves invariant the eigenspaces of dρ (s) for any s ∈ iR = Lie(U (1)). The that L n (n) ˜ ˜ resulting block decomposition of L , which is described in this section, is essential to our analysis. In particular, the translational zero-modes each lie within a single subspace of this decomposition. 4.1. The decomposition of L(n) . In what follows, we define, for convenience, b(r) = n(1−a(r)) . r Proposition 3. There is an orthogonal decomposition M (ei(m+n)θ L2rad ⊕ ei(m−n)θ L2rad ⊕ −iei(m−1)θ L2rad ⊕ iei(m+1)θ L2rad ), X˜ =

(21)

m∈Z

˜˜ under which the linearized operator around the vortex, L ˜˜ L

(n)

=

M m∈Z

(n)

, decomposes as

Lˆ (n) m ,

where ˆ (n) Lˆ (n) m = −1r (I d) + Vm with 1 Vˆm(n) = 2 diag {[m + n(1 − a)]2 , [m − n(1 − a)]2 , [m − 1]2 , [m + 1]2 } + V 0 r and λ  V0 =  

1 2 1 2 2 2 (2f − 1) + 2 f 2 (λ − 1)f 1 1 2 λ 2 2 2 (λ − 1)f 2 (2f − 1) + 2 f 0 0 f − bf −[f + bf ]

−[f 0 + bf ]

f 0 − bf

 f 0 − bf −[f 0 + bf ] −[f 0 + bf ] f 0 − bf  .  f2 0 2 0 f

(22)

The Stability of Magnetic Vortices

265

Proof. The decomposition (21) of X˜ follows from the usual Fourier decomposition of ˜˜ (n) preserves the L2 (S1 ; C), and the relation (20). An easy computation shows that L space of vectors of the form (ξ ei(m+n)θ , ηei(m−n)θ , −iαei(m−1)θ , iβei(m+1)θ ) and that it acts on such vectors via (22).

(23)

t u

(n) It follows that Lˆ m is self-adjoint on [L2rad ]4 . (n) It will also be convenient to work with a rotated version of the operator Lˆ m , ( (n) R Lˆ m R T m ≥ 0 (n) , Lm ≡ (n) R 0 Lˆ m (R 0 )T m < 0

where



1 1  −1 R=√  0 2 0

1 1 0 0

0 0 1 1

 0 0  , 1  −1



1 1 1 0 R =√  2 0 0

1 −1 0 0

0 0 1 1

 0 0  . 1  −1

We have (n) L(n) m = −1r (I d) + Vm ,

(24)

where 



m2 + b2 + λ (3f 2 − 1) −2|m| br −2bf 0 2 2   r 2 m b λ 2 2 2   −2|m| r + b + 2 (f − 1) + f 0 −2f 0   (n) r2 Vm =  . 2 +1 |m| m 2   −2bf 0 + f −2   r2 r2 2 m +1 + f 2 0 −2f 0 −2 |m| 2 2 r r

(n)

4.2. Properties of Lm . Proposition 4. We have the following: 1. (n)

L(n) m = L−m .

(25)

σess (L(n) m ) = [min(1, λ), ∞).

(26)

2.

3. For |n| = 1 and |m| ≥ 2, (n)

L(n) m − L1 ≥ 0 with no zero-eigenvalue.

(27)

266

S. Gustafson, I. M. Sigal

Proof. The first statement is obvious. The second statement follows in a standard way from the fact that lim Vm(n) (r) = diag {λ, 1, 1, 1}.

r→∞

To prove the third statement, we compute m−1 ˆ (n) diag {m + 1 + 2n(1 − a), m + 1 − 2n(1 − a), m − 1, m + 3} Lˆ (n) m − L1 = r2 which is non-negative, with no zero-eigenvalue for m ≥ 2, n = 1.

t u

Remark 5. In light of (25), we can assume from now on that m ≥ 0. This degeneracy is a result of the complexification (19) of the space of perturbations.

4.3. Translational zero-modes. The gauge fixing (Sect. 3.3) has eliminated the zeromodes arising from gauge symmetry. The translational zero-modes remain. As written in (15), the translational zero-modes fail to satisfy the gauge condition (17). Further, they do not lie in L2 . A straightforward computation shows that if we adjust the vectors in (15) by gauge zero-modes given by (14) with γ = −Aj , j = 1, 2, we obtain T1 =

(∇A ψ)1 (∇ × A)e2

,

T2 =

(∇A ψ)2 −(∇ × A)e1

,

where e1 = (1, 0) and e2 = (0, 1). T1 and T2 satisfy (17), and are zero-modes of the linearized operator. Note also that T±1 decay exponentially as |x| → ∞, and hence lie in L2 . (n) It is easily checked that T1 ± iT2 lie in the m = ±1 blocks for Lˆ m . After rotation by R, we have (n)

L±1 T = 0, where T = (f 0 , bf, n

a0 a0 , n ). r r

5. Stability of the Fundamental Vortices In this section we prove the first part of Theorem 1. Specifically, we show that for some (±1) (±1) c > 0, Lm ≥ c for m 6 = 1, and L1 |T ⊥ ≥ c. In light of the discussions in Sects. 3.3, 4.1, and 4.3, this will establish the stability of the ±1-vortices.

The Stability of Magnetic Vortices

267

(n)

5.1. Non-negativity of L0 and radial minimization. (n)

Proposition 5. L0 ≥ 0 for all λ. (n)

Proof. From the expression (24) we see that L0 breaks up: (n)

L0 = N0 ⊕ M0

(28)

(abusing notation slightly) where M0 = −1r (I d) + W0 with

W0 =

and

N0 =

b2 + λ2 (3fn2 − 1) −2bf 1 + f2 −2bf r2

−1r + b2 + λ2 (f 2 − 1) + f 2 −2f 0 0 −2f −1r + r12 + f 2

.

An easy computation shows that M0 is precisely the Hessian of the radial energy, (n) (n) HessEr (see (10)). Since the n-vortex minimizes Er , we have M0 ≥ 0. It remains to show N0 ≥ 0. We establish the stronger result, N0 > 0. Note that N0 = G∗0 G0 , where

G0 =

∂r − f 0 /f f f ∂r + 1/r

.

In fact, G0 has no zero-eigenvalue. To see this, we exploit some known results about the kernel of G0 at λ = 1. In Sect. 6, we will show that at λ = 1, the full linearized operator is the square of a first-order differential operator, F : L˜ (n) |λ=1 = F ∗ F . The operator F was analyzed in [S], where it was shown to be Fredholm with index 2|n|. The operator F0 ≡ G0 |λ=1 is F restricted to a particular invariant subspace. Thus F0 is a Fredholm operator from its domain to L2rad . The kernels of F and F ∗ are known precisely, (see [S] and Sect. 6) and it follows that F0 has index zero. Now, G0 is a relatively compact perturbation of F0 (due to the decay of the field components – see, again, [S]), and hence G0 is also Fredholm with index zero. Finally, it is a simple matter to check that G∗0 has trivial kernel. If ∗ ξ =0 G0 β it follows that (−1r + f 2 )β = 0 and hence that β = 0, and so ξ = 0. The relation N0 > 0 follows from this, and the fact t that σess (N0 ) = [1, ∞). u

268

S. Gustafson, I. M. Sigal

5.2. A maximum principle argument. Removing the equality in Proposition 5 requires more work. First, we establish an extension of the maximum principle to systems (see, eg, [LM,PA] for related results). We will use this also in the proof that the translational (n) zero-mode is the ground state of L1 (Sect. 5.4). Proposition 6. Let L be a self-adjoint operator on L2 (Rn ; Rd ) of the form L = −1(I d) + V , where V is a d × d matrix-multiplication operator with smooth entries. Suppose that L ≥ 0 and that for i 6 = j , Vij (x) ≤ 0 for all x. Further, suppose V is irreducible in the sense that for any splitting of the set {1, . . . , d} into disjoint sets S1 and S2 , there is an i ∈ S1 and a j ∈ S2 with Vij (x) < 0 for all x. Finally, suppose that Lξ = η ∈ L2 with η ≥ 0 component-wise, and ξ 6 ≡ 0. Then either 1. ξ > 0 or 2. η ≡ 0 and ξ < 0. Proof. We write ξ = ξ + − ξ − with ξ + , ξ − ≥ 0 component-wise, and compute 0 ≤ < ξ − , Lξ − > = < ξ − , Lξ + > − < ξ − , Lξ > . Since ξj+ and ξj− have disjoint support, we have r.h.s. =

X j 6=k

< ξj− , Vj k ξk+ > − < ξ − , η > ≤ 0.

Thus we have 1. 0 = < ξ − , Lξ − >. 2. 0 = < ξj− , Vj k ξk+ > for all j 6 = k. Since L ≥ 0, the first of these implies Lξ − = 0 and hence Lξ + = η. So if η 6 ≡ 0, then ξ + 6 ≡ 0. If η ≡ 0 and ξ + ≡ 0, replace ξ with −ξ in what follows. An application of the strong maximum principle (eg. [GT], Thm. 8.19) to each component of the equation Lξ + = η now allows us to conclude that for each k, either ξk+ > 0 or ξk+ ≡ 0. We know that for some k, ξk+ > 0. Looking back at the second listed equation above, and using the irreducibility of V , we then see that ξj− ≡ 0 for all j . Finally, we can easily rule out the possibility ξk ≡ 0 for some k, by looking back at the equation satisfied by ξk . Thus we have ξ > 0. u t (n)

5.3. Positivity of L0 . Now we apply Proposition 6 to show M0 > 0. The trick here is to find a function ξ which satisfies M0 ξ ≥ 0. This allows us to rule out the existence of a zero-eigenvector, which would be positive by Proposition 6. To obtain such a ξ , we differentiate the vortex with respect to the parameter λ. Specifically, differentiation of the Ginzburg–Landau equations with respect to λ results in M0 ξ = η,

(29)

The Stability of Magnetic Vortices

269

where

ξ=

and η=

∂λ f n∂λ a/r

1

2 (1 − f

0

2 )f

≥ 0.

We can now establish (n)

Proposition 7. For all λ, L0 ≥ c > 0. Proof. We have already shown in the proof of Proposition 5, that N0 > 0 and M0 ≥ 0. Hence, due to (28) and (26), it suffices to show that N ull(M0 ) = {0}. Suppose M0 ζ = 0, ζ 6 ≡ 0. Proposition 6 then implies ζ > 0 (or else take −ζ ). Now 0 = < M0 ζ, ξ > = < ζ, M0 ξ > = < ζ, η > > 0 t u

gives a contradiction.

Remark 6. Proposition 6 applied to Eq. (29) also gives ξ > 0. That is, the vortex profiles increase monotonically with λ. This can be used to show that the rescaled vortex √ √ (fn (r/ λ), an (r/ λ)) converges as λ → ∞ to (f ∗ , 0), where f ∗ is the (profile of) the n-vortex solution of the ordinary GL equation: −1r f ∗ + n2 f ∗ /r 2 + (f ∗ 2 − 1)f ∗ = 0. This result was established by different means in [ABG]. (±1)

5.4. Positivity of L1

(±1)

Proposition 8. L1

.

≥ 0 with non-degenerate zero-eigenvalue given by T . (±1)

(±1)

Proof. Let µ = inf specL1 ≤ 0, which is an eigenvalue by (26). Suppose L1 S = (±1) (±1) satisfies the irreducibility µS. Applying Proposition 6 to L1 − µ (note that V1 requirement) gives S > 0 (or S < 0). Further, µ is non-degenerate, as if µ were degenerate, we would have two strictly positive eigenfunctions which are orthogonal, an impossibility. Now if µ < 0, we have < S, T >= 0, which is also impossible. Thus S is a multiple of T , and µ = 0. u t 5.5. Completion of stability proof for n = ±1. We are now in a position to complete (±1) ≥ c > 0. By the proof of the first statement of Theorem 1. By Proposition 7, L0 (±1) (±1) Proposition 8 and (26), L1 |T ⊥ ≥ c˜ > 0. Finally, by (27), Lm ≥ c0 > 0 for |m| ≥ 2. It follows from Proposition 3 that L˜ (n) ≥ c > 0 on the subspace of X orthogonal to the translational zero-modes. By the discussion of Sect. 3.3, this gives Theorem 1 for n = ±1. u t 6. The Critical Case, λ = 1 In order to prove the remainder of Theorem 1, we exploit some results from the λ = 1 case.

270

S. Gustafson, I. M. Sigal

6.1. The first-order equations. Following [B], we use an integration by parts to rewrite the energy (1) as Z h i2 n 1 |∂A∗ ψ|2 + ∇ × A + 21 (|ψ|2 − 1) E(ψ, A) = 2 R2 o + 41 (λ − 1)(|ψ|2 − 1)2 + πdeg(ψ) (30) (recall, since we work in dimension two, ∇ × A is a scalar) where deg(ψ) is the topological degree of ψ, defined in the introduction. We assume, without loss of generality, that deg(ψ) ≥ 0. Clearly, when λ = 1, a solution of the first-order equations ∂A∗ ψ = 0,

(31)

1 ∇ × A + (|ψ|2 − 1) = 0 2

(32)

minimizes the energy within a fixed topological sector, deg(ψ) = n, and hence solves GL. Note that we have identified the vector-field A with a complex field as in (18). The n-vortices (9) are solutions of these equations (when λ = 1). Specifically, n

1 a0 = (1 − f 2 ) r 2

(33)

(1 − a)f . r

(34)

and f0 = n

In fact, it is shown in [T2] that for λ = 1, any solution of the variational equations solves the first- order equations (31)-(32). Beginning from expression (30) for the energy, the variational equations (previously written as (2)-(3)) can be written as 1 1 ∂A [∂A∗ ψ] + ψ[∇ × A + (|ψ|2 − 1)] + (λ − 1)(|ψ|2 − 1)ψ = 0, 2 2 1 iψ[∂A∗ ψ] − i∂z¯ [∇ × A + (|ψ|2 − 1)] = 0 2

(35) (36)

(here ∂A∗ ≡ −∂z¯ + i A¯ is the adjoint of ∂A ). 6.2. First-order linearized operator. We show that the linearized operator at λ = 1 is the square of the linearized operator for the first-order equations. Linearizing the first-order equations (31)–(32) about a solution, (ψ, A) (of the firstorder equations) results in the following equations for the perturbation, α ≡ (ξ, B): ∂A∗ ξ + iψ B¯ = 0, ¯ ) = 0. ∇ × B + Re(ψξ

The Stability of Magnetic Vortices

271

Now using i∂z¯ B = ∇ × B + i(∇ · B), and adding in the gauge condition (17), we can rewrite this as F α = 0, where

F =

(37)

∂A∗ iψ( ¯ ) . ψ( ¯ ) i∂z

If we linearize the full (second order) variational equations (in the form (35)-(36)) around (ψ, A), we obtain ¯ + i B[∂ ¯ ∗ ψ] + ψ[∇ × B + Re(ψξ ¯ )] ∂A [∂A∗ ξ + i Bψ] A

¯ )] = 0 +ξ [∇ × A + 21 (|ψ|2 − 1)] + 21 (λ − 1)[(|ψ|2 − 1)ξ + 2ψRe(ψξ and ¯ + i ξ¯ [∂A∗ ψ] − i∂z¯ [∇ × B + Re(ψξ ¯ )] = 0. ¯ A∗ ξ + i Bψ] i ψ[∂ Proposition 9. When λ = 1, these linearized equations can also be written F ∗ F α = 0. Proof. This is a simple computation using the fact that the first-order equations (31–32) hold. u t This relation holds also on the level of the blocks. A straightforward computation gives ∗ L(n) m |λ=1 = Fm Fm ,

where

  Fm = 

∂r − b m r

f 0

 f 0 ∂r − b 0 f  . 0 ∂r + 1/r − mr  m f −r ∂r + 1/r m r

6.3. Zero-modes for λ = 1. It was predicted in [W] (and proved rigorously in [S]) that for λ = 1, the linearized operator around any degree-n solution of the first-order equations has a 2|n|-dimensional kernel (modulo gauge transformations). This kernel arises because the Taubes solutions form a 2|n|-parameter family, and all have the same energy. The zero-eigenvalues are identified in [B], and we describe them here. Let χm be the unique solution of (−1r +

m2 + f 2 )χm = 0 r2

on (0, ∞) with χm ∼ r −m

as

r→0

272

S. Gustafson, I. M. Sigal

and χm → 0

as

r→∞

for m = 1, 2, . . . , n. Then it is easy to check that when λ = 1, Fm Wm = 0,

(38)

where  f χm f χm   . Wm =  −(χm0 + mχm /r)  0 −(χm + mχm /r) 

We remark that χ1 =

1−a r

and it is easily verified that for λ = 1, W1 = n1 T gives the translational zero-modes. 7. The (In)stability Proof for |n| ≥ 2 Here we complete the proof of Theorem 1. (n) The idea is to decompose Lm into a sum of two terms, each of which has the same (n) (translational) zero-mode (for m = 1) as Lm . One term is manifestly positive, and the other satisfies restrictions of Perron-Frobenius theory. We begin by modifying Fm , and defining, for any λ,   0 m (∂r − ff ) · q f 0 r   f0 m   q ∂ − 0 f ˜ r Fm ≡  r f , m  fq 0 ∂r + 1/r − r  0 f − mr ∂r + 1/r where we have defined q(r) ≡

n(1 − a)f rf 0

(39)

and ∂r · q denotes an operator composition. By (34), we have q ≡ 1 for λ = 1. We also set, for m = 1, . . . , n,   q −1 f χm f χm   . W˜ m =  −(χm0 + m χrm )  χm 0 −(χm + m r ) Now W˜m has the following properties: 1. W˜ 1 is the translational zero-mode n1 T for all λ.

The Stability of Magnetic Vortices

273

2. When λ = 1, W˜ m = Wm , m = 1, . . . , n, give the 2|n| zero-modes (38) of the linearized operator. These W˜ m were chosen in [B] as candidates for directions of energy decrease (for |m| ≥ 2) when λ > 1. Intuitively, we think of W˜ m as a perturbation that tends to break the n-vortex into separate vortices of lower degree. Now, F˜m was designed to have the following properties: 1. F˜m = Fm when λ = 1 (this is clear). 2. F˜m W˜ m = 0 for all m and λ (this is easily checked). A straightforward computation gives ˜∗ ˜ L(n) m = Fm Fm + J Mm ,

(40)

where J = diag{1, 0, 0, 0} and Mm = lm − qlm q + (λ − q 2 )f 2 with m2 λ + b2 + (f 2 − 1). 2 r 2 By construction, when m = 1, the second term in the decomposition (40) must have a zero-mode corresponding to the original translational zero-mode. In fact, one can easily check that M1 f 0 = 0. lm = −1r +

Proposition 10. For |n| ≥ 2, M1 has a non-degenerate zero-eigenvalue corresponding to f 0 , and M1 ≥ 0 λ < 1 M1 ≤ 0 λ > 1 on L2rad . Proof. We recall inequality (13), which implies that for λ < 1, q < 1, and for λ > 1, q > 1. The operator M1 is of the form M1 = (1 − q 2 )(−1r ) + first order + multiplication.

(41)

One can show that M1 is bounded from below (resp. above) for λ < 1 (resp. λ > 1). We stick with the case λ < 1 for concreteness. Suppose M1 η = µη with µ = infspecM1 ≤ 0. Applying the maximum principle (e.g. Proposition 6 for d = 1) to (41), we conclude that η > 0. If µ < 0, we have < η, f 0 >= 0, a contradiction. Thus µ = 0, and is non-degenerate by a similar argument. u t We also have Lemma 1. For m ≥ 2, Mm − M1 is non-negative for λ < 1, non-positive for λ > 1, and has no zero-eigenvalue. Proof. This follows from the equation Mm − M1 = (1 − q 2 )

m2 − 1 . r2

t u

274

S. Gustafson, I. M. Sigal

Completion of the proof of Theorem 1. Suppose now λ < 1. Since F˜m∗ F˜m is manifestly (n) non-negative, and Mm > M1 for m ≥ 2, we have Lm ≥ 0 for m ≥ 1 (with only the translational 0-mode). Combined with (26) and Propositions 7 and 3, this gives stability of the n-vortex for λ < 1. Now suppose λ > 1. By (40), Proposition 10 and Lemma 1, we have for m = 2, . . . n, ˜ < W˜ m , L(n) m Wm >

< 0.

We remark that W˜ m corresponds to an element of the un-complexified space X, and so L(n) has negative eigenvalues. This establishes the instability of the n-vortex for |n| ≥ 2, λ > 1, and completes the proof of Theorem 1. u t 8. Appendix: Vortex Solutions are Radial Minimizers (n)

Proposition 11. For λ ≥ 2n2 , a solution of Eqs. (11)–(12) locally minimizes Er . (n)

Proof. It suffices then to show M0 = HessEr L0 + Z0 , where

> 0 (see Sect. 5.1). We write M0 =

L0 = diag{l, −1r } with l = −1r + b2 + λ2 (f 2 − 1) and 2λf 2 −2bf Z0 = . −2bf r12 + f 2 We note that lf = 0 (one of the GL equations). It follows from the fact that f > 0 and a Perron-Frobenius type argument (see [OS1]) that l ≥ 0 with no zero-eigenvalue. It suffices to show Z0 ≥ 0. Clearly tr(Z0 ) > 0, and det(Z0 ) = 2λf 4 + is strictly positive for λ ≥ 2n2 .

2f 2 [λ − 2n2 (1 − a)2 ] r2

t u

Acknowledgements. The first author would like to thank the Courant Institute for its hospitality during part of the preparation of this paper, and especially J. Shatah for some helpful discussions. Part of this work is toward fulfillment of the requirements of the first author’s PhD at the University of Toronto. The second author thanks Yu. N. Ovchinnikov for many fruitful discussions. The authors would also like to thank the referee for helpful remarks.

References [ABG] Almeida, L., Bethuel, F., Guo, Y.: A remark on the instability of symmetric vortices with large coupling constant. Commun. Pure Appl. Math. 50, 1295–1300 (1997) [ABGi] Alama, S., Bronsard, L., Giorgi T.: Uniqueness of symmetric vortex solutions in the Ginzburg–Landau model of superconductivity. Preprint (1998) [AD] Androulakis, G., Dostoglou, S.: On the stability of monopole solutions. Nonlinearity 11, 377–408 (1998) [BC] Berger, M.S., Chen,Y.Y.: Symmetric vortices for the nonlinear Ginzburg–Landau equations of superconductivity, and the nonlinear desingularization phenomenon. J. Funct. Anal. 82, 259–295 (1989) [B] Bogomol’nyi, E.B.: The stability of classical solutions. Yad. Fiz. 24, 861–870 (1976)

The Stability of Magnetic Vortices

[BGP]

275

Boutet de Monvel–Berthier, A., Georgescu, V., Purice, R.: A boundary value problem related to the Ginzburg–Landau model. Commun. Math. Phys. 142, 1–23 (1991) [DS] Demoulini, S., Stuart, D.: Gradient flow of the superconducting Ginzburg–Landau functional on the plane. Commun. Anal. Geom. 5, no.1, 121–198 (1997) [GT] Gilbarg, D., Trudinger, N.S.: Elliptic Partial Differential Equations of Second Order. Berlin: SpringerVerlag, 1977 [G1] Gustafson, S.: Symmetric solutions of Ginzburg–Landau equations in all dimensions. Intern. Math. Res. Notices No. 16, 807–816 (1997) [G2] Gustafson, S.: Dynamic stability of magnetic vortices. In preparation. [JT] Jaffe, A., Taubes, C.: Vortices and Monopoles. Boston: Birkhauser, 1980. [JR] Jacobs, L., Rebbi, C.: Interaction of superconducting vortices. Phys. Rev. B19, 4486–4494 (1979) [LL] Lieb, E.H., Loss, M.: Symmetry of the Ginzburg–Landau Minimizer in a Disc. Math. Res. Lett. 1, 701–715 (1994) [LM] Lopez-Gomez, J., Molina-Meyer, M.: The maximum principle for cooperative weakly coupled elliptic systems and some applications. Diff. Int. Eqns. 7, no. 2, 383–398 (1994) [M] Mironescu, P.: On the stability of radial solutions of the Ginzburg–Landau equation. J. Funct. Anal. 130, 334–344 (1995) [OS1] Ovchinnikov, Y., Sigal, I.M.: Ginzburg–Landau equation I: Static vortices. In: Partial Differential Equations and their Applications, Greiner et. al., eds. Providence, RI: AMS, 1997, pp. 199–220 [P] Plohr, B.: The existence, regularity, and behaviour at infinity of isotropic solutions of classical gauge field theories. Princeton thesis [PA] Pao, C.V.: Nonlinear elliptic systems in unbounded domains. Nonlinear Analysis: Theory, Methods, and Applications 22, No. 11, 1391–1407 (1994) [RSII] Reed, M., Simon, B.: Methods of Modern Mathematical Physics, Vol II: Fourier Analysis, SelfAdjointness. New York: Academic Press, 1975 [RSIV] Reed, M., Simon, B.: Methods of Modern Mathematical Physics, Vol IV: Analysis of Operators. New York: Academic Press, 1978 [S] Stuart, D.: Dynamics of Abelian Higgs vortices in the near Bogomolny regime. Commun. Math. Phys. 159, 51–91 (1994) [S2] Stuart, D.: Periodic solutions of the Abelian Higgs model and rigid rotation of vortices. GAFA 9, 568–595 (1999) [T1] Taubes, C.: Arbitrary n-vortex solutions to the first order Ginzburg–Landau equations. Commun. Math. Phys. 72, 277–292 (1980) [T2] Taubes, C.: On the equivalence of the first and second order equations for gauge theories. Commun. Math. Phys. 75, 207–227 (1980) [W] Weinberg, E.: Multivortex solutions of the Ginzburg–Landau equations. Phys. Rev. D 19, 3008–3012 (1979) Communicated by A. Jaffe

Commun. Math. Phys. 212, 277 – 296 (2000)

Communications in

Mathematical Physics

© Springer-Verlag 2000

Stochastic Stability for Contracting Lorenz Maps and Flows∗ R. J. Metzger Instituto de Matemática y Ciencias Afines, IMCA, UNI. Jr. Ancash 536, Casa de las Trece Monedas, Lima 1, Perú. E-mail: [email protected] Received: 24 February 1999 / Accepted: 7 January 2000

Abstract: In a previous work [M], we proved the existence of absolutely continuous invariant measures for contracting Lorenz-like maps, and constructed Sinai–Ruelle– Bowen measures f or the flows that generate them. Here, we prove stochastic stability for such one-dimensional maps and use this result to prove that the corresponding flows generating these maps are stochastically stable under small diffusion-type perturbations, even though, as shown by Rovella [Ro], they are persistent only in a measure theoretical sense in a parameter space. For the one-dimensional maps we also prove strong stochastic stability in the sense of Baladi and Viana [BV]. 1. Introduction Lorenz flows are related to the system numerically studied in [Lo] as a truncation of a Navier-Stokes equation. Guckenheimer and Williams [GW] introduced a geometric model called the expanding Lorenz attractor, in which it was supposed that the eigenvalues λ2 < λ1 < 0 < λ3 at the singularity of the flow satisfy the expanding condition λ1 + λ3 > 0. In [Ro], the expanding conditions is replaced by the contracting condition λ1 + λ3 < 0. The general assumptions used to construct the geometric models also permit the reduction of the 3-dimensional problem, first to a 2-dimensional Poincaré section and then to a one-dimensional map. These maps are also called Lorenz-like. In a previous work we proved the existence and uniqueness of an ergodic absolutely continuous invariant measure (a.c.i.m.) for certain one-dimensional Lorenz-like maps (see [M]). In the same work we related this result to the case of flows and constructed an SRB measure for them. Since the a.c.i.m. found for the one-dimensional case is unique, the SRB measure constructed for the flow is also unique. SRB measures are related to the statistical properties of a system. On the other hand, stochastic stability means, in a general sense, that the statistical properties are persistent ∗ This work was partially support by CNPq-Brazil, during a stay at IMPA

278

R. J. Metzger

under small random perturbations. It can be stated as follows. Consider the family of measures P ε (t, x, ·) on M given for every x ∈ M and t ∈ R or t ∈ Z+ and ε > 0 small ε enough and define Markov chains xtε , t ∈ R in the following way: if xtε = x then xt+τ ε ε has probability P (τ, x, 0) of being in 0. The Markov chain xt for t ∈ R is called a small random perturbation of a flow f t if for every continuous function h on M, we have Z lim P ε (t, x, dy)h(y) − h(f t (x)) = 0. ε→0

M

Similarly, the Markov chain xnε for n ∈ Z+ is called a small random perturbation of a map f if for every continuous function h on M, we have Z ε n lim P (n, x, dy)h(y) − h(f (x)) = 0. ε→0

M

νε

We say that on M is an invariant measure for the Markov chain xtε if for all Borel sets 0 and any τ > 0, Z ν ε (dx)P ε (τ, x, 0) = ν ε (0). M

Stochastic stability concerns the convergence of these measures. Under general hypothesis, weak limits of invariant measures ν ε when ε −→ 0 are invariant under the flow (or the map, depending on the case). So, the question is if all limits are equal. At least for the case when there is only a finite number of ergodic attractors (that is when only a finite number of SRB measures exist), we say that a system is stochastically stable if all these limits are the same and in this case we have that the common limit ν is in the convex hull of the SRB measures (cf. [Ki2, Theorem II.5.5]). For the Lorenz-like maps, we prove their stochastic stability in Theorem A, in the general setting of [KK] for the a.c.i.m. obtained in [M]. This case will help us to prove stochastic stability for flows under small diffusion-type random perturbations in Theorem B, even though, as proved in [Ro], they are not robust in parameter space but only persistent in a measure theoretical sense: the contracting Lorenz-like attractors only exist with positive Lebesgue probability in parameter space. Perturbations of diffusion type are often introduced when we try to model actual behavior of systems, they accomplish for brownian motion or random collisions of particles in a media, see [Y]. Finally, for the case of maps, it is common to consider stochastic stability for local perturbations, i.e. when the family of measures P ε have compact support. Actually, for the contracting Lorenz-like maps we prove more (Theorem C): they are strongly stochastically stable in the sense of [BV]. Let us remark that the case when the perturbations are not local, helps us to prove stochastic stability for flows, where the diffusion type perturbations are considered. For the stochastic stability of the Lorenz flow, it is important that there exists only one ergodic attractor because that allows us to reduce the problem to a compact neighborhood of the attractor and to Markov chains that remain in it, as in [Ki2]. The systems commonly considered have finitely many SRB measures µ1 , . . . , µN , and their basin of attractions cover Lebesgue almost all the phase space M. Also, it seems that finding SRB measures is a necesary condition to show stochastic stability. A global point of view on this subject is the Palis conjecture: every dynamical system can be approximated by another having only finitely many attractors, supporting physical measures that describe the time average of Lebesgue almost all points, and, moreover, the statistical properties of these measures are stable under small random perturbations, see [Pa,Vi2]. It is in this spirit that the present work was done. Concerning SRB measures,

Stochastic Stability for Lorenz Attractors

279

we were much inspired by the works of Sinai, Ruelle, Bowen and Kifer [Si, Ru, BR, Ki1, Ki2], and more recently [Vi1]. 2. Lorenz-Like Maps and Flows In this section we recall the strange attractor first discovered by Lorenz [Lo], as a truncation of a Navier-Stokes equation. Actually, we will be dealing with the geometric model introduced by Guckenheimer and Williams in [GW], called the expanding Lorenz attractor. This is, a family of C r (R3 ) vector fields such that it is linear in a neighborhood of the origin containing the cube {(x, y, z) ∈ R3 : |x|, |y|, |z| ≤ 1} and with eigenvalues λ1 , λ2 , λ3 satisfying λ2 < λ1 < 0 < λ3 and λ1 + λ3 > 0 , and with both trajectories of the unstable manifold intersecting the top of the cube, as in Fig. 1. So if we call U the union of the cube with a neighborhood of the unstable manifold, there exists an attractor 3 = ∩t≥0 Xt (U ), where Xt is the flow of the vector field.

Q Σ

Fig. 1. The Lorenz flow

The contracting Lorenz attractor arises in a similar way if we replace the expanding condition λ1 + λ3 > 0 by the contracting condition λ1 + λ3 < 0, see [Ro]. By construction, the top of the cube is a cross section Q for the flow. More explicitly, there exists a curve 6 that we can assume to be the intersection Q with the plane {x = 0}, so there exists a first return map (a Poincaré map) of the form P : Q\6 −→ Q (x, y) 7 −→ P (x, y) = (f (x), g(x, y)), This Poincaré map reduces in a wide sense the study of the dynamics of the Lorenz attractor to the study of the map P . But also the form of this map, that says that the leaves with x = cte are mapped to leaves with x = f (cte), allows another simplification if we

280

R. J. Metzger

project along the stable leaves, see [Ro]. In other words studying the one-dimensional map defined by f gives a great amount of information on the flow that generates it. The one-dimensional maps constructed in this way are called Lorenz-like maps. Let I ⊂ [−1, 1] be a compact interval and f : I → I be a map such that f (I ) ⊂ I with a discontinuity at the origin. Set c±k = limx→0± f k (x) for k ≥ 0. We will require f to satisfy conditions A0-A3 below. A0) Outside the origen f is of class C 3 and with negative Schwarzian derivative, and also satisfies 0 K2 |x|s−1 ≤ f (x) ≤ K1 |x|s−1 . For some constants K1 , K2 and s with s > 1. 0 A1) (f n ) (c±1 ) > λnc , for some λc > 1, and for n ≥ 1. A2) |f n−1 (c±1 )| > e−αn some α small enough, and all n ≥ 1. A3) For any interval J ⊂ I there exists a number n(J ) > 0 such that I∗ ⊂ f n (J ) (f is topologically mixing on I∗ = [c+1 , c−1 ]). Sinai–Ruelle–Bowen measures, or physical measures, are those measures for what the Birkhoff averages converge to a constant for a large Lebesgue set. More precisely: if f : M −→ M is a transformation on a manifold M, we call an f -invariant measure µ an SRB measure if there exists a positive Lebesgue measure set B(µ) of points x ∈ M such that Z n 1X i ϕ(f (x)) = ϕdµ for every ϕ ∈ C0 (M, R) , lim n→∞ n M i=1

and the set B(µ) is called (ergodic) basin of attraction of µ. For a flow f t : M −→ M the definition is Z Z 1 T ϕ(f t (x))dt = ϕdµ for every ϕ ∈ C0 (M, R). lim T →∞ T 0 M It is clear from our definitions that if µ is an absolutely continuous invariant measure for f and ergodic then it is an SRB measure. In [M] we have shown the following: Theorem 1. Under Conditions A0–A3, f admits an absolutely continuous invariant probability measure. This measure is unique and ergodic. This theorem implies also that there exist a unique SRB measure for the original Lorenz flow, see [Vi1,M]. 3. General Random Perturbations Consider the family of measures Qε (x, ·) on I given for every x ∈ I and ε > 0 ε small enough. Define Markov chain xnε in the following way: if xnε = x then xn+1 ε ε has probability Q (f x, 0) of being in 0. The Markov chain xn is called small random perturbation of f if for every function h continuous on I , we have Z lim Qε (x, dy)h(y) − h(x) = 0. ε→0

I

Stochastic Stability for Lorenz Attractors

281

We say that µε on I is an invariant measure for the Markov chain xnε if for all Borel sets 0 we have Z µε (dx)P ε (x, 0) = µε (0), I

where P ε (x, 0) = Qε (f x, 0) and we will consider the family Qεx (·) to be of the form Z ε qxε (y)dy, (1) Qx (0) = 0

and we impose some restrictions to the density qxε . This conditions are Assumption A in [KK] namely: 1. Transition probabilities of Markov chains xnε have the form (1). 2. There exists constants α < 1, C > 0 and a family of non-negative functions {rx (ξ ), x ∈ I, ξ ∈ R} such that α

qxε (y) ≤ Cε−1 e− ε

dist(x,y)

for all x, y ∈ I ,

(2)

where dist(x, y) = min(|y − x|, |y − x + 2|, |y − x − 2|),

(3)

1−α , where σ (x, y) and qxε ≤ (1 + εα )ε−1 rx ( σ (x,y) ε ), provided that dist(x, y) ≤ ε equals one of the numbers (y − x), (y − x + 2), or (y − x − 2) so that |σ (x, y)| = dist(x, y), where definition (3) is mainly because we are considering the interval [−1, 1] with identification of end points1 . 3. The R functions rx (ξ ), x ∈ I , ξ ∈ R, satisfy: – R rx (ξ )dξ = 1. – rx (ξ ) ≤ Ce−α|ξ | for α, C > 0 independent of x and ξ . – There exists C > 0 such that if Vx+ = {ξ : rx (ξ ) > 0} and ∂Vx+ (δ) denotes the δ-neighborhood in R of the boundary ∂Vx+ of Vx+ then Z rx (ξ )dξ ≤ Cδ ∂Vx+ (δ)

and rx (ξ ) ≤ ry (η) + Cρ + χ∂Vx+ (Cρ) (ξ )rx (ξ ),

(4)

where ρ = ρ(x, ξ ), (y, η) = dist(x, y) + ||ξ − πyx η||, and πyx is the parallel transport. These assumptions imply that instead of taking into account whole Markov chains we can work only with Markov chains that are δ-pseudo-orbits. This is shown with more generality in Lemma 1.1 of [Ki2, Chapter 2], which says that the mistake we are making calculating only the n-step transition probability for δ-pseudo-orbits to arrive at borel αδ set 0 is of the order of Cnε 2 m(0)e− 2ε . That is to say ε P (n, x, 0) − P ε dist(f (x ε ), x ε ) < δ; i = 0,ε . . . , n − 1 ≤ Cnε2 m(0)e− αδ 2ε x i i+1 and xn ∈ 0 (5) for appropriate chosen constants. 1 We can consider it also without identification.

282

R. J. Metzger

3.1. The shadowing lemma. To make a proof of stochastic stability similar to that of [KK] we need the following lemma. Lemma 1 (Shadowing). Suppose that f satisfies hypothesis A0-A3. Let x0 , . . . , xn be an ε α -pseudo-orbit of f, i.e. dist(f xi , xi+1 ) < εα ,

i = 0, . . . , n − 1,

(6)

where dist is define by (3) and ε > 0 is small enough. There exists C > 0 depending only on f such that if 0 < β ≤ α/s and |xi − c0 | ≥ 2Cεβ ,

i = 0, . . . , n,

(7)

then one can find a point y ∈ I so that dist(f i (y), xi ) ≤ Cεα−β(s−1) ,

i = 0, . . . , n.

(8)

Proof. Choose ρ3 < ρ0 /2 for suficiently small ρ0 such that \ f (U2ρ3 (c0 )) U2ρ3 (c0 ) = ∅. Let i1 < . . . < ik such that xij ∈ Uρ3 (c0 ), for j = 1, . . . , k and xl 6∈ Uρ3 (c0 ) if l 6 = ij for j = 1, . . . , k. Put also i0 = 0, and ik+1 = n. Fixing ρ2 , we have fixed M 3ρ3 so if xl 6 ∈ Uρ3 (c0 ) for i = l + 1, . . . , l + M 3ρ3 − 1, 4

then f q (xl ) 6 ∈ U 3ρ3 (c0 ) for q = 1, . . . , M 3ρ3 − 1, from relation (6). 4

4

4

Therefore, in our case, a lemma similar to Lemma 5.2 of [Vi1] (shown in [Ro]) enable us to employ the standard argument yielding the shadowing in the expanding case for pieces xij +1 , . . . , xij +1 , of the pseudo-orbit to conclude that there exists Cρ > 0 independent of the whole pseudo-orbit and some points yj , j = 0, . . . , k such that dist(xij +l , f l (xij )) ≤ Cρ3 εα

(9)

for all l = 1, . . . , ij +1 − ij and j = 0, . . . , k. Next, we shall prove that there exists a point y ∈ f −ik yk satisfying (8). By (6) and (9) dist(f xij , fyj ) ≤ (Cρ3 + 1)εα ,

(10)

from (A0) and (10) we have s (Cρ3 + 1)εα |xij |s − |yj |s ≤ K2 and since s > 1 we have

|xi |s − |yj |s |xi | − |yj | j j ≥ |xij | |xij |s

so (11) becomes s (Cρ3 + 1) α ε K2 |xij | s (Cρ3 + 1) α−sβ ≤ |xij | ε , K2 (2C)s

|xij − yj | ≤ |xij |

(11)

Stochastic Stability for Lorenz Attractors

283

since |xij | = |xij − c0 | ≥ 2Cε β and sβ ≤ α. Now, by (9) if C6 is chosen big enough we have α eρ3 ε (12) dist(yj , f ij −ij −1 yj −1 ) ≤ C |xij |s−1 eρ3 independent of ε, j and points {xi }. for some C Since |xij | ≥ 2Cεβ then for ε small enough, it follows from (9) and (12) eρ3 dist(yj , c0 ) ≤ dist(yj , xij ) + dist(xij , c0 ) ≤ C

2 εα + ρ3 ≤ ρ0 |xij |s−1 3

Also we have dist(f ij −ij −1 yj −1 , c0 ) ≤ 23 ρ0 . From this two relations and the corresponding lemmas similar to Lemmas (3.2) and (3.5) [KK] we have α eρ3 ε γ −l dist f −l yj , f ij −ij −1 −l yj −1 ≤ C1 C |xij |s−1 0 for appropiate preimages of yj , where l ≥ 0 and C1 > 0 depends only on ρ0 . It follows from here that k X )+l r−j −(i −i eρ3 εα−(s−1)β C1 γ0 j j −1 dist f −(ik −ij )+l yk , f l yj ≤ C

(13)

r=j

for corresponding preimages of yk , where l = 1, . . . , ij +1 − ij . From the assumptions on our maps, (ij +1 − ij ) is of the order C2−1 log ρ3−1 , where C2 is independent of ρ3 , ij and the choice of points {xi }, {yj }. −(i

−i )

Therefore if ρ3 is taken small enough, then C1 γ0 j +1 j < 1 and the sum in the right side of (13) is bounded. This, together with (9) and (13), yield (8) for some y ∈ f −ik (yk ), and proves Lemma 1. u t In [KK], there are two crucial lemmas: Lemma 4.1 and Lemma 4.4. Lemma 4.4 shows that the closure of the critical orbit has Lebesgue measure. This is necessary since [KK] proves Lemma 4.1 for intervals that are far from the critical orbit. In other words, [KK] shows that the limit measure µ of a weak covergent sequence µεi of stationary measures is absolutely continuous with respect to Lebesgue in I \A where m(A) can be made arbitrarely small. We do the same, only chosing carefully the subset A. More precisely, we are going to estimate probabilities of arriving at intervals 0 ⊂ I such that 0 = π(fˆn (η)) for some η ⊂ E0 and η ∈ Q(n, N ), where N is chosen in such a way that the sum of the measures of the intervals that do not belong to Q(n, N ) is less than ˜ . The definitions of fˆ and the tower extension Iˆ (π is the natural projection of the tower) can be found in Section 5 and in [M]. The colection of intervals Q(n, N ) is defined in Sect. 6 of [M] and the principal property we are using is stated in Lemma 6.1 of the same reference (similar definitions and properties can be found in [Vi1]). Let 0 ⊂ I be a borel set, define J1ε (ρ, n, x, 0) = Px { min (xkε , c0 ) > ρ and xnε ∈ 0} 0≤k≤n−1 Z Z Z ... qfε (x) (y1 )qfε (y1 ) (y2 ) . . . qyεn−1 (yn )dy1 . . . dyn . = I \Uρ (c0 )

I \Uρ (c0 ) 0

284

R. J. Metzger

Lemma 2. For any ˜ there exist N such that if 0 = π(fˆn (η)), where η ∈ Q(n, N ) as before, then there exist γ0 such that for any x ∈ I we have J1ε (εγ , n, x, 0) ≤ Dm(0), provided that (log ε)4 ≥ n ≥ (log ε)2 , γ ≤ γ0 and ε is small enough. Proof. This lemma is similar to a corollary in [KK]. Let us give a sketch of the proof. First we define J2ε (ρ, δ1 , n, x, 0) similar to J1 but considering only δ1 -pseudo-orbits. This let us approximate J1 with J2 with the same error as in (5). From here we use the shadowing lemma to reduce the problem to calculate the probability J3ε (δ2 , n, x; z, 0) for n Markov chains beginning in x and ending in 0 that can be δ2 shadowed by z, i.e. for Markov chains that stay in iterates of a dynamical ball. After this we can conclude the lemma as in [KK], using the corresponding bounded distortion properties of [Vi1] shown in [M]. u t As in [KK] we shall take care of the Markov chain xkε which sometimes approach the critical point c0 . This is made in Lemma 4.3 of [KK] that can be translated with few modifications. So we already have Theorem 2 (Theorem B). Stationary measures µε for perturbations converge weakly to the a.c.i.m. µ0 of f . Proof. We are assuming that the a.c.i.m. for f are unique and the methods in [KK] give t that if µε → µ then µ is an a.c.i.m., therefore µ = µ0 . u 4. Stochastic Stability for the Lorenz Flow For the previously defined Lorenz flow Xt we consider the family of measures P ε (t, x, ·) on M given for every x ∈ M and t ∈ R and ε > 0 small enough. Define the Markov ε has probability P ε (τ, x, 0) chain xtε , t ∈ R in the following way: if xtε = x then xt+τ ε of being in 0. The Markov chain xt is called a small random perturbation of Xt if for every function h continuous on M, we have Z ε lim P (t, x, dy)h(y) − h(Xt (x)) = 0. ε→0

M

νε

We say that on M is an invariant measure for the Markov chain xtε if for all Borel sets 0 and any τ > 0 Z ν ε (dx)P ε (τ, x, 0) = ν ε (0). M

It is a standard fact that in the case we are treating, weak limits of invariant measures ν ε when ε −→ 0 exist and it will be invariant for the flow itself. As in [Ki2], we are going to deal with diffusion type random perturbation. This is the most common perturbation used for flows because it models a particle that moves under the action of the vector field and it is also affected by random collision in a media. The operator which models this process has the following form: Lε = ε2 L + B,

Stochastic Stability for Lorenz Attractors

285

where B is the vector field (Lorenz type in our case) and L is the L aplace–Beltrami operator (a second order elliptic operator acting on the space of coordinates [IW]). This operator generates a Markov diffusion process xtε with transition probability P ε (t, x, ·) having densities p ε (t, x, y) with respect to the Riemannian volume satisfying Kolmogorov’s equation ∂pε (t, x, y) = Lε pε (t, x, y), ∂t where Lε acts in the variable x see [IW,Y]. It is known that if we consider P ε (x, 0) = P ε (τ, x, 0) for τ fixed, and P ε (τ, x, ·) being the diffusion transition probabilities, we arrive at Markov chains xn = xnτ that satisfy similar properties as in Assumption A of [KK]. We already know the existence of a Poincaré section which has all the good properties including the stochastic stability for diffeomorfism. Before going to the theorem we need the following lemma as asked in [Ki2]. Lemma 3. Let U be a sufficiently small neighborhood of the attractor 3, There exist a constant C > 0 such that if xi i = 0, . . . , n is a εα -pseudo-orbit of X1 staying in U and satisfying min dist(xi , O) > Cεβ ,

0≤i≤n

(14)

then we can find a point y ∈ U such that max dist(xi , Xi (y)) ≤ Cnεα−β(s−1) ,

0≤i≤n

where O in 14 represents the origin and dist here is the Euclidean distance. Proof. This lemma is the combination of the shadowing property in the Poincaré section with the shadowing property in the neighborhood of the hyperbolic fixed point of the flow. u t Our main theorem is the following Theorem 3 (Theorem C). Let xtε be diffusion type small random perturbation of the Lorenz flow introduced in Sect. 2 and let ν ε be an invariant measure for this process. If ν ε −→ ν then ν is an SRB measure for the Lorenz flow. Proof. Since invariant measures for the perturbation of the flow are also invariant measures for the diffeomorfism Xτ for τ fixed, we can consider invariant measure ν 0 ε for this diffeomorphism. Let ν εi be a weak convergent subsequence of measures having as limit the measure ν. Define a measure ν ∗ on the Poincaré section 6 (see Sect. 2 for the definition of 6), as follows: dν(∪0≤t≤s Xt (0)) ∗ (15) ν (0) = ds s=0 for all 0 ⊂ 6. We claim that ν ∗ satisfies the property that the measure µ defined as µ(B) = ν ∗ (π −1 (B)) is absolutely continuous with respect to Lebesgue in the interval I .

286

R. J. Metzger

From the claim the theorem follows since invariant measures for diffusion type perturbation are unique [IW] Theorem 4.5, and since the SRB measure for the Lorenz flow is uniquely defined by the property asked in the claim and by Definition 15. The claim follows from the methods in [Ki2] provided that we prove a distortion property in “rectangles”formed by a cartesian product of boxes in the flow, the stable manifold and the Poincaré section. If we choose carefully the partition to make the “rectangles”the distortion property will be an easy consequence of a similar property in the one dimensional case. We know the existence of Q(n, N ), which is a partition except for a small Lebesgue set (small depending only on N). With this “partition”we induced a similar one in the Poincaré section 6. That is, property (15) is shown for 0 ⊂ 6 which belongs to this induced “partition”similarly to Lemma 2 and using the methods of Chapter 2 of [Ki2]. u t 5. Strong Stochastic Stability In what we will say below, we are using the same constants as in the work [M]. It will be clear in the development of the sections that if these constants change they do it in such a way that the proofs that use them remain true. We will use here other constants, for example β1 and β2 , that are very close to β. From now on we will use an open interval only a little larger than I , and it will be still denoted by I . Fix some small 0 so that ft (I ) ⊂ I for all |t| < 0 , where ft (x) = f (x) + t, and we also write ftn = ftnn ... = ftn ◦ . . . ◦ ft1 for each n ≥ 1 and t = (t1 , . . . , tn ). We are interested in Markov chains xt , for 0 < < 0 , whose transition probabilities P (x, .) have densities θ (y − f (x)). Each θ is a probability distribution on [−, ], bounded from below as in [BY]. We assume also Z and θ (x)dx = 1. supp θ ⊂ [−, ] We also assume that θ satisfies M = sup ( sup |θ |) < ∞. Denote J = {t : θ (t) > 0}. It is clear that J is an interval containing zero and we assume that φ = log(θ |J ) is concave. Clearly, φ is concave if θ |J is. On the other hand, θ |J is at most two-to-one if φ is concave since log is a homeomorphism in (0, ∞). It follows from our assumptions on P that, for all small enough, the Markov process xn has a unique invariant probability measure µ , and this measure is absolutely continuous with respect to Lebesgue. The uniqueness comes from the property that θ is bounded from below. Uniqueness also implies that µ satisfy an ergodic property, namely, the product measure µ ×θN is ergodic (and invariant) with respect to the map on I × RN defined by (x, t1 , t2 , t3 , . . . ) 7 → (ft1 (x), t2 , t3 , . . . ), (see [Ki1], Theorem 2.1). It follows, using the Ergodic Theorem, that Birkhoff averages of random trajectories xj = ftj ...t1 (x) converge to µ for Lebesgue almost every (x, t1 , t2 , t3 , . . . ) ∈ supp (µ ) × supp θN . In this context we say that the dynamics of f is strong stochastically stable if the densities h of µ converge to the density of µ0 as goes to zero in the BV topology, where µ0 is the unique invariant measure of f . In this section we are going to prove Theorem 4 (Theorem D). The dynamics of f is strong stochastically stable.

Stochastic Stability for Lorenz Attractors

287

5.1. The construction. Besides the tower extension fˆ, we construct its deterministic perturbation fˆt , for |t| < < 0 . Take the constants β1 and β2 , with β1 < β < β2 , very 1/s close to β so that it is still true that eβi /2 λρ < λc , for i = 1, 2. For k ∈ Z, (x, k) ∈ Ek and |t| < we set   (f (x), k(+)) if |k| ≥ 1 and ft (x) ∈ Bk   t  (ft (x), −1) if k = 0 and x ∈ (−δ, 0) fˆt (x, k) =  (f (x), 1) if k = 0 and x ∈ (0, δ) t   (f (x), 0) otherwise. t Define also fˆtn ...t1 = fˆtn ◦ . . . ◦ fˆt1 for some (t1 , . . . , tn . . . ) ∈ JN . Observe that also ft ◦ π = π ◦ fˆt on Iˆ. We allow now H (δ) to depend on 0 , in the following way H (δ) = H (δ, 0 ) be the minimal k ≥ 1 such that there exist some x ∈ (−δ, δ) and some t = (t1 , . . . , tk , tk+1 ) ∈ such that fˆtk+1 (x, 0) ∈ E0 . We observe that, by continuity, H (δ) again can be Jk+1 0 made arbitrarily large by choosing small enough δ and 0 . We define the Markov chain xˆn by considering the transition probabilities ∞ Z X θˆ ((y, j ), fˆ(x, k))dy, Pˆ ((x, k), E) = j =−∞ π(E)

where θˆ ((y, j ), fˆ(x, k)) = 0 if fˆy−f (x) (x, k) 6 ∈ Ej and θˆ ((y, j ), fˆ(x, k)) = θ (y − f (x)) otherwise, in which case fˆy−f (x) (x, k) = (y, j ). In particular j = k + 1 and when there is no ambiguity we simply write θ (y − f (x)). We wish to consider the transfer operator L related to the unique absolutely continuous invariant measure of fˆ and Markov chain xˆn , so we first define the cocycle ω in the following way:  1 k=0    λ R θ (x − f (y))dy k =1 R(−δ,0) ω (x, k) = k = −1 λ (0,δ) θ (x − f (y))dy    λ R ω (y, k(−))θ (x − f (y))dy |k| ≥ 2. ∗ B k

(x, k), with |k| ≥ l ≥ 1, to be the unique point Define (xtl ,... ,t1 , k − l(k)) = fˆt−l l ,... ,t1 l ˆ such that ftl ,... ,t1 (xtl ,... ,t1 , k − l(k)) = (x, k), with l(k) = l if k ≥ 0 and l(k) = −l otherwise. With the previous definition we have Z ω (x, ±1) = λ (f 0 (xt ))−1 θ (t)dt, Z

and also ω (x, k) = λ

ω (xt , k(−) )(f 0 (xt ))−1 θ (t)dt

|k| ≥ 2.

Integration is over t such that xt is defined, with xt ∈ (−δ, 0) or (0, δ) for the first integral (depending on the “sign of the level”), and xt ∈ Bk(−) for the second integral. Introducing the notation dθ (t) = θ (t1 ) . . . θ (t|k|−1 )dt1 . . . dt|k|−1 .

288

R. J. Metzger

we also have for |k| ≥ 2, |k|−1

Z

ω (x, k) = λ

0 |k|−1 ω (xt|k|−1 ...t1 , 1) fˆt|k|−1 ...t1 (xt|k|−1 ...t1 )dθ (t), |k|−1

such that xt|k|−1 ...t1 ∈ B±1 exists. where the integral is over the t ∈ J Our assumptions imply that θ converges to the Dirac functions as → 0. It follows that ω (x, k) → ω0 (x, k) pointwise as → 0. Moreover, for small enough, and for |k| > 0, the support of ω in Ek is an interval with endpoints close to the endpoint of the support of ω0 in Ek . Similar to what we do for ω0 we write m = ω m, and note that this measure is also finite. We use the cocycles ω to define nonnegative weights gt on Iˆ by ω (y, k) 1 , gt (y, k) = 0 ω (fˆt (y, k) f (y) for |t| < . Q (n) ˆj = We use the notations g = g0 , and g (n) = n−1 j =0 g ◦ f , and similarly for gt Qn−1 j ˆ j =0 g ◦ ftj ,... ,t1 . S |k| T We will denote [ak , bk ] × {k} = Ek . Note that this definition |k| Im fˆt t∈J implies that for all x that belongs in [ak , bk ] × {k} there exists a t and a point xt in the |k| ground level such that fˆt (xt , 0) = (x, k). We denote X 1 ϕ(y, j )ω (y, j ) Lt ϕ(x, k) = ω (x, k) f 0 (y) X

=

fˆt (y,j )=(x,k)

ϕ(y, j )gt (y, j )

fˆt (y,j )=(x,k)

and

Z L ϕ(x, k) =

Lt ϕ(x, k)θ (t)dt XZ 1 ϕ(y, j )ω (y, j )θˆ (x, k)fˆ(y, j )dy, = ω (x, k) Bj j

for k = 0 or |k| ≥ 1 with ak < x < bk . For |k| ≥ 1 and x 6∈ [ak , bk ] we make definitions using limits as before. An interval η ⊂ Ek is called an interval of monotonicity for a map Fˆ : Iˆ → Iˆ if the map F = π ◦ Fˆ is monotone on η and if there is a j such that Fˆ (η) ⊂ Ej . Observe that this definition coincides with that of P (n) given in Sect. 3 of [M]. (n) For t = (t1 , . . . , tn ) ∈ JN , let Pt be the set of intervals of monotonicity of fˆtn . That is (n) Pt = { η1 ∩ fˆt−1 (η2 ) ∩ · · · ∩ (fˆtn−1,... ,t )−1 (ηn ) : 1

n−1

1

ηi monotonicity interval of fˆti , 1 ≤ i ≤ n}. (n)

Clearly, endpoints of nontrivial intervals in Pt vary continuously with t. It follows that given any η0 in P (n) , for each t close enough to 0, there exists an interval η(t, η0 ) ∈ (n) Pt with endpoints depending continuously on t and such that η(0, η0 ) = η0 .

Stochastic Stability for Lorenz Attractors

289

5.2. The lemmas. We are not going to prove all the equivalent statements of [BV], but give some lemmas that make us understand how the others come through, with the necessary modifications. The next lemma is related to the weight gt (y, k), compare with the definition of gt , evaluated at points in the support of m which “fall down”from the tower, i.e., |k| ≥ H (δ) and fˆt (y, k) ∈ E0 . Lemma 4 (BV 3). There is c > 0 so that ω (y, k)|f 0 (y)|−1 ≤ cρ −|k| for all ≥ 0 and |k| ≥ 1, and all (y, k) ∈ Ek having fˆt (y, k) ∈ E0 for some |t| < . Proof. The case = 0 is easy. Assuming > 0, we derive a preliminary estimate for ω on E±1 . Consider ω on E−1 . (For E1 the proof is similar.) If x ≥ c−1 + then ω (z, −1) = 0. Otherwise Z θ (z − f (y))dy, ω (z, −1) = λ (−δ,0) z−f (−δ)

Z ω (z, −1) = λ

z−c−1

θ (t)

f 0 (ft−1 (z))

Z dt ≤ sup θ

1 f 0 (z

t)

dt.

Note that we write ft−1 (z) because it is well defined for all z such that (z, −1) ∈ E−1 , for δ small enough. The first integral is taken over {t ≥ z − c−1 : |zt | ≤ δ} and the second one over {t ≥ z − c−1 : |t| ≤ }. Hence, if c−1 − ≤ z ≤ c−1 + then Z Z z dt −dx, (16) = λ sup θ ω (z, −1) ≤ λ sup θ −1 0 z−c−1 f 0 (ft (z)) |z | −z ≤ λM , (17) ω (z, −1) = λ( sup θ ) ≤ λMC/()1−1/s , (18) because property A0 implies K1

|x|s < |f (x) − f (0− )|, s

leading to

|z |s < 2. s On the other hand, for z ≤ c1 − we have, Z dt = λ sup θ (−(z − z− ) ω (z, −1) ≤ λ sup θ −1 0 − f (ft (z)) ≤ λ sup θ |z | − |z− | |z ||z− |s−1 − |z− ||z |s−1 |z |s − |z− |s + ≤ λ sup θ |z |s−1 + |z− |s−1 |z |s−1 + |z− |s−1 |z |s − |z− |s . ≤ Cλ sup θ |z |s−1 + |z− |s−1 K1

290

R. J. Metzger

So it becomes CλM , |z0 |s−1

ω (z, −1) ≤

(19)

since (ft−1 )s = (zt )s is a smooth function of t and since |z |s−1 + |z− |s−1 ≥ |z0 |s−1 . Now we consider a general |k| ≥ H (δ). Without loss of generality suppose here that k > 0. From the definition we have Z −1 k 0 ) (y d θˆ (t ). ω (y, k)|f 0 (y)|−1 = λk−1 ω (ytk−1 ...t1 , 1) (f0,t t ...t k−1 1 k−1 ...t1 Now, split this into a sum W1 + W2 , where the two terms correspond to restricting the domain of integration, respectively, to {|c1 − ytk−1 ...t1 | ≥ } and to {|c1 − ytk−1 | < }. In order to bound W1 and W2 , we note that e−β2 (k+1) ≤ c|(f k )0 (c1 )||c1 − ytk−1 ...t1 | + |.

(20)

This is a translation of the relation deduced in the proof of Lemma 2 in [BV], using j −1 k (ytk−1 ...t1 , 1) ∈ E0 for some fˆtj −1 ...t1 (ytk−1 ...t1 , 1) ∈ Ej for 1 ≤ j ≤ k and fˆt,t k−1 ...t1 |t| < . −β2 k Let us set first |c1 − ytk−1 ...t1 | ≥ . Then (20) gives |c1 − ytk−1 ...t1 | ≥ C|(fe k )0 (c )| , and since |z0 | ≥

|c1 −z|1/s , C

1

(19) yields 0

|(f k ) (c1 )| e−β2 k

ω (ytk−1 ...t1 , 1) ≤ CλM

! s−1 s

0

(s−1)/s

≤ Ceβ2 k(s−1)/s ((f k ) (c1 ))

.

Replacing in W1 and using again the distortion inequality (3.9) of [BV] we obtain W1 ≤ λk−1

Z

Ceβ2 k(s−1)/s

(s−1)/s

0

((f k ) (c1 )) 0

(f k ) (c1 ) (s−1)/s k ≤ Cρ −k . ≤ λk−1 C eβ2 (s−1)/s λc

dθ (t)

e−β2 k if |c1 − ytk−1 ...t1 | ≤ , and (18) to 0 C(f k ) (c1 ) (s−1)/s 0 Ceβ2 k(s−1)/s |(f k ) (c1 )| . The same calculations

For W2 , we use (20) to get that ≥

conclude that ω (ytk−1 ...t1 , 1) ≤ as before give W2 ≤ Cρ −k , ending the proof of Lemma 4.

t u

For |k| ≥ 1 we introduce subintervals of Ek : βkL = {(y, k)|f (y) < ak+1 − },

βkR = {(y, k)|f (y) > bk+1 + }, where [aj , bj ] = Bj , i.e, aj , bj are the endpoints of the interval Bj . Note that (y, k) ∈ βkR ∪ βkL if and only if fˆt (y, k) ∈ E0 for some |t| ≤ . Lemma 5 (BV 4). There is a constant c > 0 such that for all ≥ 0 and |k| ≥ 1 we have |k| −1 ≤ c(eα ρ −1 ) var β L,R ω (y, k)(f 0 (y))

Stochastic Stability for Lorenz Attractors

291

Proof. For each fixed ≥ 0 and |k| ≥ 1, we have that {ω (y, k) 6= 0} is an interval. Denote γkL,R its intersection with βkL,R . We suppose |k| ≥ H (δ), otherwise γkL,R is empty. Now, suppose that = 0. For (y, k) ∈ γkL,R we have −1

ω0 (y, k)|f 0 (y)|

=

λ|k| . 0 (f |k|+1 ) (fˆ−|k| (y, k))

Note that f |k|+1 has negative Schwarzian derivative, because f does. Now, f |k|+1 does not have critical points in fˆ−|k| (γkL,R ), because this last set does not contain the critical point, neither does π(Ej ∩ supp (ω0 )) for j ≥ 1 for appropriate chosen contants, 0 see Sect. 2 of [M]. This implies that (f |k|+1 ) (fˆ−|k| (y, k)) has a unique maximum and so ω0 (y, k)|f 0 (y)|−1 has a unique minimum, restricted to γkL,R , hence −1

var β L,R (ω0 (y, k)(f 0 (y))

−1

) ≤ 2 sup (ω0 (y, k)(f 0 (y)) γkL,R

),

and the claim for = 0 follows from Lemma 4. Assume now that > 0. The main step is to prove that ω is at most two-to-one on each Ek . For this we use the assumption that φ = log(θ |J ) is concave. Observe that a function 9 is concave if and only if 9(x1 ) + 9(x4 ) ≤ 9(x2 ) + 9(x3 ), for every x1 < x2 ≤ x3 < x4 with x1 + x4 = x2 + x3 . Given j ≥ 0 (for j < 0 we have similar relations). If j = 0 replace Bj by [−δ, 0) or (0, δ], thus ω (x1 , j + 1)ω (x4 , j + 1) − ω (x2 , j + 1)ω (x3 , j + 1) Z Z ω (y, j )ω (z, j ) = λ2 Bj

Bj

· [θ (x1 − fy)θ (x4 − f z) − θ (x2 − fy)θ (x3 − f z)] dydz ≤ 0 , for all x1 < x2 ≤ x3 < x4 with x1 + x4 = x2 + x3 . For the last inequality observe that the term in the integral is always non-positive since we have (x1 − f (y)) + (x4 − f z) − (x2 − fy) + (x3 − f z) and log(θ |J ) is concave. This proves that log ω is concave and so ω is at most two-to-one on Ej +1 , see Sect. 2.1 in [BV]. As a consequence −1

var β L,R ω (y, k) ≤ 2 sup (ω (y, k)(f 0 (y)) γkL,R

),

by Lemma 4. Therefore, since |ck | ≥ e−α|k| ( condition A2) and f 0 has a unique maximum on each Bk for |k| ≥ H (δ), use that f has negative Schwarzian derivative once more.

292

R. J. Metzger

Therefore, we have −1

var β L,R (ω (y, k)(f 0 (y))

−1

) ≤ var β L,R (ω (y, k)) sup ((f 0 (y)) β L,R

) −1

+ sup (ω (y, k)) var β L,R ((f 0 (y)) β L,R

≤ cρ −|k| ceα|k| + cρ −|k| eα|k| .

)

t u

We now proceed with some preliminary bounds on L concerning points which are “climbing” the tower, i.e., (y, j ) ∈ Ek and fˆt (y, k) ∈ Ek(+) . Lemma 6 (BV 5). Let ϕ ∈ BV (Iˆ) and ≥ 0. 1) For |k| > 0 and each β ⊂ Ek(+) ∩ supp m , we have supβ |L ϕ| ≤ −1 S (β) ∪ supp m . where γ = t∈J fˆt |Ek 2) For each β ⊂ E±1 ∪ supp m we have supβ |L ϕ| ≤ −1 S ˆ (β). t∈J ft | ±

K λ

1 λ

supγ |ϕ|,

supγ ± |ϕ|, where γ ± =

E0

Proof. The proof for = 0 is easy. Assume > 0. By definition, for |k| ≥ 1 and x ∈ Bk(+) such that ω (x, k(+) ) 6 = 0, R L ϕ(x, k + 1) =

ω (z, k)ϕ(z, k)θ (x − f (z))dz R , λ Bk ω (z, k)θ (x − f (z))dz

Bk

and part (1) follows. Now, note that if ω (x, ±1) 6 = 0, then R L ϕ(x, 1) =

[−δ,0) ϕ(z, k)θ (x

R L ϕ(x, −1) = and part (2) follows.

λ

R

[−δ,0) θ (x

− f (z))dz

(0,δ] ϕ(z, k)θ (x

λ

R

(0,δ) θ (x

− f (z))dz

− f (z))dz

− f (z))dz

,

,

t u

Lemma 7 (BV 6). Let ϕ ∈ BV (I ) and ≥ 0. 1) For all |k| ≥ 1 and each interval β ⊂ Ek+1 , we have var β L (ϕ) ≤ λ1 var γ (ϕ), where −1 S (β) ∪ supp (m ). γ = t∈J fˆt |Ek S 2) For each interval β ⊂ E±1 , we have var β L (ϕ) ≤ Kλ var γ ± ϕ, where γ + = t∈J −1 −1 S (β), and γ − = t∈J fˆt |(0,δ)×{0} (β). fˆt |(−δ,0)×{0} Proof. The case = 0 is easy. We start with |k| ≥ 1. Consider first ϕ|Ek = Hu = xn |[u,bk ]×{k} for some point u ∈ Bk . We shall prove that L ϕ is monotone on Ek(+) .

Stochastic Stability for Lorenz Attractors

293

Obviously, we may disregard the points (x, k(+) ), where L is defined by a limit. At all other points, we have R bk ω (z, k)θ (x − f (z))dz . L ϕ(x, k + 1) = Rub λ akk ω (y, k)θ (x − f (y))dy Fix x1 > x2 in π(β) with ω (x1 , k + 1) 6= 0, for i = 1, 2. Up to a positive factor, the difference L ϕ(x1 , k + 1) − L ϕ(x2 , k + 1) is equal to Z bk Z bk dy dzω (z, x)ω (y, k) [θ (x1 −f z)θ (x2 −fy)−θ (x2 −f z)θ (x1 − fy)] . ak

u

(21) Since f |Bk ∩supp m is increasing then f (y) ≤ f (z) in (21). Thus x1 −fy ≥ max{x1 − f z, x2 − fy} and x2 − f z ≤ min{x2 − fy, x1 − f z}. So that, using (x1 − fy) + (x2 − f z) = (x1 − f z) + (x2 − f z) together with the concavity of log(θ |J ), we get θ (x1 − f z)θ (x2 − fy) ≥ θ (x2 − f z)θ (x1 − fy). Hence L ϕ(x1 , k + 1) ≥ L ϕ(x2 , k + 1), i.e., L ϕ is non-decreasing on β. This proves, using Lemma 6, that var β L ϕ = sup L ϕ − inf L ϕ ≤ β

β

1 (1 − 0). λ

Consider now the case where ϕ|Ek =

m X

dj Huj ,

(22)

j =1

for some uj ∈ Bk and dj > 0. Then

X dj Huj , var β L ϕ = var β L d0 χγ + uj ∈γ

for some constant d0 ≥ 0. Observe that L (d0 χγ ) is constant on β. Therefore, by P linearity, var β L ϕ ≤ λ1 uj ∈γ dj = (1/λ)var γ ϕ. If ϕ|Ek is nonnegative and non-decreasing, we take a sequence of ϕn of the form (22) with ϕn |Ek ≤ ϕ|Ek and converging uniformly to ϕ|Ek . Since L ϕn converges pointwise to L ϕ on Ek(+) , we get var β L ϕ ≤ lim inf var β L ϕn ≤ n

1 1 λ lim sup var γ ϕn ≤ λvar γ ϕ. 1 1 n

Finally, if ϕ|Ek is any function with bounded variation, we may write ϕ|Ek = ϕ1 − ϕ2 with ϕj nonnegative, nondecreasing, and such that varγ ϕ = var γ ϕ1 + var γ ϕ2 , then var β L ϕ ≤

X j

var β L ϕj ≤

X1 1 var γ ϕj = var γ ϕ. λ λ j

For β ⊂ E±1 the argument above holds for a function ϕ which vanishes on (0, δ] or t [−δ, 0), yields var β L ϕ ≤ λ1 var γ ± ϕ. u

294

R. J. Metzger

Lemma 8 (BV 7). lim

→0

XZ Bk

k

|ω (x, k) − ω0 (x, k)|dx = 0.

Proof. The term for k = 0 vanishes. For k = ±1 we have ( ( 0 x 6 ∈ f ([−δ, 0)) 0 ; ω0 (x, −1) = ω0 (x, 1) = λ λ otherwise f 0 (x0 ) f 0 (x0 )

x 6 ∈ f ((0, δ]) . otherwise

−1 −1 (x, 1) and x0 = fˆt=0 (x, −1), Respectively we have x0 = fˆt=0 ( 0 x 6∈ ∪t∈J ft ([−δ, 0)) R , ω (x, 1) = λ J θf 0(t)dt otherwise (xt )  0 x 6∈ ∪t∈J ft ((0, δ]) . ω (x, −1) = R θ (t)dt λ otherwise J f 0 (xt )

Respectively we have xt = fˆt−1 (x, 1) and xt = fˆt−1 (x, −1). In what follows we shall consider k = 1, since the case k = −1 is similar. For small fixed ζ > 0 we have, by a computation similar to (18), Z ω0 (x, 1)dx = λ[(x0 (1 − ζ, 1) − x0 (c1 , 1)] ≤ λζ 1/s . |1−x|≤ζ

R Since ω (x) converges uniformly to ω0 (x) on |c1 −x| ≥ ζ , the integral |c1 −x|≥ζ |ω − ω0 |dx can beR arbitrarily small by taking small. We split |c1 −x|≤ζ ω (x, 1)dx into a sum W1 + W2 , where W1 , W2 correspond to restricting the domain of integration to ζ ≥ |c1 − x| ≥ 2, respectively|c1 − x| ≤ min(2, ζ ). The first item vanishes if ζ < 2, otherwise it satisfies Z C λ 0 dx ≤ λCζ 1/s , W1 ≤ |c1 −x|≤ζ f (x0 ) since |c1 − x + t| ≥ |c1 − x|/2. For the second item, we have (recall (18)) Z C CλM ≤ 1−1/s min(2, ζ ) W2 ≤ 1−1/s |c1 −x|<min(2,ζ ) ≤ C min( 1/s , ζ 1/s ). We have just proved:

Z lim →0 Z

E1

|ω0 − ω |dx = 0,

|c1 −x|≤ζ

ω (x, 1)dx ≤ Cζ 1/s .

Now, for levels with |k| > 2 we can apply the same ideas contained in [BV], and the previous relations and lemmas. u t

Stochastic Stability for Lorenz Attractors

295

Lemmas (4)–(8) are the ones needed to make the proof of Theorem D. It suffices to follow the methods in [BV], and also the ideas developed here that solve the technical problems concerning the maps we are dealing with. A remark has to be made: there is one simplification that we can do with our construction that concerns the differentiability of f . We do not require f to be C 4 (or piecewise C 4 ) nor to be symmetric, as it is required in [BV]. We can see this as follows. Since our tower extension is injective for (−δ, δ), we do not have to make a choice between two points in (−δ, δ) to define the cocycle. For each point p in Ek , there is at most one point q in E0 that goes to Ek in |k| iterates. This makes it unnecessary to compare derivatives of points going to the same image in the tower. We already use this fact in Lemmas (4)–(8) and also in the previous sections. As (n) an example we finish this section stating two preperties of g (n) and gt that simplify the proof of the Integral Lemma in [BV]. Let η0 ∈ P n+N , and η1 (, η0 ) = ∩t∈Jn (η(t, η0 )). Define η2 (, η0 ) in a similar way, replacing intersection by union. Let l = k(0), i.e., l such that η0 ⊂ El . It is clear that given (x, k) ∈ η1 (, η0 ) ⊂ Ek and t ∈ Jn , there exists exactly one point yt = yt (η0 ), such that (yt , l) ∈ (t, η0 and fˆtn (yt , l) = (x, k). Lemma 9. Given (x, k) ∈ Ek we have g (n) (y, l) =

λ|l|−|k| , 0 f n+|l|−|k| (fˆ−|l| (y, l))

where fˆn (y, l) = (x, k), and we write y for y0 . Proof. This relation comes from our definition of g (n) , and ω0 . We only have to keep in mind that for (x, k) ∈ Ek there is only one choice for a point z ∈ B0 if we want it to satisfy fˆ|k| (z, 0) = (x, k). The same affirmative is true for (y, l). There exists only one point in B0 (that we are denoting by fˆ−|l| (y, l)), with the property that fˆ|l| ((fˆ−|l| (y, l), 0)) = (y, l). u t |k|

|l|

Now, given n ≥ 1, and l, k ∈ Z, for t ∈ Jn , u ∈ J , and v ∈ J , we denote −|k| −|l| xu = fˆu (x, k), and yt,v = fˆv (yt , l). Note that there is no ambiguity in the choice of xu and yt,v . With these definitions we have the following lemma. Lemma 10. For n ≥ |l| − |k| we have Z

−1 R |l| 0 (fv ) yt,v d θˆ (v) −1 (n) d θˆ (t) · (ftn )0 yt gt (yt , l)dθ (t) = λ|l|−|k| −1 R |k| d θˆ (v) (fu )0 xu R |k| 0 −1 Z −1 (fr ) xr d θˆ (r) d θˆ (s) · (fsn+|l|−|k| )0 yt,v = λ|l|−|k| R −1 |k| d θˆ (v) (fu )0 xu Z −1 d θˆ (s), (fsn+|l|−|k| )0 yt,v = λ|l|−|k| Z

where we denote r = (tn−|k| , . . . , tn ), and s = (v1 , . . . , v|l| , t1 , . . . , tn−|k| ).

296

R. J. Metzger (n)

Proof. The first equality comes from the definition of gt rearranging terms in the integral. u t

and the second is obtained

Acknowledgements. This paper was completed during a visit to IMPA – Rio de Janeiro. It is part of my Ph.D. Thesis. I am grateful to Prof. Jacob Palis, my advisor, who contributed in a significant way to achieve the final form of this work.

References [BV]

Baladi, V. and Viana, M.: Strong stochastic stability and rate of mixing for unimodal maps. Ann. Scient. E.N.S. 29-4, 483–517 (1996) [BR] Bowen, R. and Ruelle, A.: The ergodic theory of Axiom A flows. Inv. Math. 29, 181–202 (1975) [GW] Guckenheimer, J. and Williams, R.F.: Structural stability of Lorenz attractors. Publ. Math. IHES 50, 307–320 (1979) [IW] Ikeda, N. and Watanabe, S.: Stochastic Differential Equations and Diffusion Processes. Amsterdam: North-Holland/Kodansha, 1981 [KK] Katok, A. and Kifer, Y.: Random perturbations of transformations of an interval. J. de Analyse Math. 47, 193–237 (1986) [Ki1] Kifer, Y.: Ergodic Theory of Random Perturbations. Boston-Basel: Birkhäuser, 1986 [Ki2] Kifer, Y.: Random Perturbations of Dynamical Systems. Boston-Basel: Birkhäuser, 1988 [Lo] Lorenz, E.N.: Deterministic non periodic flow. J. Atmosph. Sci. 20, 130–141 (1963) [M] Metzger, R.: Sinai–Ruelle–Bowen measures for contracting Lorenz maps and flows. To appear in: Annales de L’I.H.P. Jour. d’Analyse [Pa] Palis, J.: A global view of dynamics and a conjecture on the denseness of finitude of attractors. Asterisque (1998) [Ro] Rovella, A.: The dynamics of perturbations of the contracting Lorenz Attractor. Bull. Braz. Math. Soc. 24, 233–259 (1993) [Ru] Ruelle, D.: A measure associated with Axiom A attractors. Am. J. of Math. 98, 619–654 (1976) [Si] Sinai, Ya.: Gibbs measure in ergodic theory. Russ. Math. Surv. 27, 21–79 (1972) [Vi1] Viana, M.: Stochastic Dynamics of Deterministic Systems. 21o Colóquio Brasileiro de Matemática, IMPA 1997 [Vi2] Viana, M.: Dynamics: A probabilistic and geometric perspective. Doc. Math. J., Extra Volume ICM 1998 [Y] Yosida, K.: Functional Analysis. Berlin: Springer-Verlag, 1980 Communicated by Ya. G. Sinai

Commun. Math. Phys. 212, 297 – 321 (2000)

Communications in

Mathematical Physics

© Springer-Verlag 2000

Factorization Dynamics and Coxeter–Toda Lattices Tim Hoffmann1 , Johannes Kellendonk1 , Nadja Kutz1 , Nicolai Reshetikhin1,2 1 Fachbereich Mathematik, Sekr. MA 8-5, Technische Universität Berlin, Strasse des 17. Juni 136,

10623 Berlin, Germany. E-mail: [email protected]; [email protected]; [email protected] 2 Department of Mathematics, University of California at Berkeley, Berkeley, CA 94720, USA. E-mail: [email protected] Received: 1 October 1999 / Accepted: 18 January 2000

Abstract: It is shown that the factorization relation on simple Lie groups with standard Poisson Lie structure restricted to Coxeter symplectic leaves gives an integrable dynamical system. This system can be regarded as a discretization of the Toda flow. In case of SLn the integrals of the factorization dynamics are integrals of the relativistic Toda system. A substantial part of the paper is devoted to the description of symplectic leaves in simple complex Lie groups, its Borel subgroups and their doubles. Contents Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1. Basic Facts About Simple Poisson Lie Groups . . . . . . . . . . . . . . . 1.1 Basic facts about Poisson Lie groups . . . . . . . . . . . . . . . . 1.2 Standard Poisson structure on a simple Lie group . . . . . . . . . . 2. Symplectic Leaves of G . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1 Bruhat decomposition of the double of G . . . . . . . . . . . . . . 2.2 Left cosets D(G)/j (G− ) . . . . . . . . . . . . . . . . . . . . . . 2.3 Double cosets j (G− )\D(G)/j (G− ) . . . . . . . . . . . . . . . . 2.4 Symplectic leaves of G and double Bruhat cells . . . . . . . . . . 3. Symplectic Leaves of B . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1 B − double cosets in D(B) . . . . . . . . . . . . . . . . . . . . . . 3.2 Factorization of left cosets and Darboux coordinates on symplectic leaves of B . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3 Coxeter symplectic leaves of B . . . . . . . . . . . . . . . . . . . 3.4 Symplectic leaves of B − . . . . . . . . . . . . . . . . . . . . . . . 4. Symplectic Leaves of D(B) . . . . . . . . . . . . . . . . . . . . . . . . . 4.1 Symplectic leaves of D(B) . . . . . . . . . . . . . . . . . . . . . 4.2 Relation between symplectic leaves of B and D(B) . . . . . . . . 4.3 Relation between symplectic leaves of D(B) and G . . . . . . . .

298 300 300 301 302 302 303 303 304 305 305 306 308 308 308 308 309 309

298

5.

6.

7.

8.

T. Hoffmann, J. Kellendonk, N. Kutz, N. Reshetikhin

Factorization Dynamics on Poisson Lie Groups . . . . . . . . . . . . . 5.1 Dynamics of Poisson relations . . . . . . . . . . . . . . . . . . 5.2 Factorization relations on Poisson Lie groups . . . . . . . . . . Factorization Dynamics on Coxeter Symplectic Leaves . . . . . . . . 6.1 Integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2 Factorization map . . . . . . . . . . . . . . . . . . . . . . . . 6.3 Real positive form . . . . . . . . . . . . . . . . . . . . . . . . The Interpolating Flow and Continuous Time Nonlinear Toda Lattices . 7.1 Interpolating flow . . . . . . . . . . . . . . . . . . . . . . . . 7.2 Linearization in a neighborhood of 1 . . . . . . . . . . . . . . Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

310 310 311 312 312 314 316 317 317 318 320

Introduction An integrable Hamiltonian system on a symplectic manifold consists of a Hamiltonian that generates the dynamics together with a Lagrangian fibration on the manifold such that the flow lines generated by the Hamiltonian are parallel to the fibers. Usually, the fibers are level surfaces of functions called higher integrals. The fibration by level surfaces is Lagrangian when the integrals Poisson commute and the flow lines are parallel to the fibers when the integrals Poisson commute with the Hamiltonian. The level surfaces of the integrals are equipped with natural affine coordinates in which the dynamics is linear [Arn89, HZ94]. Integrable systems on Poisson Lie groups have the following characteristic features: • The phase space of such a system is a symplectic manifold which is a symplectic leaf of a factorizable Poisson Lie group G. • The level surfaces of integrals are G-orbits with respect to the adjoint action of the group on itself. One should notice that for some symplectic leaves the G-invariant functions do not form complete set of Poisson commuting integrals (their level sets are not Lagrangian submanifolds, but only co-isotropic). In such cases still there is a complete system of integrals, but the complementary integrals may have singularities. An example is socalled full Toda system [Kos79,DLNT86]. Since symplectic leaves of the Poisson Lie group G are connected components of orbits of the dressing action of the dual Poisson Lie group G∗ on G, the invariant tori of such systems lie in the intersection of AdG and G∗ -orbits in G. Surprisingly enough, most of the known integrable systems on Poisson Lie groups are of this type. Such integrable systems have a Lax representation. Systematic treatment of such integrable systems was done by Semenov-Tian-Shanskii [STS85]. Linearization of this construction in a neighborhood of identity gives the similar construction based on Lie algebras which has been pioneered by Kostant [Kos79] on the example of Toda lattices and by Adler [Adl79] on the example of KdV equation. An integrable discrete dynamical system on a symplectic manifold is a symplectomorphism which acts parallel to fibers of a Lagrangian fibration given by level surfaces of integrals. More generally, it can be a Poisson relation preserving the fibration, for details see [Ves91]. In this paper we derive integrable systems related to Toda models [Tod88] (and references therein). We show that for simple Lie groups the factorization relation restricted to symplectic leaves that are associated with a Coxeter element in the Weyl group yields

Factorization Dynamics and Coxeter–Toda Lattices

299

a discrete integrable evolution. Such a dynamical system will be called Coxeter–Toda lattice and the dynamics factorization dynamics. It turns out that different choices of Coxeter element produce isomorphic integrable systems. The integrals for the factorization dynamics are in case of G = SLn the integrals of so-called relativistic Toda lattice introduced in [Rui90]. (Since we will deal only with simple Lie groups with standard Poisson Lie structure we can avoid going into the general discussion of factorizable Poisson Lie groups.) The phase space of the Coxeter–Toda lattice the symplectic leaf mentioned above. On a Zariski open subset of such a leaf which is isomorphic to C2r one can introduce coordinates χi± , i = 1, . . . , r = rankG with the following Poisson brackets: {χi± , χj± } = 0, {χi+ , χj− } = −2di Cij χi+ χj− . Here Cij is the Cartan matrix of G and the di co-prime positive integers symmetrizing it. The factorization relation restricted to a Coxeter symplectic leaf gives a symplectomorphism which acts on coordinates χi± is as follows α(χi+ ) = χi− , α(χi− ) =

r (χi− )2 Y

χi+

(1 − χj− )−Cj i .

j =1

This symplectomorphism is integrable. We will call it the discrete Toda evolution. Its integrals have the following description in terms of characters of finite dimensional representations of G. Let xi− , hi , xi+ be Chevalley generators of the Lie algebra g = Lie(G) and ϕi : SL2 (i) ⊂ G be the natural embedding of the SL2 subgroup generated by the elements xi− , hi , xi+ corresponding to the simple root αi . For a Coxeter element w of the Weyl group W of G fix a reduced decomposition w = si1 . . . sir , where r = rank(g) and define the element of G g=

r χ+ Y j j ( − )h exp(−χi+1 xi+1 ) exp(xi−1 ) . . . exp(−χi+r xi+r ) exp(xi−r ). χ j =1 j

Here {hj }rj =1 are elements of the Cartan subalgebra of g corresponding to fundamental P weights, hi = rj =1 Cij hj . The functions ChV (χ + , χ − ) = T rV (g),

(1)

where V is a finite dimensional representation of G form Poisson commutative subalgebra in the algebra of functions the phase space. They are the integrals of the map α. The characters of fundamental irreducible representations of G generate the subalgebra of integrals. Consider the function 1 Hd (χ ± ) = (ξ, ξ ), 2

300

T. Hoffmann, J. Kellendonk, N. Kutz, N. Reshetikhin

where g = exp(ξ ) and (., .) is the Killing form on Lie G. The Hamiltonian flow generated by this function interpolates the map α. For G = SLn the integrals (1) are the integrals of so-called relativistic Toda lattice [Rui90]. In a neighborhood of the identity these integrals turn into the integrals of the (usual) Toda lattice. In the same sense as a Lie algebra can be regarded as a linearization of a Lie group, the usual Toda lattices are linearizations of Coxeter–Toda lattices. Integrable discretizations of Toda lattices have been discovered by Hirota [Hir77] who studied solitonic aspects of them (see also [DJM82]). Later they were re-derived in [Sur90,Sur91b] from discrete time version of a Lax pair. The Hamiltonian interpretation based on classical r-matrices was derived in [Sur91a] and generalized to Toda systems related to all classical Lie groups (and their affine extensions). In [KR97] a discrete version of Toda field theory was described together with the Hamiltonian structure and its quantization. The role of matrix factorization in discrete integrable systems was noticed quite some time ago. The references include [Sym82, QNCvdL84, MV91, DLT89]. The primary goal of this article is not to produce new discrete integrable systems (although those related to exceptional Lie groups are new) but rather to demonstrate how the discrete Toda evolution together with its integrals (1) can be derived in a systematic way from the geometry of Poisson Lie groups, and from the factorization relation. A large part of this paper is devoted to the study of the phase space of these systems. This requires the careful study of symplectic leaves of B (a Borel subgroup in a simple algebraic Lie group G with the standard Poisson Lie structure) and of its double. In Sect. 1 we recall basic facts about Poisson Lie groups and describe the factorization dynamics on factorizable Poisson Lie groups. Section 2 contains the analysis of symplectic leaves of simple complex algebraic groups G with a standard Poisson Lie structure. In Sect. 3 we describe symplectic leaves of the Borel subgroup B of a simple Poisson Lie group G. Section 4 contains the description of symplectic leaves of the double of B and of how they are related to symplectic leaves of B and of G. The factorization dynamics on Coxeter symplectic leaves is studied in Sect. 5. The interpolating flow and the relation to the (usual) Toda lattices is described in Sect 6. In the conclusion we point out what may be done next in this direction.

1. Basic Facts About Simple Poisson Lie Groups 1.1. Basic facts about Poisson Lie groups. A Poisson Lie group is a Lie group equipped with a Poisson structure which is compatible with the group multiplication. There is a functorial correspondence between connected, simply connected Poisson Lie groups and Lie bialgebras [Dri87]. The Lie bialgebra corresponding to a given Poisson Lie group is called tangent Lie bialgebra. The dual of a Lie bialgebra p is the dual vector space p∗ equipped with the Lie bracket dual to Lie cobracket of p and with the Lie cobracket dual to Lie bracket on p. The dual P ∗ of a Poisson Lie group P is, by definition, the connected, simply connected Poisson Lie group having the dual p∗ of the Lie bialgebra p corresponding to P as Lie bialgebra. Denote by p ∗ op the Lie bialgebra p∗ with opposite cobracket (which is minus the original cobracket). The double D(p) of p is the direct sum p ⊕p ∗ op as a Lie coalgebra and its Lie bracket is determined uniquely by the requirement that the natural inclusions i : p → D(p) and j : p ∗ op → D(p) (into the first and second summand, respectively) are Lie bialgebra

Factorization Dynamics and Coxeter–Toda Lattices

301

homomorphisms and by the fact that the natural bilinear form < (x, l), (y, m) >= m(x) + l(y) is D(p)-invariant. The double D(P ) of P is the connected, simply connected Poisson Lie group having D(p) as its Lie bialgebra. The maps i and j lift to injective Poisson maps i : P → D(P ), j : P ∗ op → D(P ) and consequently to a map µ ◦ (i × j ) : P × P ∗ op → D(P ): (x, y) 7 → i(x)j (y) which is also a local Poisson isomorphism. By a local isomorphism we mean an isomorphism between neighborhoods of the identity. A symplectic leaf of a Poisson manifold is an equivalence class of points which can be joined by piecewise Hamiltonian flow lines. When the Poisson manifold is a Poisson Lie group P , there is another description of these leaves which involves the dressing action of the dual Poisson Lie group on P . The Poisson Lie group P ∗ acts on D(P ) via left multiplication, y · x := j (y)x. We also have a map ϕ : P → D(P )/j (P ∗op ) which is the composition of i with the natural projection. In a neighborhood of the identity this map ϕ is a Poisson isomorphism and induces dressing action of P ∗ on P [STS85].The map ϕ is a finite cover and has open dense range. The symplectic leaves of P are orbits of dressing action of G∗op and are connected components of preimages of left P ∗ -orbits in D(P )/j (P ∗op ). Among the cases which have been investigated we point out the following three, P = G (a complex connected and simply connected simple Lie group with standard Poisson structure), P = B a Borel-(Poisson)-subgroup of G, and P = K the compact real from of G. For P = K, the double, which can be identified with G as a real group, is globally isomorphic to K × K ∗ op as a real manifold via Iwasawa factorization. The map ϕ in this case is a global Poisson isomorphism [LW90]. There is a particular simple relation between the Bruhat decomposition of K and its symplectic leaves [Soi90, LW90]. It is worth noticing that as the double of K the complex simple Lie group is equipped with real Poisson structure which is different from the standard Lie Poisson structure on G. In the first two cases, which are the ones we shall consider in detail below, the double is only locally isomorphic to P × P ∗ op . The symplectic leaves of G have been studied in [HL93] . Symplectic leaves for B were described in [DCKP95]. We reproduce the results of [HL93,DCKP95] below but will describe symplectic leaves in G and B more explicitly.

1.2. Standard Poisson structure on a simple Lie group. Let G be a simple complex Lie group. Fix a labeling of the nodes on the Dynkin diagram associated with the Lie algebra Lie G by integers i = 1, . . . , r = rank(G). Assign the simple root αi to the node labeled by i . Let C be the Cartan matrix, that is, Cj i = 2

(αi , αj ) . (αj , αj )

Denote by di the length of i th simple root, then di Cij = dj Cj i . Fix a Borel subgroup B ⊂ G. This fixes the polarization of the root system and together with the enumeration of nodes of Dynkin diagram fixes the generators of the Lie algebra Lie G {hi , xi± }i=1,··· ,r corresponding to simple roots of Lie G. The determining

302

T. Hoffmann, J. Kellendonk, N. Kutz, N. Reshetikhin

relations for these generators are: [hi , hj ] = 0, [xi+ , xj− ] = δij hi , [hi , xj± ] = ±Cij xj± , ad(xi± )1−Cij xj± = 0,

i 6 = j.

The standard Lie bialgebra structure on Lie G compatible with the chosen Borel subgroup B is given by the cobracket acting on generators as follows: δ(hi ) = 0, δ(xi± ) = di xi± ∧ hi . This induces the Poisson Lie structure on G for which the Lie bialgebra described above is the tangent Lie bialgebra. The Borel subgroup B and its opposite B − are Poisson Lie subgroups. The Lie bialgebra Lie(G) is isomorphic to the double of the Lie bialgebra Lie(B) quotioned by the diagonally embedded Cartan subalgebra [Dri87]. We denote by N and N − the nilpotent subgroups of B and B − , respectively. Since H = B∩B − we have two natural projections and isomorphisms θ : B → B/N ∼ = H and θ − : B − → B − /N − ∼ = H . We shall also write B + and N + for B and N , respectively. 2. Symplectic Leaves of G 2.1. Bruhat decomposition of the double of G. A simple Lie group G with fixed Borel subgroup B admits Bruhat decomposition with respect to B: G BwB. G= w∈W def

˙ where w˙ is a representative of w ∈ NG (H )/H in NG (H ) (clearly Here BwB = B wB, B wB ˙ depends only on the class w ∈ NG (H )/H ). There is also a Bruhat decomposition of G with respect to B − : G B − wB − . G= w

Recall [KS98] that the double D(G) is, as a group, isomorphic to G × G. The cell decompositions of G therefore give the Bruhat decomposition of D(G) with respect to D − = B − × B: G D − (w1 , w2 )D − , D(G) = (w1 ,w2 )∈W ×W

D − (w1 , w2 )D −

B −w

B−

= × Bw2 B, where W × W = ND(G) (H × H )/H × H is 1 the Weyl group of D(G). We can also represent D − ⊂ D(G) as D − = (H × H )(N − × N + ) = (N − × N + )(H × H ). Then for the Bruhat cell D − (w1 , w2 )D − we can write D − (w1 , w2 )D − = (Nw−1 × Nw+2 )(H × H )(w˙ 1 , w˙ 2 )D − ,

(2)

where Nw± = {n ∈ N ± |w˙ −1 nw˙ ∈ N ∓ } (clearly this definition of Nw± does not depend on the choice of w). ˙

Factorization Dynamics and Coxeter–Toda Lattices

303

2.2. Left cosets D(G)/j (G− ). Let G− = G∗op which may be identified with {(b− , b) ∈ B − ×B|θ − (b− ) = θ(b)−1 }, a subgroup of B − ×B, [KS98]. We write j : G− ,→ B − ×B for this identification. There is a natural isomorphism: D − /j (G− ) ' H.

(3)

The group H × H acts on cosets (w˙ 1 , w˙ 2 )D − /j (G− ) by left multiplication: (h, h0 )(w˙ 1 , w˙ 2 )(b− , b)j (G− ) = (w˙ 1 , w˙ 2 )(hw1 b− , h0w2 b)j (G− )

b)j (G− ) = (w˙ 1 , w˙ 2 )(Adhw1 b− , h0w2 hw1 (Adh−1 w 1

= (w˙ 1 , w˙ 2 )(θ − (b− ), h0w2 hw1 θ (b))j (G− ) . ˙ Using also (3) we conclude that Here (b− , b) ∈ B − × B and we write hw = w˙ −1 hw. this action has stationary subgroup H w1 ,w2 = {(h, h0 ) ∈ H × H | hw1 = h0w2

−1

}.

(4)

Thus, we have an isomorphism D − (w1 , w2 )D − /j (G− ) ∼ = Nw−1 × Nw+2 × H and, in − − − particular, dim(D (w1 , w2 )D /j (G )) = l(w1 ) + l(w2 ) + r. 2.3. Double cosets j (G− )\D(G)/j (G− ). For double cosets, we have + ˙ 1 , w˙ 2 )(b− , b)j (G− ) = j (G− )(h˜ 1 , h˜ 2 )(w˙ 1 , w˙ 2 )j (G− ), j (G− )(n− w1 , nw2 )(h1 , h2 )(w

where h˜ 1 = h1 θ − (b− )w−1 , h˜ 2 = h2 θ − (b− )w−1 . The set of such double cosets is ac1 2 cording to (4) isomorphic to j (H )\H × H /jw1 w2 (H ),

(5)

where j (H ) ⊂ H × H is the subgroup that consists of elements (h, h−1 ), h ∈ H and −1 jw1 ,w2 (h) = (hw1 , h−1 w2 ). The coset of (h1 , h2 ) ∈ H × H in (5) is the set {(h, h ) 00 −1 00 00 (1, h1 h2 h h −1 )|h, h ∈ H }. Thus (5) is isomorphic to Hw−1 w1 , where Hw is the w2 w1

2

space of H -orbits on H with respect to the action

h : h0 → h0 hh−1 w .

(6)

All orbits are naturally isomorphic and we denote the one through 1 by H w . Furthermore, Hw is isomorphic to ker (w2−1 w1 − id) = {h ∈ H |hw−1 w1 = h}. Thus, we proved: 2

Proposition 1. We have an isomorphism j (G− )\D − (w1 , w2 )D − /j (G− ) ' Hw−1 w1 . 2

(7)

Each j (G− ) orbit corresponding to an element of this set is isomorphic to −1

Nw−1 × Nw+2 × H w2

w1

.

In particular, each such orbit has the dimension `(w1 ) + `(w2 ) + dim(coker(w2−1 w1 − 1)).

(8)

304

T. Hoffmann, J. Kellendonk, N. Kutz, N. Reshetikhin

Notice that the isomorphism (7) and the isomorphisms between j (G− )-orbits and sets (8) are not canonical but depend on the choice of representatives w˙ 1 , w˙ 2 . What we really have here is the fiber bundle D − (w1 , w2 )D − /j (G− ) → j (G− )\D − (w1 , w2 )D − /j (G− )

(9)

over the torus Hw−1 w1 whose fibers are j (G− )-orbits. 2

2.4. Symplectic leaves of G and double Bruhat cells. Double Bruhat cells are defined as intersections of B-Bruhat cells and B − -Bruhat cells: Gw1 ,w2 = B − w1 B − ∩ Bw2 B. It is known that dim(Gw1 ,w2 ) = l(w1 ) + l(w2 ) + r (for example [FZ99]). i

Let ϕ : G ,→ D → D/j (G− ) be the composition of diagonal embedding with the natural projection. According to (2) we have D − (w1 , w2 )D − /j (G− ) ∼ = (Nw−1 × Nw+2 )(w˙ 1 , w˙ 2 )i(H ). Define

0 := {ε ∈ H |ε 2 = 1}.

Theorem 1. 1. ϕ(Gw1 ,w2 ) ⊂ D − (w1 , w2 )D − /j (G− ) 2. The image of ϕ is Zariski open in D − (w1 , w2 )D − /j (G− ). 3. For each x ∈ I mϕ the group 0 acts by left translations on ϕ −1 (x). 4. The restriction of map varphi to Gw1 ,w2 is a cover map with the group of deck transformation 0. ˙ 1 b− = n+ ˙ 2 b+ ∈ B − w1 B − ∩ Bw2 B, Here is the outline of the proof. Let g = n− w1 w w2 w − − + + ± ± where nw1 ∈ Nw1 , nw2 ∈ Nw2 and b ∈ B . Then we have ˙ 1 b− , n+ ˙ 2 b+ )j (G− ). ϕ(g) = (g, g)j (G− ) = (n− w1 w w2 w Therefore ϕ(g) is an element of D − (w1 , w2 )D − /j (G− ). ˙ 1 b− and x2 = n+ ˙ 2 b+ , then (x1 , x2 )j (G). This Conversely, assume x1 = n− w1 w w2 w class has a representative of the form (g, g)j (G) if and only if there exists (η+ , η− ) ∈ ˙ 1 η− = n+ ˙ 2 η+ . According to [FZ99] such elements exist when G− such that n− w1 w w2 w − + (nw1 , nw2 ) belong to an open dense subset of Nw−1 × Nw+2 . Therefore the image of ϕ is open dense in D − (w1 , w2 )D − /j (G− ). Furthermore ϕ(gε) = (gε, gε−1 )j (G− ) = ϕ(g) for each ε ∈ 0. This shows that 0 acts (fixed point freely) on the preimages of points. Since i(0) = i(H ) ∩ j (H ) is the kernel of ϕ, 0 is the group of deck transformations for the cover map ϕ : Gw1 ,w2 → D − (w1 , w2 )D − /j (G− ). Since the symplectic leaves in G are connected components of preimages of j (G− )orbits in D(G)/j (G− ) we obtain the following description of leaves. Corollary 1. Connected components of preimages of G− orbits in D − (w1 , w2 ) D − /j (G− ) with respect to the map ϕ are symplectic leaves of G which belong to the double Bruhat cell Gw1 ,w2 .

Factorization Dynamics and Coxeter–Toda Lattices

305

3. Symplectic Leaves of B 3.1. B − double cosets in D(B). The double of a Borel subgroup B of G is isomorphic to G × H as a group (for the details see for example [KS98]. Furthermore, B ∗op ∼ = B −, − − − − − sitting inside G × H as j : B → G × H : j (b ) = (b , θ (b )). In particular, D(B) has the following cell decompositions: G G B − wB − × H = BwB × H. (10) D(B) = G × H = w∈W

Denote D(B)w = j (B − ) we have

B − wB −

D(B)/j (B − ) =

× H and G

w∈W

D(B)w

= BwB × H . For the quotient D(B)/

(B − wB − × H )/j (B − ) ∼ =

w∈W

G w∈W

Nw− × H.

Let us compute double cosets: j (B − )(b− w˙ b˜ − , h)j (B − ) = j (B − )(hb− w˙ b˜ − , 1)j (B − ) = j (B − )(hh− w˙ h˜ − , 1)j (B − ) ˙ 0 , 1)j (B − ) = j (B − )(wh ' j (H )(wh ˙ 0 , 1)j (H ). Clearly j (h) = (h, h−1 ) ∈ H × H ⊂ G × H . Therefore we have the isomorphism j (B − )\D(B)w /j (B − ) ∼ = Hw , where, we recall, Hw is the space of H -orbits on H for the action (6). In particular, dim(Hw ) = dim(ker(w − id)). ˙ h) in D(B) representing an equivalence class in D(B)/j (B − ). Choose a point (n− w w, − The left j (B ) orbit passing through this point is the set of elements ˙ θ(b− )−1 h)j (B − ), b− ∈ B − } {(b− n− w w, − − ˙ )w , θ(b− )−1 )h)j (B − ) | b− ∈ B − , b− n− ˜− = {(n˜ − w wθ(b w =n w θ (b )} − ˙ θ(b− )−1 θ(b− )w h)j (B − ) | b− ∈ B − , b− n− ˜− = {(n˜ − w w, w =n w θ (b )}.

Thus, we proved the following Proposition 2. 1. j (B − )\D(B)w /j (B − ) ∼ = (C× )dim(ker(w−id)) . − w 2. Each orbit is isomorphic to Nw × H . Similar to the case of G, the isomorphisms are not canonical but we have a fiber bundle D(B)w /j (B − ) → j (B − )\D(B)/j (B − ) whose fibers are the j (B − ) orbits. For w ∈ W define the subset Bw = B ∩ B − wB − . According to the general theory, symplectic leaves of B are connected components of preimages of j (B − )-orbits in D(B)/j (B − ) with respect to the map ϕ : B ⊂ D(B) → D(B)/j (B − ). Theorem 2. 1. The subset ϕ(Bw ) ⊂ BwB × H /j (B − ) is Zariski open. 2. For each x ∈ I mϕ the group 0 acts freely on ϕ −1 (x) acts by left translations. 3. The restriction of ϕ to Bw is a covering map Bw → D(B)w /j (B − ) with the group of deck transformations isomorphic to 0.

306

T. Hoffmann, J. Kellendonk, N. Kutz, N. Reshetikhin

3.2. Factorization of left cosets and Darboux coordinates on symplectic leaves of B. Fix a reduced decomposition of w ∈ W , w = si1 . . . si`(w) , where `(w) is the length of w. Consider the subset Bi1 ,...,i`(w) = Bsi1 . . . Bsi`(w) ⊂ B which is the image of Bi1 × . . . × Bi`(w) under the multiplication in G. Here Bsi = B(i) ∩ B(i)− si B − (i) and B(i) = B ∩ SL2 (i) is the intersection of the Borel subgroup in G and of the SL2 -subgroup corresponding to the i th simple root. For w ∈ W define numbers of “repetitions” ni = {# of i in the sequence {i1 , . . . , i`(w) }}, and define the support of w as I (w) = {i | 1 ≤ i ≤ r, ni 6= 0}. If ni ≥ 1 consider the following action of (C× )ni −1 on Bi1 × . . . × Bi`(w) : (x1 , . . . , xni −1 ) : (bi1 , . . . , bi`(w) ) 7 → (. . . , bi ϕi (x1 ), . . . , Adϕi (x1 ) (bj ), . . . , ϕi (x1 )−1 bi ϕi (x2 ), . . .

Adϕi (x2 ) (bk ), . . . , ϕi (x2 )−1 bi ϕi (x3 ), . . . ).

(11)

Here ϕi : C× ,→ SL2 ,→ G is the composition of embedding, C× into SL2 as the (complex) Cartan subgroup and SL2 into G as the i th SL2 -triple. It is clear that for different ni , nj , both greater than 1, the corresponding actions commute so that w gives rise to an action of the torus J , the product of all (C× )ni −1 , over i with ni > 1. Proposition 3. The multiplication map Bsi1 × . . . × Bsi` (w) → Bi1 ,...,i`(w) commutes with the J -action, assuming J acts trivially on Bi1 ,...,i`(w) and establishes an isomorphism Bi1 ,...,i`(w) ' (Bsi1 × · · · × Bsi`(w) )/J . Here is the outline of the proof. We can choose the elements for (C× )ni −1 in such a way that the Cartan parts of the elements bi of (bi1 , . . . , bi`(w) ) will all be trivial, all except one. If we do this for each i ∈ I (w) we will have cross-section of the action of J . Then it quickly follows that this cross-section is a birational isomorphism. The support I (w) of w defines naturally a sub-diagram of the Dynkin diagram of G (by deleting all nodes not in I (w)) and hence a subgroup of G. Let Bw0 be the image in G of the Bruhat cell corresponding to w in this subgroup. Then multiplication provides an isomorphism between Bw0 × H (w) and Bw where H (w) is the subgroup of / I (w). The following is known (see for H corresponding to the simple roots αi with i ∈ example [FZ99]). Theorem 3. • For each w ∈ W with fixed reduced decomposition the set Bi1 ,...,i`(w) is Zariski open in Bw0 . • For each two reduced decompositions w = si1 . . . si`(w) and w = sj1 . . . sj`(w) there is a birational isomorphism between Bi1 ,...,i`(w) and Bj1 ,...,j`(w) .

Factorization Dynamics and Coxeter–Toda Lattices

307

Let us describe the symplectic leaves of Bw more explicitly, using results of the previous subsection. There is a natural coordinate system in a neighborhood of the identity of the subgroup B(i) in which the group elements are written as exp (ai hi + bi xi+ ) = exp(ai hi ) exp(bi0 xi+ ), where bi0 = e−ai baii sinh(ai ).

i) . In these The corresponding global coordinates on B(i) are Ai = eai , Bi = bi sinh(a ai Ai Bi in two coordinates the above element is represented by the 2 × 2 matrix 0 A−1 i dimensional representation of SL2 . The subgroup B(i) is a Poisson Lie subgroup in SL2 (i) with the following Poisson brackets between coordinate functions:

{Ai , Bi } = −di Ai Bi . Here and below we will abuse notations and will denote coordinates and coordinate functions by the same letters. The symplectic leaves of B(i) are one 2-dimensional leaf Bsi = {Ai , Bi | Ai ∈ C× , Bi ∈ C× }, and a 1-dimensional family of zero-dimensional leaves {Ai = t, Bi = 0}. The product Bsi1 × . . . × Bsi` (w) carries natural product symplectic structure. Since the multiplication map is Poisson, the sub-manifold Bi1 ,...,i`(w) ⊂ Bw0 is a Poisson submanifold. According to Theorem 3, Bi1 ,...,i`(w) is Zariski open in Bw0 which implies that the symplectic leaves of Bi1 ,...,i`(w) are Zariski open sub-varieties in the symplectic leaves of B. The following result, combined with the product Poisson structure on Bi1 ×· · ·×Bi`(w) allows to describe symplectic leaves of Bi1 ,...,i`(w) explicitly via Hamiltonian reduction. Proposition 4. The action (11) of J = (C× )`(w)−|I (w)| on Bi1 × · · · × Bi`(w) is Hamiltonian. Here |I (w)| is the cardinality of the support of w. The Hamiltonians generating this action can be constructed explicitly as linear functions of logA and logB. They do not commute with respect to the Poisson brackets. This means that the pull-back of the moment map maps the Poisson algebra of functions on Bi1 × · · · × Bi`(w) to the Poisson algebra of functions on a hyper-plane in the vector space dual to a central extension of Lie algebra j . The symplectic leaves of the quotient space (Bsi1 × · · · × Bsi`(w) )/J are preimages of the corresponding coadjoint orbits (of centrally extended Lie algebra j ) with respect to the moment map. In other words the symplectic leaves of these quotient spaces can be obtained via Hamiltonian reduction. We will leave the details of this Hamiltonian reduction to a separate publication. Below we will consider symplectic leaves corresponding to Coxeter elements. In this case all ni = 1 and so J is trivial. This means that for these symplectic leaves the coordinates described above are global.

308

T. Hoffmann, J. Kellendonk, N. Kutz, N. Reshetikhin

3.3. Coxeter symplectic leaves of B. An element of the Weyl groups W is called a Coxeter element if its reduced decomposition into the product of simple reflections w = si1 . . . sil(w) does not have repetitions in the sequence of sub-indices and if l(w) = r (i.e. in this product each generator of W appears exactly once). It is not difficult to see that if w is a Coxeter element, dim(coker(w − id)) is r and therefore the subset Bw is a symplectic leaf of B. We will call them Coxeter symplectic leaves. Let Ui : Bsi ,→ B be the natural inclusion of Bsi ⊂ B(i) into B. Then any element of Bi1 ,...,i`w can be written as Ui1 (Ai1 , Bi1 ) · · · Uir (Air , Bir ). Thus, for Coxeter symplectic leaves Ai , Bi (more precisely, their logarithms) are Darboux coordinates. 3.4. Symplectic leaves of B − . Symplectic leaves of B − can be described similarly to how it was done for B. They also can obtained from the ones for B since B is antiisomorphic to B − as a Poisson manifold (there is an isomorphism of groups, which maps one Poisson tensor to the negative of the other) . Let Ci , Di , be coordinates on the lower triangular part ofSL2 (i) in which group Di 0 elements are represented by matrices Li (Di , Ci ) = in the two dimensional Ci Di−1 irreducible representation of SL2 . These coordinate functions have the following Poisson brackets: {Di , Ci } = di Di Ci . Denote by Bs−i the sub-variety of the lower triangular part of SL2 (i), where Ci 6 = 0. Fix the Coxeter element w ∈ W and its reduced decomposition w = si1 . . . sir . On a Zariski open subset of the Coxeter symplectic leaf Bw− one can introduce the natural coordinates Ci , Di , i = 1, . . . , r. Every element of this subset can be written as: Lir (Dir , Cir ) . . . Li1 (Di1 , Ci1 ), where Li : Bs−i ⊂ B − are natural inclusions. 4. Symplectic Leaves of D(B) 4.1. Symplectic leaves of D(B). As above, let us identify D(B) with G × H as a group. The Poisson structure on D(B) = G × H is not the product structure. Symplectic leaves of D(B) can be described similarly to how it was done for G. Since D(B) is a factorizable Poisson Lie group D(D(B)) ' D(B) × D(B). Fix this isomorphism together with the identification D(B) = G × H . This gives the following cell decomposition for D(D(B)): G (B − w1 B − × H ) × (Bw2 B × H ). D(B) × D(B) = w1 ,w2

The Poisson Lie group D − (B) = D(B)∗op can naturally be identified with B − × B.

Factorization Dynamics and Coxeter–Toda Lattices

309

Let D(B)w and D(B)w be Bruhat cells of D(B) defined in (10). The double cosets j (D − (B))\D(B)w1 × D(B)w2 /j (D − (B)) can be computed similarly to Proposition 1: j (D − (B))\D(B)w1 × D(B)w2 /j (D − (B)) ' Hw1 × Hw2 . The D − (B)-orbit passing through the coset class of ((w˙ 1 , h1 ), (w˙ 2 , h2 )) ∈ D(B)w1 × D(B)w2 is isomorphic to (Nw−1 × H w1 ) × (Nw+2 × H w2 ).

(12)

Notice that j (D − (B))-orbits in D(B)w1 × D(B)w2 /j (D − (B)) are isomorphic to the product of corresponding orbits for B and for B − . Again we have a natural fiber bundle (D(B)w1 × D(B w2 ))/j (D − (B)) → j (D − (B))\D(B)w1 × D(B)w2 /j (D − (B)) and j (D − (B))-orbits are fibers of this bundle. Symplectic leaves of D(B) are connected components of preimages of j (D − (B))orbits under the map ϕ : D(B) → (D(B) × D(B))/j (D − (B)). Symplectic leaves whose image is a j (D − (B))- orbit in D(B)w1 × D(B)w2 / j (D − (B)) will be denoted as Sw1 ,w2 . 4.2. Relation between symplectic leaves of B and D(B). Embeddings i : B ,→ D(B) and j : B − ,→ D(B) combined with the multiplication and inversion in D(B) give rise to the map I : B × B − → D(B) = G × H,

I (b, b− ) = (b(b− )−1 , θ (b)θ − (b− ))

(13)

which is most important to define the factorization relation, see below. The image of this map is Zariski open in D(B). This map is Poisson and therefore it maps symplectic leaves of B × B − to symplectic leaves of D(B). The intersection of the image of I and of any of the symplectic leaves is Zariski open in this leaf. This “explains” the formula (12). 4.3. Relation between symplectic leaves of D(B) and G. The Cartan subgroup H acts naturally on B × B − by diagonal multiplication from the right, h(b, b− ) = (bh, b− h).

(14)

The following is clear. Lemma 1. The map I commutes with the H -action I (bh, b− h) = I (b, b− )(1, h2 ) and induces a Poisson map I˜ between corresponding cosets: ˜

I (B × B − )/H −→ G ∼ = D(B)/H.

Here the coset is taken with respect to the action of H on D(B) by the multiplication by (1, h2 ) from the right.

310

T. Hoffmann, J. Kellendonk, N. Kutz, N. Reshetikhin

It is also clear that the image of I˜ is Zariski open in G and that I˜ is a birational isomorphism. Since the action (14) is Hamiltonian, generic symplectic leaves of (B × B − )/H can be obtained via Hamiltonian reduction from generic symplectic leaves of B × B − . Therefore, symplectic leaves of G can be obtained via Hamiltonian reduction from symplectic leaves of D(B). Symplectic leaves of G can be also described via Hamiltonian reduction similarly to how it was done for symplectic leaves of B in Sect. 4.1. For this consider two elements u, v ∈ W and fix their reduced decomposition u = si1 . . . sil , v = sj1 . . . sjm . Consider the image of Bsi1 × · · · × Bsil × Bs−jm × · · · × Bs−j 1

under the multiplication and inverse map: Gi1 ,...,il ,j1 ,...,jm = Bsi1 . . . Bsil Bs−j

1

−1

. . . Bs−jm

−1

.

The double Bruhat cell Gu,v has natural decomposition Gu,v = G0 u,v × H (u, v), where H (u, v) is the subgroup of H generated by elements corresponding to simple roots which do not belong to I (u) ∪ I (v). It follows from [FZ99] that the variety Gi1 ,...,il ,j1 ,...,jm is birationally isomorphic to G0 u,v . On the other hand it is also isomorphic to the quotient of Bsi1 × · · · × Bsil × Bs−jm × · · · × Bs−j 1

with respect to the appropriate Hamiltonian toric action (see Sect. 4.1. This allows to construct all symplectic leaves of G via Hamiltonian reduction. We will leave the details of this construction for another publication. 5. Factorization Dynamics on Poisson Lie Groups 5.1. Dynamics of Poisson relations. Here we will recall basic facts about Poison relations and their dynamics. Let (M, p) be a Poisson manifold with the Poisson tensor p ∈ ∧2 T M. Denote by p(2) ∈ ∧2 T (M × M) the Poisson tensor corresponding to the following product of Poisson manifolds: (M, −p) × (M, p). A smooth relation of finite type on a manifold M is a submanifold R ⊂ M × M, such that natural projections π1 , π2 : M × M → M, π1 (x, y) = x, π2 (x, y) = y have a finite number of preimages. Denote by T ⊥ R the forms on M × M which vanish on T R ⊂ T (M × M). A smooth relation on a Poisson manifold M is called a Poisson relation if p(2) |T ⊥ R = 0 and dim(R) = dim(M). If a relation R = {(x, φ(x)) | x ∈ M} is a graph of a map φ : M → M it is Poisson if and only if φ is a Poisson map. An nth iteration of a relation R on M is a submanifold R (n) ⊂ M ×(n+1) such that R (n) = {(x1 , . . . , xn+1 ) | xi ∈ M, (xi , xi+1 ) ∈ R ⊂ M × M}. A function F ∈ C ∞ (M) is called an integral of a smooth relation R ⊂ M × M if F (x) = F (y) for all

(x, y) ∈ R.

Factorization Dynamics and Coxeter–Toda Lattices

311

A smooth relation on a symplectic manifold is Poisson if and only if it is a Lagrangian submanifold in M × M (equipped with the product symplectic structure). It is called integrable if there exists n independent Poisson commuting functions I1 , . . . , In which are integrals of R. Similarly one can define Poisson and symplectic relations in an algebro-geometric setting. For more details about the dynamics of symplectic relations see [Ves91].

5.2. Factorization relations on Poisson Lie groups. We will study very specific Poisson relations on Poisson Lie groups which we will call factorization relations. Let P be a Poisson Lie group and D(P ) be its double. A factorization relation on P × P op is a sub-variety F ⊂ (P × P op ) × (P × P op ), defined as F = {(g + , g − ), (h+ , h− ) | i(g + )j (g − )−1 = j (h− )−1 i(h+ )}, where i : P ,→ D(P ) and j : P op ,→ D(P ) are the natural inclusions of Poisson Lie groups. Proposition 5. • Functions on D(P ) which are invariant with respect to the adjoint action of D(P ) form a Poisson commutative subalgebra in the Poisson algebra of functions on D(P ). • A function on P × P op which is the composition of the map M(i × j ) : P × P op → D(P ), (g + , g − ) 7 → i(g + )j (g − )−1 and of an Ad-invariant function on D(P ) is an integral of the factorization map. Part 1 of this proposition is well known [STS85]; Part 2 is obvious: f (i(g + )j (g − )−1 ) = f (j (h− )−1 i(h+ )) = f (i(h+ )j (h− )−1 ). Let 61 and 6¯2 be symplectic leaves in P and P op respectively. Restricting the relation F to the symplectic leaf 6 = 61 × 6¯2 ⊂ P × P op we obtain a Poisson relation on 6. Functions on D(P ) which are Ad-invariant are invariant with respect to the factorization relation and therefore produce its integrals. It may happen that one can make a Hamiltonian reduction of 6 in such a way that on the reduced space we have enough central functions, in a sense that their level surfaces are half of the dimension of the reduced symplectic manifold. In this case the factorization dynamics on 6 or on the reduced space will be integrable. In the next sections we will show that this is exactly what happens with symplectic leaves corresponding to the Coxeter elements. As we will see this gives an integrable system which is a “nonlinear” version of an open Toda system corresponding to the Lie algebra g = Lie(G). It becomes the usual Toda system in a neighborhood of the identity. Remark. One can argue that factorization dynamics is integrable on all (appropriately reduced) symplectic leaves of P × P op when P is a Borel subgroup of simple Lie group G. In a neighborhood of the identity such systems become “complete” Toda systems (corresponding to parabolic subgroups in G) [Kos79, DLT89]. But this will be the subject for a separate publication.

312

T. Hoffmann, J. Kellendonk, N. Kutz, N. Reshetikhin

6. Factorization Dynamics on Coxeter Symplectic Leaves 6.1. Integrals. Consider a Coxeter symplectic leaf Sw,w of D(B) corresponding to the Coxeter element w. Fix reduced decomposition w = si1 . . . sir . On Zariski open subset 0 of Sw,w each element of G∩Sw,w (provided that G is embedded to D(B) = G×H Sw,w as (G, e)) can be represented by the product −1 U L−1 = Ui1 . . . Uir L−1 i1 . . . Lir .

(15)

Here we abbreviated Ui ≡ Ui (Ai , Bi ), Li = Li (Di , Ci ). This subset depends on the choice of reduced decomposition of w. We will suppress this dependence since different reduced decompositions give birationally isomorphic subsets. For each i = 1, . . . , r and given reduced decomposition w = si1 . . . sir define {i}+ = {iα = 1, . . . r | α > β, i = iβ } and {i}− = {iα = 1, . . . r | α < β, i = iβ }. Proposition 6. The following identities hold: Y h Y −h −1 ˜ ˜ ˜ Ai i Ui1 (1, V˜i1 )L−1 Di i , U L−1 = i1 (1, Wi1 ) . . . Uir (1, Vir ) · Lir (1, Wir ) i

U L−1

i

Y Ai h i −1 = Ui1 (1, Vi1 ) . . . Uir (1, Vir ) Li1 (1, Wi1 ) . . . L−1 ir (1, Wir ), Di

(16)

i

where

Y

Vi = Bi Ai

j ∈{i}−

Wi = Ci Di−1

V˜i = Bi A−1 i

C

Aj j i ,

Y

j ∈{i}+

−Cj i

Dj

−Cj i

j ∈{i}+

Y

W˜ i = Ci Di

,

Y

Aj

,

C

j ∈{i}−

Dj j i .

The proof of this proposition and of the next lemma is a simple exercise. Lemma 2. V˜i = Vi

Q

j

−Cj i

Aj

, W˜ i = Wi

Q

j

C

Dj j i .

Define variables χi± , Gi , Fi as χi+ = Vi Wi ,

χi− = χi+

Y Cj i Y Cj i Bi Ai Di Aj Dj , Gi = Ci j ∈{i}+

Y Aj −C ji , Dj j

Fi = Ai Di .

j ∈{i}−

0 , χi± , Fi , and Gi have the following Proposition 7. Considered as functions on Sw,w Poisson brackets:

{χi+ , χj+ } = {χi− , χj− } = 0, {χi+ , χj− } = −2di Cij χi+ χj− , {χi± , Fj } = {χi± , Gj } = 0, {Fi , Gj } = −2di Fi Gj δi,j .

Factorization Dynamics and Coxeter–Toda Lattices

313

The proof is a straightforward computation based on the definition of χi± , Fi , and Gi and on the Poisson brackets between Ai , Bi , Ci , Di : {Ai , Bj } {Di , Cj } {Ai , Aj } {Bi , Bj }

= = = =

−di δij Ai Bj , di δij Di Cj , {Ai , Cj } = {Ai , Dj } = {Bi , Cj } = {Bi , Dj } = {Di , Dj } = 0 , {Ci , Cj } = 0.

Using Proposition 6, the definition of χi± and elementary algebra we arrive at the following Proposition 8. Let V be a finite-dimensional representation of G and ChV be its character. Then ChV (U L−1 ) = ChV =

r Y χj+ hj

φi1 (gi1 ) . . . φir (gir ) χj− j =1 r Y χj+ hj φi1 (g¯ i1 ) . . . φir (g¯ ir ) . ChV − χj j =1

Here φi : SLr (i) ,→ G is the embedding of SL2 generated by xi+ , hi , xi− into G, gi and g¯ i are elements of SL2 whose image in 2-dimensional irreducible representation is given by the following weight basis of 2-dimensional irreducible representation: 1 χi+ 1−χi− χi− , g¯ i = . gi = −1 1 −1 1−χi+ The element {hj } forms the basis in h ⊂ g corresponding to fundamental weights: P hj = i Cj i hi . Observe that [hj , Xi± ] = ±δij Xi± , hence by conjugating U L−1 with an element exp ahi of H one can alter the off-diagonals of the gi0 s. This was used in the proof of Proposition 8. Now let us interpret these two propositions from the point of view of Hamiltonian reduction. 0 . Proposition 9. (1) Functions log Gi generate H action (14) on Sw,w 0 (2) Functions log Fi generate the adjoint action of H on Sw,w ⊂ D(B), h : (g, h0 ) 7 → (hgh−1 , h0 ).

This proposition can be derived immediately from formulae (16) and from the explicit form of Poisson brackets in terms of coordinates Ai , Bi , Ci , Di . Characters as functions on the group are invariant with respect to the adjoint action. 0 do not depend Therefore Proposition 9 implies that characters, computed on G ∩ Sw,w on Fi , Gi which can be seen also by direct computation (Proposition 8). As it follows from 4.2 we can naturally identify 0 0 = Sw,w /H, G ∩ Sw,w

where the H action is generated by log Fi . Level surfaces of functions Gi are symplectic 0 and log χi± are Darboux coordinates on these symplectic leaves. All leaves of G ∩ Sw,w this is clear from the structure of Poisson brackets in Proposition 7.

314

T. Hoffmann, J. Kellendonk, N. Kutz, N. Reshetikhin

6.2. Factorization map. Consider the map α : (C× )2r → (C× )2r , α(χi+ ) = χi− , α(χi− ) =

(χi− )2 Y (1 − χj− )−Cj i , χi+ j

α(Fi ) = Fi , α(Gi ) = Gi

Y j

Cij

Fj

defined outside of the hyper-planes (χj− = 1, χi+ = 0). Since we are interested in integrable systems whose Hamiltonians are given by functions on G invariant under conjugations and since there functions restricted to a Coxeter orbit do not depend on F and G variables we will focus on the action of the factorization dynamics on χ ± . Here we will continue the practice of abusing notation and will denote coordinates and coordinate functions by the same letter. Let ChV (χ + , χ − ) be functions on (C× )2r as defined in Proposition 8. Theorem 4. ChV (α(χ + ), α(χ − )) = ChV (χ + , χ − ). Proof. We will use two formulae for these functions derived in Proposition 8. From the first one we have:   + r → + Y Y ) α(χ j 1 α(χi ) j h  φi ChV (α(χ + ), α(χ − )) = ChV  − −1 1−α(χi+ ) α(χ ) j j =1 i   r r → − Y Y χj+ hj Y 1 χi  (1 − χj− )hj φi = ChV  − −1 1−χi− χ j j =1 j =1 i   r → − − Y χj+ hj Y 1−χi χi  . φi = ChV  −1 1 χ− j =1

j

i

Here the product is taken in the order (i1 , . . . , ir ) and in the last equality we used the cyclic property of the trace and the Lie brackets between hj and ei and fi . The last expression is exactly the second formula for ChV , which proves the theorem. On the other hand the theorem follows from the next statement and from the fact that Adinvariant functions on a Poisson Lie group are invariant with respect to the factorization map. u t 0 0 ×Sw,w be the factorization relation restricted to Coxeter Proposition 10. Let F ⊂ Sw,w symplectic leaves of D(B). The diagram

χL . (C× )2r

F α

& χR

−→ (C× )2r

is commutative. Here χL is the composition of the projection to the first component in 0 0 × Sw,w and the map χ : (Ai , Bi , Ci , Di ) 7→ (χi± ) and χR is the composition of Sw,w the projection to the right component and χ .

Factorization Dynamics and Coxeter–Toda Lattices

315

Proof. On the image of the factorization map I : B ×B → D(B), elements of Sw,w−1 ⊂ D(B) can be represented as (U L−1 , diag(U )diag(L)), 0 0 × Sw,w consists of where U and L are as above. The factorization relation F ⊂ Sw,w points

¯ (U L−1 , diag(U )diag(L)), (U¯ L¯ −1 , diag(U¯ )diag(L)) satisfying conditions U L−1 = L¯ −1 U¯ , ¯ diag(U )diag(L)) = diag(U¯ )diag(L). Let Ui , Ui0 , Ui00 , U¯ i , Li , L0i , L00i , L¯ i be factors of U, L, . . . satisfying relations 0

0

−1 −1 −1 0 0 U L−1 = Ui1 . . . Uir L−1 i1 . . . Lir = Ui1 Li1 . . . Uir Lir 00

00

−1 00 00 ¯ −1 ¯ ¯ = L−1 U−1 . . . Lir−1 U−r = L¯ −1 i1 . . . Lir Ui1 . . . Uir .

Then the coordinates A, B, C, D of these elements have to satisfy the relations Di0 = Di , Q −C Ci0 = Ci j ∈{i}+ Aj j i ,

A0 = Ai , Q C Bi0 = Bi j ∈{i}− Dj j i , 0

00

00

A0i Di−1 − Bi0 Ci0 = A00i Di −1 , Bi0 Di0 = Bi00 Di −1 , 0

00

Ci0 Ai−1 = Ci00 A00i ,

−1 00 A−1 − Bi00 Ci00 , i Di = Di Ai

A0i Di0 = A00i Di00 , Y 00 Cj i Dj , B¯ i = Bi00

C¯ i = Ci00

A¯ i = A00i ,

D¯ i = Di00 .

j ∈{i}+

Y j ∈{i}−

00 −C ji

Aj

,

Let us find χ¯ i+ from these relations: A¯ i Y ¯ −Cj i Y ¯ −Cj i A Dj χ¯ i+ = B¯ i C¯ i D¯ i j ∈{i} j j ∈{i} = Bi00 Ci00 =

Y

−

Dj

j ∈{i}+ A00 Bi00 Ci00 i00 Bi0 Ci0 Di

= Bi Ci

j ∈{i}−

+

00 −C ji

Aj

Di0 A0i

Di Y Cj i Y −Cj i Dj Aj Ai j ∈{i}−

=

Y

00 C ji

χ− χi+ i+ χi

= χi− .

j ∈{i}+

A00i Y 00 Cj i Y 00 −Cj i Aj Dj Di00 j ∈{i}−

j ∈{i}+

316

T. Hoffmann, J. Kellendonk, N. Kutz, N. Reshetikhin

Similarly, χ¯ i−

=

=

χ¯ i+ χi−

Y D¯ j A¯ j

!Cj i =

j

Y Dj0 j

A0j

Here we used the identities proves the proposition. u t

χi−

1 − Bj0 Cj0 Q

Y Dj00 j

A00j

!−Cj i

Di0 A0i

0 0 Cj i j (Dj /Aj )

!Cj i

=

(χi− )2 Y (1 − χj− )−Cj i . χi+ j

= χi− /χi+ and Bj0 Cj0 Dj0 /A0j = χj− . This

Corollary 2. The map α is Poisson. This can also be checked by direct calculation using Poisson brackets between χi± . Thus, we have a Poisson map α : ((C× )2r ) → ((C× )2r ) defined outside of hyperplanes χi− = 1, χ + = 0, which preserves functions ChV (χ + , χ − ). Proposition 11. (1) {ChV , ChW } = 0 for every pair of finite dimensional representations V and W . (2) ChV , as a function of the χi± , is independent of the choice of the Coxeter element w. Proof. The first part of this proposition is a general fact about factorizable Poisson Lie groups. For the second we have to show that ChV (χ + , χ − ) does not depend on the order (i1 , . . . , ir ) of the indices. Clearly ChV (χ + , χ − ) doesn’t change if we change the order by an elementary transposition (exchange of two consecutive indices) of two indices which are not linked in the Coxeter diagram. Let us call these transpositions free elementary transpositions. Furthermore, ChV (χ + , χ − ) is also invariant under a cyclic permutation as may be seen using the observation made after Proposition (8). Thus the proposition follows from the easily established fact that every elementary transposition can be obtained by successive applications of cyclic permutations and free elementary transpositions. u t To summarize, with each Coxeter symplectic leaf of G we associated a (complex holomorphic, algebraic) integrable system on ((C× )2r ) for which the integrals are given by characters (there are exactly r independent of them) but all these systems are trivially isomorphic. The coordinates χi± simply describe different points in the group if one changes the Coxeter element. The factorization relation restricted to a Coxeter symplectic leaf gives a discrete-time evolution preserving these integrals. 6.3. Real positive form. Consider the real form GR of the complex algebraic group G. ± Introduce variables χi± = −u± i . The domain ui > 0 we will call positive domain. The following is clear. − Proposition 12. Functions ChV (u+ , u− ) are positive for u+ i , ui > 0 and   + r + + Y u j 1 u 1 u j h i1 ir  φi1 ChV (u+ , u− ) = T rV  + . . . φir + − 1 1 + u 1 1 + u u i i r 1 j j =1   r − − − − Y u+ hj u 1 + u u 1 + u j i1 i1 . . . φ ir ir  . φi1 = T rV  ir − 1 1 1 1 u j j =1

Factorization Dynamics and Coxeter–Toda Lattices

317

It is also clear that the map α is defined globally on positive domain: − α(u+ i ) = ui ,

α(u− i )=

2 Y (u− i ) −Cj i (1 + u− . j ) + ui j

Let G>0 be the positive part of GR (see [Lus94] and [FZ98] for definitions). For SL(n) the positive part consists of all real unimodular n × n matrices with positive principal minors. Lemma 3. On G>0 there exists unique factorization g = g+ (g− )−1 , ±1 ± ∈ B>0 = B ± ∩ G>0 and θ (g+ ) = θ − (g− )−1 . where g± + be the positive Coxeter symplectic leaf of G . It is the connected component Let Sw,w R of ϕ −1 of the corresponding orbit in D(GR )/j (GR − ) which lies in G>0 . The positive domain described above is essentially a positive symplectic leaf and thus, on the positive −1 7→ domain the factorization map α is the restriction of the factorization map g = g+ g− −1 g¯ = g− g+ .

7. The Interpolating Flow and Continuous Time Nonlinear Toda Lattices 7.1. Interpolating flow. From now on we consider the factorization dynamics in positive real domain. As it was already pointed out the factorization dynamics on the positive real domain is a graph of a Poisson map. The trajectory of this map is defined recursively as x(n + 1) = x− (n)−1 x+ (n) for x(n) = x+ (n)x− (n)−1 . Proposition 13. The trajectory of the factorization map restricted to the positive real domain which starts at x(0) has the form: x(n) = g+ (n)−1 x(0)g+ (n), g(n) = x(0)n = g+ (n)g− (n)−1 . Proof. x(0)n = x+ (0)x(1)n−1 x− (0)−1 = x+ (0) . . . x+ (n)x− (n)−1 . . . x− (0)−1 shows t that g+ (n) = x+ (0) . . . x+ (n) which quickly leads to the statement. u This proposition is a discrete analogue of the following theorem of Semenov-TianShansky [STS85] for continuous time systems which describes the trajectories of Hamiltonian systems on Poisson Lie groups generated by Ad-invariant functions. Define the Lie G-valued gradient ∇f of a function f : G → R by (∇f (g), η) :=< df (g), (Xη ) >, where we write Xη for the left invariant vector field on G corresponding to η and < ω, X > is the value of the form ω on the vector field X. Theorem 5. Let H be an AdG -invariant function on G. The trajectory x(t) of the Hamiltonian equations of motion generated by H is given by x(t) = g+ (t)−1 x(0)g+ (t), where g(t) = exp(t∇H (x(0)) = g+ (t)g− (t)−1 .

318

T. Hoffmann, J. Kellendonk, N. Kutz, N. Reshetikhin

Now we are in a position to derive a Hamiltonian flow which interpolates the factorization dynamics. Obviously, a Hamiltonian Hd which has a flow whose time 1 map is given by factorization as above has to solve the equation g = exp(∇Hd ). Thus, for g = eξ and ξ ∈ Lie G we should have ξ = ∇Hd (eξ ) . Proposition 14. In a neighborhood of the identity all AdG -invariant solutions of the equation ξ = ∇H (eξ )

(17)

have the form Hd (eξ ) =

1 (ξ, ξ ) + const. 2

Proof. Let H an AdG -invariant solution of the above equation and H˜ = H ◦ exp. Then H˜ is adg -invariant and hence d H˜ ξ (adη (ξ )) = 0 for all ξ, η ∈ Lie G. By (17), (ξ, η) =< dH (eξ ), Xη >=< d H˜ (ξ ), η >. Here we trivialized the tangent bundle on G by left translations. Thus, for H˜ we have the equation (ξ, η) = d H˜ |ξ (η). Integration yields now the statement of the proposition. u t If G = SL(n, R) then Hd (g) = 21 tr(log2 (g)) in a sufficiently small neighborhood of the identity [Sur91a]. The Hamiltonian Hd is quite remarkable since it gives the so-called classical quantum R-matrix [WX92,Res92]. The function Hd is the most singular part of the quantum R-matrix in the appropriate semi-classical limit [Skl82, Res95, Res96]. The map α generated by time 1 flow of Hd is the classical quantum R-matrix in the sense of [WX92] restricted to the product of Coxeter symplectic leaves and reduced by hamiltonian reduction. + − 7.2. Linearization in a neighborhood of 1. Consider R2n + with coordinates (ui , ui ) and with the following Poisson brackets between coordinate functions: + − − {u+ i , uj } = {ui , uj } = 0, − + − {u+ i , uj } = −2di Cij ui uj .

Consider the family of diffeomorphisms of β : R+ ×R2n → R2n + acting on coordinate functions as βε (πi , φi ) = (ε2 eφi +επi , ε2 eφi ). Here (πi , φi ) are coordinates in R2n such that (βε (φi ), βε (πi )) are the coordinates − 2n is equipped with the following (u+ i , ui ) which were used above. Assume that R symplectic structure {φi , φj } = {πi , πj } = 0, Then the maps βε are symplectomorphisms.

{φi , πj } = 2di Cij .

(18)

Factorization Dynamics and Coxeter–Toda Lattices

319

For each ε > 0 define the map αε : R2n → R2n , as αε = βε−1 ◦ α ◦ βε . The map αε acts on coordinates (φi , πi ) as αε (πi ) = πi +

r X

Cj i

j =1

αε (φi ) = φi + επi +

1 ln(1 + ε2 eφj ), ε

r X

Cj i

j =1

1 ln(1 + ε2 eφj ). ε

(19)

By construction these maps are symplectomorphisms for the bracket (18). In the limit ε → 0 Eq. (19) defines a vector field on R2n with coordinates αε (φi ) − φi = πi , ε X αε (πi ) − πi Cj i eφj . = π˙ i = lim ε→0 ε φ˙ i = lim

ε→0

j

This vector field is the Hamiltonian (for the Poisson brackets (18)) generated by the (usual) Toda Hamiltonian HToda = 21 (ξ0 , ξ0 ), where ξ0 =

r X (πi hi + eφi xi+ + xi− ). i=1

Thus the family of maps (19) “retracts” to the Toda Hamiltonian flow in the neighborhood of the identity. Equivalently, we have: lim α n (φ, π ) n→∞ ε

= (φ(t), π(t)),

where t = n is fixed and φ(t), π(t) is the Hamiltonian flow generated by HToda passing through (φ, π) at t = 0. It is easy to find the leading terms of the asymptotic expansion of the integrals in the limit ε → 0. Indeed, composing map βε with functions ChV and Hd we have: HV (φ, π) = (ChV ◦ βε )(φ, π ) = T rV (exp(ξε )), Hd (φ, π) = 21 (ξε , ξε ), where exp(ξε ) =

r Y j =1

exp(επj hj )

→ Y i

exp(εeφi xi+ ) exp(εxi− ).

As ε → 0, ξε = εξ0 + O(ε2 ).

320

T. Hoffmann, J. Kellendonk, N. Kutz, N. Reshetikhin

Thus, for HV and Hd we have HV = dim V 1 + ε2

cV HToda + O(ε3 ) , dim(g)

Hd = ε2 HToda + O(ε3 ). Here we assumed that V is irreducible and cV is the value of the Casimir operator action on V . Higher Toda Hamiltonians can be obtained from higher order terms of ε-expansion of HV . 8. Conclusion As it was mentioned in the introduction, the main goal of this paper was systematic derivation of Coxeter–Toda systems from the symplectic geometry of Poisson Lie groups. Naturally, such analysis can be done for loop groups as well. The corresponding models will be affine versions of Coxeter–Toda systems. For the An root system this will give the relativistic Toda chain first described by Ruijsenaars [Rui90]. In a similar way one can construct discrete versions of Toda field theories. For the An -case it has been done in [KR97]. Notice also that somewhat unexpectedly the same Hirota equations appear as a system of equations for transfer-matrices of some solvable models in statistical mechanics [BR90,KNS94]. Although it is clear that the explanation of this coincidence lies in the theory of q − W -algebra [ER97], the complete picture is still missing. The factorization dynamics restricted to other symplectic leaves will give “nonlinear” Toda–Kostant systems which are related to general coadjoint orbits. Acknowledgements. The authors thankYuri Suris for helpful discussions. N.R. thanks S. Fomin andA. Zelevinsky for valuable discussions and the Technische Universität Berlin for hospitality. The research of N.R. was partially supported by the NSF grant DMS-9603239. T. H., J. K. and N. K. were supported by the grant “Discrete integrable systems” of the Deutsche Akademische Austauschdienst and by the Sonderforschungsbereich 288 supported by the Deutsche Forschungsgemeinschaft.

References [Adl79] [Arn89] [BR90] [DCKP95] [DJM82] [DLNT86] [DLT89] [Dri87] [ER97]

Adler, M.: On a trace functional for formal pseudo-differential operators and the symplectic structure of the kdv-type equations. Inv. Math. 50, 219–248 (1979) Arnold, V.I.: Mathematical Methods of Classical Mechanics, Second Edition. Berlin– Heidelberg–New York: Springer, 1989 Bazhanov, V., Reshetikhin, N.: Restricted solid-on-solid models connected with simply laced algebras and conformal field theory. J. Phys. A 23, 1477–1492 (1990) De Concini, C., Kac, V.G. and Procesi, C.: Some quantum analogues of solvable Lie groups. In Geometry and analysis. Papers presented at the Bombay colloquium, India, January 6–14, 1992, Oxford: Oxford University Press, 1995, pp. 41–65 Date, F., Jimbo, M., Miwa, T.: Method for generating discrete soliton equations I-IV. J. Phys. Soc. Japan 51, 4116–4131 (1982) Deift, P., Li, L.C., Nanda, T. and Tomei, C.: The Toda flow on a generic orbit is integrable. Comm. Pure Appl. Math. 39, 183–232 (1986) Deift, P., Li, L.C. and Tomei, C.: Matrix factorization and integrable systems. Comm. Pure Appl. Math. 42, 443–521 (1989) Drinfeld, V.G.: Quantum groups. In Proc. Intern. Congress of Math. (Berkeley 1986), pp. 798–820. Providence, RI: AMS, 1987 Frenkel, E. and Reshetikhin, N.: Deformations of W-algebras associated to simple Lie algebras. from-math-QA-archive, q-alg/9707012:–, 1997

Factorization Dynamics and Coxeter–Toda Lattices

[FZ98]

321

Fomin, S. and Zelevinsky, A.: Totally nonconnegative and oscillatory elements in semisimple groups. Preprint, 1998 [FZ99] Fomin, S. and Zelevinsky, A.: Double bruhat cells and total positivity. J. of the AMS 12, 335–380 (1999) [Hir77] Hirota, R.: Nonlinear partial difference equations II. Discrete-time Toda equation. J. Phys. Soc. Japan 43 (6), 2074–2078 (1977) [HL93] Hodges, T. and Levasseur, T.: Primitive ideals of Cq [SL(3)]. Commun. Math. Phys. 156, 581–605 (1993) [HZ94] Hofer, H. and Zehnder, E.: Symplectic invariants and Hamiltonian dynamics. Base, Boston: Birkhäuser Verlag, 1994 [KNS94] Kuniba, A., Nakanishi, T.and Suzuki, J.: Functional relations in solvable lattice models. Int. J. of Mod. Phys. 9, 5215–5311 (1994) [Kos79] Kostant, B.: The solution to a generalized Toda lattice and representation theory. Adv. Math. 34, 195–338 (1979) [KR97] Kashaev, R. and Reshetikhin, N.: Affine Toda systems as an integrable 3-dimensional quantum field theory. Comm. Math. Phys. 188, 251–266 (1997) [KS98] Korogodski, L. and Soibelman, Y.: Algebras of Functions on Quantum Groups, Part I. Providence, RI: American Mathmatical Society, 1998 [Lus94] Lusztig, G.: Total positivity in reductive groups. In Lie theory and geometry: In honor of Bertram Kostant. Boston: Birkhäuser, 1994 [LW90] Lu, J.-H. and Weinstein, A.: Poisson Lie groups, dressing transformations and Bruhat decompositions. J. Differ. Geom. 31, 501–526 (1990) [MV91] Moser, J. and Veselov, A.: Discete versions of some classical integtable systems and factorization of matrix polynomials. Commun. Math. Phys. 139, 217–243 (1991) [QNCvdL84] Quispel, G., Nijhoff, F., Capel, H. and van der Linden, J.: Linear integral equations and nonlinear differential-difference equations. Physics A 125, 344–380 (1984) [Res92] Reshetikhin, N.: Quasitriangularity of quantum groups and quasi-triangular Hopf-Poisson algebras. In AMS Summer Reaserch Institute on Algebras, Groups and Their Generalization, Providence, RI: AMS, 1992, pp. 111–133 [Res95] Reshetikhin, N.: Quasitriangularity of quantum groups at roots of 1. Commun. Math. Phys. 170, 79–100 (1995) [Res96] Reshetikhin, N.: Integrable discrete systems. In Quantum Groups and their Appliations in Physics. Bologna: IOS Press, 1996, pp. 445–487 [Rui90] Ruijsenaars, S.: Relativistic Toda systems. Commun. Math. Phys. 122, 217–247 (1990) [Skl82] Sklyanin, E.: On some algebraic structures related to the Yang–Baxter equation. Funct. Anal. and its Appl. 16, 27–34 (1982) [Soi90] Soibelman, Y.: Algebra of functions on a compact quantum group and its representations. Algebra i Analiz 2, 193–225 (1990) [STS85] Semenov-Tian-Shansky, M.: Dressing transformations and Poisson group actions. Pub. Res. Inst. Math. Sci. Kyoto Univ. 21, 1237–1260 (1985) [Sur90] Suris, Y.: Discrete time generalized Toda lattices: Complete integrability and relation with relativistic Toda lattices. Phys. Lett. A 145, 113–119 (1990) [Sur91a] Suris, Y.: Algebraic structure of discrete-time and relativistic Toda x lattices. Phys. Lett. A 156, 467–474 (1991) [Sur91b] Suris, Y.: Generalized Toda chains in discrete time. Leningrad Math. J. 2, 339–352 (1991) [Sym82] Symes, W.: The QR-algorithm and scattering for the finite non-periodic Toda lattice. Physica D 4, 275–290 (1982) [Tod88] Toda, M.: Theory of nonlinear lattices. Berlin–Heidelberg–New York: Springer, 1988 [Ves91] Veselov, A.P.: Integrable maps. Russ. Math. Surv. 46, 3–45 (1991) [WX92] Weinstein, and Xu, P.: Classical solutions to the quantum Yang–Baxter equation. Commun. Math. Phys. 143, 309–344 (1992) Communicated by T. Miwa

Commun. Math. Phys. 212, 323 – 336 (2000)

Communications in

Mathematical Physics

© Springer-Verlag 2000

Resonance Wave Expansions: Two Hyperbolic Examples T. Christiansen, M. Zworski 1 Department of Mathematics, University of Missouri, Columbia, MO 65211, USA.

E-mail: [email protected]

2 Department of Mathematics, University of California, Evans Hall, Berkeley, CA 94720, USA.

E-mail: [email protected] Received: 1 October 1999 / Accepted: 24 January 2000

Abstract: For scattering on the modular surface and on the hyperbolic cylinder, we show that the solutions of the wave equations can be expanded in terms of resonances, despite the presence of trapping. Expansions of this type are expected to hold in greater generality but have been understood only in non-trapping situations.

1. Introduction In this note we give two examples for which we can obtain, on compact sets, an asymptotic expansion of solutions to the wave equation with smooth, compactly supported initial data, although there is trapping. The expansions are given in terms of resonances and they generalize the standard “separation of variables” expansions in terms of eigenvalues. The examples are the modular surface where we can use detailed information about the zeta function (see Fig. 1 and Theorem 1) and the hyperbolic cylinder where the resonances are particularly simple (see Fig. 2(b) and Theorem 2). Resonances or scattering poles are defined as poles of the meromorphic continuation of the resolvent or the scattering matrix and they constitute a natural replacement of discrete spectral data for problems on exterior domains. That point of view was emphasized early by Lax–Phillips ([12]) – see [23] for a light-hearted overview of recent results. Although resonances are most frequently defined in the stationary framework of scattering theory they are a dynamical concept: the real part of a resonance describes the rest energy of a state and the imaginary part its rate of decay. Consequently they should be understood in terms of long time behaviour of solutions to evolution equations, and, in particular, to the wave equation. For the Schrödinger evolution equation we refer to the recent paper by Soffer-Weinstein [17] and references given there; in that case one considers resonances which come from perturbing embedded eigenvalues. Under a quantum dynamical condition that the obstacle O is non-trapping (that is, a condition on the behaviour of solutions of the evolution equation) Lax–Phillips [12]

324

T. Christiansen, M. Zworski

and Vainberg [21,22] showed that for n odd, for some > 0,

 (Dt2 − 1)u(t, x) = 0, x ∈ Rn \ O, uR×∂ O = 0,      ∞ n H⇒ ut=0 = f ∈ Cc (R \ O)      1 ∞ n i ∂t ut=0 = g ∈ Cc (R \ O)

u(t, x) =

X

mX O (λl )

(1.1)

wλl ,j (x)eitλl t j −1 + O(e−(C+)t ), x ∈ K,

Im λl ≤C j =1

K ⊂ Rn \ O compact, where mO (λl ) is the multiplicity of the resonance. Here we took the convention that Im λl ≥ 0, that is, that the resonances lie in the upper half-plane. That the quantum non-trapping condition follows from the classical non-trapping condition was shown in the works of Andersson, Melrose, Morawetz, Ralston, Strauss, Sjöstrand and Taylor – see the appendix to the second edition of [12] and references given there. In the ultimate trapping situation of a compact manifold, the expansion (1.1) is simply the expansion in terms of eigenvalues. Hence a naïve “interpolation” argument suggests its validity for all perturbations. That, however, is far from clear and not much is known. Tang-Zworski [19] recently showed, using the methods of [18], that the expansion (1.1) is valid for general “black box” perturbations when we sum over resonances satisfying Im λj ≤ |λj |−M , M sufficiently large, and when we replace the error by O(t −N ) for any N. The result is at the moment conditional and the following generically reasonable yet unverifiable assumption has to be made: |λl − λk | >

1 (max{|λl |, |λk |})−L , for some fixed L > 0. C

(1.2)

This motivates our unconditional results in two rather explicit examples. Remark. After this paper was written we learned that in a recent paper [1], Beyer obtained results very close to Theorem 2 here. His work was motivated by gravitational scattering – see [15] for another discussion of resonances in that setting and the relation to the hyperbolic cylinder. In the paper, C will stand for a constant whose value may change from line to line. The 1 notation hsi means (1 + |s|2 ) 2 . 2. The Quotient by the Modular Group We begin by working on a general surface with one constant curvature cusp end, which we shall identify with (a, ∞)y × Sθ1 for some a > 0. We will use z to denote a variable in the surface and, as is common for scattering on hyperbolic surfaces, we use the spectral variable s(1−s). We refer to [14] for the spectral and scattering theories of such surfaces. Let E(z, s) be the generalized eigenfunction of the positive Laplacian 1, so that (1 − s(1 − s))E(z, s) = 0, such that on the cusp end E(z, s) = y s + S(s)y 1−s + O(y − )

Resonance Wave Expansions

325

for some > 0. With this convention, the scattering matrix S(s) is holomorphic when Re s > 1/2, except for a finite number of values of s which correspond to the eigenvalues of 1 which lie below 1/4. √ √ 1−1/4) f satisfies the equation The function u(t) = sin(t 1−1/4 (Dt2 − (1 − 1/4))u(t) = 0, u(0) = 0, ut (0) = f when f is sufficiently well-behaved. Using the spectral representation, we have √ √ √ 1 X (ei λj −1/4t − e−i λj −1/4t ) sin(t 1 − 1/4) p = φj (z)φ j (z0 ) √ 2i 1 − 1/4 λ − 1/4 j λj ∈σp (1) Z ∞ iτ t (e − e−iτ t ) 1 E(z, 1/2 + iτ )E(z0 , 1/2 − iτ )dτ + 8π −∞ iτ

(2.1)

in the sense of distributions, where φj is a normalized eigenfunction of 1, associated to λj . Here the λj ’s can take the same value depending on the multiplicity and the φj ’s form an orthonormal set. Let X0 = H2 /P SL(2; Z) be the quotient of the hyperbolic upper half plane by P SL(2; Z), which has scattering matrix S(s) =

√ 0(s − 21 ) ζ (2s − 1) , π 0(s) ζ (2s)

(2.2)

where 0 is the Euler 0-function and ζ is the Riemann ζ -function. One consequence of this is that the poles of the scattering matrix other than s = 1 correspond to the nontrivial zeros of ζ (2s). We recall that if N (T ) is the number of zeros of the function ζ (s) in 0 ≤ Re s ≤ 1, 0 ≤ Im s ≤ T , then N(T ) =

1 1 T log T − T + O(log T ) 2π 2π

as T → ∞ ([20, Theorem 9.4]). That is, the scattering matrix for the Laplacian on X0 has far fewer than the maximum number of poles in a ball of radius T centered at the origin, which is O(T 2 ) for general surfaces with cusps. ∞ Let √ f, χ ∈ Cc√(X0 ). We wish to obtain an asymptotic expansion as t → ∞ of sin(t 1 − 1/4)/ 1 − 1/4 applied to f and truncated by χ. Theorem 1. Let f, χ ∈ Cc∞ (X0 ). Then there exist vj k ∈ Cc∞ (X0 ) such that as t → ∞, √ 1 sin(t 1−1/4) f = χ √ 2i 1−1/4 X +

√ ! − e−i λj −1/4t p χ(z)φj (z)(f, φj ) λj − 1/4 λj ∈σp (1) X e(sj −1/2)t (sign(1/2−Re sj )) vj k t k + O(e−N t )

sj poles of S(s)

for any N .

X

ei

√

λj −1/4t

k≤mult(sj )−1

326

T. Christiansen, M. Zworski

The sum over the poles should be understood as follows: for n ∈ Z, there exist τn ∈ R, n ≤ 2τn < n + 1, such that if X X bn (z, t) = e(sj −1/2)t vj k t k , sj :τn
k≤mult(sj )−1

P∞

then n=−∞ bn (z, t) is absolutely convergent when t ≥ 0. We remark that to obtain an expected expansion in the case of the modular surface, that is an expansion in which there is no need to cluster terms to obtain absolute convergence, would require more precise information about the zeta function: simplicity of zeros, lower bounds on the derivatives. The proof of the theorem involves a contour deformation. In order to justify the deformation, we will need some bounds on S(s) and on E(z, s). Let χ1 ∈ Cc∞ (X0 ) be such that the support of (1 − χ1 ) is contained in the cusp end of X0 . Then E(z, s) = (1 − χ1 )y s + (1 − s(1 − s))−1 [1, χ1 ]y s .

(2.3)

i 4

P2

I

P1 −i 2

P

Fig. 1. The fundamental domain of the modular group and the resonances of the modular surface in the λ-plane, s = 21 − iλ

If χ ∈ Cc∞ (X0 ), then the basic spectral estimate on the resolvent (1 − s(1 − s))−1 = O(d(s(1 − s), σ (1))−1 ) : L2 (X0 ) −→ L2 (X0 ), the Sobolev embedding theorem, and interpolation (1k ([1, χ1 ]y s ) = O(|s|2k+1 e| Re s| ) ∈ L2 (X0 )) immediately give |χE(z, s)| ≤ C

hsi2+ eRe s and |S(s)| ≤ CeRe s Re s − 1/2

(2.4)

when Re s ≥ 1/2 and | Im s| > 1 or Re s > 2. (For the bound on S, see [14, (3.26),(3.27)].) The bounds (2.4) are valid for other surfaces with one cusp end, but we shall need an improved bound in the special case of X0 . As a by-product of their work on L∞ bounds on L2 -eigenfunctions, Iwaniec and Sarnak obtain the bound |χ(z)E(z, 1/2 + it)| |t| 12 + , 5

Resonance Wave Expansions

327

see [11, (A.12)] where now the positive contribution of the Eisenstein series should be kept (see also the discussion around [16, (2.18)]). Here we give a direct proof of a weaker estimate, in the spirit of general scattering theory: the estimate depends only on the separation of the poles of E from the continuous spectrum. Lemma 2.1. For the generalized eigenfunctions E(z, s) on X0 , |χ(z)E(z, s)| ≤ Chsi2+ eRe s if Re s ≥ 1/2, |s − 1| > > 0. Proof. We shall use (2.2) to improve our bound on E(z, s) when Re s ≥ 1/2. We start by recalling an exponential bound on the resolvent: kχ2 (1 − s(1 − s))−1 χ2 kL2 →L2 ≤ C exp(C|s|N0 ), −3 −3 D(s , hs i ) ∪ D(σ , hσ i ) if s 6 ∈ ∪∞ j j j j σ :σ (1−σ )=λ ∈σ (1) p j j j j j =1

(2.5)

for any χ2 ∈ Cc∞ (X0 ) for some N0 (see the representation of the resolvent in [7, Sect. 5] and [8, Lemma 3.6]; the estimate also follows from the general “black box” scattering estimate, see [18, Lemma 1]). Moreover, by Theorem 3.8 of [20] there is a constant A > 0 such that ζ (s) is not zero for Re s ≥ 1 − A/ log | Im s|, | Im s| > t0 . Since (1 − s(1 − s))−1 [1, χ1 ]y s is regular away from the poles of the scattering matrix, using (2.3), (2.5), and the maximum principle this leads to a bound on the generalized eigenfunctions: |χ (z)E(z, s)| ≤ C exp(C|s|N0 )

(2.6)

when | Im s| is sufficiently large and Re s ≥ 1/2 − A/(4 log | Im s|). Then, just as in [18, Lemma 2], we can use the maximum principle, the existence of a pole free region, the bound in the good half plane (2.4) and the exponential bound (2.6) to obtain, for | Im s| sufficiently large, t u (2.7) |χ(z)E(z, s)| ≤ Chsi2+ eRe s if Re s ≥ 1/2. √ √ Proof of Theorem 1. We use the representation (2.1) of sin(t 1 − 1/4)/ 1 − 1/4. For the term of (2.1) with an eitτ we deform the contour of integration into the upper half plane; the term with e−itτ is deformed into the lower half-plane. We note that this is possible despite the singularity at τ = 0 since Z lim ↓0

eiτ t E(z, 1/2 + iτ )E(z0 , 1/2 − iτ )dτ Im τ = τ −δ
(2.8)

We consider the term with eitτ , for which we deform the contour of integration into the upper half-plane. There E(z, 1/2 − iτ ) is bounded as in Lemma 2.1. We will use that E(z, 1/2+iτ ) = S(z, 1/2+iτ )E(z, 1/2−iτ ), and this, combined with some estimates on S(s) and the estimates above, will allow us to justify the contour deformation.

328

T. Christiansen, M. Zworski

We give some bounds on the scattering matrix, using the fact that we know it explicitly in terms of 0 and ζ . If | Re(s)| < C1 , | Im s| > 1, then 0(s − 1/2) (2.9) 0(s) ≤ C2 by Stirling’s formula. If Re(s) ≥ −δ, |ζ (s)| = O(| Im s|3/2+δ )

(2.10)

([20, p. 95]). Moreover, by [20, Theorem 9.7], there is a constant A such that each interval (T , T + 1) contains a value τ of Im s such that |ζ (s)| ≥ | Im s|−A , Im s = τ, −1 ≤ Re s ≤ 2.

(2.11)

For each n ∈ Z we choose such a 2τn ∈ (n, n + 1). Using Eqs. (2.2), (2.7), (2.9), (2.10), (2.11), the fact that 1E(z, 1/2 − iτ ) = (1/4 + τ 2 )E(z, 1/2 − iτ ) and 1k f ∈ Cc∞ (X0 ) for any k ∈ N, by integrating by parts in z0 we can show that Z E(z0 , 1/2 − iτ )f (z0 )| ≤ C(1 + |τ |)−m (2.12) |χ(z)E(z, 1/2 + iτ ) z0 ∈X0

for any m ∈ R, if Im τ = 0, Im τ = 1, or Re τ = τn , 0 ≤ Im τ ≤ 1. We have also used the proof of [20, Theorem 9.7] to bound (ζ (s))−1 when Re s = −1 or Re s = 2. Now we do a contour deformation, deforming the integral over τn ≤ τ ≤ τn+1 , Im τ = 0, to the contour consisting of the three line segments Re τ = τn , 0 ≤ Im τ ≤ 1; Im τ = 1, τn ≤ Re τ ≤ τn+1 ; and Re τ = τn+1 , 0 ≤ Im τ ≤ 1. The estimate (2.12) and the residue theorem show that the sum of the residues in this region is bounded in absolute value by Ck (1 + |τn |)−k for any k; this will give us the convergence of the sum over the poles as desired. By doing a contour deformation to the line Im τ = 1 and using the estimates (2.7), (2.9), (2.10), and (2.11), along with the residue theorem, we obtain a sum over the poles, with an error term of order e−t . To improve the error term, we will need bounds on S(s) when Re s < −1. By [20, Theorem 8.7], 1 ζ (Re s) (2.13) ζ (s) ≤ ζ (Re 2s) if Re s > 1. Since

πs 0(1 − s)ζ (1 − s), 2 the estimate (2.13) combined with our earlier estimates is enough to show that the contour of integration can be deformed to any line Im τ = c0 , resulting in an error as claimed. t The term with e−itτ is treated similarly. u ζ (s) = 2s π s−1 sin

Suppose the Riemann hypothesis is true. Then, using a contour deformation argument similar to that above, one can show that √ √ ! √ 1 X ei λj −1/4t − e−i λj −1/4t sin(t 1 − 1/4) p f = χ √ 2i 1 − 1/4 λj − 1/4 λ ∈σ (1) j

p

· χ (z)φj (z)(f, φj ) + O(e(−1/4+)t )

Resonance Wave Expansions

329

for any > 0 and where we have used the notation and assumptions of Theorem 1. On the other hand, just using the known fact that the ζ -function has no zeros in {s : Re s > 1 − C(log | Im s|)−1 , | Im s| > t0 } for some C > 0, one can show, by moving the contour of integration off the real axis, to Im τ = C/4 Re τ for sufficiently large τ , that √ √ ! √ 1 X ei λj −1/4t − e−i λj −1/4t sin(t 1 − 1/4) p f = χ √ 2i 1 − 1/4 λj − 1/4 λ ∈σ (1) p

j

· χ(z)φj (z)(f, φj ) + O(t −N ) for any N. 3. The Hyperbolic Half-Cylinder The second example we consider is the hyperbolic half-cylinder Y0l ' (R+ )r ×(R/ lZ)θ with metric dr 2 + cosh2 rdθ 2 . The analysis is equally applicable to the case of the full cylinder Yl ' (R)r × (R/ lZ)θ with the same metric. In both cases the trapped set consists of one closed hyperbolic orbit which is well known to generate resonances on a lattice (as was pointed out by Guillopé [6] and Epstein [2]; see also [7, Appendix] and Fig. 2(b)). We begin by recalling some results of [7, Sect. 3 and Appendix]. The Laplacian 1Y0l = Dr2 − i tanh rDr + 1R/ lZ (cosh r)−2 on the hyperbolic half-cylinder Y0l is, through conjugation by cosh1/2 r, equivalent to the operator Dr2 +

1R/ lZ + 1/4 1 + 4 cosh2 r

on L2 (R+ × R/ lZ, drdθ). We can expand this in terms of the eigenfunctions on R/ lZ to obtain M (2π m/ l)2 + 1/4 1 + Dr2 + 4 cosh2 r m∈Z

+ , dr)). Modifying slightly the notation of [7], we have a generalized eigenfunction satisfying l)2 +1/4 − the Dirichlet boundary condition for the one-dimensional problem Dr2 + (2π m/ cosh2 r k2 ,

on

l 2 (Z, L2 (R

˜ m , k) sinh r cosh1+νm r E˜ νm (r, k) = a(ν · 2 F1 (νm − ik + 2)/2, (νm + ik + 2)/2, 3/2; − sinh2 r , where νm = −1/2 + i(2πm/ l) and a(ν ˜ m , k) =

2−ik 0((νm − ik + 2)/2)0((−νm − ik + 1)/2) . 0(−ik)0(3/2)

Then, for r, r 0 > 0, δ(r − r 0 ) =

1 4π

Z

∞

−∞

E˜ νm (r, k)E˜ νm (r 0 , −k)dk.

(3.1)

330

T. Christiansen, M. Zworski

d/2

(a)

d

d π

(b)

Fig. 2. (a) Resonances associated to two strictly convex bodies: in every fixed strip, the resonances become closer to points on the lattice as the real part increases. (b) Resonances for a hyperbolic cylinder: all resonances lie exactly on a lattice. The underlying dynamical structure, exactly one hyperbolic closed orbit, is the same in the two examples

In order to be consistent with our first example, we shall use as the variable s = 1/2 − ik and set Eνm (r, s) = E˜ νm (r, k). The scattering matrix S0l (s) for the hyperbolic half-cylinder with Dirichlet boundary conditions is S0l (s) =

M

s(H 0,−1/2+2imπ/ l )(k), s = 1/2 − ik,

m∈Z l) +1/4 with Dirichlet boundary conditions. This where H 0,−1/2+2imπ/ l is Dr2 + (2π m/ cosh2 r gives us ([7, Lemmas 3.3, 3.4]) that 2

S0l (s) =

M

slm (s)

(3.2)

m∈Z

with slm (s) =

22s−1 0(1/2 − s)0((1 + s − i2π m/ l)/2)0((1 + s + i2π m/ l)/2) 0(s − 1/2)0((2 − s − i2π m/ l)/2)0((2 − s + i2π m/ l)/2)

(3.3)

and the resonances of the Dirichlet Laplacian associated to slm (s) are ±i2π m/ l −n, n ∈ 2N − 1. Notice that this means that for any β ∈ R, corresponding to each eigenfunction on R/ lZ there are only a finite number of resonances with real part greater than β. Note too that for m 6 = 0, the resonance i2π m/ l − n has multiplicity two as a resonance of S0l (s) but only one as a resonance of slm (s) and sl(−m) (s). For this reason, when m 6 = 0 we can rule out the possibility of needing terms with t dependence in the expansion someplace other than the exponential. If X is a manifold with boundary, we use the notation C˙ c∞ (X) to denote the smooth, compactly supported functions on X that vanish to infinite order at the boundary.

Resonance Wave Expansions

331

Theorem 2. Let f ∈ C˙ c∞ (Y0l ) and let χ ∈ Cc∞ (Y0l ). Then there exist vm,n , wn ∈ Cc∞ (Y0l ) such that as t → ∞, p X sin(t 1Y0l − 1/4) e(i2mπ/ l−n−1/2)t vm,n f = χ p 1Y0l − 1/4 0
+

X

e(−n−1/2)t wn t + O(e(−β−1/2)t )

0
if β 6 ∈ 2N − 1. The same conclusion holds for f ∈ Cc∞ (Yl ) and 1l with the resonance set replaced by the resonance set of the full hyperbolic cylinder ±2π im/ l − n, m ∈ N0 , n ∈ N0 . Let {φm }, m ∈ Z be an orthonormal set of eigenfunctions of 1R/ lZ , with corresponding eigenvalue (2πm/ l)2 . As in the first example, we shall use the spectral representation √ 1 X sin(t 1 − 1/4) φm (θ )φ m (θ 0 ) = √ 8π 1 − 1/4 m∈Z Z (s−1/2)t −(s−1/2)t e −e Eνm (r, s)Eνm (r 0 , 1 − s) cosh−1/2 r cosh−1/2 r 0 ds. · s − 1/2 Re s=1/2 (3.4) As for the first example, the proof involves a contour deformation and the difficulty lies in justifying it, bounding the residues, and bounding the integral over Re s = β. We prove the proposition by proving a series of lemmas which justify the contour deformation and bound the remainder. Let χ, χ˜ ∈ Cc∞ (R+ ). We will need bounds in s and m on χ(r)Eνm (r, s)Eνm (r 0 , 1 − s)χ(r ˜ 0 ) for fixed Re s and bounds, in m, on its residues. We note that polynomial bounds will be sufficient for our purposes, since we shall be pairing with a function f ∈ Cc∞ (Y0l ). Since (2πm/ l)2 are the eigenvalues on the cross section R/ lZ, and since 1R/ lZ f ∈ Cc∞ (Y0l ) when f ∈ Cc∞ (Y0l ), a polynomial bound in m suffices. Similarly, because f vanishes to infinite order at r = 0, repeated integration by parts shows that a polynomial bound in s is enough. We first concern ourselves with the residues. Lemma 3.1. Let χ , χ˜ ∈ Cc∞ (R+ ). If 0 < n < N0 , the residues of χ(r)Eνm (r, s) ˜ 0 ) at s = ±2π im/ l − n are bounded by ChmiN0 +1/2 . Eνm (r 0 , 1 − s)χ(r Proof. We will use that Eνm (r, s) = slm (s)Eνm (r, 1 − s) so that we need only bound slm (s) and its residues and bound Eνm (r, 1 − s) in the good half plane Re(1 − s) > 1/2. At the pole s = ±i2πm/ l − n, n ∈ 2N − 1, ˜ m , i/2 ± 2π m/ l + in) sinh r(cosh r)1/2+2π im/ l Eνm (r, 1 − s)|s=±i2πm/ l−n = a(ν n πim π im 1 π im π im n 3 × 2 F1 (1 + + ∓ , + ± − , , − sinh2 r). (3.5) 2 l l 2 l l 2 2

332

T. Christiansen, M. Zworski

From [3, Sect. 2.3.2], for bounded n and r, |2 F1 (1 +

n πim πim 1 π im π im n 3 + ∓ , + ± − , , − sinh2 r)| ≤ Chmin/2−1/2 . 2 l l 2 l l 2 2 (3.6)

Using Stirling’s formula, we find that |a(ν ˜ m , i/2 ± 2π m/ l + in)| ≤ Chmi−n/2+1/2 .

(3.7)

For m 6 = 0, the residue of slm at s = ±i2π m/ l − n is (−1)(1−n)/2 2±i4πm/ l−2n 0(1/2 ∓ i2π m/ l + n)0((1 − n ± i4π m/ l)/2) 0(±i2πm/ l − n − 1/2)0(2 ∓ i4π m/ l + n)/2)0((2 + n)/2)0((n + 1)/2) which, for fixed n, is easily seen to be bounded by Chmin+1/2 using the relation 0(z + 1) = z0(z) and Stirling’s formula. This, along with (3.5), (3.6) and (3.7), shows that the residues are bounded as claimed. u t The bound given by this lemma is enough to show that the sum over the vm,n in Theorem 2, which corresponds to a sum over residues, converges absolutely, since we may integrate by parts in the tangential variable to obtain decay in |m|. We state the following lemma for the slightly easier case of et (s−1/2) rather than for t (s−1/2) /(s − 1/2) in order to avoid the difficulty at s = 1/2. However, this can be e easily treated as indicated in (2.8) by considering at the same time the term of (3.4) with e−t (s−1/2) /(s − 1/2). As the main difficulty here is the behaviour of the integrand at infinity, we avoid the question of the behaviour near s = 1/2. This lemma, combined with the previous one and the remarks above, justify the contour deformation to Re s = −β. Lemma 3.2. Let χ ∈ Cc∞ (R+ ), f ∈ C˙ c∞ (Y0l ), and hm (r, s; f ) = χ (r) cosh−1/2 rφm (θ ) Z Z et (s−1/2) Eνm (r, s)Eνm (r 0 , 1 − s) cosh1/2 r 0 f (r 0 , θ 0 )φ m (θ 0 )dθ 0 dr 0 . × R+

R/ lZ

Then if β > 0, β 6 ∈ 2N − 1, then Z hm (r, s; f )ds Re s=1/2 Z X hm (r, s; f )ds + 2π i Ress=±i2π m/ l−n hm (r, s; f ). = Re s=−β

n<β n∈2N−1

Proof. In order to prove the lemma, we need only show that Z hm (r, s; f )ds = 0. lim R→∞

| Im s|=R 1/2>Re s>−β

Since l 0 0 (s(1 − s))l hm (r, s; f ) = hm (r, s; 1Y0l f )

(3.8)

Resonance Wave Expansions

333

and the right-hand side has the same properties as hm (r, s; f ), we need only show that χ(r)Eνm (r, s)Eνm (r 0 , 1 − s)χ (r 0 ) is polynomially bounded by hsi as | Im s| → ∞ with Re s bounded. Using [13, 7.2.8], we see that |χ(r)2 F1 ((s + i2πm/ l + 1)/2, (i2π m/ l − s + 2)/2, 3/2; − sinh2 r)| ≤ C when s is in such a region and |s| is sufficiently large. It only remains to bound a(ν ˜ m , is − i/2)a(ν ˜ m , i/2 − is) which is easily done using Stirling’s formula, to get that ˜ m , i/2 − is)| ≤ Chsi |a(ν ˜ m , is − i/2)a(ν when Re s is in a compact set and Re s 6∈ −(2N − 1), Re s 6∈ 2N. This completes the proof. u t Notice that the previous proof is made simple by the fact that we do not need uniform bounds in m. This is not the case in the next lemma, which is why its proof is more involved than the earlier ones. Lemma 3.3. Let χ ∈ Cc∞ (Y0l ), f ∈ C˙ c∞ (Y0l ), and let β > 0, β 6 ∈ (N ∪ (N − 1/2)). Then X

Z φm (θ)χ(r)

m∈Z

Re s=−β

Z

Z R+

R/ lZ

et (s−1/2) Eν (r, s)Eνm (r 0 , 1−s)f (r 0 , θ 0 )φ m (θ 0 ) s − 1/2 m cosh r 0 1/2 0 0 dy dr ds = O(e−t (β+1/2) ) · cosh r

as t → ∞. Proof. Here we need uniform polynomial bounds on Eνm (r, s) and Eνm (r, 1 − s) when r is in a compact set and as |m| → ∞, |s| → ∞ with Re s = β. Recall that ˜ m , is − i/2) sinh r(cosh r)1/2+2π im/ l Eνm (r, s) = a(ν × 2 F1 ((s + 2π im/ l + 1)/2, (2π im/ l − s + 2)/2, 3/2; − sinh2 r). It is relatively easy to bound the hypergeometric function either in s (as was done in the previous lemma) or in m, but we are unaware of a bound in both independently. Instead we will take the approach of writing 2 F1 ((s

+ 2πim/ l + 1)/2, (2π im/ l − s + 2)/2, 3/2; − sinh2 r)

(3.9)

as a sum of derivatives of 2 F1 ((s 0 +2π im/ l+1)/2, (2π im/ l−s 0 +2)/2, 3/2; − sinh2 r) with 1/2 < Re s 0 < 5/2, where we can obtain bounds on it using properties of the resolvent. We will use (c − a)2 F1 (a − 1, b, c; z) + (2a − c − az + bz)2 F1 (a, b, c; z) + a(z − 1)2 F1 (a + 1, b, c; z) = 0

(3.10)

334

T. Christiansen, M. Zworski

([13, 3.4.19]). Let γ be the greatest integer strictly less than (5/2 + β)/2 and let s = −β + iτ , τ ∈ R. We can write 2 F1 ((iτ − β + 2π im/ l + 1)/2, (2π im/ l − iτ + β + 2)/2, 3/2; − sinh2 r) as a sum of 2 F1

2πim + 1 /2 + j1 , l 2πim 3 − iτ + β − 2γ + 2 /2 + j2 , ; − sinh2 r , l 2

iτ − β + 2γ +

(3.11)

where j1 , j2 ∈ {0, 1} and the coefficients are rational functions in 2π im/ l, τ = Im s, and sinh2 r, the degree of which does not exceed 2γ , and whose denominators are bounded away from 0 when β 6 ∈ 2N − 2. These coefficients can be bounded by Chmi2γ hsi2γ , where the constant depends on β. There are four functions (3.11) to bound now. We use the notation (b)j = b(b + 1) · · · (b + j − 1). If at least one of j1 , j2 6 = 0, then we use d j a+j −1 a−1 [z 2 F1 (a, b, c; z)] = (a)j z 2 F1 (a + j, b, c; z) dzj

(3.12)

for positive integers j ([13, 3.4.4]) to write (3.11) as a sum of 2π im 2πim 2 + 1)/2, ( − iτ + β −2γ + 2)/2, 3/2; − sinh r 2 F1 (iτ −β +2γ + l l (3.13) and its derivative in r (and second derivative, if j1 = j2 = 1 ) with coefficients which are rational functions of order no greater than two in sinh r, cosh r, m, and τ , with the denominator bounded away from 0. Note that 1/2 < 2γ −β < 5/2 when β 6 ∈ 2N−1/2, β > 0. The derivatives are not a difficulty, since once we have a polynomial bound on the function (3.13) we get a similar bound on the derivative using the fact that (3.13) is closely related to solutions of the equation (Dr2 +

(2πm/ l)2 + 1/4 + 1/4 − s 0 (1 − s 0 ))h = 0. cosh2 r

Let s 0 = iτ − β + 2γ , and note that (3.13) is equal to (a(νm , is 0 − i/2))−1 (sinh r)−1 (cosh r)−1/2−2π im/ l Eνm (m, s 0 ) with 1/2 < Re s 0 < 5/2. In order to bound E(νm , s 0 ) when 1/2 < Re s 0 < 5/2, we use, in analogy with (2.3), −1 (2π m/ l)2 + 1/4 1 0 0 + (1 − s ) − s E(νm , s 0 ) = 2 sinh (s 0 − 1/2)r − Dr2 + 4 cosh2 r 2 (2π m/ l) + 1/4 0 2 sinh (s · − 1/2)r (3.14) cosh2 r

Resonance Wave Expansions

335

when 1/2 < Re s 0 < 5/2 and we recall that we have chosen the convention that (Dr2 + (2πm/ l)2 +1/4 cosh2 r

+

1 4

− s 0 (1 − s 0 ))−1 is bounded on L20 (R+ ) when Re s 0 > 1/2. We have

−1

2 + 1/4 1 (2πm/ l) C

0 0 + − s (1 − s )

≤

Dr2 + 2 0

4 | Im s || Re s 0 − 1/2| cosh r

when Re s 0 > 1/2. Since

(2πm/ l)2 + 1/4 (s 0 −1/2)r hmi2 −(s 0 −1/2)r

(e − e ) ≤ C

2

| Re s 0 − 5/2|1/2 cosh2 r L (R+ )

(3.15)

(3.16)

when 1/2 ≤ Re s 0 < 5/2, we find that for 1/2 + < Re s 0 < 5/2 − , |Eνm (r, s 0 )| ≤ 0 Chmi2 hs 0 i1+ e(s −1/2)r where the constant depends on and we have used (3.14), (3.15), (3.16) as well as the Sobolev embedding theorem. ˜ m , is − i/2) when Re s = −β It remains to bound (a(ν ˜ m , i(s + 2γ ) − i/2))−1 and a(ν – actually, we need only bound their product. We have a(ν ˜ m , is − i/2) a(ν ˜ m , i(s + 2γ ) − i/2) 0(s + 2γ − 1/2) 0((2π im/ l + s + 1)/2) 0(−2π im/ l + s + 1)/2) = 2−2γ 0(s − 1/2) 0(2π im/ l + s + 2γ + 1)/2) 0((−2π im/ l + s + 2γ + 1)/2) 2−2γ (s − 1/2)2γ = ((2πim/ l + s + 1)/2)γ (−2π im/ l + s + 1)/2)γ ≤ Chsi2γ

when Re s 6 ∈ −(2N − 1). Finally, this shows that |χ(r)Eνm (r, s)| ≤ Chmi2γ +4 hsi4γ +7 when Re s = −β, β > 0, β 6 ∈ N ∪ N − 1/2. A polynomial bound for Eνm (r, 1 − s) is found in a similar way. These polynomial bounds are enough then to show that the integral in the statement of the lemma is of t order O(e−t (β+1/2) ). u We point out that the same method applies to the case of the Neumann Laplacian on Y0l , using the generalized eigenfuctions as in [5]. The Laplacian on the full cylinder Yl can be built from the Dirichlet and Neumann Laplacians on the half-cylinder (by decomposing L2 (Yl ) into subspaces of odd and even functions) and hence we obtain the second part of Theorem 2. As a final remark we point out that a more complex example with the same dynamical structure is given by two convex obstacles – see Fig. 2(a). The resonances are shown to be asymptotic to a lattice of points, [9, 4]. The estimates on the resolvent given in [9] seem sufficient for obtaining an analogue of Theorem 2 above ([10]). Acknowledgements. The first author is grateful for the partial support of a University of Missouri S.R.F. The second author is grateful to the National Science and Engineering Research Council of Canada and the National Science Foundation of the U.S. for partial support. The authors would like to thank Bill Banks for helpful conversations and Laurent Guillopé for allowing them to use the figures he created for another occasion.

336

T. Christiansen, M. Zworski

References 1. Beyer, H.: On the completeness of the quasinormal modes of the Pöschl-Teller potential. Commun. Math. Phys. 204, 397–423 (1999) 2. Epstein, Ch.: Unpublished 3. Erdélyi, A. et. al.: Higher Transcendental Functions. Vol. I. New York: McGraw-Hill Book Company, Inc., 1953 4. Gérard, Ch.: Asymptotique des pôles de la matrice de scattering pour deux obstacles strictement convexes. Mém. Soc. Math. France (N.S.) 31, (1988) 5. Guillopé, L.: Pöschl-Teller potentials and Laplacians on hyperbolic spaces. Unpublished 6. Guillopé, L.: Sur la distribution des longuers des géodésiques fermées d’une surface compacte à bord totalement géodésique. Duke Math. J. 53, 827–848 (1986) 7. Guillopé, L. and Zworski, M.: Upper bounds on the number of resonances for non-compact Riemann surfaces. J. Funct. Anal. 129 (2), 364–389 (1995) 8. Guillopé, L. and Zworski, M.: Scattering asymptotics for Riemann surfaces. Ann. Math. 145, 597–660 (1997) 9. Ikawa, M.: On the poles of the scattering matrix for two strictly convex obstacles. J. Math. Kyoto Univ. 23, 127–194 (1983) 10. Ikawa, M.: Private communication 11. Iwaniec, H. and Sarnak, P.: L∞ norms of eigenfunctions of arithmetic surfaces. Ann. of Math. 141, 301–320 (1995) 12. Lax, P. and Phillips, R.: Scattering Theory. New York: Academic Press, 1st edition 1969, 2nd edition 1989 13. Luke, Y.: The Special functions and their approximations. Vol. 1. New York: Academic Press, 1969 14. Müller, W.: Spectral geometry and scattering theory for certain complete surfaces of finite volume. Invent. Math. 109, 265–305 (1992) 15. Sá Barreto, A. and Zworski, M.: Distribution of resonances for spherical black holes. Math. Res. Lett. 4, 103–121 (1997) 16. Sarnak, P.: Arithmetic quantum chaos. The Schur lectures (1992, Tel Aviv). Israel Math. Conf. Proc. 8, 183–236 (1995) 17. Soffer, A. and Weinstein, M.: Resonances, radiation damping and instability in Hamiltonian nonlinear wave equations. Invent. Math. 136, 9–74 (1999) 18. Tang, S.-H. and Zworski, M.: From quasimodes to resonances. Math. Res. Lett. 5 3, 261–272 (1998) 19. Tang, S.-H. and Zworski, M.: Resonance expansions of scattered waves. Preprint, 1999 20. Titchmarsh, E.C.: The Theory of the Riemann zeta-function. Second edition. Oxford: Clarendon Press, 1986 21. Vainberg, B.R.: Exterior elliptic problems that depend polynomially on the spectral parameter, and the asymptotic behavior for large values of the time of the solutions of nonstationary problems. (Russian) Mat. Sb. (N.S.) 92 134, 224–241 (1973) 22. Vainberg, B.R.: Asymptotic methods in equations of mathematical physics. London: Gordon and Breach, 1989 23. Zworski, M.: Resonances in physics and geometry. Notices Amer. Math. Soc. 46, 319–328 (1999) Communicated by P. Sarnak

Commun. Math. Phys. 212, 337 – 370 (2000)

Communications in

Mathematical Physics

© Springer-Verlag 2000

Classical Dynamical r-Matrices and Homogeneous Poisson Structures on G/H and K/T Jiang-Hua Lu∗ Department of Mathematics, University of Arizona, Tucson, AZ 85721, USA. E-mail: [email protected] Received: 4 September 1999 / Accepted: 25 January 2000

Abstract: Let G be a finite dimensional simple complex group equipped with the standard Poisson Lie group structure. We show that all G-homogeneous (holomorphic) Poisson structures on G/H , where H ⊂ G is a Cartan subgroup, come from solutions to the Classical Dynamical Yang–Baxter equations which are classified by Etingof and Varchenko. A similar result holds for a maximal compact subgroup K, and we get a family of K-homogeneous Poisson structures on K/T , where T = K ∩ H is a maximal torus of K. This family exhausts all K-homogeneous Poisson structures on K/T up to isomorphisms. We study some Poisson geometrical properties of members of this family such as their symplectic leaves, their modular classes, and the moment maps for the T -action. Contents 1. 2. 3.

4. 5.

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Classical Dynamical Yang–Baxter Equation . . . . . . . . . r-Matrices and Homogeneous Poisson Structures on G/H . . . . 3.1 The main theorem . . . . . . . . . . . . . . . . . . . . . 3.2 The Poisson structures πrX (λ) on G/H . . . . . . . . . . 3.3 Comparison with Karolinsky’s classification . . . . . . . r-Matrices and Homogeneous Poisson Structures on K/T . . . . The Poisson Structures πX,X1 ,λ on K/T . . . . . . . . . . . . . . 5.1 Definition . . . . . . . . . . . . . . . . . . . . . . . . . 5.2 Connections via taking limits in λ . . . . . . . . . . . . . 5.3 The Lagrangian subalgebras of g corresponding to πX,X1 ,λ 5.4 Geometrical interpretation of πX,X1 ,λ . . . . . . . . . . . 5.5 πX,X1 ,λ as the result of Poisson induction . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

338 340 341 341 347 348 349 351 351 355 355 357 359

∗ Research partially supported by an NSF Postdoctoral Fellowship and by NSF grant DMS 9803624.

338

J.-H. Lu

5.6 5.7

The symplectic leaves of πX,X1 ,λ . . . . . . . . . . . . . . . . . . The modular vector fields and the leaf-wise moment maps for the T -actions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

362 366

1. Introduction This paper is motivated by the work of Etingof and Varchenko [E-V] on classical dynamical r-matrices for the pair (g, h), where g is a complex simple Lie algebra and h ⊂ g a Cartan subalgebra. A classical dynamical r-matrix is, by definition, a meromorphic function r : h∗ → g⊗g satisfying the so-called Classical Dynamical Yang–Baxter Equation (CDYBE): Alt(dr) + [r 12 , r 13 ] + [r 12 , r 23 ] + [r 13 , r 23 ] = 0 (see Sect. 2 for details). One such r-matrix has the form ε X ε ε coth( α, λ )Eα ⊗ E−α , r(λ) = + 2 2 2 α∈6

where ∈ (S 2 g)g corresponds to the Killing form , of g, 6 is the set of roots of x −x is the g with respect to h, the Eα and E−α ’s are root vectors, and coth(x) = eex +e −e−x hyperbolic cotangent function. Other r-matrices can be obtained by performing certain “gauge transformations" to the one above and by taking various limits of it. See Sect. 2. We want to understand the geometrical meaning of these r-matrices. In [E-V], Etingof and Varchenko show that every classical dynamical r-matrix defines a Poisson groupoid over an open subset of h∗ . In this paper, we give another geometrical interpretation of the r-matrices by connecting them with Poisson structures on the spaces G/H and K/T , where G is a complex Lie group with Lie algebra g, H ⊂ G its connected subgroup corresponding to h, K a compact real form of G, and T = K ∩ H . We then study some Poisson geometrical properties of these Poisson structures on K/T such as their symplectic leaves, their modular classes, and the moment maps for the T -action. We now explain this in more detail. A special example of a classical dynamical r-matrix is one that is not “dynamical”, i.e., independent of λ. It is given by ε ε X Eα ∧ E−α r0 = + c + 2 2 α∈6+

for a choice of positive roots 6+ and an element c ∈ h ∧ h. It defines a (holomorphic) Poisson structure πG on G by πG (g) = Rg r0 − Lg r0 , where Rg and Lg are respectively the right and left translations on G by g ∈ G, making (G, πG ) into a Poisson Lie group. This Poisson structure is the semi-classical limit of the quantum group corresponding to G [D1, D2]. A Poisson structure on G/H is said to be (G, πG )-homogeneous if the action map G × (G/H ) → G/H is a Poisson map [D3]. The first result of this paper, Theorem 3.2, is on the construction of a surjective map from the set of all classical dynamical r-matrices for the pair (g, h) together with their

Classical Dynamical r-Matrices and Homogeneous Poisson Structures

339

domains to the set of all (holomorphic) (G, πG )-homogeneous Poisson structures on G/H . More precisely, for any classical dynamical r-matrix r and λ ∈ h∗ such that r(λ) is defined, we show that the bi-vector field π˜ r(λ) on G defined by π˜ r(λ) = Rg r0 − Lg r(λ) projects to a holomorphic (G, πG )-homogeneous Poisson structure on G/H under the projection G → G/H , and that all (G, πG )-homogeneous Poisson structures on G/H arise this way. See also [L-X] for another interpretation of classical dynamical r-matrices. Let K ⊂ G be a compact real form of G, and let T = K ∩ H be the maximal torus of K. Then K also carries a natural Poisson structure πK such that (K, πK ) is a Poisson Lie group. Theorem 3.2 is then modified to Theorem 4.1 which states that classical dynamical r-matrices give rise to (K, πK )-homogeneous Poisson structure on K/T and that all (K, πK )-homogeneous Poisson structures on K/T arise this way. We point out that a classification of all (G, πG ) or (K, πK )-homogeneous Poisson structures, not necessarily on G/H or on K/T , has already been obtained by E. Karolinsky [Ka2,Ka3]. We want to emphasize that what is brought out here is the connection of such Poisson spaces with the CDYBE. Among all (K, πK )-homogeneous Poisson structures on K/T , we single out a family denoted by πX,X1 ,λ , where X is any subset of the set S(6+ ) of all simple roots, X1 ⊂ X, and λ ∈ h satisfies some regularity condition (Theorem 5.1). This family exhausts all (K, πK )-homogeneous Poisson structures on K/T up to K-equivariant isomorphisms. Moreover, these Poisson structures are related to each other by taking various limits of the parameter λ (see Sect. 5.2). We study several Poisson geometrical properties of this family: The Lagrangian subalgebra of g corresponding to each πX,X1 ,λ is described in Sect. 5.3. In Sect. 5.4, we recall the construction in [E-L2] of a Poisson structure 5 on the variety L of all Lagrangian subalgebras in g and the fact that each (K/T , πX,X1 ,λ ) sits inside (L, 5) as a Poisson submanifold (possibly up to a covering map). The two special cases of πX,X1 ,λ when X = X1 = ∅ and when X = S(6+ ), X1 = ∅ are considered in more detail here. In Sect. 5.5, we show that each πX,X1 ,λ on K/T can be obtained via Poisson induction from a Poisson structure on a smaller manifold. In Sect. 5.6, we describe the symplectic leaves of πX,X1 ,λ when X1 is the empty set. We show that in this case πX,X1 ,λ has a finite number of symplectic leaves. For an arbitrary πX,X1 ,λ , we show that it always has at least one open symplectic leaf. In Sect. 5.7, we show that with respect to a K-invariant volume form µ0 on K/T , all the Poisson structures πX,X1 ,λ have the same modular vector field. In the case when X1 is the empty set, we also describe the moment map for the T -action on each symplectic leaf of πX,∅,λ . Some applications of results in this paper are given in [E-L1], where a Poisson geometrical interpretation of the Kostant harmonic forms on K/T [Ko] is given using the Bruhat Poisson structure π∞ := πX,X1 ,λ for X = X1 = ∅. Set πλ = πX,X1 ,λ when X = S(6+ ) and X1 = ∅. The fact that πλ → π∞ as λ → ∞ is used in [E-L1] to show that the Kostant harmonic forms are limits of the usual Hodge harmonic forms. Results in this paper also motivate our work in [E-L2], where, among other things, we show that there is a Poisson manifold (L0 , 5) such that every (K/T , πX,X1 ,λ ) is a Poisson submanifold (possibly up to a covering map) of (L0 , 5). In fact, L0 is an irreducible component of the variety L of all Lagrangian subalgebras of g, and the

340

J.-H. Lu

Poisson structure 5 is defined on all of L. We show in [E-L2] that all the K-orbits in L with respect to the Adjoint action are (K, πK )-homogeneous Poisson spaces, and that every (K, πK )-homogeneous Poisson space maps to (L, 5) by a Poisson map. Thus, (L, 5) is a setting for studying all (K, πK )-homogeneous Poisson spaces. We point out that many more properties of the Poisson structures πX,X1 ,λ can be studied, among these their Poisson cohomology, their Poisson harmonic forms [E-L1], and their symplectic groupoids. We hope to do this in the future.

2. The Classical Dynamical Yang–Baxter Equation Definition 2.1 ([F,E-V]). A meromorphic function r : h∗ → g⊗g is called a classical (quasi-triangular) dynamical r-matrix for the pair (g, h) if it satisfies the following three conditions: 1. The zero weight condition: adx r(λ) = 0 for all x ∈ h and λ ∈ h∗ such that r(λ) is defined; 2. The generalized unitarity condition: r 12 + r 21 = ε for some complex number ε and for all λ ∈ h∗ such that r(λ) is defined, where ∈ (S 2 g)g is the element corresponding to the Killing form on g; 3. The Classical Dynamical Yang–Baxter Equation (CDYBE): Alt(dr) + [r 12 , r 13 ] + [r 12 , r 23 ] + [r 13 , r 23 ] = 0, P P P where, for r = i ui ⊗vi , we have r 12 = i ui ⊗vi ⊗1, r 13 = i ui ⊗1⊗vi , r 23 = P i 1⊗ui ⊗vi , CYB(r) := [r 12 , r 13 ] + [r 12 , r 23 ] + [r 13 , r 23 ] X = [ui , uj ]⊗vi ⊗vj + ui ⊗[vi , uj ]⊗vj + ui ⊗uj ⊗[vi , vj ], i,j

and Alt(dr)(λ) ∈ ∧3 g is the skew-symmetrization of dr(λ) ∈ h⊗g⊗g ⊂ g⊗g⊗g. The complex number ε is called the coupling constant for r. We now recall the classification of classical dynamical r-matrices for the pair (g, h) as given in [E-V]. Let 6 be the set of all roots for g with respect to h. For each α ∈ 6, choose root vectors Eα and E−α such that Eα , E−α = 1, where , is the Killing form on g. P Let ε be a non-zero complex number, let µ ∈ h∗ , and let C = i,j Cij dxi ∧ dxj be a closed meromorphic 2-form on h∗ . Let 6+ be a choice of positive roots, and let X be a subset of the set S(6+ ) of simple roots in 6+ . For each α ∈ 6, define a (scalar-valued) meromorphic function φα on h∗ according to the rule: If α is a linear combination of simple roots in X, then φα (λ) =

ε ε coth( α, λ − µ ), 2 2

x −x is the hyperbolic cotangent function; Otherwise, set φα (λ) = ε2 where coth(x) = eex +e −e−x if α is positive and φα (λ) = − ε2 if α is negative.

Classical Dynamical r-Matrices and Homogeneous Poisson Structures

341

Theorem 2.2 (Etingof–Varchenko [E-V]). 1. With the above choices of µ, C, 6+ , X ⊂ S(6+ ) and φα , the meromorphic function r : h∗ → g⊗g defined by X X ε Cij (λ)xi ⊗ xj + φα (λ)Eα ⊗ E−α (1) r(λ) = + 2 α∈6

i,j

is a classical dynamical r-matrix with non-zero coupling constant ε; 2. Every classical dynamical r-matrix with non-zero coupling constant has this form. 3. r-Matrices and Homogeneous Poisson Structures on G/H 3.1. The main theorem. Let r : h∗ → g⊗g be any classical dynamical r-matrix as in Definition 2.1. Let ε Ar (λ) = r(λ) − 2 be the skew-symmetric part of r(λ). Using the fact that is symmetric and ad-invariant, one easily shows that the terms [ij , Akl r ] in the CDYBE for r all cancel. Moreover, it is well-known that [12 , 13 ] + [12 , 23 ] + [13 , 23 ] = [12 , 13 ] = [13 , 23 ] = −[12 , 23 ] ∈ (∧3 g)g . Therefore, Ar satisfies the following modified CDYBE (see also [E-V]): 13 12 23 13 23 Alt(dAr ) + [A12 r , Ar ] + [Ar , Ar ] + [Ar , Ar ] =

ε2 12 23 [ , ] ∈ (∧3 g)g . 4

(2)

Recall that there is the Schouten bracket [ ] on ∧g. For x1 , x2 , . . . , xk ∈ g, we use the convention X sign(σ )xσ (1) ⊗xσ (2) ⊗ · · · ⊗xσ (k) ∈ g⊗k . x1 ∧ x2 ∧ · · · ∧ xk = σ ∈Sk

Then for X ∈ ∧2 g, the element CYB(X) and the Schouten bracket [X, X] are related by [D2] CYB(X) = [X12 , X13 ] + [X12 , X23 ] + [X13 , X23 ] =

1 [X, X]. 2

Thus, we can rewrite Eq. (2) as [Ar (λ), Ar (λ)] =

ε2 12 23 [ , ] − 2Alt(dAr )(λ). 2

(3)

It is this form of the CDYBE that we will use to define Poisson structures on G/H . Recall [D2] that a classical quasi-triangular r-matrix with coupling constant ε is an element r0 ∈ g⊗g such that r0 + r021 = ε, CYB(r0 ) = 0.

342

J.-H. Lu

Remark 3.1. If r0 has the zero-weight property, i.e., if r0 ∈ (g⊗g)h , then by Theorem 2.2, it must be of the form X ε ε X cij xi ∧ xj + Eα ∧ E−α (4) r0 = + 2 2 i,j

α∈6+

P for some choice 6+ of positive roots and i,j cij ∈ h∧h. But not every quasi-triangular r0 has the zero-weight property. For example, for g = sl(3, C), we can take r0 = 1 ε ( + P α∈6+ Eα ∧ E−α + 6 E21 ∧ E23 ), where Eij has 1 at the (ij )’s entry and 0 2 everywhere else. See [B-D] for more examples. Let r0 be a classical quasi-triangular r-matrix with coupling constant ε (not necessarily of zero weight for h). Let 3 = r0 − ε2 ∈ g ∧ g be the skew-symmetric part of r0 . Then, as a special case of (3), 3 satisfies the modified Classical Yang–Baxter Equation (CYBE) [3, 3] =

ε 2 12 23 [ , ]. 2

(5)

It is well known that the bi-vector field πG on the group G defined by πG (g) = Rg 3 − Lg 3,

(6)

where for Rg and Lg denote respectively the right and left translations from the identity element to g, defines a holomorphic Poisson structure on G, and that (G, πG ) is a (holomorphic) Poisson Lie group [D2, STS1]. All Poisson structures in this section are assumed to be holomorphic. Recall that an action of the Poisson Lie group (G, πG ) on a Poisson manifold P is said to be Poisson if the action map G × P → P : (g, p) 7→ gp is a Poisson map, where G × P is equipped with the product Poisson structure. When the action of G on P is transitive, the Poisson structure on P is said to be (G, πG )-homogeneous [D3]. The following theorem makes a connection between classical dynamical r-matrices and (G, πG )-homogeneous Poisson structures on G/H . Theorem 3.2. Let r0 = ε2 + 3 be any classical quasi-triangular r-matrix (not necessarily of zero-weight) with skew-symmetric part 3. Let r(λ) = ε2 + Ar (λ) be any classical dynamical r-matrix for the pair (g, h) as in Definition 2.1. For each value λ such that r(λ) is defined, define a bi-vector field π˜ r(λ) on G by π˜ r(λ) (g) = Rg 3 − Lg Ar (λ),

g ∈ G.

Let πr(λ) = p∗ π˜ r(λ) be the projection of π˜ r(λ) to G/H by the map p : G → G/H : g 7 → gH . Then 1) πr(λ) is well-defined and it defines a Poisson structure on G/H ; 2) Equip G with the Poisson structure πG as defined by (6). Then πr(λ) is a (G, πG )homogeneous Poisson structure on G/H . 3) When r0 has the zero-weight property, i.e., r0 ∈ (g⊗g)h , every (G, πG )-homogeneous Poisson structure on G/H arises this way.

Classical Dynamical r-Matrices and Homogeneous Poisson Structures

343

The rest of this section is devoted to the proof of this theorem. We first prove the first two parts. Proof of 1) and 2) in Theorem 3.2. It follows from Ar (λ) ∈ (∧2 g)h that πr(λ) is well-defined. To show that πr(λ) defines a Poisson structure on G/H , we calculate the Schouten bracket [πr(λ) , πr(λ) ] of πr(λ) with itself. Set 3R (g) = Rg 3 and Ar (λ)L (g) = Lg Ar (λ). Then π˜ r(λ) = 3R − Ar (λ)L . Hence [π˜ r(λ) , π˜ r(λ) ] = [3R , 3R ] − 2[3R , Ar (λ)L ] + [Ar (λ)L , Ar (λ)L ] = −[3, 3]R + [Ar (λ), Ar (λ)]L = −2Alt(dAr (λ))L ∈ (h ∧ g ∧ g)L , where in the last step, we used Eqs. (3) and (5). This shows that π˜ r(λ) is in general not a Poisson bi-vector field on G. However, for πr(λ) = p∗ π˜ r(λ) , we have [πr(λ) , πr(λ) ] = p∗ [π˜ r(λ) , π˜ r(λ) ] = −2p∗ Alt(dAr (λ))L = 0. Therefore, πr(λ) is a Poisson structure on G/H . Now for any g1 and g2 ∈ G, we have π˜ r(λ) (g1 g2 ) = Rg1 g2 3 − Lg1 g2 Ar (λ) = Lg1 (Rg2 3 − Lg2 Ar (λ)) + Rg2 (Rg1 3 − Lg1 3) = Lg1 π˜ r(λ) (g2 ) + Rg2 πG (g1 ). Projecting π˜ r(λ) to πr(λ) , this says that the action map of G on G/H by left translations is a Poisson map. Thus πr(λ) is a (G, πG )-homogeneous Poisson structure on G/H . This finishes the proof of 1) and 2) in Theorem 3.2. We now prove 3) of Theorem 3.2. Assume that r0 ∈ (g⊗g)h . Then by PTheorem 2.2, it must be of the form (4) for some choice 6+ of positive roots and some i,j uij xi ∧ xj ∈ h ∧ h. Let e = eH be the base point of G/H . Recall [D3] that a (G, πG )-homogeneous Poisson structure π on G/H is determined by its value π(e) at e in such a way that π(gH ) = Lg π(e) + p∗ πG (g).

(7)

Moreover, since πG (g) = 0 for g ∈ H (this is why we need the zero weight condition on r0 ), we see that π(e) is H -invariant, i.e., π(e) ∈ ∧2 Te (G/H )H ∼ = (∧2 (g/h))H . Let n+ and n− be the nilpotent Lie subalgebras of g spanned by the root vectors for the roots in 6+ and −6+ respectively. Identify g/h ∼ = n− + n+ . Lemma 3.3. Write π(e) =

X ε ( − φα )Eα ∧ E−α ∈ (∧2 (g/h))H 2

(8)

α∈6+

and set φ−α = −φα . Then the bi-vector field π on G/H defined by (7) is Poisson if and only if the function φ : 6 → C satisfies φα φβ + φβ φγ + φγ φα = −

ε2 , 4

wheneverα, β, γ ∈ 6 and α + β + γ = 0.

(9)

344

J.-H. Lu

Proof of Lemma 3.3. For any given π(e) in the form of (8), set X φα Eα ∧ E−α ∈ ∧2 g A= α∈6+

and introduce the following bi-vector field πˆ on G: πˆ (g) = Rg 3 − Lg A. ˆ But as in the proof of 1) of Theorem 3.2, Then π = p∗ πˆ , and hence [π, π] = p∗ [πˆ , π]. we have [π, ˆ π] ˆ = [3R , 3R ] − 2[3R , AL ] + [AL , AL ] = −[3, 3]R + [A, A]L . Since 3 satisfies the modified CYBE (5), by writing B = [A, A] −

ε 2 12 23 [ , ] ∈ ∧3 g, 2

we see that [π, ˆ π] ˆ = B L , the left invariant 3-vector field on G with value B at e. Thus [π, π] = 0 if and only if B ∈ h ∧ g ∧ g, or, if and only if [A, A] =

ε2 12 23 [ , ]modh ∧ g ∧ g. 2

A direct calculation shows that X φα2 hα ∧ Eα ∧ E−α [A, A] = α∈6

−2

X

(φα φβ + φβ φγ + φγ φα )Nα,β Eα ∧ Eβ ∧ Eγ

˜3 [(α,β,γ )]∈6

and [12 , 23 ] =

1X hα ∧ Eα ∧ E−α + 2 α∈6

X

Nα,β Eα ∧ Eβ ∧ Eγ ,

˜3 [(α,β,γ )]∈6

where hα = [Eα , E−α ] ∈ h, [Eα , Eβ ] = Nα,β Eα+β when α, β ∈ 6 and α + β ∈ 6, ˜ 3 means that the summation index runs over all and the summation over [(α, β, γ )] ∈ 6 3 triples (α, β, γ ) ∈ 6 such that α + β + γ = 0 but two such triples are considered the same if they only differ by a reordering of the three roots. It then follows immediately that π is a Poisson structure on G/H if and only if Condition (9) is satisfied. This finishes the proof of Lemma 3.3. It now remains to classify all odd functions φ on 6 such that Condition (9) is satisfied. Note that the Weyl group W for (g, h) acts on the set of such functions by (w·φ)α := φwα . We say that two such functions φ and ψ are W -related if ψ = w · φ for some w ∈ W . Notation 3.4. Let S(6+ ) be the set of simple roots in 6+ . For a subset X of S(6+ ), we will use [X] to denote the set of roots in 6 that are in the linear span of X. Also set hX = spanC {hγ = [Eγ , E−γ ] : γ ∈ X}.

Classical Dynamical r-Matrices and Homogeneous Poisson Structures

345

Lemma 3.5. For any X ⊂ S(6+ ) and h ∈ hX such that α(h) ∈ / π iZ for any α ∈ [X], where π = 3.14159 . . . (we hope that there is no confusion between this notation of π = 3.14159 . . . and π as a Poisson structure), and Z is the set of integers, define φ : 6 → C by ε   2 coth α(h), α ∈ [X] α ∈ 6+ \[X] φα = ε2 ,   ε α ∈ −(6 \[X]). − , +

2

Then (1) φ satisfies Condition (9); (2) Any odd function φ : 6 → C satisfying Condition (9) is W -related to one obtained this way. Proof. (1) can be checked directly. We only show (2). Suppose that φ : 6 → C satisfies Condition (9). Set Y = {α ∈ 6 : φα = ε2 }. Then because of (9), Y has two properties: (A) If α, β ∈ Y and α + β ∈ 6, then α + β ∈ Y ; (B) If α ∈ Y , then −α 6 ∈ Y . 0

0

It follows [E-V] that there exists a choice of positive roots 6+ such that Y ⊂ 6+ . Since 0 there exists w ∈ W such that w6+ = 6+ , by considering w · φ instead of φ, we can 0 assume that 6+ = 6+ . Set X = S(6+ ) ∩ (6+ \Y ). Since Condition (9) implies that Y has the additional property: (C) If α ∈ Y, β ∈ 6\(−Y ) are such that α + β ∈ 6, then α + β ∈ Y , we claim that 6+ = ([X] ∩ 6+ ) ∪ Y is a disjoint union. Indeed, suppose that α ∈ [X] ∩ 6+ . We first use induction on the height ht(α) of α with respect to S(6+ ) to show that α ∈ / Y . If ht(α) = 1, then α is simple, so α ∈ / Y by definition. Suppose that ht(α) = k. We can [Se] write α as α = α1 + · · · + αk such that each αj is in X and 0 that each α1 + · · · + αj is a root, for j = 1, . . . , k. Set α = α1 + · · · + αk−1 . By 0 0 / Y . If α ∈ Y , then we know by (C) that αk = α − α ∈ Y induction assumption, α ∈ which is a contradiction. Thus α ∈ / Y . This shows that ([X] ∩ 6+ ) ∩ Y = ∅. Next, suppose that α ∈ 6+ \Y . We use induction on ht(α) again to show that α ∈ [X]. If ht(α) = 1, then α ∈ X ⊂ [X] by the definition of X. Suppose that ht(α) = k. Write α 0 0 as α = α + αk , where α ∈ 6+ and αk is a simple root. If αk ∈ Y . Then by (C), we 0 0 is absurd. Thus αk ∈ / Y , so αk ∈ 0 X. If α ∈ Y , then have −α = αk − α ∈ Y which 0 / Y . By induction again by (C), we have −αk = α − α ∈ Y which is also absurd, so α ∈ 0 assumption, α ∈ [X]. Thus α ∈ [X]. Hence we have shown that 6+ = ([X] ∩ 6+ ) ∪ Y is a disjoint union. / π iZ, such that φγ = For γ ∈ X, since φγ 6 = ± ε2 , there exists λγ ∈ C, λγ ∈ ε coth λ . Choose h ∈ h such that γ (h) = λ for every γ ∈ X. We now show that γ γ X 2 α(h) ∈ / πiZ and that φα = ε2 coth α(h) for all α ∈ [X] ∩ 6+ by using induction on the height ht(α). This is true when ht(α) = 1. Suppose that ht(α) = k. As before, write 0 0 0 α = α + αk , where α ∈ [X] ∩ 6+ , ht(α ) = k − 1, and αk ∈ X. Then by induction 0 0 / πiZ and φα 0 = ε2 coth α (h). By Condition (9), assumption, α (h) ∈ −φα (φα 0 + φαk ) = −

ε2 − φα 0 φαk . 4

346

J.-H. Lu

2 If φα 0 + φαk = 0, we would have φα 0 φαk = − ε4 and thus φα 0 = ± ε2 and φαk = ∓ ε2 . This is not possible since ([X] ∩ 6+ ) ∩ Y = ∅. Thus φα 0 + φαk 6= 0, so α(h) = 0 α (h) + αk (h) ∈ / πiZ, and

ε2 + φ 0 φ ε α αk = coth α(h). φα = 4 φα 0 + φαk 2

t u

We now continue with the proof of (3) of Theorem 3.2. Let π be a (G, πG )-homogeneous Poisson structure on G/H . Then by Lemmas 3.3 and 3.5, there exist a choice 0 0 0 6+ of positive roots, a subset X of the set of simple roots in 6+ , and an element λ0 ∈ h∗ such that π = πr 0 (λ0 ) , where X

rX0 (λ) =

ε ε + 2 2

X α∈[X

0

coth

0 ]∩6+

ε ε α, λ Eα ∧ E−α + 2 2

X

Eα ∧ E−α

0 0 α∈6+ \[X ]

(10) is a classical dynamical r-matrix for the pair (g, h). This proves part (3) of Theorem 3.2. t u 0

Remark 3.6. For any Lie subalgebra h of g, one can define classical dynamical r-matrices 0 for the pair (g, h ). It is clear from the proof that 1) and 2) in Theorem 3.2 still hold 0 when H is replaced by any closed subgroup H of G and when r is a classical dynamical 0 0 0 r-matrix for (g, h ) with h being the Lie algebra of H . For 3), assume that r0 ∈ (g⊗g)h 0 and that H is a subgroup of H . Let π be a (G, πG )-homogeneous Poisson structure on 0 G/H . Consider again 0

0

H 0

π(eH ) ∈ ∧ TeH 0 (G/H ) 2

0 0 H 2 ∼ . = ∧ g/h

0 0 0 By picking an H -invariant complement h0 of h in g, and by identifying g/h ∼ = h0 , we can consider 0

0

A = π(eH ) ∈ (∧2 h0 )H ⊂ ∧2 g. The discussions in the proof of 3) of Theorem 3.2 show that ε 0 CYB( + A) ∈ h ∧ g ∧ g. 2 0

In [Sc], Schiffmann shows that if h contains a regular semi-simple element, then under 0 certain conditions on A, there is a classical dynamical r-matrix r for (g, h ) such that A is the skew-symmetric part of r(0). Thus π comes from r in our sense. We thank the referee for pointing this out.

Classical Dynamical r-Matrices and Homogeneous Poisson Structures

347

3.2. The Poisson structures πrX (λ) on G/H . In this section, we consider in more detail the case when the Poisson structure on G is defined by a classical quasi-triangular rmatrices r0 with the zero weight property. In other words, we fix a choice 6+ of positive roots, and consider r0 of the form X ε ε X cij xi ∧ xj + Eα ∧ E−α , (11) r0 = + 2 2 i,j

P

P

α∈6+

where i,j cij xi ∧ xj ∈ h ∧ h. When i,j cij xi ∧ xj = 0, the corresponding r0 is often called the standard r-matrix. The corresponding Poisson structure πG on G is the semi-classical limit of the quantum group corresponding to G [D2]. For X ⊂ S(6+ ), set ε ε ε X ε X coth α, λ Eα ∧ E−α + Eα ∧ E−α . rX (λ) = + 2 2 2 2 α∈[X]∩6+

α∈6+ \[X]

(12) / 2πεi Z Clearly, the domain D(rX ) of rX consists of those λ ∈ h∗ such that λ, α ∈ for all α ∈ [X]. For each such λ, we have the (G, πG )-homogeneous Poisson structure πrX (λ) on G/H : let p∗ πG be the projection to G/H of πG by p : G → G/H : g 7→ gH . Then  L X ε Eα ∧ E−α  , πrX (λ) = p∗ πG +  1 − eεα ,λ α∈[X]∩6+

where the second term on the right hand side is the G-invariant bi-vector field on G/H whose value at e = eH is the expression given in the parenthesis. Theorem 3.7. With the Poisson structure πG on G defined by r0 in (11), every holomorphic (G, πG )-homogeneous Poisson structure on G/H is isomorphic, via a Gequivariant diffeomorphism, to a πrX (λ) for some subset X ⊂ S(6+ ) and λ ∈ D(rX ), where rX is given in (12). Proof. Let π be a (G, πG )-homogeneous Poisson structure on G/H . By Theorem 3.2, 0 0 we know that there exists a choice 6+ of positive roots and a subset X of the set of 0 simple roots in 6+ such that π = πr 0 (λ0 ) for some λ0 ∈ h∗ , where rX0 is the classical X dynamical r-matrix given by (10). Let 3 = r0 − ε2 and let AX0 (λ0 ) be the skew0 0 symmetric part of rX0 (λ0 ). Then recall from Sect. 3 that π = p∗ πˆ , where πˆ is the bi-vector field on G given by 0

πˆ (g) = Rg 3 − Lg AX0 (λ0 ), 0

g ∈ G.

0

Pick w ∈ W such that w6+ = 6+ . Set X = wX . Let w˙ be a representative of w in G. We will use Rw˙ −1 to denote the right translation on G by w˙ −1 as well as the induced diffeomorphism on G/H . Then for any g ∈ G, 0

Rw˙ −1 πˆ (g) = Rw˙ −1 g 3 − Lg Lw˙ −1 Adw˙ AX0 (λ0 ) = Rg w˙ −1 3 − Lg w˙ −1 AX (wλ0 ), where AX is the skew-symmetric part of the r-matrix rX given by (12). It follows from the definition of πrX (wλ0 ) that π = Rw˙ πrX (wλ0 ) . The map Rw˙ : G/H → G/H is G-equivariant. u t

348

J.-H. Lu

P 3.3. Comparison with Karolinsky’s classification. When ij cij xi ∧ xj = 0 in the definition of r0 , all (G, πG )-homogeneous Poisson structures on G/H have been classified by Karolinsky [Ka3] by using Drinfeld’s theorem on Poisson homogeneous spaces. We now look at the Poisson structures πrX (λ) on G/H in terms of Karolinsky’s classification. Recall that the double Lie algebra associated to the Poisson Lie group (G, πG ) can be identified with the direct sum Lie algebra d = g + g equipped with the ad-invariant non-degenerate scalar product given by h(x1 , x2 ), (y1 , y2 )i =

1 ( x2 , y2 − x1 , y1 ). ε

The Lie algebra g is identified with the diagonal of d, and the Lie algebra g∗ is identified with the subspace g∗ ∼ = {(x− , x+ ) : x± ∈ b± , (x− )h + (x+ )h = 0}. Here, b± = h+n± and (x± )h ∈ h is the h-component of x± . A theorem of Drinfeld [D3] says that (G, πG )-homogeneous Poisson structures on G/H correspond to Lagrangian (with respect to the scalar product h, i) subalgebras l of the double d ∼ = g + g such that l ∩ g = h. Theorem 3.8 (Karolinsky [Ka3]). Lagrangian subalgebras l of g + g such that l ∩ g = 0 0 h are in 1 − 1 correspondence with triples (p, p , η), where p and p are parabolic 0 subalgebras of g such that q = p ∩ p is the Levi subalgebra, h ⊂ q, and η is an 0 interior orthogonal automorphism of q with qη = h. If (p, p , η) is such a triple, the 0 0 0 corresponding subalgebra l of g + g is l = {(x , x) ∈ p × p : η(xq ) = xq }, where 0 0 0 xq ∈ q (resp. xq ∈ q ) is the projection of x (resp. x ) to q with respect to the Levi 0 decomposition of p (resp. p ). For a (G, πG )-homogeneous Poisson structure π on G/H , the Lagrangian subalgebra lπ(e) of g + g is by definition [D3] lπ(e) = {x + ξ : x ∈ g, ξ ∈ g∗ , ξ |h = 0, andξ For π(e) of the form π(e) = that

P

π(e) = x + h}.

ε − φ )E ∧ E , it is an easy calculation to see α α −α

α∈6+ ( 2

lπ(e) = h + spanC {ξα : α ∈ 6}, where for α ∈ 6, ε ε ξα = (φα − )Eα , (φα + )Eα ∈ g + g. 2 2 Thus, for the Poisson structure πrX (λ) on G/H , we have  (−εEα , 0) if α ∈ −Y    ε (Eα , eεα,λ Eα ) if α ∈ [X] , ξα = eεα,λ −1    if α ∈ Y. (0, εEα )

Classical Dynamical r-Matrices and Homogeneous Poisson Structures

349

where Y = 6+ \[X]. Let pX = h + spanC {Eα : α ∈ [X] ∪ Y } be the parabolic subalgebra of g defined by X, and let 0

pX = h + spanC {Eα : α ∈ [X] ∪ (−Y )} be its opposite parabolic subalgebra. Set mX = h + spanC {Eα : α ∈ [X]}

(13)

0

so that mX = pX ∩ pX . Let η be the interior automorphism of mX given by Adeεhλ , where 0 hλ ∈ h corresponds to λ ∈ h∗ under the Killing form. Then the triple (pX , pX , η) is the one corresponding to the Poisson structure πrX (λ) in the Karolinsky classification. 4. r-Matrices and Homogeneous Poisson Structures on K/T We pick a compact real form k of g as follows: For each α ∈ 6+ , set Xα = Eα − E−α ,

Yα = i(Eα + E−α )

and hα = [Eα , E−α ]. Then the real subspace k = spanR {ihα , Xα , Yα : α ∈ 6+ } is a compact real form of g. Set t = spanR {ihα : α ∈ 6} ⊂ k. Let K and T ⊂ K be respectively the connected compact subgroups of G with Lie algebras k and t. It is well-known [So] that every Poisson structure πK on K such that (K, πK ) is a Poisson Lie group is of the form πK (k) = Rk 3 − Lk 3,

(14)

where 3=u−

iε X Xα ∧ Yα ∈k∧k 2 2

(15)

α∈6+

for some u ∈ t ∧ t, an imaginary complex number ε and a choice 6+ of positive roots. In this section, we will show how (K, πK )-homogeneous Poisson structures on K/T are related to classical dynamical r-matrices. We remark again that one classification of all (K, πK )-homogeneous Poisson spaces (by the corresponding Lagrangian Lie subalgebras) has been given by Karolinsky [Ka2]. If we regard ∧g as a real vector space, then ∧k −→ ∧g : ∧l k 3 x1 ∧ · · · ∧ xl 7−→ x1 ∧ · · · ∧ xl ∈ ∧l g is an embedding of ∧k into ∧g as a real subspace. This embedding also preserves the Schouten bracket. Thus, for A ∈ ∧2 k of the form A=

X α∈6+

aα

Xα ∧ Yα , 2

aα ∈ R

for α ∈ 6+ ,

350

J.-H. Lu

P we can calculate [A, A] ∈ ∧3 k by first writing A = α∈6+ iaα Eα ∧ E−α ∈ ∧2 g and calculate [A, A] inside ∧g. Indeed, as in the proof of Lemma 3.3, in ∧3 g we have [A, A] =

1 X 2 aα (ihα ∧ Xα ∧ Yα ) 2 α∈6+ X (aα aβ + aβ aγ + aγ aα )Nα,β Eα ∧ Eβ ∧ Eγ . +2

(16)

˜3 [(α,β,γ )]∈6

Clearly, ihα ∧ Eα ∧ E−α ∈ ∧3 k for each α ∈ 6+ . Suppose that (α, β, γ ) ∈ 6 3 are such that α + β + γ = 0. Without loss of generality, we can assume that α, β ∈ 6+ and γ ∈ −6+ . Then Nα,β Eα ∧ Eβ ∧ Eγ + N−α,−β E−α ∧ E−β ∧ E−γ = Nα,β (Eα ∧ Eβ ∧ Eγ − E−α ∧ E−β ∧ E−γ ). This element is in ∧3 k because it is fixed by θ ∈ EndR (∧3 g) defined by θ(x1 ∧ x2 ∧ x3 ) = θ(x1 ) ∧ θ (x2 ) ∧ θ (x3 ),

x1 , x2 , x3 ∈ g,

where θ ∈ EndR (g) is the complex conjugation of g defined by k. The right hand side of (16) is thus the Schouten bracket of A with itself inside ∧k. Now suppose that r is a classical dynamical r-matrix for the pair (g, h) as given in Theorem 2.2. Suppose that λ ∈ h∗ is in the domain of r such that the skew-symmetric part Ar (λ) = r(λ) − ε2 of r(λ) lies in ∧2 k. Then [Ar (λ), Ar (λ)] − [3, 3] ∈ (∧3 k) ∩ (h ∧ k ∧ k) = t ∧ k ∧ k. By abuse of notation, we still use π˜ r(λ) (already used in Theorem 3.2) to denote the bi-vector field on K given by π˜ r(λ) (k) = Rk 3 − Lk Ar (λ),

k ∈ K,

where Rk and Lk are respectively the right and left translations on K by k. We use πr(λ) to denote the projection of π˜ r(λ) to K/T by the map p : K → K/T : k 7 → kT . Theorem 4.1. Let r be any classical dynamical r-matrix for the pair (g, h) given in Theorem 2.2. Suppose that λ ∈ h∗ is in the domain of r such that Ar (λ) = r(λ) − ε2 is in ∧2 k. Then, 1) the bi-vector field πr(λ) on K/T defines a (K, πK )-homogeneous Poisson structure on K/T ; 2) with the Poisson structure πK on K given by (14), every (K, πK )-homogeneous Poisson structure on K/T arises this way. Proof. The proof of 1) is similar to that of Theorem 3.2. We prove 2). Assume that π is a (K, πK )-homogeneous Poisson structure on K/T . Since π is T -invariant, we can write X iε Xα ∧ Yα ∈ ∧2 (k/t), (− + iφα ) π(e) = 2 2 α∈6+

Classical Dynamical r-Matrices and Homogeneous Poisson Structures

351

where e = eT ∈ K/T and φα ∈ iR for each α ∈ 6+ . (Recall that ε ∈ iR is fixed at the beginning.) Set φ−α = −φα for α ∈ 6+ . Using the same trick for calculating the Schouten bracket in ∧k, i.e., by embedding ∧k into ∧g, and by using arguments similar to those in the proof of Lemma 3.3, we know that the φα ’s must satisfies Condition (9). Exactly the same as in the proof of the second part of Theorem 3.2, we know that there 0 0 exist a choice of positive roots 6+ , a choice of a subset X of the set of simple roots for 0 6+ , and some (not necessarily unique) λ0 ∈ h∗ such that φα =

ε

0 coth ε2 α, λ0 ifα ∈ [X ] 0 0 ifα ∈ ±(6+ \[X ]. ± ε2

2

0

0

Let r be the classical dynamical r-matrix for the pair (g, h) defined by 6+ and X as in Theorem 2.2 (µ = 0 and C = 0), we see that π coincides with the Poisson structure t πr(λ0 ) on K/T . u 5. The Poisson Structures πX,X1 ,λ on K/T 5.1. Definition. As in the case for G/H , we will single out a family of (K, πK )homogeneous Poisson structures on K/T which exhausts all such Poisson structures on K/T up to K-equivariant isomorphisms. For a subset X ⊂ S(6+ ), set aX = spanR {hγ = [Eγ , E−γ ] : γ ∈ X}. Denote by {hˇ γ : γ ∈ S(6+ )} the set of fundamental co-weights for S(6+ ), i.e., hˇ γ ∈ a for each γ ∈ S(6+ ) and γ1 (hˇ γ ) = δγ1 ,γ for all γ1 , γ ∈ S(6+ ).. For X1 ⊂ S(6+ ), set X hˇ γ . ρˇX1 = γ ∈X1

Define ρˇX1 to be 0 if X1 is the empty set. iπ Theorem 5.1. For X ∈ S(6+ ), X1 ⊂ X and λ = λ1 + iπ 2 ρˇX1 ∈ a X + 2 ρˇX1 such that α(λ1 ) 6 = 0 for all α ∈ [X] with α(ρˇX1 ) even, let πX,X1 ,λ be the bi-vector field on K/T given by  L iε  X 1 Xα ∧ Yα  , πX,X1 ,λ = p∗ πK − 2 1 − e2α(λ) α∈[X]∩6+

where the second term on the right hand side is the K-invariant bi-vector field on K/T whose value at e = eT is the expression given in the parenthesis. Then 1) πX,X1 ,λ is a (K, πK )-homogeneous Poisson structure on K/T , and 2) every (K, πK )-homogeneous Poisson structure on K/T is K-equivariantly isomorphic to some πX,X1 ,λ . / π iZ for all Remark 5.2. Note that the condition on λ1 ∈ aX is equivalent to α(λ) ∈ α ∈ [X], so that e2α(λ) 6 = 1 for all α ∈ [X].

352

J.-H. Lu

Proof. 1) The number e2α(λ) is real for each α ∈ [X]. Thus πX,X1 ,λ is a (K, πK )homogeneous Poisson structure coming from a classical dynamical r-matrix. 2) Assume that π is a (K, πK )-homogeneous Poisson structure on K/T . By Theorem 4.1 and by a proof similar to that of Theorem 3.7, there exist X ⊂ S(6+ ) and some λ0 ∈ h∗ such that π is isomorphic, via a K-equivariant diffeomorphism of K/T , to the 0 Poisson structure π given by  L X iε 0 kα Xα ∧ Yα  , π = p∗ πK −  2 α∈[X]∩6+

where kα =

ε 1 1 (1 − coth( α, λ0 )) = ∈ R. εα,λ 0 2 2 1−e

Let hλ0 ∈ h be the element in h corresponding to λ0 under the Killing form, so that α, λ0 = α(hλ0 ) for all α ∈ 6. It remains to show that ε2 hλ0 can be replaced by z some λ ∈ aX + iπ 2 ρˇX1 . To this end, consider the function f (z) = 1/(1 − e ) for z ∈ C. It takes values in all of C except for 0 and 1. Moreover, f (R\{0}) = (−∞, 0) ∪ (1, ∞) and f (R + iπ) ∈ (0, 1). Set X1 = {γ ∈ X : kγ ∈ (0, 1)}. Then for each γ ∈ X, there exists µγ ∈ R such that kγ = f (µγ + iπ ) if γ ∈ X1 kγ = f (µγ ) if γ ∈ X\X1 . Let λ1 ∈ aX be such that 2γ (λ1 ) = µγ for each γ ∈ X (such a λ1 exists), and let λ = λ1 + πi 2 ρˇX1 . Then kγ = f (2γ (λ)) for all γ ∈ X. Consequently, by writing α ∈ [X] ∩ 6+ as a linear combination of elements in X, we see that kα = f (2α((λ)) for all α ∈ [X]. u t Notation 5.3. For reasons given in Sect. 5.2, we will use π∞ to denote the Poisson structure p∗ πK on K/T . It is called the Bruhat Poisson structure [Lu-We], because its symplectic leaves are Bruhat cells in K/T . See Sect. 5.6 for more details. Example 5.4. Consider K = SU (2) =

u v −v¯ u¯

: u, v ∈ C, |u| + |v| = 1 , 2

2

T = {diag(eix , e−ix ) : x ∈ R} ∼ = S 1 and the root α(x, −x) = 2x is taken to be the positive root. Then 1 0 1 1 0i , Yα = . Xα = 2 −1 0 2 i 0 With 3=−

iε Xα ∧ Yα ∈ su(2) ∧ su(2) 2 2

Classical Dynamical r-Matrices and Homogeneous Poisson Structures

353

and the Poisson structure πK on K = SU (2) defined by πK = 3R − 3L , the Poisson brackets among the coordinate functions u, v, u¯ and v¯ on SU (2) are given by ε {u, u} ¯ = − |v|2 , 4

{u, v} =

ε uv, 8

{u, v} ¯ =

ε uv, ¯ 8

{v, v} ¯ = 0.

Let π0 be the SU (2)-invariant bivector field on SU (2)/S 1 whose value at the point e = eS 1 is Xα ∧ Yα . It is symplectic. Case 1. X = X1 = ∅. Then πX,X1 ,λ = π∞ . λ1 0 with λ1 6= 0, and Case 2. X = {α}, X1 = ∅. Then λ = 0 −λ1 πX,X1 ,λ = π∞ − Case 3. X = X1 = {α}. Then

λ=

λ1 + 0

πi 4

iε 1 π0 . 2 1 − e4λ1

0 −λ1 −

πi 4

with λ1 ∈ R arbitrary, and πX,X1 ,λ = π∞ − Note that the range of the function

1 1−e4λ1

1 iε π0 . 2 1 + e4λ1

for λ1 ∈ R\{0} is (−∞, 0) ∪ (1, +∞), and

for λ1 ∈ R is (0, 1). Thus, for all possible choices of X, X1 and λ, the range of we get all the Poisson structures of the form 1 1+e4λ1

π a = π∞ −

iε aπ0 2

for a ∈ R except for a = 1. But the Poisson structure π a when a = 1 is easily seen to be isomorphic to π∞ (corresponding to a = 0) by the SU (2)-equivariant diffeomorphism on SU (2)/S 1 defined by the right translation by the non-trivial Weyl group element. The fact that every (SU (2), πK )-homogeneous Poisson structures on S 2 is of the form π a for some a ∈ R is very easy to check directly [Sh]. Identify the Lie algebra su(2) with R3 by ix y + iz 7 −→ (x, y, z) −y + iz −ix i 0 so the Adjoint orbit through can be identified with the sphere S 2 = {(x, y, z) ∈ 0 −i R3 : x 2 + y 2 + z2 = 1}. Consequently, we have the identification i 0 , SU (2)/S 1 → S 2 : kS 1 7 −→ Adk 0 −i

354

J.-H. Lu

or

u v ¯ −(uv + u¯ v)). ¯ S 1 7 −→ (|u|2 − |v|2 , −i(uv − u¯ v), −v¯ u¯

The induced Bruhat Poisson structure π∞ on S 2 is given by {x, y} = −

εi (x − 1)z, 4

{y, z} = −

εi (x − 1)x, 4

{z, x} = −

εi (x − 1)y, 4

and the Poisson structure π a on S 2 is given by εi (x + 2a − 1)z, 4 εi {y, z} = − (x + 2a − 1)x, 4 εi {z, x} = − (x + 2a − 1)y. 4

{x, y} = −

Note that π a is symplectic when a < 0 or a > 1. When a = 0, it has two symplectic leaves, the point (1, 0, 0) being a one-point leaf and the rest of S 2 as another leaf. Similarly for a = 1. When 0 < a < 1, it has infinitely many symplectic leaves: two open leaves respectively given by x < 1 − 2a and x > 1 − 2a, and every point on the circle x = 1 − 2a as a one-point leaf. Example 5.5. Let g = sl(3, C) and K = SU (3). The three positive roots are chosen to be α1 (x) = x1 − x2 ,

α2 (x) = x2 − x3 ,

α3 (x) = x1 − x3

for a diagonal matrix x = diag(x1 , x2 , x3 ). Take X = S(6+ ) = {α1 , α2 } and X1 = {α1 }. In this case  0 0 =  0 − 13 0  , 0 0 − 13 2

ρˇX1

3

and 

λ1 + λ= 0 0

πi 3

0 λ2 − 0

 πi 6

0 −(λ1 + λ2 ) −

πi 6

 , λ1 + 2λ2 6= 0.

Then πX,X1 ,λ = π∞ +

2Xα1 ∧ Yα1 2Xα2 ∧ Yα2 2Xα3 ∧ Yα3 + + 2λ +4λ 2(λ −λ ) 1 2 1 2 1−e 1 + e4λ1 +2λ2 1+e

L

.

Classical Dynamical r-Matrices and Homogeneous Poisson Structures

355

5.2. Connections via taking limits in λ. As noted in [E-V], the dynamical r-matrices are related to each other via taking various limits in λ. Correspondingly, the Poisson structures πX,X1 ,λ are also related this way. We study these relations in the section. Proposition 5.6. For any X1 ⊂ X ⊂ Y ⊂ S(6+ ) and λ = λ1 + such that α(λ1 ) 6 = 0 for all α ∈ [X] with α(ρˇX1 ) even, we have

iπ 2 ρˇX1

∈ aX +

iπ 2 ρˇX1

πX,X1 ,λ = lim πY ,X1 ,λ+t ρˇY \X .

(17)

t→+∞

In particular, π∞ = lim πY ,∅,t ρˇY . t→+∞

Moreover, we also have π∞ = lim πX,X1 ,λ+t ρˇX .

(18)

t→+∞

Proof. Set µt = λ + t ρˇY \X for t > 0. Let α ∈ [Y ] ∩ 6+ . If α ∈ [X], then α(ρˇY \X ) = 0 so α(µt ) = α(λ). If α ∈ [Y ]\[X], then v := α(ρˇY \X ) is positive, so 1 1 = lim = 0. α(µ ) t t→∞ 1 − e t→∞ 1 − etv lim

Hence (17) follows from the definition of πX,X1 ,λ . The limit in (18) is obvious.

t u

5.3. The Lagrangian subalgebras of g corresponding to πX,X1 ,λ . The Lie bialgebra of the Poisson Lie group (K, πK ) is (k, a + n), where the pairing between k and a + n is given by 2iε Im , , where Im , stands for the imaginary part of the Killing form , . We will call a real subalgebra l of g a Lagrangian algebra if 1) dim l = dim k, and 2) 2iε Im x, y = 0 for all x, y ∈ l. By a theorem of Drinfeld [D3], (K, πK )homogeneous Poisson structures on K/T correspond to Lagrangian subalgebras l of g with l ∩ k = t. In this section, we calculate the Lagrangian subalgebras lX,X1 ,λ corresponding to the Poisson structures πX,X1 ,λ . By definition [D3], lX,X1 ,λ = {x + ξ : x ∈ k, ξ ∈ a + n : ξ |t = 0, ξ

πX,X1 ,λ (e) = x + t}.

A direct calculation gives lX,X1 ,λ = t + spanR {Eβ , iEβ : β ∈ 6+ \[X]} 1 1 Xα + Eα , 2α(λ) Yα + iEα : α ∈ [X] ∩ 6+ }. + spanR { 2α(λ) e −1 e −1 On the other hand, for α ∈ [X], since e2α(λ) 6 = 1, we have 1 Xα + Eα ), e2α(λ) − 1 1 Yα + iEα ). Adeλ Yα = Adeλ (iEα + iE−α ) = (eα(λ) − e−α(λ) )( 2α(λ) e −1

Adeλ Xα = Adeλ (Eα − E−α ) = (eα(λ) − e−α(λ) )(

356

J.-H. Lu

Note that eα(λ) is real or imaginary depending on α(ρˇX1 ) is even or odd. Set nX = spanR {Eβ , iEβ : β ∈ 6+ \[X]}.

(19)

Then we have proved the following proposition. Proposition 5.7. Denote by lX,X1 ,λ the Lagrangian subalgebra of g corresponding to the Poisson structure πX,X1 ,λ on K/T . It is given by lX,X1 ,λ = Ad eλ (t + nX + spanR {Xα , Yα : α ∈ [X], α(ρˇX1 ) is even} + spanR {iXα , iYα : α ∈ [X], α(ρˇX1 ) is odd}). Remark 5.8. Let θ be the complex conjugation on g defined by k. Let τX,X1 be the complex conjugation on g given by τX,X1 = Adexp(π i ρˇX1 ) θ = θ Adexp(−π i ρˇX1 ) . τX,X1

Denote by mX

the set of fixed points of τX,X1 in mX , where mX = h + spanC {Eα : α ∈ [X]}.

Then τX,X1

lX,X1 ,λ = Adeλ (mX

+ nX ).

Remark 5.9. Let n = dim k and consider lX,X1 ,λ as a point in Gr(n, g), the Grassmannian of n-dimensional real subspaces of g. Then, corresponding to Proposition 5.6, we have, iπ for X1 ⊂ X ⊂ Y ⊂ S(6+ ) and for any λ = λ1 + iπ 2 ρˇX1 ∈ a X + 2 ρˇX1 such that α(λ1 ) 6 = 0 for all α ∈ [X] with α(ρˇX1 ) even, lim lY ,X1 ,λ+t ρˇY \X = lX,X1 ,λ

(20)

t→+∞

in Gr(n, g). Indeed, under the Plucker embedding of Gr(n, g) into P1 (∧n g), the Lie subalgebra lY ,X1 ,λ corresponds to the point in P1 (∧n g) defined by the vector vY ,X1 ,λ := Z0 ∧

Y α∈[Y ]∩6+

1

Xα + Eα

e2α(λ) − 1 1 Yα + iEα ∧ ∧ 2α(λ) e −1

Y

Eα ∧ E−α ,

α∈6+ \[Y ]

where Z0 ∈ ∧dim t t and Z0 6 = 0 is fixed. Since vY ,X1 ,λ+t ρˇY \X → vX1 ,λ as t → +∞, we see that (20) holds in P1 (∧n g) and thus also in Gr(n, g). Example 5.10. When X = X1 are the empty set, we have lX,X1 ,λ = t + n, and when X = S(6+ ) and X1 is the empty set, we have lX,X1 ,λ = Adeλ k. In general, when X = S(6+ ), the Lie subalgebra lX,X1 ,λ is a real form of g.

Classical Dynamical r-Matrices and Homogeneous Poisson Structures

357

5.4. Geometrical interpretation of πX,X1 ,λ . Denote by L the set of all Lagrangian subalgebras of g with respect to the imaginary part of the Killing form , . (Here g is regarded as a real vector space.) It is an algebraic subvariety of the Grassmannian Gr(n, g) of n-dimensional subspaces of g, where n = dim k. In [E-L2], we show that there is a smooth bivector field 5 on Gr(n, g) such that the Schouten bracket [5, 5] vanishes at every l ∈ L. More precisely, consider the G-action on Gr(n, g) by the Adjoint action. It defines a Lie algebra anti-homomorphism κ : g −→ χ 1 (Gr(n, g)), where χ 1 (Gr(n, g)) is the space of vector fields on Gr(n, g). Denote by the same letter its multi-linear extension from ∧2 g to the space of bi-vector fields on Gr(n, g). Then the bivector field 5 on Gr(n, g) is defined to be 5=

1 κ(R), 2

where R ∈ ∧2 g is the r-matrix for g given by hR, (x1 + y1 ) ∧ (x2 + y2 )iε = hx1 , y2 iε − hx2 , y1 iε

(21)

for x1 , x2 ∈ k and y1 , y2 ∈ a + n with h, iε = 2i ε Im , . Explicitly,   l X ε X (ihj ) ∧ hj + (−Xα ∧ (iEα ) + Yα ∧ Eα ) , R=− 2i j =1

α∈6+

where {h1 , . . . , hl } is a basis for a such that hj , hk = δj k . It now follows from the definition of 5 that it defines a Poisson structure on every G-invariant smooth submanifold of L. One particular G-invariant smooth submanifold of L is the (unique) irreducible component L0 of L that contains k. We show in [E-L2] that each lX,X1 ,λ ∈ L0 and that its K-orbit in L0 is a Poisson submanifold of (L0 , 5). (We also show in [E-L2] that L0 is diffeomorphic to the set of real points in the De Concini–Procesi compactification of G [D-P].) For each Poisson structure πX,X1 ,λ on K/T , consider the map P : (K/T , πX,X1 ,λ ) −→ (L0 , 5) : kT 7 −→ Adk lX,X1 ,λ . It is shown in [E-L2] that P is a Poisson map. When the normalizer subgroup of lX,X1 ,λ in K is T , this map is an embedding of K/T into L0 whose image is the the K-orbit of lX,X1 ,λ in L0 . In general, P is a covering map onto the K-orbit of lX,X1 ,λ in L0 . Thus, every (K/T , πX,X1 ,λ ) is a Poisson submanifold of (L0 , 5) (possibly up to a covering map). This can be considered as one geometrical interpretation of πX,X1 ,λ . Two special cases of πX,X1 ,λ deserve more attention. The first is when X = X1 = ∅ (λ = 0 in this case). Then πX,X1 ,λ = π∞ is the Bruhat Poisson structure. It has been the most interesting example in terms of connections to Lie theory. For its relations with the Kostant harmonic forms [Ko], see [Lu3] and [E-L1]. The second special case is when X = S(6+ ) and X1 = ∅. The condition on λ is that λ ∈ a is regular. We will show that πX,X1 ,λ is symplectic in this case. In fact, we will show that πX,X1 ,λ can be identified with the symplectic structure on a dressing orbit of K in its dual Poisson Lie group. We also remark that this symplectic structure has been used in [L-R] to give a symplectic proof of Kostant’s nonlinear convexity theorem.

358

J.-H. Lu

Recall that the Manin triple (g, k, a + n, 2i ε Im , ) gives rise to a Poisson structure πAN on the group AN making (AN, πAN ) into the dual Poisson Lie group of (K, πK ). The group K acts on AN by the (left) dressing action: K × AN −→ AN : (k, b) 7 −→ k · b := b1 , if bk −1 = k1 b1 for k1 ∈ K and b1 ∈ AN. The K orbits of this dressing action of K in AN , called the dressing orbits, are precisely all the symplectic leaves of the Poisson structure on AN and they are parametrized by a fundamental W -chamber in a. Thus each dressing orbit inherits a symplectic, and thus Poisson, structure as a symplectic leaf. Since the dressing action is Poisson [STS2, Lu-We], these dressing orbits are examples of (K, πK )-homogeneous Poisson spaces. Let λ ∈ a be regular and consider the element e−λ ∈ A. The stabilizer subgroup of K in AN at e−λ is T . Thus, by identifying K/T with the dressing orbit through e−λ , we get a Poisson structure on K/T which is in fact symplectic. Notation 5.11. We will use πλ to denote the Poisson structure on K/T obtained by identifying K/T with the symplectic leaf in AN through the point e−λ , and we call it the dressing orbit Poisson structure corresponding to e−λ ∈ A. Proposition 5.12. When X = S(6+ ), X1 = ∅, and λ ∈ a is regular, the Poisson structure πX,X1 ,λ on K/T is nothing but the dressing orbit Poisson structure πλ corresponding to e−λ . Explicitly, we have  L X iε 1 Xα ∧ Yα  + π∞ , (22) πλ = −  2 1 − e2α(λ) α∈6+

where the first term is the K-invariant bi-vector field on K/T whose value at e = eT is the expression given in the parenthesis. Proof. Since lX,X1 ,λ is given by the right-hand side of (22), we only need to show that the dressing orbit Poisson structure πλ is also given by the same formula. Denote the Poisson structure on AN by πAN . Since we are identifying k with (a + n)∗ via 2i Im , , an element x ∈ k can be regarded as a left invariant 1-form on AN which we denote by x l . Let pk : g → k be the projection from g to k with respect to the Iwasawa Decomposition g = k + a + n. We know that (see [Lu1]) for any a ∈ A, πAN (x l , y l )(a) =

2i Im Ada x, pk Ada y ε

for all x, y ∈ k. Here, Ada is the Adjoint action of a ∈ A on g. Thus, when x and y run over the basis vectors {iHα , Xα , Yα : α ∈ 6+ } for k, we have πAN (x l , y l )(a) = 0 except that 2i Im Ada Xα , pk Ada Yα 2i = Im a α Eα − a −α E−α , a −α (iEα + iE−α ) 2i = (1 − a −2α ).

πAN (Xαl , Yαl ) =

Classical Dynamical r-Matrices and Homogeneous Poisson Structures

359

Let σx be the (left)-dressing vector field on AN defined by x ∈ k, i.e., σx = −x l Then, taking a = e−λ , we have X

πAN (a) =

α∈6+

=−

πAN .

1 σX (a) ∧ σYα (a). πAN (Xαl , Yαl ) α

1 iε X σX (a) ∧ σYα (a) ∈ ∧2 Ta (K · a). 2 1 − e2α(λ) α α∈6+

Identify K/T with K · a by kT 7→ k · a, we get πλ (eT ) = −

iε X 1 Xα ∧ Yα . 2 1 − e2α(λ) α∈6+

Thus πλ is given as by (22).

t u

5.5. πX,X1 ,λ as the result of Poisson induction. We now look at the general case of πX,X1 ,λ . Set kX = t + spanR {Xα , Yα : α ∈ [X] ∩ 6+ }, and let KX ⊂ K be the connected subgroup of K with Lie algebra kX . We will show that lX,X1 ,λ can be obtained via Poisson induction (see Remark 5.15 below) from a Poisson structure on the smaller space KX /T . To this end, consider k0X = {ξ ∈ k∗ : ξ(x) = 0∀x ∈ kX }. Since we are identifying k∗ with a + n, we have k0X ∼ = nX as real Lie algebras, where nX is given in (19). Since nX ⊂ a + n is an ideal, we know that KX ⊂ K is a Poisson subgroup [Lu-We]. In fact, set 31 = −

iε 2

X α∈[X]∩6+

Xα ∧ Yα , 2

32 = −

iε 2

X α∈6+ \[X]

Xα ∧ Yα . 2

Then, we have Proposition 5.13. 1) For any x ∈ kX , adx 32 = 0. 2) The Poisson structure on KX (as a Poisson submanifold of K) is given by πKX (k1 ) = Rk1 31 − Lk1 31 , where Rk1 and Lk1 are respectively the right and left translations on KX by k1 ∈ KX . 3) The Manin triple for the Poisson Lie group (KX , πKX ) is (mX , kX , a + uX , 2i ε , ), where mX , given in (13), is considered as over R, and uX = spanR {Eα , iEα : α ∈ [X] ∩ 6+ }.

360

J.-H. Lu

Proof. 1) Using the embedding of ∧• k into ∧• g as a real subspace, it is enough to show that adx 32 = 0 for x = Eα with α ∈ [X]. Let α ∈ [X] ∩ 6+ . Then, X 2 adEα 3 = [Eα , Eβ ] ∧ E−β + Eβ ∧ [Eα , E−β ]. ε β∈6+ \[X]

Set Y1 = {β ∈ 6+ \[X] : α + β ∈ 6},

and

Y2 = {β ∈ 6+ \[X] : β − α ∈ 6}.

Since Y = 6+ \[X] has the property that if α ∈ [X] ∩ 6+ and β ∈ Y are such that α + β ∈ 6, then α + β ∈ Y , the map Y1 → Y2 : β 7→ α + β is a bijection. Thus X 2 adEα 32 = ([Eα , Eβ ] ∧ E−β + Eα+β ∧ [Eα , E−(α+β) ]) ε β∈Y1 X (Nα,β + Nα,−(α+β) )Eα+β ∧ E−β = β∈Y1

= 0. Similarly, adE−α 32 = 0. This proves 1). 2) By definition, the induced Poisson structure πKX on KX is the restriction of πK to KX . Using the definition of πK and 1), we know that πKX is as given. 3) From the general theory of Poisson Lie groups [Lu-We], we know that the induced Lie algebra structure on k∗X is isomorphic to the quotient Lie algebra k∗ /k0X . Through the 2i ∗ ∼ identifications k∗ ∼ = a + n and k0X ∼ = nX via 2i ε , , we get kX = a + uX via ε , t which is now considered as a symmetric scalar product on mX by restriction. u Notation 5.14. Let X1 ⊂ X and let λ = λ1 + π2i ρˇX1 ∈ aX + π2i ρˇX1 be such that α(λ1 ) 6= 0 for any α ∈ [X] with α(ρˇX1 ) even. By replacing K by KX and by regarding X as the set of all simple roots for the root system for (KX , T ), we know that there is a (KX , πKX )homogeneous Poisson structure on KX /T corresponding to X, X1 and λ. We will denote it by πXX1 ,λ . We now show that the Poisson structure πX,X1 ,λ on K/T can be obtained via Poisson induction from the Poisson structure πXX1 ,λ on KX /T . To this end, consider the product space K×(KX /T ) with the product Poisson structure πK ⊕ πXX1 ,λ . Even though the diagonal (right) action of KX on K × (KX /T ) given by 0

0

k1 : (k, k T ) 7 → (kk1 , k1−1 k T ) is in general not Poisson, there is nevertheless a unique Poisson structure on the quotient space K ×KX (KX /T ) such that the projection map 0

0

K × (KX /T ) −→ K ×KX (KX /T ) : (k, k T ) 7−→ [(k, k T )] is a Poisson map. We temporarily denote this Poisson structure on K ×KX (KX /T ) by π0 . Remark 5.15. In general, suppose that K is a Poisson Lie group and K1 ⊂ K is a Poisson subgroup. Suppose that M is a Poisson manifold on which there is a Poisson action by K1 . Then there is a unique Poisson structure on K ×K1 M such that the natural projection from K ×M to K ×K1 M is a Poisson map. Moreover, the left action of K on K ×K1 M by left translations on the first factor is a Poisson action. We call this procedure of producing the Poisson K-space K ×K1 M from the Poisson K1 -space M Poisson induction.

Classical Dynamical r-Matrices and Homogeneous Poisson Structures

361

Proposition 5.16. We have F∗ π0 = πX,X1 ,λ , where F is the identification ∼

0

0

F : K ×KX (KX /T ) −→ K/T : [(k, k T )] 7 −→ kk T . Proof. Recall that πX,X1 ,λ is the image of π˜ rX (λ) = 3R − AX (λ)L under the projection p1 : K → K/T , where 3R (resp. AX (λ)L ) is the right (resp. left) invariant bivector field on K with value 3 (resp. AX (λ)) at e, and AX (λ) ∈ k ∧ k is the skew symmetric part of the r-matrix rX (λ) given in (12). On the other hand, π0 is the image of πK ⊕ π¯ under the projection 0

0

p2 : K × KX −→ K ×KX (KX /T ) : (k, k ) 7−→ [(k, k T )], L where π¯ is the bi-vector field on KX defined by π¯ = 3R 1 − 33 with

33 = −

iε 2

X

coth α(λ)

α∈[X]∩6+

Xα ∧ Yα . 2

Because of the commutative diagram: m

K × KX −→ K ↓ p1 p2 ↓ K ×KX (KX /T ) 0

−→ F

K/T ,

0

where m : K × KX −→ K : (k, k ) 7 → kk , it is enough to show that m∗ (πK ⊕ π¯ ) = π˜ rX (λ) , or π˜ rX (λ) (kk1 ) = Lk π¯ (k1 ) + Rk1 πK (k), ∀k ∈ K, k1 ∈ KX . But this follows easily from the definitions and the fact that Adk1 32 = 32 for all t k1 ∈ KX . u We state some more properties of πX,X1 ,λ which can be proved either by definitions or as corollaries of Proposition 5.16. Proposition 5.17. 1) The embedding (KX /T , πXX1 ,λ ) ,→ (K/T , πX,X1 ,λ ) is a Poisson map; 2) With the Poisson structure πK on K, the Poisson structure πXX1 ,λ on KX /T and the Poisson structure πX,X1 ,λ on K/T , the map 0

0

m1 : K × (KX /T ) −→ K/T : (k, k T ) 7−→ kk T is a Poisson map; 3) Let p∗ πK be the projection to K/KX of πK by p : K → K/KX : k 7→ kKX . Then the projection map (K/T , πX,X1 ,λ ) → (K/KX , p∗ πK ) is a Poisson map. Remark 5.18. The Poisson structure p∗ πK on K/KX is known as the Bruhat-Poisson structure, because its symplectic leaves are exactly the Bruhat cells in K/KX . See [Lu-We].

362

J.-H. Lu

5.6. The symplectic leaves of πX,X1 ,λ . In this section, we first describe the symplectic leaves of πX,X1 ,λ for any X ⊂ S(6+ ) but X1 = ∅. The description of symplectic leaves for general πX,X1 ,λ is somewhat complicated, and we will leave it to the future. However, we will show that each πX,X1 ,λ , for any X, X1 and λ, has at least one open symplectic leaf. Notation 5.19. We will use πX,∅,λ to denote the Poisson structure πX,X1 ,λ when X1 is the empty set. We first recall that the space K/T has the well-known Bruhat decomposition: Because of the Iwasawa decomposition G = KAN of G, the natural map K/T → G/B : kT 7 → kB is a diffeomorphism. Its inverse map is G/B → K/T : gB 7→ kT if g = kan is the Iwasawa decomposition of g. Thus we have [

K/T ∼ = G/B =

N wB

w∈W

as a disjoint union. The set NwB is called the Bruhat (or Schubert) cell corresponding to w ∈ W . We denote it by 6w . For w ∈ W , set 8w = (−w6+ ) ∩ 6+ = {α ∈ 6+ : w−1 α ∈ −6+ }. Set nw = spanC {Eα : α ∈ 8w } and Nw = exp nw . Then 6w is parametrized by Nw by the map jw : Nw −→ 6w : n 7−→ nwB. Define j1 = G −→ K : g = kb 7 −→ k j2 = G −→ K : g = bk 7 −→ k

for k ∈ K, b ∈ AN; for k ∈ K, b ∈ AN.

Then we have a left action of G on K by G × K −→ K : (g, k) 7 −→ g ◦ k := j1 (gk), and a right action of G on K: K × G −→ K : (k, g) 7 −→ k g := j2 (kg). The parameterization of 6w by Nw is then also given by ˙ , jw : Nw −→ 6w : n 7−→ (n ◦ w)T where w˙ ∈ K is any representative of w in K. Notation 5.20. For k ∈ K and a subgroup G1 ⊂ G, we set G1 ◦ k = {g ◦ k : g ∈ G1 },

k G1 = {k g : g ∈ G1 }.

Classical Dynamical r-Matrices and Homogeneous Poisson Structures

363

It is easy to show that (AN ) ◦ k = k AN for any k ∈ K. This set is the symplectic leaf of πK in K through the point k (see [So, Lu-We]). Since KX ⊂ K is a Poisson submanifold, we know that (AN ) ◦ k = k AN ⊂ KX for k ∈ KX . Moreover, if w ∈ W and if w˙ ∈ K is a representative of w in K, set Cw˙ = (AN ) ◦ w˙ ⊂ K. Then Cw˙ = (AN) ◦ w˙ = N ◦ w˙ = Nw ◦ w˙ = w˙ AN = w˙ N = w˙ Nw−1 .

(23)

Its image under the projection K → K/T is the Bruhat cell 6w , which is also the symplectic leaf of the Bruhat Poisson structure π∞ in K/T . See [So, Lu-We]. Let X ⊂ S(6+ ). Denote by WX the subgroup of W generated by the simple reflections corresponding to elements in X. It is the Weyl group for (mX , h). Introduce the subset W X of W : W X = {w ∈ W : 8w−1 ⊂ 6+ \[X]}. It follows from the definition that w ∈ W X if and only if w([X] ∩ 6+ ) ⊂ 6+ . Moreover, we have Cw˙ 1 = w˙ 1NX for w1 ∈ W X because Nw−1 ⊂ NX , where NX = exp nX with 1 nX given by (19). The following lemma says that each w1 ∈ W X is the minimal length representative for the coset w1 WX , and that the set W X is a “cross section” for the canonical projection from W to the coset space W/WX . For a proof of the Lemma, see [Ko], Prop. 5.13. Lemma 5.21. For any w ∈ W , there exists a unique w1 ∈ W X and w2 ∈ WX such that w = w1 w2 . Moreover, 8w−1 = 8w−1 ∪ w2−1 8w−1 2

1

is a disjoint union, and the components on the right hand side are the respective intersections of 8w−1 with [X] and 6+ \[X]. Hence, l(w) = l(w1 ) + l(w2 ). We can now describe the symplectic leaves of πX,∅,λ in K/T . S Theorem 5.22. 1) For each w1 ∈ W X , the union w2 ∈WX 6w1 w2 is the symplectic leaf of πX,∅,λ in K/T through the point w1 ∈ K/T . 2) These are all the symplectic leaves of πX,∅,λ in K/T . Proof. Set LX,λ = eλ KX e−λ NX = NX eλ KX e−λ . It is the connected subgroup of G with Lie algebra lX,λ = Adeλ (nX + kX ) . Notice that each l ∈ LX,λ can be written as a unique product l = nX eλ ke−λ for nX ∈ NX and k ∈ KX . Denote by Sw1 the symplectic leaf of πX,∅,λ through the point w1 ∈ K/T . Pick a representative w˙ 1 of w1 in K. By Theorem 7.2 of [Lu2] (see also [Ka1]), the symplectic L leaf Sw1 is the image of the set w˙ 1 X,λ under the projection K → K/T . We define a map M : LX,λ −→ Nw−1 × KX 1

364

J.-H. Lu 0

as follows: For l = nX eλ ke−λ ∈ LX,λ , write ke−λ = bk , where b ∈ AUX with UX = 0 0 exp uX and k ∈ KX , so that l = nX eλ bk . Since the map Nw−1 → Cw˙1 : n 7 → w˙ 1n is a 1

0

λ

diffeomorphism, there exists a unique n ∈ Nw−1 such that w˙ 1n = w˙ 1nX e b . Now define 0

0

1

0

0

0

M(l) = (n , k ). It is easy to see that the map M is onto and that w˙ 1l = w˙ 1n k ∈ Cw˙1 KX . This shows that L

w˙ 1 X,λ = Cw˙1 KX . It is easy to show that the map Cw˙1 × KX −→ Cw˙1 KX : (c, k) 7−→ ck is a diffeomorphism, Sand that the image of Cw˙1 KX to K/T under the projection K → K/T is the union w2 ∈WX 6w1 w2 , which is thus the symplectic leaf of the Poisson structure πX,∅,λ through the point w1 ∈ K/T . Now since [ Sw1 K/T = w1 ∈W X

is already a disjoint union, we conclude that the collection {Sw1 : w1 ∈ W X } is that of t all symplectic leaves of πX,∅,λ in K/T . u X S Let w1 ∈ W . The following proposition identifies the symplectic manifold Sw1 = 6 w2 ∈WX w1 w2 , as a symplectic leaf of πX,∅,λ in K/T , with the product of two symplectic manifolds. Recall that for w ∈ W with a representative w˙ in K, the set Cw˙ ⊂ K is the ˙ Recall also from Notation 5.14 the definition symplectic leaf of πK through the point w. X on KX /T . Note that it is symplectic by Proposition 5.12. of the Poisson structure π∅,λ

Proposition 5.23. Let w1 ∈ W X and let w˙ 1 be a representative of w1 in K. Equip Cw˙1 with the symplectic structure as a symplectic leaf of πK in K; Equip KX /T with X , and finally, equip Sw1 with the symplectic structure as a the symplectic structure π∅,λ symplectic leaf of πX,∅,λ . Then the map 0

0

m1 : Cw˙1 × KX /T −→ Sw1 : (k, k T ) 7−→ kk T is a diffeomorphism between symplectic manifolds. Proof. This is a direct consequence of 2) in Proposition 5.17.

t u

Among all the elements in W X , there is one which is the longest. We denote this element by w X , so l(wX ) ≥ l(w1 ) for all w1 ∈ W X . Proposition 5.24. The symplectic leaf SwX of πX,∅,λ in K/T through the point wX is open and dense. Proof. Consider the projection K/T → K/KX : kT 7→ kKX . The image of 6wX ⊂ K/T under this projection is an open dense subset (in fact a cell) in K/KX . Since t K/T → K/KX is a fibration, we know that SwX is open and dense in K/T . u Corollary 5.25. Each Poisson structure πX,∅,λ has a finite number of symplectic leaves with at least one of them open and dense.

Classical Dynamical r-Matrices and Homogeneous Poisson Structures

365

Remark 5.26. Note that the statement in Corollary 5.25 may not be true if X1 6 = ∅, as is seen from Case 3 of Example 5.4. The description of the symplectic leaves of πX,X1 ,λ in general is somewhat complicated. However, we have Proposition 5.27. The Poisson structure πX,X1 ,λ for X = S(6+ ) (and X1 ⊂ X arbitrary) is non-degenerate at every element in the Weyl group W of (K, T ) considered as a point in K/T . Consequently, the symplectic leaves of πX,X1 ,λ through these points are open. Proof. Let w ∈ W and let w˙ ∈ K be a representative of w in K. Recall from the definition of πX,X1 ,λ that πX,X1 ,λ = p∗ π˜ 1 , where p : K → K/T is the natural projection and π˜ 1 is the bi-vector field on K defined by with 3 = − i4ε

π˜ 1 = 3R − AL ,

P

α∈6+

Xα ∧ Yα and A=−

iε X e2α(λ) + 1 Xα ∧ Yα . 4 e2α(λ) − 1 α∈6+

Thus ˙ = Adw˙ −1 3 − A lw˙ −1 π˜ 1 (w)   X e2α(λ) + 1 iε  X (Xw−1 α ∧ Yw−1 α ) + ( 2α(λ) Xα ∧ Yα ) =− 4 e −1 α∈6+

=−

iε 4

−

X

α∈6+

(1 +

α∈6+ ,wα<0

iε 4

X α∈6+ ,wα>0

e2α(λ)

+1 )Xα ∧ Yα e2α(λ) − 1

(−1 +

e2α(λ) + 1 )Xα ∧ Yα . e2α(λ) − 1

neq ± 1, lw˙ −1 πX,X1 ,λ (wT ˙ ) = p∗ lw˙ −1 π˜ 1 (w) ˙ ∈ ∧2 Te (K/T ) is non-degenSince ee2α(λ) +1 −1 ˙ ∈ K/T . u t erate. Hence πX,X1 ,λ is non-degenerate at w = wT 2α(λ)

Corollary 5.28. For any X, X1 and λ, the Poisson structure πX,X1 ,λ on K/T has at least one open symplectic leaf. Proof. We use Proposition 5.16 which says that πX,X1 ,λ can be obtained via Poisson induction from the Poisson structure πXX1 ,λ on KX /T . Recall the definition of πXX1 ,λ from Notation 5.14. Since X is the set of all simple roots for the root systems for (KX , T ), we know from Proposition 5.27 that πXX1 ,λ is non-degenerate at every Weyl group element in WX , regarded as points in KX /T . Let w2 ∈ WX . Recall that wX is the longest element in the set W X . Let w˙ X be any representative of wX in K. Recall that Cw˙ X is the symplectic leaf of πK in K through w˙ X . By Proposition 5.17, the map 0

0

(Cw˙ X , πK ) × (KX /T , πXX1 ,λ ) −→ (K/T , πX,X1 ,λ ) : (k, k T ) 7 −→ kk T

366

J.-H. Lu

is a Poisson map. But this map is a diffeomorphism onto its image which is open because it is the inverse image under the natural projection K/T → KX /T of the biggest cell in KX /T . Thus the symplectic leaf of πX,X1 ,λ through the point w˙ X w2 ∈ K/T is open. t u Note that the proof of Corollary 5.28 shows that πX,X1 ,λ is open at every point in the coset w X WX ⊂ K/T . Example 5.29. Corollary 5.28 can be checked directly for the case of g = sl(2, C) by looking at the explicit formulas in Example 5.4. 5.7. The modular vector fields and the leaf-wise moment maps for the T -actions. For an orientable Poisson manifold (P , π ) and a given volume form µ on P , the modular vector field of π associated to µ is defined to be the vector field vµ on P satisfying vµ µ = d(π µ). It measures how Hamiltonian flows on P fail to preserve µ. More details can be found in [W]. Coming back to (K, πK )-homogeneous Poisson structures on K/T , we set ρ = 1P α∈6+ α for the choice of 6+ in the definition of πK . Then we have iHρ ∈ t. We use 2 σiHρ to denote the infinitesimal generator of the T action on K/T by left translations in the direction of iHρ . Proposition 5.30. For the Poisson structure πK on K defined by (14) with 3 given in (15), all (K, πK )-homogeneous Poisson structures on K/T , and in particular all the πX,X1 ,λ ’s, have the same modular vector field v, namely v = −iεσiHρ , with respect to a (and thus any) K-invariant volume form on K/T . Remark 5.31. Proposition 5.30 is a statement about any Poisson Lie group structure on K since the Poisson structure πK on K defined by (14) with 3 given in (15) is the most general form of such structures. Proof of Proposition 5.30. Let π be an arbitrary (K, πK )-homogeneous Poisson structure. Then we know that π is the sum π = π(e)L + p∗ πK , where π(e)L is the K-invariant bi-vector field on K/T whose value at e = eT is π(e), and p∗ πK is the projection of πK from K to K/T by p : K → K/T : k 7 → kT (it is the Bruhat Poisson structure π∞ when u = 0 in the definition of 3). Let µ be a Kinvariant volume form on K/T . Let bµ be the degree −1 operator on χ • (K/T ) defined by bµ (U ) = (−1)|U | d(U µ), so that v = bµ (π ) [E-L-W]. Then bµ (π ) = bµ (π(e)L )+ bµ (p∗ πK ). Since µ is K-invariant, the operator bµ maps a K-invariant multi-vector field to another such. Hence bµ (π(e)L ) must be a K-invariant (1-)vector field so it must be zero. Thus bµ (π) = bµ (p∗ πK ). It is proved in [E-L-W] that bµ (p∗ πK ) = −iεσiHρ , which is therefore the modular vector field for any π . u t The modular vector field is always a Poisson vector field [W], but it is not necessarily Hamiltonian in general. For the rest of this section, we study this problem for the modular vector field v = −iεσiHρ for the Poisson structure πX,∅,λ . We will show that although v is not globally Hamiltonian unless X = S(6+ ), it is leaf-wise, and we describe its Hamiltonian function on each leaf. In fact, since every πX,∅,λ is T -invariant (for the T action on K/T by left translations), we will describe the moment map for the T -action

Classical Dynamical r-Matrices and Homogeneous Poisson Structures

367

on each symplectic leaf of πX,∅,λ . We are particularly interested in the behavior of these moment maps when λ goes infinity in various directions as in Sect. 5.2. We first look at the Bruhat Poisson structure π∞ corresponding to X = ∅. This case (when ε = i) is studied in detail in [Lu3]. We recall the results there. Denote by PA : G = KAN −→ A : g = kan 7 −→ a, where G = KAN is the Iwasawa decomposition of G (as a real Lie group). For each w ∈ W , choose a representative w˙ ∈ K of w in K, and use ˙ jw : Nw −→ 6w : n 7 −→ (n ◦ w)T ˙ ∈ A. The element to parameterize the Bruhat cell 6w . For n ∈ Nw , let aw (n) = PA (nw) ˙ so we have a well-defined map aw (n) is independent of the choice of w, aw : Nw −→ A : n 7 −→ aw (n). Denote by w the symplectic structure on 6w as a symplectic leaf of π∞ . Then each (6w , w ) is a Hamiltonian T -space. The following fact is proved in [Lu3]. Proposition 5.32. The map φw : 6w −→ t ∗ : hφw , xi(kT ) 2i = Im Adw˙ log aw (jw−1 (kT )), x , ε

x∈t

is the moment map for the T -action on (6w , w ) such that φw (w) = 0. In [Lu3], we have written down an explicit formula for φw in certain Bott-Samelson type coordinates {z1 , z¯ 1 , z2 , z¯ 2 , . . . , zl(w) , z¯ l(w) }. It takes the form l(w)

hφw , xi = −

1 X 2αj (x) log(1 + |zj |2 ), ε αj , αj j =1

where {α1 , α2 , . . . , αl(w) } = 6+ ∩ (−w6+ ). In particular, let x = −iε(iHρ ) = εHρ , we get a Hamiltonian function for the vector field v = −iεσiHρ on (6w , w ) as hφw , εHρ i = −

l(w) X 2 ρ, αj log(1 + |zj |2 ). αj , αj j =1

This function goes to −∞ as |zj | → ∞ which corresponds to the boundary of 6w . Thus, the modular vector field v can not be globally Hamiltonian on K/T . Next, we look at the case when X = S(6+ ), so πX,∅,λ = πλ is the the symplectic structure on K/T obtained by identifying K/T with the dressing orbit in the group AN through the point e−λ (see Proposition 5.12). Since K/T is simply connected, the T -action on K/T is Hamiltonian. The following fact is proved in [L-R]. Proposition 5.33. The moment map for the T -action on (K/T , πλ ) is given by 8λ : K/T −→ t ∗ : h8λ , xi(kT ) 2i = Im log(PA (ke−λ k −1 )), x , ε

x ∈ t.

368

J.-H. Lu

Remark 5.34. This fact plays the key role in the symplectic proof of Kostant’s nonlinear convexity theorem given in [L-R]. Corresponding to the fact that limt→+∞ πλ+t ρˇ = π∞ , where ρˇ is the sum of all fundamental co-weights, the two moment maps are related as follows. Proposition 5.35. For any λ ∈ a, w ∈ W and kT ∈ 6w , lim 8λ+t ρˇ (kT ) − 8λ+t ρˇ (w) = φw (kT ), t→+∞

lim d8λ+t ρˇ (kT ) = dφw (kT ).

t→+∞

Proof. Using the parameterization of 6w by Nw , we regard both 8λ+t ρˇ |6w and φw as ˙ Write (t ∗ -valued) functions on Nw . Let n ∈ Nw with k = n ◦ w. nw˙ = kaw (n)m(n) with m(n) ∈ Nw . Then ˙ −λ aw (n)w˙ −1 )n−1 . e−λ k −1 = (e−λ aw (n)m(n)aw (n)−1 eλ w˙ −1 )(we Thus, for any x ∈ t, h8λ+t ρˇ (n) − 8λ+t ρˇ (e) − φw (n), xi 2i = Im log PA (e−λ−t ρˇ aw (n)m(n)aw (n)−1 eλ+t ρˇ w˙ −1 ), x , ε where e ∈ Nw is the identity element. Consider now the map ψt : Nw −→ Nw : m 7−→ e−λ−t ρˇ meλ+t ρˇ . Under the identification of nw with Nw by the exponential map of Nw , this is the linear map Ad−λ−t ρˇ on nw , which goes to 0 as t → +∞. Thus lim ψt (m) = 0,

t→+∞

and

lim dψt (m) = 0

t→+∞

for all m ∈ Nw . But we have the composition of maps h8λ+t ρˇ (n) − 8λ+t ρˇ (e) − φw (n), xi = ηx (ψt (ξ(n))), −1 where ηx : Nw → R : m 7 → 2i ε Im log PA (mw˙ ), x and ξ : Nw → Nw : n 7 → u aw (n)m(n)aw (n)−1 . Thus the two limits in Proposition 5.35 hold. t

Now consider the general case of πX,∅,λ . Recall that the symplectic leaves of πX,∅,λ in K/T are indexed by elements in W X . We keep the notation in Proposition 5.23, in which we have used the map m1 to identify the symplectic leaf Sw1 of πX,∅,λ in K/T with the product symplectic manifold Cw˙ 1 × KX /T . We use the projection map Cw˙ 1 → 6w1 : k 7 → kT to identify Cw˙ 1 and 6w1 . This identification is T -equivariant if we equip Cw˙ 1 with the T -action T × Cw˙ 1 −→ Cw˙ 1 : t · k 7−→ tk(w˙ 1−1 t −1 w˙ 1 ).

Classical Dynamical r-Matrices and Homogeneous Poisson Structures

369

Equip Cw˙ 1 × KX /T with the T -action 0

T × (Cw˙ 1 × KX /T ) −→ Cw˙ 1 × KX /T : t · (k, k T ) 0 7−→ (tk(w˙ 1−1 t −1 w˙ 1 ), w˙ 1−1 t w˙ 1 k T ). Then the map m1 in Proposition 5.23 is T -equivariant. Denote by 8λ,X the moment X ). Then the moment map for the T -action on map for the T -action on (KX /T , π∅,λ ∼ Sw1 = Cw˙ 1 × KX /T is given by 0

0

hφλ,X,w1 (k, k T ), xi = hφw1 (kT ), xi + h8λ,X (k T ), Adw˙ −1 xi 1

for all x ∈ t. Remark 5.36. There remain many problems to be addressed concerning the Poisson structures πX,X1 ,λ . Other than the description of their symplectic leaves in the general case, one can try to compute its Poisson cohomology according to the theory developed in [Lu2]. One can also study the K-invariant Poisson harmonic forms [E-L1] of πX,X1 ,λ . Another problem is to construct the symplectic groupoids for πX,X1 ,λ . We hope to treat these problems in the future. Acknowledgement. The author would like to thank P. Etingof for explaining to her the results in [E-V] and Professors V. Drinfeld, S. Evens, Y. Kosmann-Schwarzbach, A. Weinstein and P. Xu for helpful discussions. She would also like to thank the Mathematics Department of Hong Kong University of Sciences and Technology for it hospitality. Special thanks to the referee for useful comments.

References [B-D]

Belavin,A. and Drinfeld,V.: Solutions of the classicalYang–Baxter equations for simple Lie algebras. Funct. Anal. Appl. 16, 159–180 (1982) [D1] Drinfeld, V. G.: Hamiltonian structures on Lie groups, Lie bialgebras and the geometric meaning of the classical Yang - Baxter equations. Soviet Math. Dokl. 27 (1), 68-71 (1983) [D2] Drinfeld, V.: Quantum groups. Proc. Intern. Congr. Math., Berkeley, 1, 1986, pp. 798–820 [D3] Drinfeld, V. G.: On Poisson homogeneous spaces of Poisson-Lie groups. Theo. Math. Phys. 95 (2), 226–227 (1993) [D-P] De Concini, C. and Procesi, C.: Complete symmetric varieties. In Invariant Theory (Montecatini, 1982), Lecture Notes in Math., Vol. 996, Berlin–New York: Springer, 1983, pp. 1–44 [E-V] Etingof, P. and Varchenko, A.: Geometry and classification of solutions of the classical dynamical Yang–Baxter equation. Commun. Math. Phys. 192, 77–120 (1998) [E-L-W] Evens, S., Lu, J-H., and Weinstein, A.: Transverse measures, the modular class, and a cohomology pairing for Lie algebroids. Quarterly J. Math. 50, 417–436 (1999) [E-L1] Evens, S., and Lu, J-H.: Poisson harmonic forms, the Kostant harmonic forms, and the S 1 -equivariant cohomology of K/T . Adv. Math. 142, 171–220 (1999) [E-L2] Evens, S., and Lu, J-H.: On the variety of Lagrangian subalgebras. Preprint, 1999 [F] Felder, G.: Conformal field theory and integrable systems associated to elliptic curves. Proceedings of the ICM, Zurich, 1994 [Ka1] Karolinsky, E.: The symplectic leaves on Poisson homogeneous spaces of Poisson-Lie groups. Mathematical Physics, Analysis, Geometry 2 No. 3/4 (in Russian), 306–311 (1999) [Ka2] Karolinsky, E.: The classification of Poisson homogeneous spaces of compact Poisson Lie groups. Mathematical physics, analysis, and geometry 3 No. 3/4, (in Russian) 274–289 (1996) [Ka3] Karolinsky, E.: A classification of Poisson homogeneous spaces of complex reductive Poisson Lie groups. math.QA/9901073 [Ko] Kostant, B.: Lie algebra cohomology and generalized Schubert cells. Ann. of Math. 77 (1), 72–144 (1963) [L-X] Liu, Z.-J., Xu, P.: Dirac structures and dynamical r-matrices. Preprint. [Lu-We] Lu, J. H., Weinstein, A.: Poisson Lie groups, dressing transformations, and Bruhat decompositions. J. Diff. Geom. 31, 501–526 (1990)

370

[Lu1] [L-R] [Lu2] [Lu3] [O-S] [Sc] [STS1] [STS2] [Se] [Sh] [So] [W]

J.-H. Lu

Lu, J. H.: Multiplicative and affine Poisson structures on Lie groups. UC Berkeley thesis, 1990 Lu, J. H., Ratiu, T.: On the nonlinear convexity theorem of Kostant. J. of AMS 4, No. 2, 349–363 (1991) Lu, J. H.: Poisson homogeneous spaces and Lie algebroids associated to Poisson actions. Duke Math. J. 86, No. 2, 261–304 (1997) Lu, J. H.: Coordinates on Schubert cells, Kostant’s harmonic forms, and the Bruhat Poisson structure on G/B. Trans. groups 4, No. 4, 355–374 (1999) Oshima, T., and Sekiguchi, J.: Eigenspaces of invariant differential operators on an affine symmetric space. Inventiones Math. 57, 1–81 (1980) Schiffmann, O.: On classification of dynamical r-matrices. Math. Res. Letters 5, 13–30 (1998) Semenov-Tian-Shansky, M. A.: What is a classical r-matrix?. Funct. Anal. Appl. 17, (4), 259–272 (1983) Semenov-Tian-Shansky, M. A.: Dressing transformations and Poisson Lie group actions. Publ. RIMS, Kyoto University 21, 1237–1260 (1985) Serre, J.-P.: Complex semisimple Lie algebras. Berlin–Heidelberg–NewYork: Springer-Verlag, 1987 Sheu, A.: Quantization of the Poisson SU (2) and its Poisson homogeneous space–the 2-sphere Commun. Math. Phsy. 135, 217–232 (1991) Soibelman, Y.: The algebra of functions on a compact quantum group, and its representations. St. Petersburg Math. J. 2 (1), 161–178 (1991) Weinstein, A.: The modular automorphism group of a Poisson manifold. J. Geom. Phys. 23, 379–394 (1997)

Communicated by T. Miwa

Commun. Math. Phys. 212, 371 – 394 (2000)

Communications in

Mathematical Physics

© Springer-Verlag 2000

Renormalization and Periodic Orbits for Hamiltonian Flows Juan J. Abad, Hans Koch∗ Department of Mathematics, University of Texas at Austin, Austin, TX 78712, USA Received: 5 October 1999 / Accepted: 2 February 2000

Abstract: We consider a renormalization group transformation R for analytic Hamiltonians in two or more dimensions, and use this transformation to construct invariant tori, as well as sequences of periodic orbits with rotation vectors approaching that of the invariant torus. The construction of periodic and quasiperiodic orbits is limited to near-integrable Hamiltonians. But as a first step toward a non-perturbative analysis, we extend the domain of R to include any Hamiltonian for which a certain non-resonance condition holds. 1. Introduction and Results In this paper we complement and extend the results given in [15], by using a renormalization group (RG) transformation to construct sequences of periodic orbits for nearintegrable Hamiltonians, and by extending the domain of this transformation to a larger set of Hamiltonians. The construction of periodic orbits that approximate quasiperiodic motion is a canonical application of RG ideas; see for example [1, 7, 9, 11, 18, 21]. It relates observed universal accumulation rates to eigenvalues of the linearized RG transformation. This part of our analysis is restricted to near-integrable Hamiltonians, since RG fixed points relevant to critical cases have not yet been obtained rigorously. But the first part, which includes the definition of the RG transformation, does not require near-integrability. The work presented here is essentially self-contained, but for a motivation of some of our choices, and other background material, the reader is referred to [15]. P We start with some definitions. On Cd , consider the two norms |v| = j |vj | and kvk = maxj |vj |. Let V and W be two fixed but arbitrary d × d matrices over C, satisfying V T W = I, where V T denotes the transposed of V . Define Dρ,1 = q ∈ Cd : |V Imq| < ρ and Dρ,2 = p ∈ Cd : kWpk < ρ , for every ρ > 0. Unless stated ∗ Supported in Part by the National Science Foundation under Grant No. DMS-9705095.

372

J. J. Abad, H. Koch

otherwise, we will identify Dρ,1 with a complexified d-torus. In particular, a function on Dρ = Dρ,1 ×Dρ,2 is assumed to be 2π -periodic in each component of its first argument. If analytic, such a function H may be written as a Fourier–Taylor series X Hν,α (Wp)α eiq·ν , (q, p) ∈ Dρ , (1.1) H (q, p) = (ν,α)∈I

where I = Zd ×Nd . Here, and in what follows, x · y denotes the standard dot product of two vectors in Cd , and x α = x1α1 x2α2 · · · xdαd . Definition 1.1. Given any ρ > 0, define Aρ to be the Banach space of all analytic Hamiltonians H on Dρ of the form (1.1), for which the norm X |Hν,α |ρ |α| eρkW νk (1.2) |H |ρ = (ν,α)∈I

is finite. The identity operator on Aρ will be denoted by I. On the product space Adρ we P define the two norms |f |ρ = j |fj |ρ and kf kρ = maxj |fj |ρ . Let A0ρ be the space of all functions in Aρ , whose first partial derivatives belong to Aρ . On this space, we consider the following two seminorms: Denoting by ∇j H the partial gradient of H with respect to the j th argument, j = 1, 2, we define |H |0ρ and kH k0ρ to be the sum and maximum, respectively, of the numbers kW ∇1 H kρ and |V ∇2 H |ρ . Our first result is concerned with the possibility of finding a canonical change of variables that eliminates the component of a Hamiltonian in the direction of some predefined subspace of Aρ . The coordinate changes are restricted to those canonical transformations U : (q, p) 7 → (q + Q, p + P ) for which the one-form P · dq + (p + P ) · dQ is not just closed, but the differential of a function S. The function φ = p · Q − S, expressed in terms of q and p + P (if possible), satisfies the equation Q(q, p), P (q, p) = (J∇φ) q, p + P (q, p) , (1.3) where J(q, p) = (p, −q). Conversely, given φ ∈ Aρ sufficiently small, we can use (1.3) b the Hamiltonian vector field to define a canonical transformation U = Uφ . Denote by H b associated with a Hamiltonian H , that is, H f = (J∇H ) · ∇f . Theorem 1.2. Let ρ, c > 0, and let H be some non-empty open set of Hamiltonians H ∈ Aρ satisfying |H |ρ < c. Let I− be a projection operator on Aρ 0 , where 0 < ρ 0 < ρ, and assume that I− satisfies the following non-resonance condition: There are constants a, b > 0, such that for all values of r in the interval [ρ 0 , (ρ 0 +8ρ)/9], I− f ∈ Ar and |I− f |r ≤ a|f |r ; (a) if −f ∈ Ar , then 2 (b) I H ρ < ab c(1 − ρ 0 /ρ)2 /1231, for all H ∈ H; − b maps I− A0 onto I− A , and if φ ∈ I− A0 is nonzero, (c) for every r r r − H ∈ H, I H bφ > abcρ −1 ||φ||0r . then I H r Then there exists a map U that assigns to each H ∈ H a canonical transformation UH from Dρ 0 to D(ρ 0 +ρ)/2 , such that (1.4) H ◦ UH ∈ Aρ 0 , I− H ◦ UH = 0. The map N : H 7 → H ◦ UH is analytic from H to Aρ 0 . Furthermore, if H ∈ H satisfies b I− H bI− −1 I− , where I+ = I − I− . I− H = 0, then UH = I and DN (H ) = I+ − I+ H

Renormalization and Periodic Orbits for Hamiltonian Flows

373

This theorem was proved in [15], in the special case where H is a small neighborhood of a linear Hamiltonian (q, p) 7 → ω · p, and I− is a specific projection adapted to the choice of the vector ω. A similar projection will be considered again here. Its usefulness for renormalization derives from a combination of Theorem 1.2, and Lemma 1.3 below. Given a nonzero vector ω ∈ Rd , and two positive constants σ and κ, let (1.5) I − = (ν, α) ∈ I : |ω · ν| > σ kW νk + κ|α| , I + = I − I − , and define I± H by restricting the sum in (1.1) to the corresponding index sets I ± , I± H (q, p) =

X

Hν,α (Wp)α eiq·ν .

(1.6)

(ν,α)∈I ±

The functions I− H and I+ H will be referred to as the non-resonant and resonant part, respectively, of H . Clearly, the projection I− satisfies the condition (a) of Theorem 1.2, with a = 1. Notice that if H (q, p) depends on p only, then I− H = 0. For the purpose of renormalization, we now focus on vectors ω = (1, ω2 , . . . , ωd ) whose components span a real algebraic number field of degree d. In particular, ω ·ν 6 = 0 for every nonzero vector ν in Zd . Another consequence of this assumption [15] is that there exists an integral d × d matrix T such that (T1) T has a simple real eigenvalue ϑ > 1, and T ω = ϑω. (T2) All other eigenvalues of T are simple, and of modulus less than 1. (T3) det(T ) = ±1. Such a matrix T provides a way of approximating ω by vectors with rational components: If w ∈ Qd is nonzero, then the vector T n w, when rescaled such that its first component is one, approaches ω as n → ∞. The same approximating sequences can also be found in some Hamiltonian systems [1, 3, 9–13, 18, 21, 26], in the form of periodic orbits that accumulate at an invariant ω-torus. In order to investigate this Hamiltonian “representation” of the arithmetic related to ω, we “lift” the inverse of T , viewed as a map on frequency vectors, to a transformation acting on a space of Hamiltonians. A transformation that has some of the required properties is H 7→ µ−1 H ◦ Tµ , where Tµ (q, p) = T q, µ(T ∗ )−1 p ,

µ 6= 0.

(1.7)

It combines a canonical change of variables (case µ = 1) with a scaling in p, and is part of most RG schemes for Hamiltonians [1, 4–6, 15, 17, 19, 20]. By itself, this transformation is not a dynamical system on any of the spaces Aρ , since the domain Dρ is not left invariant by Tµ . But we can combine it with Theorem 1.2: The following lemma shows that a canonical change of variables, that eliminates the non-resonant part of a Hamiltonian, can “transfer analyticity” from the variable p to the variable q. After fixing an integral d × d matrix T satisfying (T1–T3), we adapt the matrices W and V to our choice of T , by assuming that the row vectors W1 , W2 , . . . , Wd of W are an ordered basis of eigenvectors for T , T Wj = ϑj Wj ,

|ϑj | ≤ |ϑi |,

1 ≤ i ≤ j ≤ d,

(1.8)

with W1 = ω. In addition we fix the parameters σ, κ > 0, with σ restricted by the condition |ϑ2 | + σ (ϑ − |ϑ2 |) < 1.

374

J. J. Abad, H. Koch

Lemma 1.3. Let 0 < ρ 0 < ρ and µ ∈ C be given such that µ ρ0 ρ0 |ϑ2 | + σ ϑ − |ϑ2 | < , 0 < eρκ(ϑ−|ϑ2 |) < . ρ ϑd ρ

(1.9)

Then every function H ∈ I+ Aρ 0 extends analytically to Tµ Dρ , and H 7→ H ◦ Tµ is a compact linear map from I+ Aρ 0 to Aρ , whose operator norm is ≤ 1. Let now H be some subset of Aρ for which the given projection I− satisfies a nonresonance condition, as described in Theorem 1.2. Definition 1.4. Given a nonzero complex number µ of modulus less than |ϑd |, define Rµ (H ) =

η H ◦ UH ◦ Tµ , µ

H ∈ H,

(1.10)

e0,(1,0,... ,0) of the renormalized where η = η(H ) is determined such that the coefficient H e Hamiltonian H = Rµ (H ) is equal to one, if this is possible. The action of this transformation is particularly simple when restricted to functions of the form hw (q, p) = w · p,

w ∈ Cd .

(1.11)

In particular, it is independent of the choice of µ, since hw is invariant under the scaling Sz : H 7 → z−1 H (., z.), for any z 6 = 0. More explicitly, if w = β1 ω+β2 W2 +. . .+βd Wd , with β1 nonzero, then Rµ (hw ) = hw0 , where w0 = ηT −1 w and η = ϑ/β1 . This shows e.g. that hω is a (trivial) fixed point of Rµ . For particular choices of the scaling parameter µ, there are other trivial fixed points as well. Such fixed points can be found easily by restricting Rµ to the space of Hamiltonians H ∈ Aρ for which ∇1 H = 0, and thus UH = I. Theorem 1.2 and Lemma 1.3 can be combined as follows. (Additional symmetries of Rµ are mentioned at the end of Sect. 3.) Theorem 1.5. Let 0 < ρ < σ/κ. If ρ 0 < ρ is sufficiently close to ρ, and µ ∈ C satisfies (1.9), then there exists an open neighborhood H of {H ∈ I+ A0s : H0,0 = 0, |H −hω |0s < κρ} in Aρ , where s = (ρ 0 + 8ρ)/9, such that the transformation Rµ is well defined, analytic, and compact, as a map from H to Aρ . The same holds for Rzµ = Sz ◦ Rµ , for all z in some open neighborhood Z ⊂ C of the unit circle, and Sz ◦ Rµ = Rµ ◦ Sz , for all z ∈ Z. This theorem is obtained by verifying conditions (b) and (c) of Theorem 1.2 for the given domain H. Since the rest of this paper will deal with near-integrable Hamiltonians only, we have centered H at the trivial fixed point hω , which simplifies the task of verifying (c). Condition (b) is satisfied by taking b > 0 small, i.e., we work with near-resonant Hamiltonians. This is not as restrictive or unusual as it may seem. The same can be achieved e.g. in the neighborhood of a Hamiltonian F that is near-resonant modulo a canonical change of coordinates U , by considering the modified transformation H 7 → Rµ (H ◦ U ). An example of such a pair (F, U ) could be the approximate RG fixed point F found in [1], and an approximation U for the corresponding canonical transformation UF . As mentioned earlier, Rµ acts trivially on Hamiltonians that only depend on the action variable p. This makes it possible to compute all eigenvalues and eigenvectors of

Renormalization and Periodic Orbits for Hamiltonian Flows

375

the derivative DRµ (hω ) of Rµ at the fixed point hω . The eigenvectors are precisely the monomials (q, p) 7 → (Wp)α , and the point 0 in the spectrum of DRµ (hω ) corresponds to functions with torus-average zero; see [15] for details. In what follows, ρ is a fixed positive real number less than σ κ, and the scaling parameter µ is assumed to be real, satisfying ϑ2 (1.12) 0 < µ < d . ϑ1 In this case, hω is an isolated fixed point of Rµ , and DRµ (hω ) has precisely d eigenvalues outside the open unit disk. One of them is λ0 = ϑ/µ, associated with constant Hamiltonians, and the other d − 1 eigenvalue-eigenvector pairs are λj = ϑ1 /ϑj ,

hWj (q, p) = Wj · p,

j = 2, . . . , d.

(1.13)

The corresponding local unstable manifold W u of hω is simply the d-dimensional affine subspace of Aρ that is tangent to the expanding eigenspace at hω . As usual in the theory of renormalization, the transformation Rµ is merely a tool for constructing and analyzing certain objects that are of interest outside this theory. We start with a discussion of the local stable manifold W s of Rµ at the fixed point hω . By definition, if H lies on W s , then the sequence of Hamiltonians Hn = Rnµ (H ) converges to hω , as n → ∞. This fact can be used to define a sequence of canonical transformations Vn (H ) = V0 (H ) ◦ V1 (H ) ◦ . . . ◦ Vn−1 (H ),

Vk (H ) = Tµk ◦ UHk ◦ Tµ−k .

(1.14)

Formally, H ◦ Vn (H ) approaches a constant multiple of hω , as n tends to infinity. But we cannot expect convergence on an open subset of phase space, unless H is integrable. One of the things that can be extracted from the transformations Vn (H ) is the limit of Vn (H )◦ϒ, as n → ∞, where ϒ(q) = (q, 0). This limit yields the function 0H described below. Theorem 1.6. Given two positive real numbers r 0 and r > r 0 + ρ, there exists an open neighborhood B of hω in Ar , and for every H ∈ B a complex number cH and an analytic function 0H : Dr 0 ,1 → Dr , such that the following holds. If H ∈ W s ∩ B then (1.15) J∇H ◦ 0H = cH ω · ∇0H on Dr 0 ,1 . For H = hω , the values of cH and 0H are 1 and ϒ, respectively. Furthermore, H 7 → cH is an analytic function on B, and H 7→ 0H − ϒ is an analytic map from B to some Banach space of analytic 2π -periodic functions on Dr 0 ,1 . This theorem is essentially identical to [15, Theorem 1.7]. Thus, we will not repeat a proof here. Notice that (1.15) is the equation of an invariant d-torus for H , with rotation vector proportional to ω; or more precisely (if cH is not real), the equation of an invariant d-torus for cH−1 H , with rotation vector ω. By construction, this torus lies on the energy surface H −1 (0). Another property of this torus is that it is “centered at p = 0”, in the sense that the integral I 1 p · dq (1.16) K(γ ) = 2π γ

376

J. J. Abad, H. Koch

vanishes, if γ is any closed curve on the torus. This follows from the fact that 0H is the limit of canonical transformations Vn (H ), that leave p · dq invariant up to a differential of a one-form. The assumption H ∈ W s , used to prove (1.15), replaces the non-degeneracy assumption in traditional KAM theory. To discuss the connection between these two conditions, we consider families β 7 → Hβ , Hβ = H ◦ Rβ ,

Rβ (q, p) = (q, p + β),

(1.17)

generated from a fixed Hamiltonian H by a translation in the action variable p. To be more specific, let r > ρ, and consider a Hamiltonian h ∈ Ar of the form h(q, p) = ω · p +

1 2

p · Mp + f (p),

f (p) = O(|p|3 ),

(1.18)

where M is a real symmetric d × d matrix, such that the quadratic form p 7 → p · Mp is non-degenerate, when restricted to the d − 1 dimensional contracting subspace of T ∗ . It is straightforward to check that if h is sufficiently close to hω , then the family β 7→ Hβ , for H = h, intersects the stable manifold W s transversally. Since this property persists under small perturbations, every Hamiltonian H ∈ Ar near h has an invariant cω-torus on the energy surface H −1 (0). The torus is given by 0 0 = Rβ 0 ◦ 0Hβ 0 , where β 0 is the value of the parameter β for which Hβ belongs to W s . Under suitable assumptions on H , the renormalization transformation Rµ can also be used to construct periodic orbits for H , whose rotation vectors are “rational approximants” for cω. Recall that a curve γ : R → Dρ is a (lifted) orbit for H if it satisfies the first order differential equation γ 0 = (J∇H ) ◦ γ . A periodic orbit that closes (modulo 2πZd in the variable q) after a time 2π τ > 0, but not earlier, can be parametrized as follows: (1.19) γ (t) = tw + Q0 + Q(t/τ ), P0 + P (t/τ ) , where w = (γ (2πτ ) − γ (0))/(2π τ ), and where Q, P are periodic functions with fundamental period 2π and average zero. The rotation vector w belongs to RZd , that is, w is a real scalar multiple of some vector in Zd . In what follows, given a nonzero vector w ∈ RZd , we denote by τ (w), or τ for short, the value of the smallest positive real number t such that tw belongs to Zd . In order to construct periodic orbits that depend continuously on H , near the Hamiltonian (1.18) that is invariant under translations q 7 → q + u, we will need to limit the number of ways this symmetry can be broken by a perturbation. We shall do this by restricting to Hamiltonians H (q, p) that are even functions of q – a property that is preserved under the transformation Rµ . The corresponding “even” subspace of Ar , r > 0, will be denoted by Br . We note that the approximate RG fixed point of [1] lies in such a space Br . Let ρ 0 and ε be fixed positive real numbers (to be chosen below). Definition 1.7. For every nonzero vector w ∈ RZd , we define H(w) to be the set of all Hamiltonians H ∈ Bρ , such that a constant multiple of H has a periodic orbit γ with rotation vector w, on the surface of constant energy zero, with K(γ ) = 0. The functions Q and P in (1.19) are assumed to be 2π -periodic, and to have average zero. In addition, we require P to be even, Q odd, and Q0 = 0. A subset 6(w) of H(w) is defined by requiring also that P0 = kV ∗ V w, with |k| < ε2 /τ , and that Q, P extend analytically to the strip |Imz| < ρ 0 /τ , satisfying the bounds |V Q(z)| < ε and kW P (z)k < ε.

Renormalization and Periodic Orbits for Hamiltonian Flows

377

Wu Hβ0

60

hw

Hβ1

61

Hβ2

62

Hβ3

63 64

hω

Hβ 0

Ws

β 7 → Hβ Fig. 1. Accumulation of the hypersurfaces 6n at the stable manifold W s

Theorem 1.8. There exist ρ 0 , ε > 0 such that the following holds. If w ∈ RZd is sufficiently close to ω, and hw lies on W u , then there exists an open neighborhood B(w) of hw in Bρ , such that 6(w) ∩ B(w) is an analytic manifold of codimension d that intersects W u transversally at hw . Consider now such a rotation vector w ∈ RZd , and let 6n (w) = R−n µ (6(w)∩B(w)), for all n ≥ 0. By the λ-Lemma [23, 24], this defines a sequence of codimension d manifolds that accumulate at W s as follows (see also Sect. 4). Assume that β 7 → Hβ is an analytic d-parameter family of Hamiltonians in the domain of Rµ , that intersect W s transversally at β = β 0 . Then there exists an open neighborhood B 0 of β 0 in Cd , such that for sufficiently large n, the set {β ∈ B 0 : Hβ ∈ 6n (w)} contains a single point, say βn , and the ratio |βn − β 0 |/|βn+1 − β 0 | converges to |λ2 |, as n → ∞. The reason for considering the manifolds 6n (w) is the fact that 6n (w) ⊂ H(T n w). Thus, by considering families of the type (1.17), we can construct infinite sequences of periodic orbits for a single Hamiltonian H . As part of the proof of the theorem below, and under its assumptions, we will show that 1 − ln βn − β 0 = ln |λ2 | + O n

1 n

.

(1.20)

Theorem 1.9. Let r > ρ and w ∈ RZd be given, w 6= 0. Let h ∈ Br be a Hamiltonian of the form (1.18), with M as described after (1.18). If h is sufficiently close to hω in Br , then there exists an open neighborhood B of h in Br , and a positive integer N, such that for every Hamiltonian H ∈ B, and every n ≥ N, some constant multiple of H has a periodic orbit γn with frequency vector wn = (ϑ −1 T )n w, lying on the energy surface

378

J. J. Abad, H. Koch

H −1 (0), and satisfying 1 − ln γn (0) − 0 0 (0) = ln |λ2 | + O n 0 where 0 is the invariant torus described after (1.18).

1 n

,

(1.21)

Our reason for using K(γ ) = 0 as one of the conditions in the definition of 6(w), is the resulting identity I 1 p · dq = wn · βn , (1.22) 2πτ (wn ) γn

valid for large n, if w ∈ RZd is chosen sufficiently close to ω. The normalized integral in this equation can be regarded as the coordinate of γn in the direction of wn , since it changes by an amount wn · v under a translation p 7→ p + v. It appears that in this direction, the orbits γn accumulate faster than in some other directions: A straightforward calculation shows that if ∇1 H = 0, then the difference wn · βn − ω · β 0 is of the order |λ2 |−2n . The same might be true more generally, as K is the functional that appears in the variational equation for orbits on fixed energy surfaces. 2. Eliminating Non-Resonant Modes Our goal in this section is to prove Theorem 1.2, concerning the existence of a canonical change of coordinates that eliminates the component of a Hamiltonian in the direction of a given “non-resonant” subspace of Aρ . We start by giving some basic estimates involving the evaluation, multiplication, differentiation, and composition of functions in the spaces Aρ . Proposition 2.1. Let ρ, δ > 0. Consider f, g ∈ Aρ , and P , Q ∈ Adρ , and h ∈ Aρ+δ . Define U (q, p) = q + Q(q, p), P (q, p) , for all (q, p) in Dρ . Then (a) |f (q, p)| ≤ |f |ρ for all (q, p) in Dρ . (b) fg ∈ Aρ and |fg|ρ ≤ |f |ρ |g|ρ . (c) |h|ρ + δ|h|0ρ ≤ |h|ρ+δ . (d) h ◦ U ∈ Aρ and |h ◦ U |ρ ≤ |h|ρ+δ , if |V Q|ρ ≤ δ and kW P kρ ≤ ρ + δ. The proof of these estimates is straightforward and will be omitted. Define {H, φ} = ∇1 H · ∇2 φ − ∇2 H · ∇1 φ. Proposition 2.2. Let r, δ > 0 and 0 < ε < 21 . Denote by B 0 the set of all functions φ ∈ A0r that satisfy kφk0r+2δ < εδ. Then for every function φ ∈ B 0 , Eq. (1.3) has a unique solution Q, P ∈ Adr satisfying kW P kr ≤ δ. The corresponding canonical transformation Uφ : (q, p) 7 → q + Q(q, p), p + P (q, p) is analytic from Dr to Dr+2δ . If H is any function in Ar+2δ , then H ◦ Uφ belongs to Ar , and H ◦ Uφ ≤ |H |r+2δ , r H ◦ Uφ − H ≤ 2 ε|H |r+2δ , (2.1) r 3 1 H ◦ Uφ − H − {H, φ} ≤ ε2 |H |r+2δ . r 3 Furthermore, the maps φ 7 → (Q, P ) and φ 7 → H ◦ Uφ are analytic on B 0 .

Renormalization and Periodic Orbits for Hamiltonian Flows

379

Proof. Denote by B the set of all P ∈ Adr satisfying kW P kr ≤ δ. Let φ ∈ B 0 , and define a map F : B → Adr by setting F (P ) = −(∇1 φ)◦G, where G(q, p) = (q, p+P (q, p)). If P ∈ B then, by using Proposition 2.1, we obtain kW DF (P )hkr = max h · ∇2 (W ∇1 φ)i ◦ G r i ≤ max kW hkr V ∇2 (W ∇1 φ)i ◦ G r i ≤ max kW hkr V ∇2 (W ∇1 φ)i r+δ (2.2) i ≤ max kW hkr δ −1 (W ∇1 φ)i r+2δ

i

= kW hkr δ2−1 kW ∇1 φkr+2δ ≤ kW hkr , for all h ∈ Adr . This, together with the bound kW F (0)kr ≤ δ/2, shows that F is a contraction on B, for the norm kW.kr . Thus, Eq. (1.3) has a unique solution (Q, P ) ∈ Adr × B. By Proposition 2.1, this solution satisfies |V Q|r = (V ∇2 φ) ◦ G r ≤ |V ∇2 φ|r+δ < εδ, (2.3) kW P kr = k(W ∇1 φ) ◦ Gkr ≤ kW ∇1 φkr+δ < εδ. Consider now H ∈ Ar+2δ . In order to prove (2.1), let f (z) (q, p) = H q + z∇2 φ q, p + zP (q, p) , p − z∇1 φ q, p + zP (q, p) . (2.4) From the bounds (2.3) and Proposition 2.1, it follows that this equation defines a function f , from an open neighborhood of the disk |z| ≤ 2/ε to Ar , satisfying |f (z)|r ≤ |H |r+2δ . By using the representation I 1 f (z) s 2 dz, (2.5) f (s) = f (0) + sf 0 (0) + 2π i z−s z |z|=2/ε

we obtain the bound H ◦ Uφ − H − {H, φ} = f (1) − f (0) − f 0 (0) r r I 1 f (z)dz ≤ 1 ε2 |H |r+2δ . ≤ 2 2π i (z − 1)z r 3

(2.6)

|z|=2/ε

The first two inequalities in (2.1) are proved similarly. The analyticity of the maps that assign to φ ∈ B 0 the functions P , Q ∈ Adr and H ◦ UH ∈ Ar , follows by the implicit function theorem and the chain rule. u t Proof of Theorem 1.2. We start with an informal description of the proof. Consider H0 = H ∈ H. Our goal is to define functions φ0 , φ1 , φ2 , . . . such that if we set Gn = Uφ0 ◦ Uφ1 ◦ . . . ◦ Uφn−1 − I,

Hn = H ◦ (I + Gn ),

n = 1, 2, . . . ,

∞, with I− H∞

then Gn → G∞ and Hn → H∞ = H ◦ (I + G∞ ) as n → with n = 0, we define φn = I− φn to be the solution of the equation I− {Hn , φn } = −I− Hn .

(2.7)

= 0. Starting (2.8)

380

J. J. Abad, H. Koch

If I− Hn is small, say of the order εn , then the same should be true for φn . By using that Hn+1 = Hn ◦ Uφn , and thus (2.9) I− Hn+1 = I− Hn ◦ Uφn − Hn − {Hn , φn } , we see from Eq. (2.1) that I− Hn+1 is of the order εn+1 ≈ εn2 . Now the process is repeated for n = 1, 2, . . . . In order to be more precise, assume now that H and I− satisfy the assumptions of Theorem 1.2. Let t0 = (1 − ρ 0 /ρ)/9 and ρ0 = ρ. Define tn = (2/3)n t0 ,

δn = tn ρ,

ρn = ρ 0 + 9δn ,

n = 1, 2, . . . ,

(2.10)

so that ρn+1 = ρn − 3δn , for all n ≥ 0. Given H ∈ H, we intend to verify inductively that for all m > 0, the function Hm belongs to Aρm and satisfies the bounds |Hm |ρm < c, c m |Hm − Hm−1 |ρm < t0 b(2/3)2(3/2) , 4 ac m (t0 b)2 (2/3)4(3/2) . |I− Hm |ρm < 3

(2.11)

By assumption, these bounds hold for m = 0, if we set H−1 = 0. Let now n ≥ 0 be fixed, and assume that (2.11) has been verified for all m ≤ n. We start by showing that Eq. (2.8) can be solved. By using Proposition 2.1.c we obtain |Hn − H |0ρn −δn ≤ <

n X m=1 n X m=1

Thus, if φ belongs to

|Hm − Hm−1 |0ρm −δm ≤

ρn −δn

m=1

−1 δm |Hm − Hm−1 |ρm

(2.12)

bc bc m (3/2)n (2/3)2(3/2) < . 4ρ 2ρ

I− A0ρn −δn ,

− I {Hn − H, φ}

n X

then

≤ a|Hn − H |0ρn −δn kφk0ρn −δn ≤

abc kφk0ρn −δn . 2ρ

(2.13)

cn maps I− A0 This, together with the assumption (c) of Theorem 1.2, implies that I− H ρn −δn − onto I Aρn −δn , and that − I H cn φ

ρn −δn

bφ ≥ I− H ρ

n −δn

− I− {Hn − H, φ} ρ

n −δn

≥

abc kφk0ρn −δn , 2ρ

(2.14)

for all φ ∈ I− A0ρn −δn . Consequently, Eq. (2.8) has a unique solution φn ∈ I− A0ρn −δn , and this solution satisfies the bound kφn k0ρn −δn ≤

2ρ − 2ρ 2 n |I Hn |ρn −δn < t0 b(2/3)4(3/2) ≤ δn εn , abc 3

(2.15)

where εn =

2 n t0 b(2/3)3(3/2) . 3

(2.16)

Renormalization and Periodic Orbits for Hamiltonian Flows

381

In fact, the solution can be obtained by using a convergent Neumann series (for the operator being inverted) that is dominated in norm by a geometric series with ratio 1/2. We note that t0 b ≤ 1, and thus εn < 1/2. This follows from the fact that abckφk0ρ0 −δ0 ≤ ρ|I− {H, φ}|ρ0 −δ0 ≤ ρa|H |0ρ0 −δ0 kφk0ρ0 −δ0 ≤ at0−1 ckφk0ρ0 −δ0 ,

(2.17)

for all φ ∈ I− A0ρ0 −δ0 . Now we can use Proposition 2.2 to prove the three bounds (2.11) for m = n + 1. The first of these bounds is straightforward. For the second one we have 2 n εn |Hn |ρn −δn < (2/3)2 ct0 b(2/3)3(3/2) 3 c n+1 n+1 ≤ (2/3)7/2 ct0 b(2/3)(3/2) < t0 b(2/3)(3/2) , 4

|Hn+1 − Hn |ρn −3δn ≤

(2.18)

and the bound on I− Hn+1 is obtained by using identity (2.9): − I Hn+1

ρn −3δn

a 2 ε |Hn |ρn −δn 3 n 4ac ac n n+1 (t0 b)2 (2/3)6(3/2) < (t0 b)2 (2/3)4(3/2) . < 27 4 ≤

(2.19)

This proves that (2.11) holds for all positive integers m. In particular, we find that the sequence of Hamiltonians Hn converges in Aρ 0 to a function H∞ that satisfies I− H∞ = 0. Next, we estimate the functions Gn defined in equation (2.7). The bounds (2.15) and (2.3) show that gn = Uφn − I satisfies kgn kρn −3δn < εn δn , for any n ≥ 0, where

(Q, P )kr = max |V Q|r , kW P kr , (Q, P ) ∈ Ad × Ad . (2.20) r r Thus, by using the identity Gn =

n−1 X

gm ◦ Uφm+1 ◦ . . . ◦ Uφn−1 ,

(2.21)

m=0

and Proposition 2.1.d, we obtain the bound kGn kρn ≤

n−1 X m=0

kgm kρm+1 <

n−1 1X 3 δm < δ0 . 2 2

(2.22)

m=0

By Proposition 2.2, we also have

kGn+1 − Gn kρ 0 = gn + (Gn ◦ Uφn − Gn ) ρ 0

≤ kgn kρn −3δn + Gn ◦ Uφn − Gn ρ

n −3δn

(2.23)

2 < εn δn + εn kGn kρn −δn < 2εn δ0 . 3 This shows that the sequence {Gn } converges in Adρ 0 ×Adρ 0 , and that the limit G∞ defines an analytic map UH = I + G∞ from Dρ 0 to D(ρ 0 +ρ)/2 . The identity H∞ = H ◦ (I + G∞ )

382

J. J. Abad, H. Koch

follows from the fact that H is analytic on Dρ . Since I+Gn is a canonical transformation, for every n ≥ 0, the limit UH is a canonical transformation as well. In order to verify the remaining claims in Theorem 1.2, let us consider a fixed but arbitrary curve z 7 → Hz ∈ H, which is analytic on an open neighborhood Z of zero in C. For each z ∈ Z, our construction of UHz defines a sequence of functions φz,n ∈ A0ρn −δn and Hz,n ∈ Aρn and Gz,n ∈ A2d ρn . Since we have only used bounds that are uniform on H, all of these functions depend analytically on z. The same applies to the limit n → ∞, since convergence was proved to be uniform on H. Thus, the maps H 7 → H∞ and H 7 → G∞ are analytic on H. Assume now that |I− Hz,n |ρn = O |z|k , for some n ≥ 0 and k ≥ 1. Then the inequality (2.14) shows that kφz,n k0ρn −δn is also of order |z|k . This, together with the identity (2.9), and the last inequality in (2.1), implies that |I− Hz,n+1 |ρn+1 = O |z|2k . By applying this to a curve of the form Hz = H + zh with I− H = 0, we see e.g. that UH = I, and that Hz ◦ UHz = Hz,1 + z2 R1 (z) = I+ Hz,1 + z2 R2 (z) = I+ Hz + {Hz , φz,0 } + z2 R3 (z) bφz,0 + z2 R4 (z), = I+ H + zI+ h − I+ H

(2.24)

with Rj (z) ∈ Aρ 0 for all z ∈ Z. Furthermore, we have cz I− φz,0 = z I− H

−1

bI− I− h = z I− H

−1

I− h + z2 R5 (z),

(2.25)

with R5 (z) ∈ A0ρ−δ0 , for all z ∈ Z. The last two inequalities yield the given formula for t the derivative of H 7 → H ◦ UH . This concludes the proof of Theorem 1.2. u We note that Theorem 1.2 could be applied iteratively, using an increasing sequence − − of projections I− 0 ≤ I1 ≤ I2 ≤ . . . . The part of H that is resonant at step k but becomes non-resonant (and gets eliminated) at step k + 1, could be called a resonance of order k. The iteration scheme of classical KAM theory [2, 8, 16, 22, 25] is of this type, except e.g. that the canonical transformation Uφ0 is used in place of UH . This is sufficient for near-integrable Hamiltonians, since the remaining non-resonant term is very small.

3. The Renormalization Group Transformation The transformation Rµ implements the abovementioned iteration scheme as a dynamical system. Roughly speaking, the step H 7 → H ◦ UH eliminates the non-resonant part of H , or resonance of order zero. In the next step H 7 → H ◦ T1 , the order of each remaining resonance is lowered by one. Two additional steps are included in the definition of Rµ , in order to re-normalize the resulting Hamiltonian: A scaling Sµ of the action variable, and a scaling H 7 → ηH of the energy (or time). The main effect of iterating Rµ , besides scaling, is to eliminate resonances of higher and higher order, by successively decreasing orders and eliminating the lowest one. The remaining part of this section contains our proofs of Lemma 1.3 and Theorem 1.5, followed by some remarks on symmetries.

Renormalization and Periodic Orbits for Hamiltonian Flows

383

Proof of Lemma 1.3. For every index (ν, α) in I + we have kW T ∗ νk = max |Wj · T ∗ ν| = max |ϑj ||Wj · ν| j j ≤ ϑ − |ϑ2 | |ω · ν| + |ϑ2 |kW νk ≤ ϑ − |ϑ2 | κ|α| + |ϑ2 | + σ ϑ − |ϑ2 | kW νk.

(3.1)

Choose r > ρ in such a way that (1.9) remains true if ρ is replaced by r. Then the inequality (3.1) implies that 0 |α| ρ ϑd rkW T ∗ νk eρ 0 kW νk . ≤ (3.2) e rµ Let now H ∈ I+ Aρ 0 . Then for H ◦ Tµ we have the representation H ◦ Tµ (q, p) = H T q, µ(T ∗ )−1 p Y d X µ αj iq·(T ∗ ν) Hν,α (Wp)α . = e ϑj +

(3.3)

j =1

(ν,α)∈I

Thus, by using the bound (3.2), we obtain H ◦ Tµ = r

Y d X rµ αj rkW T ∗ νk Hν,α ρ 0 |α| ≤ |H |ρ 0 . ρ0ϑ e

(ν,α)∈I +

j =1

j

(3.4)

This shows that H 7 → H ◦ Tµ defines a bounded linear operator from I+ Aρ 0 to Ar . The assertion now follows from the fact that the inclusion map from Ar into Aρ is compact, and that H ◦ Tµ ρ ≤ H ◦ Tµ r . u t Proof of Theorem 1.5. Assume that κρ < σ . Then we can choose ρ 0 < ρ and κ 0 > κ such that κρ 2 < κ 0 ρρ 0 < σρ 0 , and such that the first inequality in (1.9) holds, if κ is replaced by κ 0 . Let s = (ρ 0 + 8ρ)/9, and define B = {H ∈ I+ A0s : H0,0 = 0, |H − hω |0s < κρ}. If ρ 0 ≤ r ≤ s, then for every function φ ∈ I− Ar we have X {hω , φ} = |ω · ∇1 φ|r = |φν,α ||ω · ν|r |α| erkW νk r ≥σ

X

(ν,α)∈I −

(ν,α)∈I −

|φν,α |r |α| kW νkerkW νk + κ 0

X

|φν,α ||α|r |α| erkW νk (3.5)

(ν,α)∈I −

≥ σ kW ∇1 φkr + κ 0 r|V ∇2 φ|r ≥ κ 0 rkφk0r , which yields the bound {H, φ} ≥ {hω , φ} − {H − hω , φ} r r r 0 0 0 ≥ κ r − |H − hω |r kφkr ≥ (κ 0 ρ 0 − κρ)kφk0r ,

(3.6)

for all H ∈ B. This shows that I− satisfies a non-resonance condition (with respect to the set B), as defined in Theorem 1.2, for some constants a, b, c > 0. The same resonance condition remains satisfied if we replace B by the set H of all Hamiltonians H ∈ Aρ

384

J. J. Abad, H. Koch

that lie within a small distance ε > 0 of B. If necessary, we decrease ε > 0 such that (H ◦ UH )0,(1,0,... ,0) is bounded away from zero, for all H ∈ H. This is possible since |H0,(1,0,... ,0) | > 1 − σ > 0, for all H ∈ B. Now Theorem 1.2 and Lemma 1.3 imply that (1.10) defines a compact analytic map Rµ from H to Aρ , provided that µ satisfies the second inequality in (1.9). By using the analyticity improving property of the map described in Lemma 1.3, we obtain the same result for Sz ◦ Rµ and Rµ ◦ Sz , uniformly in z, for all z in some open neighborhood of the unit circle in C. In order to prove that Sz ◦ Rµ = Rµ ◦ Sz , as claimed, it suffices to consider |z| = 1. In this case, it is straightforward to check that Sz “commutes through” each of the steps used in the proof of Theorem 1.2, yielding USz H ◦ Sz = Sz ◦ UH ,

(3.7)

where Sz (q, p) = (q, zp). This can be done by using the fact that Sz I− = I− Sz and {Sz H, Sz φ} = Sz {H, φ}, which implies that USz φ ◦ Sz = Sz ◦ Uφ . The details of this computation are left to the reader; see also [15]. u t Remarks. • Due to the symmetry Sz ◦ Rµ = Rµ ◦ Sz , the transformation Rµ maps the scaling orbit z 7 → Sz H of a Hamiltonian H to the corresponding scaling orbit for Rµ (H ). Thus, in situations where these orbits are non-degenerate, a normalization condition can be used to pick an arbitrary representative from each of them. This leads naturally to a transformation (1.10) where the scaling µ is chosen to depend on H , in such a way that the given normalization is preserved. • A relation analogous to (3.7) holds if the scaling transformations Sz and Sz are replaced by translations of the angles, given by Jγ (q, p) = (q − γ , p) and Jγ H = H ◦ Jγ , respectively. The corresponding symmetry Rµ ◦ Jγ = JT −1 γ ◦ Rµ can be related to observations for certain periodic orbits [1]. • As was mentioned in the introduction, if H is an even function of the angle variable q, then the same is true for Rµ (H ). This can be seen easily from our construction of UH in the proof of Theorem 1.2. • Another invariance property of Rµ is the following. Given v ∈ Cd , denote by A(v) the set of Hamiltonians H (in the appropriate domain) for which v · ∇2 H is constant. Our proof of Theorem 1.2 shows that H ◦ UH belongs to A(v) whenever H does. Thus, Rµ maps A(v) to A((T ∗ )−1 v). The same holds if we define A(v) by the condition v · ∇2 H = 0. • The symmetry properties mentioned above, either separately or combined, can be used e.g. to restrict the search for nontrivial RG fixed points (or invariant families) to appropriate invariant subspaces. Such a fixed point (or family) may be relevant only for Hamiltonians that share all of its symmetries. But as the trivial fixed point hω and other examples show, the “domain of relevance” (universality class) may actually be larger; see also [1, 6]. 4. Periodic Orbits We begin by showing that under a certain condition on w ∈ Rd , every Hamiltonian H near hw determines a “counterterm” 8(H ), within a d-dimensional subspace of Bρ that is roughly parallel to W u , such that a constant multiple of H + 8(H ) has a periodic

Renormalization and Periodic Orbits for Hamiltonian Flows

385

orbit of the type described in Definition 1.7. The subsequent proof of Theorem 1.8 is based on identifying 6(w) locally with 8−1 (0). A proof of Theorem 1.9 is given at this end of this section, after some results on d-parameter families. Given any r > 0, define Ar to be the Banach P space of all analytic functions g on |gn | exp(r|n|), where g0 , g±1 , g±2 , . . . the strip |Imz| < r, with finite norm |g|r = are the Fourier coefficients of g. In other words, Ar is the one-variable analogue of Ar . On the product space Adr we use norms |.|r and k.kr analogous to those introduced in Definition 1.1. Theorem 4.1. Let δ, r, ρ be positive real numbers satisfying δ ≤ r ≤ 1 and r + δ ≤ ρ/2. Let w be a nonzero vector in Rd such that τ w ∈ Zd , τ ≥ 4, and assume that |V (w − ω)| ≤ 2−4 . Define b = 2−5 δ 2 /τ . Then for every Hamiltonian H in B = {H ∈ Bρ : |H − hw |ρ < b}, there exist two complex numbers ξ and E, and a vector u perpendicular to v = V ∗ V w, such that the Hamiltonian ξ(H + hu ) + E has a periodic orbit (1.19) at energy zero, with K(γ ) = 0, Q0 = 0, P0 = kv for some k ∈ C, and with Q (odd) and P (even) belonging to Adr . The quantities ξ, u, E, k, Q, P satisfy the bounds |V u| ≤ δ 0 /δ, |ξ − 1| ≤ δ 0 /δ, 0 |k| ≤ δ , |V Q|r/τ ≤ τ δ 0 /δ,

|E| ≤ δ 0 , kW P kr/τ ≤ τ δ 0 /δ,

(4.1)

with δ 0 = 4|H −hw |ρ , and they are uniquely determined if we require that (4.1) holds for δ 0 = 4b. Furthermore, the dependence of ξ, u, E, k, Q, P on the Hamiltonian H ∈ B is analytic. Proof. Given ξ ∈ C, and u ∈ Cd satisfying u · v = 0, we define x = (ξ − 1)w + ξ u. If ξ is bounded away from zero, as is the case for the parameter values considered here, the map (ξ, u) 7 → x is one-to-one. Thus, we shall use x as a parameter and consider ξ, u to be functions of x. Let now H = hw + h be a Hamiltonian in Bρ , with |h|ρ ≤ 2b, and consider the family (x, E) 7 → G defined by G = ξ(H + hu ) + E = hw + ξ h + hx + E.

(4.2)

Let r 0 = r/τ , and denote by D the derivative operator on Adr0 . Then the equation for a periodic orbit (1.19) of G, with Q0 = 0 and P0 = kv, can be written as DQ = τ ∇2 (ξ h + hx ) (ζ + Q, kv + P ), DP = −τ ∇1 (ξ h + hx ) (ζ + Q, kv + P ),

(4.3)

where g(ζ + Q, kv + P ) stands for the composition of a given function g with the map t 0 7 → (t 0 τ w + Q(t 0 ), kv + P (t 0 )). The variable t 0 used here is the rescaled time t 0 = t/τ . Denote by Ef the average of a periodic function f . It is convenient to split the first equation in (4.3) into two equations, one for the average, and one for the remaining e where zero-average part. The result can be written in the form x = e x and Q = Q, e x = −ξ E(∇2 h)(ζ + Q, kv + P ), e = τ ξ D −1 (I − E)(∇2 h)(ζ + Q, kv + P ). Q

(4.4)

386

J. J. Abad, H. Koch

Here, D −1 denotes the antiderivative operator on (I − E)Adr0 . Similarly, the second e, with equation in (4.3) is equivalent to P = P e = −τ ξ D −1 (∇1 h)(ζ + Q, kv + P ). P

(4.5)

Notice that in this equation, the function to the right of D −1 has automatically a zero average, due to the parity conditions on Q, P , and h. The condition that the integral (1.16) along the periodic orbit γ be zero, will be written as k = e k, where (omitting the argument t 0 ) 1 e k=k− 2πτ c =−

Z2π

ξ 2πc

Z2π Z2π 1 0 (kv + P ) · (τ w + DQ) dt = − P · DQ dt 0 2π τ c 0

0

(4.6)

P · (∇2 h)(ζ + Q, kv + P ) dt 0 ,

0

with c = v · w. In addition, we impose the condition that G = 0 on the orbit γ , or equivalently (if k = e k), that the integral of Gdt − p · dq along γ be zero. This condition e where (omitting the argument t 0 ) can be written in the form E = E, e= E − 1 E 2π

Z2π 0

=−

ξ 2π

1 G(ζ + Q, kv + P ) dt + 2π τ 0

Z2π (kv + P ) · (τ w + DQ) dt 0 0

Z2πh

i

(4.7)

h(ζ + Q, kv + P ) − P · (∇2 h)(ζ + Q, kv + P ) dt 0 − kv · x.

0

The problem of finding a pair (x, E), such that the Hamiltonian (4.2) has an orbit γ with the desired properties, has now been reduced to a fixed point problem for the map ee e P e) defined by Eqs. (4.4) . . . (4.7). Denote by X F : (x, E, k, Q, P ) 7 → (e x , E, k, Q, the Banach space of all quintuples X = (x, E, k, Q, P ) in Cd × C × C × Adr0 × Adr0 , whose components Q and P have average zero, equipped with the norm (4.8) kXk = max τ |V x|, δ −1 τ |E|, δ −1 τ |k|, |V Q|r 0 , kW P kr 0 . We note that the conditions (4.1), with δ 0 = 4b, imply that kXk < δ/2. To see this, one first proves e.g. that |V x| ≤ 17 16 |ξ − 1| + |ξ ||V u|. Let us now assume that kXk ≤ δ. Then, by using part (b) of Proposition 2.1, we obtain g(ζ + Q, kv + P ) 0 ≤ |g|ρ−δ , g ∈ Aρ−δ , (4.9) r and this bound can be used to show that kF (X)k ≤

2 δ 2τ |h|ρ + kXk2 < . δ τ 4

(4.10)

The proof of the first inequality in (4.10) is mildly tedious, but straightforward. This estimate, together with Cauchy’s formula, can also be used to show that the derivative of F , at every point in the ball kXk ≤ δ/2, is of norm less than 21 . Thus, by the contraction

Renormalization and Periodic Orbits for Hamiltonian Flows

387

mapping principle, F has a unique fixed point X0 in this ball. From the bound (4.10), it follows that kX 0 k ≤

16τ |h|ρ . 7δ

(4.11)

By combining this with the two estimates |ξ −1| ≤ 87 |V x| and |V u| ≤ 43 |V x|, we obtain the inequalities (4.1) with δ 0 = 4|h|ρ . The analytic dependence of ξ, u, E, k, Q, P on t H ∈ B follows from the uniform convergence of F n (0) → X0 as n → ∞. u Proof of Theorem 1.8. We will use the notation and assumptions made in Theorem 4.1. In addition, we assume that hw lies on W u , and that δ > 0 is sufficiently small such that B is contained in the domain of Rµ . Denote by X the subspace of Bρ , consisting of all Hamiltonians of the form chw + f , with c ∈ C, and with f a function in Bρ , whose Fourier–Taylor coefficients fν,α are zero whenever ν = 0 and |α| < 2. In addition, let Y be the space of all Hamiltonians of the form hy + C, with C a constant function, y ∈ Cd , and y · v = 0. Then we can identify Bρ with X ⊕ Y . Define a map φ from X ∩ B to Y , by setting φ(H ) = hu + E/ξ , with H 7→ (ξ, u, E) as described in Theorem 4.1. The graph 6 0 of this map φ is clearly an analytic manifold of codimension d in Bρ . In addition, 6 0 intersects W u transversally at hw . This follows essentially from the fact that φ(chw ) = 0: Due to the identity Dφ(hw )hw = 0, it suffices to verify the transversality property in the (d + 1)-dimensional subspace of all Hamiltonians of the form (q, p) 7 → y · p + C, where it is trivial. In order to compare 6 0 with the set 6(w) defined in Definition 1.7, consider two different choices (δ1 , r1 ) and (δ2 , r2 ) for the parameters (δ, r) in Theorem 4.1, with δ1 < δ2 < r2 < r1 . Denote by B1 and B2 the corresponding balls B, and by 610 and 620 the corresponding manifolds 6 0 . Then, by the uniqueness part of Theorem 4.1, the intersection of 620 with B1 is equal to 610 . Furthermore, if we set ρ 0 = r1 and choose ε > 0 sufficiently small, then 6(w) ∩ B1 is contained in 620 . And by choosing δ1 > 0 sufficiently small, we also have 610 ⊂ 6(w). Thus, 6(w) agrees with 610 in the ball B1 . We note that the same choice of parameters works for all rotation vectors w ∈ RZd that are sufficiently close to ω. (This fact is not used later on.) The reason for this is that our bounds depend on w only through the constant τ = τ (w). u t Our proof of Theorem 1.9 is based on the graph transform method; see e.g. [14]. We start by introducing some notation. Denote by X and Y the stable and unstable subspaces, respectively, of the linearized RG transformation DRµ (hω ), restricted to Bρ . Denote by ψ the (analytic) function, defined on an open neighborhood of hω in X , with values in Y, whose graph is the local stable manifold W s of Rµ at hω . As was mentioned in the introduction, Y is spanned by the d − 1 functions listed in (1.13), together with the constant function. The canonical projections onto X and Y will be denoted by Ps and Pu , respectively. Due to our choice of norm on Bρ , both Ps and Pu have operator norm 1. Consider the transformation Nµ , Nµ = 9 −1 ◦ Rµ ◦ 9,

9 = I + ψ ◦ Ps ,

(4.12)

defined on a neighborhood of hω in Bρ . Notice that Nµ and Rµ have the same derivatives at the fixed point hω , and the same local unstable manifolds. But the local stable manifold of Nµ is trivial; that is, it agrees with X near hω . The largest (in modulus) contracting eigenvalue of DNµ (hω ) is λ = µϑ1 ϑd−2 . Let θ be some fixed real number larger than

388

J. J. Abad, H. Koch

ϑ1 |ϑd |−2 . In the remaining part of this section, we restrict the possible choices of µ by imposing the condition θµ < |λ2 |−1 . Since DNµ (hω ) is compact, we can choose a norm k.k on X that is equivalent to k.kρ , but for which the restriction of DNµ (hω ) to X has (operator) norm less than θ µ. Assume now that such a norm has been chosen. The extension to Bρ = X ⊕ Y is defined by setting kx + yk = kxk + kykρ , for every x ∈ X and y ∈ Y. Given any δ > 0, let Dδ = {y ∈ Y : kyk < δ}, and define Fδ to be the Banach space of all analytic d-parameter families F : Dδ → Bρ , that extend continuously to the boundary of Dδ and satisfy Pu F (0) = 0, equipped with the norm kF kδ = sup kF (y)k.

(4.13)

y∈Dδ

Of particular interest is the family F ∗ , defined by F ∗ (y) = hω + y, as it parametrizes the local unstable manifold of Rµ . Proposition 4.2. If δ > 0 is sufficiently small, then the equation Mµ (F ) = Nµ ◦ F ◦ YF ,

YF = (Pu ◦ Nµ ◦ F )−1 ,

(4.14)

defines an analytic contraction mapping Mµ , on some open neighborhood B of F ∗ in Fδ , with fixed point F ∗ , and contraction rate less than θ µ. The proof of this proposition is straightforward: Pu ◦ Nµ ◦ F ∗ agrees with the restriction of DRµ to Y, so that by the inverse function theorem, (F, y) 7 → YF (y) is well defined and analytic near (F ∗ , 0) in Fr × Y; and the asserted contraction property of Mµ follows from the identity DMµ (F ∗ )f (y) = Ps DNµ F ∗ (YF ∗ (y)) f (YF ∗ (y)), f ∈ Fr , y ∈ Dr , (4.15) which can be verified by an explicit computation. Notice that YF ∗ is linear, and that YF (0) = 0 for any F . The following proposition will be used to estimate the composition of maps fn = YFn , associated with an orbit (F0 , F1 , F2 , . . . ) of Mµ . Proposition 4.3. Let U , V be normed linear spaces, and let Z be an open ball in U ⊕ V , centered at zero. Let L be a bounded linear operator on U ⊕ V that commutes with the projection (u, v) 7 → (u, 0), and that satisfies kL(u, 0)k = akuk,

kL(0, v)k ≤ bkvk,

u ∈ U, v ∈ V ,

(4.16)

with 0 < b < a fixed. Let f0 , f1 , . . . , fn−1 be Lipschitz maps on Z, that leave the origin fixed. Then ±a −n kf0 (f1 (· · · fn−1 (u, v) · · · ))k ≤ ±kuk + ec0 c0 kuk + (b/a)n ec0 +c1 kvk, (4.17) where cm =

n−1 kfj (z) − Lzk 1 X a m(j +1) , sup a b kzk z6 =0 j =0

m = 0, 1.

(4.18)

Renormalization and Periodic Orbits for Hamiltonian Flows

389

Proof. Denote by sj the value of the supremum in (4.18). Let (u0 , v0 ) be an arbitrary point in Z. For k = 1, 2, . . . , n, define (uk , vk ) = Lk (u0 , v0 ) and (4.19) rk = fn−k ◦ fn−k+1 ◦ . . . ◦ fn−1 (u0 , v0 ) − (uk , vk ). By (4.16) we have kuk k = a k ku0 k and kvk k ≤ bk kv0 k. If we set r0 = 0, then

krk k = fn−k (uk−1 , vk−1 ) + rk−1 − L(uk−1 , vk−1 ) ≤ kLrk−1 k + sn−k k(uk−1 , vk−1 ) + rk−1 k k−1

(4.20) k−1

≤ (a + sn−k )krk−1 k + sn−k a ku0 k + sn−k b kv0 k, Q for all positive k ≤ n. Define pk = n−1 j =n−k (1 + sj /a). Then, by applying the bound (4.20) recursively, we obtain n 1 sn−k b a n−k+1 sn−k 1 ku0 k + kv0 k krk k ≤ k−1 krk−1 k + k a pk a pk−1 a a b a n X (4.21) n−1 n−1 X sj b a j +1 sj ku0 k + kv0 k. ≤ a a b a j =n−k

j =n−k

In particular, a −n krn k ≤ pn c0 ku0 k + (b/a)n pn c1 kv0 k. The assertion follows by comt bining this inequality with the trivial bounds pn ≤ ec0 and 1 + pn c1 ≤ ec0 +c1 . u Proposition 4.3 can be used to prove uniform upper and lower bounds on βn − β 0 , which we will need in order to estimate the accumulation rate of periodic orbits, as described in Theorem 1.9. The following proposition is the first step toward this goal. We note that a general result of this type was proved in [23], but it is not sufficient for our purpose. Proposition 4.4. If δ > 0 is sufficiently small, and if w ∈ RZd is sufficiently close to ω and normalized, such that hw − hω belongs to Dδ/2 , then there exists an open neighborhood B 0 of F ∗ in Fδ , and constants k1 , k2 , N > 0, such that the following holds. For every F ∈ B 0 , and for every non-negative integer n, the condition 9(F (y)) ∈ 6n (w) defines a unique parameter value y = yn in Dδ , and this value satisfies the bound k1 |λ2 |−n < kyn k < k2 |λ2 |−n ,

n ≥ N.

(4.22)

Proof. Under the given conditions on δ and w, Theorem 1.8 guarantees that F ∗ intersects the manifold 9 −1 (60 (w)) transversally, and in a single point. Since transversality is preserved under small perturbations, the same holds for every family F in some open ball B 0 ⊂ Fδ centered at F ∗ , and the intersection parameter y0 depends analytically on F . Denote by Z the map F 7 → y0 , and by r 0 the radius of B 0 . In addition, let z = Z(F ∗ ) and L = DYF ∗ (0). Assume that δ, r 0 > 0 have been chosen sufficiently small, such that the restriction of Mµ to B 0 has the properties described in Proposition 4.2, and such that the derivatives of F 7 → YF and Mµ are uniformly bounded on B 0 . Given any F0 ∈ B 0 , we set Fn = Mnµ (F0 ), for n = 1, 2 . . . . By the definition of Mµ , the family Fn intersects 9 −1 (6n (w)) at a single point yn , given by the equation (4.23) yn = YF0 ◦ YF1 ◦ . . . ◦ YFn−1 (zn ), zn = Z(Fn ).

390

J. J. Abad, H. Koch

By Proposition 4.2, the norm of Fn − F ∗ is bounded by (θ µ)n kF0 − F ∗ k, for all n ≥ 0. By analyticity, analogous bounds hold for kzn − zk and kDYFn (y) − Lk, up to constant factors that we can choose to be independent of n, F ∈ B, and y ∈ Dδ/2 . Notice that L is the inverse of the restriction of DRµ (hω ) to Y. All eigenvalues of L are either of modulus a, or of modulus less than b, with θ µ < b < a. Given these properties of the maps YFn and L, the bound (4.22) now follows from Proposition 4.3, provided that z has a nonzero component in a spectral subspace of L corresponding to an eigenvalue of modulus a. But z = hw − hω meets this requirement, since w = c1 W1 + . . . + cd Wd , with cj 6 = 0 for all j . This follows from the fact that the fields Q[ϑj ] are all isomorphic, and that Q[ϑ1 ] is spanned by the rationally independent components of ω. u t Proof of Theorem 1.9. Let w 0 be a nonzero vector in RZd . In order to prove the assertion for w = w 0 , it suffices to prove it for w = c(ϑ −1 T )k w0 , where c can be any nonzero real number, and k any nonnegative integer. Thus, since c(ϑ −1 T )k w 0 → ω for some value of c, we can, without loss of generality, assume that w is as close to ω as required by Proposition 4.4, after choosing δ > 0 sufficiently small. In what follows, c0 , . . . , c18 denote positive constants that do not depend on the choice of the Hamiltonian H . But unless stated otherwise, ci may depend on µ. We will also introduce constants b1 , . . . , b6 that only depend on the choice of T . We assume that µ > 0 has been chosen sufficiently small such that bi µ < |λ2 |−1 . Since r > ρ, there exists an open neighborhood 3 of zero in Cd , such that (β, H ) 7 → Hβ is analytic and has bounded derivatives, as a map from 3 × Br to Bρ . When considering families β 7 → Hβ , we will implicitly assume that β ∈ 3. Let now h be a Hamiltonian in Br , of the form (1.18), with M as described after (1.18), and assume that h is sufficiently close to hω , such that Nµ is well defined and analytic on an open ball in Bρ containing h and hω . Define Y (β) = Pu 9 −1 (h◦Rβ ). Then Y (0) = 0, and our assumption on M implies that Y is invertible as a map from an open neighborhood of zero in Cd , to Y. Define f0 (y) = 9 −1 (h ◦ Rβ ), with β = Y −1 (y). An explicit computation shows that for n = 1, 2, . . . , Eq. (4.14) defines a family fn = Mµ (fn−1 ), which belongs to Fδ for large n, and that fn → F ∗ in Fδ , as n tends to infinity. Thus, given any open neighborhood B 0 of F ∗ in Fδ , there exists a positive integer `, such that if H ∈ Br is sufficiently close to h, then the equation (4.24) F (y) = 9 −1 R`µ Hβ 0 ◦ RZ` (y) , Z` = Y −1 ◦ Yf0 ◦ . . . ◦ Yf`−1 , defines a family F ∈ B 0 . Here, β 0 denotes the parameter value where β 7 → Hβ intersects W s . This value is well defined and depends analytically on H , since β 7 → hβ intersects W s transversally at h. By using Proposition 4.4, and the fact that Z` is invertible near the origin, we can find constants k10 , k20 , N 0 > 0, such that for all n ≥ N 0 , and for all Hamiltonians H in some open neighborhood B of h in Br , the condition Hβ ∈ 6n (w) defines a unique parameter value β = βn , and this value satisfies the bound k10 |λ2 |−n < |βn − β 0 | < k20 |λ2 |−n .

(4.25)

In what follows, we assume that n is larger than N 0 . Define Hβ,m = Rm µ (Hβ ), whenever . Hβ belongs to the domain of Rm µ Consider now a fixed but arbitrary Hamiltonian H ∈ B. Given that Hβn ,n lies on 6(w), a constant multiple of this Hamiltonian has a periodic orbit with rotation vector w, given by an analytic curve gn in Dρ with fixed domain, as described in Definition 1.7. By the definition of Rµ , the Hamiltonian Hβn ,n is related to Hβn by a canonical change

Renormalization and Periodic Orbits for Hamiltonian Flows

391

of coordinates homotopic to T1n , and a scaling. Formally, a constant multiple of Hβn has a periodic orbit Gn with rotation vector wn = (ϑ −1 T )n w, and this orbit is given by the equation Gn = UHβn ,0 ◦ Tµ ◦ UHβn ,1 ◦ Tµ ◦ . . . ◦ UHβn ,n−1 ◦ Tµ ◦ gn ◦ 2−n = Vn (Hβn ) ◦ Tµn ◦ gn ◦ 2−n ,

(4.26)

where 2(t) = ϑt for all t, and where Vn (Hβn ) is the transformation given by (1.14). But in order to establish the existence of such an orbit Gn , we need to show that the maps in Eq. (4.26) can be composed as indicated. By construction, the size of the domain (width of the strip in the angle variable, and diameter of the ball in the action variable) of the transformation Vn (Hβn ) decreases exponentially with n. But the rate of decrease is independent of µ: Due to the identity (3.7), the transformations Vk (Hβn ), defined in (1.14), are not only canonical, but also independent of the choice of µ. Thus, the same is true for Vn (Hβn ). On the other hand, the nonlinear part of gn is bounded by a constant times |Hβn ,n − hw |ρ , as was shown in Theorem 4.1. And this norm is less than c0 (θ µ)n . This follows from Proposition 4.2, and from the fact that 6(w) intersects F ∗ transversally at hw . Thus, if µ > 0 is chosen sufficiently small, we find that the range of Tµn ◦ gn is contained in the domain of Vn (Hβn ), for large n. A more detailed discussion of the transformations Vn (H ) can be found in [15]. The estimates obtained there are formulated for H = Hβ 0 only, but they are easy to adapt to the Hamiltonians considered here. To be more precise, the starting point for these estimates is a bound kφ(Hβ 0 ,m )k0r0 ≤ c2 kI− Hβ 0 ,m kρ ≤ c2 c1 (b1 µ)m kHβ 0 − hω kρ ,

(4.27)

where φ(Hβ 0 ,m ) is the generating function of the canonical transformation UHβ 0 ,m . The constants c2 and b1 , and the parameter r0 defining the domain of φm (Hβ 0 ,m ), are independent of µ. We have omitted an additional factor (b1 µ)m that appears in Eq. (5.11) of [15], since it is not used or needed. Concerning the replacement of Hβ 0 by Hβn , we note that the first inequality in (4.27) is a general bound on φ that applies directly to Hβn ,m . The second inequality in (4.27) carries over as well. In fact, it can be improved if H is close to h: kI− Hβ,m kρ ≤ c3 (θ µ)m kH − hkr ,

H ∈ B,

(4.28)

for all β in some open set 3m ⊂ Cd containing β 0 and βn . This inequality is trivial for m ≤ `, and if m = ` + k with k positive, it follows from the bound

−

I 9 (Mk (F ))(y) = I− 9 (Mk (F ))(y) − I− 9 (Mk (f` ))(y) µ µ µ ρ ρ

(4.29) ≤ c4 Mkµ (F ) − Mkµ (f` ) ≤ c4 (θ µ)k kF − f` k ≤ c5 (θ µ)k kH − hkr . Here, F is the family defined in (4.24), and y is an arbitrary point in Dδ . We note that for m = ` + k, the abovementioned set 3m can be taken to be the image of Dδ under the map Z` ◦ YF0 ◦ . . . ◦ YFk−1 , where Fn = Mnµ (F ). Since YFn → YF ∗ uniformly on Dδ , there exists a universal constant b > 0, such that 3m contains a union of balls of radii c6 bm , whose centers trace out a path of length ≤ c7 b−m |βn − β 0 |, connecting βn and β 0 . Thus, by using (4.29), together with Cauchy’s formula to estimate

392

J. J. Abad, H. Koch

the derivative of β 7 → I− Hβ,m along the path from βn and β 0 , we obtain the first of the following two bounds: kI− Hβn ,m − I− Hβ 0 ,m kρ ≤ c8 (b2 µ)m |βn − β 0 | kH − hkr ,

(4.30)

kHβn ,m − Hβ 0 ,m kρ ≤ c9 b3m |βn − β 0 |.

The second bound is obtained similarly. Consider now the curve γn = Rβn ◦ Gn . Clearly, γn is a periodic orbit for a constant multiple of H , with rotation vector wn . Our next goal is to show that the n-dependence of γn (0) is mostly due to the translations Rβn . To this end, we split Gn (0) − 0Hβ 0 (0) into three pieces and use the triangle inequality: Gn (0) − 0H 0 (0) ≤ Vn (Hβ ) T n (gn (0)) − Vn (Hβ )(0) n n µ β + Vn (Hβ )(0) − Vn (Hβ 0 )(0) + Vn (Hβ 0 )(0) − 0H 0 (0) . n

β

(4.31) The first term on the right hand side of (4.31) can be estimated by using the results from [15], with Hβ 0 replaced by Hβn , as mentioned above. The relevant fact is that Vn (Hβn ) is uniformly bounded on a domain whose size decreases exponentially in n, independently of µ, while |gn (0)| ≤ c10 (θµ)n , as mentioned earlier. Thus, Vn (Hβ ) T n (gn (0)) − Vn (Hβ )(0) ≤ c11 (b4 µ)n . (4.32) n n µ In order to bound the second term on the right hand side of (4.31), we define an interpolating family of transformations s 7→ Vn,s (H ), such that Vn,0 (H ) = Vn (Hβ 0 ) and Vn,1 (H ) = Vn (Hβn ). To obtain Vn,s (H ), each of the transformations UHβ 0 ,m that enter the definition (1.14) of Vn (Hβ 0 ), is replaced by the canonical transformation with generating function (4.33) φn,m,s (H ) = φ(Hβ 0 ,m ) + s φ(Hβn ,m ) − φ(Hβ 0 ,m ) . By using the analyticity of the map H 7→ φ(H ), and the bounds (4.25), (4.29), and (4.30), we find that for |s| ≤ |λ2 |n , kφn,m,s (H )k0r0 ≤ c12 kI− Hβ 0 ,m kρ + c13 |s| kI− Hβn ,m − I− Hβ 0 ,m kρ + c14 |s| kI− Hβ 0 ,m kρ kHβn ,m − Hβ 0 ,m kρ ≤ c15 (b5 µ)m kH − hkr .

(4.34)

With this bound replacing (4.27), we now obtain the analogue of Lemma 5.5 in [15], which implies that v(s) = Vn,s (H )(0) is bounded in modulus by c16 kH − hkr , if n is sufficiently large and |s| ≤ |λ2 |n . Thus, by writing v(1) − v(0) as the integral of v 0 over [0, 1], and estimating v 0 by using Cauchy’s formula with contour |s| = |λ2 |n , we obtain the bound Vn (Hβ )(0) − Vn (Hβ 0 )(0) ≤ c17 |λ2 |−n kH − hkr . (4.35) n The last term in (4.31) satisfies again a bound of the form (4.32), as was shown in [15]. Putting the pieces together, we now have Gn (0) − 0H 0 (0) ≤ c17 |λ2 |−n kH − hkr + c18 (b6 µ)n , (4.36) β

Renormalization and Periodic Orbits for Hamiltonian Flows

provided that H is sufficiently close to h in Br , and n sufficiently large. Since ±|γn (0) − 0 0 (0)| ≤ ±|βn − β 0 | + Gn (0) − 0Hβ 0 (0) ,

393

(4.37)

as a result of the identities γn (0) = Gn (0) + (0, βn ) and 0 0 (0) = 0Hβ 0 (0) + (0, β 0 ), the bound (1.21) now follows from (4.25) and (4.36), if H is sufficiently close to h. u t We conclude with a proof of (1.22), using the same notation as above. By the definition of 6(w), we have Hβn ,n ◦gn = 0, and thus H ◦γn = Hβn ◦Gn = an Hβn ,n ◦gn ◦2−n = 0, where an is some nonzero constant. If U is a canonical transformation with globally defined generating function, or a composition of such transformations, then K(U ◦ γ ) = K(γ ) for any closed curve γ . The corresponding identity for Tµ is K(Tµ ◦γ ) = µK(γ ). As a result, we have K(Gn ) = µn K(gn ) = 0, and thus K(γn ) = τ (wn )wn · βn , which is equivalent to Eq. (1.22). Acknowledgements. We would like to thank R. de la Llave and P. Wittwer for helpful discussions.

References 1. Abad, J.J., Koch, H., Wittwer, P.: A Renormalization Group for Hamiltonians: Numerical Results. Nonlinearity 11, 1185–1194 (1998) 2. Arnold, V.I.: Proof of A.N. Kolmogorov’s Theorem on the Preservation of Quasi-Periodic Motions under Small Perturbations of the Hamiltonian. Usp. Mat. Nauk, 18, No. 5, 13–40 (1963); Russ. Math. Surv., 18, No. 5, 9–36 (1963) 3. Bernstein, D., Katok, A.: Birkhoff Periodic Orbits for Small Perturbations of Completely Integrable Hamiltonian Systems with Convex Hamiltonians. Invent. Math. 88, 225–241 (1987) 4. Chandre, C., Govin, M., Jauslin, H.R.: KAM-Renormalization Group Analysis of Stability in Hamiltonian Flows. Phys. Rev. Lett. 79, 3881–3884 (1997) 5. Chandre, C., Govin, M., Jauslin, H.R., Koch, H.: Universality for the Breakup of Invariant Tori in Hamiltonian Flows. Phys. Rev. E 57, 6612–6617 (1998) 6. Chandre, C., Jauslin, H.R., Benfatto, G., Celletti, A.: An Approximate Renormalization-Group Transformation for Hamiltonian Systems with Three Degrees of Freedom. Preprint U. Texas, mp_arc 99–74 (1999) 7. Collet, P., Eckmann, J.-P.: Iterated Maps on the Interval as Dynamical Systems. Basel–Boston–Berlin: Birkhäuser Verlag, 1980. 8. de la Llave, R.: Introduction to KAM Theory. Preprint U. Texas, mp_arc 93–8 (1993) 9. del Castillo-Negrete, J., Greene, J.M., Morrison, P.: Area Preserving Non-Twist Maps: Periodic Orbits and Transition to Chaos. Phys. D 91, 1–23 (1996) 10. Delshams, A., delaLlave, R.: KAM Theory and a Partial Justification of Greene’s Criterion for Non-Twist maps. Preprint U. Texas, mp_arc 98–732 (1998) 11. Escande, D.F., Doveil, F.: Renormalisation Method for Computing the Threshold of the Large Scale Stochastic Instability in Two Degree of Freedom Hamiltonian Systems. J. Stat. Phys. 26, 257–284 (1981) 12. Falcolini, C., delaLlave, R.: A Rigorous Partial Justification of Greene’s Criterion. J. Stat. Phys. 67, 609–643 (1992) 13. Greene, J.M.: A Method for Determining a Stochastic Transition. J. Math. Phys. 20, 1183–1201 (1979) 14. Hirsch, M.W., Pugh, C.C., Shub, M.: Invariant Manifolds. Lecture Notes in Math. 583, Berlin–New York: Springer-Verlag, 1977 15. Koch, H.: A Renormalization Group for Hamiltonians, with Applications to KAM Tori. Erg. Theor. Dyn. Syst. 19, 1–47 (1999) 16. Kolmogorov, A.N.: On Conservation of Conditionally Periodic Motions Under Small Perturbations of the Hamiltonian. Dokl. Akad. Nauka SSSR, 98, 527–530 (1954) 17. Kosygin, D.: Multidimensional KAM Theory from the Renormalization Group Viewpoint. In: Dynamical Systems and Statistical Mechanics, Ya.G. Sinai (ed), AMS, Adv. Sov. Math. 3, 99–129 (1991) 18. MacKay, R.S.: Renormalisation in Area Preserving Maps. Thesis, Princeton (1982), London: World Scientific, 1993 19. MacKay, R.S.: Three Topics in Hamiltonian Dynamics. In: Dynamical Systems and Chaos, Vol.2, Y. Aizawa, S. Saito, K. Shiraiwa (eds), London: World Scientific, 1995

394

J. J. Abad, H. Koch

20. MacKay, R.S., Meiss, J.D., Stark, J.: An Approximate Renormalization for the Break-up of Invariant Tori with Three Frequencies. Phys. Lett. A 190, 417–424 (1994) 21. Mehr, A., Escande, D.F.: Destruction of KAM Tori in Hamiltonian Systems: Link with the Destabilization of nearby Cycles and Calculation of Residues. Physica 13D, 302–338 (1984) 22. Moser, J.: On Invariant Curves of Area-Preserving Mappings of an Annulus. Nachr. Akad. Wiss. Gött., II. Math. Phys. Kl 1962, 1–20 (1962) 23. Palis, J.: A Note on the Inclination Lemma (λ-Lemma) and Feigenbaum’s Rate of Approach. In: Geometric dynamics (Rio de Janeiro, 1981), J. Palis (ed), Lecture Notes in Math. 1007, Berlin–New York: SpringerVerlag, pp. 630–635, 1983. 24. Palis, J., de Melo, W.: Geometric Theory of Dynamical Systems. An Introduction. Berlin–New York: Springer-Verlag, 1982 25. Thirring, W.: A Course in Mathematical Physics I: Classical Dynamical Systems. Berlin–NewYork–Wien: Springer-Verlag, 1978 26. Tompaidis, S.: Approximation of Invariant Surfaces by Periodic Orbits in High-Dimensional Maps: Some Rigorous Results. Experimental Math. 5, 197–209 (1996) Communicated by Ya. G. Sinai

Commun. Math. Phys. 212, 395 – 413 (2000)

Communications in

Mathematical Physics

© Springer-Verlag 2000

Noncommutative Geometry and Gauge Theory on Fuzzy Sphere Ursula Carow-Watamura, Satoshi Watamura Department of Physics, Graduate School of Science, Tohoku University, Aoba-ku, Sendai 980-8577, Japan. E-mail: [email protected]; [email protected] Received: 21 January 1998 / Accepted: 4 February 2000

Abstract: The differential algebra on the fuzzy sphere is constructed by applying Connes’scheme. The U (1) gauge theory on the fuzzy sphere based on this differential algebra is defined. The local U (1) gauge transformation on the fuzzy sphere is identified with the left U (N + 1) transformation of the field, where a field is a bimodule over the quantized algebra AN . The interaction with a complex scalar field is also given.

1. Introduction The concept of quantized spaces is discussed in a variety of fields in physics and mathematics. From the physicists’ viewpoint, the main motivation for investigating noncommutative spaces stems from the need of an appropriate framework to describe the quantum theory of gravity. Recently quantized spaces are also discussed in connection with M(atrix) theory which has been proposed as a nonperturbative formulation of string theory [1,2]. This development in string theory supports the idea that the noncommutative structure of spacetime becomes relevant when constructing the theory of gravitation at Planck scale. To describe noncommutative spaces, the noncommutative geometry is now investigated by many authors and using this framework one can even consider the differential geometry of singular spaces like, for example, a 2-point space which has been shown to provide a geometrical interpretation of the Higgs mechanism [3]. On the other hand, in order to describe gravity we have to know the theory of a wider class of noncommutative geometry. In this context, the class of noncommutative spaces which can be considered as deformations of continuous spaces is especially interesting. In general, such noncommutative spaces can be obtained by quantizing a given space with its Poisson structure. Furthermore, if the original space is compact one obtains a finite dimensional matrix algebra as a quantized algebra of functions over this space. In this case, we may consider the deformation as a kind of regularization with the special

396

U. Carow-Watamura, S. Watamura

property that we can keep track of the geometric structure, a feature which is missing in the conventional regularization schemes. In physics the algebra of the fuzzy sphere is well known and has been investigated in a variety of contexts: as an example for a general quantization procedure [4, 5] (see also for example [6–8,10,11] and references therein) and in relation with geometric quantization. It is also discussed as the algebra appearing in membranes [12, 13], in relation with coherent states [14,15], and recently in connection with noncommutative geometry [16–18]. The same structure also appears in the context of the quantum Hall effect [19,20]. In this paper, we investigate the differential geometry of the fuzzy sphere and the field theory on it. We formulate the U (1) gauge theory on the fuzzy sphere. The fuzzy sphere is one example in the above mentioned class of noncommutative geometry and thus the field theory on this space is a very instructive model to examine the ideas of noncommutative geometry. Besides that, it is a deformation of the sphere obtained by quantization based on the Poisson structure on S 2 , and the resulting algebra AN is a finite dimensional matrix algebra. Thus, what we obtain is a regularized field theory on the sphere. From this point of view, we are also interested in the gauge theory on this noncommutative space. In order to formulate the local U (1) gauge theory on the fuzzy sphere, we first have to define the differential algebra based on the above algebra AN . We apply Connes’ framework of noncommutative differential geometry [9] by using a spectral triple (AN ,HN , D) proposed recently by the authors[25], where D is the Dirac operator and HN is the corresponding Hilbert space of spinors. We analyze the space of 1-forms which corresponds to the gauge potential and give the 2-forms to define the field strength. This paper is organized as follows. In Sect. 2, we summarize the definitions of the Dirac operator, the chirality operator and the spectral triple. We give a complete derivation of the spectrum of the Dirac operator and discuss its properties in detail. Then we define the differential algebra on the fuzzy sphere. In Sect. 3, the gauge field and the field strength are defined using this differential algebra. We examine the structure of the U (1) gauge transformation of the charged scalar field. Then the corresponding invariant actions are formulated. Section 4 contains the discussion. We also discuss the commutative limit. 2. Noncommutative Differential Algebra 2.1. Algebra of fuzzy sphere. The algebra of the fuzzy sphere can be obtained by quantizing the function algebra over the sphere by using its Poisson structure. For this end we adopt the Berezin-Toeplitz quantization which gives the quantization procedure for a Kähler manifold [4,5]. Applying this method to the function algebra over the sphere we obtain the algebra AN . AN can be represented by operators acting on a (N + 1) dimensional Hilbert space FN . The algebra AN can thus be identified with the algebra of the complex (N + 1) × (N + 1) matrices. The basic algebra to be quantized is the function algebra A∞ of the square integrable functions over a 2-sphere. The basis of this algebra is given by the spherical harmonics Ylm and the multiplication of the algebra is a usual pointwise product of functions. The fuzzy sphere may also be introduced as an approximation of the function algebra over the sphere by taking a finite number N of spherical harmonics, where this number N is limited by the maximal angular momentum {Ylm ; l ≤ N }. However with respect to the usual multiplication this set of functions does not form a closed algebra since the product of two spherical harmonics Ylm and Yl 0 m0 contains Yl+l 0 ,m . It is a new multiplication rule that solves the above described situation and gives a closed function algebra with a finite

Noncommutative Geometry and Gauge Theory on Fuzzy Sphere

397

number of basis elements. The resulting algebra AN is noncommutative. We can identify the algebra of the fuzzy sphere with the algebra of complex matrices MN +1 (C) and thus we can consider it as a special case of matrix geometry [21–24]. The operator algebra AN and the Hilbert space FN can be formulated keeping the symmetry properties under the rotation group. We introduce a pair of creationannihilation operators a†b , ab (b = 1, 2) which transforms as a fundamental representation under the SU (2) action of rotation, [aa , ab† ] = δba .

(1)

Define the number operator by N = ab† ab , then the set of states |v > in the Fock space associated with the creation-annihilation operators satisfying N|v >= N |v >,

(2)

provides an N + 1 dimensional Hilbert space FN . The orthogonal basis |k > of FN can be defined as |ki = √

1 (a† )k (a2† )N −k |0i, k!(N − k)! 1

(3)

where k = 0, . . . , N and |0i is the vacuum. The operator algebra AN acting on FN is unital and given by operators {O; [N, O] = 0}. The generators of the algebra AN are defined by xi =

1 a † b ασ b a a , 2 i a

(4)

where the normalization factor α is a central element [α, xi ] = 0 and is defined by the constraint x i xi =

α2 N(N + 2) = `2 . 4

(5)

The above equation means that ` > 0 is the radius of the 2-sphere and we get for α, α=√

2` . N(N + 2)

(6)

The algebra of the fuzzy sphere is generated by xi and the basic relation is [xi , xj ] = iαij k xk .

(7)

On the Hilbert space FN , α is constant and plays the role of the “Planck constant”. The commutative limit corresponds to α → 0, i.e., N → ∞.1 Now let us consider the derivations of AN . Among them, the derivative operator Li is defined by the adjoint action of xi [16], 1 1 adxi a = [xi , a] ≡ Li a, α α

(8)

1 Another possible choice is to take α = 2 as in ref.[4]. With this choice, the radius of the fuzzy sphere N depends on N.

398

U. Carow-Watamura, S. Watamura

where a ∈ AN . These objects are the noncommutative analogue of the Killing vector fields on the sphere, and the algebra of Li closes. We obtain thus [Li , xj ] = iij k xk ,

[Li , Lj ] = iij k Lk .

(9)

Finally, the integration is given by the trace over the Hilbert space FN . The integration over the fuzzy sphere which corresponds to the standard integration over the sphere in the commutative limit is defined by 1 X 1 hk|O|ki, (10) Tr{O} = hOi = N +1 N +1 k

where O ∈ AN . 2.2. Chirality operator and Dirac operator. We introduce the spinor field 9 as an AN bimodule 0AN ≡ C2 ⊗ AN , which is the noncommutative analogue ofthe space of ψ1 sections of a spin bundle. 9 is represented by 2-component spinors 9 = , where ψ2 each entry is an element of AN and we require that it transforms as a spinor under rotation of the sphere. Since left multiplication and right multiplication commute, the AN -bimodule can be considered as a left module over the algebra AN ⊗ AoN , where AoN denotes the opposite algebra which is defined by: xio xjo ≡ (xj xi )o , xi ∈ AN .

(11)

The action of a, b ∈ AN onto the AN -bimodule 9 ∈ 0AN is abo 9 ≡ a 9 b.

(12)

We define the Dirac operator and the chirality operator in the algebra AN ⊗ AoN [25], i.e. as 2 × 2 matrices the entries of which are elements in the algebra AN ⊗ AoN . The construction of the Dirac operator is performed by the following steps: (a) Define a chirality operator which commutes with the elements of AN and which has a standard commutative limit. (b) Define the Dirac operator by requiring that it anticommutes with the chirality operator and, in the commutative limit it reproduces the standard Dirac operator on the sphere. Requiring the above condition (a) we obtain for the chirality operator [25] 1 α (σi xio − ). N 2 N is a normalization constant defined by the condition γχ =

(γχ )2 = 1, α 2 (N

(13)

(14)

+ 1) and σi (i = 1, 2, 3) are the Pauli matrices. In the commutative limit, as N = the operator xi can be identified with the homogeneous coordinate xi of sphere and the chirality operator given in Eq. (5) becomes 1` σi xi , which is the standard chirality operator invariant under rotation [26]. The chirality operator (13) defines a Z2 grading of the differential algebra and it commutes with the algebra AN .

Noncommutative Geometry and Gauge Theory on Fuzzy Sphere

399

Proposition 1. The Dirac operator D satisfying the condition (b), i.e., {γχ , D} = 0, is given by D=

i γχ ij k σi xjo xk . `α

(15)

Proof. See [25].2 Note that this Dirac operator is selfadjoint, D† = D. Acting with this operator on a spinor 9 ∈ 0AN , we obtain i D9 = γχ χi Ji 9, `

(16)

1 Ji = Li + σi , 2

(17)

χi ≡ ij k xj σk .

(18)

where

and

The action of the angular momentum operator on the bimodule is defined by Li 9 ≡

1 1 [xi , 9] = (xi 9 − 9xi ). α α

(19)

The second condition of (b) concerning the commutative limit of the Dirac operator is also satisfied. If we replace each operator χi , Ji and γχ in Eq. (16) by the corresponding quantity which is obtained in the commutative limit, we get i i 1 D∞ = γχ χi Ji = 2 (σl xl )ij k xi σj (iKk + σk ) = −(iσi Ki + 1), ` ` 2

(20)

where xi is the homogeneous coordinate of S 2 and Ki is the Killing vector. Therefore, in the commutative limit this Dirac operator is equivalent to the standard Dirac operator.

2.3. Spectral triple. In order to establish Connes’ triple we have to identify the Hilbert space. The space of the fermions 9 ∈ AN ⊗ C2 defines a Hilbert space HN with norm h9|9i = TrF (9 † 9) =

2 X

TrF {(ψ ρ )∗ ψ ρ },

(21)

ρ=1

where TrF is the trace over the (N + 1) dimensional Hilbert space FN . 2 Note that this Dirac operator is different from the one given in [17]. The difference is that the operator in [17] contains a product of the Pauli matrix and angular momentum operator, whereas the operator defined here contains a product of χi and angular momentum operator as in Eq. (16), i.e. it also contains xi . Consequently, the spectra are not the same.

400

U. Carow-Watamura, S. Watamura

The dimension of the Hilbert space HN is 2(N + 1)2 and the trace over HN is the trace over the spin suffices and over the (N + 1)2 dimensional space of the matrices. Since the Dirac operator is defined in the algebra AN ⊗ AoN , the trace must be taken for operators of the form abo , with a, b ∈ AN , and it is given by TrH {ab } = o

2(N +1)2 X

h9K |abo 9K i = 2TrF {a}TrF {b}.

(22)

K=1

Here 9K is an appropriate basis in HN labeled by an integer K ∈ {1, . . . , 2(N + 1)2 }. The factor 2 on the r.h.s. comes from the trace over the spin suffices. To examine the structure of the Hilbert space we compute the spectrum λj of the Dirac operator: D2 9j m = λ2j 9j m .

(23)

9j m is a state with total angular momentum j , J2 9j m = j (j + 1)9j m and J3 9j m = m9j m is the x3 component of the total angular momentum operator Ji in Eq. (17). j and m are half integers and run 21 ≤ j ≤ N + 21 and −j ≤ m ≤ j . Proposition 2. The spectrum of the Dirac operator is given by 1 − (j + 21 )2 1 2 2 . λj = (j + ) 1 + 2 N (N + 2)

(24)

Proof. `2 2 0 0 0 D = (ij k σ i Xj Yk )(i 0 j 0 k 0 σ i Xj Yk ) α2 = X2 Y2 − (XY)[(XY) + 1 + (Xσ ) + (Yσ )], P where Xi = α1 xi , Yi = − α1 xio and (XY) = i Xi Yi . Using the relations Li = Xi + Yi

and

1 Ji = Li + σi , 2

(25)

(26)

we obtain (XY) = 21 [L2 − X2 − Y2 ] and (σ X) + (σ Y) = J2 − L2 − 43 . In order to evaluate the spectrum we use the representation of the spinor and substitute J2 = j (j + 1) where j ≤ N +

1 2

and

L2 = (j + s)(j + s + 1),

(27)

is a half integer and s = ± 21 . With this value we get

1 1 [j (j + 1) + s(2j + 1) + − X2 − Y2 ], 2 4 (σ X) + (σ Y) = −s(2j + 1) − 1. (XY) =

(28)

Thus, the eigenvalue is i 1 1 2h 1 1 2 `2 2 2 2 (j + ) ) λ = − − 2(X + Y ) − 1 − (X2 − Y2 )2 . (j + α2 j 4 2 2 4 If we substitute X2 = Y2 =

N N 2(2

+ 1) we obtain the relation (24).

t u

(29)

Noncommutative Geometry and Gauge Theory on Fuzzy Sphere

401

This spectrum coincides with the classical spectrum of the Dirac operator in the limit N → ∞. For finite N, it contains zeromodes. When the angular momentum takes its maximal value we see that λN + 1 = 0. This happens since there is no chiral pair for the 2

spin N + 21 state and therefore this part must be a zeromode for consistency. We can also confirm this property by computing: TrH (γχ ) = 2(N + 1). Since these zeromodes have no classical analogue, one way to treat them is to project them out from the Hilbert space. On the other hand, the contribution of the zeromodes in the integration is of order 1 N and thus their contribution vanishes in the limit N → ∞. Therefore, considering the differential algebra on the fuzzy sphere as a kind of regularization of the differential algebra on the sphere, it is sufficient to take the full Hilbert space HN . In this way we obtain Connes’triple (AN , D, HN ). We thus can apply the construction of the differential algebra. 2.4. Differential algebra. In this section we construct the differential algebra associated with (AN , D, HN ) by using Connes’ method [9]. See also [27]. We define the universal differential algebra ∗ (AN ) over AN . An element ω ∈ ∗ (AN ) is in general given by X (0) (1) (2) (p) aλ daλ daλ · · · daλ , (30) ω= λ∈I

(k)

where p is an integer, aλ ∈ AN (k = 0 · · · p) and I is an appropriate set labeling the elements. da is a symbol defined by the operation of the differential d on a ∈ AN , which satisfies Leibnitz rule d(ab) = (da)b + a(db) for a, b ∈ AN , and d1 = 0 for the identity 1 ∈ AN . We also require (da)∗ = −da∗ . The Leibnitz rule provides a natural product among the elements in ∗ (AN ) and the differential d on ∗ (AN ) is defined by X (0) (1) (2) X (0) (1) (2) (p) (p) aλ daλ daλ · · · daλ ) = daλ daλ daλ · · · daλ . (31) d( λ∈I

λ∈I

Then, it follows d 2 ω = 0 and the graded Leibnitz rule. In order to define the p-forms as operators on HN , a representation π is defined by X (0) X (0) (1) (2) (p) (p) (1) (2) aλ daλ daλ · · · daλ ) = aλ [D, aλ ][D, aλ ] · · · [D, aλ ]. (32) π( λ∈I

λ∈I

Recall that AN is defined as an algebra of operators in HN . Then the graded differential algebra is defined by ∗D (AN ) = ∗ (AN )/J,

(33)

where J = ker π + d ker π is the differential ideal of ∗ (AN ). In order to establish the differential calculus on the fuzzy sphere, we have to examine the structure of the differential kernel J. For this we denote the kernel of each level as ker π (p) ≡ p (AN ) ∩ ker π,

(34)

then the differential kernel J(p) for the p-form is J(p) = ker π (p) + d ker π (p−1) .

(35)

402

U. Carow-Watamura, S. Watamura

Since the elements of the algebra AN are defined as operators in HN , ker π (0) = {0}, i.e., J(0) = {0}. It means that 0D (AN ) = AN . The differential kernel of the 1-form is J(1) = ker π (1) +d ker π (0) = ker π (1) , and thus for any element a ∈ AN the derivative is defined by π(da) = [D, a].

(36)

The space of 1-forms ω ∈ 1D (AN ) can be identified with the operators π(ω) in HN : π(1D (AN )) = {π(ω)| π(ω) =

X

aλ [D, bλ ] ; aλ , bλ ∈ AN }.

(37)

λ∈I

Thus, with the above identification, the exterior derivative d defines a map: d

AN → M2 (C) ⊗ (AN ⊗ AoN ),

:

(38)

where M2 (C) is the algebra of 2 × 2 complex matrices. Using the definition of the Dirac operator (15), a 1-form is expressed as follows: Take a 1-form π(ω) ∈ π(1D (AN )) in Eq. (37). Using Eq. (15) we obtain π(ω) =

X i i γχ ij k σi xjo aλ [xk , bλ ] = γχ χko ωk , `α `

(39)

λ

where χko ≡ −ij k xio σj ,

(40)

and the components ωk of π(ω) can be rewritten by using the definition (8) of L as: ωk ≡

X 1X aλ [xk , bλ ] = aλ (Lk bλ ). α λ

(41)

λ

Here, ωk ∈ AN may be considered as the component of a vector field. In order to write the gauge field action, we have to define the 2-form. A 2-form η ∈ 2D (AN ) can be given in general as π(η) =

X λ

(1)

(2)

(3)

aλ [D, aλ ][D, aλ ]

X (1) 1 (2) (3) aλ (Li aλ )(Lj aλ ), = 2 χio χjo `

(42)

λ

(i)

where aλ ∈ AN . Since the 2-form in Eq. (42) is defined up to the differential kernel π(d ker π (1) ), π(η) contains redundant components. Note that when we perform the calculation, we do not use the 2D (AN ), but its representation π(2D (AN )), thus it is sufficient to compute π(d ker π (1) ), since π(∗D (AN )) is isomorphic to π(∗ (AN ))/π(d ker π ). The nontrivial contribution of π(d ker π (1) ) is proportional to the traceless part of the symmetric product χ{io χjo} as we shall see in the following.

Noncommutative Geometry and Gauge Theory on Fuzzy Sphere

The exterior derivative of a general 1-form ω defined in Eq. (37) is X [D, aλ ][D, bλ ]. π(dω) =

403

(43)

λ

Using the Dirac operator we obtain 1 2 1 X o oh χi χi 0 Li (aλ Li 0 bλ ) − i ii 0 k aλ Lk bλ + δi,i 0 (aλ Li bλ )xi π(dω) = 2 ` 2 3α λ i 1 1 −aλ [ {Li , Li 0 } − δii 0 L2 ]bλ . (44) 2 3 The first three terms vanish for ω ∈ ker π (1) . Only the last term gives a nontrivial contribution for the differential kernel and thus π(d ker π (1) ) is proportional to the symmetric traceless product of χio χio0 . The proof of the existence of the nontrivial 1-form kernels which contribute to p q d ker π (1) is given in the appendix. Using the explicit expression ωp,q = xA dxA of ker π (1) obtained in the appendix (see Eq. (87)) we compute π(d ker π (1) ). We find that dωp,q gives an element of d ker π (1) : Proposition 3. π(dωp,q ) 6 = 0, for p + q = N + 2 and p, q > 1. Proof. p

q

π(dωp,q ) = [D, x+ ][D, x+ ] α α −1 p−1 q−1 = 2 γχ χ+o γχ χ+o [−2px+ (x3 + (p − 1))][−2qx+ (x3 + (q − 1))] ` 2 2 α 3α 1 o o p+q−2 2 x3 + x3 ( p + q − 2α) = 2 χ+ χ+ 4pqx+ ` 2 2 α2 (p + 2q − 3)(q − 1) . (45) + 4 Using the identity x+ x− = `2 + αx3 − x32 , this expression can be simplified to π(dωp,q ) = 4pq

1 o o p+q−2 χ χ x [A(q)x3 + B(q)], `2 + + +

(46)

where A(q) =

B(q) = `2 +

3α α p+ q − α, 2 2

α2 (p + 2q − 3)(q − 1) − x+ x− . 4

(47)

(48)

This means that π(dωp,q ) does not vanish for p + q = N + 2 although ωp,q ∈ kerπ (1) for p + q = N + 2. u t The result of Proposition 3 and the corresponding contribution from the kernels of the other directions show that there exist nontrivial differential kernel elements π(d kerπ (1) ) proportional to the symmetric traceless product of χio χjo . With this result we can prove the following proposition.

404

U. Carow-Watamura, S. Watamura

Proposition 4. π(d ker π (1) ) = {3|3 =

1 o o χ χ aij `2 i j

where aij ∈ AN , aij = aj i and

3 X

aii = 0}.

i=1

(49) Proof. Using Proposition 3, we obtain a nontrivial element by multiplying aλ0 , b0λ ∈ AN and X X aλ0 ωp,q b0λ ) = π aλ0 (dωp,q )b0λ π d( λ

λ

X 1 p q χ o χ o a0 (L− x+ )(L− x+ )b0λ = `2 + + λ λ X 1 N = 4pq 2 χ+o χ+o aλ0 x+ [A(q)x3 + B(q)]b0λ , `

(50)

λ

where we have used p + q = N + 2. Choosing appropriate elements aλ0 , b0λ ∈ AN , P N [A(q)x + B(q)]b0 can become any element in A . We have six the factor λ aλ0 x+ 3 N λ independent directions for ωp,q and combining the results from them we get the traceless t symmetric combinations of suffices i, i 0 in χio χio0 . u η∈ Identifying the 2D (AN ) with its representation π(2D (AN )), a general 2-form e π(2D (AN )) is given by X (1) (2) (3) aλ [D, aλ ][D, aλ ], (51) e η= λ

(i)

where aλ ∈ AN up to π(d ker π (1) ). Combining Eqs. (43) and (44) we can compute the operation of the derivative d on a general 1-form in Eq. (37) and we obtain 2 1 π(dω) = 2 χko χko0 {Lk ωk 0 − Lk 0 ωk } − ikk 0 k 00 ωk 00 + δkk 0 [xi ωi + ωi xi ] , (52) 2` 3α where we have used the definition of the components ωk in Eq. (39). Since the trace part does not belong to the differential kernel, the last term in the above equation is not removed by dividing differential kernels. We continue here our construction of the gauge field action with this definition of the differential algebra and we shall obtain a kind of mass term in the gauge theory. The commutative limit α → 0 becomes singular, as can be seen from Eq. (39). However, as we discuss in the following, we can still interpret the resulting theory as a regularization of the corresponding commutative theory. An alternative strategy to the one taken here would be to restrict the above defined 2-form. With the above 2-form as it stands the naive commutative limit does not give the standard differential calculus. One possibility to handle this situation is to use the property of the trace: χio χio = 2N 2 − αN γχ . It turns out that the trace part JT is an ideal of the π(2 (AN )). Furthermore, in each p-form space π(p (AN )), the set JT π(p−2 (AN )) ∪ π(p−2 (AN ))JT is an ideal and thus there is a possibility to divide the differential algebra so that we can take the commutative limit and obtain the standard differential calculus. This procedure will be discussed elsewhere.

Noncommutative Geometry and Gauge Theory on Fuzzy Sphere

405

3. U (1) Gauge Field Theory 3.1. Vector field. Using the geometric notions defined in the previous sections, we formulate the U (1) gauge theory on the fuzzy sphere. We identify the differential algebra ∗D (AN ) with its representation π(∗D (AN )) and do not write the map π explicitly. First, to formulate the gauge field theory we define the real vector field A which is a 1-form on the fuzzy sphere. We impose the reality condition for this 1-form by A† = A. Using the general definition of a 1-form, A can be written as3 X A= aλ [D, bλ ],

(53)

(54)

λ

where aλ , bλ ∈ AN are appropriate elements. According to the general discussion about 1-forms in the previous section we can write i A = γχ χko Ak , ` where Ak is the component field of A given by X aλ (Lk bλ ). Ak =

(55)

(56)

λ

For the component field the reality condition gives A∗k = Ak .

(57)

Thus each component of the gauge field is represented by an (N +1)×(N +1) hermitian matrix. Note that, in the commutative case, the 1-form satisfies the constraint xi Ai = 0, which shows the reduction of the degrees of freedom. However in the noncommutative case the 1-form defined by Eq. (56) does not satisfy the similar constraint on Ak in general. Further discussion on the treatment of this property is given in Sect. 4. In the remaining part, let us push forward the construction of the gauge theory on the noncommutative sphere. In the commutative case, we obtain the field strength of the U (1) gauge theory by taking the exterior derivative of the 1-form. In the noncommutative case, the exterior derivative gives X [D, aλ ][D, bλ ]. (58) dA = λ

Applying the result of the previous section we obtain dA =

i o o χ χ 0 Fkk 0 , 2`2 k k

(59)

3 The hermiticity condition requires the form A = P a [D, b ] + b∗ [D, a∗ ] − 1 [D, a b + b∗ a∗ ]. ρ ρ ρ ρ ρ ρ ρ ρ ρ 2

This can be again written in the form (54).

406

U. Carow-Watamura, S. Watamura

with Fkk 0 = −i{Lk Ak 0 − Lk 0 Ak } − kk 0 k 00 Ak 00 − iδkk 0

2 [Ai xi + xi Ai ]. 3α

(60)

We can show that the above Fkk 0 corresponds to the field strength for the abelian gauge field in the commutative limit. To see this, we use the following correspondence which holds in the commutative limit: µ

Ak = Kk Aµ

and

Lk = iKk .

(61)

µ

Here Aµ is a gauge field and Kk (k = 1, 2, 3, µ = 1, 2) is the Killing vector on the µ sphere with appropriate coordinates ρ µ , and Kk = Kk ∂µ . With the above identification we get µ

Fkk 0 = Kk Kkν0 Fµν ,

(62)

where Fµν = ∂µ Aν − ∂ν Aµ . Here we have used the relation Ai xi = 0 which holds in the commutative case. In the noncommutative case, however, the exterior derivative of the 1-form dA does not give the field strength. 3.2. U (1) Gauge transformation. For the formulation of the U (1) gauge theory on the fuzzy sphere, let us consider the U (1) gauge transformation of a charged scalar field, i.e., a complex scalar field [25]. The algebraic object corresponding to the complex scalar field on the fuzzy sphere is the AN -bimodule 8 ∈ AN . Its action is given by S=

1 TrH {(d8)† d8}. 2(N + 1)2

(63)

Apparently, the above action is invariant under global U (1) transformation of the phase 80 = eiφ 8.

(64)

Following the standard approach, the local U (1) gauge transformation can be defined if we let the phase eiφ be a function on the fuzzy sphere. In the present algebraic formulation this means we multiply an element u ∈ AN on the field 8, where unitarity is implemented by u∗ u = 1

.

(65)

When we generalize the transformation, we may take either left or right multiplication of u on the field 8 due to the ordering ambiguity. Here we take the left multiplication as the U (1) gauge transformation for 8: 80 = u8.

(66) 0

The transformation of the conjugate field 8 = 8∗ is given by 8 = 8u∗ . Since the algebra AN is isomorphic to the algebra of (N + 1) × (N + 1) matrices, the condition (65) shows that, as a matrix, u is an element of U (N + 1). In other words, the local U (1) gauge transformation on the fuzzy sphere in matrix representation is defined as the left U (N + 1) transformation.

Noncommutative Geometry and Gauge Theory on Fuzzy Sphere

407

Therefore, we define the covariant derivative ∇A as ∇A 8 = d8 + A8.

(67)

Then the gauge transformation of the gauge field can be defined by requiring the covariance of ∇A 8: ∇A0 (u8) = u∇A 8.

(68)

This defines the standard form of the gauge transformation A0 = udu∗ + uAu∗ .

(69)

A0k = u(Lk u∗ ) + uAk u∗ .

(70)

In components it reads

The above transformation keeps the hermiticity condition (53) and may be interpreted as the transformation of the U (N + 1) gauge theory on a one-point space, and thus the covariant field strength is given by the standard curvature form [9] 2 = dA + AA.

(71)

In components the curvature 2-form is 2=

−i o o χ χ 0 2kk 0 , 2`2 k k

(72)

where the component of the field strength is 2kk 0 = i{Lk Ak 0 − Lk 0 Ak } + kk 0 k 00 Ak 00 + i[Ak , Ak 0 ] 2i + δkk 0 [Ai xi + xi Ai + αAi Ai ]. 3α

(73)

3.3. The action of gauge field and matter. With the above results we define the noncommutative analogue of the gauge invariant action. The action of the charged scalar is SM =

1 TrH {(∇A 8)† ∇A 8}. 2(N + 1)2

(74)

The action of the gauge field is given by SG ≡

1 TrH {22 }. 2(N + 1)2

(75)

Both actions are invariant under local U (1) gauge transformation. Thus, combining these two actions, we obtain the action of the U (1) gauge theory with scalar matter on the fuzzy sphere. Note that we may introduce the gauge coupling constant g by rescaling the gauge field A to gA. In order to see the detailed structure of the above actions, we take a part of the trace. We perform the trace relating to the opposite algebra and the spin suffices. Then we obtain the action which contains only the fields A and 8 and the trace of this action is taken over the Hilbert space FN .

408

U. Carow-Watamura, S. Watamura

Then the matter action (74) can be reduced as SM =

2 TrF {(Li 8 + Ai 8)∗ (Li 8 + Ai 8)}. 3(N + 1)

(76)

Similarly, the gauge field action (75) is reduced to SG =

CA CS A TrF {2A TrF {2Sii 0 2Sii 0 }, ii 0 2ii 0 } + (N + 1) (N + 1)

(77)

where N 2 n −α 2 2o , + 2`2 3N 2 3 1 CS = 1 + N (N + 2)

CA =

(78)

and 2Sij (2A ) is the (anti)symmetric part of the field strength given in Eq. (73). Since the trace over the Hilbert space FN corresponds to the volume integration in the commutative limit, the actions SG and SM given in Eqs. (76) and (77), respectively, should correspond to the standard action on the sphere in the limit N → ∞. Apparently the 2S in the gauge action does not have a classical correspondence. Furthermore, as we see below this term is singular in the naive N → ∞ limit. This is unavoidable since our differential algebra is singular in this limit. However, under certain conditions we may consider the above action as a regularized theory of the commutative case as follows: The symmetric part of the action is (2S )2 ∼

1 [(Ai xi + xi Ai ) + αAi Ai ]2 . α2

(79)

The above combination is gauge invariant under the gauge transformation given in Eq. (69). This term can be understood as the gauge invariant mass term of the radial component of the gauge field. Thus, physically we can understand the effect of the symmetric part as follows: When we consider the quantization of the above regularized theory using the path integral which respects the gauge symmetry, then in the α → 0 limit the symmetric term behaves like a (gauge invariant) delta function which drops the radial component. Furthermore, from the point of view of gauge theory it is not necessary to take 22 as an action. Instead, we can simply take any linear combination of the gauge invariant terms. This means that we can take CA and CS as independent parameters. Thus, we obtain in general the following action for the gauge field: S=

1 2 TrF {C1 Gkk 0 Gkk 0 + C2 G0 }, (N + 1)

(80)

where C1 and C2 are c-numbers and Gkk 0 = iLk Ak 0 − iLk 0 Ak + kk 0 k 00 Ak 00 + i[Ak , Ak 0 ], G0 = xi Ai + Ai xi + αAi Ai . The above action (77) is a special case of the general form given here.

(81)

Noncommutative Geometry and Gauge Theory on Fuzzy Sphere

409

4. Discussions and Conclusion In this paper we have formulated the U (1) gauge theory on the fuzzy sphere, following Connes’ framework of noncommutative differential geometry. The differential algebra on the fuzzy sphere has been constructed by applying the chirality operator and Dirac operator proposed in ref.[25]. This chirality operator anticommutes with the Dirac operator and the structure of the differential algebra becomes simple. Then we analyzed the structure of the 1-forms and 2-forms which are necessary to construct the gauge field action. In ref.[25], the action of a complex scalar field on the fuzzy sphere which is invariant under the global U (1) transformation of the phase of the complex scalar field has been formulated. Here, the local U (1) gauge transformation on the fuzzy sphere is introduced by making the global phase transformation into a local transformation, i.e. the phase becomes a function over the fuzzy sphere. By construction, a function over the fuzzy sphere is simply given by elements of the algebra AN . Thus, the local U (1) gauge transformation is defined by multiplication of an element u ∈ AN , satisfying unitarity u∗ u = 1. Since the algebra AN is noncommutative, there is an ambiguity of operator ordering when replacing the global phase by the algebra elements u. We have chosen here the left multiplication. Thus, when we represent the algebra AN by matrices, the local U (1) gauge transformation is identified with the left transformation by a unitary (N + 1) × (N + 1) matrix. Therefore, the gauge field action is analogous to the Yang-Mills action. Once we know the Dirac operator, the construction of the differential calculus is rather straightforward, however, as we have seen when defining the 1-forms, their components Ai do not satisfy Ai xi = 0 in general. In the commutative case this relation holds since the Killing vector is perpendicular to the normal direction of the sphere. However in the noncommutative case xi Li is not necessarily zero. Since the relation 4 xi Ai + Ai xi =

1 [xi , a][xi , b], α

(82)

holds, this property is related with the trace part of the 2-form as follows: As we have seen in the construction of the 2-forms performed here, the differential kernel π (2) does not contain a trace part, i.e., the part proportional to χio χio . In the course of deriving Eq. (44) we get [xi , a][xi , b] as a coefficient of χio χio . Up to the kernel condition this product of commutators is equivalent to aL2 b. The reason why the trace part drops from the differential kernel is due to the relation aL2 b = − α2 a(Li b)xi . This relation is a direct consequence of the condition that `2 is central. This type of problem relating to the reduction of degrees of freedom as well as to the structure of the differential kernel is a rather general feature when defining the differential forms by the adjoint action Li .5 Thus, in the noncommutative case the construction gives 1-forms which have three independent components. One possibility to drop the trace part (which is proportional to the third component) in the present approach has been indicated in Sect. 2.4. On the other hand, although the 2-form is singular in the N → ∞ limit, the action given in Eq. (77) still allows the interpretation as a regularized theory of the gauge theory on the sphere. 4 This relation follows from Definition (56). 5 The structure of the Dirac operator depends on the choice of the fermion, but on the other hand if the Dirac operator has the form θ i xi , and if θ i commutes with xi , where θ i ∼ γχ χio in our case, then the derivative d is always given by da = θ i (Li a) with Li being the adjoint action.

410

U. Carow-Watamura, S. Watamura

It is easy to check that both terms in the action Eq. (77) are invariant under the gauge transformation (70). Thus, the most general gauge action can be written as in Eq. (81). The first term corresponds to the standard gauge action in the commutative limit. This term is usually taken as the action for the gauge field in the fuzzy sphere. The second term approaches simply (2xi Ai )2 in this limit. As we mentioned, the symmetric part of the action can be understood as a gauge invariant mass for the radial component of the gauge field. Furthermore, in the action (75), this mass is diverging in the limit of N → ∞ and can be treated as delta function constraint under the path integral. Thus by taking a limit which respects the gauge symmetry, the freedom corresponding to xi Ai +Ai xi +αA2 is frozen and thus effectively drops from the theory. Since in this limit this procedure is equivalent to the constraint xi Ai = 0, it reduces the freedom of the vector potential in the commutative theory properly. From the point of view of constructing a gauge theory on the fuzzy sphere, we have an even simpler choice to treat the degrees of freedom of the theory. If we require only the gauge invariance under the gauge transformation (69), we can take the symmetric term as a constraint for the gauge field from the beginning. Then the action contains only the antisymmetric part, i.e., C2 = 0 in Eq. (81) and the gauge field is constrained by G0 = (Ai xi + xi Ai ) + αAi Ai = 0.

(83)

Then in this construction, the gauge field has correct degrees of freedom, even in the noncommutative case. Apparently, this theory also gives the correct commutative limit. To complete our discussion, we want to mention that the use of the constraint G0 = 0 to restrict the differential calculus is not straightforward, since dG0 does not automatically vanish. The treatment of this constraint within the differential calculus needs more investigation. The fuzzy sphere is one of the easiest examples of a noncommutative space. We can consider the U (1) gauge theory on the fuzzy sphere formulated in this paper as a regularized version of a gauge theory on the sphere. The gauge theory on the noncommutative sphere is also investigated in ref.[28]. The differential calculus there is based on the supersymmetric fuzzy sphere and the structure of the fermion is different from the one discussed here. Thus the structure of the differential algebra is also different. However, this is not a contradiction since, in principle, there are many types of differential algebra associated with the fuzzy sphere algebra, depending on the choice of the spectral triple. In the formulation given here we can also see an interesting analogue with the M(atrix) theory. If we introduce a new field ∇i =

1 xi + Ai , α

(84)

then the field strength 2A ij is given by 2A ij = i[∇i , ∇j ] − ij k ∇k .

(85)

Using the same replacement for the symmetric part, the action is SG =

2 1 `2 TrF {C1 i[∇i , ∇j ] − ij k ∇k + C2 (∇i ∇i − 2 )}. (N + 1) α

(86)

After rewriting the gauge field action in the above form, we can make the following reinterpretation: There is a general theory defined by the matrix ∇i and the action (86).

Noncommutative Geometry and Gauge Theory on Fuzzy Sphere

411

The geometry of the base space is then defined by the vacuum expectation value of the field ∇i given by h∇i i = xαi . Then the original gauge field action can be obtained by expanding the field around this vacuum expectation value. Acknowledgement. The authors would like to thank H. Ishikawa for helpful discussions. This work is supported by the Grant-in-Aid of Monbusho (the Japanese Ministry of Education, Science, Sports and Culture) #09640331.

5. Appendix 5.1. One form kernels. We show the existence of nontrivial elements of J(1) which contribute to π(dkerπ (1) ). Consider the 1-forms: (xA )p d(xB )q

(87)

with A, B = +, −, 3, where we have used the coordinates x± = x1 ± ix2 . Since AN is the algebra of (N + 1) × (N + 1) matrices, corresponding to the (N + 1) dimensional representation of the algebra of the angular momentum up to the normalization, the identity (x± )N+1 = 0 holds, and thus one easily finds that elements of the differential kernel appear for A = B = ±. Here, we give the proof for A = B = +. (The proof for A = B = − works correspondingly.) Let us define the 1-forms ωp,q as p

q

ωp,q = x+ dx+ .

(88)

Then the following proposition holds. Proposition 5. ωp,q is an element of ker π (1) , for integers p, q satisfying 1 < p, q < N + 1 and p + q ≥ N + 2. Proof. Using the Dirac operator given in (15) we obtain p

q

π(ωp,q ) = x+ [D, x+ ] =

i `

X A=+,3,−

p

q

γχ χAo x+ LA x+ .

(89)

A straightforward calculation yields q

L+ x+ = 0, q

q

L3 x+ = qx+ , q

q−1

L− x+ = −2qx+ (x3 +

α (q − 1)). 2

(90)

N +1 = 0, the r.h.s. of Eq. (89) vanishes. Substituting the above relations, and using that x+ t u

Note that there are six different elements xλ which correspond to the raising (lowering) operators of the three different directions xλ = xj ±ixk , where j < k and j, k ∈ {1, 2, 3}, satisfying (xλ )N +1 = 0.

(91)

412

U. Carow-Watamura, S. Watamura p

q

For each direction xλ we can obtain kernels of the type ωp,q = xλ dxλ .6 These one forms as well as all one forms obtained by multiplying elements a ∈ AN onto them, belong to the kernel J(1) = ker π (1) . We may still find other elements of ker π (1) . However, the above kernel ωp,q is sufficient to prove that π(d ker π (1) ) is not empty and contains the symmetric traceless part of χio χjo . References 1. Banks, T., Fischler, W., Shenker, SH., Susskind, L.: M theory as a matrix model. Nucl. Phys. B 497, 41–55 (1997) 2. Ishibashi, N., Kawai, H., Kitazawa, Y., Tsuchiya, A.: A large-N reduced model as superstring. Nucl. Phys. B 498, 467–491 (1997) 3. Connes, A., Lott, J.: Nucl. Phys. B 18 (Proc. Suppl.) 29 (1990); see also chapter VI of ref.[9] 4. Berezin, F.A.: Quantization. Math. USSR Izvestija 8, 1109–1165 (1974) 5. Berezin, F.A.: General concept of quantization. Commun. Math. Phys. 40, 153–174 (1975) 6. Bordemann, M., Hoppe, J., Schaller, P., Schlichenmaier, M.: gl(∞) and Geometric Quantization. Commun. Math. Phys. 138, 209–244 (1991) 7. Bordemann, M., Meinrenken, E., Schlichenmaier, M.: Toeplitz Quantization of Kähler Manifolds and gl(N), N → ∞ Limits. Commun. Math. Phys. 165, 281–296 (1994) 8. Coburn, L.A.: Deformation Estimates for the Berezin-Toeplitz Quantization. Commun. Math. Phys. 149, 415–424 (1992) 9. Connes, A.: Noncommutative Geometry. London–New York: Academic Press, 1994 10. Klimek, S., Lesniewski, A.: Quantum Riemann Surfaces I. The Unit Disk. Commun. Math. Phys. 146, 103 (1992) 11. Cahen,M., Gutt, S., Rawnsley, J.: Quantization of Kähler Manifolds II. Transactions of the American Math. Soc. 337, 73–98 (1993) 12. Hoppe, J.: Quantum Theory of a Massless Relativistic Surface and a Two-Dimensional Bound State Problem. PhD Thesis, MIT (1982) published in Soryushiron Kenkyu (Kyoto) Vol. 80, 145–202 (1989) 13. de Wit, B., Hoppe, J., Nicolai, H.: On the Quantum Mechanics of Supermembranes. Nucl. Phys. B305, 545–581 (1988) 14. Bargmann, V.: On a Hilbert Space of Analytic Functions and an Associated Integral Transform,Part I. Comm. Pure Appl. Math. 14, 187–214 (1961) 15. Perelomov, A.M.: Coherent states for arbitrary Lie groups. Commun. Math. Phys. 26, 222 (1972); Generalized Coherent States and their Application. Berlin–Heidelberg–New York: Springer Verlag, 1986 16. Madore, J.: The fuzzy sphere. Class. Quant. Grav. 9, 69–87 (1992) 17. Grosse, H., Presnajder, P.: The Dirac Operator on the Fuzzy Sphere. Lett. Math. Phys. 33, 171–181 (1995) 18. Grosse, H., Klimˇcík, C., Presnajder, P.: Towards a finite Quantum Field Theory in Noncomm. Geometry. Int. J. Theor. Phys. 35, 231–244 (1996); Field Theory on a Supersymmetric Lattice. Commun. Math. Phys. 185, 155–175 (1997); Topological Nontrivial Field Configurations in Noncommutative Geometry. Commun. Math. Phys. 178, 507–526 (1996) 19. Haldane, F.D.M.: Fractional Quantization of the Hall Effect: A Hierachy of Incompressible Quantum Fluid States. Phys. Rev. Lett. 51, 605 (1983) 20. Fano, G., Ortolani, F., Colombo, E.: Configuration-interaction calculations on the fractional quantum Hall effect. Phys. Rev. B 34, 2670–2680 (1986) 21. Dubois-Violette, M., Kerner, R., Madore, J.: Gauge Bosons in a Noncommutative Geometry. Phys. Lett. B 217, 485–488 (1989) 22. Dubois-Violette, M., Kerner, R., Madore, J.: Noncommutative differential geometry of matrix algebras. J. Math. Phys. 31, 316 (1990) 23. Dubois-Violette, M., Kerner, R., Madore, J.: Noncommutative differential geometry and new models of gauge theory. J. Math. Phys. 31, 323 (1990) 6 In fact we have a whole “tower” of kernels m Y k=0

(x3 −

(N − 2k) α)ωp,q ∈ kerπ (1) , for p + q ≥ N + 1 − m, m = 0, . . . , N − 1, 2

(92)

Q (N −2k) N −m since m α)x+ = 0. However the above kernel ωp,q is enough for the following discusk=0 (x3 − 2 sions.

Noncommutative Geometry and Gauge Theory on Fuzzy Sphere

413

24. Dubois-Violette, M., Madore, J., Kerner, R.: Super Matrix Geometry. Class. Quantum Grav. 8, 1077 (1991) 25. Carow-Watamura, U., Watamura, S.: Differential Calculus on Fuzzy Sphere and Scalar Field. Int. J. of Mod. Phys. A 13, 3235–3243 (1998) 26. Jayewardena, C.: Schwinger model on S 2 . Helvetica Physica Acta 61, 636–711 (1988) 27. Chamseddine, A.H., Fröhlich, J.: Some Elements of Connes’ Noncommutative Geometry, and Spacetime Geometry. Preprint ETH-TH-93-24 (93, rec. Jul.), hep-th/9307012 28. Klimˇcík, C.: Gauge theories on the noncommutative sphere. IHES/P/97/77, hep-th/9710153 Communicated by H. Araki

Commun. Math. Phys. 212, 415 – 436 (2000)

Communications in

Mathematical Physics

© Springer-Verlag 2000

On the Large Time Asymptotics of Decaying Burgers Turbulence Roger Tribe, Oleg Zaboronski Mathematics Institute, University of Warwick, Coventry CV4 7AL, UK. E-mail: [email protected]; [email protected] Received: 4 October 1999 / Accepted: 4 February 2000

Abstract: The decay of Burgers turbulence with compactly supported Gaussian “white noise” initial conditions is studied in the limit of vanishing viscosity and large time. Probability distribution functions and moments for both velocities and velocity differences are computed exactly, together with the “time-like” structure functions Tn (t, τ ) ≡ h(u(t + τ ) − u(t))n i. The analysis of the answers reveals both well known features of Burgers turbulence, such as the presence of dissipative anomaly, the extreme anomalous scaling of the velocity structure functions and self similarity of the statistics of the velocity field, and new features such as the extreme anomalous scaling of the “time-like” structure functions and the non-existence of a global inertial scale due to multiscaling of the Burgers velocity field. We also observe that all the results can be recovered using the one point probability distribution function of the shock strength and discuss the implications of this fact for Burgers turbulence in general. 1. Introduction The study of decaying Burgers turbulence (DBT) is largely motivated by the the observation that this is a system which falls into the phenomenological class of turbulent systems which can be treated in principle by means of Kolmogorov theory.Yet the answers which can be derived analytically for Burgers turbulence are in the sharp contradiction to the predictions of Kolmogorov theory. The understanding of the reasons for such a discrepancy and their relevance for the general theory of turbulence is one of the major aims of the study of Burgers turbulence. The history of the subject (see e.g. [11, 23, 27, 32, 16, 28, 19, 2–4, 1, 5, 6, 18, 22, 21, 33, 34,9,24,31,35,8]; see [17] for a review) shows however that the problem is hard, so hard in fact that it has a tendency to become self justifying, getting more and more alienated from the main body of turbulent research. However, until recently there existed

416

R. Tribe, O. Zaboronski

no model of Burgers turbulence which can be used as a testing ground for general phenomenological theories of turbulence on one hand and admits a complete and simple analytical treatment on the other. In the present paper we introduce and analyse such a model. Namely, we study the decay of Burgers turbulence with compactly supported Gaussian “white noise” initial conditions. In physical terms the turbulence in our model is excited by an initial disturbance localized at a fixed scale much less than the size of the reservoir and which can occur with equal probability around any point of the reservoir. Note that DBT driven by “white noise” plays a special role for the theory of DBT in general. The reason is that the integral scale of turbulence in this problem is not imposed by initial conditions but rather is generated by time evolution. Thus, the answers one obtains for “white noise” DBT are in some sense universal. Consider for example DBT driven by Gaussian initial conditions characterized by the two point function χ(r) which is approximately constant for r R and goes to 0 exponentially fast for r R. Then the statistics of the velocity field in this model at scales much larger than R and much less than the integral scale is asymptotically equivalent, in the limit as ν → 0, t → ∞, to that of “white noise” DBT. Likewise, compactly supported “white noise” DBT defines a universality class of models of DBT driven by compactly supported Gaussian initial conditions. The choice of a simple initial condition and the choice to look for answers only in the vanishing viscosity and large time limits lead to a model that is exactly solvable. Explicit asymptotics can be obtained for statistics that are hard to estimate in more general models. The main reason for the exact solvability of our model is the fact that the statistics of the velocity field in the case of compactly supported initial conditions are dominated in the limit ν → 0, t → ∞ by two shock configurations, the statistics of which is easily computable as functionals of white noise. We would like to stress that our model is in a different universality class than the original Burgers model in which turbulence is initiated by white noise initial conditions but no restriction of compactness is imposed: a solution to Burgers equation corresponding to an initial condition supported on a whole line will generically contain infinitely many shocks at any moment of time, not just two as in our case. Accordingly, the large time statistics of the velocity field in our case is very different from that in Burgers’ model. For instance, energy density decays as t −1/2 in our case (see Sect. 3.1) and as t −2/3 in Burgers’, [11]. The paper is organized as follows. In Sect. 2 we give a precise statement of the problem, construct a large time limit of the solution to the inviscid Burgers equation corresponding to compactly supported initial conditions and formulate the main statements about the statistics of these solutions. In Sect. 3 we obtain asymptotics for a variety of statistics: the moments of velocity field, the probability distribution function of velocities, the velocity structure functions, the probability distribution function of velocity differences, time-like velocity structure functions. In Sect. 4 the analysis of these results is given. In particular, the validity of one shock approximation and multiscaling in the problem are discussed.

Large Time Asymptotics of Decaying Burgers Turbulence

417

2. The Limiting Velocity Field Consider the following initial value problem connected to the Burgers equation: ∂u ∂ 2u ∂u (x, t) + u(x, t) (x, t) = ν 2 (x, t), x ∈ R, t > 0, ∂t ∂x ∂x u(x, 0) = u0 (x),

(1) (2)

where u0 (x) is a bounded function which is compactly supported in the interval [x0 − l, x0 + l]. Here l is a fixed positive constant and x0 is a random variable uniformly distributed in the interval [−L, L]. The fixed positive constant L plays a role of normalization length. Conditional on x0 the initial velocity u0 (x) will be a white noise over the interval [x0 − l, x0 + l], so that it has a formal density 1 − 2J1 Rxx0−l+l u20 (x)dx 0 , e Z where Z is a normalization constant chosen in such a way that, formally, Z Z L dx0 P (u0 |x0 )D(u0 ) = 1. −L 2L P (u0 |x0 ) =

(3)

J , the Gaussian variance, is a positive constant which plays a role of Loitsansky integral for the problem at hand. Since we have a compact initial condition the distribution of the velocities ut (x) are not translation invariant. The role of x0 is to randomise the location of the initial disturbance uniformly over the interval [−L, L]. The values of ut (x) at a fixed x will then typically be non-zero only with probability O(L−1 ). We take the limit as L → ∞ and all the answers concerning the statistics of the velocity field will be expressed in the form of the leading term in an asymptotic expansion in L−1 . This has the advantage that the answers are then translation invariant and we are free to consider statistics centered at the origin. In what follows we will compute asymptotics of the following statistics: the moments of velocity distribution Mn = hun (x, t)i; the velocity structure functions Sn (y) = h(u(x + y, t) − u(x, t))n i, the probability distribution function of the velocity field P (u) = hθ(u − u(x, t))i; the probability distribution function of velocity differences P (u, y) = hθ(u − u(x + y, t) + u(x, t))i; and the “time-like” velocity structure functions, Tn (τ, t) = h(u(x, t + τ ) − u(x, t))n i. Here θ (z) = χ(z ≥ 0) is the Heavyside function and h. . . i denotes the average w.r.t. to the random initial velocity field u0 (x). The solution of the initial value problem (1), (2) for ν > 0 via the Cole-Hopf transformation and the evaluation of the limit as ν → 0 for fixed t > 0 are well known. We refer the reader to [20] and [11] for a detailed description and give here a quick summary, sufficient for our needs. The vanishing viscosity solution can be obtained by plotting a chain R x of parabolic arcs such that each is touching the graph of the function −q(x) = − −∞ u0 (y)dy at two points exactly. The i th parabolic arc is given by a graph i) . As time grows the parabolic arcs flatten out and of the function 8i (x, t) = 8i + (x−x 2t ∗ merge, and there exists a time T such that for any t > T ∗ there are generically only two arcs left. The velocity field associated with such a configuration is then given by 2

u∗ (x, t) = U (x0 + x ∗ , x, t, P , Q) (x − x0 − x ∗ ) χ[(x0 +x ∗ −√−2Qt,x0 +x ∗ +√2(P −Q)t] (x), ≡ t

(4)

418

R. Tribe, O. Zaboronski

where χI is an indicator function of the interval I , P = q(+∞) is a momentum corresponding to a given u0 , Q = minx q(x) is a global minimum of q(x) and x0 + x ∗ ∈ [x0 − l, x0 + l] is the point where this minimum is achieved. (Such a point exists and is unique almost surely as q(x) is continuous and the global minimum is almost surely unique.) The limiting solution (4) was originally constructed in [20]. The time T ∗ at which the limiting velocity field u∗ is attained depends on the random initial condition u0 but it will be shown that the statistics of the velocity field is well approximated at large times by the statistics of the limiting velocity field u∗ . The latter is determined in turn by the joint distribution of the momentum P and the global minimum Q. Indeed although the expression for u∗ depends explicitly on (P , Q, x ∗ ), the dependence on x ∗ doesn’t influence the statistics of u∗ in the limit L → ∞, where the translational invariance is restored. We delegate the detailed discussion of this point to the next section. The choice of white noise as an initial distribution leads to the distribution of the pair (P , Q) being exactly calculable. Indeed it is a well known consequence of the ’reflection principle’ for Brownian paths [29]. Since it is key to all our asymptotics we include a quick derivation of the joint density function ρ(P , Q). We start with a computation of the probability distribution function of momentum ρ(P ). Writing δ for the delta function at zero, we have by definition Z ρ(P ) = δ P −

∞ −∞

dx u0 (x)

Z =

∞

−∞

dλ iλP D −iλ R ∞ dxu0 (x) E −∞ e e . 2π

Using (3) this functional integral is Gaussian and can be simply computed to give D

e−iλ

R∞

−∞ dxu0 (x)

E

= e−lJ λ . 2

The integral over λ is a Gaussian integral and we conclude that the distribution of P is also Gaussian, as could have been guessed from the very beginning, and given by −(

P

)2

e P0 , ρ(P ) = √ π P0

√ where P0 = 2 lJ .

(5)

The joint probability distribution function can now be computed as follows. Fix q, p satisfying q < 0, q < p. Let x 0 be the first value of x for which q(x) = q. Define q 0 (x) to equal q(x) for x ≤ x 0 and to equal the reflection of q(x) in the horizontal line y = q for x ≥ x 0 . Then if Q0 = minx q 0 (x) and P 0 = q 0 (∞) the reflection principle (see [29]), which exploits the white noise nature of u0 , states that Q0 , P 0 have the same distribution as Q, P . Then Prob(Q ≤ q, P ≥ p) = Prob(Q0 ≤ q, P 0 ≤ 2q − p) = Prob(P 0 ≤ 2q − p) Z 2q−p dz −( Pz )2 e 0 . = √ π P0 −∞

Large Time Asymptotics of Decaying Burgers Turbulence

419

Differentiating in p and q we conclude that ρ(p, q) =

4(p − 2q) −( p−2q )2 √ 3 e P0 , if q ≤ min{0, p}, π P0

(6)

and is zero for all other values of p and q. With the help of (6) we are able to average functionals F [u∗ (t)] = F [u∗ (xi , t) : i = 1, 2, . . . ] with respect to the initial distribution. If however we are interested in the statistics of u(x, t) at zero viscosity and large times there is still a question: is it true that in this limit hF [u(t)]i ∼ hF [u∗ (t)]i, or even at large times are there statistically many initial conditions such that corresponding velocity profiles haven’t converged to the limiting ones? It so happens that the first alternative prevails. The detailed proofs of this fact for relevant functionals are carried out in the next section and in the appendix and are based on the following estimate on the time T ∗ of convergence to the limiting profile: 1 tc 2 ∗ , (7) Prob(T > t) ≤ C t q 3 where tc = lJ and C is a positive number. The proof of this estimate is fairly complicated and is allocated to the appendix. However the result itself is so important for the validity of conclusions of our paper that we decided to present here a convincing and very simple heuristic derivation of it. By definition, Prob(T ∗ < t | P , Q, x ∗ ) = Prob(q ≤ 8t | P , Q, x ∗ ), where 8t coincides for x < x ∗ with parabolic arc 81,t passing through the point (x ∗ , −Q) and touching the line y = 0 and with parabolic arc 81,t passing through the point (x ∗ , −Q) and touching the line y = −P for x > x ∗ (we used the translation invariance of the random variable T ∗ to set x0 = 0. Consequently, x ∗ ∈ [−l, l]). It is convenient to think of a Brownian walk q(x) passing through (x ∗ , −Q) as a collection of two independent walks q + (x) and q − (x) starting at this point and moving in the opposite directions in “time” x. Therefore, Prob(T ∗ < t | P , Q, x ∗ ) = Prob(q − < 81,t | Q, x ∗ ) · Prob(q + < 82,t | P , Q, x ∗ ). (8) To estimate, say, Prob(q − < 81,t | Q, x ∗ ) below we note that Prob(q − < 81,t | Q, x ∗ ) ≥ Prob(q − < −Q + θ · (x − x ∗ ) | Q, x ∗ ), where y = −Q + θ · (x −qx ∗ ) is an equation for the line tangent to the parabola 81,t at the point (x ∗ , −Q); θ = Hence, Prob(q − < 81, t | Q, x ∗ ) q(x ∗ )=−Q− R

≥ lim

→+0

q(−l)=0

1 Dq 2 q < −Q + θ · (x − x ∗ ) e− 2J

q(x ∗ )=−Q− R q(−l)=0

1 Dq2 q < −Q e− 2J

R x∗ −l

R x∗ −l

−2Q t .

q˙ 2 dx

q˙ 2 dx

· 2 0 < −Q − θ · (l + x ∗ ) ,

(9)

420

R. Tribe, O. Zaboronski

where 2[. . . ] is a functional step function, 2(. . . ) - a usual one. The functional integral in the numerator of (9) can be transformed into an integral over all pathes satisfying q(x) ≥ 0 by a change of variables q(x) → q(x) − Q + θ (x − x ∗ ). (A counterpart of this transformation in quantum mechanics is a Galilean transformation.) Now the functional integrals in both numerator and denumerator of (9) can be expressed in terms of Green’s function of heat equation q˙ = J2 q 00 on half a line, i.e. the antisymmetrization of Green’s function of the same equation on the whole line. A simple computation shows then that   s 2 2 2l  · θ 1 − 2l . (10) Prob(q − < 81,t | Q, x ∗ ) ≥ 1 − −Qt −Qt Similar estimate holds for Prob(q + < 82,t | P , Q, x ∗ ) if one replaces −Q with P − Q in the r. h. s. of (10). Substituting these two estimates into (8) and integrating both sides of q the resulting

inequality w. r. t. P , Q using (6) we find that Prob(T ∗ < t) ≥ 1 − Const is equivalent to the estimate (7) for Prob(T ∗ > t) = 1 − Prob(T ∗ < t).

l2 tP0 ,

which

3. The Statistics of the Velocity Field in the ν → 0, t → ∞ Limit 3.1. Moments of the velocity distribution. The aim of the present section is to compute the large t-limit of moments of the velocity distribution

(11) Mn (t) = un (0, t) , n = 1, 2, . . . . Odd order moments vanish identically due to the symmetry: both Burgers equation and the initial distribution are invariant with respect to the transformation u → −u, x → −x. On the other hand, M2k+1 → −M2k+1 under this transformation, which implies that M2k+1 (t) ≡ 0 for k = 1, 2, . . . . We concentrate therefore on the computation of the moments of even order and assume everywhere below that n is even. We may write, using the fact that u(x, t) = u∗ (x, t) for t > T ∗ ,

(12) Mn (t) = u∗n (0, t) + Rn (t), where

Rn (t) =

n

∗n

∗

u (0, t) − u (0, t) θ (T − t)

(13)

is an error term to be estimated. The first term in the right-hand side of (12) can be written in the following form: Z Z L

∗n dx0 n (14) U (x0 , 0, t, p, q) + rn (t), u (0, t) = dpdq ρ(p, q) −L 2L where rn (t) =

Z

l

−l

dx

∗

Z

∗

dpdq ρ(p, q, x )

Z

−L

−L+x ∗

Z +

L+x ∗

L

dx0 n U (x0 , 0, t, p, q) 2L (15)

Large Time Asymptotics of Decaying Burgers Turbulence

421

is an error term appearing due to neglecting x ∗ in comparison to L and ρ(p, q, x ∗ ) is a joint probability density of P , Q and x ∗ . It is shown in the appendix that the error term rn (t) does not affect the asymptotics as ν → 0, L → ∞, t → ∞. Informally this fact can be explained by noticing that the integrand in (15) is non-zero only for velocity profiles which are “stretched” over the interval of length L and thus are exponentially improbable. The remaining integral on the right hand side of (14) can be evaluated exactly using the explicit expressions (4) and (6) leading to the following result: 0((n + 3)/4) L(t) U (t)n , u∗n (0, t) ∼ √ π (n + 1) L

(16)

where L(t) =

p

2P0 t, U (t) =

L(t) t

(17)

are parameters, with dimensions length and velocity, which should be interpreted as the scale of turbulence and turbulent velocity correspondingly. Here we write the symbol ∼ to mean asymptotic equivalence in the limit as L → ∞ and then t → ∞. Another computation presented in the appendix leads to the following estimate of the error term Rn (t) from (12): 1/4 tc L(t) U (t)n , (18) |Rn (t)| ≤ Cn L t p where tc = l 3 /J is a constant having a dimension of time, Cn is a positive constant. Comparing (18) with (16) we see that for t tc , |Rn (t)| hu∗n (0, t)i, which permits us to conclude that 0(k/2 + 3/4) L(t) U (t)2k , M2k (t) ∼ √ π (2k + 1) L

k = 1, 2, . . . .

(19)

It is important to stress however that the coefficient Cn from (18) grows faster with n than the number factor in the r. h. s. of (19). Thus it takes a long time for a moment of high order to converge to the limiting value (19). It follows from (19) that the energy density E(t) ≡ 21 M2 (t) decays like t −1/2 as t → ∞. This is the result to be expected: Dissipation of energy occurs in Burgers turbulence due to shock collisions and at each separate shock. The energy of a separate shock decays as t −1/2 and due to the absence of shock collisions in the limiting profile (4), this also gives the law of decay of total energy density. This argument is due to J. M. Burgers, see [11]. We will also see below that the statistics of the velocity field in our model is selfsimilar with the scales of length and velocity given by (17). These scales depend on time exactly as their counterparts in Kida’s model. The statistics of the velocity field in our case are however different from that of Kida1 . Thus we conclude that the self-similarity alone does not determine the large time asymptotics of the statistics of the velocity field in DBT. Note also that E(t) decays in time, showing the presence of a dissipation anomaly in the model: the rate of energy dissipation does not vanish but converges to a finite non-zero limit when the viscosity ν approaches zero. 1 There exists no complete solution of Kida’s model. Yet the answers which can be obtained within Kida’s model are different from their counterparts in our model.

422

R. Tribe, O. Zaboronski

3.2. The probability distribution function of velocities. In this section we will concern ourselves with computing the probability distribution function (PDF) of velocities given by P (u, t) ≡ Prob(u(0, t) > u) = hθ (u(0, t) − u)i . Reasoning exactly as in the previous section we find that

P (u, t) = θ (u∗ (0, t) − u) + R(u, t), where R(u, t) =

θ (t ∗ − t) θ u(0, t) − u − θ u∗ (0, t) − u

(20)

(21)

(22)

is an error due to the replacement u → u∗ ; Z

∗ θ(u (0, t) − u) = θ(−u) + dpdq ρ(p, q) (23) Z L dx0 θ (U (x0 , 0, t, p, q) − u) − θ (−u) + r(u, t), · −L 2L where r(u, t) is an error due neglecting x ∗ in comparison with L: Z r(u, t) =

l −l

dx ∗

Z

·

Z

−L −L+x ∗

dpdq ρ(p, q, x ∗ ) Z +

L+x ∗

L

dx0 (θ (U (x0 , 0, t, p, q) − u) − θ (−u)) . 2L

(24)

The reason that the term θ(−u) is added and subtracted is that, due to the averaging of the position of the initial condition over the block [−L, L], the velocity is typically zero and so the PDF is an O(L−1 ) perturbation to θ (−u). An estimate of r(u, t) similar to that of the term rn (t) in Sect. 3.1 shows that r(u, t) does not affect the final asymptotics. An exact calculation using the known density ρ(p, q) for the other terms on the right hand side of (23) leads to Z

∗ L(t) ∞ dα −α 2 √ ¯ + α − |u| ¯ sgn(u), ¯ (25) θ(u (0, t) − u) ∼ θ(−u) √ e L u¯ 2 π where u¯ = u/U (t). A computation performed in the appendix shows that |R(u, t)| ≤ C

L(t) tc 1/4 ( ) , L t

(26)

where C is a positive constant. Comparing (26) with (25) we see that for t tc we have hθ(u∗ (0, t) − u)i |R(u, t)|, with the last inequality being pointwise in u¯ rather than uniform. We conclude that Z L(t) ∞ dα −α 2 √ α − |u| ¯ sgn(u). ¯ (27) P (U (t)u, ¯ t) ∼ θ(−u) ¯ + √ e L u¯ 2 π

Large Time Asymptotics of Decaying Burgers Turbulence

423

If in particular |u| ¯ → ∞ this simplifies to L(t) e−u¯ 1 . ¯ P (U (t)u, ¯ t) ∼ θ (−u) ¯ + √ sgn(u) L u¯ 5 8 π 4

(28)

Note that the answer (27) for P (u, t) is self-similar with U (t) playing the role of the integral velocity scale. Note also that the form of P (u, t) is not Gaussian. This confirms the non-triviality of our model: the output (the strongly non-Gaussian statistics of the velocity field in the limit of small viscosity and large time) is not the same as the input (a trivial Gaussian distribution of the initial velocity field). This non-triviality will be reemphasised in the consequent sections where it will be shown that the limiting statistics of the velocity field is intermittent. Finally we would like to make the following technical comment. Of course, the moments of the distribution (27) are exactly those given by (19). We could therefore try to compute the distribution (20) first and then argue that the moments of this asymptotic distribution coincide with the asymptotics of the moments of the actual distribution. Unfortunately the analysis of error terms within this approach becomes very involved. For this reason we have two separate computations, the asymptotics of the moments of the velocity distribution and the asymptotics of the velocity distribution itself. 3.3. Velocity structure functions. Now we will turn to the two-point statistics of the velocity field and compute asymptotics for the velocity structure functions given by n n = 1, 2, . . . . (29) Sn (y, t) = u(y, t) − u(0, t) We find as in the previous subsections that n + Rn (y, t), Sn (y, t) = u∗ (y, t) − u∗ (0, t)

(30)

where Rn (y, t) accounts for the error due to the replacement of u with u∗ . As shown in the appendix this error can be estimated as follows: for t such that L(t) ≥ y, y tc 1/4 n . (31) |Rn (y, t)| ≤ Cn U (t) L t We express the first term in (30) as n ∗ ∗ u (y, t) − u (0, t) Z =

L −L

dx0 2L

Z

n dpdq ρ(p, q) U (x0 , y, t, p, q) − U (x0 , 0, t, p, q) + rn (y, t),

where rn (y, t) accounts for an error arising due to neglecting x ∗ in comparison with L. Again it can be shown that the term rn (y, t) does not contribute to the asymptotics. Now a direct computation using the density ρ(p, q) shows, for y ≥ 0, that n u∗ (y, t) − u∗ (0, t) n + 2 L(t) n 1 U (t)y¯ + O(y¯ 2 ), n = 2, 3, . . . , ∼ (−1)n √ 0 4 L π

424

R. Tribe, O. Zaboronski

y where y¯ = L(t) . In addition S1 (y, t) ∼ 0, which confirms the restoration of translation invariance in the large L limit. Comparing this with (31) we see that the asymptotics of the velocity structure functions is given, for fixed y¯ ≤ 1, by

n + 2 L(t) n 1 ) U (t)y¯ + O(y¯ 2 ) n = 2, 3, . . . . ¯ t) ∼ (−1)n √ 0( Sn (L(t)y, 4 L π

(32)

It has been assumed in our computations that y¯ ≥ 0. Extending (32) to negative y by ¯ t) is proportional to |y| ¯ and the symmetry y → −y, u → u, we see that S2k (L(t)y, ¯ t) is proportional to y¯ for k ≥ 1 and |y| ¯ 1. S2k+1 (L(t)y, Thus the velocity structure functions of the problem exhibit in the inertial range the extreme anomalous (non-Kolmogorov) scaling which is typical for Burgers turbulence in general and is due to the presence of shocks in the limiting velocity profile. The Burgers anomalous scaling is well known from heuristic arguments (see e.g. [15, 7, 9] ). In our case however it has been derived as a part of the complete solution of the problem. 3.4. The probability distribution function of velocity differences. Here we will compute the PDF for velocity differences P (u, y, t) = Prob u > 1u(y, t) = θ u − 1u(y, t) , (33) where 1u(y, t) = u(y/2, t) − u(−y/2, t) and y ≥ 0. Definition (33) is tailored for the study of negative velocity differences and we consider only the case u < 0. Negative differences are the interesting case since they occur when the velocities are evaluated either side of a shock. A lengthy but straightforward computation shows that for fixed u¯ < 0, y¯ > 0, P (U (t)u, ¯ L(t)y, ¯ t) Z (y− Z ∞ ¯ u) ¯ 2 dα dα L(t) 2 √ 2 α + u¯ + y¯ ¯ y, ¯ t), ∼2 √ e−α √ e−α + R(u, L π π u¯ 2 (y− ¯ u) ¯ 2

(34)

where, as shown in the appendix, 1/4 tc L(t) y¯ . |R(u, ¯ y, ¯ t)| ≤ C L y¯ + |u| ¯ t

(35)

Due to the presence of extra factor of ( ttc )1/4 decaying with time, R(u, y, t) becomes small compared to the first term in the right hand side of (34), given that u, ¯ y¯ fixed. It is easy to analyse (34) in the following limiting cases. We suppose that y¯ 1. If |u| ¯ 1, then √ √ π π 2 L(t)y¯ 2 3 1− y¯ − |u| ¯ + O(y¯ ) + O(|u| ¯ ) . (36) P (U (t)u, ¯ L(t)y, ¯ t) ∼ L 2 2 If 1 |u| ¯ y¯ −1/3 then

1 L(t)y¯ 1 −|u| ¯4 3 e ) + O( y| ¯ u| ¯ ) . 1 + O( P (U (t)u, ¯ L(t)y, ¯ t) ∼ √ ¯2 |u| ¯4 π L |u|

(37)

Large Time Asymptotics of Decaying Burgers Turbulence

425

If |u| ¯ y¯ −1/3 then

1 L(t) 1 −|u| ¯4 e 1 + O( 8 ) . P (U (t)u, ¯ L(t)y, ¯ t) ∼ √ |u| ¯ ¯5 4 π L |u|

(38)

To summarize, for negative u, P (u, y, t) decays algebraically for |u| U (t) and super exponentially for |u| U (t). Moreover, P (u, y, t) ∼ O(y) if 1 |u| ¯ y¯ −1/3 −1/3 and doesn’t depend on y if 1 |u| ¯ y¯ . This information alone enables one to conclude that velocity structure functions of sufficiently high order exhibit anomalous scaling. In addition we observe a crossover between regimes (37) and (38). This crossover is actually responsible for the presence of many scales in the description of the statistics of velocity field and the absence of the universal inertial range in Burgers turbulence. We refer the reader to Sect. 4 for a detailed discussion of this point. 3.5. The multi-time statistics of the velocity field. The simplicity of our model allows us to compute the correlation between values of the velocity field at different moments of time. Let n (39) Tn (τ, t) = u(0, t + τ ) − u(0, t) be the velocity structure functions corresponding to the same point at space but different moments of time. We write (39) in the already familiar form n (40) + Rn (τ, t) Tn (τ, t) = u∗ (0, t + τ ) − u∗ (0, t) with Rn (τ ) accounting for an error due to the replacement of u with u∗ . An estimate in the appendix shows that τ U (t) tc 1/4 n . (41) |Rn (τ, t)| ≤ Cn U (t) L t The computation of the first term in the right-hand side of (40) is very close to the computation performed in previous sections and leads to Tn (τ, t) ∼ (−1)n ∼ (−1)

0( n+3 ) U (t)τ + Rn (τ, t) √4 U n (t) L 2 π

n+3 n 0( 4 )

U (t)τ , √ U (t) L 2 π n

(42)

n = 2, 3, . . . .

We therefore conclude that the time-like structure functions exhibit in Burgers turbulence the extreme anomalous scaling in τ given by Tn (τ, t) ∼ τ, n = 2, 3, . . . . Comparing this with the expression (32) for the space-like structure functions, we see that Sn (y, t) = Tn (τ, t),

n = 2, 3, . . .

(43)

at y = C(n)U (t)τ , given that y L(t) and τ t. The identity (43) means that the “isotropic” Taylor conjecture stating the equivalence of the space-like and time-like statistics in isotropic turbulence at small scales, becomes a theorem for our model of

426

R. Tribe, O. Zaboronski

Burgers turbulence. The similar observation was also independently made in [8] in the context Burgers turbulence generated by correlated Gaussian initial conditions. Let us finally note that if one wishes to compare Tn (y, t) with Sn (τ, t) at arbitrarily high orders n, the condition of applicability of relation (43) has to be changed to y Ln (t), where Ln (t) is correlation length associated with nth order structure function introduced in Sect. 4.2. For n 1, Ln (t) ∼ L(t)/n3/4 , see (46) below. 3.6. One-shock approximation. We wish to show that all of the results obtained in the previous section can be easily obtained from heuristic arguments given the knowledge of the probability density of a velocity jump at a shock. In our case the latter is easy to compute: a simple computation which uses the knowledge of the limiting velocity profile (4) and the density ρ(p, q) gives p 4 4 µe ¯ −µ¯ , (44) ρ(µ) ≡ δ µ − 2(P − Q)/t = √ π U (t) where µ is a velocity jump at the (right) shock, µ¯ = Uµ(t) . The probability density of the velocity jump at the left shock has exactly the same form, so we will be referring to (44) as the probability density of the velocity jump at a shock. Now let us assume: Firstly that the large-t statistics of u are approximated by that of u∗ ; secondly that a one-shock approximation is valid, i.e. that one can disregard in the analysis the contributions coming from configurations with shocks separated by distances much less than the average separation L(t). To derive P (u, y, t) for u < 0, y L(t) using these assumptions note that u(y, t)− u(0, t) can be negative only if there is a shock at some point in [0, y]. If the right hand shock lies at x ∈ [0, y] then u(y, t) − u(0, t) = −µ + x/t. A similar formula holds if the left hand shock lies in [0, y]. So neglecting the contribution from the configurations with 2 shocks inside the interval [0, y], we see that Z y dx x Prob(Size of Jump > − u). Prob u(y, t) − u(0, t) < u ≈ 2 2L t 0 This can be easily computed using the density of the shock jump (44) giving

Prob u(y, t) − u(0, t) < y ≈

2L(t) L

Z

(y− ¯ u) ¯ 2

u¯ 2

dα 2 √ ¯ + y¯ √ e−α ( α + u) π

Z

∞

(y− ¯ u) ¯ 2

dα 2 √ e−α , π

(45)

which coincides with the exact answer (34). With the knowledge of the PDF of velocity difference we can compute velocity structure functions, thus moments of velocities, thus the PDF of velocities. In other words all of the results of the previous section concerning single time statistics of the velocity field can be obtained using a one-shock approximation. Moreover, the τ -dependence of the time-like structure functions (42) is also entirely due to the one-shock effects: if n = 2, 3, ... and τ t, then the main contribution to Tn (τ, t) comes from the configurations with a shock passing through x = 0 between

Large Time Asymptotics of Decaying Burgers Turbulence

427

the moments of time t and t + τ . A shock with velocity jump µ travels a distance approximately µτ/2 over the interval [t, t + τ ]. Therefore,

Tn (τ ) ≈ (−µ) χ Shock passed through 0 during [t, t + τ ] n

≈ h(−µ)n

µτ i. 2L

Computing this average using the PDF of shock strength (44) we arrive exactly at (42), which again shows that one-shock approximation is asymptotically exact. These calculations support the following statement about decaying Burgers turbulence: all one needs to know in order to describe the statistics of the velocity field at scales much less than the average distance between shocks is the one-point PDF of shock velocity and strength (or just shock strength if the correlation functions which we’re trying to compute are Galilean-invariant). Thus the problem is much simpler than one might have thought: recall for example that exact formulae expressing velocity correlation functions in terms of the statistics of shocks are such ([11, 23]) that one seemingly needs to know the n-point joint PDF of shock strengths in order to compute the nth order correlation function. The rigorous proof of the above statement together with estimates on the errors of one-shock approximation will make DBT analytically tractable for a wide class of initial conditions as the great deal is known about the one-point function of shock strength, see e. g. [11,23,32,3,4]. Is there a universal technique for the computation of the one-point PDF of the shock strength? It has been known since Burgers [11], but never really exploited, that shocks behave (almost) as a system of sticky particles. One might try therefore to extract the information about one-point PDF of shock characteristics by studying the kinetics of this system, for example, by analyzing the Smoluchowski-Bogoluibov chain of equations for one-point, two-point, . . . PDF’s of shocks.

3.7. On multiscaling in Burgers turbulence. In statistical physics the term “multiscaling”, instead of “anomalous scaling”, is used to stress an inherently multiscale nature of a system exhibiting anomalous scaling of correlation functions. Burgers turbulence is no exception. In this section we will show that the crossover between the tails (37) and (38) of the PDF for velocity differences is actually a reflection of the presence of many correlation lengths in the problem, which in turn is a consequence of the anomalous scaling of correlation functions and, ultimately, the intermittency of the velocity field in Burgers turbulence. Let n 1 be a large even positive integer. We know from (32) that as y¯ approaches zero, 1 n + 2 L(t) n ¯ t) ≈ √ 0 ¯ U (t)y. Sn (L(t)y, 4 L π

428

R. Tribe, O. Zaboronski

For large y¯ however one expects the quantities u(L(t)y, ¯ t) and u(0, t) to become independent. When this happens we have

¯ t) = (u(L(t)y, t) − u(0, t))n Sn (L(t)y,

∼ un (L(t)y, t) + un (0, t) 0( n+3 4 ) L(t) U (t)n . ∼ 2Mn = 2 √ π (n + 1) L Here the cross terms in expanding the nth power are, using the independence, of order O(L−2 ). The region in between these two formulae for large and small y marks the correlation length for the nth moments. If we assume there is a simple crossover then we can locate the scale at which it occurs by equating the expressions for large y¯ and ¯ t) ≈ 2Mn , at the value n−3/4 and so the small y. ¯ These become equal, i.e. Sn (L(t)y, correlation length for the nth structure function is Ln ∼

L(t) , n3/4

n 1,

(46)

and this shows the presence of many scales in our problem. To show how this multiscaling is related to the crossover between the asymptotic regimes (37) and (38) we shall use the PDF for velocity differences to compute Sn (y, t) ¯ t) as an integral against the PDF of for n positive and large. Writing Sn (L(t)y, 1u(L(t)y, ¯ t) and treating n as a large parameter we see that the integral is dominated by values of |u| ¯ coming from the neighbourhood of the negative critical point of the function ¯ 4) F (u) = |u| ¯ n exp(−|u| namely near u¯ c = −n1/4 . Note this value is much less than −1 for n 1 and so we may neglect the part of the integral that uses the PDF in the form (36) and also neglect positive values of 1u(y). Now, if in addition |u¯ c | y¯ −1/3 , we have to use asymptotics (37) ¯ t) ≈ C y. ¯ to evaluate the contribution from the critical point, which yields Sn (L(t)y, If |u¯ c | y¯ −1/3 we have to use asymptotics (38) in our computations, which gives ¯ t) ≈ Constant. The crossover between these two answers corresponds to the Sn (L(t)y, crossover between the asymptotics (36) and (37) and occurs when y¯ = |u¯ c |3 = n−3/4 , exactly as in our computed correlation length Ln for the nth structure function. It remains to remark that multiscaling, and consequently a PDF for velocity differences which has a crossover between a regime scaling like y and one that is independent of y, should be a general feature of DBT regardless of the initial distribution. All related questions concerning other statistics can be studied in more general situations, if one assumes a one-shock approximation is valid, by using the information about the tails of the one-point PDF of shock strength obtained in [32, 3, 4]. It is worth noting that the presence of the multitude of correlation lengths in Burgers turbulence was understood long ago by Robert Kraichnan, [25], and rediscovered within the instanton approach to the forced Burgers turbulence, [12]. It is also worth stressing that in models of chaotic systems which do not account for the effects of intermittency, there is always a single universal correlation length. A good example is served by random matrix models, see [26] for a review. Finally, let us remark that if we define the integral scale as the scale of scaling behaviour of correlation functions, we must immediately conclude that there is no such unique scale, there is rather a family of them parameterized by the order of correlation

Large Time Asymptotics of Decaying Burgers Turbulence

429

function. In other words, the notion of the integral scale becomes local, and the notion of the universal inertial range disappears. (See also [14] for the general discussion about the multitude of dissipative scales based on a multifractal models.) This should be a general feature of all intermittent turbulent systems, for instance, Navier–Stokes turbulence. Acknowledgements. We are grateful to E. Balkovski, D. Elworthy, G. Falkovich, U. Frisch2 , J. Gibbon, K. Khanin, S. Kuksin, S. Nazarenko, A. Newell, C. Vassilicos for most illuminating discussions. We are most grateful for the hospitality of the Department of Complex Systems of Weizmann Institute of Science, where part of this work has been carried out. The financial support through the research grant MA1117 from the University of Warwick is also greatly appreciated. Note added in proof. We are grateful to the referee of our paper who drew our attention to a recent preprint by L. Frachebourg and Ph. A. Martin, [13], in which the study of the model of decaying Burgers turbulence initiated by white noise initial conditions (without compactness assumption) has been effectively completed. This model was originally considered by Burgers himself about forty years ago but complexity of analysis prevented him from obtaining explicit answers for anything but the two- and three-point correlation functions of velocity field. Now most of the questions about the statistics of velocity field in Burgers’ model can be effectively resolved using the integral representation of the Green’s function of a diffusion equation in the (x, t)-domain with parabolic boundary derived in the above mentioned paper.

4. Appendix In order to bound the various error terms in Sect. 3 we will need to bound the size of the true solution u, the asymptotic solution u∗ and the size of their supports (i.e. the interval on which they are non-zero). We use details from the method of construction of the vanishing viscosity solution as descibed in [20] and recalled in Sect. 2. Suppose that initial velocity profile is supported in the interval (x0 − l, x0 + l). The rightmost (respectively leftmost) parabola in the chain of parabolic arcs built on the initial potential will always lie to the left (respectively right) of the parabola with the same curvature that passes through the point (x0 + l, −Q) (respectively (x0 − l, −Q)) and assumes minimial value equal to −P (respectively 0). This immediately implies that both u and u∗ are supported in the interval [y∗ , y ∗ ] where y∗ = x0 − l −

p −2tQ,

y ∗ = x0 + l +

p 2t (P − Q).

(47)

Using the fact that both u and u∗ vanish at the point within [x0 − l, x0 + l] at which q(x) achieves its global minimum, we also find that |u| and |u∗ | are bounded by umax , where umax = max{(y ∗ − (x0 − l))/t, ((x0 + l) − y∗ )/t} p p = max{(2l + 2t (P − Q))/t, (2l + −2tQ)/t}.

(48)

Estimates (47), (48) and the bound (7) will be used to estimate all relevant error terms in Sect. 3. The careful analysis of these error terms leads to a better understanding for when the asymptotics for various statistics start to hold. 2 Who asked the very useful, perhaps rhetorical, question “Why study white noise Burgers turbulence at all?”

430

R. Tribe, O. Zaboronski

4.1. Proof of the estimate (18). Applying the above estimates to the error term (18) we obtain |Rn (t)| = h|un (0, t) − u∗n (0, t)|θ (T ∗ − t)i = h|un (0, t) − u∗n (0, t)|χ[y∗ ,y ∗ ] (0) θ (T ∗ − t)i ≤ 2hunmax χ[y∗ ,y ∗ ] (0) θ (T ∗ − t)i 2 ≤ h(y ∗ − y∗ )unmax θ (T ∗ − t)i (averaging over x0 ) L 2 1/2 hθ (T ∗ − t)i1/2 (Cauchy-Schwartz) ≤ h(y ∗ − y∗ )2 u2n max i L t∗ L(t) n U (t)( )1/4 , ≤ Cn L t where the last inequality uses the the estimate (7) and an explicit calculation using ρ(p, q). Comparing the first and the last entries of the presented chain of inequalities we obtain a proof of (18). 4.2. Proof of the estimate on rn (t) from Sect. (3.1). We may bound rn (t) as follows: Z |rn (t)| ≤

l

−l

Z

dx

∗

Z

Z dpdq ρ(p, q, x )( ∗

Z

−L+l

Z

−L+x ∗

−L L+l

Z +

L+x ∗

L

)

dx0 |U (x0 , 0, t, p, q)|n 2L

dx0 |U (x0 , 0, t, p, q)|n ≤ dpdq ρ(p, q)( + ) 2L −L−l L−l Z p 2l L + l n ≤ dpdq ρ(p, q)θ ( −2Qt − (L − l))U n (t) L L ! L−l 4 2l L + l n L(t) 2 exp − U n (t), ≤ L L L−l L(t) where the last inequality follows by an explicit calculation using ρ(p, q). This is exponentially small in L and so does not affect the asymptotics which take the limit L → ∞ first and preserve only the O(L−1 ) terms. A similar argument controls similar error terms of this form for the other statistics considered.

4.3. Proof of the estimate (26). The proof of (26) is similar to that of (18): |R(u, t)| ≤= h|θ(u(0, t) − u) − θ (u∗ (0, t) − u)|χ[y∗ ,y ∗ ] (0)θ (T ∗ − t)i ≤ 2hχ[y∗ ,y ∗ ] (0)θ(T ∗ − t)i 1 ≤ h(y ∗ − y∗ )θ (T ∗ − t)i L 1 ≤ h(y ∗ − y∗ )2 i1/2 hθ (T ∗ − t)i1/2 L L(t) tc 1/4 ≤C ( ) . L t

Large Time Asymptotics of Decaying Burgers Turbulence

431

4.4. Proof of the estimate (31). We can split this error term into two via |Rn (y, t)| ≤ 2n h1u(y, t)n θ (T ∗ − t)i + 2n h1u∗ (y, t)n θ (T ∗ − t)i,

(49)

where 1u(y) = u(y/2, t) − u(−y/2, t). We show how to bound the first of these terms, the other being entirely similar. The vanishing viscosity solution u takes the form, within its support, of a line with slope 1/t plus a series of downward jumps. So we may define F (x, t) to be a non increasing piecewise constant function so that, for x in the support of u, y − x0 + F (y − x0 , t). u(y, t) = t It is easy to see that |1F (y, t)| = |F (y/2, t)−F (−y/2, t)| ≤ 2umax . Also |1u(y, t)| ≤ |y/t| + |1F (y − x0 , t)| whenever one of the points y/2 or −y/2 is in the support of u. So we bound the first term on the right-hand side of (49) by h(|y/t| + |1F (y − x0 )|)n χ[y∗ −(y/2),y ∗ +(y/2)] (0)θ (T ∗ − t)i ≤ 2n h|1F (y − x0 )|n θ(T ∗ − t)i + 2n h|y/t|n χ[y∗ −(y/2),y ∗ +(y/2)] (0)θ (T ∗ − t)i.

(50)

The first term on the right-hand side of (50) can be bounded by averaging over x0 first and using Z L Z L dx0 dx0 n n−1 |1F (y − x0 , t)| ≤ (2umax ) |1F (y − x0 , t)| −L 2L −L 2L yumax , ≤ (2umax )n−1 L using in the last inequality the fact that F is decreasing and bounded by 2umax . Substituting into (50) one can take the further averaging as for previous error bounds. By taking t large enough that L(t) ≥ y and combining the various terms one arrives at the desired error bound. 4.5. The proof of the estimate (35). The proof of this estimate is similar to that of (31). Noting that 1u(y, t) = 1F (y − x0 , t) + (y/t) we may write Z Z dx0 dx0 θ(u − 1u(y, t)) = θ (|1F (y − x0 )| − (y/t) − |u|) 2L 2L Z dx0 |1F (y − x0 )| ≤ 2L (y/t) + |u| 2yumax 1 . ≤ 2L (y/t) + |u|) A similar estimate holds for 1u∗ (y, t). Hence

|R(u, y, t)| ≤ h θ(u − 1u(y, t)) + θ (u − 1u∗ (y, t) θ (T ∗ − t)i 2y 1 humax θ (T ∗ − t)i ≤ L (y/t) + |u|) 1/4 tc L(t) y¯ . ≤ L y¯ + |u| ¯ t

432

R. Tribe, O. Zaboronski

4.6. Proof of the estimate (41). The proof of this estimate is similar to that of (35). The key change is to obtain a bound for Z dx0 |F (y − x0 , t) − F (y − x0 , t + τ )|. (51) 2L The piecewise constant profile F (y, t) consists of a series of shocks which may travel forwards or backwards but move with a maximum speed umax . The total height of the shocks is also bounded by umax . So the integral (51) can be bounded by u2max τ/2L. The possibility of infinitely many shocks, or the merging of shocks between times t and t +τ , does not affect this upper bound.

4.7. The proof of the estimate (7). The construction of the two shock profile R x uses two parabolas that pass through the graph of the Brownian motion −q(x) = − −∞ u0 (z)dz at its point of maximum. Below is a lemma about the behavior of a Brownian path near its maximum. Lemma 1. Let (Bt : 0 ≤ t ≤ 1) be a standard Brownian motion started at zero. Define M = sup Bt ,

6 = inf{t : Bt = M}.

t∈[0,1]

We consider the pieces of the path (Bt ) either side of its maximum by defining Xt = M − B6−t for t ∈ [0, 6],

X¯ t = M − B6+t for t ∈ [0, 1 − 6].

Define the slopes of two lines that pass through the maximum and lie above the path by 2 = inf{Xt /t : 0 < t ≤ 6},

¯ = inf{X¯ t /t : 0 < t ≤ 1 − 6}. 2

a) The triples (M, 6, (Xt : t ≤ 6)) and (M − B1 , 1 − 6, (X¯ t : t ≤ 1 − 6)) are identically distributed. b) The law of (M, 6) is given by P (M ∈ dm, 6 ∈ dσ ) =

mσ −1 exp(−m2 /2σ )dm dσ. π(σ (1 − σ ))1/2

c) Conditional on M ∈ dm, 6 ∈ dσ the path (Xt : t ≤ σ ) satisfies X0 = 0 and solves the stochastic differential equation, driven by a Brownian motion (Wt ), dXt = f (t, Xt )dt + dWt ,

where f (t, x) =

m−x 2m 2mx + (exp( ) − 1)−1 . σ −t σ −t σ −t (52)

d) For (Xt ) that solves (52) we have the estimate P (2 ≤ θ) ≤ Cθ (m + σ m−1 ) + I (m ≤ θ σ ).

Large Time Asymptotics of Decaying Burgers Turbulence

433

We delay the proof of this lemma until the end of this appendix and first use it to prove the estimate (7) on the tail P (T ∗ ≥ t) of the time T ∗ at which the two shock R x profile is obtained. The construction of the two shock profile uses the function q(x) = ∞ u0 (z)dz, its global minimum Q and the position x0 + x ∗ at which the minimum is attained. Two ¯ 2 /2t) are constructed parabolas of the form π(z) = (z − x)2 /2t (and π¯ (z) = P + (z − x) to pass through the point (x0 + x ∗ , −Q). The slopes of the parabolas at the point x0 + x ∗ are (−2Q/t)1/2 (respectively (2(P − Q)/t)1/2 ). Let T (respectively T¯ ) be the smallest time t at which the parabola π (respectively π¯ ) lies above the graph of −q(x). Then the two shock profile is attained for times t ≥ T ∗ = max{T , T¯ }. To apply the lemma we must rescale to obtain a standard Brownian path of length one. Set Bt = −(2lJ )−1/2 q(x0 − l + 2lt) for t ∈ [0, 1]. Then (Bt ) is a standard Brownian motion and its maximum M takes the value −Q/(2lJ )1/2 . The construction of the parabola π (respectively π) ¯ show that if 2 ≤ (−4lQ/tJ )1/2 then t ≤ T (respectively ¯ ≤ ((4l(P − Q)/tJ )1/2 then t ≤ T¯ ). Part a) of the lemma shows that both of these if 2 events have the same probability. So, applying part d) of the lemma, P (T ∗ ≥ t) ≤ 2P (2 ≤ (−4lQ/tJ )1/2 ) = 2P (2 ≤ 25/4 M 1/2 t −1/2 l 3/4 J −1/4 ) ≤ Ct −1/2 l 3/4 J −1/4 E(M 1/2 (M + 6M −1 )) + 2P (M ≤ 25/4 M 1/2 t −1/2 l 3/4 J −1/4 6) ≤ Ct −1/2 l 3/4 J −1/4 using Markov’s inequality in the last inequality and the exact distribution of (M, 6) in part b) of the lemma. This completes the proof of (7) and it remains to describe the proof of the lemma. Part a) of the lemma follows from the symmetry of the problem with respect to the time reversal t → 1 − t. The distribution of (M, 6) is well known and may be obtained for example by exploiting the reflection principle. Conditional on M ∈ dm, 6 ∈ σ the path (Xt ) becomes a Brownian bridge, taking the value zero at time zero and the value m at time σ , that is conditioned to never take negative values. The equation describing the evolution can then be obtained using an h-transform as in Rogers and Williams [30] Sect. 4.23. We first sketch the idea for estimating P (2 ≤ θ ) = P (Xs < θ s for some s ≤ σ ). The drift f (t, x) in Eq. (52) is approximately 1/x for small t and x. If this approximation were exact the process (Xt ) would satisfy dX = X−1 dt + dW which is uniquely solved by the three dimensional Bessel process (the radius of a three dimensional Brownian motion). For a Bessel process one can make use of time inversion via the identity in distribution (Xt : t > 0) = (tX1/t : t > 0) and potential theory for three dimensional Brownian motion which gives P (Xs < θ for some s ≥ 0|X0 = x) = min{θ x −1 , 1}.

434

R. Tribe, O. Zaboronski

Then P (Xs < θs for some s ≤ σ ) = P (Xs < θ for some s ≥ 1/σ ) −1 , 1}) = E(min{θ X1/σ

= E(min{θ σ 1/2 X1−1 , 1}) Z ∞ (2π )−3/2 r 2 exp(−r 2 /2) min{θ σ 1/2 r −1 , 1}dr = 0

≤ Cθ σ 1/2 , where the penultimate equality follows from Brownian scaling and the final equality from a calculation using the density of the Gaussian variable X1 . To exploit this idea we divide the interval [0, σ ] into two parts, over the first of which the approximation f (t, x) ≈ 1/x is sufficiently good. We first estimate P (Xs < θs for some s ≤ σ/2). Using the elementary inequalities (1−z)/2z ≤ (e2z −1)−1 ≤ 1/2z for all z > 0 one obtains the bounds x −1 −x(σ −t)−1 ≤ f (t, x) ≤ x −1 + 2mσ −1 . Hence Xt−1 − 2σ −1 Xt ≤ f (t, Xt ) ≤ Xt−1 + 2m(σ − t)−1

for t ≤ σ/2.

(53)

So the solution of the equation dYt = Yt−1 dt − 2σ −1 Yt dt + dWt , Y0 = 0 satisfies Yt ≤ Xt for all t ≤ τ . To remove the unwanted −2σ −1 Yt dt in the drift of (Yt ) we use a change of measure. Define a new probability measure Q by defining the Radon-Nicodym derivative M by dQ M= dP Fσ/2 Z σ/2 Z σ/2 −1 −2 = exp(2σ Ys dWs − 2σ Ys2 ds) 0

2 + 2σ −2 = exp(σ −1 Yσ/2

Z 0

0

σ/2

Ys2 ds − 3/2)

≥ exp(−3/2). The second equality here follows from Ito’s formula. By Girsanov’s theorem ( see [29]) the process (Yt ) solves dY = Y −1 dt + d W˜ with respect to some Brownian motion (W˜ ) under Q, implying that (Yt ) is a three dimensional Bessel process under Q. Writing EQ for the expectation under Q we have P (Xs < θs for some s ≤ σ/2) ≤ P (Ys < θ s for some s ≤ σ/2) = EQ (M −1 I (Ys < θ s for some s ≤ σ/2)) ≤ e3/2 Q(Ys < θ s for some s ≤ σ/2) ≤ Cθ σ 1/2 using the argument given above.

(54)

Large Time Asymptotics of Decaying Burgers Turbulence

435

It remains to estimate the probability P (Xs < θ s for some σ/2 ≤ s ≤ σ ). We shall further condition on the value of Xσ/2 . If Xσ/2 ∈ dr the evolution of (Xs : s ∈ [σ/2, σ ]) is that of a Brownian bridge starting at r, ending at m and conditioned to take non-negative values. We write Qx for the law of a one-dimensional Brownian motion (Wt ) started at x and we define Ha = inf{t : Wt ≤ a}. Then, supposing r, q ≥ θ σ , we have P (Xs ≤ θ s for some s ∈ [σ/2, σ ]|Xσ/2 ∈ dr) = 1 − P (Xs > θ s for all s ∈ [σ/2, σ ]|Xσ/2 ∈ dr) ≤ 1 − P (Xs > θ σ for all s ∈ [σ/2, σ ]|Xσ/2 ∈ dr) = 1 − Qr (Hθ σ > σ/2|Wσ/2 ∈ dm, H0 > σ/2) Qr (Hθ σ > σ/2, Wσ/2 ∈ dm) . =1 − Qr (H0 > σ/2, Wσ/2 ∈ dm) The reflection principle can be used to show that, for a ≤ r, m, Qr (Ha > t, Wt ∈ dm) = (pt (m − r) − pt (m + r − 2a)) dm,

(55)

where pt (z) = (2πt)−1/2 exp(−z2 /2t). Using this we rewrite the last expression as exp((m + r)2 /σ ) − exp((m + r − 2θ σ )2 /σ ) exp((m + r)2 /σ ) − exp((m − r)2 /σ ) (m + r − 2η)2 − (m + r)2 −4mr −1 )) 4θ (m + r − 2η) exp( ) = (1 − exp( σ σ for some η ∈ [0, θ σ ] by the mean value theorem −4mr −1 )) 4θ (m + r − 2η) ≤ (1 − exp( σ σ )(m + r) (using (1 − e−z )−1 ≤ C(1 + z−1 )) ≤ Cθ(1 + 4mr ≤ Cθ(m + r + σ r −1 + σ m−1 ). Thus P (Xs ≤ θs for some s ∈ [τ, σ ]|Xτ ∈ dr) ≤ Cθ(m + r + σ r −1 + σ m−1 ) + I (r ≤ θ σ ) + I (m ≤ θ σ ).

(56)

We now undo the conditioning on Xσ/2 ∈ dr. Using the upper bound in (53) and Ito’s formula one obtains dXt2 ≤ (3 + 4mσ −1 Xt )dt + 2Xt dWt . Taking expectations one has Z t 2 −1 E(Xs )ds E(Xt ) ≤ 3t + 4mσ 0 Z t E(Xs2 )ds. ≤ (3 + m2 σ −1 )t + 4σ −1 0

2 ))1/2 ≤ C(σ 1/2 + m). Applying Gronwall’s inequality shows that E(Xσ/2 ) ≤ (E(Xσ/2 −1 By Markov’s inequality P (Xτ ≤ θ σ ) ≤ θ σ E(Xτ ). Using the comparison with a −1 ) ≤ Cσ −1/2 . Using these Bessel process as before we have E(Xτ−1 ) ≤ e3/2 EQ (Yσ/2 bounds in (56) and combining with (54) leads to the estimate in part d) of the lemma.

436

R. Tribe, O. Zaboronski

References 1. Aurell, E., Frisch, U., Noullez, A., Blank, M.: Bifractality of the Devil’s staircase appearing in the Burgers equation with Brownian initial velocity. chao-dyn/961101, published in J. Stat. Phys. 88, 1151–1164 2. Avellaneda, M., Ryan, R., E, Weinan: PDFs for velocity and velocity gradients in Burgers’ turbulence. Commun. Math. Phys. 172, no. 1, 13–38 (1995) 3. Avellaneda, M., E, Weinan: Statistical properties of shocks in Burgers turbulence. Commun. Math. Phys. 172, no. 1, 13–38 (1995) 4. Avellaneda, M.: Statistical properties of shocks in Burgers turbulence. II. Tail probabilities for velocities, shock-strengths and rarefaction intervals. Commun. Math. Phys. 169, no. 1, 45–59 (1995) 5. Balkovsky, E., Falkovich, G., Kolokolov, I., Lebedev, V.: Intermittency of Burgers’ Turbulence . Phys. Rev. Lett. 78, 1452 (1997) 6. Balkovsky, E., Falkovich, G., Kolokolov, I., Lebedev, V.: Viscous Instanton for Burgers’Turbulence. Phys. Rev. Lett. 78, 1452 (1997) 7. Balkovski, E. and Falkovich, G.: Private communication 8. Bec, J., Frisch, U.: Pdf’s of Derivatives and Increments for Decaying Burgers Turbulence. condmat/9906047 9. Bernard, D. and Gawedzki, K.: Scaling and exotic regimes in the decaying Burgers turbulence. chaodyn/9805002 10. Borodin, A.N., Salminen, P.: Handbook of Brownian motion – facts and formulae. Probability and its Applications. Birkhäuser Verlag, Basel, 1996 11. Burgers, J. M.: Statistical problems connected with the solution of a simple non-linear partial differential equation. I, II, III. Nederl. Akad. Wetensch. Proc. Ser. B. 57, 403–413, 414–424, 425–433 (1954) 12. Falkovich, G.: Unpublished 13. Frachebourg, L., Martin, Ph.A.: Exact statistical properties of the Burgers equation. cond-mat/9909056 14. Frisch, U., Vergassola, M.: A prediction of the multifractal model: the intermediate dissipation range. In: New approaches and concepts in turbulence (Monte Verità, 1991), Basel: Birkhäuser, 1993, pp. 29–34 15. Frisch, U.: Turbulence. The legacy of A. N. Kolmogorov. Cambridge: Cambridge University Press, 1995 16. Gotoh, T. and Kraichnan, R.: Statistics of decaying Burgers turbulence. Phys. Fluids A 5, (2), (1993) 17. Gurbatov, S.N., Malakhov, A.N., Saichev, A.I.: Nonlinear random waves and turbulence in nondispersive media: Waves, rays, particles. Translated from the Russian. Supplement 1 by Adrian L. Melott and Sergei F. Shandarin. Supplement 2 by V. I. Arnol’d, Yu. M. Baryshnikov and I. A. Bogayevsky. Translation edited and with a preface by D. G. Crighton. Nonlinear Science: Theory and Applications. Manchester: Manchester University Press, 1991 18. Gurbatov, S.N., Simdyankin, S.I., Aurell, E., Frisch, U., Tóth, G.: On the decay of Burgers turbulence. J. Fluid Mech. 344, 339–374 (1997) 19. Gurarie, V., Migdal, A.: Instantons in Burgers Equation. Phys. Rev. E 54, 4908–4914 (1996) 20. Hopf, E.: The partial differential equation ut +uux = µuxx . Comm. Pure Appl. Math. 3, 201–230 (1950) 21. E, W., Khanin, K., Mazel, A., and Sinai,Ya.G.: Invariant measures for the random forced Burgers equation. Submitted to Ann. Math. 22. E, W., Khanin, K., Mazel, A., and Sinai, Ya.G. Probability distribution functions for the random forced Burgers equation. Phys. Rev. Letters 78, 1904–1907 (1997) 23. Kida, S.: Asymptotic properties of Burgers turbulence. J. Fluid Mech. 93 no. 2, 337–377 (1979) 24. Kraichnan, R.H.: Note on Forced Burgers Turbulence. chao-dyn/9901023 25. Kraichnan, R.: Unpublished 26. Mehta, M.L.: Random matrices. Second edition. Boston, MA: Academic Press, Inc., 1991 27. Parker, D.F.: The decay of sawtooth solutions to the Burgers equation. Proc. Roy. Soc. London Ser. A 369, no. 1738, 409–424 (1980) 28. Polyakov, A.M.: Turbulence without pressure. Phys. Rev. E (3) 52, no. 6, part A, 6183–6188 (1995) 29. Revuz, D. and Yor, M.: Continuous Martingales and Brownian Motion Berlin–Heidelberg–New York: Springer Verlag, 1991 30. Rogers, L.C.G. and Williams, D.: Diffusions, Markov Processes, and Martingales. Volume 2. New York: Wiley, 1986 31. Ryan, R., Avellaneda, M.: The one-point statistics of viscous Burgers turbulence initialized with Gaussian data. Commun. Math. Phys. 200, no. 1, 1–23 (1999) 32. Sinai, Ya.G.: Statistics of shocks in solutions of inviscid Burgers equation. Commun. Math. Phys. 148, no. 3, 601–621 (1992) 33. Truman, A., Zhao, H.Z.: On stochastic diffusion equations and stochastic Burgers’ equations. J. Math. Phys. 37, no. 1, 283–307 (1996) 34. Truman, A., Zhao, H.Z.: Stochastic Burgers’ equations and their semi-classical expansions. Commun. Math. Phys. 194, 1, 231–248 (1998) 35. E, W., Vanden Eijnden, E.: Statistical Theory for the Stochastic Burgers Equation in the Inviscid Limit. chao-dyn/9904028 Communicated by Ya. G. Sinai

Commun. Math. Phys. 212, 437 – 467 (2000)

Communications in

Mathematical Physics

© Springer-Verlag 2000

Slow Motion of Charges Interacting Through the Maxwell Field Markus Kunze1 , Herbert Spohn2 1 Mathematisches Institut der Universität Köln, Weyertal 86, 50931 Köln, Germany.

E-mail: [email protected]

2 Zentrum Mathematik and Physik Department, TU München, 80290 München, Germany.

E-mail: [email protected] Received: 13 January 2000 / Accepted: 4 February 2000

Abstract: We study the Abraham model for N charges interacting with the Maxwell −1 field. On the scale of √ the charge diameter, Rϕ , the charges are a distance ε Rϕ apart and have a velocity εc with ε a small dimensionless parameter. We follow the motion of the charges over times of the order ε−3/2 Rϕ /c and prove that on this time scale their motion is well approximated by the Darwin Lagrangian. The mass is renormalized. The interaction is dominated by the instantaneous Coulomb forces, which are of the order ε 2 . The magnetic fields and first order retardation generate the Darwin correction of the order ε 3 . Radiation damping would be of the order ε7/2 .

1. Introduction Classical charges interact through Coulomb forces, as one learns in every course on electromagnetism. Presumably the best realization in nature is a strongly ionized gas, for which the Darwin correction to the Coulomb forces is of importance, since under standard conditions the velocities cannot be considered small as compared to the velocity of light, cf. [7, §65]. Thus, given N charges, with positions rα , velocities uα , charges eα , and masses mα , α = 1, . . . , N, their motion is governed by the Lagrangian LD =

N X 1 α=1

+

1 4c2

2

mα u2α +

N eα eβ 1 ∗ 4 1 X m u − α α 2 8c 2 α,β=1 4π |rα − rβ | α6 =β

eα eβ uα · uβ + |rα − rβ |−2 (uα · [rα − rβ ])(uβ · [rα − rβ ]) , 4π|rα − rβ | α,β=1 N X

α6=β

(1.1)

438

M. Kunze, H. Spohn

c denoting the velocity of light. The first term is the kinetic energy with a u4α -correction of a strength m∗α depending on the precise model (m∗α = mα for a relativistic particle). The second term is the Coulomb potential, whereas the third term is the Darwin potential, which decays as the Coulomb potential and has a velocity dependent strength. On a more fundamental level, the forces between the charges are mediated through the electromagnetic field. The instantaneous Coulomb–Darwin interaction is a derived concept only. To understand the emergence of such an interaction, in this paper we will investigate the coupled system, charges and Maxwell field, and we will prove that in a certain limit the motion of the charges is well approximated by the Lagrange equations for LD . Let us first describe how the charges are coupled to the Maxwell field. To avoid short-distance singularities, we assume that the charge is spread out over a distance Rϕ , which physically is of order of the classical electron radius. Thus charge α has a charge distribution ρα which for simplicity we take to be of the form ρα (x) = eα ϕ(x),

x ∈ R3 ,

where the form factor ϕ satisfies 0 ≤ ϕ ∈ C0∞ (R3 ) ,

ϕ(x) = ϕr (|x|) ,

ϕ(x) = 0

for |x| ≥ Rϕ .

(C)

To distinguish the true solution from the approximation (1.1), the position of a charge α in the coupled system is denoted by qα and its velocity by vα , α = 1, . . . , N. The charges then generate the charge distribution ρ and the current j given by ρ(x, t) =

N X

ρα (x − qα (t))

and

j (x, t) =

α=1

N X

ρα (x − qα (t))vα (t),

(1.2)

α=1

which satisfy charge conservation by fiat. The Maxwell field, consisting of the electric field E and the magnetic field B, evolves according to c−1

∂ B(x, t) = −∇ ∧ E(x, t), ∂t

c−1

∂ E(x, t) = ∇ ∧ B(x, t) − c−1 j (x, t) (1.3) ∂t

with the constraints ∇ · E(x, t) = ρ(x, t),

∇ · B(x, t) = 0.

(1.4)

The charges generate the electromagnetic field which in turn determines the forces on the charges through the Lorentz force equation Z h i d mbα γα vα (t) = d 3 x ρα (x − qα (t)) E(x, t) + vα (t) ∧ B(x, t) , t ∈ R, dt (1.5) for α = 1, . . . , N. Here mbα is the bare mass of charge α and γα the relativistic factor −1/2 , which ensures |vα | < c. Note that there are no direct forces γα = (1 − vα2 /c2 ) acting between the particles. Equations (1.2)–(1.5) are known as Abraham model for N charges.

Slow Motion of Charges Interacting Through the Maxwell Field

439

We define the energy function by H(E, B, q, v) =

N X α=1

mbα γα +

1 2

Z d 3 x [E 2 (x) + B 2 (x)],

(1.6)

with q = (q1 , . . . , qN ) and v = (v1 , . . . , vN ). It then may be seen that the initial value problem corresponding to (1.2)–(1.5) has a unique weak solution of finite energy and that H is conserved by this solution, compare with [4] for the case of a single particle. We assume that initially the particles are very far apart on the scale set by Rϕ . Thus we require, for α 6 = β, that (1.7) |qα (0) − qβ (0)| ∼ = ε−1 Rϕ with ε > 0 small. If particles would come together as close as Rϕ , our equations of motion are not trustworthy anyhow. In addition, we require that the initial velocities be small compared to the speed of light, √ (1.8) |vα (0)| ∼ = εc. Subject to these restrictions, in essence, the initial electromagnetic field is chosen such as to minimize the energy function H from (1.6), cf. Sect. 5.1 for precise statements and estimates. With these initial conditions, for the particles to travel a distance of order ε −1 Rϕ it will take a time of order ε−3/2 Rϕ /c, which will be the time scale of interest. Thus physically we consider slow particles that are far apart, and we want to follow their motion over long times. Next note that it takes a time of order ε−1 Rϕ /c for a signal to travel between the particles. This means that on the time scale of interest, retardation effects are small. If particles interact through Coulomb forces, as will have to be proved, the strength of the forces is of order ε 2 since the distance is of order ε−1 Rϕ . Followed over a time √ span ε −3/2 Rϕ /c, this yields a change in velocity of order εc. On this basis we expect the orders of magnitude (1.7) and (1.8) to remain valid over times of order ε−3/2 Rϕ /c. There is one subtle point here, however. The self-interaction of a charge with the fields renormalizes its mass. Thus in (1.1) the quantity mα cannot be the bare mass of the charge, the electromagnetic mass has to be added. In theoretical physics it is common practice to count the post-Coulombian corrections in orders of v/c relative to the motion through pure Coulomb forces. Thus the Darwin term is the first correction and of order (v/c)2 . The next correction is of order (v/c)3 and accounts for damping through radiation. If we push the Taylor expansion in Sect. 3 one term further, one obtains N X ∂LD d ∂LD + (eα / 6π c3 ) eβ v¨β , (1.9) = ∂rα dt ∂uα β=1

α = 1, . . . , N. The physical solution has to be on the center manifold for (1.9). At the present level of precision it suffices to substitute the Hamiltonian dynamics to lowest order, which yields eα 1 ∂LD d ∂LD + = dt ∂uα ∂rα 6π c3 2 N X eβ 0 (rβ −rβ 0 ) · (uβ −uβ 0 ) eβ eβ 0 eβ 0 )−3 0) . − −u (r −r (u β β β β mβ mβ 0 4π|rβ −rβ 0 |3 |rβ −rβ 0 |2 0 β,β =1 β6=β 0

440

M. Kunze, H. Spohn

Note that if the ratio eα /mα does not depend on α, then the radiation reaction vanishes and the system does not emit dipole radiation. The next order correction is (v/c)4 and of Lagrangian form. It is discussed in [7] and [1]. While in electrodynamics corrections of order higher than the radiation reaction are of marginal interest, in general relativity there is a huge effort to obtain very precise corrections to the Newtonian orbits, a problem that is similar to the one discussed here. The most famous example is the Hulse–Taylor binary pulsar, where two highly compact neutron stars of roughly solar mass revolve around each other with a period of 7.8 h [9]. In this case (v/c) ∼ = 10−3 . For gravitational systems there is only quadrupole radiation which is of order (v/c)5 . To this order the theory agrees with the observed radio signals within 0.3%. In newly designed experiments one expects highly improved precision which will require corrections up to order (v/c)11 . 2. Main Results We recall the initial conditions for the Abraham model (1.2)–(1.5), where we set c = 1 throughout for simplicity. For the initial positions qα0 = qα (0) we require C1 ε−1 ≤ |qα0 − qβ0 | ≤ C2 ε−1 ,

α 6 = β,

(2.1)

for some constants C1 , C2 > 0. For the initial velocities vα0 = vα (0) we assume √ (2.2) |vα0 | ≤ C3 ε with C3 > 0. The initial fields are a sum over charge solitons, E(x, 0) = E 0 (x) =

N X α=1

Evα0 (x − qα0 ) and

B(x, 0) = B 0 (x) =

N X α=1

Bvα0 (x − qα0 ). (2.3)

Here Ev (x) = −∇φv (x) + (v · ∇φv (x))v

and

Bv (x) = −v ∧ ∇φv (x)

(2.4)

and the Fourier transform of φv is given by 2 ˆ − (k · v)2 ], φˆ v (k) = eϕ(k)/[k

(2.5)

where it is understood that in φvα0 we have to set e = eα . For this choice of data, the constraints (1.4) are satisfied for t = 0 and therefore for all t. In case N = 1, the particle would travel freely, q1 (t) = q10 + v10 t, t ≥ 0, and the co-moving electromagnetic fields would maintain their form (2.3). In spirit, the bounds (2.1) and (2.2) should propagate in time and the form (2.3) of the electromagnetic fields, at least in approximation. On the other hand, for two particles with opposite charge one particular solution is the head on collision which violates the lower bound in (2.1). Considerably more delicate are solutions where some particles reach infinity in finite time, [8,10]. Thus we simply require that for given constants C∗ , C ∗ > 0 the bound C∗ ε−1 ≤

sup t∈[0, T ε−3/2 ]

|qα (t) − qβ (t)| ≤ C ∗ ε−1 ,

α 6= β,

(2.6)

Slow Motion of Charges Interacting Through the Maxwell Field

441

holds, which implicitly defines the first time, T , at which (2.6) is violated. In fact (2.6) looks like an uncheckable assumption. But, as to be shown, the optimal T can be computed on the basis of the approximation dynamics generated by the Lagrangian (1.1). Under the assumption (2.6) the velocity bound propagates through the conservation of energy. We define the electrostatic energy of the charge distributions as Estat =

N X α=1

eα2

Z 1 2 −2 3 d k |ϕ(k)| ˆ k , 2

(2.7)

and compute the energy (1.6) for the given initial data. Then H(0) := H(t = 0) =

N X α=1

mbα γ (vα0 ) + Estat + O(ε)

−1/2

2 . We minimize the electromagnetic field energy Hf (t) = with R γ3 (v) 2= (1 − v )2 1 x [E (x, t) + B (x, t)] at time t for given ρ and j , i.e., for given positions q(t) d 2 and velocities v(t). Using (2.6) it may be shown that

H(t) ≥

N X

mbα γ (vα (t)) + Estat + O(ε).

α=1

Since by energy conservation H(0) = H(t) and since the dominant contributions Estat √ cancel exactly, we thus will continue to have the bound |vα (t)| ∼ = C ε. (We refer to Sect. 5.1 in Appendix A for the complete argument). Therefore √ |vα (t)| ≤ Cv ε (2.8) sup t∈[0, T ε−3/2 ]

with some constant Cv > 0. As a next step we solve the inhomogeneous Maxwell equations for the fields and insert them into the Lorentz force equations. According to the retarded part of the fields, retarded positions qα (s), s ∈ [0, t], will show up. To control the Taylor expansion of qα (t) − qα (s) and thus of the retarded force, including the Darwin term, we will need bounds not only on positions and velocities, but also on v˙α and v¨α . Implicitly they use that the true fields remain close to the fields of the form (2.3) evaluated at current positions and velocities. Lemma 2.1. Let the initial data for the Abraham model satisfy (2.1), (2.2), and (2.3). Moreover, assume C∗ ε−1 ≤

sup t∈[0,T ε−3/2 ]

|qα (t) − qβ (t)|,

α 6 = β,

(2.9)

for some T > 0. Then there exist constants C ∗ , Cv > 0 such that (2.6) and (2.8) hold. ¯ In addition, we find C > 0 and In particular, supt∈[0,T ε−3/2 ] |vα (t)| ≤ v¯ < 1 for some v. e¯ > 0 such that sup t∈[0,T ε−3/2 ]

|v˙α (t)| ≤ Cε2 and

sup t∈[0,T ε−3/2 ]

|v¨α (t)| ≤ Cε7/2

(2.10)

¯ α = 1, . . . , N. In the estimates (2.6), (2.8), and (2.10), C and e¯ in case that |eα | ≤ e, do depend only on T and the bounds for the initial data, but not on ε.

442

M. Kunze, H. Spohn

The proof of this lemma is rather technical and will be given in Appendix A. Using the bounds of Lemma 2.1, we expand the Lorentz force up to an error of order ε7/2 , cf. Lemma 3.5, which is the order of radiation damping (the Coulomb force is order ε 2 and radiation damping a relative order ε3/2 smaller). The terms up to order ε3 then can be collected in the form of the Darwin Lagrangian (1.1). We set 4 mα = mbα + eα2 me 3 with the electromagnetic mass me = LD (r, u) =

R

m∗α = mbα +

16 2 e me 15 α

2 d 3 k |fˆ(k)| k −2 and the Darwin Lagrangian

N X 1

N 1 X eα eβ ε mα u2α + m∗α u4α − 2 8 2 α,β=1 4π |rα − rβ |

α=1

+

1 2

and

α6 =β

eα eβ ε uα · uβ + |rα − rβ |−2 (uα · [rα − rβ ])(uβ · [rα − rβ ]) 4 α,β=1 4π|rα − rβ | N X

α6=β

for r = (r1 , . . . , rN ) and u = (u1 , . . . , uN ). The comparison dynamics is then ∂LD d ∂LD , α = 1, . . . , N. (2.11) = dt ∂uα ∂rα It conserves the energy HD (r, u) =

N X 1 α=1

N 1 X eα eβ 3 mα u2α + ε m∗α u4α + . 2 8 2 α,β=1 4π |rα − rβ |

(2.12)

α6 =β

Because of the Coulomb singularity, in general the solutions to (2.11) will exist only locally in time, the only exception being when all charges have the same sign, in which case energy conservation yields global existence. In the corresponding gravitational problem, for a set of positive phase space measure, mass can be transported to infinity in a finite time, [10]. We do not know whether this can happen also for the Coulomb problem. We set √ (2.13) qα0 = ε−1 rα0 and vα0 = εu0α , α = 1, . . . , N, with rα0 6 = rβ0 for α 6 = β. Then (2.1) and (2.2) are satisfied. During the initial time slip of order ε −1 the fields build up the forces between particles and adjust to their motion. Thus during that period the dynamics of the particles is not well approximated by the Darwin Lagrangian and we correct the initial data of the comparison dynamics to the true positions and velocities only at the end of the initial time slip. To take into account that the comparison dynamics will have no global solutions in time, in general, we define τ ∈]0, ∞] to be the first time when either limt→τ − |rα (t) − rβ (t)| = 0 for some α 6 = β or limt→τ − |rα (t)| = ∞ for some α holds for the comparison dynamics (2.11).

Slow Motion of Charges Interacting Through the Maxwell Field

443

As our main approximation result we state Theorem 2.2. Let T > 0 be fixed. Define τ ∈]0, ∞] as above and fix some δ0 ∈]0, τ [. For the Abraham model let the initial data be given by (2.13) and (2.3). Furthermore we ¯ with e¯ = e(T ¯ , data) > 0 from Lemma 2.1. Let t0 = 4(Rϕ + C ∗ ε−1 ). require |eα | ≤ e, We adjust the initial data of the comparison dynamics such that qα (t0 ) = ε−1 rα (ε3/2 t0 ) √ and vα (t0 ) = εuα (ε3/2 t0 ), α = 1, . . . , N. Then there exists a constant C > 0 such that for all t ∈ [t0 , min{τ − δ0 , T } ε−3/2 ] we have √ √ |qα (t) − ε−1 rα (ε 3/2 t)| ≤ C ε, |vα (t) − εuα (ε 3/2 t)| ≤ Cε2 , α = 1, . . . , N. (2.14) Remarks. (i) If we are satisfied with the precision from the pure Coulomb dynamics, then in (2.14) we loose one power in ε. In this case, we can adjust the initial data of the comparison dynamics at time t = 0, and then (2.14) holds for all t ∈ [0, min{τ − δ0 , T } ε−3/2 ]. (ii) In fact the initial data need not to be adjusted exactly at t = t0 , a bound √ √ |qα (t0 ) − ε−1 rα (ε 3/2 t0 )| ∼ ε and |vα (t0 ) − εuα (ε3/2 t0 )| ∼ ε2 would be sufficient. 3. Self-Action and Mutual Interaction In this section we expand the Lorentz force term Z Fα (t) = d 3 x ρα (x − qα (t)) E(x, t) + vα (t) ∧ B(x, t) .

(3.1)

Since the fields (E, B) are a solution to the inhomogeneous Maxwell’s equations, we may decompose them in the initial and the retarded fields, E(x, t) = E (0) (x, t) + E (r) (x, t)

and

B(x, t) = B (0) (x, t) + B (r) (x, t),

where sin |k|t ˆ ˆ k ∧ B(k, 0), 0) − i Eˆ (0) (k, t) = cos |k|t E(k, |k| sin |k|t ˆ ˆ 0) + i k ∧ E(k, 0), Bˆ (0) (k, t) = cos |k|t B(k, |k| Z t Z t sin |k|(t − s) ρ(k, ˆ s)k, ds cos |k|(t − s) jˆ(k, s) + i ds Eˆ (r) (k, t) = − |k| 0 0 Z t sin |k|(t − s) k ∧ jˆ(k, s), ds Bˆ (r) (k, t) = −i |k| 0 cf. [6, Sect. 4], with j (x, t) and ρ(x, t) from (1.2). Accordingly we can rewrite Fα (t) in (3.1) as Z Fα (t) = d 3 x ρα (x − qα (t))[E (0) (x, t) + vα (t) ∧ B (0) (x, t)] Z + d 3 x ρα (x − qα (t))[E (r) (x, t) + vα (t) ∧ B (r) (x, t)] = Fα(0) (t) + Fα(r) (t).

(3.2)

444

M. Kunze, H. Spohn (0)

First we consider Fα (t). (0)

Lemma 3.1. For t ∈ [t0 , T ε−3/2 ], with t0 = 4(Rϕ + C ∗ ε−1 ), we have Fα (t) = 0. Proof. If S(t) denotes the solution group generated by the free wave equation in D 1,2 (R3 ) ⊕ L2 (R3 ), it follows from (2.3) through Fourier transform that

E (0) (x, t) E˙ (0) (x, t)

E (0) (·, 0) = S(t) ˙ (0) (x) E (·, 0) Z 0 N X β eβ ds [S(t − s)8E (· − qβ0 − vβ0 s)](x), =− β=1

−∞

β

where 8E (x) = (ϕ(x)vβ0 , ∇ϕ(x)). The analogous formula is valid for B (0) (x, t), with β

β

8E to be replaced with 8B (x) = (0, vβ0 ∧ ∇ϕ(x)). For fixed 1 ≤ β ≤ N and x ∈ R3 β

with |x − qβ0 | ≤ t − Rϕ assumption (C) yields [S(t − s)8E (· − qβ0 − vβ0 s)] (x) = 0 1 for all s ≤ 0 by means of Kirchhoff’s formula and Lemma 2.1, [. . . ]1 denoting the first component. As for t ∈ [t0 , T ε−3/2 ] and |x − qβ0 | > t − Rϕ we obtain √ |x − qα (t)| ≥ |x − qβ0 | − |qα (t) − qβ (t)| − |qβ (t) − qβ0 | ≥ t − Rϕ − C ∗ ε−1 − C ε t ≥ t0 /2 − Rϕ − C ∗ ε−1 ≥ Rϕ for ε small by Lemma 2.1, the claim follows.

t u

(r)

Turning then to Fα (t) in (3.2), we write this term in Fourier transformed form and use (1.2) to obtain (r) (t) + Fα(r) (t) = eα2 Fαα

N X

(r)

eα eβ Fαβ (t),

(3.3)

β=1

β6 =α

with (r) Fαβ (t)

Z =

t

Z ds

2 −ik·[qα (t)−qβ (s)] dk |ϕ(k)| ˆ e

0

·

sin |k|(t − s) k |k| sin |k|(t − s) vα (t) ∧ (k ∧ vβ (s)) , −i |k|

− cos |k|(t − s) vβ (s) + i

(r)

(3.4) (r)

α, β = 1, . . . , N. The term Fαα (t) accounts for the self-force, whereas Fαβ (t) for β 6 = α represents the mutual interaction force between particle α and particle β. These both contributions are dealt with separately in the following two subsections. Before going on to this, we state an auxiliary result.

Slow Motion of Charges Interacting Through the Maxwell Field

445

Lemma 3.2. Let 1 ≤ α, β ≤ N , α 6 = β. For t ∈ [t0 , T ε−3/2 ] we have Z Z t 2 −ik·[qα (t)−qβ (s)] ds dk |ϕ(k)| ˆ e cos |k|(t − s) vβ (s) (a) − 0

Z Z ∞ n o 2 −ik·ξαβ dτ dk |ϕ(k)| ˆ e cos |k|τ vβ − iτ (k · vβ )vβ − τ v˙β + O(ε7/2 ), =− Z t 0 Z 2 −ik·[qα (t)−qβ (s)] sin |k|(t − s) k ds dk |ϕ(k)| ˆ e (b) i |k| 0 Z Z ∞ h 1 2 i 1 2 2 −ik·ξαβ sin |k|τ 2 k 1−ik· τ vβ − τ v˙β − τ (k · vβ ) dτ dk |ϕ(k)| ˆ e =i |k| 2 2 0 7/2 ), +O(ε Z Z t 2 −ik·[qα (t)−qβ (s)] sin |k|(t − s) vα (t) ∧ (k ∧ vβ (s)) ds dk |ϕ(k)| ˆ e (c) (−i) |k| 0 Z Z ∞ 2 −ik·ξαβ sin |k|τ dτ dk |ϕ(k)| ˆ e vα ∧ (k ∧ vβ ) + O(ε7/2 ). = (−i) |k| 0 Here vα = vα (t), etc., and ξαβ = qα (t) − qβ (t). The proof is somewhat tedious and given in Appendix B.

3.1. Self-action. For t ∈ [t0 , T ε−3/2 ] we have Z

(r) (t) Fαα

=

Z

i 2 −i(k·vα )τ dk |ϕ(k)| ˆ e 1 + (k · v˙α )τ 2 2 0 sin |k|τ sin |k|τ k−i vα ∧ (k ∧ [vα − v˙α τ ]) · − cos |k|τ [vα − v˙α τ ] + i |k| |k| ∞

dτ

+O(ε 7/2 ).

(3.5)

The rigorous proof of this relation is omitted since it is very similar to the proof of Lemma 3.2 given in Appendix B. It once more relies on the fact that we may Taylor expand 1 qα (s) ∼ = qα − vα τ + v˙α τ 2 + O(ε7/2 ), 2

vα (s) ∼ = vα − v˙α τ + O(ε7/2 )

by Lemma 2.1, with qα = qα (t), etc. and τ = t − s, whence i e−ik·[qα (t)−qα (s)] ∼ = e−i(k·vα )τ 1 + (k · v˙α )τ 2 + O(ε7/2 ). 2 Introducing Z Ip =

0

t¯

dτ

sin(|k|τ ) −i(k·vα )τ p τ , e |k|

Z Jp =

0

t¯

dτ cos(|k|τ )e−i(k·vα )τ τ p ,

p ∈ N0 ,

446

M. Kunze, H. Spohn

Eq. (3.5) may be rewritten as Z i 2 (r) (t) = lim − dk |ϕ(k)| ˆ vα J0 − v˙α J1 + (k · v˙α )vα J2 Fαα 2 t¯→∞ Z 2 i [(1 − vα2 )k + (k · vα )vα ]I0 + i [(vα · v˙α )k − (k · vα )v˙α ]I1 + dk |ϕ(k)| ˆ 1 − (k · v˙α )[(1 − vα2 )k + (k · vα )vα ]I2 + O(ε7/2 ), (3.6) 2 since v˙α2 = O(ε4 ). Denote the term containing the Jp by J and the one containing the Ip by I. To evaluate the limits t¯ → ∞, we can rely on the results from [6, Sect. 4]. We first recall that Z Z 2 2 J0 → 0, dk |ϕ(k)| ˆ J1 → −2me γα2 as t¯ → ∞, dk |ϕ(k)| ˆ −1/2

with γα = (1 − vα2 ) therefore

and me =

1 2

R

2 −2 k . Moreover, ∇v J1 = −ikJ2 , and dk |ϕ(k)| ˆ

1 J → (−2me γα2 )v˙α + v˙α · ∇v (−2me γα2 ) vα = −2me γα2 v˙α + γα2 (vα · v˙α )vα 2 2 (3.7) = −2me (1 + vα )v˙α + (vα · v˙α )vα + O(ε4 )

as t¯ → ∞, the latter equality according to the expansion γα2 = 1 + vα2 + O(vα4 ) = 1 + vα2 + O(ε2 ) and γα4 = 1 + O(ε). R 2 kI0 → 0, What concerns I, we know from [3, 6] that dk |ϕ(k)| ˆ Z Z 2 2 I0 → 2me |vα |−1 arth|vα |, dk |ϕ(k)| ˆ (k · v˙α )kI2 → −2me µ(vα )v˙α dk |ϕ(k)| ˆ as t¯ → ∞, where γ 4 γ 2 −3 2 −5 − |v| arth|v| z + (5v − 3) + 3|v| arth|v| (v · z)v µ(v)z = v2 v4 for |v| < 1 and z ∈ R3 . Consequently, since s −1 arth(s) = 1 + s 2 /3 + s 4 /5 + O(s 6 ) for s close to zero, it thus follows after some calculation that I → −(vα · v˙α )∇v 2me |vα |−1 arth|vα | + v˙α vα · ∇v 2me |vα |−1 arth|vα | 1 1 − (1 − vα2 ) − 2me µ(vα )v˙α − vα vα · − 2me µ(vα )v˙α 2 2 14 2 22 2 + v me v˙α + me (vα · v˙α )vα + O(ε4 ). (3.8) = 3 15 α 15 Summarizing (3.6), (3.7), and (3.8), we arrive at Lemma 3.3. For t ∈ [t0 , T ε−3/2 ] we have 8 4 16 (r) + vα2 me v˙α − me (vα · v˙α )vα + O(ε7/2 ). (t) = − Fαα 3 15 15

Slow Motion of Charges Interacting Through the Maxwell Field

447 (r)

3.2. Mutual interaction. In this section we expand Fαβ (t) from (3.4) with β 6 = α. For p ∈ N0 we have that Z ∞ Z 2 −ik·ξαβ sin |k|τ p τ Ap := dτ dk |ϕ(k)| ˆ e |k| 0 Z Z = (4π)−1

and

Z

∞

dxdy ϕ(x)ϕ(y)|ξαβ + x − y|p−1

Z

2 −ik·ξαβ dk |ϕ(k)| ˆ e cos(|k|τ ) τ p Z Z −1 = (−p)(4π) dxdy ϕ(x)ϕ(y)|ξαβ + x − y|p−2 = (−p)Ap−1 ,

Bp :=

dτ

0

as may be seen through Fourier transform. We hence obtain from Lemma 3.2 that for β 6 = α and t ∈ [t0 , T ε−3/2 ], 1 1 (r) Fαβ (t) = −vβ (vβ · ∇ξ )B1 + v˙β B1 − ∇ξ A0 + (v˙β · ∇ξ )∇ξ A2 − (vβ · ∇ξ )2 ∇ξ A2 2 2 +(vα · vβ )∇ξ A0 − vβ (vα · ∇ξ )A0 + O(ε7/2 ), (3.9) taking also into account that A1 = (4π )−1 , thus ∇ξ A1 = 0. As a consequence of |ξαβ | = O(ε−1 ), cf. Lemma 2.1, of assumption (C), and of Lemma 2.1, it follows that in (3.9) we have −∇ξ A0 = O(ε2 ), while all other terms are O(ε3 ). Since e.g. ≤ Cε4 , (vα · vβ )∇ξ A0 − (vα · vβ ) − ξαβ 4π |ξαβ |3 with an obvious similar estimate for the other terms besides −∇ξ A0 , we find from (3.9) and after some calculation that for β 6 = α and t ∈ [t0 , T ε−3/2 ], 1 1 (r) − v˙β − ∇ξ A0 Fαβ (t) = vβ (vβ · ∇ξ ) 4π|ξαβ | 4π |ξαβ | |ξαβ | |ξαβ | 1 1 + (v˙β · ∇ξ )∇ξ − (vβ · ∇ξ )2 ∇ξ 2 4π 2 4π 1 1 − vβ (vα · ∇ξ ) + O(ε7/2 ) +(vα · vβ )∇ξ 4π|ξαβ | 4π |ξαβ | vβ2 (v˙β · ξαβ ) 3(vβ · ξαβ )2 1 ξ + ξ − ξαβ v˙β − αβ αβ 8π|ξαβ | 8π |ξαβ |3 8π |ξαβ |3 8π |ξαβ |5 (vα · vβ ) (vα · ξαβ ) − ξαβ + vβ + O(ε7/2 ). 3 4π|ξαβ | 4π |ξαβ |3

= −∇ξ A0 −

Finally, to deal with the lowest-order term we observe that with n = ξαβ /|ξαβ |, ∇ξ A0 +

ξαβ 1 = 4π|ξαβ |3 4|ξαβ |2

Z Z n + x−y |ξαβ | . (3.10) − n dxdy ϕ(x)ϕ(y) n + x−y 3 |ξαβ |

448

M. Kunze, H. Spohn

Defining R = (x − y)/|ξαβ | = O(ε) for |x|, |y| ≤ Rϕ , we canR expand ψ(R) = (n + R dxdy ϕ(x)ϕ(y)(x − R)/|n+R| to obtain that ψ(R) = n+R−3(n·R)n+O(ε2 ). As y) = 0, we hence conclude that the right-hand side of (3.10) is O(ε4 ). Thus we can summarize our estimates on the mutual interaction force as follows. Lemma 3.4. For β 6 = α and t ∈ [t0 , T ε−3/2 ] we have (r)

Fαβ (t) =

vβ2 ξαβ (v˙β · ξαβ ) 3(vβ · ξαβ )2 1 v ˙ − − ξ + ξ − ξαβ β αβ αβ 4π|ξαβ |3 8π|ξαβ | 8π |ξαβ |3 8π |ξαβ |3 8π |ξαβ |5 (vα · vβ ) (vα · ξαβ ) − ξαβ + vβ + O(ε 7/2 ). 4π|ξαβ |3 4π|ξαβ |3 (r)

3.3. Summary of the estimates. By (3.1), (3.2), and Lemma 3.1 we find Fα (t) = Fα (t) for t ∈ [t0 , T ε−3/2 ]. According to (3.3) and Lemmas 3.3 and 3.4 we hence have obtained the following expansion of the Lorentz force in (3.1). For t ∈ [t0 , T ε−3/2 ] we have 4 16 8 ˙ + O(ε7/2 ), + vα2 me v˙α − me (vα · v˙α )vα + Gα (q, v, v) Fα (t) = − 3 15 15 N X vβ2 ξαβ eα eβ (v˙β · ξαβ ) 1 ˙ = v ˙ − − ξ + ξαβ Gα (q, v, v) β αβ 4π |ξαβ |3 2|ξαβ | 2|ξαβ |3 2|ξαβ |3 β=1 β6=α

3(vβ · ξαβ )2 (vα · vβ ) (vα · ξαβ ) − ξαβ − ξαβ + vβ (3.11) , |ξαβ |3 |ξαβ |3 2|ξαβ |5

where t0 = 4(Rϕ + C ∗ ε−1 ), ξαβ = qα (t) − qβ (t), vα = vα (t), and vβ = vβ (t). Due to d the Lorentz equation dt (mbα γα vα ) = Fα (t), cf. (1.5), we finally obtain the following lemma by calculating the right-hand side and expanding γα . Lemma 3.5. For t ∈ [t0 , T ε−3/2 ] we have ˙ + O(ε7/2 ), Mα (vα )v˙α = Gα (q, v, v)

1 ≤ α ≤ N,

with Gα from (3.11) and Mα (v) the (3 × 3)-matrix Mα (v)(z) = (mα + 21 m∗α v 2 )z + m∗α (v · z)v for v, z ∈ R3 . 4. Proof of Theorem 2.2 We need to compare a solution (qα (t), vα (t)) of (1.2)–(1.5) with data (2.13) to (˜rα (t), u˜ α (t)), where we let √ r˜α (t) = ε−1 rα (ε3/2 t), u˜ α (t) = εuα (ε3/2 t), (4.1) and where the (rα (t), uα (t)) are the solution to the system induced by (2.11) with data (rα0 , u0α ). A somewhat lengthy but elementary calculation shows that (˜rα (t), u˜ α (t)) satisfy ˙˜ ˜ u, ˜ u), Mα (u˜ α )u˙˜ α = Gα (r,

1 ≤ α ≤ N,

(4.2)

Slow Motion of Charges Interacting Through the Maxwell Field

449

cf. Lemma 3.5 for the notation. Recalling that τ ∈]0, ∞] was defined to be the first time when either limt→τ − |rα (t) − rβ (t)| = 0 for some α 6= β or limt→τ − |rα (t)| = ∞ for some α holds, we find that (4.2) is valid for t ∈ [0, (τ − δ0 )ε−3/2 ], for any δ0 ∈]0, τ [ which we consider to be fixed throughout. This leads to some useful estimates on the effective dynamics. Lemma 4.1. For suitable constants C0 , C 0 , C > 0 (depending on τ , δ0 , and the data) we have C0 ε−1 ≤

sup t∈[0, (τ −δ0 )ε−3/2 ]

|˜rα (t) − r˜β (t)| ≤ C 0 ε−1 ,

α 6 = β,

(4.3)

and sup t∈[0, (τ −δ0 )ε−3/2 ]

√ |u˜ α (t)| ≤ C ε.

(4.4)

Proof. The bounds in (4.3) follow from (4.1) and the fact that |rα (t) − rβ (t)| ≥ δ1 and |rα (t)| ≤ C on [0, τ − δ0 ] for some δ1 > 0, C > 0, by definition of τ . Concerning (4.4), by conservation of the energy HD from (2.12) we obtain C ≥ HD (r(0), u(0)) = HD (r(t), u(t)) ≥ 21 mα u2α (t) as long as the solution exists, in particular for t ∈ [0, τ −δ0 ]. t u To simplify the presentation, we henceforth omit the tilde and write (r, u) instead of (˜r , u) ˜ to denote the rescaled solution. Utilizing the bounds from Lemma 2.1 and from (4.3), (4.4), it may be seen after some calculation that ˙ ˙ − Gα (r, u, u)(t) Gα (q, v, v)(t) ≤C

(4.5) ε3 |qβ (t) − rβ (t)| + ε5/2 |vβ (t) − uβ (t)| + ε|v˙β (t) − u˙ β (t)|

N X β=1

for 1 ≤ α ≤ N and t ∈ [0, T ε −3/2 ] ∩ [0, (τ − δ0 )ε−3/2 ] = [0, min{τ − δ0 , T }ε−3/2 ]. Note that the term ε 3 |qβ − rβ | appears through comparison of ξαβ /|ξαβ |3 to rαβ /|rαβ |3 , cf. the form of Gα in (3.11). Next, a general (3 × 3)-matrix M(v) = a(v)id + b(v ⊗ v) has the inverse M(v)−1 = a(v)−1 id +

b (v ⊗ v). a(v)[a(v) + bv 2 ]

√ This remark shows |Mα (vα )−1 | = O(1) and |Mα (vα )−1 − Mα (uα )−1 | ≤ C ε|vα − uα | −3/2 2 ˙ = O(ε ) it follows from Lemma ]. Since |Gα (q, v, v)| for t ∈ [0, min{τ − δ0 , T }ε 3.5, (4.2), and (4.5) that |v˙α (t) − u˙ α (t)| ≤C

ε3 |qβ (t) − rβ (t)| + ε5/2 |vβ (t) − uβ (t)| + ε|v˙β (t) − u˙ β (t)| + O(ε7/2 )

N X β=1

450

M. Kunze, H. Spohn

for 1 ≤ α ≤ N and t ∈ [t0 , min{τ − δ0 , T }ε−3/2 ]. Summation over α and choosing ε > 0 sufficiently small this results in N X α=1

|v˙α (t) − u˙ α (t)| ≤ C

ε3 |qα (t) − rα (t)| + ε5/2 |vα (t) − uα (t)| + O(ε7/2 )

N X α=1

(4.6) for t ∈ [t0 , min{τ − δ0 , T }ε−3/2 ]. To use this basic estimate, we write dα (t) = qα (t) − rα (t) as Z t Z t ˙ ¨ ˙ ˙ (t −s)dα (s) ds, dα (t) = dα (t0 )+ d¨α (s) ds. dα (t) = dα (t0 ) + (t − t0 )dα (t0 ) + t0

t0

We then obtain for t ∈ [t0 , min{τ − δ0 , T }ε−3/2 ] from (4.6) that Z t ¯ 0 ) + Cε3 (t − s)D(s) ds D(t) ≤ D(t0 ) + (t − t0 )D(t Z + Cε 5/2

t t0

t0

√ ¯ (t − s)D(s) ds + C ε,

¯ ¯ 0 ) + Cε3 D(t) ≤ D(t

Z

t

t0

Z D(s) ds + Cε5/2

(4.7) t

t0

¯ D(s) ds + Cε2 ,

(4.8)

where D(t) = max max |dα (s)|

and

1≤α≤N s∈[t0 ,t]

¯ D(t) = max max |d˙α (s)|. 1≤α≤N s∈[t0 ,t]

Application of Gronwall’s lemma to (4.8) yields Z t ¯ ¯ 0 ) + ε2 + ε3 D(s) ds , D(t) ≤ C D(t t0

(4.9)

and utilizing this in (4.7) implies √ ¯ 0 ) + C ε + Cε−1/2 (D(t ¯ 0) D(t) ≤ D(t0 ) + (t − t0 )D(t Z t (t − s)D(s) ds. + ε 2 ) + Cε3 t0

Finally, (t − s) ≤ Cε −3/2 yields upon a further application of Gronwall’s lemma that √ ¯ 0 ) + ε , t ∈ [t0 , min{τ − δ0 , T }ε−3/2 ]. (4.10) D(t) ≤ C D(t0 ) + ε−3/2 D(t ¯ 0 ). Therefore (4.10) and (4.9) imply (2.14). This By assumption D(t0 ) = 0 = D(t completes the proof of Theorem 2.2. u t 5. Appendix A: Proof of Lemma 2.1 This appendix concerns the proof of Lemma 2.1. We split the proof into three subsections.

Slow Motion of Charges Interacting Through the Maxwell Field

451

5.1. Bounding the particle distances and the velocities. We intend to use energy conservation to show (2.8), and for that reason we calculate with (2.3) the field energy Z 1 d 3 x [E 2 (x, 0) + B 2 (x, 0)] HF (0) = 2 N Z 1X d 3 x [Ev20 (x − qα0 ) + Bv20 (x − qα0 )] = α α 2 α=1

+

N Z h 1 X d 3 x Evα0 (x − qα0 ) · Ev 0 (x − qβ0 ) β 2 α,β=1 α6 =β

i + Bvα0 (x − qα0 ) · Bv 0 (x − qβ0 ) . β

According to (2.4) and [6, Sect. 2] the first term equals (1) HF (0)

=

N X α=1

eα2

Z 1 + |vα0 | 1 1 2 −2 3 d k |ϕ(k)| log −1 . ˆ k 2 |vα0 | 1 − |vα0 |

Denoting the term in [. . . ] as ψ(|vα0 |), ψ(r) is odd, and hence Taylor expansion implies ψ(r) = 1 + O(r 2 ) for r small. Therefore (2.2) yields (1)

HF (0) = ECoul + O(ε), with ECoul from (2.7). To deal with the contributions for α 6 = β in the second term, we obtain by passing to Fourier transformed form and observing (2.2) that e.g. Z Z 2 −2 ik·(qα0 −qβ0 ) ˆ k e + O(ε) d 3 x Evα0 (x − qα0 ) · Ev 0 (x − qβ0 ) = eα eβ d 3 k |ϕ(k)| β

= O(ε), the latter with (2.1) and by passing to polar coordinates. Thus we have shown HF (0) = ECoul + O(ε).

(5.1)

Next we will investigate the field energy at time t > 0. We claim that Z 1 d 3 x [E 2 (x, t) + B 2 (x, t)] HF (t) = 2 Z 1 1 d 3 x E 2 (x, t) ≥ − ρ(·, t), 1−1 ρ(·, t) 2 3 . ≥ L (R ) 2 2

(5.2)

The easiest way to see this is to introduce potentials A and φ, B(x, t) = ∇ ∧ A(x, t),

E(x, t) = −∇φ(x, t)−F (x, t),

with F (x, t) =

∂A (x, t), ∂t

for the electromagnetic field. Then ρ = ∇ · E = −1φ − ∇ · F , and the estimate in (5.2) follows by passing to Fourier transformed form. On the other hand, substituting ρ from (1.2) into Z 1 1 −1 ˆ t)|2 k −2 , d 3 k |ρ(k, − ρ(·, t), 1 ρ(·, t) 2 3 = L (R ) 2 2

452

M. Kunze, H. Spohn

by assumption (2.9) we can argue exactly as before to show that the terms with α 6 = β are O(ε), and thus N Z 1X (5.3) d 3 k |ρˆα (k)|2 k −2 + O(ε) = ECoul + O(ε) HF (t) ≥ 2 α=1

for t ∈ [0, T ε −3/2 ]. Consequently for t ∈ [0, T ε−3/2 ] by energy conservation, cf. (1.6), by (5.1) and (5.3), N X α=1

mbα γ (vα0 ) + ECoul + O(ε) = = ≥

N X α=1 N X α=1 N X

mbα γ (vα0 ) + HF (0) mbα γ (vα (t)) + HF (t) mbα γ (vα (t)) + ECoul + O(ε),

α=1 −1/2

with γ (v) = (1 − v 2 ) N X α=1

. Thus

mbα γ (vα0 ) + Cε

≥

N X

mbα γ (vα (t)),

t ∈ [0, T ε−3/2 ]

(5.4)

α=1

with some constant C depending on C1 , C3 , C∗ , T . This estimate now allows to prove (2.8). Define I+ = {α ∈ {1, . . . , N} : γ (vα (t)) ≤ γ (vα0 )} and I− = {α ∈ {1, . . . , N} : γ (vα (t)) > γ (vα0 )}. √ For α ∈ I+ we have |vα (t)| ≤ |vα0 | ≤ C3 ε by (2.2). Thus for ε so small that C32 ε ≤ 1/2, √ γ (vα0 ) − γ (vα (t)) ≤ 2|(vα0 )2 − (vα (t))2 | ≤ Cε. Therefore by (5.4), X mbα γ (vα (t)) − γ (vα0 ) . Cε ≥ α∈I−

Since mbα > 0 we deduce that γ (vα (t)) ≤ γ (vα0 ) + Cε, α ∈ I− , √ √ and according to |vα0 | ≤ C3 ε it then follows that |vα (t)| ≤ C ε also for α ∈ I− . This concludes the proof of (2.8). Using (2.1) and (2.8) it is finally easy to derive the upper bound in (2.6), since for t ∈ [0, T ε −3/2 ] we have |qα (t) − qβ (t)| ≤ |qα0 − qβ0 | + |qα (t) − qα0 | + |qβ (t) − qβ0 | √ ≤ C2 ε−1 + 2Cv T εε−3/2 = C ∗ ε−1 , with C ∗ = C2 + 2Cv T . We remark that for the estimates in this section the smallness of the eα was not needed.

Slow Motion of Charges Interacting Through the Maxwell Field

453

5.2. Bounding |v˙α (t)|. Since d mbα γα vα (t) = m0α (vα (t))v˙α (t), dt with the (3×3)-matrices m0α (vα ) given through m0α (vα )(z) = mbα (γα z+γα3 (vα ·z)vα ), z ∈ R3 , we obtain from (1.5) that for α = 1, . . . , N, v˙α = m0α (vα )−1 Z · d 3 x ρα (x − qα ) [E(x) − Evα (x − qα )] + vα ∧ [B(x) − Bvα (x − qα )] Z −1 d 3 x ρα (x) Z1 (x + qα , t) + vα ∧ Z2 (x + qα , t) + Rα (t), (5.5) = m0α (vα ) where m0α (vα )−1 z = mbα −1 γα−1 (z − (vα · z)vα ), z ∈ R3 , is the matrix inverse of m0α (vα ). For (5.5) it is important to note that adding the Evα (x − qα )-term and the vα ∧ Bvα (x − qα )-term does not change the integral, as may be seen through Fourier transform using (2.4) and (2.5). Moreover, in (5.5) we have set Rα (t) = m0α (vα )−1

X N Z

h i d 3 x ρα (x − qα ) Evβ (x − qβ ) + vα ∧ Bvβ (x − qβ )

β=1

β6=α

(5.6) and Z(x, t) =

Z1 (x, t) Z2 (x, t)

=

E(x, t) −

PN

β=1 Evβ (t) (x PN B(x, t) − β=1 Bvβ (t) (x

− qβ (t)) − qβ (t))

! .

Maxwell’s equations and the relations (v · ∇)Ev (x) = −∇ ∧ Bv (x) + eϕ(x)v, (v · ∇)Bv (x) = ∇ ∧ Ev (x), e = eα for index α, yield 0 ∇∧ ˙ Z(t) = AZ(t) − f (t) , with A = (5.7) −∇∧ 0 and f (x, t) =

N X

(v˙β (t) · ∇v )Evβ (t) (x − qβ (t))

β=1

(v˙β (t) · ∇v )Bvβ (t) (x − qβ (t))

! .

(5.8)

The Maxwell operator A generates a C 0 -group U (t), t ∈ R, of isometries in L2 (R3 )3 ⊕ L2 (R3 )3 ; see [2, p. 435; (H2)]. Therefore we have the mild solution representation Z t ds [U (t − s)f (·, s)](x). (5.9) Z(x, t) = [U (t)Z(·, 0)](x) − 0

According to (2.3), Z(0) = 0, so the first term drops out. To estimate the remaining term, we first state and prove some auxiliary lemmas that will be used frequently.

454

M. Kunze, H. Spohn

Lemma 5.1. For given f = (f1 , f2 ) with ∇ · f1 = 0 and ∇ · f2 = 0 we have for W (t, s, x) = (W1 (t, s, x), W2 (t, s, x)) = [U (t − s)f (·, s)](x), 1 W1 (t, s, x) = 4π(t − s)2 Z h i · d 2 y (t − s)∇ ∧ f2 (y, s) + f1 (y, s) + ((y − x) · ∇)f1 (y, s) , |y−x|=(t−s)

1 W2 (t, s, x) = 4π(t − s)2 Z h i · d 2 y − (t − s)∇ ∧ f1 (y, s) + f2 (y, s) + ((y − x) · ∇)f2 (y, s) . |y−x|=(t−s)

Proof. See [6, Lemma 8.1].

t u

Lemma 5.2. (a) Let ξ(s) ≥ 0 be some function. Assume that for y ∈ R3 , s ∈ [0, t], and some f (y, s) = (f1 (y, s), f2 (y, s)) with ∇ · f1 = 0 = ∇ · f2 , |f1 (y, s)| + |f2 (y, s)| ≤ Cξ(s) |∇f1 (y, s)| + |∇f2 (y, s)| ≤ Cξ(s)

N X

1

β=1

1 + |y − qβ (s)|2

N X

1

1 + |y − qβ (s)|3 β=1

,

(5.10)

.

(5.11)

Then for each α = 1, . . . , N, t ∈ [0, T ε −3/2 ], and |x| ≤ Rϕ , Z t ds [U (t − s)f (·, s)](x + qα (t)) ≤ C sup ξ(s) . s∈[0,t]

0

(b) Under the hypotheses of (a), if instead of (5.10) and (5.11) it holds for fixed 1 ≤ α ≤ N that |f1 (y, s)| + |f2 (y, s)| ≤ Cξ(s)

N X

1

β=1

1 + |y − qβ (s)|3

N X

1

β=1

1 + |y − qβ (s)|4

β6 =α

|∇f1 (y, s)| + |∇f2 (y, s)| ≤ Cξ(s)

β6 =α

,

(5.12)

,

(5.13)

then for t ∈ [0, T ε −3/2 ] and |x| ≤ Rϕ we have even that Z t ds [U (t − s)f (·, s)](x + q (t)) ≤ C sup ξ(s) ε. α 0

s∈[0,t]

Slow Motion of Charges Interacting Through the Maxwell Field

455

(c) Let ξ(τ, s) ≥ 0 be some function. Assume that for y ∈ R3 , τ ∈ [0, t], s ∈ [0, τ ], and some g(y, τ, s) = (g1 (y, τ, s), g2 (y, τ, s)) with ∇ · g1 = 0 = ∇ · g2 that N X

|g1 (y, τ, s)| + |g2 (y, τ, s)| ≤ Cξ(τ, s) |∇g1 (y, τ, s)| + |∇g2 (y, τ, s)| ≤ Cξ(τ, s)

α=1 N X α=1

1 1 + |y − qα (s)|3 1 1 + |y − qα (s)|4

,

(5.14)

.

(5.15)

Then for each α = 1, . . . , N, t ∈ [0, T ε −3/2 ], and |x| ≤ Rϕ , Z t Z τ dτ ds [U (t − s)g(·, τ, s)](x + q (t)) ≤ C sup ξ(τ, s) , α 0

(τ,s)∈1t

0

where 1t = {(τ, s) : τ ∈ [0, t], s ∈ [0, τ ]}. In (a)–(c), all constants C on the right-hand sides are independent of α, t, and x. Proof. (a) Define W as in Lemma 5.1. We derive the estimates with W1 . Fix 1 ≤ α ≤ N , t ∈ [0, T ε −3/2 ], s ∈ [0, t], and |x| ≤ Rϕ . According to Lemma 5.1, (5.10), and (5.11), |W1 (t, s, x + qα (t))| ≤ C

N ξ(s) X (2) Iαβ (t, s, x), (t − s)2 β=1

with (n) Iαβ (t, s, x)

Z =

d y 2

|y−x−qα (t)|=(t−s)

(t − s) 1 + |y − qβ (s)|n+1

1 + . 1 + |y − qβ (s)|n (5.16) (n)

In the sum in (5.16), with general n ≥ 2, we first consider the term Iαα (t, s, x), i.e., the one with β = α. In this case according √ to (2.8), |y − qβ (s)| ≥ |y − x − qα (t)| − |x| − |qα (t) − qα (s)| ≥ (t − s) − Rϕ − C ε(t − s) ≥ (t − s)/2 − Rϕ for ε small. Therefore |y − qβ (s)| ≥ (t − s)/4 for s ≤ t − 4Rϕ . We hence obtain for β = α and s ≤ t − 4Rϕ , (n) (t, s, x) ≤ C Iαα

(t − s)2 . 1 + (t − s)n

(5.17)

On the other hand, for s ∈ [t − 4Rϕ , t], (n) Iαα (t, s, x) ≤ C(t − s)2 [(t − s) + 1] ≤ C(t − s)2 [4Rϕ + 1]

≤ C(t − s)2

(t − s)2 1 ≤ C . 1 + (4Rϕ )n 1 + (t − s)n

Hence (5.17) shows that the latter estimate holds for any s ∈ [0, t]. Since Z t Z τ Z t ds ds ≤ C, dτ ≤ C, 2 3 0 1 + (t − s) 0 0 1 + (t − s) the term with β = α will satisfy the claimed estimates not only in (a), but also in (c).

456

M. Kunze, H. Spohn (2)

Next we turn to deriving a bound for Iαβ (t, s, x) with β 6= α. First note that for some portion of the interval [0, t] the preceding argument applies again. For this, define t0 = 4(Rϕ + C ∗ ε−1 ). Then for s ≤ t − t0 we find by (2.8) for ε small that on the y-sphere, |y − qβ (s)| ≥ |y − x − qα (t)| − |x| − |qα (t) − qβ (s)| ≥ (t − s) − Rϕ − |qα (t) − qβ (t)| − |qβ (t) − qβ (s)| √ ≥ (t − s) − Rϕ − C ∗ ε−1 − C ε(t − s) ≥ (t − s)/2 − Rϕ − C ∗ ε−1 ≥ (t − s)/4. Therefore as in (5.17) for general n ≥ 2, (n)

Iαβ (t, s, x) ≤ C

(t − s)2 , 1 + (t − s)n

s ∈ [0, t − t0 ],

(5.18)

(2)

and it remains to estimate Iαβ (t, s, x) for β 6= α and s ∈ [t − t0 , t]. To do so, we note that an explicit computation shows for z1 , z2 ∈ R3 and γ ≥ 0, Z d 2y

|y−z1 |=γ

πγ 1 + (γ + |z1 − z2 |)2 1 = log |z1 − z2 | (1 + |y − z2 |2 ) 1 + (γ − |z1 − z2 |)2 4γ |z1 − z2 | πγ log 1 + = |z1 − z2 | 1 + (γ − |z1 − z2 |)2 ≤

4π γ 2 1 + (γ − |z1 − z2 |)2

,

(5.19)

as log(1 + A) ≤ A for A ≥ 0. Similarly, for n ≥ 2, Z d 2y

|y−z1 |=γ

Z

= 2πγ 2

dr

1

−1

Z ≤ Cγ 2

1 (1 + |y − z2 |n+1 )

1 + |z1 − z2 |2 + 2γ |z1 − z2 |r + γ 2

(n+1)/2

dr

1

(n+1)/2 1 + |z1 − z2 |2 + 2γ |z1 − z2 |r + γ 2 1 1 γ = Cn − (n−1)/2 . |z1 − z2 | 1 + (|z1 − z2 | − γ )2 (n−1)/2 1 + (|z1 − z2 | + γ )2 −1

(5.20) So in particular Z d 2y

|y−z1 |=γ

γ 1 ≤C , n+1 |z1 − z2 | (1 + |y − z2 | )

n ≥ 2.

(5.21)

Slow Motion of Charges Interacting Through the Maxwell Field

457

Below we will also need some more refined estimates, and for this purpose we note that according to (5.20) also Z d 2y

|y−z1 |=γ

γ2 1 γ2 ≤C ≤C . 3 2 (1 + |y − z2 | ) |z1 − z2 |2 1 + (|z1 − z2 | + γ )

(5.22)

Analogously we obtain 1 1 γ2 ≤ C min 1, . (5.23) (1 + |y − z2 |4 ) |z1 − z2 |2 1 + (|z1 − z2 | − γ )2

Z d 2y

|y−z1 |=γ

(2)

As to bound Iαβ (t, s, x) for β 6= α and s ∈ [t − t0 , t] we then use (5.21) and (5.19) with z1 = x + qα (t), z2 = qβ (s), and γ = t − s to obtain for s ∈ [t − t0 , t], (2) Iαβ (t, s, x)

≤C

(t − s)2 (t − s)2 + . |x + qα (t) − qβ (s)| 1 + [(t − s) − |x + qα (t) − qβ (s)|]2 (5.24)

Therefore by (5.18) and (5.24), Z t ξ(s) (2) ds I (t, s, x) (t − s)2 αβ 0 Z t−t0 Z t ds ds (2) (2) I (t, s, x) + I (t, s, x) ≤ sup ξ(s) 2 αβ (t − s)2 αβ 0 t−t0 (t − s) s∈[0,t] Z t−t0 Z t ds ds + ≤ C sup ξ(s) 2 1 + (t − s) |x + q (t) − qβ (s)| α 0 t−t0 s∈[0,t] Z t ds . (5.25) + 2 t−t0 1 + [(t − s) − |x + qα (t) − qβ (s)|] The first of the three integrals is bounded by a constant. Concerning the second, we have |x + qα (t) − qβ (s)| ≥ |qα (t) − qβ (t)| − |x| − |qβ (t) − qβ (s)| √ ≥ C∗ ε−1 − Rϕ − C ε (t − s) by (2.6) and (2.8). In the domain of integration [t − t0 , t] it holds that t − s ≤ t0 ≤ Cε−1 , whence |x + qα (t) − qβ (s)| ≥ C∗ ε−1 − Rϕ − Cε−1/2 ≥ (C∗ /2)ε−1 , s ∈ [t − t0 , t], β 6= α,

|x| ≤ Rϕ ,

(5.26)

Rt for ε small. Therefore the second integral can be bound by Cε t−t0 ds ≤ Cεt0 ≤ C. To estimate the last integral =: J on the right-hand side of (5.25), we substitute θ = t − s to obtain Z t0 dθ (5.27) J = 2 0 1 + [θ − r(θ )]

458

M. Kunze, H. Spohn

√ with r(θ) = |x + qα (t) − qβ (t − θ)|. Observe that |˙r (θ )| ≤ |q˙β (t − θ )| ≤ C ε by (2.8). Thus θ 7 → χ(θ) = θ − r(θ) is strictly increasing, and we can substitute θ = θ (χ) to get Z Z χ (t0 ) 1 dχ dχ ≤ C. ≤ C J = 2 2 R 1+χ χ(0) 1 − r˙ (θ ) 1 + χ Summarizing these estimates we obtain the bound claimed in part (a) of the lemma. (n)

(b) Defining Iαβ as in (5.16), we need to show Z

0

t

ds (3) I (t, s, x) ≤ Cε, (t − s)2 αβ

β 6= α.

(5.28)

By (5.18), Z

t−t0 0

ds (3) I (t, s, x) ≤ C (t − s)2 αβ

Z 0

t−t0

ds (t − s) . (t − s)2 1 + (t − s)2

In the domain of integration, (t − s) ≥ t0 ≤ Cε−1 , and hence Z t Z t−t0 ds ds (3) I (t, s, x) ≤ Cε ≤ Cε. αβ 2 2 (t − s) 0 0 1 + (t − s)

(5.29)

Thus it remains to estimate the part of the integral in (5.28) for s ∈ [t − t0 , t]. Firstly, by (5.22), Z Z t 1 ds d 2y 3 2 |y−x−qα (t)|=(t−s) (1 + |y − qβ (s)| ) t−t0 (t − s) Z t Z t ds ≤ Cε2 ds = Cε2 t0 ≤ Cε. (5.30) ≤C 2 |x + q (t) − q (s)| α β t−t0 t−t0 Here we have used |x + qα (t) − qβ (s)| ≥ (C∗ /2)ε−1 for ε small, cf. (5.26). Reference to this is possible, since we again have that β 6= α. Analogously we infer from (5.23) that Z Z t (t − s) ds d 2y 4 2 |y−x−qα (t)|=(t−s) (1 + |y − qβ (s)| ) t−t0 (t − s) Z t (t − s) 1 ds ≤ |x + qα (t) − qβ (s)|2 1 + [(t − s) − |x + qα (t) − qβ (s)|]2 t−t0 Z t ds = CεJ ≤ Cε, ≤ Cε 2 t0 2 t−t0 1 + [(t − s) − |x + qα (t) − qβ (s)|] with the bounded J from (5.27). This together with (5.30) and (5.29) shows that (5.28) is satisfied. (c) Due to the remarks in (a), (5.14), and (5.15) we only have to prove Z τ Z t ds (3) dτ I (t, s, x) ≤ C, β 6 = α, t ∈ [0, T ε−3/2 ], 2 αβ 0 0 (t − s)

|x| ≤ Rϕ . (5.31)

Slow Motion of Charges Interacting Through the Maxwell Field

459

We decompose the domain of integration 1t = {(τ, s) : τ ∈ [0, t], s ∈ [0, τ ]} in 1t,1 = 1t ∩ {(τ, s) : s ∈ [0, t − t0 ]} and 1t,2 = {(τ, s) : τ ∈ [t − t0 , t], s ∈ [t − t0 , τ ]}. On 1t,1 we can utilize (5.18) to get Z Z

1 (3) dτ ds I (t, s, x) ≤ C 2 αβ (t − s) 1t,1

Z 0

t

Z dτ 0

τ

ds

1 ≤ C. 1 + (t − s)3

(5.32)

Since again t − s ≤ t0 ≤ Cε−1 for (τ, s) ∈ 1t,2 , by (5.26) and (5.21) Z 1 d 2y dτ ds (t − s)2 |y−x−qα (t)|=(t−s) 1 + |y − qβ (s)|3 1t,2 Z Z 1 1 ≤C dτ ds (t − s) |x + qα (t) − qβ (s)| 1t,2 Z τ Z t Z t ds dτ ds = Cεt0 ≤ C. = Cε ≤ Cε t−t0 t−t0 t − s t−

Z Z

(5.33)

In addition, by (5.23) Z 1 d 2y dτ ds (t − s) |y−x−qα (t)|=(t−s) 1 + |y − qβ (s)|4 1 Z t,2Z 1 1 ≤ dτ ds (t − s) 1 + [(t − s) − |x + qα (t) − qβ (s)|]2 1t,2 Z t ds = ≤ C, 2 t−t0 1 + [(t − s) − |x + qα (t) − qβ (s)|]

Z Z

(5.34)

since the last integral is just J from (5.27) and hence bounded. By (5.32), (5.33), and (5.34) we thus have proved (5.31). u t 2 − (k · v)2 ]. Then for x ∈ R3 and ˆ Lemma 5.3. Define φv (x) through φˆ v (k) = eϕ(k)/[k |v| ≤ v¯ < 1, with ∇ = ∇x ,

|∇φv (x)| + |∇v ∇φv (x)| + |∇v2 ∇φv (x)| + |∇v3 ∇φv (x)| ≤ C|e|(1 + |x|)−2 ,

|∇ 2 φv (x)| + |∇v ∇ 2 φv (x)| + |∇v2 ∇ 2 φv (x)| + |∇v3 ∇ 2 φv (x)| ≤ C|e|(1 + |x|)−3 , |∇ 3 φv (x)| + |∇v ∇ 3 φv (x)| + |∇v2 ∇ 3 φv (x)| + |∇v3 ∇ 3 φv (x)| ≤ C|e|(1 + |x|)−4 , |∇ 4 φv (x)| + |∇v ∇ 4 φv (x)| + |∇v2 ∇ 4 φv (x)| + |∇v3 ∇ 4 φv (x)| ≤ C|e|(1 + |x|)−5 .

Proof. Tedious calculations; see also the appendices of [5, 6].

t u

Rt Now we can estimate 0 ds [U (t −s)f (·, s)](x +qα (t)), cf. (5.9), for t ∈ [0, T ε−3/2 ] and |x| ≤ Rϕ , using Lemma 5.1 and Lemma 5.2(a), with f = (f1 , f2 ) defined by (5.8). Since ∇ · Bv = 0, and ∇ · Ev = eϕ is independent of v, we have ∇ · f1 = 0 = ∇ · f2 . Concerning (5.10) and (5.11), note |∇v Ev (x)| + |∇v Bv (x)| ≤ C(|∇φv (x)| + |∇v ∇φv (x)|) ≤ C|e|(1 + |x|)−2 and |∇v ∇Ev (x)| + |∇v ∇Bv (x)| ≤ C(|∇ 2 φv (x)| + |∇v ∇ 2 φv (x)|) ≤ C|e|(1 + |x|)−3 by Lemma 5.3. Thus (5.10) and (5.11) are satisfied

460

M. Kunze, H. Spohn

with ξ(s) = max1≤β≤N |v˙β (s)| max1≤β≤N |eβ | . As Z(x, 0) = 0, hence (5.9) in conjunction with Lemma 5.2(a) yields for α = 1, . . . , N, max |eβ | , |Z(x + qα (t), t)| ≤ C sup max |v˙β (s)| 1≤β≤N s∈[0,t] 1≤β≤N (5.35) −3/2 ], |x| ≤ Rϕ . t ∈ [0, T ε We will utilize this further in (5.5), and to this end we also need to bound Rα (t) from (5.6). For fixed β 6 = α one calculates for the interaction terms Z 9αβ (t) = d 3 x ρα (x − qα (t))∇φvβ (t) (x − qβ (t)) Z = (−i) eα eβ eα eβ = 4π

d 3k k

Z Z

2 |ϕ(k)| ˆ

k 2 − (k · vβ (t))2

d 3 xd 3 y ϕ(x − qα (t))ϕ(y − qβ (t))∇ζvβ (t) (x − y), (5.36)

with ζv (x) =

eik·[qβ (t)−qα (t)]

r

1 2 1/2

[(1 − v 2 )x 2 + (x · v) ]

,

ζˆv (k) =

1 2 , 2 π k − (k · v)2

|v| < 1. (5.37)

Then supt∈[0,T ε−3/2 ] |∇ζvβ (t) (x)| ≤ C(1 + |x|)−2 due to (2.8). By (C), in (5.36) we only need to integrate over (x, y) that have |x − qα (t)| ≤ Rϕ and |y − qβ (t)| ≤ Rϕ . Then by (2.6), |x − y| ≥ |qα (t) − qβ (t)| − 2Rϕ ≥ C∗ ε−1 − 2Rϕ ≥ (C∗ /2)ε−1 for ε small. Therefore (5.36) shows |9αβ (t)| ≤ Cε2 ,

t ∈ [0, T ε−3/2 ],

α 6= β.

(5.38)

By definition of Bv (x) and Ev (x) we have Rα (t) = m0α (vα (t))−1 X · − 9αβ (t) + [vβ (t) · 9αβ (t)]vβ (t) + vα (t) ∧ [−vβ (t) ∧ 9αβ (t)] β6=α

(5.39) cf. (5.6), and therefore (5.38) together with (2.8) implies |Rα (t)| ≤ Cε2 ,

t ∈ [0, T ε−3/2 ].

(5.40)

Hence (5.5), (5.35), and (5.40) finally yield max |eβ | + Cε2 , |v˙α (t)| ≤ C sup max |v˙β (s)| s∈[0,t] 1≤β≤N

1≤β≤N

for every α = 1, . . . , N and t ∈ [0, T ε −3/2 ]. Choosing max1≤β≤N |eβ | ≤ e¯ with e¯ > 0 sufficiently small, we find that for α = 1, . . . , N, sup t∈[0, T ε−3/2 ]

|v˙α (t)| ≤ Cε2 .

(5.41)

For later reference we also note that then according to (5.35), |Z(x + qα (t), t)| ≤ Cε2 ,

α = 1, . . . , N,

t ∈ [0, T ε−3/2 ],

|x| ≤ Rϕ .

(5.42)

Slow Motion of Charges Interacting Through the Maxwell Field

461

5.3. Bounding |v¨α (t)|. By (2.8) we have in particular that √ |vα (t) − vβ (t)| ≤ C ε, t ∈ [0, T ε−3/2 ].

(5.43)

In order to estimate the derivative of Eq. (5.5), first note that using the explicit form of m0α (vα )−1 we obtain from (5.41) that d m0α (vα (t))−1 ≤ C|v˙α (t)| ≤ Cε2 . (5.44) dt Hence by (5.5), (5.42) and (5.40), |v¨α (t)| ≤ C ε4 + |Mα (t)| + |R˙ α (t)| ,

(5.45)

with Rα defined in (5.6), and Z h i Mα (t) = d 3 x ρα (x) (Lα (t)Z1 )(x + qα (t), t) + vα (t) ∧ (Lα (t)Z2 )(x + qα (t), t) , (5.46) where Lα (t)φ = (vα (t) · ∇)φ + ∂t φ for a function φ = φ(x, t). We first estimate Mα (t). d [Lα (t)φ] = Lα (t)φ˙ + (v˙α · ∇)φ and, Let 6α (x, t) = (Lα (t)Z)(x, t). Since generally dt see (5.7), Z˙ = AZ − f with f from (5.8), we obtain ˙ α = A6α + (v˙α · ∇)Z − Lα (t)f. 6 According to (2.3) it may be shown that 6α (x, 0) = 0. We hence get Z t h i dτ U (t − τ ) (v˙α (τ ) · ∇)Z(·, τ ) − Lα (τ )f (·, τ ) 6α (x + qα (t), t) = 0

· (x + qα (t)) =: 6α,1 (x + qα (t), t) − 6α,2 (x + qα (t), t). d (∇Z) = ∇(AZ − f ) = A(∇Z) − ∇f and Z(x, 0) = 0, we As a consequence of dt obtain from the group property of U (·) that Z t h i dτ U (t − τ ) (v˙α (τ ) · ∇)Z(·, τ ) (x + qα (t)) 6α,1 (x + qα (t), t) = 0 Z τ h Z t i dτ ds U (t − s) (v˙α (τ ) · ∇)f (·, s) (x + qα (t)). =− 0

0

With g(y, τ, s) = v˙α (τ )·∇f (y, s) it follows from the definitions of f , Ev (x), and Bv (x) that ∇ · g = 0. Moreover, by (5.41) and Lemma 5.3 we find that (5.14) and (5.15) are satisfied with ξ(τ, s) = ε 4 . Therefore Lemma 5.2(c) applies to yield for α = 1, . . . , N, |6α,1 (x + qα (t), t)| ≤ Cε4 , To estimate 6α,2 (x + qα (t), t) = −

Z 0

t

t ∈ [0, T ε−3/2 ],

|x| ≤ Rϕ .

h i dτ U (t − τ ) Lα (τ )f (·, τ ) (x + qα (t)),

(5.47)

462

M. Kunze, H. Spohn

observe that [Lα (τ )f (·, τ )](x) = vα (τ ) · ∇f (x, τ ) + ∂t f (x, τ ) N n o X = (v¨β · ∇v )8vβ (x − qβ ) + (v˙β · ∇v )2 8vβ (x − qβ ) β=1

+

N X

2 ∇xv 8vβ (x − qβ )(vα − vβ , v˙β )

β=1

β6=α

=: f \ (τ, y) + f [ (τ, y), with all time arguments taken at time τ , and 8v = (Ev , Bv ). Since ∇ · Bv = 0 and ∇ · Ev = eϕ is independent of v, we have that ∇ · f \ = 0 = ∇ · f [ . In addition, f \ satisfies (5.10) and (5.11) with max |eβ | + ε4 . ξ \ (τ ) = max |v¨β (τ )| 1≤β≤N

1≤β≤N

Because f [ has an additional x-derivative, moreover (5.12) and (5.13) hold for f [ , with ξ [ (τ ) = max |vα (τ ) − vβ (τ )| ε2 , 1≤β≤N

as again follows from Lemma 5.3 and (5.41). Thus Lemma 5.2(a) and (b) imply that for all α = 1, . . . , N, t ∈ [0, T ε −3/2 ], and |x| ≤ Rϕ , |6α,2 (x + qα (t), t)| Z t Z t dτ [U (t − τ )f \ (·, τ )](x + qα (t)) + dτ [U (t − τ )f [ (·, τ )](x + qα (t)) ≤ 0 0 \ [ ≤ C sup ξ (τ ) + sup ξ (τ )ε

τ ∈[0,t]

≤ C ε4 +

+

sup

τ ∈[0,t]

sup

max |v¨β (τ )| max |eβ |

τ ∈[0,t] 1≤β≤N

1≤β≤N

max |vα (τ ) − vβ (τ )| ε3 .

τ ∈[0,t] 1≤β≤N

Hence by (5.47) and (5.43) for α = 1, . . . , N, t ∈ [0, T ε−3/2 ], and |x| ≤ Rϕ , 7/2 max |eβ | . |6α (x + qα (t), t)| ≤ C ε + sup max |v¨β (τ )| τ ∈[0,t] 1≤β≤N

1≤β≤N

According to the definition of Mα (t) in (5.46) we therefore have Z h i d 3 x ρα (x) 6α,1 (x + qα (t), t) + vα (t) ∧ 6α,2 (x + qα (t), t) |Mα (t)| = |x|≤Rϕ

≤C ε

7/2

+

sup

max |v¨β (τ )|

τ ∈[0,t] 1≤β≤N

max |eβ |

1≤β≤N

.

(5.48)

Slow Motion of Charges Interacting Through the Maxwell Field

463

To further estimate the right-hand side of (5.45), we have to bound R˙ α (t), with Rα (t) from (5.6). Calculating R˙ α (t) explicitly we obtain XZ d m0α (vα )−1 m0α (vα )Rα (t) + m0α (vα )−1 d 3 x ρα (x − qα ) R˙ α (t) = dt β=1 β6 =α

h

i

· (v˙β · ∇v )Evβ (x − qβ ) + vα ∧ (v˙β · ∇v )Bvβ (x − qβ ) XZ h d 3 x ρα (x − qα ) ((vα − vβ ) · ∇)Evβ (x − qβ ) + m0α (vα )−1 β=1

β6=α

i

+vα ∧ ((vα − vβ ) · ∇)Bvβ (x − qβ ) XZ d 3 x ρα (x − qα ) v˙α ∧ Bvβ (x − qβ ) + m0α (vα )−1 β=1

β6=α

=: R˙ α,1 (t) + R˙ α,2 (t) + R˙ α,3 (t) + R˙ α,4 (t) with all time arguments at time t. Firstly, d −1 m0α (vα ) m0α (vα )Rα (t) ≤ Cε4 |R˙ α,1 (t)| = dt

(5.49)

for α = 1, . . . , N and t ∈ [0, T ε −3/2 ] by (5.44) and (5.40). Since Bv (x) = −v ∧ ∇φv (x), by (5.41), (2.8), and (5.38) also |R˙ α,4 (t)| ≤ Cε9/2 .

(5.50)

What concerns R˙ α,2 (t), we may repeat the calculation in (5.36) to obtain Z 2 φvβ (t) (x − qβ (t)) ∇v 9αβ (t) := d 3 x ρ(x − qα (t))∇xv Z Z 1 2 d 3 xd 3 y ρ(x − qα (t))ρ(y − qβ (t))∇xv ζvβ (t) (x − y), = 4π −2 2 ζ with ζv (x) from (5.37). Since supt∈[0, T ε−3/2 ] |∇xv vβ (t) (x)| ≤ C(1 + |x|) , we get as before that

|∇v 9αβ (t)| ≤ Cε2 ,

t ∈ [0, T ε−3/2 ],

α 6 = β,

and hence by (5.41), |R˙ α,2 (t)| ≤ Cε4 .

(5.51)

So finally we have to bound R˙ α,3 (t), and this relies on a similar argument. Here we have Z ∇9αβ (t) := d 3 x ρ(x − qα (t))∇ 2 φvβ (t) (x − qβ (t)) Z Z 1 d 3 xd 3 y ρ(x − qα (t))ρ(y − qβ (t))∇ 2 ζvβ (t) (x − y), = 4π

464

M. Kunze, H. Spohn

and supt∈[0, T ε−3/2 ] |∇ 2 ζvβ (t) (x)| ≤ C(1 + |x|)−3 . This in turn yields |∇9αβ (t)| ≤ Cε3 ,

t ∈ [0, T ε−3/2 ],

α 6= β.

Using the explicit form of Ev (x) and Bv (x), as in (5.39), we then get for t ∈ [0, T ε−3/2 ], |R˙ α,3 (t)| ≤ Cε3 |vα (t) − vβ (t)| ≤ Cε7/2 ,

(5.52)

by (5.43). Summarizing (5.49), (5.50), (5.51), and (5.52) it follows that |R˙ α (t)| ≤ Cε7/2 ,

α = 1, . . . , N,

t ∈ [0, T ε−3/2 ].

(5.53)

Consequently, by (5.45), (5.48), and (5.53) for α = 1, . . . , N and t ∈ [0, T ε−3/2 ], |v¨α (t)| ≤ C ε4 + |Mα (t)| + |R˙ α (t)| 7/2 max |eβ | . ≤ C ε + sup max |v¨β (τ )| τ ∈[0,t] 1≤β≤N

1≤β≤N

Choosing max1≤β≤N |eβ | ≤ e¯ with e¯ sufficiently small we hence obtain sup t∈[0, T ε−3/2 ]

|v¨α (t)| ≤ Cε7/2 ,

This completes the proof of Lemma 2.1.

α = 1, . . . , N.

t u

6. Appendix B: Proof of Lemma 3.2 Here we give the proof of Lemma 3.2. We verify e.g. (b). To compare the left-hand side to the right-hand side of the assertion, we will insert some additional terms and estimate the corresponding differences Dj (t), j = 1, 2, 3, for t ∈ [t0 , T ε−3/2 ], where t0 = 4(Rϕ + C ∗ ε−1 ). First we introduce Z t Z 2 −ik·ξαβ dτ d 3 k |ϕ(k)| ˆ e D1 (t) = i 0 −ik·[qβ (t)−qβ (t−τ )] −ik·[τ vβ − 21 τ 2 v˙β ] sin |k|τ k · e −e |k| Z Z t 2 −ik·ξαβ dτ d 3 k |ϕ(k)| ˆ e = − ∇ξ 0 −ik·[qβ (t)−qβ (t−τ )] −ik·[τ vβ − 21 τ 2 v˙β ] sin |k|τ · e −e |k| Z Z d 3 xd 3 y ϕ(x)ϕ(y) = − ∇ξ Z t n dτ ψτ [ξαβ + x − qβ (t − τ )] − [y − qβ (t)] × 0

o 1 − ψτ [x − τ 2 v˙β ] − [y − τ vβ ] , 2

(6.1)

Slow Motion of Charges Interacting Through the Maxwell Field

465

as follows through application of the Fourier transform, with ξαβ = qα (t) − qβ (t), and ψτ (x) = (4π|x|)−1 for |x| = τ whereas ψτ (x) = 0 otherwise. We claim that for x, y ∈ R3 with |x|, |y| ≤ Rϕ and t ∈ [t0 , T ε−3/2 ] there exists a unique τ0 = τ0 (x, y, t, ξαβ ) ∈ [0, t0 ] ⊂ [0, t] such that (6.2) τ0 = [ξαβ + x − qβ (t − τ0 )] − [y − qβ (t)] . To see this, observe with θ(τ ) = τ − |[ξαβ + x − qβ (t − τ )] − [y − qβ (t)]| that √ 0 ≥ θ(0) ≥ −(2Rϕ + C ∗ ε−1 ) and θ 0 (τ ) ≥ 1 − Cv ε by (2.6) and (2.8). For ε so small √ that 1 − Cv ε ≥ 1/2 we hence obtain θ (t0 ) ≥ −(2Rϕ + C ∗ ε−1 ) + t0 /2 = 2C ∗ ε−1 . This shows θ has a unique zero τ0 ∈ [0, t0 ]. Moreover (6.2) together with (2.6) implies √ τ0 ≥ |ξαβ | − |x − qβ (t − τ0 )] − [y − qβ (t)]| ≥ C∗ ε−1 − 2Rϕ − Cv ετ0 , whence also τ0 ≥ Cε−1 for ε small. Similarly, we find a unique τ1 = τ1 (x, y, t, ξαβ ) satisfying 1 (6.3) τ1 = [ξαβ + x − τ12 v˙β ] − [y − τ1 vβ ] , 2 with τ1 having the same properties as τ0 . By definition of ψτ we therefore may simply write Z Z d 3 xd 3 y ϕ(x)ϕ(y) ∇ξ τ0−1 − τ1−1 . (6.4) D1 (t) = − To estimate this, we calculate from (6.2) that −1 −3 [ξαβ + x − qβ (t − τ0 )] − [y − qβ (t)] ∇ξ τ0 = −τ0

+ [ξαβ + x − qβ (t − τ0 )] − [y − qβ (t)] · vβ (t − τ0 )∇ξ τ0 , with an analogous expression for ∇ξ τ1−1 . Therefore h i −1 −1 ∇ξ τ0 − τ1 ≤ C τ0−3 |qβ (t − τ0 ) − qβ (t − τ1 )| 1 + |vβ (t − τ1 )||∇ξ τ1 | h i +|τ0−3 − τ1−3 | [ξαβ + x − qβ (t − τ1 )] − [y − qβ (t)] 1 + |vβ (t − τ1 )||∇ξ τ1 | +τ0−2 |vβ (t − τ0 ) − vβ (t − τ1 )||∇ξ τ1 | +τ0−2 |vβ (t − τ0 )||∇ξ (τ0 − τ1 )| . From (6.2), (6.3), and according to the Taylor expansion 1 qβ (t − τ ) = qβ (t) − τ vβ + τ 2 v˙β + O(ε 7/2 τ 3 ), 2 cf. Lemma 2.1, it follows that 1 1 |τ0 − τ1 | ≤ τ0 vβ − τ02 v˙β − τ1 vβ + τ12 v˙β + O(ε7/2 τ03 ) 2 2 √ ≤ C ε |τ0 − τ1 | + Cε2 (τ0 + τ1 )|τ0 − τ1 | + O(ε7/2 τ03 ),

(6.5)

466

M. Kunze, H. Spohn

whence

√ |τ0 − τ1 | = O( ε ),

|τ0−3 − τ1−3 | = O(ε9/2 ),

recall Cε −1 ≤ τ0 , τ1 ≤ t0 = O(ε−1 ). Differentiating (6.2) and (6.3) w.r. to ξ = ξαβ we moreover get |∇ξ τ0 | + |∇ξ τ1 | = O(1), and after a longer calculation which we omit √ also |∇ξ (τ0 − τ1 )| ≤ C ε3/2 + ε |∇ξ (τ0 − τ1 )| , thus |∇ξ (τ0 − τ1 )| ≤ Cε3/2 . Utilizing these estimates and Lemma 2.1 in (6.5), we consequently obtain |∇ξ (τ0−1 − τ1−1 )| ≤ Cε7/2 . Hence (6.4) yields sup t∈[t0 , T ε−3/2 ]

|D1 (t)| ≤ Cε7/2

(6.6)

as desired. Next, with Z t Z 1 2 2 −ik·ξαβ dτ d 3 k |ϕ(k)| ˆ e e−ik·[τ vβ − 2 τ v˙β ] D2 (t) = i 0 h i 1 sin |k|τ 1 k, − 1 − ik · τ vβ − τ 2 v˙β − τ 2 (k · vβ )2 2 2 |k| it may be shown in a similar way that sup t∈[t0 , T ε−3/2 ]

|D2 (t)| ≤ Cε7/2 .

(6.7)

Rt Finally we need to compare 0 dτ (. . . ) to the infinite dτ -integral and thus let Z ∞ Z 2 −ik·ξαβ dτ d 3 k |ϕ(k)| ˆ e D3 (t) = i t

h i 1 sin |k|τ 1 k. · 1 − ik · τ vβ − τ 2 v˙β − τ 2 (k · vβ )2 2 2 |k|

With the notation Kp = e−ik·ξαβ

Z t

∞

dτ

sin |k|τ p τ , |k|

p = 0, . . . , 2,

this may be rewritten as Z 2 ˆ D3 (t) = d 3 k |ϕ(k)| 1 1 2 · − ∇ξ K0 − (vβ · ∇ξ )∇ξ K1 + (v˙β · ∇ξ )∇ξ K2 − (vβ · ∇ξ ) ∇ξ K2 . 2 2 Thus we only need to estimate Z Z ∞ Z sin |k|τ p 2 2 −ik·ξαβ 3 3 τ ˆ Kp = d k |ϕ(k)| ˆ e dτ d k |ϕ(k)| |k| t Z Z Z ∞ = d 3 xd 3 y ϕ(x)ϕ(y) dτ ψτ (ξαβ + x − y) τ p , (6.8) t

Slow Motion of Charges Interacting Through the Maxwell Field

467

the latter equality follows analogously to (6.1). However, for |x|, |y| ≤ Rϕ and t ∈ [t0 , T ε−3/2 ] we obtain in case τ = |ξαβ + x − y| from (2.6) the contradiction 4(Rϕ + C ∗ ε−1 ) = t0 ≤ t ≤ τ ≤ 2Rϕ + |ξαβ | ≤ 2Rϕ + C ∗ ε−1 . This shows the term in (6.8) is identically zero for t ∈ [t0 , T ε−3/2 ], and thus D3 (t) = 0 for t ∈ [t0 , T ε−3/2 ]. Together with (6.6) and (6.7) this completes the proof of Lemma 3.2(b). u t Acknowledgement. We are grateful to A. Komech for discussions. HS thanks G. Schäfer for useful hints on post-Newtonian corrections in general relativity and for insisting on (1.9).

References 1. Damour T., Schäfer G.: Redefinition of position variables and the reduction of higher-order Lagrangians. J. Math. Phys. 22, 127–134 (1991) 2. Dautray R., Lions J.-L.: Mathematical Analysis and Numerical Methods for Science and Technology. Vol. 5: Evolution Problems I. Berlin–Heidelberg–New York: Springer, 1992 3. Komech A., Kunze M., Spohn H.: Effective dynamics for a mechanical particle coupled to a wave field. Commun. Math. Phys. 203, 1–19 (1999) 4. Komech A., Spohn H.: Long-time asymptotics for the coupled Maxwell-Lorentz equations. Comm. Partial Differential Equations 25, 559–584 (2000) 5. Kunze M., Spohn H.: Radiation reaction and center manifolds. To appear in SIAM J. Math. Anal. 6. Kunze M., Spohn H.: Adiabatic limit for the Maxwell–Lorentz equations. To appear in Annales Henri Poincaré 7. Landau L.D., Lifschitz E.M.: The Theory of Classical Fields. Oxford: Pergamon Press, 1962 8. Moser J.: Dynamical systems – past and present. In: Proc. of the ICM, Vol. 1 (Berlin 1998), Doc. Math., Extra Vol. I, 381–402 (1998) 9. Taylor J.H.: Binary pulsars and relativistic gravity. Rev. Mod. Phys. 66, 711–719 (1994) 10. Xia J.: The existence of noncollision singularities in Newtonian systems. Ann. Math. 135, 411–468 (1991) Communicated by Ya. G. Sinai

Commun. Math. Phys. 212, 469 – 501 (2000)

Communications in

Mathematical Physics

© Springer-Verlag 2000

Statistical Properties of Locally Free Groups with Applications to Braid Groups and Growth of Random Heaps A. M. Vershik1 , S. Nechaev2,3 , R. Bikbov3 1 St. Petersburg Branch of Steklov Mathematical Institute, Fontanka 27, 119011 St. Petersburg, Russia 2 UMR 8626, CNRS - Université Paris XI, LPTMS, Bat. 100, Université Paris Sud, 91405 Orsay Cedex,

France

3 L.D. Landau Institute for Theoretical Physics, Kosygin str. 2, 117940 Moscow, Russia

Received: 7 June 1999 / Accepted: 21 April 2000

Abstract: The main statistical characteristics of locally free groups: the growth, the drift and the entropy are considered and relations between them are established. Our results assert that: (i) the statistical properties of random walks (Markov chains) on locally free and braid groups are not the same as the uniform statistics on these groups, and (ii) the stabilization of the statistical characteristics exists when the number of generators of the group grows. Contents 1. 2. 3. 4. 5.

6.

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Main Definitions and Statement of a Problem . . . . . . . . . . . . . . . . Asymptotics of the Number of Words in the Locally Free Group (Logarithmic Volume) . . . . . . . . . . . . . . . . . . . . . . . . . . . . Random Walk on Locally Free Group: The Drift . . . . . . . . . . . . . . 4.1 Mathematical expectation of the heap’s roof . . . . . . . . . . . . 4.2 Drift as mathematical expectation of number of cells in the heap . . Random Walk on Locally Free Group and Semi-Group: The Entropy . . . 5.1 Entropy of random walk on semi-group LF + n+1 . . . . . . . . . . 5.2 Entropy of random walk on groups LF n+1 and LFI + n+1 . . . . . . Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.1 Bounds for logarithmic volume and drift of random walk on braid group . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2 Physical interpretation of results . . . . . . . . . . . . . . . . . . .

469 470 475 482 485 489 491 492 495 497 497 498

1. Introduction The last years have been marked by growing interest in a number of problems of physical origin, dealing with probabilistic processes on noncommutative groups. Among these

470

A. M. Vershik, S. Nechaev, R. Bikbov

are the problems of statistics and topology of chain molecules and related statistical problems of knots (see, for example, [Ne]). Along with the known problems of construction of topological invariants of knots and links, investigation of homotopic classes and fibre bundles, a set of similar, but less investigated problems dealing with statistics of knots and braids should be noted. In a set of works [GN] problems dealing with the investigation of the mathematical expectation of a “complexity” of randomly generated knots was formulated, where the degree of any known algebraic invariant (polynomials of Jones, Alexander, HOMFLY and others) had been served for the characteristics of a knot complexity. As for the theory of a random walk on braid groups, very few results devoted to the investigation of limiting behavior of random walks and Brownian bridges on the simplest braid group B3 are known [NGV]. Thus, neither Poisson boundary, nor explicit expression of harmonic functions for braid groups are yet found. At the same time it is clear that this set of problems is connected by large to random walks on noncommutative groups. In the present paper we consider statistical properties of locally free and braid groups following an idea of the first author (A.V.) and extended in the papers [Ve1, DN1, DN2]. For study of braid groups we introduce the concept of so called locally free groups, which are a particular case of local groups in the sense of [Ve1,Ve2,Ve3]. This concept gives us a very useful tool for bilateral approximations for the number of nonequivalent words in the braid groups and semi-groups. A very important and apparently rather new aspect of this problem consists in passing to the limit n → ∞ in the group Bn ; just this limit is considered in our work. We have found stabilization of various statistical characteristics of the local groups in this limit of a large number of generators. In [Ve2,Ve3] the systematical approach to computation of various numerical characteristics of countable groups is proposed. The essence of this approach deals with simultaneous consideration of three numerical constants, properly characterizing the logarithmic volume, the entropy and the escape (the drift) of the uniform random walk on the group. It so happens that these characteristics are related by the strict fundamental inequality (see also [Ve2,Ve3]), which means that the statistics of convolutions of measures on the generators is not the same as the uniform statistics on the set of words of a given length. In other words, generating the group step-by-step by a Monte Carlo method allows one to get only the exponentially small fraction of the group. The same statement holds also for braid groups. The locally free groups play the role of the approximants to the braid groups. 2. Main Definitions and Statement of a Problem We begin with definitions of a locally free group and semi-groups, which are special cases of a noncommutative local group and semi-groups [Ve1,Ve2,Ve3]. Definition 1 (Locally free group and semi-group). Locally free group (semi-group) LF n+1 (LF + n+1 ) with n generators {f1 , . . . , fn } is a group (semi-group), determined by the following relations: fj fk = fk fj

∀|j − k| ≥ 2,

{j, k} = 1, . . . n.

(1)

Each pair of neighboring generators (fj , fj ±1 ) produces a free subgroup (sub– semigroup) of a group LF n+1 (semi-group LF + n+1 ). In addition to the locally free group LF n+1 we define a few similar objects:

Statistical Properties of Locally Free Groups

471 +(r)

Definition 2. 1. Locally free semi-group LF n+1 of finite order r. The semi-group +(r)

LF n+1 is the locally free semi-group LF + n+1 subject to extra “finite order” relations (fi )r = 1; i = 1, . . . , n.

+ 2. Locally free idempotent semi-group LFI + n+1 . The idempotent semi-group LFI n+1 + is the locally free semi-group LF n+1 subject to extra “idempotent” relations

(fi )2 = fi ;

i = 1, . . . , n.

The concept, equivalent to the concept of a locally free semi-group LF + n+1 has appeared earlier in [CF], devoted to the investigation of combinatorial properties of substitutions of sequences and so called “partially commutative monoids” (see [Vi] and references there). Especially productive becomes the geometrical interpretation of monoids in the form of a “heap”, offered by G. X. Viennot and connected with various questions of statistics of directed growth and parallel computations. The case of a group (instead of semi-group) introduces a number of additional complications to the model of a heap and apparently has not been considered in the literature. We touch it in more detail below. It makes sense to give a more general definition of 0-locally free groups. Definition 3 (0-locally free group). Let 0 be a graph. Call the group LF(0) 0-locally free if the generators gγ of LF(0) can be labeled by vertices γ of the graph 0 and two generators commute if the vertices are not neighbors in the graph. The semi-group can be defined in the same way. If 0 is a p-cycle then corresponding locally free groups we call cyclic locally free group and denote by CLP p . For more details see [Ve2,Ve3,Ve1]. The more general concept of a locally free group consists in the consideration of the locally free group of depth m. Definition 4 (Locally free group of the depth m). The group G with the set of the generators f1 , f2 , . . . fn , n ≥ m is called locally free of depths m if fj fk = fk fj

∀|j − k| ≥ m + 1,

{j, k} = 1, . . . n.

(2)

For m = 1 we return to the previous notion. Let us recall finally the definition of the local group [Ve2,Ve3,Ve1]. Definition 5 (Local group). If generators f1 , f2 . . . fn of the group G satisfy the commutation relation fj fk = fk fj ∀|j − k| ≥ 2 and might have additional relations R between neighbors fj , fj +1 ∀j = 1 . . . n − 1, then G is the local group. If the relations R are the same for all j then the G is called local stationary group. Many important groups, semi-groups and algebras are of type of local groups, for example, the Coxeter groups, Hecke algebra, etc. Obviously, a locally free group with n generators (n ≤ ∞) is a universal object in the manifold of all local groups with the same number of generators. Now we give the definition of the braid group and establish the link between a braid group and a locally free group.

472

A. M. Vershik, S. Nechaev, R. Bikbov

... 1

2

i

i+1

... 1

2

= σj

... n

= σj-1

... i

i+1

n

Fig. 1. Graphic representation of generators of braid group Bn+1

Definition 6 (Artin braid group). The braid group Bn+1 of n + 1 “strings” has n generators {σ1 , . . . , σn } with the following relations: (

σi σi+1 σi = σi+1 σi σi+1 σi σj = σj σi

(1 ≤ i < n) . (|i − j | ≥ 2)

(3)

There exists an extensive literature on general properties of braid groups – see [Bi1]; for the last work on the normal forms of words, we shall quote [Bi2]. An element of the braid group Bn is set by a word in the alphabet {σ1 , . . . , σn ; σ1−1 , . . . , σn−1 } – see Fig. 1. By the length N of a record of a braid we mean just the length of a word in a given record of the braid, and by the irreducible length (or simple length) – the minimal length of a word, in which the given braid can be written. The irreducible length can be also viewed as a distance from the unity on the Cayley graph 0 of the group. Graphically the braid is represented by a set of strings, going from above downwards in accordance with the growth of the braid length. A closed braid is obtained by gluing the “top” and “bottom” free ends on a cylinder. A closed braid defines a link (in particular, a knot). The homotopy type of the link can be described in terms of algebraic characteristics of a braid [Jo]. The positive braid by definition is the element of the sub–semigroup generated by the generators of the braid group. The braid group Bn is the local group. Moreover, • The braid group Bn is a factor–group of a locally free group LF n , since Bn has been obtained from LF n by introducing the Yang–Baxter (braid) relations to LF n ; • The locally free group LF n is simultaneously the subgroup of the braid group. Bn , over squares of generators Bn : Lemma 1. Consider a subgroup B n of the group B n = σ 1 , . . . , σ n−1 |σ i = σi2 , i = 1, . . . , n . The correspondence σ i ↔ fi sets the isomorphism of the groups B n and LF n .

Statistical Properties of Locally Free Groups

473

... 1

2

... i

i+1

... 1

2

= fj n

= fj-1

... i

i+1

n

Fig. 2. Graphic representation of generators of locally-free group LF n+1

This lemma has been proved in [Hu, Co] and is the partial case of a general conjecture of J. Tits. We skip the full proof giving only a hint of it. Consider the Burau representation 1 0 −t 1 ; σ2 (t) = , σ1 (t) = t −t 0 1 being the exact representation of B3 over C[t]. It is obvious that 2 1 0 t −t + 1 2 2 . σ 1 = σ1 (t) = ; σ 2 = σ2 (t) = 0 1 t − t2 t2

(4)

Putting t = −1, we see that (4) is reduced up to 1 0 12 ; f2 = σ22 (−1) = , f1 = σ12 (−1) = −2 1 01 which are the generators of free group,02 . 1 0 It should be noted that the matrices 01 21 and −2 1 define the generators of a free group. This fact was proved apparently for the first time by I. Sanov [Sa]. Corollary 1. The locally free group LF n is simultaneously super- and subgroup of the braid group Bn . This consequence will hereafter be used for transmitting the estimates from the locally free group to the braid group. The geometrical interpretation of the group LF n+1 is shown in Fig. 2. Let us formulate the main problems concerning the determination of asymptotic characteristics of locally free and similar groups. This is the realization of the general program which was discussed in [Ve2,Ve3] and concerns the asymptotic properties of the local groups and similar objects. Namely we introduce three statistical characteristics of the group: the logarithmic volume, drift and entropy and study the fundamental inequality ([Ve2,Ve3]) which links these characteristics. 1. Asymptotics of number of words in a group (logarithmic volume). Let G be the group with fixed framing {g1 , . . . , gn }. The definition following hereafter makes sense

474

A. M. Vershik, S. Nechaev, R. Bikbov

for any groups with a fixed and finite set of generators. Denote by K(g) and call the length K(g) the minimal length of the word g, written in terms of generators {g1 , . . . , gn ; g1−1 , . . . , gn−1 }. The length defines a metric (the metrics of words [Gr]) on the group. Denote by V (G, K) the number of elements of group G of length K. Definition 7. Call v(G) the logarithmic volume of a given group G: log V (G, K) . K→∞ K

v(G) = lim

(5)

The existence of the limit is discussed in [Ve2,Ve3]. We call G the group of exponential growth if v > 0. In Sect. 3 we investigate the asymptotic behavior of logarithmic volumes v(Bn ), v(LF n ) in the limit n → ∞. 2. Random walk and average drift on a group. Consider the (right-hand side) random walk on any group G with fixed framing {g1 , . . . , gn ; g1−1 , . . . , gn−1 }, i.e. regard the Markov chain with the following transition probabilities: the word w transforms into 1 ; i = 1, . . . , n. Similarly one can build a left-hand Markov w g ±1 with the probability 2n chain. Let L(G) be a mathematical expectation of a length of a random word, obtained after N steps of random walk on the group G. Definition 8. Call l(G, N ) the drift on the group G (see [Ve2,Ve3]): l(G) = lim sup N →∞

L(G, N ) . N

(6)

Thus, the drift is the average speed of a flow to infinity in the metrics of words. In Sect. 4 we calculate the drift l(LF n ) on the locally free group and its limit for n → ∞. 3. Entropy of a random walk on a group. Let µN be the N -time convolution of a uniform measure µ on generators {f1 , . . . , fn ; f1−1 , . . . , fn−1 }. Definition 9. The entropy (see [Av,De,KaiV,Ve2,Ve3]) of a random walk on a group with respect to µ is H µN H µN = inf , (7) h(G) = lim N→∞ N N N P ν(x) log ν(x). where H (ν) = − x∈supp ν

Section 5 is devoted to the computation of h(LF n ) in the limit n = const 1. The question about simultaneous study of these three numerical characteristics (volume, drift and entropy) is delivered by the first author (A.V.) – see [Ve2,Ve3] and represents a serious and deep problem. In particular, the desire to find the above-defined characteristics for the braid group motivates our consideration of locally free and similar groups. These three quantities are connected by the basic fundamental inequality, which was suggested and proved for arbitrary groups in [Ve2,Ve3] (see also special earlier cases in [Av,Kai]): v l ≥ h.

(8)

Statistical Properties of Locally Free Groups

475

For many groups (like the free group, for example) Eq. (8) is reduced to equality ([Ve2, Ve3]). In general it is an interesting problem to classify the groups in a given framing for which Eq. (8) becomes the equality. As we show below for locally free groups in standard framing the fundamental inequality is strict. We propose an explanation of this phenomenon and discuss its possible applications and physical consequences. (For more detailed consideration of the mathematical aspects see [Ve2,Ve3]).

3. Asymptotics of the Number of Words in the Locally Free Group (Logarithmic Volume) In this section we find the asymptotics in n 1 of the logarithmic volume and precise expressions for numbers of words of locally free groups and semi-groups (see also [NGV, CN]). Later on, in Sect. 6 we use the results obtained here for the bilateral estimation of the logarithmic volume of the braid group. Lemma 2. Any element of length K in the group LF n+1 can be uniquely written in the normal form m m m (9) W = fα1 1 fα2 2 . . . fαs s , where

s P i=1

|mi | = K (mi 6 = 0 ∀ i; 1 ≤ s ≤ K), and the indices α1 , . . . , αs satisfy the

following conditions (i) If αi = 1 then αi+1 = 2, . . . , n; (ii) If αi = k (2 ≤ k ≤ n − 1) then αi+1 = k − 1, k + 1, . . . , n; (iii) If αi = n then αi+1 = n − 1. Proof. The proof directly follows from the definition of commutation relations in the t group LF n+1 . u Let θn (s) be the number of all different sequences α1 , . . . , αs of s indices (1 ≤ s ≤ K), satisfying the rules (i)–(iii). In other words, the local rules (i)-(iii) define a Markov chain of length s on the set of indices {α1 , . . . , αn } with n × n-dimensional transition matrix Tbn 

0 1 0 0 .. .

1 0 1 0 .. .

1 1 0 1 .. .

1 1 1 0 .. .

... ... ... ... .. .

1 1 1 1 .. .

1 1 1 1 .. .

    Tbn =      0 0 0 0 ... 0 1 0 0 0 0 ... 1 0

     .    

(10)

Thus, θn (s) is a partition function, determined as follows: θn (s) = vin

Tbn

s−1

n

vout ;

z }| { vin = ( 1 1 . . . 1 );

T . vout = vin

(11)

476

A. M. Vershik, S. Nechaev, R. Bikbov

First of all compute the spectrum of the matrix Tbn . Consider the determinant −λ 1 1 . . . 1 −λ 1 . . . Dn (λ) = det Tbn − λIb = 0 1 −λ . . . . . .. .. . . .. . . .

(12)

It satisfies the recursion relation Dn (λ) = −(λ + 1)Dn−1 (λ) − (λ + 1)Dn−2 (λ)

(13)

with the boundary conditions (

D0 (λ) = 1 . D1 (λ) = −λ

(14)

For λ > −1 one may set Dn (λ) = (λ + 1)

n−1 2

(−1)n ϕn (λ),

(15)

which gives for the function ϕ(λ), √ ϕn (λ) = λ + 1ϕn−1 (λ) − ϕn−2 (λ).

(16)

The general solution of (16) satisfying the previously defined boundary conditions (14) is given in [CN] in terms of Chebyshev’s polynomials of the second kind ϕn (λ) = Un+1 (cos ϑ), where

√ λ+1 cos ϑ = 2

(17)

π 0<ϑ < . 2

(18)

Therefore Dn (λ) = (−1)n (λ + 1)

n−1 2

Un+1 (cos ϑ) = (−1)n (λ + 1)

n−1 2

sin(n + 2)ϑ . sin ϑ

(19)

The last expression enables us to obtain all the eigenvalues of the matrix Tbn . In fact, it is convenient to distinguish them according to the parity of n (see [CN]): 1. n = 2m + 1

 λ0 = (−1) λk = 4 cos2

2. n = 2m

kπ 2m+3

−1

 λ0 = (−1) λk = 4 cos2

kπ 2m+2

−1

(m such values) . k = [1, m + 1]

(20)

(m such values) . k = [1, m]

(21)

Statistical Properties of Locally Free Groups

477

Since in each case we have exactly n states, this exhausts the complete set of eigenvalues, showing they all are real in the interval [−1, 3]. One also recovers the result obtained earlier in [NGV,DN1] for the asymptotics of the highest eigenvalue of matrix Tbn (in the limit n 1): 4π 2 2 π − 1 ≈ 3 − 2 ; (k = 1). (22) λmax = 4 cos n+2 n n1 The number V (n, K) of different words of length K of the group LF n can be computed as follows: V (n, K) =

K X s=1

2s

(K − 1)! θn (s) = 2vin (2Tbn + Ib)K−1 vout , (s − 1)!(K − s)!

(23)

where Ibis the identity matrix and the appearance of the binomial coefficient is explained below at length in the proof of Theorem 1. In the limit K → ∞, n = const 1 one can approximately (with exponential accuracy) estimate (23) as follows: P log 2 ni=1 (2λi + 1)K−1 log V (n, K) ≈ lim lim lim lim n→∞ K→∞ n→∞ K→∞ K K (24) = lim (2λmax + 1) = log 7. n→∞

The exact expression of the function V (n, K) is given by the following theorem: Theorem 1. 1) The number V (n, K) of elements of length K of the locally free group LF n reads K −1 2 θn (s) V (n, K) = s−1 s=1 n+1 X πk n 2π k 2 2π k K−1 (−1)k−1 n+2 2 . cos sin 3 + 4 cos = n+2 n+2 n+2 n+2 K X

s

k=1

(25) 2) In the limit of infinite number of generators (n → ∞) the logarithmic volume of a locally free group is v = lim lim

n→∞ K→∞

log V (n, K) = log 7, K

(26)

i.e. v asymptotically corresponds to the logarithmic volume of a free group with four generators. Proof. The value V (n, K) can be represented in the form V (n, K) =

K X s=1

N (K, s)θn (s),

(27)

478

A. M. Vershik, S. Nechaev, R. Bikbov

where

N (K, s) =

X0

δ

{m1 ,... ,ms }

" s X

# |mi | − K ,

(28)

i=1

is the number of all representations of a word of length K with fixed sequence of indices α1 , . . . αs ; prime means that the sum does not contain the terms with mi = 0 (1 ≤ i ≤ K); and δ(x) is the Kronecker δ-function: δ(x) = 1 for x = 0 and δ(x) = 0 for x 6= 0. The sum N (K, s) is independent of θn (s) and can be easily evaluated, which gives for the group LF n the following expression N (K, s) = 2s

(K − 1)! . (s − 1)!(K − s)!

(29)

(The above expression (23) is based on (27)–(29).) Now we turn to the computation of the function θn (s). Our approach is based on exact evaluation of a “correlation function" θn (x, s), which determines the number of various sequences of s generators, satisfying the rules (i), (ii), (iii) and ending1 with index αs = x. Using the representation (10) we write down an evolution equation in “time” s for the function θ(x, s) ≡ θn (x, x0 , s):  n X    θ(y, s) θ(1, s + 1) =     y=2    n X θ (y, s), θ(x, s + 1) = θ(x − 1, s) +    y=x+1     θ(n, s + 1) = θ(n − 1, s)    θ(x, 1) = 1, x = 1, . . . , n

Define a generating function 2(x) ≡ 2n (x, λ) =

∞ P s=1

x = 2, . . . , n − 1 .

(30)

θn (x, s)λ−s . Rewriting (30) in

terms of the function 2(x) we arrive at the set of algebraic equations  n P   2(y) = −1 −λ2(1) +    y=2  n P 2(y) = −1, 2(x − 1) − λ2(x) +    y=x+1   2(n − 1) − λ2(n) = −1

x = 2, . . . , n − 1 .

1 The index of the first generator in the sequence is not fixed and can be arbitrary.

(31)

Statistical Properties of Locally Free Groups Tn Hence, 2(n) = D , where n −λ 1 1 1 . . . 1 −λ 1 1 . . . 0 1 −λ 1 . . . Dn = 0 0 1 −λ . . . .. .. .. .. . . . . . . . 0 0 0 0 ... 0 0 0 0 ...

479

−λ 1 0 ; Tn = 0 . . . 0 −λ 1 0 1 −λ 1 1 1 1 .. .

1 1 1 1 .. .

1 −λ 1 0 .. .

1 1 −λ 1 .. .

0 0

0 0

−1 −1 −1 −1 .. . 0 . . . −λ −1 0 . . . 1 −1

1 1 1 −λ .. .

... ... ... ... .. .

1 1 1 1 .. .

(32)

(compare to (19)). The determinants Dm and Tm (0 ≤ m ≤ n) satisfy the following recursion relations: ( Dm+1 + (λ + 1)Dm + (λ + 1)Dm−1 = 0 ; D0 = 1, D1 = −λ ( (33) Tm+1 + (λ + 1)Tm + (λ + 1)Tm−1 = 0 . T0 = 0, T1 = −1 Solving (33) we get

where cos φ =

1 2

Dn = (−1)n (λ + 1)

n−1 2

Tn = (−1)n (λ + 1)

n−1 2

sin φ(n + 2) , sin φ sin φn , sin φ

(34)

√ λ + 1. Thus the boundary value 2(n) reads 2(n) =

sin φn . sin(n + 2)φ

(35)

All other functions 2(x) for x = 1, . . . , n − 1 can be easily found from the equation derived on the basis of (31): ( (λ + 1) 2(x + 1) − (λ + 1) 2(x) + 2(x − 1) = 0, x = 2, . . . , n , (36) 2(1) = 2(2) where the right boundary condition is given by (35). The solution of (36) is as follows: 2(x) ≡ 2n (x, λ) = (λ + 1)

n−x 2

sin φx . sin(n + 2)φ

(37)

The generating function 2n (λ) is a sum of the functions 2n (x, λ) over all x = 1, . . . , n. After simple algebra we arrive at the following expression for 2n (λ): √ (2 cos φ)n+1 sin φn λ+1 ; cos φ = . (38) 2n (λ) = −1 + sin(n + 2)φ 2 It is convenient to express the function θn (s) via the contour integral I 1 2(λ) dλ, θn (s) = 2π i λ−s+1 C0

480

A. M. Vershik, S. Nechaev, R. Bikbov

where the contour C0 surrounds the pole λ = 0 and lies in the regularity area of the function 2n (λ). Hence, ! X 2(λk ) Res θn (s) = − , (39) λ−s+1 k λ 6 =0 k

where λk are zeros of the function 2(λ): λk = 4 cos2

πk − 1, n+1

k = 1, . . . ,

n+1 . 2

(40)

Let us note, that λk are the poles of Chebyshev polynomials of the second kind. Evaluating the residues in (39) we get the following exact expression: θn (s) =

n+1 X (−1)k−1

n+2

k=1

2

n+1

πk cos n+2

n

2π k sin n+2

2

2π k 1 + 2 cos n+2

s−1

.

t u (41)

Theorem 2. 1. The number V (n, K|LFI + n+1 ) of elements of length K of the idempotent + semi-group LFI n+1 is V

+i

n+1 X (−1)k−1

(n, K) =

k=1

n+2

2

n+1

πk cos n+2

n

2π k sin n+2

2

2π k 1 + 2 cos n+2

K−1

.

(42) + 2. The number V (n, K|LF + n+1 ) of elements of length K of the semi-group LF n+1 is

V + (n, K) =

πk n 2π k 2 2π k K−1 . 2n+1 cos sin 2 + 2 cos n+2 n+2 n+2 n+2

n+1 X (−1)k−1 k=1

(43) +(r)

+(r)

3. The number V (n, K|LF n+1 ) of elements of length K of the semi-group LF n+1 of local order r is n+1 X πk n 2π k 2 (r) (−1)k−1 n+1 2 pk (K), (44) cos sin V +(r) (n, K) = n+2 n+2 n+2 k=1

where (r) pk (K)

1 = 2πi

I X K C j =1

2π k 1 + 2 cos n+2

j −1

(z + · · · + zr−1 )j

dz . zN +1

(45)

In particular the following asymptotic expression is valid at K → ∞ n (1 − z0r−1 )2 π 2n+2 2π 2 1 cos , sin V +(r) (n, K) = r−1 K−1 r n+2 n+2 n+2 1 − rz0 + (r − 1)z0 z0 (46)

Statistical Properties of Locally Free Groups

481

where z0 is the positive root of the algebraic equation z + · · · + zr−1 =

1 2π 1 + 2 cos n+2

.

(47) +(r)

+ Proof. All the values V (n, K|LFI + n+1 ), V (n, K|LF n+1 ), V (n, K|LF n+1 ) can be computed on the basis of Eq. (27) where the function N (n, K) is model-dependent:   for LFI + N +i (n, K)  n+1  + + for LF n+1 (48) N (n, K) = N (n, K)   N +(r) (n, K) for LF +(r) n+1

and the function θn (s) is universal and in all cases θn (s) is the given by (41). + For the semi-groups LFI + n+1 and LF n+1 we have correspondingly (compare to (29)) N +i (n, K, s) = δK,s , (K − 1)! , N + (n, K, s) = (s − 1)!(K − s)!

(49)

where δK,s is the Kronecker δ-function. The value V (n, K|LFI + n+1 ) coincides with θn (s) which is obvious from the structure of the matrix (10). +(r) The case of the semi-group LF n+1 needs more advanced analysis. According to the +(r) definition the function N (n, K) is the number of all partitions m1 + · · · + ms = K such that ∀i ∈ {1, . √ . . , s} mi ∈ {1, . . . , r − 1}. Consider the function g(z) = z + · · · + zr−1 . The function g s (z) is a generation function for the value N +(r) (n, K): g s (z) = (z + · · · + zr−1 )s =

∞ X

N +(r) (s, K)zK ,

(50)

N =1

i.e. N

+(r)

1 (s, K) = 2π i

I C

g s (z) dz, zK+1

(51)

where the contour C surrounds the origin. Thus, we have h

V

+(r)

n+1 2

i

X (−1)k−1 πk n 2π k 2 (r) n+1 2 (n, K) = pk (K), cos sin n+2 n+2 n+2

(52)

k=1

and (r)

pk (K) =

1 2πi

I X K C s=1

pks−1 g s (z)

dz

z

, K+1

k = 1, . . . ,

n+1 , 2

(53)

482

A. M. Vershik, S. Nechaev, R. Bikbov

2πk where pk = 1 + 2 cos n+2 and we have rewritten (44) as (52). Let us represent now (53) in the following form: I I pkK g K+1 dz 1 1 gdz (r) − . (54) pk (K) = 2πi (1 − pk g)zK+1 2π i (1 − pk g)zK+1 C

C

The second integral is identically equals zero because the integrand is regular in vicinity of the point z = 0. So, we have I 1 gdz (r) . (55) pk (K) = 2π i (1 − pk g)zK+1 C

The asymptotic behavior of (55) is governed by the pole p1 closest to the origin z = 0, which in turn is the solution of the equation g(z) pk = 1.

(56)

Thus, we arrive at the expression for the function V +(r) (n, K) n 2n+2 π 2π 2 (r) p1 (K). cos sin V +(r) (n, K) = n+2 n+2 n+2

(57)

(r)

Now taking into account that one can rewrite p1 in the form (r)

p1 = −Res

g(z0 ) (1 − λ1 g(z0 ))z0K+1

,

(58)

where z0 is the positive root of equation g(z) p1 = 1 (compare to (47)) and evaluating the residue (58), we arive at the statement of the theorem. u t 4. Random Walk on Locally Free Group: The Drift The computation of the drift of the random walk on the locally free group proposed below generalizes the appropriate results for the free group. Recall that a symmetric N-step random walk on a free group 0n with n generators is a cross product of a nonsymmetric N -step random walk on a half-line Z+ and a layer over N ∈ Z+ giving a set of all words of length N with the uniform distribution. The transition probabilities in a base are: ( step forward on Z+ with the probability 2n−1 2n . 1 step backward on Z+ with the probability 2n Thus, the mathematical expectation of a word’s length after N steps reads 1 n−1 2n − 1 + (−1) × N =N , (+1) × 2n 2n n and hence the drift is

N

n−1 1 X E . ξi = N→∞ N n

l = lim

i=1

Statistical Properties of Locally Free Groups

+

-

-

-

+ -

-

-

=/

j

+ -

-

...

2

+ +

+

+

+

-

+

+ 1

483

-

+ ...

n

=

Fig. 3. Typical configuration of a colored heap. Elements of a roof are shown by filled squares

For example, for the group 02 with two generators (n = 2) the drift is equal to 21 . To compute the drift of the random walk on the group LF n+1 one should understand in more detail the structure of the normal form of elements of LF n+1 . It is useful to reformulate the indicated concepts geometrically following the ideas of G.X. Viennot (see [Vi] for a review). This representation arose in connection with the theory of partially commutative monoids [CF]. We imagine a word in the group LF n as a finite configuration of cells in a set called hereafter a heap or a colored heap depending on whether semi-group or group is regarded – see Fig. 3. Namely, we consider the strip P = n × Z+ ⊂ Z2 n = {1, 2, . . . , n}; Z+ = {0, 1, . . . , } as a subset of the lattice Z2 . Definition 10. 1. We call a heap the finite set H ∈ P , satisfying the conditions: (a) In a row the cells from the set H cannot be neighbors; (b) Each cell from H , not standing in the first row has at least one cell in the previous row, touching it. (The touching of cells means that the horizontal coordinates of such cells differ by not more than 1. 2. Let each cell from H have two colors (+, −). We shall assume, that besides the conditions 1 and 2 the following one is fulfilled: 3. In one and the same column i = 1, . . . , n, cells of different color cannot be the neighbors. In the last case H is a colored heap. (n)

∞

(n)

The set of heaps with number of cells K ≥ 0 is denoted by HK (H (n) = ∪ HK (n)

K=0

and H0 is an empty heap). The concept of a heap had been introduced and investigated by Viennot [Vi] in connection with combinatorial problems of partially commutative monoids of Cartier and Foata [CF] and so-called “directed lattice animals” considered (n) in [HNDV,Vi]. Denote by CHK the set of colored heaps with the number of cells

484

A. M. Vershik, S. Nechaev, R. Bikbov ∞

(n)

K ≥ 0, thus CH (n) = ∪ CHK . As far as we know, colored heaps have not been considered so far.

K=0

Definition 11. The numbered heap is a heap whose cells are enumerated by natural numbers so that the enumeration is monotone, i.e. if two cells touch each other, the top cell has the larger number. Definition 12. We call the roof of the heap (of the colored heap) the set T (H ) of those elements of a heap which have no upper neighbors in the same and closest columns. In other words, some element belongs to the roof of a heap of K elements if after the removal of this element we get a heap of K − 1 elements. Lemma 3. There exists a bijection τ between the set of words of a locally free group LF n+1 and the set of colored heaps CH (n) . (n) → For the semi-group LF + n+1 the same statement is true with the replacement CH (n) H and for this case the bijection was given in the work [Vi].

Proof. By induction. The unity of the group corresponds to an empty heap. Let τ be determined for words of length ≤ K. Compare to a word of length K + 1 a colored heap, which is obtained by adding the element gi±1 in an i’s column to an already existing colored heap, i.e. put a cell so, that Conditions 1 and 2 of Definition 11 are satisfied and a new cell belongs to the roof of heap. If directly under a new cell there was already a cell with the same coordinate i and the opposite color, these cells cancel. It is easy to see that if two words are equal, the corresponding (colored) heaps coincide. Let us show now that any (colored) numbered heap is uniquely associated with some word in LF n+1 . Namely, we construct an algorithm which sets a word in the normal order by some numbered (colored) heap (see Fig. 4): 1. Denote the left-most cell at the bottom as Cell No. 1 corresponding to the first letter in the normal order form for a given heap. For definiteness assume that this cell is in column j ’s. Cell No. 2 is a cell located in a column k (k ≤ j ) as close as possible to Cell No. 1. Now we seek for cells left-most close to Cell No. 2 and so on . . . . Continuing such enumeration we get a part of a heap called “cluster”. 2. If there are no more cells satisfying rule 1, we continue numbering with the bottommost cell which is the closest right-hand neighbor to the given cluster such that this new cell leaves the roof of the cluster without changes. This new cell is added to the cluster and enumeration is continued recursively. As a result, we get numbering corresponding to the normal form of a given word. Thus, Lemma 3 is proved. u t Remark 1. There is an analogy between heaps and Young diagrams, as well as between numbered heaps and Young tables [Ve2,Ve3]. Remark 2. In continuation of the similarity of heaps and Young diagrams, we can say that the roof is analogous to the corners of the Young diagrams. Dynamics of the words’ growth, i.e. the random walk on LF n+1 (LF + n+1 ) acquires the following obvious geometrical sense: it is a Markov chain with the states taken from the set of colored heaps (or just heaps). The transitions consist in addition of cells (in a

Statistical Properties of Locally Free Groups

1

2

485

...

6

this particular heap defines a following word in a normal order form:

g1 g1 g3 g2 g1 g4 g3 g3 g2 g6 g5 g5 g6 Fig. 4. Example of construction of normally ordered word by given heap. The configuration of some current cluster is shown by filled cells

1 view of Conditions 1–3 of Definition 7) with the probabilities 2n for LF n+1 and n1 for + LF n+1 . For each element w of the locally free group LF n+1 , written in the normal form we assign a set of removable generators T (w) (as proposed by J. Desbois in [DN2]):

T (w) = {i = 1, . . . , n : K(w gi ) = K(w) − 1} , where K(w) is the word’s length. The removable generator gi in the word w is such a generator in the representation w = w1 gim w2 , that at multiplication of the element w from the right-hand side by gi±1 , the new word w0 can be recorded as w0 = w1 gim±1 w2 , i.e. w2 and gi commute. In particular, the generator gim±1 can be reduced, if m ± 1 = 0. It is easy to realize that removable generators are such that they can be reduced in one step of a random walk. Further we shall number generators gi by the index i, and generators gi−1 by the index −i. The set T (w) has following obvious properties: (i) If i ∈ T (w), then −i ∈ / T (w) and if −i ∈ T (w) then i ∈ / T (w); (ii) If i ∈ T (w) or −i ∈ T (w) then ±(i − 1), ±(i + 1) ∈ / T (w). The last property entails the inequality #T (w) ≤ n+1 2 . Continuing the geometric interpretation of concepts we can identify the set T (w) of removable elements with the roof of the heap. 4.1. Mathematical expectation of the heap’s roof. In a geometrical interpretation described above the set T (H ) of elements of the roof is a set of such cells of a heap, removal of which for one step leaves an allowable configuration (see Definition 12).

486

A. M. Vershik, S. Nechaev, R. Bikbov

The basis of the roof2 of a heap H is the subset T (H ) = {1, . . . , n} of the set of removable generators. This subset, as can be seen from the properties (i)–(ii), satisfies the condition: if (k1 , k2 ) ∈ T (H ) then |k1 − k2 | > 1. Denote by Tn the family of all such subsets of the set {1, . . . , n}. In case of a colored heap the basis consists of subsets of {1, . . . , n} painted in two colors (+, −). We denote these subsets by Tnc . Remark 3. Let w be the element of the group LF n+1 and H be the corresponding heap. Then Tn (H ) is exactly the set of removable generators. It is convenient to characterize Tn by a vector (ε1 , . . . , εn ) with elements 0 and 1, where {εr = 1} ⇔ r ∈ T . the set Tn is equal to the Fibonacci number Fn and hence Lemma 4. The power #Tn of √ ≈ 1.618 is the golden mean. The power of the set Tnc it grows as λn , where λ = 5+1 2 n is equal to 2 . Proof. The power #Tn is equal to the number of sequences of elements 0 and 1 of length n, such that these sequences do not have the elements 1 in succession, i.e. satisfy the recursion relation Fn = Fn−1 + Fn−2 , (59) F1 = 1; F2 = 2, which defines the Fibonacci sequence. Similarly, the number of the elements of the set Tnc satisfies the recursion relation c c Fnc = Fn−1 + 2Fn−2 ,

F0c = 1

F1c = 2;

(60)

F2c = 4.

Actually, if the sequence #Tnc ⊂ Fnc begins with 0, the part remaining after removal c . If #T c begins with 1, then by definition the 2nd element of 0 is any sequence from Fn−1 n is 0. Deleting these two elements (1 and following after it 0), we get a sequence from c , Thus, the power of the set T c satisfies recursion relation (60), and consequently Fn−2 n t Fnc = 2n . u Define the time-homogeneous Markov chain, the set of states of which at any moment of time are the sets T ∈ T and the transition probabilities from the state T to the state T 0 are determined by the time-independent rules. Let T = {ε1 , . . . , εn }; T 0 = {ε10 , . . . , εn0 }. Then the transition matrix is as follows. The transition probability T → T 0 is nonzero and is equal to n1 only for the cases when εi = εi0 for all i except not more than three 0 0 , εr0 , εr+1 ) and for these triples one consecutive numbers, say (εr−1 , εr , εr+1 ) and (εr−1 of the following conditions is satisfied: If εr−1 If εr−1 If εr+1 If εr−1

= εr+1 = 1 = 1, εr+1 = 0 = 1, εr−1 = 0 = εr = εr+1 = 0

0 0 then εr0 = 1, εr−1 = εr+1 0 0 0 then εr = 1; εr−1 = εr+1 0 0 0 then εr = 1; εr−1 = εr+1 0 0 then εr−1 = εr+1 = 0, εr0

= 0; = 0; = 0; = 1.

(61)

Thus, the Markov chain is determined on the set of states Tn . Later on we will be interested in the asymptotics of a mathematical expectation of the size of a roof. This computation for the first time has been carried out in [DN2]. We repeat in Theorem 3 the main steps of the derivation of [DN2]. 2 Hereafter, if is not stipulated especially, we shall use the notation “roof” for a designation of both the roof as well as the basis of a roof.

Statistical Properties of Locally Free Groups

487

Theorem 3. The limit of the mathematical expectation of the number of removable generators for a random walk on the semi-group LF + n+1 for n 1 (i.e. the limit of the mathematical expectation of the roof of a heap) is lim E#T (wN ) =

N →∞

n . 3

(62)

Proof. Compute the mathematical expectation of a number of removable elements when we do not distinguish between generators and their inverses, i.e. for the random walk on the semi-group LF + n+1 . Let us represent the elements of the roof T (w) (i.e. the number of removable generators) graphically by filled boxes on the diagram as shown below:

1

4

8

10

Here n = 11, #T = 4. Denote by hj = kj − kj −1 − 1 the intervals of lengths j between neighboring boxes or between a box and the edge of the diagram. Let T consist of the set {k1 , . . . , ks }. If the edge points (1 and n) do not belong to T , then h1 = k1 , hs+1 = n − ks − 1; if one or both edge points belong to T , then h1 = k1 − 1, hs+1 = n − ks . For example, if k1 = 1 then h1 = 0, or if ks = n then hs = 0. (On the above diagram h1 = 0, h2 = 2, h3 = 3, h4 = 1.) The values hj satisfy the following relation, valid when neglecting the “boundary effects” at #T 1, n 1: X hj = n − #T . (63) j

It is not hard to establish the rules according to which the diagram is changed at such multiplication of w by gr (or by gr−1 ), which increases #T (w) by 1: in r’s position a point appears, while in positions (r − 1) and/or (r + 1) the points (which were present) disappear. Having in mind this rule, let us write the explicit expressions for the 1-step increment of a length, 1T (w), expressing it in terms of hj (w) provided that the boundary points do not belong to T :  P (hj − 2) 1T (w) = +1 with the probability q+ = n1    j :hj ≥3 (64) 1T (w) = 0 with the probability q0 = n1 #T + 2#{j : hj ≥ 2} .    1 1T (w) = −1 with the probability q− = n #{j : hj = 1} Summing (64), we obtain the conditional mathematical expectation of the conditional probability of local reconstruction of a roof for the fixed element w: Ew 1T = 1 × q+ + P0 × q0 + (−1) × q− = 1 × n1 (hj − 2) + 0 × #T + n2 #{j : hj ≥ 2} j :hj ≥3

+ (−1) × n1 #{j : hj = 1} P P hj − n2 #T = 1 − n3 #T (w). = n1 (hj − 2) = n1 j

(65)

j

Let us mention that depending on whether the boundary points belong or do not belong to the set T (w), the right-hand side of Eq. (65) is changed by terms which do not

488

A. M. Vershik, S. Nechaev, R. Bikbov

exceed n4 . Therefore in the large n limit the expression (65) is exact. In case of periodic boundary conditions Eq. (65) is exact for any finite values of n. Since our Markov chain has a finite set of states and is ergodic, it has a unique invariant measure. The Markov chain with this invariant measure is stationary. So, the mathematical expectation E[1T (w)] over all elements w with respect to the invariant measure exists and is finite, therefore E[1T (w)] = 0. Thus, from the strong law of large numbers (or, equivalently, from the individual ergodic theorem) it follows that for the random walk on the semi-group we have Eq. (62) for the mathematical expectation of the number of removable elements (i.e. the set of elements of a roof). u t The distinction between the semi-group LF + n+1 (i.e. the heap) and the group LF n+1 (i.e. colored heap) is due to the fact that for the random walk on the group there is a 1 #T c (w). To account for that, possibility of the word’s reduction with the probability 2n + and p − to increase and to reduce the size of the roof we introduce the probabilities pw w #T c (w) of a colored heap per unity under the condition of the word’s length reduction. + and p − This mathematical expectation is a difference of conditional probabilities pw w c to change the value #T (w) per unity provided that reduction of a word occurs. This difference should be added to the mathematical expectation of the change of #T (w) in the case of a semi-group (61): Ew 1T c (w) = 1 −

− 3 c p+ − pw #T (w) + w #T c (w). n 2n

(66)

For the idempotent semi-group LFI + n+1 there are no possibilities for the word’s reduction and the mathematical expectation of the size of a roof is the same as for the semi-group LF + n+1 . Thus we arrive at the following corollaries of Theorem 3: Corollary 2. The limit of the mathematical expectation of the size of a roof for the random walk on the group LF n+1 and idempotent semi-group LFI + n+1 is n E#T c = 3−α n E#T = 3

for the group LF n+1 , for the idempotent semi-group LFI + n+1 ,

(67)

+ and p − = Ep − . where α = 21 (p+ − p− ); p+ = Epw w

On can easily realize that for some configurations of heaps w we could have p+ − 6 = 0 and in these cases the mathematical expectation Ew 1T for the group (colored heap) and for the semi-group (heap) do not coincide. However, we believe, that at + = Ep − and the following hypothesis (expressed N → ∞ (i.e. in a stationary mode) Epw w first by J. Desbois in [DN2]) is valid:

p−

Conjecture 1. The mathematical expectation of a roof (a set of removable elements) for the heap (the locally free semi-group LF + n+1 ) and for colored heap (the locally free group LF n+1 ) coincide at n 1. Hence, lim E#T (wN ) =

N→∞

n . 3

(68)

The concept of a roof is the same for the heap (the semi-group) and for the colored heap (the group), however the dynamics in these two cases is distinct. The random walk on the locally free semi-group (group) has been reduced to a Markov dynamics of heaps

Statistical Properties of Locally Free Groups

489

(colored heaps). We have defined a new dynamics – the dynamics of the roofs, Markovian in the case of the locally free semi-group, by which the general dynamics is restored and which is convenient for computations. In the case of the group this dynamics is not Markovian anymore, but nevertheless enables us to get some nontrivial estimates. Using the subadditive ergodic theorem we can prove now the following important fact: Lemma 5. 1. For almost all sequences of heaps, i.e. for almost all trajectories {wN } of the random walk on the semigroup LF + n+1 (or on the group LF n+1 ) the limit lim

N →∞

#T (wN ) = κn n

exists and does not depend on the trajectory. 2. From Theorem 3 we know that κn = 1/3 + on (1) for n large. This lemma is used below in the proof of Theorems 5 and 7.

4.2. Drift as mathematical expectation of number of cells in the heap. Let us compute now the change of a length of some fixed word w for a random walk on a group LF n+1 . It is obvious that for one step of the random walk the length of a word can be changed by ±1. The multiplication by a given generator, or by its inverse, occurs with the probability 1 2n and thus, the conditional mathematical expectation Ew K to change a word’s length is determined for a fixed element w. Below we shall compute Ew K and shall be convinced that the answer depends only on a size of a roof, i.e. on a size of a set #T c (w) of removable generators T c (w). Consider a fixed element w of the group LF n+1 such that the set of removable 1 generators w is {1 ≤ k1 < k2 < · · · < ks ≤ n}. Assume that with the probability 2n −1 the word w is multiplied by a generator gr or gr (for definiteness let us choose gr ). Denote the set of removable generators of the element w 0 = w gr as T 0 ≡ T (w0 ). Then the dynamics of the change of the set T (w) is settled by the following opportunities (compare to the above relations (61)): We have the following possibilities: I. Provided that the word’s length is increased, i.e. K 0 (wgi ) = K(w) + 1 the dynamics of the roof is described by the relations (61) valid for the semigroup LF + n+1 ; 0 II. Provided that the word is reduced, i.e. K (wgi ) = K(w) − 1, we have: T c → T c0 ≡ T c−

if εr = 1,

(69)

where T c − is the configuration of a roof obtained by the cancellation of one of the elements of the roof T c located in position r. (This rule cannot be described in local terms.) c

The probability of a word’s length reduction is #T 2n , because for each element of a roof there is a unique possibility to be reduced if and only if at the following step the element inverse to a former one has arrived. Accordingly, the probability to increase a c word’s length is 1 − #T 2n , which follows from the possibility mentioned above to change

490

A. M. Vershik, S. Nechaev, R. Bikbov

a word’s length for one step by ±1. As a result, the mathematical expectation of the total change of a word’s length for one step of random walk on the group LF n+1 is E#T c (w) E#T c (w) E#T c (w) + 1− =1− . (70) Egr [K(w) − K(w gr )] = − 2n 2n n The indicated computation proves the following lemma: Lemma 6. The conditional mathematical expectation of the word’s length K(w) after N steps of the random walk on the group LF n+1 for the fixed last element w is E#T c (w) , Ew K = N 1 − n hence the drift (i.e. the mathematical expectation of a normalized word’s length) is E#T c (w) 1 Ew K = 1 − . N→∞ N n

l = lim

Corollary 3. The drift of the random walk on the idempotent locally free semi-group LFI + n+1 is E#T (w) 1 Ew K = 1 − . l = lim N→∞ N n Proof. Despite the fact the expressions for the drifts for the group LF n+1 and idempotent semi-group LFI + n+1 coincide, their origins are different. For the idempotent semi-group there is no cancellation of the word’s length, however the relation gi2 = gi provides the existence of such configurations of the heap which does not change when a new letter is added. The probability of such an event is n1 . Hence the word’s length increases by +1 if and only if the new added letter changes the configuration of the roof. The probability of such an event is 1 − E#Tn(w) . By the corollary of Theorem 3 we know the sizes of roofs for the random walks on the semi-group and idempotent semi-group coincide. u t Thus, for calculation of the drift it is sufficient to know the mathematical expectation E#T (w) of the roof – see Eq. (67). Theorem 4. The mathematical expectations of the drift of a random walk on a locally free group and idempotent semi-group at n 1 are 2−α E#T c (w) = n 3−α 2 E#T (w) = l =1− n 3 l =1−

for the group LF n+1 ,

(71)

for the idempotent semi-group LF n+1 ,

where α is defined in (67). Conjecture 2. The mathematical expectation of the drift on the locally free group at n 1 is l=

2 . 3

(72)

Conjecture 2 is a direct consequence of Conjecture 1 (J. Desbois in [DN2]) but still it is not proved rigorously.

Statistical Properties of Locally Free Groups

491

5. Random Walk on Locally Free Group and Semi-Group: The Entropy The entropy h(G) of the random walk on the group G with the uniform measure µ on the set of generators according to the theorem similar to the Shannon–Macmillan– Breiman one and proved in [KaiV, De] (see also [Ve2,Ve3]) can be written as follows (see Definition 7): 1 N 1 H µ (wN ) = − lim log µN (wN ), N→∞ N N →∞ N

h(wN ) = lim

(73)

for almost all elements wN of the group; N is the number of the steps of the random walk; µN is the N -time convolution of a measure µ. In turn, the measure µN itself can be defined in the following way:  #L (g) N   (2n)N µN (g) = + #L   N (g) nN

for the group .

(74)

for the semi-group

By LN (g) and L+ N (g) we denoted the sets of different dynamical representations of the word g of record’s length N in the alphabets {g1 , . . . , gn , g1−1 , . . . , gn−1 } (for the group) and {g1 , . . . , gn } (for the semi-group) correspondingly. Hence, #LN (g) and #L+ N (g) are the numbers of various dynamical representations of the element g by words of record length N in a given framing. The values #LN (g) (#L+ N (g)) can be viewed as the number of different ways on the Cayley graph of the group (semi-group), leading from the root point of the graph. Let us pay attention that in the case of the group the element g = wN can have the length K(g) shorter than the record length N . This question has been considered in detail in the previous section. As has been found in the previous section during the study of the drift, the dynamics of the increments of words (i.e. dynamics of the heap H ) for random uniform addition of cells is uniquely determined by the dynamics of the roof T of the heap H . Moreover, we have found (see Eq. (62)), that in the limit N → ∞ and at n 1 the mathematical expectation of the size E#T of a roof, normalized by n is 1/3. Let us prove the lemma: Lemma 7. The fluctuations of mathematical expectation of the roof for n 1 have the asymptotic behavior E #T 2 − E(#T )2 const ≤ , 2 E(#T ) n where we have denoted E#T = lim E#T (wN ). N →∞

Proof. Rewrite (62) in the form (E#T )2 =

2 lim E#T (wN )

N →∞

=

n2 . 9

(75)

492

A. M. Vershik, S. Nechaev, R. Bikbov

Using Eqs.(64)–(65) for the probabilities of local rearrangements of the roof we get the mathematical expectation of the fluctuations of a roof: h i E1(#T 2 ) = E (#T 0 )2 − (#T )2 = 1 × q+ (#T + 1)2 − (#T )2 +0 × q0 + (−1) × q− (#T − 1)2 − (#T )2 = 2(q+ − q− )#T + (q+ + q− ), where q+ =

1 X (hj − 2); n j :hj ≥3

q− =

q0 =

(76)

1 #T + 2#{j : hj ≥ 2} ; n

1 #{j : hj = 1}. n

Taking into account, that q+ − q− = 1 − n3 #T , we obtain from (76): 3 E1(#T ) = E 2 1 − #T #T + (q+ + q− ) . n 2

(77)

For the invariant initial distribution we should set E1(#T 2 ) = 0, therefore the mathematical expectation of a square of the size of a roof can be received from the following relation: 6 E1(#T 2 ) = 2E#T − E#T 2 − E(q+ + q− ) = 0 n hence we get E#T 2 =

n n E#T + E(q+ + q− ). 3 6

Estimating the mathematical expectation from above as E(q+ + q− ) < const, we arrive at the equation: const n n2 n2 + = + o(n2 ). E#T 2 = 9 6 9 Comparing the last expression with (75), we get the statement of Lemma 6.

t u

It is convenient to split the problem of computaton of the entropy of random walks on locally free group and semi-group in two parts and to begin with the case of the semi-group LF + n for which the computations seem to be more transparent. 5.1. Entropy of random walk on semi-group LF + n+1 . Theorem 5. The entropy of the random walk on the locally free semigroup LF + n+1 for n 1 is h = log 3 + o(1).

(78)

Statistical Properties of Locally Free Groups

493

Proof. We have X

#LN (g) =

#LN (g \ gi ),

(79)

gi ∈T

where the sum is taken over all elements gi from the roof T . Let us write g g (1) if the heap of g (1) is a result of removal of one cell from the roof of the heap g. Using the definition (74) for the semigroup LF + n+1 we have #L(g) = n−1 nN

X g (1) :gg (1)

#L(g1 ) , nN −1

which, consequently, can be rewritten in the following way: X µN −1 (g (1) ). µN (g) = n−1

(80)

(81)

g (1) :gg (1)

Both sums in (80) and (81) run over all g (1) for which g g (1) and number of cells in g is indicated by the exponent in µ. Let us rewrite (81) equivalently as µN (g) =

1 #T (g) × n #T (g)

X

µN −1 (g (1) ).

(82)

g (1) :gg (1)

The second factor in the right side of (82) is the average of the measures of heaps obtained by exclusion of one cell from the initial heap; we denote this term as A(g) =

1 #T (g)

X

µN −1 (g (1) ).

(83)

g (1) :gg (1)

Taking the logarithm of (82) divided by N and using (83) we get N −1 log µN (g) = N −1 log

#T (g) + N −1 log A(g). n

(84)

This is true for all N and all g with N cells. Now we iterate the second term in the right side: X 1 µN −1 (g (1) ) A(g) = #T (g) (1) g :gg1 X 1 1 µN −1 (g (1) ) = (1) ) #T (g) (1) #T (g g :gg (1)     (85) X X 1 n−1 µN −2 (g (2) ) =   (2) (1) (2) #T (g) (1) g :gg1 g :g g    (1) X X #T (g ) 1 1 µN −2 (g (2) ) . = (1)   #T (g ) (2) (1) (2) #T (g) (1) n (1) g

:gg

g

:g

g

494

A. M. Vershik, S. Nechaev, R. Bikbov #T (w1 ) n

for N large is close to κn up to . Thus,    X X  1 1 N −2 (2) µ (g ) + . (86) × A(g) = κn ×   #T (g (1) ) (2) (1) (1) #T (g) (1) (1)

Using Lemma 5 we expect that

g

:gg

g

:g

g

We can iterate Eq. (86) assuming that all g (1) , g (2) , . . . , g (k) run over the all sequences of heaps g g (1) g (2) · · · g (k) and for all of those heaps

#T (wi ) n

A(g) = (κn )m−1 ×

(87)

is -close to κn . After mth iterations we obtain 1 × #T (g)

X

C(g (m) )µN −m (g (m) ),

(88)

g (1) ,...,g (m)

where the coefficients C(g (m) ) are positive with sum equal to 1, being average values of the convolution of the measures µN −m (g (m) ). Coming back to equality (84) and iterating it m = N − k times, we get −N −1 log µN (g) = N −1 × log κn + N −1 log A(g) ··· X N −k × log κn + N −1 log C(g (N−k) )µk (g (N−k) ) + . = N (1) (k) g

,...g

(89) Let us now make a shift g → g N +k , g g (1) , . . . , gk , so that g (j ) becomes a heap with N + k − j cells; in particular g (N ) now has k cells. With this shift we have −

N 1 log µN+k (g N+k ) = × log κn N +k N +k   X 1 + log  N +k (1) N +k g

g

···g (k)

C(g (k) )µN (g (k) )

  

+ . (90)

Now we fix sufficiently large k and set N → ∞. The convergence of limN→∞ N −1 log µ(wN ) for almost all sequences of heaps, i.e. for almost all trajectories {wN } of the random walk on the semigroup LF + n+1 (or the group LF n+1 ) follows now from the theorem of Shannon–Macmillan–Breiman–type, mentioned in the beginning of Sect. 5. Hence, lim N −1 log µN (wN ) = h.

N→∞

(91)

The limit in (91) exists in L2 in the space of trajectories. So, the left-hand side of (90) tends to the entropy h for almost all sequences when N → ∞. The second summund in the right-hand side tends to 0 because the logarithm

Statistical Properties of Locally Free Groups

495

of the sum is bounded by the average of the measures µN (g (k) ). Thus, for N → ∞ we have 1 N log µ(g (N+k) ) = lim log κn = log κn = log 3 + on (1) N→∞ N + k N →∞ N + k lim

(92)

and h = log 3 + on (1).

t u

(93)

Theorem 6. For the random walk on the locally free semi-group LF + n+1 the logarithmic volume v, the drift l and the entropy h satisfy at n 1 the strict inequality v l > h, where v ≡ Proof. (i)

v(LF + n );

l≡

l(LF + n );

h ≡ h(LF + n ).

By Theorem 2 we have v(LF + n ) → log 4.

(ii) The drift of the random walk on LF + n is strictly equal to 1, i.e. l = 1. (iii) By Theorem 5 we have

h(LF + n ) → log 3.

Comparing the values of v, l and h, we get the strict inequality v > h for the random t walk on LF + n+1 . u 5.2. Entropy of random walk on groups LF n+1 and LFI + n+1 . Theorem 7. The entropy h of the random walk on the group LF n+1 and idempotent semi-group LFI + n+1 at n 1 is ( log(3 − α) + o(1) for the group LF n+1 , (94) h= log 23 + o(1) for the idempotent semi-group LFI + n+1 where α = 21 (p+ − p− ) (see 67). Proof. In the case of LF n+1 and LFI + n+1 the element g can be achieved at N ’s step of the random walk not only by adding a new cell to the previous roof (as it was for the semi-group) but also by cancelling some already existing cell of the roof. This behavior is manifested in the following modifications of the recursion relation (79): X #LN −1 (g ∪ gi ) for LF n+1    X gX ∈T #LN−1 (g \ gi ) + i . (95) #LN (g) =  #LN −1 (g) for LFI +  n+1 gi ∈T  gi ∈T

496

A. M. Vershik, S. Nechaev, R. Bikbov

Following the outline of the proof of Theorem 5 and using Lemma 7 we compute h regarding the dynamics of the long (n 1) roof in the stationary regime for N 1. It means that we replace the time-ordered product of j roofs Tj by the j ’s power of an averaged roof T . For the group LF n+1 the exact value of E#T c is unknown as far + − Ep − ). Let us recall that as E#T c depends on the unknown parameter α = 21 (Epw w + − pw and pw are the probabilities of the change of the value #T c (w) by ±1 provided the reduction of a word (see Eq. (67)). Nevertheless, we can follow directly the outline of the proof of Theorem 5 with a single replacement ξj → ξj − α. In the stationary regime for LF n+1 and LFI + n+1 both terms in (95) amounts to the same contribution which results in the following expression for the entropy h:  2n 2#Tj 1 PN 1 3−α   − limN→∞ N j =1 log 2n = − log 2n + o(1) = − log 3−α + o(1)   for LF n+1 . (96) h= 2n 2#Tj − lim 1 PN 2 3  log = − log + o(1) = − log + o(1) N→∞  j =1 N n n 3   for LFI + n+1 Thus Eq. (94) is proved.

t u

Now we are in a position to prove the following main theorem Theorem 8. For the random walk on the locally free group LF n , the logarithmic volume v, the drift l and the entropy h satisfy at n 1 the strict inequality v l > h, where v ≡ v(LF n ); l ≡ l(LF n ); h ≡ h(LF n ). Proof. For the group LF n as well as in the case of the semi-group LF + n , the entropy h and the drift l of the random walk are determined by the mathematical expectation of the size of a roof E#T . Nevertheless in the case of the group the numerical value of the mathematical expectation of a colored heap’s roof depends on the value α. However since our purpose is to prove that for a locally free group in the limit of infinite number of generators the strict inequality l v < h holds, it is sufficient to estimate appropriately the interval of the change of α. For the proof we shall use again the statements of Theorems 1, 3, and 4. (i)

By the Theorem 1: v(LF n ) → log 7.

(97)

2−α . 3−α

(98)

(ii) By the Theorem 4: l(LF n ) → (iii) By the Theorem 7: h(LF n ) → log(3 − α).

(99)

Statistical Properties of Locally Free Groups

497

By definition α = 21 (p+ − p − ). Because p+ + p− = 1, the following estimate is valid |α| < 21 . Thus, the values of the drift and the entropy lie within the interval 5 3
7 5 < h < log . 2 2

Define the discrepancy ε = l v − h and check that ε(α) > 0 for all values of |α| from the interval |α| < 21 . Consider the function ε(α) =

2−α log 7 − log(3 − α). 3−α

1 Computing the derivative dε(α) dα , one can easily verify that on the interval − 2 < α < the function ε(α) is strictly positive, hence l v − h > 0. The theorem is proved. u t

1 2

Theorem 9. For the random walk on the locally free idempotent group LFI n for n 1 the strict inequality v l > h holds, where v ≡ v(LFI n ); l ≡ l(LFI n ); h ≡ h(LFI n ). Proof. By Theorems 1, 2, 4 and 7 we have at n → ∞: v(LFI n ) → log 3; which gets v l − h =

2 3

l(LFI n ) →

log 3 − log 23 > 0.

2 ; 3

3 h(LFI n ) → log , 2

(100)

t u

6. Conclusion Let us spread the results obtained above for locally free groups and semi-groups to the case of braid groups and semi-groups.

6.1. Bounds for logarithmic volume and drift of random walk on braid group. As we have pointed out already in Lemma 1, the braid group Bn is the factor-group of the locally free group LF n and simultaneously LF n is the subgroup of Bn . The same relations are valid for the semi-group of positive braids Bn+ and the locally free semi-group LF + n. Theorem 10. The logarithmic volumes v(Bn ) and v(Bn+ ) for n 1 satisfy the bilateral estimates, log 7 < v(Bn ) ≤ log 7, log 2 < v(Bn+ ) ≤ log 4. 1 2

(101)

Proof. The proof is based on Lemma 1 and its corollary. The upper bounds in (101) are the direct consequence of the fact that Bn (Bn+ ) is a factor–group of LF n (LF + n ). Thus, vBn ≤ vLF n ≡ log 7.

498

A. M. Vershik, S. Nechaev, R. Bikbov

In order to get the lower bounds let us point out that embedding ρn of LF n in Bn 2 + and of LF + n in Bn is realized by the isomorphism fi ↔ σi . Thus, in case of the group we have: V (ρn : LF n , K) ⊂ V (Bn , 2K) and

hence

log V (Bn , 2K) log V (ρn : LF n , K) ≤ , K K vLF n ≡ log 7 ≤ 2vBn ,

therefore

1 vLF n ≤ vBn , 2 The case of the semi-group LF + n can be treated along the same line.

t u

Apparently, the upper estimate in Eq. (101) is closer to the true value, than the lower one. Theorem 11. The drift l(Bn ) on the braid group Bn at n 1 satisfies the inequality 2−α 2−α < l(Bn ) ≤ . 2(3 − α) 3−α

(102)

Proof. The bilateral estimate (102) is again a direct consequence of Lemma 1 showing that the braid group is the factor–group of the locally free group and in turn the locally free group is the subgroup of the braid group. The value α has been defined above and t varies in the interval − 21 < α < 21 . u For the entropy of the random walk on the braid group the corresponding bilateral estimates have not yet received but nevertheless we can assert that the fundamental inequality is strict. We shall return to this question in the forthcomming publication. The fundamental inequality vl > h for locally free and braid groups has deep connection to the multifractal structure of harmonic measure on the Poisson boundary for corresponding groups. 6.2. Physical interpretation of results. Let us give a physical interpretation of a strict inequality lv > h for the locally free group and for the ballistic deposition process – see Fig. 5. The relations (73)–(74) permit one to estimate the probability of various dynamical representations of typical elements g by words of length N 1 with respect to the uniform measure µ: #LN (g) ≈ 2N h lim n→∞ (2n)N that for locally free group gives with the exponential accuracy (see Eq. (94)) #L(g) ≈ (3 − α)N . n→∞ (2n)N lim

L(g) The value #(2n) N is the measure of a set of trajectories on the Cayley graph of the locally free group, visited by the N-step random walk.

Statistical Properties of Locally Free Groups

499

Fig. 5. Typical configuration obtained in numerical simulations of the uniform heap’s growth

On the other hand, the expression #V(g) ≈ 2N l v (2n)N gives the exponential estimate for the probability to find the element g for a time of random walk on the group without any reference to dynamics. For the locally free group the value #V(w) can be written as follows: 2−α #V(g) ≈ 7 3−α N , N (2n)

where |α| < 21 . In other words, #VnN(g) is the measure of the set of all different states of the Cayley graph of the locally free group, located at a distance of typical drift L = 2−α 3−α N of the N-step walk from the root point of the graph. The inequality #L(g) #V(g) N (2n) (2n)N

(103)

500

A. M. Vershik, S. Nechaev, R. Bikbov

means that the measure of the set of typical trajectories covered dynamically by the N-step random walk on LF n is an exponentially small fraction of the set of statistically available trajectories of the same length. The inequality, similar to (103) in case of the locally free semi-group reads #L+ (g) #V + (g) , N n nN

(104)

where #V + (g) ≈ 4N is the volume of the locally free semi-group LF + n for n 1 and in the same limit. #L+ (g) ≈ 3N is the entropy of the random walk on LF + n The dynamically induced probabilistic measure on the group (semi-group), i.e. the representation of words by the random walks on a group (semi-group), essentially differs from the uniform (on the words) measure. This difference is manifested in the exponential divergence of the two quantities #V(g) and #L(g)–see Eq. (103) (the same is valid for the semi-group and is described by Eq. (104)). The very origin of the above stated exponential difference in the number of (i) dynamical representations of some typical group element g by N -step random walks and (ii) all different representations of the same element g by N -step trajectories in the locally free group consists in the locking mechanism of dynamical construction of words. Let us explain this mechanism for a simple example. Suppose after some steps of the (righthand) random walk we have arrived at some word, say wN = g1 g4 g3 g2 (it is written in the normal form). By any random adding a letter (from the side) we can h right-hand i never create a word, say, like the following one wN +1 = g1 g2 g4 g3 g2 . However the word wN+1 can be created if we would allow insertion of a new letter everywhere in the given word wN (but not only add in to the right-most end of wN ). Hence for the random walk many configurations which are statistically available are locked dynamically by the condition that we add the new letters at the right–most end of the current word. This locking is permanent for the locally free semi-group LF + n and temporary for the locally free group LF n . In the last case we can release the locked structure by consecutively adding opposite generators which would cancel step by step the roof, however the probability of such an event is exponentially small (the detailed explanation of this effect is given in [Ve2,Ve3]). The inequality (104) seems to be the origin of the fact that in the numerical simulations of a random heap’s growth (Fig. 5) a strong divergence is observed between the normalized mathematical expectation (averaged density) of the roof ρ roof = E #T n (where ρ roof = 13 ) and the mathematical expectation (averaged density) of a whole heap N ρ heap = nH (where H is the maximal height of a heap). The value of H , obtained in various computer simulations is evaluated as H ≈ 4.05 Nn (for references see [HZh]), which corresponds to the density ρ heap ≈ 0.247. The same value of the density is observed in average for any flat horizontal section of a heap. An essential numerical distinction between ρ roof and ρ heap means that the roof of a heap has nontrivial fractal structure lying in a strip of nonzero’s width. In Fig. 5 we have shown by black points a few current configurations of the roofs in the course of the heap’s growth. As it can be seen, the configurations of the roof are far from the flat ones and exhibit apparently nontrivial fractal behavior, which would be interesting to compare with the continuous models of the surface growth described by Kardar–Parisi–Zhang (KPZ) theory (see, for a review [HZh]).

Statistical Properties of Locally Free Groups

501

Acknowledgements. We would thank S. Fomin for pointing us to the connection between heaps and partially commutative monoids; G. X. Viennot, B. Derrida, A. Comtet and J. Lebowitz for fruitful discussions and comments; S.N. highly appreciates deep suggestions made by J. Desbois (see [DN2]). The authors are grateful to the RFBR grants 99–01–17931 and RFBR 00–15–96060 for partial support.

References [Av] [Bi1]

Avec, A.: C.R. Acad. Sci. Paris 275 (A), 1363 (1972) Birman, J.: Knots, Links and Mapping Class Groups. Ann. Math. Studies, 82, Princeton: Princeton Univ. Press, 1976 [Bi2] Birman, J., Ko, K.H., Lee, J.S.: Adv. Math. 139, 322 (1998) [CF] Cartier, P., Foata, D.: Lect. Not. Math. 85 New York–Berlin: Springer, 1969 [CN] Comtet, A., Nechaev, S.: J. Phys. (A): Math. Gen. 31, 5609 (1998) [Co] Collins, P.J.: Invest. Math. 117, 525 (1994) [De] Derennic, Y.: Astérisque 74, 183 (1980) [DN1] Desbois, J., Nechaev, S.: J. Stat. Phys. 88, 201 (1997) [DN2] Desbois, J., Nechaev, S.: J. Phys. (A): Math. Gen. 31, 2767 (1998) [FK+] Frank-Kamenetskii, M.D., Vologodskii, A.V.: Sov. Phys. Uspekhi 134, 641 (1981); Vologodskii, A.V., Lukashin, A.V., Frank-Kamenetskii, M.D., Anshelevich, V.V.: Zh. Exp. Teor. Fiz. (JETP), 66, (1974) 2153; Vologodskii, A.V., Lukashin, A.V., Frank-Kamenetskii, M.D.: Zh. Exp. Teor. Fiz. (JETP) 67, 1875 (1974); Frank-Kamenetskii, M.D., Vologodskii, A.V., Lukashin, A.V.: Nature (London), 258, 398 (1975) [GN] Grosberg,A.Yu., Nechaev, S.K.: J. Phys. (A): Math. Gen. 25, 4659 (1992); Grosberg,A.Yu., Nechaev, S.K.: Europhus. Lett. 20, 613 (1992) [Gr] Gromov, M. : Hyperbolic Groups. In: Essays in Group Theory 8, 75 MSRI Publishing: Springer, 1987 [HZh] Halpin-Healy, T., Zhang, Y.C.: Phys. Rep. 254, 215 (1995) [Hu] Humfreyies, S.: Journ. of Algebra 169, 847 (1994) [HNDV] Hakim, V., Nadal, J.P.: J.Phys. A 18, L-213 (1983); Nadal, J.P., Derrida, B., Vannimenus, J.: J. de Physique 43, 1561 (1982) [Jo] Jones, V.F.R.: Bull. Am. Math. Soc. 12, 103 (1985); Jones, V.F.R.: Pacific J. Math. 137, 311 (1989) [Kai] Kaimanovich, V.: Ergodic Theory and Dyn. Syst. 18, 631 (1998) [KaiV] Kaimanovich, V., Vershik, A.M.: Ann. Prob. 11, 457 (1983) [LGP] Lifshitz, I.M., Gredeskul, A., Pastur, L.A.: Introduction to the theory of disorderd systems. Moscow: Nauka, 1982 [Ne] Nechaev, S.K.: Statistics of Knots and Entangled Random Walks. (WSPC: Springer, 1996); Nechaev, S.K.: Sov. Phys. Uspekhi 168, 369 (1998) [NGV] Nechaev, S., Grosberg, A., Vershik, A.: J. Phys. (A): Math. Gen. 29, 2411 (1996) [Sa] Sanov, I.: Dokl. Ac. Sci. USSR 57, 657 (1940) [Ve1] Vershik, A.M.: In: Topics in Algebra 26, pt.2, 467, (1990) (Banach Center Publication, Warszawa); Vershik, A.M.: Proc. Am. Math. Soc. 148, 1 (1991) [Ve2] Vershik, A.M.: Zapiski Sem. POMI 256 (1999) [Ve3] Vershik, A.M.: Russ. Math. Surv. (2000), No.4 (to appear) [Vi] Viennot, G.X.: Ann. N. Y. Ac. Sci. 576, 542 (1989) Communicated by A. Jaffe

Commun. Math. Phys. 212, 503 – 533 (2000)

Communications in

Mathematical Physics

© Springer-Verlag 2000

Instantons, Monopoles and Toric HyperKähler Manifolds Thomas C. Kraan Instituut-Lorentz for Theoretical Physics, University of Leiden, PO Box 9506, 2300 RA Leiden, The Netherlands. E-mail: [email protected] Received: 20 November 1998 / Accepted: 11 October 1999

Abstract: In this paper, the metric on the moduli space of the k = 1 SU (n) periodic instanton – or caloron – with arbitrary gauge holonomy at spatial infinity is explicitly constructed. The metric is toric hyperKähler and of the form conjectured by Lee and Yi. The torus coordinates describe the residual U (1)n−1 gauge invariance and the temporal position of the caloron and can also be viewed as the phases of n monopoles that constitute the caloron. The (1, 1, . . . , 1) monopole is obtained as a limit of the caloron. The calculation is performed on the space of Nahm data, which is justified by proving the isometric property of the Nahm construction for the cases considered. An alternative construction using the hyperKähler quotient is also presented. The effect of massless monopoles is briefly discussed.

1. Introduction Moduli spaces of instantons [2] and Bogomol’nyi–Prasad–Sommerfield (BPS) monopoles [3] have been subject to investigation for some time. The moduli space, quotient of the set of self-dual gauge connections by the group of gauge transformations, is a subset of the configuration space and its geometry therefore reflects physical properties of the system. In this paper instantons on R3 × S 1 [17], or calorons, are studied for gauge group SU (n). Calorons are composed out of elementary BPS monopoles [29], as is seen from the action density [24]. This becomes clear for small compactification lengths when the constituents are far apart. In particular, removing one of the monopoles to spatial infinity turns the k = 1 caloron into a BPS SU (n) monopole. In contrast, the situation of all monopoles nearly coalescing -in appropriate units corresponding to an infinite compactification length- gives back the ordinary instanton on R4 . These various aspects are respected by the corresponding limits in the metric. The form of the metric was conjectured by Lee and Yi [29], using considerations of D-brane constructions and

504

T. C. Kraan

asymptotic monopole interactions. This paper addresses the explicit calculation of the metric for the caloron moduli space and its limits. Metric properties of moduli spaces of selfdual connections play an important role in the study of non-perturbative effects of gauge theories. For instantons the metric appears through the bosonic zero modes in the background of the charge one SU (2) instanton in a calculation to study its physical effects [19]. The scattering of monopoles can be described as the geodesic motion on the moduli space [33], relating the metric to the Lagrangian of the interacting monopole system [34]. The metrics on these moduli spaces are hyperKähler [18]. This property derives formally from the nature of the selfduality equations themselves [1, 10]. It also appears in the Atiyah–Drinfeld–Hitchin–Manin (ADHM) construction of instantons of higher charge, as well as in the Nahm construction for monopoles as a hyperKähler structure on the space of data [8, 9]. The Nahm formalism first appeared as a generalisation of the ADHM construction to construct the BPS monopole [36]. In its extension to selfdual monopoles for arbitrary group and charge [37, 38], the Nahm data in terms of which the monopole is obtained can be constructed in terms of the Weyl zero modes in the background of the monopole. A similar scheme was set up for the caloron [12, 38], which up to very recently [22, 23, 26] had not resulted in explicit solutions. This reciprocity idea could be applied to instantons on R4 as well [7]. Extended to the four-torus T 4 , the involutive property of the Nahm transformation preserves the metric and hyperKähler structure [4]. These ideas fit in a programme of studying the Nahm transformation on generalised tori M = R4 /H , where H is the isometry group of the selfdual connection. The calorons correspond to M = R3 × S 1 , H = Z. This compactification provides a smooth interpolation between instantons and monopoles, adding to the understanding of both objects and the formalism to study them. The incorporation of both instanton and monopole-like aspects by calorons is read off from the topological characteristics of selfdual gauge connections Aµ dxµ on R3 × S 1 [16]. These are related to the properties of the vacuum which the solution necessarily approaches at spatial infinity in order for the action to be finite. The homotopy class of the gauge transformation connecting the vacuum at infinity with the connection near the origin gives the instanton number k ∈ π3 (SU (n)) = Z. The vacuum itself can be nontrivial, due to the non-trivial topology of the asymptotic boundary of the base manifold S 2 × S 1 . This leads to extra labels for the solution which are studied in terms of the gauge holonomy P( x ) along S 1 . In the periodic gauge (Aµ ( x , x0 +T ) = Aµ ( x , x0 )), P( x ) is defined as T P( x ) = P exp( A0 ( x , x0 )dx0 ), (1) 0

where P denotes path ordering and T the circumference of S 1 , which we set to 1. In a zero curvature background, continuous deformations of the loop do not affect P( x ). Its eigenvalues at spatial infinity are topological invariants. Therefore, the gauge holonomy at infinity is diagonal up to an xˆ dependent gauge transformation V , 0 lim P( x ) = P∞ = V P∞ V −1 ,

| x |→∞

0 P∞ = exp[2π idiag(µ1 , . . . , µn )].

(2)

The eigenvalues can be ordered such that µ1 < . . . < µn < µn+1 ≡ µ1 + 1,

n m=1

µm = 0,

(3)

Instantons, Monopoles and Toric HyperKähler Manifolds

505

using the gauge symmetry and assuming maximal symmetry breaking for the moment. For later use, we define νm = µm+1 − µm , related to the mass of the mth constituent monopole. Asymptotically, A0 = 2πi diag(µ1 , · · · , µn ) − i diag(k1 , · · · , kn )/(2r) + O(r −2 ), ki = 0, i

(4) up to the gauge transformation V (x) ˆ that induces a map from S 2 to SU (n)/H∞ , with H∞ the isotropy group of exp[2πi diag(µ1 , · · · , µn )]. The maps V (x) ˆ → SU (n)/H∞ are classified according to the fundamental group of H∞ . Generically, H∞ consists of several U (1) and SU (N ), N > 1 subgroups. Each U (1) gives rise to a monopole winding number, related to the integers ki . The enhanced residual gauge symmetry described by the SU (N ) subgroups arises when there is non-maximal symmetry breaking, νm = µm+1 − µm = 0 for some value(s) of m, giving rise to massless constituent monopoles. A non-trivial value of P∞ breaks the gauge symmetry. This makes calorons very similar to BPS monopoles, [20, 36, 37] which fit in the above classification as S 1 invariant selfdual connections, classified according to the magnetic charges (m1 , . . . , mn−1 ), where mi = k1 + . . . + ki . The k = 1 SU (n) caloron studied in this paper has no magnetic charges, and its only nontrivial topological labels are the instanton number k = 1 and the eigenvalues µm of the holonomy. The explicit computation of the metrics in this paper is based on the isometric property of the Nahm transformation, known to hold for instantons on R4 and T 4 , as well as for certain types of BPS monopoles [39]. It is believed to hold generally. For most situations considered in this paper, an explicit proof seems not to be present in the literature, and will be given here. This allows for a determination of the metric on the moduli space of Nahm data. For monopoles, such a calculation was first done in [5] showing that the metric of the (1, 1) data is a Taub-NUT space with positive mass parameter. Considerations based on asymptotic monopole interactions [14] reproduced this result [11]. For the (1, 1, . . . , 1) monopole a similar equivalence was found [27, 35]. All these metrics are of so-called toric hyperKähler type [13,42], and can be efficiently obtained as metrics on hyperKähler quotients [15]. An explicit calculation of the k = 1 SU (2) caloron is extended here to SU (n), generalising the techniques in [22, 23]. An alternative derivation using the hyperKähler quotient will also be given. There we will greatly benefit from the formalism in [35, 15], due to the similarity between the caloron and monopole Nahm data. The outline of this paper is as follows. In Sect. 2, some aspects of hyperKähler manifolds are presented, mostly to fix notation and to give some identities used throughout. Crucial in the ability to handle the caloron is that the infinite matrices of the ADHM construction are converted by Fourier transformation to functions on S 1 . This translates ADHM to the Nahm formulation and allows one to keep track of crucial delta-function singularities. In Sect. 3, to define notation, we summarise the ADHMN formalism for calorons as developed in refs. [22, 23, 30] based on the ADHM construction for instantons, rather than following [12, 38]. The caloron metric is calculated in Sect. 4. The instanton and monopole limits of the caloron are discussed in Sect. 5. A unified description of instantons, calorons and monopoles is thus achieved. Other aspects of the caloron, among which the effect of massless constituents, are commented on in the discussion. The appendix contains some technicalities on the (1, 1, . . . , 1) monopole.

506

T. C. Kraan

2. Preliminaries Manifolds with metric g are hyperKähler if they have three independent complex structures I, J, K that satisfy the quaternion algebra, I J = −J I = K and cyclic, whose associated Kähler forms ωI (·, ·) = g(·, I ·), ωJ (·, ·) = g(·, J ·), ωK (·, ·) = g(·, K·) are closed. As will be outlined in Sect. 4.1, the moduli spaces of selfdual connections inherit their hyperKähler property from the hyperKähler structure of the base space manifold M = R4 /H , where H = ∅, Z, R for instantons, calorons and monopoles respectively. The position coordinate on R4 will be denoted as a quaternion, x = xµ σµ . Here the unit quaternions are defined as σµ = (12 , −i τ) = (1, i, j, k) and σ¯ µ = (12 , i τ), with ij = −ji = k and τ the Pauli matrices. As M has a flat metric, there is no difference between upper and lower indices. Repeated indices imply summation. We introduce the i σ ≡ 1 (σ σ selfdual, resp. anti-selfdual quaternionic tensors [19] ηµν ≡ ηµν ¯µ) i µ ¯ ν − σν σ 2 i 1 and η¯ µν ≡ η¯ µν σi ≡ 2 (σ¯ µ σν − σ¯ ν σµ ), and #0123 = 1. Identifying the tangent space to H = R4 with the vector space itself, the complex structures act on x as right multipli1,2,3 . It is convenient to combine the cation with −i, −j, −k, such that (I, J, K)µν = η¯ µν metric and Kähler forms into one quaternion, (g, ω) = gσ0 + ω · σ .

(5)

(g, ω) = d x¯ ⊗ dx, g = ds 2 = (dxµ )2 , ω · σ = d x¯ ∧ dx = η¯ µν dxµ ∧ dxν = (2dx0 ∧ d x − d x ∧ d x) · σ .

(6)

This implies for R4 ,

i = #ij k daj ∧ dbk . One extends to HN by replacing d x¯ in Eq. (6) by Here, (d a ∧ d b) † t dx = d x¯ . Many examples of hyperKähler manifolds emerge as hyperKähler quotients [18]. Consider a hyperKähler manifold M acted upon freely by a group G (with algebra g) of isometries, LX g = 0, L denoting the Lie derivative and X ∈ g. When G preserves the complex structures, LX ω = 0, the isometries are called triholomorphic and the moment map µ : M → g∗ ⊗ R3 can be defined as Xµ ω µν = ∂ν µ X . The manifold µ −1 (c)/G, 3 ∗ with c ∈ R ⊗ Zg (Zg the center of g ) obtained by taking the quotient of the level set µ −1 (c) by G is then hyperKähler itself. Isometries commuting with G descend to the quotient. When they are also triholomorphic, this property is preserved. The relevant example is provided by the moduli space of ADHM data in the construction of charge k instantons on R4 for gauge group SU (n) [2, 7]. The caloron will be constructed using an infinite-dimensional version of the ADHM construction which we therefore review here, to establish conventions. One considers the set Aˆ of matrices λ .= , (7) B with λ ∈ Cn,2k and the 2k × 2k dimensional matrix B = Bµ ⊗ σµ , where Bµ are k × k dimensional hermitian matrices. With metric and Kähler forms on Aˆ defined as g = 21 Tr tr 2 dB † dB + 2dλ† dλ , ω · σ = 21 σi Tr tr 2 σ¯ i dB † ∧ dB + 2dλ† ∧ dλ , (8)

Instantons, Monopoles and Toric HyperKähler Manifolds

507

Aˆ is hyperKähler. The U (k) transformations λ → λT † ,

Bµ → T Bµ T † ,

T ∈ U (k),

(9)

leave (g, ω) in Eq. (8) invariant and therefore form a group of triholomorphic isomeˆ The associated moment map reads (tr 2 denoting the trace associated with tries of A. quaternions) (10) µ = 21 tr 2 B † B + λ† λ σ¯ . Its zero set µ −1 (0) is formed by the solutions to the ADHM constraint η¯ µν Bµ Bν + 21 τa tr 2 (τa λ† λ) = 0.

(11)

The instanton gauge connection corresponding to a solution to . ∈ µ −1 (0) is obtained as Aµ (x) = v † (x)∂µ v(x),

(12)

in terms of the (2k + n) × n dimensional complex matrix v(x) containing the normalised zero modes of .† (x) = .† − x † b† , where b† = (0, 1k ). For Aµ to be an SU (n) gauge potential, B † B + λ† λ should be invertible, implying the existence of a k × k dimensional hermitian matrix fx commuting with the quaternions, .† (x).(x) = fx−1 ⊗ σ0 .

(13)

This matrix features in the expression for the curvature, Fµν = 2v † (x)bηµν fx b† v(x),

(14)

showing it to be self-dual. It also appears in the formula for the action density [40], 2 (x) = −∂µ2 ∂ν2 log det fx , TrFµν

(15)

from which it follows that the topological charge is k, because of the asymptotic behaviour fx = 1k /x 2 ,

x 2 → ∞.

(16)

Thus it is shown that an element . ∈ µ −1 (0) corresponds to a charge k instanton solution. The gauge connection (12) is not affected by the U (k) transformations (9), which therefore have to be divided out to obtain the instanton moduli space µ −1 (0)/U (k) (its isometry with the moduli space of instantons is discussed later). This reduces the dimension of the instanton moduli space to 4kn. As it is a hyperKähler quotient, this space is hyperKähler [8, 10]. Global gauge transformations of the instanton, which are included as moduli, are realised by the action λ → gλ,

g ∈ SU (n),

(17)

which is a triholomorphic isometry, as follows from Eq. (8). As SU (n) acts on the left, it commutes with U (k) acting on the right. Therefore, SU (n) descends as a group of triholomorphic isometries to the moduli space of ADHM data, the hyperKähler quotient µ −1 (0)/U (k), reflecting the gauge symmetry of the instanton solution.

508

T. C. Kraan

At this place we recall a frequently used U (1) fibration over R3 , physically interpreted as a monopole phase and position. It is presented in terms of complex row 2-vectors that feature in the ADHM matrix λ. Specifically, for a 2-dimensional complex row vector ς = (ς1 , ς2 ), describing R4 , the metric and Kähler forms read g = 21 tr 2 (dς † dς),

ω · σ = 21 σi tr 2 σ¯ i dς † ∧ dς.

(18)

The complex structures act on ς by right multiplication with −σi . There is a triholomorphic U (1) isometry with associated moment map ς → eit ς,

µ = 21 tr 2 (−iς † ς σ¯ ) = 21 r.

(19)

The level sets are U (1) fibres due to the phase ambiguity in defining ς from r, which becomes more manifest upon introducing new coordinates, ψ

ς = ς 0 ei 2 ,

ψ ∈ R/(4π Z),

(20)

with for example ς20 (r ) chosen real. A useful identity is 1 2

tr 2 (δς0† ς0 − ς0† δς0 ) = −i|r |w( r ) · d r,

(21)

where w( r ) is the vector potential of the abelian Dirac monopole, r 1 . r × w( r) = ∇ ∇ |r |

(22)

In the present form, the Dirac string lies along the positive z axis, other gauges are obtained by allowing for r dependent phase ambiguities. In terms of (r , ψ), the metric and Kähler forms on R4 read 1 1 2 2 2 r ) · d r) , d r + |r |(dψ + w( ds = 4 |r | (23) 1 ω = (dψ + w( r ) · d r) ∧ d r − d r ∧ d r. 2r The U (1) isometry is equivalent to a linear action ψ → ψ + 2t,

t ∈ R/(2π Z).

(24)

The moduli spaces we will encounter are all so-called toric hyperKähler manifolds [42]. These manifolds have coordinates consisting of N three-vectors xa ∈ R3 , a = 1, . . . , N, and N torus variables φa , generalising the U (1) in the previous example. Metric and Kähler forms read dφb dφa ac · d xc (:−1 )ab bd · d xd , +; +; g = d xa :ab · d xb + 4π 4π

ω = 2(

dφa ab · d xb ) ∧ d xa − :ab d xb ∧ d xa . +; 4π

(25)

Instantons, Monopoles and Toric HyperKähler Manifolds

509

are φa independent, giving Here we adopted the notation of [14]. The potentials : and ; rise to N commuting triholomorphic isometries ∂/∂φa , corresponding to shifts on the torus. Closure of the Kähler forms is equivalent to ∂ i ∂ ∂ j ; − ; = #ij k k :ab , ∂xai bc ∂xcj ba ∂xc

∀a, b, c, i, j.

(26)

These equations are therefore called hyperKähler conditions [42,13], and generalise Eq. (22). The metric in Eq. (25) has an SO(3) isometry, acting on the vectors xa , that rotates the complex structures. Toric hyperKähler manifolds are torus bundles over (R3 )N [14]. Physically, the R3 vectors xa are (relative) constituent monopole positions, whereas the torus describes the phases of the monopoles. In the Lagrangian interpretation denote retarded interaction potentials for the constituents [14, of the metric, : and ; 34] and it was considerations of this kind that led to the conjectures for the metric in [27, 29]. 3. The ADHM-Nahm Formalism We will construct the caloron in the so-called algebraic gauge, related to the periodic gauge by the non-periodic gauge transformation g( x , x0 ) = V exp[2π ix0 diag(µ1 , . . . , µn )]V −1 . In this gauge, the background field 2π i diag(µ1 , · · · , µn ) in Eq. (4) is removed and we have the alternative boundary condition, −1 x , x0 + T ) = P∞ Aµ ( x , x0 )P∞ . Aµ (

(27)

Since in the absence of magnetic windings, P∞ can always be gauged to a constant 0 without loss of generality. The periodic diagonal form, we assume henceforth P∞ = P∞ instanton of charge one is obtained in the algebraic gauge (27) by taking an infinite array of elementary instantons, relatively gauge- rotated by P∞ . To implement this in the ADHM formalism we take a specific solution for the zero mode vector v(x) in the ADHM construction, 1 −1n ϕ − 2 (x), u(x) = (B † − x † 1k )−1 λ† , ϕ(x) = 1n + u† (x)u(x), v(x) = u(x) (28) where ϕ is an n × n positive hermitian matrix. In terms of these, one obtains Aµ (x) = ϕ − 2 (x)(u† (x)∂µ u(x))ϕ − 2 (x) + ϕ 2 (x)∂µ ϕ − 2 (x). 1

1

1

1

(29)

For Eq. (27) to hold, it is then required that −1 , up+1 (x + 1) = up (x)P∞

p ∈ Z.

(30)

This imposes periodicity constraints on the data λp+1 = P∞ λp ,

Bp,p (x + 1) = Bp−1,p −1 (x),

(31)

with B(x) = B − x1k , which imply p

λp = P∞ ζ,

Bp,p = σ0 δp,p + Aˆ p−p ,

p, p ∈ Z.

(32)

510

T. C. Kraan

The off-diagonal part Aˆ is still to be determined. Fourier transformation translates the ADHM formalism to the Nahm language. B is cast into a Weyl operator, δ(z − z ) ˆ Bp,p (x)e2πi(pz−p z ) = Dx (z ), 2π i p,p ∈Z

d ˆ Dˆ x (z) = σµ Dˆ µ x (z) = + A(z) − 2π ix, dz ˆ A(z) = σµ Aˆ µ (z), Aˆ µ (z) = 2π i e2πipz Aˆ µ p , (33) p∈Z

ˆ and λ† λ into a singularity structure describing the matching conditions for A(z), ˆ e−2πpiz λp = e2πip(µm −z) Pm ζ = λ(z), p∈Z

ˆ λ(z) =

p∈Z

δ(z − µm )Pm ζ,

m∈Z/nZ

p,p ∈Z

ˆ λ†p e2πi(pz−p z ) λp = δ(z − z )C(z),

ˆ C(z) =

ˆ δ(z − µm )ζ † Pm ζ = ζ † λ(z).

(34)

m∈Z/nZ t , where e is the mth unit vector, Here we introduced the projection operators Pm = em em m in terms of which P∞ = exp(2πiµm )Pm and λp = exp(2π ipµm )Pm ζ. m∈Z/nZ

m∈Z/nZ

The group index m ∈ Z/nZ is a cyclic variable. We also used that for any two obp jects a, b of type ap = P∞ α, p ∈ Z, the Fourier transforms defined as a(z) ˆ = p∈Z exp(−2πipz)ap , have the property ˆ ) = δ(z − z ) a(z) ˆ ˆ † < bˆ >= δ(z − z ) < aˆ † > b(z) aˆ † (z)b(z δ(z − µm )α † Pm β, = δ(z − z ) where < H >≡

S1

(35)

m∈Z/nZ

H (z)dz. The quadratic ADHM constraint translates into 1 2

ˆ [Dˆ µ (z), Dˆ ν (z)]η¯ µν = 4π 2 C(z),

(36)

where is introduced to act on a 2×2 matrix as W ≡ 21 [W −τ2 W t τ2 ] (W ≡ 21 tr 2 W ). We use the U (1) fibration over R3 (Eq. (19)) to write 1 (ρm + ρm · τ), ρm = |ρm |. 2π This leads to the caloron Nahm equation d ˆ j δ(z − µm )ρm , Aj (z) = 2π i dz † ζ † Pm ζ = ζ(m) ζ(m) =

m∈Z/nZ

(37)

(38)

Instantons, Monopoles and Toric HyperKähler Manifolds

511

which is abelian in the k = 1 situation at hand, see [24, 38]. The phase ambiguity in defining ζ(m) from ρm is resolved later. As integration of Eq. (38) over S 1 gives a constraint on ζ ,

ρm = π tr 2 ( τ ζ † ζ ) = 0,

(39)

m∈Z/nZ

we can introduce vectors ym , m ∈ Z/nZ, such that ρm = ym − ym−1 . The vectors ym are to be interpreted as the constituent monopole positions. We now find for the spacelike ˆ components of A(z), Aˆ j (z) = 2π i

j

χ[µm ,µm+1 ] (z)ym ,

(40)

m∈Z/nZ

where χ[µm ,µm+1 ] (z) = 1 for z ∈ [µm , µm+1 ] and 0 elsewhere, extended periodically. Note that the Nahm equations determine ym up to the global R3 × S 1 position variable 1 ˆ νm ym . (41) A(z)dz, ξ = ξ= 2π i S 1 m∈Z/nZ

Here νm = µm+1 − µm is related to the mass of the mth constituent. The T symmetry Eq. (9) in the ADHM construction is mapped to a U (1) gauge symmetry, with gauge group Gˆ = {g(z)|g : z → e−ih(z) ∈ U (1)}, acting as d ˆ ˆ A(z) → A(z) + i h(z), dz

ζm → ζm eih(µm ) .

(42)

For calorons, g(z) is periodic and can be used to set Aˆ 0 (z) to a constant. A piecewise linear U (1) gauge function h(z) shifts the U (1) phase ambiguities in ζ(m) to Aˆ 0 (z), which thus becomes piecewise constant. Therefore, all 4n moduli are included in the following solution to the Nahm equations: ˆ A(z) = 2πi

m∈Z/nZ

χ[µm ,µm+1 ] (z)(

τm σ0 + ym · σ ), 4π νm

(43)

where τ = (τ1 , . . . , τn )t takes values in Rn . Using the gauge function g(z) =

m∈Z/nZ

χ[µm ,µm+1 ] (z) exp(2π i(z − µm )

km ), νm

km ∈ Z,

(44)

which leave the U (1) phases of ζ unaffected, τ can be restricted to the torus Rn /(4π Z)n . In this gauge, the moduli describing the general caloron are the position vectors ym , comprised in y = ( y1 , . . . , yn ) and the torus coordinate τ describing the U (1)n−1 residual gauge symmetry and the temporal position of the caloron. Strictly speaking, these variables are coordinates on the cover of the moduli space of framed calorons. The true moduli space is obtained by dividing out the center of the gauge group. This leads to orbifold singularities.

512

T. C. Kraan

Under Fourier transformation, the Green’s function fx (Eq. (13)) for calorons be 2πi(pz−p z ) and is a solution of the differential comes fˆx (z, z ) ≡ p,p ∈Z fx,p,p e equation

1 d − x0 2πi dz

2 +

m∈Z/nZ

1 + 2π

2 χ[µm ,µm+1 ] (z) rm

δ(z − µm )| ym − ym−1 | fˆx (z, z ) = δ(z − z ),

(45)

m∈Z/nZ

in the gauge with Aˆ 0 (z) constant. Here rm = | x − rm | is the center of mass radius of the mth constituent. Expressions for fˆx in other gauges are obtained by using that under the ˆ fˆx transforms as action of G, fˆx (z, z ) → g(z)fˆx (z, z )g(z )∗ ,

ˆ g(z) ∈ G.

(46)

The Nahm construction of the (1, 1, . . . , 1) monopole, later obtained by as a special limit of the caloron, is discussed in the appendix. 4. The Caloron Metric 4.1. Moduli spaces of selfdual connections. The metric on the moduli space M of selfdual connections on the manifold M = R4 /H is computed as the L2 norm of its tangent vectors. These are gauge orthogonal variations of the connections with respect to their moduli. Specifically, Zµ is tangent to the moduli space when it is a solution of the deformation equation and the gauge orthogonality condition requiring it to be a zero mode of the covariant derivative Dµad = ∂µ + [Aµ , ·], ad ad D[µ Zν] = 21 #µναβ D[α Zβ] ,

Dµad (A)Zµ = 0.

(47)

Written in terms of quaternions, these equations are concisely expressed as D ad† Z = 0, from which one reads off the tangent space to admit three almost complex structures I, J, K acting as −i, −j, −k on the right. Metric and Kähler forms read 1 (g, ω) M (Z, Z ) = d4 xTr Z † (x)Z (x) , (48) 2 4π M where Z, Z are any two tangent vectors. Gauge orthogonality of a general variation δAµ of the selfdual connection can be achieved by applying an infinitesimal gauge transformation :, Zµ = δAµ + Dµad :, implying for the metric g=−

1 4π 2

M

(Dνad )2 : = −Dµad δAµ ,

d4 xTr(δAµ − Dµad (Dνad )−2 Dρad δAρ )2 .

(49)

(50)

The hyperKähler property of the moduli space follows formally from considering it as the infinite dimensional hyperKähler quotient of the space of general connections

Instantons, Monopoles and Toric HyperKähler Manifolds

513

A by the triholomorphic action of the group of gauge transformations G[1, 10]. The moment map is µ G = η¯ µν Fµν /8π 2 , so that the zero set is formed by the space of selfdual solutions, which quotiented by G gives the moduli space. That this quotient is well defined follows from the invariance of the Kähler forms 1 ω rs · σ = − 2 d4 x η¯ µν Tr(δr Aµ δs Aν ), (51) 4π M under infinitesimal gauge transformations, which is seen by adding arbitrary Dµad : to the deformations. For the caloron the boundary condition Eq. (27) is consistent with complex structures acting as η¯ µν , i.e. the non- trivial holonomy is compatible with the hyperKähler structure. One therefore expects caloron moduli spaces to be hyperKähler. For practical purposes the formal reasoning above is of little use. Computing metrics on moduli spaces with the techniques presented depends crucially on the construction of the Green’s function of the covariant Laplacian and in the present situation, we do not even have an expression for Aµ readily available. We take a different route which uses multi-instanton calculus, suitably adapted to the caloron situation. This allows for calculating the metric in terms of the ADHMN data and makes it thus feasible to find a compensating gauge transformation or to perform the hyperKähler quotient. Moduli spaces of selfdual connections can usually be written as a product of the base space M, describing the center of mass and the non-trivial relative moduli space Mrel , M = M × Mrel .

(52)

In the metric this corresponds to a part describing the flat metric on the base space M and one for the relative or centered metric on Mrel , containing the nontrivial part. However, in the case at hand, where we want to take particular limits, it will be preferable to work with the full metric on M. 4.2. Isometric properties of the ADHM-Nahm construction. We first recall the computation of the metric on the moduli space of instantons on R4 which can be entirely performed using ADHM techniques. Adapted to the caloron situation, this will translate into the formalism to calculate metrics in terms of Nahm data. A tangent vector to the instanton moduli space is given by Zµ (C) = v † (x)C σ¯ µ fx u(x)ϕ − 2 (x) − ϕ − 2 (x)u† (x)fx σµ C † v(x), 1

1

where C is a tangent vector to the moduli space of ADHM data, c C= , Y

(53)

(54)

which satisfies tr 2 (.† (x)C σ¯ i ) = −tr 2 (C † .(x)σ¯ i ),

tr 2 (.† (x)C) = tr 2 (C † .(x)).

(55)

Here the first equation is the deformation of the ADHM constraint and the second guarantees gauge orthogonality. Using an infinitesimal U (k) transformation (9) T = exp(−iδX), where δX = δX† , the tangent vectors can be constructed as δλ + iλδX C = δ. + δX . = , (56) δB + i[B, δX]

514

T. C. Kraan

which automatically satisfy the deformation equation. Gauge orthogonality imposes tr 2 B † [B, iδX] − [B † , iδX]B + 2iδXλ† λ + λ† δλ − δλ† λ + B † δB − δB † B = 0. (57) The complex structures acting on tangent vectors Z extend to C in a natural way, i σ . The metric can Z(C)σ¯ i = Z(C σ¯ i ), as is seen from Eq. (53) and σµ σ¯ i = −η¯ µν ν 1 be evaluated using a powerful expression due to Corrigan [6], † (58) Tr(Zµ† (x)Zµ (x)) = − 21 ∂ 2 tr 2 Tr C † (1 − .(x)fx .† (x))C fx + C Cfx . The integral to compute the L2 norm in Eq. (48) is reduced to a boundary term corresponding with x 2 → ∞, where fx is known, compare Eq. (16). Using that Z(C)σ¯ i = Z(C σ¯ i ) and identifying the tangent space to the ADHM data with the vector space itself, the well-known (see also [32]) hyperKähler isometric property of the ADHM construction is proven † gM (Z, Z ) = 21 Tr tr 2 Y † Y + c† c + c c , (59) † ω M (Z, Z ) · σ = 21 σi Tr tr 2 σ¯ i Y † Y + c† c − c c . The right-hand side of Eq. (59) explains why Eq. (8) gives the natural metric and Kähler forms on the space Aˆ of ADHM matrices .. As the ADHM construction is an isometry and the moduli space of ADHM data µ −1 (0)/U (k) is hyperKähler the same holds for the moduli space of instantons on R4 . In employing the metric properties of the ADHM construction in the caloron case, one has – in addition to the deformation equation and gauge orthogonality – the algebraic gauge condition Eq. (27) to be satisfied −1 Zµ (x + 1) = P∞ Zµ (x)P∞ .

(60)

This requires Yp,p = Yp−1,p −1 ,

cp+1 = P∞ cp ,

δXp,p = δXp−p .

(61)

The compatibility of periodicity and nontrivial holonomy with the hyperKähler structure on the level of the ADHM-Nahm construction can be seen from the complex structures acting on Y and c as multiplication by −i, −j, −k on the right. We define the Fourier transforms of the tangent vector c(z) ˆ =

p∈Z

δ(z − z )Yˆ (z) =

exp(−2π ipz)cp =

m∈Z/nZ

e2πi(pz−p z ) Yp,p ,

δ(z − µm )cˆm , (62)

p,p ∈Z 1 The expression given in eq. 3.17 of [41] is incorrect for gauge group SU (n) and should be replaced by the one given here in Eq. (58).

Instantons, Monopoles and Toric HyperKähler Manifolds

515

and find after Fourier transformation of Eqs. (55, 56) the analogues of Eq. (47) as the deformation of the Nahm equation and a gauge orthogonality condition d ˆ † Yi (z) = −iπ tr 2 σ¯ i (ζm† cˆm + cˆm ζm )δ(z − µm ), dz m∈Z/nZ

d ˆ Y0 (z) = −iπ dz

m∈Z/nZ

† tr 2 (ζm† cˆm − cˆm ζm )δ(z − µm ).

(63)

To evaluate the caloron metric we use Eq. (58) and closely follow the reasoning in [23]. By Fourier transformation, Corrigan’s formula is cast into † 1 2 dz [Yˆ † (z)Yˆ (z) + Yˆ † (z)Yˆ (z) (64) TrZµ (x)Zµ (x) = − 2 ∂ tr 2 S1 + cˆ† (z) < cˆ > +cˆ† (z) < cˆ >]fˆx (z, z) ˆ + Yˆ x (z)]fˆx (z, z )[Yˆ x† (z ) + Cˆ† (z )]fˆx (z , z) , + 21 ∂ 2 tr 2 dzdz [C(z) S1

where we introduced the shorthand notation ˆ = cˆ† (z) < λˆ >, Yˆ x (z) = (2π i)−1 Yˆ † (z)Dˆ x (z). C(z)

(65)

In evaluating the integral over R3 × S 1 , the ∂02 term gives no contribution because of periodicity. The term involving ∂i2 is evaluated by partial integration as a boundary term at spatial infinity, for which the asymptotic behaviour of the Green’s function fˆx (z, z ) is needed. Since the asymptotic expression for the Green’s function is independent of n, π −2π|x ||z−z |+2πix0 (z−z ) fˆx (z, z ) = e + O(| x |−2 ), (66) | x| we can use, slightly adapted, the analysis for SU (2) in [23]. Combining the first line in Eq. (64) with the only surviving term of the second, we find the following gauge independent expression: (67) gM (Z, Z ) = 21 tr 2 < Yˆ † Yˆ > + < cˆ† >< cˆ > + < cˆ† >< cˆ > , i ωM (Z, Z ) = 21 tr 2 σ¯ i < Yˆ † Yˆ > + < cˆ† >< cˆ > − < cˆ† >< cˆ > . This proves that the metric and Kähler forms on the caloron moduli space can be computed as the metric on the Nahm data. In other words, for k = 1 SU (n) calorons, the Nahm construction is a hyperKähler isometry. A slightly modified proof shows this for monopoles of type (1, 1, . . . , 1) and can be found in the appendix. The isometric property is essential for what follows. The metric on the caloron moduli spaces can now be calculated in terms of tangent vectors to the space of solutions to the Nahm equations, with infinitesimal gauge transformations performed where needed. This method, used in Sect. 4.3, is called direct as it concentrates on the gauge orthogonal tangent vectors to the moduli space. An alternative method, given in Sect. 4.4, uses the fact that the moduli space of data is an infinite dimensional hyperKähler quotient. It proceeds by using part of the U (k) gauge symmetry to embed the moduli in a finite dimensional hyperKähler space. The metric on the moduli space is then found as the metric on a finite dimensional hyperKähler quotient, with the remaining gauge action to be divided out.

516

T. C. Kraan

ˆ 4.3. Direct computation. In the direct approach a compensating gauge function δ X(z) = X exp(2πipz) has to be found to account for the tangent vectors p∈Z p c(z) ˆ =

ˆ m) , δ(z − µm ) δζm + iζm δ X(µ

m∈Z/nZ

1 Yˆ (z) = 2πi

d ˆ ˆ δ A(z) + i δ X(z) , dz

(68)

to be gauge orthogonal, Eq. (63). The gauge orthogonality of Yˆ (z) implies for the comˆ pensating gauge function δ X(z) −

ˆ 1 d 2 δ X(z) ˆ + 2δ X(z) δ(z − µm )|ρm | 2 2π dz m∈Z/nZ

dτm dτm−1 δ(z − µm ) − − |ρm |w m (ρm ) · d ρm , = 4π νm 4π νm−1

(69)

m∈Z/nZ

ˆ where we used Eq. (21). This differential equation implies that δ X(z) is continuous and ˆ piecewise linear. Therefore, δ X(z) is fully determined by the values δ Xˆ m it takes at z = µm , which are comprised in the vector δ Xˆ = (δ Xˆ 1 , . . . , δ Xˆ n ) ∈ Rn . In the gauge chosen, all functions are either constants on the subintervals (µm , µm+1 ), or fixed by values at z = µm . Therefore, the entire computation can conveniently be performed in terms of n dimensional vectors and n × n matrix operators acting thereon, at the cost of introducing some extra notation. For taking derivatives, we will use the n × n matrix 

1

  S=  

1

−1



−1

  , .  1 −1 

−1

..

(70)

1

with unspecified entries zero. In addition we introduce the vector ρ = (ρ1 , . . . , ρn ) ∈ R3n and diagonal matrices = 1 diag(w 1 (ρ1 ), . . . , w W n (ρn )), 4π = 4πdiag(ρ1 , . . . , ρn ).

N = diag(ν1 , . . . , νn ), V −1

(71)

Introducing the symbol V anticipates its later interpretation as potential. In the sequel, all matrix multiplications between n-dimensional objects are implicitly assumed. The transpose t acts only on the indices running from 1 to n. The Nahm connection is now represented by the n dimensional vector τ Aˆ = (Aˆ 1 , . . . , Aˆ n )t = 2π(N −1 + y · σ ), 4π

(72)

Instantons, Monopoles and Toric HyperKähler Manifolds

517

ˆ on (µm , µm+1 ). The Nahm equation reduces to ρ = S t y. where i Aˆ m is the value of A(z) Similarly c(z) ˆ = m∈Z/nZ δ(z − µm )cˆm and Yˆ (z) = i m∈Z/nZ χ[µm ,µm+1 ] (z)Yˆm are fixed by cˆm = δζm + iζm δ Xˆ m ,

1 ˆ 1 −1 ˆ Yˆ = δA − N Sδ X. 2π 2π

(73)

ˆ Integrating the differential equation (69) for δ X(z) over small intervals [µm −#, µm +#], # ↓ 0, gives conditions on the values δ Xˆ m . This yields 1 t −1 dτ S t · d y), S N S + V −1 δ Xˆ = (S t N −1 − V −1 W 2π 4π

2 dz contributes ˆ where we used that −d 2 δ X(z)/dz − δ Xˆ (µm +) − δ Xˆ (µm −) 1 ˆ 1 ˆ ˆ ˆ =− δ Xm+1 − δ Xm − δ Xm − δ Xm−1 . νm νm−1

(74)

(75)

Equation (74) is solved by δ Xˆ dτ S t · d y, = V S t G−1 − 1 − V S t G−1 S W 2π 4π

(76)

such that dτ dτ 1 −1 ˆ S t · d y), Yˆ = d y · σ + N −1 − N Sδ X = d y · σ + G−1 ( + SW 4π 2π 4π

(77)

where we defined G = N + SV S t . The integration over S 1 to evaluate the metric on the Nahm data in Eq. (67) is carried out as < Yˆ † Yˆ >= Yˆ † N Yˆ using that each subinterval has length νm = µm+1 − µm . Thus we obtain tr 2 < Yˆ † Yˆ > = d y t · N d y t dτ S t · d y G−1 N G−1 dτ + S W S t · d y , + + SW 4π 4π † t 1 1 ¯ i < Yˆ ∧ Yˆ > = − 2 d y N ∧ d y · σ 4 σi tr 2 σ t −1 dτ t + NG ( + S W S · d y) ∧ d y · σ . 4π 1 2

Using the properties (21, 23) of ζm , the contribution to the metric of cˆm defined in Eq. (73) is found. One obtains tr 2 < cˆ† >< cˆ >= d y t · SV S t d y t dτ S t · d y G−1 SV S t G−1 dτ + S W S t · d y , + SW + 4π 4π

518

T. C. Kraan 1 2

σi tr 2 σ¯ i < cˆ† > ∧ < cˆ >= − 21 (SV S t d y)t ∧ d y · σ t dτ S t · d y) ∧ d y · σ , + SW + SV S t G−1 ( 4π

where it is used that in the gauge chosen the phases of ζ are fixed. The metric and Kähler forms on moduli space of the uncentered caloron are now readily obtained dτ · d y)t G−1 ( dτ + W · d y), +W ds 2 = d y t G · d y + ( 4π 4π t dτ · d y ∧ d y − (Gd y)t ∧ d y, ω =2 +W 4π = SW St . G = N + SV S t , W

(78) (79)

Equivalently writing δm−1,m + δm,m 4πρm m, m ∈ Z/nZ,

Gm,m = νm δmm −

1 1 + 4πρm 4πρm+1

−

δm+1,m , 4πρm+1

(80)

reveals the form of G as given in [29]; thus we confirm the conjectured form for the satisfy metric in [29]. As is readily checked, from Eqs. (22, 71) it follows that G and W the hyperKähler conditions (26), k j i ∂m Gm ,m = #ij k ∂m W

yG = ∇ y × W, ∇

m ,m

,

(81)

i = ∂/∂y i ), which implies the Kähler forms in (79) to be closed and the caloron (∂m m metric to be hyperKähler. The metric has n commuting triholomorphic isometries,

∂ , ∂τm

m = 1, . . . , n,

(82)

are τ independent. The isometries correspond to shifts on the n-torus as G and W Rn /(4π Z)n which describe the residual U (1)n−1 gauge invariance and the temporal position ξ0 =

1 4π

τm ∈ S 1 ,

(83)

m∈Z/nZ

of the caloron. Therefore, the caloron moduli space is a toric hyperKähler manifold, with dimension 4n. 3n coordinates describe the monopole positions and n phase angles parameterise the temporal position and residual U (1)n−1 gauge invariance in the case of maximal symmetry breaking. From the uncentered caloron metric in Eq. (78), all other metrics discussed in this paper can be obtained by taking suitable limits. In the next subsection the caloron metric will be obtained using the hyperKähler quotient. The non-trivial part of the metric is obtained by splitting off the center of mass coordinate ξ in Eq. (41). To this aim, we express the metric in terms of ξ and n − 1

Instantons, Monopoles and Toric HyperKähler Manifolds

519

relative monopole position vectors ρm , using that ρn = − n−1 m because of Eq. (39). m=1 ρ The two sets of coordinates are related by the n × n dimensional “centering matrix" Fc , ρ˜ Fc = (Sc , N e), = Fct y. (84) ξ Here, the n × (n − 1) dimensional matrix Sc is obtained from S by omitting its last column, and we defined e = (1, . . . , 1)t ∈ Rn . A tilde denotes from now on the restriction to the first n−1 coordinates, e.g. ρ˜ = (ρ1 , . . . , ρn−1 )t . New torus coordinates υ˜ = (υ1 , . . . , υn−1 )t are introduced as well υ˜ τ = Fc . (85) 4π ξ0 The centered metric will be again hyperKähler, as splitting of the center off mass metric amounts to taking the hyperKähler quotient under the U (1) action τm → τm + νm tc ,

m = 1, . . . , n,

tc ∈ R.

(86)

From Eqs. (78, 79) it is seen that this action is a triholomorphic isometry whose moment map gives the center of mass of the caloron µ =

1 4π

νm ym =

m∈Z/nZ

ξ . 4π

(87)

Indeed, the phase variables υ˜ are invariant under the U (1) action and can serve as coordinates on the quotient whereas the fibre coordinate ξ0 changes as ξ0 → ξ0 + tc . In the new basis the relative metric is expressed in terms of a relative mass matrix and relative interaction potentials ˜ rel G ˜ rel = M˜ + V˜rel , Fc−1 G(Fc−1 )t = , G 1 1 n (ρn ) ˜ + w ˜ ) = W (V˜rel )mm = V˜mm + , (W , (88) rel mm mm 4π |ρn | 4π m . The relative mass matrix M˜ is defined where m, m = 1, . . . , n − 1, ρn = − n−1 m=1 ρ as M˜ −1 Fct N −1 Fc = , 1   1 1 − ν11 νn + ν1 1 1   −1 − ν12   (89) ν1 ν1 + ν2   . . . −1   ˜ . . . M = . . . ,   1 1 1 1 − + −   νn−3 νn−3 νn−2 νn−2 1 1 1 − νn−2 + νn−2 νn−1 its explicit form allowing one to take limits that correspond to massless monopoles M˜ = M˜ t ,

M˜ mm = (νm + · · · + νn−1 )(1 − νm · · · − νn−1 ) for m ≥ m ,

m, m = 1, . . . , n − 1.

(90)

520

T. C. Kraan

The centered metric and Kähler forms now read t ˜ d υ˜ ˜ ˜ −1 d υ t ˜ ˜ ˜ ˜ ˜ ˜ + W rel · d ρ Grel + W rel · d ρ , g = dξµ dξµ + d ρ Grel · d ρ + 4π 4π t d υ˜ ˜ · d ρ˜ ∧ d ρ˜ − (G ˜ t ∧ d ρ. ˜ (91) ˜ rel d ρ) ω = 2dξ0 ∧ d ξ − d ξ ∧ d ξ + 2 +W rel 4π The first terms give the center of mass metric on R3 × S 1 , the other terms represent the non-trivial part of the metric. Both are toric hyperKähler, and have an SO(3) invariance corresponding to spatial rotations. 4.4. HyperKähler quotient construction. We follow the approach in [35] for BPS monopoles of type (1, 1, . . . , 1) and consider the natural metric and Kähler forms on the ˆ compare Eq. (67), space of caloron Nahm data A, gAˆ = 21 tr 2 < d Aˆ † d Aˆ > +2 < d λˆ † >< d λˆ > , (92) ωi ˆ = 21 tr 2 σ¯ i < d Aˆ † ∧ d Aˆ > +2 < d λˆ † > ∧ < d λˆ > . A

One then notes that the group Gˆ of U (1) gauge transformations on Sˆ 1 acts triholomorˆ The zero set of the associated moment map is formed by the set N of phically on A. solutions to the Nahm equations, which after quotienting by the U (1) gauge action Gˆ on the dual S 1 gives the moduli space of Nahm data. By virtue of Eq. (67) this quotient is isometric to the caloron moduli space, ˆ M = N /G.

(93)

As both N and Gˆ are infinite dimensional, it is not obvious that this procedure is welldefined. However, using the gauge action we can restrict to those solutions N0 to the Nahm equations which have constant Aˆ 0 (z) on the subintervals (µm , µm+1 ). As the Nahm equations force Aˆ i (z) to be piecewise constant, there are n quaternions specifying the Nahm connection, denoted by y ∈ Hn . The singularities (or matching data) are described by n complex two-component vectors ζm , denoted by ζ ∈ Cn,2 . Hence, N0 is a subset of the space Aˆ 0 = Hn × Cn,2 of possible piecewise constant data, which has metric and Kähler forms (94) g = 21 Tr tr 2 dy † N dy + 2dζ † dζ , † † 1 ωi = 2 Tr tr 2 σ¯ i dy ∧ N dy + 2dζ ∧ dζ , as is natural from Eq. (67). On Aˆ 0 , the gauge action Gˆ is restricted to the set Gˆ0 of gauge functions with piecewise linear and continuous log. These are determined by the values h assumes at z = µm . Under these gauge transformations, Aˆ and ζ change according to ζm → eitm ζm ,

ψ → ψ + 2t,

y→y−

1 −1 N St, 2π

(95)

where t = (h(µ1 ), . . . , h(µn )) ∈ Rn /(2π Z)n and ψ = (ψ1 , . . . , ψn )/(4π Z)n denotes the phases of ζ . The lattices correspond to gauge transformations of type (44). Therefore

Instantons, Monopoles and Toric HyperKähler Manifolds

521

the action of the restriction Gˆ0 of Gˆ on Aˆ 0 is equivalent to an Rn /(2π Z)n action on Hn × Cn,2 . Thus we reduced the infinite dimensional hyperKähler quotient to a finite dimensional. This technique was also used for the (1, 1, . . . , 1) monopole metric [35]. The metric on the moduli space of Nahm data can now be computed as a metric on a hyperKähler quotient of a finite dimensional euclidean space by a toric group action. To do this we follow [15]. From the metric and Kähler forms on Aˆ 0 , determined by inserting Eqs. (6, 23) in Eq. (94), t dψ · d ρ V −1 dψ + W · d ρ , +W (96) ds 2 = dy † N dy + d ρ t V · d ρ + 4π 4π t dψ · d ρ ∧ d ρ − (V d ρ) ω = −(N d y)t ∧ d y + 2(N dy0 )t ∧ d y + 2 +W t ∧ d ρ, 4π the action (95) is seen to be triholomorphic. The moment map for this Rn /(2π Z)n action reads 1 ¯ − iζ † P ζ, (97) µ · σ = − S t (y − y) 4π −1 (0) given by the solutions Aˆ corwhere P = (P1 , . . . , Pn )t , and has a zero set µ t responding to ρ = S y. Therefore, the space of piecewise constant solutions to the ˆ ζ ) ∈ N0 = µ Nahm data is (A, −1 (0) ⊂ Aˆ 0 . The moduli space of Nahm data is this set quotiented by the reduction of the gauge action in Eq. (42), or equivalently Rn . Hence M = N /Gˆ = N0 /Gˆ0 = µ −1 (0)/(Rn /(2π )n ).

(98)

The metric on µ −1 (0) reads ds 2 = d y t (SV S t + N ) · d y t dψ t −1 dψ t + + W · S d y V + W · S d y + dy0t N dy0 , 4π 4π t dτ · d y ∧ d y − (Gd y)t ∧ d y. ω =2 +W 4π

(99) (100)

The n vector ψ τ =S + Ny0 , 4π 4π

(101)

is invariant under the Rn /(2πZ)n action (95) and can therefore be used as coordinate on the quotient µ −1 (0)/Rn = M, together with y. Cotangent vectors involving dψ have a vertical component, i.e. lie along the Rn /(2π Z)n fibre. The horizontal and vertical 1 part of the metric are separated by inserting y0 = 4π N −1 (τ − Sψ) and completing the squares to obtain dτ t −1 dτ · S t d y V −1 W ds 2 = d y t G · d y + N + d y t S · W 4π 4π t dτ S t · d y (V −1 + S t N −1 S)−1 − S t N −1 − V −1 W 4π dτ S t · d y + ϕ t (V −1 + S t N −1 S)ϕ, · S t N −1 − V −1 W 4π

(102)

522

T. C. Kraan

where the one form ϕ denotes the component along the Rn /(2π Z)n fibre ϕ=

dψ S t · d y − (V −1 + S t N −1 S)−1 S t N −1 dτ . + (V −1 + S t N −1 S)−1 V −1 W 4π 4π (103)

Horizontal projecting to the metric on µ −1 (0)/(Rn /(2π Z)n ) amounts to discarding the last term in Eq. (102) and one obtains (after reorganising) the metric on the caloron moduli space M given in Eq. (78). For the Kähler forms, this projection is generally not necessary: Eq. (100) is precisely the Kähler form in Eq. (79). This is a manifestation of the degeneracy of the Kähler forms along the gauge orbit, needed for the hyperKähler quotient to be well defined. 5. Instanton and Monopole Limits of the Caloron From the caloron metric, other toric hyperKähler manifolds can be obtained by taking suitable limits. For large T or equivalently all ρm small, one expects the metric to approach the moduli space for k = 1 SU (n) instantons on R4 . To study this limit, we consider the centered metric Eq. (91). For small ρm , the elements of the relative mass −1 terms in V˜ , matrix M˜ in Eq. (88) are dominated by the ρm rel ˜ rel G V˜rel → , ρm → 0, m = 1, . . . , n − 1, (104) Fc−1 G(Fc−1 )t = 1 1 resulting in the asymptotic form for the non-trivial part of the metric and Kähler forms t ˜ d υ˜ ˜ ˜ −1 d υ t ˜ ˜ ˜ ˜ ˜ ˜ + W rel · d ρ Vrel + W rel · d ρ , glimit = d ρ Vrel · d ρ + 4π 4π ω limit = 2

t d υ˜ ˜ · d ρ˜ ∧ d ρ˜ − (V˜ d ρ) ˜ t ˜ +W rel rel ∧ d ρ. 4π

(105)

The caloron with trivial gauge holonomy has the same limiting metric, as follows directly from taking the limit ν1 , . . . , νn−1 → 0, νn → 1 of the caloron relative mass matrix in Eq. (90). The phase variables are now given by υm = τm + . . . + τn−1 ∈ R/(4π Z), cf. Eq. (85). The Kähler forms ω limit are closed, since the hyperKähler conditions (26) are satisfied ˜ , ˜ rel = ∇ ρG ρ × W ∇ rel

(106)

hence the limiting metric for large T is hyperKähler. It is known as the Calabi metric. This limit was discussed in [29] using indirect arguments. With the techniques presented in this paper, it is easy to prove explicitly that the limiting metric is indeed the metric for both the ordinary k = 1 SU (n) instantons on R4 and the calorons with trivial holonomy. It follows immediately when realising that the 4(n − 1) dimensional Calabi space can be obtained as the hyperKähler quotient of Hn by a U (1) action [15]. This quotient emerges naturally from both the construction of the charge one SU (n) instanton and the trivial holonomy caloron. First note that there is a one to one correspondence between the ADHM data of the k = 1 SU (n) instanton and the Nahm data of the trivial holonomy caloron in the Gˆ gauge with constant Aˆ 0 (z). The latter are

Instantons, Monopoles and Toric HyperKähler Manifolds

523

ˆ ˆ given in terms of (ξ, ζ ) ∈ H × Cn,2 as A(z) = 2π iξ, λ(z) = δ(z)ζ and directly translate into ADHM data λ = ζ, B = ξ for the instanton. With only one subinterval, the metric on the Nahm data now reduces to the expression for the instanton (8). Having restricted to constant Aˆ 0 (z), the remaining transformations in Gˆ0 leave ξ invariant, apart from confining ξ0 to the circle through g(z) = exp(2π ipz), p ∈ Z. For their action on the matching data only the U (1) formed by the values g(0) is relevant. Therefore, in both cases the nontrivial part of the moduli space is the quotient of Cn,2 (with g = 21 tr 2 dζ † dζ, ωi = 21 tr 2 (σ¯ i dζ † ∧ dζ )) by the U (1) action ζm → eit ζm ,

ψm → ψm + 2t,

m = 1, . . . n,

t ∈ R/(2π Z).

(107)

(Identifying C2 and H, this quotient is readily seen to be equivalent to that discussed in eq. (36) of [15]). The corresponding moment map, zero set and invariants are given by µ =

1 2π

m∈Z/nZ

ρm ,

ρm = 0,

υ˜ m = ψm − ψn ,

m = 1, . . . , n − 1.

m∈Z/nZ

(108) Expressing the metric on the zero set in terms of invariants and the terms involving dψn describing the fibre part, one obtains [15] the Calabi metric in Eq. (105). The Calabi metric has an SU (n) triholomorphic isometry, reflecting the SU (n) gauge symmetry of the k = 1 instanton and trivial holonomy caloron. As explained in Sect. 2 for the instanton, it emerges as the SU (n) acting on ζ in Eq. (107) on the left, commuting with U (1), and descending to the quotient. A direct calculation using a compensating gauge transformation gives the same result. In [23, 25], it was explicitly shown from the action density that removing one of the constituent monopoles of the caloron to spatial infinity, | yn | → ∞ turns it into a static selfdual SU (n) solution, i.e. a monopole in the BPS limit. Indeed, this limit corresponds to the compactification length going to zero. The Nahm data suggest that the remnant is the (1, 1, . . . , 1) monopole. We will show indeed that the metric in this limit has the required form. Removing a constituent is described by a hyperKähler quotient. Consider the U (1) action that changes the phase of the mth monopole in the uncentered caloron τm → τm + t,

t ∈ R/(4π Z).

(109)

It is a triholomorphic isometry as follows from Eqs. (78, 79). Its moment map µ fix is exactly the position of the mth monopole, µ fix = ym /(4π ). Therefore, the metric on the quotient, the caloron moduli space with the mth constituent fixed, is hyperKähler −1 ym )/R irrespective of its position. For finite | ym |, the resulting metric on the quotient µ fix ( is complicated, and no longer SO(3) symmetric. Removing the constituent, | ym | → ∞, i.e. fixing it at spatial infinity, gives the hyperKähler metric of the remnant BPS monopole, with a simple form and SO(3) symmetry restored. The metric and Kähler forms with the nth monopole far away, in which case ρ1−1 , ρn−1 → 0, reads (g, ω) = (gn , ω n ) + (gm , ω m ).

(110)

524

T. C. Kraan

Here the removed monopole is described by the metric gn = νn d yn2 + νn −1 dτn2 /(4π 2 ) and Kähler forms ω n = dτ/(2π) ∧ d yn − νn d yn ∧ d yn , and the remnant by t dτm t −1 dτm gm = d ym Gm · d ym + + Wm · d ym Gm + Wm · d ym , 4π 4π t dτm m · d ym ∧ d ym , ω m = −(Gm d ym )t ∧ d ym + 2 (111) +W 4π where t Gm = Nm + Sm Vm Sm ,

t m = Sm W m Sm W ,

m = diag(w W 2 (ρ2 ), . . . , w n−1 (ρn−1 ))/(4π ),

Vm−1 = 4πdiag(ρ2 , . . . , ρn−1 ),

Nm = diag(ν1 , . . . , νn−1 ), ρm = (ρ2 , . . . , ρn−1 )t ,

ym = (y1 , . . . , yn−1 )t ,

τm = (τ1 , . . . , τn−1 )t ,

 −1  1 −1    .. ..  ∈ Rn−1,n−2 . Sm =  . .    1 −1  1

(112)



More explicitly, the potential term in Eq. (111) reads  1 − ρ12 ρ2 − 1 1 + 1 − ρ13  ρ2 ρ2 ρ3  t . .. .. 4πSm Vm Sm = . .  ..  1 1 1 1 − ρn−2 ρn−2 + ρn−1 − ρn−1  1 1 − ρn−1 ρn−1

(113)

     ∈ Rn−1,n−1 .   

(114)

m has a similar structure. The metric in Eq. (111) is that of the The vector potential W uncentered SU (n) monopole of type (1, 1, . . . , 1). The calculation of the metric on its space of Nahm data was performed in [15, 35]. Details on the Nahm construction of the (1, 1, . . . , 1) monopole and a proof of its isometric property as well as an outline of the calculation of the metric can be found in the appendix. To connect with [27], we have to center the monopole. We introduce Fm = Sm , ν1 Nm em ∈ Rn−1,n−1 , (115) where em = (1, . . . , 1)t ∈ Rn−1 and ν = n−1 m=1 νm denotes the mass of the monopole. The relative position variables ρm are reinstated and the center of mass R3 position is separated off using ym = (Fmt )−1

ρm ξm

,

ξm =

n−1 1 νm ym . ν m=1

(116)

Instantons, Monopoles and Toric HyperKähler Manifolds

The mass matrix in this basis is given by −1 Mm t −1 , Fm Nm Fm = ν −1  1 1 − ν12 ν1 + ν2 1 1  −1  ν2 ν2 + ν3  .. .. −1 = Mm . .   1 − νn−3 

t Mm = Mm ,

525

 − ν13 .. . 1 1 νn−1 + νn−2 1 − νn−2

1 − νn−2 1 νn−2 + νn−1

   ,   

1

(Mm )m,m = ν −1 (ν1 + · · · + νm )(νm +1 + · · · + νn−1 ), for m ≥ m.

(117)

Furthermore, alternative torus coordinates χm = (χ1 , . . . , χn−2 ) are introduced, as well as a global U (1) phase ξ0,m , τm = Fm

χm ξ0

,

ξ0,m =

n−1

τm .

(118)

m=1

In the new coordinates, the uncentered metric is the sum of the center of mass and relative metric 2 c gm = νd ξm · d ξm + ν −1 dξ0,m + gm ,

(119)

where the nontrivial part c gm = d ρmt (Mm + Vm ) · d ρm t dχm m · d ρm (Mm + Vm )−1 dχm + W m · d ρm + +W 4π 4π

(120)

is the Lee–Weinberg–Yi metric [27]. It is of toric hyperKähler form. Thus we proved that the (1, 1, . . . , 1) monopole is a limit of the caloron, identifying the static remnant in [24, 25]. Finally, we note that the (1, 1, . . . , 1) monopole has only one magnetic winding, as explained in the introduction. It is opposite to the winding of the removed monopole, and hence, we can apply the reasoning in [43] explaining how the instanton charge arises also for SU (n) from braiding two monopoles [23]. 6. Discussion Since the metric describes the Lagrangian for adiabatic motion on the moduli space [33], it reflects the interactions of the monopole constituents. The constituent nature of the caloron solution, easily extracted from the action density, should therefore also be reflected in the metric. The action density of the k = 1 SU (n) caloron [24] is derived from Eq. (15) employing Green’s function techniques and reads 2 − 21 TrFµν = − 21 ∂µ2 ∂ν2 log O.

(121)

526

T. C. Kraan

Here the positive scalar potential O is defined as n 1 O(x) = 2 tr Am − cos(2π x0 ),

(122)

m=1

where

Am =

rm | ym − ym+1 | 0 rm+1

cm sm s m cm

1 , rm

(123)

given in terms of the center of mass radii rm = | x − ym | of the mth constituent monopole, cm = cosh(2πνm rm ), sm = sinh(2π νm rm ) and nm=1 Am = An · · · A1 . The energy density for the (1, 1, . . . , 1) monopole is obtained from it by sending the nth constituent to infinity, which gives [25] 2 ˜ m ( E( x ) = − 21 trFµν ( x ) = − 21 .2 log O x ),

˜ m ( O x ) = 21 tr

1

rn−1

sn−1 cn−1 0 0

n−2

(124)

Am .

(125)

m=1

(see [31] for some special cases). These densities allow for an unambiguous identification of elementary BPS monopoles as constituents of calorons, and (1, 1, . . . , 1) monopoles, as in the limit where rm rl for all l ! = m the action density approaches that of the single BPS monopole [24]. The corresponding limit in the uncentered metrics reveals ds 2 |m = νm d ym · d ym +

1 dτ 2 νm m

(126)

for the part describing the mth constituent, as all interaction potentials approach zero with the other constituents far away. Equation (126) is the flat metric on R3 × S 1 , the twofold cover of the moduli space for the elementary BPS monopole. Therefore the limit of the (cover of the) caloron moduli space – corresponding to all monopoles well separated – can be seen as a product of elementary BPS monopole moduli spaces. We obtained the metric for the k = 1 SU (n) caloron assuming symmetry breaking to the maximal torus U (1)n−1 with arbitrarily chosen holonomy eigenvalues µm . In the situation of non-maximal breaking, some of the eigenvalues of the holonomy become equal, resulting in some monopoles acquiring zero mass. The form of the relative mass matrices defined as inverses suggests that dramatic things happen when one or more of the constituents acquire zero mass. However, as is clear from the explicit forms of M, Mm in Eqs. (71, 90, 112, 117), all limits can be taken smoothly. This assertion was explicitly checked for the trivial holonomy caloron, with all but one monopole having zero mass. Therefore one can study most efficiently all symmetry breaking patterns, both for k = 1 calorons and for monopoles of type (1, 1, . . . 1), just by inserting the proper values for µm , rather than having to calculate the metric for each case separately. Consider, both for the caloron and for the (1, 1, . . . , 1) monopole, the situation of N − 1 monopoles turning massless νK , . . . , νK+N−2 = 0,

µK = . . . = µK+N−1 ,

(127)

Instantons, Monopoles and Toric HyperKähler Manifolds

527

resulting in an enhanced residual symmetry to SU (N ) × U (1)n−N . The corresponding center of mass radii no longer appear in the expression for the action and energy densities [24], as follows from   K−2 n n   1 rK−1 Rc cK−1 sK−1 Am → Am Am . 0 rK+N−1 sK−1 cK−1 rK−1   m=1

m =K+N−1

m=1

(128) Here Rc = |ρK | + . . . + |ρK+N−1 | = π tr 2

K+N−1 m=K

† ζ(m) ζ(m)

(129)

denotes what is known in the monopole literature as the "non- abelian cloud" parameter [28]. It is seen from the right-hand side of Eq. (129) that it is SU (N ) invariant. From the ADHM-Nahm construction (28, 29), this SU (N ) symmetry is seen to leave the holonomy invariant. It will descend to the quotient in the hyperKähler quotient construction of the metric, and therefore, the metric will be SU (N ) invariant as well, much like in the case of the trivial holonomy caloron. As the explicit form of the metric can readily be found by inserting Eq. (127) in the mass matrices (71, 112), it will not be given here. The SU (N ) transformation mixes the positions of the massless monopoles, which therefore do not exist as individual particles. A way of seeing this physically is that the intrinsic length scales of the monopoles, proportional to their inverse masses, become infinitely large as their masses become small, so that they overlap and lose their identities. This appearance of massless particles and infinite length scales illustrates a very general feature of systems near a transition to a more symmetric phase. The fact that the SU (n+1) (1, 1, . . . , 1) monopole and the SU (n) k = 1 caloron both consist out of n constituent BPS monopoles in combination with the fact that the former can be obtained out of an SU (n + 1) caloron, suggests a great similarity between their metrics. We consider the relevant situation for quantum chromodynamics, the SU (3) caloron. Removing one monopole to infinity gives the SU (3) monopole of type (1,1). There remain two constituents, of masses proportional to ν1 , ν2 . The relative metric of the (1, 1) monopole is Taub-NUT with positive mass parameter, gT N = U (ρ)d ρ 2 + U (ρ) −1 (

dψ Qw( ρ) + · d ρ) 2, 4π 4π

U (ρ) =

ν1 ν2 Q + , ν1 + ν 2 4π |ρ| (130)

ρ denoting the separation of the constituents, Q = 1. The relative metric for the SU (2) caloron is also a Taub-NUT [22, 23]. (The metric obtained there checks with Eq. (130) apart from the normalisation 4π 2 , as πρ 2 , ϒ in [23] is related to |r |, ψ in Eq. (130)). However, the interaction strength, depending on the distance between the monopoles, for the caloron is Q = 2, twice that of the SU (3) monopole. Both solitons can be considered as built out of two interacting constituent BPS monopoles, and have a four-dimensional relative moduli space. As each matching point in the Nahm construction gives rise to an interaction between monopoles of distinct type, this is to be expected. The SU (3) (1, 1) monopole has one matching point, at z = µ2 whereas the SU (2) caloron has, in the situation of two constituents, one additional, equal to the other, at z = µ1 + 1 to close the circle. In [26] this was attributed to the fact that the constituent monopoles in the

528

T. C. Kraan

SU (3) (1,1) case are charged with respect to different U (1), whereas for the caloron, they are oppositely charged with respect to the same U (1), generated by ω · τ. In conclusion, we have presented results for the metric on moduli spaces in a unified description that incorporates instantons, calorons and monopoles. Acknowledgements. Pierre van Baal is gratefully acknowledged for discussions and critically reading earlier versions of the manuscript. Conversations and correspondence with Conor Houghton have been very stimulating. This work was financially supported by a grant from the FOM/SWON Association for Mathematical Physics.

Appendix The (1, 1, . . . , 1) monopole. The Nahm construction of the (1, 1, . . . , 1) monopole is similar to that of the k = 1 SU (n) caloron. The main difference is that the circle is replaced by the interval [µ1 , µn ]. For the (1, 1, . . . , 1) monopole, the singularities reside at z = µ2 , . . . , µn−1 [37, 21, 44]. Like for the caloron we introduce 1 ˆ† .† = (λ† (z), 2πi Dx (z)), λ(z) =

n−1

δ(z − µm )ζm ,

m=2

d ˆ Dˆ x (z) = σµ Dˆ µ x (z) = + A(z) − 2π ix, dz

(131)

ˆ where A(z) is now defined on [µ1 , µn ]. The Nahm construction is performed in terms of the normalised zero modes v(x) of .(x), n−1 1 ˆ† sx m v(x) = ˆ δ(z − µm )ζm† sxm Dx (z)ψˆ xm (z) + , = 0,(132) ψx (z) 2π i m =2 µn † † † dzψˆ x (z)ψˆ x (z) = 1n , (133) v (x)v(x) = sx sx + µ1

where ψˆ x (z) = [µ1 , µn ], and s for fixed sx [21]). Though the monopole is a static solution, it is preferable to have x0 included as a dummy variable, the x0 dependence trivially being implemented by v(x) = e2πix0 z v( x ), so as to write concisely

(ψx1 (z), . . . , ψxn (z)) contains the n two-spinors defined on the interval ∈ Cn−2,n . (The equation for ψˆ xm (z) is readily seen to have n solutions

x )) = v † (x)∂µ v(x), x ) = (:( x ), A( Aµ (

(134)

with the inner product defined as in Eq. (133). Performing all monopole calculations in terms of .(x) and v(x), we can copy the caloron formalism. In particular, it follows that for Eq. (134) to be selfdual, .† (x).(x) should commute with the quaternions. This is equivalent to the monopole Nahm equation, n−1 d ˆ j Aj (z) = 2π i δ(z − µm )ρm . dz

(135)

m=2

Its solution Aˆ j (z) can be written in terms of n − 1 position vectors ym , ρm = ym − ym−1 , comprised in ym = ( y1 , . . . , yn )t , t ym , ρm = Sm

(136)

Instantons, Monopoles and Toric HyperKähler Manifolds

529

implying Aˆ j = 2π i

n−1

j

χ[µm ,µm+1 ] (z)ym .

(137)

m=1

Like for the caloron, there is a gauge action on the Nahm data d ˆ ˆ A(z) → A(z) + i h(z), dz

ζm → ζm eih(µm ) ,

m = 2, . . . , n − 1,

(138)

with gauge group Gˆm = {g(z)|g : z → e−ih(z) ∈ U (1), g(µ1 ) = g(µn ) = 1}. The condition at the endpoints is required for id/dz to be hermitian on the space of gauge functions. Hence, for the monopole Gˆm = {g(z)|g : z → e−ih(z) ∈ U (1), g(µ1 ) = g(µn ) = 1}. The Gm action can be used to set Aˆ 0 (z) constant, and to undo the U (1) phase ambiguities in relating ζm to ρm , m = 2, . . . , n − 1, hence ζm can be considered to have fixed phase. The monopole Nahm data can then be expressed in terms of n − 1 quaternions, τm + ym · σ ), Aˆ m = (Aˆ 1 , . . . , Aˆ n−1 )t = 2π(Nm−1 4π

(139)

ˆ takes on (µm , µm+1 ). i Aˆ m,m denoting the value A(z) In the gauge with constant Aˆ 0 (z), the Green’s function fx in the monopole Nahm construction is the solution to the differential equation

1 d − x0 2πi dz

2 +

n−1

2 χ[µm ,µm+1 ] (z) rm

m=2 n−1

1 + 2π

δ(z − µm )| ym − ym−1 | fˆx (z, z ) = δ(z − z ),

(140)

m=2

whereas transformations to other gauges are realised by fˆx (z, z ) → g(z)fˆx (z, z )g(z )∗ ,

g(z) ∈ Gˆm .

(141)

The boundary condition for the monopole Green’s function is determined by the required be a hermitian operator, therefore the eigenfunctions of the left-hand side ment that i dz of Eq. (140) vanish in the endpoints. This imposes by standard Sturm–Liouville theory fˆx (µ1 , z ) = fˆx (µn , z ) = 0

(142)

for the Green’s function. This boundary condition is automatically satisfied when obtaining the monopole Green’s function from the caloron Green’s function, taking the limit | yn | → ∞. The x0 dependence of the monopole Green’s function is trivial, fˆx (z, z ) = e2πix0 (z−z ) fˆx (z, z ).

(143)

The metric on the monopole moduli space is determined in terms of the L2 norm of gauge orthogonal solutions Zm to the linearised Bogomol’nyi equations. With A0 identified as the Higgs field, and assuming all fields and zero modes being static, the conditions for a

530

T. C. Kraan

tangent vector to the monopole moduli space are identical to those on the tangent vector to an instanton moduli space, hence Zm satisfies †

D ad (A)Zm = 0,

(144)

where ∂0 acts trivially, but is kept to make later derivations more transparent. Metric and Kähler forms read 1 † (g, ω)(Z d 3 xTrZm ( x )Zm ( x ). (145) m , Zm ) = 4π 2 R3 The formalism to compute the metric is copied from the caloron case. A tangent vector to the monopole moduli space is given by Zmµ ( x) =

[µ1 ,µn ]2

dzdz

n−1

m =2

sx† cˆm δ(z − µm ) + ψˆ x† (z)Yˆ (z) fˆx (z, z )σ¯ µ ψˆ x (z ) − h.c. (146)

in terms of a tangent vector to the moduli space of monopole Nahm data C=

c(z) ˆ , Yˆ (z)

n−1

c(z) ˆ =

cˆm δ(z − µm ),

(147)

m=2

satisfying the deformation and gauge orthogonality equations n−1

d ˆ † Yi (z) = −iπtr 2 σ¯ i (ζm† cˆm + cˆm ζm )δ(z − µm ), dz m=2

d ˆ Y0 (z) = −iπ dz

n−1 m=2

† tr 2 (ζm† cˆm − cˆm ζm )δ(z − µm ).

(148)

To derive the analogue for monopoles of Corrigan’s formula we trade each matrix multiplication in Eq. (58) for an integration over [µ1 , µn ] or an inner product of type (133) and use the trivial x0 dependence of v(x) and fx (z, z ) for the monopole to obtain † TrZm (x)Zm (x) = − 21 ∇ 2 tr 2 dz [Yˆ † (z)Yˆ (z) + Yˆ † (z)Yˆ (z) [µ1 ,µn ]

ˆ + cˆ (z) < cˆ > +cˆ (z) < cˆ >]fx (z, z) †

+ ∇ tr 2 1 2

[µ1 ,µn ]2

†

ˆ + Y(z)] ˆ dzdz [C(z)

2

fˆx (z, z )[Yˆ x† (z ) + Cˆ† (z )]fˆx (z , z) , (149)

n−1 † −1 ˆ † ˆ ˆ ˆ where now m=2 cˆm ζm δ(z − µm ), Yx (z) = (2π i) Y (z)Dx (z) and <

C(z) = H >≡ [µ1 ,µn ] H (z)dz. Compare Eq. (64). The monopole metric is evaluated from

Instantons, Monopoles and Toric HyperKähler Manifolds

531

Eqs. (145,149) by partial integration, along the lines of the derivation in Sect. 4.2. The monopole Green’s function fx (z, z ) behaves as in Eq. (66). Thus we arrive at the isometric property of the Nahm construction for (1, 1, . . . , 1) monopoles, gM (Zm , Zm ) = 21 tr 2 < Yˆ † Yˆ > + < cˆ† >< cˆ > + < cˆ† >< cˆ > , i ωM (Zm , Zm ) = 21 tr 2 σ¯ i < Yˆ † Yˆ > + < cˆ† >< cˆ > − < cˆ† >< cˆ > . (150) ˆ An infinitesimal gauge transformation δ X(z) is applied to obtain gauge orthogonality of the tangent vector C, c(z) ˆ =

n−1

δ(z − µm )cˆm =

m=2 n−1

Yˆ (z) = i

m=1

n−1

ˆ m) , δ(z − µm ) δζm + iζm δ X(µ

m=2

1 χ[µm ,µm+1 ] Yˆm = 2π i

(151)

d ˆ ˆ δ A(z) + i δ X(z) . dz

It vanishes in the endpoints z = µ1 , z = µn and satisfies −

n−1 ˆ 1 d 2 δ X(z) ˆ + 2δ X(z) δ(z − µm )|ρm | 2π dz2 m=2

=

n−1 m=2

δ(z − µm )

dτm dτm−1 − − |ρm |w m (ρm ) · d ρm . 4π νm 4π νm−1

(152)

Therefore, it is piecewise linear and fixed by δ Xˆ = (δ Xˆ 2 , . . . , δ Xˆ n−1 )t , δ Xˆ m = ˆ m ), m = 2, . . . , n − 1, where δ X(µ 1 t −1 dτm t t m Sm (S N Sm + Vm−1 )δ Xˆ = (Sm − Vm−1 W Nm−1 · d ym ) 2π m m 4π

(153)

(see Eqs. (112, 113) for definitions). With the compensating gauge function found, the remaining manipulations to retrieve the uncentered monopole metric in Eq. (111) from Eqs. (150, 151) differ only in the m label and the dimensions of the matrices from those in Sect. 4.3 and are therefore not repeated here. To compute the metric using the hyperKähler quotient construction we follow and summarise the reasoning in [35, 15] and Sect. 4.4. We have to find the metric on Nm /Gˆm , where Nm is the subset of the space Aˆ m of monopole Nahm data containing the solutions to the Nahm equations. Making use of the U (1) gauge symmetry for the monopole ˆ in Eq. (138), we can restrict ourselves to piecewise constant A(z), characterised by n − 1 quaternions corresponding to its values on the subintervals. Together with the n − 2 complex two-vectors giving the matching data, these form the space Aˆ 0m = Hn−1 × Cn−2,2 " (ym , ζm ). This space has natural metric † g = 21 Tr tr 2 dym Nm dym + 2dζm† dζm . (154) The set of piecewise constant solutions to the Nahm equations form N0,m , which is a subset of Aˆ 0m . The vector part of a piecewise constant solution to the monopole

532

T. C. Kraan

Nahm equation (i.e. Nm,0 ) is fixed by Eq. (136). We introduce the phases of ζm as ˆ ψm = (ψ2 , . . . , ψn−1 )t . Having gauge fixed to constant A(z), the residual U (1) gauge symmetry consists of gauge functions having piecewise linear and continuous logarithms, which vanish in the endpoints z = µ1 and z = µm . This results in an Rn−2 action on Aˆ 0m , characterised by ym → ym −

1 −1 N Sm tm , 2π m

ψm → ψm + 2tm ,

tm ∈ Rn−2 ,

(155)

with moment map, zero set and invariants given by µ m = −

1 t ρm S ym + , 2π m 2π

t ρm = Sm ym ,

τm = 4π Nm y0m + Sm ψm .

(156)

A suitable notation being established, the algebra to obtain the metric and Kähler forms for the uncentered monopole in Eq. (111) is now nearly identical to the hyperKähler quotient construction of the uncentered caloron metric, and one readily retrieves Eq. (111). Actually, one only has to insert the m labels at appropriate places, just realising that the dimensionalities of the objects are slightly different. References 1. Atiyah, M.F. and Hitchin, N.J.: The Geometry and Dynamics of Magnetic Monopoles. Princeton: Princeton Univ. Press, 1988 2. Atiyah, M.F., Hitchin, N.J., Drinfeld, V.G., Manin, Yu.I.: Phys. Lett. 65 A, 185 (1978); Atiyah, M.F.: Geometry of Yang–Mills fields. Fermi lectures, Scuola Normale Superiore, Pisa, 1979 3. Bogomol’nyi, E.B.:Yad. Fiz. 24, 861 (1976); Sov. J. Nucl. 24, 449 (1976); Prasad, M.K. and Sommerfield, C.M.: Phys. Rev. Lett. 35, 760 (1975) 4. Braam, P.J. and van Baal, P.: Commun. Math. Phys. 122, 267 (1989) 5. Connell, S.A.: The dynamics of the SU(3) charge (1,1) magnetic monopole. (1991), ftp://maths.adelaide.edu.au/pure/murray/oneone.tex, unpublished preprint 6. Corrigan, E.: Unpublished, quoted in [41] 7. Corrigan, E. and Goddard, P.: Ann. Phys. (N.Y.) 154, 253 (1984) 8. Donaldson, S.K.: Commun. Math. Phys. 93, 453 (1984) 9. Donaldson, S.K.: Commun. Math. Phys. 96, 387 (1984) 10. Donaldson, S.K. and Kronheimer, P.B.: The Geometry of Four-Manifolds, Oxford: Clarendon Press, 1990 11. Gauntlett, J.P. and Lowe, D.A.: Nucl. Phys. B472, 194 (1996) (hep-th/9601085); Lee, K., Weinberg, E.J. and Yi, P.: Phys. Lett. B376, 97 (1996) (hep-th/9601097) 12. Garland, H. and Murray, M.K.: Commun. Math. Phys. 120, 335 (1988) 13. Gauntlett, J.P., Gibbons, G.W., Papadopoulos, G. and Townsend, P.K.: Nucl. Phys. B 500, 133 (1997) (hep-th/9702202); Papadopoulos, G. and Townsend, P.K.: Nucl. Phys. B 444, 245 (1995) (hep-th/9501069) 14. Gibbons, G.W. and Manton, N.S.: Phys. Lett. B 356, 32 (1995) (hep-th/9506052) 15. Gibbons, G.W., Rychenkova, P. and Goto, R.: Commun. Math. Phys. 186, 581 (1997) (hep-th/9608085) 16. Gross, D.J., Pisarski, R.D. and Yaffe, L.G.: Rev. Mod. Phys. 53, 43 (1981) 17. Harrington, B.J. and Shepard, H.K.: Phys. Rev. D 17, 2122 (1978); ibid. D18, 2990 (1978) 18. Hitchin, N.J., Karlhede, A., Lindström, U. and Roˇcek, M.: Commun. Math. Phys. 108, 535 (1987) 19. ’t Hooft, G.: Phys. Rev. D 14, 3432 (1976) 20. Houghton, C.J. and Sutcliffe, P.M.: J. Math. Phys. 38, 5576 (1997) 21. Hurtubise, J. and Murray, M.K.: Commun. Math. Phys. 122, 35 (1989) 22. Kraan, T.C. and van Baal, P.: Phys. Lett. B428, 268 (1998) (hep-th/9802049) 23. Kraan, T.C. and van Baal, P.: Nucl. Phys. B 533, 627-659 (1998) (hep-th/9805168); Nucl. Phys. A 642, 299c (1998) (hep-th/9805201) 24. Kraan, T.C. and van Baal, P.: Phys. Lett. B 435, 389 (1998) (hep-th/9806034) 25. Kraan, T.C. and van Baal, P.: Nucl. Phys. Suppl. 73, 554 (1999) (hep-lat/9808015) 26. Lee, K.: Phys. Lett. B 426, 323 (1998) (hep-th/9802012); Lee, K. and Lu,C.: Phys. Rev. D 58, 25011 (1998) (hep-th/9802108) 27. Lee, K., Weinberg, E.J., Yi, P.: Phys. Rev. D 54, 1633 (1996) 28. Lee, K., Weinberg, E.J., Yi, P.: Phys. Rev. D 54, 6351 (1996)

Instantons, Monopoles and Toric HyperKähler Manifolds

29. 30. 31. 32. 33. 34. 35. 36. 37.

38. 39. 40. 41. 42. 43. 44.

533

Lee, K. and Yi, P.: Phys. Rev. D 56, 3711 (1997) (hep-th/9702107) Lee, K. and Yi, P.: Phys. Rev. D 58, 066005 (1998) (hep-th/9804174) Lu, C.: Phys. Rev. D 58, 125010 (1998) (hep-th/9806237) Maciocia, A.: Commun. Math. Phys. 135, 467 (1991) Manton, N.S.: Phys. Lett. B 110, 54 (1982) Manton, N.S.: Phys. Lett. B 154, 397 (1985) [Err. 157B, (1985) 475] Murray, M.K.: J. Geom. Phys. 23, 31–41 (1997) (hep-th/9605054) Nahm, W.: Phys. Lett. B 90, 413 (1980) Nahm, W.: All self-dual multimonopoles for arbitrary gauge groups. CERN preprint TH-3172 (1981), published in Freiburg ASI 301 (1981); The construction of all self-dual multimonopoles by the ADHM method. In: Monopoles in quantum field theory, eds. N. Craigie, e.a. Singapore: World Scientific, 1982, p. 87 Nahm, W.: Self-dual monopoles and calorons. In: Lect. Notes in Physics. 201, eds. G. Denardo, e.a. 1984, p. 189 Nakajima, H.: Monopoles and Nahm’s Equations. In: Einstein metrics andYang–Mills connections, Sanda, 1990, eds. T. Mabuchi and S. Mukai, New York: Dekker, 1993 Osborn, H.: Nucl. Phys. B 159, 497 (1979) Osborn, H.: Ann. Phys. (N.Y.) 135, 373 (1981) Pedersen, H. and Poon, Y.: Commun. Math. Phys. 117, 569 (1988) Taubes, C.: Morse theory and monopoles: Topology in long range forces. In: Progress in gauge field theory, eds. G. ’t Hooft et al, New York: Plenum Press, 1984, p. 563 Weinberg, E.J. and Yi, P.: Phys. Rev. D 58, 046001 (1998)

Communicated by R. H. Dijkgraaf

Commun. Math. Phys. 212, 535 – 556 (2000)

Communications in

Mathematical Physics

© Springer-Verlag 2000

On Covariant Realizations of the Euclid Group R. Z. Zhdanov, V. I. Lahno, W. I. Fushchych Institute of Mathematics, 3 Tereshchenkivska Street, 01004 Kiev, Ukraine. E-mail: [email protected]; [email protected] Received: 12 September 1997 / Accepted: 30 January 2000

Abstract: We classify realizations of the Lie algebras of the rotation O(3) and Euclid E(3) groups within the class of first-order differential operators in arbitrary finite dimensions. It is established that there are only two distinct realizations of the Lie algebra of the group O(3) which are inequivalent within the action of a diffeomorphism group. Using this result we describe a special subclass of realizations of the Euclid algebra which are called covariant ones by analogy to similar objects considered in classical representation theory. Furthermore, we give an exhaustive description of realizations of the Lie algebra of the group O(4) and construct covariant realizations of the Lie algebra of the generalized Euclid group E(4). 1. Introduction The standard approach to constructing linear relativistic motion equations contains as a subproblem the one of describing inequivalent matrix representations of the Poincaré group P (1, 3). So that if one succeeds in obtaining an exhaustive (in some sense) description of all inequivalent representations of the latter, then it is possible to construct all possible Poincaré-invariant linear wave equations (for more details see, e.g., [1–3]). It would be only natural to apply the same approach to describing nonlinear relativisticallyinvariant models with the help of the Lie’s infinitesimal technique. However, in the overwhelming majority of the papers devoted to symmetry classification of nonlinear differential equations admitting some Lie transformation group G the realization of the group was fixed a priori. As a result, only particular classes of partial differential equations invariant with respect to a prescribed group G were obtained. One of the possible reasons for this is that the problem of describing inequivalent realizations of a given Lie transformation group reduces to constructing a general solution of some over-determined system of nonlinear partial differential equations (in contrast to the case of classical matrix representation theory where one has to solve nonlinear matrix equations).

536

R. Z. Zhdanov, V. I. Lahno, W. I. Fushchych

We recall that given a fixed realization of a Lie transformation group G, the problem of describing partial differential equations invariant under the group G is reduced with the help of the infinitesimal Lie method to integrating some over-determined linear system of partial differential equations (called determining equations) [4–7]. However, to solve the problem of constructing all differential equations admitting the transformation group G whose realization is not fixed a priori one has • to construct all inequivalent (in some sense) realizations of the Lie transformation group G, • to solve the determining equations for each realization obtained. And what is more, the first problem, in contrast to the second one, reduces to solving nonlinear systems of partial differential equations. In this respect one should mention Lie’s classification of integrable ordinary differential equations based on his classification of complex Lie algebras of first-order differential operators in one and two variables [8]. However, it seems impossible to give an exhaustive description of all Lie algebras of first-order differential operators. Till now there is no complete classification of them even for the case of first-order differential operators in three variables, though a partial classification was obtained by Lie a century ago [8]. The classification problem is substantially simplified if we are looking for inequivalent realizations of a specific Lie algebra. It has been completely solved by Rideau and Winternitz [9], Zhdanov and Fushchych [10] for the generalized Galilei (Schrödinger) group G2 (1, 1) acting in the space of two dependent and two independent variables. Yehorchenko [11] and Fushchych, Tsyfra and Boyko [12] have constructed new (nonlinear) realizations of the Poincaré groups P (1, 2) and P (1, 3), correspondingly (see also [13, 14]). Some new realizations of the Galilei group G(1, 3) were suggested in [15]. A complete description of covariant realizations of the conformal group C(n, m) in the space of n+m independent and one dependent variables was obtained by Fushchych, Zhdanov and Lahno [16, 17] (see, also [18]). It has been established, in particular, that any covariant realization of the Poincaré group P (n, m) with max{n, m} ≥ 3 in the case of one dependent variable is equivalent to the standard realization. But given the condition max{n, m} < 3, there exist essentially new realizations of the corresponding Poincaré groups. The present paper is devoted mainly to classification of inequivalent realizations of the Euclid group E(3), which is a semi-direct product of the three-parameter rotation group O(3) and of the three-parameter Abelian translation group T (3), acting in the space of three independent (x1 , x2 , x3 ) and n ∈ N dependent (u1 , . . . , un ) variables. Being a subgroup of such fundamental groups as the Poincaré and Galilei groups, the Euclid group plays an exceptional role in modern mathematical and theoretical physics, since it is admitted both by equations of relativistic and non-relativistic theories. In particular, group E(3) is an invariance group of the Klein-Gordon-Fock, Maxwell, heat, Schrödinger, Dirac, Weyl, Navier–Stokes, Lamé and Yang-Mills equations. The paper is organized as follows. The second section contains the necessary notations, conventions and definitions used throughout the paper. In the third section we give an exhaustive classification of inequivalent realizations of the Lie algebra of the rotation group O(3) within the class of first-order differential operators. The fourth section is devoted to description of covariant realizations of the Euclid algebra AE(3). We give a complete classification of them and, furthermore, demonstrate how to reduce the realizations of AE(3) realized on the sets of solutions of the Navier–Stokes, Lamè, Weyl, Maxwell and Dirac equations to one of the two canonical forms. In the fourth section

On Covariant Realizations of the Euclid Group

537

the results obtained are applied to describe covariant realizations of the Lie algebra of the generalized Euclid group AE(4). 2. Basic Notations and Definitions It is common knowledge that investigation of realizations of a Lie transformation group G reduces to study of realizations of its Lie algebra AG whose basis elements are the first-order differential operators (Lie vector fields) of the form Q = ξα (x, u)∂xα + ηi (x, u)∂ui ,

(1)

where ξα , ηi are some real-valued smooth functions of x = (x1 , x2 , . . . , xm ) ∈ Rm and u = (u1 , u2 , . . . , un ) ∈ Rn , ∂xα = ∂x∂ α , ∂ui = ∂u∂ i , α = 1, 2, . . . , m, i = 1, 2, . . . , n. Hereafter, a summation over the repeated indices is understood. In the above formulae we have two “sorts” of variables. The variables x1 , x2 , . . . , xm and u1 , u2 , . . . , un will be referred to as independent and dependent variables, respectively. The difference between these becomes essential when we consider AG as an invariance algebra of some system of partial differential equations for u1 (x), . . . , un (x). Due to properties of the corresponding Lie transformation group G basis operators Qa , a = 1, . . . , N of a Lie algebra AG satisfy commutation relations c [Qa , Qb ] = Cab Qc ,

a, b = 1, . . . , N,

(2)

where [Qa , Qb ] ≡ Qa Qb − Qb Qa is the commutator. c = const ∈ R are structure constants which determine uniquely the Lie In (2) Cab algebra AG.A fixed set of Lie vector fields (LVFs) Qa satisfying (2) is called a realization of the Lie algebra AG. Thus the problem of description of all realizations of a given Lie algebra AG reduces c within the class of LVFs to solving relations (2) with some fixed structure constants Cab (1). It is easy to check that relations (2) are not altered with an arbitrary invertible transformation of variables x, u, yα = fα (x, u),

α = 1, . . . , m,

vi = gi (x, u),

i = 1, . . . , n,

(3)

where fα , gi are smooth functions. That is why we can introduce on the set of realizations of a Lie algebra AG the following relation: two realizations Q1 , . . . , QN and Q1 , . . . , QN are called equivalent if they are transformed one into another by means of an invertible transformation (3). As invertible transformations of the form (3) form a group (called diffeomorphism group), the relation above is an equivalence relation. It divides the set of all realizations of a Lie algebra AG into equivalence classes A1 , . . . , Ar . Consequently, to describe all possible realizations of AG it suffices to construct one representative of each equivalence class Aj , j = 1, . . . , r. Definition 1. First-order linearly-independent differential operators (1)

(1)

(2)

(2)

Pa = ξab (x, u)∂xb + ηai (x, u)∂ui , Ja = ξab (x, u)∂xb + ηai (x, u)∂ui ,

(4)

538

R. Z. Zhdanov, V. I. Lahno, W. I. Fushchych

where the indices a, b take the values 1, 2, 3 and the index i takes the values 1, 2, . . . , n, form a realization of the Euclid algebra AE(3) provided the following commutation relations are fulfilled:

where εabc

[Pa , Pb ] = 0,

(5)

[Ja , Pb ] = εabc Pc ,

(6)

[Ja , Jb ] = εabc Jc ,

(7)

  1, (abc) = cycle (123), = −1, (abc) = cycle (213),  0, in the remaining cases.

Definition 2. Realization of the Euclid algebra within the class of LVFs (4) is called covariant if coefficients of the basis elements Pa satisfy the following condition: ξ (1) ξ (1) ξ (1) η(1) . . . η(1) 11 12 13 11 1n (1) (1) (1) (1) (1) rank ξ21 ξ22 ξ23 η21 . . . η2n (8) = 3. (1) (1) (1) (1) (1) ξ ξ ξ η ... η 31

32

33

31

3n

3. Realizations of the Lie Algebra of the Rotation Group O(3) It is well-known from classical representation theory that there are infinitely many inequivalent matrix representations of the Lie algebra of the rotation group O(3) [1]. A natural equivalence relation on the set of matrix representations of AO(3) is defined as follows: Ja → V Ja V −1 with an arbitrary constant nonsingular matrix V . If we represent the matrices Ja as the first-order differential operators (see, e.g., [7]) Ja = −{Ja u}α ∂uα ,

(9)

where u is a vector-column of corresponding dimension, then the above equivalence relation means that the representations of the algebra AO(3) are searched for within the class of LVFs (9) up to invertible linear transformations u → v = V u. It is proved below that provided realizations of AO(3) are classified within arbitrary invertible transformations of variables vi = Fi (u),

i = 1, . . . , n,

there are only two inequivalent realizations.

(10)

On Covariant Realizations of the Euclid Group

539

Theorem 1. Let first-order differential operators Ja = ηai (u)∂ui ,

a = 1, 2, 3

(11)

satisfy the commutation relations of the Lie algebra of the rotation group O(3) (7). Then either all of them are equal to zero, i.e. Ja = 0,

a = 1, 2, 3,

(12)

or there exists a transformation (10) reducing these operators to one of the following forms: 1.

2.

J1 J2 J3 J1 J2 J3

= − sin u1 tan u2 ∂u1 − cos u1 ∂u2 , = − cos u1 tan u2 ∂u1 + sin u1 ∂u2 , = ∂u1 ; = − sin u1 tan u2 ∂u1 − cos u1 ∂u2 + sin u1 sec u2 ∂u3 , = − cos u1 tan u2 ∂u1 + sin u1 ∂u2 + cos u1 sec u2 ∂u3 , = ∂u1 .

(13)

(14)

Proof. If at least one of the operators Ja (say J3 ) is equal to zero, then due to the commutation relations (7) two other operators J2 , J3 are also equal to zero and we arrive at the Formulae (12). Let J3 be a non-zero operator. Then, using a transformation (10) we can always reduce the operator J3 to the form J3 = ∂v1 (we should write J3 but to simplify the notations we omit hereafter the primes). Next, from the commutation relations [J3 , J1 ] = J2 , [J3 , J2 ] = −J1 it follows that coefficients of the operators J1 , J2 satisfy the system of ordinary differential equations with respect to v1 , η2iv1 = η3i ,

η3iv1 = −η2i ,

i = 1, . . . , n.

Solving the above system yields η2i = fi cos v1 + gi sin v1 ,

η3i = gi cos v1 − fi sin v1 ,

(15)

where fi , gi are arbitrary smooth functions of v2 , . . . , vn , i = 1, . . . , n. Case 1. fj = gj = 0, j ≥ 2. In this case operators J1 , J2 read J1 = f cos v1 ∂v1 ,

J2 = −f sin v1 ∂v1

with an arbitrary smooth function f = f (v2 , . . . , vn ). Inserting the above expressions into the remaining commutation relation [J1 , J2 ] = J3 and computing the commutator on the left-hand side we arrive at the equality f 2 = −1 which cannot be satisfied by a real-valued function f . Case 2. Not all fj , gj , j ≥ 2 are equal to 0. Making a change of variables w1 = v1 + V (v2 , . . . , vn ),

wj = vj ,

j = 2, . . . , n

we transform operators Ja , a = 1, 2, 3 with coefficients (15) as follows: J1 = f˜ sin w1 ∂w1 +

n j =2

(f˜j cos w1 + g˜ j sin w1 )∂wj ,

540

R. Z. Zhdanov, V. I. Lahno, W. I. Fushchych

J2 = f˜ cos w1 ∂w1 +

n

(g˜ j cos w1 − f˜j sin w1 )∂wj ,

(16)

j =2

J3 = ∂w1 . Here f˜, f˜j , g˜ j are some functions of w2 , . . . , wn . Subcase 2.1. Not all f˜j are equal to 0. Making a transformation z1 = w1 ,

zj = Wj (w2 , . . . , wn ),

j = 2, . . . , n,

where W2 is a particular solution of the partial differential equation n

f˜j ∂wj W2 = 1

j =2

and W3 , . . . , Wn are functionally-independent first integrals of the partial differential equation n

f˜j ∂wj W = 0,

j =2

we reduce the operators (16) to be J1 = F sin z1 ∂z1 + cos z1 ∂z2 + J2 = F cos z1 ∂z1 − sin z1 ∂z2 +

n j =2 n

Gj sin z1 ∂zj , Gj cos z1 ∂wj ,

(17)

j =2

J3 = ∂z1 . Substituting operators (17) into the commutation relation [J1 , J2 ] = J3 and equating coefficients of the linearly-independent operators ∂z1 , . . . , ∂zn , we arrive at the following system of partial differential equations for the functions F, G2 , . . . , Gn : Fz2 − F 2 = 1,

Gj z2 − F Gj = 0,

j = 2, . . . , n.

Integrating the above equations yields F = tan(z2 + c1 ),

Gj =

cj , cos(z2 + c1 )

where c1 , . . . , cn are arbitrary smooth functions of z3 , . . . , zn , j = 2, . . . , n. Changing, if necessary, z2 by z2 + c1 (z3 , . . . , zn ) we can put c1 equal to zero. Next, making a transformation, ya = za , a = 1, 2, 3, yk = Zk (z3 , . . . , zn ), k = 4, . . . , n,

On Covariant Realizations of the Euclid Group

541

where Zk are functionally-independent first integrals of the partial differential equation n

Gj ∂zj Z = 0,

j =3

we can put Gk = 0, k = 4, . . . , n. With these remarks the operators (17) take the form sin y1 (f ∂y2 + g∂y3 ), cos y2 cos y1 + (f ∂y2 + g∂y3 ), cos y2

J1 = sin y1 tan y2 ∂y1 + cos y1 ∂y2 + J2 = cos y1 tan y2 ∂y1 − sin y1 ∂y2 J3 = ∂y1 ,

(18)

where f, g are arbitrary smooth functions of y3 , . . . , yn . If g ≡ 0, then making a transformation u˜ 1 = y1 − arctan

f , cos y2

u˜ 2 = − arctan

sin y2 cos2 y2 + f 2

,

u˜ k = yk ,

where k = 3, . . . , n, we reduce operators (18) to the form (13). If in (18) g ≡ 0, then changing y3 to y˜3 = g −1 dy3 and y2 to y˜2 = −y2 we transform the above operators to become sin y˜1 sin y˜1 J1 = − sin y˜1 tan y˜2 ∂y˜1 − cos y˜1 − α ∂y˜2 + ∂y˜ , cos y˜2 cos y˜2 3 cos y˜1 cos y˜1 J2 = − cos y˜1 tan y˜2 ∂y˜1 + sin y˜1 + α ∂y˜2 + ∂y˜ , (19) cos y˜2 cos y˜2 3 J3 = ∂y˜1 . Here α is an arbitrary smooth function of y˜3 , . . . , y˜n . Finally, making the transformation u˜ 1 = y˜1 + f,

u˜ 2 = g,

u˜ 3 = h,

u˜ k = y˜k ,

where k = 3, . . . , n and f = f (y˜2 , . . . , y˜n ), g = g(y˜2 , . . . , y˜n ), h = h(y˜2 , . . . , y˜n ) satisfy the compatible over-determined system of nonlinear partial differential equations: fy˜2 = sin f tan g, fy˜3 = sin y˜2 − α sin f tan g − cos y˜2 cos f tan g, gy˜2 = cos f, gy˜3 = sin f cos y˜2 − α cos f, hy˜2 = − sin f sec g, hy˜3 = (cos f cos y˜2 + α sin f ) sec g, reduces operators (19) to the form (14). Subcase 2.2. fj = 0, j = 2, . . . , n. Substituting operators (16) under fj = 0 into the commutation relation [J1 , J2 ] = J3 and equating coefficients of the linearlyindependent operators ∂z1 , . . . , ∂zn yield a system of algebraic equations −f 2 = 1,

f gj = 0,

j = 2, . . . , n.

As the function f is a real-valued one, the system obtained is inconsistent. Thus we have proved that Formulae (13)–(12) give all possible inequivalent realizations of the Lie algebra (7) within the class of first-order differential operators (11). The theorem is proved.

542

R. Z. Zhdanov, V. I. Lahno, W. I. Fushchych

If we realize the rotation group as the group of transformations of the space of spherical functions, then the basis elements of its Lie algebra are exactly of the form (13) [1]. Hence it follows that the realization space V of the Lie algebra (13) is a direct sum of subspaces V2l+1 of spherical functions of the order l. Furthermore, if we consider O(3) as the group of transformations of the space of generalized spherical functions [1], then operators (14) are the basis elements of the corresponding Lie algebra. 4. Realizations of the Algebra AE(3) First we will prove an auxiliary assertion giving inequivalent realizations of Lie algebras of the translation T (3) group within the class of LVFs. Lemma 1. Let mutually commuting LVFs (1)

(1)

Pa = ξab (x, u)∂xb + ηai (x, u)∂ui , where a, b = 1, . . . , N, satisfy the relation (1) ξ . . . ξ (1) η(1) 1N 11 11 .. .. rank ... ... . . ξ (1) . . . ξ (1) η(1) NN N 1 N1

(1) . . . η1n .. .. = N. . . (1) ... η

(20)

Nn

Then there exists a transformation of the form (3) reducing operators Pa to become Pa = ∂ya , a = 1, . . . , N. Proof. To avoid inessential technicalities we will give the detailed proof of the lemma for the case N = 3. Given a condition N = 3, relation (20) reduces to the form (8). Due to the latter Pa = 0 for all a = 1, 2, 3, it is well-known that a non-zero operator (1)

(1)

P1 = ξ1b (x, u)∂xb + η1i (x, u)∂ui can always be reduced to the form P1 = ∂y1 by a transformation (3) with m = 3. If we denote by P2 , P3 the operators P2 , P3 written in the new variables y, v, then owing to commutation relations (5) they commute with the operator P1 = ∂y1 . Hence, we conclude that their coefficients are independent of y1 . (1) (1) (1) Furthermore due to condition (8) at least one of the coefficients ξ22 , ξ23 , η21 , . . . , (1) η2n of the operator P2 is not equal to zero. Summing up, we conclude that the operator P2 is of the form (1)

(1)

P2 = ξ2b (y2 , y3 , v)∂yb + η2i (y2 , y3 , v)∂vi = 0, (1)

(1)

(1)

(1)

not all the functions ξ22 , ξ23 , η21 , . . . , η2n being identically equal to zero. Making a transformation z1 z2 z3 wi

= = = =

y1 + F (y2 , y3 , v), G(y2 , y3 , v), ω0 (y2 , y3 , v), ωi (y2 , y3 , v), i = 1, . . . , n,

(21)

On Covariant Realizations of the Euclid Group

543

where the functions F, G are particular solutions of the differential equations (1)

(1)

(1)

ξ22 (y2 , y3 , v)Fy2 + ξ23 (y2 , y3 , v)Fy3 + η2i (y2 , y3 , v)Fui (1)

+ξ21 (y2 , y3 , v) = 0, (1)

(1)

(1)

ξ22 (y2 , y3 , v)Gy2 + ξ23 (y2 , y3 , v)Gy3 + η2i (y2 , y3 , v)Gui = 1, and ω0 , ω1 , . . . , ωn are functionally-independent first integrals of the Euler–Lagrange system dy2 (1) ξ22

=

dy3 (1) ξ23

=

dv1 (1) η21

= ··· =

dvn

(1)

η2n

,

which has exactly n + 1 functionally-independent integrals, we reduce the operator P2 to the form P2 = ∂z2 . It is easy to check that transformation (21) does not alter the form of the operator P1 . Being rewritten in the new variables z, w it reads P1 = ∂z1 . As the right-hand sides of (21) are functionally-independent by construction, the transformation (21) is invertible. Consequently, operators Pa are equivalent to operators Pa , where P1 = ∂z1 , P2 = ∂z2 and (1)

(1)

P3 = ξ3b (z3 , w)∂yb + η3i (z3 , w)∂vi = 0. (Coefficients of the above operator are independent of z1 , z2 because of the fact that it commutes with the operators P1 , P2 .) And what is more, due to (8) at least one of the (1) (1) (1) coefficients ξ33 , η31 , . . . , η3i of the operator P3 is not identically equal to zero. Making a transformation Z1 Z2 Z3 Wi

= = = =

z1 + F (z3 , w), z2 + G(z3 , w), H (z3 , w), /i (z3 , w), i = 1, . . . , n,

where F, G, H are particular solutions of the partial differential equations (1)

(1)

(1)

(1)

(1)

(1)

(1)

(1)

ξ33 (z3 , w)Fz3 + η3i (z3 , w)Fwi = −ξ31 (z3 , w), ξ33 (z3 , w)Gz3 + η3i (z3 , w)Gwi = −ξ32 (z3 , w), ξ33 (z3 , w)Hz3 + η3i (z3 , w)Hwi = 1, and /1 , . . . , /n are functionally-independent first integrals of the Euler-Lagrange system dw1 dwn dz3 = (1) = · · · = (1) , (1) ξ33 η31 η3n we reduce the operators Pa , a = 1, 2, 3 to the form Pa = ∂Za , a = 1, 2, 3, which is the same as what was to be proved. Note 1. In the papers [9, 17] mentioned above a classification of realizations of the groups G2 (1, 1), C(n, m) was carried out under assumption that mutually commuting LVFs Qa = ξaα (x)∂xα ,

a = 1, . . . , N

544

R. Z. Zhdanov, V. I. Lahno, W. I. Fushchych

can be simultaneously reduced by the map yα = fα (x),

α = 1, . . . , n

(22)

to the form Qa = ∂ya . It is not difficult to become convinced of the fact that this is possible if and only if the condition n rank ξaα N a=1α=1 = N

(23)

holds. The sufficiency of the above statement is a consequence of Lemma 1. The necessity follows from the fact that function-rows of coefficients of operators Q1 , . . . , QN transformed according to Formulae (22) are obtained by multiplying function-rows of coefficients of the operators Q1 , . . . , QN by a Jacobi matrix of the map (22), i.e. ξaα = ξaβ fαxβ ,

a = 1, . . . , N, α = 1, . . . , n,

which leaves relation (23) invariant. Consequently, in [9, 17] only covariant realizations of the corresponding Lie algebras were considered, which, generally speaking, do not exhaust a set of all possible realizations. Now we can prove a principal theorem giving a description of all inequivalent covariant realizations of the Euclid algebra AE(3). Theorem 2. Any covariant realization of the algebra AE(3) within the class of firstorder differential operators is equivalent to one of the following realizations: 1. Pa = ∂xa ,

Ja = −εabc xb ∂xc ,

a = 1, 2, 3;

2. Pa = ∂xa , a = 1, 2, 3, J1 = −x2 ∂x3 + x3 ∂x2 + f ∂x1 − fu2 sin u1 ∂x3 − sin u1 tan u2 ∂u1 − cos u1 ∂u2 , J2 = −x3 ∂x1 + x1 ∂x3 + f ∂x2 − fu2 cos u1 ∂x3 − cos u1 tan u2 ∂u1 + sin u1 ∂u2 , J3 = −x1 ∂x2 + x2 ∂x1 + ∂u1 ; 3. Pa = ∂xa , a = 1, 2, 3, J1 = −x2 ∂x3 + x3 ∂x2 + g∂x1 − (sin u1 gu2 + cos u1 sec u2 gu3 )∂x3 − sin u1 tan u2 ∂u1 − cos u1 ∂u2 + sin u1 sec u2 ∂u3 , J2 = −x3 ∂x1 + x1 ∂x3 + g∂x2 − (cos u1 gu2 − sin u1 sec u2 gu3 )∂x3 − cos u1 tan u2 ∂u1 + sin u1 ∂u2 + cos u1 sec u2 ∂u3 , J3 = −x1 ∂x2 + x2 ∂x1 + ∂u1 . Here f = f (u2 , . . . , un ) is given by the formula sin u2 + 1 f = α sin u2 + β sin u2 ln −1 , cos u2

(24)

(25)

(26)

(27)

On Covariant Realizations of the Euclid Group

545

α, β are arbitrary smooth functions of u3 , . . . , un and g = g(u2 , . . . , un ) is a solution of the following linear partial differential equation: cos2 u2 gu2 u2 + gu3 u3 − sin u2 cos u2 gu2 + 2 cos2 u2 g = 0.

(28)

Proof. Due to Lemma 1 operators Pa can always be reduced to the form Pa = ∂xa by means of a properly chosen transformation (3). Inserting the operators Pa = ∂xa ,

Ja = ξab (x, u)∂xb + ηai (x, u)∂ui

into commutation relations (6) and equating the coefficients of the linearly-independent operators ∂x1 , ∂x2 , ∂x3 , ∂u1 , . . . , ∂un we arrive at the system of partial differential equations for the functions ξab (x, u), ηai (x, u), ξacxb = −εabc ,

ηaixb = 0,

a, b, c = 1, 2, 3, i = 1 . . . , n.

Integrating the above system we conclude that the operators Ja have the form Ja = −εabc xb ∂xc + jab (u)∂xb + η˜ ai (u)∂ui ,

a = 1, 2, 3,

(29)

where jab , η˜ ab are arbitrary smooth functions. Inserting (29) into commutation relations (7) and equating coefficients of ∂u1 , . . . , ∂un show that the operators Ja = η˜ ai ∂ui , a = 1, 2, 3 have to fulfill (7) with Ja → Ja . Hence, taking into account Theorem 1 we conclude that any covariant realization of the algebra AE(3) is equivalent to the following one: Pa = ∂xa ,

Ja = −εabc xb ∂xc + jab (u)∂xb + Ja ,

a = 1, 2, 3,

(30)

operators Ja being given by one of the Formulae (12)–(14). Making a transformation ya = xa + Fa (u),

vi = ui ,

a = 1, 2, 3, i = 1, . . . , n,

we reduce the operators Ja from (30) to be J1 = −y2 ∂y3 + y3 ∂y2 + A∂y1 + B∂y2 + C∂y3 + J1 , J2 = −y3 ∂y1 + y1 ∂y3 + F ∂y2 + G∂y3 + J2 , J3 = −y1 ∂y2 + y2 ∂y1 + H ∂y3 + J3 ,

(31)

where A, B, C, F, G, H are arbitrary smooth functions of v1 , . . . , vn . Substituting operators (31) into (7) and equating coefficients of linearly-independent operators ∂y1 , ∂y2 , ∂y3 , ∂v1 , . . . , ∂vn result in the following system of partial differential equations: 1) J2 A = −C,

6) J3 C − J1 H = G,

2) J3 F = −B,

7) J1 G − J2 C = H − A − F,

3) J3 A = B,

8) J3 B = F − A − H,

4) J1 F − J2 B = G,

9) A − F − H = 0.

5) J2 H − J3 G = C,

(32)

546

R. Z. Zhdanov, V. I. Lahno, W. I. Fushchych

Case 1. All operators J1 , J2 , J3 are equal to zero. Then, (32) reduces to the system of linear algebraic equations B = C = G = 0,

H − A − F = 0,

F − A − H = 0,

A − F − H = 0,

whence it immediately follows that A = F = G = 0. Substituting the above results into Formulae (31) we arrive at realization (24). Case 2. Suppose now that not all operators J1 , J2 , J3 vanish. Then, they are given either by Formulae (13) or by Formulae (14), where one should replace u1 , . . . , un by v1 , . . . , vn . As for both cases J3 = ∂v1 , a subsystem of Eqs. 2, 3, 8, 9 forms a system of linear ordinary differential equations for the functions A, B, F, H with respect to v1 . Integrating it we have A = B0 + B1 sin 2v1 − B2 cos 2v1 ,

B = 2B1 cos 2v1 + 2B2 sin 2v1 ,

F = B0 + B2 cos 2v1 − B1 sin 2v1 ,

H = 2B1 sin 2v1 − 2B2 cos 2v1 ,

(33)

where B0 , B1 , B2 are arbitrary smooth functions of v2 , . . . , vn . Subcase 2.1. Let the operators J1 , J2 , J3 be of the form (13). Then, making a transformation z1 = y1 + R1 cos v1 + R2 sin v1 , z2 = y2 + R2 cos v1 − R1 sin v1 , 1 1 z3 = y3 + (R2v2 + tan v2 R2 ) cos 2v1 − (R1v2 + tan v2 R1 ) sin 2v1 2 2 1 + (tan v2 R2 − R2v2 ), 2 where the functions R1 , R2 are solutions of the system of partial differential equations R1v2 +

1 tan v2 R1 = −2B2 , 2

R2v2 +

1 tan v2 R2 = 2B1 , 2

we reduce operators (31) with A, B, F, H given by (33) to the form

z1 + C∂

z3 + J1 , J1 = −z2 ∂z3 + z3 ∂z2 + A∂

z2 + G∂

z3 + J2 , J2 = −z3 ∂z1 + z1 ∂z3 + A∂ J3 = −z1 ∂z2 + z2 ∂z1 + J3 .

(34)

C,

G

are arbitrary smooth functions of v1 , . . . , vn , and what is more, A

does Here A, not depend on v1 . Given such a form of operators Ja , system (32) reduces to three differential equations

= −C,

J2 A

= G,

J1 A

− J2 C

= −2A.

J1 G

(35)

Inserting expressions for the operators J1 , J2 from (13) into the first two equations we have

v2 ,

= − sin v1 A C

= − cos v1 A

v2 . G

Substituting the above formulae into the third equation of system (35) we conclude that it is equivalent to the differential equation

= 0,

v2 + 2A

v2 v2 − tan v2 A A

On Covariant Realizations of the Euclid Group

547

whose general solution is given by (27). At last, inserting the results obtained into (34) we get Formulae (25). Subcase 2.2. Let the operators J1 , J2 , J3 be of the form (14). Then, making a transformation z1 = y1 + R1 cos v1 + R2 sin v1 , z2 = y2 + R2 cos v1 − R1 sin v1 , 1 z3 = y3 + R2v2 − sec v2 R1v3 + tan v2 R2 cos 2v1 2 1 − (R1v2 + sec v2 R2v3 + tan v2 R1 ) sin 2v1 2 1 + (tan v2 R2 − sec v2 R1v3 − R2v2 ), 2 where the functions R1 , R2 are solutions of the system of partial differential equations 2B1 = R2v2 − sec v2 R1v3 + tan v2 R2 , 2B2 = −R1v2 − sec v2 R2v3 − tan v2 R1 , we reduce operators (31) with A, B, F, H given by (33) to the form (34), where

C,

G

are arbitrary smooth functions, and what is more, A

does not depend on v1 . A, Given such a form of the operators Ja , system (32) reduces to three differential equations (35). Inserting the expressions for the operators J1 , J2 from (13) into the first two equations of (35) we have

= − cos v1 A

v2 + sin v1 sec v2 A

v3 , C

(36)

v3 .

= − sin v1 Av2 − cos v1 sec v2 A G

Substituting the above formulae into the third equation from (35), after some algebra we arrive at the conclusion that it is equivalent to Eq. (28). Inserting (36) into (34) yields Formulae (26). Thus we have proved that if LVFs Pa , Ja realize a covariant realization of the Euclid algebra AE(3), then they can be reduced to one of the forms (24)–(26) by means of an invertible transformation (3). The theorem is proved. While proving Theorem 1, we have established, in particular, that any realization of the Euclid algebra satisfying condition (8) can be transformed to become Pa = ∂xa ,

Ja = −εabc xb ∂xc + jab (u)∂xb + η˜ ai (u)∂ui ,

a = 1, 2, 3.

If we choose in the above formulae jab (u) = 0,

ηai (u) = −3aij uj ,

a, b = 1, 2, 3, i = 1, . . . , n,

where 3aij = const, then the following realization Pa = ∂xa ,

Ja = −εabc xb ∂xc + Ja ,

with Ja = −3aij uj ∂ui is obtained.

a = 1, 2, 3

(37)

548

R. Z. Zhdanov, V. I. Lahno, W. I. Fushchych

A realization of the Euclid algebra with generators of the form (37) is called in the classical linear representation theory a covariant realization. That is why it is natural to preserve for a realization of the algebra AE(3) within the class of LVFs obeying (8) the same terminology. As an illustration to Theorem 2 we will demonstrate how to reduce the realizations of the Euclid algebras forming the symmetry algebras of the heat, wave, Laplace, Navier– Stokes, Lamè, Weyl, Dirac and Maxwell equations to one of the three canonical forms (24)–(26). First of all, we note that realization (24) is exactly the one realized on the sets of solutions of the linear and nonlinear heat (Schrödinger), wave and Laplace equations. Symmetry algebras of the Navier–Stokes and Lamè equations contain as a subalgebra the Euclid algebra having basis elements (37), where (see, e.g. [6]) Ja = −εabc vb ∂vc ,

a = 1, 2, 3.

(38)

The change of variables v1 = u3 sin u1 cos u2 ,

v2 = u3 cos u1 cos u2 ,

v3 = u3 sin u2

reduce these LVFs to the form (25) with f = 0. Next, if we consider the Weyl equation as the system of four real equations for four real-valued functions v1 , v2 , w1 , w2 , then on the set of its solutions realization (37) of the algebra AE(3) is realized, where [3, 7] 1 (w2 ∂v1 − v1 ∂w2 + w1 ∂v2 − v2 ∂w1 ), 2 1 J2 = (v2 ∂v1 − v1 ∂v2 + w2 ∂w1 − w1 ∂w2 ), 2 1 J3 = (w1 ∂v1 − v1 ∂w1 + v2 ∂w2 − w2 ∂v2 ). 2 J1 =

(39)

Making the change of variables

u u2 u3 u1 u2 1 v1 = u4 sin sin cos + cos cos 2 2 2 2 2

u1 u2 u3 u1 u2 v2 = u4 cos cos cos − sin sin 2 2 2 2 2

u1 u2 u3 u1 u2 w1 = u4 cos sin cos − sin cos 2 2 2 2

u2 u2 u3 u1 u2 1 w2 = u4 sin cos cos + cos sin 2 2 2 2 2

u3 , 2 u3 sin , 2 u3 sin , 2 u3 sin 2 sin

reduces the above LVFs to the form (26) with g = 0. On the solution set of the Maxwell equations the realization of the Euclid algebra (37), where Ja = −εabc Eb ∂Ec + Hb ∂Hc , is realized [19].

a = 1, 2, 3,

On Covariant Realizations of the Euclid Group

549

This realization is reduced to the form (26) under g = 0 with the help of the change of variables: E1 E2 E3 H1 H2 H3

= = = = = =

u6 sin u1 cos u2 , u6 cos u1 cos u2 , u6 sin u2 , u4 (cos u1 sin u3 + sin u1 sin u2 cos u3 ) + u5 sin u1 cos u2 , u4 (cos u1 sin u2 cos u3 − sin u1 sin u3 ) + u5 cos u1 cos u2 , −u4 cos u2 cos u3 + u5 sin u2 .

Taking the Dirac matrices γµ in the Majorana representation we can represent the Dirac equation as the system of eight real equations for eight real-valued functions ψ10 , . . . , ψ13 , ψ20 , . . . , ψ23 (for details, see e.g. [7]). With this choice of γ -matrices, the realization of the Euclid algebra (37) with 1 J1 = − ψ13 ∂ψ 0 + ψ12 ∂ψ 1 − ψ11 ∂ψ 2 − ψ10 ∂ψ 3 + ψ23 ∂ψ 0 + ψ22 ∂ψ 1 1 1 1 1 2 2 2 1 0 −ψ2 ∂ψ 2 − ψ2 ∂ψ 3 , 2 2 1 2 3 J2 = −ψ1 ∂ψ 0 + ψ1 ∂ψ 1 + ψ10 ∂ψ 2 − ψ11 ∂ψ 3 − ψ22 ∂ψ 0 + ψ23 ∂ψ 1 1 1 1 1 2 2 2 0 1 +ψ2 ∂ψ 2 − ψ2 ∂ψ 3 , 2 2 1 1 0 J3 = − ψ1 ∂ψ 0 − ψ1 ∂ψ 1 + ψ13 ∂ψ 2 − ψ12 ∂ψ 3 + ψ21 ∂ψ 0 − ψ20 ∂ψ 1 1 1 1 1 2 2 2 3 2 +ψ2 ∂ψ 2 − ψ2 ∂ψ 3 2

2

is realized on the set of solutions of the Dirac equation. Making the change of variables

u1 u2 u3 u1 u2 u3 ψ10 = u4 cos cos sin + sin sin cos , 2 2 2 2 2

u2 u2 u3 u1 u2 u3 1 ψ11 = u4 sin cos sin − cos sin cos , 2 2 2 2 2

2u u2 u3 u1 u2 u3 1 ψ12 = −u4 cos cos cos − sin sin sin , 2 2 2 2 2 2

u u2 u3 u1 u2 u3 1 ψ13 = −u4 sin cos cos + cos sin sin , 2 2 2 2 2 2 u1 u2 u3 + u6 u1 u2 u3 + u6 ψ20 = u5 sin sin sin − cos cos cos 2 2 2 2 2 2 u1 u2 u3 + u8 u1 u2 u3 + u8 +u7 sin cos sin − cos sin cos , 2 2 2 2 2 2 u1 u2 u3 + u6 u1 u2 u3 + u6 ψ21 = −u5 sin cos cos + cos sin sin 2 2 2 2 2 2 u1 u2 u3 + u8 u1 u2 u3 + u8 −u7 sin sin cos − cos cos sin , 2 2 2 2 2 2

550

R. Z. Zhdanov, V. I. Lahno, W. I. Fushchych

u2 u3 + u6 u1 u2 u3 + u6 u1 ψ22 = −u5 cos cos sin + sin sin cos 2 2 2 2 2 2 u1 u2 u3 + u8 u1 u2 u3 + u8 +u7 cos sin sin + sin cos cos , 2 2 2 2 2 2 u1 u2 u3 + u6 u1 u2 u3 + u6 ψ23 = u5 cos sin cos − sin cos sin 2 2 2 2 2 2 u1 u2 u3 + u8 u1 u2 u3 + u8 −u7 cos cos cos − sin sin sin 2 2 2 2 2 2 reduces the above realization to the form (26) with g = 0. 5. Covariant Realizations of the Lie Algebra of the Group E(4) We recall that the basis elements of the Lie algebra of the Euclid group E(4) fulfill the following commutation relations: [Pα , Pβ ] = 0, [Jµν , Pα ] = δµα Pν − δνα Pµ , [Jαβ , Jµν ] = δαµ Jβν + δβν Jαµ − δαν Jβµ − δβµ Jαν ,

(40) (41) (42)

where α, β, µ, ν = 1, 2, 3, 4. Using the results of the previous sections and the fact that the Lie algebra of the rotation group O(4) is the direct sum of two algebras AO(3) we will obtain a description of covariant realizations of the Lie algebra (40)–(42) within the class of LVFs, Pµ = ξµν (x, u)∂xν + ηµi (x, u)∂ui , Jµν = ξµνα (x, u)∂xα + ηµνi (x, u)∂ui with Jµν = −Jνµ . Here the indices µ, ν, α take the values 1, 2, 3, 4 and the index i takes the values 1, . . . , n. As we consider covariant realizations, mutually commuting operators Pµ satisfy (20) with N = 4. Hence due to Lemma 1 it follows that they can be reduced to the form Pµ = ∂xµ , µ = 1, 2, 3, 4. Next, using the commutation relations (41) we establish that the operators Jµν have the following structure: Jµν = xν ∂xµ − xµ ∂xν + fµνα (u)∂xα + gµνi (u)∂ui

(43)

with arbitrary sufficiently smooth fµνα , gµνi . In what follows we will restrict our considerations to the case when in (43) fµνα ≡ 0. This means geometrically that the transformation groups generated by the operators Jµν in the space of independent variables are standard rotations in the planes (xµ , xν ). With this restriction LVFs Jµν take the form Jµν = xν ∂xµ − xµ ∂xν + Jµν ,

(44)

Jµν = gµνi (u)∂ui

(45)

where

and, furthermore, gµνi (u) = −gνµi (u).

On Covariant Realizations of the Euclid Group

551

Inserting LVFs (44) into (42) we come to the conclusion that the operators Jµν satisfy the commutation relations of the Lie algebra of the rotation group O(4), [Jαβ , Jµν ] = δαµ Jβν + δβν Jαµ − δαν Jβµ − δβµ Jαν .

(46)

An exhaustive description of inequivalent realizations of the above Lie algebra within the class of LVFs (45) is given below. It is based on results of Sect. 2 and on the wellknown fact that the algebra AO(4) is decomposed into the direct sum of two algebras AO(3). This is achieved by choosing the basis of AO(4) in the following way: 1 1 ± Ja = (47) εabc Jbc ± Ja4 , 2 2 where the indices a, b, c take the values 1, 2, 3. Due to (46) LVFs Ja− , Ja+ fulfill the following commutation relations: [Ja+ , Jb+ ] = εabc Jc+ ,

[Ja+ , [Ja− ,

Jb− ] Jb− ]

= 0, =

εabc Jc− ,

(48) (49) (50)

which is the same as what was required. Now we are ready to formulate an assertion giving an exhaustive description of LVFs (45) satisfying commutation relations (46) or, equivalently, (48)–(50). Theorem 3. Any realization of the Lie algebra AO(4) within the class of LVFs (45) is given by Formulae (47) and by one of the Formulae 1–6 presented below. 1. J1+ = − sin u1 tan u2 ∂u1 − cos u1 ∂u2 , J2+ = − cos u1 tan u2 ∂u1 + sin u1 ∂u2 , J3+ = ∂u1 ,

J1− = − sin u3 tan u4 ∂u3 − cos u3 ∂u4 , J2− = − cos u3 tan u4 ∂u3 + sin u3 ∂u4 , J3− = ∂u3 ;

2. J1+ = − sin u1 tan u2 ∂u1 − cos u1 ∂u2 , J2+ = − cos u1 tan u2 ∂u1 + sin u1 ∂u2 , J3+ = ∂u1 ,

J1− = − sin u3 tan u4 ∂u3 − cos u3 ∂u4 − sin u3 sec u4 ∂u5 ,

J2− = − cos u3 tan u4 ∂u3 + sin u3 ∂u4 − cos u3 sec u4 ∂u5 ,

J3− = ∂u3 ;

3. J1+ = − sin u1 tan u2 ∂u1 − cos u1 ∂u2 − sin u1 sec u2 ∂u3 ,

J2+ = − cos u1 tan u2 ∂u1 + sin u1 ∂u2 − cos u1 sec u2 ∂u3 ,

J3+ = ∂u1 ,

552

R. Z. Zhdanov, V. I. Lahno, W. I. Fushchych

J1− = sec u2 cos u3 ∂u1 + sin u3 ∂u2 − tan u2 cos u3 ∂u3 ,

J2− = − sec u2 sin u3 ∂u1 + cos u3 ∂u2 + tan u2 sin u3 ∂u3 ,

J3− = ∂u3 ;

4. J1+ = − sin u1 tan u2 ∂u1 − cos u1 ∂u2 − sin u1 sec u2 ∂u3 ,

J2+ = − cos u1 tan u2 ∂u1 + sin u1 ∂u2 − cos u1 sec u2 ∂u3 , J3+ = ∂u1 ,

J1− = − sin u4 tan u5 ∂u4 − cos u4 ∂u5 − sin u4 sec u5 ∂u6 ,

J2− = − cos u4 tan u5 ∂u4 + sin u4 ∂u5 − cos u4 sec u5 ∂u6 , J3− = ∂u4 ;

5. J1+ = − sin u1 tan u2 ∂u1 − cos u1 ∂u2 − sin u1 sec u2 ∂u3 ,

J2+ = − cos u1 tan u2 ∂u1 + sin u1 ∂u2 − cos u1 sec u2 ∂u3 , J3+ = ∂u1 ,

J1− = k sin u4 sec u5 ∂u3 − sin u4 tan u5 ∂u4 − cos u4 ∂u5 , J2− = k sin u4 sec u5 ∂u3 − cos u4 tan u5 ∂u4 + sin u4 ∂u5 , J3− = ∂u4 ;

6. J1+ = − sin u1 tan u2 ∂u1 − cos u1 ∂u2 − sin u1 sec u2 ∂u3 ,

J2+ = − cos u1 tan u2 ∂u1 + sin u1 ∂u2 − cos u1 sec u2 ∂u3 , J3+ = ∂u1 ,

J1− = u6 sin u4 sec u5 ∂u3 − sin u4 tan u5 ∂u4 − cos u4 ∂u5 , J2− = u6 sin u4 sec u5 ∂u3 − cos u4 tan u5 ∂u4 + sin u4 ∂u5 , J3− = ∂u4 ,

where k = const, k = 0. Proof. We will give the principal steps of the proof omitting intermediate computations. According to Theorem 1, there are two inequivalent realizations of the algebra AO(3) with basis elements J1+ , J2+ , J3+ : 1.

J1+ = − sin u1 tan u2 ∂u1 − cos u1 ∂u2 , J2+ = − cos u1 tan u2 ∂u1 + sin u1 ∂u2 , J3+ = ∂u1 ;

2.

(51)

J1+ = − sin u1 tan u2 ∂u1 − cos u1 ∂u2 − sin u1 sec u2 ∂u3 ,

J2+ = − cos u1 tan u2 ∂u1 + sin u1 ∂u2 − cos u1 sec u2 ∂u3 , J3+ = ∂u1 .

To complete a classification of inequivalent realization of AO(4) we have to find all the triplets of operators J1− , J2− , J3− which together with operators (51) satisfy (49), (50).

On Covariant Realizations of the Euclid Group

553

Analyzing commutation relations (49) we arrive at the following expressions for operators J1− , J2− , J3− : 1. Ja− = 2.

Ja−

=

n

fai (u3 , . . . , un )∂ui ,

i=3 3

fab (u4 , . . . , un )Qb +

b=1

n

fai (u4 , . . . , un )∂ui ,

i=4

where fij are arbitrary smooth functions and Q1 = sec u2 cos u3 ∂u1 + sin u3 ∂u2 − tan u2 cos u3 ∂u3 , Q2 = − sec u2 sin u3 ∂u1 + cos u3 ∂u2 + tan u2 sin u3 ∂u3 , Q3 = ∂u3 . Note that the operators Qa fulfill the commutation relations of the algebra AO(3). Hence, we conclude that for Case 1 from (51) the operators Ja− are given by the Formulae (51), where one should replace ui by ui+2 , correspondingly. Let us turn now to the second realization of the algebra AO(3) from (51). Case 1. fai = 0, a = 1, 2, 3, i = 4, . . . , n. In this case we can reduce J1− to the form J1− = r˜ (u4 , . . . , n)Q1 with the help of equivalence transformation X → X˜ = VXV −1 ,

V = exp

3

Fa Qa ,

(52)

a=1

where Fa are some functions of u4 , . . . , un . Note that transformation (52) does not change the form of the operators Ja+ , since [Ja+ , Qb ] = 0, a, b = 1, 2, 3. From commutation relations (50) it follows that r˜ = 1 and furthermore J2− = Q2 , J3− = Q3 . Thus we get the following forms of the operators Ja− : J1− = sec u2 cos u3 ∂u1 + sin u3 ∂u2 − tan u2 cos u3 ∂u3 ,

J2− = − sec u2 sin u3 ∂u1 + cos u3 ∂u2 + tan u2 sin u3 ∂u3 ,

J3− = ∂u3 .

Case 2. Not all fai vanish. Then the operators J1− , J2− , J3− can be transformed to become Ja− = fa (u4 , . . . , un )Q1 + ga (u4 , . . . , un )Q2 + ha (u4 , . . . , un )Q3 + Za , where a = 1, 2, 3, and Z1 = − sin u4 tan u5 ∂u4 − cos u4 ∂u5 − ε sin u4 sec u5 ∂u6 , Z2 = − cos u4 tan u5 ∂u4 + sin u4 ∂u5 − ε cos u4 sec u5 ∂u6 , Z3 = ∂u4 , and ε = 0, 1.

554

R. Z. Zhdanov, V. I. Lahno, W. I. Fushchych

Now using transformation (52) we reduce the operator J3− to the form Z3 = ∂u4 . Next, from commutation relations [J3− , J1− ] = J2− ,

[J3− , J2− ] = −J1−

we get J1− = J2− =

3 a=1 3

(Ga cos u4 + Ha sin u4 )Qa + Z1 , (Ha cos u4 − Ga sin u4 )Qa + Z2 ,

a=1

where Ga , Ha are arbitrary smooth functions of u5 , . . . , un . Making use of the equivalence transformation (52) with Fa being functions of u5 , . . . , un , we can cancel the coefficients Ga . The remaining commutation relation [J1− , J2− ] = J3− yields equations for H1 , H2 , H3 , Hau5 − tan u5 Ha = 0, whence Ha = H˜ a sec u5 ,

a = 1, 2, 3,

H˜ a being arbitrary functions of u6 , . . . , un . Consequently, the operators Ja− read J1− =

3

sin u4 sec u5 H˜ a Qa + Z1 ,

a=1

J2− =

3

cos u4 sec u5 H˜ a Qa + Z2 ,

a=1

J3− = Z3 . If ε = 1, then using the transformation (52) with Fa being functions of u6 , . . . , un we can cancel H˜ a , thus getting Ja− = Za , a = 1, 2, 3. If ε = 0, then making use of the transformation (52) with Fa being functions of u6 , . . . , un we can put H˜ 1 = H˜ 2 = 0. Provided H˜ 3 = 0, we get the realization which is reduced to that given by Formulae 2 from the statement of the theorem. Provided H˜ 3 = const = 0, we get Formulae 5.At last, if H˜ 3 = const, then performing a proper change of variables we arrive at the realization given by Formulae 6 from the statement of the theorem. The theorem is proved. It follows from the above theorem that Formulae (47) and 1–6 of the statement of Theorem 3 give six inequivalent realizations of the Lie algebra of the Euclid group E(4) having the basis elements Pµ = ∂xµ and (44), (45). To get all possible realizations of the algebra in question belonging to the above class it is necessary to add to the list of realizations of the algebra AO(4) obtained in Theorem 3 the following three realizations of the operators Ja− , Ja+ :

On Covariant Realizations of the Euclid Group

1.

555

J1+ = − sin u1 tan u2 ∂u1 − cos u1 ∂u2 , J2+ = − cos u1 tan u2 ∂u1 + sin u1 ∂u2 ,

J3+ = ∂u1 ,

Ja− = 0; 2.

J1+ = − sin u1 tan u2 ∂u1 − cos u1 ∂u2 − sin u1 sec u2 ∂u3 ,

J2+ = − cos u1 tan u2 ∂u1 + sin u1 ∂u2 − cos u1 sec u2 ∂u3 ,

J3+ = ∂u1 , Ja− = 0;

3. Ja+ = 0,

Ja− = 0,

where a = 1, 2, 3. This yields nine inequivalent realizations of the Lie algebra of the group E(4). In particular, the basis generators of the Euclid groups realized on the sets of solutions of the Dirac and self-dual Yang-Mills equations in the Euclidean space R4 are reduced to such a form that the generators of the rotation groups are given by (44), (45), Jµν being adduced in Formulae 4 of the statement of Theorem 3.

6. Concluding Remarks Summarizing the results of Sects. 3 and 4 yields the following structure of realizations of the Lie algebra of the rotation group by LVFs in n variables: • If n = 1, then there are no realizations. • As there is no realization of AO(3) by real non-zero 2×2 matrices, the only realization for the case n = 2 is given by (13). Furthermore, this realization is essentially nonlinear (i.e., it is not equivalent to a realization of the form (9)). • In the case n = 3 there are two more realization given by formula (38) (which is equivalent to (13)) and by formula (14). The latter realization is essentially nonlinear. • Provided n > 3, there is no new realization of AO(3) and, furthermore, any realization can be reduced to a linear one (say, to (39)). An evident (and very important) consequence of Theorem 1 is that there are only two inequivalent classes of O(3)-invariant partial differential equations of order r. They are obtained via differential invariants of the order not higher than r of the Lie algebras having the basis elements (13), (14). In particular, the Weyl, Maxwell, Dirac equations are the special cases of the general system of first-order partial differential equations in n ≥ 8 dependent variables invariant with respect to the algebra (14). We intend to devote one of our future publications to description of first-order differential invariants of the Lie algebra of the Euclid group E(3) having the basis elements (13), (14) and (37). Let us note that this problem has been completely solved provided basis elements of AE(3) are given by Formulae (12) [20]. Acknowledgements. One of the authors (R. Zh.) gratefully acknowledges financial support from the Alexander von Humboldt Foundation and of the International Renessaince Foundation.

556

R. Z. Zhdanov, V. I. Lahno, W. I. Fushchych

References 1. Gel’fand, I.M., Minlos, R.A. and Shapiro, Z.Ya.: Representations of the Rotation Group and of the Lorentz Group and Their Applications. New York: Macmillan, 1963 2. Barut, A.O. and Raczka, R.: Theory of Group Representations and Their Applications. Warszawa: Polish Scientific Publ., 1984 3. Fushchych, W.I. and Nikitin, A.G.. Symmetry of Equations of Quantum Mechanics. New York: Allerton Press, 1994 4. Ovsjannikov, L.V.: Group Analysis of Differential Equations. New York: Academic Press, 1982 5. Olver, P.J.: Applications of Lie Groups to Differential Equations. New York: Springer, 1986 6. Fushchych, W.I., Shtelen, W.M. and Serov, N.I.: Symmetry Analysis and Exact Solutions of Nonlinear Equations of Mathematical Physics. Kiev: Naukova Dumka, 1989 (translated into English by Dordrecht: Kluwer Academic Publishers, 1993 7. Fushchych, W.I. and Zhdanov, R.Z.: Nonlinear Spinor Equations: Symmetry and Exact Solutions. Kiev: Naukova Dumka 1992 8. Lie, S.: Theorie der Transformationsgruppen. Vol. 3, Leipzig: Teubner, 1893 9. Rideau, G. and Winternitz, P.: Evolution equations invariant under two-dimensional space-time Schrödinger group. J. Math. Phys. 34, N 2, 558–570 (1993) 10. Zhdanov, R.Z. and Fushchych, W.I.: On new representations of Galilei groups. J. Non. Math. Phys. 4, N 3–4, 417–424 (1997) 11. Yehorchenko, I.A.: Nonlinear representation of the Poincaré algebra and invariant equations. In: Symmetry Analysis of Equations of Mathematical Physics, Kiev, Ukraine: Math. Acad. of Sci., 1992, pp. 62–66 12. Fushchych, W.I., Tsyfra, I.M. and Boyko, W.M.: Nonlinear representations for Poincaré and Galilei algebras and nonlinear equations for electro-magnetic field. J. Non. Math. Phys. 1, N 2, 210–221 (1994) 13. Rideau, G. and Winternitz, P.: Nonlinear equations invariant under the Poincaré, similitude and conformal groups in two-dimensional space-time. J. Math. Phys. 31, N 9, 1095–1105 (1990) 14. Lahno, V.I.: On the new representations of the Poincaré and Euclid groups. Proc. Acad. of Sci. Ukraine N 8, 14–19 (1996) 15. Fushchych, W.I. and Cherniha, R.M: Galilei-invariant nonlinear systems of evolution equations. J. Phys. A: Math. Gen., 28, N 19, 5569–5579 (1995) 16. Fushchych W.I., Lahno V.I. and Zhdanov R.Z.: On nonlinear representations of the conformal algebra AC(2, 2). Proc. Acad. of Sci. Ukraine, N 9, 44–47 (1993) 17. Fushchych W.I., Zhdanov R.Z. and Lahno V.I.: On linear and nonlinear representations of the generalized Poincaré groups in the class of Lie vector fields. J. Non. Math. Phys. 1, N 3, 295–308 (1994) 18. Fushchych W.I. and Zhdanov R.Z.: Symmetries and Exact solutions of Nonlinear Dirac Equations. Kyiv: Mathematical Ukraina Publ., 1997 19. Fushchych W.I. and Nikitin A.G.: Symmetries of Maxwell’s Equations. Dordrecht: Reidel, 1987 20. Fushchych, W.I. and Yegorchenko, I.A.: Second-order differential invariants of the rotation group O(n) and of its extention E(n), P (1, n). Acta Appl. Math. 28, N 1, 69–92 (1992) Communicated by R. H. Dijkgraaf

Commun. Math. Phys. 212, 557 – 569 (2000)

Communications in

Mathematical Physics

© Springer-Verlag 2000

Periodic Instantons and the Loop Group Paul Norbury Department of Mathematics and Statistics, University of Melbourne, Parkville 3052, Australia. E-mail: [email protected] Received: 10 August 1998 / Accepted: 30 January 2000

Abstract: We construct a large class of periodic instantons. Conjecturally we produce all periodic instantons. This confirms a conjecture of Garland and Murray that relates periodic instantons to orbits of the loop group acting on an extension of its Lie algebra. 1. Introduction Periodic instantons are solutions of the anti-self-dual equations F B = − ∗ FB for a connection B on a trivial vector bundle with structure group G over S 1 × R3 . In this paper, G is a compact Lie group with complexification Gc equipped with a representation acting on Cn that is unitary on G. Put B = A + dθ so ∗FA = dA − µ∂θ A,

(1)

where we use the three-dimensional Hodge star operator and µ is the reciprocal of the radius of the circle. One can think of the connection and Higgs field as defined over R3 and dependent on the circle-valued θ . Nahm studied periodic instantons, calling them calorons [17]. Later, Garland and Murray studied periodic instantons from the twistor viewpoint [7]. To remedy the fact that there was so far no existence theorem for periodic instantons nor an understanding of the topology of the moduli space of instantons (if they were to exist), they conjectured that periodic instantons can be constructed using holomorphic spheres in a flag manifold associated to the loop group. This conjecture is confirmed by the main result of this paper, Theorem 1. 1 × R 3 has been Recently the study of super-symmetric Yang–Mills theory over S1/µ used as further evidence for the existence of dualities in physical theories. In [21] Seiberg

558

P. Norbury

and Witten obtained a result for periodic instantons analogous to their 1994 work on instantons, [20], by studying the limiting behaviour when µ → 0 and µ → ∞. This led to the Rozansky-Witten invariants [19]. We will not discuss these developments here.

2. Loop Groups Define LG to be the group of smooth gauge transformations of the trivial G-bundle over the circle. Equivalently, LG is the space of smooth maps from S 1 to the compact Lie group G. Following [7], intertwine the gauge transformations with the isometries of the = LG×S ˜ 1 , where the action of S 1 is given by circle to get the twisted product LG ∼ rotation. It has Lie algebra Lg = Lg ⊕ Rd with Lie bracket [X + xd, Y + yd] = [X, Y ] − y∂X/∂θ + x∂Y /∂θ. ˆ = + φd. Then the Bogomolny equations over R3 for this pair are Put Aˆ = A + ad, given by ˆ ∗FAˆ = dAˆ .

(2)

The d-component is given by ∗da = dφ so a finite energy condition will force a = 0 and φ =constant= µ, say. The remaining part of (2) is then (1). Thus, one can think of periodic instantons as monopoles over R3 with structure group LG. Monopoles for finite-dimensional groups are well-studied [10, 16, 18]. In particular, the topology of the moduli space of monopoles is understood. The moduli space of monopoles with structure group G is diffeomorphic to the space of holomorphic maps from the two-sphere to a homogeneous space of G, or equivalently to an adjoint orbit of G [4, 12]. In analogy with the finite-dimensional case this led Garland and Murray to conjecture that periodic instantons are in one-to-one correspondence with based holo in Lg. The following theorem addresses half of morphic maps from S 2 to orbits of LG denote its this conjecture. The action of LG is really an action of LG. For (ξ, µ) ∈ Lg orbit by LG · (ξ, µ). Theorem 1. There is an injective map from (i) the space of based holomorphic maps from S 2 to LG · (ξ, µ), to 1 × R3 . (ii) the moduli space of instantons over S1/µ The basing condition on the space of holomorphic maps distinguishes an element of the orbit of LG that is conjecturally the asymptotic value of the Higgs field. See Sect. 6. The moduli space consists of gauge equivalence classes of connections where the gauge group consists of gauge transformations independent of θ in the limit at infinity. The full conjecture, that the map is also surjective, is equivalent to a conjecture for decay properties of finite energy periodic instantons analogous to known decay properties for monopoles. We discuss this in Sect. 6. Theorem 1 can be thought of as an extension of [13] from finite dimensional Lie groups to the loop group.

Periodic Instantons and the Loop Group

559

by 2.1. Orbits of the loop group. The loop group LG acts on Lg γ · (ξ, µ) = (γ · ξ − µγ γ −1 , µ). For ξ = 0 the orbit is given by the based loop group G. More generally, we get LG · (ξ, µ) ∼ = LG/Zξ , where the isotropy subgroup Zξ is described explicitly in the following proposition. Proposition 2.1 (Pressley and Segal). For π1 G = 0 and µ = 0 the orbits of LG on Lg correspond precisely to the conjugacy classes of G under the map (ξ, µ) → Mξ ∈ G, where Mξ is obtained by solving the ordinary differential equation h h−1 = −µ−1 ξ and noticing h(θ + 2π) = h(θ )Mξ . The isotropy subgroup of ξ is given by Zξ = {γ ∈ LG|γ (0) ∈ C[Mξ ], γ (θ ) = h(θ )γ (0)h(θ )−1 },

(3)

where C[Mξ ] is the centraliser of the conjugacy class of Mξ in G. Equivalently, the orbits are given by gauge equivalence classes of connections on a trivialised bundle over the circle of radius 1/µ. Each orbit is labeled by the underlying connection which is determined by its holonomy. In the next section we will equip the orbit of the loop group with a complex structure. 2.2. Loop groups and flat connections. Donaldson [5] re-interpreted elements of the loop group in terms of holomorphic bundles over the disk framed on the boundary, and the factorisation theorem in terms of flat connections on these bundles. He showed that each framed holomorphic bundle over the disk possesses a unique Hermitian-Yang–Mills (flat) connection. Theorem 2.2 (Donaldson). There is a 1 − 1 correspondence between (i) holomorphic bundles over D framed over ∂D; (ii) unitary Hermitian-Yang–Mills connections over D on a bundle with a unitary framing over ∂D. Donaldson’s argument generalises to parabolic bundles – holomorphic bundles over the disk with a flag specified over the origin [15]. In this case the flat connection must be singular at the origin. Proposition 2.3. There is a 1 − 1 correspondence between (i) parabolic bundles over D framed over ∂D; (ii) unitary Hermitian-Yang–Mills connections over D − {0} on a bundle with a unitary framing over ∂D. The singularity of the connection at 0 encodes the flag at 0. Following Donaldson, we can re-interpret this result in terms of a factorisation theorem for loop groups as follows. A parabolic bundle over the disk has an underlying trivial holomorphic bundle and a trivialisation compared to the framing over the boundary produces a loop γ ∈ LGc . Any other trivialisation that preserves the parabolic structure at 0 ∈ D changes γ by an element of L+ P – those loops that are boundary values of holomorphic maps from the disk to GL(n, C) with value at 0 lying in P . So (i) in the statement of Proposition 2.3 is equivalent to choosing an element of LGc /L+ P .

560

P. Norbury

A unitary Hermitian-Yang–Mills (or, equivalently, flat) connection over D − {0} is determined uniquely by the parabolic structure at 0 ∈ D. (This would not be true if there was more than one puncture.) With respect to the unitary framing over the boundary, We saw in the previous the flat connection defines an element of the orbit LG · ξ ∈ Lg. section that the orbit is isomorphic to LG/Zξ . Thus we get the following restatement of Proposition 2.3. Corollary 2.4. For any ξ ∈ Lg we have LGc /L+ P ∼ = LG/Zξ . We could have proven the factorisation theorem in a different way. In the special case that Zξ consists of only constant loops then Corollary 2.4 follows from the standard factorisation theorem for loop groups. In general, each orbit of LG possesses a nice representative which simplifies the isotropy subgroup to consist only of constant loops so the general case follows from the special case. The importance of the treatment here is that at the same time as establishing a complex structure on the orbit space, ξ remains the natural base-point for the holomorphic map and we get an interpretation of the orbit space in terms of flat connections over the disk on a bundle framed over the boundary. In the next section we will see how a holomorphic map from S 2 into a space of flat connections is related to an instanton over an associated four-manifold.

3. Instantons and Holomorphic Maps into Spaces of Flat Connections Atiyah showed that there is a one-to-one correspondence between instantons over the four-sphere and holomorphic maps from the two-sphere to the loop group [1]. The interpretation of elements of the loop group in terms of flat connections means that Atiyah’s result can be viewed as a relationship between instantons and holomorphic maps from the two-sphere to a space of flat connections. This approach was exploited in [14]. Another result of this type was obtained by Dostoglou and Salamon [6] in their proof of the Atiyah-Floer conjecture. They showed that the instanton Floer homology ˜ is the same as the associated to the three-manifold given by a mapping torus S 1 ×# symplectic Floer homology of the space of flat connections over #. The relationship between instantons and holomorphic maps into spaces of flat connections can be understood as follows. Suppose that locally a four-manifold is given by a product of two complex curves U × V equipped with the product metric. The anti-self-dual equations with respect to local coordinates {w} × {z} are given by: [∂wA¯ , ∂z¯A ] = 0 , (4) [∂z¯A , ∂zA ] = ρ(w, z)[∂wA¯ , ∂wA ] where ρ(w, z) depends on the metrics on U and V . Let f : U → MV be a holomorphic map from U into the space of flat connections MV over V . (The conformal structure on V equips the space of flat connections with a natural complex structure.) Define a connection over U × V by A = df + f (w),

(5)

Periodic Instantons and the Loop Group

561

where df is a Lie algebra valued 1-form over U × V and f (w) is a flat connection over {w} × V . Then A satisfies the following equations which resemble (4): [∂wA¯ , ∂z¯A ] = 0 . (6) [∂z¯A , ∂zA ] = 0 The first equation is equivalent to the holomorphic condition on the map f and the second equation uses the fact that f maps to a space of flat connections. We can think of the second equation of each of (4) and (6) as a type of moment map. One can move from solutions of (6) to solutions of (4) using the Yang–Mills flow, as we do in this paper or, say, by using the implicit function theorem. In order to apply this to periodic instantons we exploit the conformal invariance of the anti-self-dual equations. Let # be the punctured disk D 2 − {0} equipped with the complete hyperbolic metric |dz|2 /(|z| ln |z|)2 . There is a conformal equivalence: S 1 × (R3 − {0}) S 2 × #, where S 1 × (R3 − {0}) is equipped with the flat metric and S 2 × # is equipped with the product metric ds 2 =

4d wdw ¯ d z¯ dz + 2 . (1 + |w|2 )2 |z| (ln |z|)2

(7)

On S 2 × # the anti-self-dual equations are given by (4) with 2 1 + |w|2 ρ(w, z) = . |z| ln |z| Our course is set. We have shown that a holomorphic map from S 2 to LG · (ξ, µ) is the same as a holomorphic map from S 2 to a space of flat connections which gives an approximate instanton over S 2 × #. In Sect. 4 we will use rather standard techniques to move from an approximate instanton to an exact one. Under the conformal equivalence described above, this instanton will correspond to a periodic instanton. 3.1. Approximate instantons. Beginning with a holomorphic map from the two-sphere to an orbit of LG, we will construct an approximate instanton over S 1 × R3 . This will be an explicit realisation of (5). The map f : S 2 → LG/Zξ is holomorphic when f −1 ∂w¯ f : S 2 → L+ p, where L+ p ⊂ L+ gc is given by those loops that extend to a holomorphic map of the disk whose value at the origin lies in p. Put η equal to the holomorphic extension of f −1 ∂w¯ f to the disk. Over S 2 × # = {(w, z)}, define a connection A = ηd w¯ − Hξ−1 η∗ Hξ dw + iξ dz/z

(8)

which is Hermitian with respect to the Hermitian metric Hξ = exp(iξ ln z)∗ exp(iξ ln z)

(9)

562

P. Norbury

and flat on each {w} × D. Over S 1 × R3 in a radially-free gauge we get: (A, ) = (exp(iξ r)η exp(−iξ r)d w¯ − exp(−iξ r)η∗ exp(iξ r)dw, ξ ). Furthermore, 2 ∗FA = dA − µ∂θ A + (1 + |w|2 )2 Fww ¯ dr/r

(10)

which resembles the periodic instanton equation, (1). 4. Construction In this section we will use the Yang–Mills flow to move from the “approximate” periodic instanton (8) to an exact one. Instead of working directly with the connections, we will follow Donaldson [3] and work with a Hermitian metric on a holomorphic bundle which gives a Hermitian connection. In fact, we will work with a pair (H, η) consisting of a Hermitian metric H on a holomorphic bundle and a map η : S 2 × D 2 → gc that is holomorphic in the second factor. A connection A is obtained from the pair (H, η) by: A = H −1 ∂z H dz + η(w, z)d w¯ + (H −1 ∂w H − H 1 η(w, z)H )dw.

(11)

Associate to the pair (H, η) the Hermitian-Yang–Mills tensor B(H, η) = |z|2 (ln |z|)2 ∂z¯ (H −1 ∂z H ) + (1 + |w|2 )2 {∂w¯ (H −1 ∂w H ) −∂w¯ (H −1 η∗ H ) − ∂w η + [η, H −1 ∂w H − H −1 η∗ H ]}.

When B(H, η) ≡ 0, the connection (11) is anti-self-dual. Following Donaldson [3] we study the heat flow for the Hermitian metric H in place of the Yang–Mills flow for the associated connection. Since the Hermitian metrics we deal with here are not bounded we need to extend Donaldson’s results and their generalisations due to Simpson [23]. Essentially we need to understand properties of the Laplacian of the Kahler manifold S 2 × # with metric (7) and properties of the initial Hermitian metric (9). Similar results specialised to other non-compact Kahler manifolds exist in [8, 14]. 4.1. The heat flow. Associate to a holomorphic map f : S 2 → LG/Zξ the map η : S 2 × D 2 → gc given by the holomorphic extension of f −1 ∂w¯ f to the disks in the second factor. We would like to construct a Hermitian metric H that satisfies the equation B(H, η) = 0. This would produce an anti-self-dual connection associated to the map f . Consider the heat flow equation over S 2 × #, H −1 ∂H /∂t = B(H, η), H (w, z, 0) = Hξ ,

(12)

where Hξ is defined in (9). A solution of (12) will converge to the required solution of B(H, η) = 0 as t → ∞. Instead of solving (12) we will work with a family of boundary value problems. Put S 2 × #0,δ = {(w, z) ∈ S 2 × # | 0 ≤ |z| ≤ δ} so the S 2 × #0,δ exhaust S 2 × # as δ → 1 and 0 → 0.

Periodic Instantons and the Loop Group

563

Proposition 4.1. Over each S 2 × #0,δ there is a unique solution of the boundary value problem  H −1 ∂H /∂t = B(H, η)  H (w, z, 0) = Hξ (13)   H |∂S 2 ×#0,δ = Hξ given by H 0,δ (w, z, t) and converging to a smooth metric H 0,δ (w, z, ∞) that satisfies B(H 0,δ (w, z, ∞), η) = 0. Proof. Since we have fixed S 2 × #0,δ for the moment we will omit the superscript in H 0,δ (w, z, t) during this proof. Short-time existence of a solution of (13) is automatic since B(H, η) is elliptic in H and we have Dirichlet boundary conditions. In order to extend this to long-time existence we will take the approach given by Donaldson [3] and extended by Simpson [23] and show that a solution on [0, T ) gives a limit at T which is a good initial condition to start the flow again. The lemmas we need to prove on the way use the details of our particular case and allow us to proceed with Donaldson’s proof. A Hermitian metric H takes its values in the space Gc /G which comes equipped with the complete metric d given locally by tr(H −1 δH )2 . Following Donaldson, we will use both this metric and the convenient function σ (H1 , H2 ) = tr(H1−1 H2 ) + tr(H1 H2−1 ) − 2n that satisfies c1 d 2 ≤ σ ≤ c2 d 2 for constants c1 , c2 . (Aside: if we take the loop group perspective described in [7], then a Hermitian metric takes its values in the space LGc /LG. We have not checked that this is a complete metric space.) Lemma 4.2. If H1 and H2 are two solutions of the heat equation then ∂t σ + 4σ ≤ 0

(14)

for σ = σ (H1 , H2 ). Proof. See [14]. Apply (14) to H (w, z, t) and H (w, z, t +τ ), the flow at two times. Since they obey the same boundary conditions on S 2 × #0,δ , σ vanishes on the boundary. By the maximum principle supS 2 ×#0,δ σ is a non-increasing function of t. By continuity, for any ρ > 0 there exists a τ small enough so that sup σ (H (w, z, t), H (w, z, t )) < ρ

S 2 ×#0,δ

for 0 < t, t < τ . It follows from the non-increasing property of σ that sup σ (H (w, z, t), H (w, z, t )) < ρ

S 2 ×#0,δ

for T − τ < t, t < T . Since ρ can be made arbitrarily small, H (w, z, t) is a Cauchy sequence in the C 0 norm as t → T . The metrics take their values in a complete metric space (described above) and the function σ acts like the metric so there is a continuous limit HT of the sequence. Notice also that (14) and the maximum principle show that this short-time solution to the heat flow equation is unique.

564

P. Norbury

Using the heat equation and the metric on Gc /G, we have

t d(H (w, z, t), H (w, z, 0)) ≤ |B(H (w, z, s), η)|ds, 0

where |B(H (w, z, s), η)|2 = tr(B ∗ B) and the adjoint is taken with respect to the metric Hs . Notice that B ∗ = B so |B(H (w, z, s), η)|2 = tr(B 2 ). Lemma 4.3. If H (w, z, t) is a solution of the heat equation then (d/dt + 4)|B(H (w, z, t), η)| ≤ 0 whenever |B| > 0. Proof. See [14].

(15)

The next two lemmas use the particular features of the Kahler manifold S 2 × # together with the initial Hermitian metric Hξ to get C 0 control on H (w, z, t) during the flow. Lemma 4.4. When η is the holomorphic extension of f −1 ∂w¯ f , for a given holomorphic map f : S 2 → LG/Zξ , there exists a constant M such that |B(Hξ , η)| ≤ M(1 − |z|) on S 2 × #. Proof. B(Hξ , η) = −(1 + |w|2 )2 (∂w η + ∂w¯ (Hξ−1 η∗ Hξ ) + [η, Hξ−1 η∗ Hξ ]), and since [η(0), ξ ] = 0, |B(Hξ , η)| is bounded near z = 0. Since f takes its values in the unitary loop group and Hξ = I on |z| = 1, we can identify B(Hξ , η) with the curvature of a flat connection which is 0. Furthermore, B(Hξ , η) is continuous and differentiable up to |z| = 1 so it vanishes like 1 − |z| there. Lemma 4.5. There is a constant C independent of 0 and δ such that d(H 0,δ (w, z, t), Hξ ) ≤ C ln(1 − ln |z|) for all (w, z, t) ∈ S 2 × #0,δ × R. Proof. It follows from (15) and the maximum principle that if there is a function b(w, z, t) defined on S 2 × #0,δ × R that satisfies (∂t + 4)b = 0 and |B(Hξ , η)| ≤ b(w, z, 0), then |B(H (w, z, t), η)| ≤ b(w, z, t) for all t. Put b(w, z, 0) = M(1 − |z|). Notice that b(w, z, 0) = b(|z|), so we only need use the one-dimensional Laplacian and b(w, z, t) = b(|z|, t). From the flow equation (13) we have

t d(H (w, z, t), Hξ (w, z)) = B(H (w, z, τ ))dτ 0

t ≤ b(w, z, τ )dτ

0 ∞ b(w, z, τ )dτ. (16) ≤ 0

Periodic Instantons and the Loop Group

565

Now, b(|z|, t) = ∞ b(s, t)k(|z|, s, t)ds, where k is the one-dimensional heat kernel operator. Since 0 k(|z|, s, t)dt = G(|z|, s), the Green’s operator is finite, Fubini’s theorem allows us to interchange the order of integration in (16). So

0 d(H (w, z, t), Hξ (w, z)) ≤ M (1 − s)G(|z|, s)ds

0

1

≤M

(1 − s)G(|z|, s)ds .

0

With respect to the Laplacian 2 4 = −(1 + |w|)2 ∂w¯ ∂w − 4|z|2 (ln |z|)2 ∂z¯ ∂z = −(ln |z|)2 ∂ln |z|

reduced to one dimension, the Green’s operator is given by G(|z|, s) = min{− ln |z|, − ln s}/s(ln s)2 . Actually, this Green’s operator is only valid for the entire interval (0 = 1) and Fubini’s theorem doesn’t apply there. There is a monotone property of heat kernels which means that our choice of G is simply an overestimate when 0 < 1 so the calculation is valid. Thus

|z|

1 (1 − s)ds (1 − s)ds d(H (w, z, t), Hξ (w, z)) ≤ M − ln |z| − s(ln s)2 s ln s 0 |z| ≤ C ln(1 − ln |z|), where the last inequality simply encodes the fact that the distance vanishes as |z| → 1 and grows like ln(1 − ln |z|) as |z| → 0. The preceding lemmas have shown that there is a solution to the heat equation that satisfies H (w, z, t) → H (w, z, T ) in C 0 and H (w, z, t) is uniformly bounded with bound independent of t (though depending on 0). These are the conditions required to use Simpson’s extension of Donaldson’s result to show that H (w, z, t) is bounded p in L2 uniformly in t. Hamilton’s methods [9] then give control of all higher Sobolev norms. Thus we get a solution, H (w, z, t), of (13) for all t that converges to a smooth limit H 0,δ (w, z, ∞) defined on S 2 × #0,δ and satisfying B(H 0,δ (w, z, ∞), η) = 0 and H 0,δ (w, z, ∞) = Hξ on ∂S 2 × #0,δ so Proposition 4.1 is proven. Proposition 4.6. For each holomorphic map f : S 2 → LG/Zξ there is a periodic instanton Af on S 1 × R3 . Proof. We have proven the existence of a family of hermitian metrics H 0,δ respectively defined over S 2 × #0,δ and satisfying B(H 0,δ , η) = 0. Since σ (H 0,δ , H 0 ,δ ) is subharmonic its maximum occurs at the boundary of the set on which it is defined. For 0 ≤ 0 ≤ δ ≤ δ , the common set is S 2 × #0,δ . If we fix 0 = 0 and let δ → 1, then σ = 0 on |z| = 0 and the maximum of σ occurs on |z| = δ. Since the metrics σ and d on Gc /G are equivalent, the maximum value of σ is bounded by a constant times d(H 0,δ , Hξ ) ≤ C ln(1 − ln δ) using Lemma 4.5. This tends to 0 as δ → 1, thus we have a Cauchy sequence that converges uniformly to a Hermitian metric H 0 defined on p |z| ≥ 0. The convergence can be improved to L2 to ensure that B(H 0 , η) = 0 [23].

566

P. Norbury

In order to deal with 0 → 0, notice that since ln |z| is harmonic on S 2 ×#, σ +a ln |z| is subharmonic for any a. Put a = sup|z|=0 σ/| ln 0|. Then σ + a ln |z| ≤ 0 on |z| = 1 and |z| = 0. Thus σ ≤ − ln |z| sup σ/| ln 0|. |z|=0

(17)

By Lemma 4.5, d(H 0,δ , Hξ ) ≤ C ln(1 − ln 0) so σ = o(| ln 0|) as 0 → 0. Thus the right hand side of (17) tends uniformly to 0 on compact sets away from z = 0. Again we conclude that the {H 0 } form a Cauchy sequence as 0 → 0, converging uniformly on the complement of any neighbourhood of S 2 × {0} to a Hermitian metric H that satisfies B(H, η) = 0 on S 2 × #. Using S 1 ×(R3 −{0}) ∼ = S 2 ×# we see that the limit H is smooth on S 1 ×(R3 −{0}) 1 and continuous on all of S ×R3 , converging to I on S 1 ×{0}. The connection A obtained from H via (11) is defined and anti-self-dual on S 1 ×(R3 −{0}). By the following lemma, A has finite charge. Since codimension three singularities of finite charge anti-self-dual connections can be removed [22], A is smooth on all of S 1 × R3 . Lemma 4.7. The curvature of the limiting connection A has finite L2 norm. Proof. The Yang–Mills flow decreases the L2 norm of a connection, and any bubbling in the limit just decreases the L2 norm further, so it is sufficent to show that the initial connection has finite L2 norm. For any connection A, we have

+ 2 2 2 (18) 8π FA 2 = 2 |FA | − FA ∧ FA , where FA+ is the self-dual part of the curvature. We can calculate this explicitly for the initial connection defined in (8). Notice that FA+ = B(Hξ , η) and by Lemma 4.4 we have |B(Hξ , η)| ≤ M(1 − |z|). This is square-integrable over S 2 × # since S 2 is compact and # has finite area near z = 0 and grows like 1/(1 − |z|)2 near |z| = 1. As one might expect, the topological term in (18) will coincide with the topological degree of the map f : S 2 → LG/Zξ .

1 1 2 k(E) = tr(F ) = − tr(∂z¯ η∗ ∂z η)d z¯ dzd wdw, ¯ A 8π 2 S 2 ×D 8π 2 S 2 ×D since only the Fzw¯ and Fz¯ w terms contribute. Since η is holomorphic in z, then on the disk d{tr(η∗ ∂z η)dz} = tr(∂z¯ η∗ ∂z η)d z¯ dz so

1 k(E) = − 2 tr(η∗ ∂z η)dzd wdw ¯ 8π S 2 |z|=1

1 d wdw ¯ = , f −1 ∂w¯ f 2 4π S 2 i 2 where f −1 ∂wf ¯ uses the Kahler metric on LG/Zξ . This expression is the degree of f.

Remark. In the construction of this section we started with parabolic bundles over the disk. However, the reverse is not true that a periodic instanton gives rise to a family of parabolic bundles. By this we mean that the holomorphic structure defined on each punctured disk by the restriction of the periodic instanton does not extend to the entire disk. The curvature just fails to satisfy FA ∈ Lp for p > 1 as required in [2].

Periodic Instantons and the Loop Group

567

5. Injection In this section we will show that the map produced in Sect. 4 is injective. Proposition 5.1. Let f : S 2 → LG/Zξ and g : S 2 → LG/Zν be two based holomorphic maps. Then the instantons Af and Ag are gauge equivalent precisely when ν − ξ is in the root lattice and g = f · exp(i(ν − ξ ) ln z). Proof. The instanton Af is given by the expression (11) which depends on a pair (H, η) consisting of a Hermitian metric, H , and the holomorphic extension of f −1 ∂w¯ f denoted by η and likewise for Ag . These expressions are independent of the unitary gauge so Af ∼ Ag only if Af = Ag or possibly if we have used different holomorphic trivialisations of the holomorphically trivial bundle restricted to each {w} × # for Af and Ag . If Af = Ag then f −1 ∂w¯ f = η = g −1 ∂w¯ g, so ∂w¯ (gf −1 ) = 0 and this is global over 2 S , thus g = γ (z)f for some loop γ (z) independent of w. The requirement that f and g map ∞ ∈ S 2 to the constant loop I forces γ (z) = I . If Af = Ag and Af ∼ Ag then Af uses the pair (H, η) in (11) and Ag uses the pair ∗ (p Hp, p−1 ηp + p −1 ∂w¯ p) for a map p : S 2 × # → Gc which is holomorphic on each {w} × # and unitary on its boundary. Note that this implies that g = fp though since p is not a priori in L+ P , the maps f and g can be distinct. The proof of the proposition is completed by the following two lemmas that show that g = fp together with the known growth of the Hermitian metrics associated to f and g forces p to be constant or to be a standard holomorphic gauge change. Lemma 5.2. If ξ = ν then Af ∼ Ag only if f = gu for u ∈ P ∩ G ∼ = Zξ . Proof. We can apply Lemma 4.5 to the Hermitian-Yang–Mills metric H over all of S 2 × # even though it is only stated for 0 < 0 < δ < 1. Thus d(H, Hξ ) + d(p ∗ Hp, Hξ ) ≤ C ln(1 − ln |z|) for the initial metric Hξ defined in (9). Using the identity d(p ∗ Hp, Hξ ) = d((p∗ )−1 Hξ p −1 , Hξ )) and the triangle inequality we have d(H, Hξ ) + d(p ∗ Hp, Hξ ) ≥ d((p∗ )−1 Hξ p −1 , Hξ )),

(19)

and the right-hand side is bounded by C ln(1 − ln |z|) only if p is bounded near z = 0 by C ln(1 − ln |z|). Since it satisfies limz→0 zp(z) → 0, p extends across z = 0 and is holomorphic there. Furthermore we must have p(0) ∈ P in order that the right-hand side of (19) is bounded by C ln(1 − ln |z|). Since p is holomorphic on the disk and unitary on the boundary it must be unitary on the disk (by the maximum principle applied to the subharmonic function tr(p∗ p) + tr((p∗ p)−1 )), and thus constant there, and moreover lie in P ∩ G. Lemma 5.3. If Af ∼ Ag then ν − ξ lies in the root lattice and g = f exp(i(ν − ξ ) ln z).

568

P. Norbury

Proof. As described above, g = fp. Then limz→0 zp −1 ∂z p = ν − ξ . Since zp −1 ∂z p is bounded and holomorphic on the punctured disk, itextends to a holomorphic function of z the disk. In fact p−1 ∂z p = q(z)/z so p(z) = exp( q(ζ )dζ /ζ ) and ν − ξ = q(0) must lie in the integer lattice. Thus p · exp(−i(ν − ξ ) ln z) is holomorphic on the disk and unitary on the boundary and hence constant which we absorb in the unitary ambiguity of f . So g = f · exp(i(ν − ξ ) ln z). The proposition allowed for gauge transformations that have angular dependence at infinity (corresponding to z = 0). When we restrict the gauge transformations to have no angular dependence at infinity then the maps f and f · exp(i(ν − ξ ) ln z) define inequivalent connections. Thus the map f → Af is injective. 6. Boundary Conditions There are natural boundary conditions that the periodic instantons constructed in this paper conjecturally satisfy: as r → ∞, − ξ = O(1/r), ∂ − ξ /∂ = O(1/r 2 ), ∇( − ξ ) = O(1/r 2 ), where ξ is a given constant Higgs field, r is the radial coordinate in R3 , ∂/∂ is an angular derivative, and the asymptotic constants are uniform in θ. In order to prove these conditions we would need to understand the precise elliptic constants for the Hermitian Yang–Mills Laplacian on S 2 × # near the puncture at z = 0. This would enable us to get estimates on the second derivatives of H from the estimates on H given in this paper and estimates on first derivatives of H obtained from a maximum principle argument [5]. We hope to show this in future work. Alternatively, one might prove the stronger conjecture that all finite energy periodic instantons satisfy these boundary conditions. Such a proof would again require a good understanding of the Laplacian on S 2 × # as in the special case of monopoles [11]. This stronger conjecture implies that the construction of this paper yields all periodic instantons. This can be proven by using a scattering argument to retrieve a holomorphic map from S 2 to an orbit of the loop group from a given periodic instanton. Acknowledgements. I would like to thank Michael Murray for useful discussions and the University of Adelaide for its hospitality over a period when part of this work was carried out.

References 1. Atiyah, M.F.: Instantons in two and four dimensions. Commun. Math. Phys. 93, 437–451 (1984) 2. Biquard, Olivier: Fibrés paraboliques stables et connexions singulières plates. Bull. Soc. Math. France 119, 231–257 (1991) 3. Donaldson, S.K.: Anti-self-dual Yang–Mills connections over complex algebraic surfaces and stable vector bundles. Proc. London Math. Society 30, 1–26 (1985) 4. Donaldson, S.K.: Nahm’s equations and the classification of monopoles. Commun. Math. Phys. 96, 387– 407 (1984) 5. Donaldson,S.K.: Boundary value problems for Yang–Mills fields. J. Geom. and Phys. 8, 89–122 (1992) 6. Dostoglou, Stamatis and Salamon, Dietmar: Self-dual instantons and holomorphic curves. Ann. of Math. 139, 581–640 (1994)

Periodic Instantons and the Loop Group

569

7. Garland, H. and Murray, M.K.: Kac-Moody monopoles and periodic instantons. Commun. Math. Phys. 120, 335–351 (1988) 8. Guo, G.-Y.: On an analytic proof of a result by Donaldson. Int. J. Math. 7, 1–17 (1996) 9. Hamilton, R.S.: Harmonic maps of manifolds with boundary. Lecture Notes in Math. 471, New York: Springer, 1975 10. Hitchin, N.J.: On the construction of monopoles. Commun. Math. Phys. 89, 145–190 (1983) 11. Jaffe, A. and Taubes, C.H.: Vortices and monopoles. Boston: Birkhäuser, 1980 12. Jarvis, S.: Euclidean monopoles and rational maps. Proc. LMS 77, 170–192 (1998) 13. Jarvis, S.: Monopoles to rational maps via radial scattering. Preprint (1996) 14. Jarvis, S. and Norbury, P.: Degenerating metrics and instantons on the four-sphere. J. Geom. Phys. 27, 79–98 (1998) 15. Mehta and Seshadri. Parabolic bundles. Math. Ann. 248, 205–239 (1980) 16. Murray, M.K.: Monopoles and spectral curves for arbitrary Lie groups. Commun. Math. Phys. 90, 263–271 (1983) 17. Nahm, W.: Self-dual monopoles and calorons. Lecture Notes in Phys. 201, Berlin: Springer, 1983, pp. 189– 200 18. Nahm, W.: The construction of all self-dual multimonopoles by the ADHM method. In Monopoles in quantum field theory (Trieste), Singapore: World Sci. Pub., 1981, pp. 87–94 19. Rozansky, L. and Witten, E.: Hyper-Kahler geometry and invariants of three-manifolds. Selecta Math. 3, 401–458 (1997) 20. Seiberg, N. and Witten, E.: Electric-magnetic duality, monopole condensation, and confinement in n = 2 supersymmetric Yang–Mills theory. Nuclear Phys. B 426, 19–52 (1994) 21. Seiberg, N. and Witten, E.: Gauge dynamics and compactifications to three dimensions. Adv. Ser. Math. Phys. 24, 333–366 (1997) 22. Sibner, L.M. and Sibner, R.J.: Classification of singular Sobolev connections by their holonomy. Commun. Math. Phys. 144, 337–350 (1992) 23. Simpson, C.T.: Constructing variations of Hodge structure using Yang–Mills theory and applications to uniformization. J. Amer. Math. Soc. 1, 867–918 (1988) Communicated by R. H. Dijkgraaf

Commun. Math. Phys. 212, 571 – 590 (2000)

Communications in

Mathematical Physics

© Springer-Verlag 2000

Wigner Symbols and Combinatorial Invariants of Three-Manifolds with Boundary Gaspare Carbone1 , Mauro Carfora2,3 , Annalisa Marzuoli2,3 1 S.I.S.S.A.-I.S.A.S., Via Beirut 2–4, 34013 Trieste, Italy. E-mail: [email protected] 2 Dipartimento di Fisica Nucleare e Teorica, Università degli Studi di Pavia, via A. Bassi 6, 27100 Pavia, Italy 3 Istituto Nazionale di Fisica Nucleare, Sezione di Pavia, via A. Bassi 6, 27100 Pavia, Italy.

E-mail: [email protected]; [email protected] Received: 14 December 1998 / Accepted: 30 January 2000

To Giorgio Ponzano and Tullio Regge Abstract: In this paper we generalize the partition function proposed by Ponzano and Regge in 1968 to the case of a compact 3-dimensional simplicial pair (M, ∂M). The resulting state sum Z[(M, ∂M)] contains both Wigner 6j symbols associated with tetrahedra and Wigner 3j m symbols associated with triangular faces lying in ∂M. In order to show the invariance of Z[(M, ∂M)] under PL-homeomorphisms we exploit some results due to Pachner on the equivalence of n-dimensional PL-pairs both under bistellar moves on n-simplices in the interior of M and under elementary boundary operations (shellings and inverse shellings) acting on n-simplices which have some component in ∂M. We find, in particular, the algebraic identities – involving a suitable number of Wigner symbols – which realize the complete set of Pachner’s boundary operations in n = 3. The results established for the classical SU (2)-invariant Z[(M, ∂M)] are further extended to the case of the quantum enveloping algebra Uq (sl(2, C)) (q a root of unity). The corresponding quantum invariant, Zq [(M, ∂M)], turns out to be the counterpart of the Turaev–Viro invariant for a closed 3-dimensional PL-manifold. 1. Introduction The search for combinatorial invariants of compact n-dimensional manifolds (n = 3, 4) plays a key role both in topological lattice field theories and in quantum gravity discretized according to Regge’s prescription [R]. From a historical point of view, the typical examples of this class of models in dimension three are provided in [P-R] and in [T-V] (further developments can be found in [M-T, O92,a, O92,b, C-F-S] and in [C-K-S] for what concerns in particular a categorical approach to the subject). As a matter of fact, all the papers quoted above deal essentially with state sum invariants for a closed 3 or 4-dimensional manifold M . The interest in dealing with an n-dimensional compact pair (M, ∂M) (where ∂M is the (n − 1)-dimensional boundary manifold of M) relies on the fact that in typical physical situations we have to consider

572

G. Carbone, M. Carfora, A. Marzuoli

probability amplitudes between different (n − 1)-dimensional Riemannian geometries which represent the boundary of an n-dimensional (pseudo)Riemannian manifold. Borrowing the language from the Euclidean functional integral approach to the quantization of gravity, we have to evaluate quantities such as < (N1 , h1 ) | O | (N2 , h2 ) > /< (N1 , h1 ) | (N2 , h2 ) >, where (N1 , h1 ) and (N2 , h2 ) are (n − 1)-dimensional manifolds equipped with fixed Riemannian metrics h1 and h2 respectively, and O is some observable. The symbol < | > denotes a functional integration over (a suitable class of) n-dimensional Riemannian metrics, up to diffeomorphisms, interpolating between (N1 , h1 ) and (N2 , h2 ). If we are interested in studying either the topological sector of quantum gravity or just a topological model, the requirement of taking fixed geometries on the boundaries appears to be much too restrictive. Indeed, a topological n-dimensional field theory, when projected onto a (n − 1)-dimensional boundary, keeps on taking its topological character (namely, it is independent from the metric on the boundary as well), and thus topological invariants of pairs (M, ∂M) should come into play in a more natural way. Notice that combinatorial invariants for PL-pairs may appear also in the loop quantum gravity approach, in particular when introducing a spin networks basis (see e.g. [Ro-S, D-R] and references therein). In this paper we extend both the Ponzano–Regge partition function and the Turaev– Viro invariant to compact 3-dimensional simplicial PL-manifolds with non- empty boundaries. Although this issue has already been addressed some years ago in [K-M-S], our proposal turns out to be quite natural and more closely related to the original idea of a discretized spacetime partition function arising from a recoupling scheme of angular momenta variables, much in the spirit of [P-R, Pe] and [M]. In our approach the presence of a 2-dimensional boundary will be taken into account through the introduction in the state sum of a Wigner 3j m symbol associated with each triangle lying in the boundary itself. (Conversely, in [K-M-S] the authors introduce a new kind of mixed symbol, involving both the angular momenta variables and the vertices in the boundary.) The next step will consist in summing over all angular momenta variables j and simultaneously over all momentum projections, or m-variables; in such a way we get – up to regularization – a state sum Z[(M, ∂M)] for the simplicial pair (M, ∂M) which is the natural counterpart of the Ponzano–Regge partition function in the case of manifolds with boundary (Sect. 3). In order to show the invariance of the above state sum under PL-homeomorphisms we shall go back to a basic theorem (established by Pachner in [P91]) which states that two simplicial compact PL-manifolds of dimension n with non-empty boundaries are PL-homeomorphic if, and only if, they are equivalent under elementary shellings and their inverse operations. Such elementary shellings are topological operations, or moves, which act on simplices that have some components in the boundary of the manifold by deleting – or adding – one n-simplex at a time. Since this theorem and some other related results (see also [P90]) do not seem widely known, we shall briefly discuss them in Sect. 1. In Sect. 4 we show how the three different types of elementary shellings defined in the 3-dimensional case can be associated with identities involving one 6j and four 3j m symbols. (Notice that in [K-M-S] it is necessary to assume that the new symbols satisfy some identities of Biedenharn–Elliott type in order to show that the state sum is actually invariant under a suitable class of subdivisions and isotopies of the boundary.) As a consequence of such a natural algebrization of the elementary shellings, the state sum Z[(M, ∂M)] will turn out to be automatically invariant under such moves. Moreover,

Wigner Symbols and Combinatorial Invariants of Three-Manifolds

573

by Pachner’s results, it will be also an invariant of the PL-structure of the simplicial pair (M, ∂M). In Sect. 5 we present the extension of the model to the case of the quantized enveloping algebra Uq (sl(2, C) (q a root of unit). The resulting invariant, Zq [(M, ∂M)], is thus the counterpart of the Turaev–Viro quantum invariant and reduces to it in the case ∂M = ∅ (and to Z[(M, ∂M)] for q → 1). Finally, Sect. 6 contains some remarks concerning possible developments of the methods proposed in the present paper. 2. Equivalence of PL-Manifolds with Boundary Under Pachner’s Elementary Shellings We first list some well known, but necessary, preliminaries on Piecewise-Linear manifolds (see e.g. [G, R-S]). By a p-simplex σ p ≡ (x0 , x1 , . . . , xp ) with vertices . p x0 , x1 , . . . , xp we mean the subspace of Rd , (d > p) defined by σ p = i=1 λi xi , where (x0 , x1 , . . . , xp ) are (p + 1) points in general position in Rd with i λi = 1 and λi ≥ 0, ∀i. Definition 1. Let σ p and τ q be simplices in Rd with distinct vertices and such that the totality of these vertices is at most (d + 1) and they are in general position in Rd . Then such vertices span a simplex, σ p τ q , the join of σ p and τ q , defined as the (p + q + 1)simplex obtained by taking the convex hull in Rd , viz.: . (1) σ p τ q = conv(σ p ∪ τ q ). A face of a p-simplex σ p is any simplex the vertices of which are a subset of those of σ p. Definition 2. A finite simplicial complex T (or, more precisely, the geometrical realization of an abstract simplicial complex) is a finite collection of simplices in Rd such that: i) if σ p ∈ T , then so are all of its faces; ii) if σ p , τ q ∈ T , then σ p ∩ τ q is either a (common) face or is empty. T has dimension n if n is the maximum dimension of its faces. The faces of maximal dimension, σ n , are called facets of T . Definition 3. If T1 and T2 are simplicial complexes, then the join of T1 with T2 is defined according to: . (2) T1 T2 = {σ1 σ2 s.t. σ1 ∈ T1 , σ2 ∈ T2 } where σ1 σ2 is given in Definition 1. In particular, the join of a complex T with the empty simplex gives T {∅} = T , while the join of T with the empty complex gives T ∅ = ∅. (Notice however that in the following the join of a complex T with a simplex τ will be denoted by T τ for short.) A simplicial complex is pure provided that all its facets have the same dimension. The boundary complex of a pure simplicial n-complex T is denoted by ∂T and it is the subcomplex of T the facets of which are the (n − 1)-faces of T which are contained in . only one facet of T . The set of the interior faces of T is denoted by int (T ) = T \ ∂T . If σ is a simplex, then by B(σ ) we mean the complex made up of all the faces of σ , except σ itself. Moreover . F(σ ) = B(σ ) ∪ {σ } (3) is the complex made up of σ and all its proper faces.

574

G. Carbone, M. Carfora, A. Marzuoli

Given a (finite) simplicial complex T , consider the set theoretic union |T | ⊂ Rd of all simplices from T , namely . (4) |T | = ∪σ ∈T σ. Introduce on the set |T | a topology that is the strongest of all topologies in which the embedding of each simplex into |T | is continuous (the set A ⊂ |T | is closed iff A ∩ σ p is closed in σ p for any σ p ∈ T ). The topological space |T | is the underlying polyhedron, geometric carrier of the simplicial complex T ; the polyhedron |T | is said to be triangulated by the simplicial complex T . More generally, a triangulation of a topological space M is a simplicial complex T together with a homeomorphism |T | → M. Definition 4. A simplicial map f : T1 → T2 between two simplicial complexes T1 , T2 is a continuous map f : |T1 | → |T2 | between the corresponding underlying polyhedra which takes p-simplices to p-simplices for all p. The map f is a simplicial isomorphism if f −1 : T1 → T2 is also a simplicial map. ∼ |T |, where ∼ A subdivision T of T is a simplicial complex such that: i) |T | = = denotes a homeomorphism between topological spaces; ii) each p-simplex of T is contained in a p-simplex of T , for every p. A property of a simplicial complex T which is invariant under subdivisions is a combinatorial (or Piecewise Linear) property of T . More precisely: Definition 5. A PL-homeomorphism f : T1 −→ T2

(5)

between two simplicial complexes (of the same dimension) is a map which is a simplicial isomorphism for some subdivisions T1 and T2 of T1 and T2 , respectively. ∼ |T |, each point of Definition 6. A PL-manifold of dimension n is a polyhedron M = which has a neighborhood, in M, PL-homeomorphic to an open set in Rn . PL-manifolds are realized by simplicial manifolds under the equivalence relation generated by PL-homeomorphisms: Definition 7. Two PL-manifolds M1 ∼ = |T1 | and M2 ∼ = |T2 | are PL-homeomorphic, or M1 ∼ =PL M2

(6)

if there exists a map g : M1 → M2 which is both a homeomorphism and a simplicial isomorphism, in the sense of Definition 5. In what follows we shall use the notation T −→ M ∼ = |T |

(7)

to denote a particular triangulation of the closed PL-manifold M and, when dealing with a PL-pair (M, ∂M), we shall write: (T , ∂T ) −→ (M, ∂M) ∼ = (|T |, |∂T |),

(8)

where the triangulation on ∂M is the unique triangulation induced on it by the chosen triangulation T in M. The extension of Definitions 5, 6 and 7 to PL-pairs is quite straightforward and can be found e.g. in [R-S]. Recall (see e.g. [T]) also that a sufficient condition for characterizing a triangulated space as a PL-manifold follows from:

Wigner Symbols and Combinatorial Invariants of Three-Manifolds

575

Theorem 1. A simplicial n-complex K is a (simplicial) PL-manifold of dimension n if, for all p-simplices σ p ∈ K, the link of σ p , link(σ p ), has the topology of the boundary of the standard (n − p)-simplex, namely if: link(σ p ) ∼ = Sn−p−1 (the (n − p − 1)dimensional sphere). In the above statement, link(σ p ) ⊂ K is the union of all faces τ of all simplices in the star of σ satisfying σ ∩ τ = ∅ (the star of σ in K is simply the union of all simplices of which σ is a face). Notice however that in this paper we shall deal only with triangulations underlying PL-manifolds, and thus the content of Theorem 1 will not be discussed any further. The point that we are going to examine now concerns PL-equivalence of polyhedra. Notice that Definition 7 turns out to be quite difficult to be handled in practice, since one should go over and over through subdivisions in order to find out isomorphic triangulations. The issue of combinatorial equivalence was first addressed by Alexander in [A], where he proved the following theorem (which indeed holds true for more general complexes too): Theorem 2. For any polyhedron M which is dimensionally homogeneous (viz., its underlying simplicial complex is pure) any two triangulations of M can be transformed one into the other by a finite sequence of stellar subdivisions and their inverse transformations. The stellar subdivisions, typically known also as Alexander’s transformations (or moves), are not elementary, in the sense that each one of them involves a variable number of n-simplices of the triangulation T we are considering. Thus, being interested in transformations between different triangulations of a PL-manifold M, one should implement Alexander’s moves over a lot of local arrangements of simplices which cannot be factorized into simpler blocks. On the other hand, owing to Theorem 2, PL-manifolds are mapped homeomorphically into PL-manifolds, and moreover all admissible triangulations of a given M are related to each other by a suitable sequence of Alexander’s moves. The way out of this situation is to look for a different set of moves, which are both elementary (i.e. they involve just a fixed number of simplices in any dimension n) and equivalent to Alexander’s transformations, namely topology – preserving and ergodic (i.e. they must span all the possible triangulations of a given M). A set of moves that shares these requirements for the case of closed n-dimensional PL-manifolds has been found by Pachner: the bistellar elementary operations (see [P87] and also Appendix A of [A-C-M] for an account on this subject in connection with simplicial quantum gravity models in n = 3, 4). Pachner has also introduced a set of moves which are suitable in the case of compact n-dimensional PL-manifolds with a non–empty boundary, the elementary shellings (see [P90] and [P91]). As the term “elementary shelling” suggests, this kind of operation involves the cancellation of one n-simplex (facet) at a time in a given triangulation (T , ∂T ) → (M, ∂M) of a PL-pair of dimension n. In order to be deleted, the facet must have some of its faces lying in the boundary ∂T . Moreover, using Definition 1, we may decompose a facet of this kind (considered now as a complex) into the join of two suitable faces belonging to it. This decomposition is obviously not unique, although in each dimension n there are only a finite number of possibilities of carrying it out (up to relabelling the faces of a given dimension). Definition 8. Let (T , ∂T ) → (M, ∂M) be a triangulation of a PL-pair of dimension n and let σ n be a facet decomposed according to: σn = τ σr,

(9)

576

G. Carbone, M. Carfora, A. Marzuoli

where τ is a face of σ n of dimension p ≥ 0 such that τ ∈ int (T ), and the second factor represents a face of σ n of dimension r ≥ 0 with the following property: B(τ ) σ r ⊆ ∂T ,

(10)

where B(τ ) is the complex made up of all the faces of τ except τ itself. Then an elementary r-shelling of (T , ∂T ) is defined according to: . '−σ n (T , ∂T ) = (T , ∂T ) \ {F(τ ) σ r } ≡ (T˜ , ∂ T˜ ), (11) where F(τ ) is given in (3). Notice that the dimension p of τ is given, in terms of n and r, by p = n − r − 1; moreover, if τ is a 0-simplex then B(τ ) = ∅ and the remark at the end of Definition 3 has to be kept in mind. The inverse operation amounts to adding a new facet to (T˜ , ∂ T˜ ) along some faces in ˜ ∂ T , and can be simply defined as . '+σ n (T˜ , ∂ T˜ ) = ('−σ n )−1 (T˜ , ∂ T˜ ). (12) If we set '+ ≡ '+σ n and '− ≡ '−σ n for some facet (or missing facet) σ n , we can establish an equivalence relation between triangulations according to: Definition 9. Two triangulations (T , ∂T ) and (T˜ , ∂ T˜ ) are said to be equivalent under elementary shellings if, and only if, they are connected by a finite number of elementary boundary operations, namely: (T , ∂T ) ≈sh (T˜ , ∂ T˜ ) ⇐⇒ (T˜ , ∂ T˜ ) = 'k± · · · '1± (T , ∂T ),

(13)

where '± are defined in (12) and (11) respectively and k is an integer. Remark 1. It may happen that there exist one face τ ∈ int (T ) and different faces, say σ1r and σ2r , with both B(τ ) σ1r and B(τ ) σ2r in ∂T and such that σ n = τ σ1r , σ n = τ σ2r for a fixed σ n . However, for each σ r belonging to ∂T , there exists at most one τ ∈ int (T ) such that: i) τ σ r is a facet; ii) B(τ ) σ r ⊆ ∂T . Hence the elementary operation '−σ n defined in (11) in uniquely determined by σ r , thus the set of the possible elementary shellings (performed on a single facet) is equal to the dimension n of the facet itself, since r = 0, 1, . . . , n − 1. The statement of the main theorem in [P91] can be rewritten in our notation as: Theorem 3. Let (T1 , ∂T1 ) → (M1 , ∂M1 ) and (T2 , ∂T2 ) → (M2 , ∂M2 ) be triangulations of compact n-dimensional manifolds with boundary. Then (M1 , ∂M1 ) and (M2 , ∂M2 ) are PL-homeomorphic if, and only if, (T1 , ∂T1 ) and (T2 , ∂T2 ) are equivalent under elementary shellings, namely: |(T1 , ∂T1 )| ∼ =PL |(T2 , ∂T2 )| ⇐⇒ (T1 , ∂T1 ) ≈sh (T2 , ∂T2 ),

(14)

where |(T1 , ∂T1 )| ∼ = (M1 , ∂M1 ), |(T2 , ∂T2 )| ∼ = (M2 , ∂M2 ), the equivalence ≈sh being in the sense of Definition 9. Notice that in one direction (⇐) the result is quite straightforward and moreover, as a particular application, we may consider different triangulations (T , ∂T ), (T˜ , ∂ T˜ ) of the same PL-pair (M, ∂M). Indeed, Pachner has proved a weaker version of the above result in [P90], namely

Wigner Symbols and Combinatorial Invariants of Three-Manifolds

577

Theorem 4. Let (T1 , ∂T1 ) → (M1 , ∂M1 ) and (T2 , ∂T2 ) → (M2 , ∂M2 ) be triangulations of PL, compact n-dimensional pairs. Then |(T1 , ∂T1 )| ∼ =PL |(T2 , ∂T2 )| ⇐⇒ (T1 , ∂T1 ) ≈sh,bst (T2 , ∂T2 ),

(15)

where the equivalence ≈sh,bst is both under elementary shellings and under bistellar elementary operations on n-simplices in int(T1 ) or int(T2 ). Remark 2. The advantage of having to deal with elementary shellings is quite evident (although we shall use Theorem 4 when handling our combinatorial invariants). Moreover, there exists a correspondence between bistellar moves in dimension n and elementary shellings in dimension (n − 1), as discussed in Sect. 6. In this respect, Theorem 3 represents exactly the counterpart of Pachner’s theorem for closed (n − 1)-PL-manifolds (see [P87]). Example. Elementary shellings in n = 3. Let (T , ∂T ) → (M, ∂M) represent a triangulation of a 3-dimensional PL-pair and let σ 3 be a facet with some component in ∂T . According to Definition 8 we can write: σ 3 = τ σ r (r = 0, 1, 2),

(16)

where τ ∈ int(T ) and σ r ∈ ∂T . As we noticed in Remark 1, for every σ r ∈ ∂T there exists at most one τ ∈ int(T ) which satisfies (16). Then we can classify the possible facets and the corresponding elementary shellings according to the dimensionality of σ r and using (11) (the different configurations are also illustrated in the figures of Sect. 4). 1. TYPE I (r = 0). The facet σ 3(I) admits the decomposition σ 3(I) = τ (I) σ 0 , where τ (I) is a 2-simplex and belongs to int(T ). The vertex σ 0 and the three 2-dimensional faces of σ 3(I) which have σ 0 as a common subsimplex are in ∂T . The shelling of σ 3(I) is represented by the map: '−(I) : (T , ∂T ) −→ (T , ∂T ) \ {F(τ (I) ) σ 0 }.

(17)

2. TYPE II (r = 1). The facet σ 3(II) admits the decomposition σ 3(II) = τ (II) σ 1 , where τ (II) is a 1-simplex and belongs to int(T ). The 1-simplex σ 1 and the two 2-dimensional faces of σ 3(II) which have σ 1 as a common subsimplex are in ∂T . The shelling of σ 3(II) is represented by the map: '−(II) : (T , ∂T ) −→ (T , ∂T ) \ {F(τ (II) ) σ 1 }.

(18)

3. TYPE III (r = 2). The facet σ 3(III) admits the decomposition σ 3(III) = τ (III) σ 2 , where τ (III) is a vertex and belongs to int(T ). The 2-simplex σ 2 is in ∂T . The shelling of σ 3(III) is represented by the map: '−(III) : (T , ∂T ) −→ (T , ∂T ) \ {F(τ (III) ) σ 2 }.

(19)

The inverse elementary shellings '+(I) , '+(II) and '+(III) are nothing but maps which are the inverse operations with respect to the former ones, and represent attachments of 3-simplices of Types I, II, III respectively.

578

G. Carbone, M. Carfora, A. Marzuoli

3. Generalization of Ponzano–Regge Partition Function in Terms of Wigner 3j m and 6j Symbols In this section we generalize the partition function proposed by Ponzano and Regge in [P-R] for closed manifolds to the case of a compact 3-dimensional simplicial PLmanifold with a non-empty boundary, (M, ∂M). The resulting state sum will contain both 6j symbols associated with 3-simplices in (M, ∂M) and 3j m symbols associated with 2-simplices in ∂M. We start with some basic definitions from the recoupling theory of angular momenta of SU (2) following the standard notation of [V-M-K]. If j1 , j2 are two angular momenta (spin) labelling irreducible representations of SU (2) and m1 , m2 are the corresponding projections on the quantization axis, then the Clebsh–Gordan jm coefficient Cj1 m1 j2 m2 represents the probability amplitude that j1 and j2 are coupled to give a resultant angular momentum j with projections m = m1 + m2 . In what follows the Wigner 3j m symbols will be used instead of the C-G coefficients owing to their symmetry properties. A 3j m symbol represents the probability amplitude that three angular momenta j1 , j2 , j3 with projections m1 , m2 , m3 respectively, are coupled to yield zero angular momentum; in terms of the corresponding C-G coefficient it can be expressed as: a b c cγ = (−1)a−b+γ (2c + 1)−1/2 Caαbβ . (20) α β −γ Here we adopt the Latin letters {a, b, c, . . . } to denote angular momenta, and the Greek letters {α, β, γ , . . . } to denote momentum projections in the arguments of the coefficients. (When convenient the notation j1 , j2 , . . . for angular momenta and m1 , m2 , . . . for the corresponding momentum projections will be restored). In any case, a variable of type j is any integer or half-integer non-negative number and its corresponding variable of type m is such that |m| ≤ j , both in h¯ units. We do not need at this point either the explicit expression of the C-G coefficient or the list of the properties of the 3j m symbol; we just mention the fact that the phase factor (−1)a−b+γ in (20) is chosen in such a way that any cyclic permutation of columns leaves the symbol unchanged. Moreover, both the 3j m symbol and the C-G coefficient vanish unless the triangular inequalities |a − b| ≤ c ≤ a + b (and their cyclic permutations) hold true. A triad of such kind will be called admissible, borrowing the language used in the context of quantum invariants introduced in [T-V] . Thus, from a geometrical point of view, a 3j m symbol can be associated with a triangle lying in an Euclidean 3-space, the edge lengths of which are (2a + 1), (2b + 1), (2c + 1) and having projection α, β, −γ respectively along a fixed reference axis. (Strictly speaking, such a picture arises only in the semiclassical limit, when the vector model for the recoupling of angular momenta can be applied). The following combination of four 3j m symbols, summed over their magnetic quantum numbers, provides the expression of the Racah–Wigner 6j symbol: ab c a b c a e f d bf d e c η = (−1) · , (21) d ef −α −β −γ α −1 ϕ δβ ϕ −δ 1 γ where η = a + b + c + d + e + f − α − β − γ − δ − 1 and the sum is extended over all possible values of the m-variables (notice however that only three summation indices are independent). The 6j symbols satisfy orthogonality conditions which read: abX abX (2X + 1) = (2e + 1)−1 δef {ade}{bce}, cd e cd f X

(22)

Wigner Symbols and Combinatorial Invariants of Three-Manifolds

579

where the notation {ade} stands for the triangular delta, (viz., {ade} is equal to 1 if its three arguments satisfy triangular inequalities, and is zero otherwise), while δef ≡ δ(e, f ). Each triad in (21), (abc), (aef ), (dbf ), (dec), must be admissible, or, in other words, each 3j m symbol is different from zero. Thus the 6j symbol has the symmetries of a nondegenerate tetrahedron embedded in an Euclidean 3-space with edge lengths (2a + 1), . . . , (2f + 1) (the non-degeneracy is given by the supplementary condition V 2 > 0, where V is the Euclidean volume of the tetrahedron, see e.g. [P-R]). Any arrangement of six spin variables {j1 , j2 , . . . , j6 } with jp = 0, 1/2, 1, 3/2, . . . (p = 1, 2, . . . , 6) satisfying all the above requirements will be called admissible. After these preliminary remarks, the connection between a recoupling scheme of a (finite number of) angular momenta and the combinatorial structure of a compact 3dimensional simplicial PL-manifold M without boundary is given by the classical result by Ponzano and Regge. On the basis of the notation introduced in (7), let the map: T (j ) −→ M

(23)

represent here a particular triangulation of the 3-dimensonal PL-manifold M associated with an admissible assignment of spin variables to the collection of the edges in T . Moreover, we set j ≡ {jA }. A = 1, 2, . . . , N1 , where N1 is the number of the edges in T (j ); therefore (2jA + 1) is the length of the edge labelled by A. The compatibility conditions on the assignment of spin variables are encoded in the requirement that each 3-simplex σ 3 in T (j ) is actually associated, apart from a phase factor, with a 6j symbol of SU (2): σB3

6

←→ (−1)

p=1 jp

j1 j2 j3 j4 j5 j6

B

,

(24)

where B = 1, 2, . . . , N3 labels the tetrahedra of T (j ). Then the Ponzano–Regge partition function for the manifold M is rewritten here as: Z[T (j ) → M; L], (25) Z[M] = lim L→∞

{T (j ),j ≤L}

where the sum is extended to all assignments of spin variables such that each of them is not greater than the cut-off L, and each term under the sum is given by:

−N0

Z[T (j ) → M; L] = 8(L)

N1

(2jA + 1)

A=1

N3 B=1

(−1)

6

p=1 jp

j1 j2 j3 j4 j5 j6

B

.

(26)

Here 8(L) ≡ 4L3 /3a, (a is an arbitrary constant) and N0 is the number of vertices in T (j ). As is well known (see e.g. [P-R, P87] and [C-F-S]) the state sum given in (25) and (26) is invariant under bistellar elementary operations. Recall that such bistellar moves can be expressed in terms of the Biedenharn–Elliott identity (representing the moves (2 tetrahedra) ↔ (3 tetrahedra)) and of both the B-E identity and the orthogonality conditions (22) (which represent the moves (1 tetrahedron) ↔ (4 tetrahedra)). As an intermediate step toward the generalization to the case of a 3-manifold with boundary, we recall the extension of (25) and (26) to a manifold M with a fixed triangulation on its boundary ∂M (see e.g. [O92,a]). Let the map

580

G. Carbone, M. Carfora, A. Marzuoli

¯ (j¯)) −→ (M, ∂M ≡ ∂T ¯ ) (T (j, j¯), ∂T

(27)

denote now a particular triangulation of the PL-pair (M, ∂M) associated with an admissible assignment of spin variables j ≡ {jA }, A = 1, 2, . . . , n1 , to the edges belonging to the interior of T (j, j¯), and such that the assignment of variables j¯ ≡ {j¯C }, ¯ is kept fixed. Then the following C = 1, 2, . . . , n¯ 1 to the edges belonging to ∂M ≡ ∂T state sum can be defined: ¯ )] Z[(M, ∂M ≡ ∂T

= lim

L→∞

T (j,j¯),j ≤L; j¯fixed

¯ (j¯)) → (M, ∂M ≡ ∂T ¯ ); L], Z[(T (j, j¯), ∂T

(28)

where ¯ (j¯)) → (M, ∂M ≡ ∂T ¯ ); L] Z[(T (j, j¯), ∂T −n0

= 8(L)

n1

(−1)

2jA

(2jA + 1)

A=1

·

n¯ 1

N3

(−1)

B=1

6

p=1 jp

j1 j2 j3 j4 j5 j6

B

¯

(−1)jC (2j¯C + 1)1/2 .

C=1

In this last expression n0 is the number of vertices in the interior of T (j, j¯), N3 is the total number of 3-simplices, while N1 ≡ n1 + n¯ 1 is the total number of edges in ¯ (j¯)). The above state sum is invariant under bistellar moves performed in (T (j, j¯), ∂T int (T (j )). Moreover, it behaves correctly with respect to spacetime compositions (or cobordisms): starting for instance with two PL-pairs (M1 , ∂M1 ≡ ∂T ) and (M2 , ∂M2 ≡ ∂T ) with fixed isomorphic triangulations on their boundaries, the composite state sum is obtained by glueing along the boundaries and is given by (25) and (26) with M = M1 ∪ M2 . We turn now to the general case of a 3-dimensional compact PL-pair (M, ∂M), the boundary of which will be equipped with the unique triangulation induced on it by the triangulation we choose in T according to (8). In the present context, let the map (T (j ), ∂T (j , m)) −→ (M, ∂M)

(29)

represent a triangulation associated with an admissible assignment of both spin variables to the collection of the edges in (T , ∂T ) and of momentum projections to the subset of edges lying in ∂T . With a slight change of notation, let j ≡ {jA }, A = 1, 2, . . . , N1 , denote all the spin variables, n1 of which are associated with the edges in the boundary. This last subset is labelled both by j ≡ {jC }, C = 1, 2, . . . , n1 , and by m ≡ {mC }, where mC is the projection of jC along the fixed reference axis. The consistency in the assignment of j , j , m is ensured if we require that each 3-simplex σB3 , (B = 1, 2, . . . , N3 ), in (T , ∂T ) must be associated with a 6j symbol as in (24), while each 2-simplex σD2 , D = 1, 2, . . . , n2 in ∂T must be associated with a 3j m symbol of SU (2) according to 3 j1 j2 j3 σD2 ←→ (−1)( s=1 ms )/2 . (30) m1 m2 −m3 D

Wigner Symbols and Combinatorial Invariants of Three-Manifolds

581

Then the following state sum can be defined: Z[(M, ∂M)] = lim Z[(T (j ), ∂T (j , m)) → (M, ∂M); L], (31) L→∞

(T (j ),∂T (j ,m)) j,j ,m≤L −j ≤m≤j

where Z[(T (j ), ∂T (j , m)) → (M, ∂M); L] = 8(L)−N0

N1

(−1)2jA (2jA + 1)

A=1

N3

(−1)

6

p=1 jp

B=1 n2

·

D=1

3

(−1)(

s=1 ms )/2

j1 j2 j3 j4 j5 j6

B

j1 j2 j3 m1 m2 −m3

D

.

(32)

N0 , N1 , N3 denote respectively the total number of vertices, edges and tetrahedra in (T (j ), ∂T (j , m)), while n2 is the number of 2-simplices lying in ∂T (j , m). Notice that there appears a factor 8(L)−1 for each vertex in ∂T (j , m) too (cfr. the corresponding expression in the case of a boundary with a fixed triangulation). Moreover, (32) is manifestely invariant under bistellar moves which involve 3-simplices in int (T ), and thus (31) and (32) reduce to (25) and (26) respectively if ∂M = ∅. It is worthwhile to remark also that products of 6j and 3j m coefficients of the kind which appear in (32) are known as j m coefficients in the quantum theory of angular momentum (see e.g [Y-L-V]). Their semiclassical limit can be defined in a consistent way by requiring that simultaneously j, j → ∞ and m → ∞ with the constraint −j ≤ m ≤ j . The summation in (31) has precisely this meaning , apart from the introduction of the cut-off L. 4. Identities Representing Elementary Shellings and Invariance of the State Sum The aim of this section is to show the invariance of the state sum proposed in (31) and (32) under the set of elementary shellings in n = 3 illustrated at the end of Sect. 2. Then Z[(M, ∂M)] will turn out to be a PL (or combinatorial) invariant of the PL-pair (M, ∂M) according to Theorem 4. It should be clear at this point that we have to find suitable identities (involving j m coefficients) which could be associated with the three types of elementary shellings and inverse shellings. Once these identities are established, the state sum Z[(M, ∂M)] will comply with them in a manifest way. Notice that this kind of proof mimics essentially the procedure followed both in [P-R] and [T-V] and, more recently, in [C-K-S]: we have actually inferred the expression of the state sum Z[(M, ∂M)] from the set of identities which implements its topological invariance. In what follows we show that one of the identities collected in [V-M-K], together with the orthogonality conditions for the 6j and 3mj symbols, are all we need in order to characterize completely both the three types of elementary shellings and their inverse transformations. Let us start with the shelling of a facet of TYPE III and with the corresponding maps '±(III) , the action of which (given in (19)) is depicted schematically in Fig. 1. According to the basic rules given in (24) and (30), the configuration on the lefthand side must be represented in the state sum by a product between one 6j symbol,

582

G. Carbone, M. Carfora, A. Marzuoli

ρ −(III)

a

q

b r

p c

a ρ +(III)

q

b r

p c

Fig. 1. On the left-hand side, the 2-simplex σ 2 ⊂ σ 3(III) lying in ∂T is associated with the triad (abc); its opposite vertex τ (III) , together with the other faces, are in the interior of T . The action of the map '−(III) amounts to cancel σ 2 and the interior of the facet; the surviving triangles, labelled by (apq), (bqr), (cpr), are in the boundary of the new complex. The action of the map '+(III) can be read in the opposite direction

associated with the facet, and one 3j m symbol associated with the unique face which is in ∂T . The three faces that survive after the shelling appear on the right-hand side, thus the corresponding side of the identity should contain a suitable sum of a product of three 3j m symbols. As a general remark, notice that variables appearing in 3j m symbol are associated with edges lying in ∂T in the particular configuration we are dealing with, while variables appearing only in 6j symbols correspond to internal edges. The labelling we adopt here agree with the notation at the beginning of Sect. 3, namely Latin letters a, b, c, r, p, q, . . . denote angular momentum variables and Greek letters α, β, γ , ρ, ψ, κ, . . . are the corresponding momentum projections. From these elementary remarks it follows that the maps '±(III) are represented by the following identity:

a b c αβγ

a b c (−1) r pq p a q q b r r c p −ψ−κ−ρ = , (−1) ψ α −κ κ β −ρ ρ γ −ψ =

(33)

κψρ

where = ≡ a + b + c + r + p + q. Here we have made use, with respect to the expression given in [V-M-K], of the symmetry properties of the 3j m symbols and of the fact that (−1)2(a+b+c) = 1. The triple sum over magnetic numbers (which appear in pairs with opposite signs) is interpreted as a glueing along the edges labelled by the corresponding j -variables. The shelling and inverse shelling of a facet of TYPE II, given in (18), are depicted in Fig. 2. The configuration on the left-hand side is associated with a suitable sum of a product of two 3j m symbols and one 6j symbol; on the other side we have just the two

Wigner Symbols and Combinatorial Invariants of Three-Manifolds

583

ρ −(II) a

b

a

b

q q

c

r r p

p ρ +(II)

Fig. 2. On the left-hand side, the 1-simplex σ 1 ⊂ σ 3(II) (corresponding to the edge labelled by c) is in ∂T , together with the triangles associated with the triads (abc) and (cpr). The 1-simplex τ (II) ↔ q, together with the triangles associated with (aqp) and (bqr), are in the interior of T . The map '−(II) deletes both the interior of the facet and (abc), (cpr): the resulting complex has the two remaining faces in its boundary. The action of the map '+(II) can be read in the opposite direction

faces which survive after the shelling, represented by a sum (over the magnetic number corresponding to their common edge) of the product of the remaining 3j m’s. Recall that the expression of the orthogonality conditions for the 3j m symbols with respect to magnetic numbers reads: a b c b a c 2c−γ (−1) (2c + 1) (34) = (−1)α+β δαα δββ . −α −β γ −β −α γ cγ

r p c Consider now (33) again, multiply each side by (−1)−γ +2c (2c + 1) −ρ ψ −γ and sum over the pair c, γ . Then, using (34) and the symmetry properties of the 3j m’s in order to adjust phase factors, we get the identity representing the shellings of TYPE II: a b c c r p = a b c (2c + 1)(−1)2c−γ (−1) αβγ −γ ρ ψ r pq cγ p a q q b r −2ρ −κ . (35) (−1) = (−1) ψ α −κ κ β −ρ κ

584

G. Carbone, M. Carfora, A. Marzuoli

ρ −(I) p

r q c

c a

b

b

a

ρ +(I) Fig. 3. On the left-hand side, the 2-simplex τ (I) ⊂ σ 3(I) is in the interior of T and corresponds to the triad (abc); its opposite vertex σ 0 , together with the other three faces, are in ∂T . The action of '−(I) amounts to cancel both the edges labelled by r, p, q and the faces which share one of them. In the boundary of the new complex we have just the triangle associated with (abc). The action of the map '+(I) can be read in the opposite direction

Since in this kind of shelling an edge must disappear, it is natural to find out that in (35) we have indeed a sum both over the variable c and over its corresponding γ since c is shared by the two faces on the left-hand side. The shelling and inverse shelling of a facet of TYPE I are characterized by the fact that a vertex, together with the three edges arising from it, is now involved. More precisely, as the configurations in Fig. 3 show, we have to sum over external edges a product containing three 3j m and one 6j symbols, getting a single 3j m (this is the action of the map '−(I) , recall also (17)).

We start again from (33), multiply both sides by (2r + 1)(2p + 1)(2q + 1) ar pb qc ·(−1)2(p+q+r) and sum over q, p, r. Then we get the expression:

(−1)−ψ−κ−ρ (−1)2(p+q+r) (2p + 1)(2r + 1)(2q + 1)

qκ,pψ,rρ

·

=

a b c αβγ

p,r,q

q b r κ β −ρ (−1)

r c p ρ γ −ψ

2(p+q+r)

(−1)=

a b c r pq

p a q ψ α −κ

(2p + 1)(2r + 1)(2q + 1)(−1)

=

a b c r pq

2 .

Using the orthogonality conditions (22) for the 6j ’s, the sum over q on the right-hand side gives (2c + 1)−1 ; the two remaining summations reduce to (2c + 1) p (2p + 1)2 , which diverges in the semiclassical limit. Hence, as in the Ponzano–Regge model, we have to introduce a cut-off L and denote the above weight by 8(L) according to the notation of Sect. 3. The following steps consist in an interchange of the first two columns

Wigner Symbols and Combinatorial Invariants of Three-Manifolds

585

of each 3j m and in a relabelling: ρ, κ, ψ → −ρ , −κ , −ψ . Thus the final expression representing the maps '±(I ) reads: 8(L)−1 ·

qκ ,pψ ,rρ

a p q α −ψ κ

(−1)−ψ −κ −ρ (−1)2(p+q+r) (2p + 1)(2r + 1)(2q + 1)

b q r β −κ ρ

c r p γ −ρ ψ

(−1)

=

a b c r pq

=

b a c βαγ

.

(36)

The above analysis of the identities representing the elementary shellings and their inverse moves, together with a comparison with the expression given in (32), completes the proof of the following: Theorem 5. The state sum Z[(M, ∂M)] for the 3-dimensional PL-pair (M, ∂M) is formally invariant both under bistellar moves in the interior of M and under elementary boundary operations (shellings and inverse shellings). Then, by Pachner’s Theorem 4, Z[(M, ∂M)] is an invariant of the PL-structure. Remark 3. As we have just seen, the complete set of elementary shellings can be derived from a single identity, namely (33), together with orthogonality conditions for 3j m and 6j symbols. However, it is quite clear that we could have get started either from (35) or from (36) as well. The expression given in (35) appears preferable since its structure closely resembles the Biedenharn–Elliott identity, both for what concerns the number of symbols involved and owing to the presence of a single sum over a j -variable. Recall also that the complete set of bistellar moves is actually derived from the B-E identity + (orthogonality conditions for the 6j ), apart from regularization. This similarity in the algebraic structure of the two sets of moves does not happen by chance, although the topological content of the fundamental identity is different in the two cases. Remark 4. It is worthwhile to stress that both the PL-invariants Z[M] in (25) and Z[(M, ∂M)] in (31), regularized in the same way with the introduction of the cut-off L, are notoriously difficult to handle. However, as with the state sum proposed by Turaev and Viro in [T-V], improved regularizations can be obtained by exploiting quantum groups technology. 5. Extension to the q-Deformed Case In this section we extend our previous results in order to get a quantum invariant of a 3-dimensional PL-pair (M, ∂M) which is the counterpart of the Turaev–Viro one defined in [T-V]. We limit ourselves to the analysis of the case in which representations of the quantized enveloping algebra Uq (sl(2, C)), q a root of unity, are involved. The notation, at least in the initial part, is the standard one (see e.g. [M-T, C-F-S]). Thus in particular a q −6j symbol can be associated with each 3-simplex of a given triangulation (T (j ), ∂T (j , m)) → (M, ∂M) according to j j j σ 3 ←→ 1 2 3 , (37) j4 j5 j6 q 6

where the phase factor (−1) p=1 jp has been inglobed. Notice that in the present case the spin variables j take their values in a finite set I ≡ {0, 1/2, 1.3/2, . . . , (k/2)−1}, where

586

G. Carbone, M. Carfora, A. Marzuoli

exp(π i/k) = q. Moreover, the 6-tuple (j1 , j2 , . . . , j6 ) ∈ I 6 is said to be admissible if each of its unordered triples (j1 j2 j3 ), (j1 j5 j6 ), (j4 j2 j6 ), (j4 j5 j6 ) is admissible in the sense already explained in Sect. 3. To be more precise, the q − 6j symbol is associated with a map I 6 → K, where K is a commutative ring with unity, and the corresponding 6-tuple is admissible. The following step consists in defining, for each j ∈ I , a function . w2 (j ) ≡ wj2 = (−1)2xj [2xj + 1]q ∈ K ∗ , where K ∗ = K \ {0} and [n]q denotes a q-integer, namely [n]q = (q n − q −n )/(q − q −1 ). Moreover, a distinguished element w ∈ K ∗ is chosen in such a way that w 2 = −2k/(q − q −1 )2 . The symbol | · · · |q , the functions wj and the element w are collectively referred to as initial data. According to [T-V], they have to satisfy some conditions, among which we just need here the following ones: 2 2 j 2 j1 j j 2 j1 j wj wj4 = δj4 j6 (38) j3 j5 j4 q j3 j5 j6 q j

representing the orthogonality relations for the q − 6j symbols, and w 2 = wj−2 wk2 wl2 .

(39)

(j,k,l)∈adm

The summation in (38) and (39) are carried out over those j -variables for which the symbols are defined and adm is the set of admissible triples. In order to deal with the generalization of the Turaev–Viro state sum to the case of a 3-dimensional PL-pair (M, ∂M), and following the framework of Sect. 3, we have to introduce the q-analog of the Wigner 3j m symbol (20). This symbol turns out to be associated with a triangular face σ 2 in the boundary of a triangulation (T , ∂T ) according to the prescription: j1 j 2 j 3 2 σ ←→ χq , (40) m1 m2 −m3 q where χq is a term containing both a phase factor as in (30) and a suitable normalization factor depending on q. The correct choice of the normalization in (40), and consequently in the definition of the q −6j symbol (37), is discussed in the Appendix. Then, following the procedure of Sect. 4 with a slight change of notation, it turns out that the maps '±(III) representing the elementary shelling and inverse shellings of a facet of TYPE III correspond to the identity:

j1 j2 j12 j3 j j23 q q m2 −m2 /6 = (−1) q (−1)m3 q −m3 /6 (−1)m12 q −m12 /6

j1 j23 j m1 m23 −m

m2 m3 m12

j3 j2 j23 m3 −m2 m23

q

j12 j3 j m12 −m3 m

q

j1 j2 j12 m1 m2 −m12

q

.

(41)

±(II) is found by multiplying (41) by The relation describing the maps ' j2 j3 j23 (−1)m23 +m2 +m3 q (m3 −m2 )/3 , summing both sides over j23 , m3 and m2 m3 −m23 q

Wigner Symbols and Combinatorial Invariants of Three-Manifolds

587

using the orthogonality conditions for the q − 3j m’s given in (49). Then, up to the substitution m3 → −m3 , the final expression reads: j j j j1 j23 j j2 j3 j23 wj223 (−1)m23 q −m23 /6 1 2 12 j3 j j23 q m1 m23 −m q m2 −m3 −m23 q j23 m23 j1 j2 j12 j 2m3 2m3 /6 m12 −m12 /6 j12 j3 = (−1) q (−1) q , (42) m12 −m3 −m q m1 m2 −m12 q

m12

where wj223 ≡ (−1)2j23 [2j23 + 1]q as defined before. In order to find the identity representing the last type of shelling notice that (41) can be rewritten in terms of the deformation parameter 1/q and that | · · · |q = | · · ·|1/q (see e.g. [K-R]). Multiplying each j j j side of the resulting expression by wj22 wj23 wj212 1 2 12 , summing over j3 , j2 , j12 j3 j j23 q and taking into account (38) and (39), we get:

j1 j23 j m1 m23 m j3 m3

= w−2 1/q

j2 m2

wj22 (−1)m2 q m2 /6

j3 j2 j23 m3 −m2 m23 1/q j12 m12 j1 j2 j12 j1 j2 j12 j12 j3 j . · m12 −m3 m 1/q m1 m2 −m12 1/q j3 j j23 q

wj23 (−1)m3 q m3 /6

wj212 (−1)m12 q m12 /6

Interchanging the first two j -variables of each q−3j m symbol, and up to the substitutions m2 , m3 , m12 → −m2 , −m3 , −m12 , we obtain the correct identity representing the maps '±(I ) , namely:

j23 j1 j m23 m1 m j3 m3

q

= w −2

j2 m2

wj23 (−1)m3 q −m3 /6 ·

wj22 (−1)m2 q −m2 /6 j12 m12

wj212 (−1)m12 q −m12 /6

j3 j12 j m3 −m12 m

q

j2 j3 j23 m2 −m3 m23

q

j1 j2 j12 j2 j1 j12 . −m2 m1 m12 q j3 j j23 q

(43)

Collecting the results of this section we can now state the following: Theorem 6. Let (M, ∂M) be a 3-dimensional compact PL-pair and (T (j ), ∂T (j , m)) → (M, ∂M) a triangulation associated with an admissible assignment of both j variables (j of which in ∂T ) and m-variables according to the rules given at the beginning of this section. Then the state sum Z[(M, ∂M)]q = Zq [(T (j ), ∂T (j , m)) → (M, ∂M)], (44) {(T (j ),∂T (j ,m))}

588

G. Carbone, M. Carfora, A. Marzuoli

where Zq [(T (j ), ∂T (j , m)) → (M, ∂M)] = w −2N0

N1 A=1

n2

·

D=1

χq(D)

2 wA

N3 j1 j2 j3 (B) j4 j5 j6 · q

B=1

j1 j2 j3 m1 m2 −m3

(D) (45) q

is a quantum invariant of the PL-pair (M, ∂M). Proof. The state sum (44) is manifestely invariant both under bistellar moves in the interior of T and under elementary shellings represented by (41), (42), (43). Then, by Pachner’s Theorem 4, Zq [(M, ∂M)] is, for each q = root of unity, a quantum PLinvariant. $ % 6. Concluding Remarks The topological elementary moves introduced by Pachner and discussed in Sect. 2 are characterized by other remarkable properties. For instance, as pointed out recently in [C-K-S], one can recover the fundamental n-simplex glueing together, in Rn , the two different configurations – representing any one of the bistellar moves in dimension (n − 1) – along their common fixed boundary. Obviously, there cannot be any straightforward relationship between bistellar moves in contiguous dimensions (recall that the number of such kind of moves in dimension n is (n + 1)). However, if we allow the elementary boundary operations to be involved, new possibilities arise. As we have already noticed, the different types of elementary shellings acting on a simplicial n-dimensional pair (T , ∂T ) amount exactly to n. Moreover, the central projection of each elementary shelling onto ∂T gives a particular bistellar move in dimension (n − 1) (being ∂T a triangulation of a closed (n − 1)-dimensional manifold). It is also easy to check that the same kind of projection of the complete set of boundary operations reproduces the complete set of bistellar moves in the lower dimensional case. Thus the deep link between the two basic theorems proved by Pachner and quoted in our Remark 2 becomes evident. For what concerns in particular Theorem 3, it should be feasible to build up state sum models for triangulated 3-dimensional pairs leaving the requirement of being generalizations of the Ponzano–Regge and Turaev–Viro ones apart (and implementing the elementary shellings alone). Turning now to some possible developments of our approach toward models in dimension different from three, we are currently addressing a 2-dimensional closed model and a 4-dimensional model with boundary which are reminiscent of the 3-dimensional one. On both sides the starting point is one of the fundamental identities which we have looked at in Remark 3 (namely the Biedenharn– Elliott identity involving five 6j symbols and identity (35) involving one 6j and four 3j m symbols). Since the structure of a 2-dimensional local term of the state sum given e.g. in (32) is naturally encoded in (35), we can, in a definite sense, project this last expression in order to get the correct form of the corresponding bistellar move in n = 2. The partition function arising in such a way includes suitable sums of products of double 3j m symbols, each one of them being associated with a triangular facet of the closed 2-manifold. On the other hand, the B–E identity represents in our view the projected counterpart of the identity associated with a particular elementary shelling in n = 4. We are confident that, notwithstanding

Wigner Symbols and Combinatorial Invariants of Three-Manifolds

589

the complexity of the algebraic relations involved, a state sum which generalizes the known results in dimension 4 (see e.g. [O92,b, C-K-S] and references therein) could be established. Coming back again to the 3-dimensional partition function given in (32), we can investigate its semiclassical limit much in the spirit of the original approach of Ponzano and Regge. As is well known, in the case of a closed 3-manifold, the state sum given in (25) and (26) can be related to the semiclassical Euclidean partition function containing the Regge action SR (M) of the manifold M according to Z[M] ∼ cos(SR (M) + π/4). In order to perform a similar analysis on our state sum we have to consider also the semiclassical limit of each 3j m symbol involved in (32). Such a limit can be found in [P-R] as well, and involves both the angular momenta j and the momentum projections m, together with suitable angular variables. Without entering into technical details, the asymptotic structure of our state sum can be summarized in the expression Z[(M, ∂M)] ∼ cos(SR (M) + S(∂M) + const), where S(∂M) is an action containing both the Euler characteristic of ∂M and other terms depending on the orientation of the components of ∂M with respect to the quantization axis. Appendix In what follows we collect first some relations involving q − 3j m-symbols which can be found for instance in [K-R, N]. Recall that the relation between the quantum Clebsh– Gordan coefficient (j1 m1 j2 m2 |j3 m3 )q and the q − 3j m symbol is given by: (j1 m1 j2 m2 |j3 m3 )q = (−1)

j1 −j2 +m3

([2j3 + 1]q )

1/2

j 1 j2 j3 m1 m2 −m3

q

,

(46)

where, as usual, an m-variable runs in integer steps between −j and +j , and the classical expression (20) is recovered when q = 1. The symmetry properties of the q − 3j m symbol read: j1 j2 j3 j 2 j1 j3 = (−1)j1 +j2 +j3 , m1 m2 −m3 q m2 m1 −m3 1/q j1 j2 j3 j 1 j3 j2 j +j +j −m /2 1 2 3 1 = (−1) q , m1 m2 −m3 q m1 m3 −m2 1/q j1 j2 j3 j 1 j2 j3 = (−1)j1 +j2 +j3 . (47) m1 m2 −m3 q −m1 −m2 m3 q The above relations make it clear the necessity of choosing different normalization factors in order to comply with the cyclic-permutation property which ensures the correspondence (triangle) ↔ (q − 3j m). Thus we define the normalized q − 3j m symbols, for deformation parameters q and 1/q respectively, according to: j 1 j2 j3 j1 j2 j3 . = q (m1 −m2 )/6 , m1 m2 −m3 q m1 m2 −m3 q j1 j2 j3 . (m2 −m1 )/6 j1 j2 j3 =q . (48) m1 m2 −m3 1/q m1 m2 −m3 1/q

590

G. Carbone, M. Carfora, A. Marzuoli

The form of the orthogonality relations involving the normalized symbols which is used in Sect. 5 reads: j1 j 2 j j 2 j1 j wj2 (−1)µ q (m2 −m1 )/3 = δm1 m1 δm2 m2 (49) m1 m2 −m q −m2 −m1 −m q jm

where µ = m1 + m2 + m3 . References [A] Alexander, J.W.: The combinatorial theory of complexes. Ann. of Math. 31, 292–320 (1930) [A-C-M] Ambjørn, J., Carfora, M., Marzuoli, A.: The Geometry of Dynamical Triangulations. Lect. Notes in Physics m50. Berlin: Springer, 1997 [C-F-S] Carter, J.S., Flath, D.E., Saito, M.: The Classical and Quantum 6j-symbols. Math. Notes 43. Princeton, NJ: Princeton University Press, 1995 [C-K-S] Carter, J.S., Kauffman, L.H., Saito, M.: Structure and diagrammatics of four dimensional topological lattice field theories. Preprint, math. GT/9806023 (1998) [D-R] De Pietri, R., Rovelli, C.: Geometry eigenvalues and the scalar product from recoupling theory in loop quantum gravity. Phys. Rev. D54, 2664–2690 (1996) [G] Glaser, L.C.: Geometric Combinatorial Topology. vol. 1. New York: van Nostrand Reinhold, 1970 [K-M-S] Karowski, M., Müller, W., Schrader, R.: State sum invariants of compact 3-manifolds with boundary and 6j-symbols. J. Phys. A: Math. Gen. 25, 4847–4860 (1992) [K-R] Kirillov,A.N., Reshetikhin, N.Y.: Representations of the algebra Uq (sl2 ), q-orthogonal polynomials and invariants of links. In: Kac, V.G. (ed.) Infinite dimensional Lie algebras and groups, Adv. Ser. in Math. Phys. 7. Singapore: World Scientific, 1988, pp. 285–339 [M] Moussouris, J.P.: Quantum models of space-time based on recoupling theory. Oxford: Ph.D. thesis, 1983 [M-T] Mizoguchi, S., Tada, T.: 3-dimensional gravity and the Turaev-Viro invariant. Progr. Theor. Phys. Suppl. 110, 207–227 (1992) [N] Nomura, M.: Relations for Clebsh–Gordan and Racah coefficients in suq (2) and Yang–Baxter equation. J. Math. Phys. 30, 2397–2405 (1989) [O92,a] Ooguri, H.: Partition functions and topology-changing amplitudes in the three-dimensional lattice gravity of Ponzano and Regge. Nucl. Phys. B 382, 276–304 (1992) [O92,b] Ooguri, H.: Topological lattice models in four dimensions. Mod. Phys. Lett. A 7, 2799–2810 (1992) [P87] Pachner, U.: Ein Henkeltheorem für geschlossene semilineare Mannigfaltigkeiten. Result. Math. 12, 386–394 (1987) [P90] Pachner, U.: Shellings of simplicial balls and p.l. manifolds with boundary. Discr. Math. 81, 37–47 (1990) [P91] Pachner, U.: P.L. homeomorphic manifolds are equivalent by elementary shellings. Europ. J. Combinatorics 12, 129–145 (1991) [Pe] Penrose, R.: Angular momentum: an approach to combinatorial space-time. In: Bastin, T. (ed.) Quantum Theory and beyond. Cambridge: Cambridge University Press, 1971, pp. 151–180 [P-R] Ponzano, G., Regge, T.: Semiclassical limit of Racah coefficients. In: Bloch, F. et al (eds.) Spectroscopic and Group Theoretical Methods in Physics. Amsterdam: North-Holland, 1968, pp. 1–58 [R] Regge, T.: General Relativity without coordinates. Nuovo Cimento 19, 558–571 (1961) [R-S] Rourke, C., Sanderson, B.: Introduction to Piecewise Linear Topology. New York: Springer-Verlag, 1982 [Ro-S] Rovelli, C., Smolin, L.: Spin networks and quantum gravity. Phys. Rev. D52, 5743–5759 (1995) [T] Thurston, W.P.: Three-dimensional Geometry and Topology. Vol. 1, Levy, S. (ed.). Princeton, NJ: Princeton University Press, 1997 [T-V] Turaev, V., Viro, O.Ya.: State sum invariants of 3-manifolds and quantum 6j-symbols. Topology 31, 865–902 (1992) [V-M-K] Varshalovich, D.A., Moskalev, A.N., Khersonskii, V.K.: Quantum Theory of Angular Momentum. Singapore: World Scientific, 1988 [Y-L-V] Yutsis, A.P., Levinson, I.B., Vanagas, V.V.: The Mathematical Apparatus of the Theory of Angular Momentum. Jerusalem: Israel Program for Sci. Transl. Ltd. 1962 Communicated by R. H. Dijkgraaf

Commun. Math. Phys. 212, 591 – 611 (2000)

Communications in

Mathematical Physics

© Springer-Verlag 2000

A Path Integral Approach to the Kontsevich Quantization Formula Alberto S. Cattaneo1 , Giovanni Felder2 1 Institut für Mathematik, Universität Zürich, 8057 Zürich, Switzerland. E-mail: [email protected] 2 Departement Mathematik, ETH-Zentrum, 8092 Zürich, Switzerland. E-mail: [email protected]

Received: 10 March 1999 / Accepted: 30 January 2000

Abstract: We give a quantum field theory interpretation of Kontsevich’s deformation quantization formula for Poisson manifolds. We show that it is given by the perturbative expansion of the path integral of a simple topological bosonic open string theory. Its Batalin–Vilkovisky quantization yields a superconformal field theory. The associativity of the star product, and more generally the formality conjecture can then be understood by field theory methods. As an application, we compute the center of the deformed algebra in terms of the center of the Poisson algebra. 1. Introduction In a recent paper [K], M. Kontsevich gave a general formula for the deformation quantization [BFFLS] of the algebra of functions on a Poisson manifold. The deformed product (the “star product”) is given in terms of an expansion reminiscent of the Feynman perturbation expansion of a two dimensional field theory on a disc with boundary. We review Kontsevich’s formula in Sect. 2. The purpose of this paper is to describe this quantum field theory explicitly. It turns out that it is a simple bosonic topological quantum field theory on a disc D with a field X : D → M taking values in the Poisson manifold M and a one-form η on D taking values in the pull-back X ∗ (T ∗ M) of the cotangent bundle. The formula for the star product is i f (X(1))g(X(0))e h¯ S[X,η] dX dη, f g (x) = X(∞)=x

where 0, 1, ∞ are three distinct points on the boundary of D. The integral is normalized in such a way that in the case of the trivial Poisson structure the star product reduces to the ordinary product. The action S is described in Sect. 3 and was originally studied for manifolds without boundary in [I] and [SchStr]. In particular the canonical quantization on the cylinder was considered.

592

A. S. Cattaneo, G. Felder

In the symplectic case the above formula essentially reduces to the original Feynman path integral formula for quantum mechanics, as pointed out to us by H. Ooguri. The quantization of the theory is somewhat subtle, due to the presence of a gauge symmetry which only closes on shell, as already noticed in [I]. In other words, the action S is a function of the fields annihilated by a distribution of vector fields which is only integrable on the set of critical points of S. As a consequence, the BRST quantization fails and one has to resort to the Batalin–Vilkovisky method (see for example [BV,W1, S1,AKSZ]). This method yields a gauge fixed action, which turns out to have a superconformal invariance. Its perturbative expansion around constant classical solutions reproduces Kontsevich’s formula. As an application, we show in Sect. 4 by quantum field theory methods that there exists a star product equivalent to Kontsevich’s whose center consists of the power series in h¯ whose coefficients are in the center of the Poisson algebra. A rigorous proof of this statement will appear elsewhere [CFT]. More generally, we may consider a path integral associated to an arbitrary polyvector field, a formal sum of skew-symmetric contravariant tensor fields of arbitrary rank, the star product being the special case of bivector fields. Correlation functions of boundary fields yield then a map U from polyvector fields to polydifferential operators. Formal properties of this map can be deduced from BV and factorization methods of quantum field theory. This leads to identities, also found by Kontsevich, which may be thought of as the open string analog of the WDVV equations [W2, DDV]. They may be formulated by saying that U is an L∞ morphism [SchlSt, LS]. They imply the associativity of the star product and, in the general setting of arbitrary polyvector fields, the formality conjecture [K]. These constructions are explained in Sect. 5. Although the non-rigorous quantum field theory arguments of this paper are of course no substitute for the proofs in [K], this approach offers an explanation for why Kontsevich’s construction works, and puts it in the context of Feynman’s original picture of quantization [F]. Moreover, our approach indicates the way for more general constructions. In particular, one can consider the perturbative expansion around a non-trivial classical solution, one can insert a Hamiltonian and one can consider this quantum field theory on a complex curve of higher genus. We plan to study these variants in the future.

2. The Kontsevich Formula In [K], M. Kontsevich wrote a beautiful explicit solution to the problem of deformation quantization of the algebra of functions on a Poisson manifold M. The problem is to find a deformation of the product on the algebra of smooth functions on a Poisson manifold, which to first order in Planck’s constant is given by the Poisson bracket. If M is an open set in Rd with a Poisson structure {f, g}(x) =

d

α ij (x)∂i f (x)∂j g(x)

i,j =1

given by a skew-symmetric bivector field α, obeying the Jacobi identity α il ∂l α j k + α j l ∂l α ki + α kl ∂l α ij = 0,

(1)

Path Integral Approach to Kontsevich Quantization Formula

593

the problem is to find an associative product on C ∞ (M)[[h]], ¯ such that for f, g ∈ C ∞ (M), f g (x) = f (x)g(x) +

i h¯ {f, g}(x) + O(h¯ 2 ). 2

Kontsevich’s solution1 to this problem may be described as follows. The coefficient of (i h/2) ¯ n in f g is given by a sum of terms labeled by diagrams of order n. A diagram of order n is a graph consisting of n vertices numbered from 1 to n and two vertices labeled by letters L and R, for Left and Right. From each of the numbered vertices there emerge two ordered oriented edges that end at numbered vertices or at vertices labeled by letters, so that no edge starts and ends at the same vertex. The two edges emerging from vertex i are called ei1 , ei2 . They are of the form eia = (i, va (i)) for some maps va : {1, . . . , n} → {1, . . . , n, L, R}. In fact, a diagram can be thought of as an ordered pair (v1 , v2 ) of maps {1, . . . , n} → {1, . . . , n, L, R}, such that va (i) is never equal to i. To each diagram of order n there corresponds a bidifferential operator D whose coefficients are differential polynomials, homogeneous of degree n in the components α ij of the Poisson structure. The edges indicate how the partial derivatives are acting. For instance the bidifferential operator2 (f, g) → α ij (x)∂i f (x)∂j g(x) corresponds to the diagram with vertices 1, L, R and edges e11 = (1, L), e12 = (1, R). The bidifferential operator D (f ⊗ g) = α ij ∂i α kl ∂j ∂l f ∂k g corresponds to the diagram with vertices 1, 2, L, R and edges e11 = (1, 2), e12 = (1, L), e21 = (2, R), e22 = (2, L). Kontsevich’s formula is then ∞ i h¯ n f g = fg + w D (f ⊗ g). 2 n=1

of order n

The weight w is the integral of a differential form over the configuration space Cn (H ) = {u ∈ H n , ui = uj (i = j )} of n ordered points on the upper half plane H . It is defined as follows: for any two distinct points z, w in the upper half plane with the Poincaré metric ds 2 = (dx 2 + dy 2 )/y 2 , let φ(z, w) be the angle between the (vertical) geodesic connecting z to i∞ and the geodesic connecting z to w, measured in counterclockwise ∂ ∂ direction. Let dφ(z, w) = dz ∂z φ(z, w) + dw ∂w φ(z, w) denote the differential of this angle. Then the weight is 1 w = ∧n dφ(ui , uv1 (i) ) ∧ dφ(ui , uv2 (i) ), (2π)2n n! Cn (H ) i=1 where we set uL = 0 and uR = 1. The orientation is induced from the product of the standard orientation of the upper half plane. For example, we have two graphs of order one, differing in the ordering of edges. Let us compute the weight of these diagrams. Let be the diagram with e11 = (1, L), e12 = (1, R). To compute the integral over u = u1 +iu2 ∈ H we introduce new variables φ0 = φ(u, 0), φ1 = φ(u, 1).As arg(u) varies between 0 and π , the angle φ0 varies from 0 to 2π . 1 In [K] h is what is here i h/2. We adopt the notation of the physics literature and work accordingly over ¯ ¯ the complex numbers. With Kontsevich’s conventions one may formulate the problem over the real numbers, which in terms of the physics conventions would mean to have an imaginary Planck constant. 2 We use throughout the paper the Einstein summation convention, meaning that sums over repeated indices are understood.

594

A. S. Cattaneo, G. Felder

As we vary u on the half-line of constant φ0 , the angle φ1 varies between φ0 (at infinity) and 2π (at u = 0). Thus this change of variables is a diffeomorphism from the upper half plane to the domain 0 < φ0 < φ1 < 2π in R2 . The above description also shows that this diffeomorphism is orientation preserving. Thus w = (2π )−2 0<φ0 <φ1 <2π dφ0 ∧ dφ1 = 1/2 (φj = φ(u, j ) and dφ0 ∧ dφ1 is positively oriented). The other diagram has e11 = (1, R), e12 = (1, L) and gives the same contribution with the opposite sign. Therefore the coefficient of (i h¯ /2) is 1 ij 1 α ∂i f ∂j g − α ij ∂j f ∂i g = α ij ∂i f ∂j g, 2 2 by the skew-symmetry of α. Let us conclude this section with some remarks about involutions. The opposite product f op g is related to the product by a change of sign of h. ¯ Indeed, D (g ⊗ f ) = D¯ (f ⊗ g), where ¯ is obtained from by exchanging R and L, and w¯ = (−1)n w if is of order n, since w¯ is the integral of the pull-back of the differential form defining w by the reflection about the axis Re(z) = 21 which reverses the orientation of H . Since the weights w are real, this implies that complex conjugation, extended to C ∞ (M)[[h]] ¯ by setting h¯¯ = h, ¯ is an antilinear antiautomorphism for the star product. 3. A Sigma Model 3.1. The classical action and its symmetries. We start by introducing a sigma model action. The perturbative expansion of correlation functions of boundary fields on the disc will then be related to the star product. The model has two real bosonic fields X, η. X is a map from the disc D = {u ∈ R2 , |u| ≤ 1} to M and η is a differential 1-form on D taking values in the pull-back by X of the cotangent bundle of M, i.e. a section of X ∗ (T ∗ M) ⊗ T ∗ D. In local coordinates, X is given by d functions X i (u) and η by d differential 1-forms ηi (u) = ηi,µ (u)duµ . The action reads 1 S[X, η] = ηi (u) ∧ dX i (u) + α ij (X(u))ηi (u) ∧ ηj (u). 2 D The boundary condition for η is that for u ∈ ∂D, ηi (u) vanishes on vectors tangent to ∂D. We then claim that the star product is given by the semiclassical expansion of the path integral3 i f g (x) = f (X(1))g(X(0))e h¯ S[X,η] dX dη. X(∞)=x

Here 0, 1, ∞ are any three cyclically ordered points on the unit circle (which we secretly view as the completed real line by stereographic projection). Cyclically ordered means 3 In the symplectic case, where α comes from a symplectic form ω, one can integrate formally over η and this formula may, in the spirit of Feynman [F], be written as i d −1 ω f g (x) = f (γ (1))g(γ (0))e h¯ γ dγ . γ (±∞)=x

The integral over trajectories γ : R → M is to be understood as an expansion around the classical solution γ (t) = x, which is a constant function of time since the Hamiltonian vanishes.

Path Integral Approach to Kontsevich Quantization Formula

595

that if we start from 0 and move on the circle counterclockwise we first meet 1 and then ∞. The path integral is over all X : D → M, η ∈ (D, X ∗ (T ∗ M) ⊗ T ∗ D) subject to the boundary conditions X(∞) = x, η(u)(ξ ) = 0 if u ∈ ∂D and ξ is tangent to ∂D. Its semiclassical expansion is to be understood as an expansion around the classical solution X(u) = x, η(u) = 0. To evaluate this path integral we have as usual to take gauge fixing and renormalization into account. This action is invariant under the following infinitesimal gauge transformations with infinitesimal parameter βi , which is a section of X ∗ (T ∗ M) and vanishes on the boundary of D: δβ X i = α ij (X)βj , δβ ηi = −dβi − ∂i α j k (X)ηj βk . This symmetry is an extension of more familiar gauge symmetries encountered in special cases. On one extreme we have α = 0 and the action is invariant under translations of η by exact one-forms on D. On the other extreme we have the symplectic case where α ij is an invertible matrix so that integrating formally over η we get the action D X ∗ ω which is invariant under arbitrary translations Xi → Xi + ξ i , with ξ i (u) = 0 on the boundary of D. Another special case is the case when M is a vector space and α is a linear function on M. In this case M is the dual space to a Lie algebra g with Kirillov– Kostant Poisson structure. The Lie bracket of two linear functions f, g ∈ g = M ∗ is just the Poisson bracket and is again a linear function on M. Then the classical action is best viewed as a function of a field X taking values in g ∗ and a connection d + η on a trivial principal bundle on D. After an integration by parts, the action becomes the “BF action” [S2, BT] S = D X, F (η), where F (η) is the curvature of d + η. In this case the gauge transformation is the usual gauge transformation (with gauge parameter −β) of a connection and a field X in the coadjoint representation. In the general case, the commutator of two gauge transformations is a gauge transformation only on shell, i.e., modulo the equations of motion: [δβ , δβ ]X i = δ{β,β } X i , [δβ , δβ ]ηi = δ{β,β } ηi − ∂i ∂k α rs βr βs (dX k + α kj (X)ηj ). Here {β, β }i = −∂i α j k (X)βj βk and dX k + α kj ηj = 0 is an Euler-Lagrange equation for the action S. In this calculation the Jacobi identity (1) plays an essential role. Thus the gauge transformations form a Lie algebra only when acting on critical points (classical solutions) of S. In the BRST formalism one then promotes the infinitesimal gauge parameter βi to an anticommuting ghost field (vanishing on the boundary of the disc) and introduces the BRST operator δ0 , an odd derivation on the functions of X, η, β such that δ0 X i = α ij (X)βj , δ0 ηi = −dβi − ∂i α kl (X)ηk βl , 1 δ0 βi = ∂i α j k (X)βj βk . 2 Then δ0 is a differential on shell, i.e., it squares to zero modulo the equations of motion. More precisely we have δ02 X i = δ02 βi = 0 and δ02 ηi = − 21 ∂i ∂k α rs βr βs (dX k +

596

A. S. Cattaneo, G. Felder

α kj (X)ηj ). We assign a gradation, the ghost number, to our fields: gh(X i ) = gh(ηi ) = 0, gh(βi ) = 1. The BRST operator has then ghost number one. Additionally we have the gradation of the fields as differential forms on the disc, which will be denoted by deg: deg(X i ) = deg(βi ) = 0, deg(ηi ) = 1. In the case M = g ∗ of linear Poisson structures, the second derivatives of α vanish, and the BRST operator squares to zero. 3.2. The Batalin–Vilkovisky action. If the BRST operator squares to zero only modulo the equations of motion, the usual BRST procedure to evaluate the path integral does not quite work, since it essentially requires a well-defined cohomology to construct physical observables. The generalization of the BRST procedure that works in this case is the Batalin–Vilkovisky method. The recipe is as follows. One first adds antifields X+ , η+ , β + with complementary ghost number and degree as differential forms on D. The assignments of degree (from left to right) and ghost number (from top to bottom) are given by 0 1 2 −2 β +i +i −1 η Xi+ i 0 X ηi 1 βi One then looks for a Batalin–Vilkovisky action SBV [φ, φ + ] of ghost number zero depending on fields φ 1 , φ 2 , . . . (here X i , ηi , βi ) and antifields φ1+ , φ2+ , . . . , with gh(φα+ ) = −1 − gh(φ α ) and deg(φα+ ) = 2 − deg(φ α ) subject to two requirements. The first requirement is that SBV [φ, 0] reduces to the classical action S[φ] when the antifields are set to zero and the second requirement is that SBV obeys the quantum master equation (SBV , SBV ) − 2i hS ¯ BV = 0. The BV Laplacian and the BV antibracket are defined as follows. Let us introduce temporarily a Riemannian metric on D, and denote by , u the induced √ scalar product on the exterior algebra of the cotangent space at u. The volume form gdu1 du2 will be denoted by dv(u). The Hodge star ∗ then obeys α, βu dv(u) = α ∧ ∗β. The expression for the Laplacian is better expressed in terms of the Hodge dual antifields φα∗ = ∗φα+ . The Laplacian of a function of fields and antifields is A =

(−1)gh(α) α

δ2 A . δφ α (u)δφα∗ (u)

The functional derivatives of a function of fields and antifields, collectively denoted by ψ α , are the distributions (de Rham currents) defined by ← d δA Aδ α A(ψ + tρ) = ρ (u), dv(u) = α , ρ α u dv(u), α (u) u dt δψ D D δψ (u) t=0

for any test forms ρ α of the same degree and ghost number as ψ α .

Path Integral Approach to Kontsevich Quantization Formula

597

Note that the Laplacian is the restriction of a distribution on D 2 to the diagonal, and is thus a singular object in this infinite dimensional context. It should be understood as the limit of a suitably regularized expression. The Laplacian obeys (AB) = (A)B + (−1)gh(A) (A, B) + (−1)gh(A) A(B),

(2)

where the Batalin–Vilkovisky antibracket is ← ← Aδ δB Aδ δB (A, B) = α , ∗ − ∗ , α dv(u). δφ (u) δφα (u) δφα (u) δφ (u) D α This antibracket is better defined than the Laplacian, in the sense that if A and B are local functionals of the fields and antifields, such as the action S, then the functional derivatives are regular distributions and (A, B) is again a local functional. Moreover, it is independent of the choice of Riemannian metric: it can be expressed without reference to the metric at the cost of introducing signs:   ← ← A ∂B ∂B A ∂ ∂  . (A, B) = ∧ − (−1)deg φα + ∧ ∂φ α ∂φα+ ∂φ α ∂φα D α Here the derivatives of a function A of fields and antifields ψα are the distributions defined by ← d A ∂A ∂ α A(ψ + tρ) = ρ ∧ = ∧ ρα , α α dt ∂ψ ∂ψ D D t=0

for any test forms ρ α of the same degree and ghost number as ψ α . The antibracket obeys the graded commutativity relation (A, B) = −(−1)(gh(A)−1)(gh(B)−1) (B, A), and the Leibnitz rule (A, BC) = (A, B)C + (−1)(gh(A)−1)gh(B) B(A, C).

(3)

In the general case of field theories with non-trivial renormalization the BV action depends on h¯ through counterterms and the full quantum master equation is solved by a recursive procedure order by order in h. ¯ Here, as we shall see, the renormalization is rather trivial and the Batalin–Vilkovisky action satisfies separately the equation SBV = 0 and the classical master equation (SBV , SBV ) = 0. The classical master equation implies that the BV version of the BRST operator δ defined by δA = (SBV , A) is a differential. It obeys the Leibnitz rule δ(AB) = δAB + (−1)gh(A) AδB and it acts on fields and antifields by the rule δφ α = (−1)gh(φ

α)

BV ∂S , ∂φα+

δφα+ = (−1)gh(φ

α )+deg(φ α )

BV ∂S . ∂φ α

598

A. S. Cattaneo, G. Felder

One semi-systematic way to find the BV action, which is under suitable hypotheses unique up to the BVversion of canonical transformations, is to start with the obvious + 0 = S + i +i ∧ δ η − β +i δ β , which has BRST operator action SBV 0 i 0 i D Xi δ0 X + η δ = δ0 and then add suitable terms, so that the new BRST operator obeys δ 2 = 0. 0 /∂η contains a term proportional to the equations of motion (plus Since δ0 η+i = ∂SBV i terms involving antifields) which we need to cancel from δ02 ηi , it is natural to add a term quadratic in η+ to achieve our goal. It turns out that 1 0 SBV = SBV − η+i ∧ η+j ∂i ∂j α kl (X)βk βl 4 D 1 = ηi ∧ dX i + α ij (X)ηi ∧ ηj 2 D +Xi+ α ij (X)βj − η+i ∧ (dβi + ∂i α kl (X)ηk βl ) 1 1 − β +i ∂i α j k (X)βj βk − η+i ∧ η+j ∂i ∂j α kl (X)βk βl , 2 4

does the job. Moreover SBV is BRST closed (i.e., it obeys δSBV = 0), which is equivalent to the classical master equation. This is more conveniently shown in the superfield formalism of the next subsection. We claim that, if the regularization is appropriate, SBV = 0. Indeed the only terms contributing to the Laplacian of the BV action contain both a field and its antifield: 1 SBV = Xi+ α ij (X)βj − η+i ∧ ∂i α kl (X)ηk βl − β +i ∂i α j k (X)βj βk 2 D ij = (1 − 2 + 1)C ∂i α (X)βj dv = 0.

D

Here C is an infinite constant. The factor takes into account the contribution of the first term (1), of the second term (−2 since the one-form ηi has two components) and the third term (1). In an appropriate regularization scheme, this cancellation is supposed to be valid before removing the regularization, in spite of the fact that C tends to infinity. Let us conclude this subsection by discussing the boundary conditions of the various fields. The rule is that Hodge dual antifields must have the same boundary conditions as the fields. The boundary conditions for the fields are that, for u ∈ ∂D, βi (u) = 0 and ηi (u) vanishes on vectors tangent to the boundary. Thus β +i (u) = 0 and ηi+ (u) vanishes on vectors normal to the boundary. 3.3. Superfield formalism. It turns out that the calculations simplify if we combine our fields and antifields into superfields. These are functions of the even coordinates u1 , u2 on D and odd (anticommuting) coordinates θ 1 , θ 2 . Thus a superfield φ has the form (1) (2) φ(u, θ ) = φ (0) (u) + θ µ φµ (u) + θ µ θ ν 21 φµν . Its component fields are a scalar function (1) (2) φ (0) , a one-form φ (1) = φµ duµ and a two form φ (2) = 21 φµν duµ ∧ duν . The fields of total degree (degree+ghost number) zero combine into even superfields X˜ i , the “supercoordinates”, 1 +i +i X˜ i = X i + θ µ ηµ − θ µ θ ν βµν , 2

Path Integral Approach to Kontsevich Quantization Formula

599

and the fields of total degree one combine into odd superfields η˜ i , the “super-one-forms”: 1 + . η˜ i = βi + θ µ ηi,µ + θ µ θ ν Xi,µν 2 Let D = θ µ ∂/∂uµ . This operator acts on component fields as the de Rham differential. Then the BRST operator δ acts as an odd derivation on functions of the superfields X˜ i , η˜ i by the rule ˜ η˜ j , δ X˜ i = D X˜ i + α ij (X) 1 ˜ η˜ j η˜ k . δ η˜ i = D η˜ i + ∂i α j k (X) 2 It is easy to check that the Jacobi identity implies δ 2 = 0. The action of δ on component fields can then easily be evaluated by comparing coefficients and taking into account the (1) (2) sign rule δφ = δφ (0) − θ µ δφµ + 21 θ µ θ ν δφµν . One gets δX i = α ij (X)βj , δη+i = −dX i − α ij (X)ηj − ∂k α ij (X)η+k βj , 1 δβ +i = −dη+i − α ij (X)Xj+ + ∂k ∂l α ij (X)η+k ∧ η+l βj 2 +∂k α ij (X)η+k ∧ ηj + ∂k α ij (X)β +k βj , and δβi =

1 ∂i α kl (X)βk βl , 2

δηi = −dβi − ∂i α kl (X)ηk βl −

1 ∂i ∂j α kl (X)η+j βk βl , 2

δXi+ = dηi + ∂i α kl (X)Xk+ βl − ∂i ∂j α kl (X)η+j ∧ ηk βl +

1 ∂i α kl (X)ηk ∧ ηl 2

1 1 − ∂i ∂j ∂p α kl (X)η+j ∧ η+p βk βl − ∂i ∂j α kl (X)β +j βk βl . 4 2 This BRST operator coincides with the one obtained by the Batalin–Vilkovisky procedure. The Batalin–Vilkovisky action is the integral L(2) SBV = of the two-form part L(2) =

D

d 2 θL of

1 ˜ η˜ i η˜ j . L = η˜ i D X˜ i + α ij (X) 2 It is BRST closed, i.e., it obeys the master equation. In fact one has δL = D(η˜ i D X˜ i ), so that δL(2) is the differential of a one form which vanishes along the boundary.

600

A. S. Cattaneo, G. Felder

3.4. The gauge fixed action. We compute the path integral in the Lorentz-type gauge d∗ηi = 0. The Hodge ∗ operator (alias the almost complex structure) depends on the conformal structure and the orientation of D ⊂ R2 : in terms of the standard coordinates, ∗du1 = du2 , ∗du2 = −du1 . Let us briefly recall the main idea of the Batalin–Vilkovisky formalism in the general setting of 3.2. For any function <, the “gauge fixing fermion”, of the fields of ghost i number −1, one considers the integral L Oe h¯ SBV for an observable O, i.e., a function of fields and antifields which is closed with respect to the quantum BRST operator = = −i h ¯ + δ: = O = −i hO + (SBV , O) = 0. ¯ The integral is taken over the “Lagrangian” submanifold L defined by the equations φα+ = ∂φ α <. Using formally the master equation and the fact that O is BRST closed, one then sees that these integrals are invariant under variations of < and thus “equal” to the original (ill-defined) path integral with action S[φ] = SBV [φ, 0], which is what one gets if < = 0. The problem is then to find a function < which makes the integral well-defined, at least as a perturbative series. One way to do this is to add new fields, called antighosts and Lagrange multipliers together with their antifields, and choose < as the scalar product of the antighost and the gauge fixing condition. The action for these new fields is the simplest and is added to the Batalin–Vilkovisky action SBV . Let us do this in the case at hand. We first introduce new anticommuting scalar fields (antighosts) γ i of ghost number −1 on D, and scalar Lagrange multiplier fields λi of ghost number zero, together with their antifields γi+ , λ+ i . The boundary condition for λi is Dirichlet: λi (u) = 0, u ∈ ∂D, and γ i is constant on the boundary. The action for these fields and antifields is − D λi γi+ and is just added to the BV action. The BRST operator acts then as δλ = δγ + = 0,

δλ+ = −γ + ,

δγ = λ.

Clearly the new action also obeys the master equation. The gauge fixing condition d∗η = 0 is encoded in the gauge fixing fermion < = − D dγ i ∗ηi . On the Lagrangian submanifold we then have X+ = β + = λ+ = 0, γi+ = d∗ηi plus a boundary term whose form will not matter, and η+i = ∗dγ i . The boundary condition for γ i was chosen so as to fulfill the boundary condition for η+ (vanishing on normal vectors). The gauge fixed action is then 1 ηi ∧ dX i + α ij (X)ηi ∧ ηj − ∗dγ i ∧ (dβi + ∂i α kl (X)ηk βl ) Sgf = 2 D 1 − ∗dγ i ∧ ∗dγ j ∂i ∂j α kl (X)βk βl − λi d∗ηi . 4 3.5. Superconformal invariance of the gauge fixed action. The original action is invariant under arbitrary diffeomorphisms of the disc. As the gauge fixing condition depends on a choice of conformal structure, the gauge fixed action is only invariant under conformal diffeomorphisms. In fact this invariance is part of a (twisted) superconformal invariance, as we now show. For each vector field ?(u) = ? µ (u) ∂u∂ µ on D, tangent to

Path Integral Approach to Kontsevich Quantization Formula

601

the boundary on ∂D, we introduce an odd derivation δ¯? , depending linearly on ?, on functions of our fields: δ¯? X i = i(?)∗dγ i , δ¯? λi = −i(?)dγ i , δ¯? ηi = 0,

δ¯? βi = i(?)ηi ,

δ¯? γ i = 0.

Here i(?) is the interior multiplication of a differential form on D with a vector field ?. A straightforward calculation shows that these derivations, together with the BRST operator obey the twisted supersymmetry algebra relations [δ, δ]+ = [δ¯? , δ¯? ]+ = 0,

[δ, δ¯? ]+ = −L? ,

modulo the equations of motion for Sgf , with Lie derivative L? = i(?) ◦ d + d ◦ i(?). The gauge fixed action obeys δSgf = 0 and ¯δ? Sgf = ηi ∧ (L? ∗dγ i − ∗L? dγ i ). D

The latter expression vanishes if ? is conformal, i.e., if L? commutes with ∗. Conformal vector fields on the disc form a three dimensional Lie algebra, isomorphic to su(1, 1). 3.6. BRST cohomology classes. Observables can be obtained from differential forms on M. To a differential p-form ω = ωi1 ,...,ip (x)dx i1 ∧ · · · ∧ dx ip and a point u on the boundary of the disc D, one associates the observable ω(u) ˆ = ωi1 ,...,ip (X(u))γ i1 (u) · · · γ ip (u). ˜ In general, functions of the components of X(u) with u on the boundary are observables, since with our boundary conditions we have δ X˜ = 0 on the boundary and the Laplacian of a function depending only on a field but not its antifield or on an antifield but not its field, vanishes. More general observables are considered in Sect. 4. The gauge fixed action still has a (finite dimensional) residual infinitesimal symmetry Xi → X i + a i , γ i → γ i + g i , where ai , gi are constant functions on the disc. This translates into zero modes in the integration over the fermions γ i , and it follows that the only observables that have non-zero integral have ghost number −dim(M). 3.7. Feynman rules. The Feynman perturbation expansion in powers of h¯ around the classical solution X(u) = x, η(u) = 0 can be now computed. Thus we write X(u) = x + ξ(u) with a fluctuation field ξ(u) with ξ(∞) = 0. The Feynman propagators can then be deduced from the “kinetic” part 0 Sgf = ηi ∧ dξ i − ∗dγ i ∧ dβi − λi d∗ηi D = ηi ∧ (dξ i + ∗dλi ) + βi d∗dγ i D

of the gauge fixed action. The other terms of Sgf are considered as perturbations. Thus we have to invert the operators d ⊕ ∗d : =0 (D) ⊕ =00 (D) → =1 (D) and d∗d :

602

A. S. Cattaneo, G. Felder

=0 (D) → =2 (D). Here =p (D) is the space of smooth p-forms on D and =00 (D) denotes the space of functions with Dirichlet boundary conditions λi (u) = 0, u ∈ ∂D. Both operators are surjective but have a one-dimensional kernel consisting of constant functions. Inverses (modulo these kernels) are integral operators: to describe them it is useful to map conformally the disc onto the upper half plane H+ and use the standard complex coordinate of H+ . The integration kernel of (d∗d)−1 is the Green function 1 2π ψ(z, w), with z − w . ψ(z, w) = ln z − w¯ 1 (∗dz ψ(z, w) ⊕ The integration kernel of (d ⊕ ∗d)−1 is the Green function G(w, z) = 2π ∂ ∂ dz φ(z, w)), where dz = dz ∂z + d z¯ ∂ z¯ is the differential with respect to z and

φ(z, w) =

1 (z − w)(z − w) ¯ ln . 2i (¯z − w)(¯ ¯ z − w)

We have dw ∗dw ψ(z, w) = dw ∗dw φ(z, w) = 2π δz (w), where δz (w) is the Dirac distribution two-form, and the boundary conditions for w ∈ ∂H+ are Dirichlet for ψ and Neumann for φ. The propagators are then γ k (w)βj (z) =

i h¯ k δ ψ(z, w), 2π j λk (w)ηj (z) =

ξ k (w)ηj (z) = i h¯ k δ ∗dz ψ(z, w). 2π j

i h¯ k δ dz φ(z, w), 2π j

i h¯ Note that ∗dw ψ(z, w) = dw φ(z, w) so that ∗dγ k (w)βj (z) = δjk 2π dw φ(z, w). It follows that the propagators combine into a superpropagator

ξ k (w)ηj (z) + ∗dγ k (w)βj (z) =

i h¯ k δ dφ(z, w), 2π j

where d = dz + dw . In terms of superfields η˜ j (z, θ ) = βj (z) + θ µ ηj,µ (w), ξ˜ k (w, ζ ) = +j ξ k (w) + ζ µ ηµ (w), with η+j = ∗dγ j , the superpropagator is ξ˜ k (w, ζ )η˜ j (z, θ ) =

i h¯ k δ Dφ(z, w), 2π j

where D = θ µ ∂z∂µ + ζ µ ∂w∂ µ . 0 +S 1 and expanding: The perturbation expansion is then obtained by writing Sgf = Sgf gf e

i h¯ Sgf

∞ i 0 in S 1 n O= ) O. e h¯ gf (Sgf h¯ n n! n=0

This expression is calculated using the Wick theorem for Gaussian integrals i 0 S e h¯ gf ξ˜ k1 (w1 , ζ1 ) · · · ξ˜ kN (wN , ζN )η˜ j1 (z1 , θ1 ) · · · η˜ jN (zN , θN )δx (X(∞)) = ξ˜ kσ (1) (wσ (1) , ζσ (1) )η˜ j1 (z1 , θ1 ) · · · ξ˜ kσ (N ) (wσ (N) , ζσ (N) )η˜ jN (zN , θN ). σ ∈SN

Path Integral Approach to Kontsevich Quantization Formula

603

0 )δ (X(∞)) = 1, so that for α = The normalization of the integral is such that exp( hi¯ Sgf x

0 the star product coincides with the ordinary product. Here δx (X(t)) = di=1 δ(X i (t)− x i )γ i (t) fixes the value of the zero modes (constant functions) of X and the γ ’s are needed since the integral is otherwise zero, owing to the presence of zero modes in the integration over γ . More generally we could insert instead of the delta distribution a factor ρ(X(∞))γ 1 (∞) · · ·γ d (∞), for some top differential form ω = ρ(x)dx 1 · · · dx d on M, resulting in a factor M ω in the right-hand side. The Feynman perturbation expansion 1 and the observable in powers of is then obtained by expanding the interaction term Sgf ξ˜ η. ˜ This gives the vertices 1 Sgf =

1 2

D

d 2θ

∞ 1 ∂j · · · ∂jk α ij (x)ξ˜ j1 · · · ξ˜ jk η˜ i η˜ j . k! 1

(4)

k=0

Here the Berezin integral selects the two-form part of the superfield. We consider the observable of the correct ghost number ˜ ˜ O = f (X(1))g( X(0))δ x (X(∞)),

(5)

where f, g ∈ C ∞ (M) . Then expanding f and g in powers of ξ˜ we get an expansion in Feynman diagrams4 . The terms with n vertices (4) are then labeled by the Kontsevich diagrams of order n, but possibly with tadpoles, i.e., lines that start and end at the same vertex. The term labeled by a diagram with lines (j, v1 (j )), (j, v2 (j )), j = 1, . . . , n is D (g ⊗ f ) (see Sect. 2) times n 1 i h¯ 2n i ∧nj=1 dφ(uj , uv1 (j ) ) ∧ dφ(uj , uv2 (j ) ) = (−1)n w . h¯ 2n 2π

The factor 1/ kj !, where kj is the number of lines pointing to j , is compensated by the fact that there are as many terms in the Wick theorem which give the same contribution because kj arguments of ξ˜ are equal to each other. As explained at the end of Sect. 2, we have (−1)n w D (g ⊗ f ) = w¯ D¯ (f ⊗ g), where ¯ is with R and L interchanged. Thus the product obtained here coincides with Kontsevich’s, except that it also involves tadpole diagrams. These have to be considered separately, and require (finite) renormalization, which we proceed to discuss. 1 n!

3.8. Renormalization. In the perturbation expansion described above, all integrals are absolutely convergent except for those containing tadpole diagrams, which are diagrams with one edge connecting a vertex to itself. The corresponding amplitude contains an ill-defined factor dφ(z, z), the superpropagator taken at coinciding points. To make sense of this expression we introduce a point-splitting regularization and define dφ(z, z) as the limit dφ(z, z) = κ(z; ζ ) = lim dφ(z, z + ?ζ (z)). ?→0

4 Actually, the arguments of f and g of our original integral are X(1), X(0), rather than the superfields. ˜ ˜ However, the additional terms in X(1), X(0) do not contribute to the integral since they are of negative ghost number.

604

A. S. Cattaneo, G. Felder

Here ζ (z) is a vector field on D which does not vanish in the interior of D. This limit exists but depends on the regularizing vector field ζ (z). Indeed, if we write ζ (z) = r(z)eiϑ(z) in polar coordinates, then κ(z; ζ ) = dϑ(z). Thus the Feynman integrals have a finite renormalization ambiguity. One way to fix it is to add a counterterm i h¯ ˜ η˜ j κ, d 2 θ ∂i α ij (X) ˜ κ˜ = θ µ κµ , (6) Sc.t. = 2π D (or more simply choose the slightly singular ϑ = constant) which removes the tadpole diagrams, and one gets precisely the Kontsevich formula. One easily checks that the action with the addition of the counterterm still obeys the classical master equation and, by the same argument as at the end of 3.2, also the quantum master equation. 4. Central Functions Using a non-rigourous quantum field theory argument based on BRST cohomology, we can prove the following claim: There is a star product, equivalent to Kontsevich’s, so that every function that is central in the Poisson algebra is also central for the star product. Two star products , corresponding to the same Poisson bracket are called equivalent if there is a series R = R0 + hR ¯ 1 + h¯ 2 R2 + · · · with Ri differential operators and −1 R0 = Id, such that f g = R (Rf Rg). The argument will also give us a formula for R, see (8). Observe first that the BRST variation of a function on M is given by ˜ (D X˜ i + α ij (X) ˜ η˜ j ) = Df (X) ˜ + η˜ j α ij (X)∂ ˜ i f (X). ˜ ˜ = ∂i f (X) δf (X) If f is central in the Poisson algebra – that is, if α ij (X)∂j f (X) = 0 – then the second ˜ in components, f (X) ˜ = f (X) + term on the right-hand side vanishes.5 Writing f (X) µ +i θ ηµ ∂i f (X) + · · · , we get the descent equations δf (X) = 0, δ(η+i ∂i f (X)) = −df (X). The first equation means that f (X) is an observable. Therefore, the expectation value i h(u; f, g)(x) = f (X(u))g(X(0))δx (X(∞))e h¯ S (7) is well defined for any u in the upper half plane. Observe that we put no additional hypotheses on g, so that – as in the previous sections – g(X(v)) is an observable only if v is on the real axis. 5 In the presence of the counterterm (6) the BRST operator is modified and we get an additional term ˜ in the formula for δf . This term also vanishes if f is central in the Poisson algebra. (i h/2π )κ∂ ˜ j (α j i ∂i f )(X) ¯

Path Integral Approach to Kontsevich Quantization Formula

605

The second descent equation may then be used to prove that h is independent of u. In fact, denoting by d the exterior derivative on the upper half plane, we get dh(u; f, g)(x) =

=−

i

df (X(u))g(X(0))δx (X(∞))e h¯ S = i

δ[η+i (u)∂i f (X(u))]g(X(0))δx (X(∞))e h¯ S = i = − δ[η+i (u)∂i f (X(u))g(X(0))δx (X(∞))]e h¯ S = 0.

Observe that to obtain the last equality one must also check that [η+i (u)∂i f (X(u))g(X(0))δx (X(∞))] = 0. As a consequence, we get eventually Rf g (x) = lim h(1 + i?; f, g)(x) = lim h(−1 + i?; f, g)(x) = g Rf (x). ?↓0

?↓0

Here for v on the real axis, Rf (X(v)) = lim?↓0 f (X(v+i?)) is the limit of the observable f (X(u)) defined on the upper half plane as u tends to the real axis. We claim that Rf is given by the one-point function i Rf (x) = f (X(u))δx (X(∞))e h¯ S = f (x) + O(h¯ 2 ) (8) for any point u not on the boundary. This is based on the following factorization argument: if in the integral h(u; f, g) the point u approaches the boundary, it is as if we considered the integral on two discs connected by a small bridge, with u in the middle of one disc and the insertion point for the observable g on the boundary of the other. In the limit one obtains a path integral for a disc with two points (and the point at infinity) on the boundary. One point is the insertion point for g and at the other the result of the path integral on the disc with one point in the interior is inserted. See Fig. 1 for the case when u approaches −1. This argument can be made precise looking at the perturbation 0

u ∞

Fig. 1. The expectation value (7) in the limit as u approaches the boundary reduces to a path integral on this surface

606

A. S. Cattaneo, G. Felder

expansion [CFT] with the result that Rf = f + h¯ 2 R2 f + · · · , with Rj differential operators. Thus f is central for the star product g h = R −1 (Rg Rh), proving the claim at the beginning of the section. Using this result we may strengthen our claim: The center of C ∞ (M)[[h¯ ]] with the star product is Z[[h¯ ]], where Z = {f ∈ C ∞ (M) | {f, ·} = 0} is the center of the Poisson algebra C ∞ (M). The proof goes as follows. We need to show that if f = f0 + hf ¯ 1 + · · · is central for then all coefficients fi are in Z. If {f, g} = 0 for arbitrary g ∈ C ∞ (M), then in particular the coefficient of h¯ vanishes, so {f0 , g} = 0 and f0 ∈ Z. But this implies, by what we showed above, that f0 is central for the star product. Thus (f − f0 )/h¯ = f1 + hf ¯ 2 +··· is central. By proceeding in this way we see that f1 , f2 , . . . are all in Z. 5. L∞ Morphism and Formality 5.1. The general path integral as a map from polyvector fields to polydifferential operators. The path integral we considered so far is a special case of the following general construction. A polyvector field of degree p is a section of ∧p+1 T M, i.e., a skewsymmetric contravariant tensor field of rank p + 1, ∂ ∂ 1 α j0 ,...,jp (x) j ∧ · · · ∧ j . (p + 1)! ∂x 0 ∂x p d−1 (p) A polyvector field is a sum α = of polyvector fields of all nonnegative p=0 α degrees. The space of polyvector fields is denoted by Tpoly (M).

For a multi-index I = (i1 , . . . , id ) ∈ Zd≥0 , let ∂I = k (∂/∂x k )ik . A polydifferential operator of degree m is an operator V : C ∞ (M)⊗m+1 → C ∞ (M), of the form V (f0 ⊗ · · · ⊗ fm )(x) = VI0 ,...,Im (x)∂I0 f0 · · · ∂Im fm , with a finite sum over sets of multiindices Ij . A polydifferential operator is a formal sum of polydifferential operators of arbitrary nonnegative degrees. The space of polydifferential operator is denoted by Dpoly (M). To a polyvector field α we may associate a function of fields and antifields: S = S 0 + Sα , with

S0 =

D

d 2 θ η˜ j D X˜ j −

D

λi γi+ ,

as above, and Sα =

d−1 p=0 D

d 2θ

1 ˜ θ ))η˜ j0 (u, θ ) · · · η˜ jp (u, θ ). α j0 ,...,jp (X(u, (p + 1)!

We may then consider correlation functions of boundary fields associated to the functions f0 , . . . , fm on M, i U (α)(f0 ⊗ · · · ⊗ fm )(x) = e h¯ (S0 +Sα ) Ox (f0 , . . . , fm ), ˜ 0 , θ0 )) · · · fm (X(t ˜ m , θm ))](m−1) δx (X(∞)). Ox (f0 , . . . , fm ) = [f0 (X(t Bm

Path Integral Approach to Kontsevich Quantization Formula

607

The path integral is, as before, the integral over the Lagrangian submanifold in the space of fields and antifields determined by our gauge condition d∗η = 0. The integral over the ti is the integral over the m − 1 form part (the coefficient of θ1 · · · θm−1 ) of the integrand over the simplex 1 = t0 > t1 > · · · > tm−1 > tm = 0, with the orientation given by the volume form dt1 ∧ · · · ∧ dtm−1 . It may be viewed as an integral over the moduli space Bm of m + 1 cyclically ordered points on the circle modulo conformal transformations. More explicitly Ox (f0 , . . . , fm ) =

1>t1 >···>tm−1 >0

·

m−1

f0 (X(1))

∂ik f (X(tk ))η+ik (tk ) fm (X(0))δx (X(∞)).

k=1

Expanding the path integral in powers of h¯ as in the previous section, we get a map U that associates, to each polyvector field α, a formal power series whose coefficients are polydifferential operators. The perturbative expansion has the form U (α) = ∞ ¯ Here Un n=0 Un (α, . . . , α; h). is a multilinear function of n arguments in Tpoly (M) with values Dpoly (M). The formula for Un is Un (α1 , . . . , αn ; h)(f ¯ 0 ⊗ · · · ⊗ fm )(x) =

i i i e h¯ S0 Sα1 · · · Sαn Ox (f0 , . . . , fm ). h¯ h¯

Suppose now that, for i = 1, . . . , n, αi is homogeneous of degree pi . Then Sαi is the j0 ...jpi 1 ˜ η˜ j0 . . . η˜ jp , and has thus ghost integral of the two-form component of (pi +1)! αi (X) i number pi − 1. This has two consequences: first, since the integral over the ti picks the

˜ m − 1 form component of fi (X(t ni )), which has ghost number 1 − m, we have the ghost number condition 1 − m + i=1 (pi − 1) = 0 or m=1−n+

n

pi ,

(9)

i=1

for the path integral to be non-zero. This means that Un is a map of degree 1 − n from Tpoly (M)⊗n to Dpoly (M). Using this formula we may compute the dependence on h¯ of this integral: the path integral has an overall h¯ to the power −n+ (pi +1) = n+m−1 (each vertex has 1/h¯ and each propagator has an h), ¯ and we have m n+m−1 Un (α1 , . . . , αn )(⊗m Un (α1 , . . . , αn ; h)(⊗ ¯ ¯ 0 fi ) = (i h) 0 fi ),

with Un (α1 , . . . , αn ) = Un (α1 , . . . , αn ; h¯ = 1/ i) independent of h. ¯ The second consequence is that Un (. . . , αi , . . . , αj , . . . ) = (−1)(pi −1)(pj −1) Un (. . . , αj , . . . , αi , . . . ), i.e., Un is symmetric in a graded sense.

(10)

608

A. S. Cattaneo, G. Felder

5.2. Special cases. Let us consider in detail some special cases. For n = 0, Un is a polydifferential operator of degree m = 1, and i ˜ ˜ U0 (f0 ⊗ f1 ) = e h¯ S0 f0 (X(1))f 1 (X(0))δ x (X(∞)) = f0 (x)f1 (x) is the undeformed product on C ∞ (M). If n = 1 and α is a polyvector field of degree p then U1 (α) is of degree p. Let α=

∂ ∂ 1 α j0 ,...,jp (x) j ∧ · · · ∧ j , (p + 1)! ∂x 0 ∂x p

with α j0 ,...,jp antisymmetric. The Wick theorem yields in this case i U1 (α; h¯ )(f0 ⊗ · · · ⊗ fp )(x) = h¯

i h¯ 2π

p+1

Ip α j0 ,...,jp ∂j0 f0 (x) · · · ∂jp fp (x).

Here Ip is the integral Ip =

dφ(u, 1) ∧ dφ(u, t1 ) ∧ · · · ∧ dφ(u, 0),

over u = u1 +iu2 ∈ H and 1 > t1 > · · · > tp−1 > 0, with orientation given by the form du1 ∧du2 ∧dt1 . . . dtp−1 . To compute this integral we proceed as in Sect. 2 and introduce new variables φ0 = φ(u, 1), φj = φ(u, tj ) (j = 1, . . . , m − 1) and φp = φ(u, 0). In the new variables the integration is over the region 2π > φ0 > · · · > φp > 0. We claim that the Jacobian of the change of variables is (−1)p . This follows from the fact that dφ(u, 0) ∧ dφ(u, 1) = J du1 ∧ du2 with J > 0 and that ∂φ(u, t)/∂t > 0. Hence, dφ0 · · · dφp dφ(u, 1) ∧ dφ(u, t1 ) ∧ · · · ∧ dφ(u, 0) = (−1)p =

2π>φ0 >···>φp >0 p (−1) (2π )p+1

(p + 1)!

,

and we obtain U1 (α)(f0 ⊗ · · · ⊗ fp )(x) =

(−1)p+1 j0 ,...,jp α (x)∂j0 f0 (x) · · · ∂jp fp (x). (p + 1)!

5.3. U is an L∞ morphism. The formal properties of the map U can be deduced using the main trick of the BV formalism, which is to use the fact thatthe integral of the Laplacian of anything is zero. In our situation we have, with S0 = D d 2 θ η˜ i D X˜ i − D λi γi+ , and αj (j = 1, . . . , n) homogeneous polyvector fields of degree pj ,

e

i h¯ S0

n i=1

Sαi Ox (f0 , . . . , fm ) = 0.

Path Integral Approach to Kontsevich Quantization Formula

609

To evaluate the left-hand side we use (2) and S0 = Sα = 0. Also, we use the fact that ˜ η˜ j0 · · · η˜ jp ) which vanishes because of (S0 , Sα ) is proportional to D d 2 θD(α j0 ...jp (X) the boundary conditions for η˜ j . Thus we get 0 = (−1)m−1 +

i

e h¯ S0

n

i (S0 , Ox (f0 , . . . , fm )) h¯ i=1 ?j k (Sαj , Sαk ) Sαi Ox (f0 , . . . , fm ). i

e h¯ S0

Sαi

1≤j
i=j,k

The sign is ?j k = (−1)(g1 +···+gj )gj +(g1 +···+gj −1 +gj +1 +···+gk−1 )gk , where gj = pj − 1 is the ghost number of Sαj . Now the BV bracket (Sαj , Sαk ) is again of the form Sα : (Sαj , Sαk ) = −S[αj ,αk ] . The Schouten–Nijenhuis bracket [ , ] is a graded super Lie algebra structure on Tpoly M. On vector fields it is defined to be the usual Lie bracket, and it is extended to polyvector fields by the Leibnitz rule [α1 , α2 ∧ α3 ] = [α1 , α2 ] ∧ α3 + (−1)p1 (p2 −1) α2 ∧ [α1 , α3 ]. The Jacobi identity for a bivector field α is [α, α] = 0. ˜ ˜ Moreover, (S0 , f (X(t), θ )) = Df (X(t), θ), which in components reads S0 , ∂i f (X(t))η+i (t) = −df (X(t)). (S0 , f (X(t))) = 0, Using this identity and the Leibnitz rule (3) we see that the integral over Bm reduces to an integral over the boundary (of a suitable compactification). The compactification may be understood by thinking of Bm as the moduli space of discs with m + 1 marked points on the boundary modulo the action of SU (1, 1). The boundary of B¯ m consists of discs degenerated into pairs of discs with a point in common. Its various connected components are obtained by distributing the points on the two discs in all possible ways compatible with the cyclic ordering, see Fig. 2. By the usual factorization arguments of quantum field theory, we obtain the following formulae. Let SK,n−K be the subset of the group Sn of permutations of n letters consisting of permutations such that σ (1) < · · · < σ (K) and σ (K + 1) < · · · < σ (n). For σ ∈ SK,n−K let us introduce the sign ?(σ ) = (−1)

K

σ (r)−1

r=1 gσ (r) (

s=1

r−1

gs −

s=1 gσ (s) )

.

tm+1 ti+k

ti+k−1

t0

t1

ti−1

ti

Fig. 2. A component of the boundary of Bm . The point ∞ is tm+1

610

A. S. Cattaneo, G. Felder

It is the sign one gets if one puts a product of n elements of degree g1 , . . . , gn of a graded commutative algebra in the order given by σ . Then we have n m−1 m−k

?(σ )(−1)k(i+1) (−1)m UK (ασ (1) , . . . , ασ (K) )(f0 ⊗ · · · ⊗ fi−1

K=0 k=1 i=0 σ ∈SK,n−K

⊗ Un−K (ασ (K+1) , . . . , ασ (n) )(fi ⊗ · · · ⊗ fi+k ) ⊗ fi+k+1 ⊗ · · · ⊗ fm ) ?ij Un−1 ([αi , αj ], α1 , . . . , αi , . . . , αj , . . . , αn )(f0 ⊗ · · · ⊗ fm ). = i<j

The sign −(−1)k(i+1) comes from the orientation of the faces of the boundary of B¯ m . A sequence of maps Un with this property and the symmetry (10) is called an L∞ morphism [SchlSt, LS, K]. A consequence of it in this case is the formality conjecture: U1 , which is (up to sign) the obvious map sending a polyvector field to itself viewed as a polydifferential operator, induces an isomorphism of graded Lie algebras from the graded Lie algebra of polyvector fields to the cohomology of the polydifferential operators, viewed as a complex of the Hochschild cochains of C ∞ (M), see [K]. A special case of this identity is the associativity of the star product: in this case α is a bivector field (a polyvector of degree 1) obeying the Jacobi identity [α, α] = 0. field n U (α, . . . , α) and by the ghost number condition (9), Then we have U (α) = ∞ h ¯ n n=0 every Un is a bidifferential operator. The L∞ identity reduces to the associativity of U (α): n k=0

Uk (Un−k (f0 ⊗ f1 ) ⊗ f2 ) −

n

Uk (f0 ⊗ Un−k (f1 ⊗ f2 )) = 0,

k=0

where we have suppressed the dependence on α in the notation. 5.4. Tadpoles. The perturbative expansion of U (α) contains tadpoles which can be removed as in 3.8 either by choosing a constant angle ϑ, or by replacing Sα by i h¯ 1 d 2 θ κ∂ ˜ k α k,j1 ,...,jp η˜ j1 · · · η˜ jp . Sα = Sα − 2π p! D The arguments of the previous subsections remain valid with the addition of these coun . terterms, since Sα = (S0 , Sα ) = 0 and (Sα , Sβ ) = −S[α,β] Acknowledgements. We thank H. Ooguri, whose clarifying comments at an early stage of this work were essential to our understanding of the problem. We also thank J. Fröhlich and C. Schweigert for interesting discussions and J. Stasheff for useful comments on the first draft of this paper.

References [AKSZ] [BV] [BT]

Alexandrov, M., Kontsevich, M., Schwarz, A. and Zaboronsky, O.: The geometry of the master equation and topological quantum field theory. Internat. J. Modern Phys. A 12 no. 7, 1405–1429 (1997) Batalin, I. and Vilkovisky, G.: Gauge algebra and quantization. Phys. Lett. B 102, 27 (1981); Quantization of gauge theories with linearly dependent generators. Phys. Rev. D 29, 2567 (1983) Blau, M., Thompson, G.: Topological gauge theories of antisymmetric tensor fields. Ann. Physics 205, no. 1, 130–172 (1991)

Path Integral Approach to Kontsevich Quantization Formula

611

[BFFLS] Bayen, F., Flato, M., Fronsdal, C., Lichnerowicz, A. and Sternheimer, D.: Deformation theory and quantization I, II. Ann. Phys. 111, 61–110, 111–151 (1978) [CFT] Cattaneo, A.S., Felder, G. and Tomassini, L.: In preparation [DDV] Dijkgraaf, R., Verlinde H. and Verlinde, E.: Topological strings in d < 1. Nucl. Phys. B 352, 59–86 (1991) [F] Feynman, R.P.: Space–time approach to non-relativistic quantum mechanics. Rev. Modern Physics 20, 367–387 (1948) [I] Ikeda, N.: Two-dimensional gravity and nonlinear gauge theory. Ann. Phys. 235, 435–464 (1994) [K] Kontsevich, M.: Deformation quantization of Poisson manifolds. q-alg/9709040 [LS] Lada, T. and Stasheff, J.: Introduction to sh Lie algebras for physicists. Internat. J. Theoret. Phys. 32 no. 7, 1087–1103 (1993) [SchStr] Schaller, P. and Strobl, T.: Poisson structure induced (topological) field theories. Modern Phys. Lett. A 9, no. 33, 3129–3136 (1994) [SchlSt] Schlessinger, M. and Stasheff, J.: The Lie algebra structure of tangent cohomology and deformation theory. J. Pure Appl. Alg. 38, 313–322 (1985) [S1] Schwarz, A.S.: Geometry of Batalin–Vilkovisky quantization. Commun. Math. Phys. 155, 249–260 (1993) [S2] Schwarz, A.S.: The partition function of a degenerate quadratic functional and the Ray–Singer invariants. Lett. Math. Phys. 2, 247–252 (1978) [W1] Witten, E.: A note on the antibracket formalism. Modern Phys. Lett. A 5, no. 7, 487–494 (1990) [W2] Witten, E.: On the structure of the topological phase of two-dimensional gravity. Nucl. Phys. B 340, 281–332 (1990) Communicated by R. H. Dijkgraaf

Commun. Math. Phys. 212, 613 – 624 (2000)

Communications in

Mathematical Physics

© Springer-Verlag 2000

Gravitational Anomalies, Gerbes, and Hamiltonian Quantization C. Ekstrand, J. Mickelsson Department of Theoretical Physics, Royal Institute of Technology, 100 44 Stockholm, Sweden Received: 5 May 1999 / Accepted: 30 January 2000

Abstract: In ref. [1], Schwinger terms in hamiltonian quantization of chiral fermions coupled to vector potentials were computed, using some ideas from the theory of gerbes, with the help of the family index theorem for a manifold with boundary. Here, we generalize this method to include gravitational Schwinger terms. 1. Introduction Chiral anomalies in quantum field theory appear in several different forms. Historically they were first observed in perturbative calculations of certain 1-loop scattering amplitudes, as a breaking of the (classically valid) chiral symmetry, [2]. Later nonperturbative methods were developed for understanding the chiral symmetry breaking in the euclidean path integral formalism, [3, 4]. It was also understood that even the symmetry under coordinate transformations could be broken when quantizing massless fermions. In the hamiltonian approach to chiral anomalies one considers the equal time commutation relations for the infinitesimal generators of the classical symmetry group. First one constructs the bundle of fermionic Fock spaces parametrized by various external (classical) fields: gauge potentials, metrics, scalar potentials, etc. The quantization of the algebra of currents in the Fock spaces requires some renormalization procedure. In 1 + 1 space-time dimensions usually a normal ordering is sufficient but in higher dimensions certain additional subtractions are needed. Typically the renormalization modifies the algebra of classical symmetries by the so-called Schwinger terms, [5]. In 1 + 1 dimensions the Schwinger term is normally just a c-number, leading to an affine Lie algebra (gauge transformations) or to the Virasoro algebra (diffeomorphisms). In higher dimensions the algebra is more complicated; instead of a central (c-number) extension the Schwinger terms lead to an extension by an abelian ideal [6]. Direct analytic computations of the Schwinger terms in higher dimensions, although they can be carried out [7] in the Yang-Mills case, are very complicated in the case of an external gravitational field. However, there are topological and geometrical methods which give

614

C. Ekstrand, J. Mickelsson

directly the structure of the quantized current algebra. As in the case of euclidean path integral anomalies, a central ingredient in this discussion is the families index theorem. In a previous paper [1] (see also [8] for a review) it was shown how the Schwinger terms in the gauge current algebra are related, via Atiyah–Patodi–Singer index theory, to the structure of a system of local determinant line bundles in odd dimensions. This system provides an example of a mathematical structure called a bundle gerbe, [9]. In the present paper we want to extend the methods of [1] for constructing the Schwinger terms which arise in fermionic Fock space quantization of the algebra of vector fields on an odd dimensional manifold. In Sect. 2 we set up the notation and recall some basic results in the families index theory in case of compact manifolds with boundaries. In Sect. 3 we compute the curvature forms for a local system of complex line bundles over a parameter space B which consists of gauge potentials and Riemannian metrics. In Sect. 4 we explain how the Schwinger terms in the Fock space quantization of vector fields (and infinitesimal gauge transformations) are obtained from the local curvature formulas. Finally, in Sect. 5 we give some results of explicit computations of the Schwinger terms.

2. The Family Index Theorem Let π˜ : M˜ → B be a smooth fibre bundle with fibres diffeomorphic to a compact oriented spin manifold M of even dimension 2n. Assume that each fibre Mz = π˜ −1 (z), z ∈ B, is equipped with a Riemannian metric. Assume further that M˜ is equipped with a ˜ the tangent space splits in a horizontal connection. This means that at each point x ∈ M, ˜ and a vertical part: Tx M = Hx ⊕ Vx , where Vx consists of vectors tangential to the fibers. Let E˜ be a vector bundle over M˜ which along each fiber Mz is the tensor product Ez of the Dirac spinor bundle over Mz (with Clifford action of the vertical vectors Vx ) and a finite dimensional vector bundle Wz (with trivial Clifford multiplication). It has a Z2 structure provided by the chirality operator according to: c() = ±1 on E˜ ± , where c denotes the Clifford action. E˜ is assumed to be equipped with a hermitian fiber metric. This naturally induces an ˜ ˜ E) L2 -metric in the space of sections (Mz , Ez ). Finally, let D be an operator on (M, which is fiberwise defined as a family of Dirac operators Dz : (Mz , Ez ) → (Mz , Ez ) with z ∈ B. With a Dirac operator we mean an operator that can be written as the sum of compositions of a covariant derivative and a Clifford multiplication. We would now like to apply the family index theorem to the case described above. Before doing this, some additional assumptions are needed. These are of a different nature depending on if M has a non-empty boundary or not. When the boundary is empty, it will only be assumed that the set {Dz }z∈B should consist of self-adjoint Dirac operators. The assumptions in the more difficult case of a non-empty boundary ∂M will now be described. We make the common simplifying assumption that for all z ∈ B, there exists a collar neighbourhood of ∂Mz such that all structures in Ez are of “product type” (see ref. [10]). This implies that Dz+ = D|(Mz ,Ez+ ) can be written as ct ( ∂t∂ + Dz∂ ) near the boundary, where ct is the Clifford multiplication by an element corresponding to the coordinate vector field ∂t∂ (which is along the unit inward normal vector field at the boundary) and Dz∂ is a self-adjoint Dirac operator on E˜ + |∂Mz . Our conventions are such that ct is unitary.

Gravitational Anomalies, Gerbes, and Hamiltonian Quantization

615

+ − Let IndDλ be the family {IndDz,λ }z∈Uλ , where IndD = kerD ∂ kerD is the index bundle in the sense of K-theory and Uλ = {z ∈ B; λ ∈ / spec Dz }. The notation means + that every operator Dz will be restricted to the domain {ψ ∈ (Mz , Ez+ )|Pz,λ ψ|∂Mz = 0}, while Dz− will be restricted to {ψ ∈ (Mz , Ez− |(1 − Pz,λ )ct ψ|∂Mz = 0}, where Pz,λ is the spectral projection of Dz∂ corresponding to eigenvalues ≥ λ. We will assume that the Dirac operators Dz,λ are self-adjoint. The family index theorem has been proven in [11] when ∂M = ∅ and in [12], based on [10], when ∂M = ∅. It reads: ˆ z )ch(Wz ), ∂M = ∅, A(M ch (IndD) (z) = Mz ˆ z )ch(Wz ) − 1 η˜ λ (z), z ∈ Uλ , ∂M = ∅, ch (IndDλ ) (z) = (1) A(M 2 Mz

where

iRz /4π , sinh(iRz /4π ) ch(Wz ) = tr exp (iFz /2π ) . ˆ z ) = det1/2 A(M

We choose to not write down the definitions of the form η˜ λ or the curvature 2-forms R˜ ˜ where Rz = R| ˜ Mz and Fz = F˜ |Mz , since we will only need their explicit and F˜ in M, expression in a simple special case. The only thing about the η˜ form we need to know later is that it depends only on the boundary spectral data. The zero degree part of η˜ is just the η-invariant of the boundary Dirac operator. For M = ∅, the determinant line bundle DET is an object closely related with the index bundle IndD. It is a line bundle over B, fibre-wise defined as (det kerDz+ )∗ ⊗ det ker Dz− . To define it globally over B we must also account for the fact that the dimension of kerDz+ and kerDz− can jump as z varies. For a detailed construction, see [13]. For M = ∅ there exists a similar construction of a bundle DETλ over Uλ , closely related to IndDλ , see ref. [14]. In [13] and [14] it has been shown that for ∂M = ∅ and ∂M = ∅ there exists a connection on DET and DETλ , respectively, naturally associated with the Quillen metric, [15], with curvature given by i DET ˆ z )ch(Wz ) A(M F (z) = , ∂M = ∅, 2π Mz [2] 1 i DETλ ˆ (z) = , A(Mz )ch(Wz ) − η˜ λ (z) F 2π 2 Mz [2] z ∈ Uλ , ∂M = ∅,

(2)

where [2] denotes the part that is a 2-form. Notice that the 2-form part of the right-hand side of the family index theorem, Eq. (1), is equal to the right-hand side of Eq. (2). In the case of an odd dimensional manifold N one can produce in a similar way an element ∈ H 3 (B, Z), by an integration over the fibers Nz , ˆ , ∂N = ∅, (3) A(Nz )ch(Wz ) (z) = Nz

[3]

616

C. Ekstrand, J. Mickelsson

where this time we pick up the component of the form which is of degree 3 in the tangential directions on the parameter space B. This form plays an important role in the hamiltonian quantization of external field problems, [1]. It is the Dixmier-Douady class of a gerbe. A nonvanishing Dixmier-Douady class is an obstruction to quantizing chiral fermions in a gauge invariant manner. 3. Local Line Bundles Over Boundary Geometries Let P be a fixed principal G bundle over M with a projection π : P → M and F M the oriented frame bundle of M. The bundle F M is also a principal bundle, with the structure group GL+ (2n, R), the group of real 2n × 2n matrices with positive determinant. Let Q denote the product bundle P × F M. Let B = A × M where A is the affine space of connections on P and M is the space of Riemannian metrics on M. Locally, an element ˜ of A is written as a Lie(G) valued 1-form on M. We may view Q as a principal bundle Q over M˜ = M × B in a natural way, as the pull-back under the projection M × B → M. With notations as in the previous section we define Mz as the manifold M with metric given by z ∈ B. Along the model fiber M, let E be the tensor product of the Dirac spinor bundle and a vector bundle W over M, the latter being an associated bundle to P (M, G). We view E in a natural way as a vector bundle E˜ over M × B. Finally, we let Dz : (Mz , Ez ) → (Mz , Ez ) be the Dirac operator constructed from z ∈ B in the usual way; in terms of local coordinates A = Aµ dx µ , = µ dx µ , and with respect to a local orthonormal frame {ea }2n a=1 of T Mz we have D(A,) =

2n

γ a ea µ ∂µ + A µ + µ ,

a,µ=1

where the ea µ ’s are the components of the basis vectors ea in the coordinate frame and γ a is the Clifford multiplication by ea , with γ a γ b + γ b γ a = 2δ ab . Let D be the group of orientation preserving diffeomorphims of M and G the group of gauge transformations in P , i.e., the group of automorphims of P (M, G) which projects to the identity map on the base. The groups G and D act through pull-backs on A and M. The group actions induce a fiber structure in B = A × M but in order to obtain smooth moduli spaces we restrict to the subgroups G0 ⊂ G and D0 ⊂ D. The former is the group of based gauge transformations φ leaving invariant some fixed base point p0 ∈ P , φ(p0 ) = p0 . The group D0 is defined as D0 = {φ ∈ D|φ(x0 ) = x0 and Tx0 φ = id} for some fixed x0 ∈ M. With these choices, we obtain the smooth fiber bundles M → M/D0 and A → A/G0 . This leads also to a fibering B = A × M → (A × M)/(D0 G0 ). Note that the group of symmetries is the semidirect product of D0 and G0 since locally an element of G0 is a G-valued function on M and the diffeomorphisms act on the argument of the function. ˜ = P × FM × B → M × B Following [16] we have a connection form ω on Q ˜ which can be pushed forward to a connection form on Q/(D 0 G0 ). Along P × F M the form ω is given by a connection A ∈ A and by the Levi–Civita connection given by a metric g ∈ M. Restricted to the second factor B = A × M the form ω is called the BRST ghost and will be denoted by v. Since the total form ω should vanish along

Gravitational Anomalies, Gerbes, and Hamiltonian Quantization

617

gauge and diffeomorphism directions it follows that its value along these directions in B is uniquely determined by the value of the corresponding vector field on P × F M. In the case of gauge potentials, B = A, an infinitesimal gauge transformation is given locally as a Lie(G) valued function Z on M and then vp,A (ZA ) = Z(x), where x = π(p) and ZA = δZ A is the vector field on A defined by the infinitesimal gauge transformation Z. In the case of diffeomorphims, Z is the gl(2n, R) valued function defined as the Jacobian (in local coordinates) of a vector field on M. Again, v(Z ) is the “tautological 1-form”, Z evaluated at a point x ∈ M. The ghost v in other directions is a nonlocal expression involving the Green’s function of a gauge covariant Laplacian, [16, 17], but we shall make explicit use of v only in the gauge and diffeomorphism directions. Next let N be an odd dimensional manifold without boundary and let M = [0, 1]×N, dimM = 2n. Given a principal bundle P over N we can extend it to a principal bundle over M (to be denoted by the same symbol P ) by a pull-back defined by the projection [0, 1] × N → N . We choose a fixed connection A0 in the principal bundle P over N and a fixed metric g0 in N . If A is an arbitrary connection in P we form a connection A(t) (with t ∈ [0, 1]) in the principal bundle P over M by A(t) = (1 − f (t))A0 + f (t)A, where f is a fixed smooth real valued function such that f (0) = 0, f (1) = 1, and all the derivatives of f vanish at the end points t = 0, 1. Similarly, any metric g in N defines a metric in M such that along N directions it is given by gij (t) = (1 − f (t))(g0 )ij + f (t)gij and such that ∂t is a normalized vector field in the t direction, orthogonal to the N directions. However, for computations below it is more convenient to use directly a homotopy connecting the Levi–Civita connection (constructed from the metric g) to the connection 0 (constructed from g0 ). The formula for this homotopy is the same as for the gauge potentials above. We use the “Russian formula”, which is just an expression of the fact that in a principal bundle with a connection the curvature form has no components along fiber directions. The formula tells that when the total curvature on P × F M × A × M is evaluated along vertical directions in A × M → (A × M)/(G0 × D0 ) and along vector fields on P × F M the result is 1 1 F ω = F A, = dA + [A, A] + d + [, ], 2 2 where is the Levi–Civita connection. Next we replace ω = A + + v by the ’time’ dependent connection ω(t) = (1 − f (t))(A0 + 0 ) + f (t)(A + + v). An evaluation of the curvature of a Dirac determinant bundle over A × M involves an integration of a characteristic class pn+1 (F ω(t) ) over M = [0, 1] × N and an evaluation of the η˜ form on the boundary. Here pi denotes a generic homogeneous symmetric invariant polynomial of degree i in the curvature. Actually we have to restrict the construction of the determinant bundle to subsets Uλ in the parameter space B on the boundary. Here Uλ is again the set of those points in B such that the associated Dirac operator on ∂M does not have the eigenvalue λ. These

618

C. Ekstrand, J. Mickelsson

sets form an open cover of B. In each of the sets Uλ one can define the η˜ form associated to Dirac operators Dz − λ as a continuous function of the parameters z ∈ B. We shall restrict to the problem of determining the curvature along gauge and diffeomorphism directions on the boundary. The η˜ form is a spectral invariant and therefore the only term which contributes is the appropriate characteristic class in the bulk M. The integration of the index density in the bulk can be performed in two steps. First one integrates over the time variable t and then the resulting expression is integrated over N to produce a 2-form on A(N )×M(N ). All the computations involving the ghost v are restricted to vertical directions. Restricting to the case of gauge potentials (calculations involving the Levi–Civita connections are performed in the same way) we have ω(t)

Fij

= ∂i Aj (t) − ∂j Ai (t) + [Ai (t), Aj (t)],

ω(t) F0i

= f (t)(A − A0 )i ,

where we use the index 0 for the t component. For intermediate times 0 < t < 1 the curvature has components also to the vertical directions, 1 f (f − 1)[v, v], 2 = vf (t)dt + f (f − 1)[A − A0 , v].

(F ω(t) )[0,2] = (F ω(t) )[1,1]

Here we have denoted by (F )[i,j ] the component of a form F which is of degree i in the tangential directions in P and of degree j in the ghost v. If pk is any homogeneous symmetric function of degree k of the curvature we set pk (A, F ) = pk (A, F, F, . . . , F ) and then 1 pk (F ω(t) ) = k pk (f (t)(A + v − A0 ), F ω(t) )dt M N 0 ≡ ω2k−1 (A + v, A0 ). (4) N

The form on the right, when expanded in powers of the ghost v, gives forms of various degrees on the parameter space B. We are interested in the curvature form which is of degree 2 in v. The degree zero part gives just the Chern–Simons form ω2k−1 (A, A0 ) and if N were an even dimensional manifold the degree 1 term would be the nonabelian gauge anomaly. In low dimensions one gets familiar explicit formulas; as an example, consider the case of a trivial bundle and A0 = 0. When n = 1 the relevant characteristic i 2 class is p2 (F ) = 2!1 ( 2π ) tr F 2 and the curvature c1 along vector fields given by a pair X, Y of infinitesimal gauge transformations on the one-dimensional manifold N is i 1 tr A[X, Y ], c1 (X, Y ) = 2π 8π 2 N i 3 and in the case n = 2, dimN = 3, p3 (F ) = 3!1 ( 2π ) tr F 3 , one gets i 1 c3 (X, Y ) = tr (AdA + dA A + A3 )[X, Y ] 2π 48π 3 N +XdA Y A − Y dA XA .

Gravitational Anomalies, Gerbes, and Hamiltonian Quantization

619

The case of Levi–Civita connection needs some extra remarks. The reason is that we have actually two types of Chern–Simons forms (and associated anomaly forms) depending whether we write the connection with respect to a (local) orthonormal frame in the tangent bundle T M or with respect to the holonomic frame given by coordinate vector fields. Formally, the two Chern–Simons forms (and associated polynomials in v) look the λ (coordinate same; they are given exactly by the same differential polynomials in µν µ b (with respect to an anholonomic basis e ). The difference is (locally) an basis) or in µa a exterior derivative of a form (in N ) of lower degree. The difference 7ω of the Chern– µ Simons forms involves the matrix function ea (x) on N . Since this function takes values in the group GL(2n − 1, R), which topologically is equivalent to SO(2n − 1), there might be a topological obstruction for writing 7ω globally as dθ for some form θ . The potential obstruction is the winding number of the map e : N → GL(2n − 1, R), given by the (normalized) integral n+1 i 1 tr (e−1 de)2n−1 . w(e) = (2n − 1)! 2π N a in the anholonomic frame e leads to a diffeomorphism invariant The choice µb a integral N ω2n−1 (A, A0 ) and there are no anomalies or 2-forms along Diff(N ) orbits in M. On the other hand, there is a frame bundle anomaly related to local frame rotations; this takes exactly the same form as the pure gauge anomaly discussed above, [18]. λ in the coordinate frame is insensitive to the frame rotations e → e The choice µν a a but it responds to a local change of coordinates. Explicit formulas for the forms ω2n−1 along DiffN orbits are given in Sect. 5.

4. Schwinger Terms from the Local System of Line Bundles As we saw in the previous section, APS index theorem gives us a system of local determinant bundles DETλ over certain open sets Uλ ⊂ B. The infinite-dimensional group K = D0 × G0 acts in the parameter space B mapping each of the subsets Uλ onto itself. We denote k = Lie(K). In general, the determinant bundles are topologically nontrivial and one cannot lift directly the action of K to the total space of DETλ . Instead, there is a an extension Kˆ which acts in the determinant bundles. The Lie algebra kˆ of Kˆ is given in a standard way. It consists of pairs (X, α), where X ∈ k and α is a function on B, with commutation relations [(X, α), (Y, β)] = ([X, Y ], LX β − LY α + c(X, Y ; ·)), where the Schwinger term c is a purely imaginary function on B and antisymmetric bilinear function on k. It is defined as the value of the curvature of DETλ at the given point in B along the vector fields X, Y on B. Here LX denotes the Lie derivative on B along the vector field X. The Jacobi indentity in kˆ is an immediate consequence of the fact that the curvature is a closed 2-form on Uλ ⊂ B. Let Uλλ = Uλ ∩ Uλ . Over Uλλ there is a natural complex line bundle DETλλ such that the fiber at a point z is the top exterior power of the finite-dimensional vector space spanned by all eigenvectors of the Dirac operator Dz on N with eigenvalues in the range λ < µ < λ . If λ < λ, we set DETλλ = DET∗λ λ . By construction, we have a natural isomorphism DETλλ ⊗ DETλ λ DETλλ , for all triples λ, λ , λ .

620

C. Ekstrand, J. Mickelsson

Theorem 1 ([1]). For any pairs λ, λ of real numbers one has DETλλ DETλ ⊗ DET∗λ over the set Uλλ . Note that even though in [1] the discussion was mainly around the case of gauge potentials and gauge transformations, the proof of the theorem was abstract and very general, not depending on the particular type of parameter space for Dirac operators. In the gerbe terminology the content of this theorem is that the gerbe defined by the system of local line bundles DETλλ is trivial. The line bundles can be pushed forward to give a family of local line bundles on B/K since the spectral subspaces transform equivariantly under gauge transformations and changes of coordinates. However, over B/K the gerbe is no more trivial, i.e., it cannot be given as tensor products of local line bundles over the sets pr(Uλ ), where pr : B → B/K is the canonical projection. The obstruction to the trivialization is an element of H 3 (B/K, Z), the Dixmier–Douady class of the gerbe. In [1] the DD class was computed from the index theory in the case of Yang–Mills theory; the generalization to the case involving metrics and diffeomorphism is straightforward and the free part of the cohomology class is given by the integral formula (3), with B replaced by B/K. The importance of the above theorem comes from the following simple observation. Let Hz = H+ (z, λ) ⊕ H− (z, λ) be the spectral decomposition of the fermionic “1particle” Hilbert space with respect to a spectral cut at λ ∈ R, not in the spectrum of Dz . This determines a representation of the CAR algebra in a Fock space F(z, λ), with a normalized vacuum vector |z, λ >. The defining property of this representation is that a(u)|z, λ >= 0 = a ∗ (u )|z, λ >, for u ∈ H+ (z, λ) and u ∈ H− (z, λ). All creation operators a ∗ (u) and annihilation operators a(u) are anticommuting except a ∗ (u)a(u ) + a(u )a ∗ (u) =< u , u >, where < ·, · > is the inner product in Hz . If we change the vacuum level from λ to λ > λ, we have an isomorphism F(z, λ) → F(z, λ ) which is natural up to a multiplicative phase. The phase is fixed by a choice of normalized eigenvectors u1 , u2 , . . . up in the energy range λ < Dz < λ and setting |z, λ >= a ∗ (u1 ) . . . a ∗ (up )|z, λ >. But this choice is exactly the same as choosing a (normalized) element in DETλλ over the point z ∈ B. Thus, setting Fz = F(z, λ)⊗DETλ (z) for any λ not in the spectrum of Dz we obtain, according to Theorem 1, a family of Fock spaces parametrized by points of B but which do not depend on the choice of λ, [19]. This gives us a smooth Fock bundle F over B. The K action on the base lifts to a Kˆ action in F, the extension part in Kˆ coming entirely from the action in the determinant bundles DETλ . The Schrödinger wave functions for quantized fermions in background fields (parametrized by points of B) are sections of the Fock bundle. It follows that the Schwinger ˆ acting on Schrödinger wave functions, are terms for the infinitesimal generators of K, given by the formula for c which describes the curvature of the determinant bundle in the K directions. In the case of B = A and K = G, the elements in the Lie algebra are the Gauss law generators. This case was discussed in detail in [1]. More generally, we give explicit formulas for the Schwinger terms in Sect. 5.

Gravitational Anomalies, Gerbes, and Hamiltonian Quantization

621

5. Explicit Computations The Schwinger term in (2n − 1)-dimensional space will now be computed. This will be done by using notations for Yang-Mills, but it works for diffeomorphisms as well if a different symmetric invariant polynomial is used. Equation (4) gives that 1 f pn+1 A + v − A0 , ω2n+1 (A + v, A0 ) = (n + 1) 0

f dA + f A + (1 − f )dA0 + (1 − f )2 A20 − f (f − 1)[A0 , A] 1 +f (f − 1)[A − A0 , v] + f (f − 1)[v, v] dt. 2 The Schwinger term can be calculated from this expression. However, since we are only interested in the Schwinger term up to a coboundary, an alternative is to use the “triangle formula” as in [20]: 2

2

ω2n+1 (A + v, A0 ) ∼ ω2n+1 (A0 + v, A0 ) + ω2n+1 (A + v, A0 + v), where “∼” means equality up to a coboundary with respect to d + δ, where δ is the BRST operator. This gives a simpler expression for the non-integrated Schwinger term and also for all other ghost degrees of ω2n+1 (A + v, A0 ). Straightforward computations give the result (n + 1)n ω2n+1 (A + v, A0 )(2) ∼ pn+1 v, dv + [A0 , v], dA0 + A20 2 (n + 1)n(n − 1) 1 f (1 − f )2 pn+1 A − A0 , dv + [A0 , v], + 2 0 dv + [A0 , v], f dA + f 2 A2 + (1 − f )dA0 +(1 − f )2 A20 − f (f − 1)[A0 , A] dt, where the index (2) means the part of the form that is quadratic in the ghost. Inserting n = 1, 2 and 3 gives: ω3 (A + v, A0 )(2) ∼ p2 (v, dv + [A0 , v]) ω5 (A + v, A0 )(2) ∼ 3p3 v, dv + [A0 , v], dA0 + A20 +p3 (A − A0 , dv + [A0 , v], dv + [A0 , v]) ω7 (A + v, A0 )(2) ∼ 6p4 v, dv + [A0 , v], dA0 + A20 , dA0 + A20 +p4 A − A0 , dv + [A0 , v], dv + [A0 , v], 2 12 3 dA + A2 + 3dA0 + A20 + [A0 , A] , 5 5 5 This gives expressions for the non-integrated Schwinger term in a pure Yang-Mills potential if pn+1 is the symmetrized trace. The appropriate polynomial to use for the ˆ Levi–Civita connection is pn+1 = A(M) n+1 , according to Eq. (2). Using 2 1 1 ˆ A(M) = 1+ tr R 2 4π 12 4 1 2 2 1 4 1 + ... tr R tr R + + 4π 288 360

622

C. Ekstrand, J. Mickelsson

gives for n = 1 and 2: ω3 ( + v, 0 )(2) ∼

1 4π

2

1 vdv + 20 v 2 , 12

ω5 ( + v, 0 )(2) ∼ 0. Since the expression for ω7 is rather long we will omit to write it down. However, for the special case 0 = 0 it becomes: ω7 ( + v, 0)(2) ∼ 4 1 1 2 2 2 · tr (dv) tr dv d + 4π 288 3 5 1 1 2 + · tr (dv)2 tr d + 2 288 3 5 1 1 3 2 2 2 + . (dv) + (dv)dv + (dv) · tr R− 360 3 5 This expression can be simplified if subtracting the coboundary

1 4 1 2 4 3 · δ tr (dv) tr d + 4π 288 3 5 2 3 + d tr (vdv) tr d + . 3 The result is 4 1 1 ω7 ( + v, 0)(2) ∼ tr (vdv) trR 2 4π 288 1 1 3 + . · tr R − 2 (dv)2 + (dv)dv + (dv)2 360 3 5 The gravitational Schwinger terms are obtained by multiplying with the normalization factor (i/2π)−1 , inserting the integration over N and evaluating on vector fields X and Y on M generating diffeomorphisms. The Levi–Civita connection and curvature have i dx i ∧ dx j . Recall that (v(X))i = components ()ij = iji dx i and (R)ij = Rijj j

∂j X i , see, for instance, [18]. To illustrate how the Schwinger terms can be computed, we give the result for 1 space dimension: i −2πi ω3 ( + v, 0 )(2) (X, Y ) ∼ − (∂x X) ∂x2 Y dx, 48π N N

where “∼” now means equality up to a coboundary with respect to the BRST operator. When both a Yang-Mills field and gravity are present, the relevant polynomial is a sum of polynomial of type pk F ω(t) p˜ l R ω(t) ,

Gravitational Anomalies, Gerbes, and Hamiltonian Quantization

623

where the curvatures F ω(t) and R ω(t) are with respect to pure Yang–Mills and pure gravity, respective. This gives pk F ω(t) p˜ l R ω(t) M

kpk f (t)(A + vA − A0 ), F ω(f (t)) p˜ l R ω(h(t)) N 0 + lpk F ω(f (t)) p˜ l h (t)( + v − 0 ), R ω(h(t)) dt.

=

1

The expression is independent of f and h (see below). With a choice such that f (t) = 0, t ∈ [1/2, 1] and h (t) = 0, t ∈ [0, 1/2], this implies that ω(t) ω(t) p˜ l R = pk F (ω2k−1 (A + vA , A0 )p˜ l (R0 ) M

N

+pk (F )ω˜ 2l−1 ( + v , 0 )) .

(5)

Thus, the Schwinger term in combined Yang-Mills and gravity is up to a coboundary equal to the part of the expansion of (5) that is of second ghost degree. In particular, this implies that Schwinger terms which have oneYang–Mills ghost and one diffeomorphism ghost are in cohomology equal to the Schwinger term obtained from the form in (5). Thus, truly mixed Schwinger terms do not exist. Notice that if the background fields are vanishing then the Schwinger term is gravitational (although some parts of the form degrees are taken up by theYang–Mills polynomial). This can give anomalies of Virasoro type in higher dimensions. Observe that there is nothing special about gravity, a Yang– Mills Schwinger term is obtained by interchanging the role of f and h. This does however not mean that the gravitational Schwinger term differs from the Yang–Mills Schwinger term by a coboundary. The terms with k = 0 respectively l = 0 ruin this argument. It is easy to see that our method of computing the Schwinger term agrees with one of the most common approaches: The polynomial pk (F n ) − pk (F0n ) is written as (d + δ) on a form, the (non-integrated) Chern–Simons form. The Schwinger term is given by the part of the Chern–Simons form that is quadratic in the ghost. For the case when both Yang-Mills and gravity are present, the relevant polynomial is a sum of polynomials pk (F )p˜ l (R) − pk (F0 )p˜ l (R0 ).

(6)

There is an ambiguity in the definition of the Chern–Simons form; it is for instance possible to add forms of type (d + δ)χ to it. However, an ambiguity of this type will only change the Schwinger term by a coboundary. It will now be shown that the ambiguity in the definition of the Chern–Simons form is only of this type. Thus, we must prove that closeness with respect to (d + δ) implies exactness. This can be done by introducing the degree 1 derivation " defined on the generators by: "(d + δ)(A + vA ) = A + vA , "(d + δ)( + v ) = + v , "dA0 = A0 , "d0 = 0 , and otherwise zero. Then "(d + δ) + (d + δ)" is a degree 0 derivation which is equal to 1 on the generators. Therefore, if χ is closed with respect to (d + δ), then χ is proportional to (d + δ)"χ . An example of a (non-integrated) Chern–Simons form for the polynomial in (6) is ω2k−1 (A + vA , A0 )p˜ l (R0 ) + pk (F )ω˜ 2l−1 ( + v , 0 ). This is in complete agreement with (5).

624

C. Ekstrand, J. Mickelsson

References 1. Carey, A.L., Mickelsson, J. and Murray, M.K.: Index theory, gerbes and Hamiltonian quantization. Commun. Math. Phys. 183, 707 (1997), hep-th/9511151 2. Adler, S.: Axial vector vertex in spinor electrodynamics. Phys. Rev. 177, 2426 (1969); Bell, J. and Jackiw, R.: A PCAC puzzle: pi0−− >gamma gamma in the sigma model. Nuovo Cimento 60A, 47 (1969) 3. Jackiw, R. and Rebbi, C.: Conformal properties of a Yang-Mills pseudoparticle. Phys. Rev. D 14, 517 (1976); N.K. Nielsen, Römer, H. and Schroer, B.: Classical anomalies and a local version of the AtiyahSinger index theorem. Phys. Lett. B 70, 445 (1977); Hawking, S.W.: Gravitational instantons. Phys. Lett. A 60, 81 (1977) 4. Alvarez-Gaume, L. and Witten, E.: Gravitational anomalies. Nucl. Phys. B 234, 269 (1984) 5. Schwinger, J.: Field theory commutators. Phys. Rev. Lett. 3 , 296 (1959) 6. Mickelsson, J.: Chiral anomalies in even and odd dimensions. Commun. Math. Phys. 97, 361, (1985); On a relation between massive Yang-Mills theories and dual string models. Lett. Math. Phys. 7, 45 (1983); Faddeev, L. and Shatasvili, S.: Algebraic and Hamiltonian methods in the theory of nonabelian anomalies. Theoret. Math. Phys. 60, 770 (1985) 7. Mickelsson, J.: Wodzicki residue and anomalies of current algebras. In Integrable Models and Strings, ed. by A. Alekseev et al., Sprionger Lecture Notes in Physics 436; hep-th/9404093 8. Carey, A.L., Mickelsson, J. and Murray, M.K.: Bundle gerbes applied to field theory. hep-th/9711133, Rev. Math. Phys. 12, 65 (2000) 9. Murray, M.K.: Bundle Gerbes. J. London Math. Soc. (2) 54, 403 (1996), dg-ga/9407015 10. Atiyah, M.F., Patodi, V.K. and Singer, I.M.: Spectral asymmetry and Riemannian geometry I. Math. Proc. Camb. Phil. Soc. 77, 43 (1975) 11. Atiyah, M.F. and Singer, I.M.: The index of elliptic operators (IV). Ann. Math. 93, 119 (1971) 12. Bismut, J.M. and Cheeger, J.: Family index for manifolds with boundary, superconnections, and cones. I. J. Funct. Anal. 89, 313 (1990); Family index for manifolds with boundary, superconnections, and cones. II. J. Funct. Anal. 90, 306 (1990) 13. Bismut, J.M. and Freed, D.S.: The analysis of elliptic families: Metrics and connections on determinant line bundles. Commun. Math. Phys. 106, 159 (1986); The analysis of elliptic families: Dirac operators, eta invariants and the holonomy theorem of Witten. Commun. Math. Phys. 107, 103 (1986) 14. Piazza, P.J.: Determinant bundles, manifolds with boundary and surgery. Commun. Math. Phys. 178, 597 (1996) 15. Quillen, D.: Determinants of Cauchy–Riemann operators on Riemann surfaces. Funct. Anal. Appl. 19, 31 (1985) 16. Atiyah, M.F. and Singer, I.M.: Dirac operators coupled to vector potentials. Proc. Nat. Acad. Sci. 81, 2596 (1984) 17. Donaldson, S.K. and Kronheimer, P.B.: The Geometry of Four-Manifolds, Sect. 5.2.3. Clarendon Press, Oxford, 1990 18. Bardeen, W. and Zumino, B.: Consistent and covariant anomalies in gauge and gravitational anomalies. Nucl. Phys. B 244, 421 (1984) 19. Mickelsson, J.: On the Hamiltonian approach to commutator anomalies in (3+1) dimensions. Phys. Lett. B241, 70 (1990) 20. Mañes, J., Stora, R., Zumino, B.: Algebraic study of chiral anomalies. Commun. Math. Phys. 102, 157 (1985) Communicated by R. H. Dijkgraaf

Commun. Math. Phys. 212, 625 – 647 (2000)

Communications in

Mathematical Physics

© Springer-Verlag 2000

Picard–Fuchs Uniformization and Modularity of the Mirror Map Charles F. Doran Department of Mathematics, Harvard University, Cambridge, MA 02138, USA. E-mail: [email protected] Received: 14 August 1999 / Accepted: 30 January 2000

Abstract: Arithmetic properties of mirror symmetry (type IIA-IIB string duality) are studied. We give criteria for the mirror map q-series of certain families of Calabi–Yau manifolds to be automorphic functions. For families of elliptic curves and lattice polarized K3 surfaces with surjective period mappings, global Torelli theorems allow one to present these criteria in terms of the ramification behavior of natural algebraic invariants – the functional and generalized functional invariants respectively. In particular, when applied to one parameter families of rank 19 lattice polarized K3 surfaces, our criterion demystifies the Mirror-Moonshine phenomenon of Lian and Yau and highlights its non-monstrous nature. The lack of global Torelli theorems and presence of instanton corrections makes Calabi–Yau threefold families more complicated. Via the constraints of special geometry, the Picard–Fuchs equations for one parameter families of Calabi– Yau threefolds imply a differential equation criterion for automorphicity of the mirror map in terms of the Yukawa coupling. In the absence of instanton corrections, the projective periods map to a twisted cubic space curve. A hierarchy of “algebraic” instanton corrections correlated with the differential Galois group of the Picard–Fuchs equation is proposed. 1. Introduction Numerous remarkable properties of the type IIA-IIB string duality better known as mirror symmetry have been revealed since its discovery a decade ago. Mathematically this symmetry entails a correspondence between complex moduli in one family of Calabi– Yau manifolds and Kähler moduli of a mirror family. In the neighborhood of a large complex structure/large radius limit point mirror symmetry is described by the mirror map q-series. The mirror map is a locally holomorphic function determined by the behavior of fundamental solutions to the Picard–Fuchs equation for periods of a Calabi–Yau Present address: Center for Geometry and Mathematical Physics, Department of Mathematics, Pennsylvania State University, University Park, PA 16802, USA

626

C. F. Doran

family about a point of maximal unipotent monodromy. For a family of Calabi–Yau threefolds the mirror map q-series and the Yukawa couplings determine a generating function for the Gromov–Witten invariants. These invariants (conjecturally) count the number of rational curves on a generic member of the family. In fact, the original predictions of Candelas [2] for the one parameter family of Fermat-type quintic Calabi–Yau hypersurfaces have now been proven mathematically [22]. In a series of papers, Lian and Yau [23–26] investigate arithmetic properties of the mirror maps of several “torically defined” families of elliptic curves, K3 surfaces, and Calabi–Yau threefolds constructed in their work with Hosono, Klemm, Roan and Theisen [14, 15, 19, 16, 18]. Each of the one parameter families of elliptic curves and K3 surfaces they study has a globally defined mirror map, automorphic with respect to the global monodromy group of the family. The mirror maps of these elliptic curve families are classical modular functions for finite index subgroups of PSL(2, Z), while the mirror maps of the K3 surface families are, up to an additive integer correction, always reciprocals of some McKay–Thompson series associated to the monster in the “Monstrous Moonsine” lists of Conway and Norton [4]. In particular, the mirror maps of their examples are always automorphic functions for genus zero subgroups of PSL(2, R), a phenomenon Lian and Yau dub “Mirror-Moonshine”. When such modularity properties are possessed by a mirror map, other properties of potential physical interest can be derived: e.g., integrality of the mirror map and prepotential, congruences satisfied by the mirror map coefficients, the effect on instanton corrections, etc. Thus a question of mathematical interest and physical relevance is: Question 1. When is the mirror map an automorphic function? Unlike other questions regarding the mirror map studied in the literature, this is an inherently global question. We are asking for which families of Calabi–Yau manifolds does the mirror map admit an extension to a map from the whole period domain to the entire base of the family. Our question is related to the classical problem of characterizing modular relations between automorphic functions and the elliptic modular function. In fact, for families of elliptic curves we will see in Sect. 2.1 that this is all that is involved: we recover the classical criterion for just such modular relations from [11, 1, 41]. In [7, 8] we answer our question for families over P1 . For elliptic curve families we use Kodaira’s functional invariant J to pull back the uniformizing differential equation for the elliptic modular function from the coarse moduli space of elliptic curves (the J -line). The existence of the functional invariant J can be interpreted as a consequence of the (trivial) classical analogue of the global Torelli theorem. In the case of lattice polarized K3 surface families, we apply the global Torelli theorem of Nikulin (see the lists of related works in Dolgachev [6]) to define a generalized functional invariant mapping again from the base of a family to the associated coarse moduli space. We use this generalized functional invariant to explain the Mirror-Moonshine phenomenon for families of K3 surfaces over P1 with third order Picard–Fuchs differential equations – the setting in which the Mirror-Moonshine Conjecture of Lian and Yau was originally formuated. The basic idea behind our approach to answering the modularity question for one parameter families is quite simple: The mirror map of a family of elliptic curves (resp. rank 19 lattice polarized K3 surfaces) is classically modular (resp. automorphic) if and only if the Picard–Fuchs differential equation is a classical uniformizing differential equation (resp. the symmetric square of one). We call this Picard–Fuchs uniformization. In this paper, instead of deriving the modularity criterion “from scratch” from the local behavior of uniformizing differential equations on P1 , we use the theory of branched

Picard–Fuchs Uniformization and Modularity of the Mirror Map

627

covers of orbifolds as described by Namba [31]. This approach gives us directly the modularity criterion in the neatest possible form, and applies to both 1. one parameter families of elliptic curves (Sect. 2.1) and rank 19 lattice polarized K3 surfaces (Sect. 3.1) over a base curve of arbitrary genus, and 2. multiparameter families of lattice polarized K3 surfaces with surjective period mapping (Sect. 3.2). We replace the uniformizing differential equations for the elliptic curve and lattice polarized K3 surface families with holomorphic projective connections and holomorphic conformal connections respectively. Picard–Fuchs uniformization occurs when the GaussManin connection of such a family of elliptic curves (resp. lattice polarized K3 surfaces) is a holomorphic projective (resp. holomorphic conformal) connection. The lack of a global Torelli theorem for Calabi–Yau threefolds (in particular no presentation of the coarse moduli space as a locally symmetric space) prevents one from algebraically defining generalized functional invariants or mimicing the previous arguments for elliptic curves and K3 surfaces. Instead of an algebraic criterion for modularity of the mirror map, we must settle for a differential algebraic one in general. The Picard–Fuchs equation for a one parameter family of Calabi- Yau manifolds with h2,1 = 1 has order four. Moreover the constraints imposed by special geometry imply that about a point of maximal unipotent monodromy there is a set of fundamental solutions of the form u, u · t, u · F˙ , u · (t F˙ − 2F ) , where u(z) is the fundamental solution locally holomorphic at the point of maximal unipotent monodromy, t (z) is the mirror map, and F (z) is the prepotential (the derivative F˙ is taken with respect to the mirror map coordinate t). Following Lian and Yau, one can derive a “quantum Schwarzian equation” relating the second order coefficient of the Picard–Fuchs equation, the mirror map, and the Yukawa couplings (Sect. 4.1). In the absence of instanton corrections, this quantum Schwarzian reduces to a classical one, and the Picard–Fuchs equation takes the special form of a symmetric cube of a second order equation. We give first a criterion for modularity of such mirror maps in the beginning of Sect. 4.2. Suppose on the other hand that there are instanton corrections, so the quantum Schwarzian is not classical. If we assume that the mirror map is an automorphic function, it will satisfy another classical Schwarzian equation. By subtracting the two to eliminate the Schwarzian derivative terms, and applying a reduction of order argument to the original Picard–Fuchs equation, we obtain a nonlinear differential equation in the Yukawa coupling and coefficients of the Picard–Fuchs and classical Schwarzian equations (the “modularity equation” in Theorem 9). The mirror map does not appear in this expression, yet the equation will hold if and only if the mirror map is automorphic. This is our general criterion for modularity of the mirror map for Calabi–Yau threefolds. The absence of instanton corrections in a one parameter family of Calabi–Yau threefolds corresponds to the existence of a homogeneous third order relation among the four periods, i.e., the image of the period mapping lies on a twisted cubic space curve. It is natural to ask what other homogeneous algebraic relations can occur between periods of one parameter families of Calabi–Yau threefolds. We call these “algebraic” instanton corrections. In Sect. 4.3 we apply a century old theorem of Fano to give a rough classification, paralleling the structure of the differential Galois group of the Picard–Fuchs equation. Most of the results on the mirror map for Calabi–Yau manifolds which appear in the literature depend on the hypothesis that the families of Calabi–Yau threefolds arise

628

C. F. Doran

“torically”, i.e., as particular parametrized families of hypersurfaces or complete intersections in Fano toric varieties. By working in the setting of transcendental algebraic geometry, we obtain general results about whole classes of families of Calabi–Yau manifolds. There has been a major effort in the literature to produce examples, first of mirror maps in general [19, 14–16] and then to test the (generalized) Mirror-Moonshine phenomenon in particular [23–26, 42]. Since the point of this paper is to explain general tools and results, we refer the reader interested in examples to the papers cited above. 2. Elliptic Curve Families In this section we derive the modularity criterion for mirror maps of one parameter families of elliptic curves with section (Theorem 2), and make some comments on the case of multiparameter families of elliptic curves (Sect. 2.2). It does not really make sense to ask our question in this latter case, but we will use it to motivate some aspects of the problem for multiparameter families of K3 surfaces. 2.1. One parameter families of elliptic curves with section. In the early 1960’s Kodaira developed a general theory of elliptic surfaces, i.e., compact complex surfaces fibered over curves, with generic fiber an elliptic curve. In particular he showed that every elliptic surface with section is determined by a pair of natural invariants. The first of these, the functional invariant, is a meromorphic function on the base of the family which keeps track of the J -value of each elliptic curve fiber. The second, the homological invariant, is nothing more than the monodromy representation associated with the second order Fuchsian ordinary differential equation satisfied by the periods, i.e., the monodromy of the Picard–Fuchs equation. The elliptic surfaces with a section, the basic elliptic surfaces, play a distinguished role in Kodaira’s theory. There is a canonical form for such a family of elliptic curves π : X → C with section, exhibiting X as a divisor in a P2 -bundle over the base curve C: Theorem 1 ([29], Theorem (2.1)). Let denote the given section of π , i.e., = s(C), a divisor on X which is taken isomorphically onto C by π . Let L = π∗ [OX ()/OX ]. Suppose that the general fiber of π is smooth. Then L is invertible and X is isomorphic to the closed subscheme of P = P(L⊗2 ⊕ L⊗3 ⊕ OY ) defined by y 2 z = 4x 3 − g2 xz2 − g3 z3 , where

g2 ∈ (C, L⊗−4 ) , g3 ∈ (C, L⊗−6 ) ,

and [x, y, z] is the global coordinate system of P relative to (L⊗2 , L⊗3 , OC ). Moreover the pair (g2 , g3 ) is unique up to isomorphism, and the discriminant g23 − 27g32 ∈ (C, L⊗−12 ) vanishes at a point s ∈ C precisely when the fiber Xs is singular. For a family of elliptic curves in Weierstrass form, the functional invariant takes the form J = g23 / : C → P1J .

(1)

Picard–Fuchs Uniformization and Modularity of the Mirror Map

629

The fact that the functional invariant takes the special form of Eq. (1) is evidence of the coarseness of the “J - line” moduli space of elliptic curves. By contrast, if we were to mark the two torsion on each elliptic curve, i.e., use the Legendre family y 2 = x(x − 1)(x − λ(s)) “λ-line” moduli space, then any rational function on the base curve C would be a “λ-functional invariant” for a family of elliptic curves with level two structure over C. Kodaira has classified the singular fiber types which can arise in Weierstrass fibered elliptic surfaces. The singular fibers which appear in a smooth minimal elliptic surface fall into “types”: In (n ≥ 0), II, III, IV, I∗n (n ≥ 0), IV∗ , III∗ , and II∗ . Denote a smooth elliptic fiber by I0 . The fiber of type I1 is a rational curve with a single node. More generally, fibers of type In consist of an n-cycle of intersecting rational curves for n ≥ 1. A fiber of type II is just a rational curve with a single cusp. Type III fibers consist of two rational curves with a single point of tangency. Fibers of type IV consist of three rational components intersecting at a single point. There are also fibers of types I∗n , n ≥ 0, IV∗ , III∗ , and II∗ , whose dual intersection graphs, minus in each case a multiplicity one component, correspond to those graphs of Dynkin types Dn+4 , E6 , E7 , and E8 respectively. We now recall how the Kodaira fiber types correlate with the ramification behavior of the J-map. Lemma 1 ([30], Lemma IV.4.1). Let F = Xs be the fiber of π over s ∈ C, and let νs (J) be the multiplicity of the functional invariant at s. 1. If F has type II, IV, IV∗ , or II∗ , then J(s) = 0. Conversely, suppose that J(s) = 0. Then – F has type I0 or I∗0 if and only if νs (J) ≡ 0 mod 3, – F has type II or IV∗ if and only if νs (J) ≡ 1 mod 3, – F has type IV or II∗ if and only if νs (J) ≡ 2 mod 3. 2. If F has type III or III∗ , then J(s) = 1. Conversely, suppose that J(s) = 1. Then – F has type I0 or I∗0 if and only if νs (J) ≡ 0 mod 2, – F has type III or III∗ if and only if νs (J) ≡ 1 mod 2. 3. F has type In or I∗n with n ≥ 1 if and only if J has a pole at s of order n. Following [35, p. 304], one can apply the Griffiths-Dwork approach to computing the Picard–Fuchs equation of a Weierstrass elliptic surface as a Fuchsian system: −1 3δ d η1 d log η1 2 = 12 −g2 δ , 1 η2 d log dz η2 8

where

12

= g23 − 27g32 , δ = 3g3 g2 − 2g2 g3

and η1 =

γ

dx , η2 = y

γ

xdx . y

From this, for a one parameter family of elliptic curves in Weierstrass form it is not difficult to write down the Picard–Fuchs second order ordinary differential equation satisfied by the periods of the holomorphic one form ω = dx/y over the one cycles on

630

C. F. Doran

the fibers. Picking a basis of one cycles γi , i = 0, 1, we denote by fi = γi ω the basis of solutions to the Picard–Fuchs equation. We can now reinterpret Kodaira’s functional invariant J as the composition of the projective period morphism τ := f1 /f0 : C → H ⊂ P1 and the morphism J : H → P1 extending the classical modular function, i.e., J = J ◦ ω1 /ω0 : Ez

/E

G8 H NNNN NNNJ (τ ) NNN NN& / PSL(2, Z)\H∗ ∼ = P1J

τ (z)

C

∂(z)

Recall that a regular singular point of a Fuchsian ordinary differential equation of order k d k−1 f dkf + P (s) + . . . + Pk (s)f = 0 , Pi (s) ∈ C(s) , (2) 1 ds k ds k−1 is called a point of maximal unipotent monodromy if the local monodromy matrix G is such that G − Ik is nilpotent with exact order k. In a neighborhood of a point of maximal unipotent monodromy, Frobenius’ method tells us that there is a basis of solutions such that the first is holomorphic at the point, the second has logarithmic behavior, the next behaves like log2 , . . . , up to logk−1 . An easy consequence of Lemma 1 is Corollary 1. The points of maximal unipotent monodromy in the base curve C of an elliptic surface E are the points s ∈ C over which there is a singular fiber of type In or I∗n , n ≥ 1 (i.e., the support of the semistable elliptic fibers). Moreover, the presence of a point of maximal unipotent monodromy has global effects: Corollary 2. The Picard–Fuchs differential equation of an elliptic surface has a point of maximal unipotent monodromy if and only if the global monodromy group has infinite order if and only if the family of elliptic curves is not isotrivial. Consider more generally a one parameter family of Calabi–Yau manifolds π : X → C, whose Picard–Fuchs equation has a point of maximal unipotent monodromy. In a neighborhood of such a point consider the multivalued truncated period vector consisting only of the holomorphic solution and the logarithmic solution [phol (s) : plog (s)] : C P1 . If the image lies in the upper half plane H ⊂ P1 , then, possibly after composition with projective linear transformations so that the singular point lies at 0 ∈ P1 and maps to ı∞ ∈ H∗ ⊂ P1 , we can consider the q-series for the local inverse mapping / C , q(τ ) = e2πıτ . z(q(τ )) : H This q-series z(q) is called the mirror map of the family π : X → P1 about the point of maximal unipotent monodromy: H ~~7F τ (z) ~~~~~~~ ~~~~~ ~~~~~~~~ z(q) P1 o

Picard–Fuchs Uniformization and Modularity of the Mirror Map

631

Example 1. Consider the family EJ of elliptic curves over P1 defined by the equation 27s 27s x− . s−1 s−1 The periods of the form dx/y may be given in terms of the hypergeometric function 2 F1 (see [39, pp. 232–233] for explicit expressions). The Picard–Fuchs equation is EJ : y 2 = 4x 3 −

1 df (31/144)s − 1/36 d 2f + f = 0. + 2 ds s ds s 2 (s − 1)2 There is a basis of solutions with local monodromies G0 , G1 , G∞ about the regular singular points {0, 1, ∞} respectively, where 1 1 0 −1 11 , G1 = , G∞ = . G0 = −1 0 1 0 01 The unique point of maximal unipotent monodromy lies at s = ∞ ∈ P1 . The mirror map about this point is quite familiar. Since the maximal unipotent monodromy point is at ∞, we change variables first to z = 1/s so the mirror map q-series will be locally holomorphic. The single-valued local inverse to the period mapping is then the reciprocal of the q-series for the elliptic modular function J (q), J (q) = z(q) =

1 + 744 + 196884q + 21493760q 2 + O(q 3 ) , q

1 = q − 744q 2 + 356652q 3 − 140361152q 4 + O(q 5 ). J (q)

The period mapping is defined as a map to projective space. If one is interested in the mirror map it is often preferable to consider the Picard–Fuchs differential equation only up to “projective equivalence”. The projective normal form of a Fuchsian ordinary differential equation (e.g., that in Eq. (2) above) is the unique Fuchsian ordinary differential equation without a (k − 1)st order derivative dkg d k−2 g + R (s) + . . . + Rk (s)g = 0 , Ri (s) ∈ C(s) 2 ds k ds k−2 whose fundamental solutions define the same projective period map as that of Eq. (2). It is always possible to pass to the projective normal form differential equation by rescaling each fundamental solution of the original equation by the k th root of the Wronskian. Example 2. Suppose now that k = 2, i.e., the initial differential equation is df d 2f + P1 (s) + P2 (s)f = 0 , ds 2 ds then the projective normal form of this differential equation takes the particularly simple form d 2g 1 1 2 + P2 (s) − P1 (s) − P1 (s) g = 0. ds 2 2 4 Let /J denote the projective normal form of the Picard–Fuchs equation of the family EJ from Example 1, d2 36s 2 − 41s + 32 /J : + . ds 2 144s 2 (s − 1)2

632

C. F. Doran

As the process of taking the projective normal form does not alter the position or type of a maximal unipotent monodromy point, and as the projective solution determines the mirror map there, we see that the mirror map z(t) of a family of Calabi–Yau manifolds about a point of maximal unipotent monodromy of the Picard–Fuchs equation is determined by the projective normal form of this differential equation. Since the projective normalized Picard–Fuchs equation determines the mirror map, it is natural to ask if there is a simpler expression for this differential equation. In fact, by direct computation one can check that Proposition 1. The projective normalized Picard–Fuchs equation of a one parameter family of elliptic curves with section equals the projective normal form of the pullback J∗ (/J ) of /J from P1J to C by the functional invariant. Thus the mirror map of a one parameter family of elliptic curves is determined by the functional invariant J. This suggests that the answer to our modularity question should be expressed purely in terms of properties of the functional invariant itself. We now discuss three approaches to characterize modular mirror maps, each yielding the same criterion stated in terms of properties of the functional invariant. The three methods amount to the characterization of modular functions on the upper half plane H in terms of 1. modular relations between modular hauptmoduls and the elliptic modular function J , 2. uniformizing differential equations (genus g = 0) and holomorphic projective connections (g ≥ 1) on modular curves, and 3. branched covers of the J -line elliptic modular orbifold, respectively. The first of these is the most classical, implicit in fact in the early works of Fricke and Klein [11]. They introduce the notion of a single valued local uniformizer, or hauptmodul, H (τ ) for a genus zero modular curve. They compute several classical examples of modular relations between hauptmoduls H (τ ) and the elliptic modular function J (τ ), i.e., rational functions R(z) ∈ C(z) with the property that R(H (τ )) = J (τ ). In [1] Atkin and Swinnerton-Dyer state the following characterization of modular relations: Proposition 2. A function f (τ ) is a hauptmodul for a finite index subgroup of the classical elliptic modular group PSL(2, Z) if and only if there is a rational function R(z) ∈ C(z) such that 1. R(f (τ )) = J (τ ), 2. R(z) ramifies only over {0, 1, ∞} ⊂ P1J , and 3. the orders of ramification are = 1 or 3 over 0, and = 1 or 2 over 1. They comment further that this divisibility criterion extends to automorphic functions for subgroups of PSL(2, Z) of arbitrary genus. Their proof was extended by Venkov [41] to genus zero Fuchsian groups of the first kind more general than the classical elliptic modular group. The mirror map of a one parameter family of elliptic curves is modular when the functional invariant satisfies the three conditions of the proposition. The second approach, the one used to characterize modular mirror maps for families of elliptic curves over P1 in [8], focuses on the local properties of Fuchsian second order ordinary differential equations in projective normal form which characterize uniformizing differential equations. The uniformization theory for Riemann surfaces can be reformulated after Gunning [13] in terms of holomorphic projective connections on the

Picard–Fuchs Uniformization and Modularity of the Mirror Map

633

Riemann surface. On a local chart, or over a genus zero Riemann surface, this projective connection takes the form of a second order Fuchsian ordinary differential equation in projective normal form, i.e., d 2f + Q(z)f = 0. dz2

(3)

A fundamental set of solutions {f1 , f2 } to a uniformizing differential equation (3) has the property that Q(z) is the Schwarzian derivative of the projective solution τ (z) = f1 (z)/f2 (z) with respect to z, i.e., Q(z) = {τ (z); z}. The local behavior of the Schwarzian at poles then characterizes the class of Q(z) corresponding to uniformizing differential equations. Our criterion for modularity of the mirror map becomes a constraint on the functional invariant J so that the projective normalization of the pullback of the projective normalized /J (uniformizing differential equation for the J -line) is again a uniformizing differential equation. The “no excess ramification” condition (i.e., no ramification except over {0, 1, ∞} ⊂ P1J ) means that the projective normal form of the Picard–Fuchs equation must be free of apparent singularities. For a detailed discussion see [8, §3,4]. The third method, characterizing branched covers of orbifolds, is the most easily generalized of these three, and hence is our method of choice. We sketch here the theory of branched covers of orbifolds due to Kato, following Yoshida [44, §5.1]. Let X be a compact Riemann surface of genus g, equipped with m ≥ 1 marked “orbifold points” aj ∈ X and associated “orbifold weights” bj ∈ Z (2 ≤ bj ≤ ∞). Suppose that g = 0 and m ≥ 3. Fix the following data: X0 := X \ {a1 , . . . , am }; X˜ 0 the universal covering of X0 ; H the fundamental group of X0 , which we also view as the transformation group of X˜ 0 ; µj the element of H represented by a simple loop about aj ; H [µb ] the b

smallest normal subgroup of H containing µj j (j = 1, . . . , m) (determined uniquely independent of choice of µj or basepoint for H ). Let K be an arbitrary subgroup of H containing H [µb ], X0 the covering of X0 corresponding to K, and X the completion of X0 , i.e., the space obtained by adding to X0 all points over the aj with finite bj . Then we have a sort of “galois correspondence” of branched covers: The space X is a branched cover of X branching at aj with a ramification index dividing bj ; we say that X is branched at most over the divisor D = m b · (a j ) ∈ Pic(X). Conversely, to j =1 j such a branched covering of X there corresponds a subgroup K, H [µb ] ⊂ K ⊂ H . The covering M corresponding to K = H [µb ] is called the universal branched covering of X. In other words we have the following diagram of correspondences:

1 ↔ | K/H [µb ] ↔ | H /H [µb ] ↔

M ↓ X ↓ X

X˜ 0 ↓ ⊃ M0 ↓ ⊃ X0 ↓ ⊃ X0

↔

1 | ↔ H [µb ] | ↔ K | ↔ H

In this language we can most cleanly state our modularity criterion for the mirror map: Theorem 2. The mirror map of a one parameter family of elliptic curves with section π : E → C is an automorphic function for a finite index subgroup of PSL(2, Z) if and only if the functional invariant J(z) is branched at most over 3 · (0) + 2 · (1) ∈ Pic(P1J ).

634

C. F. Doran

Proof. Apply the galois correspondence above to the J -line orbifold. The Riemann surface X ∼ = P1J (genus g = 0), m = 3, a1 = 0, a2 = 1, a3 = ∞, b1 = 3, b2 = 2, b3 = ∞, D = 3 · (0) + 2 · (1) ∈ Pic(P1J ). There is a correspondence between Riemann surfaces uniformized by subgroups of H /H [µb ] ∼ = PSL(2, Z) and covers of the J -line branched at most over D = 3 · (0) + 2 · (1). The mirror map is an automorphic function for a subgroup of PSL(2, Z) if and only if the base C of the family is so uniformized. But the branched covering C → P1J is given by J. Hence the modularity criterion is just that the natural cover of the J -line defined by the functional invariant branch at most over D. This is not the end of the story in the elliptic curve case. By Lemma 1 we know the correspondence between local ramification behavior of the functional invariant and the type of Kodaira singular fiber to appear in the elliptic surface. In particular, if the mirror map of a basic elliptic surface is modular, then there are no singular fibers of types IV or II∗ . Moreover, if one restricts to the case of rational elliptic surfaces where all combinations of singular fiber types are known, one can list all rational elliptic modular surfaces with section. See [8, Theorem 4.11]. 2.2. Multi-parameter families of elliptic curves. The definition of Weierstrass fibrations in the one parameter case extends naturally to multiparameter families of elliptic curves with section. It is natural to ask if the modularity characterization extends in any way to families π : E → S of elliptic curves with section where dim(S) ≥ 2. This isn’t possible, but the obstruction is of interest in itself, and suggests an important hypothesis to make in the case of multiparameter families of K3 surfaces (Sect. 3.2). To begin with, the Gauss-Manin system for a multiparameter family of elliptic curves consists of a rank two system of linear partial differential equations. With a slight modification, we can construct a family of varieties for which the Gauss–Manin system takes a recognizable projective normal form. Replace an n parameter family of elliptic curves fiberwise with their nth power. The resulting Gauss-Manin system (essentially the nth symmetric power of the original) is a rank n+1 system of linear partial differential equations in n independent variables. A (projective) normal form exists for such differential equations [28]: n

∂ 2w ∂w = Pijk k + Pij0 w (i, j = 1, . . . , n). ∂zi ∂zj ∂z k=1

In the one parameter setting (n = 1) these equations reduce to projective normalized second order Fuchsian ordinary differential equations. Local conditions coming from the Schwarzian derivative define a natural subclass consisting of uniformizing differential equations for Riemann surfaces with respect to subgroups of PSL(2, R) (projective connections if g ≥ 1). In the multivariable case, the analogous subclass consists of the multiparameter holomorphic projective connections (connections modelled after projective space) much studied by Kobayashi [20] in a program established by Cartan. Holomorphic projective connections generalize the Schwarzian derivative, and uniformize quotients of the n-ball Bn := {[z0 : . . . : zn ] ∈ Pn | |z0 |2 − |z1 |2 − . . . − |zn |2 > 0} by a discrete subgroup of the analytic automorphisms.

Picard–Fuchs Uniformization and Modularity of the Mirror Map

635

The difficulty we encounter in generalizing our modularity criterion for the mirror map to multiparameter families of elliptic curves is fundamental. The image of the projective period morphism, even considering the symmetric power family, only lies on a one dimensional submanifold of the period domain Bn ! A necessary condition for the Picard–Fuchs equation to uniformize the base S of our family E is for the period mapping S Bn to be surjective. In fact, as the local inverse to the period mapping, the mirror map itself cannot be defined unless the period mapping is surjective and the dimension of S equals that of the period domain. This suggests two ingredients which will be needed for the multiparameter K3 surface generalization in Sect. 3.2: 1. a notion of uniformizing differential equation well adapted for Picard–Fuchs equations of K3 surface families, and 2. consideration only of families with surjective period mappings. 3. K3 Surface Families The results of Sect.2 are extended here to families of lattice polarized K3 surfaces with surjective period mappings, first in the one parameter case (Sect. 3.1) and then for multiparameter families (Sect. 3.2). By applying the resulting criterion for automorphic mirror maps to one parameter families of rank 19 lattice polarized K3 surfaces, we explain the Mirror-Moonshine phenomenon of Lian and Yau.

3.1. One parameter families of K3 surfaces. In their first systematic investigations of mirror symmetry for one parameter families of Calabi–Yau manifolds constructed via the “orbifold construction” [24], Lian andYau discovered that the reciprocal of the mirror maps for the K3 surfaces they were studying agreed, up to an additive constant, with some of the McKay–Thompson normalized q-series in the lists of Conway–Norton [4]. The evidence was sufficiently strong that they formulated Conjecture 1 (Mirror-Moonshine, [24,23]). If z(q) is the mirror map for a one parameter family of algebraic K3 surfaces from an orbifold construction which has a third order Picard–Fuchs equation, then, for some c ∈ Z, the q-series 1 +c z(q) is a McKay–Thompson series Tg (q) for some element g in the Monster. In [25, 26], Lian and Yau compute many more toric examples (including over a dozen complete intersection examples), and note that the correspondence to monstrous groups persists. This suggested that the hypothesis regarding the “orbifold construction” should perhaps be weakened to the hypothesis “torically constructed”. As noted in the proof of Theorem 5, for a family of lattice polarized K3 surfaces the condition of having a third order Picard–Fuchs equation is equivalent to the generic member possessing a polarization by a lattice of rank 19. Furthermore, a McKay–Thompson series is in particular a hauptmodul for some “monstrous” genus zero arithmetic group , and the various equivalent hauptmoduls are well-defined as generators of the function field of the rational curve \ H∗ only up to

636

C. F. Doran

action of . We see that in Conjecture 1 an equivalent conclusion is that the mirror map is itself a hauptmodul (unnormalized!) for some monstrous . Before Conjecture 1 was even formulated, Beukers, Peters, and Stienstra had computed the Picard–Fuchs equation of a particular family of rank 19 lattice polarized K3 surfaces [33]. The mirror map was determined by Verrill and Yui [42]. Although it is a hauptmodul, this q-series does not satisfy the conclusion of the Mirror-Moonshine Conjecture. Thus it provides a counterexample to a “monstrous” generalization of the Mirror-Moonshine Conjecture for torically constructed families. This suggests that we characterize the families of rank 19 lattice polarized K3 surfaces whose mirror maps are hauptmoduls for genus zero groups – a special case of our question from the introduction. The condition that a one parameter family of K3 surfaces have a third order Picard– Fuchs equation is actually quite natural. The periods obtained by integration of the holomorphic two form ω = ω(2,0) over algebraic two cycles all vanish. For a K3 surface X, the intersection form defines on H2 (X, Z) the structure of a lattice, isomorphic to the even unimodular lattice L = U ⊥ U ⊥ U ⊥ −E8 ⊥ −E8 , where U is the standard hyperbolic plane. The sublattice of algebraic cycles in H2 (X, Z) is naturally identified with the Picard group Pic(X) of divisor classes of X. Thus the rank ρ of the Picard group determines the order of the Picard–Fuchs equation: order of Picard–Fuchs = 22 − ρ. In particular, the families considered by Lian and Yau all have Picard rank 19. Let M be a lattice. An M-polarized K3 surface is a pair (X, j ) of a K3 surface X and a primitive lattice embedding j : M A→ Pic(X). The examples studied by Lian and Yau relating to Mirror-Moonshine are families of rank 19 lattice polarized K3 surfaces. A moduli space for lattice polarized K3 surfaces is constructed in [6, §3]. Each isomorphism class of (X, j ) is represented by a point of this coarse moduli space KM . Moreover the global Torelli theorem for lattice polarized K3 surfaces implies, as with the J -line in the case of elliptic curve moduli, that KM has the structure of an arithmetic quotient of a symmetric homogeneous space DM (a bounded symmetric domain of type IV) by an arithmetic group M . Here DM ∼ = O(2, 20 − ρ)/(SO(2) × O(20 − ρ)) and

M = ker (O(N ) → Aut(N ∗ /N )) ,

where N := ML⊥ . In particular, if the rank of M is 19 then DM ∼ = H. The generalized functional invariant HM : S → KM of a family π : X → S of M-polarized K3 surfaces may now be defined, by analogy with the elliptic curve case, as the composition of the multivalued period morphism S DM and the arithmetic quotient DM → KM . Since we are particularly interested in the case ρ = 19, the Picard–Fuchs equations of such one parameter families must be studied. We begin by examining some preliminary generalities on symmetric powers of second order Fuchsian ordinary differential equations. Assume that we have a second order Fuchsian ordinary differential equation L2 f = 0, where d2 d L2 = 2 + P1 (s) + P2 (s). ds ds

Picard–Fuchs Uniformization and Modularity of the Mirror Map

637

The second order equation L2 f = 0 is equivalent to the system of first order differential equations f =g g = −P2 f − P1 g with {f, g} as fundamental solutions. Observe that {f n , f n−1 g, . . . , f g n−1 , g n } n forms a set of fundamental solutions for the nth symmetric power L = L 2 . The following result describes a system of first order differential equations for L with these fundamental solutions.

Theorem 3 ([21], Theorem 2). If {f, g} satisfy a first order 2 × 2 differential system d f f 0 1 , = g −P2 −P1 dt g then {f n , f n−1 g, . . . , fg n−1 , g n } satisfy the (n + 1) × (n + 1) system  fn   fn  d dt

 f n−1 g   f n−1 g     .   .  = A  ..  ,  .   .   n−1   n−1  fg fg n g gn

where A = (aij ) is an (n + 1) × (n + 1) matrix such that ak,k ak,k+1 ak+1,k ai,j

= = = =

(1 − k)P1 , n + 1 − k, −kP2 , 0,

1 ≤ k ≤ n + 1, 1 ≤ k ≤ n, 1 ≤ k ≤ n, i > j + 1 or j > i + 1.

Example 3. In particular, when n = 2, the case for a symmetric square, one may rewrite the system in terms of a single third order operator Sym2 (L2 ) =

d2 d3 d + 3P + (2P1 2 + 4P2 + P1 ) + (4P1 P2 + 2P2 ). 1 ds 3 ds 2 ds

Our next task is to show that the Picard–Fuchs equation of a one parameter family of rank 19 lattice polarized K3 surfaces is a symmetric square of a second order equation, and to reduce the modularity question for the mirror map to the second order setting. Theorem 4 ([38], Lemma 3.1.(b)). Let L1 (y) and L2 (y) be homogeneous linear differential polynomials with coefficients in C(t). Then there exists a homogeneous linear differential equation L3 (y) = 0 with coefficients in C(t) and solution space the C-span of {ν1 ν2 | L1 (ν1 ) = 0 and L2 (ν2 ) = 0}.

We call the operator L3 (y) constructed above the symmetric product of L1 and L2 , and denote it by L1 L2 . In fact, the operation is associative, and we may further define L n for n ≥ 1 by L1 = L and L n = L n−1 L. We call Symn (L) = L n the nth symmetric power of L; conversely, L is the nth root of L n .

638

C. F. Doran

Lemma 2 ([38], Lemma 4, p. 129). Let L(y) be a homogeneous linear differential n polynomial with coefficients in C(t). Then L(y) = L 2 (y) for some second order homogeneous linear differential polynomial L2 (y) with coefficients in C(t) if and only if there exists a fundamental set of solutions {y1 , . . . , yn+1 } of L(y) = 0 such that 2 yi yi+2 − yi+1 = 0 , i = 1, . . . , n − 1.

Corollary 3. Let L(y) = 0 be a third order homogeneous linear equation with coefficients in C(t). If there exists a nondegenerate homogeneous polynomial P of degree 2 with constant coefficients and a fundamental set of solutions {y1 , y2 , y3 } of L(y) = 0 such that P (y1 , y2 , y3 ) = 0, then L(y) is the second symmetric power of a second order homogeneous linear differential equation with coefficients in C(t). Proof. This follows easily from Lemma 2. By assumption, the fundamental set of solutions satisfies a nondegenerate quadratic relation. Since all such quadrics in P2 (C) are projectively equivalent to y1 y3 − y22 = 0 the criterion of the lemma applies and L(y) is a symmetric square.

In this form, using the expression for the projective normal form of a second order Fuchsian differential equation given in Example 2, it is easy to check that: Proposition 3. Let L2 be as above a second order Fuchsian ordinary differential operator, and let L = Sym2 (L2 ) be its symmetric square. Then the projective normal form of L is the symmetric square of the projective normal form of L2 . In fact, it is possible to provide an explicit description of the relationship between the monodromy matrices of the second order “square root” equation and those of the third order symmetric square equation. This is provided by the faithful representation of SL(2, C) in SL(3, C) via the symmetric square representation [38]. Finally, we see the relevance of all of this for Picard–Fuchs equations of our rank 19 lattice polarized K3 surface families: Theorem 5. The Picard–Fuchs equation of a family of rank 19 lattice polarized K3 surfaces is the symmetric square of a second order homogeneous linear Fuchsian ordinary differential equation. Proof. To begin with, the order of the Picard–Fuchs equation is equal to the rank of the transcendental lattice, i.e., 22 − 19 = 3. By Nikulin’s Torelli theorem for lattice polarized K3 surfaces the period domain lies on a nondegenerate quadric in P2 [6]. Thus, Corollary 3 implies that the third order Picard–Fuchs differential equation is in fact a symmetric square. There is another approach to proving this result in the special case of K3 surfaces polarized by a lattice of the form Mn := U ⊥ U ⊥ −E8 ⊥ −E8 ⊥ #−2n$ , which takes advantage of their presentation as Shioda-Inose surfaces coming from a product of two elliptic curves linked by an n-isogeny. See [32] for more details. Such a simple geometric description is lacking in case of a general rank 19 lattice polarization. Nevertheless, our approach via symmetric square(root) Picard–Fuchs equations still

Picard–Fuchs Uniformization and Modularity of the Mirror Map

639

applies! This is what allows our transcendental methods to extend beyond the Mn polarized case to general rank 19 lattice polarized K3 surface families. We have effectively reduced the question of automorphicity of the mirror map to the case of uniformization of orbifold Riemann surfaces by second order Fuchsian equations already addressed in Sect. 2.1. Our result is Theorem 6. The mirror map of a one parameter family of rank 19 lattice polarized K3 surfaces π : X → C is an automorphic function for a finite index subgroup of M if and only if the generalized functional invariant HM (z) is branched at most over the orbifold divisor D ∈ Pic(KM ). Proof. By Theorem 5 the Picard–Fuchs equation of such a family of K3 surfaces is a symmetric square. The mirror map of a one parameter family of rank 19 lattice polarized K3 surfaces about a point of maximal unipotent monodromy is identical to that of the projective normalized square root of its Picard–Fuchs equation about the corresponding point: If {f, g} is a fundamental set of solutions to the square root equation, say f the locally holomorphic solution, then {f 2 , f g, g 2 } is a fundamental set of solutions to the symmetric square, with f 2 locally holomorphic. The (truncated) projective period mapping for the K3 surface family, is given by f g/f 2 = g/f , which is exactly the projective period ratio of the square root equation. Thus the mirror map for the K3 surface family is modular if and only if the projective normalized square root of Picard– Fuchs is a uniformizing differential equation for C. We can now apply the same galois correspondence for branched covers of orbifolds we used in Theorem 2. Now X = KM , the aj and bj are determined by the positions and orders of the fixed points m of the action of M on DM ∼ = H, and the total orbifold divisor of KM is again D = j =1 bj · (aj ) ∈ Pic(KM ). Using the theorem of Fano reproduced in Sect. 4.3, we can even characterize near modularity properties of one parameter families of rank 18 lattice polarized K3 surfaces. By the nondegenerate quadric structure of the period domain and case 3 of Theorem 9 we know that the fourth order projective normalized Picard–Fuchs equation is a tensor product of two second order Fuchsian equations in projective normal form. If the fundamental solutions, in {hol., log.} pairs, for these factor equations are {a, b} and {c, d} , then the fundamental solutions to the product equation take the form {ac, bc, ad, bd} so the truncated projective period mapping consists of the pair {b/a, d/c}, i.e., the pair of projective solutions to the factor equations. Although it is not natural to describe the mirror map when the dimension of the family is unequal to that of the associated period domain, there is a good notion of “bimodularity”, i.e., when each factor equation is a uniformizing differential equation (necessarily distinct else the lattice polarization rank jumps to 19 and the equation is a symmetric square). 3.2. Multi-parameter families of K3 surfaces. For the multiparameter definition of points of maximal unipotent monodromy and the mirror map we refer the reader to the unified presentation in [5] (§5.2.2 and §6.3.1 respectively). The details of the local description of the mirror map are in fact irrelevant for what follows as we address the

640

C. F. Doran

global question of modularity. In any case, the existence of a point of maximal unipotent monodromy is again guaranteed by the (related) hypotheses: 1. the family is not isotrivial, and 2. the period mapping is surjective. We define the generalized functional invariant HM : S → KM for a family π : X → S of M-polarized K3 surfaces as in Sect. 3.1 as the composition of the multivalued period morphism S DM and the quotient map to the coarse moduli space DM → KM coming from the global Torelli theorem. Under our hypotheses, the Gauss–Manin system for an n parameter family of rank 20 − n lattice polarized K3 surfaces is a system of linear partial differential equations of rank n + 2 in n independent variables. Any such system has a (projective) normal form [37] n

∂ 2u ∂ 2u ∂u = gij + Akij + A0ij u (1 ≤ i, j ≤ n) , ∂zi ∂zj ∂z1 ∂zn ∂zk k=1

where gij = gj i , Akij = Akj i , A0ij = A0j i , g1n = 1 , Ak1n = A01n = 0 (for n ≥ 3), or [36] ∂ 2u ∂u ∂u ∂ 2u +a +b + pu , = l 2 ∂x ∂x∂y ∂x ∂y ∂u ∂u ∂ 2u ∂ 2u +c +d + qu =m 2 ∂y ∂x∂y ∂x ∂y (for n = 2). The global Torelli theorem of Nikulin again implies that the periods map to a quadric projective hypersurface. The natural subclass of uniformizing differential equations adapted to the Picard–Fuchs equations of lattice polarized K3 surfaces with surjective period mappings are the holomorphic conformal connections (connections modelled after hyperquadrics) introduced by Kobayashi [20]. Once again the question of automorphicity of the inverse to the projective period mapping reduces to the uniformizability of the base S of our family as a branched cover of the modular orbifold KM . Fortunately, the galois correspondence for branched covers of orbifold Riemann surfaces has been generalized by Namba [31] to the case of orbifold complex manifolds of higher dimension. We refer to [31, Theorem 1.2.7] for the details, but the only essential difference is that we must add a higher dimensional analogue of the topological condition excluding “g = 0, m = 1 or 2” in the Riemann surface case. This topological condition, [31, Condition 1.2.4], says: if µdj ∈ H [µb ], then bj | d (for all j , 1 ≤ j ≤ m). By applying the galois correspondence as before to our families we find Theorem 7. The mirror map of an n parameter family of rank 20 − n lattice polarized K3 surfaces π : X → S is an automorphic function for a finite index subgroup of M (M := the polarizing lattice) if and only if the generalized functional invariant HM is branched at most over the orbifold divisor of KM .

Picard–Fuchs Uniformization and Modularity of the Mirror Map

641

4. Calabi–Yau Threefold Families We have seen that the presence of a global Torelli theorem is a great help in establishing modularity criteria for the mirror map, expressed in terms of natural algebraic invariants of our families of Calabi–Yau manifolds. It is known that in general moduli spaces of polarized Calabi–Yau threefolds lack the structure of a locally symmetric space. Nevertheless, it is possible that a differential algebraic criterion for automorphicity of the mirror map may be obtainable for Calabi–Yau threefold families by making use of the “special geometry" of Calabi–Yau threefold moduli. In fact, one can use the constraints imposed by special geometry on the Picard–Fuchs equation of a one parameter family of Calabi- Yau threefolds with h2,1 = 1 to derive an auxiliary differential equation (involving the Yukawa couplings, the coefficients of Picard–Fuchs, and the rational function defining the putative uniformizing differential equation) which holds if and only if the mirror map is an automorphic function (Theorem 9 in Sect. 4.2). 4.1. Picard–Fuchs equations of Calabi–Yau threefolds and special geometry. Special geometry arises in global N = 2 supersymmetry in four dimensions as a structure on the manifold spanned by the scalars in the vectormultiplets. The moduli space of (2,2) superconformal field theories, and thus the moduli space of Calabi–Yau threefolds, satisfies the same constraint equation for the natural Kähler metric on moduli space. In the case of one parameter families of Calabi–Yau threefolds with h2,1 = 1 much is known about the implications of special geometry. In particular, the effect of special geometry on the fourth order Picard–Fuchs ordinary differential equations is well known [19, 3]. We review these results in this section, using notation largely compatible with that in [19, 27]. We will always use primes (e.g., f (z)) to denote derivatives with respect to the base parameter z, and dots (e.g., F˙ (t)) to denote derivatives with respect to the truncated period mapping parameter t. Suppose given the Picard–Fuchs equation for a family of h2,1 = 1 Calabi–Yau threefolds Lf (z) = 0 : f (z) + b3 (z)f (z) + b2 (z)f (z) + b1 (z)f (z) + b0 (z)f (z) = 0 with fundamental solutions (ξ0 , ξ1 , ξ0 F˙ (ξ1 /ξ0 ), ξ0 ((ξ1 /ξ0 )F˙ (ξ1 /ξ0 ) − 2F (ξ1 /ξ0 ))). Then t (z) := ξ1 /ξ0 is the truncated period mapping. By rescaling the solutions g(z) := f (z)/A(z), where 1 A(z) = exp − b3 (z)dz , 4 we obtain the projective normalized Picard–Fuchs equation Lg(z) = 0 : g (z) + a2 (z)g (z) + a1 (z)g (z) + a0 (z)g(z) = 0 with fundamental solutions 1/A(z) times the previous ones. In fact, a1 (z) = a2 (z) (see [19]). Let u(z) = ξ0 /A. The quantum Yukawa coupling is related to the holomorphic solution u(z) about the point of maximal unipotent monodromy: K = F˙ (3) = const.A2 /ξ02 = const./u2 .

642

C. F. Doran

˜ = 0, By reduction of order applied to the z ↔ t variables exchanged equation Lg we derive a third order variant of the Picard–Fuchs equation in t, satisfied by u(t), (3)

P Ft u(t) = 0 : u˙ (3) (t) + 21 c2 (t)u(t) ˙ + 41 c˙2 (t)u(t) = 0, i.e.,

1 ˜ ˜ L(u · t) − t · (Lu) . 4 (3) We recognize P Ft as the symmetric square of the second order “square root” equation (3)

P Ft u = (2)

¨ + 18 c2 (t)v(t) = 0 P Ft v(t) = 0 : v(t)

√ satisfied by v(t) = u(t). By plugging in the two remaining fundamental solutions, one finds that the resulting system of equations reads c2 (t) = r2 (t) , c0 (t) = r0 (t), where

2 z˙ (3) z¨ +5 = a2 (z)(˙z)2 + 5{z(t), t}, z˙ z˙ 2 5 K˙ K¨ r2 (t) = 2 − , K 2 K and the lengthy expressions for c0 (t) and r0 (t) are found in [19], where they are used to derive nonlinear ordinary differential equations of high order for the mirror map and Yukawa coupling. The c0 = r0 equation provides no simplification of our approach to modularity in Sect. 4.2, so c0 and r0 may be safely ignored. By reduction of order applied to Lg = 0, we find a third order Picard–Fuchs type (3) equation in z for T (z) = t (z), P Fz T (z) = 0 : c2 (t) = a2 (z)(˙z)2 −

T (z) + 4

15 2

u (z) u (z) u (z) u (z) T (z) + 6 + a2 (z) T (z) + 4 + 2a2 (z) + a2 (z) T (z) = 0. u(z) u(z) u(z) u(z)

It is important that the dependence of the coefficients on u is only through the ratios u (z) u (z) u (z) , , and u(z) u(z) u(z) – this is why the constant relating K and u never enters into the equation even if we rewrite it in terms of K. With this in mind, let r := d log u, and Lu(z) = 0 becomes (r + 4rr + 3(r )2 + 6r 2 r + r 4 ) + a2 (r + r 2 ) + a2 r + a0 = 0.

(4)

For Calabi–Yau threefold families (assuming special coordinates) Lian andYau show that the mirror map satisfies a “quantum corrected” version of Schwarz’s equation (the c2 = r2 equation above): 2Q(z)(˙z)2 + {z, t} = 25 y¨ −

1 ˙ 2, 10 (y)

where y(t) = log(K(t)), Q(z) = a2 (z)/10. For reference note as well that 1 2 ˙ . c2 (t) = 2y¨ − (y) 2

Picard–Fuchs Uniformization and Modularity of the Mirror Map

643

4.2. Characterization of modular mirror maps. We will start with the case with no instanton corrections: The quantum corrections vanish if and only if c2 (t) = 0, i.e., when x¨ = (x) ˙ 2, where x = y/4. Letting X = x, ˙ this becomes X˙ = X2 , with solutions X(t) = −(c−t)−1 for constant c. So K(t) = exp(y(t)) = exp(4x(t)) = exp(4(log(c − t) + d)) = const.(c − t)4 . Whenever K(t) does not take this particular form, we know that the mirror map z(t) is not an automorphic function for the projective monodromy group of the second order ordinary differential equation in projective normal form with coefficient Q(z). Conversely, if K(t) satisfies Eq. (4.2) and Q(z) satisfies the local conditions coming from the Schwarzian derivative for orbifold uniformization (i.e., characteristic exponent differences are proper unit fractions or zero), then the mirror map z(t) will be an automorphic function. A two parameter family of Calabi–Yau threefolds (a subfamily of the 101 parameter family of Calabi–Yau quintic hypersurfaces in P4 ) without instanton corrections is described in [3]. Assume for the remainder of Sect. 4.2 that there are instanton corrections present. Suppose that there is a rational function R(z) (necessarily unequal to Q(z)) with respect to which the mirror map z(t) satisfies the classical Schwarz equation 2R(z)(˙z)2 + {z, t} = 0

(5)

i.e., with respect to which the mirror map is an automorphic function. There is only one such candidate rational function R(z). This is the rational function which defines the uniformizing differential equation with regular singular points with compatible characteristic exponent differences exactly at those of the projective normal form Picard–Fuchs equation. The only subtlety that arises is one of computational effectivity: If there are more than three regular singular points, then the coefficients in the numerator of R(z) are difficult to determine in general from the denominator data – this is the famous “accessory parameter problem” in Riemann–Hilbert theory. By subtracting the two expressions (4.1) and (5) we have the equation 2(Q(z) − R(z))(˙z)2 =

2 1 y¨ − (y) ˙ 2. 5 10

(6)

Set P (z) = 5(Q(z) − R(z)) and S(z) = (1/4)P (z). Then Eq. (6) can be rewritten as S(z)(˙z)2 = X˙ − X 2 , where X = x˙ and x = y/4 as before. Now apply a Ricatti transformation X(t) =

w(t) ˙ d = log w(t) w(t) dt

yielding the linear ordinary differential equation in t w(t) ¨ + S(z(t))(˙z)2 w(t) = 0.

(7)

644

C. F. Doran

Now change the independent variable from t to z and we get a second order linear equation in z w (z) −

T (z) w (z) + S(z)w(z) = 0, T (z)

(8)

where T (z) = t (z). (z) in terms of {w, S}, or {K, P }, or {u, a2 , P }, or This implies an equation for TT (z) {r, a2 , P } (recall r = d log u). The equation in terms of r, a2 , P reads T r 1 1P = − r− . T r 2 2 r One can of course substitute ((a2 (z)/2) − 5R(z)) for P (z) and obtain the expresion in terms of {r, a2 , R} as well. (3) Now apply this to reduce P Fz to an expression of the form T (z)ϕ(z) = 0. Since T (z) is not identically zero by assumption (the mirror map is locally invertible), ϕ(z) = 0. This is the modularity condition. If we use the expression for d log T in terms of u, we can arrange to never have more (4) than a u appear (use P Fz u = 0). Similarly we can arrange to never have more than a w or a K or a r appear. In the r variant in Theorem 9 below we use Eq. (4). Of course this results in an additional term involving a0 (there was only a2 dependence in the higher order equation). Theorem 8. Here is the equation characterizing modularity of the mirror map in terms of r, a0 , a2 , R: 0 = −a23 − 16a0 r 2 − 6a22 r 2 − 12a2 r 4 + 8r 6 + 30a22 R + 80a2 r 2 R + 200r 4 R − 300a2 R 2 − 200r 2 R 2 + 1000R 3 + 5a2 ra2 − 6r 3 a2 − 50rRa2 + 7a22 r + 8a2 r 2 r − 64r 3 r + 116r 4 r − 140a2 Rr − 80r 2 Rr + 700R 2 r − 12ra2 r − 16a2 (r )2 − 48r 2 (r )2 + 160R(r )2 + 16(r )3 − 50a2 rR + 60r 3 R + 500rRR + 120rr R − 4r 2 a2 − 16a2 rr + 80r 3 r + 160rRr + 32rr r + 40r 2 R . In particular, this modularity equation is a second order nonlinear ordinary differential equation with rational function coefficients which the logarithmic derivative of the holomorphic solution to the Picard–Fuchs equation (4.1) satisfies if and only if the mirror map is an automorphic function. Special geometry is a phenomenon present in multidimensional families of Calabi– Yau manifolds as well [40]. A multiparameter criterion for automorphicity of the mirror map would of course be desirable. Perhaps the recent mathematical reformulation of special geometry by Freed [10] is a natural starting point.

Picard–Fuchs Uniformization and Modularity of the Mirror Map

645

4.3. Algebraic instanton corrections. In case of families of lattice polarized K3 surfaces, by global Torelli there is a homogeneous quadratic relation among the periods, and no instanton corrections. For Calabi–Yau threefold families we can also interpret the absence of instanton corrections as imposing a particular homogeneous algebraic relation among the periods. In Sect. 4.2 we saw a condition for vanishing of instanton corrections was that c2 (t) = 0. Equivalently, as described in [3, §2.1] for example, this can be interpreted as the vanishing of the fourth W -algebra generator w4 in the presence of the vanishing of the third (w3 = 0 being a consequence of special geometry). This implies in particular that the projective normalized Picard–Fuchs equation have a set of fundamental solutions {u31 , u21 u2 , u1 u22 , u32 }, where u1 and u2 are the fundamental solutions to the cube root equation u (z) + Q(z)u(z) = 0. In particular the projective periods map to a twisted cubic space curve. What about other homogeneous algebraic relations among the periods? We call instanton corrections for which the Picard–Fuchs equation still admits homogeneous algebraic relations among the periods algebraic instanton corrections. A century ago Fano classified fourth order Fuchsian ordinary differential equations whose fundamental solutions satisfy homogeneous algebraic relations [9, pp. 496–497]. To paraphrase in more modern language Theorem 9. The projective solution to a fourth order Fuchsian ordinary diferential equation falls into one of the following classes: 1. The projective solution lies on an algebraic (twisted cubic) curve in P3 . These equations are symmetric cubes of second order Fuchsian ordinary differential equations. 2. There is a homogeneous quartic relation among the fundamental solutions. Such equations can be transformed by a differential algebraic change of variables f = αh + βh + γ h to a member of the previous class. 3. A quadratic relation with nonvanishing discriminant exists among the fundamental solutions. These equations are the tensor product of two distinct second order Fuchsian ordinary differential equations L2 ⊗ L2 . 4. A quadratic relation with vanishing discriminant exists. These equations are formed by operator composition of a first order and a third order equation L1 · L3 . 5. No homogeneous algebraic relations exist among the fundamental solutions. This is the generic case. We can of course reinterpret Fano’s result as providing a rough classification of algebraic instanton corrections. In the first and last cases at least, we know Fano’s classification parallels the classification by differential Galois group of the Picard–Fuchs equation. Since the Picard–Fuchs differential equation is a Fuchsian ordinary differential equation, the differential Galois group equals the Zariski closure of the global monodromy group. In the first case this corresponds to the symmetric cube monodromy representation of SL(2, C) in Sp(4). In the last case, the monodromy representation is irreducible and the differential Galois group is all of Sp(4). It should be possible to fill in the other three entries as well. In fact we can say more about the absence of algebraic relations among the periods in the last case. By special geometry there are no homogeneous algebraic relations among {u, u · t, u · F˙ , u · (t F˙ − 2F )}

646

C. F. Doran

which implies there are no algebraic relations whatsoever among {t, F˙ , (t F˙ − 2F )}. Hence there are no algebraic relations among {t, F, F˙ }, and thus no algebraic relations between {t, F }. Moreover the modularity equation from Theorem 9 takes a particularly simple form in each of the nongeneric cases (e.g., it characterizes “bimodularity” in class 3. above). References 1. Atkin, A.O.L. and Swinnerton-Dyer, H.P.F.: Modular forms on noncongruence subgroups. In: Combinatorics, Proc. Sympos. Pure Math., Vol. XIX, Univ. California, Los Angeles, Calif., 1968, Providence: Am. Math. Soc. 1971, pp. 1–25 2. Candelas, P.: A pair of Calabi–Yau manifolds as an exactly soluble superconformal theory. Reprinted in [43, pp. 31–95] 3. Ceresole, A., D’Auria, R., Ferrara, S., Lerche, W., Louis, J., Regge, T.: Picard–Fuchs equations, special geometry, and target space duality. In: [12, pp. 281–354] 4. Conway, J.H. and Norton, S.P.: Monstrous moonshine. Bull. London Math. Soc. 11, 308–339 (1979) 5. Cox, D.A. and Katz, S.: Mirror Symmetry and Algebraic Geometry. Providence: Am. Math. Soc., 1999 6. Dolgachev, I.V.: Mirror symmetry for lattice polarized K3 surfaces. Algebraic geometry, 4. J. Math. Sci. 81, 2599–2630 (1996) 7. Doran, C.F.: Picard–Fuchs Uniformization and Geometric Isomonodromic Deformations: Modularity and Variation of the Mirror Map. Ph.D. thesis, Harvard University, April 1999 8. Doran, C.F.: Picard–Fuchs uniformization: Modularity of the mirror map and Mirror-Moonshine. In B. Gordon, et al., Eds., The Arithmetic and Geometry of Algebraic Cycles: Proceedings of the CRM Summer School, June 7–19, 1998, Banff, Alberta, Canada. Centre de Recherches Mathématiques, CRM Proceedings and Lecture Notes. 24, 2000, pp. 257–281 9. Fano, G.: Über lineare homogene Differentialgleichungen mit algebraische Relationen zwischen den Fundamentallösungen. Math. Ann. 53, 493–590 (1900) 10. Freed, D.: Special Kähler manifolds. Commun. Math. Phys. 203, 31–52 (1999) 11. Fricke, R. and Klein, F.: Vorlesungen über die Theorie der elliptischen Modulfunktionen. I, II. Leipzig: Teubner, 1890, 1892 12. Greene, B. and Yau, S.-T. (eds.): Mirror Symmetry II. In: AMS/IP Studies in Advanced Mathematics. 1. Providence and Cambridge: Amer. Math. Soc. and International Press, 1997 13. Gunning, R.C.: Lectures on Riemann surfaces. Princeton Mathematical Notes. Princeton: Princeton University Press, 1966 14. Hosono, S., Klemm, A., Theisen, S., Yau, S.-T.: Mirror symmetry, mirror map and applications to Calabi– Yau hypersurfaces. Commun. Math. Phys. 167, 301–350 (1995) 15. Hosono, S., Klemm, A., Theisen, S.,Yau, S.-T.: Mirror symmetry, mirror map and applications to complete intersection Calabi–Yau spaces. Nucl. Phys. B. 433, 501–552 (1995). Reprinted in [12, pp. 545–606] 16. Hosono, S., Lian, B.H., Yau, S.-T.: GKZ-generalized hypergeometric systems in mirror symmetry of Calabi–Yau hypersurfaces. Commun. Math. Phys. 182, 535–578 (1996) 17. Hosono, S., Lian, B.H., Yau, S.-T.: Maximal degeneracy points of GKZ systems. J. Am. Math. Soc. 10, 427–443 (1997) 18. Hosono, S., Lian, B.H., Yau, S.-T.: Calabi–Yau varieties and pencils of K3 surfaces. LANL archive preprint, alg-geom/9603020 19. Klemm, A., Lian, B.H., Roan, S.-S., Yau, S.-T.: A note on ODEs from mirror symmetry. In: Functional Analysis on the Eve of the 21st Century, Vol. II (New Brunswick, NJ, 1993). Progr. Math. 132, Boston: Birkhäuser, 1996, pp. 301–323. 20. Kobayashi, S. and Ochiai, T.: Holomorphic structures modeled after hyperquadrics. Tôhoku Math. J. 34, 587–629 (1982) 21. Lee, M.-H.: Picard–Fuchs equations for elliptic modular varieties. Appl. Math. Lett. 4 91–95 (1991) 22. Lian, B.H., Liu, K., Yau, S.-T.: The Candelas-de la Ossa-Green- Parkes formula. String Theory, Gauge Theory and Quantum Gravity (Trieste, 1997). Nuclear Phys. B Proc. Suppl. 67, 106–114 (1998) 23. Lian, B.H. and Yau, S.-T.: Mirror symmetry, rational curves on algebraic manifolds and hypergeometric series. In: XIth International Congress of Mathematical Physics (Paris, 1994). Cambridge: International Press, 1995, pp. 163–184 24. Lian, B.H. and Yau, S.-T.: Arithmetic properties of mirror map and quantum coupling. Commun. Math. Phys. 176, 163–191 (1996)

Picard–Fuchs Uniformization and Modularity of the Mirror Map

647

25. Lian, B.H. and Yau, S.-T.: Mirror maps, modular relations and hypergeometric series. I. LANL archive preprint, hep-th/9507151 26. Lian, B.H. and Yau, S.-T.: Mirror maps, modular relations and hypergeometric series. II. S-duality and Mirror Symmetry (Trieste, 1995). Nuclear Phys. B Proc. Suppl. 46, 248–262 (1996) 27. Lian, B.H. and Yau, S.-T.: A note on ODEs from mirror symetry II. In preparation. 28. Matsumoto, K., Sasaki, T., Yoshida, M.: Recent progress of Gauss- Schwarz theory and related geometric structures. Memoirs of the Faculty of Science, Kyushu University. Ser. A. 47, 283–381 (1993) 29. Miranda, R.: The moduli of Weierstrass fibrations over P1 . Math. Ann. 255, 379–394 (1981) 30. Miranda, R.: The Basic Theory of Elliptic Surfaces. Dottorato di Ricerca in Matematica. Pisa: ETS Editrice (1989) 31. Namba, M.: Branched Coverings and Algebraic Functions. Pitman Res. Notes Math. Ser. 161. Harlow: Longman Scientific & Technical, 1987 32. Peters, C.: Monodromy and Picard–Fuchs equations for families of K3-surfaces and elliptic curves. Ann. Sci. École Norm. Sup. (4) 19, 583–607 (1986) 33. Peters, C. and Stienstra, J.: A pencil of K3-surfaces related to Apéry’s recurrence for ζ (3) and fermi surfaces for potential zero. In: Arithmetic of Complex Manifolds (Erlangen, 1988), Lecture Notes in Math. 1399. Berlin: Springer, 1989, pp. 110–127 34. Phong, D.H., Vinet, L.,Yau, S.-T. (eds.): Mirror Symmetry III. AMS/IP Studies in Advanced Mathematics, 10. Providence, Cambridge, and Montreal: Am. Math. Soc., International Press, and Centre de Recherches Mathématiques, 1999 35. Sasai, T.: Monodromy representations of homology of certain elliptic surfaces. J. Math. Soc. Japan 26, 296–305 (1974) 36. Sasaki, T. and Yoshida, M.: Linear differential equations in two variables of rank four. I. Math. Ann. 282, 69–93 (1988) 37. Sasaki, T. and Yoshida, M.: Linear differential equations modeled after hyperquadrics. Tôhoku Math. J. 41, 321–348 (1989) 38. Singer, M.: Algebraic relations among solutions of linear differential equations: Fano’s theorem. Am. J. Math. 110, 115–143 (1988) 39. Stiller, P.: On the uniformization of certain curves. Pacific J. Math. 107, 229–244 (1983) 40. Strominger, A.: Special geometry. Commun. Math. Phys. 133, 163–180 (1990) 41. Venkov, A.B.: Examples of the effective solution of the Riemann–Hilbert problem on the reconstruction of a differential equation from a monodromy group in the framework of the theory of automorphic functions. (Russian) Zap. Nauchn. Sem. Leningrad. Otdel. Mat. Inst. Steklov. (LOMI) 162, Avtomorfn. Funkts. i Teor. Chisel. III, 5–42, 189 (1987) 42. Verrill, H. and Yui, N.: Thompson series, and the mirror maps of pencils of K3 surfaces. In B. Gordon, et al., Eds., The Arithmetic and Geometry of Algebraic Cycles: Proceedings of the CRM Summer School, June 7–19, 1998, Banff, Alberta, Canada. Centre de Recherches Mathématiques, CRM Proceedings and Lecture Notes. 24, 2000, pp. 399–432 43. Yau, S.-T. (ed.): Mirror Symmetry I. AMS/IP Studies in Advanced Mathematics, 9. Cambridge: International Press, 1998 44. Yoshida, M.: Fuchsian differential equations. With special emphasis on the Gauss-Schwarz theory.Aspects of Mathematics, E11. Braunschweig: Friedr. Vieweg & Sohn, 1987 Communicated by R. H. Dijkgraaf

Commun. Math. Phys. 212, 649 – 652 (2000)

Communications in

Mathematical Physics

© Springer-Verlag 2000

A Geometric Interpretation of the χy Genus on Hyper-Kähler Manifolds George Thompson ICTP, P.O. Box 586, 34100 Trieste, Italy. E-mail: [email protected] Received: 3 December 1999 / Accepted: 30 January 2000

Abstract: The group SL(2) acts on the space of cohomology groups of any hyperKähler manifold X. The χy genus of a hyper-Kähler X is shown to have a geometric interpretation as the super trace of an element of SL(2). As a by product one learns that the generalized Casson invariant for a mapping torus is essentially the χy genus. 1. Introduction The χy genus of Hirzebruch is a very interesting and rather powerful invariant. There are three significant values for y. At y = −1 the χy genus is the Euler characteristic, at y = 0 it is the Todd genus, while at y = 1 it is the signature. There seems to be, however, no geometric understanding of the genus away from these preferred values of y. In this short note, I prove that for (compact) hyper-Kähler manifolds, there is, in fact, quite a clear geometric meaning to the genus. For hyper-Kähler manifolds there is a natural SL(2) action, associated with the p holomorphic 2-form, on the cohomology groups p Hq X, X which preserves q and shifts p by even integers. This means that (−1)q+p is preserved. One can, therefore, take the graded trace of an SL(2) element, with the grading given by (−1)p+q . Denote the graded trace of U ∈ SL(2) by STr U . The geometric meaning of the χy genus for hyper-Kähler X is the content of the following Theorem 1.1. Let X be an irreducible compact hyper-Kähler manifold of real dimension 4n. Let U ∈ SL(2) and y be an eigenvalue of U , in the two dimensional representation, then STr U =

χ−y . yn

(1.1)

650

G. Thompson

Remarks. 1) Note that, since h(p,q) = h(2n−p,q) , the right-hand side is invariant under y → 1/y, so that it does not depend on which eigenvalue one picks. 2) Once one expects that a result of this kind is true the proof turns out to be embarrassingly easy. The motivation for this result comes from the study of 3-manifold invariants. Rozansky and Witten [RW] indicated how, given a hyper-Kähler manifold X, one could associate to the Mapping Torus TU , the invariant STr U . In [T], I showed that one could perform the associated path integral. The solution found there is, in fact, the Riemann– Roch formula for the χy genus divided by y n . This motivated the above theorem, which can be proven without recourse to physics. However, one can now read the derivation in [T] as a path integral proof of the Riemann–Roch formula for the χy genus. That path integral calculation of STr U gave 1/2 Todd (T XC ) Det U ⊗ I − I ⊗ e R , (1.2) X

which can be re-written as X

Todd (T XC )

n

(t − 2 cosh xi ) ,

(1.3)

i=1

where t is the character of U in the 2-dimensional representation. The χy genus is given by Riemann-Roch as [NR] 2n 1 − ye −xi , (1.4) Todd (T XC ) χ−y (X) = X

i=1

but since X is hyper-Kähler one has that xi+n = −xi for i ≤ n. This means that n χ−y (X) = Todd (T XC ) (1.5) (1 + y 2 ) − 2y cosh(xi ) , X

i=1

so that this suggests (1.1) on setting ty = 1 + y 2 . Consequently we have, in the notation of [T], RW [T ] = χ /y n , for U ∈ SL(2, Z). Corollary 1.2. The Rozansky–Witten invariant ZX U −y

Further Remarks. 1) The essential feature used here is the SL(2) action that is made available by the holomorphic 2-form. Hence this is not the same as thinking of X as a Kähler manifold and making use of the usual SL(2) action that comes from the symplectic 2-form (Lefschetz decomposition). 2) There is a rather more general formula that was suggested by the work of [RW]. If one considers a “mapping Riemann surface”, for a Riemann surface, , of genus g, RW [ ] = STr U , where U ∈ Sp(g) and this then the Rozansky–Witten invariant ZX U q ∗ ⊗g . In [T] a Riemann–Roch formula for this super group acts on H X, X trace was given which looks like a Riemann–Roch formula for a generalized χy genus. That suggests that the corresponding generalized χy can be rigorously shown to be the super trace. This has important implications for 3-manifold invariants. 3) Similar, though not identical, path integral formulae are available for general holomorphic symplectic manifolds. 4) Justin Sawon [S] has made use of the weight system in [RW] in an ingenious way to get constraints on the Chern numbers of X.

χy Genus on Hyper-Kähler Manifolds

651

2. The Sl(2) Action on X The SL(2, C) action on the cohomology groups of X, that we are interested in, is perhaps best explained at the level of the Lie algebra, Lie SL(2) := sl(2). Let L : p p+2 Hq X, X → Hq X, X be the map given by the cup-product with the holomor p p−2 phic 2-form . Let ı : Hq X, X → Hq X, X be contraction with respect to . To fix conventions we note that in local holomorphic coordinates if ω ∈ (p,q) (X), then, suppressing the anti-holomorphic factors, (the Einstein summation convention is in force) ω = ωI1 ,...,Ip dzIp ∧ · · · ∧ dzI1 ,

(2.1)

and ı ω =

p(p − 1) ωI1 ,I2 ,I3 ,...,Ip I1 I2 dzI3 ∧ · · · ∧ dzIp . 2

(2.2)

The algebra satisfied by these operators is, by a straightforward computation, [ı , L ] = (n − p)

(2.3)

p p understood as a map Hq X, X → Hq X, X . The generators of sl(2) are then realized as

01 00

∼ L

00 10

∼ ı

1 0 0 −1

∼ (n − p).

(2.4)

The following is taken from the survey by Huybrechts [H] (but see also the original work by Fujiki [F]). Let, p , Hq X, X := ker Ln−p+1

(2.5)

then the Lefschetz decomposition theorem tells us that p Hq X, X =

(p−l)≥max(p−n,0)

2l−p q X, X Lp−l . H

(2.6)

p One thinks of L as a raising operator, and the Hq X, X , for 0 ≤ p ≤ n, are the highest weight vectors of the n − p + 1 dimensional irreducible representations of SL(2, C). One also has, by a straightforward count, that p = h(p,q) − h(p−2,q) . dimR Hq X, X := h(p,q)

(2.7)

652

G. Thompson

3. Proof of Theorem 1.1 The proof is by direct computation. Let tr be the character of U in the r dimensional irreducible representation of SL(2, C) and set t2 = t. Note that t1 = 1, and I use the convention that tr = 0 for r ≤ 0, as well as h(p,q) = 0 if p < 0. Then STr U =

n 2n

(−1)p+q tn−p+1 h(p,q) .

(3.1)

(−1)p+q h(p,q) tn−p+1 − tn−p−1 .

(3.2)

q=0 p=0

One can re-write this expression as STr U =

n 2n q=0 p=0

Now notice that, on making use of Serre duality, which implies that h(p,q) = that the χy genus satisfies,

h(2n−p,q) ,

2n n−1

2n

q=0 p=0

q=0

p−n χ−y p+q (p,q) n−p y + = (−1) h + y (−1)q h(n,q) . yn

(3.3)

A comparison of (3.2) and (3.3) shows us that they agree if we can set tr+1 − tr−1 = y r + y −r r > 0.

(3.4)

ty = y 2 + 1,

(3.5)

For r = 1 this reads as

which is simply the characteristic polynomial for the two-dimensional representation of U , where y is an eigenvalue and t is the trace. We make this identification, then (3.4) is a standard relationship between characters and eigenvalues for SL(2). Acknowledgements. I would like to thank M. Blau, L. Göttsche and A. King for discussions. Special thanks are due to M. S. Narasimhan who made the right observations and the right remarks at the right time.

References [F]

Fujiki, A.: On the de Rham Cohomology Group of a compact Kähler Symplectic Manifold. In: Algebraic Geometry, Sendai, 1985. Advanced Studies in Pure Mathematics 10, T. Oda ed., Amsterdam: North Holland, 1987 [H] Huybrechts, D.: Compact Hyper-Kähler Manifolds: Basic Results. alg-geom/9705025. [NR] Narasimhan, M.S., Ramanan, S.: Generalized Prym Varieties as Fixed Points. J. Indian Math. Soc. 39, 1–19 (1975) [RW] Rozansky, L., Witten, E.: Hyper-Kähler Geometry and Invariants of Three Manifolds. hep-th/9612216 [S] Sawon J.: The Rozansky-Witten Invariants of Hyper-Kähler Manifolds. Preprint of a talk presented at the Brno conference. [T] Thompson G.: On the Generalized Casson Invariant. To appear in Adv. in Theor. Math. Physics 3, hep-th/9811199. Communicated by R. H. Dijkgraaf

Commun. Math. Phys. 212, 653 – 686 (2000)

Communications in

Mathematical Physics

© Springer-Verlag 2000

Group Invariant Solutions Without Transversality Ian M. Anderson1 , Mark E. Fels1 , Charles G. Torre2 1 Department of Mathematics and Statistics, Utah State University, Logan, Utah 84322, USA 2 Department of Physics, Utah State University, Logan, Utah 84322, USA

Received: 16 September 1999 / Accepted: 4 February 2000

Abstract: We present a generalization of Lie’s method for finding the group invariant solutions to a system of partial differential equations. Our generalization relaxes the standard transversality assumption and encompasses the common situation where the reduced differential equations for the group invariant solutions involve both fewer dependent and independent variables. The theoretical basis for our method is provided by a general existence theorem for the invariant sections, both local and global, of a bundle on which a finite dimensional Lie group acts. A simple and natural extension of our characterization of invariant sections leads to an intrinsic characterization of the reduced equations for the group invariant solutions for a system of differential equations. The characterization of both the invariant sections and the reduced equations are summarized schematically by the kinematic and dynamic reduction diagrams and are illustrated by a number of examples from fluid mechanics, harmonic maps, and general relativity. This work also provides the theoretical foundations for a further detailed study of the reduced equations for group invariant solutions. 1. Introduction Lie’s method of symmetry reduction for finding the group invariant solutions to partial differential equations is widely recognized as one of the most general and effective methods for obtaining exact solutions of non-linear partial differential equations. In recent years Lie’s method has been described in a number of excellent texts and survey articles (see, for example, Bluman and Kumei [10], Olver [29], Stephani [36], Vorob’ev [40], Winternitz [41]) and has been systematically applied to differential equations arising in a broad spectrum of disciplines (see, for example, Ibragimov [23] or Rogers and Shadwick [34]). It came, therefore, as quite a surprise to the present authors that Lie’s method, as it is conventionally described, does not provide an appropriate theoretical framework Research supported by NSF grants DMS–9804833 and PHY–9732636

654

I. M. Anderson, M. E. Fels, C. G. Torre

for the derivation of such celebrated invariant solutions as the Schwarzschild solution of the vacuum Einstein equations, the instanton and monopole solutions in Yang–Mills theory or the Veronese map for the harmonic map equations. The primary objectives of this paper are to focus attention on this deficiency in the literature on Lie’s method, to describe the elementary steps needed to correct this problem, and to give a precise formulation of the reduced differential equations for the group invariant solutions which arise from this generalization of Lie’s method. A second impetus for the present article is to provide the foundations for a systematic study of the interplay between the formal geometric properties of a system of differential equations, such as the conservation laws, symmetries, Hamiltonian structures, variational principles, local solvability, formal integrability and so on, and those same properties of the reduced equations for the group invariant solutions. Two problems merit special attention. First, one can interpret the principle of symmetric criticality [32, 33] as the problem of determining those group actions for which the reduced equations of a system of Euler-Lagrange equations are derivable from a canonically defined Lagrangian. Our previous work [2] on this problem, and the closely related problem of reduction of conservation laws, was cast entirely within the context of transverse group actions. Therefore, in order to extend our results to include the reductions that one encounters in field theory and differential geometry, one needs the more general description of Lie symmetry reduction obtained here. Secondly, there do not appear to be any general theorems in the literature which insure the local existence of group invariant solutions to differential equations; however, as one step in this direction the results presented here can be used to determine when a system of differential equations of Cauchy–Kovalevskaya type remain of Cauchy-Kovalevskaya type under reduction [4]. We begin by quickly reviewing the salient steps of Lie’s method and then comparing Lie’s method with the standard derivation of the Schwarzschild solution of the vacuum Einstein equations. This will clearly demonstrate the difficulties with the classical Lie approach. In Sect. 3 we describe, in detail, a general method for characterizing the group invariant sections of a given bundle. In Sect. 4 the reduced equations for the group invariant solutions are constructed in the case where reduction in both the number of independent and dependent variables can occur. We define the residual symmetry group of the reduced equations in Sect. 5. In Sect. 6 we illustrate, at some length, these results with a variety of examples. In the appendix we briefly outline some of the technical issues underlying the general theory of Lie symmetry reduction for the group invariant solutions of differential equations. 2. Lie’s Method for Group Invariant Solutions Consider a system of second-order partial differential equations β (x i , uα , uαi , uαij ) = 0

(2.1)

for the m unknown functions uα , α = 1, . . . , m, as functions of the n independent variables x i , i = 1, . . . , n. As usual, uαi and uαij denote the first and second order partial derivatives of the functions uα . We have assumed that Eqs. (2.1) are second-order and that the number of equations coincides with the number of unknown functions strictly for the sake of simplicity. A fundamental feature of Lie’s entire approach to symmetry reduction of differential equations, and one that contributes greatly to its broad applicability, is that the Lie algebra of infinitesimal symmetries of a system of differential equations can be

Group Invariant Solutions Without Transversality

655

systematically and readily determined. We are not so much concerned with this aspect of Lie’s work and accordingly assume that the symmetry algebra of (2.1) is given. Now let be a finite dimensional Lie subalgebra of the symmetry algebra of (2.1), generated by vector fields Va = ξai (x j )

∂ ∂ + ηaα (x j , uβ ) α , i ∂x ∂u

(2.2)

where a = 1,2, . . . , p. A map s : Rn → Rm given by uα = s α (x i ) is said to be invariant under the Lie algebra if the graph is invariant under the local flows of the vector fields (2.2). One finds this to be the case if and only if the functions s α (x i ) satisfy the infinitesimal invariance equations ξai (x j )

∂s α = ηaα (x j , s β (x j )) ∂x i

(2.3)

for all a = 1, 2, . . . , p. The method of Lie symmetry reduction consists of explicitly solving the infinitesimal invariance equations (2.3) and substituting the solutions of (2.3) into (2.1) to derive the reduced equations for the invariant solutions. In order to solve (2.3) it is customarily assumed (see, for example, Olver [29], Ovsiannikov [30], or Winternitz [41]) that the rank of the matrix ξai (x j ) is constant, say q, and that the Lie algebra of vector fields satisfies the local transversality condition (2.4) rank[ξai (x j ) = rank[ξai (x j ), ηaα (x j , uβ )]. Granted (2.4), it then follows that there exist local coordinates x˜ r = x˜ r (x j ),

xˆ k = xˆ k (x j )

and

v α = v α (x j , uβ ),

(2.5)

on the space of independent and dependent variables, where r = 1, . . . , n − q, k = 1, . . . , q, and α = 1, . . . , m, such that, in these new coordinates, the vector fields Va take the form Va =

q l=1

ξˆal (x˜ r , xˆ k )

∂ . ∂ xˆ l

(2.6)

The coordinate functions x˜ r and v α are the infinitesimal invariants for the Lie algebra of vector fields . In these coordinates the infinitesimal invariance equations (2.3) for v α = v α (x˜ r , xˆ k ) can be explicitly integrated to give v α = v α (x˜ r ), where the v α (x˜ r ) are arbitrary smooth functions. One now inverts the relations (2.5) to find that the explicit solutions to (2.3) are given by s α (x˜ r , xˆ k ) = uα (x˜ r , xˆ k , v β (x˜ r )).

(2.7)

Finally one substitutes (2.7) into the differential equations (2.1) to arrive at the reduced system of differential equations α ˜ β (x˜ r , v α , vrα , vrs ) = 0.

(2.8)

Every solution of (2.8) therefore determines, by (2.7), a solution of (2.1) which also satisfies the invariance condition (2.3). In many applications of Lie reduction one picks the Lie algebra of vector fields (2.2) so that q = n − 1 in which case there is only one independent invariant x˜ on M and (2.8) is a system of ordinary differential equations.

656

I. M. Anderson, M. E. Fels, C. G. Torre

For the vacuum Einstein equations the independent variables x i , i = 0, . . . , 3, are the local coordinates on a 4-dimensional spacetime, the dependent variables are the 10 components gij of the spacetime metric and the differential equations (2.1) are given by the vanishing of the Einstein tensor Gij = 0. In the case of the spherically symmetric, stationary solutions to the vacuum Einstein equations the relevant infinitesimal symmetry ∂ generators on spacetime are V0 = 0 , ∂x V1 = x 3

∂ ∂ − x2 3 , ∂x 2 ∂x

V2 = −x 3

∂ ∂ + x1 3 , ∂x 1 ∂x

and V3 = x 2

∂ ∂ − x1 2 ∂x 1 ∂x

and the symmetry conditions, as represented by the Killing equations LVa gij = 0, lead to the familiar ansatz (in spherical coordinates) ds 2 = A(r)dt 2 + B(r)dtdr + C(r)dr 2 + D(r)(dφ 2 + sin(φ)2 dθ 2 ).

(2.9)

The substitution of (2.9) into the field equations leads to a system of ODE whose general solution leads to the Schwarzschild solution to the vacuum Einstein field equations. What happens if we attempt to derive the Schwarzschild solution using the classical Lie ansatz (2.7)? To begin, it is necessary to lift the vector fields Va to the space of independent and dependent variables in order to account for the induced action of the infinitesimal spacetime transformations on the components of the metric. These lifted 0 = V0 and vector fields are V k = Vk − 2 V

∂Vkl ∂ glj . ∂x i ∂gij

(2.10)

In terms of these lifted vector fields, the infinitesimal invariance equations (2.3) then coincide exactly with the Killing equations. However, (2.7) cannot possibly coincide with (2.9) since the latter contains only 4 arbitrary functions A(r), B(r), C(r), D(r) whereas (2.7) would imply that the general stationary, rotationally invariant metric depends upon 10 arbitrary functions of r. This discrepancy is easily accounted for – in this example rank V0 , V1 , V2 , V3 = 3

while

rank Vˆ0 , Vˆ1 , Vˆ2 , Vˆ3 = 4,

and hence the local transversality condition (2.4) does not hold. Indeed, whenever the local transversality condition fails, the general solution to the infinitesimal invariance equation will depend upon fewer arbitrary functions than the original number of dependent variables. The reduced differential equations will be a system of equations with both fewer independent and dependent variables. We remark that in many of the exhaustive classifications of invariant solutions using Lie reduction either the number of independent variables is 2 and hence, typically, the number of vector fields Va is one, or there is just a single dependent variable and (2.1) is a scalar partial differential equation. In either circumstance the local transversality condition is normally satisfied and the ansatz (2.7) gives the correct solution to the infinitesimal invariance equation (2.3). However, once the number of independent and dependent variables exceed these minimal thresholds, as is the case in most physical field theories, the local transversality condition is likely to fail.

Group Invariant Solutions Without Transversality

657

3. An Existence Theorem for Invariant Sections Let M be an n-dimensional manifold and π : E → M a bundle over M. In our applications to Lie symmetry reduction the manifold M serves as the space of independent variables and the bundle E plays the role of the total space of independent and dependent variables. We refer to points of M with local coordinates (x i ) and to points of E with local coordinates (x i , uα ), for which the projection map π is given by π(x i , uα ) = (x i ). In many applications E either is a trivial bundle E = M × N , a vector bundle over M, or a fiber bundle over M with finite dimensional structure group. However, for the purposes of this paper one need only suppose that π is a smooth submersion. We let Ex = π −1 (x) denote the fiber of E over the point x ∈ M. Now let G be a finite dimensional Lie group which acts smoothly on E. We assume that G acts projectably on E in the sense that the action of each element of G is a fiber preserving transformation on E – if p, q lie in a common fiber, then so do g · p and g · q. Consequently, there is a smooth induced action of G on M. The action of G on the space of sections of E is then given by (g · s)(x) = g · [s(g −1 · x)].

(3.1)

for each smooth section s : M → E. A section s is invariant if g · s = s for all g ∈ G. More generally, we have the following definition. Definition 3.1. Let G be a smooth projectable group action on the bundle π : E → M and let U ⊂ M be open. Then a smooth section s : U → E is G invariant, if for all x ∈ U and g ∈ G such that g · x ∈ U , s(g · x) = g · s(x).

(3.2)

Let be the Lie algebra of vector fields on E which are the infinitesimal generators for the action of G on E. Since the action of G is assumed projectable, any basis Va , a = 1, . . . , p of assumes the local coordinate form (2.2). If gt is a one-parameter subgroup of G with associated infinitesimal generator Va on E, then by differentiating the invariance condition s(gt · x) = gt · s(x) one finds that the component functions s α (x i ) satisfy the infinitesimal invariance condition 2.3. If s is globally defined on all of M and if G is connected, then the infinitesimal invariance criterion (2.3) implies (3.2). This may not be true if G is not connected or if s is only defined on a proper open subset of M. For the purposes of finding group invariant solutions of differential equations, we shall take the group G to be a symmetry group of the given system of differential equations. The task at hand is to explicitly identify the space of G invariant sections of E with and to construct the differential sections of an auxiliary bundle πκ˜ G : κ˜ G (E) → M equations for the G invariant sections as a reduced system of differential equations on the sections of πκ˜ G : κ˜ G (E) → M. Our characterization of the G invariant sections of E is based upon the following key observation. Suppose that p ∈ E and that there is a G invariant section s : U → E with s(x) = p, where x ∈ U . Let Gx = { g ∈ G | g · x = x } be the isotropy subgroup of G at x. Then, for every g ∈ Gx , we compute g · p = g · s(x) = s(g · x) = s(x) = p.

(3.3)

658

I. M. Anderson, M. E. Fels, C. G. Torre

This equation shows that the isotropy subgroup Gx constrains the admissible values that an invariant section can assume at the point x. Accordingly, we define the kinematic bundle κG (E) for the action of G on E by κG,x (E), κG (E) = x∈M

where

κG,x (E) = p ∈ Ex | g · p = p

for all g ∈ Gx .

(3.4)

It is easy to check that κG (E) is a G invariant subset of E and therefore the action of G restricts to an action on κG (E). = M/G and κ˜ G (E) = κG (E)/G be the quotient spaces for the actions of G Let M on M and κG (E). We define the kinematic reduction diagram for the action of G on E to be the commutative diagram q κG

ι

κ˜ G (E) ←−−−− κG (E) −−−−→    πκ˜ G  π

M

qM

←−−−−

M

E  π

(3.5)

id

−−−−→ M.

In this diagram ι is the inclusion map of the kinematic bundle κG (E) into E, id : M → M is the identity map, the maps qM and qκG are the projection maps to the quotient spaces and πκ˜ G is the surjective map induced by π . The next lemma summarizes two of the key properties of the kinematic reduction diagram. Lemma 3.2. Let G act projectably on E. (i) Let p ∈ κG (E) and g ∈ G. If π(g · p) = π(p), then g · p = p. (ii) If p˜ ∈ κ˜ G (E) and x ∈ M satisfy πκ˜ G (p) ˜ = qM (x), then there is a unique point p ∈ κG (E) such that qκG (p) = p˜ and πκ˜ G (p) = x. Proof. (i) Let x = π(p). If π(g · p) = π(p), then g · x = x and therefore, since p ∈ κG,x (E), we conclude that g · p = p. (ii) Since qκG : κG (E) → κ˜ G (E) is surjective, there is a point p0 ∈ κG (E) which projects to p. ˜ Let x0 = π(p0 ). Then qM (x0 ) = qM (x) and hence, by definition of the quotient map qM , there is a g ∈ G such that g · x0 = x. The point p = g · p0 projects under qκG to p˜ and to x under π so that the existence of the point p is established. Suppose p1 and p2 are two points in κG (E) which project to p˜ and x under qκG and π respectively. Then p1 and p2 belong to the same fiber κG,x (E) and are related by a group element g ∈ G, that is, g · p1 = p2 . Since π(p1 ) = π(p2 ), it follows that π(g · p1 ) = π(p1 ). Since p1 ∈ κG,x (E), we infer from (i) that g · p1 = p1 and therefore p1 = p2 . → κ˜ G (E), This simple lemma immediately implies that every local section s˜ : U is an open subset of M, uniquely determines a G-invariant section s : U → where U κG (E), where U = q−1 M (U ), such that qκG (s(x)) = s˜ (qM (x)).

(3.6)

To insure that this correspondence between the G invariant sections of E and the sections of κ˜ G (E) extends to a correspondence between smooth sections it suffices to insure that is a smooth bundle. πκ˜ G : κ˜ G (E) → M

Group Invariant Solutions Without Transversality

659

Theorem 3.3 (Existence Theorem for G Invariant Sections). Suppose that E admits a kinematic reduction diagram (3.5) such that κG (E) is an imbedded subbundle of E, and κ˜ G (E) are smooth manifolds, and πκ˜ : κ˜ G (E) → M is a the quotient spaces M G bundle. be any open set in M and let U = q−1 Let U M (U ). Then (3.6) defines a one-to-one correspondence between the G invariant smooth sections s : U → E and the smooth → κ˜ G (E). sections s˜ : U We can describe the kinematic reduction diagram in local coordinates as follows. is a bundle we begin with local coordinates πκ˜ : (x˜ r , v a ) → Since πκ˜ G : κ˜ G (E) → M G r and a ranges from 1 to the fiber dimension (x˜ ) for κ˜ G (E), where r = 1, . . . , dim M is a submersion, we can use the coordinates x˜ r as part of κ˜ G (E). Since qM : M → M and, for of a local coordinate system (x˜ r , xˆ k ) on M. Here k = 1, . . . , dim M − dim M r r k fixed values of x˜ , the points (x˜ , xˆ ) all lie on a common G orbit. As a consequence of Lemma 3.2(ii) one can prove that qκG restricts to a diffeomorphism between the fibers of κG (E) and κ˜ G (E) and hence one can use (x˜ r , xˆ k , v a ) as a system of local coordinates on κG (E). Finally, let (x˜ r , xˆ k , uα ) → (x˜ r , xˆ k ) be a system of local coordinates on E. Since κG (E) is an imbedded sub-bundle of E, the inclusion map ι : κG (E) → E assumes the form ι(x˜ r , xˆ k , v a ) = (x˜ r , xˆ k , ια (x˜ r , xˆ k , v a )),

∂ια where the rank of the Jacobian matrix ∂v a kinematic reduction diagram (3.5) becomes q κG

is maximal. In these coordinates the

ι

(x˜ r , v a ) ←−−−− (x˜ r , xˆ k , v a ) −−−−→ (x˜ r , xˆ k , ια (x˜ r , xˆ k , v a ))      πκ˜ G  π

π

(x˜ r )

qM

(3.7)

←−−−−

(x˜ r , xˆ k )

id

−−−−→

(3.8)

(x˜ r , xˆ k ).

These coordinates are readily constructed in most applications. If v a = s˜ a (x˜ r ) is a local section of κ˜ G (E), then the corresponding G invariant section of E is given by s α (x˜ r , xˆ k ) = ια (x˜ r , xˆ k , s˜ a (x˜ r )).

(3.9)

Notice that when ι is the identity map, (3.9) reduces to (2.7). The formula (3.9) is the full and proper generalization of the classical Lie prescription (2.7) for infinitesimally invariant sections of transverse actions. In general the fiber dimension of κG (E) will be less than that of E, while the fiber dimension of κ˜ G (E) is always the same as that of κG (E). Thus, in our description of the G invariant sections of E, fiber reduction, or reduction in the number of dependent variables, occurs in the right square of the diagram (3.5) while base reduction, or reduction in the number of independent variables, occurs in the left square of (3.5). We now consider the case of an infinitesimal group action on E, defined directly by a p-dimensional Lie algebra of vector fields (2.2). These vector fields need not be the infinitesimal generators of a global action of a Lie group G on E. If the rank of the

660

I. M. Anderson, M. E. Fels, C. G. Torre

coefficient matrix [ξai (x j )] is q, then there are locally defined functions φ.a (x j ), where . = 1, . . . , p − q, such that p a=1

φ.a (x j )ξai (x j ) = 0.

Consequently, if we multiply the infinitesimal invariance equation (2.3) by the functions φ.a (x j ) and sum on a = 1, . . . , p, we find that the invariant sections s α (x j ) are constrained by the algebraic equations p a=1

φ.a (x j )ηaα (x j , s β (x j )) = 0.

(3.10)

These conditions are the infinitesimal counterparts to equations (3.3) and accordingly we define the infinitesimal kinematic bundle κ (E) = x∈M κ,x (E), where

j

β

κ,x (E) = (x , u ) ∈ Ex |

p a=1

= p ∈ Ex | Z(p) = 0

φ.a (x j )ηaα (x j , uβ ) = 0

for all Z ∈ such that π∗ (Z(p)) = 0 .

(3.11)

In most applications the algebraic conditions defining κ (E) are easily solved. The Lie algebra of vector fields restricts to a Lie algebra of vector fields on κ (E) which now satisfies the infinitesimal transversality condition (2.4). One then arrives at (3.8) as a local coordinate description of the infinitesimal kinematic diagram for , where the coordinates (x˜ r , v a ) are now the infinitesimal invariants for the action of on κ (E). It is not difficult to show that κG,x (E) ⊂ κ,x (E), with equality holding whenever the isotropy group Gx is connected. In the case where E is a vector bundle, the infinitesimal kinematic bundle appears in Fels and Olver [16]. For applications of the kinematic bundle to the classification of invariant tensors and spinors see [6] and [7]. 4. Reduced Differential Equations for Group Invariant Solutions Let G be a Lie group acting projectably on the bundle π : E → M and let = 0 be a system of G invariant differential equations for the sections of E. In order to describe ˜ = 0 for the G invariant solutions to = 0 we geometrically the reduced equations first formalize the definition of a system of differential equations. To this end, let π k : J k (E) → M be the k th order jet bundle of π : E → M. A point σ = j k (s)(x) in J k (E) represents the values of a local section s and all its derivatives to order k at the point x ∈ M. Since G acts on the space of sections of E by (3.1), the action of G on E can be lifted (or prolonged) to an action on J k (E) by setting g · σ = j k (g · s)(g · x),

where σ = j k (s)(x).

Now let π : D → J k (E) be a vector bundle over J k (E) and suppose that the Lie group acts projectably on D in a manner which covers the action of G on J k (E). A differential operator is a section : J k (E) → D. The differential operator is G invariant if it is invariant in the sense of Definition 3.1, that is, g · (σ ) = (g · σ )

Group Invariant Solutions Without Transversality

661

for all g ∈ G and all points σ ∈ J k (E). A section s of E defined on an open set U ⊂ M is a solution to the differential equations = 0 if (j k (s)(x)) = 0 for all x ∈ U . Typically, the bundle D → J k (E) is defined as the pullback bundle of a vector bundle V (on which G acts) over E or M by the projections π k : J k (E) → E or k : J k (E) → M and the action of G on D is the action jointly induced from J k (E) πM and V . Our goal now is to construct a bundle D˜ → J k (κ˜ G (E)) and a differential operator ˜ : J k (κ˜ G (E)) → D˜ such that the correspondence (3.6) restricts to a 1-1 correspondence ˜ = 0. between the G invariant solutions of = 0 and the solutions of One might anticipate that the required bundle D˜ → J k (κ˜ G (E)) can be constructed by a direct application of kinematic reduction to D → J k (E). However, one can readily check that the quotient space of J k (E) by the prolonged action of G does not in general coincide with the jet space J k (κ˜ G (E)) so that the kinematic reduction diagram for the action of G on D will not lead to a bundle over J k (κ˜ G (E)). For example, if G is the group acting on M × R → M by rotations in the base M = R2 − {(0, 0)}, then J 2 (E)/G is a 7-dimensional manifold whereas J 2 (κ˜ G (E)) is 4-dimensional. This difficulty is easily circumvented by introducing the bundle of invariant k-jets Invk (E) = { σ ∈ J k (E) | σ = j k (s)(x0 ), where s is a G invariant section defined in a neighborhood of x0 }. (4.1) This bundle is studied in Olver [29] although the importance of these invariant jet spaces to the general theory of symmetry reduction of differential equations is not as widely acknowledged in the literature as it should be. The quotient space Invk (E)/G coincides with the jet space J k (κ˜ G (E)). We let DInv → Invk (E) be the restriction of D to the bundle of invariant k-jets and to this we now apply our reduction procedure to arrive at the dynamic reduction diagram q

ι

κ˜ G (DInv ) ←−−−− κG (DInv ) −−−−→     π

π˜

qInv

DInv   π

ιInv

−−−−→

id

D  π

(4.2)

ιk

J k (κ˜ G (E)) ←−−−− Invk (E) −−−−→ Invk (E) −−−−→ J k (E). Theorem 3.3 insures that there is a one-to-one correspondence between the G invariant sections of DInv → Invk (E) and the sections of κ˜ G (DInv ) → J k (κ˜ G (E)). Any G invariant differential operator : J k (E) → D restricts to a G invariant differential operator Inv : Invk (E) → DInv and thus determines a differential operator ˜ : J k (κ˜ G (E)) → κ˜ G (D). This is the reduced differential operator; the solutions to ˜ = 0 describe the G invariant solutions to = 0. To describe diagram (4.2) in local coordinates, we begin with the coordinate description (3.8) of the kinematic reduction diagram and we let (x˜ r , xˆ k , uα , uαr , uαk , uαrs , uαrk , uαkl , . . . ) denote the standard jet coordinates on J k (E). Since the invariant sections are parameterized by functions v a = v a (x˜ r ), coordinates for Invk (E) are a , . . . ). (x˜ r , xˆ k , v a , vra , vrs

662

I. M. Anderson, M. E. Fels, C. G. Torre

In accordance with (3.9), the inclusion map ιk : Invk (E) → J k (E) is given by a ιk (x˜ r , xˆ k , v a , vra , vrs , . . . ) = (x˜ r , xˆ k , uα , uαr , uαk , uαrs , uαrk , uαkl , . . . ),

(4.3)

where, by a formal application of the chain rule, uα = ια (x˜ r , xˆ i , v a ), uαrs =

uαr =

∂ια ∂ια a + v , ∂ x˜ r ∂v a r

uαk =

∂ια , ∂ xˆ k

∂ 2 ια ∂ 2 ια a ∂ 2 ια a ∂ 2 ια a b ∂ια a + v + v + v v + a vrs , ∂ x˜ r ∂ x˜ s ∂v a ∂ x˜ s r ∂v a ∂ x˜ r s ∂v a ∂v b r s ∂v

and so on. The quotient map qInv : Invk (E) → J k (κ˜ G (E)) is given simply by a a qInv (x˜ r , xˆ k , v a , vra , vrs , . . . ) = (x˜ r , v a , vra , vrs , . . . ).

Next let f A be a local frame field for the vector bundle D. The differential operator : J k (E) → D can be written in terms of the standard coordinates on J k (E) and in this local frame as = A (x˜ r , xˆ k , uα , uαr , uαk , uαrs , uαrk , uαkl , . . . ) f A .

(4.4)

The restriction of to Invk (E) defines the section Inv : Invk (E) → DInv by a Inv = Inv,A (x˜ r , xˆ k , v a , vra , vrs , . . . ) f A,

(4.5)

a , . . . ) are defined as the comwhere the component functions Inv,A (x˜ r , xˆ k , v a , vra , vrs position of the maps (4.3) and the component maps A . Since is a G invariant differential operator, Inv is a G invariant differential operator and hence Inv necessarily factors through the kinematic bundle κG (DInv ),

Inv : Invk (E) → κG (DInv ). Our general existence theory for invariant sections implies that we can also find a locally Q defined, G invariant frame f Inv for κG (DInv ). The inclusion map κG (DInv ) → DInv is Q expressed by writing each vector f Inv as a linear combination of the vectors f A , Q

Q

f Inv = MA f A , Q

where the coefficients MA are functions on Invk (E). The invariant operator Inv can be expressed as Q a , . . . ) f Inv . Inv = Inv,Q (x˜ r , xˆ k , v a , vra , vrs

Group Invariant Solutions Without Transversality

663

Q

Finally, the G invariant frame f Inv determines a frame ˜f Q on κ˜ G (DInv ), the invariance of implies that the component functions Inv,Q are necessarily independent of the parametric variables xˆ k , that is, a a ˜ Q (x˜ r , v a , vra , vrs , . . . ) = Inv,Q (x˜ r , xˆ k , v a , vra , vrs ,...)

and the reduced differential operator is a ˜ = ˜ Q (x˜ r , v a , vra , vrs , . . . ) ˜f Q .

At first sight, this general framework may appear to be rather cumbersome and overly complicated. However, as we shall see in examples, every square in the dynamic reduction diagram (4.2) actually corresponds to the individual steps that one performs in practice. 5. The Automorphism Group of the Kinematic Bundle Let G be the full group of projectable symmetries on E for a given system of differential equations on J k (E) and let G ⊂ G be a fixed subgroup for which the group invariant solutions are sought. It is commonly noted (again, within the context of reduction with transversality) that Nor(G, G), the normalizer of G in G, preserves the space of invariant sections and that Nor(G, G)/G is a symmetry group of the reduced equations. However, because this is a purely algebraic construction which does not take into account the action of G on E, this construction may not yield the largest possible residual symmetry group or may result in a residual group which does not act effectively on κ˜ G (E). These difficulties are easily resolved. We let Op (G) denote orbit of G through a point p ∈ E. Definition 5.1. Let G be a group of fiber-preserving transformations acting on π : E → M and let G be a subgroup of G. Assume that E admits a kinematic reduction diagram (3.5) for the action of G on E. ˜ for the kinematic bundle π : κG (E) → M is the (i) The automorphism group G subgroup of G which stabilizes the set of all the G orbits in κG (E), that is, ˜ = a ∈ G | a · Op (G) = Oa·p (G) and G

a −1 · Op (G) = Oa −1 ·p (G) for all p ∈ κG (E) .

(5.1)

(ii) The global isotropy subgroup of G, as it acts on the space of G orbits of κG (E), is ˜ ∗ = a ∈ G | a · Op (G) = Op (G) for all p ∈ κG (E) . (5.2) G ˜ G ˜ ∗. ˜ eff = G/ (iii) The residual symmetry group is G ˜ ∗ is that it is the largest subgroup of G with exactly the same The key property of G reduction diagram and invariant sections as G. This is an important interpretation of the ˜ ∗ – from the viewpoint of kinematic reduction, one should generally replace group G ˜ ∗ . For computational purposes, it is often advantageous to the group G by the group G ˜ ∗ fixes every G invariant section of E. It is not difficult to check that use the fact that G

664

I. M. Anderson, M. E. Fels, C. G. Torre

˜ ∗ , G) = G, ˜ that the quotient group G ˜ eff = G/ ˜ G ˜ ∗ acts effectively and projectably Nor(G and that, if G is a symmetry group of a differential on the reduced bundle κ˜ G (E) → M ˜ eff is always a symmetry group of the reduced differential operator . ˜ operator , then G Similarly, if G is a Lie algebra of projectable vector fields on E and ⊂ G, we define the infinitesimal automorphism algebra of κ (E) as the Lie subalgebra of vector fields given by G˜ = Y ∈ G | [ Z, Y ]p ∈ span()(p)

for all p ∈ κ (E) and all Z ∈ ,

(5.3)

and the associated isotropy subalgebra for κ (E) G˜ ∗ = Y ∈ G | Yp ∈ span()(p) for all p ∈ κ (E) .

(5.4)

When G is a finite dimensional Lie group and G = (G), then it is readily checked that ˜ ∗ ). ˜ and G˜ ∗ = (G G˜ = (G) ˜ acts on the k-jets of invariant sections Invk (E), Since the automorphism group G G this group also plays an important role in dynamic reduction. Specifically, let us suppose that G acts on the vector bundle D → J k (E) and that : J k (E) → D is a G invariant ˜ and section. Then Inv : Invk (E) → DInv is always invariant under the action of G accordingly the operator Inv always factors through the kinematic bundle for the action ˜ on DInv , where for σ ∈ Invk (E), of G κG,σ ˜ (DInv ) = ∈ DInv,σ | g · =

for all

˜σ . g∈G

We note that κG˜ (DInv ) ⊂ κG (DInv ) and consequently one can refine the dynamic reduction diagram from (4.2) to q

ι

qInv

id

κ˜ G˜ (DInv ) ←−−−− κG˜ (DInv ) −−−−→     π

π

DInv   π

ιInv

−−−−→

D  π

ιk

J k (κ˜ G (E)) ←−−−− Invk (E) −−−−→ Invk (E) −−−−→ J k (E), where the quotient maps to the left are still by the action of G. Given the actions of G on π : E → M and also G on D → J k (E), it sometimes happens that κG,σ ˜ (DInv ) = 0.

(5.5)

In this case every G invariant section of E is automatically a solution to = 0 for every G invariant operator : J k (E) → D – such sections are called universal solutions. Previous work on this subject (see Bleecker [8], [9], Gaeta and Morando [19]) have emphasized a variational approach which, from the viewpoint of the dynamic reduction diagram and the automorphism group of the kinematic reduction diagram, may not always be necessary.

Group Invariant Solutions Without Transversality

665

6. Examples In this section we find the kinematic and dynamic reduction diagrams for the group invariant solutions for some well-known systems of differential equations in applied mathematics, differential geometry, and mathematical physics. We begin by deriving the rotationally invariant solutions of the Euler equations for incompressible fluid flow. As noted by Olver [29] (p. 199), these solutions cannot be obtained by the classical Lie ansatz. The general theory of symmetry reduction without transversality leads to some interesting new classification problems for group invariant solutions which we briefly illustrate by presenting new reduction of the Euler equations. In our second set of examples we consider reductions of the harmonic map equations. We show the classic Veronese map from S 2 → S 4 is an example of a universal solution. In Example 5.4 we consider another symmetry reduction of the harmonic map equation which nicely illustrates the construction of the reduced kinematic space for quotient with boundary. manifolds M In our third set of examples, the Schwarzschild and plane wave solutions of the vacuum Einstein equations are re-examined in the context of symmetry reduction without transversality. We demonstrate the importance of the automorphism group in understanding the geometric properties of the kinematic bundle and, as well, qualitative features of the reduced equations. Finally, some elementary examples from mechanics are used to demonstrate the basic differences between symmetry reduction for group invariant solutions and symplectic reduction of Hamiltonian systems. Although space does not permit us to do so, the kinematic and dynamic reduction diagrams are also nicely illustrated by symmetry reduction of the Yang–Mills equations as found, for example, in [22, 25, 27]. In particular, it is interesting to note that the invariance properties of the classical instanton solution to the Yang–Mills equations (Jackiw and Rebbi [24]) imply that it is a universal solution in the sense of Eq. (5.5). Euler Equations for Incompressible Fluid Flow. The Euler equations are a system of 4 first order equations in 4 independent and dependent variables. The underlying bundle E for these equations is the trivial bundle R4 × R4 → R4 with coordinates (t, x, u, p) → (t, x), where x = (x 1 , x 2 , x 3 ) and u = (u1 , u2 , u3 ) and the equations are ut + u · ∇u = −∇p

and

∇ · u = 0.

(6.1)

The full symmetry group G of the Euler equations is well- known (see, for example, [23, 29, 34]) Example 6.1 (Rotationally Invariant Solutions of the Euler Equations). The symmetry group of the Euler equations contains the group G = SO(3) acting on E by R · (t, x, u, p) = (t, R · x, R · u, p) = (t, Rji x j , Rji uj , p),

(6.2)

for R = (Rji ) ∈ SO(3). To insure that the action of G on the base R4 is regular we restrict to the open set M ⊂ R4 where ||x|| = 0. The infinitesimal generators for this action are Vk = εkij x i

∂ ∂ + εkij ui j . ∂x j ∂u

(6.3)

666

I. M. Anderson, M. E. Fels, C. G. Torre

We first construct the kinematic reduction diagram for this action. For a given point x0 = (t0 , x0 ) ∈ M, the isotropy subgroup Gx0 for the action of G on M is the subgroup SO(2)x0 ⊂ SO(3) which fixes the vector x0 in R3 . Since the only vectors invariant under all rotations about a given axis of rotation are vectors along the axis of rotation, we deduce that for x0 ∈ M, κG,x0 (E) = { (t0 , x0 , u, p) | R · u = u for all R ∈ SO(2)x0 } = { (t0 , x0 , u, p) | u = Ax0 for some A ∈ R }. The same conclusion can be obtained by infinitesimal considerations. Indeed, the infinitesimal isotropy vector field at x0 for the action on M is Z = x0k εkij x i

∂ ∂x j

and therefore, if (t, x, u, p) ∈ κ,x (E), we must have by (3.11) x k εkij ui

∂ = 0. ∂uj

This implies that x × u = 0 and so u is parallel to x. Either way, we conclude that κG (E) is a two dimensional trivial bundle (t, x, A, B) → (t, x), where the inclusion map ι : κG (E) → E is ι(t, x, A, B) = (t, x, u, p),

where u = Ax

and p = B. The invariants for the action of G on M are t and r = x 2 + y 2 + z2 so that the kinematic reduction diagram for the action of SO(3) on E is q κG

ι

(t, r, A, B) ←−−−− (t, x, A, B) −−−−→ (t, x, u, p)     π πκ˜ G  π

qM

(t, r)

←−−−−

id

(t, x)

−−−−→

(6.4)

(t, x).

In accordance with Eq. (3.9), each section A = A(t, r) and B = B(t, r) of κ˜ G (E) determines the rotationally invariant section u = A(r, t) x

and

p = B(r, t)

(6.5)

of E. The computation of the reduced equations for the rotationally invariant solutions to the Euler equations now proceeds as follows. From (6.5) we compute uit = At x i ,

uij = Aδji + Ar

x i xj r

and

pi = Br

xi r

(6.6)

so that the Euler equations (6.1) become x i xj xi At x i + Ax j Aδji + Ar = −Br r r

and 3A + rAr = 0

(6.7)

Group Invariant Solutions Without Transversality

667

which simplify to the differential equations At + A(A + rAr ) = −

Br r

and

3A + rAr = 0

(6.8)

on J 1 (κ˜ G (E)). These equations are readily integrated to give A=

a r3

and

B=

a2 a˙ − 4 +b r 2r

for arbitrary functions a(t) and b(t) and the rotationally invariant solutions to the Euler equations are u=

a x r3

and

p=

a2 a˙ − 4 + b. r 2r

(6.9)

We note that for the Lie algebra of vector fields (6.3), the matrix on the right side of (2.4), namely   0 −x 3 x 2 0 −u3 u2  x 3 0 −x 1 u3 0 −u1  , −x 2 x 1 0 −u2 u1 0 has full rank 3 whereas the matrix on the left side of (2.4), consisting of the first three columns of the above matrix, has rank 2. The local transversality condition (2.4) fails and the solution (6.9) to the Euler equations cannot be obtained using the classical Lie prescription. To describe the derivation of the reduced equations in the context of invariant differential operators and the dynamic reduction diagram we introduce the bundle D = ∂ J 1 (E) × R3 × R with sections i ⊗ dt and dt and define the differential operator ∂u on D by = [uit + uk uik + δ ij pj ]

∂ ⊗ dt + [uii ] dt. ∂ui

(6.10)

This operator is invariant under the full symmetry group of the Euler equations. The induced action of G = SO(3) on J 1 (E) is given by R · (t, x i , ui , p, uij , pj ) = (t, Rri x r , Rri ur , p, Rri Rjs urs , Rsr pr ),

where R ∈ SO(3).

Coordinates for the bundle of invariant jets Inv1 (E) are (t, x i , A, At , Ar , B, Bt , Br ) and (6.6) defines the inclusion map ι1 : Inv1 (E) → J 1 (E). A basis for the G invariant sections of DInv → Inv1 (E) is given by f 1 = xi

∂ ⊗ dt ∂ui

and

f 2 = dt.

Let ˜f 1 and ˜f 2 be the corresponding sections of κ˜ G (DInv ). We are now ready to work though the dynamic reduction diagram (4.2), starting with the Euler operator as a section : J 1 (E) → D. Restricted to the invariant jet bundle Inv1 (E), becomes x i xj x i xj xi ∂ j + Br ] i ⊗ dt + [δi (Aδji + Ar )] dt. Inv = [At x i + Ax j Aδji + Ar r r ∂u r

668

I. M. Anderson, M. E. Fels, C. G. Torre

Restricting to Inv1 (E) is precisely the first step one takes in practice in computing the reduced equations and corresponds to the right most square in the dynamic reduction diagram. Next, because Inv is G invariant it is necessarily a linear combination of the two invariant sections f 1 and f 2 and therefore factors though the kinematic bundle κG (DInv ). This means we can write Inv as a section of κG (DInv ), namely, i x j xj 1 j x xj Inv = [At + A A + Ar + Br ] f 1 + [3A + Ar (δi )] f 2 . r r r

This corresponds to the center commutative square in the dynamic reduction diagram (4.2) and coincides with the fact that Eq. (6.7) contained a common factor x i – a common factor which insures that the time evolution equation for u reduces to a single time evolution equation for A. Finally, as a G invariant section of κG (DInv ), a bundle on which G always acts ˜ on the transversally, we are assured that Inv descends to a differential operator bundle J 1 (κ˜ G (E)). In this example this implies that the independent variables (t, x i ) appear only though the invariants for the action of G on M, in this case t and r, and so ˜ = [At + A(A + rAr ) +

Br ˜ 1 ] f + [3A + rAr ] ˜f 2 . r

Example 6.2 (A New Reduction of the Euler Equations). It is possible to give a complete classification of all possible symmetry reductions of the Euler equations (6.1) to a system of ordinary differential equations in three or fewer dependent variables [17]. A number of authors have obtained complete lists of reductions of various differential equations (see, for example, [14, 18, 21, 42]) but this particular classification of reductions of the Euler equations may be the first such classification of group invariant solutions which explicitly requires non-trivial isotropy in the group action on the space of independent variables. There are too many cases to list the results of this classification here, but we do present one more reduction of the Euler equations, one which does not seem to appear elsewhere in the literature. For this example it will be convenient to write x = (x, y, z) and u = (u, v, w). The infinitesimal generators for the group action are = { V0 , V1 , V2 = Vx,α + Vy,β , V3 = Vy,α − Vx,β }, where V0 = x∂x +y∂y +z∂z +u∂u +v∂v +w∂w +2p∂p , ˙ u −x α∂ ¨ p, Vx,α = α∂x + α∂

and

V1 = y∂x −x∂y +v∂u −u∂v , ˙ v −y β∂ ¨ p. Vy,β = β∂y + β∂

Here α = α(t) and β = β(t) are such that α β¨ − αβ ¨ = 0, or equivalently, α β˙ − β α˙ = c = constant.

(6.11)

This condition insures that [V2 , V3 ] = 0 so that is indeed a finite dimensional Lie algebra of vector fields. In order that have constant rank on the base space, we assume that xyα = 0 or yzβ = 0. The horizontal components of V2 and V3 are given by M 1 α −β V2M α β ∂x ∂x V2 = , so that = , −β α ∂y ∂y V3M V3M δ β α

Group Invariant Solutions Without Transversality

669

where δ = α 2 + β 2 , and therefore at the point (t0 , x0 ), the horizontal components of the vector field Z = V1 − y0

α(t0 )V2 − β(t0 )V3 β(t0 )V2 + α(t0 )V3 + x0 δ(t0 ) δ(t0 )

(6.12)

vanish. The isotropy condition (3.11) defining the fiber of the kinematic bundle κ,x (E) leads, from the coefficients of ∂u , ∂v and ∂p , to the relations v=

yα − xβ xα + yβ α˙ + β˙ δ δ

and u =

xα + yβ xβ − yα ˙ α˙ + β. δ δ

(6.13)

We therefore conclude that the kinematic bundle has fiber dimension 2 with fiber coordinates w and p. However, these coordinates are not invariant under the action of on κ (E) and cannot be used in the local coordinate description (3.8) of the kinematic reduction diagram. Restricted to κ (E), the vector fields Vi become V0 = x∂x + y∂y + z∂z + w∂w + 2p∂p ,

V1 = y∂x − x∂y ,

¨ p, ¨ + βy)∂ V2 = α∂x + β∂y − (αx

¨ + αy)∂ V3 = −β∂x + α∂y − (−βx ¨ p.

and

Note that these restricted vector fields now satisfy the infinitesimal transversality condition (2.4). Invariants for this action are t, A=

w z

and

α α¨ + β β¨ 2 B = 2p + (x + y 2 ) /z2 . δ

(6.14)

To verify that B satisfies V2 (B) = V3 (B) = 0 one must use α β¨ = αβ. ¨ The kinematic reduction diagram for the action of on E is therefore q κ

ι

(t, A, B) ←−−−− (t, x, A, B) −−−−→ (t, x, u, p)      π π

π˜

(t)

qM

←−−−−

(t, x)

id

−−−−→

(t, x),

where the inclusion map ι is defined by (6.13) and the solutions to (6.14) for w and p. The general invariant section is then, on putting σ = ln δ, u=x

c σ˙ −y , 2 δ

w = zA(t),

c σ˙ v =x +y , δ 2 α α¨ + β β¨ 2 z2 p =− (x + y 2 ) + B(t). 2δ 2

Note that the u and v components are uniquely determined from the isotropy conditions (6.12) and (6.13) and that the arbitrary functions A(t) and B(t) defining these invariant sections appear only in the w and p components. We now turn to the dynamic reduction diagram. Since we are treating the Euler equations as the section (6.10) of the tensor bundle D we can anticipate the form of Inv by computing the invariant tensors of the form T = P ∂u ⊗ d t + Q ∂v ⊗ d t + R ∂w ⊗ d t + Sd t.

(6.15)

670

I. M. Anderson, M. E. Fels, C. G. Torre

The isotropy condition LZ T = 0 at x0 , where Z is defined by (6.12), shows immediately that P = Q = 0 from which it follows that f1 = z

∂ ⊗ dt ∂w

and

f2 = d t

are a basis for the invariant fields of the type (6.15). This calculation shows that the ∂u ⊗ dt and ∂v ⊗ dt components of the reduced Euler equations must vanish identically and, consistent with this conclusion, one readily computes ¨ ∂ c 2 (α α¨ + β β) σ˙ + ( )2 − [x ⊗ dt − 2 2 δ δ ∂u ∂ +y ⊗ dt] + A˙ + A2 + B f 1 + σ˙ + A f 2 ∂v = A˙ + A2 + B f 1 + σ˙ + A f 2 .

Inv =

σ¨

Thus, the reduced differential equations are A˙ + A2 + B = 0

and

σ˙ + A = 0

which determine A and B algebraically. In conclusion, for each choice of α and β there is precisely one invariant solution to the Euler equations given by α α+β ˙ β˙ α β˙ − αβ ˙ −y 2 , 2 2 α +β α +β 2

α β˙ − αβ ˙ α α+β ˙ β˙ α α+β ˙ β˙ +y 2 , w = −2z 2 , 2 2 2 α +β α +β α +β 2 ¨ β¨ α α+β ¨ β¨ α β˙ −β α˙ 2 α α+β ˙ β˙ 2 1 2 α α+β . +z + −3 p = − (x 2 +y 2 ) 2 2 α +β 2 α 2 +β 2 α 2 +β 2 α 2 +β 2

u=x

v=x

Harmonic Maps. For our next examples we look at two well-known reductions of the harmonic map equation for maps between spheres. For these examples the bundle E is S n × S m → S n which we realize as a subset of Rn+1 × Rm+1 by E = (x, u) ∈ Rn+1 × Rm+1 | x · x = u · u = 1 . Let G be a Lie subgroup of SO(n + 1), let ρ : G → SO(m + 1) be a Lie group homomorphism and define the action of G on E by R · (x, u) = (R · x, ρ(R) · u) for

R ∈ G.

The kinematic bundle for the G invariant sections of E has fiber κG,x (E) = (x, u) ∈ E | ρ(R) · u = u for all R ∈ G such that R · x = x . We identify the jet space J 2 (E) with a submanifold of J 2 (Rn+1 , Rm+1 ) by J 2 (E) = { (x, u, ∂i u, ∂ij u) ∈ J 2 (Rn+1 , Rm+1 ) | x · x = 1, u · u = 1, u · ∂i u = 0, u · ∂ij u + ∂i u · ∂j u = 0 }.

Group Invariant Solutions Without Transversality

671

Since the harmonic map operator (or tension field) is a tangent vector to the target sphere S m at each point σ ∈ J 2 (E), we let D = {(σ, ) ∈ J 2 (E) × Rm+1 | u · = 0}.

(6.16)

By combining Proposition I.1.17 (p.19) and Lemma VII.1.2 (p.129) in Eells and Ratto [15], it follows that one can write the harmonic map operator : J 2 (E) → D as the map ∂ n+1 , (σ ) = R uα + x i x j uαij + nx i uαi − λuα ∂uα where

β

β

λ = δαβ [δ ij uαi uj − x i x j uαi uj ]

and

R

n+1

(6.17)

uα = −δ ij uαij .

This operator is invariant under the induced action of G = SO(n + 1) × SO(m + 1) on D. Example 6.3 (Harmonic Maps from S 2 to S 4 ). For our first example we take E = S 2 × S 4 → S 2 and we look for harmonic maps which are invariant under the standard action of SO(3) acting on S 2 . It can be proved that, up to conjugation, there are three distinct group homomorphisms ρ : SO(3) → SO(5), which lead to the following three possibilities for the infinitesimal generators of SO(3) acting on E:   V1 = z∂y −y∂z , V1 = z∂y −y∂z −u2 ∂u3 + u3 ∂u2 ,       Case I Case II V2 = x∂z −z∂x , V2 = x∂z −z∂x −u3 ∂u1 + u1 ∂u3 ,       V3 = y∂x −x∂y . V3 = y∂x −x∂y −u1 ∂u2 + u2 ∂u1 . √ √  V1 = z∂y −y∂z + u2 ∂u1 −u1 ∂u2 + (u4 − 3u5 )∂u3 −u3 ∂u4 + 3u3 ∂u5 ,    √ √ Case III V2 = x∂z −z∂x −u3 ∂u1 + (u4 + 3u5 )∂u2 + u1 ∂u3 −u2 ∂u4 − 3u2 ∂u5 ,    V3 = y∂x −x∂y −2u4 ∂u1 + u3 ∂u2 −u2 ∂u3 + 2u1 ∂u4 . In Case I the map ρ is the constant map and, in Case II, ρ is the standard inclusion of SO(3) into SO(5). The origin of the map ρ in Case III will be discussed shortly. consists of a single point, the Since SO(3) acts transitively on S 2 , the orbit manifold M space of invariant sections is a finite dimensional manifold, and the reduced differential equations are algebraic equations. The kinematic bundles κG (E) are determined in each case from the isotropy constraint xV1 + yV2 + zV3 = 0. In Case I the action is transverse, the isotropy constraint is vacuous and the kinematic bundle is κG (E) = S 2 × S 4 . The invariant sections are given by AI (x, y, z) = (A, B, C, D, E), where A, . . . , E are constants and A2 + B 2 + C 2 + D 2 + E 2 = 1. In Case II the kinematic bundle is S 2 × S 2 and the invariant sections are AI I (x, y, z) = (Ax, Ay, Az, B, C),

672

I. M. Anderson, M. E. Fels, C. G. Torre

where A, B, C are constants such that A2 + B 2 + C 2 = 1. We take A = 0, since otherwise AI I becomes a special case of AI . In Case III, κG (E) = S 2 × { ±1 } and the invariant sections are √ √ 3 2 1 2 2 (x + y 2 − 2z2 ) , AI I I (x, y, z) = A 3 xy, xz, yz, (x − y ), 2 6 where A = ±1. Direct substitution into (6.17) easily shows that the maps AI and AI I I automatically satisfy the harmonic map equation. The map AI I is harmonic if and only if B = C = 0 in which case AI I is either the identity map or the antipodal map on S 2 followed by the standard inclusion into S 4 . Despite the simplicity of these conclusions, it is nevertheless instructive to look at the corresponding dynamic reduction diagrams. In Case I, the invariant sections are constant and so Inv2 (E) = {(x, A) ∈ R3 × R5 | x · x = A · A = 1, } and

DInv = {(σ, ) ∈ Inv2 (E) × R5 | A · = 0}. ˜ = SO(3) × SO(5) The automorphism group for the kinematic bundle in this case is G which acts on DInv by (R, S) · (x, A, ) = (R · x, S · A, S · )

for R ∈ SO(3) and S ∈ SO(5).

The isotropy constraint for κG˜ (DInv ) forces to be a multiple of A. Hence, by the tangency condition A · = 0, we have = 0 and κG,σ ˜ (DInv ) = 0. This shows that the map AI is harmonic by symmetry considerations alone and moreover that it is a universal solution for any operator : J k (S 2 × S 4 ) → D with SO(3) × SO(5) symmetry. In Case II, the harmonic map equations force B = C = 0 so that the maps AI I are not universal. Interestingly however, the standard and antipodal inclusions S 2 → S 4 have a larger symmetry group, namely SO(3) × SO(2) ⊂ G and it is easily seen, using these larger symmetry groups, that the standard and antipodal inclusions are universal. It is a common phenomenon that the group invariant solutions to a system of differential equations possess a larger symmetry group than the original group used in their construction. In Case III one finds immediately that κG,σ (DInv ) = 0 and AI I I is universal, again for any operator : J k (S 2 × S 4 ) → D with SO(3) × SO(5) symmetry. The map AI I I is the classic Veronese map. The symmetry group defining it is based on a standard irreducible representation of SO(3) which readily generalizes to give harmonic maps between various spheres of higher dimension. Specifically, starting with the standard action of SO(n) on V = Rn , consider the induced action on Symktr (V ), the space of rank k symmetric, trace-free tensors or, equivalently, on the space W = Hk (V ) of harmonic polynomials of degree k on V . The standard metric on W is invariant under this action of SO(n) and in this way one obtains a Lie group monomorphism ρ : SO(n) → SO(N ), where N = dim(W ) = n+k−1 − 1. For example, the polynok mials √ u1 = xy, , u2 = xz, u3 = yz, u4 = 1/2(x 2 −y 2 ), u5 = 3/6(x 2 +y 2 −2z2 ) form an orthogonal basis for H2 (R3 ) and the action of SO(3) on this space determines the action of SO(3) on R3 × R5 in Case III. For further examples see Eells and Ratto [15] and Toth [38].

Group Invariant Solutions Without Transversality

673

Example 6.4 (Harmonic Maps from S n to S n ). A basic result of Smith [35] states that each element of πn (S n ) = Z can be represented by a harmonic map (with respect to the standard metric) provided n ≤ 7 or n = 9. This result, which can be established by symmetry reduction of the harmonic map equation (see Eells and Ratto [15] and Urakawa [39]), illustrates a number of interesting features. First, we see that much of is the general theory which we have outlined could be extended to the case where M a manifold with boundary and where the fibers of κG (E) change topological type on the boundary. Secondly, we find that the invariant sections for the standard action of G = SO(n − 1) × SO(2) ⊂ SO(n + 1) on S n are slightly more general than those considered in [15] and [39]. However, a simple analysis of the reduced equations, based upon Noether’s theorem, shows that the only solutions to the reduced equations are essentially those provided by the ansatz used by Eells and Ratto and Urakawa. If (R, S) ∈ G = SO(n − 1) × SO(2) ⊂ SO(n + 1) and (x, y, u, v) ∈ E ⊂ (Rn−1 × R2 ) × (Rn−1 × R2 ), where ||x||2 + ||y||2 = 1 and ||u||2 + ||v||2 = 1, then the action of G on E = S n × S n is given by R0 x R0 u (R, S)(x, y, u, v) = , . 0 S y 0 S v The invariants for the action of G on the base Rn+1 are r = ||x|| and s = ||y|| which, for points (x, y) ∈ S n , are related by r 2 + s 2 = 1, where r ≥ 0 and s ≥ 0. The quotient = S n /G is therefore diffeomorphic to the closed interval [0, π/2]. manifold M To describe the kinematic bundle κG (E) we must consider separately those points in M for which (i) s = 0, (ii) s = 0 and r = 0 and (iii) r = 0, corresponding to the left-hand boundary point, the interior points and the right-hand boundary point of M. For (x, 0) ∈ S n , the isotropy subalgebra is SO(n − 1)x × SO(2) and the fiber of the kinematic bundle consists of a pair of points κG,(x,0) (E) = { (x, 0, u, v) | u = ±x,

and

v = 0 }.

For points (x, y) ∈ S n with r = 0 and s = 0 the isotropy group is SO(n − 1)x × { I } and the fiber of the kinematic bundle is the ellipsoid of revolution κG,(x,y) (E) = { (x, y, u, v) | u = Ax, Invariant coordinates on κG,(x,y) (E) are A = y⊥ = (0, −y 2 , y 1 ), subject to

where

r 2 A2 + ||v||2 = 1}.

x·u y·v y⊥ · v , B = and C = , where r2 s2 s2

r 2 A2 + s 2 (B 2 + C 2 ) = 1.

(6.18)

The inclusion map from κG,(x,y) (E) to E(x,y) is u = Ax

and

v = By + Cy⊥ .

At the points (0, y), the isotropy subalgebra is SO(n)×{ I } and the fiber of the kinematic bundle is the circle κG,(0,y) (E) = { (0, y, u, v) | u = 0

and ||v|| = 1 }.

674

0

I. M. Anderson, M. E. Fels, C. G. Torre

π/4

π/2

Fig. 1. The reduced kinematic bundle for SO(n − 1) × SO(2) invariant maps s : S n → S n

The quotient space κ˜ G (E) is shown in Fig. 1. The G invariant sections are therefore described, as maps A : Rn+1 → Rn+1 , by A(x, y) = A(t)x + B(t)y + C(t)y⊥ ,

(6.19)

r where t is the smooth function of (x, y) defined by cos(t) = 2 and sin(t) = r + s2 s , and where cos2 (t)A2 (t)+sin2 (t)(B 2 (t)+C 2 (t)) = 1. The isotropy conditions r 2 + s2 imply that the functions A, B and C are subject to the boundary at the boundary of M conditions π π π A(0) = ±1, B(0) = 0, C(0) = 0, and A( ) = 0, B( )2 + C( )2 = 1. 2 2 2 (6.20) The invariant sections considered in [15] and [39] correspond to C(t) = 0. Note that the space of invariant sections (6.19) is preserved by rotations in the v plane, that is, ˜ eff . By computing κG (DInv ) we rotations in the BC plane and therefore SO(2) ⊂ G deduce that the restricted harmonic operator Inv is of the form ∂ ∂ ∂ Inv = A x · + B y · + C y⊥ · ⊥ , ∂u ∂v ∂v where the tangency condition (6.16) reduces to r 2 AA + s 2 BB + s 2 CC = 0. A series of straightforward calculations, using (6.17), now shows that the coefficients of ˜ are the reduced operator sin(t) cos(t) ˙ ˜ A = −A¨ + n − A + nA − λA, cos(t) sin(t) sin(t) cos(t) ˙ ˜ B = −B¨ + (n − 2) −3 B + nB − λB, cos(t) sin(t) cos(t) ˙ sin(t) ˜ C = −C¨ + (n − 2) −3 C + nC − λC, cos(t) sin(t)

(6.21)

Group Invariant Solutions Without Transversality

where

675

λ = cos2 (t)A˙ 2 + sin2 (t) B˙ 2 + C˙ 2 + 2 cos(t) sin(t) −AA˙ + B B˙ + C C˙ + (n − 1)A2 + 2(B 2 + C 2 ) − cos2 (t)A2 − sin2 (t)(B 2 + C 2 ).

To analyze these equations, we first invoke the principle of symmetric criticality and the formulas in [2] for the reduced Lagrangian to conclude that these equations are the Euler–Lagrange equations for the reduced Lagrangian 1 L˜ = cos(t)n−2 sin(t)λ dt 2 subject, of course, to the constraint (6.18). From knowledge of the automorphism group of the kinematic bundle we know that this Lagrangian is invariant under rotations in the BC plane and this leads to the first integral ˙ J = cos(t)n−2 sin(t)3 (B C˙ − C B) for (6.21). By the boundary conditions (6.20), J must vanish identically. Thus C(t) = µB(t), for some constant µ and therefore a rotation in the v, v⊥ plane will rotate the general invariant section (6.19) into the section with C(t) = 0. We then have r 2 A2 + s 2 B 2 = 1 and the change of variables A(t) =

cos(φ(t)) cos(t)

and

B(t) =

sin(φ(t)) sin(t)

converts the reduced operator (6.21) into the form found in [15] or [39]. General Relativity. We now turn to some examples of Lie symmetry reduction in general relativity which we again examine from the viewpoint of the kinematic and dynamic reduction diagrams. To study reductions of the Einstein field equations, we take the bundle E to be the bundle Q(M) of quadratic forms, with Lorentz signature, on a 4dimensional manifold M. A section of E then corresponds to a choice of Lorentz metric on M. We view the Einstein tensor = Gij (ghk , ghk,l , ghk,lm )

∂ ∂ ⊗ j i ∂x ∂x

formally as a section of D → J 2 (E), where D is pullback of V = Sym2 (T (M)) to the bundle of 2-jets J 2 (E). The operator is invariant under the Lie pseudo-group G of all local diffeomorphisms of M. Let Divg be the covariant divergence operator (defined by the metric connection for g) acting on (1,1) tensors, Divg (S) = ∇i Sji dx j . The contracted Bianchi identity is Divg G = 0, where G is the operator obtained from by lowering an index with the metric. The first point we wish to underscore with the following examples is that the kinematic reduction diagram gives a remarkably efficient means of solving the Killing equations for the determination of the invariant metrics. Secondly, we show that discrete symmetries, can lead to isotropy which will not change the dimension of the reduced spacetime M,

676

I. M. Anderson, M. E. Fels, C. G. Torre

constraints which reduce the fiber dimension of the kinematic bundle. Thirdly, for G invariant metrics, the divergence operator Divg is a G invariant operator to which the dynamical reduction procedure can be applied to obtain the reduction of the contracted Bianchi identities for the reduced equations. Throughout, we emphasize the importance of the residual symmetry group in analyzing the reductions of the field equations. Finally, we remark that our conclusions in these examples are not restricted to the Einstein equations but in fact hold for any generally covariant metric field theory derivable from a variational principle. Example 6.5 (Spherically Symmetric and Stationary, Spherically Symmetric Reductions). We begin by looking at spherically symmetric solutions on the four dimensional manifold M = R × (R3 − { 0 }), with coordinates (x i ) = (t, x, y, z) for i = 0, 1, 2, 3. Although this is a very well-understood example, it is nevertheless instructive to consider it within the general theory of Lie symmetry reduction of differential equations. The infinitesimal generators for G = SO(3) are given by (2.10) and, just as in Example 6.1, we find that the infinitesimal isotropy constraint defining κG,x (E) = κ,x (E) is ∂ ε0kij x k gli = 0, ∂glj or, in terms of matrices, ga + a t g = 0, where



0 0 a= 0 0

0 0 −z y

0 z 0 −x

(6.22)

 0 −y  x  0

and g = [ gij ]. These linear equations are easily solved to give        0 0 0 0 0 1 0 0 0 0 x y z 2 0 x xy xz 0  0 0 0 0 x 0 0 0  g = A +B +C 0 xy y 2 yz + D 0 0 0 0 0 y 0 0 0 0 0 0 0 0 z 0 0 0 0 xz yz z2

0 1 0 0

0 0 1 0

 0 0 . (6.23) 0 1

The fiber of the kinematic bundle κG,x (E) is therefore parameterized by four variables A, B, C, D. Since these variables are invariants for the action of G restricted to κG (E) and since the invariants for the action of SO(3) on M are t and r, the kinematic reduction diagram for the action of SO(3) on the bundle of Lorentz metrics is q κG

ι

(t, r, A, B, C, D) ←−−−− (x i , A, B, C, D) −−−−→ (x i , gij )      

(t, r)

qM

←−−−−

(x i )

id

−−−−→

(x i ),

where the inclusion map ι is given by (6.23). Consequently, the most general rotationally invariant metric on M is ds 2 = A(t, r)dt 2 + 2B(t, r)dt (x dx + y dy + z dz) + C(t, r)(x dx + y dy + z dz)2 + D(t, r)(dx 2 + dy 2 + dz2 ).

(6.24)

Group Invariant Solutions Without Transversality

677

In standard spherical coordinates x = r cos θ sin φ, y = r sin θ sin φ, z = cos φ this takes the familiar form (on re-defining the coefficients B, C and D) ds 2 = A(t, r)dt 2 + B(t, r)dtdr + C(t, r)dr 2 + D(t, r)d H2 ,

(6.25)

where d H2 = dφ 2 + sin2 φ dθ 2 .

∂ , then the If we enlarge the symmetry group to include time translations V0 = ∂t kinematic reduction diagram becomes q κG

ι

(r, A, B, C, D) ←−−−− (x i , A, B, C, D) −−−−→ (x i , gij )      

qM

(r)

←−−−−

(x i )

id

−−−−→

(6.26)

(x i ).

At first glance there appears to be little difference between the two diagrams (6.24) and (6.26), but a computation of the automorphism groups reveals a dramatic difference in the geometry of the reduced bundles κ˜ G (E) in (6.24) and (6.26). This difference is best explained in terms of general results on Kaluza-Klein reductions of metric theories as in, for example, Coquereaux and Jadczyk [13]. From our perspective, these authors show that when the action of G on M is simple in the sense that the isotropy groups Gx can all be conjugated in G to a fixed isotropy group Gx0 , then the reduced bundle κ˜ G (E) is a product of three bundles over M, ⊕ A(M) ⊕ QInv (K). κ˜ G (E) = Q(M)

(6.27)

Here is the bundle of metrics on M. (i) Q(M) 1 (ii) A(M) = J (M) ⊗ (P ×H h) , where P is the principal H bundle defined as the set of points in M with isotropy group Gx0 and H = Nor(Gx0 , G)/Gx0 . (iii) QInv (K) is the trivial bundle whose fiber consist of the G invariant metrics on the homogeneous space K = G/Gx0 . ˜ eff to be the diffeomorphism For (6.24) one computes the residual symmetry group G = R × R+ and one finds that the coefficients A, B, C transform as the group of M and that D is a scalar field (which one identifies as a map components of a metric on M into the space of SO(3) invariant metrics on S 2 ). Thus, for (6.24), we find that ⊕ !, κ˜ G (E) = Q(M) By contrast, for the diagram (6.26) the automorwhere ! is a trivial line bundle over M. ˜ phism group Geff acts on M by r → f (r) Diff(R+ ),

C ∞ (R)

and t → .t + g(r),

(6.28)

R∗ . Without

where f ∈ g∈ and . ∈ going further into the details of the decomposition (6.27), we simply note that the variable t is now the fiber coordinate on the principle bundle P and that under the transformations (6.28) the coefficients of the metric (6.25), which are now functions of r alone, transform according to A(r) → . 2 A(f (r)),

B(r) → .[f B(f (r)) + 2g A(f (r))],

C(r) → (f )2 C(f (r)) + f g B(f (r)) + (g )2 A(f (r)),

D(r) → D(f (r)).

678

I. M. Anderson, M. E. Fels, C. G. Torre

Consequently, the sections of κ˜ G (E) can be written as ˜ s˜ (r) = [g(r), ˜ ω(r), ˜ h(r)], where g(r) ˜ = [C(r) −

B(r)2 ] dr 2 , 4A(r)

ω(r) ˜ =

˜ h(r) = A(r)dt 2 + D(r) dH2 .

B(r) ∂ dr ⊗ , 2A(r) ∂t

and

˜ and h(r) ˜ ω(r) Here g(r) ˜ is a metric on M, ˜ is a connection on P pulled back to M, is a 2 ˜ map from M into the G invariant metrics on R × S . ˜ for the stationary, rotationally The detailed expression for the reduced operator invariant metrics can be found in any introductory text on general relativity. Here we simply point out that by computing the action of G on Sym2 (T M), we can deduce that the reduced operator will have the form ∂ ∂ ∂ ∂ ∂ ∂ ˜ = ˜ tt ⊗ ˜ rt + ⊗ + ⊗ ∂t ∂t ∂r ∂t ∂t ∂r ∂ ∂ 1 ∂ ∂ ∂ ∂ ˜ rr ˜H + ⊗ + ⊗ + ⊗ , ∂r ∂r ∂φ ∂φ ∂θ sin2 φ ∂θ ˜ tt , ˜ rt , ˜ rr and ˜ H are smooth functions on the 2-jets of the bundle (r, A, B, where C, D) → (r). In other words, of the ten components in the field equations, the dynamic reduction diagram automatically implies that 6 of these components vanish. Moreover, ˜ is constrained by the reduced Bianchi identities. Since dt and dr the reduced operator provide a basis for the invariant one forms on M, we know that the reduction of Divg S is a linear combination of dt and dr, Div g S = S˜t dt + S˜r dr. By direct computation, one finds that the dt and dr components of the reduced Bianchi identities are 1 d ˜ rt + B ˜ rr ) = 0, and γ (2A 2γ dr 1 1d ˜ rt ) − A˙ ˜ tt − C˙ ˜ rr − B˙ ˜ rt − 2D˙ ˜ H = 0, ˜ rr + B γ (2C 2 γ dr 1 2 where γ = D B − AC. It follows from the first of these identities and the transfor4 ˜ rt and ˜ rr under the residual scaling t → .t that mation properties of A, B, ˜ rt + B ˜ rr = 0. 2A This same identity can be derived by first observing that the principle of symmetric criticality holds for the action G and then by applying Noether’s second theorem to the ˜ eff . reduced Lagrangian with symmetry G Consequently of the four ODEs arising in the stationary, spherically symmetric reduction of the field equations one need only solve the two equations ˜ tt = 0

and

˜ rr = 0.

Group Invariant Solutions Without Transversality

679

The remaining two equations ˜ rt = 0

and

˜H =0

will automatically be satisfied (assuming D˙ = 0, A = 0). We stress that these conclusions actually hold true for the stationary, rotationally invariant reductions of any generally covariant metric field equations derivable from a variational principle. Example 6.6 (Static, Spherically Symmetric Reductions). A metric is static and spherically symmetric if, in addition to being invariant under time translations and rotations, it is invariant under time reflection. The symmetry group G now includes the transformations t → t + c and t → −t and therefore the isotropy subgroup Gx0 of the point x0 = (t0 , x0 ) now includes the reflection t → 2t0 − t. The fibers of the kinematic bundle are now constrained by (6.22) along with bgbt = g,

where

b = diag[−1, 1, 1, 1].

This forces B = 0 in (6.23) so that the fibers of the kinematic bundle are now 3-dimensional and the general invariant section is ds 2 = A(r)dt 2 + C(r)dr 2 + D(r)d H2 . The automorphism group for this bundle is now r → f (r) and t → .t and the A(M) summand in (6.27) does not appear. This example shows that while discrete symmetries that is, the number will never result in a reduction of the dimension of the orbit space M, of independent variables, discrete symmetries can reduce the fiber dimension of the kinematic bundle, that is, the number of dependent variables. Example 6.7 (Plane Waves). As our next example from general relativity, we consider a class of plane wave metrics [12]. We take M = R4 with coordinates (u, v, x, y) and let P (u) and Q(u) be arbitrary smooth functions satisfying P (u) > 0 and Q (u) > 0. The symmetry group on M is the five-parameter transformation group u = u, v = v + ε1 + ε4 x + ε5 y + 1/2 ε2 ε4 + ε3 ε5 + ε42 P (u) + ε52 Q(u) , x = x + ε2 + ε4 P (u),

y = y + ε3 + ε5 Q(u), (6.29)

with infinitesimal generators V1 = V4 = x

∂ ∂ ∂ , V2 = , V3 = , ∂v ∂x ∂y

∂ ∂ + P (u) ∂v ∂x

∂ ∂ + Q(u) . ∂v ∂y

and

V5 = y

and

[ V3 , V5 ] = V1

The only non-vanishing brackets are [ V2 , V4 ] = V1

so that, regardless of the choice of functions P and Q, the abstract Lie algebras or groups are the same although the actions are generically different for different choices of P and Q. The coordinate function u is the only invariant and the orbits of this action are 3dimensional. Therefore, at each point the isotropy subgroup is two dimensional and it is easily seen that, at x0 = (u0 , v0 , x0 , y0 ), the infinitesimal isotropy x0 is generated by Z1 = V4 − x0 V1 − P (u0 )V2

and

Z2 = V5 − y0 V1 − Q(u0 )V3 .

680

I. M. Anderson, M. E. Fels, C. G. Torre

At x ∈ M, the metric components g = [ gij ] of a G invariant metric satisfy the isotropy conditions ga1 + a1t g = 0 where



0  0 a1 =  P (u) 0

0 0 0 0

and ga2 + a2t g = 0, 

 0 0 0 0

0 1 0 0

0  0 a2 =  0 Q (u)

and

(6.30) 0 0 0 0

0 0 0 0

 0 1 . 0 0

We find that the solutions to (6.30) are 





1 0 0 0 0 0 0 0 g1 =  0 0 0 0 0 0 0 0

and

0 −1   g2 =  0   0

−1 0 0

0 0 1 P (u)

0 0

0

Q (u)

0

0 1

    .  

Thus the kinematic reduction diagram is q κG

ι

(u, A, B) ←−−−− (x i , A, B) −−−−→ (x i , gij )      

(u)

qM

←−−−−

(x i )

id

−−−−→

(x i ),

and the inclusion map ι sends (A, B) to ds 2 = Adu2 + Bγ , where γ = −2du dv +

dx 2 dy 2 + . P (u) Q (u)

The most general G invariant metric is ds 2 = A(u)du2 + B(u)γ . From the form of the most general G invariant symmetric type that the reduced field equations take the form ˜ = ˜ vv

(6.31) 2 0

tensor, we are assured

∂ ∂ ∂ ∂ ∂ ∂ ∂ ! ∂ ∂ ∂ ˜γ − ⊗ + ⊗ − ⊗ + P (u) ⊗ + Q (u) ⊗ . ∂v ∂v ∂u ∂v ∂v ∂u ∂x ∂x ∂y ∂y

Every G invariant one-form is a multiple of du so that there is only one non-trivial component to the contracted Bianchi identities and, indeed, by direct computation we find that d ˜G = ˜ γ d u. Divg˜ B du ˜ γ component of the reduced Since this must vanish identically, we conclude that the field equations is of the form c ˜γ = , B

Group Invariant Solutions Without Transversality

681

where c is a constant. Either the constant c is non-zero, in which case the reduced equations are inconsistent and there are no G invariant solutions, or else c = 0 and the reduced ˜ vv = 0. For generally covariant metric equations consist of just the single equation theories the case c = 0 can only arise when the field equations contain a cosmological term [37]. It is easy to check that while the isotropy algebras x0 are all two-dimensional abelian subalgebras, on disjoint orbits none are conjugate under the adjoint action of G. Hence the group action (6.29) is not simple and consequently the kinematic bundle for this action need not decompose according to (6.27). Indeed, the tensor γ cannot be identified with any G invariant quadratic form on the orbits G/Gx0 . Example 6.8 (Symplectic Reduction and Group Invariant Solutions). It is important to recognize the fundamental differences between symplectic reduction and Lie symmetry reduction for group invariant solutions of a Hamiltonian system with symmetry. Let M be an even dimensional manifold with symplectic form ω and let H : M → R be the Hamiltonian for a dynamical system on M. For the purposes of this example, it suffices to consider reduction by a one dimensional group of Hamiltonian symmetries generated by a vector field V with associated momentum map J , V

ω = d J.

(6.32)

is obtained by (i) restricting to the In symplectic reduction the reduced space M submanifold of M defined by J = µ ≡ constant, and then (ii) quotienting this submanifold by the action of the transformation group and the reduced equations are the associated generated by V . Both ω and H descend to M Since dim M = dim M − 2, the reduction in the number of Hamiltonian system on M. dependent variables is 2. The solution to the original Hamiltonian equations are obtained from that of the reduced Hamiltonian equations by quadratures. To compare with symmetry reduction for group invariant solutions, we transcribe Hamilton’s equations into the operator-theoretic setting used to construct the kinematic and dynamic reduction diagrams. Let E = M × R → R be extended phase space so that the differential operator characterizing the canonical equations is the one-form valued operator on J 1 (E) defined by =X

ω − d H.

Here X is the total derivative operator given, in standard canonical coordinates (ui , pi ) on M, by d ∂ ∂ ∂ X= = + u˙ i i + p˙ i . dt ∂t ∂u ∂pi It is not difficult to show that if V is any vector field on M, then the prolongation of V to J 1 (E) satisfies [X, pr 1 V ] = 0 and therefore V is a symmetry of the operator whenever V is a symmetry of ω and H . Since V is a vertical vector field on E it is “all isotropy” and the kinematic bundle is the fixed point set for the flow of V , κ (E) = {(t, ui , pi ) | V (ui , pi ) = 0 }.

682

I. M. Anderson, M. E. Fels, C. G. Torre

The dimension of κ (E) therefore depends upon the choice of V and is generally less than the dimension of E by more than 2 (the decrease in the dimension in the case of symplectic reduction). In short, it is not possible to identify the fibers of the kinematic Moreover, from (6.32), it follows that points in bundle with the reduced phase space M. κ (E) always correspond to points on the singular level sets of the momentum map and, typically, to points where the level sets fail to be a manifold. Thus the invariant solutions are problematic from the viewpoint of symplectic reduction and are subject to special treatment. See, for example, [5] and [20]. Finally, there is no guarantee that the reduced equations for the group invariant solutions possess any natural inherited Hamiltonian formulation. We illustrate these general observations with some specific examples. First, if V is a translation symmetry of a mechanical system, then J is a linear function and symplectic reduction yields all the solutions to Hamilton’s equations with a given fixed value for the first integral J . Since the vector field V never vanishes, the kinematic bundle is empty and there are no group invariant solutions. Second, for the classical 3-dimensional central force problem u¨ = −f (ρ)u,

where ρ =

v¨ = −f (ρ)v,

w¨ = −f (ρ)w,

√ u2 + v 2 + w 2 , the extended phase space E is R×R6 → R with coordinates (t, u, v, w, pu , pv , pw ) → (t),

the symplectic structure on phase space is ω = du ∧ dpu + dv ∧ dpv + dw ∧ dpw and 2 ) + φ(ρ), where φ (ρ) = ρf (ρ). The vector the Hamiltonian is H = 21 (pu2 + pv2 + pw field ∂ ∂ ∂ ∂ V = −u + pv +v − pu ∂v ∂u ∂pv ∂pu is a Hamiltonian symmetry. The kinematic bundle for the V invariant sections of E is ι

id

(t, w, pw ) ←−−−− (t, w, pw ) −−−−→ (t, u, v, w, pu , pv , pw )     π π π

(t)

id

←−−−−

(t)

id

−−−−→

(6.33)

(t) ,

where ι(t, w, pw ) = (t, 0, 0, w, 0, 0, pw ), the invariant sections are of the form t → (0, 0, w(t), 0, 0, pw (t)), and the reduced differential operator for the V invariant solutions is ˜ = (w˙ − pw ) dpw − (p˙ w + wf (|w|)) dw. Let us compare this state of affairs with that obtained by symplectic reduction based upon the Hamiltonian vector field V . The momentum map associated to this symmetry is the angular momentum J = −upv + vpu . The level sets J = µ are manifolds except for µ = 0. The level set J = 0 is the product of a plane and a cone whose vertex is precisely the fiber of the kinematic bundle.

Group Invariant Solutions Without Transversality

683

To implement the symplectic reduction, we introduce canonical cylindrical coordinates (r, θ, w, pr , pθ , pw ), where u = r cos θ, v = r sin θ, pθ pθ pu = pr cos θ − sin θ , pv = pr sin θ + cos θ . r r Note that this change of coordinates fails precisely at points of the kinematic bundle. In terms of these phase space coordinates, the symplectic structure is still in canonical 1 1 form ω = dr ∧ dpr + dθ ∧ dpθ + dw ∧ dpw ., the Hamiltonian is H = (pr2 + 2 pθ2 + 2 r 2 ) + φ( r 2 + w 2 ), and the momentum map is J = −pθ . We can therefore describe pw the symplectic reduction of E by the diagram ι

(t, r, w, pr , pw ) ←−−−− (t, r, θ, w, pr , pw ) −−−−→ (t, r, θ, w, pr , pθ , pw )      

(t)

←−−−−

(t)

−−−−→

(t).

The reduced symplectic structure is then ωˆ = dr ∧ dpr + dw ∧ dpw , the reduced 2 2 = 1 (pr2 + µ + pw Hamiltonian is H ) + φ( r 2 + w 2 ), and the reduced equations of 2 2 r motion are µ2 r˙ = pr , p˙ r = −rf ( r 2 + w 2 ) + 3 , w˙ = pw , p˙ w = −wf ( r 2 + w 2 ). r Given a choice of µ and solutions to these reduced equations, we get a solution to the full equations via θ = −µt + const. 7. Appendix We summarize a few technical points concerning group actions on fiber bundles and the construction of the kinematic and dynamic reduction diagrams. For details, see [3]. A. Transversality and Regularity. Let G be a finite dimensional Lie group acting projectably on a bundle π : E → M. We say that G acts transversally on E if, for each fixed p ∈ E and each fixed g ∈ G, the equation π(g · p) = π(p)

implies that

g · p = p.

(7.1)

Thus each orbit of G intersects each fiber of E exactly once. For transverse group actions the orbits of G in E are diffeomorphic to the orbits of G in M under the projection map π : E → M. Projectable, transverse actions always satisfy the infinitesimal transversality condition (2.4) but the converse is easily seen to be false. = M/G is a Let us say that the action of G on M is regular if the quotient space M The smooth manifold and the quotient map qM : M → M defines M as a bundle over M. is discussed in various texts, for example, [1], [4], construction of the orbit manifold M [29], [31]. The assumption that the action of G on M is regular is a standard assumption is a manifold without in Lie symmetry reduction. For simplicity we suppose that M boundary but, as Example 6.4 shows, this assumption can be relaxed in applications. The fundamental properties of transverse group actions are described in the following theorem which is proved in [3].

684

I. M. Anderson, M. E. Fels, C. G. Torre

Theorem 7.1 (The Regularity Theorem for Transverse Group Actions). Let G be a Lie group which acts projectably and transversally on the bundle π : E → M. Suppose that G acts regularly on M. = E/G is a bundle over M. (i) Then G acts regularly on E and E is Hausdorff, then the orbit manifold E is also Hausdorff. (ii) If the orbit manifold M → M via the (iii) The bundle E can be identified with the pullback of the bundle π˜ : E quotient map qM : M → M. be an open set in M and let U = q−1 (U ). There is a one-to-one correspon(iv) Let U M dence between the smooth G invariant sections of E over U and the sections of E . over U B. Transversality and the Kinematic Bundle. Lemma 3.2 implies that the action of G on E always restricts to a transverse action on the set κG (E). In fact, it is not difficult to characterize κG (E) as the largest subset of E on which G acts transversally or, alternatively, as the smallest set through which all locally defined invariant sections factor. For Lie symmetry reduction without transversality the assumption that κG (E) is an imbedded subbundle of E now replaces the infinitesimal transversality condition (2.4) as the underlying hypothesis for the action of G on E (together, of course, with the regularity of the action of G on M). In particular, the assumption that the dimension of κG,x (E) is constant as x varies over M is clearly a necessary condition if one hopes to parameterize the space of G invariant local sections of E in terms of a fixed number of arbitrary functions. There are a variety of general results which one can apply to check whether κG (E) is a subbundle of E. To begin with, if x, y ∈ M lie on the same G orbit, that is, if y = g · x for some g ∈ G, then it is not difficult to prove that κG,y (E) = g · κG,x (E). By virtue of this observation it suffices to check that the restrictions of κG (E) to the cross-sections of the action of G on M are subbundles. For Lie group actions G which admit slices on M, it is not difficult to establish (see [4]) that the kinematic bundles for the induced actions on tensor bundles of M always exist. For compact groups acting by isometries on hermitian vector bundles the existence of the kinematic bundle is established in [11]. Granted that κG (E) → M is a bundle, Theorem 3.3 now follows from Theorem 7.1. Theorem 7.1 also shows that there is considerable redundancy in the hypothesis of Theorem 3.3. We emphasize that the action of G on E itself need not be regular in order to construct a smooth kinematic reduction diagram. This is well-illustrated by Example 19 in Lawson [26, p. 23]. C. The Bundle of Invariant Jets. The following theorem summarizes the key properties of the bundle Invk (E) → M. Theorem 7.2. Let G be a projectable group action on π : E → M and suppose that E admits a smooth kinematic reduction diagram (3.5). (i) Then Invk (E) is a G invariant embedded submanifold of J k (E). (ii) The action of G on Invk (E) is transverse and regular.

Group Invariant Solutions Without Transversality

685

(iii) The quotient manifold Invk (E)/G is diffeomorphic to J k (κ˜ G (E)) and the diagram ιk

qInv

J k (κ˜ G (E)) ←−−−− Invk (E) −−−−→ J k (E)      

M

qM

←−−−−

M

id

−−−−→

M

commutes. This theorem implies that the same hypothesis on the action of G on the bundle π : E → M needed to insure that the kinemati c reduction diagram is a diagram of smooth manifolds and maps also insures that the bottom row of the dynamical reduction diagram (4.2) exists. Therefore to guarantee the smoothness of the entire dynamic reduction diagram one need only assume, in addition, that DInv is a subbundle of D. D. The Automorphism Group of the Kinematic Bundle. For computations of the auto˜∗ morphism group of the kinematic bundle it is often advantageous to use the fact that G ˜ fixes every G invariant section of E, that G preserves the space of G invariant sections and that, conversely, under very mild assumptions, these properties characterize these groups. Theorem 7.3. Assume that there is a G invariant section through each point of κG (E). ˜ ∗ coincides with the subgroup of G which fixes every invariant section Then the group G of E, ˜ ∗ = { a ∈ G | a · s = s for all G invariant sections s : M → E } G ˜ coincides with the subgroup of G which preserves the set of G invariant and the group G sections of E, ˜ = { a ∈ G | a · s is G invariant for all G invariant sections s : M → E }. G References 1. Abraham, R. and Marsden, J.: Foundations of Mechanics. 2nd ed., Reading, MA: Benjamin-Cummings, 1978 2. Anderson, I.M. and Fels, M.E.: Symmetry Reduction of variational bicomplexes and the principle of symmetry criticality. Am. J. Math. 112, 609–670 (1997) 3. Anderson, I.M. and Fels, M.E.: Transverse group actions on bundles. In preparation 4. Anderson, I.M., Fels, M.E., Torre, C.G.: Symmetry Reduction of Differential Equations. In preparation 5. Arms, J.A., Gotay, M.J., Jennings, G.: Geometric and algebraic reduction for singular momentum maps. Adv. in Math, 79, 43–103 (1990) 6. Beckers, J., Harnad, J., Perrod, M. and Winternitz, P.: Tensor fields invariant under subgroups of the conformal group of space-time. J. Math. Phys. 19 (10), 2126–2153 (1978) 7. Beckers, J., Harnad, J. and Jasselette, P.: Spinor fields invariant under space-time transformations. J. Math. Phys. 21 (10), 2491–2499 (1979) 8. Bleecker, D.D.: Critical mappings of Riemannian manifolds. Trans. Am. Math. Soc. 254, 319–338 (1979) 9. Bleecker, D.D.: Critical Riemannian manifolds. J. Diff. Geom. 14, 599–608 (1979) 10. Bluman, G.W. and Kumei, S.: Symmetries and Differential Equations. Applied Mathematical Sciences, 81, New York–Berlin: Springer-Verlag, 1989 11. Brúning, J. and Heintze, E.: Representations of compact Lie groups and elliptic operators. Invent. Math. 50, 169–203 (1979) 12. Bondi, H., Pirani, F., Robinson, I.: Gravitational waves in general relativity III. Exact plane waves. Proc. Roy. Soc. London A 251, 519–533 (1959) 13. Coquereaux, R. and Jadczyk, A.: Riemannian Geometry, Fiber Bundles, Kaluza–Klein Theories and all that. Lecture Notes in Physics 16, Singapore: World Scientific, 1988

686

I. M. Anderson, M. E. Fels, C. G. Torre

14. David, D., Kamran, N., Levi, D. and Winternitz, P.: Symmetry reduction for the Kadomtsev–Petviashvili equation using a loop algebra. J. Math. Phys: 27, 1225–1237 (1986) 15. Eells, J. and Ratto,A.: Harmonic Maps and Minimal Immersions with Symmetries.Annals of Mathematical Studies 130, Princeton: Princeton Univ. Press, 1993 16. Fels, M.E. and Olver, P.J.: On relative invariants. Math. Ann. 308, 609–670 (1997) 17. Fels, M.E.: Symmetry reductions of the Euler equations. In preparation 18. Fushchich, W.I., Shtelen, W.M., Slavutsky, S.L.: Reduction and exact solutions of the Navier–Stokes equations. Topology 15, 165–188 (1976) 19. Gaeta, G. and Morando, P.: Michel theory of symmetry breaking and gauge theories. Ann. Phys. 260, 149–170 (1997) 20. Gotay, M.J. and Bos, L.: Singular angular momentum mappings. J. Diff. Geom. 24 181–203 (1986) 21. Grundland,A.M., Winternitz, P., Zakrewski, W.J.: On the solutions of the CP1 model in (2+1) dimensions. J. Math. Phys. 37 (3), 1501–1520 (1996) 22. Harnad, J., Schnider, S. and Vinet, L.: Solution to Yang–Mills equations on M 4 under subgroups of O(4, 2). In: Complex Manifold Techniques in the Theoretical Physics. (Proc. Workshop, Lawrence, Kan. 1978). Research Notes in Math. 32 Boston: Pitman, 1979, pp. 219–230 23. Ibragimov, N.H.: CRC Handbook of Lie Group Analysis of Differential Equations. Volume 1, Symmetries, Exact Solutions and Conservation Laws. Boca Raton, FL: CRC Press, 1995 24. Jackiw, R. and Rebbi, C.: Conformal properties of aYang–Mills pseudoparticle. Phys. Rev. D 14, 517–523 (1976) 25. Kovalyov, M., Légaré, M. and Gagnon, L.: Reductions by isometries of the self-dualYang–Mills equations in four-dimensional Euclidean space. J. Math. Phys. 34 (7), 3245–3267 (1993) 26. Lawson, H.B.: Lectures on Minimal Submanifolds. Mathematics Lecture Series 9, Berkeley: Publish or Perish, 1980 27. Légaré, M. and Harnad, J.: SO(4) reduction of the Yang–Mills equations for the classical gauge group. J. Math. Phys. 25 (5), 1542–1547 (1984) 28. Lègaré, M. Invariant spinors and reduced Dirac equations under subgroups of the Euclidean group in four-dimensional Euclidean space. J. Math. Phys. 36 6, 2777–1791 (1995) 29. Olver, P.J.: Applications of Lie Groups to Differential Equations. (Second Ed.) New York: Springer, 1986 30. Ovsiannikov, L.V.: Group Analysis of Differential Equations New York: Academic Press, 1982 31. Palais, R.S.: A Global Formulation of the Lie theory of Transformation Groups. Memoirs of the Am. Math Soc. 22, Providence, RI: Am. Math. Soc., 1957 32. Palais, R.S.: The principle of symmetric criticality. Commun. Math. Phys. 69, 19–30 (1979) 33. Palais, R.S.: Applications of the symmetric criticality principle in mathematical physics and differential geometry. In: Proc. U.S.– China Symp. on Differential Geometry and Differential Equations II, 1985 34. Rogers, C. and Shadwick, W.: Nonlinear boundary value problems in science and engineering. Mathematics in Science and Enginering 183, Boston: Academic Press, 1989 35. Smith, R.T.: Harmonic mapings of spheres. Am. J. of Math. 97, 364–385 (1975) 36. Stephani, H.: Differential Equations and their Solutions using Symmetries. Cambridge: Cambridge University Press, 1989 37. Torre, C.G.: Gravitational waves: Just plane symmetry. Preprint gr-qc/9907089 38. Tóth, G.: Harmonic and Minimal Immersions through Representation Theory. Perspectives in Math. Boston: Academic Press, 1990 39. Urakawa, H.: Equivariant harmonic maps between compact Riemannian manifolds of cohomogenity. 1, Michigan Math. J. 40, 27–50 (1993) 40. Vorob’ev, E.M.: Reduction of quotient equations for differential equations with symmetries. Acta Appl. Math. 23, 1999 (1991) 41. Winternitz, P.: Group theory and exact solutions of partially integrable equations. In: Partially Integrable Evolution Equations, R. Conte and N. Boccara, eds. Dordrecht: Kluwer Academic Publishers, 1990, pp. 515–567 42. Winternitz, P., Grundland, A.M., Tuszy´nski, J.A.: Exact solutions of the multidimensional classical φ 6 – field equations obtained by symmetry reduction. J. Math. Phys. 28, 2194–2212 (1987) Communicated by H. Araki

Commun. Math. Phys. 212, 687 – 701 (2000)

Communications in

Mathematical Physics

© Springer-Verlag 2000

On Baxter’s Q-Operator for the XXX Spin Chain G. P. Pronko Institute for High Energy Physics, Protvino, Moscow reg. 142284, Russia and International Solvay Institute, Brussels, Belgium Received: 5 September 1999 / Accepted: 10 February 2000

Abstract: We discuss the construction of Baxter’s Q-operator. The suggested approach leads to the one-parametric family of Q-operators, satisfying wronskian-type relations. Also we have found the generalization of Baxter operators, which defines the nondiagonal part of the monodromy. 1. Introduction Long ago, considering the XY Z spin chain Baxter [1] has introduced the so-called Qoperator, which permitted him to find the eigenvalues of the transfer matrix t (x) in spite of the absence of the Bethe Ansatz for this spin chain. This object may be defined by the following operator equation: t (x)Q(x) = a(x)Q(x + i) + b(x)Q(x − i),

(1)

together with the requirements [t (x), Q(y)] = [Q(x), Q(y)] = 0.

(2)

Recently, in a series of papers Bazhanov, Lukyanov and Zamolodchikov [2] have given an explicit construction of such operators for the case of a certain integrable field model. Moreover their construction definitely gives the pair of operators Q± (x), satisfying apart from (1) and (2) also [Q+ (x), Q− (y)] = 0

(3)

and a certain “wronskian” relation, which becomes the origin of the various fusion relations. However, the extension of their results for the six-vertex spin chain requires an external magnetic field which cannot be eliminated by the limiting procedure. Therefore in the simplest case of the XXX spin chain we do not know Q± (x) -operators, though from

688

G. P. Pronko

the point of view of the quantum inverse scattering method (QISM) [3] their construction should be universal for any integrable system. In [4] we investigated Eq. (1) for the eigenvalues of the transfer matrix in the cases of XXX and XXZ and had proven that there exists a pair of solutions (we called it Q(x) and P (x)) which are the polynomials (or trigonometric polynomials for the XXZ case) in the spectral parameter and which also satisfies “wronskian” relations. In the present paper we give the explicit operator construction of the one-parametric solution of Eq. (1) and also the solutions of the generalized equation, where instead of the transfer matrix, the whole monodromy matrix enters. To simplify the discussion we shall consider the case of quantum spin 1/2. The generalization for arbitrary quantum spin as well as for the inhomogeneous chain are more or less straightforward. In the frameworks of QISM [3], the monodromy matrix T l (x) is defined as the ordered product of L-operators, acting in the 2 × (2l + 1)-dimensional space: Lln = x + 2isna La , sna

(4) La

operators act in the auxiliary where are the operators of quantum spin 1/2, while 2l + 1-dimensional space. The monodromy matrix is then given by N

T l (x) =

Ll (x),

(5)

n=1

where N is the length of the chain and the transfer matrix t l (x) is the trace of T l (x) over the auxiliary space: t l (x) = T rT l (x).

(6)

Note that for the case of the isotropic XXX-spin chain, the monodromy matrix T l (x) is the scalar with respect to simultaneous rotation in the quantum and auxiliary spaces [S + L, T l (x)] = 0,

(7)

where S=

n=1

sn.

(8)

N

Therefore the transfer matrix t l (x) is the scalar with respect to quantum spin: [S, t l (x)] = 0.

(9)

The full set of commutation relations between matrix elements of the monodromy matrix with different spectral parameters is contained in the following equation [3]:

R l,l (x − y)T l (x)T l (y) = T l (y)T l (x)R l,l (x − y),

(10)

l

where the monodromies T l (x) and T (y) have a common quantum space and different auxiliary spaces. The R-matrix, which acts in the tensor product of auxiliary spaces with dimension (2l + 1) × (2l + 1) is the function of total auxiliary spin J = L + L [3]:

R l,l (x) = eiπJ

(J + 1 − ix) , (J + 1 + ix)

(11)

Baxter’s Q-Operator for the XXX Spin Chain

689

where the operator J is given by: 1/2 J = J 2 + 1/4 − 1/2.

(12)

The same commutation relations as (10) are valid also for L-operators:

R l,l (x − y)Ll (x)Ll (y) = Ll (y)Ll (x)R l,l (x − y).

(13)

Equation (10) has many important corollaries, among which there are so-called fusion relations. The later plays the key role in what follows. One of these fusion relations for the transfer matrix has the following form: i N l+1/2 i N l−1/2 t 1/2 (x)t l (x + i(l + 1/2)) = x + t (x + il) + x − t (x + i(l +1)). 2 2 (14) If we denote as A(x, l) the transfer matrix with shifted spectral parameter A(x, l) = t l (x + i(l + 1/2)),

(15)

relation (14) will take the form i N i N A(x − i, l + 1/2) + x − A(x + i, l − 1/2), (16) t 1/2 (x)A(x, l) = x + 2 2 which is very similar to the defining relation for the Baxter Q-operator in the case of the XXX spin chain [4]: i N i N Q(x − i) + x − Q(x + i). (17) t 1/2 (x)Q(x) = x + 2 2 The difference of (16) and (17) is due to the shift of the second argument of A(x, l) in the r.h.s. of (16). To eliminate this difference we shall make the following trick. Let us forget for a moment that l denotes the representation of auxiliary spin, and takes only integer or half-integer values and consider A(x, l) as a function of two complex arguments. Then the new function, which is defined as Q(x, l) = A(x, l + ix/2) apparently satisfies the relation i N i N Q(x − i, l) + x − Q(x + i, l). t 1/2 (x)Q(x, l) = x + 2 2

(18)

(19)

In such a way we obtain a one parametric family of operators, satisfying the Baxter equation (17). Of course, this construction can not be considered as rigorous, because the analytic continuation from a countable set of points into the complex plane is not unique and we have used this trick only to illustrate the idea. In what follows we shall give a more educated construction of the Q-operator, based on this idea. However, if we impose the condition that after the continuation into the complex l-plane, the operator A(x, l) remains a polynomial in l, then this continuation becomes unique and this trick gives the effective way for calculation of eigenvalues of the Q-operators via the eigenvalues of the monodromies t l (x).

690

G. P. Pronko

2. The L-Operators The discussion of the previous section made it clear that for the construction of the Q-operator we need the complex spin in the auxiliary space. Also, we shall look for the Q-operator in the form of the trace of the appropriate monodromy: ˆ Q(x) = T r Q(x),

(20)

ˆ where the operator Q(x) acts in the tensor product of quantum space of s = 1/2 and infinite dimensional auxiliary space . As we shall see for our purpose it is sufficient that this space is the representation of the algebra: [ρi , ρj+ ] = δij ,

i, j = 1, 2.

(21)

ˆ The operator Q(x) will be given by the ordered product: ˆ Q(x) =

N

Ln (x),

(22)

n=1

where Ln (x)-are the operators, depending on ρ and ρ + and acting in the space of nth quantum spin. 1/2 Further we shall need to consider the product Ln (x) Ln (x), which acts in the ij

auxiliary space × 2 (-for Ln (x) and 2 – is a two-dimensional auxiliary space for 1/2 Ln (x)). In this space it is convenient to consider a pair of projectors !± ij : + + + + −1 −1 !+ ij = (ρ ρ + 1) ρi ρj = ρi ρj (ρ ρ + 1) , + + + −1 + −1 !− ij = (ρ ρ + 1) "il ρl "j m ρm = "il ρl "j m ρm (ρ ρ + 1) ,

(23)

where ρ + ρ = ρi+ ρi , "ij = −"j i ,

"12 = 1.

(24)

These projectors formally satisfy the following relations: ± ± !± ik !kj = !ij , − !+ ik !kj = 0, − !+ ij + !ij = δij .

(25)

Rigorously speaking the r.h.s. of the first equation (25) in the Fock representation has an extra term, proportional to the projector on the vacuum state, but, as we shall see below, this term is irrelevant in the present discussion. In order to define the Q-operator which satisfies the Baxter equation (17) we shall exploit Baxter’s idea [1], which we reformulate as follows: Ln (x)-operator should satisfy the relation: 1/2 !− Ln (x)!+ (26) ij Ln (x) lk = 0. jl

Baxter’s Q-Operator for the XXX Spin Chain

691

If this condition is fulfilled, then 1/2 1/2 Ln (x) Ln (x) = !+ (x) Ln (x)!+ L n ik lj + ij kl 1/2 + 1/2 !− (x)n Ln (x)!− Ln (x)!− ik L lj + !ik Ln (x) lj . kl

kl

(27)

In other words, the condition (26) guarantes that the r.h.s. of (27) in the sense of projectors !± has the triangle form and this form will be conserved for products over n due to orthogonality of the projectors. From (26) we obtain "j m ρm xδj k + isna σjak Ln (x)ρk = 0, (28) or

and

Ln (x)ρk = isna σkla − δkl ρl An (x)

(29)

"j m ρm xδj k − isna σjak Ln (x) = Bn (x)"kl ρl ,

(30)

where An (x) and Bn (x) are some operators which we shall find now. Making use of (29) let us rewrite the first term in the r.h.s. of (27) in the following form: 1/2 + + −1 !+ L (x) Ln (x)!+ n ik lj = −(x + i/2)(x − 3i/2)ρi An (x)ρj (ρ ρ + 1) . (31) kl

This equation may also be written as 1/2 + + −1 !+ Ln (x)!+ ik Ln (x) lj = (x + i/2)ρi Ln (x − i)ρj (ρ ρ + 1) , kl

(32)

provided the operator An (x) is given by An (x) = −(x − 3i/2)−1 Ln (x).

(33)

Substituting (33) into (29) we obtain the desired equation for L operator: sna σija + ixδij ρj Ln (x) = (1/2 + ix)Ln (x + i)ρi .

(34)

If this equation is satisfied, we immediately find the operator Bn (x) in (30): Bn (x) = (x − i/2)Ln (x + i).

(35)

Having (35) we can also rewrite the second term in the r.h.s. of (27), as we did in Eq. (32) and finally arrive at 1/2 Ln (x) Ln (x) = (x + i/2)ρi Ln (x − i)ρj+ (ρ + ρ + 1)−1 ij 1/2 L + (x − i/2)(ρ + ρ + 1)−1 "il ρl+ Ln (x + i)"j m ρm + !+ (x) Ln (x)!− n ik lj . (36) kl

We do not care to rewrite the last term in the r.h.s. of (36) because this term does not contribute into the final expression of the Q-operator.

692

G. P. Pronko

Until now our discussion has been quite formal because we did not specify the representation of the algebra (21). The detailed investigation of Eq. (34) shows that the usual Fock representation for (21) does not fit for our purpose, therefore we shall use a less restrictive holomorphic representation. Let the operator ρi+ be the operator of multiplication by the αi , while the operator ρi – the operator of differentiation with respect to αi : ρi+ ψ(α) = αi ψ(α), ∂ ρi ψ(α) = ψ(α). ∂α

(37)

The operators ρi+ , ρi are canonically conjugated for the scalar product:

¯ i −α α¯ i=1,2 dαi d α ¯ (ψ, φ) = e ψ(α)φ(α). 2 (2π i)

(38)

The action of an operator in a holomorphic representation is defined by its kernel: ¯ ¯ β)ψ(β), (39) (Kψ) (α) = dµ(β, β)K(α, where we have denoted

¯ = dµ(β, β)

¯i i=1,2 dαi d α . 2 (2π i)

(40)

In this framework we can formulate the following: Theorem. The kernel of the operator Ln (x) , satisfying Eq. (34) in holomorphic representation has the following form: ¯ 2l+ix i + (α β) a + a ¯ Ln (x, l, α, β) = x + (ρ ρ + 1) + isn ρ σ ρ , (41) 2 (2l + ix + 1) where l is arbitrary number. The proof is trivial by direct substitution of (41) into (34), using Definition (39). In such a way, the operator Ln (x) given by (41) solves Eq. (36) for left multiplication by L1/2 (x). Changing the order of the multiplication in (36), we can prove that 1/2 Ln (x) Ln (x) = (x + i/2)(ρ + ρ + 1)−1 ρi Ln (x − i)ρj+ ij 1/2 L + (x − i/2)"il ρl+ Ln (x + i)"j m ρm (ρ + ρ + 1)−1 + !− (x) Ln (x)!+ n ik lj , (42) kl

provided Ln (x) satisfies the equation: Ln (x)ρi+ sna σija + ixδij = (1/2 + ix)ρj+ Ln (x + i).

(43)

The direct substitution of (41) into (43) shows that (41) is also the solution of this equation. The solution (41) possesses invariance with respect to simultaneous rotation in quantum and auxiliary spaces, as an L-operator of the XXX chain: σ s n + ρ + ρ, Ln (x) = 0. (44) 2

Baxter’s Q-Operator for the XXX Spin Chain

693

3. The Q-Operators To proceed further we need to recall the definition of a trace of an operator in holomorphic ¯ then, (see e.g. [5]) representation. If the operator is given by its kernel F (α, β) T rF = dµ(α, α)F ¯ (α, α), ¯ (45) where the measure was defined in (40). Let us now consider the ordered product of the Ln (x)-operators, introduced in the previous section, ˆ ¯ = Q(x, l, α, β)

N−1

dµ(γi , γ¯i )LN (x, l, α, γ¯N−1 )LN−1 (x, l, γN−1 , γ¯N−2 ) × · · ·

i=1

¯ · · · × L2 (x, l, γ2 , γ¯1 )L1 (x, l, γ1 , β).

(46)

Due to the triangle (in the sense of projectors !± ) structure of the r.h.s. of (36) we obtain the following rule of multiplication of the monodromy matrix T 1/2 (x) on the ˆ operator Q:

1/2 ˆ ˆ − i, l, α, β)ρ ¯ = (x + i )N ρi Q(x ¯ + (ρ + ρ + 1)−1 T (x) ij Q(x, l, α, β) j 2

+ i N + −1 + ˆ + i, l, α, β)" ¯ j k ρk + ! · · · (x − 2 ) (ρ ρ + 1) "im ρm Q(x !− , (47) im mk kj where we omitted the explicit expression of the last term for obvious reasons. In the derivation of (47) we have used the remnants of the projectors !± which govern the proper multiplication of each term in (36) separately. Now we can perform the trace operation over the holomorphic variables and over i, j indexes, corresponding to the auxiliary 2-dimensional space of T 1/2 (x). The result is the desired Baxter equation: i N i N t 1/2 (x)Q(x, l) = x + Q(x − i, l) + x − Q(x + i, l), 2 2

(48)

where, according to (45) and (46), Q(x, l) =

ˆ dµ(α, α) ¯ Q(x, l, α, α). ¯

(49)

ˆ exists due to the exponential factor in the holomorphic measure Note that the trace of Q (40) and has cyclic property, therefore Q(x, l) is invariant under cyclic permutation of the quantum spins. Further, due to property (44) we easily obtain that Q(x, l) is invariant with respect to rotations of the total quantum spin: [S, Q(x, l)] = 0,

(50)

where S is given in (8). Recall that Ln (x)-operators satisfy also relation (42) for the right multiplication by L1/2 (x), therefore [t (x), Q(x, l)] = 0.

(51)

694

G. P. Pronko

ˆ Expression (46) for the Q-operator could be essentially simplified. For that let us rewrite Eq. (41) in the following form: ¯ 2l+ix (α β) ¯ = Kn (x) , (52) Ln (x, l, α, α) (2l + ix + 1) where we have denoted via Kn (x) the following operator: i (53) Kn (x) = x + (ρ + ρ + 1) + isna ρ + σ a ρ. 2 Here ρ, ρ+ -are the operators, acting in (52) on the variable α. The action of the operator Ln (x) in the form (52) on the function ψ, according to (39) has the following form: ¯ 2l+ix (α β) ¯ n (x) ψ(β). (54) (Ln (x)ψ) (α) = dµ(β, β)K (2l + ix + 1) ¯ we can transfer the action of the operator Using a property of the measure dµ(β, β) Kn (x) from the variable αi onto the variable βi and rewrite (54) in the form: ¯ 2l+ix (α β) ¯ Kn (x)ψ(β). (55) (Ln (x)ψ) (α) = dµ(β, β) (2l + ix + 1) Proceeding this way in the representation (46) we can collect all operators Kn (x) in one place: N−1 (α γ¯N−1 )2l+ix (γN−1 γ¯N−2 )2l+ix · · · (γ2 γ¯1 )2l+ix ˆ ¯ Q(x, l, α, β) = dµ(γi , γ¯i ) [(2l + ix + 1]N−1 i=1 ×

N m=1

Km (x)

¯ 2l+ix (γ1 β) . (2l + ix + 1)

(56)

Now all operators Km (x) act on the variable γ1 , and we can perform integration over γk , γ¯k with k = 2, · · · , N − 1. This integration could be done with the help of the following formula: ¯ 2l+ix ¯ 2l+ix (α β) (α γ¯ )2l+ix (γ β) = . (57) dµ(γ , γ¯ ) (2l + ix + 1) [(2l + ix + 1)]2 ¯ u / (u + 1) is actually the kernel of some From (57) it follows that the factor (α β) ˆ projector. Using this property, we arrive at the following representation for Q: N ¯ 2l+ix (α γ¯ )2l+ix (γ β) ˆ ¯ Q(x, l, α, β) = dµ(γ , γ¯ ) Km (x) . (58) (2l + ix + 1) (2l + ix + 1) m=1

Due to the fact that in (58) the ordered product of Kn (x) acts only on the variable γi , we ˆ ¯ performing one more integration: can derive the expression for the trace of Q(x, l, α, β), N (γ γ¯ )2l+ix Q(x, l) = dµ(γ , γ¯ ) Km (x) . (59) (2l + ix + 1) m=1

¯ 2l / (2l+1) Needless to say, for an integer or half-integer l and x = 0 the expression (α β) coincides with the kernel of the projector on the representation l of su(2), so that Eq. (59) is actually the desired prescription for analytic continuation into complex momentum, naively suggested in the Introduction.

Baxter’s Q-Operator for the XXX Spin Chain

695

4. The Intertwining Relations In this section we shall derive several intertwining relations for the operators Ln (x) and L1/2 (x) which permit us to prove the commutativity of Q(x, l) and some other important corollaries. First of all let us consider the representation (52) for the Ln (x)operator. The formal operator Kn (x), which enters into this representation is nothing else but the usual Ln (x)-operator of the XXX-spin chain with infinite dimensional auxiliary space, with shifted spectral parameter. The shift commutes with Kn (x), so we can prove pure algebraically the R-matrix form of commutation relation for Kn (x): i R ρ,τ x − y + (ρ + ρ − τ + τ ) Knρ (x)Knτ (y) 2 i = Knτ (y)Knρ (x)R ρ,τ x − y + (ρ + ρ − τ + τ ) , 2

(60)

where R ρ,τ (x) is given by Eq. (11) with σ σ J = ρ + ρ + τ + τ. 2 2

(61)

The indexes ρ and τ at the operators Kn and R indicate different operators, acting in their auxiliary spaces. For the products of L-operators Eq. (60) implies ¯ 2m+iy ¯ 2l+ix i (γ δ) (α β) R ρ,τ x − y + (ρ + ρ − τ + τ ) Knρ (x)Knτ (y) 2 (2l + ix + 1) (2m + iy + 1) (α β) ¯ 2l+ix ¯ 2m+iy i (γ δ) = Knτ (y)Knρ (x)R ρ,τ x − y + (ρ + ρ − τ + τ ) 2 (2l + ix + 1) (2m + iy + 1) 2l+ix 2m+iy ¯ ¯ i (γ δ) (α β) = Knτ (y)Knρ (x) R ρ,τ x − y + (ρ + ρ − τ + τ ) . (2l + ix + 1) (2m + iy + 1) 2 (62) Few comments need to be made about these equations. The holomorphic variables, ¯ , to the pair (τ + , τ ) ∼ (γ , δ). ¯ corresponding to the pair of operators (ρ + , ρ) are (α, β) Equations (62) should be understood as the short version of the long story with integrals over the holomorphic variables with the functions, depending upon β, δ. The last step of the chain of the equations is due to the same property of measure, which permitted the transition from (54) to (55). Coming back to the L operators we can write the following intertwining relation: i ¯ n (y, m, γ , δ) ¯ = R ρ,τ x − y + (ρ + ρ − τ + τ ) Ln (x, l, α, β)L 2 i ¯ ρ,τ (x − y + (ρ + ρ − τ + τ ) . ¯ n x, l, α, β)R Ln (y, m, γ , δ)L 2

(63)

ˆ operators. From that we immediately Apparently, the same relation holds true also for Qobtain the commutativity of its traces: [Q(x, l), Q(y, m)] = 0.

(64)

696

G. P. Pronko

Further, let us consider another commutation relation [6] : i (x + σ 1 σ 2 )(y + iσ 1 M)(y − x + iσ 2 M) 2 i = (y − x + iσ 2 M)(y + iσ 1 M) x + σ 1 σ 2 , 2

(65)

where the Pauli matrices σ1a , σ2a acts in their spaces and M a are some operators satisfying [M a , M b ] = i"abc M c .

(66)

σ M a = ρ + ρ, 2

(67)

In particular, we can set

where ρ + , ρ the Heisenberg variables (21). Now let us shift the spectral parameter y in (65) by i/2(ρ + ρ + 1) and rewrite (65) in the following form:

(x + iσ s n ) y + 2i (ρ + ρ + 1) + is n ρ + σ ρ y − x + 2i (ρ + ρ + 1) + iσ ρ + σ2 ρ

= y − x + 2i (ρ + ρ + 1) + iσ ρ + σ2 ρ y + 2i (ρ + ρ + 1) + is n ρ + σ ρ (x + iσ s n ) , (68) where we interpret σ 1 as the quantum spin s n while the σ 2 serves as auxiliary spin. Equation (68) could be also written as

i i 1/2 1/2 Ln (x) Kn (y) K y − x + = K y−x+ Kn (y) Ln (x) , ik kj 2 kj 2 ik (69)

where we explicitly wrote the indexes, corresponding to the auxiliary space, index n indicates corresponding quantum space and the operator K(x) was introduced in (53). Equation (69) could be used for derivation of the intertwining relation for L1/2 and L operators. Indeed, let us consider the following product of the operators, acting on the function in holomorphic representation:

i Ln (y, l)Kkj (y − x + )ψ (α) ik 2 ¯ 2l+iy i (α β) 1/2 ¯ n (y) dµ(β, β)K = Ln (x) Kkj y − x + ψ(β). (70) ik (2l + iy + 1) 2

1/2 Ln (x)

¯ 2l+iy / (2l + Moving the operator Kkj (y −x +i/2) to the left, through the projector (α β) iy + 1) and using (69) we obtain the following relation:

i i 1/2 1/2 Ln (x) Ln (y, l)Kkj y − x + = Kik y − x + Ln (y, l) Ln (x) , ik kj 2 2 (71)

Baxter’s Q-Operator for the XXX Spin Chain

697

where the operator σ i Kij (x) = x + (ρ + ρ + 1) δij + iσ ij ρ + ρ 2 2

(72)

plays the role of R-matrix. The relation (71) gives rise to the analogous relation for the monodromies:

i ˆ ¯ kj y − x + T 1/2 (x) Q(y, l, α, β)K ik 2 i ˆ ¯ T 1/2 (x) , = Kik y − x + Q(y, l, α, β) kj 2

(73)

from where we obtain the commutativity of the transfer matrix and our Q-operator: [t (x), Q(y, l)] = 0.

(74)

5. Again Q-Operators Now we are ready to discuss some important properties of Baxter’s Q- operator and its generalizations. Let us start from the Baxter equation (48) for the Q-operator. Due to mutual commutativity of Q(x, l) with different spectral parameters and second arguments, we easily derive that i N i N x− Q(x − i, l) − x + Q(x + i, l) Q(x, m) 2 2 i N i N = x− Q(x − i, m) − x + Q(x + i, m) Q(x, l), (75) 2 2 or i i i i Q x + , l Q x − , m − Q x − , l Q x + , m = C(l, m)x N , 2 2 2 2

(76)

where C(l, m) is some unknown operator, commuting with Q. To find C(l, m) we must calculate the l.h.s. of (76) for some convenient values of arguments. From (59) it follows that the Q-operator is proportional to the trace of the projector, whose kernel in the holomorphic representation is (γ γ¯ )2l+ix / (2l + ix + 1). This trace is given by: (γ γ¯ )2l+ix dµ(γ γ¯ ) = 2(2l + ix + 1). (77) (2l + ix + 1) Note that for x = 0, the trace is 2× the dimension of the representation of spin l. From (77) we conclude that Q(x, l)|ix=−(2l+1) = 0.

(78)

Now let us set x = i(2l + 1/2) in Eq. (76) (for m, l – integer or half-integer and m ≥ l + 1/2). Then, due to (78) the first term in l.h.s. of (76) disappears and we obtain: −Q(2il, l)Q(i(2l + 1), m) = C(l, m)[i(2l + 1)]N .

(79)

698

G. P. Pronko

Further, from (59) we derive that Q(2il, l) = t 0 (i(2l + 1)) = [i(2l + 1)]N

(80)

Q(i(2l + 1)) = t m−l−1/2 (i(l + m)).

(81)

and

Hence, the unknown coefficient C(l, m) is given by C(l, m) = t m−l−1/2 (i(l + m)).

(82)

For the case l ≥ m + 1/2 , we should put x = i(2m + 1/2) and l and m will change their places in the final answer. So, finally we obtain the quantum wronskian in the following form: i i i i Q x + , l Q x − , m − Q x − , l Q x + , m = −x N t m−l−1/2 (i(l + m)). 2 2 2 2 (83) Note that for l = m, the r.h.s. of (83) vanishes, as it should be for wronskian of a linearly dependent solutions. Proceeding along this way we can obtain the general relation involving the transfer matrix with arbitrary spin in the auxiliary space in the r.h.s. of (83) (the x N is just the t o (x)) (see [2, 4]). We postpone the discussion of these relations to a future publication, where we intend to give another derivation. Further we want to consider the generalization of Baxter equation, which follows from the fundamental relation (47). To obtain these new relations, let us multiply both sides of (47) by total spin in the auxiliary space: 1 a ˆ ¯ l, α, β) σik + J a δik T 1/2 (x) Q(x, ij 2 i N ˆ 1 a ¯ + (ρ + ρ + 1)−1 x+ ρk Q(x − i, l, α, β)ρ = σik + J a δik j 2 2 i N + + ˆ ¯ j n ρn Q(x + i, l, α, β)" + x− (ρ ρ + 1)−1 "km ρm 2

− (84) + !+ km · · · mn !nj , where J a = ρ+

σa ρ. 2

(85)

Due to the obvious relations 1 a σ ρk + J a ρi = ρ i J a , 2 ik 1 a + σ "km ρm + J a "ik ρk+ = "ik ρk+ J a , 2 ik 1 a pm pm 1 a σik + J a δik !kj = !ik σkj + J a δkj , 2 2

(86)

Baxter’s Q-Operator for the XXX Spin Chain

Eq. (84) could be rewritten in the following form: 1 a ˆ ¯ l, α, β) σik + J a δik T 1/2 (x) Q(x, kj 2 i N ˆ − i, l, α, β)ρ ¯ + (ρ + ρ + 1)−1 = x+ ρi J a Q(x j 2 N i + a ˆ ¯ j k ρk + x− (ρ + ρ + 1)−1 "im ρm J Q(x + i, l, α, β)" 2

1 a a · · · ml !− σ + !+ + J δ km km ik lj . 2

699

(87)

If we now calculate the trace over the whole auxiliary space, the last term in the r.h.s. of (87) again will not contribute, as in the case of usual Baxter equation and we obtain the following relation: i N i N Q(x − i, l) + x − Q(x + i, l), t 1/2 (x)Q(x, l) + t 1/2 (x)Q(x, l) = x + 2 2 (88) where we have introduced the notations: σ t 1/2 (x) = T r T 1/2 (x), 2 ˆ Q(x, l) = dµ(α, α)J ¯ Q(x, l, α, α). ¯

(89)

This equation may be considered as an inhomogeneous Baxter equation, where the first term in the l.h.s. plays the role of inhomogeneity. Remarkably, the r.h.s. of (89) will not change if we simultaneously change the order of multiplication in both terms in the l.h.s.: i N i N Q(x − i, l) + x − Q(x + i, l). Q(x, l)t 1/2 (x) + Q(x, l)t 1/2 (x) = x + 2 2 (90) This property could by derived either from Eq. (47), repeating all steps, which would lead us to (88) or directly from the intertwining relation (73). These new vector Qoperators inherit many properties of the original Baxter operator. In particular, they also satisfy the wronskian-type relations, similar to (83). We intend to present a detailed discussion of these operators in a separate publication. It is worth to mention that we can go further, multiplying Eq. (47) by products of the generators of total auxiliary spin. The relations (86) and triangle structure of the r.h.s. of (47) guarantees that the last term will not contribute and we shall obtain the relations, similar to (90) for new tensorial generalizations of the Q-operator. Also, multiplying both sides of (47) by the operator σ U (H ) = exp iH +J , (91) 2 where H are a c-number, we shall obtain Baxter’s operator for the XXX spin chain in the external magnetic field.

700

G. P. Pronko

6. Concluding Remarks In our previous publication [4] we considered Baxter’s equation for eigenvalues of the Q-operator and have proven that the existence of one operator implies the existence of the second. As Baxter’s equation is linear, this apparently means that its solutions are the linear combinations of these two basic Q-operators, or in other words the general solution of this equation forms the one-parametric family. In [4] we denoted these basic operators as Q(x) and P (x). We have shown in particular that the eigenvalues of Q(x) and P (x) are polynomials and that the transfer matrix t l (x) could be expressed in terms of the eigenvalues of Q(x) and P (x) as follows: t l (x) = P (x + i(l + 1/2))Q(x − i(l + 1/2) − P (x − i(l + 1/2))Q(x + i(l + 1/2). (92) Making an analytic continuation as in (18) (remember that the condition of polynomiality makes this continuation unique) we can obtain now the expression for the eigenvalue of our operator Q(x, l) in terms of the eigenvalues of Q(x) and P (x): Q(x, l) = P (i(2l + 1))Q(x) − P (x)Q(i(2l + 1)).

(93)

From this equation it is evident that the the operators Q(x, l) constructed in the present paper are the linear combinations of the two basic operators with the operator coefficients, which do not depend on the spectral parameter x, but depend on the parameter l. In more simple case of e.g. Toda chain we can give the separate construction of the basic operators (G. Pronko, “On Baxter Q-operator for Toda Chain”, e-preprint nlin/0003002), but for the case of XXX spin chain we still do not know these operators separately. The construction of Baxter’s Q operator, considered in the present paper for the case of the XXX spin chain seems to be rather universal and could be extended for the case of the anisotropic XXZ spin chain. The key to this generalization is again the “naive” analytic continuation suggested in the Introduction. Indeed, in Baxter’s parametrizing [1], the fusion relation for the XXZ spin chain has the following form: t 1/2 (φ)t l (φ + (2l + 1)η) = sinN (φ + η)t l+1/2 (φ + 2lη) + sinN (φ − η)t l−1/2 (φ + 2(l + 1)η),

(94)

where η is the crossing parameter. From (92) it is clear that the function Q(φ, l), defined by Q(φ, l) = t l−φ/4η (φ/2 + (2l + 1)η)

(95)

satisfies the relation: t 1/2 (φ)Q(φ, l) = sinN (φ + η)Q(φ − 2η, l) + sinN (φ − η)Q(φ + 2η, l).

(96)

Again this trick can not be considered as the construction of the Q- operator, but it gives strong evidence that the procedure described in the present paper may be extended to the 6-vertex model. A very interesting question is the further generalization of this approach to the case of the 8-vertex spin chain, for which there also exists a Baxter construction [1] and to the case of the field model considered in [2]. Acknowledgements. The author is grateful to V. Bazhanov, L. Faddeev, E. Skyanin, S. Sergeev, Yu. Stroganov, A. Volkov for their interest, discussions, criticism and encouragement. This work was supported in part by ESPIRIT project NTCONGS and RFFI Grant 98-01-0070.

Baxter’s Q-Operator for the XXX Spin Chain

701

References 1. Baxter, R.J.: Stud. Appl. Math. L 51–69 (1971); Ann. Phys. N.Y. 70, 193–228 (1972); Ann. Phys. N.Y. 76, 1–71 (1973) 2. Bazhanov, V.V., Lukyanov, S.L., Zamolodchikov, A.B.: Commun. Math. Phys. 190, 247–78 (1997); 200, 297–324 (1998) 3. Faddeev, L.D., Takhtajan, L.A.: Zap. Nauch. Semin. LOMI, 109, Leningrad: Nauka, 1981, pp. 134–178 4. Pronko, G.P., Stroganov, Yu.G.: J. Phys. A: Math.Gen. 32, 2333–2340 (1999) 5. Berezin, F.A.: The Method of Second Quantization. Academic Press, New York, 1966 6. Faddeev, L.D.: UMANA 40, 214 (1995) (hep-th/9605187) Communicated by T. Miwa

Commun. Math. Phys. 212, 703 – 724 (2000)

Communications in

Mathematical Physics

© Springer-Verlag 2000

Atoms in Strong Magnetic Fields: The High Field Limit at Fixed Nuclear Charge Bernhard Baumgartner1 , Jan Philip Solovej2 , Jakob Yngvason1 1 Institut für Theoretische Physik, Universität Wien, Boltzmanngasse 5, 1090 Vienna, Austria 2 Department of Mathematics, University of Copenhagen, Universitetsparken 5, 2100 Copenhagen Ø,

Denmark Received: 9 December 1999 / Accepted: 15 February 2000

Abstract: Let E(B, Z, N ) denote the ground state energy of an atom with N electrons and nuclear charge Z in a homogeneous magnetic field B. We study the asymptotics of E(B, Z, N) as B → ∞ with N and Z fixed but arbitrary. It is shown that the leading term has the form (ln B)2 e(Z, N ), where e(Z, N ) is the ground state energy of a system of N bosons with delta interactions in one dimension. This extends and refines previously known results for N = 1 on the one hand, and N, Z → ∞ with B/Z 3 → ∞ on the other hand. 1. Introduction The effects of extremely strong magnetic fields (order of 109 Gauss and higher) on atoms and molecules are of considerable astrophysical as well as mathematical interest and are far from being completely understood in spite of many theoretical studies since the early seventies. We refer to [LSYa] and [RWHG] for a general discussion of this subject and extensive lists of references. An atom (ion) with N electrons and nuclear charge Z in a homogeneous magnetic field B = (0, 0, B) is (in appropriate units) usually modeled by the nonrelativistic many-body Hamiltonian HB,Z,N =

N

(i)

HA −

i=1

Z |xi |

+

N i<j

1 . |xi − xj |

(1)

Here xi ∈ R3 are the positions of the electrons, i = 1, . . . , N, A(x) = 21 B × x is the vector potential, and HA = [(i∇ + A(x)) · σ ]2

(2)

with σ the vector of Pauli spin matrices. The Hamiltonian HB,Z,N operates on the Hilbert space HN = N L2 (R3 ; C2 ) appropriate for Fermions of spin 1/2. In this paper we are

704

B. Baumgartner, J. P. Solovej, J. Yngvason

concerned with the ground state energy E(B, Z, N ) = inf spec HB,Z,N

N = inf{, HB,Z,N : ∈ C0∞ R3N ; C2 ∩ HN , 2 = 1}, (3)

more precisely the B → ∞ asymptotics of this quantity. Such an asymptotic study is relevant at the field strengths prevailing on white dwarfs and neutron stars. Previous investigations of the asymptotics of E(B, Z, N ) have either dealt with the case N = 1, i.e., hydrogen-like atoms [AHS,FWa], or the case when Z and N tend to ∞ together with B [LSYa, LSYb, I]. The most complete rigorous treatment of the ground state in the N = 1 case so far is [AHS] where the following B → ∞ asymptotics was derived: E(B, Z, 1)/Z 2 = − 41 [ln(B/2)]2 + [ln(B/2) ln ln(B/2)] − [(C + ln 2) ln(B/2)] − [ln ln(B/2)]2 + 2(C − 1 + ln 2) ln ln(B/2) + O(1),

(4)

with a constant C (Euler’s constant/2). Asymptotics for other eigenvalues and resonances are obtained in [FWa] and [FWb]. The basic results on the N, Z → ∞ case were obtained in [LSYa] and [LSYb]. In particular, in [LSYa] it was shown that if N, Z → ∞ with λ = N/Z fixed, and B/Z 3 → ∞, then  1 1 3 λ if λ < 2  − 4 λ + 18 λ2 − 48 3 3 2 E(B, Z, N )/(Z [ln(B/Z )] ) → (5)  1 −6 if λ ≥ 2. The fact that the right side of (5) decreases with increasing N/Z as long as N/Z < 2 shows that in the limit Z → ∞, B/Z 3 → ∞ an atom can bind at least 2Z electrons. In [I] some higher order corrections to the leading asymptotics for the energy are discussed. The main result of the present paper is a derivation of the leading term in the B → ∞ asymptotics of E(B, Z, N), where Z and N are fixed, but arbitrary. The precise statement is as follows: Theorem 1.1 (High field limit of the energy). For each fixed Z and N E(B, Z, N ) = e(Z, N ), B→∞ (ln B)2 lim

(6)

where e(Z, N ) is the ground state energy of the Hamiltonian hZ,N =

N δ(zi − zj ), −∂ 2 /∂zi2 − Zδ(zi ) +

N i=1

(7)

i<j

of N bosons with δ-interaction in one dimension, defined in the sense of quadratic forms as e(Z, N ) = inf{, hZ,N : ∈ C0∞ (RN ), 2 = 1}.

(8)

It is trivial to compute e(Z, 1) = −Z 2 /4. Thus (6) generalizes the first term in the expansion (4) to the case N > 1. The relevance of the δ-function model for the ground state of hydrogen in strong magnetic fields was noted already in [S]. We also verify that the mean field limit of e(Z, N ) agrees with (5):

Atoms in Strong Magnetic Fields

705

Theorem 1.2 (Mean field limit). If Z, N → ∞ with λ = N/Z fixed, then  1 1 3 λ if λ < 2  − 4 λ + 18 λ2 − 48 3 e(Z, N )/Z → .  1 −6 if λ ≥ 2

(9)

Taken together, Theorems 1.1 and 1.2 lead to the same high B, high Z limit as Theorem 1.4 in [LSYa], where Z → ∞ and B/Z 3 → ∞ simultaneously (the “hyperstrong” limit.) We now describe briefly the strategy for the proof of these results and introduce some notation that will be used throughout. The first step in the proof of Theorem 1.1 is a 0 ⊂ H generated by wave functions in the lowest Landau reduction to the subspace HN N 0 0 . (Its integral kernel is given by Eqs. (52)–(53) band. Let N denote the projector on HN 0 depend on B.) Let E conf (B, Z, N ) denote the ground in Sect. 5. Note that 0N and HN 0 0 state energy of N HB,Z,N N . It is clear that E(B, Z, N ) ≤ E conf (B, Z, N ),

(10)

and by Theorem 1.2 in [LSYa], E conf (B, Z, N ) ≤ E(B, Z, N )(1 − δ(B, Z, N )),

(11)

where δ(B, Z, N ) → 0 for B → ∞ with Z, N fixed. Hence it suffices to prove (6) with E(B, Z, N ) replaced by E conf (B, Z, N ). We note in passing that (11) also holds for bosons. In fact, it will become evident in the sequel that Theorem 1.1 is independent of the statistics of the particles. To study E conf (B, Z, N ) the next step is to introduce a Hamiltonian for the motion parallel to the magnetic field with the coordinates perpendicular to the magnetic field as parameters. We write the variables xi ∈ R3 as xi = (xi⊥ , zi ), where xi⊥ ∈ R2 and zi ∈ R are respectively the components perpendicular and parallel to the field. ⊥ ) ∈ R2N and Moreover, we write (x1 , . . . , xN ) = (x ⊥ , z) with x ⊥ = (x1⊥ , . . . , xN N z = (z1 , . . . , zN ) ∈ R . In the lowest Landau band the part of (2) associated with the motion perpendicular to the field is exactly canceled by the spin contribution and only the part corresponding to the motion along the field remains. Hence 0N HB,Z,N 0N = 0N HZ,N 0N

(12)

with HZ,N =

N

−∂

i=1

2

/∂zi2

Z − |xi |

+

N i<j

1 . |xi − xj |

(13)

The operator (13) contains no derivatives perpendicular to the field and hence the variables x ⊥ can be regarded as parameters for a differential operator in the variables ⊥ are all different from zero, we parallel to the field. For each x ⊥ such that x1⊥ , . . . , xN consider the one-dimensional Hamiltonian   N N 1 Z −∂zi 2 − + HZ,N (x ⊥ ) = (14) (zi − zj )2 + (xi⊥ − xj⊥ )2 zi2 + (xi⊥ )2 i=1 i<j

706

acting on

B. Baumgartner, J. P. Solovej, J. Yngvason N

L2 (R) = L2 (RN ). The expectation values of HZ,N can be written as (15) , HZ,N = (x ⊥ , ·), HZ,N (x ⊥ )(x ⊥ , ·)L2 (RN ) dx ⊥ .

The next step is a scaling of the variables. In the lowest Landau level the characteristic length in the directions perpendicular to the field is B −1/2 . One can therefore expect that for the computation of E conf (B, Z, N ), i.e., the infimum of (15) over (normalized) 0 , the properties of H ⊥ ⊥ −1/2 are decisive. Anticipating this, it ∈ HN Z,N (x ) for |xi | ∼ B ⊥ is natural to make a transformation of variables, (x , z) → (B 1/2 x ⊥ , L(B)z), where the scale factor L(B) in the direction of the field has still to be specified. The corresponding unitary operator on L2 (RN ) is U (z) = L(B)N/2 (L(B)z),

(16)

and the Hamiltonian transforms in the following way: 1/2 ⊥ x ), U −1 HZ,N (x ⊥ )U = L(B)2 hB Z,N (B

(17)

where ⊥ hB Z,N (y ) =

VB,|y ⊥ −y ⊥ | (zi − zj ) −∂z2i − ZVB,|y ⊥ | (zi ) +

N i=1

i

i<j

i

j

(18)

and the potential VB,r (z) is (for r > 0) defined as VB,r (z) = L(B)−1 (B −1 L(B)2 r 2 + z2 )−1/2 .

(19)

B (y ⊥ ) denote the ground state energies of H ⊥ Let EZ,N (x ⊥ ) and eZ,N Z,N (x ) and ⊥ hB Z,N (y ) respectively. In order to avoid discussions about the domains of the Hamiltonians, which in fact depend on whether some of the parameters xi⊥ (resp. yi⊥ ) coincide, we define the ground state energies in terms of quadratic forms in the same way as (8):

EZ,N (x ⊥ ) = inf{, HZ,N (x ⊥ ) : ∈ C0∞ (RN ), 2 = 1}, B eZ,N (y ⊥ )

=

⊥ inf{, hB Z,N (y )

: ∈

C0∞ (RN ),

2 = 1}.

(20) (21)

These energies are connected by the scaling relation B (y ⊥ ). EZ,N (B −1/2 y ⊥ )/L(B)2 = eZ,N

(22)

In the next section we show that with the choice L(B) ∼ ln B the potential VB,r (z) converges for each r > 0 in the sense of distributions to the delta function as B → 0. This is the heuristic basis of Theorem 1.1. Since the convergence is not uniform in r, however, more is needed for a rigorous proof. In particular, one needs estimates on the rdependence of the convergence VB,r (z) → δ(z). These estimates, stated in Lemmas 2.1 and 2.2 in the next section, can be regarded as variants of Propositions 3.3 and 3.4 in [LSYa] and the Appendix in [JY], adapted to the problem at hand. They are included here for completeness. The upper bound on the energy, given in Sect. 3, is a straightforward variational calculation. The lower bound is more subtle. An important ingredient needed is the superharmonicity of the energy EZ,N (x ⊥ ) in the variables xi⊥ . This result, established

Atoms in Strong Magnetic Fields

707

in Theorem 4.3, generalizes a corresponding result (Proposition 2.3) in [LSYa]. Superharmonicity implies that the lowest value of EZ,N (B −1/2 y ⊥ ) for |yi⊥ | ≥ ε with ε > 0 is obtained at the boundary of the variable range, i.e., when either |yi⊥ | = ε or |yi⊥ | → ∞. Variables tending to infinity can be ignored, since VB,r (z) → 0 for r → ∞, so by this result one may in (15) restrict the attention to wave functions localized where |xi⊥ | ≤ (const.)B −1/2 . On the other hand, the requirement that only wave functions in the lowest Landau band are taken into account in (15) plays the role of a “hard core condition” that prevents collapse, since such wave functions cannot be concentrated on shorter scales than O(B −1/2 ). This statement is made precise in Lemma 5.3. The lower bound is obtained in Sect. 5 by combining Theorem 4.3, Lemma 5.3 and the convergence of the potentials VB,r discussed in Sect. 2. It is noteworthy that this lower bound holds also for bosonic statistics while the upper bound holds for fermionic statistics, so that altogether the convergence of E(Z, N, B)/(ln B)2 to e(Z, N ) is independent of the statistics. In Sect. 6 we discuss the delta-function model (7) and in particular prove Theorem 1.2. In the course of the proof we compare (7) with another model, whose ground state energy can be explicitly calculated. This model provides an upper bound for the ground state energy of (7) and has the same mean field limit. The Hamiltonian for this model is hZ,N =

N i=1

1 pi2 − δ(zi ) + δ(|zi | − |zj |). 2Z

(23)

i<j

An interesting feature of this model is the fact that the maximal number Nc of electrons that a nucleus of charge Z can bind is exactly the largest integer satisfying Nc < 2Z + 1.

(24)

(This fact is unrelated to Lieb’s upper bound [L] for the maximal negative ionization of atoms that does not apply to the Pauli Hamiltonian with a homogeneous magnetic field.) A corresponding statement for the Hamiltonian (7) is not known, except in the mean field limit, cf. Theorem 1.2. In this connection it should be mentioned that an estimate of the form Nc < 2Z + 1 + (const.) B 1/2 has been derived in [BR] for a Hamiltonian of a similar type as (18). 2. The High B Limit of the Coulomb Interaction We define the scaling factor L(B) in the potential (19) as the solution of the equation

Since

1 0

B 1/2 = L(B) sinh[L(B)/2]. (a 2 + z2 )−1/2 dz = sinh−1 (1/a), we have with this choice VB,r (z)dz = 1 |z|≤r

(25)

(26)

for all B. Note also that L(B) = ln B + O(ln ln B)

(27)

as B → ∞. Let ψ ∈ H 1 (R) = {ψ : |ψ|2 + |dψ/dz|2 < ∞}. Every such ψ is a continuous function on R.

708

B. Baumgartner, J. P. Solovej, J. Yngvason

Lemma 2.1 (Delta approximation, part 1). 2 − VB,r (z)|ψ(z)|2 dz ≤ L(B)−1 λr −1 + 8λ1/4 T 3/4 r 1/2 |ψ(0)| with λ =

|ψ|2 , T =

(28)

|dψ/dz|2 .

Proof. It suffices to take r = 1, for the general case follows by scaling z → rz. Write the difference on the left side of (2.1) as A1 + A2 with A1 = − VB,1 (z)|ψ(z)|2 dz, (29) |z|≥1 (30) A2 = VB,1 (z) |ψ(0)|2 − |ψ(z)|2 dz. |z|≤1

The missing term

A3 = 1 −

|z|≤1

VB,1 (z)dz |ψ(0)|2

(31)

is zero because of (26). Since |VB,1 (z)| ≤ L(B)−1 for |z| ≥ 1, we have |A1 | ≤ λL(B)−1 .

(32)

|VB,1 (z)| ≤ L(B)−1 |z|−1 .

(33)

For |z| ≤ 1 we have in any case

Moreover,

|ψ(z)|2 − |ψ(0)|2 ≤ |ψ(z) − ψ(0)| [|ψ(z)| + |ψ(0)|] z dψ ∞ d|ψ(z )|2 1/2 dz · 2 dz ≤ dz 0 dz −∞ ≤ |z|1/2 T 1/2 2λ1/4 T 1/4 = 2λ1/4 T 3/4 |z|1/2 .

Hence |A2 | ≤ 2L(B)−1

|z|≤1

|z|−1/2 dz λ1/4 T 3/4 = 8L(B)−1 λ1/4 T 3/4 .

Combining the estimates for A1 and A2 gives (2.1).

(35)

Lemma 2.2 (Delta approximation, part 2). Let ∈ H 1 (R2 ) and put 2 λ= |(z, z )| dzdz , T = |∂z (z, z )|2 dzdz . Then

(34)

(36)

2 VB,r (z − z )|ψ(z, z )|2 dzdz |ψ(z, z)| dz − ≤ L(B)−1 [λr −1 + 8λ1/4 T 3/4 r 1/2 ].

(37)

Atoms in Strong Magnetic Fields

Proof. Put λ(z) =

709

|(z, z )|2 dz , T (z) = |∂z (z, z )|2 dz . By (2.1) we have 2 z)| − VB,r (z − z )|(z, z )|2 dz |(z,

≤ L(B)−1 [λ(z)r −1 + 8λ(z)1/4 T (z)3/4 r 1/2 ]. (38) 1/4 3/4 Integration over z, using the Hölder inequality to estimate λ(z) T (z) dz, gives (37). 3. Upper Bound Let ψ ∈ S(R2N ) be a smooth and rapidly decreasing wave function in the lowest ∞ N Landau level at field2strength 1, and let φ ∈ C0 (R ). If ψ and φ are normalized, i.e., 2 R2N |ψ| = RN |φ| = 1, then B (x ⊥ , z) = (BL(B))N/2 ψ(B 1/2 x ⊥ )φ(L(B)z)

(39)

is a normalized wave function in the lowest Landau band at field strength B. Moreover, using (15) and (17) we have E(B, Z, N ) ≤ B , HZ,N B ⊥ 2N ⊥ = L(B)2 |ψ(y ⊥ )|2 φ, hB y , Z,N (y )φd ⊥ 2 2 where hB Z,N (y ) is given by (18). Since L(B) /(ln B) → 1 as B → ∞ and ψ is normalized, one has for the upper bound in Theorem 1.1 only to check that |ψ(y ⊥ )|2 VB,|y ⊥ | (zi )|φ(z)|2 d 2N y ⊥ d N z i → δ(zi )|φ(z)|2 d N z

and

|ψ(y ⊥ )|2 VB,|y ⊥ −y ⊥ | (zi − zj )|φ(z)|2 d 2N y ⊥ d N z i j → δ(zi − zj )|φ(z)|2 d N z

as B → ∞. But this is taken care of by Lemmas 2.1 and 2.2. (That VB,r (z) is not defined for r = 0 is of no consequence here, because the error terms in (28) and (37) are integrable all the way to r = 0.) We therefore have Proposition 3.1 (Upper bound). lim inf B→∞

E(B, Z, N ) ≤ e(Z, N ). (ln B)2

(40)

Remark. It is clear that our upper bound holds for fermions, although e(Z, N ) is the bosonic ground state energy of (7). In fact, in the ansatz (39) above we may choose ψ to be antisymmetric and φ to be symmetric; then B is antisymmetric. Note also that for the Hamiltonian (7) the bosonic ground state energy is the same as its ground state energy without symmetry restriction.

710

B. Baumgartner, J. P. Solovej, J. Yngvason

4. Superharmonicity In this section we take a closer look at the dependence of the ground state energy EZ,N (x ⊥ ) of the Hamiltonian (14) on the parameter x ⊥ . We start with a simple estimate: Lemma 4.1 (Simple bounds). The function x ⊥ → EZ,N (x ⊥ ) satisfies the bounds −

N

Z

2

1 + sinh

i=1

−1

2

((Z|xi⊥ |)−1 )

≤ EZ,N (x ⊥ ) ≤ 0

(41)

on the set A = {x ⊥ ∈ R2N : xi⊥ = 0, for all i = 1, . . . , N}.

(42)

Proof. The non-positivity of E is straightforward from the definition by an appropriate choice of . Note that this also holds in the case where some of the xi⊥ variables coincide. The lower bound on EZ,N (x ⊥ ) follows from Lemma 2.1 in [LSYa] together with the operator inequality   N −∂z2 − Z  HZ,N (x ⊥ ) ≥ i zi2 + (xi⊥ )2 i=1 which is obtained by ignoring the positive two-body interactions.

Next we turn to the superharmonicity properties of EZ,N (x ⊥ ). We shall need the following general result. Lemma 4.2 (Inherited superharmonicity). Let U be an open set in Rd and assume that f : U × R → (−∞, ∞] is a superharmonic function with the property that b = min{lim inf f (x, t), lim inf f (x, t)} t→∞

t→−∞

is independent of x for all x ∈ U . Then g(x) = inf t f (x, t) is a superharmonic function on U . Proof. We shall prove this by showing that +g ≤ 0 as a distribution. We shall use that f is a lower semicontinuous function satisfying the mean value inequality f (y, s)dyds ≤ f (x, t)cd+1 r d+1 , |(x,t)−(y,s)|≤r

for all (x, t) ∈ U × R if r > 0 is small enough, where cd+1 is the volume of the unit ball in Rd+1 . For x ∈ U it follows from the lower semicontinuity of f that we have either g(x) = b or there exists t ∈ R such that g(x) = f (x, t). In the first case we obviously have cd+1 r d+1 g(x) ≥ 2 g(y) r 2 − (x − y)2 dy (43) |x−y|≤r

Atoms in Strong Magnetic Fields

711

since g(y) ≤ b for all y. If g(x) < b we also conclude the above inequality since g(x)cd+1 r d+1 = f (x, t)cd+1 r d+1 ≥ f (y, s)dyds |(x,t)−(y,s)|≤r ≥ g(y)dyds = 2 g(y) r 2 − (x − y)2 dy. |(x,t)−(y,s)|≤r

|x−y|≤r

Note now that for any φ ∈ C0∞ (U ) we have for any x ∈ U that lim r −(d+3) [φ(y) − φ(x)] r 2 − (x − y)2 dy = C+φ(x) r→0

|x−y|≤r

for some constant C > 0 and in fact this limit holds in the topology of C0∞ (U ). Thus if φ ≥ 0 we have g(x)+φ(x)dx = C −1 lim r −(d+3) r→0 U · g(x)(φ(y) − φ(x)) r 2 − (x − y)2 dydx ≤ 0 |x−y|≤r

by the inequality (43). Hence +g ≤ 0.

Theorem 4.3 (Superharmonicity of the energy). On the set A defined in (42) the function x ⊥ → EZ,N (x ⊥ ) is superharmonic in each of the variables xi⊥ , i = 1, . . . , N independently. Proof. We follow closely the proof of Prop. 2.3 in [LSYa], which stated the superharmonicity of the ground state energy of a one-body operator which can be considered as a mean field approximation of HZ,N (x ⊥ ). It is clearly enough to prove that EZ,N (x ⊥ ) is superharmonic in x1⊥ (on the re⊥ fixed. We shall prove this by showing that x ⊥ → gion x1⊥ = 0) for x2⊥ , . . . , xN 1 ⊥ . Let x ⊥ = ⊥ EZ,N (x ) satisfies the mean value inequality around any given point x1,0 0 ⊥ , x ⊥ , . . . , x ⊥ ). Choose a sequence of L2 normalized functions ∈ C ∞ (RN ) (x1,0 n 2 0 N ⊥ such that n , HZ,N (x ⊥ 0 )n → EZ,N (x 0 ) as n → ∞. (w) For w ∈ R denote by n the function n(w) (z1 , . . . , zN ) = n (z1 − w, z2 , . . . , zN ). We clearly have (w) ⊥ inf n(w) , HZ,N (x ⊥ 0 )n → EZ,N (x 0 )

w∈R

(w)

⊥ we shall use If x1⊥ is close to x1,0 n

as n → ∞.

as a trial function for H (x ⊥ ). We then obtain

EZ,N (x ⊥ ) ≤ lim inf inf n(w) , HZ,N (x ⊥ )n(w) . n

w∈R

Hence EZ,N (x ⊥ ) − EZ,N (x ⊥ 0) (v) ) . ≤ lim inf inf n(w) , HZ,N (x ⊥ )n(w) − inf n(v) , HZ,N (x ⊥ n 0 n

w∈R

v∈R

(44)

712

B. Baumgartner, J. P. Solovej, J. Yngvason

The potential appearing in HZ,N (x ⊥ ), i.e., WZ,N,x ⊥ (z1 , . . . , zN ) = −

N

i=1

Z zi2 + (xi⊥ )2

+

N i<j

1 (zi − zj )2 + (xi⊥ − xj⊥ )2

is a superharmonic function of (z1 , x1⊥ ) ∈ R3 \ {0}. Writing n(w) , WZ,N,x ⊥ n(w) = WZ,N,x ⊥ (z1 + w, z2 , . . . , zN )|n (z1 , . . . , zN )|2 dz1 · · · dzN (w)

(w)

we see that n , WZ,N,x ⊥ n is superharmonic in (w, x1⊥ ) away from the line x1⊥ = (w)

(w)

0. Since n , ∂z2i n is independent of w and x1⊥ for all i = 1, . . . , N we have that (w)

(w)

n , HZ,N (x ⊥ )n is superharmonic in (w, x1⊥ ) away from the line x1⊥ = 0. Moreover, we also have that the two limits lim inf n(w) , HZ,N (x ⊥ )n(w)

w→±∞

are independent of x1⊥ . This is true simply because the contribution from the terms in the Hamiltonian depending on x1⊥ tend to zero as w → ±∞. We may therefore apply the (w) (w) above lemma to the function f (w, x1⊥ ) = n , HZ,N (x ⊥ )n . We conclude that the function x1⊥ → inf n(w) , HZ,N (x ⊥ )n(w) w∈R

is superharmonic for x1⊥ = 0. Moreover by the inequality (41) this function is bounded below if |x1⊥ | is bounded away from 0. Now using Fatou’s Lemma we see from (44) that the average of EZ,N (x ⊥ )−EN (x ⊥ 0) ⊥ | < r} is non-positive for all r > 0 small enough. over the set {x1⊥ : |x1⊥ − x1,0 5. Lower Bound B (y ⊥ ) of hB (y ⊥ ) The first lemma in this section concerns the ground state energy eZ,N Z,N and does not use superharmonicity. B (y ⊥ )). Let K be a compact subset of the set A given Lemma 5.1 (Lower bound on eZ,N in (42). Then B (y ⊥ ) ≥ e(Z, N ). lim inf inf eZ,N

(45)

B→∞ y ⊥ ∈K

Proof. To avoid problems at points y ⊥ with yi⊥ − yj⊥ = 0 for some i, j , we replace the repulsive potential VB,|y ⊥ −y ⊥ | (zi − zj ) by the smaller potential VB,|y ⊥ −y ⊥ |+1 (zi − zj ). i

j

i

j

⊥ We denote the corresponding Hamiltonian by h˜ B Z,N (y ) and its ground state energy by B B B B (y ⊥ ) gives ⊥ ⊥ ⊥ e˜Z,N (y ). It is obvious that eZ,N (y ) ≥ e˜Z,N (y ), so a lower bound on e˜Z,N B (y ⊥ ). a lower bound on eZ,N

Atoms in Strong Magnetic Fields

713

Let be a normalized, symmetric wavefunction in C0∞ (RN ). Since , hZ,N ≥ ⊥ e(Z, N) we have to estimate the matrix elements of the difference h˜ B Z,N (y ) − hZ,N . Using Lemmas 2.1 and 2.2, together with the Hölder inequality for the integration over z2 , . . . , zN and z3 , . . . , zN respectively, we obtain − , h ≤ L(B)−1 (ZN + N (N − 1)) , h˜ B Z,N Z,N 3/4 −1 (46) × rmin + 8T (2rmax + 1)1/2 , where rmin and rmax are respectively the minimum and the maximum value of |yi⊥ |, i = 1, . . . , N, with y ⊥ ∈ K, and (47) T = N |∂z (z, z2 , . . . , zN )|2 dzdz2 · · · dzN is the kinetic energy of . Now if yB⊥ ,n , n = 1, 2, . . . is a minimizing sequence of

⊥ normalized wave functions for h˜ B Z,N (y ), then we may assume that the corresponding kinetic energy is uniformly bounded in n, B and y ⊥ ∈ K. In fact, we may assume that B ⊥ yB⊥ ,n , h˜ B Z,N (y )y ⊥ ,n is a bounded sequence. If we use the bound from Lemma 2.1

in [LSYa], we obtain ⊥ B yB⊥ ,n , h˜ B Z,N (y )y ⊥ ,n

Tn 1 ≥ − 2 2

2Z L(B)

2

1 + sinh

−1

{(2Z)

−1

B

1/2

2

}

, (48)

where we have saved half of the kinetic energy Tn of yB⊥ ,n . For large B, L(B)−1 sinh−1 {(2Z)−1 B 1/2 } is bounded and hence we see that Tn is bounded. The error term (46) with = yB⊥ ,n thus tends to zero as B → ∞, uniformly in n, and the lemma is established.

Lemma 5.2 (Uniform bounds on EZ,N (x ⊥ )). Let ε > 0. Consider the set C B,ε = {x ⊥ : εB −1/2 ≤ |xi⊥ |, for all i = 1, . . . , N}.

(49)

lim inf (ln B)−2 inf{EZ,N (x ⊥ ) : x ⊥ ∈ C B,ε } ≥ e(Z, N ).

(50)

Then B→∞

where e(Z, N ) as before denotes the 1-dimensional delta function atom energy. Proof. Define the sets CnB,ε = {x ⊥ : εB −1/2 ≤ |xi⊥ | ≤ n, for all i = 1, . . . , N}. Since CnB,ε is compact and EZ,N is lower semicontinuous (being superharmonic, in fact, B,ε superharmonic in each variable) we may find x ⊥ n ∈ Cn such that ⊥ ⊥ B,ε EZ,N (x ⊥ n ) = min{EZ,N (x ) : x ∈ Cn }.

714

B. Baumgartner, J. P. Solovej, J. Yngvason

Clearly,

⊥ ⊥ B,ε }. lim EZ,N (x ⊥ n ) → inf{EZ,N (x ) : x ∈ C

n→∞

By the superharmonicity of EZ,N (x ⊥ ) in each variable xi⊥ we know that each coordinate ⊥ of the point x ⊥ satisfies either |x ⊥ | = εB −1/2 or |x ⊥ | = n. Moreover, since xi,n n i,n i,n EZ,N (x ⊥ ) is invariant under permutations of the coordinates of x ⊥ we may assume that ⊥ | ≤ |x ⊥ | ≤ · · · ≤ |x ⊥ | for all n. By possibly going to a subsequence we may |x1,n 2,n N,n assume that there exists an integer 0 ≤ K ≤ N such that for n large enough −1/2 , for i = 1, . . . , K εB ⊥ . |= |xi,n n, for i > K ⊥ converges as n → ∞ for i = 1, . . . , K. Moreover, we may assume that xi,n ⊥ , i = K + 1, . . . , N, which tend to infinity Since we may ignore the variables xi,n we have ⊥ ⊥ lim EZ,N (x ⊥ n )/EZ,K (x1,n , . . . , xK,n ) = 1. n→∞

Since EZ,K (x ⊥ ) is lower semicontinuous we conclude that there exists a point ⊥ , . . . , x ⊥ ) ∈ R2K with |x ⊥ | = εB −1/2 for all i = 1, . . . , K such that (x1,∞ K,∞ i,∞ ⊥ ⊥ inf{EZ,N (x ⊥ ) : x ⊥ ∈ C B,ε } = EZ,K (x1,∞ , . . . , xK,∞ ).

By Lemma 5.1 we have that ⊥ ) : |yi⊥ | = ε, for all i} ≥ e(Z, K). lim inf inf{L(B)−2 EZ,K (B −1/2 y1⊥ , . . . , B −1/2 yK B→∞

Since K ≤ N and hence e(Z, K) ≥ e(Z, N ) we have proved the lemma.

0 belongs to the Lemma 5.3 (Wave functions in the lowest Landau band). If ∈ HN lowest Landau band at field strength B, then |(x ⊥ , z)|2 dz is a bounded function of x ⊥ (possibly after a modification on a null set) and for all 1 ≤ n ≤ N , Bn ⊥ ⊥ |(x ⊥ , z)|2 dzdxn+1 · · · dxN 2 . (51) sup ≤ n (2π ) ⊥ ⊥ x ,...,x 1

n

Proof. The projector 0N on the lowest Landau band is the N th tensorial power of the projector 0 that operates on L2 (R3 ; C2 ) and is given by the integral kernel ⊥

0 (x, x ) = 0⊥ (x ⊥ , x )δ(z − z )P ↓ ,

(52)

where ⊥

0⊥ (x ⊥ , x ) =

B ⊥ ⊥ exp 2i (x ⊥ × x ) · B − 41 (x ⊥ − x )2 B 2π

(53)

and P ↓ is the the projector on vectors in C2 with spin component −1/2. The kernel 0⊥ (x ⊥ , x ⊥ ) is a continuous function with (54) 0 (x ⊥ , u⊥ )0 (u⊥ , y ⊥ )du⊥ = 0 (x ⊥ , y ⊥ )

Atoms in Strong Magnetic Fields

715

and 0 (x ⊥ , x ⊥ ) =

B 2π

(55)

for all x ⊥ . A wave function in the lowest Landau band has the representation = 0N . After writing 0N as an integral operator (51) follows from the Cauchy–Schwarz inequality, using (54) and (55). Proposition 5.4 (Lower bound). lim inf B→∞

E(B, Z, N ) ≥ e(Z, N ). (ln B)2

(56)

Proof. For fixed B let be a normalized wave function in the lowest Landau band. By (15) we have , HZ,N ≥ EZ,N (x ⊥ ) |(x ⊥ , z)|2 dz dx ⊥ . (57) We split the integral over x ⊥ into an integral over C B,ε (defined in (49)) and its complement in R2N . By Lemma 5.2 we have only to consider the latter. Using the estimate (41) the task is to bound terms of the form −1 ⊥ −1 2 ⊥ 2 (1 + [sinh (Z|xj | )] ) |(x , z)| dz dx ⊥ (58) |xi⊥ |≤εB −1/2

from above. If i = j we carry out the integration over all xk⊥ with k = i and use Lemma 5.3 for the remaining variable xi⊥ . For small r, | sinh−1 r −1 | ≤ (const.)| ln r| and the term can be estimated by (ln |x ⊥ |)2 Bdx ⊥ ≤ (const.)ε 2 (ln B)2 . (59) (const.) |x ⊥ |≤εB −1/2

For i = j we split the integration over xj⊥ into two parts, namely |xj⊥ | ≤ B −1/2 and |xj⊥ | ≥ B −1/2 . For the first part we obtain the following bound, after transforming variables and using Lemma 5.3, this time for n = 2, 2 (const.)ε (ln B −1/2 |yj⊥ |)2 dyi⊥ dyj⊥ ≤ (const.)ε2 (ln B)2 . (60) |yi⊥ |≤1,|yj⊥ |≤1

For the integral over |xj⊥ | ≥ B −1/2 we estimate | sinh−1 (Z|xj⊥ |−1 )|2 by its maximum value, ≤ (const.)(ln B)2 and obtain for this part of the integral the upper bound 2 ⊥ 2 (1 + (const.)(ln B) ) |(x , z)| dz dx ⊥ ≤

|xi⊥ |<εB −1/2

|xi⊥ |<εB −1/2

(1 + (const.)(ln B)2 )Bdxi⊥ ≤ ε2 (1 + c(ln B)2 ),

(61)

where we have used Lemma 5.3 again. We see that (55) is bounded above by (const.)(ε ln B)2 , for B large enough. Since ε > 0 is arbitrary this completes the proof.

716

B. Baumgartner, J. P. Solovej, J. Yngvason

6. The One-Dimensional Delta-Function Model We now want to study the delta-function Hamiltonian (7), in particular its mean field limit, N → ∞, Z → ∞, with λ = N/Z fixed. For this it is convenient to make a scale transformation z → z/Z, which implies a unitary equivalence hZ,N hZ,N ∼ = Z 2

(62)

with hZ,N =

N i=1

1 pi2 − δ(zi ) + δ(zi − zj ). Z

(63)

i<j

We denote its ground state energy (again in the sense of quadratic forms) by e(Z, N ). The formal mean field theory of this system is identical to the so called hyper-strong theory discussed in [LSYa, Sect. 3]. The energy of a (one dimensional) electron density Zρ in this theory is ZE HS [ρ] with 2 d 1 E HS [ρ] = ρ(z) − ρ(0) + (64) ρ 2 dz. 2 R dz The infimum over densities with fixed normalization ρ = λ leads to the hyper-strong energy E HS (λ) given by the right side of (5). We shall now establish this mean field limit rigorously and prove Theorem 1.2. 6.1. A comparison model. An upper bound to the Hamiltonian (7) can be obtained from another model whose ground state can be computed explicitly. The corresponding Hamiltonian is completely symmetric with regard to each single reflection zi → −zi , and the electronic repulsions are equally distributed between the sites zi and −zi : hZ,N =

N i=1

1 pi2 − δ(zi ) + δ(zi − zj ) + δ(zi + zj ) . 2Z

(65)

i<j

Its ground state energy is denoted by e(Z, N ). The replacement of 1/Z by 1/(2Z) is important, because it compensates to a certain extent the doubling of the interaction sites. In particular it leads to the same formal mean field theory as (63) for symmetric electron densities ρ. This observation will be substantiated by the mathematical treatment in the sequel. The model (65) was used in [WS] for N = 2 as a starting point for a perturbational calculation. It was also considered in [Ro] (for N = 2) as an upper bound to the model (7), but with the coupling 1/Z instead of 1/(2Z). The present considerations and extensions to N ≥ 3 appear to be new. , if it exists, is completely symmetric under perThe ground state wave function ψ mutations of {z1 ...zN } and reflections zi → −zi . Such a highly symmetric function ψ is determined by its restriction to the cone M = {z : 0 ≤ z1 ≤ z2 ≤ . . . ≤ zN }.

(66)

Atoms in Strong Magnetic Fields

717

In M we make the ansatz (z1 . . . zN ) = c ψ

N

e−κi zi ,

(67)

i=1

. with c a normalization constant and let hZ,N act on the symmetrically extended ψ The delta-function interactions dictate the jumps in the partial logarithmic derivatives at the boundary of M and we find of ψ κ1 =

1 , 2

κi − κi−1 = −

1 , 4Z

which implies 1 n−1 − . (68) 2 4Z The function (67) is square integrable if and only if all κn are strictly positive, which is equivalent to κn =

N < 2Z + 1.

(69)

, is everywhere positive, and it is easy to see that The corresponding eigenfunction ψ it is, indeed, a ground state for (65): Define the operators An on L2 (RN ) by ) , An = ∂zn − ∂zn (ln ψ

(70)

) the eigenvalue of with obvious domains of definition. Denoting by e(ψ hZ,N corre we can write the quadratic form sponding to ψ hZ,N as hZ,N =

N i=1

). A∗n An + e(ψ

(71)

The equation ψ, hZ,N ψ =

N

) ψ 2 ≥ ) ψ 2 , An ψ 2 + e(ψ e(ψ

(72)

i=1

). which holds for each ψ in the form domain of hZ,N , shows that e(Z, N ) = e(ψ If N ≥ 2Z +1, the simple inequality e(Z, N ) ≤ e(Z, N −1) is sufficient for our purposes. To prove it, one may use trial-wave-functions of the form ψ(z1 . . . zN−1 )εϕ(ε 2 zN ) with a smooth ϕ, and take ε to zero. This inequality for the energies can be iterated to e(Z, N ) ≤ e(Z, No ),

(73)

where No is the largest integer satisfying (69). For N < 2Z + 1 the ground state energy is the eigenvalue corresponding to (67): ) = − e(Z, N ) = e(ψ

N n=1

κn2 =

1 λ λ2 λ λ2 λ2 =− N 1− + + − + 4 2 12 2 8 24N

.

(74)

718

B. Baumgartner, J. P. Solovej, J. Yngvason

If N ≥ No we may use (73), i.e., e(Z, N ) is bounded from above by (74) with λ replaced by λo = No /Z. Dividing by Z and keeping λ fixed, the leading term for N → ∞ is identical to E HS (λ). By the next proposition this is sufficient for the upper bound in Theorem 1.2. But one can in fact show that e(Z, No ) is equal to e(Z, N ) for N ≥ 2Z + 1 and not only an upper bound to it. Hence No is equal to Nc , the maximal number of electrons that can be bound in the model (65). We give the proof of this result in the Appendix. Proposition 6.1 (The comparison model gives upper bounds). The ground state energy of the symmetrized model hZ,N is an upper bound to the ground state energy of hZ,N : e(Z, N ) ≤ e(Z, N ).

(75)

This inequality is strict, if N ≥ 2 and N < 2Z + 1. Proof. The Hamiltonian hZ,N is the symmetrization of hZ,N with respect to the group N R with 2 elements, generated by the reflections zi → −zi , i = 1, . . . , N. For R ∈ R let UR denote the corresponding unitary operators on L2 (RN ) . Then 1 UR ψ, hZ,N UR ψ = ψ, hZ,N ψ 2N R∈R

for any ψ, so e(Z, N ) ≤ e(Z, N ). If N < 2Z + 1 we may take the square integrable ground state wave function of =ψ for all R, so hZ,N , given by (67), as a test state for hZ,N . It satisfies UR ψ = , e(Z, N ). ψ hZ,N ψ

is not an eigenfunction of But ψ hZ,N if N ≥ 2, so e(Z, N ) is strictly below e(Z, N ).

Combining the last proposition with Eq. (74), recalling that e(Z, N ) = Z 2 e(Z, N ), we obtain Proposition 6.2 (Upper bound in the mean field limit). If N, Z → ∞ with λ = N/Z fixed, then lim sup e(Z, N )/Z 3 ≤ E HS (λ), where E HS (λ) is given by (5).

(76)

Atoms in Strong Magnetic Fields

719

6.2. Lower bounds to the delta-function Hamiltonian. An elegant way to obtain lower bounds for Hamiltonians with repulsive pair interactions is the use of positive definite functions. This was probably done for the first time in [HLT]. In this method, the positive definite functions have to be finite at the origin, however, and hence it is impossible to bound the δ-function interaction in this way without additional help. Our way out is to borrow a bit of kinetic energy (this was also done in Theorem 7.1 in [LSYa]). So we search for operator inequalities a p2 +

1 δ(z) ≥ wZ,a,b (z) Z

(77)

with appropriate functions wZ,a,b (z), depending on a parameter b in addition to a and Z to allow convergence to a delta function. Lemma 6.3 (An operator inequality). The inequality (77) holds for wZ,a,b (z) =

1 b2 e−b|z|/Za . Z 2 a (2b + 1)

(78)

Proof. With the simple reformulation to a p2 +

1 δ(z) − wZ,a,b (z) ≥ 0 Z

(79)

we are on well known territory: The Hamiltonian on the left side shall have no negative eigenvalue. By the scale transformation z → Zaz, this inequality is transformed to p2 + δ(z) − Wb (z) ≥ 0,

Wb (z) = Z 2 a wZ,a,b (Zaz).

(80)

This inequality will hold for Wb (z) =

b2 e−b|z| 2b + 1

(81)

if it is true for the larger potential b (z) = W

b2 e−b|z| , 2b + 1 − e−b|z|

because the Hamiltonian in (80) is bounded from below by b (z). p2 + δ(z) − W

(82)

This Hamiltonian has f (z) = 1 −

1 e−b|z| 2b + 1

(83)

as a positive symmetric solution to the Schrödinger equation – as a differential equation – with zero energy. Now, if (82) would have a square integrable ground state wave function g(z), this wave function would also be symmetric under reflection z → −z, and the delta-function would dictate the same value for g (z)/g(z) as it does for f (z)/f (z) at z = 0+ . So the question of the existence of g(z) can be dealt with by the methods which are used for proving Sturm’s comparison theorem: We assume that g(z) exists, with negative

720

B. Baumgartner, J. P. Solovej, J. Yngvason

energy E. The Wronskian W (z) := f (z)g(z) − g (z)f (z) is zero at z = 0+ . Its derivative is determined as W (z) = Ef (z)g(z). If g(z) is chosen positive, then W (z) is negative, which implies that W (z) is negative for z ≥ 0, and g (z)/g(z) > f (z)/f (z). This inequality can be integrated to give g(z)/g(0) > f (z)/f (0), a contradiction to the assumption of the square-integrability of g(z). Therefore we know that the Hamiltonian (82) has no negative eigenvalue. And so the operator inequality holds. The {Wb (z)} and hence {ZwZ,a,b (z)} are δ-sequences in the limit b → ∞. All these functions are positive definite, and finite at the origin: b . (84) 2Z 2 a With this tool we can now deduce the lower bound for the many body Hamiltonian: wZ,a,b (0) <

Proposition 6.4 (Lower bound in the mean field limit). If N, Z → ∞ with λ = N/Z fixed, then lim inf e(Z, N )/Z 3 ≥ E HS (λ).

(85)

Proof. We use the operator inequality (77) with wZ (z) := wZ,a,b (z), (a and b will finally be chosen as appropriate powers of N ) to bound hZ,N from below. For each δ(zi − zj ) we use it twice; one time with api2 , and a second time with apj2 . Then we add these inequalities and divide by two: hZ,N =

" ! 2 N pi + pj2 N −1 1 2 a 1−a pi − δ(zi ) + + δ(zi − zj ) 2 2 Z i=1

≥

N

i<j

[. . . ] +

i=1

wZ (zi − zj ).

(86)

i<j

At this point the positive definiteness of wZ (z) becomes essential. It implies, that for any real valued integrable function σ (z): $ # $ # N N 1 δ(z − zi ) wZ (z − y) N σ (y) − δ(y − zj ) ≥ 0. dzdy N σ (z) − 2 i=1

i=1

(87) Expanding this expression and integrating the delta-functions we get N wZ (zi − zj ) ≥ N wZ (zi − z)σ (z)dz − wZ (0) 2 i<j i N2 − σ (z)wZ (z − y)σ (y)dzdy. 2

(88)

Combining this with (86) gives hZ,N ≥

N i=1

hi (Z, N, σ ) −

N2 2

σ (z)wZ (z − y)σ (y)dzdy

(89)

Atoms in Strong Magnetic Fields

721

with the one-particle operators N −1 1 pi2 − δ(zi ) + N (σ ∗ wZ )(zi ) − wZ (0). hi (Z, N, σ ) = 1 − a 2 2

(90)

The parameters are now chosen as a = N −1−ε ,

b = N ε,

with 0 < ε < 1/2.

The fraction of kinetic energy per particle that we borrowed in (86) then decreases as N −ε , and the functions wZ (z) become wZ (z) =

N 2ε N 1+ε 1+2ε e−z·N /Z . 2 ε Z (2N + 1)

(91)

In the mean field limit N, Z → ∞ with N/Z = λ > 0 fixed the sequence ZwZ (z) is a δ-sequence, and wZ (0) ∼ λN 2ε /Z → 0. If σ (z) is smooth and bounded, with |σ (z)| ≤ γ and |σ (z)| ≤ γ , then |N (σ ∗ wZ )(z) − λσ (z)| ≤ (1 + λ)γ N −ε . The one particle Hamiltonians h(Z, N, σ ), with smooth σ (z), converge as quadratic forms pointwise (i.e., for each test function) to hλσ = p2 − δ(z) + λσ (z).

(92)

h(Z, N, σ ) ≥ hλσ − (N −ε /2)p 2 − (1 + λ)γ N −ε − (λ2 /2)N 2ε−1 .

(93)

Moreover

Since the ground state energies of operators of the type αp2 +V are concave functions of α and hence continuous in α, the ground state energies of the right side of (93) converge in the limit N → ∞. The ground state energy of hλσ is a concave functional e[λσ ], and the lower bound (89), when divided by the number of electrons N , gives 1 λ (94) e(Z, N ) ≥ e[λσ ] − σ 2 (z)dz =: Iλ [σ ]. lim inf N,Z→∞ N 2 N/Z=λ Inserting the mean field density ρ for λσ (i.e., the minimizer of (64) which satisfies Eq. (3.8) of [LSYa]) gives the mean field energy, divided by λ, as a lower bound to the limit of the energy per electron. We remark that searching for the supremum of Iλ [σ ] in (94) also leads to the mean field equation of [LSYa]: Assuming e(λσ ) = ψ, hλσ ψ with a normalized ψ the variational condition on σ (z) for maximizing Iλ [σ ] is σ (z) = ψ 2 (z).

(95)

Inserting this into the Schrödinger equation hλσ ψ = µλ ψ for ψ gives −ψ (z) − δ(z)ψ(0) + λψ 3 (z) = −µλ ψ(z),

(96)

i.e., Eq. (3.8) in [LSYa]. Finally we remark that the energy per electron, e(Z, N )/N , approaches the mean field limit monotonously. There is also a subadditivity property, which in the limit becomes concavity of E HS (λ)/λ. These properties of the approach to a mean field hold in some other cases too, as will be shown elsewhere [B99].

722

B. Baumgartner, J. P. Solovej, J. Yngvason

7. Conclusions We have shown that the energy of an atom in a strong magnetic field B approaches, after division by (ln B)2 , the energy of a many body Hamiltonian with delta interactions in one dimension as B → ∞. This delta function model is not explicitly solvable, but an upper bound to the energy can be given in terms of another model with the same mean field limit and where we can explicitly calculate the ground state energy. In the latter model an atom with nuclear charge Z can bind up to 2Z electrons. Whether this represents the true state of affairs for the atomic Hamiltonian in the B → ∞ limit is an open problem. Acknowledgements. J.P. Solovej and J. Yngvason were supported in part by the EU TMR-grant FMRX-CT 96-0001. J.P.S. was also supported in part by MaPhySto – Centre for Mathematical Physics and Stochastics, funded by a grant from The Danish National Research Foundation, and by a grant from the Danish Natural Science Research Council.

Appendix We prove here that the ground state energy e(Z, N ) of the Hamiltonian (65) is independent of N if N ≥ 2Z + 1. Proposition (Maximal negative ionization for the comparison model). If N ≥ 2Z + 1, then e(Z, N ) = e(Z, No ),

(97)

where No is the largest integer strictly smaller than 2Z + 1. Moreover, there is then no L2 -function with e(Z, N ) as an eigenvalue. Proof. In the cone M defined by (66) we consider the wave function ˇ 1 , . . . , zN ) = ψ(z

No

e−κi zi

i=1

N

(1 − κj zj ),

(98)

j =No +1

with κn defined by (68). Since κj ≤ 0 for j ≥ No + 1, the function ψˇ is strictly positive. We extend ψˇ symmetrically from M to all of RN as a continuous function. The jumps in the logarithmic derivatives of ψˇ at the boundary of M are not of the right size required for an eigenfunction of hZ,N . But ψˇ is an eigenfunction of a slightly different operator: ˇ N )ψˇ hˇ Z,N ψˇ = e(Z,

(99)

with e(Z, ˇ N) = −

No i=1

κi2 = e(Z, No )

(100)

Atoms in Strong Magnetic Fields

723

and hˇ Z,N =

1 γi,j (z1 , . . . , zN ) δ(zi − zj ) + δ(zi + zj ) pi2 − δ(zi ) + 2Z

N i=1

i<j

(101) with certain functions γi,j . It is sufficient to specify γi,i+1 (z1 , . . . , zN ) on the boundary of M (other cases follow by permutation and/or reflection of the variables) and one finds for 0 ≤ z1 ≤ z2 ≤ . . . ≤ zN : γi,i+1

 1 % & if = 4Z κNo + |κNo +1 |(1 + |κNo +1 |zNo )−1 if  (1 + |κi |zi )−1 (1 + |κi+1 |zi+1 )−1 if

1 ≤ i ≤ No − 1 i = No . (102) No + 1 ≤ i ≤ N

Since γi,i+1 ≤ 1 for all i one has hZ,N . hˇ Z,N ≤

(103)

Since ψˇ is strictly positive we can in the same way as in (71) write hˇ Z,N =

N n

Aˇ ∗n Aˇ n + e(Z, ˇ N)

(104)

with ˇ , Aˇ n = ∂zn − ∂zn (ln ψ)

(105)

and conclude that e(Z, ˇ N) = e(Z, No ) is, indeed, the ground state energy of hˇ Z,N . Hence, e(Z, N) = e(Z, No ) for N ≥ 2Z + 1. To see that there are no bound states at the bottom of the spectrum of hZ,N assume ψ is an eigenfunction to eigenvalue e(Z, N ), so that ψ, hZ,N ψ = e(Z, N ) ψ 2 .

(106)

By (103) and the equality of the ground state energies this implies ˇ N ) ψ 2 , ψ, hˇ Z,N ψ = e(Z,

(107)

which, because of (104), is equivalent to the set of differential equations Aˇ n ψ = 0. ˇ and ψˇ is not an L2 function. These equations have no other solutions than cψ,

(108)

724

B. Baumgartner, J. P. Solovej, J. Yngvason

References [LSYa]

Lieb, E.H., Solovej, J.P. and Yngvason, J.: Asymptotics of Heavy Atoms in High Magnetic Fields: I. Lowest Landau Band Regions. Commun. Pure Appl. Math. 52, 513–591 (1994) [RWHG] Ruder, H., Wunner, G., Herold, H. and Geyer, D. Atoms in Strong Magnetic Fields. Berlin– Heidelberg–New York: Springer-Verlag, 1994 [AHS] Avron, J.E., Herbst, I.W. and Simon, B.: Schrödinger Operators with Magnetic Fields III. Atoms in Homogeneous Magnetic Field. Commun. Math. Phys. 79, 529–572 (1981) [FWa] Froese, R. and Waxler, R.. The spectrum of a hydrogen atom in an intense magnetic field. Rev. in Math. Phys. 6, 699–832 (1994) [FWb] Froese, R. and Waxler, R.: Ground state resonances of a hydrogen atom in an intense magnetic field. Rev. in Math. Phys. 7, 311–361 (1994) [LSYb] Lieb, E.H., Solovej, J.P. and Yngvason, J.: Asymptotics of Heavy Atoms in High Magnetic Fields: II. Semiclassical Regions. Commun. Math. Phys. 161, 77–124 (1994) [I] Ivrii, V.: Asymptotics of the ground state energy of heavy molecules in the strong magnetic field. I. Russ. J. Math. Phys. 4, 29–74 (1996); II, Russ. J. Math. Phys. 5, 321–354 (1997) [S] Spruch, L.: A Report on Some Few-Body Problems in Atomic Physics. In: Few Body Dynamics, A.N. Mitra et al., eds., Amsterdam: North-Holland, 1976, pp. 715–725 [JY] Johnsen, K.,Yngvason, J.: Density-matrix calculations for matter in strong magnetic fields: Ground states of heavy atoms. Phys. Rev. A 54, 1936–1946 (1996) [L] Lieb, E.H.: Bound on the maximum negative ionization of atoms and molecules. Phys. Rev. A 29, 3018–3028 (1984) [BR] Brummelhuis, R., Ruskai, M.B.:A One-Dimensional Model for many-ElectronAtoms in Extremely Strong Magnetic Fields: Maximum Negative Ionization. math-ph/99025, J. Phys. A, in press (1999) [WS] White, R.J., Stillinger, F.H. Jr.: Analytic Approach to Electron Correlation in Atoms. J. Chem. Phys. 52, 5800–5814 (1970) [Ro] Rosenthal, C.M.: Solution of the Delta Function Model for Heliumlike Ions. J. Chem. Phys. 55, 2474–2483 (1971) [HLT] Hertel, P., Lieb, E.H., Thirring, W.: Lower bound to the energy of complex atoms. J. Chem. Phys. 62, 3355–3356 (1975) [B99] Baumgartner, B.: Monotonicity in the approach to mean field limits. University Vienna preprint UWThPh-1999-59 Communicated by B. Simon

Commun. Math. Phys. 212, 725 – 744 (2000)

Communications in

Mathematical Physics

© Springer-Verlag 2000

On Brillouin Zones J. J. P. Veerman1 , Mauricio M. Peixoto2 , André C. Rocha3 , Scott Sutherland4 1 Mathematics Department, Pennsylvania State University, University Park, PE, USA.

E-mail: [email protected]

2 Instituto de Matemática Pura e Aplicada, Rio de Janeiro, Brazil. E-mail: [email protected] 3 Departamento de Matemática, Universidade Federal de Pernambuco, Recife, Brazil. 4 Institute for Mathematical Sciences and Mathematics Department, State University of New York, Stony

Brook, NY, USA. E-mail: [email protected] Received: 7 August 1998/ Accepted: 22 March 2000

Abstract: Brillouin zones were introduced by Brillouin [Br] in the thirties to describe quantum mechanical properties of crystals, that is, in a lattice in Rn . They play an important role in solid-state physics. It was shown by Bieberbach [Bi] that Brillouin zones tile the underlying space and that each zone has the same area. We generalize the notion of Brillouin zones to apply to an arbitrary discrete set in a proper metric space, and show that analogs of Bieberbach’s results hold in this context. We then use these ideas to discuss focusing of geodesics in spaces of constant curvature. In the particular case of the Riemann surfaces H2 / (k) (k = 2, 3, or 5), we explicitly count the number of geodesics of length t that connect the point i to itself. 1. Introduction In solid-state physics, the notion of Brillouin zones is used to describe the behavior of an electron in a perfect crystal. In a crystal, the atoms are often arranged in a lattice; for example, in NaCl, the sodium and chlorine atoms are arranged along the points of the simple cubic lattice Z3 . If we pick a specific atom and call it the origin, its first Brillouin zone consists of the points in R3 which are closer to the origin than to any other element of the lattice. This same zone can be constructed as follows: for each element a in the lattice, let L0 a be the perpendicular bisecting plane of the line between 0 and a (this plane is called a Bragg plane). The volume about the origin enclosed by these intersecting planes is the first Brillouin zone, b1 (0). This construction also allows us to define the higher Brillouin zones as well: a point x is in bn if the line connecting it to the origin crosses exactly n − 1 planes L0 a , counted with multiplicity. This notion was introduced by Brillouin in the 1930s ([Br]), and plays an important role in solid-state theory (see, for example, [AM, Jo2,Ti]). The construction which gives rise to Brillouin zones is not limited to consideration of crystals, however. For example, in computational geometry, the notion of the Voronoi cell corresponds exactly to the first Brillouin zone described above (see [PS]). We shall also see below how, after suitable

726

J. J. P. Veerman, M. M. Peixoto, A. C. Rocha, S. Sutherland

4

3

4

3

2

4

4

3

3

4

4

2

2

4

4

1 3

3 4

2

3 4

4

3 4

Fig. 1.1. On the left are the Brillouin zones for the lattice Z2 in R2 . On the right is the outer boundary of the third Brillouin zone for the lattice Z3 in R3

generalization, this construction coincides with the Dirichlet domain of Riemannian geometry, and in many cases, with the focal decomposition introduced in [Pe1] (see also [Pe3]). With some slight hypotheses (see Sect. 2), we generalize the construction of Brillouin zones to any discrete set S in a path-connected, proper metric space X. We generalize the Bragg planes above as mediatrices, defined here. Definition 1.1. For a and b distinct points in S, define the mediatrix (also called the equidistant set or bisector) Lab of a and b as: Lab = {x ∈ X d(x, a) = d(x, b)} . Now choose a preferred point x0 in S, and consider the collection of mediatrices Lx0 ,s s∈S . These partition X into Brillouin zones as above: roughly, the nth Brillouin zone Bn (x0 ) consists of those points in X which are accessible from x0 by crossing exactly n − 1 mediatrices. (There is some difficulty accounting for multiple crossings— see Def. 2.6 for a precise statement.) One basic property of the zones Bn is that they tile the space X: Bn (xi ) = X and Bn (x0 ) ∩ Bn (x1 ) is small . xi ∈S

Here, with some extra hypotheses, “small” means of measure zero. Furthermore, again with some extra hypothesis, each zone Bn has the same area. (This property was “obvious” to Brillouin.) Both results were proved by Bieberbach in [Bi] in the case of a lattice in R2 . Indeed, he proves (as we do) that each zone forms a fundamental set for the group action of the lattice. His arguments rely heavily on planar Euclidean geometry, although he remarks that his considerations work equally well in Rd and can be extended to “groups of motions in non-Euclidean spaces”. In [Jo1], Jones proves these results for lattices in Rd , as well as giving asymptotics for both the distance from Bn to the basepoint, and for the number of connected components of the interior of Bn . In Sect. 2, we show that the tiling result holds for arbitrary discrete sets in a metric space.

On Brillouin Zones

727

If the discrete set is generated by a group of isometries, we show that each Bn forms a fundamental set, and consequently all have the same area (see Prop. 2.10). We now discuss the relationship of Brillouin zones and focal decomposition of Riemannian manifolds. If x1 (t) and x2 (t) are two solutions of a second order differential equation with x1 (0) = x2 (0) and there is some T = 0 so that x1 (T ) = x2 (T ), then the trajectories x1 and x2 are said to focus at time T . One can ask how the number of trajectories which focus varies with the endpoint x(T ) – this gives rise to the concept of a focal decomposition (originally called a sigma decomposition). This concept was introduced in [Pe1] and has important applications in physics, for example when computing the semiclassical quantization using the Feynman path integral method (see [Pe3]). There is also a connection with the arithmetic of positive definite quadratic forms (see [Pe2, KP, Pe3]). Brillouin zones have a similar connection with arithmetic, as can be seen in Sect. 4 as well as [Pe3]. More specifically, consider the two-point boundary problem x¨ = f (t, x, x), ˙

x(t0 ) = x0 ,

x(t1 ) = x1 ,

x, t, x, ˙ x¨ ∈ R.

Associated with this equation, there is a partition of R4 into sets k , where a point (x0 , x1 , t0 , t1 ) is in k if there are exactly k solutions which connect (x0 , t0 ) to (x1 , t1 ). This partition is the focal decomposition with respect to the boundary value problem. In [PT], several explicit examples are worked out, in particular the fundamental example of the pendulum x¨ = − sin x. Also, using results of Hironaka ([Hi]) and Hardt ([Ha]), the possibility of a general, analytic theory was pointed out. In particular, under very general hypotheses, the focal decomposition yields an analytic Whitney stratification. Later, in [KP], the idea of focal decomposition was approached in the context of geodesics of a Riemannian manifold M (in addition to a reformulation of the main theorem of [PT]). Here, one takes a basepoint x0 in the manifold M: two geodesics γ1 and γ2 focus at some point y ∈ M if γ1 (T ) = y = γ2 (T ). This gives rise to a decomposition of the tangent space of M at x into regions where the same number of geodesics focus. In order to study focusing of geodesics on a manifold (M, g) with metric g via Brillouin zones, we do the following. Choose a base-point p0 in M and construct the universal cover X, lifting p0 to a point x0 in X. Let γ be a smooth curve in M with initial point p0 and endpoint p. Lift γ to γ˜ in X with initial point x0 . Its endpoint will be some x ∈ π −1 (p). The metric g on M is lifted to a metric g˜ on X by setting g˜ = π ∗ g. Under the above conditions, the group G of deck transformations is discontinuous and so π −1 (p0 ) ⊂ X is a discrete set. One can ask how many geodesics of length t there are which start at p0 and end in p, or translated to (X, γ˜ ), this becomes: How many mediatrices Lx0 ,s intersect at x, as s ranges over π −1 (p0 )? Notice that if the universal cover of M coincides with the tangent space T Mx , the focal decomposition of [KP] and that given by Brillouin zones will be the same. If the universal cover and the tangent space are homeomorphic (as is the case for a manifold of constant negative curvature), the two decompositions are not identical, but there is a clear correspondence. However, if the universal cover of the manifold is not homeomorphic to the tangent space at the base point, the focal decomposition and that given by constructing Brillouin zones in the universal cover are completely different. For example, let M be Sn , and let x be any point in it. The focal decomposition with respect to x gives a collection of nested n − 1-spheres centered at x; on each of these infinitely many geodesics focus (each sphere is mapped by the exponential to either x or its antipodal point). Between

728

J. J. P. Veerman, M. M. Peixoto, A. C. Rocha, S. Sutherland

the spheres are bands in which no focusing occurs. (See [Pe3]). However, using the construction outlined in the previous paragraph gives a very different result. Since Sn is simply connected, it is its own universal cover. There is only one point in our discrete set, and so the entire sphere Sn is in the first zone B1 . The organization of this paper is as follows. In Sect. 2, we set up the general machinery we need, and prove the main theorems in the context of a discrete set S in a proper metric space. Section 3 explores this in the context of manifolds of constant curvature. The universal cover is Rn , Sn , or Hn , and the group G of deck transformations is a discrete group of isometries (see, for example, [doC]). The discrete set S is the orbit of a point not fixed by any element of G under this discontinuous group. It is easy to see that the mediatrices in this case are totally geodesic spaces. From the basic property explained above, one can deduce that the nth Brillouin zone is a fundamental region for the group G in X. In Sect. 4, we calculate exactly the number of geodesics of length t that connect the origin to itself in two cases: the flat torus R2 /Z2 and the Riemann surfaces H2 / (p), for p ∈ {2, 3, 5}. While these calculations could, of course, be done independent of our construction, we find that the Brillouin zones help visualize the process. In the final section, we give a nontrivial example in the case of a non-Riemannian metric, and mention a connection to the question of how many integer solutions there are to the equation a k + bk = n, for fixed k. 2. Definitions and Main Results In this section, we prove that under very general conditions, Brillouin zones tile (as defined below) the space in which they are defined, generalizing an old result of Bieberbach [Bi]. With stronger assumptions, we prove that these tiles are fairly well-behaved sets (see Prop. 2.13). Notation. Throughout this paper, we shall assume X is a path connected, proper (see below) metric space (with metric d(·, ·)). We will make use of the following notation: – Write an open r-neighborhood of a point x0 as Nr (x0 ) = {x ∈ X d(x0 , x) < r}. – Define the circumference as Cr (x0 ) = {x ∈ X d(x0 , x) = r}. – The closed disk of radius r, denoted by Dr (x0 ) = {x ∈ X d(x0 , x) ≤ r}, is their union. Definition 2.1. A metric space X is proper if the distance function d(x, ·) is a proper map for every fixed x ∈ X. In particular, for every x ∈ X and r > 0, the closed ball Dr (x) is compact. Such a metric space is also sometimes called a geometry (see [Ca]). Note if X is proper, path-connected metric space, it is locally compact and complete. By the Hopf-Rinow Theorem, the converse also holds if X is a geodesic metric space, also called a “length space” (see [Gr]). A metric space is a length space if the distance between any two points coincides with the infimum of the lengths of curves joining them. Although the notions do not quite coincide, metrically consistent spaces (defined below) are closely related to length spaces. Definition 2.2. The space X is called metrically consistent if, for all x in X, all R > r > 0 in R with r sufficiently small, and for each a ∈ CR (x), there is a z ∈ Cr (x) satisfying Nd(z,a) (z) ⊆ NR (x) and Cd(z,a) (z) ∩ CR (x) = {a}.

On Brillouin Zones

729

Metric consistency ensures some regularity properties, which we need to use only in Proposition 2.13. We note that every Riemannian metric space is metrically consistent. Any mediatrix La,b separates X, that is: X Lab contains at least two components (one containing the point a and the other b). Another regularity condition that we will sometimes want is for the complement of L to have exactly two components, and for L to be minimal: Definition 2.3. We say that the mediatrix Lab is minimally separating if for any subset Lˆ ⊂ Lab with Lˆ = Lab , the set X − Lˆ has one component. We will use the notation + L− 0 a = {x ∈ X d(0, x) − d(a, x) < 0} and L0 a = {x ∈ X d(0, x) − d(a, x) > 0}

for the two components of X L0 a ; we will sometimes omit the subscripts and just use L+ and L− . Note that a minimally separating set L is contained both in the closure of L− and in the closure of L+ . To see this, let V be an open set contained in L. Then L− ∪ V and L+ are disjoint open sets. Consequently, L (L ∩ V ) separates X, which contradicts the minimality of L. Usually, there will be a discrete set of points S = {xi }i∈I in X which will be of interest. By discrete we mean that any compact subset of X contains finitely many points of S. Note that if lim inf d(a, b) > 0, then S is discrete. a,b∈S

Definition 2.4. We say a proper, path connected metric space X is Brillouin if it satisfies the following conditions: 1. X is metrically consistent. 2. For all a, b in X, the mediatrices Lab are minimally separating sets. The second condition in the above definition may be weakened to apply only to those mediatrices Lab , where a and b are in S. In this case, we will say that X is Brillouin over S, if it is not obvious from the context.

Fig. 2.1. The set L(0,0),(a,a) contains two quarter-planes

Fig. 2.2. L(0,0),(4,6) (thin solid line) and L(0,0),(2,4) (thick grey line) have open segments in common

Example 2.5. Equip R2 with the “Manhattan metric”, that is, d(p, q) = |p1 − q1 | + |p2 − q2 |. The Manhattan metric is not metrically consistent: a circle Cr (p) is a diamond √ of side length r 2 centered at p, and the definition fails because Cd(z,a) (z) ∩ CR (x) is a segment rather than a point. Neither are the mediatrices minimally separating: if the coordinates of a point a are equal, then L0 a consists of a line segment and two quarterplanes (see Fig. 2.1). Even if the discrete set S contains no such points, we can still run into

730

J. J. P. Veerman, M. M. Peixoto, A. C. Rocha, S. Sutherland

√ Fig. 2.3. The mediatrices L0 a for R2 with the Manhattan metric and a in the lattice (m, n 2)

strange situations. For example, the mediatrices L(0,0),(2,4) and L(0,0),(4,6) both contain the ray {(t, 1) t ≥ 4} Fig. 2.2). But, if we are careful, we can avoid this. If (0, 0) is the basepoint, we must have that for all pairs (a1 , a2 ) and (b1 , b2 ) in S, a1 − a2 =b1 − b2 . √ For example, take S to be an irrational lattice such as (m, n 2) m, n ∈ Z . (From this example, we see that to do well in Manhattan, one should be carefully irrational.) It is interesting to note that while this example is not metrically consistent and hence not Brillouin, all the conclusions of this section (in particular, Prop. 2.13) still hold. As mentioned in the introduction, for each x0 ∈ S, the mediatrices Lx0 a give a partition of X. Informally, those elements of the partition which are reached by crossing n − 1 mediatrices from x0 form the nth Brillouin zone, Bn (x0 ). This definition is impractical, in part because a path may cross several mediatrices simultaneously, or the same mediatrix more than once. Instead, we will use a definition given in terms of the number of elements of S which are nearest to x. In many cases, this definition is equivalent to the informal one. See the remarks at the end of this section for more details. We use the notation #(S) to denote the cardinality of the set S. Definition 2.6. Let x ∈ X, let n be a positive integer, n ≤ #(S), and let r = d(x, x0 ). Then define the sets bn (x0 ) and Bn (x0 ) as follows: – x ∈ bn (x0 ) ⇐⇒ # (Nr (x) ∩ S) = n − 1 and Cr (x) ∩ S = {x0 }. – x ∈ Bn (x0 ) ⇐⇒ # (Nr (x) ∩ S) = m and # (Cr (x) ∩ S) = * ≥ 1, where l, m ∈ Z+ with m + 1 ≤ n ≤ m + *. Here the point x0 is called the base point, and the set Bn (x0 ) is the nth Brillouin zone with base point x0 . Note that in the second part, if m = n − 1 and * = 1, then x ∈ bn (x0 ). So bn (x0 ) ⊆ Bn (x0 ). Note also that the complement of bn (x0 ) in Bn (x0 ) consists of subsets of mediatrices (see Def. 1.1). Note also that bn (x0 ) is open and that Bn (x0 ) is closed. Finally, observe that for fixed x0 the sets bn (x0 ) are disjoint, but the sets Bn (x0 ) are not.

On Brillouin Zones

731

x0

x0

Fig. 2.4. Here we illustrate the definition of the sets bn (x0 ) and Bn (x0 ) for the lattice Z2 in R2 . In both pictures, the circle Cd(x,x0 ) (x) is drawn, and the basepoint x0 lies in the center of the square at the lower left. On the left side, the point x (marked by a small cross) lies in b5 , and # (Nr (x) ∩ S) = 4, while x0 is the only point of S on the circle. On the right, we have m = 4 and * = 8, so x lies in all of the sets B5 , B6 , . . . , B12

The following lemma, which follows immediately from Def. 2.6, explains a basic feature of the zones, namely that they are concentric in a weak sense. This property is also apparent from the figures. Lemma 2.7. Any continuous path from x0 to Bn (x0 ) intersects Bn−1 (x0 ). The Brillouin zones actually form a covering of X by non-overlapping closed sets in various ways. This is proved in parts. The next two results assert that the zones B cover X, but the zones b do not. The first of these is an immediate consequence of the definitions. The second is more surprising and ultimately leads to Corollary 3.5, the generalization of Bieberbach’s “equal area” result. Lemma 2.8. For fixed n the Brillouin zones tile X in the following sense:

Bi (xn ) = X

and

bi (xn ) ∩ bj (xn ) = ∅

if i = j.

i

In addition, Bi (xn ) ∩ bj (xn ) = ∅ if i = j .

Theorem 2.9. Let X be a proper, path-connected metric space and let S = {xi }i∈I be a discrete set. Then, for fixed n ≤ #(S), the sets Bn0 (xi ) i∈I tile X in the following sense: Bn (xi ) = X and bn (xi ) ∩ bn (xj ) = ∅ if i = j. i

732

J. J. P. Veerman, M. M. Peixoto, A. C. Rocha, S. Sutherland

Fig. 2.5. This example illustrates Lemma 2.8 and Theorem 2.9. Let S be the discrete set {(m, 0)} ∪ {(0, n)} , m, n ∈ Z in the Euclidean plane. On the left is the tiling given by Bi (0, 0) and in the middle is the tiling by Bi (2, 0). In both cases, b2 is shaded. On the right is the tiling given by B2 (xi ) as in Thm. 2.9. The sets b2 (0, 0), b2 (1, 0), and b2 (2, 0) have been shaded. Note that this S does not correspond to a group, nor does it satisfy the hypotheses of Prop. 2.10, because there are no isometries which permute S and do not fix the origin

Proof. First, we show that for any fixed n > 0 and each x ∈ X, there is an xi ∈ S with x ∈ Bn (xi ). Re-index S so that if S = {x1 , x2 , x3 , . . .} and i < j , then d(x, xi ) ≤ d(x, xj ). This can be done; since S is a discrete subset and closed balls Dc (xi ) are compact, the subsets of S with d(x, xi ) ≤ c are all finite. Let ri = d(x, xi ). We will show that x ∈ Bn (xn ). Note that rn ≥ rn−1 . Suppose first that rn > rn−1 , then Nrn (x) ∩ S contains exactly n − 1 points, and xn ∈ Crn (x) ∩ S. Thus x ∈ Bn (xn ). Note that if rn+1 > rn , then we would have x ∈ bn (xn ) ⊂ Bn (xn ). If, on the other hand, rn = rn−1 , then there is a k > 0 so that rn = rn−1 = . . . = rn−k , and so # Nrn (x) ∩ S = n − k − 1 ≤ n − 1. But then # Crn (x) ∩ S ≥ k + 1, and hence x ∈ Bn (xn ) as desired. For the second part, we show that bn (xi ) ∩ bn (xj ) = ∅. If not, then there is a point x in their intersection. If ri = rj , then xi = xj , because by the definition of bn (xk ), {xk } = Crk (x) ∩ S. If not, then ri < rj . In this case, xi ∈ Dri (x) ⊂ Nrj (x) . Thus, since # Nri (x) ∩ S = n − 1, Nrj (x) must contain at least n points of S, a contradiction. The next result indicates how this notion of tiling is related to the notion of a fundamental set. Proposition 2.10. Let S be a discrete set in a metric space X as in Thm. 2.9. Suppose that for each xi in S there is an isometry gi of X such that gi (x0 ) = xi , gi permutes S and the only gi which leaves x0 fixed is the identity. Then there is a set F (the fundamental set), satisfying:

bn (x0 ) ⊆

F

⊆ Bn (x0 )

with

gi (F ) = X and gi (F ) ∩ gj (F ) = ∅ (i = j ).

i

Proof. Suppose that x ∈ bn (x0 ). From Def. 2.6 and the fact that the gi are isometries, we see that this is equivalent to gi (x) ∈ bn (xi ). Thus gi (bn (x0 )) = bn (xi ). Now apply Theorem 2.9. A similar reasoning proves the statement for Bn (x0 ). Remark 2.11. The fundamental set F is not necessarily connected. Also, note that it follows from this proposition that Bi (x0 ) is scissors congruent to Bj (x0 ) (see [Sah] for

On Brillouin Zones

733

a discussion of scissors congruence). In particular, this implies immediately that the Bi all have the same area. Note that this result does not hold if S is not generated by a group of isometries. See, for example, Fig. 2.5. In many examples, Bn is the closure of bn . However, this need not always be the case, even if we assume the space is Brillouin, as the example below shows. We will give additional, more involved examples in a forthcoming work. Example 2.12. Let X be the flat cylinder obtained by identifying opposite sides of the strip {z − 1 ≤ Re (z) ≤ 1} in the usual way. We will denote points in the cylinder by a corresponding complex number. Let x0 = 1, x1 = i, and x2 = −i. Each mediatrix Li is a topological circle consisting of a pair of segments meeting at right angles. The first zone b1 (x0 ) is the part of the cylinder where |Im (z) | < |Re (z) |, and B1 (x0 ) is the closure of b1 . The second zone is the complement of b1 in the cylinder, and b2 is its interior. However, B3 = {0} and b3 is empty. Note that in this example, B3 is contained in the closures of b1 and b2 .

x1 L1 x0

x0

L2

x2

Fig. 2.6. Brillouin zones for a three point discrete set as discussed in Example 2.12

Despite the fact that the zones Bi are not always the closure of their interiors, if X is a Brillouin space, the Bi are still fairly well behaved sets, as the next proposition shows. Proposition 2.13. If X is Brillouin over S, then (i) Interior points of Bn (x0 ) are in bn (x0 ). (ii) Bn (x0 ) is contained in the closure of b1 (x0 ) ∪ · · · ∪ bn (x0 ). Proof. Without loss of generality, we can restrict our attention to Bn (x0 ), which we will denote Bn throughout the proof. Since bn ⊂ Bn , with bn open and Bn closed, it is obvious that bn ⊆ Bn . Let x be a point in Bn bn . By Definition 2.6, x ∈ Bm+1 ∩ Bm+2 ∩ . . . ∩ Bm+* , with * ≥ 2 and m + 1 ≤ n ≤ m + *. The point x lies on the intersection of * − 1 mediatrices, that is, Cd(x,x0 ) (x) ∩ S consists of * points. Suppose x is an interior point of Bm+d for some d ∈ {1, · · · *}. Let V be an arbritrary, small neighborhood of x, so that V ⊂ Bm+d . Continuity of the metric allows us to choose y ∈ V such that Nd(y,x0 ) (y) ∩ S contains m points, and using metric consistency we can ensure that Cd(y,x0 ) (y) ∩ S contains exactly one point, namely x0 . Thus, we have y ∈ bm+1 .

734

J. J. P. Veerman, M. M. Peixoto, A. C. Rocha, S. Sutherland

Suppose xs = x0 is a point in Cd(x,x0 ) (x)∩S. By the same reasoning as above, V must contain a point z such that Nd(z,xs ) (z) ∩ S contains m points, and Cd(z,xs ) (z) ∩ S = {x0 }. Thus d(z, x0 ) > d(z, xs ) and so Nd(z,x0 ) (z) ∩ S contains at least m + 1 points. This implies that z ∈ Bm+2 ∪ · · · ∪ Bm+* . Since y and z are in V ⊂ Bm+d , we have that for some d ≥ 1, Bm+d ∩ bm+1 and Bm+d ∩ (Bm+2 ∪ · · · Bm+* ) are both non-empty. In view of Lemma 2.8 this is a contradiction. To prove the second statement, we start again by observing that if x is a point in Bn bn , then x ∈ Bm+1 ∩ Bm+2 ∩ . . . ∩ Bm+* . Exactly as above, we note that any neighborhood of x contains points of bm+1 . Remark 2.14. In practice, using Definition 2.6 directly can be unwieldy. It is typically easier to identify the various bn using the informal definition, counting the number of mediatrices crossed by a path which starts at x0 . Suppose X is such that between x0 and any point of bn , one can find a path γ so that if Li and Lj are distinct mediatrices, then γ ∩ Li = γ ∩ Lj . In this case, it follows immediately that a point is in bn if and only if such a path crosses exactly n − 1 mediatrices. If the path γ crosses the same mediatrix more than once, we must use a signed notion of crossing. This allows us to account only for those crossings which are essential. However, such a process is not always possible – we can not always push a path off a point where several mediatrices intersect. One way around this is to adjust the definition of “cross”. As in [Pe1], we assign to each point x its Brillouin index: β(x) ≡ max {n x ∈ Bn (x0 )} . From Lemma 2.8, we see that this is a well defined function which is constant on bn (x0 ). If L = Lx0 ,xs is a mediatrix, we say that γ crosses L if γ (1) ∈ L+ x0 ,xs , the component − containing xs . (Recall that γ (0) ∈ Lx0 ,xs by definition.) Notice that this definition only makes sense if X − L has two components, which is always the case if X is Brillouin. With this definition of “cross”, then there is always a path γ from x to x0 which crosses exactly n − 1 mediatrices if and only if x ∈ bn (x0 ). 3. Brillouin Zones in Spaces of Constant Curvature In this section X will be assumed to be one of Rn , Sn , or Hn , all equipped with the standard metric, and let G be a discontinuous group of isometries of X. Denote the quotient X/G with the induced metric by (M, g). Then the construction of lifting to the universal cover, as outlined in the introduction, applies naturally to (M, g). In this section we describe focusing of geodesics in (M, g) by Brillouin zones in X. The discrete set S is given by the orbit of a chosen point in X (which we will call the origin) under the group of deck-transformations G. The fact that the Brillouin zones are fundamental sets is now a direct corollary of Prop. 2.10. The regularity conditions of Def. 2.4 are easily verified in the present context. We do this first. Lemma 3.1. If X is either Rn , Sn , or Hn , then a mediatrix Lab in X is an (n − 1)dimensional, totally geodesic subspace consisting of one component, and X − Lab has two components.

On Brillouin Zones

735

Proof. This is easy to see if we change coordinates by an isometry of X, putting a and b in a convenient position, say as x and −x. The mediatrix Lx,−x is easily seen to satisfy the conditions (in the case of Sn , it is the equator, and for the others, it is a hyperplane). The conclusion follows. Proposition 3.2. All such spaces X are Brillouin (see Definition 2.4). Proof. As remarked before, the first condition is satisfied for any Riemannian metric. The second condition is also easy. It suffices to observe that the subspaces of Lemma 3.1 are minimally separating. Remark 3.3. Note that in the Riemannian case, mediatrices always cross transversally. If L0 a and L0b coincide in an open set, then their tangent spaces also coincide at some point. Uniqueness of solutions of second order differential equations then implies L0 a = L0b . Recall that a metric space X is called rigid if the only isometry which fixes each point of a nonempty open subset of X is the identity. It is not hard to see that Sn , Hn , and Rn are rigid spaces. See [Ra] for more details of rigid metric spaces and for the proof of the following result. Recall that the stabilizer in G of a point x ∈ X consists of those elements of G that fix x. Proposition 3.4. Let G be a discontinuous group of isometries of a rigid metric space X. Then there exists a point y of X whose stabilizer Gy consists of the identity. We now return to Brillouin zones as defined in the last section. Recall that G is a group of isometries of X that acts discontinuously on points in X. Let x0 be a point in X whose stabilizer under the action of G is trivial. For any x ∈ X, let [x0 , x] be a geodesic segment of minimal length whose endpoints are x0 and x. Then Bn (x0 ), the nth Brillouin zone relative to x0 , is the set of points x in X such that the geodesic segment [x0 , x] intercepts exactly n − 1 mediatrices Lx0 ,y , where y is in the orbit of x0 under the group G. Proposition 2.10 immediately implies the most important fact about Brillouin zones in this setting. Corollary 3.5. Let X be Rn , Sn , or Hn , and let G be a discontinuous group of isometries of X. Let x0 ∈ X be such that its stabilizer Gx0 under G is trivial. Then for every positive integer n, the nth Brillouin zone Bn (x0 ) is a fundamental set for the action of G on points in X. Its boundary is the union of pieces of totally geodesic subspaces and equals the boundary of its interior. Remark 3.6. The above corollary is the generalization of Bieberbach’s main result on Brillouin zones [Bi]. The first Brillouin zone B1 (0) is the usual Dirichlet fundamental domain for the action of G. Furthermore, even when Gx0 is not trivial, Bn (x0 ) is a k-fold cover of a fundamental region. As pointed out in the introduction, the number of geodesics that focus in a certain point is counted in the lift. So if a given point x ∈ X is intersected by n mediatrices, it is reached by n + 1 geodesics of length d(0, x) emanating from the reference point (the origin). In the next section, we give more specific examples of this. Finally, we state a conjecture. Conjecture 3.7. Let (X, g) ˜ be the universal cover of a d-dimensional smooth Riemannian manifold (M, g) as described in the construction. For a generic metric g on M, no more than d mediatrices intersect in any given point y of X.

736

J. J. P. Veerman, M. M. Peixoto, A. C. Rocha, S. Sutherland

Fig. 3.1. Brillouin zones for P SL(2, Z) in the hyperbolic disk. We have transported the “usual” upper halfi plane representation using the map z → iz+1 z+i . On the left are the sets Bn ( 4 ), which give fundamental sets as in Cor. 3.5. On the right, 0 is taken as a basepoint. Since the origin has a non-trivial stabilizer, the corresponding Brillouin zones give a double cover of the fundamental sets

This conjecture acquires perhaps even more interest (and certainly more structure), when one restricts the collection of metrics on M to conformal ones ([Mas]). A result in this direction for M = R2 /Z2 can be found in [Jo1]. 4. Focusing in Two Riemannian Examples In this section, we give two examples (one of them new as far as we know) of focusing. Suppose that at t = 0 geodesics start emanating in all possible directions from a point. At certain times t1 , t2 , ...., we may see geodesics returning to that point. We derive expressions for the number of geodesics returning at tn in two cases. First, as an introductory example we will discuss this for the case of the flat, square torus M = R2 /Z2 (a more complete discussion of this example can be found in [Pe3]). Second, we will deal with a much more unusual example, namely M = H2 / (k), where (k) is a subgroup of P SL(2, Z) called the principal congruence subgroup of level k (defined in more detail below). We note that it seems to be considerably harder to count geodesics that focus in points other than our basepoint. Before continuing, consider the classical problem of counting Rg (n), the number of solutions in Z2 of p 2 + q 2 = n. Let n = 2α

k i=1

β

pi i

* j =1

γ

qi i

be the prime decomposition of the number n, where pi ≡ 1(mod 4) and qi ≡ 3(mod 4). The following classical result of Gauss (see, for example, [NZM]) will be very useful.

On Brillouin Zones

737

Lemma 4.1. Rg (n) is zero whenever n is not an integer, or any of the γi is odd. Otherwise, Rg (n) = 4

k (1 + βi ). i=1

Example 4.2. Choose an origin in M = R2 /Z2 and lift it to the origin in R2 . Our discrete set S is then Z2 . Let ρx (t) be the number of geodesics of length t that connect the origin to the point x ∈ M. Proposition 4.3. In the flat torus R2 /Z2 , the number of geodesics of length t that connect any point to itself is ρ0 (t) = Rg (t 2 ). Proof. Notice that by definition geodesics of length t leaving from the origin in R2 reach the points contained in Ct (0). Only if t 2 is an integer does this circle intersect points of Z2 . Because of the homgeneity of the flat, square torus, it does not matter where we choose the origin. Example 4.4. We now turn to the next example. Recall that P SL(2, Z) can be identified with the group of two by two matrices with integer entries and determinant one, and with multiplication by −1 as equivalence. For each k, the group (k) is the subgroup of P SL(2, Z) given by

ab (k) = ∈ P SL(2, Z) a ≡ d ≡ 1 (mod k), b ≡ c ≡ 0 (mod k) . c d This group has important applications in number theory. The action of (k) on H2 is given by the Möbius transformations

az + b ab , where ∈ (k). g(z) = c d cz + d We point out that for k = 2, 3, or 5, the surface H2 / (k) is a sphere with 3, 4, or 12 punctures (see [FK]). We will find it more convenient to work in the hyperbolic disk D2 , which is the universal cover of H2 / (k). We shall choose a representation of (k) in the disk so that i ∈ H2 corresponds to the origin. This will allow us to determine the focusing of the geodesics which emanate from i. Note that the surface H2 / (k) has special symmetries with respect to i: for example, i is the unique point fixed by the order 2 element of P SL(2, Z). Lemma 4.5. The action of the fundamental group of the surface H2 / (k) can be represented as    

p2 + q 2 + 1 = r 2 + s 2 r − is p + iq r + p ≡ 1 (mod k), r − p ≡ 1 (mod k) ,  p − iq r + is s + q ≡ 0 (mod k), s − q ≡ 0 (mod k)  acting on D2 . We shall denote this particular representation as the group (k).

738

J. J. P. Veerman, M. M. Peixoto, A. C. Rocha, S. Sutherland

Fig. 4.1. The orbit of i under (2) transported to the hyperbolic disk, and the corresponding Brillouin zones. Each zone Bn forms a fundamental domain for a 3-punctured sphere

Proof. Following the conventions in [Be], define φ : D2 → H2 ,

φ(z) = i

z+1 . −z + 1

Push back the transformation g ∈ (k) from H2 to D2 by g → φ −1 gφ to obtain a representation of g ∈ (k) as a transformation acting on D2 . The matrix representation of this transformation is given by:   a+d b−c a−d b+c 2 +i 2 2 −i 2 , Ag =  b+c a+d b−c a−d 2 +i 2 2 −i 2 where det Ag = 1, since this matrix is conjugate to g, whose determinant is equal to 1. Let p = (a − d)/2 r = (a + d)/2 and Ag now written as

Ag =

q = −(b + c)/2 s = −(b − c)/2

r − is p + iq p − iq r + is

.

Here the numbers p, q, r, s are in Z and must satisfy the following congruence conditions: r + p ≡ 1 (mod k), r − p ≡ 1 (mod k), s + q ≡ 0 (mod k), s − q ≡ 0 (mod k). Since the determinant of Ag is equal to 1, we must also have p2 + q 2 + 1 = r 2 + s 2 .

On Brillouin Zones

739

We need another auxiliary result before we state the main result of this section. Lemma 4.6. Let (p, q) and (r, s) be two points in Z2 such that the integers A = p 2 + q 2 and B = r 2 + s 2 are relatively prime, and let ϕ be a rotation fixing the origin. Now ϕ(p, q) = (p" , q " ) and ϕ(r, s) = (r " , s " ) are in Z2 if and only if ϕ is a rotation by an integer multiple of π/2. Proof. Let c be the cosine of the angle of rotation. We have c=

r "r + s"s p" p + q " q = . A B

Thus if p" p + q " q and r " r + s " s are not both equal to zero, A p" p + q " q = . " " r r +s s B Because A and B are relatively prime and surely |p " p + q " q| is less than or equal to A, and similarly for B, we have that p " p + q " q = ±A and r " r + s " s = ±B. This implies the result.

Now we define a counter just as before. Choose a lift of M = D2 / (k) so that 0 ∈ M lifts to 0 ∈ D2 . Let γx (t) be the number of geodesics of length t that connect the origin to the point x ∈ M.

512

512

256

256

128

128

64

64

32

32

16

16

8

8

4

4 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25

0 1 2 3 4 5 6 7 8

Fig. 4.2. The non-zero values of ρ0 (t) for t ≤ 25 (left) and γ0 (t) for t ≤ 8 (right), which count how many geodesics of length t connect the origin to itself in the R2 /Z2 and D2 / (2), respectively

740

J. J. P. Veerman, M. M. Peixoto, A. C. Rocha, S. Sutherland

Theorem 4.7. In the surface H2 / (k), the number of geodesics of length t which connect the point i ∈ H2 to itself is given by 1 2 t − 1)R (cosh2 t) g 4 Rg (cosh 2 1 cosh t−1 Rg (cosh2 t) 4 Rg 9 1 4 Rg

cosh2 t−1 25

Rg (cosh2 t)

for k = 2, for k = 3, for k = 5.

Note that in all cases, the number is nonzero only if cosh2 t ∈ N. Proof. We shall work in the disk, rather than in H2 . Let S be the orbit of 0 ∈ D2 under (k). Then the number of such geodesics is exactly the number of distinct points of S which lie on the circle Ct (0) of radius t and centered at the origin. If x ∈ S, then by Lemma 4.5, it is of the form p+iq r+is with p, q, r, s integers satisfying 2 2 2 2 p + q = r + s − 1. Let n be their common value, that is, n = p 2 + q 2 = r 2 + s 2 − 1. We will first count the number of 4-tuples (p, q, r, s) that solve this equation, momentarily ignoring the congruence conditions. Note that the point x has Euclidean distance to the origin given by |x|2e =

p2 + q 2 n . = r 2 + s2 n+1

The hyperbolic length of the geodesic which connects x to √the origin is arctanh (|x|e ). Consequently, γ0 (t) is only non-zero when t = arctanh n/(n + 1), or, equivalently, when n = cosh2 t − 1. To count the number of intersections of Ct (0) with S for these values of t, observe that we can use Gauss’ result to count the number of pairs (p, q) such that p 2 + q 2 = n. This number is given by Rg (n). For each such pair (p, q), we have a number of choices to form x=

p + iq . r + is

By the above, this number is equal to Rg (n + 1). Thus, γ0 (t) is at most Rg (n)Rg (n + 1). However, we have over-counted: some of our choices for p, q, r, s represent the same point x ∈ S, and some of them may not satisfy the congruence conditions, which we have so far ignored. We will first account for the multiple representations, and then account for the congruence relations. Let p, q, r, s ∈ Z be as above, giving a point x = p+iq r+is which is at distance t = √ arctanh n/(n + 1) from the origin. If we multiply the numerator and denominator of x by eiθ , then x will remain unchanged. Because of the requirement that p 2 + q 2 = r 2 + s 2 − 1 = n, this is the only invariant, and by Lemma 4.6, θ must be a multiple of π 2 for the numerator and denominator to remain Gaussian integers. We see that in our counting, we have represented our point x in 4 different ways: x=

p + iq −p − iq q − ip −q + ip = = , = −s + ir −r − is s − ir r + is

meaning we have over-counted by a factor of at least 4.

On Brillouin Zones

741

Now we account for the congruence conditions. First, consider the case k = 2. Note that q + s ≡ 0 (mod 2) if and only if p + r ≡ 1 (mod 2), because p2 + q 2 + 1 = r 2 + s 2 , so we need only check this one condition. If the representation p+iq r+is fails to satisfy our parity condition, then q + s ≡ 1 (mod 2) and consequently p + r ≡ 0 (mod 2). This means that the representation −q+ip −s+ir of this same point does satisfy the parity conditions, giving exactly 1 Rg (cosh2 t − 1)Rg (cosh2 t) 4 distinct points of S at distance t from the origin. If k = 3, then since k is odd, the congruence conditions on p, q, r, and s imply that r ≡ 1(mod 3)

and

p ≡ q ≡ s ≡ 0(mod 3).

Note that the equation p2 + q 2 = n

and

p ≡ q ≡ 0 (mod 3)

will be satisfied exactly Rg (n/32 ) times. (Recall that if n is not divisible by 9, then Rg (n/9) is 0.) For fixed n, let (p, q) be any one of the solutions. We need to decide how many solutions the equation r 2 + s2 = n + 1

with

r ≡ 1 (mod 3)

and

s ≡ 0 (mod 3)

admits. The solution of the first equation implies that 3 divides n. Thus r 2 + s 2 ≡ 1 (mod 3). Consequently, we have 4 choices mod3 for the pair (r, s), namely (0, 1), (1, 0), (0, 2), and (2, 0). Let (p, q, r, s) ∈ Z2 × Z2 be any solution to n = p 2 + q 2 = r 2 + s 2 − 1 with p ≡ q ≡ 0(mod 3). For each choice of (p, q), we have exactly Rg (n + 1) choices of (r, s). Now let R denote the product of the rotations by π/2 on each of the components of Z2 × Z2 . Using Lemma 4.6, we see that all such solutions can be obtained from just one by applying R repeatedly. It is easy to check that each quadruple of solutions thus constructed runs exactly once through the above list. Since precisely one out of the four associated solutions is compatible with the conditions, the total number of solutions is exactly: 1 n Rg (n + 1). Rg 9 4 Using the relationship between the Euclidean distance and the Poincaré length as before gives the result. If k = 5, the proof for k = 3 can be literally transcribed to obtain the result. Remark 4.8. Note that the above results do not hold if k is not one of the cases mentioned. The primary difficulty is that for prime k ≥ 7, there are solutions which are not related therotation R. However, the argument does give an upper bound of by applying 1 2 2 2 2 4 Rg (cosh t − 1)/k Rg (cosh t) for H / (k) when k is an odd prime. Note that the 2 surface H / (k) is of genus 0 if and only if k ≤ 5 (see [FK]).

742

J. J. P. Veerman, M. M. Peixoto, A. C. Rocha, S. Sutherland

5. Non-Riemannian Examples The present context is certainly not restricted to Riemannian metrics. As an indicator of this we now discuss a different set of examples. Let k be a positive number greater than one. Equip R2 with the distance function 1/k % x − y %= |x1 − y1 |k + |x2 − y2 |k and let the discrete set S be given by Z2 . For k not equal to 2, this is not a Riemannian metric, yet all conclusions of Sect. 2 hold. In particular, each Brillouin zone forms a fundamental domain. Note that determining the zones by inspecting the picture requires close attention! 10

2

5

1

y

0

y

0

-1 -5

-2

-10

-2

-1

0

1

2

x

-10

-5

0

5

x

Fig. 5.1. Brillouin zones for the lattice Z2 in R2 with the metric |x1 − y1 |4 + |x2 − y2 |4 Fig. 2.3 and Example 2.5, which deal with the case k = 1, the “Manhattan metric”

1/4

10

. See also

Now the problem of determining Ct (0) ∩ S for any given t is unsolved for general k. In fact, even for certain integer values of k greater than 2, it is not known whether Ct (0) ∩ S ever contains at least two points that are not related by the symmetries of the problem. For k = 4, the smallest t for which Ct (0) ∩ S has at least two (unrelated) solutions is given by t 4 = 1334 + 1344 = 1584 + 594 . However, for k ≥ 5, it unknown whether this can happen at all (see [SW]). There are some things that can be said, however. In the situation where k ≥ 3, the mediatrices intersect the coordinate axes only in irrational points or in multiples of 1/2. For if x = (p/q, 0) is a point of a mediatrix L(0,0),(a1 ,a2 ) , we have |p|k = |p − qa1 |k + |a2 q|k

(p = 0, q = 0)

.

By Fermat’s Last Theorem, this has no solution unless either p = qa1 or a2 = 0. In the first case, pq = ±a2 , which can only occur if the lattice point is of the form (a2 , ±a2 ). If a2 = 0, then pq = a21 . In particular, there is no nontrivial focusing along the axes.

On Brillouin Zones

743

To compute Fig. 5.1, we took advantage of the smoothness of the metric. Not all metrics are sufficiently smooth for this procedure to work. Even for Riemannian metrics, in general the distance function is only Lipschitz, which will not be sufficiently smooth. For each a = (a1 , a2 ) ∈ Z2 , define a Hamiltonian: Ha (x) =% x − a % − % x % . The mediatrix L0 a corresponds to the level set Ha (x) = 0. Because Ha (x) is smooth, we have uniqueness of solutions to Hamilton’s equations. In the current situation, where the dimension is two, the level set consists of one orbit. Thus, one can produce the mediatrix by numerically tracing the zero energy orbits of the above Hamiltonian. As mentioned above, for a general Riemannian metric, the distance function is only Lipschitz. This means we have no guarantee that the solutions of the above differential equation are unique. Indeed, there are examples of multiply connected Riemannian manifolds with self-intersecting mediatrices, as will be shown in a forthcoming work. Acknowledgement. It is a pleasure to acknowledge useful conversations with Federico Bonetto, Johann Dupont, Irwin Kra, Bernie Maskit, John Milnor, Chi-Han Sah, and Duncan Sands. Part of this work was carried out while Peter Veerman was visiting the Center for Physics and Biology at Rockefeller University and the Mathematics Department at SUNY Stony Brook; the authors are grateful for the hospitality of these institutions.

References [AM]

Ashcroft, N. W., and Mermin, N. D.: Solid State Physics. New York: Holt, Rhinehart, and Winston, 1976 [Be] Beardon, A. F.: The Geometry of Discrete Groups. Berlin–Heidelberg–New York: Springer-Verlag, 1983 [Bi] Bieberbach, L.: Über die Inhaltsgleichheit der Brillouinschen Zonen. Monatshefte für Math. und Phys. 48, 509–515 (1939) [Br] Brillouin, L.: Wave Propagation in Periodic Structures. New York: Dover, 1953 [Ca] Cannon, J.: The Theory of Negatively Curved Spaces and Groups. In: Ergodic Theory, Symbolic Dynamics, and Hyperbolic Spaces. (Ed: Bedford, Keane, & Series). Oxford: Oxford University Press, 1991 [doC] do Carmo, M.: Riemannian Geometry. Basel–Boston: Birkhäuser, 1992 [FK] Farkas, H., and Kra, I.: Automorphic Forms for Subgroups of the Modular Group. II: Groups Containing Congruence Subgroups. J. d’Analyse Math. 70, 91–156 (1996) [Gr] Gromov, M.: Metric Structures for Riemaniannian and Non-Riemannian Spaces. Basel–Boston: Birkhäuser, 1999 [Ha] Hardt, R. M.: Stratifications of real analytic mappings and images. Inv. Math. 28, 193–208 (1975) [Hi] Hironaka, H.: Subanalytic sets. Number theory. In: Algebraic geometry and commutative algebra in honor of Y. Akizuki, Tokyo: Kinokuniya Publications, 1973, pp. 453–493 [Jo1] Jones, G. A.: Geometric and Asymptotic Properties of Brillouin Zones in Lattices. Bull. Lond. Math. Soc. 16, 241–263 (1984) [Jo2] Jones, H.: The Theory of Brillouin Zones and Electronic States in Crystals. Amsterdam: NorthHolland, 1975 [KP] Kupka, I. A. K., and Peixoto, M. M.: On the Enumerative Geometry of Geodesics. In: From Topology to Computation, (Ed: Marsden & Shub), Berlin–Heidelberg–New York: Springer, 1993, pp. 243–253 [Mas] Maskit, B.: Personal communication [NZM] Niven, I., Zuckerman, H. S., and Montgomery, H. L.: An Introduction to the Theory of Numbers. fifth edition, New York: John Wiley and Sons, 1991 [Pe1] Peixoto, M. M.: On end point boundary value problems. J. Differ. Eqs. 44, 273–280 (1982) [Pe2] Peixoto, M. M.: Sigma décomposition et arithmétique de quelqes formes quadratiques définies positives. In: R. Thom Festschift volume: Passion des Formes (Ed: M. Porte), Paris: ENS Editions, 1994, pp. 455–479 [Pe3] Peixoto, M. M.: Focal Decomposition in Geometry, Arithmetic, and Physics. In: Geometry, Topology and Physics, (Ed: Apanasov, Bradlow, Rodrigues, & Uhlenbeck), Berlin–New York: de Gruyter & Co., 1997, pp. 213–231

744

[PT]

[PS] [Ra] [Sah] [SW] [Ti]

J. J. P. Veerman, M. M. Peixoto, A. C. Rocha, S. Sutherland

Peixoto, M. M., and Thom, R.: Le point de vue énumératif dans les problèmes aux limites pour les équations différentielles ordinaires. I: Quelques exemples, C. R. Acad. Sc. Paris I 303, 629–632 (1986); Erratum. C. R. Acad. Sc. Paris I 307, 197–198 (1988); II: Le théoréme. C. R. Acad. Sc. Paris I 303, 693–698 (1986) Preparata, F. P., and Shamos, M. I.: Computational Geometry. Berlin–Heidelberg–New York: Springer, 1985 Ratcliffe, J. G.: Foundations of Hyperbolic Manifolds. Berlin–Heidelberg–NewYork: Springer, 1994 Sah, C. H.: Hilbert’s Third Problem: Scissors Congruence. Research Notes in Mathematics 33, London: Pitman, 1979 Skinner, C. M., and Wooley, T. D.: Sums of k-th Powers. J. Reine Angew. Math. 462, 57–68 (1995) Tinkham, M.: Group Theory and Quantum Mechanics. New York: McGraw-Hill, 1964

Communicated by A. Connes

Communications in Mathematical Physics - Volume 221

Read more

Communications in Mathematical Physics - Volume 220

Read more

Communications in Mathematical Physics - Volume 235

Read more

Communications in Mathematical Physics - Volume 223

Read more

Communications In Mathematical Physics - Volume 283

Read more

Communications In Mathematical Physics - Volume 270

Read more

Communications in Mathematical Physics - Volume 208

Read more

Communications in Mathematical Physics - Volume 186

Read more

Communications In Mathematical Physics - Volume 294

Read more

Communications in Mathematical Physics - Volume 217

Read more

Communications In Mathematical Physics - Volume 274

Read more

Communications in Mathematical Physics - Volume 239

Read more

Communications in Mathematical Physics - Volume 306

Read more

Communications in Mathematical Physics - Volume 264

Read more

Communications in Mathematical Physics - Volume 227

Read more

Communications in Mathematical Physics - Volume 184

Read more

Communications in Mathematical Physics - Volume 261

Read more

Communications in Mathematical Physics - Volume 225

Read more

Communications In Mathematical Physics - Volume 263

Read more

Communications in Mathematical Physics - Volume 211

Read more

Communications In Mathematical Physics - Volume 293

Read more

Communications in Mathematical Physics - Volume 246

Read more

Communications In Mathematical Physics - Volume 298

Read more

Communications in Mathematical Physics - Volume 234

Read more

Communications In Mathematical Physics - Volume 288

Read more

Communications in Mathematical Physics - Volume 304

Read more

Communications In Mathematical Physics - Volume 292

Read more

Communications in Mathematical Physics - Volume 233

Read more

Communications in Mathematical Physics - Volume 253

Read more

Communications in Mathematical Physics - Volume 222

Read more

Recommend Documents

Communications in Mathematical Physics - Volume 221

Commun. Math. Phys. 221, 1 – 26 (2001) Communications in Mathematical Physics © Springer-Verlag 2001 Evolution of a ...

Communications in Mathematical Physics - Volume 220

Commun. Math. Phys. 220, 1 – 12 (2001) Communications in Mathematical Physics © Springer-Verlag 2001 On the Definiti...

Communications in Mathematical Physics - Volume 235

Commun. Math. Phys. 235, 1–45 (2003) Digital Object Identifier (DOI) 10.1007/s00220-002-0778-0 Communications in Mathe...

Communications in Mathematical Physics - Volume 223

Commun. Math. Phys. 223, 1 – 12 (2001) Communications in Mathematical Physics © Springer-Verlag 2001 Resonance Expan...

Communications In Mathematical Physics - Volume 283

Commun. Math. Phys. 283, 1–24 (2008) Digital Object Identifier (DOI) 10.1007/s00220-008-0556-8 Communications in Mathe...

Communications In Mathematical Physics - Volume 270

Commun. Math. Phys. 270, 1–12 (2007) Digital Object Identifier (DOI) 10.1007/s00220-006-0139-5 Communications in Mathe...

Communications in Mathematical Physics - Volume 208

Commun. Math. Phys. 208, 1 – 23 (1999) Communications in Mathematical Physics © Springer-Verlag 1999 Characters of C...

Communications in Mathematical Physics - Volume 186

Commun. Math. Phys. 186, 1-59 (1997) Communications in Mathematical Physics (~) Springer-Verlag1997 Meanders and the...

Communications In Mathematical Physics - Volume 294

Commun. Math. Phys. 294, 1–19 (2010) Digital Object Identifier (DOI) 10.1007/s00220-009-0920-3 Communications in Mathe...

Communications in Mathematical Physics - Volume 217

Commun. Math. Phys. 217, 1 – 31 (2001) Communications in Mathematical Physics © Springer-Verlag 2001 Integrable Stru...