Communications in Mathematical Physics - Volume 188

Commun. Math. Phys. 188, 1 – 27 (1997) Communications in Mathematical Physics c Springer-Verlag 1997 Weak Homogeniza...

Author: A. Jaffe (Chief Editor)

34 downloads 680 Views 8MB Size Report

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!

Report copyright / DMCA form

DOWNLOAD PDF

Commun. Math. Phys. 188, 1 – 27 (1997)

Communications in

Mathematical Physics c Springer-Verlag 1997

Weak Homogenization of Anisotropic Diffusion on ´ Pre-Sierpinski Carpets Martin T. Barlow1 , Kumiko Hattori2 , Tetsuya Hattori3 , Hiroshi Watanabe4 1 Department of Mathematics, University of British Columbia, Vancouver, British Columbia V6T 1Z2, Canada. E-mail: [email protected] 2 Department of Mathematical Sciences, University of Tokyo, Komaba, Tokyo 113, Japan. E-mail: [email protected] 3 Department of Mathematics, Rikkyo University, Nishi-Ikebukuro, Tokyo 171, Japan. E-mail: [email protected] 4 Department of Mathematics, Nippon Medical School, Kosugi, Nakahara Kawasaki 211, Japan. E-mail: [email protected]

Received: 26 June 1996 / Accepted: 25 November 1996

Abstract: We study a kind of ‘restoration of isotropy” on the pre-Sierpi´nski carpet. Let Rnx (r) and Rny (r) be the effective resistances in the x and y directions, respectively, of the Sierpi´nski carpet at the nth stage of its construction, if it is made of anisotropic material whose anisotropy is parametrized by the ratio of resistances for a unit square: r = R0y / R0x . We prove that isotropy is weakly restored asymptotically in the sense that for all sufficiently large n the ratio Rny (r) / Rnx (r) is bounded by positive constants independent of r. The ratio decays exponentially fast when r 1. Furthermore, it is proved that the effective resistances asymptotically grow exponentially with an exponent equal to that found by Barlow and Bass for the isotropic case r = 1.

1. Introduction In this article we study a kind of homogenization, or restoration of isotropy of anisotropic diffusion, on the pre-Sierpi´nski carpet [5]. The present work develops ideas arising in two series of recent studies on the diffusion on fractals. One is a study of asymptotically onedimensional diffusions on Sierpi´nski gaskets in [9, 10, 8], which contains the discovery of the mechanism on finitely ramified fractals. The other is a detailed study of isotropic diffusion on Sierpi´nski carpets in [1, 2, 4, 3]. The most interesting aspects of asymptotic behaviors of diffusion (e.g. the spectral dimensions) are embodied in the asymptotic behaviors of effective resistances. A physicist may find it easy to interpret the results on resistances in terms of diffusions. Note (as we will summarize below) that electrical resistance is the rate of heat dissipation caused by electric power. As we will actually use in the proofs, the resistance can be defined as an H1 norm of electric potential (see (1.1) below), and the potential is a solution to the Laplace equation (a harmonic function) with corresponding Neumann-Dirichlet boundary conditions. Thus it is natural that resistances and diffusions are strongly related. In this paper, rather than going into the relation of the two phenomena in general, we will

2

M.T. Barlow, K. Hattori, T. Hattori, H. Watanabe

focus on the behavior of electrical resistances. See [1, 2, 4, 3] on how resistances play an essential part in the construction of diffusions on the Sierpi´nski carpet, and derivation of their properties. The Sierpi´nski carpet is an example of an infinitely ramified fractal [14, 13]. For n ∈ Z+ the pre-Sierpi´nski carpet Fn is the open subset of the unit open square F0 = (0, 1) × (0, 1) obtained by iterating the operation for constructing the Sierpi´nski carpet, until squares of side length 3−n are reached, where we stop, so that smaller scale structures are absent. The operation is a generalization of that in the construction of the Cantor ternary set: given a square of side length 3−m , we divide it into 9 squares of side length 3−m−1 , remove the middle square (with its boundary) and keep the other 8 squares. Thus Fn is an open set in R2 , composed of 8n squares of side 3−n , and has square shaped holes of side length varying from 3−n to 3−1 . It will be convenient later to write Fn = F0 for n < 0.

Fig. 1. The pre-Sierpi´nski carpet F3

Let r ∈ (0, ∞), and consider a function v ∈ C(F¯n ) ∩ H 1 (Fn ), where C(F¯n ) denotes the set of continuous functions on F¯n , and H 1 (Fn ) the set of square integrable functions ∂v ∂v and (in the sense of distribution) are square integrable. whose partial derivatives ∂x ∂y Put ! 2 Z ÿ 2 1 ∂v ∂v EFn (v, v) = (x, y) + (x, y) dx dy . (1.1) ∂x r ∂y Fn In physical terms EFn (v, v) is the rate of energy dissipation for the potential (voltage) distribution v if Fn is made of a material with a uniform but anisotropic electrical resistivity, with anisotropy parameter r. For a unit square made of this material, the total resistance is 1 in the x-direction and r in the y-direction, and the principal axes of the resistivity tensor are parallel to the x and y axes. Define Rnx (r), the effective resistance of Fn in the x direction, by the following (principle of minimum heat production): 1 = inf {EFn (v, v)} , Rnx (r)

(1.2)

Weak Homogenization of Anisotropic Diffusion on Pre-Sierpi´nski Carpets

3

where the infimum is taken over all the functions v ∈ C(F¯n ) ∩ H 1 (Fn ), satisfying boundary conditions v(0, y) = 0, v(1, y) = 1, 0 ≤ y ≤ 1.

(1.3)

The effective resistance in the y direction Rny (r) is defined in a similar manner, with boundary conditions v(x, 0) = 0, v(x, 1) = 1, 0 ≤ x ≤ 1. Obviously,

R0x (r) = 1

and

Set Hn (r) =

(1.4)

R0y (r) = r .

(1.5)

Rny (r) ; Rnx (r)

(1.6)

thus Hn (r) measures the effective anisotropy of Fn if it is composed of material with anisotropy parameter r. It is easy (see Lemma 3.1 ) to verify that Rnx (r) = rRny (1/r),

Hn (r) = Hn (1/r)−1 .

We have the following conjecture: Conjecture . (“Strong Homogenization”). lim Hn (r) = 1, f or each r ∈ (0, ∞).

n→∞

(1.7)

In this paper, we prove the following weak homogenization property: Theorem 1.1. There exists a constant 1 ≤ K < ∞ such that K −1 ≤ lim inf Hn (r) ≤ lim sup Hn (r) ≤ K n→∞

n→∞

f or each r ∈ (0, ∞) .

Our proof gives explicit bounds: we can take K = 6333, which may be compared with the conjectured value K = 1 in (1.7). (Our bounds, and proof, have improved since we announced them in [5].) Theorem 1.1 does not give information on the asymptotic behavior in n of Rnx (r) and Rny (r). However, we have the following result: Theorem 1.2. For each r > 0, 0 < inf ρ−n Rnz (r) ≤ sup ρ−n Rnz (r) < ∞ , n

n

z = x, y,

where ρ is the growth exponent for the isotropic case r = 1 given in [2, 4]. Thus the effective resistances Rnx (r) and Rny (r) both grow asymptotically like ρn , and so the growth exponent ρ found in [2] is universal in the sense that it is independent of the anisotropy r. We see from (1.5) and (1.6) that H0 (r) = r. Thus Theorem 1.1 implies that if r 1, Hn (r) should be relatively small when n is large. In fact, we have the following estimate for the decrease of Hn in n.

4

M.T. Barlow, K. Hattori, T. Hattori, H. Watanabe

Theorem 1.3. There exist constants c ∈ (0, ∞), s1 ∈ (0, 1) such that 1 ≤ s−1 Hn ((9/7)n s) ≤ exp(cs−ξ ),

n ≥ 1, s ≥ s1 ,

(1.8)

where ξ = log 2/ log 7. In particular lim lim inf s−1 Hn ((9/7)n s) = lim lim sup s−1 Hn ((9/7)n s) = 1 .

s→∞ n→∞

s→∞ n→∞

Thus when s = (7/9)n r is large, Hn (r) ≈ (7/9)n r. A similar result holds for small s:

lim lim sup s−1 Hn ((7/9)n s) = lim lim inf s−1 Hn ((7/9)n s) = 1 . s→0 n→∞

s→0 n→∞

We can also obtain scaling relations of this kind for the effective resistances Rnx (r) and Rny (r) – see the proof of the theorem. Our proof also implies that lim Rnx (r) = (3/2)n and lim r−1 Rny (r) = (7/6)n . (See (3.22) and (3.20).) Therefore r→∞

lim r−1 Hn (r) =

r→∞

n 7 , n ≥ 0. 9

r→∞

(1.9)

We have no proof of the existence of the scaling limit h(s) = lim s−1 Hn ((9/7)n s) , n→∞

but Theorem 1.3 implies that if h does exist then lim h(s) = 1. For further comments s→∞ and conjectures on the form of h see [5]. Proofs of Theorem 1.1 and Theorem 1.3 are given in Sect. 3. The basic tools to prove Theorem 1.1 are Propositions 3.2 and 3.3, which are recursive inequalities for the effective resistances, which give good bounds in the anisotropic regime, that is when Hn (r) is very different from 1. If Hn (r) 1, then, roughly speaking, these inequalities state that the smaller effective resistance Rnx (r) grows as (3/2)n , while the larger effective resistance Rny (r) grows as (7/6)n r. So as long as Hn (r) 1, we have Hn (r) ≈ (7/9)n r, and thus Hn (r) approaches 1 exponentially fast. Theorem 1.3 shows that we can make precise this argument on the exponential decay of Hn (r). In fact, the estimates in Propositions 3.2 and 3.3 are precise enough to allow us to prove that Hn (r) is bounded for all large n, so proving Theorem 1.1. We prove Theorem 1.2 in Sect. 4, by giving another recursive inequality (4.1), analogous to those given in [2] for r = 1. Sect. 2 is devoted to basic estimates used both in Sect. 3 and Sect. 4. A strong homogenization result similar to (1.7) is proved in [9, 10, 5] for the preSierpi´nski gasket, using explicit renormalization group recursion relations for quantities analogous to Rnx (r) and Rny (r). As the Sierpi´nski gasket is finitely ramified, these recursion relations are finite dimensional, and so exact calculations are possible. We expect that this kind of restoration of isotropy will occur on a wide class of fractals – see [5]. That this is difficult to prove for the Sierpi´nski carpet reflects the fact that it is an infinitely ramified fractal, and so the renormalization group recursion acts on an infinite dimensional space. The rigorous inequalities in Propositions 3.2, 3.3, and 4.1 provide a version of the renormalization group relations. We conclude this section with some remarks.

Weak Homogenization of Anisotropic Diffusion on Pre-Sierpi´nski Carpets

5

√ 1. With the change of the coordinate y 0 = r y, the defining equation (1.2) has an isotropic expression, so our results also apply to rectangular boards made of isotropic material. 2. Fn is contained in the unit square and the unit structure is of order 3−n . But the scale invariance of resistance in two dimensions implies that the effective resistances are the same if we defined Fn as a figure with unit structures of order 1 and of total size 3n × 3n ; i.e., constructing the figure outward instead of inward. The results in this paper hold as they are, with only minor notational changes. 3. Analogous results can also be obtained for the cross-wire networks Gn introduced in [2]. The network Gn is obtained from Fn by replacing each of the 8n squares of side 3−n in Fn by a horizontal and vertical crosswire of four linear resistors (joined at the center of the square), where each horizontal resistor has resistance 1/2 and each vertical resistor has resistance r/2. (See [7] for basic facts about resistor networks.) The results in this paper hold as they are, with similar proofs. 4. Our proofs should also be effective for the class of “generalized Sierpi´nski carpets” considered in [2, Eq. (3.1)]. In particular, with only minor changes, they apply to (k, l) – Sierpi´nski carpets. Here the sets Fn are constructed recursively by dividing each square of side k −(n−1) in Fn−1 into k 2 squares, and throwing out a block of `2 squares at the center. (We take k ≥ 3 and k > `.) The numbers appearing in the results, such as the exponents 7/9, 7/6, 3/2, and ρ, will of course in general be different for different figures. 5. The proof of the conjecture (1.7) seems to us to be quite hard. We suspect that it is similar in difficulty to the problem of improving the inequalities 1 n ρ ≤ Rnx (1) ≤ 4ρn , 4

n ≥ 0,

given in [2], to proving the existence of the conjectured limit lim ρ−n Rnx (1) .

n→∞

2. Basic Estimates on Energy of Harmonic Functions Throughout this section, we fix r > 0 and n ∈ Z. The first two propositions deal with the principle of minimum heat production in terms of potentials and currents, respectively. They are straightforward extensions of the isotropic case r = 1 in [2], to which we refer for a proof. Proposition 2.1. There exists a unique function v = Vnx (r) (or Vny (r)) in C(F¯n ) ∩ H 1 (Fn ) with ∇v ∈ L2 (∂F ) which attains the infimum of (1.2) with the boundary condition (1.3) (or (1.4), respectively); Rnx (r)−1 = EFn (Vnx (r), Vnx (r)), Rny (r)−1 = EFn (Vny (r), Vny (r)).

(2.1)

The functions satisfy the following Laplace equation on Fn 1 ∂2v ∂2v (x, y) + (x, y) = 0 , (x, y) ∈ Fn , ∂x2 r ∂y 2

(2.2)

6

M.T. Barlow, K. Hattori, T. Hattori, H. Watanabe

with boundary conditions (1.3) (or (1.4), respectively), and Neumann boundary con∂v = 0, on the rest of ∂Fn , except at the corners of the squares in ∂Fn . In ditions ∂n particular, for z = x, y, 0 ≤ Vnz (r)(x, y) ≤ 1,

(x, y) ∈ F¯n .

(2.3)

Note also that the symmetry of Fn implies Vnx (r)(x, y) = Vnx (r)(x, 1 − y), Vnx (r)(x, y) + Vnx (r)(1 − x, y) = 1,

0 ≤ x ≤ 1, 0 ≤ y ≤ 1, 0 ≤ x ≤ 1, 0 ≤ y ≤ 1,

(2.4)

with similar relations for Vny (r). There is a dual formulation of resistance in terms of currents. Denote by C(Fn ), the set of R2 valued square integrable functions j ∈ BV (Fn ) (integrable functions whose derivatives in the sense of distribution are measures with finite total variations [15]), satisfying current conservation divj = 0 (in the sense of distribution). We call an element j = (jx , jy ) of C(Fn ), a current on Fn . Remark . Note that as j is defined on the open set Fn , the values of j on ∂Fn are not defined. However, we will need to express the resistance Rnx (r) in terms of the minimum energy of a current j with total flux 1 across Fn , and to define the class of feasible currents for this optimization problem we need to consider boundary values for currents j ∈ C(Fn ). If j ∈ BV (Fn ) then by [12, p.325] the rough trace j ∗ exists on ∂Fn . For the precise definition of j ∗ see [12] – but note from [12] that if (x0 , y0 ) ∈ ∂Fn then j ∗ (x0 , y0 ) =

lim

(x,y)→(x0 ,y0 )

j(x, y)

whenever this limit exists. Thus, essentially, for a well behaved function the rough trace is simply a continuous extension to the boundary. A general version of the Gauss–Green formula [12, p. 340] expresses an integration of j over the domain Fn by a contour integration of j ∗ along ∂Fn . The currents we will consider in this paper have analytic continuations to ∂Fn , except at a finite number of points. (See the proof of Lemma 2.8 in Appendix A.) Thus we can consistently extend j to the boundary ∂Fn , and from now on we will do so whenever necessary without further comment. For a vector field j = (jx , jy ) ∈ L2 (Fn ), and B ⊂ Fn , define Z (jx2 (x, y) + r jy2 (x, y)) dx dy . EB (j, j) = B

Proposition 2.2.

Rnx (r) = inf {EFn (j, j)} ,

(2.5)

where the infimum is taken over all j = (jx , jy ) ∈ C(Fn ) which satisfy j · n = 0, a.e., on the boundary of Fn , except at two edges x = 0 and x = 1, where we impose Z 1 Z 1 jx (0, y) dy = jx (1, y) dy = −1 . (2.6) 0

0

Here n is the unit normal vector at the boundary of Fn , and j ·n denotes inner product of vectors. The function j = Jnx (r) which attains the infimum of (2.5) exists and is unique, and is given by

Weak Homogenization of Anisotropic Diffusion on Pre-Sierpi´nski Carpets

Jnx (r)

=

x x (Jnx (r), Jny (r))

=−

Rnx (r)

∂Vnx (r) 1 x ∂V x (r) , Rn (r) n ∂x r ∂y

7

.

(2.7)

Similarly, there exists a unique function Jny (r) ∈ C(Fn ) which satisfies Rny (r) = inf {EFn (j, j)} = EFn (Jny (r), Jny (r)) , where j satisfies similar conditions as before, with Z 1 Z 1 jy (x, 0) dx = jy (x, 1) dx = −1. 0

(2.8)

(2.9)

0

in place of (2.6). Remark . The minus sign in (2.7) comes from the sign conventions in the boundary conditions (1.3) and (2.6), which are the traditions in the study of electricity. It is a well-known historical misfortune that not only do we need minus signs here, but the electrons in reality move in opposite direction to the currents when they are defined in this way. Remark . We can regard 1.1 and 2.2 as giving Rnx (r) in terms of an optimization problem and its dual. In view of this, we will use the language of optimization theory and refer, for example, to a flow which satisfies the conditions of 2.2 as a feasible flow. Remark . If n < 0, then since Fn = F0 , we have Vnx = V0x , Jnx = J0x , etc. Note that (2.3) and (2.7) imply that x (r)(0, y) ≥ 0, Jnx

0 ≤ y ≤ 1,

while the symmetry of Fn implies x x x x Jnx (0, y) = Jnx (1, y) = Jnx (0, 1 − y) = Jnx (1, 1 − y), 0 ≤ y ≤ 1,

(2.10)

with similar relations for Jny (r). Next we turn to a couple of basic estimates of the energy in terms of potentials and currents. Definition 2.3. For G ⊂ Fn , define the bilinear form Z ∂f ∂g 1 ∂f ∂g ¯ ∩ H 1 (G). + dx dy , f, g ∈ C(G) EG (f, g) = ∂x ∂x r ∂y ∂y G Thus EG is the Dirichlet form associated with the self-adjoint operator L=

1 ∂2 ∂2 + 2 ∂x r ∂y 2

on the space L2 (G, µ). (Here µ is Lebesgue measure.) The following lemma is an application of Cauchy-Schwarz. We write 1G for the indicator function of G, and || · ||∞ for the L∞ norm. Lemma 2.4. Let f , g ∈ C(F¯n ) ∩ H 1 (Fn ). Then EG (f g, f g) ≤ 2||g1G ||2∞ EG (f, f ) + 2||f 1G ||2∞ EG (g, g).

8

M.T. Barlow, K. Hattori, T. Hattori, H. Watanabe

Proof. Write (just for now) fx = 2g 2 fx2 . So,

Z

EG (f g, f g) ≤ 2

G

∂f . Note that ((f g)x )2 = (f gx + fx g)2 ≤ 2f 2 gx2 + ∂x

f 2 (gx2 + r−1 gy2 ) dx dy + 2

Z G

g 2 (fx2 + r−1 fy2 ) dx dy

≤ 2||f 1G ||2∞ EG (g, g) + 2||g1G ||2∞ EG (f, f ). Definition 2.5. Let n ≥ 0, m ≥ 0. Set x Bm,i = [0, 1] × [i 3−m , (i + 1) 3−m ],

0 ≤ i ≤ 3m − 1,

y = [i 3−m , (i + 1) 3−m ] × [0, 1], Bm,i

0 ≤ i ≤ 3m − 1.

x We now estimate the energy associated with the potential Vnx (r) in the thin strip Bm,0 , which lies adjacent to the x-axis. To avoid too many subscripts we will sometimes write E[G](f, f, ) for EG (f, f, ) in what follows.

Lemma 2.6. For m, n ≥ 0, x ∩ Fn ](Vnx (r), Vnx (r)) ≤ 2−m Rnx (r)−1 , E[Bm,0

(2.11)

y E[Bm,0

(2.12)

Proof. Write set

∩

Fn ](Vny (r), Vny (r))

≤2

−m

Rny (r)−1 .

x ∩ Fn ](Vnx , Vnx ) , Em,i = E[Bm,i x B˜ m,i = [0, 1] × [i2−1 3−m+1 , (i + 1)2−1 3−m+1 ]

and let

x E˜ m,i = E[B˜ m,i ∩ Fn ](Vnx , Vnx ),

Thus we have Em,0 =

2 X j=0

Em+1,j =

1 X

i = 0, 1,

i = 0, 1.

E˜ m+1,j .

j=0

For m ∈ Z+ , define a potential v ∈ C(F¯n ) ∩ H 1 (Fn ) by  x −m x − y) , (x, y) ∈ B˜ m+1,0 ,  Vn (r)(x, 3 v(x, y) = x  Vnx (r)(x, y) , (x, y) ∈ Fn \ B˜ m+1,0 . As v satisfies the boundary condition (1.3), (2.1), (1.2), and the definition of v imply that 3m+1 X−1 i=0

Em+1,i =

Rnx (r)−1

≤ EFn (v, v) = 2E˜ m+1,1 +

3m+1 X−1

Em+1,i .

(2.13)

i=3

x ⊂ Therefore E˜ m+1,0 ≤ E˜ m+1,1 , and so 2E˜ m+1,0 ≤ E˜ m+1,0 + E˜ m+1,1 = Em,0 . As Bm+1,0 x ˜ Bm+1,0 this implies that Em+1,0 ≤ E˜ m+1,0 ≤ 21 Em,0 .

Iterating, and using the fact that E0,0 = Rnx (r)−1 , we obtain (2.11). Equqation (2.12) follows by interchanging x and y axes.

Weak Homogenization of Anisotropic Diffusion on Pre-Sierpi´nski Carpets

9

We have a corresponding result for currents. Lemma 2.7. For n, m ≥ 0, y ∩ Fn ](Jnx (r), Jnx (r)) ≤ 2−m Rnx (r), E[Bm,0

(2.14)

x ∩ Fn ](Jny (r), Jny (r)) ≤ 2−m Rny (r). E[Bm,0

(2.15)

Proof. Let y 0 Em,i = E[Bm,i ∩ Fn ](Jnx (r), Jnx (r)) , B˜ y = [i2−1 3−m+1 , (i + 1)2−1 3−m+1 ] × [0, 1], m,i 0 ˜ Em,i

=

y E[B˜ m,i

i = 0, 1,

∩ Fn ](Jnx (r), Jnx (r)).

So, as before, we have 0 Em,0 =

2 X

0 Em+1,j =

j=0

1 X

0 E˜ m+1,j .

j=0

For m ∈ Z+ , define a current j ∈ C(Fn ) by   x y x (r))(3−m − x, y) , (x, y) ∈ B˜ m+1,0 , (Jnx (r), −Jny j(x, y) =  x y Jn (r)(x, y) , (x, y) ∈ Fn \ B˜ m+1,0 . It is straightforward to check that j ∈ C(Fn ) and satisfies (2.6). Therefore 3m+1 X−1

0 Em+1,i

=

Rnx (r)

0 ≤ E(j, j) = 2E˜ m+1,1 +

i=0

3m+1 X−1

0 Em+1,i ,

(2.16)

i=3

and the remainder of the proof proceeds as in Lemma 2.6. The next lemma will play a crucial role when we obtain an upper bound on quantities like Rnx (r) by constructing a “feasible flow” j ∈ C(Fn ) and using the energy-minimizing principle (2.5). Except in the simplest cases, this construction requires estimates on the energy of a current which can “turn corners”. y ∩ Fn . Let RG be the resistance of Fix (for now) n, m ≥ 0, r > 0, and let G = Bm,0 G between the lines y = 0 and y = 1. We define (and calculate), RG by the methods of Propositions 2.1 and 2.2. Thus RG = inf{EG (j, j)},

(2.17)

where the infimum is over currents j on G satisfying the boundary conditions Z 3−m Z 3−m jy (x, 0)dx = jy (x, 1)dx = −1, 0

0

and j · n = 0 a.e. on the remainder of the boundary of G. As G consists of 3m scaled copies of Fn−m , it is easy to see that the infimum in (2.17) is attained by the current Je y (r): obtained by piecing together 3m scaled copies of Jn−m e y) = 3m J y (3m x, 3m y − [3m y])(r), J(x, n−m

(x, y) ∈ G.

10

M.T. Barlow, K. Hattori, T. Hattori, H. Watanabe

Here [3m y] is the largest integer less than or equal to 3m y. Therefore e J) e = 3m Ry (r). RG = EG (J, n−m The following result is proved in the Appendix. Lemma 2.8. There exists L = L(n,m) ∈ BV (Fn ) ∩ L2 (Fn ) satisfying div(L) = 0 (as a distribution) on G, ¯ L = 0 on Fn − G, x L = Jn (r) in a neighborhood of {x = 0, 0 < y < 1}, L = −J˜ in a neighborhood of {0 < x < 3−m , y = 0}, ∂L = 0 a.e. on the remainder of the boundary of G, ∂n

(2.18) (2.19) (2.20) (2.21) (2.22)

such that ˜ J) ˜ EG (L, L) ≤ EG (Jnx (r), Jnx (r)) + EG (J, −m x m y ≤ 2 Rn (r) + 3 Rn−m (r). The current L constructed in Lemma 2.8 provides a current which has total flux 1 across G coming in from the left edge x = 0 and going out at the bottom edge y = 0. L will be considered as a part of a current in the larger domain in such a way that the boundary condition (current conservation at the boundary of G) specified by (2.20) and (2.21) must be satisfied.

3. Recursion Relations Effective in the Anisotropic Regime

3.1. Basic tools. We begin with some elementary observations. Lemma 3.1. For r ∈ (0, ∞) and n ∈ Z+ , Rnx (r) = rRny (1/r), Hn (r) = Hn (1/r)−1 . Proof. Fix n, and write S x (a, b) for the resistance in the x direction of Fn , if it is composed of anisotropic material with resistivity a in the x direction, and b in the y direction, and define S y (a, b) analogously. Then S x (a, b) = S y (b, a), Rnx (r) = S x (1, r), Rny (r) = S y (1, r), S x (λa, λb) = λS x (a, b), and so Rnx (r) = S x (1, r) = S y (r, 1) = rS y (1, r−1 ) = rRny (r−1 ). Also, Hn (r) =

1 S y (1, r) S x (r, 1) = y = . x S (1, r) S (r, 1) Hn (r−1 )

The following two propositions give the recursion relations which are the essential tools for this section.

Weak Homogenization of Anisotropic Diffusion on Pre-Sierpi´nski Carpets

11

Proposition 3.2. Let r > 0, n ≥ 1, and m ≥ 2. Then 3 x −1 a1 x −1 y Rn (r) ≤ (1 + m )Rn−1 (r)−1 + A1 3m Rn−m (r) , 2 2 3 a1 −1 y x ≤ Rny (r)−1 ≤ (1 + m ) Rn−1 (r) + A1 3m Rn−m (r)−1 , 2 2

x Rn−1 (r)−1 ≤ −1

y (r) Rn−1

(3.1) (3.2)

where a1 = 8/3, A1 = 4/9. Proposition 3.3. Let r > 0, n ≥ 1, and m ≥ 2. Then 6 x a2 y x Rn (r) ≤ (1 + m ) Rn−1 (r) + A2 3m Rn−m (r) , 7 2 6 a2 y y x (r) ≤ Rny (r) ≤ (1 + m ) Rn−1 (r) + A2 3m Rn−m (r) , Rn−1 7 2

x (r) ≤ Rn−1

(3.3) (3.4)

where a2 = 16/7, A2 = 4/21. Remark . Equations (3.1) and (3.3) are good bounds when Hn (r) 1, while (3.2) and (3.4) are good when Hn (r) 1. While we have, for clarity, given four separate inequalities, (3.2) and (3.4) are immediate consequences of (3.1), (3.3) and Lemma 3.1. So we need only prove (3.1) and (3.3). Definition 3.4. Denote the eight scaled copies of F¯n−1 which compose F¯n , by Aij = ([i/3, (i + 1)/3] × [j/3, (j + 1)/3]) ∩ F¯n , (i, j) ∈ {0, 1, 2}2 \ {(1, 1)} . The left-hand side inequalities in Propositions 3.2 and 4.3 are easy – this is essentially just a standard argument involving shorts and cuts. See [7], [6]. Proof the left-hand side of (3.1) . Define a potential v ∈ C(F¯n ) ∩ H 1 (Fn ) by 2 x V (r)(3x, 3y − j) , (x, y) ∈ A0j , j = 0, 1, 2,    7 n−1 x v(x, y) = 27 + 37 Vn−1 (r)(3x − 1, 3y − j) , (x, y) ∈ A1j , j = 0, 2,   5 2 x 7 + 7 Vn−1 (r)(3x − 2, 3y − j) , (x, y) ∈ A2j , j = 0, 1, 2. Then v is continuous, and using (1.2) we have Rnx (r)−1 ≤ E(v, v) 2 x 2 x 3 x 3 x ≤ 6EFn−1 ( Vn−1 (r), Vn−1 (r)) + 2EFn−1 ( Vn−1 (r), Vn−1 (r)) 7 7 7 7 6 x (r)−1 . = Rn−1 7 Proof of the left-hand side of (3.3). Define a current j ∈ C(Fn ) by 3 x J (r)(3x − i, 3y − j) , (x, y) ∈ Aij , i = 0, 1, 2, j = 0, 2, j(x, y) = 2 n−1 0, (x, y) ∈ A01 ∪ A21 .

12

M.T. Barlow, K. Hattori, T. Hattori, H. Watanabe

Then it is easy to check that j satisfies the current conservation and the boundary conditions given in 2.2, so that by (2.5), and the fact that Aij ∼ = 3−1 Fn−1 , we have 1 x 1 x 3 x Rnx (r) ≤ E(j, j) = 6EFn−1 ( Jn−1 (r), Jn−1 (r)) = Rn−1 (r) . 2 2 2 The proofs of the right hand side inequalities in Propositions 3.2 and 4.3 are more involved. Proof of the right-hand side of (3.1). . As r will be fixed throughout this proof, we will simplify notation by writing Vnx = Vnx (r), Jnx = Jnx (r), etc. Fix n ≥ 0, m ≥ 2, set k = n − m and recall our convention that Vkx = V0x if k < 0. Set ϕ(x, y) =

1 x i Vn−1 (3x − i, 3y − j) + , 3 3

if (x, y) ∈ Aij , (i, j) 6= (1, 1).

Note that ϕ ∈ C(F¯n ) ∩ H 1 (Fn ) and E[Aij ∩ Fn ](ϕ, ϕ) =

1 x −1 (R ) 9 n

for (i, j) 6= (1, 1).

Now let  1, (x, y) ∈ F¯n \ (A01 ∪ A21 ) ,   x x  (x, y) ∈ (A01 ∪ A21 ) \ (Bm,3  m−1 ∪ Bm,23m−1 −1 ) ,  0, ψ(x, y) = x V y (3m x − [3m x], 3m (y − 13 )), (x, y) ∈ (A01 ∪ A21 ) ∩ Bm,3  m−1 ,   n−m   y x Vn−m (3m x − [3m x], 3m ( 23 − y)), (x, y) ∈ (A01 ∪ A21 ) ∩ Bm,23 m−1 −1 . We can check that ψ is continuous, and so ψ ∈ C(F¯n ) ∩ H 1 (Fn ). Note that ϕ, ψ are symmetric about the line y = 21 , and that ϕ(x, y) + ϕ(1 − x, y) = 1, Set

v(x, y) =

ψ(x, y) = ψ(1 − x, y).

(3.5)

ϕ(x, y) ψ(x, y) , 0 < x ≤ 21 , 0 ≤ y ≤ 1, 1 − (1 − ϕ(x, y)) ψ(x, y) , 21 ≤ x ≤ 1, 0 ≤ y ≤ 1.

Continuity of v follows from that of ϕ and ψ, and (3.5); thus v ∈ C(F¯n ) ∩ H 1 (Fn ). It is also easy to see that v satisfies the boundary conditions (1.3). Noting that Aij ∼ = 3−1 Fn−1 and using the symmetry of v, we have (Rnx )−1 ≤ EFn (v, v) = 4E[A00 ∩ Fn ](v, v) + 2E[A01 ∩ Fn ](v, v) + 2E[A10 ∩ Fn ](v, v). (3.6) As ψ = 1 on A00 ∪ A10 we have for j = 0, 1, E[A0j ∩ Fn ](v, v) = E[A00 ∩ Fn ](ϕ, ϕ) =

1 x R (r)−1 . 9 n−1

(3.7)

Weak Homogenization of Anisotropic Diffusion on Pre-Sierpi´nski Carpets

13

Now set G = [0, 13 ] × [ 13 , 13 + 3−m ]. As ψ = 0 on (A10 − G) ∩ {y < 21 }, by the symmetry of G and Lemma 2.4, E[A10 ](v, v) = 2EG (ϕψ, ϕψ) ≤ 4||ψ1G ||2∞ EG (ϕ, ϕ) + 4||ϕ1G ||2∞ EG (ψ, ψ) 4 = 4EG (ϕ, ϕ) + EG (ψ, ψ). 9

(3.8)

Using scaling and Lemma 2.6, EG (ϕ, ϕ) =

1 21−m x −1 x x x E[Bm−1,0 (Rn−1 ) , ∩ Fn−1 ](Vn−1 , Vn−1 )≤ 9 9

(3.9)

while as G consists of 3m−1 segments, each congruent to 3−m Fn−m , y y y , Vn−m ) = 3m−1 (Rn−m )−1 . EG (ψ, ψ) = 3m−1 E(Vn−m

(3.10)

Combining (3.6), (3.7), (3.8), (3.9), and (3.10) we deduce that 2 x −1 4 −m+1 x −1 4 m−1 y 2 ) +2 (Rn−1 ) + 3 (Rn−m )−1 (Rnx )−1 ≤ (Rn−1 3 9 9 2 8 4 y x = (Rn−1 )−1 (1 + 2−m ) + 3m (Rn−m )−1 . 3 3 9 Proof of the right-hand side of (3.3) . This proof uses similar ideas to the one given above, but as we have to work with currents rather than potentials, it is a bit more complicated. Define a vector field K 1 on Fn by  x  Jn (3x − i, 3y − j), (x, y) ∈ Aij , i = 0, 2, 0 ≤ j ≤ 2, K 1 (x, y) = 3 x  2 Jn (3x − i, 3y − j), (x, y) ∈ A1j , j = 0, 2. Then K 1 is piecewise continuous, and div(K 1 ) = 0 on int(Aij ), for (i, j) 6= (1, 1), but K 1 has a jump discontinuity on the lines x = 13 , x = 23 . Thus, we have x (1, 3y), Kx1 ( 13 −, y) = Jnx

Kx1 ( 13 +, y) =

3 x J (1, 3y), 2 nx

y ∈ [0, 13 ].

We now modify K 1 to obtain a current satisfying the conditions of Proposition 2.2. Essentially, we use the current L, defined in Lemma 2.8, to move the excess current arriving at the left-hand edges of the squares A10 , A12 to the right-hand edge of A01 . Let L ∈ BV (Fn−1 ) ∩ L2 (Fn−1 ) be L(n−1,m−1) defined in Lemma 2.8. Recall that y ∩ Fn−1 . Put L = 0 except on Bm−1,0 L0 (x, y) = L(1 − x, 1 − y), L2 (x, y) = (Lx (1 − x, y), −Ly (1 − x, y)), L1 (x, y) = −L0 − L2 .

14

M.T. Barlow, K. Hattori, T. Hattori, H. Watanabe

y Since div(L) = 0 on Bn−1,0 ∩ Fn−1 , we have div(Li ) = 0 for 0 ≤ i ≤ 2. Define a vector 2 field K by

1 j (x, y) ∈ A0j , 0 ≤ j ≤ 2,   2 L (3x, 3y − j),    0, (x, y) ∈ A1j , j = 0, 2, K 2 (x, y) =  1   (Lj (1 − 3x, 3y − j)   2 jx −Ly (1 − 3x, 3y − j)), (x, y) ∈ A2j , 0 ≤ j ≤ 2. Now let K = K 1 + K 2 ; then K ∈ C(Fn ). To see this, note that for 0 ≤ y ≤ 13 , Kx2 ( 13 −, y) = 21 L0x (1−, 3y) = 21 Lx (0+, 1 − 3y) x x = 21 Jnx (0+, 1 − 3y) = 21 Jnx (1, 3y),

so that 3 x J (1, 3y) = Kx1 ( 13 +, y). 2 nx With a number of similar calculations, this shows that div(K) = 0. Therefore, using the symmetry of Fn and K, Kx1 ( 13 −, y) + Kx2 ( 13 −, y) =

Rnx ≤ E(K, K) = 4EA00 (K, K) + 2EA01 (K, K) + 2EA10 (K, K),

(3.11)

and it remains to estimate the terms in (3.11). Note first that 2 1 1 x 3 R = Rx , EA10 (K, K) = EA10 (K , K ) = 2 9 n−1 4 n−1 1

1

and EA01 (K 1 , K 1 ) = EA00 (K 1 , K 1 ) =

1 x R . 9 n−1

Let H = [0, 13 − 3−m ] × [0, 1], and G = [ 13 − 3−m , 13 ] × [0, 1]. As K 2 = 0 on H we have for j = 0, 1, EA0j (K, K) = EA0j ∩H (K 1 , K 1 ) + EA0j ∩G (K 1 + K 2 , K 1 + K 2 ) ≤ EA0j (K 1 , K 1 ) + EA0j ∩G (K 1 , K 1 ) + 2EA0j (K 2 , K 2 ). Using symmetry, and Lemma 2.7, for j = 0, 1, EA0j ∩G (K 1 , K 1 ) =

1 1 y x x x E[Fn−1 ∩ Bm−1 ](Jn−1 , Jn−1 ) ≤ 2−(m−1) Rn−1 . 9 9

From the definition of K 2 , EA00 (K 2 , K 2 ) = and

1 EF (L, L), 36 n−1

Weak Homogenization of Anisotropic Diffusion on Pre-Sierpi´nski Carpets

15

1 EF (L1 , L1 ) 36 n−1 1 ≤ (2EFn−1 (L0 , L0 ) + 2EFn−1 (L2 , L2 )) 36 1 = EFn−1 (L, L). 9

EA01 (K 2 , K 2 ) =

Finally, by Lemma 2.8, y x + 3m−1 Rn−m . EFn−1 (L, L) ≤ 2−(m−1) Rn−1

Therefore, substituting in (3.11), 2 Rnx ≤ 6EA00 (K 1 , K 1 ) + 6EA00 ∩G (K 1 , K 1 ) + EFn−1 (L, L) + 2EA10 (K 1 , K 1 ) 3 8 −m x 2 −(m−1) x 7 x y Rn−1 + 3m−1 Rn−m ) ≤ Rn−1 + 2 Rn−1 + (2 6 3 3 16 4 7 y x (1 + 2−m )Rn−1 , + 3m Rn−m = 6 7 21 which completes the proof of 3.3.

3.2. Proof of Theorem 1.1. Fix r > 0. The left-hand inequalities of Propositions 3.2 and 4.3 imply, for n ≥ k ≥ 0, Rkx (r)−1

n−k n−k 3 6 y −1 x ≤ Rn (r) , and Rk (r) ≤ Rny (r) , 2 7

hence Hk (r) ≤

9 Hk+1 (r) , 7

k ≥ 0.

(3.12)

(3.13)

y y (r) ≤ (6/7)m−1 Rn−1 (r), for n ≥ m ≥ 1, it follows Since, by (3.12), we have Rn−m from (3.3) that for n ≥ m ≥ 2, Ry (r) 7 x 1 + a2 2−m + A2 3m n−m Rn−1 (r) Rnx (r) ≤ x 6 Rn−1 (r) 7 7 m −m x 1 + a2 2 + A2 θ2 Hn−1 (r) Rn−1 (r) , ≤ 6 6

where θ2 = 18/7. Similarly, we have 2 2 −1 y 1 + a1 2−m + A1 θ1 m Hn−1 (r) Rn−1 (r) , Rny (r)−1 ≤ 3 3

n ≥ m ≥ 2,

where θ1 = 9/2. Combining these inequalities, we obtain Hn (r) ≥ where

Hn−1 (r) , n ≥ m ≥ 2, Gm (Hn−1 (r))

(3.14)

16

M.T. Barlow, K. Hattori, T. Hattori, H. Watanabe

Gm (x) =

7 2 7 (1 + a1 2−m + A1 θ1m x)(1 + a2 2−m + A2 θ2m x) . 9 3 6

(3.15)

Now let m be large enough so that Gm (0) < 1, and let δm > 0 be such that Gm (δm ) = 1. Let η be an arbitrary number satisfying 0 ≤ η < δm , and put α = 1/Gm (η). We have Gm (x)−1 ≥ α > 1 for 0 ≤ x ≤ η. Hence, by (3.14), Hn+1 (r) ≥ α Hn (r),

whenever Hn (r) ≤ η.

(3.16)

It follows immediately that there exists an integer n0 ≥ m such that Hn0 (r) > η. Now if k ≥ n0 and Hk (r) ≥ 79 η, then if Hk (r) ≤ η, by (3.16) Hk+1 (r) > Hk (r) ≥ 79 η. On the other hand, if Hk (r) > η then by (3.13) Hk+1 (r) > 79 Hk (r) > 79 η. Thus in either case Hk+1 (r) > 79 η, and so, by induction, we deduce that Hn (r) ≥

7 η, 9

for n ≥ n0 .

This holds for any η < δm , hence lim inf Hn (r) ≥ n→∞

7 δm . 9 −1

Since this holds for any r > 0, and Hn (r) = Hn (1/r) lim sup Hn (r) ≤ n→∞

proving the theorem.

, we also deduce that

9 (δm )−1 , 7

Remark . A numerical bound for the asymptotic values of Hn (r) is obtained by computing δm . If we use the explicit values for the constants in Gm , we find Gm (0) < 1 for m ≥ 5, and that δ5 ≥ 2.03039 × 10−4 , which leads to the numbers given in Sect. 1. 3.3. Proof of Theorem 1.3. We begin with a lemma. Lemma 3.5. Let fn (r), r ∈ [0, ∞), n ≥ 0, be a sequence of functions satisfying, for constants α > 1, β > 0, θ > 1, ci ∈ (0, ∞), βfn−1 (r) ≤ fn (r) ≤ βfn−1 (r)(1 + c1 2−m + rc2 αm θn ),

(3.17)

for all n ≥ 1 and m ≥ 2. Then if ξ = log 2/ log(2α) there exist constants s0 , c5 , depending only on α, θ, ci such that 1≤

β −n fn (θ−n s) ≤ exp(c5 sξ ), f0 (θ−n s)

0 < s ≤ s0 , n ≥ 1.

Proof. Let n ≥ 1 be fixed, and choose mi ≥ 2 for 1 ≤ i ≤ n. Then iterating (3.17) we obtain for r > 0 β n ≤ fn (r)/f0 (r) ≤ β n

n Y

(1 + c1 2−mi + rc2 αmi θi ) .

i=1

So, setting r = θ−n s, ki = mn−i , j = n − i we have

Weak Homogenization of Anisotropic Diffusion on Pre-Sierpi´nski Carpets

0 ≤ log(β

−n

fn (θ

−n

s)/f0 (θ

−n

s)) ≤ c1

n−1 X

2

−kj

+ c2 s

j=0

17 n−1 X

αkj θ−j .

j=0

Choose b > 0 such that 2−b < 1 and αb < θ (so b depends only on α, θ), let a=

log(1/s) , log(2α)

and let kj satisfy a + bj ≤ kj < 1 + a + bj,

0 ≤ j ≤ n − 1.

Then kj ≥ 2, provided s ≤ s0 = (2α)−2 . Thus 0 ≤ log(β −n fn (θ−n s)/f0 (θ−n s)) ≤ c1

∞ X

2−a−bj + c2 sαa+1

j=0

∞ X

(αb θ−1 )j

j=0 −a

a

= c3 (α, θ)2 + c4 (α, θ)α s = c5 (α, θ)sξ . Proof of Theorem 1.3. The left-hand side inequalities of Propositions 3.2 and 4.3 imply that, for z = x, y, Rnz (r)−1 ≤ (6/7)n R0z (r)−1 ,

Rnz (r) ≤ (3/2)n R0z (r).

(3.18)

It follows that (treating the cases m ≤ n, m > n separately) y (r) Rn−m ≤ (7/6)(9/7)n r, x Rn−1 (r)

n ≥ 1, m ≥ 1.

Therefore (3.3) implies that for n ≥ 1, m ≥ 2, 7 x 7 x Rn−1 (r) ≤ Rnx (r) ≤ Rn−1 (r)(1 + a2 2−m + r(7/6)A2 (9/7)n 3m ) . 6 6 So, by Lemma 3.5, taking fn (r) = Rnx (r), β = 7/6, θ = 9/7, α = 3, ξ1 = log 2/ log 6, we obtain 1 ≤ (6/7)n Rnx ((7/9)n s) ≤ exp(csξ1 ), n ≥ 1, s ≤ s0 . (3.19) Here s0 = 1/36 and c ∈ (0, ∞). Using Lemma 3.1 and (3.19), we obtain, replacing s by s−1 , 1 ≤ (2/3)n Rny ((9/7)n s)s−1 ≤ exp(cs−ξ1 ),

n ≥ 1, s ≥ s−1 0 .

(3.20)

In a similar fashion we have, if n ≥ 1, m ≥ 1, k = max(n − m, 0), x (r) Rn−1 ≤ (3/2)n−1 (6/7)k r−1 ≤ (2/(3r))(9/7)n (7/6)m . y Rn−m (r)

So, using (3.1), and replacing r by r−1 , for n ≥ 1, m ≥ 2, 2 x −1 ≤ Rnx (1/r)−1 3 Rn−1 (1/r) 2 x −1 ≤ 3 Rn−1 (1/r) (1 + a1 2−m +

r(2A1 /3)(9/7)n (7/2)m ) .

(3.21)

18

M.T. Barlow, K. Hattori, T. Hattori, H. Watanabe

Taking fn (r) = Rnx (1/r)−1 , β = 2/3, θ = 9/7, α = 7/2, ξ2 = log 2/ log 7, we obtain by Lemma 3.5, 1 ≤ (3/2)n Rnx ((9/7)n s)−1 ≤ exp(cs−ξ2 ),

n ≥ 1, s ≥ s−1 0 .

(3.22)

n ≥ 1, s ≤ s0 .

(3.23)

Using Lemma 3.1 this implies that 1 ≤ (7/6)n Rny ((7/9)n s)−1 s ≤ exp(csξ2 ), Multiplying together (3.22) and (3.20) gives the theorem.

4. Asymptotic Behavior of Effective Resistances

4.1. Statement of the results. For the isotropic case r = 1, it is proved in [2] that there exists a constant ρ > 1 such that 4−1 ρn ≤ Rn ≤ 4 ρn , n ≥ 0,

(4.1)

def

where Rn = Rnx (1) = Rny (1). (It is also proved there that 7/6 ≤ ρ ≤ 1.27656; calculations of Rn , 1 ≤ n ≤ 7 suggest that ρ ≈ 1.25149. See [2] and [4]). The proof uses the inequalities 4−1 Rm Rn ≤ Rn+m ≤ 4 Rm Rn , n ≥ 0, m ≥ 0. The following proposition extends this result to the anisotropic case r 6= 1. Theorem 1.2 follows at once if we put m = 0 in Proposition 4.1. Proposition 4.1. For z = x, y, r > 0, n, m ∈ Z+ , z x y Rn+m (r)−1 ≤ 16 ρ−n (Rm (r)−1 + Rm (r)−1 ) , z x y (r) ≤ 8 ρn (Rm (r) + Rm (r)). Rn+m

(4.2) (4.3)

Remark . The proof below also implies the bounds with Rn in place of ρn in both (4.2) and (4.3). To prove the proposition, we first recall results proved in [2], which relate Rn to the effective resistances for crosswire resistance networks. For i = 0, 1, · · · , 3n − 1, and j = 0, 1, · · · , 3n − 1, let 3ij be the closure in R2 of ([i3−n , (i + 1)3−n ] × [j3−n , (j + 1)3−n ]) ∩ Fn , and let

Sn = {(i, j) ∈ {0, 1, · · · , 3n − 1}2 : Aij 6= ∅} .

Given a = {ai,j : i = 0, 1, · · · , 3n , j = 0, 1, · · · , 3n }, set a¯ ij = 4−1

1 1 X X α=0 β=0

and define

ai+α,j+β ,

(4.4)

Weak Homogenization of Anisotropic Diffusion on Pre-Sierpi´nski Carpets

D

K (a) =

1 1 X X X

19

(ai+α,j+β − a¯ ij )2 .

(i,j)∈S α=0 β=0

Define RnD by (RnD )−1 = inf {K D (a) | a0,j = 0, a3n ,j = 1, j = 0, 1, · · · , 3n } . a

(4.5)

The notation RnD is consistent with that of [2], and denotes the effective resistance of the wire network obtained by replacing each board of side 3−n in Fn by a diagonal crosswire of 4 unit resistors. Next let 3ij and Sn be as above. Assume that a set of numbers J = {Jijη | i = 0, 1, · · · , 3n − 1, j = 0, 1, · · · , 3n − 1, η = 1, 2, 3, 4} satisfies the following conditions:  Jijη = 0, (i, j) ∈ {0, 1, · · · , 3n − 1}2 \ S, η = 1, 2, 3, 4,    4   X Jijη = 0, (i, j) ∈ S, η=1     + Jij3 = 0, (i, j) ∈ {0, 1, · · · , 3n − 1}2 , J   i+1,j,1 Ji,j+1,2 + Jij4 = 0, (i, j) ∈ {0, 1, · · · , 3n − 1}2 .

(4.6)

We regard Jijη as being the current flowing in the wire network Gn obtained by replacing each board of side 3−n in Fn by a horizontal and vertical crosswire of 4 wires, each of resistance 21 . With this interpretation (4.6) are the equations of current conservation. We impose the following “boundary conditions”:  J = Ji,3n −1,4 = 0, i = 0, 1, · · · , 3n − 1,   3ni02 n −1 3X −1 X J = − J3n −1,j,3 = 1.  0j1  j=0

(4.7)

j=0

Put K G (J) =

4 1 X X 2 Jijη , 2 (i,j)∈S η=1

and RnG = inf {K G (J) | J satisfies (4.6) and (4.7). } . J

The notation RnG is consistent with that of [2], and denotes the effective resistance of the network Gn . From [2] (see Theorem 3.3, Proposition 4.1, Theorem 4.3 and (5.4)) we have Lemma 4.2. For n ≥ 0, RnG ≤ 4 min(ρn , Rn ) ≤ 4 max(ρn , Rn ) ≤ 8RnD .

(4.8)

20

M.T. Barlow, K. Hattori, T. Hattori, H. Watanabe

4.2. Proof of Proposition 4.1. It is sufficient to consider the case z = x, as the case z = y then follows immediately by Lemma 3.1. We first prove (4.2). For i = 0, 1, · · · , 3n − 1, be the closure in R2 of ([i3−n , (i+1)3−n ]×[j3−n , (j + and j = 0, 1, · · · , 3n −1, let Bij [ 1)3−n ]) ∩Fn+m . Then F¯n+m = Bij , and each non-empty Bij is congruent to 3−n F¯m . i,j

Define four functions ϕαβ , α = 0, 1, β = 0, 1, on F¯m by ϕ11 (x, y) = Vmx (r)(x, y) Vmy (r)(x, y) , ϕ01 (x, y) = (1 − Vmx (r)(x, y)) Vmy (r)(x, y) , ϕ10 (x, y) = Vmx (r)(x, y) (1 − Vmy (r)(x, y)) , ϕ00 (x, y) = (1 − Vmx (r)(x, y)) (1 − Vmy (r)(x, y)) . Note that Lemma 2.4, (2.3), and (2.1) imply x y EFm (ϕαβ , ϕαβ ) ≤ 2 (Rm (r)−1 + Rm (r)−1 ) .

(4.9)

n

Given a set of real numbers {ai,j | 0 ≤ i, j ≤ 3 }, with a¯ ij defined by (4.4), define v ∈ C(F¯n+m ) ∩ H 1 (Fn+m ) by: def

v(x, y) = a¯ ij +

1 1 X X

(ai+α,j+β − a¯ ij ) ϕαβ (3n x − i, 3n y − j) ,

α=0 β=0

(x, y) ∈ Bij , (i, j) ∈ Sn .

Note that if (aij ) satisfy the “boundary conditions” in (4.5) then v satisfies (1.3). Continuity of v at the boundaries of the Bij follows from (2.4). Recalling that Bij is congruent to 3−n F¯m for (i, j) ∈ S, we have X EFn+m (v, v) = EBij (v, v) (i,j)∈S

=

X

X

(ai+α,j+β − a¯ ij )(ai+α0 ,j+β 0 − a¯ ij )EFm (ϕαβ , ϕα0 β 0 )

(i,j)∈S α,β,α0 ,β 0

X

X

1 (ai+α,j+β − a¯ ij )2 EFm (ϕαβ , ϕαβ ) 2 0 0 (i,j)∈S α,β,α ,β + (ai+α0 ,j+β 0 − a¯ ij )2 EFm (ϕα0 β 0 , ϕα0 β 0 ) X X =4 (ai+α,j+β − a¯ ij )2 EFm (ϕαβ , ϕαβ )

≤

(i,j)∈S α,β

≤8

X X

x y (ai+α,j+β − a¯ ij )2 (Rm (r)−1 + Rm (r)−1 )

(i,j)∈S α,β x y (r)−1 + Rm (r)−1 ) , ≤ 8K D (a)(Rm

where we used (4.9) in the last line. Hence, taking infimum over {aij } and using (4.5) we have x x y (r)−1 ≤ 8(RnD )−1 (Rm (r)−1 + Rm (r)−1 ) , Rn+m and (4.2) now follows immediately using (4.8). We now turn to a proof of (4.3). Let Bij and Sn be as above. Define currents Iηη0 , 1 ≤ η, η 0 ≤ 4, on F¯m as follows. First, let

Weak Homogenization of Anisotropic Diffusion on Pre-Sierpi´nski Carpets

21

x y I13 = −I31 = Jm (r), I24 = −I42 = Jm (r).

Let I12 = (I12x , I12y ) be the current L(m,0) defined in Lemma 2.8, and let I14 (x, y) = −I41 (x, y) = (I12x (x, 1 − y), −I12y (x, 1 − y)) , I32 (x, y) = −I23 (x, y) = (−I12x (1 − x, y), I12y (1 − x, y)) , I43 (x, y) = −I34 (x, y) = (I12x (1 − x, 1 − y), I12y (1 − x, 1 − y)) , (x, y) ∈ F¯m . Finally we put Iηη = 0, η = 1, 2, 3, 4. From Lemma 2.8 we have, x y EFm (Iηη0 , Iηη0 ) ≤ Rm (r) + Rm (r) , η, η 0 ∈ {1, 2, 3, 4}.

(4.10)

Note also that from (2.10) we have the boundary conditions x I1η,x (0, y) = −Iη1,x (0, y) = Jmx (r)(0, y), η y (r)(x, 0), η I2η,y (x, 0) = −Iη2,y (x, 0) = Jmy x (r)(1, y), η I3η,x (1, y) = −Iη3,x (1, y) = −Jmx y (r)(x, 1), η I4η,y (x, 1) = −Iη4,y (x, 1) = −Jmy

= 2, 3, 4, = 1, 3, 4, = 1, 2, 4, = 1, 2, 3,

while for the remaining combinations of the suffices, the corresponding quantities vanish. ± = 2−1 (|Jijη | ± Jijη )), Given {Jijη } satisfying (4.6), write Jijη hij =

4 X

+ Jijη =

η=1

4 X

− Jijη ,

η=1

and define a current I on Fn+m , by def

I(x, y) =

4 4 1 XX + − Jijη Jijη0 Iηη0 (3n x − i, 3n y − j) , hij 0 η=0 η =0

(x, y) ∈ Bij , (i, j) ∈ S.

Then I ∈ C(Fn+m ), so if {Jijη } satisfy (4.7), then by (2.6) we have x Rn+m (r) ≤ EFn+m (I, I).

(4.11)

Recalling that Bij ∼ = 3−n F¯m , (i, j) ∈ S, we have EFn+m (I, I) X = EBij (I, I) (i,j)∈S

=

X

h−2 ij

(i,j)∈S

≤ 2−1

X

(i,j)∈S

≤

x (Rm (r)

+

XX ηη 0

h−2 ij

ξξ 0

− − + + Jijη Jijη 0 Jijξ Jijξ 0 E(Iηη 0 , Iξξ 0 )

XX ηη 0 ξξ 0

y Rm (r))

− − + + Jijη Jijη 0 Jijξ Jijξ 0 (EFm (Iηη 0 , Iηη 0 ) + EFm (Iξξ 0 , Iξξ 0 ))

X

h2ij ,

(i,j)∈S

where we used (4.10) in the last line. Now by the Cauchy-Schwarz inequality,

22

M.T. Barlow, K. Hattori, T. Hattori, H. Watanabe

hij =

1 2

X

|Jijη | ≤ (

η

X η

2 1/2 Jijη ) .

Hence x (r) ≤ EFn+m (I, I) Rn+m x y ≤ (Rm (r) + Rm (r))

XX ij

≤ ≤ and using (4.8) gives (4.3).

x 2(Rm (r) x 2(Rm (r)

+ +

η

2 Jijη

y Rm (r))K G (J) y Rm (r))RnG ,

A. Proof of Lemma 2.8 In this Appendix, we will give a proof of Lemma 2.8. In fact we prove a more general result: Lemma A.1. Let 0 < kx ≤ 1 and 0 ≤ ky < 1, and let B = ((0, kx ) × (ky , 1)) ∩ Fn and B˜ = ((0, kx ) × (0, 1)) ∩ Fn . Let v0 < v1 , v00 < v10 be constants, and let v x be the ∂v x harmonic function on B, with Neumann boundary conditions = 0 at the boundaries ∂n 2 (in R ) of B, except at x = 0 and x = kx , where the Dirichlet boundary conditions, v x (0, y) = v1 and v x (kx , y) = v0 are imposed. Define a current j x = (jxx , jyx ) on B by j x = −Rx ∇v x , where the constant Rx is defined by the normalization condition R1 x ˜ with Neumann j (0, y) dy = 1. Similarly, let v y be the harmonic function on B, ky x boundary conditions, except at y = 1 and y = 0, where Dirichlet boundary conditions v y (x, 1) = v00 and v y (x, 0) = v10 are imposed. Define j y = (jxy , jyy ) = −Ry ∇v y , where Z kx Ry is defined by jyy (x, 1) dx = 1. Then, there exist two disjoint open subsets of B, 0

B x and B y , satisfying the following: 1. the boundary of B x contains ({0} × [ky , 1]) ∪ (([0, kx ] × {ky }) ∩ ∂B), and has no common points with ((0, kx ] × {1}) ∪ (({kx } × (ky , 1]) ∩ ∂B), 2. the boundary of B y contains ([0, kx ] × {1}) ∪ (({kx } × [ky , 1]) ∩ ∂B), and has no common points with ({0} × [ky , 1)) ∪ (([0, kx ) × {ky }) ∩ ∂B), 3. The vector field J defined by   j x (x, y) , (x, y) ∈ B x , J(x, y) = j y (x, y) , (x, y) ∈ B y , (A.1)  0, otherwise, is in C(B). It follows, in particular, that EB (J, J) ≤ EB (j x , j x ) + EB (j y , j y ) .

(A.2)

Weak Homogenization of Anisotropic Diffusion on Pre-Sierpi´nski Carpets

23

√ Proof. Note that with a linear transformation of the coordinate y 0 = y r, (2.2) becomes the Laplace equation in the standard sense. Hence, the potential functions Vnx and Vny are harmonic functions in the usual sense, with this change of coordinate. We assume this change of coordinate in the following. With the change of coordinate, the domain Fn may no more be a square, but it is still a rectangle shaped object, with rectangular “holes” inside. To avoid the clumsiness in the notation, we will keep the notations Fn and assume that it is a square [0, 1]2 with square holes inside. We will not use any symmetries specific to squares in the proofs, and the results are directly applicable to the original problem. Put v = Rx v x − Ry v y . Since v is harmonic, there locally exists, around each point in √ √ def B, an analytic function u(x+y −1) = v(x, y)+ −1 w(x, y), where w is the conjugate harmonic function of v. Note that for any closed path C in B, we have I I I ∂v ∂v ds = − ds, gradw · dx = C C ∂n ∂ 0 B ∂n where n is the unit normal vector and ∂ 0 B is the boundary of B in the interior of C. In the last equality, we used the fact that v is harmonic. Because of the boundary conditions on v we see that this quantity is zero, hence w is single valued. Denote the boundary of B by ∂B. (By a boundary of a set, we always mean, in the following, that as a set in R2 .) Decompose ∂B into the “external” boundary of B defined by ∂ext B = ∂B ∩ ({x = 0} ∪ {x = kx } ∪ {y = ky } ∪ {y = 1}) , and the “internal” boundary defined by ∂int B = ∂B \ ∂ext B. Decompose ∂int B \ o B, where ∂corner B is the (finite) set of corner points of square “holes” ∂corner B = ∂int in B ⊂ Fn . By the reflection principle of analytic functions (see the arguments in the Appendix of [2] for Fn with boundary conditions dealt with here), we see that u(z) can be analytically continued to a neighborhood of each point in ∂B \ ∂corner B = o B, and that at each z0 ∈ ∂corner B, there exists an analytic function U in a ∂ext B ∪ ∂int neighborhood of 0, such that u(z) = U ((z − z0 )2/3 ). We regard, in the following, u (and ¯ analytic on B¯ \ ∂corner B. also v, w) as a continuous function on the closed set B, x Note also that similar considerations hold for v on B and v y on B˜ ⊃ B in place of v. We define wx on B and wy on B˜ which are conjugate harmonic functions of v x and v y , respectively, and put √ √ ux (x + −1y) = v x (x, y) + −1 wx (x, y) and

uy (x +

√

−1y) = v y (x, y) +

√

−1 wy (x, y) .

Obviously, we can fix constant ambiguities of conjugate harmonic functions to satisfy w = Rx wx − Ry wy and u = Rx ux − Ry uy . Decompose ∂ext B into 4 parts and put e1 = ∂ext B∩{x = 0}, e2 = ∂ext B∩{y = ky }, e3 = ∂ext B ∩ {x = kx }, e4 = ∂ext B ∩ {y = 1}. def

Define 2 disjoint open subsets B x , B y of B by B x = {(x, y) ∈ B | w(x, y) > def

w(0, 1)} and B y = {(x, y) ∈ B | w(x, y) < w(0, 1)}. We will prove that B x and B y satisfy the statements of the lemma. A point (x, y) ∈ B \ ∂corner B is said to be a critical point of v x if ∇v x (x, y) = 0, or equivalently, ux 0 (z) = 0. By the uniqueness theorem on analytic continuation, we see

24

M.T. Barlow, K. Hattori, T. Hattori, H. Watanabe

that there are at most a finite number of critical points in B \ ∂corner B. (The possibility of accumulation of critical points to a point in ∂corner B is ruled out by considering the uniqueness theorem on U x , the function corresponding to U defined above.) Denote the (finite) set of critical points by Axcrit . Note first that, by assumption, we have v x (x, y) ≤ v1 , from which follows ∂v x ∂v x (0, y) ≤ 0. The boundary condition v x (0, y) = v1 implies (0, y) = 0. ∂x ∂y These results and the fact that the number of critical points is finite imply, with ∂wx the Cauchy–Riemann relation, that (0, y) < 0 except for at most finite num∂y ∂v y ber of ys’. Note also that we have, for v y , the boundary condition (0, y) = 0. ∂x y ∂w (0, y) = 0. Therefore we have With the Cauchy–Riemann relation we have ∂y ∂wx ∂wy ∂w (0, y) = Rx (0, y) − Ry (0, y) < 0, except for at most finite number of ∂y ∂y ∂y points on e1 , consequently, w(0, y) > w(0, 1) for ky ≤ y < 1. w is a continuous func¯ hence we see that e1 is contained in the boundary of B x and has no common tion on B, points with that of B y except for a point (0, 1). The positivity of Rx and Ry are consequences of the fact that the assumptions imply that Rx and j x (or Ry and j y ) satisfy a relation analogous to (2.6) and (2.7). Similarly we deduce that e4 is contained in the boundary of B y and has no common points with that of B x except for a point (0, 1). To prove that e3 is contained in the boundary of B y , first note that, by the Cauchy– Riemann relations and the boundary conditions on v x and v y and the normalization condition on j y , we have Z kx ∂w (x, 1) dx w(kx , 1) − w(0, 1) = ∂x 0 Z kx Z kx x y x ∂v y ∂v = (x, 1) + R (x, 1) dx = − −R jyy (x, 1) dx = −1 . ∂y ∂y 0 0 ∂v y = 0 on Then we have, using Cauchy–Riemann relations and the assumption that ∂x e3 , w(kx , y) − w(0, 1) = w(kx , y) − w(kx , 1) − 1 Z 1 Z 1 ∂v (kx , y) dy − 1 = jxx (kx , y) dy − 1 . =− ∂x y y Using divj x = 0, the Gauss–Green formula, and the boundary conditions on v x , we see Z 1 Z 1 that jxx (kx , y) dy = jxx (0, y) dy = 1. Therefore, ky

ky

w(kx , y) − w(0, 1) = −

Z

y ky

jxx (kx , y) dy .

(A.3)

∂v x By assumption, v x (x, y) ≥ v0 , from which follows (kx , y) ≤ 0. The boundary ∂x x ∂v (kx , y) = 0. These results and the fact that the condition v x (kx , y) = v0 implies ∂y

Weak Homogenization of Anisotropic Diffusion on Pre-Sierpi´nski Carpets

25

∂v x (kx , y) > 0, except for at ∂x most finite number of points. With (A.3) we see that w(kx , y) < w(0, 1) on e3 \ (kx , ky ), implying that e3 is contained in the boundary of B y , and has no common points with that of B x except for (kx , ky ). To prove that e2 is contained in the boundary of B x , (and has no common points with that of B y except for (kx , ky ),) it suffices to prove w(x, ky ) > w(0, 1) on e2 \ (kx , ky ). By an analogous argument to those above we obtain Z 1 Z 1 ∂w (0, y) dy = jxx (0, y) dy = 1 . w(0, ky ) − w(0, 1) = − ∂y ky ky number of critical points is finite imply, jxx (kx , y) = −Rx

Noting the boundary condition

∂v x (x, ky ) = 0 on e2 , we therefore see, with the Cauchy– ∂y

Riemann relations, that Z

x

w(x, ky ) − w(0, 1) = 0

∂w (x, ky ) dx + 1 = − ∂x

Z

x 0

jyy (x, ky ) dx + 1 .

Hence w(x, ky ) > w(0, 1) on e2 \ (kx , ky ) holds if we can show Z x jyy (x, ky ) dx < 1, 0 ≤ x < kx .

(A.4)

0

Z

x0

Suppose 0

jyy (x, ky ) dx ≥ 1 for some (x0 , ky ) ∈ e2 \ (kx , ky ), and put `y = {(x, y) ∈ B˜ | wy (x, y) = wx (x0 , ky )} .

˜ `y is a smooth curve (or a set of smooth curves) in Since wy is a harmonic function on B, ˜ B, whose tangent is proportional to ∇v y , which implies that v y is strictly monotone on `y , hence it is not a closed orbit, and separates B˜ in domains with wy (x, y) > wx (x0 , ky ) ∂v y = 0 on the edges x = 0 and with wy (x, y) < wx (x0 , ky ). The boundary conditions ∂x y y and x = kx imply that w is constant on these edges, so that ` cannot have endpoints on them. Therefore there is an endpoint of `y on the edges {(x, 0) | 0 ≤ x < kx } ∪ {(x, 1) | 0 ≤ x < kx }. Let (x1 , 1) be an endpoint of `y , satisfying 0 ≤ x1 < kx . (The case that the endpoints are only on the edge y = 0 can be handled similarly.) There is a connected o B which connects (x0 , ky ) to (x1 , 1). Consider piecewise smooth curve `y 0 ⊂ `y ∪ ∂int y0 the subset of B bounded by ` , e2 , e1 , and e4 . Applying the Gauss–Green formula and the current conservation divj y = 0, and noting that j y · n = 0 on `y 0 and e1 , where n is a normal vector, we see that Z x1 Z x0 y jy (x, 1) dx = jyy (x, ky ) dx ≥ 1. (A.5) 0

0

∂v y ∂v y (x, 1) ≥ 0, and (x, 1) = 0. ∂y ∂x This implies (with an argument similar to one which led to w(0, y) > w(0, 1) for ky ≤ y < 1) that jyy (x, 1) > 0, 0 ≤ x ≤ kx , except for at most finite number of points. Hence On the other hand, v y (x, 1) = v00 ≤ v y (x, y) implies

26

M.T. Barlow, K. Hattori, T. Hattori, H. Watanabe

Z

x1 0

jyy (x, 1) dx = 1 −

Z

kx

x1

jyy (x, 1) dx < 1 .

This contradicts (A.5). Hence (A.4) is proved. We are left with the statements on J defined in (A.1). Since j x and j y are in C(B), it follows at once that J is square integrable and of bounded variation. To prove that divJ = 0, let f be an infinitely differentiable function on B with compact support. Using (A.1), divj x = divj y = 0, and the Gauss–Green formula [12, p.340] in turn, we have Z Z f divJ dx dy = − (∇f ) · J dx dy B B Z Z x (∇f ) · j dx dy − (∇f ) · j y dx dy =− x y ZB ZB x div(f j ) dx dy − div(f j y ) dx dy =− Bx By Z Z x f j · n ds − f j y · n ds , =− ∂B x

∂B y

where n is the unit normal vector to the curves ∂B x or ∂B y , in the outward directions of the domain B x or B y . Since f has compact support on B, the contribution to the line integration from ∂B is zero. On the other hand, the function w is analytic in B, hence, def

` = (∂B x ) \ (∂B) = (∂B y ) \ (∂B) , and w(x, y) = w(0, 1) on the curve `. Note that ∇v = −Ry ∇v y + Rx ∇v x = j y − j x . By Cauchy–Riemann relations we know that ∇v · ∇w = 0. Hence we have (j y − j x ) · n = 0 on `, where n is the unit normal vector to `, with same sign as n for ∂B x . The normal vector n has opposite signs on ∂B x and ∂B y . Therefore, Z Z f divJ dx dy = − f (j x − j y ) · n ds = 0 , B

`

which proves divJ = 0. The estimate (A.2) now follows since EFn (J, J) = EB x ∩Fn (J, J) + EB y ∩Fn (J, J) ≤ EFn (j x , j x ) + EFn (j y , j y ) . Acknowledgement. The research of M. T. Barlow is supported by a NSERC (Canada) grant. The research of T. Hattori is supported in part by a Grant-in-Aid for Scientific Research (C) from the Ministry of Education, Science, Sports and Culture.

References 1. Barlow, R.F., Bass, R.: Construction of Brownian motion on the Sierpi´nski carpet. Ann. Inst. Henri Poincar´e 25, 225–257 (1989) 2. Barlow, M.T., Bass, R.F.: On the resistance of the Sierpi´nski carpet Proc. Roy. Soc. London A 431, 345–360 (1990) 3. Barlow, M.T., Bass, R.F.: Coupling and Harnack inequalities for Sierpi´nski carpets Bull. Amer. Math. Soc. 29, 208–212 (1993) 4. Barlow, M.T., Bass, R.F., Sherwood, J.D.: Resistance and spectral dimension of Sierpi´nski carpets. J. Phys. A, 23, L253–L258 (1990)

Weak Homogenization of Anisotropic Diffusion on Pre-Sierpi´nski Carpets

27

5. Barlow, M.T., Hattori, K., Hattori, K., Watanabe, H.: Restoration of isotropy on fractals. Phys. Rev. Lett. 75, 3042–3045 (1995) 6. Ben-Avraham, D., Havlin, S.: Exact fractals with adjustable fractal and fracton dimensionalities. J. Phys. A 16, L559–L563 (1983) 7. Doyle, P.G., Snell, J.L.: Random walks and electrical networks. Math. Assoc. of America, Washington, 1984 8. Hattori, T.: Asymptotically one-dimensional diffusions on scale-irregular gaskets. Preprint 9. Hattori, K., Hattori, K., Watanabe, H.: Gaussian field theories on general networks and the spectral dimensions. Progr. Theor. Phys. Supplement 92, 108–143 (1987) 10. Hattori, K., Hattori, K., Watanabe, H.: Asymptotically one-dimensional diffusions on the Sierpi´nski gasket and the abc-gaskets. Probab. Theory Relat. Fields 100, 85–116 (1994) 11. Kozlov, S.M.: Harmonization and homogenization on fractals. Commun. Math. Phys. 153, 339–357 (1993) 12. Maz’ja, V.G.: Sobolev Spaces. Berlin: Springer, 1985 13. Mandelbrot, B.B.: The Fractal Geometry of Nature. Freeman, San Francisco, 1982 14. Sierpi´nski, W.: Sur une courbe cantorienne qui contient une image biunivoque et continue de toute ´ courbe donn´ee. C. r. hebd. Seanc. Acad. Sci., Paris 162, 629–632 (1916) 15. Ziemer, W.P.: Weakly differentiable functions. Springer, Berlin, 1989 Communicated by D.C. Brydges

Commun. Math. Phys. 188, 29 – 67 (1997)

Communications in

Mathematical Physics c Springer-Verlag 1997

Generating Functional in CFT and Effective Action for Two-Dimensional Quantum Gravity on Higher Genus Riemann Surfaces Ettore Aldrovandi, Leon A. Takhtajan Department of Mathematics, SUNY at Stony Brook, Stony Brook, NY 11794-3651, USA. E-mail: [email protected], [email protected] Received: 12 September 1996 / Accepted: 6 January 1997

Abstract: We formulate and solve the analog of the universal Conformal Ward Identity for the stress-energy tensor on a compact Riemann surface of genus g > 1, and present a rigorous invariant formulation of the chiral sector in the induced two-dimensional gravity on higher genus Riemann surfaces. Our construction of the action functional uses various double complexes naturally associated with a Riemann surface, with computations that are quite similar to descent calculations in BRST cohomology theory. We also provide an interpretation of the action functional in terms of the geometry of different fiber spaces over the Teichm¨uller space of compact Riemann surfaces of genus g > 1.

1. Introduction Conformal symmetry in two dimensions, according to Belavin, Polyakov, and Zamolodchikov [8], is generated by the holomorphic and anti-holomorphic components T(z) and ¯ z) T( ¯ of the stress-energy tensor of a Conformal Field Theory. These components satisfy the Operator Product Expansions [8, 15] 1 ∂ 2 c/2 T(w), + + T(z) T(w) ∼ (z − w)4 (z − w)2 z − w ∂w c/2 1 ∂ ¯ 2 ¯ z) ¯ w) T( ¯ T( ¯ ∼ T(w), ¯ + + (z¯ − w) ¯ 4 (z¯ − w) ¯ 2 z¯ − w¯ ∂ w¯ ¯ w) T(z) T( ¯ ∼ 0, where c is the central charge of the CFT and ∼ means “up to the terms that are regular as z → w”. These OPE, together with the regularity condition T(z) ∼ 1/z 4 as |z| → ∞, are used to construct Verma modules for the Virasoro algebra that correspond to the holomorphic and anti-holomorphic sectors of a CFT. The operator content of the CFT

30

E. Aldrovandi, L.A. Takhtajan

is specified by the highest weight vectors of the Virasoro algebra that correspond to the ¯ with conformal weights (hl , h¯ l ), satisfying primary fields Ol (z, z) ∂ hl 1 Ol (w, w) ¯ ∼ + ¯ , T(z) Ol (w, w) (z − w)2 z − w ∂w ¯ z). and similar OPE with T( ¯ A CFT is determined by the complete set of correlation functions among the primary fields, which are built up of conformal blocks: the correlation functions for the holomorphic sector. The conformal blocks are defined by the Conformal Ward Identities of BPZ [8], which follow from the OPE for the primary fields. Introducing the generating functional for the n-point correlation functions Z 1 2 µ(z, z) ¯ T(z) d z i exp{−W [µ](z1 , . . . , zn )} = hO1 (z1 ) . . . On (zn ) exp − π C def

= hO1 (z1 ) · · · On (zn )iµ ,

where the integration goes over the complex plane C and d2 z = 2i d z∧d z¯ = d x∧d y, z = x+iy, z¯ = x−iy, the CWI can be written in the following “universal form” (cf. [31, 30]) n

X c δW ∂W (∂¯ − µ ∂ − 2 µz ) = µzzz + {hl δz (z − zl ) + δ(z − zl ) }, δµ(z) 12π ∂zl l=1

where ∂ = ∂/∂z, ∂¯ = ∂/∂ z. ¯ Describing the complete solution of this equation, as well as of its generalization for higher genus Riemann surfaces, is one of the major problems of CFT. This problem remains non-trivial even in the simplest case of conformal blocks without primary fields, when the generating functional W [µ] takes the form Z 1 def µ(z, z) ¯ T(z) d2 z i = hIiµ . (1.1) exp{−W [µ]} = hexp − π C It gives the expectation value of the unit operator I in the presence of Schwinger’s source term µ, which is a characteristic feature of all CFT with the same central charge c. The corresponding universal CWI reduces to the equation (∂¯ − µ ∂ − 2 µz )

c δW = µzzz δµ(z) 12π

(1.2)

for the expectation value of the stress-energy tensor def

hT(z)iµ =

δW . δµ(z)

It is remarkable that the functional W [µ], for |µ| < 1, can be determined in closed form and that it turns out to be the Euclidean version of Polyakov’s action functional for two-dimensional induced quantum gravity [26]. To see this, let µ be a Beltrami coefficient on C – a bounded function µ with the property |µ| < 1 – to which one can associate a self-mapping f : C → C as a unique normalized (fixing 0, 1 and ∞) solution of the Beltrami equation

Effective Action for Two-Dimensional Quantum Gravity

31

fz¯ = µ fz . Denote by T (z) = {f, z} =

2 fzzz 3 fzz − fz 2 fz

the Schwarzian derivative of f – “the stress-energy tensor associated with f ”. Then (see, e.g. [22, 31]), Eq. (1.2) is equivalent to the following Cauchy-Riemann equation c δW − T (z) /(fz )2 = 0 (∂¯ − µ∂) δµ(z) 12π with respect to the complex structure on C defined by the coordinates ζ = f (z, z), ¯ ζ¯ = f (z, z). ¯ Using the regularity of the stress-energy tensor at ∞ one gets that c δW = hT(z)iµ = T (z) . δµ(z) 12π

(1.3)

This variational equation for determining W was explicitly solved by Haba [18]. Specifically, let f tµ be the family of self-mappings of C associated to the Beltrami coefficients tµ, 0 ≤ t ≤ 1. Then Z 1 Z c tµ 2 W [µ] = dt T µ d z 12π 0 C solves (1.3). The functional W can be considered as a WZW type functional since its definition requires an additional integration over a path in the field space. Next, consider Polyakov’s action functional for two-dimensional induced quantum gravity in the light-cone gauge [26], applied to the quasi-conformal map f : Z fzz fz¯ 2 (1.4) S[f ] = − d z. fz z C fz It has the property

δS = 2 T (z) = 2{f, z}, δµ(z)

so that c S[f ]/24π, considered as a functional of µ = fz¯ /fz , also solves Eq. 1.2. Therefore, one has the fundamental relation c S[f ], (1.5) W [µ] = 24π which expresses W as a local functional of f and which can be verified directly. This relation provides the interpretation (cf. [31, 7, 27]) of two-dimensional induced gravity in the conformal gauge in terms of a gravitational WZNW model (and hence in terms of a Chern-Simons functional as well). In the present paper we formulate and solve the analog of Eq. (1.2) for the stressenergy tensor on a compact Riemann surface of genus g > 1. As in the genus zero case, it provides an invariant formulation of the chiral sector in two-dimensional induced gravity on higher genus Riemann surfaces, a solution to the problem discussed in [30]. From a different point of view, this problem was also considered in [34, 35]. First, it should be noted that it is trivial to generalize the genus zero treatment to the case of elliptic curves – compact Riemann surfaces of genus 1. Namely, let X be

32

E. Aldrovandi, L.A. Takhtajan

an elliptic curve realized as the quotient L\C of the complex plane C by the action of a rank 2 lattice L generated by 1 and τ , with Im τ > 0. The analog of Eq. (1.2) has the same form, where µ is now a doubly-periodic function on C, while the corresponding normalized solution f of the Beltrami equation has the property f (z + 1) = f (z) + 1 ,

f (z + τ ) = f (z) + τ˜ ,

where τ˜ = f (τ ) , Im τ˜ 6= 0. It follows that f ◦ γ = γ˜ ◦ f

for all γ ∈ L,

˜ the rank 2 lattice in C generated by 1 and τ˜ . As a result, the functional where γ˜ ∈ L, S[f ] has the same form as in (1.4), where now the integration goes over the fundamental parallelogram Π of the lattice L. Having thus addressed the genus 1 case, we start by formulating Eq. (1.2) – the same applies to the universal CWI as well – on a compact Riemann surface X of genus g > 1. In order to do it one needs to use projective connections on X (see, e.g., [17] for details). Namely, recall [14] that the stress-energy tensor T of a CFT on a Riemann surface is c/12 times a projective connection. Therefore the expectation value hT(z)i =

c Q(z), 12

is a holomorphic projective connection on X which depends on the particular CFT. The difference between two projective connections on X is a quadratic differential, so that in order to define the generating functional for the stress-energy tensor on X, one can choose a “background” holomorphic projective connection R and set Z c 1 2 µ(z, z) ¯ (T(z) − R(z)) d z i , exp{−W [µ]} = hexp − π X 12 where µ is a Beltrami differential on X. The analog of Eq. (1.2) takes the form [6, 22] (∂¯ − µ∂ − 2µz )

c δW = (µzzz + 2Rµz + Rz µ), δµ(z) 12π

where z is a local complex coordinate on X, and was used in [34, 35]. As it follows from the definition of W , c c δW (Q(z) − R(z)) = hT(z) − R(z)i = δµ(z) µ=0 12 12 and this expectation value can be set to zero if one chooses Q = R. However, when working with all conformal field theories on X having the same central charge c, it is preferrable to have a canonical choice of the holomorphic projective connection R. One possibility, which is the choice we will adopt in this paper, is to use a Fuchsian projective connection. It is defined by the Fuchsian uniformization of the Riemann surface X, i.e. by its realization as a quotient 0\H of the upper half-plane H by the action of a strictly hyperbolic Fuchsian group 0 with 2g generators. The upper half plane is isomorphic to the universal cover of X, while 0, as an abstract group, is isomorphic to π1 (X), the fundamental group of the surface X. Note that the Fuchsian uniformization of Riemann surfaces plays a fundamental role in the geometric approach to the two-dimensional quantum gravity through quantum Liouville theory (see [29] and references therein).

Effective Action for Two-Dimensional Quantum Gravity

33

The covering H → X allows to pull-back geometric objects from X to H. Since the Fuchsian projective connection tautologically vanishes on H, the stress-energy tensor T(z) becomes a quadratic differential for the Fuchsian group 0 T ◦ γ (γ 0 )2 = T

for all γ ∈ 0,

whereas the source term µ becomes a Beltrami differential for 0, µ◦γ

γ0 =µ γ0

for all γ ∈ 0.

The product Tµ is a (1, 1)-tensor for 0, so that the integral Z Tµ d z ∧ d z¯ F

– the natural pairing between quadratic and Beltrami differentials – is well-defined, i.e. it does not depend on the choice of the fundamental domain F ⊂ H of the Fuchsian group 0. As a result, the functional W [µ] retains the same form as in formula (1.1), where now the integration goes over the domain F , and satisfies the same Eq. (1.2), with z ∈ H. It should be noted that the expectation value of hT(z)iµ is no longer zero when µ = 0, but rather is c/12 times a holomorphic quadratic differential q, which is the pull-back to H of the quadratic differential Q − R on X and characterizes a particular CFT. Thus, as it was observed in [34, 35], the generating functional for the stress-energy tensor on a higher genus Riemann surface is no longer a universal feature of all conformal field theories with the same value of c. However, as we shall show in the paper, one can still find the general solution of Eq. (1.2). Next, in order to solve the universal CWI and to define an action functional for the chiral sector in two-dimensional induced gravity on X, one could first try to extend Polyakov’s functional (1.4) from C to X by considering the following integral Z 1 ω[f ] , (1.6) 2i F where fzz ω[f ] = fz

fz¯ fz

z

d z ∧ d z¯ ,

which was the correct choice for the genus 1 case. In this expression µ = fz¯ /fz should be a Beltrami differential for 0, which is necessary for an invariant definition of the generating functional W [µ]. This imposes strong conditions on the possible choices of the mapping f . It should be noted in the first place that, contrary to the genus zero case, the correspondence f 7→ µ(f ) = fz¯ /fz is no longer one-to-one. Indeed, the solution of the Beltrami equation fz¯ = µfz on H depends on the extension of the Beltrami coefficient µ to the lower half-plane H of the complex plane C. There are two canonical choices compatible with the action of 0. In the first case def ¯ z ∈ H, µ(z, ¯ z) = µ(z, z), whereas in the second case

34

E. Aldrovandi, L.A. Takhtajan def

µ(z, z) ¯ = 0,

z ∈ H.

In both cases, the property of µ being a Beltrami differential for 0 is equivalent to the following equivariance property of f (the solution of the Beltrami equation in C). There ˜ ⊂ PSL(2, C), such that should exist an isomorphism 0 3 γ 7→ γ˜ ∈ 0 f ◦ γ = γ˜ ◦ f

for all γ ∈ 0.

(1.7)

˜ a Fuchsian In the first case, the restriction of f to H yields a self-mapping of H with 0 group (thus defining a Fuchsian deformation of 0), whereas in the second case f maps ˜ a quasi-Fuchsian group (thus H onto the interior of a simple Jordan curve in C with 0 defining a quasi-Fuchsian deformation of 0). However, using the equivariance property of f it is easy to see that the “naive” expression (1.6) can not be considered as a correct choice for the action functional in higher genus. Indeed, it follows from (1.7) that: 1. The density ω[f ] is not a (1, 1)-tensor for 0, so that the integral (1.6) depends on any particular choice of the fundamental domain F . 2. The formal variation of (1.6) depends on the values of δf on the boundary ∂F of F . One may try to overcome these difficulties and resolve the second problem by adding suitable “correction terms” to the functional (1.6); these can be determined by performing the formal variation of (1.6). Specifically, all local computations will be the same as in the genus zero case (see Lemma 2.6), except that now (1.7) does not allow to get rid of the boundary terms in the Stokes formula by setting the variations δµ or δf to zero on ∂F . Therefore, besides the local “bulk” term, the variation of (1.6) will contain “total derivative” terms localized at ∂F . This suggests the addition of “counterterms”, which depend only on the edges of F , such that their variation cancels the boundary terms coming from the variation of (1.6). Such counterterms can be determined; it should be noted that a similar, though much simpler procedure was used in [33], where the Liouville action functional on the fundamental domain of a Schottky group was defined. In our case, however, the actual construction goes one step further: the variation of the edge terms produces additional quantities localized at the vertices of ∂F . In turn, their cancellation requires counterterms that depend on the vertices of ∂F , which can be determined as well. It turns out that this rather complicated procedure, which solves problem 2, can be carried out in a canonical way using standard tools from homological algebra, namely various double complexes naturally associated with the Riemann surface X. It is remarkable that at the same time it solves problem 1 as well! By using the action of the group 0 on H, we extend the singular chain boundary differential and the de Rham differential on H to act on chains and cochains for the group homology and cohomology of 0. The corresponding group boundary and coboundary differentials give rise to two double complexes such that the fundamental domain F and the density ω[f ] can be extended to representatives of suitable homology and cohomology classes [Σ] and [f ] and the pairing between them becomes 0-invariant. Subsequently, we define the action functional S[f ] as the result of such pairing, i.e. as the evaluation of [f ] on [Σ]. Quite naturally, the actual computation of these representatives goes exactly like descent calculations, familiar from BRST cohomology (see, e.g. [20]). This is more than a simple analogy in the following sense. The appropriate tool for linearizing the action of a discrete group is the group ring, which leads to the group (co)homology that we are using for the action of the Fuchsian group 0 on H. The

Effective Action for Two-Dimensional Quantum Gravity

35

corresponding concept in the case of a continuous (Lie) group is the Lie algebra and its (co)homology, which is used in BRST theory. The action functional S[f ] resulting from this construction looks as follows. Let F be a canonical fundamental domain for 0 in the form of a closed non-Euclidean polygon in H with 4g edges. For any γ ∈ 0 and any pair (γ1 , γ2 ) ∈ 0 × 0, let θγ [f ] and Θγ1 ,γ2 [f ] be a 1-form and a function on H given by the following explicit expressions: γ 00 θγ −1 [f ] = log(γ˜ 0 ◦ f ) d log fz − log(fz ◦ γ) d log γ 0 − 2 0 µ d z¯ γ 0 0 ∗ 0 0 d Θγ2−1 ,γ1−1 [f ] = f log γ˜ 1 ◦ γ˜ 2 d log γ˜ 2 + log γ2 d log γ1 ◦ γ2 2 1 2 1 − f ∗ d log γ˜ 20 − d log γ20 , 2 2 where f ∗ denotes the pull-back of differential forms on H by the mapping f . Then Z 2iS[f ] = F

+

ω[f ] −

g X

g Z X i=1

bi

θβi [f ] +

g Z X i=1

ai

Θαi ,βi [f ](ai (0)) − Θβi ,αi [f ](bi (0)) + Θγ −1 ,αi βi [f ](bi (0)) (1.8) i

i=1

−

θαi [f ]

g−1 X i=1

Θγg−1 ...γ −1 ,γ −1 [f ](bg (0)) . i+1

i

Here ai and bi are the standard cycles on X viewed as edges of F with initial points ai (0) and bi (0), αi and βi are the corresponding generators of the group 0, and γi stands def

for the commutator [αi , βi ] = αi βi αi−1 βi−1 . Observe that one can formally set g = 1 in the representation (1.8), replacing the ˜ respectively. Since in this case ˜ by the lattices L and L, non-abelian groups 0 and 0 0 0 γ = γ˜ = 1 identically, the differential forms θ and d Θ vanish and the action functional S[f ] is given by the bulk term only. It is also instructive to compare our construction with that presented in [34, 35]. Namely, in [34, 35] a solution of (1.2) was written directly on a higher genus Riemann surface equipped with additional algebro-geometric and/or dissection data. Formally, this solution also features a bulk term derived from the genus zero Polyakov action plus contributions of lower degree, but a rather complicated series of prescriptions is involved in its definition. In our construction, the functional S[f ] is written down on the universal cover H and it only depends on the choice of the normalized solution f of the Beltrami equation on H. As a result, it enjoys the same nice variational properties as in the genus zero case. Specifically, we summarize our main results as follows. Theorem A. The functional S[f ] does not depend on either the choice of the fundamental domain F , or the choice of standard generators for the Fuchsian group 0. It has a geometrical interpretation as a result of the evaluation map given by the canonical pairing H 2 (X, C) × H2 (X, Z) −→ C, where ω[f ] − θ[f ] − Θ[f ] represents an element in H 2 (X, C) depending on f and F is canonically extended to a representative of the fundamental class of X in H2 (X, Z).

36

E. Aldrovandi, L.A. Takhtajan

Since the action functional S[f ] is independent of all the choices made, the corresponding variational problem is well-defined. We shall consider two versions of it, depending on whether we choose either µ or f , related through the Beltrami equation, to be the independent functional variable. In the first case, the independent variable belongs to the linear space of Beltrami differentials for 0 and the “source” Fuchsian group 0 ˜ = f ◦ 0 ◦ f −1 uniquely determines the “target” Fuchsian (or quasi-Fuchsian) group 0 through the solution of the Beltrami equation (“variation with free endpoint”). In the ˜ and the homomorphism 0 → 0 ˜ are fixed a priori second case, the “target” group 0 (“variation with fixed endpoints”) and the independent variable f is a self-mapping of H (or a mapping of H onto the interior of a simple Jordan curve) satisfying the equivariance property (1.7). In both cases it is guaranteed that the boundary terms arising from (1.6) are taken care of by the counterterms in (1.8), so that we have Theorem B. The variation of the action S[f ] with respect to µ or f is given by the formulas Z T (z) δµ(z) d2 z δS[f ] = 2 F

Z

and δS[f ] = −2

F

µzzz

δf 2 d z, fz

respectively. Needless to say, the variational derivatives of S[f ] – the quantities T (z) and µzzz – are, respectively, (2, 0) and (2, 1)-tensors for 0 (see Lemma 4.2) and can be therefore pushed down to the Riemann surface X ' 0\H. Note that the critical points of the functional S[f ], considered for the mappings f ˜ that intertwine a given Fuchsian group 0 and a Fuchsian (or quasi-Fuchsian) group 0, consist of those maps f such that the corresponding µ = fz¯ /fz satisfies the “equation of motion” (1.9) µzzz = 0 . ˜ For a given pair 0, 0, determining the critical set of S[f ] seems to be a very difficult problem. However, it is rather easy to find the dimension of the solution space of Eq. (1.9) ˜ = f ◦ 0 ◦ f −1 . We shall show in without imposing any conditions on the target group 0 Sect. 4, using the Riemann-Roch theorem, that this dimension is actually 4g − 3. Critical points of the functional S[f ] with respect to the variation with free endpoint satisfy the equation of motion T (z) = 0. They are a subset of the previous “fixed-end” critical set (cf. Lemma 2.3 and Proposition 5.2). Again, determining this set seems to be a non simple task. As in the genus zero case, it follows from Theorem B that c S[f ]/24π, considered as a functional of µ = fz¯ /fz , solves Eq. (1.2), and is a solution local in the map f . However, in the higher genus case the correspondence µ 7→ f is no longer oneto-one and, at least, there are two canonical choices for f producing a Fuchsian or a quasi-Fuchsian deformation of the Fuchsian group 0. Both the functionals c S[f ]/24π corresponding to these mappings solve Eq. (1.2). We shall show in Sect. 4.2.2 that the difference of the corresponding stress-energy tensors is a quadratic differential for 0, which is holomorphic with respect to the complex structure on X determined by the Fuchsian and the quasi-Fuchsian deformations of 0. As we already mentioned, in genus zero it is possible to express the solution of (1.3) by integrating along a linear path in the space of Beltrami coefficients. Actually, as we show in 2.2, any path µ(t) that connects µ to 0 leads to the same functional. In the higher

Effective Action for Two-Dimensional Quantum Gravity

37

genus case, we denote by f µ(t) the corresponding solutions of the Beltrami equation on H producing either a Fuchsian or a quasi-Fuchsian deformation of 0, depending on the given terminal mapping f , and set T t (z) = {f µ(t) , z} . According to Lemma 4.2, the definition Z 1 Z def c t 2 T µ(t) ˙ d z dt, W [µ] = 12π 0 X

(1.10)

where µ(t) ˙ = dµ(t)/dt, makes perfect sense since the integrand in (1.10), being a product of a Beltrami and a quadratic differential for 0, is a (1, 1)-tensor for 0. We have Theorem C. (i) Let f be either a Fuchsian or a quasi-Fuchsian solution of the Beltrami equation on H. Then c S[f ] , W [µ] = 24π so that the functional W [µ] does not depend on the choice of the homotopy µ(t) and Z c δW = T (z)δµ(z) d2 z . 12π X (ii) The functional W [µ] is a holomorphic functional of µ in the quasi-Fuchsian case, while in the Fuchsian case Z c ∂ 2 W [µ] = − |µ|2 y −2 d2 z , ∂∂ ¯ =0 48π F for Bers harmonic Beltrami differentials µ. It is worth stressing again that W , as defined in (1.10), is but one possible solution to the universal CWI on X: we have already noted that the solution corresponding to a given CFT with central charge c may differ from (1.10) by a term involving a 0quadratic differential, which is the expectation value of the stress-energy tensor of that CFT. (Similar observations about the lack of uniqueness in the solution to the CWI due to holomorphic quadratic differentials appear in [34, 35].) Moreover, the fact that in higher genus the correspondence µ 7→ f ceases to be one-to-one clearly affects the value of (1.10), which will depend on the prescription used to solve the Beltrami equation. These observations lead to the question of what features of conformal field theories at central charge c are actually conveyed by (1.10). Since, according to Theorem C, the solution of (1.10) featuring a quasi-Fuchsian deformation depends holomorphically on µ, it is therefore natural to conjecture that the corresponding functional W [µ] (or (c/24π)S[f ], through Theorem C) represents a universal feature of all conformal field theories with central charge c. We also observe that (1.10) can be considered as a WZW type functional, since it is obtained integrating over a path in the field space. Theorem C says that this term has also a local representation in two dimensions. This parallels the genus zero situation, where the Polyakov’s action in the light cone gauge can be actually derived from a WZNW model [2]. (See also [31, 32] for the analogous situation in the conformal gauge.) In that case, one obtains a local functional in two dimensions as a consequence of the topological triviality of the WZW term for the group SL2 (R).

38

E. Aldrovandi, L.A. Takhtajan

1.1. The organization of this paper is as follows. In Sect. 2 we present a consistent formulation of the two-dimensional induced gravity in the conformal gauge using quasiconformal (even smooth) mappings of C and without using any analytic continuation from the light-cone gauge or treating z and z¯ as independent variables. There we gather all results, based on local computations, that will be used in the subsequent sections. Needless to say, essentially all these results are known (see [18, 26, 31, 32]) and we present them mainly for the convenience of the reader and in order to make the paper self-contained. We also discuss in detail the formulation based on the functional W [µ] from [18], prove that it coincides with the Polyakov’s action functional (which was implicitly contained in [31]) and compute the Hessians of the action functionals S[f ] and W [µ]. We start Sect. 3 by briefly discussing the genus 1 case. Next, we recall the standard concepts from homological algebra and differential topology that are needed to treat the case of higher genus Riemann surfaces, relegating the proofs of some rather technical results to the appendix. We then present the explicit construction of the representatives of the fundamental class [Σ] and the cohomology class [f ] corresponding to the fundamental domain F and the density ω[f ], respectively. In Sect. 4 we finally define an analog of the Polyakov’s action functional for the Riemann surface X of genus g > 1 and prove Theorems A, B and C. We also prove that the solution space of the equation µzzz = 0 is 4g − 3-dimensional and compute the Hessians of the action functionals S[f ] and W [µ]. The relation of the constructions presented in Sects. 3 and 4 with the geometry of various fiber spaces over the Teichm¨uller space is analyzed in Sect. 5. There we describe exp(−W [µ]) as a section of a line bundle over Teichm¨uller space, making contact with previous work on the subject. In the last subsection we draw our conclusions and set some directions for future work.

2. Generating Functional and Polyakov’s Action in Genus Zero

2.1. Let f be a normalized self-mapping of the complex plane C, i.e. an orientation preserving diffeomorphism of the Riemann sphere P1 = C ∪ {∞} fixing 0, 1, ∞. Define a map f 7→ µ = µ(f ) = fz¯ /fz , where µ is a smooth Beltrami coefficient on C: a smooth bounded function such that |µ| < 1. The following basic result of the theory of quasi-conformal mappings guarantees that the correspondence f 7→ µ is one-to-one and onto. Proposition 2.1. Let µ ∈ L∞ (C) (the Banach space of measurable functions with finite sup norm) such that ||µ||∞ < 1. Then the Beltrami equation fz¯ = µfz

(2.1)

has a unique solution f fixing 0, 1, ∞ which is an orientation preserving quasi-conformal homeomorphism of C. The solution is smooth (real-analytic) whenever µ is smooth (realanalytic). Proof. See [1].

Let ω[f ] be the following (1, 1)-form

Effective Action for Two-Dimensional Quantum Gravity

ω[f ] =

fzz µz d z ∧ d z, ¯ fz

39

(2.2)

which (see the introduction) we identify as the density of Polyakov’s action functional. Here and elsewhere it is understood that µ = µ(f ). From now on we also assume that f (z, z) ¯ − z → 0 as |z| → ∞ in such a way that the (1, 1)-form ω[f ] is integrable on C. (One can simply consider µ with finite support; other less restrictive conditions for the difference f (z, z) ¯ − z can be formulated in terms of Sobolev spaces.) Define the functional Z Z fzz 1 ω[f ] = − µz d2 z. (2.3) S[f ] = 2i C f z C Remark 2.2. The functional S[f ] is the Euclidean version of Polyakov’s action functional for the two-dimensional quantum gravity in the light-cone gauge [26]. Let us recall that it can be also formally obtained (cf. [30]) as a “chiral” version of the Liouville action Z √ 1 h (hab ∂a φ ∂b φ + φ Rh ), A[φ] = 2 C (where x1 = x, x2 = y and Rh is the curvature of the background metric h), in the following way. Consider the “metric” h = (d z +µ d z)⊗ ¯ d z, ¯ µ = µ(f ) and set φ = log fz . Since Rh = 2µzz , the integrand in A[φ] is equal to 1 2 fzz fzz µz + 2 µ . φz φz¯ + 2µ − φz + φzz = − 2 fz fz z Let T = {f, z} be the Schwarzian derivative of the mapping f . We have the following identity, which could also be looked at as an “equation for the trace anomaly” [26, 32]. Lemma 2.3.

(∂¯ − µ∂ − 2µz )T = µzzz .

Proof. A direct computation using the definitions of µ and of the Schwarzian derivative. Lemma 2.4. The functional S[f ] is smooth in the sense that its variational derivative δS/δµ(z), defined as Z δS d δµ d2 z S(µ + t δµ) = dt t=0 C δµ exists and is given by

δS = 2 T (z). δµ(z)

Proof. Starting with the formula δµ =

δfz δfz¯ −µ , fz fz

that relates the variations of µ and f , we get by a straightforward computation fzz δfz δω = µz + δµz d z ∧ d z¯ = −2 T δµ d z ∧ d z¯ − d η, fz z fz

(2.4)

(2.5)

40

E. Aldrovandi, L.A. Takhtajan

where

η[f ; δf ] =

fzz δfz¯ µz δfz + − fz2 fz

fzz fz

fzz δfz µ dz + d z¯ . fz2 z

Proposition 2.5. The functional c S[f ]/24π is the unique solution of the universal CWI for the stress-energy tensor. Proof. It follows immediately from Lemmas 2.3 and 2.4 that cS[f ]/24π, considered as a functional of µ, satisfies Eq. (1.2) (∂¯ − µ∂ − 2µz )

c δW = µzzz . δµ(z) 12π

To prove uniqueness, consider the difference c δS δW − (fz )−2 Q[µ](z) = δµ(z) 24π δµ(z) and observe (cf. [22, 31]) that it satisfies the following equation (∂¯ − µ∂)Q[µ](z) = 0, which shows that Q[µ](z, z) ¯ is holomorphic with respect to the new complex structure ¯ ¯ on C defined by the Cauchy-Riemann operator ∂−µ ∂. Recalling ζ = f (z, z), ¯ ζ¯ = f (z, z) that δW/δµ(z), as well as T (z), vanish as |z| → ∞ (regularity of the stress-energy tensor at ∞) we conclude that Q[µ] is an entire function of ζ vanishing at ∞, so that Q[µ] = 0. Therefore, the functional Z c fzz c S[f ] = − µz d 2 z 24π 24π C fz solves the universal CWI (1.2) on P1 .

Next, we determine the variation of S with respect to f and determine the classical equations of motion: the critical points δS[f ] = 0 of the functional S. Lemma 2.6.

Z

δS[f ] = −2

C

(Tz¯ − µ Tz − 2 µz T )

δf 2 d z = −2 fz

Z C

µzzz

so that the classical equation of motion is µzzz = 0 . Proof. It follows from the identity T δµ d z ∧ d z¯ = (−Tz¯ + µ Tz + 2µz T ) where η0 = T and from Lemma 2.3.

δf δf d z + µ T d z¯ , fz fz

δf − d η0 , fz

δf 2 d z, fz

Effective Action for Two-Dimensional Quantum Gravity

41

2.2. Let µ(t), 0 ≤ t ≤ 1, be the path in the space of Beltrami coefficients connecting 0 with the given Beltrami coefficient µ. It gives rise to a homotopy f t = f µ(t) , f 0 = id, f 1 = f that consists of normalized quasi-conformal mappings satisfying the Beltrami equation fzt¯ = µ(t)fzt . Denoting the corresponding Schwarzians as T t (z) = {f t , z}, so that T 0 = 0 and T 1 = T , we have the following useful variational formulas. Lemma 2.7. µ(t)zzz = (∂¯ − µ(t) ∂ − 2 µ(t)z )(T t ), δT t = (∂ 3 + 2 T t ∂ + Tzt )(ut ), δµ(t) = (∂¯ − µ(t) ∂ + µ(t)z )(ut ),

(i) (ii) (iii)

where ut = δf t /fzt . Proof. Equation (i) is just a restatement of Lemma 2.3, applied to the map f t . The variational formula (ii) is verified by a straightforward (though lengthy) computation using T = {f, z} and the definition of the Schwarzian derivative. Finally, Eq. (iii) follows from the variational formula (2.4), written as δf ¯ δµ = (∂ − µ ∂ + µz ) fz and specialized to the map f t .

As it follows from Lemma 2.7, the differential operators T = ∂ 3 + 2 T ∂ + Tz and M = ∂¯ − µ ∂ + µz play a fundamental role in the variational theory. In particular, the third-order differential operator T appears in many other different areas as well. It serves as a Jacobi operator for the second Poisson structure for the KdV equation [24] that is given by the Virasoro algebra and it plays an important role in Eichler cohomology on Riemann surfaces [17]. The operator T is skew-symmetric, T τ = −T , with respect to the inner product given by Z (u, v) = C

u v d2 z ,

(2.6)

def whereas Mτ = −D, where D = ∂¯ − µ∂ − 2µz . However, we have the following result.

Lemma 2.8. The operator T M is symmetric. Proof. It reduces to the verification of the identity (T M)τ = DT , or (∂ 3 + 2 T ∂ + Tz )(∂¯ − µ ∂ + µz ) = (∂¯ − µ ∂ + 2µz )(∂ 3 + 2 T ∂ + Tz ), which immediately follows from Lemma 2.3 and T = {f, z}.

42

E. Aldrovandi, L.A. Takhtajan

Now, let us introduce the functional Z 1Z c W [µ] = T t µ(t) ˙ d2 z d t, 12π 0 C

(2.7)

where the dot stands for d/dt. A priori it may depend on the choice of the homotopy µ(t). The following result shows that the variational derivative of W with respect to µ = µ(1) does not depend on µ(t). Lemma 2.9.

c δW = T (z) . δµ(z) 12π

Proof. Writing δ(T t µ(t)) ˙ = δT t µ(t) ˙ + T t δµ(t) ˙ and using (ii) in Lemma 2.7, together with the relation (2.8) µ(t) ˙ = Mt (v t ), t t t ˙ (where v = f /fz ) which follows from formula (iii) of Lemma 2.7 applied to δ = d/dt, we get ˙ = T t (ut )Mt (v t ) . δT t µ(t) Using Lemma 2.8, Eqs. (2.8), (iii) and the equation T˙ t = T t (v t ) , which follows from formula (ii) of Lemma 2.7 applied to δ = d/dt, we obtain Z δT t µ(t) ˙ d2 z = (T t (ut ), Mt (v t )) = −(ut , T t Mt (v t )) C

= (ut , (Mt )τ T t (v t )) = (Mt (ut ), T t (v t )) Z δµ(t)T˙ t d2 z . = C

Substituting this into the expression for δW , we get Z 1 t=1 t (T˙ t δµ(t) + T t δµ(t)) ˙ d t = T δµ(t) t=0 = T δµ, 0

which completes the proof.

Moreover, as the next result shows, the functional W is actually independent of the choice of the path µ(t) connecting the points 0 and µ in the space of Beltrami coefficients. Proposition 2.10.

c S[f ] , 24π where f and µ are related through µ = fz¯ /fz . W [µ] =

Proof. It is essentially the computation in Lemma 2.4, done in the reverse order. Namely, considering the families µ(t) and f µ(t) and using the formula (2.5) for the case δ = d/dt, we get t d fzz ˙ d z ∧ d z¯ = µ(t) z ∧ z ¯ + d η[f t ; f˙t ] , 2 T t µ(t) d zd dt fzt which after integrating over C × [0, 1] yields the result.

Effective Action for Two-Dimensional Quantum Gravity

43

2.3. Here we compute the Hessian of the functional S[f ], i.e. its second variation with respect to f , evaluated at the critical point. Let δ1 f and δ2 f be two variations of f , defined through the two-parameter family fs,t with f0,0 = f as ∂fs,t ∂fs,t , δ2 f = . δ1 f = ∂s s=t=0 ∂t s=t=0 The second variation of S[f ] is d2 δ S[f ] = S[fs,t ] , ds dt s=t=0 2

and it can be computed using the first variation of S[f ] from Lemma 2.6 Z δ1 f 2 δ1 S[f ] = −2 µzzz d z fz C by evaluating δ2 (µzzz [f ]). As it follows from Lemma 2.7, δ2 f , δ2 µzzz [f ] = ∂ 3 ◦ M fz so that

δ2 f δ1 f 3 2 ∂ ◦M d z. fz fz

Z δ2 S[f ](δ1 f, δ2 f ) = −2

C

(2.9)

(2.10)

The Hessian is symmetric, so that the right hand side of (2.10) should be a symmetric bilinear form in δ1 f, δ2 f whenever µzzz = 0. This can be verified directly, as we have Lemma 2.11. The operator ∂ 3 ◦ M for µzzz = 0 is symmetric with respect to the bilinear form (2.6). Proof. Using (∂ 3 )τ = −∂ 3 we have ∂3 ◦ M

τ

= D ◦ ∂3 ,

where D = ∂¯ − µ ∂ − 2 µz , and it is straightforward to verify the following identity when µzzz = 0: ∂3 ◦ M = D ◦ ∂3 . Similarly, one can compute the Hessian of the functional W [µ]. We have Lemma 2.12. c δ W [µ](δ1 µ, δ2 µ) = 12π

Z

2

C

δ1 µ ∂ 3 ◦ M−1 (δ2 µ) d2 z.

Remark 2.13. Since

u◦f M fz

=

fz ¯ ◦f, (1 − |µ|2 ) (∂u) fz

(2.11)

the operator M is invertible on the subspace of smooth functions on C vanishing at ∞.

44

E. Aldrovandi, L.A. Takhtajan

3. Algebraic and Topological Constructions 3.1. Here we consider the genus 1 case. Let X be an elliptic curve, i.e. a compact Riemann surface of genus 1, realized as the quotient X ∼ = L\C, where L is a rank 2 lattice in C, generated by the translations α(z) = z + 1 and β(z) = z + τ , where Im τ > 0. Let µ be a Beltrami coefficient for L, i.e. a ||µ||∞ < 1 function on C satisfying µ◦γ =µ

for all γ ∈ L,

and let f = f µ be the normalized (fixing 0, 1, ∞) solution of the Beltrami equation on C fz¯ = µfz . ˜ It is easy to see that f ◦ L = L ◦ f , where L˜ is the rank 2 lattice in C generated by 1 and τ˜ = f (τ ). Indeed, γ˜ = f ◦ γ ◦ f −1 is a parabolic element in PSL(2, C) fixing ∞, i.e. a translation z 7→ z + h, and it follows from the normalization that f (z + 1) = f (z) + 1. Therefore the (1, 1)-form ω[f ] on C is well-defined on X so that the action functional takes the form Z 1 ω[f ] , S[f ] = 2i Π where Π is the fundamental parallelogram for the lattice L. 3.2. Here we consider the higher genus case and construct double complexes that extend the singular chain and the de Rham complexes on H . We extend the fundamental domain F for 0 and the (1, 1)-form ω[f ] on H to representatives of the homology and cohomology classes [Σ] and [f ] for these double complexes. 3.2.1. Let X ∼ = 0\H be a compact Riemann surface of genus g > 1, realized as the quotient of the upper half-plane H by the action of a strictly hyperbolic Fuchsian group 0. Recall that the group 0 is called marked if there is a chosen system, up to inner automorphism, of 2g free generators α1 , . . . , αg , β1 , . . . , βg satisfying the single relation (3.1) [α1 , β1 ] · · · [αg , βg ] = 1 , def

where [αi , βi ] = αi βi αi−1 βi−1 and 1 is the unit element in 0. For every choice of the marking there is a standard choice of a fundamental domain F ⊂ H for 0 as a closed nonEuclidean polygon with 4g edges, pairwise identified by suitable group elements. We will use the following normalization (see, e.g., [19] and Fig. 1). The edges of F are labelled ai , βi (b0i ) = bi for all i = 1, . . . , g; the orientation of the by ai , a0i , bi , b0i and αi (a0i ) = P g edges is chosen so that ∂F = i=1 (ai +b0i −a0i −bi ). Also we set ∂ai = ai (1)−ai (0) and ∂bi = bi (1)−bi (0), where the label “1” represents the end point and the label “0” the initial point with respect to the edge’s orientation. One has the following relations between the vertices of F and the generators: ai (0) = bi+1 (0), αi−1 (ai (0)) = bi (1), βi−1 (bi (0)) = ai (1) and [αi , βi ](bi (0)) = bi−1 (0), where, in accordance with (3.1), b0 (0) = bg (0). 3.2.2. Let µ be a Beltrami differential for the Fuchsian group 0, i.e. a bounded (L∞ (H)) function on H satisfying µ◦γ

γ0 =µ γ0

for all γ ∈ 0.

Effective Action for Two-Dimensional Quantum Gravity

a1

45

b2

b’1

a’2

a’1

b’2

a2

b1

Fig. 1. Conventions for the fundamental domain F

In addition, it is called a Beltrami coefficient for 0 when ||µ||∞ < 1. Denote by f = f µ the normalized (fixing 0, 1 and ∞) solution of the Beltrami equation on H fz¯ = µfz . As it was already explained in the introduction, we consider f to be either a self-mapping of H, or a mapping of H onto the interior of a simple Jordan curve in C, uniquely determined by µ. These two choices can be realized by considering the Beltrami equation on the whole complex plane C: in the former case the Beltrami coefficient µ is extended to the lower half-plane H by reflecting it through the real line R, while in the latter µ is ˜ ⊂ PSL(2, C), isomorphic to 0 as an extended by zero in H. In both cases there exists 0 ˜ abstract group and such that f intertwines between 0 and 0 f ◦ γ = γ˜ ◦ f

for all γ ∈ 0,

˜ ⊂ which actually defines the isomorphism γ 7→ γ. ˜ In the first case we have that 0 PSL(2, R) and it is in fact a Fuchsian group, a Fuchsian deformation of 0. In the second ˜ is a so-called quasi-Fuchsian group, a special case of a Kleinian group. Its domain case 0 of discontinuity has two invariant components, the interior and the exterior of a simple Jordan curve in C, which is the image of the real line R under the mapping f and is ˜ These mappings, introduced and studied by Ahlfors and Bers, play a a limit set for 0. fundamental role in Teichm¨uller theory (see, e.g. [16]). 3.2.3. Let S• ≡ S• (X0 ) be the standard singular chain complex of H with the differential ∂ 0 . (From now on, we will denote the singular chain differential by ∂ 0 , as the symbol ∂ will be reserved for the total differential in a double complex, to be introduced below.) The group 0 acts on H and induces a left action on S• by translating the chains, hence S• becomes a complex of 0-modules. Since the action of 0 on H is proper, S• is a complex of left free Z0-modules [23], where Z0 is the integral group ring of 0: the set P of finite combinations γ∈0 nγ γ with coefficients nγ ∈ Z. Let B• ≡ B• (Z0) be the canonical “bar” resolution complex for 0, with differential ∂ 00 . Each Bn (Z0) is a free left 0-module on generators [γ1 | . . . |γn ], with the differential ∂ 00 : Bn → Bn−1 given by ∂ 00 [γ1 | . . . |γn ] = γ1 [γ2 | . . . |γn ] +

Pn−1

(−1)i [γ1 | . . . |γi γi+1 | . . . |γn ] +(−1)n [γ1 | . . . |γn−1 ] i=1

46

E. Aldrovandi, L.A. Takhtajan

for n > 1 and by

∂ 00 [γ] = γ[ ] − [ ]

for n = 1. Here [γ1 | . . . |γn ] is defined to be zero if any of the group elements inside [. . .] equals the unit element 1 in 0. B0 (Z0) is a Z0-module on one generator [ ], and can be identified with Z0 under the isomorphism that sends [ ] to 1. Next, consider the double complex K•,• = S• ⊗Z0 B• . The associated total simple complex Tot K is equipped with the total differential ∂ = ∂ 0 + (−1)p ∂ 00 on Kp,q . For the sake of future reference, we observe that S• is identified with S• ⊗Z0 B0 under the correspondence c 7→ c ⊗ [ ]. Remark 3.1. Since S• and B• are both complexes of left 0-modules, in order to define their tensor product over Z0 we need to endow each Sn with a right 0-module structure. def

−1 This is done in the standard fashion by setting c · γ = γ (c). As a result S ⊗Z0 B = S ⊗Z B 0 , so that the tensor product over integral group ring of 0 can be obtained as the set of 0-invariants in the usual tensor product (over Z) as abelian groups [9].

The application of standard spectral sequence machinery, together with the trivial fact that H is acyclic, leads to the following lemma, whose formal proof immediately follows, for example, from [23], Theorem XI.7.1 and Corollary XI.7.2. Lemma 3.2. There are isomorphisms H• (X, Z) ∼ = H• (0, Z) ∼ = H• (Tot K•,• ) , where the three homologies are the singular homology of X, the group homology of 0 and the homology of the complex Tot K•,• with respect to the total differential ∂. We will use this lemma in the construction of the explicit cycle Σ in Tot K that extends the fundamental domain F . For the convenience of the reader we present a simple minded proof of Lemma 3.2 in Appendix A. 3.2.4. We now turn to constructions dual to those in 3. Denote by A• ≡ A•C (X0 ) the complexified de Rham complex on H. Each An is a left 0-module with the pull-back def

action of 0, i.e. γ · φ = (γ −1 )∗ φ for φ ∈ A• and for all γ ∈ 0. Consider the double complex Cp,q = Hom(Bq , Ap ) with differentials d, the usual de Rham differential, and δ = (∂ 00 )∗ , the group coboundary. Specifically, for φ ∈ Cp,q , (δφ)γ1 ,...,γq+1 = γ1 · φγ2 ,...,γq+1 +

q X

(−1)i φγ1 ,...,γi γi+1 ...,γq+1

i=1 q+1

+(−1)

φγ1 ,...,γq .

As usual, the total differential on Cp,q is D = d +(−1)p δ. Either by dualizing Lemma 3.2 or working out the spectral sequences resulting from C, we obtain the Lemma 3.3. There are isomorphisms H • (X, C) ∼ = H • (0, C) ∼ = H • (Tot C•,• ) , where the three cohomologies are the de Rham cohomology of X, the group cohomology of 0 and the cohomology of the complex Tot C•,• with respect to the total differential D.

Effective Action for Two-Dimensional Quantum Gravity

47

As for Lemma 3.2, a simpler proof can also be found in Appendix A. Finally, there exists a natural pairing between Cp,q and Kp,q which assigns to the pair (φ, c ⊗ [γ1 | . . . |γq ]) the evaluation of the form φγ1 ,...,γq over a cycle c, Z (3.2) hφ, c ⊗ [γ1 | . . . |γq ]i = φγ1 ,...,γq . c

By the very construction of the double complexes C•,• and K•,• , the total differentials D and ∂ are transpose to each other hD8, C i = h 8, ∂Ci

(3.3)

for all 8 ∈ C•,• , C ∈ K•,• . Therefore the pairing (3.2) descends to the corresponding homology and cohomology groups and is non degenerate. It defines a pairing between H • (Tot C•,• ) and H• (Tot K•,• ) which we continue to denote by h , i. 3.3. Here we compute explicit representatives Σ and f , for the fundamental class of the surface X and a degree two cohomology class on X that extend the fundamental domain F and the 2-form ω[f ], respectively. 3.3.1. Homology computations. Fix the marking of 0 and choose a fundamental domain F as in 3. We start by the observation that F ∼ = F ⊗ [ ] ∈ K2,0 . Furthermore, obviously ∂ 00 F = 0, and ∂0F =

g X

(b0i − bi − a0i + ai )

i=1

=

g X

(βi−1 (bi ) − bi − αi−1 (ai ) + ai ) ,

i=1 0

which we can rewrite as ∂ F = ∂ 00 L, where L ∈ K1,1 is given by L=

g X

(bi ⊗ [βi ] − ai ⊗ [αi ]) .

(3.4)

i=1

This follows from γ −1 (c) − c = c · γ − c = c ⊗ γ[ ] − c ⊗ [ ] = c ⊗ ∂ 00 [γ] for any singular chain c and any γ ∈ 0. Let us now compute ∂ 0 L. There exists V ∈ K0,2 such that ∂ 0 L = ∂ 00 V ; its explicit expression is given by V =

g X

ai (0) ⊗ [αi |βi ] − bi (0) ⊗ [βi |αi ] + bi (0) ⊗ [γi−1 |αi βi ]

i=1

−

g−1 X

(3.5) −1 −1 bg (0) ⊗ [γg−1 . . . γi+1 |γi ] ,

i=1

where [αi , βi ] = γi . Indeed, a straightforward computation, using the relations between generators and vertices, yields ∂ 0 L = ∂ 00 V − bg (0) ⊗ [γg−1 . . . γ1−1 ] ,

48

E. Aldrovandi, L.A. Takhtajan

and the second term in the RHS vanishes by virtue of (3.1), since [1] = 0. From the relations ∂ 0 F = ∂ 00 L and ∂ 0 L = ∂ 00 V it follows immediately that the element Σ = F + L − V of total degree two is a cycle in Tot K, that is ∂(F + L − V ) = 0 . Thus we have the Proposition 3.4. The cycle Σ ∈ (Tot K)2 represents the fundamental class of the surface in H2 (X, Z). Proof. This follows immediately from Lemma 3.2, provided the class [Σ] is not zero, but this is not the case, since the cycle Σ is a “ladder” starting from the fundamental domain F . It follows from the arguments in Appendix A that the latter in fact maps under S2 3 F 7→ F ⊗ 1 ∈ S2 ⊗Z0 Z ∼ = S2 (X) to a representative of the fundamental class. Remark 3.5. The existence of the elements L and V can be guaranteed a priori by the methods of Appendix A, using the fact that 0 has no cohomology except in degree zero. As it follows from Proposition 3.4, the homology class [Σ] is independent of the marking of the Fuchsian group 0 and of the choice of the fundamental domain F , whereas its representative Σ is not. Since this independence is a key issue in defining the action functional for the higher genus case, we will show explicitly that different choices lead to homologous Σ. Essentially, these choices are the following. – Within the same marking choose another set of canonical generators αi0 , βi0 by conjugating αi , βi with γ ∈ 0 so that F 0 = γF for the corresponding fundamental domains. – Within the same marking make a different choice of the fundamental domain F 0 (which is always assumed to be closed in H), not necessarily equal to the canonical 4g polygon F . – Consider a different marking αi0 , βi0 and a fundamental domain F 0 for it. Clearly, all the previous cases amount to an arbitary choice of the fundamental domain for 0. However, if F and F 0 are two such choices, then there exist a suitable set of indices {ν}, elements γν ∈ 0 and singular two-chains cν such that X F0 − F = (γν−1 (cν ) − cν ) . (3.6) ν

It follows, for instance, from the fact that the chain complex for H is a free 0-module [23]. Then we have the following Lemma 3.6. If F and F 0 are two choices of the fundamental domain for 0 in H, then [Σ] = [Σ 0 ] for the corresponding classes in H• (Tot K•,• ). Proof. Let Σ = F + L − V and Σ 0 = F 0 + L0 − V 0 be the cycles in Tot K constructed according to the method of 3.3.1. It follows from (3.6) that X cν ⊗ [γν ] , F 0 − F = ∂ 00 ν

and therefore

Effective Action for Two-Dimensional Quantum Gravity

X

F 0 + L0 − F − L = ∂

49

cν ⊗ [γν ]

ν

+ L0 − L −

X

∂ 0 (cν ) ⊗ [γν ] .

ν

The second term in these expression is an element of K1,1 and its second differential is ∂ 00 L0 − L −

X

X ∂ 0 (cν ) ⊗ [γν ] = ∂ 0 (F 0 − F ) − (γν−1 (∂ 0 (cν )) − ∂ 0 (cν ))

ν

ν

= 0. Since the higher homology of 0 with values in S• is zero (cf. Appendix A), there exists an element C ∈ K1,2 such that X L0 − L − ∂ 0 (cν ) ⊗ [γν ] = ∂ 00 C , ν

so that

Σ0 − Σ = ∂

X

cν ⊗ [γν ] − C − V 0 + V + ∂ 0 C .

ν 00

0

0

Similarily, ∂ (V − V − ∂ C) = 0, and therefore there exists K ∈ K0,3 such that V 0 − V + ∂ 0 C = ∂ 00 K. Finally, X Σ0 − Σ = ∂ cν ⊗ [γν ] − C − K , ν 0

since, obviously, ∂ K = 0.

3.3.2. Cohomology computations. Here we pass to the dual computations in cohomology. Let fzz µz d z ∧ d z¯ , ω[f ] = fz be the density of Polyakov’s action functional in the genus zero case, where µ = fz¯ /fz . Obviously, ω[f ] can be considered as an element in C2,0 , that is a two-form valued zero cochain on 0. Then there exist elements θ[f ] ∈ C1,1 and Θ[f ] ∈ C0,2 such that δω[f ] = d θ[f ]

and δθ[f ] = d Θ[f ] ,

def

so that the f -dependent cochain f = ω[f ] − θ[f ] − Θ[f ] of total degree two is a cocycle in Tot C, that is D(ω[f ] − θ[f ] − Θ[f ]) = 0 . Indeed, d δω[f ] = δ d ω[f ] = 0 because ω[f ] is a top form on H, and since H is contractible, it follows that there exists θ[f ] such that δω[f ] = d θ[f ]. Similarly, d δθ[f ] = δ d θ[f ] = δδω[f ] = 0 and again, since H is acyclic, there exists Θ[f ] such that δθ[f ] = d Θ[f ]. Continuing along this way, we get d δΘ[f ] = 0, so that δΘ[f ] is a 3-cocycle on 0 with constant values. As it follows from Lemma 3.3, H 3 (0, C) = {0}, so that, shifting Θ[f ] by a C-valued group cochain, if necessary, one can choose the “integration constants” in the equation d Θ[f ] = δθ[f ] in such a way that δΘ[f ] = 0.

50

E. Aldrovandi, L.A. Takhtajan

It is quite remarkable that explicit expressions for θ[f ] and Θ[f ] can be obtained by performing a straightforward calculation. Indeed, using f ◦ γ = γ˜ ◦ f we get

and µ ◦ γ

γ0 = µ, γ0

δωγ [f ] = ω[f ] ◦ γ −1 |(γ −1 )0 |2 − ω[f ] = d θγ [f ].

(3.7)

A direct computation, using the property that {γ, z} = 0 for all fractional linear transformations, verifies that θγ −1 [f ] = log(γ˜ 0 ◦ f ) d log fz − log(fz ◦ γ) d log γ 0 − 2

γ 00 µ d z. ¯ γ0

(3.8)

Proceeding along the same lines one can work out an expression for Θ[f ]; in order to get a manageable formula, it is more convenient to write down its differential 0 0 ∗ 0 0 d Θγ2−1 ,γ1−1 [f ] = f log γ˜ 1 ◦ γ˜ 2 d log γ˜ 2 + log γ2 d log γ1 ◦ γ2 (3.9) 2 1 2 1 − f ∗ d log γ˜ 20 − d log γ20 . 2 2 It is easy to verify that the right hand side of this expression is indeed a closed one-form on H and, therefore, is exact. Remark 3.7. One can obtain a formula for Θ[f ] by integrating (3.9). The resulting expression will involve combinations of logarithms and dilogarithms, resulting from the typical integral Z log γ 0 d log σ 0 , where γ and σ are fractional linear transformations. The customary choice in defining this integral is to put branch-cuts from −∞ to γ −1 (∞) and from σ −1 (∞) to ∞. When these elements belong to the Fuchsian group 0, the branch-cuts should go along the ˜ when real axis R which is the limit set of 0. The same applies to the target group 0 ˜ the mapping f defines a Fuchsian deformation. If the target group 0 is quasi-Fuchsian, ˜ the simple Jordan curve that is the the branch-cuts should go along the limit set of 0, image of R under the mapping f . With this normalization, Θγ −1 ,γ −1 (f ) is defined up 2 1 to the “integration constants” cγ −1 ,γ −1 which are determined from the condition that 2 1 δΘ[f ] = 0. Therefore we proved, in complete analogy with the homological computation, that the cochain f = ω[f ] − θ[f ] − Θ[f ] ∈ (Tot C)2 is in fact a cocycle, Df = 0 . Hence, from Lemma 3.3, we have Proposition 3.8. The cocycle f ∈ (Tot C)2 represents a cohomology class in H 2 (X, C) ∼ = C , which depends on the mapping f . Remark 3.9. It might happen that the cohomology class [f ] = 0 for some specific mapping(s) f .

Effective Action for Two-Dimensional Quantum Gravity

51

4. Polyakov’s Action in Higher Genus

4.1. After the algebraic and topological preparations of Sect. 3, here we finally define the Polyakov action functional and prove Theorems A, B, C. Let X ' 0\H be a Riemann ˜ = f ◦ 0 ◦ f −1 surface of genus g > 1 and f be a quasi-conformal mapping such that 0 is a Fuchsian or quasi-Fuchsian group isomorphic to 0 (see the introduction and 3.2.2 for details). Using the pairing between C•,• and K•,• , we set 2iS[f ] = hf , Σi = hω[f ], F i − hθ[f ], Li + hΘ[f ], V i Z g Z g Z X X ω[f ] − θβi [f ] + θαi [f ] = F

+

g X

i=1

bi

i=1

ai

Θαi ,βi [f ](ai (0)) − Θβi ,αi [f ](bi (0)) + Θγ −1 ,αi βi [f ](bi (0)) i

i=1

−

(4.1)

g X i=1

Θγg−1 ···γ −1 ,γ −1 [f ](bg (0)) . i+1

i

Proof of Theorem A. It follows at once from the constructions in Sect. 3. First, the value of S[f ], for any given f , depends only on the classes defined by f and Σ and not on the explicit cocycles representing them. Indeed, because of the property (3.3) of the pairing h , i, shifting either f or Σ by (co)boundaries does not alter the value given in (4.1). Furthermore, by virtue of Lemma 3.6 and the above invariance, the action S[f ] does not depend on either the choice of the marking of 0, or on the choice of the fundamental domain F . Finally, it follows from Propositions 3.4 and 3.8, which identify the (total) homology of the complexes K•,• and C•,• with that of the surface X, that the action S[f ] comes from the pairing H 2 (X, C) × H2 (X, Z) −→ C . Remark 4.1. Since the action results from a pairing in homology, we write it as S[f ] =

1 h[f ], [Σ]i, 2i

(4.2)

stressing its dependence on the (co)homology classes only. 4.2. Here we discuss the variational properties of the action functional (4.1) and prove Theorem B. As it was mentioned in the introduction, there are two versions of the variational problem for S[f ]. In the first one, the free-end variation, we consider µ to ˜ is be the independent variable, so that the target Fuchsian (or quasi-Fuchsian) group 0 determined by µ through the solution of the Beltrami equation. In the second case, the ˜ together with fixed-end variation, we fix the target Fuchsian (or quasi-Fuchsian) group 0, ˜ and consider the set QC(0, 0) ˜ of all smooth quasi-conformal the isomorphism 0 −→ 0 ˜ mappings f that intertwine between 0 and 0.

52

E. Aldrovandi, L.A. Takhtajan

In the first case, since the set of Beltrami coefficients for 0 is the interior of a ball of radius 1 (with respect to the || ||∞ norm) in the linear space B(0) of all Beltrami differentials for 0, the variation δµ belongs to B(0). In the second case, since the target Fuchsian (or quasi-Fuchsian) group 0 is fixed, it follows from the equivariance property (1.7) that δf /fz is (−1, 0)-tensor for 0, that is δf 0 δf ◦γ = γ fz fz

for all γ ∈ 0.

One can express δf /fz in terms of a vector field on X as follows. Let G0 be the group of all orientation preserving diffeomorphisms of H fixing 0 and homotopic to the identity. ˜ Any path g t in G0 connected to the identity defines a path f t = f ◦ g t in QC(0, 0) ˜ connected to f ∈ QC(0, 0), a deformation of the mapping f . Setting d δf = f t dt t=0 and defining v = v z ∂z + v z¯ ∂z¯ as the vector field generating the flow t 7→ g t , we get δf = v z + µ v z¯ , fz where µ = fz¯ /fz is the Beltrami coefficient for 0 corresponding to f . Note that in the first case the corresponding variation δf /fz is not necessarily a ˜ “floats” under a generic variation of µ (−1, 0)-tensor for 0, since the target group 0 (variation with free end). Specifically, 1 δf 1 δγ˜ δf ◦γ 0 = + ◦f, (4.3) fz γ fz fz γ˜ 0 for all γ ∈ 0. Objects on H with such tranformation property are pull-backs under the ˜ By definimap f of non-holomorphic Eichler integrals of order −1 for the group 0. −1 tion [21], the space E0˜ of these Eichler integrals consists of smooth functions E on H such that 1 (4.4) E ◦ γ˜ 0 = E + pγ˜ , γ˜ ˜ where pγ˜ is a 1-cocycle of 0 ˜ with coefficients in the linear space of for all γ˜ ∈ 0, polynomials P of order ≤ 2 with the action P 7→ ((γ˜ −1 )0 )2 P ◦ γ˜ −1 . Clearly the pull-back (E ◦ f )/fz of the Eichler integral E has the trasformation property (4.3). In both cases the variations of f and µ are related by the same equation δf = δµ, M fz where M = ∂¯ − µ ∂ + µz is the differential operator introduced in Sect. 2. It has the remarkable property of mapping (−1, 0)-tensors for 0, and even objects of more complicated type such as pull-backs of Eichler integrals, into (−1, 1)-tensors for 0. There are other differentials operators with similar properties, collected in the following

Effective Action for Two-Dimensional Quantum Gravity

53

Lemma 4.2. (i) The operators T = ∂ 3 + 2T ∂ + Tz and M = ∂¯ − µ∂ + µz , where T is a quadratic differential for 0 and µ is Beltrami differential for 0, map (−1, 0)-tensors for 0 into quadratic and Beltrami differentials for 0, respectively. (ii) The operators T and M from part (i) map pull-backs by the mapping f of Eich˜ into quadratic and Beltrami differentials for 0, ler integrals of order −1 for 0 respectively. ˜ then T = {f, z} is a quadratic differential (iii) If f is mapping of H intertwining 0 and 0, for 0. Proof. Part (i) is well-known (see, e.g. [17]) and the statements can be easily verified. In particular, setting T = 0 we get that µzzz is a (2, 1)-tensor for 0, which is also a known result (see, e.g. [21]). In order to prove part (ii), note that for a holomorphic function p on H we have p◦f = fz2 (∂ 3 p) ◦ f , T fz which shows that the additional terms in the transformation law (4.3) belong to the kernel of T . Similarly, (2.11) shows that these terms belong to the kernel M as well. Part (iii) is another classical result, which can be easily verified as well. 4.2.1. Proof of Theorem B. For concreteness, we first consider variations with respect to µ, though, as we shall see, the actual argument works for both kinds of variations. The proof requires climbing the “ladder” in the double complex C•,• , together with the computation of the variation of ω[f ]. Since ω[f ] is a local functional of f , we can just use the computation already done in genus zero so that, according to formula (2.5), δω = a − d η ,

(4.5)

where a = −2 T δµ d z ∧ d z¯ and the explicit expression for the 1-form η is not needed. (In order to simplify notations, we temporarily drop the dependence on f from the notation.) As it follows from Lemma 4.2, the 2-form a on H is a (1, 1)-tensor for 0, therefore it is closed with respect to the total differential, i.e. Da = 0. Next observe that Dδ = δD = 0, therefore D(δ − a) = 0. We want to show that δ − a is in fact D-exact up to a term whose contribution vanishes after pairing with Σ. To this end, let us write δΘ = δχ , where χ has degree (0, 1) in the total complex. This is possible, since, as it is shown in the appendix, the higher cohomology of 0 with coefficients in the de Rham complex vanishes. The equation Dδ = 0 gives us the two relations d δΘ = δδθ ,

d δθ = δδω ,

(4.6)

of which the first one implies that δθ = d χ + δλ , q

where, again, the vanishing of H (0, Ap ) for q > 0 has been used. Plugging this relation into the second one in (4.6), yields δδω = δ d λ .

54

E. Aldrovandi, L.A. Takhtajan

Notice that this time we can at most conclude that δω − d λ is a 0-invariant form, since H 0 (0, Ap ) precisely gives the invariant p-forms (cf. the appendix). We write this invariant form as a + b, for some (2, 0) invariant element b, so that δω = d λ + a + b and, using (4.5),

b = − d(η + λ) ,

i.e. b is 0-invariant and exact. Putting all together, we obtain δ = δω − δθ − δΘ = a − d η − d χ − δλ − δχ = a + b + D(λ − χ) , which, after evaluation against Σ, reduces to hδ , Σi =

Z a, F

as wanted (the integral of b over F is obviously zero). In order to complete the proof, notice that the variation of ω[f ] always has the form (4.5), independently of whether either variable µ or f is varied. In the latter case, the variation δf /fz is a (−1, 0)-tensor for 0, so that we can use (4.5) and the relation δµ = M(δf /fz ) together with Lemma 2.3. Remark 4.3. Note that the argument presented in the proof of Theorem B is quite general. It applies to any functional defined by an evaluation of a cocycle in Tot C2 over a cycle Σ, provided that the cocycle is the extension of a 2-form on H with the property that its variation is a sum of D and d-exact terms. 4.2.2. As it was mentioned in the introduction, it follows from Theorem B that c S[f ]/24π, considered as a functional of µ = fz¯ /fz , solves Eq. (1.2), no matter what kind of deformation we are considering, be it Fuchsian or quasi-Fuchsian. Thus there are at least two possible solutions of (1.2) on a Riemann surface of genus higher than one. In order to clearly distinguish the two cases, let us adopt for a moment the customary notation in the theory of quasi-conformal mappings [1], so that f µ and 0µ (respectively fµ and 0µ ) stand for the Fuchsian (respectively, quasi-Fuchsian) deformation of 0. There is a simple relationship between the variations of S[fµ ] and S[f µ ]. First of all, observe that the mapping g := fµ ◦ (f µ )−1 : H → fµ (H) is conformal (note that f µ (H) = H). Indeed, it follows from the Beltrami equation that ∂(f µ )−1 ∂g ∂f µ ∂(f µ )−1 = +µ = 0, ∂z ∂ ζ¯ ∂ ζ¯ ∂ ζ¯ ¯ is the new complex coordinate on H. Moreover, the map g intertwines where ζ = f µ (z, z) 0µ and 0µ , thus it descends to a biholomorphic map g : X µ = 0µ \H −→ 0µ \fµ (H) = Xµ showing that the Riemann surfaces X µ and Xµ are conformally equivalent. Furthermore, we have Tµ (z) = {fµ , z} = {g, ζ} ◦ f µ (fzµ )2 + T µ (z) ,

Effective Action for Two-Dimensional Quantum Gravity

55

where T µ (z) = {f µ , z}. Thus the difference Q=

δS[fµ ] δS[f µ ] − δµ δµ

is just the pull-back under f µ of the holomorphic quadratic differential obtained by taking the Schwarzian derivative of g with respect to the new complex coordinate ζ. Of course, the situation is completely symmetric under the exchange of fµ and f µ . One can reach the same conclusion proceeding along a different line (cf. [32]). Namely, since both S[f µ ] and S[fµ ] satisfy (1.2), Q satisfies the equation (∂¯ − µ ∂ − 2 µz )Q = 0 which, using the Cauchy-Riemann operator ∂ ∂ z¯ ∂ ∂ −µ = ∂z ∂ ζ¯ ∂ ζ¯ ∂ z¯

can be written as ∂ζ¯

Q fz2

= 0,

showing that Q is indeed the pull-back of a holomorphic quadratic differential with respect to the complex coordinate ζ. Remark 4.4. The above argument actually shows that homogeneous solutions to the equation (1.2) on X are pull-backs under the mapping f µ (or fµ ) of the holomorphic quadratic differentials on the “target” Riemann surface X µ . According to the RiemannRoch theorem, this space is 3g − 3-dimensional; therefore, the universal CWI (1.2) does not completely determine the generating functional for the stress-energy tensor in the higher genus case. As we mentioned in the introduction, additional information should be provided by the particular CFT. 4.2.3. According to Theorem B, the variation of the action with respect to the map f yields the classical equation of motion µzzz = 0 .

(4.7)

Here we compute the dimension of the space of solutions of (4.7). It was observed in the ˜ out of (4.7) seems to introduction that determining the critical set of S[f ] in QC(0, 0) be a very difficult problem. However, the space of solutions to (4.7) is quite interesting since, as we show below, it contains the subspace of harmonic Beltrami differentials. First, recall the definition of the so-called Maass operators (see, e.g. [13]). For k, l ∈ k,l 0 ∼ Z, denote by A0k,l ≡ Ak,l C (H) = AC (X) the space of 0-invariant (k, l)-forms on H; k by convention, (dz) , for k negative, means (∂/∂z)−k . Define Dk,l : Ak,l −→ Ak+1,l by

Dk,l = y −2k ◦ ∂ ◦ y 2k ,

where ∂ = ∂/∂z. It is easy to verify that ∂z3 = D1,1 ◦ D0,1 ◦ D−1,1 ,

(4.8)

56

E. Aldrovandi, L.A. Takhtajan

which once again shows that the operator ∂z3 maps Beltrami differentials into the (2, 1)tensors for 0. Furthermore, a Beltrami differential ν ∈ A0−1,1 is called Bers harmonic if it is harmonic with respect to the ∂-Laplacian of the Poincar´e metric on 0\H, acting on (−1, 1)-forms. It can be shown that ν = y 2 q¯ , where q ∈ A02,0 is a holomorphic quadratic differential. It follows from the RiemannRoch theorem that Bers harmonic Beltrami differentials form a (3g − 3)-dimensional complex vector space and play an important role in the Teichm¨uller theory [1, 16]. Proposition 4.5. The space of solutions of Eq. (4.7) has complex dimension 4g − 3: dimC Ker A−1,1 (∂z3 ) = 4g − 3, 0

and contains the 3g−3 dimensional vector space of Bers harmonic Beltrami differentials. Proof. Using (4.8), we start by observing that the kernel of D−1,1 coincides with the space of harmonic Beltrami differentials. Indeed, ν ∈ Ker(D−1,1 ) if and only if ¯ for q a holomorphic quadratic differential, since ∂(y −2 ν) = 0, which implies ν = y 2 q, y −2 ν is a (0, 2)-form. Furthermore, Ker(D1,1 ) ∩ Im(D0,1 ) = {0}. Indeed, an element in Ker(D1,1 ) is necessarily a multiple of the (1, 1)-form y −2 . If it is non zero, then it cannot belong to Im(D0,1 ) = Im ∂, since y −2 represents a cohomology class in 0\H. Next, it is clear that Ker(D0,1 ) is complex anti-isomorphic to the linear space of Abelian differentials for X. Finally, the map D−1,1 is onto: its image is the entire space of (0, 1)-differentials. Namely, the operator adjoint to D−1,1 with respect to the Hermitian ∗ = −∂¯ ◦ y 2 , which scalar product on A0k,l induced by the Poincar´e metric y −2 is D−1,1 has zero kernel since g > 1. Thus any element in Ker(D0,1 ) is the D−1,1 -image of an element in A0−1,1 , orthogonal to the subspace of harmonic Beltrami differentials, and it also belongs to the kernel of ∂z3 . Counting 4g − 3 = 3g − 3 + g proves the claim. Remark 4.6. As in the genus zero case, the equation of motion (4.7) is equivalent to the holomorphicity property of T = {f, z} with respect to the new complex structure induced by f . Namely, when µ satisfies (4.7), the corresponding (1.2) becomes homogeneous so that, according to 4.2.2, we have T =0 (4.9) ∂ζ¯ (∂z ζ)2 ¯ This condition is well defined for the stress-energy tensor in the new coordinates ζ , ζ. ˜ (H). on the surface X as well as on the deformed Riemann surface 0\f 4.2.4. Here we briefly comment on the computation of the second variation. It follows from Lemma 4.2 that the differential operators used in the genus zero computation are tensorial; therefore, using Theorem B and the fact that the problem is local, we can just repeat the computations in 2.3 in order to get the Proposition 4.7. The Hessian of the Polyakov action (4.1) is given by the genus zero formula Z δ2 f δ1 f 3 2 2 ∂ ◦M δ S[f ](δ1 f, δ2 f ) = −2 d z. f f z z F

Effective Action for Two-Dimensional Quantum Gravity

57

4.3. We now analyze how S[f ] relates to the functional W [µ] defined by (1.10), and prove Theorem C. For t ∈ [0, 1], let µt be a homotopy in the space of Beltrami differentials connecting 0 to µ, and let f t be the solution of the Beltrami equation corresponding to µt . For the sake of convenience, let us rewrite (1.10) here: Z 1 Z c t 2 T µ(t) ˙ d z dt. (4.10) W [µ] = 12π 0 F The integration in (4.10) is extended to F , but, according to Lemma 4.2, the integrand is a (1, 1)-tensor for 0, hence the integral descends to X. Proof of Theorem C. We want to proceed in a fashion similar to the proof of Theorem B. Our construction of S[f ] applied to f t produces ω t , t and S[f t ] for any t ∈ [0, 1]. We can make use of formula (2.5) applied to δ = d/dt: ω˙ t = −2 T t µ˙ t d z ∧ d z¯ − d η(f t ; f˙t ) ≡ at − d η t , ˙ t = 0, since Dt = 0 for any t, and where, as before, Dat = 0. On the other hand, D therefore the same arguments as in the proof of Theorem B lead us to conclude that Z ˙ t , Σi = at . h F

Integrating in t from 0 to 1 we get that W [µ] = (c/24π)S[f ], which together with Theorem C proves part (i). First statement of part (ii) follows from the fact that it is well-known [1] that the quasi-Fuchsian deformation f = fµ depends holomorphically on µ. Finally, if f = f µ is ¯ then the Ahlfors a Fuchsian deformation with harmonic Beltrami differential µ = y 2 q, lemma (see, e.g., [33]) states 1 ∂f µ = − q. ∂ ¯ =0 2 Therefore, choosing a linear homotopy µ(t) = tµ, we have the following simple computation Z 1Z c ∂f tµ ∂ 2 W [µ] = µ d2 z d t ∂∂ ¯ =0 12π 0 F ∂ ¯ =0 Z 1 Z c td t qµ d2 z − 24π 0 F Z c =− |µ|2 y −2 d2 z. 48π F Remark 4.8. Theorem C specifies the µ-dependence for two natural solutions for W [µ], defined by quasi-Fuchsian and Fuchsian deformations. In the former case the corresponding functional is holomorphic in µ, as a generating functional should be, while in the latter case it is not. Introducing the Weil-Petersson inner product in the space of Bers harmonic Beltrami differentials by

58

E. Aldrovandi, L.A. Takhtajan

µ1 , µ2

Z

WP

= F

µ1 µ¯ 2 y −2 d2 z,

the latter statement takes a quantative form c ∂ 2 W [µ] ||µ||2WP , =− ∂∂ ¯ =0 48π that once again characterizes the Weil-Petersson metric as a “holomorphic anomaly”. Finally, for arbitrary Beltrami differential one should replace µ by P µ in the above formula, where P stands for the orthogonal projection (with respect to the Weil-Petersson metric) onto the space of harmonic Beltrami differentials. 4.4. Here we compute the Hessian of the action functional W as a functional of µ. For this end we need to extend the linear mapping M : A0−1,0 → A0−1,1 to the space ˜ This mapping of pull-backs by the mapping f of Eichler integrals of order −1 for 0. has no kernel on the subspace of normalized Eichler integrals (i.e. vanishing at 0, 1, ∞) and, according to Bers, it is onto (see [21]). We denote, slightly abusing the notations, the inverse of thus extended mapping M by M−1 . Proposition 4.9. The second variation of the functional W [µ] is given by Z c δ2 W [µ](δ1 µ, δ2 µ) = δ1 µ T ◦ M−1 (δ2 µ) d2 z , 12π F where, according to Lemma 4.2, the operator T ◦M−1 maps Beltrami differentials for 0 into quadratic differentials. The Hessian of W [µ] at the point µ is given by the operator ∂ 3 ◦ M−1 . Proof. It is the same as the genus zero computations using Lemma 4.2. Note that at the critical point T (z) = 0, so that T = ∂ 3 . ¨ 5. Fiber Spaces over Teichmuller Space. Discussion and Conclusions In the preceding sections we have defined Polyakov’s action for the chiral sector in the induced gravity on a Riemann surface X of genus g > 1 and explored some of its properties. We have also pointed out the possible interpretation of W [µ] = (c/24π) S[f ] as the universal part of the generating functional for the correlation functions of the stress-energy tensor for a CFT on X. However, the most compelling interest in W [µ] (or S[f ]) stems in its relation with the geometry of the various fiber spaces over Teichm¨uller space. We want to elaborate more on this point. 5.1. Recall that the Teichm¨uller space T (X) of the Riemann surface X of genus g > 1 is naturally realized as the quotient of the open unit ball B(X) (with respect to the L∞ norm) in the Banach space of Beltrami differentials on X = 0\H by the group of quasi-conformal self-mappings of H pointwise fixing the group 0. If one replaces B(X) by its subset P(X) consisting of smooth Beltrami differentials and considers the identity component G0 (X) of the group G(X) of orientation preserving diffeomorphisms of X (elements in G0 (X) point-wise fix 0 while acting on H), then one gets Earle and

Effective Action for Two-Dimensional Quantum Gravity

59

Eells [11] fiber space π : P(X) → T (X) over the Teichm¨uller space. It is a smooth (in the Frech´et topology) principal G0 (X)-bundle over T (X). The group action on P(X) can be written as µ = µ(f ) 7→ µg = µ(f ◦ g), for g ∈ G0 (X) [11], where f = f µ is a Fuchsian deformation associated with µ. Explicitly, the above action is [1]: gz µ − µ(g −1 ) g µ = ◦g. gz 1 − µ µ(g −1 ) Consider now the tangent bundle exact sequence i

dπ

0 −→ TV P(X)−→T P(X)−→π ∗ (T T (X)) −→ 0 determined by the Earle-Eells fibration. (Observe that since P(X) is a ball in the vector space A0−1,1 of all smooth Beltrami differentials, the tangent space to it at any given point µ is canonically identified with A0−1,1 .) According to the description of the fixed-end variation given in 4.2, the deformation f t = f ◦ g t , for t 7→ g t ∈ G0 (X), results in a vertical curve t 7→ µt above the point π(µ) ∈ T (X). Thus the corresponding variation δµ = µ˙ lies in the vertical tangent space TV P(X) at point µ, which is isomorphic to Im(M), where M = ∂¯ −µ∂ +µz : A0−1,0 → A0−1,1 . Next, the tangent space Tµ P(X) can ˜ also be identified with the space of smooth 0-Beltrami differentials; an easy computation proves the following (well-known) lemma. Lemma 5.1. For any ν ∈ A0−1,1 the correspondence ν fz ◦ f −1 ν 7→ fz 1 − |µ|2 ¯ maps A0−1,1 isomorphically onto A0−1,1 . Under this map M becomes ∂¯π(µ) , the ∂˜ operator relative to the new complex structure on the Riemann surface X defined by µ. This implies at once that the kernel of M is trivial, and therefore the correspondence v = v z ∂z + v z¯ ∂z¯ 7→ M(v z + µv z¯ ) explicitly gives the injection in the tangent bundle sequence above. Furthermore, it realizes TV P(X) (and its quotient by G0 (X)) as a bundle of Lie algebras, as usual in a principal fibration [4]. Here the Lie algebra in question is the Lie algebra Vect(X) of smooth vector fields on X, which can be identified – as a real vector space – with A0−1,0 . With these definitions at hand, the following reinterpretation of the formulas in the statement of Theorem B becomes obvious. Proposition 5.2. For any smooth functional F : P(X) → C, 1. the open-end variation δF computes its total differential on P(X); 2. the fixed-end variation computes its vertical differential. In particular, for the action functional W , d W |µ =

c T ∈ Tµ∗ P(X). 12π

60

E. Aldrovandi, L.A. Takhtajan

Remark 5.3. The second point in the proposition can be verified by the following explicit computation, that uses Theorems B, C and Lemma 2.3. Z Z c c δf 2 δf 2 δW =− µzzz z = − DT (z) d d z δf (z) 12π F fz 12π F fz Z c δf 2 T (z)M = d z. 12π F fz Remark 5.4. The description of the vertical bundle as the image of M immediately implies that −1,1 Tπ(µ) T (X) ∼ = A0 / Im(M) , so that we get the well-known result [11] 0,1 Tπ(µ) T (X) ∼ = H∂¯ (X µ , TX µ ) ∼ = H 1 (X µ , ΘX µ ) ,

where the last group gives the Kodaira-Spencer infinitesimal deformations. (ΘX µ is the holomorphic tangent sheaf to the Riemann surface X µ .) 5.2. It is fundamental to investigate how the function W : P(X) → C relates to the geometry of the bundle π : P(X) → T (X). A long but straightforward computation using the definition (1.10) of W proves Lemma 5.5. There exists A : P(X) × G0 (X) → C such that W [µg ] = W [µ] + A[µ, g] .

(5.1)

The functional A depends only on the point (µ, g) and is local in µ and µg ; in particular, it is independent of any possible choice of the solution of the Beltrami equation involved in the definition of W . It trivially follows from (5.1) that the functional A satisfies the cocycle identity: A[µ, gh] = A[µg , h] + A[µ, g] . Next, according to [30], the functional 9[µ] = exp(−W [µ]) is to be interpreted as a conformal block for a CFT defined on X. Thus it is more convenient to work with the exponential version of (5.1). Namely, defining C[µ, g] = exp(−A[µ, g]) , we get

9[µg ] = C[µ, g] 9[µ] .

(5.2)

The cocycle condition takes the form C[µ, gh] = C[µg , h] C[µ, g] , which defines a 1-cocycle on G0 (X) with values in the group of non vanishing complex valued functions on P(X). We denote by [C] the class of C in the cohomology group H 1 (G0 (X), C∗ (P(X))). Proposition 5.6. There is an injective map of the group H 1 (G0 (X), C∗ (P(X))) into the group of isomorphism classes of line bundles over T (X). The line bundle L[C] over T (X), defined by [C] is, in particular, holomorphic.

Effective Action for Two-Dimensional Quantum Gravity

61

Proof. The existence of a map 0 → H 1 (G0 (X), C∗ (P(X))) → H 2 (T (X), Z) is an application of the well-known concept of G-vector bundle as presented in [5, 28]. We define an action by G0 (X) on the trivial line bundle L˜ = P(X) × C by (µ, z) 7→ (µg , C[µ, g]z) .

(5.3)

˜ 0 (X) is a line bundle The action is free since it is so on the first factor, hence L = L/G over T (X). As it is easily checked, cohomologous cocycles yield isomorphic bundles, and so L[C] is trivial if and only if [C] is trivial. Next, observe that C[µ, g] can be defined using the quasi-Fuchsian prescription, which, according to Theorem C, yields a holomorphic W . Moreover, µg is holomorphic in µ, as it follows from the explicit expression. Thus, C[ · , g] is holomorphic and so is the action 5.3. Remark 5.7. The construction of the line bundle L is well known from works on anomalies [3, 10, 12]. An explicit construction of the map H 1 (G0 (X), C∗ (P(X))) → ˇ cohomology appears in [12]. H 2 (T (X), Z) using Cech It follows from general arguments (cf. [28]) that sections of L[C] can be identified with the ˜ namely with those functions 8 : P(X) → C satisfying G0 (X)-invariant sections of L, 8[µg ] = C[µ, g] 8[µ] . Since the conformal block 9 = exp(−W ) does not vanish, the foregoing proves the following Proposition 5.8. The conformal block 9 descends to a non-vanishing section of L[C] , thereby providing a trivializing isomorphism L[C] → T (X) × C. Observe (cf. [35]) that the line bundle L[C] is holomorphically trivial due to a general property of the Teichm¨uller space being a contractible domain of holomorphy [25]. Our construction provides an instance of this general fact, as well as an explicit trivializing map. Also note that, due to the universal nature of the cocycle C, the ratio of two different conformal blocks, in accordance with [30], is G0 (X)-invariant and, therefore, descends to a non-vanishing function on the Teichm¨uller space T (X). 5.3. The preceding observations bring in several additional questions concerning the geometrical significance of exp(−W [µ]). For instance, we can define the trivial connection on the trivial line bundle L˜ on P(X): ∇8 = 9 d(9−1 8) = d 8 − (9−1 d 9)8 . This connection is easily verified to be G0 (X)-invariant, hence it descends onto L[C] . It follows from Proposition 5.2 and Theorem B that the connection form coincides with d W = c T /12π. This is very reminiscent of Friedan and Shenker’s modular geometry program for CFT [14], where the vacuum expectation value of the stress-energy tensor is interpreted as a connection on a line bundle over the moduli space. As a further development, this suggests studying the action of the full group G(X) on the presented construction. As

62

E. Aldrovandi, L.A. Takhtajan

it is well known [11], the quotient of P(X)/G(X) (the action being the same as in the previous case) is precisely the moduli space of compact Riemann surfaces of genus g > 1. All the local formulas will stay the same, while the action of the modular group G(X)/G0 (X) on T (X) will introduce the topological “twisting”. All of this should be fundamental for the differential-geometrical realization of Friedan and Shenker’s program. In this respect it is important, as we proved in the paper, that the functional W [µ] is independent of the marking of a Riemann surface X. Another direction, more directly related to the Earle-Eells fibration consists in finding the geometric interpretation of the critical points T = 0 and “vertical critical” points µzzz = 0 of the functional W [µ]. Finally, the question of the relation of W [µ] with the full induced gravity action on X is also very important. Recall the genus zero factorization [30] Z ¯ + K[φ, µ, µ] ¯ , R1−1 R = W [µ] + W [µ] where the term K[φ, µ, µ] ¯ is further decomposed as a sum ¯ + KBK [µ, µ] ¯ K[φ, µ, µ] ¯ = SL [φ, µ, µ] of the Belavin-Knizhnik-like anomaly term plus the Liouville action in the background | d z + µ d z| ¯ 2 . After having properly defined W [µ] on X, it is natural to ask whether such a decomposition holds in higher genus as well. We observe that the general (co)homological techniques applied in this paper can also be used to give a mathematically rigorous construction of the Liouville action (in various backgrounds) in the form of a “bulk” term plus boundary and vertex corrections, as in the spiritR of [29, 33]. A construction of this kind should provide a meaning also to the full action R1−1 R in terms of a Liouville action in the “target” complex structure, provided one can actually define KBK in higher genus as well. A full understanding of the geometrical properties of W [µ] and KBK and their exponentials would be relevant in order to put the Geometric Quantization approach of ref. [30] and, more generally, the three-dimensional approach to two-dimensional gravity on a more conventional mathematical basis. Finally, similar construction can be carried out for defining the WZW functional on the higher genus Riemann surfaces. We are planning to address these questions in the next publications. Appendix A. Some Facts from Homological Algebra We give a brief account on the use of double complexes as applied to our situation. We shall mainly focus on homology and just indicate the required modifications to discuss the cohomological counterpart of the various statements. For a full account cf. any book on homological algebra, like, for instance, [23]. A.1. The framework we put ourselves in is sufficiently simple that one can in fact avoid the use of spectral sequences altogether in the proof of Lemmas 3.2 and 3.3, provided one takes into account a few simple facts from homological algebra. The key point is that the various double complexes we are interested in have trivial (co)homology in higher degrees with respect to either the first or second differentials, so the arguments can be given in general, without referring to specific examples. Let K•,• a double complex with differentials ∂ 0 : Kp,q → Kp−1,q and ∂ 00 : Kp,q → Kp,q−1 , and total differential ∂|Kp,q = ∂ 0 + (−1)p ∂ 00 . According to our discussion, let us make the assumption that

Effective Action for Two-Dimensional Quantum Gravity 00 Hq∂ (Kp,• )

=

63

q=0 q>0 .

Cp 0

def

Then C• = ⊕Cp inherits a differential1 ∂ : Cp → Cp−1 from the first differential ∂ 0 in the double complex, and since ∂ 00

∂ 00

∂ 00

∂ 00

· · · ←−Kp,q−1 ←−Kp,q ←−Kp,q+1 ←− · · · is exact except in degree zero, we can “augment” K•,• inserting the projection ε : Kp,0 → Cp to obtain the exact sequence 0 ←− C• ←− K•,• . Proposition A.1.

H• (Tot K) ∼ = H• (C) .

Proof. This is a routine check of the definitions. Suppose c ∈ Cp is closed, i.e. ∂c = 0. This means that a chain c0 ∈ Kp,0 exists such that ε(∂ 0 c0 ) = 0, but ε(∂ 0 c0 ) is the class represented by ∂ 0 c0 , since we clearly have ∂ 00 ∂ 0 c0 = 0. So, this class is zero, and therefore we have ∂ 0 c0 = ∂ 00 c1 for c1 ∈ Kp−1,1 . Now, ∂ 00 (∂ 0 c1 ) = ∂ 0 (∂ 00 c1 ) = ∂ 0 ∂ 0 c0 = 0, and since the ∂ 00 -homology of K•,• is concentrated only in dimension zero, a c2 ∈ Kp−2,2 must exist such that ∂ 0 c1 = ∂ 00 c2 , and so on. The procedure stops at the pth step. Thus the chain C = c0 +

p X

Pi−1 (−1)

k=0

(p−k)

ci

i=1

is a cycle in Tot K, that is, ∂C = 0.

Pi−1 Pp Conversely, suppose C = c0 + i=1 (−1) k=0 (p−k) ci ∈ Tot K is ∂-closed. Then c ≡ ε(c0 ) is a degree p cycle in Cp . Indeed, in degree (p − 1, 0) we have ∂ 0 c0 = ∂ 00 c1 and ε(∂ 0 c0 ) = ε(∂ 00 c1 ) = 0 , since the augmentation is exact. That the cycle c ∈ Cp is a boundary if and only if C ∈ Tot K is a boundary can be proven along the same lines. This completes the argument. A.2. Recall from Sect. 3 the various double complexes we used. In particular, K•,• = S• ⊗Z0 B• is the double complex obtained tensoring the singular chain complex on X0 ∼ = H with the “bar” complex ∂ 00

∂ 00

∂ 00

∂ 00

0 ←− B0 ←−B1 ←− · · · ←−Bn ←− · · · ,

(A.1)

which is exact except in degree zero. Its definition has been given in the main text. Being B0 a 0-module on the generator [ ], introducing the augmentation ε : B0 → Z, ε([ ]) = 1, we can rewrite it as the exact sequence 1

The use of the same symbol to denote the differentials in C and Tot K should not generate any confusion.

64

E. Aldrovandi, L.A. Takhtajan ε

∂ 00

∂ 00

∂ 00

∂ 00

0 ←− Z←−B0 ←−B1 ←− · · · ←−Bn ←− · · · .

(A.2)

The above exact sequence is usually referred to as a “resolution” of the integers. Since every Bq is a free 0-module, the sequence is a free resolution. The singular chain complex S• ≡ S• (X0 ) needs little description. Since 0 acts on the space, S• acquires a 0-module structure simply by translating around the chains. That this actually is a complex of free 0-modules is proven in [23] or [9]. A choice of free generators is to take those chains whose first vertex lies in a suitably chosen fundamental domain in X0 . The differential, which we called ∂ 0 in the main text, is just the usual boundary homomorphism. The homology of 0 with coefficients in any 0-module M is by definition the homology of the complex M ⊗Z0 B• . (Any other resolution of Z would be adequate.) In fact, tensor product does not preserve exactness in general. As a matter of terminology, a module M such that any exact sequence remains exact after tensoring with it, is called flat. Therefore, all the higher homology groups of 0 with coefficient in a flat module will be zero. A free 0-module is in particular flat, as it is very easy to see. So, in our case, we have Sp ⊗Z0 Z q = 0 , Hq (0, Sp ) = 0 q>0 where Z is considered as a trivial 0-module. Moreover, note that Sp ⊗Z0 Z ≡ Sp (X0 )⊗Z0 Z∼ = Sp (X) the space of singular chains on the surface. Indeed, if c is any chain on X0 and γ is any group element, we have c · γ ⊗ 1 = c ⊗ γ · 1 = c ⊗ 1, and therefore c ⊗ 1 can be identified with a singular chain on the surface, as claimed. After these preparations, we can exploit the exact complex (A.2) to build the augmented double complex id ⊗ (A.3) S• ⊗Z0 B• S• ⊗Z0 Z 0

with exact rows. According to the foregoing, the leftmost column in (A.3) is to be identified with the singular chain complex on the surface. (Or, more generally, of the quotient space.) The complex (A.3) satisfies the hypotheses of Proposition (A.1), and since the group homology is the ∂ 00 -homology of the double complex, we conclude that H• (Tot K) ∼ = H• (X, Z) thereby proving one half of Lemma 3.2. In order to prove the other half, let us observe that actually all the columns in (A.3), except the first one, are exact, X0 ∼ = H being a contractible space. Indeed, the complex S• carries no homology except in degree zero, and we can “augment” it as well to obtain another resolution of the integers: ε

∂0

∂0

∂0

∂0

0 ←− Z←−S0 ←−S1 ←− · · · ←−Sn ←− · · · . Now the situation is completely symmetric and we can just “transpose” the above constructions to build the augmented complex

Effective Action for Two-Dimensional Quantum Gravity

65

S• ⊗Z0 B• ⊗ id ? Z ⊗Z0 B• ? 0 and apply Proposition A.1 to it to show that H• (Tot K) ∼ = H• (0, Z). A.3. The cohomological picture has a very similar structure. The cohomology of 0 with coefficients in M is by definition the homology of the complex HomZ0 (B• , M ). (Notice that Hom is contravariant in the first variable, thus it reverses the arrows.) We will be in position to apply the analogue of Proposition A.1 with the arrows reversed to the complex C•,• = Hom(B• , A• ) provided we show that H q (0, Ap ) = 0 for q > 0, that is, Hom( · , Ap ) must preserve exactness, so that the higher cohomology groups are zero. An injective module M is by definition a 0-module such that Hom( · , M ) preserves exactness, hence the higher cohomology groups of 0 with coefficients into an injective are zero. Thus we have to show that Ap is injective as a 0-module. In fact, more can p p be done, namely it can be shown that Ap ∼ = HomZ (Z0, AC (X)), where AC (X) is the vector space of (complex valued) differential forms on the Riemann surface X. The (easy) proof of this assertion requires the construction of an equivariant partition of unity on H, see [21]. Then Ap has no higher cohomology since p HomZ0 (B• , Ap ) ∼ = HomZ0 (B• , HomZ (Z0, AC (X))) p ∼ = HomZ (B• , AC (X)) ,

and the last complex has no cohomology, except in degree zero. Thus we have p AC (X) q = 0 H q (0, Ap ) = 0 q>0 , and applying Proposition A.1 to the double complex C•,• we can prove that H • (Tot C) ∼ = H • (X, C) . To prove the rest of Lemma 3.3 we need only use the contractibility of X0 ∼ = H, so that A• has no cohomology, and apply Proposition A.1 to the transposed double complex. Acknowledgement. We would like to thank J. L. Dupont, C.-H. Sah and S. Shatashvili for very helpful discussions. We also thank R. Zucchini and G. Falqui for kindly pointing out several references to previous works on the subject. The work of E.A. was supported by the National Research Council (CNR), Italy; the work of L.T. was partially supported by the NSF grant DMS-95-00557.

Note added in proof After the work described in thos paper has been completed, the articles [36] and [37], where similar double complexes for group cohomology are also used, have been brought to our attention.

66

E. Aldrovandi, L.A. Takhtajan

References 1. Ahlfors, L.: Lectures on Quasiconformal Mappings. Van Nostrand, 1966 2. A. Alekseev and S. Shatashvili, Path integral quantization of the coadjoint orbits of the Virasoro group and 2-d gravity. Nucl. Phys. B323, 719–733 (1989) 3. Alvarez-Gaum´e, L. and Ginsparg, P.: The topological meaning of non abelian anomalies. Nucl. Phys. B243, 449 (1984) 4. Atiyah, M. F.: Complex analytic connections in fibre bundles. Trans. Am. Math. Soc. 85, 185–207 (1957) 5. Atiyah, M. F.: K-Theory. New York: Benjamin, 1967 6. Becchi, C.: On the covariant quantization of the free string: the conformal structure. Nucl. Phys. B304, 513 (1988) 7. Belavin, A. A.: unpublished, 1985–1986 8. Belavin, A. A., Polyakov, A. M. and Zamolodchikov, A. B.: Infinite conformal symmetry in twodimensional quantum field theory. Nucl. Phys. B241, 333–380 (1984) 9. Brown, S. K.: Cohomology of groups. Springer-Verlag, 1982 10. Catenacci R. and Pirola, G. P.: A geometrical description of local and global anomalies. Lett. Math. Phys. 19, 45–51 (1990) 11. Earle, C. J., Eells, J.: A fibre bundle description of Teichm¨uller theory. J. Diff. Geom. 3, 19–43 (1969) 12. Falqui, G. and Reina, C.: BRS cohomology and topological anomalies. Commun. Math. Phys. 102, 503–515 (1985) 13. Fay, J.: Fourier coefficients of the resolvent for a Fuchsian group. J. Reine Angew. Math. 293/294, 143–203 (1977) 14. Friedan, D. and Shenker, S.: The analytic geometry of two-dimensional conformal field theory. Nucl. Phys. B281, 509–545 (1987) 15. Friedan, D., Qiu Z. and Shenker, S.: Conformal invariance, unitarity and two dimensional critical exponents. In: Vertex operators in Mathematics and Physics, J. Lepowsky et al., editors, Publ. MSRI, no. 3, Berlin–Heidelberg–New York: Springer-Verlag, 1984 16. Gardiner, F. P.: Teichm¨uller Theory and Quadratic Differentials. Wiley-Interscience, 1987 17. Gunning, R. C.: Lectures on Riemann surfaces. Princeton: Princeton Univ. Press, 1966 18. Haba, Z.: Generating functional for the energy-momentum tensor in two-dimensional conformal field theory. Phys. Rev. D41, 724–726 (1990) 19. Katok, S.: Fuchsian Groups. University of Chicago Press, 1992 20. Kostant, B. and Sternberg, S.: Symplectic reduction, BRS cohomology and infinite-dimensional Clifford algebras. Ann. of Phys. 176, 49–113 (1987) 21. Kra, I.: Automorphic forms and Kleinian groups. Benjamin, 1972 22. Lazzarini, S.: Doctoral Thesis, LAPP Annecy-le-Vieux (1990) and references therein. 23. S. Mac Lane, Homology. Berlin–Heidelberg–New York: Springer-Verlag, 1975 24. Magri, F.: A simple model for the integrable Hamiltonian equation, J. Math. Phys. 19, 1156–1162 (1978) 25. Nag, S.: The complex analytic theory of Teichm¨uller spaces. Wiley Intersc., 1988 26. Polyakov, A. M.: Quantum gravity in two dimensions. Mod. Phys. Lett. A 2, 893–898 (1987) 27. Polyakov, A. M.: Unpublished, 1985–1986 28. Segal, G.: Equivariant K-Theory. Publ. Mat. IHES 34, 129–151 (1968) 29. Takhtajan, L. A.: Topics in the quantum geometry of Riemann surfaces: Two-dimensional quantum gravity. In: International School of Physics “Enrico Fermi” Course CXXVII “Quantum Groups and their Applications in Physics”, L. Castellani et al., editors. IOS Press Amsterdam, 1996. 30. H. Verlinde, Conformal field theory, two-dimensional quantum gravity and quantization of Teichm¨uller space. Nucl. Phys. B337, 652–680 (1990) 31. Yoshida, K.: Effective action for quantum gravity in two dimensions. Mod. Phys. Lett. A 4, 71–81 (1989) 32. Yoshida, K.: On the origin of SL(n, C) current algebra in generalized 2-dimensional gravity. Int. Jour. Mod. Phys. A 7, 4353–4375 (1992) 33. Zograf, P. and Takhtajan, L.: On uniformization of Riemann surfaces and the Weil-Petersson metric on Teichm¨uller and Schottky spaces. Math. USSR Sbornik 60, 297–313 (1988)

Effective Action for Two-Dimensional Quantum Gravity

67

34. Zucchini, R.: A Polyakov action on Riemann surfaces. Phys. Lett B 260, 296–302 (1991) 35. Zucchini, R.: A Polyakov action on Riemann surfaces. II. Commun. Math. Phys. 152, 269–298 (1993) 36. Jeffrey, L.C.: Group cohomology construction of the cohomology of moduli spaces of flat connections on 2-manifolds. Duke Math. Jour. 77, 407–429 (1995) 37. Weinstein, A.: The symplectic structure of moduli space, In: A. Floer memorial volume, Birkh¨auser Communicated by R.H. Dijkgraaf

Commun. Math. Phys. 188, 69 – 88 (1997)

Communications in

Mathematical Physics c Springer-Verlag 1997

The Thermal Equilibrium Solution of a Generic Bipolar Quantum Hydrodynamic Model Andreas Unterreiter Fachbereich Mathematik, Universit¨at Kaiserslautern, Erwin-Schr¨odinger-Straße, D-67653 Kaiserslautern, Germany Received: 26 March 1996 / Accepted: 13 January 1997

Dedicated to Helmut Neunzert at his 60th birthday Abstract: The thermal equilibrium state of a bipolar, isothermic quantum fluid confined to a bounded domain ⊂ IRd , d = 1, 2 or d = 3 is entirely described by the particle densities n, p, minimizing the energy Z Z Z Z Z √ λ2 √ |∇V [n − p − C]|2 , ε2 |∇ n|2 + ε2 |∇ p|2 + G1 (n) + G2 (p) + 2 2 where G1,2 are strictly R R convex real valued functions, −λ 1V = n − p − C, with (n − p − C) = V = 0. It is shown that this variational problem has a unique minimizer in Z Z √ √ 1 1 1 (n, p) ∈ L () × L () : n, p ≥ 0, n, p ∈ H (), n = N, p = P

and some regularity results are proven. The semi-classical limit ε → 0 is carried out recovering the minimizer of the limiting functional. The subsequent zero space charge limit λ → 0 leads to extensions of the classical boundary conditions. Due to the lack of regularity the asymptotics λ → 0 can not be settled on Sobolev embedding arguments. The limit is carried out by means of a compactness-by-convexity principle. 1. Introduction Quantum hydrodynamic models (QHDs) give a fairly accurate account of the macroscopic behavior of ultra small semiconductor devices in terms of only macroscopic quantities such as particle densities, current densities and electric fields. Within semiconductor device modeling QHDs are located between microscopic quantum models (Schr¨odinger-Poisson systems [16, 15], Bloch’s equation [3, 13] or kinetic-type quantum transport equations [14]) and macroscopic semi-classical hydrodynamic models [14]. Presently the interplay between these different approaches is a

70

A. Unterreiter

field of intensive research. Actual research deal with the derivation of QHDs from microscopic quantum models (essentially based on Madelung’s transformation, see [6] for a review) and investigations of the semi-classical limit ~ → 0. All quantum models of semiconductor devices investigated so far are unipolar, i.e. these models involve only one particle type, namely electrons. Hence a consistency problem arises. Whenever quantum effects are negligible, solutions of QHDs should recover the qualitative behavior of solutions of semi-classical models. However most of the established semi-classical approaches involve in a crucial way two particle types, namely electrons and holes. Therefore the analysis of unipolar QHDs has to be extended to bipolar QHDs. Unipolar QHDs reduce in thermal equilibrium to generic unipolar constitutive laws [2]. The (scaled) bipolar extension of the constitutive laws reads √  1 n 2   √ ∇R (n) − ε n∇ = 0, n∇V + T 1 1   n     √   1 p    −p∇V + T2 ∇R2 (p) − ξε2 p∇ √ = 0, p (1)   2  −λ 1V = n − p − C,       Z Z Z     n = N, p = P, V = 0 In (1) the functions n, p, V are unknown, where n = n(x) ≥ 0 is the particle density of electrons (negatively charged) in the conduction band, p = p(x) ≥ 0 is the particle density of holes (positively charged) in the valence band, V = V (x) is the (negative) electrostatic potential and x ranges over , a bounded domain in IRd , where d = 1, 2 or d = 3. ε is the scaled Planck’s constant and ξ is the ratio of the effective masses of electrons and holes. The device dependent parameters T1 , T2 (electron and hole reference temperature, respectively) and the minimal Debye length λ are assumed to be constant. R1,2 : [0, ∞) → [0, ∞) are the respective pressure functions. (Typically, the pressure function is continuously differentiable and increasing.) C is the doping profile. It is assumed that the impurity atoms are fully ionized, i.e. C = ND − NA , where ND = ND (x), NA = NA (x) ≥ 0 are the space densities of donator and acceptor atoms, respectively. N is the total number of electrons in the conductivity band and P is the total number of holes in the valence band. N, P are related to the densities of donator and acceptor atoms via Z Z N = ni + ND , P = ni + NA , where ni > 0 is an intrinsic constant taking into account that the number of electrons in the conduction band (as well as the number of holes in the valence band) is not only determined by the doping but also by intrinsic thermal excitation processes. The relation between N, P and C implies total charge neutrality. Hence Poisson’s equation has (at R least for n − p − C ∈ L2 ()) exactly one solution V satisfying V = 0. Since our main conclusions will not depend on the particular values of the positive parameters T1 , T2 , ξ we simply set T1 = T2 = ξ = 1.

Thermal Equilibrium Solution of Generic Bipolar Quantum Hydrodynamic Model

71

Equations (1) provide only a necessary condition for the thermal equilibrium state. Equations (1) do not take into account that the thermal equilibrium solution minimizes the system’s total energy Eελ . If (1) has more than one solution – this happens in some semi-classical settings [18] – the physically relevant solution of (1) is distinguished as a minimizer of Eελ . One is therefore compelled to minimize Z Z Z Z √ √ Eελ (ν, π) = ε2 |∇ ν|2 + ε2 |∇ π|2 + G1 (ν) + G2 (π) + in 0ε ≡

λ2 2

Z |∇V [ν − π − C]|2

√ √ (ν, π) ∈ L1 () × L1 () : ν, π ≥ 0, ν, π ∈ H 1 (),

Z

Z ν = N,

π=P

,

1 d R1,2 (t) and t dt Z 2 −λ 1V [ν − π − C] = ν − π − C, V [ν − π − C] = 0.

where G1,2 is a primitive of g1,2 (t) ≡

A straightforward formal computation shows that the Euler-Lagrange equations of the functional Eελ are  √ √ ε2 1 n = n(V + g1 (n) − α1 )       √ √   ε2 1 p = p(−V + g2 (p) − α2 )   (2) −λ2 1V = n − p − C      Z Z Z      n = N, p = P, V = 0, R where R α1 , α2 ∈ IR are the Lagrange-multipliers associated with the constraints ν = N, π = P . If the minimizer (n, p) of Eελ in 0ε satisfies (2) and n, p > 0 one gets (1) from (2) by simple algebraic manipulations and taking gradients. The formulation of (1)Ras a variational problem provides a natural justification of the normalizing condition V = 0. For fixed (ν, π) ∈ 0 the potential V [ν − π − C] minimizes the electric field energy [9] Z Z λ2 2 |∇W | − (ν − π − C)W, Fel [W ] = 2

where W ranges in a set 0W such that inf W ∈0W Fel [W ] = Fel [V [ν − π − C]]. To make inf W ∈0W Fel [W ] as small as possible one has to choose 0W as large as possible: Z 1 0W ≡ W ∈ H () : W = 0 . Due to the assumed total charge neutrality Fel is wellR defined on 0W and attains its unique minimizer in 0W . The normalizing condition W = 0 eliminates physically irrelevant additive constants. It is readily seen that V [ν − π − C] satisfies homogeneous Neumann conditions.

72

A. Unterreiter

√ √ Remark 1. a) Replacing formally the terms n by 91 , p by 92 , n by |91 |2 and p by |92 |2 Eqs. (2) can be written as an scaled, stationary, nonlinear Schr¨odinger-Poisson system.  2 −ε 191 + V 91 + 91 g1 |91 |2 = α1 91         −ε2 192 − V 92 + 92 g2 |92 |2 = α2 92    −λ2 1V = |91 |2 − |92 |2 − C     Z Z Z      |92 |2 = P , V = 0. |91 |2 = N , In this formulation α1 , α2 are energy eigenvalues. The corresponding variational problem is to minimize the functional Z Z Z Z ∗ (91 , 92 ) = ε2 |∇91 |2 + ε2 |∇92 |2 + G1 (|91 |2 ) + G2 (|92 |2 Eελ + in the set 0∗ =

λ2 2

Z |∇V [|91 |2 − |92 |2 − C]|2

Z (91 , 92 ) ∈ H 1 (; C I) × H 1 (; C I) :

Z |91 |2 = N,

|92 |2 = P

.

∗ It is not very difficult to check √ that the minimizer of Eελ equals up to a physically √ irrelevant constant phase factorR ( n, p). b) The normalizing condition V = 0 implies that V satisfies homogeneous Neumann boundary conditions. This means that no external voltage is present. In voltage-driven applications however the thermal equilibrium state is influenced by external electric potentials. In this case Dirichlet (or mixed Dirichlet-Neumann) boundary data for V are prescribed. In [17] the analysis of a unipolar QHD with these boundary data is carried out. The extension to bipolar models of the investigations in [17] as well as the modifications of the results of Subsect. 2.2 and 2.3 are rather straightforward and can be left to the reader. Essential for the treatment of the electric energy are the estimates

kV [f ]kL∞ , kV [f ]kH 1 ≤ Kkf kL2 , where 1V [f ] = f . Such estimates hold for reasonable Dirichlet (or mixed DirichletNeumann) boundary data for V . Equations (1) involve the dimensionless parameters ε, λ. Due to the presence of quantum effects ε is of not negligible order of magnitude for ultra small semiconductor devices. For "standard" devices however quantum effects play no major role. In these settings one has ε2 λ2 1, and one is therefore compelled to study the consecutive limits ε → 0 and λ → 0. The smallness of ε2 is a high temperature effect as well as due to the smallness of Planck’s constant. The terms involving ε2 represent corrections to an otherwise classical model. Carrying out the limit ε → 0 means to go back from quantum mechanics to classical physics.

Thermal Equilibrium Solution of Generic Bipolar Quantum Hydrodynamic Model

73

It is the aim of this paper to analyze the variational problem of minimizing Eελ in 0 , to give a rigorous derivation of the associated Euler-Lagrange equations (2), to prove that the minimizer of Eελ in 0 solves (1), to carry out the semi-classical limit → 0 and to justify the employment of semi-classical boundary conditions whenever quantum effects are negligible and the scaled minimal Debye length is small. All subsequent investigations are based on (mild) assumptions given at the beginning of Sect. 2. Subsections 2.1, 2.2, 2.3 are concerned with the statements of the results. The proofs are given in Subsects. 3.1, 3.2, 3.3. The core of the analysis of the semiclassical limit ε → 0 (subsection 3.2) are properties of the functional E◦λ obtained from Eελ by setting formally ε = 0. This functional E◦λ possesses a unique minimizer in a set 0◦ with 0◦ ⊃ 0 , 0◦ 6= 0 . Although the comparison functions of 0◦ are less regular than those of 0 , the minimizer of E◦λ in 0◦ is actually an element of 0 . This regularity result allows in connection with ε-independent estimates to pass to the limit ε → 0 strongly in H 1 (). Subsection 3.3 is concerned with the justification of semi-classical boundary conditions for QHDs. The minimizer of E◦λ in 0◦ does not recover the usual semi-classical boundary conditions [12, 14]. This is not to be expected because the semi-classical boundary conditions are derived from the zero space charge assumption λ = 0. Setting λ = 0 in E◦λ gives a functional E◦◦ to be minimized in a set 0◦◦ ⊂ 0◦ , 0◦◦ 6= 0◦ . E◦◦ possesses a unique minimizer (nc , pc ) in 0◦◦ satisfying the semi-classical boundary conditions. However the investigation of λ → 0 requires some effort. The main difficulty to pass to the limit λ → 0 is the lack of regularity of (nc , pc ). In fact the limiting densities nc , pc are in general not continuous while for all λ > 0 the minimizers of E◦λ belong to C(). Hence compactness arguments based on embeddings of H 1 () in some Lp -space (as used to perform the semi-classical limit) are not applicable. However a compactness-by-convexity principle (Lemma 3) allows to carry out the limit λ → 0.

2. Statement of the Results The subsequent investigations are based on the following assumptions:

(A)

 a) ⊂ IRd , d = 1, 2 or d = 3 is a bounded domain with ∂ ∈ C 0,1 .       b) There exists a K > 0 only depending on such that        kV [f ]kL∞ ≤ Kkf kL2 .      c) C ∈ L∞ ().  Z Z Z    + −    d) N − P = C , N > C , P > C .      e) g1,2 ∈ C(0, ∞) ∩ L1loc ([0, ∞)) is strictly increasing,       lim g1,2 (t) = ∞ and g1,2 ≡ lim g1,2 (t) ∈ [−∞, ∞). t→∞

t→0+

Remark 2. a) Assumption (A)b) is essentially a requirement on the smoothness of ∂. For instance it is well known, see e.g. [5], that for ∂ ∈ C ∞ the estimate kV [f ]kH 2 ≤ Kkf kL2

74

A. Unterreiter

holds. This estimate implies in dimensions d ≤ 3 assumption b), because due to ∂ ∈ C 0,1 the embedding H 2 () → CB () is continuous [1]. b) The assumptions (A)e) are satisfied for functions g1,2 deduced from the most frequently employed pressure functions of the form R1,2 (t) = ta , a ∈ [1, ∞). 2.1. Existence and uniqueness of a minimizer. The main result of this subsection is Theorem 1. Assume (A). Then for all ε, λ > 0 the functional Eελ has a unique minimizer (n, p) in 0 which solves the associated Euler-Lagrange equations (2) as well as (1). Furthermore, – – – – –

n, p, V satisfy homogeneous Neumann boundary conditions, √ √ 1,t ()∩CB ()∩H 1 (), for all t ∈ (0, 1), the functions n, p, n, p, V belong to Cloc n, p are strictly positive in , i.e. n(x), p(x) > 0 for all x ∈ , if g1 = −∞, then there exists a constant K > 1 such that 1/K ≤ n ≤ K. if g2 = −∞, then there exists a constant K > 1 such that 1/K ≤ p ≤ K.

2.2. The semi-classical limit ε → 0. Keeping λ > 0 fixed and given ε ∈ (0, ∞) let (nε , pε ) be the unique minimizer of Eελ in 0 and let Vε = V [nε − pε − C]. By setting ε = 0 and formal manipulations Eqs. (1) become  n◦ ∇V◦ + ∇R1 (n◦ ) = 0,      −p◦ ∇V◦ + ∇R2 (p◦ ) = 0,   , (3) −λ2 1V◦ = n◦ − p◦ − C,   Z Z Z      n◦ = N, p◦ = P, V◦ = 0, the energy functional Eελ becomes Z Z Z λ2 |∇V [ν − π − C]|2 , E◦λ (ν, π) = G1 (ν) + G2 (π) + 2 √ √ i.e. ν, π ∈ H 1 () is not required anymore and E◦λ should be minimized in Z Z 1 1 ν = N, π=P . 0◦ = (ν, π) ∈ L () × L () : ν, π ≥ 0 , The limit ε = 0 of the Euler-Lagrange equations (2) is less straightforward. In contrast to the quantum case the appearance of “vacuum-sets” (subsets of where n◦ or p◦ vanishes) is possible. Hence by a simple canceling the differential operators in (2) some information is lost on vacuum-sets. A rigorous analysis shows that the Euler-Lagrange equations become in the limit ε = 0 variational inequalities  if n◦ > 0, 0 = V◦ + g1 (n◦ ) − α1◦    0 ≤ V + g (n ) − α if n◦ = 0,  ◦ 1 ◦ 1◦        if p◦ > 0, 0 = −V◦ + g2 (p◦ ) − α2◦   if p◦ = 0, 0 ≤ −V◦ + g2 (p◦ ) − α2◦ (4)     −λ2 1V◦ = n◦ − p◦ − C,      Z Z Z     V◦ = 0, n◦ = N, p◦ = P,

Thermal Equilibrium Solution of Generic Bipolar Quantum Hydrodynamic Model

75

where α1◦ , α2◦ ∈ IR. Some more information about n◦ , p◦ is available by introducing the generalized inverse h1,2 of g1,2 :  [0, ∞)   h1,2 : IR →    0 if t ≤ g1,2 t 7→    g −1 (t) if t > g1,2 .  1,2 Lemma 1. Assume (A) and let λ > 0. Then the functional E◦λ has a unique minimizer (n◦ , p◦ ) in 0◦ solving the associated variational inequalities (4). Furthermore, – – – – – – –

1,t () ∩ CB () ∩ H 1 (), for all t ∈ (0, 1), the electric potential V◦ belongs to Cloc n◦ , p◦ ∈ CB (), n◦ ≤ sup C + P/meas(), p◦ ≤ − inf C + N/meas(), 1,t ({n◦ > 0}) ∩ H 1 ({n◦ > 0}), for all t ∈ (0, 1), g1 (n◦ ) ∈ Cloc 1,t for all t ∈ (0, 1), g2 (p◦ ) ∈ Cloc ({p◦ > 0}) ∩ H 1 ({p◦ > 0}), if g1 = −∞, then there exists a K > 1 such that 1/K ≤ n◦ ≤ K, if g2 = −∞, then there exists a K > 1 such that 1/K ≤ p◦ ≤ K, n◦ = h1 (α1◦ − V◦ ) , p◦ = h2 (α2◦ + V◦ ) and the electric potential V◦ solves the semi-linear elliptic equation Z −λ2 1V◦ = h1 (α1◦ − V◦ ) − h2 (α2◦ + V◦ ) − C , V◦ = 0.

The convergence result of (nε , pε , V √ following √ε ) to√(n◦ , p◦ , V◦ ) as ε → 0 requires √ n◦ , p◦ ∈ H 1 (). Sufficient conditions for n◦ , p◦ ∈ H 1 () can be most easily formulated in terms of h1,2 and g1,2 [19]: √ √ Corollary 1. Assume (A) and let λ > 0. Then n◦ , p◦ belong to H 1 () if gj , hj , j = 1, 2 satisfy one of the following conditions: p 0,1 hj ∈ Cloc (IR). a) 0,1 (IR). b) gj = −∞ and hj ∈ Cloc 1 c) gj ∈ Cloc (0, ∞), gj = −∞ and

d gj (t) dt

> 0 for t ∈ (0, ∞).

Remark 3. In applications g1,2 (t) usually equals to log(t) for small t so b) applies. √ √ Theorem 2. Assume (A) and n◦ , p◦ ∈ H 1 (). Then Vε → V◦ strongly in H 1 () and strongly in L∞ () as ε → 0, r ∞ n √ε → n◦√and pε →√p◦ strongly √ in L (), r ∈ 1[1, ∞) and weak* in L () as ε → 0, nε → n◦ and pε → p◦ strongly in H () as ε → 0, if g1 = −∞ then there exists an ε∗ > 0 and a K > 1 which is independent of ε ∈ (0, ε∗ ) such that 1/K ≤ nε , n◦ ≤ K and nε → n◦ strongly in H 1 () as ε → 0, – if g2 = −∞ then there exists an ε∗ > 0 and a K > 1 which is independent of ε ∈ (0, ε∗ ) such that 1/K ≤ pε , p◦ ≤ K and pε → p◦ strongly in H 1 () as ε → 0. – – – –

2.3. The limit λ → 0. Throughout this section let (nλ , pλ ) be the unique minimizer of E◦λ in 0◦ and let Vλ = V [nλ − pλ − C]. Equations (3) are known as semi-classical hydrodynamic semiconductor device model in thermal equilibrium. For this model the

76

A. Unterreiter

definition of the built-in potential is based on the zero space charge assumption which means that λ is set to zero in Poisson’s equation [12]. To analyze the limit λ → 0 set formally λ = 0 in (3):  nc ∇Vc + ∇R1 (nc ) = 0,       −pc ∇Vc + ∇R2 (pc ) = 0,   0 = nc − pc − C, (5)    Z Z Z     n = N, p = P, V = 0 c

c

The functional E◦λ becomes formally

Z

E◦◦ (ν, π) =

c

Z G1 (ν) +

G2 (π)

to be minimized in Z Z 1 1 ν = N, π = P, ν − π − C = 0 . 0◦◦ = (ν, π) ∈ L () × L () : ν, π ≥ 0, The associated Euler-Lagrange equations are γ = g1 (nc ) + g2 (pc ) if nc pc > 0 γ ≤ g1 (nc ) + g2 (pc ) if nc pc = 0, where γ ∈ IR. The solvability of this minimization problem is the content of Lemma 2. Assume (A). Then E◦◦ has a unique minimizer (nc , pc ) in 0◦◦ and – – – – – – – – –

nc , pc ∈ L∞ (), nc , pc satisfy (6), meas({nc = 0} ∩ {pc = 0}) = 0, R nc pc does not vanish identically on , i.e. nc pc > 0, {nc = 0} = {pc = C − } and {pc = 0} = {nc = C + }, if g1 = −∞ then there exists a K > 1 such that 1/K ≤ nc ≤ K, if g1 = −∞ then there exists a K > 1 such that 1/K ≤ pc ≤ K, g1 (nc ), g2 (pc ) ∈ L∞ (), defining ÿ ! Z Z 1 β1 ≡ γ meas({nc = 0}) + g1 (nc ) − g2 (pc ) , meas() {nc >0} {nc =0} β2 ≡ γ − β1 , and setting ( Vc ≡

β1 − g1 (nc ) if nc > 0 g2 (pc ) − β2 if nc = 0

the quintuple (β1 , β2 , nc , pc , Vc ) is a solution of (5).

(6)

Thermal Equilibrium Solution of Generic Bipolar Quantum Hydrodynamic Model

77

The main problem when passing to the limit λ → 0 is that the limit solution (nc , pc ) is 1 H 1 ()less regular than R there are no uniform R √ the√minimizers nλ , pλ ∈ H (). Hence estimates on nλ , pλ . Available estimates concern G1 (nλ ) and G2 (pλ ) so the subsequent Lemma and its Corollary are fundamental. Lemma 3. (Compactness-by-Convexity) Let ⊂ IRd , d ∈ IN, be a bounded domain and let G : [0, ∞) → IR be strictly convex and continuous. For n ∈ IN let fn , f ∈ L1 () with fn , f ≥ 0 a.e. on . Assume that ||fn ||L1 → ||f ||L1 as n → ∞ and suppose that there exists a ϑ ∈ (0, 1) such that Z Z Z G(fn ) = lim G(ϑf + (1 − ϑ)fn ) ≡ L ∈ IR. G(f ) = lim n→∞

n→∞

Then fn → f strongly in L1 () as n → ∞. Corollary 2. Let and G as in Lemma 3. For n ∈ IN let fn , f ∈ L1 () with fn , f ≥ 0 a.e. on and assume that fn → f weakly in L1 () as well as Z Z G(fn ) ≡ L < ∞, G(f ) = lim n→∞

as n → ∞. Then fn → f strongly in L1 () as n → ∞. Remark 4. a) In Lemma 3 it is assumed that ϑ is constant. By obvious modifications this assumption can be a bit weakened to require that there exists a sequence (ϑn )n∈IN with ϑn ∈ (0, 1) and limn→∞ ϑn = ϑ ∈ (0, 1) such that Z Z Z G(f ) = lim G(fn ) = lim G(ϑn f + (1 − ϑn )fn ) ≡ L ∈ IR. n→∞

n→∞

Setting Θ = inf {ϑn : n ∈ IN} , Θ = sup {ϑn : n ∈ IN}, both in Lemma 3 and Corollary 2 the assumption G ∈ C([0, ∞)), G strictly convex, can be replaced by ∀k > 1, ∀ϑ ∈ [Θ, Θ] : ∃c > 0 : ∀u, v ∈ [ k1 , k], u ≤ v : ϑG(v) + (1 − ϑ)G(v − u) − G(v − (1 − ϑ)u) ≥ C u. b) There are many sufficient conditions known which allow to pass from weak L1 convergence (or convergence in the sense of distributions) to strong L1 -convergence, see e.g. Br´ezis [4] and the references given there. In Lemma 3 however no convergence of the sequence (fn ) is assumed. The main result of this subsection is Theorem 3. Assume (A). Then – nλ → nc , pλ → pc , Vλ → Vc strongly in Lr (), r ∈ [1, ∞) and weak* in L∞ () as λ → 0, – kVλ kH 1 = o(1/λ) as λ → 0, – if g1 = −∞ then there exists a λ∗ > 0 and a constant K > 1 which is independent of λ ∈ (0, λ∗ ) such that 1/K ≤ nλ , nc ≤ K, – if g2 = −∞ then there exists a λ∗ > 0 and a constant K > 1 which is independent of λ ∈ (0, λ∗ ) such that 1/K ≤ pλ , pc ≤ K.

78

A. Unterreiter

Remark 5. a) Convergence in the L∞ ()-norm can in general not be expected because nλ , pλ , Vλ ∈ C() for all λ > 0 while for not continuous C one has nc , pc , Vc 6∈ C(). b) If g1,2 (t) = log(t), see [12], then the functions nc , pc , Vc are given by q   n = (C/2) + (C/2)2 + δ 2  c    q      pc = −(C/2) + (C/2)2 + δ 2 q (7)  2 2  = Vc = β1 − log (C/2) + (C/2) + δ     q    2 2  = log −(C/2) + (C/2) + δ − β2 Z q where δ 2 = eβ1 +β2 is uniquely determined by (C/2) + (C/2)2 + δ 2 = N , or Z q equivalently by −(C/2) + (C/2)2 + δ 2 = P . Equations (7) recover the classical expressions for the thermal equilibrium distributions of of nc , pc , Vc , see [12]. The parameter δ 2 (as well as β1 , β2 ) is uniquely determined by N and P . 3. Proofs 3.1. Proofs of Subsection 2.1. Proof of Theorem 1. The proof extends a similar argumentation of [17] to bipolar models. Some modifications are however necessary to handle the operator V [f ] whose corresponding operator in [17] is positive. For the sake of simplicity assume that g1 = g2 = g. Step 1. For i ∈ (1, ∞], t ∈ [0, ∞) let gi (t) ≡ min{it, max{−i, g(t)}} and Gi (t) = Rt g (σ) dσ. We shall minimize 1 i Z Z Z Z Ei+ (r, s) = ε2 |∇r|2 + ε2 |∇s|2 + Gi (r+ )2 + Gi (s+ )2 λ2 + 2

Z

∇V [(r+ )2 − (s+ )2 − C] 2

in 0 ≡ +

Z (r, s) ∈ H () × H () : 1

1

Z + 2

(r ) = N,

+ 2

(s ) = P

,

where r+ , s+ are the positive parts of r, s. The aim of the subsequent analysis is to carry out the limit i → ∞. Various i-independent positive constants are denoted by K. Lemma 4. Assume (A). Then, for all i ∈ (1, ∞], the functional Ei+ possesses a unique minimizer (Ri , Si ) in 0+ and Ri , Si ≥ 0. Proof of Lemma 4. The existence of a minimizer (Ri , Si ) ∈ 0+ follows from standard theory, see e.g. [7, 11]. One easily checks that (Ri+ , Si+ ) ∈ 0+ (cutting maps H 1 () into H 1 (), see e.g. [10]) and Ei+ (Ri+ , Si+ ) ≤ Ei+ (Ri , Si ), where equality holds iff Ri− = Si− = 0. Therefore Ri , Si ≥ 0. Assume that (Ri , Si ) and (R1 , S 1 ) are distinct

Thermal Equilibrium Solution of Generic Bipolar Quantum Hydrodynamic Model

79

non-negative minimizers of Ei+ in 0+ . Then a straightforward calculation shows that for all ϑ ∈ (0, 1) the pair (Rϑ , Sϑ ), p Rϑ ≡ pϑ(Ri )2 + (1 − ϑ)(R1 )2 ≥ 0, Sϑ ≡ ϑ(Si )2 + (1 − ϑ)(S 1 )2 ≥ 0 belongs to 0+ with Ei+ (Rϑ , Sϑ ) < ϑEi+ (Ri , Si ) + (1 − ϑ)Ei+ (R1 , S 1 ) which contradicts the assumed minimality of Ei+ (Ri , Si ) and Ei+ (R1 , S 1 ) in 0+ . Step 2. Similar to [17] it can be easily seen that for all i ∈ (1, ∞) (the case i = ∞ has to be excluded here because of the possible lack of differentiability of Gi (t) at t = 0) the pair (Ri , Si ) satisfies the respective Euler-Lagrange equations  2 ε 1Ri = Ri Vi + gi (Ri2 ) − αi1 ,         ε2 1Si = Si −Vi + gi (Si2 ) − αi2 ,   (8)  −λ2 1Vi = Ri2 − Si2 − C,     Z Z Z      Ri2 = N, Si2 = P, Vi = 0, where it is taken into account that Ri , Si ≥ 0. The space of test functions of (8) is H 1 (). Hence Ri , Si satisfy homogeneous Neumann boundary conditions. Step 3. The limit i → ∞ is prepared by deriving i-independent estimates on Ri , Si . Here some modifications of the proof of [17] are necessary. Due to the fact that Ei+ is uniformly (with respect to i) bounded from below and kRi kL2 , kSi kL2 ≤ K, one gets (A)b),c) kRi kH 1 , kSi kH 1 ≤ K which gives kRi kL6 , kSi kL6 ≤ K. Due Z to assumption Z 2 it follows kVi kL∞ ≤ K. Combining these estimates we get Ri |Vi | , Ri |Vi | ≤ K and along the lines of Sect. 3.3 of [17] to establish the estimates Z we can proceed Z 2 2 2 Ri gi Ri , Si gi Si2 ≤ K and |αi1 |, |αi2 | ≤ K. Lemma 5. Assume (A). Then 0 ≤ Ri , Si ≤ K. Proof of Lemma 5. Given a > 1 we use

[Ri − a]+ as test function in the first equation Ri

of (8). This gives Z Z |∇[Ri − a]+ |2 + [Ri − a]+ Vi + gi Ri2 − αi1 = 0 ε2 a 2 Ri such that previous estimates imply Z Z |∇[Ri − a]+ |2 ≥ 0, [Ri − a]+ ≥ εa K − g i a2 Ri2

and due to limt→∞ gi (t) = ∞ we have Ri ≤ K. Si ≤ K follows in analogy.

Step 4. The estimates derived so far allow to choose a sequence (Ri , Si )i∈IN such that Ri → R, Si → S weakly in H 1 () and weak* in L∞ () as i → ∞. It remains to show + in 0+ solving the corresponding that the pair (R, S) is actually the minimizer of E∞

80

A. Unterreiter

+ Euler-Lagrange equations. It can be seen as in [17] that (R, S) is the minimizer of E∞ in 0+ . To pass to the limit in the Euler-Lagrange equations (8) we distinguish between two cases. If g = −∞ it follows from the maximum principle and previous estimates that √ K ≤ Ri , Si , see [17] for the details. If g ∈ IR , then the map t 7→ tg(t) is continuous on [0, ∞). In both cases we can pass to the limit i → ∞ in the weak formulation of (8) with arbitrary test functions in H 1 (). This settles the boundary conditions and the limiting equations. The regularity of R, S follows from the fact that 1R, 1S are both in L∞ (). If g = −∞ then the lower estimate for R, S follows from Ri , Si ≥ K, if g ∈ IR, the strict positivity of R, S follows from Harnack’s inequality. Identifying n with R2 and p with S 2 settles the proof of Theorem 1.

3.2. Proofs of Subsection 2.2. Proof of Lemma 1. Lemma 1 modifies a result in [18] where mixed Dirichlet-Neumann boundary conditions are concerned. For the sake of a smoother presentation assume g = g1 = g2 . Step 1. For i ∈ (1, ∞], t ∈ [0, ∞) let   t − (1/i) + g(1/i) , 0 ≤ t ≤ (1/i) g(t) , (1/i) < t < i , gi (t) =  t − i + g(i) ,t≥i and set Gi (t) ≡

Rt 1

gi (σ) dσ. gi is strictly monotone increasing. Let [0, ∞) hi : IR → 0 if t ≤ g(1/i) − (1/i) t 7→ . gi−1 (t) if t > g(1/i) − (1/i)

It is readily seen that for i ∈ (1, ∞) the function Gi is strictly convex and belongs to C 1 [0, ∞). Furthermore Gi (t) = O(t2 ) as t → ∞. We shall minimize the functional Z i (ν, π) ≡ E◦λ

Z G1 (ν) +

G2 (π) +

λ2 2

Z |∇V [ν − π − C]|2

in the set 0◦ =

Z (ν, π) ∈ L () × L () : ν, π ≥ 0, 1

1

Z ν = N,

π=P

,

R i is set to +∞ whenever the problem −λ2 1V = ν−π−C, V = where the last term of E◦λ 0 admits no solution in H 1 (). (ν, π belong only to L1 ().) It follows from standard i possesses for all i ∈ (1, ∞) a unique minimizer (ni , pi ) ∈ 0◦ . The case theory that E◦λ i = ∞ has to be excluded here because of the possible lack of coercivity of the functional E◦λ in L1 () (or any other Lr () space as well). Furthermore the standard theory also provides that (ni , pi ) solves the corresponding variational inequalities

Thermal Equilibrium Solution of Generic Bipolar Quantum Hydrodynamic Model

             

0 = Vi + gi (ni ) − αi1 0 ≤ Vi + gi (ni ) − αi1

if ni > 0, if ni = 0,

0 = −Vi + gi (pi ) − αi2 0 ≤ −Vi + gi (pi ) − αi2

if pi > 0, if pi = 0,

81

(9)

    −λ2 1Vi = ni − pi − C,      Z Z Z     Vi = 0. ni = N, pi = P,

This system can be written as a single semi-linear equation in terms of the electrostatic potential Vi : Z 2 Vi = 0, −λ 1Vi = hi (αi1 − Vi ) − hi (αi2 + Vi ) − C , where ni = hi (αi1 − Vi )

,

pi = hi (αi2 + Vi ) .

(10) ∞

It follows by the strict monotonicity of hi via the maximum principle that Vi ∈ L () with Vi ≤ V ≤ Vi , where Vi , Vi satisfy the inequalities hi (αi1 − Vi ) − hi (αi2 + Vi ) ≤ sup C, hi (αi1 − Vi ) − hi (αi2 + Vi ) ≥ inf C.

(11)

R R Furthermore, the normalizing conditions hi (αi1 − Vi ) = N and hi (αi2 + Vi ) = P imply ( hi (αi1 − Vi ) ≤ N 0 ≡ N/meas() ≤ hi (αi1 − Vi ), (12) hi (αi2 + Vi ) ≤ P 0 ≡ P/meas() ≤ hi (αi2 + Vi ). Step 2. We carry out the limit i → ∞ by deriving i-independent estimates. Various i-independent positive constants are denoted by K. It follows from (10),(11), (12) and the non negativity of ni , pi that ni ≤ sup C +P 0 , pi ≤ − inf C +N 0 . Hence k1Vi kL∞ ≤ K which gives by (A)b) the estimate kVi kL∞ ≤ K. It follows from (9) that αi1 , αi2 ≤ K. To establish lower estimates for αi1 assume that lim inf i→∞ αi1 = −∞. Passing if necessary to a subsequence we have due to (12), N 0 ≤ h(αi1 + K). Choose i large enough such that αi1 + K < gi (N 0 ) = g(N 0 ). Then, if αi1 + K > g(1/i) − (1/i) the contradiction N 0 ≤ hi (αi1 + K) = gi−1 (αi1 + K) = g −1 (αi1 + K), i.e. g(N 0 ) ≤ αi1 + K follows. If however αi1 + K ≤ g(1/i) − (1/i), then N 0 ≤ hi (αi1 + K) = 0, which is a contradiction. This proves that lim inf i→∞ αi1 ∈ IR and a similar argumentation for αi2 settles |αi1 |, |αi2 | ≤ K. Step 3: The estimates of Step 2 ensure that - possibly after passing to a subsequence limi→∞ αi1 = α1◦ , limi→∞ αi2 = α2◦ as well as ni → n◦

,

pi → p◦

weak* in L∞ (),

as i → ∞.

82

A. Unterreiter

Hence Vi → V◦ weak* in L∞ () and strongly in H 1 (), as i → ∞, where V◦ = V [n◦ − p◦ − C]. Passing if necessary to a subsequence gives V i → V◦

almost everywhere in ,

as i → ∞.

We proceed by a case distinction. a) If g = −∞ then by means of gi (ni ) ≥ αi1 − Vi ≥ −K, the estimate ni ≥ K follows. Hence gi (ni ) = g(ni ) as well as ni = h(αi1 − Vi ) for all sufficiently large i and by continuity of h we have ni → n◦ = h(α1◦ − V◦ ) almost everywhere in as i → ∞ which gives via kni kL∞ ≤ K, ni → n◦ = h(α1◦ − V )

strongly in Lr () , r ∈ [1, ∞),

as i → ∞.

b) If g ∈ IR then hi → h uniformly on compact subsets of IR as i → ∞ which gives via kαi1 − Vi kL∞ ≤ K and convergence almost everywhere of αi1 − Vi , ni → n◦ = h(α1◦ − V◦ )

strongly in Lr () , r ∈ [1, ∞),

as i → ∞.

strongly in Lr () , r ∈ [1, ∞),

as i → ∞.

In analogy we get in both cases pi → p◦ = h(α2◦ + V◦ )

Step 4. It remains to prove that (n◦ , p◦ ) is the minimizer of E◦λ in 0◦ . (By strict convexity of E◦λ there is at most one minimizer.) As shown in Step 3 the triple (n◦ , p◦ , V◦ ) satisfies the variational inequalities (4). Now it is an easy exercise to verify for all (ν, π) ∈ 0◦ , lim inf

ϑ→0

E◦λ (n◦ + ϑ(ν − n◦ ), p◦ + ϑ(π − p◦ )) − E◦λ (n◦ , p◦ ) ≥ 0. ϑ

The convexity of E◦λ implies that (n◦ , p◦ ) is a minimizer of E◦λ in 0◦ . The regularity results stated in Lemma 1 follow from standard theory [8]. Proof of Theorem 2. The proof is divided into two steps. In the first step strong convergence of nε , pε in H 1 () as ε → 0 is proven. Then uniform L∞ -estimates are established. Step 1. Various ε-independent positive constants are denoted by K. We note that Z Z √ √ Eελ (nε , pε ) − E◦λ (nε , pε ) = ε2 |∇ nε |2 + ε2 |∇ pε |2 ≥ 0 √

√

p◦ ∈ H 1 (), for all ε > 0, Z Z √ √ Eελ (nε , pε ) ≤ Eελ (n◦ , p◦ ) = ε2 |∇ n◦ |2 + ε2 |∇ p◦ |2 + E◦λ (n◦ , p◦ ),

for all ε > 0. Due to

n◦ ,

as well as E◦λ (n◦ , p◦ ) ≤ E◦λ (nε , pε ). Combining these estimates we get for all ε ≥ 0, Z Z Z Z √ 2 √ 2 √ 2 √ |∇ nε | + |∇ pε | ≤ |∇ n◦ | + |∇ p◦ |2 , √ √ √ √ and due to k nε kL2 = N, k pε kL2 = P this implies k nε kH 1 , k pε kH 1 ≤ K. Passing to a subnet one has √ √ √ √ nε → n∗ , pε → p∗ weakly in H 1 (), as → 0.

Thermal Equilibrium Solution of Generic Bipolar Quantum Hydrodynamic Model

83

The compactness of the embedding H 1 () → L6 () gives nε → n∗ , pε → p∗

as → 0.

strongly in L3 (),

This convergence implies by (A)b) that Vε → V∗

strongly in L∞ (), H 1 (),

as → 0,

where V∗ = V [n∗ − p∗ − C]. To prove n∗ = n◦ , p∗ = p◦ note that E◦λ (n◦ , p◦ ) ≤ lim inf E◦λ (nε , pε ) ≤ lim inf Eελ (nε , pε ) ≤ lim sup Eελ (nε , pε ) ε→0

ε→0

ε→0

≤ lim inf Eελ (n◦ , p◦ ) = E◦λ (n◦ , p◦ ). ε→0

Hence E◦λ (n◦ , p◦ ) = lim Eελ (nε , pε ). On the other hand by the weakly sequential L2 ()ε→0

continuity of the functional E◦λ , E◦λ (n∗ , p∗ ) ≤ lim inf E◦λ (nε , pε ) ≤ lim Eελ (nε , pε ) = E◦λ (n◦ , p◦ ), ε→0

ε→0

so (n∗ , p∗ ) is minimizer of E◦λ in 0◦ . (Obviously, (n∗ , p∗ ) ∈ 0◦ .) By uniqueness of the minimizer of E◦λ in 0◦ one has n∗ = n◦ , p∗ = p◦ . ∞ Step shown in Step 1 weRhave √ kVε kL ≤ K. We observe ∗by strong convergence √ 2. As √ and n > 0 that there exists an ε >R 0√such that for all of nε to n◦ in L1 () ◦ R √ ε ∈ (0, ε∗ ) the estimate nε ≥ K holds. For ε < ε∗ set mε ≡ N/ nε . We observe ∗ ε . This allows us to proceed as in the proof of Lemma 4 in that mε ≤ K for all ε < R √ [17] to get the estimate nε g1 (nε ) ≤ K for all ε < ε∗ . Using nε as test function in the first equation of (2) we get Z Z Z √ α1ε N = ε2 |∇ nε |2 + nε Vε + nε g1 (nε ), and therefore by previous estimates |α1ε | ≤ K for all ε < ε∗ . Using the maximum principle and the monotonicity of g1 in the first equation of (2) it follows that nε ≡ sup nε > 0 satisfies the inequality g1 (nε ) ≤ α1ε − Vε , where Vε ≡ inf Vε . Hence nε ≤ h1 (α1ε − Vε ) ≤ K for all ε < ε∗ , because kVε kL∞ ≤ K. This settles by nonnegativity knε kL∞ ≤ K. If g1 = −∞ we can again apply the maximum principle in the first equation of (2) to get for nε ≡ inf nε > 0 in analogy for all ε < ε∗ the estimate nε ≥ h1 (α1ε − Vε ) ≥ K, where Vε ≡ sup Vε . The L∞ -estimates concerning pε , p◦ follow in analogy. Finally the regularity results are consequences of standard theory [8]. 3.3. Proofs of Subsection 2.3. Proof of Lemma 2:. We rewrite the minimization problem as follows. The functional Z Z E(ρ) ≡ G1 (C + + ρ) + G2 (C − + ρ) is to be minimized in 0≡

Z ρ ∈ L () : ρ ≥ 0, 1

Z ρ=N−

C

+

.

84

A. Unterreiter

R Due to (A)d) we have C + < N and therefore 0 6= {0}. As a strictly convex functional E possesses at most one minimizer. We introduce the function g : × [0, ∞) → [−∞, +∞) (x, s)

7→ g1 (C + (x) + s) + g2 (C − (x) + s).

It is readily seen that for fixed x ∈ the function g(x, .) is strictly monotone increasing and continuous. Furthermore, for fixed x ∈ we have lims→∞ g(x, s) = ∞. This allows to define for fixed x ∈ the function r(x, .) : IR → [0, ( +∞) γ 7→

if γ ≤ g(x, 0) .

0

[g(x, .)]−1 (γ) if γ > g(x, 0)

For fixed x ∈ the function r(x, .) is continuous and monotone increasing. Given γ ∈ IR we note that r(x, γ) ∈ L∞ () as well as lim sup r(x, γ) = 0

γ→−∞ x∈

which gives

,

lim inf r(x, γ) = ∞,

γ→∞ x∈

Z

Z r(x, γ) = 0

lim

γ→−∞

,

lim

γ→∞

r(x, γ) = ∞.

R Furthermore the map Rγ 7→ r(x, γ) is continuous. Hence there exists a γ ∗ ∈ IR such R that r(x, γ ∗ ) = N − C + . Set r∗ (x) = r(x, γ ∗ ) and nc = g1 (C + +r∗ ), pc = g2 (C − +r∗ ). Then g1 (C + + r∗ ) + g2 (C − + r∗ ) ≥ γ ∗ , where equality holds whenever r∗ > 0. Since r∗ does not vanish identically we have by strict monotonicity of g the estimate γ ∗ > g1 (0) + g2 (0), which proves meas({nc = 0} ∩ {pc = 0}) = 0. If the function nc pc vanishes identically on then by nc = C + + r∗ and pc = C − + r∗ the identity (C + + r∗ )(C − + r∗ ) = 0 will follow which gives due to C + C − = 0 the contradiction r∗ (|C| + r∗ ) = 0, i.e. r∗ = 0. We have E(r∗ + ϑ(ρ − r∗ )) − E(r∗ ) ≥0 ϑ→0 ϑ

lim inf

for all ρ ∈ 0. Hence r∗ is a minimizer of E in 0. The remaining assertions of Lemma 2 follow by straightforward verifications. Proof of Lemma 3. If kf kL1 = 0 = lim kfn kL1 , then fn → 0 = f strongly in L1 () n→∞

and there is nothing to do. If kf kL1 ≡ K > 0, suppose by contradiction that there exists an ε ∈ (0, 8K) such that kfn − f kL1 > ε for a subsequence n. Set gn ≡ fnR − f . Then R fn − f = gn+ − gn− and fn +R gn− = f + gn+ .RBy non-negativity of fn , f andR fn → f as n → ∞ one gets lim gn+ = lim gn− . On the other hand ε < |fn − f | = n→∞ n→∞ R R R + R − gn + gn for all n ∈ IN. Hence lim gn− ≥ ε2 and therefore gn− ≥ ε4 for a n→∞ R subsequence n. Choose Mε > 0 such that f < 18 ε and put ε ≡ {f ≤ Mε } {f >Mε }

which has nonzero measure:

Thermal Equilibrium Solution of Generic Bipolar Quantum Hydrodynamic Model

1 0
R {f >Mε }

Z

Z

Z

f−

f=

f < 18 ε, because of 0 ≤ fn = f + gn+ − gn− ,

{f >Mε } ≥ gn− , or gn+

and either gn+ = 0 which implies f Z 1 ε < gn− − 8

f. ε

{f >Mε }

R

gn− ≤

85

Z

> 0 which gives 0 = gn− ≤ f . But then Z gn− = gn− . ε

{f >Mε }

Now set δε ≡ ε/(16 meas(ε )) and define Cn ≡ {gn− ≥ δε } ∩ ε which has non zero measure: Z Z Z Z ε 1 − − − ε < gn = + gn− . gn + gn ≤ 8 16 ε

ε \Cn

gn−

Cn

Cn

− and ≡ 0 on Cn , one has 0 < δε ≤ gn− ≤ f ≤ Mε Since 0 ≤ fn = f + almost everywhere on Cn . Set Rε ≡ {(u, v) ∈ IR2 : δε ≤ u ≤ v ≤ Mε } and define gn+

gn+

F : Rε → IR (u, v) 7→ (ϑG(v) + (1 − ϑ)G(v − u) − G(v − (1 − ϑ)u)) /u. Since G is strictly convex and 0 < δε ≤ u ≤ v ≤ Mε , it follows that F > 0 on Rε . Furthermore G is continuous and so is F on the compact set Rε . Hence there exists a Cε > 0 such that F ≥ Cε on Rε . But then R ϑG(f ) + (1 − ϑ)G(fn ) − G(ϑf + (1 − ϑ)fn )

≥

R

ϑG(f ) + (1 − ϑ)G(f + gn+ − gn− ) − G(ϑf + (1 − ϑ)f + (1 − ϑ)(gn+ − gn− ))

Cn

=

R

F (ϑ, gn− , f )gn− ≥ Cε

Cn

R

gn− ≥ εCε /16 > 0,

Cn

because gn+ ≡ 0 on Cn . Hence we get the contradiction Z L = lim G(ϑf + (1 − ϑ)fn ) n→∞

Z

Z G(f ) + lim (1 − ϑ)

≤ −εCε /16 + lim ϑ n→∞

G(fn ) = −εCε /16 + L.

n→∞

Proof of Corollary 2. Set ϑn ≡ 21 . Then by convexity Z Z 1 1 1 1 G(fn ) = L, f + fn ≤ G(f ) + lim lim sup G 2 2 2 2 n→∞ n→∞ R R while by weak lower semi-continuity L = G(f ) ≤ lim inf G 21 f + 21 fn . Hence n→∞ R L = limn→∞ G( 21 fR + 21 fn ). On the other hand the weak L1 -convergence implies that R kfn kL1 = fn −→ f = kf kL1 as n → ∞. The result follows from Lemma 3.

86

A. Unterreiter

Proof of Theorem 3. Various λ-independent positive constants are denoted by K. We note that

E◦◦ (nc , pc ) ≤ E◦◦ (nλ , pλ ) ≤ E◦λ (nλ , pλ ) ≤ E◦λ (nc , pc ),

which gives E◦◦ (nc , pc ) ≤ lim sup E◦◦ (nλ , pλ ) lim inf E◦λ (nλ , pλ ) λ→0

λ→0

≤ lim sup E◦λ (nλ , pλ ) ≤ lim sup E◦λ (nc , pc ) = E◦◦ (nc , pc ), λ→0

and therefore λ2 2

λ→0

E◦◦ (nc , pc ) = lim E◦λ (nλ , pλ ), λ→0

R

as well as |∇Vλ |2 ≤ K. As knλ kL∞ , kpλ kL∞ ≤ K, see Lemma 1, one has by passing to a subnet nλ → n∗ , pλ → p∗ weak* in L∞ () as well as λVλ → W◦ weakly in H 1 (). It follows for all test functions ϕ ∈ H 1 (), Z Z 0 = lim λ2 ∇Vλ ∇ϕ = lim (nλ − pλ − C)ϕ, λ→0

λ→0

which implies nλ − pλ − C → 0 weakly in H 1 () as λ → 0 and therefore R n∗ , p ∗ ∈ 0◦◦ . Thanks to weak in L2 () one has G1 (n∗ ) ≤ R sequential lower semi-continuity R R lim inf G1 (nλ ) , G2 (p∗ ) ≤ lim inf G2 (pλ ), and therefore λ→0

λ→0

E◦◦ (n∗ , p∗ ) ≤ lim sup E◦◦ (nλ , pλ ) ≤ E◦◦ (nc , pc ). λ→0

But (nc , pc ) is the unique minimizer of E◦◦ in 0◦◦ . Hence n∗ = nc , p∗ = pc , and as a consequence of E◦◦ (n∗ , p∗ ) = E◦◦ (nc , pc ) = lim E◦λ (nλ , pλ ) one gets λ→0

2

Z

λ |∇Vλ |2 = 0, λ→0 2 R R R R as well as G1 (nλ ) → G1 (nc ) , G2 (pλ ) → G2 (pc ), as λ → 0. Now it follows from Corollary 2 that lim

nλ → nc , pλ → pc

strongly in L1 (),

as λ → 0,

and therefore nλ → nc , pλ → pc a.e. on for a subnet λ. Due to convergence almost everywhere and convergence weak* in L∞ () we have nλ → nc , pλ → pc

strongly in Lr () , r ∈ [1, ∞),

as λ → 0.

∞ The uniform R L -estimates on nλ , pλ imply g1 (nλ ), g2 (pλ ) ≤ K. Hence by integration of (4) and Vλ = 0 we get upper estimates for the Lagrange multipliers: α1λ , a2λ ≤ K. Due to convergence almost everywhere and due to the continuity of g1,2 we have

g1 (nλ ) + g2 (pλ ) → g1 (nc ) + g2 (pc ) ≥ γ

a.e. on ,

as λ → 0.

Hence there exists a λ∗ ∈ (0, ∞) such that for all λ ∈ (0, λ∗ ) the estimate g1 (nλ ) + g2 (pλ ) ≥ −K holds a.e. on . Hence, if g1 = −∞, then there exists a K > 1 such that 1/K ≤ nλ ≤ K for all λ < λ∗ and an equivalent estimate follows for pλ whenever limu→0 g2 (u) = −∞. To establish lower estimates for α1λ , a2λ , assume by contradiction

Thermal Equilibrium Solution of Generic Bipolar Quantum Hydrodynamic Model

87

that for a subnet limλ→0 α1λ = −∞. Then on the set {nλ > 0} - whose measure is at least N/(P 0 + C) - the equality Vλ =R α1λ − g1 (nλ ) holds R which gives Vλ → −∞ uniformly on {nλ > 0}. Hence due to Vλ = 0 we have {nλ =0} Vλ → ∞ leading to limλ→0 Vλ = ∞. We have due to (4) the inequality α2λ ≤ −Vλ + g2 (pλ ), and therefore limλ→0 α2λ = −∞ which settles in analogy Vλ → +∞ uniformly on {pλ > 0}. Hence by continuity of nλ , pλ , Vλ we have {nλ > 0} ∩ {pλ > 0} = ∅, and therefore nλ pλ = 0 for all sufficiently small λ. Due to convergence almost everywhere it follows that nc pc = 0, which contradicts Lemma 2. This and an equivalent investigation of α2λ settles |α1λ |, |α2λ | ≤ K, and we conclude from (4) that α1λ − g1 (nλ ) ≤ Vλ ≤ g2 (pλ ) − α2λ , which gives kVλ kL∞ ≤ K for all λ ≤ λ∗ which settles by passing to a subnet V λ → V∗

R

weak* in L∞ (),

as λ → 0

as well as V∗ = 0. Passing to another subnet we have, due to the uniform estimates on α1λ , a2λ , the existence of β1∗ , β2∗ ∈ IR such that α1λ → β1∗ and α2λ → β2∗ as λ → 0. Due to strong convergence in L1 () and due to Egorov’s, Theorem there exists for each δ > 0 an δ ⊂ with meas( \ δ ) ≤ δ such that g1 (nλ ) − α1λ → g1 (nc ) − β1∗

uniformly on δ ,

as λ → 0.

Hence Vλ = g1 (nλ ) − α1λ → V∗

uniformly on δ ∩ {n > 0},

as λ → 0,

which settles V∗ = Vc + β1∗ − β1 almost everywhere on {nc > 0}. A similar argumentation gives V∗ = Vc − β2∗ + β2 almost everywhere on {pc > 0}. As shown in Lemma 2 the function nc pc does not vanish identically on , which settles β1∗ − β1 = −β2∗ + β2 , and β2∗ = γ − β1∗ . As {nc = 0} ⊂ {pc > 0}, see Lemma 2, we conclude via R R therefore V∗ = Vc = 0 that 0 = (β1∗ − β1 ) meas({nc > 0} + (β2 − β2∗ ) meas({nc = 0}, and therefore β1∗ = β1 and β2∗ = β2 , and therefore V∗ = Vc on {nc > 0}∪{pc > 0} = . Furthermore, as seen above, we have Vλ → Vc

almost everywhere on {nc > 0} ∪ {pc > 0},

as λ → 0.

This settles in connection with weak* convergence in L∞ (), V λ → Vc

strongly in Lr () r ∈ [1, ∞),

and finishes the proof of Theorem 3.

as λ → 0,

Acknowledgement. The author acknowledges support from EC-network, contract # ERBCHRXCT 930413 and support from the Deutsche Forschungsgemeinschaft, project MA 1662/2-1 entitled “Mathematische Analysis und Numerik von Quantenhydrodynamischen Modellen der Halbleiterphysik (QHD)”. The author is indebted to the ENS, Departement de Math´ematique, Cachan, and the Universit´e Paul Sabatier, Toulouse, where parts of this research were carried out.

88

A. Unterreiter

References 1. Adams, R.: Sobolev Spaces. ew York: Academic Press, 1975 2. Ancona, M.G. and Iafrate, G.J.: Quantum Correction to the Equation of State of an Electron Gas in a Semiconductor. Phys. Rev. B 39 (13), 9536–9540 (1989) 3. Arnold, A., Markowich, P.A. and Mauser, P.A.: The one-dimensional periodic Bloch-Poisson equation. M3AS, 1 (1), 83–112 (1991) 4. Brezis , H.: Convergence in D 0 and in L1 under Strict Convexity. Technical Report R 93011, Laboratoire d’Analyse Numerique, Universite Pierre et Marie Curie, 4, place Jussieu, 75252 Paris Cedex 05, France, 1993 5. Brezzi , F. and Gilardi , G.: Fundamentals of P.D.E for Numerical Analysis. Technical report, Consiglio Nazionale delle Richerche, Corso C. Alberto 5, 27100 Pavia, Italy, 1984 6. Ghosh, S.K. and Deb, B.M.: Density, Density-Functionals and Electron Fluids. Physics Reports (Review Section of Physics Letters), 92 (1), 1–44 (1982) 7. Giaquinta, M: Multiple Integrals in the Calculus of Variations and Nonlinear Elliptic Systems. Annals of Mathematical Studies. Princeton, NJ: University Press, 1983 8. Gilbarg, D. and Trudinger, N.S.: Elliptic Partial Differential Equations of Second Order. Berlin– Heidelberg–New York: Springer, 2nd edition, 1983 9. Jackson, J.D.: Klassische Elektrodynamik. Amsterdam: de Gruyter, 2nd edition, 1983 10. Kinderlehrer, D. and Stampacchia, G.: An Introduction to Variational Inequalities and Their Applications. London–New York: Academic Press, 1980 11. Lions, P.L.: On the Existence of Positive Solutions of Semilinear Elliptic Equations. SIAM Review 24, 441–467 (1982) 12. Markowich, P.A.: The Stationary Semiconductor Device Equations. Berlin–Heidelberg–New York: Springer, 1986 13. Markowich, P.A.: Boltzmann Distributed Quantum Steady States and Their Classical Limit. Forum Math., 6, 1–33 (1994) 14. Markowich, P.A., Ringhofer, C.A. and Schmeiser, C.: Semiconductor Equations. Berlin–Heidelberg– New York: Springer, 1990 15. Nier, F.: A Stationary Schr¨odinger-Poisson System Arising from the Modelling of Electric Devices. Forum Mathematicum 2 (5), 489–510 (1990) 16. Nier, F.: A Variational Formulation of Schr¨odinger-Poisson Systems in Dimension d ≤ 3. Comm PDE 18 (7–8), 1125–1147 (1993) 17. Pacard, F. and Unterreiter, A.: A Variational Analysis of the Thermal Equilibrium State of Charged Quantum Fluids. Comm PDE 20, 885–900 (1995) 18. Unterreiter, A.: The Thermal Equilibrium State of Semiconductor Devices. Appl. Math. Lett. 7 (6), 39–43 (1994) 19. Ziemer, W.P.: Weakly Differentiable Functions. Berlin–Heidelberg–New York: Springer, 1989 Communicated by J.L. Lebowitz

Commun. Math. Phys. 188, 89 – 119 (1997)

Communications in

Mathematical Physics c Springer-Verlag 1997

Metropolis Dynamics Relaxation via Nucleation and Growth Pouria Dehghanpour, Roberto H. Schonmann? Department of Mathematics, UCLA, Los Angeles, CA 90095, USA Received: 11 June 1996 / Accepted: 14 January 1997

Abstract: We consider the Ising model with Metropolis dynamics on Z2 under a small positive external field h. We show that the relaxation time, i.e., the time it takes for the system to reach the (+)-phase starting from all spins −1, scales as eβκc as the temperature and 0 is the energy of a “critical” droplet. The factor 13 1/β → 0, where κc = 0+(2−h) 3 originates from droplet growth and is related to the dimension of the lattice, while the term (2 − h) is related to the rate of growth of highly supercritical droplets. 1. Introduction This paper is motivated by the real-world phenomenon of metastability and its decay by means of nucleation and growth of the stable phase in the midst of the metastable one. We refer the reader to [PL] and [GD] for an introduction to this phenomenon and its theoretical investigation. In recent years substantial progress has been made on the understanding of metastability at a mathematically rigorous level, in the realm of spin flip interacting particle systems. Consulting [Sch1] and [Sch2] the reader will find some of these recent results and references to various other papers on the subject. Of special relevance for the motivation behind the current paper are Sect. 4 of [Sch1] and Sect. 8 of [Sch2]. In contrast to a great deal of the recent work on metastability, here we will consider an infinite system, so that not only nucleation, but also droplet growth becomes relevant. We will consider the behavior of the two dimensional Ising model on Z2 evolving with Metropolis dynamics, under a fixed positive external field h satisfying 0 < h < 2, as the temperature 1/β is scaled to 0. The following is a brief recap of the standard setup of this model; for more details, the reader is referred to [Sch1] or [Sch2] and [NS]; for a systematic overview of interacting particle systems in general, see [Lig] and [Dur]. At each site in Z2 there is a spin which can take values −1 and +1. The configurations ?

The work of both authors was supported by the N.S.F. through grant DMS 94-00644.

90

P. Dehghanpour, R.H. Schonmann

will therefore be elements of the set = {−1, +1}Z . Given σ ∈ , we write σ(x) for the spin at the site x ∈ Z2 . Two configurations are specially relevant: −1 and +1, which are, respectively, the ones with all spins −1 and +1. When these configurations appear as a subscript or superscript, we will usually abbreviate them by, respectively, − and +. The single spin space, {−1, +1}, is endowed with the discrete topology and is endowed with the corresponding product topology. The following definition will be important when we introduce finite systems with boundary conditions later on; given 8 ⊂ Z2 and a configuration η ∈ , we introduce 2

/ 8}. 8,η = {σ ∈ : σ(x) = η(x) for all x ∈ To each configuration σ we associate a formal Hamiltonian H(σ) = −

hX 1 X σ(x)σ(y) − σ(x), 2 x,y n.n. 2 x

where the first sum is taken over all unordered pairs of nearest neighbors x, y ∈ Z2 . The time evolution is introduced as a spin flip Markov process which is reversible with respect to the corresponding (unique) Gibbs measure at temperature 1/β (remember 0 < h < 2 is fixed); specifically, we consider Metropolis dynamics, where the rate at which the spin at a site x in the configuration σ flips is given by cβ (x, σ) = exp(−β(∆x H(σ))+ ), 

where

∆x H(σ) = σ(x) 

X

 σ(y) + h ,

y n.n. of x

and (a)+ = max{a, 0} is the positive part of a. Note that formally, ∆x H(σ) = H(σ x ) − H(σ), where σ x is the configuration σ with the spin at x flipped. We have different dynamics for different temperatures 1/β, but in a standard way, we can define all dynamics on the same probability space of Poisson processes and uniform random variables (see, e.g., [Sch1] or [Sch2] for this construction). Denote by σtη the process at time t starting from the configuration η; here there is an implicit dependence on the temperature 1/β, which we will omit in the notation. We let η σ8,ζ;t

denote the process starting from η restricted to the box 8 with boundary conditions ζ. The flip rates for this process are denoted by c8,ζ . When we omit the boundary conditions or the starting configuration in our notation, it is assumed to be −1. For completeness, we briefly mention the Gibbs states here. In order to give precise definitions, we define, for each set 8 ⊂ Z2 and each boundary condition η ∈ , H8,η (σ) = −

1 X 1 σ(x)σ(y) − 2 x,y∈8 2 x,y n.n.

X x∈8,y6∈8 x,y n.n.

σ(x)η(y) −

hX σ(x), 2 x∈8

where σ ∈ is a generic configuration. Given 8 ⊂ Z2 and η ∈ , we write

Metropolis Dynamics Relaxation via Nucleation and Growth

Z8,η =

X

91

exp(−βH8,η (σ)),

σ∈8,η

for the partition function. The Gibbs (probability) measure in 8 with boundary condition η under external field h and at temperature 1/β is now defined on as ( exp(−βH (σ)) 8,η , if σ ∈ 8,η , µ8,η (σ) = Z8,η 0, otherwise. Note that there is an implicit dependence on β. The Gibbs states for the infinite lattice can be defined in a standard fashion by taking limits of the Gibbs states defined on finite boxes as the size of the box grows. The states are naturally partially ordered as follows: η ≤ η 0 if η(x) ≤ η 0 (x) for all x ∈ Z2 . If η ≤ η 0 , ζ ≤ ζ 0 , and 8 ⊂ 80 ⊂ Z2 , then the following basic–coupling inequalities follow by attractiveness: for all t ≥ 0, 0

η η ≤ σ8,ζ σ8,ζ;t 0 ;t , 0

η η ≤ σ8 σ8,−;t 0 ,−;t , and 0

η ≤ σtη . σ8,−;t

(1.1) (1.2) (1.3)

It is well known (see [NS] for instance) that the energy barrier for passage from the −1 state to the +1 state in a finite two dimensional lattice with periodic or (−) boundary conditions under a fixed positive external field h with 0 < h < 2 is 0 = 4L − h(L2 − L + 1), where

L = d2/he .

The value of L above is the correct value when 2/h is not an integer; if 2/h is an integer, then L = 2/h + 1. Throughout the proofs, we will assume for simplicity that 2/h is not an integer, but note that by monotonicity considerations and the observation that 0 is continuous in h, the theorem we prove holds for all h. The quantity 0 is simply the energy of a “critical” droplet of +1 spins, which is in the shape of an L × (L − 1) rectangle with an additional +1 spin protruding from one of the longer sides. Once a droplet of +1 spins in a sea of −1 spins becomes larger than this, it is more likely to grow than to shrink. Heuristically, such droplets appear at a rate of e−β0 . In [NS] it is proved that for a finite system under fixed external field 0 < h < 2, the relaxation time is asymptotically on the order of eβ0 . For the infinite system, the relaxation time, i.e., the time it takes for a particular site, say the origin, to be likely to have a +1 spin when the system is started from the −1 state, is actually much shorter than eβ0 . The reason for this is that a large droplet of +1 spins may form far away from the origin and subsequently carry the (+)-phase to the origin. In Metropolis dynamics, the rate at which a −1 spin flips to +1 when it has one +1 neighbor is ε = e−β(2−h) ,

92

P. Dehghanpour, R.H. Schonmann

while if there are two or more +1 neighbors, this happens at rate 1. Thus the movement of the (+)-phase can be compared to a growth model in which sites become occupied at a small rate ε when they have one occupied neighbor, and at rate 1 when they have two or more occupied neighbors. This growth model is studied in [KS], and there it is proved that the asymptotic speed of growth of such a model in two dimensions scales as ε1/2 as ε → 0. Of course, the model studied in [KS] is not a reversible model, as occupied sites remain occupied forever, whereas in the Ising model, +1 spins can and will flip back to −1 occasionally. As an intermediate step between the work in [KS] and in the current paper, a simplified nucleation–and–growth model was introduced in [DS] in which this difficulty was not present. In that model, sites become occupied at rate e−β0 if they have no occupied neighbors, and at rates ε = e−βγ and 1 if they have one or more than one occupied neighbors, respectively. Occupied sites remain occupied forever. Clearly this model is a simplification of the Ising model we are considering. Using results from [KS] and [AL], we proved in [DS] that in dimension 2, the relaxation time for this model scales as (1.4) trel = eβ(0+γ)/3 as β → ∞, under the assumption that 0 ≥ 2γ. To see why this should be the case, one can proceed heuristically as explained in [Sch1] and [Sch2] by computing the volume of the space–time cone of height t with vertex at the origin and base consisting of all points with time coordinate 0 and within a distance ε1/2 t of the origin, and then multiplying this by the nucleation rate e−β0 . (See [RTMS] for a different view on this heuristics and references on its early history.) The order of magnitude of the relaxation time is given by solving 2 e−β0 trel ε1/2 trel = 1, which yields (1.4) above. The condition 0 ≥ 2γ is needed to ensure that a supercritical droplet will reach its asymptotic speed of growth in a time which is short compared with trel . If we set γ = (2 − h) above, this condition is satisfied since a simple calculation shows that 0 ≥ 5(2 − h) (see Appendix). Using the same heuristic reasoning, therefore, the relaxation time for the Ising model should be εβ(0+(2−h))/3 . Proving this, on the other hand, is rather involved. First, as was the case in the model studied in [DS], different large clusters of +1 spins will interact so that the speed at which the (+)-phase spreads is effectively on the order of 1 at times. To handle this problem, it is necessary to properly rescale the lattice and use an argument of [AL] to control the interaction of the different large clusters of +1 spins. Moreover, there are further difficulties in dealing with the Metropolis–Ising model which were absent in [DS]. As a large cluster of +1 spins grows, it will run into small (sub–critical) clusters of +1 spins, so there is the danger that this phenomenon is also causing the (+)-phase to grow faster than expected. Finally, the notion that critical droplets of +1 spins are formed at rate e−β0 needs to be made rigorous, and this turned out to be less straightforward than one could expect. It is true that from [NS] we know that in a large finite box with periodic or (−) boundary conditions, if we start with all spins −1, the configuration with all spins +1 is reached in a time of order εβ0 with overwhelming probability. But it is not the case that starting from the same configuration the probability of creating a critical droplet in a fixed time t > 0 is of order tε−β0 ! One can prove that for any fixed t > 0 this probability is actually of much lower order. Also if one waits for a time of order eβa with a small a > 0, it turns out that the probability of creating a critical droplet is of smaller order

Metropolis Dynamics Relaxation via Nucleation and Growth

93

than ε−β(0−a) . On the other hand, in Lemma 11, we will show that if a > h(L − 2), then this probability of nucleation in time eβa is at least as large as expected from an “effective rate of nucleation” e−β0 . Lemma 3 contains a complementary bound that shows that this “effective rate of nucleation” cannot be of larger order (here a can be arbitrarily small). We state the theorem in terms of local observables, i.e., real–valued functions f defined on the state space that depend only on the values of finitely many spins. Theorem. There is a critical value κc =

0 + (2 − h) 3

such that for any local observable f , if τ = eβκ , then (1) lim E(f (στ )) = f (−1)

if κ < κc ,

(2) lim E(f (στ )) = f (+1)

if κ > κc .

β→∞ β→∞

Similarly to Theorem 4 in [Sch1] one may also want to consider a system with periodic or (−) boundary conditions inside a box of sidelength eβD . From our results and techniques it is a standard matter to show that there is a critical value of D which separates two regimes. When D > Dc = 0/3 − (2 − h)/6 the relaxation time behaves as trel ∼ eβκc , with κc as above. On the other hand, if 0 < D < Dc , then trel ∼ eβκc (D) , with κc (D) = 0 − 2D. In this latter case, when a first supercritical droplet is formed, it is likely to grow and invade the whole system before any other critical droplet is formed. In [SS] nucleation and growth in two-dimensional kinetic Ising models is also being analyzed, but at fixed subcritical temperatures. The relaxation from the metastable to the stable state is then studied in the regime in which the external field vanishes, i.e., in the vicinity of the phase coexistence region. The relaxation time is then related to the nucleation and growth of critical droplets which have the Wulff shape. The energy barrier 0 is replaced with a free energy barrier related to surface tension and a sharp result similar to our Theorem above is obtained, in which a factor 1/3 is also present, due to droplet growth. Needless to say, most of the technical difficulties in [SS] are greater than the ones in the current paper, since there one is dealing with a fixed temperature setting, and the size of the critical droplet blows up when h & 0. We want nevertheless to stress that there is one major source of technical difficulty in our paper which is absent in [SS]. In the regime that we are studying here, the speed of growth of highly supercritical droplets is so slow that the interaction among supercritical droplets and also among supercritical droplets and subcritical ones becomes relevant, and in principle spoils the simple heuristics reviewed above for the computation of the relaxation time. In the situation studied in [SS], supercritical droplets are shown to grow fast enough so that there is no need there to control the interaction among droplets. We will need a bit more notation before we begin the proofs. For integer i and x ∈ Z2 , we let Ki (x) = {y ∈ Zd : kx − yk∞ ≤ i} be the box of side–length 2i + 1 centered at x, and for convenience we define 3(`) = largest Ki (0) which has side–length not larger than `. Also, we use ei to denote the unit vector in the positive ith coordinate direction.

94

P. Dehghanpour, R.H. Schonmann

2. Metastable Regime (Proof of Part 1 of the Theorem) Throughout this section, we have a fixed κ < κc =

0+(2−h) , 3

and we let

τ = eβκ , 3 = 3(eβκc ), and W = sidelength of 3. Our first lemma states that it is sufficient to look at the system restricted to the box 3: Lemma 1. For any local observable f , lim |E(f (στ )) − E(f (σ3;τ ))| = 0.

β→∞

Proof. The proof is standard; see, e.g., Lemmas 1 and 2 in [Sch1].

In order to motivate the technical work that follows, we give a brief description of what needs to be done in relatively vague terms. We want to show that in a box that has sidelength scaling with an exponential of β, it is very unlikely that nucleation will happen by time τ . In order to do this, we will show that if nucleation does happen, then it must happen “locally”, in the sense that it would also happen in a smaller box with sidelength not scaling with β. To be able to justify this last localization statement, one must show that nucleation is unlikely to be caused by the influence of +1 spins far away from the place where the nucleation takes place. So we will need to formalize this notion of “influence”, which we will do by clumping all +1 spins that influence one another into equivalence classes, which we will call space–time clusters. Localization, therefore, will be justified once we show that it is unlikely that certain space–time clusters are very wide in the spatial dimension. We now make these notions precise. Given a configuration η ∈ R8 , we say that two sites x and y are in the same cluster if there exist sites x = x0 , x1 , . . . , xn = y, such that kxi − xi−1 k1 = 1 for i = 1, . . . , n and η(xi ) = +1 for i = 0, . . . , n. A cluster has width D if D = sup kx − yk∞ over all x and y in the cluster. We can extend these notions in a natural way to define space–time − . For each t ≥ 0, we let clusters as follows. Consider the process σ8,− − C8,t = (x, s) : x ∈ 8, s ≤ t, and σ8,−;s (x) = +1 be the set of space–time points with spin +1. We define a relation R8,t on C8,t by (1) (x, s1 )R8,t (x, s2 ) if (x, s) ∈ C8,t ∀s ∈ [s1 , s2 ], and − (2) (x, s)R8,t (y, s) if x and y are in the same cluster of σ8,−;s .

Let ≡8,t denote the smallest equivalence relation on C8,t containing the relation R8,t ; then a space–time cluster, or STC, is simply a class under ≡8,t . Note that ≡8,t is monotone in t in the sense that if t ≤ s, ≡8,t is contained in ≡8,s (of course the latter is an equivalence relation defined on C8,s , which contains C8,t ). Note that two space–time points that are not in the same STC by time t may be in the same STC by some time s > t. If there is a time t such that no (x, t) is in the class of a given STC that has been formed before time t, then that STC has been “terminated” by time t, in the sense that no other point (x, s) for s ≥ t will ever be in the equivalence class of that STC. The width of an STC is the maximum `∞ distance between the spatial coordinates of any two space–time points in that STC. For terminated STCs, this is well–defined; otherwise

Metropolis Dynamics Relaxation via Nucleation and Growth

95

we must specify the time t at which we are interested in the width of the STC (the width will be a non-decreasing function of t, as STCs can only get bigger as t gets bigger). We now introduce a key lemma that uses the concept of STCs. − (y) < Lemma 2. Let 81 ⊂ 82 ⊂ Z2 . For any time s ≥ 0 and y ∈ 81 , if σ8 1 ,−;s − − σ82 ,−;s (y), then (y, s) is in a STC of σ82 ,− that reaches outside 81 ; i.e., there is a − as (y, s). space–time point (x, t) with x ∈ / 81 and t ≤ s that is in the same STC of σ8 2 ,− − − (y) ≤ σ8 (y) for all (y, s). Proof. By the basic–coupling inequality (1.2), σ8 1 ,−;s 2 ,−;s Consider the set of all space–time points that satisfy the hypotheses of the lemma but not the conclusion. Assuming this set is non–empty, it is easy to see that it contains a unique space–time point (y, s) with s minimal and s > 0 (here we are implicitly using the fact that in any finite box, the marks of the Poisson Processes form a discrete set and so can be ordered). It is also clear that at time s, either the spin at the site y changed from − − process but not in the σ8 process, or else the spin at the site y −1 to +1 in the σ8 2 ,− 1 ,− − − process. Because of the changed from +1 to −1 in the σ81 ,− process but not in the σ8 2 ,− basic–coupling, in both cases it is necessarily the case that at time s, the site y had more − − process than in the σ8 process. Any such nearest +1 nearest neighbors in the σ8 2 ,− 1 ,− 0 neighbor, say y , must be in 81 since we are assuming (y, s) is not in a STC that reaches outside 81 . Since (y 0 , s) is in the same STC as (y, s), it too must violate the conclusion of the lemma, thus contradicting the uniqueness of (y, s).

For technical reasons, we now fix an integer D that satisfies 12κc 2 , 2L , diam(supp f ) . D > max 2 − h − (L − 2)h

(2.1)

This no doubt mysterious integer will be used (among other things), as a bound on the width of “typical” STCs and as a fixed parameter for defining nucleation. As a reminder, note that when we omit the boundary condition, it is assumed to be −1. For instance we will write 8 instead of 8,− . For any configuration η, we define T+ η to be the configuration obtained from η by flipping all the −1 spins with at least two +1 neighbors. For any η ∈ 8 , with 8 a finite rectangle, one can apply T+ iteratively and obtain a final configuration η. This procedure is also known as bootstrapping (see [AL]). In the context of our Ising model, the operation T+ only lowers the energy of a configuration (when there is a positive external field), and so corresponds to rate 1 flips in the dynamics. By considering η instead of η, we are intuitively allowing for the possibility that rate 1 flips will happen very quickly in times that do not scale with β. However, bootstrapping over a large area is being too generous, so we need a way to be generous “locally”. Given a configuration η ∈ 81 and a rectangle 82 , we let η82 be the configuration in 81 that is equal to η on 82 ∩ 81 and is equal to −1 everywhere else. Define the box Q = 3(8D). We say that a configuration η ∈ 8 locally spans a critical square if there is some translate Q0 of the box Q such that there is a square of sidelength L of +1 spins in the (bootstrapped) configuration ηQ0 (this square may be part of a larger rectangle of +1 spins of course). In the dynamics, the first time a configuration is reached that locally spans a critical square, we say that nucleation has occurred. We will use the term nucleation loosely, however, as a guide in understanding rather than in the more technical manner of locally spanning a critical square.

96

P. Dehghanpour, R.H. Schonmann

In order to prove certain things about the σ3 process, we will need a coupled process in which nucleation is not allowed. For each rectangle 8, we define the restricted set of configurations R8 by R8 = {η ∈ 8 : η does not locally span a critical square} . We now introduce for each rectangle 8 a modified dynamics evolving in R8 , in which large droplets cannot, by definition, be formed and then we couple the unrestricted dynamics to this modified one, in a natural way. The modified dynamics is simply defined as the Markov process on 8 which evolves as the original stochastic Ising model in 8, with (−) boundary conditions, but for which all jumps out of R8 are suppressed. In other words, the rates, e c8,−,β (x, σ), of the new process are identical to c8,−,β (x, σ) in case σ x ∈ R8 and are 0 otherwise. We will denote this modified process, restricted to the state space R8 , by η σ e8,−;t , where η ∈ R8 is the initial configuration. It is easy to see that such a modified process is also reversible. Note that in the modified dynamics, only flips from −1 to +1 are suppressed; this will be important later because certain proofs for the original dynamics will carry over to the modified dynamics. Based on the heuristics explained in the introduction, it is easy to see that nucleation will occur in the box 3 by time τ . We will see that nucleation is unlikely in the smaller box 30 = 3(ε1/2 eβκc ). Let

W 0 = sidelength of 30 ,

and denote the translates of 30 by 30j = 30 + W 0 j,

j ∈ Z2 .

We say that we have tiled the lattice Z2 with the boxes 30j , and we shall refer to the 30j as tiles. Now define the rescaled lattice 3Res = {j : 30j ∩ 3 6= ∅}, and let

WRes = sidelength of 3Res .

We want to define a simple random state µ on the rescaled lattice to mark where nucleation has occurred in the original lattice. For technical reasons we not only record nucleation, but also the formation of very wide STCs; namely STCs that become wider than D. In order to not miss nucleation and STCs on the edges of the boxes 30j , we define the larger boxes [ 30i 3∗j = ki−jk∞ ≤1

that have width 3W 0 . For j ∈ Z2 , we let Nj be the event that either (1) at some time s ≤ τ , the configuration σ3∗j ;s (remember that we mean with −1 boundary conditions and initial configuration) locally spans a critical square, or (2) at some time s ≤ τ , a STC of the process σ3∗j has width greater than or equal to D.

Metropolis Dynamics Relaxation via Nucleation and Growth

97

Note that the events Nj are identically distributed and have a finite range of dependence; namely Nj is independent from Ni if ki − jk∞ > 2. In order to compute the probability of the event N0 , we first need a few local results which are essentially consequences of the work done in [NS]. We state and prove these results here. In the following lemma, we obtain an exponential bound on the probability that nucleation will happen in a finite box by a time of smaller order than eβ0 . Lemma 3. For fixed a < 0 and N > L2 + 1, define S = inf{t : σ3(N );t locally spans a critical square}. Then for any δ > 0,

P(S ≤ eβa ) ≤ e−β(0−a−δ)

for all large β (depending on a, N , and δ). Proof. Fix N and δ > 0 (small). Define the time T = inf{t : σ3(N );t has all spins in 3(N ) equal to +1}. By Thereom 3 of [NS], lim P(T ≤ eβ(0−δ) ) = 0.

β→∞

Because the system is in a finite box, once it reaches a configuration whose bootstrap contains a large droplet of +1 spins, the system can go to the +1 state with non-vanishing probability in a time of order eβ(2−h) < eβ(0−δ) (for small δ), so it follows that (see [NS] for details) (2.2) lim P(S ≤ eβ(0−δ) ) = 0. β→∞

For simplicity, let P(S ≤ eβa ) = c(β). We break the time interval eβ(0−δ) into smaller intervals of length eβa , and by independence, the Markov property, and attractiveness, we have eβ(0−a−δ)

P(S > eβ(0−δ) ) ≤ P(S > eβa ) = (1 − c(β))e

β(0−a−δ)

β(0−a−δ)

≤ e−c(β)e

(2.3)

.

Since c(β) ≥ 0, the right hand side of (2.3) is bounded above by 1, while the limit of the left hand side is 1 by (2.2). Hence lim c(β)eβ(0−a−δ) = 0,

β→∞

which completes the proof.

The following lemma states that before nucleation has happened in a finite box, it is unlikely that there are any wide STCs in the σ process.

98

P. Dehghanpour, R.H. Schonmann

Lemma 4. Let S = inf{t : σQ;t locally spans a critical square} and T = inf{t : σQ has a STC with width ≥ D by time t}. Then for large β,

P(T < min{S, τ }) ≤ e−2βκc .

Proof. By definition of the modified dynamics σ e, eQ;t σQ;t = σ so that where

for t < S,

P(T < min{S, τ }) ≤ P(Te ≤ τ ),

(2.4)

Te = inf{t : σ eQ has a STC with width ≥ D by time t}.

To prove (2.4) we will show that if a wide STC is to be formed in σ eQ , it must be formed relatively quickly; we will then show that it is also unlikely that a wide STC can be formed quickly. Let d = (L − 2)h. The intuition behind the quantity d is that it takes at most a time of order eβd for a sub–critical rectangle of +1 spins to be eaten; for detailed explanations, see [NS]. Note that d < 2 − h and let δ = 13 (2 − h − d). The proof of Proposition 2 in [NS] shows that for any η ∈ RQ , η lim P(e σQ,−;t = −1 for some t ≤ eβ(d+δ) ) = 1.

β→∞

(2.5)

Let G1 be the event that there is an integer time s < τ such that on the time interval eQ is never in state −1. By breaking such an interval into smaller [s, s + eβ(d+2δ) ], σ intervals of length eβ(d+δ) and using (2.5) along with the Markov property, and then adding over all integer times s < τ , we obtain βδ

P(G1 ) ≤ τ e−e

(2.6)

for large β, which is a super–exponential bound on the probability of the event G1 . Observe that it is necessarily the case that any two adjacent lines (parallel to some coordinate direction) that both intersect a STC must have a site in one of them whose spin flipped to +1 at a moment when fewer than 2 of its neighbors had +1 spins, so that the flip happened at a rate no greater than e−β(2−h) . It is clear that if an STC has width 2N , there are N obvious pairs of adjacent lines with the aforementioned property. Let G2 be the event that there are bD/2c pairs of adjacent lines in Q such that each pair contains a site whose spin flipped to +1 at a rate slower than e−β(2−h) , and that all of these D flips happened in a time interval of length less than eβ(d+2δ) contained in [0, τ ]. By adding over all such intervals (starting at integer times) and all such pairs of lines, we obtain bD/2c P(G2 ) ≤ C(D)τ 16De−β(2−h) eβ(d+2δ) (2.7) ≤ C(D)τ e−βδD/3 ,

Metropolis Dynamics Relaxation via Nucleation and Growth

99

where C(D) is a constant depending on D that corresponds to counting the number of ways in which one can choose the pairs of lines mentioned above. It is clear from the discussion above that the event that a wide STC is formed by time τ is contained in the union of the events G1 and G2 , so for large β we have P(Te ≤ τ ) ≤ P (G1 ) + P (G2 ) ≤ τ e−βδD/4 ≤e

−2βκc

(2.8)

,

where in the last step we used (2.1). This completes the proof, but note that we are simply obtaining an exponential bound that will be sufficient for our purposes; in fact, the bound that can be obtained is super–exponential in β in the sense that by choosing D large enough, the bound obtained can kill any given exponential in β. Now that we have local results about the likelihood of nucleation and wide STCs, we can obtain an exponential bound on the probability of the event N0 . Lemma 5. There exists a δ > 0 such that for large β, P(N0 ) ≤ e−βδ . Proof. Define the times V1 = inf{s : σ3∗0 ;s locally spans a critical square}, V2 = inf{s : a STC of σ3∗0 has width ≥ D}, and let

V = min{V1 , V2 },

so that the event N0 is the same as the event that V ≤ τ . We want to know if nucleation happens first or if a wide STC is formed before there is nucleation, so we let F1 be the event that V1 ≤ V2 and V1 ≤ τ , and we let F2 be the event that V2 < V1 and V2 ≤ τ . On the event F1 , there is some x ∈ 3∗0 such that σ3∗0 ;V1 Q+x contains a square of sidelength L of +1 spins. Let Q∗ = 3(11D). Since up to the time V1 ≤ V2 , no STC has width larger than 2D + 1, all the clusters of σ3∗0 ;V1 that intersect the box Q + x would have also appeared in σQ∗ +x;V1 by virtue of Lemma 2. In particular, σQ∗ +x;V1 locally spans a critical square also. Using the fact that κ < κc < 0, we apply Lemma 3 to σQ∗ +x , and add over all such sites x ∈ 3∗0 to obtain (for large β) P(F1 ) ≤ (3ε1/2 eβκc )2 e−β(0−κ−δ1 ) (2.9) ≤ e−βδ2 , where δ2 is chosen sufficiently small and smaller than 21 (κc − κ). Using the same sort of reasoning as above, it is clear that on the event F2 , if the flip at the site x ∈ 3∗0 at time V2 created the first STC with width ≥ D, then at that time, the said STC can be no wider than 2D + 1, so that again by virtue of Lemma 2, the same

100

P. Dehghanpour, R.H. Schonmann

wide STC would have been formed at time V2 in the process σQ+x . Since by assumption, nucleation has not yet happened by time V2 , we can apply Lemma 4, and adding over all sites x ∈ 3∗0 , we obtain (for large β) P(F2 ) ≤ (3ε1/2 eβκc )2 e−2βκc ≤ e−βδ3 , where δ3 is chosen smaller than 2 − h. This completes the proof of the lemma.

(2.10)

Turning our attention to the rescaled lattice 3Res , we define the state µ on this lattice by µ(j) = 1Nj . The bootstrapped configuration µ is defined exactly as before, with 1’s instead of +1’s and 0’s instead of −1’s. A site x for which µ(x) = 1 is also said to be occupied (versus vacant). We want to say that with high probability, if we bootstrap the configuration µ, then the origin (of the rescaled lattice) will not be occupied in the final configuration, i.e., P(µ(0) = 1) → 0 By construction, the collection

as

β → ∞.

(2.11)

{µ(j)}j∈3Res

is an identically distributed set of random variables with a finite range of dependence. We have (for large β) WRes ≤

2eβκc = 4ε−1/2 = 4eβ(2−h)/2 . (1/2)ε1/2 eβκc

(2.12)

Comparing the occupation density for µ as given by Lemma 5 with (2.12), we see that for any constant C, (2.13) WRes < eC/p for all large β, and hence (2.11) would follow from Theorems 1 & 2 of [AL] were it not for the fact that there is dependence among the occupation events of the sites of 3Res . Because this dependence is of finite range, the following lemma will show that (2.11) holds nevertheless. Readers who are not concerned by the lack of independence can skip the proof, which only uses the techniques of [AL]. 2 In what follows, a configuration η ∈ {0, 1}Z is chosen randomly such that P(η(x) = 1) = p for all x, and for all k, η(x1 ), . . . , η(xk ) are independent if kxi − xj k∞ > 2 for 1 ≤ i < j ≤ k. These are the constraints of our problem, but in fact the range of dependence does not have to be 2 in what follows. We are interested in the probability that the origin is occupied in the final bootstrapped configuration of η restricted to the box 3(N ), so we define M (N, p) = P η3(N ) (0) = 1 . Lemma 6. There exists a constant C > 0 such that lim

p→0, N →∞ N <eC/p

M (N, p) = 0.

Metropolis Dynamics Relaxation via Nucleation and Growth

101

Proof. For the moment, let’s fix p and N . The key idea of the proof is to use Lemma 1 of [AL]. Using the terminology introduced in [AL], we say that a region 8 ⊂ Z2 is internally spanned if 8 is entirely covered with 1’s in the final configuration for the bootstrap dynamics restricted to 8 (i.e., the states of the sites outside 8 are held fixed, equal to 0, at each step). On the event that the origin is occupied in the final configuration of the bootstrap percolation in 3(N ) (in our terminology, this means η3(N ) (0) = 1), it is the case that there is some rectangle contained in 3(N ) that is internally spanned in the configuration η, and this rectangle contains the origin. Let ` be the maximum sidelength of such a rectangle. There are three cases to consider: (1) ` > ap , (2)

a p

≥`≥

(3) ` <

a

p1/8

a , p1/8

or

,

where a is a constant less than 18 . In the first case, we apply Lemma 1 of [AL] to find that there is a rectangle inside 3(N ) (but not necessarily containing the origin) that is internally spanned and has maximum sidelength m in the interval [ ap , 2a p + 2]. Let the shorter side have length n ≤ m. Then in each pair of adjacent lines parallel to the shorter side of the rectangle, there must be one occupied site (this is a necessary condition for the rectangle to be internally spanned). By partitioning the rectangle into bm/2c such pairs of lines, we see that by considering every other such pair, the existence of occupied sites in them are independent. Thus the probability that such a rectangle is internally spanned is bounded by (2np)bm/4c , and adding over all such rectangles (with sidelengths n ≤ m satisfying the said conditions) in the box 3(N ), we have 2 ba/(4p)c 2a 2a 2 +2 +2 p 2 P(Case 1) ≤ N p p a/(8p) 2 4a 8a 2 (2.14) p ≤N p p 1/p 16a2 ≤ 2 e2C (8a)a/8 , p which goes to 0 as p goes to 0 provided C is chosen small enough. For the second case, we again use the trick of finding occupied sites in pairs of adjacent lines. In this case, however, we do not use the [AL] Lemma; instead we simply add over all possible rectangles containing the origin that have maximal sidelength m = ` a , ap ], to obtain in the interval [ p1/8 ba/(4p1/8 )c a p P(Case 2) ≤ (2a/p)4 2 p 1/p1/8 ≤ (2a/p)4 (2a)a/8 ,

(2.15)

which goes to 0 as p goes to 0. Finally, in the third case, observe that it is certainly necessary that there is some a ). We immediately get the bound occupied site inside the rectangle 3( p1/8 2 2a , P(Case 3) ≤ p p1/8

102

P. Dehghanpour, R.H. Schonmann

which goes to 0 as p goes to 0.

Given a configuration η on the rescaled lattice 3Res , we define a corresponding configuration νη on 3 in the obvious way: n +1 if η(j) = 1, νη |30j ∩3 = −1 otherwise. For convenience, we let ν = νµ denote the random state on 3 corresponding to the bootstrapped random state µ. Note that thanks to Lemma 6, (2.11) is justified, and we have P(ν(0) = +1) → 0

as

β → ∞.

(2.16)

The next step is to start the σ process at time t = 0 from the initial configuration ν. This process, which we will denote by ξ3;t , is clearly not Markov, since it uses information from the future of the σ3 process (up to time τ ) to determine its initial configuration. The ξ3 process is by definition coupled to the σ3 process, and it simply uses the Poisson processes and uniform random variables to determine its time evolution in the same way as the σ3 process does. It is easy to see that σ3;t ≤ ξ3;t

for all t.

(2.17)

It makes sense to talk about STCs of ξ3 ; we denote the STC equivalence relation by ≡ν3,t . Since the random state ν has information about nucleation and the formation of wide STCs in the σ3 process up to time τ , intuitively, a STC of ξ3 has been “helped” by nucleation only if the STC is connected in space–time to some cluster in the region {ν = 1} of space; i.e., if the projection of the STC onto the spatial dimension intersects the region {ν = +1}. We make this precise through the following definition. Definition. Let Blue(t) be the (set–valued) process defined as follows: for x ∈ 3 and t ≥ 0 such that ξ3;t (x) = +1, we say that x ∈ Blue(t) if and only if (x, t) ≡ν3,t (y, s) for some s ≤ t and y ∈ 3 with ν(y) = +1. If x ∈ Blue(t) we say x is blue at time t. Again, roughly speaking, the maximal influence of nucleation in the σ process is to sites that are blue. Note that if one site of a STC is blue, then all other sites of that STC are also blue. To become more familiar with the definitions, the reader can prove the following lemma, whose proof is as the proof of Lemma 2. Lemma 7. If ξ3;t (x) = +1 and x is not blue at time t, then σ3;t (x) = +1. Lemma 8. If a STC of ξ3 has width ≥ D at time t, then that STC must be blue at time t. Proof. Consider the first time s ≤ t at which the width of the said STC became ≥ D. If s = 0, we are done, since at time 0 the set {ξ3;0 (x) = +1} is precisely Blue(0). Suppose, therefore, that s > 0 and the STC is not blue at time t; then it is also not blue at time s. It is clear that at time s, there was a flip from −1 to +1 at some site x ∈ 3 that made the STC (of x) have width ≥ D for the first time. In particular, the width of the STC at time s is no more than 2D + 1. By Lemma 7, the same STC is formed in σ3 , and by Lemma 2, it is also formed in σ3∗j , where j is such that x ∈ 30j . But this means that the event Nj has occurred, and so ν(x) = +1, contradicting our assumption that the STC was not blue.

Metropolis Dynamics Relaxation via Nucleation and Growth

103

The following lemma extends the result above to further localize the appearance of non–blue +1 spins. We use the following notation: σ3;t− (x) = +1, for example, if there is some t0 < t such that σ3;s (x) = +1 for all s ∈ [t0 , t]. We say that x became blue at time t if x ∈ Blue(t) and there is some t0 < t such that x is not blue at any time s ∈ [t0 , t). Lemma 9. If ξ3;t− (x) = +1 and x became blue at time t, then σQ+x;t (x) = +1 and σ eQ+x;t (x) = +1. Proof. Consider the STC of x in ξ3 at time t− (i.e., immediately before time t). By hypothesis, this STC is not blue before time t, so by Lemma 7 it also appears in σ3 . By Lemma 8, it has width less than D, so by Lemma 2, it also appears in σQ+x . Suppose σQ+x;s locally spans a critical square at some time s ≤ t ≤ τ . Then by attractiveness, σ3∗j ;s would also locally span a critical square; here j is such that x ∈ 30j . Hence, as in the proof of Lemma 8, the event Nj has occurred, and so ν(x) = +1, contradicting our assumption that x became blue at time t (any site y with ν(y) = +1 is always blue when ξ3 (y) = +1). We conclude, therefore, that up to time t, no spin flips have been suppressed in σ eQ+x , and the result follows. The following lemma is a technical result that will be used in the sequel, but to avoid a break in continuity later on, we state and prove it here. Lemma 10. For any time t ≥ 0, P(e σQ;t (x) = +1 for some x ∈ Q) ≤ e−2β for all large β (uniformly in t). Proof. We define the quantities d and δ as in the proof of Lemma 4, so d = h(L − 2), and d + 3δ = 2 − h. First we consider the case where t ≤ eβ(d+2δ) ; in this case, the probability we need to bound is certainly bounded by P σ eQ;s (x) = +1 for some s ≤ eβ(d+2δ) ≤ |Q|e−β(4−h) eβ(d+2δ) = |Q|e−β(2+δ) ≤e

(2.18)

−2β

for large β; this is simply because some spin must flip to +1 starting from the −1 configuration. In the second case, where t > eβ(d+2δ) , we use the technique used in the proof of Lemma 4 to write eQ;t (x) = +1, σ eQ;s = −1 for some s ∈ [t − eβ(d+2δ) , t] P(e σQ;t (x) = +1) = P σ +P σ eQ;s 6= −1 for any s ∈ [t − eβ(d+2δ) , t] βδ

≤ e−β(4−h) eβ(d+2δ) + e−e . (2.19) Since the last term is super–exponentially small in β, we can add over all x ∈ Q as in (2.18) and obtain the desired result.

104

P. Dehghanpour, R.H. Schonmann

The next step is to actually show, using the setup developed above, that the Blue influence is unlikely to reach the support of our local observable f . To do this, we will use the technique of chronological paths (see [KS]). Define the random set B = {x ∈ 3 : kx − yk∞ < W 0 /3 for some y such that ν(y) = +1}. The set B is simply the set {ν = +1} with a shell of width W 0 /3 around it. For each x ∈ 3, let Gx denote the event that x ∈ 3 \ B is one of the first sites outside of B to become blue, and that x becomes blue before time τ . Sites in the same STC become blue at the same time, of course, so when we say x is a first such site, we mean no site outside of B became blue at a time strictly before the time at which x became blue. By the same reasoning, the events Gx are not disjoint. Now if the origin is not blue at time 0, then since the set {ν = +1} fits the tiles 30j , it is the case that the box 3(W/3) is disjoint from B, so that in particular, for large β, the box Q is disjoint from B (remember that the sidelength of Q is fixed). Define the event G = {there is some x ∈ Q and some t ≤ τ such that x ∈ Blue(t)} . By the aforementioned observations, if the event G occurs but ν(0) 6= +1, then Gx must occur for some x. Thus, we have X P(Gx ). (2.20) P(G) ≤ P(ν(0) = +1) + x∈3

The first term on the right hand side is already controlled in (2.16). We must show that the terms P(Gx ) are vanishing fast enough as β → ∞. Let’s begin analyzing the Blue(t) process. First, observe that this process changes values only at times t when there is a flip in the ξ3 process. We let Blue(t−) denote the set of blue sites immediately before time t. If the spin at site x flips from +1 to −1 at time t, then clearly Blue(t) = Blue(t−) \ {x} (note that x need not have been blue at time t−). More interestingly, if the spin at a site x flips from −1 to +1 at time t, then one of the following mutually disjoint events happens: (1) if x has no +1 neighbors at time t, then it becomes blue at time t if and only if ν(x) = +1, (2) if all the +1 neighbors of x at time t were already blue at time t, then Blue(t) = Blue(t−) ∪ {x}, or (2) if some of the +1 neighbors of x at time t were not blue at time t, then x and the STCs of the said +1 neighbors all become blue at time t provided that either x had some other +1 neighbor that was already blue at time t or else ν(x) = +1. The key observation here is that in all cases, by Lemma 8, the width of the blue set increases by at most 2D + 1, since any non–blue STCs that became blue at time t must have had width less than D. In particular, a necessary condition for the event Gx to occur is that x be within k · k∞ –distance D + 1 of B. A chronological path1 from a site x ∈ Blue(t) to a site y ∈ Blue(s) ∩ {ν = +1} is a sequence (xi , ti ), i = 0, . . . , n, such that x0 = x, xn = y, kxi − xi−1 k∞ ≤ D + 1 for i = 1, . . . , n, t = t0 > t1 > . . . > tn = s, and xi became blue at time ti for i = 0, . . . , n. 1 We use this term differently than the way it is used in [KS]. Here, the times are in decreasing order, whereas in [KS] they are in increasing order.

Metropolis Dynamics Relaxation via Nucleation and Growth

105

It should be clear from the definitions that on the event Gx , there is a chronological path from x to some site y ∈ {ν = +1}; in any case, we will find a particular such chronological path shortly. We say that a space–time point (x, t) is a slow site if x became blue at time t as a result of the spin at x flipping from −1 to +1 in the process ξ3 at time t and it had only one +1 neighbor at time t−. In other words slow sites are protuberances off of previous blue sites; an exponential clock of rate e−β(2−h) must have rung at the time that a slow site became blue. We say that a space–time point (x, t) is a cluster site if ξ3;t− (x) = +1 and x became blue at time t. In other words, x was part of a non–blue STC at time t−, and due to some nearby flip at a site y from −1 to +1 at time t, the STC of x became blue at time t as a result of coming into contact with blue sites. Lemma 8 tells us that in fact kx − yk∞ ≤ D. We say that a space–time point (x, t) is a special site if it is either a slow site or a cluster site. We want to construct a chronological path that has “many” special sites. In order to do this, we pick out a special direction α ∈ {±e1 , ±e2 } along which we try not to move in the recursive construction of the chronological path, which is as follows. Given (x, t) such that x became blue at time t and ν(x) 6= +1 (otherwise we are done), there are two cases (note that t > 0 necessarily): (1) If the spin at x flipped from −1 to +1 at time t in the ξ3 process, then some nearest neighbor y of x was blue at time t−. We choose any such y with y − x 6= α unless the only such y is x + α. The next point in the chronological path, therefore, is (y, s), where s is the most recent time before t at which y became blue. The key point is that we have moved in the special direction only in case (x, s) is a slow site. (2) If the spin at x was +1 at time t− in the ξ3 process, then there is some y such that kx − yk∞ ≤ D and the spin at y flipped from −1 to +1 at time t. As a result of this flip, the cluster containing x at time t came into contact with blue through some neighbor z of y that was blue at time t (or else ν(y) = +1 and we can take z = y). The next point in the chronological path, therefore, is simply (z, s), where s is the most recent time before t at which z became blue. In this case, we do not care how z is chosen, since in any case, (x, t) is a cluster site. Note that kx − zk∞ ≤ D + 1. The construction of the chronological path given above works in the sense that it produces a chronological path for each special direction, but more importantly, one of the (4) paths produced actually contains many special sites; we will make this precise now. On the event Gx , partitioning according to what the set B is, the special direction that will produce many special sites in the chronological path from x to the region {ν = +1}, is the unique choice of α ∈ {±e1 , ±e2 } such that x + (D + 1)α ∈ B. If x is too near the corners of B, then it is possible that no such α works; but in this case, there is a unique choice of two perpendicular directions α1 , α2 ∈ {±e1 , ±e2 } such that x + (D + 1)(α1 + α2 ) ∈ B, and either α1 or α2 can be taken as the special direction. The point is that given B, we have a special direction along which the chronological path must travel in order to get from x to the region {ν = +1}. The reason is that the site x is the first site to become blue outside of B, so that as soon as the path steps into the region B, which happens at the first step of the chronological path construction, the path can no longer leave B. In other words, the path must connect x to the rectangular component of {ν = +1} that is closest to x, and so it must move at least a distance W 0 /3 along the special direction. 0 be the slab Let α denote the special direction. Let Hx,α 0 = {y ∈ Z2 : 0 ≤ (x − y) · α < 10D}, Hx,α

106

P. Dehghanpour, R.H. Schonmann

and for each integer i, denote the translates by 10D of this slab by i 0 Hx,α = Hx,α + i(10D)α.

At each step of the chronological path, the maximum k · k∞ –distance traversed is D + i ,i = 1 units, so it is certainly the case that there is a special site in each slab Hx,α 0 0 1, . . . , bW /(30D)c. We can pick roughly W /(30D) special sites, but instead we choose 2i special sites from every other slab, Hx,α , i = 1, . . . , bW 0 /(60D)c so that we obtain n = bW 0 /(60D)c ∼

−β(2−h)/2 βκc 1 e 60D e

(2.21)

special sites, each two of which are at least a distance 10D apart. We now point out that since one of the four special directions produces a chronological path with many special sites in the sense described above, we do not need to ever condition on what B is. In other words if we let Gx,α denote the event that there is a chronological path from x to 2i , i = 1, . . . , n, we have {ν = +1} that has a special site in each slab Hx,α X P Gx,α . (2.22) P(Gx ) ≤ α∈{±e1 ,±e2 }

To bound the probabilities of the events Gx,α , we simply count the total number of chronological paths that satisfy the condition of having many special sites and estimate the probability of each and add. First we need to obtain a bound on the likely “length” of a chronological path. If we define the length of a chronological path to be the sum of the k · k∞ –distances between successive points in the path, then since each step of the path corresponds to a flip of maximal rate 1 in the process and the step taken is bounded in k · k∞ –distance by D + 1 (any constant will do), a standard Peierls–type argument tells us that there exist positive constants C1 , C2 , and C3 (uniformly in β, of course!) such that the probability that there is a chronological path of length greater than C1 τ from (x, t) to (y, s) with s < t ≤ τ is bounded above by C2 e−C3 τ . Notice that this is a superexponential bound in β, so that even after summing on all x ∈ 3, the probability ` denote the event Gx,α of having paths longer than C1 τ vanishes as β → ∞. Let Gx,α but only when no chronological path has length more than `. We can now summarize these results along with (2.16), (2.20), and (2.22) to write X X C1 τ P Gx,α , (2.23) P(G) ≤ o(1) + x∈3 α∈{±e1 ,±e2 }

where o(1) → 0 as β → ∞. C1 τ . By definition, for For simplicity, let α = −e1 , and let us now analyze the event Gx,−e 1 each outcome in this event, there must exist a sequence (x(i), t(i)), i = 1, . . . , n, of special sites, where n is as given in (2.21), and the times t(i) satisfy t(1) > t(2) > . . . > t(n). Furthermore, for i 6= j, kx(i) − x(j)k∞ ≥ 10D, so that in particular, Q + x(i) and Q + x(j) are disjoint for i 6= j.

(2.24)

For simplicity, in what follows and throughout the rest of the paper, we assume that all relevant quantities are integral. Of the n special sites, either at least n/2 are slow sites, or else at least n/2 are cluster sites. Let (y(i), s(i)), i = 1, . . . , n2 , be a subsequence (consisting entirely of the same type of special site). Let y(0) = x and define z(j) = y2 (j) − y2 (j − 1),

i = 1, . . . , n/2,

Metropolis Dynamics Relaxation via Nucleation and Growth

107

where we write y ∈ Z2 as y = (y1 , y2 ). By the bound on the chronological path length and the triangle inequality, n X |z(j)| ≤ C1 τ. K= i=1

Note that the z(j)’s determine the e2 -coordinate of the y(j)’s in their respective slabs Hx,α , and the e1 -coordinate is one of 10D choices. By dividing up the total variation K among the z(j)’s, we have the following simple combinatorial bound on the total number of possible ways the subsequence y(j) can be chosen:

n n/2

X C1 τ

2

n/2

K=1

K + (n/2) − 1 (n/2) − 1

(10D)n/2 .

The factor 2n/2 comes from the choices of signs of the z(j)’s, and the binomial coefficient in front of the summation is the number of ways n2 of the n slabs are chosen to pick the y(j)’s from. Using Stirling’s formula (see (A.3) of the appendix in [KS], for instance) and the fact that n ≤ C1 τ (since we can assume, without loss of generality, that κ is sufficiently close to κc ), we see that the expression above is bounded by C1 τ (80D)n/2

C1 τ + (n/2) n/2

≤ C4 τ (80D)n/2

≤ (C5 ) τ ε n

−1/2

4eC1 τ n

n/2

n/2 (2.25)

,

for some constants C4 , C5 ; the second inequality follows from (2.21). Now that we have counted the total number of lattice arrangements of the sites y(j) (all of the same type of special site), we must actually bound the probability that such a sequence of sites consists of special sites. The probability that n/2 (fixed) sites are slow sites and become occupied in order by time τ is bounded by (2.26) P Ze−β(2−h) τ ≥ n2 , where Zλ is a Poisson random variable with mean λ. Using a standard large deviation estimate for Poisson random variables (see, e.g., the appendix in [KS]) and (2.21), we have −β(2−h) n/2 τ 2e en/2 P Ze−β(2−h) τ ≥ n2 ≤ n (2.27) n/2 n 1/2 −β(κc −κ) ≤ (C6 ) ε e , since n/2 ≥ e−β(2−h) τ . We will come back to these estimates. The case of the n/2 cluster sites is slightly more complicated. First observe that by Lemma 9, if (y, s) is a cluster site, then σ eQ+y;s (y) = +1. Denote the subset of cluster sites of (x(i), t(i)) by (y(i), s(i)), i = 1, . . . , n2 , with the s(i)’s in increasing order (for simplicity). Let T0 = 0, and define the following stopping times for i = 1, . . . , n2 : Ti = inf {t ≥ Ti−1 : σ eQ+y(i);t 6= −1} , and let

φi = Ti − Ti−1

108

P. Dehghanpour, R.H. Schonmann

be the waiting time after Ti−1 until a + spin appears in the process σ eQ+y(i) . One can easily convince oneself that if (y(i), s(i)) are cluster sites as above, then certainly Tn/2 ≤ τ.

(2.28)

This simply corresponds to waiting for a + at a site and then jumping to the next site as soon as a + appears, and waiting for a + at the new site, and so on. The waiting times may eQ+y(i) (by (2.24)), conditioning be 0; in fact, since Ti−1 is independent of the process σ on Ti−1 and using Lemma 10 gives us: σQ+y(i);Ti−1 6= −1) ≤ e−2β , P(φi = 0) = P(e and in particular (by independence again), for any subsequence mj with mk > mk−1 > . . . > m1 > 1, we have P φmk = 0|φmk−1 = 0, . . . , φm1 = 0 ≤ e−2β . (2.29) Of course we require m1 > 1 since P(φ1 = 0) = 0. Using induction and (2.29), we have that for any subsequence mj , j = 1, . . . , k, (2.30) P φmk = 0, φmk−1 = 0, . . . , φm1 = 0 ≤ e−2βk . By the strong Markov property, if φk 6= 0, then the waiting time is exponential with rate |Q|e−β(4−h) (this is the rate at which a + spin will appear starting from the −1 state) and is independent of the Ti ’s for i < k, so that in particular, for any subsequence mj with mk > mk−1 > . . . > m1 , and for any postive numbers aj , −β(4−h) . P 0 < φmk ≤ ak |0 < φmk−1 ≤ ak−1 , . . . , 0 < φm1 ≤ a1 = 1 − e−|Q|aj e Again, by induction, we have

P 0 < φ m k ≤ aj , . . . , 0 < φ m 1 ≤ a1 =

k Y

1 − e−|Q|aj e

−β(4−h)

.

(2.31)

j=1

But this is just the joint distribution of independent exponential random variables, so we have P φmk + . . . + φm1 ≤ τ, φmk > 0, . . . , φm1 > 0 ≤ P Z|Q|τ e−β(4−h) ≥ k k 2|Q|τ e−β(4−h) (2.32) ek ≤ k −β(4−h) k τe k ≤ (C7 ) , k where again Zλ is a Poisson random variable with mean λ and we have used a standard large deviation estimate. Having done the above calculations, we argue as follows. Of the n/2 cluster sites, either at least n/4 have wait times φi = 0, or else at least n/4 have non–zero waiting times. Thus, using (2.30) and (2.32) with k = n/4, we have the following bound on the probability that n/2 (fixed) sites are cluster sites:

Metropolis Dynamics Relaxation via Nucleation and Growth

n/2 n/4

ÿ e

−2βn/4

+ (C7 )

n/4

109

τ e−β(4−h) n/4

n/4 !

n/4 ≤ (C8 )n e−βn/2 + e−β(κc −κ) ε−1/2 ε3/2 n/2 ≤ (C8 )n e−βn/2 + e−β(κc −κ)/2 ε1/2 ,

(2.33)

where we have used the simple fact that e−β(4−h) ≤ ε3/2 . Putting (2.25), (2.27), and (2.33) together, we have n/2 n/2 C1 τ n −1/2 ε1/2 e−β(κc −κ) P Gx,α ≤ (C9 ) τ ε n/2 −βn/2 −β(κc −κ)/2 1/2 +e + e ε n/2 n/2 n −βh/2 −β(κc −κ)/2 ≤ 2(C9 ) τ e + e ≤ τ e−n/2 , for large β. Since n is an exponential in β, the bound above is a super–exponential bound in β, so looking back at (2.23), we see that adding over all x ∈ 3 does no damage, and we have shown that P(G) → 0 as β → ∞. (2.34) To conclude the proof of the first half of the theorem, note that on the complement of the event G, the processes ξ3 and σ3 agree on the box Q for all times t ≤ τ by virtue of Lemma 7. By Lemmas 2 and 8, the processes σ3 and σQ agree on the support of the local observable f , thanks to the way D was chosen. Using again the fact that we are on eQ agree up to time t ≤ τ . It remains to show the event G, the processes σQ and σ eQ;τ = f (−1), lim E f σ β→∞

but this is a trivial consequence of Lemma 10.

3. Relaxation Regime (Proof of Part 2 of the Theorem) Throughout this section, we have a fixed κ > κc =

0+(2−h) , 3

and we let

τ = eβκ and 3 = 3(ε1/2 eβκc ). Since we are now trying to show that the origin will become occupied by time τ , freezing the sites outside the box 3 will provide a useful comparison, thanks to the basic–coupling inequality (1.3). In fact, several times throughout the proof when we find lower bounds for the probability that certain sites will have +1 spins, we will implicitly assume the worst case scenario, namely that all unmentioned sites have −1 spins. Intuitively, any other scenario only helps the origin become occupied more quickly.

110

P. Dehghanpour, R.H. Schonmann

The outline of the proof is as follows. First we will show that it is likely that a “critical droplet” will form somewhere in the box 3 by time τ /3. Then we will see that the critical droplet has enough time to grow large enough to attain its asymptotic speed of growth, ε1/2 , so that it can reach the origin by time τ . In reality, many droplets meet and grow toward the origin—we are simply obtaining an upper bound for how long it takes for the (+)-phase to get to the origin. As in the previous section, we need to prove a local result first. Recall that given a configuration η, the bootstrapped configuration η is obtained from η by applying the operation T+ iteratively. Given a finite rectangle 8, define the restricted set of configuˆ 8 by rations R ˆ 8 = {η ∈ 8 : η has no rectangle of + spins with shortest side ≥ L} . R Let σˆ 8 denote the (coupled) dynamics obtained from σ8 by restricting to the states in ˆ 8 . Using the standard notation η x to denote the configuration obtained from η by R flipping the spin at x, it is clear that σˆ 8 is equal to σ8 at least until the moment when ˆ 8 , namely the former enters the boundary of R ˆ 8 for some x ∈ 8 . ˆ 8 : ηx ∈ /R P= η∈R We can now state a lemma which gives a lower bound on the probability that nucleation happens in a finite box by a time of smaller order than eβ0 . Note that this lemma is, in a way, the complement of Lemma 3. Lemma 11. For fixed a satisfying h(L − 2) < a < 0 and N > L2 + 1, define S = inf{t : σ3(N );t ∈ P}. Then for any δ > 0,

P(S ≤ eβa ) ≥ e−β(0−a+δ)

for all large β (depending on a, N , and δ). Proof. Fix N and δ > 0 (small). We will describe a mechanism of growth of the critical droplet to obtain a lower bound for the desired probability. For each positive integer k, let Uk1 denote the event that σ3(N ) does not enter the set of configurations P up to time k and σ3(N );k = −1, and let Uk2 denote the event that σ3(N );k = −1 and σ3(N ) leaves the −1 state before time k + 1 and reaches P by time eβa before possibly returning to the −1 state. Intuitively, k is the last visit to −1 before the trip to P. Letting Uk be the intersection of Uk1 and Uk2 , it is easy to see that the events Uk are disjoint, and certainly eβa /2

P(S ≤ e ) ≥ βa

X

P(Uk )

k=1

(3.1)

eβa /2

=

X

P(Uk1 )P(Uk2 ),

k=1

where we have used the (weak) Markov property in the equality. Recall that we assume, for simplicity, that all relevant quantities are integral. In the previous section, we defined the notion of “locally spanning a critical square,” which was necessary to localize the nucleation phenomenon in large boxes that scaled

Metropolis Dynamics Relaxation via Nucleation and Growth

111

with β. When dealing with the dynamics in a finite box, there is no need for this notion, however, as it is equivalent in flavor to simply bootstrapping the configuration and searching for a large rectangle (meaning one with shortest sidelength ≥ L). In particular, the proofs of Lemmas 4 and 5 easily imply (3.2) lim P σˆ 3(N );t 6= σ3(N );t for some t ≤ eβa = 0, β→∞

and the proof of Lemma 10 implies that for all t, P σˆ 3(N );t = −1 ≥ 1 − e−2β .

(3.3)

Combining (3.2) and (3.3), it follows that for k = 1, . . . , eβa /2, P(Uk1 ) ≥

1 2

(3.4)

for all large β, uniformly in k. The argument above is really nothing new; the heart of the matter is to prove the intuitively obvious fact that the probability of the event Uk2 is on the order of e−β0 (remember that 0 is the energy barrier for going from −1 to +1). The well known energy profile for a single droplet is as follows (see [Nev]). The relative minima of the graph of the energy versus the droplet size (assuming the droplet is in a shape that minimizes its energy) occur when the droplet is a square or near square (i.e., sidelengths a and a − 1). The relative maxima occur upon the addition of one +1 spin to a square or near square droplet, creating a new layer with only one +1 spin. The energy then decreases as the layer is filled with +1 spins, finally reaching the next square or near square shape at a local minimum. The energies of the local minima increase until the (square) droplet has sidelength L, at which point the energies of the local minima begin to decrease. Using this energy profile as a guide, we define the following events. For each 2 ≤ ` < L, let A`,` be the event that starting from a single ` × ` droplet of +1 spins, a droplet of size ` × (` + 1) is formed 2 before the original droplet loses (` − 1) of its +1 spins, and that this happens before time eβ(h(`−1)−δ/(2N )) . Similarly, for 2 ≤ ` < L define A`,(`+1) to be the event that starting from a single ` × (` + 1) droplet of +1 spins, a droplet of size (` + 1) × (` + 1) is formed before the original droplet loses (` − 1) of its +1 spins, and that this happens before time eβ(h(`−1)−δ/(2N )) . Finally, let A1,1 be the event that a 2 × 2 droplet is formed by time 1. The key point is to observe that since the formation of a critical droplet through the mechanism described above is a subset of the event Uk2 , we have (3.5) P(Uk2 ) ≥ P(A1,1 ) P(A2,2 ) P(A2,3 ) P(A3,3 ) · · · P(A(L−1),L ), where we have used the strong Markov property to restart each time the next larger square or near–square droplet has been formed. We now need to estimate the probabilities in the right hand side of (3.5). Suppose we start the σ3(N ) process from a single square droplet of sidelength ` < L. In terms of the energy profile, the droplet is at a relative minimum. Assume for a moment that +1 spins can flip to −1 only at the corners (i.e., when they have at least two −1 neighbors). There are two competing phenomena affecting the droplet. If it loses (` − 1) of the +1 spins (each at rate e−βh ) on a side, then the last +1 spin is lost at rate 1, and the resulting droplet is the next lower relative minimum in the energy profile. If, on the 2 Here and below, when we say that a certain droplet has been formed, we mean a configuration has been reached which contains such a droplet (or a larger droplet).

112

P. Dehghanpour, R.H. Schonmann

other hand, the spin at one of the sites neighboring the original droplet flips to +1 before the loss of ` − 1 spins in the original droplet, then the droplet can recover the lost +1 spins and fill in the new edge to reach the next higher relative minimum (a rectangle of sidelengths ` and ` + 1) with non–vanishing probability (as β → ∞), since all the spin flips needed to do this have rate 1. Using Lemma 4 in [NS], one can see that the probability that the latter of the above events happens (before the former) and that this happens by time eβ(h(`−1)−δ/(2N )) is at least as large as (for β large, of course) −e−β(2−h) eβ(h(`−1)−δ/(2N )) 1 , (3.6) 2 1−e where the factor of 21 comes from the fact that it is very unlikely that ` − 1 of the +1 spins can be lost by time eβ(h(`−1)−δ/(2N )) , and the second factor corresponds to a rate e−β(2−h) flip happening in the time allotted. Note that at any time before ` − 1 spins are missing from the original droplet, there is at least one site that is outside the original droplet and touching a site that is +1 in the droplet, and the Poisson processes associated with the outside sites are independent of those for the inside sites, justifying the above calculation. Using the fact that 1 − e−x ≥ x/2 for 0 ≤ x ≤ 1, (3.6) is larger than 1 −β(2−h) β(h(`−1)−δ/(2N )) e 4e

(3.7)

for large β. Notice that the exponent here is the difference in energy levels of the two maxima surrounding the local minimum corresponding to the ` × ` droplet. As for the assumption that +1 spins are only eaten at the corners, simply note that the rate at which a +1 spin that has 3 or more +1 neighbors flips to −1 is at most e−β(2+h) , and the times we are considering are much shorter than the inverse of this rate, so putting in another factor of 21 into (3.7) takes care of this problem in a standard fashion (see [NS] for the details of these arguments), and we have shown P(A`,` ) ≥ 18 e−β(2−h) eβ(h(`−1)−δ/(2N ))

(3.8)

for large β. Of course a similar bound can be obtained for the probability of going from each relative minimum to the next higher relative minimum. In each case, the time allotted should be eβ(h(`−1)−δ/(2N )) , where ` is the length of the shorter side (if the droplet is a near square). For the last step, i.e., going from an L × (L − 1) droplet to an L × L droplet, the time allotted is eβ(h(L−2)−δ/(2N )) ; note that this step takes the longest time, and the condition a > h(L − 2) is needed to guarantee that there is enough time for the process to move from each relative minimum to the next relative minimum through the relative maximum between them. Last but not least, we must consider the base case, i.e., going from the −1 state to having one +1 spin. This is easy, and in fact with elementary reasoning, one can see that the probability that a 2 × 2 square of +1 spins is created in time 1 is at least −b(2−h) 2 −β(4−h) 2 e 1 e . (3.9) P(A2,2 ) ≥ e−N 8 8 8 Notice that the exponent here is simply the energy of a 2 × 2 droplet. Using the bounds given by (3.7) and (3.9) in (3.5), and noting that the exponents of the estimates add up to β(0 + δ) by the observations made after (3.7) and (3.9), we have P(Uk2 ) ≥ Ce−β(0+δ) ,

(3.10)

Metropolis Dynamics Relaxation via Nucleation and Growth

113

for large β, uniformly in k, with the constant C only depending on N . Using (3.4) and (3.10) in (3.1) we have (3.11) P(S ≤ eβa ) ≥ C2 eβa e−β(0+δ) for large β. Of course, we may have taken δ a bit smaller than required, thereby getting rid of the constant. This completes the proof of Lemma 11. In the next lemma, we will show that going from an L × L droplet to a larger (finite) droplet is no problem. For technical reasons, we now fix an integer D satisfying 15κ , diam(supp(f )) . (3.12) D > 2L2 , h Lemma 12. Let T = inf{t : σ3(D);t (x) = +1 for all x ∈ 3(D)}. Then for any δ > 0,

P(T ≤ τ /3) ≥ e−β(0−κ+δ)

for all large β (depending on τ and δ). Proof. It suffices to prove the lemma for δ > 0 sufficiently small. Let δ < (κc − κ), and set a = κ − δ. Using the fact that 0 ≥ 5(2 − h) (see Appendix), one can check that a > κc > h(L − 2). Without loss of generality, we can assume that κ < 0, so that a satisfies the conditions of Lemma 11. Define the time S 0 = inf{t : σ3(D);t has a droplet of + spins larger than an L × L square}, and note that once σ3(D) ∈ P, an L × L droplet of + spins can form in a time of order 1 due to the fact that bootstrapping happens at a rate of order 1 in a finite box. Using this fact together with Lemma 11, we have P(S 0 ≤ τ /6) ≥ C1 e−β(0−a+δ/3) = C1 e−β(0−κ+2δ/3) , for large β; the constant is due to bootstrapping and depends on D. We now invoke Theorem 1 (part b) of [NS] to obtain P(T ≤ τ /3) ≥ C1 C2 e−β(0−κ+2δ/3) ≥ e−β(0−κ+δ) for large β; the constant C2 can be taken arbitrarily close to 1.

In order to keep track of where the (+)-phase is, we define a renormalized process µ on 3 as follows. For any x in 3, µt (x) = 0 for t ≥ 0 until 3(D) + x has all spins +1 in the σ3 process; at that moment, µ(x) becomes 1 and we say that the site x is infected. Once a site x is infected, it remains infected until the number of −1 spins in the box 3(D) + x becomes D/3 in the σ3 process; at that moment, µ(x) becomes 0. The next lemma shows that some sites will become infected by time τ /3. Lemma 13. lim P ∃x ∈ 3 and t ≤ τ /3 such that µt (x) = 1 = 1. β→∞

114

P. Dehghanpour, R.H. Schonmann

Proof. The proof is easy. For each site x ∈ 3, define Tx as in Lemma 12: Tx = inf{t : σx+3(D);t (y) = +1 for all y ∈ x + 3(D)}. By the basic coupling inequalities P x becomes infected by time τ /3 ≥ P(Tx ≤ τ /3).

(3.13)

If we tile 3 with copies of 3(D), then the Tx ’s corresponding to the centers of each tile are independent of each other, so that we have ε1/2 eβκc /D 2 , (3.14) P no x ∈ 3 becomes infected by time τ /3 ≤ P(T > τ /3) where we have used the fact that the Tx ’s are identically distributed and have the same distribution as T , which was defined in Lemma 12. Invoking Lemma 12 with some small δ, we see that the quantity in (3.14) is bounded by 1 − e−β(0−κ+δ)

εe2βκc /D2

−β(0−κ+δ)

≤ e−e

εe2βκc /D 2

which goes to 0 as β → ∞ for any sufficiently small δ.

β(κ−κc −δ)

= e−e

/D 2

,

Now that we know infection will show up, we need the following lemma, which states that sites that become infected will remain infected until time τ with very high probability (this is what we expect; once the (+)-phase reaches a region, it is unlikely to leave in a short time). Lemma 14. Let B0 denote the event that there is some site x ∈ 3 that becomes infected before time τ , but does not remain infected up till time τ . Then lim P (B0 ) = 0.

β→∞

Proof. Define the restricted set of configurations ˇ = {η ∈ 3(D) : at most D/3 sites x ∈ 3(D) have η(x) = −1} , R and let the corresponding restricted dynamics be denoted by σˇ 3(D) . Note that because D > 2L, the ground state for this restricted dynamics has all +1 spins in the box 3(D). Let ρ denote the Gibbs measure for this restricted dynamics. In the usual way, we define the boundary set ˇ for some x ∈ 3(D) , ˇ = η∈R ˇ : ηx ∈ /R ∂R which simply consists of configurations that have exactly D3 sites in 3(D) with −1 spins. For any such configuration, each row and column of 3(D) must have at least one +1 spin, so it is clear that the total length of the contours separating the +1 and −1 spins ˇ is at least as large as the perimeter of 3(D). Thus the energy of any configuration in ∂ R , and adding over relative to the ground state for the restricted dynamics is at least hD 3 all such configurations, we have ˇ ≤ D2D/3 e−βhD/3 . (3.15) ρ ∂R

Metropolis Dynamics Relaxation via Nucleation and Growth

115

ρ + We can couple the processes σ3(D) and σ3(D) by simply enlarging the probability space to independently select the starting configuration of the latter process with distribution ρ. The time evolution of both processes is governed by the Poisson processes and uniform random variables, and we clearly have ρ + ≥ σ3(D);t σ3(D);t

for all t ≥ 0 (here the superscript + refers only to the sites in the box 3(D); all other sites have spins frozen at −1 as usual). If we define o n ρ ˇ , ∈ ∂R M = inf t ≥ 0 : σˇ 3(D);t ρ ρ = σˇ 3(D);t up to time M , it is clear that and notice that σ3(D);t ρ + σ3(D);t ≥ σˇ 3(D);t

for all t < M.

(3.16)

We now compute the probability that M < τ . Considering only times that are multiples ρ ∈ of 1 = e−2βκ , we see that stationarity and (3.15) imply that the probability that σˇ 3(D);t ˇ ∂ R for some time t which is an integer multiple of 1 is bounded by e2βκ eβκ D2D/3 e−βhD/3 ≤ Ce−2βκ ,

(3.17)

using the definition of D. If M < τ , but the aforementioned event did not happen, then there must have been some flip in the interval [M, M + 1], so by the strong Markov property and using the fact that the flip rates are bounded by 1, the probability of this event is bounded by −2βκ

1 − e−e

D2

≤ D2 e−2βκ .

(3.18)

Putting (3.17) and (3.18) together, we have P(M < τ ) ≤ Ce−2βκ ,

(3.19)

where as usual, C is a constant (depending on D) that does not scale with β. Now, using the strong Markov property, (3.16), and (3.19), we can simply add over all x ∈ 3 to obtain: P (∃x ∈ 3 and t < s ≤ τ such that µt (x) > µs (x)) X ≤ P (∃t < s ≤ τ such that µt (x) > µs (x)) x∈3

≤

X

+ ˇ for some u ≤ τ P σx+3(D);u ∈ ∂R

x∈3

≤

X

(3.20) P (M ≤ τ )

x∈3

2 ≤ ε1/2 eβκc Ce−2βκ →0

as β → ∞,

concluding the proof of the lemma.

116

P. Dehghanpour, R.H. Schonmann

The main work is essentially done; all that remains is to obtain some large deviation estimates to show that infection “spreads” fast enough to get to the origin by time τ . Let δ > 0 be small. Let B1 denote the event that there is some site x ∈ 3 that remains uninfected up to a time eβ(2+h+δ) after one of its neighbors becomes infected. Let B2 denote the event that there is some site x ∈ 3 that remains uninfected up to a time eβδ after two of its neighbors have become infected. Lemma 15. If δ > 0 is sufficiently small, then lim P (B1 ∪ B2 ) ∩ B0c = 0.

β→∞

Proof. Consider a fixed site x ∈ 3, and suppose two of its neighbors are infected. Then by definition of infection, the box x + 3(D) has at most 2(D/3) + 1 < D sites that have −1 spins. In particular, bootstrapping the +1 spins in the box will fill the box, so that with non–vanishing probability bounded below by some constant α > 0, the box becomes filled with +1 spins in one unit of time. So the probability that this does not happen by time eβδ/2 is bounded above by (1 − α)e

βδ/2

,

which is a super–exponential bound in β. Adding over all such scenarios, we obtain lim P(B2 ∩ B0c ) = 0.

β→∞

The argument for B1 is similar. Suppose the site x is infected. Then after the moment of infection and up to time τ , the box x + 3(D) has at most D/3 sites with spins −1. If y is a nearest neighbor of x and x is infected, then the box y + 3(D) is already almost full of +1 spins, except for one row of sites, say E. At each time after x has been infected, there is some site in E that has a neighbor in x + 3(D) that has a +1 spin, so a +1 spin will appear in E at rate e−β(2−h) . Once such a protuberance appears, the entire new edge E can be filled in a time of order 1, so that using the same sort of reasoning as above, we obtain a super–exponential bound on the probability that a particular infection takes too long, from which it follows, using also Lemma 14, that lim P(B1 ∩ B0c ) = 0.

β→∞

Using Lemmas 13, 14, and 15, we can now show that a large number of sites, namely a box of sidelength ε−1/2 , will become infected by time 2τ /3. This is important because it is when the droplet has sidelengths of this order that it grows with its asymptotic speed of growth (see [KS] and [DS]). Lemma 16. Let T ∗ = inf{t : there is a square of sidelength ε−1/2 of infected sites in 3}. Then

lim P(T ∗ ≤ 2τ /3) = 1.

β→∞

Metropolis Dynamics Relaxation via Nucleation and Growth

117

Proof. Thanks to Lemmas 14 and 15, the proof reduces to a simple deterministic calculation. Using Lemma 13 and conditioning on the location of the first infected site, all that needs to be shown is that the time it takes for the infection to spread to all the sites in a box of sidelength ε−1/2 containing the first infected site is bounded by τ /3. Such a square can become infected layer by layer, starting from the first infected site, as follows. We wait for an infection in each new layer (waiting no longer than eβ(2−h+δ) ), and then fill up that layer at a fast rate (each infection taking no longer than eβδ ). Summing these times, we see that on the event B0c ∩ B1c ∩ B2c , the time it takes for a square of sidelength ε−1/2 containing the first infected site to become infected is bounded by −1/2 εX

4eβ(2−h+δ) + 8ieβδ ≤ Ceβ(3(2−h)/2+δ) ≤ τ /3,

(3.21)

i=1

for large β and δ > 0 chosen sufficiently small; note that the last inequality above is easily satisfied, i.e., with plenty of time to spare (see Appendix). The final lemma simply assures that the large droplet that has been formed by time 2τ /3 has enough time to carry the (+)-phase to the origin by time τ . Lemma 17. limβ→∞ P(the origin becomes infected by time τ ) = 1. Proof. By Lemma 16, there is some box of sidelength ε−1/2 of infected sites by time 2τ /3. By conditioning on where this box is in 3, we can lay out a sequence of tracks (each track being a rectangle of dimensions 1 × ε−1/2 ) connecting this box to the origin. The sites in each track are all at distance 1 from the sites either in the infected box of sidelength ε−1/2 or in previous tracks in the sequence. Note that we need not use more tracks than the sidelength of 3, namely ε1/2 eβκc , to connect the box of sidelength ε−1/2 to the origin. We claim that each track will become totally infected within time ε−1/2 eβδ = eβ((2−h)/2+δ) after the previous tracks in the sequence have become totally infected. To see that the claim is true, simply observe that by dividing up the sites of the new track into groups of D sites, each group is independently being infected by some site in the previous tracks or in the initial droplet at a rate of order e−β(2−h+δ) , so that the total rate at which the first site in the new track becomes infected is on the order of 1 −1/2 −β(2−h+δ) e Dε

=

1 −β((2−h)/2+δ) . De

Thus, as in the proof of Lemma 15, the time to wait before the first infection in the new track is on the order of eβ((2−h)/2+δ) . Once the first infection appears in the new track, the rest of the sites become infected (in order) at rate 1, so the time it takes for the entire track to become infected after the first infection appears is on the order of ε−1/2 = eβ(2−h)/2 eβ((2−h)/2+δ) . To infect the tracks consecutively, therefore, takes a time of order (3.22) ε1/2 eβκc eβ((2−h)/2+δ) = eβ(κc +δ) ≤ τ /3, for large β and δ > 0 chosen sufficiently small.

To complete the proof, note that once the origin is infected, the sites in the box 3(D) are likely to all have +1 spins “most of the time” from then on. More precisely, using the Strong Markov property to restart the process once the origin has become infected, we see that because the ground state for the dynamics restricted to the box 3(D) is the state where all spins inside 3(D) are +1, it follows (using also the basic coupling inequalities as usual) that

118

P. Dehghanpour, R.H. Schonmann

lim P σ3;τ (x) = +1 for all x ∈ 3(D) = 1.

β→∞

(3.23)

Since 3(D) contains the support of f , (3.23) completes the proof of the second part of the theorem. 4. Appendix Here we derive some elementary inequalities relating 0, h, and 2 − h. Recall that 0 < h < 2, and 2 , L= h 0 = 4L − h(L2 − L + 1). 2 By definition, 1 < h2 ≤ L < h2 + 1. Note that the values h = L−δ correspond to the same 2 fixed integer L for any 0 ≤ δ < 1. So L = h + δ, and writing h(L − 1) = 2 − h(1 − δ) gives 2 − h ≤ h(L − 1) < 2.

Using the same notation, 0 = 4L − h(L2 − L + 1) 2 4 4δ 2 2 +δ −h + δ − δ + 1 + − =4 h h2 h h 4 8 = + 4δ − − 4δ − δ 2 h + 2 + δh − h h h 4 = + 2 − h(δ 2 − δ + 1), h and for 0 ≤ δ < 1, we have

3 4

≤ δ 2 − δ + 1 ≤ 1, which gives

4 3 4 + 2 − h ≤ 0 ≤ + 2 − h. h h 4 Using this last inequality and observing that

4 h

≥ 4(2 − h) for 0 < h < 2, we have

0 ≥ 5γ. Acknowledgement. R.H.S. thanks Eduardo Jord˜ao Neves for his collaboration in an early stage of this project.

References [AL] [DS] [Dur] [GD]

Aizenman, M. and Lebowitz, J.L.: Metastability effects in bootstrap percolation. J. Phys. A 21, 3801–3813 (1988) Dehghanpour, P. and Schonmann, R.H.: A nucleation–and–growth model. Probab. Theory Relat. Fields. 107, 123–135 (1997) Durrett, R.: Lecture notes on interacting particle systems and percolation. Wadsworth & Brooks/Cole Publ. Co., 1988 Gunton, J.D. and Droz, M.: Introduction to the theory of metastable and unstable states. In: Lecture Notes in Physics 183, Berlin, Heidelberg, New York: Springer, 1983

Metropolis Dynamics Relaxation via Nucleation and Growth

[KS]

119

Kesten, H. and Schonmann, R.H.: On some growth models with a small parameter. Probab. Theory Relat. Fields 101, 435–468 (1995) [Lig] Liggett, T. M.: Interacting Particle Systems. Berlin, Heidelberg, New York: Springer, 1985 [Nev] Neves, E.J.: A discrete variational problem related to Ising droplets at low temperatures. J. Stat. Phys. 80, 103–123 (1995) [NS] Neves, E.J. and Schonmann, R.H.: Critical droplets and metastability for a Glauber dynamics at very low temperatures. Commun. Math. Phys. 137, 209–230 (1991) [PL] Penrose, O. and Lebowitz, J.L.: Towards a rigorous molecular theory of metastability. Info in Fluctuation Phenomena (second edition), E. W. Montroll, J. L. Lebowitz, editors, Amsterdam: North– Holland Physics Publishing, 1987 [RTMS] Rikvold, P.A., Tomita, H., Miyashita, S. and Sides, S.W.: Metastable lifetimes in a kinetic Ising model: dependence on field and system size. Phys. Rev. E 49, 5080–5090 (1994) [Sch1] Schonmann, R.H.: Slow droplet–driven relaxation of stochastic Ising models in the vicinity of the phase coexistence region. Commun. Math. Phys. 161, 1–49 (1994) [Sch2] Schonmann, R.H.: Theorems and conjectures on the droplet driven relaxation of stochastic Ising models. Info in Probability theory of spatial disorder and phase transition, G. Grimmett, ed., Amsterdam: Kluwer Publ. Co, 1994 pp. 265–301 [SS] Schonmann, R.H. and Shlosman S.B.: Wulff droplets and the metastable relaxation of kinetic ising models. Preprint (1997) Communicated by J.L. Lebowitz

Commun. Math. Phys. 188, 121 – 133 (1997)

Communications in

Mathematical Physics c Springer-Verlag 1997

A Penrose-like Inequality for the Mass of Riemannian Asymptotically Flat Manifolds Marc Herzlich? Centre de Math´ematiques de l’Ecole polytechnique, CNRS URA 169, 91128 Palaiseau Cedex, France and D´epartement de Math´ematiques, Universit´e de Cergy-Pontoise, Site de Saint Martin, 95302 Cergy-Pontoise Cedex, France. E-mail: [email protected] Received: 17 September 1996 / Accepted: 21 January 1997

Abstract: We prove an optimal Penrose-like inequality for the mass of any asymptotically flat Riemannian 3-manifold having an inner minimal 2-sphere and nonnegative scalar curvature. Our result shows that the mass is bounded from below by an expression involving the area of the minimal sphere (as in the original Penrose conjecture) and some nomalized Sobolev ratio. As expected, the equality case is achieved if and only if the metric is that of a standard spacelike slice in the Schwarzschild space. Introduction Mass is the most important global invariant of Riemannian asymptotically flat manifolds. It was defined by physicists in General Relativity in the early 60’s [1] and has been a subject of intense study during the last twenty years, leading in particular to the proof of the celebrated “Positive Mass Conjecture”. Although it appeared in the context of Lorentzian Geometry and 3+1-dimensional spacetimes, the conjecture has a Riemannian counterpart, which states that a complete asymptotically flat Riemannian manifold of nonnegative scalar curvature must have a nonnegative mass. Proofs were given first by R. Schoen and S. T. Yau [23], then by E. Witten [25] who used spinors in an alternative proof completed later by T. Parker and C. Taubes [21] and independently by Y. ChoquetBruhat [8]. In 1973, R. Penrose conjectured an analog of this statement for asymptotically flat manifolds with boundary [22]. This is now known as the Penrose Conjecture. Let (M, g) be a 3-dimensional asymptotically flat Riemannian manifold with a compact, connected, minimal and stable (inner) boundary ∂M which is a topological 2-sphere. Suppose also that the scalar curvature of (M, g) is nonnegative. Then its mass m, if defined, satisfies ?

Supported in part by the GADGET II program of the European Union.

122

M. Herzlich

1 m≥ 4

r

Area(∂M ) π

and equality is achieved if and only if (M, g) is a spacelike Schwarzschild metric. A complete proof of the Penrose inequality is not yet available. Steps in this direction (i.e. proofs of the conjecture for special classes of manifolds) were taken by R. Bartnik [4], J. Jezierski [13, 14], M. Ludvigsen and J. Vickers [18] and E. Malec and N. O’Murchadha [19]. The goal of this short article is to establish another (Penrose-like) inequality for the mass. Its formulation is a bit more awkward since it involves an (M, g)-related dimensionless quantity but its statement is still general (i.e. valid for any manifold satisfying the assumptions of the conjecture) and optimal (i.e. it includes a rigidity theorem for the case of equality). Main Theorem. Let (M, g) be a 3-dimensional asymptotically flat Riemannian manifold with a compact, connected, (inner) boundary ∂M that is a minimal (topological) 2-sphere. Suppose also that the scalar curvature of (M, g) is nonnegative. Then its mass m, if defined, satisfies r Area(∂M ) 1 σ , m≥ 2 1+σ π where σ is a dimensionless quantity defined as r ||df ||2L2 (M ) Area(∂M ) inf · σ= f ∈Cc∞ ,f 6≡0 ||f ||2 2 π L (∂M ) Moreover,p equality is achieved if and only if (M, g) is a spacelike Schwarzschild metric of mass 41 Area(∂M )/π. The constant σ is positive on any asymptotically flat manifold with compact (inner) boundary. As defined in the statement of the theorem, it is scale-invariant and then independent of the precise value of the area of the boundary; for example, its value is 1 in the model case of any Schwarzschild metric and 2 in the case of the exterior of any round sphere in flat euclidean space. Unfortunately, no further control (depending on geometrical bounds) of σ has been proved at the present time. 1. Geometrical Tools of the Proof Let (M, g) be a smooth Riemannian 3-manifold, Cτ2,α -asymptotically flat with order τ strictly bigger than 1/2, by which we mean that there exists a compact subset K of M and a constant r0 such that M \ K is diffeomorphic to R3 \ B0 (r0 ) and the coefficients of the metric tensor in this chart satisfy gkl − δkl ∈ Cτ2,α , where k,α Cβk,α = {u ∈ Cloc , ||rβ u||C 0 < ∞, ..., ||rβ+k Dk u||C 0 < ∞, rk+β+α [Dk u]α < ∞},

where

Penrose-like Inequality for Mass of Riemannian Manifolds

[Dk u]α =

sup |z−z 0 |≤1

123

|z − z 0 |−α |Dk u(z) − Dk u(z 0 )|,

and the {zi } are the coordinates of the chart at infinity (with r = |z|). These conditions can be replaced [3] by integral ones with the help of weighted Sobolev spaces defined below. We could then also speak of Wτ2,2 −3/2 -asymptotically flat spaces. In this paper, M has a (smooth) inner boundary ∂M whose inner unit normal and second fundamental form in the metric g will be denoted respectively by ν and θ. Notice that a compact minimal and stable hypersurface of a 3-dimensional Riemannian manifold with nonnegative scalar curvature must be a topological 2-sphere or a torus [24]. Here we need to exclude the toroidal topology. As any 3-dimensional Riemannian manifold, (M, g) is endowed with a complex rank 2 bundle of spinors, denoted ΣM . Both Levi-Civit`a connections on the tensor and spinor bundles will be denoted by ∇ and the Dirac operator on ΣM by D. The mass m of an asymptotically flat manifold is defined if τ > 1/2 (in case of integral definition of asymptotically flat spaces, this can be relaxed to τ ≥ 1/2) and its scalar curvature belongs to the Lebesgue space L1 [3]. In any chart at infinity, its expression is m=

1 lim 16π r→∞

Z Sr

(∂i gij − ∂j gii )νrj d volSr ,

where Sr is a (large) sphere of radius r in the chart and νr its outer unit normal. We will denote by Mr the compact part of M delimited by Sr . The spacelike Schwarzschild metric of mass µ is defined on R3 \ B0 (µ/2) by µ 4 dr2 + r2 gS 2 . gµ = 1 + 2r It is the only scalar curvature flat and asymptotically flat (of order not smaller than 1/2) metric in the conformal class of the flat space such that the sphere S(0, µ/2) is minimal: this is an easy consequence of the maximum principle as the conformal factor relating two such metrics should be harmonic with Neumann boundary condition, hence constant. The coefficient in the definition of the mass is chosen such that the mass of this spacelike Schwarzschild metric is precisely µ. The proof of the theorem is divided into two steps: in the first one we prove a positive mass theorem for Riemannian asymptotically flat 3-manifolds with boundary whose mean curvature satisfies some inequality. We use here the usual spinor technique introduced by E. Witten [25] for the mass. As a consequence, we prove an analog of a known result of G. Gibbons, S. Hawking, G. Horowitz and M. Perry on the mass of black holes (but our choice of boundary conditions is different and more precise for our later application). The second one consists in finding a nice conformal change of the metric, such that the mean curvature of the boundary satisfies the condition singled out in the previous step. The main point is then to estimate the behaviour of the conformal factor on the boundary.

124

M. Herzlich

2. A Positive Mass Theorem for Manifolds with Boundary Our main goal is here to prove the following Proposition 2.1. Let (M, g) be a Cτ2,α -asymptotically flat Riemannian 3-dimensional manifold of order τ > 1/2 and scalar curvature in L1 . Suppose M has an (inner) boundary ∂M , homeomorphic to a 2-sphere, whose mean curvature trθ satisfies r π · trθ ≤ 4 Area(∂M ) Then, if the scalar curvature of (M, g) is nonnegative, its mass is nonnegative. Moreover, if its mass is zero, then the manifold is flat. As usual, the proof proceeds in two steps. The first one establishes a BochnerLichnerowicz-Weitzenb¨ock formula for the Dirac operator on spinors. The second one proves the existence of some asymptotically constant and Dirac-harmonic spinor field with well-chosen boundary conditions. Before entering the proof, we define the weighted Sobolev spaces which will be our main tools in the analysis and we also recall the fundamental formula. Definition 2.2. If r is the radius in any chart at infinity, we define k,p , rδ+l ∇l u ∈ Lp ∀ 0 ≤ l ≤ k}. Wδk,p = {u ∈ Wloc

We shall use here weighted Sobolev spaces of functions as well as fields of spinors. General properties of these spaces were studied in [3, 16, 17, 20]. Lemma 2.3 (Bochner-Lichnerowicz-Weitzenb¨ock formula). 1 D∗ D = D2 = ∇∗ ∇ + Scal, 4 where Scal is the scalar curvature of (M, g). After integration, we get for any smooth spinor field ψ, Z Z Z 1 < Dψ, Dψ > = < ∇ψ, ∇ψ > + Scal < ψ, ψ > 4 Mr Mr Mr Z Z < ∇ν ψ + ν · Dψ, ψ > − < ∇νr ψ + νr · Dψ, ψ > + ∂M

Sr

(recall ν is the inner unit normal of the boundary whereas νr is the outer unit normal of Sr ). Let us now denote the 2-dimensional Dirac operator of the boundary (with the metric induced from g) by D /. It acts on the complex rank 2 spinor bundle Σ∂M of the boundary. In our dimensions, there is an isomorphism (along ∂M ) denoted by γ between Σ∂M and ΣM , and, since the Clifford algebra in any dimension is nothing else but the even part of the Clifford algebra in one dimension more [15, chapter I], the Clifford actions of vectors upon spinors are related by the following formula γ(X ·∂M ψ) = X ·M ν ·M γ(ψ)

Penrose-like Inequality for Mass of Riemannian Manifolds

125

(for any spinor ψ and vector X tangent to the boundary). From now on, we won’t speak anymore of the 2-dimensional Clifford action ·∂M and the notation · will always refer to the structure of the whole (M, g). For any spinor field ψ in ΣM over ∂M ({ei } is a basis of the tangent plane at any point of the boundary), we then define the operator Aψ :=

2 X

e i · ∇ ei ψ = ν · D /ψ +

i=1

1 trθ ν · ψ, 2

where D /ψ is computed on Σ∂M and then transferred on ΣM through the already described isomorphism. From this formula, it is easy to see that D / anticommutes with the action of the normal. The operator A is the boundary term appearing in the integration by parts formula of Lemma 2.3. We are now going to find a non trivial solution to the PDE system Dψ = 0, with boundary condition P+ ψ = 0, where ψ is asymptotically constant and P± are the L2 -orthogonal projections on the spaces of eigenvectors of positive (resp. negative) eigenvalues of D / on ∂M (notice that since the boundary is a topological 2-sphere, there is no D /-harmonic spinor field on it). With a slight abuse of notation, let us denote by L2± (∂M, Σ) the spaces of positive (resp. negative) eigenvectors of D / on ∂M . We define the space 1,2 , P+ ψ = 0}. H = {ψ ∈ W−1 1,2 It is straightforward that this defines a Hilbert space with respect to the W−1 -norm. Its dual will be denoted by H0 .

Lemma 2.4. Suppose Scal is nonnegative and the mean curvature of the boundary satisfies trθ ≤ 2λ, where λ is the smallest absolute value of eigenvalues of the Dirac operator D /. Then for any 8 in H0 , there exists a unique ψ in H such that Z < Dψ, Dξ >=< 8, ξ >H0 ,H ∀ ξ ∈ H. M

Proof. This is the standard Lax-Milgram lemma. We only have to check that the given bilinear form is coercive on H. From the Weitzenb¨ock formula (which is valid for any 1,2 since Cc∞ is dense in it), we get spinor in W−1 Z Z Z Z 1 < Dψ, Dψ >= < ∇ψ, ∇ψ > + Scal < ψ, ψ > + < ψ, ν · Aψ > . 4 M M M ∂M We now compute the boundary term: Z Z < ψ, ν · Aψ >= − ∂M

∂M

1 < ψ, D /ψ > − 2

Z trθ < ψ, ψ > . ∂M

Decomposition along the eigenspaces, with corresponding eigenvectors and eigenvalues / (with the convention that λn > 0 iff n > 0) gives ψn , λn (n ∈ Z∗ ) of D

126

Z ∂M

M. Herzlich

X Z < ψ, ν ·Aψ > ≥ − λn λn <0

∂M

1 < ψn , ψn > + sup(trθ) 2

Z ∂M

< ψn , ψn > .

If n < 0, the desired term is nonnegative if −2λn ≥ trθ so that Z Z < Dψ, Dψ > ≥ < ∇ψ, ∇ψ > . M

M

1,2 . This ends the proof since the right-hand side is a Hilbert norm on W−1

Consider now a smooth spinor field ψ0 which is constant in some chart around infinity and such that P+ ψ0 = 0. Proposition 2.5. Suppose Scal is nonnegative and the mean curvature of the boundary satisfies trθ ≤ 2λ, where λ is the smallest absolute value of eigenvalues of the Dirac operator D /. Then there exists a unique ψ in H such that D(ψ0 + ψ) = 0, P+ (ψ0 + ψ) = 0. Proof. From Lemma 2.4, we get a unique ψ in H such that Z < D(ψ0 + ψ), Dξ > = 0 ∀ξ ∈ H. M

If the spinor field was smooth enough, we would then get by integration by parts, Z < D(ψ0 + ψ), Dξ > 0= Z ZM 2 < D (ψ0 + ψ), ξ > + < D(ψ0 + ψ), ν · ξ > . = M

∂M

We then have, in a weak sense, D2 (ψ0 + ψ) = 0, P+ (ψ0 + ψ) = 0, P+ D(ψ0 + ψ) = 0. But the boundary conditions given here satisfy the Lopatinski-Shapiro condition of ellipticity (see [6], but take care that the situation considered there is slightly more complicated than ours; see also [5]), and our operators have smooth coefficients, so that we can apply the results of classical pseudo-differential calculus [12, chapter XX] and we conclude that ψ has local regularity W 2,2 (including around the boundary) and the last PDE system is valid in the strong sense. From ellipticity of the Dirac operator, we also get that for any spinor field ϕ living in some Wβ2,2 , ||ϕ||W 2,2 (M \K1 ) ≤ C ||D2 ϕ||L2 (M \K2 ) + ||ϕ||L2 (M \K2 ) , β

β+2

β

where K2 ⊂ K1 are compact subsets of M containing the boundary. This inequality is indeed obtained by patching together the classical local inequalities [10] and using Bartnik’s scaling argument (see [3, Proposition 1.15]).

Penrose-like Inequality for Mass of Riemannian Manifolds

127

Applying to βR ψ (where βR is a cut-off function which is zero outside a ball of radius R and satisfies |dβR | ≤ c R−1 , |DdβR | ≤ c R−2 ) and letting R tend to infinity 2,2 shows that ψ belongs to W−1 , 9 = D(ψ0 + ψ) belongs to W01,2 , and D9 = 0,

P+ 9 = 0.

1,2 , the Weitzenb¨ock formula Since W01,2 is included in W−1 Z Z < D9, D9 > ≥ < ∇9, ∇9 > M

implies that 9 ≡ 0.

M

From the lower estimate for eigenvalues of the Dirac operator on 2-spheres, due to C. B¨ar [2] and O. Hijazi [11], r π λ≥2 Area(∂M ) (with equality iff. the surface is a standard sphere), we deduce Corollary. Suppose Scal is nonnegative and the mean curvature of the boundary satisfies r π . trθ ≤ 4 Area(∂M ) Then there exists a unique ψ in H such that D(ψ0 + ψ) = 0, P+ (ψ0 + ψ) = 0. The end of the proof is done as usual: if ψ0 is a constant spinor at infinity, Lemma 2.3 together with the fact that Z < ∇νr (ψ + ψ0 ) + νr · D(ψ + ψ0 ), (ψ + ψ0 ) > 4π |ψ0 |2 m = lim r→∞

Sr

gives the positive mass theorem. If the mass is zero, the Weitzenb¨ock formula becomes Z Z 1 < ∇(ψ + ψ0 ), ∇(ψ + ψ0 ) > + Scal < (ψ + ψ0 ), (ψ + ψ0 ) > 0= 4 M M Z < (ψ + ψ0 ), ν · A(ψ + ψ0 ) > . + ∂M

This shows that the manifold admits a parallel (hence never zero) spinor, so that it is Ricci-flat, then flat. Remark. Included in Proposition 2.1 is a rigorous proof (in the time-symmetric case) of a result of G. Gibbons, S. Hawking, G. Horowitz and M. Perry [9] who showed a positive mass theorem for Lorentz manifolds whose mean curvature of the boundary is nonpositive. Note that our boundary conditions enable us to get a stronger statement (namely: mass is nonnegative even if there is some positivity of the mean curvature; for example, it includes the case of the exterior of a round sphere in R3 ). This is due to the fact that the solutions of Dψ = 0 with our boundary condition are indeed the constant spinors in the euclidean flat space, a situation which wasn’t the case with the

128

M. Herzlich

boundary conditions used in [9]. Compared to their, our positive mass theorem gives us a rigidity statement in the case of zero mass. In [9], such a statement would have been meaningless since there is no compact submanifold (homeomorphic to a sphere) of the flat euclidean space with nonpositive mean curvature. This remark will become crucial when considering the optimality in the next section.

3. A Conformal Change of Metric We shall now start with our asymptotically flat Riemannian 3-manifold (M, g) with a compact minimal inner boundary ∂M . Our goal is then to find a conformal change of the metric which respects its asymptotically flat character but makes the scalar curvature zero and the boundary a constant mean curvature surface whose mean curvature has exactly the limit value of Proposition 2.1. We will then show that the mass of the original metric exceeds the mass of the new metric (which is nonnegative by the results of the previous section) by the number announced in the main theorem. More precisely, we seek a function 1,2 , 8 = 1 + u, u ∈ W−1 such that the metric g = 84 g satisfies

r

g

Scal ≡ 0, tr g θ ≡ + 4

π . Areag (∂M )

Our first step is to compute the changes in scalar curvature, mean curvature and mass resulting from conformal changes. Lemma 3.1. If g = 84 g = (1 + u)4 g, Scalg = 8−4 8 1g 8 + Scalg 8 , tr g θ = 8−2 tr g θ + 48−3 d8(νg ), Z 1 lim du(νr ) d volSr , m84 g = mg − 2π r→∞ Sr where νg (resp. νr ) is the inner (resp. outer) g-unit normal of ∂M (resp. Sr ). We shall find our conformal factor by a calculus of variations procedure. Consider 1,2 the Lagrangian on W−1 defined by 1 Q(f ) = 2

Z

1 |df | + 16 M

Z

g

2

√ 2

Scal (1 + f ) + M

π 2

Z (1 + f )

4

21 .

∂M

1,2 , non identically −1 on the boundary, which extremizes Q, is smooth [7] Any u in W−1 and satisfies the Euler-Lagrange equations:

1 1g u + Scalg (1 + u) = 0 (on M ), 8 Z − 21 √ 4 (1 + u) (1 + u)3 (on ∂M ). du(νg ) = π ∂M

Penrose-like Inequality for Mass of Riemannian Manifolds

129

1,2 Proposition 3.2. There exists u ∈ W−1 such that Q(u) = min Q. Moreover, 1 + u never 4 vanishes, so that g = (1 + u) g is a Riemannian metric. 1,2 . It is Proof. Since Q is nonnegative, there exists minimizing sequences (ui ) in W−1 standard that the norms Z Z f2 |df |2 + 2 and |df |2 r M M 1,2 are equivalent on W−1 . Any minimizing sequence then belongs to a bounded (weakly 1,2 (which is included in L4 (∂M )). We compact) subset of the reflexive Hilbert space W−1 1,2 such that ui converges to can find a subsequence, still denoted by (ui ) and u in W−1 1,2 2 4 u strongly in L−η (η > 1), and weakly in W−1 and L (∂M ). Moreover, since Q is continuous and convex, Q(u) ≤ lim Q(ui ) = min Q.

Notice that fixing the value of u at infinity breaks the conformal invariance on the whole manifold. This is the reason why we do not enter the same kind of complications as in the Yamabe problem. Although this is not necessary for the rest of the proof, it is easily seen from now on that the zero function cannot be the solution (just use the same idea that is used below). We now have to show that the solution is not identically −1 on the boundary. Suppose this is the case. Then, consider the solution h in Wδ2,p (with 1/2 ≤ δ + 3/p < 3/2) of the linear problem 1 1h + Scalg h = 0 on M, dh(ν) = −1 on ∂M. 8 Using the method of proof of isomorphism of the previous section, it is easily seen that the Laplacian with Neumann boundary condition is invertible in the desired weighted Sobolev space and, from the maximum principle, h is positive (never vanishes). From standard arguments for weighted spaces [3], we infer that h has the following asymptotic expansion in the neighbourhood of infinity c h = + h1 , r where h1 belongs to some weighted space whose weight is strictly bigger than 1 (this means precisely that h1 = o(r−1 ) at infinity). Since h is positive and non-identically zero, the constant c must be (strictly) positive. We then compute Z Z 1 1 |d(u + εh)|2 − |du|2 + Scal(1 + u + εh)2 Q(u + εh) − Q(u) = 2 M 16 M 21 √ Z Z 1 π 2 4 − Scal(1 + u) + (1 + u + εh) 16 M 2 Z ∂M Z 1 =ε (1 + u)(1h + Scal h) − ε (1 + u)dh(ν) 8 M ∂M Z (1 + u)dh(νr ) + O(ε2 ) + ε lim r−→∞ ∂S r Z (1 + u)dh(νr ) + O(ε2 ). = ε lim r−→∞

∂Sr

130

M. Herzlich

Furthermore, the strict positivity of c implies that the boundary term in the last formula is strictly negative, so that Q(u + εh) − Q(u) < 0 for ε positive and small enough. This shows that u identically −1 on the boundary cannot achieve the minimum of Q. Suppose now that there exists x ∈ M such that u(x) ≤ −1. The solution being superharmonic on a neighbourhood of its minimum, it attains its minimum value on the boundary or at infinity. Then y ∈ ∂M exists such that u(y) ≤ −1. From the boundary condition, we get du(ν) ≤ 0 at y, and the maximum principle gives a contradiction. Let us denote by µ = min Q = Q(u) and A = Areag (∂M ). We can then prove the following: Lemma 3.3. We have σ √ πA, where σ = µ≥ 2 + 2σ

r

||df ||2L2 (M ) A inf = π Cc∞ ||f ||2L2 (∂M )

r

A π

inf

1,2 W−1

||df ||2L2 (M ) ||f ||2L2 (∂M )

·

Remark. The constant σ is positive on any asymptotically flat manifold; up to the nor1,2 malization, its value is nothing else but the inverse of the norm of the injection of W−1 R (with the “gradient norm” M |df |2 ) into the Lebesgue space L2 of the boundary. √ Proof. Suppose µ ≤ η πA, where η is a small positive constant. We shall now prove that η cannot be smaller than an expression involving the Sobolev ratio σ. From H¨older’s inequality, we can write 1 2

Z

1 |du| + 2 M

r

2

π A

Z

(1 + u)

2

∂M

√ ≤ η πA.

From Young’s inequality, (1 + u)2 ≥ 1 −

1 + (1 − ε)u2 , ε

∀ ε > 0,

whence 1 2

Z

r Z √ 1−ε π 1√ 1 + |du| + πA 1 − u2 ≤ η πA, 2 ε 2 A ∂M M 2

so that 1 2

Z M

|du|2 +

1−ε 2

r

π A

Z

u2 ∂M

≤

η−

1 2

1−

1 ε

√ πA.

The left-hand side is nonnegative if ε − 1 ≤ σ, so that η ≥ ε−1 (ε − 1)/2 for all these ε and the maximum of this expression is achieved for ε = 1 + σ.

Penrose-like Inequality for Mass of Riemannian Manifolds

131

1,2 We now show that our solution has a slightly better decay than W−1 : first we obtain (in a very similar manner as in Sect. 2, with the help of well-chosen cut-off functions) 2,2 0,α . It then lives in C1/2 for some α > 0. Elliptic that our conformal factor lives in W−1 2,α and since there are regularity (in the H¨older classes) shows then that it belongs to C1/2 no critical weights of the Laplace operator between 0 and 1, we can apply [20, Theorem 6.4] to obtain that it eventually lives in Cτ2,α for some τ > 1/2.

We can now conclude the proof of the main theorem. Collecting all the intermediate results of this section and the Positive Mass Theorem of the previous one, we get Proposition 3.4. The metric g = 84 g = (1 + u)4 g is a Cτ2,α -asymptotically flat metric (of order τ > 1/2) which is scalar flat. The boundary ∂M has constant mean curvature r π tr g θ ≡ 4 Areag (∂M ) in it, so that its mass is nonnegative. Moreover, r σ Areag (∂M ) . mg − mg ≥ 2 + 2σ π Proof. Everything has been done except the last computation relating the masses. From Lemma 3.1 we know Z 1 lim du(νr ) d volSr . m(1+u)4 g = mg − 2π r→∞ Sr Notice that u decays faster than r−1/2 at infinity, so that Z Z 1 lim lim du(νr ) d volSr = d (1 + u)2 (νr ) d volSr . r→∞ S r→∞ 2 Sr r Moreover, from Stokes’ theorem, Z Z Z d (1 + u)2 (νr ) = d (1 + u)2 (ν) + 2 Sr

∂M

Mr

1 4

|du|2 +

Z

Scalg (1 + u)2 , Mr

and, injecting the boundary term of the Euler-Lagrange equations, Z Sr

√ d (1 + u)2 (νr ) = 2 π

Z (1 + u)4

21

Z +2

∂M

Mr

1 4

|du|2 +

Z

Scalg (1 + u)2 . Mr

Taking the limit as r tends to infinity gives Z lim

r→∞

Sr

√ d (1 + u)2 (νr ) d volSr = 2 π

Z (1 + u)4 Z

∂M

+ 2 M

= 4 Q(u), which ends the proof.

21

|du|2 +

1 4

Z

Scalg (1 + u)2 M

132

M. Herzlich

4. A Quick Look at the Equality Case This is easily done: it implies that the new metric g = (1 + u)4 g has vanishing mass. It is then flat. Moreover, from the Weitzenb¨ock formula giving the mass, we get that the boundary has (constant) mean curvature whose value is exactly r π , trθ = 4 Area(∂M ) and it must be a (metric) round sphere, since equality is achieved in the B¨ar/Hijazi estimate. Let ki (i = 1, 2) be the eigenvalues of θ with respect to the metric. Integrating on the boundary and using the classical Gauss-Bonnet theorem (since the 3-dimensional manifold is flat, extrinsic and Gaussian curvatures of the boundary coincide), we get Z Z Z (k1 − k2 )2 = (k1 + k2 )2 − 4 k1 k2 = 16π − 16π = 0. ∂M

∂M

∂M

The second fundamental form θ is then exactly the same as that of a round sphere of the same area in the flat space. We can then glue in the interior of a round ball in the flat space. This eventually gives a complete flat manifold (without boundary) with vanishing mass: the previous one is then the complement of a round ball (since the spheres are the only embedded surfaces with constant mean curvature) in the euclidean space. From Proposition 3.4, we see furthermore that, for the solution u, r A σ . Q(u) = 2 + 2σ π Running again the argument of Lemma 3.3 shows that r Z Z A |du|2 < σ u2 , π M ∂M (which contradicts the definition of σ) unless Scalg identically vanishes. The original metric is then scalar-flat and rigidity is obtained because the spacelike Schwarzschild space is the unique (up to trivial rescaling) scalar-flat and asymptotically flat metric in the conformal class of the flat space having such a minimal 2-sphere. Acknowledgement. It’s a pleasure for me to thank Olivier Biquard, Jean-Michel Bony and Piotr T. Chru´sciel for useful discussions, and Emmanuel Hebey for his profitable comments.

Note added in proof The author recently learned that the Penrose inequality has finally been proved by G. Huisken and T. Ilmanen (announcement, June 1997).

References 1. Arnowitt, R., Deser, R. and Misner, C.W.: Coordinate invariance and energy expressions in General Relativity. Phys. Rev. 122, 997–1006 (1961) 2. B¨ar, C.: Lower eigenvalues estimates for Dirac operators. Math. Ann. 293, 39–46 (1992)

Penrose-like Inequality for Mass of Riemannian Manifolds

133

3. Bartnik, R.: The mass of an asymptotically flat manifold. Commun. Pure. Appl. Math. 39, 661–693 (1986) 4. Bartnik, R.: Quasi-spherical metrics and prescribed scalar curvature. J. Diff. Geom. 37, 31–71 (1993) 5. Booss, B. and Wojciechowski, B.: Elliptic boundary problems for the Dirac operator. Basel: Birkh¨auser, 1993 6. Bunke, U.: Comparison of Dirac operators for manifolds with boundary. Suppl. Rend. Circ. Mat. Palermo 30, 133–141 (1993) 7. Cherrier, P.: Probl`emes de Neumann non-lin´eaires sur les vari´et´es riemanniennes. J. Funct. Anal. 57, 154–206 (1984) 8. Choquet-Bruhat, Y.: Positive energy theorems. Relativity, groups and topology II. Les Houches XL. 1983 (B. De Witt and R. Stora, eds.), Amsterdam: Elsevier, 1984, pp.740–785 9. Gibbons, G.W., Hawking, S.W., Horowitz, G.T. and Perry, M.J.: Positive mass theorems for black holes. Commun. Math. Phys. 88, 295–308 (1983) 10. Gilbarg, D. and Trudinger, N.S.: Elliptic Partial Differential Equations of Second Order. Grundlehr. Math. Wiss., Vol. 224, Berlin: Springer, 1977 11. Hijazi, O.: Premi`ere valeur propre de l’op´erateur de Dirac et nombre de Yamabe. C. R. Acad. Sci. Paris 313, 865–868 (1991) 12. H¨ormander, L.: The analysis of linear partial differential operators III. Grundlehr. Math. Wiss., Vol. 274, Berlin: Springer, 1988 13. Jezierski, J.: Positivity of mass for certain spacetimes with horizons. Class. Quantum Grav. 6, 1535–1539 (1989) 14. Jezierski, J.: Perturbation of initial data set for spherically symmetric charged black hole and the Penrose inequality. Acta Phys. Pol. B 25, 1413–1417 (1994) 15. Lawson, H.B. and Michelsohn, M.L.: Spin geometry. Princeton Math. Series, Vol. 38, Princeton, NJ: Princeton Univ. Press, 1989 16. Lockhart, R.B.: Fredholm properties of a class of elliptic operators on non-compact manifolds. Duke Math. J. 48, 289–312 (1983) 17. Lockhart, R.B. and McOwen, R. B.: Elliptic differential operators on non-compact manifolds. Ann. Scuola. Norm. Sup. Pisa 12, 409–447 (1985) 18. Ludvigsen, M. and Vickers, J.A.: An inequality relating the total mass and the area of a trapped surface in general relativity. J. Phys. A 16, 3349–3353 (1983) 19. Malec, E. and O’Murchadha, N.: Trapped surfaces and the Penrose inequality in spherically symmetric geometries. Phys. Rev. D 49, 6931–6934 (1994) 20. Maz’ya V.G. and Plameneevski, B.A.: Estimates in Lp and in H¨older classes and the Miranda-Agmon Maximum principle for solutions of elliptic boundary value problems in domains with singular points on the boundary. (in russian) Math. Nachr. 81, 25–82 (1978); english transl.: Amer. Math. Soc. Transl. 123, 1–56 (1984) 21. Parker, T.H. and Taubes, C.H.: On Witten’s proof of the positive energy theorem. Commun. Math. Phys. 84, 223–238 (1982) 22. Penrose, R.: Naked singularities. Ann. N. Y. Acad. Sci. 224, 125–134 (1973) 23. Schoen, R. and Yau, S.-T.: On the proof of the positive mass conjecture in General Relativity. Commun. Math. Phys 65, 45–76 (1979) 24. Schoen, R. and Yau, S.-T.: On the structure of manifolds with positive scalar curvature. Manuscripta math. 28, 159–183 (1979) 25. Witten, E.: A new proof of the positive energy theorem. Commun. Math. Phys. 80, 381–402 (1981) Communicated by H. Nicolai

Commun. Math. Phys. 188, 135 – 173 (1997)

Communications in

Mathematical Physics c Springer-Verlag 1997

Relaxation of Disordered Magnets in the Griffiths’ Regime? F. Cesi1 , C. Maes2 , F. Martinelli3 1 Dipartimento di Fisica, Universit` a “La Sapienza”, P.le A. Moro 2, 00185 Roma, Italy. E-mail: [email protected] 2 Instituut voor Theoretische Fysika, K.U. Leuven, Celestijnenlaan 200D B-3001 Leuven, Belgium and Onderzoeksleider N.F.W.O, Belgium. E-mail: [email protected] 3 Dipartimento di Energetica, Universit` a dell’ Aquila, Italy. E-mail: [email protected]

Received: 12 June 1996 / Accepted: 23 January 1997

Abstract: We study the relaxation to equilibrium of discrete spin systems with random finite range (not necessarily ferromagnetic) interactions in the Griffiths’ regime. We prove that the speed of convergence to the unique reversible Gibbs measure is almost surely faster than any stretched exponential, at least if the probability distribution of the interaction decays faster than exponential (e.g. Gaussian). Furthermore, if the interaction is uniformly bounded, the average over the disorder of the time–autocorrelation function, goes to equilibrium as exp[−k(log t)d/(d−1) ] (in d > 1), in agreement with previous results obtained for the dilute Ising model. 1. Introduction In the present paper we study the speed of convergence to equilibrium of a single spin– flip stochastic dynamics with a reversible Gibbs measure with random interactions in the so called Griffiths’ phase. For simplicity (our models are introduced in the next section) consider here a nearest– neighbor Ising model on the d-dimensional lattice with coupling coefficients J = {Jxy }. The {Jxy }, are independent and identically distributed real–valued random variables. If the {Jxy } are uniformly bounded, then at all sufficiently high temperatures Dobrushin’s uniqueness theory applies and detailed information about the unique Gibbs measure and the relaxation to equilibrium of an associated Glauber dynamics are available using the concept of complete analyticity [DS, SZ, LY, MO1 and MO2]. This regime is usually referred to as the paramagnetic phase. There is then a range of temperatures, below the paramagnetic phase, where, even if the Gibbs state is unique, certain characteristics of the paramagnetic phase like the analyticity of the free energy as a function of the external field disappear. This is the so called Griffiths’ regime [G] (see also [F] for additional discussion on this and many ?

Work partially supported by grant CHRX-CT93-0411 of the Commission of European Communities

136

F. Cesi, C. Maes, F. Martinelli

other related topics). This “anomalous behavior” is caused by the presence of arbitrarily large clusters of bonds associated with “strong” couplings Jxy , which can produce a long–range order inside the cluster. Even above the percolation threshold, i.e. when one of such clusters is infinite, there may be a Griffiths phase, at temperatures between a certain critical temperature Tc (disordered) and the critical temperature for the “pure system” (i.e. the system with “strong” couplings everywhere on Zd ). What happens is that for almost all realizations of the disorder J and for all sites x there is a finite length scale l(J, x), such that correlations between σ(x) and σ(y) start decaying exponentially at distances greater than l(J, x). In [BD] an “elementary” approach was given to the problem of uniqueness of the equilibrium state of disordered systems in the Griffiths regime (see also [FI]). In another paper [D] Dobrushin prepared the mathematical background for the study of (arbitrary order) truncated correlation functions for spin glasses. Bounds on Tc (disordered) have been obtained in [ACCN and OPG]. More recent references where, at least for the statics, the situation has been considerably cleared up are [DKP, GM2, GM3 and Be]. In particular, under suitable conditions on the couplings distribution, one proves that the infinite volume Gibbs state is unique with probability one and the static correlation functions decay exponentially fast uniformly in the size of the system and its boundary conditions. Though the system is not completely analytic there is still sufficient local analyticity to ensure many high temperature properties. The effect of the Griffiths’ singularities on the dynamical properties are much more serious since, as we will see, the long time behaviour of any associated Glauber dynamics is dominated by the islands of strongly coupled spins produced by large statistical fluctuations in the disorder (see e.g. [B1 and B2]). Consider the usual stochastic (or kinetic) Ising model associated to the model discussed above. It is a stochastic spin flip dynamics for which the (almost sure unique) Gibbs state is a reversible measure. Let us denote by q(J, t) the absolute difference between the expectation at time t of e.g. the spin at the origin starting in some initial state, and its equilibrium value. Up to now only little rigorous information was available about the long time behaviour in the Griffiths’ regime of q(J, t) or about its more physically relevant disorder– average q(t). The first rigorous result was obtained in [Z] where the absence of a gap in the spectrum of the Markov generator was proven. This result, in particular, rules out the possibility that q(J, t) decays exponentially fast in t with probability one with a rate independent of J. In [GM1], for quite general models, an almost sure upper bound of the form q(J, t) ≤ c(J) exp[−λ(log t)ν ],

ν>1

(faster than any polynomial) was derived. This was further improved in [GZ1] where almost sure upper bounds of the form q(J, t) ≤ c(J) exp[−λtδ ],

δ<1

have been proven for both discrete and continuous spin systems, in dimension d ≤ 2, and in [GZ2] where it has been shown that, for continuous spins, d ≤ 3 and ferromagnetic interactions, one has 2 q(J, t) ≤ c(J) exp − exp[ θ(log t)1/(d−1) ] . We refer to the original papers for the precise statement of the results and the models. We only observe that in [GZ1] the probability distribution has only exponential tails and

Relaxation of Disordered Magnets in Griffiths’ Regime

137

that, in this case, the stretched exponential bound cannot be improved in general (see the remark after Theorem 3.1). One of the main results of the present paper is the proof that the almost sure decay is in fact faster than stretched exponential at least for random interaction whose probability distribution decays faster than exponential. In particular, for bounded interactions, we show that h i 1 q(J, t)≥ c1 (J) exp −t exp −k1 (log t)1− d , h i 1 q(J, t)≤ c2 (J) exp −t exp −k2 (log t)1− d (log log t)d−1 . Our general assumption, called (H1), (see Sect. 3 for a precise statement) is that in a cube of side length L, the correlation between σ(x) and σ(y) start decaying exponentially fast if, |x − y| is greater than, say, L/2, with probability (w.r.t the disorder) at least 1 − exp(−cL). We show that this assumption is indeed implied by the assumptions used in [GM2 and DKP]. In a related paper [CMM] we prove that, at least for the two dimensional diluted Ising model, (H1) holds in a wider region of the Griffiths’ phase which extends above the percolation threshold. We also analyze the average (over the disorder or spatially) q(t) of q(J, t) and prove both upper and lower bounds of the form exp[−λ1 (log t)d/(d−1) ] ≤ q(t) ≤ exp[−λ2 (log t)d/(d−1) (log log t)−d ]

(1.1)

for suitable constants λ1 , λ2 . This result agrees with the predictions of [B1, B2, DRS and RSP] for dilute Ising magnets. Recent computer simulations [J] suggest that behavior in (1.1) is attained only when q(t) is extremely small, while, at intermediate times q(t) is better fitted with a stretched exponential. We hope to come back to this problem in the near future. Although we consider only discrete spin systems we don’t see any serious obstacle to extend our upper bounds to compact continuous spins. The lower bound, on the other side, could be very different if one considers, for instance, the Heisenberg model (see our comment at the beginning of Sect. 4.3). The basic questions treated in this paper can of course also be put in the general context of interacting particle systems (with random interactions). But there no reversibility is guaranteed and the methods of this paper or of [GZ1, GZ2] fail in that case. The only general results so far are contained in [GM1] but they are believed to be far from optimal. The paper is organized as follows. Sect. 2 contains the definition of our models. Sect. 3 contains the statements of the main results. Sect. 4 contains several technical tools that are essential for Sect. 5. In particular we prove that, if the interactions are bounded, then the relaxation time in an arbitrary set V does not grow faster than a d−1 exponential of the “surface” |V | d . Sect. 5 represents the core of the paper. We prove a rather sharp deterministic upper bound on the logarithmic Sobolev constant for the finite volume Gibbs measure. Sect. 6 is devoted to the proof of the main results. Appendix 1 contains the proof of a key geometric bound.

138

F. Cesi, C. Maes, F. Martinelli

2. The Model The lattice. We consider the d dimensional lattice Zd with sites x = {x1 , . . . , xd } and norm |x| = max |xi |. i∈{1,...,d}

The associated distance function is denoted by d(·, ·). By QL we denote the cube of all x = (x1 , . . . , xd ) ∈ Zd such that xi ∈ {0, . . . , L − 1}. If x ∈ Zd , QL (x) stands for QL + x. We also let BL be the ball of radius L centered at the origin, i.e. BL = Q2L+1 ((−L, . . . , −L)). A finite subset 3 of Zd is said to be a multiple of QL if 3 is the union of a finite number of cubes QL (xi ), where xi ∈ LZd . If 3 is a finite subset of Zd we write 3 ⊂⊂ Zd . The cardinality of 3 is denoted by |3|. F is the set of all nonempty finite subsets of Zd . We define the exterior n–boundary as ∂n+ 3 = {x ∈ 3c : d(x, 3) ≤ n}. Given r ∈ Z+ we say that a subset V of Zd is r-connected if, for all y, z ∈ V there exist {x1 , . . . , xn } ⊂ V such that x1 = y, xn = z and |xi+1 − xi | ≤ r for i = 2, . . . , n. The configuration space. Our configuration space is = S Z , where S = {−1, 1}, or V = S V for some V ⊂ Zd . The single spin space S is endowed with the discrete topology and with the corresponding product topology. Given σ ∈ and 3 ⊂ Zd we denote by σ3 the natural projection over 3 . If U , V are disjoint, σU ηV is the configuration on U ∪ V which is equal to σ on U and η on V . If f is a function on , 3f denotes the smallest subset of Zd such that f (σ) depends only on σ3f . f is called local if 3f is finite. F3 stands for the σ−algebra generated by the set of projections {πx }, x ∈ 3, from to {−1, 1}, where πx : σ 7→ σ(x). When 3 = Zd we set F ≡ FZd and F coincides with the Borel σ−algebra on with respect to the topology introduced above. By kf k∞ we mean the supremum norm of f . The gradient of a function f is defined as d

(∇x f )(σ) = f (σ x ) − f (σ), where σ x ∈ is the configuration obtained from σ, by flipping the spin at the site x. If 3 ∈ F we let X (∇x f )2 . |∇3 f |2 = x∈3

We also define |||f ||| =

X

k∇x f k∞ .

x∈Zd

The interaction and the Gibbs measures. We consider an abstract probability space (2, B, P) and a set of real valued random variables J = {JA } with A ∈ F, with the properties (1) JA and JB are independent if A 6= B (2) JA and JA+x are identically distributed for all A ∈ F and all x ∈ Zd (3) There exists r > 0 such that with P−probability 1, JA = 0 if diam A > r. r is called the range of the interaction.

Relaxation of Disordered Magnets in Griffiths’ Regime

139

The expectation with respect to P is denoted by E (·). For x ∈ Zd , we let X kJkx ≡ |JA |. A3x

We write kJkV = sup{kJkx : x ∈ V }. Given a potential or interaction J, and V ∈ F we define the Hamiltonian HVJ : 7→ R by X Y JA σ(x). HVJ (σ) = − A: A∩V 6=∅

x∈A

For σ, τ ∈ we also let HVJ,τ (σ) = HVJ (σV τV c ) and τ is called the boundary condition. For each V ∈ F, τ ∈ the (finite volume) conditional Gibbs measure on (, F ), are given by ( J,τ −1 exp[ −HVJ,τ (σ) ] if σ(x) = τ (x) for all x ∈ V c ZV J,τ (2.1) µV (σ) = 0 otherwise, where ZVJ,τ is the proper normalization factor called partition function. We will sometimes drop the superscript J if that does not generate confusion. Given a measurable bounded function f on , µV f denotes the function σ 7→ µσV (f ). Analogously, if X ∈ F, µV (X) ≡ µV 1IX , where 1IX is the characteristic function on X. µ(f, g) stands for the covariance (with respect to µ) of f and g. The set of measures (2.1) satisfies the DLR compatibility conditions µ3 (µV (X)) = µ3 (X)

∀X ∈ F

∀ V ⊂ 3 ⊂⊂ Zd .

(2.2)

A probability measure µ on (, F) is called a Gibbs measure for J if µ(µV (X)) = µ(X)

∀X ∈ F

∀ V ∈ F.

(2.3)

Given any two measures µ and ν on (, F ), and given V ∈ F such that for each X ∈ FV , ν(X) = 0 implies µ(X) = 0, we define the FV −measurable function µ{σ ∈ : σV = ηV } dµ : η 7→ dν V ν{σ ∈ : σV = ηV } where 0/0 means 0. We have, of course, dµ µ(f ) = ν f dν V

η ∈ ,

∀f ∈ FV .

(2.4)

(2.5)

The dynamics. The stochastic dynamics we want to study is determined by the Markov generators LJV , V ⊂ Zd , defined by X cJ (x, σ)(∇x f )(σ) σ ∈ . (2.6) (LJV f )(σ) = x∈V

The nonnegative real quantities cJ (x, σ) are the transition rates for the process. The general assumptions on the transition rates are

140

F. Cesi, C. Maes, F. Martinelli

(1) Finite range interactions. If σ(y) = σ 0 (y) for all y such that d(x, y) ≤ r, then cJ (x, σ) = cJ (x, σ 0 ). (2) Detailed balance. For all σ ∈ and x ∈ Zd , J J exp −H{x} (σ) cJ (x, σ) = exp −H{x} (σ x ) cJ (x, σ x ). (2.7) (3) Positivity and boundedness. There exist non–negative real numbers cm , κ1 cM and κ2 such that cm e−κ1 kJkx ≤ inf cJ (x, σ) x,σ

and

sup cJ (x, σ) ≤ cM eκ2 kJkx .

(2.8)

x,σ

Three cases one may want to keep in mind are cJ (x, σ) = min{e−(∇x H{x} )(σ) , 1}, x (∇x H{x} )(σ) −1 cJ (x, σ) = µJ,σ , {x} (σ ) = 1 + e 1 cJ (x, σ) = 1 + e−(∇x H{x} )(σ) . 2

(2.9) (2.10) (2.11)

Notice that the first two examples, corresponding to the Metropolis and heat–bath dynamics respectively, satisfy (2.8) with κ2 = 0. When considering the infinite volume dynamics (see below) we will always assume uniformly bounded transition rates (κ2 = 0), even if this assumption can be relaxed (see Theorem 2.3 in [GZ1]). J,τ J 2 We denote by LJ,τ V the operator LV acting on L (, dµV ) (this amounts to choose τ as the boundary condition). Assumptions (1), (2) and (3) guarantee that there exists a unique Markov process whose generator is LJ,τ V , and whose semigroup we denote by J,τ J,τ {TV (t)}t≥0 . LV is a bounded operator on L2 (, dµJ,τ V ). The process has a unique J,τ . Moreover µ is reversible with respect to the process, invariant measure given by µJ,τ V V J,τ 2 is self–adjoint on L (, dµ ). i.e. LJ,τ V V A fundamental quantity associated with the dynamics of a reversible system is the gap of the generator, i.e. J,τ ⊥ gap(LJ,τ V ) = inf spec (−LV 1I ),

where 1I⊥ is the subspace of L2 (, dµJ,τ V ) orthogonal to the constant functions. The gap can also be characterized as gap(LJ,τ V )=

inf

f ∈L2 (,dµJ,τ ), Var J,τ (f )6=0 V V

EVτ (f, f ) Var J,τ V (f )

,

(2.12)

where E is the Dirichlet form associated with the generator L, EVJ,τ (f, f ) =

1 X X J,τ µV (σ) c(x, σ) [(∇x f )(σ)]2 2 σ∈ V

(2.13)

x∈V

J,τ and Var J,τ V is the variance relative to the probability measure µV . We define the logJ,τ arithmic Sobolev constant cs (LV ) associated with the generator LJ,τ V as the infimum over all c such that, for all positive functions f ,

Relaxation of Disordered Magnets in Griffiths’ Regime

J,τ J,τ 2 2 µJ,τ V (f log f ) ≤ c EV (f, f ) + µV (f ) log

141

q

2 µJ,τ V (f ).

(2.14)

We define cs (LJV ) = supτ cs (LJ,τ V ). When the transition rates are chosen as in (2.11), it is easy to verify that the Dirichlet form takes a particularly simple form EVJ,τ (f, f ) =

1 J,τ µ (|∇V f |2 ). 2 V

(2.15)

We denote by cs (µJ,τ V ) the logarithmic Sobolev constant associated to this particular J,τ choice of the generator LJ,τ V . Notice the following simple estimates relating cs (LV ) to cs (µJ,τ V ): J,τ κ2 kJkV cs (LJ,τ (2.16) cm e−κ1 kJkV cs (LJ,τ V ) ≤ cs (µV ) ≤ cM e V ). The infinite volume dynamics. Let µ be a Gibbs measure for J. If the transition rates are bounded, i.e. when κ2 = 0, then the infinite volume generator LJ obtained by choosing V = Zd in (2.6) is well defined on the set of functions f such that |||f ||| is finite. The closure of LJ in L2 (, dµ) (or in C(), the metric space of all continuous functions on with the sup–distance) is a Markov generator (see, for instance Theorems 3.9 in Chapter I and 4.1 in Chapter IV of [L]), which defines a Markov semigroup denoted by T (t). LJ is self–adjoint on L2 (, dµ). The block dynamics. We will also consider a more general version of heat–bath dynamics in which more than one spin can flip at once. Let D = {V1 , . . . , Vn } be an arbitrary collection of finite sets Vi ∈ F and let V = ∪i Vi . The generator of the Markov process corresponding to D is defined as LJD f =

n X

(µJVi f − f ).

i=1

From the DLR condition (2.2) it follows that LJD is self–adjoint on L2 (, dµJ,τ V ). 3. Main results In this section we state our hypotheses and our main results on the growth of the logarithmic Sobolev constant in a cube of side L as a function of L, (ii) the speed of relaxation to equilibrium for the infinite volume dynamics for a set of potentials J of measure one, (iii) the speed of relaxation to equilibrium for the averaged infinite volume dynamics.

(i)

In order to state our hypotheses we need first the following definition. Given V ⊂⊂ Zd , n, α > 0, we say that the condition SM T (V, n, α) holds if for all local functions f and g on such that d(3f , 3g ) ≥ n we have sup |µJ,τ V (f, g)| ≤ |3f ||3g | kf k∞ kgk∞ exp[−α d(3f , 3g )].

τ ∈

Then our hypotheses on the random interactions J = {JA }A∈F are:

142

F. Cesi, C. Maes, F. Martinelli

(H1) There exist L0 ∈ Z+ , α > 0, ϑ > 0 such that for all L ≥ L0 , P{ SM T (QL , L/2, α) } ≥ 1 − e−ϑL . (H2) There exists δ > 0, such that E (exp( kJk1+δ x )) ≡ Gδ < ∞. Some of our result are given in the special case in which the JA ’s are bounded, so we let (H3) There exists J0 > 0 such that with probability 1, we have |JA | ≤ J0 for all A ∈ F. Remark 1. The key hypothesis (H1) is different from the assumption that appears in the basic references on disordered systems in the Griffiths phase (see e.g. [GM3, DKP, D, FI, GZ1, GZ2]). In these references, in fact, one sets a constraint on either the inverse temperature β or the external fields (if present). Here we adopt a more general hypothesis. In Sect. 3.2 we show that (H1) holds under the assumption of [GM3 or DKP], while in [CMM] we study the two dimensional diluted Ising model above the percolation threshold. As far as the second hypothesis is concerned, we observe that it is definitely stronger than the assumption one needs in order to control the equilibrium (see e.g. [DKP or GM3]). This fact is, as we hope it will be clear from the proofs of the various results, almost unavoidable when dealing with dynamical problems if one wants to get sufficiently precise results. Relaxation times are in fact much more sensitive than correlation functions to the occurrence of small regions of very large couplings (see also the remark after Theorem 3.1). In any case we have focused more on the bounded case (H3) since it appears to be the most interesting one from the physical point of view. 3.1. General theorems. Theorem 3.1. 1

(i) Assume (H1) and (H2) and let Aδ = (log Gδ ) 1+δ ∨ 1. Then there exist C(d, r, α, ϑ), and L1 (d, r, α, ϑ, L0 , Aδ ) such that for all L ≥ L1 , δ δ (3.1) P{ cs (LJQL ) > exp CAδ (log log L)d− 1+δ (log L)1− d(1+δ) } < L−1.5 . If, in addition, (H3) holds (bounded interactions) then there exist C1 and C2 depending on d, r, α and ϑ such that for all L ≥ L1 , d−1 P{ cs (LJQL ) > C1 exp C2 J0 (log log L)d−1 (log L) d } < L−1.5 . (3.2) (ii) Assume (H1) and (H3) and d ≥ 2. Then for any ε ∈ (0, 1] there exist positive constants C3 and L2 depending on d, r, α, ϑ, J0 and ε such that for all L ≥ L2 , d P{ cs (LJQL ) > Lε } < exp −C3 (log log L)−d (log L) d−1 . (3.3) Remark 1. Using (i) together with the Borel–Cantelli lemma, it follows that, with probability one, cs (LJQL ) does not grow faster than the exponential appearing in (3.1). It is quite easy to see that, in this respect, hypothesis (H2) is almost optimal. Let us in fact consider a ferromagnetic model with nearest neighbor couplings Jxy = J{x,y} , with only exponential moments, e.g. such that P{Jxy ≥ n} ≤ C exp[−n], n = 0, 1, . . .. That would correspond to δ = 0 in (H2). Then, with large probability, one can find, in the cube

Relaxation of Disordered Magnets in Griffiths’ Regime

143

QL , a pair of nearest neighbor sites {x, y} such that, for some small ε, Jxy = ε log L and Jz,z0 = 0 ∀z ∈ {x, y}, z 0 ∈ {x, y}c . In this case the Gibbs measure µτQL (0) will factorize on the product of the Gibbs measure on the pair {x, y} with free boundary conditions and of the Gibbs measure µτQL (0)\{x,y} . In turn the logarithmic Sobolev constant cs (LJ,τ QL ) will be at least as large as the logarithmic Sobolev constant for the pair {x, y} with free boundary conditions and coupling Jxy = ε log L. A simple computation shows that this latter, for e.g. the heat–bath dynamics, is of the order of exp[Jxy ] = Lε , i.e. much larger than our bound in (3.1). Remark 2. It is easy to check that, in the bounded case (H3), the almost sure bound on the growth of cs (LJ,τ QL ) that follows from (3.2) is, apart from the log log L factor, optimal. To this purpose let us consider the simplest model, namely the diluted Ising model without external field, nearest neighbor interactions Jxy taking only two values, 0 and J¯ Jc with probability 1 − p and p pc respectively. Here Jc denotes the critical inverse temperature for the Ising model while pc is percolation threshold for bond percolation in d–dimensions. Since we are below pc , it is not difficult to check that hypothesis (H1) holds. It is easy to see now that for almost all realizations J there exists L0 (J) such that for all L ≥ L0 (J) there exists x ≡ x(L, J), with |x| ≤ L/2, such that all couplings inside the cube Ql (x), l = (ε log L)1/d , are equal to J¯ and all couplings connecting a point inside Ql (x) with one of its nearest neighbors outside it are zero. Let now 3 be the cube of side 2L centered at the origin. By construction Ql (x) ⊂ 3. Since the couplings across the boundary of Ql (x) are zero one has cs (LJ3 ) ≥ cs (LJQl (x) ) ≥ gap(LJQl (x) )−1 . ¯

¯

In turn, since J¯ Jc , one has (see e.g [M]) that gap(LJQl (x) )−1 ≥ exp[k(ε log L) d−1 ] ¯

d

for a suitable constant k. Actually one can prove a similar lower bound on the logarithmic Sobolev constant even in the more general case discussed in Theorem 3.3 below. Theorem 3.2. Assume (H1) and (H2) and uniformly bounded transition rates, i.e. κ2 = 0 in (2.8). Then ¯ ⊂ 2 of full measure such that for each J ∈ 2 ¯ there (a) If d ≥ 1 there exists a set 2 J exists a unique infinite volume Gibbs measure µ . Moreover there exists a constant ¯ and for any local function f there exists 0 < t0 (J, f ) < ∞ k and, for each J ∈ 2 such that for all t ≥ t0 , h i 0 δ0 kT J (t)f − µJ (f )k∞ ≤ exp −t exp −k (log t)1− d (log log t)d−δ , (3.4) where δ 0 ∈ (0, 1) is given by δ 0 = δ(1 + δ)−1 . If in addition (H3) holds (bounded interactions), then for all t ≥ t0 (J, f ), h i 1 kT J (t)f − µJ (f )k∞ ≤ exp −t exp −k (log t)1− d (log log t)d−1 . (3.5) (b) Assume (H3) and d ≥ 2. Then there exists a constant k and for any local function f there exists 0 < t0 (J, f ) < ∞ such that, if t ≥ t0 (f ) then d E kT J (t)f − µJ (f )k∞ ≤ exp −k (log t) d−1 (log log t)−d . (3.6)

144

F. Cesi, C. Maes, F. Martinelli

Remark 1. The “constant” k as well as t0 may depend on the geometrical parameters d, r and on the various parameters appearing in our hypotheses, like α, ϑ, L0 , δ, J0 . Remark 2. The almost sure speed of relaxation to equilibrium is faster than any stretched exponential, at least under our assumptions (H2) or (H3), and, as the next theorem shows, it cannot be improved in general. It is possible to show, at least for ferromagnetic systems, that if we assume an exponential tail for the distribution of the couplings (δ = 0), then the almost sure bound cannot be better than a stretched exponential (see also Remark 1 after Theorem 3.1). Remark 3. The bound (b) on the relaxation for the averaged dynamics in the bounded case is, apart from the technical factor (log log t)d , optimal as the next Theorem 3.3 shows. We don’t give the analogous result in the unbounded case, i.e. when only (H2) holds, since the computation of the new exponent of log t is quite involved and, in our opinion, not particularly interesting from the physical point of view. An interaction J is said to be nearest neighbor (n.n.) if JA = 0 unless A = {x, y} and the euclidean distance between x and y equals 1. We also remind the reader that with πx , for x ∈ Z d , we denote the projection from over {−1, 1}, given by πx : σ 7→ σ(x). Theorem 3.3. For each d ≥ 2 there is J˜1 (d) > 0 such that the following holds: assume uniformly bounded transition rates, i.e. κ2 = 0 in (2.8), nearest neighbor interactions J, and suppose that for almost all J ∈ 2, there exists a unique Gibbs measure µJ . Assume also p1 ≡ P{Jxy = J1 } > 0 for some J1 > J˜1 (d) and p2 ≡ P{|Jxy | ≤ 1/4} > 0. Then we have (a) for all large enough t, d E kT J (t)π0 − µJ (π0 )kL2 (µJ ) ≥ exp −k (log t) d−1

(3.7)

for some k which depends on d, p1 and p2 . (b) assume in addition (H3) and P{Jxy ≤ δ} = 0 for some 0 < δ < 1/4 (uniformly ferromagnetic interaction). Choose the transition rates of the heat-bath dynamics given in (2.10). Then there exists k > 0 such that for almost all J ∈ 2 there exists 0 < t0 (J) < ∞ such that for all t ≥ t0 we have h i d−1 . (3.8) kT J (t)π0 − µJ (π0 )k∞ ≥ exp −t exp −k (log t) d Remark 1. If one assumes (H1) and (H2) then the uniqueness of the Gibbs measure with P−probability one follows. Remark 2. µJ (π0 ) is clearly equal to 0, by uniqueness of the Gibbs measure and symmetry. Remark 3. (3.7) is obiously also a lower bound for E kT (t)π0 − µ(π0 )k∞ . This lower bound is of the same order, apart from the technical factor (log log t)d , as the upper bound given in (b) of Theorem 3.2. Although a similar result was argued in [DRS] for the diluted Ising model, to our knowledge this is the first rigorous lower bound in a truly interacting case. Remark 4. The quantity J˜1 (d) comes from Theorem 6.4.

Relaxation of Disordered Magnets in Griffiths’ Regime

145

3.2. Applications. In this section we discuss the hypotheses in our theorems from the point of view of standard examples. Clearly the hypotheses (H2) and (H3) refer to the nature of the disorder (the distribution of the interaction potential) while the first hypothesis (H1) needs to be checked in a given disordered equilibrium model. General methods to verify (H1) can be found in [DKP] and [GM3] or the references therein. Here we follow [GM3] to discuss (H1) for the important example of a random-field short-range spin glass with formal Hamiltonian X X X Jxy σ(x)σ(y) − b hx σ(x) − h σ(x) (3.9) H=− <xy>

x

x

determined by a realization of one-(hx ) and two–body interactions (Jxy ). To have in mind a specific example satisfying (H2) we could e.g. take the Jxy identically distributed independent Gaussian random variables and let the hx be equal to ±1 or 0 each with probability 1/3. In the notation of Sect. 2, JA = hx if A = {x} and JA = Jxy if the set A = {x, y} is a nearest neighbor pair < xy > on the lattice. In (3.9) b and h are just (constant) parameters. To check (H1) we must consider the finite volume measure µτV corresponding to (3.9) with V = QL , and estimate truncated correlation functions. It is an immediate consequence of Corollary 2 in [BM] as applied in the main Theorem of [GM3] that for all local functions f and g, |µτV (f, g)| ≤ 2|3f | |3g | kf k∞ kgk∞

max

x∈3f , y∈3g

G(x, y),

(3.10)

where G(x, y) is the two-point connectivity function for independent site percolation on Zd with (random) densities {pz , z ∈ Zd } specified below. More precisely, G(x, y) is the probability in the independent site percolation process to find an open path from site x to site y; independently a site z is open with probability pz and is closed with probability 1 − pz . The densities are an explicit function of the interaction potential. In the spin-glass example (b = 0 in (3.9)) the random densities are given by h h X i h X ii = 1/2 tanh |J | + h + tanh |J | − h , pSG yz yz z

while in the random-field case (Jxy = J > 0, h = 0 in (3.9)) we get pRF = 1/2 tanh(2dJ + bhx ) + tanh(2dJ − bhx ) . z d It follows easily that {pSG z , z ∈ Z } is a one-dependent stationary random field while } are independent and identically distributed. All the above features are quite the {pRF z general and do not depend very much on the model under investigation. As long as the interaction is short range, we will find some independent percolation process with random but almost independent densities which allows a domination like (3.10). It is not difficult to see that if Epz is sufficiently small (typically, below some percolation threshold), then for all sites x, y ∈ Zd ,

E G(x, y) ≤ e−αd(x,y)

(3.11)

for some α > 0, with α → ∞ as Epz → 0. For example, in order to have α > 0 for the random-field case it suffices that < pc (d), where pc (d) is the threshold (or critical) density for Bernoulli site EpRF z

146

F. Cesi, C. Maes, F. Martinelli

percolation on Zd ; for the general model (3.9) it suffices that Epz < 1/(2d − 1)2 , see [GM3]. The combination of the upper bounds in (3.10) and (3.11) with the Chebyshev inequality yields (H1). In fact the probability that SM T (QL , L/2, α/2) does not hold, thanks to (3.10), is bounded by the probability that there exist x and y such that d(x, y) ≥ L/2 and 2G(x, y) > exp(−αd(x, y)/2). This latter probability is, in turn, not greater than sup P{ 2G(x, y) > exp(−αd(x, y)/2) } ≤ L2 x,y∈QL : d(x,y)≥L/2

≤ 2 L2 E G(x, y) exp(αd(x, y)/2) ≤ 2 L2 e−αd(x,y)/2 ≤ 2 L2 e−αL/4 ≤ e−αL/6 . Hence, (H1) is verified. Another (but very similar) approach to check (H1) can be found in [DKP]. In particular their estimate (2.19) is almost the same as (3.10) above except that they are dominating via a bond percolation process. Notice that checking (H1) as we have illustrated above requires a rather strong “high temperature” or “strong external field” condition. In certain cases however (e.g. the dilute Ising ferromagnet) one can substantially improve on this. We refer to [CMM] for the details. 4. Preliminaries In this section we collect several technical results to be used in the next key section. Most of the results presented here, with the notable exception of Theorem 4.12, which seems to us completely new and of independent interest, are rather simple and some of them can actually be found in the literature. We thought it, however useful, also for future purposes, to put them together in a sort of primitive tool–box for the subject. 4.1. Mixing properties and bounds on relative densities for Gibbs measures.. In this first part we give three equilibrium results on finite volume Gibbs measures. The first one is what was called in [MO1] “effectiveness” of property SM T . The second and the third one provide two simple bounds on the relative density between the projection over certain sets of two different Gibbs measures, once one assumes exponential decay of correlations. ¯ r, α), γ1 (d, r), γ2 (d, r), Proposition 4.1. Given α > 0 there exist positive numbers l(d, ¯ m(d, b r, α) such that, the following holds for all l > l: let V ∈ F such that V can be written as union of (possibly overlapping) cubes of side l and assume that (i) SM T (Ql (x), l/2, α) holds if Ql (x) ⊂ V , (ii) γ2 kJkV ≤ αl, b r, α)) holds. then SM T (V, γ1 l, m(d, Proof. It follows from Lemma A2.1 of [MO1] and Proposition 3.1, Eqs. (3.9), (3.11) of [O] that any cube Ci for which (i) and (ii) hold also satisfies condition Cl¯ of [OP] for some l¯ large enough depending only on α, d, r, provided that the constants γ1 , γ2 are chosen respectively large and small enough depending only on the dimension d and the range r. Then the result follows from Propositions 2.5.1, 2.5.2, 2.5.3, 2.5.4 of [OP]. We

Relaxation of Disordered Magnets in Griffiths’ Regime

147

also refer the reader to Appendix A.1 of [MO1] for a simple proof in the attractive case. Remark . In the sequel for any given α > 0 we will denote by m(α) b the constant m(d, b r, α) given in the above proposition. Proposition 4.2. Let V ⊂ 3 ⊂⊂ Zd , and let x ∈ 3c such that d(x, V ) > r. If U ≡ 3\V , we have x

X dµJ,τ

e2kJky sup | µτU e−∇x HU , e−∇y HU |. sup 1 − 3J,τ ≤ e14kJkx τ ∈ τ ∈ dµ3 V ∞ y∈V

Proof. Let 1 ∈ be the configuration with all spins equal to +1, and let σ

W3,V (σ) = log ZUσ − log ZUV

c1 V

.

It is easy to show that x

dµJ,τ

3 ≤ e2k∇x W3,V k∞

1 − J,τ V ∞ dµ3

for all τ ∈

which, using the trivial bound k∇x W3,V k∞ ≤ 4kJkx , gives x

dµJ,τ

3 ≤ e8kJkx k∇x W3,V k∞

1 − J,τ V ∞ dµ3

for all τ ∈ .

By proceeding as in Lemma 3.1 of [MO2] one can show that X k∇x W3,V k∞ ≤ e6kJkx e2kJky sup | µτU e−∇x HU , e−∇y HU |. y∈V

which completes the proof.

τ ∈

Proposition 4.3. For each m > 0 there exists C(d, r, m) such that the following holds. Let A ⊂⊂ Zd , A0 ⊂ A and B0 ⊂ ∂r+ A. Let A¯ = A ∪ ∂r+ A and assume that (i) md0 ≡ md(A0 , B0 ) ≥ max{ C , 100kJkA¯ , 10 (log |B0 | + 1) }. (ii) SM T (A\A0 , d0 − 2r, m) holds. Then for each pair of configurations σ, τ ∈ which agree on ∂r+ A\B0 , we have

dµτ

(4.1) ≤ e−(m/4)d0 .

1 − A dµσA A0 ∞ Proof. For each η ∈ A0 , consider the event Fη = {σ ∈ : σA0 = η}. Choose a pair of configurations σ, τ which agree on ∂r+ A\B0 . Then there exists a sequence of interpolating configurations γi ∈ for i = 1, . . . , n such that n ≤ |B0 |, γi+1 differs from γi at exactly one site, γ1 = σ and γn agrees with τ on ∂r+ A. Thus, for each η ∈ A0 , we can write n γi τ Y µ (F ) (F ) µ η η A A = 1 − 1 − (4.2) . γi−1 σ µA (Fη ) µA (Fη ) i=2

If we define

148

F. Cesi, C. Maes, F. Martinelli

µζA (Fη ) a= sup , 1 − ζ x µA (Fη ) ζ∈, x∈B0 , η∈A0 1 and a|B0 | ≤ 1, then the RHS of (4.2) cannot then it is easy to check that, if a ≤ 10 exceed ea|B0 |, so if we show that, for instance, a ≤ e−(m/2)d0 , the proposition follows. Let then, for z ∈ Zd , gz = exp(−∇z HA\A0 ). By Proposition 4.2, and the SMT property given in the hypotheses, we find X e2kJky |3gx ||3gy | (kgx k∞ kgy k∞ ) e−md(3gx , 3gy ) ≤, a ≤ sup e14kJkB0 x∈B0

y∈A0

≤ (2r + 1) e

2d 20kJkA¯

X

sup

e−m |x−y|−2r .

x∈B0 y∈A 0

In the second inequality we have used the fact that 3gx is contained in a ball of center x and radius r, and the fact that kgx k∞ ≤ exp(2kJkx ). Finally, using the hypothesis on d0 , we easily get a ≤ e−(m/2)d0 4.2. Some results on the spectral gap of the block dynamics. Here we provide three lower bounds on the spectral gap of the block dynamics with just two blocks. Proposition 4.4. Let V ⊂⊂ Zd , and let A, B be two (possibly intersecting) subsets of V such that V = A ∪ B. Let D = {A, B}. Assume that

dµτ

(4.3) sup 1 − τA + ≤ ε < 1. dµV ∂r B ∞ τ ∈ Then the gap for the block dynamics on D satisfies inf gap(LτD ) ≥ 1 −

τ ∈

√ ε.

Proof. The action of the semigroup TD (t) associated to the block dynamics is given by TD (t)f =

∞ n X t (LD )n f. n! n=0

Using the explicit expression for LD and some elementary combinatorics, it is not difficult to show that TD (t)f =

∞ X (2t)n n=0

n!

e−2t

1 2n

X

µX1 · · · µXn (f ).

(4.4)

X∈{A,B}n

Since (µA )2 = µA (and similarly for B) the last summation (over X) in (4.4) can be written as n−1 X n − 1 bk+1 + B bk+1 )f, (A (4.5) k k=0

where bk = (µA ◦ µB )bk/2c ◦ µk−2bk/2c A A

bk = (µB ◦ µA )bk/2c ◦ µk−2bk/2c . B B

Relaxation of Disordered Magnets in Griffiths’ Regime

149

If now g is an arbitrary bounded measurable function on , such that µV (g) = 0, we get kµA µB µA gk∞ ≤ kµV µB µA gk∞ + kµV µB µA g − µA µB µA gk∞ .

(4.6)

By the DLR property (2.2) the first term on the RHS of (4.6) is equal to µV (g) = 0. Furthermore, since the interaction has range r, the function h ≡ µB µA g is FV c ∪∂r+ B measurable. This fact together with hypothesis (4.3) and the trivial observation that µA and µV agree on FV c implies kµA µB µA gk∞ ≤ εkµB µA gk∞ ≤ εkµA gk∞ .

(4.7)

Iterating this inequality we get, for each bounded measurable f with µV (f ) = 0, √ √ bk f k∞ ≤ ( ε)k−3 kf k∞ . bk f k∞ ≤ ( ε)k−3 kf k∞ kB (4.8) kA Thus, we get that the sup norm of (4.5) is not greater than kf k∞

√ 2 (1 + ε)n−1 , ε3/2

which, inserted back into (4.4) yields kTD (t)f k∞ ≤ kf k∞ 4ε−3/2 e−2t

∞ n X √ √ t (1 + ε)n = kf k∞ 4ε−3/2 e−(1− ε)t . n!

n=0

Proposition 4.5. Let V , A and B be as in Proposition 4.4. Let also A0 = A ∩ ∂s+ B, with s ≥ r, B0 = B ∩ ∂r+ A and A¯ = A ∪ ∂r+ A. For each m > 0 there exists C(d, r, m) such that if (i) md0 ≡ md(A0 , B0 ) ≥ max{ C , 100kJkA¯ , 10 (log |B0 | + 1) }, (ii) SM T (A\A0 , d0 − 2r, m) holds, then inf gap(Lτ{A,B} ) ≥

τ ∈

1 . 2

Proof. Thanks to Proposition 4.4 it is sufficient to show that

1 dµτ

sup 1 − τA ≤ . dµ 4 τ ∈ V A0 ∞

(4.9)

By the DLR property (2.3) we have LHS of (4.9) ≤

dµτ

1 − A . dµσA A0 ∞ τ,σ∈ : τV c =σV c sup

At this point we can use Proposition 4.3 and obtain the result. Proposition 4.6. Let V, A, B be as in Proposition 4.4. Let N = |∂r+ A ∩ B| ∧ |∂r+ B ∩ A|. Then there exists k = k(d, r) such that inf gap(LJ,τ {A,B} ) ≥ exp[ −kkJkV N ].

τ ∈

(4.10)

150

F. Cesi, C. Maes, F. Martinelli

Proof. We can assume N = |∂r+ B ∩ A|. Consider a new interaction J 0 such that B and V \B are decoupled, i.e. n + 0 = JX if X ∩ ∂r B ∩ A = ∅ . JX 0 otherwise We have clearly kJ − J 0 kx = 0 unless x is in a neighborhood of radius r of ∂r+ B ∩ A, hence X kJ − J 0 kx ≤ k1 N kJkV x∈V

for some k1 which depends on d and r. This implies that for all functions f on , 0

J,τ J ,τ (f, f ) ≥ exp(−4k1 kJkV N ) E{A,B} (f, f ), E{A,B} J Var J,τ V (f ) ≤ exp(4k1 kJkV N ) Var V

0

,τ

(f ).

(4.11) (4.12)

From (4.11), (4.12) and the variational characterization of the gap (2.12), it follows that 0

J ,τ gap(LJ,τ {A,B} ) ≥ exp[−8k1 kJkV N ] gap(L{A,B} ).

(4.13)

In order to estimate the gap for the block–dynamics with couplings J 0 , we just notice that 0 ,τ the hypotheses of Proposition 4.4 are satisfied with ε = 0, and thus gap(LJ{A,B} ) ≥ 1. 4.3. Some general results on the spectral gap and the log–Sobolev constant.. This is actually the most important part of this section since it contains two key results. The first one, Proposition 4.9, has been essentially proved in [MO2] and, at least in the case of bounded interaction, it roughly says the following. If in a given cube QL of side L truncated correlations decay exponentially fast on all length scales larger than l1 , with l1 L, then the logarithmic Sobolev constant in that cube is not larger than the largest among the logarithmic Sobolev constants of all cubes of side l1 inside QL . In order to appreciate this result one should consider that, if hypotheses (H1) and (H3) hold, then with probability one, truncated correlations in a cube of side L centered at the origin decay exponentially fast on all length scales larger than l1 ≈ log L. Thus in this case we would have a logarithmic contraction of the starting length scale, namely from L to log L. This result, together with a very rough estimate of the logarithmic Sobolev constant for a cube (see Proposition 4.10 and Theorem 4.12 below), allows us to conclude immediately that, with probability one, cs (LτBL ) cannot grow faster than exp[C(log L)d−1 ]. Notice that in two dimensions this bound is just a power law in the side L. The second important result is a very general lower bound on the spectral gap of Glauber dynamics (or upper bound on the logarithmic Sobolev constant) in an arbitrary set V ⊂⊂ Zd . It says that the spectral gap is always larger than a negative exponential of d−1 d−1 |V | d . Notice that if V is cube then |V | d is simply its surface. In this case the bound is certainly optimal, at least in our general setting, since it is known that for several models of lattice discrete spins in the phase coexistence region, the activation energy between different stable phases is proportional to the surface of the region in consideration (see [M] and [CGMS] for more precise statements for the Ising model). Apparently the situation for continuous spin systems can be very different. For Heisenberg models, in fact, it is believed on the basis of spin–wave theory (see [B1, B2]) that, at least for cubic regions, the spectral gap does not go t o zero faster than the inverse of the volume. It is a challenging problem to actually prove it!

Relaxation of Disordered Magnets in Griffiths’ Regime

151

Definition 4.7. A cube C = Ql (x) is said to be α−regular if, letting n = bl/(2γ1 )c (i) SM T (Qn (y), n/2, α) holds for all y ∈ Ql (x), (ii) (γ2 ∨ 100) kJkC¯ ≤ m(α)l, b b are those appearing in Proposition where C¯ = C∪∂r+ C and the constants γ1 , γ2 and m(α) 4.1. We immediately observe the following Proposition 4.8. Assume (H1) and (H2). Then there exist L00 ∈ Z+ , ϑ0 > 0 (depending on α, ϑ, γ1 , γ2 and G0 ) such that for all L ≥ L00 , 0

P{ QL is α−regular } ≥ 1 − e−ϑ L Furthermore, if V is a union of α−regular cubes of side length l, then SM T (V, l/2, m(α)) b holds. Proof. The probability that QL is not α−regular is bounded by (we use the exponential Chebyshev inequality) P{ (i) does not hold } + P{ (ii) does not hold } ≤ b L/(γ2 ∨100) ≤ e−ϑ0 L ≤ Ld e−ϑL/(3γ1 ) + (L + 2r)d G e−m 0

if L is greater than some L00 . The second statement follows from Proposition 4.1.

Proposition 4.9. Let l1 ∈ Z+ and let 3 ⊂⊂ Zd be a multiple of Ql1 , i.e. 3 = ∪ni=1 Bi , where Bi = Ql1 (xi ) for some xi ∈ l1 Zd . Let, for any I ⊂ {1, . . . , n}, 3I = ∪i∈I Bi . Let also A be the set of all I ⊂ {1, . . . , n} such that diam(3I ) ≤ 3l1 . Assume that each Bi is α−regular for α > 0. Then there exist two positive constants l¯1 and k depending on α, d, r such that if m ≡ m(α), b and ¯ (i) l1 ≥ l1 , (ii) inf I∈A inf τ ∈ gap(Lτ3I ) ≥ exp[−ml1 /2]. then

sup cs (µτ3 ) ≤ k sup sup cs (µτ3I ).

τ ∈

I∈A τ ∈

Proof. Since each Bi is α−regular, using Proposition 4.1, we get that for any I ⊂ {1, . . . , n}, SM T (3I , l1 /2, m) holds. This fact, together with hypothesis (ii), allows us to apply Theorem 2.1 of [MO2] and to conclude that there exists k(d, r, α) such that sup cs (µτ3 ) ≤

τ ∈

k 4

sup

sup

i∈{1,...,n}

I⊂{1...n} I3i

sup γi (µτ3I ),

τ ∈

(4.14)

provided that l1 was taken large enough. Here, for any I ⊂ {1, . . . , n} containing i, γi (µτ3I ) is the smallest constant γ such that the logarithmic Sobolev inequality q µτ3I (f 2 log f ) ≤ γµτ3I (|∇3I f |2 ) + µτ3I (f 2 ) log µτ3I (f 2 ) holds for all positive functions f that depend only on the spins in Bi .

152

F. Cesi, C. Maes, F. Martinelli

It is clear from the above definition that γi (µτ3I ) ≤ cs (µτ3I ). Assume that the supremum in the RHS of (4.14) is attained over a set I ∈ / A (otherwise the proof of the proposition would be finished). Given i, I such that i ∈ I ∈ / A, let I0 be the largest subset of I such that d(Bi , Bj ) < l1 for all j ∈ I0 . By construction i ∈ I0 ∈ A . We claim that γi (µτ3I ) ≤ 4 sup cs (µτ3I ),

(4.15)

0

τ ∈

provided that l1 is large enough depending on α, d, r. Such a bound clearly completes the proof. In order to prove (4.15) it is enough to estimate the relative density between the projection over FBi of the two Gibbs measures µτ3I and µτ3I uniformly in the boundary 0 condition τ . More precisely, let dµτ dµτ3I τ τ gmax ≡ k τ3I k∞ ; gmin ≡ min (σ). σ∈Bi dµτ dµ3I Bi 3I Bi 0

0

Then, using Exercise 6.1.27 of [DeSt], (2.2) and the bound γi (µτ3I ) ≤

gτ sup max γi (µτ3I ) τ 0 τ ∈ gmin

≤

γi (µτ3I ) 0

≤ cs (µτ3I ), we get

gτ sup max cs (µτ3I ). τ 0 τ ∈ gmin

0

(4.16)

We then use the DLR equations and write τ ≤ gmax

dµτ3 I

τ0 0 , 0 B ∞ 0 dµ i c τ,τ ∈ : τ3 =τ c 3I sup

3

I

τ ≥ gmin

I

inf

min

0 τ,τ 0 ∈ : τ3c =τ3 c σ∈Bi I

I

(4.17)

0

dµτ3I 0 (σ). 0 dµτ3I Bi

(4.18)

0

Thanks to Proposition 4.3 applied to the sets A ≡ 3I0 , A0 ≡ Bi and B0 ≡ ∂r+ 3I0 ∩ 3I , we know that the RHS of (4.17) is less than 2 while the RHS of (4.18) is greater than 1/2, provided that l1 is taken large enough depending only on α, d, r. In this way we have proven (4.15), and, by consequence, the proposition. Proposition 4.10. For each 3 ⊂⊂ Zd we have h i X −1 ) ≤ 4 + 4 kJk + 2|3| log 2 (gap(LJ,τ cs (LJ,τ x 3 3 )) . x∈3

The proposition follows from (2.12), Proposition 4.11 below, and from a trivial estimate on inf σ µJ,τ 3 (σ). Proposition 4.11. Let be a finite set, let µ be a probability measure on (, 2 ) and assume µ0 ≡ inf µ(x) > 0. x∈

Then, for each positive function f on , we have 2 µ(f 2 log f ) ≤ (4 + 2 log µ−1 0 ) Var(f ) + µ(f ) log

p

µ(f 2 ).

Relaxation of Disordered Magnets in Griffiths’ Regime

153

Proof. We can assume µ(f 2 ) = 1. If we let f = µ(f )(1 + g), we find µ(g) = 0 and µ(g 2 ) = Var(f )/µ(f )2 . Let A be the set of all x ∈ such that |g(x)| < 1. We can then write (4.19) µ(f 2 log f ) = µ(f 2 log f 1IA ) + µ(f 2 log f 1IAc ). Let’s denote by X1 respectively X2 the first and the second term in the RHS of (4.19). Using the inequalities log(1 + g) ≤ g and log µ(f ) ≤ log µ(f 2 ) ≤ 0, we get X1 ≤ µ(f )2 µ[(g + 2g 2 + g 3 )1IA ] ≤ µ(f )2 3µ(g 2 ) + µ(g1IA ) = 3 Var(f ) + µ(f )2 µ(g1IA ).

(4.20)

To take care of the last term we remember that µg = 0, so µ(g1IA ) = −µ(g1IAc ) which implies, using the Schwarz and then the Chebyshev inequalities, |µ(g1IA )| ≤ µ(|g|1IAc ) ≤ (µ(g 2 )µ(1IAc ))1/2 ≤ µ(g 2 ). Thus we get X1 ≤ 4 Var f . As for X2 , we write X2 ≤ ( sup log f (x) ) µ(f 2 1IAc ) ≤ log kf k∞ µ(f 2 1IAc ).

(4.21)

x∈

−(1/2)

Finally we observe that kf k∞ is bounded by (µ(f 2 )/µ0 )1/2 = µ0

while

µ(f 2 1IAc ) = µ(f )2 µ((1 + 2g + g 2 )1IAc ) ≤ 4µ(f )2 µ(g 2 ) = 4 Var(f ). This concludes the proof.

Theorem 4.12. There exist k(d, r, κ1 ), such that, for each 3 ⊂⊂ Zd and for each τ ∈ , we have (cm was defined in (2.8)) d−1 gap(Lτ3 ) ≥ cm exp −k kJk3 |3| d . (4.22) Proof. For each non–negative integer n, let (Kn ) = the inequality (4.22) holds for all 3 ∈ F such that |3| ≤ (3/2)n . We want to show that (Kn ) holds for all n ∈ Z+ , by proving that there exists n0 (d, r) ∈ Z+ such that (Kn0 ) holds, and such that, for all n ≥ n0 , (Kn ) implies (Kn+1 ). Assume then that Kn−1 holds, and take any 3 such that (3/2)n−1 < |3| ≤ (3/2)n . Let v = |3|. By Proposition A1.1, it is possible to write 3 as the disjoint union of two subsets X and Y , such that (a) v/2 − k1 v

d−1 d

≤ |X| ≤ v/2,

(b) δr (X, Y ) ≤ k1 v

d−1 d

,

where k1 depends only on d and r. There exists then n0 (d, r) such that if n > n0 (and thus v > (3/2)n0 −1 ), then |Y | ≤ (2/3)|3|. So we can apply the inductive hypothesis to both X and Y . Furthermore a simple calculation (see Proposition A1.1 in [CM]) shows that J,τ inf gap(LJ,τ (4.23) inf gap(LJ,τ 3 ) ≥ inf W ) inf gap(L{X,Y } ), τ ∈

τ ∈ W ∈{X,Y }

τ ∈

where, as usual, the last term refers to the block dynamics. By Proposition 4.6 we know that

154

F. Cesi, C. Maes, F. Martinelli −k2 kJk3 v inf gap(LJ,τ {X,Y } ) ≥ e

d−1 d

τ ∈

for some k2 (d, r). Together with the inductive hypothesis on X and Y , this gives d−1 d−1 d inf gap(LJ,τ − k2 kJk3 v d . (4.24) 3 ) ≥ cm exp −kkJk3 |Y | τ ∈

Since |Y | ≤ (2/3)v, we have d−1 d inf gap(LJ,τ 3 ) ≥ cm exp −kkJk3 v

τ ∈

if k ≥

k2 1 − (2/3)

d−1 d

.

In this way we have shown that (Kn ) implies (Kn+1 ) for all n ≥ n0 (d, r). All is left is to prove (Kn0 ). For this purpose we observe that e−2kJ−J

0

k3 |3|

≤

µJ,τ 3 (σ)

J 0 ,τ µ3 (σ)

≤ e2kJ−J

0

k3 |3|

for all τ, σ, 3, J, J 0 .

(4.25)

Choose now any 3 with volume not exceeding (3/2)n0 and let L˜ 3 be the generator of the heat–bath dynamics with J = 0, i.e. X X L˜ {x} = L˜ 3 = (µJ=0 {x} − 1I). x∈3

x∈3

Since all L˜ {x} commute, it follows that gap(L˜ 3 ) = gap(L˜ {x} ) = 1 (the last equality can be checked via an explicit calculation). From (2.8), (2.12) and (4.25), it now follows that −6kJk3 |3| ˜ gap(LJ,τ cm e−κ1 kJk3 ≥ cm exp −(6 + κ1 )kJk3 (3/2)n0 , 3 ) ≥ gap(L3 ) e which implies (Kn0 ) (and then (4.22)), if we take k ≥ (6 + κ1 )(3/2)n0 .

5. The Deterministic Problem This section is the core of the paper. We give a deterministic upper bound on the logarithmic Sobolev constant cs (µJ,τ 3 ) in the cube 3 ≡ QL . In order not to obscure the discussion of our ideas with less relevant details due to unbounded interactions, we present the main steps of our strategy in the bounded case. For this purpose, consider first the so–called two dimensional diluted Ising model ¯ and call “regular” any with nearest neighbor interactions Jxy which are either zero or J, site x ∈ 3 such that Jxy = 0 for all neighboring sites y. Let us also consider the set W of all non–regular sites and its connected components (in the obvious sense) {Wi }ni=1 inside 3. Fix a volume scale v and assume that supi |Wi | ≤ v. Then we claim that in this case d−1 d d ] (5.1) cs (µJ,τ 3 ) ≤ C1 L exp[C2 v for suitable constants C1 and C2 independent of L and v. The proof follows immediately from Proposition 4.10 if we can prove the key inequality 0 0 gap(LJ,τ 3 ) ≥ C1 exp[−C2 v

d−1 d

]

(5.2)

for another pair of constants C10 , C20 . The above inequality follows from Theorem 4.12 once we observe that, and this is the key feature of the diluted model, the connected

Relaxation of Disordered Magnets in Griffiths’ Regime

155

components of W are non–interacting since they are separated from each other by a “safety belt” of completely decoupled sites. Therefore the spectral gap of LJ,τ 3 is not smaller than the smallest among the spectral gaps of LJ,τ . Using now Theorem 4.12 and Wi the assumption supi |Wi | ≤ v we get the required bound (5.1). It is very important to observe that, thanks to some of the results of Sect. 4.2, the above conclusion remains true, modulo some irrelevant constant factors, even if the value Jxy = 0 is replaced by a very small number Jmin , provided that |Jmin ||W | 1. This remark suggests how to transpose to a truly interacting model the previous ideas. In a certain sense our original model behaves after a suitable “coarse-graining” quite closely to this diluted model. Let us in fact make a coarse–grained description of the model on a new scale l0 L, by replacing sites with disjoint cubes Ci of side l0 and declare “regular” those cubes Ci in which truncated correlations decay exponentially fast with rate α > 0. In this way, if B is a collection of “non–regular” cubes Ci surrounded by a safety–belt of regular cubes, then the effective interaction of B with any other region outside the safety–belt will be not larger than |B| exp(−αl0 ). Thus, if l0 is chosen so large that the effective interaction among the connected components of the set Wl0 of non–regular cubes Ci is much smaller than one, e.g. if |Wl0 | exp(−αl0 ) 1, then our system, on scale l0 , will behave like a diluted Ising model. In particular we will be able to apply the results of Sect. 4.2 and, as a consequence, we will get the bound (5.2) on the spectral gap, with v equal to the volume of the largest connected component of the set Wl0 . We refer the reader to Proposition 5.2 below for a precise formulation of this result in the more general case of unbounded J. Once we have (5.2) then we also get (5.1) simply by applying Proposition 4.10. Although the above reasoning looks quite appealing from a physical point of view, it is still unsatisfactory for the following reason. In a typical configuration of J, the volume of the set Wl0 is roughly p(l0 )Ld , where p(l0 ) is the probability that a cube Ci is not regular. Using our basic assumption (H1), p(l0 ) ≈ exp(−ϑl0 ) so that the minimal scale l0 satisfying |Wl0 | exp(−αl0 ) 1 becomes of order log L. This unfortunately is too large a scale: since v is at least l0d , the corresponding bounds (5.2) or (5.1) on the spectral gap or on the logarithmic Sobolev constant, become at least of the order of a power of L. In order to overcome this difficulty, we appeal to Proposition 4.9. More precisely we introduce an intermediate length scale l1 L and we assume that the J in 3 are such that the hypotheses of Proposition 4.9 apply for l1 . If this is the case, then Proposition 4.9 basically allows us to replace the initial cube 3 = QL with a smaller cube Ql1 (x), for a suitable x ∈ 3. Once we have reduced the initial scale L to the new scale l1 , we make the coarse–grained analysis on scale l0 l1 on the new cube Ql1 (x) and proceed as explained before. The advantage of the above two–scale analysis is twofold. First of all the shortest scale l0 is now at most of the order of log l1 instead of log L. Secondly the prefactor Ld in (5.1) is replaced by l1d . If one considers that in a typical configuration the intermediate scale l1 can be taken already of the order of log L (see the comments before Proposition 4.9), we see that the smallest scale becomes l0 ≈ log log L with an enormous gain in precision. We conclude this short heuristic discussion by observing that it is precisely the coarse–grained analysis on scale log log L that is responsible for the various log log L factor in Theorem 3.1. We are now ready for a precise formulation of our results. Definition 5.1. Let l ∈ Z+ , α > 0 and let 3 be a mutilple of Ql and write 3 = ∪ni=1 Ql (xi ) for some n ∈ Z+ and xi ∈ lZd . Let K be the set of all i ∈ {1, . . . , n} such that Ql (xi )

156

F. Cesi, C. Maes, F. Martinelli

is not α−regular. Then we let W (3, l, α)= { x ∈ 3 : d x, ∪i∈K Ql (xi ) ≤ 2l }, v(3, l, α)= the cardinality of the largest r−connected component of W (3, l, α) . Given λ ≥ 0, we also define a cutoff interaction J (λ) as (λ) = (sgn JA ) (|JA | ∧ λ). JA

(5.3)

Proposition 5.2. Choose the transition rates cJ as in (2.11). Then, for each α > 0 ¯ let V be a multiple of ¯ r, α) such that the following holds for all l0 ≥ l: there exists l(d, b r, α). Let also λ, γ ≥ 0, and Ql0 , v = v(V, l0 , α) (see Definition 5.1), and let m = m(d, assume that (i) ml0 ≥ 10 (1 + log |W (V, l0 , α)|). P (ii) For each r−connected subset X of V with |X| ≤ v, we have x∈X kJ −J (λ) kx ≤ γ . Then,

d−1 inf gap(LτV ) ≥ |V |−ω exp −( 8γ + kλv d + k 0 ml0d ) ,

τ ∈

(5.4)

where ω can be taken equal to d log 4/ log(3/2), k = k(d, r) is the quantity defined in Theorem 4.12 and k 0 = 9d−1 k. Remark. The reader who does not want to bother with the extra complications due to the unboundedness of the interaction may just consider the bounded case and take λ equal to supx kJkx and γ = 0. Ss Proof. Write V = i=1 Ci , where Ci = Ql0 (yi ) for some yi ∈ l0 Zd . Let B = W (V, l0 , α) and let A be the union of all those (α–regular) cubes Ci such that d(Ci , Cj ) > l0 for all Cj which are not α−regular. Let also A0 = A ∩ ∂l+0 B and B0 = B ∩ ∂r+ A. By Proposition A1.1 in [CM], we have i 1h inf gap(LJ,τ inf )≥ inf gap(LJ,τ ) inf gap(LJ,τ (5.5) V D {A,B} ). τ ∈ τ ∈ 2 τ ∈ D∈{A,B} The proof of the proposition can be organized in the following steps: (a) We can use Proposition 4.5 to show that the gap of the block dynamics generator J,τ L{A,B} is at least 1/2. In order to show that 4.5 does indeed apply to our case, we first notice that d(A0 , B0 ) ≥ l0 , which, together with the fact that all cubes in A are α−regular and the trivial inequality |B0 | ≤ |W (V, l0 , α)|, implies the hypothesis (i) of 4.5. Then we observe that A\A0 can be expressed as a union of α−regular cubes Ci . So, by Proposition 4.8, the property SM T (A\A0 , l0 /2, m) holds. (b) Since the set A is a union of α−regular cubes, using the ideas in [MO1] one can prove that gap(LJ,τ A ) is bounded from below by a quantity which does not depend on the size of A. In Appendix 2, we give a simple proof of the much weaker result −ω exp(−k 0 ml0d ). gap(LJ,τ A ) ≥ 8|A|

Such an inequality, even if far from optimal, is sufficient anyway for our purposes.

Relaxation of Disordered Magnets in Griffiths’ Regime

157

(c) For what concerns the gap of LJ,τ B , we write B as the disjoint union of its ˜ ˜ commutes with LJ,τ for all i 6= j, it r−connected components B1 , . . . , Bn . Since LJ,τ B˜ i B˜ j follows that inf gap(LJ,τ ). gap(LJ,τ B )= B˜ i

i∈{1,...,n}

(d) Now we get rid of those couplings which are too strong, by introducing, on each (λ) }X∈F (see (5.3)). By (2.12) and hypothesis (ii), we obtain B˜ i , a cutoff interaction {JX (λ)

J ) ≥ e−8γ gap(LB gap(LJ,τ ˜ B˜ i

i

,τ

).

From (5.5), (a), (b), (c) and (d), together with Theorem 4.12 (for the dynamics (2.11) we can take cm = 1/2 and κ1 = 0 in (2.8)) and the fact that trivially kJ (λ) kB˜ i ≤ λ, we get inf gap(LJ,τ V )≥

τ ∈

d−1 1 1 ˜ min{ inf e−(8γ+k λ |Bi | d ) , 8|A|−ω exp(−k 0 ml0d )}. 4 2 i

In order to obtain (5.4) we now observe that by definition of v, we have |B˜ i | ≤ v, and that the minimum of the two quantities in braces is greater than their product if l0 is such that both terms are less than 1. Theorem 5.3. If the transition rates are given by (2.11), then for each α > 0 there ¯ C1 and C2 depending on d, r and α such that the following holds for all positive exist l, ¯ let l1 be a multiple of l0 and let 3 be a multiple of Ql1 so that we can integers l0 ≥ l: write Ss Sn (5.6) 3 = i=1 Bi = i=1 Ci , where Bi = Ql1 (xi ) and Ci = Ql0 (yi ) for some xi ∈ l1 Zd and yi ∈ l0 Zd Let v = b r, α). Let also λ, γ ≥ 0, and assume v(3, l0 , α) (see Definition 5.1), and let m = m(d, that: (i)

For each i ∈ {1, . . . n} the cube Bi is α−regular.

(ii) 8γ + kλv

d−1 d

≤ ml1 /4, where k(d, r) is the quantity defined in Theorem 4.12.

(iii) 30d log l1 ≤ ml0 ≤ (l1 )1/(2d) . P (iv) For each r−connected V ⊂ 3 with |V | ≤ v, we have x∈V kJ − J (λ) kx ≤ γ. Then we have

d−1 d sup cs (µJ,τ + l0d ) . 3 ) ≤ C1 exp 8γ + C2 (λv

τ ∈

Proof. Let V ⊂ 3 be a union of cubes Ci such that diam(V ) ≤ 3l1 . The hypotheses (iii) and (iv) tell us that Proposition 5.2 can be applied to V . Therefore we have d−1 −ω gap(LJ,τ exp −( 8γ + kλv d + k 0 ml0d ) ≥ e−ml1 /2 , V ) ≥ |V |

(5.7)

where, ω = d log 4/ log(3/2), and in the second inequality, we have used hypotheses (ii) and (iii). Thanks to (5.7) we can now apply Proposition 4.9, which, combined with Proposition 4.10 and again with (5.7) implies that

158

F. Cesi, C. Maes, F. Martinelli

0 sup cs (LJ,τ 4 + 4(3l1 )d (kJk3 + 2 log 2) (3l1 )ωd 3 )≤ C

τ ∈

d−1 exp 8γ + kλv d + k 0 ml0d ≤ d−1 ≤ C1 exp 8γ + C2 (λv d + l0d )

for some C 0 , C1 , C2 depending on d, r and α.

6. Proof of the Main Results 6.1. The upper bounds. In this section we finally prove our main results. Before doing that we need a simple probabilistic estimate on independent random variables. Lemma 6.1. Let {Xi }ni=1 be real independent random variables such that E (exp( Xi1+δ )) ≤ Gδ < ∞ for some δ > 0, for all i. Then, for all λ, γ > 0 n nX

P

(Xi − λ)+ ≥ γ

o

1+δ ≤ exp −λδ γ + nGδ e−λ .

i=1

Proof. By the Chebyshev inequality, and using log(1 + x) ≤ x, we obtain P{

n X

n (Xi − λ)+ ≥ γ } ≤ e−αγ Eeα(Xi −λ)+ ≤

i=1

n ≤ e−αγ 1 + e−αλ E eαXi 1I{Xi ≥ λ} ≤ exp −αγ + ne−αλ E eαXi 1I{Xi ≥ λ} . Now take α = λδ and notice that E eλ

d

Xi

1+δ 1+δ 1I{Xi ≥ λ} ≤ E e(Xi ) 1I{Xi ≥ λ} ≤ E e(Xi ) ≤ Gδ .

Proposition 6.2. Assume (H2). Then there exists k = k(d, r) > 0 such that if v > log L, and if we let 1

Aδ = (log Gδ ) 1+δ ∨ 1

1

λ = v d(1+δ) Aδ

δ

γ = kv 1− d(1+δ)

then, for all L ∈ Z+ (see (5.3)) o n X kJ − J (λ) kx ≥ γ ≤ L−3d . (6.1) P ∃V ⊂ QL : V is r−connected , |V | ≤ v, x∈V

Relaxation of Disordered Magnets in Griffiths’ Regime

Proof. For each V ∈ F we have o n nX kJ − J (λ) kx ≥ γ ≤ P P

159

X

(JA − λ)+ ≥ γ/k1

o (6.2)

A: A∩V 6=∅

x∈V

where k1 can be taken equal to sup{|A| : diam A ≤ r}. Using lemma 6.1 and the fact that the number of sets A ∈ F with a diameter not greater than r which intersect V can be bounded by |V |k2 (d, r) we obtain o h i nX 1+δ γ kJ − J (λ) kx ≥ γ ≤ exp −λδ + k2 |V |Gδ e−λ . (6.3) P k1 x∈V

Furthermore, since the number of r−connected V ⊂ QL such that |V | ≤ v is not greater than Ld exp(k3 v) for some k3 (d, r), if v ≥ log L and λ is chosen as in the hypothesis, we get h γi . RHS of (6.1) ≤ exp v(d + k3 + k2 ) − λδ k1 If now k ≥ k1 (4d + k2 + k3 ), we find RHS of (6.1) ≤ e−3dv ≤ L−3d .

Proof of Theorem 3.1. We give the proof in the special case of L which is a power of 2, which is enough to prove Theorem 3.2. A proof which works for all L requires a modification of Theorem 5.3 where one considers more general coverings of 3 with cubes and cuboids with slightly different sidelengths. This generalization is straightforward. Part (i). By combining hypothesis (H2) with the exponential Chebyshev inequality, one gets δ δ (6.4) P{ kJkQL ≥ 3 (log L)1− d(1+δ) } ≤ Ld Gδ exp −3(log L)1+δ− d L−2 for all L large enough. Therefore, using (2.16), it is enough to prove (3.2) with cs (LJ,τ QL ) −1.5 −2 ) and L replaced by 3L . replaced by cs (µJ,τ QL For this purpose we are going to use the key deterministic estimate of cs (µJ,τ QL ) given in Theorem 5.3. The idea is to prove that with probability greater than 1 − 3L−2 , it is possible to choose the four parameters in Theorem 5.3, l0 , l1 , λ and γ in such a way that the deterministic upper bound on cs (µJ,τ QL ) given in that proposition is not greater than δ δ exp CAδ (log log L)d− (1+δ) (log L)1− d(1+δ) . More precisely we define l0 and l1 as those powers of 2 (they are uniquely defined) such that 120d 60d log log L ≤ l0 < log log L m m

3d 6d log L ≤ l1 < 0 log L, 0 ϑ ϑ

(6.5)

where m ≡ m(α) b (see Proposition 4.1) and ϑ0 is given in Proposition 4.8. We then take v∗ = l0d log L

1

λ = v∗d(1+δ) Aδ

and

δ 1− d(1+δ)

γ = kv∗

,

(6.6)

and k(d, r) is given in Proposition 6.2. Since l0 divides l1 and l1 divides L, we can write QL as in (5.6). We now observe that, if L is large enough, the hypotheses (i) – (iv) of ˜ ≡ ∩3i=1 2i , where Theorem 5.3 are satisfied for all J ∈ 2

160

F. Cesi, C. Maes, F. Martinelli

21 = {J : each Bi is α−regular}, 22 = {J : v(QL , l0 , α) ≤ v∗ } , P (λ) 23 = {J : for each r−connected V ⊂ 3 with |V | ≤ v∗ , kx ≤ γ }, x∈V kJ − J and v(QL , l0 , α) has been defined in 5.1. Notice that for all J ∈ 21 the bound kJkQL ≤ k 0 log L holds for some constant k 0 , because of the definition of α−regular cubes and ˜ we have of our choice of l1 . By Theorem 5.3, (6.5) and (6.6), for any J ∈ 2, δ δ d− 1+δ (log L)1− d(1+δ) cs (µJ,τ QL ) ≤ exp CAδ (log log L) for a suitable constant C independent of J. In order to prove the theorem it is therefore ˜c . sufficient to estimate from above P 2 From Proposition 4.8, it follows 0 (6.7) P 2c1 ≤ Ld e−ϑ l1 ≤ L−2d for all L large enough. Let p(l) be the probability that a cube Ql is not α−regular. Then p(l) goes to zero as l → ∞, and a standard estimate for 2−dependent site percolation (sites at distance greater than 2 are independent) implies k2 v∗ l0−d ≤ L−3d P 2c2 ≤ Ld k1 p(l0 )

(6.8)

for L large enough, where k1 and Finally, by k2 are two suitable geometrical constants. ˜ c ≤ 3L−2 . This Proposition 6.2, we have P 2c3 ≤ L−3d , and, by consequence P 2 completes the proof in the general unbounded case. The bounded case can be treated in the same way, by choosing λ = J0 and γ = 0. Proof of part (ii). The proof is the same as in part (i), with a different choice of the three basic parameters l0 , l1 and v∗ . More precisely we define l0 and l1 as those powers of 2 (they are uniquely defined) such that (let again m = m(α)) b 60d2 (d−1)m

log log L ≤ l0 <

120d2 (d−1)m

log log L

d

d

(log L) d−1 ≤ l1 < 2(log L) d−1 .

Given ε ∈ (0, 1) we then let λ = J0

γ=0

and v∗ =

ε log L 2J0 C2

d d−1

,

(6.9)

˜ 2i as where C2 appears in Theorem 5.3. Write QL as in (5.6) and define the events 2, in the proof of part (i). Thanks to Theorem 5.3, we get ε

d ε 2 cs (µJ,τ QL ) ≤ C1 L exp(C2 l0 ) ≤ L

˜ ∀J ∈ 2

(6.10)

for all L sufficiently large. In order to prove (3.3) it is therefore sufficient to bound from ˜ c . As before, we find above P 2 0 d P 2c1 ≤ Ld e−ϑ l1 = Ld exp −ϑ0 (log L) d−1

(6.11)

and −d d P 2c2 ≤ Ld (k1 p(l0 ))k2 v∗ l0 ≤ exp −C3 (log log L)−d (log L) d−1

(6.12)

Relaxation of Disordered Magnets in Griffiths’ Regime

161

for a suitable constant C3 and all L large enough. Clearly (6.11) and (6.12) complete the proof of (ii). Proof of Theorem 3.2. The proof of the almost sure bounds (part (a)) is a simple consequence of Theorem 3.1. We prove only (3.4) since the case of bounded interactions ¯ be the set of interactions J such that for each J ∈ 2 ¯ there exists (3.5) is identical. Let 2 L1 (J) such that for all L ≥ L1 (J) (C is given in Proposition 3.1) 0 δ0 (i) cs (LJBL ) < exp CAδ (log L)1− d (log log L)d−δ . (ii) SM T (BL , γ1 (2L + 1), α) holds. (iii) kJkB¯ L ≤ log L, where B¯ L = BL ∪ ∂r+ BL . ¯ = Using Theorem 3.1, (H1), (H2) and the Borel–Cantelli lemma, one can check that P(2) ¯ 1. Moreover, thanks to (ii) and (iii), for all J ∈ 2 there exists a unique infinite volume Gibbs measure that in the sequel will be denoted by µJ . Let, in fact, f be any local function on , and take L large enough such that BL ⊃ 3f . Then, given two arbitrary boundary conditions τ and η, and using a telescopic interpolation between them, we get J,η + sup | µJ,τ BL (f ) − µBL (f ) | ≤ |∂r BL | sup

x∈∂r+ BL

τ,η∈

k∇x [µJBL (f )]k∞ =

J,τ µ (h , f ) x L = |∂r+ BL | sup sup BJ,τ , µBL (hx ) x∈∂r+ BL τ ∈

(6.13)

¯ we have where hx ≡ exp[−∇x HBL ]. Notice that because of (iii) in the definition of 2, 2 khx k∞ ≤ exp(2 log L) = L . Therefore, if L is larger than L1 (J) and if d(3f , (BL )c ) > γ1 (2L + 1) + r, we can use SM T (BL , γ1 (2L + 1), α), and write J,η d+4 sup | µJ,τ |3f |kf k∞ e−αd(3hx ,3f ) BL (f ) − µBL (f ) | ≤ kL

(6.14)

τ,η∈

for a suitable constant k, and the uniqueness follows. In order to prove inequalities (3.4) and (3.5) we first need to recall a standard result on the “finite speed of information propagation” for Glauber dynamics with bounded rates. Lemma 6.3. Assume the transition rates uniformly bounded, i.e. κ2 = 0 in (2.8). Then there exists a constant k0 depending on d, r and cM and for any local function f , there is A(f ) such that for all V ⊂⊂ Zd , t ≥ 0 with d(V c , 3f ) ≥ k0 t, we have sup kT (t)f − TVτ (t)f k∞ ≤ A(f ) e−2t .

τ ∈

Proof. One can see for instance Lemma 1.7 in [HS], or Lemma 1 in [S] which makes use of the explicit “graphical construction” of the process. Let now Lt = bk1 tc for some k1 > k0 (k0 is given in Lemma 6.3) and, for simplicity, let 3t = BLt . Choose an arbitrary boundary condition τ . Then we have (t)f − µJ,τ kT J (t)f − µJ (f )k∞ ≤ kT3J,τ 3t (f )k∞ + t J (t)f k∞ + |µJ,τ +kT J (t)f − T3J,τ 3t (f ) − µ (f )|. t

(6.15)

162

F. Cesi, C. Maes, F. Martinelli

Let us examine separately the three terms appearing in the RHS of (6.15). The first one, using (i) and (iii) above, together with hypercontractivity (see the proof of Theorem 4.1 in [GZ1]), can be bounded from above by t J,τ −1 (t)f − µJ,τ ≤ kT3J,τ 3t (f )k∞ ≤ 2 |||f ||| exp − cs (L3t ) t 2 i h t 0 δ0 ≤ 2 |||f ||| exp − exp −CAδ (log Lt )1− d (log log Lt )d−δ 2

(6.16)

for any sufficiently large t. The second term in (6.15), thanks to Lemma 6.3 is not greater than A(f )e−2t . The last term is bounded by the RHS of (6.14) which, if k1 > 3α−1 , is bounded by A0 (f )e−2t for large t. This concludes the proof (3.4). Proof of part (b). Define Lt as in part (a) and, for any ε ∈ (0, 1), let 2(t, ε) be the set of interactions J such that (i) cs (LJ3t ) ≤ Lεt . d

b (ii) SM T (3t , (log Lt ) d−1 , m(α)). We can write, for any τ ∈ , E kT J (t)f − µJ (f )k∞ ≤ |||f ||| P 2(t, ε)c + sup kT3J,τ (t)f − µJ,τ 3t (f )k∞ + t

(6.17)

J∈2(t,ε)

+ sup kT (t)f − J

J∈2(t,ε)

T3J,τ (t)f k∞ t

+

sup J∈2(t,ε)

|µJ,τ 3t (f )

− µ (f )|. J

We denote by X1 , X2 , X3 and X4 the four terms on the RHS of (6.17). For the last two terms we can proceed as in part (a) and we get X3 + X4 ≤ (A(f ) + A0 (f )) e−2t . Furthermore, we have P 2(t, ε)c ≤ P{ cs (LJ3t ) ≥ Lε }+ d

+P{ SM T (3t , (log L) d−1 , m(α)) b does not hold }.

(6.18)

(6.19)

Of the above two terms the first one is estimated via (ii) of Theorem 3.1, which implies d P{ cs (LJ3t ) ≥ Lε } ≤ exp −C3 (log log Lt )−d (log Lt ) d−1 ,

(6.20)

provided that t is large enough. The second term in the RHS of (6.19) can be bounded from above, using Proposition 4.1, by the probability that there exists a cube Ql (x) in 3t , d with l = d(log L) d−1 e, which is not α−regular. Using Proposition 4.8 such a probability is bounded from above by Ldt exp[−ϑ0 (log Lt ) d−1 ], d

(6.21)

provided that t is so large that Lt ≥ L0 . In this way we have obtained h i d d X1 ≤ |||f ||| exp[ −C3 (log log Lt )−d (log Lt ) d−1 ] + Ldt exp[−ϑ(log Lt ) d−1 ] . (6.22)

Relaxation of Disordered Magnets in Griffiths’ Regime

163

As for X2 , we use hypercontractivity (see again the proof of Theorem 4.1 in [GZ1]) and the fact that now cs (LJ3t ) ≤ Lεt , and we get X2 ≤ 2 |||f ||| exp −k 0 t1−ε (6.23) for any t sufficiently large. From (6.18), (6.22) and (6.23) we get that for large t the dominant term in (6.17) is the first one and, by consequence (3.6) follows. 6.2. Proof of the lower bound, Theorem 3.3. Proof of part (a). The main idea behind the proof of the lower bound for the averaged dynamics is not new (see [DRS]) and it can be summarized as follows. If all the couplings Jxy in the cube BL are above the critical value for the standard Ising model, then the spin at the origin reaches the equilibrium after a time t which is at least relaxation time of the cube BL . Since the relaxation time for the stochastic Ising model in a cube BL , at low temperature and zero external field, grows like the exponential of the surface Ld−1 , it 1 follows that if L ≈ (log t) d−1 , then at time t the spin at the origin has not yet equilibrated. To complete the argument one has to observe that, under our assumptions, the probability that the Jxy ’s in BL are all equal and large is not smaller than an exponential of the volume Ld . Let us now provide the details. Given J1 > 0 and a positive integer L, let 3 = BL ¯ be the set of all interactions J ∈ 2 such there is a unique Gibbs measure µJ and let 2 and (a) Jxy = J1 for all {x, y} such that {x, y} ⊂ 3, (b) |Jxy | ≤

for all {x, y} which intersect both 3 and 3c (the boundary edges). P If we denote with m3 = |3|−1 x∈3 σ(x) the normalized magnetization in 3, we can write (remember that µJ (π0 ) = 0) 1 4

¯ inf kT J (t) m3 kL2 (µJ ) . E kT J (t)π0 kL2 (µJ ) ≥ E kT J (t) m3 kL2 (µJ ) ≥ P(2) ¯ J∈2

¯ and let Choose J ∈ 2

(6.24)

F3 = {σ ∈ : m3 (σ) > 21 }.

Then we have kT J (t) m3 kL2 (µJ ) ≥

p µ(F3 ) kT J (t) m3 kL2 (µJ (· | F3 ))

and kT J (t) m3 kL2 (µJ (· | F3 )) ≥ kT J (t) m3 kL1 (µJ (· | F3 )) ≥ µ(T J (t) m3 | F3 ).

(6.25)

For σ ∈ , let {ηtσ }t≥0 be the process associated with T J (t) with initial condition η0σ = σ, and let {ηtµ }t≥0 be the stationary process (the one with initial distribution µJ ). Consider the events Gσ3,t ≡ { ∃s ∈ [0, t] : |m3 (ηsσ ) − 1/2| ≤ 1/(100) }

σ ∈ ∪ {µ}.

For each σ ∈ F3 , if |3| > 100, we have m3 (ηtσ ) ≥

1 1 3 1I(Gσ3,t )c − 1IGσ3,t = − 1IGσ3,t , 2 2 2

164

F. Cesi, C. Maes, F. Martinelli

which implies Z 1 3 µ(dσ | F3 ) Prob(Gσ3,t ) µ(T (t) m3 | F3 )≥ − 2 2 1 3 ≥ − µ(F3 )−1 Prob(Gµ3,t ). 2 2 J

(6.26)

If t1 , t2 , . . . are the (random) times at which the stationary process ηtµ is updated inside 3 and nt is the number of updates up to time t, we have, for all j ∈ Z+ , Prob(Gµ3,t ) ≤ jµJ {|m3 (σ) − 1/2| ≤ 1/(100)} + Prob{nt > j},

(6.27)

which, taking j = k|3|t with k = 2cM , can be bounded by (remember that we have κ2 = 0) 0 (6.28) k |3| t µJ {|m3 (σ) − 1/2| ≤ 1/(100)} + e−k |3|t for a suitable positive constant k 0 . The idea is now to J1 ,∅ , i.e. with the Ising Gibbs measure in 3 with coupling J1 (1) Replace µJ with µ3 ¯ and the DLR and free boundary conditions. Thanks to the properties (a) and (b) of 2 condition, the price to pay can be estimated as e−|∂

+

3|/2 J1 ,∅ µ3 (X)

≤ µJ (X) ≤ e|∂

+

3|/2 J1 ,∅ µ3 (X)

∀X ∈ F3 .

(6.29)

(2) Use the following key result for the large deviations of the magnetization for the d−dimensional Ising model in 3 without external field and with free boundary conditions. Theorem 6.4. For each d ≥ 2 there exists J˜1 (d) > 0 such that if J1 ≥ J˜1 (d), 1 , 3 + J1 ,∅ {|m3 (σ) − 1/2| ≤ 1/(100)}≤ e−2|∂ 3| . µ3 J1 ,∅ µ3 {m3 (σ) ≥ 1/2}≥

Proof. The d = 2 case has been proved in [Sh] (see also [Pf]) and extended up to the critical temperature in [CGMS]. For d > 2 see [P]. Remark . The results of [P] are stated for the standard Ising model, namely when the couplings Jxy are all equal and large enough. We expect the same result to hold also when the Jxy ’s are not all equal, but just large enough. Choose now J1 as in Theorem 6.4, and take L = Lt as the smallest integer for which |∂ + 3| ≥ 2 log t. In this way we find µ(T J (t) m3 | F3 ) ≥

+ 0 + 1 9 9 1 − k |3| t e−|∂ 3| − e−k |3|t+|∂ 3|/2 ≥ 2 2 2 3

for all t large enough. From (6.24) . . . (6.30) it follows E kT J (t)π0 kL2 (µJ ) ≥

d 1 −|∂ + 3|/2 ¯ e P(2) ≥ exp −k 00 (log t) d−1 3

for a suitable positive constant k 00 .

(6.30)

Relaxation of Disordered Magnets in Griffiths’ Regime

165

Proof of part (b). The main idea for the lower bound on the a.s. relaxation of the spin at the origin seems to be new and it can be divided into two distinct parts. The first part consists in showing that, with probability one, for any L large enough, there exists a local fL , with 3fiL ⊂ BL , whose relaxational behaviour is not faster than h function d−1 exp −t exp −k (log L) d . The second part amounts to proving that the influence of the slow relaxation of fL on the spin at the origin is not smaller than a negative exponential of L. This implies a d−1 lower bound on kT J (t)π0 k∞ of the order of exp[−mL − t exp[ −k (log L) d ]] and J the result (remember that µ (π0 ) = 0 by symmetry) follows by optimizing over L ≤ t. Let us now implement these sketchy ideas. Given a local function f and a finite set 3 we set eJ (f ) =

E J (f, f ) ; Var J (f )

d(3) = sup |x| ; x∈3

eJ (3) =

inf

f ∈L2 (,dµJ )

eJ (f )

(6.31)

3f ⊂3

(both the Dirichlet form and the variance are with respect to the unique infinite volume Gibbs measure). With the above definition we have the following two key results. Lemma 6.5. Under the same assumptions of part (a) of Theorem 3.3 there exists a set ¯ ⊂ 2 of full measure and a positive constant k such that for each J ∈ 2 ¯ there exists 2 L(J) < ∞ such that i h d−1 ∀ L ≥ L(J). eJ (BL ) ≤ exp −k(log L) d Lemma 6.6. Under the same assumptions of part (b) of Theorem 3.3 there exists m > 0 such that for any t ≥ 1 and any finite set 3, h i kT J (t)π0 k∞ ≥ (8|3|)−1 exp −md(3) − 2eJ (3)t . Before proving the two lemmas we complete the proof of part (b) of the theorem. For this purpose choose i J in the set of full measure given by Lemma 6.5, define L(t) = h d−1 t exp −(log t) d and assume that t is so large that L(t) ≥ L(J). If we apply Lemma 6.6 to the box BL(t) and use the upper bound on eJ (BL(t) ) given in Lemma 6.5 we immediately get the sought lower bound on kT J (t)π0 k∞ . Proof of Lemma 6.5. It is simple to check that there exists ε > 0 such that for almost all J there exists L(J) such that for all L ≥ L(J) there exists x ≡ x(L, J), with |x| ≤ L/2, such that all couplings inside the cube Ql (x), l = (ε log L)1/d , are equal to J1 , with J1 as in Theorem 6.4, and all couplings connecting a point inside Ql (x) with one of its nearest neighbors outside it are smaller than 1/4 (see also Remark 2 after Theorem 3.1). By construction Ql (x) ⊂ BL if L is large enough. Let now f (σ) ≡ 1I{mQl (x) (σ) ≥ 0}, where mQl (x) (σ) denotes the (normalized) magnetization in Ql (x). Notice that 3f ⊂ Ql (x) ⊂ BL . If we compute eJ (f ) and use (6.29), Theorem 6.4 and the symmetry under global spin flip, we get eJ (f ) ≤

i h d−1 cM |Ql (x)| µJ { |mQl (x) | ≤ 1/100 } ≤ exp −k(log L) d J Var (f )

for a suitable constant k depending on ε and any L large enough.

166

F. Cesi, C. Maes, F. Martinelli

Proof of Lemma 6.6. For any given bounded J with Jxy ≥ δ > 0 for all nearest neighbor pairs {x, y}, we set F J (x, t) ≡ T J (t)πx (1). Notice that, since the nearest neighbor couplings Jxy are uniformly bounded and positive, the heat-bath dynamics is attractive (see [L]) so that F J (x, t) is a non–increasing function of t and kT J (t)π0 k∞ = F J (0, t). Next we define m by inf

|x−y|=1

inf

σ,η∈ σ(y)=1,η(y)=−1 σ(z)≥η(z) ∀z6=y

T J (1)πx (σ) − T J (1)πx (η) ≡ 2e−m .

(6.32)

Thanks to attractivity the quantity in (6.32) is non–negative, and, in particular, it is strictly positive with our choice of the transition rates. Fix now a finite set 3. The result of the lemma is a direct consequence of the following three inequalities valid for any local function f with 3f ⊂ 3 and any t ≥ 1: F J (0, t) ≥ e−m|x| F J (x, t), X F J (x, t), kT J (t)f − µJ (f )k2L2 (µJ ) ≤ 4 Var J (f )

(6.33) (6.34)

x∈3f

kT J (t)f − µJ (f )k2L2 (µJ ) ≥

Var J (f ) exp[−2eJ (f )t]. 2

(6.35)

In fact, by summing (6.33) over x ∈ 3f and using (6.34) and (6.35), we get for any local function f such that 3f ⊂ 3, F J (0, t) ≥

1 1 exp[−md(3f )−2eJ (f )t] ≥ exp[−md(3)−2eJ (f )t], (6.36) 8|3f | 8|3|

which proves the lemma if we take the supremum over f in the RHS of (6.36). Let us prove (6.33). Using induction over x and the fact that F J (x, t) is non– increasing in t it is sufficient to prove that F J (0, t) ≥ e−m F J (x, t − 1)

(6.37)

for any x with |x| = 1. To prove (6.37) we observe that, because of attractivity, it is possible to define all processes {ηtσ }t≥0 starting from σ on the same probability space 0 in such a way that σ(x) ≥ σ 0 (x) ∀x ∈ Zd implies ηtσ (x) ≥ ηtσ (x) ∀x ∈ Zd . Let Eˆ denote the expectation over this global coupling. Then, using the Markov property, we can write 1 F J (0, t)= [ T J (t)π0 (1) − T J (t)π0 (−1) ] = 2 (6.38) 1 1 −1 = Eˆ T J (1)π0 (ηt−1 ) − T J (1)π0 (ηt−1 ) . 2 Using the definition of m given in (6.32), we can write 1 −1 RHS of (6.38)≥ e−m Eˆ 1I{ηt−1 (x) = +1, ηt−1 (x) = −1} (6.39) ≡ e−m F (x, t − 1) for any x with |x| = 1 and (6.37) follows. Let us prove (6.34). Let f be a local function with µJ (f ) = 0. Then, using reversibility, we write

Relaxation of Disordered Magnets in Griffiths’ Regime

Z kT J (t)f k2L2 (µJ ) =

hZ

dµJ (σ)

167

i2 0 dµJ (σ 0 ) Eˆ [ f (ηtσ ) − f (ηtσ ) ] . 1

(6.40)

−1

Notice that, by monotonicity, the event {ηt (x) = ηt (x)} implies the event {ηtσ (x) = 0 ηtσ (x)} for any pair of initial conditions σ, σ 0 . Thus, using the Schwartz inequality, we can bound from above the RHS of (6.40) by Z Z 0 ˆ I{∃ x ∈ 3f : ηt1 (x) 6= ηt−1 (x)} ≤ 2 dµJ (σ) dµJ (σ 0 ) Eˆ [f (ηtσ )2 + f (ηtσ )2 ] E1 X X ˆ I{ηt1 (x) 6= ηt−1 (x)}] = 4 Var J (f ) E[1 F J (x, t), ≤ 4 Var J (f ) x∈3f

x∈3f

and (6.34) follows. Let us finally prove (6.35). Given a local function f with µJ (f ) = 0, let PfJ be the spectral projection of −LJ associated to the set [0, 2eJ (f )]. An elementary L2 computation shows that kPfJ f k2L2 (µJ ) ≥ 1/2 Var J (f ). Thus, thanks to the spectral theorem, we get kT J (t)f k2L2 (µJ ) ≥ e−2e

J

and (6.35) follows.

(f )t

kPfJ f k2L2 (µJ ) ≥

1 −2eJ (f )t e Var J (f ), 2

A1. Appendix 1 Given two subsets A, B of Zd we let δr (A, B) = (∂r+ A ∩ B) ∪ (∂r+ B ∩ A). Proposition A1.1. For each d, r ∈ Z+ , there exists k(d, r) such that for each V ⊂⊂ Zd , d−1 and for each v ∈ [0, |V |] there exists Xv ⊂ V such that, if we let S ≡ d2|V | d e and Yv ≡ V \Xv , we have (a) v − kS ≤ |Xv | ≤ v, (b) δr (Xv , Yv ) ≤ kS, (c) X0 = ∅, X|V | = V and Xv ⊂ Xw if v < w. Proof. Given V ⊂⊂ Zd , we define the i−width of V as the smallest k ∈ Z+ such that there exists n ∈ Z with the property that, for all x ∈ V , we have xi ∈ {n, . . . , n+k −1}, where xi is the ith coordinate of x. We start with the following result Lemma A1.2. Let V be a finite subset of Zd , let i ∈ {1, . . . , d} and let a ∈ [0, |V |]. Let L, S be two positive numbers such that LS > |V |. Then there exist k = k(d) > 0 and two disjoint subsets of V , W1 and W2 such that (a) (b) (c) (d)

|W1 | ≤ a. The i−width of W2 is less than or equal to L. |δ1 (W1 , V \W1 )| ≤ kS and |δ1 (W2 , V \W2 )| ≤ kS. If W2 = ∅ then |W1 | ≥ a − S, while, if W2 6= ∅, then |W1 ∪ W2 | > a.

168

F. Cesi, C. Maes, F. Martinelli

Proof. If a = |V |, we take W1 = V and W2 = ∅ and the lemma follows. Assume now that a < |V |. For j ∈ Z, let Vj(i) = {x ∈ V : xi = j} and

m = inf{j ∈ Z : |

[

(A1.1)

Vk(i) | > a}.

k≤j

Since a < |V |, m is always finite. If W1 =

|Vm(i) |

[

≤ S we set W2 = ∅.

Vj(i)

(A1.2)

j<m

Then properties (a), (b) and (d) are trivial, while, in order to prove (c), we observe that δ1 (W1 , V \W1 ) ⊂ Vm(i) ∪ { x ∈ Zd : d(x, Vm(i) ) ≤ 1 }, which implies

|δ1 (W1 , V \W1 )| ≤ |Vm(i) |(1 + 3d ) ≤ (3d + 1)S.

Consider now the case when |Vm(i) | > S. Define m1 = sup{j < m : |Vj(i) | ≤ S} Let then W1 =

[

m2 = inf{j > m : |Vj(i) | ≤ S}.

Vj(i)

W2 =

m[ 2 −1

Vj(i) .

j=m1 +1

j≤m1

The statements (a) – (d) are easily verified.

We now prove Proposition A1.1 when r = 1 and then we will show that this is enough to treat the case of r arbitrary. We let fa(i) (V ) = W1

ga(i) (V ) = W2 ,

where W1 and W2 are the subsets of V found in the previous lemma for given values of a and i, with L and S chosen as S ≡ d2|V |

d−1 d

e

1

L ≡ d|V | d e.

Then we define A1 = fv(1) (V )

D1 = gv(1) (V )

and recursively (i) (Di−1 ), Di = gv−|A i−1 |

(i) Ei = fv−|A (Di−1 ), i−1 |

Ai = Ai−1 ∪ Ei .

(A1.3)

We let then k = min{i : Di = ∅} and we claim that Proposition A1.1 holds with Xv = Ak . We observe that necessarily k ≤ d. In fact, the previous lemma together with the fact that Di ⊂ Di−1 , imply that the i−width of Dd−1 is no greater than L for all i ∈ {1, . . . , d−1}. By consequence, all the slices of Dd−1 in the direction perpendicular to the d direction (see (A1.1)) have a cardinality not greater than Ld−1 which is less than S. Thus, by the proof of the lemma (see (A1.2)), it is clear that Dd = ∅.

Relaxation of Disordered Magnets in Griffiths’ Regime

169

Then we check that, for all i ∈ {1, . . . , k − 1} we have 0 ≤ v − |Ai | ≤ |Di |, so that definitions (A1.3) make sense for i ≤ k. This can be done by induction. From statement (a) of the previous lemma, we get |Ai | ≤ |Ai−1 | + |Ei | ≤ |Ai−1 | + v − |Ai−1 | = v. On the other hand, since Di 6= ∅, statement (d) of Lemma A1.2 implies |Ai | + |Di | = |Ai−1 | + |Ei | + |Di | > |Ai−1 | + v − |Ai−1 | = v. Since Dk = ∅, Lemma A1.2 implies |Ek | > v − |Ak−1 | − S, thus, by consequence we have |Ak | > v − S. Together with the statement |Ak | ≤ v , this gives part (a) of the proposition. In order to prove part (b), we notice that, thanks to Lemma A1.2, we have |δ1 (Ei , Di−1 \Ei )| ≤ kS

|δ1 (Di , Di−1 \Di )| ≤ kS.

(A1.4)

We then claim that δ1 (Ai , V \Ai ) ⊂ δ1 (Ai−1 , V \Ai−1 ) ∪ ∪δ1 (Ei , Di−1 \Ei ) ∪ δ1 (Di−1 , V \Di−1 ), δ1 (Di , V \Di ) ⊂ δ1 (Di , Di−1 \Di ) ∪ δ1 (Di−1 , V \Di−1 )

(A1.5) (A1.6)

Iterating (A1.6) and using (A1.4) we find |δ1 (Di , V \Di )| ≤ kiS, which, inserted into (A1.5), together with (A1.4), gives |δ1 (Ai , V \Ai )| ≤ kd2 S, which completes the proof of the proposition. To obtain (A1.5), we write δ1 (Ai , V \Ai ) = δ1 (Ai−1 , V \Ai ) ∪ δ1 (Ei , V \Ai ) ⊂ δ1 (Ai−1 , V \Ai−1 ) ∪ δ1 (Ei , V \Ai ). The last term can be written as δ1 (Ei , V \Ai ) = δ1 (Ei , Di−1 \Ai ) ∪ δ1 (Ei , V \(Ai ∪ Di−1 )) ⊂ ⊂ δ1 (Ei , Di−1 \Ei ) ∪ δ1 (Di−1 , V \Di−1 ), which proves (A1.5). Furthermore, to get (A1.6), we observe that δ1 (Di , V \Di )= δ1 (Di , Di−1 \Di ) ∪ δ1 (Di , V \Di−1 ) ⊂ δ1 (Di , Di−1 \Di ) ∪ δ1 (Di−1 , V \Di−1 ). This proves (b). Property (c) follows from the construction. Finally we want to show that Proposition A1.1 with δ1 (i.e. for r = 1) implies that the same result holds for δr but with a different constant k. Choose then v ∈ [0, |V |], let s = 2r and consider the mapping Zd 3 x = (x1 , . . . , xd ) 7→ π(x) ≡ (dx1 /se, . . . , dxd /se) ∈ Zd . Applying Proposition A1.1 (with r = 1) to the set πV , we get that, for each u ∈ [0, |πV |], πV is the disjoint union of two subsets πV = Xu0 ∪ Yu0 such that properties (a), (b) (with r = 1) and (c) hold. Let then w = sup{ u ∈ [ 0, |πV | ] : |(π −1 Xu0 ) ∩ V | ≤ v }. We claim that, if we define

Xv = (π −1 Xw0 ) ∩ V,

170

F. Cesi, C. Maes, F. Martinelli

then (a), (b) and (c) are satisfied. By definition of v we have |Xv | ≤ v. Now, let w¯ = (w+1)∧|πV | and let 1 = Xw0¯ \Xw0 . Using (a) we get |1| = |Xw0¯ | − |Xw0 | ≤ w¯ − (w − kS) ≤ (k + 1)S.

(A1.7)

Moreover, by considering both cases w¯ = w + 1 and w¯ = |πV |, it easy to verify that |(π −1 Xw0¯ ) ∩ V | ≥ v. Thus we obtain |Xv | = |(π −1 Xw0¯ ) ∩ V | − |(π −1 1) ∩ V | ≥ v − sd |D| ≥ v − (k + 1)sd S. which proves (a). To prove (b) all we need is to observe that δr (Xv , Yv ) ⊂ δr (π −1 Xw0 , π −1 Yw0 ) ⊂ π −1 δ1 (Xw0 , Yw0 ), which implies

|δr (Xv , Yv )| ≤ sd kS.

The proof of (c) is straightforward. A2. Appendix 2

Proposition 2.1. Take the transition rates as in (2.11). Let l ∈ Z+ and let V be a multiple of Ql , i.e. Ss V = i=1 Ql (xi ) with xi ∈ lZd . Assume that each Ql (xi ) for i = 1, . . . , s is α−regular for some α > 0. ¯ r, α), we have Then, if l is larger than some l(d, −ω gap(LJ,τ exp(−k 0 mld ), V ) ≥ 8|V |

where ω = d log 4/ log(3/2), k 0 = 9d−1 k and k is the constant given in Theorem 4.12. Proof. We can assume that V is r−connected, since, otherwise one could just consider the r−connected components of V . Since V is a multiple of Ql , this implies, if l > r, that V is actually connected (i.e. 1−connected). Choose l ∈ Z+ and let, for n = 0, 1, 2, . . ., Rn = [0, an+1 ) × [0, an+2 ) × · · · × [0, an+d ) ∩ Zd , where an = 6lbn and b = (3/2)1/d . Let Cn∗ be the set of all volumes V ⊂ Zd such that (1) V is a multiple of Ql , (2) V ⊂ Rn modulo translations and permutations of the coordinates. Let also 2r (V, l, α) be the set of all interactions J such that each Ql (xi ) ⊂ V with xi ∈ lZd is α−regular. Define gn = inf

inf

inf

∗ J∈2 (V,l,α) τ ∈ V ∈Cn r

. gap(LJ,τ V )

Thanks to the α−regularity we know that kJkx ≤ 100−1 ml ≡ J0

∀x ∈ V,

(A2.1)

Relaxation of Disordered Magnets in Griffiths’ Regime

171

where, as usual, we have set m = m(α) b (see Proposition 4.1). We will show that for all n ≥ 1 we have gn ≥

1 gn−1 , 4

(A2.2)

which implies gn ≥ 4−n g0 . Using Theorem 4.12 to estimate g0 we get d−1 1 gn ≥ 4−n exp(−kJ0 |R0 | d ). 2

(A2.3)

Once we have (A2.3) the proposition easily follows from (A2.1) and from the following observations: (1) |R0 | ≤ (6lbd )d = (9l)d . ∗ , then |V | ≥ an = 6l(3/2)n/d , which implies (2) Since V is connected, if V ∈ Cn∗ \Cn−1

4−n ≤ (6l)ω |V |−ω . So we are left with the proof of (A2.2). For this purpose we want to use (5.5) and Proposition 4.5. So we let p1 = sup{s ∈ lZ : s ≤ an }, p2 = inf{s ∈ lZ : s ≥ an+d − an }, and A = {x = (x1 , . . . , xd ) ∈ V : xd < p1 },

B = {x = (x1 , . . . , xd ) ∈ V : xd ≥ p2 }.

∗ , so gap(LJ,τ Both A and B are in Cn−1 A ) ≥ gn−1 and the same holds for B. Moreover, + if we let A0 = A ∩ ∂l B, B0 = B ∩ ∂r+ A, we get

d(A0 , B0 ) = p1 − p2 + 1 ≥ 2an − an+d − 2l ≥ l (3/2)n/d . One can then check that, thanks also to (A2.1) and Proposition 4.1, the hypotheses of Proposition 4.5 are satisfied, so that the gap for the block dynamics on {A, B} is at least 1/2. Combining this fact with formula (5.5) we get (A2.2). Acknowledgement. The authors are grateful to the Schr¨odinger Institute in Wien for the kind hospitality and the opportunity to start this work. Particular thank goes to M. Zahradn´ık for suggesting the main idea in the proof of the geometric proposition contained in Appendix 1.

References [ACCN] Aizenman, M., Chayes, J.T., Chayes, L., Newman, C.M.: The phase boundary in dilute and random Ising and Potts ferromagnets. J. Phys. A: Math. Gen. 20, L313 (1987) [B1] Bray, A.J.: Upper and lower bounds on dynamic correlations in the Griffiths phase. J. Phys. A: Math. Gen. 22, L81 (1989) [B2] Bray, A.J.: Dynamics of dilute magnets above Tc . Phys. Rev. Lett. 60, No 8, 720 (1988) [BD] Bassalygo, L.A., Dobrushin, R.L.: Uniqueness of a Gibbs field with random potential – an elementary approach. Theory Prob. Appl. 31, 572 (1986) [Be] van den Berg, J.: A constructive mixing condition for 2-D Gibbs measures with random interactions. Preprint 1996

172

[BM]

F. Cesi, C. Maes, F. Martinelli

van den Berg, J., Maes, C.: Disagreement percolation in the study of Markov fields. Ann. Prob. 22, 749 (1994) [CM] Cesi, F., Martinelli, F.: On the Layering Transition of an SOS Surface Interacting with a Wall. I. Equilibrium Results. J. Stat. Phys., 82, no 3/4, 823 (1996) [CGMS] F. Cesi, G. Guadagni, F. Martinelli and R. Schonmann: On the 2D Stochastic Ising Model in the Phase Coexistence Region Near the Critical Point. J. Stat. Phys. 85, no. 1/2, 55 (1996) [CMM] Cesi, F., Maes, C., Martinelli, F.: Relaxation to equilibrium for two dimensional disordered Ising models in the Griffiths phase. Commun. Math. Phys., in preparation [D] Dobrushin, R.L.: A formula of full Semiinvariants. In: “Cellular Automata and Cooperative Systems”, N. Boccara, E. Goles, S. Martinez and P. Picco (eds.), Dordrecht-Boston-London: Kluwer Acad. Publ. (1993), pp. 135–140 [DS] Dobrushin, R.L., Shlosman, S.: Constructive criterion for the uniqueness of Gibbs fields. In: “Statistical Physics and Dynamical Systems”, Fritz, Jaffe and Sz´asz (eds), Basel– Boston: Birkhauser, 1985, p. 347 [DeSt] Deuschel, J.D., Stroock, D.W.: Large deviations. London–New York: Academic Press, Series in Pure and Applied Mathematics, 137 (1989) [DKP] von Dreyfus, H., Klein, A., Perez, J.F.: Taming Griffiths singularities: Infinite differentiability of quenched correlations functions. Commun. Math. Phys. 170, 21 (1995) [DRS] Dhar, D., Randeria, M., Sethna, J.P.: Griffiths singularities in the dynamics of disordered Ising models. Europhys. Lett., 5, No. 6, 485 (1988) [F] Fr¨ohlich, J.: Mathematical aspects of the physics of disordered systems. In “Critical Phenomena, Random Systems, Gauge Theories”, Eds. K.Osterwalder and R. Stora, Amsterdam: Elsevier, 1986 [FI] Fr¨ohlich, J., Imbrie, J.Z.: Improved perturbation expansion for disordered systems: beating Griffiths singularities. Commun. Math. Phys. 96, 145 (1984) [G] Griffiths, R: Non-analytic behaviour above the critical point in a random Ising ferromagnet. Phys. Rev. Lett. 23, 17 (1969) [GM1] Gielis, G., Maes, C.: Percolation Techniques in Disordered Spin Flip Dynamics: Relaxation to the Unique Invariant Measure. Commun. Math. Phys. 177, 83 (1996) [GM2] Gielis, G., Maes, C.: Local analyticity and bounds on the truncated correlation functions in disordered systems. Markov Proc. Relat. Fields 1, 459 (1995) [GM3] Gielis, G., Maes, C.: The Uniqueness regime of Gibbs Fields with Unbounded Disorder. J. Stat. Phys. 81, 829 (1995) [GZ1] Guionnet, A., Zegarlinski, B.: Decay to equilibrium in random spin systems on a lattice. Commun. Mfath. Phys. 181, 703 (1996) [GZ2] Guionnet, A., Zegarlinski, B.: Decay to equilibrium in random spin systems on a lattice II. J. Stat. Phys. 86, 899 (1997) [HS] Holley, R.A., Strook, D.W.: Uniform and L2 convergence in one dimensional stochastic Ising models. Commun. Math. Phys. 123, 85 (1989) [J] Jain, S.: Anomalously slow relaxation in the diluted Ising model below the percolation threshold. Physica A, 218, 279 (1995) [L] Ligget, T.M.: Interacting particles systems. Berlin–Heidelberg–New York: Springer-Verlag, (1985) [LY] Lu, S. L., Yau, H. T.: Spectral gap and logarithmic Sobolev inequality for Kawasaki and Glauber dynamics. Commun. Math. Phys. 156, 399 (1993) [M] Martinelli, F: On the two dimensional dynamical Ising model in the phase coexistence region. J. Stat. Phys. 76, No. 5/6, 1179 (1994) [MO1] Martinelli, F., Olivieri, E.: Approach to equilibrium of Glauber dynamics in the one phase region I: The attractive case. Commun. Math. Phys. 161, 447 (1994) [MO2] MO2) Martinelli, F., Olivieri, E.: Approach to equilibrium of Glauber dynamics in the one phase region II: The general case. Commun. Math. Phys. 161, 487 (1994) [O] Olivieri, E.: On a cluster expansion for lattice spin systems and finite size condition for the convergence. J. Stat. Phys. 50, 1179 (1988) [OP] Olivieri, E., Picco, P.: Cluster expansion for D–dimensional lattice systems and finite volume factorization properties. J. Stat. Phys. 59, 221 (1990) [OPG] Olivieri, E., Perez, F., Goulart–Rosa–Jr., F.: Some rigorous results on the phase diagram of the dilute Ising model. Phys. Lett. 94A, No 6,7, 309 (1983) [P] Pisztora, A.: Surface order large deviations for Ising, Potts and percolation models. Probab. Th. Rel. Fields 104, 427 (1996)

Relaxation of Disordered Magnets in Griffiths’ Regime

[Pf] [RSP] [S] [Sh] [SZ] [Z]

173

Pfister, C.E.: Large deviations and phase separation in the two–dimensional Ising model. Helvetica Physica Acta 64, 953 (1991) Randeria, M., Sethna, J.P.,Palmer, R.G.: Low–frequency relaxation in Ising spin–glasses. Phys. Rev. Lett. 54, No. 12, 1321 (1985) Schonmann, R.H.: Slow droplet–driven relaxation of stochastic Ising Models in the vicinity of the phase coexistence region. Commun. Math. Phys. 170, 453 (1995) Shlosman, S.B.: The droplet in the tube: a case of phase transition in the canonical ensemble. Commun. Math. Phys. 125, 81 (1989) Stroock, D.W., Zegarlinski, B.: The logarithmic Sobolev inequality for discrete spin systems on a lattice. Commun. Math. Phys. 149, 175 (1992) Zegarlinski, B.: Strong decay to equilibrium in one dimensional random spin systems. J. Stat. Phys. 77, 717 (1994)

Communicated by J.L. Lebowitz

Commun. Math. Phys. 188, 175 – 216 (1997)

Communications in

Mathematical Physics c Springer-Verlag 1997

The Calogero-Sutherland Model and Generalized Classical Polynomials T.H. Baker1 , P.J. Forrester2,? 1 Department of Mathematics, University of Melbourne, Parkville, Victoria 3052, Australia. E-mail: [email protected] 2 Research Institute for Mathematical Sciences, Kyoto University, Kyoto 606, Japan

Received: 16 August 1996 / Accepted: 21 January 1997

Abstract: Multivariable generalizations of the classical Hermite, Laguerre and Jacobi polynomials occur as the polynomial part of the eigenfunctions of certain Schr¨odinger operators for Calogero-Sutherland-type quantum systems. For the generalized Hermite and Laguerre polynomials the multidimensional analogues of many classical results regarding generating functions, differentiation and integration formulas, recurrence relations and summation theorems are obtained. We use this and related theory to evaluate the global limit of the ground state density, obtaining in the Hermite case the Wigner semi-circle law, and to give an explicit solution for an initial value problem in the Hermite and Laguerre case. 1. Introduction The Calogero-Sutherland model refers to exactly solvable quantum many body systems in one-dimension with pair potentials proportional to 1/r2 (in some asymptotic limit at least). A subclass of these models also have exact BDJ–type ground states: ψ0 =

N Y j=1

f1 (xj )

Y

f2 (xj , xk ).

(1.1)

1≤j
Three particular quantum many body systems of the Calogero-Sutherland type possessing property (1.1) are specified by the Schr¨odinger operators H (H) = −

N N X X β2 X 2 ∂2 1 + xj + β(β/2 − 1) 2 4 (xj − xk )2 ∂xj j=1 j=1 1≤j
(1.2a)

? Department of Mathematics, University of Melbourne, Parkville, Victoria 3052, Australia. E-mail: [email protected]

176

H (L)

T.H. Baker, P.J. Forrester N N X ∂2 X =− + ∂x2j j=1 j=1

+ 2β(β/2 − 1)

ÿ

N X j,k=1 j6=k

H (J)

N N X ∂2 X =− + ∂φ2j j=1 j=1

+ 2β(β/2 − 1)

ÿ

N X j,k=1 j6=k

βa0 2

! a0 β 1 β2 2 −1 + xj 2 4 x2j

x2j , (x2k − x2j )2 a0 β 2

(1.2b)

! b0 β b0 β a0 β 1 1 −1 −1 + 2 2 2 cos2 φj sin2 φj

sin2 φj cos2 φj . (sin2 φj − sin2 φk )2

(1.2c)

The superscripts (H), (L), (J) stand for Hermite, Laguerre and Jacobi respectively, and are chosen because of the relationship of these Schr¨odinger operators to generalizations of the corresponding classical polynomials. A direct calculation shows that there are eigenfunctions of the form e−βW/2 , where W (H) =

W (L) =

N

X

j=1

1≤j
1X 2 xj − 2

log |xk − xj |,

N N 1 X 2 a0 X xj − log x2j − 2 2 j=1

j=1

X

(1.3a)

log |x2k − x2j |,

(1.3b)

1≤j
N N a0 X b0 X log sin2 φj − log cos2 φj 2 2 j=1 j=1 X 2 − log | sin φj − sin2 φk |.

W (J) = −

(1.3c)

1≤j
Since these eigenfunctions are non-negative they correspond to the ground state wavefunction ψ0 (i.e. they are the eigenfunctions with the most negative eigenvalue E0 ). Notice that ψ0 is indeed of the type (1.1). Conjugation of the Schr¨odinger operators by the reciprocal of the ground state eβW/2 gives the Fokker-Planck operators X ∂ ∂W 1 ∂ 1 + L := − e−βW/2 (H − E0 )eβW/2 = β ∂xj ∂xj β ∂xj N

(1.4)

j=1

(for H (J) the coordinates xj are to be replaced by φj ). Thus the Schr¨odinger equation i

∂ ψ({xj }; t) = Hψ({xj }; t), ∂t

(1.5)

with ψ = e−iE0 t eβW/2 P and t = τ /iβ transforms to the Fokker-Planck equation ∂ P = LP. ∂τ

(1.6)

Calogero-Sutherland Model and Generalized Classical Polynomials

177

The Fokker-Planck equation (1.6) describes the evolution of a classical gas in onedimension with potential energy W undergoing Brownian motion. Two classes of problems associated with the Schr¨odinger operators (1.2) or equivalently the Fokker-Planck operator (1.4) with W given by (1.3), are the topic of this paper. The first is the discussion of some mathematical properties relating to the eigenfunctions, while the second is the evaluation of the density in the ground state and the exact solution of (1.6) for certain initial conditions. These problems are in fact inter-related; we find that the density for each system can be written in terms of a certain eigenstate and that a summation theorem for the eigenstates gives an exact solution of (1.6). A feature of the Schr¨odinger operators (1.2) is that after conjugation with the ground state: N 2 X ∂W ∂ ∂ (1.7) − β − eβW/2 (H − E0 )e−βW/2 = ∂xj ∂xj ∂x2j j=1 the resulting differential operator has a complete set of polynomial eigenfunctions. In Sect. 2 we consider the form of the expansion of these polynomials in terms of some different bases of symmetric functions. We note that in the N = 1 case, after a suitable change of variables, the operator (1.7) with W given by (1.3) is the eigenoperator for the classical Hermite, Laguerre and Jacobi polynomials. Previous studies of the operator for general N in the Jacobi case [1] have established an orthogonality relation. Since the polynomials in the Hermite and Laguerre cases are limiting cases of these generalized Jacobi polynomials, we can obtain the corresponding orthogonality relations via the limiting procedure. The generalized Hermite polynomials, which are the polynomial eigenfunctions of (1.7) with W = W (H) as given by (1.3a), are studied in Sect. 3. Many higher-dimensional analogues of properties of the classical Hermite polynomials are obtained, including a generating function formula, differentiation and integration formulas, a summation theorem and recurrence relations. An analogous study of the generalized Laguerre polynomials is performed in Sect. 4. In Sect. 5 we relate the problem of computing the ground state density for the Schr¨odinger operators (1.2) to the computation of particular eigenstates. By using integral formulas for these eigenstates we are able to compute the global density limit for even values of the coupling β. In the case of the Schr¨odinger operator (1.2a), the limiting global density is the well known Wigner semi-circle law. Also in Sect. 5, we give interpretation to results obtained in Sections 3 and 4 for a summation formula. The interpretation is in terms of the solution of an initial value problem associated with the Schr¨odinger equation (1.5) We conclude in Sect. 6 by identifying the formulas contained herein which are to be found in previous works, and give reference to these works (two of the most important references in this regard are unpublished, handwritten manuscripts). In the Appendix we present some results relating to generalized hypergeometric functions depending on two sets of variables which are of relevance to the working in Sections 3 and 4.

2. Inter-relationships Let us begin by explicitly calculating the operator (1.7) for W given byp(1.3). In all cases it is convenient to first change variables: for W = W (H) set yj := β/2 xj , for W = W (L) set yj = βx2j /2, while for W = W (J) set yj = sin2 φj . We then obtain

178

T.H. Baker, P.J. Forrester (H) (H) 2 H˜ (H) := − eβW /2 (H (H) − E0 )e−βW /2 β N 2 N X 2X ∂ ∂ 1 ∂ = − 2y + , j ∂yj α k=1 yj − yk ∂yj ∂yj2

j=1

H˜ (L)

(2.1a)

k6=j

(L) (L) 1 := − eβW /2 (H (L) − E0 )e−βW /2 2β N N X ∂ 2 X yj ∂2 ∂ = + yj 2 + (a − yj + 1) , ∂yj α k=1 yj − yk ∂yj ∂yj j=1

(2.1b)

k6=j

H˜ (J)

(J) (J) 1 := − eβW /2 (H (J) − E0 )e−βW /2 4 ÿ N X ∂2 ∂ yj (1 − yj ) 2 + [a + 1 − yj (a + b + 2)] = ∂yj ∂yj j=1  N 2 X yj (1 − yj ) ∂  + , α k=1 yj − yk ∂yj

(2.1c)

k6=j

where

a := (βa0 − 1)/2,

b := (βb0 − 1)/2,

α := 2/β.

In the one-variable case (N = 1), these operators have a complete set of polynomial eigenfunctions given by the classical Hermite, Laguerre and Jacobi polynomials X (−1)j (2y)n−2j , j!(n − 2j)! j=0 n (a + 1)n X n (−y)j Lan (y) := , n! j (a + 1)j [n/2]

Hn (y) := n!

(2.2a)

(2.2b)

j=0

Pn(b,a) (2y − 1) := (−1)n

n n + a X (n + a + b + 1)j n (−y)j (a + 1)j n j

(2.2c)

j=0

respectively, where (u)n := u(u + 1) . . . (u + n − 1). It is also true for general N that there is a complete set of polynomial eigenfunctions for each of the operators (2.1). This can be seen by computing their action on the monomial symmetric polynomial mκ , where κ denotes a partition consisting of N parts κj . We obtain series of the form X b(H) (2.3a) e(H) (κ, α)mκ + µκ mµ , |µ|<|κ|

e(L) (κ, α)mκ +

X

b(L) µκ mµ ,

|µ|<|κ|

e(J) (κ, α)mκ +

X

µ<κ

a(J) µκ mµ +

(2.3b) X |µ|<|κ|

b(J) µκ mµ

(2.3c)

Calogero-Sutherland Model and Generalized Classical Polynomials

respectively, where the notation |µ| < |κ| means µ < κ means µ 6= κ but N X j=1

µj =

N X

κj

p X

and

j=1

µj ≤

j=1

PN

p X

j=1

κj

179

µj <

PN

j=1

for each

κj , while the notation

p = 1, . . . , N.

j=1

Also, aµκ , bµκ are coefficients independent of yj and e(H) (κ, α) = −2|κ|,

e(L) (κ, α) = −|κ|

(2.4)

(the explicit value of e(J) (κ, α) can also be computed, however it is not needed in our subsequent discussion). This means there are eigenfunctions of the form X b˜ (H) (2.5a) a˜ (H) κκ mκ + µκ mµ , |µ|<|κ|

a˜ (L) κκ mκ

+

X

b˜ (L) µκ mµ ,

|µ|<|κ|

a˜ (J) κκ mκ +

X

µ<κ

a˜ (J) µκ mµ +

(2.5b) X

b˜ (J) µκ mµ ,

(2.5c)

|µ|<|κ|

with eigenvalues e(H) (κ, α), e(L) (κ, α) and e(J) (κ, α) respectively. Rather than study the eigenfunctions in the form (2.5), previous studies [24, 19] have shown that it is advantageous to change basis from the monomial symmetric polynomials to the Jack polynomials [30, 24]. We recall that the Jack polynomial Jκ(α) (z1 , . . . , zN ) is the unique (up to normalization) symmetric eigenfunction of the operator D2 :=

N X

zj2

j=1

N 2 X zj2 ∂2 ∂ + , 2 ∂zj α j,k=1 zj − zk ∂zj

(2.6)

j6=k

which has an expansion of the form aκκ mκ +

X

aµκ mµ .

(2.7)

µ<κ

The notation Jκ(α) is usually used for the particular normalization a(1|κ| )κ = |κ|! in (2.7). However, for our purposes it is more convenient to choose a different normalization, and to denote the corresponding Jack polynomial by Cκ(α) as in e.g. [17]. This normalization is specified by requiring X Cκ(α) (x1 , . . . , xN ). (2.8) (x1 + . . . + xN )n = |κ|=n

It is known (see e.g. [17]) that Jκ(α) and Cκ(α) are related by Cκ(α) (x1 , . . . , xN ) = α|κ| |κ|!jκ−1 Jκ(α) (x1 , . . . , xN ), where

(2.9)

180

T.H. Baker, P.J. Forrester

Y

jκ :=

h∗κ (s) hκ∗ (s),

s∈κ

(2.10a)

h∗κ (s) := lκ (s) + α(aκ (s) + 1) hκ∗ (s) := lκ (s) + 1 + αaκ (s)

with

(2.10b)

In (2.10a) and (2.10b), κ is regarded as a diagram, s denotes a node in the diagram and aκ (s) (lκ (s)) denotes the arm length (leg length) of the node (see e.g. [25]). In terms of the Jack polynomials, it is known [24, 21, 23, 22] that for each partition κ there is an eigenfunction of the form X (α) c(H) (2.11a) Hκ (y1 , . . . , yN ; α) := µκ Cµ (y1 , . . . , yN ), µ⊆κ

Laκ (y1 , . . . , yN ; α)

:=

X

(α) c(L) µκ Cµ (y1 , . . . , yN ),

(2.11b)

(α) c(J) µκ Cµ (y1 , . . . , yN ),

(2.11c)

µ⊆κ

(y1 , . . . , yN ; α) := G(a,b) κ

X

µ⊆κ

where the notation µ ⊆ κ denotes µj ≤ κj for each j = 1, . . . , N and cκκ 6= 0. These results can be established by using known formulas for the action of the operators Ek :=

N X

xki

∂ , ∂xi

(2.12a)

xki

2 X xki ∂2 ∂ + 2 ∂xi α i6=j xi − xj ∂xi

(2.12b)

i=1

Dk :=

N X i=1

for k = 0, 1, 2 which for future reference we list here: N X Cκ(α) (x) κ (i) = , E0 (α) (α) N N κ Cκ (1 ) Cκ(i) (1 ) (i) i=1 Cκ(α) (x)

E1 Cκ(α) (x) = |κ|Cκ(α) (x), E2 Cκ(α) (x) = D1

Cκ(α) (x) Cκ(α) (1N )

=

i−1 κ κi − Cκ(α) (i) (x), κ α i=1 (α) N − i Cκ(i) (x) κ , κi − 1 + N) α κ(i) Cκ(α) (i) (1

1 1 + |κ| N X i=1

N (i) X

(2.13a) (2.13b) (2.13c)

(2.13d)

D2 Cκ(α) (x) = dκ Cκ(α) (x),

N X 2 2 κi κi − 1 − (i − 1) dκ := |κ|(N − 1) + α α

(2.13e)

i=1

(the action of D0 can be computed from the commutator formula D0 = [E0 , D1 ]). Here the generalized binomial coefficients σκ are defined by the expansion Cκ(α) (1 + t1 , . . . , 1 + tN ) Cκ(α) (1N )

=

|κ| X X κ C (α) (t1 , . . . , tN ) σ , σ Cσ(α) (1N ) s=0 |σ|=s

(2.14)

Calogero-Sutherland Model and Generalized Classical Polynomials

181

where Cκ(α) (1N ) has the explicit form α|κ| |κ|! jκ

Cκ(α) (1N ) =

Y

(N − (i − 1) + α(j − 1)) .

(2.15)

(i,j)∈κ

The generalized binomial coefficients are non-zero if and only if σ ⊆ κ [20, 17]. We have also used the notation κ(i) := (κ1 , . . . , κi−1 , κi − 1, κi+1 , . . . , κN ), κ(i) := (κ1 , . . . , κi−1 , κi + 1, κi+1 , . . . , κN ) (note that this is the opposite of what is used in [27, 17] but rather is that used by [20]). The polynomials in (2.11) are referred to as generalized Hermite, Laguerre and Jacobi polynomials respectively [14]; they are uniquely specified up to normalization as the eigenfunctions of the operators (2.1) with an expansion in terms of Jack polynomials with leading term cκκ Cκ(α) whose label is maximal with respect to the inclusion order µ ⊆ κ. For the normalization we choose |κ| (α) N c(H) κκ = 2 /Cκ (1 ),

|κ| (α) N c(L) κκ = (−1) /|κ|!Cκ (1 )

and

c(J) κκ = 1.

(2.16)

With this choice, for N = 1 the generalized Hermite and Laguerre polynomials exactly coincide with the classical Hermite and Laguerre polynomials (2.2a) and (2.2b) respec(b,a) (2y−1), tively, while in the N = 1 case G(a,b) (k) corresponds to the Jacobi polynomial Pk normalized so that the coefficient of y k is unity. There have been a number of studies of the generalized Jacobi polynomials G(a,b) κ [22, 4, 24]. In particular, it is known that these polynomials are orthogonal with respect to the inner product hf |gi(J) :=

N Z Y

1 0

l=1

dyl yla (1 − yl )b

Y

×

|yk − yj |2/α f (y1 , . . . , yN )g(y1 , . . . , yN )

(2.17)

1≤j
This is significant to the study of the generalized Hermite and Laguerre polynomials, as both are limiting cases of the Jacobi polynomials. Thus by comparing the operators H˜ (H) ˜ (L) with H ˜ (J) , and using the facts that the Jack polynomial Cκ(α) is homogeneous and H of order |κ| and that in the expansion (2.14) the binomial coefficient is non-zero if and only if µ ⊂ κ [17], we see that lim

b→∞

22|κ| (−b)|κ| Cκ(α) (1N )

G(b κ

2

,b2 )

1 2

(1 −

y1 1 yN ), . . . , (1 − ) = Hκ (y1 , . . . , yN ; α) (2.18) b 2 b

and lim

b→∞

(−1)|κ| b|κ| |κ|!Cκ(α) (1N )

G(a,b) (y1 /b, . . . , yN /b; α) = Laκ (y1 , . . . , yN ; α). κ

(2.19)

It thus follows that by performing the same change of variables and limiting procedure in (2.17), we will obtain inner products for which the Hermite and Laguerre

182

T.H. Baker, P.J. Forrester

polynomials are orthogonal with respect to. We find that for the generalized Hermite polynomials this inner product is hf |gi(H) :=

N Z Y l=1

∞

Y

dyl e−yl

2

−∞

|yk − yj |2/α f (y1 , . . . , yN )g(y1 , . . . , yN ),

1≤j
(2.20) while for the generalized Laguerre polynomials it is hf |gi

(L)

:=

N Z Y l=1

∞ 0

Y

dyl yla e−yl

|yk − yj |2/α f (y1 , . . . , yN )g(y1 , . . . , yN )

1≤j
(2.21) (these inner products have previously been identified by Lassalle [21, 23]). 3. The generalized Hermite polynomials 3.1. The generating function. The starting point and key source of inspiration in our studies of the generalized Hermite and Laguerre polynomials is a private correspondence with M. Lassalle [19], in which we received unpublished notes containing, amongst other results, a multi-variable generalization of the classical generating function formula ∞ X Hk (y)z k k=0

k!

= e2yz e−z , 2

(3.1)

which is given by the following result. Proposition 3.1. Let y := (y1 , . . . , yN ) and z := (z1 , . . . , zN ). The generalized Hermite polynomials Hκ (y; α), defined in the previous section as polynomial eigenfunctions of the operator (2.1a), which have leading term as in (2.11a) with the normalization specified by (2.16), are given by the generating function X 1 Hκ (y; α)Cκ(α) (z) = 0 F0(α) (2y; z)e−p2 (z) , (3.2a) |κ|! κ where (α) 0 F0 (2y; z) :=

X 1 C (α) (2y)C (α) (z) κ κ (α) N |κ|! C (1 ) κ κ

and

p2 (z) :=

N X

zj2 . (3.2b)

j=1

A fundamental result in Lassalle’s researches is an explicit formula for the action of the operator E0(y) (recall (2.13a); the superscript (y) indicates operation with respect to the variables y) on 0 F0(α) : E0(y) 0 F0(α) (2y; z) = 2p1 (z) 0 F0(α) (2y; z),

where

p1 (z) :=

N X

zj .

(3.3)

j=1

This formula follows from (2.13a) and the result [17, 20] N 1 X κ(i) (α) Cκ(α) p1 (x) Cκ (x) = (i) (x). 1 + |κ| κ i=1

(3.4)

Calogero-Sutherland Model and Generalized Classical Polynomials

183

˜ (H) (2.1a) is given by Now in the notation of (2.12), the operator H ˜ (H) = D0 − 2E1 . H

(3.5)

Knowledge of the action of D0(y) on 0 F0(α) (2y; z) is required to prove Proposition 3.1. Lassalle uses the formulas (2.13) and (3.3) to establish this action. We have observed that in fact the required formula can be derived from (3.3). In our derivation we make use of the general fact that if A(y) F = Aˆ (z) F and B (y) F = Bˆ (z) F , then A(y) B (y) F = A(y) Bˆ (z) F = Bˆ (z) A(y) F = Bˆ (z) Aˆ (z) F,

(3.6)

where the second equality follows because operators acting on different sets of variables always commute. Lemma 3.1. We have D1(y) 0 F0(α) (2y; z) =

2

(N − 1)p1 (z) + 2E2(z)

α (y) (α) D0 0 F0 (2y; z) = 4p2 (z) 0 F0(α) (2y; z).

(α) 0 F0 (2y; z),

(3.7a) (3.7b)

Proof. Since D1(y) = 21 [E0(y) , D2(y) ], using (3.3), the fact that D2(y) is an eigenoperator for the Jack polynomials, and (3.6) gives D1(y) 0 F0(α) (2y; z) = [D2(z) , p1 (z)] 0 F0(α) (2y; z) 2 (N − 1)p1 (z) + 2E2(z) 0 F0(α) (2y; z), = α where the second equality follows by computing the commutator. To derive the second result, note that D0(y) = [E0(y) , D1(y) ], so from (3.3), (3.7a) and (3.6) 2 D0(y) 0 F0(α) (2y; z) = [ (N − 1)p1 (z) + 2E2(z) , 2p1 (z)] 0 F0(α) (2y; z) α = 4p2 (z) 0 F0(α) (2y; z). Let us now show how (3.7b) is used in Lassalle’s derivation of (3.2a). Proof of Proposition 3.1. We first want to show that Hκ (y; α) as defined by the generating function (3.2a) is an eigenfunction of (3.5) with eigenvalue −2|κ|. To do this, consider the action of E1(z) on both sides of (3.2a). On the r.h.s. we have E1(z) 0 F0(α) (2y; z)e−p2 (z) = e−p2 (z) E1(z) 0 F0(α) (2y; z) − 2p2 (z)0 F0(α) (2y; z)e−p2 (z) 1 = e−p2 (z) E1(y) 0 F0(α) (2y; z) − D0(y) 0 F0(α) (2y; z)e−p2 (z) 2 1X 1 (D(y) − 2E1(y) )Hκ (y; α)Cκ(α) (z), =− 2 κ |κ|! 0

(3.8)

where the second equality follows by using (3.7b) and noting that since E1 is an eigenoperator of the Jack polynomials, the definition (3.2b) gives E1(y) 0 F0(α) (2y; z) = E1(z) 0 F0(α) (2y; z),

184

T.H. Baker, P.J. Forrester

and the final equality follows by substituting the generating function. On the l.h.s., since E1(z) is an eigenoperator of Cκ(α) (z) with eigenvalue |κ|, from the definition (3.2b), E1(z)

X 1 X 1 Hκ (y; α)Cκ(α) (z) = Hκ (y; α)|κ|Cκ(α) (z). |κ|! |κ|! κ κ

(3.9)

Equating coefficients of Cκ(α) (z) in (3.8) and (3.9) shows that Hκ (y; α) is an eigenfunction of the operator (3.5) with eigenvalue −2|κ| as required. It remains to check that Hκ (y; α) as given by (3.2a) has an expansion in terms of Jack polynomials with leading term 2|κ| Cκ(α) (y)/Cκ(α) (1N ). This follows from the fact that to compute the coefficient of Cκ(α) (z) in 0 F0(α) (2y; z)e−p2 (z) , the sum in (3.2b) can be restricted to partitions with modulus less than or equal to |κ|. The strategy of the above proof leads us to generalizations of the eigenvalue equation. In the theory of Jack polynomials, Macdonald [25] (see also [29, 4]) has given a family of j }j=1,...,N which have the Jack polynomials as eigenfunctions, differential operators {DN and for which the corresponding eigenvalues are known explicitly. These operators are given by p := DN

p X l=0

×

1 ∂ ∂ . . . z il z i1 1+ 1+ ∂zi1 ∂zil 1≤i1
(α)p−l

X

1≤il+1 6=i1 ...6=il

Q where 1+ := 1≤j
then the eigenvalues are given by e(κ, α; X) :=

N Y

(X + N − j + ακj ).

(3.10)

j=1 1 1 The operator E1 is related to the Macdonald operator DN by DN = αE1 + N (N − j (z) 1)/2. By considering the analogue of (3.8) with E1 replaced by DN (j = 1, 2, . . . , N ), a family of differential operators which have the generalized Hermite polynomials as eigenfunctions can be given. The r.h.s. of (3.8) is then computed according to the BakerCampbell-Hausdorff formula h i j (z) j (z) j(z) DN f e−p2 (z) = e−p2 (z) DN − DN , p2 (z) i i i i (−1)j h h j (z) 1 hh j (z) DN , p2 (z) , p2 (z) + · · · · · · DN , p2 (z) , · · · , p1 (z) f. + 2! j! (3.11)

Note that the sum on the r.h.s. terminates after the j th nested commutator since the j(z) has degree j. highest derivative in DN

Calogero-Sutherland Model and Generalized Classical Polynomials

185

Following the derivation of the eigenvalue equation given in the proof of Proposition j is an eigenoperator of the Jacks 3.1, and thus using (3.3), (3.6) and the fact that since DN we have j (z) j (y) (α) (α) DN 0 F0 (2y; z) = DN 0 F0 (2y; z), we can immediately deduce a family of N independent eigenoperators of the polynomials Hκ (y; α), together with the corresponding eigenvalues. Proposition 3.2. Let 1 h (y) j(y) i 1 h (y) h (y) j(y) ii j(y) H˜ j(H) := DN D0 , D N + 2 D0 , D 0 , D N − − ··· 4 4 2! ii i h (−1)j h (y) h (y) j(y) D0 , D0 , · · · , D0(y) , DN ··· . + j 4 j! ˜ (H) for each j = 1, . . . , N , with eigenvalue We have that Hκ (y; α) is an eigenfunction of H j ej (κ; α) given by the coefficient of X N −j in (3.10). 3.2. Consequences of the generating function. The generating function formula (3.2a) can be used to deduce higher-dimensional analogues of the classical properties of the Hermite polynomials Hn (−y) = (−1)n Hn (y), d Hn (y) = 2nHn−1 (y), dy 2yHn (y) = Hn+1 (y) + 2nHn−1 (y).

(3.12a) (3.12b) (3.12c)

Proposition 3.3. We have Hκ (−y; α) = (−1)|κ| Hκ (y; α). Proof. Replacing y by −y in (3.2a) and using the fact that 0 F0(α) (2µy; z) = 0 F0(α) (2y; µz), where µ is any scalar, and that Cκ(α) is homogeneous of order |κ|, gives X 1 Hκ (−y; α)Cκ(α) (z) = 0 F0(α) (−2y; z)e−p2 (z) = 0 F0(α) (2y; −z)e−p2 (−z) |κ|! κ =

X 1 (−1)|κ| Hκ (y; α)Cκ(α) (z). |κ|! κ

The result follows by equating coefficients of Cκ(α) (z). Proposition 3.4. We have E0(y)

N X κ Hκ (y; α) = 2 Hκ(i) (y; α). κ(i) i=1

186

T.H. Baker, P.J. Forrester

Proof. Applying E0(y) to the generating function (3.2a) and using (3.3) gives X 1 (y) E0 Hκ (y; α)Cκ(α) (z) = 2p1 (z)0 F0(α) (2y; z)e−p2 (z) |κ|! κ X 1 Hκ (y; α)Cκ(α) (z). = 2p1 (z) |κ|! κ

(3.13)

Using the formula (3.4) the final formula on the r.h.s. of (3.13) can be rewritten as X κ(i) 1 Hκ (y; α)Cκ(α) (i) (z) (1 + |κ|)! κ κ i=1 N X 1 X κ =2 Hκ(i) (y; α)Cκ(α) (z). |κ|! κ (i) κ

2

N

X

(3.14)

i=1

Equating coefficients of Cκ(α) (z) on the l.h.s. of (3.13) and on the r.h.s. of (3.14) gives the stated result. Proposition 3.5. We have 2p1 (y)Hκ (y; α) X κ X κ(i) jκ (N − i + 1 + ακi ) Hκ(i) (y; α) + 2 Hκ(i) (y; α). =α κ jκ(i) κ(i) i i Proof. From the generating function (3.2a) we have X 1 p1 (y)Hκ (y; α)Cκ(α) (z) |κ|! κ = 2p1 (y)0 F0(α) (2y; z)e−p2 (z) = E0(z) 0 F0(α) (2y; z) e−p2 (z) = E0(z) 0 F0(α) (2y; z)e−p2 (z) + 2p1 (z)0 F0(α) (2y; z)e−p2 (z) X 1 = E0(z) + 2p1 (z) Hκ (y; α)Cκ(α) (z) |κ|! κ

2

(3.15)

Using the formula (2.13a) and writing 2p1 (z)

X 1 Hκ (y; α)Cκ(α) (z), |κ|! κ

as in the proof of Proposition 3.4 shows that we can rewrite the last expression on the r.h.s. of (3.15) as N X X 1 Cκ(α) (z) κ α N Cκ (1 ) Hκ (y; α) (α)(i) |κ|! κ(i) Cκ(i) (1N ) κ i=1 N X 1 X κ +2 Hκ(i) (y; α)Cκ(α) (z) |κ|! κ (i) κ i=1

Calogero-Sutherland Model and Generalized Classical Polynomials

187

N X κ Cκ(α) ) (α) 1 (i) (1 (i) (y; α) Cκ (z) = H κ (α) (i) N (1 + |κ|)! κ Cκ (1 ) κ i=1 N X 1 X κ +2 Hκ(i) (y; α)Cκ(α) (z). κ |κ|! (i) κ N

X

i=1

The stated formula now follows by equating coefficients of Cκ(α) (z) on the l.h.s. of (3.15) N )/Cκ(α) (1N ). and the r.h.s. of the above equation, and using (2.15)) to rewrite Cκ(α) (i) (1 Another consequence of the generating function (3.2a) relates to an analogue of the formula (2.8). Proposition 3.6. We have X 1 Hκ (y; α)Cκ(α) (1N ). Hk ( √ p1 (y)) = N −k/2 N |κ|=k Proof. Set z1 = · · · = zN = c in (3.2a) and note that (α) 0 F0 (2y; c, . . . , c) =

X c|κ| Cκ(α) (2y) = e2cp1 (y) |κ|! κ

(3.16)

(the last equality follows from (2.8)) to conclude ∞ X c|κ| X 1 ck (α) N 2cp1 (y) −N c2 Hk ( √ p1 (y)). Hκ (y; α)Cκ (1 ) = e e = k/2 k! |κ|! N N κ k=0

k

The result now follows by equating coefficients of c . Notice that each term on the r.h.s. of the above formula is an eigenfunction of the operator (3.5) with eigenvalue −2|κ|. Thus Hk ( √1N p1 (y)) is also an eigenfunction of (3.5) with eigenvalue −2k. This latter fact can be checked directly, and has been observed previously [11]. 3.3. Integration formulas. Using the generating function (3.2a) and the orthogonality of the generalized Hermite polynomials with respect to the inner product (2.20), a number of integration formulas can be obtained. In particular, we can obtain the multidimensional analogues of the classical formulas Z ∞ 2 √ 2 dy e−y Hk (y) = π2k k!, (3.17a) 2−k √ π

−∞ ∞

Z

dy e−y Hk (y + x) = xk ,

−∞ k Z ∞

2 √

π

−∞

2

dy e−y (x + iy)k = Hk (x). 2

(3.17b) (3.17c)

To present these analogues let us introduce the notation dµ(H) (y) :=

N Y j=1

e−yj 2

Y 1≤j
|yj − yk |2/α dy1 . . . dyN .

(3.18)

188

T.H. Baker, P.J. Forrester

Proposition 3.7. We have Z Nκ(H) :=

(−∞,∞)N

2 Hκ (y; α)

2|κ| |κ|!N0(H)

dµ(H) (y) =

Cκ(α) (1N )

,

where Z N0(H)

dµ(H) (y) = 2−N (N −1)/2α π N/2

:= (−∞,∞)N

N −1 Y j=0

0(1 + (j + 1)/α) . 0(1 + 1/α)

Proof. Multiplying both sides of the generating function (3.2a) by Hκ (y; α), integrating with respect to the measure (3.18), and using the orthogonality property of {Hκ (y; α)}κ with respect to the inner product (2.20) gives Z Nκ(H) (α) (α) (H) Cκ (z) = e−p2 (z) (y). (3.19) 0 F0 (2y; z)Hκ (y; α) dµ |κ|! (−∞,∞)N Set z1 = . . . = zN = c, substitute (3.16) in the r.h.s. of (3.19) and complete the square to show Z Y Nκ(H) (α) Cκ (c) = e−p2 (y) |yk − yj |2/α Hκ (y + c; α) dy1 . . . dyN . |κ|! (−∞,∞)N 1≤j
Now take the limit c → ∞. Since from (2.14) and (2.11a), lim

c→∞

Hκ (y + c; α) Cκ(α) (c)

=

2|κ| Cκ(α) (1N )

,

the stated formula for Nκ(H) follows. The formula for N0(H) is a well known limiting case of Selberg’s integral. The analogues of (3.17b) and (3.17c) can be derived from the following integration formula. Proposition 3.8. We have Z (α) (α) (H) (y) = ep2 (w)+p2 (z) N0(H) 0 F0(α) (2z; w). 0 F0 (2y; z)0 F0 (2y; w) dµ (−∞,∞)N

Proof. Substitute the generating function (3.2a) for 0 F0(α) and integrate term-by-term using the orthogonality property of the {Hκ (y; α)}κ with respect to the inner product (2.20) and the normalization integral of Proposition 3.7. The resulting series is identified as 0 F0(α) (2z; w) according to the definition (3.2b). Corollary 3.1. We have Z 2|κ| C (α) (z) (α) (H) . (y) = ep2 (z) N0(H) (α) κ 0 F0 (2y; z)Hκ (y, α) dµ Cκ (1N ) (−∞,∞)N Proof. Multiply both sides of the integration formula of Proposition 3.8 by exp(−p2 (w)) and substitute for exp(−p2 (w))0 F0(α) (2y; w) using (3.2a). The result follows by equating coefficients of Cκ(α) (w) on both sides.

Calogero-Sutherland Model and Generalized Classical Polynomials

189

Corollary 3.2. We have e

−p2 (z)

Hκ (z; α) =

2|κ| N0(H) Cκ(α) (1N )

Z (−∞,∞)N

(α) (α) 0 F0 (2y; −iz)Cκ (iy)

dµ(H) (y).

Proof. By writing iw for w in Proposition 3.8, we can replace 0 F0(α) (2y; w) by (α) (α) (α) 0 F0 (2iy; w), 0 F0 (2z; w) by 0 F0 (2iz; w) and exp(−p2 (w)) by exp(p2 (w)). The result follows by using the generating function (3.2a) to substitute for exp(−p2 (w)) (α) (α) 0 F0 (2iz; w) and equating coefficients of Cκ (w). The integration formula of Corollary 3.2 can be used to derive the analogue of the classical summation formula [8] ∞ X 2 2 2 2 2 1 Hk (w)Hk (z) k √ t = √ (1 − t2 )−1/2 e−t (z +w )/(1−t ) e2wzt/(1−t ) , k!2k π π

|t| < 1.

k=0

Proposition 3.9. For |t| < 1 we have X Hκ (w; α)Hκ (z; α) G(H) (w, z; t) := t|κ| (H) N κ κ 1 t2 2 −N q/2 (p = (1 − t ) exp − (z) + p (w)) 2 2 (1 − t2 ) N0(H) z 2wt ; , ×0 F0(α) (1 − t2 )1/2 (1 − t2 )1/2 where q = 1 + (N − 1)/α. Proof. Substituting the integral representation of Corollary 3.2 for Hκ (z; α) and Hκ (w; α) in the definition G(H) (w, z; t), we see that the sum over κ can be recognized in terms of 0 F0(α) and thus Z 1 dµ(H) (ya ) G(H) (w, z; t) = ep2 (z)+p2 (w) (H) × 3 N (N0 ) (−∞,∞) Z (α) dµ(H) (yb )0 F0 (2ya ; −iw)0 F0(α) (2yb ; −iz)0 F0(α) (2ya ; −tyb ). × (−∞,∞)N

We now use Proposition 3.8 to integrate over ya . This gives G(H) (w, z; t) Z 2 1 dµ(H) (yb )0 F0(α) (2yb ; −iz)0 F0(α) (2iw; tyb ) et p2 (yb ) = ep2 (z) (H) (N0 )2 (−∞,∞)N 1 = ep2 (z) (H) (1 − t2 )−(N/2+N (N −1)/2α) (N )2 Z 0 × dµ(H) (yb )0 F0(α) (2yb ; −iz(1 − t2 )−1/2 )0 F0(α) (2yb ; iwt(1 − t2 )−1/2 ), (−∞,∞)N

where the second equality follows by combining dµ(H) (yb ) and exp(t2 p2 (yb )) (recall (3.18)) and changing variables. The integration over dµ(H) (yb ) can now be performed using Proposition 3.8, and the summation formula for G(H) (w, z; t) results.

190

T.H. Baker, P.J. Forrester

Notice from (3.16) that in the special case that w1 = · · · = wN = c, the summation formula of Proposition 3.9 is entirely in terms of elementary functions:   N X 1 1 (t2 zj2 − 2tczj + t2 c2 ) . G(H) (w, z; t) = (H) (1 − t2 )−N q/2 exp − (1 − t2 ) N0 j=1

(3.20) Interpretation of this result in terms of an explicit solution of the Fokker-Planck equation (1.6) with W given by (1.3a) will be discussed in Sect. 5. In Corollary 3.2 a certain integral transform is applied to the Jack polynomial to obtain the generalized Hermite polynomial. It has been observed by Lassalle that the generalized Hermite polynomials can be obtained from the Jack polynomials by the action of a certain exponential differential operator. Thus from the formula (3.7b), we see that k k 1 D0(y) 0 F0(α) (2y; z) = p2 (z) 0 F0(α) (2y; z), 4 which after multiplication by (−1)k /k! and summing over k gives 1 exp − D0(y) 0 F0(α) (2y; z) = e−p2 (z) 0 F0(α) (2y; z). 4 Use of the generating function (3.2a) on the r.h.s. and equating coefficients of Cκ(α) (z) gives Lassalle’s formula 1 2|κ| (y) exp − D Cκ(α) (y) = Hκ (y; α). (3.21) 4 0 Cκ(α) (1N ) Comparison with the formula of Corollary 3.2, and use of the fact that {Cκ(α) (y)}κ forms a basis for symmetric analytic functions shows that for any symmetric analytic function f (y), Z 1 ep2 (z) (α) (z) (H) D F (2y; −iz)f (iy) dµ (y) = exp − f (z). (3.22) 0 0 4 0 N0(H) (−∞,∞)N From (3.22) we see that if F (z) =

ep2 (z)

Z

N0(H)

then f (z) = exp

1

(−∞,∞)N

(α) (H) (y), 0 F0 (2y; −iz)f (iy) dµ

D0(z) F (z).

4 On the other hand, by replacing z by iz and f (ix) by F (x) we have Z 1 e−p2 (z) (α) (H) D0(z) F (z). (y) = exp 0 F0 (2y; z)F (y) dµ (H) 4 N0 (−∞,∞)N Comparison of (3.23b) and (3.24) gives Z e−p2 (z) (α) (H) (y), f (z) = 0 F0 (2y; z)F (y) dµ N0(H) (−∞,∞)N

(3.23a)

(3.23b)

(3.24)

(3.25)

which is the inversion formula for the transform (3.23a) (in the case α = ∞ (3.23a) corresponds to the Fourier transform).

Calogero-Sutherland Model and Generalized Classical Polynomials

191

4. The Generalized Laguerre Polynomials The generalized Laguerre polynomials, defined as the polynomial eigenfunctions of the operator (2.1b) of the form (2.11b) with normalization (2.16), also satisfy higherdimensional analogues of their classical counterparts. A number of these formulas have been proved in the case α = 2 by Muirhead [27] and for general α but N = 2 by Yan [34]. Below we will develop the theory of generalized Laguerre polynomials by presenting the analogues of the classical generating functions, the series expansion (2.2b), recurrence and differentiation formulas, integration formulas and a summation formula. In Sect. 6 we will identify the formulas known to Muirhead and Yan, as well as those which can be found in the work of Lassalle and Macdonald. 4.1. Generating functions. The classical Laguerre polynomials can be defined by either of the generating functions e

z Ja (2

√

yz)

(yz)a/2

=

∞ X n=0

or (1 − z)−(a+1) eyz/(z−1) =

∞ X

1 La (y)z n 0(n + a + 1) n

(4.1a)

Lan (y)z n ,

(4.1b)

n=0

where in (4.1a) Ja denotes the Bessel function. These generating functions have the following higher-dimensional analogues. Proposition 4.1. We have ep1 (z) 0 F1(α) (a + q; x; −z) =

X La (x; α)C (α) (z) κ

κ

κ

[a + q](α) κ

,

(4.2)

where q := 1 + (N − 1)/α,

(4.3a)

(α) p Fr (a1 , . . . , ap ; b1 , . . . , br ; x; z) X 1 [a1 ](α) . . . [ap ](α) C (α) (x)C (α) (z) κ κ κ κ , := (α) (α) (α) N |κ|! [b ] . . . [b C (1 ] ) κ κ κ 1 r κ

with [c](α) κ :=

QN j=1

Y

where

(4.3b)

c − α1 (j − 1) . We also have κj

−(a+q)

(1 − z)

(α) 0 F0 (−x;

X z )= Laκ (x; α)Cκ(α) (z), 1−z κ

(4.4)

Q QN (1 − z) := j=1 (1 − zj ).

Proof. In each case we need to establish that Laκ (x; α) as defined by the generating function is an eigenfunction of the operator (2.1b), which in terms of the notation (2.12) reads ˜ (L) = D1 + (a + 1)E0 − E1 , (4.5) H

192

T.H. Baker, P.J. Forrester

with eigenvalue −|κ| and has an expansion in terms of Jack polynomials with leading term (−1)|κ| Cκ(α) (x)/|κ|!Cκα (1N ). The proof of the first requirement relies on the identities (4.6a) D1(x) + (a + 1)E0(x) 0 F1(α) (a + q; x; z) = p1 (z)0 F1(α) (a + q; x; z), D1(x) − E2(y) 0 F0(α) (x; y) = (q − 1)p1 (y)0 F0(α) (x; y), (4.6b) with (4.6a) being established in the Appendix, and (4.6b) simply a rewrite of (3.7a) with y → y/2. First consider (4.2). We have E1(z) 0 F1(α) (a + q; x; −z)ep1 (z) = ep1 (z) E1(z) 0 F1(α) (a + q; x; −z) + p1 (z)ep1 (z) 0 F1(α) (a + q; x; −z).

(4.7)

Using (4.6a) and the fact that E1(z) is an eigenoperator of the Jack polynomials so that its action on 0 F1(α) (x; −z) is the same as the action of E1(x) , the r.h.s. of (4.7) can be rewritten as (4.8) E1(x) − D1(x) − (a + 1)E0(x) 0 F1(α) (a + q; x; −z)ep1 (z) . Substituting the generating function (4.2) in the l.h.s. of (4.7) and computing the action of E1(z) , and comparing coefficients of Cκ(α) (z) with (4.8) after also substituting the generating function (4.2) establishes the eigenvalue equation. The expansion of L(α) κ (x; α) in terms of Jack polynomials as deduced from (4.2) is given by Proposition 4.3. Its leading term is (−1)|κ| Cκ(α) (x)/Cκ(α) (1N ) as required. Now consider (4.4). Setting yj := zj /(1 − zj ) this is equivalent to Y a+q X y (α) )= (1 + y) Laµ (x; α)Cµ(α) ( (4.9) 0 F0 (−x; y). 1 + y µ To verify the eigenvalue equation, note that E1(y) +E2(y) is an eigenoperator of Cµ(α) (y/(1+ y)) with eigenvalue |µ|, and Y a+q Y a+q (E1(y) + E2(y) ) (1 + y) = (a + q)p1 (y) (1 + y) . (4.10) Thus, applying E1(y) + E2(y) to the generating function (4.9) we have X y ) |µ|Laµ (x; α)Cµ(α) ( 1 + y µ Y a+q (α) (1 + y) = E1(y) + E2(y) ) 0 F0 (−x; y) Y a+q = (1 + y) E1(y) + E2(y) + (a + q)p1 (y) 0 F0(α) (−x; y) Y a+q = (1 + y) E1(x) − D1(x) + (a + 1)p1 (y) 0 F0(α) (−x; y) Y a+q = (1 + y) E1(x) − D1(x) − (a + 1)E0(x) 0 F0(α) (−x; y),

(4.11)

Calogero-Sutherland Model and Generalized Classical Polynomials

193

where to obtain the third equality Q we have used (4.6b) while the final equality follows from (3.3). The factor involving (1+y) in the final equality can be commuted in front of the operators, since they act only on the x-variables. Use of the generating function (4.9) and comparison of the coefficient of Cκ(α) (y/(1 + y)) in (4.11) establishes the eigenvalue equation. The leading term in the Jack polynomial expansion of Laκ (x; α) as defined by (4.4) is the same as the leading coefficient of Cκ(α) (z) in the expansion of Y −(a+q) X (−1)|µ| Cµ(α) (z/(1 − z)) . (1 − z) Cµ(α) (x) |µ|! Cµ(α) (1N ) µ |µ|≤|κ|

Consideration of the form of the expansion of Cµ(α) (z/(1 − z)) and

Q −(a+q) (1 − z) in

terms of {Cσ(α) (z)}σ shows that the leading coefficient of Cκ(α) (z) in the above expression is (−1)|κ| Cκ(α) (x)/|κ|!Cκ(α) (1N ). Comparison with the coefficient of Cκ(α) (z) on the r.h.s. of (4.4) shows that L(α) κ (x; α) has the required leading term for its expansion in terms of Jack polynomials. In Proposition 4.1 the Laguerre polynomial is given by generating functions involving 0 F0(α) and 0 F1(α) . It is also possible to give a generating function involving 1 F1(α) (recall (4.3b)) which includes both these generating functions as limiting cases. Proposition 4.2. We have Y (α) −c−q c + q; a + q; −x; (1 − z) 1 F1

z 1−z

=

X [c + q](α) λ λ

[a + q](α) λ

Laλ (x; α) Cλ(α) (z). (4.12)

Proof. The derivation closely follows that of (4.4) above, with (4.6b) being replaced by the formula 1 − N (x) (y) (α) (α) )E0 F (c; a; x; y) = E + cp (y) D1(x) + (a + 1 1 1 1 F1 (c; a; x; y), 2 α (4.13) which is established in the Appendix. To derive the generating function (4.2) from (4.12) replace z by z/c and take the limit c → ∞ using the facts that lim 1 F1(α) (c + q; a + q; −x;

c→∞

lim

c→∞

Y

(1 − z/c)−c−q = ep1 (z) ,

z/c ) = 0 F1(α) (a + q; −x; z), 1 − z/c (α) (α) lim [c + q](α) κ Cκ (z/c) = Cκ (z).

c→∞

The generating function (4.4) follows from (4.12) by setting c = a. From the generating function (4.2) it is possible to deduce the higher-dimensional analogue of the series expansion (2.2b). Proposition 4.3. We have X κ (−1)|σ| C (α) (x) [a + q](α) κ σ , (α) N |κ|! σ [a + q](α) C σ σ (1 ) σ⊆κ κ |σ|!La (x; α) X σ (α) N Cκ(α) (x) = [a + q](α) (−1)|σ| . κ Cκ (1 ) σ [a + q](α) σ σ⊆κ

Laκ (x; α) =

(4.14a) (4.14b)

194

T.H. Baker, P.J. Forrester

Proof. The formula (4.14a) follows from the generating function formula (4.2) by applying the identity [20, 17] X µ |λ|! Cµ(α) (z) (4.15) ep1 (z) Cλ(α) (z) = |µ|! λ µ on the l.h.s. and equating coefficients of Cκ(α) (z). The formula (4.14b) follows from (4.2) by multiplying both sides by e−p1 (z) , using the identity (4.15) (with z replaced by −z) on the r.h.s., and equating coefficients of Cκ(α) (z). A simple consequence of the generating function (4.4) is the analogue of the formula of Proposition 3.6, which is derived by following the steps of the proof of that formula. Proposition 4.4. We have (a+q)−1 (p1 (y)) = LN k

X

Laκ (y; α)Cκ(α) (1N ).

|κ|=k

Laκ (y; α)

is an eigenfunction of (4.5) with eigenvalue −|κ|, it follows that Since each (a+q)−1 (p (y)) is also an eigenfunction of (4.5) with eigenvalue −k (this feature can LN 1 k be checked directly). As our final result of this section, we note that the proof of (4.2) can be generalized to give a family of eigenoperators for Laκ (y; α), together with the corresponding eigenvalues. These operators are analogues of the operators of Proposition 3.2 for the generalized Hermite polynomials, and are derived in the same way. Proposition 4.5. Let i h p (y) p (y) + ··· − D1(y) + (a + 1)E0(y) , DN H˜ p(L) (y) := DN h i ii h h (−1)p p (y) + D1(y) + (a + 1)E0(y) , · · · D1(y) + (a + 1)E0(y) , DN ··· p! p is the operator introduced in Sect. 3.1. We have that Laκ (y; α) is an eigenfuncwhere DN (L) tion of H˜ p (y) for each p = 1, . . . , N , with eigenvalue ep (κ; α) given by the coefficient of X N −p in (3.10).

4.2. Recurrence and differentiation formulas. The classical Laguerre polynomials satisfy the recurrence relations xLan (x) = (2n + a + 1)Lan (x) − (n + 1)Lan+1 (x) − (n + a)Lan−1 (x), n X La+1 Lam (x), n (x) = Lan (x) =

m=0 La+1 n (x)

− La+1 n−1 (x),

(4.16a) (4.16b) (4.16c)

and the differentiation formulas d a L (x) = −La+1 (4.17a) n−1 (x), dx n d x Lan (x) = nLan (x) − (n + a)Lan−1 (x). (4.17b) dx The generalized Laguerre polynomials satisfy higher-dimensional analogues of these formulas. Let us first consider (4.17b).

Calogero-Sutherland Model and Generalized Classical Polynomials

195

Proposition 4.6. We have E1(x)

Laκ (x; α)

=

|κ|Laκ (x; α)

1 X − |κ| i

κ κ(i)

N −i κi + a + α

Laκ(i) (x; α)

Proof. From the generating function (4.9) and the fact that E1(x) is an eigenoperator of Cκ(α) (x) we have E1(x)

X µ

Laµ (x; α)Cµ(α) (

y ) 1+y

Y

(1 + y)a+q E1(y) 0 F0(α) (−x; y) Y = E1(y) − (a + q)p1 (y/(1 + y)) (1 + y)a+q 0 F0(α) (−x; y) X y = E1(y) − (a + q)p1 (y/(1 + y)) ). Laµ (x; α)Cµ(α) ( 1 + y µ

=

(4.18)

But with zj := yj /(1 + yj ), E1(y) Cµ(α) (

y ) = (E1(z) − E2(z) )Cµ(α) (z) 1+y N

= |µ|Cµ(α) (z) −

1 X 1 + |µ| i=1

µ(i) µ

i−1 µi − Cµ(α) (i) (z), α

where the second equality uses (2.13c). Substituting this expression in the r.h.s. of (4.18), and using (3.4) to simplify the remaining term on the r.h.s. of (4.18) gives X y y a )= ) Lµ (x; α) |µ|Cµ(α) ( 1 + y 1 + y µ µ ) N N −i y 1 X µ(i) µi + 1 + a + Cµ(α) ) . − (i) ( 1 + |µ| µ α 1+y

E1(x)

X

Laµ (x; α)Cµ(α) (

i=1

y ). The result follows by equating coefficients of Cµ(α) ( 1+y

Next we will derive the analogue of (4.16a). Proposition 4.7. We have p1 (x) Laκ (x; α) = 2|κ| + N (a + q) Laκ (x; α) N N −i 1 X κ Laκ(i) (x; α) κi + a + − |κ| α κ(i) i=1 X κ(i) jκ (N − i + 1 + ακi ) Laκ(i) (x; α). − (|κ| + 1)α (i) κ j κ i

196

T.H. Baker, P.J. Forrester

Proof. From the generating function (4.9) X µ

p1 (x)Laµ (x; α)Cµ(α) (

Y y ) = p1 (x) (1 + y)a+q 0 F0(α) (−x; y) 1+y

Y

(1 + y)a+q E0(y) 0 F0(α) (−x; y) Y Y = −E0(y) (1 + y)a+q 0 F0(α) (−x; y) + 0 F0(α) (−x; y)E0(y) (1 + y)a+q Y = − E0(y) + (a + q)p1 (1/(1 + y)) (1 + y)a+q 0 F0(α) (−x; y) X y = − E0(y) + (a + q)p1 (1/(1 + y)) ). (4.19) Laµ (x; α)Cµ(α) ( 1+y µ =−

y y ) and p1 (1/(1 + y))Cµ(α) ( 1+y ) as a series in The task is now to write E0(y) Cµ(α) ( 1+y y {Cκ(α) ( 1+y )}κ . To do this let zj = yj /(1 + yj ) so that

E0(y) =

N X

(1 − zj )2

j=1

∂ = E0(z) − 2E1(z) + E2(z) ∂zj

and p1 (1/(1 + y)) = p1 (1 − z) = N − p1 (z). We then have E0(y) Cµ(α) (

y ) = E0(z) − 2E1(z) + E2(z) Cµ(α) (z) 1+y (α) N N X Cµ (1 ) (α) y y µ Cµ(i) ( = ) − 2|µ|Cµ(α) ( ) N 1 + y 1 + y µ(i) Cµ(α) ) (i) (1 i=1 N i−1 y 1 X µ(i) µi − Cµ(i) ( ) (4.20) + 1 + |µ| µ α 1+y i=1

and y ) = N − p1 (z) Cµ(α) (z) 1+y N 1 X µ(i) y y )− = N Cµ(α) ( Cκ(α) ), (i) ( 1+y 1 + |µ| µ 1+y

p1 (1/(1 + y))Cµ(α) (

(4.21)

i=1

where to obtain (4.20) we have used (2.13a), (2.13b) and (2.13c), while to obtain (4.21) we have used (3.4). Substituting (4.20) and (4.21) in the r.h.s. of (4.19), equaty ing coefficients of Cκ(α) (i) ( 1+y ) with the l.h.s. of (4.19), and use of (2.15) to rewrite Cµ(α) (1N )/Cµ(α) (1N ) gives the stated result. (i) The generalizations of (4.16b) and (4.16c) are given by the following result.

Calogero-Sutherland Model and Generalized Classical Polynomials

197

Proposition 4.8. We have X

X

min(N,|κ|)

La−1 κ (x; α) =

(−α)r

r=0

σ

X

(4.22)

κ/σ a vertical r-strip

|κ|

(x; α) = La+1/α κ

|σ|! ψκ/σ (α)Laσ (x; α), |κ|!

X

α−r

r=0

|σ|! φκ/σ (α)Laσ (x; α), |κ|!

(4.23)

hσ∗ (s) Y κ Y σ h (s) (h (s))−1 , h∗σ (s) s∈κ ∗ s∈σ ∗

(4.24a)

σ

κ/σ a horizontal r-strip

where Y s∈Rκ/σ ∩κ

h∗κ (s) hκ∗ (s)

s∈Cκ/σ ∩κ

hκ∗ (s) h∗κ (s)

ψκ/σ (α) =

Y

φκ/σ (α) =

Y s∈Rκ/σ ∩σ

Y

s∈Cκ/σ ∩σ

h∗σ (s) Y ∗ Y ∗ h (s) (h (s))−1 . (4.24b) hσ∗ (s) s∈κ κ s∈σ σ

Here Rκ/σ denotes the union of all rows which intersect κ − σ, Cκ/σ denotes the union of all columns which intersect κ − σ, and h∗κ (s), hκ∗ (s) etc. are given by (2.10b). Proof. First consider (4.22). From the generating function (4.4) we see that X µ

(α) La−1 µ (x; α)Cµ (z) =

N Y

(1 − zj )

X σ

j=1

Laσ (x; α)Cσ(α) (z).

(4.25)

Using [30] N Y

(1 − zj ) =

j=1

N X (−1)r r=0

r!

J(1(α)r ) (z),

the Pieri formula [30] (α) J1(α) = r! r Jσ

X

jσ ψκ/σ (α)Jκ(α) , jκ

κ

(4.26)

κ/σ a vertical r-strip

and the relationship (2.9), the r.h.s. of (4.25) can be rewritten as N X

X

r=0

κ,σ

(−1)r α|σ| |σ|! Laσ (x; α)ψκ/σ (α)

α−|κ| (α) C (z). |κ|! κ

(4.27)

κ/σ a vertical r-strip

The result now follows by comparing the coefficients of Cκ(α) (z) on the l.h.s. of (4.25) with that in (4.27). Now consider (4.23). Using the formula [30] N Y

(1 − zj )−1/α =

j=1

and the Pieri formula (4.26) in conjugate form

(α) ∞ X J(r) (z) r=0

αr r!

198

T.H. Baker, P.J. Forrester

X

(α) J(r) (z) Jσ(α) (z) = r!αr

jσ φκ/σ (α)Jκ(α) (z), jκ

κ

κ/σ a horizontal r-strip

together with the formula (2.9), we have X µ

La+1/α (x; α)Cµ(α) (z) = µ

N Y

(1 − zj )−1/α

X σ

j=1

=

∞ X r=0

Laσ (x; α)Cσ(α) (z)

X

Laσ (x; α)α|σ|−|κ|

κ

|σ|! φκ/σ (α)Cσ(α) (z). |κ|!

κ/σ a horizontal r-strip

The identity (4.23) follows by equating coefficients of Cσ(α) (z). The analogue of (4.17a) takes the form Proposition 4.9. X

X

min(N,|κ|)

E0(x) Laκ (x; α)

=

rα−r

r=1

|σ|! ψκ/σ (α)La+1 σ (x; α), |κ|!

σ

κ/σ a vertical r-strip

where ψκ/σ (α) is given by (4.24a). Proof. Applying E0(x) to the generating function (4.4), we have E0(x)

X κ

= −p1 = −p1

Laκ (x; α)Cκ(α) (z) = z 1−z z 1−z

Certainly

p1

Y

Y j

(1 − zj )−a−q 0 F0 −x,

j

Y

(1 − zj )−a−q E0(x) 0 F0 −x,

(1 − zj )

X σ

j

z 1−z

Y

z 1−z

(α) La+1 σ (x; α)Cσ (z).

(1 − zj ) =

j

X k

zk

Y

(1 − zp ).

p6=k

If we differentiate w.r.t. t the identity N Y i=1

(1 − zi t) =

N X (−1)r r=0

r!

r J1(α) r (z)t

giving N X X Y (−1)r (α) J1r (z)tr−1 = − zk (1 − zp t), (r − 1)! r=1

and set t = 1, we obtain

k

p6=k

z 1−z

(4.28)

Calogero-Sutherland Model and Generalized Classical Polynomials

− p1

z 1−z

Y j

199

N X (−1)r (α) J r (z). (1 − zj ) = (r − 1)! 1

(4.29)

r=1

Inserting (4.29) back into (4.28) gives, after some manipulation, E0(x)

X κ

Laκ (x; α)Cκ(α) (z) =

N X

X

rα−r

r=1

|σ|! (α) ψκ/σ (α)La+1 σ (x; α)Cκ (z), |κ|!

κ,σ

κ/σ a vertical r-strip

which yields the result upon comparison of the coefficients of Cκ(α) (z). 4.3. Integration formulas. The classical Laguerre polynomial obeys integration formulas analogous to the integration formulas (3.17) for the classical Hermite polynomial: Z ∞ 2 0(a + 1 + k) , (4.30a) y a e−y Lak (y) dy = k! 0 Z ∞ k! ex y a e−y 0 F1 (a + 1; −xy)Lak (y) dy = xk , (4.30b) 0(a + 1) 0 Z ∞ ex y a e−y 0 F1 (a + 1; −xy)y k dy = Lk (x). (4.30c) k!0(a + 1) 0 The higher-dimensional analogues of these formulas can be established using the generating functions (4.2), (4.4) in much the same way as the higher-dimensional analogues of (3.17) were established using the generating function (3.2a). To present these results, we will make use of the notation dµ(L) (y) :=

N Y j=1

yja e−yj

Y

|yk − yj |2/α dy1 . . . dyN .

(4.31)

1≤j
Proposition 4.10. We have Z 2 [a + q](α) κ , Nκ(L) := Laκ (x; α) dµ(L) (x) = N0(L) (α) Cκ (1N )|κ|! [0,∞)N

(4.32)

where Z N0(L) :=

dµ(L) (x) = α(1−N −(N −1)

2

[0,∞)N

/α)

N −1 Y j=0

0(1 + (j + 1)/α)0(a + 1 + j/α) . 0(1 + 1/α) (4.33)

Proof. Multiplication of both sides of the generating function (4.4) by Laκ (x; α) and integration with respect to dµ(L) (x) gives, upon using the orthogonality of {Laκ }κ with respect to the inner product (2.21), Z Y z (α) )La (x; α) dµ(L) (x) = Nκ(L) Cκ(α) (z). (1 − z)−(a+q) 0 F0 (−x; 1−z κ [0,∞)N Setting z1 = . . . = zN = c, using (3.16) and changing variables cxj /(1 − c) =: yj , this reads

200

T.H. Baker, P.J. Forrester

c

−N (a+q)

Z e

−p1 (y)/c

[0,∞)N

N Y

yja

Laκ

j=1

(1 − c) yj c

Y

|yk − yj |

1≤j
2/α

N Y

dyj

j=1

= Nκ(L) Cκ(α) (cN ). The stated result follows by choosing c = 1 and noting from (4.14a) that Laκ (0; α) = [a + q](α) κ /|κ|!.

(4.34)

The analogues of the formulas (4.30b) and (4.30c) are deduced from the following integration formula. Proposition 4.11. We have Z (α) (α) (L) (x) 0 F1 (a + q; x; −za ) 0 F1 (a + q; x; −zb ) dµ [0,∞)N

= N0(L) e−p1 (za ) e−p1 (zb ) 0 F1(α) (a + q; za ; zb ).

(4.35)

Proof. Substitute for ep1 (zs ) 0 F1(α) (a + q; x; −zs ) (s = a, b) using the generating function (4.2) and integrate with respect to dµ(L) (x) term-by-term. From the orthogonality property of {Laκ }κ with respect to (2.21), only the diagonal terms in the double sum are non-zero, with the integral then being evaluated according to Proposition 4.10. The resulting sum is identified with 0 F1(α) according to the definition (4.3b). Corollary 4.1. We have Z [0,∞)N

(α) 0 F1 (a

+ q; x; −za )Laκ (x; α) dµ(L) (x)

N (L) e−p1 (za ) Cκ(α) (za ), = (α) 0 N Cκ (1 )|κ|! Z (α) (α) (L) (x) 0 F1 (a + q; x; −za )Cκ (x) dµ [0,∞)N = N0(L) Cκ(α) (1N )|κ|!e−p1 (za ) Laκ (za ; α).

(4.36a)

(4.36b)

Proof. The integration formula (4.36a) follows from (4.35) after multiplying both sides by ep1 (zb ) , using the generating function (4.2) to substitute for ep1 (zb ) 0 F1(α) (a + q; x; −zb ) and equating coefficients of Cκ(α) (zb ) on both sides. The integration formula (4.36b) follows from (4.35) after replacing zb by −zb , substituting for ep1 (zb ) 0 F1(α) (a+q; za ; −zb ) using (4.2) and equating coefficients of Cκ(α) (zb ) on both sides. Analogous to the Hermite case, the integration formula (4.35) can be used to derive the analogue of the classical summation formula valid for |t| < 1, ∞ X

n! La (x)Lan (y)tn = (a + 1)n n n=0 t xyt −a−1 (x + y) 0 F1 a + 1; exp − . (1 − t) 1−t (1 − t)2

(4.37)

Calogero-Sutherland Model and Generalized Classical Polynomials

201

Proposition 4.12. For |t| < 1 we have X Lκ (x; α)Lκ (y; α)

t|κ| (L) N κ κ tx 1 t y (α) −N (a+q) (p1 (x) + p1 (y)) 0 F1 ; . = (L) (1 − t) exp − a + q; 1−t 1−t 1−t N0

G(L) (x, y; t) :=

Proof. This follows by following the procedure used in the Hermite case, Proposition 3.9. Notice that in the special case x = 0 the above summation reduces to elementary functions, giving G(L) (0, y; t) =

−N (a+q) (1 − t) exp − (L)

1 N0

t p1 (y) . 1−t

(4.38)

Interpretation of this result in terms of an explicit solution of the Fokker-Planck equation (1.6) with W given by (1.3b) will be discussed in the next section. Finally, we note (prompted by M. Lassalle) that (4.6a) implies the identity (−1)|κ| |κ|!Cκ(α) (1N )

exp − D1(x) − (a + 1)E0(x) Cκ(α) (x) = Laκ (x; α)

(4.39)

(the derivation parallels that of (3.21)). Analogous to (3.22), comparison with (4.36b) gives that for any symmetric analytic function f (x), Z ep1 (z) (α) (L) (x) 0 F1 (a + q; x; −z)f (−x)dµ N0(L) [0,∞)N (4.40) = exp − D1(z) − (a + 1)E0(z) f (z), and hence by a similar argument as before, if Z ep1 (z) (α) (L) (x), F (z) = (L) 0 F1 (a + q; x; −z)f (−x)dµ N N0 [0,∞) then f (z) =

e−p1 (z) N0(L)

(4.41)

Z [0,∞)N

(α) 0 F1 (a

+ q; x; z)F (x)dµ(L) (x)

(4.42)

(in the case α = ∞ (4.41) corresponds to the Hankel transform). 5. Applications 5.1. The ground state global density. As noted in the Introduction, the ground states of the Schr¨odinger operators (1.2) are, up to normalization, of the form e−βW/2 , where W is given by (1.3). The ground state density in a system of N + 1 particles, ρN +1 (x) say, is then given by

202

T.H. Baker, P.J. Forrester N

N +1Y ρN +1 (x) = ZN +1

Z

l=1

where ZN +1 :=

N +1 Z Y I

l=1

I

dxl e−βW (x,x1 ,...,xN ) ,

(5.1)

dxl e−βW (x1 ,x2 ,...,xN +1 ) .

(5.2)

An alternative interpretation of (5.1) is as the density at the point x in the statistical mechanical system of N + 1 particles with potential energy W confined to the interval I, in equilibrium at inverse temperature β. In the case W = W (H) as given by (1.3a), a physical argument based on the interpretation of the harmonic term as an electrostatic potential (see e.g. [2]) predicts that for all β, r 2√ 2 √ 1 − x2 , |x| < 1 . (5.3) ρ( 2N x) = π lim 0, |x| ≥ 1 N →∞ N This limit gives the so-called global density, and the result is known as the Wigner semi-circle law. For W = W (L) the change of variables yj = x2j gives N

ρN +1 (y) =

N + 1 −βy/2 βµ/2 Y e y ZN +1 l=1

Z

∞ 0

Y

βµ/2

dyl |y −yl |β e−βyl /2 yl

1≤j
(5.4)

where µ := a0 − 1/β and ZN +1 :=

|yk −yj |β .

N +1 Z ∞ Y l=1

0

Y

βµ/2

dyl e−βyl /2 yl

|yk − yj |β .

(5.5)

1≤j
The same type of electrostatics calculation used to obtain (5.3) predicts that for all β and µ, 1 √ 1 − y, 0 < y < 1 . (5.6) lim ρN +1 (4N y) = 2πy1/2 N →∞ 0, y < 0, y ≥ 1 Finally, for W = W (J) , the change of variable sin2 φj = yj gives ρN +1 (y) =

N + 1 βµ1 /2 y (1 − y)βµ2 /2 ZN +1 N Z 1 Y βµ /2 × dyl |y − yl |β yl 1 (1 − yl )βµ2 /2 l=1

0

Y

|yk − yj |β , (5.7)

1≤j
where µ1 := a0 − 1/β, µ2 := b0 − 1/β and ZN +1 =

N +1 Z 1 Y l=1

0

βµ1 /2

dyl yl

(1 − yl )βµ2 /2

In this case the electrostatics calculation gives

Y 1≤j
|yk − yj |β .

(5.8)

Calogero-Sutherland Model and Generalized Classical Polynomials

( 1 ρN +1 (y) = lim N →∞ N

1 π

√

1 y(1−y)

0

203

0 1

.

(5.9)

In this section we will show how for β even, when the multidimensional integral factors in the definition of ρN +1 (x) are polynomials in x, the density (5.1) in the Hermite, Laguerre and Jacobi cases is related to eigenstates of the operator (1.7) and thus the generalized Hermite, Laguerre and Jacobi polynomials respectively (this result is already implicit in earlier publications [9, 10, 17]). Furthermore, we will show how the global density can be evaluated by using integral representations. 5.2. Relationship between the density and the generalized polynomials. Instead of considering the density directly, we proceed as in [9] and introduce a function f depending on the auxilary variables t1 , . . . , tm : N N Z m Y Y 1 Y f (t1 , . . . , tm ) := dyl e−βV (yl ) (yl − ts ) QN I s=1 l=1

l=1

Y

|yk − yj |β , (5.10)

1≤j
where the normalization QN is chosen so that f equals unity at the origin. For an appropriate choice of I and V , (5.10) gives each of the densities in the Hermite, Laguerre and Jacobi cases for β even according to the formula QN −βV (y) e f (t1 , . . . , tβ ) . (5.11) ρN +1 (y) = (N + 1) ZN +1 t1 =...=tβ =y Let us consider each case in turn, starting with the Jacobi case. Jacobi case. Kaneko [17] has shown that with I = [0, 1],

e−βV (y) = y λ1 (1 − y)λ2 ,

t := (t1 , . . . , tm ),

λi = βµi /2,

(i = 1, 2) (5.12) f := f (J) (λ1 , λ2 , β; t) as given by (5.10) is the unique solution of each of the p.d.e.’s tp (1 − tp )

∂F ∂2F 0 2 2 + c − (m − 1) − (a0 + b0 + 1 − (m − 1))tp − a 0 b0 F 2 ∂tp β β ∂tp

N 2X 1 ∂F ∂F tp (1 − tp ) = 0, + − tj (1 − tj ) β j=1 tp − tj ∂tp ∂tj

(5.13)

j6=p

(p = 1, . . . , m) with a0 = −N, m = β, b0 =

2 2 (λ1 + λ2 + m + 1) + N − 1, c0 = (λ1 + m). β β

(5.14)

Furthermore, Kaneko (see also Yan [33]) has shown that the solution of (5.13), normalized to unity at the origin, is also given by the generalized hypergeometric function (β/2) 0 0 0 (a , b ; c ; t), where 2 F1 (β/2) (a1 , . . . , ap ; b1 , . . . , br ; t) p Fr

:=

(β/2) X 1 [a1 ](β/2) · · · [ap ]κ κ C (β/2) (t) (β/2) (β/2) κ |κ|! [b ] · · · [b ] κ κ 1 r κ

(5.15)

204

T.H. Baker, P.J. Forrester

(cf. (4.3b) ), so that (β/2)

f (J) (λ1 , λ2 , β; t) = 2 F1

(−N,

2 2 (λ1 + λ2 + m + 1) + N − 1; (λ1 + m); t). (5.16) β β

(a0 , b0 ; c0 ; t) as defined by (5.15) indeed terminates and Notice that with a0 = −N , 2 F1 gives a polynomial. To see the connection with the generalized Jacobi polynomials, we note that by summing the p.d.e.’s (5.13) an eigenvalue equation results. The eigenoperator is precisely the operator (2.1c) with (β/2)

N = m, a =

2 2 (λ1 + 1) − 1, b = (λ2 + 1) − 1. β β

Furthermore, from (5.15) and (5.16) f (J) has a Jack polynomial expansion of the form (2.11c) and from the definition (5.10) of f , the leading monomial in the power series expansion of f is m(N m ) . We thus have (2(λ +1)/β−1,2(λ2 +1)/β−1) f (J) (λ1 , λ2 , β; t) = G˜ (N m1 ) (t; β/2).

(5.17)

The tilde here denotes that the normalization in the generalized Jacobi polynomial is such that it equals unity at the origin. Comparison of (5.16) and (5.17) gives an equality between G˜ and 2 F1 therein. The formula (5.11) for ρN +1 (y) also requires the value of QN and ZN +1 . Both quantities are examples of the Selberg integral, ÿN Z ! Y Y 1 λ1 λ2 |tk − tj |2λ dtl tl (1 − tl ) SN (λ1 , λ2 , λ) := l=1

=

N −1 Y j=0

0

1≤j
0(λ1 + 1 + jλ)0(λ2 + 1 + jλ)0(1 + (j + 1)λ) . 0(λ1 + λ2 + 2 + (N + j − 1)λ)0(1 + λ)

(5.18)

We have ZN +1 = SN +1 (βµ1 /2, βµ2 /2, β/2),

QN = SN (βµ1 /2 + βN, βµ2 /2, β/2). (5.19)

Substituting (5.17) and (5.19) in (5.11) gives ρN +1 (y) = (N + 1)

SN (βµ1 /2 + βN, βµ2 /2, β/2) βµ1 /2 y (1 − y)βµ2 /2 SN +1 (βµ1 /2, βµ2 /2, β/2) (2/β+µ −1,2/β+µ2 −1) ×G˜ (N β ) 1 (t1 , . . . , tβ ; β/2) . t1 =...=tβ =y

(5.20) (1/λ)

Our ability to compute the global limit relies on an integral representation of 2 F1 (different of course from (5.10)) and thus, after equating (5.16) and (5.17), of G˜ (N m ) . This integral representation can be derived from the integral representation of the generalized hypergeometric function [33] (1/λ) (a, λ(m 2 F1

− 1) + ν1 + 1; 2λ(m − 1) + ν1 + ν2 + 2; t) Z 1 (1/λ) dx1 . . . dxm 1 F0 (a; t; x)Dν1 ,ν2 ,λ (x) = Sm (ν1 , ν2 , λ) [0,1]m

(5.21)

Calogero-Sutherland Model and Generalized Classical Polynomials (1/λ)

with 1 F0

205

given by (4.3b), and Dν1 ,ν2 ,λ (x) :=

m Y

Y

xνj 1 (1 − xj )ν2

j=1

|xk − xj |2λ .

(5.22)

1≤j
(β/2) in (5.16) with m = β, we must set Since G˜ in (5.20) is equal to 2 F1

a = −N, λ = 2/β, ν1 =

4 + µ1 + µ2 + N − 2, ν2 = −2 − µ2 − N β

(5.23)

in (5.21). We note that ν2 is negative so that (5.21) is not defined as written. However we can readily analytically continue the integral (5.21) so that it is valid for ν2 negative by following the procedure detailed in [10]. Thus we deform the contours [0, 1]m to the contours C m , where C is any simple closed contour which starts at the origin and encircles the point x = 1 (this is first done under the assumption that ν2 is not an integer, and λ is an integer; it is extended to all ν2 by analytic continuation and to all λ by noting that the r.h.s. is analytic in λ when it is defined, while the l.h.s. is a rational function of λ in the case of interest (a = −N )). Furthermore, we have the formula [17]

(1/λ) (a; t; x) 1 F0 t1 ,...,tβ =c

=

m Y

(1 − cxl )−a .

(5.24)

l=1

Thus we have

(2/β+µ −1,2/β+µ2 −1) G˜ (N β ) 1 (t1 , . . . , tβ ; β/2) 1 C

Z Cβ

dx1 . . . dxβ

β Y l=1

t1 =...=tβ =y

4/β+µ1 +µ2 +N −2

(1 − yxl )N xl

Y

= (1 − xl )−2−µ2 −N

|xk − xj |4/β .

(5.25)

1≤j
where C is chosen so that at t = 0, G˜ is unity. This is our desired integral representation. Laguerre case. Let (5.10) with Laguerre weight e−βV (y) = e−βy/2 y βµ/2 and integration interval I = [0, ∞) be denoted f = f (L) (µ, β; t). Comparison with the definition of f (J) shows mN 1 lim f (J) (βµ/2, βL/2, β; t/L) = f (L) (βµ/2, β; t). (5.26) L→∞ L Substituting (5.16) and (5.17) for f (J) and using (2.19) and the fact that limb→∞ 2 F1(α) (a, b; c; x/b) = 1 F1(α) (a; c; x), we thus have µ−1+2/β (β/2) (−N ; µ + 2; t), f (L) (βµ/2; t) = L˜ (N m ) (t; β/2) = 1 F1

(5.27)

where L˜ denotes the generalized Laguerre polynomial normalized to unity at the origin. (β/2) (The equality between f (L) and 1 F1 has previously been given in [10] and the equality µ−1+2/β (β/2) and 1 F1 has been noted in [23].) Furthermore from the working between L˜ (N m ) in [10] we have

206

T.H. Baker, P.J. Forrester

(β/2) (−N ; µ 1 F1

1 C0

+ 2; t1 , . . . , tβ )

Z (C 0 )β

dx1 · · · dxβ

β Y

t1 =...tβ =y

=

−N −3+2/β

eyxj xj

Y

(1 − xj )µ+N +2/β−1

j=1

|xk − xj |4/β .

1≤j
(5.28) Hermite case. Starting with I and e−βV (y) given by (5.12), the Hermite case I = 2 (−∞, ∞) and e−βV (y) = e−βy /2 can be obtained by the change of variables and limiting procedure yj 7→

yj 1 (1 − ), 2 L

tj 7→

tj 1 (1 − ), 2 L

βµ1 /2 = βµ2 /2 = βL2 /2,

L → ∞. (5.29)

Hence from (2.18) and (5.17) with λ1 = λ2 = βL2 /2 we see that in the Hermite case, f (H) is proportional to H(N m ) (t1 , . . . , tm ; β/2) (the fact that f (H) is an eigenfunction of (2.1a) with eigenvalue −2N was shown in [9]). Thus, from (5.1), if we denote by H¯ κ the generalized Hermite polynomial normalized so that the coefficient of the leading monomial mκ is unity, we have ZN −βx2 /2 ¯ H(N β ) (t1 , . . . , tβ ; β/2) e , (5.30a) ρN +1 (x) = (N + 1) ZN +1 t1 =...tβ =x   N Z ∞ N Y β X 2 Y dλl exp − λj |λk − λj |β where ZN = 2 −∞ j=1

l=1

= β −N/2−N β(N −1)/4 (2π)N/2

1≤j
N −1 Y j=0

0(1 + β(j + 1)/2) 0(1 + β/2)

(5.30b)

(compare (5.30b) with N0(H) in Proposition 3.7). To obtain a form of H¯ (N β ) suitable for asymptotic analysis, we make use of the integral representation Corollory 3.2 of Hκ . In the case of interest (κ = (N β ), t1 = . . . = tβ = x) we have (2/β)

0 F0

(2y1 , . . . , 2yβ ; −iz1 , . . . , −izβ )

and (2/β)

C(N β ) (iy1 , . . . , iyβ ) =

z1 =...=zβ =x

β Y

=

β Y

e−2ixyj

j=1

(iyj )N ,

j=1

so we can complete the square in the integrand of the formula of Corollary 3.2 and change variables to obtain H¯ (N β ) (t1 , . . . , tβ ; β/2) = where

1 Vβ

t1 =...=tβ =x

Z Rβ

du1 . . . duβ

β Y j=1

(iuj + x)N e−uj 2

Y 1≤j
|uk − uj |4/β ,

(5.31)

Calogero-Sutherland Model and Generalized Classical Polynomials

207

Vm := N0(H) (N = m, α = β/2)

(5.32)

(It is also possible to derive (5.31) by performing the limiting procedure (5.29) in the integral representation (5.25).) Analogous to the situation in (5.25) and (5.28), we note that each integration path along the real line can be deformed to the path C 00 , where C 00 is a simple contour which starts at −∞ and ends at ∞ (this is true for 2/β ∈ Z≥0 by Cauchy’s theorem; it then follows for all values of 2/β that the r.h.s. is defined by noting that the r.h.s. is then analytic in 2/β while the l.h.s. is a rational function in this variable). 5.3. The global density limit. Using the integral representations (5.25), (5.28) and (5.31), the global density limits in (5.3), (5.6) and (5.9) can be computed for all β even. The method used in each case is to deform the contours so that they pass through the saddle points (for each integration variable there are two saddle points), and to expand the integrand in the neighbourhood of these points. Due to the similarities of the three calculations, we will give the details in the Hermite case only. √ Changing variables ul 7→ 2N ul in the integral representation (5.31), and substituting the result in (5.30a) gives √ ρN +1 ( 2N x) = (N + 1) Z ×

Rβ

du1 . . . duβ

β Y

2 ZN (2N )(βN +3β−2)/2 e−βN x ZN +1 Vβ

e−2N ul (iul + x)N 2

l=1

Y

|uk − uj |4/β .

(5.33)

1≤j
For each integration variable ul the N -dependent terms in the integrand are e−2N ul (iul + x)N = e−2N ul +N log(iul +x) . 2

2

The exponent has a stationary point when ul = u± :=

ix 1 ± (1 − x2 )1/2 , 2 2

(5.34)

so according to the saddle point method of asymptotic analysis we should deform each of the contours of integration in (5.33) to pass through u+ and u− . With the contours of integration so deformed, we must Q expand the integrand in the neighbourhood of the saddle points. Due to the factor 1≤j
e−2N ul +N log(iul +x) 2

1 N ∼ exp[−2N u2± + N log(iu± + x) − (u − u± )2 (4N − )], (5.35) 2 (iu± + x)2 where on the r.h.s. u+ is to be taken for j = 1, . . . , β/2, while u− is to be taken for j = β/2 + 1, . . . , β. Also

208

T.H. Baker, P.J. Forrester

Y

|uk −uj |4/β ∼ |u+ −u− |β

1≤j
Y

|uk −uj |4/β

1≤j
Y

|uk −uj |4/β .

β/2+1≤j
(5.36) Thus after substituting (5.35) and (5.36) in (5.33) we obtain √ ρN +1 ( 2N x) ∼ 2 2 2 2 ZN β (2N )(βN +3β−2)/2 e−βN x e−N β(u+ +u− )+(N β/2) log |iu+ +x| |u+ −u− |β N ZN +1 Vβ β/2 2 Z β/2 Y Y N 2 4/β du1 · · ·duβ/2 exp[−2N ul (2N − )] |u −u | × k j . 2(iu+ + x)2 Rβ/2 l=1

1≤j
(5.37) To simplify (5.37) note that a simple change of variables gives that the last line is equal to !β−1 ÿ 1 (Vβ/2 )2 , |2N − 2(iuN+ +x)2 | where Vβ/2 is defined by (5.32). Now suppose x < 1 so that u∗− = −u+ . Using (5.34) we then have N = 4N (1 − x2 )1/2 , |u+ − u− | = (1 − x2 )1/2 , 2N − 2(iu+ + x)2 1 1 − x2 , |iu+ + x| = . 2 2 Making these substitutions in (5.37) shows that √ ZN (Vβ/2 )2 β 2 1/2 ρN +1 ( 2N x) ∼ (1 − x ) N e−N β/2 2−N β/2−β/2+1 N β(N +1)/2 . ZN +1 Vβ β/2 (5.38) To simplify the x-independent terms in (5.38) we note from the specific formula (5.30b) and Stirling’s formula that u2+ + u2− =

0(1 + β/2) ZN = β (1+βN )/2 (2π)−1/2 ZN +1 0(1 + (N + 1)β/2) 0(β/2 + 1) βN/2−1/2 −(β/2)(N +1)−1/2 2 N (β/2)−(β/2) eN β/2 . ∼ π

(5.39)

Also, from (5.32) and straightforward manipulation of the explicit formula in Proposition 3.7 we have (Vβ/2 )2 0(1 + β/2) . (5.40) = 2β/2 (β/2)β/2 Vβ 0(1 + β) Substituting (5.39) and (5.40) in (5.38) gives √ √ 2N (1 − x2 )1/2 , ρN +1 ( 2N x) ∼ π which is precisely the formula (5.3) for |x| ≤ 1.

|x| ≤ 1,

(5.41)

Calogero-Sutherland Model and Generalized Classical Polynomials

209

For the intervals |x| > 1, instead of repeating the working of the expansion about the saddle points (which are both pure imaginary in this case), we note from the result (5.41) that Z

Z √ √ 2N 1 ρ( 2N x) d( 2N x) ∼ (1 − x2 )1/2 dx = N. π −1 −1 1

But from the definition of the density it is non-negative and satisfies the normalization Z ∞ √ √ ρ( 2N x)d( 2N x) = N. −∞

Thus we must have

√ ρ( 2N x) √ → 0, 2N

for

|x| > 1,

as predicted by (5.3). 5.4. Initial value problems. The summations G(H) (w, z; t) in Proposition 3.9 and G(L) (x, y; t) in Proposition 4.12 are essentially the Green functions for the solution of the Fokker-Planck equation (1.6) with W given by (1.3a) and (1.3b) respectively. To ˜ (0) |x; τ ) is the Green function solution of (1.6) if it see this, we first recall that P = G(x is the solution which satisfies the initial condition P (x; τ )

τ =0

N Y

=

l=1

δ(xl − x(0) l ),

(0) x(0) 1 < · · · < xN .

By applying the transformation (1.4) the Fokker-Planck equation can be written as the Schr¨odinger equation (1.5), where t = τ /iβ. In general the Green function solution of the Schr¨odinger equation, G(x(0) |x; t) say, may be written in terms of the eigenvalues and eigenfunctions of H. Thus suppose {ψκ }κ is a complete set of orthogonal eigenfunctions of H with corresponding eigenvalues {Eκ }κ . Then the method of separation of variables gives X ψκ (x(0) )ψκ (x) e−itEκ . G(x(0) |x; t) = N κ κ Thus for the Fokker-Planck equation ˜ (0) |x; τ ) = eτ E0 /β ψ0 (x) G(x(0) |x; τ /iβ). G(x ψ0 (x(0) )

(5.42)

Now, for the Schr¨odinger operators (1.2a), (1.2b) √ ψκ(H) (x) = ψ0(H) (x)Hκ (x/ α; α) 0

/α−1/2) 2 (x /α; α) ψκ(L) (x) = ψ0(L) (x)L(a κ

where

Eκ(H) = E0(H) + Eκ(L) = E0(L) +

2 |κ|, (5.43a) α

4 |κ|, α

(5.43b)

210

T.H. Baker, P.J. Forrester

ψ0(H) (x)

:=

N Y

e−xj /2α 2

j=1

ψ0(L) (x) :=

N Y

Y

|xk − xj |1/α ,

1≤j
Y

a0 /α −x2j /2α

xj

e

j=1

|x2k − x2j |1/α .

1≤j
Substituting (5.43a) and (5.43b) into (5.42) and comparing with the definitions of G(H) (w, z; t) and G(L) (x, y; t) shows that G˜ (H) (x(0) |x; τ ) = 2 √ √ α−N q/2 ψ0(H) (x) G(H) (x(0) / α, x/ α; e−τ ),

(5.44a)

G˜ (L) (x(0) |x; τ ) =

2 0 α−N (a /α−1/2+q) ψ0(L) (x) G(L) ((x(0) )2 /α, x2 /α; e−2τ )

a=(a0 /α−1/2)

.

(5.44b)

From (3.20) and (4.38) we see from (5.44) that for some initial conditions it is possible to express G˜ (H) and G˜ (L) in terms of elementary functions. Thus for x(0) = c (0) (i.e. x(0) 1 = · · · = xN = c ) in the Hermite case, from (3.20) and (5.44a) we have G˜ (H) (x(0) |x; τ )

x(0) =c



=

−N q/2

1 N0(H)

α(1 − e−2τ )

 N X 1 × exp − (xj − e−τ c)2  α(1 − e−2τ ) j=1

Y

|xk − xj |2/α , (5.45a)

1≤j
while for x(0) = 0 in the Laguerre case, (4.38) and (5.44b) give G˜ (L) (x(0) |x; τ ) ×

N Y j=1

2a0 /α

xj

−N (a0 /α−1/2+q) 1 α(1 − e−2τ ) (L) x(0) =0 N0 a=a0 /α−1/2   N X Y 1 exp − x2j  |x2k − x2j |2/α . (5.45b) −2τ α(1 − e ) =

j=1

1≤j
Now that these explicit solutions have been revealed, they can be verified independent of the theory of generalized classical polynomials, by direct substitution into (1.6) with the appropriate W . Another consequence of (5.44) is that it implies the asymptotic small-τ behaviour of (0) x (x(0) )2 x2 (α) x ; . and 0 F1(α) (a + q; ; 1/2 0 F0 1/2 2τ 2τ τ τ Thus in general, as τ → 0 the asymptotic solution of the Schr¨odinger equation (1.5) (with t = τ /iβ) is given by G(x(0) |x; τ /iβ) ∼

N β N/2 Y (0) 2 e−β(xj −xj ) /4τ 4πτ j=1

Calogero-Sutherland Model and Generalized Classical Polynomials

211

Substituting this in (5.42), substituting the result in (5.44) and using Propositions 3.9 and 4.12 gives (0) N Y (0) π −N/2 2N (N −1)/2α N0(H) x x (α) F ; exj xj /τ ∼ 0 0 1/2 1/2 1/α Q τ τ (0) (0) j=1 1≤j
N Y

−(a+1/2) (xj x(0) j /τ )

j=1

N Y

(0)

exj xj

/τ

(5.47)

j=1

(0) where it is assumed x1 < · · · < xN and x(0) 1 < · · · < xN . In the case α = 2 these asymptotic formulas are known in the mathematical statistics literature (see e.g. [26]).

6. A Brief Literature Survey To our knowledge, the generalized classical polynomials were first introduced by Herz [12] in the case α = 2 via integral formulas over measures associated with spaces of orthogonal matrices (however it should be noted that what Herz calls generalized Hermite polynomials do not correspond to the generalized Hermite polynomial we have considered). Constantine and Muirhead extended the work of Herz on the generalized Laguerre polynomials in the case α = 2, and derived the formulas (4.2) [27, ex. 7.19], (4.4) [27, Thm. 7.6.3], (4.12) [27, ex. 7.20], (4.14a) [27, eq. 7.6(4)], (4.32) [27, Thm. 7.6.5], (4.36) [27, Thm. 7.6.4] and Proposition 4.12 [27, ex. 7.21] in that case (in comparing formulas it should be noted that Muirhead adopts the normalization cκκ = (−1)|κ| /Cκ(α) (1N ) which is |κ|! times the normalization we have used). For general α and N = 2 Yan [34] derived (4.4) [34, eq. (5.6)] (4.14a) [34, last eq. p. 251], (4.32) [34, eq. (5.11)] and (4.36b) [34, eq. (5.13)] (Yan uses the same normalization as Muirhead). For general α Lassalle [19] has reported the results (4.14) and (4.32), and simultaneous to our investigations has obtained the results (4.2) and (4.39) - (4.42) (Lassalle uses the normalization Laκ (0; α) = 1). For values of α corresponding to Jordan algebras, generalized Laguerre polynomials have been studied by Dib [5]. In an unpublished handwritten manuscript Macdonald [24] has derived some properties of the generalized classical polynomials. His results for the generalized Laguerre polynomials, which overlap with the same equations of ours as does the work of Yan, are typically proved for α = 1/2, 1 and 2, and are conjectured to remain valid for general α. The validity of a number of the results in [24] for general α rely on a conjecture for the so called generalized Laplace transform of the Jack polynomial: Z N Y Y (α) (α) F (−x; y)C (x) xaj |xk − xj |2/α dx1 . . . dxN κ 0 0 [0,∞)N

j=1

1≤j
(α) N = [a + q](α) κ Cκ (1 )

N Y j=1

1 yl−(a+q) Cκ(α) ( ). y

(6.1)

212

T.H. Baker, P.J. Forrester

This conjecture can be proved using results contained herein. First we calculate the generalized Laplace transform of the generalized Laguerre polynomial: Z [0,∞)N

(α) a 0 F0 (−x; y)Lσ (x; α)

N Y

Y

xaj

j=1

|xk − xj |2/α dx1 . . . dxN

1≤j
=

Nσ(L)

N Y j=1

1 yl−(a+q) Cσ(α) (1 − ), y

(6.2)

which follows from the first equation of the proof of Proposition 4.10 after noting from (4.15) that N Y z 1 ) = 0 F0(α) (−x; ), e−xj 0 F0(α) (−x; 1−z 1−z j=1

and writing 1/(1 − z) =: y. We now use (4.14b) and multiply (6.2) by a suitable σ– dependent factor so that after summing over σ we can replace Laσ (x; α) on the l.h.s. by Cκ(α) (x). On the r.h.s. we then have N Y j=1

(α) N yl−(a+q) [a + q](α) κ Cκ (1 )

X κ (−1)|σ| Nσ(L) Cσ (1 − y1 ) σ

σ⊆κ

[a + q](α) σ

.

(6.3)

Substituting the value of Nσ(L) from (4.32) and using (2.14) to compute the sum gives (6.1) as required. The generalized Hermite polynomials of the type considered in this paper appear to have been first considered by James [14] in the case α = 2. Subsequently, for general α Lassalle [21] noted the orthogonality with respect to the measure (2.20), the normalization of Proposition 3.7 and the property of Proposition 3.3. Furthermore, in handwritten notes Lassalle [19] has established Proposition 3.1 and has stated Corollaries 3.1 and 3.2, (3.21), (3.22) and (3.25). Also given in the notes is an explicit formula for the coefficients c(H) µκ in (2.11a). Macdonald [24] has also considered properties of the generalized Hermite polynomials in the form of conjectures based on derivations in the cases α = 1/2, 1 and 2. He has obtained the normalization of Proposition 3.7, the property of Proposition 3.3 and the integration formula of Proposition 3.8 and the generating function of Proposition 3.1. M. Lassalle has pointed out to us that the exponential operator formulas (3.21) and (4.39) imply an intimate connection between the theory of generalized Hermite and Laguerre polynomials and theory developed by Dunkl [6, 7]. Inspection of these works show that this is indeed so. The Hermite case is the most straightforward, which in the language of [6, 7] corresponds to the root system AN . Dunkl introduces the operators N

Ti :=

∂ 1 X 1 − Mij + , ∂xi α j=1 xi − xj j6=i

where Mij is the operator which exchanges coordinates xi and xj . When acting on functions symmetric in x1 , . . . , xN these operators are related to D0 (recall (2.12)) by D0 =

N X i=1

Ti2 .

Calogero-Sutherland Model and Generalized Classical Polynomials

213

Also introduced is the pairing [p, q]H . For polynomials p and q homogeneous of the same degree (6.4) [p, q]H := p(T x )q(x), where p(T x ) means that each variable xi in p is replaced by Ti (the ordering within the monomials does not matter since the operators {Ti } commute), while if the degrees differ [p, q]H := 0. This pairing is intimately related to the exponential operator in (3.21). Thus, as noted in [19], it follows from [6, Thm. 3.10] that for homogeneous symmetric polynomials p and q of degree |κ|, Z 2|κ| e−D0 /4 p e−D0 /4 q dµ(H) (x). [p, q]H = (H) N0 (−∞,∞)N From (3.21) and the orthogonality of {Hκ } with respect to the inner product (2.20) we immediately have the result that the Jack polynomials are orthogonal with respect to the pairing (6.4): [Jκ(α) , Jµ(α) ]H = α−|κ| jκ Jκ(α) (1N )δκ,µ , where we have used Proposition 3.7 and (2.9) Dunkl also introduces a kernel K(x, y), which for the root system AN and p a symmetric homogeneous polynomial has the property [7, Prop. 2.1] Z e−p2 (y) e−D0 /4 p K(x, y) dµ(H) (x) p(y) = (H) N0 (−∞,∞)N √ √ √ √ (we have changed variables x 7→ 2x, y 7→ 2y and replaced K( 2x, 2y) by K(x, y)). Comparison with Corollary 3.1 (after substituting for Hκ using (3.21)) gives the explicit formula K(x, y) = 0 F0(α) (2x; y). In the Laguerre case there are analogous connections with the work of Dunkl, with the underlying root system now being BN . The operators Ti are now given by (see e.g. [13]) ∂ 1 X 1 − Mij 1 − Si Sj Mij a + 1/2 + + (1 − Si ), + ∂xi α j=1 xi − xj xi + x j xi N

Ti =

(6.5)

j6=i

where the action of the operator Si is to replace the variable xi by −xi . When acting on a function f symmetric and even in x1 , . . . , xN these operators are such that N X i=1

Ti2

x2i =ui

= 4 D1(u) + (a + 1)E0(u) .

Using (6.5) and setting P (x) := p(x2 ) and Q(x) = q(x2 ), [6, Thm. 3.10] gives Z 22|κ| e−(D1 +(a+1)E0 ) p e−(D1 +(a+1)E0 ) q dµ(L) (x), [P, Q]L = (L) N0 [0,∞)N

214

T.H. Baker, P.J. Forrester

where [ , ]L is defined by (6.4) with Ti specified by (6.5). Orthogonality of {Lκ } with respect to (2.21) and use of (4.39) then gives (α) N [Jκ(α) (x2 ), Jµ(α) (x2 )]L = 22|κ| α−|κ| jκ [a + q](α) κ Jκ (1 )δκ,µ .

Also, (4.36a) with the substitution of (4.39) and the change of variables x, za 7→ x2 /2, za2 /2 gives the kernel K(x, y) in the BN case as 0 F1(α) (a + q; x2 /2; −y 2 /2). In the context of the Calogero-Sutherland model the expansion of the generalized Hermite polynomials in terms of monomial symmetric functions has been considered by Ujino and Wadati [31], and a Rodrigues–type formula has been obtained [32], analogous to that recently given by Lapointe and Vinet [18] for Jack polynomials. Also Polychronakos [28] has recently considered the monomial expansion of H(1k ) (x; α) and given its normalization. To our knowledge there have been no previous discussions of the generalized Laguerre polynomials in the context of the Calogero-Sutherland model. Regarding the global limit of the density computed in Sect. 5, we know of no other works which consider this limit directly. However, using techniques from potential theory Johansson [15] has recently proved that, in the Hermite case, for all β ≥ 0, r lim

N →∞

2 N

Z

Z p √ √ 2 1 f ( 2N x)ρ( 2N x) dx = f (x) 1 − x2 dx. π −1 −∞ ∞

(for β = 2 this result was first given in [3] using a mean-field approach) for any continuous, bounded f . The analogous result in the Jacobi case has also been obtained by Johansson [16]. These results establish that the smoothed density is that predicted by electrostatics, while our result establishes pointwise convergence to the electrostatic prediction.

Appendix In this appendix, Eqs. (4.6a), (4.6b) and (4.13) are derived as special cases of a p.d.e. satisfied by 2 F1(α) (a, b; c; x; y) (recall (4.3b)). Proposition A.1. Let 2 F1(α) (a, b; c; x; y) be defined by (4.3b). This function satisfies the p.d.e. N −1 N −1 E0(x) F − a + b − E2(y) F − η2(y) F = abp1 (y) F, D1(x) F + c − α α (A.1) where Dk , Ek are defined by (2.12) and η2 := 21 [D2 , E2 ], and is in fact the unique solution of the equation of the form F (x, y) =

X κ

Aκ

Cκ(α) (x)Cκ(α) (y) Cκ(α) (1N )

,

A0 = 1.

(A.2)

Proof. We follow the method of Constantine and Muirhead [27] in the case α = 2. With F given by (A.2), from (2.13) and (3.4) we have

Calogero-Sutherland Model and Generalized Classical Polynomials

D1(x)

F =

N (i) XX κ κ

i=1

κ

N −i κi + α

215

Cκ(α) (x) Cκ(α) (1N )

Cκ(α) (i) (y) Aκ(i) ,

(A.3a)

N (i) XX Cκ(α) (x) (α) κ C (i) (y) Aκ(i) , κ Cκ(α) (1N ) κ κ i=1 N 1 X X κ(i) i − 1 Cκ(α) (x) (α) C (i) (y) Aκ , κi − E2(y) F = 1 + |κ| κ κ α Cκ(α) (1N ) κ i=1 N 1 X X κ(i) i−1 (y) κi − η2 F = 1 + |κ| κ κ α i=1 (α) i−N Cκ (x) (α) C (i) (y) Aκ , × κi − α Cκ(α) (1N ) κ (α) N 1 X X κ(i) Cκ (x) (α) C (i) (y) Aκ . p1 (y) F = 1 + |κ| κ κ Cκ(α) (1N ) κ i=1

E0(x) F =

(A.3b)

(A.3c)

(A.3d) (A.3e)

Substituting (A.3) in (A.1), coefficients of Cκ(α) (x)/Cκ(α) (1N ) and then equating (i)equating κ coefficients of Cκ(α) gives (i) (y) κ

i−1 c + κi − α

Aκ(i)

1 = 1 + |κ|

i−1 a + κi − α

i−1 b + κi − α

Aκ .

This is a first order difference equation and so has a unique solution once the initial condition (A0 = 1) is specified. It is straightforward to verify that the solution is Aκ =

(α) 1 [a](α) κ [b]κ . |κ|! [c](α) κ

Equation (4.13) for 1 F1(α) (a; c; x; y) follows from (A.1) by changing variables y 7→ y/b and then taking b → ∞. Equation (4.6b) for 0 F0(α) (x; y) follows from that for (α) 1 F1 (a; c; x; y) by setting a = c = (N − 1)/α, while the equation (4.6a) follows from (4.13) (with a and c interchanged) for 1 F1(α) (a; c; x; y) by changing variables y → y/a and taking a → ∞. Acknowledgement. We are particularly thankful to M. Lassalle for providing us with [19], encouraging our research and for advice. We also thank N. Obata for a useful remark. THB would like to thank Prof. T. Miwa for hospitality at RIMS where part of this work was carried out. The financial support of the ARC is acknowledged.

References 1. Beerends, R.J., Opdam, E.M.: Certain hypergeometric series related to the root system BC. Trans. Amer. Math. Soc. 339, 581–609 (1993) 2. Br´ezin, E., Itzykson, C., Parisi, G., Zuber, J.B.: Planar diagrams. Commun. Math. Phys. 59, 35–51 (1978) 3. Boutet de Monvel, A., Pastur, L. and Shcherbina, M.: On the statistical mechanics approach in the random matrix theory: Integrated density of states. J. Stat. Phys. 79, 585–611 (1995)

216

T.H. Baker, P.J. Forrester

4. Debiard, A.: Syst`eme diff´erential hyperg´eom´etrique et parties radiales des op´erateurs invariants des espaces sym´etriques de type BCp . In: Lecture Notes in Mathematics, Volume 1296, 1987, pp. 42–124 5. Dib, H.: Fonctions de Bessel sur une alg`ebre de Jordan. J. de Math. Pures et Appl. 69, 403–448 (1990) 6. Dunkl, C.F.: Integral kernels with reflection group invariance. Canad. J. Math. 43, 1213-1227 (1991] 7. Dunkl, C.F.: Hankel transforms associated to finite reflection groups. In: Contemp. Math. 138, (ed. D.St.Richards) 1992, pp. 123–138 8. Erd´elyi, et. al.: Higher Transcendental Functions. Vol. 2. New York: Mcgraw-Hill, 1953 9. Forrester, P.J.: Selberg correlation integrals and the 1/r 2 quantum many body system. Nucl. Phys. B 388, 671–699 (1992) 10. Forrester, P.J.: Exact results and universal asymptotics in the Laguerre random matrix ensemble.J. Math. Phys. 35, 2539–2551 (1994) 11. Gurappa, N., Panigrahi, P.K.: On an explicit set of complete eigenfunctions for the Calogero-Sutherland model. Mod. Phys. Lett. 11, A 891–898 (1996) 12. Herz, C.S.: Bessel functions of matrix argument. Ann. Math. 61, 474–523 (1955) 13. Hikami, K.: Dunkl operator formalism for quantum many body problems associated with classical root systems. J. Phys. Soc. Japan 65, 394–401 (1996) 14. James, A.T.: Special functions of matrix and single argument in statistics. In: R.A. Askey, editor, Theory and Application of Special Functions. New York: Academic Press, 1975, pp. 497–520 15. Johansson, K.: On fluctuations of random Hermitian matrices. Roy. Inst. Tech. Stockholm. Preprint 16. Johansson, K.: On random matrices from the compact classical groups. Roy. Inst. Tech. Stockholm. Preprint 17. Kaneko, J.: Selberg integrals and hypergeometric functions associated with Jack polynomials. SIAM J. Math. Anal. 24, 1086–1110 (1993) 18. Lapointe, L., Vinet, L. Exact operator solution of the Calogero-Sutherland model. Commun. Math. Phys. 178, 425–452 (1996) 19. Lassalle, M.: Generalized Hermite polynomials: a short survey. Unpublished manuscript 20. Lassalle, M.: Une formule du binˆome g´en´eralis´ee pour les polynˆomes de Jack. C. R. Acad. Sci. Paris, t. S´eries I 310, 253–256 (1990) 21. Lassalle, M.: Polynˆomes de Hermite g´en´eralis´es. C. R. Acad. Sci. Paris, t. S´eries I 313, 579–582 (1991) 22. Lassalle, M.: Polynˆomes de Jacobi g´en´eralis´es. C. R. Acad. Sci. Paris, t. S´eries I 312, 425–428 (1991) 23. Lassalle, M.: Polynˆomes de Laguerre g´en´eralis´es. C. R. Acad. Sci. Paris, t. S´eries I 312, 725–728 (1991) 24. Macdonald, I.G.: Hypergeometric functions. Unpublished manuscript 25. Macdonald, I.G.: Symmetric functions and Hall polynomials. Oxford: Oxford University Press, 2nd edition, 1995 26. Muirhead, R.J.: Latent roots and matrix variates: a review of some asymptotic results. Ann. Stat. 6, 5–33 (1978) 27. Muirhead, R.J.: Aspects of multivariate statistical theory. New York: Wiley, 1st edition, 1982 28. Polychronakos, A.P.: Quasihole wavefunctions for the Calogero model. cond-mat 9603132 29. Sekiguchi, J.: Zonal spherical functions of some symmetric spaces. Publ. RIMS Kyoto Univ. 12, 455–459 (1977) 30. Stanley, R.P.: Some combinatorial properties of Jack symmetric functions. Adv. in Math. 77, 76–115 (1989) 31. Ujino, H.,Wadati, M.: Orthogonal symmetric polynomials associated with the quantum Calogero model. J. Phys. Soc. Japan 64, 2703–2706 (1995) 32. Ujino, H., Wadati, M.: Algebraic construction of the eigenstates for the second conserved operator of the quantum Calogero model. J. Phys. Soc. Japan 65, 653–656 (1996) 33. Yan, Z.: A class of generalized hypergeometric functions in several variables. Canad. J. Math. 44, 1317–1338 (1992) 34. Yan, Z.: Generalized hypergeometric functions and Laguerre polynomials in two variables. In: Contemp. Math., 138, (ed. D.St.Richards), 239–259 (1992) Communicated by T. Miwa

Commun. Math. Phys. 188, 217 – 232 (1997)

Communications in

Mathematical Physics c Springer-Verlag 1997

Deformation Quantization for Hilbert Space Actions Nik Weaver? Mathematics Department, Washington University, St. Lovis, MO 63130, USA. E-mail: [email protected] Received: 26 August 1996 / Accepted: 28 January 1997

Abstract: Rieffel’s theory of deformations of C*-algebras for Rd -actions can be extended to actions of infinite-dimensional Hilbert spaces. The CCR algebra over a Hilbert space H can be exhibited in this manner as a deformation of a commutative C*-algebra of almost periodic functions on H.

Introduction In his monograph [16] Rieffel gave a method for “deforming” a C*-algebra A which is equipped with an action of R2n , that is, a group homomorphism α : R2n → Aut(A) which is continuous for the point-norm topology on Aut(A). For any ~ ∈ R and nondegenerate symplectic form σ on R2n and any two elements a, b ∈ A which are smooth for α one is able to make sense of the deformed product Z Z α~u (a)αv (b)e2πiσ(u,v) dudv a ×~ b = defined as an oscillatory integral, so that a ×~ b ∈ A is also smooth. One then completes the algebra of smooth elements to get a deformed C*-algebra A~ . A version of this procedure for von Neumann algebras was presented in [22]. (The above formulation has been slightly altered to make it consistent with what we do here.) The present paper arose from the observation that the CCR algebra over an ndimensional complex Hilbert space is precisely a Rieffel deformation of the C*-algebra of almost periodic continuous functions on R2n . This was observed in ([17], Example 4); see [6] for a related result. Here R2n acts on the almost periodic functions by translation and is equipped with the standard symplectic form. This raises the question of whether CCR algebras over general Hilbert spaces can be interpreted in a similar way. It is ?

This research was supported by an NSF postdoctoral research fellowship.

218

N. Weaver

not hard to see that this is the case, and in fact this particular class of examples can be understood in a fairly simplistic manner via finite-dimensional approximations. It therefore seemed worthwhile to try to extend the work in [16] and [22] to cover actions of infinite-dimensional Hilbert spaces. We found that indeed this extension is possible. The main techniques are finite-dimensional approximation along the lines of ([17], discussion following Proposition 1.5) and the use of cylinder measures on Hilbert space in place of Lebesgue measure on R2n . As applications we treat both the usual CCR algebras and a related enlarged class of algebras defined in ([17], Definition 1.6). Our set-up consists of a complex Hilbert space H, a real constant ~, a positive real constant ω, a von Neumann algebra M , and a group homomorphism α : H → Aut(M ) which is continuous from the weak topology on H to the point-ultraweak topology on Aut(M ). Under these conditions we define a deformed von Neumann algebra M~ and show that it has many of the nice properties of deformed C*-algebras proved in [16]. The only function of the complex structure on H is to provide us with the symplectic form σ(x, y) = Imhx, yi on H considered as a real Hilbert space. Somewhat greater generality in the choice of σ is probably possible but only at the expense of some complication. In practice the symplectic structures of greatest interest are those which arise in the above manner ([5], § 5.2.2.2; [19]). The positive constant ω parametrizes a family of cylinder measures on H; the case ω = 1 corresponds to the normal measure under which L2 (H) is canonically equivalent to the symmetric Fock space over H. However other choices are equally possible and lead to different families of deformations. In particular the “undeformed” algebra M0 generally is not isomorphic to M and indeed varies according to the value of ω. The situation is closely related to the celebrated non-uniqueness of representations of the canonical commutation relations. There are many cylinder measures on H besides the Gaussian measures we use, but this special form seems to be needed at a crucial step (Lemma 3). Our use of the weak topology on H perhaps also merits comment; it means that the action is in a sense close to being finite-dimensional, and this is what allows us to apply previous work on actions of finite-dimensional spaces. We will also consider deformed C*-algebras realized as subalgebras of M for which α is norm continuous. The deformed C*-algebras do not depend on the value of ω. In contrast to the finite-dimensional case it seems difficult to develop a satisfying C*algebra theory ab initio; one is virtually forced to detour through von Neumann algebras. The main reason for this is that a von Neumann algebra with a Hilbert space action must be rich in elements for which the action is trivial on finite-codimensional subspaces of H (Proposition 2), whereas this is not so for C*-algebras. Of course one could assume the existence of such elements but it is unlikely that they must remain sufficiently rich in the desired sense after deformation. Sections 1 to 3 expose the general theory, with Sects. 1 and 2 devoted mainly to basic definitions and Sect. 3 to the main results. In Sect. 4 we consider CCR and related algebras as examples. Because of our heavy reliance on [16] and [22], the reader probably needs to be familiar with those references.

Deformation Quantization for Hilbert Space Actions

219

1. Hilbert Spaces and Hilbert Modules Let H be a complex Hilbert space, ~ a real number, ω a positive real number, M a von Neumann algebra, and α an action of H on M which is weak to ultraweakly continuous. Also let A be the C*-algebra of elements of M for which the action is weak to norm continuous. This notation will remain fixed through Sect. 3. The goal of Sects. 1 and 2 is to define a deformed von Neumann algebra M~ and a deformed C*-algebra A~ . Just to get to the definition requires a fair amount of preparation. We will discuss first cylinder measures and Hilbert modules in this section, then finite-dimensionally based subalgebras and Hilbert module operators in the next. We begin with some basic facts about cylinder measures on Hilbert spaces. On each finite-dimensional subspace V of H define the Gaussian measure µV by dµV = 2 2 ω 2n e−πω kxk dx, where n is the complex dimension of V and dx denotes Lebesgue measure on V . Then µV is a probability measure on V . Also if V and V 0 are orthogonal finite-dimensional subspaces and W = V ⊕ V 0 then µW = µV × µV 0 . Thus the map πV,W : f 7→ f ⊗1V 0 , where 1V 0 is the function on V 0 which is constantly 1, isometrically embeds L2 (V, µV ) in L2 (W, µW ), and these embeddings are clearly compatible. We may therefore define L2 (H) to be the direct limit L2 (H) = limV L2 (V, µV ) in the category of Hilbert spaces, taken over all finite-dimensional subspaces V of H with connecting maps πV,W . L Vi then L2 (H) may be There are three ways to view L2 (H). First, if H = N other 2 defined in the case of an infinite family as the completed span identified with L (Vi ),N of the elementary tensors xi , where each xi ∈ L2 (Vi ) and xi = 1Vi for all but finitely many i. Taking all of the subspaces Vi to be finite-dimensional thus provides a different definition of L2 (H); at the same time, we also obtain L2 (H) ∼ = L2 (V ) ⊗ L2 (V ⊥ ) for every subspace V of H, a fact that will come in handy later. e = lim V of all Second, essentially as observed in ([1], § 10), the inverse limit H finite-dimensional subspaces of H, with orthogonal projections as connecting maps and taken in the category of algebraic vector spaces, may be viewed as a product space and e so one can then use the measures µV to define a genuine product measure µ on H, 2 e 2 that L (H, µ) is canonically isomorphic to L (H). Namely, for each finite-dimensional e → V be the natural map subspace V ⊂ H and each Borel subset A of V , let PV : H and define µ(PV−1 (A)) = µV (A). Sets of the form PV−1 (A) are called cylinder sets (hence the term “cylinder measure”). e are not It is worth noting that for different values of ω the resulting measures on H mutually absolutely continuous. This can be seen by taking H = l2 (N) and finding a sequence tn → ∞ such that µV ([0, tn ]n ) = 1/2 for V = ln2 and a given choice of ω, while µV ([0, tn ]n ) → 0 for any smaller value of ω; by a standard fact ([9], Theorem 3.5) e resulting from the two values of ω are not equivalent. this shows that the measures on H We will use this observation in Sect. 4 to show that the examples given there yield nonisomorphic families of deformations for different values of ω. However, for the most part we suppress the dependence on ω in our notation. The last version of L2 (H), which we will not use, involves a canonical isomorphism with the symmetric (boson) Fock space over H, i.e. the infinite direct sum of symmetrized tensor powers C ⊕ Hc ⊕ (Hc ⊗s Hc ) ⊕ · · · , where Hc is the complexification of H considered as a real Hilbert space. See ([10], p. 48 and [18], § II) for details.

220

N. Weaver

Following [16], we will construct deformed algebras via their actions on Hilbert modules rather than Hilbert spaces; as in [22], working at the von Neumann algebra level demands the use of self-dual modules. The basic construction is as follows. Let K be any Hilbert space. The algebraic tensor product K ⊗ M has a natural action of M by right multiplication and a natural M -valued inner product given by hx ⊗ a, y ⊗ biM = hx, yia∗ b for x, y ∈ K and a, b ∈ M . (All of our inner products are conjugate linear in the first variable.) Then the set (K ⊗ M )∗ = Hom(K ⊗ M, M ) of all bounded M -module homomorphisms from K ⊗ M to M is the space we need. It has the following properties. Theorem 1. Let E be a Hilbert M -module and let E ∗ be the set of all bounded module homomorphisms from E into M . (a) E ∗ is a self-dual Hilbert M -module and a dual space. (b) E canonically embeds as a weak* dense submodule of E ∗ . (c) The set B(E ∗ ) of all bounded module endomorphisms is a von Neumann algebra. It contains a canonical embedding of B(E). (d) A bounded net (Tκ ) ⊂ B((K ⊗ M )∗ ) converges ultraweakly to T if and only if hTκ (x ⊗ 1), y ⊗ 1iM → hT (x ⊗ 1), y ⊗ 1iM , converging ultraweakly in M for all x, y ∈ K. (e) If K = K1 ⊗ K2 then there is a canonical isomorphism (K ⊗ M )∗ ∼ = ((K1 ⊗ M )∗ ⊗ K2 )∗ , and B((K1 ⊗ M )∗ ) canonically embeds in B((K ⊗ M )∗ ). Proof. Parts (a) and (c) and the embedding in part (b) are all contained in ([14], § 3). The weak* density in part (b) follows from ([11], Theorem 3.2 (iv) ⇒ (i)), and part (d) is proved just as in ([22], Lemma 1). The isomorphism in part (e) follows from the identity K ⊗M ∼ = K1 ⊗ M ⊗ K2 and weak* density of K1 ⊗ M in (K1 ⊗ M )∗ (by part (b)); and B((K1 ⊗ M )∗ ) embeds in B((K1 ⊗ M )∗ ⊗ K2 ) by tensoring with the identity on K2 , hence in B((K ⊗ M )∗ ) by part (c). A simpler (but seemingly coordinate-dependent) description of (K ⊗ M )∗ goes as follows. Let (en ) be an orthonormal basis of K (separability is not essential). Then with l2 (M ), the module of all sequences (an ) of elements (K ⊗ M )∗ can be P identified ∗ of M such that an an is bounded (hence ultraweakly summable), in such a way that en ⊗ a ∈ K ⊗ M corresponds to the sequence whose nth entry is a and whose other entries are zero. We define L2 (H; M ) = (L2 (H) ⊗ M )∗ . In von Neumann algebra language, the Hilbert module S(R2n ; M ) used in [16] consists of the M -valued Schwartz functions on R2n ; this can be viewed as an uncompleted version of L2 (R2n ; M ), giving R2n Lebesgue measure, since S(R2n ; M ) can be embedded in (S(R2n ) ⊗ M )∗ ∼ = (L2 (R2n ) ⊗ M )∗ = L2 (R2n ; M ). L2 (H; M ) will here play the role of an infinite-dimensional analog of the latter module.

Deformation Quantization for Hilbert Space Actions

221

2. Deformed Algebras For any finite-dimensional subspace V of H, let MV = {a ∈ M : αx (a) = a for all x ∈ V ⊥ } be the set of elements of M for which the action is trivial on V ⊥ . We say that elements . Let MV∞ be of MV are based on V . Clearly MV is a von Neumann subalgebra Sof M∞ ∞ the set of elements of MV which are smooth for α and let M = V MV . Proposition 2. M ∞ is ultraweakly dense in M . A is the norm closure of M ∞ . Proof. Let a ∈ M , let > 0, and let η1 , . . . , ηn ∈ M∗ be normal linear functionals on M . To establish ultraweak density it will suffice to find an element c of M ∞ such that |ηj (c − a)| ≤ 2 for 1 ≤ j ≤ n. First observe that {b ∈ M : |ηj (b − a)| < for 1 ≤ j ≤ n} is an ultraweakly open neighborhood of a, hence by continuity of α the set {x ∈ H : |ηj (αx (a) − a)| < for 1 ≤ j ≤ n} is a weakly open neighborhood of the origin in H. Therefore it contains a finite codimension subspace of H; let V be the orthocomplement of this subspace. Now V ⊥ is an abelian group under addition, hence it has an invariant mean m ([15], Comment 7.3.5). For η ∈ M∗ define fη ∈ l∞ (V ⊥ ) by fη (x) = η(αx (a)); then define b∈M ∼ = (M∗ )∗ by setting b(η) = m(fη ). Note that |m(fη )| ≤ kfη k∞ ≤ kηkkak, so that b is indeed a bounded functional on M∗ . Now |ηj (b − a)| = |m(g)| where g ∈ l∞ (V ⊥ ) is the function g(x) = ηj (αx (a) − a). But kgk∞ ≤ , so |ηj (b − a)| ≤ for 1 ≤ j ≤ n. Also for η ∈ M∗ and y ∈ V ⊥ , η(αy (b)) = b(η ◦ αy ) = m(fη◦αy ), and since fη◦αy (x) = fη (x + y) it follows from the invariant mean property that m(fη◦αy ) = m(fη ), hence η(αy (b)) = η(b). As this holds for all η ∈ M∗ we conclude that αy (b) = b for all y ∈ V ⊥ , i.e. b ∈ MV . Finally, a standard convolution argument (e.g. see [15], Lemma 7.5.1) shows that MV∞ is ultraweakly dense in MV , so we can find an element c ∈ MV∞ such that |ηj (c − b)| ≤ for 1 ≤ j ≤ n. This completes the proof of the first assertion. It is clear that every element of M ∞ is norm continuous for α. Conversely, let a ∈ M and suppose the map x 7→ αx (a) is weak to norm continuous, i.e. a ∈ A. Then as above, for any > 0 we can find a finite codimension subspace V ⊥ of H such that kαx (a) − ak < for all x ∈ V ⊥ . Again as above define b ∈ MV by b(η) = m(fη ). Then for any η ∈ M∗ we have |η(b − a)| = |m(g)| ≤ kηk, where g(x) = η(αx (a) − a) satisfies kg(x)k ≤ kηk for all x ∈ V ⊥ . This shows that kb − ak ≤ . Also, for any y ∈ V and η ∈ M∗ , we have |η(αy (b) − b)| = |m(fη◦αy − fη )|; but for all x ∈ V ⊥

222

N. Weaver

|(fη◦αy − fη )(x)| = |η(αx+y (a) − αx (a))| ≤ kηkkαy (a) − ak. So kαy (b) − bk ≤ kαy (a) − ak, hence b is norm continuous for α. Finally, convolution with a C ∞ approximate unit on V yields an element c ∈ MV∞ within of b in norm, and this completes the proof that M ∞ is dense in A. Now we discuss Hilbert module operators. Let V ⊂ H be a subspace of H of finite complex dimension n and let a ∈ MV∞ . Equip V ∼ = R2n with the symplectic form σ(x, y) = Imhx, yi. Following ([16], comment preceding Definition 4.8), define the 2 2n operator L2n a˜ on L (R ; M ) by setting Z Z L2n f (x) = αx+~u (a)f (x + v)e2πiσ(u,v) dudv a˜ for f ∈ S(R2n ; M ), and extending to L2 (R2n ; M ) by boundedness ([16], Theorem 4.6) and Theorem 1 (c). (In the notation of [16], J = i~.) Note that here R2n is equipped with Lebesgue, not Gaussian, measure. Thus let UV : L2 (V ; M ) → L2 (R2n ; M ) be the isometric isomorphism arising from multiplication by the function ρV (x) = (ω 2n e−πω

2

kxk2 1/2

)

,

2 = UV−1 L2n and define La,V ˜ a˜ UV ∈ B(L (V ; M )). 2 Recall the embedding πV,W : L (V ) → L2 (W ) defined in Sect. 1 and let QV,W : 2 L (V ; M ) → L2 (W ; M ) be the corresponding map obtained by tensoring with the identity on M .

Lemma 3. Let V ⊂ W be finite-dimensional subspaces of H and let a be an element = La,W QV,W . of MV∞ . Then QV,W La,V ˜ ˜ ∞ , so both La,V and La,W make sense. The point of Proof. Note first that MV∞ ⊂ MW ˜ ˜ this lemma is to show that these operators match up, so that they can be amalgamated into a single operator on L2 (H; M ). Write W = V ⊕ V 0 and let V and V 0 have complex dimensions n and n0 respectively. For any f ∈ S(R2n ; M ) we have UV−1 f ∈ L2 (V ; M ), and making use of the isomorphism L2 (W ; M ) ∼ = (L2 (V ; M ) ⊗ L2 (V 0 ))∗

of Theorem 1 (e) we have −1 −1 QV,W La,V ˜ (UV f ) = (La,V ˜ U V f ) ⊗ 1V 0

= (UV−1 L2n a˜ f ) ⊗ 1V 0 2n = (ρ−1 V La˜ f ) ⊗ 1V 0 ,

while

QV,W (UV−1 f ) = La,W QV,W (ρ−1 La,W ˜ ˜ V f) (ρ−1 = La,W ˜ V f ⊗ 1V 0 ) 0

−1 2(n+n ) = UW La˜ UW (ρ−1 V f ⊗ 1V 0 ) 0

2(n+n ) = ρ−1 (f ⊗ ρV 0 ). W La˜

Since a is based on V , ([16], Proposition 1.11) implies that in the second computation

Deformation Quantization for Hilbert Space Actions

223

0

) L2(n+n (f ⊗ ρV 0 ) = (L2n a˜ f ) ⊗ ρV 0 , a˜

so that finally

2n QV,W (UV−1 f ) = ρ−1 La,W ˜ W ((La˜ f ) ⊗ ρV 0 ) 2n = (ρ−1 V La˜ f ) ⊗ 1V 0 .

Thus the two expressions agree, so by density and continuity we conclude that = La,W QV,W . QV,W La,V ˜ ˜ The preceding lemma is the only place where it is really crucial that we use Gaussian cylinder measures. Specifically we require the property that ρW = ρV ⊗ ρV 0 . Perhaps deformations can be defined using more general cylinder measures, but our techniques here demand this special form. ∈ B(L2 (H; M )) by For a ∈ M ∞ we now define a bounded module operator La,H ˜ setting ∗ (φ) = QV La,V La,H ˜ ˜ QV (φ) for any V on which a is based and any φ ∈ QV (L2 (V ; M )), where QV : L2 (V ; M ) → L2 (H; M ) is the natural embedding. In other words La,H = limV La,V ˜ ˜ . By the preceding result this definition is consistent, and by ([16], Propositions 1.11 and 5.4) the norms of 2(n+n0 ) 2n L agree, hence kLa,V k for V ⊂ W ; so La,H is bounded on ˜ k = kLa,W ˜ ˜ Sa˜ and L2a˜ 2 Q (L (V ; M )) and thus extends to L (H; M ) by Theorem 1 (c). V V Another equivalent way to define La,H is as the image of La,V under the canonical ˜ ˜ embedding of B(L2 (V ; M )) into B(L2 (H; M )) promised in Theorem 1 (e). This still requires Lemma 3 to show consistency for different choices of V . We are finally able to define the deformed algebras M~ and A~ . Definition 4. We define M~ , the von Neumann algebra deformed by the action of H, to : a ∈ M ∞ } in B(L2 (H; M )). be the ultraweak closure of {La,H ˜ We define A~ , the C*-algebra deformed by the action of H, to be the norm closure : a ∈ M ∞ } in M~ . of {La,H ˜ 3. Main Results In this section H, ~, ω, M , α, and A remain fixed. Let MV,~ be the deformation of MV , defined as in [22]; equivalently by ([22], 2 2n Lemma 15), MV,~ is the ultraweak closure of the operators L2n a˜ in B(L (R ; M )), or 2 ∞ in B(L (V ; M )), for a ∈ MV . We will find it most equivalently the operators La,V ˜ convenient to take MV,~ as acting on L2 (V ; M ). Proposition 5. There is a natural ∗-isomorphic embedding of MV,~ into M~ . Proof. This follows from the second definition of La,H and Theorem 1 (e). The em˜ bedding is given by T 7→ T ⊗ IV ⊥ , where IV ⊥ is the identity operator on L2 (V ⊥ ). The interest of Proposition 5 lies in the fact that MV,~ is independent of ω. Thus, M~ is full of deformed von Neumann algebras which are based on finite-dimensional subspaces, and for which there is only one possible deformation. This shows that the choice of ω has a somewhat limited effect; loosely speaking, it only influences the

224

N. Weaver

deformations of elements which are not finitely based. Along similar lines we also note that A~ is independent of ω, since the deformed norm of an element of M ∞ is a finite-dimensional, hence unique, property. We have two actions of H on L2 (H; M ), defined as follows. For y ∈ H and x ⊗ a ∈ 2 L (H) ⊗ M define σy (x ⊗ a) = x ⊗ α−y (a). This action extends to L2 (H; M ) by setting (σy φ)(x ⊗ a) = α−y (φ(x ⊗ αy (a))) for φ ∈ L2 (H; M ) = (L2 (H) ⊗ M )∗ . Note that σy is not a module operator on L2 (H; M ) since it does not commute with the action of M , although this will not cause us any problems. Also let V be any finite-dimensional subspace containing y and define τy on L2 (V ; M ) to be translation on L2 (R2n ; M ) conjugated by UV , i.e. τy (f )(x) = e−πω

2

(kx−yk2 −kxk2 )/2

f (x − y);

2

then extend τ to L (H; M ) by taking a limit over V or applying Theorem 1 (e). Proposition 6. For any y ∈ H and a ∈ M ∞ we have Lαy (a)∼ ,H = τy−1 La,H τy = ˜ −1 σy . Conjugation with τ defines an action of H on M~ which agrees with α σy La,H ˜ on M ∞ , is weak to ultraweakly continuous, and restricts to a weak to norm continuous action on A~ . Proof. We can find a finite-dimensional subspace V of H such that a is based on V and y ∈ V . Then σ and τ are as defined in [22] on L2 (R2n ; M ), conjugated by UV and tensored with IV ⊥ , so the first assertion follows from the corresponding finite: dimensional result ([22], Lemma 6). Agreement with α on M ∞ as identified with {La,H ˜ a ∈ M ∞ } immediately follows. Ultraweak continuity on M~ follows from Theorem 1 (d) and the calculation hσy−1 La,H σyκ (x ⊗ 1), x0 ⊗ 1iM = αyκ (hLa,H (x ⊗ 1), x0 ⊗ 1iM ) ˜ ˜ κ (x ⊗ 1), x0 ⊗ 1iM ) → αy (hLa,H ˜

σy (x ⊗ 1), x0 ⊗ 1iM = hσy−1 La,H ˜ where yκ → y weakly in H. Norm continuity on A~ follows from norm continuity on M ∞ ([16], Proposition 5.11). Finally, M~ and A~ are invariant for the action by continuity, since M ∞ is invariant. Because of Proposition 6, the deformed algebra M~ carries its own action of H, which we may consistently also refer to as α since it agrees with the original action α on M ∞ . Thus we can form the algebras (M~ )V and (M~ )∞ . As in [16] and [22], the fact that (M~ )∞ = M ∞ as sets is very important. That is, there are no new smooth elements after deformation. To show this we first need an alternative characterization of M~ . For f ∈ S(V ; M ) ∼ = S(R2n ), respectively the set of = S(R2n ; M ) and g ∈ S(V ) ∼ M -valued and scalar-valued Schwartz functions on a finite-dimensional subspace V of H, we have Z Z f (x + ~u)g(x + v)e2πiσ(u,v) dudv f ×~ g(x) = Z Z = g(x − ~v)f (x + u)e2πiσ(v,u) dvdu = g ×−~ f (x). This justifies the notation Rg2n for the operator f 7→ f ×~ g on L2 (R2n ; M ) and also shows that it is a bounded module map by ([16], Theorem 4.6). We may then carry Rg2n

Deformation Quantization for Hilbert Space Actions

225

over to an operator Rg,V on L2 (V ; M ) by conjugating with UV and extend to an operator , either by taking the limit of the Rg,H in B(L2 (H; M )) just as in the definition of La,H ˜ operators Rg⊗1V 0 ,W or by invoking Theorem 1 (e). Let πV : L2 (V ) → L2 (H) be the canonical embedding and let PV ∈ B(L2 (H; M )) be the operator obtained by tensoring the orthogonal projection from L2 (H) onto πV (L2 (V )) with the identity on M (and then applying Theorem 1 (c) as usual). That is, PV = QV Q∗V , where QV : L2 (V ; M ) → L2 (H; M ) is the natural embedding. Lemma 7. An operator T ∈ B(L2 (H; M )) belongs to M~ if and only if it commutes with the operators σy τy−1 and Rg,H for all y ∈ H and g ∈ S(V ), V ranging over all finite-dimensional subspaces of H. Proof. Suppose T ∈ M~ . Then commutation with σy τy−1 follows from Proposition 6 since T is an ultraweak limit of operators of the form Tκ = La˜ κ ,H . The fact that σy is not a module operator is easy to handle, indeed hTκ (x ⊗ 1), x0 ⊗ 1iM = hσy τy−1 Tκ τy σy−1 (x ⊗ 1), x0 ⊗ 1iM = α−y (hτy−1 Tκ τy (x ⊗ 1), x0 ⊗ 1iM ) → α−y (hτy−1 T τy (x ⊗ 1), x0 ⊗ 1iM ) = hσy τy−1 T τy σy−1 (x ⊗ 1), x0 ⊗ 1iM , and since also Tκ → T we conclude that T = σy τy−1 T τy σy−1 , using Theorem 1 (d). For for a ∈ M ∞ , which commutation with Rg,H it suffices to consider the case T = La,H ˜ works by associativity of ×~ ([16], Theorem 2.14) plus the interpretation of Rg as right twisted multiplication by g. Now suppose T ∈ B(L2 (H; M )) commutes with all of the operators σy τy−1 and Rg,H . For any finite-dimensional subspace V , PV also commutes with Rg,H for all g ∈ S(V ) and with σy τy−1 for all y ∈ V ; hence PV T PV commutes with them as well. It will suffice to show that this implies PV T PV ∈ MV,~ ⊗ IV ⊥ , since MV,~ ⊗ IV ⊥ ⊂ M~ by Proposition 5 and PV T PV → T ultraweakly. But this is essentially the contents of Lemma 7 through Theorem 11 in [22], so we are done. (The argument in [22] also requires that PV T PV be smooth for α|V , but this can be arranged by convolving with a C ∞ approximate unit as usual.) Recall that QV : L2 (V ; M ) → L2 (H; M ) is the natural embedding. Lemma 8. For any finite-dimensional subspace V of H we have Q∗V M~ QV = MV,~ . Proof. For every T ∈ MV,~ we have Q∗V (T ⊗ IV ⊥ )QV = T , so the containment ⊃ is easy. Conversely, every element of Q∗V M~ QV satisfies the conditions of Lemma 7 with V in place of H, and hence belongs to MV,~ . Define 8V : B(L2 (H; M )) → B(L2 (H; M )) by 8V (T ) = Q∗V T QV ⊗ IV ⊥ . Notice that 8V takes M~ into M~,V ⊗ IV ⊥ ⊂ M~ . We are now ready for the key result on smooth vectors of the deformed algebra. : a ∈ M ∞ }. Theorem 9. (M~ )∞ = {La,H ˜

226

N. Weaver

Proof. The containment ⊃ follows from ([16], Theorem 7.1). Conversely, let T ∈ (M~ )∞ and suppose T is based on V . Then for any W containing V we have 8W (T ) ∈ ∞ : a ∈ MW } by ([22], Theorem (MW,~ )∞ ⊗ IW ⊥ by Lemma 8, hence 8W (T ) ∈ {La,H ˜ : a ∈ MV∞ } and 11). But since T is based on V so is 8W (T ), and so 8W (T ) ∈ {La,H ˜ we must have 8W (T ) = 8V (T ) by Lemma 3, since QV = QW QV,W . We conclude that : a ∈ MV∞ }, hence its the net (8W (T )) is eventually constant and belongs to {La,H ˜ ultraweak limit T also belongs to this set. Corollary 10. A~ is the norm continuous part of M~ for α. Proof. This is immediate from Theorem 9 plus Proposition 2 applied to M~ .

We omit the proof of the next result because it is virtually identical to the proof given for finite-dimensional actions in ([22], Lemma 13 to Theorem 16). Theorem 11. Let N be another von Neumann algebra with a weak to ultraweakly continuous action β of H, and let θ : M → N be an equivariant normal homomorphism. Then there is a unique equivariant normal homomorphism θ~ : M~ → N~ which agrees with θ on M ∞ . Since the deformed von Neumann algebra M~ itself carries an action of H, it is possible to apply the deformation procedure again to get another algebra (M~ )~0 . The next result asserts that this algebra is isomorphic to the single-step deformation M~+~0 . In particular, taking ~0 = 0 we see that deforming by zero does not affect an algebra that has already been deformed. This is worth noting since as we have already mentioned, M need not be isomorphic to M0 . Of course this result depends on the fact that the same value of ω will have been used for both deformations. Theorem 12. Let ~, ~0 ∈ R. Then (M~ )~0 ∼ = M~+~0 . Proof. M~+~0 is obtained by taking the ultraweak closure of M ∞ in a representation on the Hilbert M -module L2 (H; M ), while (M~ )~0 is obtained by closing a 7→ La,H ˜ ∞ ∞ on the Hilbert M~ -module M (≈ M~ by Theorem 9) in a representation a 7→ L0a,H ˜ L2 (H; M~ ). According to ([16], Theorem 6.5) the norm of any element of M ∞ is the same in both cases. (Note that this immediately implies that (A~ )~0 ∼ = A~+~0 .) Thus to show that the ultraweak closures are isomorphic, it is sufficient to show that any net (aκ ) ⊂ M ∞ which is bounded in this norm verifies La˜ κ ,H → 0 ultraweakly if and only if L0a˜ κ ,H → 0 ultraweakly. Recall the maps 8V : B(L2 (H; M )) → B(L2 (H; M )) defined preceding Theorem 9. Also define similar maps 80V : B(L2 (H; M~ )) → B(L2 (H; M~ )) and observe that La˜ κ ,H → 0 if and only if 8V (La˜ κ ,H ) → 0 for all finite-dimensional V , and similarly for L0a˜ κ ,H and 80V (L0a˜ κ ,H ). But we know that MV,~+~0 ⊗ IV ⊥ , in which the elements 8V (La˜ κ ,H ) live, is canonically isomorphic to (MV,~ )~0 ⊗ IV ⊥ , in which the elements 80V (L0a˜ κ ,H ) live ([22], Corollary 18), so ultraweak convergence coincides as desired. The last theorem in this section is an immediate consequence of the corresponding results in [16]. Theorem 13. The family {A~ }~∈R is a continuous field of C*-algebras and a strict deformation quantization of M ∞ . Proof. It suffices to prove this for the field {AV,~ }, where AV,~ denotes the norm closure ∞ . This is done in ([16], Theorems 8.13 and 9.3). of MV,~

Deformation Quantization for Hilbert Space Actions

227

4. CCR Algebras CCR algebras (see e.g. [5] for background) can be interpreted as examples of the deformation theory described in the previous sections. Let H be a complex Hilbert space. (Discard all other notation fixed up to now.) We will realize CCR(H), the CCR algebra over H, as a deformation of the commutative C*-algebra AP (H) of almost periodic continuous functions on H. Quantizations of almost periodic algebras are also discussed in [6] and [13]. A bounded function f : H → C, continuous for the norm topology on H, is called almost periodic if the set of its translations is precompact in sup norm. (See e.g. [7] for background on almost periodic functions on groups.) Denote the set of bounded continuous almost periodic functions on H by AP (H). We can use standard machinery of almost periodic functions to obtain a simple alternate characterization of this class. Lemma 14. Consider H as a topological abelian group with the norm topology. The continuous characters on H are precisely the functions χx : y 7→ e2πiImhx,yi for x ∈ H. Proof. The map χx is clearly a continuous homomorphism from H into the circle group T. Conversely, let χ : H → T be a continuous group homomorphism. Since H is simply connected χ can be factored through the universal cover R of T, i.e. we have χ(y) = e2πiφ(y) for some continuous map φ : H → R. Shifting by an integer if necessary, we may assume that φ(0) = 0. Then the function (y, y 0 ) 7→ φ(y + y 0 ) − φ(y) − φ(y 0 ) is continuous from H × H to R, is integer valued since its composition with e2πi· is 1, and is zero at y = y 0 = 0; hence it is constantly zero. This shows that φ is a group homomorphism, and continuity easily implies that φ is linear. Thus, φ is a bounded real linear functional and hence it must be of the form y 7→ Imhx, yi for some x ∈ H. Corollary 15. AP (H) is the C*-algebra generated by the set of functions {χx : x ∈ H}. Proof. ([7], Theorem 7.14 and Corollary 7.16).

The corollary shows that in fact, every function in AP (H) is weakly continuous on H. Every almost periodic function f on H has a “mean value” τ (f ); in operator algebra language τ is a faithful state on AP (H), and it satisfies 1 if x = 0 τ (χx ) = 0 if x 6= 0 ([7], Theorem 7.7). It is easy to see that the Hilbert space for the resulting GNS representation πτ is (isomorphic to) l2 (H) and πτ (χx ) is translation by x, πτ (χx ) = Lx . Also define Mx ∈ B(l2 (H)) to be the operator of multiplication by χx , so Mx f (y) = e2πiImhx,yi f (y) for f ∈ l2 (H). Theorem 16. (a) πτ (AP (H)) = C ∗ (H) is the group C*-algebra of H considered as a discrete abelian group and the von Neumann algebra M = πτ (AP (H))00 is maximal abelian.

228

N. Weaver

(b) The map x 7→ Mx is weak to strong operator continuous, and conjugation with Mx defines a weak to ultraweakly continuous action α of H on M . (c) The C*-algebra A of norm continuous elements of M for α is precisely πτ (AP (H)) ∼ = AP (H). (d) A~ ∼ = AP (H)~ is isomorphic to CCR(H) for any ~ 6= 0. Proof. (a) The first part is immediate from the definition of the group C*-algebra and the fact that πτ (χx ) = Lx , and the second part follows from ([21], Proposition 7.14). (b) Suppose f ∈ l2 (H) has finite support and xκ → x weakly in H. Then χxκ (y) → χx (y) for all y ∈ H and it follows that Mxκ f → Mx f in l2 norm. Since the operators Mxκ are unitary, this shows strong continuity. Ultraweak continuity of conjugation with Mx follows immediately, and this conjugation takes Ly to e−2πiImhx,yi Ly , so M is invariant for this action. (c) The observation αx (Ly ) = e−2πiImhx,yi Ly shows that the operators Ly are norm continuous for α, and hence so is the C*-algebra πτ (AP (H)) they generate. Conversely, by Proposition 2 it suffices to show that M ∞ ⊂ πτ (AP (H)). But if T ∈ M ∞ is based on V ⊂ H then for any x ∈ V ⊥ we have hT f0 , fy i = hT Mx f0 , Mx fy i = e2πiImhx,yi hT f0 , fy i, where fy ∈ l2 (H) is the characteristic function of y ∈ H; this shows that the “Fourier coefficients” hT f0 , fy i are nonzero only for y ∈ V . Since T is norm continuous for α|V , a standard argument then shows that it is approximated in norm by BochnerFejer polynomials ([2], § I.9, norm continuity being used in Theorem 9.3), hence it belongs to πτ (AP (H)) as desired. (The Bochner-Fejer polynomials are finite linear combinations of operators Ly and in the notation of [2] are defined by σ = mt {αt (T )K(t)}. The almost periodic function t 7→ αt (T )K(t) takes values in M and its mean exists by ([3], § I).) (d) Let ~ 6= 0. Then for x, y ∈ H and u, v ranging over any finite-dimensional subspace V which contains x and y, Z Z α~u (Lx )αv (Ly )e2πiσ(u,v) dudv Lx ×~ Ly = Z Z = e−2πiImh~u,xi Lx e−2πiImhv,yi Ly e2πiImhu,vi dudv = e−2πi~Imhx,yi Lx+y . p The constant 2π~ can be removed from the formula by rescaling H by 2π|~|. This shows that the elements Lx of the deformed algebra satisfy the Weyl form of the canonical commutation relations, hence the C*-algebra A~ they generate is isomorphic to the CCR algebra over H. Physically, we can regard H as the phase space of a classical field, say H = Hr ⊕iHr , where Hr is a real Hilbert space the elements of which describe field strength and iHr describes the conjugate field momentum. Then AP (H) is a classical algebra of observables of a particularly simple type.

Deformation Quantization for Hilbert Space Actions

229

This picture raises the question of what other natural C*-algebras of observables exist. (Several possiblities are discussed in [12].) In the finite-dimensional case the algebra of continuous functions on R2n which vanish at infinity seems more tractable than the algebra of almost periodic functions, but it appears to have no good infinitedimensional analog since infinite-dimensional Hilbert spaces are not locally compact. Another natural C*-algebra, Cbu (R2n ), consists of the bounded uniformly continuous functions on R2n ; it is of interest because it constitutes precisely that part of L∞ (R2n ) for which the action of R2n by translations is norm continuous. It may be regarded as the largest reasonable algebra of continuous observables. There is an infinite-dimensional analogue of this algebra, whose deformation may be viewed as an enlarged version of CCR(H). This enlarged CCR algebra was defined in ([17], Definition 1.6), where it was called the Weyl C*-algebra for H. e = lim V described in To define the undeformed algebra, recall the vector space H Sect. 1 as the algebraic inverse limit of all finite-dimensional subspaces of H. For each e → V be the natural map and for any bounded uniformly such subspace V let φV : H e Then continuous function f ∈ Cbu (V ) let f ◦ φV be the corresponding function on H. we define Cbu (H) to be the uniform closure of the set {f ◦ φV : V ⊂ H is finite-dimensional and f ∈ Cbu (V )}. In other words Cbu (H) is the direct limit Cbu (H) = limV Cbu (V ) in the category of C*-algebras, where the connecting maps are given by tensoring with the identity. It e µ) by multiplication, where µ is the cylindrical Gaussian measure on H e acts on L2 (H, ∞ e described in Sect. 1 (for some value of ω), and is contained in L (H, µ). For x ∈ H let e µ) defined by translation by x. βx be the automorphism of L∞ (H, e µ). Theorem 17. (a) Cbu (H)00 = L∞ (H, (b) β is weak to ultraweakly continuous. e µ) for β is precisely (c) The C*-algebra B of norm continuous elements of L∞ (H, Cbu (H). e µ), so also Cbu (H)00 ⊂ L∞ (H, e µ). Conversely, Proof. (a) Clearly Cbu (H) ⊂ L∞ (H, e choose a Hamel basis B of H and identify H with the infinite product space CB by matche = lim V whose value z in V = span(x1 , . . . , xn ) ing (zx ) ∈ CB with the element of H for x1 , . . . , xn ∈ B satisfies hz, xi i = zxi . Then for any finite subset B0 of B, L∞ (CB0 ) embeds in L∞ (CB ) as functions which only depend on the variables in B0 , and Cbu (H) ∩ L∞ (CB0 ) = Cbu (CB0 ) is ultraweakly dense in L∞ (CB0 ). An easy measure-theoretic argument then implies e µ). (The subsets of CB whose that Cbu (H) is ultraweakly dense in L∞ (CB ) = L∞ (H, 00 characteristic functions belong to Cbu (H) constitute a σ-algebra that includes all Borel cylinder sets, hence every characteristic function belongs to Cbu (H)00 .) (b) It is enough to show that β is ultraweakly continuous on characteristic functions; then taking linear combinations implies continuity on simple functions, and norm density e µ) implies the result. of simple functions in L∞ (H, First suppose S ⊂ CB is a Borel cylinder set, i.e. S is based on some finite subset B0 ⊂ B, and let fS be its characteristic function. Let xκ → 0 be any weakly convergent net in H, and let yκ be the orthogonal projection of xκ onto span B0 ⊂ H. Then yκ → 0

230

N. Weaver

and the translation of S by yκ equals its translation by xκ . Since Gaussian measure on CB0 is equivalent to Lebesgue measure, it follows that the translations of S by yκ , hence by xκ , converge to S in measure. We now extend the above to any measurable set S by showing that the sets for which we have convergence in measure under translations constitute a σ-algebra. Clearly, if this is true of S then it is also true of the complement of S. Now let (Sn ) be a disjoint sequence of measurable sets and suppose that the translations of Sn by xκ converge to 0 Sn in measure S∞for each n. Given any > 0, choose N large enough so that µ(S ) ≤ , 0 where S = N +1 Sn , and choose κ0 large enough so that Z |βxκ (fSn ) − fSn | ≤ /N for κ > κ0 and 1 ≤ n ≤ N . It follows that Z Z N Z X |βxκ (fS ) − fS | ≤ |βxκ (fSn ) − fSn | + |βxκ (fS 0 ) − fS 0 | 1

≤ N (/N ) + 2 = 3

S for κ > κ0 , where S = Sn . We conclude that the translations of S converge to S in measure for every measurable set S. e µ) To show weak to ultraweak continuity, let S be any measurable set. Let g ∈ L1 (H, and observe that by absolute continuity, for every > 0 there exists a δ > 0 such that the integral of |g| over any set of measure less than δ is less than . It then follows from convergence in measure that Z | (βxκ (fS ) − fS )g| → 0. Thus characteristic functions translate ultraweakly continuously, which is what we needed to show. e µ) is not weak to norm (It may be worth noting that the adjoint action β ∗ on L1 (H, continuous. Indeed −πω 2 (kx−yk2 −kxk2 ) , βx∗ (1H e )(y) = e where 1H e denotes the function which is constantly 1; so if xn is a sequence of unit vectors in H which converge weakly to zero, the values kβx∗n (1) − 1k1 are constant over n and nonzero. This shows that the infinite-dimensional analog of ([4], Corollary 2.5.23) is false.) (c). The containment Cbu (H) ⊂ B is easy, and conversely by Proposition 2 it is enough to show that any element of B which is based on some finite-dimensional subspace V ⊂ H is in Cbu (H). We now invoke the fact mentioned earlier, that Cbu (V ) is the subalgebra of L∞ (V ) of norm translation continuous elements. Recall that deformations at the C*-algebra level do not depend on ω. However, e µ) is an easy example of a von Neumann algebra whose deformations do depend L∞ (H, e ω , µω ) for some value of ω and let M0 be the deformation on ω. Indeed, let M ω = L∞ (H ω of M for ~ = 0, relative to a different value ω 0 . Then M0 acts on the Hilbert module e ω0 ; M ω ). Letting τ be the faithful normal state on M ω given by integration, the L2 (H resulting GNS representation is just the original representation by multiplication on e ω ); composing the M ω -valued inner product with τ maps L2 (H e ω0 ; M ω ) into L 2 (H

Deformation Quantization for Hilbert Space Actions

231

e ω 0 , L2 (H e ω )) ∼ e ω 0 ) ⊗ L2 ( H eω ) ∼ e ω0 × H e ω ), L2 ( H = L2 (H = L2 (H e ω0 , M ω )) in B(L2 (H e ω0 ⊗ H e ω )) ([14], Proposition 2.6). In and thus embeds B(L2 (H ∞ e 0 e ω ) given by particular Lf˜,H ∈ M0 maps to the function φ(f ) ∈ L (Hω × H φ(f )(x, y) = f (x − y). eω ) If M ω were isomorphic to M0 then φ would define an isomorphism from L∞ (H ∞ e 0 ω e ω ). But if f ∈ M is the characteristic function of a set which has into L (Hω × H positive µω -measure but is µω0 -null, then by Fubini’s theorem φ(f ) is zero. So M ω is not isomorphic to M0 . In closing we mention that the machinery of almost periodic functions can be used to give a quick proof of the uniqueness of the CCR algebra over H ([20], Theorem 3.7). The proof is modelled on a corresponding argument for noncommutative tori ([23], Theorem 12.3.2). Let {Wx : x ∈ H} be any family of unitaries acting on some Hilbert space K and satisfying the canonical commutation relations. Let A = span{Wx : x ∈ H} be the C*-algebra generated by these unitaries and observe that H acts on A by αy (Wx ) = Wy−1 Wx Wy = eiImhx,yi Wx . Thus Wx is periodic for the action, hence all of A is almost-periodic for the action. It follows ([3], § I) that for each T ∈ A the function y 7→ αy (T ) possesses a mean value m(T ) ∈ A. Then km(T )k ≤ kT k so m is continuous, and n 0 if x 6= 0 ; m(Wx ) = I = W0 if x = 0 so by continuity m takes values in C · I. Also T ≥ 0 automatically implies m(T ) ≥ 0, and if T > 0 then η(T ) > 0 for some state η on A, hence η(m(T )) = m(η(T )) > 0 ([3], Theorem 16); so in fact T > 0 implies m(T ) > 0. Thus, m is a faithful trace on A (identifying C · I with C). Finally, by its definition m(T ) belongs to the norm closed convex hull of the set {αy (T ) : y ∈ H} = {Wy−1 T Wy : y ∈ H}. If J is any closed ideal of A and T ∈ J is nonzero, then the last fact above shows that m(T ∗ T ) ∈ J, but m(T ∗ T ) is a nonzero scalar multiple of the unit since m is a faithful trace. So J contains a nonzero scalar multiple of the unit of A, hence J = A. This shows that A is simple. Uniqueness follows by a standard trick [8]: given two representations Wx and Wx0 with corresponding C*-algebras A and A0 , form the direct sum representation. Then the resulting C*-algebra maps onto both A and A0 hence by simplicity must equal both of them. So A = A0 . References 1. Araki, H.: Hamiltonian formalism and the canonical commutation relations in quantum field theory. J. Math. Phys. 1, (1960) 492-504 2. Besicovitch, A.S.: Almost Periodic Functions, London: Dover 1954 3. Bochner, S. and von Neumann, J.: Almost periodic functions in groups. II, Trans. Am. Math. Soc. 37 21–50 (1935) 4. Bratteli, O. and Robinson, D.W.: Operator Algebras and Quantum Statistical Mechanics I. (second edition), Berlin–Heidelberg–New York: Springer-Verlag, 1987

232

N. Weaver

5. Bratteli, O. and Robinson, D.W.: Operator Algebras and Quantum Statistical Mechanics II. Berlin– Heidelberg–New York: Springer-Verlag, 1981 6. Coburn, L.A.: Deformation estimates for the Berezin-Toeplitz quantization. Commun. Math. Phys. 149, 415-424 (1992) 7. Corduneanu, C.: Almost Periodic Functions, New York: Wiley Interscience, 1968 8. Douglas, R.: On the C*-algebra of a one-parameter semigroup of isometries. Acta Math. 128, 143–151 (1972) 9. Folland, G.B.: Real Analysis. New York: John Wiley & Sons, 1984 10. Folland, G.B.: Harmonic Analysis in Phase Space. Annals of Mathematics Studies 122 (1989) 11. Frank, M.: Self-duality and C*-reflexivity of Hilbert C*-moduli. Zeit. f¨ur Anal. 9, 165–176 (1990) 12. Guichardet, A.Alg`ebres d’Observables Associ´ees aux Relations de Commutation. Paris: Armand Colin, 1968 13. Klimek, A. and Le´sniewski, A.: Quantized Kronecker flows and almost periodic quantum field theory. Preprint 14. Paschke, W. L.: Inner product modules over B*-algebras. Trans. Am. Math. Soc. 182, 443–468 (1973) 15. Pedersen, G.K.: C*-algebras and their Automorphism Groups. New York: Academic Press, 1979 16. Rieffel, M.A.: Deformation Quantization for Actions of Rd . AMS Memoir, Vol. 106, (1993) 17. Rieffel, M.A.: Quantization and C*-algebras. Contemp. Math. 167, 67–97 (1994) 18. Segal, I.E.: Tensor algebras over Hilbert spaces I. Trans. Am. Math. Soc. 81, 106–134 (1956) 19. Shale, D.: Linear symmetries of free boson fields. Trans. Am. Math. Soc. 103, 149–167 (1962) 20. Slawny, J.: On factor representations and the C*-algebra of canonical commutation relations. Commun. Math. Phys. 24, 151–170 (1972) 21. Takesaki, M.: Theory of Operator Algebras I. Berlin–Heidelberg–New York: Springer-Verlag, 1979 22. Weaver, N.: Deformations of von Neumann algebras. To appear in J. Op. Thy. 23. Wegge-Olsen, N.E.: K-Theory and C*-Algebras. Oxford: Oxford University Press, 1993 Communicated by H. Araki

Commun. Math. Phys. 188, 233 – 249 (1997)

Communications in

Mathematical Physics c Springer-Verlag 1997

On-Diagonal Estimates on Schr¨odinger Semigroup Kernels and Reduced Heat Kernels Adam Sikora Centre for Mathematics and its Applications, School of Mathematical Sciences, Australian National University, Canberra, ACT 0020, Australia Received: 25 May 1996 / Accepted: 29 January 1997

Abstract: We prove various estimates for the kernels of semigroups generated by Schr¨odinger operators with magnetic field and potential of polynomial growth. We also investigate the reduced heat kernels.

1. Introduction Let M be a connected and complete Riemannian manifold with Riemannian metric h·, ·i. By d we denote the Riemannian distance on M and by H we denote the operator Z (1) (Hψ, ψ) = dx |grad ψ(x) + iψ(x)Y |2 + V (x)|ψ(x)|2 , M

where dx is a Riemannian measure on M , ψ ∈ Cc∞ (M ), Y is a real vector field such that hY, Y i ∈ L1loc (M ), V : M → R, V ∈ L1loc (M ) and V ≥ 0. With some abuse of notation, we will also denote by H the Friedrichs extension of this operator. For any bounded Borel function F : [0, ∞) → C we define the operator F (H) by the spectral decomposition and we denote its kernel by KF (H) , i.e. Z F (H)(ψ)(x) = dy KF (H) (x, y)ψ(y) . M

The operator H is called a Schr¨odinger operator with magnetic field. Various properties of such operators were studied in many papers, see e.g. [2, 9, 12, 14]. In the sequel we will always assume that the following Nash inequality holds: kψkL2 ≤ ε kgrad ψk2L2 + γ 2 kψk2L2

1/2

+ c (γε)−k/2 kψkL1

(2)

234

A. Sikora

for some c > 0, all ε > 0, all γ ∈ (0, 1] and all ψ ∈ Cc∞ (M ). Note that (2) is equivalent to the formula 2+4/k 4/k (3) kψkL2 ≤ c0 kgrad ψk2L2 + kψk2L2 kψkL1 for some c0 > 0 and all ψ ∈ Cc∞ (M ). The form (2) of the Nash inequality comes from [7] Corollary 3.7 and [16] (2.22) IV.2. However, it is a variation of inequalities originally discovered by Nash [13]. The theory of Nash inequalities and their connections with Sobolev inequalities and L∞ estimates for heat kernels corresponding to LaplaceBeltrami operators constitute a very broad subject; we refer the readers who are looking for a rationale of using the assumption (2) (or (3)) to [19, 4 and 16]. However, we want to point out that (2) holds in many interesting cases, for example for group invariant operators on Lie groups (see [16]) or for uniformly elliptic operators on Rn . The main aim of this paper is the study of on-diagonal bounds for heat kernels corresponding to the operator H under the following assumption of polynomial growth of the potential V : (4) V (x) ≥ σd(0, x)α , where α > 0 and 0 ∈ M is an arbitrary fixed point of M (see also [6, 4, §4.5]). We also obtain some off-diagonal estimates as a direct consequence of on-diagonal estimates. However, if points x and y are far apart we do not gain additional information, since in this case our estimates are weaker than known Gaussian bounds (see Theorem 4.1 [7] and Theorem 2 below). In this paper, similarly as in [18], we use the connection of heat and the wave equation which has a long history, see [11]. In the last section, to investigate the sharpness of our estimates, we compare them with the lower bounds for the kernel of the semigroup generated by the operator − ∆ + |x|α = −

k X

∂ 2 /∂x2j + |x|α .

(5)

j=1

The estimates for the heat kernel corresponding to the operator (5) were studied by Davies and Simon [6, 4, §4.5]. They obtained sharp estimates for large time and α > 2. For a large time our results give the same kind of behaviour of the heat kernels as Davies and Simon’s estimates. However, their approach gives also the precise values of the constants in the estimates (see Theorem 7 below). On the other hand, our estimates are stronger for small times and work also for α < 2, whereas Davies and Simon’s approach in this case gives only negative results. We will apply our result to the kernels, which following ter Elst and Robinson [7] we call reduced heat kernels, and which can be described as follows. Let U be an irreducible unitary representation of a nilpotent Lie group G on L2 (Rk ) and let U = dU denote the representation of the Lie algebra g obtained by differentiation. If b1 , . . . , bn is a basis of g and we define the operator H by H=−

n X

U (bj )2 ,

(6)

j=1

then H generates a continuous semigroup St , holomorphic in the open right half-plane, with a kernel κt Z (St ψ)(x) =

Rk

dyκt (x, y)ψ(y) .

We will call κt a reduced heat kernel. In [7] ter Elst and Robinson proved that

On-Diagonal Estimates of Kernels

235

|κt (x, y)| ≤ Ce−λ1 t exp(−c(|x|α + |y|α ))

(7)

for t > 1 and some α > 0, where λ1 is the smallest eigenvalue of the operator H and |x| denote Euclidean norm of x. Our result applied to the operator (6) yields that |κt (x, y)| ≤ Ct−k/2 exp(−ct(|x|α + |y|α ))

(8)

for t < 1 and some α > 0. Note that our estimates are sufficiently strong to verify the well known fact that St is of trace class. We also give an alternative proof of (7). We do it by showing that the large time estimate (7) is a direct consequence of the small time estimate (8).

2. Preliminaries In what follows we will use the following version of the finite speed propagation property of solutions of the wave equation for the operator H. Theorem 1. √ Suppose that the operator H is defined by (1) and that V ≥ 0. Then for Ct (λ) = cos t λ the following holds: supp KCt (H) ⊂ {(x, y) ∈ M 2 : d(x, y) ≤ t} . Proof. By virtue of Theorem 4.1 [17] for any vector field Y such that hY, Y i ∈ L1loc (M ) and any function V : M → R, V ∈ L1loc (M ) there exist sequences of smooth vector fields Yn and smooth positive functions Vn such that Hn = H(Yn , Vn ) converges to H = H(Y, V ) in the strong resolvent sense. Hence by [15, Theorem VIII.20] Ct (Hn ) converges to Ct (H) in the strong operator topology. Therefore it is enough to prove Theorem 1 for a smooth vector field Y and a smooth function V . Now the proof of Theorem 1 relies on the following lemma. Lemma 1. Suppose that a function Φ ∈ C ∞ (M × R) solves the wave equation, i.e. ∂t2 Φ(x, t) = −HΦ(x, t).

(9)

Then for every y ∈ M there exists a constant cy > 0, such that the function Z hgrad Φ + iΦY, grad Φ + iΦY i + V |Φ|2 + |∂t Φ|2 dx, P (t) = B(cy −t,y)

is nonincreasing for 0 < t < cy , where B(r, y) = {x : d(x, y) ≤ r}. Proof. To prove Lemma 1 it is enough to show that Z hgrad Φ + iΦY, grad Φ + iΦY i + V |Φ|2 + |∂t Φ|2 dx ≤ 0. ∂t P (t) = ∂t B(cy −t,y)

We choose cy so small that the geodesic exponential map is a diffeomorphism for x ∈ B(cy , y). Next we note that in that domain any vector tangent to a geodesic is a normal vector to the sphere, so

236

A. Sikora

Z ∂t

Z φ dx =

B(t,y)

φ dσ, ∂B(t,y)

where dσ is surface measure on ∂B. Hence Z ∂t P (t) = 2Re hgrad Φ + iΦY, grad ∂t Φ + i∂t ΦY i B(cy −t,y)

+V Φ∂t Φ¯ + ∂t2 Φ∂t Φ¯ dx

Z −

(10)

hgrad Φ + iΦY, grad Φ + iΦY i + V |Φ|2 + |∂t Φ|2 dσ.

∂B(cy −t,y)

Now put Xt = grad Φ + iΦY . By the definition of gradient ¯ hXt , grad ∂t Φi = Xt ∂t Φ.

(11)

On the other hand for any φ ∈ C ∞ (M ), div φX = φ div X + Xφ,

(12)

so hgrad Φ + iΦY, grad ∂t Φ + i∂t ΦY i + V Φ∂t Φ¯ + ∂t2 Φ∂t Φ¯ = Xt ∂t Φ¯ + hgrad Φ + iΦY, iY i∂t Φ¯ + V Φ∂t Φ¯ + ∂t2 Φ∂t Φ¯ ¯ t ) − ∂t Φdiv ¯ Xt + hgrad Φ + iΦY, iY i∂t Φ¯ = div (∂t ΦX

(13)

+V Φ∂t Φ¯ + ∂t2 Φ∂t Φ¯ ¯ = div (∂t Φgrad Φ) + (H + ∂t2 )Φ∂t Φ¯ ¯ Φ). = div (∂t Φgrad In virtue of (10) and (13),

Z ¯ t ) dx div (∂t ΦX

∂t P (t) = 2Re B(cy −t,y)

Z

hgrad Φ + iΦY, grad Φ + iΦY i + V |Φ|2 + |∂t Φ|2 dσ.

− ∂B(cy −t,y)

We denote by n a normal vector to the surface ∂B(cy − t, y). Then Z Z ¯ t ) dx = 2Re ¯ t , ni dσ 2Re div (∂t ΦX h∂t ΦX B(cy −t,y)

Z

∂B(cy −t,y)

hXt , Xt i + |∂t Φ|2 + V |Φ|2 dσ.

≤ ∂B(cy −t,y)

This proves Lemma 1. (See also [8, §5, pp. 209–215]).

On-Diagonal Estimates of Kernels

237

Using Lemma 1 we can easily obtain Theorem 1. First we note that if φ, ψ ∈ Cc∞ (M ), and Ψ : M × R 7→ C is defined by √ √ (14) Ψ (x, t) = Ct ( H)(φ)(x) + St ( H)(ψ)(x), where Ct (λ) = cos tλ and St (λ) =

sin tλ λ ,

then

∂t2 Ψ (x, t) = −HΨ (x, t) and Ψ (x, 0) = φ(x), ∂t Ψ (x, 0) = ψ(x). In virtue of Lemma 1 if cx > t, supp KCt (√H) (x, ·) = supp KCt (√H) (·, x) ⊂ B(t, x),

(15)

supp KSt (√H) (x, ·) = supp KSt (√H) (·, x) ⊂ B(t, x). (16) √ √ However operators St ( H) and Ct ( H) as functions of t are continuous in the strong operator topology. Therefore for a given point x ∈ M the set of all t such that (15) and (16) holds is closed. Hence either (15) and (16) are true for all t or we can choose the biggest number t1 such that for all 0 ≤ t ≤ t1 , (15) and (16) hold. By the completeness of the Riemannian metric the ball B(t1 + 1, x) is compact, so there exists c > 0 such that for any y ∈ B(t1 + 1, x) we have cy > c, where cy is the constant from Lemma 1. By virtue of Lemma 1 if functions φ, ψ ∈ Cc∞ (M ) satisfy dist{x, supp φ ∪ supp ψ} > t1 + t2

(17)

for t2 ≤ c, then (supp Ψ (t2 , ·) ∪ supp ∂t Ψ (t2 , ·)) ∩ B(t1 , x) = ∅. However

√ Ψ (t1 + t2 , x) = Ct1 ( L)Ψ (t2 , ·)(x) + St1 ∂t Ψ (t2 , ·)(x).

Hence (15) and (16) hold for all t ≤ t1 + c which contradicts the definition of t1 . This proves Theorem 1. In the sequel we will also need the following Gaussian estimates for the heat kernel corresponding to H. Theorem 2. Suppose that (2) holds and kt (x, y) is the heat kernel corresponding to the operator H, i.e., kt (x, y) = Kexp(−tH) (x, y). Then d2 (x, y) |kt (x, y)| ≤ C(1 ∧ t)−k/2 exp − 4(1 + )t with a constant C independ cutof the function V ≥ 0.

238

A. Sikora

Proof. Assume that Y = 0 and V = 0. Then by Corollary 2.4.7. of [4], (see also [7, §4.2a]). |Kexp(−tH(0,0)) (x, y)| ≤ Ct−k/2 for t ≤ 1. However H(0, 0) is positive definite so k exp(−tH(0, 0))kL2 →L2 ≤ 1 and for t ≥ 1, Kexp(−tH(0,0)) (x, y) ≤ k exp(−tH(0, 0))kL1 →L∞ ≤ k exp(−tH(0, 0)/2))k2L2 →L∞ ≤ k exp(−H(0, 0)/2))k2L2 →L∞ = sup kKexp(−H(0,0)/2)) (x, ·)k2L2 = sup Kexp(−H(0,0)) (x, y) ≤ C. x∈M

Thus

x,y∈M

Kexp(−tH(0,0)) (x, y) ≤ C(1 ∧ t)−k/2

and Theorem 2 follows by Theorem 1 of [18] or [5]. We obtain Theorem 2 for any V ≥ 0 and Y in virtue of the following theorem (we put A = H(0, 0) and B = H(Y, V ), see also Theorem 2.3 [17]). Theorem 3. (Theorem 4.2, p. 270 [1]) Let (T (t))t≥0 be a positive semigroup with generator A and (S(t))t≥0 a semigroup with generator B. The following assertions are equivalent: (i) |S(t)ψ| ≤ T (t)|ψ|. ¯ (ii) Re (sign ψBψ, φ) ≤ (|ψ|, A0 φ) for all ψ ∈ D(B) and φ ∈ D(A0 ) and φ ≥ 0. Remark 1. It is also possible to derive the finite speed propagation property of the wave equation, i.e. Theorem 1, from Gaussian estimates given by Theorem 2 (see Theorem 3 of [18]).

3. Abstract Theorem The main result of this paper is the following theorem Theorem 4. Let kt (x, y) be the heat kernel corresponding to the operator H. Suppose that, the Nash inequality (2) is satisfied and that (4) holds. Then

|kt (x, x)| ≤ C(1 ∧ t)−k/2 exp (−c1 td(x, 0)α ) + exp (−c2 t−1 d(x, 0)2 )

.

Proof. For s > 0 we define a function vs by the formula vs (x) = σ max{sα − d(x, 0)α , 0} and an operator Hs by Hs ψ(x) = Hψ(x) + vs (x)ψ(x) , i.e. Hs = H(Y, Vs ), where Vs = V + vs . Next, if we put s = d(x, 0)/2 in Lemma 2 and Lemma 3 below we obtain Theorem 4 just by the triangle inequality.

On-Diagonal Estimates of Kernels

239

Lemma 2. For any x, y ∈ Rk and t > 0 |Kexp(−tHs ) (x, y)| ≤ C(1 ∧ t)−k/2 e−σ ts . α

Lemma 3. For any s, t > 0 and x ∈ M ,

d(x, B(s, 0))2 |Kexp(−tHs ) (x, x) − kt (x, x)| ≤ Cε (1 ∧ t)−k/2 exp − , t(4 + ε)

for all ε > 0, where B(s, 0) = {y ∈ Rk ; d(y, 0) ≤ s}. Proof (of Lemma 2). Let s > 0. Then by (4) Vs − σsα ≥ 0 . Hence by Theorem 2 we have |Kexp(−tH(Y,Vs −σsα )) (x, y)| = |Kexp(−t(Hs −σsα )) (x, y)| d2 (x, y) ≤ C(1 ∧ t)−k/2 . ≤ C(1 ∧ t)−k/2 exp − 4(1 + )t However,

(18)

α

Kexp(−t(Hs −σsα )) (x, y) = Kexp(−tHs ) (x, y)eσ ts and Lemma 2 follows from (18). Proof (of Lemma 3). Let the function ( · )+ : R → R be defined by x if x ≥ 0 (x)+ = 0 if x < 0 . Then, for β > −1, exp(−x2 ) =

Γ (β + 1) 2

Z

∞

dr (r2 − x2 )β+ r e−r

2

.

0

Hence √

2 β+3/2 Z ∞ r2 x Γ (β + 1) 1 1 √ exp − = dr (r2 − x2 )β+ r e− 4t . 4t 4t 2 π 4πt 0

Taking the Fourier transform on both sides yields Γ (β + 1) exp(−tλ ) = 2

2

1 4t

β+3/2 Z

∞ 0

r2

dr Frβ (λ)re− 4t ,

(19)

where Frβ is the Fourier transform of x ∈ R → π −1/2 (r2 − x2 )β+ . By (19) exp(−tL) =

Γ (β + 1) 2

1 4t

β+3/2 Z

∞ 0

√ r2 dr Frβ ( L)re− 4t ,

where L is any positive self-adjoint operator. Next for L = H or L = Hs we have

(20)

240

A. Sikora

supp K(Frβ )(√L) (x, ·) ⊂ B(r, x) .

(21)

Indeed, if f is an even function then by the Fourier inversion formula Z +∞ 1 dt fˆ(t)cos(tλ) f (λ) = 2π −∞

(22)

and in virtue of (22), Z

√

1 f ( L) = 2π Thus by (23) Frβ (

√

2 L) = √ π

+∞

−∞

Z

∞

√ dt fˆ(t)Ct ( L) .

(23)

√ dt (r2 − t2 )β+ Ct ( L) ,

(24)

0

and (21) follows by Theorem 1 and (24). Note that Hs = H on Rk − B(s, 0), so by (21) KFrβ (√H s ) (x, ·) = KFrβ (√H) (x, ·)

(25)

for r ≤ d(x, B(s, 0)). Now assume that for L = H, or L = Hs , and β > k − 1, |KFrβ (√L) (x, y)| ≤ C(r2β−k+1 + r2β+1 ) .

(26)

Then by (20) and (25), |Kexp(−tHs ) (x, x) − Kexp(−tH) (x, x)| β+3/2 Z ∞ r2 Γ (β) 1 dr |KFrβ (√H s ) (x, y) − KFrβ (√H) (x, y)| r e− 4t ≤ 2 4t 0 β+3/2 Z ∞ r2 Γ (β) 1 = dr |KFrβ (√H s ) (x, x) − KFrβ (√H) (x, x)| r e− 4t . 2 4t d(x,B(s,0))

(27)

Finally by (26) for L = H, or L = Hs ,

1 4t

β+3/2 Z

∞

r2

d(x,B(s,0))

dr |KFrβ (√L) (x, y)| r e− 4t

≤C

1 4t

β+3/2 Z

= C(4t)−k/2 Z

Z

∞

+C

∞

∞

r2

dr (r2β−k+1 + r2β+1 ) r e− 4t

d(x,B(s,0))

dr r2β−k+1 r e−r

2

d(x,B(s,0)) √ 4t

dr r2β+1 r e−r

2

d(x,B(s,0)) √ 4t

and we obtain Lemma 3 from the elementary inequality Z ∞ 2 dr rb r e−r ≤ Cb (1 + a)b exp(−a2 ) ≤ Cb,ε exp(−a2 /(1 + ε)) . a

(28)

On-Diagonal Estimates of Kernels

241

Proof (of (26)). As above we put L = H or L = Hs . By pt we denote pt (x, y) = Kexp(−tL) (x, y). It follows easily from spectral theory that if L is a self-adjoint, positive-definite operator, p1 (x, ·) ∈ L2 and we define a measure µx by the formula Z

Z

∞

∞

F (λ)dµx (λ) = 0

(e−λ )−2 F (λ)2λd(E(λ2 )p1 (x, · ), p1 (x, · )) , 2

0

then

Z kKF (√L)(x,·) k2L2 (dx) =

∞

|F (λ)|2 dµx (λ) .

(29)

0

On the other hand, by Theorem 2 for L = H or L = Hs , kpt (x, · )k2L2 (dx) = p2t (x, x) ≤ C(1 ∧ t)− 2

k

.

Hence Z ≤

µx ([0, r))

Z ≤

r

e 0

dµx (λ) eλ

∞

e

2 −2

r

dµx (λ) eλ

2 −2

r

0

=

e kp 12 (x, · )k2L2 ≤ C 0 (1 + rk ) .

(30)

r

Now, for a function F , we put G1 = |F |signF and G2 = |F |. Then Z |KF (√L) (x1 , x2 )| = | dy KG1 (√L) (x1 , y)KG2 (√L) (y, x2 )| ≤ = =

M

kKG1 (√L) (x1 , · )kL2 kKG2 (√L) (x1 , · )kL2 Z ∞ 1/2 Z ∞ 1/2 dµx1 (λ) |G1 (λ)|2 dµx2 (λ) |G2 (λ)|2 0 0 Z ∞ 1/2 Z ∞ 1/2 dµx1 (λ) |F (λ)| dµx2 (λ) |F (λ)| , 0

0

i.e., |KF (√L) (x1 , x2 )| ≤

Z

∞

dµx1 (λ) |F (λ)|

0

1/2 Z

∞

dµx2 (λ) |F (λ)|

1/2 .

(31)

0

It is not difficult, however, to verify that for some constant Cβ independent of λ and r, |Frβ (λ)| ≤ Cβ so by (30)

r2β+1 , 1 + |rλ|β+1

(32)

242

A. Sikora

Z

∞ 0

Z

dµx (λ) |Frβ (λ)|

≤ = = ≤ =

∞

r2β+1 1 + |rλ|β+1 0 Z ∞ Z ∞ r2β+1 d −Cβ dµx (λ) ds ds 1 + (rs)β+1 0 λ Z ∞ d r2β+1 ds µx ([0, s)) −Cβ ds 1 + (rs)β+1 0 Z ∞ 2β+1 r d ds C 0 (1 + sk ) −Cβ ds 1 + (rs)β+1 0 Z ∞ s β sk ds r2β−k+1 C 0 Cβ (β + 1) (1 + sβ+1 )2 0 Z ∞ sβ + r2β+1 C 0 Cβ (β + 1) ds (1 + sβ+1 )2 0 Cβ

dµx (λ)

,

which in virtue of (31) proves (26). Remark 2. 1. The particular form of the operator H seems to play very little role in the proof of Theorem 4 and one could ask whether Theorem 4 could be stated in more general way, for example for a generator of a semigroup with finite speed propagation property of the solution of the wave equation. However, if we consider differential operators with constant coefficients on Rn then in virtue of the Paley-Wiener theorem, Theorem 1 holds only for operators of the form (1). We do not know how to precisely state and prove a generalisation of the above observation to the case of differential operators with variable coefficients, but we conjecture that any extension of Theorem 1 is contained in Remark 2.2 below. 2. It is possible to prove a version of Theorem 4 for operators of the following form: Z X n |(Xj + iYj (x))ψ(x)|2 + |ψ(x)|2 V (x) , (33) (Hψ, ψ) = − M j=1

where we assume that the vector fields Xj satisfy H¨ormander’s condition. We can prove Theorem 1 for the operator in (33) using Theorem 3 of [18]. However, we have to replace the Riemannian distance with the sub-Riemannian distance corresponding to the vector fields Xj (see §III.4 of [19] or §IV.4b of [16]). The proof of Theorem 4 in this case is the same. 3. It follows from the proof that for any 0 < s < 1 and ε > 0 we can put c1 = σ sα and c2 = (1 − s)2 /(4 + ε) as constants in Theorem 4. 4. Large Time Lstimates In virtue of Theorem 4 we can easily obtain large time estimates on the semigroup kernels. Theorem 5. If H satisfies the hypotheses of Theorem 4 and kt is the corresponding heat kernel, then Ct−k/2 exp(−ctd(0, x)α ) if t ≤ (1 + d(0, x))1−α/2 (34) |kt (x, x)| ≤ Ce−tλ1 exp(−cd(0, x)1+α/2 ) if t > (1 + d(0, x))1−α/2 , where λ1 is the smallest eigenvalue of H.

On-Diagonal Estimates of Kernels

243

Proof. For t ≤ (1 + d(0, x))1−α/2 and some constants C, c > 0, C exp (−c td(x, 0)α ) ≥ exp (−c2 d(x, 0)2 t−1 ) , where c2 is a constant from Theorem 4. Hence estimates (34) follow from Theorem 4. To prove Theorem 5 for t > (1 + d(0, x))1−α/2 we note first that ke−tH kL2 7→L2 = e−tλ1 and

e−tH ks (x, ·) = ks+t (x, ·) .

Next

= ke

−t/2+2−1 (1+d(x,0))1−α/2 H

≤e

kt (x, x) = kkt/2 (x, ·)k2L2 k2−1 (1+d(x,0))1−α/2 (x, ·)k2L2

−t+(1+d(x,0))1−α/2 λ1

kk2−1 (1+d(x,0))1−α/2 (x, ·)k2L2 1−α/2 λ1 = e −t+(1+d(x,0)) k(1+d(x,0))1−α/2 (x, x) −k/2 ≤ Ce−tλ1 1 ∧ (1 + d(x, 0))1−α/2 d(0, x)α d(0, x)2 × exp − c1 + exp − c2 (1 + d(0, x))α/2−1 (1 + d(0, x))1−α/2 1−α/2 × exp (1 + d(x, 0)) λ1 ≤ C 0 e−tλ1 exp − c0 d(0, x)1+α/2 + (1 + d(x, 0))1−α/2 λ1 ≤ C 00 e−tλ1 exp(−c00 d(0, x)1+α/2 ) . Corollary 1. Under the assumptions of Theorem 4,   Ct−k/2 exp − ct(d(0, x)α0 + d(0, y)α0 ) |kt (x, y)| ≤  Ce−tλ1 exp − c(d(0, x)α00 + d(0, y)α00 )

if t ≤ 1 if t > 1 ,

(35)

where λ1 is the smallest eigenvalue of H and α0 = 2 ∧ α and α00 = α ∧ (1 + α/2). Proof. In virtue of Theorem 5 and Theorem 4 there exist constants C 0 , c0 such that 0 −k/2 0 exp(−c0 td(0, x)α ) if t ≤ 1 Ct |kt (x, x)| ≤ (36) 00 C 0 e−tλ1 exp(−c0 (d(0, x)α ) if t > 1 . However

Z |kt (x, y)| = |

kt/2 (x, z)kt/2 (z, y)|

≤ kkt/2 (x, ·)kL2 kkt/2 (y, ·)kL2 = kt (x, x)1/2 kt (y, y)1/2 and for c = c0 /2 (35) follows from (36).

244

A. Sikora

5. Reduced Heat Kernel We will apply Theorem 4 to the reduced heat kernel on a nilpotent Lie group G, i.e. to the semigroup kernel corresponding to the operator H defined by (6). In the sequel we will assume that our unitary irreducible representation U is constructed in the same way as in Theorem 1.1, Theorem 1.8 and Lemma 1.10 of [14]. Any unitary irreducible representation is equivalent to some representation constructed in such a way. Alternatively we can assume that the representation U is the one considered in case 1 in the proof of Theorem 4.1.1 of [3]. Such a representation U acts on Rk and there exists a subalgebra g0 of codimension one in g and vector ak ∈ g such that U (a0 )(x1 , . . . , xk ) = U 0 (Adexp xk ak a0 )(x1 , . . . , xk−1 ) r X xjk 0 = U (ad ak )j a0 )(x1 , . . . , xk−1 ) j!

(37)

j=1

for all a0 ∈ g0 , where U 0 is an irreducible representation of the subalgebra g0 acting on Rk−1 . In addition by Theorem 1.12 and (1.29) of [14] (or see the proof of Theorem 4.1.1, case 1 of [3]) there exists b ∈ g satisfying U (b) = ixk ,

(38)

and by (1.29) of [14] there is b0 ∈ g such that U (b0 ) = i.

(39)

We can state a version of the condition (4) for the operator H defined by (6) in the following way. Lemma 4. If U is a representation described above, b1 , . . . , bn is a linear basis of g, then there exist α > 0 and C > 0 such that n X

|U (bj )ψ(x)|2 ≥ C(1 + |x|)α |ψ(x)|2 .

j=1

Proof. We will prove Lemma 4 by induction on the dimension of g. For dim g = 1, Lemma 4 is obvious. Next let b satisfy (38) and b0 satisfy (39). We assume that the set bj is a base for g, so there exist numbers ξj and ηj such that b=

n X

ξ j bj

j=1

and b0 =

n X

η j bj .

j=1

Hence by H¨older’s inequality

=|

n X j=1

ξj U (bj )|2 + |

n X j=1

(1 + x2k )|ψ(x)|2 = |U (b)ψ(x)|2 + |U (b0 )ψ(x)|2 n X ηj U (bj )|2 ≤ (kbk2 + kb0 k2 ) |U (bj )ψ(x)|2 , j=1

(40)

On-Diagonal Estimates of Kernels

245

where k · k is the Euclidean norm on g for which the vectors bi form an orthonormal basis. By the induction hypothesis, if b0j is a basis of g0 , then n−1 X

0

|U 0 (b0j )ψ(x)|2 ≥ C(1 + |(x1 , . . . , xk−1 )|)α |ψ(x)|2 .

(41)

j=1

Next by (37), n−1 X

|U (b0j )ψ(x)|2 =

n−1 X

j=1

|U 0 (Adexp xk ak b0j )ψ(·, xk )|2 .

(42)

j=1

However g is nilpotent so there exist polynomials Amj (xk ) such that b0m

=

n−1 X

Amj (xk )Adexp xk ak b0j .

j=1

Hence |U 0 (b0m )ψ(x)|2 ≤ (

n−1 X

Amj (xk )2 )(

j=1

n−1 X

|U 0 (Adexp xk ak b0j )ψ(x)|2 )

(43)

j=1

≤ C(1 + |xk |)r (

n X

|U (b0j )ψ(x)|2 ) .

j=1

Thus by (41), (42) and (43) n X

|U (bj )ψ(x)|2 ≥ C

j=1

≥ C1 (1 + |xk |)−r

n−1 X

n−1 X

|U (b0j )ψ(x)|2

(44)

j=1 0

|U 0 (b0j )ψ(x)|2 ≥ C

j=1

(1 + |(x1 , . . . , xk−1 )|)α |ψ(x)|2 . (1 + |xk |)r

Finally, if p, q > 1 and 1/p + 1/q = 1, in virtue of (44), (40) we have ÿ !1/p 0 n X (1 + |(x1 , . . . , xk−1 )|)α |ψ(x)|2 2 |U (bj )ψ(x)| ≥ C (1 + |xk |)r i=j ×((1 + |xk |)2 |ψ(x)|2 )1/q ≥ C(1 + |x|)α for

α0 p

=

2 q

−

r p

= α. This proves Lemma 4.

For any a ∈ g, U (a) is a differential operator acting on Rk of the form U (a) =

k X j=1

Xj◦ (x, a)

∂ + iY (x, a) , ∂xj

where Xj◦ (x, a) = Xj◦ (x1 , . . . , xj−1 , a) and Y (x, a) are polynomials in Rk (see [7, 14 or 3]). We define U ◦ (a) by

246

A. Sikora

◦

U (a) =

k X

Xj◦ (x, a)

j=1

∂ . ∂xj

Next we define the Riemannian metric h· , ·i in such a way that hgrad ψ(x), grad ψ(x)i =

n X

|U ◦ (bj )(x)ψ(x)|2

j=1

for any ψ ∈ C ∞ (Rk ). Finally we define a potential V and a vector field A by the formula hgrad ψ(x) + iψ(x)A, grad ψ(x) + iψ(x)Ai + V |ψ(x)|2 =

n X

|U (bj )(x)ψ(x)|2

j=1

for any ψ ∈ C ∞ (Rk ). In virtue of Lemma 4, V (x) ≥ C(1 + |x|)α . By d( ·, · ) we denote the Riemannian distance corresponding to the metric h· , ·i. Using standard techniques we can easily prove that Lemma 5. There exist β > 0 and constants C, C 0 such that C|x|β ≤ d(x, 0) ≤ C 0 |x| . Finally, to apply Theorem 4 we need the following lemma Lemma 6. The system U ◦ (b1 )(x), . . . , U ◦ (bn )(x) satisfies the Nash inequality (condition (2)). Proof. For a weak Malcev basis a1 , . . . , ak satisfying an additional condition (3) [7], Lemma 6 is stated in Corollary 3.10 of [7]. But in virtue of Lemma 2.3 of [7] if a˜ 1 , . . . , a˜ k is any other weak Malcev basis, then the representation U˜ corresponding to this basis is given by the formula U˜ = JU J ? , where J is an isomorphism of Lp (Rk ) for any 1 ≤ p ≤ ∞. J maps the system U ◦ (b1 )(x), . . . , U ◦ (bn )(x) onto the system U˜ ◦ (b1 )(x), . . . , U˜ ◦ (bn )(x). Hence Corollary 3.10 of [7] is true for any weak Malcev basis. This proves Lemma 6. In virtue of Corollary 1, Lemma 4, Lemma 5 and Lemma 6 we obtain the following theorem. Theorem 6. If κt is the reduced heat kernel then there exist constants C, c, α > 0 such that Ct−k/2 exp(−ct(|x|α + |y|α )) for t < 1 (45) |κt (x, y)| ≤ Ce−tλ1 exp(−c(|x|α + |y|α )) for t ≥ 1 , where λ1 is the smallest eigenvalue of H.

On-Diagonal Estimates of Kernels

247

6. Lower Bounds In order to verify to what extent the estimates of Theorem 4 and Theorem 5 are sharp we prove some lower bounds for the heat kernel corresponding to the Schr¨odinger operator −∆ + |x|α = −

k X

∂ 2 /∂x2j + |x|α .

j=1

Proposition 1. If kt is a kernel of the semigroup generated by the operator −∆ + |x|α , then there exist constants c, C > 0 such that for t ≤ (1 + |x|)1−α/2 C −1 t−k/2 exp(−ct|x|α ) ≥ kt (x, x) ≥ Ct−k/2 exp(−c−1 t|x|α )

(46)

and for t ≥ (1 + |x|)1−α/2 C −1 e−tλ1 exp(−c|x|1+α/2 ) ≥ kt (x, x) ≥ Ce−tλ1 exp(−c−1 |x|1+α/2 ) . Proof. If we put

(47)

Vs (y) = min{sα , |y|α } ,

then in virtue of the Feynman–Kac formula, Kexp(t(∆−Vs )) (y, y 0 ) ≥ Kexp(t(∆−sα )) (y, y 0 ) and

Kexp(t(∆−Vs )) (y, y) ≥ Ct−k/2 exp (−tsα ).

(48)

In the same way as in Lemma 3 we can show that |x|2 |Kexp(t(∆−V|2x| ) (x, x) − kt (x, x)| ≤ Cε t−k/2 exp − t(4 + ε) and by (48) |x|2 . kt (x, x) ≥ Ct−k/2 exp (−t|2x|α ) − Cε t−k/2 exp − t(4 + ε)

(49)

On the other hand, let λ1 < λ2 ≤ . . . denote the eigenvalues of the operator H = −∆ + |x|α , repeated according to multiplicity, and let ϕ1 , ϕ2 , . . . be the corresponding orthonormal basis of eigenfunctions. By Proposition 1.4.3 [4] λ1 has multiplicity one. Note that ∞ X e−tλi |ϕi (x)|2 ≥ e−tλ1 |ϕ1 (x)|2 . (50) kt (x, x) = i=1

Next, by Corollary 4.5.7 of [4] for some constants C, c ϕ1 (x) ≥ C exp(−c|x|1+α/2 ) .

(51)

and in virtue of of Theorem 4, (49) (50), (51) we obtain Proposition 1. For α < 2 and t < 1 we can obtain a more precise result. Namely, we can control the constant c in the estimate (46).

248

A. Sikora

Proposition 2. If kt is the heat kernel corresponding to the operator −∆ + |x|α and α < 2 then for any ε > 0 there exist constants Cε and Cε0 such that for t ≤ 1, Cε t−k/2 exp(−(1 − ε)t|x|α ) ≥ kt (x, x) ≥ Cε0 t−k/2 exp(−(1 + ε)t|x|α ). Proof. It is not difficult to show (see Lemma 4.5.9 of [4]) that Lemma 7. If kt is the heat kernel corresponding to the operator H = −∆ + |x|α , then for some constants C and T and for all t ≤ T , kt (x, x) ≥ Ct−k/2 exp(−t(|x| + 1)α ). However, a careful examination of the constants in Theorem 4 (see the Remark 2.3 at the end of §3) shows that kt (x, x) ≤ Ct−k/2 exp(−(1 − ε)t|x|α ) + exp(−cε |x|2 ) ≤ Cε t−k/2 exp(−(1 − ε)t|x|α ). Although according to Proposition 1 and Proposition 2 it seems that Theorem 4 and 5 are quite sharp, for α ≥ 2 and large t Davies and Simon have obtained a more precise result which gives exactly the value of the constant c in Proposition 1. In [6] it is proved that the Schr¨odinger operator generated by the operator −∆ + |x|α for α > 2 is so called intrinsic ultracontractive (Theorem 6.3 of [6]) and by Theorem 4.2.5 and Corollary 4.5.8 of [4] Theorem 7. For any > 0 there exists T such that for t > T , (1 − ε)e−tλ1 ϕ1 (x)ϕ1 (y) ≤ kt (x, y) ≤ (1 + ε)e−tλ1 ϕ1 (x)ϕ1 (y), where ϕ1 is a ground state (eigenfunction corresponding to the smallest eigenvalue) of the operator −∆ + |x|α and 2 |x|1+α/2 ≤ ϕ1 (x) C 0 (1 + |x|)(1−k)/2 exp − 2+α 2 (1−k)/2 |x|1+α/2 . exp − ≤ C(1 + |x|) 2+α For α < 2 the Schr¨odinger semigroup is not intrinsically ultracontractive and this approach does not give any upper bound for the semigroup kernel. Acknowledgement. I would like to thank D.W. Robinson for several stimulating discussions. In fact, some ideas in this paper are due to him.

References 1.

2. 3.

Arendt, W., Grabosch, A., Greiner, G., Groh, U., Lotz, H.P., Moustakas, U., Nagel, R., Neubrander, F., Schlotterbeck, U.: One-parameter semigroups of positive operators. Lecture Notes in Mathematics, 1184. Berlin–Heidelberg–New York: Springer-Verlag, 1986 Avron, J., Herbst, I., Simon, B.: Schr¨odinger operators with magnetic fields, I General interactions. Duke Math. J. 45, 847–84 (1978) Corwin, L.J., Greenleaf, F.P.: Representation of nilpotent Lie groups and their applications, Part 1: Basic theory and examples. Cambridge: Cambridge Univ. Press, 1990

On-Diagonal Estimates of Kernels

4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19.

249

Davies, E.B.: Heat kernels and spectral theory. Cambridge: Cambridge Univ. Press, 1989 Davies, E.B., Pang, M.M.H.: Sharp Heat kernel bounds for some Laplace operators. Quar. J. Math. Oxford (2) 40, 281–290 (1989) Davies, E.B., Simon, B.: Ultracontractivity and the heat kernel for Schr¨odinger operators and Dirichlet Laplacians. J. Funct. Anal. 59, 335–95 (1984) ter Elst, A.F.M., Robinson, D.W.: Reduced heat kernels on nilpotent Lie groups. Commun. Math. Phys. 173, 475–511 (1995) Folland, G. B.: Introduction to partial differential equations. Math. Notes, Princeton, NJ: Princeton Univ. Press, 1976 Helffer, B., Nourrigat, J.: D´ecroissance a` l’infini des fonctions propres de l’op´erateur de Schr¨odinger avec champ e´ lectromagn´etique polynomial. J. Analyse Math. 58, 263–75 (1992) H¨ormander, L.: The analysis of linear partial differential operators 1. Berlin–Heidelberg–New York: Springer-Verlag, 1983 Melrose, R.: Propagation for the wave group of a positive subelliptic second order differential operator in Hyperbolic Equations and related topics. Boston, Mass.: Academic Press 1989 Mohamed, A., L´evy-Bruhl, P., Nourrigat, J.: Etude spectrale d’op´erateurs li´es a` des repr´esentations de groupes nilpotents. J. Funct. Anal. 113, 65–93 (1993) Nash, J.: Continuity of solutions of parabolic and elliptic equations. Amer. J. Math. 81, 931–954 (1958) Nourrigat, J.: L2 inequalities and representations of nilpotent group. Lectures notes C.I.M.P.A. School of Harmonic Analysis Wuhan (China), 1991 Reed, M., Simon, B.: Methods of modern mathematical physics, I, Functional Analysis, self-adjointness,. New York: Academic press, 1972 Robinson D.W.: Elliptic operators and Lie groups. Oxford: Oxford Univ. Press, 1991 Simon, B.: Maximal and Minimal Schr¨odinger forms. J. Operator Theory 1, 37–47 (1979) Sikora, A.: Sharp pointwise estimates on heat kernels. Quart. J. Math. Oxford (2) 47, 371–382 (1996) Varopoulos, N.Th., Saloff-Coste, L., Coulhon, T.: Analysis and geometry on Groups. Cambridge: Cambridge Univ. Press, 1992

Communicated by H. Araki

This article was processed by the author using the LaTEX style file pljour1 from Springer-Verlag.

Commun. Math. Phys. 188, 251 – 266 (1997)

Communications in

Mathematical Physics c Springer-Verlag 1997

Affine Toda Field Theory as a 3-Dimensional Integrable System R.M. Kashaev1,? , N. Reshetikhin2,?? 1 Laboratoire de Physique Th´ eorique enslapp,(URA 14-36 du CNRS, associ´ee a` l’E.N.S. de Lyon, et a` l’Universit`e de Savoie), ENSLyon, 46 All´ee d’Italie, 69007 Lyon, France 2 Department of Mathematics, University of California, Berkeley, CA 94720, USA

Received: 23 May 1996 / Accepted: 22 August 1996

Abstract: The affine Toda field theory is studied as a 2+1-dimensional system. The third dimension appears as the discrete space dimension, corresponding to the simple roots in the AN affine root system, enumerated according to the cyclic order on the AN affine Dynkin diagram. We show that there exists a natural discretization of the affine Toda theory. The quantum analog of the τ -variables is found. The thermodynamic Bethe ansatz of the affine Toda system is studied in the limit L, N → ∞. It is shown that the free energy of the systems grows proportionally to the volume.

1. Introduction The Toda field theories were extensively studied as classical and quantum integrable field theories. The latest developments in the study of these models are described in [1]. In the quantum case Toda field theories provide examples of integrable models of quantum field theories with a scalar factorizable S-matrix. The affine Toda field theory of AN type describes N scalar fields interacting nonlinearly with the Lagrangian LAT =

Z X N 1 ∂φi 2 1 ∂φi 2 M 2 ( ) − ( ) − 2 exp(β(φi − φi+1 )) dx. 2 ∂t 2 ∂x β

(1.1)

i=1

Here it is assumed that φN +1 = φ1 . The mass spectrum and the scattering amplitudes of this model of the quantum field theory were suggested in [3] on the base of the bootstrap principle and the perturbation theory. ? On leave of absence from St. Petersburg Branch of the Steklov Mathematical Institute, Fontanka 27, St. Petersburg 191011, RUSSIA ?? The work of N.R. was partially supported by NSF Grant DMS-9692-120

252

R.M. Kashaev, N. Reshetikhin

The spectrum of the model consists of N − 1 massive particles with masses Ml = M sin(πl/N ), where M is the renormalized mass and l = 1, . . . , N − 1. The scattering amplitude of the lth particle on the k th particle is given by the following product Y π(l − 2m) π(k − 2n) −i , (1.2) S1,1 θ + i Sk,l = N N 1≤n≤k 1≤m≤l

where S1,1 (θ) =

sinh( θ2 +

iπ θ N ) sinh( 2 − θ sinh( θ2 − iπ N ) sinh( 2 +

iπ N iπ N

ib θ 2 ) sinh( 2 − − ib2 ) sinh( θ2 +

+

ib 2) . ib 2)

Here b = 2π/(N (1 + 4π/β 2 )). It is interesting to compare the Toda model (1.1) with the so-called open Toda field theory, described by the same Lagrangian (1.1), but without the periodicity condition φN +1 = φ1 . The Liouville theory [2], which describes the 2D-gravity, is a simplest example of such a model with N = 1. The periodic and open Toda field theories have a completely different structure of dynamics for finite N [15, 16]. In particular, the open Toda chain has massless spectrum. Moreover, the open Toda model (with N fields) can be obtained from the model (1.1) in the limit Z → 0, M = mZ, φj = φ˜ j +(2j/β) ln Z for j = 1, . . . , N , and m being fixed, where the Lagrangian (1.1) becomes the Lagrangian of the open Toda field theory with the mass parameter m and with the fields φ˜ j , j = 1, . . . , N . In this paper we investigate the possibility of interpreting the affine Toda field theory as the model in 2+1-dimensions with the discrete second space coordinate i = 1, . . . , N . In this interpretation the model (1.1) corresponds to the periodic boundary conditions in the second space dimension, while the open Toda model corresponds to the open boundary conditions. Notice, that if we would not have the first space coordinate x, then the interpretation of the components of the field {φi } as an extra space dimension would correspond to the original physical model, suggested by Toda for the description of long molecules [17]. The following results came out of the study of such an interpretation of the Toda field theory: – The discrete local integrable version of the affine Toda field theory is proposed. The equations of motion for certain variables in this discrete model are invariant with respect to any permutation of coordinates. In the classical limit these variables are the τ functions [10, 14]. When the space-time lattice is finite the model provides a finite dimensional approximation to the Toda field theory. – It is shown that in the limit N → ∞ the system has a thermodynamical limit. The spectrum of the resulting theory consists of massless scalar particles. The density of the free energy is computed using the thermodynamical Bethe ansatz. The classical Toda field theories in the continuous space-time are infinite dimensional integrable Hamiltonian systems. The Lax representation for these systems is well known [15]. In this case the Lax representation corresponds to a certain choice of the symplectic leaf in the appropriate Lie bialgebra [18, 19]. This is equivalent to the fact that the Poisson brackets between matrix elements of the Lax operators are given by the so called r-matrix Poisson brackets [20]. If one wants to find an integrable discrete space-time analog of the Toda field theory, the natural way to proceed is to find the appropriate discrete Lax pair [4]. A discrete Hamiltonian system with the Lax representation is usually integrable if the Lax operator

Affine Toda Field Theory as a 3-Dimensional Integrable System

253

has the r-matrix Poisson brackets. In more geometrical language this means that the c ). The Lax operator describes certain symplectic leaf in the Poisson Lie group SL(N factorization map, constructed in [22, 23, 24] provides a Poisson map from the Poisson Lie group to itself, which, being restricted to the corresponding symplectic leaf, gives the evolution with equations of motion for the coordinate functions, given by the discrete Lax equation. The discretization on the classical level reduces the infinite dimensional integrable Hamiltonian system to the finite dimensional one. The next step is the quantization. The quantization replaces the Poisson evolution map by an automorphism of the quantum algebra of observables. Since the classical system is integrable we certainly want to construct an integrable quantization (with extra integrals of motion). If the quantum system can be described in terms of differential operators on an n-dimensional manifold, there should be n such integrals. It is known that if the system admits the quantum Lax representation, and if the quantum Lax operators have the so called Rmatrix commutation relations, then one can construct the integrals of such evolution, using the traces of appropriate products of quantum Lax operators. In the algebraic language, the quantum Lax operator is the universal R-matrix of a certain factorizable Hopf algebra evaluated on the tensor product of two representations. For the details see [24]. The classical equations of motion as well as the Lax representation for the discrete Toda field theory first appeared in [10] as bilinear difference equations for the corresponding τ function. The Lax representation for these equations is described also in [11] without the Hamiltonian interpretation. A completely different approach to classical discrete Toda field theory is suggested very recently in [12]. The Lax representation together with the Hamiltonian interpretation for the Toda chain with discrete time was found in [13]. The paper is organized as follows. In Sect. 2 we describe the discrete integrable version of the Toda field theory and show that in the classical and continuous limit it is reduced to the classical Toda field theory (1.1). In the next section, using the discrete Lax operators [4], we describe the integrals of motion. The construction is inspired by the one, used for the discrete Sine-Gordon model [5, 6, 7], see also [8, 9]. In Sect. 4 the quantum τ -variables are defined and the discrete 3-dimensional symmetry is revealed. Section 5 contains the analysis of the thermodynamical limit of the affine Toda system, regarded as a 3-dimensional field theory. It is shown that there exists a 3-dimensional thermodynamical limit in which the excitations are massless scalar particles. In the last Sect. 6 we make further remarks and formulate open problems. 2. The Discrete Space-Time Toda Field Theory In this section we will describe discrete space time quantum Toda field theory. The origin of this model from the point of view of quantum Lax pairs and corresponding quantum groups will be described in Sect. 3. 2.1. Discrete affine Toda field theory. The quantum algebra of observables for the complexified discrete Toda system is an associative algebra Cq (N, L), generated by invertible elements χi (n),

i = 1, . . . , N ;

with the following determining relations

n = 1, . . . , 2L,

254

R.M. Kashaev, N. Reshetikhin

χi (2n)χi (2n + 1) = q 2 χi (2n + 1)χi (2n), χi (2n)χi (2n − 1) = q 2 χi (2n − 1)χi (2n), χi+1 (2n)χi (2n + 1) = q −2 χi (2n + 1)χi+1 (2n), χi−1 (2n)χi (2n − 1) = q −2 χi (2n − 1)χi−1 (2n),

(2.1)

and all other χi (n) pairwise commute. Here in the relations we assume the following "periodical boundary conditions" χi+N (n) = χi (n + 2L) = χi (n).

(2.2)

The algebra of observables of the discrete Toda system corresponds to the values |q| = 1. In this case it is a ∗-algebra with χi (n)∗ = χi (n). Consider two maps κ± : Cq (N, L) → Cq (N, L) defined on the generators χi (n) as follows: κ± : χi (n ∓ 1) 7→ χi (n), for even n and κ± : χi (n ∓ 1) 7→ q −2

1 − χi+1 (n − 1) 1 − χi−1 (n + 1) χ−1 (n) −2 χ−1 (n − 1) i 1 − q −2 χ−1 (n + 1) 1 − q i i

(2.3)

for odd n. Here and below we assume that the subindex i and the argument n are taken modulo N and 2L, respectively, as in (2.2). Proposition 1. The mappings κ± in (2.3), extended by linearity to the whole algebra1 Cq (N, L), determine automorphisms of this algebra. The mapping κ = κ+ κ− is the evolution operator for the affine discrete Toda system. For each i = 1, . . . , N , n = 1, . . . , 2L, define the ‘trajectory of generators χi (n)’ with respect to the evolution in the discrete affine Toda system as the following sequence of elements in Cq (N, L): χi (n, 1), if n = 0 (mod 2) χi (n±1, t + 1) = κ± (χi (n, t)); χi (n) = . (2.4) χi (n, 0), if n = 1 (mod 2) Notice that the fields χi (n, t) are defined only for n + t = 1 following equations of motions

(mod 2). They satisfy the

q −2 χi (n, t + 1) χi (n, t − 1) = χi (n + 1, t) χi (n − 1, t) 1 − χi−1 (n + 1, t) 1 − χi+1 (n − 1, t) × . 1 − q 2 χi (n + 1, t) 1 − q 2 χi (n − 1, t)

(2.5)

These are the quantum discrete space-time Heisenberg equations of motion for the discrete affine Toda field theory. Having been originated from automorphisms of the algebra, they preserve the following commutation relations of operators with relative time shifts ∆t = 0, and ∆t = ±1: 1 Strictly speaking, the maps κ act from the algebra to its completion, which can be defined in many ± ways. For example, one can consider Laurent power series in χ, or χ − 1, or one can realize χ as operators in a Hilbert space and use the spectral theorem for the definition of rational functions of an operator. We will ignore these problems when possible.

Affine Toda Field Theory as a 3-Dimensional Integrable System

255

χi (2n, 2t + 1)χi (2n + 1, 2t) = q 2 χi (2n + 1, 2t)χi (2n, 2t + 1), χi (2n, 2t + 1)χi (2n − 1, 2t) = q 2 χi (2n − 1, 2t)χi (2n, 2t + 1), χi+1 (2n, 2t + 1)χi (2n + 1, 2t) = q −2 χi (2n + 1, 2t)χi+1 (2n, 2t + 1), χi−1 (2n, 2t + 1)χi (2n − 1, 2t) = q −2 χi (2n − 1, 2t)χi−1 (2n, 2t + 1), (2.6) the others being trivially commutative. Note, that for the affine Toda system we have to take subindices modulo N . Now, let us show how to obtain the classical continuous Toda field theory (1.1) from this system. First, consider the classical discrete Toda, which corresponds to the limit q → 1 in the quantum discrete Toda model described above. Let ε be the lattice spacing both in the time and space directions. Define the new field: χi (n, t) 1 . (2.7) ϕi (nε, tε) = ln − β (εM )2 Substituting this into (2.5), we obtain (q = 1): exp (β(ϕi (x, t + ε) + ϕi (x, t − ε) − ϕi (x + ε, t) − ϕi (x − ε, t))) =

1 + (εM )2 exp(βϕi−1 (x + ε, t)) 1 + (εM )2 exp(βϕi+1 (x − ε, t)) . 1 + (εM )2 exp(βϕi (x + ε, t)) 1 + (εM )2 exp(βϕi (x − ε, t))

In the limit ε → 0 these equations reduce to ∂ 2 ϕi M 2 ∂ 2 ϕi exp(βϕi+1 ) + exp(βϕi−1 ) − 2 exp(βϕi ) , − = 2 2 ∂t ∂x β which coincide with the Euler-Lagrange equations for the system (1.1), written for the fields (2.8) ϕi = φi − φi+1 . So far we have just defined what we call the quantum affine discrete Toda system. 2.2. Discrete open Toda field theory. The quantum algebra for the open Toda field theory C˜ q (N, L) slightly differs from Cq (N, L). It is generated by χi (n), with 1 ≤ i ≤ N , 1 ≤ n ≤ 2L with determining relations (2.1), and the boundary relations which differ from (2.2) by dropping the "periodicity condition" χN +1 (n) = χ1 (n). This means that now χN (n) and χ1 (n + 1) commute, instead of "q-commuting" in Cq (N, L). Define the mappings κ˜ ± by the same formulas as κ± but now we assume that the subindex i is not taken modulo N . Instead we assume that χ0 (n) = χN +1 (n) = 0.

(2.9)

Proposition 2. The mappings κ˜ ± in (2.3), extended by linearity to the whole algebra C˜ q (N, L), determine automorphisms of this algebra. The mapping κ˜ = κ˜ + κ˜ − is considered as the evolution operator for the open discrete Toda system. Again, strictly speaking everything is defined in the appropriate completion of C˜ q (N, L). The Heisenberg equations for the discrete open Toda system differ from (2.5) only in the “boundary condition”. Instead of taking subindices i modulo N one has to use conventions (2.9).

256

R.M. Kashaev, N. Reshetikhin

3. Integrals of Motion Here we describe quantum Lax operators for the discrete Toda field theories. The open discrete Toda field theory is related to representations of Uq (slN ), while the affine Toda slN ). field theory is related to those of Uq (b 3.1. Quantum group “background”. Here we describe some algebra which will be used in the description of quantum integrals for the evolution (2.5). Denote by Aq (N ) the algebra, generated by invertible elements ai , bi , i = 1, . . . , N , with the following determining relations: ai bi = q −1 bi ai .

ai bi−1 = qbi−1 ai ,

(3.10)

Here we assume that subindices i are taken modulo N . All other generators a and b commute. Consider the following elements in End(CN ) ⊗ Aq (N) X X ei,i ⊗ ai + ei,i+1 ⊗ bi + z −2 eN,1 ⊗ bN , (3.11) L+ (z) = 1≤i≤N

L− (z) =

X

1≤i
X

ei,i ⊗ a−1 i +

1≤i≤N

ei+1,i ⊗ bi + z 2 e1,N ⊗ bN .

(3.12)

1≤i
It is not difficult to check that these elements satisfy the following identities in End(CN ) ⊗ End(CN ) ⊗ Aq (N): R(x/y)L+ (x) ⊗ L+ (y) = (1 ⊗ L+ (x))(L+ (y) ⊗ 1)R(x/y), L− (x) ⊗ L− (y)R(x/y) = R(x/y)(1 ⊗ L− (y))(L− (x) ⊗ 1).

(3.13)

Here the tensor product is taken over the algebra Aq (N ) (tensor product of matrices where the elements are multiplied in Aq (N )). The N 2 × N 2 matrix R(z) acts trivially b )) R-matrix : in Aq (N ) and is the ‘fundamental’ Uq (sl(N R(z) = (qz − q −1 z −1 )

X

ei,i ⊗ ei,i + (z − z −1 )

1≤i≤N

+

(q − q −1 )

X

X

ei,i ⊗ ej,j

1≤i6=j≤N

(zei,j ⊗ ej,i + z −1 zej,i ⊗ ei,j ).

(3.14)

1≤i6=j≤N

Remarks – In terms of representations of quantized universal enveloping algebras the matrix s lN ) R(x/y) can be regarded as the evaluation of the universal R-matrix R for Uq (b on the tensor product of two evaluation modules corresponding to the vector representation of Uq (slN ): (3.15) R(x/y) = πx ⊗ πy (R). The matrices L± describe the “minimal” representations of the quantized algebra b N . They also describe “minimal” representations [25, 36] of of functions on SL quantized Borel subalgebras Uq (bb± ) of the quantized universal enveloping algebra Uq (b slN ). If φˆ ± : Uq (bb± ) → Aq (N ) are the minimal representations, then L+ (x) = πx ⊗ φˆ + (R),

L− (x) = φˆ − ⊗ πx (R).

(3.16)

Affine Toda Field Theory as a 3-Dimensional Integrable System

257

– If one wants to study discrete open Toda field theory then one has to assume that bN = 0 ( to take the corresponding quotient algebra Aq (N )0 of Aq (N )). In this case the L-matrices do not depend on spectral parameter z and instead of relations (3.13) we have RL+ ⊗ L+ L− ⊗ L− R

= (1 ⊗ L+ )(L+ ⊗ 1)R, = R(1 ⊗ L− )(L− ⊗ 1).

(3.17) (3.18)

These L and R matrices can be obtained from minimal representations of quantized algebra of functions Cq (SLN ) and of Borel subalgebras Uq (b± ) of quantized universal enveloping algebra Uq (slN ). Denote such representations as φ± : Uq (b± ) → Aq (N )0 (see for example [25, 36]), and denote by π : Uq (slN ) → End(CN ) the N -dimensional vector representation of this algebra. Then for L± we have: L− = φ− ⊗ π(R),

L+ = π ⊗ φ+ (R),

(3.19)

where R is the universal R-matrix for Uq (slN ). 3.2. Quantum Lax pair and integrals of motion. The relation between the algebra of observables for quantum discrete Toda models and the algebra Aq (N ) is the following. Consider the algebra Aq (N, L) = Aq (N )⊗2L and the elements ai (n) = 1 ⊗ . . . ⊗ ai ⊗ . . . ⊗ 1,

bi (n) = 1 ⊗ . . . ⊗ bi ⊗ . . . ⊗ 1,

(3.20)

where n = 1, . . . , 2L. Let C∗ be the group of all nonzero complex numbers with respect to multiplication. Consider the following action of C∗ ×2LN on the algebra Aq (N, L) ai (n)

7→

bi (n)

7→

bi (n)

7→

αi (n) ai (n) αi−1 (n − 1), −1 αi (n) bi (n) αi+1 (n − 1),

αi+1 (n − 1) bi (n)

n=1

αi−1 (n),

(mod 2),

n=0

(mod 2).

It is easy to see that the group acts by automorphisms of the algebra and therefore the invariant subspace Ainv q (N, L) is also a subalgebra in Aq (N, L). We will call group C∗ ×2LN the “gauge group”. Proposition 3. The mapping χi (n) 7→ ai+1 (n)bi (n)bi (n + 1)a−1 i (n + 1), χi (n) 7→ ai (n + 1)bi (n + 1)bi (n)a−1 i+1 (n),

n=0 n=1

(mod 2); (mod 2),

(3.21)

extended by linearity to the algebra Cq (N, L), gives a homomorphism of algebras with the image in the subalgebra Ainv q (N, L) ⊂ Aq (N, L) of elements invariant with respect to the action of the gauge group C∗ ×2L . Thus, the algebra of observables of the discrete Toda system maps injectively into the subalgebra Ainv q (N, L). Now the idea of construction of integrals of motion for the discrete Toda system consists in finding the discrete time Lax evolution in the algebra Aq (N, L), the corresponding integrals being constructed in the usual way, such that on the subalgebra Ainv q (N, L) it

258

R.M. Kashaev, N. Reshetikhin

would coincide with the Toda evolution. Then the gauge invariant integrals for this Lax evolution will also be the integrals for the Toda evolution. Let us start from constructing the discrete time Lax pair. We will follow the general principles outlined in [24]. N Define the following elements L± n (z) in End(C ) ⊗ Aq (N, L) X X ei,i ⊗ ai (n) + ei,i+1 ⊗ bi (n) + z −2 eN,1 ⊗ bN (n), (3.22) L+n (z) = 1≤i≤N

for odd n and L− n (z) =

X

1≤i
ei,i ⊗ a−1 i (n) +

1≤i≤N

X

ei+1,i ⊗ bi (n) + z 2 e1,N ⊗ bN (n)

(3.23)

1≤i
for even n. Notice that the action of the gauge group C∗ ×2LN can be represented by the following “gauge transformation” of matrices L± n: L+n (z) L− n (z)

7→

−1 Dn L+n (z) Dn−1 ,

7→

Dn−1 L− n (z)

Dn−1 ,

n=1 n=0

(mod 2), (mod 2),

(3.24)

where Dn ∈ End(CN ) ⊗ Aq (N, L) act trivially in Aq (N, L) and diagonally in CN : Dn =

N X

αi (n) eii ⊗ 1.

i=1

Consider a sequence of elements ai (n, t), bi (n, t), where the ‘time’ index t takes integer values. We assume that for each fixed t these elements satisfy the defining relations of the algebra Aq (N, L) (3.10), (3.20). Define the corresponding elements of End(CN ) ⊗ Aq (N, L) according to (3.22) and (3.23): L− n,t (z), n + t = 0

(mod 2)

L+n,t (z), n + t = 1

(mod 2),

(3.25)

and postulate the zero curvature equations: − + + L− n,t+1 (z)Ln,t (z) = Ln−1,t+1 (z)Ln−1,t (z).

(3.26)

These equations are invariant with respect to the “time dependent” gauge transformations: L+n,t (z) L− n,t (z)

7→

−1 Dn,t L+n,t (z) Dn−1,t−1 ,

n+t=1

(mod 2),

7→

Dn−1,t L− n,t (z)

n+t=0

(mod 2),

where Dn,t =

N X

−1 Dn,t−1 ,

αi (n, t) eii ⊗ 1,

(3.27)

αi (n, t) ∈ C∗ .

i=1

Like in usual gauge theories, Eqs. (3.26) do not specify a unique time evolution in the algebra2 Aq (N, L). However they define a unique evolution on the gauge invariant subalgebra of Aq (N, L) and, this evolution coincides with the one for the discrete quantum Toda systems. 2 To specify the evolution in gauge theories, a “gauge fixing procedure” is needed, which is not unique, see e.g. [26].

Affine Toda Field Theory as a 3-Dimensional Integrable System

259

Proposition 4. Let L± n,t (z), defined in (3.25), (3.22), (3.23) satisfy Eqs. (3.26). Then, the gauge invariant (with respect to gauge transformations (3.27)) operators χi (n, t) = ai+1 (n, t)bi (n, t)bi (n + 1, t)a−1 i (n + 1, t), for odd t and even n and χi (n, t) = ai (n + 1, t)bi (n + 1, t)bi (n, t)a−1 i+1 (n, t),

(3.28)

for even t and odd n satisfy the discrete Toda field equations (2.5). In particular, this means that the operators (3.25), (3.22), (3.23) can be regarded as Lax operators for the quantum Toda system. Now, we can use the Lax operators for construction of integrals of motion. Consider the following “transfer matrix”: − −1 + −1 + (3.29) t1 (z) = trCN (L− 2L (z)) L2L−1 (z) · · · (L2 (z)) L1 (z) ∈ Aq (N, L). It is an element in Aq (N, L). In the same way introduce the set of transfer matrices tl (z), where l = 2, . . . , N −1, which correspond to elements Lωl ,± (z) ∈ End(V (ωl ))⊗Aq (N ), where V (ωl ), is the set of irreducible finite dimensional Uq (sl(N )) modules, associated with fundamental weights ωl . These elements can be constructed through the fusion procedure [27] from L± (z). Proposition 5. The elements tl (z), l = 1, . . . , N − 1, belong to Ainv q (N, L); they form a commutative family [tl (z), tl0 (z 0 )] = 0; and are invariant under the action of automorphisms (2.3), extended to Ainv q (N, L). These are direct consequences of the gauge transformation law (3.24), the Yang-Baxter equations (3.13), Proposition 4, and the fusion procedure [27]. Coefficients of polynomials tl (z) give the integrals for the discrete quantum affine Toda system. Similarly one can obtain the integrals for the quantum discrete open Toda chain. 4. Discrete Toda Field Theory as a 3-Dimensional System In this section we introduce the “τ -variables” for the quantum discrete Toda system. In the classical case they form a coordinate system on the phase space, in which the evolution is given in terms of τ -functions. For simplicity we consider the case of infinite lattice and ignore the boundary conditions. Define the algebra Tq , generated by invertible elements τx (y),τ x (y), x, y ∈ Z, with the following commutation relations τx (y)τx0 (y 0 ) = τx0 (y 0 )τx (y), τ x (y)τ x0 (y 0 ) = τ x0 (y 0 )τ x (y), 0

0

τ x (y)τx0 (y 0 ) = τx0 (y 0 )τ x (y)q G(x−x ,y−y ) , where

(4.1)

1 (4.2) G(x, y) = − (x)(x + y), 2 1 if x ≥ 0 (x) = . (4.3) −1 otherwise Strictly speaking we have to take appropriate completion of this algebra, but we will not discuss these details here.

260

R.M. Kashaev, N. Reshetikhin

Proposition 6. The mappings K+ (τi (n)) K+ (τ i (n)) K− (τi (n)) K− (τ i (n))

= = = =

τi (n)τi (n + 1) + τi−1 (n + 1)τi+1 (n) τ i (n)−1 , τi (n + 1), τi (n − 1)τi (n) + τi−1 (n)τi+1 (n − 1) τ i (n − 1)−1 . τi (n),

(4.4)

can be extended by linearity to automorphisms of the algebra Tq . Thus we have an evolution given by the composition K = K+ K− of these automorphisms. The following proposition explains why it can be regarded as the discrete Toda evolution. Let Cq be the algebra generated by invertible elements χi (n), i, n ∈ Z with relations (2.1). Proposition 7. 1. The mapping φ(χi (2n)) = φ(χi (2n + 1)) =

τi−1 (n)τi+1 (n − 1)τi (n)−1 τi (n − 1)−1 , τ i−1 (n)τ i+1 (n − 1)τ i (n)−1 τ i (n − 1)−1

(4.5)

determines an algebra homomorphism φ: Cq → Tq . 2. The following diagram is commutative Cq  y Tq

κ±

−→ K±

−→

Cq  y

(4.6)

Tq

Introduce elements Ti (2n + 1, 0) and Ti (2n, 1) as follows: Ti (2n, 1) = τi (n), Ti (2n + 1, 0) = τ i (n).

(4.7)

The time evolution of these elements is defined similarly to the one for χ: K± (Ti (n, t)) = Ti (n ± 1, t + 1),

n+t=1

(mod 2).

(4.8)

The equation of motion for Ti (n, t) reads: Ti (n, t + 1)Ti (n, t − 1) = Ti (n + 1, t)Ti (n − 1, t) + Ti−1 (n + 1, t)Ti+1 (n − 1, t). (4.9) If we introduce a new variable τ (x, y, t) = Ty (x − y, t),

(4.10)

then the equation coincides with the one, discovered by Hirota in the classical case [10]. It is a 3-dimensional system of difference equations, which admits soliton solutions and interpolates most of the known soliton PDE, including the classical Toda field theory. With the aid of an appropriate “gauge transformation” τ (x, y, t) 7→ g(x, y)τ (x, y, t) the equation of motion (4.9) for τ (x, y, t) can be made invariant with respect to permutations of all three coordinates: τ (x, y, t + 1)τ (x, y, t − 1) + τ (x, y + 1, t)τ (x, y − 1, t) + τ (x + 1, y, t)τ (x − 1, y, t) = 0. (4.11) Thus, we have obtained a quantum integrable three dimensional system whose equations of motion are invariant with respect to permutations of “space-time” coordinates (lattice analog of the Lorentz invariance).

Affine Toda Field Theory as a 3-Dimensional Integrable System

261

5. The N → ∞ Limit as a 3-Dimensional Thermodynamical Limit Let us return to the 1 + 1-dimensional affine Toda field theory in the continuous spacetime and consider it as a 2 + 1-dimensional field theory in partly discrete space-time: one space coordinate is discrete with the values 1, . . . , N (the “Lie algebra direction”), the other two are, as usual, continuous. We will refer to excitations in 1 + 1 dimensional theory as one-dimensional particles and to the excitations in the 2+1-dimensional theory as two-dimensional particles. The spectrum of the theory can be interpreted in two ways: as the spectrum of the 1 + 1-dimensional model and as the spectrum of the 2 + 1-dimensional model: – the 1 + 1-dimensional interpretation: N − 1 scalar massive particles with masses Ml = M sin( πl N ). – the 2+1-dimensional interpretation: one scalar massless particle with the momentum in the second space direction given by the above formula. These two interpretations are very reminiscent of the mechanism of the mass generation via the compactification of extra dimensions (see for example [28]). As N → ∞ the 2-dimensional momentum of the 2 + 1-dimensional scalar particle becomes continuous. This corresponds to the limit where the ratio πl/N in the formula for the masses is kept finite as N → ∞. For finite N the scattering of massive onedimensional particles is pure elastic with the scattering amplitudes given by (1.2). The notion of scattering becomes more subtle in the limit N → ∞, since we are dealing now with massless 2 + 1-dimensional particles. However, we will see that the free energy of the system is proportional to the 2-dimensional volume (N L) in the 2-dimensional thermodynamical limit. This is a good indication that we have the correct thermodynamical limit in this case. By the 2-dimensional thermodynamical limit we understand the limit, when the 2-dimensional space volume of the system increases proportionally to the number of particles. We use the thermodynamical Bethe ansatz to investigate this limit. 5.1. The thermodynamics of the affine Toda field theory for finite N . The idea of using scattering amplitudes and dispersions of physical excitations for the description of states of the thermodynamical equilibrium goes back to [29]. It is based on the spectrum of the system which is described in the introduction [3] and on the assumption that imposing spatial periodic boundary conditions will “quantize” momenta of particles according to the “physical Bethe equations” [30]. For the Toda system this gives the following answer for the energy levels of the model in the box of length L with periodic boundary conditions (we assume that N is yet finite): E=

nl N X X

M sin(

l=1 α=1

πl ) cosh θα(l) , N

(5.1)

where nl is the number of particles of type l in the state and the rapidities θα(l) are “quantized” by the periodic boundary conditions as follows: LM sin(

πl ) sinh θα(l) = 2πIα(l) + N

X (k,β)6=(l,α)

φl,k (θα(l) − θβ(k) ).

(5.2)

262

R.M. Kashaev, N. Reshetikhin

Here the numbers Iα(l) are integers and φl,k (θ) = −i ln(Sl,k (θ)) assuming that the branch of the logarithm is chosen in such a way that φ vanishes when b = 0. Similar equations have been studied in detail in [30, 31] for the chiral Gross-Neveu type models. Since the numbers Iα(l) form only a subset among all integers, one can introduce the rapidities of “holes” (in the distribution of Iα(l) among integers) as solutions to the system LM sin(

X πl ) sinh θ˜α(l) = 2π I˜α(l) + φl,k (θ˜α(l) − θβ(k) ). N

(5.3)

(k,β)

Here I˜α(l) are all integers which do not belong to {Iα(l) }. According to [29] we will refer to such a state as a state with particles with rapidities ˜ θ and with holes with rapidities θ. The 1-dimensional thermodynamical (macroscopic) states correspond to the limit L → ∞ where nl = Lρl with finite densities ρl . These states are parametrized by the asymptotic densities of distributions of rapidities of particles and holes along the real line (ρl (θ) and ρhl (θ), respectively). The assumption about finite densities of distributions of rapidities is based on the “Pauli principle” which asserts that there should be no identical particles with the same rapidities. This principle follows from the property Sk,k (0) = −1 of the physical S-matrices. These densities are certainly not independent. Equations(5.2), (5.3) provide the following integral equation which relates ρl (θ) and ρhl (θ) X Z ∞ Ml cosh θ = 2πρl (θ) + 2πρhl (θ) + φ0l,k (θ − α)ρk (α)dα. (5.4) 1≤k≤N

−∞

The energy of such state grows proportionally to the length of the system: X Z ∞ E=L Ml cosh θρl (θ)dθ. 1≤l≤N

(5.5)

−∞

The state of the thermodynamical equilibrium minimizes the free energy of the system which is the linear combination of the energy and the entropy F = E − T S.

(5.6)

Here T is the temperature and S is the combinatorial entropy of the gas of particles and holes. It has the following asymptotics as L → ∞ on macroscopic states: X Z ∞ {(ρl (θ) + ρhl (θ)) ln(ρl (θ) + ρhl (θ)) S=L 1≤l≤N

−

−∞

ρl (θ) ln ρl (θ) − ρhl (θ) ln ρhl (θ)}dθ.

(5.7)

Minimization of the functional (5.6) with the condition (5.4) gives the following formula for the free energy of the state of the thermodynamical equilibrium: X Z ∞ Ml cosh θ ln(1 + exp(−k (θ)/T ))dθ, (5.8) F =L 1≤k≤N

−∞

where the functions k (θ) satisfy the following system of nonlinear integral equations:

Affine Toda Field Theory as a 3-Dimensional Integrable System

Z ∞ 1 X Ml cosh θ = l (θ) + φ0l,k (θ − α) ln(1 + exp(−k (α)/T ))dα. 2π −∞

263

(5.9)

1≤k≤N

5.2. Now let us consider the thermodynamical limit and thermodynamical states of the Toda field theory, regarded as a 2 + 1-dimensional model. It is not difficult to verify that the limit N, L → ∞ does not depend on the order in which it is taken. Let us consider the case where we first take the limit L → ∞ and then N → ∞. The limit L → ∞ for fixed N has been already described above. When N → ∞ the 2-dimensional macroscopic states correspond to the macroscopic number of 2 + 1-dimensional P excitations. This means that we have to consider the states with n = σN where n = 1≤l≤N nl and σ is fixed when N → ∞. The densities of holes and particles in such states will be functions of 2 variables (of 2-momentum): ρhl (θ) → ρ(θ, πl/N ), ρl (θ) → ρ(θ, πl/N ). Let x = πl/N ,y = πk/N , and β in (1.2) be fixed, and N → ∞. The function φl,k (θ) has the following asymptotics in this limit 1 B2 − B K(θ|x, y) + O( 3 ), (5.10) N N where the B = (1 + 4π/β 2 )−1 and the function K(θ|x, y) has the following form: φl,k (θ) = 2π

1 1 − }. (5.11) cosh(θ) − cos(x − y) cosh(θ) − cos(x + y) Using asymptotics (5.10) in Eqs. (5.4), (5.5), (5.7), we obtain the following description of macroscopic states in the affine Toda field theory regarded as a 2 + 1-dimensional field theory. The energy and the entropy of such states are: Z Z LN π ∞ M sin(x) cosh(θ)ρ(θ, x)dθdx, (5.12) E= π 0 −∞ Z Z LN π ∞ {(ρ(θ, x) + ρh (θ, x)) ln(ρ(θ, x) + ρh (θ, x)) − S = π 0 −∞ K(θ|x, y) = sinh(θ){

ρ(θ, x) ln ρ(θ, x) − ρh (θ, x) ln ρh (θ, x)}dθdx.

(5.13)

The densities of holes and particles are related by the equation M sin x cosh θ = 2πρ(θ, x) + 2πρh (θ, x) + 2(B 2 − B) Z πZ ∞ × K 0 (θ − α|x, y)ρ(α, x)dαdx. 0

(5.14)

−∞

Minimizing the free energy (5.6), we obtain the following expression for the free energy of the Toda model for large N : Z Z LN π ∞ M sin(x) cosh θ ln(1 + exp(−(θ, x)/T ))dθdx, (5.15) F (T ) = π 0 −∞ where the function (θ, x) is the solution to the following nonlinear integral equation: M sin x cosh θ = Z(θ,Zx) + 2(B 2 − B) π ∞ K 0 (θ−α|x, y) ln(1 + exp(−(α, y)/T ))dαdy. (5.16) × 0

−∞

These equations describe the equilibrium thermodynamics of the Toda system at N → ∞.

264

R.M. Kashaev, N. Reshetikhin

6. Conclusion In this paper we studied the Toda field theory along two lines. We investigated the two dimensional thermodynamical limit of this model, and constructed the discrete spacetime approximation which partly has the discrete Lorentz invariance. 6.1. Continuum 3-dimensional limit. Let us analyze the continuum limit in the Toda field theory, where the third dimension becomes continuous as well. It is not difficult to see from the Lagrangian (1.1) that such a limit corresponds to β → ∞ and M = mβ with m being fixed. As a result we have the theory in 2 + 1-dimensional space-time with the Lagrangian Z Z ∂φ 1 ∂φ ∂φ ( )2 − ( )2 − 2m2 exp dxdy. (6.1) LAT = 2 ∂t ∂x ∂y Equations of motion of this system have been considered by many authors, see for example [32, 33, 34]. Equations (5.15) and (5.16) imply the following asymptotics of the free energy in this limit: Z ∞Z ∞ mt cosh(θ) ln(1 + exp(−mt cosh(θ)/T ))dθdt. F (T ) = LL1 0

−∞

This is the free energy of massless free particles. Such a behaviour of the free energy suggests that in the continuum limit the Toda system describes noninteracting particles. From the structure of the asymptotics of the free energy one can assume that the particles are fermions. This agrees with the property of the physical S-matrix for the affine Toda theory Sk,k (0) = −1. The fact that in the continuum limit the Toda field theory describes noninteracting free particles also can be seen from the corresponding limit in the Bethe equations (5.2). Recall that these equations describe possible values of rapidities of physical particles in the box of length L with periodic boundary conditions. In the continuum limit β → ∞, M = mβ, N = L1 β with fixed m and L1 the Eqs. (5.2) degenerate into the equations Lp = 2πI,

L1 p1 = πlm,

where p = p1 sinh(θ). The energy of such excitation, according to (5.1), is E 2 = p2 + p21 . It is clear that this is the spectrum of free relativistic massless two dimensional particles. Thus, the Toda theory has some selfinteraction for finite β and N → ∞ but it becomes free in the continuum limit. 6.2. The large N limit in the principal chiral field theory, based on the group SU (N ), has been studied in [35]. The difference between this model and the Toda theory is obvious: the large N limit of the principal chiral field theory describes some string-type objects, while the similar limit in the Toda field theory describes the 2 + 1-field theory. Now let us conclude with some open problems and conjectures. – One has to understand the relation between the 3-dimensional model constructed in [37], its two dimentional counterpart [36], and the quantum discrete system constructed in Sects. 2–5. We conjecture that the model constructed in [37] corresponds to the discrete quantum Toda model at roots of 1. The relation should be similar to the one between the discrete sine-Gordon and the chiral Potts model [38].

Affine Toda Field Theory as a 3-Dimensional Integrable System

265

– When q is a root of 1, the discrete Toda system has properties similar to the discrete sine-Gordon system at roots of 1: it describes the quantum integrable system, interacting with the classical integrable system. – It would be interesting to compute the spectrum of the discrete quantum Toda field theory. – The difference between the large N limits in the Toda field theories, related to other classical Lie algebras and the SL(N ), can be interpreted as the other (nonperiodic) boundary conditions in the extra dimension. It would be interesting to compute the corresponding bulk terms in the free energy. – We have constructed the quantum analog of the τ -functions, introduced and studied in [10, 14]. It would be interesting to obtain the formulas for these quantum τ -functions which would generalize the determinant formulas or similar constructions, known in the classical case. – As L/m → 0 the spectrum of the two dimensional Toda field theory degenerates into the spectrum of the Toda chain (periodic for discrete affine Toda field theory and open for open Toda field theory). The physically interesting ground state for the Toda chain corresponds to the “lattice” boundary conditions when < qn >= an + O(1)

(6.2)

as n → ∞. Here a is a constant characterizing the spacing of the lattice. The energy of this ground state depends on a. For the Toda field theory in the limit N → ∞ we can impose a similar condition. Then the ground state will be a function of L and a. It would be interesting to compute the corresponding ground state energy and the spectrum of the excitations in this case. In the limit L/m → 0 it should degenerate to the spectrum of the infinite Toda chain with the boundary condition (6.2). We are planing to return to these problems in the extended version of this publication. Acknowledgement. We would like to thank L.D. Faddeev, I.G. Korepanov, J.M. Maillet, P. Sorba, A.M. Vershik and A. Yu. Volkov for interesting discussions. This work was completed when both of the authors visited the Laboratoire de Physique Theorique de ENS-Lyon. We are grateful to the members of the laboratory and especially to Jean-Michel Maillet and Paul Sorba for the hospitality. The work of R.K. is supported by CNRS.

References 1. 2. 3. 4. 5. 6. 7.

8. 9. 10.

Corrigan, E.: Recent developments in affine Toda field theory. Preprint DTP-94/55, hep-th/9412213 Polyakov, A.: Phys. Lett., B103, 207 (1981) Arinstein, A., Fateev, V., Zamoodchikov A.: Phys. Lett., B87, 389 (1979) Ablowitz, M., Ladic, J.: A nonlinear difference scheme and inverse scattering, Stud. Appl. Math. 55, 213–229 (1976) Faddeev, L., Volkov, A.: Quantum inverse scattering method on a space-time lattice, Theor. and Math. Phys., 92, 207–214 (1992) Faddeev, L., Volkov, A.: Hirota equation as an example of integrable symplectic map, Lett. Math. Phys. 32, 125 (1994), hep-th/9405087 Faddeev, L.D.: Current-like variables in massive and massless integrable models. Lectures delivered at the International School of Physics “Enrico Fermi”, held in Villa Monastero, Varenna, Italy, 1994, hep-th/9408041 Bobenko, A., Bazhanov, V., Reshetikhin, N.: Quantum discrete sine-Gordon model at roots of 1: integrable system on the integrable classical background. Commun. Math. Phys. Bobenko, A., Kutz, N., Pinkal, U., The discrete quantum pendulum, Physics Letters, A177, 399–404 (1993) Hirota, R.: J. Phys. Soc. Japan, 50, 3785 (1981)

266

R.M. Kashaev, N. Reshetikhin

11. Ward, R.S.: Discrete Toda field equations. Preprint DTP/95/3; solv-int/9502002 12. Korepanov, I.G.: Algebraic integrable dynamical systems, 2+1-dimensional models in whollydiscrete space-time, and inhomogeneous models in 2-dimensional statistical physics. solv-int/9506003 13. Suris, Yu.: Phys. Lett. A 156, 467 (1991) 14. Jimbo, M., Miwa, T.: Solitons and Infinite Dimensional Lie Algebras, Publ. RIMS, Kyoto University, 19, 943–1001(1983) 15. Mikhailov, A., Olshanetsky, M., Perelomov, A.: Commun. Math. Phys. 79, 473 (1981) 16. Leznov, A.N., Saveliev, M.V.: Group-theoretical methods for the integration of nonlinear dynamical systems Basel–Berlin: Birkh¨auser Verlag, 1992 17. Toda, M.: Theory of nonlinear lattices. Berlin–Heidelberg–New York: Springer, 1988 18. Kostant, B.: The solution to a generalized Toda lattice and representation theory. Adv. Math. 34 195–338 (1979) 19. Reyman, A., Semenov-Tian-Shanski, M.: Integrable Systems, Modern Problems in Mathematics, 16, Dynamical Systems-7, Publications of VINITY, 1987 (in Russian) 20. Faddeev, L., Takhtajan, L.: Hamiltonian methods in the theory of solitons. Berlin–Heidelberg–New York: Springer-Verlag, 1987 21. Korepin, V., Bogolubov, N., Izergin, A., Quantum Inverse Scattering Method and Correlation Functions. Cambridge: Cambridge University Press, 1993 22. Moser, J., Veselov, A.: Discrete versions of some classical integrable systems and factorization of matrix polynomials. Preprint, ETH, Zurich, 1989 23. Deift, P., Li, L.C., Tomei, C.: Loop groups, integrable systems, and rank 2 extensions. Memoirs of the AMS 479 (1991) 24. Reshetikhin, N.: Integarble discrete systems Lectures given at “Enrico Fermi school of Physics”, Varenna, July 1994 25. Date, E., Jimbo, M., Miki, K., Miwa, T.: Generalized Chiral Potts Model and Minimal Cyclic Representations of Uq (sbl(n)). Commun. Math. Phys., 137, 133–148 (1991) 26. Faddeev, L.D., Slavnov, A.A.: Gauge fields: Introduction to quantum theory. Reading: Benjamin/Cummings, 1988 27. Kulish, P., Sklyanin, E., Reshetikhin, N.: Yang-Baxter equation and representation theory, I. Lett. Math. Phys. 5, 393–403 (1981) 28. Green, M., Schwarz, J., Witten, E.: Superstring Theory. Cambridge: Cambridge University Press, 1987 29. Yang, C.N., Yang, C.P.: Thermodynamics of a one-dimensional system of bosons with repulsive deltafunction interactions, J. Math. Phys., 10, 1115–1122 (1967) 30. Andrei, N., Lowenstein, J.: Phys. Rev. Lett. 46, 356 (1981) 31. Japaridze, G., Nersesyan, A., Wiegmann, P.: Nucl. Phys., B230 [FS10], 511 (1984) 32. Boyer, O., Finley, J.: J. Math. Phys. 23 1126 (1982) 33. Saveliev, M.V.: Commun. Math. Phys. 121, 283 (1989) 34. Saveliev, M.V., Vershik, A.M.: Commun. Math. Phys. 126, 367 (1989) 35. Fateev, V., Kazakov, V., Wiegmann, P.: Principal chiral field at large N , Nucl. Phys. B424 [FS], 505–520 (1994) ×(n−1) Generalization of the Chiral Potts 36. Bazhanov, V., Kashaev, R., Mangazeev, V., Stroganov, Yu.: ZN Model, Commun. Math. Phys., 138, 393–408 (1991) 37. Bazhanov, V., Baxter, R. J.: J. Stat. Phys. 69 453 (1992); J. Stat. Phys. 71, 839(1993) 38. Bazhanov, V., Reshetikhin, N.: Chiral Potts model and the discrete sine-Gordon model at roots of 1. Preprint 1995 39. Kashaev, R.M., Reshetikhin, N.Yu.: Preprint ENSLAPP-L-548/95, hep-th/9507065 Communicated by T. Miwa

Commun. Math. Phys. 188, 267 – 304 (1997)

Communications in

Mathematical Physics c Springer-Verlag 1997

Quantum Integrable Models and Discrete Classical Hirota Equations I. Krichever1 , O. Lipan2 , P. Wiegmann3 , A. Zabrodin4 1 Department of Mathematics, Columbia University, New York, NY 10027, USA and Landau Institute for Theoretical Physics, Kosygina str. 2, 117940 Moscow, Russia 2 James Franck Institute of the University of Chicago, 5640 S.Ellis Avenue, Chicago, IL 60637, USA 3 James Franck Institute and Enrico Fermi Institute of the University of Chicago, 5640 S.Ellis Avenue, Chicago, IL 60637, USA and Landau Institute for Theoretical Physics 4 Joint Institute of Chemical Physics, Kosygina str. 4, 117334, Moscow, Russia and ITEP, 117259, Moscow, Russia

Received: 15 May 1996 / Accepted: 25 November 1996

Abstract: The standard objects of quantum integrable systems are identified with elements of classical nonlinear integrable difference equations. The functional relation for commuting quantum transfer matrices of quantum integrable models is shown to coincide with classical Hirota’s bilinear difference equation. This equation is equivalent to the completely discretized classical 2D Toda lattice with open boundaries. Elliptic solutions of Hirota’s equation give a complete set of eigenvalues of the quantum transfer matrices. Eigenvalues of Baxter’s Q-operator are solutions to the auxiliary linear problems for classical Hirota’s equation. The elliptic solutions relevant to the Bethe ansatz are studied. The nested Bethe ansatz equations for Ak−1 -type models appear as discrete time equations of motions for zeros of classical τ -functions and Baker-Akhiezer functions. Determinant representations of the general solution to bilinear discrete Hirota’s equation are analysed and a new determinant formula for eigenvalues of the quantum transfer matrices is obtained. Difference equations for eigenvalues of the Q-operators which generalize Baxter’s three-term T -Q-relation are derived. 1. Introduction In spite of the diversity of solvable models of quantum field theory and the vast variety of methods, the final results display dramatic unification: the spectrum of an integrable theory with a local interaction is given by a sum of elementary energies X ε(ui ) , (1.1) E= i

where ui obey a system of algebraic or transcendental equations known as Bethe equations [4, 16]. The major ingredients of Bethe equations are determined by the algebraic structure of the problem. A typical example of a system of Bethe equations (related to A1 -type models with an elliptic R-matrix) is

268

I. Krichever, O. Lipan, P. Wiegmann, A. Zabrodin

e−4ην

Y σ(η(uj − uk + 2)) φ(uj ) =− , φ(uj − 2) σ(η(uj − uk − 2))

(1.2)

k

where σ(x) is the Weierstrass σ-function and φ(u) =

N Y

σ(η(u − yk )) .

(1.3)

k=1

Entries of these equations which encode information of the model are the function ε(u) (entering through φ(u)), quasiperiods ω1 , ω2 of the σ-function, parameters η, ν, yk and size of the system N . Different solutions of the Bethe equations correspond to different quantum states of the model. In this paper we show that these equations, which are usually considered as a tool inherent to the quantum integrability, arise naturally as a result of the solution of entirely classical non-linear discrete time integrable equations. This suggests an intriguing interrelation (if not equivalence) between integrable quantum field theories and classical soliton equations in discrete time. In forthcoming papers we will show that the Bethe equations themselves may be considered as a discrete integrable dynamical system. R. Hirota proposed [20] a difference equation which unifies the majority of known continuous soliton equations, including their hierarchies [42, 12]. A particular case of the Hirota equation is a bilinear difference equation for a function τ (n, l, m) of three discrete variables: ατ (n, l + 1, m)τ (n, l, m + 1) + βτ (n, l, m)τ (n, l + 1, m + 1) +γτ (n + 1, l + 1, m)τ (n − 1, l, m + 1) = 0 ,

(1.4)

where it is assumed that α + β + γ = 0. Different continuum limits at different boundary conditions then reproduce continuous soliton equations (KP, Toda lattice, etc). On the other hand, τ (n, l, m) can be identified [42] with the τ -function of a continuous hierarchy expressed through special independent variables. The same equation (with a particular boundary condition) has quite unexpectedly appeared in the theory of quantum integrable systems as a fusion relation for the transfer matrix (trace of the quantum monodromy matrix). The transfer matrix is one of the key objects in the theory of quantum integrable systems [13]. Transfer matrices form a commutative family of operators acting in the Hilbert space of a quantum problem. Let Ri,A (u) be the R-matrix acting in the tensor product of Hilbert spaces Vi ⊗ VA . Then the transfer matrix is a trace over the auxiliary space VA of the monodromy matrix. The latter being the matrix product of N R-matrices with a common auxiliary space: TˆA (u|yi ) = RN,A (u − yN ) . . . R2,A (u − y2 )R1,A (u − y1 ) , TA (u) = trA TˆA (u|yi ) .

(1.5)

The transfer matrices commute for all values of the spectral parameter u and different auxiliary spaces: (1.6) [TA (u), TA0 (u0 )] = 0. They can be diagonalized simultaneously. The family of eigenvalues of the transfer matrix is an object of primary interest in an integrable system, since the spectrum of the quantum problem can be expressed in terms of eigenvalues of the transfer matrix.

Quantum Integrable Models and Discrete Classical Hirota Equations

269

The transfer matrix corresponding to a given representation in the auxiliary space can be constructed out of transfer matrices for some elementary space by means of the fusion procedure [35, 36, 26]. The fusion procedure is based on the fact that at certain values of the spectral parameter u the R-matrix becomes essentially a projector onto an irreducible representation space. The fusion rules are especially simple in the A1 -case. For example, the R1,1 (u)-matrix for two spin-1/2 representations in a certain normalization of the spectral parameter is proportional to the projector onto the singlet (spin-0 state) at u = +2 and onto the triplet (spin-1 subspace) at u = −2, in accordance with the decomposition [1/2] + [1/2] = [0] + [1]. Then the transfer matrix T21 (u) with spin-1 auxiliary space is obtained from the product of two spin-1/2 monodromy matrices Tˆ11 (u) with arguments shifted by 2: T21 (u) = tr[1] R1,1 (−2)Tˆ11 (u + 1)Tˆ11 (u − 1)R1,1 (−2) . A combination of the fusion procedure and the Yang-Baxter equation results in numerous functional relations (fusion rules) for the transfer matrix [35, 47]. They were recently combined into a universal bilinear form [30, 37]. The bilinear functional relations have the most simple closed form for the models of the Ak−1 -series and representations corresponding to rectangular Young diagrams. Let Tsa (u) be the transfer matrix for the rectangular Young diagram of length a and height s. If η can not be represented in the form η = r1 ω1 + r2 ω2 with rational r1 , r2 (below we always assume that this is the case; for models with trigonometric R-matrices this means that the quantum deformation parameter q would not be a root of unity), they obey the following bilinear functional relation: a a (u)Ts−1 (u) = Tsa+1 (u)Tsa−1 (u) . Tsa (u + 1)Tsa (u − 1) − Ts+1

(1.7)

Tsa (u)

Since commute at different u, a, s, the same equation holds for eigenvalues of the transfer matrices, so we can (and will) treat Tsa (u) in Eq. (1.7) as number-valued functions. The bilinear fusion relations for models related to other Dynkin graphs were suggested in ref. [37]. Remarkably, the bilinear fusion relations (1.7) appear to be identical to the Hirota equation (1.4). Indeed, one can eliminate the constants α, β, γ by the transformation 2

τ (n, l, m) =

(−α/γ)n /2 τn (l, m), (1 + γ/α)lm

so that τn (l + 1, m)τn (l, m + 1) − τn (l, m)τn (l + 1, m + 1) = τn+1 (l + 1, m)τn−1 (l, m + 1) = 0 , (1.8) and then change variables from light-cone coordinates n, l, m to the “direct" variables a = n,

s = l + m, u = l − m − n, a (l − m − n). τn (l, m) ≡ Tl+m

(1.9)

At least at a formal level, this transformation provides the equivalence between Eqs. (1.7), (1.4) and (1.8). In what follows we call Eq. (1.8) (or (1.7)) Hirota’s bilinear difference equation (HBDE). Leaving aside more fundamental aspects of this “coincidence," we exploit, as a first step, some technical advantages it offers. Specifically, we treat the functional relation (1.7) not as an identity but as a fundamental equation which (together with particular

270

I. Krichever, O. Lipan, P. Wiegmann, A. Zabrodin

boundary and analytical conditions) completely determines all the eigenvalues of the transfer matrix. The solution to HBDE then appears in the form of the Bethe equations. We anticipate that this approach makes it possible to use some specific tools of classical integrability and, in particular, the finite gap integration technique. The origin of Tsa (u) as an eigenvalue of the transfer matrix (1.5) imposes specific boundary conditions and, what is perhaps even more important, requires certain analytic properties of the solutions. As a general consequence of the Yang-Baxter equation, the transfer matrices may always be normalized to be elliptic polynomials in the spectral parameter, i.e. finite products of Weierstrass σ-functions (as in (1.3)). The problem therefore is stated as finding elliptic solutions of HBDE. A similar problem appeared in the theory of continuous soliton equations since the works [1, 11], wherein a remarkable connection between the motion of poles of the elliptic solutions to the KdV equation and the Calogero-Moser dynamical system was revealed. Elliptic solutions to Kadomtsev-Petviashvili (KP), matrix KP equations and the matrix 2D Toda lattice (2DTL) were analyzed in Refs. [31, 32, 33], respectively. It was shown, in particular, that poles of elliptic solutions to the abelian 2DTL (i.e. zeros of corresponding τ -functions and Baker-Akhiezer functions) move according to the equations of motion for the Ruijsenaars-Schneider (RS) system of particles [48]. Analytic properties of solutions to HBDE relevant to the Bethe ansatz suggest a similar interpretation of Bethe ansatz equations. We will show that the nested Bethe ansatz for Ak−1 -type models is equivalent to a chain of B¨acklund transformations of HBDE. The nested Bethe ansatz equations arise as equations of motion for zeros of the Baker-Akhiezer functions in discrete time (discrete time RS system1 ). The discrete time variable is identified with the level of the nested Bethe ansatz. The paper is organized as follows. In Sect. 2 we review general properties and boundary conditions of solutions to HBDE that yield eigenvalues of quantum transfer matrices. In Sect. 3 the zero curvature representation of HBDE and the auxiliary linear problems are presented. We also discuss the duality relation between “wave functions" and “potentials" and define B¨acklund flows on the set of wave functions. These flows are important ingredients of the nested Bethe ansatz scheme. For illustrative purposes, in Sect. 4, we give a self-contained treatment of the A1 -case, where the major part of the construction contains familiar objects from the usual Bethe ansatz. Section 5 is devoted to the general Ak−1 -case. We give a general solution to HBDE with the required boundary conditions. This leads to a new type of determinant formulas for eigenvalues of quantum transfer matrices. A sketch of proof of this result is presented in the appendix to Sect. 5. Generalized Baxter’s relations (difference equations for Qt (u)) are written in the explicit form. They are used for examining the equivalence to the standard Bethe ansatz results. In Sect. 6 a part of the general theory of elliptic solutions to HBDE is given. Section 7 contains a discussion of the results.

2. General Properties of Solutions to Hirota’s Equation Relevant to Bethe Ansatz 2.1. Boundary conditions and analytic properties. HBDE has many different solutions. Not all of them give eigenvalues of the transfer matrix (1.5). There are certain boundary and analytic conditions imposed on the transfer matrix (1.5). 1 It should be noted that equations of motion for the discrete time RS system were already written down in the paper [43]. However, the relation to elliptic solutions of discrete soliton equations and their nested Bethe ansatz interpretation were not discussed there.

Quantum Integrable Models and Discrete Classical Hirota Equations

271

(i) It is known that Tsk (u), the transfer matrix in the most antisymmetrical representation in the auxiliary space, is a scalar, i.e. it has only one eigenvalue (sometimes called the quantum determinant detq Tˆs (u) of the monodromy matrix). It depends on the representation in the quantum space of the model and is known explicitly. In the simplest case of the vector representation (one-box Young diagram) in the quantum space it is [34]: Tsk (u) = φ(u − s − k)

k−1 Y s−1 Y

φ(u + s + k − 2l − 2p − 2)

l=0 p=1

k−1 Y

φ(u + s + k − 2l), (2.1)

l=1

Ts0 (u) = 1. Ts0 (u)

(2.2)

Tsk (u)

These values of and should be considered as boundary conditions. Let us note that they obey the discrete Laplace equation: k k (u)Ts−1 (u). Tsk (u + 1)Tsk (u − 1) = Ts+1

(2.3)

This leads to the boundary condition (b.c.) Tsa (u) = 0

as a < 0 and a > k

(2.4)

(with this b.c. Eq. (1.8) is known as the discrete two-dimensional Toda molecule equation [22], an integrable discretization of the conformal Toda field theory [8]). (ii) The second important condition (which follows, eventually, from the Yang-Baxter equation) is that Tsa (u) has to be an elliptic polynomial in the spectral parameter u. By elliptic polynomial we mean essentially a finite product of Weierstrass σ-functions. For models with a rational R-matrix it degenerates to a usual polynomial in u. To give a more precise formulation of this property, let us note that Eq. (1.7) has the gauge invariance under a transformation parametrized by four arbitrary functions χi of one variable: Tsa (u) → χ1 (a + u + s)χ2 (a − u + s)χ3 (a + u − s)χ4 (a − u − s)Tsa (u) .

(2.5)

These transformations can remove all zeros from the characteristics a ± s ± u = const. We require that the remaining part of all Tsa (u) should be an elliptic (trigonometric, rational) polynomial of one and the same degree N , where N is the number of sites on the lattice (see (1.3)). One can formulate this condition in a gauge invariant form by introducing the gauge invariant combination a T a (u)Ts−1 (u) . (2.6) Ysa (u) = s+1 a−1 a+1 Ts (u)Ts (u) We require Ysa (u) to be an elliptic function having 2N zeros and 2N poles in the fundamental domain. This implies that Tsa (u) has the general form2 Tsa (u) = Aas eµ(a,s)u

N Y

σ(η(u − zj(a,s) )) ,

(2.7)

j=1

where zj(a,s) , Aas , µ(a, s) do not depend on u and the following constraints hold: 2 This differs from a more traditional expression in terms of Jacobi θ-functions by a simple normalization factor.

272

I. Krichever, O. Lipan, P. Wiegmann, A. Zabrodin N X

(zj(a,s+1)

+

zj(a,s−1) )

j=1

=

N X

(zj(a+1,s) + zj(a−1,s) ) ,

(2.8)

j=1

µ(a, s + 1) + µ(a, s − 1) = µ(a + 1, s) + µ(a − 1, s) .

(2.9)

Another gauge invariant combination, Xsa (u) = −

Tsa (u + 1)Tsa (u − 1) = −1 − Ysa (u) , Tsa+1 (u)Tsa−1 (u)

(2.10)

is also convenient. As a reference, we point out gauge invariant forms of HBDE [37]: Ysa (u + 1)Ysa (u − 1) = a a (u)Xs−1 (u) = Xs+1

a a (u))(1 + Ys−1 (u)) (1 + Ys+1

(1 + (Ysa+1 (u))−1 )(1 + (Ysa−1 (u))−1 )

(1 + Xsa (u + 1))(1 + Xsa (u − 1)) . (1 + (Xsa+1 (u))−1 )(1 + (Xsa−1 (u))−1 )

,

(2.11) (2.12)

It can be shown that the minimal polynomial appears in the gauge  −1 a−1 a−1 Y s−1 Y Y φ(u + s + a − 2l − 2p − 2) φ(u + s + a − 2l) , Tsa (u) → Tsa (u)  l=0 p=1

l=1

(2.13) where all the “trivial" zeros (common for all the eigenvalues) of the transfer matrix are removed (see e.g. [54]). The boundary values at a = 0, k then become: Ts0 (u) = φ(u + s), Tsk (u) = φ(u − s − k) .

(2.14)

From now on we adopt this normalization. (iii) The analyticity conditions and b.c. (2.14) lead to a particular “initial condition" in s. It is convenient, however, to take advantage of it before the actual derivation. The condition reads Tsa (u) = 0

for any

− k < s < 0, 0 < a < k .

(2.15)

This is consistent with (1.7), (2.14) and implies T0a (u) = φ(u − a)

(2.16)

for 0 ≤ a ≤ k. Under the analyticity conditions (i) and the b.c. (2.14) (and their consequences (2.15), (2.16)) each solution to HBDE (1.7) corresponds to an eigenstate of the Ak−1 -transfer matrix. The same conditions are valid for higher representations of the quantum space. However, in that case there are certain constraints on zeros of φ(u) (they should form “strings"), whence Tsa (u) acquires extra “trivial" zeros. Here we do not address this question. 2.2. Pl¨ucker relations and determinant representations of solutions. Classical integrable equations in Hirota’s bilinear form are known to be naturally connected [50, 25, 51],

Quantum Integrable Models and Discrete Classical Hirota Equations

273

with geometry of Grassmann’s manifolds (grassmannians) (see [24, 23, 19]), in general of an infinite dimension. Type of the grassmannian is specified by boundary conditions. Remarkably, the b.c. (2.4) required for Bethe ansatz solutions corresponds to finite dimensional grassmannians. This connection suggests a simple way to write down a general solution in terms of determinants and to transmit the problem to the boundary conditions. Numerous determinant formulas may be obtained in this way. r+1 is a collection of all (r + 1)-dimensional linear subspaces of The grassmannian Gn+1 1 the complex (n + 1)-dimensional vector space Cn+1 . In particular, Gn+1 is the complex n r+1 projective space P . Let X ∈ Gn+1 be such a (r + 1)-dimensional subspace spanned Pn i i n+1 . The by vectors x(j) = i=0 x(j) i e , j = 1, . . . , r + 1, where e are basis vectors in C (j) collection of their coordinates form a rectangular (n + 1) × (r + 1)-matrix xi . Let us consider its (r + 1) × (r + 1) minors det (x(q) ip ) ≡ (i0 , i1 , . . . , ir ), pq

p, q = 0, 1, . . . , r ,

(2.17)

r+1 minors are called Pl¨ucker obtained by choosing r + 1 lines i0 , i1 , . . . , ir . These Cn+1 coordinates of X. They are defined up to a common scalar factor and provide the Pl¨ucker r+1 r+1 into the projective space Pd , where d = Cn+1 −1 embedding of the grassmannian Gn+1 r+1 (Cn+1 is the bimomial coefficient). r+1 The image of Gn+1 in Pd is realized as an intersection of quadrics. This means that the coordinates (i0 , i1 , . . . , ir ) are not independent but obey the Pl¨ucker relations [23, 19]:

(i0 , i1 , ..., ir )(j0 , j1 , ..., jr ) =

r X

(jp , i1 , ..., ir )(j0 , ...jp−1 , i0 , jp+1 ..., jr )

(2.18)

p=0

for all ip , jp , p = 0, 1, . . . , r. Here it is implied that the symbol (i0 , i1 , . . . , ir ) is antysymmetric in all the indices, i.e., (i0 , . . . , ip−1 , ip , . . . , ir ) = −(i0 , . . . , ip , ip−1 , . . . , ir ) and it equals zero if any two indices coincide. If one treats these relations as equations rather than identities, then determinants (2.17) would give a solution to Hirota’s equations. The Pl¨ucker relations in their general form (2.18) describe fusion rules for transfer matrices corresponding to arbitrary Young diagrams. At the same time these general fusion rules can be recast [40] into the form of higher equations of the discrete KP hierarchy. These are n-term bilinear equations for functions of n variables [12, 44]. In this paper we restrict ourselves to the three-term Hirota equation. In order to reduce general Pl¨ucker relations to 3-term HBDE, one should take ip = jp for p 6= 0, 1. Then all terms but the first two in the r.h.s. of (2.18) vanish and one is left with the 3-term relation (i0 , i1 , . . . , ir )(j0 , j1 , i2 , . . . , ir ) = (j0 , i1 , i2 , . . . , ir )(i0 , j1 , i2 , . . . ir ) +(j1 , i1 , i2 , . . . , ir )(j0 , i0 , i2 , . . . ir ).

(2.19)

After substitution of (2.17) these elementary Pl¨ucker relations turn into certain deter(j) minant identities. For example, choosing x(j) i0 = δpj , xj0 = δqj , q 6= p, one can recast Eq. (2.19) into the form of the Jacobi identity: D[p|p] · D[q|q] − D[p|q] · D[q|p] = D[p, q|p, q] · D ,

(2.20)

where D is the determinant of a (r + 1) × (r + 1)-matrix and D[p1 , p2 |q1 , q2 ] denotes the determinant of the same matrix with p1,2 -th rows and q1,2 -th columns removed. Another

274

I. Krichever, O. Lipan, P. Wiegmann, A. Zabrodin

useful identity contained in Eq. (2.19) connects minors D[l1 , l2 ] of a (r + 3) × (r + 1) rectangular matrix, where the two rows l1 , l2 are removed: D[l1 , l3 ] · D[l2 , l4 ] − D[l1 , l2 ] · D[l3 , l4 ] = D[l1 , l4 ] · D[l2 , l3 ] ,

l1 < l2 < l3 < l4 . (2.21) Identifying terms in Eq. (2.19) with terms in HBDE (1.8), one obtains various determinant representations of solutions to HBDE. Two of them follow from the Jacobi identity (2.20): i, j = 1, . . . , a, τ0 (l, m) = 1 (2.22) τa (l, m) = det τ1 (l + i − a, m − j + a) , or, in “direct" variables

1 (u + i + j − a − 1) , Tsa (u) = det Ts+i−j

i, j = 1, . . . , a,

Ts0 (u) = 1 .

(2.23)

This representation determines an evolution in a from the initial values at a = 1. The size of the determinant grows with a. A similar formula exists for the evolution in s: i, j = 1, . . . , s , T0a (u) = 1 . (2.24) Tsa (u) = det T1a+i−j (u + i + j − s − 1) , The size of this determinant grows with s. Determinant formulas of this type have been known in the literature on quantum integrable models (see e.g. [6]). They allow one to express Tsa (u) through T1a (u) or Ts1 (u). A different kind of determinant representation follows from (2.21): Tsa (u) = det Mij , h (u + s + a + 2j) if j = 1, ..., k − a; i = 1, ..., k , Mji = ¯ i hi (u − s + a + 2j) if j = k − a + 1, ..., k; i = 1, ..., k

(2.25)

where hi (x) and h¯ i (x) are 2k arbitrary functions of one variable. The size of this determinant is equal to k for all 0 ≤ a ≤ k. This determinant formula plays an essential role in what follows. The determinant representations give a solution to discrete nonlinear equations and expose the essence of the integrability. Let us note that they are simpler and more convenient than their continuous counterparts. 2.3. Examples of difference and continuous A1 -type equations. For illustrative purposes we specialize the Hirota equation to the A1 -case and later study it separately. At k = 2 Eq. (1.7) is Ts (u + 1)Ts (u − 1) − Ts+1 (u)Ts−1 (u) = φ(u + s)φ(u − s − 2)

(2.26)

with the condition T−1 (u) = 0 (here we set Ts (u) ≡ Ts1 (u)). This equation is known as a discrete version of the Liouville equation [22] written in terms of the τ -function. It can be recast to a somewhat more universal form in terms of the discrete Liouville field Ys1 (u) ≡ Ys (u) =

Ts+1 (u)Ts−1 (u) φ(u + s)φ(u − s − 2)

(2.27)

(see (2.6)), which hides the function φ(u) in the r.h.s. of (2.26). The equation becomes Ys (u − 1)Ys (u + 1) = (Ys+1 (u) + 1)(Ys−1 (u) + 1) .

(2.28)

Quantum Integrable Models and Discrete Classical Hirota Equations

275

(Let us note that the same functional equation but with different analytic properties of the solutions appears in the thermodynamic Bethe ansatz [53, 46].) In the continuum limit one should put Ys (u) = δ −2 exp(−ϕ(x, t)), u = δ −1 x, s = −1 δ t. An expansion in δ → 0 then gives the continuous Liouville equation ∂s2 ϕ − ∂u2 ϕ = 2 exp(ϕ) .

(2.29)

To stress the specifics of the b.c. (2.15) and for further reference let us compare it with the quasiperiodic b.c. Then the A1 -case corresponds to the discrete sine-Gordon (SG) equation [21]: Tsa+1 (u) = eα λ2a Tsa−1 (u − 2),

(2.30)

where α and λ are parameters. Substituting this condition into (1.7), we get: 1 1 (u)Ts−1 (u) = eα λ2 Ts0 (u)Ts0 (u − 2), Ts1 (u + 1)Ts1 (u − 1) − Ts+1

Ts0 (u

+

1)Ts0 (u

− 1) −

0 0 Ts+1 (u)Ts−1 (u)

= e

−α

Ts1 (u)Ts1 (u

+ 2).

(2.31) (2.32)

Let us introduce two fields ρs,u and ϕs,u on the square (s, u) lattice Ts0 (u) = exp(ρs,u + ϕs,u ), Ts1 (u

+ 1) = λ

1/2

s,u

exp(ρ

−ϕ

(2.33) s,u

),

(2.34)

and substitute them into (2.31), (2.32). Finally, eliminating ρs,u , one gets the discrete SG equation: sinh(ϕs+1,u + ϕs−1,u − ϕs,u+1 − ϕs,u−1 ) = λsinh(ϕs+1,u + ϕs−1,u + ϕs,u+1 + ϕs,u−1 + α) . (2.35) The constant α can be removed by the redefinition ϕs,u → ϕs,u − 41 α. Another useful form of the discrete SG equation appears in variables Xsa (u) (2.10). Under condition (2.30) one has Xsa+1 (u) = Xsa−1 (u − 2),

λ2 Xsa+1 (u + 1)Xsa (u) = 1 ,

(2.36)

Xs1 (u) ≡ xs (u) = −e−α λ−1 exp − 2ϕs,u − 2ϕs,u−2 .

(2.37)

so there is only one independent function

The discrete SG equation becomes [21, 14, 9]: xs+1 (u)xs−1 (u) =

(λ + xs (u + 1))(λ + xs (u − 1)) . (1 + λxs (u + 1))(1 + λxs (u − 1))

(2.38)

In the limit λ → 0 Eq. (2.38) turns into the discrete Liouville equation (2.28) for Ys (u) = −1 − λ−1 xs (u).

276

I. Krichever, O. Lipan, P. Wiegmann, A. Zabrodin

3. Linear Problems and B¨acklund Transformations 3.1. Zero curvature condition. Consider the square lattice in two light cone variables l and m and a vector function ψa (l, m) on this lattice. Let La,a0 (l, m) and Ma,a0 (l, m) be two shift operators in directions l and m: X La,a0 (l, m)ψa0 (l + 1, m) = ψa (l, m), a0

X

Ma,a0 (l, m)ψa0 (l, m + 1) = ψa (l, m).

(3.1)

a"

The zero curvature condition states that the result of subsequent shifts from an initial point to a fixed final point does not depend on the path: L(l, m) · M (l + 1, m) = M (l, m) · L(l, m + 1).

(3.2)

HBDE (1.7) possesses [20, 49] a zero-curvature representation by means of the following two-diagonal infinite matrices: La,a0 = δa,a0 −1 + δa,a0 Vla , a , Ma,a0 = δa,a0 + δa,a0 +1 Wm

(3.3)

where τa (l + 1, m)τa+1 (l, m) , τa (l, m)τa+1 (l + 1, m) τa−1 (l, m + 1)τa+1 (l, m) . = τa (l, m)τa (l, m + 1)

Vla = a Wm

(3.4)

More precisely, the compatibility condition of the two linear problems ψa (l, m) − ψa+1 (l + 1, m) = Vla ψa (l + 1, m) , a ψa−1 (l, m + 1), ψa (l, m) − ψa (l, m + 1) = Wm

(3.5)

combined with the b.c. (2.14) yields HBDE (1.8). Introducing an unnormalized “wave function" (3.6) fa (l, m) = ψa (l, m)τa (l, m) , we can write the linear problems in the form τa+1 (l + 1, m)fa (l, m) − τa+1 (l, m)fa (l + 1, m) = τa (l, m)fa+1 (l + 1, m) , τa (l, m + 1)fa (l, m) − τa (l, m)fa (l, m + 1) = τa+1 (l, m)fa−1 (l, m + 1) ,

(3.7)

or in “direct" variables a+1 (u)F a (s, u) − Tsa+1 (u − 1)F a (s + 1, u + 1) = Tsa (u)F a+1 (s + 1, u) , Ts+1 a Ts+1 (u − 1)F a (s, u) − Tsa (u)F a (s + 1, u − 1) = Tsa+1 (u − 1)F a−1 (s + 1, u) , (3.8)

where F a (l + m, l − m − a) ≡ fa (l, m). An advantage of the light cone coordinates is that they are separated in the linear problems (there are shifts only of l (m) in the first (second) Eq. (3.7)).

Quantum Integrable Models and Discrete Classical Hirota Equations

277

The wave function and potential possess a redundant gauge freedom: Vla →

χ(a − l + 1) a χ(a − l) a V , Wm W a , ψa (l, m) → χ(a−l+1)ψa (3.9) → χ(a − l) l χ(a − l − 1) m

with an arbitrary function χ. The b.c. (2.4) implies a similar condition for the object of the linear problems F a (s, u) = 0

as a < 0 and a > k − 1

(3.10)

so that the number of functions F is one less than the number of T ’s. Then from the second equation of the pair (3.8) at a = 0 and from the first one at a = k −1 it follows that F 0 (s, u) (F k−1 (s, u)) depends on one cone variable u + s (resp., u − s). We introduce a special notation for them: F 0 (s, u) = Qk−1 (u + s), F k−1 (s, u) = Q¯ k−1 (u − s).

(3.11)

Furthermore, it can be shown that the important condition (2.15) relates the functions Q ¯ and Q: Q¯ k−1 (u) = Qk−1 (u − k + 1). (3.12) The special form of the functions F a at the ends of the Dynkin graph (a = 0, k − 1) reflects the specifics of the “Liouville-type" boundary conditions. This is to be compared with nonlinear equations with the quasiperiodic boundary condition (2.30): in this case all the functions F depend on two variables and obey the quasiperiodic b.c. 2

3.2. Continuum limit. In the continuum limit l = −δt+ , m = −δt− , τa → δ a τa , fa → 2 δ a +a fa , δ → 0, we recover the auxiliary linear problems for the 2D Toda lattice [52] (∂± ≡ ∂/∂t± ): ∂+ ψa = ψa+1 + ∂+ (log ∂ − ψa =

τa+1 )ψa , τa

τa+1 τa−1 ψa−1 , τa2

(3.13)

or, in terms of fa , τa+1 ∂+ fa − (∂+ τa+1 )fa = τa fa+1 , τa ∂− fa − (∂− τa )fa = τa+1 fa−1 .

(3.14)

The compatibility condition of these equations yields the first non-trivial equation of the 2D Toda lattice hierarchy: ∂+ τa ∂− τa − τa ∂+ ∂− τa = τa+1 τa−1 . In terms of ϕa (t+ , t− ) = log

(3.15)

τa+1 (t+ , t− ) τa (t+ , t− )

it has the familiar form ∂+ ∂− ϕa = eϕa −ϕa−1 − eϕa+1 −ϕa .

(3.16)

3.3. B¨acklund flow. The discrete nonlinear equation has a remarkable duality between “potentials" T a and “wave functions" F a first noticed in [49]. In the continuum version

278

I. Krichever, O. Lipan, P. Wiegmann, A. Zabrodin

it is not so transparent. Equations (3.8) are symmetric under the interchange of F and T . Then one may treat (3.8) as linear problems for a nonlinear equation on F ’s. It is not surprising that one again obtains HBDE (1.7): F a (s, u + 1)F a (s, u − 1) − F a (s + 1, u)F a (s − 1, u) = F a+1 (s, u)F a−1 (s, u) . (3.17) Moreover, conditions (3.10)-(3.12) mean that even the b.c. for F a (s, u) are the same as for Tsa (u) under a substitution φ(u) by Qk−1 (u). The only change is a reduction of the Dynkin graph: k → k − 1. Using this property, one can successively reduce the Ak−1 -problem up to A1 . Below we use this trick to derive Ak−1 (“nested") Bethe ansatz equations. To elaborate the chain of these transformations, let us introduce a new variable a (s, u) be a solution to t = 0, 1, . . . , k to mark a level of the flow Ak−1 → A1 and let Ft+1 th a a (s, u) = the linear problem at (k − t) level. In this notation, Fk (s, u) = Tsa (u) and Fk−1 F a (s, u) is the corresponding wave function. The wave function itself obeys the nonlinear a (s, u) denotes its wave function and so on. For each level t the equation (3.17), so Fk−2 a function Ft (s, u) obeys HBDE of the form (3.17) with the b.c. Fta (s, u) = 0

as a < 0 and a > t .

(3.18)

As a consequence of (3.18), the first and the last components of the vector Fta (s, u) obey the discrete Laplace equation (2.3) and under the condition (3.11) are functions of only one of the light-cone variables (u + s and u − s respectively). We denote them as follows: Ft0 (s, u) ≡ Qt (u + s) ,

Ftt (s, u) ≡ Q¯ t (u − s) ,

(3.19)

where it is implied that Qk (u) = φ(u). It can be shown that ellipticity requirement (ii) and condition (2.14) impose the relation Q¯ t (u) = Qt (u − t). In this notation the linear problems (3.8) at level t, a+1 a+1 a (s + 1, u)Fta (s, u) − Ft+1 (s, u − 1)Fta (s + 1, u + 1) = Ft+1 (s, u)Fta+1 (s + 1, u) , Ft+1 (3.20) a a a+1 Ft+1 (s + 1, u − 1)Fta (s, u) − Ft+1 (s, u)Fta (s + 1, u − 1) = Ft+1 (s, u − 1)Fta−1 (s + 1, u) (3.21) look like bilinear equations for a function of 4 variables. However, Eq. (3.20) (resp., Eq. (3.21)) leaves the hyperplane u − s + a = const (resp., u + s + a = const) invariant, and actually depends on three variables. Restricting the variables in Eq. (3.20) to the hyperplane u − s + a = v (where v is a constant), by setting a (u + a − v, u), (3.22) τu (t, a) ≡ Fk−t

we reduce Eq. (3.20) to the form of the same HBDE (1.8) in cone coordinates t and a. The b.c. is τu (t, 0) = Qk−t (2u − v),

τu (t, k − t) = Q¯ k−t (v + t − k) = const.

(3.23)

Similar equations can be obtained from the second linear problem (3.21) by setting k−t−b τ¯u (b, t) = Fk−t (v¯ + b − u, u + t − k)

(3.24)

(v¯ is a constant). This function obeys Eq. (1.8), τ¯u (b + 1, t)τ¯u (b, t + 1) − τ¯u (b, t)τ¯u (b + 1, t + 1) = τ¯u+1 (b + 1, t)τ¯u−1 (b, t + 1) , (3.25)

Quantum Integrable Models and Discrete Classical Hirota Equations

279

where t now plays the role of the light cone coordinate m. The b.c. is τ¯u (0, t) = Q¯ k−t (2u + t − k − v), ¯

τ¯u (k − t, t) = Qk−t (v) = const.

(3.26)

It is convenient to visualize this array of τ -functions on a diagram; here is an example for the A3 -case (k = 4): 0

1

0

0 Q1 (u + s) Q¯ 1 (u − s)

0

0 Q2 (u + s)

F21 (s, u)

Q¯ 2 (u − s)

0

0 Q3 (u + s)

F31 (s, u)

F32 (s, u)

Q¯ 3 (u − s)

Ts1 (u)

Ts2 (u)

Ts3 (u)

0

φ(u + s)

(3.27) 0 ¯ − s) 0 φ(u

Functions in each horizontal (constant t) slice satisfy HBDE (3.17), whereas functions on the u − s + a = const slice satisfy HBDE (1.8) with t, a being light cone variables l, m respectively. A general solution of the bilinear discrete equation (1.7) with the b.c. (2.14) is determined by 2k arbitrary functions of one variable Qt (u) and Q¯ t (u), t = 1, ..., k. The additional requirement (ii) of ellipticity determines these functions through the Bethe ansatz. 3.4. Nested Bethe ansatz scheme. Here we elaborate the nested scheme of solving HBDE based on the chain of successive B¨acklund transformations (Sect. 3.4). This is an alternative (and actually the shortest) way to obtain nested Bethe ansatz equations a (u + a, u) (3.22) (where we put v = 0 for (3.31). Recall that the function τu (t, a) = Fk−t simplicity) obeys HBDE in light cone variables: τu (t + 1, a)τu (t, a + 1) − τu (t, a)τu (t + 1, a + 1) = τu+1 (t + 1, a)τu−1 (t, a + 1) . (3.28) Since τu (t, 0) = Qk−t (2u), nested Bethe ansatz equations can be understood as “equations of motions" for zeros of Qt (u) in discrete time t (level of the Bethe ansatz). The simplest way to derive them is to consider the auxiliary linear problems for Eq. (3.28). Here we present an example of this derivation in the simplest possible form. Let us assume that Qt (u) has the form Qt (u) = eνt ηu

Mt Y

σ(η(u − utj ))

(3.29)

j=1

(note that we allow the number of roots Mt to depend on t). Since we are interested in dynamics in t at a fixed a, it is sufficient to consider only the first linear equation of the pair (3.7): τu+1 (t + 1, a)fu (t, a) − τu+1 (t, a)fu (t + 1, a) = τu (t, a)fu+1 (t + 1, a) .

(3.30)

An elementary way to derive equations of motion for roots of τu (t, 0) is to put u equal to the roots of fu (t + 1, 0), fu (t, 0) and fu+1 (t + 1, 0), so that only two terms in (3.30) would survive. Combining relations obtained in this way, one can eliminate f ’s and obtain the system of equations

280

I. Krichever, O. Lipan, P. Wiegmann, A. Zabrodin

Qt−1 (utj + 2)Qt (utj − 2)Qt+1 (utj ) = −1 Qt−1 (utj )Qt (utj + 2)Qt+1 (utj − 2)

(3.31)

as the necessary conditions for solutions of the form (3.29) to exist. In the more detailed notation they look as follows: Mt t+1 Y σ(η(utj − ut−1 Y + 2)) Y σ(η(utj − utk − 2)) M σ(η(utj − ut+1 k )) k t t t−1 t t+1 t σ(η(uj − uk )) k=1 σ(η(uj − uk + 2)) k=1 σ(η(uj − uk − 2)) k=1

Mt−1

= −e2η(2νt −νt+1 −νt−1 ) .

(3.32)

With the “boundary conditions” Q0 (u) = 1,

Qk (u) = φ(u),

(3.33)

this system of M1 + M2 + . . . + Mk−1 equations is equivalent to the nested Bethe ansatz equations for Ak−1 -type quantum integrable models with Belavin’s elliptic R-matrix. The same equations can be obtained for the right edge of the diagram (3.27) from the second linear equation in (3.7). In Sect. 5 we explicitly identify our Q’s with similar objects known from the Bethe ansatz solution. Let us remark that the origin of Eq. (3.32) suggests to consider them as equations of motion for the elliptic Ruijsenaars-Schneider model in discrete time. Taking the continuum limit in t (provided Mt = M does not depend on t), one can check that Eqs. (3.32) do yield the equations of motion for the elliptic RS model [48] with M particles. The additional limiting procedure η → 0 with finite ηuj = xj yields the well known equations of motion for the elliptic Calogero-Moser system of particles. However, integrable systems of particles in discrete time seem to have a richer structure than their continuous time counterparts. In particular, the total number of particles in the system may depend on (discrete) time. Such a phenomenon is possible in continuous time models only for singular solutions, when particles can move to infinity or merge to another within a finite period of time. Remarkably, this appears to be the case for the solutions to Eqs. (3.32) corresponding to eigenstates of the quantum model. It is known that the number of excitations Mt at the tth level of the Bethe ansatz solution does depend on t. In other words, the number of “particles" in the corresponding discrete time RS model is not conserved, though the numbers Mt may not be arbitrary. In the elliptic case degrees of the elliptic polynomials Qt (u) are equal to Mt = (N/k)t (provided η is incommensurable with the lattice spanned by ω1 , ω2 and N is divisible by k). This fact follows directly from Bethe equations (3.31). Indeed, the elliptic polynomial form (3.29) implies that if utj is a zero of Qt (u), i.e., Qt (utj ) = 0, then utj +2n1 ω1 +2n2 ω2 for all integers n1 , n2 are its zeros too. Taking into account the well known monodromy properties of the σ-function, one concludes that this is possible if and only if Mt+1 + Mt−1 = 2Mt ,

(3.34)

which has a unique solution N t (3.35) k satisfying b.c. (3.33). This means that the nested scheme for elliptic Ak−1 -type models is consistent only if N is divisible by k. Mt =

Quantum Integrable Models and Discrete Classical Hirota Equations

281

In trigonometric and rational cases the conditions on degrees of Qt ’s become less restrictive since some of the roots can be located at infinity. The equality in (3.35) becomes an inequality: Mt ≤ (N/k)t. A more detailed analysis [28] shows that the following inequalities also hold: 2M1 ≤ M2 , 2M2 ≤ M1 +M3 , . . ., 2Mt ≤ Mt−1 +Mt+1 , . . ., N = Mk ≥ 2Mk−1 − Mk−2 . 4. The A1 -Case: Discrete Liouville Equation In this section we consider the A1 -case separately. Although in this case the general nested scheme is missing, the construction is more explicit and contains familiar objects from the Bethe ansatz literature. 4.1. General solution. Let us consider a more general functional relation: ¯ − s), Ts (u + 1)Ts (u − 1) − Ts+1 (u)Ts−1 (u) = φ(u + s)φ(u

(4.1)

where the functions φ, φ¯ are independent and Ts (u) ≡ Ts1 (u). The auxiliary linear problems (3.8) acquire the form ¯ − s − 1), Ts+1 (u)Q(u + s) − Ts (u − 1)Q(u + s + 2) = φ(u + s)Q(u

(4.2)

¯ − s)Q(u + s + 2) . ¯ − s + 1) − Ts (u + 1)Q(u ¯ − s − 1) = φ(u Ts+1 (u)Q(u

(4.3)

Here we set Q(u) ≡ Q1 (u) and φ(u) = Q2 (u). Rearranging these equations, we obtain φ(u − 2)Q(u + 2) + φ(u)Q(u − 2) = A(u)Q(u),

(4.4)

¯ Q(u ¯ + 2)Q(u ¯ + 3) + φ(u ¯ − 1) = A(u) ¯ Q(u ¯ + 1) φ(u)

(4.5)

with the constraint ¯ − 1), T1 (u)Q(u) − T0 (u − 1)Q(u + 2) = φ(u)Q(u

(4.6)

which follows from Eq. (4.2) at s = 0. In these equations, A(u) =

φ(u − 2)Ts+1 (u − s) + φ(u)Ts−1 (u − s − 2) , Ts (u − s − 1)

(4.7)

¯ + 2)Ts+1 (u + s) + φ(u)T ¯ φ(u s−1 (u + s + 2) ¯ . (4.8) A(u) = Ts (u + s + 1) ¯ Due to consistency condition (4.1) A(u) and A(u) are functions of one variable and do not depend on s. The symmetry between u and s allows one to construct similar objects ¯ which in turn do not depend on u. Functions A(u) and A(u), in the r.h.s. of (4.4), (4.5) are the conservation laws of the s-dynamics. ¯ φ(u) ¯ Let us note that the connection between φ and φ, = φ(u−2), and its consequence T−1 (u) = 0 (see (2.15)), simplifies Eqs. (4.4)-(4.8). Putting s = 0 and using the b.c. T−1 (u) = 0, we find ¯ (4.9) A(u) = A(u) = T1 (u) . Therefore, the following holds Ts (u − 1)T1 (u + s) = φ(u + s − 2)Ts+1 (u) + φ(u + s)Ts−1 (u − 2),

(4.10)

282

I. Krichever, O. Lipan, P. Wiegmann, A. Zabrodin

Ts (u + 1)T1 (u − s) = φ(u − s)Ts+1 (u) + φ(u − s − 2)Ts−1 (u + 2), φ(u − 2)Q(u + 2) + φ(u)Q(u − 2) = T1 (u)Q(u) .

(4.11) (4.12)

The first two equalities are known as fusion relations [35, 29, 5] while Eq. (4.12) is Baxter’s T -Q-relation [4, 3]. So Baxter’s Q function and the T -Q-relation naturally appear in the context of the auxiliary linear problems for HBDE. ¯ may be A general solution of the discrete Liouville equation (for arbitrary φ and φ) ¯ expressed through two independent functions Q(u) and Q(u). One may follow the same lines developed for solving the continuous classical Liouville equation (see e.g. [17, 27] and references therein). Let us consider Eq. (4.4) (resp., (4.5)) as a second order linear ¯ difference equation, where the function A(u) (A(u)) is determined from the initial data. ¯ Let R(u) (resp., R(u)) be a second (linearly independent) solution of Eq. (4.4) (resp., (4.5)) normalized so that the wronskians are R(u) Q(u) = φ(u), W (u) = (4.13) R(u + 2) Q(u + 2) R(u) ¯ ¯ Q(u) ¯ ¯ W (u) = ¯ (4.14) ¯ + 2) = φ(u + 1), R(u + 2) Q(u and the constraint similar to (4.6) is imposed: ¯ − 1). T1 (u)R(u) − T0 (u − 1)R(u + 2) = φ(u)R(u Then the general solution of Eq. (4.1) is given in terms of Q and R: Q(u + s + 1) R(u + s + 1) Ts (u) = ¯ ¯ − s) . Q(u − s) R(u

(4.15)

(4.16)

This formula is a particular case of the general determinant representation (2.25). Like in the continuous case, this expression is invariant with respect to changing the basis of linearly independent solutions with the given wronskians. The transformation of the basis vectors is described by an element of SL(2). Due to relations (4.6), (4.15) ¯ R¯ transform in the same way as Q, R and the invariance of Eq. (4.16) is evident. Q, ¯ ¯ For any given Q(u) and Q(u) the second solution R(u) and R(u) (defined modulo a linear transformation R(u) → R(u) + αQ(u) ) can be explicitly found from the first order recurrence relations (4.13), (4.14), if necessary. Let Q(u0 ) and R(u0 ) be initial values at u = u0 . Then, say, for even r ≥ 0,   r/2 X R(u0 )  φ(u0 + 2j − 2) + , (4.17) R(u0 + r) = Q(u0 + r) − Q(u0 + 2j)Q(u0 + 2j − 2) Q(u0 ) j=1

¯ and so on for other r’s and R(u). Finally, one can express the solution to Eq. (4.1) through two independent functions ¯ Q(u) and Q(u):   s X φ(u + 2j − 2) ¯ − 1)  T0 (u − 1) + , Ts (u + s − 1) = Q(u + 2s)Q(u ¯ − 1) Q(u + 2j)Q(u + 2j − 2) Q(u)Q(u j=1

(4.18) where T0 (u) can be found from (4.18) by putting s = 0:

Quantum Integrable Models and Discrete Classical Hirota Equations

−

283

¯ φ(u) T0 (u − 1) φ(u) T0 (u + 1) − + = ¯ − 1)Q(u ¯ + 1) . (4.19) ¯ − 1) Q(u + 2)Q(u ¯ + 1) Q(u)Q(u + 2) Q(u Q(u)Q(u

Note also the following useful representations: A(u) = Q(u + 2)R(u − 2) − R(u + 2)Q(u − 2),

(4.20)

¯ ¯ + 3)Q(u ¯ − 1) − Q(u ¯ + 3)R(u ¯ − 1), A(u) = R(u

(4.21)

which are direct corollaries of (4.4), (4.5). 4.2. Equivalent forms of Baxter’s equation. The key ingredient of the construction is Baxter’s relation (4.12) and its “chiral" versions (4.4), (4.5). For completeness, we gather some other useful forms of them. Consider first “chiral" linear equations (4.4), (4.5) (thus not implying any specific b.c. in s). Assuming that Ts (u) obeys HBDE (4.1), one can represent Eqs. (4.4), (4.5) in the form Ts (u) Ts+1 (u − 1) Q(u + s + 1) Q(u + s + 3) = 0 , Ts+1 (u + 1) Ts+2 (u) T (u + 2) T (u + 1) Q(u + s + 5) s+2 s+3 Ts (u) Ts+1 (u + 1) Ts+1 (u − 1) Ts+2 (u) T (u − 2) T (u − 1) s+2 s+3

¯ − s − 2) = 0 , Q(u ¯ − s − 4) Q(u

(4.22)

¯ − s) Q(u

(4.23)

respectively. This representation can be straightforwardly extended to the Ak−1 -case (see Eqs. (5.37), (5.38)). A factorized form of these difference equations is φ(u)Q(u − 2) Q(u) 2∂u 2∂u e X(u − 2) = 0 , (4.24) − − e φ(u − 2)Q(u) Q(u − 2) ¯ + 2)Q(u ¯ − 1) ¯ + 1) φ(u Q(u ¯ − 1) = 0 . e2∂u − ¯ X(u (4.25) e2∂u − ¯ ¯ + 1) Q(u − 1) φ(u)Q(u ¯ Here e∂u acts as the shift operator, e∂u f (u) = f (u + 1), and X(u) (X(u)) stands for any ¯ ¯ linear combination of Q(u), R(u) (Q(u), R(u)). Specifying Eqs. (4.22), (4.23) to the b.c. T−1 (u) = 0 (see (4.9)), we see that both of them turn into the equation 2 X

(−1)a T1a (u + a − 1)X(u + 2a − 2) = 0,

(4.26)

a=0

that is Baxter’s relation (4.12). Furthermore, the difference operator in (4.26) admits a factorization of the form (4.24):

284 2 X a=0

I. Krichever, O. Lipan, P. Wiegmann, A. Zabrodin

+ a − 1) 2a∂u e = φ(u − 2)

a a T1 (u

(−1)

e

φ(u)Q(u − 2) − φ(u − 2)Q(u)

2∂u

2∂u

e

Q(u) − Q(u − 2)

, (4.27)

which is equivalent to the well known formula for T1 (u) in terms of Q(u). 4.3. Double-Bloch solutions to Baxter’s equation. In this section we formulate the analytic properties of solutions to Baxter’s functional relation (4.4) that are relevant to models on finite lattices. First let us transform Baxter’s relation to a difference equation with elliptic (i.e. double-periodic with periods 2ω1 /η, 2ω2 /η) coefficients. The formal substitution Q(u)P (u) ˜ (4.28) 9(u) = φ(u − 2) with a (as yet not specified ) function P (u) yields ˜ − 2) = A(u)P (u + 2) 9(u) ˜ ˜ + 2) + P (u + 2)φ(u − 4) 9(u . 9(u P (u − 2)φ(u − 2) φ(u)P (u)

(4.29)

Below we restrict ourselves to the case when the degree N of the elliptic polynomial φ(u) (1.3) is even. Then for any P (u) of the form Y

N/2

P (u) =

σ(η(u − pj ))

(4.30)

j=1

with arbitrary pj the coefficients in (4.29) are elliptic functions. Indeed, for the coefficient ˜ in front of 9(u−2) this is obvious. As for the coefficient in the r.h.s. of (4.29), its doubleperiodicity follows from the “sum rule" (2.8). Let us represent φ(u) in the form φ(u) = φ0 (u)φ1 (u) ,

(4.31)

where φ0 (u), φ1 (u) are elliptic polynomials of degree N/2 (of course for N > 2 there are many ways to do that). Specifying P (u) as P (u) = φ1 (u − 2) ,

(4.32)

we rewrite (4.29) in the form 9(u + 2) +

A(u) φ0 (u − 4)φ1 (u) 9(u − 2) = 9(u) , φ0 (u − 2)φ1 (u − 2) φ0 (u)φ1 (u − 2)

where 9(u) =

Q(u) . φ0 (u − 2)

(4.33)

(4.34)

Now, the coefficients in Eq. (4.33) being double-periodic, it is natural to consider its double-Bloch solutions. A meromorphic function f (x) is said to be double-Bloch if it obeys the following monodromy properties: f (x + 2ωα ) = Bα f (x),

α = 1, 2.

(4.35)

Quantum Integrable Models and Discrete Classical Hirota Equations

285

The complex numbers Bα are called Bloch multipliers. It is easy to see that any doubleBloch function can be represented as a linear combination of elementary ones: f (x) =

M X

ci 8(x − xi , z)κx/η ,

(4.36)

i=1

where [33]

x/(2η) σ(z + x + η) σ(z − η) 8(x, z) = , σ(z + η)σ(x) σ(z + η)

(4.37)

and complex parameters z and κ are related by 2ωα /η

Bα = κ

exp(2ζ(ωα )(z + η))

σ(z − η) σ(z + η)

ωα /η (4.38)

(ζ(x) = σ 0 (x)/σ(x) is the Weierstrass ζ-function). Considered as a function of z, 8(x, z) is double-periodic: 8(x, z + 2ωα ) = 8(x, z). For general values of x one can define a single-valued branch of 8(x, z) by cutting the elliptic curve between the points z = ±η. In the fundamental domain of the lattice defined by 2ωα the function 8(x, z) has a unique pole at the point x = 0: 8(x, z) =

1 + O(1) . x

Coming back to the variable u = x/η, one can formulate the double-Bloch property of the function 9(u) (4.34) in terms of its numerator Q(u). It follows from (4.36) that the general form of Q(u) is Q(u) = Q(u; ν) = eνηu

M Y

σ(η(u − uj )) ,

(4.39)

j=1

where M = N/2 and ν determines Bloch multipliers. For the trigonometric and rational degeneration of Eqs. (4.4), (4.33), (4.39) the meaning of ν is quite clear: it plays the role of the “boundary phase" for twisted b.c. in the horizontal (auxiliary) direction. For each ν Eq. (4.12) has a solution of the form (4.39). The corresponding value of T1 (u) = A(u) depends on ν as a parameter: T1 (u) = T1 (u; ν). If there exist ν 6= ν 0 such that T1 (u; ν) = T1 (u; ν 0 ), one may put Q(u) = Q(u, ν), R(u) = Q(u; ν 0 ). In the elliptic case the boundary phase in general is not compatible with integrability and so ν should have a different physical sense which is still unclear. 4.4. Bethe equations. It can be shown that for double-Bloch solutions the relation ¯ φ(u) ¯ between φ and φ, = φ(u − 2), implies ¯ Q(u) = Q(u − 1), so that (see (4.16)

¯ R(u) = R(u − 1) ,

Q(u + s + 1) R(u + s + 1) . Ts (u) = Q(u − s − 1) R(u − s − 1)

(4.40)

(4.41)

286

I. Krichever, O. Lipan, P. Wiegmann, A. Zabrodin

It is clear that if Q(u) and R(u) are elliptic polynomials of degree N/2 multiplied by an exponential function (as in (4.39)), Ts (u) has the desired general form (2.7). Under condition (4.40) Eq. (4.18) yields the familiar result: Ts (u) = Q(u + s + 1)Q(u − s − 1)

s X j=0

φ(u − s + 2j − 1) . (4.42) Q(u − s + 2j + 1)Q(u − s + 2j − 1)

This formula was obtained in [29, 5] by direct solution of the fusion recurrence relations (4.10), (4.11). Let uj and vj , j = 1, . . . , M , be zeros of Q(u) and R(u), respectively. Then, evaluating (4.13) at u = uj , u = uj − 2 and u = vj , u = vj − 2 we obtain the relations φ(uj ) = Q(uj + 2)R(uj ) , whence it holds

φ(uj − 2) = −Q(uj − 2)R(uj ) ,

(4.43)

Q(uj + 2) φ(uj ) =− , φ(uj − 2) Q(uj − 2)

(4.44)

R(vj + 2) φ(vj ) =− . φ(vj − 2) R(vj − 2)

(4.45)

Equation (4.44) are exactly the standard Bethe equations (1.2). We refer to Eqs. (4.45) as complementary Bethe equations. It is easy to check that Eq. (4.44) ensure cancellation of poles in (4.42). A more standard way to derive Bethe equations (4.44), (4.45) is to substitute zeros of Q(u) (or R(u)) directly into Baxter’s relation (4.12). However, the wronskian relation (4.13) is somewhat more informative: in addition to Bethe equations for uj , vj it provides the connection (4.43) between them. In the next section we derive the system of nested Bethe ansatz equations starting from a proper generalization of Eq. (4.13). In the elliptic case degrees of the elliptic polynomials Q(u), R(u) (for even N ) are equal to M = N/2 (provided η is incommensurable with the lattice spanned by ω1 , ω2 ). This fact follows directly from Bethe equations (4.44), (4.45) by the same argument as in Sect. 3.5. In trigonometric and rational cases there are no such strong restrictions on degrees M ˜ of Q and R respectively. This is because a part of their zeros may tend to infinity and M ˜ can be arbitrary integers not exceeding thus reducing the degree. Whence M and M ˜ = N . The traditional N . However, they must be complementary to each other: M + M choice is M ≤ N/2. In particular, the solution Q(u) = 1 (M = 0) corresponds to the simplest reference state (“bare vacuum") of the model. We already pointed out that the function Q(u) originally introduced by Baxter (see e.g. [4] and references therein) emerged naturally in the context of the auxiliary linear problems. Let us mention that for models with the rational R-matrix this function can be treated as a limiting value of Ts (u) as s → ∞ [35]. Rational degeneration of Eqs. (2.7), (4.39) gives N Y (4.46) Ts (u) = As (u − zj(s) ) , j=1

Q(u) = eνηu

M Y j=1

(u − uj ) ,

(4.47)

Quantum Integrable Models and Discrete Classical Hirota Equations

287

where As =

sinh(2νη(s + 1)) . sinh(2νη)

(4.48)

(The last expression follows from (4.42) by extracting the leading term as u → ∞.) If the “boundary phase" −iνη is real and ν 6= 0, one has from (4.41): Q(u) = ±2 sinh(2νη)eνηu lim e2νηs s→∓∞

T∓s−1 (u + s) . (2s)N −M

(4.49)

For each finite s ≥ 0 Ts (u) has N zeros but in the limit some of them tend to infinity. The degenerate case ν = 0 needs special analysis since the limits ν → 0 and s → ∞ do not commute. Another remark on the rational case is in order. Fusion relations (4.10), (4.11) give “Bethe ansatz like" equations for zeros of Ts (u) (4.46). Substituting zeros of Ts (u ± 1) into (4.10), (4.11) and using (4.48) one finds: (s) N Y zj(s) − zk(s−1) − 1 sinh(2νη(s + 2)) φ(zj + s − 1) =− , sinh(2νηs) φ(zj(s) + s + 1) zj(s) − zk(s+1) + 1

(4.50)

(s) N Y zj(s) − zk(s−1) + 1 sinh(2νη(s + 2)) φ(zj − s − 1) = − . sinh(2νηs) φ(zj(s) − s − 3) zj(s) − zk(s+1) − 1

(4.51)

k=1

k=1

These equations give the discrete dynamics of zeros in s. They are to be compared with dynamics of zeros of rational solutions of classical nonlinear equations [1, 32]. It is an interesting open problem to find elliptic analogues of Eqs. (4.49)-(4.51).

5. The Ak−1 -Case: Discrete Time 2D Toda Lattice 5.1. General solution. The family of bilinear equations arising as a result of the B¨acklund flow (Sect. 3.4), Fta (s, u + 1)Fta (s, u − 1) − Fta (s + 1, u)Fta (s − 1, u) = Fta+1 (s, u)Fta−1 (s, u) , (5.1) and the corresponding linear problems, a+1 a+1 a (s+1, u)Fta (s, u)−Ft+1 (s, u−1)Fta (s+1, u+1) = Ft+1 (s, u)Fta+1 (s+1, u) , (5.2) Ft+1 a a a+1 (s + 1, u − 1)Fta (s, u) − Ft+1 (s, u)Fta (s + 1, u − 1) = Ft+1 (s, u − 1)Fta−1 (s + 1, u) , Ft+1 (5.3) subject to the b.c.

Fta (s, u) = 0

as a < 0 and a > t.

(5.4)

They may be solved simultaneously by using the determinant representation (2.25). The set of functions Fta (s, u) entering these equations as illustrated by the following diagram:

288

I. Krichever, O. Lipan, P. Wiegmann, A. Zabrodin

0

1

0

0

F10 F11

0

F20 F21 F22

0 (5.5)

0

··· ··· ··· ··· ··· 0

Ft0 Ft1 Ft2 · · · Ftt 0

(cf. (3.27)). Functions in each horizontal slice satisfy HBDE (5.1). By level of Eq. (5.1) we understand the number t. Level 0 is introduced for later convenience. At the moment we do not assume any relations between solutions at different levels. Determinant formula (2.25) gives the solution to these equations for each level t in terms of t arbitrary holomorphic3 functions h(j) t (u + s) and t arbitrary antiholomorphic functions h¯ (j) (u − s). This is illustrated by the diagrams: t 1

1

h(1) h(2) 1 1

h¯ (2) 1

h¯ (1) 1

h(1) h(2) h(3) 2 2 2

h¯ (3) 2

h¯ (2) h¯ (1) 2 2

···

···

···

···

···

h(1) h(2) ··· t t

h(t+1) t

(5.6)

···

h¯ (t) h¯ (t+1) ··· t t

h¯ (1) t

Then, according to (2.25), the general solution to Eq. (5.1) is a (s, u) = Ft+1

(t+1) ht (u+s−a+2) (t+1) ht (u+s−a+4) ··· (t+1) ht (u+s+a) = χat (u + s)χ¯ at (u − s) ¯ (t+1) ht (u−s+a−t) ¯ (t+1) ht (u−s+a−t+2) ··· h¯ (t+1) (u−s−a+t) t

h(1) (u+s−a+4) t ··· h(1) t (u+s+a) , h¯ (1) t (u−s+a−t) h¯ (1) t (u−s+a−t+2) ··· (1) h¯ t (u−s−a+t)

· · · h(1) t (u+s−a+2) ··· ··· ··· ··· ··· ··· ···

(5.7) 3

Here we call holomorphic (antiholomorphic) a function of u + s (resp., u − s).

Quantum Integrable Models and Discrete Classical Hirota Equations

289

where 0 ≤ a ≤ t+1 and the gauge functions χat (u), χ¯ at (u) (introduced for normalization) satisfy the following equations: a−1 (u) , χat (u + 1)χat (u − 1) = χa+1 t (u)χt

χ¯ at (u + 1)χ¯ at (u − 1) = χ¯ a+1 ¯ a−1 (u) t t (u)χ

(5.8)

(cf. (2.5)). The size of the determinant is t + 1. The first a rows contain functions h(j) i , (j) ¯ (j) . The arguments of h , h increase by 2, the remaining t − a + 1 rows contain h¯ (j) i i i going down a column. Note that the determinant in (5.7) (without the prefactors) is a solution itself. At a = 0 (a = t + 1) it is an antiholomorphic (holomorphic) function. The required b.c. (3.19) can be satisfied by choosing appropriate gauge functions χat , χ¯ at . 5.2. Canonical solution. The general solution (5.7) gives the function Tsa (u) ≡ Fka (s, u) in terms of 2k functions of one variable hik−1 and h¯ ik−1 . However, we need to represent the solution in terms of another set of 2k functions Qt (u) and Q¯ t (u) by virtue of conditions (5.4) in such a way that Eqs. (5.2), (5.3) connecting two adjacent levels are fulfilled. We refer to this specification as the canonical solution. To find it let us notice that at a = 0 Eq. (5.2) consists of the holomorphic function Qt (u + s) and a function F 1 . According to Eq. (5.7), F 1 is given by the determinant + 1) in the first row. Other rows of the matrix with the holomorphic entries h(i) t (u + s P contain antiholomorphic functions only, so Ft1 (u, s) = i h(i) t (u + s + 1)ηi (u − s), where ηi (u − s) are corresponding minors of the matrix (5.7) at a = 1. Substituting this into Eq. (5.2) at a = 0 and separating holomorphic and antiholomorphic functions one gets (i) relations connecting h(i) t , ht−1 and Qt (u), Qt+1 (u). Similar arguments can be applied to Eq. (5.3) at another boundary a = t + 1. The general proof is outlined in the appendix to this section. Here we present the result: h(1) t (u + s) = Qt (u + s) , and

¯ h¯ (1) t (u − s) = Qt (u − s)

(i+1) h (u − 2) Qt (u − 2) t Qt+1 (u − = (i+1) , Qt (u) ht (u) (i+1) h¯ (u) Q¯ t (u) (i) t , ¯ ¯ Qt+1 (u + 1)ht−1 (u + 1) = ¯ (i+1) ¯ ht (u + 2) Qt (u + 2) 2)h(i) t−1 (u)

(5.9)

(5.10) (5.11)

where 1 ≤ i ≤ t. Functions χ, χ¯ in front of the determinant (5.7) are then fixed as follows: −1  a−1 Y χat (u) = (−1)at  Qt+1 (u − a + 2j) , a ≥ 2 , j=1

χ0t (u)

= Qt+1 (u) ,

χ1t (u) = (−1)t ,



t−a Y

χ¯ at (u) = 

(5.12) −1

Q¯ t+1 (u + a − t + 2j − 1)

,

a ≤ t − 1,

j=1

χ¯ tt (u) = 1 ,

¯ χ¯ t+1 t (u) = Qt+1 (u) .

(5.13)

290

I. Krichever, O. Lipan, P. Wiegmann, A. Zabrodin

It is easy to check that they do satisfy Eq. (5.8). The recursive relations (5.10), (5.11) ¯(i) allow one to determine functions h(i) t and ht starting from a given set of Qt (u). These formulas generalize wronskian relations (4.13), (4.14) to the Ak−1 -case. Let us also note that this construction resembles the Leznov-Saveliev solution [39] to the continuous 2DTL with open boundaries. 5.3. The Bethe ansatz and canonical solution. The canonical solution of the previous section immediately leads to the nested Bethe ansatz for elliptic solutions. ¯ (i) In this case all functions h(i) t , ht are elliptic polynomials multiplied by an exponential function: Mt(i) Y (i) (i) νt(i) ηu σ(η(u − ut,i (5.14) ht (u) = at e j )) , j=0 (i)

ν¯ t h¯ (i) ¯ (i) t (u) = a t e

¯ (i) M t ηu

Y

σ(η(u − u¯ t,i j )) .

(5.15)

j=0

This implies a number of constraints on their zeros. The determinant in (5.10) should be divisible by Qt+1 (u − 2) and h(i) t−1 (u), whence h(i+1) (ut+1 t j ) h(i+1) (ut+1 t j + 2) (ut−1,i ) h(i+1) t j h(i+1) (ut−1,i − 2) t j

=

=

Qt (ut+1 j ) , t+1 Qt (uj + 2) Qt (ut−1,i ) j Qt (ut−1,i − 2) j

(5.16)

,

(5.17)

where utj ≡ ut,1 j . Furthermore, it is possible to get a closed system of constraints for the roots of Qt (u) only. Indeed, choosing u = utj , u = utj + 2 in (5.10), we get t Qt+1 (utj − 2)Qt−1 (utj ) = −Qt (utj − 2)h(2) t (uj ) ,

(5.18)

t Qt+1 (utj )Qt−1 (utj + 2) = Qt (utj + 2)h(2) t (uj ) .

(5.19)

Dividing Eq. (5.18) by Eq. (5.19) we obtain the system of nested Bethe equations: Qt−1 (utj + 2)Qt (utj − 2)Qt+1 (utj ) = −1 , Qt−1 (utj )Qt (utj + 2)Qt+1 (utj − 2)

(5.20)

which coincides with (3.31) from Sect. 3.5. ¯ Similar relations hold true for the h-diagram: Q¯ t (u¯ t+1 h¯ (i+1) (u¯ t+1 t j + 1) j + 1) = , t+1 t+1 ¯ ¯h(i+1) (u¯ j − 1) Qt (u¯ j − 1) t

(5.21)

h¯ (i+1) (u¯ t−1,i + 1) + 1) Q¯ t (u¯ t−1,i t j j = , t−1,i t−1,i ¯h(i+1) ¯ (u¯ j − 1) Qt (u¯ j − 1) t

(5.22)

Q¯ t+1 (u¯ tj + 1)Q¯ t−1 (u¯ tj + 1) = Q¯ t (u¯ tj + 2)h¯ (2) ¯ tj ) , t (u

(5.23)

Quantum Integrable Models and Discrete Classical Hirota Equations

291

Q¯ t+1 (u¯ tj − 1)Q¯ t−1 (u¯ tj − 1) = −Q¯ t (u¯ tj − 2)h¯ (2) ¯ tj ) , t (u

(5.24)

Q¯ t−1 (u¯ tj + 1)Q¯ t (u¯ tj − 2)Q¯ t+1 (u¯ tj + 1) = −1 . Q¯ t−1 (u¯ t − 1)Q¯ t (u¯ t + 2)Q¯ t+1 (u¯ t − 1)

(5.25)

j

j

j

These conditions are sufficient to ensure that the canonical solution for Tsa (u) (i.e., for Fka (s, u)) has the required general form (2.7). To see this, take a generic Q-factor from the product (5.12), (Qt+1 (u − a + 2j))−1 . It follows from (5.16) that at its poles the j th and j + 1th rows of the determinant (5.7) become proportional. The same argument a ¯ (s, u) has no poles. repeated for Q-factors shows that Ft+1 Finally, it is straightforward to see from (5.7) that the constraint Q¯ t (u) = Qt (u − t) leads to condition (2.15) (for −t ≤ s ≤ −1 two rows of the determinant become equal). To summarize, the solution goes as follows. First, one should find a solution to Bethe equations (3.31) thus getting a set of elliptic polynomials Qt (u), t = 1, . . . , k − 1, Q0 (u) = 1, Qk (u) = φ(u) being a given function. To make the chain of equations finite, it is convenient to use the formal convention Q−1 (u) = Qk+1 (u) = 0. Second, one should ¯ (i) solve step by step relations (5.10), (5.11) and find the functions h(i) t (u), ht (u). All these relations are of the same type as the wronskian relation (4.13) in the A1 -case: each of them is a linear inhomogeneous first order difference equation. 5.4. Conservation laws. The solution described in Sects. 5.2 and 5.3 provides compact determinant formulas for eigenvalues of quantum transfer matrices. It also provides determinant representations for conservation laws of the s-dynamics which generalize Eqs. (4.7), (4.8) to the Ak−1 -case. The generalization comes up in the form of Eqs. (4.22), (4.23) and (4.26). The conservation laws (i.e., integrals of the s-dynamics) follow from the determinant representation (5.7) of the general solution to HBDE. Let us consider (Cka + 1) × (Cka + 1)-matrices a a 0 TB,B 0 (s, u) ≡ Ts+B+B 0 (u − s + B − B ),

B, B 0 = 1, . . . , Cka + 1 ,

(5.26)

a a 0 T¯B,B 0 (s, u) = Ts−B−B 0 (u + s + B − B ),

B, B 0 = 1, . . . , Cka + 1 ,

(5.27)

where Cka is the binomial coefficient. Let T a [P |R](s, u) be minors of the matrix (5.26) with row P and column R removed (similarly for (5.27)). Theorem 5.1. Let Tsa (u) be the general solution to HBDE given by Eq. (5.7). Then any ratio of the form T a [P |R](s, u) (5.28) (s, u) ≡ Aa,R 0 P,P T a [P 0 |R](s, u) does not depend on s. These quantities are integrals of the s-dynamics: Aa,R P,P 0 (s, u) = a,R AP,P 0 (u). Similarly, minors of the matrix (5.27) give in the same way a complimentary set of conservation laws4 . A sketch of proof is as follows. Consider the Laplace expansion of the determinant solution (5.7) with respect to the first a (holomorphic) rows: a

Tsa (u)

=

Ck X P =1

4

Compare with (4.7), (4.8).

ψPa (u + s)ψ¯ Pa (u − s).

(5.29)

292

I. Krichever, O. Lipan, P. Wiegmann, A. Zabrodin

Here P numbers (in an arbitrary order) sets of indices (p1 , p2 , . . . , pa ) such that k ≥ p1 > p2 > . . . > pa ≥ 1, ψPa (u + s) is minor of the matrix in Eq. (5.7) constructed from first a rows and columns p1 , . . . , pa (multiplied by χak−1 (u + s)), ψ¯ Pa (u − s) is the complimentary minor (multiplied by χ¯ ak−1 (u − s)). Substitute Rth column of the matrix (5.26) by the column vector with components a ψP (u+2B), B = 1, . . . , Cka +1. The matrix obtained this way (let us call it (T a;R,P )B,B 0 ) depends on R = 1, . . . , Cka + 1, P = 1, . . . , Cka and a = 1, . . . , k − 1. The “complementary" matrix (T¯ a;R,P )B,B 0 is defined by the similar substitution of the column vector ψ¯ Pa (u + 2B), B = 1, . . . , Cka + 1, into the matrix (5.27). Lemma 5.1. Determinants of all the four matrices introduced above vanish: det(T a ) = det(T¯ a ) = det(T a;R,P ) = det(T¯ a;R,P ) = 0 .

(5.30)

The proof follows from the Laplace expansion (5.29). From this representation it is obvious that Cka + 1 columns of the matrices in (5.30) are linearly dependent. This ¯ (i) identity is valid for arbitrary functions h(i) t (u + s), ht (u − s) in Eq. (5.7). The conservation laws immediately follow from these identities. Indeed, let us rewrite the determinant of the matrix T a;R,P as a linear combination of entries of the Rth column: Cka +1

det(T

a;R,P

X

)=

(−1)B

0

+R

ψPa (u + 2B 0 )T a [B 0 |R](s, u) = 0 .

(5.31)

B 0 =1

Dividing by T a [P 0 |R](s, u), we get, using the notation (5.28): Cka +1

X

B 0 =1,B 0 6=P 0

0

P (−1)B ψPa (u + 2B 0 )Aa,R B 0 ,P 0 (s, u) = (−1)

0

+1

ψPa (u + 2P 0 ) .

(5.32)

The latter identity is a system of Cka linear equations for Cka quantities Aa,R 1,P 0 (s, u), a,R a,R a,R (s, u), . . . , A (s, u), A (s, u), . . ., A (s, u). In the case of the Aa,R 2,P 0 P 0 −1,P 0 P 0 +1,P 0 Cka +1,P 0 a general position the wronskian of the functions ψP (u) is nonzero, whence system (5.32) has a unique solution for Aa,R P,P 0 (s, u). The coefficients of the system do not depend on s. a,R Therefore, AP,P 0 (s, u) are s-independent too. Similar arguments areapplied to minors of the matrix (5.27). Another form of Eq. (5.31) may be obtained by multiplication its l.h.s. by ψ¯ Pa (u−2s) and summation over P . This yields Cka +1

X

a (−1)B Ts+B (u − s + B)T a [B|R](s, u) = 0 ,

(5.33)

B=1

which is a difference equation for Tsa (u) as a function of the “holomorphic" variable u + s with fixed u − s. 5.5. Generalized Baxter’s relations. Equation (5.31) can be considered as a linear difference equation for a function ψ a (u) having Cka linearly independent solutions ψPa (u). It provides the Ak−1 -generalization of Baxter’s relations (4.4), (4.5). This generalization comes up in the form of Eqs. (4.22), (4.23) and (4.26).

Quantum Integrable Models and Discrete Classical Hirota Equations

293

The simplest cases are a = 1 and a = k − 1. Then there are k + 1 terms in the sum (5.31). Furthermore, it is obvious that ψi1 (u) = h(i) k−1 (u + 1),

ψ¯ ik−1 (u) = h¯ (i) k−1 (u) .

(5.34)

Then Eq. (5.31) and a similar equation for antiholomorphic parts read: k+1 X

1 (−1)j h(i) k−1 (u + 2j + 1)T [j|k + 1](s, u) = 0 ,

(5.35)

j=1 k+1 X

¯ k−1 [j|k + 1](s, u) = 0 , (−1)j h¯ (i) k−1 (u + 2j)T

(5.36)

j=1

where we put R = k + 1 for simplicity. These formulas may be understood as linear difference equations of order k. Indeed, Eq. (5.35) can be rewritten as the following equation for a function X(u): 1 Ts (u) 1 Ts+1 (u+1) ... T 1 (u+k) s+k

X(u+s+3) = 0 . (5.37) ... X(u+s +2k+1)

1 Ts+1 (u−1)

1 . . . Ts+k−1 (u−k+1) X(u+s+1)

1 Ts+2 (u)

1 . . . Ts+k (u−k+2)

...

... ...

1 1 Ts+k+1 (u+k−1) . . . Ts+2k−1 (u+1)

(1) This equation has k solutions h(i) k−1 (u), i = 1, . . . , k. One of them is Qk−1 ≡ hk−1 (u) (see Eq.(5.9)). Similarly Eq. (5.36) for the antiholomorphic parts,

T k−1 (u) s T k−1 (u+1) s−1 ... k−1 T s−k (u+k)

¯ X(u−s+2) = 0, ... ¯ X(u−s+2k)

k−1 Ts−1 (u−1)

k−1 ¯ . . . Ts−k+1 (u−k+1) X(u−s)

k−1 Ts−2 (u)

k−1 . . . Ts−k (u−k+2)

...

... ...

k−1 k−1 Ts−k−1 (u+k−1) . . . Ts−2k+1 (u+1)

(5.38)

¯ (1) ¯ has k solutions h¯ (i) k−1 (u), i = 1, . . . , k. One of them is Qk−1 ≡ hk−1 (u). Difference equations (5.37), (5.38) can be rewritten in the factorized form. This fact follows from a more general statement. Fix an arbitrary level k and set Tsa (u) = Fka (s, u), a (s, u) (as in Sect. 3). F a (s, u) = Fk−1 Proposition 5.1. For each j = 0, 1, . . . , k − 1 it holds: (j) e∂s +∂u − Rj+1 (s, u) e∂s +∂u − Rj(j) (s, u) . . . e∂s +∂u − R1(j) (s, u) F k−1−j (s, u) = 0 , (5.39) (j) e∂s −∂u − R¯ j+1 (s, u) e∂s −∂u − R¯ j(j) (s, u) . . . e∂s −∂u − R¯ 1(j) (s, u) F j (s, u) = 0 , (5.40)

294

I. Krichever, O. Lipan, P. Wiegmann, A. Zabrodin

where Ri(k−1−j) (s, u) = R¯ i(j) (s, u) =

j j+i−1 j+i (u + i − 1)Ts+i−2 (u − 1)Ts+i (u) Ts+i−1 j j+i−1 j+i Ts+i−2 (u + i − 2)Ts+i−1 (u)Ts+i−1 (u − 1) j+1 j−i+1 j−i+2 (u − l)Ts+l (u − 1)Ts+i−2 (u) Ts+i−1

j+1 j−i+1 j−i+2 Ts+i−2 (u − i + 1)Ts+i−1 (u)Ts+i−1 (u − 1)

,

.

(5.41)

(5.42)

Proof. The proof is by induction. At j = 0 Eq. (5.39) turns into T k (u) F k−1 (s, u) = 0 . e∂s +∂u − ks+1 Ts (u − 1) This means that F k−1 (s, u) does not depend on u + s. Further, a Tsa (u − 1) ∂s +∂u Ts+1 (u) a F a−1 (s, u) , F (s + 1, u) = − a−1 − a e Ts (u − 1) Ts (u)

(5.43)

(see (3.8)). The inductive step is then straightforward. The proof of (5.40) is absolutely identical. Now, putting j = k − 1 we get the following difference equations in one variable: (k−1) (s, u − s) e2∂u +∂s − Rk(k−1) (s, u − s) e2∂u +∂s − Rk−1 . . . e2∂u +∂s − R1(k−1) (s, u − s) Qk−1 (u) = 0 , (5.44) (k−1) (s, u + s)) (e−2∂u +∂s − R¯ k(k−1) (s, u + s))(e−2∂u +∂s − R¯ k−1

. . . (e−2∂u +∂s − R¯ 1(k−1) (s, u + s))Q¯ k−1 (u) = 0 .

(5.45)

Note that operators e±∂s act only on the coefficient functions in (5.44), (5.45). These equations provide a version of the discrete Miura transformation of generalized Baxter’s operators, which is different from the one discussed in Ref. [15] (see also below). Coming back to Eq. (5.31) and using relations (5.10), (5.11), one finds: ψkk−1 (u) = h(1) 1 (u + k − 1) = Q1 (u + k − 1) ,

(5.46)

¯ ψ¯ k1 (u) = h¯ (1) 1 (u) = Q1 (u)

(5.47)

(for the proof see Lemma 5.2 in the appendix to this section). Then, in complete analogy with Eqs. (5.37), (5.38), one obtains from (5.31) the following difference equations: T k−1 (u) s T k−1 (u+1) s+1 ... k−1 Ts+k (u+k)

k−1 . . . Ts+k−1 (u−k+1) X(u+s+k−1) k−1 k−1 Ts+2 (u) . . . Ts+k (u−k+2) X(u+s+k+1) = 0 , (5.48) ... ... ... ... k−1 k−1 Ts+k+1 (u+k−1) . . . Ts+2k−1 (u+1) X(u+s+3k−1)

k−1 Ts+1 (u−1)

Quantum Integrable Models and Discrete Classical Hirota Equations

1 Ts (u) 1 Ts−1 (u+1) ... T 1 (u+k) s−k

295

¯ X(u−s+2) =0 ... ¯ X(u−s+2k)

1 Ts−1 (u−1)

1 ¯ . . . Ts−k+1 (u−k+1) X(u−s)

1 Ts−2 (u)

1 . . . Ts−k (u−k+2)

...

... ...

1 1 Ts−k−1 (u+k−1) . . . Ts−2k+1 (u+1)

(5.49)

to which Q1 (u) (resp., Q¯ 1 (u)) is a solution. The other k−1 linearly independent solutions to Eq. (5.48) (resp., (5.49)) are other algebraic complements of the last (first) line of the ¯ 1k−1 ). matrix in Eq. (5.7) at a = k − 1 (a = 1) multiplied by χk−1 k−1 (χ Further specification follows from imposing constraints (3.12) which ensure conditions (2.4) forced by the usual Bethe ansatz. One can see that under these conditions Eqs. (5.48) and (5.49) become the same. Further, substituting a particular value of s, s = −k, into, say, Eqs. (5.48), (5.37), one gets the following difference equations: k X

(−1)a T1a (u + a − 1)Q1 (u + 2a − 2) = 0 ,

(5.50)

a=0 k X a=0

(−1)a

T1a (u − a − 1) Qk−1 (u − 2a) =0 φ(u − 2a − 2) φ(u − 2a)

(5.51)

(we remind the reader that φ(u) ≡ Qk (u)). The latter equation can be obtained directly from the determinant formula (5.7): notice that under conditions (2.4) the determinants in Eq. (5.7) become minors of the matrix h(i) k−1 (u − 2k + 2j), where i numbers columns running from 1 to k, j numbers lines and runs from 0 to k skipping the value k − a. Taking care of the prefactors in Eq. (5.7) and recalling that h(1) k−1 (u) = Qk−1 (u), one gets Eq. (5.51). These formulas give a generalization of the Baxter equations (4.4), (4.5), (4.12). At last, we are to identify our Qt ’s with Qt ’s from the usual nested Bethe ansatz solution. This is achieved by factorization of the difference operators in (5.50) and (5.51) in terms of Qt (u). Using the technique developed in the appendix to this section, one can prove the following factorization formulas: k X

+ a − 1) 2a∂u e = φ(u − 2)

a a−k T1 (u

(−1)

a=0

e

2∂u

Qk (u)Qk−1 (u − 2) − Qk (u − 2)Qk−1 (u)

Q2 (u)Q1 (u − 2) Q1 (u) . . . e2∂u − e2∂u − , Q2 (u − 2)Q1 (u) Q1 (u − 2) k X

(−1)

a=0

e

− a − 1) −2a∂u e = φ(u − 2a − 2)

a a−k T1 (u

−2∂u

Q2 (u)Q1 (u − 2) − Q2 (u − 2)Q1 (u)

... e

e

−2∂u

−2∂u

Q1 (u) − Q1 (u − 2)

(5.52)

Qk (u)Qk−1 (u − 2) − Qk (u − 2)Qk−1 (u)

.

(5.53)

296

I. Krichever, O. Lipan, P. Wiegmann, A. Zabrodin

Note that these operators are adjoint to each other. The l.h.s. of Eq. (5.52) or (5.53) is known as the generating function for T1a (u); Tsa (u) for s > 1 can be found with the help of determinant formula (2.24). These formulas for the generating function coincide with the ones known in the literature (see e.g. [5, 38]). They yield T1a (u) in terms of elliptic polynomials Qt with roots constrained by the nested Bethe ansatz equations which ensure cancellation of poles in T1a (u). 5.6. Appendix to Section 5. Here we outline the proof of the result of Sect. 5. It is enough to prove that the canonical solution does satisfy Eqs. (5.2), (5.3) connecting adjacent levels. The idea is to show that they are equivalent to the elementary Pl¨ucker relation (2.21). We proceed in steps. First step: Preliminaries. We need the determinant identity     am,n k am,k+1 Y  =  aj,k+1  (am,n ) (5.54) det det  1≤m,n≤k 1≤m,n≤k+1 am+1,n am+1,k+1 j=2 valid for an arbitrary (k + 1) × (k + 1)-matrix am,n , 1 ≤ m, n ≤ k + 1. It can be easily proved by induction. ¯ (j) Let us consider minors of the matrices h(j) t (u + 2i), ht (u + 2i), 1 ≤ i, j ≤ t + 1 of size a × a: (i ) ht β (u + 2α − 2) (5.55) Ht(i1 ,i2 ,...,ia ) (u) = det 1≤α,β≤a

and the same expression for H¯ t ’s through h¯ t ’s. The following technical lemma follows directly from Eq. (5.54): Lemma 5.2. If relations (5.9)-(5.11) hold, then (i1 ,i2 ,...,ia ) (u + 1) Ht−1 Ht(i1 +1,i2 +1,...,ia +1,1) (u − 1) , = Q Qa−1 a j=1 Qt+1 (u + 2j − 3) j=1 Qt (u + 2j − 1)

(5.56)

(i1 ,i2 ,...,ia ) H¯ t−1 (u + 1) H¯ (i1 +1,i2 +1,...,ia +1,1) (u) . = Qat ¯ Qa−1 ¯ j=1 Qt+1 (u + 2j − 1) j=1 Qt (u + 2j)

(5.57)

Relations (5.46), (5.47) are direct corollaries of the lemma. Second step: From h(i) t ’s to qi ’s. Let us fix a level k and define the quantities \

H (k,k−1,...,k−i+1,...,1) (u − 2k + 4) , qi (u) = k−1 Qk−2 j=1 Qk (u − 2k + 2j + 2)

(5.58)

\ (k,k−1,...,k−i+1,...,1) H¯ k−1 (u − k + 2) q¯i (u) = Qk−2 Q¯ k (u − k + 2j + 1)

(5.59)

j=1

for 1 ≤ i ≤ k. The hat means that the corresponding index is skipped. Due to Lemma 5.2 these quantities actually do not depend on the particular value of k used in the definition. More precisely, define qi (u), q¯i (u) with respect to any level k 0 > k, then they coincide with those previously defined for 1 ≤ i ≤ k. With this definition, one can prove

Quantum Integrable Models and Discrete Classical Hirota Equations

297

Lemma 5.3. Fix an arbitrary level k > 1. Let mα , α = 1, 2, . . . , r, be a set of integers ˜ α , α = 1, 2, . . . , k − r, be its such that k ≥ m1 > m2 > . . . > mr ≥ 1 and let m ˜ 2 > ... > complement to the set 1, 2, . . . , k ordered in the same way: k ≥ m ˜1 > m m ˜ k−r ≥ 1. Then the following identities hold: (m ˜ )

det

1≤α,β≤r

(qmβ (u + 2α − 2)) =

det1≤α,β≤k−r (hk−1β (u + 2r − 2k + 2α)) , Qk−r−1 Qk (u + 2r − 2k + 2j) j=1

(m ˜ ) det1≤α,β≤k−r (h¯ k−1β (u + 2r − k + 2α − 2)) . det (q¯mβ (u + 2α − 2)) = Qk−r−1 1≤α,β≤r Q¯ k (u + 2r − k + 2j − 1)

(5.60)

(5.61)

j=1

Let us outline the proof. At r = 1, these identities coincide with the definitions of qi , q¯i . At r = 2, they follow from the Jacobi identity (2.20). The inductive step consists in expanding the determinant in the left hand side in the first row and then making use of determinant identities equivalent to the r + 1-term Pl¨ucker relation. The identities from Lemma 5.3 allow one to express the canonical solution in terms of qi , q¯i . The Laplace expansion of the determinant in Eq. (5.7) combined with Eqs. (5.60), (5.61) yields: qt (u+s+a) · · · q1 (u+s+a) qt (u+s+a+2) · · · q1 (u+s+a+2) ··· ··· ··· qt (u+s+2t−a−2) · · · q1 (u+s+2t−a−2) . (5.62) Fta (s, u) = (−1)a(t−1) q¯t (u−s−a+1) · · · q ¯ (u−s−a+1) 1 · · · q¯1 (u−s−a+3) q¯t (u−s−a+3) ··· ··· ··· q¯t (u−s+a−1) · · · q¯1 (u−s+a−1) In particular, we have: Ft0 (s, u) = Qt (u + s) = Ftt (s, u) = Q¯ t (u − s) =

det qt+1−j (u + s + 2i − 2) ,

(5.63)

det q¯t+1−j (u − s − t + 2i − 1) .

(5.64)

1≤i,j≤t

1≤i,j≤t

Third step: The Pl¨ucker relation. Consider the rectangular (t + 3) × (t + 1)-matrix Sij , i = 1, 2, . . . , t + 3, i = 1, 2, . . . , t + 1, given explicitly by S1j = δ1j , 2 ≤ i ≤ t−a+2, Sij = qt+2−j (u + s + a + 2i − 4) , t − a + 3 ≤ i ≤ t + 3. Sij = q¯t+2−j (u − s + a + 2j − 2t − 7) ,

(5.65)

Applying the determinant identity (2.21) (the elementary Pl¨ucker relation) to minors of this matrix, one gets Eq. (5.2) for l1 = 1, l2 = 2, l3 = t − a + 2, l4 = t − a + 3 and Eq. (5.3) for l1 = 1, l2 = t − a + 2, l3 = t − a + 3, l4 = t + 1. This completes the proof.

298

I. Krichever, O. Lipan, P. Wiegmann, A. Zabrodin

Remark. Functions qi (u), q¯i (u), i = 1, 2 . . . , k, are linearly independent solutions to generalized Baxter’s equations (5.48), (5.49) respectively. To construct an elliptic polynomial solution for Tsa (u), it is sufficient to take them to be arbitrary elliptic polynomials of one and the same degree d, qi (u) = e

ζi ηu

d Y

σ(η(u −

vl(i) )),

q¯i (u) = e

ζ¯i ηu

l=1

d Y

σ(η(u − v¯ l(i) )),

l=1

Pd with the only conditions that ζi − ζ¯i , l=1 (vl(i) − v¯ l(i) ) do not depend on i = 1, 2, . . . , k. It is easy to check that in this case general conditions (2.8), (2.9) are fulfilled. 6. Regular Elliptic Solutions of the HBDE and RS System in Discrete Time In this section we study the class of elliptic solutions to HBDE for which the number of zeros Mt of the τ -function does not depend on t. We call them elliptic solutions of the regular type (or simply regular elliptic solutions) since they have a smooth continuum limit. Although it has been argued in the previous section that the situation of interest for the Bethe ansatz is quite opposite, we find it useful to briefly discuss this class of solutions. It is convenient to slightly change the notation: τ l,m (x) ≡ τu (−m, −l), x ≡ uη. HBDE (1.8) acquires the form τ l+1,m (x)τ l,m+1 (x) − τ l+1,m+1 (x)τ l,m (x) = τ l+1,m (x + η)τ l,m+1 (x − η) .

(6.1)

We are interested in solutions that are elliptic polynomials in x, τ l,m (x) =

M Y

σ(x − xl,m j ).

(6.2)

j=1

The main goal of this section is to describe this class of solutions in a systematic way and, in particular, to prove that all the elliptic solutions of regular type are finite-gap. The auxiliary linear problems (3.5) look as follows: 9l,m+1 (x) = 9l,m (x + η) +

τ l,m (x)τ l,m+1 (x + η) l,m 9 (x) , τ l,m+1 (x)τ l,m (x + η)

(6.3)

τ l,m (x − η)τ l+1,m (x + η) l,m 9 (x − η) . (6.4) τ l+1,m (x)τ l,m (x) (The notation is correspondingly changed: 9l,m (uη) ≡ ψu (−m, −l).) The coefficients are elliptic functions of x. Similarly to the case of the Calogero-Moser model and its spin generalizations [31, 32] the dynamics of their poles is determined by the fact that Eqs. (6.3), (6.4) have infinite number of double-Bloch solutions (Sect. 4). The “gauge transformation" f (x) → f˜(x) = f (x)eax (a is an arbitrary constant) does not change poles of any function and transforms a double-Bloch function into another double-Bloch function. If Bα are Bloch multipliers for f , then the Bloch multipliers for f˜ are B˜ 1 = B1 e2aω1 , B˜ 2 = B2 e2aω2 , where ω1 , ω2 are quasiperiods of the σ-function. Two pairs of Bloch multipliers are said to be equivalent if they are connected by this relation with some a (or by the equivalent condition that the product B1ω2 B2−ω1 is the same for both pairs). Consider first Eq. (6.3). Since l enters as a parameter, not a variable, we omit it for → xm simplicity of the notation (e.g. xl,m j ). j 9l+1,m (x) = 9l,m (x) +

Quantum Integrable Models and Discrete Classical Hirota Equations

299

Theorem 6.1. Equation (6.3) has an infinite number of linearly independent doubleBloch solutions with simple poles at the points xm i and equivalent Bloch multipliers if satisfy the system of equations and only if xm i M m−1 m+1 m m Y σ(xm )σ(xm + η) i − xj i − xj − η)σ(xi − xj j=1

m+1 − η)σ(xm − xm + η)σ(xm − xm−1 ) σ(xm j i − xj i j i

= −1 .

(6.5)

All these solutions can be represented in the form 9 (x) = m

M X

x/η ci (m, z, κ)8(x − xm i , z)κ

(6.6)

i=1

(8(x, z) is defined in (4.37)). The set of corresponding pairs (z, κ) are parametrized by points of an algebraic curve defined by the equation of the form R(κ, z) = κM +

M X

ri (z)κM −i = 0 .

(6.7)

i=1

Sketch of proof. We omit the detailed proof since it is almost identical to the proof of the corresponding theorem in [33] and only present the part of it which provides the Lax representation for Eq. (6.5). Let us substitute the function 9m (x) of the form (6.6) into Eq. (6.3). The cancellation m+1 gives the conditions of poles at x = xm i − η and x = xi κci (m, z, κ) + λi (m)

M X

m cj (m, z, κ)8(xm i − xj − η, z) = 0 ,

(6.8)

j=1

ci (m + 1, z, κ) = µi (m)

M X

cj (m, z, κ)8(xm+1 − xm i j , z) ,

(6.9)

j=1

where

QM

m m m+1 σ(xm ) i − xs − η)σ(xi − xs , Q M m m m m+1 − η) s=1,6=i σ(xi − xs ) s=1 σ(xi − xs QM σ(xm+1 − xm+1 + η)σ(xm+1 − xm s s ) i i . µi (m) = QM s=1 Q M m+1 m+1 m+1 − xs ) s=1 σ(xi − xm s + η) s=1,6=i σ(xi

λi (m) = QM

s=1

(6.10)

(6.11)

Introducing a vector C(m) with components ci (m, z, κ) we can rewrite these conditions in the form (L(m) + κI)C(m) = 0 , (6.12) C(m + 1) = M(m)C(m) ,

(6.13)

where I is the unit matrix. Entries of the matrices L(m) and M(m) are: m Lij (m) = λi (m)8(xm i − xj − η, z),

(6.14)

− xm Mij (m) = µi (m)8(xm+1 i j , z).

(6.15)

The compatibility condition of (6.12) and (6.13),

300

I. Krichever, O. Lipan, P. Wiegmann, A. Zabrodin

L(m + 1)M(m) = M(m)L(m)

(6.16)

is the discrete Lax equation. By the direct commutation of the matrices L, M (making use of some non-trivial identities for the function 8(x, z) which are omitted) it can be shown that for the matrices L and M defined by Eqs. (6.14), (6.10) and (6.15), (6.11) respectively, the discrete Lax equation (6.16) holds if and only if the xm i satisfy Eqs. (6.5). It is worthwhile to remark that in terms of λi (m), µi (m) Eqs. (6.5) take the form λi (m + 1) = −µi (m),

i = 1, . . . , M .

(6.17)

Equation (6.12) implies that R(κ, z) ≡ det(L(m) + κI) = 0 .

(6.18)

The coefficients of R(κ, z) do not depend on m due to (6.16). This equation defines an algebraic curve (6.7) realized as a ramified covering of the elliptic curve. Solutions to Eq. (6.5) are implicitly given by the equation ~ xl,m + U ~ +l + U ~ − m + Z) ~ = 0, Θ(U i

(6.19)

~ corresponds to the spectral curve (6.7), (6.18), where the Riemann theta-function Θ(X) ~ − are periods of certain dipole differentials on ~, U ~ +, U components of the vectors U ~ is an arbitrary vector. Elliptic solutions are characterized by the following the curve, Z ~ , i = 1, 2, belongs to the lattice of periods of holomorphic differentials on property: 2ωi U the curve. The matrix L(m) = L(l, m) is defined by fixing xlj0 ,m0 , xlj0 ,m0 +1 , i = 1, . . . , M . ~, U ~ +, U ~ − and Z ~ in Eq. These Cauchy data uniquely define the curve and the vectors U ~ ~ ~ (6.19). The curve and vectors U , U+ , U− do not depend on the choice of l0 , m0 . According ~ depends linearly on this choice and its components are thus to Eq. (6.19), the vector Z angle-type variables. The same analysis can be repeated for the second linear problem (6.4). Now m enters as a parameter and we set xl,m → xˆ li for simplicity. The theorem is literally the same, the equations of motion for the poles being M Y σ(xˆ li − xˆ l+1 ˆ li − xˆ lj − η)σ(xˆ li − xˆ l−1 j + η)σ(x j ) j=1

σ(xˆ li − xˆ l+1 ˆ li − xˆ lj + η)σ(xˆ li − xˆ l−1 − η) j j )σ(x

= −1 .

(6.20)

The corresponding discrete Lax equation is

where5

and

5

ˆ + 1)M(l) ˆ ˆ L(l) ˆ , L(l = M(l)

(6.21)

Lˆ ij (l) = λˆ i (l)8(xˆ li − xˆ lj − η, z),

(6.22)

ˆ ij (l) = µˆ i (l)8(xˆ l+1 M ˆ lj − η, z), i −x

(6.23)

QM l l xˆ li − xˆ l+1 s + η) ˆλi (l) = Q s=1 σ(xˆ i − xˆ s − η)σ( , Q M M l l l ˆ i − xˆ s ) s=1 σ(xˆ i − xˆ l+1 s ) s=1,6=i σ(x

(6.24)

A very close version of the discrete L-M pair appeared first in the Ref.[43] as an a priori ansatz.

Quantum Integrable Models and Discrete Classical Hirota Equations

QM ˆ l+1 ˆ l+1 ˆ l+1 ˆ ls − η) s + η)σ(x i −x i −x s=1 σ(x µˆ i (l) = QM . Q M l+1 l+1 ˆ i − xˆ l+1 ˆ i − xˆ ls ) s ) s=1,6=i σ(x s=1 σ(x

301

(6.25)

All these formulas can be obtained from (6.5), (6.10)-(6.15) by the formal substitutions ˆ li , xm±1 → xˆ l±1 ∓ η. According to the comment after Eq. (6.19), the Cauchy xm i → x i i l0 ,m0 data for the l-flow xj , xlj0+1 ,m0 are uniquely determined by fixing the Cauchy data l0 ,m0 l0 ,m0+1 , xj for the m-flow and vice versa. xj 7. Conclusion and Outlook It turned out that classical and quantum integrable models have a deeper connection than the common assertion that the former are obtained as a “classical limit" of the latter. In this paper we have tried to elaborate perhaps the simplest example of this phenomenon: the fusion rules for quantum transfer matrices coincide with Hirota’s bilinear difference equation (HBDE). We have identified the bilinear fusion relations in Hirota’s classical difference equation with particular boundary conditions and elliptic solutions of the Hirota equation, with eigenvalues of the quantum transfer matrix. Eigenvalues of the quantum transfer matrix play the role of the τ -function. Positions of zeros of the solution are determined by the Bethe ansatz equations. The latter have been derived from an entirely classical set-up. We have shown that nested Bethe ansatz equations can be considered as a natural discrete time analogue of the Ruijsenaars-Schneider system of particles. The discrete time t runs over vertices of the Dynkin graph of Ak−1 -type and numbers levels of the nested Bethe ansatz. The continuum limit in t gives the continuous time RS system [48]. This is our motivation to search for classical integrability properties of the nested Bethe ansatz equations. In addition we constructed the general solution of the Hirota equation with a certain boundary conditions and obtained new determinant representations for eigenvalues of the quantum transfer matrix. The approach suggested in Sect. 5 resembles the LeznovSaveliev solution [39] to the 2D Toda lattice with open boundaries. It can be considered as an integrable discretization of the classical W -geometry [18]. We hope that this work gives enough evidence to support the assertion that all spectral characteristics of quantum integrable systems on finite 1D lattices can be obtained starting from classical discrete soliton equations, not implying a quantization. The Bethe ansatz technique, which has been thought of as a specific tool of quantum integrability is shown to exist in classical discrete nonlinear integrable equations. The main new lesson is that solving classical discrete soliton equations one recovers a lot of information about a quantum integrable system. Soliton equations usually have a huge number of solutions with very different properties. To extract the information about a quantum model, one should restrict the class of solutions by imposing certain boundary and analytic conditions. In particular, elliptic solutions to HBDE give spectral properties of quantum models with elliptic R-matrices. The difference bilinear equation of the same form, though with different analytical requirements, has appeared in quantum integrable systems in another context. Spin-spin correlation functions of the Ising model obey a bilinear difference equation that can be recast into the form of HBDE [41, 45, 2]. More recently, nonlinear equations for correlation functions have been derived for a more general class of quantum integrable models, by virtue of the new approach of Ref. [10].

302

I. Krichever, O. Lipan, P. Wiegmann, A. Zabrodin

Thermodynamic Bethe ansatz equations written in the form of functional relations [53, 46] (see e.g., [7]) appeared to be identical to HBDE with different analytic properties. All these suggest that HBDE may play the role of a master equation for both classical and quantum integrable systems simultaneously, such that the “equivalence" between quantum systems and discrete classical dynamics might be extended beyond the spectral properties discussed in this paper. In particular, it will be very interesting to identify the quantum group structures and matrix elements of quantum L-operators and R-matrices with objects of classical hierarchies. We do not doubt that such a relation exists. Acknowledgement. We are grateful to O. Babelon, B.Enriquez, L.Faddeev, J.L. Gervais, A. Klumper, V.Korepin, P.Kulish, A.Kuniba, D.Lebedev, A.Mironov, P. Pearce, N.Reshetikhin, R.Seiler, E. Sklyanin, N.Slavnov, T.Takebe and Al. Zamolodchikov for discussions and A. Abanov and J. Talstra for discussions and help. P.W. and O.L. were supported by grant NSF DMR 9509533 and the MRSEC Program of NSF DMR 9400379. The work of A.Z. was supported in part by RFFR grant 95-01-01106, by ISTC grant 015, by INTAS 94 2317 and the grant of the Dutch NWO organization. A.Z. and P.W. thank the Erwin Schr¨odinger Institute of Mathematical Physics for hospitality during the semester “Condensed Matter Physics – Dynamics, Geometry and Spectral Theory," where this work was started. P.W. also thanks Ecole Normale Superieure, where a part of this work was done, for hospitality. I.K. and A.Z. are grateful to the University of Chicago for the hospitality in November 1995. During these visits they were supported by NSF grant DMR 9509533. This work has been reported at the NATO Summer School at Cargese 1995.

References 1. Airault, H., McKean, H., Moser, J.: Rational and elliptic solutions of the KdV equation and related many-body problem. Comm. Pure and Appl. Math. 30, 95–125 (1977) 2. Au-Yang, H., Perk, J.H.H.: Toda lattice equation and wronskians in the 2d Ising model. Physica 18D, 365–366 (1986) 3. Baxter, R.: Partition function of the eight-vertex lattice model. Ann. Phys. 70, 193–228 (1972) 4. Baxter, R.: Exactly solved models in statistical mechanics. New York–London: Academic Press, 1982 5. Bazhanov, V., Reshetikhin, N.: Critical RSOS models and conformal field theory. Int. J. Mod. Phys. A4, 115–142 (1989) 6. Bazhanov, V., Reshetikhin, N.: Restricted solid on solid models connected with simply laced algebras and conformal field theory. J. Phys. A23, 1477–1492 (1990) 7. Bazhanov, V., Lukyanov, S., Zamolodchikov, A.: Integrable structure of conformal field theory, quantum KdV theory and thermodynamic Bethe ansatz. Preprint CLNS 94/1316, RU-94-98, hep-th/9412229 8. Bilal, A., Gervais, J.-L.: Extended c = ∞ conformal systems from classical Toda field theories. Nucl. Phys. B314, 646–686 (1989) 9. Bobenko, A., Kutz, N., Pinkall, V.: The discrete quantum pendulum. Phys. Lett. A177, 399–404 (1993) 10. Bogoliubov, N.M., Izergin, A.G., Korepin, V.E.: Quantum inverse scattering method and correlation functions. Cambridge: Cambridge University Press, 1993 11. Chudnovsky, D.V., Chudnovsky, G.V.: Pole expansions of non-linear partial differential equations. Nuovo Cimento 40B, 339–350 (1977) 12. Date, E., Jimbo, M., Miwa, T.: Method for generating discrete soliton equations I, II. J. Phys. Soc. Japan 4116–4131 (1982) 13. Faddeev, L.D., Takhtadjan, L.A.: Quantum inverse scattering method and the XY Z Heisenberg model. Uspekhi Mat. Nauk 34:5, 13–63 (1979) 14. Faddeev, L.D., Volkov, A.Yu.: Quantum inverse scattering method on a spacetime lattice. Teor. Mat. Fiz. 92, 207–214 (1992) (in Russian); Faddeev, L.D.: Current-like variables in massive and massless integrable models. Lectures at E. Fermi Summer School, Varenna 1994, hep-th/9406196 15. Frenkel, E., Reshetikhin, N.: Quantum affine algebras and deformations of the Virasoro and W-algebras. Preprint q-alg/9505025 (1995) 16. Gaudin, M.: La fonction d’onde de Bethe. Paris: Masson, 1983 17. Gervais, J.-L., Neveu, A.: Novel triangle relation and absence of tachyons in Liouville string field theory. Nucl. Phys. B238, 125 (1984)

Quantum Integrable Models and Discrete Classical Hirota Equations

303

18. Gervais, J.-L., Matsuo, Y.: W -geometries. Phys. Lett. B274, 309–316 (1992); Classical An -W geometry. Commun. Math. Phys. 152, 317–368 (1993) 19. Griffiths, P., Harris, J.: Principles of algebraic geometry. A Wiley-Interscience Publication, New York: John Wiley & Sons, 1978 20. Hirota, R.: Nonlinear partial difference equations II; Discrete time Toda equations. J. Phys. Soc. Japan 43, 2074–2078 (1977) Discrete analogue of a generalized Toda equation. J. Phys. Soc. Japan 50, 3785–3791 (1981) 21. Hirota, R.: Nonlinear partial difference equations III; Discrete sine-Gordon equation. J. Phys. Soc. Japan 43, 2079–2086 (1977) 22. Hirota, R.: Discrete two-dimensional Toda molecule equation. J. Phys. Soc. Japan 56, 4285–4288 (1987) 23. Hirschfeld, J.W.P., Thas, J.A.: General Galois geometries. Oxford: Clarendon Press, 1991 24. Hodge, H.V.D., Pedoe, D.: Methods of algebraic geometry. Volume I, Cambridge: Cambridge University Press, 1947 25. Jimbo, M., Miwa, T.: Solitons and infinite dimensional Lie algebras. Publ. RIMS, Kyoto Univ. 19, 943–1001 (1983) 26. Jimbo, M., Miwa, T., Okado, M.: An A(1) n−1 family of solvable lattice models. Mod. Phys. Lett. B1, 73–79 (1987) 27. Jorjadze, G., Pogrebkov, A., Polivanov, M., Talalov, S.: Liouville field theory: IST and Poisson bracket structure. J. Phys. A19, 121–139 (1986) 28. Kirillov, A.N.: Completeness of states in the generalized Heisenberg magnet. Zap. Nauchn. Sem. LOMI 134, 169–189 (1984) (in Russian) 29. Kirillov, A., Reshetikhin, N.: Exact solution of the integrable XXZ Heisenberg model with arbitrary spin, I, II. J. Phys. A20, 1565–1597 (1987) 30. Kl¨umper, A., Pearce, P.: Conformal weights of RSOS lattice models and their fusion hierarchies. Physica A183, 304–350 (1992) 31. Krichever, I.M.: Elliptic solutions of Kadomtsev–Petviashvilii equation and integrable systems of particles. Funct. Anal. App. 14, n 4, 282–290 (1980) 32. Krichever, I., Babelon, O., Billey, E., Talon, M.: Spin generalization of the Calogero-Moser system and the Matrix KP equation. Preprint LPTHE 94/42 33. Krichever, I.M., Zabrodin, A.V.: Spin generalization of the Ruijsenaars-Schneider model, non-abelian 2D Toda chain and representations of Sklyanin algebra. Uspekhi Mat. Nauk, 50:6, 3–56 (1995), hepth/9505039 34. Kulish, P.P., Reshetikhin, N.Yu., Sklyanin, E.K.: Yang-Baxter equation and representation theory: I. Lett. Math. Phys. 5, 393–403 (1981) 35. Kulish, P.P., Reshetikhin, N,Yu.: On GL3 -invariant solutions of the Yang-Baxter equation and associated quantum systems. Zap. Nauchn. Sem. LOMI 120,92–121 (1982) (in Russian), Engl. transl.: J. Soviet Math. 34, 1948–1971 (1986) 36. Kulish, P.P., Sklyanin, E.K.: Quantum spectral transform method. Recent developments. Lecture Notes in Physics 151, Berlin-Heidelberg- New York: Springer, 1982, pp. 61–119 37. Kuniba, A., Nakanishi, T., Suzuki, J.: Functional relations in solvable lattice models, I: Functional relations and representation theory, II: Applications. Int. J. Mod. Phys. A9, 5215–5312 (1994) 38. Kuniba, A., Suzuki, J.: Analytic Bethe ansatz for fundamental representations of Yangians. hepth/9406180 39. Leznov, A., Saveliev, M.: Theory of group representations and integration of nonlinear systems xa,zz¯ = exp(kx)a . Physica 3D, 62–72 (1981); Group-theoretical methods for integration of nonlinear dynamical systems. Progress in Physics series 15, Basel: Birkha¨user-Verlag, 1992 40. Lipan, O., Wiegmann, P., Zabrodin, A.: Fusion Rules for Quanatum Transfer Matrices as a Dynamical System on Grassmann Manifolds. Mod. Phys. Lett. A, 12 (19), 1369–1378 (1997) 41. McCoy, B., Wu, T.T.: Nonlinear partial difference equations for the two-dimensional Ising model. Phys. Rev. Lett. 45, 675–678 (1980); Nonlinear partial difference equations for the two-spin correlation function of the two-dimensional Ising model. Nucl. Phys. B180, 89–115 (1981); McCoy, B., Perk, J.H.H., Wu, T.: Ising field theory: quadratic difference equations for the n-point Green’s functions on the lattice. Phys. Rev. Lett. 46, 757–760 (1981) 42. Miwa, T.: On Hirota’s difference equations. Proc. Japan Acad. 58, 9–12 (1982) 43. Nijhof, F., Ragnisco, O., Kuznetsov, V.: Integrable time-discretization of the Ruijsenaars-Schneider model. Commun. Math. Phys. 176, 681–700 (1996) 44. Ohta, Y., Hirota, R., Tsujimoto, S., Imai, T.: Casorati and discrete Gram type determinant representations of solutions to the discrete KP hierarchy. J. Phys. Soc. Japan 62, 1872–1886 (1993)

304

I. Krichever, O. Lipan, P. Wiegmann, A. Zabrodin

45. Perk, J.H.H.: Quadratic identities for Ising model correlations. Phys. Lett. A79, 3–5 (1980) 46. Ravanini, F., Valleriani, A., Tateo, R.: Dynkin TBA’s. Int. J. Mod. Phys. A8, 1707–1727 (1993) 47. Reshetikhin, N.Yu.: The functional equation method in the theory of exactly soluble quantum systems. Sov. Phys. JETP 57, 691–696 (1983) 48. Ruijsenaars, S.N.M., Schneider, H.: A new class of integrable systems and its relation to solitons. Ann. Phys. (NY) 170, 370–405 (1986) 49. Saito, S., Saitoh, N.: Linearization of bilinear difference equations. Phys. Lett. A120, 322–326 (1987); Gauge and dual symmetries and linearization of Hirota’s bilinear equations. J. Math. Phys. 28, 1052– 1055 (1987) 50. Sato, M.: Soliton equations as dynamical systems on infinite dimensional Grassmann manifolds. RIMS Kokyuroku 439, 30–46 (1981) 51. Segal, G., Wilson, G.: Loop groups and equations of KdV type. Publ. IHES 61, 5–65 (1985) 52. Ueno, K., Takasaki, K.: Toda lattice hierarchy. Adv. Studies in Pure Math. 4, 1–95 (1984) 53. Zamolodchikov, A.B.: On the thermodynamic Bethe ansatz equations for reflectionless ADE scattering theories. Phys. Lett. B253, 391–394 (1991) 54. Zhou, Y., Pearce, P.: Solution of functional equations of restricted A(1) n−1 fused lattice models. Preprint hep-th/9502067 (1995) Communicated by Ya. G. Sinai

Commun. Math. Phys. 188, 305 – 325 (1997)

Communications in

Mathematical Physics c Springer-Verlag 1997

On the Geometry of Darboux Transformations for the KP Hierarchy and its Connection with the Discrete KP Hierarchy Franco Magri1 , Marco Pedroni2 , Jorge P. Zubelli3 1 2 3

Dip. di Matematica, Universit`a di Milano, Via C. Saldini 50, Milano MI 20154, Italy Dip. di Matematica, Universit`a di Genova, Via Dodecaneso 35, Genova GE 16146, Italy IMPA – CNPq, Est. D. Castorina 110, Rio de Janeiro RJ 22460, Brazil

Received: 23 July 1996 / Accepted: 6 January 1997

Abstract: We tackle the problem of interpreting the Darboux transformation for the KP hierarchy and its relations with the modified KP hierarchy from a geometric point of view. This is achieved by introducing the concept of a Darboux covering. We construct a Darboux covering of the KP equations and obtain a new hierarchy of equations, which we call the Darboux-KP hierarchy (DKP). We employ the DKP equations to discuss the relationships among the KP equations, the modified KP equations, and the discrete KP equations. Our approach also handles the various reductions of the KP hierarchy. We show that the KP hierarchy is a projection of the DKP, the mKP hierarchy is a DKP restriction to a suitable invariant submanifold, and that the discrete KP equations are obtained as iterations of the DKP ones.

1. Introduction The theory of Darboux transformations has a long and curious history. These transformations were introduced more than a century ago by G. Darboux in [8] and after passing through a period of oblivion, they were rediscovered as a technique for constructing solutions to important partial differential equations of Mathematical Physics. They have been used in a number of interesting situations, as can be seen in [1, 9, 10, 17, 18, 23, 24] and references therein. The “philosophical” purpose of the present work is to show that the Darboux technique has a deeper geometric significance than the usually accepted explanation that the underlying equations are “covariant” by a certain set of formal manipulations [17]. Furthermore, we argue below that an appropriate geometric setting of the Darboux method allows to clarify the links among important elements of Soliton Theory, to wit the Kadomtsev-Petviashvili (KP) hierarchy, the modified KP hierarchy, the (generalized) Miura map, and the (generalized) Toda lattice. We do so, by approaching the Darboux method as a theory of intertwining of vector fields, as follows:

306

Franco Magri, Marco Pedroni, Jorge P. Zubelli

Let us consider three dynamical systems described by three vector fields X, Y , and Z on three manifolds M , N and P , respectively. We shall say that Y intertwines X and Z if there exists a pair of maps µ : N → M and σ : N → P relating Y to X and Z, respectively. In the form of a diagram we have µ

σ

M ←− N −→ P, µ∗ σ∗ Z. X ←− Y −→ In particular, if Z coincides with X and N is a fiber bundle over M = P with canonical projection µ, we shall say that Y is a Darboux covering of X. So, a Darboux covering is a vector field Y on a fiber bundle over M that intertwines X with itself by means of the canonical projection µ : N → M and the Darboux map σ : N → M . The concept of Darboux covering described above may be used to construct solutions as well as invariant submanifolds of the vector field X. The construction of solutions is easily explained in a system of local coordinates on N adapted to the canonical projection µ. If U is an open set on the base M with local coordinates x, we denote by (x, a) the fibered coordinates on µ−1 (U ) for which the projection µ reduces itself to the projection on the first coordinates x. In these coordinates, the equations of the vector field Y becomes x˙ = X(x), a˙ = Y (x, a),

(1.1) (1.2)

where Eq. (1.1) is the equation of the vector field X on U . These equations show that we can lift any integral curve x(t) of X into an integral curve (x(t), a(t)) of Y by solving the auxiliary system (1.2), controlled by x(t). Once these equations have been solved, the Darboux map σ : N → M gives us a second solution x e(t) = σ(x(t), a(t))

(1.3)

of the dynamical system described by the vector field X. This solution depends on as many arbitrary parameters as there are arbitrary constants entering into the solution of the auxiliary system (1.2). One of the first main points of the present work is to explain how one can interpret the Darboux transformation of Soliton Theory within the framework described above. Another way of thinking on Eq. (1.3) is as a symmetry transformation of the dynamical system described by X, depending on a solution of an auxiliary system controlled by X itself. The use of Darboux coverings for the reduction of the dynamical system X rests on the elementary remark that any invariant submanifold S ⊂ N of the vector field Y projects into two submanifolds S 0 = µ(S), S 00 = σ(S), which are invariant by X on the base space M . If S 0 is the whole base space M , we obtain a Darboux subcovering of X. If σ(S) ⊂ µ(S) ⊂ M , the restriction of Y to S provides a Darboux covering of the restriction of X to S 0 . The main purpose of the present work is to display the usefulness of the concept of Darboux covering in unifying many concepts and results of Soliton Theory. Among these, we have in mind the theory of Miura maps, Darboux transformations, Krichever rational invariant submanifolds, modified KP equations, and their relations with the KP

Darboux Transformations for KP

307

and the discrete KP equations. All these ideas can be handled as different aspects of a single construction, leading to a specific Darboux covering of the KP equations, which we shall call the Darboux-KP (DKP) hierarchy. To define this hierarchy, we consider pairs of monic Laurent series a(z) and h(z) of the form X aj z −j , (1.4) a(z) = z + j≥0

h(z) = z +

X

hj z −j ,

(1.5)

j≥1

whose coefficients are functions of a space variable x. We associate to each of the Laurent series above the corresponding Fa`a di Bruno iterates a(j) and h(j) defined by a(0) = 1, = (∂x + a)a(j) , a

∀j ≥ 0,

(1.6)

h(0) = 1, = (∂x + h)h(j) , h

∀j ≥ 0 .

(1.7)

(j+1)

and by (j+1)

We combine the Fa`a di Bruno iterates in a suitable way to define the currents A(j) and H (j) associated with a and h, respectively. More precisely, the currents H (j) are defined as the unique linear combinations of the iterates of h H (j) = h(j) +

j−2 X

pjl [h]h(l)

l=0

with the asymptotic behavior H (j) = z j + O(z −1 ) as z → ∞. Similarly, the currents A(j) associated with a are defined as the unique linear combinations of the iterates of a A(j) = a(j) +

j−1 X

qlj [a]a(l)

(1.8)

l=1

with the asymptotic behavior

A(j) = z j + O(z 0 )

(1.9)

as z → ∞. We shall show in Sect. 2 how to identify the KP and the mKP hierarchy with the local conservation laws (1.10) ∂tj h = ∂x H (j) and

∂tj a = ∂x A(j) ,

(1.11)

respectively. The Darboux covering of the KP hierarchy is defined by the space N of pairs of Laurent series (h, a), by the maps

308

Franco Magri, Marco Pedroni, Jorge P. Zubelli

µ(h, a) = h,

ax σ(h, a) = h + , a and by the DKP equations

∂tj h = ∂x H (j) e (j) − H (j) ), ∂tj a = a(H

(1.12) (1.13)

(1.14)

e (j) is the current H (j) evaluated at the point e h = σ(h, a). where H This paper is devoted to the study of Eq. (1.14). We shall divide our study in two parts, the first one dealing with Darboux subcoverings and reductions, and the second one dealing with Darboux iteration. Our first remark, in Sect. 3, is that the submanifolds Sl ⊂ N defined by the constraints z l a = H (l+1) +

l X

am H (l−m)

m=0

are invariant submanifolds of the DKP equations. As a corollary, we get Theorem 1. The submanifolds 0 = µ(Sl ∩ Sl+n ) Sl,n

are invariant submanifolds for the KP equations. 0 have been called by Krichever “rational reductions” of the KP The manifolds Sl,n equations. In our formalism they appear as simple intersections of invariant submanifolds of the DKP equations. Then we focus our attention on the simplest invariant submanifold S0 . In Sect. 4 we prove that the mKP Eq. (1.11) are the restriction of the DKP Eqs. to S0 . The next two theorems are immediate consequences, as we shall see, of the preceding remark.

Theorem 2. Let a = a(tj ; z) be any solution of the mKP equations. Then, h = a − a0

(1.15)

and

e h = (a(2) − a0 a(1) )/a are two solutions of the KP equations.

(1.16)

Let a be a solution of the mKP hierarchy. We shall say that e h related to h by means of Eqs. (1.15) and (1.16) is an elementary Darboux transformation of h. Theorem 3. Let h = h(t; z) be any solution of the KP equations, and let a0 = a0 (tj ) be a solution of the auxiliary linear system j X

pjl [h](−a0 )(l) = 0 ,

(1.17)

e h = (h(2) + a0 h(1) + a0,x )/(h + a0 )

(1.18)

∂ tj a 0 + ∂ x

l=0

controlled by h(t; z). Then, is a new solution of the KP hierarchy, which is related to the solution h by an elementary Darboux transformation.

Darboux Transformations for KP

309

Theorem 2 explains in what sense the mKP equations are the simplest Darboux covering of the KP equations. Theorem 3 uses this remark to construct what we call elementary Darboux transformations of the KP equations along the lines discussed above for general Darboux coverings. The practical value of this result is enhanced by the remark that, whenever one is allowed to sum the formal Laurent series h(z) at a point z = z0 ∈ C, we have that a0 (t) = −h(t, z0 ) is a solution of the auxiliary system. Therefore, under that assumption, the construction of the new solution e h(t) becomes a purely algebraic problem. Our final remark concerns the reduction to the Gelfand–Dickey equations. As a simple consequence of the natural behavior of the Miura and Darboux maps under restriction, in Sect. 5 we shall obtain a very simple form of the Miura map for the GD equations. Theorem 4. The Miura map, the Darboux map, and the elementary Darboux transformation for the nth GD equations are the restriction of the maps (1.15), (1.16), (1.18) to the submanifold A(n) = z n . The second part of the theory developed herein deals with the iteration of elementary Darboux transformations. The problem is now to determine a sequence {a(n)} of Laurent series that solve, for each n, the mKP equations ∂tj a(n) = ∂x A(j) (n),

(1.19)

and that satisfies the Darboux recursion relation µ b(a(n + 1)) = σ b(a(n)),

(1.20)

where µ b and σ b are the restrictions (1.15) and (1.16) of the maps µ and σ onto S0 . These relations, written in the explicit form a(n)x + a(n)2 − a0 (n)a(n) = [a(n + 1) − a0 (n + 1)]a(n) allow to compute all the higher-order derivatives of a(z) with respect to x, and to express them as polynomials in a(n), a(n + 1), . . . , and so on. By substituting these polynomials into the mKP Eq. (1.19), one gets a discrete system we shall now describe. Following a procedure similar to the one above in the construction of the KP and the mKP hierarchies, we associate to any sequence {a(n)}n≥0 the discrete Fa`a di Bruno iterates a(j) (n) defined by a(0) (n) = 1, a(j+1) (n) = a(n)a(j) (n + 1). Then we combine these iterates in such a way to force the Laurent series K (n) = a (n) + (j)

(j)

j−1 X

rlj [a]a(l) (n)

l=0

to have the asymptotic behavior K (j) (n) = z j + O(z −1 ) ,

310

Franco Magri, Marco Pedroni, Jorge P. Zubelli

as z → ∞. Associated to the currents K (j) we define a system of discrete conservation laws (1.21) ∂tj a(n) = a(n)(K (j) (n + 1) − K (j) (n)) , which are equivalent to the discrete KP equations (dKP for short). In Sect. 6 we show the following Theorem 5. The generating function a(n) of a sequence of iterated Darboux transformations of the KP hierarchy verifies the discrete KP hierarchy of Eq. (1.21). Furthermore, we show that the previous result also admits a converse statement. Theorem 6. Let a(n) be a solution of the discrete KP equations, and let a(n0 ) be its restriction to the fixed site n = n0 . Then, a(n0 ) is a solution of the mKP equations. Furthermore, the series h = a(n0 ) − a0 (n0 ), e h = a(n0 + 1) − a0 (n0 + 1)

(1.22) (1.23)

are two solutions of the KP hierarchy related by an elementary Darboux transformation. These two theorems prove that the Darboux recursion relation (1.20) allows to exchange locality in n and nonlocality in x with nonlocality in n and locality in x. In this way, we pass from the mKP hierarchy to the discrete KP hierarchy, and viceversa. By a slight change of perspective, this connection can be described in a way that emphasizes the central role played by the DKP equations. To this end we consider the space of sequences (h(x, n), a(x, n)) of Laurent series, and in this space we define the equations ∂tj h(n) = ∂x H (j) (n) (1.24) ∂tj a(n) = a(n)(H (j) (n + 1) − H (j) (n)). Furthermore, we consider the submanifold S˜ l defined by the constraints Pl z l a(n) = H (l+1) (n) + m=0 am (n)H (l−m) (n), µ(h(n + 1), a(n + 1)) = σ(h(n), a(n)).

(1.25)

We claim (but we do not prove in this paper) that the submanifolds S˜ l are invariant for the Eqs. (1.24), and that the discrete KP equations are the restriction of Eqs. (1.24) to S˜ 0 . So, we get a picture where the KP, mKP, and discrete KP equations are all obtained from the DKP equations: the KP equations by projection, and the mKP and dKP equations by suitable restrictions. Of course many results related to the Miura maps, Darboux transformations, mKP equations, discrete KP equations, and other aspects of the KP theory touched in this paper have been extensively investigated by several authors. The Miura map for the KP equations was introduced by Konopelchenko in [12] and by Jimbo and Miwa in [11]. The Darboux transformations have been studied both from the point of view of the intertwining of differential operators and of Sato’s Grassmannian [22]. The relation between KP and the Toda lattice (discrete KP) has been pointed out in [21, 19, 20] (see also for the AKNS case [18]). More information on the discrete KP hierarchy could be found in [15]. The possibility of splitting the mKP equations into the KP equations and an auxiliary equation has been recently noticed by Kupershmidt [16]. However, from those contributions one gets, in our opinion, only a fragmented picture of the whole theory. We hope that the point of view of Darboux coverings may help to unify these results.

Darboux Transformations for KP

311

2. KP, mKP, and Discrete KP Hierarchy In this paper the KP equations, the modified KP equations, and the discrete KP equations are regarded as infinite systems of conservation laws. The KP and mKP equations are written as partial differential equations ∂tj h = ∂x H (j) , ∂tj a = ∂x A(j) on Laurent series h(z) = z +

X

hj z −j ,

j≥1

a(z) = z +

X

ak z −k ,

k≥0

whose coefficients are functions of a space variable x. The discrete KP equations are written as difference equations ∂a(n) = a(n) K (j) (n + 1) − K (j) (n) ∂tj on a Laurent series

a(n) = z +

X

ak (n)z −k

k≥0

whose coefficients are functions of the discrete variable n. Here, the currents H (j) , A(j) , and K (j) (n) have been defined in the previous section. Although non–standard, these definitions are natural from the point of view of the bihamiltonian approach to soliton Eqs. [3, 4, 5, 6]. In this section we shall quickly review the main properties of these equations. We begin with the KP equations. The first remark concerns the possibility of writing the KP equations as a system of ordinary differential equations of Riccati type. We explain this point in detail, since it allows to display a circle of ideas which will be repeatedly used afterwards. In the space of Laurent series we associate with each point h the subspace H+ = hh(0) , h(1) , . . .i spanned by the nonnegative Fa`a di Bruno iterates of h. We remark that this space is generated by the powers (∂ + h)k acting on the zeroth order iterate h(0) = 1. Furthermore, we notice that the KP equations are the commutability conditions of the differential operators (∂ + h) and (∂tj + H (j) ). Since the last operator transforms h(0) into H (j) , which belongs to H+ , we conclude that H+ is an invariant subspace for this operator, i.e., ∂tj + H (j) (H+ ) ⊂ H+ . Let us notice now that the set of currents {H (i) }i≥0 forms a basis of H+ . Therefore, we can expand the vector (∂tj + H (j) )H (i) on such basis. One then gets X X ∂H (i) + H (j) H (i) = H (i+j) + H i,l H (j−l) + H j,l H (i−l) , ∂tj j

i

l=1

l=1

(2.1)

312

Franco Magri, Marco Pedroni, Jorge P. Zubelli

where H i,l is the coefficient of z −l of H (i) . These equations are a system of ordinary differential equations of Riccati type on the currents H (j) . They can be regarded as an infinite jet–bundle extension of the KP equations, obtained by replacing the unknown function h and its spatial derivatives hx , hxx , . . . , of all orders, with the currents H (j) . This extension is completely equivalent to the given KP equations. To recover them, we have to project back the Riccati system (2.1) onto h = H (1) , by using the equation relative to the time t1 = x, namely, X ∂H (i) + hH (i) = H (i+1) + hl H (j−l) + H i,1 , ∂x i

l=1

to compute all the currents (H , H , . . .) as differential polynomials in h. It can be easily shown that they coincide with the currents introduced in the previous section. Then we notice that the symmetry in i and j of the Riccati Equations (2.1) entails (2)

(3)

∂H (i) ∂H (j) = . ∂tj ∂ti Thus, in particular, we get

(2.2)

∂tj h = ∂x H (j) ,

proving our claim that the KP equations are a necessary consequence of the Riccati Equation (2.1). In closing, we remark that other “projections” of the Riccati equations are possible, leading to equations which are of interest in the so–called “fractional KdV hierarchies” [7]. The second remark concerns the Lax representation of the KP equations. It is constructed by introducing the negative Fa`a di Bruno iterates of h. For j = −1, −2, . . . we denote by h(j) the Laurent series having the asymptotic expansion h(j) = z j + O(z j−1 ) when z → ∞, which solve backwards the Fa`a di Bruno recursion relations (1.7). We define H− = hh(−1) , h(−2) , . . .i, which is the vector space spanned by the negative iterates of h. We notice that H+ and H− define a splitting of the space of Laurent series, and we denote by π+ and by π− the associated projections. By means of these projections the definition of the currents H (j) reads as H (j) = π+ (z j ). Using the negative Fa`a di Bruno iterates of h, we expand the Laurent series z in the form X z = h(1) + uj h(−j) . j≥1

In this way we introduce a new set of variables {uj }j≥1 which are related to the coefficients {hj }j≥1 by an invertible transformation. They can be used to define the pseudodifferential operator X uj ∂ −j . LKP = ∂ + j≥1

It can be shown [6] that the KP equations admit the Lax representation

Darboux Transformations for KP

313

∂LKP = [LKP , (LjKP )+ ], ∂tj where (LjKP )+ denotes, as usual, the differential part of the pseudodifferential operator LjKP . Almost the same remarks hold for the modified and the discrete KP equations. We limit ourselves to give the results, without repeating the arguments. The Riccati equations associated with the mKP equations are X X ∂A(i) + A(j) A(i) = A(i+j) + Ai,l A(j−l) + Aj,l A(i−l) . ∂tj j−1

i−1

l=0

l=0

(2.3)

Those associated with the discrete KP equations are ∂K (i) (n) + K (j) (n)K (i) (n) ∂tj j k X X K i,l (n)K (j−l) (n) + K j,l (n)K (i−l) (n). = K (i+j) (n) + l=1

(2.4)

l=1

We remark that these equations are local in n, since all the currents are evaluated at the same site n. Thus, by passing from the discrete KP to their associated Riccati form, we have localized the equations. Furthermore, we remark that Eq. (2.4), for each n, coincides with the Riccati form of KP given in (2.1). This clearly points out the close connection between the two equations, a subject which will be thoroughly discussed in Sect. 6. We end this section by explaining the Lax representation of the mKP and discrete KP equations. As in the case of the KP equations, one first introduces the negative Fa`a di Bruno iterates of a(z) and a(n). Then one expands z on the Fa`a di Bruno bases attached to the points a(z) and a(n) as follows: X vk a(−k) , z = a(1) + k≥0

z = a(n) +

X

qk (n)a(−k) (n).

k≥0

Finally one introduces the Lax operators LmKP = ∂ +

X

vk ∂ −k ,

k≥0

LdKP = ξ +

X

qk ξ −k ,

k≥0

where ξ is the shift operator [15]. They allow to put the mKP and discrete KP equations in the classical Lax form ∂LmKP = [LmKP , (LjmKP )≥1 ], ∂tj ∂LdKP = [LdKP , (LjdKP )+ ] , ∂tj where the subscript ≥ 1 means that we are discarding the pseudo-differential operator terms of order (strictly) less than 1.

314

Franco Magri, Marco Pedroni, Jorge P. Zubelli

3. The Darboux–KP Hierarchy The DKP equations have been introduced in Sect. 1 as a system of equations on pairs of Laurent series h(z) and a(z). They are defined by ∂tj h = ∂x H (j) , e (j) − H (j) ), ∂t a = a(H j

e (j) is the current H (j) evaluated at the point where H e h = σ(h, a) = h + ∂x log a.

(3.1)

It is clear from this definition that, if (h(t), a(t)) is any solution of the DKP equations, then e h(t) is a new solution of the KP equations since e (j) − H (j) ) = ∂x H e (j) . ∂ tj e h = ∂ tj h + ∂ x ( H Thus the conditions defining a Darboux covering are trivially verified in this case, by choosing σ as in Eq. (3.1) and setting µ(h, a) = h. According to the general view illustrated in the introduction, our strategy will be now to replace the study of the KP hierarchy by that of its covering. First of all, we observe that the DKP flows commute, as it is easily shown using (2.2). Then we study the reductions of the DKP equations. In the space of pairs of Laurent series (h, a), we consider the family of submanifolds Sl , indexed by a nonnegative integer l, whose points obey the constraint π− (z l a) = 0. This equation means that the Laurent series z l a belongs to the subspace H+ spanned by the nonnegative Fa`a di Bruno iterates of h. Since the currents H (j) form a basis in H+ , the series z l a may be expanded in the form zla =

l+1 X

clm [a]H (m)

m=0

and the coefficients clm can be easily computed by comparing the coefficients of z k on both sides of this equations. We get z l a = H (l+1) +

l X

am H (l−m) ,

(3.2)

am H l−m,k ,

(3.3)

m=0

or al+k = H l+1,k +

l X m=0

showing that the submanifold Sl may be parametrized by the components of the Laurent series h(z) and by the first (l + 1) components (a0 , a1 , . . . , al ) of a(z). Eqs. (3.2) are the parametric equations of the submanifold Sl . We show that Sl is an invariant submanifold for the DKP equations.

Darboux Transformations for KP

315

Theorem 7. The DKP vector fields are tangent to the submanifolds Sl . Proof. We have to prove that ÿ ∂ tj

z a−H l

(l+1)

−

l X

! am H

(l−m)

=0

(3.4)

m=0

on Sl . As a first step, we show that

∂tj + H (j) (z l a) ∈ H+

on Sl .

(3.5)

This does not follow from the fact that (∂tj + H (j) )(H+ ) ⊂ H+ , since z l a ∈ H+ only at the points of Sl . In order to prove (3.5), we remark that the definition (3.1) of e h can be read as an intertwining relation a(∂ + e h) = (∂ + h)a between the differential operators ∂ + h and ∂ + e h associated with the points h and e h respectively. By iteration we get a(∂ + e h)k = (∂ + h)k a, and, therefore,

h)k = (∂ + h)k (z l a). (z l a)(∂ + e

(3.6)

Now z l a ∈ H+ on Sl , thus (∂ + h)k (z l a) ∈ H+ on Sl for all k ≥ 0. Therefore (3.6) e + associated with shows that the operator of multiplication by z l a maps the subspace H e the point h into H+ : e + ) ⊂ H+ . (z l a)(H Now (3.5) follows from e (j) a) = (z l a)H e (j) ∈ H+ ∂tj + H (j) z l a = z l ∂tj + H (j) a = z l (H Therefore on Sl we have

z l ∂tj a + H (j) (z l a) ∈ H+ , ÿ

or z ∂ tj a + H l

(j)

H

(l+1)

+

l X

! am H

(l−m)

∈ H+ .

m=0

Recalling that H (m) H (n) + ∂tm H (n) ∈ H+ for all m, n, we can write z l ∂tj a − ∂tj H (l+1) −

l X

am ∂tj H (l−m) ∈ H+ .

m=0

After a look at the positive powers of z, we get z l ∂tj a − ∂tj H (l+1) −

l X m=0

so that (3.4) is proved.

am ∂tj H (l−m) =

l X m=0

∂tj am H (l−m) ,

on Sl .

316

Franco Magri, Marco Pedroni, Jorge P. Zubelli

A simple (but non trivial) consequence of this result is that the projections µ(Sl ∩Sl+n ) of the intersection of any pair of these submanifolds are invariant submanifolds of the KP equations. Therefore they define possible reductions of these equations. It can be shown [2] that these reductions are those which have been called “rational reductions” of the KP equations by Krichever [14]. A second consequence of the previous result is that it provides an explicit algorithm to construct solutions of the KP equations by means of a Darboux transformation. The algorithm can be divided into two parts. The first concerns the “auxiliary system” described in the introduction. Let us focus our attention on a specific equation, corresponding to the time tn . Let h be a solution of this equation. Then let us choose the invariant submanifold Sl . By means of the parametric equation of this submanifold and of the definition of e h, we compute: 1. the components (al+1 , al+2 , . . . , al+n−1 ) of a(z), 2. the components (e h1 , e h2 , . . . , e hl+n ) of e h(z), n,1 n,l+1 e e (n) . e ) of H 3. the components (H , . . . , H Then we write the reduced DKP equations X ∂ak e n,k+1 − H n,k+1 ) + e n,k−m − H n,k−m ) = (H am ( H ∂tn k−1

(3.7)

m=0

for the first (l + 1) components (a0 , a1 , . . . , al ). This is the auxiliary system we were looking for. Because of Theorem 7 the remaining equations can be disregarded since they are differential consequences of the previous ones. Suppose now that we are able to find a solution (a0 (t), a1 (t), . . . , al (t)) of the system (3.7). Then, we can use again the parametric Eqs. (3.3) to recover the full Laurent series a(t), and the definition of e h to obtain the second solution we were looking for. Thus, the construction of a new solution of the nth KP equation has been “reduced” to the problem of solving the system of (l + 1) nonlinear partial differential Eqs. (3.7) on the free parameters (a0 , a1 , . . . , al ) on Sl . In the next section we shall discuss in detail this problem, in the simplest case l = 0. 4. Elementary Darboux and the Modified KP Hierarchy In this section we give a detailed picture of the DKP hierarchy’s restriction to the simplest invariant submanifold S0 . The equations of this submanifold are a = h + a0 . Therefore, S0 can be parametrized either by a(z) or by (h(z), a0 ). We choose the former parametrization, and we show that the reduced DKP equations are the modified KP equations on a(z). To prove this result, we evaluate the restrictions µ b and σ b of the maps µ and σ to S0 . We get (4.1) h=µ b(a) = a − a0 and

e h=σ b(a) = (a(2) − a0 a(1) )/a .

Let us now compute the currents

(4.2)

Darboux Transformations for KP

317

H

(j)

=h

(j)

+

j−2 X

pjl [h]h(l) ,

l=0

and e (j) = e h(j) + H

j−2 X

pjl [e h]e h(l) ,

l=0

associated with h and e h, respectively. By using Eqs. (4.1) and (4.2) we can express these currents as functions of the Laurent series a(z) and of their currents A(j) = a(j) +

j−1 X

qlj [a]a(l) .

l=1

e (j) associated with the series a, h, and e h, Lemma 8. The currents A(j) , H (j) and H respectively, are connected by the relations H (j) = A(j) − Aj,0 , e (j) = A(j+1) + aH

j−1 X

(4.3) al A(j−l) ,

(4.4)

l=0

where Aj,0 denotes the zeroth-order term in the Laurent series of A(j) . Furthermore, Aj,0 = −

j X

pjk [h](−a0 )(k) .

k=0

Proof. We set v = −a0 and we denote by v (k) the Fa`a di Bruno iterates of v defined as usual v (0) = 1 v (k+1) = (∂x + v)v (k) , ∀k ≥ 0 . By a straightforward induction one first proves that the relation h = µ(a) entails that h(j) =

j X j v (l) a(j−l) . l l=0

In the same way, one proves that the relation e h = σ(a) entails ae h(j) =

j X j v (l) a(j+1−l) . l l=0

We then get

(4.5)

318

Franco Magri, Marco Pedroni, Jorge P. Zubelli

H

(j)

=

j X

pl h(l)

l=0

j X l X l = pl v (k) a(l−k) k l=0 k=0 ÿj−m ! j X X m + k (k) = a(m) . pm+k v k m=0

k=0

We now set def

qm [a] =

j−m X k=0

m+k k

Hence, H (j) − q0 [a]z 0 =

j X

pm+k v (k) .

qm [a]a(m) .

(4.6)

m=1

Notice, however, that by the definition of the coefficients pl [h], the expansion of H (j) is z j + O(z −1 ). On the other hand the right hand side of Eq. (4.6) is exactly the choice of coefficients ql such that j X qm a(m) = z j + O(z 0 ) , m=1

when z → ∞. By the uniqueness of such coefficients it follows that the right-hand side of Eq. (4.6) is exactly A(j) . In conclusion, we have just shown that if a = h + a0 = h − v, then the currents A(j) and H (j) are related by A(j) = H (j) + Aj,0 , where Aj,0 is given by Aj,0 = −q0 [a] = −

j X

pk [h]v (k) .

k=0

e (j) bee (j) , we notice that the relations (4.5) mean that aH Concerning the currents H (1) (2) longs to the vector subspace spanned by the Fa`a di Bruno iterates of a, i.e., ha , a , . . .i. e (j) can be expanded as a linear combination of the curTherefore, the Laurent series aH (j) rents A . The form of this expansion given by Eq. (4.4) immediately follows from the expansion of the coefficients of z n for n = 1, 2, . . . , j + 1. We are now ready to prove Theorem 9. The mKP hierarchy is the restriction of the DKP hierarchy to the submanifold S0 . Proof. Because of Lemma 8 we have e (j) − H (j) ) = A(j+1) + a(H

j−1 X l=0

al Aj−l − a(A(j) − Aj,0 ) .

Darboux Transformations for KP

319

Furthermore, by the definition of the currents A(j) , we have ∂x A(j) + aA(j) = A(j+1) +

j−1 X

al A(j−l) + Aj,0 .

l=0

Consequently, we have

e (j) − H (j) ) = ∂x A(j) , a(H

concluding the proof of the theorem

This result gives a new interpretation of the link between the KP and the mKP hierarchies. The KP equations are a projection of the DKP ones, while the mKP equations are a restriction of the same equations to the invariant submanifold S0 . Furthermore, it allows us to give fairly simple proofs of Theorems 2 and 3. Indeed, Theorem 2 is proved by noticing that S0 is a Darboux subcovering of the KP equations. To prove Theorem 3 it is enough to remark that, according to Eq. (4.3), each equation ∂tj a = ∂x A(j) of the mKP hierarchy splits into two equations ∂tj h = ∂x H (j) and

∂tj a0 = ∂x Aj,0 .

(4.7)

This remark shows that Eq. (4.7) is the “auxiliary equation” relative to the restriction to S0 . By solving this equation we can generate new solutions of the KP hierarchy by means of “elementary” Darboux transformations, as stated in Theorem 2. 5. The Reduction to the Gelfand–Dickey Hierarchy As previously noticed in the introduction, the geometric constructions based on the concept of Darboux covering have a natural behavior with respect to the process of restriction to an invariant submanifold. It is enough to restrict the Miura map µ : N → M and the Darboux map σ : N → M to any invariant submanifold S ⊂ N of Y to get, at the same time, the invariant submanifold S 0 = µ(S) ⊂ M of X, and the associated Miura and Darboux maps. In this section we give an example of this procedure, by considering the Gelfand–Dickey equations. Inside the invariant submanifold S0 , whose equation is h = a − a0 ,

(5.1)

we consider the submanifold Tn defined by the additional constraint A(n) = z n .

(5.2)

We notice that this submanifold is an invariant submanifold of the DKP equations since ∂tj (A(n) − z n ) = 0 on Tn . Indeed, on S0 the DKP equations coincide with the mKP equations, and therefore

320

Franco Magri, Marco Pedroni, Jorge P. Zubelli

∂tj (A(n) − z n ) = ∂tn A(j) = 0, since we have ∂tn a = 0 on Tn by the constraint (5.2). The restriction to Tn of the mKP equations are the modified GD equations. Consider now the projection Tn0 = µ(Tn ) of Tn onto the phase space of the KP equations. Since h and a are related by Eq. (5.1), from Lemma 8 we immediately get the equation H (n) = z n for Tn0 . Therefore ∂tn h = 0 on Tn0 , and Tn0 is the submanifold considered in the Gelfand– Dickey theory. The restriction to Tn0 of the KP equations are the GD equations. Furthermore, since on Tn e (n) − H (n) ) = 0, ∂tn a = a(H we also see that

e (n) = z n , H

and, therefore, Tn00 = σ(Tn ) coincides with Tn0 . Then we conclude that the restriction to Tn of the maps h=µ b(a) = a − a0 . e h=σ b(a) = h + ∂x log a give the Miura map and the Darboux map of the GD equations. Example 5.1 (KdV). The simplest constraint is A(2) = z 2 . According to the definition of the mKP currents, this amounts to setting ax + a2 − 2a0 a = z 2 on the Laurent series a(z). This constraint allows to compute the coefficients aj of a(z), for j ≥ 1, as differential polynomials of the first coefficient a0 . In particular we have a1 =

1 (−a0x + a20 ). 2

In the same way the constraint H (2) = z 2 allows to compute all the coefficients hj , for j ≥ 2, as differential polynomials of the first coefficient h1 . The restrictions to T2 of the mKP equations are called mKdV equations; those of the KP equations to T20 are called the KdV equations. The restrictions of the maps µ b and σ b to T2 are given by 1 (−a0x + a20 ) 2 e h1 = h1 + a0x . h1 = a1 =

If we use the variables v = −a0 and u = 2h1 , we get the usual Miura map and Darboux map u = vx + v 2 u˜ = u − 2vx of the KdV theory.

Darboux Transformations for KP

321

Example 5.2 (Boussinesq). The next constraint is A(3) = z 3 , or, explicitly, (axx + 3aax + a3 ) − 3a0 (ax + a2 ) − 3(a0x − a20 + a1 )a = z 3 . It allows to compute the coefficients aj , for j ≥ 2, as differential polynomials of the first two coefficients (a0 , a1 ). In particular we have 1 1 a2 = − a0xx − a1x + a0 a0x + a0 a1 − a30 . 3 3 In the same way the constraint H (3) = z 3 allows to compute all the coefficients hj , for j ≥ 3, as differential polynomials of the first two coefficients (h1 , h2 ). The restrictions to T3 of the mKP equations are called modified Boussinesq equations; those of the b is KP equations to T30 are called Boussinesq equations. The restriction of the map µ h1 = a 1 , 1 h2 = a2 = −a1x + a0 a0x + a0 a1 − a30 . 3 The restriction of the map σ b is e h1 = h1 + a0x , 1 e h2 = h2 + (a1 − a20 )x . 2 They are the Miura and Darboux maps of the Boussinesq theory in unconventional variables. If we pass to the usual variables u0 = 3(h2 + h1x ), u1 = 3h1 , v0 = 3(a0x − a20 + a1 ), v1 = 3a0 we get the Miura map given in [13].

6. Iterated Elementary Darboux Transformations In this section we prove that the generators {a(n)}n≥0 of a sequence of iterated elementary Darboux transformations solve the discrete KP equations. Furthermore, we show that also the converse statement is true: If a(n) is any solution of dKP, then the local value a(n0 ) of a(n) at the site n0 is a solution of mKP. Therefore, h(n0 ) = a(n0 ) − a0 (n0 ) is a solution of KP. Moreover, h(n0 + 1) and h(n0 ) are connected by the elementary Darboux transformation generated by a(n0 ). Let {a(n)} be a sequence of Laurent series of the form (1.4). We say that this sequence satisfies the Darboux recursion relations if µ b(a(n + 1)) = σ b(a(n)), where the maps µ b and σ b have been defined in Sect. 4. Let us set h(n) = µ b(a(n)) = a(n) − a0 (n). Then we can state

(6.1)

322

Franco Magri, Marco Pedroni, Jorge P. Zubelli

Theorem 10. The differential KP currents H (j) (n) associated to h(n) coincide with the discrete KP currents K (j) (n) attached to a(n). Proof. We recall that H (j) is the unique linear combination H (j) = h(j) +

j−2 X

pjl [h]h(l)

l=0

of the differential Fa`a di Bruno iterates of h defined by h(0) = 1 h(j+1) = (∂x + h)h(j) ,

∀ j ≥ 0,

having the asymptotic behavior H (j) = z j + O(z −1 )

(6.2)

when z → ∞. Similarly, the currents K (j) (n) of the discrete KP equations are the unique linear combinations j−1 X rlj [a]a(l) (n) K (j) (n) = a(j) (n) + l=0

of the discrete Fa`a di Bruno iterates of a(n) defined by a(0) (n) = 1, a(j+1) (n) = a(n)a(j) (n + 1),

∀ j ≥ 0,

having the asymptotic behavior (6.2). To prove that H (j) (n) = K (j) (n), we have to show that a(j) (n) ∈ H+ (n) = hh(i) (n)ii≥0 for all j ≥ 0. Since a(0) (n) = 1 ∈ H+ (n) and a(j+1) (n) = a(n)a(j) (n + 1), we can prove our thesis by induction once we know that a(n)H+ (n + 1) ⊂ H+ (n). But this follows from the Darboux recursion relations (6.1), which entail a(n)(∂x + h(n + 1)) = (∂x + h(n))a(n) . Indeed, by recalling that (∂x + h(n)) (H+ (n)) ⊂ H+ (n) and that a(n) = h(n) + a0 (n) ∈ H+ (n), we prove our claim.

Suppose now that {a(n)}n≥0 is a sequence of solutions of the mKP equations that satisfy the Darboux recursion relations (6.1). To show that the sequence {a(n)}n≥0 is a solution of the discrete KP equations, we recall that the pair (h(n), a(n)) is a solution of DKP for each solution a(n) of mKP. Therefore, the sequence {a(n)} verifies the equations ∂a(n) = a(n) H (j) (n + 1) − H (j) (n) , ∂tj where H (j) (n) and H (j) (n+1) are the KP currents associated to the points h(n) and h(n+ 1), respectively. By Theorem 10, this sequence also verifies the discrete KP equations ∂a(n) = a(n) K (j) (n + 1) − K (j) (n) . ∂tj We have thus proved

Darboux Transformations for KP

323

Theorem 11. Any sequence {a(n)} of solutions of the mKP equations which satisfies the Darboux recursion relations is a solution of the discrete KP equations. To prove the converse statement relating the solutions of the discrete KP to the solutions of mKP, we associate a new kind of modified currents A(j) (n) with any solution a(n) of the dKP equations. They are defined as the unique linear combination A (n) = a (n) + (j)

(j)

j−1 X

sjl [a]a(l) (n)

l=1

of the discrete Fa`a di Bruno iterates a (n) of a(n), having the asymptotic behavior (j)

A(j) (n) = z j + O(z 0 ) as z → ∞. Clearly, A(j) (n) = K (j) (n) + Aj,0 (n) and in particular A(1) (n) = a(n). The important point is that {A(j) (n)}, for n fixed, satisfies the Riccati Eqs. (2.3) associated with the mKP equations. Proposition 12. The modified currents A(j) (n) associated with any solution a(n) of the discrete KP equations satisfy the system of Riccati equations X ∂A(j) (n) + A(j) (n)A(k) (n) = A(j+k) (n) + Aj,l (n)A(k−l) (n) ∂tk k−1 l=0

+

j−1 X

(6.3)

Ak,l (n)A(j−l) (n),

l=0

which characterize the currents associated with the mKP equations. Proof. Let us denote by H+(1) (n) = ha(j) (n)ij≥1 the linear space spanned by the (strictly) positive discrete Fa`a di Bruno iterates of a(n). We notice that the dKP hierarchy can be written in the operator form ∂tj + K (j) (n) a(n) = a(n) ∂tj + K (j) (n + 1) . This implies

∂tj + A(j) (n) a(n) = a(n) ∂tj + A(j) (n + 1) + (Aj,0 (n) − Aj,0 (n + 1))a(n).

Using this formula we can prove by induction that ∂ (k) + A (n) H+(1) (n) ⊂ H+(1) (n). ∂tk Indeed ∂tj + A(j) (n) 1 = A(j) ∈ H+(1) (n) and ∂tj + A(j) (n) a(k+1) (n) = ∂tj + A(j) (n) a(n)a(k)(n + 1) = a(n) ∂tj + A(j) (n + 1) a(k) (n + 1) +(Aj,0 (n) − Aj,0 (n + 1))a(n)a(k) (n + 1) ∈ a(n)H+(1) (n + 1) ⊂ H+(1) (n). Hence, ∂tj + A(j) (n) A(k) (n) ∈ H+(1) (n), which implies Eq. (6.3).

324

Franco Magri, Marco Pedroni, Jorge P. Zubelli

From the Riccati Equation (6.3) we can immediately obtain the mKP hierarchy by a process of “spatialization”. Let us set x = t1 and n = n0 , and let us denote a(n0 ) simply by a. We use Eqs. (6.3) with k = 1 to express all the currents A(j) as x–differential polynomials in a: X ∂A(j) + aA(j) − Aj,0 a − al A(j−l) . ∂x j−1

A(j+1) =

(6.4)

l=0

These Laurent series are the mKP currents associated with a, since they are in the span of the differential Fa`a di Bruno iterates {a(j) }j≥1 of a defined by a(0) = 1, a(j+1) = (∂x + a)a(j) . The proof is a simple induction. First, we notice that A(1) = a is a current for mKP. Then, we assume that the same statement is true for the currents A(l) , with l ≤ j. This means that the Laurent series A(l) belong to the linear span H+(1) = ha(i) ii≥1 for l = 1, . . . , j. Finally, we notice that the same is true for A(j+1) according to Eq. (6.4), since (∂x + a)(H+(1) ) ⊂ H+(1) . This remark allows to prove Theorem 13. The localization a = a(n0 ), at any site n0 , of a solution a(n) of the discrete KP hierarchy is a solution of the mKP hierarchy. Furthermore, h(n0 ) = a(n0 ) − a0 (n0 ) is a solution of the KP equations. Solutions h(n0 ) and h(n0 + 1) related to adjacent sites are connected by an elementary Darboux transformation h(n0 + 1) = h(n0 ) + ∂x log a(n0 ). Proof. It is enough to remark that the Riccati system (6.3) entails ∂A(j) ∂A(k) = . ∂tk ∂tj Therefore: ∂a ∂A(1) ∂A(j) ∂A(j) . = = = ∂tj ∂tj ∂t1 ∂x We can conclude that any solution a(n) of dKP induces a sequence of solutions a(n0 ) of b(a(n0 )) of KP. The last statement mKP, and therefore a sequence of solutions h(n0 ) := µ is proved by noticing that the first equation of the dKP hierarchy is ∂a(n) = a(n)(h(n + 1) − h(n)). ∂x

This result concludes the discussion of the link between the hierarchies of iterated elementary Darboux transformations of the KP equations and the discrete KP equations. Acknowledgement. During the course of this work, FM and MP were sponsored by the Italian M. U. R. S. T. and by the Italian National Research Council (CNR) through the GNFM. JPZ was sponsored by the Brazilian National Reseach Council (CNPq) through grant MA 521329/94-9.

Darboux Transformations for KP

325

References 1. Adler, M., Moser, J.: On a class of polynomials connected with the KdV equations. Commun. Math. Phys. 61, 1–30 (1978) 2. Casati, P., Falqui, G., Magri, F., Pedroni, M.: Darboux Coverings and Rational Reductions of the KP Hierarchy. To appear in Letters in Mathematical Physics, 1997 3. Casati, P., Falqui, G., Magri, F., Pedroni, M.: The KP theory revisited. I., Technical Report SISSA/2/96/FM, SISSA/ISAS, Via Beirut 2–4, 34014 Trieste - Italy, 1996 4. Casati, P., Falqui, G., Magri, F., Pedroni, M.: The KP theory revisited II., Technical Report SISSA/3/96/FM, SISSA/ISAS, Via Beirut 2–4, 34014 Trieste - Italy, 1996 5. Casati, P., Falqui, G., Magri, F., Pedroni, M.: The KP theory revisited. III., Technical Report SISSA/4/96/FM, SISSA/ISAS, Via Beirut 2–4, 34014 Trieste - Italy, 1996 6. Casati, P., Falqui, G., Magri, F., Pedroni, M.: The KP theory revisited. IV., Technical Report SISSA/5/96/FM, SISSA/ISAS, Via Beirut 2–4, 34014 Trieste - Italy, 1996 7. Casati, P., Falqui, G., Magri, F., Pedroni, M.: A note on fractional KdV hierarchies. Technical Report SISSA/94/96/FM, SISSA/ISAS, Via Beirut 2–4, 34014 Trieste - Italy, 1996. To appear in J. Math. Phys. 8. Darboux, G.: “Lec¸ons sur la Th´eorie G´en´erale de Surfaces et les Applications G´eom´etriques du Calcul Infinit´esimal, Deuxi`eme Partie”. Paris: Gauthiers-Villars, (1889) 9. Deift, P., Trubowitz, E.: Inverse scattering on the line. CPAM 32, 121–251 (1977) 10. Duistermaat, J. J., Gr¨unbaum, F. A.: Differential equations in the spectral parameter. Commun. Math. Phys. 103, 177–240 (1986) 11. Jimbo, M., Miwa, T.: Solitons and Infinite Dimensional Lie Algebras. Publ. Res. Inst. Math. Sci. 19(3), 943–1001, (1983) 12. Konopelchenko, B.G.: On the gauge-invariant description of the evolution equations integrable by Gelfand-Dikij spectral problems. Phys. Letters 92A(7), 323–327 (1982) 13. Konopelchenko, B. G., Oevel, W.: On the R-matrix approach to nonstandard classes of integrable equations. RIMS, Kyoto 29, 581 (1993) 14. Krichever, I. M.: General rational reductions of the KP hierarchy and their symmetries. Funct. Anal. Appl. 29(2), 75–80 (1995) 15. Kupershmidt, B.A.: Discrete Lax Equations and Differential–Difference Calculus. Paris: Ast´erisque, 1985 16. Kupershmidt, B. A.: Canonical property of the Miura maps between the mKP and KP hierarchies, continuous and discrete. Commun. Math. Phys. 167(2), 351–371 (1995) 17. Matveev, V.B., Salle, M. A.: Darboux transformations and solitons. New York: Springer Verlag, 1991 18. Newell, A. C.: Solitons in Mathematics and Physics. Philadelphia, PA: SIAM, 1985 19. Ueno, K., Takasaki, K.: Toda lattice hierarchy. I. Proc. Japan Acad. Ser. A Math. Sci. 59(5) 167–170 (1983) 20. Ueno, K., Takasaki, K.: Toda lattice hierarchy. II. Proc. Japan Acad. Ser. A Math. Sci. 59(6), 215–218 (1983) 21. Ueno, K., Takasaki, K.: Toda Lattice Hierarchy. In: K. Okamoto, editor, Group Representations and Systems of Differential Equations (Tokyo, 1982), Amsterdam: North–Holland, 1984, pp. 1–95 22. van Moerbeke, P.: Integrable Foundations of String Theory. CIMPA - Summer School at SophiaAntipolis. In: O. Babelon et al., editors, Lectures on Integrable Systems, Singapore: World Scientific, 1994, pp. 163–267 23. Zubelli, J. P.: Differential equations in the spectral parameter for matrix differential operators. Physica D, 43(2-3), 269–287 (1990) 24. Zubelli, J. P., Magri, F.: Differential equations in the spectral parameter, Darboux transformations, and a hierarchy of master symmetries for KdV. Commun. Math. Phys. 141(2), 329–351 (1991) Communicated by R.H. Dijkgraaf

Commun. Math. Phys. 188, 327 – 350 (1997)

Communications in

Mathematical Physics c Springer-Verlag 1997

Distribution Functions for Random Variables for Ensembles of Positive Hermitian Matrices Estelle L. Basor ? Department of Mathematics, California Polytechnic State University, San Luis Obispo, CA 93407, USA. E-mail: [email protected] Received: 5 November 1996 / Accepted: 8 January 1997

Abstract: Distribution functions for random variables that depend on a parameter are computed asymptotically for ensembles of positive Hermitian matrices. The inverse Fourier transform of the distribution is shown to be a Fredholm determinant of a certain operator that is an analogue of a Wiener-Hopf operator. The asymptotic formula shows that, up to the terms of order o(1), the distributions are Gaussian. 1. Introduction In the theory of random matrices one is led naturally to consider the probability distribution on the set of eigenvalues of the matrices. For N ×N random Hermitian matrices one can show that under reasonable assumptions, the probability density that the eigenvalues λ1 , . . . , λN lie in the intervals (x1 , x1 + dx1 ), . . . , (xN , xN + dxN ) is given by the formula PN (x1 , . . . , xN ) =

1 N det K(xi , xj ) |i,j=1 , N!

(1)

where KN (x, y) =

N −1 X i=0

φi (x)φi (y),

n o 2 and φi is obtained by orthonormalizing the sequence xi e−x /2 over R. ?

Supported in part by NSF Grant DMS-9623278.

(2)

328

E. L. Basor

For N × N positive Hermitian matrices the probability density has the same form except φi is replaced by the functions obtained by orthonormalizing the sequence ν/2 that x e−x/2 xi over R+ . We will not describe here exactly how these particular densities arise but instead refer the reader to [8]. We can define a random variable on the space of eigenvalues by considering f (x1 , . . . , xN ) where f is any symmetric function of the xi ’s. A particular case of PN interest is a random variable of the form i=1 f (xi ), where f is a function of a real variable. Such a random variable is generally called a linear statistic. In previous work [8, 3, 1, 6], the variance of the random variable was computed in the large N limit. More precisely, the function f and the kernel KN (x, y) were suitably rescaled so that the limit as N → ∞ of the variance could be computed. The precise details of this are in the next section. Our goal in this paper is to compute the distribution function for a class of the linear statistics that depend on a parameter α. We now describe the sections of the paper and main results. In the next section we outline the random matrix theory and show how the distribution functions can be computed using Fredholm determinants. In Sect. 3 we replace the function f (x) in the linear statistic by fα (x) = f (x/α). For random variables ˇ of this type we show that the inverse Fourier transform of the distribution function φ(k) has an asymptotic expansion of the form 2 ˇ φ(k) ∼ eak +bk

(3)

as α → ∞. This of course implies that the actual distribution is asymptotically Gaussian. Here a and b depend on f and α. This is proved for both the Hermitian matrices and positive Hermitian matrices. In the latter case with ν = −1/2, a very simple proof is given in Sect. 3. For ν > −1/2, a completely different proof is obtained in Sect. 4. Most of the results are obtained by using simple operator theory identities in the theory of Wiener-Hopf operators. The central idea is that the various quantities which yield information about random variables can all be computed in terms of traces or determinants of integral operators. Some of the computations lead directly to a familiar problem in the theory of Wiener-Hopf operators, while others require modifications and generalizations of these results.

2. Preliminaries In this section we show how to compute the mean, variance, and inverse Fourier transform of the distribution of the random variable. Computations for the mean and variance have been given before in many places. However, we reproduce all of these here for completeness sake and also to highlight the use of operator theory ideas. We begin by considering PN for N × N random Hermitian matrices. We want to consider large matrices and thus we let N → ∞, but this leads to a trivial result unless we rescale KN in a particular way. We replace KN (x, y) with y 1 x √ KN √ ,√ . (4) 2N 2N 2N Rescaling KN is equivalent to rescaling the mean spacing of the eigenvalues. (See [12] for details.) From the theory of Hermite polynomials it is easy to see that as N → ∞,

Distribution Functions for Random Variables of Hermitian Matrices

1 √ KN 2N

y x √ ,√ 2N 2N

→

sin(x − y) . π(x − y)

329

(5)

This last function is known as the sine kernel. Now consider a random variable of the form N X √ f (xi 2N ), i=1

where in all that follows f is a continuous real-valued function belonging to L1 (R) and √ which vanishes at ±∞. The appearance of the 2N should not be surprising here since the above rescaling spreads out the eigenvalues and hence should be reflected in the random variable. The mean µN is Z ···

Z X N

√ f (xi 2N )PN (x1 , . . . , xN )dx1 · · · dxN .

(6)

i=1

Now the function PN has the important property [8] Z Z N! · · · PN (x1 , . . . , xn , xn+1 , . . . , xN )dxn+1 · · · dxN (N − n)! n = det K(xi , xj ) |i,j=1 .

(7)

Thus, (6) is easily seen to be Z

∞ −∞

√ f (x 2N )KN (x, x) dx

(8)

√ which, after changing x to x/ 2N , becomes Z ∞ y 1 x KN √ ,√ dx. f (x) √ 2N 2N 2N −∞ Thus, as N → ∞,

Z µN → µ =

∞

f (x)K(x, x) dx,

(9)

−∞

where K(x, y) is the sine kernel. A very similar computation for the variance varN f , again using (7), yields Z Z Z var f := lim varN f = − f (x)f (y)K 2 (x, y) dx dy + f 2 (x)K(x, x) dx. (10) N →∞

Both the mean and the variance can be interpreted as traces of certain Wiener-Hopf operators. To see this, consider the operator A(f ) on L2 (−1, 1) with kernel Z ∞ 1 f (t)e−it(x−y) dt. (11) 2π −∞ This operator can easily be seen to be the product F Mf F −1 P , where P g = χ(−1,1) g, Mf g = f g and F is the Fourier transform. A moment’s thought shows that µ = tr {A(f )} and var f = tr {A(f 2 ) − (A(f ))2 }.

330

E. L. Basor

A more difficult, yet also straightforward problem, is to find an expression for the distribution function of a random variable of this type. A fundamental formula from probability theory shows that if we call the probability distribution function φN , then Z ∞ Z ∞ PN √ ik f (xj 2N ) ˇ j=1 ··· e PN (x1 , . . . , xN )dx1 · · · dxN . (12) φN (k) = −∞

−∞

Thus, Z φˇ N (k) =

Z

∞ −∞

Z

Z

∞

= −∞

Z

··· Z

∞

= −∞

+

···

···

N X

N ∞ Y

eikf (xj

√

2N )

−∞ j=1 N ∞ Y

((eikf (xj

√

PN (x1 , . . . , xN ) dx1 · · · dxN

2N )

−∞ j=1 ∞ −∞

{1 +

(eikf (xj

N X

(eikf (xj

− 1) + 1)PN (x1 , . . . , xN ) dx1 · · · dxN

√

2N )

− 1)

j=1 √

2N )

− 1)(eikf (xl

√

2N )

− 1) + . . .}

j
×PN (x1 , . . . , xN ) dx1 · · · dxN Z 1 ∞ ikf (x√2N ) =1+ (e − 1)KN (x, x) dx 1! −∞ Z Z √ 1 ∞ ∞ ikf (x1 √2N ) + (e − 1)(eikf (x2 2N ) − 1) 2! −∞ −∞ det(KN (xj , xl )) |1≤j,l≤2 dx1 dx2 Z ∞ Z ∞Y N √ 1 + ··· + ··· (eikf (xj 2N ) − 1)PN (x1 , . . . , xN ) dx1 · · · dxN . N ! −∞ −∞ j=1

In each integral we rescale to obtain Z Z Z 1 ∞ 0 1 ∞ ∞ 0 ˇ K (x1 , x1 ) dx1 + K (x1 , x2 ) dx1 dx2 φN (k) = 1 + 1! −∞ 2! −∞ −∞ Z ∞ Z ∞ 1 +··· + ··· K 0 (x1 , . . . , xN ) dx1 · · · dxN , N ! −∞ −∞ where 0

K (x1 , . . . , xn ) = det (e

ikf (xj )

xl 1 xj ,√ )√ − 1)KN ( √ 2N 2N 2N

(13)

.

(14)

1≤j,l≤n

Letting N → ∞ we see this is the formula for the Fredholm determinant det(I + K), where K has kernel sin(x − y) . (15) K(x, y) = (eikf (x) − 1) π(x − y) As before we can express this last quantity in terms of the operator A(σ)

Distribution Functions for Random Variables of Hermitian Matrices

ˇ φ(k) = lim φˇ N (k) = det(I + A(σ)), N →∞

331

(16)

where σ(x) = eikf (x) − 1. The preceding computations can all be carried out in the case of positive Hermitian matrices. In this case we replace KN (x, y) with y x 1 KN ( , ) 4N 4N 4N and from the theory of Laguerre polynomials we see that as N → ∞, √ √ √ √ √ √ Jν ( x) yJν 0 ( y) − xJν 0 ( x)Jν ( y) y x 1 KN ( , )→ , 4N 4N 4N 2(x − y)

(17)

where Jν is the Bessel function of order ν. The details of this are found in [13]. The rescaling here forces the eigenvalue density to be bounded near zero and is called “scaling at the hard edge.” The kernel (17) is known as the Bessel kernel. We can again write the mean, the variance, and the Fourier transform of the distribution in terms of operators. This time the relevant operator B(f ) is defined on L2 (0, 1) with kernel given by Z ∞ √ t xyf (t)Jν (tx)Jν (ty) dt. (18) K(x, y) = 0

√ If we begin with the linear statistic (the x is merely for convenience, and we again assume that f is continuous, in L1 (R+ ) and vanishes at +∞) N X

f(

p

xi 4N ),

(19)

i=1

then nearly identical computations show that µ = tr B(f ), var f = tr {B(f 2 ) − (B(f ))2 }, ˇ φ(k) = det(I + B(σ)), where σ = eikf (x) − 1. We summarize these results in the following: √ PN Theorem 1. (a) Given a random variable of the form i=1 f (xi 2N ) defined on the space of eigenvalues of N × N Hermitian matrices with probability distribution given in (1), we have µ := limN →∞ µN = tr (A(f )), var f := limN →∞ varN f = tr {A(f 2 ) − (A(f ))2 }, ˇ φ(k) := limN →∞ φˇ N (k) = det(I + A(σ)), where σ(x) = eikf (x) − 1. √ PN (b) Given a random variable of the form i=1 f ( xi 4N ) defined on the space of eigenvalues of positive N × N Hermitian matrices, we have µ := limN →∞ µN = tr (B(f )), var f := limN →∞ varN f = tr {B(f 2 ) − (B(f ))2 }, ˇ φ(k) := limN →∞ φˇ N (k) = det(I + B(σ)), where σ(x) = eikf (x) − 1.

332

E. L. Basor

When linear PN statistics are considered [3, 10], one is often concerned with a statistic of the form i=1 f (xi /α), where α is a real parameter approaching infinity. This is the case, for example, in the study of disordered conductors where large α corresponds to a high density metallic regime. The above formulas still hold, of course, but now they depend on the parameter. We will call the operators that depend on the parameter α by Aα (f ) and Bα (f ), respectively. In the next sections we will compute the mean, variance, and distribution function asymptotically as α → ∞. 3. The Mean, Variance, and Distribution Function as α → ∞ For random Hermitian matrices, computing the various limits are applications of the continuous analogues of the Strong Szeg¨o Limit Theorem. For then, Aα (f ) is just the classical Wiener-Hopf operator defined on the interval (−α, α), and all of the quantities are known asymptotically as α → ∞. We provide the answers here for completeness. Theorem 2. Assume that f ∈ L1 (R) is continuous, and vanishes at ±∞ and that in addition its Fourier transform fˆ satisfies Z ∞ |x||fˆ(x)|2 dx < ∞. −∞

Then α µ= 2π Z var f = 2

Z

∞

f (x) dx, −∞ ∞

xfˆ(x)fˆ(−x) dx + o(1)

0

and ˇ φ(k) ∼ exp

α 2π

Z

Z

∞ −∞

∞

ikf (x) dx − k 2

xfˆ(x)fˆ(−x) dx .

0

The Bessel case is significantly more complicated. There is no corresponding Szeg¨o type theorem. We begin by computing the mean. The operator Bα (σ) has kernel Z ∞ √ xytf (t/α)Jν (tx)Jν (ty) dt. 0

Thus the mean µ is given by Z

1

Z

∞

µ= 0

=α

0 ∞

Z

2

Z

0

0

1 0

xtf (t)Jν2 (αtx) dx dt Z

1

f (t) 0

Z

Z

∞

= α2 Now

xtf (t/α)Jν2 (tx) dt dx

1

xJν2 (αtx) dx =

0

xtJν2 (αtx) dx dt.

1 2 J (αt) − Jν+1 (αt)Jν−1 (αt) 2 ν

(20)

Distribution Functions for Random Variables of Hermitian Matrices

and Jν−1 (αt) = −Jν+1 (αt) +

333

2ν Jν (αt). αt

Therefore the integral (20) becomes Z ∞ αt 2ν 2 2 α Jν (αt) + Jν+1 (αt) − Jν+1 (αt)Jν (αt) dt f (t) 2 αt 0 or

Z

∞

α 0

αt 2 2 Jν (αt) + Jν+1 f (t) (αt) dt − αν 2

Z

∞

f (t)Jν+1 (αt)Jν (αt) dt.

(21)

0

The first integral equals α π

Z

∞

f (t) dt + o(1),

(22)

0

which can be easily seen by using the asymptotic properties of Bessel functions. The second integral is asymptotically

This uses the identity Thus we have

R∞ 0

ν f (0) + o(1). 2 Jν+1 (x)Jν (x) dx = 21 .

µ=

α π

Z

∞ 0

f (t) dt −

ν f (0) + o(1). 2

(23)

For the variance we refer to [1] where the calculation was already done. There it was shown that Z ∞ 1 |M (f )(2iy)|2 y tanh(πy)dy. (24) varf ∼ 2 π −∞ We note however, that this can also be written as Z ∞ 1 x(C(f )2 ) dx, (25) varf ∼ 2 π 0 R∞ where C(f )(x) = 0 f (y) cos(xy)dy denotes the cosine transform of f . This is an exercise involving the properties of the Mellin transform, and we leave it to the reader. To compute the distribution function, we first turn our attention to the case where ν = −1/2. Our operator Bα (σ) has kernel Z 2 ∞ σ(t/α) cos xt cos ytdt π 0 Z 1 ∞ σ(t/α)(cos((x − y)t) + cos((x + y)t)) dt = π 0 α = (C(σ)((x − y)α) + C(σ)((x + y)α)). π This is unitarily equivalent to the operator on L2 (0, α) with kernel

334

E. L. Basor

1 (C(σ)(x − y) + C(σ)(x + y)). π

(26)

The operator with kernel π1 (C(σ)(x − y)) is the finite Wiener-Hopf operator, usually denoted as Wα (σ), and the operator with kernel π1 (C(σ)(x + y)) is the Hankel operator Hα (σ). (The only difference between this definition of a finite Wiener-Hopf operator and the one given earlier for Aα is the difference in the domain. The two are unitarily equivalent.) If we consider the operators on L2 (0, ∞) in what follows, we will denote them by W (σ) and H(σ) respectively. Also, whenever it is necessary to consider the extension of σ to the entire real axis, it will always be the even extension. Thus the problem of finding the distribution function asymptotically becomes the same as computing the Fredholm determinant det(I + Bα (σ)) = det(I + Wα (σ) + Hα (σ)) asymptotically. To do this we need some basic facts about Wiener-Hopf operators and we collect them in the following theorem. These are well-known and can all be found in [4]. Theorem 3. a) Suppose φ and ψ are even bounded functions in L1 (R). Then W (φ)H(ψ) + H(φ)W (ψ) = H(φψ) and W (φ)W (ψ) = W (φψ) − H(φ)H(ψ). ˆ b) Suppose φ and ψ are bounded functions in L1 (R). If the Fourier transform φ(x) ˆ vanishes for x negative, then W (ψ)W (φ) = W (φψ) and if φ(x) vanishes for x positive, then W (φ)W (ψ) = W (φψ). We define W (σ) and H(σ) with σ = 1 + f and f in L1 by W (σ) = I + W (f ) and H(σ) = H(f ). Both of these definitions are natural when thought of in a distributional setting, and the above theorem holds with these definitions as well. The next theorem is of primary importance in the computations that follow. Theorem 4. Suppose φ = 1 + f, φ−1 = 1 + g, where f and g are bounded even functions. Then the inverse of W (φ) + H(φ) is W (φ−1 ) + H(φ−1 ). Proof. Using Theorem 3 parts a) and b) we have, (W (φ) + H(φ))(W (φ−1 ) + H(φ−1 )) = W (φ)W (φ−1 ) + H(φ)W (φ−1 ) + W (φ)H(φ−1 ) + H(φ)H(φ−1 ) = I − H(φ)H(φ−1 ) + H(φφ−1 ) + H(φ)H(φ−1 ) = I + H(1) = I. The same computation holds for (W (φ−1 ) + H(φ−1 ))(W (φ) + H(φ)), and so we have shown that these operators are inverses of each other. It is well known from the theory of Wiener-Hopf operators that under appropriate conditions det(I + Wα (σ)) has the asymptotic expansion G(σ)α E(σ), where Z ∞ 1 log(1 + σ(ξ))dξ G(σ) = exp 2π −∞

Distribution Functions for Random Variables of Hermitian Matrices

335

and E(σ) = det(W (φ)W (φ−1 )) with φ = 1 + σ. This is simply another version of Theorem 2. With additional assumptions on φ, it is very easy to adapt this proof to the Bessel case ν = −1/2 to show that det(I + Wα (σ) + Hα (σ)) ∼ G(σ)α E 0 (σ)

(27)

and E 0 (σ) = det((W (φ) + H(φ))W (φ−1 )). Thus to compute the distribution, we need to know the form of the above determinant. This is contained in the next theorem. Theorem 5. Suppose σ = eikf − 1, where f is even, continuous, piecewise C 2 and vanishes at infinity. Suppose also that f ∈ L1 and the function ξ → (1 + ξ 2 )(|f 00 (ξ)| + |f 0 (ξ)|2 ) ∈ L2 . Then as α → ∞, we have det(I+Wα (σ)+Hα (σ)) ∼ exp{

α π

Z

∞

ikf (x) dx+ 0

k2 ik f (0)− 2 4 2π

Z

∞

x|C(f )(x)|2 dx}.

0

(28)

Proof. The conditions on σ ensure that the above integrals converge, and that the operators H(φ) and H(f ) are trace class. The reader is referred to [2] for details. These assumptions also guarantee that (27) holds. It is also easy to see that G(φ) = R∞ ikf (x) dx}. To complete the proof we need a concrete representation for exp{ α π 0 det((W (φ) + H(φ))W (φ−1 )). Define h(k) = log det((W (φ) + H(φ))W (φ−1 )), where φ = eikf . Let h(k) = log det((W (φ) + H(φ))W (φ−1 )). We need to show the second derivative of h is constant in k. A standard formula [5] yields h0 (k) = tr((W (φ−1 ))−1 (W (φ) + H(φ))−1 ×

d(W (φ) + H(φ))W (φ−1 ) ) dk

= tr((W (φ−1 ))−1 (W (φ) + H(φ))−1 ) ×{(W (φ) + H(φ))W (φ−1 (−if )) + W (φif )W (φ−1 ) + H(φif )W (φ−1 )} = tr{(W (φ−1 ))−1 W (φ−1 (−if )) + (W (φ−1 ))−1 W (φ−1 )W (φif )W (φ−1 ) +(W (φ−1 ))−1 W (φ−1 )H(φif )W (φ−1 ) + (W (φ−1 ))−1 H(φ−1 )W (φif )W (φ−1 ) +(W (φ−1 ))−1 H(φ−1 )H(φif )W (φ−1 )}. This uses Theorem 4. Simplifying further and using the fact that H(φ−1 ) is trace class we have h0 (k) = tr{(W (φ−1 ))−1 W (φ−1 (−if )) + W (φif )W (φ−1 ) +H(φif )W (φ−1 ) + H(φ−1 )W (φif ) + H(φ−1 )H(φif )}. Now apply Theorem 3, part a) and the fact that tr(AB) = tr(BA) to find h00 (k) = tr{(W (φ−1 ))−1 W ((φ−1 )(if )2 ) −(W (φ−1 ))−1 W (φ−1 (−if ))(W (φ−1 ))−1 W (φ−1 (−if ))}.

336

E. L. Basor

The conditions on φ guarantee that the function φ has a factorization φ = (g− + 1)(g+ + 1) such that the Fourier transforms of g+ and g− vanish for positive and negative real values respectively. Then using Theorem 3, part b), it is easy to see that we can write W (φ) = W (g− + 1)W (g+ + 1), W (φ−1 )−1 = W (g+ + 1)W (g− + 1). A repeated application of these identities allows us to write h00 (k) = trH(if )H(if ), and h00 (k) is independent of k. Thus at this point we have h(k)R= ak 2 + bk + c, where ∞ 2a = −tr((H(f ))2 . A direct computation shows that a = − 2π1 2 0 x|C(f )(x)|2 dx. To R ∞ i compute b, notice that h0 (0) is trH(if ) = 2π C(f (x)) dx. Also h(0) = tr log(I) = 0. 0 Thus the last theorem holds. 4. The General Case In this section we show that under certain conditions, the distribution function for general ν has the same form as in the case of ν = −1/2. The only difference is in the mean which was computed in the last section. The attack on the problem is entirely different here. Instead of computing determinants asymptotically, we compute the traces of the operators (Bα (σ))n and then piece together the answers to get an answer for the trace of log(I + Bα (σ)) and from that to the desired determinant. To begin we need to show that tr f (Bα (σ)) makes sense for a class of analytic functions f . Just as we can associate the Wiener-Hopf operator with the Fourier transform and a multiplication operator, we can also write Bα (σ) = P HMσ H, where H is the Hankel transform and P is the projection on L2 (0, 1). Since the Hankel transform is unitary on L2 (0, ∞) ([11]), the operator norm ||Bα (σ)|| is less than the infinity norm ||σ||∞ of σ. Thus f (Bα (σ)) is defined for f analytic on a disk centered at the origin with radius ||σ||∞ + δ, δ > 0. The operator Bα (σ) is also trace class for σ in L1 by Mercer’s Theorem ([5] Ch.III) as is f (Bα (σ)) for f satisfying the above and f (1) = 0. We need some lemmas that will prove to be useful. These may be known already, but we include them for completeness. Lemma 6. Suppose −1 < p < 1, 0 < λ, δ < 1, µ < 0, p + µ + δ < 0 and 0 < t < 1. Then Z ∞ sp (1 + s)µ |1 − s|−1+λ |1 − ts|−1+δ ds 0

≤ A max(|1 − t|−1+λ , |1 − t|−1+δ ) max(t−λ , t−p−λ ), where A is some constant independent of t. Proof. We have Z

∞

sp (1 + s)µ |1 − s|−1+λ |1 − ts|−1+δ ds

0

=t

−1+δ

Z

1 0

sp (1 + s)µ |1 − s|−1+λ |1/t − s|−1+δ ds

(29)

Distribution Functions for Random Variables of Hermitian Matrices

+t−1+δ +t

−1+δ

Z Z

1/t 1

∞

337

sp (1 + s)µ |1 − s|−1+λ |1/t − s|−1+δ ds

sp (1 + s)µ |1 − s|−1+λ |1/t − s|−1+δ ds.

1/t

We consider each of the above integrals. In each, A is a possibly different constant independent of t but can depend on the other parameters. First, Z

1

sp (1 + s)µ |1 − s|−1+λ |1/t − s|−1+δ ds

0

≤ |1/t − 1|−1+δ ≤ A|t − 1|

Z

1

0 −1+δ 1−δ

t

sp (1 + s)µ |1 − s|−1+λ ds .

Next, Z

1/t

sp (1 + s)µ |1 − s|−1+λ |1/t − s|−1+δ ds

1

≤ A max(1, t−p )

Z

1/t

|1 − s|−1+λ |1/t − s|−1+δ ds

1

= A max(1, t−p )|1/t − 1|−1+λ+δ = A max(1, t−p )t−λ−δ+1 |1 − t|−1+λ+δ ≤ A max(1, t−p )t−λ−δ+1 |1 − t|−1+λ . Finally, Z

∞

sp (1 + s)µ |1 − s|−1+λ |1/t − s|−1+δ ds.

1/t

≤ |1 − 1/t|

−1+λ

Z

∞

sp+µ |1/t − s|−1+δ ds

1/t

≤ |1 − 1/t|−1+λ t−p−µ−δ A = A|t − 1|−1+λ t−p−µ−δ−λ+1 . Putting this together we have that the original integral is bounded by A max(|1 − t|−1+λ , |1 − t|−1+δ ) max(1, t−λ , t−p−λ ).

Lemma 7. Suppose −1 < p < 1, 0 < λ, δ < 1, µ < 0, p + µ + λ < 0 and t > 1. Then Z ∞ sp (1+s)µ |1−s|−1+λ |1−ts|−1+δ ds ≤ A max(|1−t|−1+λ , |1−t|−1+δ ) max(1, t−p ), 0

(30)

where A is some constant independent of t. Proof. The proof of this is almost identical to the previous lemma, and we leave the details to the reader.

338

E. L. Basor

Lemma 8. Suppose |x| < 1, Re c > 0, Re (c − b) > 0, and Re (c − a − b) < 0. Then the hypergeometric function F (a, b, c, x) satisfies the estimate |F (a, b, c, x)| ≤ A|1 − x|Re (c−a−b) with A independent of x. Proof. The hypergeometric function satisfies the identity F (a, b, c, x) = (1 − x)c−a−b F (c − a, c − b, c, x). Using Euler’s integral formula for F , we have Z 1 0(c) F (c − a, c − b, c, x) = tc−b−1 (1 − t)b−1 (1 − tx)−c+a dx. (31) 0(b)0(c − b) 0 R1 (a+b−c)) . The last integral is bounded by 0 tRe (c−b−1) (1−t)Re (a+b−c−1) dx or 0(Re (c−b))0(Re 0(Re a) n We next find an integral expression for the trace of (Bα (σ)) . We proceed informally at first and later state things rigorously. Using (18) we can write this trace as Z ∞Z 1 Z 1Y Z ∞ n ... ... si xi σ(xi /α)Jν (xi si )Jν (xi si+1 ) ds1 . . . dsn dx1 . . . dxn , 0

0

0

0

i=1

where s1+n = s1 . Let σˆ be the Mellin transform of σ, where c > 0. Then the above becomes Z c+i∞ Z c+i∞ Z ∞ Z ∞ Z 1 Z 1Y n 1 i . . . . . . . . . {si x1−z Jν (xi si )Jν (xi si+1 )σ(z ˆ i )} i (2πi)n c−i∞ c−i∞ 0 0 0 0 i=1

×αz1 +...zn ds1 . . . dsn dx1 . . . dxn dz1 . . . dzn . Z

Now use the formula

∞

x−λ Jν (ax)Jν (bx)dx

0

=

(ab)ν 0(ν + 2λ (a

+

b)2ν−λ+1 0(1

1−λ 2 )

+ ν)0(1/2 +

λ 2)

F (ν +

1−λ 1 4ab , ν + ; 2ν + 1; ), 2 2 (a + b)2

where F (a, b; c; z) is the hypergeometric function 2 F1 , n times in the integral to get the expression Z c+i∞ Z c+i∞ Z 1 Z 1 1 . . . . . . αz1 +...+zn (2πi)n c−i∞ c−i∞ 0 0 ×

n Y i=1

σ(z ˆ i)

i si+1 s2ν+1 0(ν + 1 − zi /2)F (ν + 1 − zi /2, ν + 21 ; 2ν + 1; (s4s 2) i i +si+1 )

2zi −1 0(1 + ν)0(zi /2)(si + si+1 )2ν−zi +2 ×ds1 . . . dsn dz1 . . . dzn .

Next we make the change of variables s1 = s01 s2 = s02 s01 .. . sn = s0n . . . s01 ,

Distribution Functions for Random Variables of Hermitian Matrices

339

and the integral becomes 1 (2πi)n

Z

c+i∞Z 1Z

Z

c+i∞

1 s1

... c−i∞

c−i∞ 0

Z

1 s1 ...sn−1

...

0

(α/2)z1 +...+zn 2n σ(z ˆ n)

0

0(ν + 1 − zn /2) 0(1 + ν)0(zn /2)

4sn . . . s2 1 ×sz1 1 +...+zn −1 (1 + sn . . . s2 )−2ν+zn −2 F (ν + 1 − zn /2, ν + ; 2ν + 1; ) 2 (1 + sn . . . s2 )2 ×

n−1 Y

σ(z ˆ i )0(ν + 1 − zi /2) 2ν+1+zi+1 +...zn−1 si+1 (1 + si+1 )−2ν+zi −2 0(1 + ν)0(zi /2)

{

i=1

4si+1 1 ×F (ν + 1 − zi /2, ν + ; 2ν + 1; )}dsn . . . ds1 dz1 . . . dzn . 2 (1 + si+1 )2 Write the inside integral as Z

1 0

Z

1 s1

Z ...

0

Z

1 s1 ...sn−1

1

Z

. . . dsn . . . d1 −

0 1

Z

Z

∞

0

0

. . . dsn . . . ds1 0

∞

...

+

∞

... 0

Z

Z

∞

. . . dsn . . . ds1 .

0

0

The last integral in the above sum, inserted in the main integral, is the same as tr (Bα (σ n )). After reversing the order of integration to ds1 . . . dsn , the first two terms combine to yield limits of integration Z ∞Z 1 Z ∞Z ∞ ... , − 0

0

0

min(1, s1 ,..., s 2

1 2 ...sn

)

and then the first integration can be done. The result is that tr (Bα (σ))n = tr Bα (σ n ) + C(σ), where C(σ) is given by the expression −1 (2πi)n

Z

Z

c+i∞

c+i∞

... c−i∞

Z

c−i∞

Z

∞

×

∞

... 0

0

(α/2)z1 +...+zn

n Y σ(z ˆ i )20(ν + 1 − zi /2) i=1

0(1 + ν)0(zi /2)

1 1 − (min(1, s12 , . . . , s2 ...s ))z1 +...+zn n

z1 + z 2 + . . . + z n

4sn . . . s2 1 ×(1 + sn . . . s2 )−2ν+zn −2 F (ν + 1 − zn /2, ν + ; 2ν + 1; ) 2 (1 + sn . . . s2 )2 ×{

n Y i=2

2ν+1+zi +...zn−1

si

4si 1 (1 + si )−2ν+zi−1 −2 × F (ν + 1 − zi−1 /2, ν + ; 2ν + 1; )} 2 (1 + si )2 ×dsn . . . ds2 dz1 . . . dzn .

We next write this integral as

340

E. L. Basor

−1 (2πi)n

Z

Z

c+i∞

Z

c+i∞

... c−i∞

Z

∞

G(zi )

∞

...

c−i∞

H(zi ; si )dsn . . . ds2 dz1 . . . dzn .

0

0

The idea from here on out is to evaluate this integral asymptotically using complex analysis. This will be done in several stages and by breaking the integral into several parts. To begin we first consider the interior integration Z

Z

∞

∞

... 0

H(zi ; si )ds2 . . . dsn . 0

Consider this as an integral over R1 ∪R2 , where R1 is a union of disjoint sets, R1 = ∪ni=2 Ui such that on Ui , si is bounded away from 1 and where R2 is the complement of R1 . Lemma 9. Suppose that −2ν − 1 < 0. The integral of H(zi ; si ) over Ui is bounded and the zi variables can be changed in such a way so that the integrated function is analytic in a particular z variable to the left of the imaginary axis. Proof. For convenience let i = 2 (although the proof is the same for any i) and let 0 z1 + . . . + zn = z1 with the other variables remaining the same. Suppose that Re zi = c 0 for i = 3, . . . , n and that c > 0. Suppose also that Re z1 = b with |b| < c. We now refer 0 to z1 as z. Our goal is to show that this integral is bounded and that as a function of z is analytic to the left of the imaginary axis. By repeated application of Lemma 8, we can say that the integral is bounded by a constant times Z

Z

Z

∞

∞

... |s2 −1|≥B

0

×

0

n Y

1 − (min(1, 1 , . . . , 1 ))z1 +...+zn s2 s2 ...sn z1 + z 2 + . . . + z n

(1 + si )−2ν−1 s(n−i)c |1 − si |c−1 i

i=3

×sb−2c (1 + s2 )−2ν−1 |1 − s2 |b−1 |1 − s2 . . . sn |c−1 2 ×ds2 . . . dsn . This is valid as long as 2ν + 1 > 0 and Re zi − 1/2 < 0, which is the case here if we assume that c is small enough. Next, we estimate 1 − (min(1, 1 , . . . , 1 ))z1 +...+zn s2 s2 ...sn z1 + z 2 + . . . + z n 0

by using the fact that |1 − xz | ≤ max |z|xRe z | ln x|, where the max is taken over the z 0 values on a line connecting 0 and z, x between 0 and 1. Thus, |1 − xz | ≤ K|z|xRe z x− for some positive chosen shortly. Inserting this in the integral we have that the integral is bounded by a constant times

Distribution Functions for Random Variables of Hermitian Matrices

Z

Z

Z

∞

n ∞X

... |s2 −1|≥B

0

0

×

n Y

341

{max(1, (s2 . . . sj ) (s2 . . . sj )−b+ )}

j=2

(1 + si )−2ν−1 s(n−i)c |1 − si |c−1 i

i=3

×sb−2c (1 + s2 )−2ν−1 |1 − s2 |b−1 |1 − s2 . . . sn |c−1 ds2 . . . dsn . 2 The reason for both terms in the “max” part of the integral is that b could be either positive or negative. Now let’s begin with the sn integration. Then the first interior integral has the form Z ∞ 0

spn (1 + sn )−2ν−1 |1 − sn |c−1 |1 − s2 . . . sn |−1+c dsn .

The value for p is either ±a, where a = |b − | < c. The next step is to apply Lemmas 6 and 7. We use λ = δ = c and p as above. The result is that this integral is bounded by a constant times |1 − s2 . . . sn−1 |c−1 × max(1, (s2 . . . sn−1 )−c , (s2 . . . sn−1 )−c−p (s2 . . . sn−1 )−p ). We collect powers and use the lemmas twice with respect to the sn−1 integration and powers of p = ±(2c). At the next integration step the powers of p = ±3c and so on until we arrive at the s2 integration. Here we will have Z sp2 |1 − s2 |q |1 + s2 |−2ν−1 ds2 , |s2 −1|≥B

where p and q are appropriate powers. These integrals satisfy all the conditions necessary for the lemmas as long as c and b are small enough. We will have at most 2n integrals in this process. Hence the integral of H over Ui is analytic in the z variable in a strip |Re z| < c by the application of Morera’s Theorem and Fubini’s Theorem. We remark here that this proof also is easily modified to show that the interchange of integrals done at the beginning of the section are valid and the expression C(σ) is the one of interest. Lemma 10. Suppose that σ has [ν] + 2 derivatives all in L1 and that −2ν − 1 < 0. Then the integral Z c+i∞ Z Z Z c+i∞ ... G(zi ) . . . H(zi ; si )ds2 . . . dsn dz1 . . . dzn c−i∞

R1

c−i∞

is O(α−δ ), where δ > 0. Proof. Note that the condition in the hypothesis implies that Z c+∞ ν+1/2 |σ(z)||z| ˆ < ∞.

(32)

c−i∞

We first replace the inside integral with a sum of integrals over Ui . For each of these we change variables as in the last lemma. We can then perform the integration over the

342

E. L. Basor

z variable by moving it to a line to the left of the imaginary axis. Thus we have that each of these integrals is bounded by a constant times Z b+i∞ Z c+i∞ Z c+i∞ Y n σ(z ˆ i )0(ν + 1 − zi /2) αb ... (π)n 2b b−i∞ c−i∞ 0(1 + ν)0(zi /2) c−i∞ i=2

P P σ(z − (z − zj 6=z zj )/2) ˆ − zj 6=z zj )0(ν + 1P × dzdz2 . . . dzn . 0(1 + ν)0((z − zj 6=z zj )/2) This last integral is bounded by a product of integrals all of the form Z

c+i∞ c−i∞

0(ν + 1 − z/2) dz, |σ(z)| ˆ 0(z/2)

and these in turn are bounded by (32) using the basic asymptotics properties of the Gamma function. We now turn our attention to the region R2 . To begin we make another change of variables, 1 = 1 − s02 , s2 1 = 1 − s02 − s03 , s2 s3 .. . 1 = 1 − s02 − s03 − . . . − s0n . s2 . . . s n

(33) (34) (35) (36)

Under the change of variables, the region R2 is transformed to a region R3 which can be assumed to be a symmetric region containing the origin, and where the sum |s2 + . . . + sj | ≤ a < 1 (we drop the “primes” again) for some a. Notice that the exact form of R1 was unnecessary in the previous computation. Thus the integral over R2 is transformed to Z Z . . . I(zi ; si )ds2 . . . dsn , R3

where I(zi ; si ) =

1 − (1 − max(0, s2 , . . . , s2 + . . . + sn ))z1 +...+zn z1 + . . . + z n

×|s2 |z2 −1 . . . |sn |zn−1 −1 |s2 + . . . + sn |zn −1 ×f (s2 , . . . , sn , z1 , . . . , zn ), where the function f is smooth in the s variables.

Distribution Functions for Random Variables of Hermitian Matrices

343

The following lemmas will help keep track of the contribution of the R3 integral. Lemma 11. Suppose Re zi = c, 0 < c < 1, for i ≥ 3. Then the integral Z Z z1 +1 z2 −1 . . . |sn |zn−1 −1 |s2 + . . . sn |zn −1 ds2 . . . dsn . . . |s2 | |s3 | R3

can be thought of as an analytic function in the z1 variable that can be extended to a strip containing the imaginary axis. Proof. First note that the following integral with z and w real and between zero and one satisfies Z b

|x|z−1 |x + y|w−1 dx ≤ A|y|z+w−1 ,

a

where the constant only depends on the z and w variable. A repeated application of this estimate in the above integral yields a final integration of Z b |s2 |Re z1 +(n−2)c ds2 . a

Thus, once again the analytic continuation argument holds.

Lemma 12. Suppose Re zi = c, 0 < c < 1, for i ≥ 2 and Re z1 = d. Then the integral Z Z z1 z2 z3 −1 . . . |sn |zn−1 −1 |s2 + . . . sn |zn −1 ds2 . . . dsn . . . |s2 | |s3 | |s4 | R3

can be thought of as an analytic function in the z1 variable that can be extended to a strip containing the imaginary axis. Proof. We begin the integration just as in the previous integral. After n − 3 integrations we arrive at an integral with an estimate of the form Z bZ b |s2 |d |s3 |c |s3 + s2 |(n−3)c−1 ds2 ds3 . a

a

We can estimate this by looking at three integrals Z bZ 1 |s2 |d+(n−2)c |s3 |c |s3 + 1|(n−3)c−1 ds3 ds2 , bZ a

and

−1

a

Z

b/s2 1

bZ

Z a

|s2 |d+(n−2)c |s3 |c |s3 + 1|(n−3)c−1 ds3 ds2 ,

−1

|s2 |d+(n−2)c |s3 |c |s3 + 1|(n−3)c−1 ds3 ds2 .

a/s2

We can say, for example, that the last integral is less than a constant times Z b |s2 |d−(n−4)c ds2 , a

and thus is finite for Re z1 in a strip about the imaginary axis. The other two integrals are handled in the same manner. So by our standard argument the analytic extension is defined.

344

E. L. Basor

Now let us return to our function I(zi ; si ). We can write the expression 1 − (1 − max(0, s2 , . . . , s2 + . . . + sn ))z1 +...zn z1 + . . . + z n as max(0, s2 , . . . , s2 + . . . sn ) + (max(0, s2 , . . . , s2 + . . . sn ))2 × g(z1 + . . . + zn , s2 , . . . sn ), where the last function is a bounded continuous function in the variables. Lemma 13. The contribution of Z c+i∞ Z c+i∞ Z Z −1 . . . G(z ) . . . (max(0, s2 , . . . , s2 + . . . sn ))2 i (2πi)n c−i∞ R2 c−i∞ ×|s2 |z1 −1 . . . |sn |zn−1 −1 |s2 + . . . + sn |zn −1 ×f (s2 , . . . , sn , z1 , . . . , zn )g(z1 + . . . + zn , s2 , . . . sn ) ×dsn . . . ds2 dz1 . . . dzn is O(α

−δ

).

Proof. We simply consider the set where say s1 + s2 + . . . + sj is the maximum of the terms. We then expand the square so that we have a term of the form si sk . We then apply the above lemmas after an appropriate re-ordering of the variables and the lemma holds. The next step is to replace the function f in the expression for I(zi ; si ) with the first term of its Taylor expansion. This expansion gives an “extra" si (combined with the ones from the max(0, s2 , . . . , s2 + . . . + sn )) term in the estimates which, as the above lemmas show, is all we need to show that this part of the integral does not contribute in the asymptotic expansion. So we are finally at the one critical term that gives a contribution in the expansion. This term is Z c+i∞ Z c+i∞ Z Z −1 ... G(zi ) . . . max(0, s2 , . . . , s2 + . . . + sn ) (2πi)n c−i∞ R3 c−i∞ ×|s2 |z1 −1 . . . |sn |zn−1 −1 |s2 + . . . + sn |zn −1 ×f (0, 0, . . . , 0, z1 , . . . , zn )dsn . . . ds2 dz1 . . . dn . We can easily compute f (0, 0, . . . , 0, z1 , . . . , zn ) to see that it equals n Y 2−2ν−1 0(2ν + 1)0(−zi /2 + 1/2)

0(ν + 1/2)0(ν + 1 − zi /2)

1

.

We can simplify further using the formula for G(zi ) and the duplication formula for the Gamma function to arrive at Z c+i∞ Z c+i∞ n Y σ(z ˆ i )0(−zi /2 + 1/2) −1 z1 +...+zn −n/2 . . . (α/2) π C(σ) = (2πi)n c−i∞ 0(zi /2) c−i∞ Z ×

1

Z

max(0, s2 , . . . , s2 + . . . + sn )

... R3

Distribution Functions for Random Variables of Hermitian Matrices

345

× |s2 |z1 −1 . . . |sn |zn−1 −1 |s2 + . . . + sn |zn −1 ds2 . . . dsn dz1 . . . dzn + O(α−δ ).

(37)

Notice that this expression is now independent of ν. Our final steps are to compute the contribution from the above integral and we, by the way, finally have an integral which will yield a contribution. We begin with a well-known identity due to Mark Kac, which was used originally to prove the continuous analogue of the Strong Szeg¨o Limit Theorem. It reads X

max(0, aσ1 , aσ1 + aσ2 , . . . , aσ1 + . . . + aσn ) =

σ

n XX σ

aσ1 θ(aσ1 + . . . + aσk ),

k=1

where θ(x) = 1 if x > 0 and θ(x) = 0 otherwise and the sums are taken over all permutations in n variables. Because of this identity we can rewrite the integral in (37) as Z c+i∞ Z c+i∞ n n X Y −1 σ(z ˆ i )0(−zi /2 + 1/2) z1 +...+zn −n/2 . . . (α/2) π (2πi)n c−i∞ 0(zi /2) c−i∞ j=2

Z Z × ...

1

R3 ∩{s2 +...+sj >0}

s2 |s2 |z1 −1 . . . |sn |zn−1 −1 |s2 +. . .+sn |zn −1 ds2 . . . dsn dz1 . . . dzn .

(38) It is straightforward to see how this identity can be used if the integrand is symmetric in the variables. In our case, the integrand is not obviously symmetric in the variables, but can always be made so by changing the z variables. Thus we can apply the identity. We once again consider the inner integral and call z = z1 + . . . + zn leaving the other variables as is, and show how this inner integral can be thought of as analytic in z in a strip containing the imaginary axis. The difference is that in this case there will be a pole at z = 0. Now we suppose that j > 2. For j = 2 the following computation is almost identical and the conclusion is the same. Let us rewrite the inner integral in (38) as two integrals Z bZ Z z−z2 −z3 −...zn −1 . . . |sn |zn−1 −1 |s2 + . . . + sn |zn −1 dsn . . . ds2 . . . s2 |s2 | 0

B

Z

0

+ −b

Z

Z ...

s2 |s2 |z−z2 −z3 −...zn −1 . . . |sn |zn−1 −1 |s2 + . . . + sn |zn −1 dsn . . . ds2 ,

B

where B is some n − 2 dimensional set. In the first (the computations for the second integral being almost identical) of these we make yet another change of variables: s3 = s03 s02 .. . sn = s0n s02 to arrive at Z b Z Z z2 −1 sz−1 . . . |sn |zn−1 −1 |1 + s3 + . . . + sn |zn −1 dsn . . . ds3 ds2 . 2 . . . |s3 | 0

B/s2

346

E. L. Basor

The original set R3 was chosen to be symmetric and contain the origin. So here we chose it to be something convenient, say a cube C with size length l. With this choice we can write B/s2 as C/s2 ∩ {s3 + . . . + sn + 1 > 0}. Next integrate by parts with respect to the s2 variable. The result is that the above integral becomes: Z b sz2 d/ds2 (k(s2 ))ds2 , sz2 k(s2 ) − 0

where k(s2 ) =

Z

Z ...

|s3 |z2 −1 . . . |sn |zn−1 −1 |1 + s3 + . . . + sn |zn −1 dsn . . . ds3 .

B/s2

The function k(s2 ) has a derivative given by the formula Z f (s3 , . . . , sn ) (n · s−1 k 0 (s2 ) = −s−1 2 2 (s3 , . . . , sn ))dS, D

where D is the boundary of the set C/s2 which lies in the half-space defined by {s3 + . . . + sn + 1 > 0}, the vector n is the outward normal to the surface, the function f is simply the one given in the above integral restricted to the surface, and dS is surface measure. We can estimate the derivative of k(s2 ) on any boundary edge to be at most a for Re zi = c. Thus we have proved the following: constant times s(n−2)c 2 Lemma 14. The function of z defined by Z bZ Z z−z2 −z3 −...zn −1 . . . |sn |zn−1 −1 |s2 + . . . + sn |zn −1 dsn . . . ds2 . . . s2 |s2 | 0

Z

B 0

+

Z

Z ...

−b

s2 |s2 |z−z2 −z3 −...zn −1 . . . |sn |zn−1 −1 |s2 + . . . + sn |zn −1 dsn . . . ds2

B

is analytic in a strip containing the imaginary axis except at the point z = 0. Further, the contribution of this integral with the z integration moved to a line to the left of the axis is given by the residue at z = 0 plus O(α−δ ). We note here that there are no other poles given our conditions on σ, (32) and the formula for G(zi ). For j > 2, the above computation also shows exactly what the residue is, namely: Z Z |s3 |z2 −1 . . . |sn |zn−1 −1 |1 + s3 + . . . + sn |zn −1 dsn . . . ds3 ... Rn−2 ∩{s3 +...+sj >−1}

−

Z

Z

...

|s3 |z2 −1 . . . |sn |zn−1 −1 | − 1 + s3 + . . . + sn |zn −1 dsn . . . ds3 .

Rn−2 ∩{s3 +...+sj >−1}

To find an explicit formula for this integral we start with the following formula that can be easily proved using formulas for the Beta function. For 0 < Re p, Re q < 1, Re (p + q) < 1,

Distribution Functions for Random Variables of Hermitian Matrices

Z

∞ −∞

|x|p−1 |x + y|q−1 dx = |y|p+q−1

347

20(p)0(q) cos(πp/2) cos(πq/2) . 0(p + q) cos((p + q)π/2)

(39)

Define t(p, q) to be 20(p)0(q) cos(πp/2) cos(πq/2) . 0(p + q) cos((p + q)π/2) The residue is then (B is the Beta function) B(z2 + . . . + zj−1 , zj + . . . + zn )

n−1 Y

t(zk , zn + . . . + zk+1 )

j−2 Y

k=j

t(zk , zk+1 + . . . + zj−1 ).

k=2

We leave this as an exercise to the reader. For j = 2 the residue can also be easily computed using the definition of t(p, q) and it is seen to be n−1 Y

t(zk , zn + . . . + zk+1 ).

k=2

Combining all of the above results we are left with the following theorem. Theorem 15. Suppose σ has [ν] + 2 continuous derivatives in L1 . Then tr (Bα (σ))n = tr Bα (σ n ) + C(σ), where

Z n−1 −1 X 1 ∞ xC(σ j )(x)C(σ n−j )(x)dx + o(1). C(σ) = 2 π j 0 j=1

Proof. Recall we were computing the integral Z c+i∞ Z c+i∞ n n X Y −1 σ(z ˆ i )0(zi /2 + 1/2) z1 +...+zn −n/2 . . . (α/2) π (2πi)n c−i∞ 0(zi /2) c−i∞ j=2

Z Z × ...

1

s2 |s2 |z1 −1 . . . |sn |zn−1 −1 |s2 +. . .+sn |zn −1 ds2 . . . dsn dz1 . . . . dzn .

R3 ∩{s2 +...+sj >0}

(40) For each j we rename the variables and compute the residue as above. For j > 2 the residue is Z c+i∞ Z c+i∞ n Y σ(z ˆ i )0(−zi /2 + 1/2) −1 −n/2 . . . π (2πi)n−1 c−i∞ 0(zi /2) c−i∞ 2

×

σ(−z ˆ 2 − . . . − zn )0((z2 + . . . + zn )/2 + 1/2) B(z2 + . . . + zj−1 , zj + . . . + zn ) 0((−z2 − . . . − zn )/2) ×

n−1 Y k=j

Notice that

t(zk , zn + . . . + zk+1 )

j−2 Y k=2

t(zk , zk+1 + . . . + zj−1 )dz2 . . . dzn .

(41)

348

E. L. Basor

t(p, q)t(p + q, r) = 22

0(p)0(q)0(r) cos(p) cos(q) cos(r) . 0(p + q + r) cos((p + q + r)π/2)

Using this identity in (4) we have that the above integral is Z c+i∞ Z c+i∞ n Y −1 σ(z ˆ i )0(zi ) cos(zi π/2)0(−zi /2 + 1/2) n−3 −n/2 . . . 2 π (2πi)n−1 c−i∞ 0(zi /2) c−i∞ i=2

σ(−z ˆ 2 −. . .− zn )0((z2 +. . .+zn )/2 + 1/2) × (−z2 −...−zn ) dz2 . . . dzn . 0 0(z2 +. . .+zn ) cos (z2 +. . .+ zj−1 )π/2 cos (zj +. . .+zn )π/2 2 (42) From the duplication formula for the Gamma function, this can be simplified to Z c+i∞ Z c+i∞ n Y −1 −2 −1 . . . 2 π σ(z ˆ i) (2πi)n−1 c−i∞ c−i∞ i=2

×

σ(−z ˆ 2 − . . . − zn )(z2 + . . . + zn ) sin((z2 + . . . + zn )π/2) dz2 . . . dzn . cos((z2 + . . . + zj−1 )π/2) cos((zj + . . . + zn )π/2)

(43)

Now we change variables with zj−1 = z2 + . . . + zj−1 , zn = zj + . . . + zn , and the above integral becomes −1 (2πi)n−1

Z

Z

c+i∞

c+i∞

... c−i∞

2−2 π −1 (

c−i∞

j−2 Y

σ(z ˆ i ))σ(z ˆ j−1 − . . . − z2 )

σ(z ˆ i)

i=j

i=2

ˆ × σ(z ˆ n −. . .−zj )σ(−z j−1 −zn )(zj−1 +zn )

n−1 Y

sin((zj−1 + zn )π/2) dz2 . . . dzn . (44) cos(zj−1 π/2) cos(zn π/2)

The convolution theorem for the Mellin transform shows that this can be reduced to the integral Z c+i∞ Z c+i∞ −1 ˆ (zj−1 )σ n−j ˆ (zn ) 2−2 π −1 σ j−2 (2πi)2 c−i∞ c−i∞ × σ(−z ˆ j−1 − zn )(zj−1 + zn )

sin((zj−1 + zn )π/2) dzj−1 dzn . cos(zj−1 π/2) cos(zn π/2)

(45)

Notice this can also be written as Z c+i∞ Z c+i∞ −1 ˆ (zj−1 )σ n−j+1 ˆ (zn ) 2−2 π −1 σ j−2 (2πi)2 c−i∞ c−i∞ × σ(−z ˆ j−1 − zn )(zj−1 + zn ) −

1 (2πi)2

Z

c+i∞ c−i∞

Z

c+i∞

sin(zj−1 π/2) dzj−1 dzn cos(zj−1 π/2)

(46)

ˆ (zj−1 )σ n−j+1 ˆ (zn ) 2−2 π −1 σ j−2

c−i∞

× σ(−z ˆ j−1 − zn )(zj−1 + zn )

sin(zn π/2) dzj−1 dzn . cos(zn π/2)

(47)

Distribution Functions for Random Variables of Hermitian Matrices

349

Before we proceed further we need three formulas from the theory of Mellin transforms. These are Z ∞ φ(x)dx = z −1 8(z + 1), the Mellin transform of x

where 8 is the transform of φ, the Mellin transform of xφ0 (x) = −z8(z), where 8 is the transform of φ, and finally Z Z c+∞ 1 2 ∞ xC(φ)(x)C(ψ)(x)dx = 8(z)9(−z)z tan(zπ/2)dz. π 0 2πi c−i∞ These can be found in any standard table of transforms, although the third requires a straightforward computation combined with the convolution theorem. So now we apply the second formula along with convolution with respect to the zn variable and we have for each 2 < j < n, Z c+i∞ sin(zj−1 π/2) 1 \ [ j−2 (z n−j+1 σ 0 (−z dzj−1 (48) σ j−1 )xσ j−1 ) 8π 2 i c−i∞ cos(zj−1 π/2) Z c+i∞ 1 sin(zj−1 π/2) \ n−j+1 (z j−2 σ 0 (−z + 2 dzj−1 . (49) σ\ j−1 )xσ j−1 ) 8π i c−i∞ cos(zj−1 π/2) Next apply the first formula after inserting a factor of zj−1 /zj−1 to write the above as Z ∞ Z ∞ 1 j−2 xC(σ )(x)C( σ n−j+1 σ 0 )(x)dx (50) 2π 2 0 x Z ∞ Z ∞ 1 n−j+1 + 2 xC(σ )(x)C( σ j−2 σ 0 )(x)dx (51) 2π 0 x or Z ∞ −1 1 xC(σ j−2 )(x)C(σ n−j+2 )(x) dx (52) 2π 2 n − j + 2 0 Z ∞ −1 1 xC(σ j−1 )(x)C(σ n−j+1 )(x) dx. (53) + 2 2π j − 1 0 We can do the j = 2, j = n cases separately just as easily (the above formulas are not even all required in that case) and putting the two cases together and reindexing when necessary we arrive at the conclusion of the theorem. Our final step is to extend this to functions other than powers. The standard uniformity arguments used in the Wiener-Hopf theory apply here if we can show that ||tr f (Bα (σ)) − tr Bα (f (σ))||1 = O(1) uniformly for σ replaced by 1 − λ + λσ and λ in some complex neighborhood of [0, 1]. The details of this are found in [14]. The norm above is the trace norm. Given sufficient analyticity conditions on f , it is only necessary to prove ||Bα (σ1 )Bα (σ2 )−Bα (σ1 σ2 )||1 = O(1), where the O(1) here depends on properites of σi . A trace norm of a product can always be estimated by the product of two Hilbert-Schmidt norms and in this case we need to estimate the Hilbert Schmidt norm of the operator with kernel

350

E. L. Basor

Z

∞

X(1,∞) (z)

√ σi (t/α) xztJν (xt)Jν (tz)dt.

0

Using integration by parts, and integration formulas for Bessel functions this is easily estimated to be bounded. For analogous details see [14]. Thus for suitably defined f we can extend our previous theorem to the more general case. The f of interest is log(1 + z). This will satisfy the necessary analyticity conditions if we consider small enough k. The necessary conditions are collected in the following: Theorem 16. Suppose f is a real-valued function with [ν] + 2 derivatives all contained in L1 . Then for sufficiently small k (say k < ||σ||−1 ∞) Z ∞ Z ∞ k2 ikν α ˇ φ(k) ∼ exp f (0) − 2 ikf (x)dx − xC(f )2 (x)dx . π 0 2 2π 0 Proof. The form of the answer follows from the computation of the mean given earlier and from the fact that the constant term in the previous theorem is exactly half of the answer in Szeg¨o’s Theorem. Thus the above answer for the log function must be half as well. Acknowledgement. The author would like to thank both Craig Tracy and Harold Widom for many useful and helpful conversations.

References 1. Basor, E. L., Tracy, C. A.: Variance calculations and the Bessel kernel. J. Stat. Phys. 73 (1993) 2. Basor, E. L., Widom. H.: Toeplitz and Wiener-Hopf determinants with piecewise continuous symbols, J. Funct. Anal. 50, 387–413 (1983) 3. Beenakker, C. W. J.: Universality in the random-matrix theory of quantum transport. Phys. Rev. Letts. 70, 1155–1158 (1993) 4. B¨ottcher, A., Silbermann, B.: Analysis of Toeplitz Operators. Berlin: Springer, 1990 5. Gohberg, I.C., Krein, M.G.: Introduction to the Theory of Linear Nonselfadjoint Operators, Vol. 18, Translations of Mathematical Monographs, Providence; RI: Amer. Math. Soc., 1969 6. Johannsson, K.: On Fluctuations of Eigenvalues of Random Hermitian Matrices. Preprint 7. Kac, M.: Toeplitz matrices, translation kernels, and a related problem in probability theory. Duke Math. J. 21, 501–509 (1954) 8. Mehta, M. L.: Random Matrices, San Diego: Academic Press, 1991 9. Sarnak, P.: Arithmetic quantum chaos. Preprint 10. Stone, A. D., Mello, P. A., Muttalib, K. A., and Pichard, J.-L.: Random theory and maximum entropy models for disordered conductors. In Mesoscopic Phenomena in Solids, eds. B. L. Altshuler, P. A. Lee, and R. A. Webb, Amsterdam: North-Holland, 1991, Ch. 9, pp. 369–448 11. Unterberger, A., Unterberger, J.: La Serie discrete de SL(2, R) et les operateurs pseudo-differentiels sur une demi-droite. Ann. Scient. Ec Norm. Sup. 4 serie, 17, 83–116 (1984) 12. Tracy, C. A., Widom, H.: Introduction to random matrices. In: Proc. 8th Scheveningen Conf., Springer Lecture Notes in Physics, 1993 13. Tracy, C. A., Widom, H.: Level spacing distributions and the Bessel kernel. Commun. Math Phys. 161, 289–309 (1994) 14. Widom. H.: Szeg¨o’s limit theorem: The higher-dimensional matrix case, J. Funct. Anal. 39, 182–198 (1980) Communicated by J. L. Lebowitz

Commun. Math. Phys. 188, 351 – 365 (1997)

Communications in

Mathematical Physics c Springer-Verlag 1997

Coadjoint Orbits of Central Extensions of Gauge Groups Jean-Luc Brylinski? Department of Mathematics, Pennsylvania State University, University Park, PA 16802, USA. E-mail: [email protected] Received: 3 October 1996 / Accepted: 17 January 1997

Abstract: We study geometrically the coadjoint orbits of the central extensions of gauge groups over arbitrary manifolds. We show that these orbits are classified by a dimension one foliation with a transverse measure, together with a leafwise connection. For the case of a two-dimensional torus with standard trivial foliation, we show that the holonomies along the leaves give a complete invariant for the regular coadjoint orbits. We investigate in detail the Kronecker foliation of a torus using a new construction which we call asymptotic holonomy. We give a description of a large class of integral orbits and construct polarizations for many orbits. Finally, we use continuous tensor products to investigate the problem of quantizing the orbits. We argue that the representation obtained by geometric quantization could only be unitary with respect to an indefinite hermitian form. Introduction The theory of loop groups and their central extensions is now quite well understood, due to the work of Kac and Peterson [K-P, Ka] and of Pressley and Segal [P-S]. The coadjoint orbits for the central extension have a nice interpretation in terms of gauge equivalence classes of connections. I. Frenkel has established a natural correspondence between unitary representations and the integral coadjoint orbits [Fr], in conformity with the orbit method and with geometric quantization. For G a simple simply-connected compact Lie group and M a smooth manifold, there are few known representations of the gauge group M ap(M, G). Pressley and Segal constructed a central extension of M ap(M, G) by an abelian group whose Lie algebra is the quotient A1 (M )/A0 (M ) of the 1-forms on M by the exact 1-forms. The recent work of Etingof and I. Frenkel [E-F] for 2-dimensional M has established a beautiful connection between certain coadjoint orbits of the complexified central extension and ?

This research was supported in part by NSF grants DMS-9203517 and DMS-9504522.

352

J.-L. Brylinski

holomorphic bundles over Riemann surfaces. In the present paper, we work over the (noncomplexified) central extension. One main result is that coadjoint orbits correspond to a (possibly singular) one-dimensional foliation on M equipped with a connection along the leaves. The gauge group acts by fiberwise gauge transformations. To find invariants of coadjoint orbits, we are then led to study holonomy along the leaves. When there are closed leaves, one can simply take holonomy along them. In some cases (cf. Theorem 1) this leafwise holonomy is a fine enough invariant to distinguish coadjoint orbits. For the Kronecker foliation of the two-torus, some interesting invariants can be constructed, measuring a sort of asymptotic leafwise holonomy. The results of Sect. 4 and Sect. 5 deal with M = S 1 × X equipped with the foliations for which S 1 × {x} are the leaves. In Sect. 4, we use a Borel subgroup of GC to construct polarizations of some coadjoint orbits associated to a weight of a Cartan subalgebra, and in Sect. 5 we give a criterion for the integrality of these orbits (Theorem 5). In Sect. 6 we study the problem of geometric quantization for these orbits, based on the notion of continuous tensor product of Hilbert spaces in the sense of Araki and Woods. Although we have a proposal for a Lie algebra representation, we find some obstruction to the unitarity of the representations. One motivation for this work was to investigate the possible extension of non-abelian holonomy from circles to higher-dimensional manifolds. It was a surprise to find that the notion of holonomy which arises out of the study of central extensions of gauge groups is in fact again one-dimensional, the passage from n dimension to 1 dimension being accomplished by the foliation. 1. The Smooth Dual of the Central Extension Let M be a smooth closed oriented manifold of dimension n and let G be a simple compact Lie group with Lie algebra g. Pick an invariant bilinear form ( | ) on g. Let M ap(M, G) be the Fr´echet Lie group consisting of smooth maps M → G; the Lie algebra of M ap(M, G) is the Lie algebra M ap(M, g) comprised of the smooth maps g g) of M ap(M, g): M → g. There exists a universal central extension M ap(M, g g) → M ap(M, g) → 0, 0 → A1 (M )/dA0 (M ) → M ap(M,

(1)

where Aj (M ) denotes the vector space of real j-forms on M . This central extension was already described by S. Bloch [ Bl] in the context of algebraic varieties. It corresponds to the Lie algebra 2-cocycle on M ap(M, g) with values in A1 (M )/dA0 (M ): ω(ξ, η) = (ξ|dη) ∈ A1 (M )/dA0 (M ). g g) with M ap(M, g) ⊕ A1 (M )/dA0 (M ), with the bracket We will identify M ap(M, [(ξ, α), (η, β)] = ([ξ, η], ω(ξ, η)).

(2)

Note that g may be viewed as a Lie subalgebra of M ap(M, g), comprised of the constant maps M → g. Since the cocycle ω restricts to the trivial cocycle on g, there is g g) of the restriction to g of the central extension. a natural splitting g → M ap(M, g g). It is There is an adjoint action of M ap(M, G) on the central extension M ap(M, given by Ad(g) · (ξ, α) = (Ad(g)ξ, −(g −1 dg|ξ)).

(3)

Coadjoint Orbits of Central Extensions of Gauge Groups

353

∗

g g) can be identified as a vector space with the direct The continuous dual M ap(M, n−1 (M ), where C j (M ) denotes the space of degree j currents sum [C n (M ) ⊗ g∗ ] ⊕ Ccl j on M , which is the dual of the space An−j (M ), and Ccl (M ) denotes the space of closed ∗ degree j currents. We identify g with g using the bilinear form ( | ). The coadjoint ∗ g g) is then given by action of M ap(M, G) on M ap(M, Ad∗ (g) · (ν, β) = (Ad(g) · ν − [dg · g −1 ] ∧ β, β). We will be interested in the smooth part which is the direct sum

∗ g g) M ap(M, sm

(4) ∗

g g) , of the dual M ap(M,

∗

g g) = [An (M ) ⊗ g] ⊕ An−1 (M )cl . M ap(M, sm

(5)

∗

g g) is stable under the coadjoint action of M ap(M, G). Clearly M ap(M, sm 2. Coadjoint Orbits, Measured Foliations and Leafwise Holonomy ∗

g g) ; we We will investigate the geometry of the orbits of M ap(M, G) in M ap(M, sm will call these orbits the smooth coadjoint orbits. Recall that in the case of M = S 1 , the function β has zero differential, so is just a constant λ. In the “generic case” λ 6= 0 we can rescale the orbit so as to achieve λ = 1. Then ν is a g-valued 1-form, and the coadjoint action of g ∈ M ap(S 1 , G) = LG transforms (ν, 1) into (Ad(g)ν − dg · g −1 , 1). The affine action of g on ν thus gives the action of a gauge transformation on the potential ν. So the classification of smooth coadjoint orbits amounts to the classification of smooth connections on S 1 . It is a well-known and elementary fact that the orbits are then classified by the conjugacy class of the holonomy H ∈ G of the connection. We will see that the classification of the smooth coadjoint orbits for general M leads to studying 1-dimensional singular foliations with a transverse measure. First we note that the closed (n − 1)-form β is an invariant of the coadjoint orbit. So we should study β. First we focus on the case where β is nowhere vanishing. Lemma 1. Let β be a closed (n − 1)-form on the smooth manifold M which is nowhere vanishing. Then there exists a smooth 1-dimensional foliation F of M characterized by the fact that a vector field v over some open set belongs to F if and only if i(v)β = 0. Furthermore β induces a transverse measure for F which is invariant under the transverse holonomy groupoid. Then any orientation of M induces an orientation of F . The existence of this 1-dimensional foliation then directs us to the natural notion of holonomy adapted to the situation. Assume that C ⊂ M is a compact leaf. Then we can perform the following construction of a connection on C. Recall the following linear algebra lemma: Lemma 2. With the notations of Lemma 1, let be any n-form on M . Then there is a unique 1-form α defined along F such that α ∧ β = . Recall that the differential graded algebra A• (M )F of differential forms among the leaves of F is the quotient of A• (M ) by the differential graded ideal generated by the differential forms α such that i(v) · α = 0 for a vector field α tangent to F . Of course Ap (M )F = 0 for p 6= 0, 1. The differential of A• (M )F is denoted by dF .

354

J.-L. Brylinski

It then follows formally that for ν any g-valued n-form on M , there exists a unique g-valued 1-form A defined along F such that A ∧ β = ν. We think of A as a connection along the leaves of F . Then A induces a connection on each leaf of F , in particular on C. We can then define H(ν,β) (C) to be the holonomy of the connection A around C. We then have: ∗

g g) , and let g ∈ M ap(M, G). Then the connection Lemma 3. Let (ν, β) ∈ M ap(M, sm along the leaves of F associated with Ad∗ (g)(ν, β) is equal to Ad(g)A − dF g · g −1 . It follows that the holonomy of the connection associated to Ad∗ (g)(ν, β) is equal (up to conjugacy) to the holonomy of the connection A. Therefore the conjugacy class of the holonomy H(ν,β) (C) is an invariant of the coadjoint orbit of (ν, β). Theorem 1. Let M = S 1 × S 1 with coordinates t (mod 1) and u (mod 1). Let β = du, and let A = f (t, u)dt, B = h(t, u)dt, where f , h are doubly periodic g-valued functions. Assume that for each a ∈ [0, 1] the holonomy of the connections A and B around the circle u = a are regular elements of G which are conjugate to each other. Then ∗ g g) . (A ∧ du, du) and (B ∧ du, du) belong to the same coadjoint orbit in M ap(M, Proof. Denote by HA (y) resp. HB (y) the holonomies of the connections A resp. B around the loop t ∈ [0, 1] 7→ (t, y). By assumption, for each y ∈ [0, 1], HA (y) and HB (y) are regular elements of G which are conjugate. There is a smooth bundle of abelian groups Z → S 1 , whose fiber Zy at y is the centralizer of HA (y). Then there is a smooth bundle W → S 1 , whose fiber Wy is the set of g ∈ G such that gHA (y)g −1 = HB (y). This bundle is a principal homogeneous space under the bundle of groups Z, in the sense that there is a right action W ×S 1 Z → W of Z on W , and that Zy acts simply transitively on Wy for each y. I claim that the bundle W → S 1 admits a smooth section. The obstruction to finding a smooth section belongs to the sheaf cohomology group H 1 (S 1 , Z), where Z is the sheaf of smooth sections of the bundle Z → S 1 . We have the exponential exact sequence of sheaves exp

0 → EZ → Lie Z −→W → 0, where Lie Z denotes the sheaf of smooth sections of the bundle over S 1 whose fiber is the Lie algebra of Zy , and EZ is a local system over S 1 whose fiber at y is π1 (Zy ). The cohomology group H 1 (S 1 , Lie Z) is 0 because Lie Z is a fine sheaf. The cohomology group H 2 (S 1 , EZ ) is 0 for dimension reasons. Therefore H 1 (S 1 , W ) = 0 and W → S 1 admits a global section u 7→ g(u). We can view g as an element of M ap(M, G), and the conjugate [Ad(g)A − (dF g · g −1 )] ∧ du has holonomy around u = y equal to g(y)HA (y)g(y)−1 = HB (y). Thus we may as well assume to start with that the holonomies of A and B coincide. There exists a unique function g : R × R → G which is a solution of the partial differential equation: ∂g −1 g =h (6) Ad(g)f − ∂t with the boundary condition: g(0, u) = 1. Since both f and h are periodic in the udirection and the boundary condition is periodic, we have: g(t, u + 1) = g(t, u) by uniqueness of the solution to the Cauchy problem. Equation (5) means that over R×{y}, the gauge transformation g transforms the connection A into B. The holonomies HA (y) and HB (y) are then related by HB (y) = g(1, u)HA (y)g(0, u). Since g(0, u) = 1 and HB (y) = HA (y), we see that g(1, u) = 1. Then uniqueness for the Cauchy problem

Coadjoint Orbits of Central Extensions of Gauge Groups

355

implies that g(t + 1, u) = g(t, u). Therefore g is an element of M ap(M, G), which transforms (A ∧ du, du) into (B ∧ du, du). We note that the classification of coadjoint orbits is likely to be considerably more difficult in the case where the dimension of the centralizer of the leafwise holonomy varies with the parameter u. We can give at least a general invariant attached to a coadjoint orbit, using the notion of the holonomy groupoid G = (G0 , G1 ; s, t) of a foliation F [Co, Wi]. Recall that the base G0 of the holonomy groupoid is the manifold M , and the manifold G1 of arrows is the set of leafwise homotopy classes of piecewise smooth paths γ : [0, 1] → M which are tangent to F. More precisely, we consider homotopies F : [0, 1] × [0, 1] → M which are tangent to F and satisfy the usual condition that F (t, 0) and F (t, 1) are constant. The source map s : G1 → G0 and the target map t : G1 → G0 associate to a path γ its origin and end. If G1 is Hausdorff, there exists a unique smooth manifold structure on it such that both s and t are smooth maps. A representation of the holonomy groupoid G is defined as a vector bundle E over G, ˜ ∗ E which satisfies the usual equipped with a vector bundle isomorphism φ : s∗ E →t cocycle condition. There is a natural notion of isomorphism of representations of G. We then have: ∗

g g)sm , there is an associated represenTheorem 2. To each element (ν, β) of M ap(M, ∗ g g) give rise tation of the holonomy groupoid G. Two conjugate elements of M ap(M, sm to isomorphic representations of G. We now make some remarks on the coadjoint orbits of the complexified Lie algebra g g) ⊗ C, and we relate them to the work of Etingof and I. Frenkel [E-F]. Lemma M ap(M, 1 and Lemma 2 extend to this situation, but now, instead of a (singular) one-dimensional foliation F , we have a distribution of complex subspaces of the complexified tangent space T M ⊗ C. Then geometrically one is again considering connections along the leaves. The geometry becomes particularly nice when dim(M ) = 2 and F is transverse to its complex-conjugate. Then such a distribution F amounts to a complex structure on M , such that a germ of complex-valued function f on M is holomorphic if and only if it is killed by F . Then a GC -bundle with connection along the leaves of F is the same thing as a holomorphic GC -bundle over M . If M is a Riemann surface of genus 1, and β is a holomorphic 1-form, Etingof and Frenkel show that the coadjoint orbits of type (•, β) are in bijection with the isomorphism classes of holomorphic GC -bundles over M . They use Atiyah’s results about the classification of holomorphic bundles on elliptic curves yield to classify the coadjoint orbits [E-F]. The paper [E-F] contains many other beautiful theorems for elliptic curves, connecting them with the classical theory of linear q-difference equations.

3. The Case of the Kronecker Foliation Let M = S 1 × S 1 , and let β = du − κdt, where κ is irrational. The corresponding foliation is the Kronecker foliation by the image in S 1 × S 1 of the lines of slope κ. There is no closed leaf so there is no way to define holonomy along a closed leaf. However, there are some “asymptotic” substitutes to holonomy along a closed leaf. To explain this, we use the fact that the holonomy groupoid of the Kronecker foliation is Morita equivalent to the groupoid K associated to the action of Z on S 1 = R/Z in which the generator of Z acts by translation by κ. Therefore any connection A along the leaves of

356

J.-L. Brylinski

F gives rise to a representation of the groupoid K. We can make this concrete as follows. Recall that K0 = S 1 , K1 = S 1 × Z, and the source and target maps are s(x, n) = x, t(x, n) = x + nκ mod Z. Then let E be the trivial G-bundle over R/Z. Let us describe ˜ ∗ E of G-bundles at (x, n) ∈ (R/Z) × Z. This amounts to the isomorphism φ : s∗ E →t giving an element φ(x, n) ∈ G. This will be defined geometrically, embedding S 1 into S 1 × S 1 by x 7→ (0, x) as a transverse submanifold for the foliation. Then φ(x, n) ∈ G is the parallel transport of the connection A along the portion of leaf which starts at x ∈ S 1 and ends at x + nκ. The cocycle condition is φ(x + nκ, m)φ(x, n) = φ(x, m + n).

(7)

Two cocycles φ1 , φ2 are called gauge-equivalent if we have: φ2 (x, n) = h(x + nκ)φ1 (x, n)h(x)−1

(8)

for some smooth function h : S 1 → G. So we are led to classify the smooth cocycles φ : S 1 × Z → G, modulo the gauge equivalence (8). Such a cocycle φ is of course uniquely described by ψ(x) = φ(x, 1), which is arbitrary. In terms of such G-valued smooth functions ψ, the gauge equivalence relation becomes: (9) ψ(x) ∼ = h(x + κ)ψ(x)h(x)−1 . pn We then introduce the approximation of κ by continued fractions qn . Recall that |qn κ − pn | < q1n and |pn qn+1 − pn+1 qn | = 1 for all n. Let G/conj be the quotient space of G by the conjugation action. G/conj is a compact metric space, and we have the projection map f : G → G/conj. Choose a left and right invariant distance d on G, and let d¯ be the induced distance on the quotient space G/conj. For x ∈ S 1 = R/Z, denote by hxi the minimum value of |y| over all representatives y of x in R. Then (u, v) 7→ hu − vi is a distance on S 1 . We then have Lemma 4. Assume that φ1 , φ2 : S 1 × Z → G are gauge equivalent cocycles. Then we have limn→∞ d f (φ1 (x, qn )), f (φ2 (x, qn )) = 0 for any x ∈ S 1 . This limit is uniform in x. Proof. We have:

φ2 (x, m) = h(x + mκ)φ1 (x, m)h(x)−1

for some smooth function h : S 1 → G. For any > 0, there exists δ > 0 such that x, y ∈ S 1 and hx − yi < δ implies d(h(x), h(y)) < . Let N be such that qN > δ1 . Then for n ≥ N , we have: hqn κi < q1n < δ. We then have: d(h(x + qn κ), h(x)) < , and since d is bi-invariant we obtain d(φ2 (x, qn ), h(x)φ1 (x, m)h(x)−1 ) ≤ d(h(x + qn κ), h(x)) < . Since f is distance decreasing, we see that d(f (φ1 (x, qn )), f (φ2 (x, qn ))) < .

Coadjoint Orbits of Central Extensions of Gauge Groups

357

The meaning of Lemma 4 is best expressed by introducing the following terminology. Two sequences (xn ) and (yn ) in a metric space are said to be asymptotically equivalent if limn→∞ d(xn , yn ) = 0. Then we have: Proposition 1. (1) To each leafwise connection A with respect to the Kronecker foliation, and for each x = (0, x) ∈ S 1 , there is an associated sequence bn (x) ∈ G/conj, represented by the parallel of transport for A along the segment of leaf from x to x+qn κ. Two gauge equivalent leafwise connections lead to asymptotically equivalent sequences in G/conj. (2) The asymptotic class of the sequence (bn (x)) is an invariant of the coadjoint orbit of (du − κdt, A). In this construction, we do not know how to control the dependence of the asymptotic class bn (x) on x ∈ S 1 . This is to be contrasted with the case G = R, which is wellunderstood, and which we discuss now. Let A = f (t, u)dt be a leafwise 1-form. Then we have the following ergodic theorem, essentially due to H. Weyl: Proposition 2. Assume that α is an algebraic irrational number. Then for each x ∈ S 1 the limit Z p 1 lim f (x + t, κt) dt p→∞ p 0 R exists and is equal to S 1 ×S 1 A ∧ β, hence it is a constant independent of x. P Proof. Let f (t, u) = m,n∈Z2 amn e2πi(nt+mu) be the Fourier series of f . We have: Z p 1 e2πi(n+κm)p − 1 if (n, m) 6= (0, 0). e2πi(nx+nt+mκt) dt = e2πinx 2πi(n + mκ) 0 Since κ is an algebraic irrational number,P there exist an integer q and a constant K such 1 | ≥ K · n−q . Then the series (m,n)6=(0,0) |amn | × | 2πi(n+mκ) | is dominated that |κ + m n P q−1 by (m,n)6=(0,0) |amn |n , which converges. Therefore Z 1 p lim [f (x + t, κt) − a(0, 0)]dt = 0, p→∞ p 0 and we have 1 lim p→∞ p

Z

Z

p

f (x + t, κt) dt = a(0, 0) = 0

S 1 ×S 1

A ∧ β.

4. Polarizations Recall the notion of polarization of a coadjoint orbit of a real Lie group H [Di, B-CD, Pu]. The coadjoint orbit G · λ ⊂ h∗ is equipped with the Kirillov-Kostant-Souriau symplectic structure ω [Ki, Ko, So] such that ωλ (ad∗ (X)λ, ad∗ (Y )λ) = λ([X, Y ]).

(10)

Let Hλ ⊆ H be the stabilizer of λ. The Lie algebra hλ of Hλ consists of those X ∈ h such that ad∗ (X)λ = 0. A polarization of the orbit G · λ ⊂ h∗ is a Lie subalgebra q of hC which contains hλ ⊗ C and is such that q/[hλ ⊗ C] ⊂ hC [hλ ⊗ C] ' Tλ (G · λ) is

358

J.-L. Brylinski

a maximal isotropic subspace. If q is the complexification of a real Lie subalgebra, it is called a real polarization. Otherwise, it is called a complex polarization. Given the polarization q, one obtains for every Ad∗ (h) · λ ∈ H · λ a Lie subalgebra Ad(h) · q, which contains the centralizer of Ad∗ (h) · λ and gives a maximal isotropic subspace of the tangent space to the orbit at this point. In this manner, a polarization defines a distribution of (real or complex) tangent spaces on the coadjoint orbit, which is easily seen to be integrable. For example, assume that H is a compact Lie group. Let HC be the complexification of H. For µ ∈ h, the stabilizer L = Hµ is such that its complexification LC is the Levi subgroup of a parabolic subgroup Q of HC . Then q = Lie(Q) is a complex polarization of the coadjoint orbit of µ. If µ is a regular element of k∗ , its centralizer is a maximal torus, and Q is a Borel subgroup corresponding to some choice of positive roots. Now let LG = M ap(S 1 , G) be the smooth loop group of the simple simplyg be the central connected compact Lie group G, and let Lg be its Lie algebra. Let LG extension of LG by the circle group T constructed by Pressley and Segal [P-S], and let f be its Lie algebra. As a special case of Sect. 1, Lg f identifies with Lg ⊕ R · c as a Lg f vector space. Let γ denote the linear form on Lg which vanishes on Lg and takes the g on Lg g∗ factors through an action of LG. Recall that G value 1 on c. The action of LG g Let T ⊂ G be a maximal torus. Let λ ∈ t∗ ⊂ g∗ identifies with a Lie subgroup of LG. ∗ f . The centralizer of y in LG be a regular element, let n ∈ Z, and let y = λ + n · γ ∈ Lg is equal to T . ∗ f as a Lie We will think now of a polarization of the coadjoint orbit of y ∈ Lg f containing the center R · c. To construct subalgebra of Lg, rather than a subalgebra of Lg a (complex) polarization of LG · y, we need a Borel subalgebra b of gC containing t. Recall that if D = {z ∈ C, |z| ≤ 1} is the closed unit disc and D 0 is the open unit disc, then a smooth function f : D → C is said to be the boundary value of a holomorphic function on D 0 if f can be extended to a function f : D → C which is continuous on D and holomorphic on D0 . The same concepts make sense for vector-valued functions. Then let b˜ ⊂ LgC be the set of f : S 1 → gC which are the boundary value of a holomorphic map f : D0 → gC such that f (0) ∈ b. We then have the well-known ∗

f . Proposition 3. b˜ is a polarization of the coadjoint orbit of y = µ + n · γ ∈ Lg The Lie group corresponding to b˜ is a smooth analog of the Iwahori subgroup of G(C((t))). Proposition 3 was proved by Frenkel [Fr]. We will now prove a generalization of Proposition 3 to a manifold M = S 1 × X, where X is a closed manifold of dimension n − 1 equipped with a volume form β. Then β gives by pull-back a closed nowhere vanishing (n − 1)-form on M , which we also denote by β. The foliation corresponding to β has leaves given by S 1 × {x} for x ∈ X (so all its leaves are closed). Now pick a volume form on M and a regular element µ ∈ t∗ as above. Set ν = ⊗ µ. Then we have: Lemma 5. The centralizer z in M ap(M, g) of the element (ν, β) is equal to M ap(X, t), where M ap(X, t) is viewed as a subalgebra of M ap(M, t) in the obvious manner. Given a Borel subalgebra b of gC , we define the Lie subalgebra q of the complexified Lie algebra M ap(M, g)C as the set of all f : S 1 × X → gC such that: 1) the function f is the boundary value of a function f : D0 × M → gC which is holomorphic in z ∈ D.

Coadjoint Orbits of Central Extensions of Gauge Groups

359

2) for any x ∈ X, we have: f (0, x) ∈ b. Then we can state: Theorem 3. The Lie subalgebra q of M ap(M, g)C is a polarization of the coadjoint orbit of y = (dt ∧ β ⊗ λ, β). To prove this we will need the following lemma, in which a skew-symmetric bilinear form ω over a vector space E is said to be weakly symplectic if the kernel of ω is trivial. It then follows formally that given linear independent vectors (v1 , · · · , vn ) in V , there exists a vector u such that ω(u, vi ) = δ1i . Lemma 6. Let (E, ω) be a weakly symplectic vector space, and let F1 , F2 be isotropic subspaces of E such that E = F1 ⊕ F2 . Then F1 and F2 are each maximal isotropic. Proof. To prove Theorem 2, we first observe that the skew-symmetric form ω on M ap(M, g)/z is weakly symplectic. We then show that q/zC is an isotropic subspace. We have Z Z 1Z ∂η (11) hH, [ξ, η]idt ∧ β + (ξ| )βdt. ωy (ξ, η) = ∂t S 1 ×M M 0 Decompose ξ, η ∈ b as Fourier series in the variable t: ξ=

X

fn (x)e2πint , η =

n≥0

X

gn (x)e2πint ,

n≥0

where fn , gn are smooth functions from X to gCR, and f0 , g0 take values in b. The first term in the right hand side of (11) reduces to M β ⊗ hH, [f0 (x), g0 (x)]i; but since f0 (x), g0 (x) ∈ b, their bracket belongs to the nilpotent radical of b, and the integrand 0 (x) is identically zero. The second term vanishes because ∂g∂t = 0. Now we produce a second Lie subalgebra q∞ of M ap(M, g)C . First let D∞ = {z ∈ CP1 ; |z| ≥ 1} be the 0 be the interior of D∞ . Let q∞ be the set of complement of D0 inside CP1 , and let D∞ 1 f : S × X → gC such that 1) the function f is the boundary value of a function f : D∞ × X → gC which is 0 . holomorphic in z ∈ D∞ 2) for any x ∈ X, we have: f (∞, x) ∈ b− , where b− ⊃ t is the Borel subalgebra opposed to b. The same argument as for q shows that q∞ /zC is an isotropic subspace of M ap(M, g)C /zC . It is clear that the intersection of q/zC and of q∞ /zC is reduced to 0. We now prove that these two subspaces span M ap(M, g)C /zC . This amounts to showing that any element of M ap(M, g) is the sum of an element of q and element of q∞ . Let then f (t, x) ∈ M ap(M, g). Write f as Fourier series with respect to t : f=

X

fn (x)e2πint .

n∈Z

it as f0 = u + v, where u Then f0 is a smooth function X → gC , and we may write P takes values in b, and v takes values in b− . Then let g = n>0 fn (x)e2πint + u(x), P h = n<0 fn (x)e2πint + v(x). It is clear that g ∈ q, h ∈ q∞ , and f = g + h. This finishes the proof of Theorem 2.

360

J.-L. Brylinski

It is tempting to conjecture that Theorem 2 will hold true for a wide class of (n − 1)forms defining singular foliations. As an example, we let M = S 2 , equipped with its standard SO(3)-invariant volume form of total volume 1. Let v be the vector field generating the action of S 1 ' SO(2), for which the orbits are the horizontal circles. Let β be the 1-form corresponding to v by duality. In spherical coordinates (φ ∈ [− π2 , π2 ], θ ∈ [0, 2π]) (so that φ = Const defines the horizontal circles) we have: v=

1 ∂ , β = −sin(φ)dφ = d(cosφ). 2π ∂θ

Again we let λ be a regular element of t∗ , and we consider the coadjoint orbit of ∗ g g) . The centralizer z of y in M ap(M, g) is the set of f : S 2 → t y = (λ, β) ∈ M ap(M, which are invariant under the S 1 -action. We now introduce the Lie subalgebra q of M ap(M, g) which consists of all f : S 2 → gC such that: 1) for each φ0 6= ± π2 , the restriction of f to the horizontal circle φ = φ0 is the boundary value of a holomorphic function on the inside of the circle; 2) for −1 ≤ u ≤ 1, we have: f (0, 0, u) ∈ b. Then we have: Theorem 4. The Lie subalgebra q of M ap(M, g) is a polarization of the coadjoint orbit of (λ, β). Proof. Again we have the notion of a Fourier series for f ∈ C ∞ (S 2 ): X f (θ, φ) = fn (cos(φ))einθ , n∈Z

where fn is a smooth function on the interval [−1, 1]. The fact that fn is smooth follows R 2π 1 from fn (cos(φ)) = 2π f (θ, φ)e−inθ f (θ, φ). Then f ∈ q if and only if fn = 0 for 0 n < 0 and f0 takes values in b. It is easy to see that q/zC is an isotropic subspace of M ap(M, g)C /zC . Now there is another Lie subalgebra q0 , which is the set of f such that fn = 0 for n > 0 and f0 takes values in b− . This gives another isotropic subspace of M ap(M, g)C /zC which is complementary to q/zC . The theorem now follows from Lemma 5. It is easy to see that if the weight λ is antidominant and regular with respect to the Borel subalgebra b, then q is a Kaehler polarization. 5. Integral Orbits We assume that G is simply-connected. We need to recall from [P-S] the Lie group central extension π

g G)−→M ap(M, G) → 1, 1 → A1 (M )/A1 (M )cl,Z → M ap(M,

(12)

where A1 (M )cl,Z denotes the set of closed 1-forms on M whose cohomology class is g G) which is a generalization of integral. We give a different description of M ap(M, Mickelsson’s construction [M] of the central extension of a loop group. The main ingredient is the action functional S, which associates to a map φ : M × Σ → G a class S(φ) ∈ A1 (M ), called the action of φ. Here Σ is a smooth compact oriented surface with corners, and φ is required to be smooth as a function on M , but only

Coadjoint Orbits of Central Extensions of Gauge Groups

361

piecewise smooth as a function on Σ. Let ν be the normalized Lie algebra 3-cocycle ∂φ on g as in [P-S]. Let (x1 , x2 ) be local coordinates on Σ. Then φ−1 ∂x is a g-valued i function, for i = 1, 2. Let v be any tangent vector to M . Then ν(φ−1 vφ, φ−1

∂φ −1 ∂φ ,φ ) ∂x1 ∂x2

is a local function on M such that the 2-form ηv = ν(φ−1 v · φ, φ−1

∂φ −1 ∂φ ,φ ) ⊗ dx1 ∧ dx2 ∂x1 ∂x2

(13)

is independent of the Rcoordinates (x1 , x2 ), hence is globally defined on Σ. Thus we can compute the integral Σ ηv , and we have the 1-form S(φ) on M defined by Z ηv . (14) S(φ)(v) = Σ

Lemma 7. If Σ is a closed oriented surface, then S(φ) ∈ A1 (M )cl,Z . Proof. To show that a 1-form on M belongs to A1 (M )cl,Z amounts to proving that its integral over any smooth loop in M is an integer. By functoriality we can then assume 1 1 RM = S ,∗and then we have to prove that for any map φ : S × Σ → G, the integral φ ν is an integer, where ν now denotes the normalized bi-invariant 3-form on S 1 ×Σ G. This is true because of the chosen normalization. g G) is defined as the set of equivalence Let I = [0, 1]. Then the group M ap(M, classes of pairs (F, θ), where F : M × I → G is a map which satisfies (1) F is smooth as a function on M , and piecewise smooth as a function on I; (2) F (x, 0) = 1; and θ ∈ A1 (M )/A1 (M )cl,Z . The equivalence relation is as follows. Let I 2 be the unit square in R2 , and let γ0 , γ1 : I → I 2 be parameterizations of the lower and upper horizontal segments in the boundary of I 2 . Let φ : M × I 2 → G be a map which is smooth as a function on M and piecewise smooth as a function on I 2 , and which satisfies φ(x, 0, t) = 1 ∈ G, φ(x, 1, t) is independent of t. For such φ, let Fi : M × I → G be the map defined as Fi = φ◦(Id×γi ) for i = 0, 1. Then we identify (F0 , θ) with (F1 , θ+S(φ)). We note that given F0 and F1 , the extension φ to a mapping M × I 2 → G is not unique, but it follows from Lemma 7 that the action S(φ) depends on φ only up to an element of A1 (M )cl,Z . g G) is induced by the composition law The product map on M ap(M, (F1 , θ1 ) · (F2 , θ2 ) = (F1 · F2 , θ1 + θ2 + α(g1 , g2 )),

(15)

where F1 · F2 : M → G is the pointwise product of F1 and F2 , gj (x) = Fj (x, 1), and the 1-form α(g1 , g2 ) is defined by Z d (16) α(g1 , g2 )(v) = [(g1 , g2 )∗ β](v, )dt, dt I for the 2-form β on G × G defined in [M]. That 2-form β satisfies p∗1 ν + p∗2 ν − m∗ ν = dβ,

362

J.-L. Brylinski

where p1 , p2 : G × G → G are the projections, and m : G × G → G is the product map. g G) is a Lie group with Lie algebra It is proved in [P-S, Sect. 4.10] that M ap(M, g M ap(M, g). If f : M → N is a smooth map between compact manifolds, there result pull-back maps f ∗ : M ap(N, G) → M ap(M, G) and f ∗ : A1 (N )/A1 (N )cl,Z → A1 (M )/A1 (M )cl,Z , and we have a Lie group homomorphism g G) → M ap(M, g G) which fits into a commutative diagram M ap(N, 0

→

0

→

A1 (N )/A1 (N )cl,Z 1

↓f

→

∗

1

A (M )/A (M )cl,Z

→

g G) M ap(N,

↓f

→

M ap(N, G)

∗

g G) M ap(M,

↓f

→ 1

∗

→ M ap(M, G)

→ 1

If we apply this to N = pt, we obtain: Lemma 8. Let G ⊂ M ap(M, G) be the subgroup of constant maps. Then the restriction g G) → M ap(M, G) has a canonical splitting. to G of the central extension M ap(M, The corresponding Lie algebra splitting is the one defined in Sect. 1. Concretely, this splitting associates to g ∈ G the class (F, 0), where F is any smooth map F : M ×I → G which is independent of the variable in M , and satisfies F (x, 0) = 1, F (x, 1) = γ(x). The main point here is that for a map φ : M × I 2 → G which does not depend on the variable in M , the action S(φ) is 0. We return to the situation M = S 1 × X, where X is an (n − 1)-manifold equipped with a volume form β. Let λ ∈ t∗ be regular. We consider the coadjoint orbit of y = (dt ∧ β ⊗ λ, β). According to Lemma 4, the stabilizer of y in M ap(M, G) is the group M ap(X, T ) ⊂ M ap(M, G). The centralizer of y in the central extension g G). Recall the integrality condition g G) is then π −1 M ap(X, T ) ⊂ M ap(M, M ap(M, for a coadjoint orbit of a Lie group H which comes from the orbit method [Ki] and geometric quantization [Ko]. One says that f ∈ h∗ is integral if it satisfies the following integrality condition: (I) there exists a Lie group character χ : Hf → C∗ whose differential is 2πi · f . We note the following ∗

g g) is integral if and Lemma 9. The smooth element y = (dt ∧ β ⊗ λ, β) of M ap(M, sm only the restriction of 2πi · y to the Lie subalgebra π −1 t ⊂ M ap(X, t) integrates to a character of the subgroup π −1 T of π −1 M ap(X, T ). Proof. We have an exact sequence of abelian Lie groups 0 → π −1 T → π −1 M ap(X, T ) → A1 (X)cl,Z ⊗ X(T ) → 0, where X(T ) is the group of 1-parameter subgroups of T . We also have the exact sequence 0 → A0 (X)/R → A1 (X)cl,Z → 0 → 0, where 0 is a free abelian group. We have a similar exact sequence after tensoring with X(T ). Because A0 (X)/R is a topological vector space, and π −1 (T ) is a finitedimensional abelian Lie group, any character χ : π −1 (T ) → C∗ can be extended to a character of the inverse image of A0 (X)/R ⊗ X(T ) in π −1 M ap(X, T ). Then, since 0 ⊗ X(T ) is a free abelian group, there is no difficulty in extending this character to π −1 M ap(X, T ).

Coadjoint Orbits of Central Extensions of Gauge Groups

363

g G) is isomorphic to the Now because of Lemma 7 the subgroup π −1 T of M ap(M, direct product of A1 (M )/A1 (M )cl,Z and of T , and therefore y = (dt∧β ⊗λ, β) integrates to a character of the product group if and only if (1) 2πi · β integrates to a character of A1 (M )/A1 (M )cl,Z ; and (2) 2πi · dt ∧ β ⊗ λ integrates to a character of A1 (M )/A1 (M )cl,Z . R The first condition holds if and only if, for any α ∈ A1 (M )cl,Z , we have: M β ∧α ∈ Z. Since M = X × S 1 , and β is the R R pull-back of a volume form on X, this condition holds true if andRonly if we have: M β ∧ dt ∈ Z. Since this integral reduces to X β, the condition is X β ∈ Z. The second condition is then verified if and only if λ is an integral weight, i.e., if 2πi · λ is the differential of a character of T . So we obtain: ∗

g g)sm is integral if and only if Theorem 5. The element y = (dt ∧ β ⊗ λ, β) of M ap(M, R (1) X β ∈ Z; (2) λ is an integral weight of T .

6. Comments on Geometric Quantization We work again in the situation M = SR1 × X, where X is an (n − 1)-manifold equipped with a volume form β such that n := X β ∈ Z. Let λ ∈ t∗ be an integral weight. Con∗ g g) . sider the coadjoint orbit Oy of the integral element y = (dt ∧ β ⊗ λ, β) of M ap(M, sm According to the orbit method [Ki] and to geometric quantization a` la Kostant [Ko], there should correspond to the orbit Oy an irreducible unitary representation E of g G). We will discuss the construction of a representation of the Lie algebra M ap(M, g M ap(M, g), but we will see that there are obstacles to the representation being unitary. g of First we recall the results of geometric quantization for the central extension LG f is a loop group LG, following I. Frenkel [Fr]. The element y of the smooth dual of Lg then simply a pair (n, λ). The coadjoint orbit of (n, λ) depends only on its orbit under the action of the affine Weyl group Waf f . Using an element of Waf f , one can transform the pair (n, λ) into a normal form, characterized by the following properties: (1) λ is a dominant weight; ˆ ≤ n, where θˆ is the coroot corresponding to the highest root θ. (2) hλ, θi ˜ is then Under these assumptions, the unitary representation H = H(n,λ) of LG constructed by Pressley and Segal [P-S]. It is the space of holomorphic sections of a holomorphic line bundle L(n,λ) over the flag manifold of the loop group. Therefore the construction in [P-S] is a (very difficult) generalization of the Borel-Weil-Bott theorem to loop groups. For our purposes, the exact construction of this representation is not important. We note that M ap(M, G) = M ap(X, LG); if X were finite, a projective representation H of LG would induce the representation ⊗x∈X H of M ap(X, LG). If X were discrete and countable, we could similarly try to use the notion of an infinite tensor product of Hilbert spaces. In our situation, it is therefore quite natural to look for a continuous tensor product of Hilbert spaces. Fortunately, there is such a construction, due to Araki and Woods [A-W], which we now recall. R The context is the same as for the construction of a direct integral H := X H(x)dµ(x), namely we have a measure space (X, µ) and a measurable family (Hx )x∈X of Hilbert

364

J.-L. Brylinski

spaces. However, one needs to make a choice of a measurable family x ∈ X 7→ x ∈ Hx of unit vectors. One also needs to assume that µ is non-atomic, i.e., every point of X has measure 0. Then = (x ) is a vector in K. Let K ⊂ H be the hyperplane orthogonal to . Then Araki and Woods define ⊗X,µ Hx to be the exponential Hilbert space eK . Recall that for a Hilbert space, eK is the Hilbert space direct sum of the Hilbert spaces (⊗n K)S . Here ⊗n K is the n-fold Hilbert tensor product, and (⊗n K)S is the subspace of invariants under the action of the symmetry group Sn : eK = ⊕n≥0 (⊗n K)S . This is called the continuous tensor product of the Hx , with respect to the measure µ on X. It does depend, however, on the choice of the family (x ). For any vector φ in K, P ⊗n we have the vector eφ = n≥0 φ√n! in eK . The vectors eφ generate eK topologically, and we have: (eφ , eψ ) = e(φ,ψ) . In our situation, we take the probability space (X, nµ ). The Hilbert space Hx is g We can independent of x, and equal to the representation space H = H(n,λ) for LG. ˜ acts by a character. take x to be a unit vector on which the Borel subgroup B˜ of LG Then we are led to the following question: does a family of bounded (resp. unitary) operators Tx on Hx = H lead to a bounded (resp. unitary) operator on the continuous tensor product eK ? For unitary operators, the answer is clearly yes if each Tx preserves x , since in that case we have an induced unitary operator on K, which yields a unitary operator on the exponential Hilbert space eK . More generally, if X is connected and simply-connected, and if Tx x = λx x for some λx ∈ BC ∗ , |λx | = 1, then one can define a unitary operator on eK . One writes Tx = λx URx , where Ux x = x . The family of operators (λx Id) operates on eK by the scalar exp( X log(λx )), which is independent on the choice of the branch log(λx ) of the logarithm. However, it is in general impossible to associate a unitary operator on eK to a family (Tx ) of unitary operators which don’t preserve the vacuum vectors. More precisely, let H be any Hilbert space and let K ⊂ H be a hyperplane. Then it is impossible to make the unitary group of H act on eK . We will explain briefly the obstruction in the finitedimensional case. So let K = Cn ⊂ H = Cn+1 . Let (x0 , · · · , xn ) be the coordinates on Cn+1 ; say that x = (1, 0, · · · , 0), so that K is the hyperplane x0 = 0. We will discuss an infinitesimal action of the Lie algebra gl(n + 1) on the symmetric algebra C[x1 , · · · , xn ] [Go]. (which is a dense subspace of eK ). This action was first introduced by PGoncharov n ∂ be the Let Ei,j denote an elementary matrix in gl(n + 1), and let Eu = i=1 xi ∂x i Euler vector field . Define ρ : gl(n + 1) → End(C[x1 , · · · , xn ]) as follows:   ∂     if i, j ≥ 1 x i     ∂x j         ∂ if i = 0, j ≥ 1 ρ(Ei,j ) = . ∂xj          −xi · (Eu + 2) if i ≥ 1, j = 0        Eu + 2 if i = j = 0. Then ρ is a Lie algebra homomorphism. However, this does not in any way integrate to a unitary representation of SU (n + 1). First of all, it is necessary to rescale the scalar 1 . Even after this is done, because of the sign of product on Symk Cn by the factor k+1

Coadjoint Orbits of Central Extensions of Gauge Groups

365

the front of xi · Eu above, one does not get a unitary representation of the compact group SU (n + 1), but rather of the subgroup U (1, n) of GL(n + 1, C) which preserves the indefinite hermitian form −|x0 |2 + |x1 |2 + · · · + |xn |2 . This was described in more detail by Sahi in [Sa]. Indeed, this yields the minimal representation of SU (1, n). In infinite dimensions, we expect Conjecture 1. Let K be a separable Hilbert space, and let H = C · ⊕ K with the hermitian form ((a, v), (b, w)) = −ab¯ + (v, w). Then the unitary group U (1, ∞) of this non-definite hermitian form acts unitarily on the exponential Hilbert space eK . Therefore we have found a strong obstruction to unitarity of a projective representation of M ap(M, G) on the continuous tensor product ⊗X, nµ H. It may be possible, however that one could integrate a Lie algebra representation to a Lie group representation which would not be unitary. Acknowledgement. I thank Boris Khesin for a useful discussion on the work of Etingof and Frenkel, and Ranee Brylinski for valuable information about minimal representations. I thank Dmitri Burago for useful remarks on a first draft of this paper. I am grateful to Princeton University and Harvard University for providing me with a stimulating intellectual atmosphere in the Spring of 95 and in the Summer of 95, respectively.

References [A-W]

Araki, H. and Woods, E-J.: Complete Boolean algebras of type I factors. Publ. RIMS 2, 157–242 (1965) [B-C-D] Bernat, P., Conze-Berline, N., Duflo, M., L´evy-Nahas, M., Ra¨ıs, M., Renouard, P. and Vergne, M.: Repr´esentations des Groupes de Lie R´esolubles. Monographies Soc. Math. Fr. 4, (1972) [Bl] Bloch, S.: The dilogarithm and extensions of Lie algebras., Algebraic K-Theory, Evanston 1980. Lecture Notes in Math. Vol. 854. Berlin–Heidelberg–New York: Springer Verlag), 1981 pp. 1–23 [Co] Connes, A.: A survey of foliations and operator algebras.Proc. Symp. Pure Math. 38, Part I, 521–628 (1982) [Di] Dixmier, J.: Alg`ebres Enveloppantes. Paris: Gauthiers-Villars, 1974 [E-F] Etingof, P. and Frenkel, I.: Central extensions of gauge groups in two dimensions. Commun . Math. Phys. 165, 429–444 (1994) [Fr] Frenkel, I.: Orbital theory for affine Lie algebras. Invent. Math. 77, 301–352 (1984) [Go] Goncharov, A.: Constructions of Weyl representations of some simple Lie algebras. Funct. Anal. Appl. 16, 133–134 (1982) [Ka] Kac, V.: Infinite-Dimensional Lie algebras. Progress in Math. 44, Basel, Boston: Birkha¨user, 1985 [K-P] Kac, V. and Peterson, D.H.:Infinite-dimensional Lie algebras, theta functions and modular forms. Adv. Math. 53, 125–164 (1984) [Ki] Kirillov, A.A.: Unitary representations of nilpotent Lie groups. Usp. Mat. Nauk 17, 57–110 (1962) [Ko] Kostant, B.: Quantization and unitary representations. Lecture Notes in Math. Vol. 170, Berlin– Heidelber–New York: Springer-Verlag, 1970 [M] Mickelsson, J.: Kac-Moody groups, topology of the Dirac determinant bundle, and fermionization.Commun. Math. Phys. 110, 173–183 (1987) [P-S] Pressley, A. and Segal, G.: Loop Groups. Oxford: Clarendon Press, 1986 [Pu] Pukanszky, L.: Lec¸ons sur les Repr´esentations des Groupes. Paris: Dunod, 1966 [Sa] Sahi, S.: Unitary representations on the Shilov boundary of a symmetric tube domain. Repr. Theory of Groups and Algebras, Contemp. Math. 145, Providence, RI: Am. Math. Soc., 1993 [So] Souriau, J-M.: Structure des Syst`emes Dynamiques. Paris: Dunod, 1970 Communicated by H. Araki

Commun. Math. Phys. 188, 367 – 378 (1997)

Communications in

Mathematical Physics c Springer-Verlag 1997

A Level-One Representation of the Quantum Affine c (M + 1|N + 1)) Superalgebra Uq (sl Kazuhiro Kimura1 , Jun’ichi Shiraishi2 , Jun Uchiyama3 1 2 3

Department of Physics, Osaka Institute of Technology, Omiya, Osaka 535, Japan Institute for Solid State Physics, University of Tokyo, Roppongi, Tokyo 106, Japan Department of Physics, Rikkyo University, Nishi-Ikebukuro, Tokyo 171, Japan

Received: 30 September 1996 / Accepted: 4 February 1997

Abstract: A level-one representation of the quantum affine superalgebra b Uq (sl(M + 1|N + 1)) and vertex operators associated with the fundamental representations are constructed in terms of free bosonic fields. Character formulas of level-one b are conjectured. irreducible highest weight modules of Uq (sl(2|1)) 1. Introduction In recent studies of mathematical physics and solvable systems [DFJMN] (see also references in [JM]), affine quantum algebras [D1, D2, J] played an important role. In these works, techniques of the representation theory, infinite dimensional highest weight modules and corresponding vertex operators etc. have proved to be very powerful to study low-dimensional systems. Therefore, it may be natural to expect that representation theories of the affine Lie superalgebras or their quantum analogies will greatly help our future studies (as for applications to number theory see [KWk1]). The defining relations of quantum affine superalgebras are obtained by Yamane [Y]. In that paper, Drinfeld realization is also studied [D2]. The representation theories of Lie superalgebras are much more complicated than non-super cases and have reach structures [K1, K2, K3, FSS, KWn, KWk1]. Hence, to obtain concrete representation spaces is desirable. The b + 1|N + 1)) by aim of this article is to construct a level-one representation of Uq (sl(M b bosonizing the Drinfeld generators based on the free boson representation of sl(M + 1|N + 1) obtained in [BCMN] and study character formulas for highest weight modules. This paper is organized as follows. In this section we review the definition of the quanb + 1|N + 1)) and construct level-zero representations. tum affine superalgebra Uq (sl(M b In Sect. 2, we study the bosonization of Uq (sl(M + 1|N + 1)) and using that, character b formulas for level-one irreducible highest weight modules of Uq (sl(M + 1|N + 1)) are conjectured. Sect. 3 is devoted to the study of the bosonization of the vertex operators.

368

K. Kimura, J. Shiraishi, J.Uchiyama

After finishing this article, the paper [BT] which deals with the representation theory ˆ of sl(2|1) at fractional level appeared on the electrical bulletin board. A systematic classification of singular vectors in Verma modules are given in [BT]. This might enable us to check the validity of the character formulas for level-one irreducible highest weight modules conjectured in this article. b + 1|N + 1)). We will study the quantum 1.1. Quantum affine super algebra Uq (sl(M b affine superalgebra Uq (sl(M + 1|N + 1)) for M, N = 0, 1, · · · and we will restrict our b analysis to M 6= N . The Cartan matrix of the affine Lie superalgebra sl(M + 1|N + 1) is   0 −1 1  −1 2 −1    ..   . −1 2     .. ..   . . −1     −1 2 −1   (aij ) =   −1 0 1 (1.1)   ..   .   1 −2   .. ..   . . 1    1 −2 1  1 (0 ≤ i, j ≤ M + N + 1),

1 M

−2

N

z }| { z }| { where the diagonal part is (aii ) = (0, 2, · · · , 2, 0, −2, · · · , −2). Let us introduce the orthonormal basis {ε0i |i = 1, · · · , M + N + 2} with the bilinear form (ε0i |ε0j ) = νi δi,j , where νi = 1 for i = 1, · · · , M + 1 and νj = −1 for j = M + 2, · · · , M + N + 2. PM +N +2 0 Define εi = ε0i − νi j=1 εj /(M − N ). The classical simple roots are defined by P 0 0 ¯ i = i εj for i = 1, · · · , M +N +1. α¯ i = νi εi −νi+1 εi+1 and the classical weights are 3 j=1 Introduce the affine weight 30 and the null root δ having (30 |ε0i ) = (δ|ε0i ) = 0 for i = 1, · · · , M + N + 2 and (30 |30 ) = (δ|δ) = 0, (30 |δ) = 1. The other affine weights ¯ i + 30 and αi = α¯ i for i = 1, · · · , M + N + 1 and and affine roots are given by 3i = 3 PM +N +1 M +N +1 +N +1 α0 = δ − i=1 αi . Let P = ⊕i=0 Z3i ⊕ Zδ and P ∗ = ⊕M Zhi ⊕ Zd be the i=0 affine weight lattice and its dual lattice, respectively. b The quantum affine algebra Uq (sl(M + 1|N + 1)) [Y] is a q-analogue of the universal b enveloping algebra of sl(M +1|N +1) generated by the Chevalley generators, ei , fi , t±1 i ,d b over the base field Q(q). The Z2 -grading | · | : Uq (sl(M + 1|N + 1)) → Z2 of the generators are: |e0 | = |f0 | = |eM +1 | = |fM +1 | = 1 and zero otherwise. The relations among these generators are ti tj = tj ti , ti ej t−1 i ti fj t−1 i

=q =

[ei , fj ] =

aij

ti d = dti ,

ej , −aij q fj ,

[d, ei ] = δi,0 ei , [d, fi ] = −δi,0 fi

ti − t−1 i δij , q − q −1

(1.2) (1.3) (1.4) (1.5)

Level-One Representation of Quantum Affine Superalgebra

369

[ej , [ej , ei ]q−1 ]q = 0 for |aij | = 1, i 6= 0, M + 1 [fj , [fj , fi ]q−1 ]q = 0 [el , [ek , [el , em ]q−1 ]q ] = 0 for (k, l, m) =(M + 2, M + 1, M ), (1, 0, M + N + 1), [fl , [fk , [fl , fm ]q−1 ]q ] = 0

(1.6) (1.7)

where we have used the notations [X, Y ]ξ = XY −(−1)|X||Y | ξY X. Here and hereafter, we write [X, Y ]1 as [X, Y ] for simplicity. If M = 0 or N = 0, we have extra fifth order Serre relations. As for the explicit forms of the extra Serre relations, we will refer the reader to ref. [Y]. b + 1|N + 1)) can be endowed with the The quantum affine super algebra Uq (sl(M graded Hopf algebra structure. We take the following coproduct 1(ei ) = ei ⊗ 1 + ti ⊗ ei , 1(fi ) = fi ⊗ t−1 i + 1 ⊗ fi ,

(1.8)

±1 ±1 1(t±1 i ) = ti ⊗ ti , 1(d) = d ⊗ 1 + 1 ⊗ d,

and the antipode a(ei ) = −t−1 i ei ,

a(fi ) = −fi ti ,

∓1 a(t±1 i ) = ti ,

a(d) = −d.

(1.9

The coproduct is an algebra automorphism 1(xy) = 1(x)1(y) and the antipode is a graded algebra anti-automorphism a(xy) = (−1)|x||y| a(y)a(x) for x, y ∈ b b Uq (sl(M + 1|N + 1)). Let V and W be graded representations of Uq (sl(M + 1|N + 1)). Hereafter, the Z2 -grading on representation space will also be denoted by | · |. The b b graded action of Uq (sl(M + 1|N + 1)) ⊗ Uq (sl(M + 1|N + 1)) on the tensor representa|y||v| b tion V ⊗W is defined by x⊗y·v⊗w = (−1) xv⊗yw for x, y ∈ Uq (sl(M + 1|N + 1)) and v ∈ V, w ∈ W . b + 1|N + 1)), we give In order to construct the bosonic representation of Uq (sl(M ±,i b another realization of Uq (sl(M + 1|N + 1)) using the Drinfeld basis [Y]: {Xm , hin , (K i )±1 , γ ±1/2 | i = 1, · · · , M + N + 1, m ∈ Z, n ∈ Z6=0 } with the grading operator ±,M +1 d. The Z2 -grading of the Drinfeld generators are: |Xm | = 1 for m ∈ Z and zero otherwise. The relations are γ is central, [K i , hjm ] = 0, [d, K i ] = 0, [d, hjm ] = mhjm , [aij m](γ m − γ −m ) [him , hjn ] = δm+n,0 , m(q − q −1 ) ±,j ±,j i = q ±aij Xm K , K i Xm ±,j ±,j [d, Xm ] = mXm , [aij m] ±|m|/2 ±,j γ [him , Xn±,j ] = ± Xn+m , m 1 +,i +,j −,j , Xn−,j ] = δi,j (γ (m−n)/2 ψm+n − γ −(m−n)/2 ψm+n ), [Xm q − q −1 ±,i , Xn±,j ] = 0 [Xm

for aij = 0, ±,i ±,j ±,j ±,i [Xm+1 , Xn ]q±aij + [Xn+1 , Xm ]q±aij = 0, for aij 6= 0, ±,i ±,i ±,j for aij = 0, i 6= M Syml,m [Xl , [Xm , Xn ]q−1 ]q = 0, ±,M +1 Symk,m [Xk±,M +1 , [Xl±,M +2 , [Xm , Xn±,M ]q−1 ]q ] = 0,

(1.10)

(1.11)

(1.12) (1.13) (1.14) (1.15) (1.16)

+ 1,

(1.17) (1.18)

370

K. Kimura, J. Shiraishi, J.Uchiyama

where

X

ÿ ±,i −m ψm z

i ±1

= (K )

exp ±(q − q

−1

)

X

! hi±n z ∓n

,

(1.19)

n>0

m∈Z

and the symbol Symk,l means symmetrization with respect to k and l. We used the x

−x

−q standard notation [x] = qq−q −1 . If M = 0 or N = 0, we have extra fifth order Serre relations for the Drinfeld generators. As for the explicit forms, we will refer the reader to ref. [Y]. The Chevalley generators are obtained by the formulas:

ei = X0+,i ,

ti = Ki , t0 = e0 =

fi = X0−,i

fori = 1, · · · , M + N + 1,

−1 γK1−1 · · · KM +N +1 , −,M +N +1 N +1 (−1) [X0 · · · , [X0−,M +2 , [X0−,M +1 1 M +N +1 −1

× (K · · · K f0 = K1 · · · KM +N +1 ,

)

(1.20) (1.21)

· · · , [X0−,2 , X1−,1 ]q−1

· · ·]q−1 ]q · · ·]q

,

(1.22)

+,1 , X0+,2 ]q , · · · X0+,M +1 ]q , X0+,M +2 ]q−1 , · · · X0+,M +N +1 ]q−1 . × [· · · [[· · · [X−1

(1.23)

b 1.2. Level-zero representations of Uq (sl(M + 1|N + 1)). Let Ei,j be the (M + N + 2) × (M + N + 2) matrix whose (i, j)-element is unity and zero elsewhere, set vi =t i

(0, · · · 0, 1, 0, · · · , 0) for i = 1, · · · , M + N + 2. We will adopt the Z2 -grading to the basis by |vi | = (νi + 1)/2. For the sake of simplicity, we will not study another possibility |vi | = (−νi + 1)/2 in this article. The M + N + 2 dimensional level-zero representation b Vz of Uq (sl(M + 1|N + 1)) with basis {vi ⊗ z n |i, 1, · · · , M + N + 2, n ∈ Z} is defined by ei = Ei,i+1 ,

fi = νi Ei+1,i ,

e0 = −zEM +N +2,1 , d d=z dz

ti = q νi Ei,i −νi+1 Ei+1,i+1 ,

f0 = z −1 E1,M +N +2 ,

t0 = q −E1,1 −EM +N +2,M +N +2 ,

(1.24

for i = 1, · · · , M +N +1. Let Vz∗ be the dual space of Vz with basis {vi∗ ⊗z n |i, 1, · · · , M + N + 2, n ∈ Z} such that hvi∗ ⊗ z m , vj ⊗ z n i = δi,j δm+n,0 . We also regard vi∗ as the i b vector vi =t (0, · · · 0, 1, 0, · · · , 0). The Uq (sl(M + 1|N + 1))-module structure is given by hxv, wi = hv, (−1)|x||v| a(x)wi for v ∈ Vz∗ , w ∈ Vz and we call the module Vz∗a . The representation is: ei = −νi νi+1 q −νi Ei+1,i , e0 = qzE1,M +N +2 ,

fi = −νi q νi Ei,i+1 ,

f0 = q −1 z −1 EM +N +2,1 ,

ti = q −νi Ei,i +νi+1 Ei+1,i+1 , t0 = q E1,1 +EM +N +2,M +N +2 .

(1.25)

The Drinfeld generators on Vz are represented by [m] µi m (q z) νi q −νi m Ei,i − νi+1 q νi+1 m Ei+1,i+1 , m −,i = (q µi z)m Ei,i+1 , Xm = νi (q µi z)m Ei+1,i ,

him = +,i Xm

and on Vz∗a ,

K i = q νi Ei,i −νi+1 Ei+1,i+1 , (1.26)

Level-One Representation of Quantum Affine Superalgebra

371

[m] −µi m (q z) νi q νi m Ei,i −νi+1 q −νi+1 m Ei+1,i+1 , K i = q −νi Ei,i+νi+1 Ei+1,i+1 , m +,i −,i Xm = −νi νi+1 q −νi (q −µi z)m Ei+1,i , Xm = −νi q νi (q −µi z)m Ei,i+1 , (1.27) Pi where µi = k=1 νk . him = −

b 2. A Level-One Representation of Uq (sl(M + 1|N + 1)) 2.1. Free boson realization. Now, we will study the free boson realization of b + 1|N + 1)) which gives us a level-one representation. It is well known that Uq (sl(M the representations of the non-super affine algebras are constructed in terms of bosonic fields at level-one [FK]. Based on this realization, Frenkel and Jing constructed the free boson representation of the quantum affine algebras at level-one [FJ]. We will show that this kind of bosonization can be extended to the affine superalgebras of A-type. Our representation can be regarded as a q-deformation of the free field realization of b sl(M + 1|N + 1) studied by Bouwknegt et.al. [BCMN]. The structure of the deformation is essentially the same as that of Frenkel and Jing’s except for the deformation of the β-γ ghost-system. To this ghost-system, however, the technique for bosonizing the β-γ ghost b )) (see which was discussed in the papers on deformed Wakimoto realization of Uq (sl(N [AOS] and references therein) is applicable. Let us introduce the bosonic oscillators {ain , bjn , cjn , Qai , Qbj , Qcj |n ∈ Z, i = 1, · · · , M + 1, j = 1, · · · , N + 1} satisfying the commutation relations [m]2 , m [m]2 [bim , bjn ] = −δi,j δm+n,0 , m [m]2 [cim , cjn ] = δi,j δm+n,0 , m

[aim , ajn ] = δi,j δm+n,0

[ai0 , Qaj ] = δi,j , [bi0 , Qbj ] = −δi,j , [ci0 , Qcj ] = δi,j .

(2.1) (2.2) (2.3)

The remaining commutators vanish. Define the generating functions for the Drinfeld basis by X ±,i (z) = P i ±,i −m−1 , and introduce hi0 by setting K i = q h0 . Define Qhi = Qai − Qai+1 m∈Z Xm z for i = 1, · · · , M , QhM +1 = QaM +1 + Qb1 and QhM +1+j = −Qbj + Qbj+1 for j = 1, · · · , N . Let us introduce the notation hi (z; κ) = −

X hi n −κ|n| −n q z + Qhi + hi0 ln z, [n]

(2.4)

n6=0

for the Drinfeld generators him , Qhi and κ ∈ R. In this article, we will adopt this notation for other bosonic fields, for example, the boson field cj (z; κ) should be defined in the (q −1 z) . same way. We introduce the q-differential operator defined by 1 ∂z f (z) = f (qz)−f (q−q −1 )z Now we state the result of the bosonization. Proposition 2.1. The Drinfeld generators at level-one are realized by the free boson fields as

372

K. Kimura, J. Shiraishi, J.Uchiyama

γ = q, him +1 hM m +1+j hM m +,i

X

= = =

(2.5)

|m|/2 aim q −|m|/2 − ai+1 , m q M +1 −|m|/2 1 −|m|/2 am q + bm q , j |m|/2 j+1 −|m|/2 −bm q + bm q , hi (z;1/2) iπai0

(z) = : e

X +,M +1 (z) = : eh

:e

M +1

(2.6) (2.7) (2.8)

,

(z;1/2) c1 (z;0)

e

:

M Y

(2.9) i

e−iπa0 ,

(2.10)

i=1

X +,M +1+j (z) = : eh X

−,i

M +1+j

(z) = − : e

X −,M +1 (z) = : e−h

(z;1/2)

[1 ∂z e−c

−hi (z;−1/2)

M +1

(z;−1/2)

:e

j

(z;0)

−iπai0

[1 ∂z e−c

1

]ec

j+1

(z;0)

:,

(2.11)

,

(2.12)

(z;0)

]:

M Y

i

eiπa0 ,

(2.13)

i=1

X −,M +1+j (z) = − : e−h

M +1+j

(z;−1/2) cj (z;0)

e

[1 ∂z e−c

j+1

(z;0)

] :,

(2.14)

for m ∈ Z6=0 , i = 1, · · · , M and j = 1, · · · , N . The usual normal ordering is denoted by : · · · :. Proof. We can check the commutation relations by straightforward calculations of the operator product expansions among the bosonized generators. This realization may help us to study the quantum super-geometrical meaning of the Drinfeld basis[D1, Y], in other words, the structure of quantum flag super-manifolds, and also applications for calculations of correlation functions of one dimensional quantum systems. To this end, we will study irreducible highest weight modules, their characters and vertex operators in the following sections. b To exploit special features of the quantum 2.2. Highest weight Uq (sl(2|1))-modules. b affine superalgebras, we study the simplest example Uq (sl(2|1)) of the level-one representation obtained in the last subsection. We begin by defining the Fock module. The vacuum vector |0i is defined by ain |0i = bn |0i = cn |0i = 0 for n ≥ 0, and the vector carrying the weight (λa1 , λa2 , λb , λc ) ∈ C4 by |λa1 , λa2 , λb , λc i = eλa1 Qa1 +λa2 Qa2 +λb Qb +λc Qc |0i.

(2.15)

The Fock module Fλa1 ,λa2 ,λb ,λc is generated by acting creation operators h1n = a1n q n/2 − a2n q −n/2 , h2n = a2n q n/2 + bn q n/2 and cn (n < 0) on |λa1 , λa2 , λb , λc i. b we impose the conditions: To obtain highest weight vectors of Uq (sl(2|1)), ) ei |λa1 , λa2 , λb , λc i = 0, for i = 0, 1, 2 . (2.16) hi |λa1 , λa2 , λb , λc i = λi |λa1 , λa2 , λb , λc i, Solving these equations, we obtain the following classification: (1) (λa1 , λa2 , λb , λc ) = (β, β, β − α, −α), where α and β are arbitrary. The weight of this vector is (λ0 , λ1 , λ2 ) = (1 − α, 0, α). Thus we have the identification: |(1 − α)30 + α32 i = |β, β, β − α, −αi.

Level-One Representation of Quantum Affine Superalgebra

373

(2) (λa1 , λa2 , λb , λc ) = (β + 1, β, β, 0), where β is arbitrary. The weight is (λ0 , λ1 , λ2 ) = (0, 1, 0). We have |31 i = |β + 1, β, β, 0i. (3) (λa1 , λa2 , λb , λc ) = (β + 1, β + 1, β, 0), where β is arbitrary. The weight is (λ0 , λ1 , λ2 ) = (0, 0, 1) and we have |32 i = |β + 1, β + 1, β, 0i. According to this classification, let us introduce the following spaces: M F(α;β) = Fβ+i,β−i+j,β−α+j,−α+j , F((1,0);β) =

M

i,j∈Z

Fβ+1+i,β−i+j,β+j,j ,

F((0,1);β) =

i,j∈Z

M

(2.17)

Fβ+1+i,β+1−i+j,β+j,j .

i,j∈Z

b It is not difficult to see that the bosonized actions of Uq (sl(2|1)) on these spaces b are closed, i.e. Uq (sl(2|1))F(∗;β) = F(∗;β) where ∗ = α, (1, 0), (0, 1). These spaces are not irreducible in general. It is convenient to P introduce a pair of fermionic fields P −n−1 c(z;0) −n η z =: e : and ξ(z) = =: e−c(z) : to obtain η(z) = n∈Z n n∈Z ξn z b the irreducible subspaces of these Uq (sl(2|1))-modules. The mode expansion of η(z), ξ(z) is well defined on F(α;β) for α ∈ Z and on F((1,0);β) , F((0,1);β) , and the relations are {ξr , ξs } = {ηr , ηs } = 0, {ξr , ηs } = δr+s,0 . In these cases, we have the direct sum decompositions F(∗;β) = η0 ξ0 F(∗;β) ⊕ξ0 η0 F(∗;β) . As usual, we call η0 ξ0 F(∗;β) as Ker η0 (∗;β) and F(∗;β) /η0 ξ0 F(∗;β) as Coker η0 (∗;β) . Since η0 commutes (or anti-commutes) with every elb b we can regard Ker η0 (∗;β) and Coker η0 (∗;β) as Uq (sl(2|1))-modules. ement of Uq (sl(2|1)), b From now on, we study the character formulas of these Uq (sl(2|1))-modules we have constructed in the bosonic Fock space. The character of a space F is defined by chF (q, x, y) ≡ tr F q −d xh0 y h0 , 1

2

(2.18)

where the grading operator d is [BCMN] X n )2 {a1−n a1n + a2−n a2n −b−n bn +c−n cn −(a1−n + a2−n +b−n )(a1n +a2n +bn )} d=− ( [n] n≥1

1 − {(a10 )2 + (a20 )2 − (b0 )2 + c0 (c0 + 1) − (a10 + a20 + b0 )2 }, 2

(2.19)

and h10 = a10 − a20 , h20 = a20 + b0 . (I) Character of F(α;β) for α 6∈ Z. Since η0 is not defined on this module, we simply study the character of the bosonic Fock space without any BRST resolution. Proposition 2.2. We obtain the character of F(α;β) as 1 X 1 2 2 q − 2 α(α+1) q 2 (2s −2st+t +t) x2s−t y α−s . chF(α;β) (q, x, y) = Q∞ n )3 (1 − q n=1

(2.21)

s,t∈Z

Note that the R.H.S. is independent of β. By some explicit calculations for lower lying vectors in V ((1 − α)30 + α32 ), we have found this character chF(α;β) (q, x, y) coincides with that of V ((1 − α)30 + α32 ). b Therefore, it is expected that F(α;β) is the irreducible highest weight Uq (sl(2|1))-module. We conjecture the following.

374

K. Kimura, J. Shiraishi, J.Uchiyama

b Conjecture 2.1. We have the identification of the highest weight Uq (sl(2|1))-modules: F(α;β) ∼ = V ((1 − α)30 + α32 )

for α 6∈ Z and arbitrary β,

(2.20)

b where V (λ) denotes the irreducible highest weight Uq (sl(2|1))-modules with the highest weight weight λ. (II) Character of Ker η0 and Coker η0 . ules on which η0 is well defined.

Next, let us consider the character of the mod-

Proposition 2.3. The character of Ker η0 (α;β) for α ∈ Z is obtained as chKer η0 (α;β) (q, x, y) q − 2 α(α+1) = Q∞ n 3 n=1 (1 − q ) 1

X

+

X

(−1)l+1 q

l(l−1) +l(α−t)+ 21 (2s2 −2st+t2 +t) 2

s,t,l∈Z t<α,l≥1

l

(−1) q

l(l+1) 2 2 1 2 −l(α−t)+ 2 (2s −2st+t +t)

x2s−t y α−s

x

2s−t α−s

y

.

(2.22)

s,t,l∈Z t≥α,l≥0

The character of Coker η0 (α;β) (α ∈ Z) is chCoker η0 (α;β) (q, x, y) q − 2 α(α+1) = Q∞ n 3 n=1 (1 − q ) 1

+

X

X

(−1)l q

l(l−1) +l(α−t)+ 21 (2s2 −2st+t2 +t) 2

x2s−t y α−s

s,t,l∈Z t<α,l≥0

l+1

(−1)

q

l(l+1) 2 2 1 2 −l(α−t)+ 2 (2s −2st+t +t)

x

2s−t α−s

y

.

(2.23)

s,t,l∈Z t≥α,l≥1

These formulas are obtained by inserting the projectors η0 ξ0 and ξ0 η0 to the trace of the Fock space. As for the details of this technique, see [BCMN]. We also have the following formulas. Proposition 2.4. We have the equalities chCoker η0 ((0,1);β) (q, x, y) = chCoker η0 (1;β) (q, x, y), and chCoker η0 ((1,0);β) (q, x, y) X l(l−1) 2 2 1 1 (−1)l q 2 −lt+ 2 (2s −2st+t +t) x1+2s−t y −s = Q∞ n 3 n=1 (1 − q ) s,t,l∈Z t<0,l≥0

+

X

(−1)l+1 q

l(l+1) 2 2 1 2 +lt+ 2 (2s −2st+t +t)

x1+2s−t y −s . (2.24)

s,t,l∈Z t≥0,l≥1

Since we have the conditions η0 |β, β, β − α, −αi 6= 0 for α = 0, 1, · · · and η0 |β, β, β − α, −αi = 0 for α = −1, −2, · · ·, the modules Coker η0 (α;β) (α = 0, 1, · · ·), Coker η0 ((0,1);β) , Coker η0 ((1,0);β) and Ker η0 (α;β) (α = −1, −2, · · ·) are highest weight b It is expected that these modules are also irreducible with respect Uq (sl(2|1))-modules. b to the action of Uq (sl(2|1)).

Level-One Representation of Quantum Affine Superalgebra

375

b Conjecture 2.2. We have the following identifications of the highest weight Uq (sl(2|1))modules: V ((1 − α)30 + α32 ) ∼ = Coker η0 (α;β) for α = 0, 1, · · · , ∼ = Ker η0 (α;β) for α = −1, −2, · · · ,

(2.25)

and V (31 ) ∼ = Coker η0 ((1,0);β) , V (32 ) ∼ = Coker η0 ((0,1);β) for arbitrary β. We have checked the validity of Conjecture 2.1 and Conjecture 2.2 for V (30 ), V (31 ), V (32 ) up to certain degrees by comparing our results with those of Kac and Wakimoto [KWk2]. 3. Vertex Operators b + 1|N + 1)). In this section, we study free boson 3.1. Vertex operators for Uq (sl(M b realization of vertex operators for Uq (sl(M + 1|N + 1)). Let V (λ) be the highest weight b Uq (sl(M + 1|N + 1))-module with the highest weightλ. The Z2 -gradation of V (λ) is also ∗ µV ∗ denoted by | · |. The vertex operators 8µV (z), 9Vλ µ (z), 9λV µ (z) are defined λ (z), 8λ b + 1|N + 1))-modules if they exist: as the following intertwiners of Uq (sl(M ∗

8µV λ (z) : V (λ) −→ V (µ) ⊗ Vz ,

8µV (z) : V (λ) −→ V (µ) ⊗ Vz∗a , λ

(3.1)

9Vλ µ (z) : V (λ) −→ Vz ⊗ V (µ),

9λV

(z) : V (λ) −→ Vz∗a ⊗ V (µ),

(3.2)

8µV λ (z) · x Vµ 9λ (z) · x

= =

1(x) · 8µV λ (z), Vµ 1(x) · 9λ (z),

∗

µ

∗

∗ 8µV (z) · x = 1(x) · 8µV (z), λ λ V ∗µ V ∗µ 9λ (z) · x = 1(x) · 9λ (z),

(3.3) (3.4) ∗

µV b + 1|N + 1)) together with the gradation |8µV (z)| = for ∀ x ∈ Uq (sl(M λ (z)| = |8λ Vµ V ∗µ |9λ (z)| = |9λ (z)| = 0. We expand the vertex operators as

8µV λ (z) =

MX +N +2

∗

µV 8µV (z) λl (z) ⊗ vl , 8λ

l=1

=

MX +N +2

∗ 8µV λl (z)

⊗

vl∗ , 9Vλ µ (z)

=

MX +N +2

l=1 ∗ 9λV µ (z)

vl ⊗ 8Vλlµ (z),

l=1

PM +N +2

V ∗µ = vl∗ ⊗ 9λl (z). We define the graded action of these exand l=1 ∗ P M +N +2 µV (z)|ui = 8λl (z)|ui⊗vl (−1)|vl | ||ui| , 8µV (z)|ui = panded operators by 8µV l=1 λ λ PM +N +2 PM +N +2 µV ∗ Vµ Vµ ∗ |vl∗ | ||ui| 8λl (z)|ui ⊗ vl (−1) , 9λ (z)|ui = l=1 vl ⊗ 8λl (z)|ui, and l=1 ∗ PM +N +2 ∗ V ∗µ vl ⊗ 9λl (z)|ui, for |ui ∈ V (λ). 9λV µ (z)|ui = l=1

Let us introduce the following combinations of the Drinfeld operators: h∗i m =

MX +N +1 j=1

=

MX +N +1 j=1

[αij m][βij m] j h , Q∗hi [(M − N )m][m] m MX +N +1 αij βij αij βij j Qhj , h∗i h , = 0 M −N M −N 0 j=1

(3.5)

376

K. Kimura, J. Shiraishi, J.Uchiyama j

where hj0 is defined by K j = q h0 and ( min(i, j) if min(i, j) ≤ M + 1, αij = 2(M + 1) − min(i, j) if min (i, j) > M + 1, ( M − N − max(i, j) if max(i, j) ≤ M + 1, βij = − M − N − 2 + max(i, j) if max(i, j) > M + 1.

(3.6)

(3.7)

Note that using these notations, we have the inverse of the Cartan matrix as (aij )−1 = [m]2 j ∗j αij βij /(M − N ). We obtain the relations [h∗i [h∗i m , hn ] = δi,j δm+n,0 m , m , hn ] = [a−1 m][m]

−1 ∗i ∗ , and [h∗i δm+n,0 ij m 0 , Qhj ] = δi,j , [h0 , Qhj ] = aij . ∗ ∗ Define the operators φl (z), φl (z), ψl (z), and ψl (z) (i = 1, · · · , M +N +2) iteratively by ∗

φM +N +2 (z) = : e−hM +N +1 (q

M −N +1

z;−1/2)+cN +1 (q M −N +1 z;0)

:

M +1 Y

k

1−k

eiπ M −N a0 , (3.8)

k=1

νl φl (z) = [φl+1 (z), fl ]qνl+1 , ∗

φ∗1 (z) = : eh1 (qz;−1/2) : −νl q νl φ∗l+1 (z) = [φ∗l (z), fl ]qνl , ∗

ψ1 (z) = : e−h1 (qz;1/2) :

M +1 Y

(3.9) k−1

k

eiπ M −N a0 ,

(3.10)

k=1

(3.11) M +1 Y

1−k

k

eiπ M −N a0 ,

(3.12)

k=1

ψl+1 (z) = [ψl (z), el ]qνl , ∗ ψM +N +2 (z)

=: e

−M +N +1 h∗ z;1/2) M +N +1 (q

(3.13) [1 ∂z e

−cN +1 (q −M +N +1 z;0)

]:

M +1 Y

k−1

k

eiπ M −N a0 , (3.14)

k=1

−νl νl+1 q

−νl

ψl∗ (z)

=

∗ [ψl+1 (z), el ]qνl+1 ,

(3.15) where h∗i (z; κ) is defined in the same manner as hi (z; κ). The gradations are given by |φl (z)| = |φ∗l (z)| = |ψl (z)| = |ψl∗ (z)| = νl2+1 . Define the operators φ(z), φ∗ (z), ψ(z) PM +N +2 PM +N +2 ∗ φl (z) ⊗ vl , φ∗ (z) = φl (z) ⊗ vl∗ , ψ(z) = and ψ ∗ (z) by φ(z) = l=1 l=1 PM +N +2 P M +N +2 ∗ ∗ ∗ vl ⊗ ψl (z) and ψ (z) = vl ⊗ ψl (z) respectively. Then we have l=1 l=1 the following result. Proposition 3.1. The operators φ(z), φ∗ (z), ψ(z) and ψ ∗ (z) satisfy the same commuta∗ µV ∗ (z), 9Vλ µ (z) and 9λV µ (z) respectively. tion relations as 8µV λ (z), 8λ To prove the proposition, the equations [[ψ1 (z), e1 ]q , e1 ]q−1 = 0, [ψ1 (z), ei ] = 0 (i 6= 1), ∗ (qz − q −1 x)ψ1 (z)X +,1 (x) = (z − x)X +,1 (x)ψ1 (z), and similar formulas for ψM +N +2 , ∗ φM +N +2 and φ1 are helpful. Remark 1. These operators can almost be determined by the method used for the levelb )) [JMMN, Ko]. Namely, we obtained one bosonization of the vertex operators of Uq (sl(N those by studying the commutation relations between the vertex operators and some of

Level-One Representation of Quantum Affine Superalgebra

377

the Drinfeld basis. Relevant explicit coproduct formulas for the Drinfeld basis can be obtained in the same way as Chari and Pressley [CP]. We have the bosonic fields c’s whose contribution to the vertex operators cannot be determined by studying the commutation relations with him , because they do not contain c’s. However, the following two pieces of information enables us to find the unique solutions as above: i) the vertex operators of q → 1 limit, ii) the commutation relations with X +,i (z) (or X −,i (z)) for type I (or type II). b case. We study the action of the bosonized vertex operators of 3.2. Uq (sl(2|1)) b Uq (sl(2|1)) on the Fock space defined in Subsect. 2.2. Using the bosonic representab tions of the vertex operators, we have the homomorphisms of Uq (sl(2|1))-modules:  F(α;β) → F(α−1;β+1) ⊗ Vz ,     F  ((1,0);β) → F(0;β+1) ⊗ Vz ,   F ((0,1);β) → F(1;β+1) ⊗ Vz , φ(z) :  F ((0,1);β) → F((1,0);β+1) ⊗ Vz ,     → F((0,1);β+1) ⊗ Vz , F    (3;β) F(2;β) → F((1,0);β+1) ⊗ Vz ,

φ∗ (z) :

 F(α;β) → F(α+1;β−1) ⊗ Vz∗a ,     F((1,0);β) → F(2;β−1) ⊗ Vz∗a ,    F →F ⊗ V ∗a , ((0,1);β)

(3;β−1)

z

 F((1,0);β) → F((0,1);β−1) ⊗ Vz∗a ,     → F((0,1);β−1) ⊗ Vz∗a , F    (1;β) F(0;β) → F((1,0);β−1) ⊗ Vz∗a ,

 F(α;β) → Vz ⊗ F(α−1;β+1) ,     F((1,0);β) → Vz ⊗ F(0;β+1) ,    F ((0,1);β) → Vz ⊗ F(1;β+1) , ψ(z) :  F ((0,1);β) → Vz ⊗ F((1,0);β+1) ,     → Vz ⊗ F((0,1);β+1) , F    (3;β) F(2;β) → Vz ⊗ F((1,0);β+1) , (3.16)

ψ ∗ (z) :

 F(α;β) → Vz∗a ⊗ F(α+1;β−1) ,     F((1,0);β) → Vz∗a ⊗ F(2;β−1) ,    F → V ∗a ⊗ F , ((0,1);β)

z

(3;β−1)

 F((1,0);β) → Vz∗a ⊗ F((0,1);β−1) ,     → Vz∗a ⊗ F((0,1);β−1) , F    (1;β) F(0;β) → Vz∗a ⊗ F((1,0);β−1) , (3.17)

b Next let us consider the vertex operators which intertwine highest weight Uq (sl(2|1))modules by using the above results. It is easy to see that the vertex operators also commute (or anti-commute) with η0 . Noting this property, the above homomorphisms and Conjecture 2.1, 2.2, we can study the conditions of existence for the vertex operators which intertwine the irreducible highest weight modules. Conjecture 3.1. The following vertex operators associated with the level-one irreducible highest weight modules exist: λ

V

Vλ

(z) : V (λα ) −→ V (λα−1 ) ⊗ Vz , 8λα−1 α

9λα α−1 (z) : V (λα ) −→ Vz ⊗ V (λα−1 ),

30 V (z) : V (31 ) −→ V (30 ) ⊗ Vz , 83 1

9V3130 (z) : V (31 ) −→ Vz ⊗ V (30 ),

31 V (z) : V (32 ) −→ V (31 ) ⊗ Vz , 83 2

9V3231 (z) : V (32 ) −→ Vz ⊗ V (31 ), (3.18)

378

K. Kimura, J. Shiraishi, J.Uchiyama ∗

∗

V 8λλα+1 (z) : V (λα ) −→ V (λα+1 ) ⊗ Vz∗a , 9Vλα λα+1 (z) : V (λα ) −→ Vz∗a ⊗ V (λα+1 ), α ∗

9V30 31 (z) : V (30 ) −→ Vz∗a ⊗ V (31 ),

∗

9V31 32 (z) : V (31 ) −→ Vz∗a ⊗ V (32 ), (3.19)

1V (z) : V (30 ) −→ V (31 ) ⊗ Vz∗a , 83 30 2V (z) : V (31 ) −→ V (32 ) ⊗ Vz∗a , 83 31

where λα = (1 − α)30 + α32 for α ∈ R.

∗ ∗

Acknowledgement. The authors would like to thank H. Awata, M. Jimbo, A. Kuniba, T. Miwa, J. Suzuki, T. Takagi, A. Tsuchiya and Y. Yamada for stimulating discussions. J. S. is very grateful to M. Wakimoto for valuable discussions and kind hospitality while his visiting at Mie university in 1994.

References [D1] [D2]

Drinfeld, V.G.: Quantum groups. Proc. Int. Congr. Math. Berkeley, 1986 Drinfeld, V.G.: A new realization of Yangians and quantum affine algebras. Soviet Math. Doklady 36, 212–216 (1988) [J] Jimbo, M.: A q-difference analogue of U (g) and the Yang-Baxter equation. Lett. Math. Phys. 10, 798–820 (1985) [DFJMN] Davies, B., Foda, O., Jimbo, M., Miwa T. and Nakayashiki, A.: Diagonalization of the XXZ Hamiltonian by vertex operators. Commun. Math. Phys.151, 89–153 (1993 ) [JMMN] Jimbo, M., Miwa, T., Miki, K. and Nakayashiki, A.: Correlation functions of the XXZ model for 1 < 1. Phys. Lett. A 168, 256–163 (1992) [Ko] Koyama, Y.: Staggered polarization of vertex models Uq (sl(n))-symmetry. Commun. Math. Phys. 164, 277291 (1994 [CP] Chari, V. and Pressley, A.: Quantum Affine Algebras. Commun. Math. Phys. 142, 261-283 (1991) [JM] Jimbo, M. and Miwa, T.: Algebraic Analysis of Solvable Lattice Models. CBMS Regional Conference Series in Mathematics Vol. 85, Providence, RI: AMS 1994 [K1] Kac, V.: Representations of Classical Lie Superalgebras. Lecture Notes in Mathematics Vol. 676, Berlin: Springer-Verlag, 1978 [K2] Kac, V.: A sketch of Lie superalgebra theory. Commun. Math. Phys. 53, 31 (1977) [K3] Kac, V.: Lie superalgebras. Adv. Math. 26, 8 (1977) [FSS] Frappat, L., Sciarrino, A. and Sorba, P.: Structure of basic Lie superalgebras and of their affine extensions. Commun. Math. Phys. 121, 457–500 (1989) [KWn] Kac, V.G., Wang, W.: Vertex operator superalgebras and their representations. Preprint hepth/9312065 (1993) [KWk1] Kac, V. G., Wakimoto, M.: Integrable highest weight modules over affine superalgebras and number theory. Preprint hep-th/9407057 (1994) [KWk2] Kac, V. G., Wakimoto, M.: private communication [Y] Yamane, H.: On definding relations of the affine Lie superalgebras and their quantized universal enveloping superalgebras. Preprint q-alg/9603015 (1996) [FK] Frenkel, I. B. and Kac, V.G.: Basic representations of affine Lie algebras and dual reseonance models. Inv. Math. 62, 23 (1980) [BCMN] Bouwknegt, P., Ceresole, A., McCarthy, J.G. and van Nieuwenhuizen, P.: Extended Sugawara construction for the superalgebras SU (M + 1|N + 1). I. Free-field representation and bosonization of super Kac-Moody currents. Physical Review D 39, 2971–2987 (1989) ˆ [BT] Bowcock, P. and Taormina, A.: Representation theory of the affine Lie superalgebra sl(2/1; C) at fractional level. hep-th/9605220 [FJ] Frenkel, and Jing, N.H.: Vertex representations of Quantum affine algebras. Proc. Nat’l. Acad. Sci. USA 85, 9373–9377 (1988) [AOS] Awata, H., Odake, S. and Shiraishi, J.: Free Boson Realization of Uq (sblN ). Commun. Math. Phys. 162, 61–83 (1994) Communicated by T. Miwa

Commun. Math. Phys. 188, 379 – 405 (1997)

Communications in

Mathematical Physics c Springer-Verlag 1997

Semiinfinite Cohomology of Quantum Groups S. M. Arkhipov? Independent University of Moscow, Pervomajskaya st. 16–18, Moscow 105037, Russia. E-mail: [email protected] Received: 14 October 1996 / Accepted: 25 February 1997

Abstract: We define a new cohomology theory of associative algebras called semiinfinite cohomology in the derived categories’ setting. We investigate the case of a small quantum group u, calculate semiinfinite cohomology spaces of the trivial u-module and express them in terms of local cohomology of the nilpotent cone for the corresponding semisimple Lie algebra. We discuss the connection between the semiinfinite homology of u and the conformal blocks’ spaces. 1. Introduction Semiinfinite cohomology of Lie algebras appeared in mathematics more than 10 years ago (see [F]), and yet it belongs to the area of homological algebra existing partly in the form of folklore. A remarkable breakthrough was achieved by A. Voronov (see [V]) who managed to define the semiinfinite cohomology in the derived category setting. Yet the general definition of semiinfinite cohomology of associative algebras has been unknown. 1.1. One of the aims of this paper is to give a rigorous construction for the functor of semiinfinite cohomology for arbitrary (graded) associative algebras defined in the corresponding derived categories of graded modules. The basic setup here includes a graded associative algebra A with two graded subalgebras B and N such that A = B ⊗N as a graded vector space. These conditions are satisfied in particular in the case of the universal enveloping algebra of a graded Lie algebra, but the general case is much wider. We will show that the semiinfinite cohomology of the universal enveloping algebra coincides with the corresponding Lie algebra semiinfinite cohomology (see [V, F]). Our construction of the standard complex for the calculation of the semiinfinite cohomology suggests that semiinfinite cohomology could be realized by some combination ?

Partially supported by the grants INTAS-94-4720 and CRDF RM1-265.

380

S. M. Arkhipov

of the standard Tor and Ext functors. Our considerations do not give such a realization, yet it exists in suitably chosen triangulated categories. This will be explained elsewhere. 1.2. Let us describe the structure of the paper. In the second section we recall several basic facts about quantum groups at roots of unity. In the third section we give the definition of the semiinfinite cohomology of associative algebras and prove some basic results about them. In the fourth and fifth sections we consider an example of the algebra A equal to the finite dimensional Hopf algebra u (finite quantum group) introduced by G.Lusztig in [L1]. In this case the semiinfinite homology Tor C∞2 +• (k, Y ) appeared in [FiS]. They calculated the cohomology of the space of configurations of points on the projective line P1 with coefficients in the sheaf Y equal to the localization of the module Y to the point 0 ∈ P1 . It is known that the semisimple Lie algebra g acts on the cohomology of the trivial module over the finite quantum group with the same Cartan matrix (see [GK]). In the fourth section we show that this fact holds for semiinfinite cohomology of the trivial module and calculate the character of this g-module. Unfortunately even in the simplest case the representation itself remains unknown. B.Feigin has proposed a conjecture describing the g-module of semiinfinite cohomology of the trivial module over the finite quantum group in terms of distributions on the nilpotent cone of g. In Appendix A we prove some facts confirming the conjecture on the level of characters. The main result of the fifth section is Theorem 5, stating that conformal blocks are naturally embedded into the semiinfinite Tor spaces (see the exact statement in section 5). This Theorem along with the results of the papers [Fi], [FiS] sheds light on the conjecture of Feigin, Schechtman and Varchenko about the integral representation of conformal blocks (see [FSV]). Namely, the above results imply the following statement: the local system of conformal blocks on the space of configurations of points on P1 is a direct summand in the direct image of some perverse sheaf on a larger configuration space. The perverse sheaf itself is the Goresky-MacPherson extension of a one-dimensional local system. The example 5 shows that the local system of conformal blocks is in general a proper direct summand in the direct image of the above Goresky-MacPherson sheaf. In Appendix B we present several results due to Voronov on semiinfinite homological algebra (see [V]) and make an attempt to realize semiinfinite cohomology of associative algebras as a two-sided derived functor (in the spirit of [V], 3.9). This paper grew out of attempts to understand the natural general setting for semiinfinite cohomology. I was introduced to the subject by B.Feigin back in 1993. He also formulated the conjectural answer for the semiinfinite cohomology of finite quantum groups. Thus the present paper owes its very existence to B.Feigin. I am also greatly indebted to M.Finkelberg and L.Positselsky for many helpful discussions. I would like to thank D.Timashev for bringing the paper [H] to my attention.

Notation Throughout the paper we use the following notation. (a)ri,j=1

is a Cartan matrix of the finite type

d1 , . . . , d r R R±

∈ {1, 2, 3} such that (di aij ) is symmetric is the root system corresponding to (aij ) is the set of positive (resp. negative) roots

Semiinfinite Cohomology of Quantum Groups

ρ

=

Σ ht β X Y Y± (·|·) α1∨ , . . . , αr∨ W g G Q(v) [m]!d m t

d

1 2

P

381

α

α∈R+

= {α1 , . . . , αr } is the set of simple roots Pr Pr = i=1 bi ,, where β = i=1 bi αi ∈ R+ is the height function on the set of the positive roots is the weight lattice of R is the root lattice of R, i. e. Y = Zα1 ⊕ . . . ⊕ Zαr ,→ X is the subsemigroup in Y generated by the set Σ (resp. by −Σ) is a scalar product defined on Σ ⊂ X by the formula (αi |αj ) = di aij and extended to X by bilinearity are the simple coroots is the Weyl group corresponding to R is the semisimple Lie algebra with the Cartan matrix (a)ij is the simply connected Lie group with Lie algebra g is the field of rational functions in the indeterminate v Qm dj −dj = j=1 vvd −v ∈ Q(v) where m, d ∈ N −v −d Qt vd(m−j+1) −v−d(m−j+1) = j=1 ∈ Q(v) where m ∈ Z, t, d ∈ N v dj −v −dj

2. Quantum Groups at Roots of Unity In this section we collect several well-known facts about finite quantum groups that we will need later. 2.1. For every symmetrizable Cartan matrix (a)ri,j=1 Drinfeld and Jimbo constructed a Hopf algebra U1 over the field Q(v) of rational functions with the generators Ei , Fi , Ki , Ki−1 , i = 1, . . . , r, and the following relations: Ki Kj = Kj Ki , Ki Ki−1 = Ki−1 Ki = 1,

Ki Ej = v di aij Ej Ki , Ki Fj = v −di aij Fj Ki , K −K −1

i , Ei Fj − Fj Ei = δij vdii −v−d i h i P s 1−aij Eir Ej Eis = 0 if i 6= j r+s=1−aij (−1) s

di

(see also [L1]). This algebra is called the quantum universal enveloping algebra of the corresponding Kac-Moody Lie algebra or the quantum group. 2.2. Given an integer ` > 1 prime to the nonzero elements of the fixed Cartan matrix, we choose a primitive `th root of unity ζ and set k = Q(ζ). De Concini and Kac introduced the k-algebra U2 with generators Ei , Fi , Fi , Ki , Ki−1 and relations similar to the ones for U1 but with v replaced by ζ (see [DCK], 1.5). Denote the subalgebra in U2 generated by all Ei (resp. by all Fi , resp. by all Ki and Ki−1 ) by U2+ (resp. by U2− , resp. by U20 ). In particular U20 coincides with the algebra of Laurent polynomials in the commuting variables Ki . We define the following X-grading on U2 : deg Ei = αi , deg Fi = −αi , deg Ki = 0.

382

S. M. Arkhipov

Like in classical universal enveloping algebras, for every β ∈ R+ one can define the root elements Eβ ∈ (U2 )β and Fβ ∈ (U2 )−β , in particular Eαi = Ei , Fαi = Fi (see [L1]). 2.2.1. Lemma. (see [DCK], 1.7). The elements Y β∈R+

Eβn(β)

r Y i=1

Kis(i)

Y

Fβm(β)

β∈R+

are linearly independent and generate U2 as a vector space. Here n(β) and m(β) run through nonnegative integers and s(i) run through arbitrary integers. 2.2.2. Corollary. The multiplication in U2 defines an isomorphism of the vector spaces U2+ ⊗ U20 ⊗ U2− −→ U2 . All Eβ` and Fβ` , β ∈ R+ , and Ki` , i = 1, . . . , r, belong to the center of the algebra U2 (see [DCK], 3.1). Denote the ideals generated by the elements {Eβ` , Fβ` |β ∈ R+ } (resp. {Eβ` , |β ∈ R+ }, resp. {Fβ` , |β ∈ R+ }) in the algebras U , U + , and U − respectively by I, I + and I − . Obviously I ± = I ∩ U2± . Set U = U2 /I. Let U ± and U 0 be the images of the corresponding subalgebras under the projection. Note that U + (resp. U − ) is Y + -graded (resp. Y − -graded). We obtain the base in U that consists of monomials similar to those from the previous lemma but with 0 ≤ m(β), n(β) ≤ ` for every β ∈ R+ and arbitrary s(i) ∈ Z. 2.3. There exists an alternative version of the quantum group defined by Lusztig (see [L1], 8.1). Namely Lusztig considers the Q[v, v −1 ]-subalgebra UZ ⊂ U1 generated by the elements Ei(n) := Ein /[n]!di , Fi(n) := Fin /[n]!di , Ki±1 , i = 1, . . . , r, n ≥ 0. It is analogous to the classical integral form of the universal enveloping algebras due to Kostant. By definition set U3 = k ⊗Q[v,v−1 ] UZ . Then U3 contains elements Ei , Fi , Ki and Ki−1 , i = 1, . . . , r, that satisfy the basic relations for the generators of U2 . Thus there exists a homomorphism f : U2 −→ U3 mapping the generators of U2 to the corresponding elements of U3 . Its image is a finite dimensional subalgebra u ⊂ U3 (see [L1], 8.2). The kernel of f contains I, and we obtain the surjective map: U −→ u. Both algebras are X-graded, and the map preserves the grading. 2.3.1. Lemma. (see [AJS], 1.3). The following statements hold for U and u: (i) U = U + ⊗ U 0 ⊗ U − as a vector space; (ii) u = u+ ⊗ u0 ⊗ u− as a vector space; (iii) both maps U ± −→ u± are isomorphisms of algebras;

Semiinfinite Cohomology of Quantum Groups

383

(iv) u0 is equal to the quotient algebra of U 0 by the ideal generated by all Ki2` − 1. Thus we obtain PBW-type bases in u. We call both U and u the finite quantum groups. The algebra u− ⊗ u0 (resp. u0 ⊗ u+ ) is called the negative (resp. the positive) Borel subalgebra in u and is denoted by b− (resp. by b+ ). 2.4. Using PBW-type bases, Kac, De Concini and Procesi defined some remarkable filtrations on U2 , U and u that generalize the usual PBW-filtrations on universal enveloping algebras (see [DCKP]). +1 , where N is the number of Consider the lexicographically ordered set S = Z2N + the positive roots. A filtration of a vector space by subspaces numbered by the set S is called S-filtration. 2.4.1. We fix a convex order on the set of the positive roots R+ . Roughly speaking the convex property means that the q-commutator of two root vectors Eα and Eβ in U1 consists of monomials formed only by root vectors for the roots lying between α and β in the order (see e.g. [DCKP] for the exact definition). We denote the monomial K1b1 . . . Krbr Fβt11 . . . FβtN Eβs11 . . . EβsN N N by M(s,t,α) , where α =

Pr

j=1 bj αj .

We define the total height by

d0 (M(s,t,α) ) =

N X

(si + ti ) ht βi ,

i=1

where ht β is the height of the root β, and the total degree by +1 . d(M(s,t,α) ) = (d0 (M(s,t,α) ), sN , . . . , s1 , t1 , . . . , tN ) ∈ Z2N +

We introduce the S-filtration F on U2 by the total degree. 2.4.2. Lemma. (see e.g. [DCKP], 4.2). The associated polygraded algebra gr U2 for the Sfiltration on U2 is generated by the elements Eα , Fα (α ∈ R+ ) and Ki±1 (i = 1, . . . , r) satisfying the following relations: Ki Eβ = ζ (αi |β) Eβ Ki , Ki Fβ = ζ −(αi |β) Fβ Ki , β ∈ R+ ; Ki Ki−1 = Ki−1 Ki = 1, Ki Kj = Kj Ki ; Eα Fβ = Fβ Eα , α, β ∈ R+ ; Eα Eβ = ζ −(α|β) Eβ Eα , Fα Fβ = ζ (α|β) Fβ Fα if α, β ∈ R+ and α > β in the convex order on R+ .

+1 +1 -filtrations on the algebras U and u and ZN -filtrations The filtration F defines Z2N + + + − + − on the algebras u , u , b , b etc. We denote them by F as well.

384

S. M. Arkhipov

2.4.3. Corollary. The graded algebra gr F U is generated by the elements Eα , Fα , α ∈ R+ , and Ki±1 , i = 1, . . . , r, subject to the relations from Lemma 2.4.2 and the following relation: Eα` = Fα` = 0 for every α ∈ R+ . Recall that a finite dimensional associative algebra A is called Frobenius if the dual module to the right regular module over A is isomorphic to the left regular A-module: AL ∼ = Homk (AR , k). The nondegenerate bilinear pairing A × A −→ k induced by this identification is called the trace on A. 2.4.4. Lemma. Let A be a finite dimensional filtered algebra such that the top component of the corresponding graded algebra is one dimensional, and gr A is Frobenius with the trace defined as follows: (x, y) 7→ (x · y)top ∈ (gr A)top ∼ = k. Here · denotes the multiplication in gr A. Then A is also Frobenius. The trace on A is defined as follows: (x, y) 7→ xy 7→ (xy)top ∈ (grA)top ∼ = k. The product here is the product in A.

2.4.5. Lemma. The algebras u, u+ and u− are Frobenius. Proof. By 2.4.3 the algebras gr u± satisfy the conditions of the Lemma 2.4.4. Thus u± are Frobenius. The trace on u is defined in [Xi], 2.9. 2.4.6. We will need a filtration on u that provides a weaker degeneration of the algebra u than F does. We introduce the partial degree e (s,t,α) ) = (d0 (M(s,t,α) ), sN , . . . , s1 ). d(M Let F 0 be the ZN +1 -filtration of u by the partial degree. Note that F 0 coincides with F on u+ and it coincides on u− with the natural Z-grading obtained from the Y − -grading. 0 0 Thus gr F (u− ) = u− , gr F (u+ ) = gr F (u+ ). Recall that an augmented subalgebra B ⊂ A is called normal if the left and the right ideals in A generated by the augmentation ideal B in B coincide. Then A//B denotes the quotient algebra of A by the two sided ideal. The algebras u, u± , gr F (u) and gr F (u± ) are naturally augmented: the augmentation is provided by the map Eα 7→ 0, Fα 7→ 0, Ki 7→ 1, α ∈ R+ , i = 1, . . . , r. 2.4.7. Lemma. F 0 defines a filtration on u compatible with the multiplication in u. We have 0 atriangular decomposition gr F (u) = u− ⊗ u0 ⊗ gr F (u+ ) as a vector space, u− , u0 and 0 0 gr F (u+ ) are subalgebras in gr F (u). Finally, gr F (u+ ) is normal in gr F (u) and elements 0 of u− commute with elements of gr F (u+ ).

Semiinfinite Cohomology of Quantum Groups

385

Proof. The first statement is already checked while defining the filtration F on u. Since F 0 0 and F 0 coincide on u+ , gr F (u+ ) is a subalgebra in gr F (u). The decomposition of gr F (u) into a tensor product follows from the similar statement for u (see Lemma 2.3.1). To 0 prove that elements of u− commute with elements of gr F (u+ ) in gr F (u) note that the statement is true already in the associated graded algebra of u corresponding to the filtration by d0 . 2.4.8. 0

Lemma. The algebra gr F (u) is Frobenius. 0

Proof. As a vector space grF (u) = u− ⊗ u0 ⊗ gr F (u+ ). Set the linear form, inducing 0 the trace on gr F (u), equal to the tensor product of the linear forms inducing the traces on the components. 2.5. Now we are going to describe the categories of u-modules we will L work with. Let C be the category of X-graded left u-modules M = λ ∈ X β∈X,v(β)=n λ M , dim λ M < ∨ ∞, such that Ki act on λ M by multiplication by ζ (αi |λ) , where αi∨ denote the simple coroots, Ei : λ M −→ λ+αi M, Fi : λ M −→ λ−αi M. Morphisms in C are the morphisms of X-graded u-modules M that preserve gradings. Let (r) C be the category of right X-graded u-modules N = λ N , dim λ N < ∞, such λ∈X

∨

that Ki act on λ N by multiplication by ζ −(αi |λ) , Ei :

λN

−→

λ+αi N,

Fi :

λN

−→

λ−αi N, 0

with morphisms that preserve X-gradings. One can define the categories C(b), C(grF (u)) 0 and C (r) (gr F (u)) in a similar way. We define the twisting functors by elements of the weight lattice λ ∈ `X on the category C: M 7→ M hλi, where µ M hλi := λ+µ M, L HomC (L, M h`βi) by Homu (L, M ). with the same action of u. We denote the space β∈X

We will also need the categories of finite dimensional left u± -modules denoted by u± -Mod. We call the u-module M + (λ) := u ⊗b− k(λ) (resp. the u-module M − (λ) := u ⊗b+ k(λ)) the positive (resp. negative) left Verma module with the highest (resp. lowest) weight λ. Here k(λ) denotes the one-dimensional u0 -module placed in the X-grading λ, the trivial action of u± equips it with the structure of a b± -module. Contragradient Verma modules M −∗ (λ) are defined as follows: M −∗ (λ) = Homb− (u, k(λ)) with the natural left action of u. We will need the following statement. 2.5.1. Lemma. (see [AJS], 4.10, 4.12) (i) M + (λ) = M −∗ (λ + (` − 1)2ρ); (ii) HomC (M − (λ), M + (µ)) = 0 if λ 6= µ + (` − 1)2ρ;

386

S. M. Arkhipov

(iii) HomC (M − (λ), M + (λ + (` − 1)2ρ)) = k.

2.5.2. Every Verma module M − (λ) has a unique simple quotient module L(λ) with the highest weight λ ∈ X, and this way one obtains the full list of simple modules in the category C (see [AJS], 4.1). 3. Semiinfinite Cohomology of Modules over Associative Algebras Consider a free abelian group X of the finite rank r and its subgroup Y generated by a set Σ ⊂ X consisting of r elements, such that the elements of Σ form a base of the vector space X ⊗ Q. Denote the subsemigroup in Y generated by the set Σ (resp. by the set −Σ) by Y + (resp. by Y − ). In this section we do not suppose that X and Y are the weight and the root lattices corresponding to some root system, but our notation is adopted to that case. Let v : X −→ Q be a linear function defined as follows: for every α ∈ Σ set v(α) = 1. We extend v onto X by linearity. 3.1. Suppose we have an Y -graded associative algebra A with the Y -graded subalgebras B and N satisfying the following conditions : (i) (ii) (iii) (iv) (v)

N is graded by Y + ; N0 = k; dim Nβ < ∞ for any β ∈ Y + ; B is graded by Y − ; the multiplication in A defines the isomorphisms of Y -graded vector spaces B ⊗ N −→ A and N ⊗ B −→ A.

In particular N is naturally augmented. Note that our conditions are satisfied both for finite quantum groups and for universal enveloping algebras of graded Lie algebras. In the latter case the decomposition into tensor product of negative and positive parts is given by the natural decomposition of the Lie algebra into the direct sum of its negatively and positively graded parts. 3.2. Consider the category A −mod of X-graded left A-modules with morphisms that preserve X-gradings. The corresponding category of right A-modules is denoted by mod-A. We will need the following subcategories in the category of complexes Kom(A −mod). For an X-graded module M denote by supp M the set {α ∈ X | α M 6= 0}. Denote + (β) (resp. X − (β)) the convex cone in X ⊗ Q, generated by Σ (resp. −Σ) with by XQ Q the vertex in β ∈ X ⊗ Q. For s1 , s2 , t1 , t2 ∈ Z, s1 , s2 > 0, the set {(p, q) ∈ Q⊕2 |s1 p + q ≥ t1 , s2 p − q ≥ t2 } (resp. the set {(p, q) ∈ Q⊕2 |s1 p + q ≤ t1 , s2 p − q ≤ t2 }) is denoted by Q↑ (s1 , s2 , t1 , t2 ) (resp. by Q↓ (s1 , s2 , t1 , t2 )). C ↓ (A)) in Kom(A −mod), consisting of comConsider the category C ↑ (A) (resp.M q plexes of X-graded A-modules M = λ M satisfying the following conditions: q∈Z,λ∈X

Semiinfinite Cohomology of Quantum Groups

387

(U) there exist s1 , s2 , t1 , t2 ∈ Z, s1 , s2 > 0, such that v(supp M • ) ⊂ Q↑ (s1 , s2 , t1 , t2 ) and for any (p, q) ∈ v(supp M • ) the set v −1 (p, q) is finite; resp. (D) there exist s1 , s2 , t1 , t2 ∈ Z, s1 , s2 > 0, such that v(supp M • ) ⊂ Q↓ (s1 , s2 , t1 , t2 ) and for any (p, q) ∈ v(supp M • ) the set v −1 (p, q) is finite. Here the set v(supp M • ) is considered as a subset of a (p, q)-plane: v(λ, q) := (v(λ), q). 3.3. Next we define the standard complex for the computation of semiinfinite cohomology of A-modules. Consider the right N module M Homk ( β N, k). N ∗ = Homk (N, k) = β∈X +

The right action of N on N ∗ is defined as follows: n · f (m) = f (nm), n, m ∈ N, f ∈ N ∗ . Voronov calls the right A-module SA = N ∗ ⊗N A the right semiregular representation B-module. (see [V]). Obviously SA is isomorphic to N ∗ ⊗ B as a right L HomB −mod (L, M hλi) For two X-graded B-modules L and M denote the space λ∈X

by HomB (L, M ) similarly to 2.5. Here hλi denotes the grading shift. 3.3.1. Lemma. There exists a canonical morphism of the right A-modules SA −→ HomB (A, B). Proof. Define the pairing φ : SA × A −→ B as follows: φ(f ⊗ a1 , a2 ) = f1 (a1 a2 ), where f1 denotes f used by the first argument in A ⊗ B. The required morphism is provided by φ. One checks directly that it is well defined with respect to the A-actions. 3.3.2. Below we suppose that the algebra A satisfies the following condition. (vi) The map constructed in the previous lemma is an isomorphism of the right A-modules. Set A? = EndA (SA ). The functors of induction and coinduction provide the natural inclusions of algebras N ,→ A? and B ,→ A? . Clearly up to a certain completion A? = B ⊗ N as a vector space. 3.3.3. Denote the subspace

M β∈Y,v(β)=n

βA

(resp.

L

βM)

in the algebra A (resp.

β∈X,v(β)=n

in a X-graded A-module M ) by An (resp. by Mn ). We introduce Ma topology on A An (resp. by the (resp. on a graded A-module M ) defined by the filtration F m A := n<m L Mn ). In particular the multiplication in A and the action of A on filtration F m M := n<m

M are given by continuous maps. Denote the space of continuous linear maps between

388

S. M. Arkhipov

two graded A-modules M and M 0 equipped with this topology by Homcont (M, M 0 ). Thus we have M Y Hom(Mn , M 0 ) ⊕ Hom(Mn , M 0 ). Homcont (M, M 0 ) = n≥0

n<0

For right graded A-modules M and M 0 consider the space of continuous morphisms cont 0 (M, M 0 )|f (am) = af (m) for a ∈ A, m ∈ M }. Homcont A (M, M ) := {f ∈ Hom

In particular we have cont cont (N ∗ , B) Homcont A (SA , SA ) = Hom B (SA , B) = Hom M M Hom((N ∗ )n , B) = Hom((Nn )∗ , B) = N ⊗B. = n≤0

n≥0

It is easily checked that the images of the inclusions B opp ⊂ EndA (SA ) and N opp ⊂ EndA (SA ) belong to the space of continuous endomorphisms. Thus we obtain the following statement. 3.3.4. Lemma. The subspace A] = B ⊗ N ⊂ A? is a Y -graded subalgebra.

]

Thus SA becomes an A -A bimodule. Note that for a finite dimensional algebra A satisfying the conditions (i)-(vi) the endomorphism algebra of SA equals exactly Homk (N ∗ , B) = B⊗N as a graded vector space. Thus in particular we have A] = A? . 3.3.5. Recall the relative bar construction for the algebra A with respect to the subalgebra • g (A, B, M ) ∈ Kom(A −mod) of an A-module B ⊂ A. The standard bar resolution Bar M is defined as follows: −n

g (A, B, M ) = A ⊗B . . . ⊗B A ⊗B M (n + 1 times), Bar d(a0 ⊗ . . . ⊗ an ⊗ v) = n−1 X (−1)s a0 ⊗ . . . ⊗ as as+1 ⊗ . . . ⊗ v + (−1)n a0 ⊗ . . . ⊗ an−1 ⊗ an v. = s=0

Here a0 , . . . , an ∈ A, v ∈ M . 3.3.6. •

Lemma. The subspace Bar (A, B, M ) : (Bar)−n (A, B, M ) g = {a0 ⊗ . . . ⊗ an ⊗ v ∈ Bar •

g (A, B, M ). is a subcomplex in Bar

−n

(A, B, M )| ∃ s ∈ {1, . . . , n} : as ∈ B}

Semiinfinite Cohomology of Quantum Groups

389

•

g (A, B, M )/Bar • (A, B, M ) is called the restricted The quotient Bar • (A, B, M ) = Bar bar resolution of the A-module M with respect to the subalgebra B. The ideal of augmentation in N is denoted by N . 3.3.7. ⊗n

Lemma. (i) Bar −n (A, B, M ) = N ⊗ N ⊗ M as a left N -module; (ii) Bar • (A, B, M ) = Bar • (N, k, M ) as a complex of N -modules. In particular Bar • (A, B, M ) is a N -free resolution of the A-module M . For any M • ∈ C ↑ (A) consider the total complex of its bar resolution Bar• (A, B, M • ) and the complex of A] -modules SA ⊗A Bar • (A, B, M • ). Remark. Since SA is B-free and Bar • (A, B, M • ) is N -free we have • H • (SA ⊗A Bar • (A, B, M • )) = Tor A • (SA , M ).

For a complex of left A] -modules L• ∈ C ↓ (A] ) we denote the total complex of its restricted bar resolution with respect to the subalgebra N ∈ A] by Bar • (A] , N, L• ). 3.3.8. Definition. Let L• ∈ C ↓ (A] ),∞M • ∈ C ↑ (A). The standard complex for the computation of semiinfinite Ext functor C 2 +• (L• , M • ) is defined as follows: C

∞ 2 +•

(L• , M • ) := Hom•A] (Bar • (A] , N, L• ), SA ⊗A Bar • (A, B, M • )). ∞

By definition we set Ext A2

+•

(L• , M • ) := H • (C

∞ 2 +•

(L• , M • )).

Note that unlike the usual Ext and Tor functors semiinfinite cohomology exists both in negative and positive homological degrees even for L and M being complexes-objects (see [GeM]). L ⊗n ⊗m ∞ Homk (B ⊗ L• , N ⊗ M • ). Here As a vector space C 2 +• (L• , M • ) = n,m

B denotes the space B/k. Since both arguments of the semiinfinite Ext functor are ∞ X-graded, both the standard complex C 2 +• (L• , M • ) and its cohomology are also Xgraded: γC

∞ 2 +•

L

(L• , M • ) =

Homk ( β (B

⊗n

⊗ L• ), α (N

⊗m

⊗ M • )),

α−β=γ;n,m ∞

γ

Ext A2

+•

(L• , M • ) = H • ( γ C

∞ 2 +•

(L• , M • )). ∞

From now on the zeroth X-grading component of ExtA2 ∞ +• Ext A2 −mod (L• , M • ). ∞

+•

(L• , M • ) is denoted by

3.4. Consider the filtrations (I) F (resp. (II) F ) on C 2 +• (L, M ), L ∈ A] −mod, M ∈ A −mod, by the number n (resp. m). The E0 -terms of the corresponding spectral sequences are as follows:

390

S. M. Arkhipov (I)

E0p,q = HomA] (A] ⊗N (A] /N )⊗N . . .⊗N (A] /N )⊗N L, SA ⊗A Bar q (A, B, M )) {z } | p

= HomN ((A /N ) ⊗N . . .⊗N (A /N )⊗N L, SA ⊗B (A/B)⊗B . . .⊗B (A/B) ⊗B M ) {z } {z } | | ]

]

−q

p

= Homk (B

⊗p

⊗ L, (A/B) ⊗B . . . ⊗B (A/B) ⊗B M ) {z } | −q

= HomB (Bar

−p

(B, k, L), (A/B) ⊗B . . . ⊗B (A/B) ⊗B M ) {z } | −q

with the differential being that in Bar • (B, k, L), and (II)

E0p,q = Homk ((A] /N ) ⊗N . . . ⊗N (A] /N ) ⊗N L, k) ⊗N Bar −q (N, k, M ) {z } | p

with the differential being that in Bar • (N, k, M ). Thus the E1 terms are as follows: (I)

E1p,q = ExtpB (L, (A/B) ⊗B . . . ⊗B (A/B) ⊗B M ), {z } | −q

(II)

E1p,q

=

] Tor N −q (Hom k ((A /N )

|

⊗N . . . ⊗N (A] /N ) ⊗N L, k), M ). {z } p

Both spectral sequences are X-graded. 3.4.1. + Lemma. Let L ∈ A] −mod, supp L ∈ X − (λ), M ∈ A −mod, supp M ∈ XQ (µ). Q (I) p,q (II) p,q Then for a fixed β ∈ X both β ( E ) and β ( E ) converge. ⊗p

Proof. Since B is Y − -graded, there exists β0 ∈ X such that for every p ≥ 0 supp B ⊗ L ⊗q L belongs to X − (β0 ). The X-graded space N ⊗ M satisfies the condition Q q≥0 (U).∞Thus in a fixed X-grading component β both spectral sequences of the complex +• (L, M ) are situated in the part of the (p, q)-plane which is bounded in p from the βC 2 left and in q both from the left and from the right. Recall that an object M ∈ A −mod is injective (resp. projective) relative to the subalgebra N if for every complex of A-modules C • such that C • is homotopic to zero as a complex of N -modules H • (Hom•A (C • , M )) = 0. (resp. H • (Hom•A (M, C • ) = 0). 3.4.2. Lemma. The following facts hold for L• ∈ C ↓ (A] ), M • ∈ C ↑ (A): (i)

if M • is N -projective, then we have ∞

Ext A2

+•

(L• , M • ) = H • (HomA] (Bar • (A] , N, L• ), SA ⊗A M • ));

Semiinfinite Cohomology of Quantum Groups

391

(ii) if L• is B-projective, then we have ∞

Ext A2

+•

(L• , M • ) = H • (Hom•A] (L• , SA ⊗A Bar • (A, B, M • )));

(iii) if M • is both N -projective and A-injective relative to N , then we have ∞

Ext A2

+•

(L• , M • ) = H • (Hom•A] (L• , SA ⊗A M • )).

Proof. (i) Consider the canonical mapping ϕ : Bar • (A, B, M • ) −→ M • . Then Cone• ϕ is an exact complex of N -projective A-modules satisfying (U). In particular it is homotopic to zero as a complex of N -modules. Thus we obtain an isomorphism of complexes of N -modules SA ⊗A Cone• ϕ = N ∗ ⊗N Cone• ϕ, where N ∗ is considered as a N -bimodule. It follows that SA ⊗A Cone• ϕ is also homotopic to zero over N . Next by the Shapiro Lemma A] -modules induced from N -modules are relatively projective, so Bar • (A] , N, L• ) consists of relatively projective modules. Let ∞ ϕ e : C 2 +• (L• , M • ) −→ Hom•A] (Bar • (A] , N, L• ), SA ⊗A M • ) be the morphism complexes corresponding to ϕ. Then Cone• ϕ e = Hom•A] (Bar • (A] , N, L• ), SA ⊗A Cone• ϕ). We prove that the latter complex is exact. Consider the bigrading on Cone• ϕ: e M e= HomA] (Bar p (A] , N, Ls ), Conen ϕ). Conem,n ϕ p+s=m

Our grading conditions provide that the spectral sequence of the bigraded complex converges. On the other hand we have M H • (Hom•A] (Bar p (A] , N, Ls ), Cone• ϕ)) = 0. E1m,• = p+s=m

(ii) The proof is similar to the previous one. (iii) It is sufficient to prove that the mapping Hom•A] (L• , SA ⊗A M • ) −→ Hom•A] (Bar • (A] , N, L• ), SA ⊗A M • ) is a quasiisomorphism. We are going to show that if M is both N -projectine and injective relatively to N then SA ⊗A N is injective over A] . First note that the functor SA ⊗A ∗ takes N -free modules to N -cofree ones, thus it takes N -projectives to N -injectives. The functor HomA] (SA , ∗) is the right conjugate functor for SA ⊗A ∗. It is left exact and is well defined on N -modules since it can be written as follows: M 7→ HomN (N ∗ , M ). Thus SA ⊗A ∗ preserves relative injectiveness. Finally note that a A] module that is both N -injective and relatively injective is also A] -injective. Thus for a finite exact complex P • the complex Hom•A] (P • , SA ⊗A M • ) is exact. It remains to check the convergence of a spectral sequence similar to the one from the first statement.

392

S. M. Arkhipov

3.5. Theorem. Semiinfinite Ext functor is well defined on the corresponding derived categories: ∞

ExtA2

+•

: D↓ (A] ) × D↑ (A) −→ D(Vect).

Here D↓ (A] ) (resp. D↑ (A)) denotes the localization of the category C ↓ (A] ) (resp. C ↑ (A)) by the class of quasiisomorphisms. ∞

+•

Proof. We are to prove that Ext A2 (L• , M • ) = 0 for L• ∈ C ↓ (A] ), M • ∈ C ↑ (A) if either of the arguments is exact. Suppose M • is exact, the proof in the other case is quite similar. ∞ We fix λ ∈ X. Consider the following bigrading on C 2 +• (L• , M • ): C

∞ 2

p,q

(L• , M • ) =

M

Homk ((B)⊗m ⊗ L−s ), (N )⊗n ⊗ M q ).

m−n+s=p ∞

The spectral sequence of the bicomplex λ C 2 •• (L• , M • ) converges since it is situated in the part of the (p, q)-plane bounded in ∞ p from the left and in q both from the left and from the right. Thus the total complex C 2 +• (L• , M • ) is exact. 3.5.1. Remark. The last two statements show that one can use an arbitary resolution of M • that is both N -projective and injective relatively to N for the computation of Ext •A (L• , M • ). ∞

4. Calculation of the Character of ExtC2

+•

(k, k)

In this section we define a g-module structure on the semiinfinite cohomology of the trivial u-module. We also calculate the character of this g-module. From now on X and Y denote the weight and the root lattice respectively, the linear function v coincides with the height function on Y (see 2.4.1). We will need the subcategories in the category of complexes Kom(C), satisfying the condition (U) (resp. (D)) from the previous section (the category C is defined in 2.4). These categories are denoted by C ↑ (resp. by C ↓ ). 4.1. We fix the triangular decomposition of the finite quantum group: u = b− ⊗ u+ , i. e. A = u, B = b− , N = u+ in the notations of 3.1. 4.1.1. 0

0

0

Lemma. We have u? = u] = u and (gr F u)? = (gr F u)] = gr F u. 0

Proof. The algebra u+ (resp. gr F u) is Frobenius (see 2.4.4), thus the right u-module Su 0 (resp. the right gr F u-module SgrF 0 u ) is isomorphic to the right regular u-module (resp. 0 to the right regular gr F u-module). But for any finite dimensional algebra the algebra of endomorphisms of the right regular module is isomorphic to the algebra itself.

Semiinfinite Cohomology of Quantum Groups

393

Note that the category C differs from the category u −mod. Thus to define semiinfinite cohomology of u-modules one has either to introduce a X-graded algebra A such that C ∼ (the algebra A is constructed in particular in [AJS], Remark 1.4) = A −mod ∞ +• and to consider ExtA2 −mod (∗, ∗) or to define semiinfinite cohomology in the category C explicitly. Note that for M ∈ C the complexes Bar • (u, b+ , M ) and Bar • (u, b− , M ) belong to Kom(C). 4.1.2. Definition. For L• ∈ C ↓ , M • ∈ C ↑ the semiinfinite Ext functor is defined as follows: ∞ +• Ext C2 (L• , M • ) := H • Hom•C (Bar • (u, b+ , L• ), Bar • (u, b− , M • )) . 0

The semiinfinite Ext functor in the category C(gr F u) is defined in a similar way. Note that as stated in Lemma 4.1.1, the module Su (resp. SgrF 0 u ) is isomorphic to the 0 right regular u-module (resp. to the right regular gr F u-module), hence the definition is a direct analogue of 3.3.8. In particular the statements of Lemma 3.4.2 remain true. 4.1.3. The subalgebra u+ ⊂ b+ (resp. u− ⊂ b− ) is normal. Clearly b+ //u+ = u0 is semisimple being the group algebra of the group Z/2`Z. In particular a u-module L is b+ -projective (resp. b− -projective) if and only if it is u+ -projective (resp. u− -projective). 4.2. Consider the following u+ -free u− -cofree resolution of a U3 -module M • ∈ C ↑ (U3 ): R• (M • ) := Bar • (U3 , B3+ , k)∗ ⊗ Bar • (U3 , B3− , M • ). Here B3± ⊂ U3 denotes the positive and the negative Borel subalgebras in the "big" quantum group U3 . The definition of the tensor product over the base field uses the standard Hopf algebra structure on U3 . The left U3 -module structure on M Bar • (U3 , B3+ , k)∗ := Homk ( λ Bar • (U3 , B3+ , k), k) λ∈X

is defined using the antipode in U3 . Evidently R• (M • ) satisfies the condition of Lemma 3.4.2 (iii), in particular it satisfies the condition (U). Thus it can be used for the computation of semiinfinite cohomology of u-modules. 4.2.1. Lemma. Let M • ∈ C ↑ (U3 ). Then

L λ∈X

∞

ExtC2

+•

(k, M • h`λi) admits a structure of g-

module. Proof. In [L1], Theorem 8.10, it is proved that the algebra u is normal in U3 , and the quotient algebra U3 //u is isomorphic to the universal enveloping algebra of the semisimple Lie algebra g. Thus by the Shapiro Lemma we have M ∞ +• ExtC2 (k, M • h`λi) = H • (Hom•u (k, R• (M • ))) = H • (Hom•U3 (U3 //u, R• (M • ))). λ∈X

The left U3 -module U3 //u is naturally equipped with a right action of the quotient algebra U3 //u = U (g) commuting with the left action of U3 . Thus the semiinfinite Ext spaces carry the natural structure of g-modules.

394

S. M. Arkhipov

4.3. The rest of this section is devoted to the computation of the character of ∞ +• ExtC2 (k, kh`λi). The problem is that we do not know “the minimal" u− -free resolution of the trivial u-module k. 4.3.1. −• Conjecture. There exists a resolution Rmin (k) of the trivial u-module k, filtered by Verma modules M − (λ), such that the character of the space spanned by the highest weight vectors vλ ∈ M − (λ) is given by the formal series P w(ρ)−ρ l(w) e t w∈W . ch(t) := Q (1 − e−`α t2 ) α∈R+

Here {eα } is the standard notation for the X-grading, and t is the variable denoting the homological degree. The word “minimal” is explained by the following result of Ginzburg and Kumar. 4.3.2. Lemma. (see [GK], Theorem 2.5.) P ch(Ext •u− −Mod (k, k), t) =

e−w(ρ)+ρ tl(w)

w∈W

Q

(1 − e`α t2 )

.

α∈R+ −• (k) for u(sl2 ) explicitly, and one easily obtains the following One can construct Rmin statement.

4.3.3. Lemma. ch(

M

∞

+•

ExtC(2 u(sl2 )) (k, kh`λi), t) =

λ∈Z

Here α is the only positive root of sl2 .

e`α (t + t−1 ) . (1 − e`α t2 )(1 − e`α t−2 )

4.4. To obtain a character formula for semiinfinite cohomology over other finite quantum groups we use the filtrations F 0 on quantum groups defined in 2.4.6. We begin with the ∞ L 2 +• calculation of Ext C(gr F 0 (u)) (k, kh`λi), using the following statement. λ∈X

4.4.1. 0

F Lemma. There exists a gr(u+ )-free resolution of the k with a N Ntrivial`αgr (u)-module + α 3(ξ ) ⊗ S(η ). Here {ξ α } denote the space of gr(u )-generators equal to α∈R+

α∈R+

exterior algebra generators of the homological degree −1 and the X-grading α, {η `α } denote the symmetric algebra generators of the homological degree −2 and the Xgrading `α.

Semiinfinite Cohomology of Quantum Groups

395

Proof. Consider the subalgebras k(xα ), α ∈ R+ , in gr(u+ ) generated by the images of ), and gr F (u+ ) the root elements from u+ . Each algebra k(xα ) is isomorphic to k[xα ]/(x`αN F + ∼ k(xα ) as a is the twisted tensor product of the algebras k(xα ). That is, gr (u ) = α∈R+

vector space, and the following relations are satisfied: xα xβ = ζ (α|β) xβ xα

(∗)

when α < β in the chosen convex ordering on the set of positive roots. Moreover we know 0from Lemma 2.4.7 that elements from u− commute with ele+• (k) for every algebra ments of gr F (u+ ) in gr F (u). Thus it is sufficient to construct Rmin k(xα ) and to take tensor product of these resolutions over the set of positive roots. For the latter algebra the required resolution looks as follows:

0 −→ k(xα )x`α −→ k(xα )xα −→ k(xα ) −→ k −→ 0. N • The algebra gr F (u+ ) acts on Rα by the commutation rule (*), the action of u0 on α∈R+

the complex comes from the X-grading on it and u− acts on the complex by zero.

+• We denote this resolution by Rmin,gr F 0 (u) (k).

4.4.2. Proposition. ÿ ch

M

! Ext

∞ 2 +• C(gr F 0 (

u)) (k, kh`λi), t

e2`ρ −

= t− dim n

λ∈X

Q

(1 −

P

t2l(w)

w∈W e`α t2 )(1

− e`α t−2 )

.

α∈R+

Here as before t denotes the homological grading, {eα } is the standard notation for the X-grading. The equality is understood as an equality of power series in variables numbered by the set of the generators of X with coefficients in k[t, t−1 ]. 0

4.1. Proof of Proposition 4.4.2. We know that gr F (u+ ) is a Frobenius algebra, hence 0 0 gr F (u)] = gr F (u), and M ∞ • • −• +• 2 +• ExtC(gr (k), Rmin,gr F 0 (u) (k))). F 0 (u)) (k, kh`λi) = H (Hom gr F 0 (u) (P λ∈X

Here P −• (k) denotes an arbitrary u− -free resolution of k belonging to C ↓ . The fact 0 +• F0 + that gr F (u+ ) is Frobenius also implies that Rmin,gr (u )-cofree F 0 (u) (k) consists of gr modules with the space of cogenerators equal to O O 3(ξ α ) ⊗ S(η `α ) ⊗ k((` − 1)2ρ). α∈R+

α∈R+

+• Consider the spectral sequence of the bicomplex Hom•grF 0 (u) (P −• (k), Rmin,gr F 0 (u) (k)). The term E1 looks as follows: O O M Ext• − (k, 3(ξ α ) ⊗ k((` − 1)2ρ)h`λi) ⊗ S(η `α ), C(b ) + + λ∈X

α∈R

α∈R

396

S. M. Arkhipov

N

where the space

3(ξ α ) ⊗ k((` − 1)2ρ) is decomposed into the direct sum of one

α∈R+

dimensional b− -modules. Let us reformulate the result of Ginzburg and Kumar (see Lemma 4.3.2 ) in terms of Ext functor in the category of b-modules. 4.4.3. Lemma. (see [GK], Theorem 2.5.) L Ext• − (k, k(µ)h`λi) = 0 when µ 6= w (ρ) − ρ; (i) C(b ) λ∈X (ii) for µ = w(ρ) − ρ we have ÿ ch

M λ∈X

! Ext• − (k, k(µ)h`λi), t C(b )

=Q

tl(w) . `α 2 α∈R+ (1 − e t )

4.4.4. Lemma. (see [J], part II, Lemma 12.10) Let α1 , . . . , αk and β1 , . . . , βk be the two sets of pairwise distinct positive roots such that α1 + . . . + αk ≡ β1 + . . . + βm mod (`X) . Then for ` > 2(h − 1) we have α1 + . . . + αk = β1 + . . . + βm . Here h denotes the Coxeter number of the root system (see [J], p.262). For any element of the Weyl group w ∈ W there exists a unique element w0 ∈ W , such that w(ρ) + w0 (ρ) = 0. Then w(ρ) − ρ = −2ρ− (w0 (ρ) − ρ). In [GK], 2.5 it is proved that N 0 α 0 \ for every w ∈ W we have dim ρ − w (ρ) 3(ξ ) = 1. Thus for every w ∈ W α∈R + N 3(ξ α ) ⊗ k(−2ρ) = 1. Noting that l(w0 ) = dim n− − l(w) we obtain dim w(ρ)−ρ α∈R+

if w(ρ) + w0 (ρ) = 0 we see that all the nonzero entries of the term E1 of the spectral sequence are of the same parity whence it degenerates, and the proposition follows from Lemmas 4.5.1 and 4.5.2. 4.5. Theorem. ÿ ch

M λ∈X

! ∞ 2 +•

ExtC

(k, kh`λi), t

P 2l(w) − t− dim n e2`ρ t w∈W . = Q (1 − e`α t2 )(1 − e`α t−2 ) α∈R+ ∞

Proof. The spectral sequence arising from the filtration of the complex C 2 +• (k, k), which is induced by the filtration F 0 on u, degenerates since all the nonzero semiinfinite 0 cohomology spaces of the trivial module over gr F (u) are of the same parity.

Semiinfinite Cohomology of Quantum Groups

397

5. TorC∞ +• (L, M ) and Conformal Blocks 2

In this section we are going to compare our semiinfinite cohomology of quantum groups with the functor defined by Finkelberg and Schechtman. 5.1. First we define the semiinfinite homology of quantum groups. 5.1.1. Definition. For M ∈ C, L ∈ C (r) we set Tor C∞2 +• (L, M ) = 0 (H • (Pu•− (L) ⊗u Pu•+ (M ))). Here Pu•− (L) ∈ C (r)↓ is a u− -free resolution of L, Pu•+ (M ) ∈ C ↓ is a u+ -free resolution of X; and 0 (H • (. . .)) denotes the zeroth X-grading component. Like in the third section, one can easily check that the definition does not depend on the choice of resolutions. Alternatively, this is a corollary of the following comparison Lemma. 5.1.2. ∞

Lemma. ExtC2

+•

(L, M ) = Tor C∞2 −• (M ∗ , L)∗ .

Proof. The statement follows immediately from the definition of semiinfinite homology and the standard relation between Homu and ⊗u . In particular semiinfinite Tor functor is well defined on the corresponding derived categories of u-modules (see 3.5). We will need the following reformulation of the statement of Lemma 3.4.2 (iii) on the language of semiinfinite Tor functor. 5.1.3. Lemma. Let L• ∈ C (r)↓ , M • ∈ C ↓ . Suppose that M • is both u+ -projective and uinjective relative to u− . Then Tor C∞2 +• (L• , M • ) = 0 (H • (L• ⊗u M • )). Note that since both algebras u and u+ are Frobenius (see 2.4.4), the condition of the previous lemma means simply that M • ∈ C ↓ consists of u-projective modules. 5.1.4. Remark. M. Finkelberg and V. Schechtman gave a geometric definition of the semiinfinite Tor functor in the category C (see [FiS], part IV). In [Ar] and [FiS], part IV, it is proved that the geometric definition coincides with the one presented here. 5.2. Semiinfinite homology plays an important part in the calculation of conformal blocks. Recall the definition of conformal blocks (see e.g. [A]). Suppose for simplicity that our Cartan matrix is symmetric. Let γ ∈ R∨ be the highest root, and denote by 1 ∈ X the first alcove, 1 = {λ ∈ X|hγ, λ + ρi < `, hαi∨ , λ + ρi > 0, i = 1, . . . , r}. For λ1 , . . . , λn ∈ 1 the conformal blocks hL(λ1 ), . . . , L(λn )i are defined as the maximal trivial direct summand in L(λ1 ) ⊗ . . . ⊗ L(λn ).

398

S. M. Arkhipov

5.3. Theorem. There exists a natural inclusion ϕ : hL(λ1 ), . . . , L(λn )i ,→ Tor C∞2 +0 (k, L(λ1 ) ⊗ . . . ⊗ L(λn ) ⊗ L((` − 1)2ρ)). The proof follows immediately from the definition of the conformal blocks and the following statement. 5.3.1. Proposition. Tor C∞2 +0 (k, L((` − 1)2ρ)) = k. Proof. Using 5.1.2 we are going to prove the corresponding statement for semiinfinite cohomology. Choose left resolutions of k and L((` − 1)2ρ) beginning with M + (0) and M − (−(` − 1)2ρ) respectively and satisfying the conditions (U) and (D) respectively. Such res+ olutions exist, for example one can take the standard resolution Bar • (u, b− , M (0)) + of the kernel of the canonical projection M (0) −→ k (resp. the standard resolution − Bar • (u, b+ , M ((`−1)2ρ)) of the kernel of the canonical projection M − ((`−1)2ρ) −→ L((`−1)2ρ)). Denote these resolutions by P (−)• (L((`−1)2ρ)) and P (+)• (k) respectively. Using 2.5.2 (iii) we see that HomC (P (−)n (L((` − 1)2ρ)), P (+)m (k)) is nonzero only when n = m = 0, and HomC (P (−)0 (L((` − 1)2ρ)), P (+)0 (k)) = k.

5.3.2. The inclusion ϕ in general is not bijective. Consider the following example: g = sl2 , ` = 5, n = 4, λ1 = λ2 = 2, λ3 = λ4 = 3 (we have identified X with Z, so ρ = 1, and (` − 1)2ρ = 8). 5.3.3. Lemma. Tor C∞2 +0 (k, P (0)) = k, where P (0) denotes the projective covering of the trivial module. Proof. The statement follows easily from 5.3.1.

Since the maximal trivial direct summand of P (0) is zero, it is enough to find this module among the direct summands of L(2) ⊗ L(2) ⊗ L(3) ⊗ L(3) ⊗ L(8). If we find a projective direct summand P in V := L(2) ⊗ L(2) ⊗ L(3) ⊗ L(3) with the highest weight 0, then V ⊗ L(8) will contain a projective direct summand with the highest weight 8, i. e. P (0). It is well known (see [L2], Proposition 7.1), that all the modules L(λ), λ ∈ 1, lift to the simple U3 -modules. It follows from the results of [A] that for g = sl2 , λ1 , . . . , λn ∈ 1, the module L(λ1 ) ⊗ . . . ⊗ L(λn ) is a direct sum of projective U3 -modules and simple U3 -modules with highest weights in 1. Thus the U3 -module L(2) ⊗ L(2) ⊗ L(3) ⊗ L(3) contains the idencomposable proe jective U3 -module Pe (8) (which is the projective covering of the simple U3 -module L(8)) with the highest weight 10 as an U3 -direct summand. One can check easily that when restricted to u the module Pe (8) contains as a direct summand P (−2) — the projective covering of the u-module L(−2). But the highest weight of P (−2) is 0. We conclude that L(2)⊗L(2)⊗L(3)⊗L(3)⊗L(8) contains a direct summand P (0). Hence the semiinfinite Tor space is strictly bigger than the conformal blocks space.

Semiinfinite Cohomology of Quantum Groups

399

5.4. Now we construct a certain duality on semiinfinite Tor spaces that corresponds to Poincar`e duality in Finkelberg-Schechtman interpretation. 5.4.1. We denote by e u the finite quantum group defined in the same way as u, but with ζ replaced by ζ −1 in the defining relations. 5.4.2. Lemma. The map φ : Ei 7→ Fi , Fi 7→ Ei , Ki 7→ Ki−1 defines an antiisomorphism of algebras φ : u −→ e u. We denote by Ce the category of left e u-modules satisfying the conditions of the type 3. Define the functor D : C −→ Ceopp , D(M ) = Homk (M, k) with the natural left action of e u constructed as follows: u, f ∈ Homk (M, k). u · f (m) := f (φ−1 (u)m), where u ∈ e One can check directly that the supports of the modules M ∈ C and D(M ) ∈ Ce coincide. 5.4.3. Proposition. There exists a nondegenerate pairing h , i : Tor C∞2 +• (k, M ) × Tor Ce∞2 −• (k, D(M )) −→ k. Proof. By 5.1.2 we have ∞

Tor C∞2 +• (k, M ) = (ExtC2

−•

(M, k))∗ , Tor Ce∞2 +• (k, D(M )) = (Ext

∞ 2 −•

e C

(D(M ), k))∗ ,

thus it is sufficient to construct a nondegenerate pairing ∞

Ext C2

+•

(M, k) × Ext

∞ 2 −•

e C

(D(M ), k) −→ k.

Choose a resolution R• (k) ∈ C ↓ consisting of projective u-modules. Then by 5.3.1 ∞ +• Ext C2 (M, k) = 0 (H • (Hom•u (M, R• (k)))), and by definition of D we obtain ∞ +• Ext 2 (D(M ), k) = 0 H • Hom•e (D(M ), D(R• (k)) = 0 Hom•u (R• (k), M ) . u e C • We may assume that R (k) consists of modules of the form L = Coindu u0 (V ). For such a module consider the canonical isomorphism ∗ g → 0 Hom•u0 (M, V ) − g → 0 Hom•u0 (V, M ) µ : 0 (Homu (M, L)) − ∗ ∗ − g → 0 Hom•u (Indu g → 0 Hom•u (L, M ) u0 (V ), M ) − u + The last equality uses the fact that Indu u0 (V ) = Coindu0 (V ) since all the algebras u , − u and u are Frobenius. We are to check that µ commutes with morphisms of u-modules L1 −→ L2 . But the general statement follows easily from the statement for L1 and L2 being u-free. So g → Homu (M, u) is an isomorphism of right we are to check that µ : Homu (u, M )∗ − u-modules. This can be verified directly. So µ induces a nondegenerate pairing of the complexes: h , i : Hom•u (R• (k), M ) × Hom•u (M, R• (k)) −→ k. that becomes the required pairing on the semiinfinite Tor spaces.

400

S. M. Arkhipov

Appendix A. Distributions on the Nilpotent Cone In this section we give a geometric interpretation of the character of the semiinfinite cohomology of the trivial module over the finite quantum group. A.1. Consider the nilpotent cone N ⊂ g. N is a singular affine algebraic variety containing the positive nilpotent subalgebra n ⊂ N ⊂ g. Denote by F (N ) the space of complex algebraic functions on N . The adjoint action of g preserves N and induces a representation of g in F(N ). Our considerations are parallel to the following result due to Ginzburg and Kumar [GK]. A.1.1. Proposition. (see [GK], Theorem 5). L Ext2n+1 (k, k(`λ)) = 0 for any n ≥ 0; (i) C λ∈X

(ii)

L

λ∈X

n Ext2n C (k, k(`λ)) = F (N ) as g-modules. Here the grading on the right-hand

side is the grading by homogeneous degree of functions on N .

B. Feigin has proposed the following conjecture. A.1.2. Conjecture. The g-module

L λ∈X

∞

ExtC2

+•

(k, kh`λi) is isomorphic to the g-module of

distributions on N with support in n ⊂ N . To formulate the exact statement we will need several well known facts about the geometry of the nilpotent cone. We will need the Grothendieck-Springer resolution of N . Choose a maximal torus T ⊂ G and a Borel subgroup T ⊂ B ⊂ G. Consider the flag variety B = G/B. Le b ⊂ g be the Lie algebra of B, and let n be its nilpotent radical. A.1.3. Lemma. (see [CG], 3.1.36) (i) The natural map σ : T ∗ (B) −→ N , where T ∗ (B) is the cotangent bundle to B, provides a resolution of singularities of N . (ii) The preimage of the standard nilpotent subalgebra under σ a σ −1 (n) = TC∗w B ⊂ T ∗ (B), w∈W

where

TC∗w B

denotes the conormal bundle to a B-orbit Cw .

Denote the union of the conormal bundles to the B-orbits Cw by S. A.2. We use some results and methods due to Kempf [K]. The notion of cohomology with support in a locally closed subvariety was investigated in that paper. Kempf also introduced the action of a Lie algebra on local cohomology in the case when the corresponding Lie group acts on the ambient space.

Semiinfinite Cohomology of Quantum Groups

401

We denote the local cohomology of X with support in Y ⊂ X and with coefficients in a sheaf F by HY• (X, F ). A.2.1. Lemma. (i) g acts naturally on Hn• (N , ON ). Here ON denotes the structure sheaf of N . −→HS• (T ∗ (B), OT ∗ (B) ). (ii) There is a natural isomorphism of g-modules Hn• (N , ON )g Proof. (i) Follows from [K], Lemma 11.1. (ii) The existence of the isomorphism of vector spaces in question follows from the fact that R0 σ∗ OT ∗ B = ON , and Ri σ∗ OT ∗ B = 0. The isomorphism commutes with the g-action since σ is G-equivariant. We will calculate the character of the g-module HS• (T ∗ (B), OT ∗ (B) ). The variable t corresponds to the grading by the homogeneous degree. A.2.2. −

Theorem. The local cohomology spaces HS6=dim n (T ∗ (B), OT ∗ (B) ) vanish and we have X − ch HSdim n (T ∗ (B), OT ∗ (B) ), t = e2ρ w∈W

tl(w) . (1 − eα t)(1 − eα t−1 )

Q α∈R+

The equality here is the equality of power series in the variables numbered by the generators of X with coefficients in k[t, t−1 ]. Proof. The statement follows immediately from Lemmas A.2.4, A.2.5 and A.2.6.

Comparing the answer with Theorem 4.6 we obtain the following fact. A.2.3. Corollary. Up to a shift of grading in t the character of the semiinfinite cohomology of the trivial module over the finite quantum group coincides with the character of − Hndim n (N , ON ). A.2.4.

P Lemma. ch HS• (T ∗ (B), OT ∗ (B) ), t = ch HT• ∗ w∈W

Cw

∗ ∗ (B) ), t (T (B), O . T B

Proof. Fix a total linear order on the set {Cw |w ∈ W } compatible with the natural partial order by inclusion. S TC∗w0 B. We introduce the filtration on S by subspaces Sw := l(w0 )<w

We prove by induction that X ch HT• ∗ (Cw0 ) T ∗ (B), OT ∗ (B) , t . ch HS•w T ∗ (B), OT ∗ (B) , t = w0 ≤w

For w = 1, Cw = pt, the statement is evident.

402

S. M. Arkhipov

If w follows w0 directly in the total linear order then Sw0 is closed in Sw and Sw \Sw0 = T ∗ (Cw ). Then by definition of local cohomology there exists a long exact sequence . . . −→ HSi w0 T ∗ (B), OT ∗ (B) −→ HSi w T ∗ (B), OT ∗ (B) −→ T ∗ (B), OT ∗ (B) −→ . . . . −→ HCi w T ∗ (B), OT ∗ (B) −→ HSi+1 w0 The cohomology of a nonsingular variety with support in a nonsingular (locally closed) subvariety is nonzero only in one degree equal to the codimension of the subvariety, thus the statement is proved. A.2.5. Lemma. (see [K], 6.4). There exists a T -equivariant neighbourhood P (Cw ) of a B-orbit Cw such that Q Q (i) P (Cw ) ∼ V (ρ−w(ρ))× V (ρ−w(ρ)) as topological = α∈R+ ,w(α)∈R−

α∈R− ,w(α)∈R−

spaces with the action of T . Here V (λ) denotes the one dimensional representation of T of weight λ; (ii) under this identification Cw corresponds to the first factor. A.2.6. Lemma. (see [H]) X



Y

tl(w) 

(1 − eα t)

ÿ =

Y α∈R+

(1 − eα t−1 )

α∈R+ ,w(α)∈R−

α∈R+ ,w(α)∈R+

w∈W



Y

!ÿ (1 − e t) α

X

! t

l(w)

.

w∈W

Appendix B. Semiinfinite cohomology as a derived functor Here we discuss the notion of a K-semijective complex introduced by Voronov (see [V]) and relations between our approach to semiinfinite cohomology of associative algebras and Voronov’s investigation of Lie algebras’ semiinfinite cohomology. It turns out that Voronov’s results remain true not only in the case of graded Lie algebras but also in a more general setting of an associative algebra A equipped with a triangular decomposition A = B ⊗ N . So we remain in the situation of 3.1. We begin with recalling several basic results of semiinfinite homological algebra. B.1. Let O(A) be the category of X-graded A-modules M =

L

λ M satisfying the Ss + (βi ), with following condition: there exist β1 , . . . , βs ∈ X such that supp M ⊂ i=1 XQ morphisms being morphisms of A-modules that preserve X-gradings. The corresponding category of X-graded N -modules is denoted by O(N ). We denote the homotopy category of unbonded complexes over O(A) (resp., over O(N )) with morphisms being morphisms of complexes modulo nillhomotopies, by K(A) (resp., by K(N )). λ∈X

Semiinfinite Cohomology of Quantum Groups

403

B.1.1. Definition. (see [V], 3.3). An object S • ∈ K(A) is called K-semijective if (i) it is K-projective over N , i. e. HomK(N ) (S • , V • ) = 0 for any (possibly unbounded) complex V • ∈ Kom(N ) equal to zero in K(N ); (ii) it is K-injective over A relative to N , i. e. HomK(A) (V • , S • ) = 0 for any complex V • ∈ K(A) such that HomK(N ) (V • , V • ) = 0 (in particular V • is acyclic). Note that our definition is “turned upside down” with respect to Voronov’s one (i. e. our semijective complexes should be called co-K-semijective in Voronov’s terms). Note also that A-modules that are both N -projective and A-injective relative to N (see 2.4.2) evidently are K-semijective. We call such A-modules semijective (without K). B.1.2. Lemma. (i) Any bounded from above complex of N -projective A-modules is Kprojective over N ; (ii) any bounded from below complex of A-injective relative to N modules is K-injective relative to N ; (iii) in particular any bounded complex of semijective modules is K-semijective. The following statement is the main achievement of semiinfinite homological algebra. B.2. Theorem. (see [V], Theorem 3.3). Let K(SJ (A)) be the homotopical catregory of K-semijective complexes over O(A). D(A) denotes the unbounded derived category of complexes over O(A). Then the functor of localization by the class of quasiisomorphisms provides a natural equivalence of triangulated categories K(SJ (A))g −→D(A).

B.2.1. Consider the following resolution R• (M ) of an A-module M ∈ O(A): R• (M ) := Hom•A (Bar • (A, N, A), Bar • (A, B, M )), where A in Bar • (A, N, A) is considered as a A − A bimodule. Evidently R• (M ) ∈ C ↑ (A). B.2.2. Lemma. R• (M ) ∈ C ↑ (A) is K-semijective. Proof. First note that Bar • (A, N, A) is homotopically equivalent to A as a A-N bimodule — the homotopy is provided by the map a0 ⊗ . . . ⊗ an ⊗ a 7→ a0 ⊗ . . . ⊗ an ⊗ a ⊗ 1. •

Thus R (M ) is homotopically equivalent to Hom•A (A, Bar • (A, B, M )) = Bar • (A, B, M ) as a N -module. In particular for any exact complex V • ∈ O(A) we have

404

S. M. Arkhipov

HomK(N ) (R• (M ), V • ) = HomK(N ) (Bar • (A, B, M ), V • ) = 0. Thus R• (M ) is K-projective over N . To prove that R• (M ) is A-injective relative to N note that for a complex of A-modules V • homotopically equivalent to zero over N , Hom•A (V • , Hom•A (Bar • (A, N, A), Bar • (A, B, M ))) = Hom•A (Bar • (A, N, A) ⊗A V • , Bar • (A, B, M )) = Hom•A (Bar • (A, N, V • ), Bar • (A, B, M )). Since V • ∼ = 0 in K(N ), each line Bar p (A, N, V • ) of the bicomplex Bar • (A, N, V • ) is homotopically equivalent to zero over A. Since Bar p (A, N, V • ) 6= 0 only for p ≤ 0, the total complex of Bar • (A, N, V • ) is also homotopically equivalent to zero over A. Thus HomK(A) (V • , Hom•A (Bar • (A, N, A), Bar • (A, B, M ))) = 0, and we are done. It follows from the previous lemma and Lemma 3.4.2(iii) that one can treat ∞ +• Ext A2 (L, M ) as an exotic derived functor of the functor M 7→ HomA] (L, SA ⊗A M ) (cf. [V], 3.9). In particular in [V], 3.2.1, it is proved that in the case of A = U (a) for some graded Lie algebra a the algebra A] differs from A by a 2-cocycle of the Lie algebra a. One can check directly that the functor HomA] (k, SA ⊗A ∗) coincides with the functor of semiinvariants defined in [V], 3.6, on the class of semijective objects. B.2.3. Recall the construction of the standard complex for the computation of Lie algebra semiinfinite cohomology. L an the standard resolution For a graded module M over a graded Lie algebra a = n∈Z with respect to the graded Lie subalgebra b ⊂ a looks as follows: St • (a, b, M ) := U (a) ⊗U (b) 3• (a/b) ⊗ M. Here the b-module 3(a/b) is just the direct sum of the exterior powers of the brepresentation in a/b, tensor product of a-modules over the base field is defined using the Hopf algebra structure on U (a), the differential is written as follows: d ((u ⊗ a1 ∧ . . . ∧ an ) ⊗ m) n X (−1)i (uai ⊗ a1 ∧ . . . ∧ ai−1 ∧ ai+1 ∧ . . . ∧ an ) ⊗ m = i=1

+

X

(−1)i+j (u ⊗ [ai , aj ] ∧ a1 ∧ . . . ∧ ai−1

i<j

∧ai+1 ∧ . . . ∧ aj−1 ∧ ai+1 ∧ . . . ∧ an ) ⊗ m. Here ai ∈ a/b. One can check that the differential is correctly defined. Consider the triangular decomposition of the Lie algebra a: M M an , a>0 := an , a = a≤0 ⊕ a>0 a≤0 := n≤0

n>0

as a vector space. Clearly St• (a, a≤0 , M ) belongs to C ↑ (U (a)).

Semiinfinite Cohomology of Quantum Groups

405

Let a] be the central extension of a such that U (a)] = U (a] ). Then, as before, KU• (a) (k, M ) = Hom•U (a] ) (St• (a] , a]>0 , k), SU (a) ⊗U (a) St• (a, a≤0 , M )), ∞

+•

Ext U2 (a) (k, M ) = H • (KU• (a) (k, M )). The complex KU• (a) (k, M ) is exactly the standard complex for the computation of Lie algebra semiinfinite cohomology consisting of semiinfinite exterior powers (see e.g. [V], 2.5). That gives another proof of coincidence of semiinfinite Ext functor and Lie algebra semiinfinite cohomology. References [A]

Andersen, H.H.: Representations of quantum groups, invariants of 3-manifolds and semisimple tensor categories. Israel Math. Conf. Proc. 7, 1–12 (1993) [AJS] Andersen, H.H., Jantzen, J.C., Soergel. J.C.: Representations of quantum groups at p-th roots of unity and of semisimple groups in characteristic p: independence of p. Asterisque 220 (1994) [Ar] Arkhipov, S.M.: Semiinfinite cohomology of quantum groups at roots of unity. Preprint (1994) [CG] Chriss, N., Ginzburg, V.: Representation theory and complex geometry. Boston: Birkh¨auser, 1995 [DCK] De Concini, C., Kac, V.: Representations of quantum groups at roots of unity. In: A.Connes et all (eds.), Operator algebras, unitary representations, enveloping algebras and invariant theory. (Colloque Diximier), Proc. Paris 1989, (Progr. in Math. 92), Boston etc.: Birkh¨auser, pp. 471–506 [DCKP] De Concini, C., Kac, V., Procesi, C.: Some remarkable degenerations of quantum groups. Commun. Math. Phys. 157, 405–427 (1993) [F] Feigin, B.: Semi-infinite cohomology of Kac-Moody and Virasoro Lie algebras. Usp. Mat. Nauk 33, no.2, 195–196 (1984) (in Russian) [FSV] Feigin, B., Schechtman, B., Varchenko, A.: On algebraic equations satisfied by hypergeometric corellators in WZW models. II, Commun. Math. Phys. 170, 219–247 (1995) [Fi] Finkelberg, M.: An equivalence of fusion categories. Harvard Ph.D. thesis, (1993) [FiS] Finkelberg, M., Schechtman, V.: Localization of u-modules. I. Intersection cohomology of real arrangements. Preprint hep-th/9411050 (1994) 1–23; II. Configuration spaces and quantum groups. Preprint q-alg/9412017 (1994) 1–59; III. Tensor categories arising from configuration spaces. Preprint q-alg/9503013 (1995) 1–59; IV. Localization on P1 . Preprint q-alg/9506011 (1995), 1– 31; V. Localization of modules over small quantum groups. Yu.I. Manin Festschrift, Part 2, J. Math. Sc. 82, no. 1, 3127–3164 (1996) [GeM] Gelfand, S.I., Manin, Yu.I.: Methods of homological algebra. Moscow: Nauka, (1988) (In Russian) [GK] Ginzburg, V., Kumar, N.: Cohomology of quantum groups at roots of unity. Duke Math. J. 69, 179–198 (1993) [H] Hesselink, W.H.: On the character of the nullcone. Math. Ann. 252, 179–182 (1980) [J] Jantzen, J.C.: Representations of algebraic groups. Pure and Appl. Math. 131, Orlando, Fl: Academic Press, 1987 [K] Kempf, G.: The Grothendieck-Cousin complex of an induced representation. Adv. in Math. 29, 310–396 (1978) [KL] Kazhdan, D. and Lusztig, G.: Tensor structures arising from affine Lie algebras. I, J. Am. Math. Soc. 6, 905–947 (1993); II, J. Am. Math. Soc. 6, 949–1011 (1993); III, J. Am. Math. Soc. 7, 335–381 (1994); IV, J. Am. Math. Soc. 7, 383–453 (1994) [L1] Lusztig, G.: Quantum groups at roots of unity. Geom. Dedicata 95, 89–114 (1990) [L2] Lusztig, L.: Modular representations and quantum groups. Contemp. Math. 82, 59–77 (1989) [V] Voronov, A: Semi-infinite homological algebra. Invent. Math. 113, 103–146 (1993) [Xi] Nanhua, Xi: Representations of finite dimensional Hopf algebras arising from quantum groups. Preprint (1989) Communicated by G. Felder

Commun. Math. Phys. 188, 407 – 437 (1997)

Communications in

Mathematical Physics c Springer-Verlag 1997

Generic Metrics and Connections on Spin- and Spinc -Manifolds Stephan Maier Mathematisches Institut, Universit¨at Z¨urich, Winterthurerstrasse 190, 8057 Z¨urich, Switzerland. E-mail: [email protected] Received: 3 July 1996 / Accepted: 27 February 1997

Abstract: We study the dependence of the dimension h0 (g, A) of the kernel of the Atyiah-Singer Dirac operator Dg,A on a spinc -manifold M on the metric g and the connection A. The main result is that in the case of spin-structures the value of h0 (g) for the generic metric is given by the absolute value of the index provided dimM ∈ {3, 4}. In dimension 2 the mod-2 index theorems have to be taken into a account and we obtain an extension of a classical result in the theory of Riemann surfaces. In the spinc -case we also discuss upper bounds on h0 (g, A) for generic metrics, and we obtain a complete result in dimension 2. The much simpler dependence on the connection A and applications to Seiberg–Witten theory are also discussed. Contents 1 2 3 4 5 6 7 8 9 10 11

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 407 The Dependence of Dg,A on the Metric and Connection . . . . . . . . . . . . 410 Generic Metrics and Connections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 414 The Obstruction Formula . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 415 Partial Proofs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 419 Dimension 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 420 Applications to Riemann Surfaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 424 Dimensions 3 and 4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 425 Critical Eigenvalues 6= 0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 430 A Remark on Seiberg–Witten Moduli Spaces . . . . . . . . . . . . . . . . . . . . . 431 Appendix: Analytic Families of Differential Operators . . . . . . . . . . . . . 433

1. Introduction Given a spin-(spinc -)manifold M much effort has been invested in the study of the Dirac operator Dg on spinors for particular metrics. Alternatively, metrics have been used as

408

S. Maier

an auxiliary tool to derive topological information about the underlying spin-manifold. Via the index formula, knowledge of dimKerDg may lead to topological obstructions. ˆ Thus if for instance KerDg = {0} on a closed 4k-manifold M then the A-genus of M vanishes. In particular, this is true if the metric g has positive scalar curvature [L]. Thus the dependence of dimKerDg on the metric has been studied from the very beginning of the subject [L, Hi]. In general, one expects a spin-manifold to have arbitrarily many harmonic spinors for a suitable metric. The existence of harmonic spinors for suitable metrics in dimensions 0, ±1 mod 8 is proved in [Hi,Th.4.5]. The same holds in dimensions 3 mod 4 [B¨ar1, B¨ar2], and in fact the proof of loc.cit. can probably be extended to show that in dimensions 3 mod 4 there are indeed metrics with h0 (g) arbitrarily large [B¨arP]. However, it has been conjectured that for the generic metric (a term to be made precise) the dimension of the space of harmonic spinors is equal to the absolute value of the index [BG, K2, B¨ar1]. More precisely, Problem. Is it true that for a given spinc -structure on a closed m-dimensional manifold M with fixed connection A on the canonical bundle we have n + | , m even |IndexDg,A h0 (g, A) := dimKerDg,A = 0 , m odd for the generic metric g on M ? It is the purpose of this article to study this problem. Note that the index of the Dirac operator does not depend on the choice of metric and connection. It is known that in dimensions 1 and 2 mod 8 the theorem is not true for spinmanifolds in the form stated because by the mod 2-Index theorems h0 (g) and h+ (g) := dimKerDg+ respectively are constant modulo 2 [AtSi] (see Remark 3.3 below). Thus in these dimensions the problem must be rephrased as follows: Problem. Is it true that for a given spinc -structure on a closed manifold M of dimension 1, 2 mod 8 with fixed connection A on the canonical bundle L the functions h0 (g, A) and h+ (g, A) respectively are constant on a generic set of metrics and are either 0 or 1 on this set? The corresponding problem for variations of complex K¨ahlerian structures has also been formulated [Hi,p.24] and has been conclusively answered in [K2] by exhibiting a counterexample. Some of the motivation for studying the dependence of the Dirac on the metric comes from Seiberg–Witten theory. Here, it would be desirable to have a priori knowledge about + for a suitable metric. However, this is not possible because the connection dimKerDg,A A is part of a solution of the Seiberg–Witten equations. We shall discuss this issue below in Sect. 9. The point of view adopted in this paper is the following: The Dirac operator will be viewed as a map D.,A : M → B(H1 (Σ), H0 (Σ)), i.e. as a map from the space of metrics to the space of bounded linear operators between suitable Sobolev spaces. The Dirac operator Dg,A is a Fredholm operator. Note that the space of Fredholm operators F := F(H1 (Σ), H0 (Σ)) is stratified by the sets Fn,k := {f ∈ F , dimKer(f ) = n, dimCoker(f ) = k}. Each such set is a locally closed analytic submanifold of the Banach space B(H1 (Σ), H0 (Σ)) of bounded linear maps [Kos]. As the Dirac operator is a formally self-adjoint operator, Dg,A ∈ Fn,n for n = h0 (g, A). We shall show that unless g is subject to certain restrictions, the Fr´echet derivative DDg at g has image not

Generic Metrics and Connections on Spin- and Spinc -Manifolds

409

tangential to Fn,n . Thus we may slightly perturb g to get a metric g 0 with Dg0 ∈ Fn0 ,n0 , where h0 (g 0 , A) = n0 < n. If however ImDDg is tangential to Fn,n (in which case we call g critical) this argument fails. The property of g being critical is a geometric condition which can be expressed in terms of a simple formula the analysis of which yields severe restrictions on the geometry of the Riemannian manifold (M, g). As there are signifiant differences between the situation in dimension 2 in comparison to the situation in dimensions 3 or 4 we state the results separately for each dimension: Theorem 1.1. Let M be a closed oriented 2-dimensional manifold. For a fixed spinc structure and a fixed connection A on the canonical bundle PU1 with c1 (PU1 ) 6= 0 the generic metric satisfies dimKerDg,A = | 21 c1 (PU1 )|. If a given spin-structure is twisted by a connection B on the trivial bundle, thought of as a 1-form B ∈ i1 (M ), such that B is closed and defines an element in H 1 (M, 2πiZ), the generic metric satisfies dimKerDg = 0 or 2 depending only on the spin-structure. For other B the generic metric has no nontrivial harmonic spinors. This theorem provides a complete answer to the problem. The theorem can be reformulated in the language of the theory of Riemann surfaces, compare Theorem 7.1 below. Theorem 1.2. Let M be a closed oriented 3-manifold. (i) For a fixed spin-structure the generic metric has no nontrivial harmonic spinors. (ii) For a fixed spinc -structure and a fixed connection A on the canonical bundle the dimension of the space of harmonic spinors is at most 2 for a generic metric. Theorem 1.3. Let M be an oriented closed 4-manifold. (i) For a fixed spin-structure there are no nontrivial harmonic spinors of negative (positive) chirality for the generic metric if IndexDg+ ≥ 0 (≤ 0). + 6∈ {0, ±1}. (ii) For Spinc -structures, the same conclusion as in (i) holds if IndexDg,A + (iii) If IndexDg,A = ±1 then h+ + h− ≤ 3 for the generic metric. + (iv) If IndexDg,A = 0 then h+ = h− ≤ 2 for the generic metric.

In the spinc -case we may not only vary the metric but also the connection on the canonical bundle. One obtains the following: Theorem 1.4. Let (M, g) be a Riemannian spinc -manifold with fixed spinc -structure. (i)

If dimM = 2 or 4 then for the generic connection on the canonical bundle there are + ≥ 0 (≤ 0). no nontrivial negative (positive) harmonic spinor provided IndexDg,A (ii) If dimM = 1 or 3 there are no nontrivial harmonic spinors for the generic connection. (iii) The same conclusions hold if both metric and connection are varied. This result has been proved independently N.Anghel [Ang,Th.1.5], and the fourdimensional case is contained in [Mor,Lem.6.9.3]. It is natural to consider not only variations of the 0-eigenvalue but of other eigenvalues, too. In fact, we shall formulate the more general results for arbitrary eigenvalues. The main difference in the discussion of zero- and nonzero eigenvalues stems from the fact that only the dimension of the 0-eigenspace is a conformal invariant whereas the dimension of the other eigenspaces varies with the metric in a conformal class. Thus in

410

S. Maier

the discussion of the 0-eigenvalue (in dimensions > 2) the main difficulty will be to fix a suitable metric in the given conformal class. As the results for nonzero eigenvalues seem to be of lesser importance we refer the reader to Sect. 8 for a statement of results. This paper is organized as follows: We shall first discuss the dependence of the Dirac operator on both the metric on the base-manifold and the connection on the canonical bundle. Our discussion is essentially an extension of the corresponding discussion in [BG], but we prefer to alter their definitions in order to better take into account conformal rescaling. We shall then define and discuss the term “generic" before describing formulas which describe a first-order obstruction to the existence of deformations of the metric and/or the connection on the canonical bundle which reduce the dimension of the space of harmonic spinors. In fact, we shall prove the obstruction formula for all eigenvalues, not only for the 0-eigenvalue. Restricting the discussion to harmonic spinors, the aim is then to show that in dimensions 2 to 4 this obstruction is indeed only a first-order obstruction, i.e. that unless the metric and/or connection is minimal there are deformations which do indeed reduce the dimension of the space of harmonic spinors. As an immediate application we first prove the rather simple Theorem 1.4 and we make preliminary remarks on dimensions 3 and 4. Then we discuss the case dimM = 2 where the main feature is Serre-duality whereas conformal invariance plays no role. As indicated above, Theorem 1.1 has a translation into the language of the theory of Riemann surfaces. This translation is carried out in Sect. 7. In dimensions 3 and 4 conformal invariance is the key-feature and most effort has to be put into the conformal fixing of the metric. It might be tempting to choose the metric within the conformal class such that the scalar-curvature is constant, but that approach seems to lead nowhere. Instead, we will locally rescale the metric such that harmonic spinors will have constant length. We shall then consider nonzero eigenvalues. Here, the main feature is the appearance of Killing spinors which allows us to prove that for the generic metric in dimension 2 or 3 there are no λ-eigenspinors for a fixed number λ 6= 0. Finally, we shall briefly discuss the Seiberg–Witten moduli spaces. The upshot of the discussion is the observation that for any connection A on the canonical bundle which comes from a solution to the Seiberg–Witten equations with parameter a metric g, the + ≤ 0. pair (g, A) in general is non-generic in our sense if IndexDg,A In an appendix we prove a result for analytic families of elliptic operators which is implicit in the literature but for which no general statement and proof seems to be known. We make use of the theorem in our discussion of generic metrics and connections. 2. The Dependence of Dg,A on the Metric and Connection This section contains an exposition of the results of Bourguignon and Gauduchon [BG] with the aim of extending their discussion to variations of the Dirac operator with respect to variations of connections on twisting bundles. In addition, we shall redefine the identification of spinor bundles for different metrics so as to take into account the L2 -Hilbert space structure induced on the spinor bundles by the corresponding volume forms. 2.1. Preliminaries. First, let us briefly review the terminology which we shall employ. For a thorough exposition see for example [LM]. Given an m-dimensional Riemannian manifold (M, g) we shall by PSO (M ) denote the bundle of orthonormal frames.

Generic Metrics and Connections on Spin- and Spinc -Manifolds

411

The manifold M is spin, if and only if there is a 2-fold connected cover of PSO (M ) such that on each fibre the covering map reduces to the standard two-fold cover ρ : Spinm → SOm . Such a covering is a principal Spinm bundle and we denote this bundle by PSpin (M, g). Similarly, M is spinc , if and only if there is a S 1 -bundle P and a connected double cover of the fibre product PSO (M ) ×M P which on each fibre is the two-fold covering map ρ˜ : Spincm → SOm × S 1 . Such a cover is a principal Spincm -bundle which we shall denote by PSpinc (M, g, P ). We shall refer to P as the canonical bundle of the Spinc -structure. If m is even let Σm be the irreducible module for the Clifford-algebra Clm , and if m is odd let Σm be the irreducible module for Clm on which the volume element i[(m+1)/2] e1 . . . em acts as +Id. Given a spin- or spinc -structure, we form the spinor bundles Σg := PSpin (M, g) ×rep Σm and Σg := PSpinc (M, g, P ) ×rep Σm respectively (where rep denotes the representation of Spinm and Spincm respectively which come from the standard embedding Spinm ⊂ Spincm ⊂ Clm ). Note that in even dimensions Σ splits into the ±-eigenbundles for the (fibrewise) action of the volume element. In the spin-case the Atiyah–Singer Dirac operator Dg acting on sections of Σ is defined by ˜g ∇

∼ =

Dg : C ∞ (Σ) −→ 1 (M ) ⊗ C ∞ (Σ) −→ C ∞ (T M ) ⊗ C ∞ (Σ) −→ C ∞ (Σ) , ˜ g denotes the connection where the last arrow is Clifford-multiplication, and where ∇ on Σ induced by the Levi–Civit`a-connection on PSO (M, g). In the spinc -case, given a ˜ g,A on Σ and connection on the canonical bundle P , we get an induced connection ∇ thus the Atiyah–Singer Dirac operator Dg,A acting on sections of Σ. 2.2. The Identification. In order to compare the Dirac operator on a fixed manifold M with fixed spin-(spinc ) structure for different metrics (and connections on the canonical bundle) we need a canonical way of identifying the spinor bundles Σg and Σh for different metrics g and h. We shall briefly review how this is done [BG]. Consider for the moment a real m-dimensional vector space V . Given two metrics g, h ∈ Sym(V ∗ ⊗ V ∗ ) there is a unique positive endomorphism H of V such that h(., .) = g(H., .). Let b := H −1/2 . If E is a g-orthonormal frame then b(E) is a horthonormal frame. Thus b defines a smooth SOm -equivariant map of the manifold of g-orthonormal frames P (g) to the manifold of h-orthonormal frames P (h). Let gt := (1 − t)g + th, and let bt : P (g) → P (gt ) be the associated map. Let π : P˜ (gt ) → P (gt ) be the connected 2-fold covering which (after a choice of basepoint) we may identify with the connected 2-fold covering ρ : Spinm → SOmS. Given E ∈ ˜ = E. Then the path (t, bt ) ⊂ P (g) choose E˜ ∈ P˜ (g) such that π(E) t∈[0,1] P (gt ) S ˜ ˜ ˜ lifts uniquely to a path βt in t∈[0,1] P (gt ) such that β0 (E) = E. Clearly, we have βt (E.q) = βt (E).q for q ∈ Spinm . We thus get a Spinm -equivariant map βh,g = β1 : P˜ (g) → P˜ (h). Of course, in the preceding discussion we may replace the path gt of metrics by any smooth path of metrics connecting g and h. The resulting map β1 is independent of the path chosen because the space of metrics is contractible. Note that because of the invariant description we may extend bh,g and βh,g to bundles to obtain SOm - resp. Spinm -equivariant smooth bundle maps bh,g : PSO (M, g) → PSO (M, h) and βh,g : PSpin (M, g) → PSpin (M, h) (provided M is spin), such that −1 βh,g covers bh,g . Of course, we have βh,g = βg,h .

412

S. Maier

Similarly, if M is spinc , fix a spinc -structure with canonical bundle P . The SOm × S 1 -equivariant bundle map bh,g × Id lifts to a Spincm -equivariant bundle map βh,g : PSpinc (M, g, P ) → PSpinc (M, h, P ). The map βh,g extends to an isometry βh,g : Σg → Σh of Hermitian bundles. For ˜ g,A and ∇ ˜ h,B any pair (g, A) and (h, B) of metrics and connections on P denote by ∇ respectively the connections induced on Σg resp. Σh by the Levi–Civit`a-connections −1 ˜ h,B ◦ βh,g is a connection on Σg , and in fact on T M and A resp. B on P . Then βh,g ◦∇ ˜ h,B ◦ bh,g , B). it is the connection induced by the pair (b−1 h,g ◦ ∇ −1 h,B h,B Note that g is bh,g ◦ ∇ ◦ bh,g -parallel but that b−1 ◦ bh,g is usually not h,g ◦ ∇ torsion-free. Also note that we have the following identity: βh,g (X.s) = bh,g (X).βh,g (s) . It may now be tempting to use βh,g to pull back the Dirac operator on sections of Σh to a differential operator on sections of Σg [BG]. However, even though βh,g induces an isometry of Hermitian bundles it does not induce an isometry of Hilbert spaces L2 (Σg , dvolg ) and L2 (Σh , dvolh ), where dvolg and dvolh denote the volume forms. 2 dvolg and set Instead, let a positive function fh,g be defined by dvolh = fh,g 1 βˆh,g := βh,g . fh,g This βˆh,g induces an isometry of Hilbert spaces L2 (Σg , dvolg ) and L2 (Σh , dvolh ). The pull-back −1 ◦ Dh,B ◦ βˆh,g D¯ h,B := βˆh,g then has the same properties (symmetry, self-adjoint closure etc.) as Dh,B . We have −1 −1 D¯ h,B = fh,g βh,g ◦ Dh,B ◦ fh,g βh,g −1 −1 = βh,g ◦ Dh,B ◦ βh,g − fh,g bg,h (gradh fh,g ) ,

where bh,g (gradh fh,g ) operates via Clifford multiplication. For any smooth function f we have g(bg,h (gradh f ), .) = g(bh,g (gradg f ), .). We thus obtain −1 −1 D¯ h,B = βh,g ◦ Dh,B ◦ βh,g − fh,g bh,g (gradg fh,g ) .

2.3. Computing the derivative of the Dirac Operator. We shall have to compute the derivative of D¯ h,B with respect to h and B. First, note that the second summand does not depend on B. We shall compute this term first: Pick k ∈ C ∞ Sym(T ∗ M ⊗ T ∗ M ) and let gt := g + tk for small t. Then 1 bgt ,g = (Id + tK)− 2 where K ∈ C ∞ Symg (T M ) is defined by g(K., .) = k(., .). Thus √ d 1 det(I + tK)dvolg . Hence fgt ,g = (det(I + tK))1/4 dt t=0 b gt ,g = − 2 K. Now dvolgt = d 1 and dt t=0 fgt ,g = 4 Trg k. Note that fg,g ≡ 1 and thus 1 d 1 bgt ,g (gradg fgt ,g ) = gradg (Trg k) . dt t=0 fgt ,g 4 To deal with the first summand we shall write it in terms of a local frame: If {e1 , . . . , em } is a local g-orthonormal frame on some open contractible set U ⊂ M one may compute

Generic Metrics and Connections on Spin- and Spinc -Manifolds

−1 βh,g ◦ Dh,B ◦ βh,g =

m X

˜ g,A ei . ∇ bh,g (ei ) +

i=1

m X

413

−1 ˜ h,B ˜ g,A ei . βh,g ◦∇ ◦ β − ∇ h,g bh,g (ei ) bh,g (ei ) ,

i=1

see [BG]. We may think of Σg over U as coming from a spin-structure tensor product PSpin (U, g). Given a Hermitian connection A on U × C write A as A = d + φA , ˜ g be the connection on PSpin (U, g) ×ρ Σm induced by the Levi– φA ∈ i1 (U ). Let ∇ ˜ g,A = ∇ ˜ g + 1 φA over U . It is then immediate that Civit`a-connection. Then ∇ 2 1 d ¯ Dg,A+ta = a, a ∈ i1 (M ) , dt t=0 2 where a acts via Clifford multiplication. d β −1 ◦ Dgt ,A ◦ βgt ,g for gt := g + tk. This Finally, we are left with computing dt t=0 gt ,g has been done in [BG], where the following formula is obtained: d 1 X ˜ g,A 1 βg−1 ◦ Dgt ,A ◦ βgt ,g = − ei ∇K(ei ) + (d(Trg k) − divg k) t ,g dt t=0 2 i 4 Note that in comparison to [BG] we prefer to use the opposite sign convention for the divergence operator. We obtain the following formula which is an immediate consequence of the preceding discussion: Proposition 2.4. The derivative of D¯ g,A at (g, A) in the direction (k, a), k ∈ C ∞ Sym (T ∗ M ⊗ T ∗ M ) and a ∈ i1 (M ), is given by ¯ (g,A) (k, a) = − 1 (DD) 2

X i

˜ g,A − 1 divg k + 1 a , ei ∇ K(ei ) 4 2

where in the last two terms the 1-forms act via Clifford-multiplication. Remark 2.5. More generally, if E is a complex vector bundle with connection ∇E we may compute the Fr´echet derivative of D¯ g,A,∇E on the twisted spinor bundle Σg ⊗ E. The same computation as above then yields: ¯ (g,A,∇E ) (k, a, 8) = − 1 (DD) 2

X i

X E ˜ g,A,∇ − 1 divg k + 1 a + ei ∇ ei . ⊗ 8(ei ) , K(ei ) 4 2 i

where 8 ∈ 1 (M ) ⊗ End(E). Remark 2.6. It should be remarked that the conformal invariance of the dimension of the space of harmonic spinors is not only a feature of the Atiyah–Singer operator but is a quite general phenomenon. More precisely, let M be a spinc -manifold with fixed spinc structure, a metric g and a connection A on the canonical bundle. Let ρ : Clm → End(W ) be any hermitian representation and form the bundle Σ = Pspinc (M, g, PU1 ) ×ρ W , and let E be any complex vector bundle with connection. Then the dimension of the space of harmonic spinors of the twisted Dirac operator on Σ ⊗ E is a conformal invariant. The proof (which involves computations similar to the ones above) proceeds precisely m−1 as in [Hi;BFGK,Th.13;Hij1,Prop.4.3.1]. In fact, if h = e2f g set β¯h,g := e− 2 f βh,g . −1 Then Dh,A,E = e−f β¯h,g ◦ Dg,A,E ◦ β¯h,g .

414

S. Maier

3. Generic Metrics and Connections Definition. Let E → M be a smooth (real or complex) vector bundle over the closed manifold M , and let E ⊂ C ∞ (E) be a C 0 -open subset of smooth sections of E. We shall call a subset E 0 ⊂ E C k -generic in E if E 0 is C ∞ -dense and C k -open in E. Note that if E 0 is C k -generic in E then it is also C l -generic for any l > k. In our applications, E 0 = M ⊂ C ∞ Sym(T ∗ M ⊗ T ∗ M ), E 0 = M × A and E 0 = A according to context, where M denotes the set of smooth metrics on M and A = i1 (M ). In the sequel consider the Dirac operator defined on a bundle Σ obtained from PSpin (M, g) and PSpinc (M, g, PU1 ) respectively by a hermitian representation ρ : Clm → End(W ). Let Mλmin ⊂ M (alternatively (M × A)λmin ⊂ M × A, or Mλmin (A) = M × {A}, or indeed Aλmin (g) = {g} × A) denote the set of metrics (of metrics and connections on the canonical bundle, of metrics, of connections on the canonical bundle) for which dimKer(Dg,A − λ) is minimal among all possible choices (in the third case we assume the connection to be fixed, in the fourth case we assume a metric g to be fixed). Proposition 3.1. The sets Mλmin ⊂ M, (M × A)λmin ⊂ M × A, and Mλmin (A) ⊂ M × {A} are C 1 -generic. The set Aλmin (g) ⊂ A is C 0 -generic. Proof. Suppose M is spin. We shall argue the first case: Fix a connection ∇ on Σ. Then Dg = S1 ◦ ∇ + S2 , where S1 ∈ C ∞ Hom(1 (M ) ⊗ Σ, Σ) and S2 ∈ C ∞ End(Σ). Then kDg skL2 ≤ max|S1 |.k∇skL2 + max|S2 |.kskL2 ≤ const.(max|S1 | + max|S2 |)kskH1 . S1 and S2 depend only on g and its first derivatives. Thus g 7→ Dg ∈ B(H1 (Σ), H0 (Σ)) is continuous in the C 1 -topology on M. If dimKerDg is minimal then so is dimKerDg0 for Dg0 in a neighbourhood of Dg in the norm topology on B(H1 (Σ), H0 (Σ)). This shows that Mλmin is C 1 -open. Let g ∈ Mλmin and h ∈ M. Set gt := (1 − t)g + th. The family of operators D¯ gt is self-adjoint and analytic in t in the sense of the appendix. Proposition 11.4 of this appendix shows that for all but finitely many t ∈ [0, 1] we have gt ∈ Mλmin . It follows that Mλmin is C ∞ -dense in M. The case (M × A)λmin ⊂ M × A is argued similarly. In the case Aλmin (g) ⊂ A note that with Dg,A = S1 ◦ ∇ + S2 the sections S1 and S2 depend continuously on A. The argument now proceeds as before. Example 3.2. Suppose M is spin. If M has a metric g of positive scalar curvature, then by the preceding proposition we know that for each metric h in the C 1 -generic set Mλmin of metrics on M there are no harmonic spinors, because the Dirac-operator for the metric g has none by [L;LM,Cor.8.9]. Thus because for simply connected closed manifolds of dimension m ≥ 5 the existence of positive scalar curvature metrics is equivalent to the vanishing of certain topological obstructions [GL, Sto] we find a rich class of spin-manifolds for which the answer to the problem in the introduction is affirmative. Remark 3.3. As stated in the introduction, the problem in its original form does not hold in dimensions 1, 2 mod 8. To see this let K3 be a K3-surface with the opposite orientation and define M := K3#(S 1 × S 3 ). Then M has signature σ(M ) = 16. Let Y1 := M ×M ×S 1 and Y2 := M ×M ×F , where F is a closed 2-manifold of genus ≥ 2. Choose any spin-structure on M and take the spin-structure on S 1 which does not extend

Generic Metrics and Connections on Spin- and Spinc -Manifolds

415

to the disc, and then furnish Y1 with the product spin-structure. By multiplicativity of the spin-number we see that h0 (g) ≡ 1 mod 2 [AtSi,Th.3.1]. Similarly, by Remark 3 of [At,p.60] we see that Y2 has a spin-structure with h0 (g) ≡ 1 mod 2. Convention. We shall refer to metrics in Mλmin as either minimal or generic. Similarly, we shall call metrics in Mλmin (A) respectively connections in Aλmin (g) minimal or generic, and pairs in (M × A)λmin are referred to as either minimal or generic, too. 4. The Obstruction Formula Let M be closed spinc -manifold and fix a metric g and a connection A on the canonical bundle. The formula of the first section shows that D¯ : M → DO1 as map from the Fr´echet space of smooth metrics to the Fr´echet space of differential operators of order 1 is at least C 1 . Thus so is D¯ : M → B(H1 (Σ), H0 (Σ)), where H1 (Σ) is the Sobolev space of order 1 and H0 (Σ) = L2 (Σ). Let F denote the set of Fredholm operators in B(H1 (Σ), H0 (Σ)), and let Fn,k denote the stratum Fn,k := {f ∈ F, dimKer(f ) = n, dimCoker(f ) = k}. By [Kos] each Fn,k is a locally closed analytic submanifold of B and the fibre of the analytic normal bundle of Fn,k at f is given by Hom(Ker(f ), Coker(f )). Suppose D¯ g,A − λ ∈ Fn,n for a fixed λ ∈ R (recall that because D¯ g,A − λ is formally self-adjoint we have Ker(D¯ g,A − λ) = Coker(D¯ g,A − λ) ⊂ C ∞ (Σ)). If there ¯ g,A (k, a) is not tangential is (k, a) ∈ Sym(T ∗ M ⊗ T ∗ M ) × i1 (M ) such that (DD) to Fn,n then for small t the operator D¯ gt ,At − λ will not be in Fn,n . Here, as before, gt = g + tk and At = A + ta. By upper semicontinuity of the dimension of the kernel of D¯ gt ,At for some sufficiently small t we have D¯ gt ,At ∈ Fn0 ,n0 with n0 < n. Note that if we rescale the metric by a constant factor µ2 , µ > 0, we have D¯ µ2 g,A = 1 ¯ ¯ µ Dg,A . Thus for no eigenvalue λ 6= 0 can the image of the differential D Dg,A at Dg,A −λ be tangential to Fn,n for variations of the metric unless we restrict to such variations which preserve the total volume. Hence Convention. For brevity’s sake we shall call a pair (g, A) critical at the eigenvalue λ if 1 ∗ ∗ ¯ the image R of D(D −λ)g,A restricted to elements (k, a) ∈ Sym(T M ⊗T M )×i (M ) with Trg k dvolg = 0 is tangential to Fn,n . Similarly, we call a metric (connection) critical at the eigenvalue λ if for a fixed connection (metric) the image of D(D¯ − λ)g,A is tangential to Fn,n , where the derivative is computed with respect to variations in the metric (connection) only. A good criterion with which to decide whether ImDD¯ g,A is tangential to Fn,n is the following: Proposition 4.1. The pair (g, A) is critical at the eigenvalue λ if and only if hX.91 , 92 i = 0 , ˜ g,A 91 , 92 i + h91 , X.∇ ˜ g,A 92 i = (ii) hX.∇ X X (iii) h91 , 92 i = const if λ 6= 0 ,

(i)

2λ m h91 , 92 ig(X, X) ,

for all X ∈ C ∞ (T M ) and 9i ∈ Ker(Dg,A − λ). In case we vary the connection only, the condition for A being critical is equivalent to (i), and if we vary the metric only, (ii) and (iii) are equivalent to the metric being critical.

416

S. Maier

Proof. The image of DD¯ g,A is tangential to Fn,n at Dg,A if and only if DD¯ g,A (k, a)91 , 92 L2 = 0 1 ∗ ∗ Rfor all 9i ∈ Ker(Dg,A − λ) and (k, a) ∈ Sym(T M ⊗ T M ) × i (M ) with Trg k dvolg = 0. 1 1 ˜ g,A ˜ g,A Define Qg,A 91 ,92 (X, Y ) := 2 hX.∇Y 91 , 92 i + 2 hY.∇X 91 , 92 i. Then X X g,A ˜ g,A 91 , 92 i = h ei . ∇ k(ei , ej )Qg,A 91 ,92 (ei , ej ) = hk, Q91 ,92 i , Kei i

i,j

where the term on the right-hand side means the usual pointwise C-bilinear product of C-valued symmetric bilinear forms. With this notation the condition that the image of DDg,A be tangential to Fn,n is equivalent to: Z 1 1 1 h(div ha.9 − hk, Qg,A i − k).9 , 9 i + , 9 i dvolg 0= g 1 2 1 2 91 ,92 2 4 2 M for all λ-eigenspinors 91 and 92 . If we set k = 0 then we immediately obtain the first condition of the proposition. This also implies that the integral over the third term vanishes identically. We may repeat the above argument with 91 and 92 interchanged. Denote by Q¯ g,A 92 ,91 the complex conjugate of Qg,A . Then adding the corresponding equations we get 92 ,91 Z ¯ g,A hk, Qg,A (4.1.1) 0= 91 ,92 + Q92 ,91 idvolg M

R for all k ∈ C ∞ Sym(T ∗ M ⊗ T ∗ M ) with Trg k dvolg = 0. This implies that the section ¯ g,A in the bundle of symmetric bilinear forms Qg,A 91 ,92 + Q92 ,91 is equal to its trace part and that its trace is constant. For λ 6= 0 the latter condition is equivalent to (iii) of the proposition, whereas Pthe former is just (ii). Now let Z := hei .91 , 92 iei with respect to a local g-orthonormal frame. Z is globally defined, and computing at a point x ∈ M , where we may assume the local g-orthonormal frame to satisfy ∇g ei |x = 0 we find: LZ g(X, X) = 2g(∇gX Z, X)|x = 2Xhei .91 , 92 i|x g(ei , X)|x ˜ g,A 91 , 92 i|x − 2h91 , X.∇ ˜ g,A 92 i|x = 2hX.∇ X X g,A g,A = 2Q91 ,92 (X, X)|x − 2Q¯ 92 ,91 (X, X)|x . Adding (ii) (multiplied by a factor of 2) to the last equation yields (ii)’

1 ˜ g,A 91 , 92 i + λ h91 , 92 ig(X, X) , LZ g(X, X) = hX.∇ X 4 m

¯ which is of course equivalent to (ii). To prove that (i), P (ii) and (iii) imply that ImDDg,A ∗ is tangential to Fn,n , observe that (L.g) (k) = −2 (divg k)(ei )ei [Be,1.60]. Thus (ii)’ implies Z Z Z 1 λ 1 g,A h91 , 92 iTrg k dvolg hk, Q91 ,92 idvolg = − hk, LZ gidvolg + − 2 M 8 M 2m

Generic Metrics and Connections on Spin- and Spinc -Manifolds

=

1 4

417

Z hdivg k.91 , 92 idvolg . M

The R last equality is clear for λ = 0. In case λ 6= 0 recall that h91 , 92 i is constant and Trg k dvolg = 0. But this equation precisely states that (g, A) is critical. Inspection of the proof shows that if we restrict to variations of the metric, (ii) and (iii) are equivalent to the metric being critical. And in case we vary only the connection, the property of A being critical is equivalent to (i) only. The following is an immediate corollary of the definitions and the preceding proposition: Corollary 4.2. For generic metrics conditions (ii) and (iii) of the proposition are satisfied. For generic connections (i) is satisfied. For generic pairs of metrics and connections (i), (ii) and (iii) are satisfied. Remark 4.3. Note that an eigenvalue λ which admits a Killing spinor, i.e. a spinor 9 ˜ X 9 = − λ X.9, is a critical eigenvalue for variations of the metric which satisfies ∇ m which preserve the total volume [BG,Prop.28]. In Proposition 9.1 below we shall prove a partial converse to this. Remark 4.4. Consider only the eigenvalue 0: It is clear from Remark 2.6 above that (i) is conformally invariant. Some straightforward but tedious computation shows that the ¯ g,A vanishing of Qg,A 91 ,92 + Q92 ,91 is a conformally invariant statement, too. More precisely, if −1 h = e2f g we have Dh,A,E = e−f β¯h,g ◦Dg,A,E ◦ β¯h,g by Remark 2.6 above. Furthermore, h,A g,A −1 1 ˜ ˜ ∇ = βh,g {∇ + (X.∇f.+∇f.X.)}β [LM,p.134]. We may compute Qh,A ¯ ¯ + X

X

h,g

4

β91 ,β92

¯ ¯ Q¯ h,A ¯ ,β9 ¯ , where we write β := βh,g and β := βh,g to simplify notation: β9 2

1

¯ ˜ h,A ¯ Qh,A ¯ 1 ,β9 ¯ 2 = hX.∇X β91 , β92 i β9 m−3 −1 ˜ h,A ¯ ∇X )β91 , 92 i = e− 2 f hX.(β 1 g,A = e(2−m)f Q91 ,92 + h(X.X.∇f − X.∇f.X).91 , 92 i 4 m−1 (Xf )hX.91 , 92 i − 2 n o m (Xf )hX.91 , 92 i . = e(2−m)f Qg,A 91 ,92 − 2 yields: Adding this to the corresponding result for Q¯ h,A ¯ ,β9 ¯ β9 2

1

¯ g,A Qh,A + Q¯ h,A = e(2−m)f Qg,A ¯ ,β9 ¯ ,β9 ¯ ¯ 91 ,92 + Q92 ,91 β9 β9 1

2

2

1

.

This shows that (ii) is a conformally invariant equation. ¯ g,A Note that if m = 2 the form Q := Qg,A 91 ,92 + Q92 ,91 is independent of the choice of metric in a given conformal class. It depends only on the connection A (and thus on the holomorphic structure on the line bundle Σ + ) and the choice of harmonic spinors. To better understand the meaning of this consider the case of flat connections A only. Let ˜ g 9− i. Note that Trg q = 0 and thus q is ˜ g 9+ , 9− i + Reh9+ , X.∇ q(X, Y ) := RehX.∇ Y Y anti-J-invariant, i.e. q(J., J.) = −q(., .), where J denotes the complex structure induced by the metric g. As 32,0 (M, J) is trivial, q is in fact a symmetric form. Note that we recover Q from q by the identity Q = q − iq J with q J (., .) := q(J., .).

418

S. Maier

Fix p ∈ M and choose an ON-frame {e1 , e2 } around p and a vector field X with ∇e1 |p = ∇X|p = 0. Compute at the point p: divg q = =

X i X

ei q(ei , X) ˜ ge 9− i ˜ ge 9+ , 9− i + Reh9+ , X.∇ ei RehX.∇ i i

i

˜ ei 9+ , 9− i + Reh9+ , X.∇ ˜ ei 9− i ˜ ei ∇ ˜ ei ∇ = RehX.∇ ˜ + , 9− i − Reh9+ , X.∇ ˜ −i . ˜ ∗ ∇9 ˜ ∗ ∇9 = −RehX.∇ ˜ + s/4 we see that divg q = 0. The condition ˜ ∗∇ Using the Weitzenb¨ock formula Dg2 = ∇ that a symmetric bilinear form be trace-free is invariant under conformal changes. In dimension 2 the property of a symmetric bilinear form being divergence-free is a conformally invariant property, too. Thus we see that q defines an element in the tangent space T[g] T to Teichm¨uller space T at the point defined by the conformal class [g] of g, see for example [Tr]. The image of the map which assigns to each pair (9+ , 9− ) of ˜ g 9− i is thus ˜ g 9+ , 9− i + Reh9+ , X.∇ harmonic spinors the form q(X, Y ) := RehX.∇ Y Y the subspace of the tangent space T[g] T which contains those infinitesimal deformations which reduce the dimension of the space of harmonic spinors. Conformal invariance of q thus reflects the fact that spin-geometry on 2-manifolds is essentially equivalent to the study of holomorphic square roots of the canonical bundle K = 1,0 (M ) on Riemann surfaces. For this point of view see Sect. 7 below. Remark 4.5. Define a gauge-transformation to be a smooth map u : M → U1 . Such a u acts on (g, A) by the rule u.(g, A) := (g, u(A) = A + 2udu−1 ). It is immediate that ˜ g,u(A) = u ◦ ∇ ˜ g,A ◦ u−1 . It follows that Dg,u(A) = u ◦ Dg,A ◦ u−1 , which in particular ∇ g,u(A) ˜ ˜ g,A have the same spectrum, and it is also immediate that if implies that ∇ and ∇ ˜ g,A , (i), (ii) and (iii) of the proposition hold for (g, A) and some λ in the spectrum of ∇ then they also hold for (g, u(A)). Thus the condition that λ be critical is invariant under gauge-transformations. m

Remark 4.6. Let M be even dimensional. The complex volume element i 2 e1 . . . em ∈ Clm (T M ) acts on Σ and splits it into the ±-eigenbundles Σg+ and Σg− . Dg,A intertwines Σg+ and Σg− . It is clear that βh,g respects this splitting, i.e. βh,g : Σg± → Σh± . Thus − + we may consider Dh,B = Dh,B + Dh,B as operator on Σg± . We may thus ask under + what conditions on (g, A) is ImDDg,A tangential to Fn,k (H1 (Σg+ ), H0 (Σg− )). Because + + KerDg,A = KerDg,A ⊕ CokerDg,A we do not get any new information. In fact, what one would get if one proceeded as in the above proof are equations (i) and (ii) with 91 replaced by 9+ and 92 replaced by 9− . But these equations are contained in the above proposition, and conversely if these equations are known for 9+ and 9− we retrieve (i) and (ii) above because these equations are symmetric in 9+ and 9− . Remark 4.7. Equations (ii) and (iii) are essentially contained in [BG]: If the analytic for Dgt −λ, functions λ1 (t), . . . , λn (t) (pairwise different) with λi (0) = λ are eigenvalues d λ = 0 are implied where the dimension of the λ-eigenspace is n then the equations dt i t=0 by (ii) replacing harmonic spinors by eigenspinors with eigenvalue λ for Dg [BG,Th.24]. d λ =0 Conversely, the proof of [BG,Th.24] may easily be modified to prove that if dt t=0 i for all i then the metric g is critical at the eigenvalue λ. Thus the bifurcation-theoretic approach of [BG] is equivalent to our approach.

Generic Metrics and Connections on Spin- and Spinc -Manifolds

419

5. Partial Proofs In this section we will prove Theorem 1.4 and the statements of Theorems 1.2 and 1.3 concerning spinc -manifolds. The following proof is essentially the proof of [Hij2] which however is applied rather differently in this reference. Proof of Theorem 1.4. Consider dimensions 2 and 4 first. By Proposition 4.1 we know + is tangential to some Fn,k with n, k > 0 if that for a fixed metric the image of Dg,A and only if hX.9+ , 9− i = 0 for all harmonic spinors 9+ and 9− . Suppose that neither spinor vanishes. Then by the unique continuation principle [BW] there is an open dense subset of M on which neither vanishes. In dimension 2 the complex fibre dimension of Σ ± is 1 and in dimension 4 it is 2. Thus there is always a vector field X such that hX.9+ , 9− i 6= 0. This shows that unless here are no nontrivial harmonic spinors of either positive or negative type we may deform the connection so as to reduce the dimension of the space of harmonic spinors. In dimensions 1 and 3 one may argue similarly: In these dimensions the complex fibre dimensions of Σ are 1 and 2 respectively. Thus given a nontrivial harmonic spinor 9 we may always find a vector field X with hX.9, 9i = 6 0. Thus unless there are no nontrivial harmonic spinors we may deform the connection so as to reduce the dimension of the space of harmonic spinors. In dimensions ≥ 5 it might happen that for a given metric and connection Tp M.Hp ∩ Hp = {0} for every p ∈ M , where H 6= {0} is the space of harmonic spinors and Hp is the subspace in the fibre Σp spanned by harmonic spinors. In this case (i) of Proposition 4.1 is satisfied but we have no means of deforming the connection so as to reduce the dimension of H. Also note that we are not able to extend our arguments to dimensions 7 and 8 as in [Hij2] because in dimensions 7 and 8 there is in general no parallel real structure on the spinor bundle Σ for a given spinc -structure. The following lemma contains parts of the statements of Theorems 1.2 and 1.3 concerning spinc -manifolds: Lemma 5.1. Let M be a closed oriented 3- or 4-manifold with fixed spinc -struture and fixed metric g and connection A on the canonical bundle. Suppose there are nontrivial harmonic spinors (of both chiralities in dimension 4) and that condition (ii) of Proposition 4.1 is satisfied. Then (i) All nontrivial harmonic spinors vanish on the same set N and on any connected set in the complement of N we have |91 |/|92 | = const for nontrivial harmonic spinors 9i . (ii) If dimM = 3 the dimension of the space of harmonic spinors is at most 2. (iii) If dimM = 4 and IndexDg,A 6= 0, a generic metric has h+ + h− ≤ 3 and IndexDg,A ∈ {±1} unless either of h± is zero. (iv) If dimM = 4 and IndexDg,A = 0 then h+ = h− ≤ 2 for the generic metric. + is tangential to Fn,k , Proof. Consider first the 4-dimensional case: Suppose ImDg,A n, k ≥ 1 such that there are linearly independent harmonic spinors 91+ , 92+ , and let 9− be a nontrivial negative harmonic spinor. Then by Proposition 4.1,

˜ g 9− i = 0 ˜ g 9i+ , 9− i + h9i+ , X.∇ hX.∇ X X

420

S. Maier

for i ∈ {1, 2}. Suppose for the moment that there is an open connected set U ⊂ M on which 91+ does not vanish and where 92+ = f 91+ for a smooth function f . Plugging into the equation yields (Xf )hX.91+ , 9− i = 0. At a fixed point p ∈ U we may choose a basis {X1 , . . . , X4 } for Tp M such that hXk .91+ , 9− i 6= 0. Thus df |p = 0, and hence f is constant on U . By the unique continuation principle 92+ is a constant multiple of 91+ in contradiction to the assumption. Thus the set of points p at which 91+ |p and 92+ |p are linearly independent is open and dense. Fix a connected open subset U such that neither 91+ |p and 92+ |p vanish or are linearly dependent at any point p in U and such that 9− vanishes nowhere on U . Given another harmonic spinor 90+ we may write 90+ = f1 91+ + f2 92+ (where fi ∈ ∞ C (M, C)) over U . Replacing 9i+ in the equation by 90+ we obtain (Xf1 )hX.91+ , 9− i + (Xf2 )hX.92+ , 9− i = 0 . Let Fi ⊂ T U be the subbundle Ker(X ∈ T U |p → hX.9i+ , 9− i|p ). Both Fi have 2dimensional real fibres and F1 ∩ F2 = {0}. By the previous equation, a section X1 ∈ C ∞ (F2 ) satisfies X1 f2 = 0, and a section X2 ∈ C ∞ (F1 ) satisfies X2 f1 = 0. Fix a point p ∈ U and 0 6= X2 ∈ F2 |p with X2 f2 |p = 0, and choose X1 ∈ F1 |p . Set X = X1 + X2 and plug into the above equation. Then 0 = (X1 f1 )hX2 .91+ , 9− i. By fibrewise linear independence of 91+ and 92+ on U we find X1 f1 |p = 0. Hence f1 is constant on each component of U , and similarly f2 is constant on each component, too. By the unique continuation principle 90+ is a linear combination of 91+ and 92+ . Thus if h+ , h− ≥ 2 we find (by applying the above argument to positive and negative harmonic spinors) h+ = h− = 2. Thus if IndexDg,A 6= 0 and both h+ and h− are positive we find that either h+ or h− are ≤ 1, and h+ and h− differ by one. In dimension 3 it suffices to note that if h ≥ 2 then two linearly independent harmonic spinors 91 and 92 have 91 |p and 92 |p linearly independent for p in some open dense set. This is proved as the corresponding statement in dimension 4. Then arguing as before we see that h ≤ 2. If the spinc -structure is in fact a spin-structure we have a quaternion-structure on the spinor-bundle. Thus a critical metric on M which has both positive and negative harmonic spinors satisfies h+ = h− = 2. 6. Dimension 2 In order to prove Theorem 1.1 we find it convenient to view the Picard-torus of a smooth line bundle L on a Riemann surface (M, g, J) in terms of connections on L. We shall always assume that M carries a metric which induces the given complex structure. Given a line bundle L over a Riemann surface (M, J) and a partial connection ∇0,1 on L, this partial connection induces a holomorphic structure on L. This follows from the usual integrability theorems [Do,Th.2.1.53] because 2,0 ⊕ 0,2 = {0}. When we want to emphasize that L is considered as a holomorphic bundle with the structure induced by ∇0,1 we write (L, ∇0,1 ). Given an isomorphism f of L (which we think of as a smooth map f : M 7→ C∗ ) we may pull back a given partial connection ∇0,1 along f to obtain the partial ¯ −1 ). Then (L, ∇0,1 ) and (L, ∇0,1 + f (∂f ¯ −1 )) connection f ◦ ∇0,1 ◦ f −1 = ∇0,1 + f (∂f are holomorphically equivalent. And if (L, ∇0,1 ) and (L, ∇0,1 + φ), φ ∈ 0,1 (M ), are holomorphically equivalent then there is a smooth function f : M → C∗ such that ¯ −1 ) = ∇0,1 + φ. Thus φ = f (∂f ¯ −1 ). f ◦ ∇0,1 ◦ f −1 = ∇0,1 + f (∂f

Generic Metrics and Connections on Spin- and Spinc -Manifolds

421

¯ −1 ) : f ∈ C ∞ (M, C∗ )} splits as ∂(C ¯ ∞ (M, C))⊕ Note that the additive group {f (∂f H 0,1 (M, 2πiZ), by writing f = ueh with u : M → S 1 a harmonic map and h : M → C. Here H 0,1 (M, 2πiZ) is the projection of H 1 (M, 2πiZ) ⊂ H 1 (M, C) into the (0, 1)component. Note that any holomorphic structure on L is defined by ∇0,1 for a suitable connection. Thus the moduli-space of holomorphic structures on L is the quotient ¯ ∞ (M, C)) ⊕ H 0,1 (M, 2πiZ)) = H 0,1 (M, C)/H 0,1 (M, 2πiZ) . 0,1 /(∂(C This quotient is a complex torus, called the Picard torus of L. Now fix a Hermitian metric h on L. For any holomorphic structure there is a connection which induces the given holomorphic structure and preserves h, i.e. ∇h = 0. Given two h-preserving connections ∇1 and ∇2 on (L, h) which induce the 0,1 ¯ ∞ (M, C)) ⊕ same holomorphic structure on L, we have ∇0,1 = φ ∈ ∂(C 2 − ∇1 1 0,1 ¯ H (M, 2πiZ). Because ∇2 − ∇1 ∈ i (M, R), we find that ∇2 − ∇1 = φ − φ. 1,0 We now return to spin-structures on M : Let K = (M ) be the canonical bundle of (M, J). Spin-strucures on M correspond to holomorphic square-roots of K by [Hi,Th.2.2]. Fix some such square-root. Given a metric on M which induces the given ˜ be the hermitian concomplex structure, K and L inherit hermitian metrics. Let ∇ ˜ ⊗∇ ˜ = ∇. As all nection on L, and ∇ the hermitian connection on K. Note that ∇ square-roots of K are isomorphic as unitary bundles we may think of them as being ˜ + ω) with ω ∈ i1 (M, R). Taking the square we get a connection of the form (L, ∇ 0 ˜ ⊗∇ ˜ + 2ω on K = L ⊗ L. Observe now: ∇ := ∇ Lemma 6.1. ∇0 induces the same holomorphic structure on K as does ∇ if and only ¯ ∞ (M, C)) ⊕ H 0,1 (M, 2πiZ), that is if and only if the cohomology class if 2ω 0,1 ∈ ∂(C [ω 0,1 ] is contained in the lattice obtained by projecting 21 H 1 (M, 2πiZ) into H 0,1 (M, C). Observe that dimH 1 (M, R) = rankH 1 (M, Z) = 2 genus(M ). We thus retrieve the well known fact that there are 22genus(M ) spin-structures on M . Armed with these preliminary remarks we can now embark upon a proof of Theorem 1.1. The following lemma is the analogue of Lemma 5.1 for dimensions 3 and 4 above. Lemma 6.2. Let M be a closed 2-dimensional manifold and fix a spinc -structure, a metric g on M and a connection A on the auxiliary bundle PU1 , and let Dg,A : Σ + → Σ − + be the Atiyah–Singer Dirac operator. Suppose that at g, ImDDg,A is tangential to Fn,k , − + ≤ 1, or c1 (PU1 ) 6= 0 n, k > 0. Then either c1 (PU1 ) = 0 and dimKerDg,A = dimKerDg,A and there are no harmonic spinors of either positive or negative chirality. In the first case, given two nontrivial harmonic spinors 9+ and 9− of positive and negative chirality respectively, we have |9+ | = λ|9− | for some λ > 0. In this case 9+ and 9− vanish on the same finite set of points. Proof. Pick two harmonic spinors 9+ and 9− . Then by Proposition 4.1, ˜ g 9+ , 9− i + h9+ , X.∇ ˜ g 9− i = 0 . hX.∇ X X Suppose that neither spinor vanishes identically. By the unique continuation principle [BW] we may choose an open connected subset U ⊂ M , where neither 9+ nor 9− vanish. Note that the fibre dimension of each Σ ± is 1. Let 90+ be another harmonic spinor, and over U write 90+ = f 9+ for some f ∈ C ∞ (U, C). Replace 9+ in the previous equation by 90+ to obtain (Xf )hX.9+ , 9− i = 0. As this holds for every vector

422

S. Maier

field over U we see that f is constant on U . By the unique continuation principle 90+ + is a constant multiple of 9+ , and hence dimKerDg,A = 1. Repeating the argument with − + = 21 c1 (PU1 ) = 0. 9− shows that dimKerDg,A = 1, too, and thus IndexDg,A Let X be a vector field on U with |X| = 1. Then there is the identity |9− |2 9+ = X.9− h9+ , X.9− i. Then ˜ X 9+ , 9+ i + |9− |2 h9+ , ∇ ˜ X 9+ i |9− |2 X|9+ |2 = |9− |2 h∇ ˜ X 9+ ih9+ , X.9− i ˜ = h∇X 9+ , X.9− ihX.9− , 9+ i + hX.9− , ∇ ˜ ˜ = h9+ ,X.∇X 9− ihX.9− , 9+ i + hX.∇X 9− , 9+ ih9+ , X.9− i ˜ X 9− ih9+ , X.9− i . = 2Re h9+ , X.∇ Note that the last expression is symmetric in 9+ and 9− . Thus we obtain the equation |9− |2 X|9+ |2 = |9+ |2 X|9− |2 . Given a point p in M with 9+ |p 6= 0 we conclude that |9+ | = λ|9− | in a neighbourhood of p. Because 9− does not vanish on a dense open set by the unique continuation principle [BW] it follows that λ > 0. Thus if 9+ |p = 0 at some p ∈ M then also 9− |p = 0. By symmetry, 9+ and 9− vanish on the same set, and because 9+ is a holomorphic section of Σ + with respect to the holomorphic structure ˜ 0,1 [Hi], we have |9 + | = λ|9− | for some λ > 0 on all of M . induced by ∇ Lemma 6.3. Let M be a closed 2-manifold. Let g be a metric on M and a ∈ i1 (M ), ˜ be and fix a spin-structure on M . Denote the positive spinor bundle by L and let ∇ the connection on L induced by g. Let Dg,A be the Dirac operator obtained from the ˜ ˜ connection ∇+a. Suppose that ImDDg,A is tangential to Fn,n for n > 0. Then (L, ∇+a) is a holomorphic square-root of K and thus a spin-structure, possibly different from L. There is a smooth function f : M → S 1 with a = df /2f . The form a is closed and defines an element [a] ∈ H 1 (M, 2πiZ). Proof. Let K denote the canonical bundle, and let L be the square root of K defined by ¯ the spin-structure. Given a metric on M there is an antilinear isomorphism R h : K ⊗L → L given on smooth sections v and w of either bundle by hv, h(w)i = vw [At]. Let φ be a local section of K of unit length over some open set U ⊂ M , and let φ¯ be the ¯ Let σ be a section of L with σ ⊗ σ = σ 2 = φ. Then corresponding section of K. necessarily |σ| = 1. For f ∈ C ∞ (U, C): Z Z hσ, h(f φ¯ ∧ σ)iL2 = f φ¯ ∧ φ = i f dvolg . Using a Dirac-sequence for f we find that h(φ¯ ∧ σ) = iσ. Now pick any point p ∈ U and choose a g-orthonormal frame {e1 , e2 } in a neighbourhood of p such that ∇e1 |p = 0. ¯ p = 0. It is immediate that ∇h|p = 0. As p was arbitrary Thus ∇σ|p = 0 and ∇φ| it follows that h is parallel, and h is unitary with respect to the hermitian metrics on ˆ + a on K ¯ ⊗ L to L along h. We compute both bundles. Pull back the connection ∇ −1 ˆ ˜ h ◦ (∇ + a) ◦ h = ∇ − a, because h is parallel and antilinear and a ∈ i1 (M ). Let X be a smooth vector field and denote by J the complex structure on M induced by the metric. Compute: ˜ 0,1 − a0,1 (X) = 1 ∇ ˆ X − a(X) + i∇ ˆ JX − ia(JX) ∇ X 2 1 ˆ X + a(X) − i∇ ˆ JX − ia(JX) ◦ h−1 = h◦ ∇ 2

Generic Metrics and Connections on Spin- and Spinc -Manifolds

423

ˆ 1,0 + a1,0 ◦ h−1 . = h◦ ∇ 0,1 Let 10,1 a be the Laplace-operator on L ⊕ ( (M ) ⊗ L) associated with the operator ˆ 0,1 + a0,1 , and let similarly 11,0 ∇ a be the Laplace-operator associated with the operator ˆ 1,0 + a1,0 . Let 9− be a section of K ¯ ⊗ L. Then by [Hi] 9− is harmonic if and only if ∇ 0,1 1,0 0,1 1a 9− = 0. This is equivalent to demanding 11,0 a 9− = 0, because 1a = 1a . The 1,0 1,0 ˆ latter condition translates into (∇ + a )9− = 0. By the previous computation we see that 9− is harmonic if and only if h(9− ) is a holomorphic section of L with respect to ˜ − a. the holomorphic structure induced by ∇ Now suppose the metric on M is critical and there are harmonic spinors 9+ and 9− . By the previous lemma we may assume that |9+ | = |h(9− )|. Let P be the finite set of points where both sections vanish, and choose a function f ∈ C ∞ (M \ P ) such ˜ 0,1 + a0,1 )9+ = 0, which that h(9− ) = f 9+ . Harmonicity of 9+ is equivalent to (∇ ¯ − 2f a0,1 = 0 by substitution. Thus ˜ 0,1 − a0,1 )h(9− ) = 0 implies ∂f together with (∇ ¯ /2f . Because |f | = 1 we have b := df /2f ∈ i1 (M ). As a ∈ i1 (M ), too, a0,1 = ∂f both a and b satisfy a1,0 = −a0,1 and b1,0 = −b0,1 which implies a = b = df /2f . By continuity we see that a is a closed form. Pick any p ∈ P and a neighbourhood D of p, which we may assume diffeomorphic to a disc. There a = idh for some smooth real valued function h on D. Cutting out a radial line of D yields a contractible set D0 , where we may assume f = eig for some smooth real valued function g. Thus on D0 we have dh = 21 dg. Thus g = 2h + const on D0 . It follows that f may be smoothly continued into p. In total we have found a function f ∈ C ∞ (M, C) with |f | = 1 such that a0,1 = ¯ ∂f /2f . But by Lemma 6.1 of this section and the discussion preceding it this implies ˜ 0,1 + a0,1 ) is a holomorphic square-root of K. that (L, ∇

We may now proceed to prove the main theorem of this section: Proof of Theorem 1.1. By Lemma 6.2 of this section we see that all there remains to do is to study the case of twisted spin-structures. Let g be a metric on M and a ∈ i1 (M ), ˜ be and fix a spin-structure on M . Denote the positive spinor bundle by L and let ∇ the connection on L induced by g. Let Dg,A be the Dirac operator obtained from the ˜ + a. If a is closed and represents a class in H 1 (M, 2πiZ) then (L, ∇ ˜ + a) connection ∇ is again a square root of K, for an arbitrary metric g. Suppose there is a metric g with nontrivial harmonic spinors of both chiralities which is critical. By Lemma 6.2 we have h+ = h− = 1, and Lemma 6.3 implies that ˜ + a) is in fact another spin-structure and a = df /2f for some smooth function (L, ∇ f : M → S 1 . Hence a defines an element in H 1 (M, 2πiZ). Thus unless the twisted spin-structure is a spin structure itself, no metric can be critical and hence any metric with nontrivial harmonic spinors may be deformed to one without. In the spin-case the value h+ = dimKerDg+ mod 2 is independent of the choice of metric and depends only on the spin-structure [At, Mum, ACGH]. As a critical metric has h+ ∈ {0, 1}, we see that metrics with h+ > 1 cannot be critical and can thus be perturbed to a new metric with less harmonic spinors. Thus if dimKerDg+ = 0 mod 2 the generic metric will have dimKerDg+ = 0. In the other case dimKerDg+ = 1 for the generic metric. ¯ ⊗ L → L is essentially the Remark 6.4. The initiated will have noticed that h : K Serre-duality map, possibly up to sign. In fact, if ∗¯ L denotes the complex conjugate of the Hodge-∗-operator with coefficients in L [W, p. 166] we have a map

424

S. Maier ∗¯ L ¯ ⊗ L −→ K K ⊗ L−1 ∼ = L.

We find ∗¯ (φ¯ ⊗ σ) = −iφ, and thus the above map is identified as the following map: ∼ ∗¯ L = φ¯ ⊗ σ 7−→ −iφ ⊗ h., σi = −iσ ⊗ σ ⊗ h., σi 7−→ −iσ. Hence h is the negative of the Serre-duality map. Remark 6.5. We should mention that there are spin-structures on M for which there are always harmonic spinors. In fact, their number can be computed and it turns out to be 2g−1 (2g − 1) [At,Th.3] where g = genus(M ). 7. Applications to Riemann Surfaces Theorem 1.1 is really a theorem in the theory of Riemann surfaces, their moduli and the moduli of holomorphic line bundles. Let Lc denote the positive spinor bundle for a fixed spin-structure on a Riemann surface. Here, the index c is the parameter of the complex structures on the underlying closed 2-manifold M . Let F be a Hermitian line bundle with connection A with respect to which the Hermitian metric is parallel. Then given a complex c on M , the connection induces a holomorphic structure on F . Denote this holomorphic bundle by FA,c . We may now restate Theorem 1.1 as follows: If c1 (F ) 6= 0 then for a dense open subset of Teichm¨uller space h0 (Lc ⊗ FA,c ) = 0 in case c1 (F ) < 0, and h0 (Lc ⊗ FA,c ) = c1 (F ) in case c1 (F ) > 0. If c1 (F ) = 0, write A = d + a, a ∈ i1 (M ), with respect to some trivialization of F . Then unless a ∈ H 1 (M, 2πiZ), h0 (Lc ⊗ FA,c ) = 0 for generic c. Otherwise, Lc ⊗ FA,c is a holomorphic square root of Kc and for the generic complex structure h0 (Lc ⊗ FA,c ) = 0 or 1 depending only on the spin-structure. Thus in particular we have the following extension of the classical results of [At, Mum]: Theorem 7.1. The function h0 : c ∈ T (M ) 7→ h0 (Lc ) is constant on a generic (i.e. dense and open) subset C of Teichm¨uller space T (M ). On C the image of h0 is contained in {0, 1}, and the actual value depends only on which spin-structure is chosen. In the theory of Riemann surfaces spin-structures are often called Theta-characteristics. Note that if (M, c) is hyperelliptic then h0 (Lc ) = [(g + 1)/2] for at least one square root of K [BaS,Th.3,Th.4]. Thus for genus(M ) ≥ 3 the generic set C of the proposition is not all of Teichm¨uller-space for at least one square-root of K. Remark 7.2. One may ask what kind of subset the set D(L) := { c ∈ T (M )| h0 (Lc ) > 1 } is. First, by [Gro,Th.3.1 & Rem.3.2.2] there is a smooth analytic space V and an analytic submersion π : V → T (M ) such that π −1 (c) = (M, c), i.e. M furnished with the complex structure c. Using Grauert’s upper-semicontinuity theorem [Gra,Satz 3;GR,5.10.4] we find that D(L) is an analytic subset of T (M ). I.e. D(L) is a locally finite union of irreducible analytic subsets of T (M ). Compare also [Far]. Remark 7.3. An obvious question is whether the sets D(L) := { c ∈ T (M )| h0 (Lc ) > 1 } do depend on the square root L of K. First, note that for genus(M ) < 3 the value of h0 (Lc ) is independent of the complex structure c and h0 (Lc ) ∈ {0, 1} [Hi,Prop.2.3]. Thus D(L) = ∅ for genus(M ) < 3. Thus consider the case genus(M ) ≥ 3: By [BaS,Th.3,Th.4], on any hyperelliptic surface (M, c) there is always a square-root L of K for which h0 (Lc ) = 0 and a square root L0c for which h0 (L0c ) = [(g + 1)/2]. This shows that the sets D(L) do indeed S depend upon the spin-structure chosen. Needless to say, we may take the union D := L2 =K D(L) to find an analytic subset such that on the complement the function c 7→ h0 (Lc ) ∈ {0, 1} is constant for every square root of K.

Generic Metrics and Connections on Spin- and Spinc -Manifolds

425

Remark 7.4. Theorem 1.4 may be read as follows: For generic holomorphic structures h on a line bundle F over a fixed Riemann surface, h0 (Fh ) = c1 (F ) + 1 − genus(M ) if c1 (F ) ≥ genus(M ) − 1, and h0 (Fh ) = 0 if c1 (F ) < genus(M ) − 1. Here, a generic set is a dense open subset of the Picard torus for F . This is of course a basic result of Brill–Noether theory [Gu,p.51]. The fact that h0 (Fh ) ≥ c1 (F ) + 1 − genus(M ) for every h is a trivial consequence of the Riemann–Roch theorem. Brill–Noether theory also shows that the set of holomorphic structures for which h0 (Lh ) is greater than the minimal value is a union of analytic subsets. 8. Dimensions 3 and 4 In this section we will prove the statement on spin-manifolds in Theorem 1.3 first and then indicate the necessary changes in dimension 3. Thus assume for the time being that M is a closed spin-4-manifold with a fixed spin-structure and a metric which is critical and has both nontrivial positive and negative harmonic spinors (if g was not critical we could deform the metric so as to reduce the dimension of the space of harmonic spinors). The aim is to show that (M, g) is conformally flat. We shall even show that in this situation (M, g) is conformally equivalent to a flat torus, see Proposition 8.12 below. This shows that only in this particular situation a metric may be critical without being minimal. Otherwise critical metrics are precisely the minimal metrics. Fix two nontrivial harmonic spinors 9+ and 9− . The proof of Theorem 1.3 will extend over a rather long list of lemmas. Lemma 8.1. On each connected set on which 9+ does not vanish there is λ > 0 with |9+ | = λ|9− |. In particular, 9+ and 9− vanish on the same set. Moreover, linear combinations of 9+ and J9+ (respectively 9− and J9− ) are the only positive (negative) harmonic spinors on M , i.e. h+ = h− = 2. Proof. The last statement is proved above in Lemma 5.1. Let X be a vector field on some open connected with |X| = 1. Then |9− |2 9+ = h9+ , X.9− iX.9− + h9+ , X.J9− iX.J9− . With this we may compute ˜ X 9− ihX.9− , 9+ i |9− |2 X|9+ |2 = 2Reh9+ , X ∇ ˜ X J9− ihX.J9− , 9+ i . +2Reh9+ , X ∇ By (ii) of Proposition 4.1 this is symmetric in 9+ and 9− , and arguing as before in the proof of Lemma 6.2 we may deduce the lemma. Fix a connected open set U on which neither 9+ nor 9− vanish. We may assume that |9+ | = |9− | by the preceding lemma. We may fix an ON-frame by the rule: e1 .9+ = 9−

e2 .9+ = i9−

e3 .9+ = J9−

e4 .9+ = −iJ9−

(to see that this is oriented compute e1 e2 e3 e4 9+ = −9+ ). In the sequel we shall always let X and Y be vector fields on U with |X| = |Y | = 1 and X ⊥ Y such that they map 9+ to a harmonic spinor under Clifford multiplication. Lemma 8.2. Let ωX (ei , ej ) := h∇ei X, ej i − h∇ej X, ei i. Then the following holds:

where ωX 9+ :=

˜ X 9+ − ωX 9+ = 0 , div(X)9+ + 2∇

P i<j

ωX (ei , ej )ei ej 9+ .

426

S. Maier

Proof. By definition of X, X.9+ is harmonic. Thus: X ˜ ei (X.9+ ) 0= ei ∇ i

=

X

˜ ei 9+ ei (∇ei X)9+ + ei X ∇

i

=

X

˜ X 9+ h∇ei X, ek iei ek 9+ − 2∇

i,k

=−

X

˜ X 9+ + h∇ei X, ei i9+ − 2∇

i

X

ωX (ei , ek )ei ek 9+

i
˜ X 9+ + ωX 9+ . = −div(X)9+ − 2∇ Of course, the particular choice of ON-frame plays no role here.

Lemma 8.3. Let X be as above and Z any smooth vector field which is everywhere orthogonal to X. Then ˜ Z 9+ , ZX9+ i + h∇Z X, Zi|9+ |2 = 0 . 2Reh∇ Proof. We may assume that |Z| = 1. Equation (ii) of Proposition 4.1 yields ˜ Z (X9+ )i ˜ Z 9+ , X9− i + h9+ , Z ∇ 0 = hZ ∇ ˜ Z 9+ i ˜ Z 9+ , ZX9− i + h9+ , Z(∇Z X)9+ i + h9+ , ZX ∇ = −h∇ ˜ Z 9+ , ZX9− i − h∇Z X, Zi|9+ |2 + h9+ , ZW 9+ i , = −2Reh∇ where W := ∇Z X − h∇Z X, ZiZ is orthogonal to both Z and X. Thus Reh9+ , ZW 9+ i = −Reh9+ , ZW 9+ i , and hence h9+ , ZW 9+ i is imaginary-valued. The lemma now follows.

It is useful to observe that h9+ , ei ej 9+ i is always imaginary-valued if i 6= j. This follows as in the preceding proof. Lemma 8.4. ∇ei ej is a multiple of ei in each fibre provided i 6= j. Proof. Let i < j. By Lemma 8.2: ˜ ei 9+ , ei ej 9+ i − hωei 9+ , ei ej 9+ i = 0 . div(ei )h9+ , ei ej 9+ i + 2h∇ Taking real parts we obtain: ˜ ei 9+ , ei ej 9+ i − Rehωei 9+ , ei ej 9+ i = 0 . 2Reh∇ Now Rehωei 9+ , ei ej 9+ i = ωei (ei , ej )|9+ |2 ±ωei (ek , el )|9+ |2 , where k < l are different from i, j, and the sign is taken to be + if ek , el , ei , ej is oriented and − otherwise. Plugging in the definition of ωei we obtain ˜ ei 9+ , ei ej 9+ i − h∇ei ei , ej i|9+ |2 ± h∇ek ei , el i − h∇el ei , ek i |9+ |2 = 0 . 2Reh∇ By the preceding lemma the sum of first two terms vanishes. Thus

Generic Metrics and Connections on Spin- and Spinc -Manifolds

427

h∇ek ei , el i − h∇el ei , ek i = −hei , [ek , el ]i = 0 for arbitrary choices of k, l and i 6= k and i 6= l. By the Koszul formula [O’N,3.11] 2h∇ei ej , ek i = −hei , [ej , ek ]i + hej , [ek , ei ]i + hek , [ei , ej ]i we see that ∇ei ej ⊥ ek for i 6= j and k different from both i, j. The lemma follows. Lemma 8.5. The value of h∇ek ei , ek i is independent of the choice of k 6= i. Proof. To this end let j 6= k and j 6= i and compute ˜ ek 9+ , ei 9+ i + h9+ , ej ∇ ˜ ek (ei 9+ )i 0 = hej ∇ ˜ ej 9+ , ei 9+ i + h9+ , ek ∇ ˜ ej (ei 9+ )i +hek ∇ ˜ ej 9+ , ei ek 9+ i ˜ ek 9+ , ei ej 9+ i + 2Reh∇ = 2Reh∇ + h∇ek ei , ek i − h∇ej ei , ej i h9+ , ej ek 9+ i , where we have used the preceding lemma. The first two terms are real-valued and the last term is imaginary-valued. Thus h∇ek ei , ek i − h∇ej ei , ej i h9+ , ej ek 9+ i = 0 . If ej ek 9+ ∈ {±i9+ } then h∇ek ei , ek i = h∇ej ei , ej i. Otherwise replace the first 9+ in each bracket by J9+ and compute: ˜ ek J9+ , ei 9+ i + hJ9+ , ej ∇ ˜ ek (ei 9+ )i 0 = hej ∇ ˜ ej J9+ , ei 9+ i + hJ9+ , ek ∇ ˜ ej (ei 9+ )i +hek ∇ = h∇ek ei , ek i − h∇ej ei , ej i hJ9+ , ej ek 9+ i , and if now ej ek 9+ ∈ {±J9+ , ±iJ9+ }, then again h∇ek ei , ek i = h∇ej ei , ej i.

Lemma 8.6. If |9+ | is constant on U then the ei and all harmonic spinors are parallel over U . ˜ + , 9+ i is imaginary-valued. Thus 0 = div(ei )|9+ |2 Proof. If |9+ | is constant on U , h∇9 by Lemma 8.2. Hence div(ei ) = 0. But X h∇ek ei , ek i , div(ei ) = k6=i

and by the preceding lemma we see that h∇ek ei , ek i = 0 for all i, k, and thus by Lemma 8.4 each ei is parallel on U . Lemma 8.2 then implies that 9± are parallel. Lemma 8.7. (M, g) is conformally flat. Proof. Given p ∈ M with 9+ |p 6= 0 choose a smooth function f with f = 13 ln |9+ |2 in a neighbourhood of p. Let g 0 := e2f g. By Remark 2.6 the spinor β¯g0 ,g 9+ has constant norm in a neighbourhood of p. Thus g 0 is flat in a neighbourhood of p by the last lemma. Thus the Weyl-tensor for g vanishes on the set of points where 9+ does not vanish. But this set in dense in M , and hence the Weyl-tensor for g vanishes identically, which means that (M, g) is conformally flat.

428

S. Maier

Remark 8.8. In dimension 5 the situation is considerably more complicated: There are 5-dimensional non-conformally flat Riemannian spin-manifolds the nontrivial spinors of which are all parallel [BFGK, p.150]. Thus this metric is critical but not minimal. Proof of Theorem 1.3. Note that we may perturb g by a C 1 -small deformation to get a metric with h+ (g) ≤ 2 and nonvanishing Weyl-tensor W (g). Hence this metric cannot be critical by the preceding discussion, and we may thus find a small perturbation of g to get a new metric without harmonic spinors. Proof of Theorem 1.2. We now indicate the changes in the above proof necessary to prove the corresponding result in dimension 3. Thus let (M, g) be an oriented Riemannian 3Manifold with fixed spin-structure such that there is a nontrivial harmonic spinor 9 for the metric g which is critical in the sense of the remark following Proposition 4.1. Define a local ON-frame {e1 , e2 , e3 } such that e1 9 = i9

e2 9 = iJ9

e3 9 = J9.

Lemma 8.2 holds, and because e1 e2 e3 9 = −9 we obtain ˜ X 9 − ωX (e1 , e2 )e3 9 + ωX (e1 , e3 )e2 9 − ωX (e2 , e3 )e1 9 = 0 div(X)9 + 2∇ This equation immediately implies ˜ ek 9, ek 9i ± ωek (ei , ej )|9|2 = 0 , 2Reh∇ where i < j are both different from k, and the sign depends on k. Note that by (ii) of Proposition 4.1, ˜ ek 9, 9i = Reh9, ek ∇ ˜ ek 9i = −Rehek 9, ∇ ˜ ek 9, ek 9i = −Rehek ∇ ˜ ek 9i , Reh∇ ˜ ek 9, ek 9i = 0. Thus we find ωek (ei , ej ) = 0. Hence hek , [ei , ej ]i = 0, and hence Reh∇ and as before we conclude that ∇ei ej is a multiple of ei . Lemmata 8.5 to 8.7 remain valid (where we have to replace 9+ by 9, of course). In the proof of Lemma 8.7 we have to replace the Weyl-tensor by the anti-symmetrisation of the Schouten-tensor. We can now argue as before to conclude the proof of Theorem 1.2. Remark 8.9. In dimension 5 the situation is more complicated: There are 5-dimensional non-conformally flat Riemannian spin-manifolds the nontrivial harmonic spinors of which are all parallel [BFGK,p.150]. Thus these metrics are critical but not minimal and we cannot reproduce the above arguments. Of course, one might try to prove that 5dimensional closed spin-manifolds with critical but not minimal metric must be isometric to the examples of [BFGK,p.150]. A little more work will yield all the conformal closed oriented 3-manifolds and spin-4-manifolds which admit a critical metric. We first need the following lemma: Lemma 8.10. Let U 0 ⊂ U ⊂ Rn , n ≥ 3, be open and connected, and denote by g the Euclidean metric on Rn . Let f : U 0 → R be a function such that e2f g is flat. Then f can be uniquely continued to a function φ on U with the possible exception of one point such that e2φ g is a flat metric on U .

Generic Metrics and Connections on Spin- and Spinc -Manifolds

429

Proof. By possibly shrinking U 0 we may find an open set V 0 ⊂ Rn and an isometry σ 0 : (V 0 , g) → (U 0 , e2f g). σ 0 extends uniquely to a global conformal diffeomorphism σ : S n → S n [KuPi2,p.12]. Let V := σ −1 U − {∞}, where we identify Rn with the sphere minus the North pole ∞ via stereographic projection. Thus σ : (V, g) → (U −{σ(∞)}, g) is a conformal diffeomorphism, and thus there is a function φ : U − {σ(∞)} → R such that (σ −1 )∗ g = e2φ g and φ = f on U 0 . This was under the assumption of a possibly shrunk U 0 , in order to prove the lemma it thus suffices to show that φ is unique. Given now two extensions φ1 and φ2 of f (defined on all of U with the possible exception of a point for each φi ) such that e2φi g is flat, choose a connected open set V ⊂ U on which both φi are defined such that there are open sets Vi ⊂ Rn and isometries σi : (Vi , g) → (V, e2φi g). If φ1 = φ2 on a connected open subset V0 of V , the map σ2−1 ◦ σ1 : (σ1−1 V0 , g) → (σ2−1 V0 , g) is an isometry. Then σ2−1 ◦ σ1 extends uniquely to an isometry of Rn , in particular σ2−1 ◦ σ1 |σ1−1 V is an isometry. It follows that φ1 = φ2 on all of V . By connectedness of U − {σ(∞)} the result follows. Lemma 8.11. The set of points on which 9+ (respectively 9 in dimension 3) vanishes is discrete. Proof. Let U be a connected open subset of M which (after possibly conformally rescaling the metric g first) is isometric to some open subset of Euclidean space. Let U 0 ⊂ U an open connected subset on which 9+ (respectively 9) does not vanish. Let f := 13 ln |9+ |2 : U 0 → R. Then e2f g is a flat metric on U 0 . By the preceding lemma, f may be continued to a function to all of U with the possible exception of a single point. Thus 9+ cannot vanish on U minus that point. We can now prove the following proposition: Proposition 8.12. Let (M, g) be a closed Riemannian spin-manifold of either dimension 3 or 4 with fixed spin-structure with harmonic spinors (of both chiralities in dimension 4) such that the metric g is critical for the eigenvalue 0. Then M is a torus and g is conformally equivalent to a flat metric. ˜ be the universal cover of M and F the discrete set of points on which 9+ Proof. Let M ˜ . A standard (9 in the case of dimension 3) vanishes, and let F˜ be its preimage in M ˜ \ F˜ :→ Rn , n = dimM , monodromy argument shows that there is a local isometry δ : M ˜ \ F˜ is furnished with the metric obtained by pulling back the flat metric e2f g where M from M \ F with f being defined as in the previous lemma. This map uniquely extends ˜ → S n [KuPi2]. It follows that the holonomy 0 ⊂ Conf (S n ) to a conformal map δ : M of M fixes ∞. By Theorem C of [Kam], (M, g) is conformally covered by either S n , S n−1 × S 1 or a torus T n with the natural conformal structures, and the conformal class of g contains a metric of positive scalar curvature in the first two cases and a flat metric in the third. Thus by the standard Weitzenb¨ock formula [L], harmonic spinors can occur only when M is covered by the torus. Replacing g by a conformally equivalent flat metric shows that 9+ and 9− (respectively 9) are parallel. Thus Σ ± (respectively Σ) is trivialized by parallel sections, and thus so is T M , and hence (M, g) has trivial holonomy. By [Wo,Cor.3.4.6] we conclude that (M, g) is a flat torus. Remark 8.13. Observe the following fact: Given a metric which is not critical (for the ¯ g,A eigenvalue 0), the quadratic form Q = Qg,A 91 ,92 + Q92 ,91 (in dimension 4) does not vanish identically for any choice of harmonic spinors 9± . Otherwise, the above arguments go through to show that (M, g) is conformally flat and is in fact a torus with the flat

430

S. Maier

conformal structure as we shall see in a moment. In particular the metric would be critical. We may thus assume that Q does not vanish identically. Fix some open subset U ⊂ M . If Q|U did vanish identically we could again conclude that (U, g|U ) is conformally flat. By a slight perturbation of g within U we may assume that this is not the case and that Q does not vanish identically on U . Let φ ≥ 0 be a smooth function supported in U such that φQ does not vanish identically. By Equation 4.1.1 the quadratic form Re(φQ) defines a deformation direction along which dimKerDg decreases. The same arguments go through in dimensions 2 and 3, and inspection of the proof of Theorem 1.4 shows that we may argue similarly for deformations of connections on the canonical bundle. Thus in total we have: Proposition 8.14. Let dimM = 2,3 or 4 and U an open set of M . Given a metric g which is not minimal we may find a minimal metric g 0 which is C 1 -close to g and is equal to g outside U . Let dimM = 1, . . . , 4. Given a connection A on the canonical bundle which is not minimal we may find a minimal connection A0 which is C 0 -close to A and is equal to A outside U . This extends an observation of [Hi,p.45]. 9. Critical Eigenvalues 6= 0 In this section we shall prove a partial converse to [BG,Prop.29] which asserts that eigenvalues which admit a Killing spinor are critical for all variations of the metric which preserve the total volume. Proposition 9.1. Let M be a closed oriented 2- or 3-manifold with a fixed spin-structure. If for some metric g on M some eigenvalue λ 6= 0 is critical for variations of the metric which preserve the total volume, then (M, g) is covered by the round sphere (up to rescaling by a constant factor). In dimension 2, (M, g) is isometric to the round sphere. Corollary 9.2. Let M be a closed oriented 2- or 3-manifold with fixed spin-structure. Fix λ 6= 0. The set of metrics with given total volume for which λ is not an eigenvalue of Dg is C 1 -generic. Proof of the Proposition. Let 9 be a nontrivial eigenspinor for the eigenvalue λ. By Proposition 4.1 the norm of 9 is constant and we may assume it to be = 1. Let ˜ g 9, 9i − λ g(X, Y ) ω(X, Y ) := RehX.∇ Y m for arbitrary vector fields X and Y . By (ii) of Proposition 4.1 ω is a 2-form. By T ⊂ Σ denote the image of T M under Clifford multiplication with 9, i.e T = T M.9. Let H be the orthogonal complement of R9 ⊕ T with respect to the metric Reh., .i. Set ˜ ˜ H 9 := pr ∇9, where prH denotes orthogonal projection onto H. For a local ON∇ H frame {e1 , . . . , em } ˜ ge 9 = ∇ i

X

ω(ei , ej )ej .9 −

i6=j

Multiply this with ei and sum over i to obtain

λ ˜H ei .9 + ∇ ei 9 . m

Generic Metrics and Connections on Spin- and Spinc -Manifolds

0=2

X

ω(ei , ej )ei .ej .9 +

431

X

i<j

˜H ei . ∇ ei 9 .

i

Consider the case m = 2: Then e1 e2 .9 is a local section of H. Now the previous formula reads ˜H ˜H 0 = 2ω(e1 , e2 )e1 .e2 .9 + Reh∇ e2 9, e1 e2 .9ie1 .9 − Reh∇e1 9, e1 e2 .9ie2 .9 . ˜ g 9 = − λ X.9. Thus all coefficients vanish and hence ∇ X 2 Now consider m = 3: Here H = {0} because the real fibre dimension of Σ is 4. Using e1 e2 e3 .9 = −9 we obtain 0 = ω(e1 , e2 )e3 .9 + ω(e2 , e3 )e1 .9 + ω(e3 , e1 )e2 .9 ˜ g 9 = − λ X.9. and thus ω = 0, whence ∇ X 3 Thus if m = 2 or 3 then 9 is a Killing-spinor. The proposition now follows from [BFGK,Th.8,p.31] because in dimensions 2 and 3 Einstein metrics of constant scalar curvature are in fact constant curvature metrics. Remark 9.3. In dimension 3, one might be tempted into believing that M must in fact be the sphere. This, however, is not the case: Identify S 3 ∼ = SU2 and let E = (e1 , e2 , e3 ) be a left-invariant ON-frame on SU2 , where the ei satisfy the relations [ei , ej ] = 2µek ,

µ ∈ R∗

for cyclic permutations (i, j, k) of (1, 2, 3). Then ∇ei ej = µek . Let 0 be a discrete subgroup of SU2 and M := 0 \ SU2 the quotient. The metric on SU2 and E descend to M . View E as a section E : M → PSO M . Lift E to a section E˜ of the Spin3 -bundle PSpin M associated with the trivial spin-structure. Fix v ∈ C2 and let 9 be the section given by ˜ 9(m) = [E(m), v] ∈ (PSpin M × C2 )/SU2 = PSpin (M ) ×rep C2 , where SU2 = Spin3 acts via u.(g, v) := (gu, u−1 v). Then X ˜ =1 ∇9 ωji ei ej 9, 2 i<j

ωji = h∇ei , ej i = µe∗k .

Recalling that e1 e2 e3 .9 = −9 it is now immediate that 9 is a Killing spinor. 10. A Remark on Seiberg–Witten Moduli Spaces One motivation for studying generic metrics and connections on spinc -manifolds comes from Seiberg–Witten theory. Let M be a closed oriented 4-manifold with b+ ≥ 1, and fix a spinc -structure on M with canonical bundle L. For a given metric g on M and a self-dual 2-form η the Seiberg–Witten equations are equations for a connection A on L and a section 9 of Σ + : Dg,A 9 = 0 , ρ(FA+ ) = σ(9, 9) + ρ(iη) ,

432

S. Maier

where FA+ is the self-dual part of the curvature. The map ρ : i+ (M ) → End0 (Σ + ) is given by Clifford multiplication and the bilinear map σ : Σ+ ⊗ Σ+ → End0 (Σ + ) is defined as follows: 1 σ(91 , 92 ) := 91 ⊗ 9∗2 − Tr(91 ⊗ 9∗2 )Id . 2 For (g, η) in a dense open subset D ⊂ M × + (M ) the space of solutions is a smooth manifold which contains only irreducible solutions, i.e. solutions (9, A) with 9 6≡ 0. One may ask whether the set consisting of pairs (g, A), where A comes from a solution of the Seiberg–Witten equations for a fixed pair (g, η) is disjoint from the set of pairs (g, B) ∈ M×A of metrics and connections for which the space of harmonic spinors is larger than required by the index of the Dirac operator. Here (g, η) are parameters which we will choose in D, i.e. the corresponding Seiberg–Witten moduli space contains irreducible solutions only and is smooth. By Theorem 1.4 we know that for a generic pair (g, B) the dimension of the space of harmonic spinors is indeed equal to the absolute value of the index. In particular, if the index is negative there are no nontrivial positive harmonic spinors for the generic pair (g, B). Now let the index be nonpositive, i.e c1 (L)2 − σ(M ) ≤ 0. Suppose there is a pair (g, η) of parameters with at least one nontrivial solution (9, A) to the Seiberg–Witten equations such that (g, A) is generic. Because the index is nonpositive Theorem 1.4 implies that 9 ≡ 0, and hence (g, η) 6∈ D. Suppose we are given a spinc -structure c with c1 (L)2 − σ(M ) ≤ 0. If the Seiberg– Witten invariant SWg,η (c) is nontrivial (for parameters (g, η) ∈ D) then for any nontrivial solution (9, A) of the Seiberg–Witten equations the pair (g, A) is not generic by the preceding argument. Thus if we could show that whenever SWg,η (c) is nontrivial we can find at least one solution (9, A) for which (g, A) is generic then c1 (L)2 − σ(M ) > 0. Problem. Is it true that if for a given spinc -structure c the Seiberg–Witten invariant SWg,η (c) is nontrivial, the index of the Dirac operator is positive, i.e c1 (L)2 −σ(M ) > 0? That this is not true in general will be shown in the following proposition. This proposition may also be interpreted as saying that if the answer to the problem should be affirmative for b+ > 1 then there is no proof which relies on infinitesimal arguments, i.e there is no proof which tries to argue that a C ∞ -small deformation of both g and η may be found such that (g, A) is generic for some solution A of the equations. For such an argument would also apply to the case b+ = 1. Proposition 10.1. Let M be a geometrically ruled surface over the curve C. There is a connected open set U ∈ M × + (M ) such that for (g, η) ∈ U the moduli space of solutions for the anti-canonical spinc -structure ccan contains no reducible solution and SWg,η (ccan ) 6= 0. Furthermore b+ (M ) = 1 and the signature of the Dirac operator is negative provided genus(C) > 1. In particular, if the connection A comes from a solution to the Seiberg–Witten equations for the parameters (g, η) then (g, A) is not contained in the generic set. Note that the anti-canonical spinc -structure has as canonical bundle L = K := 2,0 [LM,App.D]. Proof. Because M is K¨ahler and pg (M ) = 0, b+ (M ) = 1. The set of pairs (g, η) ∈ Uˆ ⊂ M × + (M ) for which the Seiberg–Witten moduli space contains no reducible solution is open. The dimension of the moduli space is 0. Let F be a fibre. Then because F has

Generic Metrics and Connections on Spin- and Spinc -Manifolds

433

trivial normal bundle, c1 (M )|F = c1 (F ) and c1 (K)|F = c1 (KF ) = −c1 (F ). Now F is CP 1 , thus c1 (F ) = 2. It follows that genus(C) c1 (M ) + 2c1 (K) [F ] = ±1 . 2 Corollary 1.4 of [LL] now implies that there is one component U ⊂ Uˆ such that for (g, η) ∈ U the Seiberg–Witten invariants satisfy SWg,η (ccan ) 6= 0. Note that the canonical bundle of the spinc -structure c is L = K = 2,0 (M ). Furthermore, by [Beau,Prop.III.21] c1 (K)2 = 8(1 − genus(C)) and the signature of M is σ = 0. Hence the index of the Dirac operator is (c1 (K)2 − σ)/8 < 0 for genus(C) > 1. For a solution (9, A) of the Seiberg–Witten equations for parameters (g, η) ∈ U the pair (g, A) cannot be in the generic set because 9 6≡ 0. 11. Appendix: Analytic Families of Differential Operators This section is technical in nature and serves to prove a very simple analyticity theorem for differential operators. This theorem is a formalization of the proof of [Ber,Lemme 3.15]. Equivalent statements have also been proven independently in [Ang,Th.1.1,Th.1.2] with similar applications as in this paper. In the sequel let E and F always denote smooth C-vectorbundles over a closed manifold M with dimM = m. On M a smooth measure shall be fixed once and for all. Let α always denote a multiindex in Nk . By 0(E) we denote the space of (possibly discontinuous) sections of E. Definition. We say that sy ∈ 0(E) = 0(M, E) depends analytically on y ∈ Y ⊂ Rn if for fixed p ∈ M the map y 7→ sy (p) ∈ Ep is analytic in a uniform manner, i.e. for every y0 ∈ Y there are sα ∈ 0(E) and R > 0 with BR (y0 ) ⊂ Y and sy =

X 1 sα (y − y0 )α α! α

|y − y0 | < R .

If φ ∈ 0(Hom(E, E)) then the analyticity of sy implies the analyticity of φ(sy ). The definition is local in nature, i.e if V1 ∪ V2 = M and sy is analytic over both Vi then sy is also analytic over M . If M is compact the definition of analyticity of sy is equivalent to demanding that the coordinate functions in any local trivialization be analytic in a uniform manner. Thus for most arguments it suffices to consider functions which depend analytically on a parameter. Given sy ∈ 0(M × C), where y ∈ DR := {(x1 , . . . , xn ) ∈ Rn , |xi | < R}, and P 1 sα ∈ 0(M × C) with sy = α α! sα y α we have for fixed p ∈ M and 0 < R0 < R the Cauchy integral formula: Z 1 s(p, ζ1 , . . . , ζn ) sy (p) = s(p, y) = dζ1 . . . ζn , (2πi)n C (ζ1 − y1 ) · · · (ζn − yn ) Z 1 s(p, ζ1 , . . . , ζn ) sα (p) = dζ1 . . . ζn , (2πi)n C ζ1α1 +1 · · · ζnαn +1 where C := C1 × . . . × Cn with Cj (t) = R0 esπitj .

434

S. Maier

For the moment we shall work in a fixed coordinate system on some open subset U of M . Suppose sy (p) = s(p, y) is differentiable in the p-coordinate for each fixed y and assume furthermore that D1 s : U × Y → HomR (T U, C) is continuous jointly in both variables. Suppose inductively that this holds for all D1α s(p, y) with |α| < j and 1 ≤ j ≤ k. Then the Cauchy integral formula shows that for |γ| ≤ k, D1γ s(p, y) is analytic in y, the sα are in C k and D1γ s(p, y) =

X 1 D γ sα y α . α! α

Thus for fixed > 0 we may find q ∈ N such that for |yi | < R0 < R, ksy (p) −

X 1 s α y α kC k < . α!

|α|≤q

If M is compact we thus have the following Lemma 11.1. Let sy ∈ 0(E) depend analyticaly on y ∈ Y ⊂ Rn such that s is in C k (M × Y, E). Given > 0 and y0 ∈ Y we may choose R > 0 and q ∈ N such that BR (y0 ) ⊂ Y and X 1 sα (y − y0 )α kC k < α!

ksy (p) −

|y − y0 | < R.

|α|≤q

Lemma 11.2. Let E and F be smooth complex vector bundles over M , φy ∈ C ∞ (M × Y, Hom(E, F )) be a section which depends analytically on y. Then φy defines a continuous linear map Hs (E) → Hs (F ) which depends analytically on y, where Hs denotes the Sobolev space of order s ∈ R. Proof. Choose connections ∇E and ∇F in E and F respectively and (hermitian) metrics on both vector bundles. Then for S ∈ C ∞ (E) and k ∈ N: ÿ k∇kE (φy S)kL2 (F )

≤ const.kφy kC k

k X

! k∇lE SkL2 (E)

.

l=0

Hence kφy Sk ≤ const.kφy kC k kSkHk . Thus φy : Hk (E) → Hk (F ) is continuous and its operator norm is bounded by a constant multiple of kφy kC k . For k ∈ −N this follows by duality, and for s ∈ R, φy is continuous with operator norm bounded by a constant multiple of kφy kC k for k ∈ N with k ≥ |s|, by the interpolation argument of [Fo,3.21]. By the previous lemma, given k ∈ N and y0 ∈ U we may find R > 0 and q ∈ N such that X 1 φα (y − y0 )α kC k < |y − y0 | < R kφy (p) − α! |α|≤q

P

1 φα (y − y0 )α converges uniformly to φy in the operator for given > 0. Thus |α|≤q α! norm. This proves the lemma.

Generic Metrics and Connections on Spin- and Spinc -Manifolds

435

Definition. Let Dy : C ∞ (E) → C ∞ (F ) be a differential operator of order q on a compact manifold M which depends upon a variable y ∈ U , where U ⊂ Rn is open. We say that Dy depends analytically on y if in every local trivialization of E and F over V ⊂M X ∂ Aα,y α , Dy = ∂x |α|≤q

where the Aα,y are analytic uniformly in y and are smooth jointly in both variables. This definition clearly is independent of the particular trivializations chosen. Proposition 11.3. An analytic differential operator Dy : C ∞ (E) → C ∞ (F ) of order q extends to a bounded linear operator Dy : Hs+q (E) → Hs (F ) which is analytic in y. Proof. We may think of Dy as a section dy of Hom(J q E, F ), where J q E denotes the q th jet-bundle of E. This section clearly is analytic in y. If jq : C ∞ (E) ,→ C ∞ (J q E) denotes the standard inclusion then Dy = dy ◦ jq . The map dy extends to an analytic bounded linear map Hs (J q E) → Hs (F ) and jq extends to a bounded linear map Hs+q (E) → Hs (J q E). Thus their composition is an analytic bounded linear map Hs+q (E) → Hs (F ). The following proposition is the upshot of the preceding discussion. This proposition has also been proved in [Ang,Th.1.1] for perturbations of order smaller than the order of D. Proposition 11.4. Let Dt , t ∈ (a, b), be an analytic family of differential operators of order q acting on the smooth sections of a complex vectorbundle E over M such that Dt is elliptic for each t. Let µ := min{dimKer(Dt ), t ∈ (a, b)}. Then the set T := {t ∈ (a, b) , dimKer(Dt ) > µ} is discrete. Proof. If s 6∈ T then for all t in a neighbourhood of s, we have t 6∈ T , by upper semicontinuity. This in particular implies that the set T is closed. Fix any s ∈ (a, b) in the boundary of T . Split Hq (E) orthogonally as K ⊕ H, where K := Ker(Ds ), and split H0 (E) orthogonally as C ⊕ D, where C := Coker(Ds ). We may decompose Dt as at bt Dt = c t dt with respect to this splitting, where dt : H → D is invertible for t near s. Let k := dimKerDs − µ > 0. Set R(t) := bt ◦ d−1 t ◦ ct − at and note that dimKer(Dt ) = dimKer(Ds ) if and only if R(t) = 0 [Kos]. Because s is in the boundary of T , there is a (k × k)-minor of R(t) with nonvanishing determinant for a set of points with s as an accumulation point. But R(t) depends analytically on t, and thus this minor has nonvanishing determinant at t 6= s in a neighbourhood of s. Thus there is an open interval (t1 , t2 ) with a < t1 < s < t2 < b such that for t ∈ (t1 , t2 ) \ {s} we have t 6∈ T . Thus T is discrete. Acknowledgement. I wish to thank D. Kotschick for introducing me to the problems discussed here. I also appreciate the many useful discussions with C. B¨ar. Thanks are also due to M. Slupinski for an invitation to the IRMA, Strasbourg. I am grateful to Th. Friedrich for useful comments and for pointing out an error in an earlier version of this paper.

436

S. Maier

References [ACGH] Arbarello, E., Cornalba, M., Griffiths, P.A., Harris, J.: Geometry of Algebraic Curves. Volume I, Berlin-Heidelberg-New York: Springer Grundlehren Band 267, 1985 [Ang] Anghel, N.: Generic vanishing for harmonic spinors of twisted Dirac operators. To appear in Proc. Am. Math. Soc. ´ Norm. Sup., 4e s´erie, tome 4, [At] Atiyah, M.: Riemann surfaces and spin structures. Ann. Scient. Ec. 47–62 (1971) [AtSi] Atiyah, M., Singer, I.M.: The index of elliptic operators: V. Ann. of Math. 93, 139–149 (1971) [B¨ar1] B¨ar, C.: Metrics with harmonic spinors. Preprint, University of Freiburg, 1995, to appear in GAFA [B¨ar2] B¨ar, C.: Harmonic Spinors for Twisted Dirac Operators. Sfb 288 Preprint No. 180, Berlin [B¨ar3] B¨ar, C.: Real Killing Spinors and Holonomy. Commun. Math. Phys. 154, 509–521 (1993) [B¨arP] B¨ar, C.: Private communication [BaS] B¨ar, C., Schmutz, P.: Harmonic Spinors on Riemann Surfaces. Ann. Glob. Anal. and Geom. 10, 263–273 (1992) [Be] Besse, A.L.: Einstein Manifolds. Springer Ergebnisse 3. Folge Band 10, Berlin-Heidelberg-New York: Springer, 1987 [Ber] Berger, M.: Sur les premi`eres valeurs propres des vari´et´es Riemanniennes. Comp. Math., Vol. 26, Fasc. 2, 129–149 (1973) [Beau] Beauville, A.: Complex Algebraic Surfaces. LMS Lecture Notes Series: 68, Cambridge: Cambridge University Press, 1983 [BFGK] Baum, H., Friedrich, T., Grunewald, R., Kath, I.: Twistors and Killing Spinors on Riemannian manifolds. Texte zur Mathematik Band 124, Stuttgart-Leipzig: B.G.Teubner, 1991 [BG] Bourguignon, J.-P., Gauduchon, P.: Spineurs, Op´erateurs de Dirac et Variations de M´etriques. Commun. Math. Phys. 144, 581–599 (1992) [BW] Booss-Bavnbek, B., Wojciechowski, K.P.: Elliptic Boundary Problems for Dirac Operators. BaselBoston: Birkhh¨auser, 1993 [Do] Donaldson, S.K., Kronheimer, P.B.: The Geometry of 4-Manifolds. Oxford: Oxford University Press, 1990 [Far] Farkas, H.: Special divisors and analytic subloci of Teichm¨uller space. Am. J. Math. 88, 881–901 (1966) [Fo] Folland, G.B.: Lectures on Partial Differential Equations. Tata Institute, 1983 [GL] Gromov, M., Lawson, H.B.: The classification of simply connected manifolds of positive scalar curvature. Ann. of Math. 111, 423–434 (1980) [GR] Grauert, H., Remmert, R.: Coherent Analytic Sheaves. Grundlehren Band 265, Berlin-HeidelbergNew York: Springer, 1984 [Gra] Grauert, H.: Ein Theorem der analytischen Garbentheorie und die Modulr¨aume komplexer Strukturen. Publ. Math. IHES, No 5, 1960 [Gro] Grothendieck, A.: Techniques de construction en g´eom´etrie analytique I. Expos´ee 7, S´eminaire Henri Cartan, 13e ann´ee, 1960/61 [Gu] Gunning, R.C.: Lectures on Riemann Surfaces - Jacobi Varieties. Princeton, 1972 [Hi] Hitchin, N.: Harmonic Spinors. Adv. in Math. 14, 1–55 (1974) [Hij1] Hijazi, O.: A conformal lower bound for the smallest eigenvalue of the Dirac operator and Killing spinors. Commun. Math. Phys. 104, 151–162 (1986) [Hij2] Hijazi, O.: Charact´erisation de la sph`ere standard par les premi`eres valeurs propres de l’operateur de Dirac en dimension 3,4,7 et 8. C.R.Acad. Sc. Paris, t. 303, S´erie I, no 9, 417–419 (1986) [K2] Kotschick, D.: Non-trivial harmonic spinors on generic algebraic surfaces. To appear in Proc. Am. Math. Soc. [K3] Kotschick, D.: The Seiberg–Witten Invariants of Symplectic Four-Manifolds. S´eminaire Bourbaki, 48i`eme ann´ee, 1995-96, n0 812 [Kam] Kamishima, Y.: Conformally Flat Manifolds. Trans. Am. Math. Soc., Vol. 294, Nr.2, 607–623 (1986) [Kos] Koschorke, U.: Infinite dimensional K-theory and characteristic classes of Fredholm bundle maps. In: Global Analysis, Proceedings of Symposia in Pure Math. Vol. 15, 1970 [KrMr] Kronheimer, P., Mrowka, T.: The genus of embedded surfaces in the projective plane. Math. Res. Letters 1, 797–808 (1994) [KuPi2] Kulkarni, R., Pinkall, U., (Editors): Conformal Geometry. Aspects of Mathematics, Vieweg, 1988 [L] Lichnerowicz, A.: Spineurs harmonique. C. R. Acad. Sci. Paris, Ser. A-B 257, 7–9 (1963) [LL] Li, T.J., Liu, A.: General Wall Crossing Formula. Math. Res. Letters 2, 797–810 (1995)

Generic Metrics and Connections on Spin- and Spinc -Manifolds

[LM] [Mor]

437

Lawson, H.B., Michelsohn, M.-L.: Spin Geometry. Princeton, NY: Princeton University Press, 1989 Morgan, J.W.: The Seiberg–Witten equations and applications to the topology of four-manifolds. Princeton, NY: Princeton University Press, 1995 ´ Norm. Sup., 4e s´erie, [Mum] Mumford, D.: Theta-charcteristics of an algebraic curve. Ann. Scient. Ec. fasc. 2, 181–191 (1971) [O’N] O’Neill, B.: Semi-Riemannian Geometry with Applications to Relativity. New York: Academic Press, 1983 [Sto] Stolz, S.: Simply connected manifolds of positive scalar curvature. Ann. of Math. 136, 511–540 (1992) [Tr] Tromba, A.J.: Teichm¨uller Theory in Riemannian Geometry. Basel-Boston: Birkh¨auser Verlag, 1992 [W] Wells, R.O.: Differential Analysis on Complex Manifolds. Springer Graduate Text 65, BerlinHeidelberg-New York: Springer, 1973 [Wit] Witten, E.: Monopoles and 4-Manifolds. Math. Res. Letters 1, 769–796 (1994) [Wo] Wolf, J.A.: Spaces of Constant Curvature. New York: McGraw-Hill, 1967 Communicated by S.-T. Yau

Commun. Math. Phys. 188, 439 – 448 (1997)

Communications in

Mathematical Physics c Springer-Verlag 1997

Yang-Mills and Dirac Fields with Inhomogeneous Boundary Conditions 2,?? ´ ¨ Gunter Schwarz1,? , Jedrzej Sniatycki , Jacek Tafel3,??? 1 2 3

Department of Mathematics and Computer Science, University of Mannheim, 68131 Mannheim, Germany Department of Mathematics and Statistics, University of Calgary, Calgary, Alberta, Canada Institute of Theoretical Physics, University of Warsaw, Ho˙za 69, 00-681 Warsaw, Poland

Received: 18 April 1996/Accepted: 3 March 1997

Abstract: Finite time existence and uniqueness of solutions of the evolution equations of minimally coupled Yang-Mills and Dirac systems are proved for inhomogeneous boundary conditions. A characterization of the space of solutions of minimally coupled Yang-Mills and Dirac equations is obtained in terms of the boundary data and the Cauchy data satisfying the constraint equation. The proof is based on a special gauge fixing and a singular perturbation result for the existence of continuous semigroups.

1. Introduction Minimally interacting Yang-Mills and Dirac fields with homogeneous bag boundary conditions have been studied in [1] and [2]. In this paper we extend the results on the finite time existence and uniqueness of solutions for inhomogeneous boundary data. In this way we get a complete description of solutions of the evolution Eqs. (1.3) through (1.5) for minimally coupled Yang-Mills and Dirac fields in bounded domains. We consider the Yang-Mills system with a compact structure group G on a spatially bounded region R × M of the space-time, where M is a contractible bounded domain in R3 . The 3+1 splitting of the space-time yields a splitting of the Yang-Mills potential Aµ into its spatial component A (treated as a time dependent vector field on M with values in the structure algebra σ) and the time component A0 . This leads to a representation of the field strength Fµν in terms of the "electric" component E = Ei dxi and the "magnetic" field B = Bi dxi , where Ei = F0i ? ?? ???

and

Bi = 21 ijk Fjk .

Research partially supported by DFG Grant Schw. 485/3-1 Research partially supported by NSERC Research Grant SAP0008091. Research partially supported by KBN Grant 2 P302 112 07.

(1.1)

´ G. Schwarz, J. Sniatycki, J. Tafel

440

The Dirac field is described by time dependent 4-spinors 9 with values in the vector space VG of the fundamental representation of G. Let D = −γ 0 (γ k ∂k + im)

(1.2)

be the free Dirac operator in M ⊂ R3 . The minimally coupled Yang-Mills and Dirac equations split into the evolution equations ∂t A = E + grad A0 − [A0 , A] , ∂t E = −curl B − [A×, B] − [A0 , E] + J , ∂t 9 = D9 − (γ 0 γ k Ak + A0 )9 ,

(1.3) (1.4) (1.5)

and the constraint equation div E + [A; E] = J 0 .

(1.6)

With {T a } denoting a basis of σ, B = curl A + [A×, A] , J 0 = 9† I ⊗ T a 9Ta , J k = 9† γ 0 γ k ⊗ T a 9Ta . (1.7) The strategy for investigating this system is as follows. The time dependence of the scalar potential A0 is not determined by the evolution equations and it can be fixed by an appropriate gauge condition. Here we choose A0 to be the unique solution of the Neumann problem Z A0 d3 x = 0 , (1.8) 1 A0 = −div E , n(grad A0 ) = −nE with M

where n denotes the normal component on the boundary ∂M . Then we study the evolution equations in the space of the Cauchy data (A, E, 9) ∈ P = H 2 (M, σ) × H 1 (M, σ) × H 2 (M, VG ⊗ C4 ),

(1.9)

where H k (M, V ) is the Sobolev space of fields (either σ-valued vector fields or VG valued spinors) which are square integrable together with their derivatives up to order k, [3]. Since E ∈ H 1 (M, σ), it follows that A0 ∈ H 2 (M, σ). In order to describe the boundary conditions we introduce the following notation. For a vector field X on M , we denote by nX and tX the normal and the tangential component of X on ∂M , respectively. If X is a vector field along ∂M we understand X [ as the induced one-form given by X [ = g∂ (X, .), where g∂ is the metric on ∂M . For the spinor field we set 0(9|∂M ) := 21 (Id − iγ k nk )(9|∂M )

(1.10)

where nk is the k-component of the outward pointing unit normal vector of ∂M in R3 . The boundary conditions considered here consist of specifying the boundary components t(curl A(t)) = λ(t) and 0(9(t)|∂M ) = µ(t) , 0((D9(t))|∂M ) = ν(t)

∀ t ∈ [0, T ) ,

(1.11)

where D is the Dirac operator. The existence and uniqueness result of [2] extends to the case of these inhomogeneous boundary conditions as follows :

Yang-Mills and Dirac Fields with Inhomogeneous Boundary Conditions

441

Main Theorem. Let λ(.), µ(.), ν(.) : [0, T0 ) −→ H 1/2 (∂M, σ)×H 3/2 (∂M, VG⊗C4 ) × H 1/2 (∂M, VG⊗C4 ) be a differentiable curve of boundary data, where λ splits into λ = λ1 + λ2 such that Z 3/2 λ[2 ∧ (grad φ)[ = 0 ∀ φ ∈ H 1 (∂M, σ) , (1.12) λ1 ∈ H (∂M, σ) and ∂M

and (µ, ν) satisfy the integrability conditions iγ k nk + Id µ = 0 and

iγ k nk + Id ν = 0 .

(1.13)

For every (A, E, 9) ∈ H 2 (M, σ) × H 1 (M, σ) × H 2 (M, VG ⊗ C4 ) satisfying t(curl A) = λ(0) , 0(9|∂M ) = µ(0) and 0((D9)|∂M ) = ν(0),

(1.14)

there exists a maximal T ∈ (0, T0 ], and unique classical solution (A(t), E(t), 9(t)) ∈ H 2 (M, σ) × H 1 (M, σ) × H 2 (M, VG ⊗ C4 )

(1.15)

of the evolution equations (1.3) through (1.5) with A0 satisfying the gauge condition (1.8), defined for t ∈ [0, T ), which satisfies the initial conditions A(0) = A, E(0) = E, 9(0) = 9, and the boundary conditions given by (1.11). With the trace theorem we conclude from A ∈ H 2 (M, σ) and 9 ∈ H 2 (M, VG ⊗ C4 ) that t(curl A) ∈ H 1/2 (∂M, σ), 0(9|∂M ) ∈ H 3/2 (∂M, VG ⊗ C4 ) and 0(D9|∂M ) ∈ H 1/2 (∂M, VG ⊗ C4 ). This implies that the choice of boundary conditions (1.11) is consistent. It can be shown that the second integrability condition of (1.12) is equivalent to demanding that there exists a curve χ(t) in H 3/2 (∂M, σ) such that λ2 (t) = grad χ(t) ∀ t ∈ [0, T ) .

(1.16)

In [2] we used the theory of Lipschitz perturbations of strongly continuous semigroups to obtain the existence and uniqueness theorems for the evolution equations under the homogeneous boundary conditions nE = 0 , nA = 0 , tcurl A = 0 , 0(9|∂M ) = 0 , 0(D9|∂M ) = 0 .

(1.17)

In the present paper we observe that our gauge condition enables us to drop the restrictions on nA and nE. Moreover, using the result of [5] allows for less regular perturbations and makes it possible to consider also inhomogeneous boundary conditions. It should be noted that the invariant subspace determined by the boundary conditions (1.17) is contained in a bigger invariant subspace characterized by the conditions (1.11) with λ = 0, µ = 0, ν = 0. In Sect. 2 we study the gauge fixing for A0 and construct background fields to cover the effect of the inhomogeneous boundary conditions. The resulting linearised system obeying homogeneous boundary conditions is solved by using the results of [1,2]. In Sect. 3 we state a generalization of Segal’s theorem on non-linear semigroups in the singular case, cf. [5]. This implies the proof of Main Theorem.

´ G. Schwarz, J. Sniatycki, J. Tafel

442

2. Gauge Fixing, Boundary Conditions and Linearisation To main results of this papers rely on a special version of the Helmholtz decomposition of vector fields. The following has been shown in [6] : Lemma 1. On a simply connected bounded domain M ⊂ R3 each vector field V ∈ H k (M, σ) decomposes into L k+1 V = V L +V T , where V T = grad ΘV with ΘV ∈ H k+1(M, σ) (2.1) V = curl WV with WV ∈ H (M, σ) , tWV = 0 . RThe scalar function ΘV is unique up to a constant, whichT can be chosen so that ΘV d3 x = 0. The vector field WV is not unique, but V is unique and satisfies M T nV = 0. Moreover, the following projections are continuous under the splitting (2.1): π T : H k (M, σ) −→ H k (M, σ)

π T (V ) = V T ,

π L : H k (M, σ) −→ H k (M, σ)

π L (V ) = V L .

(2.2)

Since the scalar potential A0 does not appear as an independent degree of freedom in the system (1.3) through (1.6), it can be fixed by a choice of an appropriate gauge transformation. The most convenient gauge fixing for our approach is based on the Helmholtz decomposition of the “electric” field E = E L + E T ∈ H 1 (M, σ). Equation (2.1) determines a unique scalar field ΘE ∈ H 2 (M, σ) with vanishing mean value on M such that E L = grad ΘE . Choosing A0 := −ΘE we have Z 0 L and A0 d3 x = 0 , (2.3) grad A = −E M

which is equivalent to the Neumann problem (1.8). An elliptic estimate implies that kA0 kH 2 ≤ CkEkH 1 .

(2.4)

e0 (t)) ∈ H 1 (M, σ) × H 2 (M, σ) the For an arbitrary differentiable curve (E(t), A gauge fixing (2.3) can be achieved by a gauge transformation 8(t) ∈ H 2 (M, G) via e0 8 + 8−1 ∂t 8 e0 7→ A0 = 8−1 A A

(2.5)

e0 (t)) ∈ H 2 (M, σ) × H 3 (M, σ) then 8(t) for sufficiently small t. Moreover, if (E(t), A 3 is of Sobolev class H (M, G) and the gauge transformation preserves the space P. This result has been established in [4] for M = R3 . Under the regularity assumptions above, the proof generalizes to the case of a bounded domain, considered here. Employing the Helmholtz decomposition (2.1) for the evolution Eqs. (1.3) and (1.4) and using the gauge fixing (2.3), the evolution components of the Yang-Mills and Dirac equations turn into ∂t AL = −π L ([A0 , A]) , ∂t AT = E T − π T ([A0 , A]) , ∂t E L = −π L (curl curl A) − π L curl [A×, A] + [(A×, B] + [A0 , E] − J , ∂t E T = −π T (curl curl A) − π T curl [A×, A] + [A×, B] + [A0 , E] − J , 0 k

k

∂t 9 = D9 − (γ γ A + A )9 . 0

(2.6) (2.7) (2.8) (2.9)

(2.10)

Yang-Mills and Dirac Fields with Inhomogeneous Boundary Conditions

443

The usual treatment of inhomogeneous boundary conditions is to extend the boundary data to sufficiently regular background fields on M by means of the trace theorem [8]. The fields under consideration are presented as perturbations of the background fields. The perturbations satisfy appropriately modified differential equations and the homogeneous boundary conditions. As far as the boundary data for the Dirac field are concerned we observe that a spinor field ρ defined on the boundary ∂M can be in the range of the boundary operator 0 = 21 (Id − iγ k nk ) only if e 0(ρ) =

1 2

iγ k nk + Id ρ = 0 .

(2.11)

e = 0. Therefore (1.13) give necessary conditions This follows from 0 ◦ 0 = 0 and 0 ◦ 0 on the existence of an extension for the boundary data. In fact, it is an easy algebraic construction to find, for given differentiable curves µ(t) ∈ H 3/2 (∂M, VG ⊗ C4 ) and ν(t) ∈ H 1/2 (∂M, VG ⊗ C4 ), which satisfy the integrability condition (2.11), a field ψ(t) such that 0(ψ(t)|∂M ) = µ(t)

and

0((Dψ(t))|∂M ) = ν(t)

∀ t ∈ [0, T0 ) ,

(2.12)

where ψ(t) is a differentiable curve in H 2 (M, VG ⊗ C4 ). In a similar way one would extend a C 1 curve λ(t) ∈ H 1/2 (∂M, σ) to a C 1 curve a(t) in H 2 (M, σ) which satisfies the boundary condition t(curl a(t)) = λ(t). It turns out, however, that in this case the inhomogeneity introduced by curl (curl (a(t))) into Eq. (2.8) would be too singular for our method. Therefore, we choose a different approach. Lemma 2. Let λ(t) be a C 1 curve in H 1/2 (∂M, σ) which splits into λ(t) = λ1 (t) + λ2 (t), where λ1 (t) is a C 1 curve in H 3/2 (∂M, σ) and λ2 (t) satisfies the integrability condition Z ∂M

λ[2 ∧ (grad φ)[ = 0

∀ φ ∈ H 1 (∂M, σ) .

(2.13)

Then there exists a C 1 curve a(t) ∈ H 2 (M, σ) such that curl curl a(t) ∈ H 1 (M, σ)

and

t(curl a(t)) = λ(t) ,

∀ t ∈ [0, T0 ) .

(2.14)

Proof. Given a C 1 curve λ1 (t) ∈ H 3/2 (∂M, σ), it extends to a C 1 curve a1 (t) in H 3 (M, σ) which satisfies the boundary condition t(curl a1 (t)) = λ1 (t) in a similar way to that discussed above. In [6] it is shown that on a contractible domain M ⊂ R3 the problem curl b = 0 div b = 0 on M tb = λ2 on ∂M (2.15) has a unique solution b ∈ H 1 (M, σ), provided that λ2 ∈ H 1/2 (∂M, σ) satisfies the integrability condition (2.13). Moreover, there exists a unique solution a2 ∈ H 2 (M, σ) of the boundary value problem curl a2 = b

div a2 = 0 on M

na2 = 0 on ∂M

for each b ∈ H 1 (M, σ). Then a(t) = a1 (t) + a2 (t) is the required extension.

(2.16)

´ G. Schwarz, J. Sniatycki, J. Tafel

444

With the fields a and ψ constructed above as a background we consider the perturbations b =0 b = (A − a) ∈ H 2 (M, σ) satisfying t(curl A)) (2.17) A 2 4 b b b 9 := (9 − ψ) ∈ H (M, VG ⊗ C ) satisfying 0(9|∂M ) = 0 , 0(D9|∂M ) = 0 (2.18) b we as the dynamical degree of freedom. From the homogeneous boundary condition on A T b b infer that curl curl A = 1A is a purely transversal field in the sense of the decomposition (2.1). The homogeneous linear part of the dynamical system (2.6) through (2.10) yields three uncoupled linear systems: L bT bT bT A 0 0 1 A A A b = D9 b. = ∂ = T = ∂t 9 ∂t t −1 0 ET ET ET EL 0 (2.19) In [1] and [2] we have shown the following : The operator T with domain bT , E T ) ∈ H 2 (M, σ) × H 1 (M, σ) | t(curl A bT ) = 0} D(T ) = {(A

(2.20)

is the infinitesimal generator of a one-parameter group of continuous transformations in the Hilbert space bT , E T ) ∈ H 1 (M, σ) × L2 (M, σ)} . HT = {(A

(2.21)

The (free) Dirac operator D, considered as an operator with the domain b ∂M ) = 0 and 0(D9| b ∂M ) = 0} b ∈ H 2 (M, VG ⊗ C4 ) | 0(9| D(D) = {9

(2.22)

is the infinitesimal generator of a one-parameter group of continuous transformations in the Hilbert space b ∈ H 1 (M, VG ⊗ C4 ) | 0(9| b ∂M ) = 0} . HD = {9

(2.23)

3. Proof of the Main Theorem Using the results on the linearised dynamics given above, the coupled non-linear system can be tackled by using the following generalisation, [5], of Segal’s result on non-linear semigroups in the singular case: Lemma 3. Let B1 and B2 be Banach spaces and exp(tS) : B2 → B2 be a continuous one-parameter semigroup of bounded linear operators generated by an operator S with domain D(S) ⊂ B2 . Assume that: i)

F1 : B1 × D(S) → B1 is a map, which is continuous and locally Lipschitz with respect to the norm |k(V1 , V2 )|k1 = kV1 kB1 + kV2 kB2 + kSV2 kB2 , where (V1 , V2 ) ∈ B1 × D(S) . (3.1)

ii) F2 : B1 × D(S) → B2 is a map, which is continuous and differentiable with respect to the norm (3.1).

Yang-Mills and Dirac Fields with Inhomogeneous Boundary Conditions

445

iii) The following derivative K : B1 × D(S) × B2 → B2 of F2 given by K(V1 , V2 , v2 ) := K1 (V1 , V2 ) + K2 (V1 , V2 , v2 ) with K1 (V1 , V2 ) = DF2 (V1 , V2 ) F1 (V1 , V2 ), 0 ,

(3.2)

K2 (V1 , V2 , v2 ) = DF2 (V1 , V2 ) 0, v2 ) is locally Lipschitz with respect to the norm |k(V1 , V2 , v2 )k|2 = kV1 kB1 + kV2 kB2 + kSV2 kB2 + kv2 kB2 .

(3.3)

Then, for every initial condition (V1 (0), V2 (0)) ∈ B1 × D(S) there exists a maximal T > 0 such that the differential equation 0 F1 (V1 (t), V2 (t)) V1 (t) = + (3.4) ∂t V2 (t) F2 (V1 (t), V2 (t)) SV2 (t) has a unique classical solution (V1 (t), V2 (t)) ∈ B1 ×D(S) in the interval [0, T ), satisfying the initial condition. To apply this theorem to the case under consideration we set bL , E L ) ∈ H 2 (M, σ) × H 1 (M, σ)} and B1 = HL = {(A (3.5) bT , E T , 9) b ∈ H 1 (M, σ) × L2 (M, σ) × H 1 (VG ⊗ C)} . B2 = HT × HD = {(A The generator of the linear semigroup we choose to be bT , E T , 9) b ∈ H 2 (M, σ) × H 1 (M, σ)H 2 (VG ⊗ C) | D(S) = {(A bT ) = 0 0(9| b ∂M ) = 0(D9| b ∂M ) = 0} . t(curl A (3.6) = (F ) , (F ) and In view of Eqs. (2.6) through (2.10) the components F 1 1 A 1 E F2 = (F2 )A , (F2 )E , (F2 )9 read S := T + D

with

bT , E T , 9) b + a)] , bL , E L ),(A b = −∂t aL − π L [A0 , (A (F1 )A (A bT , E T , 9) bL , E L ),(A b = −π L curl curl a (F1 )E (A b + a)×, (A b + a) + (A b + a)×, B + [A0 , E] − J , −π L curl (A bT , E T , 9) b + a)] bL , E L ),(A b = −∂t aT − π T [A0 , (A (F2 )A (A (3.7) bT , E T , 9) bL , E L ),(A b = −π L curl curl a (F2 )E (A b + a)×, (A b + a) + (A b + a)×, B + [A0 , E] − J , −π T curl (A bT , E T , 9) bk + ak )(9 bL , E L ),(A b = −∂t ψ + Dψ − γ 0 γ k (A b + ψ)A0 (9 b + ψ). (F2 )9 (A By construction, the background fields a(t) and ψ(t) as well as their time derivatives are of Sobolev class H 2 . Moreover, Dψ ∈ H 1 (VG ⊗C). By Lemma 1, the projections π L and π T to the components of the Helmholtz decomposition are continuous with respect to the Sobolev topology. The results of [1] and [2] then imply the following:

´ G. Schwarz, J. Sniatycki, J. Tafel

446

(A) The map F1 : B1 × D(S) → B1 is continuous and Lipschitz with respect to the norm given by (3.1). (B) The map F2 : B1 × D(S) → B2 is continuous and differentiable with respect to the norm (3.1). (C) The norms (3.1) and (3.3) are in the case under consideration equivalent to the respective norm bT , E T , 9) b H 2 + kEkH 1 + k9k bL , E L ), (A b k| = kAk b H2 , (3.8) |k (A 1 L L T T T T b , E ), (A b , E , 9), b (α , , ϕ) k| = |k (A (3.9) 2 bL , E L ), (A bT , E T , 9) b k| + kαT kH 1 + kT kL2 + kϕkH 1 , |k (A 1 bT , E b T , 9). b where (αT , T , ϕ) ∈ HT ×HD is an arbitrary infinitesimal variation of (A In view of later differentiation we observe that A0 is a linear functional of E L , and set (3.10) δ1 A0 := DA0 ((F1 )E ) . T T b + a ) + [(A b + a), ×(A b + a)] and of the The respective differentials of B = curl (A † k b 0 k b b components J (9) = (9 + ψ) γ γ ⊗ Tb (9 + ψ) of the matter current read b

b + a) , δ1 B := DB(AL , AT )((F1 )A , 0) = 2 (F1 )A ×, (A b + a) , δ2 B := DB(AL , AT )(0, αT ) = curl αT + 2 αT ×, (A (3.11) b + ψ)† γ 0 γ k ⊗ Tb ϕ + ϕ† γ 0 γ k ⊗ Tb (9 b b + ψ) . (δ2 J)kb := Dπ T (Jbk (9))(ϕ) = π T (9 bT , E T , 9) bL , E L ), (A b of F2 is the sum of Using these notations, the differential K1 (A the following terms: b + a)] + [A0 , (F1 )A ] , (K1 )A = − π T [δ1 A0 , (A b + a)] + [(F1 )A ×, B] + [(A b + a)×, δ1 B] , (K1 )E = − π T 2 curl [(F1 )A ×, (A (3.12) − π T [δ1 A0 , E] + [A0 , (F1 )E ], b + ψ). (K1 )9 = − γ 0 γ k (F1 )kA + δ1 A0 (9 bT , E L , 9), bL , E L ), (A b (αT , T , ϕ) : Similarly we have for K2 (A (K2 )A = − π T [A0 , αT ] , b + a)] + [αT ×, B] + [(A b + a)×, δ2 B] (K2 )E = − π T 2 curl [αT ×, (A . − π T [A0 , T ] − δ2 J , bk + ak )ϕ − A0 ϕ b + ψ) − γ 0 γ k (A (K2 )9 = − γ 0 γ k αT (9

(3.13)

k

Lemma 4. If Wi are finite dimensional vector spaces and ∗ : W1 × W2 → W3 is an algebraic product, then ∗ : H 1 (M, W1 ) × H 2 (M, W2 ) → H 1 (M, W3 ) satisfies kV1 ∗ V2 kH 1 ≤ CkV1 kH 1 kV2 kH 2

and

kV1 ∗ V2 − U1 ∗ U2 kH 1 ≤ C kV2 kH 2 + kU1 kH 1

(3.14) kV1 − U1 kH 1 + kV2 − U2 kH 2 .(3.15)

Yang-Mills and Dirac Fields with Inhomogeneous Boundary Conditions

447

Proof. By definition of the H 1 -norm kV1 ∗ V2 kH 1 ≤ kV1 ∗ V2 kL2 + k(grad V1 ) ∗ V2 kL2 + kV1 ∗ (grad V2 )kL2 .

(3.16)

The first two terms on the right hand side can be estimated by kV1 kH 1 kV2 kH 2 , since H 2 (M, W2 ) ⊂ C 0 (M, W2 ) by the Sobolev embedding theorem. Moreover, since H 1 (M, Wi ) ⊂ L4 (M, Wi ), the third term can be estimated by kV1 kH 1 k(grad V2 )kH 1 . This proves the inequality (3.14). Since V1 ∗V2 −U1 ∗U2 = (V1 −U1 )∗V2 −U1 ∗(U2 −V2 ), the estimate (3.15) follows from the triangle inequality. b E, 9), e E, e 9) b (A, e ∈ B1 × D(S) To derive the Lipschitz estimate for K1 we let (A, and understand b E, 9) f1 = F1 (A, e E, e 9) f1 )E . e0 = DA0 (F b , F e and δ1 A (3.17) F1 = F1 (A, Using this we apply Lemma 4 to estimate K1 in the norm of B2 . For the A-component we get bT , E T , 9) e L ), (A eT , E e T , 9) bL , E L ), (A eL , E b − (K1 )A (A e k 1 (3.18) k(K1 )A (A H b − Ak e H 2 + kA0 − A e0 kH 2 + k(F1 )A − (Fe1 )A kH 1 , e0 kH 1 + kA ≤ C kδ1 A0 − δ1 A where the constant C dependson the norm in B1 × D(S) of all the fields involved. With 1A0 = −div E L and 1 δ1 A0 = −div (F1 )E , the estimate (2.4) implies that e0 kH 2 ≤ CkE L − E e L kH 1 and kA0 − A e0 kH 1 ≤ Ck(F1 )E − (Fe1 )E kL2 . kδ1 A0 − δ1 A

(3.19)

Moreover, Property (A) above states that the nonlinearity F1 : B1 × D(S) → B1 is locally Lipschitz. Therefore (3.18) implies that bT , E T , 9) e L ), (A eT , E e T , 9) bL , E L ), (A eL , E b − (K1 )A (A e k 1 k(K1 )A (A H (3.20) L L T T b L eL T eT e b b e e ≤ C|k (A , E ), (A , E , 9) − (A , E ), (A , E , 9) k| . 1

As far as the estimate for (K1 )E is concerned we get from (3.15),

b + a)]−[(Fe1 )A ×, (A e + a)]

curl [(F1 )A ×, (A L2

b − Ak e H2 . ≤ C k(F1 )A − (Fe1 )A kH 1 + kA

(3.21)

Similarly, e + a)×, δ1 B]k b − Ak e H 2 , (3.22) b + a)×, δ1 B] − [(A e 2 ≤ C kδ1 B − δ1 Bk e H 1 + kA k[(A L e + a . With Lemma 4 we can estimate e = 2 (Fe1 )A ×, A where δ1 B f1 )A kH 1 + kA b − Ak e H2 . e H 1 ≤ C k(F1 )A − (F kδ1 B − δ1 Bk

(3.23)

Taking (3.19) into account, similar arguments as above apply to the remaining terms of (K1 )E . With the Lipschitz property (A) of F1 we then obtain

´ G. Schwarz, J. Sniatycki, J. Tafel

448

bT , E T , 9) e L ), (A eT , E e T , 9) bL , E L ), (A eL , E b − (K1 )A (A e k 2 k(K1 )E (A L b − Ak e H 2 + k(F1 )E − (Fe1 )E kL2 + kE − Ek e H1 ≤ C k(F1 )A − (Fe1 )A kH 1 + kA bL , E L ), (A bT , E T , 9) eL , E e L ), (A eT , E e T , 9) b − (A e k| . ≤ C|k (A 1 (3.24) The estimates for (K1 )9 can be performed in the same way so that we end up with the local Lipschitz estimate bT , E T , 9) e L ), (A eT , E e T , 9) bL , E L ), (A eL , E b − K1 ( A e k| |kK1 (A B2 (3.25) L L T T b L eL T eT e b b e e ≤ C|k (A , E ), (A , E , 9) − (A , E ), (A , E , 9) k|1 . The terms of K2 can be tackled in literally the same way as we have done for K1 . The corresponding estimates are even more direct, since one need not use Lipschitz argument for the nonlinearity F1 . That is, we can replace kF1 − Fe1 kB2 by eT kH 1 + kT − e T kL2 + kϕ − ϕk e H1 , kαT − α

(3.26)

wherever it appears. Only for (K2 )E we get an extra contribution, which computes with (3.11) as b e ϕ)k b − 9k e H 1 + kϕ − ϕk − (δ2 J)kb (9)( e L2 ≤ C k9 e H1 . (3.27) k(δ2 J)kb (9)(ϕ) Therefore we finally end up with the estimate

bT , E T , 9), bL , E L ), (A b (α, , ϕ) kK2 (A e L ), (A eT , E e T , 9)(e eL , E e α, e − K 2 (A , ϕ) e k B2 bL , E L ), (A bT , E T , 9)(α, b ≤ C|k (A , ϕ) eL , E e L ), (A eT , E e T , 9) e (e − (A α, e , ϕ) e k| .

(3.28)

1

Together with the properties (A)-(C), stated above, this proves that the nonlinearity of the theory given by (3.7) satisfied all the prerequisites of Lemma 3. This proves the Main Theorem. References ´ 1. Sniatycki, J. and Schwarz, G.: The existence and uniqueness of solutions of the Yang-Mills equations with bag boundary conditions. Commun. Math. Phys. 159, 593–604 (1994) ´ 2. Schwarz, G. and Sniatycki, J.: Yang-Mills and Dirac fields in a ag, existence and uniqueness theorems. Commun. Math. Phys. 168, 441–453 (1995) 3. Adams, R.A.: Sobolev Spaces. Orlando, Florida: Academic Press, 1975 ´ 4. Schwarz, G. and Sniatycki, J.: Gauge symmetries of an extended phase space for Yang-Mills and Dirac fields. Ann. Inst. H. Poincar´e A, 66, 109–136 (1997) ´ 5. Tafel, J. and Sniatycki, J.: Nonlinear semigroups and the Yang-Mills equations with the metallic boundary conditions. Commun. Partial Diff. Eqns. 22, 46–69 (1997) 6. Schwarz, G.: Hodge Decomposition – A Method for Solving Boundary Value Problems. Lecture Notes in Mathematics 1607, Heidelberg: Springer-Verlag, 1995 7. Wloka, J.: Partial Differential Equations. Cambridge: Cambridge University Press, 1987 Communicated by S.-T. Yau

Commun. Math. Phys. 188, 449 – 466 (1997)

Communications in

Mathematical Physics c Springer-Verlag 1997

Double Coset Construction of Moduli Space of Holomorphic Bundles and Hitchin Systems A. Levin1,? , M. Olshanetsky1,?? Max-Planck-Institute f¨ur Mathematik, Bonn, Germany. E-mail: [email protected], [email protected] Received: 7 July 1996 / Accepted: 5 March 1997

Abstract: We present a description of the moduli space of holomorphic vector bundles over Riemann curves as a double coset space which is differ from the standard loop group construction. Our approach is based on equivalent definitions of holomorphic bundles, based on the transition maps or on the first order differential operators. Using this approach we present two independent derivations of the Hitchin integrable systems. We define a “superfree” upstairs system from which Hitchin systems are obtained by three step hamiltonian reductions. Special attention is given to the Schottky parameterization of curves.

1. Introduction The moduli space of holomorphic vector bundles over Riemann surfaces are a popular subject in the algebraic geometry and the number theory. In mathematical physics they were investigated due to relations with Yang-Mills theory [1] and Wess-Zumino-Witten theory [2, 3]. The conformal blocks in WZW theory satisfy the Ward identities which take the form of differential equations on the moduli space [4, 5]. In this approach the moduli space is described as a double coset space of a loop group defined on a small circle on a Riemann surface [6]. The main goal of the paper is an alternative description of the moduli space and the Hitchin integrable systems [7] based on this construction. We start with a special group valued field on a Riemann surface which is defined as a map from a holomorphic basis in a vector bundle to a C ∞ basis. This field is an analogue of the tetrad field in the General Relativity and we call it Generalized Tetrad Field (GTF). The holomorphic ? On leave from International Institute for Nonlinear Studies at Landau Inst., Vorob’iovskoe sch. 2, Moscow, 117940, Russia ?? On leave from Institute of Theoretical and Experimental Physics, Bol. Cheremushkinskaya, 25, Moscow, 117 259, Russia

450

A. Levin, M. Olshanetsky

structures can be extracted from GTF. They are described via the holomorphic transition maps, or by means of the operators d00 . The former are invariant under the action of the global C ∞ transformations, while the later are under the action of the local holomorphic transformations. It allows us to define the moduli space as a double coset space of GTF with respect to the actions of the local holomorphic transformations and the global C ∞ transformations. We introduce a cotangent bundle to GTF and the invariant symplectic structure on it. The cotangent bundle to the moduli of holomorphic bundles can be obtained by symplectic factorizations over the action of two types of commuting gauge transformations. This cotangent bundle is a phase space of Hitchin integrable systems [7]. The tetrad fields in their turn are sections of the principle bundle over the Riemann surface, which satisfy some constraint equations. We interpret them as moment constraints in a big “superfree system” with a special gauge symmetry. This space is a cotangent bundle to the principle bundle. Thus the Hitchin systems are obtained by the three step symplectic reductions from this space. We investigate especially our reductions in terms of Schottky parameterization, which is a particular case of the general construction. This parameterization was used to derive the Knizhnik-Zamolodchikov-Bernard equations on the higher genus curves [3, 8, 9]. On the other hand the quantum second order Hitchin Hamiltonians coincide with them on the critical level. 2. Moduli of Holomorphic Vector Bundles Let Σ = Σg be a nondegenerate Riemann curve of genus g with g > 1 . We will consider in this section a set of stable holomorphic structures on complex vector bundles over ˆ Σ [1]. To define them we proceed in two ways based on the Cech and the Dolbeault cohomologies. Eventually, we come to the moduli space L of stable holomorphic bundles over Σg and represent them as a double coset space (Proposition 2.3). 2.1. Consider a vector bundle V over Σg . To be more concrete we assume that the structure group of V is GL(N, C). Let Ua , a = 1, . . . be a covering of Σg by open subsets. We consider two bases in V , the holomorphic {ehol } basis and the smooth C ∞ ∞ {eC } one. In local coordinates (za ∈ Ua ) hol ehol (za ), eC a =e a

∞

∞

= eC (za , z¯a ).

Let h be the transition map between them, ha = h(za , z¯a ). Then locally in Ua we have ha eC a

∞

= ehol a .

(2.1)

We can consider ha as the sections 0C ∞ (Ua , P ) of the adjoint bundle P =Aut V . We call the field h a generalized tetrade field (GTF). It follows from the definitions of the bases that there exists a global section for eC ∞ , ∞

∞

C eC (zb (za ), z¯b (z¯a )), za ∈ Uab = Ua ∩ Ub 6= ∅, a (za , z¯a ) = eb

(2.2)

where zb = zb (za ) are holomorphic functions defining a complex structure on Σg . On the other hand the transformations of ehol are holomorphic maps, −1 hol ehol a (za ) = gab (za )eb (zb (za )), gba (zb ) = gab (za (zb )),

(2.3)

Double Coset Construction of Moduli Space

451

¯ ab = 0, ∂¯ = ∂z¯a ). gab ∈ 0hol (Uab , Aut V ), (∂g These matrix functions define the holomorphic structure in the vector bundle V . ∞ We can describe the same holomorphic structure working with the smooth basis eC in V . Let ¯ A¯ a = h−1 (2.4) a ∂ha . Then the basis eC

∞

is annihilated by the operator d00A |Ua = ∂¯ + A¯ a , (∂¯ + A¯ a )eC a

∞

= 0.

The GTF transformations h in (2.1) are by no means free. Let RΣ be the subset of sections in P which satisfies the following conditions: −1 ¯ ¯ RΣ = {h ∈ 0C ∞ (Ua , P ) | h−1 a ∂ha |Uab = hb ∂hb |Uab , ∀ Uab 6= ∅, a, b = 1, . . .}, (2.5) (A¯ a (za ) = A¯ b (zb (za )), za ∈ Uab ).

Proposition 2.1. Conditions (2.1) and (2.5) are equivalent. Proof. Since eC b

∞

= eC a

∞

in Uab (2.1) implies gab = ha h−1 b .

(2.6)

Then the holomorphicity of gab implies (2.5) . If h ∈ RΣ , then (2.6) defines the transition map for some ∞ holomorphic basis ehol . The basis ehol h satisfies (2.2) and therefore can be taken as eC . Consider the group GΣ = {0C ∞ (Ua , P ), a = 1, . . .}.

(2.7)

It transforms local bases of V over Ua . The group acts on itself by the left and right multiplications. There are two subgroups of GΣ . Let xa ∈ 0C ∞ (Ua , P ). Then hol = {xa → fa xa | f ∈ 0hol (Σ, P )}, GΣ ∞

(2.8)

= {xa → xa ϕa | ϕ ∈ 0C ∞ (Σ, P ), ϕ(zb (za ), z¯b (z¯a )) = ϕ(za , z¯a ) za ∈ Uab }. (2.9) We can consider the GTF (2.5) as a subset in GΣ . We have the following evident statement C GΣ

hol C Proposition 2.2. The left and right actions of GΣ and GΣ

In other words

GΣ :

GΣ

∪

∪

−→

right mltpl.

GΣ :

GΣ

lef t mltpl.

∪

∪

hol GΣ :

RΣ

:

GΣ ∪

RΣ

C GΣ

and

∞

right mltpl.

∞

−→

−→

RΣ GΣ ∪ ,

lef t mltpl.

−→

RΣ

leave invariant RΣ .

452

A. Levin, M. Olshanetsky

2.2. Consider the space of holomorphic structures on the bundles V and P . Since g > 1 there is an open subset of stable holomorphic structures. The holomorphic structures can be defined in two ways. In the first type of construction, which we call the D-type, the holomorphic structures are defined by covariant operators. For V they are d00A : 0C ∞ (Σ, V ) → 0,1 C ∞ (Σ, V ). It means that A¯ satisfies (2.5). The holomorphic structure is consistent with the complex structure on Σg such that for any section s ∈ 0C ∞ (Σ, V ) and f ∈ C ∞ (Σ) d00A (f s) = ¯ )s + f d00 s. The space of holomorphic structures LD on P is defined in similar way, (∂f A Σ (0,1) 00 0 ¯ ¯ LD Σ = {dA = ∂ + A : C ∞ (Σ, P ) → C ∞ (Σ, P )},

(2.10)

with the action in the adjoint representation. The stable holomorphic structures LD,st Σ are an open subset in (2.10). The automorphisms of the holomorphic structures are given C∞ (2.9), by the action of the gauge group GΣ ∞

C . d00A → ϕ−1 d00A ϕ, ϕ ∈ GΣ

(2.11)

They preserve the subset LD,st Σ . The moduli space L of stable holomorphic structures on P is the quotient space C∞ (2.12) L = LD,st Σ /GΣ . It is a smooth complex manifold with tangent space at A¯ isomorphic to H (0,1) (Σ, Lie(GL(N, C))). Its dimension is given by the Riemann-Roch theorem dim L = N 2 (g − 1) + 1.

(2.13)

hol , (2.8), does not change A¯ a = The left action of the gauge transformations GΣ D ¯ ∂h , a = 1, . . .. Therefore the space L (2.10) can be represented as the quoh−1 a a Σ hol tient space LD = G \R . There is an open subset in Rst Σ Σ Σ Σ such that the subset of the stable holomorphic structures is the quotient space hol = GΣ \Rst LD,st Σ. Σ

The main statement of this section follows immediately from (2.12). Proposition 2.3. The moduli space L of stable holomorphic structures on P can be represented as the double coset space ∞

hol C \Rst L = GΣ Σ /GΣ .

(2.14)

ˆ 2.3. An alternative description of the holomorphic structures in terms of the Cech cohomologies, which we call the C-type construction is based on the transition maps (2.3), (2.6). The collection of transition maps −1 LCh Σ = {gab (za ) = ha (za )hb (zb (za )), za ∈ Uab , a, b = 1, . . . , }

(2.15)

defines the holomorphic structures on V or P depending on the choice of the representain LCh tions. Again we choose the open subset of stable holomorphic structures LC,st Σ . Σ C,st hol The gauge group GΣ acts as the automorphisms of LΣ , hol gab → fa gab fb−1 , fa = f (za ), fb = fb (zb (za )), f ∈ GΣ .

(2.16)

Double Coset Construction of Moduli Space

453

The space LCh Σ has a transparent description in terms of graphs. Consider the skeleton of the covering {Ua , a = 1, . . .}. It is an oriented graph, whose vertices Va are some fixed inner points in Ua and edges Lab connect those Va and Vb for which Uab 6= ∅. We choose an orientation of the graph, saying that a > b on the edge Lab and put the holomorphic function zb (za ) which defines the holomorphic map from Ua to Ub . Then the space LCh Σ can be defined by the following data. To each edge Lab , a > b we attach a matrix valued function gab ∈ GL(N, C) along with zb (za ). The gauge fields fa are living on the vertices Va and the gauge transformation is (2.16). The moduli space of stable holomorphic bundles is defined as the factor space under this action hol \LCh,st . (2.17) L = GΣ Σ The tangent space to the moduli space in this approach is H 1 (Σ, Lie(GL(N, C))) exˆ differs from LD,st we obtain the same tracted from the Cech complex. Though LCh,st Σ Σ moduli space L of stable holomorphic structures on P due to the equivalence of the ˆ Dolbeault and the Cech cohomologies. C∞ (2.9) leaves the transition maps gab In this construction the right action of GΣ invariant. Therefore C∞ LCh,st = Rst (2.18) Σ /GΣ . ˜ Σ Taking into account (2.17) we come to the same construction of the moduli space as the double coset space (2.14). 2.4. We fit the components of our construction in the exact bicomplex M: x  C ∂

x  C ∂

∂¯

i

0C ∞ (Uab , P ) −→ (0,1) C ∞ (Uab , EndV ) −→ x x  C  C ∂ ∂

−→

i

0C ∞ (Ua , P ) x  i

−→

(0,1) C ∞ (Ua , EndV ) x  i

−→

i

0C ∞ (Σ, P ) x  

−→

(0,1) C ∞ (Σ, EndV )} x  

−→

0

−→

0hol (Uab , P ) x  C ∂

−→

0

−→

0hol (Ua , P ) x  i

0

−→

0hol (Σ, P ) x  

−→

0

x  C ∂

0

∂¯

∂¯

0

ˆ differentials, i are the augmentations. The right arrows from Here ∂ C are the Cech (0,1) ¯ We have to C ∞ (∗, EndV ) are of the type h → h−1 ∂h.

0C ∞ (∗, P )

gab ∈ 0C ∞ (Uab , P ), ha ∈ 0C ∞ (Ua , P ), δ A¯ ∈ (0,1) C ∞ (Ua , EndV ). If these fields satisfy the tetrad conditions (2.1),(2.3) or (2.5) then they lie in the images of i. The Dolbeault cohomologies H (0,1) (Σ, EndV ) that define the tangent space to the 1 ˆ moduli space are living in (0,1) C ∞ (Σ, EndV ) and the Cech cohomologies H (Σ, EndV )

454

A. Levin, M. Olshanetsky

in 0hol (Uab , EndV ). Their equivalence can be derived from the properties of the double spectral sequence. The gauge transformations also can be incorporated in the exact bicomplex G: x  C ∂

x  C ∂

x  C ∂

i

∂¯

∂¯

i

∂¯

∂¯

i

∂¯

∂¯

(0,1) 0 0 −→ 0hol (Uab x, EndP ) −→ C ∞ (Uab x , EndP ) −→ C ∞ (Uab x , EndP ) −→  C  C  C ∂ ∂ ∂ (0,1) 0 −→ 0hol (Uax, EndP ) −→ 0C ∞ (Ux a , EndP ) −→ C ∞ (Ux a , EndP ) −→    i i   i (0,1) 0 0 −→ 0hol (Σ, x EndP ) −→ C ∞ (Σ, x EndP ) −→ C ∞ (Σ, x EndP ) −→       0 0 0 hol Let hol ∈ Lie(GΣ ), C

∞

∞

C ∈ Lie(GΣ ). Then

0 0 hol a ∈ Image of(hol (Ua , EndP )) in C ∞ (Ua , EndP ),

C a

∞

∈ Image of(0C ∞ (Σ, EndP ) in 0C ∞ (Ua , EndP ).

The actions of the gauge group (see (2.11) and (2.16)) ∞ ∞ ∞ ¯ C + [A¯ a , C δ C A¯ a = ∂ a a ],

(2.19)

hol δ hol gab = hol a gab − gab b .

(2.20)

More generally, M is the bigraded G module. The action of G is consistent with both ¯ The differentiations take into account the bigradings of M and G. differential ∂ C and ∂. The actions (2.19),(2.20) are particular cases of these actions .

3. The Schottky Specialization We apply the general scheme to the particular covering of Σg based on the Schottky parameterization. Consider the Riemann sphere with 2g circles Aa , A0a , a = 1, . . . g. Each circle lies in the external part of others. Let γa be g projective maps A0 a = γa Aa , γa ∈ P SL(2). The Schottky group 0 is a free group generated by γa , a = 1 . . . g. The exterior part of all the circles Σ˜ = P1 / ∪2g b=1 Db is the fundamental domain of 0. The surface Σ is obtained from Σ˜ by the pairwise gluing of the circles A0 a = γa Aa and Aa . We have only one non-simply-connected 2d cell Ua ∼ Σ˜ with selfintersections Uaa0 = vicinity Aa = vicinity A0 a . We choose g local coordinates za , a = 1, . . . , g, which define the parameterizations of the internal disks of Aa circles. In this case the holomorphic maps can be written as za0 (za ) = γa (za ). The ˜ GTF RΣ (2.5) is a twisted field h on Σ, ¯ a , z¯a ) = h−1 ∂h(γ ¯ a (za ), γa (za )), a = 1, . . . , g. h−1 ∂h(z

Double Coset Construction of Moduli Space C In the definition of GΣ

∞

455

(2.9) “the periodicity conditions” take the form

ϕ(γa (za ), γ¯ a (z¯a )) = ϕ(za , z¯a ), za ∈ vicinity of Aa . The transition maps (2.3),(2.6) defining LCh Σ are ga = gaa0 (za ) = h(za , z¯a )h−1 (γa (za ), γ¯ a (z¯a )), a = 1, . . . , g.

(3.1)

hol ˜ In the local acts as a global holomorphic transformation on Σ. The gauge group GΣ coordinates we have

, fa = f (za ), ga = ga (za ), fγa = f (γa (za )). ga → fa ga fγ−1 a

(3.2)

P (k) k In local coordinates ga have the form of Laurent polynomials. ga (za ) = ga za . Thus in this parameterization the set of holomorphic structures on the vector bundles LCh Σ can be identified with the collection of the loop groups L(GLa (N, C)). But in fact, taking into account the adjoint action of the gauge group (3.2), one concludes that the precise form of components is the semidirect product L(GL(N, C)) o P SL(2) = {g(z) o γ(z)}. Thus g LCh Σ = ⊕a=1 La (GL(N, C)) o P SL(2)a ,

(3.3)

where the subgroups {P SL(2)a }, a = 1 . . . g are responsible for the complex structure on Σ. To define the stable bundles one should choose an open subset in La (GL(N, C)). Consider the bundles over genus g = 1 curves. Though the bundles are unstable this case can be completely described in the wellknown terms. The Schottky parameterization means the realization of elliptic curve as an annulus. Let γ(z) = qz, q = exp(2πiτ ). The holomorphic bundles are defined by the loop group extended by the shift operator LCh Σ = L(GL(N, C)) o exp(2πiτ z∂).

(3.4)

The gauge action (3.2) g(z) → f (z)g(z)f −1 (qz) transforms g(z) to a z independent diagonal form, up to the action of the complex ˆ . Let W be the AN −1 Weyl group (the permutations of the Cartan affine Weyl group W ˆ = (ZR∨ τ +ZR∨ )oW (R∨ is the dual root system). The moduli space elements). Then W L in this case is the Weyl alcove. The comparisons of two description of holomorphic structures on elliptic curves (3.4) and (2.10) was carried out in [10, 11] in terms of two loop current algebras and invariants of their coadjoint actions. In the general case (g > 1) the gauge transform (3.2) allows to choose ga as constant matrices. They are defined up to the common conjugation by a GL(N, C) matrix. Thus the moduli space of holomorphic bundles in the description Eq. (3.3) are defined as the quotient L ∼ (GL(N, C) ⊕ . . . ⊕ GL(N, C))/GL(N, C). {z } | g

Since the center of GL(N, C) acts trivially we obtain dim L = N 2 (g − 1) + 1 (see (2.13)).

456

A. Levin, M. Olshanetsky

4. Symplectic Geometry in the Double Coset Picture Here we consider the Hitchin integrable systems which are defined on the cotangent bundle T ∗ L to the moduli of stable holomorphic bundles L. As was done in the original work [7] this space is derived as a symplectic quotient of T ∗ LD Σ under the gauge action C∞ of GΣ . We will come to the same systems by the three step symplectic reductions from some big upstairs space. The main object of this section is the commutative diagram (4.10), which describes these reductions and intermediate spaces. 4.1. First, as an intermediate step, consider the Hitchin description of T ∗ L. The upstairs D phase space is the cotangent bundle T ∗ LD Σ to the space LΣ (2.10) of holomorphic structures on the bundle P in the Dolbeault picture. It is the space of pairs (1,0) 00 ∗ T ∗ LD Σ = {φ, dA , φ ∈ C ∞ (Σ, (EndV ) )}.

(4.1)

The field φ is called the Higgs field and the bundle T ∗ LD Σ is the Higgs bundle. We can consider the Higgs field as a form φ ∈ 0C ∞ (Σ, (EndV )∗ ⊗ K), where K is the canonical bundle on Σ. Locally on Ua , 0 ¯ d00a = ∂¯ + A¯ a , A¯ a = h−1 a ∂ha , ha ∈ C ∞ (Ua , RΣ ).

The symplectic form on it is Z ω

D

tr(Dφ, Dd00A ).

=

(4.2)

Σ C The action of the gauge group GΣ

∞

(2.9) on d00A (2.11) with φ → ϕ−1 φϕ

is a symmetry of T ∗ LD Σ . It defines the moment map ∞

∗ C ¯ : T ∗ LD µG C ∞ (φ, A) Σ → Lie (GΣ ), Σ

¯ = [d00A , φ]. µG C ∞ (φ, A) Σ

For the zero level moment map [d00A , φ] = 0 the Higgs field becomes holomorphic φ ∈ H 0 (Σ, (EndV )∗ ⊗ K). ∞

∞

C C The symplectic quotient µ−1 (0)/GΣ = T ∗ LD is identified with the cotangent Σ //GΣ ∗ bundle to the moduli space T L. The Hitchin commuting integrals are constructed by D , k = 1, . . .: means of (1 − j, 1) holomorphic differentials νj,k Z D D Ij,k = νj,k trφj . (4.3) Σ

Since the space of these differentials H (Σ, K ⊗ T j ) (K is the canonical class, T j is (−j, 0) forms) has dimension (2j − 1)(g − 1) for j > 1 and g for j = 1 we have 0

Double Coset Construction of Moduli Space

457

N 2 (g − 1) + 1 independent commuting integrals, providing the complete integrability of the Hamiltonian systems (4.2),(4.3). The integrals (4.3) define the Hitchin map H 0 (Σ, (EndV )∗ ⊗ K) → H 0 (Σ, K j ).

4.2. The same system can be derived starting from the cotangent bundle T ∗ LCh Σ to the holomorphic structures on P defined in the C-type description (2.15). Now (1,0) ∗ 0 T ∗ LCh Σ = {ηab , gab | ηab ∈ hol (Uab , (EndV ) ), gab ∈ hol (Uab , P )}.

(4.4)

This bundle can be endowed with the symplectic structure by means of the CartanMaurer one-forms on 0hol (Uab , P ). Let 0ba (C, D) be a path in Uab with the end points in the triple intersections C ∈ Uabc = Ua ∩ Ub ∩ Uc , D ∈ Uabd . We can put the data (4.4) on the fat graph corresponding to the covering {Ua }. The edges of the graph are {0ba (CD)} and {0ab (DC)} with opposite orientation. We assume that the covering is such that the orientation of edges defines the oriented contours around the faces Ua . The fields ηab , gab are attributed to the edge 0ba (CD), while ηba , gba to 0ab (DC). The last −1 = gba ) (see (2.3)). Its counterpart in the dual space is pair is not independent - (gab −1 (za ). ηab (za ) = gab (za )ηba (zb (za ))gab

The symplectic structure is defined by the form XZ −1 Ch Dtr(ηab (za )(Dgab gab )(za )). ω = edges

(4.5)

(4.6)

0ba (CD)

Here the sum is taken over the edges of the oriented graph obtained from the fat graph after the identifications of fields (4.5). In other words we consider only the edge 0ba with the fields gab , ηab and forget about the edge 0ab . Since ηab and gab are holomorphic in Uab , the definition is independent on a choice of the path 0ba within Uab . The symplectic form is invariant under the gauge transformations (2.16) supplemented by ηab → fa ηab fa−1 . The set of invariant commuting integrals on T ∗ LCh Σ is Z X j Ch Ch Ij,k = ν(j,k) (za )tr(ηab (za )), edges

(4.7)

(4.8)

0ba (CD)

Ch where νj,k are (1−j, 0) differentials, which are related locally to the (1−j, 1) differentials D Ch ¯ as νj,k = ∂νj,k . We can consider the system on the above defined graph {Lab } which is dual to {0ba (CD)}. The fields gab , ηab , a, b = 1 . . . are defined on the edges, while the gauge transformations fa live on the vertices. The moment map is ∗ hol µG hol (ηab , gab ) : T ∗ LCh Σ → Lie (GΣ ). Σ

According to (2.20) the Hamiltonian generating the gauge transformations is

458

Fhol =

A. Levin, M. Olshanetsky

XZ edges

0ba (CD)

hol −1 tr(ηab (za )hol a (za )) − tr(ηab (za )gab (za )b ((zb (za ))gab (za ) ) =

XZ edges

0ba (CD)

hol tr(ηab (za )hol a (za )) − tr(ηba (zb (za ))b (zb (za ))) =

XZ

X 0a

a

tr(ηab (za )hol a (za )),

b

where 0a is an oriented contour around Ua . The moment equation µG hol = 0 can be read Σ off from Fhol . It means that ηab is a boundary value of some holomorphic form defined on Ua , ∗ (4.9) ηab (za ) = Ha (za ), for za ∈ Uab , Ha ∈ (1,0) hol (Ua , (EndV ) ). The reduced system is again the cotangent bundle to the moduli space of holomorphic bundles −1 hol hol \\T ∗ LCh T ∗ L = GΣ Σ = GΣ \µG hol (0), Σ

which has dimension 2N 2 (g − 1) + 2. 4.3. To get the cotangent bundle T ∗ L via the symplectic reduction we can start from ∗ Ch T ∗ RΣ using the double coset representation (2.14). Then T ∗ LD Σ or T LΣ are obtained hol C∞ on the intermediate stages of the two step reduction under the actions of GΣ or GΣ . Since these groups act from different sides on RΣ their actions commute and the result of the reduction procedure is independent on their order. But the space RΣ , as we already have remarked, is not free - its elements satisfy (2.5). We will represent the constraints (2.5) as moment constraints and consider the “superfree” space - cotangent bundle to the group GΣ (2.7). More exactly we will consider (Theorem 4.1) the three step symplectic reductions which result in the following commutative “tadpole” diagram T ∗ GΣ ↓ T ∗ RΣ

A GΣ hol GΣ .

T

∗

LD Σ

C & GΣ

∞

(4.10) T

C GΣ

∞

&

T ∗L

∗

LCh Σ

hol . GΣ

A . To To begin with we define the initial data on T ∗ GΣ and the gauge group GΣ ∗ construct T GΣ we consider three dual elements, (1,0) ∗ ∗ 9a ∈ (1,1) C ∞ (Ua , (EndV ) ), ηab , ηba ∈ C ∞ (Uab , (EndV ) ), ∗ ξab , ξba ∈ (0,1) C ∞ (Uab , (EndV ) ).

Cotangent bundle T ∗ GΣ is the set of fields (9, η, ξ, h). We endow it with the symplectic structure. Consider the same fat graph with edges 0ba (CD) and 0ab (DC) as in Sect. 4.2. Then

Double Coset Construction of Moduli Space

XZ

ωΣ = D{

a

Ua

tr(9a Dha h−1 a )+

459

XZ [ b

0ba

tr(ηab Dha h−1 a )+

Z 0ba

tr(ξab h−1 a Dha )]}.

(4.11) We assume as before that paths 0ba , 0ca , . . . can be unified in a closed oriented contour 0a ⊂ Ua . The integral over Ua means in fact the integral over a part of Ua restricted by the contour 0a . Thus the first sum can be replaced by the integration over Σ. To maintain the independence of ωΣ on the choice of the contours 0a we introduce the following “gauge” symmetry. Its action defines variations of fields along with variations of contours. Let 00a be another contour and δUa be the corresponding variation of the integration domain. There is the integral relation between fields coming from the Stokes theorem, providing the independence of ωΣ , Z Z Z −1 tr(9a Dha h−1 ) = [ tr(η Dh h ) + tr(ξab h−1 (4.12) ab a a a a Dha )] δUa

Z −[

00a

0a

tr(ηab Dha h−1 a )+

00a

Z 0a

tr(ξab h−1 a Dha )].

In other words, the variation of contours is compensated by the variation of the field 9. hol : The form ωΣ (4.11) is invariant under the actions of GΣ

C and GΣ

ha → fa ha , 9a → fa 9a fa−1 , ηab → fa ηab fa−1 , ξab → ξab ,

(4.13)

ha → ha ϕa , 9a → 9a , ηab → ηab , ξab → ϕ−1 a ξab ϕa .

(4.14)

∞

We extend these group transformations by the following affine action of the group: A GΣ = {sab ∈ 0C ∞ (Uab , P )|sab = sba }

on ξab

(4.15)

−1 ¯ −1 ¯ ¯ ξab → ξab − s−1 ab (∂ + ha ∂ha )sab + ha ∂ha ,

hol , but does not comleaving the other fields untouched. This action commutes with GΣ C∞ A mute with GΣ . GΣ can be imbedded in the bicomplex G (see (4.15)). On the Lie algebra level we have −1 ¯ A ¯ A (4.16) ξab → ξab − (∂ ab + [ha ∂ha , ab ]), A 0 A A (A ab ∈ Lie(GΣ ) = {C ∞ (Uab , EndV )| ab = ba }. A . Proposition 4.1. The form ωΣ (4.11) is invariant under the action of GΣ

Proof. From (4.16)

−1 ¯ A ¯ A δA ξab = −(∂ ab + [ha ∂ha , ab ]), ab

where ∈ A

A Lie(GΣ ).

−δA ωΣ˜ =

Then XXZ a

=−

b

0ba

−1 −1 ¯ A −1 ¯ A + ∂ a D(ha Dha ) + [ha ∂ha , a ]D(ha Dha )}

XXZ a

b

A −1 ¯ tr{D([h−1 a ∂ha , ab ])ha Dha

0ba

−1 tr{[D(h−1 a Dha ), ha Dha ] −1 −1 ¯ −1 A ¯ +∂D(h a Dha ) + [ha ∂ha , D(ha Dha )], ab }.

460

A. Levin, M. Olshanetsky

Then direct calculations show that the sum under the integral in front of A ab vanishes. Therefore ωΣ is invariant under these transformations. More generally, the dual fields (9a , ηab , ηba , ξab , ξba ) can be incorporated in a general pattern of two exact G bimoduli: ∗ M0 x  C ∂

x  C ∂

x  C ∂

i

∂¯

i

∂¯

∂¯

i

∂¯

∂¯

∂¯

0 → (1,0) , (EndV )∗ ) −→ (1,0) , (EndV )∗ ) −→ (1,1) , (EndV )∗ ) −→ C ∞ (Uabx C ∞ (Uabx hol (Uabx  C  C  C ∂ ∂ ∂ 0 → (1,0) , (EndV )∗ ) −→ (1,0) , (EndV )∗ ) −→ (1,1) , (EndV )∗ ) −→ C ∞ (Ua x C ∞ (Ua x hol (Ua x    i i i 0 → (1,0) (EndV )∗ ) −→ (1,0) (EndV )∗ ) −→ (1,1) (EndV ))∗ −→ C ∞ (Σ,x C ∞ (Σ,x hol (Σ,x       0 0 0

M00 x  C ∂

∗

x  C ∂

x  C ∂

i

∂

∂

i

∂

∂

i

∂

∂

(0,1) (1,1) ∗ ∗ ∗ 0 → (0,1) antihol (Uab x , (EndV ) ) −→ C ∞ (Uabx, (EndV ) ) −→ C ∞ (Uabx, (EndV ) ) −→  C  C  C ∂ ∂ ∂ (0,1) ∗ 0 → (0,1) , (EndV )∗ ) −→ (1,1) , (EndV )∗ ) −→ a , (EndV ) ) −→ C ∞ (Ua x C ∞ (Ua x antihol (Ux    i i i (0,1) (1,1) ∗ ∗ ∗ 0 → (0,1) antihol (Σ, x (EndV ) ) −→ C ∞ (Σ,x(EndV ) ) −→ C ∞ (Σ,x(EndV )) −→       0 0 0

We recall that (1,0) ∗ ∗ 9a ∈ (1,1) C ∞ (Ua , (EndV ) ), ηab , ηba ∈ C ∞ (Uab , (EndV ) ), ∗ ξab , ξba ∈ (0,1) C ∞ (Uab , (EndV ) ).

We will see that after the symplectic reductions these fields will obey some special constraints. Now we have all the initial data to start from the top of the diagram (4.10) –the fields, the symplectic form ωΣ (4.11) and the gauge groups actions (4.13),(4.14),(4.15). Theorem 4.1. There exist two ways of symplectic reductions represented by the commutative diagram (4.10) which leads from T ∗ GΣ to the cotangent bundle to the moduli space T ∗ L.

Double Coset Construction of Moduli Space

461

To prove the Theorem we shall go down along the diagram. A (4.16). Let T ∗ RΣ = {9, η, h} and h is GTF with 4.4. Consider first the action of GΣ the symplectic form XZ XZ −1 tr(9a Dha ha ) + tr(ηab Dha h−1 (4.17) ωΣ = D{ [ a )]}. Ua

a

Lemma 4.1.

b

0ba (CD)

A A = µ−1 (0)/GΣ . T ∗ RΣ = T ∗ GΣ //GΣ GA Σ

A Proof. It follows from (4.11),(4.15) that the Hamiltonian of GΣ action is Z Z X −1 ¯ −1 ¯ Fab , Fab = tr(A tr(A F A = ab ha ∂ha ) + ba hb ∂hb ). 0a (DC) b

0ba (CD)

a>b

In fact the one-form Z DFab =

0ba (CD)

−1 A −1 ¯ −1 ¯ A {tr(∂ ab ha Dha ) + tr(ab [ha ∂ha , ha Dha ])}

A (4.15). can be obtained from ωΣ (4.11) by the action of the vector field generated by GΣ A A But ab = ba (4.15). Putting the moment equal to zero µGΣA = 0 we come to the −1 ¯ ¯ constraints h−1 a ∂ha = hb ∂hb , which are exactly (2.5). Note that the gauge transform A has the (4.16) allows to fix ξab = 0. Thus the symplectic quotient T ∗ RΣ = T ∗ GΣ //GΣ 0 field content (9, η, h ∈ C ∞ (Σ, P)) with ωΣ (4.17). hol 4.5. Consider the action of GΣ (2.20),(4.13) on T ∗ RΣ , which corresponds to the left arrow in the diagram (4.11). We will prove

Lemma 4.2.

−1 hol ∗ hol T ∗ LD Σ = GΣ \\T RΣ = GΣ \µG hol (0), Σ

where T

∗

LD Σ

is defined by (4.1) with the symplectic structure (4.2).

Proof. From (4.13) and (4.17) we read off the hamiltonian of the gauge fields XZ XZ [ tr(9a hol ) + tr(ηab hol Fhol = a a )]. Ua

a

b

0ba

∗ ¯8 ˜ a + Ha ), where 8 ˜ a ∈ (1,0) On Ua we can put 9a = ∂( C ∞ (Ua , (EndV ) and Ha is an ∗ (1,0) arbitrary element from hol (Ua , (EndV )∗ ) (see M0 ). Then XXZ ˜ a + Ha + ηab )hol tr((8 Fhol = a ). a

0ba

b

Resolving the moment constraint µG hol = 0 gives Σ

˜ a (za , z¯a ) − Ha (za ), za ∈ Uab . ηab (za , z¯a ) = −8 By means of the Stokes theorem ωΣ (4.17) can be transformed to the form

(4.18)

462

A. Levin, M. Olshanetsky

XZ [

ωΣ = D{

a

Ua

¯8 ˜ a + Ha )Dha h−1 tr(∂( a )+

−

X a

XZ 0ba

b

Z

tr(ηab Dha h−1 a )]} =

−1 ¯ ˜ a + Ha )∂(Dh tr((8 a ha )).

D Ua

Let

˜ (4.19) φa = −h−1 a (8a + Ha )ha . Recall that Ha = Ha (za ) is an arbitrary holomorphic function on Ua . We will choose it ∗ in a such way that φa becomes a global section in (1,0) C ∞ (Σ, (EndV ) ). In other words (φa − φb )|0ba = 0.

(4.20)

−1 ˜ ˜ −1 ha (φb − φa )h−1 a = (8a − gab 8b gab ) + (Ha − gab Hb gab ),

(4.21)

In fact, since gab = ha h−1 b , where the second term is holomorphic. Consider the integral Ia over the contour 0a = ∪b 0ba around Ua Z X ˜ ˜ b g −1 )(y) (8a − gab 8 ab dy. Ia = − z−y 0a b

Due to the Sokhotsky-Plejel theorem [12] Ia is holomorphic inside and outside 0a . It ˜ b g −1 on the contour. Let ˜ a − gab 8 has a jump 8 ab Ha = Ia in Ua , −1 Ia gab outside Ua . Hb = gab −1 defined by this integral provide the vanishing Therefore the functions Ha and gab Hb gab of the left-hand side (4.21). The symplectic form ωΣ in terms of φ and A¯ can be rewritten as Z X Z ¯ D tr(φa DA¯ a ) = D tr(φDA). ωΣ = a D

Ua

Σ ∗

LD Σ.

hol The field φ is invariant under the GΣ This form coincides with ω (4.2) for T hol action (4.13). Therefore the symplectic reduction by the gauging GΣ leaves us with the hol fields φ and h and the symplectic structure (4.2). In other words T ∗ LΣ //GΣ = T ∗ LD Σ.

We can now move down along the left side of diagram (4.11) as it was described in Subsect. 4.1. and obtain eventually T ∗ L. It will be instructive to look on relations between two types of dual fields η (4.18) and φ (4.19) that arise after these two consecutive reductions. On the first step we found that η are boundary valued forms, ηab (za , z¯a ) = h−1 a φha |Uab .

(4.22)

Moreover, it follows from (4.20) that (4.23) ηab (za , z¯a ) = gab ((za , z¯a )ηba (zb (za ), z¯b (z¯a ))gab ((za , z¯a )−1 . ¯ ¯ ¯ The second reduction gives ∂φ + [A, φ] = 0 (see Subsect. 4.1). It is equivalent to ∂η = 0, due to (4.22). 4.6. Now look on the right side of the diagram.

Double Coset Construction of Moduli Space

Lemma 4.3.

463

∗ C T ∗ LCh Σ = T RΣ //GΣ

∞

∞

C = µ−1 , ∞ (0)/GΣ GC Σ

Ch where T ∗ LCh (4.6). Σ is the cotangent bundle (4.4) with ω ∞

C (4.14) on T ∗ RΣ defines the Hamiltonian (see (4.17)) Proof. The gauge action of GΣ X Z XZ C∞ C∞ F C ∞ = { tr(h−1 9 h ) + tr(h−1 (4.24) a a a a a ηab ha a )}. Ua

a

b

0ba

Consider the zero level of the moment map C∞ ). µG C ∞ : T ∗ L˜ Σ → Lie∗ (GΣ Σ

From the first terms in (4.24) we obtain 9a = 0, a = 1, . . . . This choice of 9 breaks the invariance with respect to replacements of contours . But if ¯ ab = 0 then the exact form of the path 0ba (C, D) is nonessential. Note that this choice ∂η hol reduction). Picking up is consistent with the definition of η (4.20) (ηab = Ha in the GΣ in the second sum in (4.24) integrals over two neighbor edges we come to the condition Z Z C∞ C∞ tr(h−1 η h ) + tr(h−1 ) = 0. ab a a a b ηba hb b 0a b

0ba

∞

∞

∞

∞

C C ), it is “periodic” C (zb (za )). It gives the following Since C ∈ Lie(GΣ a (za ) = b form of constraints −1 (h−1 a ηab ha )(za ) = (hb ηba hb (zb (za ))), za ∈ Uab ,

or −1 ηab (za ) = gab (za )ηba (zb (za ))gab (za ), (gab (za ) = ha (za , z¯a )h−1 b (zb (za ), zb (za ))), (4.25) which is just the twisting property of η (4.5). Furthermore, the symplectic form ωΣ (4.17) due to vanishing the field 9 now is Z X Z −1 ωΣ = D[ tr(ηab Dha ha ) + tr(ηba Dhb h−1 b )]. a>b

0ba (CD)

0a (DC) b

Taking into account that −1 −1 −1 )(za ) = Dha (za )h−1 (Dgab gab a (za ) − ha (za )(hb Dhb )(zb (za ))ha (za ) ,

and the moment constraint (4.25) we can rewrite ωΣ as XZ −1 Dtr(ηab (za )(Dgab gab )(za )). ωΣ = edges

0ba (CD)

It is just ω C (4.6) in the C-type description of holomorphic bundles. We have the same ∗ C∞ = field content and the same symplectic structure as in T ∗ LCh Σ . Therefore T RΣ //GΣ T ∗ LCh Σ . The last step on the right side of diagram was described in Subsect. 2. Its completes the proof of the theorem.

464

A. Levin, M. Olshanetsky

5. Schottky Description of Hitchin Systems 5.1. Now consider the last step in the diagram (4.10) in the Schottky parameterization. Since in this case we have only one topologically nontrivial cell Σ˜ the symplectic reduction is different from the one described in Subsect. 4.2 for the standard covering. In this case the holomorphic fields ηa , ga = ga (za ), a = 1, . . . , g live in vicinities Va of Aa -cycles, and za are local parameters in the internal disks (see (3.1)). The phase space is (1,0) ∗ 0 T ∗ LCh Σ = {ηa , ga | ηa ∈ hol (Va , (EndV ) ), ga ∈ hol (Va , P )}. In other words in accordance with (3.3) g ∗ T ∗ LCh Σ = ⊕a=1 T La (GL(N, C)),

and the loop groups La (GL(N, C)) are extended by the projective transformations of za as in (3.3). The symplectic form on this object is (see (4.6)) Z g X D tr(ηa (za ), Dga ga−1 (za )). (5.1) ω Ch = Aa

a=1

The gauge transformations (2.16),(4.7) act as the common conjugations by global holomorphic in Σ˜ matrix functions ηa (za ) → f (za )ηa f −1 (za ), ga → f (za )ga (za )f −1 (γa (za )).

(5.2)

The invariant commuting Hamiltonians (4.8) in this parameterization are XZ C C Ij,k = ν(j,k) (za )tr(ηaj (za )).

(5.3)

a

Aa

The gauge transform (5.2) produces the moment map µG hol , which takes the form Σ

µG hol = ηa (γa (za )) − Σ

(ga−1 ηa ga )(za ),

a = 1, . . . g.

Assume as above that µG hol = 0: Σ

ηa (γa (za )) − (ga−1 ηa ga )(za ) = 0, a = 1, . . . g,

(5.4)

which is twisting property (4.5) in the Schottky picture. 5.2. The solutions of the moment equations are known in a few degenerate cases [13]. We will consider here as an example of the above construction of holomorphic bundles over elliptic curves with a marked point. Define the elliptic curve as the quotient Στ = C∗ /q Z , q = exp 2πiτ. In this case

T ∗ LCh Σ ∼ (η(z), g(z); p, s),

where s ∈ GL(N, C) is a group element in the marked point z = 1 and p ∈ Lie∗ (GL(N, C)). In addition to (5.2), p → f (z)pf −1 (1), s → f (1)s.

Double Coset Construction of Moduli Space

465

The one form η(z) has a simple pole in the singular point z = 1. The symplectic form (5.1) on these objects is Z

tr(η(z)Dgg −1 (z)) + tr(D(s−1 p)Ds).

ω Ch = D

(5.5)

A

The transition map g(z) can be diagonalized by (5.2): g(z) = exp 2πiu = exp{diag 2πi(u1 , . . . , uN )}, where uj are z-independent. We keep the same notation for the transformed η(z) = (n) n z . The moment equation (5.5) takes the form Σn∈Z ηj,k η(qz) − (g −1 ηg)(z) = pδ(z), Rewrite it as (n) (n) − e2πi(xk −xj ) ηj,k = p(n) q n ηj,k j,k .

After the resolving the moment constraints we find ηj,j (z) = wj , pjj = 0,

ηj,k = −

1 θ(uj − uk − ζ)θ0 (0) , z = exp 2πiζ, 2πi θ(uj − uk )θ(ζ)

P 2 where wj are new free parameters and θ(ζ) = n∈Z eπi((n+1/2) τ +(2n+1)ζ) . The symplectic form (5.5) on the reduced space takes the form ω red = Dw · Du + trD(Js−1 Ds), and J defines the coadjoint orbit p = s−1 Js. Consider the quadratic Hamiltonian (5.3). After the reduction H takes the form of the N-body elliptic Calogero Hamiltonian with the spins [14]:

H=

N 1 X 1 (w · w + 2 [pj,k pk,j ℘(uj − uk |τ ) + E2 (τ )]). 2 4π j>k

Here E2 (τ ) is the normalized Eisenstein series. Acknowledgement. We would like to thank V. Fock, B. Khesin, N. Nekrasov and A. Rosly for illuminating discussions. We are grateful to the Max Planck Institute for Mathematik in Bonn for the hospitality where this work was prepared. The work of A.L. is supported in part by the grant INTAS 944720 and the grant ISF NSR-300. The work of M.O. is supported in part by the grant CEE-INTAS 932494 and the grant RFFI-9602-18046

466

A. Levin, M. Olshanetsky

References 1. Atiyah, M. and Bott, R.: The Yang-Mills equations over Riemann surfaces. Phil. Trans. R. Soc. Lond. A308, 523–615 (1982) 2. Knizhnik, V. and Zamolodchikov, A.: Nucl.Phys. B247, 83 (1984) 3. Bernard, D.: Nucl.Phys. B303, 77 (1988); Nucl.Phys. 309, 145 (1988) 4. Tsuchia, A., Ueno, K. and Yamada, Y.: Adv. Stud. in Pure Math. 19, 459–565 (1989) 5. Beilinson, A. and Schetman, V.: Comm. Math. Phys. 119, 651–701 (1988) 6. Presssley, A. and Segal, G.: Loop Groups, Oxford: Clarendon Press, 1986 7. Hitchin, N.: Stable bundles and Integrable Systems. Duke Math. J. 54, 91–114 (1987) 8. Losev, A.: Coset construction and Bernard equation. Preprint CERN-TH.6215/91 9. Ivanov, D.: KZB equations on Riemann surfaces. hep-th/9410091 10. Etingof, P. and Frenkel, I.: Central extension of current groups in two dimensions. Commun. Math. Phys. 165, 429–444 (1994) 11. Etingof, P. and Khesin, B.: Affine Gelfand-Dickey brackets and holomorphic vector bundles. Geom. Funct. Anal. 4, 399–423 (1994) 12. Muskhelishvili, N.I.: Singul¨are Integralgeichungen. Berlin: Academie-Verlag, 1965 13. Nekrasov, N.: Holomorphic bundles and many-body systems. PUPT-1534, hep-th/9503157 Commun. Math. Phys. 180, 587 (1996) 14. Krichever, I., Babelon, O., Billey E. and Talon, M.: Spin generalization of the Calogero-Mozer system and the matrix KP equation. Preprint LPTHE 94/42 Communicated by G. Felder

Commun. Math. Phys. 188, 467 – 497 (1997)

Communications in

Mathematical Physics c Springer-Verlag 1997

Confluent Hypergeometric Orthogonal Polynomials Related to the Rational Quantum Calogero System with Harmonic Confinement? J.F. van Diejen Centre de Recherches Math´ematiques, Universit´e de Montr´eal, C.P. 6128, succursale Centre-ville, Montr´eal (Qu´ebec), H3C 3J7 Canada Received: 20 October 1996/Accepted: 3 March 1997

Abstract: Two families (type A and type B) of confluent hypergeometric polynomials in several variables are studied. We describe the orthogonality properties, differential equations, and Pieri-type recurrence formulas for these families. In the one-variable case, the polynomials in question reduce to the Hermite polynomials (type A) and the Laguerre polynomials (type B), respectively. The multivariable confluent hypergeometric families considered here may be used to diagonalize the rational quantum Calogero models with harmonic confinement (for the classical root systems) and are closely connected to the (symmetric) generalized spherical harmonics investigated by Dunkl.

1. Introduction In this paper multivariable orthogonal polynomials are studied associated to the weight functions Type A (Hermite) Y Y 2 |xj − xk |2g0 e−ωxj , (1.1a) 1A (x) = 1≤j
1≤j≤n

Type B (Laguerre) 1B (x) =

Y 1≤j
|(xj − xk )(xj + xk )|2g0

Y

|xj |2g1 e−ωxj . 2

(1.1b)

1≤j≤n

In the one-variable case (n = 1), the type A polynomials become Hermite polynomials (1A (x) = exp(−ωx2 )) and the type B polynomials reduce to Laguerre polynomials of a quadratic argument (1B (x) = |x|2g1 exp(−ωx2 )). ?

Work supported in part by the Natural Sciences and Engineering Research Council (NSERC) of Canada.

468

J.F. van Diejen

For both families we will exhibit differential equations and Pieri-type recurrence relations as well as the normalization constants that convert the polynomials into an orthonormal system. Multiplication of the polynomials by the square root of the weight function, yields an orthogonal basis of eigenfunctions for the rational quantum Calogero model with harmonic confinement and its generalizations associated to (the classical) root systems [Ca, OP]. The connection between this orthogonal basis and the conventional (non-orthogonal) basis of eigenfunctions for the confined rational Calogero model (found by separating the quantum eigenvalue problem in a “radial” and a “spherical” part) is described using the theory of Dunkl’s generalized spherical harmonics with reflection group symmetry [Du1, Du2]. The multivariable Hermite and Laguerre families associated to the weight functions 1A (x) (1.1a) and 1B (x) (1.1b) were introduced by Macdonald [M2] and Lassalle [La4, La5] as a generalization (more accurately a deformation) of the previously known special case in which the parameter g0 is being fixed at the value 1/2 [He, Co, J, Mu]. Recently, further insight regarding the properties of the polynomials considered by Macdonald and Lassalle was obtained in the context of a renewed study of the eigenvalue problem for the rational quantum Calogero model with harmonic confinement [D3, UW, BF]. As it turns out, some of the results reported in the present work may also be obtained by combining results from previous literature. For example, our evaluation formulas for the (squared) norms of the polynomials (cf. Proposition 2.2) can also be gleaned from [M2, La4, La5, BF], where expressions for these norms in a modified guise were obtained by different methods. (Specifically, if we make the norm formulas in [M2, La4, La5, BF] explicit with the aid of known evaluation formulas for the Jack symmetric functions at the identity due to Stanley [St, M3], then they are seen to be in correspondence with the expressions derived below in a completely different manner.) In all instances where overlap of this kind occurs (see the notes in Sect. 3.5), our approach provides an alternative, independent method of proof for the statements of interest. The paper is organized as follows. First, the confluent hypergeometric families associated to the weight functions 1A (x) (1.1a) and 1B (x) (1.1b) are defined in Sect. 2 and their main properties (orthogonality relations, orthonormalization constants, differential equations, and Pieri-type recurrence relations) are formulated. In Sect. 3, we comment in some detail on the precise relation between our results and those obtained in previous literature. We will—in particular—take the opportunity to detail the connection between the multivariable Hermite/Laguerre families and the Calogero eigenfunctions as well as the relation to Dunkl’s generalized spherical harmonics. In Sect. 4 we provide the proofs for the statements in Sect. 2 by viewing the multivariable confluent hypergeometric Hermite and Laguerre families of the present work as a degeneration (viz. a limiting case) of certain families of multivariable hypergeometric orthogonal polynomials that were introduced in [D3] and investigated in more detail in [D4]. The multivariable hypergeometric polynomials relevant to us here are generalizations of the one-variable continuous Hahn polynomials [AW, AtSu, A2, KS] (this corresponds to type A) and of the onevariable Wilson polynomials [W, KS] (this corresponds to type B) to the case of several variables. From a physical viewpoint, the multivariable hypergeometric polynomials in question are connected to the eigenfunctions of a difference (or “relativistic”) counterpart of the rational Calogero models with harmonic confinement [D2, D3, R]. The transition from the hypergeometric to the confluent hypergeometric level corresponds to sending the difference step-size to zero. In this (“nonrelativistic”) limit the difference Calogero model reduces to the ordinary Calogero model. Some technicalities needed to perform this transition at the level of the polynomials (which is established by controlling the

Orthogonal Polynomials Related to the Calogero System

469

convergence of the respective weight functions) are relegated to an appendix at the end of the paper. Note. Below we will always assume (unless explicitly stated otherwise) that the parameters g0 and g1 entering through the weight functions 1A (x) (1.1a) and 1B (x) (1.1b) are nonnegative and, similarly, that the scale factor ω is positive. In principle it is possible to rescale the variables x1 , . . . , xn so as to reduce to the case that ω is fixed at the value 1 (say). However, we have found it useful to keep the dependence on ω explicit in order to have a check on the scaling properties of our expressions and so as to suppress the emergence of numerical constants. 2. Multivariable Hermite and Laguerre Polynomials In this section the multivariable confluent hypergeometric families associated to the weight functions 1A (x) (1.1a) and 1B (x) (1.1b) are defined and the main properties of the polynomials are stated. The proof of these properties can be found in Sect. 4. 2.1. Definition and orthogonality properties. Let mλ , λ ∈ 3 denote the basis of symmetric monomials X µ x1 1 · · · xµnn , λ∈3 (2.1) mλ (x) = µ∈Sn (λ)

with

3 = {λ ∈ Zn | λ1 ≥ λ2 ≥ · · · ≥ λn ≥ 0}.

(2.2)

Here the summation in (2.1) is over the orbit of λ with respect to the action of the permutation group Sn (which permutes the vector components λ1 , . . . , λn ). We will also use the notation X 2µ n x1 1 · · · x2µ λ∈3 (2.3) m2λ (x) = n , µ∈Sn (λ)

to indicate the basis of the symmetric monomials that are even in the variables x1 , . . . , xn . The monomial bases (2.1) and (2.3) inherit a partial ordering from the dominance-type partial ordering of the cone 3 (2.2) that is defined for λ, µ ∈ 3 by X X λj ≤ µj for k = 1, . . . , n (2.4) λ ≤ µ iff 1≤j≤k

1≤j≤k

(λ < µ iff λ ≤ µ and λ 6= µ). Let h·, ·iA and h·, ·iB denote the L2 inner products over Rn with weight function A 1 (x) (1.1a) and 1B (x) (1.1b), respectively. So, explicitly we have Z ∞ Z ∞ ··· f (x) g(x) 1C (x) dx1 · · · dxn (2.5) hf, giC ≡ −∞

−∞

for f, g in L (R , 1 dx1 · · · dxn ), where C stands for A or B. After these notational preliminaries, we are now in the position to define the multivariable confluent hypergeometric families associated to the weight functions 1A (x) (1.1a) and 1B (x) (1.1b). 2

n

C

Definition. The type A (or multivariable Hermite) polynomials pA λ (x), λ ∈ 3 are the polynomials determined (uniquely) by the conditions

470

J.F. van Diejen

X

A.1 pA λ (x) = mλ (x) +

cA λ,µ mµ (x),

cA λ,µ ∈ C.

µ∈3,µ<λ

A.2 hpA λ , mµ iA = 0 if µ < λ. Similarly, the type B (or multivariable Laguerre) polynomials pBλ (x), λ ∈ 3 are the polynomials determined (uniquely) by the conditions X B.1 pBλ (x) = m2λ (x) + cBλ,µ m2µ (x), cBλ,µ ∈ C. µ∈3,µ<λ

B.2 hpBλ , m2µ iB = 0 if µ < λ. Since the weight functions 1C (x) (1.1a), (1.1b) (and the monomials mλ (x) (2.1)) B are real the same is true for the polynomials pCλ (x), i.e., the coefficients cA λ,µ and cλ,µ in the above definition are in fact real. Observe also that it follows immediately from the definition and the fact that 1A (−x) = 1A (x) that |λ| A pA pλ (x) λ (−x) = (−1)

(|λ| ≡ λ1 + · · · + λn ).

(2.6)

The type A polynomials pA λ (x), λ ∈ 3 constitute a basis for the space of permutationinvariant polynomials in the variables x1 , . . . , xn and the type B polynomials pBλ (x), λ ∈ 3 form a basis for the even subsector of this space (i.e., the subspace of symmetric polynomials in x21 , . . . , x2n ). The following proposition says that the bases in question are orthogonal with respect to the inner products h·, ·iA and h·, ·iB (2.5), respectively. Proposition 2.1 (Orthogonality). Let λ, µ ∈ 3 (2.2). We have Z ∞ Z ∞ C C ··· pCλ (x) pCµ (x) 1C (x) dx1 · · · dxn hpλ , pµ iC = −∞

−∞

= 0 if λ 6= µ,

where C stands for A or B. For weight vectors λ and µ that are comparable with respect to the partial order (2.4) the orthogonality of pCλ and pCµ follows immediately from the definition of the polynomials. Proposition 2.1 states that the orthogonality relations in fact hold for general weight vectors λ, µ ∈ 3 (2.2) (not necessarily comparable with respect to the partial order (2.4)). B In order to orthonormalize the bases {pA λ }λ∈3 and {pλ }λ∈3 , it is needed to evaluate the integrals for the (squared) norms of the polynomials. Proposition 2.2 (Norm formulas). Let λ ∈ 3 (2.2). We have Z ∞ Z ∞ A 2 A , p i = · · · |pA hpA λ λ A λ (x)| 1 (x) dx1 · · · dxn −∞

−∞ n/2

(2π) n! (2ω)|λ|+g0 n(n−1)/2+n/2 Y 0((k − j + 1)g0 + λj − λk ) 0(1 + (k − j − 1)g0 + λj − λk ) × 0((k − j)g0 + λj − λk ) 0(1 + (k − j)g0 + λj − λk ) 1≤j
=

1≤j≤n

Orthogonal Polynomials Related to the Calogero System

and

Z

hpBλ , pBλ iB =

Z

∞ −∞

···

∞ −∞

471

|pBλ (x)|2 1B (x) dx1 · · · dxn

n!

=

ω 2|λ|+g0 n(n−1)+(g1 +1/2)n

Y 0((k − j + 1)g0 + λj − λk ) 0(1 + (k − j − 1)g0 + λj − λk ) 0((k − j)g0 + λj − λk ) 0(1 + (k − j)g0 + λj − λk ) 1≤j
1≤j≤n

(where |λ| ≡ λ1 + · · · + λn and 0(·) denotes the gamma function). B A B For λ = 0 the polynomials pA λ and pλ reduce to the unit polynomial (p0 (x) = p0 (x) = 1). The formulas in Proposition 2.2 are then seen to simplify to Z ∞ Z ∞ h1, 1iA = ··· 1A (x) dx1 · · · dxn (2.7a) −∞

= and

Z h1, 1iB = =

∞ −∞

Z ···

−∞

(2π)n/2 (2ω)g0 n(n−1)/2+n/2 ∞ −∞

Y 0(1 + jg0 ) 0(1 + g0 )

1≤j≤n

1B (x) dx1 · · · dxn

1 ω g0 n(n−1)+(g1 +1/2)n

(2.7b)

Y 0((j − 1)g0 + g1 + 1/2) 0(1 + jg0 ) . 0(1 + g0 )

1≤j≤n

The integrals in (2.7a) and (2.7b) amount to integrals evaluated by Mehta and Macdonald [Me, M1]. More precisely, Mehta conjectured the closed expression for the value of the integral h1, 1iA and Macdonald generalized this conjecture (in terms of root systems) therewith also including the integral h1, 1iB . Both the integration formulas for h1, 1iA and h1, 1iB in (2.7a) and (2.7b) were then proven in [M1] by viewing them as limiting cases of an integration formula due to Selberg [Se] (cf. also the introduction of [A1]). B 2.2. Differential equations. The polynomials pA λ (x) and pλ (x) defined in the previous subsection are eigenfunctions for the second-order differential operators ∂ X ∂2 X ∂ ∂ 1 − 2 + 2ωxj − − 2g0 (2.8a) DA = ∂xj xj − xk ∂xj ∂xk ∂xj 1≤j≤n

1≤j
and ∂2 1 ∂ ∂ − 2g1 (2.8b) + 2ωxj 2 xj ∂xj ∂xj ∂xj 1≤j≤n ∂ X ∂ ∂ 1 ∂ 1 − + + , −2g0 xj − xk ∂xj ∂xk xj + xk ∂xj ∂xk

DB =

X

−

1≤j
respectively.

472

J.F. van Diejen

Proposition 2.3 (Differential equations). We have that DC pCλ = E C (λ) pCλ ,

λ∈3

(C = A, B), with E A (λ) = 2ω(λ1 + · · · + λn )

and

E B (λ) = 4ω(λ1 + · · · + λn ).

It follows from the orthogonality of the basis {pC λ }λ∈3 and the real-valuedness of the eigenvalues E C (λ) that the differential operator DC (2.8a), (2.8b) is (essentially) selfadjoint in the Hilbert space of functions in L2 (Rn , 1C dx1 · · · dxn ) that are permutationinvariant (type A) or permutation-invariant and even (type B). 2.3. Pieri-type recurrence formulas. To describe the recurrence relations for the multivariable Hermite and Laguerre polynomials it is convenient to pass from monic polynomials to a different normalization. Let A PλA (x) ≡ cA λ pλ (x),

PλB (x) ≡ cBλ pBλ (x)

(2.9)

with cA λ

Y

=

1≤j
cBλ

=

[(k − j)g0 ]λj −λk , [(1 + k − j)g0 ]λj −λk Y

(−ω)|λ|

1≤j
×

Y

1≤j≤n

[(k − j)g0 ]λj −λk [(1 + k − j)g0 ]λj −λk

(2.10a) (2.10b)

1 , [(n − j)g0 + g1 + 1/2]λj

where we have employed the Pochhammer symbol defined by [a]0 ≡ 1 and [a]l ≡ a(a + 1) · · · (a + l − 1) for l = 1, 2, 3, . . . Proposition 2.4 (Normalization). The normalization of the polynomial PλC (x) (2.9) is such that lim α−|λ| PλA (α1) = 1

α→∞

(with 1 ≡ (1, . . . , 1))

and PλB (0) = 1

(with 0 ≡ (0, . . . , 0)).

The next proposition describes an expansion formula for the product of PλC (x) (2.9) and the first elementary symmetric function in x1 , . . . , xn (type A) or in x21 , . . . , x2n (type B). Formulas of this type are often referred to as Pieri formulas [M3, St]. (More generally, Pieri formulas are relations in a commutative algebra describing the expansion (in terms of a basis) of products between the basis elements and a set of generators for the algebra.) For n = 1 the formulas in the proposition reduce to classical three-term recurrence relations for the one-variable Hermite and Laguerre polynomials (cf. Sect. 3).

Orthogonal Polynomials Related to the Calogero System

473

Proposition 2.5 (Pieri formulas: Simplest case). The (renormalized) multivariable Hermite and Laguerre polynomials PλC (x) (2.9) satisfy the recurrence relations (ej denotes the j th unit vector in the standard basis of Rn ) X X A A A xj PλA (x) = (x) + Vˆ−j Pλ−e (x) , VˆjA Pλ+e j j 1≤j≤n

−ω

X

1≤j≤n

x2j PλB (x) =

1≤j≤n

X

B B B B VˆjB Pλ+e (x) − (VˆjB + Vˆ−j )PλB (x) + Vˆ−j Pλ−e (x) j j

1≤j≤n

with VˆjA A Vˆ−j

g0 , (k − j)g0 + λj − λk 1≤k≤n,k6=j Y g0 (n − j)g0 + λj 1− = 2ω (k − j)g0 + λj − λk =

Y

1+

1≤k≤n,k6=j

and

g0 , (k − j)g0 + λj − λk 1≤k≤n,k6=j g0 1− . (k − j)g0 + λj − λk

VˆjB

=

((n − j)g0 + g1 + 1/2 + λj )

B Vˆ−j

=

((n − j)g0 + λj )

Y 1≤k≤n,k6=j

Y

1+

One word of caution is in order here. It may of course happen that for certain λ ∈ 3 and j ∈ {1, . . . , n} the vector λ + ej (or λ − ej ) does not lie in the cone 3 (2.2). For such C C (or Pλ−e ) is not defined and it might a priori boundary situations the polynomial Pλ+e j j seem that the r.h.s. of the recurrence relation does not make sense in this case. It is not C ) in front difficult to verify, however, that in these situations the coefficient VˆjC (or Vˆ−j C C of Pλ+ej (or Pλ−ej ) vanishes. (Indeed, for λ ∈ 3 we have that λ + ej 6∈ 3 if λj−1 = λj and that λ − ej 6∈ 3 if λj = λj+1 or if j = n and λn = 0. In the former case we pick up a zero in VˆjC from the factor 1 + g0 ((k − j)g0 + λj − λk )−1 with k = j − 1 and in the C from the factor 1 − g0 ((k − j)g0 + λj − λk )−1 with latter case we have a zero in Vˆ−j k = j + 1 or from the factor (n − j)g0 + λj with j = n, respectively.) The next step is to generalize the expansion formulas of Proposition 2.5 to (multiplication by) arbitrary elementary symmetric functions. To this end we introduce (for r = 1, . . . , n) X Y xj , (2.11a) Eˆ rA (x) = J⊂{1,...,n}

|J|=r

Eˆ rB (x)

=

(−ω)r

j∈J

X

Y

J⊂{1,...,n}

j∈J

|J|=r

x2j .

(2.11b)

474

J.F. van Diejen

It is clear that the products Eˆ rC (x)PλC (x) can be written as a linear combination of PµC (x) with µ ≤ λ+e1 +· · ·+er (this is immediate from the structure of the monomial expansion of PλC (x) and the fact that such expansion formulas for these products evidently hold if we replace the polynomials PλC (x) by their leading monomials ∼ mλ (x) (C = A) and ∼ m2λ (x) (C = B)). It turns out that many of the coefficients cµ in the expansion Eˆ rC PλC = P C µ≤λ cµ Pµ are in fact zero (for r = 1 this is of course apparent from Proposition 2.5). The following proposition provides detailed information on the structure of the terms entering the Pieri-type expansion of the product between the basis element PλC (x) and an arbitrary elementary symmetric function Eˆ rC (x) (2.11a), (2.11b). Proposition 2.6 (Pieri formulas: Structure and leading coefficients). The (renormalC ized) multivariable Hermite and Laguerre P polynomials Pλ (x) (2.9) satisfy a system of recurrence relations of the form (eJ ≡ j∈J ej ) Eˆ rC (x)PλC (x) =

X

C ˆ JC ,J ;r (λ) Pλ+e W (x) + − J+ −eJ −

J+ ,J− ⊂{1,...,n}, λ+eJ −eJ ∈3 + −

r = 1, . . . , n,

J+ ∩J− =∅, |J+ |+|J− |≤r

where in the Hermite case (C = A) the sum on the r.h.s. runs only over those index sets ˆA J+ , J− for which r − |J+ | − |J− | is even (W J+ ,J− ;r (λ) is zero otherwise). C ˆ The coefficients WJ+ ,J− ;r (λ) that correspond to index sets J+ , J− with the sum of the cardinalities |J+ | + |J− | being equal to r are explicitly given by ˆ JC ,J ;r (λ) = VˆJC ,J ;(J ∪J )c , W + − + − + − where Y (n − j)g0 + λj 2ω j∈J− Y g0 g0 1+ 0 1+ × 0 (j − j)g0 + λj − λj 0 1 + (j − j)g0 + λj − λj 0 j∈J+ ,j 0 ∈J− Y Y g0 g0 1+ 1− × (k − j)g0 + λj − λk j∈J (k − j)g0 + λj − λk j∈J

VˆJA+ ,J− ;K =

−

+

k∈K

and VˆJB+ ,J− ;K =

Y

k∈K

((n − j)g0 + g1 + 1/2 + λj )

j∈J+

Y

((n − j)g0 + λj )

g0 g0 1 + (j 0 − j)g0 + λj − λj 0 1 + (j 0 − j)g0 + λj − λj 0 j∈J+ ,j 0 ∈J− Y Y g0 g0 1+ 1− × (k − j)g0 + λj − λk j∈J (k − j)g0 + λj − λk j∈J ×

Y

j∈J−

1+

+

k∈K

−

k∈K

(with the convention that empty products are equal to one).

Orthogonal Polynomials Related to the Calogero System

475

Notice that the fact that for type A we may restrict the sum on the r.h.s. of the Pieri formula in Proposition 2.6 to those pairs of index sets J+ , J− for which r − |J+ | − |J− | is even, is an immediate consequence of the reflection properties Eˆ rA (−x) = (−1)r Eˆ rA (x) and (recall (2.6)) PλA (−x) = (−1)|λ| PλA (x). Proposition 2.6 constitutes a partial generalization of Proposition 2.5. For r = 1 the structure described in Proposition 2.6 is compatible with that of the formula in C (viz. Proposition 2.5 and we furthermore recover the coefficient on the r.h.s. of Pλ+e j C C C ˆ ˆ V ) and P (viz. V ). For the (Hermite) type A this is {j},∅;{1,...,n}\{j}

λ−ej

∅,{j};{1,...,n}\{j}

already enough to completely recover the recurrence relation given in Proposition 2.5; for the (Laguerre) type B, however, it means that we are still lacking the coefficient of PλB , P B + which—according to Proposition 2.5—happens to be equal to − j (Vˆ{j},∅;{1,...,n}\{j} B ˆ V ). ∅,{j};{1,...,n}\{j}

Still, even though Proposition 2.6 is not completely explicit—as it does not tell us ˆC the expansion coefficients W J+ ,J− ;r (λ) for |J+ | + |J− | < r—it is nevertheless useful in its present form. For instance, the proposition implies (together with the orthogonality) that hEˆ rC PλC , PµC iC = 0 Vˆ C

(2.12)

J+ ,J− ;(J+ ∪J− )c

if µ 6= λ + eJ+ − eJ− if µ = λ + eJ+ − eJ−

with |J+ | + |J− | ≤ r with |J+ | + |J− | = r

(where J+ , J− ∈ {1, . . . , n} such that J+ ∩J− = ∅). When applying (2.12) to the identity C C hEˆ rC PλC , Pλ+e iC = hPλC , Eˆ rC Pλ+e iC one arrives at a system of recurrence {1,...,r} {1,...,r} C relations for the squared norm of Pλ C C C Vˆ{1,...,r},∅;{r,...,n} (λ) hPλ+e , Pλ+e i = {1,...,r} {1,...,r} C C (λ Vˆ∅,{1,...,r};{r,...,n}

+

(2.13)

e{1,...,r} ) hPλC , PλC iC ,

(r = 1, . . . , n). The recurrence relations in (2.13) determine hPλC , PλC iC uniquely in terms of h1, 1iC (because the (fundamental weight) vectors e{1,...,r} , r = 1, . . . , n positively C (λ) 6= 0 for generate the dominant cone 3 (2.2) and the coefficient Vˆ{1,...,r},∅;{r,...,n} λ ∈ 3). This observation gives rise to an alternative (constructive) proof of the norm formulas in Proposition 2.2 different from the proof presented in Sect. 4. Indeed, by using C the property cCλ = cCλ+e{1,...,r} Vˆ{1,...,r},∅;{r,...,n} (λ) one rewrites (2.13) in the monic form hpCλ+e{1,...,r} , pCλ+e{1,...,r} iC =

(2.14)

C C (λ)Vˆ∅,{1,...,r};{r,...,n} (λ + e{1,...,r} ) hpCλ , pCλ iC , Vˆ{1,...,r},∅;{r,...,n}

which upon iteration and matching of the initial conditions so as to reduce for λ = 0 to the Mehta-Macdonald formulas (2.7a), (2.7b) (cf. Sect. 4) leads to the norm formulas of Proposition 2.2. (For another important application/corollary of Proposition 2.6, see Comment 3.2 of the next section.) The last proposition (below) provides a complete (explicit) description of the exˆC pansion coefficients W J+ ,J− ;r (λ) for the type B case (thus including the coefficients corresponding to index sets with |J+ | + |J− | < r). This renders the system of recurrence

476

J.F. van Diejen

relations of the form given by Proposition 2.6 in a fully explicit form for the multivariable Laguerre family and thus completes (for the B type) the generalization of the Pieri formula from Proposition 2.5 to multiplication by an arbitrary elementary symmetric function. Proposition 2.7 (Pieri formulas: Explicit coefficients Laguerre case). The coefficients in the recurrence relations of the type described by Proposition 2.6 are for the renormalized multivariable Laguerre polynomials PλB (x) (2.9) given by ˆ JB ,J ;r (λ) = VˆJB ,J ; (J ∪J )c Uˆ B W (J+ ∪J− )c , r−|J+ |−|J− | + − + − + − with VˆJB+ ,J− ; (J+ ∪J− )c taken from Proposition 2.6 and B p Uˆ K, p = (−1) × X Y L+ ,L− ⊂K, L+ ∩L− =∅

|L+ |+|L− |=p

((n − l)g0 + g1 + 1/2 + λj )

l∈L+

Y

((n − l)g0 + λj )

l∈L−

Y

g0 g0 1− 0 − l)g0 + λl − λl0 1 + (l − l)g0 + λl − λl0 l∈L+ ,l0 ∈L− Y Y g0 g0 1+ 1− × (k − l)g0 + λl − λk l∈L (k − l)g0 + λl − λk l∈L

×

1+

(l0

−

+

k∈K

k∈K

B (with the convention that Uˆ K, p ≡ 1 for p = 0).

3. Comments 3.1. The special case n = 1. In the case of one single variable (n = 1), the weight functions reduce to 1A (x) = e−ωx , 2

1B (x) = |x|2g1 e−ωx . 2

(3.1)

The polynomials then become monic Hermite polynomials (type A) and monic Laguerre polynomials of a quadratic argument (type B), which can be written explicitly in terms of a terminating confluent hypergeometric series [AbSt]  −λ/2 [1/2]λ/2  2  for λ even ; ωx  (−ω)λ/2 1 F1 1/2 (x) = , (3.2a) pA λ −(λ − 1)/2 [3/2]   for λ odd ; ωx2  (−ω)(λ−1)/2 (λ−1)/2 x 1 F1 3/2 [g1 + 1/2]λ −λ B 2 pλ (x) = (3.2b) ; ωx 1 F1 g1 + 1/2 (−ω)λ ∞ X [a]m m a (with 1 F1 z and λ = 0, 1, 2, . . .). The norm formulas, dif;z ≡ b [b]m m! m=0 ferential equations and recurrence relations reduce in this special situation to classical

Orthogonal Polynomials Related to the Calogero System

477

formulas for the one-variable Hermite and the Laguerre polynomials (notice however that the scale parameter ω is usually taken to be equal to 1 and that our normalization differs from the standard one): Norm formulas (cf. Proposition 2.2) √ Z ∞ λ! π 2 A |pA (x)| 1 (x) dx = , (3.3a) λ 2λ ω λ+1/2 −∞ Z ∞ λ!0(g1 + 1/2 + λ) |pBλ (x)|2 1B (x) dx = . (3.3b) ω 2λ+g1 +1/2 −∞ Differential equations (cf. Proposition 2.3) d d2 A p + 2ωx pA = 2ωλ pA λ, dx2 λ dx λ d2 2g1 d B d − 2 pBλ − pλ + 2ωx pBλ = 4ωλ pBλ . dx x dx dx −

(3.4a) (3.4b)

Recurrence relations (cf. Proposition 2.5) xPλA

=

−ωx2 PλB

=

λ A P , 2ω λ−1 B B (g1 + 1/2 + λ)Pλ+1 − (g1 + 1/2 + 2λ)PλB + λPλ−1 , A Pλ+1 +

(3.5a) (3.5b)

λ

(−ω) B B with PλA (x) = pA λ (x) and Pλ (x) = [g1 +1/2]λ pλ (x). The normalization properties for A B Pλ (x) and Pλ (x) in Proposition 2.4 state that limα→∞ α−λ PλA (α) = 1 and that PλB (0) = 1. In the present situation these properties are immediate from the explicit confluent hypergeometric representations in (3.2a) and (3.2b).

3.2. Highest-degree homogeneous components. If we collect the terms with highest degree on both sides of the Pieri formulas in Proposition 2.6, then we arrive at a system of recurrence relations for the leading homogeneous component PλC,lead (x) of PλC (x). These recurrence relations are of the form X C,lead C VJ;J r = 1, . . . , n (3.6) Eˆ rC (x) PλC,lead (x) = c P λ+eJ (x), J⊂{1,...,n}

|J|=r

with Eˆ rC (x) given by (2.11a), (2.11b) and Y g0 A 1+ , = VJ;K (k − j)g0 + λj − λk j∈J k∈K B VJ;K =

Y

j∈J

((n − j)g0 + g1 + 1/2 + λj )

Y

(3.7a)

1+

j∈J

g0 . (k − j)g0 + λj − λk

(3.7b)

k∈K

For C = A, the recurrence relations in Eq. (3.6) are recognized as the Pieri formulas for the Jack symmetric functions (with parameter α = 1/g0 ) in the variables x1 , . . . , xn , where the normalization is such that for x = 1 the Jack symmetric function has the value 1 [St, La1, Kad]. (The Pieri formulas in (3.6) correspond to the form used by Lassalle and Kadell [La1, Kad]; Stanley [St] rather presented a dual form of the Pieri

478

J.F. van Diejen

formulas, which is related to (3.6) by the Macdonald-Stanley duality map ωα [St, M3].) Similarly, for C = B the recurrence relations in (3.6) amount to Pieri formulas for Jack symmetric functions in the variables x21 , . . . , x2n (also with parameter α = 1/g0 ) but now normalized such that for x = 1 they yield the value Y [(n − j)g0 + g1 + 1/2]−1 (−ω)|λ| λj 1≤j≤n

(where λ of course corresponds to the partition labeling the Jack symmetric function). Since the Pieri formulas can in principle be used to construct the Jack symmetric functions inductively starting from the unit polynomial for λ = 0, it thus follows that the leading homogeneous components of the monic multivariable Hermite and Laguerre polyB M nomials pA λ (x) and pλ (x) are given by the monic Jack symmetric functions Jλ (x; 1/g0 ) M 2 and Jλ (x ; 1/g0 ), respectively. (The superscript “M ” is used to indicate that we have normalized the Jack symmetric functions such that they are monic.) Corollary (Leading homogeneous component). The highest-degree homogeneous components of the Hermite and Laguerre polynomials are given by monic Jack symmetric functions with parameter α = 1/g0 in the variables x1 , . . . , xn and x21 , . . . , x2n , respectively: (x) = JλM (x; 1/g0 ), pA,lead λ

pB,lead (x) = JλM (x2 ; 1/g0 ). λ

Notice that combination of the first formulas of the above Corollary and Proposition 2.2 implies—using the definition (2.9), (2.10a)—Stanley’s evaluation formula for the Jack symmetric functions at the identity [St, M3] Y

JλM (1; 1/g0 ) =

1≤j
[(1 + k − j)g0 ]λj −λk . [(k − j)g0 ]λj −λk

3.3. Eigenfunctions for the rational Calogero model in an harmonic well. Most of the results stated in Sect. 2 admit an interpretation in terms of certain exactly solvable quantum mechanical n-particle models on the line. Specifically, conjugation with the square root of the weight function 1C (x) (1.1a), (1.1b) transforms the second-order differential operators DC (2.8a), (2.8b) into Hamiltonians for the rational quantum Calogero models with harmonic confinement associated to the classical root systems [Ca, OP] H A = (1A ) 2 DA (1A )− 2 = X ∂2 X − 2 + ω 2 x2j + 2g0 (g0 − 1) (xj − xk )−2 ∂x j 1≤j≤n 1≤j
1

H B = (1B ) 2 DB (1B )− 2 = X ∂2 2 2 + ω x − 2 + g1 (g1 − 1)x−2 j j ∂xj 1≤j≤n X (xj − xk )−2 + (xj + xk )−2 +2g0 (g0 − 1) 1

(3.8a) − E0A ,

1

1≤j
(3.8b)

− E0B

Orthogonal Polynomials Related to the Calogero System

479

with E0A = ωn(1 + g0 (n − 1)) and E0B = ωn(1 + 2g0 (n − 1) + 2g1 ). It is clear from Proposition 2.3 that the functions 21

ψλC (x) = 1C (x)

pCλ (x),

λ∈3

(C = A, B)

(3.9)

constitute a basis of eigenfunctions for H C (3.8a), (3.8b) with the corresponding eigenvalues given by E C (λ) in Proposition 2.3. In its present form these eigenfunctions for the rational Calogero model with harmonic term were introduced in [D3, BF] and also (for type A) in [UW]. The orthogonality (in L2 (Rn , dx1 , . . . , dxn )) of the basis ψλC (x) (3.9) follows from Proposition 2.1 and the orthonormalization constants can be read-off from Proposition 2.2. Historically, the study of the eigenvalue problem for the type A Hamiltonian (3.8a) was initiated by Calogero, who computed the spectrum and determined the structure of the corresponding eigenfunctions to be a product of the ground-state wave function ψ0A (x) = (1A (x))1/2 and certain symmetric polynomials in x1 , . . . , xn [Ca]. To be precise, Calogero considered a translationally symmetric n-particle system with a potential of the form X G 2 2 + $ (xj − xk ) , V (x) = (xj − xk )2 1≤j
which is seen to be equivalent to the type AP system above P up to a simple harmonic P center-of-mass motion ( j
480

J.F. van Diejen

subsection). Specifically, if one substitutes an Ansatz function of the form RC (r)YlC (x), p C 2 where R (r) is a function of r = x1 + · · · + x2n and YlC (x) is a permutation-symmetric homogeneous polynomial of degree l in x1 , . . . , xn (C = A) or x21 , . . . , x2n (C = B), then it is seen that this yields an eigenfunction of DC (2.8a), (2.8b) with eigenvalue E C if RC (r) and YlC (x) satisfy 2l + g0 n(n − 1) + n − 1 dRA d2 RA = (E A − 2ωl)RA , + 2ωr − (3.10a) 2 dr r dr d2 R B 4l + 2g0 n(n − 1) + 2ng1 + n − 1 dRB − (3.10b) + 2ωr − dr2 r dr = (E B − 4ωl)RB , −

and LC YlC = 0

(C = A, B)

(3.11)

with ∂ X ∂ ∂2 1 , (3.12a) + 2g − 0 xj − xk ∂xj ∂xk ∂x2j 1≤j≤n 1≤j
LA =

X

1≤j
The “radial” equations (3.10a) and (3.10b) are confluent hypergeometric type equations that admit polynomial solutions for E A = 2ω(l + 2m) and E B = 4ω(l + m) (with m ∈ N) given by Laguerre polynomials in r2 (cf. Eqs. (3.2b) and (3.4b)) A Rm (r)

=

B (r) Rm

=

[l + n(1 + (n − 1)g0 )/2]m (−ω)m −m 2 × 1 F1 ; ωr , l + n(1 + (n − 1)g0 )/2 [2l + n(1/2 + g1 + (n − 1)g0 )]m (−ω)m −m × 1 F1 ; ωr2 . 2l + n(1/2 + g1 + (n − 1)g0 )

(3.13a)

(3.13b)

C (Here we have chosen the normalization such that Rm (r) is monic.) The “spherical” Eq. (3.11) was studied (for type A) by Calogero [Ca] and in further detail and more generality (therewith also including the type B) by Dunkl [Du1]. Let PlC be the space of homogeneous symmetric polynomials of degree l in x1 , . . . , xn (C = A) or x21 , . . . , x2n (C = B) and let HlC ⊂ PlC be the subspace of polynomials satisfying (3.11). The polynomials in HlC are referred to as (symmetric) generalized spherical harmonics. For g0 , g1 = 0 these generalized spherical harmonics reduce to ordinary harmonic polynomials in Rn . It follows from Dunkl’s theory in [Du1] (see also [Du2] A ) and for the extension to the nonsymmetric case) that dim(HlA ) = dim(PlA ) − dim(Pl−2

Orthogonal Polynomials Related to the Calogero System

481

B that dim(HlB ) = dim(PlB ) − dim(Pl−1 ). The upshot is that each polynomial pCλ may be written uniquely in the form

X

[|λ|/2]

pA λ (x)

=

A A Rm (r)Y|λ|−2m (x),

A A Y|λ|−2m ∈ H|λ|−2m ,

(3.14a)

m=0

pBλ (x)

=

|λ| X

B B Rm (r)Y|λ|−m (x),

B B Y|λ|−m ∈ H|λ|−m

(3.14b)

m=0

(where [·] represents the function that extracts the integer part). Indeed, the functions of the form on the r.h.s. of (3.14a) and (3.14b) are eigenfunctions of DC (2.8a), (2.8b) corresponding to the eigenvalue 2ω|λ| (type A) and 4ω|λ| (type B), respectively. Furthermore, P[|λ|/2] A A ) = dim(P|λ| ) the functions of this form span a space of dimension m=0 dim(H|λ|−2m P|λ| B B and m=0 dim(H|λ|−m ) = dim(P|λ| ), which is precisely the multiplicity of the eigenvalues E C (λ) in Proposition 2.3. The formulas (3.14a), (3.14b) describe the relation between the Calogero type eigenfunctions of the form RC (r)YlC (x) and the Hermite/Laguerre basis pCλ (x). To determine precisely which functions YlC (x) appear in the decompositions (3.14a) and (3.14b), we pick the leading homogeneous parts on both sides of the equation and use that the highest-order homogeneous part of the multivariable Hermite (type A) and Laguerre (type B) polynomials are (with our normalization monic) Jack polynomials with parameter α = 1/g0 in x1 , . . . , xn and x21 , . . . , x2n , respectively (cf. Comment 3.2). Thus, we get upon taking the leading homogeneous part: X

[|λ|/2]

JλM (x; 1/g0 )

=

A r2m Y|λ|−2m (x),

A A Y|λ|−2m ∈ H|λ|−2m ,

(3.15a)

B B Y|λ|−m ∈ H|λ|−m ,

(3.15b)

m=0

JλM (x2 ; 1/g0 )

=

|λ| X

B r2m Y|λ|−m (x),

m=0

where JλM (x; 1/g0 ) and JλM (x2 ; 1/g0 ) again represent the monic Jack polynomial in x1 , . . . , xn and x21 , . . . , x2n with parameter α = 1/g0 [St, M3]. In [Du1], Dunkl provides inversion formulas for decompositions of the form (3.15a), (3.15b) with which one can A B and Y|λ|−m on the r.h.s. in terms of the homogeneous express the functions Y|λ|−2m symmetric polynomial on the l.h.s. In our case these inversion formulas become A Y|λ|−2m (x)

=

A Π|λ|−2m JλM (x; 1/g0 ),

(3.16a)

B (x) Y|λ|−m

=

B Π|λ|−m JλM (x2 ; 1/g0 )

(3.16b)

with A Π|λ|−2m

=

TkA

=

4m m!

1 TA (LA )m , [n/2 + dA + |λ| − 2m]m |λ|−2m

X

[k/2]

j=0

4j j!

r2j (LA )j , [−n/2 − dA − k + 2]j

482

J.F. van Diejen

dA = g0 n(n − 1)/2 and B Π|λ|−m

=

TkB

=

1 4m m! k X j=0

[n/2 +

dB

+ 2|λ| − 2m]m

B T|λ|−m (LB )m ,

r2j (LB )j , 4j j! [−n/2 − dB − 2k + 2]j

dB = g0 n(n − 1) + ng1 . Formulas (3.14a), (3.14b) combined with (3.16a), (3.16b) render the decomposition of the multivariable Hermite and Laguerre polynomials in terms of Dunkl’s generalized spherical harmonics in a closed form. 3.5. Notes. i. Definitions, differential equations, and highest-degree homogeneous parts. In [M2, La4, La5] it is demonstrated that the differential operators DA (2.8a) and DB (2.8b) have a basis of eigenfunctions of the form X uA (λ ∈ 3), (3.17a) p˜A λ (x) = λ,µ Jµ (x; 1/g0 ) µ∈3, µ⊂λ

p˜B λ (x)

=

X

2 uB λ,µ Jµ (x ; 1/g0 )

(λ ∈ 3)

(3.17b)

µ∈3, µ⊂λ B (where Jµ ( · ; 1/g0 ) represents the Jack symmetric function and uA λ,µ , uλ,µ ∈ C with A B uλ,λ , uλ,λ 6= 0). By the (standard) notation µ ⊂ λ for µ, λ ∈ 3 (2.2) it is meant that µj ≤ λj for j = 1, . . . , n. The papers [M2, La4, La5, BF] then introduce the multivariable Hermite and Laguerre polynomials as the eigenfunctions for DC (2.8a), (2.8b) of the form (3.17a), (3.17b) with a prescribed normalization. (Notice, however, that in those works one uses Laguerre variables that are related to our variables x1 , . . . , xn by a square.) The differential equation (cf. Proposition 2.3) and the formula for the highestdegree homogeneous component (cf. Comment 3.2) are therefore in their approach immediate from the definition of the multivariable Hermite and Laguerre polynomials. That the polynomials thus defined coincide with the multivariable Hermite and Laguerre polynomials of the present paper follows from their orthogonality with respect to the weight functions 1A (x) (1.1a) and 1B (x) (1.1b), respectively (cf. Note ii, below). Alternatively, the differential equations and the formulas for the highest-degree homogeneous components also follow from the creation-operator formulas in the Dunkl-operator approach of [UW, Ka1, Ka2]. ii. Orthogonality. The orthogonality of the multivariable Hermite and Laguerre polynomials was stated in [M2, La4, La5]. It was proved by Macdonald for the cases g0 = 1/2, 1 and 2. Recently, Baker and Forrester [BF] proposed a proof valid for general parameters that exploits the fact that the polynomials may be seen as limiting cases of certain multivariable Jacobi polynomials [V, De, H, M2, La3, BO]. In essence, this approach should boil down to an extension of the transition in [M1] from the Selberg integral (whose integrand is the weight function for the Jacobi polynomials) to the Mehta-Macdonald integrals (2.7a), (2.7b) (whose integrands are the weight functions for the multivariable Hermite and Laguerre polynomials) so as to include also the polynomials of higher degree. For a different orthogonality proof without using limit transitions see [Ka1] (type A) and [Ka2] (type B). In the language of the Calogero-Sutherland n-particle systems, the transition employed by [BF] from the (multivariable) Jacobi polynomials (∼eigenfunctions for the

Orthogonal Polynomials Related to the Calogero System

483

trigonometric BC type Sutherland system) to the Hermite and Laguerre polynomials (∼eigenfunctions for the rational type A and type B Calogero system with harmonic term, respectively) may be viewed as a transition from trigonometric to rational Calogero-Sutherland potentials (by sending the trigonometric period to infinity). (To arrive at the correct limits it is necessary to rescale the coupling constants appropriately and to recover the rational type A system one furthermore has to shift the center of mass over a half-period before taking the limit, cf. also the diagram in the introduction of [D2].) The orthogonality proof of the present paper is, on the other hand, based on the transition from the (multivariable) continuous Hahn to Hermite polynomials (type A) and from the (multivariable) Wilson to Laguerre polynomials (type B). This amounts to a transition from the difference (or “relativistic”) Ruijsenaars type Calogero system with external field to its conventional (i.e. “nonrelativistic”) counterpart by sending the difference step-size (∼ the inverse speed of light) to zero [D2, D3, R]. iii. Norm formulas. Norm formulas for the multivariable Hermite and Laguerre polynomials can be found in [M2, La4, La5], with again (cf. Note ii, above) proofs given by Macdonald for g0 = 1/2, 1 and 2. Baker and Forrester provide a proof of these norm formulas for general parameters that hinges on a generating function approach (and uses also the orthogonality) [BF]. The expressions for the norms given in [M2, La4, La5, BF] are written in terms of the Mehta-Macdonald integrals (2.7a), (2.7b) and evaluations of Jack symmetric functions at the identity. It can be verified with the aid of known evaluation formulas for the Jack symmetric functions due to Stanley [St, M3] (cf. also Comment 3.2) that our norm formulas in Proposition 2.2 are in agreement with those of [M2, La4, La5, BF]. iv. Pieri-type recurrence formulas. Recurrence relations of the type given by Proposition 2.5 (i.e. the simplest ones, corresponding to the first elementary symmetric function) were recently derived independently by Baker and Forrester using a generating function for the polynomials [BF]. As it stands, their recurrence formulas are a little less explicit than those obtained here since the coefficients are written in terms of certain implicitly defined generalized binomial coefficients and furthermore contain evaluations of the Jack symmetric functions at the identity. In order make their formulas fully explicit (so as to compare with Proposition 2.5) one again needs Stanley’s expressions for the Jack symmetric functions at the identity [St, M3] in combination with an explicit representation for the specific binomial coefficients at hand that can be found in [La2]. v. Rodrigues formulas. Recently, Ujino and Wadati derived Rodrigues type formulas for the multivariable Hermite polynomials [UW] following an approach due to Lapointe and Vinet who obtained similar Rodrigues formulas for the Jack symmetric functions [LV1, LV2]. Such Rodrigues formulas are particularly useful when trying to answer questions regarding the structure of the coefficients cλ,µ that appear in the expansion of the polynomials in terms of monomial symmetric functions. For instance, the Rodrigues formulas allowed Lapointe and Vinet to prove a weak form of the Macdonald-Stanley integrality conjecture saying that (in an appropriate normalization) the expansion coefficients for the Jack symmetric functions in terms of monomial symmetric functions are polynomials in the parameters with integer coefficients [LV1]. (See [St, M3] for the Macdonald-Stanley integrality conjecture and various related conjectures.) A similar statement also holds true for the multivariable Hermite polynomials [UW].

484

J.F. van Diejen

4. Proofs In this section the properties of the multivariable confluent hypergeometric families stated in Sect. 2 are proven by viewing the polynomials as degenerate (limiting) cases of the multivariable hypergeometric continuous Hahn families (type A) and Wilson families (type B) that were investigated in [D3, D4]. 4.1. Orthogonality properties. In [D3, D4] multivariable continuous Hahn and Wilson type polynomials were considered that are associated to the weight functions 1cH (x) =

Y 1≤j
0(g0 + i(xj − xk )) 2 0(i(xj − xk ))

Y

|0(a + ixj ) 0(b + ixj )|

2

(4.1a)

1≤j≤n

and 0(g0 + i(xj − xk )) 0(g0 + i(xj + xk )) 2 (4.1b) 1 (x) = 0(i(xj − xk )) 0(i(xj + xk )) 1≤j
W

1≤j≤n

(with g0 ≥ 0 and Re(a, b, c, d) > 0). Specifically, the multivariable continuous Hahn polynomials are defined by the conditions A.1, A.2 in Sect. 2 with 1cH (x) (4.1a) replacing the weight function 1A (x) (1.1a). Similarly, the multivariable Wilson polynomials are defined by the conditions B.1, B.2 in Sect. 2 with 1B (x) (1.1b) being replaced by 1W (x) (4.1b). If we rescale the variables by substituting xj −→ xj /β,

j = 1, . . . , n

(4.2)

and simultaneously perform a reparametrization of the form ˜ −1 , a = (β 2 ω)

b = (β 2 ω˜ 0 )−1 ,

c = g˜ 1 ,

d = g˜ 10 + 1/2

(4.3)

(with ω, ˜ ω˜ 0 > 0, g˜ 1 , g˜ 10 ≥ 0 and β real), then the weight functions 1cH (x) (4.1a) and W 1 (x) (4.1b) pass (upon multiplication by the overall normalization constants DA (β) and DB (β)) over into 1A β (x)

=

DA (β)

Y 1≤j
0(g0 + iβ −1 (xj − xk )) 2 0(iβ −1 (xj − xk ))

2 Y 1 1 −1 −1 + iβ xj ) 0( 0 2 + iβ xj ) × 0( ωβ 2 ˜ ω˜ β 1≤j≤n

and

(4.4a)

Orthogonal Polynomials Related to the Calogero System

1Bβ (x)

=

DB (β)

485

0(g0 + iβ −1 (xj − xk )) 0(g0 + iβ −1 (xj + xk )) 2 0(iβ −1 (xj − xk )) 0(iβ −1 (xj + xk ))

Y 1≤j
Y 0(g˜ 1 + iβ −1 xj ) 0(g˜ 0 + 1/2 + iβ −1 xj ) 1 × 0(iβ −1 xj ) 0(1/2 + iβ −1 xj ) 1≤j≤n

× 0(

(4.4b)

2 1 1 −1 −1 + iβ x ) 0( + iβ x ) j j , 2 0 2 ωβ ˜ ω˜ β

respectively. The normalization constants DA (β) and DB (β) are introduced so as to ensure finite limiting behavior of the weight functions for β → 0. It is not so difficult to check (see appendix)—using Stirling’s formula for the asymptotics of 0(z) for |z| → ∞ (see e.g. [AbSt])—that if one takes DA (β)

=

|β|g0 n(n−1) δ(ω, ˜ β)2n δ(ω˜ 0 , β)2n ,

DB (β)

=

|β|2g0 n(n−1)+2n(g˜ 1 +g˜ 1 ) δ(ω, ˜ β)2n δ(ω˜ 0 , β)2n ,

0

where

r δ(α, β) ≡

e (1+log(β 2 α))(1/(β 2 α)−1/2) e , 2π

(4.5)

B A then the weight functions 1A β (x) (4.4a) and 1β (x) (4.4b) converge pointwise to 1 (x) B C (1.1a) and 1 (1.1b) for β → 0 provided the parameters of 1 (x) are related to those of 1Cβ (x) by ω ≡ ω˜ + ω˜ 0 and g1 ≡ g˜ 1 + g˜ 10 . (4.6)

For our purposes, however, pointwise convergence is not sufficient and we need a somewhat stronger convergence result stating that the corresponding measures pass over into each other: Z ∞ Z ∞ ··· p(x) 1C (x) dx1 · · · dxn = (4.7) −∞ −∞ Z ∞ Z ∞ ··· p(x) 1Cβ (x) dx1 · · · dxn lim (C = A or B), β→0

−∞

−∞

where p(x) denotes an arbitrary polynomial in the variables x1 , . . . , xn . A proof of the limit formula (4.7) can be found in the appendix at the end of the paper. B Now, let {pA λ,β }λ∈3 and {pλ,β }λ∈3 be the bases determined by the conditions A.1, A.2 and B.1, B.2 in Sect. 2 with the weight functions 1A (x) (1.1a) and 1B (1.1b) B being replaced by 1A β (x) (4.4a) and 1β (x) (4.4b), respectively. So, up to scaling and A reparametrization, the polynomials pλ,β (x) amount to the multivariable continuous Hahn polynomials (multiplied by β |λ| ) and the polynomials pBλ,β (x) amount to the multivariable Wilson polynomials (multiplied by β 2|λ| ) from [D3, D4]. It then follows from the defining properties for the polynomials (of the form A.1, A.2 and B.1, B.2) and the limit formula (4.7) that (4.8) pCλ (x) = lim pCλ,β (x) β→0

and that

486

J.F. van Diejen

hpCλ , pCµ iC

= =

lim hpCλ,β , pCµ,β iC,β Z ∞ Z ∞ ··· pCλ,β (x) pCµ,β (x) 1Cβ (x) dx1 · · · dxn lim

β→0

β→0

−∞

(4.9)

−∞

(of course again assuming a correspondence between the parameters in accordance with (4.6)). Proposition 2.1 is now immediate from limit formula (4.9) and the orthogonality of the multivariable continuous Hahn and Wilson families [D4] (which translates into B the orthogonality of the bases {pA λ,β }λ∈3 and {pλ,β }λ∈3 with respect to the weight B functions 1A β (x) (4.4a) and 1β (x) (4.4b)). The proof of Proposition 2.2 is based on explicit expressions for the quotients cH W W hpcH λ , pλ icH /h1, 1icH and hpλ , pλ iW /h1, 1iW —i.e. the ratios of the squared norm of the multivariable continuous Hahn resp. Wilson polynomials and the unit polynomial (with respect to the L2 inner product over Rn with weight function 1cH (x) resp. 1W (x))—that were computed in [D4]. We have ÿ cH cH cH cH Y [g0 + ρcH hpcH j + ρk , 1 − g0 + ρj + ρk ]λj +λk −4|λ| λ , pλ icH =2 cH cH cH h1, 1icH [ρcH j + ρk , 1 + ρj + ρk ]λj +λk 1≤j
ˆ cH + ρcH , 1 − aˆ cH + ρcH , 1 − bˆ cH + ρcH ]λj Y [ˆacH + ρcH j ,b j j j cH cH [ρ , 1 + ρ ] λ j j j 1≤j≤n

with ρcH ˆ cH = a + b − 1/2 and bˆ cH = a − b + 1/2; and we j = (n − j)g0 + a + b − 1/2, a have ÿ W W W W Y [g0 + ρW hpW j + ρk , 1 − g0 + ρj + ρk ]λj +λk λ , pλ i W = W W W h1, 1iW [ρW j + ρk , 1 + ρj + ρk ]λj +λk 1≤j
Orthogonal Polynomials Related to the Calogero System

487

A Y [(k − j + 1)g0 , 1 + (k − j − 1)g0 ]λj −λk hpA λ , pλ i A = (2ω)−|λ| h1, 1iA [(k − j)g0 , 1 + (k − j)g0 ]λj −λk 1≤j
(4.11a)

1≤j≤n

and Y [(k − j + 1)g0 , 1 + (k − j − 1)g0 ]λj −λk hpBλ , pBλ iB = (2ω)−2|λ| h1, 1iB [(k − j)g0 , 1 + (k − j)g0 ]λj −λk 1≤j
(4.11b)

1≤j≤n

Combination of the expressions for the ratios in (4.11a) and (4.11b) with the MehtaMacdonald formulas for h1, 1iA and h1, 1iB (cf. (2.7a), (2.7b)), produces evaluation A B B formulas for hpA λ , pλ iA and hpλ , pλ iB that can be cast in the form given by Proposition 2.2. Indeed, one easily checks that the norm formulas in Proposition 2.2 are in agreement with the ratio formulas (4.11a), (4.11b) (using the relation 0(a + l)/0(a) = [a]l ) and, furthermore, that they boil down to the Mehta-Macdonald evaluation formulas (2.7a), (2.7b) for λ = 0. In order to verify the reduction to (2.7a) and (2.7b) for λ = 0, one uses the identity n!

Y 1≤j
0(1 + ng0 ) 0((k − j + 1)g0 ) 0(1 + (k − j − 1)g0 ) = 0((k − j)g0 ) 0(1 + (k − j)g0 ) (0(1 + g0 ))n

(which is derived by canceling common factors in the numerator and denominator on the l.h.s. and some further manipulations involving the standard shift property for the gamma function z 0(z) = 0(z + 1)). 4.2. Differential equations. The (multivariable continuous Hahn) polynomials pA λ,β and C B (multivariable Wilson) polynomials pλ,β associated to the weight functions 1β (x) (4.4a) and (4.4b), respectively, satisfy the second-order difference equation [D3, D4] DβC pCλ,β = EβC (λ)pCλ,β

(C = A, B),

(4.12)

where DβA =

X

wA (−xj )

Y

v A (−xj + xk )(e

− βi

∂ ∂j

− 1) +

(4.13a)

− 1) ,

1≤k≤n,k6=j

X

Y

wB (xj )

1≤j≤n

wB (−xj )

β ∂ ∂j

v B (xj − xk )v B (xj + xk )(e i

− 1) +

1≤k≤n,k6=j

Y

v B (−xj + xk )v B (−xj − xk )(e

1≤k≤n,k6=j

with

β ∂ ∂j

v A (xj − xk )(e i

1≤k≤n,k6=j

1≤j≤n

DβB =

Y

wA (xj )

− βi

∂ ∂j

− 1)

(4.13b)

488

J.F. van Diejen

βg0 , v A (z) = 1 + iz ˜ + iβ ω˜ 0 z), wA (z) = (1 + iβ ωz)(1 βg0 , v B (z) = 1 + iz β g˜ 10 β g˜ 1 1+ (1 + iβ ωz)(1 ˜ + iβ ω˜ 0 z), wB (z) = 1 + iz (iz + β/2) and the eigenvalues on the r.h.s. are given by X 2 A 2 EβA (λ) = β 4 ω˜ ω˜ 0 (ρA j + λj ) − (ρj ) , 1≤j≤n

X

EβB (λ) = 4β 4 ω˜ ω˜ 0

(ρBj + λj )2 − (ρBj )2

(4.14a)

(4.14b)

1≤j≤n

with ρA j

=

ρBj

=

1 − 1/2, ω˜ ω˜ 0 1 1 + (n − j)g0 + (g˜ 1 + g˜ 10 )/2 + β −2 − 1/4. 2ω˜ 2ω˜ 0 (n − j)g0 + β −2

1

+

(4.15a) (4.15b)

A Taylor expansion in β around zero reveals—using the limit (4.8)—that 2 C C DβC pC λ,β = β D pλ

+ o(β 2 )

and 2 C C EβC (λ)pC λ,β = β E (λ)pλ

+ o(β 2 )

(with ω = ω˜ + ω˜ 0 and g1 = g˜ 1 + g˜ 10 ). Hence, after division by β 2 the difference equation in (4.12) passes for β → 0 over into the differential equation of Proposition 2.3. 4.3. Pieri-type recurrence formulas. In [D4] Pieri formulas for the multivariable continuous Hahn and Wilson polynomials associated to the weight functions 1cH (x) (4.1a) and 1W (x) (4.1b) were introduced. After rescaling and reparametrizing in accordance with (4.2), (4.3), one arrives at the corresponding Pieri formulas for the polynomials B pA λ,β and pλ,β (which – recall – are determined by the conditions A.1, A.2 and B.1, B.2 of Sect. 2 with the weight functions 1A (x) (1.1a) and 1A (x) (1.1b) replaced by 1A β (x) (4.4a) and 1Bβ (x) (4.4b)). In the simplest case the resulting Pieri formula takes the form X C C C C C Eˆ 1,β Vˆj,β (x)Pλ,β (x) = (ρC + λ) Pλ+e (x) − Pλ,β (x) + (4.16) j ,β 1≤j≤n

λ+ej ∈3

X

C C C Vˆ−j,β (ρC + λ) Pλ−e (x) − P (x) ,β λ,β j

1≤j≤n

λ−ej ∈3

with C (ζ) = wˆ C (±ζj ) Vˆ±j,β

Y 1≤k≤n,k6=j

vˆ C (±ζj + ζk )vˆ C (±ζj − ζk ).

Orthogonal Polynomials Related to the Calogero System

489

The functions vˆ C , wˆ C are given by vˆ A (z)

=

vˆ B (z)

=

(ˆaA + z)(bˆ A + z) , 4z (ˆaB + z)(bˆ B + z)(ˆcB + z)(dˆB + z) wˆ B (z) = 2z(1 + 2z)

g0 , z g0 1+ , z

wˆ A (z) =

1+

with aˆ A = β −2 (

1 1 + ) − 1/2, ω˜ ω˜ 0

1 1 bˆ A = β −2 ( − 0 ) + 1/2, ω˜ ω˜

and 1 1 + )/2 − 1/4, ω˜ ω˜ 0 1 1 bˆ B = (g˜ 1 + g˜ 10 )/2 − β −2 ( + 0 )/2 + 3/4, ω˜ ω˜ 1 1 cˆB = (g˜ 1 − g˜ 10 )/2 + β −2 ( − 0 )/2 + 1/4, ω˜ ω˜ 1 1 dˆB = (g˜ 1 − g˜ 10 )/2 − β −2 ( − 0 )/2 + 1/4. ω˜ ω˜ aˆ B = (g˜ 1 + g˜ 10 )/2 + β −2 (

The vector ρC has components given by (4.15a), (4.15b) and the multiplier on the l.h.s. of (4.16) is given by A (x) Eˆ 1,β

=

−

X ixj + ρˆA ( j ), β

(4.17a)

X x2j ( 2 + (ρˆBj )2 ) β

(4.17b)

1≤j≤n B (x) Eˆ 1,β

=

−

1≤j≤n

with 2 ρˆA ˜ j = (n − j)g0 + 1/(β ω)

and

ρˆBj = (n − j)g0 + g˜ 1 .

(4.18)

In the Pieri formula we have furthermore employed the normalization C Pλ,β (x) = cCλ,β pCλ,β (x)

with cA λ,β

=

(−4i/β)|λ| ×

cBλ,β

=

[ρA j ]λ j A [ˆa + ρj , bˆ + ρA j ]λj 1≤j≤n

Y

A A [ρA [ρA j + ρk ]λj +λk j − ρk ]λj −λk , A A A [g0 + ρA j + ρk ]λj +λk [g0 + ρj − ρk ]λj −λk 1≤j
(−1/β 2 )|λ| ×

Y

Y

Y

[2ρBj ]2λj

1≤j≤n

[ˆa + ρBj , bˆ + ρBj , cˆ + ρBj , dˆ + ρBj ]λj

[ρBj + ρBk ]λj +λk [ρBj − ρBk ]λj −λk . B B [g0 + ρBj + ρA k ]λj +λk [g0 + ρj − ρk ]λj −λk 1≤j
(4.19)

490

J.F. van Diejen

The recurrence relations of Proposition 2.5 can be recovered from (4.16) for β → 0. To see this, one first observes that lim

β→0

i |λ| A Pλ,β (x) = PλA (x), β ω˜

B lim Pλ,β (x) = PλB (x)

β→0

(4.20)

with PλC (x) given by (2.9). These limits follow from (4.8) and the fact that lim

β→0

i |λ| A cλ,β = cA λ, β ω˜

lim cBλ,β = cBλ

β→0

(4.21)

B with cA λ and cλ given by (2.10a) and (2.10b). (As usual, we assume an identification of the parameters of the form ω = ω˜ + ω˜ 0 and g1 = g˜ 1 + g˜ 10 .) For type A, multiplication of (4.16) by iβ( βiω˜ )|λ| leads for β → 0 to the first recurrence relation of Proposition 2.5. The second (i.e. type B) recurrence relation of Proposition 2.5 is obtained similarly by sending β to zero in the type B version of (4.16) after having multiplied both sides by the factor (ω˜ + ω˜ 0 )β 2 . To derive these limit transitions for the recurrence relations we have used that for β → 0,

vˆ C (ρCj + ρCk + λj + λk ), vˆ C (−ρCj − ρCk − λj − λk ) = 1 + O(β 2 ), g0 , vˆ C (ρCj − ρCk + λj − λk ) = 1 + (k − j)g0 + λj − λk and 1 (1 + O(β 2 )), β 2 ω˜ (n − j)g0 + λj (1 + O(β 2 )), ˜ wˆ A (−ρA j − λj ) = −ω 2(ω˜ + ω˜ 0 ) (n − j)g0 + g˜ 1 + g˜ 10 + 1/2 + λj (1 + O(β 2 )), wˆ B (ρBj + λj ) = β 2 (ω˜ + ω˜ 0 ) (n − j)g0 + λj (1 + O(β 2 )). wˆ B (−ρBj − λj ) = β 2 (ω˜ + ω˜ 0 ) wˆ A (ρA j + λj ) =

Notice to this end also that in the case of type A, a divergent term on the l.h.s. of P the recurrence relation (4.16) originating from the factor − j ρˆA j (cf. (4.17a)) cancels against a corresponding term on the r.h.s. originating from the factor in front Pdivergent A A of Pλ,β of the form − j Vˆj,β (ρA + λ). (That the divergent terms on both sides indeed Pn Q cancel is seen using the identity j=1 k6=j (1 + g0 /(ζj − ζk )) = n.) C (4.19) induced by [D4] In general the recurrence relations for the polynomials Pλ,β become C C Eˆ r,β (x) Pλ,β (x) = X

(4.22) C C C Uˆ JCc , r−|J| (ρC + λ) VˆεJ, J c (ρ + λ) Pλ+eεJ ,β (x),

J⊂{1,...,n}, 0≤|J|≤r

εj =±1, j∈J; λ+eεJ ∈3

r = 1, . . . , n, with

Orthogonal Polynomials Related to the Calogero System

eεJ

=

X

(εj ∈ {+1, −1}),

εj e j

j∈J

C VˆεJ, K (ζ) =

Y

wˆ C (εj ζj ) Y

Y

vˆ C (εj ζj + εj 0 ζj 0 ) vˆ C (εj ζj + εj 0 ζj 0 + 1)

j,j 0 ∈J j<j 0

j∈J

×

491

vˆ C (εj ζj + ζk ) vˆ C (εj ζj − ζk ),

j∈J

k∈K C Uˆ K,p (ζ) =

(−1)p

X Y

L⊂K, |L|=p

wˆ C (εl ζl )

Y

vˆ C (εl ζl + εl0 ζl0 ) vˆ C (−εl ζl − εl0 ζl0 − 1)

l,l0 ∈L l
l∈L

εl =±1, l∈L

×

Y

vˆ C (εl ζl + ζk ) vˆ C (εl ζl − ζk )

l∈L

k∈K\L

and A (x) = (−1)r Eˆ r,β

X J⊂{1,...,n}

Y ixj β

j∈J

0≤|J|≤r B (x) = (−1)r Eˆ r,β

X J⊂{1,...,n}

0≤|J|≤r

Y x2j β2

j∈J

X r≤l1 ≤···≤lr−|J| ≤n

X

ρˆA ˆA l1 · · · ρ lr−|J| ,

(4.23a)

(ρˆBl1 · · · ρˆBlr−|J| )2 . (4.23b)

r≤l1 ≤···≤lr−|J| ≤n

For r = 1 the recurrence formula in (4.22) specializes to that of (4.16). It is not difficult to see that the recurrence relations for the multivariable Laguerre polynomials characterized by Proposition 2.6 and Proposition 2.7 follow from the type B ˜ ω˜ 0 )r and version of (4.22) for β → 0. Indeed, multiplication of (4.22) by the factor β 2r (ω+ sending β to zero readily leads to the Laguerre type recurrence relations. The verification of this assertion hinges on the second limit formula of (4.20), the asymptotics for vˆ B , wˆ B displayed above, and the fact that X Y B lim β 2r Eˆ r,β (x) = (−1)r x2j . β→0

J⊂{1,...,n}

|J|=r

j∈J

For type A the transition β → 0 is substantially more complicated due to the sinA gular nature of the terms in (4.22). Specifically, the multiplier Eˆ r,β (x) (4.23a) consists of a linear combination of the elementary symmetric functions in x1 , . . . , xn up to degree r. The coefficients in this linear combination have a pole at β = 0, the order of which is reversely proportional to the degree of the elementary symmetric function in −2 )). Hence, for β → 0 the contributions of the lowerquestion (notice that ρˆA j = O(β A (x) become predominant. To get rid of degree elementary symmetric functions to Eˆ r,β these lower-degree divergent terms, we take an appropriate linear combination of the recurrence relations (4.22) that cast them into a system of the form X Y X A A ˆ εJ, ˜A W xj P˜λ,β (x) = (4.24) J c ;r,β Pλ+eεJ ,β (x) J∈{1,...,n}

|J|=r

j∈J

J⊂{1,...,n}, 0≤|J|≤r

εj =±1, j∈J; eεJ +λ∈3

492

J.F. van Diejen

with A P˜λ,β (x) =

i |λ| A Pλ,β (x). β ω˜

Basically, this boils down to passing from Pieri formulas corresponding to the symmetric A functions Eˆ r,β (x) (4.23a) to Pieri formulas corresponding to the elementary symmetric P Q functions |J|=r j∈J xj by subtracting from the rth Pieri formula in (4.22) a suitable A linear combination of the Pieri formulas corresponding to Eˆ s,β (x) with s < r (and multiplication by an overall factor). The coefficients of the terms on the r.h.s. of (4.22) labeled by index sets J with |J| = r are invariant with respect to such changes on the l.h.s. (up to an overall factor (iβ)r and a factor caused by the change of the normalization A A ˆA c Pλ,β → P˜λ,β ). More precisely, we obtain that for |J| = r the coefficient W εJ, J ;r,β on the r.h.s. of (4.24) is given by P εj ˆ A A r ˆ εJ, j∈J VεJ, J c (ρA + λ). = (iβ) (−iβ ω) ˜ W c J ;r,β The type A version of Proposition 2.6 then follows for β → 0 (using the limit formula (4.20) and the above asymptotics for vˆ A and wˆ A (and finally (2.6))). (Recall also that we may restrict the summation on the r.h.s. of the resulting Pieri formula to index sets J+ , J− for which r − |J+ | − |J− | is even because of the reflection property (2.6).) It remains to verify the normalization properties of PλC (x) stated in Proposition 2.4. C These properties are a consequence of the fact that Pλ,β (iβ ρˆC ) = 1 (see the remarks C (iβ ρˆC ) = 1 we in [D4, Sect. 6]). Specifically, by sending β to zero in the relation Pλ,β arrive at the normalization properties of Proposition 2.4. For type B this is immediate A (iβ ρˆA ) from the limit formula (4.20); for type A this is seen by noticing that limβ→0 Pλ,β A picks up the highest-degree homogeneous part of Pλ (x) evaluated in x = 1 (here we 1 use that ρˆA j = ωβ ˜ 2 + O(1) together with the limit formula (4.20)), which is equal to limα→∞ α−|λ| PλA (α1). Appendix: Convergence of the Weight Functions B In this appendix it will be shown that the weight functions 1A β (x) (4.4a) and 1β (x) A B (4.4b) converge for β → 0 to the weight functions 1 (x) (1.1a) and 1 (x) (1.1b), respectively. More precisely, we will prove the somewhat stronger result that Z ∞ Z ∞ ··· p(x) 1Cβ (x) dx1 · · · dxn = (A.1) lim β→0 −∞ −∞ Z ∞ Z ∞ ··· p(x) 1C (x) dx1 · · · dxn (C = A or B), −∞

−∞

where p(x) denotes an arbitrary polynomial in the variables x1 , . . . , xn . In (A.1) it is understood that the parameters of 1C (x) (1.1a), (1.1b) and 1Cβ (x) (4.4a), (4.4b) are related via the identification ω ≡ ω˜ + ω˜ 0 and g1 ≡ g˜ 1 + g˜ 10 . Let us start by inferring the pointwise convergence of the weight functions. For this purpose we use the limit formula lim δ(α, β) |0(

β→0

1 + iβ −1 y)| = exp(−αy 2 /2) αβ 2

(α > 0)

(A.2)

Orthogonal Polynomials Related to the Calogero System

with

r δ(α, β) ≡

493

e (1+log(αβ 2 ))(α−1 β −2 −1/2) e 2π

and the limit formula 0(a + b + iβ −1 y) = |y|a lim β a β→0 0(b + iβ −1 y)

(a, b ≥ 0),

(A.3)

where in both formulas it is assumed that y and β are real. By applying (A.2) and B (A.3) to the factors of 1A β (x) (4.4a) and 1β (x) (4.4b), one readily sees that for β → 0 these weight functions converge pointwise to 1A (x) (1.1a) and 1B (x) (1.1b) as indicated. The normalization factors of the form δ(α, β) and |β|a in (A.2) and (A.3) ensure a finite and nontrivial limit; the factors in question have been collected in the B weight functions 1A β (x) (4.4a) and 1β (x) (4.4b) into the overall normalization constants 0 A n(n−1)g0 2n 0 δ(ω, ˜ β) δ(ω˜ , β)2n and DB (β) = |β|2n(n−1)g0 +2n(g˜ 1 +g˜ 1 ) δ(ω, ˜ β)2n D (β) = |β| 0 2n δ(ω˜ , β) . The limit formulas (A.2) and (A.3) may be verified with the aid of Stirling’s formula for the asymptotics of the gamma function for large values of the argument, which reads (see e.g. [AbSt, Ol]) 0(z) = (2π)1/2 e−z z z−1/2 · exp(R(z))

(A.4)

with R(z) = O(1/|z|) for |z| → ∞ in the sector | arg(z)| ≤ π − ( > 0). Substitution of z = α−1 β −2 + iβ −1 y in (A.4) entails that for β → 0, |0(

1 + iβ −1 y)| = αβ 2 1 y 1 (1 + α2 β 2 y 2 ) 2αβ2 e− β arctan(αβy) (1 + O(β 2 )), δ(α, β)

(A.5)

which implies (A.2). In a similar way one concludes from (A.4) that for β → 0, 0(a + b + iβ −1 y) a (A.6) 0(b + iβ −1 y) = |y/β| (1 + O(β)), which implies (A.3). Let us next demonstrate that the pointwise convergence of the integrands carries over to the convergence of the integrals by invoking Lebesgue’s dominated convergence theorem. For this purpose it is needed to dominate (the absolute value of) the integrand on the l.h.s. of (A.1) uniformly in β by an integrable function. We will do so by bounding individually the factors comprising the weight function. Specifically, it turns out that 1 −1 xj )| (and δ(ω˜ 0 , β) |0( ω˜ 01β 2 + iβ −1 xj )| ) may factors of the form δ(ω, ˜ β) |0( ωβ ˜ 2 + iβ be dominated by an exponentially decaying function and that the remaining factors— which consist of ratios of the form |β g0 0(g0 + iβ −1 (xj ± xk ))/0(iβ −1 (xj ± xk ))|, 0 |β g˜ 1 0(g˜ 1 + iβ −1 xj )/0(iβ −1 xj )| and |β g˜ 1 0(g˜ 10 + 1/2 + iβ −1 xj )/0(1/2 + iβ −1 xj )|— grow at most polynomially in the variables x1 , . . . , xn . Hence, the integrand on the l.h.s. may be dominated by an exponentially decaying function and (A.1) follows from the pointwise convergence of the integrands by the dominated convergence theorem as advertised.

494

J.F. van Diejen

To validate the above claims regarding the bounds on the growth of the factors B constituting the weight functions 1A β (x) (4.4a) and 1β (x) (4.4b), we need a precise estimate for the error term R(z) appearing in the Stirling formula [Ol], |R(z)| ≤

1 12|z| cos2 (θ/2)

(θ = arg(z)).

(A.7)

(This estimate is valid in the whole sector | arg(z)| ≤ π − , > 0.) By substituting z = α−1 β −2 + iβ −1 y in the Stirling formula (A.4) (where α is assumed to be positive) and combining with the error estimate (A.7), we find (cf. (A.5)) δ(α, β) |0( with Fβ (y) =

1 + iβ −1 y)| ≤ e−Fβ (y) Gβ , αβ 2

(A.8)

1 y arctan(αβy) − log(1 + α2 β 2 y 2 ) β 2αβ 2

(A.9)

2

and Gβ = eαβ /6 . (In our situation cos2 (θ/2) ≥ 1/2 in view of the fact that the real part of z = αβ1 2 + iβ −1 y is positive.) Differentiation of Fβ (y) with respect to y yields ∂y Fβ (y) = β −1 arctan(βαy),

(A.10)

which shows that Fβ (y) is nonnegative as an increasing/decreasing function for y positive/negative with Fβ (0) = 0. From the asymptotics for |y| → ∞ one furthermore sees that the factor exp(−Fβ (y)) decays exponentially. A little more precise analysis reveals that for 0 < β < 1, 1 for |y| < 1/α −Fβ (y) ≤ (A.11) e e−|y|/3 for |y| ≥ 1/α. To obtain the exponential bound on the tail we have used: (i) that for 0 < β < 1, Fβ (1/α)

= >

1 1 arctan(β) − log(1 + β 2 ) αβ 2αβ 2 √ (π/4 − log( 2))/α > 1/(3α),

(ii) that for y ≥ 1/α and 0 < β < 1, ∂y Fβ (y) ≥ ∂y Fβ (1/α) = β −1 arctan(β) > π/4 > 1/3, and (iii) that Fβ (y) is even in y. We conclude from (A.8) and (A.11) that for 0 < β < 1 the 1 −1 factors in the weight function 1Cβ (x) (C = A, B) of the form δ(ω, ˜ β) |0( ωβ xj )| ˜ 2 + iβ 1 0 −1 (and δ(ω˜ , β) |0( ω˜ 0 β 2 + iβ xj )| ) may indeed be dominated uniformly in β by an exponentially decaying function of xj . It remains to check that the rest of the factors—which consist of ratios of gamma functions—can be dominated by a function that grows at most polynomially in the variables. To this end we should find bounds on the gamma function ratios of the type appearing on the l.h.s. of (A.3). Notice that for a ∈ N we have Y a 0(a + b + iβ −1 y) a−1 = β |(m + b)β + iy|, (A.12) 0(b + iβ −1 y) m=0

Orthogonal Polynomials Related to the Calogero System

495

which is easily dominated uniformly in 0 < β < 1 by a function with polynomial growth in y (take e.g. the function ((a + b)2 + y 2 )a/2 ). The case of general positive (not necessarily integer valued) a is a little less straightforward; it will be addressed here with the aid of the integral representation, Z ∞ 1 − e−at dt 0(a + z) = z a exp Re(z) > 0, (A.13) e−zt a − 0(z) 1 − e−t t 0 which can be obtained by integrating Gauss’ integral formula for the psi function ψ(z) = 00 (z)/0(z) [AbSt, Ol], Z ∞ −t e−zt e − dt Re(z) > 0 (A.14) ψ(z) = t 1 − e−t 0 R∞ (using also that log(z) = 0 t−1 (e−t − e−tz )dt for Re(z) > 0). Let us for the moment assume that b is positive. Then substitution of z = b + iβ −1 y in (A.13) entails a 0(a + b + iβ −1 y) = β (A.15) 0(b + iβ −1 y) Z ∞ 1 − e−at dt (β 2 b2 + y 2 )a/2 exp . e−bt cos(yt/β) a − 1 − e−t t 0 The integral within the exponent is bounded by a constant with value Z ∞ 1 − e−at dt −bt e a − 1 − e−t t 0 (which equals log ba 0(b)/0(a + b) if 0 < a ≤ 1 and log b−a 0(a + b)/0(b) if a > 1, cf. (A.13)) and the factor in front is smaller than (b2 + y 2 )a/2 for 0 < β < 1. The case that b becomes zero can be reduced to the previous situation with positive b by means of the identity iy 0(a + iβ −1 y) 0(a + 1 + iβ −1 y) = . (A.16) 0(iβ −1 y) 0(1 + iβ −1 y) aβ + iy Thus, the upshot is that for 0 < β < 1 the factors |β g0 0(g0 +iβ −1 (xj ±xk ))/0(iβ −1 (xj ± 0 xk ))|, |β g˜ 1 0(g˜ 1 + iβ −1 xj )/0(iβ −1 xj )| and |β g˜ 1 0(g˜ 10 + 1/2 + iβ −1 xj )/0(1/2 + iβ −1 xj )| can be uniformly dominated in β by a function that grows at most polynomially in the variables x1 , . . . , xn , which completes the proof of (A.1). Acknowledgement. The author would like to thank S.N.M. Ruijsenaars for some helpful remarks in connection with the material in the appendix (in particular with respect to the usefulness of the integral representation (A.13).

References [AbSt] [A1] [A2]

Abramowitz, M., Stegun, I.A. (eds.): Handbook of mathematical functions. New York: Dover Publications, 1972 (9th printing) Askey, R.: Some basic hypergeometric extensions of integrals of Selberg and Andrews. SIAM J. Math. Anal. 11, 938–951 (1980) Askey, R.: Continuous Hahn polynomials. J. Phys. A: Math. Gen. 18, L1017–L1019 (1985)

496

[AW]

J.F. van Diejen

Askey, R., Wilson, J.: A set of hypergeometric orthogonal polynomials. SIAM J. Math. Anal. 13, 651–655 (1982) [AtSu] Atakishiyev, N.M., Suslov, S.K.: The Hahn and Meixner polynomials of an imaginary argument and some of their applications. J. Phys. A: Math. Gen. 18, 1583–1596 (1985) [BF] Baker, T.H., Forrester, P.J.: The Calogero-Sutherland model and generalized classical polynomials. Preprint Research Institute for Mathematical Sciences, Kyoto, RIMS-1094, 1996 [BO] Beerends, R.J., Opdam, E.M.: Certain hypergeometric series related to the root system BC. Trans. Am. Math. Soc. 339, 581–609 (1993) [BHKV] Brink, L., Hansson, T.H., Konstein, S., Vasiliev, M.A.: The Calogero model—Anyonic representation, fermionic extension and supersymmetry. Nucl. Phys. B 401, 591–612 (1993) [BHV] Brink, L., Hansson, T.H., Vasiliev, M.A.: Explicit solution of the n-body Calogero model. Phys. Lett. B 286, 109–111 (1992) [Ca] Calogero, F.: Solution of the one-dimensional n-body problems with quadratic and/or inversely quadratic pair potentials. J. Math. Phys. 12, 419–436 (1971) [Co] Constantine, A.G.: The distribution of Hotelling’s generalized T02 . Ann. Math. Statist. 37, 215–225 (1966) [De] Debiard, A.: Syst`eme diff´erentiel hyperg´eom´etrique et parties radiales des op´erateurs invariants des espaces sym´etriques de type BCp . In: Malliavin, M.-P. (ed.) S´eminaire d’Alg`ebre Paul Dubreil et Marie-Paule Malliavin, Lecture Notes in Math., vol. 1296, Berlin: Springer-Verlag, 1988, pp. 42–124 [D1] van Diejen, J.F.: Commuting difference operators with polynomial eigenfunctions. Compositio Math. 95, 183–233 (1995) [D2] van Diejen, J.F.: Difference Calogero-Moser systems and finite Toda chains. J. Math. Phys. 36, 1299–1323 (1995) [D3] van Diejen, J.F.: Multivariable continuous Hahn and Wilson polynomials related to integrable difference systems. J. Phys. A: Math. Gen. 28, L369–L374 (1995) [D4] van Diejen, J.F.: Properties of some families of hypergeometric orthogonal polynomials in several variables. Trans. Am. Math. Soc. (to appear) [Du1] Dunkl, C.F.: Orthogonal polynomials on the sphere with octahedral symmetry. Trans. Am. Math. Soc. 282, 555–575 (1984) [Du2] Dunkl, C.F.: Reflection groups and orthogonal polynomials on the sphere. Math. Z. 197, 33–60 (1988) [Du3] Dunkl, C.F.: Differential-difference operators associated to reflection groups. Trans. Am. Math. Soc. 311, 167–183 (1989) [G] Gambardella, P.J.: Exact results in quantum many-body systems of interacting particles in many dimensions with SU (1, 1) as the dynamical group. J. Math. Phys. 16, 1172–1187 (1975) [H] Heckman, G.J.: An elementary approach to the hypergeometric shift operator of Opdam. Invent. Math. 103, 341–350 (1991) [He] Herz, C.S.: Bessel functions of matrix argument. Ann. Math. 61, 474–523 (1955) [J] James, A.T.: Special functions of matrix and single argument in statistics. In: Askey, R. (ed.) Theory and applications of special functions, New York: Academic Press, 1975, pp. 497–520 [Kad] Kadell, K.W.J.: The Selberg-Jack symmetric functions. Adv. Math. (to appear) [Ka1] Kakei, S.: Common algebraic structure for the Calogero-Sutherland models. J. Phys. A: Math. Gen. 29, L619–L624 (1996) [Ka2] Kakei, S.: An orthogonal basis for the BN -type Calogero model. J. Phys. A: Math. Gen. 30, 535–554 (1997) [KS] Koekoek, R., Swarttouw, R.F.: The Askey-scheme of hypergeometric orthogonal polynomials and its q-analogue. Math. report Delft University of Technology 94-05, 1994 [LV1] Lapointe, L., Vinet, L.: A Rodrigues formula for the Jack polynomials and the Macdonald-Stanley conjecture. Internat. Math. Res. Notices 1995, 419–424 [LV2] Lapointe, L., Vinet, L.: Exact operator solution of the Calogero-Sutherland model. Commun. Math. Phys. 178, 425–455 (1996) [La1] Lassalle, M.: Une formule de Pieri pour les polynˆomes de Jack. C. R. Acad. Sci. Paris S´er. I Math. 309, 941–944 (1989) [La2] Lassalle, M.: Une formule du binˆome g´en´eralis´ee pour les polynˆomes de Jack. C. R. Acad. Sci. Paris S´er. I Math. 310, 253–256 (1990) [La3] Lassalle, M.: Polynˆomes de Jacobi g´en´eralis´es. C. R. Acad. Sci. Paris S´er. I Math. 312, 425–428 (1991)

Orthogonal Polynomials Related to the Calogero System

[La4] [La5] [M1] [M2] [M3] [Me] [Mu] [OP] [Ol] [Pe] [R] [Se] [St] [UW] [V] [W] [Y] [YT]

497

Lassalle, M.: Polynˆomes de Laguerre g´en´eralis´es. C. R. Acad. Sci. Paris S´er. I Math. 312, 725–728 (1991) Lassalle, M.: Polynˆomes de Hermite g´en´eralis´es. C. R. Acad. Sci. Paris S´er. I Math. 313, 579–582 (1991) Macdonald, I.G.: Some conjectures for root systems. SIAM J. Math. Anal. 13, 988–1007 (1982) Macdonald, I.G.: Hypergeometric functions. Unpublished manuscript Macdonald, I.G.: Symmetric functions and Hall polynomials (2nd edition). Oxford: Clarendon Press, 1995 Mehta, M.L.: Random matrices (2nd edition). Boston: Academic Press, 1991 Muirhead, R.J.: Aspects of multivariate statistical theory. New York: Wiley, 1982 Olshanetsky, M.A., Perelomov, A.M.: Quantum integrable systems related to Lie algebras. Phys. Rep. 94, 313–404 (1983) Olver, F.W.J.: Asymptotics and special functions. New York: Academic Press, 1974 Perelomov, A.M.: Algebraic approach to the solution of a one-dimensional model of n interacting particles. Theoret. and Math. Phys. 6, 263–282 (1971) Ruijsenaars, S.N.M.: Complete integrability of relativistic Calogero-Moser systems and elliptic function identities. Commun. Math. Phys. 110, 191–213 (1987) Selberg, A.: Bemerkninger om et multipelt integral. Norsk Mat. Tidsskr. 26, 71–78 1944. Collected papers, vol. 1, Berlin: Springer-Verlag, 1989, pp. 204–213 Stanley, R.P.: Some combinatorial properties of Jack symmetric functions. Adv. Math. 77, 76–115 (1989) Ujino, H., Wadati, M.: Rodrigues formula for hi-Jack symmetric polynomials associated with the quantum Calogero model. J. Phys. Soc. Japan 65, 2423–2439 (1996) Vretare, L.: Formulas for elementary spherical functions and generalized Jacobi polynomials. SIAM J. Math. Anal. 15, 805–833 (1984) Wilson, J.A.: Some hypergeometric orthogonal polynomials. SIAM J. Math. Anal. 11, 690–701 (1980) Yamamoto, T.: Multicomponent Calogero model of BN -type confined in harmonic potential. Phys. Lett. A 208, 293-302 (1995) Yamamoto, T., Tsuchiya, O.: Integrable 1/r2 spin chain with reflecting end. J. Phys. A: Math. Gen. 29, 3977–3984 (1996)

Communicated by T. Miwa

This article was processed by the author using the LaTEX style file pljour1 from Springer-Verlag.

Commun. Math. Phys. 188, 499 – 500 (1997)

Communications in

Mathematical Physics

Erratum Exact Ground State Energy of the Strong-Coupling Polaron? Elliott H. Lieb1 , Lawrence E. Thomas2 1 Departments of Physics and Mathematics, Jadwin Hall, Princeton University, P. O. Box 708, Princeton, New Jersey 08544, USA 2 Department of Mathematics, University of Virginia, Charlottesville, Virginia 22903, USA

Received: 20 April 1997 / Accepted: 20 April 1997 Commun. Math. Phys. 183, 511–519 (1997)

We are grateful to Professor Andrey V. Soldatov of the Moscow Steklov Mathematical Institute for calling our attention to an error in our paper. The commutator inequality (8) in our step I, namely |kj ||hak eik·x i| ≤ 2hp2j i1/2 ha∗k ak i1/2 , is not correct. Rather, the right side of this inequality should be hp2j i1/2 (ha∗k ak i1/2 + hak a∗k i1/2 ) or a related expression. The extra factor hak a∗k i1/2 with the ak and a∗k not in normal order generates uncontrolled mischief with, for P example, the right side of the ultraviolet bound (10) containing an additional term |k|≥K 1/2 = ∞. The situation is remedied with the help of the method introduced by Lieb and Yamazaki (ref. [14], in our previous paper) to obtain the previous rigorous lower bound on the polaron energy. Our main result, (31), is still valid. Indeed, it is improved slightly. Define the (vector) operator Z = (Z1 , Z2 , Z3 ) with components 4πα 1/2 X ak ) kj 3 eik·x , j = 1, 2, 3. (1) Zj = ( V |k| |k|≥K

Then the commutator estimate (8) is replaced by X ak 4πα 1/2 X ) h[pj , Zj − Zj∗ ]i h eik·x i + c.c. ≡ − −( V |k| j |k|≥K

≤ 2hp2 i1/2 h−(Z − Z∗ )2 i1/2 ≤ 2hp2 i1/2 h2(Z∗ Z + ZZ∗ )i1/2 2 ≤ εhp2 i + hZ∗ Z + ZZ∗ i. (2) ε Now, each component Zj can be thought of as a single (unnormalized) oscillator mode P having commutator with its adjoint, [Zj , Zj∗ ] = (4πα/V ) |k|≥K kj2 |k|−6 → 2α/3πK; ? 1996 c retained by the authors. Faithful reproductions of this article, by any means, is permitted for non-commercial purposes.

500

E.H. Lieb, L.E. Thomas

moreover, Zi and Zj∗ commute for i 6= j (i.e., these modes are orthogonal). Using these facts, we have that 4 ∗ 2 2α 2 ∗ ∗ hZ Z + ZZ i = hZ Zi + ε ε ε πK X ≤ ha∗k ak i + 3/2 (3) |k|≥K

if we choose ε = 8α/3πK, which is smaller and better by a factor 1/3 from the ε in Here we have employed an orthogonal rotation of coordinates bringing P the article. ∗ ∗ a a |k|≥K k k into a form (4/ε)Z Z+non-negative operators. (Compare Eqs.(21,22) of the article.) Combining these inequalities, we obtain X ak ik·x 4πα 1/2 X ) ha∗k ak i + 3/2. (4) h e i + c.c. ≤ εhp2 i + −( V |k| |k|≥K

|k|≥K

This last inequality is our replacement for the ultraviolet bound (10). It follows that H ≥ HK − 3/2, where HK is as in Eq.(11), but with the coefficient of p2 given by (1 − 8α/3πK) rather than (1 − 8α/πK). With the choice K = 8α/3π, inequality (13) becomes H ≥ −(16α2 /3π 2 ) − 3/2, a bound at least consistent with a known upper bound for the ground state energy linear in α. The remainder of the article is an analysis of HK and needs only minor modification. The coefficient of p2 in Eqs.(19,23,27,28,30) should be (1 − c1 α−1/5 /3) and, at the end of the article, c5 = (c1 /3 + 2c4 )cP . Due to the smaller value of ε defined above, our estimate on the coefficient of α9/5 in (31) is slightly improved to 2.337, rather than 3.822 as reported. Of course, our lower bound for the ground state energy is decreased merely by the constant −3/2, which is unimportant on a scale of α9/5 . Communicated by D. Brydges

Commun. Math. Phys. 188, 501 – 520 (1997)

Communications in

Mathematical Physics c Springer-Verlag 1997

Topological Quantum Field Theory for the Universal Quantum Invariant? J. Murakami1 , T. Ohtsuki2 1

Department of Mathematics, Osaka University, Osaka, Japan. E-mail: [email protected] Department of Mathematical and Computing Sciences, Tokyo Institute of Technology, Tokyo, Japan. E-mail: [email protected]

2

Received: 22 August 1996 / Accepted: 17 January 1997

Abstract: We extend the universal quantum invariant defined in [15] to an invariant of 3-manifolds with boundaries, and show that the invariant satisfies modified axioms of TQFT.

Introduction For a Lie algebra g, Witten proposed a topological invariant of 3-manifolds, what we call the quantum g invariant. He defined the invariant by using the path integral of quantum field theory. Based on his definition, rigorous constructions of the quantum invariants were obtained, first by Reshetikhin and Turaev [16], and by others. Further, Kontsevich introduced the universal Vassiliev-Kontsevich invariant of links in [8], see also [5, 11]. The invariant of a link is expressed as a linear sum of chord diagrams consisting of solid and dashed lines. We can recover the quantum g invariants of links by substituting the Lie algebra g and its representation to dashed and solid lines respectively. In [14], Thang Le and the authors constructed the universal quantum invariant (M ) of a 3-manifold M from the universal Vassiliev-Kontsevich invariant, where A(φ) denotes the space consisting of chord diagrams without solid lines. The notion of topological quantum field theory (TQFT) was introduced by Atiyah et al in [3, 2], see also [6]. The theory consists of the axioms below, and the quantum invariants of 3-manifolds satisfy the axioms. Hence, it is expected that the universal quantum invariant should also satisfy them. Axioms for topological quantum field theory. The following axioms are for functors F from a category of oriented 3-cobordisms to a category of modules over a commutative ? This research is supported in part by Grand-in-Aid for Scientific Research, The Ministry of Education, Science, Sports and Culture.

502

J. Murakami, T. Ohtsuki

ring k; we denote F(Σ) by V (Σ) for a closed surface Σ and denote F (M) by Z(M ) ∈ V (∂M ) for a 3-manifold M with its boundary ∂M . (A1) For two surfaces Σ1 and Σ2 , V (Σ1 t Σ2 ) = V (Σ1 ) ⊗ V (Σ2 ) holds. (A2) For a surface Σ, V (−Σ) = V (Σ)? holds, where −Σ denotes Σ with the opposite orientation. (A3) For the empty surface φ, V (φ) = k holds. (A4) For two 3-manifolds M1 and M2 with boundaries ∂M1 = Σ t Σ1 and ∂M2 = (−Σ) t Σ2 , the formula Z(M1 ∪Σ M2 ) =< Z(M1 ), Z(M2 ) >Σ holds, where < , >Σ : V (Σ)⊗V (Σ1 )⊗V (Σ)? ⊗V (Σ2 ) → V (Σ1 )⊗V (Σ2 ) denotes the contraction. In this paper, we extend the universal invariant (M ) ∈ A(φ) to an invariant (M, G) of a 3-manifold M with an embedded trivalent graph G in M . By regarding the invariant (M ) as an invariant of M − N (G), we extend the universal quantum invariant to an invariant of 3-manifolds with boundaries. We also aim to show that the extended invariant satisfies modified axioms of topological quantum field theory with respect to a commutative ring A(φ). 1. Universal Quantum Invariant for Embedded Graphs in S 3 1.1. Chord diagrams on a graph. Throughout this paper, a graph means a framed trivalent oriented graph admitting loops and trivial circles if nothing other is declared. We also assume that the edges are labeled. In pictures, we express the framing of a graph by the blackboard framing. To define an invariant of a framed graph embeddings in S 3 , we introduce a space of chord diagrams on a graph 0. A uni-trivalent graph is a graph every vertex of which is either univalent or trivalent. A uni-trivalent graph is vertex-oriented if, at each trivalent vertex, a cyclic order of edges is fixed. Definition 1.1. For a graph 0, a chord diagram D with support 0 is 0 together with an vertex-oriented uni-trivalent graph whose univalent vertices are on 0; and the graph does not have any connected component homeomorphic to a circle. We call the unitrivalent graph the chord graph of the diagram. A chord diagram D is illustrated by 0 together with dashed lines indicating the chords. Note that our definition of a chord diagram is more general than that of [4, 11]. Let A(0) be the C-linear space spanned by chord diagrams with support 0, subject to the AS, IHX and STU relations in 14 and the branching relation given in Fig. 1. Suppose E is an edge of 0. Reversing the orientation of E, we get a new graph 00 from 0. Let S(E) : A(0) → A(00 ) be the linear mapping which transfers every chord diagram D in A(0) to S(E) (D) obtained from D by reversing the orientation of E and multiplying by (−1)m , where m is the number of vertices of the chord graph on the edge E. We call S(E) the antipode corresponding to E. Let E be an edge of a graph 0. Removing E from 0, we get a new graph 00 . We define a map ε(E) : A(X) → A(X0 ) as follows. If a chord diagram D has no vertices of the chord graph on E, we put ε(E) (E) to be the chord diagram obtained by removing the solid edge E from D, and put ε(E) (E) = 0 if otherwise. The maps S(E) and ε(E) are well-defined and linearly extended to the maps on A(0). 1.2. Universal Vassiliev-Kontsevich invariant for a graph embedded in S 3 . Let G be an embedding of a framed trivalent oriented graph 0 in S 3 . For the framed ˆ graph G, we define its universal Vassiliev-Kontsevich invariant Z(G) in A(0). In order

Topological Quantum Field Theory for Universal Quantum Invariant

? R @ @ ? R @ @ 6 R @ @ 6 R @ @

= =− =− =

? R @ @ ? R @ @

+ +

? R @ @ ? R @ @

? @ @ I ? @ @ I

6 R @ @

−

6 R @ @

,

6 @ @ I

6 R @ @

−

6 R @ @

,

6 @ @ I

503

= =−

? @ @ I ? @ @ I

=− =

? @ @ I ? @ @ I

−

6 @ @ I

+

6 @ @ I

6 @ @ I

+

6 @ @ I

, , , .

Fig. 1. Branching relation around a vertex

to define the invariant of G, we fix a diagram of G in the plane such that the framing of G is given by the blackboard framing. We use tangle decomposition of G as in [11]. Here, before decomposing G, we deform G by isotopy of the plane, if necessary, such that there are at least one upward edge and at least one downward edge at each trivalent vertex. The definition of the invariant for vertices are given in Fig. 2. Here, S1 , S2 and S3 denote the action of the antipode S on the left, middle and right strings respectively. The element a and b are given in [13] by using the KZ-associator 8 as follows.

11 (S2 S3 (8))

S1 S2 (8)

a=

−1

11 (S1 (8

))

??

,

b=

. ??

S3 (8−1 )

Here, 11 means the comultiplication at the left-most string, S1 denote the action of the antipode S on the left-most string, S2 denote that on the middle string and S3 denote ˆ that on the right-most string. Then, Z(G) is obtained from the definitions for primitive q-tangles as in the case of knots and links. Let G be an embedding of an oriented trivalent framed graph 0 in S 3 . When defining ˆ Z(G) ∈ A(0) as above, we fixed a diagram of G in the plane. ˆ Lemma 1.2. Z(G) does not change under isotopy of the plane. ˆ Proof. If G was a link, we know that Z(G) is invariant by an isotopy of the plane by ˆ results in [11, 12]; in the papers the invariant Z is constructed for links in a combinatorial way. Though the invariance is proved by using invariance of the Kontsevich integral, we know that the invariance also holds for moves between elementary tangles, see [11, 12]. ˆ Hence, in the present case, by results of the papers, Z(G) is invariant by an isotopy of the plane which fix a neighborhood of each trivalent vertex of G. Therefore, it is sufficient ˆ to show that Z(G) is invariant under moves of a neighborhood of a trivalent vertex of G. ˆ First, under a horizontal move of a vertex, Z(G) is invariant by definition. ˆ Second, under a vertical move of a vertex, we show the invariance of Z(G) as follows. Let G be an embedding of 0 and G0 another embedding identical to G except within a neighborhood of a vertex where they are as one of the pictures in Fig. 3. Though there ˆ might possibly exist associators beside the vertex, the change of Z(G) is as shown in

504

J. Murakami, T. Ohtsuki

?

? ˆ Z(

?

a−1/2

6 ,

)=

? ˆ Z(

ˆ Z(

? ?

b−1/2

?

,

@ R @

? ˆ Z(

···

?

@ @

)= @ R @ ?

? ,

? ?

,

S2 (a−1/2 )

)=

?

)=

?

?

?

ˆ Z(

6

S1 (a−1/2 )

)= @ @ ? I

ˆ Z(

)= @ R @ 6

6

? @ @

,

S2 (b−1/2 )

?

@ @

,

S1 (b−1/2 )

@ R @

···.

@ @ I

Fig. 2. Definition of Zˆ at vertices

@ @

←→ G

G0

←→

G

@ @ G0

Fig. 3. Vertical moves of a trivalent vertex

ˆ Fig. 4. By using the branching relations in Fig. 1, Z(G) does not change by moves in ˆ ˆ 0 ) in this case. Fig. 4. Hence, we have Z(G) = Z(G ˆ Lastly, under a rotation of a neighbourhood of a vertex, Z(G) is invariant by Lemma 1.3 below.

ˆ Lemma 1.3. The formulas Z(

@ @

ˆ ˆ ) = Z( ) and Z( @ @

@ @ ) = Z( ˆ

) hold.

Proof. Since the two formulas are proved in the same way, we show the proof only ? @ ? @ ˆ ˆ ) = Z( ) and show for the first formula. We fix orientations of edges as Z( @ @ ? ? I the formula; note that we obtain the formulas for the other orientations by applying the antipode S to the orientation reversed edges.

505

∆(C)

...

C

...

∆(C)

...

C

...

Topological Quantum Field Theory for Universal Quantum Invariant

ˆ Fig. 4. The change of Z(G) under vertical moves of a vertex. By the branching relations, both sides of each move are equal in A(0)

ˆ In [10], we showed that Z(

? @ @

? @ @ ˆ Hence, we have Z(

)=

? @ @ ˆ )= a and Z( )= @ ν 1/2 R @ ? ? @ ? @ @ @ 1/2 1/2 a a ν ν 11 (S2 (b))

b ν 1/2 . @ R @ ? ν

11 (S2 (b))

= , where @ @ @ @ 11 (S2 (b)) means the comultiplication of S2 (b) for the left string of b. On the other hand, ? @ @ a ν ? @ ? 11 (S2 (b)) @ ) = Z( ˆ ˆ = we know that Z( ). Therefore, we have @ @ ? @ ? @ and so we have ? @ @ = S (b−1/2 ) . Hence, we get 1/2 2 a ν −1/2 ? @ @ ? @ a @ ˆ ν 1/2 Z( ) = a−1/2 ? ?

=

? @ @

S2 (b−1/2 )

? ˆ ). = Z( @ @ ? I

ˆ Theorem 1.4. Z(G) is an isotopy invariant of an embedding G of an oriented trivalent framed graph 0 in S 3 . ˆ Proof. Since we have Lemma 1.2, it is sufficient to show that Z(G) is invariant under extended Reidemeister moves for spatial graphs defined in [17]. The moves consist of ordinary Reidemeister moves for links and additional moves for a vertex of a graph; we show the additional moves for framed trivalent graphs in Fig. 5. Note that we do not need the move in [17] which changes the order of edges around a vertex, since we consider framed graphs with the blackboard framing. By using results in [11], we know ˆ that Z(G) is invariant under ordinary Reidemeister moves. Hence, it suffices to show the invariance under the additional moves in Fig. 5. We show the invariance under the two moves in Fig. 5 as follows. In the definition ˆ of Z(G) we put eH/2 at a crossing, and put 11 (eH/2 ) at a crossing of two parallel string

506

J. Murakami, T. Ohtsuki

Fig. 5. The additional moves among the extended Reidemeister moves

and one string. Here, H denotes the chord diagram consisting of two solid vertical line and one horizontal dashed chord, and 11 is the action of 1 on the string taking two parallel. Hence, we obtain the invariance by putting C = H/2 in the formulas in Fig. 4. Proposition 1.5. Let G be an embedding of an oriented trivalent framed graph 0 in S 3 , and E an edge of 0. (1) Let 00 be the oriented trivalent graph obtained from 0 by reversing the orientation ˆ 0) = of the edge E, G0 the corresponding embedding of 00 . Then we have Z(G ˆ S(E) (Z(G)). (2) Let 000 be the trivalent graph obtained from 0 by removing the edge E, G00 the corresponding embedding of 000 . We suppose that the remaining two edges around ˆ 00 ) = each vertex of E have inward and outward orientation. Then we have Z(G ˆ ε(E) (Z(G)). Proof. If G was a link, we can check the formulas for each elementary tangle in the combinatorial definition of Zˆ in [11]. Composing the formulas for elementary tangles, we obtain the required formulas. In the present case, adding the above argument, it is sufficient to show the formulas for the definition of Zˆ at vertices given in Fig. 2; we can check them by elementary calculations. 2. Universal Quantum Invariant for Embedded Graphs in 3-Manifolds 2.1. Kirby moves for a graph embedded in a 3-manifold. Let 0 be a framed oriented trivalent graph, M a 3-manifold. We consider an embedding of 0 in M in this subsection. Let L ∪ G be a union of a framed link and an embedding of 0 in S 3 such that L and G does not intersect and that M is obtained from the surgery along L. We use the same notation G again for the embedding of 0 in M obtained from the embedding in S 3 after the Dehn surgery along L. ˆ ∪ G). In order to show the topological We construct an invariant of (M, G) from Z(L invariance of it, we consider the following Kirby moves for L ∪ G. Let L be a link and G an embedding of 0 as before. Let the KI move be the move adding or deleting a trivial circle component with ±1 framing. Let the KII move be the handle slide move over a component of L; there are two kinds of move. One is the KII move concerning to two components of L as in Fig. 6, KIIa. The other one is the KII-move where a part of G passing over a component of L as in Fig. 6, KIIb. Then we have

Topological Quantum Field Theory for Universal Quantum Invariant

507

Proposition 2.1. Let L ∪ G be a union of a link and an embedding of a graph 0 in S 3 , and L0 ∪ G0 another union. By Dehn surgery along L and L0 respectively, we obtain the pairs (M, G) and (M 0 , G0 ) as above. Then, (M, G) and (M 0 , G0 ) are homeomorphic to each other (that is, there exists a homeomorphism between M and M 0 taking G to G0 ), if and only if there is a sequence of the above KI, KII moves between L ∪ G and L0 ∪ G0 .

(KI)

G∪L

[

U+ ∼

G∪L

∼

[

G∪L

split union

L1

U− ,

split union

@ @

L01

L2

(KIIa)

@ @

L2 ,

∼

@ @

@ @ @ @

G ∪ L1 ∪ L2 ∪ · · · ∪ Lk ∼ G ∪ L01 ∪ L2 ∪ · · · ∪ Lk

(KIIb)

G

@ @

G0

L1

@ @

L1

∼

@ @

@ @ @ @

G ∪ L 1 ∪ · · · ∪ Lk ∼ G 0 ∪ L 1 ∪ · · · ∪ L k Fig. 6. Kirby moves for a graph embedding in a 3-manifold

2.2. Invariants for graphs embedded in a 3-manifold. Let D be a chord diagram in A(t` S 1 ∪ 0). We define the degree of D to be half the number of both the trivalent vertices of D and the univalent vertices (end points of chords on solid lines) of D. For a graph 0, we call the rank of H1 (0, Z) the genus of 0. We often use the same notation A(X) for the completion of the space of chord diagrams with support X with repect to the degree of chord diagrams. Let L ∪ G be a union of a link and an embedding of 0 presenting a 3-manifold M and an embedding G of a graph 0 in M . Let ` be the number of components of L. In ˆ ∪ G) ∈ A(t` S 1 ∪ 0). We construct an invariant of the pair Section 1, we defined Z(L ˆ ∪ G) as follows. (M, G) from Z(L We introduce the relation Pn defined in [14]. Let P1 be the equivalence relation such that any chord diagram with non-empty dashed lines is equivalent to zero. Let P2 be the following relation. + + = 0. The left-hand side is the sum over all pairings of 4 points. Similarly we define the equivalence relation Pn such that the sum over all pairings of 2n points is equivalent

508

J. Murakami, T. Ohtsuki

to zero. Let L<2n be the relation such that any chord diagram having a component of t` S 1 with less than 2n end points of dashed lines equals zero, D>n the relation such that any chord diagram whose degree is more than n equals zero, and On the relation such that a trivial circle of a dashed line is equal to −2n, in other words, to add a trivial ◦

circle of a dashed line is equivalent to the multiplication by −2n. Let A (t` S 1 , 0) be the space of chord diagram including dashed trivial circles. We consider the quotient space ◦

of A (t` S 1 ∪ 0) by the relation Pn+1 , L<2n , D>n and On below. We define the notation Zˇ by ˇ ∪ G) = Z(L ˆ ∪ G)#(ν ⊗ · · · ⊗ ν ) ∈ A(t` S 1 ∪ 0). Z(L | {z } `

ˇ ∪ G) is obtained from Z(L ˆ ∪ G) by successively taking a This formula means that Z(L ˇ connected sum with ν along every component of L. A reason why we introduce Z(L∪G) ˇ ∪ G) behaves better than Z(L ˆ ∪ G) under the Kirby move KII, see [13]. In is that Z(L the same way as the proof of Proposition 3.3 of [14], we have ◦

ˇ Proposition 2.2. The equivalence class [Z(L∪G)] in A (t` S 1 ∪0)/Pn+1 , L<2n , D>n , On does not depend on the orientation of L. Further, it is invariant under the Kirby move KII. ˇ ∪ G)] invariant under the Kirby move KI, we define a linear In order to make [Z(L map ιn : A(t` S 1 ∪ 0) −→ A(0) as in [14]. The mapping ιn replace every circle component with m endpoints of dashed n n defined in [14]. We do not repeat the definition of Tm here. lines by the diagram Tm ◦

However, by using the relation Pn , we can reduce any element of A (t` S 1 ∪0)/Pn+1 , On into a linear sum of chord diagrams having at most 2n end points on every circle component as in the proof of Lemma 3.1 in [14]. For such diagrams, ιn replace the circle n , which is equal to the sum of diagrams in the relation Pn . The map components by T2m ◦

ιn induces the map of the quotient space A (t` S 1 ∪ 0)/Pn+1 , L<2n , On , D>n to the ◦ quotient sapce A (0)/Pn+1 , On , D>n . In the same way as the proof of Lemma 3.4 of [14], we have Lemma 2.3. The identity map induces an isomorphism of the quotient space A(0)/D>n ◦

to the quotient space A (0)/Pn+1 , On , D>n . Hence, by the above proposition and the lemma, we have ˇ ∪ G)] is invariant under a change of Proposition 2.4. The equivalence class [ιn Z(L orientation of L and under the Kirby move KII; recall that we mean by Kirby move KII the handle slide move over any component of L. Let σ+ (L) and σ− (L) be the number of positive and negative eigenvalues of the linking matrix of L respectively. We define an action of A(φ) on A(0) by the disjoint union of chord diagrams, and A(φ) has an algebra structure by the disjoint union and A(0) has an A(φ)-module structure by the disjoint union. These structures induce the same structures on the quotient spaces by the relation D>n . Let U+ and U− be trivial knots

Topological Quantum Field Theory for Universal Quantum Invariant

509

ˇ ± )] with ±1 framing respectively. Then, as shown in [14], the equivalence class ιn [Z(U in A(φ)/D>n is invertible. We put ˇ + )]−σ+ (L) [ιn Z(U ˇ − )]−σ− (L) [ιn Z(L ˇ ∪ G)] ∈ A(0)/D>n . n (M, G) = [ιn Z(U Then n (M, G) does not depend on the choice of the link L presenting M and so we get the following. Theorem 2.5. Let M be a 3-manifold and G an embedding of an oriented trivalent framed graph 0 in M . Then n (M, G) in A(0)/D>n is a topological invariant of the pair (M, G). 2.3. Universal quantum invariant for graphs embedded in a 3-manifold. We define a coalgebra structure on A(t` S 1 ∪ 0) by the coproduct ˆ : A(t` S 1 ∪ 0) −→ A(t` S 1 ∪ 0) ⊗ A(t` S 1 ∪ 0) 1 defined as follows. Let D be a chord diagram on t` S 1 ∪ 0, and C the Pset of connected ˆ = I⊂{1,···,k} DI ⊗ graphs of chords in D, say C = {c1 , c2 , · · ·, ck }. We put 1(D) D{1,···,k}\I . Here DI denote the chord diagram on t` S 1 ∪ 0 with chords ci (i ∈ I). As ˆ induces a linear mapping in the case 0 = φ treated in [14], 1 ˆ n ,n : A(0)/D>n +n −→ A(0)/D>n1 ⊗ A(0)/D>n2 . 1 1 2 1 2 We also know that Proposition 2.6. Let L be a link and G an embedding of a graph 0 in S 3 disjoint ˇ ∪ G)) = ˇ ∪ G) in A(t` S 1 ∪ 0) is group like, i.e. 1( ˆ Z(L with L. Then the element Z(L ˇ ˇ Z(L ∪ G) ⊗ Z(L ∪ G). From this proposition, we have ˆ n ,n (n1 +n2 (M, G)) = n (M, G) ⊗ n (M, G). 1 1 2 1

2

(2.1)

Let D0 be the chord diagram consisting of 0 and no dashed graphs. Note that the degree 0 part of A(0) is spanned by D0 . Hence we can put the degree 0 part of 1 (M, G) by m D0 with a scalar m. For x ∈ A(0), x(d) denote the degree d part of x. Then, as Lemma 6.6 in [14], we have the following. Lemma 2.7. (1) The above m is equal to the order |H1 (M, Z)| if M is a rational homology 3-sphere, 0 otherwise. 0 (2) For n < n0 , pn0 ,n (n0 (M, L)(d) ) = mn −n n (M, L)(d) , where pn0 ,n is the natural projection from A(0)/D>n0 to A(0)/D>n . Since the above lemma implies that the series n (M, G) is determined by the series n (M, G)(n) , we define ˆ Definition 2.8. We define a topological invariant (M, G) and its modification (M, G) of a pair of a 3-manifold and an embedding G of an oriented trivalent framed graph 0 in M , as follows. We call them universal quantum invariants of the pair (M, G). P∞ (1) We put (M, G) = D0 + n=1 n (M, G)(n) ∈ A(0), where D0 is given above.

510

J. Murakami, T. Ohtsuki

(2) When M is a rational homology 3-sphere, we put ˆ (M, G) = D0 +

∞ X

|H1 (M, Z)|−n n (M, G)(n) ∈ A(0),

n=1

where |H1 (M, Z)| denotes the order of the first homology group of M . ˆ 2.4. Properties of (M, G). Let M be a rational homology sphere and G an embedding ˆ of a graph 0 as before. The invariant (M, G) defined in the previous subsection is a generalization of the universal quantum invariant in [14] for a 3-manifold and the universal Vassiliev-Kontsevich invariant for framed links in [11] as follows. Let (M ) P −d ˆ be the invariant of M defined in [14] and (M ) = d |H1 (M, Z)| (M )(d) . Let εˆ be the map A(0) → A(φ) defined as follows. If D is a disjoint union of D0 (given above) and an element of A(φ), we put ε(D) ˆ to be the element of A(φ), and we put ε(D) ˆ =0 otherwise. ˆ ˆ 3 , G) = Z(G). Proposition 2.9. (1) For an embedded framed graph G in S 3 , (S ˆ ˆ (2) For an empty graph, (M, φ) = (M ). ˆ ˆ (3) By applying ε, ˆ we get ε( ˆ (M, G)) = (M ). Further, by (2.1) we have ˆ (M, ˆ ˆ ˆ Proposition 2.10. The formula 1( G)) = (M, G) ⊗ (M, G) holds. Furthermore, by Proposition 1.5 we have Proposition 2.11. Let G be an embedding in a 3-manifold M of an oriented trivalent framed graph 0, E an edge of 0. (1) Let 00 be the oriented trivalent graph obtained from 0 by reversing the orientation ˆ of the edge E, G0 the corresponding embedding of 00 . Then we have (M, G0 ) = ˆ S(E) ((M, G)). (2) Let 000 be the trivalent graph obtained from 0 by removing the edge E, G00 the corresponding embedding of 000 . We suppose that the remaining two edges around ˆ each vertex of E have inward and outward orientation. Then we have (M, G00 ) = ˆ G)). ε(E) ((M, 3. Correction for the Framing Anomaly 3.1. Linking matrices of links in rational homology 3-spheres. In this section we define a linking matrix of a link in a rational homology 3-sphere, and we show that it is a symmetric matrix; note that we need the property when considering the signature of the matrix in the following sections. Let K1 ∪ K2 be a link with two components in a rational homology 3-sphere M . We define a linking number of K1 and K2 as follows. Since H1 (M, Z) is a finite group, the homology class represented by K1 is of finite order, say, of order m. We can find a surface F in M such that its boundary covers K1 m times and it intersects K2 transversely. We define the linking number of K1 and K2 to be 1/m times the intersection number of F and K2 ; we denote the number by the usual notation lk(K1 , K2 ). Lemma 3.1. The formula lk(K1 , K2 ) = lk(K2 , K1 ) holds.

Topological Quantum Field Theory for Universal Quantum Invariant

511

Proof. We also choose a surface F 0 in M such that its boundary covers K2 m0 times and it transversely intersects K1 and F respectively. The intersection of F and F 0 is a compact oriented 1-manifold whose boundaries covers K1 ∩ F 0 m times and F ∩ K2 −m0 times; it implies that the 0 dimensional cycle m[K1 ∩ F 0 ] − m0 [F ∩ K2 ] is a boundary, i.e. vanishes in H0 (M, Z). Hence we obtain the required equality. We consider the linking matrix of a framed link in a rational homology 3-sphere with the linking number defined as above; the matrix is defined such that the (i, j) entry of the matrix is the linking number of the ith and j th components of the link, and the (i, i) entry is the linking number of two boundaries of the ith component, where we express the framing with a ribbon in M . Note that the linking matrix is symmetric by the above lemma. When the rational homology 3-sphere M is given by Dehn surgery along a framed link in S 3 , we can compute the linking matrix as follows. Let L ∪ L0 be a framed link 3 0 in S 3 . Suppose that M is obtained by Dehn surgery on S along L . We put the linking A B matrix of L ∪ L0 to be t , where A and A0 are the linking matrices of L and L0 B A0 respectively. Note that the determinant of A0 is not equal to zero, since M is a rational homology 3-sphere. We also denote by L the remain of L in M . Lemma 3.2. For the framed link L in M given as above, the linking matrix of L in M is obtained from the matrix A by elementary transformations on the linking matrix of L ∪ L0 in the way the parts of B and t B become zero keeping the part of A0 unchanged. Proof. This lemma can be proved by elementary calculation. For simplicity we show the 0 calculation assuming of L and L has one component. In this case the linking that each a b matrix of L ∪ L0 is . Let K1 and K2 be two boundaries of a ribbon expressing b a0 the framing of L in M . We will show that the linking number of K1 and K2 in M is equal to a − b2 /a0 . Along the definition of the linking number, we can find a surface F in M whose boundary covers K1 , a0 times, since the order of H1 (M ; Z) is a0 . We regard F as in S 3 − N (L0 ), where N (L0 ) denotes a tubular neighborhood of L0 in S 3 . The boundaries of F in ∂N (L0 ) are parallel b copies of the framing of L0 , because they are homologous to the set of the other boundaries, that is, a0 copies of K1 . The intersection number of F and K2 is equal to the ordinary linking number of ∂F and K2 . In this case ∂F is the union of a0 copies of K1 and b copies of L0 . Noting the signs from the orientations, we have lk(∂F, K2 ) = a0 lk(K1 , K2 ) − blk(L0 , K2 ) = aa0 − b2 . (We omit the calculation of the signs, but we can easily check the signs when L0 is a trivial knot with a0 = 1 framing as follows. In this case we obtain L in M by −1 Dehn twist along L0 , and the above linking number must be a − b2 .) By definition of the linking number, we obtain the required linking number of K1 and K2 in M . 3.2. Correction for the framing anomaly in TQFT. Let M be a compact 3-manifold with boundaries; in each boundary a set of meridians and framings are fixed, i.e. it is a set of disjoint pairs of two simple closed curves with geometric intersection 1 in the boundary; we call two curves a meridian and a framing. For M with a set of meridians and framings ˆ the closed 3-manifold with a graph G embedded in it, in its boundaries, we denote by M obtained by filling the boundaries of M with handle bodies; in each handle body there is a standard graph embedded in it as shown in Fig. 7, and we denote the disjoint union of those graphs by G. We call such a graph in Fig. 7 a chain graph. (To be precise, a

512

J. Murakami, T. Ohtsuki

chain graph is framed and oriented as shown in Fig. 8.) We naturally choose meridians and framings in the boundary of the handle body; they are obtained as meridians and framings of loops in G; we regard a chain as consisting of loops and segments. In the above filling, we glue each handle body attaching meridians to meridians, and framings ˆ by removing a tubular neighborhood to framings. Note that we can recover M from M of G.

.....

Fig. 7. A chain graph embedded in a handle body

Consider two 3-manifolds M1 and M2 with fixed meridians and framings in their boundaries. We assume that M1 and M2 are rational homology 3-spheres, and that each of them has a boundary of the same genus; let f be a homeomorphism of the boundary ˆ 1 and M ˆ 2 be the closed 3-manifolds obtained by filling the of M1 to that of M2 . Let M ˆ 1 and M ˆ 2 there is a disjoint union of boundaries of M1 and M2 as above. In each of M chain graphs; we denote them by G1 and G2 . Let Ai be the linking matrix of loops in Gi ˆ i in the sense of the previous section. We define the matrix A attaching embedded in M A1 and A2 like     A= 

A1 A2

  , 

(3.1)

where we overlap A1 and A2 in the corresponding part attached by the map f ; in the formula we take the sum in the overlapping part and we put 0 in the right upper and left lower parts. Definition 3.3. For 3-manifolds M1 and M2 and an attaching map f given as above, we define σ± (M1 , M2 ; f ) to be σ± (A) − σ± (A1 ) − σ± (A2 ), where σ± in the latter formula means the number of ± eigenvalues of the matrix. 4. Topological Field Theory for the Universal Quantum Invariant 4.1. Gluing formula for the universal quantum invariant. We consider a 3-manifold obtained by gluing two 3-manifolds M1 and M2 along a homeomorphism f between ˆ 1 and M ˆ 2 be as in Sect. 3. There two boundary surfaces of M1 and M2 of genus g. Let M ˆ i corresponding to the boundary. is an embedding of a chain graph 0i of genus g into M We deform 01 and 02 into the shapes shown in Fig. 8. In the figure, the corresponding meridians and longitudes via the map f are located in the symmetric position. We consider the linear map f? of A(01 ) ⊗ A(02 ) to A(tg S 1 ) defined as follows. We remove a straight horizontal line from each of 01 and 02 in Fig. 8, after moving all dashed ends on the line to the other part of 0i using the relation in Fig. 1. We glue the remain of 01 and 02 , then we obtain g loops and chord diagrams on them.

Topological Quantum Field Theory for Universal Quantum Invariant

513

Γ1 .....

..... Γ2

Fig. 8. Two chain graphs corresponding to two boundary surfaces. These graphs are framed by the blackboard framing in this picture

Lemma 4.1. The above map f? : A(01 ) ⊗ A(02 ) → A(tg S 1 ) is well defined. Proof. It is sufficient to show that the image of f? does not depend on the way of moving dashed ends from the horizontal line in the above construction of f? . The ambiguity of the moving is expressed as the left hand side of the formula in Fig. 9. Hence we can reduce the proof to showing the formula. More generally we can show the formula in Fig. 10; for the proof of this formula, see also [13]. This formula is shown by moving the dashed horizontal lines upward. When the lines are located in any level, the sum of terms obtained by attaching the dashed end to each section of the chord diagram “something” in the level is unchanged, during the moving of the level; it is shown by using the defining relations of chord diagrams. After moving the level sufficiently, the section in the level becomes the empty set. It implies that the formula is equal to zero.

... + ... +

...

+ ...

+

...

=0

Fig. 9. The ambiguity of the moving of a dashed end

Theorem 4.2. As given in the beginning of this subsection, let M1 , M2 be 3-manifolds, f a homeomorphism between genus g boundaries of M1 and M2 . Then, we have the invariant of the 3-manifold obtained by gluing M1 and M2 along the map f as in the following formula; (M1 ∪f M2 ) =

∞ X

ˆ 1 ; Z)|n |H1 (M ˆ 2 ; Z)|n |H1 (M

n=0

(n)

ˇ + )]−σ+ (M1 ,M2 ;f ) [ιn Z(U ˇ − )]−σ− (M1 ,M2 ;f ) [ιn f? (( ˆ 1 , G1 )⊗ ( ˆ 2 , G2 ))] ˆ M ˆ M × [ιn Z(U

,

514

J. Murakami, T. Ohtsuki

something ...

+

something

+ ... +

something

...

=0

...

Fig. 10. The proof that the ambiguity vanishes

recall that U± denotes the trivial knot with ±1 framing. Proof. We express the 3-manifolds M1 and M2 by Dehn surgery along framed links as follows. Let Li ∪ Gi be a union of a framed link Li and an embedding of a chain graph ˆ i be the 3-manifold obtained by Dehn surgery 0i of genus g into S 3 (i = 1, 2). Let M along Li . We obtain the 3-manifold Mi again, as the 3-manifold obtained by removing ˆ i for some Li ∪ Gi . a tubular neighborhood of Gi from M Let f be a homeomorphism between boundaries of M1 and M2 taken as in the beginning of this subsection. Then, we can express the 3-manifold M1 ∪f M2 by Dehn surgery along the framed link in S 3 given in Lemma 4.3 below. We denote the framed link by L01 ∪ L02 ∪ L0 , where L0 is the framed link obtained from G1 and G2 and L0i denotes the framed link Li after attaching G1 and G2 . Since L01 and L02 do not wind ˆ 2 by Dehn surgery along L0 ∪ L0 . We also denote by L0 ˆ 1 #M together, we obtain M 1 2 ˆ 1 #M ˆ 2 . Note that we obtain M1 ∪f M2 from the framed link L0 after the surgery in M ˆ 1 #M ˆ 2 by Dehn surgery along L0 . M Note that the linking matrix of the framed link L01 ∪ L02 ∪ L0 is equal to the matrix given in (3.1). By Lemma 3.2, the matrix is equivalent to the block sum of the linking ˆ 1 #M ˆ 2 . Hence, σ± (M1 , M2 ; f ) matrices of L1 and L2 and the linking matrix of L0 in M is equal to the number of positive and negative eigenvalues of the linking matrix of L0 , By using Lemma 4.4 below, we have ˆ 1 #M ˆ 2 , L0 ) = f? (( ˆ 1 ) ⊗ (M ˆ M ˆ M ˆ ( 2 )). ˆ 1 #M ˆ 2 by Dehn surgery along L0 , we obtain the Since we obtain M1 ∪f M2 from M required formula, by the Dehn surgery formula in Theorem 6.8 of [15]. Lemma 4.3. Under the notation given in the proof of Theorem 4.2, the 3-manifold M1 ∪f M2 is homeomorphic to the 3-manifold obtained by Dehn surgery along the framed link in S 3 obtained from L1 ∪ G1 and L2 ∪ G2 by removing the horizontal lines of G1 and G2 and gluing together, as shown in Fig. 11. Proof. Deform the tubular neighborhood of each Gi embedded in S 3 to a union of a big neighborhood of the horizontal line of Gi and g tunnels along the other part of Gi . We identify the complement of the big neighborhood with a 3-ball Bi . Instead of attaching all the boundaries along f , we first attach B1 and B2 along f . Then we obtain a 3-sphere with g tunnels in it, whose picture is as shown in Fig. 11. Second, we attach the boundaries of tunnels along f ; alternatively we attach solid tori along the tunnels, since an alternative way of attaching two arcs is to attach a disk along the circle consisting of

Topological Quantum Field Theory for Universal Quantum Invariant

515

... ... ...

Fig. 11. The framed link obtained from L1 ∪ G1 and L2 ∪ G2

the two arcs. This implies that we obtain M1 ∪f M2 by Dehn surgery along the g tunnels in S 3 obtained as above. Lemma 4.4. Under the notation in the proof of Theorem 4.2, we have ˇ 1 ∪ G1 ) ⊗ Z(L ˇ 2 ∪ G2 )) = Z(L ˇ 01 ∪ L02 ∪ L0 ), f? (Z(L recall that we obtain the link L01 ∪ L02 ∪ L0 from L1 ∪ G1 and L2 ∪ G2 as shown in Fig. 11. ˇ we have the invariant of an embedding G1 of Proof. By definition of the invariant Z, a chain graph as partially shown in the left picture of Fig. 12. By relations around a trivalent vertex defined in Fig. 1, we can push up the set of a as shown in Fig. 12. In the formula, we put 1(2) = (id ⊗ 1) ◦ 1, 1(3) = (id ⊗ id ⊗ 1) ◦ 1(2) and (1(2) )2 and (1(3) )2 denote 1(2) and 1(3) acting on the right string of a−1/2 . Further, we have a similar formula for an embedding G2 ; the formula is obtained by reflecting the picture in Fig. 12 with respect to a horizontal line, by replacing a with b, and by reversing all orientations. Along the definition of f? , we attach the pictures for G1 and G2 . By using Lemma 4.5 below, we can change the order of terms for a and b, and obtain the left picture in Fig. 13. Since we have the formula in Fig. 14 proved in [13], the terms a and b cancel together as shown in Fig. 13. Recall that we added ν to each component when defining ˆ Hence the right formula in Fig. 13 is equal to Z(L ˇ 0 ∪ L0 ∪ L0 ); the invariant Zˇ from Z. 1 2 the vertical lines in the formula means parts of L0 . For simplicity, we omitted the argument for the orientation. To be precise, we must have added the antipode S to the places of the upward orientation in formulas used in this proof. However, note that, even if we added any antipode in the formulas, we obtain the same required formula, since S(ν) = ν holds. Let Xn be the set of n ordered vertical lines. Note that A(Xn ) has an algebra structure by gluing two chord diagrams vertically. We define the map 1(k) : A(X1 ) −→ A(Xk+1 ) by 1(k) = (id⊗(k−1) ⊗ 1) ◦ 1(k−1) . We denote by (1(k) )2 : A(X2 ) → A(Xk+2 ) the action of 1(k) on the second line. Lemma 4.5. With the above notations, for any elements D ∈ A(X2 ) and D0 ∈ A(Xk+1 ), the element (1(k) )2 (D) commutes with 1 ⊗ D0 in A(Xk+2 ).

516

J. Murakami, T. Ohtsuki

a −1/2

a −1/2 a

∆2 (a −1/2)

−1/2

(∆ )2 (a −1/2) (2)

=

a −1/2

(∆(3))2 (a −1/2) ν1/4

a −1/2

∆(4)(ν1/4)

ν1/2 Fig. 12. Pushing up the set of a

a −1/2 b−1/2 ∆2 (a −1/2) ∆2 ( b−1/2) (∆ )2 (a −1/2) (2)

=

ν1/2 ν1/2 ν1/2 ν1/2 ν1/2 ν1/2

(∆(2))2 (b−1/2) (∆(3))2 (a −1/2) (∆(3))2 (b−1/2) ν 1/2

(4) ∆ (ν1/2)

Fig. 13. The set of a and b cancel together

Topological Quantum Field Theory for Universal Quantum Invariant

a b

=

517

ν− 1 ν− 1 ∆ (ν )

Fig. 14. A relation of a and b

Proof. We obtain the lemma by elementary calculation in the same way as the argument used in the proof of the formula in Fig. 10. Remark. When we attach two boundaries of one 3-manifold, we have a similar formula as in Theorem 4.2 as follows. We consider a 3-manifold M1 with two boundary surfaces of the same genus, and the 3-manifold obtained from M1 by gluing the two boundaries along a homeomorphism f . ˆ 1 be as in Sect. 3. There is an embedding of two chain graphs 01 and 02 into M ˆ1 Let M ˆ 1 by Dehn surgery along a framed corresponding to the two boundaries. We express M link L1 in S 3 . Then we also have an embedding G1 and G2 of 01 and 02 in S 3 in the expression. We have the following gluing formula with a suitably defined f? and σ± (M1 ; f ), by using the change of framed links shown in Fig. 15 instead of the change shown in Fig. 11: (∪f M1 ) ∞ X ˇ + )]−σ+ (M1 ;f ) [ιn Z(U ˇ − )]−σ− (M1 ;f ) [ιn f? ((M1 ))] (n) ˆ 1 ; Z)|n [ιn Z(U |H1 (M = n=0

The detailed arguments are left to the reader.

...

...

...

...

Fig. 15. The framed link obtained from L1 ∪ G1 ∪ G2

4.2. Topological quantum field theory for the universal quantum invariant. We denote by M the set of compact 3-manifolds M satisfying that the rank of H1 (∂M ; Q) is equal to twice the rank of H1 (M ; Q). We consider the sub-category of the category

518

J. Murakami, T. Ohtsuki

of 3-cobordisms such that the set of morphisms of the sub-category is M. Our aim is to construct a functor F of the sub-category to the category of modules over the commutative ring A(φ) and show that the functor satisfies modified axioms of TQFT. As in the introduction, we denote F(Σ) by V (Σ) for a closed surface Σ and denote F(M ) by Z(M ) ∈ V (∂M ) for a 3-manifold M with its boundaries ∂M . We put V (Σ) as follows. Suppose that Σ is connected. Then, when we fix a set of meridians and framings of Σ, we can regard Σ as the boundary of the handle body with the chain graph embedded in it as shown in Fig. 7. We define V (Σ) to be A(0). More precisely, we define V (Σ) to be a vector space such that we have an isomorphism of A(0) to V (Σ) for each fixed set of meridians and framings of Σ. If Σ consists of m connected components Σ1 , Σ2 , · · ·, Σm , then we define V (Σ) to be A(01 t 02 t · · · t 0m ) by considering 0i for each Σi in the same way as above. Instead of the axiom (A1), we have a natural map V (Σ1 ) ⊗ V (Σ2 ) −→ V (Σ1 t Σ2 ), which takes D1 ⊗ D2 to the disjoint union of D1 and D2 for chord diagrams D1 and D2 . Note that we take the tensor product over the ring A(φ) here, and we put V (φ) = A(φ) for the empty surface φ. We put Z(M ) as follows. Let M be a 3-manifold in M. Then, we obtain a rational ˆ by filling the (possibly, not connected) boundary of M with homology 3-sphere M handle bodies as in Sect. 3 for a suitable choice of the set of meridians and framings. ˆ. We have an embedding G of a set of chain graphs 0 in M ˆ , G) ∈ A(0) defined in Sect. 2. Through the above identification of ˆ M We have ( ˆ , G). ˆ M A(0) with V (Σ), we define Z(M ) ∈ V (Σ) to be ( We define the map ι : A(tl S 1 ) → A(φ) by n ι (D) if deg(D) is divisible by l + 1, ι(D) = deg(D)/(l+1) 0 otherwise, where deg(D) denotes the degree of D. We define the map D(k) : A(X) → A(X) by D(k) (D) = k deg(D) D. By Theorem 4.2, we have Lemma 4.6. The following formula holds (M1 ∪f M2 ) −σ (M ,M ;f ) ˆ 1 , G1 ) ⊗ ( ˆ 2 , G2 )) , ˆ M ˆ M = D(α) c+−σ+ (M1 ,M2 ;f ) c− − 1 2 D(1/β) ι ◦ f? (( where we put ˆ 1 ; Z)| · |H1 (M ˆ 2 ; Z)|/|H1 ((M1 ∪f M2 )ˆ; Z)|, α = |H1 (M ˆ 1 , G1 ) ⊗ (M ˆ 2 , G2 )))(0) , β = (ι1 ◦ f? ((M ˇ ±) . c± = D(∓1) ι(Z(U Here, we denote by D(0) the degree 0 part (or its coefficient of the empty diagram) of D. Proof. Since ˆ (M 1 ∪ f M2 ) =

∞ X n=0

|H1 ((M1 ∪f M2 )ˆ; Z)|−n (M1 ∪f M2 )(n) ,

Topological Quantum Field Theory for Universal Quantum Invariant

519

it is sufficient to show the following formula, ˇ + )]−σ+ (M1 ,M2 ;f ) [ιn Z(U ˇ − )]−σ− (M1 ,M2 ;f ) [ιn f? (( ˆ 1 , G1 ) ⊗ ( ˆ 2 , G2 ))] ˆ M ˆ M [ιn Z(U −σ (M ,M ;f ) ˆ 1 , G1 ) ⊗ ( ˆ 2 , G2 )) . ˆ M ˆ M = c+−σ+ (M1 ,M2 ;f ) c− − 1 2 D(β) ι ◦ f? (( (0)

ˇ ± )) Noting that ι1 (Z(U formula:

= ∓1, the above formula is derived from the following a−n · ιn (ω) = D(1/a) ◦ ι(ω)

for a group-like element ω ∈ A(tl S 1 ) and the scalar a = (ι1 (ω))(0) ; for the definition of a group-like element, see [15]. We prove the formula for each m ≤ n as follows: (m) (m) = a−m ι(ω) D(1/a) ◦ ι(ω) (m) = a−m ιm (ω) (m) = a−n ιn (ω) where the last equality is derived from Lemma 4.6 in [15].

Instead of axiom (A2), we have a map ι ◦ f? : V (Σ) ⊗ V (−Σ) −→ A(φ). We denote the map by <, >Σ . As for the orientation of Σ, when Σ is the boundary of a tubular neighborhood of 01 in Fig. 8 embedded in a 3-manifold, we regard the same boundary for 02 as −Σ; note that the two boundaries have opposite orientations derived from the sets of meridians and framings. Then, instead of axiom (A4), we have −σ− (M1 ,M2 ;f )

Z(M1 ∪f M2 ) = D(α) c+−σ+ (M1 ,M2 ;f ) c−

D(1/β) < Z(M1 ), Z(M2 ) >Σ ,

where the notation of the correction terms are given in Lemma 4.6. As for the correction maps D(α) and D(1/β) in the formula, α and β are values derived from homology. In fact, α is given using orders of homology groups. Further, β is equal to a scalar multiple of the determinant of linking matrix of a link obtained from G1 and G2 as in the proof of Theorem 4.2. Unlike the case of TQFT for quantum invariants for a root of unity, our ring A(φ) has a grading, and formulas for the universal quantum invariant often include correction terms with respect to the grading. That is the technical reason why we need the correction maps D(α) and D(1/β) . −σ+ (M1 ,M2 ;f ) are due to the framing anomaly in TQFT. We The correction terms c± expect that the terms should be removed by defining a suitable framing of a 3-manifold; see [1, 6] for trials to define the framing. References 1. 2. 3. 4. 5.

Atiyah, M.F.: On framings of 3-manifolds. Topology 29, 1–7 (1990) Atiyah, M.F.: Topological quantum field theories. Publ. Math. IHES 68, 175–186 (1989) Atiyah, M.F., Hitchin, N., Lawrence, R. and Segal, G.: Oxford seminar on Jones-Witten theory (1988) Bar-Natan, D.: On the Vassiliev knot invariants. Topology 34, 423–472 (1995) Bar-Natan, D.: Non-associative tangles. Harvard University, preprint

520

J. Murakami, T. Ohtsuki

6. Blanchet, C., Habegger, N., Masbaum, G. and Vogel, P.: Topological quantum field theories derived from the Kauffman bracket. Topology 34, 883–927 (1995) 7. Kirby, R.C.: A calculus for framed links in S 3 . Invent. Math. 65, 35–56 (1978) 8. Kontsevich, M.: Vassiliev’s knot invariants. Advances in Soviet Mathematics 16, 137–150 (1993) 9. Le, T.T.Q., Murakami, H., Murakami, J. and Ohtsuki, T.: A three-manifold invariant derived from the universal Vassiliev-Kontsevich invariant. Proc. Japan Acad. 171, 125–127 (1995) 10. Le, T.T.Q., Murakami, H., Murakami, J. and Ohtsuki, T.: A three-manifold invariant via the Kontsevich integral. Preprint. MPI/95-62, 1995 11. Le, T.T.Q. and Murakami, T.T.Q.: Representations of the category of tangles by Kontsevich’s iterated integral. Commun. Math. Phys. 168, 535–562 (1995) 12. Le, T.T.Q. and Murakami, J.: The universal Vassiliev-Kontsevich invariant for framed oriented links. Preprint, Max-Planck-Institut f¨ur Mathematik, Bonn 13. Le, T.T.Q. and Murakami, J.: Parallel version of the universal Vassiliev-Kontsevich invariant. J. Pure Appl. Algebra, to appear 14. Le, T.T.Q., Murakami, J. and Ohtsuki, T.: An invariant of 3-manifolds derived from the universal Vassiliev-Kontsevich invariant. Preprint, 1995 15. Le, T.T.Q., Murakami, J. and Ohtsuki, T.: On a universal quantum invariant of 3-manifolds. Preprint, 1996 16. Reshetikhin, N. and Turaev, V. G.: Invariants of 3-manifolds via link polynomials and quantum groups. Invent. Math. 103, 547–597 (1991) 17. Yamada, S.: An invariant of spatial graphs. J. Graph Theory 13, 531–551 (1989) Communicated by H. Araki

Commun. Math. Phys. 188, 521 – 534 (1997)

Communications in

Mathematical Physics c Springer-Verlag 1997

Intersection Numbers on Moduli Spaces and Symmetries of a Verlinde Formula R. Herrera? , S. Salamon Mathematical Institute, 24–29 St Giles, Oxford OX1 3LB, United Kingdom Received: 5 April 1996 / Accepted: 6 February 1997

Abstract: We investigate the geometry and topology of a standard moduli space of stable bundles on a Riemann surface, and use a generalization of the Verlinde formula to derive results on intersection pairings. 1. Introduction Let Mg denote the smooth moduli space of stable holomorphic rank 2 vector bundles with fixed determinant of odd degree over a Riemann surface Σig of genus g. The space Mg has the structure of a complex (3g −3)-dimensional K¨ahler manifold whose anticanonical bundle is the square of an ample line bundle L [25, 22]. The dimensions h0 (Mg , O(Lj )) are independent of the complex structure on Σig and were predicted in [30]. In this paper, we highlight additional calculations that arise from the DesaleRamanan description [7] of Mg for the case in which Σig is hyperelliptic. Our approach is based on the proof of the Verlinde formula by Szenes [27], and grew out of an attempt to generalize the related twistor geometry studied by the second author in [24]. Universal cohomology classes α, β, γ were defined on Mg by Newstead [21] and used to compute the Chern character of the holomorphic tangent bundle T = T 1,0 Mg . The latter can be expressed in terms of a tautological rank g − 1 bundle Q with the help of the Adams operator ψ 2 , and we show in Sect. 2 that the appearance of this algebraic gadget leads to quick proofs of both the equation β g = 0 and the recurrence relation for the Chern classes of Q. Underlying this theory is the fact that β coincides with the pullback of the canonical quaternion-K¨ahler class on a real Grassmannian, providing a link with the index theory in [17]. This approach has a number of simplifying features, though in other ways Sect. 2 is a supplement to the papers of Baranovsky [2] and Siebert-Tian [26]. These authors, together with Zagier [33] and King-Newstead [15], have provided a variety of methods for determining the cohomology ring of Mg . ?

The first author is supported by a scholarship from DGAPA, National University of Mexico

522

R. Herrera, S. Salamon

In Sect. 3 we further exploit Adams operators and the fundamental role played by Q by computing the holomorphic Euler characteristics V g−1 (p, q) = χ(Mg , O(ψ p−q Q ⊗ Lq−1 )) for all p, q ∈ Z, thereby extending the Verlinde formula (corresponding to p = q) into a 2-dimensional array. The symmetries of the title are those enjoyed by the integers V g−1 (p, q) in the (p, q)-plane. The main purpose of Sect. 4 is to show that these symmetries can be used to recover the intersection numbers hαm β n γ p , [Mg ]i, using calculations similar to those of Thaddeus [28] and Donaldson [8]. This formulation in turn provides an alternative route to the Bertram-Szenes proof [4] of the ‘untwisted’ Verlinde formula for the moduli space of semistable rank 2 bundles with fixed determinant of even degree. We end up encoding the topology of Mg into equations involving Chern characters that are particularly easy to write down and remember. For example, it is known that χ(Mg , O(T ∗ )) = g − 1, and it follows from the Riemann-Roch theorem that hch (Te)td(T ), [Mg ]i = 0, where Te = T ∗ − g + 1. We prove a stronger vanishing theorem, namely that ch (Te)eα evaluates to zero when paired with any power of β. The virtual bundle Te is a natural one to consider since it has virtual rank 2g − 2 and ci (Te) = 0 for i > 2g − 2 by [11]. Although our methods are special to the rank 2 case, the first author has found analogous results on certain moduli spaces of orthogonal bundles of arbitrary rank, including closed expressions for Verlinde-type formulae [13].

2. Tangent Relations Let Σig be a hyperelliptic curve of genus g, admitting a double-covering Σig → CP1 with distinct branching points ω1 , . . . , ω2g+2 . Desale and Ramanan proved in [7] that the manifold Mg defined in Sect. 1 can then be realized as the variety of (g − 1)-dimensional subspaces of C2g+2 isotropic with respect to the two quadratic forms 2g+2 X i=1

yi2 ,

2g+2 X

ωi yi2 .

(1)

i=1

One therefore obtains a holomorphic embedding of Mg into the complex homogeneous space SO(2g + 2) Fg = U (g − 1) × SO(4) parametrizing (g − 1)-dimensional subspaces which are isotropic with respect to the first quadratic form [27]. Let Q, W denote the duals of the tautological complex vector bundles over F g with fibres Cg−1 , C4 and structure groups U (g − 1), SO(4) respectively. (The notation Q is consistent with [26], and from now on we shall often drop the subscript g since the genus will generally be fixed in our discussion.) The decomposition of the standard representation of SO(2g + 2) on C2g+2 under U (g − 1) × SO(4) provides the equation Q∗ ⊕ Q ⊕ W = 2g + 2,

(2)

Intersection Numbers and Symmetries of a Verlinde Formula

523

where the right-hand side denotes a trivial vector bundle of rank 2g + 2. The second form in (1) now determines a non-degenerate section s of the symmetrized tensor product S 2 Q, and the zero set of s coincides with M. The holomorphic tangent bundle T 1,0 F of F is determined by the summand m in the Lie algebra splitting so(2g + 2)c ∼ = (u(g − 1) ⊕ so(4))c ⊕ m ⊕ m, where c denotes complexification. Given that so(2g + 2)c ∼ = (2) that we may choose the orientation so that T 1,0 F ∼ =

V2

V2

C2g+2 , it follows from

Q ⊕ (Q ⊗ W ).

This equation is well known in the context of twistor spaces, since F is a 3-symmetric twistor space that fibres over the oriented real Grassmannian G g = Gr4 (R2g+2 ) =

SO(2g + 2) SO(2g − 2) × SO(4)

(3)

V2 Q is simply the holomorphic tangent bundle to the for g ≥ 3 [5, 23]. The term Hermitian symmetric fibres SO(2g − 2)/U (g − 1), and its complement Q ⊗ W corresponds to the holomorphic horizontal bundle that plays an important role in the theory of harmonic maps [6]. With the above choices, the normal bundle of M in F is isomorphic to S 2 Q, and T 1,0 F|M ∼ = T 1,0 M ⊕ S 2 Q|M . In the notation of K-theory, we may write T = T 1,0 M =

V2

Q + QW − S 2 Q,

where from now on we are using the same symbols to denote bundles pulled back to V2 , we have M. Writing ψ 2 = S 2 − Lemma 2.1. T = QW − ψ 2 Q. The operator ψ 2 is one of the series of Adams operators, defined by the formula X p≥0

(ψ pE)tp = r − t

d log 3−t E, dt

Vi P E)ti [10]. Each ψ p is a ring where E ∈ K(M) has virtual rank r and 3t E = i≥0 ( homomorphism in K-theory, and is characterized by the property that chk (ψ pE) = pk chk (E),

(4)

where chk (E) denotes the term of dimension 2k in the Chern character. We shall use the operators ψ p with p ≥ 3 in Sect. 3.

524

R. Herrera, S. Salamon

Cohomology classes α ∈ H 2 (M, Z),

β ∈ H 4 (M, Z),

γ ∈ H 6 (M, Z)

(5)

were introduced by Newstead [20, 1]. They are obtained from the K¨unneth components of the characteristic class c2 (V), where V is a universal SO(3) bundle over M, and generate the ring HI∗ (M) of cohomology classes of M invariant by the action of the mapping class group on H 3 (M). By expressing T in terms of a push-forward of V, Newstead obtained the following result, which we take as given and is effectively the definition of (5) for our purposes: Theorem 2.1. [21, Theorem 2] ch (T ) = 3g − 3 + 2α +

X chk k≥2

k!

( ,

where

ch2k−1 = 2αβ k−1 − 8(k − 1)γβ k−2 , ch2k = 2(g − 1)β k .

As an application of Lemma 2.1 and (2), we see that the complexification of the real tangent bundle of M is isomorphic to T + T ∗ = (Q∗ + Q)W − ψ 2 (Q∗ + Q), = (2g + 2 − W )W − (2g + 2 − ψ 2 W ) = (2g + 2)(W − 1) − W 2 + ψ 2 W.

(6)

The Chern character of this may be read off and then compared with Theorem 2.1 (for V2 this purpose it helps not to replace −W 2 + ψ 2 W by the equivalent −2 W ). An easy calculation gives ch (T + T ∗ ) = 6g−6 + 2(g−1)p1 (W ) + 16 (g−1)p1 (W )2 − 2(g+5)p2 (W ) + · · · (7) where the Pontrjagin classes are defined by regarding W as an SO(4)-bundle. Now (7) must equal twice the sum of the even terms of ch (T ), so p1 (W ) = β, p2 (W ) = 0 and √ ch (W ) = 2 + e

β

+ e−

√

β

(8)

on the moduli space M. Since Q∗ + Q = 2g + 2 − W is a genuine complex vector bundle of rank 2g − 2 with total Chern class ∞ X βk c(W )−1 = k=0

on M, we get Conjecture (a) of [21]: Theorem 2.2. β g = 0. This was first proved by Kirwan [16], who established the completeness of the Mumford relations on HI∗ (M). It also follows from results in [14, 32], and a different proof was given by Weitsman [31] in the more general setting of moduli spaces of flat connections over a Riemann surface with marked points.

Intersection Numbers and Symmetries of a Verlinde Formula

525

From Lemma 2.1 and Theorem 2.1 one may readily compute the Chern character of Q in terms of the classes (5). From this point of view the definition of M as a submanifold V2 Q and S 2 Q ‘miraculously’ combine into a of F could not be simpler, as the terms form specifically adapted for computing characters. The result is Theorem 2.3. X sk

ch (Q) = g − 1 + α +

k≥2

k!

( ,

where

s2k−1 = αβ k−1 + 2γβ k−2 , s2k = −β k .

Proof. Let chk , sk denote the components of ch (T ), ch (Q), respectively, in dimension 2k. Using Lemma 2.1, (4) and (8), 3g − 3 +

X sk X βk X 2 k sk =2 g−1+ 2+ − g−1+ . k! k! (2k)! k!

X chk k≥1

k≥1

k≥1

k≥1

The result now follows from Theorem 2.1 by induction on k.

An analogue of the last equation can be found in [2], though the authors were led to it by the paper of Siebert and Tian [26], who give an equivalent expression for ch (Q). Theorem 2.3 leads very quickly to their recurrence relation for the Chern classes of Q. Using a standard trick [33], the generating function for the Chern classes of Q is recaptured by the formula c(t) = exp

h X (−1)k−1 s tk i k k k≥1

h = exp αt +

X

(αβ n−1 + 2γβ n−2 )

n≥2

t2n−1 X n t2n i + . β 2n − 1 2n

(9)

n≥1

The relation [26, Proposition 25], namely (1 − βt2 )c0 (t) = (α + βt + 2γt2 )c(t) follows immediately, whence Corollary 2.1. The Chern classes of the rank g − 1 bundle Q on Mg satisfy (k + 1)ck+1 = αck + kβck−1 + 2γck−2 . Zagier has shown that the ck coincide with the K¨unneth components of the Chern classes in H ∗ (Mg ) ⊗ H ∗ (Jg ) of a higher rank bundle used to define the Mumford relations (Jg is the Jacobian of Σig ). The identities in α, β, γ arising from the equations ck = 0 for k = g, g + 1, g + 2 then provide a minimal set of relations to completely determine the cohomology ring HI∗ (M) [33, 2, 15, 26]. We have set out to show that these equations follow quite directly from Newstead’s own computations, and it is worth pointing out that Corollary 2.1 is analogous, but simpler, to the recurrence relation for the Chern classes of T given at the end of [21].

526

R. Herrera, S. Salamon

The fact that the Pontrjagin ring of M = Mg is generated by p1 (M) = 2(g − 1)β can also be related to the geometry of the real Grassmannian (3). For Q + Q∗ and W ˆ over G = G g , and are (complexifications of) the pullbacks of real vector bundles Uˆ , W ˆ ˆ the real tangent bundle of G is isomorphic to U ⊗ W . The choice of an orientation of W gives the manifold G a quaternion-K¨ahler structure. The latter is characterized by a certain non-degenerate closed 4-form that arises from the curvature of a locallydefined quaternionic line bundle H, and represents the integral cohomology class 4u ∈ H 4 (G, Z), where u = −c2 (H) [17]. Proposition 2.1. β is the pullback of the class 4u by means of the embedding cum projection M ,→ F → G. ˆ ). A calculation from [24] Proof. From above, β is the pullback to M of βˆ = p1 (W shows that ˆ ) = (βˆ − 4u)2 ∈ H 8 (G, Z). p2 ( W Assuming that g ≥ 3, b4 (M) = 2 and so βˆ − 4u must pull back to aα2 + bβ on M for some a, b ∈ Z; from (8), (aα2 + bβ)2 = 0. However, the remarks after Corollary 2.1 imply that there are no non-trivial relations involving α4 , α2 β, β 2 in H 8 (M) except that 0 = −8 (c4 (Q) − α c3 (Q)) = α4 + 2α2 β − 3β 2 in genus 3. (There are actually four distinct quaternion-K¨ahler structures on G 3 = SO(8)/(SO(4) × SO(4)), and Proposition 2.1 holds for only two of them.) It follows that in all cases β = 4u in H 4 (M). Any quaternion-K¨ahler manifold M of dimension 4m is the base space of a complex manifold Z (the more usual ‘twistor space’) fibred by rational curves. The positive integer υ(M ) = h(4u)m , [M ]i =

1 c1 (Z)2m+1 2(m + 1)2m+1

(10)

determines the ‘quaternionic volume’ of M , and can be expressed in terms of dimensions of representations of the isometry group, using techniques from [17]. For M = G g we have m = 2g − 2, Z ∼ = SO(2g + 2)/(SO(2g − 2) × U (2)), and one can prove that 2 4g − 3 ; υ(G g ) = g 2g − 1 by comparison υ(HP2g−2 ) = 42g−2 . Theorem 2.2 is in contrast to the non-degenerate nature of the 4-form over G, and reflects the failure of M to map onto a quaternionic subvariety of G.

3. Character Calculations In this section, we set h = g − 1 and consider the holomorphic Euler characteristics V h (p, q) = χ(Mh+1 , O(ψ p−q Q ⊗ Lq−1 )).

(11)

Given that c1 (M) = 2α and the canonical bundle of M is isomorphic to L−2 , Serre duality implies that

Intersection Numbers and Symmetries of a Verlinde Formula

527

V h (p, q) = (−1)h V h (−p, −q),

(12)

with the convention that ψ −p Q = ψ p Q∗ . Following [27], we set w = x + x−1 − 2 and F (w, p) =

p+h−1 p+h (xp − x−p )(x − x−1 ) X wh , + = 4 2h − 1 2h + 1 x + x−1 − 2

(13)

h≥0

in order to define F (w, p) X = Gk (p, q)wk . F (w, q)2 k≥0

The next theorem expresses (11) in terms of this generating function. Theorem 3.1. Let h ≥ 2. Then V h (p, 0) = 0 for all p, and V h (p, q) = 4(−4q)h (p(h + 1) Gh (q, q) − q Gh (p, q)) ,

p > 0.

In particular, V h (p, q) + V h (−p, q) = 0 for all p, q ∈ Z. The resulting symmetries of V h (p, q) are illustrated schematically in Fig. 1. Corollary 3.1. Let c be an integer such that 0 ≤ c ≤ 2 + (g − 2)/q. Then

V h (c q, q) = c

q X

j+1

(−1)

j=1

(h + 1 − (−1)

j(c+1)

)

q 2 sin (jπ/2q)

h .

(14)

Setting p = q in (11) gives V h (q, q) = h dim H 0 (M, O(Lq−1 )), since the higher cohomology spaces are zero by Kodaira vanishing and Serre duality. When c = 1, the right-hand side of (14) does indeed reduce to the Verlinde formula for the dimension of the space of sections of Lq−1 . This was first deduced from fusion rules [30]; a direct proof was given by Szenes [27], though many other generalizations now exist [19, 14, 29, 3, 9]. Moreover, the right-hand side of (13) is essentially the generating function for the reciprocal of the Verlinde series. Note also that 1 V h (1, 1) = χ(M, O) = htd(M), [M]i = 1 h

(15)

is the Todd genus of M. Finally, when h = 1 we have Q ∼ = L and it follows that V 1 (p, q) = V 1 (p, p) for all p, q.

528

R. Herrera, S. Salamon q Verlinde values

( -p,q )

( p,q )

p

(p,-q ) zeros

Figure 1 Proof of Theorem 3.1. We follow closely Szenes’ proof of the Verlinde formula in [27], and his use of Lemma 3.1. [10, Proposition 4.3] Let E be a vector bundle of rank n over a smooth projective variety X, and let i: M → X be the zero locus of a non-degenerate section of E. Then χ(M, O(i∗ U )) = χ(X, O(U ⊗ 3−1 E ∗ )) for any vector bundle U on X. On a homogeneous space, holomorphic Euler characteristics can be computed by means of the Atiyah-Bott fixed point formula. Let G be a reductive Lie group, P a parabolic subgroup of G, and F = G/P the corresponding flag manifold. A representation R of P determines both a holomorphic vector bundle R = G ×P R over F and a virtual G-module X (−1)i H i (F, O(R)). I(R) = i

Let T be a common maximal torus of P and G, let WG , WP be the Weyl groups, and Wr the relative Weyl group. The character of the G-module I R is given by tr(I R ) =

X w∈Wr

w·

tr(R) , tr(3−1 A∗ )

(16)

where A is the P -module associated to the holomorphic tangent bundle T 1,0 F = A. The right-hand side of (16) is a function on T, and evaluation at the identity element yields tr(I R )|e = χ(F, O(R)).

Intersection Numbers and Symmetries of a Verlinde Formula

529

Returning to the problem at hand, let H denote the subgroup U (h) × SO(4) of G = SO(2g + 2), where h = g − 1. If B denotes the fundamental representation of U (h), then the vector bundles B ∗ and det B ∗ over F = SO(2g + 2)/H pull back to Q and L respectively over M. Lemma 3.1 therefore implies that χ(M, O(ψ p−q Q ⊗ Lq−1 )) = χ(F, O(R)), where

R = Rp,q = ψ p−q B ∗ ⊗ (det B ∗ )q−1 ⊗ 3−1 (S 2 B).

Now we proceed to calculate (16). Let x1 , . . . , xg+1 be the characters of the maximal torus of SO(2g + 2) corresponding to the polarisation {y2j−1 + iy2j : 1 ≤ j ≤ g + 1} Ph+2 of C2g+2 . The character of the fundamental SO(2g + 2)-module is j=1 (xj + x−1 j ), and Ph −1 that of the fundamental U (h)-module j=1 xj . Thus, tr(Rp,q ) =

Y i≤h

Y

xq−1 i

1 X p−q x` , xj xk h

1−

`=1

1≤j≤k≤h

and from [27], tr(3−1 A∗ ) =

Y

1−

1≤i<j≤h

1 xi xj

Y

1−

1≤k≤h ε=1,2

1

xh+ε xk

1−

xh+ε xk

,

so it suffices to prove that V h (p, q) = lim

{xi →1}

X w∈Wr

w·

tr(Rp,q ) . tr(3−1 A∗ )

(17)

We have that   q −1 Y X p−q x (x − x ) tr(Rp,q ) i i i   = xj . −1 −1 −1 tr(3−1 A∗ ) (xi + x−1 i − xh+1 − xh+1 )(xi + xi − xh+2 − xh+2 ) i≤h j≤h To perform the summation in (17) as in [27] we first recall the form of the relative Weyl elements, and group Wr of WSO(2h+4) with respect to WU (h) and WSO(4) . It has 2h h+2 2 Wr = W signs o W perms , where W signs consists of all the substitutions xi 7→ x−1 i of an even number of variables −1 perms modulo {xh+1 7→ x−1 , x → 7 x }, and W consists of all the permutations of h+2 h+1 h+2 two variables. Adding up first with respect to W signs we get ÿ ! Y Y X (xi − x−1 ) (xpj − x−p j )

i

i≤h

−1 −1 −1 (xi + x−1 i − xh+1 − xh+1 )(xi + xi − xh+2 − xh+2 )

j≤h

(xqj − x−q j ),

l6=j

setting xh+2 7→ 1 and then adding up with respect to W perms gives a contribution

530

R. Herrera, S. Salamon

 ÿ h+1 X  Y  j=1

i6=j

(xi + x−1 i −

xi − x−1 i 2)(xi + x−1 i

! − xj − x−1 j )

  X Y  p −p (xq` − x−q ) . (xk − xk ) ` k6=j

`6=j `6=k

Substituting wi = xi + x−1 − 2, and rearranging the terms converts the above i summation into ÿ h+1 X

F (wj , p)

j=1

Y

! F (wi , q)

i6=j

h+1 X (−1)j−1 Vm(w1 , . . . , w bj , . . . , wh+1 ) j=1

−

ÿ h+1 Y

F (wj , q) Vm(w1 , . . . , wh+1 )

!

F (wi , q)

i=1

(18)

h+1 X (−1)j−1 F (wj , p)Vm(w1 , . . . , w bj , . . . , wh+1 )

F (wj , q)2 Vm(w1 , . . . , wh+1 )

j=1

,

where Vm denotes the Vandermonde determinant. Since lim F (w, q) = 4q, the first x→1

factors in both summands converge as the xi tend to 1. By [27, Lemma 5.3] the first summand tends to (−4q)h 4p(h+1)Gh (q, q), and the second to (−4q)h+1 Gh (p, q). Proof of Corollary 3.1. The hypothesis on c implies that the meromorphic form F (w, cq)dw F (w, q)2 wh+1 over CP1 has no poles at 0 and ∞. The result is then a consequence of the residue theorem and [27, Lemma 5.3].

4. Further Relations The identity

V (p, 0) = 0, p ∈ Z, (19) of Theorem 3.1 is also an easy consequence of Theorems 2.2, 2.3 and the HirzebruchRiemann-Roch theorem. For the latter implies that b [M]i, V (p, 0) = hch (ψ p Q ⊗ L∗ )td(M), [M]i = hch (ψ p Q)A(M),

b class and the A

ÿ b A(M) =

1 2

!2g−2

√

sinh

β √ 1

2

β

,

(20)

readily computed from (6) and (8), is a polynomial in β. The identity (20) was used by Thaddeus to show that the Verlinde formula (Corollary 3.1 with c = 1) determines the intersection form on M. In [28, Eq. (30)], he gives the intersection numbers g! m! 22g−2−p (2q − 2)Bq , (21) hαm β n γ p , [Mg ]i = (−1)p−g (g − p)!q! where m + 2n + 3p = 3g − 3, q = m + p − g + 1, and Bq is the q th Bernoulli number (equal to q! times the coefficient of xq in x/(ex − 1)). Another key point in the argument is [28, Proposition (26)], namely that γ is Poincar´e dual to 2g copies of Mg−1 , so that hαm β n γ p , [Mg ]i = 2ghαm β n γ p−1 , [Mg−1 ]i,

m + 2n + 3p = 3g − 3.

(22)

Intersection Numbers and Symmetries of a Verlinde Formula

531

In this section, we shall show that the intersection numbers (21) are in fact determined by (19) and the identities Vh (0, p) = 0, Vh (p, p) + V (−p, p) = 0,

(23) (24)

p∈Z

that follow from Theorem 3.1, or rather its proof. Following closely the notation of Donaldson [8, Sect. 5], we set Ik(g) =

1 hαg−1+2k β g−1−k , [M]i, (g − 1 + 2k)!

and also I

(g)

(t) =

g−1 X

Ik(g) t2k .

k=0

Theorem 4.1. I (g) (t) = (−4)g−1

t . sinh t

Proof. Interpreting (23) as a polynomial identity in p, using the Hirzebruch-RiemannRoch theorem and (22), yields the equation t2

d dt

I (g) (t) sinh t t

= g(t − sinh t)(I (g) (t) + 4I (g−1) (t))

(25)

modulo t2g . Similarly, (24) gives t2

d dt

I (g) (t) sinh 2t t

+ t I (g) (t)(1 − cosh 2t) = g(2t − sinh 2t)(I (g) (t) + 4I (g−1) (t)),

which can be simplified into d 2t dt 2

I (g) (t) sinh t t

= g(2t − sinh(2t))(I (g) (t) + 4I (g−1) (t)).

(26)

Both sides of (25) and (26) must now be identically zero, and ignoring the modulo t2g , I (g) (t) = C(g)t/ sinh t where C(g) is a function of g such that C(g) + 4C(g − 1) = 0. But, using the description (1) of M2 as the intersection of two quadrics in CP5 , we find that C(2) = −hα3 , [M2 ]i = −4. It also follows now that I (g) (t) = (−4)g−1 K(t2 ), where K is the generating function of the intersection pairings described in [8]. Let Ng denote the moduli space of semistable rank 2 vector bundles with fixed even degree determinant over a compact Riemann surface Σig of genus g, and let L0 be the generator of Pic(Ng ). An easy application of the above knowledge of I (g) (t) now yields )= Theorem 4.2. [4, Theorem 1.1] dim H 0 (Ng , Lk−2 0

k−1 X j=1

k 1 − cos(2jπ/k)

g−1 .

532

R. Herrera, S. Salamon

Proof. Let U be a universal rank 2 bundle over Mg × Σg , so that if m ∈ M represents the bundle E → Σ, then U|{m}×Σ ∼ = E; this may be chosen such that if Ux = U |M×{x} ∼ then det(Ux ) = L over M. Bertram and Szenes [4] use a Hecke correspondence to prove that dim H 0 (Ng , Lk0 ) = χ(Mg , S k Ux ). Note that c(Ux ) = 1 + α + 41 (α2 − β), so that c(Ux ⊗ L−1/2 ) = 1 − β/4, and √ sinh((k + 1) β/2) k −1/2 √ )) = ch (S (Ux ⊗ L sinh( β/2) (see [14]). Then we need to evaluate * + √ 2g−1 √ sinh((k + 1) β/2) (k+2)α/2 β/2 √ √ e , [M] . β/2 sinh( β/2) Using Theorem 4.1, this is readily seen to equal # " g−1 sinh t k+2 k+1 Res dt t=0 k + 2 sinh((k + 2)t/(k + 1)) sinh(t/(k + 1)) −2 sinh2 (t/(k + 1)) g−1 k+1 X √ k+2 sin((k + 1)u/(k + 2)) du −1 Res = u=πj 1 − cos(2πj/(k + 2)) sin u sin(u/(k + 2)) j=1

=

k+1 X j=1

k+2 1 − cos(2πj/(k + 2))

g−1 .

In the last part of this paper, we shall combine separate uses of the Adams operators, namely Lemma 2.1 and the identities (23),(24) to express the intersection form of the smooth moduli space M = Mg in an alternative form. The bundle Q was defined geometrically only in the hyperelliptic case, and to de-emphasise its role at this stage we consider in addition Te = T ∗ − g + 1 ∈ K(M). Note that this has virtual rank 2g − 2 and vanishing higher Chern classes ci (Te) = 0 for i > 2g − 2 by Gieseker’s theorem [11, 33]. Let us say that a cohomology class δ ∈ H ∗ (M) is saturated if hδ β j , [M]i = 0 for all j ≥ 0. Thus, any polynomial in β is itself saturated. With this terminology, Proposition 4.1. ch (Q∗ ⊗ L) and ch (Te ⊗ L) are both saturated, i.e. (a) hch (Q∗ )eα β j , [M]i = 0, j ≥ 0; (b) hch (Te)eα β j , [M]i = 0, j ≥ 0. Proof. Since ψ p is a ring homomorphism in K-theory, (23) implies b 0 = hch (ψ p Q∗ ⊗ Lp−1 )td(M), [M]i = hch (ψ p (Q∗ ⊗ L))A(M), [M]i. Equation (a) follows from the fact that the identity above is true for all p ∈ Z. Now consider the decomposition (T ∗ − g + 1) ⊗ L = Q∗ ⊗ L ⊗ W − (ψ 2 Q∗ + g − 1) ⊗ L.

Intersection Numbers and Symmetries of a Verlinde Formula

533

Equation (8) and part (a) imply that ch (Q∗ ⊗ L ⊗ W ) is saturated. It therefore suffices to show that ch ((ψ 2 Q∗ + g − 1) ⊗ L) is saturated, but given that ψ p ((ψ 2 Q + g − 1) ⊗ L∗ ) = ψ 2p Q ⊗ L−p + (g − 1)L−p , this follows from (24).

The equations of Proposition 4.1 for j ≥ g − 1 follow immediately from Theorems 2.2 and 2.3, though taking j = 0, . . . , g − 2 gives independent relations. For example, expanding (a) shows that 2k Ik(g)

+ (g + 2k)

k X i=1

(g) (g−1) k X Ik−i Ik−i = 4g (2i + 1)! (2i + 1)!

(27)

i=1

for each k with 1 ≤ k ≤ g − 1, where we have used (22) to eliminate the γ’s. Together with (15), (27) forms a linearly independent system of equations which can be put into matrix form (28) I¯(g) M = (I¯(g−1) , 1) N, (g) ) and M, N are g × g matrices. The invertibility of N where I¯(g) = (I0(g) , I1(g) , . . . , Ig−1 is evident, and the invertibility of M reduces to that of its bottom-right component ÿ 3g−2 2g−2 ! 3!

4!

2g − 2

1

.

It follows that the intersection numbers (21) can be determined using (28) and induction on the genus g. In view of the Riemann-Roch equation ˆ χ(M, T ∗ ) − g + 1 = hch (Te)eα A(M), [M]i that follows from (15), Proposition 4.1(b) correctly predicts the coefficient χ(M, T ∗ ) = g − 1 of t in the polynomial χt = (1 − t)g−1 (1 + t)2g−2 computed in [18]. This observation suggests that there should exist more direct proofs of Proposition 4.1. References 1. Atiyah, M.F., Bott, R.: Yang-Mills equations over Riemann surfaces. Phil. Trans. R. Soc. London 308, 523–615 (1982) 2. Baranovsky, V.Y.: The cohomology ring of the moduli space of stable bundles with odd determinant. Izv. Russ. Akad. Nauk 58, 204–210 (1994) 3. Beauville, A., Laszlo, Y.: Conformal blocks and generalized theta functions. Commun. Math. Phys. 164, 385–419 (1994) 4. Bertram, A., Szenes, A.: Hilbert polynomials of moduli spaces of rank 2 vector bundles II. Topology 32, 599–609 (1993) 5. Bryant, R.L.: Lie groups and twistor spaces. Duke Math J. 52, 223–261 (1985) 6. Burstall, F.E, Rawnsley, J.R.: Twistor theory for Riemannian symmetric spaces. Lect. Notes Math. 1424 Berlin–Heidelberg–New York: Springer, 1990 7. Desale, U.V., Ramanan, S.: Classification of vector bundles of rank 2 over hyperelliptic curves. Invent. Math. 38, 161–185 (1976)

534

R. Herrera, S. Salamon

8. Donaldson, S.K.: Gluing techniques in the cohomology of moduli spaces. In Goldberg, L.R., Philips, A.V. (eds) Topological Methods in Modern Mathematics. Houston: Publish or Perish, 1993, pp. 137–170 9. Faltings. G.: A proof for the Verlinde formula. J. Algebraic Geom. 3, 347–374 (1994) 10. Fulton, W., Lang, S.: Riemann-Roch Algebra. Berlin–Heidelberg–New York: Springer, 1985 11. Gieseker, D.: A degeneration of the moduli space of stable bundles. J. Differ. Geom. 19, 173–206 (1984) 12. Harder, G., Narasimhan, M.S.: On the cohomology groups of moduli spaces of vector bundles on curves. Math. Ann. 212, 215–248 (1978) 13. Herrera, R.: Intersection numbers on moduli spaces and symmetries of a Verlinde formula II. Preprint 14. Jeffrey, L.C., Weitsman, J.: Toric structures on the moduli space of flat connections on a Riemann surface. Adv. Math. 106, 151–168 (1994) 15. King, A.D., Newstead, P.E.: On the cohomology ring of the moduli space of rank 2 vector bundles on a curve. Preprint 16. Kirwan, F.C.: The cohomology rings of moduli spaces of bundles over Riemann surfaces. J. Am. Math. Soc. 5, 853–906 (1992) 17. LeBrun, C.R., Salamon, S.M.: Strong rigidity of positive quaternion-K¨ahler manifolds. Invent. Math. 118, 109–132 (1994) 18. Narasimhan, M.S., Ramanan, S.: Generalized Prym varieties as fixed points. J. Indian Math. Soc. 39, 1–19 (1975) 19. Narasimhan, M.S., Ramadas, T.R.: Factorization of generalized theta functions. Invent. Math. 114, 565– 623 (1993) 20. Newstead, P.E.: Topological properties of some spaces of stable bundles. Topology 6, 241–262 (1967) 21. Newstead, P.E.: Characteristic classes of stable bundles over an algebraic curve. Trans. Am. Math. Soc. 169, 337–345 (1972) 22. Ramanan, S.: The moduli space of vector bundles over an algebraic curve, Math. Ann. 200, 69–84 (1973) 23. Salamon, S.: Harmonic and holomorphic maps. In: Seminar Luigi Bianchi II, Lect. Notes Math. 1164 Berlin–Heidelberg–New York: Springer, 1990, pp. 161–224 24. Salamon, S.M.: The twistor transform of a Verlinde formula. Riv. Mat. Univ. Parma 3, 143–157 (1994), dg–ga/9506003 25. Seshadri, C.S.: Space of unitary vector bundles on a compact Riemann surface. Ann. of Math. 85, 303–336 (1967) 26. Siebert, W., Tian, G.: Recursive relations for the cohomology ring of moduli spaces of stable bundles. Turkish J. Math. 19, No. 2, 131-144 (1995) 27. Szenes, A.: Hilbert polynomials of moduli spaces of rank 2 vector bundles I. Topology 32, 587–597 (1993) 28. Thaddeus, M.: Conformal field theory and the moduli space of stable bundles. J. Differ. Geom. 35, 131–149 (1992) 29. Thaddeus, M.: Stable pairs, linear systems and the Verlinde formula. Invent. Math. 117, 131–149 (1994) 30. Verlinde, E.: Fusion rules and modular transformations in 2d conformal field theory. Nucl. Phys. B 300, 360–376 (1988) 31. Weitsman, J.: Geometry of the intersection ring of the moduli space of flat connections and the conjectures of Newstead and Witten. Preprint, 1993 32. Witten, E.: On quantum gauge theories in 2 dimensions. Commun. Math. Phys. 140, 153 (1991) 33. Zagier, D.: On the cohomology of moduli spaces of rank two vector bundles over curves. In: R. Dijkgraaf, C. Faber, G. van der Geer (eds.) The Moduli Spaces of Curves. Progress in Math. 129, Boston–Basel– Berlin: Birkh¨auser, 1995, pp. 533–563 Communicated by S.T. Yau

Commun. Math. Phys. 188, 535 – 564 (1997)

Communications in

Mathematical Physics c Springer-Verlag 1997

Algebra of Observables and Charge Superselection Sectors for QED on the Lattice J. Kijowski, G. Rudolph, A. Thielmann 1 2 3

Center for Theoretical Physics, Polish Academy of Sciences, al. Lotnik´ow 32/46, 02-668 Warsaw, Poland Institut f¨ur Theoretische Physik, Universit¨at Leipzig, Augustusplatz 10/11, 04109 Leipzig, Germany Center for Theoretical Physics, Polish Academy of Sciences, al. Lotnik´ow 32/46, 02-668 Warsaw, Poland

Received: 21 October 1996 / Accepted: 10 February 1997

Abstract: Quantum Electrodynamics on a finite lattice is investigated within the hamiltonian approach. First, the structure of the algebra of lattice observables is analyzed and it is shown that the charge superselection rule holds. Next, for every eigenvalue of the total charge operator a canonical irreducible representation is constructed and it is proved that all irreducible representations corresponding to a fixed value of the total charge are unique up to unitary equivalence. The physical Hilbert space is by definition the direct sum of these superselection sectors. Finally, lattice quantum dynamics in the Heisenberg picture is formulated and the relation of our approach to gauge fixing procedures is discussed. Contents 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 535 2 Second Quantization on the Lattice . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 537 3 Gauge Invariance, Constraints and Boundary Data . . . . . . . . . . . . . . . . . 539 4 Algebra of Gauge Invariant Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . 541 5 Algebra of Observables and Tree Decomposition . . . . . . . . . . . . . . . . . . 545 6 Charge Superselection Rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 548 7 Uniqueness of Irreducible Representations and Charge Superselection Sectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 550 8 Wave Function Description of Physical States and Relation to the Gauge Fixing Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 556 9 Local Observables and Lattice Quantum Hamiltonian . . . . . . . . . . . . . . 557 10 Towards Continuum Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 558 A Appendix: Proof of Theorem 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 560 1. Introduction This paper is a continuation of [4] and [5]. In [4] we have proved that the classical DiracMaxwell system can be reformulated in a spin-rotation covariant way in terms of gauge

536

J. Kijowski, G. Rudolph, A. Thielmann

invariant quantities, for earlier attempts to solve this problem see [6–9]. In [5] we have shown that it is possible to perform similar constructions on the level of the (formal) functional integral of Quantum Electrodynamics. For an application of these ideas to nonabelian Higgs models see [10] and for a similar approach to QCD see [16]. The result is a functional integral completely reformulated in terms of local gauge invariant quantities, which differs essentially from the effective functional integral obtained via the Faddeev-Popov procedure [11]. In this paper we will show that there exists a natural lattice version of QED in terms of gauge invariant quantities and that this formulation provides a natural framework for discussing basic structures like the observable algebra and charge superselection sectors. We also refer to [13–15], where nonabelian Higgs models and scalar QED have been formulated in terms of lattice gauge invariants. There are relations to a series of papers, see [20–22], by Strocchi and Wightman, where these authors have analyzed the structure of gauge theories within the context of algebraic field theory in the sense of Haag and Kastler [18]. In particular, in [20] Quantum Electrodynamics was considered. It was shown that if one insists on locality and Lorentz covariance, one is rather naturally led to a theory with indefinite metric. Within this scheme, the charge superselection rule for QED was proven, but a decomposition of the physical Hilbert space into a direct sum of subspaces carrying definite total charge was not obtained. For a discussion of this decomposition within the context of theories, which do not contain massless particles, see [19]. Some progress towards an implementation of these ideas for theories with massless particles has been made, see [23] and [24] and further references therein, but a complete solution of the above mentioned problems for theories of this type has – as to our knowledge – not been obtained until now. In [25] a concrete lattice model, namely a Z2 -gauge theory with Z2 -matter fields, was studied. In particular, the ground state and charged states were found. Using methods of Euclidean quantum field theory, the authors were able to show that – for some regions in the space of coupling constants – the thermodynamic limit for charged states could be controlled. This is a very interesting result, but it is doubtful, whether these methods can be generalized to more realistic models. In this paper we discuss some of the above-mentioned problems in the simplified setting of QED on a finite lattice. For basic notions concerning lattice gauge theories (including fermions) we refer to [26] and references therein. In the context of lattice approximation complicated operator theoretic problems arising in (continuum) quantum field theory naturally disappear, whereas problems typical for gauge theories remain and can be, therefore, discussed separately. We consider QED in the hamiltonian approach on a finite cubic (3-dimensional) lattice and work in the so-called non-compact formulation, where the gauge potential remains Lie-algebra-valued on the lattice level. We stress, however, that a completely analogous analysis can be carried out within the more familiar, compact formulation leading essentially to the results presented in this paper–with some slight modifications only. Our starting point is the naive Schr¨odinger representation of the classical commutation and anti-commutation relations for canonically conjugate pairs of lattice quantities. In this representation the generator of local gauge transformations is given by the “Gauss-law-operator”. Thus, a necessary and sufficient condition for gauge invariance of the wave function is that the Gauss law constraint should be fulfilled. However, wave functions fulfilling this constraint are not square integrable and–in principle–different (may be physically inequivalent) Hilbert space structures, see [27], may be chosen. In the present paper we follow a different strategy: In a first step we construct the algebra of observables as the algebra of gauge invariant operators fulfilling the Gauss law and show that the charge superselection rule holds. Basically, the observable algebra is generated by electric and magnetic flux op-

Algebra of Observables and Charge Superselection Sectors for QED

537

erators, together with gauge invariant operators bilinear in fermion fields. These gauge invariant operators fulfill a number of algebraic identities, which, however, by using the technical tool of a lattice tree become tractable. This tool enables us to classify all irreducible representations of the observable algebra and to obtain the physical Hilbert space as a direct sum of representation spaces labeled by the total charge. This is the main result of the present paper. We underline that these representations are explicitly constructed in purely algebraic terms. A posteriori, an interpretation in terms of gauge invariant wave functions turns out to be possible. Our paper is organized as follows: In Sects. 2 and 3 we discuss the standard second quantization procedure on the lattice and–in some detail–gauge invariance and constraints. In Sect. 4 we analyze the structure of the algebra of gauge invariant operators and in Section 5 we show–using the technical tool of a lattice tree–how to implement the Gauss law. The algebra of observables turns out to be the tensor product of the electromagnetic part (finitely generated Heisenberg-algebra) and a finite dimensional part generated by invariants built from (on-tree) bilinear combinations of the fermion fields (pair creation and annihilation operators). The latter invariants have to fulfil a number of relations and it is the main technical point of this paper to analyze this subalgebra. In Sect. 6 we show that the total charge defines a superselection rule in the observable algebra. In Section 7 we find all its irreducible representations (up to unitary equivalence) and prove that they are labeled by the eigenvalues of the total charge operator. In a next step, see Sect. 8, we show that the above mentioned representations can be given an ordinary wave function interpretation and discuss relations with the gauge fixing approach. Finally, local observables and the lattice quantum Hamiltonian are shortly discussed and our philosophy of passing to the continuum limit is outlined.

2. Second Quantization on the Lattice A classical continuum field configuration consists of a U(1)-gauge potential Aµ and a four-component spinor field (ψ a ), where a, b, ... = 1, 2, 3, 4 denote bispinor indices and µ, ν, ... = 0, 1, 2, 3 spacetime indices. The classical Lagrangian of QED is given by n o 1 L = − Fµν F µν − mψ a ∗ βab ψ b − ~ Im ψ a∗ βab (γ µ )b c Dµ ψ c , 4

(2.1)

where Fµν = ∂µ Aν − ∂ν Aµ and Dµ ψ a = ∂µ ψ a + igAµ ψ a . The star denotes complex conjugation, βab denotes the canonical Hermitian structure in bispinor space and (γ µ ) are the Dirac matrices. For a given Cauchy hyperplane Σ = {t = const} in Minkowski space, the above Lagrangian gives rise to an infinite-dimensional Hamiltonian system in variables (Ak , E k , ψ a , ψ a∗ ) with the Hamiltonian given by H=

n o b 1 Ek E k + Bk B k + mψ a∗ βab ψ b + ~Im ψ a ∗ βab γ k c Dk ψ c , 2

(2.2)

where B = curl A. Let us take a finite regular cubic lattice 3 contained in Σ, with lattice spacing a, and let us denote the set of n-dimensional lattice elements by 3n , n = 0, 1, 2, 3. Such elements are (in increasing order of n) called sites, links, plaquettes and cubes. We approximate every continuous configuration (Ak , E k , ψ a , ψ a∗ ) in the following way:

538

J. Kijowski, G. Rudolph, A. Thielmann

30 3 x −→ ψxa := a 2 ψ a (x) ∈ C , Z 1 ˆ Ak dl ∈ R , 3 3 (x, x + k) −→ Ax,x+kˆ := ˆ (x,x+k) Z ˆ −→ E E k dσk ∈ R . 31 3 (x, x + k) x,x+kˆ := 3

(2.3) (2.4) (2.5)

ˆ σ(x,x+k)

ˆ denotes a plaquette of the dual lattice, dual to the link (x, x + k) ˆ ∈ 31 . Here σ(x, x + k) Note that we have chosen the non-compact lattice approximation, where the potential and the field strength remain Lie-algebra-valued on the lattice level. We define the second quantization for the lattice theory by postulating the following canonical (anti-) commutation relations for the lattice quantum field operators: i h (2.6) ψˆ xa , ψˆ yb∗ = δ ab δxy h i+ Aˆ x,x+kˆ , Eˆ y,y+lˆ = i~δxy δkˆ lˆ. (2.7) The remaining (anti-) commutators have to vanish. All irreducible representations in the strong (Weyl) sense of the above algebra are equivalent (see [1 and 2]). In particular, the bosonic quantities (A, E) may be described by the Schr¨odinger representation, in the Hilbert space of wave functions 9 depending on parameters A. Operators Aˆ are thus multiplication operators and canonically conjugate momenta are represented by derivatives ~ ∂ . Eˆ x,x+kˆ := i ∂Ax,x+kˆ For the fermion fields we use the following decomposition into Weyl spinors: K φ , K, L = 1, 2 . ψa = ϕ∗L

(2.8)

(2.9)

We take the anti-holomorphic representation for the upper part and the holomorphic representation for the lower part of ψ. Thus, we represent the “classical” Grassmann algebra valued quantities (φK∗ , ϕ∗L ) as multiplication operators (φˆ K∗ , ϕˆ ∗L ) in the space of all functions (polynomials) of these variables. It follows that the adjoint operators (φˆ K , ϕˆ L ) satisfy relations (2.6). They may be represented as derivatives (see [17]): φˆ K :=

∂ ∂ , ϕˆ L := . ∂φK∗ ∂ϕ∗L

(2.10)

The tensor product of all these representations is, therefore, defined in the space of wave functions ∗ (2.11) 9 = 9({Ax,x+kˆ }, {φK∗ x }, {ϕx;L }) ∗ ∗ which are polynomials in the anticommuting variables (φ , ϕ ) with coefficients being functions of variables A. The Hilbert space structure is defined by the L2 -norm. Integration over the Grassmann variables is understood in the sense of Berezin, which means that the set of all different monomials in these variables forms an orthonormal basis. Obviously, the algebra generated by (2.6) and (2.7) contains a lot of unphysical (gauge-dependent) elements. Moreover, the above electric field operators do not satisfy the Gauss law. In what follows we will present an explicit construction of the algebra of observables (gauge invariant operators satisfying the Gauss law), together with a complete classification of its irreducible representations.

Algebra of Observables and Charge Superselection Sectors for QED

539

3. Gauge Invariance, Constraints and Boundary Data A local gauge transformation of a lattice configuration is given by: ψ˜ x = exp(−igλx ) ψx A˜ x,x+kˆ = Ax,x+kˆ + λx+kˆ − λx ,

(3.1) (3.2)

where 30 3 x −→ λx ∈ R. We stress at this point that we could have also used the more familiar compact description, in which the lattice gauge potentials are group-valued quantities exp(igAx,x+kˆ ) (parallel transporters). On the level of physical observables, this would lead to replacing the magnetic field B, see forthcoming formula (4.1), by exp(igB) (the Wilson loop). This change leads only to an obvious and straightforward modification of our results: the so called θ-representations for the electromagnetic field operators exp(igB) and E would occur in the classification of irreducible representations of the observable algebra. This means that the uniqueness theorem proved in the present paper would no longer be true. To reestablish uniqueness, we would have to exclude these θ-representations a priori. That is why we believe that the description we are using is–in the case of electrodynamics– simpler than the Wilson description. It does not introduce any artificial compactification of the physical degrees of freedom and is, therefore, closer to the topological structure of the continuum theory. Local gauge transformations act on wave functions in the following way: ∗ ˜ K∗ ˜ ˜ ∗x;L }). (U ({λx })9)({Ax,x+kˆ }, {φK∗ x }, {ϕx;L }) = 9({Ax,x+kˆ }, {φx }, {ϕ

(3.3)

ˆ formally identical This induces the transformation law for the field operators Aˆ and ψ, with (3.1) and (3.2). For a deeper discussion of possible gauge fixings, see Sect. 8. To calculate the generator Gˆx of infinitesimal local gauge transformations at x we take the ˜ ∗x;L }) with respect to λx , at λx = 0: derivative of 9({A˜ x,x+kˆ }, {φ˜ K∗ x }, {ϕ X X ∂9 ∂9 ∂9 + ig φK∗ − ig ϕ∗x;L ∗ = x K∗ ∂Ax,x+kˆ ∂φx ∂ϕx;L K L kˆ    X i X ˆ ˆK Ex,x+kˆ + g~ − φˆ K∗ 9. ˆ ∗x;K ϕˆ x;K = x φx − ϕ  ~

Gˆx 9 := −

X

kˆ

(3.4)

K

We conclude that the generator Gˆx of local gauge transformations is given by   X  i Gˆx := − Eˆ x,x+kˆ − jˆx0 .  ~

(3.5)

kˆ

Here, the operator jˆx0 of electric charge at x is automatically obtained in the “normally ordered” form: X X ˆK ψˆ xa∗ ψˆ xa : = e ˆ ∗x;K ϕˆ x;K φˆ K∗ (3.6) jˆx0 = : e x φx − ϕ a

K

where e := g~ is the elementary charge. We define the operator Qˆ of total electric charge putting

540

J. Kijowski, G. Rudolph, A. Thielmann

Qˆ :=

X

jˆx0 .

(3.7)

x∈30

Due to (3.6), we can interpret φˆ K∗ x as the creation operator of a positron at x, carrying the charge +e, and ϕˆ ∗x;K as the creation operator of an electron at x, carrying the charge −e. The index K = 1, 2 describes the two possible helicity states. The necessary and sufficient condition for gauge invariance of the wave function is provided by the following “Gauss law constraint”: Gˆx 9 = 0.

(3.8)

Unfortunately, wave functions fulfilling (3.8) are not square integrable with respect to the standard measure on the configuration space, because they are constant on non-compact gauge orbits. As already discussed in the Introduction–one possible strategy consists in looking for an appropriate Hilbert space structure in the space of gauge invariant wave functions, another one–followed in this paper–in explicitly constructing the algebra of observables together with its irreducible representations. In the latter approach there is no ambiguity in the definition of the scalar product in the space of physical, gauge-invariant functions. It is uniquely implied by the structure of the algebra of observables. If we sum up Eqs. (3.8) over all x ∈ 30 we see that, heuristically, the total charge Qˆ should vanish, when acting on gauge invariant wave functions 9: ˆ = 0. Q9

(3.9)

Thus, nontrivial values of the total charge Qˆ can only arise from nontrivial boundary data, which we are now going to introduce. For this purpose we consider also external links of our finite lattice 3, connecting lattice sites belonging to the boundary ∂3 with “the rest of the world”. This way we can treat 3 as part of a bigger (maybe infinite) lattice. We denote these external links by (x, ∞) and allow the wave functions 9 from the beginning to depend on the corresponding potentials Ax,∞ . Moreover, we put ∂3n := ∂3 ∩ 3n . Now gauge invariance does no longer imply vanishing of the total charge, because the electric fields on external links remain when we sum up Eqs. (3.8) over all sites of 3:     X X ˆ = Q9 Eˆ x,∞  9. (3.10) jˆx0  9 =  x∈30

x∈∂30

We stress that the external fluxes Eˆ x,∞ are not dynamical quantities, they play the role of prescribed boundary conditions. It will be shown in the sequel that the charge operator Qˆ defines a superselection rule. Thus, we have Qˆ = Q11 on every superselection sector. Consequently, the only consistent choice for the external fluxes is Eˆ x,∞ = Ex,∞ 11 on every superselection sector, where Ex,∞ are c-numbers fulfilling X Ex,∞ . (3.11) Q= x∈∂30

In principle, we could distinguish between representations characterized by the same value Q, but corresponding to different external flux distributions fulfilling (3.11). This would lead to additional superselection rules. In the present paper we have chosen another option, which is motivated by the fact that – as will be shown in Sect. 8 – different boundary conditions corresponding to the same value Q give equivalent representations. Thus, for any value of Q we have an equivalence class of boundary data and we choose a representative, e.g. the “most symmetric” distribution of the values Eˆ x,∞ on ∂3.

Algebra of Observables and Charge Superselection Sectors for QED

541

4. Algebra of Gauge Invariant Operators In this section we investigate the algebra O˜ of gauge invariant operators. In particular, we find a set of algebraic relations between generators of this algebra. This set, supplemented by the Gauss law, will be taken as a set of axioms for the generators of the algebra of observables, which we are going to construct in the next section. We start with the auxiliary Hilbert space H0 of square integrable wave functions (2.11), admitting (possibly) nontrivial boundary data Eˆ x,∞ . Definition 1. The algebra O˜ of gauge invariant operators is defined as the commutant of {U ({λx })} in B(H0 ). Since the set generated by {U ({λx })} is ∗-invariant, O˜ is a von Neumann algebra. We see that arbitrary bounded functions of (self-adjoint) electric flux operators Eˆ are gauge ˆ l) ˆ we may assign the (self-adjoint) magnetic invariant. To each oriented plaquette (x; k, flux operator: ˆ ˆ ˆ ˆ ˆ + Aˆ ˆ ˆ ˆ + Aˆ ˆ . Bˆ x;k, ˆ lˆ := Ax,x+kˆ + Ax+k,x+ k+l x+k+l,x+l x+l,x

(4.1)

Again, arbitrary bounded operator functions of Bˆ are gauge invariant–due to (3.2). Observe that magnetic flux operators are subject to the following constraint: the total magnetic flux through the boundary of a lattice cube vanishes as a consequence of (4.1), ˆ ˆ ˆ ˆ + Bˆ ˆ ˆ + Bˆ ˆ ˆ ˆ + Bˆ ˆ ˆ = 0. divx;k, ˆ l, ˆn ˆ kˆ + Bx+l; ˆ B := Bx;k, l x;n, ˆ kˆ + Bx;l, n ˆ x+n; ˆ l, k,n ˆ x+k;n, ˆ l

(4.2)

To any oriented lattice path γ connecting two lattice points x and y we may assign ˆ the following gauge-invariant, bilinear combination of the fermionic operators ψ: ˆ γba := ψˆ yb∗ exp(−ig W

Z γ

ˆ ψˆ xa A)

(4.3)

where the integral denotes the oriented sum over all links belonging to γ. In particular, ˆ is γ may be trivial, in that case the integral over γ is equal to zero. Obviously, any W ˜ bounded in H0 and hence it belongs to O. Proposition 1. If γ is a path connecting x with y, β is a path connecting y with z and if the degrees of freedom (x, a), (y, b) and (z, c) differ from each other, then we have the following identity: cb ca ˆ γba = W ˆ βγ ˆβ , W (4.4) W where βγ is the composition of γ and β, a path connecting x with z. Proof.

Z c∗ b b∗ a cb ba b∗ ˆ a ˆ c∗ ˆ b ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ Wβ , Wγ = ψz ψy ψy ψx − ψy ψx ψz ψy exp(−ig A) βγ ca ca ˆ βγ ˆ βγ =W . = ψˆ yb ψˆ yb∗ + ψˆ yb∗ ψˆ yb W

542

J. Kijowski, G. Rudolph, A. Thielmann

Observe that–due to (3.1) and (3.2)–gauge invariant operators constructed from Aˆ ˆ B, ˆ W, ˆ W ˆ ∗ } only. and ψˆ may be expressed as bounded combinations of {E, ˆ There are three types of operators W: Z ˆ φˆ K A) Lˆ γ LK := φˆ L∗ exp(−ig (4.5) y x γ Z ˆ γ;L K := ϕˆ y;L exp(−ig A) ˆ φˆ K M (4.6) x γ Z ˆ γ;LK := ϕˆ y;L exp(−ig A) ˆ ϕˆ ∗x;K R (4.7) γ

ˆ γ;L K∗ ). Observe that M ˆ γ;L K (the fourth type coincides with the adjoint operators M K∗ ˆ γ;L creates such a pair. We annihilates a positron at x and an electron at y, whereas M will call them “pair annihilation operators” (respectively, “pair-creation operators”). A non-diagonal operator Lˆ γ LK (i. e. such that (x, K) 6= (y, L), where x is the beginning and ˆ γ;LK y is the end of γ) annihilates a positron at x and creates another one at y, whereas R does the same with electrons. Finally, we have diagonal operators corresponding to trivial ˆK paths. We denote them by Lˆ x KK = φˆ K∗ x φx (the projector representing the number of positrons at x with helicity K) and by Rˆ x;LL (the projector 11 − Rˆ x;LL = ϕˆ ∗y;L ϕˆ y;L describes the number of electrons at x with helicity L). ˆ can be expressed in terms of the operators M ˆ Proposition 2. The operators Lˆ and R ˆ ∗. and M Proof. For nondiagonal elements we have immediately from Proposition 1: h i ˆ β;L P ∗ , M ˆ γ;L K , α = β −1 γ , Lˆ α P K = M h i ˆ α;LQ = M ˆ γ;L K , M ˆ β;Q K ∗ , α = γβ −1 , R

(4.8) (4.9)

where β and γ are arbitrary paths chosen in such a way that α = β −1 γ or α = γβ −1 respectively. Obviously, the left-hand sides of (4.8) and (4.9) do not depend upon this arbitrary splitting of α. It remains to consider the case of diagonal elements Lˆ x KK and Rˆ x;LL . For that purpose we define the following projectors: K ˆ K∗ ˆ y;L ϕˆ ∗y;L ˆ γ;L K M ˆ γ;L K ∗ = φˆ K Pˆx,y;L =M x φx ϕ K ∗ ˆ ˆ K ˆ ∗y;L ϕˆ y;L ˆ Qˆ K Mγ;L K = φˆ K∗ x,y;L = Mγ;L x φx ϕ

(4.10) (4.11)

where γ is an arbitrary path from x to y. Obviously, these operators depend only upon the endpoints x and y, and not on the chosen path. Moreover, they all commute with each other. It is also useful to introduce the following operators: h i K ˆ γ;L K . ˆK ˆ γ;L K ∗ M ˆ x,y;L := Qˆ K M (4.12) K x,y;L − Px,y;L = One may easily check the following identity: K ˆ x,y;L = Lˆ x KK − Rˆ y;LL . K

(4.13)

Algebra of Observables and Charge Superselection Sectors for QED

543

ˆ provided These equations may be easily solved with respect to the operators Lˆ and R, ˆ the total charge Q is given. For this purpose observe that the total number of positrons Nˆ p and the total number of electrons Nˆ e are equal to X Nˆ p = Lˆ x KK (4.14) K,x e

Nˆ = N 11 −

X

Rˆ y;LL

(4.15)

L,y

where N is the number of positron (and also electron) degrees of freedom, (twice the number of lattice sites). Summing up Eqs. (4.13) over all indices we obtain 1 XX ˆ K Kx,y;L + N 11. Nˆ p + Nˆ e = N

(4.16)

K,x L,y

On the other hand, we have 1 ˆ (4.17) Nˆ p − Nˆ e = Q. e This enables us to calculate both Nˆ p and Nˆ e and, finally, to obtain Lˆ x KK by summing up Eqs. (4.13) with respect to (L, y) and Rˆ y;LL by summing it up with respect to (K, x). The final result is: 1 X ˆK 1 XX ˆ R 1 1 ˆ Kx,y;L − Ku,v;S + 11 + Q, (4.18) Lˆ x KK = 2 N 2N 2 2N e L,y

Rˆ y;LL

R,u S,v

1 X ˆK 1 XX ˆ R 1 1 ˆ Kx,y;L + Ku,v;S + 11 + Q. =− 2 N 2N 2 2N e K,x

(4.19)

R,u S,v

In the last step, we will express also the total charge operator Qˆ in terms of the projectors (4.11) and (4.10). For that purpose denote by N the set of all positron (or electron) degrees of freedom, defined as the product of the set of all lattice sites by the set {1, 2} of all possible helicity values. For each pair (V, W ) of subsets of N we are going to construct projectors Qˆ (V,W ) (or Pˆ(V,W ) , respectively), which correspond to the following, physical questions: “Are all positron degrees of freedom in V and all electron degrees of freedom in W fully occupied (or fully unoccupied, respectively)?” If none of (V, W ) is empty, we define Y Y Qˆ K (4.20) Qˆ (V,W ) := x,y;L (K,x)∈V (L,y)∈W

Pˆ(V,W ) :=

Y

Y

K Pˆx,y;L .

(4.21)

(K,x)∈V (L,y)∈W

ˆ If (The order of factors is irrelevant, due to the commutativity of operators Pˆ and Q.) ˆ ˆ both V and W are empty, we put Q(∅,∅) = P(∅,∅) = 11. Finally, if only one of them is empty, we proceed as follows: For V not empty (resp. empty) and W empty (resp. not empty), we choose any pair ((K0 , x0 ), (L0 , y0 )) such that (K0 , x0 ) 6∈ V (resp. is arbitrary) and (L0 , y0 ) is arbitrary (resp. (L0 , y0 ) 6∈ W ). Next, we choose an arbitrary path γ from x0 to y0 and put

544

J. Kijowski, G. Rudolph, A. Thielmann

ˆ γ;L0 K0 Qˆ (V,{(L ,y )}) M ˆ γ;L0 K0 ∗ Qˆ (V,∅) := M 0 0 ˆ γ;L0 K0 ∗ Pˆ(V,{(L ,y )}) M ˆ γ;L0 K0 Pˆ(V,∅) := M 0 0 ˆ γ;L0 K0 Qˆ ({(K ,x )},W ) M ˆ γ;L0 K0 ∗ Qˆ (∅,W ) := M 0 0 ˆ γ;L0 K0 ∗ Pˆ({(K ,x )},W ) M ˆ γ;L0 K0 . Pˆ(∅,W ) := M 0 0

(4.22) (4.23) (4.24) (4.25)

For any integer Z ∈ [−N + 1, N − 1] we define the following projector: X

PˆZ :=

Qˆ (V,W ) Pˆ(N \V,N \W )

(4.26)

#V −#W =Z

where by #V and #W we denote the number of elements of the sets V and W , respectively. It is easy to check–using the wave function representation of H0 –that the above projectors give the spectral decomposition of the total charge, i. e. that the following is true: X Z PˆZ . (4.27) Qˆ = e Z

It follows from this proposition that also every local charge operator (3.6) can be ˆ and M ˆ ∗: expressed in terms of M   X X e K K ˆ x,y;L ˆ y,x;L   + 2 Qˆ , K K − jˆx0 = N N K,L,y

(4.28)

K,L,y

where Qˆ is given by (4.27). ˆ B, ˆ M, ˆ M ˆ ∗ }. We see, therefore, that the algebra O˜ is generated by the family {E, These generators are, however, not independent. There is a number of operator identities between them, which we list below. Proposition 3. We have 1. ˆ divx;k, ˆ l, ˆn ˆ B = 0.

(4.29)

ˆ commutes: 2. Any pair of operators M

ˆ α;L K , M ˆ β;N M = 0. M

(4.30)

3. Pair-annihilation operators along two different paths γ and β, having common ends x and y, are related by: ˆ β;L K ˆ γ;L K = exp(−ig Bˆ (γβ −1 ) )M M

(4.31)

where Bˆ (γβ −1 ) denotes the magnetic flux through the closed path γβ −1 . 4. ˆ α;L K M

2

= 0.

(4.32)

Algebra of Observables and Charge Superselection Sectors for QED

545

5. Let α connect x with y, β connect z with u and γ connect t with w. Denote by δ(x,K)(z,M ) the Kronecker symbol, which vanishes if (x, K) 6= (z, M ) and takes the value equal to one if both points and indices coincide. Then: ˆ αβ −1 γ;L P ˆ β;N M ∗ , M ˆ γ;R P = δ(x,K)(z,M ) δ(u,N )(w,R) M ˆ α;L K , M M ˆ γβ −1 α;R K (4.33) + δ(y,L)(u,N ) δ(t,P )(z,M ) M where we have denoted [·, ·, ·] := [[·, ·], ·] ≡ [·, [·, ·]], (both definitions are equivalent, because the first and the last variable commute). 6.

i ˆ α;L K , Bˆ ˆ ˆ = 0 M x;k,l h i K ˆ ˆ α;L K , Eˆ M ˆ Mα;L x,x+kˆ = e~δα,(x,x+k) h i ˆ Bˆ x;k, ˆ lˆ, Ey,y+n ˆ l),(y,y+ ˆ ˆ = i~δ∂(x;k, n) ˆ . h

(4.34) (4.35) (4.36)

ˆ ˆ Here δα,(x,x+k) ˆ = 0 if (x, x + k) 6∈ α, δα,(x,x+k) ˆ = 1 if (x, x + k) ∈ α and has the same orientation as α and δα,(x,x+k) ˆ = −1 otherwise. In the last formula we have ˆ l) ˆ a closed path–the boundary of the oriented plaquette (x; k, ˆ l). ˆ denoted by ∂(x; k, Proof. Identity (4.29) was already shown above, see (4.2). Identities (4.30)–(4.35) follow easily from the canonical (anti-) commutation relations (2.6) and (2.7) and the commutation relations (4.36) are an immediate consequence of the commutation relations (2.7). Remark 1. The commutators containing unbounded operators Eˆ and Bˆ have to be understood in their “bounded version”. In particular, the canonical commutation relations (4.36) have to be understood in the sense of the Weyl commutation relations for the ˆ one-parameter groups generated by Eˆ and B. 5. Algebra of Observables and Tree Decomposition The algebra O˜ is, of course, unphysical, because it does not respect the Gauss law. Its representation in H0 is highly reducible. To construct the physical observable algebra O we additionally impose the Gauss law divx Eˆ = jˆx0

(5.1)

ˆ according to formulae (4.28) and (4.27). We take this equation where jˆx0 is defined by M as an additional relation between generators. Thus, let us start with the ∗-algebra generated by the family of abstract elements ˆ B, ˆ M, ˆ M ˆ ∗ } (with Eˆ and Bˆ being self-adjoint), satisfying axioms (4.29)–(4.36) {E, together with (5.1), where the charges jˆx0 are given by (4.28) and (4.27). We will prove in this section that these axioms (with canonical commutation relations understood in the sense of Weyl) define, indeed, uniquely a von Neumann algebra, which we shall call the Algebra of Observables of our model and denote by O. Our main tool will be the notion of a tree. By a tree we mean a pair (x0 , T ), where x0 ∈ 30 is a lattice site, which we call the root of the tree and T ⊂ 31 is a subset of

546

J. Kijowski, G. Rudolph, A. Thielmann

links, having the following property: for every site x ∈ 30 there is one and only one path composed of links belonging to T , which connects the root x0 with x. We denote this path by (x0 , x)T . The simplest example of a tree is obtained as follows. Choose any root and take as T the following collection of links ˆ belonging to the x3 -axis passing through the root, 1. all the links (x, x + 3) ˆ belonging to the two-dimensional plane (x2 , x3 ) passing 2. all the links (x, x + 2), through the root, ˆ 3. all the links (x, x + 1). ˆ 6∈ T , we associate the unique closed path composed With every off-tree link, (x, x+k) ˆ itself and the inverse tree-path (x + k, ˆ x0 )T . of the tree-path (x0 , x)T , the link (x, x + k) For any surface (finite number of plaquettes), such that the above closed path is its boundary, we denote the operator of total magnetic flux through it, i.e. the sum of all ˆ operators Bˆ x;k, ˆ lˆ corresponding to this surface, by Bx,x+kˆ (T ). Of course, this quantity does not depend upon the choice of the surface, because the divergence of Bˆ vanishes. We call these quantities along-tree magnetic fluxes. From (4.36) we have i h (5.2) Bˆx,x+kˆ (T ), Eˆ y,y+lˆ = i~δxy δkˆ lˆ which means that the operators canonically conjugate to the along-tree magnetic fluxes are equal to the off-tree electric fluxes. From now on we denote them by Eˆy,y+lˆ(T ). Finally, let us describe all electron and positron degrees of freedom as follows: For operators φˆ K ˆ y;L we write φˆ i and ϕˆ j , where i = (x, K), j = (y, L) label all x and ϕ possible values of indices K and L, and all lattice sites x and y. We denote by m ˆ ij those ˆ which correspond to on-tree paths: generators M, ˆ γ;L K m ˆ ij := M

(5.3)

where i = (x, K), j = (y, L) and γ denotes the unique “on-tree” path connecting the lattice site x with the lattice site y. ˆ B, ˆ M, ˆ M ˆ ∗ } of generators Definition 2. For a given tree T and a given family {E, fulfilling our axioms, we call the family {Eˆy,y+lˆ(T ), Bˆx,x+kˆ (T ), m ˆ ij , m ˆ ∗ij } the tree data ˆ B, ˆ M, ˆ M ˆ ∗ }. of {E, Observe that the tree data inherit the following properties from axioms (4.30)–(4.36): Proposition 4.

m ˆ ij , m ˆ kl = 0 2 m ˆ ij = 0 m ˆ ij , m ˆ ∗kl , m ˆ rs = δki δls m ˆ rj + δkr δlj m ˆ is . h i m ˆ ij , Bˆx,x+kˆ (T ) = 0 , h i m ˆ ij , Eˆx,x+kˆ (T ) = 0 , h i Bˆx,x+kˆ (T ), Eˆy,y+lˆ(T ) = i~δxy δkˆ lˆ .

(5.4) (5.5) (5.6) (5.7) (5.8) (5.9)

Algebra of Observables and Charge Superselection Sectors for QED

547

It will be seen in the next section that the ∗-algebra Om generated by operators m ˆ and m ˆ ∗ , fulfilling axioms (5.4)–(5.6), is finite dimensional. Remark 2. In the present paper we conclude this fact from the classification of all irreducible representations of Om . Alternatively, one can deduce it from an additional structure carried by this algebra, which will be discussed in a separate paper. More precisely, Om can be constructed in terms of the enveloping algebra of a Lie algebra ˆ r}, generated by {m, ˆ m ˆ ∗ , l, ˆ where lˆ and rˆ are constructed from m ˆ and m ˆ ∗ in analogy to formulae (4.8), (4.9), (4.18) and (4.19). ˆ B, ˆ M, ˆ M ˆ ∗ } of generators Theorem 1. Let there be given a tree T . Every family {E, fulfilling axioms (4.29)–(4.36) together with (5.1) and (4.28) is in one-to-one correspondence with its tree data, fulfilling identities (5.4)–(5.9). Proof. It is enough to show that every family fulfilling our axioms can be uniquely reconstructed from the tree data, fulfilling (5.4)–(5.9). First, it is obvious that solving the linear system of equations implied by the magnetic Gauss law (4.29), we may calculate the magnetic flux through every plaquette of 3 provided we know all fluxes Bˆx,x+kˆ (T ). ˆ may be expressed in terms of operaNext, using formula (4.31), all operators M ˆ tors m ˆ and B. This way identity (4.31) is automatically fulfilled by the reconstructed ˆ are trivially implied by the quantities. Also the remaining identities for the operators M corresponding identities of the operators m. ˆ Now, we get from Eq. (4.28) all charges jˆx0 . ˆ But knowing all off-tree quantities Ex,x+kˆ (T ) and all charges jˆx0 we are able to solve the equation divEˆ = jˆ 0 and to calculate all the remaining (i. e. on-tree) electric operators ˆ For this purpose we have to solve the equation at each site x, starting from the tree E. end-points and moving towards the tree root (an end-point x of the tree is characterized by the condition that only one among the lattice links starting from x belongs to the tree). Hence, knowing Eˆ on the remaining links and knowing jˆx0 we may calculate Eˆ on this particular link. Obviously, this procedure may be continued when we move towards the root. At points belonging to the boundary ∂3 we must use also the values of external fluxes Eˆ x,∞ = Ex,∞ 11, which are supposed to be fixed a priori. The compatibility condition at the root is automatically fulfilled for consistent boundary data in the sense of formula (3.11). It is also obvious that (4.34) is fulfilled by the reconstructed data. Thus, it remains to show that identities (4.35) and (4.36) hold. We leave this part of the proof to the reader. ˆ B, ˆ M, ˆ M ˆ ∗ } fulfilling axioms Due to this theorem, the ∗-algebra generated by {E, (4.29)–(4.36) together with (5.1) and (4.28), is isomorphic to the ∗-algebra generated by the tree data, fulfilling identities (5.4)–(5.9). Due to (5.2), the first two components, ˆ ) and B(T ˆ ), fulfill the commutation relations of a finitely generated Heisenberg E(T algebra. We take the corresponding Weyl-algebra generated by them and denote its strong closure by Oe-m (T ). This algebra may be represented as B(L2 (B(T ))), the algebra of bounded operators acting on the Hilbert space of L2 -integrable functions depending on classical variables B(T ), with Bˆx,x+kˆ (T ) defined as multiplication and Eˆx,x+kˆ (T ) as differentiation operators (Schr¨odinger representation). It follows from (5.7) and (5.8) ˆ ), B(T ˆ ), m, that Om and Oe-m (T ) commute. Thus, the C ∗ -algebra generated by (E(T ˆ m ˆ ∗) is the tensor product of Oe-m (T ) and Om . Consequently, we have the following Definition 3. The observable algebra O is defined as

548

J. Kijowski, G. Rudolph, A. Thielmann

O := Oe-m (T ) ⊗ Om .

(5.10)

Remark 3. Due to Theorem 1 we may identify elements of O reconstructed from tree data corresponding to different trees. It is easy to check that this identification is an isomorphism of algebras. Hence, the algebras O generated from data corresponding to different trees coincide. Consequently, the definition of O does not depend upon the tree. Remark 4. The generators m ˆ are non-local. But due to (5.6) they can be expressed in terms of local quantities, namely annihilation operators of pairs located at the same lattice site or pairs separated by one link. This will be discussed in more detail in Sect. 9. There is an interesting relation to a theory of C ∗ -algebras generated by unbounded elements recently developed by Woronowicz, see [30]. The following fact is a simple consequence of a general theorem contained in his paper. Let us take the C ∗ -algebra A of all compact operators acting on the Hilbert space L2 (B(T )). Due to [30]–this ˆ ), B(T ˆ )) and these generators algebra is generated (in a quite complicated sense) by (E(T are affiliated with A in the C ∗ -sense, see [31]. This way the canonical commutation relations (5.9) can be considered as relations among affiliated elements and we can extend the notion of observables to generators treated as affiliated elements in the sense just mentioned. A natural question arises, namely, whether or not one can omit our construction based on the above tree decomposition and directly define the algebra generated by our axioms, considered as relations between unbounded generators. This is an interesting question, which we are going to consider in the future. 6. Charge Superselection Rule In this section we prove that the total charge operator defined by (4.27) is a central element of the observable algebra. For this purpose we first list some properties of O. Let us observe that the composition law (5.6) contains, as special cases, the following identities, which will be used in what follows: ˆ ∗ij m ˆ ij = m ˆ ij m ˆ ij m ˆ il , m ˆ ∗kl , m ˆ kj if k 6= i or l 6= j. m ˆ ij = m

(6.1) (6.2)

Lemma 1. Operators m ˆ satisfy the following identities:

ˆ kl = −m ˆ il m ˆ kj m ˆ ij m ˆ ij = 0 if k 6= i and l 6= j. m ˆ ∗kl , m

(6.3) (6.4)

Proof. To prove (6.3) we first show that ˆ il = 0 = m ˆ ij m ˆ kj . m ˆ ij m Indeed, if j 6= l we have, according to (6.1), (5.4) and (5.5): m ˆ ij m ˆ il = m ˆ il , m ˆ ∗il , m ˆ ij m ˆ il m ˆ ∗il m ˆ ij m ˆ il − m ˆ ij m ˆ il m ˆ ∗il m ˆ il = 0 . ˆ il = m

(6.5)

(6.6)

Similarly, one can prove the second identity of (6.5). Now it is sufficient to prove (6.3) for i 6= k and j 6= l. We have

Algebra of Observables and Charge Superselection Sectors for QED

549

m ˆ ij m ˆ kl = m ˆ il , m ˆ ∗kl , m ˆ kj m ˆ il m ˆ kj m ˆ ∗kl m ˆ kl = ˆ kl = −m ∗ = −m ˆ il m ˆ kl , m ˆ kj = −m ˆ kj . ˆ kl , m ˆ il m To prove (6.4) observe that (6.2) implies ˆ ij = m ˆ ∗kl m ˆ ∗kl , m ˆ kj = −m ˆ il m ˆ kj m ˆ ∗kl ˆ il , m ˆ ∗kl m m ˆ ∗kl m

(6.7)

because the three other terms from the double commutator vanish by virtue of point 1 together with identities (6.9) and (6.10). Similarly, ∗ ˆ ∗kl = m ˆ il , m ˆ ∗kl , m ˆ kj m ˆ ∗kl m ˆ il m ˆ kj m ˆ ∗kl (6.8) ˆ kl = −m m ˆ ij m which ends the proof.

The above properties, together with (5.6), imply the following identities: ˆ ∗kj m ˆ ij = 0 if k 6= i m ˆ ij m ˆ ∗il m ˆ ij = 0 if l 6= j. m ˆ ij m

(6.9) (6.10)

In the notation used now, the projectors (4.10) and (4.11) take the form: Pˆkl := m ˆ kl m ˆ ∗kl ∗ ˆ kl m ˆ kl . Qˆ kl := m

(6.11) (6.12)

Physically, Qˆ ij corresponds to the question “Are the degrees of freedom (i, j) fully occupied?” and Pˆij corresponds to the question “Are the degrees of freedom (i, j) fully unoccupied, i. e. is there vacuum in (i, j)?” It follows from (6.1) and (5.5) that they obey the following, obvious identities: ˆ kj = m ˆ kl Pˆkj = 0 Qˆ kl m ˆ ij = m ˆ kj Pˆij = 0 Qˆ kj m Pˆkj Qˆ kl = Qˆ kl Pˆkj = 0

(6.13)

Pˆij Qˆ kj = Qˆ kj Pˆij = 0 ˆ kl = m ˆ kl m ˆ kl Qˆ kl = Pˆkl m ∗ ˆ ∗ ˆ ˆ kl = m ˆ ∗kl . m ˆ kl Pkl = Qkl m

(6.16)

(6.14) (6.15) (6.17) (6.18)

As a consequence we obtain Lemma 2. The following identities hold: 0 = Pˆij , Qˆ kl = Pˆij , Pˆkl = Qˆ ij , Qˆ kl Pˆij Pˆkl = Pˆil Pˆkj Qˆ ij Qˆ kl = Qˆ il Qˆ kj .

(6.19) (6.20) (6.21)

We leave the simple proof to the reader. Theorem 2. The total charge operator defines a superselection rule in O. We have M {Oe-m (T ) ⊗ Om (Q)} (6.22) O= Q

where Om (Q) are the central decomposition components of Om .

550

J. Kijowski, G. Rudolph, A. Thielmann

Proof. It is sufficient to show that all generators m ˆ commute with Qˆ or, equivalently, ˆ with every projector PZ given by formula (4.26). For this purpose let us observe that for k ∈ V and l 6∈ W and, similarly, for k 6∈ V and l ∈ W , formulae (6.13) and (6.14) imply: ˆ kl = 0. (6.23) m ˆ kl Qˆ (V,W ) Pˆ(N \V,N \W ) = Qˆ (V,W ) Pˆ(N \V,N \W ) m The only nontrivial contribution to the commutator of m ˆ kl with PˆZ may come from those terms of the right-hand side of (4.26) which correspond to k ∈ V and l ∈ W or those for which k 6∈ V and l 6∈ W . Let us begin with the first possibility. Due to (6.21) we have (6.24) Qˆ (V,W ) = Qˆ kl Qˆ (V \{k},W \{l}) . Hence, we have ˆ kl Qˆ kl Qˆ (V \{k},W \{l}) Pˆ(N \V,N \W ) [m ˆ kl , Qˆ (V,W ) Pˆ(N \V,N \W ) ] = m = Pˆkl m ˆ kl Qˆ (V \{k},W \{l}) Pˆ(N \V,N \W ) .

(6.25)

Let us denote V 0 := V \ {k} and W 0 := W \ {l}. Due to (6.20) we have Pˆ(V 0 ,W 0 ) = Pˆkl Pˆ(N \V,N \W ) .

(6.26)

ˆ kl . [m ˆ kl , Qˆ (V,W ) Pˆ(N \V,N \W ) ] = Qˆ (V 0 ,W 0 ) Pˆ(N \V 0 ,N \W 0 ) m

(6.27)

Hence, we have

Now we consider the second possibility. If k 6∈ V 0 and l 6∈ W 0 , then Eqs. (6.13) and (6.14) imply: ˆ kl . [m ˆ kl , Qˆ (V 0 ,W 0 ) Pˆ(N \V 0 ,N \W 0 ) ] = −Qˆ (V 0 ,W 0 ) Pˆ(N \V 0 ,N \W 0 ) m

(6.28)

We see that these contributions cancel each other in the commutator of m ˆ kl with PˆZ . ˆ Consequently, Q is a central element of O. Thus, the algebra Om splits–due to the central decomposition theorem [32]–into a direct sum of components Om (Q), corresponding to fixed eigenvalues Q of the total charge. We prove in the next section that the total charge Qˆ generates the whole center of O. 7. Uniqueness of Irreducible Representations and Charge Superselection Sectors Concerning representations of the algebra Oe-m (T ) we restrict ourselves to strongly continuous representations of the Weyl relations. Due to [1] all strongly continuous representations of the Weyl relations are unitarily equivalent to at most a countable sum of copies of the Schr¨odinger representation. Thus, all representations of the observable algebra O may be constructed from the Schr¨odinger representation and from representations of Om . In this section we will construct all irreducible representations of the latter algebra. In a first step, we are going to define a family of canonical representations of Om . We will show in the sequel that every irreducible representation is isomorphic to one of them. For this purpose we denote N = {1, 2, . . . , N }, where N is–as in Sect. 4–the number of all positron degrees of freedom φˆ i (and also the number of all electron degrees

Algebra of Observables and Charge Superselection Sectors for QED

551

of freedom ϕˆ j ). For any integer Z ∈ [−N + 1, N − 1] we define a finite-dimensional representation of Om in the following way. We denote by SN the set of all subsets of N and take the free vector space M Hm (T ) := H(I,J) H(I,J) ∼ = C, (I,J)∈SN ×SN

over SN × SN (see [29]). Next we restrict ourselves to the subspace HZ of such vectors that the number #I of elements of I differs from the number #J of elements of J exactly by Z: M H(I,J) . (7.1) HZ := #I−#J=Z

We endow HZ with a Hilbert space structure by choosing in each subspace H(I,J) ∼ =C the unit number (I,J) := 1(I,J) ∈ H(I,J) and treating {(I,J) } as an orthonormal basis. We have, of course, M HZ . (7.2) Hm (T ) = Z

We define the irreducible representation of the operators m ˆ and m ˆ ∗ on HZ as follows: sgn((i,j),(I\{i},J\{j})) (I\{i},J\{j}) if i ∈ I and j ∈ J (7.3) m ˆ ij (I,J) = 0 otherwise sgn((k,l),(I,J)) (I∪{k},J∪{l}) if k 6∈ I and l 6∈ J m ˆ ∗kl (I,J) = (7.4) 0 otherwise. By sgn((k,l),(I,J)) (where k 6∈ I and l 6∈ J) we denote the parity (±1) of the permutation, which is necessary to reestablish the canonical order of the sequence ˆ represented this way (k, l, i1 · · · ik , j1 · · · jl ). It is easy to check that the operators m fulfil the defining relations (5.6) and (6.3) of the algebra Om . The above constructed representations will be called Z-representations. Remark 5. The Hilbert space HZ constructed above may also be treated as a subspace of the fermionic Fock space defined by N (abstract) positron degrees of freedom φi and N electron degrees of freedom ϕj : O HZ ⊂ F(CN ) F (CN ) , (7.5) where by F (CN ) we denote the fermionic Fock space with N generators, satisfying the canonical anticommutation relations. In this space, vectors (I,J) may be represented as canonically ordered monomials φ∗i1 · · · φ∗ik ϕ∗j1 · · · ϕ∗jl , (where i1 < · · · < ik and j1 < · · · < jl are all elements of I and J respectively) of Grassmannian (anticommuting) variables φ∗ and ϕ∗ . They are obtained from the Fock vacuum by the action of creation operators: φ∗i1 · · · φ∗ik ϕ∗j1 · · · ϕ∗jl := φˆ ∗i1 · · · φˆ ∗ik | ωp > ⊗ ϕˆ ∗j1 · · · ϕˆ ∗jl | ωe >

(7.6)

and | ωp > (respectively | ωe >) denotes the (fermionic) Fock vacuum for positrons (resp. electrons). The subspace HZ corresponds to the value Q = eZ of the total charge, ˆ∗ as will be seen from Lemma 4. In the polynomial representation of HZ , operators m may be identified with multiplication operators by the Grassmannian 2nd order quantity m∗ij := φ∗i ϕ∗j . The coefficient arising in (7.4) is chosen in such a way that it reestablishes the canonical order in the product φ∗k ϕ∗l φ∗i1 · · · φ∗ik ϕ∗j1 · · · ϕ∗jl .

552

J. Kijowski, G. Rudolph, A. Thielmann

The following three lemmas characterize Z-representations. Lemma 3. A vector ∈ HZ belongs to H(I,J) if and only if Qˆ ij = for i ∈ I and j ∈ J Pˆkl = for k 6∈ I and l 6∈ J Pˆij = 0 for i ∈ I or j ∈ J Qˆ kl = 0 for k 6∈ I or l 6∈ J .

(7.7) (7.8) (7.9) (7.10)

Proof. The proof follows immediately by acting with mij (or respectively m∗ij ) onto the left-hand sides of Eqs- (7.3) and (7.4). Lemma 4. The projector PˆZ given by formula (4.26) reduces to unity on the representation space HZ and vanishes on all the representation spaces HZ˜ , for Z˜ 6= Z. Proof. The value of the operator (4.26) on HZ may be easily checked by inspection. This lemma shows that the canonical representations constructed above are labeled by the values of the total charge only: the space HZ is an eigenspace of Qˆ given by (4.27), with eigenvalue Q = eZ. Lemma 5. We have m ˆ ij H(I,J) =

H(I\{i},J\{j}) if i ∈ I and j ∈ J 0 otherwise

(7.11)

and the operator mij , treated as a mapping from H(I,J) to H(I\{i},J\{j}) is a unitary (intertwining) isomorphism. Equivalently, we have H(I∪{k},J∪{l}) if k 6∈ I and l 6∈ J ∗ (7.12) m ˆ kl H(I,J) = 0 otherwise and the operator m ˆ ∗kl , treated as a mapping from H(I,J) to H(I∪{k},J∪{l}) is a unitary (intertwining) isomorphism. Proof. The proof follows immediately from (7.3) and (7.4).

Now we start to analyze arbitrary irreducible representations of Om . Definition 4. Let there be given an irreducible representation of Om with representation space H. For every pair (I, J) ∈ SN × SN we define the subspace H(I,J) ⊂ H, as the space of all vectors ∈ H fulfilling conditions (7.7)–(7.10). Remark 6. Among four equations (7.7)–(7.10) only two are independent: a) (7.7) and (7.9) in the case I = N = J. Equations (7.8) and (7.10) are trivially redundant. b) (7.8) and (7.10) in the case I = ∅ = J. Equations (7.7) and (7.9) are trivially redundant. c) (7.7) and (7.8) in the case ∅ 6= I 6= N and ∅ 6= J 6= N . They imply Eqs. (7.9) and (7.10). To show, for instance, that (7.7) implies (7.9), suppose that i ∈ I. Then, due to (7.7) we can represent = Qˆ il , for some l ∈ J. Acting with Pˆij onto this equation and using (6.15) gives (7.9). A similar argument may be used if j ∈ J.

Algebra of Observables and Charge Superselection Sectors for QED

553

d) (7.7) and (7.10) in the case I = N and ∅ 6= J 6= N . Equation (7.8) is redundant and (7.7) implies (7.9). To prove it we represent = Qˆ il , for some l ∈ J. Acting with Pˆij onto (7.7) and using (6.15) we obtain (7.9). The same is true for the case J = N and ∅ 6= I 6= N . e) (7.8) and (7.9) in the case I = ∅ and ∅ 6= J 6= N . Equation (7.7) is redundant and–using the same argument as above–(7.8) implies (7.10). Again, the same is true for the case J = ∅ and ∅ 6= I 6= N . Theorem 3. For every nontrivial irreducible representation of Om there is a nontrivial subspace H0 ⊂ H satisfying conditions (7.7)–(7.10) for some (I0 , J0 ) ∈ SN × SN . In this section we give the proof for the simplest case Z = 0 (the complete proof is contained in the Appendix). It is based on the analysis of “long projectors”, i. e. products of projectors Pˆij and Qˆ kl having different indices. Let K− , K+ , L− and L+ be four subsets of N , such that #K− = #L− , #K+ = #L+ , K− ∩ K+ = ∅ and L− ∩ L+ = ∅. We put: Y Y Y Y Pˆ(K− ,L− )(K+ ,L+ ) = Pˆij Qˆ kl . (7.13) i∈K− j∈L−

k∈K+ l∈L+

Due to properties (6.20) and (6.21) we are able to reorganize the product on the righthand side in such a way that each index i, j, k and l appears only once. It may happen that some of these projectors vanish identically on the representation space H. Consider a non-vanishing projector of maximal length #K = #L, where K := (K− ∪ K+ ) and L := (L− ∪ L+ ) and define |Z| := N − #K = N − #L. This means that every product of projectors Pˆ and Qˆ having different indices, which is longer than N − |Z|, vanishes identically on H. In particular, Z = 0 means that there is at least one non-vanishing long projector of length N . Proof. Let (7.13) be a non-vanishing projector of length N . We take H0 := Pˆ(K− ,L− )(K+ ,L+ ) H

(7.14)

and put I0 := K+ , J0 := L+ . Properties (7.7)–(7.10) follow immediately from (6.15), (6.16) and (6.19). Of course, we have |Z| < N in the general case, because projectors of “length 1” do not vanish. Indeed, vanishing of Pˆij would imply vanishing of m ˆ ∗ij and vanishing of Qˆ ij would imply vanishing of m ˆ ij , according to identities: km ˆ ∗ij k2 = (|Pˆij ), km ˆ ij k2 = (|Qˆ ij ).

(7.15) (7.16)

This is excluded by the fact that the representation is nontrivial. Indeed, due to identity ˆ ij = 0 for any i and j. (6.2) equation m ˆ i0 j0 = 0 implies m Every nonvanishing long projector of maximal length defines a non-trivial subspace H(K− ,L− )(K+ ,L+ ) := Pˆ(K− ,L− )(K+ ,L+ ) H.

(7.17)

In the general case this space cannot be taken directly as H0 of Theorem 3. From the physical point of view, we know only that the degrees of freedom corresponding to K+ and L+ are occupied and those corresponding to K− and L− are unoccupied. Hence,

554

J. Kijowski, G. Rudolph, A. Thielmann

we have to check what happens with the remaining degrees of freedom. We prove in the Appendix that there are only two possible situations: either all positron degrees of freedom in N \ K are occupied and all electron degrees of freedom in N \ L are unoccupied, or vice versa. In the first case we take (I0 , J0 ) := (N \ K− , L+ ) and put Z := |Z|, whereas in the second case we take (I0 , J0 ) := (K+ , N \L− ) and put Z := −|Z|. Finally, we prove that the space (7.17) fulfills conditions (7.7)– (7.10). Lemma 6. Let there be given a subspace H0 ∈ H fulfilling conditions (7.7)–(7.10) ˆ ij H0 and for for some I = I0 and J = J0 . For (i, j) ∈ (I0 , J0 ) we define H1 := m ˆ ∗kl H0 . These spaces have the following properties: (k, l) 6∈ (I0 , J0 ) we define H2 := m 1. 2. 3. 4.

H1 fulfills conditions (7.7)–(7.10), for I = I0 \ {i} and J = J0 \ {j}. H2 fulfills conditions (7.7)–(7.10), for I = I0 ∪ {k} and J = J0 ∪ {l}. H0 , H1 and H2 are mutually orthogonal. We have ˆ kl H2 . m ˆ ∗ij H1 = H0 = m

(7.18)

∗

The operators m ˆ and m ˆ are (mutually inverse) isometries between these spaces. Proof. 1. We consider the generic case ∅ 6= I0 \ {i} 6= N and ∅ 6= J0 \ {j} 6= N (see Remark 6) and leave the remaining cases to the reader. Take any ω = m ˆ ij ∈ H1 , where ∈ H0 , k ∈ I0 \ {i} and l ∈ J0 \ {j}. Then (7.7) follows from: Qˆ kl ω = Qˆ kl m ˆ ij = m ˆ ij Qˆ kl = m ˆ ij = ω Observe that, for k 6∈ I0 \ {i} and l 6∈ J0 \ {j}, we have three possibilities: 1) k 6= i and l 6= j, 2) (k, l) = (i, j), 3) only one pair of indices coincides. In the first case the proof of (7.8) is similar to the proof of (7.7) above. In the second case (7.8) is implied by identity (6.17). In the last case, say k = i whereas l 6= j, (6.2) implies: ˆ il m ˆ ∗il m ˆ ij = (m ˆ ij − m ˆ ij Qˆ il ) = ω. Pˆil ω = m 2. The proof for H2 is analogous to the proof under point 1. 3. We have (m ˆ ij |) = (|m ˆ ∗ij ). But, m ˆ ∗ij = m ˆ ∗ij Pˆij = 0 for ∈ H0 , which ˆ kl = proves orthogonality of H1 and H0 . Replacing (i, j) with (k, l) we have m m ˆ ij Qˆ kl = 0 for ∈ H0 , which proves orthogonality of H0 and H2 . 4. For any ω = m ˆ ij ∈ H1 , where ∈ H0 , we have m ˆ ∗ij ω = Qˆ ij = . Moreover (m ˆ ij |m ˆ ij ) = (|Qˆ ij ) = (|). Similarly, we prove that m ˆ ∗kl is an isometry from H0 to H2 .

Lemma 7. For every irreducible representation of Om there is an integer Z (|Z| < N ) such that the representation space H has the following structure: 1. H=

M

H(I,J)

(7.19)

#I−#J=Z

with H(I,J) given by Definition 4. 2. We have H(I\{i},J\{j}) if i ∈ I and j ∈ J m ˆ ij H(I,J) = 0 otherwise and the operator mij is a unitary isomorphism between these spaces.

(7.20)

Algebra of Observables and Charge Superselection Sectors for QED

555

Proof. Given H0 fulfilling (7.7)– (7.10), for some I = I0 and J = J0 , we define H˜ (I0 ,J0 ) := H0 and use formulae (7.20) and (7.21) as a recursive definition of remaining spaces H˜ (I,J) , such that #I − #J = Z. Properties (5.6) and (6.3)) imply that adding ˆ ij does elements to I and J by the action of m ˆ ∗kl or removing elements by the action of m not depend upon the order of these operations. Due to Lemma 6, the subspaces H˜ (I,J) defined this way fulfil (7.7)– (7.10) and are mutually isomorphic via the intertwiners m ˆ and m ˆ ∗ . They are orthogonal to each other because of an argument, similar to that used in the proof of point 3 of Lemma 6. It follows that the direct sum H˜ of these spaces carries a nontrivial representation of our algebra. Irreducibility of the representation implies that H˜ coincides with the whole representation space H. Consequently, we have H˜ (I,J) = H(I,J) i. e. formula (7.19) holds.

Remark 7. Obviously, point 2 of this lemma may be equivalently formulated in terms of the intertwiner m ˆ ∗ij : H(I\{i},J\{j}) → H(I,J) (cf. (7.12)): m ˆ ∗kl H(I,J) =

H(I∪{k},J∪{l}) if k 6∈ I and l 6∈ J 0 otherwise.

(7.21)

These mappings are mutually inverse, because m ˆ ∗ij m ˆ ij = Qˆ ij reduces to the identity on ∗ ˆ ij m ˆ ij = Pˆij reduces to the identity on H(I\{i},J\{j}) . H(I,J) and m Theorem 4. Every irreducible representation of Om is unitarily equivalent to one of the Z-representations defined by (7.3) and (7.4). Proof. It remains to prove that the spaces H(I,J) are one-dimensional. Choosing a unit vector ω(I0 ,J0 ) in one of them, say H(I0 ,J0 ) , and constructing for each (I, J) one vector ˆ and m ˆ ∗ ’s on ω(I0 ,J0 ) , according to formulae (7.3) and ω(I,J) ∈ H(I,J) by acting with m’s (7.4), we obtain an invariant subspace of the representation spanned by all these ω’s. This subspace is isomorphic to the canonical representation space HZ by identification of ω’s with ’s. The irreducibility of our representation implies that this subspace must be equal to the entire Hilbert space H. As a result of this discussion, the physical Hilbert space H is the unique (up to unitary equivalence) representation space of the observable algebra O. Due to (7.2) we have M {He-m (T ) ⊗ HZ } (7.22) H= Z

where He-m (T ) is the space of bosonic wave functions depending on the magnetic variables Bx,x+kˆ (T ), equipped with the standard L2 -Hilbert space structure. This is the decomposition of the physical Hilbert space into charge superselection sectors, corresponding to the direct sum decomposition (6.22) of O. This means that Qˆ acts on each sector He-m (T ) ⊗ HZ as eZ11 and that it generates the whole center of O. Finally, we stress that–since O does not depend upon the choice of a tree–the representation space H is also independent of this choice.

556

J. Kijowski, G. Rudolph, A. Thielmann

8. Wave Function Description of Physical States and Relation to the Gauge Fixing Approach As already mentioned above, He-m (T ) is the space of bosonic wave functions depending on the magnetic variables Bx,x+kˆ (T ). Treating them as combinations (4.1) of the potentials Ax,x+kˆ we may interpret elements of He-m (T ) as gauge-invariant wave functions in the sense of Sect. 3. Here we also give aR wave function interpretation to elements of Hm (T ). For this purpose we denote Ai := (x0 ,x)T A , where x ∈ 30 corresponds to the ith degree of freedom. Moreover, for x ∈ ∂30 we define A(x0 ,x,∞)T := Ai + Ax,∞ . We identify (I,J) ∈ H(I,J) with the following gauge-invariant product:  ! ÿ Y Y exp(−igAi )φ∗i  exp(igAj )ϕ∗j  (I,J) := i∈I

 

Y

j∈J

 i exp( Ex,∞ A(x0 ,x,∞)T ) , ~ 0

(8.1)

x∈∂3

where boundary data Ex,∞ are multiples of e and have to fulfil (3.11). Observe that tensorising these two types of wave functions gives the whole space of gauge-invariant wave functions with a uniquely defined Hilbert space structure inherited from He-m (T ) and Hm (T ). It is easy to see that the representation of O constructed in the previous section, coincides with the representation of gauge-invariant combinations of the operators ˆ ψˆ ∗ ) acting on such wave functions, in the sense of Sects. 2 and 3. ˆ E, ˆ ψ, (A, We conclude that the Hilbert space H, which was obtained as the unique representation space of O, has a natural realization as the space of gauge-invariant wave functions 9 = 9(A, φ∗ , ϕ∗ ). In this realization no tree-decomposition is necessary. However, the tree is useful to explicitly describe the scalar product in the space of such wave functions: Definition 5. Given a tree T we define the scalar product on the space of gauge invariant wave functions in the following way: 1. We choose the “tree gauge”, i.e. we fix arbitrarily all the values of on-tree variables ˆ ∈ T , and all the values of external variables Ax,∞ . Ax,x+kˆ , (x, x + k) 2. We treat 9 as a function of the remaining variables: off-tree potentials Ax,x+kˆ , ˆ 6∈ T , and Grassmann variables (φ∗ , ϕ∗ ). (x, x + k) 3. We take the L2 -scalar product with respect to these variables (integration with respect to Grassmann variables is meant in the sense of Berezin). One easily shows the following Proposition 5. The above scalar product does not depend upon the choice of both the tree and the tree gauge. It coincides–after identification (8.1)–with the previously introduced, canonical scalar product in H. Remark 8. Irreducible representations of O, corresponding to the same value of Q are equivalent, even if they differ by the boundary data Ex,∞ . The intertwining operator between two such representations is obtained by replacing the last factor in (8.1), corresponding to the first choice of data, by the factor corresponding to the second choice of data.

Algebra of Observables and Charge Superselection Sectors for QED

557

ˆ ∈ T and Ax,∞ = 0, Remark 9. For the special tree gauge, Ax,x+kˆ = 0 for (x, x + k) (I,J) defined by (8.1) coincides with (I,J) given by (7.6). Using the gauge fixing concept we can also define physically meaningful potential operators Aˆ x,x+kˆ . We call a gauge fixing condition complete, if it allows us to express potential operators uniquely in terms of magnetic fluxes. The standard example of a complete gauge fixing is the tree gauge, where we choose arbitrarily all values of the ˆ ∈ T . Then the off-tree potential along-tree potential operators Aˆ x,x+kˆ , for (x, x + k) operators are in one-to-one correspondence with the along-tree magnetic flux operators ˆ Bˆx,x+kˆ (T ) = Aˆ x,x+kˆ + Aˆ (x0 ,x)T + Aˆ (x+k,x ˆ 0 )T , (x, x + k) 6∈ T ,

(8.2)

because the last two terms of the above sum are given by the gauge fixing condition. We stress that the along-tree potentials have to be chosen as self-adjoint operators (e. g. cnumbers). Then the off-tree potential operators defined by (8.2) act automatically as selfadjoint operators on the above defined Hilbert space of gauge invariant wave functions. In this gauge, the representation of fermionic operators on this space is obvious: φˆ ∗i and ϕˆ ∗j act as multiplication operators, whereas φˆ i and ϕˆ j act as differentiation operators. ˆ B, ˆ M, ˆ M ˆ ∗ } of our observable algebra Of course, the gauge invariant generators {E, can be uniquely expressed in terms of combinations of operators Aˆ and ψˆ xa , fulfilling the above gauge condition. Finally, we note that a tree gauge may be easily replaced by any elliptic gauge condition, i. e. a condition, which enables us to solve Eqs. (4.1) uniquely, at a given instant of time, with respect to potentials. A typical example is provided by the Coulomb gauge: X Aˆ x,x+kˆ = Fˆx (8.3) divx Aˆ := kˆ

where the operators FˆP x are self-adjoint, commute with all the magnetic fluxes and satisfy the global condition Fˆx = 0. For a lattice with non-vanishing boundary ∂3 6= ∅, this is not a complete gauge: we still have the freedom to fix boundary values Aˆ x,x+kˆ 1 ˆ for (x, x + k) P ∈ ∂3 . An additional (“residual”) gauge0 condition may be imposed: (2) ˆ ˆ ˆ divx A := (x,x+k)∈∂3 1 Ax,x+k ˆ ˆ = Cx , for any x ∈ ∂3 . Observe that the operators P ˆ C have to fulfil the consistency condition x∈∂30 Cˆ x = 0. One easily shows that the above gauge fixing, together with Eq. (4.1), uniquely defines all operators Aˆ x,x+kˆ as ˆ linear combinations of magnetic flux operators Bˆ x;k, ˆ lˆ and gauge fixing operators Fx and ˆ Cx . 9. Local Observables and Lattice Quantum Hamiltonian ˆ are non-local. Having in mind a possible continuum limit of the theory Operators M we would like to be able to describe it in terms of local quantities. For this purpose we may restrict ourselves to creation and annihilation operators of pairs, which are located at the same point or separated by at most one lattice link: uˆ x;L K := ϕˆ x;L φˆ K x , wˆ x,x+k;L ˆ

K

:= ϕˆ x+k;L exp(−ig Aˆ x,x+kˆ )φˆ K ˆ x .

(9.1) (9.2)

558

J. Kijowski, G. Rudolph, A. Thielmann

Using (4.33) we can express all pair creation and annihilation operators along any long path γ as a multiple commutator of the above short pair creation operators assigned to links and points belonging to γ. This way, the observable algebra O may be viewed ˆ B, ˆ u, as the algebra generated by the following set of local generators: {E, ˆ w}. ˆ Of course, the above generators are not independent. They fulfil conditions (4.30)– ˆ replaced by u’s (4.36), whenever both sides of the equation, with M ˆ and w’s, ˆ make sense. For concrete future calculations one has to choose a set of independent generators, e. g., one may choose any three u’s ˆ at each lattice site and one wˆ on each link. The quantum evolution is governed by the second quantized Hamiltonian (2.2). Using standard lattice approximation recipes one gets: Hˆ = Hˆ e-m + Hˆ m + Hˆ kin , 2 1 X 2 1 X ˆ Hˆ e-m = Ex,x+kˆ + Bˆ x;k, ˆ lˆ 2 2 ˆ ˆ lˆ (x,x+k) x;k, X ∗ (tr uˆ x ) , tr uˆ x Hˆ m = m ~ Hˆ kin = − a

x

X

sgn kˆ · Im

n

o K σ kL K · wˆ x,x+k;L . ˆ

(9.3) (9.4) (9.5) (9.6)

x,kˆ

The above Hamiltonian is bounded from below, because Hˆ e-m is positive definite and the remaining terms are bounded operators. Hence, we may define the dynamical vacuum as the minimal energy state in the vacuum sector Q = 0 and the notion of a dressed particle as the minimal energy state in the Q = e sector, with appropriately chosen boundary data Ex,∞ . We stress that these states have nothing to do with the perturbative vacuum and the notion of a bare particle in the perturbative approach. There is, probably, no way to obtain an exact, analytic expression for these states, even on the lattice level, and a numerical analysis will remain as the only tool to investigate ˆ There is, however, an interesting idea (see [28]), namely the spectrum of the operator H. ˆ to consider Hkin as the perturbation to the operator Hˆ 0 := Hˆ e-m + Hˆ m . We stress that this type of perturbative approach has nothing to do with “switching the interaction off”, because quantum states corresponding to g = 0 and to g 6= 0 belong to completely different Hilbert spaces. 10. Towards Continuum Theory Here, we present some heuristic ideas concerning the construction of the full continuum quantum theory. Heuristically, the algebra of observables of the continuum theory should be constructed as an inductive limit of our algebras O3 , describing a finite number of degrees of freedom, related to the finite lattice 3 (see e. g. [33] and [34]). For this purpose an order relation “≺” in the set of finite lattices has to be chosen. We say that the lattice 32 is “later” than 31 (or 31 ≺ 32 ) if it describes more field degrees of freedom than 31 does. Thus, being “later” means being “bigger” respectively “finer”, or both. Given a pair 31 ≺ 32 , there is, obviously, a natural embedding P32 ,31 : O31 → O32

(10.1)

which preserves the properties (4.29)–(4.36). The Gauss law (5.1), however, is no longer true in its original version but in a new version, with the charge jˆx0 replaced by the sum of

Algebra of Observables and Charge Superselection Sectors for QED

559

all charges jˆx0 i corresponding to the sites xi of 32 which are contained in the same cell of the dual lattice as x. There is a natural compatibility relation, P33 ,32 P32 ,31 = P33 ,31 , if 31 ≺ 32 ≺ 33 . The inductive limit of our observable algebras describes, in principle, degrees of freedom of the continuum theory. To avoid singular objects, we may smear the fields ˆ Bˆ and uˆ with sufficiently regular test functions and obtain this way “observableE, valued-distributions”. The field wˆ cannot be smeared directly, because of its non-additive character. A natural way to encode the information about the field wˆ in an “observablevalued-distribution” consists in replacing it by the field 1 K wˆ x,x+k;L − uˆ x;L K ' Dk ϕˆ x;L φˆ K (10.2) vˆ k;L K (x) := lim ˆ x . a→0 a The results obtained in [4] and [5] suggest that vˆ might arise as one of the fundamental fields of the continuum theory. Once the observable algebra of the continuum theory is given, its Hilbert space representations may be constructed via the GNS construction– provided a vacuum state is given. This idea is also followed in [33]. But there it was proposed to use a perturbative vacuum, constructed in a fully kinematic way (in the case of bosonic degrees of freedom the construction of the Hilbert space proposed in [33] consists, in fact, in choosing the Gaussian wave function as a vacuum state). In our opinion such a choice might possibly lead to an unphysical sector of the theory, because it is not plausible that the perturbative vacuum belongs to the physical sector. A possible way to avoid this difficulty could be based on an idea presented in [34], where one approximates the vacuum state of the continuum theory by the true vacuum states of its lattice approximations. For this purpose observe that the space of states S3 (not necessarily pure states, but all the mixed states) may be treated as being dual to the observable algebra O3 . This implies that we have a family of dual mappings P3∗2 ,31 : S31 ← S32

(10.3)

defined for 31 ≺ 32 . Physically, O31 may be thought of as a subsystem of a bigger physical system O32 , containing more degrees of freedom. The above mapping assigns, to every state of a bigger system, a mixed state of the subsystem. This state is obtained by “forgetting” about those degrees of freedom which are not contained in the subsystem. Let ω3 ∈ S3 denote the vacuum state, corresponding to the minimum of the Hamiltonian (9.3). Let us project the vacuum from “finer lattices” backwards to “coarser lattices” and define (10.4) ω31 ,32 := P3∗2 ,31 ω32 ∈ S31 . Suppose that the limit 31 = lim32 ω31 ,32 exists. If this is true for every 31 , then the compatibility condition 31 = P3∗2 ,31 32 is automatically fulfilled and the state := {3 } belongs to the projective limit of spaces S3 . Therefore, it defines a state on the inductive limit of the algebras O3 . As a limit of approximate vacuum states, it is a natural candidate for the non-perturbative vacuum of the continuum theory and the starting point for the GNS construction of the Hilbert space of its quantum states. The existence of the above limit of vacuum states may be extremely difficult to prove. A realistic attitude consists, therefore, in a detailed analysis of lattice approximations of the theory presented in this paper: if the continuum limit of these approximations does exist, the numerical results obtained on the level of a sufficiently “late” lattice 3 should approximate the true physical quantities.

560

J. Kijowski, G. Rudolph, A. Thielmann

A. Appendix: Proof of Theorem 2 Consider a long projector (7.13) of maximal length #K = #L = N −|Z|. Identities (6.17) ˆ kl maps H(K− ,L− )(K+ ,L+ ) and (6.18) imply that for k ∈ K+ and l ∈ L+ the operator m into H(K− ∪{k},L− ∪{l})(K+ \{k},L+ \{l}) and m ˆ ∗kl maps the latter space back into the first one. These mappings are mutually inverse. We have, therefore, constructed a net of ˆ ∗kl playing the role of mutually isomorphic subspaces, with the operators m ˆ kl and m unitary intertwining operators between them. However, this construction makes sense only for k ∈ K and l ∈ L. Observe that for ∈ H(K− ,L− )(K+ ,L+ ) and the “external indices” r 6∈ K, s 6∈ L we have (A.1) Qˆ rs = 0 = Pˆrs . Indeed, the assumption that the contrary is true, i. e. that we have 0 6= Qˆ rs = Qˆ rs Pˆ(K− ,L− )(K+ ,L+ ) = Pˆ(K− ,L− )(K+ ∪{r},L+ ∪{s})

(A.2)

0 6= Pˆrs = Pˆrs Pˆ(K− ,L− )(K+ ,L+ ) = Pˆ(K− ∪{r},L− ∪{s})(K+ ,L+ )

(A.3)

or would contradict our assumption about vanishing of all the projectors, which are longer than the maximal rank N − |Z|. ˆ ∗rs , with “external indices” r 6∈ K, s 6∈ L, annihilate We conclude that m ˆ rs and m all these subspaces H(K− ,L− )(K+ ,L+ ) , because of (7.15) and (7.16). Suppose that also all the remaining operators m ˆ ∗rl and m ˆ ∗ks , with r 6∈ K, l ∈ L, k ∈ K, s 6∈ L, annihilate ˆ rl = m ˆ rs , m ˆ ∗ks , m ˆ kl and all these subspaces. Identities m ∗ ˆ kl , m ˆ rl , m ˆ rs imply that also m ˆ ks annihilate them. Hence, the sum ˆ rl and m m ˆ ks = m of these subspaces is an invariant subspace for the representation. It follows from the irreducibility of the representation that this sum is equal to the entire Hilbert space. It is annihilated by at least one among the operators m. ˆ This implies that the representation is trivial. We conclude that, for a non-trivial representation, at least one among the operators ˆ ∗ks does not vanish identically on the sum of our subspaces. We restrict m ˆ ∗rl and m ourselves to the discussion of the case that m ˆ ∗k0 s0 , with k0 ∈ K, s0 6∈ L, does not vanish identically on the direct sum of the spaces H(K− ,L− )(K+ ,L+ ) . The second case can be dealt with in a completely analogous way. Of course, m ˆ ∗ks vanishes on all those subspaces, for which k ∈ K+ , because we have ˆ ∗ks Qˆ kl = m ˆ ∗ks m ˆ ∗kl m ˆ kl = 0 m ˆ ∗ks = m

(A.4)

ˆ ∗ks m ˆ ∗kl = 0 follows from (5.5)). We suppose, therefore, for some l ∈ L+ . The identity m that there is a vector belonging to the direct sum of the subspaces (7.17) having k0 ∈ K− , and such that m ˆ ∗k0 s0 6= 0 (A.5) for some s0 6∈ L. Lemma 8. For k0 ∈ K− and s0 6∈ L, the operator m ˆ ∗k0 s0 does not vanish identically on any of the subspaces H(K− ,L− )(K+ ,L+ ) .

Algebra of Observables and Charge Superselection Sectors for QED

561

Proof. Suppose, on the contrary, that m ˆ ∗k0 s0 vanishes on one of these subspaces. We ˆ ij , with i ∈ K, know that there exists a polynomial combination of operators m ˆ ∗ij and m j ∈ L and i 6= k0 , giving a unitary isomorphism between any two such spaces. Since ˆ ij 6= 0, these operators commute with m ˆ ∗k0 s0 , we have for m ˆ ij |m ˆ ∗k0 s0 m ˆ ij = m ˆ ∗k0 s0 |m ˆ ∗k0 s0 Qˆ ij = m ˆ ∗k0 s0 |m ˆ ∗k0 s0 , (A.6) m ˆ ∗k0 s0 m ˆ ij by m ˆ ∗ij and because Qˆ ij = in this case. A similar argument works if we replace m ∗ ˆ ˆ Qij by Pij . This means that vanishing of m ˆ k0 s0 on one of these subspaces would imply vanishing on all of them. We conclude that, for k0 ∈ K− and s0 6∈ L, also Pˆk0 s0 does not vanish identically on any of the subspaces H(K− ,L− )(K+ ,L+ ) . Let us, therefore, choose one of them and let us consider the corresponding restricted subspace H0 := Pˆk0 s0 H(K− ,L− )(K+ ,L+ )

(A.7)

with k0 ∈ K− and s0 6∈ L. Lemma 9. Let ∈ H0 . For k 6∈ N \ K− and s 6∈ L+ we have Pˆks = .

(A.8)

Proof. 1. For k ∈ K− and s ∈ L− the assertion follows trivially from the fact that Pˆ(K− ,L− )(K+ ,L+ ) contains Pˆks as a factor. 2. It remains to consider the case k ∈ K− and s 6∈ L. For that purpose choose an arbitrary l ∈ L− and take the following operator ∗ ˆ kl , m ˆ k0 l . (A.9) lˆk0 k := m In a first step we show that for any ∈ H(K− ,L− )(K+ ,L+ ) we have: ˆ ∗k0 s0 = m ˆ ∗ks0 lˆk0 k m

(A.10)

where k ∈ K− and k0 ∈ K− . Indeed, ∗ ∗ lˆk0 k m ˆ ∗k0 s0 = m ˆ ∗kl m ˆ k0 l − m ˆ k0 l m ˆ ∗kl m ˆ k 0 s0 m ˆ ∗k0 l m ˆ kl − m ˆ k 0 s0 m ˆ kl m ˆ ∗k0 l ˆ k 0 s0 = m ∗ = m ˆ ks0 − m ˆ kl m ˆ ∗k0 l m ˆ k 0 s0 + m ˆ ∗k0 l m ˆ kl m ˆ k 0 s0 ˆ ∗k0 s0 m ˆ k0 l m ˆ ∗kl + m ˆ ∗k0 s0 m ˆ ∗kl m ˆ k0 l . (A.11) =m ˆ ∗ks0 − m ˆ k0 l m ˆ ∗k0 l . Hence m ˆ k0 l = 0 and the last term vanishes. Also the But, = Pˆk0 l = m ˆ ∗kl m ˆ k0 l = 0 according to (6.9). This proves second term vanishes, because we have m ˆ k0 l m ˆ ∗ks0 = m ˆ ∗k0 s0 . Hence: (A.10). Changing the role of k0 and k, we have also lˆkk0 m km ˆ ∗ks0 k2 = lˆk0 k m ˆ ∗k0 s0 |m ˆ ∗ks0 = m ˆ ∗k0 s0 |lˆkk0 m ˆ ∗ks0 =k m ˆ ∗k0 s0 k2 . (A.12) In a similar way, using operator ˆ is0 , m ˆ ∗is , (A.13) rˆss0 := m ˆ ∗ks0 k2 for any k ∈ K− and with an arbitrary i 6∈ K, we prove that k m ˆ ∗ks k2 =k m s 6∈ L.

562

J. Kijowski, G. Rudolph, A. Thielmann

We conclude that for every k ∈ K− and s 6∈ L, the following identity holds: ˆ ∗k0 s0 k or, equivalently, km ˆ ∗ks k=k m |Pˆks = |Pˆk0 s0

(A.14)

for ∈ H(K− ,L− )(K+ ,L+ ) . Hence, for vectors fulfilling Pˆk0 s0 = we have Pˆks = . Lemma 10. Let there be given a pair (k, l) ∈ (K, L) and let us denote in accordance with (A.7), (A.15) H1 := Pˆk0 s0 H(K− \{k},L− \{l})(K+ ∪{k},L+ ∪{l}) , for (k, l) ∈ (K− , L− ), and H2 := Pˆk0 s0 H(K− ∪{k},L− ∪{l})(K+ \{k},L+ \{l}) ,

(A.16)

for (k, l) ∈ (K+ , L+ ). 1. The operators m ˆ ∗kl map H0 isomorphically onto H1 if k ∈ K− and l ∈ L− , and vanish otherwise. 2. The operators m ˆ kl map H0 isomorphically onto H2 if k ∈ K+ and l ∈ L+ , and vanish otherwise. Proof. Let ∈ H0 . We know already that m ˆ ∗kl ∈ H(K− \{k},L− \{l})(K+ ∪{k},L+ ∪{l}) .

(A.17)

Using the fact that m ˆ ∗kl commutes with Pˆis , for i 6= k and s 6∈ L, we get Pˆis m ˆ ∗kl = m ˆ ∗kl for i ∈ K− \ {k} and l ∈ L− \ {l}. Analogously, we have m ˆ kl ∈ H(K− ∪{k},L− ∪{l})(K+ \{k},L+ \{l}) . Choosing any i ∈ K− we have m ˆ ks = m ˆ ks Pˆis = 0. Hence Pˆks m ˆ kl = m ˆ ks m ˆ ∗ks m ˆ kl = m ˆ kl − m ˆ kl m ˆ ∗ks m ˆ ks = m ˆ kl .

(A.18)

(A.19)

Obviously, m ˆ kl and m ˆ ∗kl are mutually inverse isometries, because m ˆ kl m ˆ ∗kl = Pˆkl ∗ ˆ reduces to the identity in the first part and m ˆ kl m ˆ kl = Qkl reduces to the identity in the second part of the lemma. Lemma 11. Let ∈ H0 . For k ∈ N \ K− or s ∈ L+ we have Pˆks = 0 .

(A.20)

Proof. 1. For k ∈ K+ and s arbitrary the proof is trivial, because Pˆ(K− ,L− )(K+ ,L+ ) contains a factor Qˆ kr , which is annihilated by Pˆks , due to (6.15). 2. For k arbitrary and s ∈ L+ a similar argument applies, because Pˆ(K− ,L− )(K+ ,L+ ) contains a factor Qˆ js , which is annihilated by Pˆks , due to (6.16). 3. It remains to consider the case k 6∈ K and s arbitrary. Since Pˆks = m ˆ ks m ˆ ∗ks , it ∗ is sufficient to prove that the operators m ˆ ks , with k 6∈ K, annihilate H0 : Indeed, this statement is trivially true if s ∈ L+ or s 6∈ L. For the case s ∈ L− we have

Algebra of Observables and Charge Superselection Sectors for QED

563

∗ m ˆ ∗ks = m ˆ kl , m ˆ ∗rl , m ˆ rs = m ˆ ∗rs m ˆ rl m ˆ ∗kl + m ˆ ∗kl m ˆ rl m ˆ ∗rs − m ˆ rl m ˆ ∗rs m ˆ ∗kl − m ˆ ∗rs m ˆ ∗kl m ˆ rl , (A.21) for any k ∈ K− and arbitrary l. But the first and the third operator on the right-hand ˆ ∗kl are external side annihilate every ∈ H(K− ,L− )(K+ ,L+ ) , because both indices of m with respect to (K, L). This is also true for the second and the last term, because m ˆ rl annihilates H0 . Lemma 12. Let ∈ H0 . For k ∈ N \ K− and s ∈ L+ we have Qˆ ks = .

(A.22)

Proof. 1. For k ∈ K+ and s ∈ L+ the assertion follows trivially from the fact that P(K− ,L− )(K+ ,L+ ) contains Qˆ ks as a factor. 2. It remains to consider the case k 6∈ K and s ∈ L+ . Choosing any r ∈ K+ and l 6∈ L, we obtain ˆ kl , m ˆ ∗rl , m ˆ rs = m ˆ ∗rl m ˆ rs , (A.23) ˆ kl m m ˆ ks = m because the remaining terms coming from the double commutator annihilate . Hence, ˜ := m denoting ˆ rs , we get ˜ m ˜ . ˆ rl m ˆ ∗kl m ˆ kl m ˆ ∗rl (A.24) km ˆ ks k2 = | But

˜ =m ˜ =m ˜ ˆ rl m ˆ ∗kl m ˆ kl m ˆ ∗rl ˆ ∗rl m ˆ kl m ˆ ∗kl ˆ ∗rl ˆ ∗rl − m ˆ rl m m ˆ rl m

(A.25)

because both k and l are external indices. Due to Lemma 10, m ˆ ∗rl preserves the length ˜ of m ˆ rs = and mrs preserves the length of . Hence, ˜ k2 =k k2 ˜ k2 =k ˆ ks k2 =k m ˆ ∗rl (A.26) |Qˆ rl =k m which implies (A.22).

Lemma 13. Let ∈ H0 . For k 6∈ N \ K− or s 6∈ L+ we have Qˆ ks = 0 .

(A.27)

Proof. 1. For k ∈ K− and s arbitrary the proof is trivial, because Pˆ(K− ,L− )(K+ ,L+ ) contains a factor Pˆkr , which is annihilated by Qˆ ks , due to (6.15). 2. For k arbitrary and s ∈ L− a similar argument applies, because Pˆ(K− ,L− )(K+ ,L+ ) contains a factor Pˆjs , which is annihilated by Qˆ ks , due to (6.16). 3. It remains to consider the case k arbitrary and s 6∈ L. Since Qˆ ks = m ˆ ∗ks m ˆ ks , it is sufficient to prove that the operators m ˆ ks annihilate all the subspaces H(K− ,L− )(K+ ,L+ ) . This is obviously the case for k 6∈ K+ . Thus, it remains to consider the case k ∈ K+ only. Choosing any l ∈ L+ , we have (A.28) km ˆ ks k2 = |Qˆ ks = |Qˆ ks Qˆ kl . But Since

Qˆ ks Qˆ kl = m ˆ ∗ks m ˆ ks m ˆ ∗kl m ˆ kl = m ˆ ∗kl − m ˆ ∗kl m ˆ ks m ˆ ∗ks m ˆ kl . m ˆ ∗ks

(A.29)

preserves the length of m ˆ kl , we get km ˆ ks k2 =k m ˆ kl k2 − k m ˆ ∗ks m ˆ kl k2 = 0 .

(A.30)

564

J. Kijowski, G. Rudolph, A. Thielmann

Equations (A.8), (A.20), (A.22) and (A.27) prove that the subspace H0 , defined by (A.7), satisfies conditions (7.7)–(7.10), if we put I = N \ K− and J = L+ . Acknowledgement. The authors are very much indebted to B. Crell, A. Uhlmann, S. L. Woronowicz and ´ C. Sliwa for helpful discussions and remarks. One of the authors (J. K.) is grateful to the Polish National Committee for Scientific Research (KBN), Warsaw, for financial support.

References 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20. 21. 22. 23. 24. 25. 26.

27. 28. 29. 30. 31. 32. 33. 34.

v.Neumann, J.: Math. Ann. 104, 570 (1931) Jordan, P. and Wigner, E.P.: Zeitschr. f. Phys. 47, 631 (1928) G˚aarding, L. and Wightman, A.S.: Proc. of the Nat. Acad. of Sc. 40, 617 (1954) Kijowski, J., Rudolph, G.: Lett. Math. Phys. 29, 103 (1993) Kijowski, J., Rudolph, M. and Rudolph, G.: Lett. Math. Phys. 33, 139 (1995) Takabayashi, T.: Suppl. Progr. Theor. Phys., No. 4, 1 (1957) Mickelson, J.: Czech J. Phys. B 32, 521 (1982) Mikhov, S.G. and Stoyanov, D.T.: Preprint E2-12865 Dubna (1979) Jakubiec, A., Kijowski, J.: Lett. Math. Phys. 9, 1 (1985); Jakubiec, A.: Lett. Math. Phys. 9, 171 (1985) Kijowski, J. and Rudolph, G.: Nucl. Phys. B325, 211 (1989) Faddeev, L.D. and Popov, V.N.. Phys. Lett B25, 30 (1967) Mandelstam, S.: Ann. Phys. 19, 1 (1962), Phys. Rev. D, 175, N. 5, 1580 (1968) Kijowski, J. and Rudolph, G.: Lett. Math. Phys. 16, 27 (1988) Kijowski, J. and Rudolph, G.: Phys. Rev. D31, 856 (1985), Rudolph, G.: Annalen der Physik, 7. Folge, Bd. 47, 2/3, 211 (1990) Kijowski, J., Thielmann, A.: J. of Geom. and Phys. 19, 173 (1996) Kijowski, J., Rudolph, G. and Rudolph, M.: Effective Bosonic Degrees of Freedom for One-flavour Chromodynamics. To appear in Ann. Inst. H. Poincar´e Berezin, F.A.:The Method of Second Quantization. New York–London: Academic Press, 1966 Haag, R. and Kastler, D.: J. Math. Phys. 5, 848 (1964) Doplicher, S., Haag, R. and Roberts, J.: Commun. Math. Phys. 23, 199 (1971) Strocchi, F. and Wightman, A.: J. Math. Phys. 15, 2198 (1974) Strocchi, F.: Commun. Math. Phys. 56, 57 (1977) Strocchi, F.: Phys. Rev. D 17, 2010 (1978) Fr¨ohlich, J.: Commun. Math. Phys. 66, 223 (1979) Buchholz, D.: Commun. Math. Phys. 85, 49 (1982), Buchholz, D.: Phys. Lett. B174, 331 (1986) Fredenhagen, K. and Marcu, M.: Commun. Math. Phys. 92, 81 (1983) Seiler, E.: Gauge Theories as a Problem of Constructive Quantum Field Theory and Statistical Mechanics. Lecture Notes in Phys. Vol. 159, Berlin–Heidelberg–New York: Springer, 1982; Seiler, E.: Constructive Quantum Field Theory: Fermions. In: Gauge Theories: Fundamental Interactions and Rigorous Results, eds. P.Dita, V.Georgescu, R.Purice Kuchaˇr, K.: Phys. Rev. D34, 3031 (1986); Phys. Rev. D34 3044 (1986) Białynicki-Birula, I.: The Hamiltonian of Quantum Dynamics. In: QED and Quantum Optics. Ed. A.O.Barut, New York: Plenum Press, 1984 Greub, W.H.: Linear Algebra. Berlin–Heidelberg–New York:: Springer-Verlag, 1967 Woronowicz, S.L.: Rev. Math. Phys. 7, 481 (1995) Baaj, S. and Jungl, P.: C. R.Acad.Sci.Paris, S´erie I, 296, 875 (1983) Dixmier, J.: Les C ∗ -Algebres et leurs Representations. Paris: Gauthier-Villars Editeur, 1969 Ashtekar, A. and Lewandowski, J.: J. of Geom. Phys. 17, 191 (1995) Kijowski, J.: Rep. Math.Phys. 11, 97 (1977): Zakrzewski, S.: On the lattice approximation and Feynman path integrals for gauge fields. Thesis, Warsaw 1980, unpublished; Zakrzewski, S.: Unitary relations. Publicac¸o˜ es de Fisica Matem´atica 4, (1985), Coimbra

Communicated by G. Felder

Commun. Math. Phys. 188, 565 – 584 (1997)

Communications in

Mathematical Physics c Springer-Verlag 1997

Diffusive Hydrodynamic Limits for Systems of Interacting Diffusions with Finite Range Random Interaction Ren´e A. Carmona? , Lin Xu?? Statistics & Operations Research Program, CEOR, Princeton University, Princeton, NJ 08544, USA Received: 3 October 1996 / Accepted: 13 February 1997

Abstract: We consider a system of interacting diffusive particles with finite range random interaction. The variables can be interpreted as charges at sites indexed by a periodic multidimensional lattice. The equilibrium states of the system are canonical Gibbs measures with finite range random interaction. Under the diffusive scaling of lattice spacing and time, we derive a deterministic nonlinear diffusion equation for the time evolution of the macroscopic charge density. This limit is almost sure with respect to the random environment. 1. Model and Main Results The study of the scaling limit for large random dynamical systems with conservation laws has attracted the attention of many mathematical physicists in the last fifteen years. In particular, diffusive scaling limits for time dependent Ginzburg-Landau (TDGL) models have been obtained under various conditions. [F1, F2, GPV, R] are relevant references. Except for the remarkable work of [R], the diffusive scaling limit for the TDGL with general finite range interaction was always studied without the possibility of a phase transition, and even [R] is restricted to translation invariant Gibbs states. Meanwhile, especially in statistical physics and material science, much attention has been devoted to Gibbs states in a random environment (see [KS] for details). Naturally, it is of great interest to extend the work of [R] to systems of interacting diffusions with finite range random interaction where the extreme equilibriums of systems are canonical Gibbs states with finite random interaction. Given the state of the current mathematical technology, there are substantial difficulties to overcome in order to be able to analyze completely such models. Even though many new methods have been developed (see for example [S]), the entropy production method which appeared first in [GPV] remains the most ? ??

Partially supported by ONR N00014-91-1010. Supported by ONR N00014-91-1010.

566

R.A. Carmona,L. Xu

general and effective approach. Like many other contributions to the field, the work of [R] uses the entropy production method as the main tool and it relies heavily on the translation invariance of the system. Unfortunately, the only translation invariance that random systems enjoy is invariance of the statistics and obviously, for each fixed realization of the random environment translation invariance fails. It is not clear how one should deal with this difficulty. The main thrust of the paper is to “average out the random environment” first: In some sense, we homogenize the medium before we perform the hydrodynamic limit. To achieve this goal, we have to let the scaling parameter N go to infinity while keeping a good control of the averages over large intermediate size blocks (the size of which should be smaller than N ). Let us describe our specific model. For any integer N, let: TNd = {(

id i1 , . . . , ) ∈ T d : i1 , . . . , id = 1, . . . , N }, N N d

where T d denotes the d-dimensional torus Rd /Zd and let X N = RTN be the configuration space over TNd . A configuration in X N is denoted by x¯ = {xi ; i ∈ TNd }. Suppose that φ : R → R is a continuously differentiable function with the following properties: Z exp[−φ(x)] dx = 1, (1) Z Z

exp[λx − φ(x)] dx < ∞,

(2)

exp[λφ0 (x) − φ(x)] dx < ∞

(3)

for all λ in R. Let F be a function which depends on a fixed finite number of coordinates. We assume that F is bounded, continuously differentiable and has bounded first derivatives. Let {βi ; i ∈ Zd } be a set of independent and identically distributed (i.i.d. for short) random variables uniformly distributed on [0, 1]. These random variables are accounting for the impurities in the medium. The law of β¯ = {βi ; i ∈ Zd } is denoted by l¯ and the average corresponding to l¯ is denoted by < · >. The interaction energy ¯ x ∈ Zd } is defined by: corresponding to the random interaction field {βi F (τi x); X [βi F (τi x) ¯ + φ(xi )], (4) HN = d i∈TN

¯ for the charges where τi denotes the shift operator in the space TNd . Our dynamics x(t) x¯ is defined as the diffusion process in the phase space X N with the generator " 2 # N2 X ∂ ∂HN ∂ ∂ ∂HN ∂ − − − − LN = , (5) 2 i,j ∂xi ∂xj ∂xi ∂xj ∂xi ∂xj where the sum is over the adjacent sites i and j in TNd . This generator is reversible with respect to the Gibbs measure νN defined by: νN (dx) ¯ =

1 exp[HN ] dx, ¯ ZN

(6)

where ZN is as usual a normalizing constant. Our primary interest is the behavior of the macroscopic charge measure:

Diffusive Hydrodynamic Limits for Systems of Interacting Diffusions

µN (t) =

567

1 X xi (t)δi/N Nd d

(7)

i∈TN

as N tends to infinity. Here δi/N is the unit point mass at i/N . µN (t) should be viewed ¯ start from an initial distribution with as a signed measure on T d . Let the diffusion x(t) 0 0 with respect to the Gibbs measure νN and let us assume that fN satisfies the density fN entropy condition: Z 1 0 0 fN log fN dνN >< ∞. (8) < lim sup d N →∞ N t t for the distribution of x(t). ¯ fN satisfies: Then the dynamics gives us the density fN t ∂fN t = LN f N . ∂t

(9)

We further assume the existence of a continuous function m0 (θ) such that for any positive number δ and any continuous function J(θ) we have: Z 0 fN dνN = 0, (10) limN →∞ EN,δ

where EN,δ

Z 1 X i J( )xi − J(θ)m0 (θ)dθ ≥ δ . = d N N

We shall use the notation h for the convex conjugate of the specific free energy ψ defined by: " # Z X 1 log exp λ xi νN (dx). ¯ (11) ψ(λ) = lim N →∞ N d i It is well known that ψ is a deterministic convex function (even though the interaction is random). Consequently, h is also a deterministic convex function. The main result of this paper is the following. 0 satisfies (8) and (10), then for all Theorem 1.1. If the initial density of charges fN t ≥ 0, every smooth function J and each δ > 0 we have Z t fN dνN = 0, lim N →∞

t EN,δ

¯ with l-probability one where Z 1 X i t EN,δ = x; ¯ d J( )xi − J(θ)m(t, θ)dθ ≥ δ , N N and m(t, θ) is the unique weak solution of the nonlinear parabolic equation: ∂m 1 = 4 h0 (m(t, θ)), ∂t 2 where 4 is the Laplacian on the torus T d .

m(0, θ) = m0 (θ),

(12)

568

R.A. Carmona,L. Xu

The function h0 is strictly increasing when F = 0. It is well known in statistical mechanics that for some nonzero F , the function h0 is not strictly increasing. This could make the evolution (12) degenerate. This corresponds to a phase transition. Nevertheless, like in [R], we can still derive the hydrodynamic equation for the charge density m even in the presence of the phase transition. The rest of paper as follows. In Sect. 2, we give an ergodic theorem for the canonical Gibbs states corresponding to the interaction under consideration. Like in most of nonequilibruim statistical mechanics, we need to understand what is meant by equilibrium states in our model. Fortunately it can be achieved fairly easily by combining the work of [C] and the large deviation techniques used in section 2 of [R]. Throughout this part we mostly state results without proofs for the latter can be derived from these two works in a straightforward manner. In Sect. 3, we establish a local ergodic theorem away from equilibrium (one and two block estimates). It is only at this point that new ideas have to be introduced. They are needed to deal with the random environment. In Sect. 4, we establish the apriori estimates for the macroscopic density m(t, θ) which are necessary to the uniqueness of the limiting equation. We derive this hydrodynamic equation in Sect. 5. Our model falls in the category of gradient models. See [S] for a definition. Some earlier works in the context of large scale behaviors for TGLD are [CY, Y, Fu]. Among them, [Fu] is for continuous space models under conditions that exclude phase transition. The nongradient version of our model is still a completely open problem, even though remarkable progress has recently been made. [V1, Q, VY, X] are relevant references for such models.

2. Ergodic Theorem for Canonical Gibbs State Throughout the rest of paper, we use the following notation. X is a Polish space, i.e. a metrizable complete separable topological space, B(X ) is its Borel σ - field and P(X ) is the set of probability measures on X. Let W = R × [0, 1], and let π : W −→ [0, 1] d d be the canonical projection. We consider the configuration spaces W = W Z , X = RZ d and Y = [0, 1]Z . All these product spaces are endowed with their respective product topologies and as a consequence, they are all Polish spaces. An element in X is usually denoted by x¯ = (xi : i ∈ Zd ). Similar notations are used to represent elements of W and Y. The projection π induces a projection Π from W onto Y, Π w¯ = y¯ with πwi = yi , ˜ from P(W) into P(Y), Π(Q) ˜ and Π itself induces a projection Π = Q ◦ Π −1 . For ¯ = x(i ¯ + j). The each i we let τi be the translation operator on X defined by (τi x)(j) translations τiw and τiy are defined in a similar way. We denote by Ps (W) the space of translation invariant probability measures on W. It is endowed with the topology of weak convergence as usual. For each integer n, we define the box: 3n = {i ∈ Zd ; −n ≤ ik ≤ n for all k and i = (ik , 1 ≤ k ≤ d)}. For each w¯ and n, we define w¯ n by: n ¯ for all i ∈ 3n w¯ (i) = w(i) , w¯ n (i + (2n + 1)ek ) = w¯ n (i) for all i ∈ Zd

(13)

where ek is the k th vector in the canonical basis of Zd . The empirical field is defined as:

Diffusive Hydrodynamic Limits for Systems of Interacting Diffusions

R3n ,w¯ =

1 X δτ w¯ n , |3n | i∈3 i

569

(14)

n

where |3n | is the cardinality of 3n . Obviously, R3n ,w¯ ∈ Ps (W). Let φ : R → R be a continuously differentiable function satisfying the properties (1), (2) and (3) stated in the introduction for all λ in R. Equations (2) and (3) are equivalent to saying that there exists a convex symmetric function γ(x) satisfying: lim

x→∞

and

x = 0, γ(x)

γ(x) ≥ x

Z

Z eγ(x)−φ(x) dx < ∞

and

for all x,

0

eγ(φ (x))−φ(x) dx < ∞

(15)

(see [GPV] for a proof). Consider the probability measures ρ(dx) = e−φ(x) dx, l(dy) = dy and ν(dw) = ρ(dx) × dy on X = R, [0, 1] and W respectively. For each subset T of Zd , the measures ρT , lT and ν T are product measures of ρ on X T , l on [0, 1]T and ν on W T . If T = Zd , we simply denote the product measures by ρ, ¯ l¯ and ν. ¯ Define ¯ = yi , where wi = {(xi , yi )}. Then, {βi ; i ∈ Zd } is a i.i.d. random field on Zd βi (w) with the common distribution uniform distribution on [0,1]. Since W is a Polish space, we can define a regular version of ν(·| ¯ y) ¯ of ν¯ conditionally on π¯ w¯ = y. ¯ Before stating ¯ y), ¯ we recall the definition of the relative the large deviation principle for Rn,w¯ under ν(·| entropy. For Q ∈ Ps , we set: Z Z GdQ − ln eG dν¯ , (16) H(Q3n |ν¯ 3n ) = sup G

where the supremum is over all the bounded continuous function G on W which depend only on {wi ; i ∈ 3n } and we define H(Q|ν) ¯ = lim

n→∞

1 H(Q3n |ν¯ 3n ). |3n |

The limit exists because of the subadditivity of H(Q3n |ν¯ 3n ). It is known [C] that the following large deviation principle holds. Lemma 2.1. The large deviation principle for the sequence of conditional distributions ¯ ¯ of the empirical processes under ν¯ given π¯ w¯ = y¯ holds with l-probability ν(R ¯ n,w¯ ∈ ·|y) one. The rate function I is given on P(W) by: ˜ H(Q|ν) ¯ if Π(Q) = l¯ I(Q) = +∞ otherwise. 0 0 We shall denote by Cloc (W) (resp. Cloc ) the space of bounded continuous functions on W (resp.X ) which depend only upon finitely many coordinates. If x¯ ∈ X and T is a subset of Zd , x¯ T denotes the restriction of x¯ to T . Let z¯ be another configuration, then x¯ T ∨ z¯ denotes the configuration which agrees with x¯ on T and with z¯ on T c = Z d − T . 0 depends only upon the coordinates xi for i in a finite set 3 ⊂ Zd , we define If F ∈ Cloc the interaction energy by: X βi F (τi x), ¯ HF,T,β¯ =

570

R.A. Carmona,L. Xu

where the sum is over the indices i such that i + 3 ⊂ T . Given a boundary condition z¯ in X , we define X βi F ((τi x) ¯ T ∨ z), ¯ HF,T,β, ¯ z¯ = HF,T,β¯ + where the sum is over the indices i such that (i + 3) ∩ T 6= ∅

and

(i + 3) ∩ T c 6= ∅ :

0 (W). Obviously In the same way H˜ G,T and H˜ G,T,z¯ are defined for G ∈ Cloc

HF,T,β¯ = H˜ β0 F,T , ˜ β0 F.T,z¯ HF,T,β, ¯ z¯ = H Now we define the finite volume Gibbs measure µF,T associated with F and ρ by: ¯ = µF,T (dx)

1 exp HF,T,β¯ ρT (dx), ¯ ZF,T

where the normalizing constant is defined by: Z ZF,T = exp HF,T,β¯ ρT (dx). ¯

(17)

(18)

In the same way we define µF,T,z¯ and ZF,T,z¯ by replacing HF,T,β¯ with HF,T,β, ¯ z¯ in (17) and (18). We also consider the finite volume canonical Gibbs measures µaF,T and µaF,T,z¯ which are the conditional distribution of µF,T and µF,T,z¯ respectively, given mT = a, where 1 X i x. (19) mT = |T | i∈T

The family of infinite volume Gibbs measures and canonical Gibbs measures without symmetry breaking (see [C], p. 421) are defined by: GF,y¯ = {µ ∈ P(X ); µ ⊗ l¯ ∈ Ps (W), µ(·|xi = zi for i ∈ T c ) = µF,T,z¯ c GF, y¯

for all z¯ and finite T ⊂ Z d }, = {µ ∈ P(X )|µ ⊗ l¯ ∈ Ps (W), µ(·|xi = zi for i ∈ T c and mT (·) = a) = µaF,T,z¯ for all z¯ and finite T ⊂ Z d }.

c 0 are defined for G ∈ Cloc (W). It is In the same way µ˜ G,T,z¯ , µ˜ aG,T,z¯ , G˜G,β¯ and G˜G, β¯ known from [C] that: ¯ 1) With l-probability one,

1 ln ZF,T,z¯ n→∞ 3n

9(F ) = lim

(20)

converges uniformly in z. ¯ ψ(F ) is independent of z¯ and y. ¯ Alternatively, 9(F ) can be computed by: 9(F ) = sup EQ {β0 F } − I(Q) , Q

where E { · } denotes the expectation with respect to the probability measure Q. ¯ satisfy a large 2) The laws of the empirical field Rn,w¯ under the measures µF,3n ,z¯n (dx) ˜ deviation principle on P(W) with rate function I: Q

Diffusive Hydrodynamic Limits for Systems of Interacting Diffusions

571

I˜F (Q) = I(Q) − EQ {β0 F } + 9(F ),

(21)

irrespective of the choice of the sequence 3n and the sequence of boundary conditions z¯n . As an immediate consequence, one obtains the existence and the convexity of the free energy ψ defined by: " # Z X 1 ln exp λ xi µF,3n ,z¯n (dx). ¯ (22) ψ(λ) = ψ(λ, F ) = lim sup n→∞ |3n | i∈3 n

Let h be the convex conjugate (also known as the Legendre transform) of ψ, i.e. h(x) = sup[λx − ψ(λ)]

(23)

λ an an ¯ The foland let us denote by PF,3 ¯ n the law of Rn,w¯ with respect to µF,3n ,z¯ n (dx). n ,z lowing results can be obtained easily from a combination of arguments from Sect. 2 of [R] and from the results of [C]. We state them separately for future reference.

Lemma 2.2. Let {z¯n } be a sequence of boundary condition and {an } be a sequence of an real numbers such that limn→∞ an = a. Then the family PF,3 ¯ n has the upper bound n ,z large deviation property with rate function R I˜F (Q) − h(a), if x0 dQ = a; I˜Fa (Q) = (24) +∞, otherwise. 0 . For Let gn,F.z¯ (x) be the density of m3n with respect to µaF,T,z¯ and let F ∈ Cloc ˜ each λ we denote Gβ0 F +λx0 ,β¯ by Gλ .

Lemma 2.3. Suppose that G is a bounded continuous cylindrical function on W or that ¯ + φ0 (x0 ) for some G1 which is a bounded continuous G is of the form G(w) ¯ = G1 (w) cylindrical function on W. Let {z¯n } be a sequence in X , let {an } be a sequence of real 0 ¯ . Then with l-probability one, the numbers such that limn→∞ an = a and let F ∈ Cloc family ! ÿ 1 X an w αn (dt) = µF,3n ,z¯n G(τi w) ¯ ∈ dt |3n | i∈3 n

satisfies the upper bound of the large deviation principle with rate function: Z GdQ = t . JG (t) = inf I˜Fa (Q); R

Moreover JG (t) = 0Rif and only if t = Gh0 (a) , ΠQ = l¯ and x0 dQ = a. ¯ = φ0 (x0 ) − Corollary 2.1. If G0 (x) 0 t = h (a).

GdQ for some Q ∈ Ps (W) such that Q(·|y) ¯ ∈

∂HF,3n ,β¯ (x), ¯ ∂x0

then JG0 (t) = 0 is equivalent to

Theorem 2.1. Let K be a bounded subset of R and let G0 be as in the previous corollary. Then Z 1 X 0 sup G0 (τi x) ¯ − h (a) dµaF,3n ,z¯ = 0. (25) lim n→∞ a∈K,z∈X |3 | n ¯ i∈3n

572

R.A. Carmona,L. Xu

Lemma 2.4.

0 1 gn,F, z¯ (a) → −h0 (a) |3n | gn,F,z¯ (a)

(26)

uniformly in z¯ ∈ X and a in a compact subset of R. Lemma 2.5. There exists a constant c such that |h0 (x)| ≤ c + h(x).

(27)

3. Local Ergodic Theorem for the Dynamics 0 with Recall that our dynamics starts initially from the distribution having density fN 0 respect to the Gibbs states νN and that fN satisfies the entropy condition: Z 1 0 0 fN log fN dνN >< ∞. < lim sup d N →∞ N

We denote by {x(t); ¯ t ≥ 0} the diffusion process generated by LN . For each T > 0, ¯ 0 ≤ s ≤ T} let QN,T and PN,T be the laws on C([0, T ], X N ) of the evolution {x(t); 0 dνN respectively. We shall starting at time t = 0 with the distributions dνN and fN denote them by QN and PN for short whenever no confusion is possible. We have: < lim sup N →∞

1 H(PN |QN ) >< ∞, Nd

(28)

where H(·|·) is the relative entropy function whose precise definition was given in (16). t be the probability density function of x(t) ¯ under QN,T . The main result in this Let fN section is the following ergodic theorem for the dynamics of {x(t)}. ¯ Theorem 3.1. Let G0 be as in Corollary 2.1, let: 3k (i) = i + 3k and let us set    Z T  X X 1 0 1 E PN G (τ x(s)) ¯ − U ◦ h (m ) , AN,N ,k,` = d 0 j ` 3N (i) ds  0 |3k | N  d j∈3k (i)

i∈TN

  −` if x < −` U` (x) = x if |x| ≤ `  ` if x > `.

where

¯ Then with l-probability 1, it holds: lim lim sup lim sup lim sup AN,N ,k,` = 0.

`→∞

→0

k→∞

N →∞

(29)

Remark. 1. This theorem is the crucial step in the proof of our scaling limit. 2. Since most of the results in this paper are true in l¯ − probability 1, from now we drop "in l¯ − probability 1" whenever no confusion is possible.

Diffusive Hydrodynamic Limits for Systems of Interacting Diffusions

573

We shall break the proof of the theorem into two lemmas. In the jargon of the hydrodynamic limit they are usually referred to as "one block" and "two block" estimations respectively. Lemma 3.1. Using k instead of N in the definition of AN,N ,k,` it holds: lim lim sup lim sup AN,k,k,` = 0.

`→∞ k→∞

Lemma 3.2. If we set ¯ = Ai,k (x)

N →∞

(30)

X 1 xj , |3k | j∈3k (i)

and for any a > 0: DN,k,i,a = sup EPN j

 Z 

T 0

  X 1 0 0 χ ds , {|h (Ai,k [x(s)])−h ¯ (Ai+j,k [x(s)])|≥a} ¯  Nd d i∈TN

where the supremum is over all indices j such that 3l (j) ⊂ 3N and where the generic notation χ3 is used for the characteristic function of the set 3, then lim sup lim sup lim sup DN,k,i,j,a = 0. →0

k→∞

N →∞

(31)

Both the proofs of Lemma 3.1 and Lemma 3.2 will be carried out in several steps. The following lemma will help us to handle the disorder in the system. Lemma 3.3. Let Θ(y) ¯ be a bounded measurable cylindrical function on Y. Then in ¯ l-probability one, 1 X ¯ = El¯{Θ(β)}. ¯ Θ(τiy β) (32) lim N →∞ N d d i∈TN

Proof. This almost sure limit can be proved by the classical argument of the proof of the strong law of large number by the method of moments. We skip details. The next lemma will allow us to control the presence of big spins in the system. ¯ Lemma 3.4. In l-probability one,    Z T 1 X 0 [γ(x (s)) + γ(φ (x )] ds < ∞, lim sup EPN i i   0 Nd N →∞ d i∈TN

where γ(·) is defined in (15). Proof. Set AN = EPN

 Z 

T 0

  1 X 0 [γ(x ) + γ(φ (x )] ds . i i  Nd d i∈TN

Because of the entropy bound (28), we apply the entropy inequality to:

(33)

574

R.A. Carmona,L. Xu

1 2T

Z

T 0

X

(γ(xi (s)) + γ(φ0 (xi )) ds.

d i∈TN

We have: 1 H(PN |QN ) Nd    Z T X   1 1 + d ln EQN exp  (γ(xi (s)) + γ(φ0 (xi )) ds   N 2T 0 d

AN ≤ 2T

i∈TN

= J1,N + J2,N . Our proof reduces to showing lim supN →∞ J2,N is finite: To this aim, we note    Z T X   1 1 1 γ(xi ) ds J2,N = d ln EQN exp    N T 0 2 d i∈TN      X 1 1 (γ(xi ) + γ(φ0 (xi )) ds ≤ d ln EνN exp    N 2 d i∈TN           X X 1 1 νN νN 0  +    ln E γ(x ) ln E γ(φ x ) exp exp ≤ i i  2N d    2N d d d i∈TN

i∈TN

≤ J3,N + J4,N , where C is a constant. In the above, we use the convexity of the exponential function and the stationarity of QN in the first inequality and we use the Schwarz’ inequality in last inequality. Also, notice that (20), (15) and the boundedness of F imply that: lim sup J3,N < ∞, N →∞

lim sup J4,N < ∞. N →∞

Thus lim sup AN < ∞, N →∞

and this completes the proof.

For each i we consider the diffusion generator Li,i+e defined by: Li,i+e =

1 ∂ ∂ 2 ∂HN ∂HN ∂ ∂ ( − ) −( − )( − ), 2 ∂xi ∂xi+e ∂xi ∂xi+e ∂xi ∂xi+e

and for each V ⊂ Zd , we set: LV =

X

Li,i+e ,

where the sum is over bonds (i, i + e) in V . We also consider the associated form IV (ν) defined for ν ∈ P(RK ) for some K such that V + 3 ⊆ K by the formula:

Diffusive Hydrodynamic Limits for Systems of Interacting Diffusions

Z IV (ν) = sup f >0

575

−LV f dν, f

(34)

where the supremum is over all the positive smooth functions f . Note that X I N (ν) = IT d (ν) = Ii,i+e (ν), N X N I N (f νN ) = Di,i+e (f ), where N Di,i+e (f ) =

1 2

Z (

∂f ∂f 2 − ) dνN . ∂xi ∂xi+e

(35)

c

Obviously if IV (ν) = 0, then ν(·|RV ) is a convexP combination of canonical Gibbs measures with interaction β0 F since LV is elliptic on i∈V xi = a. Lemma 3.5. We have: lim sup lim sup lim sup EN,k,b,α,` ≤ 0 `→∞

k→∞

N →∞

(36)

if EN,k,b,α,` is defined for b > 0 by:    Z T X  X 1 1  G0 (τjw w(s)) ¯ EN,k,b,α,` = d ln EQN exp b  N (2k + 1)d 0 d j∈3k (i) i∈TN      X X 1 1  −  ds . x (s) αγ[x (s)] −U` ◦ h0  j j (2k + 1)d  (2k + 1)d j∈3k (i)

j∈3k (i)

Proof. By the L2 theory of semigroup, we have  X 1 1 EN,k,b,α,` ≤ T d sup Ef dνN b (2k + 1)d . N f d X

i∈TN

(G0 (τjw w) ¯ − U ` ◦ h0

j∈3k (i)

X 1 x j d (2k + 1) j∈3k (i)

 X 1 αγ[φ0 (xj )]) − N 2 I N (f νN ) − (2k + 1)d j∈3k (i)  X 1 1 f dνN  = T d sup E (b| N f (2k + 1)d d

X

i∈TN

G0 (τjw w) ¯ − U ` ◦ h0 (

j∈3k (i)

X 1 xj )| d (2k + 1) j∈3k (i)

 X 1 αγ[φ0 (xj )]) − N 2 I N (f νN ) . − (2k + 1)d j∈3k (i)

576

R.A. Carmona,L. Xu

Note that all the functions appearing in the previous equation are local and that I is a convex function in ν. Consequently: 1 EN,k,b,α,` ≤ T d N   X X X 1 1 sup Eµi,k b(| G0 (τjw w) ¯ − −U` ◦ h0 ( xj )| d d  (2k + 1) (2k + 1) d µi,k j∈3k (i) j∈3k (i) i∈TN    2 X N 1 αγ[φ0 (xj )]) − I (µ ) , −  (2k + 1)d 3+3k (i) i,k (2k + 1)d j∈3k (i)

where µi,k is the projection of f dνN onto 3 + 3k (i). Now if we let N go to ∞ and if we apply Lemma 3.3 we obtain: 1 lim sup EN,k,b,α,` ≤ lim sup T d N N →∞ N →∞    X X X 1 1 sup Eµi,k b(| G0 (τjw w) ¯ − U ` ◦ h0 ( xj )| d d  (2k + 1) (2k + 1) d µi,k j∈3k (i) j∈3k (i) i∈TN    X δ 1 αγ[φ0 (xj )]) − I3 (i) (µi,k ) −  (2k + 1)d k (2k + 1)d j∈3k (i)      X X 1 1 ¯ ≤ El sup Eµk bT (| G0 (τjw w) ¯ − U ` ◦ h0 ( xj )| d d  µk  (2k + 1) j∈3 (2k + 1) j∈3 k k     X 1 δ 0  − αγ[φ (x )]) − T I (µ ) , j 3 k   (2k + 1)d (2k + 1)d k j∈3k (i)

for any positive number δ, where µk is an arbitrary probability measure on R3+3k . By passing to the limit δ → ∞, we get: ¯

lim sup EN,k,b,α,` ≤ El N →∞     X X 1 1 Eµk bT {| G0 (τjw w) ¯ − U ` ◦ h0 ( xj )| sup d d {µk :I3 (µk )=0}  (2k + 1) j∈3 (2k + 1) j∈3 k k k  ,  X 1 0 − αγ[φ (x )] j  (2k + 1)d j∈3k (i)

l¯

≤ T bE   Z Z  X X 1 1 G0 (τjw w) ¯ − U ` ◦ h0 ( xj )| sup  µk (dz¯ × dx)( (| d d  µk (2k + 1) j∈3 (2k + 1) j∈3 k

k

Diffusive Hydrodynamic Limits for Systems of Interacting Diffusions

577

  X 1 0 x  − αγ[φ (x )]) dν ) j ¯ β¯ 3k ,z,  (2k + 1)d j∈3k (i)

Now we let k → ∞ then ` → ∞ and as an immediate consequence of Theorem 2.1 we obtain: lim sup lim sup lim sup EN,k,b,α,` ≤ 0. `→∞

k→∞

N →∞

Proof of Lemma 3.1. The entropy inequality gives: Z T 1 X PN γ(φ0 (xi ) ds + EN,k,b,α,` b × (AN,k,k,` ) ≤ C + bαE d 0 N d i∈TN

provided we set: C = sup N

1 H(PN |QN ). Nd

On account of Eq. 36 and Lemma 3.4, in the limit α → 0 we obtain: b × lim lim sup lim sup AN,k,k,` ≤ C. `→∞ k→∞

N →∞

Sending now b to ∞ gives: lim lim sup lim sup AN,k,k,` = 0,

`→∞ k→∞

N →∞

which completes the proof of Lemma 3.1.

We now switch to the proof of Lemma 3.2. Lemma 3.6. For any bounded measurable function Θ(·) on Y 3k × Y 3k we have: 1 X ¯ lim Θ(y¯ 3k (i) , y¯ 3k (i+j) ) = El {Θ(y¯ 3k , y¯ 3k (j) )}, N →∞ N d d i∈TN

uniformly in j. Proof. We first notice that, if we set: 1 X ¯ AN,j = d Θ(y¯ 3k (i) , y¯ 3k (i+j) ) − El {Θ(y¯ 3k , y¯ 3k (j) )}, N d i∈TN

then we easily see that

c N 2(d+2) for some constant c independent of N and j. Therefore: X l{sup |AN,j | ≥ N −1/4 } ≤ l{|AN,j | ≥ N −1/4 } ¯

El {|AN,j |4(d+2) } ≤

j

j

≤ N d+2

X

El {|AN,j |4(d+2) }

j

≤N

d+2

Nd

c N 2(d+2)

≤

c , N2

578

R.A. Carmona,L. Xu

P where we used Chebyshev’s inequality in the second inequality above. Since N c/N 2 < ∞, the proof becomes an immediate consequence of the first Borel Cantelli lemma. One of the key steps in the proof of Lemma 3.2 is the following two block superexponential estimation. Lemma 3.7. It holds: lim sup lim sup lim sup EN,k,b,α,j ≤ 0 →0

k→∞

(37)

N →∞

uniformly in j provided we set, for each b > 0:   Z T X   1 0 (A EN,k,b,α,j = d ln EQN exp[b χ{|h0 (Ai,k [x(s)])−h ds] . ¯ [ x(s)])|≥a} ¯ i+j,k   N 0 d i∈TN

Proof. The theory of the L2 -semigroup gives: 1 EN,k,b,α,j ≤ T d N      X  0 [A sup Ef dνN b χ{|h0 [Ai,k (x)]−h − N 2 I N (f νN ) ¯ ¯ i+j,k (x)]|≥a}   f d i∈TN

1 ≤T d N X

sup

k d µi,i+j i∈TN

−

k

0 (Ave }− Eµi,i+j {bχ{|h0 (Avei,k (x))−h ¯ ¯ i+j,k (x))|≥a}

N2 I3 (i) (µki,i+j ) 3(2k + 1)d k

1 N2 i,i+j k k I (µ ) − I (µ ) , 3 (i+j) i,i+j i,i+j 3(2k + 1)d k 3d2 2 k

where µki,i+j is the projection of f dνN onto 3 + 3k (i) ∪ 3k (i + j), where Iki,i+j (ν) is defined by: Z −Li,i+j g Iki,i+j (ν) = sup dν, g g>0 for ν ∈ P(RK ) with K ⊂ 3k (i) ∪ 3k (i + j) and where we took advantage of convexity of I N . We can now apply Lemma 3.6 and by letting N go to ∞, we obtain: k 1 X 0 (Ave sup Eµi,i+j {bχ{|h0 (Avei,k (x))−h } EN,k,b,α,j ≤ T d ¯ ¯ i+j,k (x))|≥a} N k d µi,i+j i∈TN δ 1 δ i,i+j k k k I (µ ) − I (µ ) − I (µ ) , − 3 (i) i,i+j 3 (i+j) i,i+j i,i+j 3(2k + 1)d k 3(2k + 1)d k 3d2 2 k k δ ¯ 0 (Ave ≤ T El {sup Eµ0,j {bχ{|h0 (Avei,k (x))−h }− I (µk ) ¯ ¯ i+j,k (x))|≥a} d 3k 0,j 3(2k + 1) k µ0,j 1 δ 0,j k k I3 (j) (µ0,j ) − 2 2 Ik (µ0,j ) }, − 3(2k + 1)d k 3d

Diffusive Hydrodynamic Limits for Systems of Interacting Diffusions

579

for any positive number δ. Letting δ go to ∞ gives: ¯ lim sup EN,k,b,α,j ≤ T {E l sup N →∞

{I3k (µk )=0,I3k (j) (µk )=0} 0,j 0,j

E

µk 0,j

Now we let k go to ∞ first and then → 0. We get: Z lim sup lim sup lim sup EN,k,b,α,j ≤ sup →0

k→∞

1 0 (Ave {bχ{|h0 (Avei,k (x))−h } − 2 2 Ik0,j (µk0,j ) ¯ ¯ i+j,k (x))|≥a} 3d

N →∞

β

{|h0 (c1 )−h0 (c2 )|≥a}

.

β(dc1 , dc2 ),

where β(dc1 , dc2 ) is a limit point of the joint distribution of   X X 1 1  xv , xv  (2k + 1)d v∈3 (2k + 1)d v∈3k (j)

k

under µk0,j which satisfies Ik0,j (µk0,j ) → 0. Lemma 2.4 allows us to follow the proof of Theorem 4.3 in [GPV] and to conclude that β(dc1 , dc2 ) is concentrated on the set {(c1 , c2 ); h0 (c1 ) = h0 (c2 )}. This completes the proof. Remark on the above proof. A little more work proves that the result holds uniformly in j. More precisely, if we set    Z T X   0 (A χ{|h0 (Ai,k [x(s)])−h ds. , AN,k,b,α,j = EQN exp b ¯ ¯ i+j,k [x(s)])|≥a}   0 d i∈TN    Z T X    0 (Ave χ{|h0 (Avei,k (x(s)))−h ds. , AN,k,b,α = EQN expb sup ¯ ( x(s)))|≥a} ¯ i+j,k   j 0 d i∈TN

then AN,k,b,α ≤

X

AN,k,b,α,j ,

j

and from the previous lemma, we have: lim sup lim sup lim sup →0

k→∞

N →∞

1 ln AN,k,b,α,j ≤ 0 Nd

uniformly in j. Thus lim sup lim sup lim sup →0

k→∞

N →∞

1 ln AN,k,b,α ≤ 0. Nd

Proof of lemma 3.2:. Once Lemma 3.7 is proven, the same argument as in the proof of Lemma 3.1 gives the desired result.

580

R.A. Carmona,L. Xu

4. Apriori Estimates for the Macroscopic Density Let PˆN be the law of the empirical process µN (t) =

1 X xi (t)δi/N Nd d i∈TN

under PN and let us set

1 H(PN |QN ) : d N N The following theorem establishes apriori estimates for the macroscopic density. These estimates are needed in the proof of the uniqueness of the limiting equation and to control the limit ` to ∞ in Lemma 3.1. Lemma 4.1. Let Pˆ be any limit point of the sequence {PˆN ; N = 1, 2, · · ·}: Then C = sup

(a) Pˆ {µ : µ(t, dθ) = m(t, θ)dθ} = 1. R ˆ (b) EP {sup0≤t≤T T d h(m(t, θ) dθ} ≤ C. ˆ RT R (c) EP { 0 T d {5θ [h0 (m(t, θ)]}2 dθ ds} ≤ aC for some positive constant a. Proof. First we prove (a) and (b). Observe that for any process QN in equilibrium and for any continuous function J(θ) on T d : Z 1 QN d J(θ) dµ ln E (t) exp N lim N N →∞ N d      X i 1 QN   = lim )x ln E J( (t) exp i N →∞ N d   N d i∈TN Z ψ(J(θ)) dθ, = Td

Using this observation we claim that for any finite J1 , · · · , Jk , if we set: Z Z Ji (θ)µN (t, dθ) − ψ(Ji (θ)) dθ G = sup 1≤i≤k

then lim

N →∞

Td

1 ln EQN {exp(N d G)} ≤ 0. Nd

For this, 1 ln EQN {exp(N d G)} Nd ( k ) Z Z X 1 ln EQN exp N d ψ(Ji (θ)) dθ Ji (θ)µN (t, dθ) − ≤ lim N →∞ N d Td i=1 Z Z 1 QN d ln E (θ)µ (t, dθ) − ψ(J (θ)) dθ exp N J ≤ sup lim i N i d 1≤i≤k N →∞ N Td ≤ 0. lim

N →∞

Diffusive Hydrodynamic Limits for Systems of Interacting Diffusions

581

Using one more time the entropy inequality, we get: ( Z Z Pˆ N

E

Ji (θ)µ(t, dθ) −

sup 1≤i≤k

Z

≤E

) ψ(Ji (θ)) dθ Td

Z Ji (θ)µN (t, dθ) −

PN

ψ(Ji (θ)) dθ Td

1 1 ln EQN {exp(N d G)} + d H(PN |QN ) Nd N ≤C: ≤

If we let N tend to ∞, we obtain: ( Z Z Pˆ sup Ji (θ)µ(t, dθ) − E

) ψ(Ji (θ)) dθ ) ≤ C. Td

1≤i≤k

Since {J1 , · · · , Jk } is an arbitrary set of continuous, ( Z Z E

Pˆ

J(θ)µ(t, dθ) −

sup

) ≤ C.

ψ(J(θ)) dθ Td

J∈C(T d )

Using the fact that h is the convex conjugate of ψ, (a) and (b) are immediate consequences of the above inequality. We now establish (c). We first observe that ( " Z #) "Z # T X T ∂HN i ∂HN QN exp N J( , t)[ )− )] ≤ exp 3N (t) dt , E N ∂xi ∂xi+ek 0 0 where

X p ∂HN i ∂HN J( , t)[ )− )] − N 2 DN ( f )] 3N (t) = sup Ef νN N N ∂xi ∂xi+ek f X ∂f i ∂f N 2 X ∂f ∂f 2 1 νN = sup E ( N J( , t)( − )− − ) ] N ∂xi ∂xi+ek 2 ∂xi ∂xi+ek f f 1X 2 i J ( ). = 2 N Hence 1 lim sup d ln EQN N N →∞ Let us notice that:

(

"Z exp 0

T

X ∂J i ∂HN ( , t) dt ∂θk N ∂xi X ∂J i ∂HN ( , t) ∂θk N ∂xi

#) ≤

1 2

Z

T

Z J 2 (θ, t) dθdt.

0

Td

R ∂J approximates ∂θ (θ, t)h0 (m(θ, t) dθ because of the first part of the proof, Lemma 2.5 k and Theorem 3.1. This allows us to follow the same argument as in the first step and concludes the proof of part (c).

582

R.A. Carmona,L. Xu

5. The Hydrodynamic Limit 0 ¯ starting from initial distribution fN dνN and Recall that PN and QN are the laws of x(t) dνN respectively and that PˆN is the law of the empirical process

µN (t) =

1 X xi (t)δi/N Nd d i∈TN

under PN . It is possible to adapt the arguments of Lemmas 6.1 and 6.2 in [GPV] to the present situation and conclude in the same way that the sequence {PˆN } is tight. Our goal is now to identify the limit points. The lemma below will provide the uniqueness of the limiting equation. Its proof follows the lines of Sect. 7 of [GPV]. We reproduce it here for the sake of completeness. Lemma 5.1. Any weak solution of the (nonlinear) partial differential equation: ∂m = 4h0 (m(t, θ)), m(0, θ) = m0 (θ), ∂t satisfies:

(38)

Z h(m(t, θ) dθ < ∞

(39)

{5θ [h0 (m(t, θ)]}2 dθ ds < ∞.

(40)

sup Z

T 0

0≤t≤T

Z

Td

Td

Proof. Observe that if m1 (θ, t) and m2 (θ, t) are two solutions of Eq. 38, then the H −1 norm of m1 − m2 , Z 2 km1 − m2 kH −1 = (m1 − m2 )(−4)−1 (m1 − m2 ) dθ Td

is decreasing in t. Indeed, computing its derivative gives: Z d (km1 − m2 k2H −1 ) = − [h0 (m1 ) − h0 (m2 )](m1 − m2 ) dθ ≤ 0. dt Td and the desired result is an immediate consequence of the above observation.

We now state our main result. Notice that the following form is slightly stronger than the version stated in the introduction as Theorem 1.1. Theorem 5.1. Let Pˆ be any limit point of the sequence PˆN . Then Pˆ is concentrated on the single path µ(t, dθ) = m(t, θ)dθ, where m(t, θ) is the unique weak solution of Eq. (38) satisfying the regularity conditions (39) and (40). Proof. Let J(θ) be a smooth test function on T d . We consider the functional Z Z Ξ(N, t, , `, µ) = J(θ)µ(t, dθ) − J(θ)µ(0, dθ) Z t X 1 i − 4J( )U` ◦ h0 (m3N (τi x(s))) ¯ ds. 2N d 0 N d i∈TN

Diffusive Hydrodynamic Limits for Systems of Interacting Diffusions

We apply Ito’s formula to

P d i∈TN

583

J(i/N )xi ,

i 1 X i 1 X )x J( (t) − J( )xi (0) i d Nd N N N d d i∈TN

i∈TN

Z 1 X t 2 X i + ek i i − ek ∂HN = ) − 2J( ) + J( )) ds + MN (t), [N (J( 2N 2 i 0 N N N ∂xi (s) k

where MN is a martingale. An explicit calculation shows that its quadratic variation goes to zero. Moreover: N2

X

(J(

k

i i − ek i i + ek ) − 2J( ) + J( )) = 4J( ) + o(N ), N N N N

where o(N ) goes to zero uniformly in I. Using Lemma 3.4 to take care of the large spins we get: Z i i − ek ∂HN 1 X t 2 X i + ek ) − 2J( ) + J( )) ds [N (J( 2N 2 i 0 N N N ∂xi (s) k Z i ∂HN 1 X t ds + eN , 4J( ) = 2N 2 i 0 N ∂xi (s) where eN tends to zero when N → ∞. At this stage we apply Theorem 3.1 and get: ˆ

lim sup lim sup lim sup EPN {|Ξ(N, t, , `, µ)|} = 0. N →∞

→0

`→∞

Therefore

ˆ

lim sup lim sup EP {|Ξ(t, , `, µ)|} = 0, →0

`→∞

where

Z Ξ(t, , `, µ) = 1 − 2

Z tX 0

Z J(θ)µ(t, dθ) − ÿ

4J(θ)U` ◦ h

0

Td

Hence

1 (2)d

J(θ)µ(0, dθ) !

Z m(s, ϑ) dϑ

ds.

|ϑ−θ|≤

ˆ

lim sup EP {|Ξ(t, `, µ)|} = 0, `→∞

where

Z Z Ξ(t, `, µ) = J(θ)µ(t, dθ) − J(θ)µ(0, dθ) Z 1 tX 4J(θ)U` ◦ h0 (m(s, θ) dθ) ds. − 2 0 d T

Finally taking the limit ` → ∞ and using Lemma 2.5 and Theorem 4.1 we obtain:

584

R.A. Carmona,L. Xu

) ( Z Z Z 1 tX 0 E 4J(θ)h (m(s, θ) dθ) ds = 0. J(θ)µ(t, dθ) − J(θ)µ(0, dθ) − 2 0 d Pˆ

T

From Lemma 4.1, we conclude that Pˆ is concentrated on the single path µ(t, dθ) = m(t, θ)dθ, where m(t, θ) is the unique weak solution of Eq. (38) which satisfies the regularity conditions (39) and (40). Acknowledgement. We would like to thank Prof. S.R.S. Varadhan for enlightening discussions on Corollary 2.1. Special thanks are also due to Prof. S. Olla for bringing the results of [C] to our attention.

References [CY]

Chang, C.C. and Yau, H.T. (1992): Fluctuation of one dimensional Ginzburg-Landau models in nonequilibrium. Commun. Math. Phys. 145, 209–234 [C] Comets, F. (1989): Large deviation estimates for a conditional probability distribution-application to random interaction Gibbs measures. Prob. Th. Rel. Fields 80, 407-432 [DV] Donsker, M.D. and Varadhan, S.R.S (1989): Large deviations from a hydrodynamic scaling limit. Comm. Pure Appl. Math. 42, 243–270 [F1] Fritz. J. (1989): On the hydrodynamic limit of a Ginzburg–Landau lattice model. The law of large numbers in arbitrary dimensions. Probab. Theory Rel. Fields 81, 291–318 [F2] Fritz, J. (1990): On the diffusive nature of entropy flow in infinite systems: Remarks to a paper by Guo–Papanicolau–Varadhan. Commun. Math. Phys. 133, 331–352 [Fu] Funaki, H. (1991): The Hydrodynamic Limit for a System with Interactions Prescribed by Ginzburg– Landau Type Random Hamiltonian. Prob. Th. Rel. Fields 90, 519–562 [GPV] Guo, M.Z., Papanicolaou, G.C. and Varadhan, S.R.S. (1988): Nonlinear diffusion limit for a system with nearest neighbor interactions. Commun. Math. Phys. 118, 31–59 [KOV] Kipnis, C., Olla, S. and Varadhan, S.R.S (1989): Hydrodynamics and large deviations for a simple exclusion process. Commun. Pure Appl. Math. 42, 115–137 [KS] Krug. J. and Spohn, H. (1991): Kinetic Roughening of Growing Surfaces. In: Solids Far From Equilibrium: Growth, Morphology, and Defects ed. by C.Godreche. Cambridge: Cambridge Univ. Press [Q] Quastel, J. (1992): Diffusion of Color in the Simple Exclusion Process. Commun. Pure Appl. Math. 45, n.6 [R] Rezakhanlou, F. (1990): Hydrodynamic limit for a system with finite range interactions. Commun. Math. Phys. 129, 445–480 [S] Spohn, H. (1991): Large Scale Dynamics of Interacting Particles. New York, N.Y.: Springer Verlag [V1] Varadhan, S.R.S. (1984): Large Deviations and Applications. CBMS-NSF Regional Conference Series in Applied Mathematics, Vol. 46, SIAM [V2] Varadhan, S.R.S. (1990): Nonlinear Diffusion Limit for Systems with Near Neighbor Interactions. Proc. Taniguchi Symp., Kyoto [VY] Varadhan, S.R.S. and Yau, H.T. (1996): Diffusive Scaling Limits and its Large Deviations for Lattice Gas Models with Finite Range Interaction: High Temperature Case. In preparation [X] Xu. L. (1993): Diffusive Scaling Limits for Mean Zero Asymmetric Simple Exclusion Processes. Ph.D Thesis, New York University [Y] Yau. H.T. (1991): Relative Entropy and The Hydrodynamics of Ginzburg-Landau Models. Lett. Math. Phys. 22, 63–80 Communicated by J. L. Lebowitz

Commun. Math. Phys. 188, 585 – 597 (1997)

Communications in

Mathematical Physics c Springer-Verlag 1997

The Isometry Groups of Asymptotically Flat, Asymptotically Empty Space–Times with Timelike ADM Four–Momentum Robert Beig1 , Piotr T. Chru´sciel2,? 1

Institut f¨ur Theoretische Physik, Universit¨at Wien, A–1090 Wien, Austria. E-mail: [email protected] 2 D´ epartement de Math´ematiques, Facult´e des Sciences, Parc de Grandmont, F37200 Tours, France. E–mail: [email protected] Received: 25 October 1996 / Accepted: 14 February 1997

Abstract: We give a complete classification of all connected isometry groups, together with their actions in the asymptotic region, in asymptotically flat, asymptotically vacuum space–times with timelike ADM four–momentum. 1. Introduction In any physical theory a privileged role is played by those solutions of the dynamical equations which exhibit symmetry properties. For example, according to a current paradigm, there should exist a large class of isolated gravitating systems which are expected to settle down towards a stationary state, asymptotically in time, outside of black hole regions. If that is the case, a classification of all such stationary states would give exhaustive information about the large–time dynamical behavior of the solutions under consideration. More generally, one would like to understand the global structure of all appropriately regular space–times exhibiting symmetries. Now the local structure of space–times with Killing vectors is essentially understood, the reader is referred to the book [20], a significant part of which is devoted to that question. However, in that reference, as well as in most works devoted to those problems, the global issues arising in this context are not taken into account. In this paper we wish to address the question, what is the structure of the connected component of the identity of the group of isometries of space–times which are asymptotically flat in space–like directions, when the condition of time–likeness of the ADM four–momentum pµ is imposed? Recall that the time–likeness of pµ can be established when the Einstein tensor satisfies a positivity condition, and when the space–time contains an appropriately regular spacelike surface, see [4] for a recent discussion and a list of references. Thus the condition of time– likeness of pµ is a rather weak form of imposing global restrictions on the space–time ? On leave of absence from the Institute of Mathematics, Polish Academy of Sciences, Warsaw. Supported in part by KBN grant # 2P30209506 and by the Federal Ministry of Science and Research, Austria.

586

R. Beig, P.T. Chru´sciel

under consideration. The reader should note that we do not require p0 to be positive, so that our results also apply to space–times with negative mass, as long as the total four–momentum is time–like. In asymptotically flat space–times one expects Killing vectors to “asymptotically look like” their counterparts in Minkowski space–time – in [4, Proposition 2.1] we have shown that at the leading order this is indeed the case (see also Proposition 2.1 below). This allows one to classify the Killing vectors into “boosts”, “translations”, etc., according to their leading asymptotic behavior. There exists a large literature concerning the case in which one of the Killing vectors is a time–like translation – e.g., the theory of uniqueness of black holes – but no exhaustive analysis of what Killing vectors are kinematically allowed has been done so far. This might be due to the fact that for Killing vector fields with a rotation–type leading order behaviour, the next to leading order terms are essential to analyse the structure of the orbits, and it seems difficult to control those without some overly restrictive hypotheses on the asymptotic behaviour of the metric. In this work we overcome this difficulty, and prove the following (the reader is referred to Sect. 2 for the definition of a boost–type domain, and for a detailed presentation of the asymptotic conditions used in this paper): Theorem 1.1. Let (M, gµν ) be a space–time containing an asymptotically flat boost– type domain , with time–like (non–vanishing) ADM four momentum pµ , with fall–off exponent 1/2 < α < 1 and differentiability index k ≥ 3 (see Eq. (2.2) below). We shall also assume that the hypersurface {t = 0} ⊂ can be Lorentz transformed to a hypersurface in which is asymptotically orthogonal to pµ . Suppose moreover that the Einstein tensor Gµν of gµν satisfies in the fall–off condition Gµν = O(r−3− ),

>0.

(1.1)

Let X µ be a non–trivial Killing vector field on , let φs [X] denote its (perhaps only locally defined) flow. Replacing X µ by an appropriately chosen multiple thereof if necessary, one has: 1. There exists R1 ≥ 0 such that φs [X](p) is defined for all p ∈ ΣR1 ≡ {(0, ~x) ∈ : r(~x) ≥ R1 } and for all s ∈ [0, 1]. 2. There exists a constant a ∈ R such that, in local coordinates on , for all xµ = (0, ~x) as in point 1 we have φµ1 [X] = xµ + apµ + Ok (r−α ) . 3. If a = 0, then φ1 [X](p) = p for all p for which φ1 [X](p) is defined. The reader should notice that Theorem 1.1 excludes boost-type Killing vectors. This feature is specific to asymptotic flatness at spatial infinity, see [6] for a large class of vacuum space–times with boost symmetries which are asymptotically flat in light–like directions. The theorem is sharp, in the sense that the result is not true if pµ is allowed to vanish or to be non–time–like. When considering asymptotically flat space–times with more than one Killing vector, it is customary to assume that there exists a linear combination of Killing vectors the orbits of which are periodic (and has an axis — see below). However no justification of this property of Killing orbits has been given so far, except perhaps in some special situations. Theorem 1.1 allows us to show that this is necessarily the case. While this property, appropriately understood, can be established without making the hypothesis of completeness of the orbits of the Killing vector fields, the statements become somewhat

Isometry Groups of Space–Times

587

awkward. For the sake of simplicity let us therefore assume that we have an action of a connected non–trivial group G0 on (M, gµν ) by isometries. Using Theorem 1.1 together with the results of [4] we can classify all the groups and actions. Before doing that we need to introduce some terminology. Consider a space–time (M, gµν ) with a Killing vector field X. Then (M, gµν ) will be said to be: 1. Stationary, if there exists an asymptotically Minkowskian coordinate system {y µ } on (perhaps a subset of) , with y 0 — a time coordinate, in which X = ∂/∂y 0 . When the orbits of X are complete we shall require that they are diffeomorphic to R, and that ΣR ≡ {t = 0, r(~x) ≥ R} intersects the orbits of X only once, at least for R large enough. 2. Axisymmetric, if X µ has complete periodic orbits. Moreover X µ will be required to have an axis, that is, the set {p : X µ (p) = 0} 6= ∅. 3. Stationary-rotating (compare [14]), if the matrix σνµ = limr→∞ ∂ν X µ is a rotation matrix, that is, σνµ has a timelike eigenvector aµ , with zero eigenvalue1 . Let φt [X] denote the flow of X. We shall moreover require that there exists T > 0 such that φT [X](p) ∈ I + (p) for p in the exterior asymptotically flat 3-region Σext . 4. Stationary–axisymmetric, if there exist on M two commuting Killing vector fields Xa , a = 1, 2, such that (M, gµν ) is stationary with respect to X1 and axisymmetric with respect to X2 , 5. Spherically symmetric, if, in an appropriate coordinate system on , SO(3) acts on M by rotations of the spheres r = const, t = const 0 in , at least for t = 0 and r large enough. 6. Stationary–spherically symmetric, if (M, gµν ) is stationary and spherically symmetric. We have the following: Theorem 1.2. Under the conditions of Theorem 1.1, let G0 denote the connected component of the group of all isometries of (M, gµν ). If G0 is non–trivial, then one of the following holds: 1. 2. 3. 4. 5.

G0 G0 G0 G0 G0

= R, and (M, gµν ) is either stationary, or stationary–rotating. = U (1), and (M, gµν ) is axisymmetric. = R × U (1), and (M, gµν ) is stationary–axisymmetric. = SO(3), and (M, gµν ) is spherically symmetric. = R × SO(3), and (M, gµν ) is stationary–spherically symmetric.

We believe that the condition that be a boost–type domain is unnecessary. Recall, however, that this condition is reasonable for vacuum space–times [9], and one expects it to be reasonable for a large class of couplings of matter fields to gravitation, including electro–vacuum space–times. We wish to point out that in our proof that condition is needed to exclude boost–type Killing vectors, in Proposition 2.2 below, as well as to exclude causality violations in the asymptotic region. We expect that it should be possible to exclude the boost–type Killing vectors purely by an initial data analysis, using the methods of [4]. If that turns out to be the case, the only “largeness requirements” left on (M, gµν ) would be the much weaker conditions2 needed in Proposition 2.3 below. 1 If σ µ has a timelike eigenvector aµ , we can find a Lorentz frame so that aµ = (a, 0, 0, 0). In that frame ν σνµ satisfies σν0 = σ0µ = 0, so that it generates space–rotations, if non–vanishing. 2 Those global considerations of the proof of Theorem 1.2 which use the structure of can be carried through under the condition (2.15), provided that the constants C1 and Cˆ 1 appearing there are replaced by some appropriate larger constants. The reader should also note that these considerations are unnecessary when ΣR is assumed to be achronal.

588

R. Beig, P.T. Chru´sciel

Let us also mention that in stationary space–times with more than one Killing vector all the results below can be proved directly by an analysis of initial data sets, so that no “largeness” conditions on (M, gµν ) need to be imposed — see [3]. Let us finally mention that the results here settle in the positive Conjecture 3.2 of [13], when the supplementary hypothesis of existence of at least two Killing vectors is made there. We find it likely that there exist no electro–vacuum, asymptotically flat space–times which have no black hole region, which are stationary–rotating and for which G0 = R. A similar statement should be true for domains of outer communications of regular black hole space–times. It would be of interest to prove this result. Let us also point out that the Jacobi ellipsoids [7] provide a Newtonian example of solutions with a one dimensional group of symmetries with a “stationary–rotating” behavior. 2. Definitions, Proofs Let W be a vector field, throughout we shall use the notation φt [W ] to denote the (perhaps defined only locally) flow generated by W . Consider a subset of R4 of the form = {(t, ~x) ∈ R × R3 : r((t, ~x)) ≥ R, |t| ≤ f (r(~x))} ,

(2.1)

for some constant R ≥ 0 and some function f (r) ≥ 0, f 6≡ 0. We shall consider only non–decreasing functions f . Here and elsewhere, by a slight abuse of notation, we write v u 3 uX r((t, ~x)) = r(~x) = t (xi )2 . i=1

Let α be a positive constant; will be called a boost–type domain if f (r) = θr + C for some constants θ > 0 and C ∈ R (cf. also [9]). Let φ be a function defined on . For β ∈ R we shall say that φ = Ok (rβ ) if φ ∈ C k (), and if there exists a function C(t) such that we have 0≤i≤k

|∂α1 · · · ∂αi φ| ≤ C(t)(1 + r)β−i .

We write O(rβ ) for O0 (rβ ). We say that φ = o(rβ ) if limr→∞,t=const r−β φ(t, x) = 0. A metric on will be said to be asymptotically flat if there exist α > 0 and k ∈ N such that (2.2) gµν − ηµν = Ok (r−α ) , and if there exists a function C(t) such that |gµν | + |g µν | ≤ C(t) ,

(2.3)

g 00 ≤ −C(t)−1 , g00 ≤ −C(t)−1 , P ∀X i ∈ R3 gij X i X j ≥ C(t)−1 (X i )2 .

(2.4) (2.5)

Here and throughout ηµν is the Minkowski metric. Given a set of the form (2.1) with a metric satisfying (2.2)–(2.5), to every slice {t = const} ⊂ one can associate in a unique way the ADM four–momentum vector pµ (see [10, 2]), provided that k ≥ 1, α > 1/2, and that the Einstein tensor satisfies the fall–off condition (1.1). Those conditions also guarantee that pµ will not depend upon

Isometry Groups of Space–Times

589

which hypersurface t = const has been chosen. The ADM four–momentum of will be defined as the four–momentum of any of the hypersurface {t = const} ⊂ . We note the following useful result: Proposition 2.1. Consider a metric gµν defined on a set as in (2.1) (with a non– decreasing function f ), and suppose that gµν satisfies (2.2)–(2.5) with k ≥ 2 and 0 < α < 1. Let X µ be a Killing vector field defined on . Then there exist numbers σµν = σ[µν] such that X µ − σ µ ν xν = Ok (r1−α ) , (2.6) with σ µ ν ≡ η µα σαν . If σµν = 0, then there exist numbers Aµ such that X µ − Aµ = Ok (r−α ) .

(2.7)

If σµν = Aµ = 0, then X µ ≡ 0. Proof. The result follows from Proposition 2.1 of [4], applied to the slices {t = const}, except for the estimates on those partial derivatives of X in which ∂/∂t factors occur. Those estimates can be obtained from the estimates for the space–derivatives of Proposition 2.1 of [4] and from the equations ∇µ ∇ν Xα = Rλ µνα Xλ , which are a well known consequence of the Killing equations.

(2.8)

The proofs of Theorems 1.1 and 1.2 require several steps. Let us start by showing that boost–type Killing vectors are possible only if the ADM four–momentum is spacelike or vanishes: Proposition 2.2. Let gµν be a twice differentiable metric on a boost–type domain , satisfying (2.2)–(2.5), with α > 1/2 and with k ≥ 2. Suppose that the Einstein tensor Gµν of gµν satisfies Gµν = O(r−3− ), >0. Let X µ be a Killing vector field on , set ∂X µ r→∞ ∂xν

σ µ ν ≡ lim

(2.9)

(those limits exist by Proposition 2.1). Then the ADM four–momentum pµ of satisfies σ µ ν pµ = 0 .

(2.10)

Proof. If σ µ α = 0 there is nothing to prove, suppose thus that σ µ α 6= 0. Let µ ν be a solution of the equation dµ ν = σ µ α α ν . ds It follows from Proposition 2.1 that the flow φt [X](p) is defined for all t ∈ [−α, α] and for all p ∈ ΣR1 ≡ {t = 0, r(p) ≥ R1 } ⊂ for some constants α and R1 . By [11, Theorem 1], in local coordinates we have φµt [X] = µ ν (t)xν + Ok (r1−α ) , ∂φµ t [X] ∂xν

= µ ν (t) + Ok−1 (r1−α ) .

590

R. Beig, P.T. Chru´sciel

The error terms above satisfy appropriate decay conditions so that the ADM four– momentum Z Uµαβ dSαβ pµ (φt [X](ΣR1 )) = φt [X](ΣR1 )

is finite and well–defined. Here dSµν = ι∂µ ι∂ν dx0 ∧ . . . ∧ dx3 , ιX denotes the inner product of a vector X with a form, and (cf., e.g., [11]) Uµαβ = δλ[α δνβ δµγ] η λρ ηγσ ∂ρ g νσ . As is well known (see [11] for a proof under the current asymptotic conditions, cf. also [5, 1]), under boosts the ADM four–momentum transforms like a four–vector, that is, pµ (φt [X](ΣR1 )) = µ ν (t)pν (ΣR1 ) .

(2.11)

On the other hand, the φµt [X]’s are isometries, so that gαβ (φµt [X](x))

∂φα ∂φβt [X] t [X] (x) (x) = gµν (x) , ∂xµ ∂xν

which gives Uαµν (φµt [X](x))σ µ (t)ρ ν (t) = γ α (t)Uγρσ (x) + O(r−1−2α ) .

(2.12)

Equations (2.11) and (2.12) give, for all t, σ µ (t)pσ = pµ , and (2.10) follows by t–differentiation of Eq. (2.13).

(2.13)

Suppose, now, that the ADM four–momentum pµ of the hypersurface {t = 0} is timelike. If is large enough we can find a boost transformation 3 such that the hypersurface 3({t = 0}) is asymptotically orthogonal to pµ . It then follows by Proposition 2.2 that the matrix σ defined in Eq. (2.9) has vanishing 0-components in that Lorentz frame, and therefore generates space rotations. We need to understand the structure of orbits of such Killing vectors. This is analysed in the proposition that follows: Proposition 2.3. Let gµν be a metric on a set as in Eq. (2.1), and suppose that gµν satisfies the fall-off condition (2.2) with 0 < α < 1 and k ≥ 2. Let X µ be a Killing vector field defined on , and suppose that Z µ ∂µ ≡ X µ ∂µ − ω i j xj ∂i = o(r) ,

∂σ Z µ = o(1) ,

(2.14)

with ω i j — a (non–trivial) antisymmetric matrix with constant coefficients, normalized such that ω i j ω j i = −2(2π)2 . (It follows from Proposition 2.1 that there exist constants C1 , Cˆ 1 such that |X 0 | ≤ C1 r1−α + Cˆ 1 on {t = 0} ⊂ .) Suppose that the function f in (2.1) satisfies (2.15) f (r) ≥ C2 r1−α + Cˆ 1 , where C2 is any constant larger than C1 . Let φs denote the flow of X µ . Then: 1. There exists R1 ≥ R such that φs (p) is well defined for p ∈ ΣR1 ≡ {t = 0, r ≥ R1 } ⊂ and for s ∈ [0, 1]. For those values of s we have φs (ΣR1 ) ⊂ .

Isometry Groups of Space–Times

591

2. There exist constants Aµ such that, in local coordinates on , for all xµ ∈ ΣR1 we have (2.16) φµ1 = xµ + Aµ + Ok−1 (r−α ) . 3. If Aµ = 0, then φ1 (p) = p for all p for which φ1 (p) is defined. Remark. The hypothesis that limr→∞ ∂i X 0 = 0, which is made in (2.14), is not needed for points 2 and 3 above to hold, provided one assumes that the conclusions of point 1 hold. Proof. Point 1 follows immediately from the asymptotic estimates of Proposition 2.1 and the defining equations for φµs , dφµs = X µ ◦ φµs . ds To prove point 2, let Ri j (s) be the solution of the equation dRi j = ω i k Rk j , ds with initial condition Ri j (0) = δ i j , set R0 0 (s) = 1, R0 i (s) = 0. We have the variation– of–constants formula Z s µ µ ν φs (x) = R ν (s)x + Rµ ν (s − t)Z ν (φt (x))dt, 0

from which we obtain, in view of Proposition 2.1, ∂φµ µ 1 ∂xν − δ ν µ φ 1 − xµ

Set y µ (x) =

φµ1 (x).

= Ok−1 (r−α ),

(2.17)

= Ok (r

(2.18)

1−α

).

As y µ (xν ) is an isometry, we have the equations ∂y α ∂y ρ ∂y γ ∂ 2 yα = 0σµν (x) σ − 0α . βγ (y(x)) µ ν ∂x ∂x ∂x ∂xµ ∂xν

(2.19)

From (2.17)–(2.18) we obtain ∂ 2 (y α − xα ) ∂xµ ∂xν

= =

α −1−2α 0α ) µν (x) − 0µν (y(x)) + Ok−1 (r Z 1 −1−2α ∂ρ 0α ) (y ρ (x) − xρ ) µν (tx + (1 − t)y(x))dt + Ok−1 (r 0

=

Ok−2 (r

−1−2α

).

(2.20)

We can integrate this inequality in r to obtain ∂(y α − xα ) = Ok−1 (r−2α ) . ∂xµ If 2α > 1, the Lemma the Appendix A of [11] shows that the limits lim r→∞,t=0 (y α − xα ) = Aα exist and we get y α − xα = Aα + Ok (r1−2α ) .

592

R. Beig, P.T. Chru´sciel

Otherwise, decreasing α slightly if necessary, we may assume that 2α < 1, in which case we simply obtain y α − xα = Ok (r1−2α ) . If the last case occurs we can repeat this argument ` − 1 times to obtain O(r−1−(`+1)α ) at the right–hand–side of (2.20) until −1 − (` + 1)α < −2; at the last iteration we shall thus obtain O(r−2− ) there, with some > 0. We can again use the Lemma of the Appendix A of [11] to conclude that the limits limr→∞,t=0 (y α − xα ) = Aα exist. An iterative argument similar to the one above applied to (2.20) gives then ξ α ≡ y α − xα − Aα = Ok (r−α ) ,

(2.21)

which establishes point 2. Suppose finally that Aµ vanishes. Equation (2.19) implies an inequality of the form 2 α ∂ (y − xα ) (2.22) ∂xµ ∂xν ≤ C(|∂0||y − x| + |0||∂(y − x)|), for some constant C. A standard bootstrap argument using (2.22), (2.17) and (2.18) shows that for all σ ≥ 0 we have lim [rσ |y − x| + rσ |∂(y − x)|] = 0.

(2.23)

F = rβ−2 |y − x|2 + rβ |∂(y − x)|2 .

(2.24)

r→∞

Define

Choosing β large enough one finds from (2.22) that ∂F ≥ 0. ∂r

(2.25)

R2 ≤ r ≤ r1 ⇒ F (r1 ) ≥ F (r) ≥ 0.

(2.26)

This implies Passing with r1 → ∞ from (2.23) we obtain φ1 (x) = x for x ∈ ΣR1 . φ1 is therefore an isometry which reduces to an identity on a spacelike hypersurface, and point 3 follows from [12, Lemma 2.1.1]. We are ready now to pass to the proof of Theorem 1.1: Proof of Theorem 1.1. Let y α (xβ ) be defined as in the proof of Proposition 2.3, as it is an isometry we have the equation: gµν (y(x))

∂y µ ∂y ν = gαβ (x) . ∂xα ∂xβ

(2.27)

Set ξα = ηαβ ξ β , where ηαβ = diag(−1, 1, 1, 1), with ξ defined by eq. (2.21). Equations (2.21) and (2.27) together with the asymptotic form of the metric, Eq. (2.2), give ∂ξα ∂ξβ + + gαβ (xσ + Aσ + ξ σ ) − gαβ (xσ ) = Ok−1 (r−1−2α ) . ∂xβ ∂xα Suppose first that Aσ 6≡ 0; we have

(2.28)

Isometry Groups of Space–Times

593

gαβ (xσ + Aσ + ξ σ ) − gαβ (xσ ) Z 1 ∂gαβ σ ρ ∂gαβ σ ∂gαβ σ ρ σ σ ρ ρ (x )A + (x + s(A + ξ ))(A + ξ )) − (x )A ds = ∂xρ ∂xρ ∂xρ 0 ∂gαβ σ ρ = (x )A + O(r−1−2α ) . ∂xρ A similar calculation for the derivatives of gαβ gives gαβ (xσ + Aσ + ξ σ ) − gαβ (xσ ) =

∂gαβ σ ρ (x )A + Ok−2 (r−1−2α ) . ∂xρ

(2.29)

In a neighbourhood of ΣR1 define a vector field Y µ by Y µ = ξ µ + Aµ . It follows from (2.28)–(2.29) that Y µ satisfies the equation ∇µ Yν + ∇ν Yµ = Ok−2 (r−1−2α ) . By hypothesis we have k ≥ 3 and 2α > 1, we can thus use [4, Proposition 3.1] to conclude that Aµ must be proportional to pµ . The remaining claims follow directly by Proposition 2.3. To prove Theorem 1.2 we shall need two auxiliary results: Proposition 2.4. Under the hypotheses of Prop. 2.1, let W be a non–trivial Killing vector field defined on . Suppose that there exists R1 such that for p ∈ ΣR1 the orbits φs [W ](p) are defined for s ∈ [0, 1], with φ1 [W ](p) = p. Assume moreover that there exists a non–vanishing antisymmetric matrix with constant coefficients ω i j such that W µ ∂µ − ω i j xj ∂i = o(r). Then the set {p : W (p) = 0} is not empty. Remark. The following half–converse to Proposition 2.4 is well known: Let W be a Killing vector field on a Lorentzian manifold M and suppose that W (p) = 0. If there exists a neighborhood O of p such that W is nowhere time–like on O, then there exists T > 0 such that all orbits which are defined for t ≥ T are periodic. Proof. Let φs denote the flow of W on , and for p ∈ ΣR1 define Z 1 t ◦ φs (p)ds, t¯(p) = Z r(p) ¯

(2.30)

0 1

r ◦ φs (p)ds.

=

(2.31)

0

Note that (φs )∗ asymptotes to the matrix Rµ ν (s) defined in the proof of Prop. 2.3, which gives Z 1 ∇r¯ = (φs )∗ (∇r) ◦ φs (p)ds ≈ ∇r + O(r−α ). 0

Similarly

∇t¯ ≈ ∇t + O(r−α ).

¯ = R, t¯(p) = T } are This shows that for R large enough the sets SR,T = {p : r(p) differentiable spheres. Moreover

594

R. Beig, P.T. Chru´sciel

r¯ ◦ φs = r, ¯

t¯ ◦ φs = t¯,

(2.32)

so that W is tangent to SR,T . As every continuous vector field tangent to a two– dimensional sphere has fixed points, the result follows. Proof of Theorem 1.2. Let g denote the Lie algebra of G0 . As is well known [19, Vol. I, Chapitre VI, Theorem 3.4], to any element h of g there is associated a unique Killing vector field X µ (h), the orbit of which is complete. Suppose first that g is 1–dimensional. If the constant a of Theorem 1.1 vanishes, (M, gµν ) is axisymmetric by part 3 of Theorem 1.1 and by Proposition 2.4. If a does not vanish there are two cases to analyse. Consider first the case in which ∂µ X ν 6→ 0 as r → ∞. Let us perform a Lorentz transformation so that the new hypersurface t = 0, still denoted by ΣR , is asymptotically normal to pµ . By Proposition 2.2 we must have limr→∞ ∂i X 0 = limr→∞ ∂0 X i = 0, hence Proposition 2.3 applies. As M contains a boost–type domain for any T we can choose p ∈ ΣR1 , with r(p) large enough, so that φs [X](p) is defined for all s ∈ [0, T ], with φs [X](p) 6= p by (2.16). This shows that G0 cannot be U (1), hence G0 = R, and (M, gµν ) is stationary–rotating as claimed. The second case to consider is, by Proposition 2.1, that in which X µ → apµ = Aµ as r → ∞ in . We want to show that ΣR is a global cross–section for φs [X], at least for R large enough. To do that, note that timelikeness of Aµ implies that we can choose R2 large enough so that X µ is transverse to ΣR2 . Let (gij , Kij ) be the induced ˆ , gˆ µν ) be the Killing development metric and the extrinsic curvature of ΣR2 , and let (M of (ΣR2 , gij , Kij ) constructed using the Killing vector field X µ , see Sect. 2 of [4] for ˆ → MR2 ≡ ∪t∈R φt [X](ΣR2 ) by 9(t, ~x) = φt [X](0, ~x). Then 9 details. Define 9 : M ˆ and MR2 . 9 is surjective by construction, and there exists is a local isometry between M ˆ ˆ ˆ and . a boost–type domain in M such that 9|ˆ is a diffeomorphism between Suppose that 9 is not injective, let us first show that this is equivalent to the statement that 9−1 (ΣR2 ) is not connected. Indeed, let p = (t, ~x) and q = (τ, ~y ) be such that 9(p) = 9(q), then φ−t (9(p)) = φ−t (9(q)) so that 9((0, ~x)) = 9((τ − t, ~y )), which leads to (τ − t, ~y ) ∈ 9−1 (ΣR2 ). Consider any connected component Σˆ of 9−1 (ΣR2 ), as 9 is a local isometry Σˆ is ˆ . By [11, Lemma 1 and Theorem 1], we have an asymptotically flat hypersurface in M Σˆ = {t = h(~x),

~x ∈ U ∈ R3 } ,

where U contains R3 \ B(R3 ) for some R3 ≥ R2 . Morever there exists a Lorentz matrix 3µ ν such that h(~x) = 30 i X i + O(r1−α ) . Note that the unit normal to Σˆ approaches, as r → ∞, the Killing vector X, hence 3µ ν X ν = X µ

⇒

30 i = 3i 0 = 0 .

It follows that h(~x) = O(r1−α ), so that 9((h(~x), ~x)) ∈ for r(~x) ≥ R4 for some constant R4 ≥ R3 . Consider a point q ∈ ΣR4 , then there exists a point (0, ~x) such that 9(0, ~x) = q and a point (h(~y ), ~y ) ∈ Σˆ such that 9(h(~y ), ~y )) = q. This, however, contradicts that fact that ˆ and . We conclude that 9|ˆ is a diffeomorphism between the boost-type domain ψ is injective. It follows that ψ is a bijection, which implies that all the orbits through p ∈ ΣR2 are diffeomorphic to R, and that they intersect ΣR2 only once.

Isometry Groups of Space–Times

595

Suppose next that g is two–dimensional. Then there exist on M two linearly independent Killing vectors Xaµ , a = 1, 2. Propositions 2.2 and 2.3 lead to the following three possibilites: i) There exist constants Baµ , a = 1, 2 such that Xaµ − Baµ = o(1). By [4, Prop. 3.1] we have Baµ = aa pµ for some constants aa . It follows that there exist constants (α, β) 6= (0, 0) such that αX1µ + βX2µ = o(1). Proposition 2.1 implies that αX1µ + βX2µ = 0, which contradicts the hypothesis dim g = 2, therefore this case cannot occur. ii) There exist constants B µ and ω i j = −ω j i such that X1µ − B µ = o(1),

X2µ ∂µ − ω i j xi ∂j = o(r) .

(2.33)

Consider the commutator [X1 , X2 ]. The estimates on the derivatives of Xaµ of Proposition 2.1 give [X1 , X2 ]0 = o(1), [X1 , X2 ]i = o(r), so that by Prop. 2.1 the commutator [X1 , X2 ] either vanishes, or asymptotes a constant vector with vanishing time– component, hence spacelike. The latter case cannot occur in view of [4, Prop. 3.1], hence [X1 , X2 ] = 0. It follows that φt [X2 + αX1 ] = φt [X2 ] ◦ φt [αX1 ]. Let apµ be the vector given by Theorem 1.1 for the vector field X2µ . In local coordinates we obtain φµ1 [X2 + αX1 ] = xµ + apµ + αB µ + O(r−α ) . By [4, Prop. 3.1] we have B µ ∼ pµ , so that we can choose α so that φµ1 [X2 + αX1 ] = xµ +O(r−α ). By point 3 of Theorem 1.1 we obtain φ1 [X2 +αX1 ](p) = p, hence all orbits of X2µ + αX1µ are periodic with period 1. As pµ is time–like, the orbits of X1µ must be time–like in the asymptotic region. As before, those orbits cannot be periodic because the coordinates on cover a boost–type region, hence they must be diffeomorphic to R. As [X1 , X2 ] = 0, we obtain that G0 is the direct product R × U (1). iii) For dim g = 2 the last case left to consider is that when there exist non–zero a a i , a = 1, 2, such that Xaµ ∂µ −ωij x ∂j = o(r). Suppose that the antisymmetric constants ωij a matrices ωij do not commute, then by well known properties of so(3) the matrices a 1 2 2 1 together with the matrix ωij ωjk − ωij ωjk are linearly independent. It follows that ωij [X1 , X2 ] is a Killing vector linearly independent of X1 and X2 near infinity, whence everywhere in . It is well known that the orbits of [X1 , X2 ] are complete when those of X1 and X2 are [19, Vol. I, Chapitre VI, Theorem 3.4], which implies that G0 is at least a three–dimensional, which contradicts dim g = 2. If the matrices ωij commute they are linearly dependent. Thus there exist constants (α, β) 6= (0, 0) such that αX1µ + βX2µ = o(r). By Proposition 2.1 the Killing vector field αX1µ + βX2µ is a translational Killing vector, and the case here is reduced to point ii) above. Let us turn now to the case of a three dimensional Lie algebra g. An analysis similar to the above shows that this can only be the case if three Killing vector fields Xiµ , i = 1, 2, 3, on M can be chosen so that Xiµ ∂µ − ijk xj ∂k = o(r). Moreover we must have [Xi , Xj ] = ijk Xk . Then g is the Lie algebra of SO(3), so that G0 = SO(3), or its covering group Spin(3) = SU (2) [18, p. 117, Problem 7]. Integrating over the group as R1 in the proof of Proposition 2.4 (the integral 0 in Eqs. (2.30) –(2.31) should be replaced by an integral over the group G0 with respect to the Haar measure) one can pass to a new coordinate system, defined perhaps only on a subset of , such that the spheres t = const, r = const 0 are invariant under G0 . G0 must be SO(3), as SO(3) is3 the largest group acting effectively on S 2 . The proof of point 5) is left to the reader. 3 This can be seen as follows: Any isometry is uniquely determined by its action at one point of the tangent bundle. Since SO(3) acts transitively on T S 2 , no larger groups can act effectively there.

596

R. Beig, P.T. Chru´sciel

3. Concluding Remarks Theorem 1.1 leaves open the intriguing possibility of a space–time which has only one Killing vector which, roughly speaking, behaves as a spacelike rotation accompanied by a time–like translation. We conjecture that this is not possible when the Einstein tensor Gµν falls–off at a sufficiently fast rate, when global regularity conditions are imposed and when positivity conditions on Gµν are imposed. One would like to go beyond the classification of groups given here, and consider the whole group of isometries G, not only the connected component of the identity thereof G0 . Recall, e.g., that a discrete group of conformal isometries acts on the critical space– times which arise in the context of the Choptuik effect [8, 17]. Let us first consider the case of time–periodic space–times. Clearly such space–times exist when no field equations or energy inequalities hold, so that the classification question becomes interesting only when some field equations or energy–inequalities are imposed. In the vacuum case some stationarity results have been obtained for spatially compact space–times by Galloway [15]. In the asymptotically flat context non–existence of periodic non–stationary vacuum solutions with an analytic Scri has been established by Papapetrou [21], cf. also Gibbons and Stewart [16]. The hypothesis of analyticity of Scri is, however, difficult to justify; moreover the example of boost–rotation symmetric space–times shows that the condition of asymptotic flatness in light–like directions might lead to essentially different behaviour, as compared to that which arises in the context of asymptotic flatness in space–like directions. One expects that non–stationary time–periodic vacuum space–times do not exist, but no satisfactory analysis of that possibility seems to have been done so far. Another set of discrete isometries that might arise is that of discrete subgroups of the rotation group, time–reflections, space–reflections, etc. In those cases G/G0 is compact. It is easy to construct initial data (gij , Kij ) on a compact or asymptotically flat manifold Σ which are invariant under a discrete isometry group, in such a way that the group H of all isometries of gij which preserve Kij is not connected. By [12, Theorem 2.1.4] the group H will act by isometries on the maximal globally hyperbolic development (M, gµν ) of (Σ, gij , Kij ), and it is rather clear that in generic such situations the groups G of all isometries of (M, gµν ) will coincide with H. In this way one obtains space– times in which G/G0 is compact. It is tempting to conjecture that for, say vacuum, globally hyperbolic space–times with a compact or asymptotically flat, appropriately regular, Cauchy surface, the quotient G/G0 will be a finite set. The proof of such a result would imply non–existence of non–stationary time–periodic space–times, in this class of space–times. Acknowledgement. P.T.C. is grateful to the E. Schr¨odinger Institute and to the Relativity Group in Vienna for hospitality during part of work on this paper. We are grateful to A. Fischer and A. Polombo for useful comments.

References 1. Ashtekar, A.and Hansen, R.O.: A unified treatment of null and spatial infinity in general relativity. I. universal structure, asymptotic symmetries and conserved quantities at spatial infinity. J. Math. Phys. 19, 1542–1566 (1978) 2. Bartnik, R.: The mass of an asymptotically flat manifold. Comm. Pure Appl. Math. 39, 661–693 (1986) 3. Beig, R. and Chru´sciel, P.T.: Killing Initial Data. Class. Quantum. Grav. 14, A83–A92, (1996) (A special issue in honour of Andrzej Trautman on the occasion of his 64th Birthday, J.Tafel, editor.)

Isometry Groups of Space–Times

597

4. Beig, R. and Chru´sciel, P.T.: Killing vectors in asymptotically flat space–times: I. Asymptotically translational Killing vectors and the rigid positive energy theorem. Jour. Math. Phys. 37, 1939–1961 (1996) gr-qc/9510015. ´ 5. Beig, R. and OMurchadha, N.: The Poincar´e group as the symmetry group of canonical general relativity. Ann. Phys. 174, 463–498 (1987) 6. Biˇca´ k, J. and Schmidt, B..: Asymptotically flat radiative space–times with boost–rotation symmetry: The general structure. Phys. Rev. D40, 1827–1853 (1989) 7. Chandrasekhar, S.: Ellipsoidal figures of equilibrium. New York: Dover Publ., 1969 8. Choptuik, M.: Universality and scaling in gravitational collapse of a masless scalar field. Phys. Rev. Lett. 70, 9–12 (1993) ´ 9. Christodoulou, D. and OMurchadha, N.: The boost problem in general relativity. Commun. Math. Phys. 80, 271–300 (1980) 10. Chru´sciel, P.T.: Boundary conditions at spatial infinity from a hamiltonian point of view. Topological Properties and Global Structure of Space–Time (P. Bergmann and V. de Sabbata, eds.), New York: Plenum Press, 1986 pp. 49–59 11. Chru´sciel, P.T.: On the invariant mass conjecture in general relativity. Commun. Math. Phys. 120, 233– 248 (1988) 12. Chru´sciel, P.T.: On uniqueness in the large of solutions of Einstein equations (“Strong Cosmic Censorship”), Canberra: Australian National University Press, 1991 13. Chru´sciel, P.T.: “No Hair” Theorems – folklore, conjectures, results. In: Differential Geometry and Mathematical Physics J. Beem and K.L. Duggal, eds., vol. 170, Providence, RI: American Mathematical Society, 1994, pp. 23–49 gr-qc/9402032, 14. Chru´sciel, P.T. and Wald, R.M.: Maximal hypersurfaces in stationary asymptotically flat space–times. Commun. Math. Phys. 163, 561–604, (1994) gr–qc/9304009 15. Galloway, G.J.: Splitting theorems for spatially closed space–times. Commun. Math. Phys. 96, 423–429 (1984) 16. Gibbons, G. and Stewart, J.M.: Absence of asymptotically flat solutions of Einstein’s equations which are periodic and empty near infinity. In: Classical general relativity, W.B. Bonnor and M.A.H. MacCallum, eds., Cambridge: Cambridge University Press, pp. 77–94 1984 17. Gundlach, C: The Choptuik spacetime as an eigenvalue problem. Phys. Rev. Lett. 75, 3214–3218 (1995) gr-qc/9507054 18. Kirillov, A.: El´ements de la th´eorie des repr´esentations. Moscow: Mir, 1974, in French (translation from Russian) 19. Kobayashi, S. and Nomizu, K.: Foundations of differential geometry. New York: Interscience Publishers, 1963 20. Kramer, D., Stephani, D., MacCallum, M. and Herlt, E.: Exact solutions of Einstein’s field equations. E. Schmutzer, ed., Cambridge: Cambridge University Press, 1980 21. Papapetrou, A.: On periodic non-singular solutions in the general theory of relativity. Ann. Phys. 6, 399–411 (1957) Communicated by H. Nicolai

Commun. Math. Phys. 188, 599 – 656 (1997)

Communications in

Mathematical Physics c Springer-Verlag 1997

Semiclassical Eigenvalue Estimates for the Pauli Operator with Strong Non-Homogeneous Magnetic Fields ? II. Leading Order Asymptotic Estimates L´aszl´o Erd˝os1 , Jan Philip Solovej2 1

Courant Institute, NYU, 251 Mercer Str, New York, NY-10012, USA. E-mail: [email protected] Department of Mathematics, Aarhus University, Ny Munkegade Bgn. 530, DK-8000 Aarhus C, Denmark. E-mail: [email protected]

2

Received: 11 September 1996 / Accepted: 17 February 1997

Abstract: We give the leading order semiclassical asymptotics for the sum of the negative eigenvalues of the Pauli operator (in dimension two and three) with a strong nonhomogeneous magnetic field. As in [LSY-II] for homogeneous field, this result can be used to prove that the magnetic Thomas-Fermi theory gives the leading order ground state energy of large atoms. We develop a new localization scheme well suited to the anisotropic character of the strong magnetic field. We also use the basic Lieb-Thirring estimate obtained in our companion paper [ES-I]. Contents 1 1.1 1.2 2 3 3.1 3.2 3.3 3.4 4 4.1 4.2 4.3 5 5.1 5.2 5.3

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 600 Main results in semiclassics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 604 Application to the magnetic Thomas-Fermi theory . . . . . . . . . . . . . . . . . 606 Localization for Operators with Magnetic Fields . . . . . . . . . . . . . . . . . . 608 Semiclassics in Two Dimensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 609 Two dimensional Lieb-Thirring inequality . . . . . . . . . . . . . . . . . . . . . . . 611 Constant field approximation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 612 Lower bound . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 617 Upper bound . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 620 Semiclassics in Three Dimensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 624 Constant field approximation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 626 Lower bound . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 633 Upper bound . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 637 Magnetic Thomas-Fermi Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 641 Rescaling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 643 Reduction to a one-body problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 644 Properties of the potentials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 647

? This work was partially supported by The Danish Natural Science Research Council and the European Union TMR grant FMRX-CT-96-0001

600

L. Erd˝os, J.P. Solovej

5.4 A

Completing the proof of the MTF Theorem . . . . . . . . . . . . . . . . . . . . . . 651 The Geometry of the Three Dimensional Magnetic Field . . . . . . . . . . . . 653

1. Introduction This work is the continuation of our previous paper [ES-I] on studying semiclassical limits of the Pauli operator with both electric and magnetic fields. Our main concern, compared to most other works in the subject, is to allow for non-homogeneous magnetic fields. This transition from homogeneous to non-homogeneous field is highly non-trivial, partly because of challenging technical difficulties and partly because the non-homogeneous field can exhibit qualitatively different behaviour. We shall be concerned with dimensions two and three. Though, it may seem that dimension three is the physically most important case, certain new experimental techniques (see an extensive review in [LSY-III]) allows one to study effectively two dimensional systems, like quantum dots. For these two dimensional systems laboratory magnetic fields actually have a stronger influence on the structure than for most three dimensional systems. The other reason why we treat the two dimensional case as well is pedagogical; some of our basic ideas can be presented with less technicalities in dimension two. The three dimensional Pauli operator is the following operator acting on the space L2 (R3 ; C2 ) of spinor valued functions: H(h, A, V ) := [σ ·(−ih∇+A(x))]2 +V (x) = (−ih∇+A(x))2 +V (x)+hσ ·B(x), (1.1) where σ = (σ1 , σ2 , σ3 ) is the vector of Pauli spin matrices, i.e., 0 1 0 −i 1 0 σ1 = , σ2 = , σ3 = . 1 0 i 0 0 −1 The magnetic field B : R3 → R3 is a divergence free field related to the vector potential A : R3 → R3 by B = ∇ × A. The potential V (x) describes the electric field. As usual h is the semiclassical parameter. Throughout the paper we shall use the convention of writing −i∇ = p. Let B(x) := |B(x)| and n(x) := B(x)/B(x) be the field strength and direction, respectively. The two dimensional Pauli operator has essentially the same form as the three dimensional operator above. The modifications are rather obvious. The magnetic field is a function B : R2 → R, the vector potential is a vector field A : R2 → R2 and we shall write as before B = ∇ × A with the obvious interpretation. We apply the convention that σ · v := σ · (v, 0) for any v ∈ R2 , σ · p = σ · (p, 0) for the two dimensional momentum operator p, and similarly for the vectorproduct of a 3D and a 2D vector (in particular, we define n := (0, 0, 1) and let n × v ∈ R2 be n × v := (−v2 , v1 ) for v ∈ R2 ). We may then write H (2) (h, A, V ) := [σ · (hp + A(x))]2 + V (x) = (hp + A(x))2 + V (x) + hσ3 B(x). (1.2) Since we shall consider B ∈ C 1 (R3 ) ∩ L∞ (R3 ) in the three dimensional case and B ∈ C 1 (R2 ) ∩ L∞ (R2 ) in the two dimensional case, we consider the last terms in (1.1) and (1.2) as bounded multiplication operators on the corresponding L2 space. The vector potential A can and shall be chosen to be C 1 . For the class of potentials V we shall work with, H and H (2) are defined as the Friedrichs’ extensions of the operators restricted to C0∞ functions.

Semiclassical Eigenvalue Estimates for the Pauli Operator

601

The Pauli operator describes the motion of a non-relativistic electron, where the electron spin is important because of its interaction with the magnetic field. For simplicity we have not included any physical parameters (i.e., the electron mass, the electron charge, the speed of light, or Planck’s constant ~) in the expressions for the operators. In place of Planck’s constant we have the semiclassical parameter h, which we let tend to zero. The last identities in (1.1) and (1.2) can easily be checked. If we note that σ · (hp + A(x)) is in fact the three dimensional Dirac operator, we recognize the last identity in (1.1) as the Lichnerowicz formula. As a consequence of these identities one sees a significant difference between the Pauli operator and the ‘magnetic’ Schr¨odinger operator (hp + A(x))2 + V (x). In particular, for reasonable potentials and reasonable magnetic fields the essential spectrum of the Pauli operator starts at zero. (See [ES-I] for more details.) The physically as well as mathematically interesting quantities connected with the eigenvalues are the number and the sum of the eigenvalues below the essential spectrum (in this case, the negative eigenvalues). Recall that the sum of the negative eigenvalues represents the energy of the non-interacting fermi gas in the external potential V and magnetic field B. In the case of a constant magnetic field it is known [Sol, Sob-1986] that even for a smooth compactly supported potential V , which is negative, there will be infinitely many negative eigenvalues. This holds in both two and three dimensions. It was, however, proved in [LSY-II] (three dimensions) and [LSY-III] (two dimensions) that the sum of the negative eigenvalues is finite. The goal in [LSY-II] was to analyze the eigenvalue sum in the semiclassical limit, i.e., as the semiclassical parameter h tends to zero. In the case where one fixes the magnetic field B and let h → 0 one finds that the leading order contribution to the sum of the eigenvalues becomes independent of the magnetic field1 . It is therefore equal to R 5/2 the non-magnetic Weyl term, which in Rthree dimensions is −2(15π 2 )−1 h−3 R3 [V ]− and in two dimensions is −h−2 (8π)−1 R2 [V ]2− ([V ]− denotes the negative part of the function V ). This type of semiclassical limit is therefore not very well suited to study the effect of magnetic fields. One could maybe hope that higher order terms in the expansion would reveal information about the magnetic field. In this context we should point out, however, that without some assumptions on the classical Hamiltonian flow one cannot establish non-vanishing higher order corrections to the above Weyl term. The observation made in [LSY-II] for homogeneous magnetic fields is that one can establish a semiclassical expression for the sum of the negative eigenvalues which is asymptotically exact uniformly in the magnetic field strength. In contrast to the above standard semiclassical Weyl term, the generalized semiclassical expression, indeed, depends on the magnetic field. In case of three dimensions this formula is given by Z P (h|B(x)|, [V (x)]− )dx (1.3) Escl (h, B, V ) := −h−3 R3

with B P (B, W ) := 3π 2

ÿ W

3/2

+2

∞ X ν=1

! [2νB −

3/2 W ]−

=

∞ 2 X 3/2 dν B[2νB−W ]− (1.4) 3π ν=0

1 We believe that this result in its greatest generality was also first proved in [LSY-II], or rather follows from the result in [LSY-II]. In fact, to prove the semiclassics of the sum of all the negative eigenvalues for fixed B one needs to know that the sum is finite, which was first established by the Lieb-Thirring (LT) estimate in [LSY-II].

602

L. Erd˝os, J.P. Solovej

being the pressure of the three dimensional Landau gas (B, W ≥ 0). Here d0 = (2π)−1 and dν = π −1 if ν ≥ 1. Observe that if kBk = o(h−1 ), then Escl reduces to leading order to the standard Weyl term as h → 0. If B(x)h → ∞ for all x, then only the lowest Landau band gives the main contribution, i.e. Escl reduces to leading order to a similar expression where only the first term (ν = 0) is kept in (1.4). Here and throughout the paper k · k refers to the supremum norm. In [LSY-III] the two dimensional problem was studied, but since this paper was aimed mainly at applications the semiclassical formula did not appear explicitly. It is Z (2) −2 P (2) (hB(x), [V (x)]− )dx (1.5) Escl (h, B, V ) := −h R2

with P

(2)

B (B, W ) := 2π

ÿ W +2

∞ X ν=1

! [2νB − W ]−

=

∞ X

dν B[2νB − W ]−

(1.6)

ν=0

being the pressure of the two dimensional Landau gas (dν is as above). Again if kBk = (2) reduces to the standard Weyl term. o(h−1 ) then Escl Our goal in this paper is to show that these semiclassical formulas are exact also for non-homogeneous fields. Of course, for non-homogeneous fields it is more subtle exactly what one means by uniformity in the field, since the field is now no longer determined by just one parameter. We shall return to this question later. The physical motivations for studying these issues are explained in [ES-I]. Here we just recall, as one of the most important applications, that the problem of the ground state energy of large atoms in strong magnetic fields can be reduced to the semiclassical limit of the eigenvalue sum, using Thomas-Fermi theories. We shall investigate this question in our context of non-homogeneous magnetic fields. In applications, it is usually a good approximation to consider the magnetic field as homogeneous. There are several reasons, however, why one would still like to extend the analysis to non-homogeneous fields. First of all it is of course natural to ask whether the features found for homogeneous fields are really stable. Furthermore, a detailed mathematical study often requires one to be able to locally vary the field, even if one is mainly interested in the constant field case. Even though we will find that the semiclassical results known for constant fields really carry over to non-constant fields, we will also see that, in fact, not all features of the constant case are stable. Of course the problem is also of independent interest and raises, as we shall see, many extremely interesting mathematical issues. The homogeneous field case is comparatively simple because the kinetic energy part is an exactly solvable quantum mechanical model. The semiclassical analysis of negative eigenvalues is really twofold. One must first of all establish non-asympotic, Lieb-Thirring type estimates on the sum of the negative eigenvalues, allowing one to control errors and contributions coming from nonsemiclassical regions. In our context this is the subject of [ES-I]. The second part of the semiclassical study is to show that when all the errors have been controlled one can indeed get the asymptotic formula. This is the main subject of the present paper. There are a multitude of highly developed methods for this part of the analysis, e.g., pseudo-differential operator and Fourier integral operator methods (see e.g., [Sob-1994, Sob-1995] which generalizes results in [LSY-II] using these methods, or the book [R] based on the work of Helffer and Robert [HR], or the preprints [I]). Depending on certain properties of the classical Hamiltonian flow, these results even

Semiclassical Eigenvalue Estimates for the Pauli Operator

603

give higher order corrections to the leading term. However, these methods require, in general, strong regularity assumptions on the data. Also, without some non-asymptotic Lieb-Thirring type a priori estimates, these results usually either refer to only a part of the spectrum which is strictly away form the essential spectrum or compute only quantities which are localized in space, what is called local traces or local eigenvalue moments. Within this context non-Weyl type formulas similar to (1.3) and (1.5) are studied in the preprints [I]. Our approach to the problem is the more elementary coherent state method also used in [LSY-II, LSY-III]. In addition to the conceptual simplicity, it reveals some important geometric features related to the magnetic field. As always, semiclassical methods require controlling localization errors. For a homogeneous field one simply has to localize in regions where the potential does not vary too much. In the case of non-homogeneous fields one is, however, forced to also localize on the often much shorter scale where the vector potential is nearly constant. This scale turns out to be so small that the standard IMS localization procedure would be too expensive. We have elaborated a new localization scheme, particularly suitable for magnetic problems. Similarly to the proof of the Lieb-Thirring estimate in [ES-I], we shall use a two step localization. First we have a larger scale isotropic localization. On this length scale, the field direction is almost constant and also the field strength does not vary too much. As usual in IMS type arguments, the actual localization function is not essential, only its length scale is determined. In dimension two the field direction is of course stable, therefore this step is only necessary to control the variation of the field strength (this strategy allows us to treat fields whose strength has no uniform positive lower bound). The second localization is reminiscent of the cylindrical localization in the LT proof since the localization function is supported in a typically elongated cylindrical domain, corresponding to the effective anisotropic character of the magnetic field. The key point is that, in contrast to the standard localization approach, the function itself is very specially chosen. It essentially must be a zero mode of the associated two dimensional Pauli operator with a locally constant field. In a strong magnetic field, the zero mode can be chosen as a well localized Gaussian function. It allows one to localize very strongly (much beyond the IMS localization), essentially for free (the price we pay appears as a slight modification in the magnetic field). To understand this phenomenon, recall that in the typical IMS scheme one pays the price of the lowest eigenvalue of the Dirichlet Laplacian for localization. This is of course an expression of the uncertainty principle. It should be, however, noted that while the uncertainty principle between position and momentum is a universal fact about quantum systems, large momentum does not necessarily imply large energy. This is exactly the case for the two dimensional Pauli operator, where the magnetic field and spin coupling can, and in the right sector does, compensate the “spinless" kinetic energy. This is the reason behind the existence of the Aharonov-Casher zero modes (see [AC, CFKS]), which carry large (angular) momentum, but have no energy. We should mention that probably the first asymptotic formula for eigenvalues of an operator with non-homogeneous magnetic field appeared in [CdV, T], and later was extended in [Mat-1994]. All these works consider the large eigenvalue asymptotics for the ‘magnetic’ Schr¨odinger operator with a magnetic field increasing at infinity (magnetic bottles). This operator has pure point spectrum (and compact resolvent), therefore the problem is simpler than ours. The analogy to our result is that in both problems the asymptotic formula for the non-homogeneous magnetic field is obtained

604

L. Erd˝os, J.P. Solovej

by simply inserting the strength of the non-homogeneous field into the formula for homogeneous field. The organization of the paper is the following. In the next two subsections we explain the main results in the simplest setup. In Sect. 2 we present our new localization method, which is one of the key points of our analysis. The other key point is the analysis of the geometry of the three dimensional magnetic field (in particular a careful choice of a suitable gauge), which was already used in [ES-I]. For the reader’s convenience we recall the necessary results in the Appendix. In Sect. 3 we work out the two dimensional case. We start with a short presentation of the necessary Lieb-Thirring inequality. Then we apply the localization scheme to estimate the quadratic form of the kinetic energy with a non-homogeneous field, by a similar expression with a locally homogeneous field. Finally we work out separately the lower bound, using Lieb-Thirring inequality, and the upper bound, using coherent states and variational principle. In both cases we heavily rely on the structure of the Pauli operator with a constant magnetic field (e.g. Landau levels). Section 4 contains the three dimensional semiclassics. Approximating the true kinetic energy by a (locally) constant field kinetic energy requires choosing an ‘economical’ gauge for the approximating field, we use the results from the Appendix. Once this approximation is done, the lower and upper bounds for the eigenvalue sum are obtained essentially in the same way as in the two dimensional case. For the lower bound, one has to use the more complicated three dimensional Lieb-Thirring inequality. The formulas are somewhat lengthy, as one has complicated error terms, but their basic structure resembles the simpler two dimensional setup. Therefore the reader is advised to start with the two dimensional proof. The last Section is devoted to the proof of the validity of the Magnetic Thomas Fermi (MTF) theory introduced and studied, even for non-homogeneous fields, in [LSY-II]. The validity of this theory, as an approximation to the ground state energy of large atoms in strong magnetic fields, was proved for homogeneous fields in [LSY-II]. Here we have to cope with two new difficulties, compared to the constant field case in [LSY-II]. First is that our Lieb-Thirring inequality does not provide a kinetic energy inequality via Legendre transform (because of an extra gradient term, see (1.12) later). The other complication is that one needs more information on the magnetic Thomas Fermi potential, again because of the extra terms in our LT inequality, i.e., one has to prove that these terms are negligible for the potential of MTF theory. We shall use the letter c for various positive universal constants whose exact values are irrelevant. 1.1. Main results in semiclassics. We shall throughout the paper assume the following conditions on the magnetic field, B ∈ C 1 (R3 ; R3 ), in dimension three kBk < ∞, l(B)−1 := k∇nk = k∇

(1.7) B k < ∞, B

(1.8)

|∇B| k<∞ (1.9) B (recall that k · k denotes the supremum norm). Here l = l(B) describes the length scale on which the field line geometry changes, while L = L(B) is the length scale on which the field strength varies. Note that the conditions imply that B(x) := |B(x)| never vanishes. L(B)−1 := k

Semiclassical Eigenvalue Estimates for the Pauli Operator

605

We are especially interested in the case when l L. In particular, for l = ∞, L < ∞, we obtain the constant direction (but non-homogeneous) case. In dimension two we assume conditions analogous to (1.7), (1.9),

k

kBk < ∞,

(1.10)

|∇B| k<∞ B

(1.11)

for the two dimensional magnetic field B ∈ C 1 (R2 ; R). Since B has a definite sign by the assumption (1.11), without loss of generality we can assume that B > 0. Notice, that we do not assume more than a bound2 on the first derivative of the magnetic field in contrast to the more traditional SC approach which uses pseudodifferential operators and therefore assumes strong regularity conditions on the data. The potential V is naturally assumed to be in the corresponding Lp spaces which make the semiclassical formulas (1.3), (1.5) finite, that is we assume V ∈ L5/2 (R3 ) ∩ L3/2 (R3 ) in the three dimensional case, and V ∈ L1 (R2 ) ∩ L2 (R2 ) in dimension two. As it is explained in [ES-I], in dimension two these natural assumptions are sufficient since we have a Lieb-Thirring inequality which scales exactly as the semiclassical expression. In fact, in the simple case of a bounded field, which is our concern here, the combination of the methods in [E-1995] and [LSY-III] gives a very simple proof which we present in Sect. 3 (for references on unbounded fields, see [ES-I]). However, in dimension three, our Lieb-Thirring inequality (the main theorem in [ES-I]) bounds the sum of the negative eigenvalues of H in (1.1) by Z Z 5/2 3/2 C1 h−3 [V ]− + (h−1 B + d−2 )h−1 [V ]− (1.12) R3

Z (h

+

−1

R3

where

B+d

−2

R3

)d

−1

Z [V ]− +

(h

−1

B+d

R3

−2

) |∇[V ]− | ,

d = d(h, B) := min{h1/4 kBk−1/4 l(B)1/2 , L(B), l(B)}.

This estimate contains two extra terms which, to be finite, require [V ]− ∈ W 1,1 (R3 ). These terms are of lowerR order as far as the large field semiclassics is concerned, but at least the term involving [V ]− cannot be eliminated (see [ES-I]). The other term, with R |∇[V ]− |, is a conceptual error coming from our method, also has to be considered. Our theorem in dimension two is the following. Theorem 1.1 (2D Semiclassics). Assume that the potential V satisfies V ∈ L1 (R2 ) ∩ L2 (R2 ) and the magnetic field B satisfies (1.10)-(1.11). For h, b > 0 let e1 (h, b), e2 (h, b), . . . denote the negative eigenvalues of the operator H = H (2) (h, bA, V ) in (1.2). Then P e (h, b) k − 1 = 0 lim (2)k h→0 E (h, bB, V ) scl uniformly for b ∈ R+ . 2

We strictly speaking only need the magnetic field to be Lipschitz

606

L. Erd˝os, J.P. Solovej

Later we shall in fact prove a slighty stronger result. Instead of considering magnetic fields depending only on the parameter b, we shall prove a statement that is uniform on more general families of magnetic fields. We shall make it more precise in Sect. 3. The formulation of the three dimensional result is more complicated than the two dimensional result, since we do not prove a fully uniform statement. Theorem 1.2 (3D Semiclassics). Assume that the potential V satisfies V ∈ L5/2 (R3 )∩ L3/2 (R3 ), [V ]− ∈ W 1,1 (R3 ) and the magnetic field B satisfies (1.7)–(1.9). For h, b > 0 let e1 (h, b), e2 (h, b), . . . denote the negative eigenvalues of the operator H = H(h, bA, V ) in (1.1). Then P k ek (h, b) − 1 = 0. (1.13) lim h→0 Escl (h, bB, V ) bh3 →0 We shall again prove a slightly stronger result. The reason for this generalization is that in our main application the magnetic Thomas-Fermi theory, the potential will depend slightly on the magnetic field and on the effective Planck’s constant. The details are explained in Sect. 5. There is a fundamental technical reason for the condition bh3 → 0. We already explained it in [ES-I], since a similar condition plays a role in the proof of our LiebThirring inequality. Part of the motivation for believing that the semiclassical formula for homogeneous or even constant direction fields should generalize to fully non-homogeneous fields is that these fields on the relevant quantum scales should behave approximately like constant direction fields. This is, however, not true if the field is too strong. A charged particle moving in a magnetic field essentially occupies a region in space of the shape of a cylinder with axis parallel to the magnetic field. For particles of fixed energy e the radius of the cylinder is the Landau radius r ∼ b−1/2 h1/2 and the height is of order s = he−1/2 (particles localized in regions of length he−1/2 in one dimension have energies of order e). The condition that one can approximate the magnetic field within this region by a constant direction field is that the field lines remain within this cylinder, i.e., that l(B)−1 s2 r. This condition is simply that bl(B)−2 h−3 e2 . Although the above restriction on the magnetic field might seem natural, we believe that it can be removed by an additional geometrical analysis which is beyond the scope of the present work. We intend to return to this issue in the future. 1.2. Application to the magnetic Thomas-Fermi theory. As an application of our semiclassical analysis we shall here generalize Theorem 5.1 in [LSY-II] on the energy of large atoms in strong exterior magnetic fields. We shall work in dimension three, but we are convinced that a very similar analysis can be carried over in dimension two, similarly to [LSY-III]. We consider again a magnetic field B = ∇ × A which satisfies (1.7–1.9). Our generalization will be to allow a much more general class of three dimensional exterior fields than in [LSY-II], where only homogeneous fields were treated. The quantum mechanical Hamiltonian for an atom with nuclear charge Z and with N electrons in such an exterior magnetic field is given by

HN,A,Z :=

N X

N X |xi − xj |−1 . [σ i · (pi + A(xi ))]2 − Z|xi |−1 +

i=1

i<j

(1.14)

Semiclassical Eigenvalue Estimates for the Pauli Operator

607

N V It operates on spinor-valued wave functions ψ ∈ L2 (R3 ; C2 ). We have here used units in which twice the electron mass 2me , the electron charge e, and Planck’s constant ~ are all equal to unity. We have also used the infinite mass approximation for the nucleus, which is situated at the origin. We define the ground state energy of this atom

E(N, B, Z) := inf hψ | HN,A,Z | ψi, kψk=1

(1.15)

which, of course, depends only on the magnetic field. In order to state our main result we must introduce the magnetic Thomas-Fermi (MTF) theory of [LSY-II]. It is defined in terms of the energy functional Z Z τ (B(x), ρ(x))dx − Z|x|−1 ρ(x)dx E[ρ; B, Z] := R3 R3 ZZ 1 ρ(x)|x − y|−1 ρ(y)dxdy, (1.16) +2 R3 ×R3

where ρ 7→ τ (B, ρ) is the Legendre transform of the Landau pressure function W 7→ P (B, W ) defined in (1.4), i.e., τ (B, ρ) := sup [ρW − P (B, W )], W ≥0

(1.17)

and conversely, since W 7→ P (B, W ) is convex, P (B, W ) = sup[ρW − τ (B, ρ)]. ρ≥0

The energy in MTF theory is defined by E MTF (N, B, Z) :=

inf R

0≤ρ,

ρ≤N

E[ρ; B, Z].

(1.18)

We here use the convention that E[ρ; B, Z] = +∞ unless all integrals are finite. Note that E MTF (N, B, Z) depends only on the magnitude of B(x) = |B(x)| of the magnetic field. The minimization problem (1.18) was studied in great detail in Sect. IV of [LSY-II], and the necessary results will be recalled later. Our main result on the energy of large atoms is given in the following theorem. Theorem 1.3. Let B = ∇ × A : R3 → R3 be a fixed magnetic field satisfying (1.7–1.9). If Z, N → ∞ with N/Z fixed and b/Z 3 → 0, then E(N, bB, Z)/E MTF (N, bB, Z) → 1. In this theorem we introduced a field strength parameter b. The point is that we allow the field strength to be large in comparison to powers of Z. The condition b/Z 3 → 0 is identical to the condition in [LSY-II]. In Sect. 5 we shall slightly generalize this result by allowing more complicated relations between the magnetic field and the atomic parameters. In particular, we shall give conditions for when the field is allowed to vary on the scale of the size of the atom.

608

L. Erd˝os, J.P. Solovej

2. Localization for Operators with Magnetic Fields As usual one of the important ingredients in the proof of leading order semiclassics is to control localization errors. The most commonly used estimate in this context is often referred to as the IMS formula. It is given in the following lemma. Lemma 1 (IMS). Let n = 2 or n = 3. For k ∈ C ∞ (Rn , C2 ) and g ∈ C ∞ (Rn ) we have for all 0 < δ < 1 the pointwise inequalities |g(x)σ · (hp + A(x))k(x)|2 ≥ (1 − δ) |σ · (hp + A(x)) (gk)(x)| − cδ −1 h2 |∇g(x)|2 |k(x)|2 , 2

|σ · (hp + A(x))(gk)(x)|2 ≤ (1 + δ) |g(x)σ · (hp + A(x)) k(x)| + cδ −1 h2 |∇g(x)|2 |k(x)|2 . 2

Moreover, on average, even an identity is valid (IMS formula), if, in addition, g is real and compactly supported hk|g 2 (σ · (hp + A))2 |ki + hk|(σ · (hp + A))2 g 2 |ki = 2hkg|(σ · (hp + A))2 |gki − 2h2 hk|(∇g)2 ki.

(2.1)

Proof. The first two inequalities are straightforward to prove using Leibniz formula. These inequalities are of course essentially identical. The IMS formula can be seen as follows: hσ · (hp + A)gk|σ · (hp + A)gki = hgσ · (hp+A)k|gσ · (hp + A)ki+h2 hk|(∇g)2 |ki + 2Rehgσ · (hp + A)k|(hσ · pg)ki = 21 hk|g 2 (σ · (hp + A))2 |ki + 21 hk|(σ · (hp + A))2 g 2 |ki + h2 hk|(∇g)2 |ki +2Rehgσ · (hp + A)k|(hσ · pg)ki + Reh(hσ · pg 2 )(σ · (hp + A))k|ki. We used integration by parts, and, by reality of g, one finally sees that the last two terms above cancel. It turns out, as we shall explain below, that the above localization estimates are not adequate for the Pauli operator with strong magnetic fields. In this case it is better to use a very specific localization function. Namely, the function η (0) ∈ C ∞ (R2 , C2 ) given by η (0) (x) = w−1 e−w

−2

x2

,

where w > 0 represents the localization scale. That this function does not have compact support turns out to be just a minor difficulty. We also need to define spin-up and spindown projections as P± := 21 (1 ± σ3 ). Lemma 2 (Magnetic localization around the origin). Let B0 be any constant and k ∈ C ∞ (R2 , C2 ). Then with the explicit choice of η (0) we have for all 0 < δ < 1, |η (0) (x)σ · (hp + (1/2)B0 n × x)k(x)|2 2 ≥ (1 − δ) σ · hp + (1/2)(B0 + 4hw−2 )n × x (η (0) k)(x) − cδ −1 h2 w−2 |xw−1 |2 |P+ (η (0) k)(x)|2 and

(2.2)

Semiclassical Eigenvalue Estimates for the Pauli Operator

|σ · (hp + (1/2)B0 n × x)(η (0) k)(x)|2 2 ≤ (1 + δ) η (0) (x)σ · hp + (1/2)(B0 − 4hw−2 )n × x k(x)

609

(2.3)

+ cδ −1 h2 w−2 |xw−1 |2 |P+ (η (0) k)(x)|2 (recall that for v ∈ R2 we defined n × v := (−v2 , v1 ) ∈ R2 ). Here, as usual, c denotes a positive universal constant. Proof. Using Leibniz formula and σ · (n × x) = i(σ · x)(σ · n) = i(σ · x)σ3 a simple computation gives σ · hp + (1/2)B0 n × x (η (0) k)(x) = η (0) (x)σ · hp + (1/2)(B0 − 4hw−2 )n × x k(x) + 4ihw−2 η (0) (x)(σ · x)P+ k(x). A simple application of a Cauchy-Schwarz inequality then gives (2.3). Replacing B0 by B0 + 4hw−2 we similarly get (2.2). We shall use both the magnetic localization and the IMS Lemma. The magnetic localization shall be used to approximate the variable magnetic field by a constant field. We shall now explain why the magnetic localization is superior to the IMS formula for this purpose. Imagine that we attempt to approximate the variable field by a constant field over a region of length w. The approximation error for the vector potential is then kBkL−1 w2 using that |∇B| ≤ kBkL−1 and (A.10). [This error will appear squared in the estimate on the Hamiltonian (see e.g. (3.20) and (3.21)) below, but this is unimportant for the present discussion.] Since we want to prove estimates uniform in kBk we must choose w proportional to kBk−1/2 . The IMS formula would then give an error w−2 ∼ kBk, which is not independent of kBk. The magnetic localization seems at first sight to give R the same error. In fact, this is the order of the last terms in (2.2) and (2.3). [Since η (0) (x)2 xdx = cw we should think of |xw−1 | being of order one.] The important observation is that the error terms in the magnetic localization contain the spin-up projection P+ . If the magnetic field B is bounded from below by a positive constant, then the free Pauli operator restricted to the spin up subspace is not just positive but, indeed, bounded below by a positive amount proportional to the lower bound on B [see (3.28)]. If the ratio of the supremum of B is bounded relative to the infimum of B then the relative localization error (compared to the main term) in the magnetic localization is independent of kBk and this is the important fact. We shall not actually assume that the magnetic field is bounded below. This is just a minor technical problem. In fact, as should be clear from the above discussion, it is only the ratio of the maximum of the field to the minimum that counts. We therefore simply use the standard IMS Lemma to localize in regions where this ratio is bounded.

3. Semiclassics in Two Dimensions In the introduction we stated our semiclassical result in Theorem 1.1 for fixed potential. First we formulate our more general result for potentials which are allowed to depend mildly on B and h. To describe the precise result, we introduce the 2D magnetic LiebThirring error functional

610

L. Erd˝os, J.P. Solovej (2) Eh,B (V ) := h−2

Z

|V |2 + kBkh−1

Z |V |

(3.1)

(in Sect. 3 all integrals are on R2 , unless otherwise specified). With this notation, the Lieb-Thirring inequality established in Theorem 3.2 in Sect. 3 states that the sum of the negative eigenvalues e1 (H), e2 (H), . . . of H = H (2) (h, A, V ) satisfies the bound X (2) (2) |ek (H)| ≤ cEh,B ([V ]− ) ≤ cEh,B (V ), k

if the magnetic field satisfies (1.10) with a universal constant c. Define the following set for L > 0 o n |∇B(x)| ≤ L−1 . CL := 0 < B(x) ∈ C 1 (R2 ) ∩ L∞ (R2 ) : sup B(x) x We may now introduce the conditions on the potential, C+ (V ) :=

ε1 (V, y) :=

sup

B∈CL , 0
kBkh−1 + h−2

(2) Eh,B ([V ]± − [V (· − y)]± )

sup

kBkh−1 + h−2

B∈CL , 0
ε2 (V, %) :=

(2) Eh,B (V )

sup

(2) Eh,B ([V χ% ]− − [V ]− )

B∈CL , 0
kBkh−1 + h−2

< ∞,

(3.2)

→0

as y → 0,

(3.3)

→0

as % → ∞,

(3.4)

where χ% denotes the characteristic function of the ball of radius % centered at the origin. Furthermore, we require a joint condition on B and V , (2) |Escl (h, B, V )| > 0. 0
C− (B, V ) := inf

(3.5)

Note that (3.2), (3.5), and P (2) (B, W ) ≤ c(BW + W 2 ) imply that c≤

(2) Eh,B (V ) (2) |Escl (h, B, V

)|

≤

C+ (V ) C− (B, V )

(3.6)

for all 0 < h < 1 and all B ∈ CL . For C, L > 0 introduce the set |∇B(x)| 1 2 ∞ 2 −1 ≤ L , C− (B, V ) ≥ C . CC,L (V ) := 0 < B(x) ∈ C (R )∩L (R ) : sup B(x) x Theorem 3.1 (2D Semiclassics). Let V ∈ L1 (R2 ) ∩ L2 (R2 ) and let e1 (H), e2 (H), . . . denote the negative eigenvalues of the operator H = H (2) (h, A, V ) in (1.2). For any C, L > 0 we have ! P ÿ e (H) k − 1 = 0. sup (2) k lim h→0 B∈CC,L (V ) E (h, B, V ) scl

Semiclassical Eigenvalue Estimates for the Pauli Operator

611

Remark. What we really need about V and B for the semiclassical limit is the conditions (3.2)–(3.5). For a fixed potential, these conditions follow simply from V ∈ L1 ∩ L2 and B ∈ CC,L (V ). Following [LSY-III], there are two ingredients in the proof: Lieb-Thirring inequality and localization. 3.1. Two dimensional Lieb-Thirring inequality. For completeness, we formulate here the necessary Lieb-Thirring inequality under more general regularity conditions on the magnetic field. Theorem 3.2. Let B ∈ L∞ and let A ∈ C 1 (R2 ; R2 ) with B = ∇ × A. Then for any γ ≥ 1 there exists a universal constant Cγ such that the following estimate is valid for the γ th moment of the negative eigenvalues {e(2) m }m=1,2... of the two dimensional operator H (2) (1, A, V ) = [σ · (p + A)]2 + V , Z Z X γ γ+1 (2) γ |em | ≤ Cγ kBk [V ]− + [V ]− . (3.7) m

In particular we get an estimate for the sum of the eigenvalues (the case γ = 1). Remark. Magnetic Lieb-Thirring inequalities for nonhomogeneous magnetic field were first proven in [E-1995]. There only the three dimensional case was discussed, though the corresponding two dimensional results follow analogously. In fact, [E-1995] mainly focuses on the constant direction case, which essentially requires a two dimensional analysis. The only difficulty stems from the fact that the exponent γ = 1 is critical in two dimensions, which has to be treated, using Fan’s theorem, analogously to [LSY-III]. Later Sobolev [Sob-1996(1)], with a different approach, proved Theorem 3.2 in a more general setting which allows fairly general unbounded magnetic fields. We would like to point out, however, that Theorem 3.2, in the present form, i.e. for bounded field, has a simple proof which follows immediately from [E-1995] and [LSY-III]. Without going into the details, here we just outline the steps for γ = 1. Steps of the proof. We can replace V by −[V ]− . It is enough to consider the operators H± := (p+A)2 ±B−[V ]− acting on L2 (R2 ) separately and without loss of generality we can focus on H− (there is no assumption on the sign of B in this theorem). We introduce 1/2 1/2 the Birman-Schwinger kernel KE := [V ]− ((p + A)2 − B + E)−1 [V ]− (there is no need to add part of E to the potential, see the proof of Theorem 5.1 in [LSY-III]), and < > + KE,L with decompose it into a lower and an upper part, KE = KE,L < KE,L := [V ]− ΠL ((p + A)2 − B + E)−1 ΠL [V ]− , 1/2

1/2

> KE,L := [V ]− (I − ΠL )((p + A)2 − B + E)−1 (I − ΠL )[V ]− , 1/2

1/2

where ΠL is the spectral projection, onto [0, L], of the nonnegative operator (p+A)2 −B. Here we cannot separate the lowest Landau level from the rest of the spectrum, as the field is not constant. Nevertheless we artifically cut the spectrum at level L (to be chosen 2kBk later) by inserting the spectral projections ΠL , I − ΠL (similarly to (26), (27) in [E-1995], but we now omit the kinetic energy in the third direction). Using the method of [LSY-III] one easily gets, as in (49) of [E-1995], the following bound on NE , the number of eigenvalues of H− below −E,

612

L. Erd˝os, J.P. Solovej > NE ≤ #{ev.’s of ΠL [V ]− ΠL bigger than E/4} + 4Tr[KE,L ].

Therefore, analogously to (50) in [E-1995], we have the following bound on the γ th moment of the negative eigenvalues of H− Z ∞ X γ/2 γ/2 > |ei (H− )|γ ≤ 4γ γTr [V ]− ΠL [V ]− + 4γ Tr[KE,L ]E γ−1 dE. (3.8) 0

i

By the diamagnetic inequality ΠL (x, x) ≤ etL e−tH− (x, x) ≤ etL etkBk et1 (x, x). Making the particular choice of L = 2kBk and t = (3kBk)−1 we therefore see (as in Proposition 3.1. of [E-1995]) from the explicit formula for the heat kernel et1 (x, x) that ΠL (x, x) ≤ ckBk. This yields the first term in the present Lieb-Thirring inequality (3.7). The second term in (3.8) is estimated by using the obvious operator inequality −1 −1 L 1 [(p + A)2 − B] + + E (I − ΠL ) ≤ , (I − ΠL ) (p + A)2 − B + E 2 2 and the pointwise inequality −1 −1 L 1 2 1 [(p + A)2 − B] + + E p +E (x, x) ≤ (x, x), 2 2 2

(3.9)

which is obtained by rewriting the resolvent kernel as the Laplace transform of the heat kernel then using the diamagnetic inequality and the monotonicity of the nonmagnetic heat kernel, i.e. L L 1 2 1 2 [(p+A) −B]+ + E (x, x) ≤ exp −t [p −B]+ +E (x, x) exp −t 2 2 2 2 1 2 ≤ exp −t p + E (x, x), 2 since B ≤ L/2. From (3.9) one finishes the proof along the lines of [LSY-III].

3.2. Constant field approximation. In this section we rewrite the kinetic energy part of the Pauli operator in terms of a spatial average of operators with constant magnetic field plus error terms. Fix L > 0 and let us fix two positive functions 3(h) and λ(h) satisfying 3(h) → ∞, 3(h)h1/2 → 0, λ(h) → 0 and hλ(h)−1 → 0 as h → 0 and 3(h) ≥ 4e, λ(h) < (8e2 )−1 L. For each u ∈ R2 we define the function η u := η (0),u ζ, where η (0),u (x) := −2 2 cnorm,u wu−1 e−wu x is the magnetic localization function (modulo the constant cnorm,u ) introduced in Lemma 2 and ζ ∈ C0∞ (R2 ), ζ ≡ 1 on B(0, λ(h)/2), supp ζ ⊂ B(0, λ(h)) and |∇ζ| ≤ cλ(h)−1 . The scale wu of η (0),u we choose to be wu := wu (h, B) := 3(h)B # (u)−1/2 h1/2 . We have introduced the notation

Semiclassical Eigenvalue Estimates for the Pauli Operator

B # (u) :=

sup |x−u|<2L

613

{B(x)}.

R The normalization constant cnorm,u is chosen such that (η u )2 = 1. It is easy to see that, although cnorm,u depends on h, B # (u), and ζ, there exists a universal positive constant c such that for k = 0, 1, 2, Z (3.10) η u (v)2 v 2k dv ≤ c min{wu , λ(h)}2k , Z

and that −1

(η (0),u )2 < c. supp ζ Let ηvu (x) := η u (x − v) and ζv (x) := ζ(x − v). As explained in the discussion after the magnetic localization Lemma (Lemma 2) we shall localize into regions where the maximum of the magnetic field strength is bounded relative to the minimum. To do this we use that for all B ∈ CL we have e−2 ≤ B(x)/B(y) ≤ e2 for all |x − y| < 2L. Thus from the definition of B # (u) we conclude that (3.11) e−2 B # (u) ≤ B(u) ≤ B # (u). With the above definition of wu and the assumptions on λ(h) and 3(h) we can now prove the main result of this subsection. c

<

Proposition 3. Given B ∈ CL . There exists ε(h) > 0, depending only on h and L such that ε(h) → 0 as h → 0; and for each u, v ∈ R2 there exist a (phase) function φu,v in ± C 1 (R2 ) and constant magnetic fields Bˆ u,v satisfying ± | ≤ ε(h)B(u), |B(u) − Bˆ u,v

for all v with |u − v| ≤ 2λ(h)

(3.12)

such that the following is valid. For any f ∈ C ∞ (R2 , C2 ) and g ∈ C0∞ (B(u, λ(h)), R) we have Z (3.13) |σ · (hp + A)gf |2 ZZ h i ˆ +u,v )(eiφu,v ηvu gf )(x)|2 − ε(h)|(eiφu,v ηvu gf )(x)|2 dxdv, ≥ (1−ε(h))|σ · (hp + A and for any fixed v ∈ R2 , Z (3.14) |σ · (hp + A)(e−iφu,v ηvu gf )(x)|2 dx Z h 2 2 ˆ− ≤ (ηvu g)2 (x) (1 + ε(h))|σ · (hp + A u,v )f (x)| + ε(h)Wu,v (x)|f (x)| Z + +ε(h)Wu,v (x)|P+ f (x)|2 dx + ch2 |∇(ζv g)(x)|2 |(ηv(0),u f )(x)|2 dx, where Wu,v (x) = wu−4 (x − v)4

+ and Wu,v (x) = cB # (u)hwu−2 (x − v)2 ,

and the vector potentials 1 ˆ± ˆ± A u,v (x) := Bu,v n × x, 2 ± . generate the constant magnetic fields Bˆ u,v

(3.15)

614

L. Erd˝os, J.P. Solovej

Proof. Since u is fixed we shall omit the u sub- and superscripts in the proof. Step 1. Separation. The separation of the spin up and spin down subspaces is trivial since P± commutes with σ3 B: Z Z Z |σ · (hp + A)gf |2 = |σ · (hp + A)gP− f |2 + |σ · (hp + A)gP+ f |2 . (3.16) Step 2. Localization. We separately consider the kinetic energies of P± f =: f± . For the lower bound we write Z Z Z ηv (x)2 |σ · (hp + A)(gf± )(x)|2 dx dv. (3.17) |σ · (hp + A)gf± |2 = Note that the above integrals can be restricted to x, v ∈ B(u, 2λ(h)). For all x, v ∈ B(u, 2λ(h)), we can, since λ(h) < L, approximate the magnetic field by the constant field Bv := B(v), such that |B(x) − Bv | ≤ |x − v|B # (u)L−1 .

(3.18)

We can then approximate A by Av , rot Av = Bv , such that |A(x) − Av (x)| ≤ cB # (u)|x − v|2 L−1

(3.19)

(using the Poincar´e formula, see (A.10) in the Appendix). Using a Cauchy-Schwarz inequality, we have, for any 0 < δ < 1/2, Z Z Z |σ · (hp + A)gf± |2 ≥ (1 − δ) η(x − v)2 |σ · (hp + Av )(gf± )(x)|2 dx dv(3.20) Z Z −1 η(x − v)2 |A(x) − Av (x)|2 |gf± (x)|2 dx dv. − cδ We shall choose δ = δ(h) at the end of this section. For the upper bound, we fix v ∈ R2 , and we consider Z |σ · (hp + A)(e−iφv ηv gf± )(x)|2 dx Z ≤ (1 + δ) |σ · (hp + Av )(e−iφv ηv gf± )(x)|2 dx Z +cδ −1 η(x − v)2 |A(x) − Av (x)|2 |gf± (x)|2 dx,

(3.21)

where the phase function φv : R2 → R will be chosen below. To control the error in (3.20), we shall apply the following estimate: Z Z −1 2 2 −1 # 2 −2 ηv (x) |A(x) − Av (x)| dv ≤ cδ B (u) L η(v)2 v 4 dv δ ≤ cδ −1 B # (u)2 L−2 w4 , which is a consequence of (3.10) and (3.19). We are left with considering the quadratic forms

(3.22)

Semiclassical Eigenvalue Estimates for the Pauli Operator

615

Z Z ηv (x)2 |σ · (hp + Av )(gf± )(x)|2 dx dv Z Z = ηv (x)2 |σ · (hp + Av − ∇φv )(eiφv gf± )(x)|2 dx dv, and Z |σ·(hp+Av )(e

−iφv

(3.23)

Z 2

ηv gf± )(x)| dx =

|σ·(hp+Av −∇φv )(ηv gf± )(x)|2 dx (3.24)

in (3.20) and (3.21), where we again introduced the gauge transformation φv . We choose φv such that (Av − ∇φv )(x) = (1/2)Bv n × (x − v), this determines φv (up to an irrelevant constant). Step 3. Magnetic localization. Using the estimates in Lemma 2 and k(x) = ζ(x)(gf )(x+v) (and including a phase factor in the case of (2.3)), then shifting x → x−v, we obtain from (3.23), (3.24), (3.10), and the estimates in the IMS Lemma, (once for ζv and once for gζv ) that ZZ ηv (x)2 |σ · (hp + Av )(gf± )(x)|2 dx dv (3.25) ZZ ˆ +v )(eiφv ηv gf± )(x)|2 dx dv ≥ (1 − δ) |σ · (hp + A ZZ |∇ζv (x)|2 |(ηv(0) gf± )(x)|2 dx dv − (1 − δ)h2 ZZ |P+ (ηv gf± )(x)|2 dx dv, − cδ −1 h2 w−2 and for each v,

Z

|σ · (hp + Av )(e−iφv ηv gf± )(x)|2 dx Z 2 ˆ− ≤ (1 + δ) η(x − v)2 g(x)2 |σ · (hp + A v )f± (x)| dx Z + (1 + δ)h2 |∇(gζv (x))|2 |(ηv(0) f± )(x)|2 dx Z −1 2 −4 (x − v)2 |P+ (ηv gf± )(x)|2 dx, + cδ h w

(3.26)

ˆ± ˆ± where the vector potentials A v defined in (3.15) generate the constant fields Bv := Bv ± 4hw−2 . First we prove (3.12). It follows from (3.11) that for |u − v| ≤ 2λ(h) (< 2L) we have |Bˆ v± − B(u)| ≤ |B(u) − B(v)| + 4hw−2

≤ 2B # (u)L−1 λ(h) + 4e2 hw−2 B # (u)−1 B(u) ≤ 2e2 (λ(h)L−1 + 23(h)−2 )B(u).

We must therefore show that we can choose ε(h) such that

616

L. Erd˝os, J.P. Solovej

2e2 (λ(h)L−1 + 23(h)−2 ) ≤ ε(h).

(3.27)

We also get the lower bound (recall that we assumed λ(h) ≤ (8e2 )−1 L and 3(h) ≥ 4e) Bˆ v± ≥ (1 − 2e2 (λ(h)L−1 + 23(h)−2 ))B(u) ≥ (1/2)e−2 B # (u) for all v such that |v − u| ≤ 2λ(h). Next, we continue the estimate (3.25). To give a lower bound on the right-hand side of (3.25) we have to control the last term which is nonzero only for f+ . Since for any k ∈ C0∞ (R2 , C2 ) we have Z Z Z ± 2 ± 2 −2 # ˆ ˆ |σ · (hp + Av )P+ k| ≥ hBv |P+ k| ≥ (1/2)e hB (u) |P+ k|2 , (3.28) we see that the last term on the right-hand side of (3.25) can be absorbed into the kinetic energy term if R := cδ −1 h2 w−2 (B # (u)h)−1 = cδ −1 3(h)−2 < 1.

(3.29)

We put together (3.16), (3.20), (3.22), (3.23) and (3.25), the final lower bound of this section is Z (3.30) |σ · (hp + A)gf |2 ZZ ˆ +v )(eiφv ηv gP− f )(x)|2 −(Q+cλ(h)−2 h2 )|(eiφv ηv gP− f )(x)|2 ≥ (1−δ)2 |σ·(hp+ A ˆ +v )(eiφv ηv gP+ f )(x)|2 + (1 − δ)2 (1 − R)|σ · (hp + A − (Q + cλ(h)−2 h2 )|(eiφv ηv gP+ f )(x)|2 dx dv with Q := cδ −1 B # (u)2 L−2 w4 = cδ −1 L−2 3(h)4 h2 . R We also used |∇ζ| ≤ cλ(h)−1 and supp ζ (η (0) )2 < c. For the upper bound we use (3.16) for f = e−iφv ηv f , then combine it with (3.21), (3.22), (3.24) and (3.26) to obtain Z (3.31) |σ · (hp + A)(e−iφv ηv gf )(x)|2 dx Z 2 −4 ˆ− (x − v)4 |P− f (x)|2 ≤ ηv (x)2 g(x)2 (1 + δ)2 |σ · (hp + A v )(P− f )(x)| + Qw 2 ˆ− + (1 + δ)2 |σ · (hp + A v )(P+ f )(x)| −4

−2

+ [Qw (x − v) + RB (u)hw (x − v) ]|P+ f (x)| dx Z 2 |(∇(gζv ))(x)|2 |ηv(0) f (x)|2 dx. +ch 4

#

2

2

Semiclassical Eigenvalue Estimates for the Pauli Operator

617

In order to prove the proposition we must show that we can choose ε = ε(h) and δ = δ(h) such that ε(h), δ(h) → 0 as h → 0 and such that (3.27) is satisfied and the following requirements are met: R ≤ ε, Q + cλ(h)−2 h2 ≤ ε, (1 + δ)2 ≤ (1 + ε), (1 − δ)2 (1 − R) ≥ (1 − ε). In particular this will imply (3.29). Since λ(h) → 0, 3(h) → ∞, 3(h)h1/2 → 0, and hλ(h)−1 → 0 as h → 0 it is clear that we can find (h) and δ(h) as functions of h such that all these requirements are simultaneously satisfied. This finishes the proof of Proposition 3. 3.3. Lower bound. We shall here prove the one-sided bound P ek (H) ≤ 1, lim sup (2) k h→0 Escl (h, B, V )

(3.32)

(2) uniformly P for B in CC,L . Since Escl (h, B, V ) ≤ 0 this is a lower bound on the eigenvalue sum k ek (H). For this bound, we can replace V by its negative part, −[V ]− . We introduce a Rspherically symmetric function 0 ≤ θλ ∈RC0∞ (R2 ), which localizes at scale λ(h), i.e. R2 θλ2 = 1, supp θλ ⊂ B(0, λ(h)) and (∇θλ )2 = cλ(h)−2 . Let θu,λ (x) = θλ (x − u). Instead of the original Pauli operator H, we are going to study

˜ := (1 − 2δ1 )Θ% [σ · (hp + A)]2 Θ% − [V χ% ]− ∗ (θλ )2 , H

(3.33)

where χ% is the characteristic function of the ball of radius % centered at the origin in R2 and Θ% ∈ C0∞ (R2 ) satisfies Θ% (x) = 1 on B(0, %+λ(h)), vanishes outside B(0, %+2λ(h)) and |∇Θ% | ≤ cλ(h)−1 . Here 0 < δ1 = δ1 (h) < 1/4 and % = %(h) > 0 as functions of h will be chosen at the end of the section. The argument h will frequently be omitted. We then use the localization formula (2.1) to write Z ˜ ˜ u |θu,λ f idu − c(1 − 2δ1 )h2 λ−2 hf |Θ%2 |f i, hf |H|f i = hf θu,λ |H for any f ∈ C0∞ (R2 , C2 ), where ˜ u = (1 − 2δ1 )Θ% [σ · (hp + A)]2 Θ% − [V χ% (u)]− . H The relation between the kinetic energy parts of H and H˜ is given by the pointwise inequality in Lemma 1 hf |[σ · (hp + A)]2 |f i ≥ (1 − δ1 )hf |Θ% [σ · (hp + A)]2 Θ% |f i − cδ1−1 h2 hf |(∇Θ% )2 |f i. (3.34) In order to merge the two error terms from these two localizations, we note that c(1 − 2δ1 )h2 λ−2 Θ%2 + cδ −1 h2 (∇Θ% )2 ≤ cδ1−1 h2 λ−2 (χρ+2λ )2 . Therefore we may write, using (3.34), (3.35) Z ˜ u |θu,λ f idu + hf |Herr |f i, hf |H|f i ≥ hf θu,λ |H

(3.35)

(3.36)

618

L. Erd˝os, J.P. Solovej

where

Herr := δ1 [σ · (hp + A)]2 + Werr

with

(3.37)

Werr = [V χ% ]− ∗ (θλ )2 − [V ]− − cδ1−1 h2 λ−2 (χρ+2λ )2 .

To estimate the effect of Herr , we use the 2D-magnetic Lieb-Thirring error functional. We have n (2) (2) δ1−1 Werr ≤ cδ1−2 Eh,B [V χ% ]− ∗ (θλ )2 − [V ]− ∗ (θλ )2 Eh,B o (2) (2) +Eh,B [V ]− ∗ (θλ )2 − [V ]− + Eh,B cδ1−1 h2 λ−2 (χρ+2λ )2 n (2) ≤ cδ1−2 Eh,B [V χ% ]− − [V ]− o (2) (2) +Eh.B [V ]− ∗ (θλ )2 − [V ]− + Eh,B cδ1−1 h2 λ−2 (χρ+2λ )2 , where we used Jensen’s inequality. Note that Jensen also implies that Z (2) (2) ([V ]− − [V (· − y)]− )θλ (y)2 dy. [V ]− ∗ (θλ )2 − [V ]− ≤ Eh,B Eh,B The integrand vanishes unless |y| ≤ λ(h). Since λ = λ(h) → 0 as h → 0 it follows, (2) (h, B, V )| as h → 0. from (3.2–3.3) and (3.6), that this term tends to zero relative to |Escl If we also use condition (3.4) we see that (2) δ1−1 Werr Eh,B −2 −1 ≤ cδ1 C− (B, V ) ε2 (V, %) + sup ε1 (V, y) (2) |Escl (h, B, V )| |y|≤λ(h) + λ(h)−2 h2 + (λ(h)−2 h2 )2 (% + 2λ(h))2 . Let f1 , f2 , . . . fN be a family of compactly supported smooth orthonormal spinors and we want to estimate N N Z N X X X ˜ u |θu,λ fj idu + hfj θu,λ |H hfj |H|fj i ≥ hfj |Herr |fj i (3.38) j=1

j=1

j=1

uniformly in N . By the Lieb-Thirring inequality we have N X j=1

(2) hfj |Herr |fj i ≥ −cδ1 Eh,B δ1−1 Werr ,

(3.39)

which shall be controlled using (3.38). For the main term, we use N N Z X X ˜ u |θu,λ fj i ≥ (1 − δ1 )|σ · (hp + A)(Θ% θu,λ fj )(x)|2 hfj θu,λ |H j=1

j=1

Z

−

[V

χ% (u)]− |(θu,λ ηvu Θ% fj )(x)|2

dv dx.

(3.40)

Semiclassical Eigenvalue Estimates for the Pauli Operator

619

For each ν ∈ N and u, v ∈ R2 we define the following positive operators via their kernel: (3.41) Π ± (ν, u, v)(x, y) ν,± (x, y)eiφu,v (y) η u (y − v)θλ (y − u), := θλ (x − u)η u (x − v)e−iφu,v (x) Πu,v ν,± ˆ± is the ν th Landau level projection corresponding to the constant field B where Πu,v u,v ± ˆ u,v , obtained in Proposition 3. We do not need the explicit form of and to the gauge A ν,± ν,± ± which can be found in [LSY-II]. We only need that Πu,v (x, x) = dν h−1 Bˆ u,v . Πu,v ± + Note that Π (ν, u, v) = 0 unless |v − u| ≤ 2λ. In this section we shall use Π (ν, u, v) only, the other operator will be useful in Sect. 3.4. Since θu,λ has support in B(u, λ(h)) we see from (3.13) that X ˜ u |θu,λ fj i hfj θu,λ |H (3.42) j

≥

XZ Z

ˆ +u,v )(eiφu,v ηvu Θ% θu,λ fj )(x)|2 (1 − δ1 )(1 − ε(h))|σ · (hp + A

j

− ([V (u)]− + ε(h))|(eiφu,v ηvu Θ% θu,λ fj )(x)|2 dx dv = (1 − δ2 )

∞ Z XX

+ 2νhBˆ u,v − (1 − δ2 )−1 ([V (u)]− + ε(h))

j

ν=0

× hfj |Θ% Π + (ν, u, v)Θ% |fj idv. Here

(1 − δ2 ) := (1 − δ1 )(1 − ε(h)).

(3.43)

Observe now that for each ν ∈ N we have from (3.12) that ν,± ± Πu,v (x, x) = dν h−1 Bˆ u,v ≤ dν h−1 (1 + ε(h))B(u),

if |u − v| ≤ 2λ(h). Therefore

Z X Z

≤

(3.44)

hfj |Θ% Π + (ν, u, v)Θ% |fj idv

j

Tr[Π + (ν, u, v)]dv ≤ dν h−1 (1 + ε(h))B(u),

(3.45)

where dν was defined after (1.4). In the last estimate above we used (3.12). Thus, combining (3.45) with (3.40), (3.42), both the upper and lower bounds of (3.12), and the definition (1.6) of P (2) , we have N Z X j=1

˜ u |θu,λ fj idu ≥ −h−2 (1 − ε(h)2 )(1 − δ2 ) hfj θu,λ |H

(3.46)

[V (u)]− + ε(h) hB(u), du P × 1 − δ3 |u|<%+3λ Z [V (u)]− + ε(h) du P (2) hB(u), ≥ −h−2 1 − δ3 |u|<%+3λ Z

(2)

620

L. Erd˝os, J.P. Solovej

independently of N , with (1 − δ3 ) := (1 − δ2 )(1 − ε(h)) = (1 − δ1 )(1 − ε(h))2 .

(3.47)

We used that the integral on the left side of (3.46) vanishes unless |u| ≤ % + 3λ(h). Putting together (3.38), (3.39) and (3.46), we see that X hfj |H|fj i j

≥ −h

−2

Z P

(2)

|u|<%+3λ

[V (u)]− + ε(h) hB(u), 1 − δ3

(3.48)

(2) du − cδ1 Eh,B δ1−1 Werr .

Clearly ∂ P (B, W ) ≤ cB + cW. (3.49) ∂W Therefore, using 0 ≤ P (B, W + V ) − P (B, W ) ≤ V ∂2 P (B, W + V ) for any V ≥ 0, and (1.10), we obtain if ε(h) ≤ 1 and δ3 ≤ 1/2, Z [V (u)]− + ε(h) −2 (2) (2) 0≤h P hB(u), − P (hB(u), [V (u)]− ) du 1 − δ3 |u|<%+3λ Z (ε(h) + δ3 [V (u)]− ) kBkh + ([V (u)]− + ε(h)) du ≤ ch−2 0 ≤ ∂2 P (B, W ) :=

|u|<%+3λ

i (2) ([V ]− ) . ≤ c ε(h)(% + 3λ(h))2 [kBkh−1 + h−2 ] + (ε(h) + δ3 )Eh,B h

Here we also used the simple estimate Z Z h−2 ε(h) [V (u)]− du ≤ ch−2 ε(h)(% + 3λ)2 + cε(h)h−2 |u|<%+3λ

|u|<%+3λ

(3.50)

[V (u)]2− du. (3.51)

Using (1.5), (3.48), (3.50), (3.2–3.5), (3.38), P h j hfj |H|fj i ≤ 1 + cC− (B, V )−1 ε(h)(% + 3λ(h))2 + (ε(h) + δ3 )C+ (V ) (3.52) (2) Escl (h, B, V ) i +δ1−2 ε2 (V, %) + sup ε1 (V, y) + λ(h)−2 h2 + (λ(h)−2 h2 )2 (% + 2λ(h))2 . |y|≤λ(h)

This estimate is valid independently of N . The ratio of the sum of the negative eigenvalues P (2) (2) k ek (H) of H = H (h, A, V ), to Escl (h, B, V ) is therefore estimated above by the right side of (3.52). It is clear, using (3.4–3.5) and the fact that B ∈ CC,L (V ) implies C− (B, V ) ≥ C, that we can choose δ1 and % as functions of h such that the error in (3.52) tends 0 as h → 0. This yields (3.32). 3.4. Upper bound. In this section we shall show the opposite asymptotic bound

Semiclassical Eigenvalue Estimates for the Pauli Operator

621

P

lim inf h→0

k ek (H) (2) Escl (h, B, V

)

≥ 1,

(3.53)

uniformly for B in CC,L . For each ν ∈ N, v ∈ R2 recall that we defined the following coherent state operators via their kernel: ν,− (x, y)eiφu,v (y) η u (y−v)θλ (y−u) Π − (ν, u, v)(x, y) := θλ (x−u)η u (x−v)e−iφu,v (x) Πu,v ν,− correspond(we used the notations from Sect. 3.3) and the Landau level projection Πu,v − − − ˆ ˆ ing to the constant field Bu,v and to the gauge Au,v . Note that Π (ν, u, v) = 0 unless |v − u| < 2λ(h). The following relations are immediate to check: ∞ ZZ X Π − (ν, u, v)dvdu = 1L2 (R2 ,C2 ) , (3.54) ν=0 − 8(u, v), TrΠ − (ν, u, v) = dν h−1 Bˆ u,v

Z

with 8(u, v) = and from (3.12) Z

(3.55)

θλ (x − u)2 η u (x − v)2 dx,

Tr[V Π − (ν, u, v)]dv ≤ dν h−1 B(u) {(1 + ε(h))[V ]+ −(1 − ε(h))[V ]− } ∗ (θλ )2 (u).

(3.56)

ν,− is a projection we In order to calculate the kinetic energy we first note that since Πu,v − ∗ can write Π (ν, u, v) = Ξ(ν, u, v)Ξ(ν, u, v) , where Ξ(ν, u, v) has the integral kernel ν,− (x, y). Ξ(ν, u, v)(x, y) = θλ (x − u)η u (x − v)e−iφu,v (x) Πu,v

We therefore have Z Tr [σ · (hp + A)]2 Π − (ν, u, v) dv Z = Tr (σ · (hp + A)Ξ(ν, u, v))∗ (σ · (hp + A)Ξ(ν, u, v)) dv Z Z Z σ · (hp + A)Ξ(ν, u, v) (x, y) 2 dxdydv. =

(3.57)

ν,− (x, y) and g = θλ,u . To estimate this we use (3.14) for each fixed y with f (x) = Πu,v We begin by examining the error terms. From the spectral density expression (3.44), the moment estimate (3.10), the field comparison (3.12) and the normalization of θλ,u we find Z ν,− (x, y)|2 dydxdv (ηvu θλ,u )2 (x)Wu,v (x)|Πu,v Z 2 ν,− (x)|Πu,v (x, x)|dxdv = wu−4 (x − v)4 (ηvu (x))2 θλ,u

≤ cdν h−1 (1 + ε(h))B(u),

(3.58)

622

L. Erd˝os, J.P. Solovej

ν,− where the first identity follows since Πu,v is a projection. ν,− ν,− ν,− P+ ](x, x) ≤ Πu,v (x, x). For the Since P+ commutes with Πu,v we have [P+ Πu,v second error term we therefore get in the same manner as above that Z + ν,− (x)|P+ Πu,v (x, y)|2 dydxdv ≤ cν(hB # (u))dν h−1 (1 + ε(h))B(u) (ηvu θλ,u )2 (x)Wu,v

≤ cν(hB(u))dν h−1 (1 + ε(h))B(u).

(3.59)

In the last line we used (3.11) to estimate B # (u) in terms of B(u). Note that we inserted a ν in the estimate. This is clearly allowed for ν ≥ 1. For ν = 0 it follows simply because 0,− = 0 (i.e., the lowest Landau level contains only spinors with spin down). P+ Πu,v R For the last error term in (3.14) we just have to recall that supp ζ (η (0),u )2 < c to conclude that Z ν,− (x, y)|2 dxdydv ≤ cλ(h)−2 dν h−1 (1 + ε(h))B(u). |∇(ζv θλ,u )(x)|2 |(ηv(0),u )(x)Πu,v (3.60) If we insert (3.57–3.60) into (3.14) we arrive at Z Tr [σ · (hp + A)]2 Π − (ν, u, v) dv Z Z Z 2 h i ν,− ˆ− (x, y) )Π ≤ (1 + ε(h)) ηvu (x)2 θλ,u (x)2 σ · (hp + A dxdydv u,v u,v h i + cdν h−1 (1 + ε(h))B(u) ε(h) + ν(hB(u))ε(h) + h2 λ(h)−2 . ν,− is an eigenprojection for To compute the first term we observe that since Πu,v i2 h − − ˆ u,v ). Hence, using again the spectral ˆ u,v ) it commutes with σ · (hp + A σ · (hp + A density expression (3.44), we get Z h 2 i h i ν,− − ν,− − ˆ ˆ ˆ− (x, y) )Π dy = σ · (hp + A )Π σ · (hp + A ) σ · (hp + A u,v u,v u,v u,v u,v (x, x) h i2 ν,− −1 ˆ − ˆ− ˆ− ˆ − ν,− = σ · (hp + A u,v ) Πu,v (x, x) = 2νhBu,v Πu,v (x, x) = 2νhBu,v (dν h Bu,v ).

If we insert this identity into the estimate above we conclude, again using the field comparison (3.12) that Z Tr [σ · (hp + A)]2 Π − (ν, u, v) dv h i ≤ dν h−1 (1 + ε(h))B(u) (1 + cε(h))2νhB(u) + cε(h) + ch2 λ(h)−2 . (3.61) Fix % > 0 for the moment, and let M (ν, u) be the characteristic function of the set {(ν, u) : 2νhB(u) < [V (u)]− , |u| ≤ %}. Note that M (ν, u) = 0 if V (u) ≥ 0. Define the operator γ on L2 (R2 , C2 ) by ∞ Z X M (ν, u)Π − (ν, u, v)dvdu, γ= ν=0

Semiclassical Eigenvalue Estimates for the Pauli Operator

623

which satisfies the density matrix condition 0 ≤ γ ≤ 1L2 (R2 ,C2 ) by (3.54). From the variational principle, (3.56) and (3.61) we have (recall that λ(h)−2 h2 ≤ ε(h)) X ek (H) ≤ Tr[Hγ] (3.62) k

≤

XZ ν

M (ν, u)dν h

−1

h (1 + ε(h))B(u)

1 − ε(h) [V ]− [V ]+ − 1 + ε(h)

+

i (1 + cε(h))2νhB(u) + cε(h)

∗ (θλ )

2

(u) du.

Moreover, for u such that V (u) < 0 and |u| ≤ % we have X dν h−1 B(u)M (ν, u) = h−2 ∂2 P (2) (hB(u), [V (u)]− ), ν

therefore we can continue the estimate (3.62) Z X 2 −2 ek (H) ≤ −(1 + cε(h)) h P (2) (hB(u), [V (u)]− )du + Error(h) k

≤ −h−2 with Error(h) := h−2

|u|≤%

Z |u|≤%

P (2) (hB(u), [V (u)]− )du + Error(h)

Z |u|≤% V (u)<0

(1 + ε(h))∂2 P (2) hB(u), [V (u)]−

× (1 + cε(h))[V (u)]− +

1 − ε(h) [V ]− [V ]+ − 1 + ε(h)

(3.63)

∗ (θλ )

2

(u) + cε(h) du.

Therefore, using (3.49), we have, if ε(h) < 1, Z h c(hkBk + [V (u)]− ) [V ]− − [V ]− ∗ (θλ )2 (u) Error(h) ≤ h−2 |u|≤%

i + [V ]+ ∗ (θλ )2 − [V ]+ (u) + ε(h)[V (u)]− + ε(h) du (2) ([V ]− ) ≤ c ε(h)[h−1 kBk + h−2 ]%2 + ε(h)Eh,B X (2) Eh,B [V ]± − [V ]± ∗ (θλ )2 + ±

(2) +Eh,B

2 1/2

[V ]± − [V ]± ∗ (θλ )

(2) Eh,B ([V ]− )1/2

(3.64)

,

where we also used (3.51) as in the lower bound. Considering (3.2–3.5), and (3.3) we get, using Jensen’s inequality as in the lower bound, that for fixed %, Error(h) →0 (2) |Escl (h, B, V )|

624

L. Erd˝os, J.P. Solovej

uniformly for B in CC,L (V ) as h → 0. Hence for all % > 0 we have R ÿ ! P h−2 |u|≥% P (2) (hB(u), [V (u)]− )du k ek (H) ≥ 1 − lim sup . lim inf (2) (2) h→0 E (h, B, V ) |Escl (h, B, V )| h→0 scl (3.65) By (3.4) and C− (B, V ) ≥ C, Z Z −2 (2) −2 P (2) (hB(u), [V (1 − χ% )(u)]− )du P (hB(u), [V (u)]− )du = h h |u|≥%

(2) (2) (2) ([V − V χ% ]− ) = Eh,B ([V χ% ]− − [V ]− ) ≤ ε2 (V, %)|Escl (h, B, V )|/C− (B, V ) ≤ Eh,B

we obtain the final result (3.53).

4. Semiclassics in Three Dimensions In the introduction we stated our semiclassical result in Theorem 1.2 for fixed potential and a simple one parameter family of magnetic fields, but we shall in fact prove a slightly stronger result, which includes a more general family of magnetic fields and which allows the potential to depend mildly on B and h. We always take h → 0, and we also would like to allow kBk → ∞ simultaneously (otherwise the leading term in the semiclassical limit becomes independent of the magnetic field), but always with the restrictions that hL(B)−1 → 0, hl(B)−1 → 0 and h3 kBkl(B)−2 → 0. The first two conditions are natural, as they require that the magnetic field should not change considerably on the usual semiclassical distance scale h. The role of the third condition was explained in the Introduction after Theorem 1.2. This means that instead of a single magnetic field, we consider a one-parameter family of magnetic fields, Bτ , parametrized by a real parameter τ ∈ (0, 1). We also allow the potential V = Vτ and the semiclassical parameter h = hτ depend on τ in such a way that hτ < 1, limτ →0 hτ = 0, i.e. we consider a triple of one-parameter family of data (hτ , Bτ , Vτ ). Let µ(h, B) := h max{L(B)−1 , l(B)−1 }, κ(h, B) := h3 kBkl(B)−2 and let µ(τ ) := µ(hτ , Bτ ), κ(τ ) := κ(hτ , Bτ ) for shortness, then we require that these functions go to zero as τ → 0. To describe how the potential is allowed to depend on B and h (via τ ), we introduce the full 3D magnetic Lieb-Thirring error functional Z Z (4.1) Fh,B (V ) := h−3 |V |5/2 + (h−1 kBk + d(h, B)−2 )h−1 |V |3/2 +(h where

−1

kBk + d(h, B)

−2

−1

)d(h, B)

Z

−1

|V | + (h

kBk + d(h, B)

−2

Z )

|∇V |,

o n d(h, B) := min h1/4 kBk−1/4 l(B)1/2 , L(B), l(B) ,

and we also introduce the reduced 3D magnetic Lieb-Thirring error functional Z Z −3 5/2 −2 |V | + kBkh |V |3/2 (4.2) Eh,B (V ) := h

Semiclassical Eigenvalue Estimates for the Pauli Operator

625

(in Sect. 4 all the integrals are on R3 unless otherwise specified). Recall that in the case of a large magnetic field (which is our main concern), namely if kBk ≥ max{L(B)−2 , l(B)−2 }, Fh,B is of order Z h−3 |V |5/2 + Z Z Z (4.3) −2 3/2 5/4 −5/4 −1/2 −1 |V | + kBk h |V | + kBkh |∇V | + |V | . l + kBkh With this notation, the Lieb-Thirring inequality in [ES-I] states that the sum of the negative eigenvalues e1 (H), e2 (H), . . . of the operator H = H(h, A, V ) satisfies the bound X |ek (H)| ≤ cFh,B ([V ]− ) ≤ cFh,B (V ) k

for any magnetic field B satisfying (1.7), (1.8), (1.9). We may now introduce that set of conditions on the triple (hτ , Bτ , Vτ ) which involve the potential. We require the following C− := lim inf C− (τ ) > 0

for

C− (τ ) := lim inf

|Escl (hτ , Bτ , Vτ )| > 0, −3 kBτ kh−2 τ + hτ

(4.4)

C+ := lim sup C+ (τ ) > 0

for

C+ (τ ) := lim sup

Ehτ ,Bτ , ([Vτ ]− ) < ∞, −3 kBτ kh−2 τ + hτ

(4.5)

τ →0

τ →0

τ →0

τ →0

and lim sup ε± (σ, r) = 0,

τ,r→0 σ≤τ

where ε+ (τ, r) := sup

|y|≤r

ε− (τ, r) := sup

|y|≤r

lim

sup ε2 (τ, %) = 0,

%→∞ τ ∈(0,1)

(4.6)

Ehτ ,Bτ ([Vτ ]+ − [Vτ (· − y)]+ ) , −3 kBτ kh−2 τ + hτ

(4.7)

Fhτ ,Bτ ([Vτ ]− − [Vτ (· − y)]− ) , −3 kBτ kh−2 τ + hτ

(4.8)

Fhτ ,Bτ ([Vτ χ% ]− − [Vτ ]− ) , (4.9) −3 kBτ kh−2 τ + hτ and here χ% denotes the characteristic function of the ball of radius % centered at the origin. Note that (4.4), (4.5) and P (B, W ) ≤ c(BW 3/2 + W 5/2 ) imply that ε2 (τ, %) :=

c≤

C+ (τ ) Ehτ ,Bτ (Vτ ) ≤ . |Escl (hτ , Bτ , Vτ )| C− (τ )

(4.10)

Remark. If V is independent of τ (i.e. of Bτ and hτ ), then the conditions follow simply −3 from V ∈ L5/2 ∩ L3/2 , [V ]− ∈ W 1,1 , |Escl (hτ, Bτ , V )| ≥ C(kBτ kh−2 τ + hτ ) and −2 −2 2 1/2 µ(τ ), κ(τ ) → 0 (use that d(h, B) ≤ h (µ (h, B) + κ(h, B) )). Theorem 4.1 (3D Semiclassics). Let e1 (Hτ ), e2 (Hτ ), . . . denote the negative values of the operator Hτ = H(hτ , Aτ , Vτ ) in (1.1). Assume that the triple physical data (hτ , Bτ , Vτ ) is such that µ(τ ) = hτ max{L(Bτ )−1 , l(Bτ )−1 } κ(τ ) = h3τ kBτ kl(Bτ )−2 → 0, (as τ → 0), and it satisfies (4.4–4.6) then P k ek (Hτ ) − 1 = 0. lim τ →0 Escl (hτ , Bτ , Vτ )

eigenof the → 0,

(4.11)

626

L. Erd˝os, J.P. Solovej

Remark. From our proof it will be clear that in order to get the corresponding upper bound on the eigenvalue sum in (4.11), we can relax the condition (4.6) involving ε− and ε2 by replacing Fhτ ,Bτ to Ehτ ,Bτ in their definitions (4.8)-(4.9). 4.1. Constant field approximation. Fix µ, κ > 0, h ∈ (0, 1) and consider a magnetic field B such that hL(B)−1 , hl(B)−1 ≤ µ and h3 kBkl(B)−2 ≤ κ. The goal of this section is to rewrite the kinetic energy part of the Pauli operator in terms of a spatial average of operators with constant magnetic field plus error terms. The error terms will go to zero as µ, κ, h → 0, (notice that, in addition to h, κ and µ also play the role of the small parameters). Later, this representation will be used to obtain precise lower and upper bounds on the eigenvalue sum. The final result of this section is Proposition 4 below, but we have to introduce some notations before stating it. We consider two functions, λ = λ(κ, µ, h) and 3 = 3(κ, µ, h), which will be chosen later, but it is required that λ(κ, µ, h) → 0 and 3(κ, µ, h) → ∞ as κ, µ, h → 0. −3/2 θ(x/λ) with θ ∈ First we introduce spherically symmetric functions R θλ2(x) = λ ∞ 3 (R ), which localize at scale λ = λ(κ, µ, h), i.e. θ = 1, supp θ C λ ⊂ B(0, λ) and 3 0 λ R R (∇θλ )2 ≤ cλ−2 . Let f ∈ C0∞ (R3 , C2 ), then by the IMS localization formula (Lemma 1) hf |[σ · (hp + A)]2 |f i Z ZZ 2 2 = |σ · (hp + A(x))θλ (x − u)f (x)| dx du − h hf |f i (∇θλ )2 , (the gradient operator p always acts on the x variable). We fix a point u and now study Z Z |σ · (hp + A(x))θλ (x − u)f (x)|2 dx = |σ · (hp + A)gf |2 ,

(4.12)

(4.13)

where g(x) := θλ (x − u) (for the rest of this section u and λ remain fixed, so we omit them from the notation). Note that (4.13) is invariant under orthogonal coordinate transformation, so we can choose a coordinate system such that B(u) = (0, 0, B(u)), i.e. n(u) = (0, 0, 1). Throughout this section we shall work in this coordinate system. We also define the spin projections as 1 (4.14) P±u = P± := (1 ± σ · n(u)). 2 Moreover, we define the transversal coordinates, denoted by x⊥ ∈ R2 , by x = (x⊥ , x3 ) := (π1,2 (x − x · n(u)), x · n(u)) (π1,2 is the standard projection on the first two components from R3 → R2 ), and p⊥ , σ⊥ are defined analogously. Strictly speaking the notions of "perpendicular" and "third direction" depend on u, i.e. formally the notation (x⊥ , x3 ) = (x⊥(u) , x3(u) ) would be meticulous, but in this section we omit the u dependence. We shall need a second localization, which is essentially identical to the localization given in Sect. 3.2. For the reader’s convenience we recall that for any u it is given by a function η u ∈ C0∞ (R2 ) with supp η u ⊂ B(0, λ(κ, µ, h)) ⊂ R2 which has the form η u := η (0),u ζ. Here ζ ∈ C0∞ (R2 ), supp ζ ⊂ B(0, λ(κ, µ, h)), |∇ζ| ≤ cλ(κ, µ, h)−1 , ζ ≡ 1 on B(0, λ(κ, µ, h)/2), 0 ≤ ζ ≤ 1; and the function η (0) = η (0),u is defined as

Semiclassical Eigenvalue Estimates for the Pauli Operator

627 −2 2

η (0) (v) = η (0),u (v) := cnorm,u wu−1 e−wu v R with a normalization constant cnorm,u chosen such that R2 (η u )2 = 1. The scale wu = wu (κ, µ, h) of η (0),u is chosen as wu := 3(κ, µ, h)(B # (u))−1/2 h1/2 . Here we introduced the notation B # (u) :=

sup x : |x−u|<2L(B)

{|B(x)|},

and since e−2 ≤ B(x)/B(y) ≤ e2 for all |x − y| ≤ 2L(B), we note that e−2 B # (u) ≤ B(u) ≤ B # (u).

(4.15)

It is easy to see that, although cnorm,u depends on κ, µ, h, B # (u), and ζ, there exists a universal positive constant c such that Z η u (v)2 v 2k dv ≤ c min{wu , λ(κ, µ, h)}2k (4.16) R2

for k = 0, 1, 2 and that c−1 <

Z (η (0),u )2 < c. supp ζ

We define

ηvu (x) = ηv (x) := η(x⊥ − v),

and similarly

ζvu (x) = ζv (x) := ζ(x⊥ − v), ηv(0),u (x) = ηv(0) (x) := η (0) (x⊥ − v)

(note that ηv , ηv(0) and ζv are functions on R3 and their u-dependence is hidden in wu and in ⊥=⊥ (u)). Armed with these notations and definitions, we can state the main result of this section. Proposition 4. There exist two universal constants c0 ≤ 1 and c1 ≥ 1, and there exists a function ε(κ, µ, h) : (0, c0 ) × (0, c0 ) × (0, c0 ) → (0, 1/4), ε(κ, µ, h) → 0 as κ, µ, h → 0 (in fact the function ε(κ, µ, h) = c1 max{κ1/6 , µ1/3 , h2/3 } would do) with the following property: for any 0 < h, κ, µ < c0 , for any magnetic field B satisfying hL(B)−1 , hl(B)−1 ≤ µ, h3 kBkl(B)−2 ≤ κ and for any u ∈ R3 , v ∈ R2 there exist a ˆ± ˆ± phase function φu,v , constant magnetic fields B u,v (with strengths Bu,v ), parallel with B(u), satisfying ± | ≤ ε(κ, µ, h)B(u), |B(u) − Bˆ u,v

with

for all v with |v − u⊥(u) | ≤ 2λ(κ, µ, h) (4.17)

λ(κ, µ, h) := min{c1 hε(κ, µ, h)−1 , 10−4 hµ−1 , ε(κ, µ, h)}, ∞

such that for any f ∈ C (R , C ) and g ∈ 3

2

C0∞ (B(u, λ(κ, µ, h)), R)

we have

(4.18)

628

L. Erd˝os, J.P. Solovej

Z |σ · (hp + A)gf |2 Z Z h ˆ +u,v )(eiφu,v ηvu gf )(x)|2 (1 − ε(κ, µ, h))|σ · (hp + A ≥ R2 −ε(κ, µ, h)|(eiφu,v ηvu gf )(x)|2 dx dv,

(4.19)

and for any v-dependent function f v ∈ C ∞ (R3 , C2 ), Z Z |σ · (hp + A)(e−iφu,v ηvu gf v )(x)|2 dx dv 2 R Z Z h v 2 ˆ− (ηvu g)(x)2 (1 + ε(κ, µ, h))|σ · (hp + A ≤ u,v )f (x)| R2

+ ε(κ, µ, h)Wu,v (x⊥ ))|f v (x)|2

(4.20)

+ +ε(κ, µ, h)Wu,v (x⊥ )|P+ f v (x)|2 dxdv Z Z |∇(ζv g)(x)|2 |(ηv(0),u f v )(x)|2 dx dv, + ch2 R2

where

Wu,v (x⊥ ) = wu−4 (x⊥ − v)4 + wu−2 (x⊥ − v)2 + 1, + (x⊥ ) = hB # (u)wu−2 (x⊥ − v)2 , Wu,v

wu = wu (κ, µ, h) := 3(κ, µ, h)(B # (u))−1/2 h1/2 with

3(κ, µ, h) := c1 ε(κ, µ, h)−1 ,

and the vector potentials 1 ˆ± ˆ± A u,v := Bu,v n(u) × x 2 ˆ± generating the constant fields B . u,v

(4.21)

Proof. Since u is fixed, we shall omit it from the notation, but we recall that f v , w, A± v, φv , P ± and ⊥ can and will depend on u in the application. To simplify the formulas, we also drop the arguments from ε = ε(κ, µ, h), λ = λ(κ, µ, h) and 3 = 3(κ, µ, h) in this proof. Step 1. Separation of the spin up and spin down subspaces. Let α = α(κ, µ, h) ≤ (32e2 )−1 be chosen later. For any function ψ (to be applied for ψ = gf ) we have Z Z Z 2 2 |σ · (hp + A)ψ| = |(hp + A)ψ| − (ψ, hσ · Bψ) Z Z Z = |(hp + A)P− ψ|2 − (P− ψ, hσ · BP− ψ) + |(hp + A)P+ ψ|2 Z Z Z − (P+ ψ, hσ · BP+ ψ) − (P− ψ, hσ · BP+ ψ) − (P+ ψ, hσ · BP− ψ) (4.22) Z Z = |σ · (hp + A)P− ψ|2 + |σ · (hp + A)P+ ψ|2 Z Z − (P− ψ, hσ · BP+ ψ) − (P+ ψ, hσ · BP− ψ),

Semiclassical Eigenvalue Estimates for the Pauli Operator

629

where we here used that P± commute with hp + A. Therefore we obtain Z Z Z |σ · (hp + A)gf |2 − |σ · (hp + A)gP− f |2 − |σ · (hp + A)gP+ f |2 Z Z ≤ 2hB # (u) α−1 l(B)−2 λ2 |gP− f |2 + α |gP+ f |2 , (4.23) using h|(P+ ψ(x), σ · B(x)P− ψ(x))| = hB(x)|(P+ ψ(x), σ · (n(x) − n(u))P− ψ(x))| ≤ 2hB # (u)l(B)−1 λ|P+ ψ(x)||P− ψ(x)| ≤ hB # (u)(α|P+ ψ(x)|2 + α−1 l(B)−2 λ2 |P− ψ(x)|2 ), which is valid by Cauchy-Schwarz and the relation P+ σ · n(u)P− = 0 for any function ψ(= gf ) supported in the ball B(u, λ). Therefore, we have a lower bound Z Z |σ · (hp + A)gP− f |2 − 2hB # (u)α−1 l(B)−2 λ2 |gP− f |2 |σ · (hp + A)gf |2 ≥ Z (4.24) + |σ · (hp + A)gP+ f |2 − 2hB # (u)α|gP+ f |2 , and, similarly, an upper bound, Z Z |σ · (hp + A)gf |2 ≤ |σ · (hp + A)gP− f |2 + 2hB # (u)α−1 l(B)−2 λ2 |gP− f |2 Z (4.25) + |σ · (hp + A)gP+ f |2 + 2hB # (u)α|gP+ f |2 . Step 2. Second localization. We separately consider the kinetic energies of P± f =: f± appearing in (4.24) and (4.25). For the lower bound we write Z Z Z |σ · (hp + A)gf± |2 = η(x⊥ − v)2 |σ · (hp + A)(gf± )(x)|2 dx dv. (4.26) R2

Since supp g ⊂ B(u, λ) (⊂ R3 ) and supp η ⊂ B(0, λ) (⊂ R2 ), therefore the above integral can be restricted to x ∈ B(u, λ) ⊂ R3 and v ∈ B(u⊥ , 2λ) ⊂ R2 . For any v ∈ B(u⊥ , 2λ), we can approximate the magnetic field by a constant field Bv = (0, 0, Bv ) such that for all x ∈ B(u, 2λ), |B(x) − Bv | ≤ cB # (u)λl(B)−1 + cB # (u)L(B)−1 |x⊥ − v|,

(4.27)

by applying Corollary 11 from the Appendix with the particular choice of a cylinder with center in (v, u3 ), length 4λ and radius |x⊥ − v| ≤ 4λ, and using λL(B)−1 ≤ λh−1 µ ≤ 10−4 and a similar√estimate√for λl(B)−1 . Notice that the proof of Corollary 11 requires λl(B)−1 ≤ [(6 + 3 3)(8 + 3)]−1 (see (A.6)), but this is satisfied given λ ≤ 10−4 hµ−1 and l(B)−1 ≤ h−1 µ. We can then approximate A by Av , rot Av = Bv , applying the formula (A.8) from Proposition 12 and using (A.3) to estimate ∇(B(x) − Bv ) = ∇B(x), such that |A(x) − Av (x)| ≤ cB # (u)λl(B)−1 |x⊥ − v| + cB # (u)L(B)−1 |x⊥ − v|2

(4.28)

630

L. Erd˝os, J.P. Solovej

for all x ∈ B(u, λ), v ∈ B(u⊥ , 2λ) (the potential Av may not be C 1 but only C 0 , but that is all we shall need). Using a Cauchy-Schwarz, we have, for any 0 < δ < 1/2, Z Z Z 2 η(x⊥ − v)2 |σ · (hp + Av )(gf± )(x)|2 dx dv |σ · (hp + A)gf± | ≥ (1 − δ) R2 Z Z η(x⊥ − v)2 |A(x) − Av (x)|2 |(gf± )(x)|2 dx dv. (4.29) −cδ −1 R2

We shall choose δ = δ(κ, µ, h) at the end of this section. For the upper bound, we have to consider functions f v depending on v, and we shall need the bound Z Z v |σ · (hp + A)(e−iφv ηv gf± )(x)|2 dxdv R2 Z Z v |σ · (hp + Av )(e−iφv ηv gf± )(x)|2 dx dv ≤ (1 + δ) 2 ZR Z −1 v η(x⊥ − v)2 |A(x) − Av (x)|2 |(gf± )(x)|2 dx dv, (4.30) +cδ R2

where the phase function φv : R3 → R will be chosen below. To control the error in (4.29), we shall apply the following estimate, which is valid uniformly for all x ∈ B(u, λ), Z η(x⊥ − v)2 |A(x) − Av (x)|2 dv δ −1 |v−u⊥ |≤2λ Z η(v)2 λ2 l(B)−2 v 2 + L(B)−2 v 4 dv ≤ cδ −1 (B # (u))2 ≤ cδ

−1

R2 2 2

(B (u)) (w λ l(B)−2 + h−2 µ2 w4 ) #

2

(4.31)

using (4.28), (4.16) and, in the applications, for the error term in (4.30), we will need a uniform upper bound for f v (x). We choose a gauge transformation φu,v = φv such that (Av − ∇φv )(x) = (1/2)Bv n(u) × (x − (v, x3 )), that determines φv (up to an irrelevant constant). Notice that Av − ∇φv does not depend on x3 and has zero third component. Therefore the main terms in (4.29) and (4.30) can be written as Z Z R2

Z

Z

η(x⊥ − v)2 |σ · (hp + Av )(gf± )(x)|2 dx dv

η(x⊥ −v) |σ⊥ ·(hp+Av −∇φv )⊥ (e 2

= R2

and Z

Z Z

R2

= R2

iφv

2

gf± )(x)| +|hp3 (e

iφv

(4.32) 2

gf± )(x)| dx dv,

v |σ · (hp + Av )(e−iφv ηv gf± )(x)|2 dx dv Z v v |σ⊥ · (hp + Av − ∇φv )⊥ (ηv gf± )(x)|2 + |hp3 (ηv gf± )(x)|2 dx dv.(4.33)

Semiclassical Eigenvalue Estimates for the Pauli Operator

631

Step 3. Magnetic localization. Using the estimates in Lemma 2 with the same δ as above and k(x⊥ ) = ζ(x⊥ )(gf± )(x⊥ + v, x3 ) for each x3 (and including the phase factor in the case of (2.3)), then shifting x⊥ → x⊥ − v, we obtain from (4.32), (4.33), (4.16) and the estimates in the IMS Lemma 1 (once for ζv , once for gζv ) that Z Z η(x⊥ − v)2 |σ · (hp + Av )(gf± )(x)|2 dx dv (4.34) R2 Z Z Z η (0) (x⊥ − v)2 |σ · (hp + Av )(ζv gf± )(x)|2 dx dv − ch2 sup(∇ζ)2 · |gf± |2 ≥ 2 R Z Z ˆ +v )(eiφv ηv gf± )(x)|2 dx dv |σ · (hp + A ≥ (1 − δ) R2 Z Z 2 −2 2 −1 2 −2 |gf± | − cδ h w |P+ gf± |2 , −ch λ and Z

Z

v |σ · (hp + Av )(e−iφv ηv gf± )(x)|2 dx dv Z Z v 2 ˆ− η (0) (x⊥ − v)2 |σ · (hp + A ≤ (1 + δ) v )(ζv gf± )(x)| dx dv 2 R Z Z −1 2 −2 v |P+ (ηv gf± )(x)|2 dxdv +cδ h w R2 Z Z v 2 ˆ− ηv (x)2 g(x)2 |σ · (hp + A ≤ (1 + δ) v )f± (x)| dx dv R2 Z Z |∇(ζv g)(x)|2 |ηv(0) f v (x)|2 dx dv +(1 + δ)h2 2 R Z Z −1 2 −4 v (x⊥ − v)2 |P+ (ηv gf± )(x)|2 dxdv, +cδ h w

(4.35)

R2

R2

ˆ± ˆ± where the vector potentials A u,v = Av (x) defined in (4.21) generate the constant fields ± −2 ˆB± ˆ ). u,v = Bv := (0, 0, Bv ± 4hw To estimate how well this constant field approximates the original one, for |v−u⊥ | ≤ 2λ we obtain (using (4.27) for x = (v, u3 ), L(B)−1 , l(B)−1 ≤ h−1 µ) that ˆ± |B(u) − Bˆ v± | ≤ |B(u) − B v|

≤ |B(u) − B(v, u3 )| + |B(v, u3 ) − Bv | + 4hw−2

(4.36)

≤ 2B # (u)h−1 µλ + cB # (u)λl(B)−1 + 4hw−2 ≤ cB # (u)(h−1 µλ + 3−2 ) ≤ cB(u)(h−1 µλ + 3−2 ). In the last step we used (4.15). In order to prove (4.17) we must therefore choose ε = ε(κ, µ, h) such that (4.37) c(h−1 µλ + 3−2 ) ≤ ε. We also get a lower bound (ε ≤ 1/2) Bˆ v± ≥ (1 − ε)B(u) ≥ (1/2)e−2 B # (u)

632

L. Erd˝os, J.P. Solovej

ˆ− for all v such that |v − u⊥ | ≤ 2λ (in particular (4.36) and (4.37) imply that B(u), B v + ˆ and Bv point in the same direction). Since for all k ∈ C0∞ (R3 , C2 ) we have Z Z Z 2 ± 2 −2 # ˆ± ˆ |P )P k| ≥ h B k| ≥ (1/2)e hB (u) |P+ k|2 , (4.38) |σ · (hp + A + + v v we see that the last error terms in (4.24), (4.25) and (4.34), which are nonzero only for f+ , can be absorbed into the kinetic energy (with the constant direction magnetic field 2 −1 ˆ± and B v ) if α ≤ (32e ) −1 2 ε < (8e2 )−1 R := cδ −1 h2 w−2 (hB # (u))−1 = cδ −1 3−2 = cc−2 1 δ

(4.39)

(using the definition of w and 3). We put together (4.24), (4.29), (4.31), (4.32) and (4.34), the final lower bound of this section is Z (4.40) |σ · (hp + A)gf |2 Z Z ˆ +v )(eiφv ηv gP− f )(x)|2 (1 − δ)2 |σ · (hp + A ≥ R2

−(S + Q + ch2 λ−2 )|(eiφv ηv gP− f )(x)|2 ˆ +v )(eiφv ηv gP+ f )(x)|2 +[(1 − δ)2 (1 − R) − 4e2 α]|σ · (hp + A −(Q + ch2 λ−2 )|(eiφv ηv gP+ f )(x)|2 dx dv with

S := 2hB # (u)α−1 l(B)−2 λ2 , Q := Q1 + Q2 ,

where Q1 := cδ −1 (B # (u))2 w2 λ2 l(B)−2

and Q2 := cδ −1 (B # (u))2 h−2 µ2 w4 .

Using h3 kBkl(B)−2 ≤ κ, B # (u) ≤ kBk and the definitions of w and 3 we see that S ≤ 2h−2 α−1 λ2 κ, (4.41) Q ≤ cc21 δ −1 h−2 ε2 λ2 κ + cc41 δ −1 ε−4 µ2 . For the upper bound we use (4.25) for f = e−iφv ηv f v , then combine it with (4.30), (4.31), (4.33) and (4.35) to obtain Z Z |σ · (hp + A)(e−iφv ηv gf v )(x)|2 dx dv (4.42) R2 Z Z v 2 ˆ− η(x⊥ − v)2 g(x)2 (1 + δ)2 |σ · (hp + A ≤ v )(P− f )(x)| R2

+ S + Q1 w−2 (x⊥ − v)2 + Q2 w−4 (x⊥ − v)4 |P− f v (x)|2 v 2 ˆ− +[(1 + δ)2 + 4e2 α]|σ · (hp + A v )(P+ f )(x)|

# −2 2 −4 4 v 2 + (RB (u)h + Q1 )w (x⊥ − v) + Q2 w (x⊥ − v) |P+ f (x)| dx dv Z Z 2 |(∇(gζv ))(x)|2 |ηv(0) f v (x)|2 dx dv. +ch R2

Semiclassical Eigenvalue Estimates for the Pauli Operator

633

In order to prove the proposition we must show that we can choose numbers c0 , c1 and positive functions ε = ε(κ, µ, h) ≤ 1/4, δ = δ(κ, µ, h) and α = α(κ, µ, h) ≤ (32e2 )−1 defined on (0, c0 ) × (0, c0 ) × (0, c0 ) such that ε(κ, µ, h) → 0 (as κ, µ, h → 0) and such that (4.37) is satisfied and the following requirements are met with 3(κ, µ, h) := c1 ε(κ, µ, h)−1 : R ≤ ε, Q + ch2 λ−2 + S ≤ ε, (1 + δ)2 + 4e2 α ≤ (1 + ε), (1 − δ)2 (1 − R) − 4e2 α ≥ (1 − ε).

(4.43)

In particular, R ≤ ε, and therefore a small enough choice of c0 will imply (4.39). One possible choice is ε := c1 max{κ1/6 , µ1/3 , h2/3 }, δ := ε/10, α := ε/(8e2 ). Then a short calculation, with the help of (4.41) shows that all the requirements are met for small enough (universal) c0 and large enough (universal) c1 (appearing in (4.18)). This finishes the proof of Proposition 4. 4.2. Lower bound. We shall here prove the one-sided bound P k ek (Hτ ) ≤ 1, lim sup τ →0 Escl (hτ , Bτ , Vτ )

(4.44)

in Theorem 4.1. Since Escl (hτ , Bτ , Vτ ) ≤ 0, this is a lower bound on the eigenvalue P sum k ek (Hτ ). For this bound, we can replace Vτ by its negative part, −[Vτ ]− . Obviously for small enough hτ we have hτ L(Bτ )−1 ≤ c0 , hτ l(Bτ )−1 ≤ c0 and h3τ kBτ kl(Bτ )−2 ≤ c0 , therefore we can apply Proposition 4 with B = Bτ and κ = κ(τ ), µ = µ(τ ), where κ(τ ) = h3τ kBτ kl(Bτ )−2 and µ(τ ) = hτ max{L(Bτ )−1 , l(Bτ )−1 }. We also in˜ ) := λ(κ(τ ), µ(τ ), hτ ) and ε(τ ˜ ) := troduce dτ := hτ min{κ(τ )1/4 , µ(τ )}. Define λ(τ ε(κ(τ ), µ(τ ), hτ ), where the functions λ(κ, µ, h) and ε(κ, µ, h) were obtained in Propo˜ ) ≤ ε(τ sition 4. Remark that λ(τ ˜ ) → 0 as τ → 0. Instead of the original Pauli operator Hτ , we are going to study ˜ τ := (1 − 2δ1 )Θ% [σ · (hτ p + Aτ )]2 Θ% − [Vτ χ% ]− ∗ θλ2 , H

(4.45)

˜ ) (for brevity we where θλ is the function defined in Sect. 4.1 with lengthscale λ = λ(τ ˜ )) and χ% denotes the characteristic function of the ball of radius % ≥ 1 use λ for λ(τ centered at the origin in R3 , and Θ% ∈ C0∞ (R3 ) satisfies Θ% (x) = 1 on B(0, % + λ), supp Θ% ⊂ B(0, % + 2λ), and |∇Θ% | ≤ cλ−1 . Here 0 < δ1 = δ1 (τ ) < 1/4 and % = %(τ ) as functions of τ will be chosen at the end of this section. Let θu,λ (x) := θλ (x − u). We then use the localization formula (2.1) to write Z ˜ u,τ |θu,λ f i du − c(1 − 2δ1 )h2τ λ−2 hf |Θ%2 |f i, ˜ τ |f i = hf θu,λ |H hf |H for any f ∈ C0∞ (R3 , C2 ), where ˜ u,τ = (1 − 2δ1 )Θ% [σ · (hτ p + Aτ )]2 Θ% − [Vτ χ% (u)]− . H The relation between the kinetic energy parts of Hτ and H˜ τ is given by the pointwise inequality in Lemma 1, hf |[σ · (hτ p + Aτ )]2 |f i ≥ (1 − δ1 )hf |Θ% [σ · (hτ p + Aτ )]2 Θ% |f i − cδ1−1 h2τ hf |(∇Θ% )2 |f i.

(4.46)

634

L. Erd˝os, J.P. Solovej

In order to merge the two error terms from these two localizations, we introduce the function Θ%∗ ∈ C0∞ (R3 ), satisfying Θ%∗ (x) = 1 on B(0, % + 2λ), supp Θ%∗ ⊂ B(0, % + 3λ), and |∇Θ%∗ | ≤ cλ−1 . Then obviously c(1 − 2δ1 )h2τ λ−2 Θ%2 + cδ −1 h2τ (∇Θ% )2 ≤ cδ1−1 h2τ λ−2 (Θ%∗ )2 .

(4.47)

Therefore we may write, using (4.46), (4.47) Z ˜ u,τ |θu,λ f idu + hf |Herr,τ |f i, hf |Hτ |f i ≥ hf θu,λ |H where

(4.48)

Herr,τ := δ1 [σ · (hτ p + Aτ )]2 + Werr,τ

with

(4.49)

Werr,τ = [Vτ χ% ]− ∗ (θλ )2 − [Vτ ]− − cδ1−1 h2τ λ−2 (Θ%∗ )2 .

Herr,τ shall be treated by the LT inequality. To estimate the effect of Herr,τ , we use the full 3D magnetic Lieb-Thirring error functional. We have n −5/2 Fhτ ,Bτ [Vτ χ% ]− ∗ θλ2 − [Vτ ]− ∗ θλ2 Fhτ ,Bτ δ1−1 Werr,τ ≤ Cδ1 o +Fhτ ,Bτ [Vτ ]− ∗ θλ2 − [Vτ ]− + Fhτ ,Bτ cδ1−1 h2τ λ−2 (Θ%∗ )2 n −5/2 ≤ Cδ1 Fhτ ,Bτ [Vτ χ% ]− − [Vτ ]− o +Fhτ ,Bτ [Vτ ]− ∗ θλ2 − [Vτ ]− + Fhτ ,Bτ cδ1−1 h2τ λ−2 (Θ%∗ )2 , where we used Jensen’s inequality. Note that Jensen also imples that Z 2 Fhτ ,Bτ [Vτ ]− ∗ θλ − [Vτ ]− ≤ Fhτ ,Bτ ([Vτ ]− − [Vτ (· − y)]− )θλ (y)2 dy. ˜ ) → 0 as τ → 0 it follows The integrand vanishes unless |y| ≤ λ. Since λ = λ(τ from (4.4-4.8), (4.10) and dominated convergence that this term tends to zero relative to |Escl (hτ , Bτ , Vτ )| as τ → 0. If we also use condition (4.9) and the definition of ˜ ) = λ(κ(τ ), µ(τ ), hτ ) (see (4.18)), after a short computation, using hτ d−1 λ = λ(τ = τ max{κ(τ )1/4 , µ(τ )} ≤ 1 (if τ is small enough), we see that Fhτ ,Bτ δ1−1 Werr,τ (4.50) |Escl (hτ , Bτ , Vτ )| −5/2 ≤ cδ1 C− (τ )−1 ε2 (τ, %) + sup ε− (τ, y) ˜ ) |y|≤λ(τ

+

Rτ5/2

+

−1 3 ˜ ˜ + Rτ + Rτ λ(τ ) hτ (% + 3λ(τ )) ÿ

Rτ3/2

≤ cδ1−5 C− (τ )−1

˜ ))3 ˜ )(% + 3λ(τ ε2 (τ, %) + sup ε− (τ, y) + ε(τ ˜ ) |y|≤λ(τ

!

Semiclassical Eigenvalue Estimates for the Pauli Operator

635

˜ )−2 and using h2τ λ˜ −2 (τ ) ≤ ε(τ with Rτ := δ1−1 h2τ λ(τ ˜ ) ≤ 1/2 (see (4.43)) and |∇Θ%∗ | ≤ −1 ˜ cλ(τ ) . Let f1 , f2 , . . . fN be a family of compactly supported smooth orthonormal spinors and we want to estimate N N Z N X X X ˜ hfj θu,λ |Hu,τ |θu,λ fj idu + hfj |Hτ |fj i ≥ hfj |Herr,τ |fj i (4.51) j=1

j=1

j=1

uniformly in N . By the Lieb-Thirring inequality we have N X

hfj |Herr,τ |fj i ≥ −cδ1 Fhτ ,Bτ δ1−1 Werr,τ ,

(4.52)

j=1

which will be controlled using (4.50). For the main term, we use N X

˜ u,τ |θu,λ fj i hfj θu,λ |H

j=1

≥

N Z X

(1 − 2δ1 )|σ · (hτ p + Aτ )(Θ% θu,λ fj )(x)|2

j=1

Z

−

[Vτ χ% (u)]− |(θu,λ ηvu Θ% fj )(x)|2

(4.53)

dv dx,

where η u is defined in Sect. 4.1 (note that we could insert Θ% for free in the potential term thanks to the supports of χ% (u) and θu,λ (x)). For each ν ∈ N, u ∈ R3 , v ∈ B(u⊥(u) , 2λ), p ∈ R we define the following positive operators via their kernel: Πτ± (ν, u, v, p)(x, y) := θλ (x − u)η u (x⊥(u) − v)e−iφu,v (x) ν,± Πu,v (x⊥(u) , y⊥(u) )eip(x3(u) −y3(u) ) eiφu,v (y) η u (y⊥(u)

(4.54) − v)θλ (y − u),

ν,± is the two dimensional ν th Landau where we used the notations from Sect. 4.1, and Πu,v ± ˆ± ˆ u,v and to the gauge A level projection corresponding to the constant field B u,v , obtained in Proposition 4 (all these objects naturally depend on τ , but we omit this fact from the ˜ ). In this section notation). Note that Πτ± (ν, u, v, p) = 0 unless |v − u⊥(u) | ≤ 2λ = 2λ(τ + we shall use Πτ (ν, u, v, p) only, the other operator will be useful in Sect. 4.3. Since θu,λ has support in B(u, λ), we see from (4.19) (carefully keeping track of the u-dependences which we neglected in the previous section) X hfj θu,λ |H˜ u,τ |θu,λ fj i j

≥

Z

XZ j

R2

ˆ +u,v )(eiφu,v ηvu θu,λ Θ% fj )(x)|2 (1 − 2δ1 )(1 − ε(τ ˜ ))|σ · (hτ p + A

636

L. Erd˝os, J.P. Solovej

−([Vτ (u)]− + = (1−δ2 (τ ))

ε(τ ˜ ))|(eiφu,v ηvu θu,λ Θ% fj )(x)|2

∞ Z XX

Z

dx dv

(4.55)

−1 ˆ+ ˜ )) h2τ (p23 +2νh−1 τ Bu,v )−(1−δ2 (τ )) ([Vτ (u)]− + ε(τ

2 R j ν=0 R + ×hfj |Θ% Πτ (ν, u, v, p3 )Θ% |fj idp3

Here

dv.

˜ )). (1 − δ2 (τ )) := (1 − 2δ1 )(1 − ε(τ

(4.56)

Observe now that for each ν ∈ N we have from (4.17) that ν,± −1 ˆ± (x⊥ , x⊥ ) = dν h−1 ˜ ))Bτ (u), Πu,v τ Bu,v ≤ dν hτ (1 + ε(τ

(4.57)

˜ ). if |v − u⊥ | ≤ 2λ = 2λ(τ Therefore Z X hfj |Θ% Πτ+ (ν, u, v, p3 )Θ% |fj idv j

Z

Z ≤

Tr[Θ% Πτ+ (ν, u, v, p3 )Θ% ]dv

≤

TrΠτ+ (ν, u, v, p3 )dv

(4.58)

˜ ))Bτ (u), ≤ dν h−1 τ (1 + ε(τ where dν was defined after (1.4). Thus, combining (4.58) with (4.53), (4.55), both the upper and lower bounds of (4.17), and the definition (1.4) of P , after performing the dp3 -integral, we have N Z X

˜ u,τ |θu,λ fj idu hfj θu,λ |H

j=1

(1 − ε(τ ˜ )2 )(1 − ε(τ ˜ ))1/2 (1 − δ2 (τ )) ≥ −h−3 Z τ [Vτ (u)]− + ε(τ ˜ ) du P hτ Bτ (u), 1 − δ3 (τ ) |u|≤%+3λ Z [Vτ (u)]− + ε(τ ˜ ) −3 du P hτ Bτ (u), ≥ −hτ 1 − δ3 (τ ) |u|≤%+3λ

(4.59)

independently of N , with ˜ )) = (1 − 2δ1 )(1 − ε(τ ˜ ))2 . (1 − δ3 (τ )) := (1 − δ2 (τ ))(1 − ε(τ

(4.60)

We used that the integral on the left side of (4.59) vanished unless |u| ≤ % + 3λ. Putting together (4.51), (4.52) and (4.59), we see that X hfj |Hτ |fj i (4.61) j

≥

−h−3 τ

Z P |u|≤%+3λ

[Vτ (u)]− + ε(τ ˜ ) hτ Bτ (u), 1 − δ3 (τ )

du − cδ1 Fhτ ,Bτ δ1−1 Werr,τ .

Semiclassical Eigenvalue Estimates for the Pauli Operator

637

Clearly ∞

X ∂ 1/2 P (B, W ) = π −1 dν B[2νB − W ]− ≤ cBW 1/2 + cW 3/2 . ∂W ν=0 (4.62) Therefore, using 0 ≤ P (B, W + V ) − P (B, W ) ≤ V ∂2 P (B, W + V ) for any V ≥ 0, H¨older’s inequality and (1.7), we obtain if ε(τ ˜ ) ≤ 1, and δ3 ≤ 1/2 that Z [Vτ (u)]− + ε(τ ˜ ) − P (h 0 ≤ h−3 B (u), B (u), [V (u)] ) du P h τ τ τ τ τ − τ 1 − δ3 (τ ) |u|<%+3λ Z (ε(τ ˜ ) + δ3 (τ )[Vτ (v)]− ) ≤ ch−3 τ 0 ≤ ∂2 P (B, W ) :=

h

|u|<%+3λ

i ˜ ))1/2 + ([Vτ (u)]− + ε(τ ˜ ))3/2 du hτ kBτ k([Vτ (u)]− + ε(τ

−3 ˜ ) + δ3 (τ ))Ehτ ,Bτ ([Vτ ]− ) . ≤ c ε(τ ˜ )(% + 3λ)3 [kBτ kh−2 τ + hτ ] + (ε(τ

(4.63)

˜ ) Using (1.3), (4.61), (4.63), (4.4–4.5), (4.50) and (4.10), and recalling that λ = λ(τ we get P h j hfj |Hτ |fj i ˜ ))3 + (ε(τ ≤ 1 + cC− (τ )−1 ε(τ ˜ ) + δ3 (τ ))C+ (τ ) ˜ )(% + 3λ(τ Escl (hτ , Bτ , Vτ ) (4.64) n oi ˜ ))3 . ˜ )(% + 3λ(τ + δ1−5 ε2 (τ, %) + sup ε− (τ, y) + ε(τ ˜ ) |y|≤λ(τ

This P estimate is valid independently of N . The ratio of the sum of the negative eigenvalues k ek (Hτ ) of Hτ = H(hτ , Aτ , Vτ ), to Escl (hτ , Bτ , Vτ ) is therefore estimated above by the right side of (4.64). It is clear that we can choose δ1 and % as functions of τ such that the error in (4.64) ˜ ) → 0. This tends to 0 as τ → 0 using (4.8), (4.9) and that τ → 0 implies ε(τ ˜ ), λ(τ yields (4.44). 4.3. Upper bound. In this section we shall show the opposite asymptotic bound P k ek (Hτ ) ≥ 1, (4.65) lim inf τ →0 Escl (hτ , Bτ , Vτ ) in Theorem 4.1. We shall construct a suitable trial density matrix and use the variational principle. As in Sect.4.2, we define κ(τ ) := h3τ kBτ kl(Bτ )−2 , µ(τ ) := hτ max{L(Bτ )−1 , ˜ ) := λ(κ(τ ), µ(τ ), hτ ), ε(τ l(Bτ )−1 }, λ := λ(τ ˜ ) := ε(κ(τ ), µ(τ ), hτ ). We shall also need the coherent state operators Πτ− (ν, u, v, p) defined in (4.54), and recall that Πτ− (ν, u, v, p) = 0 unless |v − u⊥(u) | ≤ 2λ. For any % > 0, let Mτ,% (ν, u, p) be the characteristic function of the set {(ν, u, p) : h2τ (p2 + 2νh−1 τ Bτ (u)) < [Vτ (u)]− , |u| ≤ %}. Note that Mτ,% (ν, u, p) = 0 if Vτ (u) ≥ 0. Define the operator γτ,% on L2 (R3 , C2 ) by Z ∞ ZZ X −1 Mτ,% (ν, u, p) Πτ− (ν, u, v, p)dvdpdu. γτ,% = (2π) ν=0

R

R2

638

L. Erd˝os, J.P. Solovej

Theorem 4.2. With the notations above, for any % > 0, the operator γτ,% satisfies the density matrix condition 0 ≤ γτ,% ≤ 1L2 (R3 ,C2 ) , its density function, ργτ,% (x) := TrC2 (γτ,% (x, x)) satisfies the estimate (1− ε(τ ˜ )) ≤

h−3 τ

R

ργτ,% (x) ∂ P (hτ , Bτ (u), [Vτ (u)]− )θλ (x − u)2 du |u|≤% 2

≤ (1+ ε(τ ˜ )), (4.66)

(it should be understood such that we allow both the nominator and the denominator be simultaneously zero), and, as a trial density matrix, γτ,% gives the exact semiclassical upper bound to the energy asymptotics, i.e. for any % > 0, lim inf τ →0

Tr[Hτ γτ,% ] ε2 (τ, %) ≥ 1 − lim sup . Escl (hτ , Bτ , Vτ ) C− (τ ) τ →0

(4.67)

LettingP % → ∞, (4.67), (4.4) and (4.6) obviously imply (4.65) by using the variational principle: k ek (Hτ ) ≤ Tr[Hτ γτ,% ]. Proof. The following relations are immediate to check ∞ ZZ Z X (2π)−1 Πτ− (ν, u, v, p)dvdpdu = 1L2 (R3 ,C2 ) , R R2 ν=0 TrC2 [Πτ− (ν, u, v, p)(x, x)]

(4.68)

− = dν h−1 Bˆ u,v (θλ (x − u)η u (x⊥(u) − v))2 .

(4.69)

In particular (4.68) implies that γτ,% satisfies the density matrix condition. The relation Z X −1 −1 (2π) dν hτ Bτ (u) Mτ,% (ν, u, p)dp = h−3 τ ∂2 P (hτ Bτ (u), [Vτ (u)]− ), (4.70) R

ν

for |u| ≤ % and (4.69) immediately give (4.66). For Tr[Hτ γτ,% ] we shall need − TrΠτ− (ν, u, v, p) := TrL2 (R3 ,C2 ) Πτ− (ν, u, v, p) = dν h−1 Bˆ u,v 8(u, v)

Z

with 8(u, v) := and

Z R2

≤

(θλ (x − u)η u (x⊥(u) − v))2 dx,

Tr[Vτ Πτ− (ν, u, v, p)]dv

dν h−1 τ Bτ (u)

(4.71)

{(1 + ε(τ ˜ ))[Vτ ]+ − (1 − ε(τ ˜ ))[Vτ ]− } ∗ (θλ )

2

(4.72) (u),

obtained from (4.17). In order to calculate the kinetic energy we note, similarly to the two dimensional case, that Πτ− (ν, u, v, p) = Ξτ (ν, u, v, p)Ξτ (ν, u, v, p)∗ , where Ξτ (ν, u, v, p) has an integral kernel ν,− (x⊥(u) , y⊥(u) )eip(x3(u) −y3(u) ) . Ξτ (ν, u, v, p)(x, y) = θλ (x − u)η u (x⊥(u) − v)e−iφu,v (x) Πu,v

Let, furthermore, 0νu,v,p be the positive operator with kernel ν,− (x⊥(u) , y⊥(u) )eip(x3(u) −y3(u) ) . 0νu,v,p (x, y) := Πu,v

Semiclassical Eigenvalue Estimates for the Pauli Operator

639

We therefore have Z Tr [σ · (hτ p + Aτ )]2 Πτ− (ν, u, v, p) dv R2 Z Tr (σ · (hτ p + Aτ )Ξτ (ν, u, v, p))∗ (σ · (hτ p + Aτ )Ξτ (ν, u, v, p)) dv = 2 ZR Z Z |[σ · (hτ p + Aτ )Ξτ (ν, u, v, p)](x, y)|2 dxdydv. (4.73) = R2

To estimate this, we use (4.20) each fixed y, with for g := θu,λ and f v (x) = 0νu,v,p (x, y). We begin by examining the error terms. Exactly as in the two dimensional case, from the spectral density expression (4.57), the moment estimate (4.16), the field comparison (4.17) and the normalization of θλ,u we find Z ν,− (x⊥ , y⊥ )|2 dydxdv (ηvu θλ,u )2 (x)Wu,v (x⊥ )|Πu,v Z 2 ν,− (x)|Πu,v (x⊥ , x⊥ )|dxdv = wu−4 (x − v)4 + wu−2 (x − v)2 + 1 (ηvu (x))2 θλ,u ≤ cdν h−1 ˜ ))Bτ (u), τ (1 + ε(τ

(4.74)

ν,− is a projection. where the first identity follows since Πu,v Likewise, for the second error term we get Z + ν,− (x⊥ )|P+ Πu,v (x⊥ , y⊥ )|2 dydxdv (ηvu θλ,u )2 (x)Wu,v

˜ ))Bτ (u) ≤ νhτ Bτ# (u)dν h−1 τ (1 + ε(τ ≤

cν(hτ Bτ (u))dν h−1 τ (1

(4.75)

+ ε(τ ˜ ))Bτ (u).

In the last line we used (4.15) to estimate Bτ# (u) in terms of Bτ (u). Note that we inserted a ν in the estimate. This is clearly allowed for ν ≥ 1. For ν = 0 it follows simply because 0,− = 0 (i.e., the lowest Landau level contains only spinors with spin down). P+ Πu,v R For the last error term in (4.20) we just have to recall that supp ζ (η (0),u )2 < c to conclude that Z ν,− (x⊥ , y⊥ )|2 dxdydv |∇(ζv θλ,u )(x)|2 (ηv(0),u )2 (x)|Πu,v (4.76) −2 −1 ˜ ˜ ))Bτ (u). ≤ cλ(τ ) dν hτ (1 + ε(τ If we insert (4.73–4.76) into (4.20) we arrive at Z Tr [σ · (hτ p + Aτ )]2 Πτ− (ν, u, v, p) dv Z Z Z h 2 i ν ˆ− )0 ≤ (1 + ε(τ ˜ )) ηvu (x)2 θλ,u (x)2 σ · (hτ p + A (x, y) dxdydv u,v u,v,p h i ˜ )−2 . ˜ ))Bτ (u) ε(τ ˜ ) + ν(hτ Bτ (u))ε(τ ˜ ) + h2τ λ(τ + cdν h−1 τ (1 + ε(τ To compute the first term we observe that since 0νu,v,p is an eigenprojection for [σ · 2 ˆ− ˆ− (hτ p + A u,v )] it commutes with σ · (hτ p + Au,v ). Hence, using again the spectral density expression (4.57), we get

640

L. Erd˝os, J.P. Solovej

Z h 2 i ν ˆ− σ · (hτ p + A u,v )0u,v,p (x, y) dy i h ν ˆ− ˆ− = σ · (hτ p + A u,v )0u,v,p σ · (hτ p + Au,v ) (x, x) h i 2 ν 2 2 ν,− ˆ− ˆ− = [σ · (hτ p + A )] 0 u,v u,v,p (x, x) = (hτ p + 2νhτ Bu,v )Πu,v (x⊥ , x⊥ ) − ˆ− )(dν h−1 = (h2τ p2 + 2νhτ Bˆ u,v τ Bu,v ).

If we insert this identity into the estimate above we conclude, again using the field comparison (4.17) that Z

˜ ))Bτ (u) Tr [σ · (hτ p + A)]2 Πτ− (ν, u, v, p) dv ≤ dν h−1 τ (1 + ε(τ Z i h ˜ )−2 . ˜ ) + ch2τ λ(τ (1 + cε(τ ˜ ))h2τ (p2 + 2νh−1 τ Bτ (u))8(u, v) + cε(τ

(4.77)

R2

From (4.77) and (4.72) we have −1

Tr[Hτ γτ,% ] ≤ (2π)

XZ Z R

ν

Z

Mτ,% (ν, u, p)dν h−1 ˜ ))Bτ (u) τ (1 + ε(τ

(4.78) 8(u, v)(1 + cε(τ ˜ ))2 h2τ (p2 + 2νh−1 τ Bτ (u)) dv 1− ε(τ ˜ ) +cε(τ ˜ ) + ch2τ λ˜ −2 (τ )+ [Vτ ]+ − [Vτ ]− ∗(θλ )2 (u) dp du. 1+ ε(τ ˜ )

×

R2

Using (4.70) R we can continue the estimate (4.78) by performing first the dv-integral (note that R2 8(u, v)dv = 1) then the dp-integral Tr[Hτ γτ,% ] ≤ −(1 +

≤ −h−3 τ

cε(τ ˜ ))2 h−3 τ

Z |u|≤%

P (hτ Bτ (u), [Vτ (u)]− )du + Error(τ, %)

Z |u|≤%

P (hτ Bτ (u), [Vτ (u)]− )du + Error(τ, %)

with Error(τ, %) := h−3 τ

Z |u|≤% Vτ (u)<0

(1 + ε(τ ˜ ))∂2 P hτ Bτ (u), [Vτ (u)]−

(4.79)

1 − ε(τ ˜ ) 2 2 [Vτ ]− ∗ (θλ ) (u) + cε(τ [Vτ ]+ − ˜ ) du. × (1 + cε(τ ˜ )) [Vτ (u)]− + 1 + ε(τ ˜ ) ˜ ) ≤ 1 which we can assume) Therefore, using (4.62), we have (if ε(τ ˜ ) ≤ 1, λ = λ(τ

Semiclassical Eigenvalue Estimates for the Pauli Operator

Error(τ, %) Z ≤ h−3 τ

|u|≤%

641

h 1/2 3/2 c(hτ kBτ k[Vτ (u)]− +[Vτ (u)]− ) [Vτ ]− −[Vτ ]− ∗(θλ )2 (u)

i ˜ ) du ˜ )[Vτ (u)]− + ε(τ + [Vτ ]+ ∗ θλ2 − [Vτ ]+ (u) + ε(τ −3 3 ˜ )Ehτ ,Bτ ([Vτ ]− ) ≤ c ε(τ ˜ )[h−2 τ kBτ k + hτ ](% + 1) + ε(τ X 2/3 + Ehτ ,Bτ [Vτ ]± − [Vτ ]± ∗ (θλ )2 Ehτ ,Bτ ([Vτ ]− )1/3 ±

+Ehτ ,Bτ [Vτ ]± − [Vτ ]± ∗ (θλ )

2 2/5

Ehτ ,Bτ ([Vτ ]− )

3/5

(4.80)

,

where we used H¨older’s inequality. ˜ )→ Considering (4.4)-(4.8) and (4.10), recalling that Ehτ ,Bτ ≤ Fhτ ,Bτ , and λ = λ(τ 0 we get, using Jensen’s inequality and dominated convergence, as in the lower bound, that Error(τ, ρ) →0 |Escl (hτ , Bτ , Vτ )| as τ → 0, uniformly in % > 0. Hence for all % > 0 we have R ÿ ! h−3 P (hτ Bτ (u), [Vτ (u)]− )du Tr[Hτ γτ ]) τ |u|≥% ≥ 1 − lim sup . lim inf τ →0 Escl (hτ , Bτ , Vτ ) |Escl (hτ , Bτ , Vτ )| τ →0 (4.81) By the definition of C− (τ ) (see (4.4)) and (4.9) Z Z −3 h−3 P (hτ Bτ (u), [Vτ (1 − χ% )(u)]− )du P (h B (u), [V (u)] )du = h τ τ τ − τ τ |u|≥%

≤ Ehτ ,Bτ ([Vτ − Vτ χ% ]− ) = Ehτ ,Bτ ([Vτ χ% ]− − [Vτ ]− ) ≤ ε2 (τ, %)|Escl (hτ , Bτ , Vτ )|/C− (τ ), hence we obtain (4.67).

5. Magnetic Thomas-Fermi Theory In Sect. 1.2 of the introduction we stated our simplest theorem on the asymptotic validity of the magnetic Thomas-Fermi (MTF) theory as the limit of quantum mechanics and we also introduced the basic notations. Here we recall some further results on the MTF theory, obtained in [LSY-II]. The first important result (Proposition 4.3 of [LSY-II]) is that the MTF energy E MTF (N, B, Z) (defined in (1.18)) is always finite, E MTF (N, B, Z) > −∞, as long as B is a locally bounded function. Furthermore, it was proved in Theorems 4.5–4.7 that there is a unique minimizer ρMTF , which satisfies the Thomas-Fermi equation (see [LSY-II] which satisfies the Thomas-Fermi equation (see [LSY-II] Eq. (4.27)) ρMTF (x) = ∂2 P (B(x), [V MTF (x)]− ), where

(5.1)

642

L. Erd˝os, J.P. Solovej

V MTF (x) = −Z|x|−1 + ρMTF (x) ∗ |x|−1 + µ

(5.2)

with µ := µ(N, B, Z) := −∂E MTF (N, B, Z)/∂N ≥ 0 being the chemical potential (see [LSY-II] Theorem 4.8). Conversely, if the pair (ρ, µ) satisfies (5.1) and R(5.2) (with ρ instead of ρMTF ) then there exists N such that ρ is the minimizer of E with ρ ≤ N and µ = µ(N, B, Z). 5/3 Note that according to [LSY-II] Proposition 4.2 the minimizer ρMTF is in Lloc (R3 ) ∩ 1 MTF −1 ∗ |x| therefore makes sense and for x 6= 0 we L (R). The convolution integral ρ have (5.3) − (4π)−1 1V MTF (x) = ρMTF (x). From Theorem 4.8 in [LSY-II] we see that Z ρMTF < N ⇒ µ(N, B, Z) = 0. We therefore have

Z µ

ρMTF = µN.

(5.4)

Note that (5.1) and the definiton (1.17) of τ as a Legendre transform imply that τ (B(x), ρMTF (x)) = ρMTF (x)[V MTF (x)]− − P (B(x), [V MTF (x)]− ).

(5.5)

We can now use (5.5) and (5.4) to express the energy as follows: E MTF (N, B, Z) = E[ρMTF ; B, Z] Z = − P (B(x), [V MTF (x)]− )dx ZZ ρMTF (x)|x − y|−1 ρMTF (y)dxdy − µN. − 21

(5.6)

Our main result on the energy of large atoms was given in the introduction, in Theorem 1.3. There only the strength of the magnetic field was rescaled. One could also have asked whether the field can be allowed to vary on a scale depending on the parameters Z and b. This is of some interest since the atomic scale, in fact, decreases with increasing Z and b, at least asymptotically. We shall see that the size of the atom is of order s := s(b, Z) := Z −1/3 (1 + bZ −4/3 )−2/5 .

(5.7)

Concerning the shortest allowed length scales of B we have the following version of the limit theorem. Theorem 5.1. Let B = ∇ × A : R3 → R3 be a fixed magnetic field satisfying (1.7–1.9). There exists a constant K > 0 depending on B such that if we define a rescaled field by BZ,b (x) := bB(x/[s(b, Z)K]) then the following result holds. Assume that Z, N → ∞ with N/Z fixed and b/Z 2 → 0, then E(N, bBZ,b , Z)/E MTF (N, bBZ,b , Z) → 1.

Semiclassical Eigenvalue Estimates for the Pauli Operator

643

We see therefore that the scale on which we allow the magnetic field to vary is greater than the size of the atom if B(0) Z 2 . It is an open question to allow the magnetic field to vary on the scale of the atom if B(0) Z 2 . Both Theorem 1.3 and Theorem 5.1 are simple consequences of the following stronger result. Theorem 5.2. Consider sequences Nn of positive integers and Zn of positive real numbers with Nn , Zn → ∞ as n → ∞ and Nn /Zn bounded above and below away from zero. If k > 0 is a constant then there exists a constant K > 0 such that if, Bn := ∇ × An : R3 → R3 is a sequence of magnetic fields satisfying Bn (0) ≥ kkBn k and (5.8) L(Bn ) ≥ Ks(kBn k, Zn ) for all n,

n o l(Bn )−1 s(kBn k, Zn ) max s(kBn k, Zn )−1/2 Zn−1/2 , kBn k1/2 Zn−1 → 0

and

as n → ∞, (5.9)

kBn kZn−3 → 0

(5.10)

lim E(Nn , Bn , Zn )/E MTF (Nn , Bn , Zn ) → 1

(5.11)

as n → ∞, then n→∞

as n → ∞. The roles of the constants K and k may seem mysterious and the corresponding conditions could possibly be weakened. The constant k ensures that we are not considering a magnetic field which is much weaker in the center than its maximum. If this were the case, s, as defined here would not be the correct scale of the atom, since it presumably should involve also the typical field strength around the nucleus. The constant K ensures that the field does not change too fast on the scale of the atom. If this happened the atom could actually have two different relevant scales, one where B is large, another where B is small. In the following all positive constants, denoted by capital C or C1 etc. , may depend on k. Constants that are universal will be denoted by the common symbol c. It is of no importance to the proof whether a constant is universal or depends on k. We devote the rest of this chapter to the proof of Theorem 5.2. For simplicity we omit the subscript n. 5.1. Rescaling. We rescale the Hamiltonian (1.14) using the unitary (Us ψ)(x1 , . . . , xN ) = s−3N/2 ψ(s−1 x1 , . . . , s−1 xN ), where s = s(kBk, Z) is given in (5.7). We obtain that Z −1 sE(N, B, Z) is the bottom of the spectrum of the operator Heff :=

N X

[σ i · (hpi + Aeff (xi ))] − |xi |

i=1

where and

2

−1

+Z

−1

N X

|xi − xj |−1 ,

(5.12)

i<j

Beff (x) := ∇ × Aeff (x) := s3/2 Z −1/2 B(sx)

(5.13)

644

L. Erd˝os, J.P. Solovej

h := s−1/2 Z −1/2 .

(5.14)

Note that the assumptions (5.8)–(5.10) imply that we are considering a limit such that h → 0,

L(Beff ) ≥ K,

l(Beff )−1 h → 0,

h3 kBeff kl(Beff )−2 → 0.

(5.15)

We shall now perform the same rescaling of the MTF theory. We obtain from (5.6) and the scaling relation α−5/2 P (αB, αW ) = P (B, W ) that (5.16) Z −1 sE MTF (N, B, Z) = Escl (h, Beff , Veff ) ZZ ρeff (x)|x − y|−1 ρeff (y)dxdy − sµN Z −1 , − 21 Z −1 where we have used the definition (1.3) of Escl and ρeff (x) := s3 ρMTF (sx), Veff (x) := sZ −1 V MTF (sx) = −|x|−1 + Z −1 ρeff ∗ |x|−1 + sZ −1 µ.

(5.17) (5.18)

5.2. Reduction to a one-body problem. We shall here explain how to relate the manybody operator Heff to a one-body operator, which can be analyzed by the semiclassical methods of the previous sections. As always this is done separately for the upper and the lower bound. Before we begin this analysis we want to note what error terms will be allowed. Since we are aiming at using the results of Sect. 4 we allow errors that are small in comparison to kBeff kh−2 + h−3 . Using the definitions of s, Beff and h in (5.7), (5.13) and (5.14) we see that Z ≤ kBeff kh−2 + h−3 ≤ 2Z.

(5.19)

We can therefore allow errors that are lower order than Z. Note that on the original scale we are aiming at proving that the energy is of order Zs−1 (kBeff kh−2 + h−3 ), i.e., of order Z 2 s−1 . 5.2.1. Reducing the lower bound. In [LSY-II] this was done using the Lieb-Oxford inequality [LO] and the fact that the Lieb-Thirring inequality in [LSY-II] could be written as a kinetic energy inequality (see Corollary 2.2 in [LSY-II]). The standard way of transforming an estimate on the sum of eigenvalues of a one-body operator into a kinetic energy inequality is to use a Legendre transform. This requires, however, that the eigenvalue sum is estimated by the integral of a function of V . In our Lieb-Thirring inequality we have a term that depends on the gradient of V and we can therefore not transform it into a kinetic energy bound. Instead of the Lieb-Oxford inequality we use another well known technique for ∞ 3 reducing to a one-body problem. Choose R a function ϕ ∈ C0 (R ) which is spherically symmetric, non-negative and satisfies ϕ = 1. Let for all a > 0, ϕa (x) = a−3 ϕ(x/a), we then have for all ρ˜ : R3 → R, X ZZ X −1 ϕa (x − xi )ϕa (y − xj )|x − y|−1 dxdy |xi − xj | ≥ 1≤i<j≤N

=

1 2

XZ Z i,j

− 21 N

ZZ

1≤i<j≤N

ϕa (x − xi )ϕa (y − xj )|x − y|−1 dxdy ϕa (x)ϕa (y)|x − y|−1 dxdy

Semiclassical Eigenvalue Estimates for the Pauli Operator

=

N Z X i=1

+ 21 ≥

−1

ρ(y)ϕ ˜ a (xi − x)|x − y|

Z Z ÿX

XZ

ZZ dxdy − !ÿ

ϕa (x − xi ) − ρ(x) ˜

i

1 2

X

ρ(y)ϕ ˜ a (xi − x)|x − y|

dxdy −

ρ(x)|x ˜ − y|−1 ρ(y)dxdy ˜ − cϕ N a−1 !

ϕa (y − xi ) − ρ(y) ˜ |x − y|−1 dxdy

i −1

645

ZZ 1 2

ρ(x)|x ˜ − y|−1 ρ(y)dxdy ˜ − cϕ N a−1 ,

i

where in the last inequality we used that |x − y|−1 is of positive type (a positive kernel) and cϕ depends on ϕ, but not on a. From this inequality, with ρ˜ = ρeff , and the definition (5.12) of Heff we get the operator inequality Heff ≥

N X

[σ i · (hpi + Aeff (xi ))]2 − |xi |−1 + Z −1 ρeff ∗ ϕa ∗ |xi |−1

i=1

− 21 Z −1

ZZ

ρeff (x)|x − y|−1 ρeff (y)dxdy − cϕ Z −1 N a−1 .

Thus Heff ≥

N n X i=1

− 21 Z −1 where

ZZ

o

− [σ i · (hpi + Aeff (xi ))]2 + Va,R (xi )

(5.20)

ρeff (x)|x − y|−1 ρeff (y)dxdy − sµN Z −1 − N R−1 − cϕ Z −1 N a−1 ,

− (x) := −|x|−1 + Z −1 ρeff ∗ ϕa ∗ |x|−1 + sµZ −1 ΘR (x) Va,R

with ΘR (x) = Θ(x/R) and Θ a smooth compactly supported function on R3 with 0 ≤ Θ ≤ 1 and satisfying Θ(x) = 1 for |x| ≤ 1 and Θ(x) = 0 for |x| ≥ 2. We define the one-body operator − (x). Ha,R := [σ · (hp + Aeff (x))]2 + Va,R

(5.21)

Then with e1 (Ha,R ), e2 (Ha,R ), . . . being the negative eigenvalues of this operator we have ZZ X Heff ≥ ρeff (x)|x − y|−1 ρeff (x)dxdy ek (Ha,R ) − 21 Z −1 (5.22) k

− sµN Z −1 − N R−1 − cϕ Z −1 N a−1 . Comparing this with (5.16) it should be clear that we intend to analyze the limits h → 0, R → ∞, and a → 0 (but aZ → ∞). 5.2.2. Reducing the upper bound. Here we use Lieb’s variational principle (see [L-1981]), which states that if γ is an admissible density matrix on L2 (R3 ; C2 ), i.e., an operator satisfying 0 ≤ γ ≤ 1 and Tr[γ] ≤ N , then

646

L. Erd˝os, J.P. Solovej

Z −1 sE(N, B, Z) ≤ Tr γ [σ · (hp + Aeff (x))]2 − |x|−1 ZZ 1 −1 ργ (x)|x − y|−1 ργ (y)dxdy, + 2Z

(5.23)

R where ργ (x, x) = TrC2 [γ(x, x)] is a function in L1 (R3 ) with ργ ≤ N . [Note that if we allow the value +∞ we can consider the above expression as being defined for all admissible γ (not only those with range in H 2 (R3 ; C2 )).] We shall choose as our γ the trial density matrix described in Theorem 4.2 corresponding to the potential Veff ΘR , where ΘR is as above, and the magnetic field Beff . (We may ignore % in the definition of γ since we can assume that % > R). We then have from (4.66) that (1−ε(h, ˜ Beff )) ≤ R

h−3 ∂

ργ (x) ≤ (1+ε(h, ˜ Beff )), 2 2 P (hBeff (u), [(ΘR Veff )(u)]− )θλ (x − u) du

where according to (5.15) we may assume that ε(h, ˜ Beff ) → 0 in the limit we consider. If we now use the rescaled form of (5.1) we see that ˜ Beff ))(χ2R ρeff ) ∗ θλ2 (x), (5.24) (1 − ε(h, ˜ Beff ))(χR ρeff ) ∗ θλ2 (x) ≤ ργ (x) ≤ (1 + ε(h, where χR is the characteristic function of the ball of radius R. Since, ZZ θλ (x − z)2 |z − w|−1 θλ (y − w)2 dzdw ≤ |x − y|−1 , we have that ZZ ZZ −1 2 1 1 ργ (x)|x − y| ργ (y)dxdy ≤ 2 (1 + ε(h, ρeff (x)|x − y|−1 ρeff (y)dxdy. ˜ Beff )) 2 We therefore get from (5.4) and (5.23) that Z −1 sE(N, B, Z) ≤ Tr γ [σ · (hp + Aeff )]2 + Veff ΘR − sµN Z −1 ZZ ρeff (x)|x − y|−1 ρeff (y)dxdy + 1+ E1 + 1+ E2 , − 21 Z −1 (5.25) where according to (5.24) and assuming R > 2λ, 1+ E1 := 1+ E1 (R) Z Z := µsZ −1 (ρeff − ΘR ργ ) ≤ µsZ −1

ρeff − (1 − ε(h, ˜ Beff ))(χR/2 ρeff ) , (5.26)

and (again assuming R > 2λ)

ZZ ρeff (x)|x − y|−1 ρeff (y)dxdy ˜ Beff ))2 1+ E2 := 1+ E1 (R) := Z −1 (1 + ε(h, ZZ −1 ρeff (x)|x − y|−1 (ΘR ργ )(y)dxdy −Z ZZ ρeff (x)|x − y|−1 ρeff (y) − (χR/2 ρeff ) ∗ θλ2 (y) dxdy (5.27) ≤ Z −1 ZZ ρeff (x)|x − y|−1 ρeff (y)dxdy. ˜ Beff )2 Z −1 +c ε(h, ˜ Beff ) + ε(h,

Semiclassical Eigenvalue Estimates for the Pauli Operator

647

5.3. Properties of the potentials. In proving our main result we shall apply Theorem 4.1 − for the lower bound and with ΘR Veff for the upper bound. In with V replaced by Va,R order to do this we must show that these potentials satisfy the necessary conditions of Theorem 4.1. Lemma 5. If B(0) ≥ kkBk then there exists constants C0 > 0 and K > 0 (depending on k) such that if B satisfies (5.8) (with this constant K) we have [Veff (x)]− ≤ C0 min{|x|−1 , |x|−4 },

(5.28)

Z −1 ρeff (x) ≤ C0 min{|x|−3/2 , |x|−2 },

(5.29)

Z

−1

sµ ≤ C0 (Z/N ),

(5.30)

where s = s(kBk, Z). Proof. It is clear from (5.18) that [Veff (x)]− ≤ |x|−1 . We consider |x| ≤ r for some r > 0. Using (5.8) we obtain that on this set B0 := kkBke−K

−1 −1

s

r

≤ B(0)e−L

−1

r

≤ B(x).

Consider now the magnetic function B˜ r : R3 → R3 which is equal to B(x) for |x| ≤ r and which is constantly equal to B0 if |x| > r. We may now study the MTF theory of atoms in this ‘magnetic’ field. (The reader may worry that we have not defined a magnetic field, but only a scalar function. The observation is that, although this was not explicit in [LSY-II], MTF theory makes sense for any locally bounded scalar function B(x).) It now follows from Theorem 4.11 in [LSY-II] that the support of this new MTF atom is bounded above by (see (4.32) in [LSY-II]) n o −2/5 , rmax ≤ c max ZB0−1 , Z 1/5 B0 where c > 0 is a universal constant. This means that the new density and the negative part of the new effective potential vanish outside this radius (recall the MTF equation (5.1) relating the density and the effective potential). Since B and B˜ r agree for |x| ≤ r we conclude by uniqueness of the minimizer to the MTF equations (5.1) and (5.2) that the original atom has radius rmax if rmax ≤ r.

(5.31)

We shall now show that we may choose r such that this condition is satisfied. We shall attempt to make a choice consistent with eK

−1 −1

s

r

≤ 2.

(5.32)

Then kkBk/2 ≤ B0 ≤ kBk and rmax ≤ C1 s(kBk, Z) if kBk > CZ 4/3 for some C, C1 > 0 depending on k. If we choose r = C1 s we have satisfied (5.31) and it is clear that if K is large enough then (5.32) is also satisfied. We have thus proved that if kBk > CZ 4/3 then [V MTF (x)]− = 0

648

L. Erd˝os, J.P. Solovej

if |x| ≥ C1 s. Recalling the definition (5.18) of Veff this identity implies (5.28) if kBk > CZ 4/3 . We now turn to the case kBk ≤ CZ 4/3 . Since ∂2 P (B, W ) ≥ cW 3/2 , we see from (5.3) and (5.1) that for x 6= 0, V MTF (x) satisfies −(4π)−1 1V MTF (x) = ρMTF (x) = ∂2 P (B(x), [V MTF (x)]− ) ≥ c[V MTF (x)]− . 3/2

Since 1|x|−4 = c(|x|−4 )3/2 it follows from a simple comparison argument, using that V MTF (x) ≥ −c|x|−4 for small enough |x| and that V MTF (x) ≥ −Z|x|−1 → 0 as |x| → ∞, that V MTF (x) ≥ −c|x|−4 for all x 6= 0. This is true for all B. If we now use that kBk ≤ CZ 4/3 and hence s ≥ CZ −1/3 , it then also follows that Veff (x) ≥ −C|x|−4 . We have thus proved (5.28). The Thomas-Fermi equation (5.1) implies that 1/2

3/2

ρMTF (x) = ∂2 P (B(x), [V MTF (x)]− ) ≤ ckBk[V MTF (x)]− + c[V MTF (x)]− . If we insert the bound [V MTF (x)]− ≤ c|x|−4 , we obtain ρeff (x) = s3 ρMTF (sx) ≤ cskBk|x|−2 + cs−3 |x|−6 ,

(5.33)

while the bound [V MTF (x)]− ≤ Z|x|−1 gives ρeff (x) ≤ cs5/2 kBkZ 1/2 |x|−1/2 + cs3/2 Z 3/2 |x|−3/2 .

(5.34)

If kBk ≤ CZ 4/3 we arrive at (5.29) using the bound (5.33) for large |x| and (5.34) for small |x|. If kBk ≥ CZ 4/3 we prove (5.29) using (5.34) and that, as proved above, ρeff (x) = 0 if |x| ≥ C. In order to prove the bound on µ we observe from (5.1) and (5.2) that ρMTF (x) = 0 if Z|x|−1 ≤ µ. Thus from (5.29) we find that if µ 6= 0 then Z Z |x|−2 dx = cµ−1 Z 2 s−1 , N = ρeff ≤ cZ |x|≤µ−1 Zs−1

which implies (5.30).

We note that the bound in Lemma 5 Ron ρeff is not integrable. It follows, however, from Theorem 4.9 in [LSY-II] that Z −1 ρeff (x)dx ≤ 1. In fact, it follows from the proof of that theorem that Z −1 ρeff ∗ |x|−1 ≤ |x|−1 . (5.35) We shall now prove a stronger bound than (5.35) for small |x|. Lemma 6. With the same assumptions as in Lemma 5 and if a > 0, we obtain the estimates (5.36) Z −1 ρeff ∗ |x|−1 ≤ C min{1, |x|−1 } and and

Z

∇(Z −1 ΘR (x)ρeff ∗ ϕa ∗ |x|−1 ) dx ≤ CR2 Z

∇(Z −1 ΘR (x)ρeff ∗ |x|−1 ) dx ≤ CR2 .

(5.37)

Semiclassical Eigenvalue Estimates for the Pauli Operator

649

Proof. Considering (5.35) it is enough, inR order to prove (5.36), to show that Z −1 ρeff ∗ |x|−1 ≤ C for |x| ≤ 1. Using (5.29) and ρeff ≤ N we find for |x| ≤ 1, Z |y|−3/2 |x − y|−1 dy + N/Z ≤ C. Z −1 ρeff ∗ |x|−1 ≤ C |y|≤2

To prove (5.37) we write Z ∇(Z −1 ΘR (x)ρeff ∗ |x|−1 ) dx Z Z ρeff ∗ |x|−1 dx + Z −1 ≤ R−1 Z −1 |x|<2R

|x|<2R

ρeff ∗ |x|−2 dx.

Inserting the bounds (5.29) and (5.36) we obtain (5.37). Note that bounds similar to (5.29) and (5.36), possibly with different constants, hold also if ρeff is replaced by ρeff ∗ ϕa . In fact, to prove (5.36) for ρeff ∗ ϕa (with the same constant) simply note that ρeff ∗ |x|−1 is superharmonic. To prove (5.29) for ρeff ∗ ϕa one simply computes the convolution on both sides of (5.29). We are now ready to control the quantities in (4.4–4.7). Lemma 7. There exists a constant C > 0 (depending on only k) such that if R > 1 and a < 1 we have the estimates − )| ≥ C(kBeff kh−2 + h−3 ), |Escl (h, Beff , Va,R

|Escl (h, Beff , ΘR Veff )| ≥ C(kBeff kh

−2

+h

−3

(5.38)

),

(5.39)

),

(5.40)

Eh,Beff ([ΘR Veff ]− ) ≤ Eh,Beff ([Veff ]− ) ≤ C(kBeff kh−2 + h−3 ).

(5.41)

− ]− ) Eh,Beff ([Va,R

≤ C(kBeff kh

−2

+h

−3

and

Proof. Since ρeff ∗|x|−1 is superharmonic we have ρeff ∗ϕa ∗|x|−1 ≤ ρeff ∗|x|−1 . Hence − ]− ≥ [Veff ]− ΘR . Using (5.36) and (5.30) and recalling that N/Z is bounded away [Va,R from zero we see that for |x| < C, [Veff ]− ΘR ≥ C|x|−1 . The estimates (5.38) and (5.39) easily follow from this together with the assumptions B(0) ≥ kkBk and L(Beff ) ≥ K from (5.15). The estimate (5.41) is an immediate consequence of (5.28). In order to prove (5.40) observe that − (x) = Veff ∗ ϕa (x) − |x|−1 + ϕa ∗ |x|−1 ΘR (x) Va,R ≥ −[Veff ∗ ϕa (x)]− − ||x|−1 − ϕa ∗ |x|−1 |. From Jensen’s inequality (note that t 7→ [t]− is convex) we conclude that − ]− ) ≤ Eh,Beff ([Veff ]− ) + Eh,Beff (|x|−1 − ϕa ∗ |x|−1 ). Eh,Beff ([Va,R

Since ϕa is supported for |x| < ca, is spherically symmetric, and has integral 1 it follows from Newton’s Theorem that |x|−1 − ϕa ∗ |x|−1 = 0 if |x| > ca. Since 0 < |x|−1 − ϕa ∗ |x|−1 ≤ |x|−1 the estimate (5.40) follows immediately.

650

L. Erd˝os, J.P. Solovej

− Lemma 8. If R > 1 and a < 1 then both for V = Va,R and V = Veff ΘR we have

Eh,Beff ([V ]± − [V (· − y)]± ) ≤ C |y|1/2 + |y|3/2 1 + | ln(|y|/R)| . −2 −3 kBeff kh + h

(5.42)

Likewise, Fh,Beff ([V ]± − [V (· − y)]± ) kBeff kh−2 + h−3 ≤ C |y|1/2 + |y|3/2 1 + | ln(|y|/R)| + |y|R + hR2 .

(5.43)

Proof. Note that for all V , |[V (x)]± − [V (x − y)]± | ≤ |V (x) − V (x − y)|. Using the simple case, ku ∗ vkp ≤ kuk1 kvkp , of Young’s inequality for p = 1, p = 3/2 or p = 5/2 − we find for both cases V = Va,R and V = ΘR Veff that

p kV (·) − V (· − y)kpp ≤ (1 + (N/Z))p ΘR (·)| · |−1 − ΘR (· − y)| · −y|−1 p  if p = 1  |y|R, ≤ C |y|1/2 , (5.44) if p = 5/2 .  3/2 1 + | ln(|y|/R)| , if p = 3/2 |y| This gives (5.42). We next turn to the estimates on Fh,Beff . First we note that the requirements (5.15) on Beff , l(Beff ) and L(Beff ) imply that d(h, Beff )−1 ≤ Ch−1 . Thus for all W , Fh,Beff (W ) Z Z −2 −3 −2 −3 ≤ C Eh,Beff (W ) + (kBeff kh + h ) |W | + (kBeff kh + h )h |∇W | . R In order to prove (5.43) it therefore remains to control |V (x) − V (x − y)|dx and R − |∇V (x) − ∇V (x − y)|dx for the two cases V = Va,R and V = ΘR Veff . The first integral was controlled in (5.44). For the gradient we use (5.37) and the trivial estimate R −1 |∇|x| |dx ≤ CR2 to arrive at |x|≤2R Z

Z |∇V (x) − ∇V (x − y)| dx ≤

in both cases.

|∇V (x)| + |∇V (x − y)| dx ≤ CR2 ,

Corollary 9. There exist constants C± > 0 (depending only on k) such that the MTF energy satisfies − C− kBeff kh−2 + h−3 ≥ Z −1 sE MTF (N, B, Z) ≥ −C+ kBeff kh−2 + h−3 . Note, in particular, that E MTF (N, B, Z) is negative.

(5.45)

Semiclassical Eigenvalue Estimates for the Pauli Operator

651

Proof. Recall that according to (5.19), kBeff kh−2 +h−3 ∼ Z. We shall use the expression (5.16) for E MTF . From (5.28) and (5.39) we find that 0 < CZ ≤ |Escl (h, Beff , ΘR Veff )| ≤ |Escl (h, Beff , Veff )| ≤ C −1 Z. We also see from (5.36) that ZZ Z −1 −1 ρeff (x)|x − y| ρeff (y)dxdy ≤ C ρeff (x)dx ≤ CN. 0≤Z Inserting these two bounds together with (5.30) into (5.16) proves the corollary.

5.4. Completing the proof of the MTF Theorem. We shall now put together the results of the previous sections to complete the proof of the main result, Theorem 5.2. We begin by proving an asymptotic lower bound on E(N, B, Z). Proof of the lower bound. From (5.22) and (5.16) we conclude that − − Z −1 sE(N, B, Z) ≥ Z −1 sE MTF (N, B, Z) − 1− 1 E − 12 E − 13 E,

where we have divided the error into three separate terms X − − − ek (Ha,R ) − Escl (h, Beff , Va,R ) , 11 E := 11 E(a, R) := k

− − E := 1 E(a, R) := (h, B , V ) − E (h, B , V ) 1− E , scl eff scl eff eff 2 2 a,R and

− −1 + cϕ Z −1 N a−1 . 1− 3 E := 13 E(a, R) := N R

We shall study these error terms in the limit as n → ∞. Recall that we are omitting the subscript n on Z, N , and B. We shall prove that the three error terms satisfy lim lim lim sup

R→∞ a→0 n→∞

1− j E = 0. kBeff kh−2 + h−3

(5.46)

It then follows from Corollary 9 that lim sup n→∞

E(N, B, Z) ≤ 1, E MTF (N, B, Z)

(5.47)

(recall that the energies are negative) which is what we want to prove. That (5.46) holds for 1− 3 E, i.e., for j = 3, is trivial. It follows from (5.15) that we consider a limit where B and h satisfy the conditions needed for Theorem 4.1. Finally, Lemmas 7 and 8 show that the conditions on the − are satisfied. Note that condition (4.9) is trivially satisfied for fixed R. potential Va,R We conclude that lim sup n→∞

1− 1− 1 E 1 E = lim sup = 0. − −2 −3 kBeff kh + h n→∞ |Escl (h, Beff , Va,R )|

It remains to prove (5.46) for 1− 2 E. We observe that

652

L. Erd˝os, J.P. Solovej

1− 2 E kBeff kh−2 + h−3 X ≤C k[Veff ]− − ΘR [Veff ]− kpp + kZ −1 ρeff ∗ ϕa ∗ |x|−1 − Z −1 ρeff ∗ |x|−1 kpp . p=3/2 p=5/2

Using the bound (5.28) on Veff it is obvious that lim lim lim sup k[Veff ]− − ΘR [Veff ]− kpp = 0

R→∞ a→0 n→∞

for both p = 3/2 and p = 5/2. Here there really is no dependence on the parameter a. Finally, we have

−1

Z ρeff ∗ |x|−1 − ϕa ∗ |x|−1 p ≤ kZ −1 ρeff k1 k|x|−1 − ϕa ∗ |x|−1 kp ≤ c(N/Z)a(3−p)/p , where we used Newton’s Theorem as in the proof of Lemma 7. This proves the limit (5.46) for 1− 2 E. Proof of the upper bound. We proceed analogously to the lower bound. From (5.25) and (5.16) we have Z −1 sE(N, B, Z) ≤ Z −1 sE MTF (N, B, Z) + 1+1 E + 1+2 E + 1+3 E + 1+4 E, where 1+1 E and 1+2 E were defined in (5.26)–(5.27) and 1+3 E := 1+3 E(R) := Tr γ [σ · (hp + Aeff )]2 + Veff ΘR − Escl (h, Beff , ΘR Veff ) and

1+4 E := 1+4 E(R) := |Escl (h, Beff , ΘR Veff ) − Escl (h, Beff , Veff )| . As for the lower bound the goal is to prove that the four error terms satisfy lim lim sup

R→∞ n→∞

It then follows that lim inf n→∞

1+j E = 0. kBeff kh−2 + h−3 E(N, B, Z)

E MTF (N, B, Z)

≥ 1,

(5.48)

(5.49)

which together with (5.47) proves Theorem 5.2. As for the lower bound we conclude from the results of the previous sections that we can apply Theorem 4.2 to conclude that (5.48) holds for 1+3 E, i.e., for j = 3. That (5.48) holds for 1+4 E, is a simple consequence of (5.28). We turn now to 1+1 E. From (5.1) and (5.2) it is clear that ρMTF (x) = 0 if Z|x|−1 ≤ µ. Thus ρeff (x) = 0 if |x| > Zs−1 µ−1 . Thus if R/2 ≥ Zs−1 µ−1 we get from (5.30), Z Z + −1 11 E ≤ ε(h, ˜ Beff )Z sµ ρeff ≤ ε(h, ˜ Beff )C0 (Z/N ) ρeff ≤ C ε(h, ˜ Beff )Z, R where in the last inequality we used that ρeff ≤ Z and the assumption that N/Z is bounded below. On the other hand if R/2 ≤ Zs−1 µ−1 , i.e., if µsZ −1 ≤ 2R−1 then (assuming ε(h, ˜ Beff ) ≤ 1) 1+1 E ≤ 2R−1 Z. Thus we have proved that

Semiclassical Eigenvalue Estimates for the Pauli Operator

653

1+1 E ≤ CZ min ε(h, ˜ Beff ), R−1 . Recalling (5.19) and ε(h, ˜ Beff ) → 0 as n → ∞ we conclude (5.48) for 1+1 E. ˜ Beff ) < 1 and λ ≤ R/2, It remains to consider 1+2 E. Assuming that ε(h, 1+2 E

≤Z

−1

ZZ |y|≥R/2

ρeff (x)|x − y|−1 ρeff (y)dxdy

Z Z ρeff (x)|x − y|−1 (χR/2 ρeff )(y) +Z −1 ˜ Beff )Z, −(χR/2 ρeff ) ∗ θλ2 (y) dxdy + C ε(h, + Rwhere we estimated the last term in 12 E using (5.36) and ρeff ≤ Z we also see that

Z −1

ZZ |y|≥R/2

R

ρeff ≤ Z. From (5.36) and

ρeff (x)|x − y|−1 ρeff (y)dxdy ≤ CZR−1 .

Finally, using (5.29) we see that kZ −1 ρeff kp < C for 3/2 < p < 2. Thus using H¨older and Young’s inequalities we obtain Z Z Z −1 ρeff (x)|x − y|−1 (χR/2 ρeff )(y) − (χR/2 ρeff ) ∗ θλ2 (y) dxdy ≤ kZ −1 ρeff kp k(χR/2 ρeff ) ∗ |x|−1 − θλ2 ∗ |x|−1 kq

≤ C χR/2 ρeff |x|−1 − θλ2 ∗ |x|−1 ≤ CZλ(3−q)/q , 1

q

where p−1 + q −1 = 1, so that 2 < q < 3. As before we used Newton’s Theorem to conclude that since θλ is supported on |x| ≤ 2λ, is spherically symmetric and has integral one then |x|−1 − θλ2 ∗ |x|−1 vanishes for |x| > 2λ and is bounded by |x|−1 for |x| ≤ 2λ. Putting these estimates together gives ˜ Beff ) . 1+2 E ≤ CZ R−1 + λ2 + ε(h, Since ε(h, ˜ Beff ) → 0 and λ → 0 as n → 0, we see from (5.19) that (5.48) holds also for 1+2 E.

A. The Geometry of the Three Dimensional Magnetic Field In this Appendix we recall two results from [ES-I] related to the geometry of a nonhomogeneous three dimensional magnetic field. Here we just give the statements and the necessary notations for the reader’s convenience, the proofs are found in [ES-I]. The following proposition will be used to approximate a general magnetic field by a constant direction field. We recall the definitions l(B)−1 = k∇(B/B)k and L(B)−1 = k|∇B|/Bk.

654

L. Erd˝os, J.P. Solovej

Proposition 10. Consider an arbitrary cube ⊂ R3 with center Q and edge length λ and a nonvanishing C 1 magnetic (divergence free) field B : R3 → R3 . Assume that aλl(B)−1 ≤ 1.

(A.1)

˜ with constant direction parallel Then there exists a magnetic (divergence free) field B, to the field at the center Q of , such that for all x ∈ , ( ˜ |B(x) − B(x)| ≤ λl(B)−1

) B(x)

sup |x−Q|≤5λ

and

a b + λ l(B)−1 + L(B)−1 2 )

( ˜ |∇B(x)| ≤ |∇B(x)| ≤

(A.2)

sup

B(x) (L(B)−1 + l(B)−1 ).

(A.3)

|x−Q|≤5λ

Here √

a := 6 + 3 3,

√ b :=

√ 3+4 6 √ . 2

(A.4)

Remarks. (i) The assumption (A.1) is a geometric condition, which states that the field lines of the field B should not vary too fast over the scale of the cube. (ii) In our application, where typically l λ, the approximation in (A.2) will be ˜ better than the straightforward would √ choice B(x) := B(Q) (constant field), since that yield only |B(x) − B(Q)| ≤ ( 3/2)λ sup |∇B|, which is of order sup |B|λ(l−1 + L−1 ). This is worse by a factor of lλ−1 1 than the similar term in (A.2). We shall, indeed, also need approximations of the magnetic field by a constant field and not just a constant direction field. In order to keep the same accuracy in the approximation we must restrict to a smaller region. It turns out that we can cover the cube by parallel cylinders such that within each of these we, without losing in the approximation, can approximate the magnetic field by a constant field along the cylinder axis. To formulate this more precisely we choose an orthonormal coordinate system {ξi }3i=1 in R3 , such that the center Q of the cube is the origin and that B(Q) = B(0) points in the positive third direction. Note that the sides of need not be parallel with the coordinate planes in this new coordinate system. We shall refer to the plane P := { ξ : ξ3 = 0} as the base plane of the cube. We consider cylinders, CP , given in this new coordinate system by CP = { ξ : |ξ⊥ − P | ≤ w |ξ3 | ≤

√

3λ/2},

(A.5)

where P ∈ P and w > 0 (here ξ⊥ := (ξ1 , ξ2 , 0)). The point P is called the center of the cylinder. Note that the cylinders are aligned along B(Q), the magnetic field at the center of the cube and that the union of all these cylinders covers . Moreover, all the cylinders 0 CP such that CP ∩ 6= ∅ are√subsets of the √ larger3cube , that, in the new coordinate system, is defined by [−w − 3λ/2, w + 3λ/2] .

Semiclassical Eigenvalue Estimates for the Pauli Operator

655

Corollary 11. Let , 0 and CP , for P ∈ P be as defined above and let a and b as in (A.4). Assume that the magnetic field B satisfies √ (A.6) a(2w + 3λ)l(B)−1 ≤ 1. Then within each CP such that CP ∩ 6= ∅, one can approximate the magnetic field B by ˜ P pointing along the axis of the cylinder, with the following precision: a constant field, B ˜P| ≤ |B(x) − B (A.7) ) ( √ √ a −1 −1 −1 B(x) b + (2w + 3λ) l(B) + L(B) sup (2w + 3λ)l(B) √ 2 |x−Q|<5( 3λ+2w) ( ) +w sup B(x) l(B)−1 + L(B)−1 , √ |x−Q|<5( 3λ+2w)

for x ∈ CP . Remark. Note that it is only the radius of the cylinder that appears in the last term in (A.7). This is important since in our applications, typically w λ, i.e., the cylinder is very thin compared to the cube. The size λ of the cube appears only together with l(B)−1 which, in our setup, will typically be small. It is in this way that we will achieve that the constant field approximation within CP is as good as the constant direction field approximation within . The corollary of the previous proposition will provide us with a good approximating constant field within a cylinder (see (A.5)). Here we show that the difference field (which is supposed to be small) can be generated by a small vector potential within this cylinder. In general, if one is given a magnetic field within a domain, then there exists a vector potential bounded by the supremum of the field times the largest linear size of the domain (see (A.10) below). For instance, one can choose the gauge given by the Poincar´e formula (see below). This gives a very crude bound for domains which are elongated cylinders. The crucial fact is that, assuming some bound on the first derivative of the field in addition to its supremum bound, one can choose a gauge independent of the longest linear size of the domain. In particular, we can choose a gauge within our cylinder which is bounded by a constant independent of the length of the cylinder. Proposition 12. Given a C 1 magnetic field β : R3 → R3 and consider a cylinder C with radius w. Then there exists a C 0 vector potential α : R3 → R3 , such that ∇×α = β (in distribution sense) and sup kαk ≤ 4(w sup kβk + w2 sup k∇βk). C

C

C

(A.8)

The bound is uniform in the length of the cylinder. Remark. For comparison, the Poincar´e formula Z 1 t(β(ty) × y)dt α(y) =

(A.9)

0

obviously yields a gauge α, for any domain D, satisfying the bound √ sup kαk ≤ 3 sup kβk · diam(D). D

D

(A.10)

656

L. Erd˝os, J.P. Solovej

Acknowledgement. L. E. gratefully acknowledges financial support from the Forschungsinstitut f¨ur Mathematik, ETH, Z¨urich, where this work was started. He is also grateful for the hospitality and support of Aarhus University during his visits. The authors wish to thank the referee for the careful reading of the manuscript and the many helpful remarks and suggestions.

References [AC]

Aharonov, Y. and Casher, A.: Ground state of spin-1/2 charged particle in a two-dimensional magnetic field. Phys. Rev. A19, 2461–2462 (1979) [CdV] Colin de Verdi`ere, Y.: L’asymptotique de Weyl pour les bouteilles magn´etiques. Commun. Math. Phys. 105, 327–335 (1986) [CFKS] Cycon, H.L., Froese, R.G., Kirsch, W. and Simon, B.: Schr¨odinger Operators with Application to Quantum Mechanics and Global Geometry. Berlin–Heidelberg–New York: Springer-Verlag, 1987 [E-1995] Erd˝os, L.: Magnetic Lieb-Thirring inequalities. Commun. Math. Phys. 170, 629–668 (1995) [ES-I] Erd˝os, L. and Solovej, J.P.: Semiclassical eigenvalue estimates for the Pauli operator with strong non-homogeneous magnetic fields. I. Non-asymptotic Lieb-Thirring type estimate. Preprint, 1996 [HR] Helffer, B. and Robert, D.: Propri´et´es asymptotiques du spectre d’op´erateurs pseudodifferentiels sur Rn . Commun. PDE 7, 795–882 (1982) [I] Ivrii, V.: Semiclassical microlocal analysis and precise spectral asymptotics. Book manuscript [L-1981] Lieb, E.H.: A variational principle for many-fermion systems. Phys. Rev. Lett. 46, 457–459 (1981); Erratum 47, 69 (1981) [LO] Lieb, E.H. and Oxford, S.: Improved lower bound on the indirect Coulomb energy. Int. J. Quant. Chem. 19, 427–439 (1981) [LSY-II] Lieb, E.H., Solovej, J.P. and Yngvason, J.: Asymptotics of heavy atoms in high magnetic fields: II. Semiclassical regions. Commun. Math. Phys. 161, 77–124 (1994) [LSY-III] Lieb, E.H., Solovej, J.P. and Yngvason, J.: Ground states of large quantum dots in magnetic fields. Phys. Rev. B 51, 10646–10665 (1995) [Mat-1994] Matsumoto, H.: Semiclassical asymptotics of eigenvalue distributions for Schr¨odinger operators with magnetic fields. Commun. in PDE. 19 (5/6), 719–759 (1994) [R] Robert, D.: Autour de l’Approximation Semiclassique. Progr. Math. 68, Boston: Birkh¨auser, 1987 [Sob-1986] Sobolev, A.: Asymptotic behavior of the energy levels of a quantum particle in a homogeneous magnetic field, perturbed by a decreasing electric field. J. Sov. Math. 35, 2201–2212 (1986) [Sob-1994] Sobolev, A.: The quasi-classical asymptotics of local Riesz means for the Schr¨odinger operator in a strong homogeneous magnetic field. Duke J. Math. 74, 319–428 (1994) [Sob-1995] Sobolev, A.: Quasi-classical asymptotics of local Riesz means for the Schr¨odinger operator in a moderate magnetic field. Ann. Inst. H. Poincar´e Phys. Th´eor. 62 no.4, 325–360 (1995) [Sob-1996(1)] A. Sobolev: On the Lieb-Thirring estimates for the Pauli operator. Duke J. Math. 82, 607–635 (1996) [Sol] Solnyshkin, S.N.: The asymptotic behavior of the energy of bound states of the Schr¨odinger operator in the presence of electric and magnetic fields. Probl. Mat. Fiz. 10, 266–278 (1982) [T] Tamura, H.: Asymptotic distribution of eigenvalues for Schr¨odinger operators with magnetic fields. Nagoya Math. J. 105, no. 10 49–69 (1987) Communicated by B. Simon

Commun. Math. Phys. 188, 657 – 689 (1997)

Communications in

Mathematical Physics c Springer-Verlag 1997

Determinant Representation for Dynamical Correlation Functions of the Quantum Nonlinear Schr¨odinger Equation T. Kojima1,? , V. E. Korepin2 , N. A. Slavnov3 1 Research Institute for Mathematical Sciences, Kyoto University, Kyoto 606, Japan. E-mail: [email protected] 2 Institute for Theoretical Physics, State University of New York at Stony Brook, Stony Brook, NY 11794-3840, USA. E-mail: [email protected] 3 Steklov Mathematical Institute, Gubkina 8, Moscow 117966, Russia. E-mail: [email protected]

Received: 11 January 1997 / Accepted: 21 February 1997

Abstract: Painlev´e analysis of correlation functions of the impenetrable Bose gas by M. Jimbo, T. Miwa, Y. Mori and M. Sato [1] was based on the determinant representation of these correlation functions obtained by A. Lenard [2]. The impenetrable Bose gas is the free fermionic case of the quantum nonlinear Schr¨odinger equation. In this paper we generalize the Lenard determinant representation for hψ(0, 0)ψ † (x, t)i to the nonfree fermionic case. We also include time and temeprature dependence. In forthcoming publications we shall perform the JMMS analysis of this correlationl function. This will give us a completely integrable equation and asymptotic for the quantum correlation function of interacting fermions. 1. Introduction We consider exactly solvable models of statistical mechanics in one space and one time dimension. The Quantum Inverse Scattering Method and Algebraic Bethe Ansatz are effective methods for a description of the spectrum of these models. Our aim is the evaluation of correlation functions of exactly solvable models. Our approach is based on the determinant representation for correlation functions. It consists of a few steps: first the correlation function is represented as a determinant of a Fredholm integral operator, second – the Fredholm integral operator is described by a classical completely integrable equation, third – the classical completely integrable equation is solved by means of the Riemann-Hilbert problem. This permits us to evaluate the long distance and large time asymptotics of the correlation function. The method is described in [3]. The most interesting correlation functions are time dependent correlation functions. The determinant representation for time dependent correlation functions was known only for the impenetrable Bose gas (the spectrum of the Hamiltonian of this model is equivalent to free fermions). In this paper we have found the determinant representation ?

Research Fellow of the Japan Society for the Promotion of Science.

658

T. Kojima, V. E. Korepin, N. A. Slavnov

for the time dependent correlation function of local fields of the penetrable Bose gas. The main idea for the construction of the determinant representation is the following. We introduce auxiliary Bose fields (acting in the canonical Fock space) in order to remove the two body scattering matrix and to reduce the model to the free fermionic case. We want to emphasize that all dual fields, which we introduce commute (belong to the same Abelian sub-algebra). Therefore we do not have any ordering problem. This will also permit us to perform nonperturbative calculations, which are necessary for the derivation of the integrable equation for the correlation function. First we shall discuss our model. Quantum nonlinear Schr¨odinger equation (equivalent to Bose gas with delta-function interaction) can be described by the canonical Bose fields ψ(x) and ψ † (x) with the commutation relations: [ψ(x), ψ † (y)] = δ(x − y), [ψ(x), ψ(y)] = [ψ † (x), ψ † (y)] = 0,

(1.1)

acting in the Fock space. The Fock vacuum |0i and dual vector h0| are important. They are defined by the relations ψ(x)|0i = 0,

h0|ψ † (x) = 0,

h0 | 0i = 1.

The Hamiltonian of the model is Z H = dx ∂x ψ † (x)∂x ψ(x) + cψ † (x)ψ † (x)ψ(x)ψ(x) − hψ † (x)ψ(x) .

(1.2)

(1.3)

Here c is the coupling constant and h > 0 is the chemical potential. We shall consider the repulsive case 0 < c ≤ ∞. The spectrum of the model was first described by E. H. Lieb and W. Liniger [4, 5]. The Lax representation for the corresponding classical equation of motion i

∂2 ∂ ψ = [ψ, H] = − 2 ψ + 2cψ † ψψ − hψ, ∂t ∂x

(1.4)

was found by V. E. Zakharov and A. B. Shabat [6]. The Quantum Inverse Scattering Method for the model was formulated by L. D. Faddeev and E. K. Sklyanin [7]. In this paper we shall follow the notations of [3]. First the model is considered in a finite periodic box of length L. Later the thermodynamic limit is considered when the length of the box L and the number of particles in the ground state go to infinity, with the ratio N/L held fixed. The Quantum nonlinear Schr¨odinger equation is equivalent to the Bose gas with delta-function interaction. In the sector with N particles the Hamiltonian of the Bose gas is given by HN = −

N X X ∂2 + 2c δ(zk − zj ) − N h. 2 ∂zj j=1 1≤j
(1.5)

Now a few words about the organization of the paper. In Sect. 2 we shall review the Algebraic Bethe Ansatz and collect all the known facts necessary for further calculations. In Sect. 3 we shall calculate the form factor of the local field in finite volume. In Sect. 4 we shall present the idea of summation with respect to all intermediate states. In Sect. 5 we introduce an auxiliary Bosonic Fock space and auxiliary Bose fields. This helps us to represent the correlation function as a determinant

Determinant Representation for Function of Bose Gas

659

in the finite volume. In Sect. 6 we consider the thermodynamic limit of the determinant representation for correlation function. Length of the periodic box L and number of particles in the ground state go to infinity but their ratio remains fixed. This leads us to the main result of the paper (see formulæ (6.24)–(6.27)). The correlation function of local fields in the infinite volume is represented as a determinant of a Fredholm integral operator. For evaluation of the thermodynamic limit it is necessary to sum up singular expressions. Appendix A is devoted to these summations. In Appendix B we present realization of quantum dual fields as linear combinations of the canonical Bose fields. Appendix C shows how to reduce the number of dual fields. Appendix D contains the determinant representation for temperature correlation function. In forthcoming publications we shall use the determinant representation for the derivation of the completely integrable equation for correlation functions. Later we shall solve this equation by means of the Riemann-Hilbert problem and evaluate the long-distance asymptotic. 2. Algebraic Bethe Ansatz Let us review some main features of the Algebraic Bethe Ansatz, which we shall use later. We consider the quantum nonlinear Schr¨odinger model. The starting point and central object of the Quantum Inverse Scattering Method is the R-matrix, which is a solution of the Yang-Baxter equation. For the case of the quantum nonlinear Schr¨odinger equation, it is of the form:   f (µ, λ) 0 0 0 0   0 g(µ, λ) 1 R(λ, µ) =  , (2.1) 0 1 g(µ, λ) 0  0 0 0 f (µ, λ) where g(λ, µ) =

λ − µ + ic ic , f (λ, µ) = . λ−µ λ−µ

(2.2)

Later we shall also use functions h(λ, µ) =

(ic)2 g(λ, µ) λ − µ + ic , t(λ, µ) = = . ic (λ − µ)(λ − µ + ic) h(λ, µ)

Another important object is the monodromy matrix A(λ) B(λ) T (λ) = . C(λ) D(λ)

(2.3)

(2.4)

The operators A, B, C, D are acting in the Fock space where the operator ψ(x) was defined. Their commutation relations are given by R(λ, µ) (T (λ) ⊗ T (µ)) = (T (µ) ⊗ T (λ)) R(λ, µ).

(2.5)

These relations are written out explicitly in Sect. VII.1 of [3]. The hermiticity properties of T (λ) are ¯ x = T (λ), σx T ∗ (λ)σ

(2.6)

660

T. Kojima, V. E. Korepin, N. A. Slavnov

¯ so that B † (λ) = C(λ). The Hamiltonian of the model can be expressed in terms of A(λ) + D(λ) by means of trace identities (Sect. VI.3 of [3]). The vacuum is an eigenvector of the diagonal elements of T (λ), A(λ)|0i = a(λ)|0i; h0|A(λ) = a(λ)h0|; iLλ ; a(λ) = exp − 2

D(λ)|0i = d(λ)|0i : h0|D(λ) = d(λ)h0|; iLλ d(λ) = exp . 2

(2.7) (2.8) (2.9)

Later we shall also use the function r(λ) =

a(λ) = e−iλL . d(λ)

(2.10)

The operator C(λ) annihilates the vacuum vector and the operator B(λ) annihilates the dual vacuum: C(λ)|0i = 0, h0|B(λ) = 0. (2.11) The Hamiltonian of the model commutes with A(λ) + D(λ) and they can be diagonalized simultaneously. The eigenvectors of the Hamiltonian are N Y

B(µj )|0i,

and

h0|

j=1

N Y

C(µj ),

(2.12)

j=1

if µj satisfy Bethe Equations N

a(µj ) Y f (µj , µk ) = 1, d(µj ) k=1 f (µk , µj )

N

or

a(µj ) Y h(µj , µk ) = (−1)N −1 . d(µj ) h(µk , µj )

(2.13)

k=1

k6=j

It is convenient to rewrite (2.13) in logarithmic form. For the ground state ϕj + π ≡ Lµj +

N X k=1

i ln

ic + µj − µk ic − µj + µk

N +1 = 2π j − 2

.

(2.14)

It is proven in Sect. I.2 of [3] that solutions µj of Eq. (2.14) are real. The distribution of µj in the ground state in the thermodynamic limit can be described by the linear integral equation. The thermodynamic limit is defined in the following way: N → ∞, L → ∞ and N/L = D is fixed. In this limit µj condense (µj+1 − µj = O(1/L)) and fill the symmetric interval [−q, q], where q is the value of spectral parameter on the Fermi surface. In the thermodynamic limit the function of local density ρ(µ) can be defined in the following way: 1 . (2.15) ρ(µj ) = lim L(µj+1 − µj ) The lim in the r.h.s. denotes the thermodynamic limit. This function satisfies the LiebLiniger integral equation

Determinant Representation for Function of Bose Gas

1 ρ(µ) − 2π

661

Zq K(ν, µ)ρ(ν)dν = −q

1 . 2π

(2.16)

Here K(ν, µ) =

2c , c2 + (µ − ν)2

and N = D= L

(2.17)

Zq dµρ(µ).

(2.18)

−q

In such a way we have described the ground state. Now we can define the correlation function of the local fields h0| †

hψ(0, 0)ψ (x, t)i = lim

N Y

C(µj )ψ(0, 0)ψ † (x, t)

j=1

B(µj )|0i

j=1

h0|

N Y

C(µj )

j=1

Here

N Y

N Y

.

(2.19)

B(µj )|0i

j=1

ψ † (x, t) = eiHt ψ † (x, 0)e−iHt .

(2.20)

We shall use the notation µj for the ground state only. The square of the norm of the ground state wave function (denominator of the correlation function) was found in [8],   N N Y Y Y C(µj ) B(µj )|0i = cN  g(µj , µk )g(µk , µj ) h0| j=1

j=1

 ×

N ≥j>k≥1 N N Y Y



h(µj , µk ) detN

j=1 k=1

Here ∂ϕj /∂µk is the N × N matrix # " N X ∂ϕj = δjk L + K(µj , µl ) − K(µj , µk ). ∂µk

∂ϕj . ∂µk

(2.21)

(2.22)

l=1

Let us emphasize that det(∂ϕj /∂µk ) > 0 (see Sect. I.2 of [3]). The thermodynamic limit of the square of the norm can be described by the following formula:   ∂ϕj det N  1 ˆ ∂µk    ˆ (2.23) lim  QN  = det I − 2π K , j=1 2πLρ(µj ) ˆ is an integral operator acting on some trial function f (λ) as where K

662

T. Kojima, V. E. Korepin, N. A. Slavnov

Zq ˆ )(λ) = (Kf

K(λ, µ)f (µ) dµ.

(2.24)

−q

The proof can be found in [8] (see also Sect. X.4 of [3]). In order to calculate the correlation function we shall also need a description of excited states. We need to consider excited states which have one more particle than in the ground state, N +1 N +1 Y Y B(λj )|0i, and h0| C(λj ), (2.25) j=1

j=1

where λj have to satisfy Bethe Equations N +1 a(λj ) Y f (λj , λk ) = 1, d(λj ) k=1 f (λk , λj )

or

N +1 a(λj ) Y h(λj , λk ) = (−1)N d(λj ) h(λk , λj )

(2.26)

k=1

k6=j

We shall further assume that the number of particles in the ground state N is even. In order to write the logarithmic form of the Bethe Equations it is convenient to introduce ϕ˜ j ≡ Lλj +

N +1 X

i ln

k=1 k6=j

λj − λk + ic λj − λk − ic

.

(2.27)

The Bethe equations can now be written as ϕ˜ j = 2πnj ,

(2.28)

where nj is an ordered set of different integer numbers nj+1 > nj . One can prove that all λj are real. In order to enumerate all the eigenstates in the sector with N + 1 particles we have to consider all sets of ordered integers nj . The square of the norm of the excited state is   N +1 N +1 Y Y Y C(λj ) B(λj )|0i = cN +1  g(λj , λk )g(λk , λj ) h0| j=1

j=1

 ×

N +1≥j>k≥1 +1 N +1 N Y Y



h(λj , λk ) detN +1

j=1 k=1

∂ ϕ˜ j . ∂λk

(2.29)

For the excited state det(∂ ϕ˜ j /∂λk ) is also positive. We shall also mention that the scattering matrix of elementary excitations can be found in Sect. I.4 of [3]. It depends strongly on momenta, this shows that the model is not free fermionic. Now we can define the form factor in the finite volume FN = h0|

N Y j=1

C(µj )ψ(0, 0)

N +1 Y

B(λj )|0i.

(2.30)

j=1

We shall calculate it in the next section. We shall also need the conjugated form factor

Determinant Representation for Function of Bose Gas

h0|

N +1 Y

†

C(λj )ψ (x, t)

N Y

j=1

663

B(µj )|0i

j=1

 

N +1 X

= e−iht · exp it 

λ2j −

j=1

N X





N +1 X

µ2k  − ix 

λj −

j=1

k=1

N X



(2.31)

µk  · F N .

k=1

Here we used the fact that the energy and momentum of the eigenstate are given by the expressions N +1 X EN +1 = (λ2j − h), (2.32) j=1

PN +1 =

N +1 X

λj .

(2.33)

j=1

3. Form Factor The main purpose of the paper is to evaluate the correlation function. In the finite volume we shall use the notation QN QN h0| j=1 C(µj )ψ(0, 0)ψ † (x, t) j=1 B(µj )|0i † . (3.1) hψ(0, 0)ψ (x, t)iN = QN QN h0| j=1 C(µj ) j=1 B(µj )|0i We shall use the standard representation of the correlation function in terms of the form factors hψ(0, 0)ψ † (x, t)iN N N +1 N +1 N Y Y Y Y h0| C(µj )ψ(0, 0) B(λj )|0ih0| C(λj )ψ † (x, t) B(µj )|0i X . j=1 j=1 j=1 j=1 = N +1 N +1 N N Y Y Y Y all {λ}N +1 h0| C(λj ) B(λj )|0ih0| C(µj ) B(µj )|0i j=1

j=1

j=1

j=1

(3.2) In order to calculate the form factor we need to know the action of the local field on the eigenvector. This can be found in [9] (see also Sect. XII.2 of [3]),   N +1 +1 N +1 N +1 Y X Y √ N  Y B(λj )|0i = −i c a(λ` )  f (λ` , λm ) B(λm )|0i. (3.3) ψ(0, 0) j=1

`=1

m=1 m6=`

m=1 m6=`

This permits us to represent the form factor as follows:    N +1 N +1 N +1 √ X Y  Y  a(λ` )  g(λ` , λm )  h(λ` , λm ) FN = −i c `=1

×h0|

N Y j=1

C(µj )

m=1 m6=`

N +1 Y m=1 m6=`

B(λm )|0i.

m=1 m6=`

(3.4)

664

T. Kojima, V. E. Korepin, N. A. Slavnov

Let us notice that the form factor is a symmetric function of all the λj because [B(λj ), B(λk )] = 0. We now need to calculate the scalar product between the eigenvector and noneigenvector N N +1 Y Y C(µj ) B(λm )|0i, (3.5) h0| j=1

m=1 m6=`

where µj satisfy the Bethe equations, but λm do not. It can be done by the following theorem. Theorem 3.1. The following determinant representation holds for such scalar products:   N N N Y  Y Y C(µj ) B(λj )|0i = d(µj )d(λj ) h0|   j=1 j=1 j=1    N  Y  Y  × g(λj , λk )g(µk , µj ) h(µj , λk ) det Mjk , (3.6)    N ≥j>k≥1

j,k=1

where Mjk =

N g(µk , λj ) a(λj ) g(λj , µk ) Y f (λj , µm ) − . h(µk , λj ) d(λj ) h(λj , µk ) f (µm , λj )

(3.7)

m=1

Here the spectral parameters {µj } satisfy the Bethe Ansatz equations (2.13). The spectral parameters {λj } are free and do not satisfy any equations. This theorem was proved in [10]. For the scalar product, which appears in the expression for the form factor we get h0|

N Y

C(µj )

j=1

Y

=

N +1 Y

B(λm )|0i

m=1 m6=`

g(µj , µk ) ·

N ≥j>k≥1

×

N Y j=1

Y

d(µj ) ·

g(λk , λj ) ·

N +1≥j>k≥1 j6=`, k6=`

N +1 Y

+1 N N Y Y j=1

m=1 m6=`

h(µj , λm )

(3.8)

d(λm ) · det N M (`) .

m=1 m6=`

Here the entries of the N × N matrix M (`) are (`) = t(µk , λj )−r(λj )t(λj , µk )· Mjk

Let us recall that

N Y f (λj , µm ) , f (µm , λj )

m=1

j = 1, . . . , ` − 1, ` + 1, . . . , N + 1, k = 1, . . . , N. (3.9)

Determinant Representation for Function of Bose Gas

t(λ, µ) =

665

(ic)2 (λ − µ)(λ − µ + ic)

a(λ) . d(λ)

r(λ) =

and

Remember that the Bethe equations give: r(λj ) =

N +1 Y p=1

h(λp , λj ) h(λj , λp )

N N Y Y f (λj , µm ) h(λj , µm ) = (−1)N , f (µm , λj ) h(µm , λj )

and

m=1

(3.10)

m=1

or equivalently, a(λ` )

N +1 Y

h(λ` , λm ) = d(λ` )

m=1

N +1 Y

h(λm , λ` ).

(3.11)

m=1

Expression (3.9) becomes 

N +1 Y

(`) Mjk = t(µk , λj ) − t(λj , µk ) 

p=1

 ÿ ! N Y h(λp , λj )  h(λj , µm ) · . h(λj , λp ) h(µm , λj )

(3.12)

m=1

Using the obvious equality N +1 Y

g(λ` , λm ) =

ÿ `−1 Y

!ÿ g(λ` , λm )

m=1

m=1 m6=`

= (−1)

`−1

!

N +1 Y

g(λ` , λm )

m=`+1

!ÿ

ÿ `−1 Y

g(λm , λ` )

m=1

N +1 Y

! g(λ` , λm ) ,

(3.13)

m=`+1

and substituting (3.8) into (3.4), we have +1 X √ N (−1)`−1 FN = −i c `=1

Y N ≥j>k≥1

×

N +1 Y

N Y

h(λm , λ` )

+1 N N Y Y j=1

d(µj )

j=1

g(λk , λj )

N +1≥j>k≥1

m=1

×

Y

g(µj , µk )

N +1 Y

h(µj , λm )

m=1 m6=`

d(λm ) · det N M (`) .

One can rewrite the determinant det N M (`) as     +1 +1 N N N +1 N Y Y Y Y 1     det N M (`) =  h(λp , λj ) · det N S (`) , · h(µ , λ ) m j j=1 j=1 m=1

where

j6=`

(3.14)

m=1

p=1

j6=`

(3.15)

666

T. Kojima, V. E. Korepin, N. A. Slavnov N Q (`) Sjk = t(µk , λj ) m=1 N Q+1 p=1

N Q

h(µm , λj )

− t(λj , µk ) m=1 N Q+1

h(λp , λj )

p=1

h(λj , µm ) , h(λj , λp )

(3.16)

j = 1, . . . , ` − 1, ` + 1, . . . , N + 1, k = 1, . . . , N. Let us substitute (3.15) into (3.14), Y

√ FN = −i c

N ≥j>k≥1

×

ÿN +1 X

Y

g(µj , µk )

`+1

(−1)

g(λk , λj )

!

detN S

N Y

·

d(µj )

N +1 Y

d(λm ).

(3.17)

(−1)`−1 det N S (`) .

(3.18)

j=1

`=1

h(λm , λj )

m=1 j=1

N +1≥j>k≥1

(`)

+1 N +1 N Y Y

m=1

In order to simplify this expression let us study Mi = Mi {λ} ≡

N +1 X `=1

Notice that Mi is an antisymmetric function of all {λj } because FN is symmetric and the product of functions g(λk , λj ) is antisymmetric. In particular, Mi {λ} = 0

if

λj = λk .

(3.19)

detS (`) can be obtained from detS (N +1) by replacing λ` and λN +1 . This is a special case of a permutation (λ1 , · · · , λ` , · · · , λN , λN +1 ) −→ (λ1 , · · · , λN +1 , · · · , λN , λ` ).

(3.20)

Since (−1)`−1 is the parity of this permutation, X

Mi{λ} =

(−1)P

Permutation of all {λN +1 }

=

1+

∂ ∂α

N Y

SP (j)j

j=1

detN (Sjk − αSN +1,k )|α=0 .

(3.21)

(N +1) from (3.16) and detSjk is the term ` = N + 1 in (3.18), Here Sjk means Sjk

−

∂ detN (Sjk − αSN +1,k )|α=0 ∂α

is the sum of N terms where each of them differs from detSjk by the replacement of the `th line (corresponding to λ` ) by the (N + 1)th line. We can use the expression (3.21) to simplify the form factor (3.17),

Determinant Representation for Function of Bose Gas

FN

Y

√ = −i c N Y

×

N +1 Y

d(µj )

j=1

Y

g(µj , µk )

N ≥j>k≥1

667

g(λk , λj )

+1 N +1 N Y Y

h(λm , λj )

m=1 j=1

N +1≥j>k≥1

d(λm ) · Mi {λ} .

(3.22)

m=1

The complex conjugate of form factor is F N = h0|

N +1 Y

†

C(λj )ψ (0, 0)

j=1

N Y

B(µj )|0i.

(3.23)

j=1

Remember that c and all λ, µ are real. Therefore complex conjugation gives g(λ, µ) = g(µ, λ), f (λ, µ) = f (µ, λ), h(λ, µ) = h(µ, λ), t(λ, µ) = t(µ, λ), a(λ) = d(λ) = a−1 (λ). So we have

(3.24)

S jk = −Sjk ,

and for even N ,

Mi {λ} = (−1)N Mi {λ} = Mi {λ} .

(3.25)

Hence for the complex conjugated form factor F N we get Y

√ FN = i c

N ≥j>k≥1

×

N Y

Y

g(µk , µj )

a(µj )

j=1

g(λj , λk )

h(λj , λm )

m=1 j=1

N +1≥j>k≥1

N +1 Y

+1 N +1 N Y Y

a(λm )Mi {λ}.

(3.26)

m=1

For the correlation function the quantity |FN |2 is important:    ÿ ! N N +1 N Y Y  Y  a(λm )d(λm ) ·  g(µj , µk ) FN F N = c  a(µj )d(µj ) j=1

m=1



j=1,k=1 j6=k

  2 +1 N +1 N +1 N Y Y Y   × g(λj , λk ) ·  h(λj , λk ) (Mi {λ})2 ,

(3.27)

j=1 k=1

j=1,k=1 j6=k

or h0|

N Q j=1

C(µj ) ÿ

= c−2N ÿ

N Q j=1

FN F N B(µj )|0i · h0|

N Q+1 NQ+1 j=1 k=1

N Q N Q j=1 k=1

N Q+1 j=1

C(λj )

N Q+1 j=1

B(λj )|0i

! h(λj , λk ) !

h(µj , µk )

(3.28)

· (Mi {λ})2 .

∂ϕ ∂ ϕ˜ det N ∂µkj detN +1 ∂λkj

668

T. Kojima, V. E. Korepin, N. A. Slavnov

This formula gives us |FN |2 at x = t = 0. Also it is easy to “switch” on space and time dependence using the formula (2.31). Note that a(λ)d(λ) = 1 for nonlinear Schr¨odinger equation. Therefore the correlation function becomes           −iht −2N X 1({λ}) e c † . (3.29) hψ(0, 0)ψ (x, t)iN = ∂ ϕ˜ N N  QQ detN +1 ∂λkj ∂ϕj    {λ}   h(µj , µk )) · det N ∂µk  ( j=1 k=1

Here we used the new notation   PN +1 PN N +1 N +1 Y Y τ (λj )− τ (µm ) j=1 m=1 h(λj , λk ) (Mi {λ})2 e , 1({λ}) = 

(3.30)

k=1 j=1

where

τ (λ) = itλ2 − ixλ.

(3.31)

4. The Idea of Summation with Respect to λ Now let us consider the sum with respect to all {λ}N +1 in (3.29), X {λ}N +1

1({λ}) . detN +1

∂ ϕ˜ j ∂λk

The idea of summation is the same as that which we used for impenetrable bosons (free fermionic case) [11] (see also Sect. XIII.5 of [3]). The factor (Mi {λ})2 entering the r.h.s. of (3.30) contains (N + 1)! terms, all of them give the same contribution to the QN sum. So we can replace one of determinants Mi {λ} by the product j=1 Sjj , X {λ}N +1

1({λ}) detN +1

∂ ϕ˜ j ∂λk

−1 ∂ ϕ˜ j e−τ (µm ) · det N +1 ∂λk {λ}N +1 m=1     N +1 N N +1 +1 N Y Y Y Y × h(λj , λk ) eτ (λj )  Sjj  Mi{λ}.

= (N + 1)!

N X Y

k=1 j=1

j=1

(4.1)

j=1

The sum with respect to all {λ}N +1 means the sum with respect to all ordered sets of integers {nj } from (2.28). We also can admit nj = nk because it leads to λj = λk which does not contribute to (4.1) because of the antisymmetry of Mi{λ}. The factor (N + 1)! is absorbed as ∞ +1 X X N Y (N + 1)! = . (4.2) {nj }

The correlation function becomes

i=1 ni =−∞

Determinant Representation for Function of Bose Gas

669

QN e−iht c−2N m=1 e−τ (µm ) hψ(0, 0)ψ † (x, t)iN = QN QN ∂ϕ ( j=1 k=1 h(µj , µk )) · det N ∂µkj ÿ ! X e 1({λ}) × , ∂ ϕ˜ detN +1 ∂λkj n1 ···nN +1 with

 e 1({λ}) = 

j=1 k=1

×

+1 N +1 N Y Y

∂ ∂α

1+

 h(λj , λk )

N +1 Y

(4.3)

eτ (λj )

j=1

detN (Sjj Sjk − αSjj SN +1 k )

.

(4.4)

α=0

The main difference between the free fermionic case (coupling constant c → +∞) and the non-free fermionic case is that in the former it is possible to solve the Bethe equations (2.26) explicitly. On the contrary this is not possible for penetrable bosons. Our approach is based on the formula ÿ ! ∞ N +1 Y X Z X e 1({λ}) N +1 e d λ 1({λ}) δ(ϕ˜ j (λ) − 2πnj ). = ∂ ϕ˜ detN +1 ∂λkj n1 ···nN +1 n1 ···nN +1 j=1 −∞

Remember that det ∂ ϕ˜ j /∂λk > 0. We shall also use the Poisson formula ∞ X

δ(x − 2πn) =

n=−∞

∞ 1 X ikx e . 2π

(4.5)

k=−∞

So we have ÿ

X n1 ···nN +1

e 1({λ})

!

∂ ϕ˜

detN +1 ∂λkj X

=

n1 ···nN +1

X

=

n1 ···nN +1

1 2π 1 2π

=

Z∞

X

e dN +1 λ 1({λ})

n1 ···nN +1−∞

N +1 Z∞

e dN +1 λ 1({λ})

δ(ϕ˜ j (λ) − 2πnj )

j=1 N +1 Y

einj ϕ˜ j (λ)

j=1

−∞ N +1 Z∞

d −∞

N +1 Y

N +1

e λ1({λ})

N +1 Y j=1

e

iLλj nj

! nj ÿN +1 Y h(λk , λj ) (4.6) . h(λj , λk ) k=1

Thus we get the following representation for the correlation function: QN e−iht c−2N m=1 e−τ (µm ) hψ(0, 0)ψ † (x, t)iN = QN QN ∂ϕ ( j=1 k=1 h(µj , µk )) · det N ∂µkj ! nj ÿN +1 ∞ N +1 X 1 N +1 Z Y Y h(λk , λj ) e × dN +1 λ1({λ}) eiLλj nj (4.7) . 2π h(λj , λk ) n ···n 1

N +1

−∞

j=1

k=1

670

T. Kojima, V. E. Korepin, N. A. Slavnov

5. Quantum Dual Fields In this section we introduce the auxiliary Fock space and auxiliary Bose fields φ0 (λ), φ1 (λ), φ2 (λ), φAj (λ) and φDj (λ) (j = 1, 2). Further we shall call these operators dual fields [12] (see also Sect. IX.5 of [3]). Dual fields help us to rewrite double products in terms of single products. By definition any operator φa (λ) (a = 0, 1, 2, A1 , A2 , D1 , D2 ) is the sum of two operators: "momentum" p(λ) and "coordinate" q(λ), φ0 (λ) = q0 (λ) + p0 (λ); φAj (λ) = qAj (λ) + pDj (λ); φ1 (λ) = q1 (λ) + p2 (λ);

φDj (λ) = qDj (λ) + pAj (λ); φ2 (λ) = q2 (λ) + p1 (λ).

(5.1)

All operators “momenta” p(λ) annihilate the vacuum vector |0), all operators q(λ) annihilate the dual vacuum (0| : pa (λ)|0) = 0,

(0| qa (λ) = 0,

for all a,

(0|0) = 1.

The only nonzero commutation relations are  [p (λ), q0 (µ)] = ln(h(λ, µ)h(µ, λ));   0 [pAj (λ), qAk (µ)] = δjk ln h(µ, λ); [pDj (λ), qDk (µ)] = δjk ln h(λ, µ); h(λ, µ) h(µ, λ)   [p1 (λ), q1 (µ)] = ln ; [p2 (λ), q2 (µ)] = ln . h(µ, λ) h(λ, µ)

(5.2)

(We remind the reader that h(λ, µ) = (λ − µ + ic)/ic.) The realization of these operators as linear combinations of canonical Bose fields is given in Appendix B. It is easy to check that all dual fields commute with each other [φa (λ), φb (µ)] = 0, where a, b run through all possible indices. Using this property we can define functions of operators F({eφa (λ) }). One should understand such an expression, for example as a power series over {eφa (λ) }. The following simple formulæ are useful: epa (λ) eqa (µ) = eqa (µ) epa (λ) e[pa (λ),qa (µ)] , M2 M1 M1 Y M2 Y Y Y (0| eαj pa (λj ) eβk qa (µk ) |0) = eαj βk [pa (λj ),qa (µk )] , j=1 M Y

k=1

eβj pa (λj ) F e

φa (µ)



|0) = F eφa (µ)

j=1

F e

j=1 k=1 M Y

(5.3) (5.4)

 eβj [pa (λj ),qa (µ)]  |0),

(5.5)

j=1

M Y pa (µ) j=1

eβj φa (λj ) |0) =

M Y j=1

  M Y eβj φa (λj ) |0)F  eβj [pa (µ),qa (λj )]  . (5.6) j=1

Here {λ}, {µ}, {β}, {α} are arbitrary complex numbers, F is a function. One can easily prove these formulæ. Let us define the very important dual field ψ(λ) as ψ(λ) = φ0 (λ) + φA1 (λ) + φD2 (λ) + φ2 (λ).

(5.7)

Determinant Representation for Function of Bose Gas

671

Theorem 5.1. The correlation function (4.7) can be presented as the following vacuum expectation value in auxiliary Fock space: e−iht c−2N

hψ(0, 0)ψ † (x, t)iN =

∂ϕ det N ∂µkj

N Y

(0|

ep0 (µm ) ep1 (µm )

m=1

∞ X Z 1 ∂ N +1 × d λ γ ˆ (λ ) + 1 N +1 (2π)N +1 n ···n ∂α 1 N +1−∞ × det N Sbjj Sbjk γˆ 1 (λj )γˆ 2 (µj )γˆ 2 (µk ) − αSbjj SbN +1 k γˆ 1 (λj )γˆ 1 (λN +1 )γˆ 2 (µj )γˆ 2 (µk ) |0). (5.8) α=0

where

Sbjk = t(µk , λj )e−φD1 (λj ) − t(λj , µk )e−φA2 (λj ) ,

and

(5.9)

γˆ 1 (λj ) = eiLλj nj +τ (λj )+ψ(λj )+nj φ1 (λj ) , 1 γˆ 2 (µ) = e− 2 (τ (µ)+ψ(µ)) .

Proof. Let us move factors γˆ 1 (λ) and γˆ 2 (µ) out of the determinant in (5.8). In the r.h.s. of (5.8) we get

×

∂ 1+ ∂α

N +1 Y

m=1

× det N

(0|

N Y

ep0 (µm ) ep1 (µm )

m=1

eiLλm nm +τ (λm )+ψ(λm )+nm φ1 (λm )

e−τ (µm )−ψ(µm )

m=1

Sbjj Sbjk − αSbjj SbN +1 k

N Y

α=0

|0).

(5.10)

Using (5.4) we find (0|

N Y

ep0 (µm ) ep1 (µm )

m=1

+1 Y N m=1

eφ0 (λm )+nm φ1 (λm )+φ2 (λm ) QN +1 QN +1 j=1

k=1

= QN QN j=1

k=1

N Y

e−φ0 (µm )−φ2 (µm ) |0)

m=1

h(λj , λk )

h(µj , µk )

+1 N +1 N Y Y j=1 k=1

h(λk , λj ) h(λj , λk )

nj (5.11) .

Using (5.5) we obtain (0|

N +1 Y m=1

eφA1 (λm )+φD2 (λm )

N Y m=1

e−φA1 (µm )−φD2 (µm )

detN Sbjj Sbjk − αSbjj SbN +1 k |0)

= det N Sjj Sjk − αSjj SN +1 k .

(5.12)

672

T. Kojima, V. E. Korepin, N. A. Slavnov

e Combining (5.11) and (5.12) we get the r.h.s. of (4.7) with 1({λ}) defined in (4.4). The theorem is proved. Now we can rewrite the r.h.s. for (5.8) as follows: ∞ X Z ∂ 1 N +1 d λ γˆ 1 (λN +1 ) + (2π)N +1 n ···n ∂α 1 N +1−∞ × det N Sbjj Sbjk γˆ 1 (λj )γˆ 2 (µj )γˆ 2 (µk ) − αSbjj SbN +1 k γˆ 1 (λj )γˆ 1 (λN +1 )γˆ 2 (µj )γˆ 2 (µk ) α=0   ∞ Z 1 X ∂  = dλN +1 γˆ 1 (λN +1 ) + 2π n ∂α N +1−∞  Z∞ 1 X  dλj Sbjj Sbjk γˆ 1 (λj )γˆ 2 (µj )γˆ 2 (µk ) ×det N 2π n j −∞  Z∞ 1 X b b  dλN +1 dλj Sjj SN +1 k γˆ 1 (λj )γˆ 1 (λN +1 )γˆ 2 (µj )γˆ 2 (µk ) . (5.13) −α 2 4π n ,n j N +1 −∞

α=0

In this formula we can perform the summation over the integer {nj }. Indeed, nj enters only in function γˆ 1 (λj ): γˆ 1 (λj ) = eiLλj nj +τ (λj )+ψ(λj )+nj φ1 (λj ) . Recall that all dual fields commute with each other: [φa (λ), φb (µ)] = 0. This means that we can treat operators φa (λ) as diagonal operators. Due to the formula (B.8) from Appendix B we can consider the operator iφ1 (λ) as a real function of λ. Hence we can use formula (4.5) to sum up with respect to nj , ∞ ∞ X 1 X in(Lλ−iφ1 (λ)) e = δ(Lλ − iφ1 (λ) − 2πn). 2π n=−∞ n=−∞

(5.14)

It means that λ = λn , where λn is a root of the equation Lλn − 2πn = iφ1 (λn ).

(5.15)

The expression (5.15) is an operator equality, which is defined only on vectors of Q the form eφ2 (λm ) |0). Therefore one should understand this equation in the sense of m

the mean value: (0|(Lλn − iφ1 (λn ) − 2πn)

Y

eφ2 (λm ) |0) = 0,

(5.16)

m

where {λm } are arbitrary real parameters. Then we can rewrite Eq. (5.16): X λm − λn + ic X h(λm , λn ) =i ln ln . Lλn − 2πn = i h(λn , λm ) λn − λm + ic m m

(5.17)

Determinant Representation for Function of Bose Gas

673

The r.h.s. of (5.17) is a real bounded function of λn . Moreover it is a decreasing function of λn , because X 2c ∂ X λm − λn + ic =− i ln < 0. 2 2 ∂λn m λn − λm + ic (λ − λ n m) + c m The l.h.s. of (5.17) is a linear increasing function of λn , hence Eq. (5.17) has one real solution and this solution is unique. Also we have (0|(L − iφ01 (λn ))

Y

eφ2 (λm ) |0) = L +

m

X m

2c > 0. (λn − λm )2 + c2

(5.18)

δ(λ − λn ) , L − iφ01 (λ)

(5.19)

Therefore, one can write δ(Lλ − iφ1 (λ) − 2πn) = where λn is a solution of Eq. (5.15). Later we shall use the notation 2π ρ(λ) ˆ =1−

i 0 φ (λ). L 1

(5.20)

We now arrive at the following formula for the correlation function in a finite volume:

hψ(0, 0)ψ † (x, t)iN =

N Y e−iht c−2N (0| ep0 (µm ) ep1 (µm ) ∂ϕ detN ∂µkj m=1 ∂ bjk − αQ bj Q b k )|0) , (5.21) × GN (x, t) + · det N (U ∂α α=0

where GN (x, t) =

∞ 1 X 1 eψ(λn )+τ (λn ) , L n=−∞ 2π ρ(λ ˆ n)

(5.22)

and 1 ∞ 1 X eψ(λn )+τ (λn ) e− 2 (ψ(µj )+ψ(µk )+τ (µj )+τ (µk )) b Ujk = L n=−∞ 2π ρ(λ ˆ n) −φD1 (λn ) − t(λn , µk )e−φA2 (λn ) × t(µk , λn )e × t(µj , λn )e−φD1 (λn ) − t(λn , µj )e−φA2 (λn ) ,

(5.23)

1 ∞ 1 X eψ(λn )+τ (λn ) e− 2 (ψ(µj )+τ (µj )) b Qj = L n=−∞ 2π ρ(λ ˆ n) × t(µj , λn )e−φD1 (λn ) − t(λn , µj )e−φA2 (λn ) .

(5.24)

Formula (5.21) is the determinant representation for the quantum correlation function in a finite volume.

674

T. Kojima, V. E. Korepin, N. A. Slavnov

6. Thermodynamic Limit In order to calculate the correlation function in the ground state one should consider the limit where the number of particles and the length of the box tend to infinity with fixed constant density: N → ∞, L → ∞, N/L = D = const. In this limit the ˆ parameters {λn } are described by the distribution density ρ(λ), 1 i ρ(λ) ˆ = 1 − φ01 (λ) 2π L bjk and Q b j can be replaced by the (see Appendix A). The sums in the expressions for U corresponding integrals. Let us introduce the new function Z(λ, µ), Z(λ, µ) =

e−φD1 (λ) e−φA2 (λ) + . h(µ, λ) h(λ, µ)

(6.1)

Then we can rewrite (5.23) and (5.24) as ∞ X (ic)2 eψ(λn )+τ (λn ) bjk = 1 U L n=−∞ 2π ρ(λ ˆ n )(λn − µj )(λn − µk )

×e− 2 (ψ(µj )+ψ(µk )+τ (µj )+τ (µk )) Z(λn , µk )Z(λn , µj ), 1

1 ∞ 1 X iceψ(λn )+τ (λn ) e− 2 (ψ(µ)+τ (µ)) b Z(λn , µ). Q(µ) = L n=−∞ 2π ρ(λ ˆ n )(λn − µ)

(6.2) (6.3)

b j = −Q(µ b j ), Q b k = −Q(µ b k ). Using formula (A.25), we get Here Q 2 ˆ j) 2 bjk = Lδjk (ic) 2π ρ(µ Z (µj , µj ) U ˆ j) 4 sin2 L2 ξ(µ Lˆ ∂ h 1 (ψ(µj )−ψ(µk )+τ (µj )−τ (µk )) (ic)2 δjk cot ξ(µ e2 Z(µj , µk )Z(µj , µj ) − j) 2 2 ∂µj i 1 − e− 2 (ψ(µj )−ψ(µk )+τ (µj )−τ (µk )) Z(µk , µj )Z(µk , µk ) µk =µj 2 1 (ic) 1 − δjk Lˆ − e 2 (ψ(µj )−ψ(µk )+τ (µj )−τ (µk )) Z(µj , µk )Z(µj , µj ) cot ξ(µ j) 2 µj − µk 2 1 Lˆ − e− 2 (ψ(µj )−ψ(µk )+τ (µj )−τ (µk )) Z(µk , µj )Z(µk , µk ) cot ξ(µ ) k 2 Z∞ 1 (ic)2 1 \ dλ − eψ(λ)+τ (λ) + 2π(µj − µk ) λ − µj λ − µk −∞

×e

− 21 (ψ(µj )+τ (µj )+ψ(µk )+τ (µk ))

Z(λ, µk )Z(λ, µj ) + O(1/L). (6.4)

Here we denote the principal value by the symbol Z∞ Z∞ Z∞ Z∞ 1 1 dλ(·) dλ(·) dλ(·) dλ(·) ≡ V.P. = + . \ λ−µ λ−µ 2 λ − µ + i0 2 λ − µ − i0

−∞

−∞

−∞

−∞

Determinant Representation for Function of Bose Gas

675

Using (A.21) we calculate the sum (6.3): Z∞ dλ ψ(λ)+τ (λ) − 1 (ψ(µ)+τ (µ)) ic b e \ e 2 Z(λ, µ) Q(µ) = 2π λ − µ −∞

Lˆ ic 1 + O(1/L). − e 2 (ψ(µ)+τ (µ)) Z(µ, µ) cot ξ(µ) 2 2

(6.5)

ˆ is defined in Appendix A as Function ξ(µ) i ˆ ξ(µ) = µ − φ1 (µ), L (see (A.2)), and hence eiLµ+φ1 (µ) + 1 Lˆ ξ(µ) = i iLµ+φ1 (µ) , 2 e −1 1 Lˆ = 2 − eiLµ+φ1 (µ) − e−iLµ−φ1 (µ) . sin2 ξ(µ) 2 4 cot

Let us turn back to the formula (5.21). We can move all ep1 (µm ) to the right vacuum b and Q b should be replaced by the rule (see |0). Then each operator φ1 entering into U (5.5)) N N N Y Y h(µm , µj ) φ1 (µj ) Y p1 (µm ) e ep1 (µm ) eφ1 (µj ) = e . h(µj , µm ) m=1

m=1

m=1

Taking into account the Bethe equations (2.13), eiLµj

N Y h(µm , µj ) = −1, h(µj , µm )

(N – even),

m=1

we get N Y

ep1 (µm ) cot

m=1 N Y

N Lˆ ω− (µj ) Y p1 (µm ) ξ(µj ) = i e , 2 ω+ (µj ) m=1

ep1 (µm ) sin2

m=1

where

Lˆ ξ(µj ) = 2

2 Y N ω+ (µj ) 2

ep1 (µm ) ,

m=1

ω± (µ) = eφ1 (µ)/2 ± e−φ1 (µ)/2 .

(6.6)

Operator 2π ρ(λ) ˆ also contains φ1 (λ), so it does not commute with p1 (µ): N Y

e

p1 (µm )

ˆ j) 2π ρ(µ ˆ j ) = 2π R(µ

m=1

ˆ 2π R(µ) =

ep1 (µm ) ,

m=1

ÿ

where

N Y

! N 1X i 0 1+ K(µ, µm ) − φ1 (µ) . L L m=1

(6.7)

676

T. Kojima, V. E. Korepin, N. A. Slavnov

Hence we have the new representation for the correlation function hψ(0, 0)ψ † (x, t)iN = ×(0|

N Y

e

p0 (µm )

m=1

e−iht c−2N ∂ϕ

detN ∂µkj

∂ GN (x, t) + ∂α

ejk − αQ(µ e j )Q(µ e k ) + O(1/L) |0) ×det N U

α=0

,

(6.8)

where 2 ˆ ejk = Lδjk (ic) 2π R(µj ) Z 2 (µj , µj ) U 2 ω+ (µj ) (ic)2 ω− (µj ) ∂ h 1 (ψ(µj )−ψ(µk )+τ (µj )−τ (µk )) −i δjk e2 Z(µj , µk )Z(µj , µj ) 2 ω+ (µj ) ∂µj

i

− e− 2 (ψ(µj )−ψ(µk )+τ (µj )−τ (µk )) Z(µk , µj )Z(µk , µk ) 1

−i

µk =µj

1 (ic) 1 − δjk ω− (µj ) e 2 (ψ(µj )−ψ(µk )+τ (µj )−τ (µk )) Z(µj , µk )Z(µj , µj ) 2 µj − µk ω+ (µj ) 1 ω− (µk ) − e− 2 (ψ(µj )−ψ(µk )+τ (µj )−τ (µk )) Z(µk , µj )Z(µk , µk ) ω+ (µk ) ∞ Z 1 (ic)2 1 \ dλ − + eψ(λ)+τ (λ) 2π(µj − µk ) λ − µj λ − µk

2

−∞

×e

− 21 (ψ(µj )+τ (µj )+ψ(µk )+τ (µk ))

Z(λ, µk )Z(λ, µj ),

(6.9)

Z∞ dλ ψ(λ)+τ (λ) − 1 (ψ(µ)+τ (µ)) ic e e \ e 2 Z(λ, µ) Q(µ) = 2π λ − µ −∞

c 1 ω− (µ) + e 2 (ψ(µ)+τ (µ)) Z(µ, µ) . 2 ω+ (µ)

(6.10)

Let us simplify formulæ (6.9) and (6.10). First, in (6.9) the term proportional to 1 − δjk is defined only for j 6= k. Let us continue this term for all j and k using l’Hˆopital’s rule for j = k. Then 2 ejk = Lδjk (ic) 2πρL (µj ) Z 2 (µj , µj ) U ω+2 (µj )

1 ω− (µj ) (ic)2 1 e 2 (ψ(µj )−ψ(µk )+τ (µj )−τ (µk )) Z(µj , µk )Z(µj , µj ) −i 2 µj − µk ω+ (µj ) ω− (µk ) − 21 (ψ(µj )−ψ(µk )+τ (µj )−τ (µk )) Z(µk , µj )Z(µk , µk ) −e ω+ (µk ) ∞ Z 1 (ic)2 1 \ dλ − + eψ(λ)+τ (λ) 2π(µj − µk ) λ − µj λ − µk −∞

Determinant Representation for Function of Bose Gas

677

×e− 2 (ψ(µj )+τ (µj )+ψ(µk )+τ (µk )) Z(λ, µk )Z(λ, µj ), 1

ÿ

where 2πρL (µ) =

! N 1X 1+ K(µ, µm ) . L

(6.11)

(6.12)

m=1

Using the Sokhodsky formula V.P.

1 1 = ± iπδ(x), x x ± i0

one can rewrite the expressions (6.11) and (6.10) as follows: 2 ejk = Lδjk (ic) 2πρL (µj ) Z 2 (µj , µj ) U ω+2 (µj ) ÿ Z∞ 1 1 e− 2 φ1 (λ) (ic)2 dλ e 2 φ1 (λ) + + 2π(µj − µk ) ω+ (λ) λ − µj + i0 λ − µj − i0 −∞ ! 1 1 e− 2 φ1 (λ) e 2 φ1 (λ) − − λ − µk + i0 λ − µk − i0

×eψ(λ)+τ (λ) e− 2 (ψ(µj )+τ (µj )+ψ(µk )+τ (µk )) Z(λ, µk )Z(λ, µj ), (6.13) ÿ 1 ! 1 1 e− 2 φ1 (λ) dλ e 2 φ1 (λ) + eψ(λ)+τ (λ) e− 2 (ψ(µ)+τ (µ)) Z(λ, µ). ω+ (λ) λ − µ + i0 λ − µ − i0 1

ic e Q(µ) = 2π

Z∞ −∞

(6.14) Now let us move the term proportional to the length of the box L out of determinant: " 2 # N Y , µ ) Z(µ a a 2 ejk − αQ(µ e j )Q(µ e k )) = (ic) 2πρL (µa )L det N (U ω+ (µa ) a=1 ! ÿ e j )Q(µ e k) ejk − αQ(µ ω+ (µj )ω+ (µk ) U · . (6.15) ×det N (ic)2 2πρ(µk )L Z(µj , µj )Z(µk , µk ) One can move the product (0|

N Y

e

p0 (µa )

a=1

QN Z(µa ,µa ) 2 a=1

Z(µa , µa ) ω+ (µa )

ω+ (µa )

2 = (0|

to the left vacuum (0| :

N Y

e

p0 (µa )

a=1

≡ (0|

N Y

e−pA1 (µa ) + e−pD2 (µa ) ep2 (µa )/2 + e−p2 (µa )/2

P(µa ).

2

(6.16)

a=1

Now let us move the product in the r.h.s. of (6.16) to the right vacuum. The only operator e and Q e which does not commute with P(µa ) is ψ(λ) = φ0 (λ) + in the expressions for U N Q φA1 (λ) + φD2 (λ) + φ2 (λ). In order to move P(µa ) through the determinant we use the following lemma.

a=1

678

T. Kojima, V. E. Korepin, N. A. Slavnov

Lemma 6.1. For arbitrary M = 1, 2, . . . and arbitrary complex numbers λ1 . . . , λM , β1 . . . , β M , P(µa )

M Y

eβm ψ(λm ) |0) =

m=1

M Y

eβm ψ(λm ) |0)

(6.17)

m=1

Proof. The proof is straightforward: P(µa )

M Y

eβm ψ(λm ) |0) = ep0 (µa )

m=1

= 

M Y

M Y

eβm ψ(λm ) |0)

m=1 M Q

e−pA1 (µa ) + e−pD2 (µa ) ep2 (µa )/2 + e−p2 (µa )/2

eβm ψ(λm ) |0)

m=1

[h(µa , λm )h(λm , µa )]βm

m=1

2

M Q

−βm

2 Y M

−βm

[h(λm , µa )] + [h(µa , λm )] M   Y   m=1 ×  m=1 = eβm ψ(λm ) |0). iβm /2 Q iβm /2  M h M h  Q  h(λm ,µa ) h(µa ,λm ) m=1 + h(µa ,λm ) h(λm ,µa ) m=1

m=1

This proves the lemma. Since the determinant in the r.h.s. of (6.15), being a function of the operator ψ, is M Q some linear combination of products of the type eβm ψ(λm ) (with different M, {β} m=1 QN and {λ}), we can move a=1 P(µa ) to the right vacuum without changing the matrix elements of the determinant (6.15). Therefore we have hψ(0, 0)ψ † (x, t)iN = × (0|

N Y

2πρL (µa )L

a=1

∂ GN (x, t) + ∂α

e−iht ∂ϕ

detN ∂µkj · det N

Wjk + O(1/L2 ) |0)

(6.18) , α=0

where Wjk = δjk +

1 (Vjk − αP (µj )P (µk )), 2πρL (µk )L

(6.19)

and ω+ (µj )ω+ (µk ) 2π(µj − µk )Z(µj , µj )Z(µk , µk ) ÿ ! Z∞ 1 1 1 1 e− 2 φ1 (λ) e 2 φ1 (λ) e− 2 φ1 (λ) dλ e 2 φ1 (λ) + − − × ω+ (λ) λ − µj + i0 λ − µj − i0 λ − µk + i0 λ − µk − i0 Vjk =

−∞

×eψ(λ)+τ (λ) e− 2 (ψ(µj )+τ (µj )+ψ(µk )+τ (µk )) Z(λ, µk )Z(λ, µj ), 1

(6.20)

Determinant Representation for Function of Bose Gas

ω+ (µ) P (µ) = 2πZ(µ, µ)

Z∞ −∞

dλ ω+ (λ)

679

ÿ

e− 2 φ1 (λ) e 2 φ1 (λ) + λ − µ + i0 λ − µ − i0 1

1

!

×eψ(λ)+τ (λ) e− 2 (ψ(µ)+τ (µ)) Z(λ, µ). 1

(6.21)

Recall that in the thermodynamic limit Y 1 ˆ ∂ϕj K), 2πρ(µa )L det(Iˆ − → ∂µk 2π N

det N

(see (2.23)),

(6.22)

a=1

1 GN (x, t) → G(x, t) = 2π ρL (λ) → ρ(λ),

Z+∞

eψ(ν)+τ (ν) dν,

(6.23)

−∞

(see (2.16)),

The formula (6.18) contains the determinant of the matrix W . In the thermodynamic limit it will turn into a determinant of an integral operator. The simplest way to see this is to express det W in terms of traces of powers of the matrix (V −αP P ). The replacement of summation by integration (in the limit) is straightforward and is explained in detail in Sect. XI.4 of [3]. Therefore the determinant tends to the Fredholm determinant. Now we arrive at the main theorem. Theorem 6.2. In the thermodynamic limit, the time-dependent correlation function has the following Fredholm determinant formula: 1 ˆ Vα det Iˆ + 2π ∂ † −iht |0) . (6.24) × (0| G(x, t) + hψ(0, 0)ψ (x, t)i = e 1 ˆ ˆ ∂α det I − 2π K α=0 Here the integral operator Vˆα is given by Z q Vˆ (λ, µ) − αPˆ (µ)Pˆ (λ) f (µ)dµ, Vˆα f (λ) =

(6.25)

−q

where q is the value of spectral parameter on the Fermi surface. Here the kernels Vˆ (µ1 , µ2 ) and Pˆ (µ) are given by ω+ (µ1 )ω+ (µ2 ) 2π(µ1 − µ2 )Z(µ1 , µ1 )Z(µ2 , µ2 ) ÿ ! 1 1 1 1 e− 2 φ1 (λ) e 2 φ1 (λ) e− 2 φ1 (λ) dλ e 2 φ1 (λ) + − − ω+ (λ) λ − µ1 + i0 λ − µ1 − i0 λ − µ2 + i0 λ − µ2 − i0

Vˆ (µ1 , µ2 ) = Z∞ × −∞

×eψ(λ)+τ (λ) e− 2 (ψ(µ1 )+τ (µ1 )+ψ(µ2 )+τ (µ2 )) Z(λ, µ2 )Z(λ, µ1 ), 1

(6.26)

and ω+ (µ) Pˆ (µ) = 2πZ(µ, µ)

Z∞ −∞

dλ ω+ (λ)

ÿ

e− 2 φ1 (λ) e 2 φ1 (λ) + λ − µ + i0 λ − µ − i0 1

1

×eψ(λ)+τ (λ) e− 2 (ψ(µ)+τ (µ)) Z(λ, µ), 1

!

(6.27)

680

T. Kojima, V. E. Korepin, N. A. Slavnov

where ω+ (µ) = eφ1 (µ)/2 + e−φ1 (µ)/2 , Z(λ, µ) =

e−φD1 (λ) e−φA2 (λ) + , h(µ, λ) h(λ, µ)

τ (λ) = itλ2 − ixλ. ˆ is given in (2.24). The integral operator K We want to emphasize that formula (6.24) is our main result. It is easy to show that it has the correct free fermionic limit. If c → +∞ (free fermionic case), then all commutators (5.2) of auxiliary “momenta” and “coordinates” go to zero because in this limit h(λ, µ) → 1. Hence one can put all dual fields φa (λ) = 0. In particular ψ(λ) = 0, ω+ (λ) = 2,

φ1 (λ) = 0, Z(λ, µ) = 2.

whereby we have c→∞ Vˆ (µ1 , µ2 ) =

Z∞ 1 1 2 1 1 eτ (λ)− 2 τ (µ1 )− 2 τ (µ2 ) , (6.28) \ dλ − π(µ1 − µ2 ) λ − µ1 λ − µ2 −∞

c→∞

Pˆ (µ) =

Z∞ 1 1 1 eτ (λ)− 2 τ (µ) , \ dλ π λ−µ

(6.29)

−∞

ˆ K

c→∞

c→∞

G(x, t) =

= 0,

1 2π

Z∞

(6.30) dνeτ (ν) .

(6.31)

−∞

Substitution of these formulæ into (6.24) reproduces the result of [11]. In order to obtain Lenard’s determinant formula [2] one should also put t = 0.

Summary The main result of the paper is formula (6.24). It represents the correlation function of local fields (in the infinite volume) as a mean value of a determinant of a Fredholm integral operator. In order to obtain this formula we introduced an auxiliary Fock space and auxiliary Bose fields (all of them belong to the same Abelian sub-algebra). This is the first step in the description of the correlation function. In forthcoming publications we shall describe the Fredholm determinant by a completely integrable integro-differential equation. Then we shall solve this equation by means of the Riemann-Hilbert problem and evaluate its long-distance asymptotic.

Determinant Representation for Function of Bose Gas

681

A. Summation of Singular Expressions Let us consider Eq. (5.15),

Lλn − iφ1 (λn ) = 2πn,

(A.1)

where iφ1 (λ) is a real and bounded function for Im λ = 0. Let us introduce a function ˆ ξ(λ), i ˆ ξ(λ) = λ − φ1 (λ). (A.2) L Obviously ˆ n ) = 2πn . ξ(λ (A.3) L Comparing with (5.20) we get 2π ρ(λ) ˆ =1−

i 0 φ (λ) = ξˆ0 (λ). L 1

(A.4)

It follows from Eq. (A.1) that |λn+1 − λn | ≤

2 (π + M ), L

where M=

sup

−∞<λ<∞

|φ1 (λ)|.

Hence, |λn+1 − λn | → 0 if L → ∞ and we can make the following estimate: 1 = ρ(λ ˆ n ) + O(1/L2 ). L(λn+1 − λn )

(A.5)

Due to (5.18) we have 2π ρ(λ) ˆ > 0. During the study of the thermodynamic limit the following sums appeared: S=

∞ 1 X f (λn ) . L n=−∞ 2π ρ(λ ˆ n )(λn − µ)

(A.6)

Here f (λ) is some smooth function, µ is some fixed point on the real axis. We shall be interested in the asymptotic of this sum when L goes to infinity. Let us present (A.6) as the sum of three summands ÿ N −1 1 X f (λn ) 1 + S= 2πL n=−∞ ρ(λ ˆ n )(λn − µ) ! N2 ∞ X X f (λn ) f (λn ) + + . (A.7) ρ(λ ˆ n )(λn − µ) ρ(λ ˆ n )(λn − µ) n=N1

n=N2 +1

Here N1 and N2 are integers such that in the limit L → ∞, the following properties are valid: (A.8) 0 < µ − λN1 < ∞, 0 < λN2 − µ < ∞. Obviously the first and the third summands in (A.7) have no singularities in the domain of summation. The corresponding sums are integral sums, for example

682

T. Kojima, V. E. Korepin, N. A. Slavnov

S1 =

N1 −1 N 1 −1 X 1 X f (λn ) f (λn )(λn+1 − λn ) = + O(1/L2 ) L n=−∞ 2π ρ(λ ˆ n )(λn − µ) n=−∞ 2π(λn − µ)

1 = 2π

λN1 Z

−∞

f (λ) dλ + O(1/L). λ−µ

(A.9)

An analogous formula is valid for S3 , Z∞ ∞ 1 1 X f (λn ) f (λ) = dλ + O(1/L). S3 = L 2π ρ(λ ˆ n )(λn − µ) 2π λ−µ n=N2 +1

(A.10)

λN 2

Consider the second summand in (A.7), S2 =

N2 1 X f (λn ) . L 2π ρ(λ ˆ n )(λn − µ) n=N1

One can present S2 in the following form: S2 = S2(1) + S2(2) , where S2(1)

N2 f (µ) 1 X f (λn ) − = , ˆ n ) − ξ(µ) ˆ L 2π ρ(λ ˆ n )(λn − µ) ξ(λ n=N1

S2(2)

N2 f (µ) X 1 = . ˆ ˆ L ξ(λn ) − ξ(µ) n=N1

(A.11)

Due to (A.4), S2(1) has no singularities in the domain of summation. Therefore it can be replaced by the corresponding integral S2(1)

1 = 2π

λN 2 Z

λN 1

2π ρ(λ)f ˆ (µ) f (λ) − ˆ ˆ λ − µ ξ(λ) − ξ(µ)

dλ + O(1/L).

(A.12)

Using (A.3) we can rewrite S2(2) in the following form: S2(2) =

N2 f (µ) X 1 . L ˆ 2π ξ(µ) n − 2π n=N1

The last sum can be calculated explicitly in terms of the logarithmic derivative of the 0-function: d ln 0(x). ψ(x) = dx We shall use the following properties of the ψ-function:

Determinant Representation for Function of Bose Gas

683

1 = ψ(x + 1), x ψ(x) − ψ(1 − x) = −π cot πx, ψ(x) → ln x + O(1/x), x → +∞.

ψ(x) +

(A.13) (A.14) (A.15)

Now using (A.13) we can write f (µ) L ˆ L ˆ ψ(N2 − ξ(µ) + 1) − ψ(N1 − ξ(µ)) . S2(2) = 2π 2π 2π

(A.16)

The argument of the second ψ-function in (A.16) is negative. Using (A.14) one can flip the sign of this argument f (µ) L ˆ (2) ψ(N2 − ξ(µ) + 1) S2 = 2π 2π L ˆ Lˆ −ψ( ξ(µ) − N1 + 1) − π cot ξ(µ) . (A.17) 2π 2 Remember now that 0 < µ − λN1 < ∞ and 0 < λN2 − µ < ∞ (see (A.8)). This means that the arguments of ψ-functions in (A.17) tend to infinity if L → ∞. Therefore we can use the asymptotic formula (A.15), " ÿ ! # L ˆ ξ(µ) N2 − 2π f (µ) Lˆ (2) ln L ξ(µ) + O(1/L). (A.18) − π cot S2 = ˆ 2π 2 2π ξ(µ) − N1 Now let us turn back to (A.12). Let us present the r.h.s. as the difference of two integrals. Both of them should be understood in the sense of principal value (V.P.): S2(1)

λN 2 λN 2 Z Z 1 f (λ) f (µ)ρ(λ) ˆ dλ − \ = \ dλ + O(1/L). ˆ − ξ(µ) ˆ 2π λ − µ ξ(λ) λN1

(A.19)

λN 1

Due to (A.4) one can compute the second term in (A.19) explicitly λN 2 λN2 Z Z ˆ f (µ) f (µ)ρ(λ) ˆ dξ(λ) = \ dλ \ = ˆ − ξ(µ) ˆ ˆ − ξ(µ) ˆ 2π ξ(λ) ξ(λ)

f (µ) ln 2π

ÿ

λN 1

ˆ N2 ) − ξ(µ) ˆ ξ(λ ˆ − ξ(λ ˆ N1 ) ξ(µ)

!

λN 1

f (µ) ln = 2π

ÿ

L ˆ ξ(µ) N2 − 2π L ˆ ξ(µ) − N1

!

2π

Combining now (A.18), (A.19) and (A.20) we get λN 2 Z f (µ) Lˆ 1 f (λ) S2 = dλ − cot ξ(µ) + O(1/L). \ 2π λ − µ 2 2 λN 1

Finally, using (A.9) and (A.10) we find

.

(A.20)

684

T. Kojima, V. E. Korepin, N. A. Slavnov

S=

∞ 1 X f (λn ) L n=−∞ 2π ρ(λ ˆ n )(λn − µ)

Z∞ 1 f (µ) Lˆ f (λ) = dλ − cot ξ(µ) + O(1/L). \ 2π λ − µ 2 2

(A.21)

−∞

This formula describes the asymptotic behavior of the sum (A.6). For the evaluation of the thermodynamic limit of our determinant representation it is necessary to consider a sum containing the second order pole, S0 =

∞ 1 X f (λn ) . L n=−∞ 2π ρ(λ ˆ n )(λn − µ)2

(A.22)

Taking the derivative of (A.21) with respect to µ, we get 2π ρ(µ)f ˆ (µ) ˆ 4 sin2 L2 ξ(µ) Z∞ 1 ∂ 1 f (λ) Lˆ + dλ − f 0 (µ) cot ξ(µ) + O(1/L). \ 2π ∂µ λ − µ 2 2

S0 = L

(A.23)

−∞

We can use formulæ (A.21) and (A.23) to calculate the thermodynamic limit of (6.2), ∞ 1 X f (λn |µj , µk ) L n=−∞ 2π ρ(λ ˆ n )(λn − µj )(λn − µk ) Lˆ Lˆ 1 f (µj |µj , µk ) cot ξ(µj ) − f (µk |µj , µk ) cot ξ(µk ) =− 2(µj − µk ) 2 2 ∞ Z 1 1 1 \ dλf (λ|µj , µk ) − + O(1/L). (A.24) + 2π(µj − µk ) λ − µj λ − µk −∞

One should understand the r.h.s. of this equality by l’Hˆopital’s rule if j = k. It is also useful to extract explicitly the term proportional to the length of the box L, ∞ 2π ρ(µ ˆ j )f (µj |µj , µj ) f (λn |µj , µk ) 1 X = δjk L ˆ j) L n=−∞ 2π ρ(λ ˆ n )(λn − µj )(λn − µk ) 4 sin2 L2 ξ(µ

Z∞ 1 1 1 \ dλf (λ|µj , µk ) − + 2π(µj − µk ) λ − µj λ − µk −∞ Lˆ Lˆ 1 − δjk f (µj |µj , µk ) cot ξ(µ ξ(µ ) − f (µ |µ , µ ) cot ) − j k j k k 2(µj − µk ) 2 2 Lˆ ∂ δjk cot ξ(µ f (µj |µj , µk ) − f (µk |µj , µk ) + O(1/L). (A.25) − j) 2 2 ∂µj µ =µ j

We have used formulæ (A.21) and (A.25) in Sect. 6.

k

Determinant Representation for Function of Bose Gas

685

B. Representation of Dual Fields What is the relation between our dual fields and the canonical Bose fields. Canonical Bose fields ψl (λ) can be characterized as follows: † (λ)] = δlm δ(λ − µ), [ψl (λ), ψm

(B.1)

(do not confuse this ψl (λ) with the dual field ψ(λ)) and (0| ψl† (λ) = 0.

ψl (λ)|0) = 0,

(B.2)

The dual fields which appeared in this paper have the form φa (λ) = qa (λ) + pa (λ),

(B.3)

where pa (λ) is the annihilation part of φa (λ) and qa (λ) is its creation part. Their commutation relations are (B.4) [pa (λ), qb (µ)] = αab (λ, µ). Here αab (λ, µ) is some complex function, (0| qa (λ) = 0.

pa (λ)|0) = 0,

(B.5)

One can represent our pa and qb in terms of ψl and ψl† , for example as pa (λ) = ψa (λ), (B.6) qb (µ) =

∞ XZ c −∞

dναcb (ν, µ)ψc† (ν).

(B.7)

This shows that the dual fields, which appear in this paper are linear combinations of the standard Bose fields. Let us now consider a related issue. We can realize the dual fields φ1 (λ) and φ2 (λ) as q2 (λ) = ψ1† (λ),

Z∞ q1 (λ) =

ln −∞

h(ν, λ) † ψ (ν) dν, h(λ, ν) 2

p1 (λ) = ψ1 (λ); Z∞ h(ν, λ) ψ2 (ν) dν, p2 (λ) = ln h(λ, ν) −∞

Here † means Hermitian conjugation, and [ψ1 (λ), ψ2† (µ)] = [ψ2 (λ), ψ1† (µ)] = δ(λ − µ). Other commutators are equal to zero. These commutation relations differ from (B.1) only by a trivial relabeling. Then φ2 (λ) =

ψ1† (λ)

Z∞ + ψ1 (λ),

φ1 (λ) =

ln −∞

h(ν, λ) † (ψ (ν) + ψ2 (ν)) dν. h(λ, ν) 2

This means that φ2 (λ) and iφ1 (λ) are Hermitian operators: φ†2 (λ) = φ2 (λ),

(iφ1 (λ))† = iφ1 (λ),

After diagonalization they will turn into real functions.

for Im λ = 0.

(B.8)

686

T. Kojima, V. E. Korepin, N. A. Slavnov

C. Reduction of Number of Dual Fields We would like to reduce the number of dual fields in the determinant formula for the correlation function in (6.24). Here we shall show that φ1 (λ) = φD1 (λ) − φA2 (λ)

and φ2 (λ) = 0.

(C.1)

Recall the definition of the dual quantum fields (5.1) and (5.2), φ0 (λ) = q0 (λ) + p0 (λ); φAj (λ) = qAj (λ) + pDj (λ); φ1 (λ) = q1 (λ) + p2 (λ);  [p (λ), q0 (µ)] = ln(h(λ, µ)h(µ, λ));   0 [pDj (λ), qDk (µ)] = δjk ln h(λ, µ); h(λ, µ)   [p1 (λ), q1 (µ)] = ln ; h(µ, λ)

φDj (λ) = qDj (λ) + pAj (λ); φ2 (λ) = q2 (λ) + p1 (λ);

[pAj (λ), qAk (µ)] = δjk ln h(µ, λ); h(µ, λ) [p2 (λ), q2 (µ)] = ln . h(λ, µ)

(C.2)

(C.3)

Remember also the definition of the field ψ(λ) (5.7), ψ(λ) = φ0 (λ) + φA1 (λ) + φD2 (λ) + φ2 (λ). Notice that q1 (λ) and p2 (λ) entering into φ1 (λ) do not commute only with ψ(µ): [p2 (λ), ψ(µ)] = ln

h(µ, λ) , h(λ, µ)

[ψ(µ), q1 (λ)] = ln

h(µ, λ) . h(λ, µ)

On the other hand qD1 (λ) − qA2 (λ) and pA1 (λ) − pD2 (λ) entering into φD1 (λ) − φA2 (λ) also do not commute only with ψ(µ): [(pA1 (λ) − pD2 (λ)), ψ(µ)] = ln

h(µ, λ) , h(λ, µ)

[ψ(µ), (qD1 (λ) − qA2 (λ))] = ln

h(µ, λ) , h(λ, µ)

so we can identify φ1 (λ) = φD1 (λ) − φA2 (λ).

(C.4)

Then we can also put φ2 (λ) = q2 (λ) + p1 (λ) = 0 because after the replacement (C.4), operators q2 (λ) and p1 (λ) commute with everything. Such a replacement implies 1

ω+ (λ) = e 2 (φD1 (λ)+φA2 (λ)) Z(λ, λ).

(C.5)

It means, that the Fredholm determinant in (6.24) really depends on three dual fields ψ(λ), φA2 (λ) and φD1 (λ). We shall use this fact in our next publications.

Determinant Representation for Function of Bose Gas

687

D. Thermodynamics The thermodynamics of the quantum nonlinear Schr¨odinger equation was described by C. N. Yang and C. P. Yang [13]. It involves few equations. The central equation is for an energy of excitation ε(λ): T ε(λ) = λ − h − 2π

Z∞

2

−∞

2c − ε(µ) T ln 1 + e dµ. c2 + (λ − µ)2

(D.1)

Other important functions are the local density (in momentum space) of particles ρp (λ) and the total local density ρt (λ) (it includes particles and holes). They satisfy equations: Z∞ 2πρt (λ) = 1 + −∞

2c ρp (µ) dµ, c2 + (λ − µ)2

−1 ε(λ) ρp (λ) = 1+e T ≡ ϑ(λ). ρt (λ)

(D.2)

(D.3)

The global density D = N/L can be represented as Z∞ D=

ρp (λ) dλ.

(D.4)

−∞

In order to obtain the determinant representation of the temperature correlation function we can use the following representation: H tr e− T ψ(0, 0)ψ † (x, t) hT |ψ(0, 0)ψ † (x, t)|T i . (D.5) = hψ(0, 0)ψ † (x, t)iT ≡ H hT |T i tr e− T Here |T i is one of the eigenfunctions of the Hamiltonian, which is present in the state of thermo-equilibrium. It is proven in Sect. I.8 of [3] that the r.h.s. of (D.5) does not depend on the particular choice of |T i. Now we have to recalculate the thermodynamic limit. First we shall return to the determinant representation of the correlation function in a finite volume, see formulæ (6.18)–(6.21). We should also notice that the thermodynamic limit of the square of the norm should be changed (comparing to (6.22)) as follows: Y 1 ˆ ∂ϕj KT ), 2πρt (µa )L det(Iˆ − → ∂µk 2π N

det N

(D.6)

a=1

see Sect. X.4 of [3]. Here µj correspond to |T i. In the thermodynamic limit summation with respect to indices j and k in (6.18) will be replaced by integration ρp (µk ) dµk . Also ρL (µ) defined in (6.12) goes to ρt : ρL (µ) → ρt (µ).

688

T. Kojima, V. E. Korepin, N. A. Slavnov

After dividing ρp (µk ) dµk by ρt (µk ) which appears in the denominator (6.19) we shall R∞ Rq obtain an integration ϑ(λ) dλ(·) instead of dλ(·). The details of these calculations −∞

−q

are explained in Sect. XI.5 of [3]. ˆ T can be defined by its kernel The integral operator K p p 2c ϑ(µ1 ) ϑ(µ2 ). KT (µ1 , µ2 ) = 2 2 c + (µ1 − µ2 )

(D.7)

Now let us formulate the final formula for the representation of the temperature correlation function of local fields in the thermodynamic limit ˆ + 1 Vˆα,T det I ∂ 2π |0) . (D.8) × hψ(0, 0)ψ † (x, t)i = e−iht (0| G(x, t) + 1 ˆ ˆ ∂α det I − 2π K α=0 Here the integral operator Vˆα,T is given by Z ∞ VˆT (λ, µ) − αPˆT (µ)PˆT (λ) f (µ)dµ. Vˆα,T f (λ) =

(D.9)

−∞

1 Here the kernel of 2π VˆT (µ1 , µ2 )−αPˆT (µ1 )PˆT (µ2 ) differs from the zero temperature case by the measure and limits of integration: p p VˆT (µ1 , µ2 ) − αPˆT (µ1 )PˆT (µ2 ) = Vˆ (µ1 , µ2 ) − αPˆ (µ1 )Pˆ (µ2 ) ϑ(µ1 ) ϑ(µ2 ). (D.10) It acts on the whole real axis −∞ < µ < ∞. Here Vˆ (µ1 , µ2 ) is given exactly by (6.26), Pˆ (µ) is given by formula (6.27) and G(x, t) is given by (6.23). Acknowledgement. We wish to thank Professor Y. Matsuo for useful discussions and A. Waldron for an assistance. This work is partly supported by the National Science Foundation (NSF) under Grant No. PHY9321165, the Japan Society for the Promotion of Science, the Russian Foundation of Basic Research under Grant No. 96-01-00344 and INTAS-01-166-ext.

References 1. Jimbo, M., Miwa, T., Mˆori, Y. and Sato, M.: Density martix of an inpenetrable Bose gas and the fith Painlev´e transcendent. Physica 1D 80–158 (1980) 2. Lenard, A.:One dimensional impenetrable bosons in thermal equilibrium. J. Math. Phys. 7, 1268–1272 (1966) 3. Korepin, V.E., Bogoliubov, N.M., Izergin, A.G.: Quantum Inverse Scattering Method and Correlation Functions. Cambridge: Cambridge University Press. 1993 4. Lieb, E.H.: Exact analysis of an interacting Bose gas II. The excitation spectrum. Phys. Rev. 130, 1616–1634 (1963) 5. Lieb, E.H., Liniger, W.: Exact analysis of an interacting Bose gas I. The general solution and the ground state. Phys. Rev. 130, 1605–1616 (1963) 6. Zakharov, V.E., Shabat, A.B.: Exact theory of two-dimensioanl self-focusing and one-dimensional selfmodulation of waves in nonlinear media. Sov. Phys. JETP 34, 62–69 (1972) 7. Faddeev, L.D., Sklyanin, E.K.: Quantum-mechanical approach to completely integrable field theory models. Sov. Phys. Dokl. 23 902–904 (1978)

Determinant Representation for Function of Bose Gas

689

8. Korepin, V.E.: Calculations of norms of Bethe wave functions. Commun. Math. Phys. 86, 391–418 (1982) 9. Izergin, A.G., Korepin, V.E., Reshetikhin, N.Yu.: Correlation functions in one-dimensional Bose gas. J. Phys. A20, 4799–4822 (1987) 10. Slavnov, N.A.: Calcuation of scalar products of the wave functions and form factors in the framework of the Algebraic Bethe Ansatz. Theor. Math. Phys. 79, 502–509 (1989) 11. Korepin, V.E., Slavnov, N.A.: The time-dependent correlation function of an impenetrable Bose gas as a Fredholm minor. Commun. Math. Phys. 129, 103–113 (1990) 12. Korepin, V.E.: Dual field formulation of quantum integrable models. Commun. Math. Phys. 113, 177– 190 (1987) 13. Yang, C.N., Yang, C.P.: Thermodynamics of a one dimensional system of bosons with repulsive deltafunction interaction. J. Math. Phys. 10, 1115–1122 (1969) 14. Barough, E., McCoy, B.M. and Wu,T.T.: Zero-field susceptibility of the two-dimensional ising model near Tc . Phys. Rev. Lett. 31, 1409 (1973) Communicated by Ya. G. Sinai

Commun. Math. Phys. 188, 691 – 708 (1997)

Communications in

Mathematical Physics c Springer-Verlag 1997

Some Propagation Properties of the Iwatsuka Model Marius Mˇantoiu, Radu Purice Institute of Mathematics of the Romanian Academy, P.O. Box 1–764, RO-70700 Bucharest, Romania Received: 12 February 1997 / Accepted: 26 February 1997

Abstract: In this paper we study a two dimensional magnetic field Schr¨odinger Hamiltonian introduced in [7]. This model has some interesting propagation properties, as conjectured in [2] and at the same time is a special case of the class of analytically decomposable Hamiltonians [5]. Our aim is to start from a conjugate operator, intimately related to the band structure of the Hamiltonian and to prove existence of an asymptotic velocity in one spatial direction and a theorem giving minimal and maximal velocity bounds for the propagation associated to the Hamiltonian. A simple example of this model, with a very simple conjugate operator, has been given in [9]. At the same time, by using the Virial Theorem, we obtain a generalisation of the hypothesis in [7].

1. Introduction There are many quantum systems with Hamiltonians admitting a direct integral decomposition with the fibre Hamiltonian having discrete spectrum, its eigenvalues being nonconstant analytic functions. We shall refer to this structure as a “band” structure, although in most cases the analytic functions describing the different eigenvalues may intersect so that their images may overlap in a complicated way. Anyhow, for some specific models or for some domains of energies, the “band” structure may be simple enough to allow a detailed analysis. We can consider the “effective Hamiltonian” associated to one analytic function describing an eigenvalue and define a conjugate operator for it. Once we have done that, the genereal procedure of Sigal and Soffer [11] allows us to obtain some propagation estimates. If the “band” structure is not too complicated, then we can obtain propagation estimates for some physically interesting observables, leading to an asymptotic velocity and to minimal and maximal velocity bounds. The aim of this paper is to apply this procedure to a two dimensional model proposed in [7]. While working on this problem we learned about the paper [5] by C. G´erard and F. Nier making a general analysis of this kind of “analytically decomposable Hamiltonians”

692

M. Mˇantoiu, R. Purice

and defining conjugate operators for them. Anyhow, the model we are looking at is not considered in [5] and moreover, we obtain for this case some propagation estimates that support the heuristic argument given in [2] for the physical behaviour of this model. Let us consider a two dimensional magnetic Hamiltonian acting in L2 ( R2 ): H := (−i∂x − bx )2 + (−i∂y − by )2 ,

(1.1)

where bx , by : R2 → R are the components of the magnetic vector potential with corresponding magnetic field B := ∂x by − ∂y bx . It is known that H may present a wide class of spectral properties. If B is constant and non-zero, there are only infinitely degenerated eigenvalues (the Landau levels). When B grows at infinity, H has compact resolvent, hence a purely discrete spectrum. Besides, even if B decays at infinity, different situations may occur. For a short range B, the spectrum is purely absolutely continuous. On the other hand there is the wellknown example of Miller and Simon: B(x, y) =

cr 2c + γ (1 + r) (1 + r)1+γ

with c, γ > 0, r2 = x2 +y 2 . If γ > 1, B is short-range, so that H has no singular spectrum; but if γ ∈ (0, 1) there will be a dense pure point and for γ = 1 we have a mobility spectrum 2 and absolutely continuous spectrum edge: dense point spectrum in the interval 0, c filling the semiaxes c2 , ∞ . We refer to [2] for the precise statements and for further discussion. In consequence, it is interesting to point out new situations when the spectral analysis of the magnetic Hamiltonian (1.1) is understood. One tentative case in this direction is in [7], which we briefly describe here. More details will be given in Sect. 2. Let us consider a smooth two dimensional magnetic field that depends only on the first variable in R2 ; hence it is described by a function B ∈ C ∞ (R; R) . Such a magnetic field may be obtained with a vector potential of the form b(x, y) := (0, β(x)) where: Zx B(t)dt.

β(x) =

(1.2)

0

We assign to it a self-adjoint operator acting in H = L2 (R2 ), given by: H := −∂x2 + (−i∂y − β(x))2 .

(1.3)

It is essentially self-adjoint on C0∞ (R2 ) (see for instance [2]). We shall work systematically under the assumption: Hypothesis 1.1. There are constants M± such that 0 < M− ≤ B(x) ≤ M+ < ∞ for every x ∈ R. Taking advantage of the fact that H commutes with the translations in the y direction, it is easy to see that H is unitarily equivalent to a direct integral in L2 Rξ ;L2 (Rx ) of ˜ a family of self-adjoint operators H(ξ) in L2 (Rx ), of the form: ξ∈R ˜ H(ξ) := −∂x2 + (ξ − β(x))2 ,

(1.4)

˜ having compact resolvent and depending analytically on ξ. The spectrum of H(ξ) is thus discrete and consists of a sequence of non-degenerate eigenvalues:

Some Propagation Properties of the Iwatsuka Model

693

0 < λ0 (ξ) < λ1 (ξ) < ... such that lim λn (ξ) = ∞. In order to show that H is purely absolutely continuous, it is n→∞ enough to prove that for all natural n λn (ξ) is not a constant function of ξ (see Theorem XIII.86 in [10]). Iwatsuka succeeds in doing this under any of the following hypotheses: Hypothesis 1.2. Hypothesis 1.1 holds true and moreover one of the following conditions is also verified: lim sup B(x)
x→+∞

or

lim sup B(x)
x→−∞

Hypothesis 1.3. Hypothesis 1.1 holds true and there exists a constant B0 ∈ R such that B − B0 is not identically zero, has compact support and there is a point x¯ ∈ R such that B 0 (x− )B 0 (x+ ) ≤ 0 for any x− ≤ x¯ ≤ x+ . The proof of Iwatsuka makes use basically of the min-max principle. One aim of the present paper is to prove that the eigenvalues λn (ξ) are not constant under an alternate assumption, namely: Hypothesis 1.4. Hypothesis 1.1 holds true and moreover the function B is not constant and there is a point x0 ∈ R such that, for all points x1 and x2 with x1 ≤ x0 ≤ x2 , one has one of the two conditions below: B(x1 ) ≤ B(x0 ) ≤ B(x2 )

or

B(x1 ) ≥ B(x0 ) ≥ B(x2 ).

The precise assertion will be formulated and proved in Sect. 3 by using a very simple argument relying on a version of the Virial Theorem. Notice that Hypothesis 1.4 is much weaker than 1.3. There is some overlap of Hypothesis 1.2 and 1.4. Actually, by coupling our Theorem 3.2 with Lemma 4.1 of [7], one may enlarge sensibly the range of assumptions under which λn (ξ) are not constant. Anyhow, we are far from proving the conjecture in [2], claiming that H must be purely absolutely continuous as soon as B is not constant. There is an extra feature that makes reference [2] interesting in our context. Namely, in Sect. 6.3 of [2], a heuristic argument is given suggesting the type of propagation expected in such situations as those described above. A simple model is chosen, with a magnetic field B taking only two distinct values B+ and B− respectively in the upper and the lower halfplane. Having in mind that a classical particle in a constant magnetic field moves along a circle with the radius inversely proportional to the intensity of the field, for suitable initial conditions the particle will go from one halfplane to the other changing the radius of its trajectory and by a cumulative effect it will practically propagate along the y axis towards infinity. Let us notice however that there are plenty of initial conditions for which the particle will remain forever in a halfplane of constant magnetic field and thus move on a fixed circle. This shows that the absolute continuity of the quantum Hamiltonian is due to quantum delocalization. Moreover, for a more complicated situation it is very difficult to predict the form of the classical trajectories. Motivated by these facts, in Sect. 5 we shall give estimations on the evolution group e−itH , valid under rather wide conditions imposed on the magnetic field, supporting the assertion that the quantum particle goes to infinity in the y direction. These estimations are of the form of minimal and maximal velocity bounds of the type given by Sigal and Soffer in a series of papers treating the N -body problem, [11] being the most convenient reference for us. Part of the proof of our Theorem 5.2 will consist of an adaptation to our

694

M. Mˇantoiu, R. Purice

situation of Lemma 4.1 and Theorem 4.2 from [11], so that we shall almost skip it. Our problem will be to convert the usual estimation into one for physically interesting objects and in doing that we shall rely on the technical Lemma 2.5. Moreover, in Sect. 4 we prove the existence of an asymptotic velocity in the y direction and a precise formula for it. For all these results we shall need the following hypothesis concerning the magnetic field: Hypothesis 1.5. Hypothesis 1.1 holds true and moreover the following limits exist: B± := lim B(x). x→±∞

2. Description of the Model In this section we shall describe more closely the self-adjoint operator (1.3), referring to [7] for more details and proofs and proving some technical results needed in the following developments. Let us denote by F the Fourier transform on S(R), given by the formula: Z 1 e−iξy ϕ(y)dy , (F ϕ)(ξ) ≡ ϕ(ξ) ˆ := √ 2π R and at the same time its extension to a bijection on S 0 (R) given by duality: (FT )ϕ := T (F ϕ) for any T ∈ S 0 (R) and any ϕ ∈ S(R). Let us denote by J the operator of reflection in S(R) : (Jϕ)(x) := ϕ(−x) and also its extension to S 0 (R). Thus we have the identity: F −1 = JF. Let us now describe some useful representations of the Hilbert space H associated to our model. In order to keep track of the variables we have in mind, we shall use them to index the sets R to which we associate them. As usual we consider H to be the spectral representation of the position operators so that: H := L2 (Rx × Ry ) ∼ = L2 (Rx ) ⊗ L2 (Ry ). Considering the derivative −i∂y with respect to the second variable in Rx × Ry and its spectral representation, we also define: H˜ := L2 (Rx × Rξ ) ∼ = L2 ( Rx ) ⊗ L2 (Rξ ) ˜ In fact so that the operator 1 ⊗ F can be considered as an unitary map from H to H. 2 we shall use a slightly different interpretation, by denoting K := L ( Rx ) and using the following canonical isomorphisms: H∼ = L2 (Ry ;K);

H˜ ∼ = L2 (Rξ ;K).

In this setting, we denote by Fˆ the operator 1 ⊗ F acting on K-valued L2 -functions and observe that it is the restriction to L2 (R;K) of the Fourier transform on S 0 (R; K), the space of K-valued tempered distributions, satisfying the relation: (Fˆ T )ϕ := T (Fϕ) ∈ K

(2.1)

Some Propagation Properties of the Iwatsuka Model

695

for any T ∈ S 0 (R; K) and any ϕ ∈ S(R). ˜ := Let H be the self-adjoint operator given by (1.3) and acting in H and let H ˜ Then for any ξ ∈ R we FHF −1 considered as a self-adjoint operator acting in H. ˜ can define the self-adjoint operator H(ξ), given by the formula (1.4) and acting in K ˜ ˜ . Our Hypothesis 1.1 implies such that H is the direct integral of the family: H(ξ) ξ∈R ˜ that lim |β(x)| = ∞ so that H(ξ) has compact resolvent and hence a purely discrete |x|→∞

spectrum:

˜ σ(H(ξ)) = {λn (ξ)}n∈ N

with λn (ξ) ≤ λn+1 (ξ) for any ξ ∈ R and any n ∈ N. In the sequel we shall constantly denote with an upper dot the derivative with respect to the variable ξ (i.e. ∂ξ f ≡ f˙).The next two propositions gather a collection of technical ˜ results concerning the family H(ξ) , taken from Lemmas 2.3 and 2.4 in [7]: ξ∈ R Proposition 2.1. Under Hypothesis 1.1 we have: ˜ defines an analytic family of type (A) on a complex neighbourhood of 1. H(ξ) ξ∈R

the real axis (see Sect. XII.2 in [10]); 2. the eigenvalues λn (ξ) are nondegenerate for any ξ ∈ R; 3. the functions λn : R → R are analytic for any n ∈ N; 4. if we denote by A¯ the closure of the set A, we have the inclusions: λn (R) ⊂ (2n + 1)B(R) ⊂ (2n + 1)M− , (2n + 1)M+ ; 5. for any ξ ∈ R, we can choose an orthonormal basis {9n (ξ)}n∈N in K such that: ˜ H(ξ)9 n (ξ) = λn (ξ)9n (ξ) in such a way that for any n ∈ N the function 9n is indefinitely derivable in x for fixed ξ and analytic in ξ as an element of K. Proposition 2.2. Under Hypothesis 1.5 the following limits exist and satisfy the equalities: lim λn (ξ) = (2n + 1)B± . ξ→±∞

Let us make now some comments concerning these results in order to fix the ideas ˜ for further developments. First let us consider the operator domains D(H(ξ)) for ξ ∈ R. We evidently have the following relations on C0∞ (Rx ): ˜ ˜ H(ξ) = H(0) − 2ξβ + ξ 2 , (2.2)

2

2 + 1/ kϕk2 ˜ ˜ kβϕk = ϕ, β 2 ϕ ≤ ϕ, H(0)ϕ ≤ H(0)ϕ (2.3) ˜ ˜ so that for any ξ ∈ R : D(H(ξ)) = D(H(0)) ≡ D and we shall consider D as a Hilbert ˜ space with the graph-scalar product associated to H(0). ˜ The second comment we make concerns the eigenprojections of the operators H(ξ). For any n ∈ N and any ξ ∈ R let us choose a contour 0n in the complex plane ˜ Evidently, this property of 0n stays surrounding λn (ξ) and no other point of σ(H(ξ)). 0 true for any ξ in a small neighbourhood of ξ. Then for Pn (ξ), the eigenprojection of ˜ H(ξ) associated to λn (ξ), we have the formula: Z 1 ˜ (H(ξ) − z)−1 dz. (2.4) Pn (ξ) = − 2πi 0n

696

M. Mˇantoiu, R. Purice

Lemma 2.3. Under Hypothesis 1.5, for any n ∈ N and any p ∈ N there exist constants Cn,p such that:

p

d

sup p Pn (ξ) ≤ Cn,p .

dξ B(K)

ξ∈R

Proof. Let us start by observing that with respect to the strong topology of B(K), when applied to functions from C0∞ (Rx ), we have: ˜ = 2(ξ − β(x)) , ∂ξ H(ξ) 2 ˜ ∂ H(ξ) = 2 , ξ

˜ is a bounded operator on K uniformly and all the other derivatives are zero. Thus ∂ξ2 H(ξ) ˜ in ξ ∈ R while ∂ξ H(ξ) can be extended to D and from (2.3) one has: 2 ˜ ˜ ≤ 4H(ξ). (∂ξ H(ξ))

On the other hand it is easy to deduce from Propositions 2.1 and 2.2 that for any n ∈ N there exists a constant δn > 0 such that: ˜ {z | |z − λn (ξ)| < 2δn } ∩ σ(H(ξ)) = {λn (ξ)} for any ξ ∈ R. Thus in (2.4) we can take 0n a circle of radius δn , uniformly in ξ ∈ R. A simple but tedious computation shows that for any p ∈ N, the pth derivative of Pn with respect to ξ is given by an integral along 0n of a sum of products of three kinds of ˜ ˜ ˜ ˜ H(ξ) − z)−1 , ∂ξ2 H(ξ). Due to our choice of 0n (ξ), for factors: (H(ξ) − z)−1 , ∂ξ H(ξ)( any ξ ∈ R and for any z ∈ 0n (ξ) we have:

(H(ξ) ˜ − z)−1 ≤ δ1n ,

2˜

∂ξ H(ξ) = 2 , 2 ˜ ˜ ˜ (H(ξ) − z)−1 (∂ξ H(ξ)) (H(ξ) − z)−1 ˜ ˜ ˜ ≤ 4 (H(ξ) − z)−1 + z(H(ξ) − z) ¯ −1 (H(ξ) − z)−1 ,

so that, taking into account that λn (ξ) is bounded and thus |z| is bounded on 0n (ξ) for any ξ ∈ R, we get:

2

1 4

∂ξ H(ξ)( ˜ ˜ H(ξ) − z)−1 ≤ 1 + |z| ≤ Cn . δn δn Following Sect. II.4.2 in [8] we can define for each n ∈ N an analytic function Un (ξ) taking values in the set of unitary operators on K and such that: Pn (ξ) = Un (ξ)Pn (0)Un (ξ)−1 . Kato’s choice, that we shall also use, is to take for Un (ξ) the unique solution of the Cauchy problem: U˙ n (ξ) = P˙n (ξ), Pn (ξ) Un (ξ) , Un (0) = 1.

(2.5)

Some Propagation Properties of the Iwatsuka Model

697

We can now make a precise choice of the eigenvectors 9n (ξ) for each ξ ∈ R. We set: 9n (ξ) := Un (ξ)9n (0) and observe that for any ξ ∈ R, Pn (ξ)P˙n (ξ)Pn (ξ) = 0, so that for any n ∈ N and any ξ ∈ R,

˙ n (ξ) = 0. 9n (ξ), 9 Let us study now the regularity properties of the functions λn (ξ) and 9n (ξ). Lemma 2.4. Under Hypothesis 1.5, we have for any n ∈ N the estimations: 1. sup λn (ξ) ≤ (2n + 1) kBkL∞ (Rx ) ; ξ∈R

√ 2. λ˙ n (ξ) ≤ 2 λn (ξ), ∀ξ ∈ R; p d 3. ∀p ∈ N : sup dξ p λn (ξ) ≤ cn,p ; ξ∈R

p

d 4. ∀p ∈ N : sup dξ 9 (ξ)

≤ cn,p . p n ξ∈R

K

Proof. Let us observe that λn (ξ) = kλn (ξ)Pn (0)k and we have: −1 ˜ ˜ λn (ξ)Pn (0) = Pn (0)Un (ξ)−1 H(ξ)U n (ξ)Pn (0) = Un (ξ) Pn (ξ)H(ξ)Pn (ξ)Un (ξ).

For the derivatives of λn (ξ) one uses this equality and the differential equation satisfied by Un (ξ) and by Un (ξ)∗ ; in order to estimate the terms that appear one uses Lemma 2.3 for the derivatives of the eigenprojections and an argument similar to that in the end of ˜ the proof of the same lemma for the term Pn (ξ)∂ξ H(ξ)P n (ξ). Similar procedures can be used to prove the estimations for the eigenvectors too. For any ξ ∈ R we have the direct sum decomposition of K associated to the orthonormal basis {9n (ξ)}n∈N : K = ⊕ C9n (ξ). n∈N

Let us consider now an element f ∈ H˜ = L2 (Rξ ;K) and define: Z fn (ξ) := f (ξ, x)9n (ξ, x)dx = h9n (ξ), f (ξ)iK R

so that: f=

X

f n 9n ,

n∈N

where for each fixed ξ ∈ R the series converges in K. Let us observe that: Z Z fn (ξ)fn (ξ)dξ = h9n (ξ), f (ξ)iK hf (ξ), 9n (ξ)iK dξ ≤ R Z Z ZR 2 2 2 2 kf (ξ)kK k9n (ξ)kK dξ = |f (ξ, x)| dx dξ = kf kH˜ ≤ R

R

R

698

M. Mˇantoiu, R. Purice

so that fn ∈ L2 (Rξ ). Moreover: Z X XZ fn (ξ)fn (ξ)dξ = h9n (ξ), f (ξ)iK hf (ξ), 9n (ξ)iK dξ , n≤N

R

R n≤N

and using the Lebesgue dominated convergence theorem one proves that this series converges for N → ∞ and has the limit: Z XZ 2 2 fn (ξ)fn (ξ)dξ = kf (ξ)kK dξ = kf kH˜ . n∈N

R

R

Thus, if we denote: H˜ n := f 9n | f ∈ L2 (Rξ ) , we have that H˜ n ⊂ H˜ for any n ∈ N and moreover: H˜ = ⊕ H˜ n . n∈N

˜ and: We easily can see that each H˜ n is invariant under H ˜ 9n ) = (λn fn )9n , H(f

(2.6)

˜ ˜ ˜ ˜ ˜ is bounded by (2n + 1) kBk ∞ so that H L (Rx ) on each Hn and H = ⊕ Hn with Hn the n∈N

˜ to H˜ n given by the above formula. restriction of H Let us denote now: Hn := Fˆ −1 H˜ n , Hn := Fˆ −1 H˜ n Fˆ , so that we have H = ⊕ Hn n∈N

and H = ⊕ Hn . For any function ρ ∈ L∞ (Rξ ) let ρ be the operator on L2 (Ry ) given n∈N

by:

ρ(F −1 (u)) := F −1 ((ρu)).

(2.7)

The elements of Hn are of the form Fˆ −1 (gn 9n ) and we shall now develop a little bit the study of their structure. Let us start by recalling that 9n ∈ L∞ (R;K)∩C ∞ (R;K) so that it may be considered to be a tempered K-valued distribution, i.e. for any χ ∈ S(R) we can define: Z 9n (χ) := 9n (ξ)χ(ξ)dξ R

as an element of K and there exist a constant c and a seminorm k|.|k on S(R) such that: k9n (χ)kK ≤ c k|χ|k. We observe that in fact we have the relation: k9n (χ)kK ≤ k9n k∞K kχkL1 , where: k9n k∞K := sup k9n (ξ)kK . ξ∈R

Now let us consider the extension of the Fourier transform Fˆ to S 0 (R;K), as explained before (2.1). Thus we can define Fˆ −1 (9n ) ≡ 8n ∈ S 0 (Ry ;K). But for any χ ∈ S(R):

k8n (χ)kK = 9n (F −1 χ) K ≤ k9n k∞K F −1 χ L1 . Thus we have extended 8n to a bounded linear application from FL1 (R) to K. In order to study the element Fˆ −1 (gn 9n ) for gn ∈ L2 (R), let us define the convolution η ∗ 8n for any η ∈ C0∞ (R) and let us denote by η˜ the function defined by η(x) ˜ := η(−x). Then, for any χ ∈ S(R), we have:

Some Propagation Properties of the Iwatsuka Model

699

(η ∗ 8n )(χ) = 8n (η˜ ∗ χ). Using the H¨older inequality and the Plancherel theorem we get: k(η ∗ 8n )(χ)kK ≤ k9n k∞K kηkL2 kχkL2 . We conclude that: η ∗ 8n ∈ L2 (R;K) and its L2 -norm is bounded by k9n k∞K kηkL2 . Approximating Fˆ −1 gn in L2 -norm with elements from C0∞ (R), we get: Fˆ −1 (gn 9n ) = Fˆ −1 gn ∗ 8n , where for any χ ∈ S(R) we have the following relation allowing for explicit calculations: Fˆ −1 (gn 9n ) (χ) = Fˆ −1 gn ∗ 8n (χ) = 8n ((Fˆ gn ) ∗ χ). This relation allows one to work formally with the usual formula for convolutions, having in mind the above interpretation. In connection with the above analysis, one can easily prove that: (2.8) hfn ∗ 8n , gn ∗ 8n iH = hfn , gn iL2 (Ry ) and with the above notations: Hn (fn ∗ 8n ) = (λn fn ) ∗ 8n .

(2.9)

Let us now briefly describe the spectrum of H. Since each Hn is unitarily equivalent with the operator of multiplication by the analytic function λn , its spectrum is λn (R). In consequence the spectrum of H is a union of “bands”: σ(H) = ∪ λn (R). n∈N

We remark that these bands may overlap. If B is constant, the nth band consists of the single point (2n + 1)B. If B is strictly monotonous (increasing for instance), the formula (3.4) shows that the bands surely overlap at least for large n. In order to obtain our propagation estimate for the y variable we shall need a technical result relating the operator of multiplication with y in H with some operator associated to the above decomposition of H. Let us denote by qy the operator of multiplication by the variable in L2 (Ry ) and by Qy the operator of multiplication with the second variable in H = L2 (Rx × Ry ). Related to qy there is an interesting operator defined on suitable elements from Hn : Y (fn ∗ 8n ) := (qy fn ) ∗ 8n . Of course, it may be extended to an (unbounded) operator in H. Remark that Qy is the second component of the position operator. Because of the simple manner (2.9) in which Hn acts on Hn , it is quite easy to get estimates on e−itHn in which the operator Y appears. Since the physically relevant object is associated to the operator Qy , we need a way to turn these estimations in terms of Qy . The next lemma states that what we lose when going from functions of Y /t to functions of Qy /t is asymptotically small in t. We shall denote by H1 the Sobolev space of order 1 on R and we shall use the following norm on it: o n 2

2

kf kH1 := kf kL2 + k∂y f kL2

1/2

.

(2.10)

From now on we shall frequently use the same symbol C for different unimportant constants appearing in estimations.

700

M. Mˇantoiu, R. Purice

Lemma 2.5. Let L : R → R be a C 2 function such that L and its first and second derivatives are bounded on R. Then there is a constant Cn such that for any f ∈ H1 one has:

Cn

L Qy − L Y

(f ∗ 8 kf kH1 . ) n ≤

t t t H Proof. One must show that for any χ ∈ S(R):

Cn

L Qy − L Y (f ∗ 8 ) (χ) n

≤ t kf kH1 kχkL2 .

t t K As explained above, we have: o n Q (f ∗ 8n ) (χ) L ty − L Yt q q = (f ∗ 8n ) L ty (χ) − L ty f ∗ 8n (χ) q q = 8n f˜ ∗ L ty (χ) − L ty f ∗ 8n χ. But

q n q o∼ y y f˜ ∗ L (χ) − L f ∗χ= t t Z y0 − y y0 −L f (y 0 − y)χ(y 0 )dy 0 = L = t t R y = t

Z Z1 L R

0

y 0 − sy t

ds f (y 0 − y)χ(y 0 )dy 0 .

0

First, let us treat this expression in order to prove that it defines an element from F −1 L1 (R) and apply the procedure presented above. For doing this, let us define the kernel: Z f (y 0 − y) 1 0 y 0 − sy ds L mt (y, y 0 ) := 1 + y2 t 0 that evidently defines an integral operator Mt on L2 (R) with a bounded Hilbert–Schmidt norm, uniformly in t ∈ R: kmt kL2 (R2 ) ≤ C kL0 kL∞ kf kL2 . 2

2

2

Let us also observe that the derivative of mt with respect to the first variable still defines the kernel of an integral operator Mt0 on L2 (R) with bounded Hilbert–Schmidt norm, uniformly in t ∈ R: o n 2 2 2 2 k∂y mt kL2 (R2 ) ≤ C kL0 kL∞ + kL00 kL∞ kf kH1 . In conclusion:

−1

(F Mt )(χ)

Z L1

= R

−1 (F Mt )(χ) (ξ) dξ ≤

Some Propagation Properties of the Iwatsuka Model

Z ≤

R

dξ 1 + ξ2

1/2 Z R

701

2 (1 + ξ 2 ) (F −1 Mt )(χ)(ξ) dξ

1/2 ≤

1/2 Z −1 2 −1 0 2 (F Mt )(χ)(ξ) + (F Mt )(χ)(ξ) dξ ≤ ≤C R

≤ C {kL0 kL∞ + kL00 kL∞ } kf kH1 kχkL2 . Secondly, let us remark that in order to make finite the Hilbert–Schmidt norm of the kernel mt we need an extra factor (1 + y 2 )−1 that we have to compensate; thus we have: 1 Y Qy −L (f ∗ 8n ) (χ) = Qy + Q3y Fˆ −1 9n {Mt χ} = L t t t =

1 ∂ξ 9n + ∂ξ3 9n F −1 Mt (χ) . t

One can easily verify that due to point 4 in Lemma 2.4, the derivatives of 9n satisfy the same conditions as 9n , so that we can apply to them the same analysis and thus they define bounded applications from Fˆ −1 L1 (R) to K. In conclusion, we have proved that:

L Qy − L Y (f ∗ 8n ) (χ)

≤

t t K ≤

Cn C {kL0 kL∞ + kL00 kL∞ } kf kH1 kχkL2 ≤ kf kH1 kχkL2 . t t

3. Absence of the Singular Spectrum As we said in the introduction, in order to prove that the operator H has no singular ˜ are not constant. spectrum it is enough to prove that the eigenvalues λn (ξ) of H(ξ) Starting with the formula:

˜ λn (ξ) = 9n (ξ), H(ξ)9 n (ξ) , it comes out that we only need to show that

˜ ∂ξ λn (ξ) = 9n (ξ), ∂ξ H(ξ)9 n (ξ) = 2

Z 2

R

(ξ − β(x)) |9n (x, ξ)| dx

(3.1)

is not identically zero. To do this, we use the following form of the Virial Theorem (see [1]): Lemma 3.1. Let K,A be two self-adjoint operators in the Hilbert space K and M a core for K such that: 1. eitA M ⊂ M for all t ∈ R;

2. sup KeitA u = c(u) < ∞ for all u ∈ M; |t|≤1

702

M. Mˇantoiu, R. Purice

3. the derivative

d −itA itA e Ke u t=0 ≡ i [K, A] u dt exists weakly in K for every u ∈ M and satisfy: 2 2 2 ki [K, A] uk ≤ a kuk + kKuk , where a is independent of u.

Then we have: hv, i [K, A] vi = 0 for every v ∈ D(K) (the domain of K) such that Kv = λv for a real λ. Then we can prove: Theorem 3.2. If B satisfies Hypothesis 1.4, then λn is not constant for all n ∈ N, so that H has purely absolutely continuous spectrum. ˜ A = px := −i∂x and Proof. Let us take in Lemma 3.1: K = L2 (Rx ), K = H(ξ), M = C0∞ (R). Then point (1) is obviously satisfied. From the following formula, valid on C0∞ (R): itpx ˜ = −∂x2 + (ξ − β(. − t))2 , e−itpx H(ξ)e point (2) follows immediately. The existence of the weak derivative appearing in point (3) also follows, given by the relation: ˜ i H(ξ), px u = −2β 0 (ξ − β)u. Then:

2

2 2 2

i H(ξ), ˜ ˜ ≤ px u K ≤ 4 kBkL∞ k(ξ − β)ukK ≤ 4 kBkL∞ u, H(ξ)u K n o

2 2

2 ˜ ≤ 2 kBkL∞ kukK + H(ξ)u K

so that point (3) is completely proved. In consequence, one has:

˜ px 9n (ξ) K = −2 h9n (ξ), B(ξ − β)9n (ξ)iK = 0. 9n (ξ), i H(ξ), Combining now (3.1) with (3.2) we get for any B0 ∈ R \ {0}: Z 2 2 (B(x) − B0 ) (β(x) − ξ) |9n (x, ξ)| dx. ∂ξ λn (ξ) = B0 R

(3.2)

(3.3)

A simple variant of the unique continuation principle implies now that |9n (x, ξ)| cannot vanish on any open and nonvoid subset in R, so that by chosing x0 to be the point appearing in Hypothesis 1.4 and B0 := B(x0 ), ξ0 := β(x0 ) and by taking into account the monotonicity of β we get the positivity of the expression (B(x) − B0 ) (β(x) − ξ). We can then conclude that ∂ξ λn (ξ0 ) > 0. Remark 3.3. Let us take a look at a simple particular case. Assume B strictly monotonous (increasing for example); being bounded, it will have limits B ± = lim B(x). Hypothx→±∞

esis 1.4 is true for any x0 ∈ R, so that ∂ξ λn (ξ) > 0 for any ξ ∈ R, i.e. λn is a strictly increasing function. Then Proposition 2.2 implies that: λn (R) = (2n + 1)B − , (2n + 1)B + . (3.4)

Some Propagation Properties of the Iwatsuka Model

703

4. Asymptotic Velocity From now on we shall always suppose that for any n ∈ N the function λn (ξ) is not constant and that Hypothesis 1.5 holds true. With respect to the non-constancy of functions λn (ξ), let us remark that in the preceding section we have discussed an explicit case when this happens; for more aspects concerning this condition one can look in [7], as mentioned in the Introduction of our paper. Moreover, by Proposition 2.2 the following limits exist: ± λ± n := lim λn (ξ) = (2n + 1)B , ξ→±∞

−

so that if B 6= B then the functions λn (ξ) are surely not constant. Let us start by recalling that proving existence of an asymptotic velocity for a quantum system whose evolution is described by an unitary group e−itH with H self-adjoint in L2 (Rn ) means to show the existence of the limits: Q −itH e lim eitH t→±∞ t +

in a suitable sense, where Q is some component of the position observable (see [3, 4]). In our specific situation, the relation (2.9) suggests that we look first at the existence of an asymptotic velocity for convolution operators as (2.7). Such results exist ([1, 6]) and we shall make use of Appendix 7.C from [1], weakened suitably to fit our framework: Lemma 4.1. Let G : R → R be a bounded Borel function. Then one has the following relation in L2 (R): (4.1) s − lim eitλn G(qy /t)e−itλn = G(λ˙ n ). |t|→∞

We denote by G(H˙ n ) the bounded operator acting on Hn and given by the formula: G(H˙ n )(f ∗ 8n ) := G(λ˙ n )f ∗ 8n , where G(λ˙ n ), appearing already in (4.1), is defined by: FG(λ˙ n )f := G(λ˙ n )(Ff ). The next result says that the role of the asymptotic velocity in the direction y, for the dynamics generated by H, is played by the operator H˙ := ⊕H˙ n . We remark that this operator is intimately connected with the “band” structure of H and has a very complicated form in the initial representation L2 (Rx × Ry ). Theorem 4.2. Let G ∈ C 2 (R; R) be bounded and have bounded derivatives of first and second order; then one has the following relation in H: ˙ s − lim eitH G(Qy /t)e−itH = G(H). |t|→∞

(4.2)

Proof. From (4.1) and (2.8) we get: s − lim eitHn G(Y /t)e−itHn = G(H˙ n ) |t|→∞

from which we conclude by using Lemma 2.5 that:

(4.3)

704

M. Mˇantoiu, R. Purice

s − lim eitHn G(Qy /t)e−itHn = G(H˙ n ). |t|→∞

(4.4)

By summing over a finite set of indices n, we get for any ϕ belonging to the algebraic direct sum of the Hilbert spaces Hn that: ˙ lim eitH G(Qy /t)e−itH ϕ = G(H)ϕ. (4.5) |t|→∞

Since:

itH

e G(Qy /t)e−itH

relation (4.5) can be extended to H.

B(H)

≤ kGkL∞ ,

An interesting situation occurs when the right-hand side in (4.2) is zero. This happens ˙ Due to the direct sum decomposition when G is supported outside the spectrum of H. of H˙ and the form of each H˙ n , one can prove the following statement: Corollary 4.3. Assume that G : R → R is a C 2 function that is bounded and has bounded derivatives of first and second order and satisfies the condition: ) ( [ λ˙ n (R) = ∅. suppG ∩ n∈N

Then:

s − lim eitH G(Qy /t)e−itH = 0. |t|→∞

(4.6)

Unfortunately, the hypothesis of the above corollary is implicit as soon as there are no good evaluations for the sets λ˙ n (R). Due to points (1) and (2) of Lemma 2.4, we get (using also Hypothesis 1.1): p p (4.7) − 2 (2n + 1)M+ ≤ λ˙ n (ξ) ≤ 2 (2n + 1)M+ , estimates that are unfortunately useless for our corollary because the union of all the intervals appearing in the above relation is the whole real axis. Anyhow, these estimations may become useful if we localize in energy. Let us choose a real interval I contained in the union of the first N bands of the form λn (R). Then, denoting by E the spectral measure associated to H, any vector ϕ ∈ E(I)H may be written as: ϕ=

N X

f n ∗ 8n ,

n=0

and we have: ˙ = G(H)ϕ

N X

G(λ˙ n )fn ∗ 8n .

n=0

This expression is zero if:

h p i p suppG ∩ −2 (2N + 1)M+ , 2 (2N + 1)M+ = ∅.

The formula (4.6) in the case of the situation just described might be interpreted as a maximal velocity bound. It is less precise than the results of the next section (especially when estimation (4.7) is far from being optimal), but it has the advantage of being true even when the interval I contains “critical energies” belonging to the set τ defined at the beginning of Sect. 5. However, there is an interesting global result that may be stated in the monotonic case:

Some Propagation Properties of the Iwatsuka Model

705

Corollary 4.4. Assume that B satisfies Hypothesis 1.1 and is strictly increasing and G is as in the hypothesis of Theorem 4.2 and also satisfies the conditions: G(x) = 1 if x < − and G(x) = 0 if x ≥ 0, for a strictly positive . Then the equality (4.6) holds.

5. Minimal and Maximal Velocity Bounds We begin with some notations: c(λn ) := λ ∈ R | ∃ξ ∈ R such that λn (ξ) = λ and λ˙ n (ξ) = 0 , + τ (λn ) := c(λn ) ∪ λ− n , λn , τ :=

[

τ (λn )

n∈N

and call c(λn ) the set of “critical values” of the function λn . Let us fix an open interval J = (a, b), whose closure is disjoint from τ . Remark that the endpoints of the intervals λn (R) are in τ (λn ) and in consequence, for a given n, either J ⊂ λn (R) or J¯ ∩λn (R) = ∅. Then we denote: N(J) := {n ∈ N | J ⊂ λn (R)} . Assuming J included in the spectrum of H, N(J) will be a non-void finite set. We also use the notations: ρn := inf |λ˙ n (ξ)| | λn (ξ) ∈ J , θn := sup |λ˙ n (ξ)| | λn (ξ) ∈ J , ρ := min ρn , n∈N(J)

θ := max θn . n∈N(J)

By our assumptions on J, one has: 0 < ρ < θ < ∞. For any f ∈ S(R) we set: ◦ an f := (1/2) λ˙ n (qy f ) + qy (λ˙ n f ) and observe that it defines an essentially self-adjoint operator (using also point (3) from Lemma 2.4), whose closure we shall denote by: an = (1/2) λ˙ n qy + qy λ˙ n . It is easy to see that: 2 i λn , an = λ˙ n , ρ2n φ(λn )2 ≤ φ(λn )i λn , an φ(λn ) ≤ θn2 φ(λn )2

(5.1)

for any φ ∈ C0∞ (R) with support included in J. The first inequality is a “Mourre estimate” for λn with respect to the conjugate operator an .

706

M. Mˇantoiu, R. Purice

Remark 5.1. Setting A :=

L

an , one proves easily that:

n∈N

ρ2 φ(H)2 ≤ φ(H)i [H, A] φ(H) ≤ θ2 φ(H)2 . In particular, H satisfies a “Mourre estimate” with respect to A and this fact has useful consequences, for example in the spectral analysis and scattering theory for some classes of perturbations of H. But (5.1) will suffice for our purpose, that is to prove the next theorem. Theorem 5.2. Let Hypothesis 1.5 be satisfied and let J be an interval of the real axis chosen as above. For any function F ∈ C∞ (R; R+ ) with support disjoint from the interval [ρ, θ] and for any function φ ∈ C0∞ (R; R+ ) with support included in J, there is a finite constant C such that for any f ∈ H: Z ∞

dt

F (|Qy | /t)e−itH φ(H)f 2 ≤ C kφ(H)f k2 . (5.2) H H t 1 Proof. The proof consists of several steps. The first two of them are straightforward adaptations of the proofs of Lemma 4.1 and Theorem 4.2 from [11], so that we shall give only some brief comments. Step 1. Let G : R → [0, ∞) and compactly supported function with support be a smooth disjoint from the interval ρ2n , θn2 and let φ be as in the statement of the theorem; then there is a constant Cn such that for any f ∈ L2 (Ry ) one has: Z ∞

dt

G(an /t)e−itλn φ(λn )f 2 2 ≤ Cn φ(λn )f 2 2 . (5.3) (R) L (R) L t 1 Basic to the proof of this statement is (5.1). We also use the boundedness of λ˙ n and λ¨ n , contained in point 3 of Lemma 2.4 and needed for some commutator expansions. Notice that in Lemma 4.1 in [11] it was assumed that the support of G is contained in −∞, ρ2n and the proof was first done for G with small support and by using then a covering argument. In order to avoid these problems we start with a “propagation observable” (t) of the type: we decompose G2 = G2− + G2+ , where G2− is supported in following 2 2 −∞, ρn and G+ in θn2 , ∞ and define G˜ := G2− − G2+ and (t) = K(an /t) with ˜ K 0 = G. Step 2. With the same assumptions on G and φ as in Step 1 one has: Z ∞

dt

G(qy2 /t2 )e−itλn φ(λn )f 2 2 ≤ Cn φ(λn )f 2 2 L (R) L (R) t 1

(5.4)

for some finite constant Cn independent of f . ˜ In order to prove this assertion one uses the “propagation observable” (t) := K(qy2 /t2 ), with the same function K as in Step 1. The proof of (5.4) relies on (5.3), on the formula: i λn , qy2 = 2an and again on the boundedness of the derivatives of λn . The explicit form of the operator an /t is also used by observing that λ˙ n is bounded and qy /t behaves well if multiplied by compactly supported functions of qy2 /t2 . Step 3. In (5.4) one can replace G(qy2 /t2 ) with F (|qy | /t), with F as in the hypothesis of the theorem. Using (2.8) the obtained formula is turned into:

Some Propagation Properties of the Iwatsuka Model

Z

∞ 1

dt

F (|Y | /t)e−itHn φ(Hn )(f ∗ 8n ) 2 ≤ Cn kφ(Hn )(f ∗ 8n )k2 . H H t

Now we use Lemma 2.5 with L(.) = F (|.|). We get: Z ∞

dt

F (|Qy | /t)e−itHn φ(Hn )(f ∗ 8n ) 2 ≤ Cn φ(λn )f 2 1 . H H t 1

707

(5.5)

(5.6)

But in computing the H1 -norm we observe that:

0

φ(λn )f 2 = ξφ(λn )fˆ 2 ≤ sup {|ξ| | ξ ∈ supp(φ ◦ λn )} φ(λn )f L2 . L

L

Since the support of φ is contained in J and J¯ ∩ τ (λn ) = ∅, we easily see that φ ◦ λn has compact support, so that (5.6) becomes: Z ∞

dt

F (|Qy | /t)e−itHn φ(Hn )(f ∗ 8n ) 2 ≤ Cn kφ(Hn )(f ∗ 8n )k2 . (5.7) H H t 1 Step 4. By summing the inequalities (5.7) for all indices n ∈ N(J), one obtains (5.2). Remark 5.3. Let us set Mt (J) := (x, y) ∈ R2 | |y| ∈ [ρt, θt] . The estimation (5.2) means (in a certain weak sense) that at energies belonging to the interval J, the evolution is concentrated in Mt (J), asymptotically for t → +∞. Of course, a similar result is true for t → −∞. Remark 5.4. Suppose that B is strictly increasing; then: [ + λ− τ= n , λn , n∈N

and for λn one may use the simpler conjugate operator qy . By a similar argument (but somewhat simpler) one gets (5.2) with F (|Qy | /t) replaced by F (Qy /t), i.e. the propagation also has a definite sense, not only a direction (one may compare this result with Corollary 4.4). This particular case is treated in [9]. References 1. Amrein, W.O., Boutet de Monvel, Anne, Georgescu, V.: C0 -Groups, Commutator Methods and Spectral Theory of N-Body Hamiltonians. Birkh¨auser-Verlag, 1996 2. Cycon, H.L., Froese, R.G., Kirsch, W., Simon, B.: Schr¨odinger Operators with Applications to Quantum Mechanics and Global Geometry. Berlin-Heidelberg-New York: Springer-Verlag, 1987 3. Enss, V.: Asymptotic observables on scattering states. Commun. Math. Phys. 89, 245–268 (1983) 4. Derezinsky, J.: Algebraic approach to the N-body long range scattering. Rev. Math. Phys. 3, 1–62 (1991) 5. G´erard, Ch., Nier, F.: Th´eorie de la diffusion pour les op´erateurs analytiquement d´ecomposables. Preprint Ecole Polytechnique Palaiseau, no. 1134, 1996 6. H¨ormander, L.: The Analysis of Linear Partial Differential Operators. Vol. I, Berlin: Springer, 1983 7. Iwatsuka, A.: Examples of absolutely continuous Schr¨odinger operators in magnetic fields. Publ. RIMS, Kyoto Univ., 21, 385-401, (1985) 8. Kato, T.: Perturbation Theory for Linear Operators. Berlin-Heidelberg-New York: Springer-Verlag, 1966 9. Mˇantoiu M., Purice R.: Propagation in a magnetic field model by Iwatsuka. Proceedings of the Conference on Partial Differential Equations, Potsdam 1996, eds. Demuth, M. and Schulze, W., Berlin: AkademieVerlag, 1997

708

M. Mˇantoiu, R. Purice

10. Reed, M., Simon, B.: Methods of Modern Mathematical Physics. Vol. IV, New York: Academic Press, 1975 11. Sigal, I.M., Soffer, A.: Long-range many-body scattering. Asymptotic clustering for Coulomb-type potentials. Invent. Math. 99, 115–143 (1990) Communicated by B. Simon

Commun. Math. Phys. 188, 709 – 721 (1997)

Communications in

Mathematical Physics c Springer-Verlag 1997

On Nodal Sets for Dirac and Laplace Operators Christian B¨ar ? Mathematisches Institut, Universit¨at Freiburg, Eckerstraße 1, D-79104 Freiburg, Germany. E-mail: [email protected] Received: 28 October 1996 / Accepted: 3 March 1997

Abstract: We prove that the nodal set (zero set) of a solution of a generalized Dirac equation on a Riemannian manifold has codimension 2 at least. If the underlying manifold is a surface, then the nodal set is discrete. We obtain a quick proof of the fact that the nodal set of an eigenfunction for the Laplace-Beltrami operator on a Riemannian manifold consists of a smooth hypersurface and a singular set of lower dimension. We also see that the nodal set of a 1-harmonic differential form on a closed manifold has codimension 2 at least; a fact which is not true if the manifold is not closed. Examples show that all bounds are optimal. 1. Introduction The motion of a vibrating membrane M fixed at the boundary is described by a function u : M × R → R satisfying the wave equation ∂2u + 1u = 0 ∂t2 and boundary conditions u|∂M = 0. One can expand u into a series u(x, t) = p P∞Dirichletp (a sin( λ t) + b cos( λ t))φ j j j j j (x) where φj are the 1-eigenfunctions on M for j=0 p p the eigenvalue λj . A “pure sound” is given by u(x, t) = (aj sin( λj t) + bj cos( λj t)) φj (x) for some fixed j. The zero set of φj describes those points of the membrane which do not move during the vibration. They can be made visible by putting fine powder on the membrane. The structure of the zero set of eigenfunctions of the Laplace-Beltrami operator on surfaces is well understood. They consist of smooth arcs, called nodal lines, and isolated singular points where these arcs meet. One knows that the arcs meeting at a singular ?

Partially supported by SFB 256 and by the GADGET program of the EU

710

C. B¨ar

point form an equiangular configuration, see [1] or [7, Satz 1] for a proof. One also has lower and upper bounds for the length of the nodal lines [7, Satz 2], [17, Theorem 1.2, Corollary 1.3], [13, Theorem 4.2]. Such a precise understanding of nodal sets seems difficult in higher dimensions. But one has the following regularity result. If φ is an eigenfunction of the Laplace-Beltrami operator on an n-dimensional Riemannian manifold, then the nodal set of φ consists of a smooth hypersurface and a singular part of dimension ≤ n − 2. Essentially this fact has been stated as Theorem 2.2 in [12]. Once it is established the standard proof of Courant’s nodal domain theorem in dimension 2 carries over to higher dimensions. A nodal domain is a connected component of the complement of the nodal set. Courant’s nodal domain theorem states that the number of nodal domains of the ith eigenfunction of 1 is less than or equal to i. The point is that one has to use a Green formula over a nodal domain and this requires some regularity of its boundary. It was pointed out by Y. Colin de Verdi`ere that the proof of Theorem 2.2 in [12] has a serious gap, see [3, App. E]. Therefore B´erard and Meyer modified the proof of Courant’s nodal domain theorem. They approximate the nodal domains by regular domains [3, App. D]. The above regularity statement on Laplace-eigenfunctions has (to our knowledge) first been proved by R. Hardt and L. Simon in [18, Theorem 1.10], see also [8] for a special case. We will obtain a quick proof of this fact once our theorem on the nodal set of solutions of Dirac equations is established (Corollary 2). One has estimates for the (n−1)-dimensional Hausdorff measure of the nodal set, see [14, Theorem 4.2], [15, Theorem 1.2], [16], and [18, Theorem 5.3], some of which only work if the Riemannian metric is assumed to be real analytic. There are also estimates for the volume of the nodal domains, see [21, Theorem B], [10, Theorem 2]. Not much is known about the topology of the singular set. The problem is that in dimension n ≥ 3 the nodal set need not be locally homeomorphic to its tangent cone. This precisely was the problem in [12]. For a structural result in dimension n = 3 see [11, Theorem 1.2]. The purpose of this paper is to study the nodal set of solutions of certain systems of elliptic linear partial differential equations of first order, generalized Dirac equations. Precise definitions will be given in the next section. To get some feeling of what to expect let us first look at the trivial one-dimensional case. A Laplace equation is then nothing but a second order linear ordinary differential equation. The standard theory tells us that zeros of solutions must be isolated. A Dirac equation becomes a first order linear ordinary differential equation and we see that nontrivial solutions have no zeros at all. Another interesting test case is provided by holomorphic functions on Riemann surfaces. The Cauchy-Riemann equations are special generalized Dirac equations. As is well-known, zeros of solutions must form a discrete set. In higher dimensions, a holomorphic function defines a complex subvariety having real codimension 2. This, as well as other special cases, leads us towards the conjecture that the nodal set of a solution of a generalized Dirac equation on an n-dimensional manifold has dimension ≤ n − 2. The main result of this paper says that this conjecture is indeed true. Examples show that this bound is optimal. In the two-dimensional case we have a similar unique-

Nodal Sets for Dirac and Laplace Operators

711

continuation theorem as we have for holomorphic functions (Corollary 3). If the nodal set of a solution of a generalized Dirac equation on a connected surface has a cumulation point, then this solution must vanish identically. There is a physical interpretation in quantum mechanics similar to the vibrating membrane for the Laplace equation. The “wave function” of a fermion (e.g. an electron) is given by a spinor field, in the simplest case by a function ψ : R3 × R → C2 , satisfying the equation i

∂ ψ + (D + h)ψ = 0, ∂t

where D is the Dirac operator on R3 and h is a potential. The particle is in “pure state” if ψ has the form ψ(x, t) = eiλt 9(x) where 9 is an eigenfunction of D + h for the eigenvalue λ. The scalar function |ψ(x, t)|2 = |9(x)|2 can be interpreted as the probability measure for the particle to be found at the point x. Hence the nodal set of 9 is the set of points where this probability measure is zero. Our main result then says that this “exclusive set” is not very big, it is at most one-dimensional. We believe that the nodal set of solutions of Dirac equations very often carries important information about the underlying manifold. In the special case of holomorphic functions on complex manifolds this is classical. Recently, Taubes has given an impressive example in the theory of 4-dimensional symplectic manifolds. Starting from solutions ψr of a one-parameter family of Dirac equations (deformed first SeibergWitten equation), r ∈ R, he constructs pseudoholomorphic curves. Philosophically, these curves are given by the zero locus of ψ∞ (which is 2-dimensional!). See [23, 24], and [19] for a very readable survey. It is nice that information about solutions of Dirac equations yields also information about solutions of Laplace equations. It was mentioned earlier that we will obtain a quick proof of the fact that the nodal set of a 1-eigenfunction is the union of a smooth hypersurface and a singular part of dimension ≤ n − 2. We will also see that the nodal set of a 1-harmonic differential form on a closed manifold has codimension 2 at least (Corollary 1). This is surprising since it is not true if the underlying manifold is not closed nor does it hold for other 1-eigenforms even if the manifold is closed. The paper is organized as follows. In the next section we give precise definitions and we collect a few well-known facts about Dirac operators for later use. We then formulate the main result. In the third section we give the most important examples for generalized Dirac operators and for some of these we draw conclusions from our main theorem. In Sect. 4 we give the proof. We employ tools similar to those used in complex algebraic geometry when one studies the topological structure of complex algebraic varieties. We use an analog of Weierstrass’ preparation theorem for differentiable functions due to Malgrange to write a solution of a Dirac equation in a certain normal form. This theorem is important e.g. in catastrophe theory. To insure that we can apply this theorem we have to use Aronszajn’s unique continuation theorem. We easily conclude that the nodal set has dimension not bigger than n − 1. The difficult part is to show that simultaneous vanishing of several components of our section must reduce the dimension of its zero set once more. This requires a careful investigation of certain resultants and this is where we really use the Dirac equation.

712

C. B¨ar

2. Statement of Result Let M be an n-dimensional Riemannian manifold, let S be a Riemannian or Hermitian vector bundle over M on which the Clifford bundle Cl(T M ) acts from the left. This means that at every point p ∈ M there is a linear map Tp M ⊗ Sp → Sp , v ⊗ s → v · s, satisfying the relations v · w · s + w · v · s = −2hv, wis and hv · s1 , s2 i = −hs1 , v · s2 i. Moreover, let ∇ be a metric connection on S satisfying the Leibniz rule for Clifford multiplication ∇(v · s) = (∇v) · s + v · ∇s. Such an S is called a Dirac bundle. All geometric data such as metrics and Clifford multiplication are assumed to be C ∞ -smooth. The (generalized) Dirac operator acts on sections of S and is defined by Ds =

n X

ei · ∇ei s,

i=1

where e1 , . . . , en denote a local orthonormal frame of T M . The Dirac operator is easily seen to be independent of the choice of orthonormal frame. It is a first order formally self-adjoint elliptic differential operator, see [4] or [20] for details. Three general facts about Dirac operators will be used later on and should be stated here for completeness. The proofs are simple computations. 1. If s is a differentiable section of S and f a differentiable function on M , then the following Leibniz rule holds [20, p. 116, Lemma 5.5]: D(f · s) = f · Ds + ∇f · s,

(1)

where ∇f is the gradient of f . 2. The connection ∇ maps sections of S into those of T ∗ M ⊗ S. Denote the formal adjoint of this operator by ∇∗ . Then we have the following Weitzenb¨ock formula [20, p. 155, Theorem 8.2]: (2) D2 = ∇∗ ∇ + <, where < is an endomorphism field given by curvature. Operators of the form ∇∗ ∇ + endomorphism field are called generalized Laplacians. 3. If M has smooth boundary ∂M with outer unit normal field ν, if s1 and s2 are compactly supported C 1 -sections of S, then the following Green formula holds [20, p. 115, Eq. (5.7)]: Z hν · s1 , s2 i. (3) (Ds1 , s2 )L2 (M ) − (s1 , Ds2 )L2 (M ) = ∂M

If s is an eigensection of D, i.e. Ds = λs, λ ∈ R, or, more generally, s satisfies (D + h)s = 0 for some endomorphism field h of S, then the zero locus of s, {x ∈ M | s(x) = 0}, is called the nodal set of s. The main purpose of this paper is to study the structure of such nodal sets.

Nodal Sets for Dirac and Laplace Operators

713

Main Theorem. Let M be a connected n-dimensional Riemannian manifold with Dirac bundle S and generalized Dirac operator D. Let h be a smooth endomorphism field for S and let s 6≡ 0 be a solution of (D + h)s = 0. Then the nodal set of s is a countably (n − 2)-rectifiable set and thus has Hausdorff dimension n − 2 at most. If n = 2, then the nodal set of s is a discrete subset of M . Recall that a subset of an n-dimensional Riemannian manifold M is called countably k-rectifiable if it can be written as a countable union of sets of the form 8(X), where X ⊂ Rk is bounded and 8 : X → M is a Lipschitz map. The proof will be given in the fourth section. 3. Examples and Consequences Let us look at the most important examples and draw some conclusions. Example 1. The Clifford algebra bundle Cl(T M ) acts on itself by Clifford multiplication. As a vector bundle Cl(T M ) can be canonically identified with the exterior form bundle 3∗ (T M ). We thus obtain a real Dirac bundle S = 3∗ (T M ) with Levi-Civita connection ∇. The Dirac operator is D = d + δ, where d denotes exterior differentiation and δ codifferentiation. Example 2. Let M be a spin manifold. Then there exists the spinor bundle S, a complex Dirac bundle of rank 2[n/2] . The connection ∇ is induced by the Levi-Civita connection. The corresponding Dirac operator D is the classical Dirac operator. Example 3. Let M be a spinc manifold. Again, there exists the complex spinor bundle S of rank 2[n/2] . The connection ∇ depends on the Levi-Civita connection and the choice of a connection on a certain U(1)-bundle, the determinant bundle. Example 4. Let M be an almost complex manifold. Then M is canonically spinc and the spinor bundle of Example 3 can be identified with the √ bundle of mixed (0, p)-forms, S = 30,∗ (T M ⊗ C). The Dirac operator is given by D = 2 · (∂¯ + ∂¯ ∗ ) + h where h is a zero order term which vanishes if M is K¨ahler. Example 5. Let S be a Dirac bundle over M , and let E be a Riemannian or Hermitian bundle over M with metric connection. Then S ⊗E canonically becomes a Dirac bundle and the corresponding Dirac operator is called twisted Dirac operator with coefficients in E. Let us see what the theorem tells us when applied to the example of differential forms. First of all, we see that the bound in the theorem is optimal. Namely, let F be a surface of higher genus, let ω1 be a nontrivial closed and coclosed 1-form on F . In other words, ω1 is a solution of (d+δ)ω1 = 0. The nodal set of ω1 consists of isolated points and it is nonempty since the Euler number of F is nonzero. Let T n−2 be a flat (n − 2)-torus and let ω2 be a parallel k-form on T n−2 . Put M = F × T n−2 and denote the projections onto the factors by π1 : M → F and π2 : M → T n−2 . Then ω = π1∗ ω1 ∧ π2∗ ω2 is a closed and coclosed (k + 1)-form on M . Its nodal set is a disjoint union of copies of T n−2 , and therefore has codimension 2 in M .

714

C. B¨ar

Corollary 1. Let M be a compact connected Riemannian manifold without boundary. Let 1 = dδ + δd be the Laplace-Beltrami operator acting on k-forms. Let ω 6≡ 0 be a harmonic k-form, i.e. 1ω = 0. Then the nodal set of ω is a countably (n − 2)-rectifiable set and thus has Hausdorff dimension n − 2 at most. If n = 2, then the nodal set of ω is a discrete subset of M . Proof. Taking L2 -products and using the Green formula (3) yields 0 = (1ω, ω)L2 (M ) = ((d + δ)2 ω, ω)L2 (M ) = ((d + δ)ω, (d + δ)ω)L2 (M ) . We conclude

(d + δ)ω ≡ 0. Now the theorem tells us that the nodal set of ω is a countably (n − 2)-rectifiable set. It is remarkable that Corollary 1 fails if we drop the assumption that M be compact. For example, ω = x1 dx1 ∧ . . . ∧ dxk is a harmonic k-form on Rn whose nodal set has codimension 1. This means that despite the local nature of the statement of Corollary 1 it can not be proved by purely local methods. One can extend Corollary 1 to complete noncompact manifolds by imposingR suitable decay conditions on the form ω. This will make the additional boundary term B(R) hν · (d + δ)ω, ωi tend to zero as R → ∞, where B(R) denotes the distance ball of radius R 1 around some fixed point. Demanding ω to be in the Sobolev space H 2 ,2 (M ) should be sufficient. Corollary 1 also fails if we replace harmonic forms by other eigenforms of the Laplace-Beltrami operator even if the manifold is compact. For example, ω = sin(2πmx1 )dx1 ∧ . . . ∧ dxk , m ∈ Z, is a 1-eigenform on the torus T n = Rn /Zn whose nodal set has codimension 1. As another application of the main theorem we obtain a quick proof of the following theorem first proven 1989 by R. Hardt and L. Simon [18, Theorem 1.10]. Corollary 2. Let M be an n-dimensional connected Riemannian manifold. Let f be a nontrivial 1-eigenfunction on M , i.e. 1f = λf for some λ > 0. Then the nodal set of f is a disjoint union Nreg ∪ Nsing , where Nreg is a smooth hypersurface of M and Nsing is a countably (n−2)-rectifiable set and thus has Hausdorff dimension n − 2 at most. Proof. Put Nreg = {x ∈ M | f (x) = 0, df (x) 6= 0} and Nsing = {x ∈ M | f (x) = 0, df (x) = 0}. Then the nodal set of f is given by Nreg ∪ Nsing . By the implicit function theorem Nreg is a smooth hypersurface. √ The other set Nsing is the zero locus of the mixed differential form ω = λf + df . Now ω is an eigenform for the generalized Dirac operator D = d + δ, √ (d + δ)ω = λω. The main theorem says Nsing is a countably (n − 2)-rectifiable set.

Nodal Sets for Dirac and Laplace Operators

715

Remark. If we knew that Corollary 2 is true for generalized Laplacians, then the main theorem could be derived from it, at least for eigensections s of a Dirac operator, as follows. Let Ds = λs, λ ∈ R. The Weitzenb¨ock formula (2) yields ∇∗ ∇s + (< − λ2 )s = (D2 − λ2 )s = 0. Hence we could conclude that the nodal set of s is of the form Nreg ∪ Nsing , where Nreg is a smooth hypersurface and Nsing has at least codimension 2. It would remain to show Nreg = ∅. Assume there is x0 ∈ Nreg . Choose a small ball B around x0 which is cut by Nreg into two pieces B1 and B2 . Define s(x), if x ∈ B1 , s(x) ˜ = 0, if x ∈ B2 . Then s˜ is a continuous section of S over B. Let φ be a smooth test section of S with compact support contained in B. Let ν be the unit normal field of Nreg ∩ B pointing into B2 . Nreg

B1

ν B2

x0

B

Fig. 1.

Then by (3), (s, ˜ Dφ)L2 (B) = (s, Dφ)L2 (B1 )

Z

= (Ds, φ)L2 (B1 ) +

B∩Nreg

hs, ν · φi

= (λs, φ)L2 (B1 ) + 0 = (λs, ˜ φ)L2 (B) . Hence Ds˜ = λs˜ holds in B in the sense of distributions. By elliptic regularity theory s˜ is smooth. Since s˜ vanishes identically on B2 we know from Aronszajn’s unique continuation theorem (see the next section) that s˜ ≡ 0 on B. Hence s ≡ 0 on B1 . Applying Aronszajn’s theorem once more we conclude s ≡ 0 on M .

716

C. B¨ar

Unfortunately, to our knowledge Corollary 2 is not established for generalized Laplacians. But, using the methods of the next section, it is easy to see that nodal sets for generalized Laplacians have Hausdorff dimension n − 1 at most. Note that the nodal set of a solution of a general linear elliptic system of second order can be very irregular. For example, it is not hard to show the following. Let A ⊂ Rn−1 be any closed subset. Then there is a linear elliptic differential operator of second order, P , acting on functions u : Rn → R2 and there is a solution u of P u = 0 such that u−1 (0) = A × {0} ⊂ Rn . To conclude this section let us emphasize once more the two-dimensional case. We have the following generalization of the well-known uniqueness theorem for holomorphic functions.

Corollary 3. Let M be a two-dimensional connected Riemannian manifold. If the zero set of a solution of a generalized Dirac equation on M has a cumulation point, then this solution must vanish identically.

This corollary has been proven for many special cases. The Dirac equation on a surface can be written as a generalized Cauchy-Riemann equation. If for example the real dimension of the Dirac bundle is 2, then the theory of “generalized analytic functions” applies, see [25], also compare [9].

4. Proof of the Main Theorem This section is devoted to the proof of the main theorem. There are two important ingredients to the proof which we state first. We have Aronszajn’s Unique Continuation Theorem. Let M be a connected Riemannian manifold. Let L be an operator of the form L = ∇∗ ∇ + L1 + L0 acting on sections of a vector bundle S over M , where L1 and L0 are differential operators of first and zeroth order respectively. Let s be a solution of Ls = 0. If s vanishes at some point of infinite order, i.e. if all derivatives vanish at that point, then s ≡ 0. For a proof see [2, Theorem on p. 235 and Remark 3 on p. 248]. The other essential ingredient is a version of Weierstrass’ preparation theorem for differentiable functions [22, Chapter V], [6, 6.3]. Malgrange’s Preparation Theorem (Special case). Let U ⊂ Rn be an open neighborhood of 0, let f : U → R be a C ∞ -function vanishing of k th order at 0 but not of (k + 1)st order, k ∈ N. Then, after possibly shrinking U to a smaller neighborhood and applying a linear coordinate transformation to Rn , there exist C ∞ -functions v : U → R and uj : U ∩ ({0} × Rn−1 ) → R such that

Nodal Sets for Dirac and Laplace Operators

717

 f (x) = v(x) · xk1 +

k−1 X

 uj (x0 )xj1 

j=0

for all x = (x1 , x0 ) ∈ U , where v(x) 6= 0 for all x ∈ U and uj vanishes of order k − j at 0 ∈ Rn−1 . Remark. By Taylor’s theorem we can write f = fˆ + ψ, where fˆ is a homogeneous polynomial of degree k and ψ vanishes of order k + 1 at 0. The linear coordinate transformation in the Preparation Theorem must be such that a vector w ∈ Rn for which fˆ(w) 6= 0 is transformed into (1, 0, . . . , 0).

Proof of Main Theorem. To prove the main theorem let s be a section of the Dirac bundle satisfying (D + h)s = 0. (4) We assume that the Dirac bundle is real, in the complex case we simply forget the complex structure. Let p ∈ M be a point of its nodal set, i.e. s(p) = 0. Applying D to (4) and using the Weitzenb¨ock formula (2) we get (∇∗ ∇ + D ◦ h + <)s = 0.

(5)

By Aronszajn’s unique continuation theorem s cannot vanish of infinite order at p. Say s vanishes at p of order k but not of order k + 1. We choose normal coordinates around p and trivialize the Dirac bundle. Then s corresponds to a vector valued function (s1 , . . . , sr ) defined in a neighborhood of 0. Here r is the (real) rank of the Dirac bundle S. All component functions vanish of order k at p and at least one of them does not vanish of order k + 1. Other components could a-priori vanish of higher order but by choosing the trivialization appropriately we can assume that this is not the case. To see this, note that trivializing the bundle amounts to exhibiting linearly independent linear functionals on the bundle. Pick one linear functional l1 such that s1 = l1 ◦ s does not vanish of order k + 1 at 0. Choose the other r − 1 functionals linearly independent but close to l1 . Moreover, if the Dirac bundle is trivialized in this way, then there is a direction in which the k th derivative of all components s1 , . . . , sr does not vanish. Hence we can use Malgrange’s preparation theorem with the same linear coordinate transformation (the same x1 -direction) for all the components. Therefore we can write   k−1 X um,j (x0 )xj1  , sm (x) = vm (x) · xk1 + j=0

where vm are nonvanishing and um,j vanish of order k − j at 0, m = 1, . . . , r. We see already that the nodal set of s near p is countably (n − 1)-rectifiable. Namely, for any m the nodal set is contained in the set   k−1   X um,j (x0 )xj1 = 0 . Nm = x = (x1 , x0 ) xk1 +   j=0

718

C. B¨ar

That this set Nm is countably (n − 1)-rectifiable follows easily from the following Fact. Rk can be written as a disjoint union of countably many bounded subsets Aν such Pk−1 that the number of pairwise distinct real roots of the polynomial Pu (t) = tk + j=0 uj tj is constant for u = (u0 , . . . , uk−1 ) ∈ Aν and the real roots (ordered my magnitude) are Lipschitz-functions on Aν .

x1 ∈ R

Nm

x0 ∈ Rn−1

Fig. 2.

Pk Pk Recall that two polynomials F = j=0 aj tj and G = j=0 bj tj have a common root if and only if the resultant RF,G vanishes. The resultant is a weighted homogeneous polynomial of degree k 2 in the coefficients aj and bj , where aj and bj have weight k − j, see [5, Sect. 4]. Pk More generally, r polynomials Pm = j=0 um,j tj , m = 1, . . . , r have a common Pr Pr root if and only if any two linear combinations F = m=1 αm Pm and G = m=1 βm Pm have vanishing resultant RF,G . Pr We will show that there are two linear combinations F = m=1 αm vm (0)Pm and P Pr k−1 G = m=1 βm vm (0)Pm of our polynomials Pm = tk + j=0 um,j (x0 )tj such that the 0 resultant RF,G regarded as a function in x does not vanish of infinite order at x0 = 0. Then, by applying once more Malgrange’s preparation theorem and the fact on the real roots of polynomials mentioned above, we see that the zero locus of RF,G is a countably (n − 2)-rectifiable subset of Rn−1 and that N = ∩rm=1 Nm is contained in a countably (n − 2)-rectifiable subset of Rn . Moreover, if n = 2, we see that x = 0 is an isolated zero of s and the theorem is proved. To find two such linear combinations F and G we look at the Taylor expansion vm (x0 ) = vm (0) + first order terms, um,j (x0 ) = uˆ m,j (x0 ) + higher order terms. Here uˆ m,j (x0 ) is a homogeneous polynomial of degree k − j. The Taylor expansion of sm is then given by

Nodal Sets for Dirac and Laplace Operators

 sm (x) = vm (0) · xk1 +

719 k−1 X

 uˆ m,j (x0 )xj1  + higher order terms.

j=0

Looking at the lowest order term of the left hand side of Eq. (4) we get ˆ = 0, Dw

(6)

Pk−1 ˆ = where w(x) = (w1 (x), . . . , wr (x)), wm (x) = vm (0) · xk1 + j=0 uˆ m,j (x0 )xj1 and D Pn n i=1 γi ∂/∂xi is the Dirac operator on Tp M = R . Here γi are generalized Pauli matrices satisfying the relations γi γj + γj γi = −2δij . Putting yj (x0 ) = (v1 (0)uˆ 1,j (x0 ), . . . , vr (0)uˆ r,j (x0 )) for j = 0, . . . , k − 1 and yk (x0 ) = (v1 (0), . . . , vr (0)) we have k X w(x) = yj (x0 )xj1 . (7) j=0

Pn

Write D1 = j=2 γ1 γj ∂/∂xj . Then D1 is a generalized Dirac operator on Rn−1 . Plugging (7) into (6) and comparing coefficients of xj1 yields D1 yj = (j + 1) · yj+1 , D1 yk = 0.

j = 0, . . . , k − 1,

Hence 1 j D y0 , j = 0, . . . , k. (8) j! 1 Pr Pr Given two linear combinations Fˆ = m=1 αm wm and Gˆ = m=1 βm wm the resultant RFˆ ,Gˆ is precisely the lowest order term of the resultant of the corresponding linear Pr Pr combinations of the Pm ’s, F = m=1 αm vm (0)Pm , G = m=1 βm vm (0)Pm , yj =

RF,G (x0 ) = RFˆ ,Gˆ (x0 ) + higher order terms.

(9)

To show that RF,G does not vanish of infinite order at x0 = 0 it is thus sufficient to show that RFˆ ,Gˆ does not vanish identically. Therefore the theorem is proved if we can choose the αm and βm in such a way that RFˆ ,Gˆ does not vanish identically. ˆ Then for Assume the resultant RFˆ ,Gˆ vanishes for all linear combinations Fˆ and G. 0 n−1 0 there exists a common root ξ(x ) of the polynomials w1 , . . . , wr . By (7) any x ∈ R and (8) we have k X 1 j D y0 (x0 )ξ(x0 )j = 0. (10) j! 1 j=0

There is a nonempty open subset of Rn−1 on which ξ can be chosen such that it depends smoothly on x0 . On this subset we apply D1 to (10) and use (1) to obtain k−1 X ξ(x0 )j j=0

j!

· (1 + ∇ξ) · D1j+1 y0 (x0 ) = 0.

(11)

720

C. B¨ar

If v ∈ V is a vector in a finite dimensional Euclidean vector space V , then the 1−v element 1 + v is invertible in the Clifford algebra Cl(V ) with inverse 1+|v| 2 . Thus (11) gives k−1 X ξ(x0 )j j+1 D1 y0 (x0 ) = 0. (12) j! j=0

Repeating this argument inductively we eventually get D1k y0 = 0. But this means yk (x0 ) = (v1 (0), . . . , vr (0)) = 0, a contradiction.

References 1. Albert, J.H.: Nodal and Critical Sets for Eigenfunctions of Elliptic Operators. Proc. Symp. Pure Math. 23, 71–78 (1973) 2. Aronszajn, N.: A Unique Continuation Theorem for Solutions of Elliptic Partial Differential Equations or Inequalities of Second Order. J. Math. Pures Appl. 36, 235–249 (1957) ´ Norm. Sup. 15, 513–542 3. B´erard, P., Meyer, D.: In´egalit´es isop´erim´etriques et applications. Ann. Sci. Ec. (1982) 4. Berline, N., Getzler, E., Vergne, M.: Heat Kernels and Dirac Operators. Berlin–Heidelberg: SpringerVerlag, 1992 5. Brieskorn, E., Kn¨orrer, H.: Plane Algebraic Curves. Basel: Birkh¨auser Verlag, 1986 6. Br¨ocker, T., Lander, L.: Differentiable Germs and Catastrophes. Cambridge: Cambridge Univ. Press, 1975 ¨ 7. Br¨uning, J.: Uber Knoten von Eigenfunktionen des Laplace-Beltrami-Operators. Math. Z. 158, 15–21 (1978) 8. Caffarelli, L., Friedman, A.: Partial Regularity of the Zero Set of Linear and Superlinear Elliptic Equations. J. Diff. Eq. 60, 420–439 (1985) 9. Carleman, T.: Sur les syst`emes lin´eaires aux d´eriv´ees partielles du premier ordre a` deux variables. C. R. Acad. Sci. Paris 197, 471–474 (1933) 10. Chanillo, S., Muckenhoupt, B.: Nodal Geometry on Riemannian Manifolds. J. Diff. Geom. 34, 85–91 (1991) 11. Chen, J.: The Local Structure of Nodal Set of Solutions of Schroedinger Equation on Riemannian 3-manifolds. Manuscr. Math. 85, 255–263 (1994) 12. Cheng, S.-Y.: Eigenfunctions and Nodal Sets. Comment. Math. Helv. 51, 43–55 (1976) 13. Dong, R.-T.: Nodal Sets of Eigenfunctions on Riemann Surfaces. J. Diff. Geom. 36, 493–506 (1992) 14. Donnelly, H. Nodal Sets for Sums of Eigenfunctions on Riemannian Manifolds. Proc. Am. Math. Soc. 121, 967–973 (1994) 15. H. Donnelly, C. Fefferman, Nodal Sets of Eigenfunctions on Riemannian Manifolds. Invent. Math. 93, 161–183 (1988) 16. Donnelly, H., Fefferman, C.: Nodal Sets of Eigenfunctions: Riemannian Manifolds with Boundary. In: Analysis, et cetera, Res. Pap. in Honor of J. Moser’s 60th Birthd. 251–262 (1990) 17. Donnelly, H., Fefferman, C.: Nodal Sets for Eigenfunctions of the Laplacian on Surfaces. J. Am. Math. Soc. 3, 333–353 (1990) 18. Hardt, R., Simon, L.: Nodal Sets for Solutions of Elliptic Equations. J. Diff. Geom. 30, 505–522 (1989) 19. Kotschick, D.: The Seiberg-Witten Invariants of Symplectic Four-Manifolds. Ast´erisque (to appear) 20. Lawson, H.B., Michelsohn, M.-L.: Spin Geometry. Princeton, N. J.: Princeton Univ. Press, 1989 21. G. Lu, Covering Lemmas and an Application to Nodal Geometry on Riemannian Manifolds. Proc. Am. Math. Soc. 117, 971–978 (1993)

Nodal Sets for Dirac and Laplace Operators

721

22. Malgrange, B.: Ideals of Differentiable Functions. Oxford: Oxford Univ. Press, 1966 23. Taubes, C.H.: The Seiberg-Witten and Gromov Invariants. Math. Res. Letters 2, 221–238 (1995) 24. Taubes, C.H.: SW ⇒ Gr, From the Seiberg-Witten Equations to Pseudoholomorphic Curves. J. Am. Math. Soc. (to appear) 25. Vekua, I.N.: Generalized Analytic Functions. Oxford: Pergamon Press, 1962 Communicated by A. Connes

Commun. Math. Phys. 188, 723 – 735 (1997)

Communications in

Mathematical Physics c Springer-Verlag 1997

Towards an Existence Proof of MacKay’s Fixed Point Andreas Stirnemann Zeltweg 4, CH-8032 Z¨urich, Switzerland Received: 3 June 1996 / Accepted: 27 March 1997

Abstract: It is proved, using a computer, that there exist symmetric analytic twist maps U and T which satisfy the fixed point equations U = BT U T B −1 , T = BT U T U T B −1 . Here B is a diagonal 2 × 2 matrix. If U and T commute, then (U, T ) is either a fixed point or period-three point for MacKay’s renormalization factor. 1. Introduction 1.1. Background. Brevis esse laboro, obscurus fio. Horatius

1.1.1. Twist maps. We consider one-parameter families of area-preserving twist maps of the cylinder S 1 × R as, for instance, the standard family, x0 = x + y 0 ,

y0 = y −

κ sin 2πx, 2π

where x and y are the “angular” and the “vertical” coordinates respectively, where x0 and y 0 are the coordinates of the image point of (x, y), and where κ is the parameter of the family. (A useful reference for this and the following is [2]. A homeomorphism given by x0 = f (x, y), y 0 = g(x, y)

724

A. Stirnemann

is called a twist map if ∂f ∂x has a fixed sign. Area-preserving twist maps arise naturally as section maps of Hamiltonian mechanical systems.) 1.1.2. Periodic twist maps. We shall always work in the covering space R2 of the cylinder, i.e., we shall look at twist maps as self-maps of the plane. The condition for such a map to lift to the cylinder is that it commutes with the backward rotation R : (x, y) 7→ (x−1, y). A twist map of the plane which commutes with R will be called a periodic twist map. 1.1.3. KAM theory. The standard map F0 corresponding to κ = 0 is simply the linear shear x0 = x + y, y 0 = y, which has the lines y = const as invariant curves. Regarded as an object on the cylinder, such a line is a homotopically non-trivial invariant circle of F0 , and the induced circle map has rotation number y. According to the celebrated KAM theorem, the curves with Diophantine rotation numbers persist and stay smooth (in fact, analytic) as the parameter κ increases slightly. As it continues to grow, more and more of these curves break up, leaving behind compact invariant Cantor sets (with the same rotation numbers), so-called Aubry-Mather sets. 1.1.4. The golden curve. There is some experimental evidence to suggest that, in “typical” families, the last curve to break up — the most robust one so to speak — is the one with the golden mean rotation number, i.e., the one with rotation number √ 5−1 ; ω= 2 we shall refer to this curve as the golden curve. In the case of the standard family, for instance, there appears to be a critical parameter value κc ≈ 0.9716 such that for κ < κc there exists a smooth invariant golden curve while for κ > κc there exist no (homotopically non-trivial) invariant curve at all. For the critical parameter value κ = κc itself, there appears to exist a golden curve which is no longer smooth. We shall refer to this one as the critical golden curve. 1.1.5. Renormalisation. The transition from a smooth golden curve via the critical golden curve to a “golden” Aubry-Mather set seems to be governed by a fixed point of a renormalisation operator. The pertinent renormalisation group analysis was developed by R. S. MacKay in his thesis [4], which to my knowledge is still the best general reference. (For a review, see [5].) MacKay’s renormalisation operator for pairs (U, T ) of twist maps is formally given by (1) (U, T ) 7→ (U˜ , T˜ ) := (BT B −1 , BT U B −1 ), where B is a linear diagonal rescaling matrix, which depends on the “input maps” U and T . (Here, T U denotes the composition of the maps T and U .) Thus, the renormalisation is essentially a composition, followed by a change of coordinates, just as in Feigenbaum’s renormalisation theory for unimodal maps. Working with pairs, rather than with single maps, can be seen as a way of implementing periodicity: to a periodic twist map F , we associate the commuting pair (R, F ), to which MacKay’s operator can be applied. Obviously, if the input pair commutes, so does the output pair: MacKay’s operator preserves commutativity. Since conjugation by a map with constant Jacobian sends area-preserving maps into area-preserving maps, it also preserves area-preservation. The fixed point problem for this operator is of considerable importance for the dynamics of twist maps. In particular, it can be shown that any (pair of) twist map(s) in

Towards an Existence Proof of MacKay’s Fixed Point

725

the domain of attraction of a fixed point (meeting certain conditions) has a homotopically non-trivial invariant circle with the golden mean rotation number [6]. 1.1.6. Critical fixed point. The fixed point of MacKay’s operator which is relevant to the breaking up of the golden curve will be referred to as the critical fixed point or as MacKay’s fixed point; its existence was conjectured in [4]. It is thought that its stable manifold, which has (essentially) codimension 1, is the boundary in the space of twist maps between maps with a (smooth) KAM curve and maps without a golden curve. 1.2. Result. Geßler: Tell:

Das ist Tells Geschoß. Du kennst den Sch¨utzen, suche keinen andern! F. Schiller, Wilhelm Tell, 4. Akt

The aim here is to announce a result which goes a long way towards proving existence of the critical fixed point and to provide as much information as needed to enable an expert to reproduce the proof of this result. The method of proof had previously been applied to a simplified (but dynamically uninteresting) version of MacKay’s operator; see [8], where also more motivation can be found. 1.2.7. Symmetry. The existence proof does not deal directly with MacKay’s operator, but with the following symmetrised version of its third iteration (see [6, p. 375]): N : (U, T ) 7→ (U˜ , T˜ ) := (BT U T B −1 , BT U T U T B −1 ).

(2)

Any commuting fixed point of this scheme is either a fixed point or a point of period three for MacKay’s operator. (From the point of view of application, a period-three point is just as useful as a fixed point.) The advantage of this scheme is that, unlike MacKay’s operator, it preserves symmetry. Definition 1. Let S be the reflection (x, y) 7→ (−x, y). A homeomorphism F of the plane is called symmetric if S conjugates it to its inverse: SF S = F −1 . For instance, the maps of the standard family are symmetric in appropriate coordinates. Moreover, MacKay’s fixed point was known to be symmetric. Therefore, it can be justified to restrict the fixed point problem to the space of symmetric pairs. 1.2.8. Pragmatics. The reason for doing so, i.e, for using the symmetrised version (2), is purely pragmatical, i.e., ultimately dictated by limitations of computing machinery. Restriction to the space of symmetric pairs makes it possible to work with a special brand of generating functions in terms of which the relevant implicit equations (see below) do not involve derivatives. The standard “area-preserving” generating functions, by contrast, lead to implicit equations in terms of first derivatives. To the best of my knowledge, any occurrence of a derivative in the definition of the operator would make the proof impossible, because it would require to run the program with a truncation degree far too high for currently available computing power. The absence of derivatives in the definition of the operator, however, comes at a price: we have to deal a priori with up to four compositions, rather than with just one as in the case of MacKay’s operator. This in itself would be completely hopeless, but

726

A. Stirnemann

symmetry makes it possible to reduce the number of compositions by a factor of two (see the next section). That a computer proof should have succeeded 1 may still be hard to believe – it continues to amaze me. 1.2.9. Statement of the result. Here is the main result: Theorem 1. We consider the operator N formally given by (2) where the dependency of the rescaling matrix B = diag(α, β) on the input pair (U, T ) will be specified below. The operator N has a symmetric fixed point which is locally unique in the space of symmetric pairs. For the values of the rescaling factors α and β, we have the bounds α ∈ [−2.8321648, −2.8321625],

β ∈ [−28.846551, −28.846545].

The proof uses Lanford’s method [3] and the Eckmann-Koch-Wittwer framework [1] for rigorous functional analysis in Banach spaces. For more detail, see [7], where the application of the same methods to the Siegel disc renormalisation is described. The program was run with an overall truncation degree of 35. For solving the implicit equations, the truncation degree was temporarily raised to 45. (This is akin to using double precision arithmetic for the sensitive parts of a big computation, the rest of which is done in single precision for efficiency.) 1.2.10. Limitations. A few comments about this result are in order: • As already hinted at above, the proof does not work directly in terms of the input maps U and T , but in terms of their generating functions u and t, see below. This means that we establish existence of a fixed point of the induced renormalisation of generating functions. To translate the results back to the framework of maps requires some (easy) computational work, which has not yet been done. • It has not been proved that the fixed point commutes. To modify the proof so that commutativity is also established should not be hard. (Basically, one would precede the operator by a projection to the submanifold of maps commuting to zeroth order at a certain point. A fixed point of this expanded scheme would then automatically commute. This approach was successful in the case of the Siegel disc renormalisation, see [7].) • It has not been proved that the fixed point is area-preserving. In order to establish area-preservation, one would precede the operator by the (linear) projection to the area-preserving subspace. This should not be hard. • It has not been proved that our pair (U, T ) is a fixed point for MacKay’s operator: even if we take commutativity for granted, it might be a point of an orbit of period three. In order to establish that it is really a fixed point, one could apply MacKay’s operator to it and verify that the image is so close to (U, T ) that, by local unicity, it must be identical to (U, T ). This is not hard in principle but might be tedious. Despite these limitations, I feel that the above result represents a substantial step towards making MacKay’s renormalisation theory rigorous. The upgrading of the proof for stronger results is left to the future. 1

It took one week of pure CPU time on a DEC Alpha.

Towards an Existence Proof of MacKay’s Fixed Point

727

1.3. Plans for future work. Ars longa, vita brevis.

In order to complete the programme suggested by MacKay’s work and by [6], the following steps (roughly in order of increasing difficulty) have to be taken: • “Solve” for the twist maps U and T . • Verify the conditions of [6]. • Show that (U, T ) is a fixed point of MacKay’s operator. (This step is actually dispensable.) • Implement area-preservation. • Implement commutativity. • Prove that the fixed point is hyperbolic with an essentially one-dimensional unstable manifold. • Prove that the standard family intersects the stable manifold transversally. • Prove that one branch of the unstable manifold ends at the so called “simple fixed point”. The result will be a global KAM theorem for the golden mean rotation number, including a precise characterisation of the critical curve. 1.4. Organisation of the paper. The rest of this paper is a brief exposition of the essential technical details of the proof. We are going to describe the machinery of symmetrygenerating functions (introduced in [1]), the definition of the rescaling factors α and β, the solution of the relevant implicit equations, and, finally, the domains on which the generating functions u and t are expanded. (The domains determine the underlying Banach space of pairs; the norm is the usual l1 -norm induced by the domains.) 1.5. Omissions. The precise definition of the renormalisation operator is given in Sects. 3 and 4. Of course, in order to apply Lanford’s method, the derivative of the operator is needed as well. The computation of this derivative is a tedious exercise in taking variations; it is not included here. (It can be found (essentially) in [5, p. 41–44].) We do not comment on the computational aspects in any detail; the subjects of interval arithmetic, of evaluation and composition of (truncated) power series, and of solving implicit equations are extensively discussed in [1]. Nor do we describe the overall strategy of the computer proof; the reader is referred to [3].

2. Symmetry-Generating Functions The aim of this section is to describe the machinery of symmetry-generating functions, which are closely related to the ones introduced in [1]. For the following, compare [8, §3], where the construction of the generating function for T U T is described. (It is reproduced for the sake of completeness.) The only new thing here is the construction of the generating function for T U T U T . Let f (x, x0 ) be a scalar function of two variables. We define the action of the symmetry S on f by Sf (x, x0 ) := f (−x0 , −x). Let, in addition, f be smooth and such that its first partial derivative Then the equations

∂f ∂x

has a fixed sign.

728

A. Stirnemann

y 0 = f (x, x0 ), y = Sf (x, x0 ) can be solved for (x0 , y 0 ) in terms of (x, y), and the resulting map (x, y) 7→ (x0 , y 0 ) is a symmetric twist map; conversely, any symmetric twist map can be brought into the above form [5, p. 35]. We call the function f the symmetry-generating function of the resulting twist map. Notice that, in contrast to the usual (“area-preserving”) generating function, the symmetry-generating function is uniquely determined by the twist map. 2.1. Composition. The main difficulty in the implementation of the operator N (2) is the computation of the generating functions of the maps T U T and T U T U T . 2.1.11. TUT. Let U and T be symmetric twist maps generated by u(x, x0 ) and t(x, x0 ) respectively. Any palindromical composition of U and T is symmetric. For instance, we have ST U T S = (ST S) (SU S) (ST S) = T −1 U −1 T −1 = (T U T )−1 . We want to find the generating function of the map T U T . Consider the diagram 0 x1 U x2 T x x T −→ −→ −→ 0 . y1 y2 y y The compatibility conditions t(x, x1 ) = y1 = u(−x2 , −x1 ) and

u(x1 , x2 ) = y2 = t(−x0 , −x2 )

lead to the following system for the functions x1 = z1 (x, x0 ) and x2 = z2 (x, x0 ): t(x, z1 ) − u(−z2 , −z1 ) = 0, t(−x0 , −z2 ) − u(z1 , z2 ) = 0.

(3)

Therefore, as one might expect, the construction of the generating function of T U T involves two unknown functions, corresponding to two “intermediate points”. By making use of the symmetry, however, we can eliminate one of them. This can be seen heuristically as follows: Applying the operator S to (3), followed by exchanging the two equations, has the same effect as the substitution (z1 , z2 ) 7→ (−Sz2 , −Sz1 ). Therefore, if (z1 , z2 ) is a solution of the system, then so is (−Sz2 , −Sz1 ). Since the solution of this system is unique (under appropriate conditions), this implies that z2 = −Sz1 . The corresponding rigorous argument goes as follows: Suppose that we can find a function z(x, x0 ) such that z1 := z and z2 := −Sz solve the first equation of (3). Then, the second equation is satisfied automatically, since it arises by applying S to both sides of the first one. This should motivate the following claim: Lemma 1. Let z(x, x0 ) be a solution of the equation t(x, z(x, x0 )) − u(Sz(x, x0 ), −z(x, x0 )) ≡ 0. Then the function generates the map T U T .

u(x, ˆ x0 ) := t(−Sz(x, x0 ), x0 )

Towards an Existence Proof of MacKay’s Fixed Point

729

Proof. As above, we put z1 := z and z2 := −Sz. It was already shown that (z1 , z2 ) solves (3). Since t generates the map (x2 , y2 ) 7→ (x0 , y 0 ), we have y 0 = t(x2 , x0 ) = t(−Sz, x0 ) = u(x, ˆ x0 ) and furthermore, since t generates the map (x, y) 7→ (x1 , y1 ), y = t(−x1 , −x) = t(−z, −x) = St(−Sz, −x0 ) = S u(x, ˆ x0 ). Therefore, the function uˆ generates the map (x, y) 7→ (x0 , y 0 ).

2.1.12. TUTUT. The case of the map T U T U T is treated in the same way. Consider the diagram 0 x1 U x2 T x3 U x4 T x x T −→ −→ −→ −→ −→ 0 . y1 y2 y3 y4 y y The compatibility conditions t(x, x1 ) u(x1 , x2 ) t(x2 , x3 ) u(x3 , x4 )

= = = =

y1 y2 y3 y4

= = = =

u(−x2 , −x1 ), t(−x3 , −x2 ), u(−x4 , −x3 ), t(−x0 , −x4 )

lead to a system for four unknown functions xj = vj (x, x0 ), j = 1, 2, 3, 4: t(x, v1 ) − u(−v2 , −v1 ) t(−v3 , −v2 ) − u(v1 , v2 ) t(v2 , v3 ) − u(−v4 , −v3 ) t(−x0 , −v4 ) − u(v3 , v4 )

= = = =

0, 0, 0, 0.

(4) (5) (6) (7)

Applying the operator S to this system, followed by exchanging (4) with (7) and (5) with (6), has the same effect as the substitution (v1 , v2 , v3 , v4 ) 7→ (−Sv4 , −Sv3 , −Sv2 , −Sv1 ). By the same reasoning as before, we may put v4 = −Sv1 and v3 = −Sv2 . Writing (v, w) instead of (v1 , v2 ), we can summarise as follows: Lemma 2. Let v(x, x0 ) and w(x, x0 ) solve the equations t(x, v(x, x0 )) − u(−w(x, x0 ), −v(x, x0 )) ≡ 0, t(Sw(x, x0 ), −w(x, x0 )) − u(v(x, x0 ), w(x, x0 )) ≡ 0. Then the function tˆ(x, x0 ) := t(−Sv(x, x0 ), x0 ) generates the map T U T U T . This is proved as Lemma 1.

730

A. Stirnemann

3. Rescaling Factors We have to define the rescaling factors α and β in terms of the generating functions u and t. The definition is exactly as described in [8, §4]. We reproduce it here for the sake of completeness but dispense with the motivation given there. We put ξ := 0.47003103, η := 0.01489080, β0 := −28.846548. (More precisely, the left hand sides represent fixed numerical constants which are represented by IEEE double precision floating point numbers close to the decimal fractions on the right-hand sides.) The rescaling factor α is found by solving the equation ξ η = t( , ξ). α The rescaling factor β is defined by β=−

β0 . t(0, 0)

Notice that these definitions do not involve any derivatives of the generating functions. 4. Implicit Equations 4.1. Induced renormalisation. 4.1.13. Conceptual procedure. Conceptually, the induced renormalisation operator (u, t) 7→ (u, ˜ t˜) is defined as follows: • Construct the generating function uˆ of T U T using Lemma 1 and the generating function tˆ of T U T U T using Lemma 2. • Compute the rescaling factors α and β following the preceding section. The renormalised pair (u, ˜ t˜) is then given by the equations x x0 u˜ = β u( ˆ , ), α α x x0 t˜ = β tˆ( , ), α α which reflect the renormalisation U˜ = BT U T B −1 , T˜ = BT U T U T B −1 (with B = diag(α, β)) of the generated maps.

Towards an Existence Proof of MacKay’s Fixed Point

731

4.1.14. Computational procedure. Computationally, however, it is simpler to put the rescaling factors into the implicit equations and to solve directly for the “rescaled” functions z(x/α, x0 /α), v(x/α, x0 /α), and w(x/α, x0 /α). (In this, we are following [1, p. 13].) From now on, with a slight abuse of notation, we denote these rescaled functions by z, v, and w. This, then, is the final definition of the function u: ˜ Having determined the rescaling factors α and β, find z(x, x0 ) by solving the implicit equation x t( , z(x, x0 )) − u(Sz(x, x0 ), −z(x, x0 )) = 0, α

(8)

and then put

x0 ). α For constructing the function t˜, find v(x, x0 ) and w(x, x0 ) by solving the implicit system t( αx , v(x, x0 )) − u(−w(x, x0 ), −v(x, x0 )) ≡ 0, (9) t(Sw(x, x0 ), −w(x, x0 )) − u(v(x, x0 ), w(x, x0 )) ≡ 0, u(x, ˜ x0 ) := t(−Sz(x, x0 ),

and then put t˜(x, x0 ) := t(−Sv(x, x0 ),

x0 ). α

4.2. Solving the implicit equation. In order to solve Eq. (8), we note that it is of the form F (x, x0 , z(x, x0 ), Sz(x, x0 )) = 0,

(10)

where F (x, x0 , z, w) is an analytic function of four variables. The solution, which is based on a variant of Newton’s method, was described in [8, §5]; we reproduce it here for the sake of completeness. Let us simplify notation by writing F (z) := F (x, x0 , z(x, x0 ), Sz(x, x0 )). Obviously, F (z) is a function of x and x0 , and we can apply the operator S to it: S(F (z)) = F (−x0 , −x, Sz(x, x0 ), z(x, x0 )). Furthermore, we introduce the partial derivatives Fz (z) := Fz (x, x0 , z(x, x0 ), Sz(x, x0 )), Fw (z) := Fw (x, x0 , z(x, x0 ), Sz(x, x0 ))

(11)

of F with respect to z and w = Sz, and define 1(z) = S(Fz (z)) · Fz (z) − S(Fw (z)) · Fw (z). Let z0 be an approximate solution of the implicit equation such that 1(z0 ) 6= 0, and consider the operator 8(z) = z − 1(z0 )−1 S(Fz (z0 )) · F (z) − Fw (z0 ) · S(F (z)) . We claim that any fixed point z ∗ of 8 solves the implicit Eq. (10). Indeed, assume that z ∗ is such a fixed point:

732

A. Stirnemann

8(z ∗ ) = z ∗ = z ∗ − 1(z0 )−1 S(Fz (z0 )) · F (z ∗ ) − Fw (z0 ) · S(F (z ∗ )) . Since 1(z0 )−1 6= 0, it follows that S(Fz (z0 )) · F (z ∗ ) − Fw (z0 ) · S(F (z ∗ )) = 0. Operating with S on this equation yields Fz (z0 ) · S(F (z ∗ )) − S(Fw (z0 )) · F (z ∗ ) = 0. The last two equations are a linear system for (F (z ∗ ), S(F (z ∗ ))), the determinant of which is equal to 1(z0 ). Since, by assumption, 1(z0 ) 6= 0, it follows that F (z ∗ ) = S(F (z ∗ )) = 0 as claimed. It is straightforward to write down the derivative of 8 and to write the code to estimate its norm. Existence of a fixed point z ? is proved by applying the contracting mapping principle. 4.3. Solving the implicit system. The implicit system (9) is solved in the same way. It can also be written in the form (10), where now the functions z and F are 2-vector-valued. Unfortunately, applying the above method (with this new interpretation) verbatimly does not work, because the partial derivatives (11) of F are now matrices which do not commute. In order to show how this difficulty is overcome, we describe the pertinent variant of Newton’s method in more detail. (The following can also serve as a motivation for the preceding subsection.) Let δz denote a small perturbation of z. Neglecting non-linear terms, we have F (z + δz) = F (z) + Fz δz + Fw δSz. Newton’s method amounts to putting the right hand side to zero and solving for δz. Notice, incidentally, that δSz = Sδz. The first equation for δz is F + Fz δz + Fw δSz = 0. Operating with S yields the second equation SF + SFw δz + SFz δSz = 0. We are thus facing a system of the form aδz + bSδz = −F (z), cδz + dSδz = −SF (z),

(12)

where the coefficients are 2 × 2 matrices (or, rather, matrix-valued functions of the variables x and x0 ). In detail, we have a = Fz ,

b = Fw ,

notice that c = Sb and that d = Sa. Eliminating Sδz from (12) yields

c = SFw ,

d = SFz ;

Towards an Existence Proof of MacKay’s Fixed Point

733

δz = (a − bd−1 c)−1 (−F (z) + bd−1 SF (z)),

(13)

provided, of course, that d and a − bd−1 c are invertible. (We cannot simplify this in the usual way, since matrix multiplication is not commutative.) In the same way, elimination of δz from (12) yields Sδz = (d − ca−1 b)−1 (−SF (z) + ca−1 F (z)).

(14)

Notice that S(d − ca−1 b) = S(Sa − Sba−1 b) = a − b(Sa)−1 Sb = a − bd−1 c. Therefore, the matrix d − ca−1 b is invertible if and only if a − bd−1 c is invertible, in which case the system (12) is regular. Notice, moreover, that S(bd−1 ) = ca−1 : operating with S on (13) transforms it into (14), which is also obvious from the construction of these equations. Let z0 be an approximate solution of the implicit equation. We put a = Fz (z0 ),

b = Fw (z0 ),

c = SFw (z0 ),

β = bd−1 ,

1 = (a − βc).

d = SFz (z0 )

and

We have shown that the system (12) is regular if and only if 1 is invertible. Assuming that, in fact, 1 is invertible, we now consider the operator

8(z) = z − 1−1 F (z) − βS(F (z))

and claim that any fixed point z ∗ of 8 solves the implicit system (10). Indeed, assuming that z ∗ is such a fixed point, we have z ∗ = 8(z ∗ ) = z ∗ − 1−1 F (z ∗ ) − βS(F (z ∗ )) . This implies that 0 = 1−1 (−F (z ∗ ) + βSF (z ∗ )), which is precisely Eq. (13) for (F (z ∗ ), SF (z ∗ )) with vanishing left hand side. Operating with S on this equation yields 0 = (S1)−1 (−SF (z ∗ ) + SβSF (z ∗ )), which is, of course, Eq. (14) for (F (z ∗ ), SF (z ∗ )) with vanishing left hand side. Since (13) and (14) define the inverse of the regular system (12), it follows that (F (z ∗ ), SF (z ∗ )) vanishes identically. In particular, F (z ∗ ) = 0, as claimed. Again, existence of a fixed point z ∗ is proved using the contracting mapping principle.

734

A. Stirnemann

5. Domains As in all these proofs, the choice of the domains of the relevant functions (here u and t) is crucial both conceptually and computationally. Conceptually, the domains implicitly define the underlying Banach space, see below; the basic requirement is obviously that the operator be well defined on an open neighbourhood of a suitable approximate fixed point (usually given as a polynomial). Computationally, the domains critically influence the condition of the fixed point problem. Good domains are found experimentally. How this is done in a slightly simpler case is described in [8, §6]. Again, it turns out that squares parallel to the axes and centred on the symmetry line {x0 = −x} do the job. Denoting the x-coordinate of the domains of u and t by cu and ct respectively and their half-lengths by τu and τt respectively, we represent u and t by the power series i 0 j x − cu x + cu uij , u(x, x ) = τu τu i,j X x − c t i x 0 + c t j t(x, x0 ) = tij . τt τt i,j 0

X

The space of pairs (u, t) is normed by the usual l1 -norm X |uij | + |tij |. ||(u, t)|| = ij

The following values were used in the proof: cu τu ct τt

= = = =

0.4719577461, 0.4682737434, −0.2041532767, 0.6181003581.

Acknowledgement. I am indebted to my former advisor Oscar E. Lanford III for encouragement and for substantial help in implementing the routines for doing rigorous arithmetic. The bulk of this work was done while I was a Research Fellow at the University of Exeter, funded by EPSRC grant GR/H38386; I am deeply grateful that I had the opportunity to work under such favourable conditions. The finishing touches were applied during a one month invited stay at the Forschungsinstitut f¨ur Mathematik of the ETH Zurich. The director of this institute, Prof. J¨urgen Moser, kindly granted me access to the powerful computers of the ETH, without which this proof would have been impossible. Last but not least, I wish to thank my mentor Andrew H. Osbaldestin and his research student Andrew Burbanks for their interest in my work. Their questions and comments helped me to understand better what I was trying to do.

References 1. Eckmann, J.-P., Koch, H., Wittwer, P.: A computer-assisted proof of universality for area-preserving maps. Mem. Am. Math.Soc. ISSN 0065-9266, Volume 47, Number 289, 1984 2. Katok, A.: Periodic and Quasi-Periodic Orbits for Twist Maps. Springer Lecture Notes in Physics 179, 47–65 (1983). Reprinted in MacKay, R.S and Meiss, J. D. (ed): Hamiltonian Dynamical Systems, Bristol and Philadelphia: Adam Hilger, 1987

Towards an Existence Proof of MacKay’s Fixed Point

735

3. Lanford III, O.E.: Computer-Assisted Proofs in Analysis. Proceedings of the International Congress of Mathematicians, Berkeley, California, USA, 1986 4. MacKay, R.S.: Renormalisation in area-preserving maps. Advanced series in nonlinear dynamics, Vol. 6, ISBN 981-02-1371-9, Singapore : World Scientific, (1993) 5. Stirnemann, A.: Renormalization for Golden Circles. PhD thesis, Diss. ETH No. 9843 6. Stirnemann, A.: Renormalization for Golden Circles. Commun. Math. Phys. 152, 369 – 431 (1993) 7. Stirnemann, A.: Existence of the Siegel Disc Renormalization Fixed Point. Nonlinearity 7, 959 – 974 (1994) 8. Stirnemann, A.: A Step Towards an Existence Proof of MacKay’s Fixed Point. Preprint 95-031, Mathematics Department, University of Edinburgh, 1995 Communicated by A. Jaffe

Commun. Math. Phys. 188, 737 – 751 (1997)

Communications in

Mathematical Physics c Springer-Verlag 1997

Multiple Instantons Representing Higher-Order Chern–Pontryagin Classes Joel Spruck1,? , D. H. Tchrakian2,?? , Yisong Yang3,??? 1

Department of Mathematics, Johns Hopkins University, Baltimore, MD 21218, USA Department of Mathematical Physics, St. Patrick’s College at Maynooth, Maynooth, Ireland, and School of Theoretical Physics, Dublin Institute for Advanced Studies, Dublin 4, Ireland 3 Department of Applied Mathematics and Physics, Polytechnic University, Brooklyn, New York 11201, USA

2

Received: 20 May 1996 / Accepted: 30 April 1997

Abstract: It has been shown in the work of Chakrabarti, Sherry and Tchrakian that the chiral SO± (4p) Yang–Mills theory in the Euclidean 4p (p ≥ 2) dimensions allows an axially symmetric self-dual system of equations similar to Witten’s instanton equations in the classical 4-dimensional SU (2) ∼ SO± (4) theory and the solutions represent a new class of instantons. However the rigorous existence of these higher-dimensional instanton solutions has remained open except for the solution of unit charge representing a single instanton. In this paper we establish an existence and uniqueness theorem for multi-instantons of arbitrary charges in the case p ≥ 2. These solutions are the first known instantons, with the Chern–Pontryagin index greater than one, of the Yang–Mills model in higher dimensions. Our approach is a study of a nonlinear variational equation defined on the Poincar´e half plane.

1. Introduction It is well known that the classical Yang–Mills theory in R4 allows an important family of energy-minimizing solutions called instantons which are topologically characterized by the second Chern index, c2 . The work of Tchrakian [20–22] shows that one can systematically develop the Yang–Mills theory in 4p dimensions so that the higher-dimensional instantons are characterized by the 2pth Chern index, c2p , for any p = 1, 2, . . .. Indeed the unit topological charge spherically symmetric solutions for all p, including the p = 1 case found by Belavin, Polyakov, Tyupkin and Schwartz (BPST) [3], were found explicitly in a unified way in [21]. For the Yang–Mills theory on R4 , Witten [25] imposed an interesting axial symmetry and constructed explicitly a special family of solutions for which N instantons are aligned along an axis. He called such solutions pseudoparticles ? ?? ???

Research supported partially by NSF under grant DMS-9403918 Research supported partially by CEC under grant HCM–ERBCHRXCT930362 Research supported partially by NSF under grant DMS–9596041

738

J. Spruck, D. H. Tchrakian, Y. Yang

which are afterwards commonly referred to as “Witten’s Instantons" [1]. Chakrabarti, Sherry and Tchrakian [7] were able to extend Witten’s axial symmetry to the Yang–Mills theory on R4p (∀p ≥ 2) and to obtain a class of 1-instantons explicitly. Recently, Burzlaff, Chakrabarti and Tchrakian have argued [4] by numerical approach and asymptotic analysis that N -instantons of the Witten type should exist on R4p (p ≥ 2) as well for any integer N . Our purpose in the present paper is to prove rigorously the existence of such solutions for p ≥ 2. For fixed N these solutions of course belong to the topological class c2p = N . We must point out that the structure of the instanton equations in R4p (p ≥ 2) is different from that in R4 studied by Witten [25]. In the latter case the key ingredient in Witten’s construction is the use of a conformal invariance of the reduced equation so that it can be transformed into a Liouville type equation under a suitable change of variables and hence solutions are found explicitly. In the former case an analogous reduction is not available and an explicit construction is impossible. Thus we must rely on nonlinear functional analysis to establish the existence of such N -instanton solutions. As in the problem of Witten [25], the equation to be studied is defined on the Poincar´e half plane with the hyperbolic metric ds2 = r−2 (dr2 + dt2 ). The two major difficulties are the nonlinearity of the reduced elliptic equation and the control of the solution near the boundary of the half plane. We will see that the first difficulty can be overcome by a change of dependent variable and a suitable extension of the function that defines the right-hand side of the equation and that the second difficulty can be overcome by a limiting argument. The rest of the paper is organized as follows. In Sect. 2 we recall the 4p dimensional Yang–Mills hierarchy [20–22] on R4p and the self-dual equations of [7] subject to the axial symmetry of Witten. We then state our main existence theorem. In Sect. 3 we prove the existence of a weak solution for the associated nonlinear elliptic equation on the Poincar´e half plane. We shall first solve the problem over any given bounded domain and then recognize that the solution is a global minimizer of the energy functional restricted to . This simple observation allows us to control the energy as → the full space. We next show that as → the full space a weak solution of the original equation is obtained in the limit. In fact this weak solution is a classical solution for which only the expected boundary conditions are to be established. In Sect. 4 we show that desired boundary conditions for the obtained solution may be recovered. Section 5 presents a general discussion. 2. N -Instantons of the Witten Type Following [20] we use F (2p) to denote the totally anti-symmetrized p fold product of the curvature 2-form F (2) = Fµν of the Yang–Mills theory on R4p with the gauge group SO± (4p). Then the conformally invariant Yang–Mills action Z tr F (2p)2 dx S= R4p

is minimized by the solutions of the self-dual equations F (2p) = ?F (2p), where ? is the Hodge dual. The spherically symmetric solutions of this equation for all p were given in [21] and the particular case with p = 1 was independently considered by

Multiple Instantons Representing Higher-Order Chern–Pontryagin Classes

739

Grossman, Kephart and Stasheff [10], but not in the context of generalised Yang–Mills [20–22] dynamics. √ Let t = x4p and r = xj xj (j = 1, 2, . . . , 4p − 1). The axial symmetry in [7, 4] generalizing that of Witten [25] involves the imposition of spherical symmetry in the 4p − 1 dimensional subspace of R4p , and the latter is adapted from Schwartz’s calculus of symmetric gauge fields [18]. This involves the evaluation of the components of the symmetric connection at the north-pole of the d − 1 dimensional sphere S d−1 in Rd , which is sufficient since all the interesting quantities like Lagrange densities and Chern classes are gauge and rotationally invariant, while the dynamical equations involving curvatures and covariant derivatives are gauge covariant. Subjecting the SO± (4p) gauge field to spherical symmetry in the 4p − 1 dimensional subspace of R4p results in a residual U (1) gauge field interacting via a U (1) covariant derivative with a dimensionless complex scalar field on R2+ whose coordinates are xi = (r, t) introduced above, while the components on S 4p−2 are given explicitly [18]. The dependence on the 4p − 2 angular coordinates xa , a = 3, 4, . . . , 4p is suppressed when the components of the gauge field (Fij , Fia , Fab ) are evaluated on the north pole. The complex scalar field can be regarded as a Higgs field since the residual subsystem resulting from the dimensional descent features a symmetry breaking self-interaction potential in terms of this complex field. We refer the reader to [7, 4] and references therein for the details, and simply state the components of this curvature on R2+ × S 4p−2 here, Fij |n.p. =

i fij 04p+1 , 2

i Fia |n.p. = − (Di φ Σa+ − Di φ Σa− ) , 2 Fab |n.p. = −(1 − |φ|2 ) 0ab . φ is a complex scalar field and fij = ∂[i aj] = ∂i aj − ∂j ai is the residual U (1) curvature and Di φ = ∂i φ + iai φ the corresponding covariant derivative, all on R2+ . The fields F (2) are evaluated on the north pole (n.p.). 0a are the gamma matrices and 04p+1 is the chirality matrix in 4p − 1 dimensions, while Σa± = 21 (1 ± 04p+1 )0a . The 2p form curvature F (2p) can be readily calculated and subsequently the Lagrange density. In the last step due account must be taken of the metric gij = g ij = δij on R2+ and 2 ab −2 ab 4p−2 g , when computing the scalar products F (2p)2 = √ab =µ1 νr1 δab , µh2p ν2p= r δ on S gg ···g Fµ1 ...µ2p Fν1 ...ν2p , with g µν = diag(g ij , g ab ) the metric on R2+ ×S 4p−2 . As a result of all this the above Yang–Mills action reduces to the following 2-dimensional action: Z S= L(p) (aj , φ) dx, R2

where L (aj , φ) is the Lagrangian density of the pth residual Abelian Higgs model on R2+ = {x = (x1 , x2 ) = (r, t) ∈ R2 | r > 0} in which aj = (a1 , a2 ) is the gauge connection, φ is a complex field, dx = drdt, fjk = ∂j ak − ∂k aj is the Abelian curvature 2-form, and L(p) is defined by the expressions (p)

2 + 4|Dj φ|2 + L(1) = r2 fjk

and

2 (1 − |φ|2 )2 , r2

740

J. Spruck, D. H. Tchrakian, Y. Yang

L(p) = (1 − |φ|2 )2(p−2) (r2 ((1 − |φ|2 )fjk − iD[j φDk] φ )2 b +a(1 − |φ|2 )2 |Dj φ|2 + 2 (1 − |φ|2 )4 ), p ≥ 2. r Here a, b are positive constants depending on the integer p. The Chern–Pontryagin index c2p , which is the volume integral of the density µ1 ...µ2p ν1 ...ν2p tr F (2p)µ1 ...µ2p F (2p)ν1 ...ν2p = tr F ∧ F ∧ · · · ∧ F,

2p times,

reduces, for this field configuration, to Z (1 − φ2 )2(p−1) ij (1 − φ2 )fij + i(2p − 1)D[i φDj] φ dx, R2

up to a normalisation constant. The integrand here is a total divergence ∂i (p) i , with (1) i = ij (iaj − φ∂j φ) , 2 2 2 (2) i = ij iaj − 1 + (1 − |φ| ) + (1 − |φ| ) φ∂j φ for p = 1 and p = 2. That the reduced Chern–Pontryagin densities resulting from the imposition of symmetries are total divergences, just like the original densities tr F ∧ F ∧ · · · ∧ F are the divergences of the corresponding Chern–Simons densities, is expected as was demonstrated in detail in [19, 13]. What we have given above is the residual U (1) system for the pth Yang–Mills system, and found that this is related to the pth (generalised) Abelian Higgs model [5] in exactly the same way as Witten’s [25] axially symmetric subsystem is related to the usual Abelian Higgs model, namely the p = 1 case here. The proof of existence for the vortex solutions of the hierarchy of Abelian Higgs models [5] was given in [24]. It is important in this connection to stress the qualitative difference between the axially symmetric instantons of the pth Yang–Mills system and the vortices of the pth Abelian Higgs model. The first model is scale invariant and hence the instantons are power localised to an arbitrary scale while the second, which is derived from the first by dimensional reduction leading to the introduction of a dimensional constant in the form of the Higgs vacuum expectation value, is exponentially localised to this dimensional absolute scale. Quantitatively the topological charge densities of the Abelian Higgs models, are given by (1) and (2) respectively, with the number 1 replaced by the square of the dimensional Higgs vacuum expectation value η 2 , and by multiplying aj with η 2p . As a result of the exponential decay [5] of the functions φ, only the first aj dependent terms contribute a non-vanishing amount to the topological charge which in these cases are the magnetic fluxes of the vortices. In the instanton case however, each of the terms in (p) contributes because of the power decay of both φ and aj . For the field configurations in the topological class c2p = N , the action has a lower bound proportional to N . This lower bound is saturated if and only if the self-dual Bogomol’nyi equations [4] Dj φ = −iεjk Dk φ, 2(2p − 1) (1 − |φ|2 )2 = −εjk ((1 − |φ|2 )fjk − i(p − 1)D[j φDk] φ), r2

x ∈ R2+ (1)

Multiple Instantons Representing Higher-Order Chern–Pontryagin Classes

741

are fulfilled, with D[j φDk] φ = Dj φDk φ − Dk φDj φ. The purpose of the present paper is to obtain solutions of the above equations for all p ≥ 2, which are the 4p-dimensional extensions of Witten’s pseudoparticle solutions. It is well known that the topological integer N can be realized by the algebraic number of zeros of the complex field φ. Our main existence theorem may be stated as follows. Theorem. Let p ≥ 2 in (1). For any points p1 , p2 , . . . , pN ∈ R2+ the system (1) has a finite-action smooth solution (φ, aj ) so that φ vanishes exactly at these prescribed points. Such a solution gives rise to an axially symmetric N -instanton solution of the Witten type which is characterized by the following uniform boundary conditions imposed on the Higgs field φ: 0 < 1 − |φ|2 = O(r2−ε )

for small r > 0,

0 < 1 − |φ|2 = O(r2−ε |x|−2(2−ε) )

for large |x|,

where ε > 0 is an arbitrarily small number, and belongs to the topological class c2p = N . Besides, in the category of this type of solutions, uniqueness holds. Remark. In [7] a single instanton solution (for N = 1) is found explicitly which makes |φ|2 take the form ([r − 1]2 + t2 )([r + 1]2 + t2 ) . |φ|2 = (1 + r2 + t2 )2 It is easily seen that the asymptotic estimates in the theorem are consistent with the above expression. Thus these estimates are sharp. 3. Proof of Existence Let p1 , p2 , . . . , pN ∈ R2+ (with possible multiplicities) be as given in the theorem in Sect. 2. Then the substitution u = ln |φ| transforms (1) into the equivalent scalar equation X (2p − 1) 2u (e − 1)2 − 2(p − 1)e2u |∇u|2 − 2π δp j , 2 r N

(e2u − 1)1u =

x ∈ R2+ , (2)

j=1

where δpj is the Dirac measure concentrated at pj . We are to look for a solution u of (2) so that u(x) → 0 (hence |φ(x)| → 1) as x → ∂R2+ or as |x| → ∞. Since the maximum principle implies that u(x) ≤ 0 everywhere, it will be more convenient to use the new variable Z u p (e2s − 1)p−1 ds, u ≤ 0. (3) v = f (u) = 2(−1) 0

It is easily seen that

f : (−∞, 0] → [0, ∞)

is strictly decreasing and convex. For later use, we note that f 0 (u) = 2(−1)p (e2u − 1)p−1 , f 00 (u) = 4(−1)p (p − 1)e2u (e2u − 1)p−2 . Set

u = F (v) = f −1 (v),

v ≥ 0.

742

J. Spruck, D. H. Tchrakian, Y. Yang

Then Eq. (2) is simplified to X 2(−1)p (2p − 1) 2F (v) (e − 1)p − 4π δp j 2 r N

1v =

in R2+ .

(4)

j=1

To approach (4), we introduce its modification of the form X 2(2p − 1) R(v) − 4π δp j , 2 r N

1v =

(5)

j=1

where the right-hand-side function R(v) is defined by   (−1)p (e2F (v) − 1)p , R(v) =  pv,

v ≥ 0, v < 0.

Then it is straightforward to check that R ∈ C 1 . In order to obtain a solution of the original equation (4), it suffices to get a solution of (5) satisfying v(x) ≥ 0 in R2+ and v(x) → 0 as x → ∂R2+ or as |x| → ∞. The main technical difficulty in (4) or (5) is the singular boundary of R2+ . We will employ a limiting argument to overcome this difficulty. We first solve (5) on a given bounded domain away from r = 0 under the homogeneous Dirichlet boundary condition. It will be seen that the obtained solution is indeed nonnegative and thus (4) is recovered. Such a property also allows us to control its energy and pointwise bounds conveniently. We then choose a sequence of bounded domains to approximate the full R2+ . The corresponding sequence of solutions is shown to converge to a weak solution of (4). This weak solution is actually a positive classical solution of (4) which necessarily vanishes asymptotically as desired. Then the stated decay rates are established by suitable comparison functions. To proceed, choose a function, say v0 , satisfying the requirement that it is compactly supported in R2+ and smooth everywhere except at p1 , p2 , . . . , pN so that 1v0 + 4π

N X

δpj = g(x) ∈ C0∞ (R2+ ).

j=1

Let be any given bounded domain containing the support of v0 and ⊂ R2+ (where and in the sequel, all bounded domains have smooth boundaries). Then v = v0 + w changes (5) into a regular form without the Dirac measure right-hand-side source terms which is the equation in the following boundary value problem: 1w =

2(2p − 1) R(v0 + w) − g r2

w =0

in ,

on ∂.

We first apply a variational method to prove the existence of a solution to (6). Lemma 1. The problem (6) has a unique solution.

(6)

Multiple Instantons Representing Higher-Order Chern–Pontryagin Classes

743

Proof. It is seen that (6) is the variational equation of the functional Z 2(2p − 1) 1 2 |∇w| + I(w) = Q(v0 + w) − gw dx, w ∈ H01 (), 2 r2 where the function Q(s) is defined by Z s  0 p   (−1) (e2F (s ) − 1)p ds0 ,  Z s  0 Q(s) = R(s0 ) ds0 = Z s  0  p   ps0 ds0 = s2 , 2 0

(7)

s ≥ 0, (8) s < 0,

which is positive except at s = 0. This property and the Poincar´e inequality indicate that the functional (7) is coercive and bounded from below on H01 (). On the other hand, since F (s) ≤ 0 for s ≥ 0, we have d Q(s) = |R(s)| = max{1, p|s|}. ds This feature says that the functional (7) is continuous on H01 () because is away from the boundary of R2+ and, so, the weight 2(2p − 1)/r2 is bounded. Besides, the definition of F (s) gives us the result   pe2F (s) , s ≥ 0, d2 Q(s) =  p, ds2 s < 0, which says that the functional (7) is also convex. Thus, by convex analysis, the functional is weakly lower semicontinuous on H01 () and the existence and uniqueness of a critical point is ensured. The standard elliptic theory then implies that such a critical point is a classical solution of (6). Lemma 2. Let w be the solution of (6) obtained in Lemma 1. Then w satisfies v0 +w > 0 in . Proof. The function v = v0 + w satisfies (5) and assumes arbitrarily large values near pj (j = 1, 2, . . . , N ). Since supp(v0 ) ⊂ , we have v = 0 on ∂. The fact that R(v) < 0 for v < 0 and the maximum principle then lead us to the conclusion that v > 0 in as stated. We now choose a sequence of bounded domains {n } satisfying ⊂ 1 ,

n ⊂ n+1 ,

n ⊂ R2+ ,

n = 1, 2, . . . ,

lim n = R2+ .

n→∞

Lemma 3. Let wn be the solution of (6) for = n obtained in Lemma 1 and I(·; n ) be the functional (7) with = n . Then we have the monotonicity I(wn ; n ) ≥ I(wn+1 ; n+1 ),

n = 1, 2, . . . .

Proof. In fact for given n the function wn is the unique minimizer of I(·; n ) on H01 (n ). Now set wn = 0 on n+1 − n . Then wn ∈ H01 (n+1 ) and I(wn ; n ) = I(wn ; n+1 ). However wn+1 is the global minimizer of I(·; n+1 ) on H01 (n+1 ). Therefore the stated monotonicity follows.

744

J. Spruck, D. H. Tchrakian, Y. Yang

To see that the energies are bounded from below, we need Lemma 4. For any H01 (R2+ ) function w there holds the Poincar´e inequality Z Z 1 2 w (x) dx ≤ 4 |∇w(x)|2 dx. 2 R2+ r R2+

(9)

Proof. For w ∈ C01 (R2+ ) we have after integration by parts, Z ∞ Z ∞ d 1 2 1 w(r, t) w(r, t) dr. w (r, t) dr = 2 2 r r dr 0 0 Thus the Schwarz inequality gives us Z Z dw 2 1 2 w (x) dx ≤ 4 (x) dx , 2 R2+ r R2+ dr which is actually stronger than (9). Thus the lemma follows. Lemma 5. Let {wn } be the solution sequence stated in Lemma 3. Then wn < wn+1 on n , n = 1, 2, . . .. Proof. Set vn = v0 + wn . Then Lemma 2 says that vn > 0 in n . In particular vn+1 > 0 on n . Thus the equation 1(vn+1 − vn ) =

2p(2p − 1) 2F (ξn ) e (vn+1 − vn ) , r2

where ξn lies between vn and vn+1 and the boundary property vn+1 − vn > 0 on ∂n imply that vn+1 − vn > 0 in n as expected. Lemma 6. Let {wn } be the sequence stated in Lemma 3. There are positive constants C1 , C2 independent of n = 1, 2, . . . so that I(wn ; n ) ≥ C1 k∇wn k2L2 (R2 ) − C2 , +

n = 1, 2, . . . .

Proof. The expression (8) says that Q ≥ 0. Since g is of compact support in R2+ , the Schwarz inequality and Lemma 4 give us Z Z 1 |∇wn |2 dx − 4 r2 g 2 dx. I(wn ; n ) ≥ 4 R2+ R2+ Lemma 7. For a given bounded subdomain 0 with 0 ⊂ R2+ , the sequence {wn } is weakly convergent in H 1 (0 ). The weak limit, say w0 , is a solution of Eq. (6) with = 0 (neglecting the boundary condition) which satisfies w0 (x) > 0. Proof. Using Lemmas 3 and 6 we see that there is a constant C > 0 such that sup k∇wn k2L2 (R2 ) ≤ C. n

+

(10)

From (9) and (10) we obtain the boundedness of {wn } in H 1 (0 ). Combining this with the monotonicity property stated in Lemma 5 we conclude that {wn } is weakly convergent in H 1 (0 ). It then follows from the compact embedding H 1 (0 ) → L2 (0 )

Multiple Instantons Representing Higher-Order Chern–Pontryagin Classes

745

that R(v0 + wn ) is convergent in L2 (0 ). On the other hand, since for sufficiently large n, we have 0 ⊂ n , consequently Z 2(2p − 1) R(v + w )ξ − gξ dx = 0, ∀ ξ ∈ C01 (0 ). (11) ∇wn · ∇ξ + 0 n 2 2 r R+ Letting n → ∞ in (11) we see that w0 is a weak solution of (6) (without considering the boundary condition). The standard elliptic regularity theory then implies that it is also a classical (hence, smooth) solution. Since wn > 0, we have w0 ≥ 0. The maximum principle then yields w0 > 0 in 0 . Thus our lemma follows. Set w(x) = w0 (x) for x ∈ 0 for any given 0 stated in Lemma 7. In this way we obtain a global solution of the equation in (6) over the full R2+ . Lemmas 3 and 6 imply that there is a constant C > 0 to make I(w) ≤ C,

k∇wkL2 (R2+ ) ≤ C.

(12)

In the next section we establish the desired asymptotic behavior of the obtained solution w. The boundedness result (12) is not sufficient to ensure the decay of w at r = 0 and at infinity. We need also to show that w is pointwise bounded as a preparation.

4. The Asymptotics For technical reasons which will become clear later, we need to show first that the solution w is pointwise bounded. This will be accomplished by the following lemma. Lemma 8. Let {wn } be the sequence of local solutions stated in Lemma 3 and ⊂ 1 be as defined in the last section. There exists a constant C > 0 independent of n so that sup wn (x) ≤ sup {wn (x)} + C sup |g(x)|,

x∈n

x∈∂

n = 1, 2, . . . .

(13)

x∈

Proof. Set Dn = n − . We consider wn on Dn and separately. Note that wn satisfies 1wn ≥ −g and v0 + wn > 0 in . Hence the inequality (13) is standard if on the left-hand side the domain n is replaced by its subdomain because v0 = 0 on ∂ implies wn (x) > 0 in view of Lemma 2 (applied to wn ). In this situation the constant C only depends on the size of (cf. [8]). Now consider the other case, x ∈ Dn . Set ηn = sup{wn (x) | x ∈ ∂}. Then the property v0 = 0, g = 0 in Dn gives us 1(wn − ηn ) ≥

2(2p − 1) ([−1]p [e2F (wn ) − 1]p − [−1]p [e2F (ηn ) − 1]p ) r2

in Dn . (14)

Since the function (−1)p (e2F (s) − 1)p is strictly increasing for s ≥ 0 and wn − ηn ≤ 0 on ∂Dn , we obtain by the maximum principle the result wn ≤ ηn in Dn . Therefore (13) follows immediately. Lemma 9. Let w be the solution of (6) over the full R2+ obtained in the last section. Then w is bounded.

746

J. Spruck, D. H. Tchrakian, Y. Yang

Proof. Since wn < w in n , we have in particular sup {wn (x)} < sup {w(x)},

x∈∂

n = 1, 2, . . . .

x∈∂

Hence Lemma 8 says that there is a constant C > 0 independent of n so that sup {wn (x)} ≤ C,

n = 1, 2, . . . .

(15)

x∈n

A simple application of the embedding theory gives us the pointwise convergence wn → w as n → ∞. Thus (15) yields the boundedness of w from above. However, v0 + w > 0 (see Lemma 2) implies already the boundedness of w from below. The lemma is consequently proven. Lemma 9 enables us to establish the asymptotic behavior of w near infinity and the boundary r = 0 as was done for the multi-meron solutions [9, 11, 6, 17]. The proof of the following lemma is adapted from [11]. Lemma 10. Let w be the solution stated in Lemma 9. Then for x = (r, t) ∈ R2+ we have the uniform limits (16) lim w(x) = lim w(x) = 0. r→0

|x|→∞

Proof. Given x = (r, t), let D be the disk centered at x with radius r/2. The Dirichlet Green’s function G(x0 , x00 ) of the Laplacian 1 on D (satisfying G(x0 , x00 ) = 0 for |x00 − x| = r/2) is defined by the expression q 1 ln |x0 − x|2 + |x00 − x|2 − 2(x0 − x) · (x00 − x) G(x0 , x00 ) = 2π s 2 2 0 1 r 2|x − x||x00 − x| ln + − 2(x0 − x) · (x00 − x) , − 2π r 2 where x0 , x00 ∈ D but x0 6= x00 . Hence w at x0 ∈ D can be represented as Z 2(2p − 1) 2F (v0 +w) p w(x0 ) = dx00 (e − 1) − gw (x00 )G(x0 , x00 ) 002 r D Z ∂G 0 00 + dS 00 (x , x ) w(x00 ), 00 ∂n ∂D

(17)

where x00 = (r00 , t00 ) and ∂/∂n00 denotes the outer normal derivative on D with respect to the variable x00 . We need to first evaluate |r(∇x w)(x)|. This can be done by differentiating (17) and then setting x = x0 . Note that 1 4 1 (∇x0 G(x0 , x00 ))x0 =x = (x00 − x), − 2π r2 |x00 − x|2 00 ∂G x −x 0 00 00 ∇x0 00 (x0 , x00 ) · ∇ = ∇x 0 G(x , x ) x 0 00 ∂n |x − x| x0 =x x =x 8 00 00 = (x − x), x ∈ ∂D. πr3

Multiple Instantons Representing Higher-Order Chern–Pontryagin Classes

747

Let now C1 = sup{ |2(2p − 1)(e2F (v0 +w) − 1)p (x) − r2 g(x)w(x)| | x = (r, t) ∈ R2+ }, C2 = sup{|w(x)| | x ∈ R2+ }. Differentiate (17) with respect to x0 , set x0 = x, apply the above results, and use r00 ≥ r/2. We have Z Z 8C2 2C1 1 00 dx + 3 |x00 − x|dS |∇w(x)| ≤ πr2 D |x00 − x| πr ∂D C ≤ , (18) r where C is a constant independent of r > 0. Thus the claimed bound for |r∇w(x)| over R2+ is established. To show that (16) holds for w, we argue by contradiction. Let xn = (rn , tn ) be a sequence in R2+ satisfying either rn → 0 or |xn | → ∞, but |w(xn )| ≥ some ε > 0. Without loss of generality we may also assume that the sequence is so chosen that the disks centered at xn with radius rn /2 are non-overlapping. Set then 1 ε 2 Dn = {x ∈ R+ | |x − xn | < ε0 rn }, ε0 = min , , 2 4C where C > 0 is the constant given in (18). For x = (r, t) ∈ Dn we have 3rn /2 ≥ r ≥ rn /2. Thus, integrating ∇w over the straight line L from xn to x ∈ Dn and using |∇w(x0 )| < 2C/rn (∀x0 ∈ Dn ), we obtain the estimate Z 0 0 |w(x)| = w(xn ) + ∇w(x ) · dl L

2C ε ≥ ε− rn rn 4C ε x ∈ Dn . = , 2 Therefore we arrive at the contradiction Z ∞ Z X w2 w2 dx ≥ dx 2 r2 R+2 r n=1 Dn 2 2 ∞ X 2 ε π(ε0 rn )2 ≥ 3rn 2 n=1 = ∞. So (16) must hold and thus the proof of the lemma is complete. We now strengthen the above result and prove Lemma 11. Let w be the solution stated in Lemma 10. There are constants r0 > 0 (small) and ρ0 > 0 (large) so that for any 0 < ε < 1 there is a constant C(ε) > 0 to make the asymptotic bounds

748

J. Spruck, D. H. Tchrakian, Y. Yang

0 < w(x) < C(ε)r2p−ε , 0 < w(x) < C(ε)r

2p−ε

|x|

−2(2p−ε)

,

0 < r < r0 ;

x = (r, t) ∈ R2+

|x| > ρ0 ,

(19)

valid. In other words, roughly speaking, there hold w(x) = O(r2p ) as r → 0 and w(x) = O(|x|−2p ) as |x| → ∞. Proof. First let r0 > 0 be small so that supp(v0 ) ⊂ {x = (r, t) ∈ R2+ | r > r0 }. Consider the infinite strip R0 = {x = (r, t) ∈ R2+ | 0 < r < r0 } and set σ(x) = Crβ .

(20)

Then r2 1σ = β(β − 1) σ. On the other hand, the solution w satisfies r2 1w = 2(2p − 1)(−1)p (e2F (w) − 1)p = 2p(2p − 1)(−1)p (e2F (ξ) − 1)p−1 e2F (ξ) 2F 0 (ξ)w ξ ∈ (0, w). = 2p(2p − 1)e2F (ξ) w, Now take β = 2p − ε. Since w → 0 uniformly as r → 0, we may choose r0 small enough so that 2p(2p − 1)e2F (w) > β(β − 1) for x ∈ R0 . Consequently r2 1(w − σ) > β(β − 1)(w − σ),

x ∈ R0 .

(21)

Let C in (20) be so large that (w − σ)r=r0 < 0. Using this and the property w − σ → 0 as r → 0 and w → 0 as |x| → ∞ in (21) we obtain the first line in (19), namely, 0 < w(x) < Crβ ,

0 < r < r0 .

(22)

Next, put S0 = {x ∈ R2+ | |x| > ρ0 }, where ρ0 > 0 is so large that supp(v0 ) ⊂ R2+ − S0 . Define the comparison function σ(x) = C1 rβ (1 + r2 + t2 )−β ,

x ∈ S0 ,

(23)

r2 σ. (1 + |x|2 )2

(24)

where β is as defined in (20). Then r2 1σ = β(β − 1) 1 − 4

Using w → 0 as |x| → ∞ we obtain (21) for x ∈ S0 , where ρ0 is sufficiently large. From (22) and (23) we see that the constant C1 > 0 may be chosen so that (w − σ)|x|=ρ0 < 0. Using this property and Lemma 10 in (21) so that R0 is replaced by S0 we have w < σ throughout S0 . This is the second line in (19) and the proof is complete. Since v0 is compactly supported, v behaves like w asymptotically. Recall the relation between v and u. We have v ∼ |u|p for r → 0 or |x| → ∞. Hence the asymptotic estimates stated in the theorem in Sect. 2 hold.

Multiple Instantons Representing Higher-Order Chern–Pontryagin Classes

749

5. Discussion We have proved the existence of a topological charge N self-dual multi-instanton axially symmetric solution to the SO± (4p) Yang–Mills model in 4p dimensions. As such this is the first instanton with topological charge greater than one, of the Yang–Mills model in higher dimensions. To put this result in perspective, we give a general discussion below. The main mathematical interest of the scale invariant hierarchy of Yang–Mills models [20–22] in 4p dimensions arises from the fact that it is a natural generalization of the usual Yang–Mills model on R4 , which is the p = 1 member of this hierarchy. It is therefore interesting to find out how deep this analogy goes, namely, what are the properties of the instanton solutions of the members of this hierarchy, what Dirac [12] equations they relate to, and associated index theorems and parameter counts for the corresponding moduli spaces, etc. Recall that one of the most important features of the Yang–Mills system on R4 is that the field equations are solved by the first order self-duality equations F (2) = ?F (2) ,

(25)

where ?F (2) is the Hodge dual of the 2-form curvature F (2) = Fµν . The most general solutions have been obtained by Atiyah, Drinfeld, Hitchin and Manin [2]. On the other hand, the hierarchy of self-duality equations stated in Sect. 2 take the form (25) in which the curvature 2-form, F (2), is replaced by a 2p-form F (2p) so that the associated topological charge, N , corresponds to the well-defined higher-order Chern–Pontryagin index, c2p . Thus, to realize these “canonical" topological classes, it is important to construct all possible self-dual solutions in the hierarchy. However, there is a mathematical subtlety that must be tackled, which concerns the richness of self-dual solutions. To see this, note that the algebra-valued components of the hierarchy of self-dual equations comprise N (p) = (4p)!/2(2p)!2 algebra-valued conditions, which determine the 4p−1 gauge independent components of the connection Aµ . Then, because N (p) ≥ (4p − 1), with the equality holding only for p = 1, it is concluded that the self-dual equations in 4p dimensions are overdetermined [23] for all p ≥ 2. This means that with the exception of the p = 1 case (25), one cannot expect a rich family of non-trivial solutions and some kind of symmetry properties should be imposed to reduce the extra number of equations. Indeed, it turns out that, for SO± (4p) connections, non-trivial spherically symmetric instanton solutions in all 4p dimensions do exist and were given explicitly in [21]. By SO± (4p) we mean one or the other of the chiral representations of SO(4p) given in terms of the gamma matrices 0µ and the chirality matrix 04p+1 . This is the spherically symmetric hierarchy of generalised Yang–Mills instantons, the first member of which, that with p = 1, is the SU (2) BPST instanton [3], where now SO± (4) is read either as SUR (2) or SUL (2) respectively. The existence of these spherically symmetric solutions implies that the spherically symmetric restriction of the self-dual equations are not overdetermined. Before proceeding to discuss the less symmetric solutions, we remark that in addition to the spherically symmetric solutions [21] on R4p to the self-dual equations F (2p) = ?F (2p), these equations are satisfied also by the SO± (4p) connections on S 4p . By virtue of the scale invariance of the systems L(p) = trF (2p)2 , a stereographic transformation resulting in the corresponding system on the 4p-sphere S 4p leaves the Lagrangian forminvariant, and it follows that the self-dual equations [21] are satisfied by the SO± (4p) connections on S 4p , given in [15, 16]. Indeed, it was shown [14] that the self-dual equations are satisfied also by the SU (n) × U (1) connections on CP 2n , and hence

750

J. Spruck, D. H. Tchrakian, Y. Yang

are expected to be satisfied also by the appropriate connections on the other compact symmetric coset spaces. It would be interesting to give a complete classification of these symmetric solutions. We proceed to the next question, namely as to whether the self-dual equations in 4p dimensions have any other non-trivial instanton solutions, less symmetric than the spherically symmetric solutions [21] on R4p . The answer to this question was pursued in [23], where it was shown using some indirect methods, that there are no solutions which are less than axially symmetric. This is the extent to which the higher dimensional self-dual equations are overdetermined. The last question then is, do axially symmetric solutions actually exist? This problem was first considered in [7, 4] which generalizes Witten’s multipseudoparticle equations. The main question is whether there are Witten type solutions realizing any prescribed “vortex” lumps or the Chern index c2p = N . The works [4, 7] succeeded only to show it is plausible that this axially symmetric restriction supports multiple instanton solutions, but did not give a rigorous proof of existence, and were unable to give an explicit solution as in Witten’s case. The present paper succeeded in proving the existence of the Witten type multiple instanton solutions for the entire hierarchy of the 4p-dimensional self-dual Yang–Mills equations obtained in [7] realizing any topological class c2p = N . Another interesting question to settle is the parameter count for these solutions. It is hoped that this task will be facilitated by relating the zero-mode problem for the hierarchy of self-dual equations to the solutions of the corresponding hierarchy of (generalised) Dirac equations introduced in [12]. Finally we note that a simplified version of the method here may be used to reproduce Witten’s multiple instantons in four dimensions. References 1. Actor, A.: Classical solutions of SU (2) Yang–Mills theories. Rev. Mod. Phys. 51, 461–525 (1979) 2. Atiyah, M.F., Drinfeld, V.G., Hitchin, N.J. and Manin, Yu.I.: Construction of instantons. Phys. Lett. A65, 185–187 (1978) 3. Belavin, A.A., Polyakov, A.M., Schwartz, A.S. and Tyupkin, Yu.S.: Pseudoparticle solutions of the Yang– Mills equations. Phys. Lett. B59, 85–87 (1975) 4. Burzlaff, J., Chakrabarti, A. and Tchrakian, D.H.: Axially symmetric instantons in generalized Yang– Mills theory in 4p dimensions. J. Math. Phys. 34, 1665–1680 (1993) 5. Burzlaff, J., Chakrabarti, A. and Tchrakian, D.H.: Generalised Abelian Higgs models with self-dual vortices. J. Phys. A27, 1617–1624 (1994) 6. Caffarelli, L., Gidas, B. and Spruck, J.: On multimeron solutions of the Yang–Mills equation. Commun. Math. Phys. 87, 485–495 (1983). 7. Chakrabarti, A., Sherry, T.N. and Tchrakian, D.H.: On axially symmetric self-dual field configurations in 4p dimensions. Phys. Lett. B162, 340–344 (1985) 8. Gilbarg, D. and Trudinger, N.: Elliptic Partial Differential Equations of Second Order. Berlin and New York: Springer, 1977 9. Glimm, J. and Jaffe, A.: Multiple meron solution of the classical Yang–Mills equation. Phys. Lett. B73, 167–170 (1978) 10. Grossman, B., Kephart, T.W. and Stasheff, J.D.: Solutions to the Yang–Mills field equations in 8 dimensions and the last Hopf map. Commun. Math. Phys. 96, 431–437 (1984) 11. Jonsson, T., McBryan, O., Zirilli, F. and Hubbard, J.: An existence theorem for multimeron solutions to classical Yang–Mills field equations. Commun. Math. Phys. 68, 259–273 (1979) 12. Lechtenfeld, O., Nahm, W. and Tchrakian, D.H.: Dirac equations in 4p-dimensions. Phys. Lett. B162, 143–147 (1985) 13. Zhong-Qi Ma, O’Brien, G.M. and Tchrakian, D.H.: Dimensional reduction and higher-order topological invariants: Descent by even steps and applications. Phys. Rev. D33, 1177–1180 (1986) 14. Zhong-Qi Ma and Tchrakian, D.H.: Gauge Field Systems on CP n . J. Math. Phys. 31, 1506–1512 (1990)

Multiple Instantons Representing Higher-Order Chern–Pontryagin Classes

751

15. O’Brien, G.M. and Tchrakian, D.H.: Spin-connection self-dual GYM fields on double-self-dual GEC backgrounds. J. Math. Phys. 29, 1212–1219 (1988) 16. O’S`e, D. and Tchrakian, D.H.: Conformal properties of the BPST instantons of the generalised Yang–Mills system. Lett. Math. Phys. 13, 211–218 (1987) 17. Renardy, M.: On bounded solutions of a classical Yang–Mills equation. Commun. Math. Phys. 76, 277– 287 (1980) 18. Schwartz, A.S.: On symmetric gauge fields. Commun. Math. Phys. 56, 79–86, (1977) 19. Sherry, T.N. and Tchrakian, D.H.: Dimensional reduction and higher order topological invariants. Phys. Lett. B147, 121–126 (1984) 20. Tchrakian, D.H.: N -dimensional instantons and monopoles. J. Math. Phys. 21, 166–169 (1980) 21. Tchrakian, D.H.: Spherically symmetric gauge field configurations in 4p dimensions. Phys. Lett. B150, 360–362 (1985) 22. Tchrakian, D.H.: Yang-Mills hierarchy. Int. J. Mod. Phys. (Proc. Suppl.) A3, 584–587 (1993) 23. Tchrakian, D.H. and Chakrabarti, A.: How overdetermined are the generalised self-duality relations? J. Math. Phys. 32, 2532–2539 (1991) 24. Tchrakian, D.H. and Yang, Y.: The existence of generalized self-dual Chern–Simons vortices. Lett. Math. Phys. 36, 403–413 (1996) 25. Witten, E.: Some exact multipseudoparticle solutions of classical Yang–Mills theory. Phys. Rev. Lett. 38, 121–124 (1977) Communicated by D. Brydges