Commun. Math. Phys. 222, 1 – 7 (2001)
Communications in
Mathematical Physics
© Springer-Verlag 2001
Estimates of Weil–Petersson Volumes via Effective Divisors Georg Schumacher1 , Stefano Trapani2 1 Fachbereich Mathematik und Informatik der Philipps-Universität, Hans-Meerwein-Strasse, Lahnberge,
35032 Marburg, Germany. E-mail:
[email protected]
2 Dipartimento di Matematica, Universitá di Roma, “Tor Vergata” Via della Ricerca Scientifica, 00179 Roma,
Italy. E-mail:
[email protected] Received: 18 May 2000 / Accepted: 1 April 2001
Abstract: We study the asymptotics of the Weil–Petersson volumes of the moduli spaces of compact Riemann surfaces of genus g with n punctures, for fixed n as g → ∞. 1. Introduction and Statement of Results The total free energy of two dimensional gravity, which is a generating function for certain intersection numbers on the compactified moduli spaces Mg,n of stable n-pointed curves, was conjectured by Witten (and proved by Kontsevich) to satisfy certain KdV equations. This gave new insight in the geometry of those moduli spaces [WI,KO, DIJ]. The Mumford class κ1 on Mg,0 was shown to be proportional to the cohomology class [ωW P ] of the Weil–Petersson form by Wolpert in [WO]. Furthermore he showed that the restriction of this class to any component of the compactifying divisor coincides with the corresponding Weil–Petersson class. Arbarello and Cornalba introduced classes κ1 on Mg,n , proved a similar restriction property for these and concluded proportionality on all Mg,n [A-C 2]. So Weil–Petersson volumes vol(Mg,n ) are up to a normalizing factor the intersection numbers 3g−3+n κ1 . Vg,n = Mg,n
Recently, in papers by Kaufmann, Manin, Zagier, and Zograf [K-M-Z, M-Z, ZO3] a generating function was introduced for intersection numbers of Mumford’s tautological classes [MU2], and shown to be equal to the above generating function up to a change of variables. Previously, Zograf had computed the volumes for genus 0,1, and 2 explicitly [ZO1, ZO2]. Manin and Zograf also gave estimates of the volume for fixed genus and n → ∞. The aim of this note is to study the Weil–Petersson volume of the moduli spaces Mg,n for fixed n and large g. Introducing the decorated Teichmüller space, Penner [PE]
2
G. Schumacher, S. Trapani
gave a technique how to integrate top degree differential forms on Mg,n , which led to an estimate of the volumes of Mg,1 from below with respect to g → ∞. With these methods, Grushevski [GR] recently proved an upper bound for the volume of Mg,n for fixed n > 0 and large g. For n = 1 his upper estimate has the same order of growth as Penner’s lower estimate. However, for the classical moduli spaces Mg,0 the asymptotics of the volume for g → ∞ have not been treated. Here, we give a different approach, which does not require the existence of punctures. On one hand we use the known push-pull type formulas in the spirit of Arbarello and Cornalba [A-C 1,A-C 2] to estimate the volume of Mg,n+1 from below in terms of the volume of Mg,n for any given g and n ≥ 0. On the other hand, we base our estimates on the fact that κ1 is ample and the above restriction property. In this way it is possible to estimate the volume of the moduli space Mg,0 from below in terms of the volume of moduli spaces of lower genus. 1 We set V(0,3) = 0. The values V(0,4) = 1, V(0,5) = 5, V(1,1) = 24 , and V(1,2) = 18 are known. We prove the following theorems. Theorem 1. Let 2g − 2 + n > 0 and (g, n) = (0, 4), (1, 1). Then Vg,n+1 ≥
1 1 (3g − 2 + n)(7g − 7 + 3n) · Vg,n + g . 2 24 g!
(1)
Theorem 2. Let g > 1. Then Vg,0
[g/2] 1 1 1 1 ≥ Vj,1 Vg−j,1 − (V g ,1 )2 , Vg−1,2 + Vg−1,1 + 28 672 14 28 2
(2)
j =2
with V g ,1 = 0, if g is odd. 2
Together with the results of Penner and Grushevsky these imply the existence of constants 0 < c < C, independent of n such that cg (2g)! ≤
Vg,n ≤ C g (2g)! (3g − 3 + n)!
(3)
for all fixed n ≥ 0 and large g. In particular for all n ≥ 0, V
lim
g→∞
g,n log (3g−3+n)!
g log g
= 2.
(4)
2. Proof of the Estimates For any g and n with 2g − 2 + n > 0 the map Mg,n+1 → Mg,n forgetting the last puncture is known to extend holomorphically to a map πn+1 : Mg,n+1 → Mg,n . For n > 0 it possesses natural sections σj ; j = 1, . . . , n (cf. [A-C 2, Sect. 1]) with corresponding divisors Dj . Denote the relative dualizing sheaf ωMg,n+1 /Mg,n by ωn+1 . Let ψj := c1 (σj∗ ωn+1 ),
Weil–Petersson Volumes via Effective Divisors
3
and
K := c1 (ωn+1 (D)) ∈ H 2 (Mg,n+1 , R), where D = D1 + . . . Dn . Finally κj := πn+1∗ (K j +1 ) ∈ H 2j (Mg,n+1 , R) for j = 0, . . . , 3g − 3 + n. For n = 0 these are equal to the Mumford classes (denoted also by κj ). Moreover κ1 is ample on Mg,n . According to Mumford [MU2] the classes κj are numerically effective on Mg,0 for j = 1, . . . , 3g − 3 in the sense that for any complete subvariety W ⊂ Mg,0 of dimension j κj ≥ 0 W
holds. Also, for j1 + . . . + jk = 3g − 3, κj1 · . . . · κjk ≥ 0
(5)
Mg,0
following [HA2]. We shall need the following formulas for cohomology classes to be found in [A-C 2, (1.7), (1.9), and (1.10)]: a
a
n−1 n−1 πn∗ (ψ1a1 · . . . · ψn−1 · ψnan +1 ) = ψ1a1 · . . . · ψn−1 · κan for aj ≥ 0, aj −1 a −1 an−1 an−1 πn∗ (ψ1a1 · . . . · ψn−1 )= ψ1a1 · . . . · ψj −1 · ψj j · ψja+1 · . . . · ψn−1 ,
(6) (7)
j ;aj >0
∗ a κa = πn+1 (κa ) + ψn+1
on
Mg,n+1 ,
(8)
κ0 = 2g − 2 + n. (9) k Lemma 1. Let mj ≥ 0 be integers such that j =1 j · mj = 3g − 3 + n with n ≥ 0. Then κ1m1 · . . . · κkmk ≥ 0. (10) Mg,n
Proof. The above Eq. (5) is the statement for n = 0. We proceed by induction on n. Assume (10) for some n ≥ 0. Then κ1m1 · . . . · κkmk Mg,n+1
Mg,n
∗ ∗ 2 πn+1∗ ( (πn+1 κ1 + ψn+1 )m1 · (πn+1 κ2 + ψn+1 )m2 · ∗ k . . . · (πn+1 κk + ψn+1 )mk )
Since
Mg,n
j j +1 ∗ = πn+1∗ πn+1 (κ11 · . . . · κk k ) · ψn+1
j
Mg,n
j
κ1 1 · . . . · κ k k · κ ,
the above integral can be expressed as a sum of non-negative terms.
4
G. Schumacher, S. Trapani
Proof of Theorem 1. We first note that for g > 0, 3g−2+n 3g−2+n ψn+1 = ψ1 = Mg,n+1
Mg,n+1
Mg,n
3g−3+n ψ1
= ... =
Mg,1
3g−2
ψ1
by (7). These integrals are known to be equal to 1/(24g · g!) (cf. [F-P]). For g = 0 n−2 ψn+1 = ψ1 = κ0 = 1. M0,n+1
M0,4
M0,3
By Lemma 1, for j < 3g − 2 + n, 3g−2+n−j ∗ j πn+1 (κ1 ) · ψn+1 = Mg,n+1
hence Mg,n+1
3g−2+n κ1
Mg,n
=
j
Mg,n+1
∗ (κ1 ) + ψn+1 πn+1
κ1 · κ3g−3+n−j ≥ 0,
3g−2+n
3g − 2 + n 3g−2+n−j ∗ (πn+1 (κ1 ))j · ψn+1 j Mg,n+1 j =0 3g−3+n ≥ (3g − 2 + n) · κ1 · κ0
=
3g−2+n
Mg,n
1 3g−4+n 3g−2+n + (3g − 2 + n)(3g − 3 + n) κ1 κ1 + ψn+1 2 Mg,n Mg,n+1 = ((3g − 2 + n)(2g − 2 + n) 1 3g−2+n 3g−2+n κ1 ψn+1 + (3g − 2 + n)(3g − 3 + n)) · 2 Mg,n Mg,n+1 1 1 3g−3+n = (3g − 2 + n)(7g − 7 + 3n) κ1 + g . 2 24 g! Mg,n For any family f : C → S of stable curves det(f∗ ωC /S ) is a line bundle over S, where ωC /S denotes the relative dualizing sheaf. These determinant sheaves give rise to a Q-divisor on Mg,0 , which is usually denoted by λ. In a similar way the singular fibers of the above family define divisors, which also give rise to Q-divisors on Mg,0 , usually denoted by δi . The irreducible components of the divisor at infinity on Mg,0 are denoted by !i , i = 0, . . . , [g/2] with classes [!i ] = δi for i = 1 and [!1 ] = 2δ1 . These are characterized as follows: (i) The generic point of !0 corresponds to an irreducible, stable curve of genus g − 1 with one ordinary double point. In fact there is a generically 2:1 surjective holomorphic map Mg−1,2 → !0 .
Weil–Petersson Volumes via Effective Divisors
5
(ii) For i = 1, . . . , [g/2] the generic points of !i correspond to stable curves with one ordinary double point and two irreducible components of genus i and g − i resp. There exists a surjective holomorphic map Mi,1 × Mg−i,1 → !i , which is generically 1:1 for i = g/2 and 2:1 for i = g/2 (cf. [H-M]). Theorem 3. Let D =p·λ−
[g/2]
qj · δj ; p, qj > 0
j =0
be an effective Q-divisor on Mg,0 such that µj =
12qj − p > 0. p
Then Vg,0 >
[g/2] µg/2 µ0 µ1 µj · Vj,1 · Vg−j,1 − · Vg−1,2 + · Vg−1,1 + · (Vg/2,1 )2 2 48 2 j =2
(where µg/2 = Vg/2,1 = 0, if g is odd). Proof. According to [MU1] κ1 = 12λ −
[g/2]
δj .
j =0
We want to write the divisor in the form D = ακ1 −
β j δj
for α, βj > 0. This gives p p > 0. ; βj = qj − 12 12 3g−4 3g−3 3g−4 · D = ακ1 − βj κ 1 · δj . 0 < κ1 α=
Now
We use the restriction property of κ1 with respect to !j . On Mg−1,2 the class κ1 is invariant under the action of Z2 exchanging the punctures. So under the natural map Mg−1,2 → !0 it descends to the restriction of the class κ1 on the ambient space Mg,0 . Now 1 3g−4 3g−4 · δ0 = κ1 = Vg−1,2 . κ1 2 !0 In a similar way, we get 3g−4
κ1 and for i > 1
· δ1 =
3g−4
κ1
1 V1,1 · Vg−1,1 , 2
· δi = Vi,1 · Vg−i,1
with an extra factor 1/2 for (V g ,1 )2 , if g is even. Also we use V1,1 = 1/24. 2
6
G. Schumacher, S. Trapani
Remark. Combining push-pull formulas and computations of intersections of powers of κ1 with various effective divisors, one can also estimate the intersection numbers Vg,n from above in terms of Vg−1,n+2 , and the numbers Vj, with j = g, < n or j < g, ≤ n + 1. The above Theorem 2 follows, since for every rational ε > 0 the Q-divisor D = (11.2 + ε) · λ − δ is ample [MU1], where δ = δj . A stronger estimate for g ≥ 23 follows from the fact that Mg,0 has positive Kodaira dimension according to the theorems of Eisenbud, Harris, and Mumford [E-H, HA1, H-M]. The equality KMg,0 = 13λ − 2δ0 − 3δ1 − 2
[g/2]
δj
j =2
in Pic(Mg,0 ) ⊗ Q was proved in [H-M]. This implies for g ≥ 23, Vg,0 >
[g/2] 11 23 11 11 ·Vj,1 · Vg−j,1 − · Vg−1,2 + · Vg−1,1 + · (Vg/2,1 )2 . 26 624 13 26 j =2
Further computations of effective divisors in terms of λ and δj were provided in [H-M, E-H]. All of these divisors satisfy the hypothesis of Theorem 3 and can be used for new estimates of the Weil–Petersson volumes. We finally discuss how to arrive at the asymptotic estimates (3). A rough estimate following from Theorem 1 is Vg,n ≥ Vg,1 , which, together with Penner’s lower estimate for Vg,1 , already implies the existence of a constant c > 0, independent of n ≥ 1, such that for large g, Vg,n ≥ cg (2g)!. (3g − 3 + n)! The corresponding upper estimate is due to Grushevski, so that (3) follows for n ≥ 1. For n = 0 and g > 1 the above Theorem 1 and Theorem 2 give 2 1 Vg−1,1 ≤ Vg,0 < Vg,1 . 672 (3g − 2)(7g − 7) Again, with [GR, PE] these inequalities yield the following corollary: Corollary 1. There exist constants 0 < c˜ < C˜ such that for g 0 c˜g (2g)! ≤
Vg,0 ≤ C˜ g (2g)!. (3g − 3)!
Acknowledgements. This paper was written, while the second named author was visiting the University of Marburg. He would like to thank the DFG for support (Schwerpunktprogramm 1094).
Weil–Petersson Volumes via Effective Divisors
7
References [A-C 1]
Arbarello, E., Cornalba, M.: The Picard groups of the moduli spaces of curves. Topology 26, 153–171 (1987) [A-C 2] Arbarello, E., Cornalba, M.: Combinatorial and algebro geometric cohomology classes on the moduli space of curves. J. Alg. Geom. 5, 705–749 (1996) [DIJ] Dijkgraaf, R.: Intersection theory, integrable hierarchies and topological field theory. In: New symmetric principles in quantum field theory, Cargése, 1991, Adv. Sci. Int. Ser. B Phys. 295, New York: Plenum, 1992 [E-H] Eisenbud, D., Harris, J.: The Kodaira dimension of the moduli space of curves of genus ≥ 23. Invent. Math. 90, 359–387 (1987) [F-P] Faber, C., Pandharipande, R.: Hodge integrals and Gromov–Witten theory. Invent. Math. 139, 173–199 (2000) [GE] Getzler, E.: Intersection theory on M1,4 and elliptic Gromov–Witten invariants. J. Am. Math. Soc. 10, 973–998 (1997) [GR] Grushevsky, S.: Explicit upper bound for the Weil–Petersson volumes. Preprint math.AG/0003217 [HA1] Harris, J.: On the Kodaira dimension of the moduli space of curves. II: The even-genus case. Invent. Math. 75, 437–466 (1984) [HA2] Harris, J.: Families of smooth curves. Duke Math. J. 51, 409–419 (1984) [H-M] Harris, J., Mumford, D.: On the Kodaira dimension of the moduli space of curves. Invent. Math. 67, 23–86 (1982) [K-M-Z] Kaufmann, R., Manin, Yu., Zagier, D.: Higher Weil–Petersson volumes of moduli spaces of stable n-pointed curves. Commun. Math. Phys. 181, 763–787 (1996) [KO] Kontsevich, M.: Intersection theory on the moduli space of curves and the matrix Airy function. Commun. Math. Phys. 147, 1–23 (1992) [M-Z] Manin, Y., Zograf, P.: Invertible Cohomological Field Theories and Weil–Petersson volumes. Preprint math.AG/9902051 [MU1] Mumford, D.: Stability of projective varieties. Enseign. Math., II. Ser. 23, 39–110 (1977) [MU2] Mumford, D.: Towards an enumerative geometry of the moduli space of curves. Arithmetic and geometry, Pap. dedic. I.R. Shafarevich, Vol. II. Geometry, Prog. Math. 36, 271–328 (1983) [PE] Penner, R.C.: Weil–Peterssonvolumes. J. Diff. Geom. 35, 559–608 (1992) [WI] Witten, E.: Two dimensional gravity and intersection theory on moduli spaces. Surveys in Diff. Geom. 1, 243–310 (1991) [WO] Wolpert, S.: On the homology of the moduli spaces of stable curves. Ann. Math. 118, 491–523 (1983) [ZO1] Zograf, P.: The Weil–Petersson volume of the moduli space of punctured spheres. In: Bödigheimer, et al., Mapping class groups and moduli spaces of Riemann surfaces. Proceedings of workshops Göttingen 1991, Seattle, 1991, American Math. Soc., Contemp. Math. 150, 367–372 (1993) [ZO2] Zograf, P.: Weil–Petersson volumes of low genus moduli spaces. Funct. Anal. Appl. 32, 78–81 (1998) [ZO3] Zograf, P.: Weil–Petersson volumes of moduli spaces of curves and the genus expansion in two dimensional gravity. Preprint math.AG/9811026 Communicated by R. H. Dijkgraaf
Commun. Math. Phys. 222, 9 – 43 (2001)
Communications in
Mathematical Physics
© Springer-Verlag 2001
On Asymptotic Properties of a Number Theoretic Function Arising out of a Spin Chain Model in Statistical Mechanics J. Kallies1 , A. Özlük2 , M. Peter3 , C. Snyder2 1 Mathematisches Institut, Universität zu Köln, Köln, Germany. E-mail:
[email protected] 2 Department of Mathematics, University of Maine, and Research Institute of Mathematics,
Orono, ME, USA. E-mail:
[email protected];
[email protected]
3 Mathematisches Institut, Universität Freiburg, Freiburg, Germany.
E-mail:
[email protected] Received: 4 January 2001 / Accepted: 10 April 2001
Dedicated to Curt Meyer, a great teacher and friend
Abstract: Let A=
1 0 1 1
B=
and
1 1 0 1
.
Let N be a positive integer and define (N ) as the number of matrices, C, which are products of A and B, where both A and B must occur, such that the trace, Tr(C) = N . It has been conjectured that (N ) ∼ N log N
(N −→ ∞),
see [10]. In this note we consider the summatory function
(N ) = (n) n≤N
and show that
(N ) ∼
6 2 N log N π2
(N −→ ∞).
1. Introduction The relationship between number theory and equilibrium statistical mechanics is well documented; see for instance [4, 5], and [6]. For example, the Riemann Zeta function ζ (s) can be interpreted as the partition function of a system of interacting primes at inverse temperature s, cf. [2, 3], and [8]. Knauf, [11], has shown that Z(s) =
ζ (s − 1) ζ (s)
10
J. Kallies, A. Özlük, M. Peter, C. Snyder
is the partition function of an infinite spin chain with ferromagnetic interactions. Kleban and Özlük, [10], introduced a spin chain model that is translation invariant, where the energy of each spin configuration is defined in terms of Farey fractions. In these models, the phase transition can be analyzed by studying the singularities of the partition function. It is also important to have good estimates for the size of the coefficients when the partition function Z(s) is expanded into a Dirichlet series. For example, in the Knauf model, Z(s) turns out to be ∞ φ(n) n=1
,
ns
where φ(n) is Euler’s phi-function. The average growth rate of φ(n) is well understood. In this paper, we study the asymptotic growth rate of the corresponding number theoretic function (n), defined below, arising in the model introduced by Kleban and Özlük. We now elaborate. Let 1 0 1 1 A= and B = At = . 1 1 0 1 Let M denote the multiplicative monoid generated by A and B. Let N = {M1 ABM2 : Mi ∈ M} ∪ {M1 BAM2 : Mi ∈ M} ⊂ M, i.e. N contains all possible products of A and B, where both A and B appear to a positive integral power. Furthermore, let N be a positive integer and define NN = {C ∈ N : Tr(C) = N }, Finally, let
(N) = |NN |,
N≤N = {C ∈ N : Tr(C) ≤ N }.
(N ) = |N≤N |,
where | | denotes the cardinality of a set. In [10] it is conjectured that (N) ∼ cN log N
(N −→ ∞),
where c = 1. If we use partial summation, then we see that this conjecture implies
(N ) ∼
1 2 cN log N 2
(N −→ ∞),
where the constant is the same as above. The purpose of this article is to prove the stronger result that
(N ) =
1 2 cN log N + O(N 2 log log N ) 2
(N −→ ∞),
where c is explicitly given as c=
12 . π2
It turns out that even the modified conjecture (N) ∼
12 N log N π2
(N → ∞)
Asymptotic of a Function in Statistical Mechanics
11
is false. M. Peter has shown that the limit (N ) N→∞ N log N lim
does not exist, see [16]. The proof of this result depends essentially on the tools developed in the present paper. It shows that the analytic continuation of the partition function cannot be done with such crude methods as were proposed in [10], Eq. (38). On the other hand, via partial summation the result of the present paper shows that (n) 12 Z(β) = = 2 (β − 2)−2 + O((β − 2)−1 log(β − 2)−1 ) nβ π n≥1
as β → 2 + 0. Thus the singularity of the partition function at β = 2 is of order 2 from the right. We do not get any information on analytic continuation of Z(β) beyond its abscissa of convergence. In Sect. 2 we collect, for the benefit of readers who are not specialists in number theory, some well known facts about the relation between real quadratic fields, continued fractions and certain 2 × 2-matrices. For the proof of the main result we need mean value formulae for certain arithmetical functions r which are defined in terms of reduced quadratic irrationalities (see the paragraph after Eq. (4.2)). Since it is difficult to work with them directly they are approximated by multiplicative functions ρ (see Eq. (3.1)). In Sect. 3 the necessary mean value result for the latter functions is proved (Proposition 3.13). In Sect. 4 the condition on the matrices which are counted in (N ) is expressed in terms of traces of quadratic units (Theorem 4.1). Then (N ) is written as a sum over reduced quadratic irrationalities (Eq. (4.1)). A crude approximation reduces this sum to a sum of mean values of the functions r (Proposition 4.10). The mean value formula for ρ from Sect. 3 then gives an asymptotic formula for (N ) as a weighted mean value of class numbers ordered according to their fundamental units (Proposition 4.13). This mean value is closely related to that considered by Sarnak [18] but his method does not work in our situation. Therefore a different approach is used in Proposition 4.14. 2. Facts from Real Quadratic Fields and Continued Fractions For the convenience of the reader, we present now some background material on real quadratic number fields and continued fractions which will be used in a later section of this paper. Let be any non-square discriminant. Then = f 2 0 for some fundamental discriminant 0 and f ∈ N, where N denotes the set of positive integers; can √ be associated with a unique order O in the corresponding quadratic field K = Q( 0 ) (cf. [1], p. 234). We let the group of units of O be given by E := {ε | ε unit of O } √ t +u ={ | (t, u) ∈ Z2 , t 2 − u2 = ±4} 2 and let ε > 1 be the corresponding fundamental unit. The subgroup √ t +u + | (t, u) ∈ Z2 , t > 0 and t 2 − u2 = +4} E = { 2
(2.1)
(2.2)
12
J. Kallies, A. Özlük, M. Peter, C. Snyder
of the totally positive units is generated by ε if ε+ := 2 ε if
N(ε ) = +1 N(ε ) = −1,
(2.3)
where N denotes the norm from K to Q. Note that the last case can only happen if N(ε0 ) = −1. We shall need the following + Proposition 2.1. Let η ∈ E , η > 1. Then
0 < tr(η) − η < 1/2. √
Proof. Write η = (t + u )/2, where t, u are positive integers. Hence tr(η) = t and so we need to show that √ t +u t − 1/2 <
5/2, i.e. if and only if t = 1, 2. But in these two cases u2 = t 2 − 4 ≤ 0, which cannot happen. Thus the proposition is established. Let I be the set of Z-modules a of K with coefficient ring O (cf. [1], pp. 228, 320). The equivalence relations a1 ∼ a2 :⇐⇒ a1 = δa2
for some δ ∈ K × ,
a1 ≈ a2 :⇐⇒ a1 = δa2
for some δ ∈ K × , N(δ) > 0,
(2.4)
yield the wide and narrow class groups Cl() := I / ∼, Cl+ () := I / ≈,
(2.5)
respectively. as is well known, finite class numbers h() := |Cl()| and They have, h+ () := Cl+ (), which are related as 2h() if N(ε ) = +1, + h () = (2.6) h() if N(ε ) = −1. For wide, respectively, narrow ideal classes [a ] and {a} where a ∈ I , we have {a} ∪ {a$ } if N(ε ) = +1, [a ] = {a} if N(ε ) = −1,
(2.7)
where the union is disjoint, and the complementary class {a$ } is given by a$ = (α)a , α ∈ K × , N(α) < 0 (cf. [1], p. 313 ff.).
Asymptotic of a Function in Statistical Mechanics
13
It is known that certain properties of real quadratic fields are related to continued fractions. Let us summarize some of these relations by following the presentation in [14] (wide theory for f = 1) and [9] (narrow theory). The unique map R \ Q −→ Z ×
∞
N
2
sending α to (a1 , a2 , . . . , an , . . . ), where 1 , α2 1 α2 = a2 + , α3 ······ 1 αn = an + , αn+1 ······, α = a1 +
a1 := [α], a2 := [α2 ], (2.8) an := [αn ],
has an inverse mapping (a1 , a2 , . . . , an , . . . ) to the infinite continued fraction α = lim tn =: [a1 , a2 , . . . , an , . . . ], n→∞
where 1
tn = [a1 , a2 , . . . , an ] = a1 +
1
a2 + a3 +
1 ..
.
+
1 an
is a finite continued fraction. It is a famous result of Lagrange that the continued fraction of a quadratic irrationality α is ultimately periodic and conversely. We denote such a continued fraction by α = [a1 , a2 , . . . , ak , ak+1 , . . . , ak+n ], where we write per(α) := n when n is the minimal period length. We further define the nth convergent for n ≥ 1 as Pn = [a1 , a2 . . . , an ], Qn where Pn and Qn are relatively prime integers with Qn > 0. We then have for n ≥ 2, P0 = 1, P1 = a1 , Q0 = 0, Q1 = 1,
Pn = an Pn−1 + Pn−2 , Qn = an Qn−1 + Qn−2 .
(2.9)
14
J. Kallies, A. Özlük, M. Peter, C. Snyder
Basic formulas concerning the convergents are Pn Qn−1 − Pn−1 Qn = (−1)n
(2.10)
α = [a1 , a2 , . . . , an , . . . ] = [a1 , a2 , . . . , an , x] xPn + Pn−1 = xQn + Qn−1
(2.11)
and
for n ≥ 1. In order to write this equation in a more convenient matrix form we introduce a 1 M(a) := for a ∈ Z, (2.12) 1 0 M(a1 , a2 , . . . , an ) := M(a1 ) · M(a2 ) · · · M(an ), where (a1 , a2 , · · · , an ) ∈ Zn and the matrices M are understood as elements of 1 0 . G := GL2 (Z) / ± I where I := 0 1 a b More precisely, the elements M = ± act on x ∈ R by c d M ◦ x :=
ax + b . cx + d
(2.13)
Notice the identity M(a) ◦ x =
1 ax + 1 =a+ . x x
So, if α = [a1 , a2 , . . . , an , . . . ] is a continued fraction one immediately can check by induction using (2.9) Pn Pn−1 (2.14) M(a1 , a2 , . . . , an ) = Qn Qn−1 for n ≥ 1 . Moreover for m, k ≥ 1, if α = [ a1 , · · · , am ], where m is not necessarily minimal, then k Pm Pm−1 Pkm Pkm−1 k M(a1 , a2 , . . . , am ) = . (2.15) = Qm Qm−1 Qkm Qkm−1 With (2.14) we get a reformulation of (2.11) as α = [a1 , a2 , . . . , an , x]
Pn Pn−1 ◦ x. = M(a1 , a2 , . . . , an ) ◦ x = Qn Qn−1
(2.16)
Asymptotic of a Function in Statistical Mechanics
15
We now return to the quadratic field K and consider the set of quadratic irrationalities with discriminant , which we denote by X() := {α | α quadratic irrationality, disc(α) = } ⊂ K. As we noticed above the elements of X() have an ultimately periodic continued fraction expansion. Moreover the group of matrices G operates on X() in the sense of (2.13). Let X()/ G be the resulting set of equivalence classes. We define the subset of reduced elements as R() := { ω ∈ X() | ω > 1, 0 > ω > −1 } √ √ b+ = | a, b ∈ Z, a > 0, 0 < b < , 4a|(b2 − ), 2a √ √
b2 − −b + b+ a, b, = 1, <1< , 4a 2a 2a
(2.17)
where ω is the conjugate of ω. It is well known that these reduced elements are purely periodic and conversely; namely ω = [ a1 , a2 , . . . , an ].
(2.18)
Moreover, if n is minimal, we define the following period lengths of ω by per(ω) := eper(ω) :=
n,
n 2n
if n is even, if n is odd.
Notice that a reduced ω can always be expanded as ω = [ a1 , a2 , . . . , aeper(ω) ].
(2.19)
Let us summarize some important facts concerning the reduction theory, which is essentially analogous to the known (narrow) theory of quadratic forms. (2.20) The number of elements in R() is finite. We call this number the caliber of discriminant and denote it by κ(), i.e. κ() := |R()| . (2.21)
An element α = [a1 , a2 , . . . , ak , ak+1 , . . . , ak+n ] ∈ X()
is G-equivalent to the reduced element ω := [ ak+1 , ak+2 , . . . , ak+n ]; more precisely using (2.16), α = [a1 , a2 , . . . , ak , ω] = M(a1 , a2 , . . . , ak ) ◦ ω. Each class of X()/ G can therefore be represented by a reduced element. By (2.20), X()/ G is a finite set. We shall see later that the number of classes |X()/ G| equals h() = |Cl()|.
16
J. Kallies, A. Özlük, M. Peter, C. Snyder
(2.22)
If we start with a reduced element ω1 := ω = [ a1 , a2 , . . . , an ] ∈ R(),
we get all the reduced elements which are G-equivalent to ω as ω1 := ω, ωk+1 = M(ak )−1 ◦ ωk ,
k = 1, . . . , n − 1.
According to (2.16) this shows that ωj = [ aj , aj +1 , . . . , an , a1 , . . . , aj −1 ] for j = 1, . . . , n, are exactly the reduced elements which are G-equivalent to ω. They are produced by cyclic permutation of the period of the continued fraction of ω. Thus the set of the coefficients of the period and its length per(ω) are invariants of the equivalence class cl(ω) ∈ X()/ G containing ω. (2.23)
We mention that the mapping I −→ X()/ G (α, β)Z → cl(β/α)
(with ordered basis (α, β) ) induces a bijection Cl() := I / ∼ X()/G. This yields in particular |X()/ G| = h(). We now turn to the units (2.1) of the quadratic field and describe their connections to reduced elements and continued fractions. (2.24)
For ω ∈ R() let Fω := {M ∈ G | M ◦ ω = ω}
be the fixgroup of ω. If ω := [ a1 , a2 , . . . , an ] (n not necessarily minimal) we have, using (2.14) and (2.16), M(a1 , a2 , . . . , an ) ◦ ω = ω and thus
Pn Pn−1 M(a1 , a2 , . . . , an ) = Qn Qn−1
∈ Fω .
Asymptotic of a Function in Statistical Mechanics
(2.25)
17
The map Fω → E given by a b λω : M := −→ ε := cω + d c d
is an isomorphism ∼
λω : Fω − → E with det(M) = N(λω (M)) = N(cω + d), Tr(M) = tr(λω (M)) = tr(cω + d),
(2.26)
where Tr is the matrix trace and tr denotes the trace from K to Q. (2.27)
By studying the above isomorphism, it can be shown that for each ω = [ a1 , a2 , . . . , an ] ∈ R(), n = per(ω), and Pn Pn−1 7 := M(a1 , a2 , . . . , an ) = Qn Qn−1 (cf. (2.14)) we have λω (7) = ε . Computing the image of the map, we get ε = Qn ω + Qn−1 .
(2.28)
Since N(ε ) = N(λω (7)) = det(7) = (−1)n , as stated in (2.10), we can claim N(ε ) = 1 N(ε ) = −1
iff n is even, iff n is odd.
(2.29)
Thus the parity of the period length of a reduced element, i.e. the caliber of an equivalence class (cf. (2.22)), is an invariant of the ring O of K. We establish a formula analogous to (2.28) for the totally positive fundamental unit ε+ . With l := eper(ω), let Pl Pl−1 7 if n is even, i.e. l = n, + 7 := = (2.30) 72 if n is odd, i.e. l = 2n. Ql Ql−1 The last case follows from (2.15). Consequently, using (2.29), we obtain
(7) = ε if N(ε ) = 1 λ ω + λω (7+ ) = = ε , (2.31) 2 λω (72 ) = ε if N(ε ) = −1 and computing the image on the left-hand side + ε = Ql ω + Ql−1 ,
(2.32)
18
J. Kallies, A. Özlük, M. Peter, C. Snyder
where again l = eper(ω). Note that 7+ is the generator of the cyclic + + subgroup λ−1 ω (E ) of Fω because ε is the generator of the isomorphic + cyclic group E ⊆ E . With respect to these results, we shall find it convenient to introduce the following notation: ε(ω) := ε = Qn ω + Qn−1 , ε+ (ω) := ε+ = Ql ω + Ql−1 ,
(2.33)
where ω is any reduced element in R(), l = eper(ω), n = per(ω), and so in particular disc(ω) = . (2.34)
∼
The inverse map E − → Fω of λω (cf. (2.25)) turns out to be t − bu √ −cu t +u ∼ 2 ε := , −→ t + bu 2 au 2
where ax 2 + bx + c = 0, (a, b, c ∈ Z, a > 0, (a, b, c) = 1) is the minimal equation of ω.
3. Number Theoretic Functions Related to the Problem In Sect. 4 containing our main result, we need some knowledge about particular number theoretic functions. We find it convenient to present these more technical results in this separate section. Let m and a be positive integers. Then b2 − m 2 Pm (a) := b ∈ Z | 0 ≤ b < 2a, b ≡ m mod 4a, (a, b, c) = 1, c = ; 4a moreover, let ρm (a) = |Pm (a)|.
(3.1)
Now we study properties of ρ (a). Our immediate goal is to prove Proposition 3.12 below. This proposition is surely not new, but since we could not find it explicitly in the literature, we shall give a proof here. We start by making some more definitions. For m and a positive integers, let Pm (a) := {b ∈ Z|0 ≤ b < 2a, b2 ≡ m mod 4a}, Pm (a) := {b ∈ Z|0 ≤ b < a, b2 ≡ m mod a}.
(a) = |P (a)|, and ρ (a) = |P (a)|. Furthermore, define ρm m m m
Asymptotic of a Function in Statistical Mechanics
19
(a) first and then use this computation to calculate ρ (a). In We shall compute ρm m order to calculate ρm (a), we need only consider a as an arbitrary prime power, since by the Chinese Remainder Theorem ρm (a) = ρm (p ordp a ), p|a
(1) = 1.) where the product is over the primes dividing a. (Notice that ρm We consider two cases, first when p m, and then when p | m.
Lemma 3.1. Suppose p m. Then for all : ≥ 1, 1 + (m p) 1 0 : ρm (p ) = 2 0 4
if if if if if if
p p p p p p
> 2, = 2, = 2, = 2, = 2, = 2,
: = 1, : = 2, : = 2, : > 2, : > 2,
m ≡ 3 mod 4, m ≡ 1 mod 4, m ≡ 1 mod 8, m ≡ 1 mod 8,
where (m/p) represents the Legendre symbol. Proof. For a proof, see for example Landau [12], Theorem 87.
Lemma 3.2. Suppose p|m, say m = p e m1 with p m1 . Then if : ≤ e p[:/2] 0 if : > e, e odd e/2 m1 p (1 + ( )) if : > e, e even, p > 2, p e/2 if : = e + 1, e even, p = 2, p ρm (p : ) = 0 if : = e + 2, e even, p = 2, e/2 2p if : = e + 2, e even, p = 2, 0 if : > e + 2, e even, p = 2, e/2 4p if : > e + 2, e even, p = 2,
m1 m1 m1 m1
≡ 3 mod 4, ≡ 1 mod 4, ≡ 1 mod 8, ≡ 1 mod 8.
Proof. First, suppose : ≤ e. Then b2 ≡ p e m1 mod p: ⇐⇒ b2 ≡ 0 mod p : ⇐⇒ b ≡ 0 mod p:/2 , where x denotes the smallest integer greater than or equal to x. Hence the number of solutions for b ∈ [0, p: ) satisfying b2 ≡ m mod p: is p :−:/2 = p[:/2] , as desired. Second, suppose : > e and e is odd. Then there are no solutions for b satisfying the congruence b2 ≡ pe m1 mod p: . Finally, suppose : > e and e is even, say e = 2e1 . Suppose b satisfies b2 ≡ 2e p 1 m1 mod p: . This can happen if and only if b = p e1 b1 for some b1 ≡ 0 mod p, with b12 ≡ m1 mod p:−2e1 . Now each such b1 pulls back to pe1 solutions for b ∈ [0, p: ) satisfying the congruence b2 ≡ p2e1 m1 mod p: , namely b = pe1 (b1 + p :−2e1 k), for any integer k. This consideration along with Lemma 3.1 proves the lemma. to ρ . Next, we relate the function ρm m
20
J. Kallies, A. Özlük, M. Peter, C. Snyder
Lemma 3.3. Let m and a be positive integers. Then (a) = ρm
p|2a
1 p δ(2,p)
where
ρm (p ordp a+2δ(2,p) ),
δ(2, p) =
1 if p = 2, 0 if p > 2.
Proof. This is an immediate consequence of the Chinese Remainder Theorem and the definitions of ρ and ρ . (a) is multiplicative Lemma 3.4. Let be a positive non-square discriminant. Then ρ in the argument a.
Proof. Let a1 , a2 be relatively prime positive integers. Then by assumption and Lemma 3.3, (a1 a2 ) = ρ
=
p|2a1 a2
p|a1 p>2
1 p δ(2,p)
ρ (p ordp (a1 a2 )+2δ(2,p) )
ρ (p ordp a1 )
p|a2 p>2
ρ (p ordp a2 )
1 ord2 (a1 a2 )+2 ). ρ (2 2
Now, if 2 a1 a2 , then 1 1 ord2 (a1 a2 )+2 ρ (2 ) = ρ (4) = 1, 2 2 by Lemmas 3.1 and 3.2, since ≡ 0, 1 mod 4. This implies that (a1 a2 ) = ρ (a1 ) ρ (a2 ), ρ
in his case. The proof is similar when a1 a2 is even. Thus the lemma is proved.
Lemma 3.5. Let be a positive non-square discriminant and 0 the corresponding fundamental discriminant. Suppose = b2 − 4ac and that p is a prime such that p | (a, b, c). Then p2 | /0 . Proof. Write (a, b, c) = p(a1 , b1 , c1 ). Then = p2 (b12 − 4a1 c1 ). Then b12 − 4a1 c1 is again a discriminant associated with the same fundamental discriminant 0 . Thus we know that b12 − 4a1 c1 = f12 0 for some positive integer f1 . Therefore, = (f1 p)2 0 , as desired. (a). Lemma 3.6. Suppose (a, /0 ) = 1. Then ρ (a) = ρ
Proof. Suppose = b2 − 4ac and p is a prime with p | (a, b, c). Then p | a and p2 | (/0 ) by Lemma 3.5, contrary to assumption. More generally, we have
Asymptotic of a Function in Statistical Mechanics
21
Lemma 3.7. Let be a positive non-square discriminant and a any positive integer. Then a µ(g) ρ/g . ρ (a) = 2 g 2 g | (/0 ) g|a
Proof. If (a, /0 ) = 1, then the lemma follows from Lemma 3.6. On the other hand, suppose (a, /0 ) = p1e1 · · · ptet , where pi are distinct primes. Then ρ (a) = as desired.
ρ (a) −
t i=1
ρ/p 2 i
a pi
+
i<j
ρ/(p 2 2 i pj )
a p i pj
− +··· ,
Lemma 3.8. ρ (a) is multiplicative in the argument a. , we have Proof. Suppose (a1 , a2 ) = 1. Then by the multiplicativity of ρ a1 a2 ρ (a1 a2 ) = ρ/(g . µ(g1 ) µ(g2 )ρ/(g 2 g2 ) 2 g2 ) 1 2 1 2 g1 g2 g |a i i gi2 |(/0 )
But since (g2 , a1 /g1 ) = 1, it follows from Lemma 3.3 that a1 a1 ρ/(g = ρ . 2 g2 ) 2 /g 1 2 1 g1 g1 Similarly, ρ/(g 2 2 1 g2 )
This establishes the lemma.
a2 g2
=
ρ/g 2 2
a2 . g2
By this lemma, we need only compute ρ (p : ). Also notice that ρ (1) = 1 by definition of ρ and since ≡ 0, 1 mod 4. To simplify our presentation, we first consider p (/0 ). Lemma 3.9. Suppose p (/0 ); then 1 + ( p ) if p , : ρ (p ) = 1 if p | , : = 1, 0 if p | , : > 1. (p : ), by Lemma 3.6. But then by Lemma 3.3, Proof. Since p (/0 ), ρ (p : ) = ρ (2: ) = (1/2)ρ (2:+2 ). If : : ρ (p ) = (1/2)ρ (4) ρ (p ) = ρ (p : ) if p > 2, while ρ : p , then ρ (p ) = 1 + (/p), by Lemma 3.1, where we as usual define 1 if ≡ 1 mod 8, = −1 if ≡ 5 mod 8. 2
22
J. Kallies, A. Özlük, M. Peter, C. Snyder
If, on the other hand, p | , but p (/0 ), then ordp = 1 when p is odd and is 2 or 3 if p = 2. From this it follows by Lemma 3.2 that 1 if : = 1, : ρ (p ) = 0 if : > 1. Lemma 3.10. Suppose p | (/0 ), say = pe d1 , where p d1 . Finally, suppose e is odd. (Notice then that e ≥ 3.) Then 0 if : ≤ e − 1 − 2δ(2, p), : odd, ϕ(p :/2 ) if : ≤ e − 1 − 2δ(2, p), : even, : ρ (p ) = p [:/2] if : = e − 2δ(2, p), 0 if : > e − 2δ(2, p). (p : ) − ρ Proof. By Lemma 3.7, ρ (p : ) = ρ (p :−1 ). First assume p is odd. By /p2 Lemma 3.3 :−1 :−1 (p : ) − ρ/p ) = ρ (p : ) − ρ/p ). ρ 2 (p 2 (p
Thus by Lemma 3.2 [:/2] − p [(:−1)/2] p : :−1 ρ (p ) − ρ/p2 (p ) = p [:/2] 0
if : ≤ e − 1, if : = e, if : ≥ e.
But notice that p[:/2] − p [(:−1)/2] = 0 if : is odd and = p :/2 − p(:/2)−1 = ϕ(p :/2 ), if : is even. Now suppose p = 2. Then by Lemma 3.3 :−1 ρ (p : ) − ρ/p )= 2 (p
1 :+2 1 (2:+1 ). ρ (2 ) − ρ/4 2 2
But then by Lemma 3.2 1 [(:+2)/2] − 2[(:+1)/2] ) if : ≤ e − 3, 2 (2 1 1 :+2 ρ (2 ) − ρ/4 (2:+1 ) = 21 2[(:+2)/2] if : = e − 2, 2 2 0 if : > e − 2. The rest follows as in the odd case.
Lemma 3.11. Suppose p | (/0 ), say = pe d1 , p p > 2, 0 if :/2 ) if ϕ(p ρ (p : ) = p e/2 − p e/2−1 (1 + ( dp1 )) if ϕ(p e/2 ) (1 + ( d1 )) if p
d1 . Suppose e is even. Then for : ≤ e − 1, : odd, : ≤ e − 1, : even, : = e, : > e;
Asymptotic of a Function in Statistical Mechanics
23
for p = 2 and e ≥ 4, 0 :/2 ) ϕ(2 :/2−1 2 0
if if if if ρ (2: ) = 0 if :/2−1 (1 − ( d1 )) 2 if 2 0 if e/2 ϕ(2 ) (1 + ( d21 )) if
: ≤ e − 2, : odd, : ≤ e − 2, : even, : = e − 1, d1 ≡ 3 mod 4, : = e − 1, d1 ≡ 1 mod 4, : = e, d1 ≡ 3 mod 4, : = e, d1 ≡ 1 mod 4, : > e, d1 ≡ 3 mod 4, : > e, d1 ≡ 1 mod 4;
and finally for p = 2 and e = 2, (in which case = 4d1 with d1 ≡ 1 mod 4) if : = 1, 0 ρ (2: ) = 1 − ( d21 ) if : = 2, 1 + ( d21 ) if : > 2. (p : ) − ρ (p :−1 ). Then as in the Proof. First assume p > 2. Then ρ (p : ) = ρ /p2 proof of Lemma 3.10, this lemma holds for : ≤ e − 1. On the other hand, for : = e, (p : ) = p e/2 and ρ (p :−1 ) = p (e−2)/2 (1 + (d1 /p)), by Lemma 3.2. Again by ρ /p2 Lemma 3.2, if : > e, then :−1 (p : ) − ρ/p ) ρ 2 (p
= p e/2 (1 + (d1 /p)) − p e/2−1 (1 + (d1 /p)) = ϕ(p e/2 )(1 + (d1 /p)), as desired in this case. Next, assume p = 2 and e ≥ 4. Notice that ρ (2: ) =
1 :+2 (2:+1 )). (ρ (2 ) − ρ/4 2
Then as above, the lemma holds for : ≤ e − 3. Suppose : = e − 2. Then : + 2 = e and (2:+2 ) = 2[(:+2)/2] and that : + 1 = (e − 2) + 1, which implies by Lemma 3.2 that ρ (2:+1 ) = 2(e−2)/2 . This yields the lemma for : = e − 2. Now suppose : = e − 1. ρ/4 (2:+2 ) = 2e/2 and Then : + 2 = e + 1 and : + 1 = (e − 2) + 2. Then by Lemma 3.2 ρ (2:+1 ) = 0, if d ≡ 3 mod 4 and = 2e/2 , if d ≡ 1 mod 4. Putting this together ρ/4 1 1 proves the lemma for this :. Now it is not hard to see that if : ≥ e and d1 ≡ 3 mod 4 then ρ (2: ) = 0. Thus for the rest of this paragraph, assume d1 ≡ 1 mod 4. If : = e, so (2:+2 ) = 2e/2+1 and ρ (2:+1 ) = that : + 2 = e + 2 and : + 1 = (e − 2) + 3, then ρ /4 2e/2 (1 + (d1 /2)). Thus ρ (2: ) = 2e/2−1 (1 − (d1 /2)), as desired. If : > e, then a similar argument shows that ρ (2: ) = 2e/2−1 (1 + (d1 /2)). (8) = 2, Finally consider p = 2 and e = 2. First suppose : = 1. By Lemma 3.2 ρ (4) = 2. Thus ρ (2) = 0. The rest of the proof of the whereas by Lemma 3.1 ρ/4 lemma is similar.
24
J. Kallies, A. Özlük, M. Peter, C. Snyder
With these lemmas, we are now able to represent the Dirichlet series ∞ ρ (n)
ns
n=1
in terms of the Riemann zeta function and Dirichlet L-series. First notice that since ρ is multiplicative (Lemma 3.8), ∞ ρ (n) n=1
ns
=
f,p (s),
p
where f,p (s) =
∞ ρ (p : ) :=0
p :s
We now break this product up into three parts: f,p (s) f,p (s)
f,p (s),
p|(/0 )
p| p(/0 )
p
.
and then proceed to compute the f,p (s)’s. If p , then by Lemma 3.9,
f,p (s) = 1 + 1 +
p
∞ :=1
−1 1 1 1 1− = 1+ s . :s p p ps p
If p | , but p /0 , then Lemma 3.9 implies f,p (s) = 1 +
1 . ps
Now, if p | /0 , then f,p (s) is more complicated and will be computed below. In the meantime, notice from the above that −1 ∞ ρ (n) 1 1 1 1+ s 1− 1+ s = f,p (s) ns p p ps p p| n=1
p
=
p
=
1+
1 ps
−1 1 1− p ps
p(/0 )
p|(/0 )
p|(/0 )
1+
1 ps
−1
f,p (s)
ζ (s) L(s, χ ) g (s), ζ (2s)
with g (s) =
p|(/0 )
g,p (s),
1 −1 g,p (s) = 1 + s f,p (s), p
and L(s, χ ) is the Dirichlet L-function associated with the Kronecker character χ .
Asymptotic of a Function in Statistical Mechanics
25
We now compute the g,p (s) and their derivatives at s = 1. First suppose p > 2. In addition, assume p | 0 . Hence = p2m+1 d1 with p d1 . Then by Lemma 3.10, f,p (s) =
m ϕ(p j ) j =0
p 2j s
+
pm p (2m+1)s
.
A straightforward calculation then shows that −1 1 1 1 1 g,p (s) = 1 − 2s−1 1 − s − (2s−1)(m+1) + (2s−1)m+s . p p p p Thus g,p (1) = 1 and (1) g,p
log p 1 1 −1 1 − m+1 log p # . =− 1− p p p p
Next, assume that p > 2 and p 0 and so = p2m f12 0 , where p f12 0 . Then by Lemma 3.11, f,p (s) =
m−1 j =0
∞ ϕ(p j ) p m − p m−1 (1 + (0 /p)) + + p 2j s p 2ms
j =2m+1
ϕ(p m )(1 + (0 /p)) . pj s
Again, a straightforward calculation shows that −1 1 (0 /p) −1 g,p (s) = 1 − 1 − ps p 2s−1 1 1 1 × 1 − s + (2s−1)m+s − (2s−1)(m+1) p p p (0 /p) (0 /p) (0 /p) (0 /p) . + (2s−1)m+s − + − ps p 2s p p (2s−1)m+1 Thus g,p (1) = 1 and 1 −1 log p 1− p p 1 −1 0 (0 /p) −1 log p log p 1− 1− + 1− # . p p p p m+1 p
(1) = − g,p
When p = 2 a similar argument, which we leave to the reader, shows that g,2 (1) = 1 (1) # 1. Putting all this together we see that and g,2 g (1) =
p|(/0 )
g,p (1) #
log p p|
p
≤
log p + p
p≤log
p|, p>log
log p . p
The first sum on the right-hand side is O(log log ). The number of terms in the second sum is ≤ log / log log , and each term is O(log log / log ). Therefore the righthand side is O(log log ). We have the following proposition.
26
J. Kallies, A. Özlük, M. Peter, C. Snyder
Proposition 3.12. Let be any positive non-square discriminant. Then for Re(s) > 1, ∞ ρ (n) n=1
ns
=
ζ (s) L(s, χ ) g (s), ζ (2s)
(1) # log log . where g (s) is analytic for σ = Re(s) > 1/2 and g (1) = 1, g Moreover,
|g,p (s)| < 1/ (1 − 2−σ )(1 − 2−σ +1/2 ) .
In particular, if σ ≥ 3/4, then |g,p (s)| < 16. Proof. The only part left to prove is the analyticity and inequality involving g,p (s). By the lemmas above, notice that ρ (p : ) ≤ p :/2 . Hence for s = σ + it with σ > 1/2, we have |g,p (s)| ≤ |f,p (s)|(1 − p −σ )−1 ≤
∞
p −(σ −1/2): (1 − p −σ )−1
:=0
= 1/ (1 − p −σ )(1 − p −(σ −1/2) ) ≤ 1/ (1 − 2−σ )(1 − 2−σ +1/2 ) . That g (s) is analytic for σ > 1/2 follows from the above upper bounds.
We now prove the following useful proposition. Proposition 3.13. If L(s, χ ) has no zeros in R 21/22 − ε, 4/15 := s = σ + it | 21/22 − ε ≤ σ ≤ 1, |t| ≤ 4/15 then ρ (a) a≤x
a
6 12 6 = 2 L(1, χ ) log x + 2 L(1, χ ) γ − 2 ζ (2) π π π +
for
√ √ /2 ≤ x ≤ .
6 6 L(1, χ )g (1) + 2 L (1, χ ) + Oε (−1/60+ε ) π2 π
Proof. Lemmas 3.9, 3.10 and 3.11 show that for p (/0 ), ρ (p l ) ≤ 2, and for p | (/0 ), ρ (p l ) ≤ pl/2 . Therefore 2 p ordp a/2 ≤ 2ν(a) a 1/2 #ε a 1/2+ε , ρ (a) ≤ p(/0 ), p|a
p|(/0 ), p|a
where ν(m) is the number of distinct prime √ factors of the integer m. √ Let T := 4/15 and x ∈ /2, a half-integer. From the proof of Perron’s formula (e.g. [19], Lemma 3.12) it follows that ∞ ε+iT ε ρ (a) ρ ρ x 1 1 (a) (a) , h(s) ds + O + = a 2πi ε−iT T a 1+ε T |x − a| a≤x a=1
x/2
Asymptotic of a Function in Statistical Mechanics
where
27
∞ x s ρ (a) x s ζ (1 + s) L(1 + s, χ ) g (1 + s). = s a 1+s s ζ (2 + 2s)
h(s) =
a=1
Here ∞ ρ (a) a=1
a 1+ε
∞ ρ (p : ) p (1+ε): p :=0 2 1 + 1+ε ≤ p −1
=
p(/0 )
#ε
p|(/0 )
∞ p :/2 p (1+ε):
p|(/0 ) :=0
1 . 1 − p −1/2−ε
Since ε > 0 we see that 1 1 ≤ < 4. −1/2−ε 1−p 1 − √1 2
Thus
∞ ρ (a)
a 1+ε
a=1
#ε 4ν(/0 ) #ε ε .
Furthermore,
ρ (a) #ε |x − a|
x/2
a 1/2+ε # x 1/2+ε log x # x 1/2+2ε . |x − a|
x/2
Therefore we have ρ (a) a≤x
a
=
1 2πi
ε+iT ε−iT
h(s)ds + O
(x)ε x 1/2+2ε + T T
.
We now estimate the integral 1 2π i
ε+iT ε−iT
h(s)ds.
To this end, consider the contour integral 1 2π i
K
h(s)ds,
where K is the contour traveling counter-clockwise along the perimeter of the rectangle with vertices ε − iT , ε + iT , − 1/22 + iT , and −1/22 − iT . h(s) has only one pole, namely at s = 0 in this rectangle. Its order is 2. Since g (1) = 1 and xs 1 = + log x + O(s), s s
ζ (1 + s) =
1 + γ + O(s), s
28
J. Kallies, A. Özlük, M. Peter, C. Snyder
as s −→ 0, we see that the residue of h(s) at s = 0 is (γ +log x)
L(1, χ ) L(1, χ ) L (1, χ ) 2ζ (2)L(1, χ ) g (1) g (1) + g (1) + g (1)− ζ (2) ζ (2) ζ (2) ζ (2)2 6 6 = (γ + log x) 2 L(1, χ ) + 2 L(1, χ )g (1) π π 6 72 + 2 L (1, χ ) − 4 ζ (2)L(1, χ ). π π
Now we need to estimate the integrals: −1/22+iT −1/22−iT
ε+iT
h(s) ds,
−1/22+iT
h(s) ds,
ε−iT −1/22−iT
h(s) ds.
Consider the first integral. Recall by Proposition 3.12 that if Re(1 + s) ≥ 3/4, then |g (1 + s)| ≤ 16ν() # ε . Moreover, since by assumption L(s, χ ) has no zero in R(21/22 − ε, 2T ) it follows by a standard argument (cf. [15], Lemma 4.10) that L(σ + it, χ ) #ε ε for 21/22 ≤ σ ≤ 1 + ε, |t| ≤ T . Also we have (cf. [19, p. 96]) ζ (σ + it) #ε |t|(1−σ )/2+ε for 1/2 ≤ σ ≤ 1 + ε, |t| ≥ 1. Putting all this together, we see that −1/22+iT −1/22−iT
h(s) ds # x −1/22 2ε T 1/44+ε .
On the other hand, notice that ε+iT −1/22+iT
h(s) ds # # #
ε −1/22 2ε T ε
T
x σ −σ/2+ε 2ε dσ T T ε −1/22
(xT −1/2 )σ dσ
2ε T ε
(xT −1/2 )ε + (xT −1/2 )−1/22 , T
since the integrand is monotonic. This last estimate also holds for the integral ε−iT −1/22−iT
h(s) ds.
Asymptotic of a Function in Statistical Mechanics
29
Collecting everything we get ρ (a) 6 6 = (γ + log x) 2 L(1, χ ) + 2 L(1, χ ) g (1) a π π a≤x 72 (x)ε x 1/2+2ε 6 + + 2 L (1, χ ) − 4 ζ (2)L(1, χ ) + O π π T T 2ε ε x + O x −1/22 2ε T 1/44+ε + 1−ε/2 . T √ Since T = 4/15 and x $ we can estimate the error terms as stated in the proposition. 4. Main Results Assume all the notation in the Introduction. Further let m be a positive integer and define V2m (N ) = {(a1 , · · · , a2m ) ∈ N2m | Tr(B a1 Aa2 · · · B a2m−1 Aa2m ) ≤ N }, Ve (N ) = ∪m V2m (N ), and ve (N ) = |Ve (N )|. Similarly, let V2m+1 (N ) = {(a1 , · · · , a2m+1 ) ∈ N2m+1 | Tr(B a1 Aa2 · · · Aa2m B a2m+1 ) ≤ N }, Vo (N ) = ∪m V2m+1 (N ), and vo (N ) = |Vo (N )|. Notice (using the notation from (2.12) and (2.30)) B z Aw = M(z)M(w) = M(z, w). More generally we find that B a1 · · · Aa2m = M(a1 , . . . , a2m ) for a1 , . . . , a2m ∈ Z. From this together with the uniqueness of the coefficients of the continued fraction expansion it follows that the elements of M are uniquely represented as products of A’s and B’s. Therefore
(N ) = 2(ve (N ) + vo (N )). We now transform the task of computing the trace Tr(B a1 · · · Aa2m ) to that of computing the trace of a particular real quadratic unit. We mention that this is a main idea of the paper, because we can then use well-developed tools from number theory in helping solve our problem. Let a1 , · · · , a2m be positive integers and define ω = [ a1 , · · · , a2m ]. (We note that 2m need not equal eper(ω).) Then ω is a reduced quadratic irrationality, as we stated in (2.18). For := disc(ω) and = f 2 0 , 0 the fundamental discriminant, we have √ ω ∈ R() ⊂ Q( 0 ) =: K . + Moreover, let ε+ (ω) := ε > 1 (cf. (2.33)) be the totally positive fundamental unit of the order O of K. We start by obtaining an asymptotic formula for ve (N ). To this end, we consider the following useful theorem and propositions.
30
J. Kallies, A. Özlük, M. Peter, C. Snyder
Theorem 4.1. Let ω = [ a1 , · · · , a2m ] and l := eper(ω) . So we have 2m = kl, where k is a positive integer. Then Tr(B a1 · · · Aa2m ) = tr(ε+ (ω)k ). Proof. With the given notation we find that B a1 · · · Aa2m = M(a1 , . . . , a2m )
Pl Pl−1 = M(a1 , . . . , al ) = Ql Ql−1 k
k = (7+ )k .
By using the mapping formula (2.31) and the trace formula (2.26) we conclude that Tr(B a1 · · · Aa2m ) = Tr((7+ )k ) = tr(λω ((7+ )k )) = tr(λω (7+ )k ) = tr(ε+ (ω)k ).
Corollary 4.2. (a1 , · · · , a2m ) ∈ Ve (N ) if and only if tr(ε + (ω)k ) ≤ N , where ω := [ a1 , · · · , a2m ] and k := 2m/ eper(ω). Proposition 4.3. Let k and N be arbitrary positive integers. Define τk (N ) as the finite sum τk (N ) = 1, ω∈R tr(ε+ (ω)k )≤N
where R = ∪ R() and the union is over all positive non-square discriminants . Then ve (N ) = τk (N ). k<2 log N
Proof. For k ∈ N let Tk (N ) := (k, ω) | ω ∈ R, tr(ε+ (ω)k ) ≤ N , T (N) :=
∞ !
Tk (N ).
k=1
So we have τ (N ) := |T (N )| =
∞
|Tk (N )| =
k=1
∞
τk (N ),
k=1
because Tk (N ) are obviously disjoint sets. We claim that the last sum is finite, namely τk (N ). τ (N ) = k<2 log N
This formula is a result of Tk (N ) = ∅
for k ≥ 2 log N.
Asymptotic of a Function in Statistical Mechanics
31
Indeed, if (k, ω) ∈ Tk (N ) and k ≥ 2 log N , we have tr(ε + (ω)k ) ≤ N and by Proposition 2.1, ε+ (ω)k < tr(ε+ (ω)k ) ≤ N. Thus k log ε+ (ω) < log N and k is bounded by k<
log N < 2 log N, log ε+ (ω)
since log ε+ (ω) > 1, except = 5, in which case ε+ = ((1 +
√
5)/2)2 so that
1 ≤ 1.0391 < 2, log ε+ (ω) contrary to the assumption k ≥ 2 log N . The theorem will be proven once we show ve (N ) = τ (N ). To this end, the map defined by j:
Ve (N ) → T (N ) (a1 , . . . , a2m ) → (k, ω),
where ω := [ a1 , · · · , a2m ], l := eper(ω), k := 2m/ l and tr(ε + (ω)k ) ≤ N , is shown to be a bijection. A preimage of a pair (k, ω) ∈ T (N ), where ω ∈ R with a periodic continued fraction expansion ω = [ a1 , . . . , al ], l := eper(ω) (cf. (2.19)) and tr(ε + (ω)k ) ≤ N , is a k-fold repetition Ak := (a1 , . . . , al , . . . . . . , a1 , . . . , al ) ∈ Nkl . Since [ a1 , . . . , al , . . . . . . , a1 , . . . , al ] = ω and tr(ε + (ω)k ) ≤ N we have Ak ∈ Ve (N ) by the above corollary. Injectivity can be proved by analogous arguments. For the next proposition we define r(N ) =
1.
ω∈R ε+ (ω)
By Theorem II of [7] we have Proposition 4.4. r(N ) ∼
3 log 2 2 N , π2
as N −→ ∞. With the help of this very powerful proposition we obtain an asymptotic formula for ve (N ).
32
J. Kallies, A. Özlük, M. Peter, C. Snyder
Proposition 4.5. ve (N ) ∼
3 log 2 2 N , π2
as N −→ ∞. Proof. We first obtain a relation between ve (N ) and r(N ). To this end, notice that r (N − 1/2)1/k < τk (N ) < r N 1/k for k ≥ 1, since, by Proposition 2.1, 1 ε+ (ω)k < tr ε+ (ω)k < ε+ (ω)k + . 2 Now by Proposition 4.3, r (N − 1/2)1/k < ve (N ) < k<2 log N
r N 1/k .
k<2 log N
Next notice that by Proposition 4.4, 3 log 2 2/k r N 1/k ∼ N # N, π2 as N −→ ∞, if 2/k ≤ 1. We thus have r (N − 1/2)1/k < 2≤k<2 log N
r N 1/k # N log N = o(N 2 ).
2≤k<2 log N
Therefore, again using Proposition 4.4, ve (N ) ∼ as desired.
3 log 2 2 N , π2
Notice that if (N ) is to have the desired asymptotic property, then all the main contribution comes from 2vo (N ). We now connect the missing asymptotic expression for vo (N ) to that of ve (N ) and start by considering the mapping σ given by (a1 , · · · , a2m ) −→ ω = [ a1 , · · · , a2m ]. Let l = eper(ω) and write 2m = kl. Notice that σ is not injective, as any ordered tuple of positive integers, which gives an even period (not necessarily minimal) for the continued fraction expansion of ω, will map to ω. Thus we have |σ −1 (ω) ∩ Ve (N )| = |{k ∈ N | tr(ε + (ω)k ) ≤ N }| =: β(, N ), where is the discriminant of ω. Also notice that β(, N ) only depends on the discriminant of ω and not on ω itself as the notation suggests. Concerning β(, N ) we have the following lemma and proposition.
Asymptotic of a Function in Statistical Mechanics
Lemma 4.6.
33
log(N − 1/2) log N − 1 < β(, N ) < , + log ε log ε+
where ε+ = ε+ . Hence, since β is integral, β(, N ) = 0 when N < ε + . Proof. Recall that
β(, N ) = |{k ∈ N : tr((ε+ )k ) ≤ N }|.
But tr((ε+ )k ) = (ε+ )k + tr((ε + )k ) − (ε + )k , which implies that tr((ε+ )k ) ≤ N
k≤
iff
log(N − tr((ε + )k ) + (ε + )k ) . log ε+
Proposition 2.1 establishes the lemma, since then 0 < tr((ε+ )k ) − (ε + )k < 1/2.
The lemma and Corollary 4.2 yield Proposition 4.7.
+ ε
ω∈R()
ve (N ) =
β(, N ),
where the first sum is over those non-square positive discriminants with ε+ < N . From this proposition along with Proposition 4.5 and Lemma 4.6, it follows easily that Corollary 4.8.
+ ε
1 κ() # N 2 / log N. log ε+
Now consider the mapping µ given by (a1 , · · · , a2m+1 ) −→ ω = [ a1 + a2m+1 , a2 , · · · , a2m ]. Suppose that the discriminant of ω is . Notice that |µ−1 (ω) ∩ Vo (N )| = β(, N )([ω] − 1), where [ ] denotes the greatest integer function. The second factor on the right accounts for the number of ways of writing ω = [ b1 , a2 , · · · , a2m ] with b1 = a1 + a2m+1 , and b1 = [ω]. From this we have, β(, N )([ω] − 1). vo (N ) = + ε
ω∈R()
But notice then that by Proposition 4.7 the right-hand side above is equal to β(, N )[ω] − ve (N ). + ε
ω∈R()
Thus we have
(N ) = 2(ve (N ) + vo (N )) = 2
+ ε
ω∈R()
β(, N )[ω].
(4.1)
34
J. Kallies, A. Özlük, M. Peter, C. Snyder
Now our goal is to obtain a simple asymptotic formula for the right-hand side of this equation. This is done by studying the asymptotics of the expression
[ω],
(4.2)
ω∈R()
connected to a function r (a), that we define in the following way. Let a be a positive integer. We define R (a) :=
√ √ b+ | b ∈ Z, 0 < b < , 4a|(b2 − ), 2a √ √ b2 − −b + b+ (a, b, c) = 1, c = , <1< , 4a 2a 2a
and r (a) := |R (a)|. Proposition 4.9. Suppose ω ∈ R (a) for some a ∈ N. Then √ √ −1<ω < . a a Proof. Since ω = (b +
√ )/2a ∈ R (a), we have
0
√
√ √ −b + b+ <1< . 2a 2a
,
√ √ √ √ b+ + < = . 2a 2a a
On the other hand, √ √ √ √ 2 − −b+ 2 − 2a −1= < = ω, a 2a 2a √ since −b + < 2a. Proposition 4.10. √ r (a) √ r (a) [ω] < − 2κ() < . a a √ √ a<
ω∈R()
a<
Proof. This follows from the previous proposition along with the observation that √
a<
(refer to Proposition 4.11 below).
r (a) = κ(),
Asymptotic of a Function in Statistical Mechanics
35
We shall need to study the asymptotics of r (a) . a √
a<
√ But r (a) is hard to count when a > /2. We start instead by relating r (a) to the function ρ (a) (considered in the preceding section, cf. (3.1)) as follows. Proposition 4.11. For any positive integer a, √ 1. if a > √, then r (a) = 0; 2. if a > √/2, then r (a) ≤ ρ (a); 3. if a < /2, then r (a) = ρ (a). √ √ Proof.√1. Since ω ∈ R(), ω > 1 and 0 < b < . Hence, if a > , then (b + )/2a < 1, which cannot happen. √ √ 2. Suppose a > /2. Let ω ∈ R (a), say ω = (b + )/2a. Then notice that b ∈ P (a). This correspondence ω −→ b is clearly injective. √ 3. Now assume a < /2. Suppose b ∈ P (a). Let b1 be the unique integer satisfying √ √ b1 ≡ b mod 2a, − 2a < b1 < . √ Then we claim that the mapping j : b −→ ω = (b1 + )/2a is a bijection of P (a) onto R (a). First we need to show that ω ∈ P (a), then √ R (a). √ Indeed, if b ∈ √ b12 ≡ mod 4a. Moreover, by assumption 2a < , − 2a < b1 < , we see √ √ that 0 < b1 < and (−b1 + )/2a < 1 < ω. Finally, we claim that (a, b1 , c1 ) = 1, where c1 = (b12 − )/4a. In fact, if (a, b1 ) = 1, then we are done. Suppose, then, that (a, b1 ) > 1. Let p be a prime dividing a and b1 ; thus p | (a, b) and hence p c by definition of P (a). Now let b1 = b +2ak. Then notice that c1 = c +bk +ak 2 ; therefore p c1 . Thus (a, b1 , c1 ) = 1, as claimed. Consequently, j : P (a) −→ R (a). √ Notice that j is clearly injective. But j is also surjective; for if ω = (b1 + )/2a ∈ R (a), then let b be chosen so that 0 ≤ b < 2a and b ≡ b1 mod 2a. Then by reversing the steps above, we see that b ∈ P (a) and j (b) = ω. " We now use our information about ρ (a) to study the asymptotics of a<√ r (a)/a and ultimately to that of (N ). Proposition 4.12. If L(s, χ ) has no zeros in R(21/22 − ε, 4/15 ) then √ r (a) √ 6 6 √ = 2 h+ () log ε+ log + 2 L (1, χ ) a π π √ a<
+ O(h+ () log ε+ log log ), where the implied constant is independent of .
36
J. Kallies, A. Özlük, M. Peter, C. Snyder
Proof. By Propositions 4.11, 3.13 and 3.12 we have ρ (a) √ √ r (a) √ = +O a a √ √ √ a<
a<
/2
√
/2
ρ (a) a
√ 6 √ 6 √ = 2 L(1, χ ) log( /2) + 2 L (1, χ ) π π √ + O( L(1, χ ) log log ) + O(29/60+ε ) √ √ √ + O L(1, χ ) (log − log( /2)) .
But L(1, χ ) =
h+ () log ε+ √
which establishes the proposition. Proposition 4.13.
(N ) =
√ √ 12 + + β(, N ) h () log ε log + L (1, χ ) π2 + ε
+ O(N 2 log log N ), as N −→ ∞. Proof. We have
(N ) = 2
β(, N )
+ ε
=2
+ ε
[ω]
ω∈R()
√ r (a) β(, N ) β(, N ) κ() , + O a √ + a<
(4.3)
ε
by Proposition 4.10. Now
β(, N ) κ() ≤ log N
+ ε
κ() # N 2, log ε+ +
ε
by Lemma 4.6 and Corollary 4.8, respectively. Let N (σ, T , χ ) be the number of zeros of L(s, χ ) in R(σ, T ). From [13], Theorem 12.2, it follows that N (21/22 − K, N, χ0 ) # N 10/21+25ε , 0 ≤N 2
where 0 runs through fundamental discriminants. Furthermore, define ∗ E(N ) := β(, N ), + ε
Asymptotic of a Function in Statistical Mechanics
37
where ∗ indicates that runs through discriminants for which L(s, χ ) has a zero in R(21/22 − ε, N). If = 0 r 2 with a fundamental discriminant 0 , then β(, N ) = |{(u, v) ∈ N2 | u2 − 0 (rv)2 = 4, u ≤ N }|. Since L(s, χ ) and L(s, χ0 ) have the same zeros with positive real part we have ∗ E(N ) = 1. 0 ≤N 2 (u,v )∈N2 :u2 −0 (v )2 =4, u≤N
r|v
The innermost sum equals d(v ) #ε (v )ε # N ε . Thus by Lemma 4.6 ∗ β(0 , N ) # N 2ε N (21/22 − ε, N, χ0 ) E(N ) #ε N ε #N
0 ≤N 2 10/21+27ε
0 ≤N 2
.
If ≤ N 2 and L(s, χ ) has no zero in R(21/22 − ε, N ) we use Proposition 4.12 to estimate the inner sum in the main term of (4.3). If L(s, χ ) has a zero in R(21/22−ε, N ) we use ρ (a) # a 1/2+ε , L(1, χ ) # log , L (1, χ ) # log2 to get the trivial estimate √ r (a) √ 6 √ 6 √ − 2 L(1, χ ) log − 2 L (1, χ ) a π π √ a< √ −1/2+ε √ #ε a + log2 # 3/4+ε/2 # N 3/2+ε . √ a<
Thus
(N ) = 2
β(, N )
+ ε
+O
√ 6 √
L(1, χ ) log + L (1, χ ) + O(N 2 ) 2 π
β(, N )h+ () log ε+ log log + O(N 3/2+ε E(N )).
+ ε
But
β(, N ) h+ () log ε+ ≤ log N
+ ε
h+ () # N 2 ,
+ ε
by Proposition 4.6 and Theorem 3.1 of [18], respectively.
In order to obtain the desired asymptotics for (N ) we need one last important result. Proposition 4.14. For any ε > 0, β(, x)h+ () log ε+ log = x 2 log x − κ2 x 2 + Oε (x ρ+ε ), + ε <x
+ ε <x
√ β(, x) L (1, χ ) = −κ3 x 2 + Oε (x ρ+ε ),
√ where κ2 , κ3 are numerically computable positive constants and ρ = ( 113 − 3)/4 ≈ 1.9075.
38
J. Kallies, A. Özlük, M. Peter, C. Snyder
Proof. These mean value formulae are closely related to that considered by Sarnak [18] but they cannot be reduced to his result. His method does not apply either. Instead we use the method from [15]. Since we work in the ground field Q and consider only first moments of class numbers the proof is much simpler. On the other hand we have the additional factor log . For a positive non-square discriminant , define √ f () := h+ () log ε+ log = L(1, χ ) log (case I) respectively f () :=
√ L (1, χ ).
(case II)
Define 1 := 0, 2 := 1. For j ∈ {1, 2}, v ∈ N, let mij (v), 1 ≤ i ≤ sj (v), be all solutions of m2 ≡ j v 2 + 4 (4v 2 ) in [0, 4v 2 ). Define pij v (t) := 16v 2 t 2 + 8mij (v)t + Then S(x) :=
=
β(, x)f () =
+ ε <x
mij (v)2 − 4 ∈ Z[t]. v2 f ()
(u,v)∈N2 ,≥1: u2 −v 2 =4, u≤x 2
(4.4)
f ().
1≤v≤x j =1 ≡j (4), 2
By dissecting the range 2 < u ≤ x into congruence classes modulo 4v 2 we see that the innermost sum equals sj (v)
i=1
t∈Z: 2<mij (v)+4v 2 t≤x
f (pij v (t)).
Now we √ need an asymptotic formula for2L(1, χ ), respectively L (1, χ ). Let σ0 := (25 − 113)/16, σ0 < γ < 1, ≤ x . First assume that L(s, χ ) has no zero in R(σ0 , log2 x). For y := x α , α := 1/(2 − σ0 ), T := (log2 x)/2, the theorem of residues gives
1 2πi
2+iT
N(s − 1)L(s, χ )y s−1 ds
2−iT
= L(1, χ ) +
1 2π i
γ −iT 2−iT
+
γ +iT γ −iT
+
2+iT γ +iT
.
(4.5)
Stirling’s formula shows that N(s − 1) # |t|σ −3/2 e−π|t|/2 for s = σ + it, |t| ≥ 1, σ0 ≤ σ ≤ 2. Mellin’s formula states that 1 2πi
a+i∞ a−i∞
N(s)u−s ds = e−u
for a, u > 0.
Asymptotic of a Function in Statistical Mechanics
39
Therefore the left-hand side of (4.5) equals ∞ χ (n)
n
n=1
e−n/y + O T 1/2 ye−πT /2 .
Since L(s, χ ) has no zeros in R(σ0 , log2 x) it follows as in the proof of Proposition 3.13 that L(s, χ ) #ε x ε for γ ≤ σ ≤ 2, |t| ≤ T . An application of Cauchy’s integral formula gives L (s, χ ) #ε x ε in the same region. Estimating the integrals on the right-hand side of (4.5) gives L(1, χ ) =
∞ χ (n)
n
n=1
where
e−n/y + I (, y),
I (, y) # x 2ε ye−(π log
2 x)/4
+ x ε y γ −1 .
If L(s, χ ) has a zero in R(σ0 , log2 x) we have the trivial estimate I (, y) # |L(1, χ )| +
∞ 1 1 −n/y # log + e n n n≤y n=1
+
∞
ν=1 νy
1 −ν e # log x + log y #ε (xy)ε . νy
A similar argument shows that L (1, χ ) = −
∞ χ (n) log n n=1
n
e−n/y + I (, y),
with the same upper bounds for I (, y) as for I (, y). From (4.4) it follows that S(x) = H (x) + R(x), where H (x) :=
j (v) 2 s
1≤v≤x j =1 i=1
√
log
t∈Z: 2<mij (v)+4v 2 t≤x, :=pij v (t)
∞ χ (n) n=1
n
e−n/y
in case I. In case II the terms of the innermost sum have the additional weights − log n, and the factor log does not occur. Furthermore, in case I √ log I (, y). R(x) := ≥1, 2
The same holds in case II with log I (, y) replaced by I (, y). The estimate |{(u, v, ) | u, v ∈ N, ≥ 1 a discriminant, u2 − v 2 = 4, 2 < u ≤ x, L(s, χ ) has a zero in R(σ0 , log2 x)}| #ε x ρ−1+ε
40
J. Kallies, A. Özlük, M. Peter, C. Snyder
follows as in the proof of Proposition 4.13. Also see the proof of Satz 1.1 and the Bemerkung following this theorem in [15]. Consequently in both cases I and II, R(x) # x 2+3ε y γ −1 + x ρ+4ε . For j ∈ {1, 2}, n, v ∈ N, define pij v (a) , cj (v, n) := n sj (v)
i=1 a mod 4n
c(v, n) := c1 (v, n) + c2 (v, n),
s(v) := s1 (v) + s2 (v).
Using the 4n-periodicity of a → (pij v (a)/n) gives for v, n ∈ N, x ≥ 3,
χ (n) =
≥1, 2
=
j (v) 2 s
j =1 i=1
t∈Z: 2<mij (v)+4v 2 t≤x, :=pij v (t)
χ (n)
x c(v, n) + O(s(v)n). 16v 2 n
The computation of c(v, n) and s(v) requires rather tedious but completely elementary calculations. Therefore only the results are given here: 2 if 2 v, 4 if 2 || v, j = 1, 0 if 2 || v, j = 2, sj (v) = 2|{p>2| p|v}| 8 if 4 || v, j = 1, 0 if 4 || v, j = 2, 8 if 8 | v, c(v, n) = c(v, 2ord2 n )
c(v, pordp n ),
2
where a−1 p (p − 1) if a ≥ 2 even, p | v, pa−1 (p − 2) if a ≥ 2 even, p v, c(v, pa ) = 0 if a ≥ 1 odd, p | v, a−1 if a ≥ 1 odd, p v, −p if a = 0, 4s(v) c(v, 2a ) = 2a+2 s2 (v) if a ≥ 2 even, a+2 2 (2s(v) − s2 (v)) if a ≥ 1 odd, 0 if 8 v, |{p>2| p|v}| s(v) = 2 4 if 8 | v.
p > 2,
Asymptotic of a Function in Statistical Mechanics
In particular, c(v, n) # 2ν(v)
41
n , K(n)
s(v) # 2ν(v) ,
where K(n) is the squarefree kernel of n. Partial summation gives for n ∈ N, x ≥ 3, 1 ≤ v ≤ x,
√
c(v, n) 2 c(v, n)(1 + 2 log v) 2 x log x − x 3 16v n 32v 3 n 2ν(v) nx log x +O , v √ c(v, n) 2 2ν(v) nx χ (n) = x +O . 32v 3 n v
log χ (n) =
≥1, 2
≥1, 2
In case I it follows that H (x) = x 2 log x − x2
∞ c(v, n) −n/y e 16v 3 n2
1≤v≤x n=1 ∞
1≤v≤x n=1
c(v, n)(1 + 2 log v) −n/y e + O(x 1+ε y). 32v 3 n2
Using |e−n/y − 1| # min{1, n/y} and sum equals
"
n≤z K(n)
κ1 + O(x −2+ε + y −1/2 ),
−1
# z1/2 we see that the first double
κ1 :=
∞ c(v, n) 16v 3 n2
n,v=1
and the second double sum equals κ2 + O(x −2+ε + y −1/2 ),
κ2 :=
∞ c(v, n)(1 + 2 log v) . 32v 3 n2
n,v=1
Consequently H (x) = κ1 x 2 log x − κ2 x 2 + O(y −1/2 x 2+ε + x 1+ε y). Analogously, it follows in case II that H (x) = −κ3 x 2 + O(y −1/2+ε x 2 + (xy)1+ε ), where κ3 :=
∞ c(v, n) log n . 32v 3 n2
n,v=1
Collecting all results and using y = x α with the value for α given above we get in case I S(x) = κ1 x 2 log x − κ2 x 2 + O(x ρ+4ε ),
42
J. Kallies, A. Özlük, M. Peter, C. Snyder
and in case II
S(x) = −κ3 x 2 + O(x ρ+6ε ). To compute the constants the Dirichlet series D(s) :=
∞ c(v, n) , n2 v s
Re(s) > 1,
n,v=1
may be used. Then D(3) = 16κ1 , D (3) = 8κ1 − 16κ2 . Using the explicit formula for c(v, n) we get for v ∈ N, −1 ∞ c(v, n) 2 1 2 1 − = A 2 1 + 1 − , n2 p(p + 1) p3 − p p3 − p n=1
p>2
2
where
40 3
16 A= 32 224 3
From this it follows that D(s) =
1−
p>2
if 2 v, if 2||v, if 4||v, if 8|v. 2 Dp (s), p3 − p p
where 1 40 16 32 224 + s + 2s + , 3 2 2 3 22s (2s − 1) 1 2(p 3 − 1) Dp (s) = 1 + s , p > 2. p − 1 p3 − p − 2 D2 (s) =
Consequently D(3) = 16 and κ1 = 1. Furthermore, Dp D (3) = (3) D Dp p and therefore κ2 =
p 2 log p 1 37 log 2 + +2 . 2 168 (p 3 − 1)(p 2 − 1) p>2
For κ3 there does not seem to be a reasonably simple series representation.
Propositions 4.13 and 4.14 give the main result. Theorem 4.15. 6
(N ) = 2 N 2 log N + O(N 2 log log N ), π as N −→ ∞. Acknowledgements. We would like to thank Ram Murty and H. W. Lenstra for pointing out some literature relevant to this problem. We also thank Professor Lenstra for his helpful hints concerning the behavior of certain number theoretic functions related to the problem solved in this article. C.S. would like to thank the Department of Mathematics of the University of Maine for providing travel funds to the University of Cologne to work on the results in this and other papers during his sabbatical leave.
Asymptotic of a Function in Statistical Mechanics
43
References 1. Adams, W. and Goldstein, L.: Introduction to Number Theory. Englewood Cliff, NJ: Prentice-Hall, 1976 2. Bost, J.-B. and Connes, A.: Produits Euleriens et Facteurs de type III. C.R. Acad. Paris Ser. I Math. 315, 279–284 (1992) 3. Bost, J.-B. and Connes, A.: Hecke Algebras, Type III Factors and Phase Transitions with Spontaneous Symmetry Breaking in Number Theory. Selecta Math. New Series 1, No. 3, 411–457 (1995) 4. Contucci, P. and Knauf, A.: The Low Activity Phase of Some Dirichlet Series. J. Math. Phys. 37, 5458– 5475 (1997) 5. Contucci, P. and Knauf, A.: The phase transition of the Number-Theoretical Spin Chain. Forum Mathematicum 9, 547–567 (1997) 6. Cvitanovi´c, P.: Circle Maps: Irrationally winding. In: Waldschmidt, Moussa, Luck, Itzykson, eds., From Number Theory to Physics, Berlin–Heidelberg–New York: Springer-Verlag, 1992 7. Faivre, C.: Distribution of Lévy constants for quadratic numbers. Acta Arith. LXI.1, 13–34 (1992) 8. Julia, B.L.: Thermodynamic Limit in Number Theory: Riemann–Beurling Gases. Physica A 203, 425–436 (1994) 9. Kallies, J.: Eine engere Theorie quadratischer Zahlkörper. Preprint MPRESS (1998) 10. Kleban, P. and Özlük, A.: A Farey Fraction Spin Chain. Commun. Math. Phys. 203, 635–647 (1999) 11. Knauf, A.: On a ferromagnetic spin chain. Commun. Math. Phys. 153, 77–115 (1993) 12. Landau, E.: Elementary Number Theory. New York, N.Y.: Chelsea Publishing Co., 1958 13. Montgomery, H.L.: Topics in Multiplicative Number Theory. Lecture Notes in Mathematics 227, Berlin– Heidelberg: Springer-Verlag, 1971 14. Ono, T.: An Introduction to Algebraic Number Theory. New York, London: Plenum Press, 1990 15. Peter, M.: Momente der Klassenzahlen binärer quadratischer Formen mit ganzalgebraischen Koeffizienten. Acta Arith. LXX.1, 43–77 (1995) 16. Peter, M.: The limit distribution of a number theoretic function arising from a problem in statistical mechanics. To appear in J. Number Th. 17. Smith, H.: Note on the theory of the Pellian equation, and of binary quadratic forms of a positive determinant. Proc. London Math. Soc. 7, 199–208 (1876) 18. Sarnak, P.: Class Numbers of Indefinite Quadratic Forms. J. Number Theory 15, 229–247 (1982) 19. Titchmarsh, E.: The Theory of the Riemann Zeta-function. 2nd Edition, Oxford: Oxford Science Publications, 1986 Communicated by H. Spohn
Commun. Math. Phys. 222, 45 – 96 (2001)
Communications in
Mathematical Physics
© Springer-Verlag 2001
Notes for a Quantum Index Theorem Roberto Longo Dipartimento di Matematica, Università di Roma “Tor Vergata”, Via della Ricerca Scientifica, 00133 Roma, Italy. E-mail: [email protected] Received: 8 March 2000 / Accepted: 17 April 2001
Dedicated to Sergio Doplicher on the occasion of his sixtieth birthday Abstract: We view DHR superselection sectors with finite statistics as Quantum Field Theory analogs of elliptic operators where KMS functionals play the role of the trace composed with the heat kernel regularization. We extend our local holomorphic dimension formula and prove an analogue of the index theorem in the Quantum Field Theory context. The analytic index is the Jones index, more precisely the minimal dimension, and, on a 4-dimensional spacetime, the DHR theorem gives the integrality of the index. We introduce the notion of holomorphic dimension; the geometric dimension is then defined as the part of the holomorphic dimension which is symmetric under charge conjugation. We apply the AHKT theory of chemical potential and we extend it to the low dimensional case, by using conformal field theory. Concerning Quantum Field Theory on a curved spacetime, the geometry of the manifold enters in the expression for the dimension. If a quantum black hole is described by a spacetime with bifurcate Killing horizon and sectors are localizable on the horizon, the variation of logarithm of the geometric dimension is proportional to the incremental free energy, due to the addition of the charge, and to the inverse temperature, hence to the inverse of the surface gravity in the Hartle–Hawking KMS state. For this analysis we consider a conformal net obtained by restricting the field to the horizon (“holography”). Compared with our previous work on Rindler spacetime, this result differs inasmuch as it concerns true black hole spacetimes, like the Schwarzschild–Kruskal manifold, and pertains to the entropy of the black hole itself, rather than of the outside system. An outlook concerns a possible relation with supersymmetry and noncommutative geometry. Contents 0. 1.
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . First Properties of Holomorphic Cocycles . . . . . . . . . . . . . . . . . .
Supported in part by MURST and GNAMPA–INDAM.
46 53
46
2. 3.
4. 5.
6. 7. A.
R. Longo
1.1 Index formulae. Factor case . . . . . . . . . . . . . . . . . . . 1.2 The holomorphic dimension in the C∗ -case . . . . . . . . . . . The Chemical Potential in Quantum Field Theory . . . . . . . . . . . 2.1 The absolute and the relative part of the incremental free energy Chemical Potential. Low Dimensional Case . . . . . . . . . . . . . . 3.1 KMS states and the generation of conformal nets . . . . . . . . 3.2 Normality of superselection sectors in temperature states . . . . 3.3 Extension of KMS states to the quantum double . . . . . . . . Roberts and Connes–Takesaki Cohomologies . . . . . . . . . . . . . . Increment of Black Hole Entropy as an Index . . . . . . . . . . . . . . 5.1 The conformal structure on the black hole horizon . . . . . . . 5.2 Charges localizable on the horizon . . . . . . . . . . . . . . . 5.3 An index formula . . . . . . . . . . . . . . . . . . . . . . . . On the Index of the Supercharge Operator . . . . . . . . . . . . . . . Outlook. Comparison with the JLO Theory . . . . . . . . . . . . . . . 7.1 Induction of cyclic cocycles . . . . . . . . . . . . . . . . . . . 7.2 Index and super-KMS functionals . . . . . . . . . . . . . . . . Appendix. Some Properties of Sectors in the C∗ -Case . . . . . . . . . A.1 The canonical endomorphism . . . . . . . . . . . . . . . . . . A.2 The quantum double in the C∗ setting . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
53 56 62 66 69 70 74 76 77 80 81 82 83 84 87 87 89 91 91 92
0. Introduction These notes are a natural outgrowth of our previous work on a local holomorphic formula for the dimension of a superselection sector [47] and were motivated by the purpose to give a geometrical picture to aspects of local quantum physics related to the superselection structure. They may be read from different points of view, in particular, guided by a similarity of the statistical dimension with the Fredholm index and a possible index theorem, already suggested in [15], we shall regard the DHR localized endomorphisms [16] as quantum analogs of elliptic differential operators. A heuristic preamble. We begin to give a heuristic, but elementary, motivation for our dimension formula, postponing for the moment the specification of the underlying structure. Let the selfadjoint operator H0 be a reference Hamiltonian for a Quantum Statistical Mechanics system, generating the evolution on an operator algebra A. H0 may be thought to correspond to the Laplacian on a compact Riemann manifold. We write any other Hamiltonian H as a perturbation H = H0 + P of H0 . We assume a supersymmetric structure, essentially the existence of a Dirac operator D (an odd square root of H ) that implements an odd derivation of A. The McKean–Singer lemma then shows that Trs (e−βH ) = Fredholm index of D
(1)
for any β > 0, where Trs denotes the super-trace. In particular Trs (e−βH ) (the Witten index) is an integer. Now ω(β) = Trs (e−βH0 ·)/Trs (e−βH0 ) is the normalized super-Gibbs functional at inverse temperature β for the dynamics generated by H0 and if we consider the unitary cocycle uP (t) ≡ eitH e−itH0
(2)
Notes for a Quantum Index Theorem
47
relating the two evolutions, and that belongs to A if P ∈ A, we may write the following formula in terms of ω(β) and uP : anal.cont. ω(β) (uP (t)) = t→iβ
Trs (e−βH ) , Trs (e−βH0 )
(3)
provided the latter makes sense. We will thus regard the above expression as a multiplicative relative index between H0 and H .1 As shown in [32, 33] this index is invariant under deformations, in particular Index(uP ) = Index(uI ) if P is an odd element of A in the domain of the superderivation and may be obtained by evaluating at the identity the JLO Chern character associated with ω(β) . At infinite volume however, the integrality of the index is not evident. In the case of infinite volume systems, likewise for the Laplacian on non-compact manifolds, the Hamiltonian no longer has any discrete spectrum and may exist only as a derivation. What survives after the thermodynamical limit is the time evolution, a one-parameter automorphism group α of a C∗ -algebra A. The equilibrium states ω(β) are characterized by the KMS condition [26]. However the cocycle (2) may well exist and belong to A, so that formula (3) may be generalized to define the index of a cocycle, if the analytic continuation exists. In this paper we shall consider the case of Relativistic Quantum Statistical Mechanics, by which we mean the consideration of KMS functionals for the time evolution in Quantum Field Theory. The basic relativistic property, locality or finite propagation of speed of light, will select the appropriate class of cocycles and will imply the integrality of the index as will be explained. For reasons that will partly be clarified, the supersymmetric structure will not play a direct role in our formulae for the dimension. Ingredients for QFT analysis. Let us discuss now some fundamental aspects of Physics and Analysis. Quantum Field Theory can be considered at the same time as a generalization of two Physical theories of very different nature: Classical (Lagrangian) Field Theory and Quantum Mechanics. Both of them extend Classical Mechanics, but point in apparently divergent directions. In the first case one goes from finitely many to infinitely many degrees of freedom, but remains in the classical framework. In the second case one replaces classical variables by quantum variables (operators), but remains within finitely many degree of freedoms. Quantum Field Theory inherits the richness of the two theories by treating infinitely many quantum variables and enhances them further, in particular by interaction, particle creation/annihilation, special relativity. There are thus two paths from the finite-dimensional classical calculus to QFT according to the following diagram: Classical, finite dim. −−−−→
Variational calculus
(4)
Quantum, finite dim. −−−−→ Quantum Field Theory Note now that the passage from ordinary manifolds to variational calculus did not require a new calculus; for example the notion of derivative still makes sense replacing points by functions. 1 As a counterpart, an additive relative index generalizing Tr (e−βH − e−βH0 ) appeared in [8, 35]. s
48
R. Longo
On the other hand, the passage from classical to quantum mechanics does require a new structure (non-commutativity) and a new calculus. The standard quantization procedure replaces functions by selfadjoint operators and Poisson brackets by commutators. In this correspondence xh → Ph and −i ∂x∂ h → Qh give position and momentum operators that satisfy the Heisenberg commutation relations [Ph , Qk ] = iδhk I . A quantized, finite-dimensional, calculus has been developed in recent times by A. Connes (see [13]); a sample dictionary is here below: Classical
Quantum
Variable
Operator
Differential Integral
[F, ·] − (Dixmier trace)
Infinitesimal
Compact operator
···
···
Concerning Quantum Field Theory, more or less implicit suggestions concerning a “second quantized” or QFT calculus can be found in [12, 15, 36, 37, 31, 44]. In particular we consider Jones subfactors and endomorphisms or Connes correspondences to be basic objects [44] in this setting. The underlying structure at each level is illustrated in the following table: Classical
Classical variables Differential forms Chern classes
Variational calculus Infinite dimensional manifolds Functions spaces Wiener measure
Quantum
Quantum geometry Fredholm operators Index Cyclic cohomology
Subfactors Correspondences, Endomorphisms Multiplicative index Supersymmetric QFT, (A, H, Q)
Note that there is a non-trivial map from points −→ fields , horizontally in diagram (4), that further enriches the structure. At the quantum level this is the second quantization functor; this is partly at the basis of the multiplicative structure of the index (cf. [4] for an example). We mention a first result, due to Connes [12], that may be read within the context of QFT analysis: the index map IndQ : K0 (A(O)) → Z is not polynomial and the K-theory group K0 (A(O)) is of infinite rank. Here A(O) is a “smooth” local Bosonic algebra associated with a free massive supersymmetric field on the cylinder. After these premises, let us discuss the basic objects of our analysis, superselection sectors. Superselection sectors as QFT analogs of elliptic operator. The celebrated Atiyah– Singer index theorem equates the analytic index of an elliptic operator to a geometric– topological index. The analytic index is the Fredholm index, which is manifestly an integer. The geometric index is intrinsically invariant under deformations. A major consequence of the index theorem is then the integrality of the geometical index.
Notes for a Quantum Index Theorem
49
As discussed, Operator Algebras provide the proper quantization (non-commutative setting) for measure theory, topology and geometry. In particular, extensions of the index theorem by means of noncommutative K-theory and cyclic cohomology occur naturally in Noncommutative Geometry [12]. These results pertain to Connes’ quantized calculus. Here we shall deal with a quantum field theory analog of the index theorem. The role of the Fredholm linear operators is now played by the endomorphisms of an infinite factor with finite Jones index [44]. In Quantum Field Theory the index-statistics theorem [43] equates the DHR statistical dimension with the square root of the Jones index or, more precisely, with the minimal dimension, whence the integrality of the index is immediate by the integrality of the statistical dimension [16]. We then look for possible geometric counterparts of the statistical dimension. Our framework is Quantum Field Theory and its superselection structure [66], in the Doplicher–Haag–Roberts framework [16]. The local observable algebra A(O) associated with region O of the spacetime provides a noncommutative version of the algebra of functions with support in O, and the localized endomorphisms with finite statistics are analog to the elliptic differential operators, as suggested by S. Doplicher [15]. Note indeed that an endomorphism ρ localized in the double cone O0 is local in the sense that ρ(A(O)) ⊂ A(O),
O ⊃ O0 ,
similarly to the locality property that characterizes the differential operators in the classical setting [54]. More is true, the correspondence O → A(O) is endowed with a natural, but not manifest, sheaf structure with respect to which covariant localized endomorphisms are sheaf maps [21], a fact that will be used only implicitly (it gives the automatic covariance used in Sect. 3.2). It is now clear that, in our context, geometric information may be contained in the classical geometry of the spacetime and in the net O → A(O). A geometrical description of the superselection structure of A has been given by Roberts [58], who defined a non-abelian cohomology ring HR1 (A) whose elements correspond to the superselection sectors of A. In his formalism, however, it is unclear how to integrate a cohomological class to obtain an invariant, the dimension. A conceptually different cohomological structure appears in [14] by considering unitary cocycles associated with a dynamics. As we shall see in Sect. 4, if we consider only localized unitary cocycles associated with translations, we obtain a cohomology ring Hτ1 (A) which describes the covariant superselection sectors. Denoting by SKMS the set of extremal KMS states for the time evolution, at inverse temperature β, satisfying Haag duality, we have indeed a pairing SKMS × Hτ1 (A)
ϕ × [u] → ϕ, [u] =
u(iβ)dϕ ∈ R
(5)
that we shall below describe in Eqs. (6, 7). We have used, only here, the notation udϕ ≡ ϕ(u) in order to provide resemblance with the classical context. It is now useful to compare in a table the context in the Atyiah–Singer theorem and in Quantum Field Theory, including the role possibly played by supersymmetry (see below).
50
R. Longo Atiyah-Singer context
QFT context
sheaf structure
functions on manifold V
net (sheaf) of C∗ -algebras on V
smooth structure
smooth functions
net of dense ∗ -algebras
differential operator
sheaf map
localized endomorphism
elliptic operator
Fredholm operator
finite index endomorphism
analytical index
Fredholm index
minimal dimension
integrality
Fredholm index ∈ Z
statistical dimension ∈ N
geometric index
associated with (D, V)
associated with (ρ, A, V)
cohomology
De Rham
Roberts; cyclic
deformation invariance
intrinsic
perturbation invariance
Chern character
Chern character
pairing (5); JLO cyclic cocycle
Hamiltonian
Laplacian
Killing Hamiltonian
grading
Dirac operator
supersymmetry
spectral formula
heat kernel
(super)-Gibbs state
In this table we have considered finite volume spaces, although we will deal with noncompact spacetimes. As is known, Gibbs states become KMS states at infinite volume [26]. The analogue replacement of super-Gibbs functionals by super-KMS functionals is far less obvious, see the outlook. However, as mentioned, for many purposes concerning the index, one may work with ordinary KMS states. Black holes, conformal symmetries and holography. There is a first important setting where the above programme may be implemented, with a formula for the geometric index involving (classical) spacetime geometry, namely the analysis of charge addition for a quantum black hole in a thermal state. We introduce here another piece of structure, namely we consider Quantum Field Theory on a curved spacetime. This physical theory combines General Relativity and Quantum Field Theory, but treats the gravitational field as a background field and therefore disregards effects occurring at the Planck length.Yet important effects, as the Hawking effect [27], pertain to this context. We start with a black hole described by a globally hyperbolic spacetime with bifurcate Killing horizon, for example the Schwarzschild–Kruskal spacetime, and we consider quantum effects on this gravitational background. We then consider the incremental entropy due to the addition of a localized charge. The case of the Rindler spacetime has been previously described by means of a local analogue of a Kac–Wakimoto formula [47]. Here however we use a different point of view and conceptual scheme. The first basic point is that the restriction of the net A to each of the two horizon components h+ and h− gives a conformal net on S 1 , a general fact that is obtained by applying Wiesbrock’s characterization of conformal nets [67], a structure already discussed in [48, 24]. It may appear analogous to the holography on the anti-de Sitter spacetime that independently appeared in the Maldacena–Witten conjecture2 [52, 69], proved by Rehren [56].Yet their context differs inasmuch as the anti-de Sitter spacetime is not globally hyperbolic and the holography there is a peculiarity of that spacetime, rather than a general phenomena. With these conformal nets at one hand, and assuming the due duality properties, we may consider endomorphisms ρ and σ of A that are localizable on h+ or h− . In 2 We thank M. Kontsevich for pointing out this similarity to us.
Notes for a Quantum Index Theorem
51
particular, we will have the formula for the difference of the logarithm of dimensions: log d(ρ) − log d(σ ) =
2π (F (ϕρ |ϕσ ) + F (ϕρ¯ |ϕσ¯ )), κ(V)
where F -terms represent the incremental free energy between the thermal equilibrium states with the charges ρ and σ or the conjugate charges ρ¯ and σ¯ . Here ϕ is the Hartle– Hawking state and ϕρ is the corresponding equilibrium state in the presence of the charge ρ. The geometry appears here in the surface gravity κ(V) associated to the spacetime manifold V, see [65]. As a consequence the right hand side is the difference of the logarithm of two integers. We will consider also different temperature states. For a finer analysis of this context we refer to our Sect. 5. We shall need to study the chemical potential, as we are going to explain. Chemical potential. Relativistic case. The chemical potential is a label that each charge sets on different equilibrium states at the same temperature. Its structure in Quantum Statistical Mechanics has been explained in the work of Araki, Haag, Kastler and Takesaki [1], see also [2]. The labels appear by considering the extensions of these states from the observable algebra to KMS states of the field algebra (with the time evolution modified by one-parameter subgroups of the gauge group). Our aim is to consider temperature states in Quantum Field Theory. This may be motivated by the wish to study extreme physical contexts such as the early universe (not discussed in this paper) or black holes. In the context of Quantum Field Theory on the d + 1-dimensional Minkowski spacetime with d ≥ 2, Doplicher and Roberts [18] have constructed the field algebra associated with local observables and short range interaction charges. It is not difficult to apply the AHKT analysis, originally made in the context of C∗ dynamics, to the case of Quantum Field Theory on Minkowski spacetime as above. It turns out the every covariant irreducible localized endomorphism ρ with finite dimension extends to the weak closure in the GNS representation πϕ associated with a KMS state ϕ, a fact that should be expected on physical grounds because the addition of a single charge should not lead to an inequivalent representation for an infinite system. If ϕ is extremal KMS, we are then led to the context of endomorphisms of factors with finite index. Assuming πϕ to satisfy Haag duality, the extension of ρ to the weak closure is still irreducible. If u is the time covariance cocycle for ρ, then u is a Connes RadonNikodym cocycle [11] up to phase, hence it satisfies certain holomorphic properties. We have the formula, that generalizes [47], log dϕ (u) = log d(ρ) + βµρ (ϕ),
(6)
where the holomorphic dimension is defined by dϕ (u) ≡ anal.cont. ϕ(u(t)). t→iβ
(7)
If there is a canonical choice for u, for example if the ρ is Poincaré covariant in the vacuum sector, then the above formulae define the chemical potential µρ (ϕ) of ϕ corresponding to the charge ρ. This extends an analogous expression in the AHKT work in the case of abelian charges (d(ρ) = 1). In general only the difference µρ (ϕ) − µρ (ψ) is intrinsic.
52
R. Longo
Now the localized endomorphisms form a C∗ -tensor category [16] and in a C∗ -tensor category there exists a natural anti-linear conjugation on the arrows between finitedimensional objects [51]: ¯ σ¯ ). T ∈ (ρ, σ ) → T • ∈ (ρ, Therefore, even if in general u is defined only up to phase, u• is a covariance cocycle for the conjugate charge ρ¯ with opposite chemical potential µρ¯ (ϕ) = −µρ (ϕ),
(8)
and we obtain an expression for the intrinsic dimension d(ρ) as the geometric mean d(ρ) = dϕ (u)dϕ (u• ), which is independent of the phase fixing. We regard the right-hand side of this expression as a geometric dimension, according to what was explained before, being candidate for geometric interpretation. The quantity β −1 log dϕ (u) represents the incremental free energy (adding the charge ρ). It has then a canonical decomposition as a sum of an intrinsic part β −1 log d(ρ), which is independent of ϕ, where log d(ρ) is half the incremental entropy associated with ρ (cf. [55, 47]), and the chemical potential part µρ (ϕ) characterized by the asymmetry with respect to charge conjugation in (8). Chemical potential. Low dimensional case. Motivated by black hole thermodynamics and the associated conformal nets on the black hole horizon, among other considerations, one is led to the analysis of the chemical potential in one-dimensional Quantum Field Theory. In this context the field algebra does not any longer exist and the AHKT work is not applicable. The notion of holomorphic dimension still makes sense and is the basis of our analysis. The crucial point however is to show that localized endomorphisms are normal in the representation πϕ associated to a thermal state ϕ. Based on the Wiesbrock characterization of conformal nets on S 1 [67], we shall see that there is a conformal net associated with ϕ, the conformal thermal completion. If πϕ satisfies duality for half-lines, we prove that the thermal completion automatically satisfies Haag duality, namely it is strongly additive [23]. Then, a localized endomorphism ρ with finite dimension of the original net gives rise to a transportable localized endomorphism of the thermal completion. By a result in [21] ρ is automatically conformally covariant. We finally use this conformal covariance to show the normality of ρ in the representation πϕ . For completeness, we extend the work of AHKT also in regard to describing the chemical potential in terms of extensions of KMS states. This will be achieved by considering extensions to the quantum double, a C∗ -algebraic version of a construction in [50], that is a substitute for the non-existing field algebra. At this point the analysis of the chemical potential goes through as in the higher dimensional case with the corresponding formulae for the incremental free energy. In the case of globally hyperbolic spacetimes with bifurcate Killing horizon, these results allow one to treat thermal states for the Killing evolution and charges localizable on the horizon, as explained. The expected role of supersymmetry. Connes cyclic cohomology enters in Supersymmetric Quantum Field Theory via the work of Jaffe, Lesniewski and Osterwalder [31],
Notes for a Quantum Index Theorem
53
see also [37]. There is a noncommutative Chern character associated with a thermal equilibrium state, i.e. an entire cyclic cocycle associated to a supersymmetric KMS functional for the time evolution. In this context an index formula appears, which essentially coincides with our formula (7). There, however, the index formula acquires the geometric meaning of evaluating the JLO cycle at the identity. Indeed such a formula was used to show the deformation invariance of the index. On the other hand, in our context, our formula for the dimension is independent of an underlying supersymmetric structure. When we consider unitary cocycles associated with charges localizable in a bounded region, the dimension varies, but remains an integer. If we consider localized endomorphisms and the super-KMS functional, namely if we consider the index associated with a covariance cocycle and a super-KMS functional, we should expect our index formula to be read geometrically by the JLO cyclic cocycle. We will explain this point at the end of this paper, as it can certainly give further insight. But the present picture is too primitive to be directly applicable because super-KMS functionals fail to exist when the spacetime is non-compact in the most natural situation when the functional is translation invariant and space translations act in an asymptotically abelian fashion [10]. This drawback is entirely caused by the assumption that the superKMS functional is bounded. The structure associated with the unbounded super-KMS functional is a problem for future investigation. 1. First Properties of Holomorphic Cocycles In this section we begin to study holomorphic cocycles and give first formulae for the dimension. 1.1. Index formulae. Factor case. Here we recall and extend results in [47, Sect. 1]. We shall show how to obtain a holomorphic formula for the dimension of a sector that assumes neither a perfect symmetry group nor a PCT anti-automorphism. The reader is however assumed to have read the above quoted reference. Let M be an infinite factor and denote by End(M) the (injective, normal, unital)3 endomorphisms of M with finite Jones index and Sect(M) the sectors of M, namely the equivalence classes of End(M) modulo inner automorphisms of M. End(M) is a tensor category where the tensor product is the composition, cf. [18]. The intertwiner space (ρ, σ ) between objects ρ, σ ∈ End(M) is defined as (ρ, σ ) ≡ {T ∈ M : T ρ(X) = σ (X)T , ∀ X ∈ M}. Sect(M) is endowed with a natural conjugation. ρ¯ ∈ End(M) is a conjugate of ρ iff the conjugate equation holds true: there exist multiples of isometries R ∈ (ι, ρρ) ¯ and ¯ such that R¯ ∈ (ι, ρ ρ), ¯ that we normalize with ||R|| = ||R||, ¯ = 1, R ∗ ρ( ¯ R)
R¯ ∗ ρ(R) = 1.
(9)
The minimum ¯ d(ρ) ≡ minRR 3 The “index” or “dimension” terminology is here interchangeable; analogously, the Fredholm index of an isometry is the dimension of its cokernel.
54
R. Longo
over all possible choices of R and R¯ is the intrinsic dimension of ρ. A pair Rρ , R¯ ρ , where the minimum is attained exists and coincides with a standard solution as discussed in [51]. It turns out that d(ρ) = dan (ρ), the analytical dimension defined as dan (ρ) = √ [M : ρ(M)], the square root of the minimal index [M : ρ(M)] (Jones–Kosaki index with respect to the minimal expectation) [44]. For each T ∈ (ρ1 , ρ2 ), the conjugate arrow T • ∈ (ρ¯1 , ρ¯2 ) is defined by T • = ρ¯2 (R¯ ρ∗1 T ∗ )Rρ2 , where Rρi and R¯ ρi give a standard solution for the conjugate equation defining the conjugate ρ¯i . The map T → T • is anti-linear and satisfies natural properties, see [51]. Fix a normal faithful state ϕ of M and let σ ϕ be the modular group of ϕ. As is well known, σ ϕ satisfies the KMS condition with respect to ϕ at inverse temperature −1, ϕ namely, setting αt = σ−t , the relation (12) holds with β = 1. For a fixed ρ, let u(ρ, ·) be ϕ ϕ a unitary σ -cocycle (i.e. u(ρ, t) is a unitary in M and u(ρ, t +s) = u(ρ, t)σt (u(ρ, s)), t, s ∈ R ) such that ϕ
ϕ
Adu(ρ, t) · σt · ρ = ρ · σt , ϕ
ϕ
that is u(ρ, t) ∈ (ρt , ρ), where ρt ≡ σt · ρ · σ−t . Note that u(ρ, ·) is not assumed to be continuous. Once standard Rρ and R¯ ρ are given, we assume the corresponding operators ϕ ϕ ϕ for the conjugate equation for ρt and ρ¯t ≡ σt · ρ¯ · σ−t to be given by Rρt = σt (Rρ ) ϕ and R¯ ρt = σt (R¯ ρ ). Then u(ρ, t)• ∈ (ρ¯t , ρ) ¯ is given by ϕ
u(ρ, t)• = ρ( ¯ R¯ ρ∗t u(ρ, t)∗ )Rρ = ρ(σ ¯ t (R¯ ρ∗ )u(ρ, t)∗ )Rρ . If ρ is irreducible, the choice of Rρ and R¯ ρ is unique up to a phase, therefore u(ρ, t)• is uniquely defined. This holds in more generality, if ρ is reducible. Proposition 1.1. u(ρ, t)• is well defined, namely it does not depend on the choice of Rρ and R¯ ρ giving a solution for the conjugate equation for ρ and ρ. ¯ Proof. Let Rρ and R¯ ρ be a standard solution. If Rρ and R¯ ρ is another solution of the ¯ ρ = v ∗−1 R¯ ρ for some invertible v ∈ (ρ, ρ) conjugate equation, then Rρ = ρ(v)R ¯ ρ and R [51], hence the conjugate of u(ρ, t) with respect to Rρ and R¯ ρ is given by ϕ ϕ ϕ ϕ ρ(σ ¯ t (R¯ ρ∗ )σt (v −1 )u(ρ, t)∗ )ρ(v)R ¯ ¯ t (Rρ∗ )u(ρ, t)∗ σt ρ (v −1 ))ρ(v)R ¯ ρ = ρ(σ ρ ϕ
= ρ(σ ¯ t (Rρ∗ )u(ρ, t)∗ v −1 )ρ(v)R ¯ ρ ϕ ¯ ∗ ∗ = ρ(σ ¯ t (Rρ ))u(ρ, t) )Rρ = u(ρ, t)• ,
(10)
ϕ
where σ ϕρ is the modular group of ϕ · /ρ , so that σt ρ (v) = v because the minimal left inverse /ρ of ρ is tracial on (ρ, ¯ ρ). ¯ Proposition 1.2. Let ρ be irreducible. We have d(ρ)2 = anal.cont. ϕ(u(ρ, t))ϕ(u(ρ, t)• ). t→−i
If u(ρ, ·) is weakly continuous, then u(ρ, ·) and u(ρ, ·)• are holomorphic (see below) in the state ϕ, and d(ρ)2 = anal.cont. ϕ(u(ρ, t))anal.cont. ϕ(u(ρ, t)• ). t→−i
t→−i
Notes for a Quantum Index Theorem
55
Proof. We may suppose that d(ρ) < ∞. Set ϕρ ≡ ϕ · /ρ , where /ρ is the minimal left inverse of ρ as before. Since the Connes Radon-Nikodym cocycle (Dϕρ : Dϕ) is a covariance cocycle for ρ [47], there exists a one-dimensional character χ of R such that u(ρ, t) = χ (t)d(ρ)it (Dϕρ : Dϕ)t , (both cocycles intertwine σ ϕ and σ ϕρ ). As shown in [47], (d(ρ)it (Dϕρ : Dϕ)t )• = d(ρ)it (Dϕρ¯ : Dϕ)t , u(ρ, t)• = χ (t)(d(ρ)it (Dϕρ : Dϕ)t )• = χ (t)d(ρ)it (Dϕρ¯ : Dϕ)t . Moreover the Connes cocycle is holomorphic and anal.cont. ϕ((Dϕρ : Dϕ)t ) = t→i
ϕρ (1) = 1 [12], therefore anal.cont. ϕ(u(ρ, t))ϕ(u(ρ, t)• ) t→−i
= anal.cont. d(ρ)2it ϕ((Dϕρ : Dϕ)t )ϕ((Dϕρ¯ : Dϕ)t ) = d(ρ)2 . t→−i
(11)
If u(ρ, ·) is continuous, then χ is continuous. So χ extends to an entire function, hence u(ρ, ·) is holomorphic, and the second formula in the statement follows from the first one. Remark. Let T ⊂ End(M) be a C∗ tensor category as before, and u(ρ, t) a two-variable cocycle. Setting dϕ (ρ) = d(u(ρ, ·) ≡ anal.cont. ϕ(u(ρ, t)) (see below) the arguments t→−i
in [47] show that
dϕ (ρ1 ⊕ ρ2 ) = dϕ (ρ1 ) + dϕ (ρ2 ), dϕ (ρ1 ρ2 ) = dϕ (ρ1 )dϕ (ρ2 ), ρ1 , ρ2 ∈ T . In particular, if T is rational, namely there are only finitely many inequivalent irreducible objects, the usual application of the Perron–Frobenious theorem entails d(ρ) = dϕ (ρ) for all objects of T (no chemical potential, see Sect. 2). 1.1.1. Case of a non-full C∗ tensor sub-category. Now let T be a C∗ tensor category with conjugates contained in End(M), thus the objects of T are finite-index endomorphisms of M, but we do not assume that T is a full sub-category of End(M), namely the intertwiner spaces (ρ, σ ) in T can be strictly contained in the corresponding intertwiner spaces in End(M). Assume that the modular group σ ϕ gives an action of R on T , that is ρt ∈ T , for all ϕ t ∈ R, ρ ∈ T , and σt ((ρ, ρ )) = (ρt , ρt ) if ρ, ρ ∈ T are objects of T . Recall that u is a two-variable unitary cocycle for the above action if, for each fixed object ρ ∈ T , u(ρ, ·) is a unitary σ ϕ -cocycle as above and, for each fixed t ∈ R, u(ρ, ·)∗ ϕ is cocycle with respect to σt , namely u(ρσ, t) = ρ(u(σ, t))u(ρ, t),
ρ, σ ∈ T ,
and ϕ
T u(ρ, t) = u(σ, t)σt (T ),
ρ, σ ∈ T , T ∈ (ρ, σ ).
Note that if u is a two variable cocycle also for the full tensor subcategory of End(M) with the same objects of T , then u(ρ, t)• defined there coincides with u(ρ, t)• defined in T cf. [47], Propositions 1.5 and A.2.
56
R. Longo
Corollary 1.3. With the above notations, if u(ρ, t) is a weakly continuous unitary twovariable cocycle for the action of R on T given by σ ϕ , then ¯ t)) d(ρ) ≤ anal.cont. ϕ(u(ρ, t))ϕ(u(ρ, t→−i
for all irreducible ρ ∈ T . Here d(ρ) is the intrinsic dimension of ρ as an object of T . Proof. We have u(ρ, t)• = u(ρ, ¯ t) [47] (the conjugate map is the one associated with T ), hence the above Proposition 1.1 applies, provided ρ is irreducible in End(M). If ρ is reducible in End(M) we have u(ρ, t) = z(t)dan (ρ)it (Dϕρ : Dϕ)t with z(t) ∈ (ρ, ρ). By using the tracial property of /ρ it is easy to check that z is a one-parameter group of unitaries in the finite-dimensional algebra (ρ, ρ), and therefore can be diagonalized. If ρ = ⊕i ρi is a decomposition of ρ into irreducibles (with eigen-projections of z) in End(M), we have a corresponding decomposition of dan (ρ)it (Dϕρ : Dϕ)t as direct sum of the dan (ρi )it (Dϕρi : Dϕ)t , thus dϕ (u(ρi , ·)) = 3i dan (ρi ) with 3i > 0, hence dϕ (uρ )dϕ (uρ¯ ) = 3i dan (ρi )3−1 dan (ρi )dan (ρj ) j dan (ρj ) ≥ i,j
i,j
= dan (ρ) ≥ d(ρ) . 2
2
We now recall that the following holds. Proposition 1.4 ([47]). In the setting of Prop. 1.1, if there exists an ϕ-preserving antiautomorphism j of M inducing an anti-automorphism of T such that j · ρ · j = ρ¯ and j (u(ρ, t)) = u(ρ, ¯ −t), then d(ρ) = anal.cont. ϕ(u(ρ, t)) for all objects ρ ∈ T . t→−i
Proof. See [47, Prop.1.7].
1.2. The holomorphic dimension in the C∗ -case. In this section we give a first look at the structure that emerges in the C∗ context, in analogy to what was studied in the previous section in the setting of factors. Here we assume from the start that a holomorphic dimension is definable, postponing the more relevant derivation of the holomorphic property and the analysis of the chemical potential to subsequent sections. Let A be a unital C∗ -algebra and α a one-parameter automorphism group of A. A linear functional ϕ ∈ A∗ is said to be a KMS functional with respect to α at inverse temperature β > 0 if for any given a, b ∈ A there exists a function Fa,b ∈ A(Sβ ) such that Fa,b (t) = ϕ(αt (a)b), Fa,b (t + iβ) = ϕ(bαt (a)).
(12) (13)
Here Sβ is the strip {0 < Imz < β} and A(Sβ ) is the algebra of functions analytic in Sβ , bounded and continuous on the closure of Sβ . We do not assume α to be pointwise norm continuous, nonetheless a weaker continuity property follows from the KMS condition. Note that a KMS functional ϕ is α-invariant. Let now u ∈ A be a unitary cocycle with respect to α, namely t ∈ R → u(t) ∈ A is a map taking values in the unitaries of A satisfying the equation u(t + s) = u(t)αt (u(s)).
Notes for a Quantum Index Theorem
57
We shall say that the cocycle u is holomorphic, in the functional ϕ, if the function t ∈ R → ϕ(u(t)) is the boundary value of a function in A(Sβ ). If u is holomorphic in the state ϕ, we define the holomorphic dimension of the cocycle u (with respect to ϕ) by dϕ (u) = anal.cont. ϕ(u(t)). t→iβ
As we shall see, in our context dϕ (u) will be a positive number related to a (noncommutative) relative index. Clearly, for a given dynamics α, dϕ (u) may depend on the KMS state ϕ. We shall sometimes rescale the “time parameter” to make the inverse temperature β = 1. If u is not holomorphic we write dϕ (u) = +∞. Let ρ be an endomorphism of A. We shall say that ρ is covariant if there exists a α-cocycle of unitaries u(ρ, t) ∈ A such that αt · ρ · α−t = Adu(ρ, t)∗ · ρ.
(14)
We shall say that ρ has finite holomorphic dimension (with respect to the KMS state ϕ) if it is covariant and there exists a covariance cocycle u(ρ, ·) as above with finite holomorphic dimension. Note that, if A has trivial centre and ρ is irreducible, u(ρ, ·) is unique up to a phase, that doesn’t alter the finite-dimensional property of u(ρ, ·), provided such a phase is chosen to be a continuous character of R. In the rest of this section we begin to study whether endomorphisms of A with finite holomorphic dimension extend to πϕ (A) . In the following we identify A with its image πϕ (A) and suppress the suffix ϕ. Lemma 1.5. Let A be a C∗ -algebra acting on a Hilbert space H and ρ an endomorphism of A. Let ξ be a cyclic separating vector for M = A and ϕ = (· ξ, ξ ). If ϕ · ρ extends to a normal faithful positive functional of M, then ρ extends to a normal endomorphism of M. Proof. Let η be a cyclic separating vector such that (ρ(a)ξ, ξ ) = (aη, η), a ∈ A, and let V be the isometry of H given by V aη = ρ(a)ξ,
a ∈ A.
(15)
The final projection of V is given by e = V V ∗ = ρ(A)ξ ∈ ρ(A) ,
(16)
thus x ∈ M → V xV ∗ is a homomorphism of M onto ρ(A) e and V aV ∗ = ρ(a)e,
a ∈ A.
(17)
Now the central support of e in ρ(A) is 1 as ρ(A) ξ ⊃ M ξ = H, hence if x ∈ M there exists a unique ρ(x) ∈ ρ(A) such that ρ(x)e = V xV ∗ , providing an extension of ρ to M. As η is separating, ρ is an isomorphism.
58
R. Longo
Now, as in the factor case, End(A) is the tensor category whose objects ρ, σ, . . . are the endomorphisms of A: the monoidal product ρ ⊗ σ = ρσ is given by the composition of maps, while the intertwiner space (ρ, σ ) is given by {T ∈ A : T ρ(a) = σ (a)T , ∀ a ∈ A}. The tensor product of intertwiners is also defined in a natural fashion, see [18]. The conjugate equation is defined as in the previous section. The intrinsic dimension d(ρ), and the conjugation on arrows are defined as well, in fact all these notions make sense for a general tensor C∗ -category, [51]. The following proposition is a special case of results in [51]. We state it in the particular case needed for our applications. Proposition 1.6 ([51]). Let A be a unital C∗ -algebra, acting on a Hilbert space, with M = A a factor. Let T be a tensor category with conjugates and subobjects of endomorphisms of A admitting a unitary braid group symmetry. Suppose that every endomorphism ρ ∈ T extends to a normal endomorphism of ρˆ ∈ M. Then d(ρ) ˆ = d(ρ), where d(ρ) ˆ = dan (ρ), ˆ i.e. d(ρ) ˆ 2 is the minimal index [M : ρ(M)], ˆ and d(ρ) is the intrinsic dimension of ρ in T . Proof. The map ρ → ρˆ is a functor of C∗ tensor categories from T to a sub-tensor category of End(M), hence d(ρ) ˆ ≤ d(ρ). As T has a unitary braiding, every real object σ ∈ T is amenable [51, Th. 5.31], thus d(σ ) = ||mσ ||, where ||mσ || is the 32 norm of the fusion matrix mσ associated with σ , therefore d(σ ) = ||mσ || ≤ ||mσˆ || ≤ d(σˆ ) ≤ d(σ ), thus d(σˆ ) = d(σ ). For any ρ ∈ T , the object σ = ρ ρ¯ is real hence, by the multiplicativity of the dimension, d(ρ) ˆ = d(ρ). 1.2.1. Case of a unique KMS state. We now restrict our attention to the case of a unique KMS functional. This is done more with an illustrative intent, rather than for later applications, where we shall treat a more general context. Proposition 1.7. Suppose ϕ is the unique KMS functional for α. Then a covariant endomorphism ρ of A with finite holomorphic dimension has a normal extension to M ≡ A . Proof. Let u = u(ρ, ·) be a holomorphic unitary covariance α-cocycle. As t → ϕ(u(t)) is continuous, the map t ∈ R → u(t) ∈ M is strongly continuous. To check this, note that by cocycle property it is enough to verify the continuity at t = 0 because then the strong limit u(s + t) = u(s)αs (u(t)) → u(s),
as t → 0
(18)
due to the normality of αs . Let then x ∈ M be a weak limit point of u(t) as t → 0. Then x ≤ 1 and (xξϕ , ξϕ ) = lim (u(ti )ξϕ , ξϕ ) = lim ϕ(u(ti )) = ϕ(1) = (ξϕ , ξϕ ) i→∞
i→∞
(19)
for some sequence ti → 0. Thus xξϕ = ξϕ by the limit case of the Schwartz inequality, thus x = 1 because ξϕ is separating. Therefore by Connes’ theorem [11] there exists a normal faithful semi-finite weight ϕρ on M with (Dϕρ , Dϕ)t = u(t) and, by the finite holomorphic dimension assumption, ϕρ (1) = dϕ (u) < ∞ so that ϕ is indeed a positive linear functional. Then ϕρ is a KMS ρ functional for its modular group α−t = Adu(−t) · α−t (we are setting β = 1 here).
Notes for a Quantum Index Theorem
59
The functional on A ϕ (a) = ϕρ (ρ(a)), a ∈ A
(20)
is KMS with respect to α anal.cont. ϕ (aαt (b)) = anal.cont. ϕρ (ρ(a)ρ(αt (b))) t→i
t→i
ρ
= anal.cont. ϕρ (ρ(a)αt (ρ(b))) t→i
(21)
= ϕρ (ρ(b)ρ(a)) = ϕ (ba). Hence, by the uniqueness of the KMS state, ϕ = λϕ
(22)
on A, for some λ > 0. Thus ϕρ is a normal faithful functional and ϕρ · ρ is normal too. Therefore the proof is completed by Lemma 1.5. To shorten notation, we shall often set uρ = u(ρ, ·). Proposition 1.8. Let ϕ be the unique KMS state as in Proposition 1.7. If ρ and ρ¯ are conjugate and dϕ (uρ ) < ∞, dϕ (uρ¯ ) < ∞, then the extension of ρ to M has finite Jones index. Proof. By assumption and Proposition 1.7 both ρ and ρ¯ extend to M. As Rρ and R¯ ρ are also intertwiners on M by weak continuity and the conjugate equation for ρ and ρ¯ is obviously satisfied on M, the extension of ρ to M has finite index. Proposition 1.9. Suppose that ϕ is faithful and the unique KMS state for α, as in Prop. 1.8. Let T be a tensor category with conjugates of endomorphisms of A and u(ρ, t) ∈ A a unitary two-variable cocycle. If uρ has finite holomorphic dimension for all irreducible objects ρ, then the intrinsic dimension d(ρ) of ρ is bounded by d(ρ) ≤ dϕ (uρ )dϕ (uρ¯ ), (23) and equality holds if ρ is irreducible and extending to M is a full functor (thus ρ is irreducible on M). Proof. By Lemma 1.8 ρ extends to M and has finite index. We are then in the case covered by Prop. 1.2 and Corollary 1.3. Motivated by the above Proposition, we define the geometric dimension dgeo (ρ) as dgeo (ρ) ≡ dϕ (uρ )dϕ (u•ρ ). If ρ is irreducible, dgeo (ρ) does not depend on the choice of the covariance cocycle u(ρ, t), because, if we multiply u(ρ, t) by a phase χ (t), then u(ρ, t)• has to be replaced by χ (t)u(ρ, t)• . Of course, since for a two-variable cocycle u(ρ, t) we have ¯ t), u(ρ, t)• = u(ρ,
60
R. Longo
in this case we also have dgeo (ρ) =
dϕ (uρ )dϕ (uρ¯ ).
Also, since u(ρ ρ, ¯ t) = ρ(u(ρ, ¯ t))u(ρ, t), we have dgeo (ρ) = dϕ (uρ ρ¯ ) if ρ and ρ¯ extend to M. A priori dgeo (ρ) might depend on the KMS functional ϕ, but, as we shall see, in most interesting cases it will actually be independent of ϕ. 1.2.2. Graded KMS functionals: reduction to ordinary KMS states. The above results extend to graded KMS functionals. Indeed the analysis of these functionals can be reduced to the case of ordinary KMS states. Let A be a Z2 -graded unital C∗ -algebra, namely A is a unital C∗ -algebra equipped with an involutive automorphism γ .4 Given a graded one-parameter automorphism group α of A (i.e. one commuting with γ ), a linear functional ϕ ∈ A∗ is said to be a graded KMS functional with respect to α at inverse temperature β > 0 if for any given a, b ∈ A there exists a function Fa,b ∈ A(Sβ ) such that Fa,b (t) = ϕ(αt (a)b), Fa,b (t + iβ) = ϕ(γ (b)αt (a)).
(24) (25)
Note that a graded KMS functional ϕ is γ -invariant. We recall the following. Proposition 1.10 ([61, 10]). Let ϕ be a graded KMS functional of A for α and let ω = |ϕ| be the modulus of ϕ. Then ω is an ordinary KMS positive functional and πω · γ extends to an inner automorphism of M ≡ πω (A) , implemented by a selfadjoint unitary < in the centralizer Mω of ω. Moreover ϕ is proportional to ω(<·). Corollary 1.11. If ϕ is the unique non-zero graded KMS functional (up to a phase) of A, then ω = |ϕ| is extremal KMS, i.e. M = πω (A) is a factor. Proof. As usual A is identified with πω (A). If Z(M) # = C, there exist two non-zero projections z1 , z2 ∈ Z(M) with sum 1. Thus ϕ(z1 ·), ϕ(z2 ·) are different graded KMS functionals on A. This can be checked since both the extensions of α and γ act trivially on Z(M), and by usual approximation arguments. By the uniqueness assumption there exists a constant λ with ϕ(z1 a) = λϕ(z2 a), a ∈ A, then by continuity the same equality holds for a ∈ M, thus ϕ(z1 a) = ϕ(z1 z1 a) = λϕ(z2 z1 a) = 0. Analogously ϕ(z2 ·) = 0, thus ϕ = 0. For our purposes Proposition 1.10 allows us to consider ordinary KMS states instead of general graded KMS functionals. Let ρ be a graded endomorphism of A, namely an endomorphism of A commuting with γ . In this graded context, we shall say that ρ is covariant if there exists a covariance α-cocycle of unitaries u(t) ∈ A such that γ (u(t)) = u(t). In the following we again identify A with its image πω (A). 4 In this paper morphisms always commute with the ∗ -mapping and preserve the unit.
Notes for a Quantum Index Theorem
61
Lemma 1.12. Let ϕ be a graded KMS functional for α and ρ a graded covariant endomorphism of A as above. If M is a factor and ρ extends to a finite index irreducible endomorphism of M, then ρ has finite holomorphic dimension, both with respect to ω and ϕ. Indeed, if ω(u(·)) is continuous, dϕ (uρ ) = ±dω (uρ ), namely ϕ(1)dω (u) = ± anal.cont. ϕ(u(t)). t→iβ
Proof. The holomorphic property of u is a direct consequence of the holomorphic property of the Connes cocycle because u is indeed a Connes Radon-Nikodym cocycle, up to phase, with respect to two bounded positive normal functionals of M (see [47] and the previous section). Note now that, since the extension of ρ to M (still denoted by ρ) commutes with γ = Ad<, we have ρ(<)< ∗ ∈ ρ(M) ∩ M = C, thus ρ(<) = ±< because < is self-adjoint, hence /ρ (<) = ±<, where /ρ is the (unique) left inverse of ρ. To check Eq. (1.12), recall that, by the holomorphic properties of Connes cocycles, we have (see [47]): anal.cont. ω(Xu(t)) = dω (u)ω · /ρ (X), t→iβ
∀X ∈ M.
(26)
As above we have the polar decomposition ϕ = ω(<·). Then anal.cont. ϕ(u(t)) = anal.cont. ω(
t→iβ
= dω (u)ω · /ρ (<) = ±dω (u)ω(<) = ±ϕ(1)dω (u), namely the holomorphic dimension with respect to ϕ coincides with the holomorphic dimension with respect to ω, up to a sign. Because of the above Lemma 1.12, it is more convenient to define the holomorphic dimension dϕ (u) directly with respect to the modulus ω of ϕ. Before concluding this section, we recall that the interest in (bounded) graded KMS functionals is limited by the following no-go theorem. Proposition 1.13 ([10]). Let ϕ be a graded KMS functional of A with respect to α as above. If there exists a ϕ-asymptotically abelian sequence βn ∈ Aut(A), then γ = ι and ϕ is an ordinary KMS functional. Here the βn ’s commute with the grading and the ϕ-asymptotically abelianness means that ϕ · βn = ϕ and ϕ(c[βn (a), b]) → 0 for all a, b, c ∈ A, where the commutator is a graded commutator. 1.2.3. Table of dimensions. Before concluding this section we display the following table that summarizes the various notions of dimension we are dealing with: Dimension Intrinsic Analytical
Definition d(ρ) = ||Rρ || ||R¯ ρ ||, (standard Rρ , R¯ ρ ) √ dan (ρ) = [M : ρ(M)]
Context Tensor C∗ -categories Subfactors
Statistical
dDHR (ρ) = |/ρ (ερ )|−1 , (ερ stat. operator)
QFT, localized endomorphisms
Holomorphic
dϕ (u) = anal.cont. ϕ(u(t)) t→iβ dgeo (ρ) = dϕ (uρ )dϕ (uρ¯ )
Unitary cocycles
Geometric
Covariant endomorphisms
62
R. Longo
Here [M : ρ(M)] denotes the minimal index, namely the Jones index with respect to the minimal expectation. We have omitted the notion of minimal dimension dmin (ρ) ≡ min ||Rρ || ||R¯ ρ ||, in the context of C∗ -tensor categories, as it turns out to coincide with the intrinsic dimension [51]. 2. The Chemical Potential in Quantum Field Theory In this section we examine certain aspects of the chemical potential for thermal states in Quantum Field Theory. Our discussion will rely on basic results as the description of the chemical potential in terms of extensions of KMS states [1] and the construction of the field net and the gauge group [18]. Together with certain results for tensor categories [51], our analysis will show a splitting of the incremental free energy into an absolute part, that depends only on the charge and not on the state, and a part which is asymmetric with respect to the charge conjugation; this indeed represents the chemical potential that labels the equilibrium states. Let M be the Minkowski spacetime Rd+1 , with d ≥ 2, and A a net of local observable von Neumann algebras on M, namely we have an inclusion preserving map O → A(O) from the set K of (open, non-empty) double cones of M to von Neumann algebras A(O) on a Hilbert space H. If E ⊂ M is arbitrary, we set A(E) for the C∗ -algebra generated by the von Neumann algebras A(O) as O ∈ K varies O ⊂ E (A(E) = C if the interior of E is empty) and denote by A ≡ A(M) the quasi-local C∗ -algebra. We denote the quasi-local C∗ -algebra and the net itself by the same symbol, but this should not create confusion. We shall also denote by A(E) = A(E) the von Neumann algebra generated by A(E) (of course A(O) = A(O) if O ∈ K). We assume the following properties5 : Additivity. If O, O1 , . . . , On are double cones and O1 ∪ · · · ∪ On ⊂ O, then A(O1 ) ∨ · · · ∨ A(On ) ⊂ A(O). Here and in the following, the lattice symbol ∨ denotes the von Neumann algebra generated. Wedge duality (or essential duality). If W ⊂ M is a wedge region (namely a Poincaré translate of the region {x ∈ M : x1 > x0 }), then A(W ) = A(W )c . Here N c denotes the relative commutant in the von Neumann algebra M ≡ A , namely N c ≡ N ∩ M, and W denotes the spacelike complement of W . In particular the net A is local, namely A(O1 ) and A(O2 ) commute if the double cones O1 and O2 are space-like separated. As in the case of irreducible nets, one may consider the dual net, here defined as Ad (O) ≡ A(O )c ,
O ∈ K,
and show the following, [59]: 5 These properties automatically hold, in particular, in a Wightman theory [5]. Here we will also consider the case of a reducible net.
Notes for a Quantum Index Theorem
63
Proposition 2.1. Ad satisfies Haag duality, by which we mean here that Ad (O) = Ad (O )c ,
O ∈ K,
where Ad (O ) is the C∗ -algebra associated to O in the net Ad . In particular Ad is local. Moreover A and Ad have the same weak closure: A = (Ad ) = M. Proof. Because of additivity and wedge duality one can write Ad (O) = A(W )
(27)
W ⊃O
(intersection over all wedges containing O). If O1 ∈ K is spacelike separated from O ∈ K there is a wedge W with W ⊃ O and W ⊃ O1 , hence Ad is local. As Ad extends A and is local, wedge duality must hold for Ad too, therefore Ad (W ) = A(W ) for all wedges W . In particular the global von Neumann algebra associated with Ad coincides with the one associated with A: ( Ad (O)) = ( Ad (W )) = ( A(W )) = M. O
W
W
Analogously, it follows from Formula (27) that A(O ) = Ad (O ) , namely Ad (O) = Ad (O )c , that is to say Haag duality holds for Ad . Note however that, since M is not a type I factor in general, A(O ) may be nonnormal in M, namely Ad (O)c = A(O )cc may be strictly larger than A(O ) if O ∈ K. Translation covariance. There exists a unitary representation U of Rd+1 making A covariant: τx (A(O)) = A(O + x), x ∈ Rd+1 , O ∈ K, where we have set τx ≡ AdU (x). Properly infiniteness and Borchers property B 6 . If O ∈ K, then A(O) is a properly ˜ are double cones and O + x ⊂ O ˜ for x in a infinite von Neumann algebra. If O, O neighborhood of 0 in Rd+1 , then every non-zero projection E ∈ A(O) is equivalent to ˜ 1 in A(O). Factoriality. M ≡ A is a factor. A localized endomorphism ρ of A is an endomorphism of A such that ρ|A(O ) = id|A(O ) for some O ⊂ K. Two localized endomorphisms ρ, ρ are equivalent (ρ ) ρ ) if there is a unitary u ∈ M such that ρ = Adu · ρ. In the DHR theory [16] M = B(H), namely A is irreducible, but most of what we are saying holds in the reducible case as well. 6 In the vacuum representation these properties follows by positivity of the energy, see [41, 25].
64
R. Longo
A localized endomorphism ρ is translation covariant if there exists a τ -cocycle of unitaries u(ρ, x) ∈ M such that Adu(ρ, x) · ρx = ρ,
x ∈ Rd+1 ,
(28)
where ρx ≡ τx · ρ · τ−x . The equivalence classes of irreducible, translation covariant, localized endomorphisms are the superselection sectors of A. The translation covariant endomorphisms of A form a tensor category. If T ∈ (ρ, ρ ) is an intertwiner between ρ, ρ , namely T ∈ M and T ρ(X) = ρ (X)T for all X ∈ A, then, as an immediate consequence of the Haag duality property for Ad , we have T ∈ Ad (O) if O ∈ K contains the localization regions of both ρ and ρ . Now the unitary intertwiners changing the localization region of ρ (in particular the unitaries u(ρ, x) above) are used to define Roberts cohomology [58]. In particular the endomorphism ρ can be reconstructed from these charge transfers; as they are local operators in Ad , they also provide an extension of ρ to Ad with the same localization (only in the case d ≥ 2). The superselection structure for A and Ad coincide (the extension map is a full functor) and, replacing A by Ad , we may thus assume that A satisfies Haag duality (in the original representation of A). As shown in [16], attached with any localized endomorphism ρ there is a unitary representation of the permutation group P∞ , the statistics of ρ, that is classified by a statistics parameter λρ whose possible values are λρ = 0, ±1, ± 21 , ± 13 . . . . Thus the statistical dimension dDHR (ρ) = |λρ |−1 takes integral values dDHR (ρ) = 1, 2, 3, · · · + ∞. By the index-statistics theorem [43, 44], dDHR (ρ) coincides with an analytic dimension, the square root of the Jones index 1
dDHR (ρ) = [A : ρ(A)] 2 , (one way to read [A : ρ(A)] is [A(W ) : ρ(A(W )) ], with W a wedge region and ρ is localized in W ). We shall denote by L the tensor category of translation covariant localized endomorphisms of A with finite statistics (i.e. with finite dimension). For an object ρ of L, the intrinsic dimension coincides with the statistical dimension [43, 44]: d(ρ) = dDHR (ρ). Theorem 2.2. Let A be as above, αt = τx(t) a one-parameter automorphism group of translations of A and ϕ a translation invariant state which is extremal KMS for α. If u(ρ, t) is a α-covariance cocycle for the irreducible localized endomorphism ρ (i.e. Eq. (14) holds), then uρ is holomorphic d(ρ) ≤ dgeo (ρ). If moreover πϕ satisfies Haag duality, then d(ρ) = dgeo (ρ).
(29)
In particular the right-hand side in the above formula is independent of the KMS state ϕ. As is known the cyclic vector ξϕ in the GNS representation of a KMS state ϕ is separating for πϕ (A) , namely ϕ is a separating state. This is crucial for the following theorem of Takesaki and Winnink.
Notes for a Quantum Index Theorem
65
Theorem 2.3. ([64]). A KMS state ϕ is locally normal, namely ϕ|A(O) is normal for any double cone O ∈ K. Note that if A is a Poincaré covariant net and ρ a Poincaré covariant localized endo↑ morphism, then the covariance cocycle u(ρ, L) (ρ ∈ L, L ∈ P+ ) is uniquely fixed ↑ because P+ has no non-trivial finite-dimensional unitary representation, hence it is a two-variable cocycle. In particular, if ρ extends to an irreducible endomorphism of the weak closure, then d(ρ) = dgeo (ρ) by Prop. 1.3. We shall return to this point in the next section. If moreover there is a PCT symmetry j for A, or an anti-automorphism j of A such that j −1 · ρ · j = ρ, ¯ and ϕ · j = ϕ, then d(ρ) = dϕ (uρ ) by Prop. 1.4. We have essentially mentioned that the tensor category L has a permutation symmetry (if dim(M) ≥ 3), as shown in [16]. Then, by [18] there exists a field net F of von Neumann algebras F(O) ⊃ A(O), with normal commutation relations, with A = F G the fixed point of F under the action γ of a compact group G of internal symmetries of F. One has A ∩ F = C, where F is the quasi-local field C∗ -algebra F(M). Every endomorphism ρ ∈ L is implemented by a Hilbert space of isometries in F. We now relax the pointwise continuity condition in [1]. We denote by τ˜ the translation automorphism group on F extending τ and by α˜ = τ˜x(·) the one-parameter automorphism group extending α. Lemma 2.4. Let ϕ be an extremal KMS state of A with respect to α. There exists a locally normal state ψ of F that extends ϕ and is extremal KMS with respect to α˜ t · γg(t) , with t → g(t) a one-parameter subgroup of G. Proof. Let Fc ⊂ F denote the sub-C∗ -algebra of all elements with pointwise norm continuous orbit under the action of τ˜ · γ of Rd+1 × G, and set Fc (O) ≡ Fc ∩ F(O). We have Fc (O)− ⊃ F(O0 ),
¯ 0 ⊂ O, ∀ O0 ∈ K, O
where Fc (O)− is the σ -weak closure of F(O). Indeed if X ∈ F(O0 ) and jn is an approximation of the identity in Rd+1 by continuous functions with support in a ball of radius n1 , then Xn ≡ jn (x)τ˜x (X)dx → X (σ -weak convergence) and Xn has pointwise τ˜ -orbit and, for large n, belongs to F(O) (the γ -continuity is checked similarly). Set ϕc ≡ ϕ|Ac , where Ac ≡ Fc ∩ A and ϕ˜ = ϕ · ε, where ε = G γg dg is the expectation of F onto A as above. Clearly ϕ˜ is a τ˜ -invariant locally normal state of F and so is its restriction ϕ˜c to Fc . Let ψc ≺ ϕ˜c be an extremal α-invariant ˜ state of Fc extending ϕc . By the AHKT theorem [1] ψc is KMS with respect to a one-parameter automorphism group t → α˜ t · γg(t) of Fc . Since ϕ˜c is locally normal and dominates ψc , ψc is also locally normal, thus it extends to a locally normal state ψ of F. By usual arguments, ψ is a KMS state on F with respect to α. ˜ Lemma 2.5 ([1, Prop. III.3.2]). Let A ⊂ F be C∗ -algebras, ϕ a state of A and ϕ˜ an extension to F. If ϕ˜ is separating, then πϕ˜ |A is quasi-equivalent to πϕ .
66
R. Longo
Corollary 2.6. Let F be a C∗ -algebra and ρ an inner endomorphism of F. If ϕ is a separating state of F, the GNS representations πϕ and πϕ·ρ are quasi-equivalent. Proof. Let H ⊂ F be a Hilbert space of isometries implementing ρ on F and let {vi , i = 1, . . . , n} be an orthonormal basis of H , thus the vi ’s are isometries in F with
orthogonal final projections summing up to the identity and ρ(X) = i vi Xvi∗ , X ∈ F. Then (πϕ·ρ (X)ξϕ·ρ , ξϕ·ρ ) = ϕ(ρ(X)) = (πϕ (ρ(X))ξϕ , ξϕ ) = (πϕ (X)ξi , ξi ) = (πϕ (X) ⊕ · · · ⊕ πϕ (X)ξ¯ , ξ¯ ),
X ∈ F,
where ξi = πϕ (vi∗ )ξϕ and ξ¯ = ⊕ξi . Therefore πϕ·ρ ≺ πϕ ⊕· · ·⊕πϕ . On the other hand ξ¯ is a separating vector for πϕ ⊕ · · · ⊕ πϕ (F) ; indeed elements of πϕ ⊕ · · · ⊕ πϕ (F) have the form Y = X ⊕ · · · ⊕ X, X ∈ πϕ (F) , thus Y ξ¯ = 0 iff Xπϕ (vi∗ )ξϕ = Xξi = 0, ∀i, thus iff Xπϕ (vi∗ ) = 0 because ξϕ is separating, which is equivalent to X = 0 because
vi∗ vi = 0. Lemma 2.7. Let ϕ be an extremal KMS state of A. The GNS representations πϕ and πϕ·ρ are quasi-equivalent. Proof. Let ψ be a KMS state of F extending ϕ. As ψ is a separating state of F, also ψ ·ρH is also separating as in the proof of Prop. 2.6, where ρH is the inner endomorphism of F implemented by H . As ψ · ρH extends ϕ · ρ we have by Lemma 2.6 that πϕ ) πψ |A ) πψρH |A ) πϕ·ρ , where the symbol “)” denotes quasi-equivalence. Corollary 2.8. Every localized endomorphism ρ ∈ L is normal with respect to ϕ. Proof. The proof now follows by Lemma 2.7 and Lemma 1.5.
Proof of Theorem 2.2. As ϕ is extremal KMS, M ≡ πϕ (A) is a factor. If ρ is a localized endomorphism, by Corollary 2.8 both ρ and its conjugate extend to M. The conjugate equation then holds on M showing that the extension ρˆ of ρ to M has finite dimension. We have dan (ρ) ˆ = d(ρ) (a priori we only have dan (ρ) ˆ ≤ d(ρ)). Indeed Proposition 1.6 applies. If Haag duality holds in the representation πϕ , then by the following Lemma 2.9 ρˆ is irreducible if ρ is irreducible, therefore the last part of the statement follows by Prop. 1.9. The rest now follows from the analysis in the previous section. Lemma 2.9. With the above notation, if πϕ satisfies Haag duality, then the extension map ρ ∈ L → ρˆ ∈ End(M) is a full functor. Proof. With ρ ∈ L irreducible, we have to show that ρˆ is irreducible too. Let T ∈ (ρ, ˆ ρ), ˆ namely T ∈ M and T ρ(X) ˆ = ρ(X)T ˆ for all X in M. As ρ is localized in a double cone O, ρ acts identically on A(O ), hence T ∈ A(O ) ∩ M = A(O), thus T ∈ (ρ, ρ) = C as desired. 2.1. The absolute and the relative part of the incremental free energy. Beside the description of the chemical potential in terms of extensions of KMS states, the AHKT work provides an intrinsic description of the chemical potential within the observable
Notes for a Quantum Index Theorem
67
algebra [1], see also [25], that was made explicit only in the case of abelian charges (automorphisms). Let’s recall this point. Let A be a unital C∗ -algebra with trivial centre and α a oneparameter automorphism group as before and ρ a covariant automorphism of A, thus ρ · αt · ρ −1 = Adu(t) · αt for some α-cocycle of unitaries u(t) ∈ A. Notice that u is unique up to multiplication by a one dimensional character of R, that one fixes once and for all. If now ϕ is an extremal KMS state for α such that ϕ · ρ −1 and ϕ are quasi-equivalent, thus ρ extends to the factor M ≡ πϕ (A) , then u(t) = e−iµρ (ϕ)t (Dϕ · ρ −1 : Dϕ)−β −1 t
(30)
for some µρ (ϕ) ∈ R, called the chemical potential of ϕ (we are assuming that ϕ(u(·)) is continuous). A relevant observation is that, although µρ (ϕ) depends on the initial phase fixing for u, µρ (ϕ |ϕ) ≡ µρ (ϕ ) − µρ (ϕ) is independent of that and is therefore an intrinsic quantity associated with a pair ϕ, ϕ of extremal KMS states. In other words the chemical potential is a label for the different extremal KMS states. Now the above argument goes true in more generality if ρ is an endomorphism of A that extends to a finite-index irreducible endomorphism M, once we replace ϕ · ρ −1 with ϕρ ≡ ϕ · /ρ , where /ρ is the minimal left inverse of the extension of ρ. In general, for a given charge ρ, the chemical potential is only defined with respect to the two thermal states µρ (ϕ |ϕ) ≡ β −1 log dϕ (u) − β −1 log dϕ (u) = β −1 log dϕ ((Dϕρ : Dϕ)). Moreover, if there is a canonical way to choose the cocycle u, independently of the state ϕ, then µρ (ϕ) ≡ β −1 log dϕ (u) − β −1 log d(u)
(31)
defines an absolute chemical potential in the state ϕ, associated with the charge ρ. The above discussion relies of course on the normality of the endomorphism ρ in a thermal state, a deep fact, proved in [1] when the endomorphisms ρ are associated to the dual of a compact gauge group, with certain asympotically abelian and cluster properties for the dynamics. We now apply the above discussion to the case of quantum relativistic statistical mechanics, namely we consider thermal states for the time evolution in a quantum field theory on Minkowski spacetime. In this situation, there is a two variable cocycle for the Poincaré covariant endomorphisms with finite dimension, which is unique because the Poincaré has no non trivial finite dimensional unitary representation. Hence µρ (ϕ) can be defined intrinsically by Eq. (31). To be more explicit, let A be a net of von Neumann algebras on the Minkowski space, as in the previous section. We further assume that A is Poincaré covariant, namely there ↑ is a unitary representation U of P+ on H that acts covariantly on A and extends the translation unitary group.
68
R. Longo ↑
Our endomorphisms ρ are assumed to be covariant with respect to the action of P+ , namely there exists an α cocycle of unitaries u(ρ, L) ∈ A such that Adu(ρ, L) · ρ = αL · ρ · αL−1 ,
↑
L ∈ P+ ,
(32)
where αL = AdU (L). This is indeed a two-variable cocycle. Restricting this cocycle to the subgroup of time translation, we obtain a canonical choice for the unitary cocycle u(ρ, t), t ∈ R, for the one parameter group α, which is a two-variable cocycle in L × R. Theorem 2.10. Let A be the quasi-local observable C∗ -algebra and α a one-parameter (time) translation automorphism group as in the previous section. If ρ is a Poincaré covariant irreducible localized endomorphism with finite dimension, we have: (i) If ϕ is an extremal KMS state for α satisfying Haag duality, there exists a chemical potential µρ (ϕ) associated with ρ defined by the canonical splitting log dϕ (uρ ) = log d(ρ) + βµρ (ϕ) and satisfies µρ (ϕ) = −µρ¯ (ϕ). The intrinsic dimension is thus given by log d(ρ) =
1 (log dϕ (uρ ) + log dϕ (uρ¯ )) 2
independently of ϕ. (ii) If there is a time reversal symmetry as in Prop. 1.4 and ϕ · j = ϕ, then µρ (ϕ) = 0, namely log dϕ (uρ ) = log d(ρ), independently of ϕ. ↑
Proof. As above noticed, u(ρ, L) is a two-variable cocycle for the action of L0 × P+ on A, where L0 ⊂ L is the tensor category with conjugates generated by ρ (see [47]). Hence by [47] u(ρ, L)• = u(ρ, ¯ L). On the other hand, by the results in the previous section, we may extend ρ to the weak closure of A in the GNS representation of ϕ, and if then compare with the Connes cocycle, we have u(ρ, t) = eiµρ (ϕ)t d(ρ)−iβ
−1 t
(Dϕρ : Dϕ)−β −1 t .
But (d(ρ)it (Dϕρ : Dϕ)t )• = (d(ρ)it (Dϕρ¯ : Dϕ)t ), hence u(ρ, ¯ t) = e−iµρ (ϕ)t d(ρ)−iβ
−1 t
(Dϕρ¯ : Dϕ)−β −1 t
namely µρ¯ (ϕ) = −µρ (ϕ). The last point is a consequence of Proposition 1.4.
Notes for a Quantum Index Theorem
69
Note in particular that by the point (ii) in the above theorem the chemical potential vanishes if there is a time-reversal symmetry, therefore a non-trivial chemical potential sets an arrow of time, in accordance with the second principle of thermodynamics. Now we make contact with the analysis in [47]. Since the covariance cocycle u = uρ is canonically defined in the above context, once we choose the thermal state ϕ, the Hamiltonian in the state ϕρ is canonically defined as Hρ ≡ −i
d u(ρ, t)eitH |t=0 , dt
where H = −β −1 log ξ is the Hamiltonian in state ϕ. Let ξ be the GNS vector associated with ϕ. The increment of the free energy between the states ϕ and ϕρ is then defined (cf. [47]) as F (ϕ|ϕρ ) ≡ ϕρ (Hρ ) − β −1 S(ϕ|ϕρ ) = −β −1 log(e−βHρ ξ, ξ ), where S(ϕ|ϕρ ) is Araki’s relative entropy. It is immediate from the last expression that F (ϕ|ϕρ ) = −β −1 log dϕ (uρ ) = −β −1 log d(ρ) − µρ (ϕ). If ϕ is another extremal KMS state as above, so that ρ is normal with respect to ϕ , we may now define the increment of the free energy F (ϕ |ϕρ ) = −β −1 log dϕ (uρ ), where ϕρ = ϕ · /ρ , so we have: F (ϕ |ϕρ ) − F (ϕ|ϕρ ) = µρ (ϕ) − µρ (ϕ ) = µ(ϕ |ϕ), moreover Sc (ρ) = log d(ρ)2 = −β(F (ϕ|ϕρ ) + F (ϕ|ϕρ¯ )) is an integer independent of ϕ. Here Sc (ρ) is the conditional entropy of ρ, see [47]. According to the thermodynamical formula “dF = dE − T dS”, we have obtained the following relation: 1 F (ϕ|ϕρ ) = µρ (ϕ) − β −1 Sc (ρ). 2 The quantity µρ (ϕ) may be interpreted as part of the energy increment obtained by adding the charge ρ to the identical charge, more specifically the part which is asymmetric with respect to charge conjugation. The total increment of the free energy contains also a part which is symmetric under charge conjugation and independent of the thermal equilibrium state, namely the intrinsic increment of the entropy 21 Sc (ρ) multiplied by the temperature β −1 . The above analysis simply goes through when we consider the increment of the free energy between two thermal states ϕρ and ϕσ (cf. [47]). We shall make this explicitly in the context of Sect. 5. 3. Chemical Potential. Low Dimensional Case We now study the chemical potential structure in the low dimensional case. The higher dimensional methods [1] cannot be applied to this context, but we shall see that an analysis is possible by using Wiesbrock’s characterizations of conformal nets on S 1 [67].
70
R. Longo
3.1. KMS states and the generation of conformal nets. Let I denote the set of all bounded open non-empty intervals of R. We shall consider a net A of von Neumann algebras on R, namely an inclusion preserving map I ∈ I → A(I ) from I to von Neumann algebras A(I ), not necessarily acting on the same Hilbert space. We denote by the same symbol the quasi-local observable C∗ -algebra A = ∪I ∈I A(I )− (norm closure). We shall assume the following properties of A: a) Translation covariance. There exists a one-parameter automorphism group τ of A that corresponds to the translations on R, τs (A(I )) = A(I + s),
I ∈ I, s ∈ R.
b) Properly infiniteness. For each I ∈ I, the von Neumann algebra A(I ) is properly infinite.7 Let now ϕ be a KMS state on A with respect to τ ; for simplicity we set β = 1. Let (Hϕ , πϕ , ξϕ ) be the associated GNS triple and V the one-parameter unitary group implementing τ : V (s)πϕ (a)ξϕ = πϕ (τs (a))ξϕ . Note that by the KMS condition s → ϕ(aτs (b)) is a continuous map for all a, b ∈ A, hence V is strongly continuous. Recall now that A is additive (resp. strongly additive) if A(I ) ⊂ A(I1 ) ∨ A(I2 ), whenever I, I1 , I2 ∈ I and I ⊂ I1 ∪ I2 (resp. I ⊂ I¯1 ∪ I¯2 ), where the bar denotes the closure. We now set A(I ) = πϕ (A(I )), I ∈ I, which is a von Neumann algebra by Th. 2.3, and A(E) ≡ ∨{A(I ) : I ∈ I, I ⊂ E} for any set E ⊂ R (the von Neumann algebras generated). Again we now assume πϕ to be one-to-one and identify A(I ) with A(I ), namely we consider the net already in its GNS representation on Hϕ = H. The following KMS version of the Reeh–Schlieder theorem is known to experts. Proposition 3.1. ξ is cyclic and separating for A(I ), if I is a half-line. If A is additive, then ξ is cyclic and separating for all A(I ), I ∈ I. Proof. Assume first that I is a half-line and let η ∈ H be orthogonal to A(I )ξ ; we have to show that η = 0. Indeed if I0 is a half-line and I¯0 ⊂ I , then for all a ∈ A(I0 ), (η, V (s)aξ ) = 0, for all s ∈ R such that I0 + s ⊂ I . But, because of the KMS property, the function s → (η, V (s)aξ ) is the boundary value of a function analytic in the strip 0 < Im z < 1 1 1 i 2 2 2 (as V ( 2 ) = and Dom( ) ⊃ A(I0 )ξ ), hence it must vanish everywhere. It 7 This assumption is needed only for the the local normality of the KMS states (above Th. 2.3 from [64]) and can alternatively be replaced by the factoriality of A(I ).
Notes for a Quantum Index Theorem
71
follows that η is orthogonal to A(I0 + s)ξ for all s ∈ R, hence η is orthogonal to ∪s∈R A(I0 + s)ξ ⊃ Aξ = H. Assume now that I ∈ I and A is additive. Set A0 (I ) = ∨{A(I0 ) : I0 ∈ I, I¯0 ⊂ I }. We shall show that ξ is cyclic for A0 (I ), hence for A(I ). By the same argument as above η ⊥ A0 (I )ξ ⇒ η ⊥ A(I0 + s)ξ, ∀s ∈ R, I¯0 ⊂ I, namely the orthogonal projection P onto A0 (I + s)ξ is independent of s ∈ R and thus belongs to ∩s A0 (I + s) = (∨s A0 (I + s)) = A (by additivity). As ξ is separating for A and P ξ = ξ it follows that P = 1, namely A0 (I )ξ = H. Now τ extends to the rescaled modular group of M ≡ A(R) with respect to ϕ and τs (N) = M(s, ∞) ⊂ N, s > 0, where N ≡ M(0, ∞), thus (N ⊂ M, ξ ) is a half-sided modular inclusion of von Neumann algebras and by Wiesbrock’s theorem [67] there exists a ξ -fixing one-parameter unitary group U on H with positive generator such that V (s)U (t)V (−s) = U (es t), U (1)MU (−1) = N.
(33) (34)
Setting B(a, b) ≡ A(log a, log b), b > a > 0, we have a net B on the intervals of (0, ∞) whose closure is contained in (0, ∞). We have the following, compare with [7]: Proposition 3.2. Let A be an additive net as above and ϕ a KMS state. There exists a net B on the intervals of (0, ∞) such that B(a, b) = πϕ (A(log a, log b)) if b > a > 0. B is dilation covariant and V is the dilation one parameter group. B is also translation covariant with positive energy on half-lines, namely there is a one-parameter ξ -fixing unitary group U with positive generator such that AdU (t)B(a, ∞) = B(a + t, ∞), where B(a, ∞) = ∨b>a B(a, b). ¯ Proof. Clearly V (s)B(a, b)V (−s) = B(es a, es b) for positive a, b. Setting B(a, ∞) ≡ AdU (a)M we have ¯ ¯ + a, ∞), AdU (t)B(a, ∞) = B(t therefore, by using the relation V (s)U (t)V (−s) = U (es t), it follows that ¯ ¯ s a, ∞). AdV (s)B(a, ∞) = B(e ¯ On the other hand B(1, ∞) = AdU (1)M = N = B(1, ∞), hence ¯ s , ∞), B(es , ∞) = AdV (s)B(1, ∞) = AdV (s)U (1)M = AdU (es )M = B(e showing the last part of the statement.
We shall call the net B the thermal completion of A with respect to ϕ. Note that the translation unitary group V for A becomes the dilation unitary group for B. Proposition 3.2 does not give the translation covariance of the net B on the bounded intervals (B(a, b) is not even defined if a < 0). Further insight in the structure of the thermal completion net may be obtained by considering a local net A, namely assuming the locality condition [A(I1 ), A(I2 )] = {0} if I1 ∩ I2 = ∅.
72
R. Longo
To construct a translation covariant net we then define the following von Neumann algebras: ˜ AdV1 (s)B(0, 1), (35) B(0, 1) = s≤0
˜ ˜ B(0, a) = AdV (a)B(0, 1), a ∈ R, ˜ ˜ B(a, b) = AdU (a)B(0, b − a), a, b ∈ R.
(36) (37)
Here we have set V1 (s) ≡ U (1)V (s)U (−1), the one-parameter unitary group associated with the dilations with respect to the point 1 ∈ R. From now on the net A will be assumed to be local. Theorem 3.3. Let the net A on the intervals of R be translation covariant, local, and ˜ defines a conformal net, indeed additive and ϕ a KMS state. With the above notations, B ˜ has a conformal extension to S 1 . B ˜ on R is strongly additive and conformal. As a consequence the dual net of B We shall see that ˜ B(a, ∞) = B(a, ∞),
a ∈ R,
˜ is an extension of B on the positive half-lines and is conformal. We shall call hence B ˜ the conformal thermal completion of A with respect to ϕ. B ˜ Proof. We first show that the triple {B(0, ∞) , B(0, 1), B(1, ∞), ξ } is a +hsm factorization with respect to ξ in the sense of [23], namely these three algebras mutually commute ˜ ˜ and (B(0, 1) ⊂ B(0, ∞), ξ ), (B(1, ∞) ⊂ B(0, 1) , ξ ) and (B(0, ∞) ⊂ B(1, ∞) , ξ ) are +half-sided modular inclusions. Now B(0, ∞) and B(1, ∞) commute by the isotony of A; for the same reason ˜ B(1, ∞) commute with B(0, 1), indeed B(1, ∞) is AdV1 -invariant where, as above, ˜ V1 (s) = U (1)V (s)U (−1). Again B(0, ∞) ⊃ B(0, 1), hence B(0, ∞) ⊃ B(0, 1) because V1 (s)B(0, ∞)V1 (−s) ⊃ B(0, ∞) if s ≤ 0 by translation-dilation covariance of B on positive half-lines (Prop. 3.2). ˜ Concerning the hsm properties, the only non-trivial verification is that (B(0, 1) ⊂ B(0, ∞), ξ ) is a +hsm inclusion, namely that ˜ ˜ AdV (s)B(0, 1) ⊂ B(0, 1), s < 0. We thus need to show that for any fixed t < 0 we have ˜ 1), s < 0. AdV (s)V1 (t)B(0, 1) ⊂ B(0, Indeed if s < 0 and t < 0, there exist s < 0 and t < 0 such that V (s)V1 (t) = V1 (t )V (s ), as follows immediately by the corresponding relation in the “ax + b” group. Therefore ˜ 1) AdV (s)V1 (t)B(0, 1) = AdV1 (t )V (s )B(0, 1) ⊂ AdV1 (t )B(0, 1) ⊂ B(0, as desired.
Notes for a Quantum Index Theorem
73
˜ on R such that the local von By a result in [23] there exists a conformal net B Neumann algebras associated to (−∞, 0), (0, 1) and (1, ∞) are respectively B(0, ∞) , ˜ B(0, 1) and B(1, ∞) and having U and V as translation and dilation unitary groups. ˜ is then a conformal thermal completion of A. By translation-dilation covariance, B ˜ as the one associated with the halfWe may also directly define the dual net Bd of B sided modular factorization (B(0, ∞) , B(1, ∞) ∩ B(0, ∞), B(1, ∞), ξ ). This net is conformal, strongly additive and Bd (a, b) = B(a, ∞) ∩ B(b, ∞) . This is due to the equivalence between strong additivity and Haag duality on the real line for a conformal net, see [23]. Clearly we have ˜ B(a, b) ⊂ B(a, b) ⊂ Bd (a, b), ˜ and thus Bd is the dual net of B ˜ ⇔B ˜ is strongly additive. Bd = B Corollary 3.4. If A is strongly additive B(1, ∞) ∩ B(0, ∞) =
V (s)B(0, 1)V (−s).
s<0
Proof. If A is strongly additive, then B is strongly additive (on the intervals of (0, ∞)), ˜ is strongly additive and the above comment applies. hence B More directly, Corollary 3.4 states that the relative commutant A(0, ∞) ∩ M is the smallest von Neumann algebra containing A(−∞, 0) which is mapped into itself by Adit , t > 0, where is the modular operator associated with (A(0, ∞), ξ ). We shall say that the state ϕ of A satisfies essential duality if A(0, ∞) ∩ M = A(−∞, 0). We have: Proposition 3.5. The following are equivalent: (i) ϕ satisfies essential duality, (ii) For some (hence for all) 0 < a < b we have A(b, ∞) ∩ A(a, ∞) = A(a, b), (iii) A is strongly additive and Adit A(−∞, 0) ⊂ A(−∞, 0) for all t > 0, where is the modular operator of (A(0, ∞), ξ ). Proof. (i) ⇔ (iii) follows by the above comments. On the other hand (i) ⇔ (ii) because they are equivalent to the relative commutant property B(a, b) = B(b, ∞) ∩ B(a, ∞) for b > a and either a = 0 or a > 0, which are indeed equivalent conditions in the conformal case [23]. Corollary 3.6. If ϕ satisfies essential duality, then ϕ satisfies Haag duality, namely A(a, b) = (A(−∞, a) ∨ A(b, ∞))c , where ·c denotes the relative commutant in M.
a < b,
74
R. Longo
Proof. If ϕ satisfies essential duality then, since A(b, ∞) ∩ A(a, ∞) ⊂ M by (ii) of the above proposition, we have A(a, b) = A(b, ∞) ∩ A(a, ∞) = A(b, ∞)c ∩ A(a, ∞) for b > a > 0. On the other hand, by essential duality, we have A(a, ∞) = A(−∞, a)c , hence A(a, b) = A(−∞, a)c ∩ A(b, ∞)c = (A(−∞, a) ∨ A(b, ∞))c as desired. The case of arbitrary a < b is obtained by translation covariance.
Hence essential duality in a thermal state can occur only if the original net is strongly additive. It would be interesting to see if the converse holds true, namely if all KMS states on a strongly additive (local) net satisfy essential duality. Note also then, in contrast to the situation occurring in the vacuum representation, the equality A(a, b) ∩ M = A(−∞, a) ∨ A(b, ∞) cannot hold in any thermal state, unless the superselection structure is trivial [38] (this would be equivalent to the triviality of the 2-interval inclusion for the net B). 3.2. Normality of superselection sectors in temperature states. In this section A will denote a local net of von Neumann algebras on the intervals I of R satisfying the properties a) and b) in the previous section. With τ the translation automorphism group of A, we shall say that an endomorphism ρ of the quasi-local C∗ -algebra A is a localized in the interval I ∈ I if ρ acts identically on A(I ), where I ≡ R \ I and, for any open set E ∈ R, A(E) denotes as before the C∗ algebra generated by the {A(I ) : I ∈ I, I ⊂ E}. We have also set A(E) ≡ πϕ (A(E)) . As above, ρ is translation covariant if there exists a unitary τ -cocycle of unitaries u(s) ∈ A such that Adu(s) · τs · ρ · τ−s = ρ. Let ϕ be a KMS state of A with respect to the translation group. In the following ρ is a translation covariant endomorphism of A localized in an interval I ∈ I. By translation covariance we may assume that I ⊂ (0, ∞). Our main result in this section is the following. Theorem 3.7. Let A be a translation covariant net on R and ϕ a KMS state of A satisfying essential duality. If ρ is a translation covariant localized endomorphism of A with finite dimension d(ρ), then ρ is normal with respect to ϕ, namely ρ extends to a normal endomorphism of M = πϕ (A) . If ϕ is an extremal KMS state, i.e. M is a factor, then the extension of ρ to M has the same dimension d(ρ). Assuming that A acts on Hϕ , as above, we have: Lemma 3.8. ρ|A(a,∞) extends to a normal endomorphism of A(a, ∞) for any a ≤ 0. Proof. By translation covariance there exists a unitary u ∈ M such that ρ ≡ Adu · ρ is localized in an interval contained in (−∞, a), thus ρ = Adu∗ · ρ = Adu∗ on A(I ) for all I ⊂ (a, ∞), I ∈ I. It follows that Adu∗ is a normal extension of ρ to A(a, ∞). If the endomorphism ρ of A is localized in the interval I ∈ I then ρ(A(I1 )) ⊂ A(I1 ) for all intervals I1 containing I by Haag duality. We shall say that ρ has finite dimension if the index [A(I1 ) : ρ(A(I1 ))] is finite and independent of I1 (the index is here defined for example by the Pimsner–Popa inequality [55]).
Notes for a Quantum Index Theorem
75
Lemma 3.9. If the endomorphism ρ has finite dimension, then the corresponding endomorphism of A(0, ∞) given by Lemma 3.8 has finite dimension (i.e. finite index). Proof. Setting Mn ≡ A(0, n), n ∈ N, we have ρ(Mn ) ⊂ Mn for large n. Moreover Mn and ρ(Mn ) converge increasingly respectively to A(0, ∞) and ρ(A(0, ∞)). Thus Prop. 4 of [38] applies and gives [A(0, ∞) : ρ(A(0, ∞))] ≤ lim inf [Mn : ρ(Mn )]. n→∞
Proof of Theorem 3.7. Let B be conformal the thermal completion of A, thus in particular B(a, ∞) = A(log a, ∞),
a > 0,
and denote by U and V the translation and dilation with respect to B as above. We set (1) (1) αs = AdV (s) and αs = AdU (1)V (s)U (−1). By Proposition 3.2 then αs acts on B as a dilation with respect to the point 1, namely αs(1) (B(a, ∞)) = B(es a + 1 − es , ∞),
a, s ∈ R,
and α (1) |B(1,∞) is the (rescaled) modular group associated with (B(1, ∞), ξ ). As ρ gives rise to a finite index endomorphism of B(1, ∞) (Lemma 3.8 and 3.9) there exists a unitary α (1) -cocycle u(1) (s) ∈ B(1, ∞) such that (1)
Adu(1) (s) · αs(1) · ρ · α−s (X) = ρ(X),
X ∈ B(1, ∞), s ∈ R.
(38)
Indeed we may take u(1) as the Connes Radon-Nikodym cocycle u(1) (s) = (Dϕ · / : Dϕ)s , where / is a normal faithful left inverse of ρ on A(0, ∞) and ϕ is considered as a state on A(0, ∞) [47]. Now, if ε ∈ (0, 1), ρ acts trivially on B(ε, 1) = πϕ (A(log ε, 0)) and, since ϕ satisfies essential duality, the conformal thermal completion is strongly additive and B(ε, ∞) = B(ε, 1) ∨ B(1, ∞). (1)
As α−s (X) ∈ B(ε, 1) if X ∈ B(ε, 1) and s > 0, it follows that Eq. (38) holds true for all X ∈ B(ε, ∞), s > 0. Setting then for a fixed s > 0, (1)
ρ(X) ≡ Adu(1) (s) · αs(1) · ρ · α−s (X),
X ∈ M,
this formula does not depend on the choice of s > 0 and provides an extension of ρ to M because of formula (38). Now B is a strongly additive local conformal net on R and (the extension of) ρ is a localized endomorphism of B with finite dimension, hence ρ is Möbius covariant [21], therefore the index of ρ(B(I )) ⊂ B(I ) is independent of the interval I ∈ I, provided ρ is localized within I [22]. This clearly implies the last part of the statement. Our results then give here a version of Theorem 2.10.
76
R. Longo
Corollary 3.10. Let A be a local net on R as above, τ the translation automorphism group and ρ a translation covariant localized endomorphism with finite dimension. Then ρ has finite holomorphic dimension in each extremal KMS state ϕ fulfilling essential duality. The chemical potential associated with ρ is defined by the canonical splitting β −1 log dϕ (uρ ) = β −1 log d(ρ) + µρ (ϕ), and satisfies µρ¯ (ϕ) = −µρ (ϕ). In particular dgeo (ρ) = d(ρ) for all irreducible ρ, independently of ϕ. 3.3. Extension of KMS states to the quantum double. The purpose of this subsection is to provide a description of the chemical potential, in a low dimensional theory, in terms of extensions of KMS states, in analogy to what is described in [1] in the higher dimensional case. As in our cases charges are not any longer associated to the dual of a compact gauge group, we shall replace the field algebras by a quantum double construction [50], that we will perform in the C∗ -case in Appendix A.2 for our purposes: the reader is referred to this appendix for the necessary notations and background. Let then A be a unital C∗ -algebra with trivial centre and T ⊂ End(A) a tensor category of endomorphisms with conjugates and sub-objects. Let α be a one-parameter group of automorphisms of A and u(ρ, t) unitary covariance cocycle (Eq. (14)) which is a twovariable cocycle. We consider T op ⊂ A by setting ρ op = ρ¯ and (ρ op , σ op ) = (ρ, σ )• . Let I be an index set so that ρi , i ∈ I is a family of inequivalent irreducible objects of T , one for each equivalence class. Then u( ˜ ρ˜i , t) ≡ u(ρi , t) ⊗ u(ρi , t)• , where ρ˜ = ρ ⊗ ρ, extends to a two-variable cocycle for αt ⊗αt and T ×T op . Indeed u( ˜ ρ˜i , t) is independent of its choice (phase fixing) due to the anti-linearity of the Frobenious map T → T • . We may extend αt ⊗ αt to a one-parameter automorphism group α˜ of B by setting ˜ α˜ t (a) = αt ⊗ αt (a), a ∈ A, ∗ α˜ t (Ri ) = u( ˜ ρ˜i , t) Ri ,
(39) (40)
(cf. [29]). If ϕ is a KMS state for αt on A, then ϕ ⊗ ϕ is a KMS state for αt ⊗ αt on ˜ ≡ A ⊗ A. A Proposition 3.11. Let ϕ be an extremal KMS state for αt . Then ϕ ⊗ ϕ · ε extends to a KMS state of πϕ˜ (B) with respect to α˜ t · θt for some one-parameter automorphism ˜ pointwise fixed, if and only if each ρ ∈ T is normal with group of πϕ˜ (B) leaving A respect to ϕ. Proof. If ϕ˜ ≡ ϕ ⊗ϕ ·ε is a KMS state, then ϕ˜ is a separating state. If R ∈ B is a non-zero multiple of an isometry, then also ψ = ϕ(R ˜ · R ∗ ) is a separating positive functional, ∗ which is quasi-equivalent to ϕ˜ because R ξϕ˜ is a separating vector for πϕ˜ (B) . ˜ we have Take R = Ri , the element of B that implements ρ˜i . If x ∈ A ψ(x) = ϕ(R ˜ i xRi∗ ) = ϕ( ˜ ρ˜i (x)Ei ) = ϕ ⊗ ϕ(ε(ρ˜i (x)Ei ) = ϕ ⊗ ϕ · ρ˜i (x),
Notes for a Quantum Index Theorem
77
where Ei = Ri Ri∗ , thus ε(Ei ) = 1. Thus ψ = ϕ ⊗ ϕ · ρi , hence πϕ⊗ϕ·ρ˜i ) πψ |A˜ ) πϕ˜ |A˜ ) πϕ⊗ϕ , and this implies that ρi is normal with respect to ϕ. Conversely, let us assume that each ρi is normal with respect to ϕ and still denote by ˜ ≡ πϕ⊗ϕ (A) ˜ . By rescaling the ρ˜i the extension of ρ˜i to the von Neumann algebra M parameter we may set the inverse temperature β equal to −1. We may then define the ˜ (but the C k are *-algebra B0 as in the appendix with A replaced by its weak closure M ij ˜ still defined with respect to A). 0∗ ¯ 0 Let Jρ˜i = Cii ¯ ρ˜i (·)Cii ¯ , thus Jρ˜i is a (not necessarily standard) non-normalized left inverse of ρ˜i . With V (ρ˜i , t) = (Dϕ ⊗ ϕ · Jρ˜i : Dϕ ⊗ ϕ)t , define a one-parameter automorphism group γ of B0 extending α ⊗ α as was done for α, ˜ but using V instead of U , thus, with obvious notations, γt (Ri ) = V (ρ˜i , t)∗ Ri . By using the holomorphic properties of the Connes cocycles and the corresponding two-variable cocycle property (that can be checked similarly as in [47]), it can be seen by elementary calculations that ϕ˜ is KMS with respect to γ . Clearly both γ and α˜ extend to N = πϕ˜ (B0 ) , indeed the extension of γ is the modular group with respect to ϕ, ˜ thus γ and α˜ commute and ˜ pointwise fixed. θt = α˜ t · γ−t is a one-parameter group of N leaving M Let G be the “gauge group”, namely the group of all automorphisms of B that leave ˜ A pointwise invariant, and Z(G) the centre of G. Any extension αˆ t of αt ⊗ αt to a one parameter automorphism group of B is clearly given by α˜ t = αˆ t · θt , where θt ∈ G. Reasoning as in the above proof one has the following. Corollary 3.12. Let ϕ and ψ be extremal KMS states for α. Assume that all ρ ∈ T are normal with respect to both ϕ and ψ and that if ρ is irreducible the normal extensions of ρ with respect to ϕ and ψ are still irreducible. Then ϕ ⊗ ψ · ε is a KMS state of B with respect to α˜ · θ, with θ a one-parameter subgroup of Z(G). In particular this is the case if ϕ and ψ are extremal KMS states with respect to translations of a net of local von Neumann algebras as in Sect. 3, satisfying essential duality. The one-parameter group θ is related to the chemical potential associated with the charge ρi by θt (Ri ) = e−iµρi (ϕ|ψ)t Ri .
4. Roberts and Connes–Takesaki Cohomologies Roberts [58] has obtained a geometrical description of the theory of superselection sectors by considering a non-abelian local cohomology naturally associated with a net A. Rather than stating here his formal definitions, that can be obtained from different geometrical pictures, we will recall here the main idea underlying the construction of the first cohomology semiring HR1 (A). Here A is a net of von Neumann algebras on the Minkowski spacetime M, fulfilling the properties stated in Sect. 2, but the construction may be extended to more general globally hyperbolic spacetimes, see [24].
78
R. Longo
Given a representation π of A localizable in all double cones we may, as usual, fix a double cone O ∈ K and choose an endomorphism ρO of A localized in O and equivalent to π. For each O1 ∈ K we choose a unitary intertwiner uO,O1 between ρO and ρO1 and set zO2 ,O1 ≡ u∗O2 ,O uO,O1
O1 , O2 ∈ K.
˜ if O ˜ is a double By duality, zO2 ,O1 belongs to Ad (O1 ∪ O2 ), in particular zO2 ,O1 ∈ A(O) cone containing O1 ∪ O2 , and satisfies the cocycle condition zO3 ,O1 = zO3 ,O2 zO2 ,O1 . The choice of ρO and uO,O1 is not unique, yet different choices give cohomologous cocycles under a natural equivalence relation. One can then formalize the definition of HR1 (A). The relevant point is that the map Superselection sectors −→ HR1 (A) is invertible, namely each localized cocycle z arises from a localized endomorphism ρ localized in a given O ∈ K, which is well-defined by the formula ∗ ρ(X) = zO1 ,O XzO , 1 ,O
˜ ∀X ∈ A(O),
˜ . ˜ is a double cone containing O and O1 is any double cone contained in O where O 1 As HR (A) is in one-to-one correspondence with the superselection sectors, it is endowed with a ring structure, that can be expressed more directly. Analogusly ZR1 (A) is a tensor C∗ -category. Let’s now recall the cohomology considered by Connes and Takesaki [14]. This concerns an automorphism group action τ on a C∗ -algebra A, in our case the translation automorphism group on the quasi-local C∗ -algebra A. Restricting to unitary cocycles, a map z : R4 → A, taking values in the unitaries A, is a cocycle if it satisfies the cocycle equation z(x + y) = z(x)τx (z(y)),
x, y ∈ R4 ;
(41)
two cocycles z and z are cohomologous if there exists a unitary u ∈ A such that z (x) = uz(x)τx (u∗ ),
x ∈ R4 .
(42)
A cocycle z gives rise to a perturbed automorphism group τ z defined by τxz ≡ z(x)τx (·)z(x)∗ . We are interested in the case where τ z is a local perturbation, in the sense that, if x varies in a bounded set, there exists a double cone O ∈ K such that τxz |A(O ) = τx |A(O ) . This amounts to define a (unitary) localized cocycle as a unitary map z : R4 → A satisfying the cocycle condition (41) and the locality condition: ∃δ > 0 & O ∈ K such that z(x) ∈ A(O), ∀x ∈ R4 , |x| < δ. Here |x| is the Euclidean modulus of x. By iterating the cocycle equation (41) it follows immediatly from the locality condition that all the z(x)’s live in a common double cone as x varies in a bounded set. Indeed the following holds.
Notes for a Quantum Index Theorem
79
Lemma 4.1. Let z be a localized cocycle. There exists r > 0 such that z(x) ∈ A(Or+|x| ), ∀x ∈ R4 , where Or denotes the double cone with basis the ball of radius r > 0, centered at 0, in R3 . Proof. Let δ, r > 0 be such that z(x) ∈ A(Or ) if |x| < δ. If (n − 1)r ≤ |x| < nr, write x = x1 + x2 + · · · + xn with |xi | < r. By the cocycle equation z(x) = z(x1 )τx1 (z(x2 ))τx1 +x2 (z(x3 )) · · · τx1 +x2 +···xn−1 (z(xn )), we see that z(x) ∈ ∨n−1 i=1 A(x1 + · · · + xi + Or ) ⊂ A(O(n−1)r ) ⊂ A(Or+|x| ). We denote by Zτ1 (A) the set of unitary localized cocycles, and by Hτ1 (A) the quotient of Zτ1 (A) modulo the equivalence relation (42), with the further restriction that the unitary u is local, namely u belongs to A(O) for some O ∈ K. Now a covariant localized endomorphism gives rise to a localized cocycle (Formula (28)), hence a covariant sector to an element of Hτ1 (A). As for the Roberts cohomology, the converse is true. Proposition 4.2. There is a natural map Hτ1 (A) −→ Covariant superselection sectors. Proof. We shall associate a covariant localized endomorphism ρ to a given z ∈ Zτ1 (A). Similarly as in Eq. (4), we set ρ(X) ≡ z(x)Xz(x)∗ ,
X ∈ A(OR ), |x| > 2R,
where x is a vector in the time-zero hyperplane and R > r, with r the radius of the double cone in Lemma 4.1. We have to show that, for a fixed R, the above definition is independent of the choice of x, namely z(x)Xz(x)∗ = z(x )Xz(x )∗ if also |x | > 2R. As O2R is connected, by iterating the procedure, it is enough to check this if x = x + y with |y| < δ. Then, by local commutativity, z(x )Xz(x )∗ = z(x)τx (z(y))Xτx (z(y))∗ z(x)∗ = z(x)Xz(x)∗ because τx (z(y)) belongs to A(Or + x) and Or + x ⊂ OR .
The map given by Prop. 4.2 is not invertible: two cocycles that differ by multiplication by a one dimensional character give rise to the same localized endomorphism; in the irreducible case this is the only ambiguity that could be eliminated by considering inner automorphisms rather than unitary operators. Apart from this point, Hτ1 (A) describes faithfully the covariant superselection sectors. Note also that, if a certain strong additivity assumption holds, all sectors with finite dimension are covariant [21]. Although the Roberts cohomology describes the superselection sectors, the definition of the statistical dimension is not manifest within HR1 (A). But, passing to Hτ1 (A), we discover a pairing between factor KMS states ϕ for the time evolution satisfying duality and Hτ1 (A), logϕ, [z] ≡ log dϕ (z) = log d(ρ) + βµρ (ϕ),
80
R. Longo
hence we have the geometrical description of the holomorphic dimension by the diagram Covariant sectors −−−−→ HR1 (A) Hτ1 (A)
−−−−→ R ∪ {∞}
Of course, by considering the involution • as before, we obtain an expression for d(ρ) as well.
5. Increment of Black Hole Entropy as an Index This section is devoted to the illustration of a physical context where the dimension is expressed by a quantity that includes also classical geometric data of the underlying spacetime. The results here below have been announced in [49]. We shall consider the increment of the entropy of a quantum black hole which is represented by a globally hyperbolic spacetime with bifurcate Killing horizon. The case of a Rindler black hole has been studied in [47]; the reader can find the basic ideas and motivations in this reference. The main points here are the following. Firstly we perform our analysis in the case of more realistic black holes spacetimes, in particular we consider Schwarzschild black holes. Secondly we shall consider charges localizable on the horizon, obtaining in this way quantum numbers for the increment of the entropy of the black hole itself, rather than of the outside region as done in Rindler spacetime. Indeed the restriction of quantum fields to the horizon will define a conformal quantum field theory on S 1 . This key point has been discussed in [48, 24] with small variations. We shall return to this with further comments. Finally, we shall deal with general KMS states, besides the Hartle–Hawking temperature state. In this context, a non-zero chemical potential can appear. The extension of the DHR analysis of the superselection structure [16] to a quantum field theory on a curved globally hyperbolic spacetime has been given in [24] and we refer to this paper for the necessary background material. We however recall here the construction of conformal symmetries for the observable algebras on the horizon of the black hole. To be more explicit let V be a d + 1 dimensional globally hyperbolic spacetime with a bifurcate Killing horizon. A typical example is given by the Schwarzschild–Kruskal manifold that, by the Birkoff theorem, is the only spherically symmetric solution of the Einstein–Hilbert equation; one might first focus on this specific example, as the more general case is treated similarly. We denote by h+ and h− the two codimension 1 submanifolds that constitute the horizon h = h+ ∪ h− . We assume that the horizon splits V in four connected components, the future, the past and the “left and right wedges” that we denote by R and L (the reader may visualize this also in the analogous case of Minkowski spacetime, where the Killing flow is a one-parameter group of pure Lorentz transformations). Let κ = κ(V) be the surface gravity, namely, denoting by χ the Killing vector field, the equation on h ∇g(χ , χ ) = −2κχ ,
(43)
Notes for a Quantum Index Theorem
81
with g the metric tensor, defines a function κ on h, that is actually constant on h, as can be checked by taking the Lie derivative of both sides of Eq. (43) with respect to χ [65]. If V is the Schwarzschild–Kruskal manifold, then κ(V) =
1 , 4M
where M is the mass of the black hole. In this case R is the exterior of the Schwarzschild black hole. Our spacetime is R and we regard V as a completion of R. Let A(O) be the von Neumann algebra on a Hilbert space H of the observables localized in the bounded diamond O ⊂ R. We make the assumptions of Haag duality, properly infiniteness of A(O), and Borchers property B. The Killing flow Nt of V gives rise to a one parameter group of automorphisms α of the C∗ -algebra A = A(R) since R ⊂ V is a N-invariant region. Here, as before, the C∗ -algebra associated with an unbounded region is the C∗ -algebra generated by the von Neumann algebras associated with bounded diamonds contained in the region.
5.1. The conformal structure on the black hole horizon. We now consider a locally normal α-invariant state ϕ on A(R), that restricts to a KMS state at inverse temperature β > 0 on the horizon algebra, as we will explain. This will be the case in particular if ϕ is a KMS state on all A(R). For convenience, we shall assume that the net A is already in the GNS representation of ϕ, hence ϕ is represented by a cyclic vector ξ . Denote by Ra the wedge R “shifted by” a ∈ R along, say, h+ (see [24]). If I = (a, b) is a bounded interval of R+ , we set C(I ) = A(Ra ) ∩ A(Rb ) ,
0 < a < b.
We denote by C(I ) (I ⊂ (0, ∞)) the C∗ -algebra generated by all C(a, b), b > a > 0, with (a, b) ⊂ I . We shall also set C(I ) = C(I ) . We obtain in this way a net of von Neumann algebras on the intervals of (0, ∞), where the Killing automorphism group α acts covariantly by rescaled dilations. (With the due assumptions on duality, C(I ) turns out to be the intersection of the von Neumann algebras associated with all diamonds containing the “interval I ” of the horizon h+ [24].) We can now state our assumption: ϕ|C(0,∞) is a KMS state with respect to α at inverse temperature β > 0. The net (a, b) → C(ea , eb ) is obviously a net on R where α is the translation automorphism group. We are therefore in the setting treated in Sect. 3, whose results may be now applied. In particular, by using the Wiesbrock theorem [67], we now show that the restriction of the net to the black hole horizon h+ has many more symmetries than the original net, cf. [62]. Theorem 5.1 ([24]). The Hilbert space H0 = C(I )ξ is independent of the bounded open interval I . The net C extends to a conformal net R of von Neumann algebras acting on H0 , where the Killing flow corresponds to the rescaled dilations.
82
R. Longo
Proof. Setting A(a, b) ≡ C(ea , eb ), b > a, we trivially obtain a local net on R and ϕ is a KMS state with respect to translations. The conformal extension of C is then the conformal thermal completion of A given by Th. 3.3; the additivity assumption is here unnecessary since H0 = C(I )ξ is independent of I . A detailed proof can be found in Prop. 4A.2 of [24]. This theorem says in particular that we may compactify R to the circle S 1 , extend the definition of C(I ) for all proper intervals I ⊂ S 1 , find a unitary positive energy representation of the Möbius group PSL(2, R) acting covariantly on C, so that the rescaled dilation subgroup is the Killing automorphism group. If ξ is cyclic for C on H, namely H0 = H (as is true in the Rindler case for a free field, see [24]), then the net C automatically satisfies Haag duality on R. Otherwise one would pass to the dual net Cd of C, which is automatically conformal and strongly additive [23]. The following proposition gives a condition for C itself to be strongly additive. Proposition 5.2. If ϕ is KMS on A(R) and the strong additivity for A holds in the sense that C(0, 1) ∨ A(R1 ) = A(R) , then C satisfies Haag duality on R. Proof. Note first that there is a conditional expectation ε : A(Ra ) → C(a, ∞), a ∈ R, given by ε(X)ξ = EXξ , X ∈ A(Ra ) , where E is the orthogonal projection onto H0 . The case a = 0 is clearly a consequence of Takesaki’s theorem since C(0, ∞) is globally invariant under the modular group of A(R) . Now the translations on A (constructed as in Prop. 3.2) restrict to the translations on C and commute with ε due to the cyclicity of ξ for C(a, ∞) on H0 . Hence ε maps A(Ra ) onto C(a, ∞), a ∈ R. We then have C(0, 1) ∨ C(1, ∞) = C(0, 1) ∨ ε(A(R1 ) ) = ε(C(0, 1) ∨ A(R1 ) ) = ε(A(R) ) = C(0, ∞),
(44)
hence C is strongly additive, thus it satisfies Haag duality on R because C is conformal. In the following we assume that A is strongly additive. 5.2. Charges localizable on the horizon. We now consider an irreducible endomorphism ρ with finite dimension of A(R) that is localizable in an interval (a, b) of h+ , a > 0, namely ρ acts trivially on A(Rb ) and on C(0, a), thus it restricts to a localized endomorphism of C. This last requirement is necessary to extend ρ to a normal endomorphism of πϕ (A(R)) , with ϕ as above or a different extremal KMS state. Remark. If we assume that the net A is defined on all V and that ρ is a transportable localized endomorphism of A(V), then ρ has a normal extension to the von Neumann algebra of A(R) which exists by transportability, cf. [47, 24], and the strong additivity assumption is unnecessary. This case may be treated just as the case of the Rindler spacetime [47], the only difference being the possible appearance a non-trivial chemical potential. We omit the detailed discussion of this context.
Notes for a Quantum Index Theorem
83
By a result in [21], a transportable localized endomorphism with finite dimension ρ of C is Möbius covariant. In particular we may choose the covariance cocycle so that it verifies the two-variable cocycle property with respect to the action of the Möbius group, and this uniquely fixes it. Let σ be another irreducible endomorphism of A localized in (a, b) ⊂ h+ and denote by ϕρ and ϕσ the thermal states for the Killing automorphism group in the representation ρ. As shown in [47], ϕρ = ϕ · /ρ , where /ρ is the left inverse of ρ, and similarly for σ . We now show that the dimension of ρ and of its restriction to C coincide. In fact the following is true. Lemma 5.3. With the above assumptions, if ρ is localized in an interval of h+ , then ρ|C(0,∞) has a normal extension to the weak closure C(0, ∞) with dimension d(ρ). Proof. If ρ is localized in the interval (a, b) (b > a > 0) of h+ , then clearly ρ restricts to the von Neumann algebra C(c, d) for all d > b > a > c as it acts trivially on A(Rc ) and by duality for C. Hence ρ restricts to the C∗ -algebra C(0, ∞), and then it extends to C(0, ∞) by Theorem 3.7. Let ρ¯ be also localized in the interval (a, b). If R, R¯ are standard solutions for the conjugate equation of ρ and ρ, ¯ then R, R¯ ∈ A(Rb ) ∩ A(R) = C(0, b) and commute with C(0, a), hence they belong to C(a, b) by strong additivity [23]. Conversely, if R, R¯ are a standard solution for the conjugate equation of the restriction of ρ and ρ, ¯ then R and R¯ belongs to C(a, b) by the Haag duality for C, hence the conjugate equation is valid for all elements of C(0, b) ∨ A(Rb ) = A(R). This implies the dimension is the same for ρ and its restriction to C. The increment of the free energy between the thermal equilibrium states ϕρ and ϕσ is expressed as in [47] by F (ϕρ |ϕσ ) = ϕρ (Hρ σ¯ ) − β −1 S(ϕρ |ϕσ ) Here S is the Araki relative entropy and Hρ σ¯ is the Hamiltonian on H0 corresponding to the composition of the charge ρ and the charge conjugate to σ in C(−∞, 0) as in [47]; it is well defined as eitHρ σ¯ e−itHι is the two-variable cocycle. In particular, if σ is the identity representation, then Hρ σ¯ = Hρ is the Killing Hamiltonian in the representation ρ. The analysis made in Sect. 2.1 works also in this context, and in fact the formulae there can be written in the case of two different KMS states ϕρ and ϕσ . In particular F (ϕσ |ϕρ ) =
1 −1 β (Sc (σ ) − Sc (ρ)) + µ(ϕσ |ϕρ ), 2
(45)
where µ(ϕσ |ϕρ ) = β −1 log dϕ (uσ ) − β −1 log dϕ (uρ ). Here Sc (ρ) = log d(ρ)2 is the conditional entropy associated with ρ (see [55]) and the above formula gives a canonical splitting for the incremental free energy. 5.3. An index formula. We now consider the Hartle–Hawking state ϕ, see [67]. In several cases this is the unique KMS state for the Killing evolution. The corresponding Hawking temperature is related to the surface gravity of R: β −1 =
κ(R) . 2π
84
R. Longo
Theorem 5.4. With ϕ the Hartle–Hawking state, if ρ and σ are localizable as above, then π log d(ρ) − log d(σ ) = (F (ϕρ |ϕσ ) + F (ϕρ¯ |ϕσ¯ )), κ(R) in particular, once we fix ρ, the exponential of the right hand side of the above equation is proportional to an integer. Proof. Formula (45) states that log d(ρ) − log d(σ ) = βF (ϕρ |ϕσ ) − βµ(ϕσ |ϕρ ).
(46)
By the asymmetry of the chemical potential µ(ϕσ |ϕρ ) + µ(ϕσ¯ |ϕρ¯ ) = 0, thus log d(ρ) − log d(σ ) = βF (ϕρ¯ |ϕσ¯ ) + βµ(ϕσ |ϕρ ).
(47)
Summing up Eqs. (46) and (47) and setting β/2 = π/κ(R) we obtain log d(ρ) − log d(σ ) = log dgeo (ρ) − log dgeo (σ ) π (F (ϕρ |ϕσ ) + F (ϕρ¯ |ϕσ¯ )). = κ(R)
We have thus expressed the analytical index log d(ρ)−log d(σ ) in terms of a physical quantity, the incremental free energy, and the quantity κ(R) associated with the geometry of the spacetime. Corollary 5.5. F (ϕσ |ϕρ ) + F (ϕσ¯ |ϕρ¯ ) = β −1 (Sc (σ ) − Sc (ρ)), where Sc (ρ) denotes the conditional entropy of the sector ρ. Proof. Immediate. The above corollary shows that the part of the incremental free energy which is independent of charge conjugation is proportional to the increment of the conditional entropy. 6. On the Index of the Supercharge Operator The purpose of this section is to make a few remarks to interpret the statistical dimension of a superselection sector as the Fredholm index of an operator associated with the supercharge operator in a supersymmetric theory. Let F(O) be the von Neumann algebra on a Hilbert space H generated by the fields localized in the region O ∈ K of a spacetime M, say the Minkowski spacetime, in a Quantum Field Theory, as in Sect. 2, in the vacuum representation. Let F = ∪O∈K F(O)− be the quasi-local C∗ -algebra and γ : g ∈ G → γg ∈ Aut(F) an action of a compact group of internal symmetries with g0 an involutive element of the center of G providing a grading automorphism γ0 = γg0 with normal commutation relations: F1 F2 ± F2 F1 = 0,
Fi ∈ F(Oi ),
O1 ⊂ O2 ,
Notes for a Quantum Index Theorem
85
where F1 , F2 ∈ F± ≡ {F : γ0 (F ) = ±F }, and the + sign occurs iff both F1 and F2 belong to F− . With < the canonical selfadjoint unitary on H implementing γ0 , the Hilbert space decomposes according to the eigenvalues of <, H = H+ ⊕ H− .
(48)
ˆ gives rise to a superselection As is known [16], each irreducible representation π ∈ G sector ρπ of the observable algebra A ≡ F G : given a Hilbert space of isometries Hπ ⊂ F(O) carrying the representation π , one has a covariant endomorphism ρπ of A, ρπ (X) =
d i=1
vi Xvi∗ ,
X ∈ F,
(49)
where {v1 , v2 , . . . vd } is an orthonormal basis of Hπ and d = dim(π ) = dDHR (ρπ ). The unitary representation of G implementing γ gives rise to a decomposition of H, H= Hπ ˆ π∈G
and, denoting by π0 the identity representation of A on H, one has the unitary equivalence π0 |Hπ ≈ dim(π )π0 · ρπ |Hι . Let H be the Hamiltonian of F, namely the generator of the time evolution on H. We now assume the existence of a supersymmetric structure on F, namely there exists a supercharge operator Q, an odd selfadjoint operator with Q2 = H ; corresponding to the decomposition (48) of H, Q can be written as
0 Q+ Q−
0
with Q− = Q∗+ , where Q± : H± → H∓ . In case (not assumed here) e−βH is a trace class operator, ∀β > 0, one has the well-known formula Index(Q+ ) = Tr(<e−βH ), β > 0, with Index(Q+ ) the Fredholm index of Q+ , that turns out to coincide with the dimension of the kernel of H , thus Index(Q+ ) = 1 if there exists a unique vacuum vector. Now fix an irreducible localized endomorphism ρ = ρπ of A as above and let Hρ be the Hamiltonian in the representation ρ, namely the generator of the unitary time evolution in the representation ρ. The Hamiltonian Hρ is not supersymmetric, in the sense that there exists no odd square root of Hρ , as such an operator would interchange Bose and Fermi sectors and cannot map the Hilbert space of ρ (which is either contained in H+ or in H− ) into itself. Yet, we may define a supercharge operator Qρ on the global Hilbert space H by setting Qρ = vi Qvi∗ , i
with the vi ’s forming a basis of Hπ as above.
86
R. Longo
Lemma 6.1. Q2ρ |H0 = Hρ . Proof. Indeed Q2ρ =
i
vi Qvi∗
2 =
i
vi Q2 vi∗ =
i
vi H vi∗ .
(50)
Because of the expression
(49), one checks easily that U (t)ρ(X)U (−t) = ρ(αt (X)), X ∈ A, where U (t) = i vi eitH vi∗ and α is the one-parameter automorphism
group. Since H commutes with γG , it follows that H0 is an invariant subspace for i vi H vi∗ , thus the latter restricts to Hρ on H0 and formula (50) implies the statement. We now assume that ρ is a Bose sector, thus Hπ ⊂ F+ , namely each vi commutes with < and thus preserves the decomposition (48) of the Hilbert space so that
0 Qρ+ Qρ = , Qρ− 0
where in particular Qρ+ = i vi Q+ vi∗ . Since the restriction of vi Q+ vi∗ to vi vi∗ H+ is unitarily equivalent to Q+ , clearly Qρ+ is unitarily equivalent to Q+ ⊕ · · · ⊕ Q+ (d times) and therefore Index(Qρ+ ) = d(ρ) · Index(Q+ ). If σ is another sector as above, we thus have the formula Index(Qρ+ ) d(ρ) = , d(σ ) Index(Qσ + ) showing an interpretation of the (statistical) dimension as a multiplicative relative Fredholm index. It remains to provide a model where the above structure can be realized. To this end let Ab and Af be the nets of local algebras associated with the free scalar Bose field and the free Fermi–Dirac field on Minkowski spacetime, or on the cylinder spacetime. Then F = Ab ⊗ Af is equipped with a supersymmetric structure (see e.g. [12, 25, 31]), where the grading unitary is 1 ⊗ (−1)N , with (−1)N the even–odd symmetry. We fix a positive integer n ≥ 2 and consider F˜ ≡ F ⊗ F ⊗ · · · ⊗ F, (n factors),
(51)
as our field algebra, with the gauge group G = Pn acting by permuting the order of the bosonic algebra Ab ⊗ · · · Ab in (51). The only thing to check is the existence of a supersymmetric structure, which is given by the following lemma. Lemma 6.2. The tensor product of two supersymmetric QFT nets is supersymmetric. Proof. To simplify notations we shall consider the tensor product F˜ = F ⊗F of the same net F by itself. The decomposition (48) of the Hilbert space H of F gives a decomposition ˜ = H ⊗ H, of H ˜ + = (H+ ⊗ H+ ) ⊕ (H− ⊗ H− ), H ˜ − = (H+ ⊗ H− ) ⊕ (H− ⊗ H+ ), H
(52) (53)
Notes for a Quantum Index Theorem
87
˜ ± → H∓ , ˜± : H and we can define the operators Q ˜ + = (Q+ ⊗ 1 + 1 ⊗ Q+ ) ⊕ (Q− ⊗ 1 − 1 ⊗ Q− ), Q ˜ − = (Q+ ⊗ 1 + 1 ⊗ Q− ) ⊕ (Q− ⊗ 1 − 1 ⊗ Q+ ). Q
(54) (55)
˜ −Q ˜ +Q ˜+ ⊕Q ˜ −. The tensor product Hamiltonian H˜ = (H ⊗ 1) ⊕ (1 ⊗ H ) is equal to Q ˜ is not canonIt should be noticed that the choice of the tensor product supercharge Q ˜ ˜ ical: having chosen Q we get another supercharge Qg by permuting the order of the tensor product factors with a permutation g. Remark. If we apply the above procedure to chiral conformal field theory, with G a finite group of internal symmetries, the observable algebra A
= F G has an extra family of sectors {σi }i , beside the above ones {ρπ }π∈Gˆ [38]; indeed π d(ρπ )2 = |G|, while
d(ρπ )2 +
d(σi )2 = |G|2 .
i
ˆ π∈G
The dimension d(σi ) is not necessarily integral and therefore cannot be related to a Fredholm index. Our formulae, nevertheless, still make sense.
7. Outlook. Comparison with the JLO Theory This section contains a tentative proposal to analyze the superselection sectors by noncommutative geometry. Although it is in a primitive form, we hope it will provide an insight to the structure. 7.1. Induction of cyclic cocycles. Let A be a Z2 -graded unital pre-C∗ -algebra, with C∗ ¯ Morphisms of A will be assumed to be bounded, i.e. to extend to A. ¯ We completion A. assume that A is a Banach ∗ -algebra with respect to a norm |·| preserved by the grading γ , so that A is a graded Banach ∗ -algebra. We briefly recall some basic definitions about Connes [12] entire cyclic cohomology. Let Cn (A) be the Banach space of the (n + 1)-linear functionals of A with finite norm |fn | = sup |fn (a0 , a1 , . . . , an )|, |ai |≤1
and let C(A) be the space of the entire cochains, namely the elements of C(A) are the sequences f = (f0 , f1 , f2 , . . . ), fn ∈ Cn (A), such that f =
√
n!|fn |zn
(56)
n≥0
is an entire function of z. The grading γ lifts to C(A), and let C± (A) denote the corresponding splitting of the entire cochains.
88
R. Longo
Denoting by Ce and Co the spaces of the even and odd entire cochains (f0 , f2 , f4 , . . . ) and (f1 , f3 , f5 , . . . ), the entire cohomology groups H+e (A) and H+o (A) are the ones associated with the complex ∂
∂
· · · → Ce+ → Co+ → Ce+ → . . . ,
(57)
where the coboundary operator is ∂ = b + B, (Bf )n−1 (a0 , a1 , . . . , an−1 ) =
n−1 j =0
γ
γ
(−1)(n−1)j fn (1, an−j , . . . , an−1 , a0 . . . , an−j −1 ) γ
+ (−1)n−1 fn (an−j , . . . , an−j −1 , 1), (bf )n+1 =
n
fn (a0 , . . . , aj aj +1 , . . . , an+1 )
j =0 γ
+ (−1)n+1 fn (an+1 a0 , a1 , . . . , an ). If B is another ∗ -Banach algebra, and graded pre-C∗ -algebra, and ρ : A → B a homomorphism of A into B, any cyclic cocycle τ of B has a pull-back to a cyclic cocycle (τ · ρ) of A, (τ · ρ)n (a0 , a1 , . . . , an ) = τn (ρ(a0 ), ρ(a1 ), . . . , ρ(an )). ¯ and B ¯ are unital with trivial center, In particular, let B be contained in A, where both A and let λ be a canonical endomorphism of A with respect to a B, namely Eq. (61) holds with T ∈ A and S ∈ B, see the Appendix. In this case, if (τn ) is a cyclic cocycle of B, its pull-back to A via λ will be called the induction of (τn ) to A. Proposition 7.1. The induction gives rise to a well-defined map from H 1 (B) to H 1 (A), independently of the choice of λ. Proof. Immediate by Proposition A.2 and the fact that inner automorphisms give the identity map in cyclic cohomology [12]. Let T be a tensor category, with conjugates and subobjects, of bounded endomorphisms of A and let ρ be an object of T . If (τn ) is a cyclic cocycle of A, then we obtain ρ a cyclic cocycle (τn ) ≡ (τn · ρ −1 ) on B = ρ(A) as pull back by ρ −1 , hence a cyclic cocycle of A by inducting it up to A. ¯ Proposition 7.2. The induction of (τn · ρ −1 ) from ρ(A) to A is equivalent to (τn · ρ), namely to the pull-back of (τn ) via the conjugate charge ρ. ¯ Proof. Since λ = ρ ρ¯ is the canonical endomorphism of A into ρ(A) (see the Appendix), the result is immediate. Remark. There is a product on the localized cyclic cocycles (i.e. cocycles of the form τ · ρ, for some localized endomorphism ρ), by setting (τ · ρ) ⊗ (τ · σ ) ≡ τ · ρσ . This product passes to equivalence classes and gives a well defined product in cohomology. The interest in this point relies on the fact that in the passage from the commutative to the noncommutative case the ring structure in cohomology is usually lost (viewed on the K-theoretical side, there is no noncommutative version of the tensor product of fiber bundles).
Notes for a Quantum Index Theorem
89
7.2. Index and super-KMS functionals. Let Ab , Af be pre-C∗ -algebras with C∗ -com¯ b, A ¯ f , and let A ≡ Ab ⊗ Af be their tensor product. We assume that A ¯ f is pletions A Z2 -graded, while the grading on Ab is trivial. Let γ be the involutive automorphism of A giving the grading on A, trivial on Ab . Let α ≡ α (b) ⊗ α (f ) be a one-parameter group ¯ leaving invariant A and δ be an unbounded odd derivation of A, ¯ of automorphisms of A ∗ with a dense -algebra Aα = Aα,b ⊗ Aα,f ⊂ A contained in the domain of δ, so that δ is a square root of the generator D of α δ2 = D ≡
d αt |t=0 dt
on Aα ,
cf. [33, 39]. We consider now a super-KMS functional ϕ at inverse temperature β = 1, namely ϕ is a linear functional on A that satisfies Eq. (24) for all a, b ∈ A and ϕ(δa) = 0,
a ∈ Aα .
We assume that ϕ is bounded, as is the case for quantum field theory on a compact space, where ϕ is a super-Gibbs functional. At the end of this section we shall add comments on the case ϕ unbounded. Associated with ϕ there is an entire cyclic cocycle, the JLO cocycle [31, 37, 32], on the algebra Aα normed with the norm | a | ≡ a + δa. For n even, it is defined as τn (a0 , a1 , . . . , an ) n ≡ (−1)− 2
γn
γ
ϕ(a0 αit1 (δa1 )αit2 (δa2 ) . . . αitn (δan ))dt1 dt2 . . . dtn .
0≤t1 ≤···≤tn ≤1
(58)
There is a special class of unitary α-cocycles with finite holomorphic dimension that corresponds to bounded perturbations of the dynamics [32]. If q ∈ A is even there is a perturbed structure (A, α q , δq ), with δq = δ + [·, q] and α q = Aduq · α, where uq is unitary cocycle in A given by the solution of [25] −i
d q u (t) = uq (t)αt (h), dt
h = δq + q 2 ;
the functional ϕ q (a) = anal.cont. ϕ(auq (t)) t→i
is super-KMS with respect to α q and the topological index of uq is given by the Chern character, the JLO cocycle τ q associated with ϕ q evaluated at the identity dϕ (uq ) = ϕ q (1) ≡
∞ n=0
(−1)n
(2n)! q τ (1, 1, . . . , 1). n! 2n
Of course, the second equality is trivially true as δ1 = 0, thus only the first term in the series may be non-zero. We state the result on the deformation invariance of the topological index in [32]: Theorem 7.3 ([32]). If q ∈ A− then ϕ q (1) = dϕ (uq ) = dϕ (1) = ϕ(1).
90
R. Longo q
In the case of Thm. 7.3, (τn ) is indeed cohomologous to (τn ). We consider now a different class of unitary cocycles. We specialize to the case where Ab ⊗ Af is dense in the quasi-local C∗ -algebra associated with a supersymmetric quantum field theory (cf. Sect. 6). Let T be a tensor category, with conjugate and subobjects, of localized endomorphisms contained in End(Aα,b ). We keep the same symbol for the ¯ and assume each ρ to endomorphism ρ of Ab to denote the endomorphism ρ ⊗ ι of A be α-covariant, in the sense that there is a α-cocycle of unitaries u(ρ, t) ∈ Aα,b such that Eq. (14) holds. Suppose first that ρ is an automorphism. Then, setting δ ρ ≡ ρ · δ · ρ −1 , we see that δ ρ − δ acts identically where ρ acts identically, that is, by Haag duality, δ ρ is a perturbation of δ by a derivation localized in a double cone. In general, when ρ is an endomorphism, δ ρ is defined on ρ(Aα ). We shall then define ρ the multilinear form τn on ρ(Aα ), τnρ (a0 , a1 , . . . , an ) n
≡ (−1)− 2
γ
ϕ(a0 u(is1 )αis1 (δ ρ (a1 )u(is2 ))αi(s1 +s2 ) (δ ρ (a2 )u(is3 ))
Sn
γn
. . . u(isn )αi(s1 +···+sn ) (δ ρ (an )u(isn+1 )))ds1 . . . dsn+1 ,
(59)
where Sn ≡ {(s1 , . . . , sn+1 ) : si ≥ 0, s1 + · · · + sn+1 = 1} and u(t) ≡ uρ (t). We now assume that ϕ is of the form ϕb ⊗ ϕf , which is automatically the case if ϕ is factorial with a cluster property, and that ϕb satisfies Haag duality, so that we can apply the results in Sect. 2. As an irreducible object ρ of T extends to irreducible endomorphisms of the weak closure on Aα,b in the state ϕ (Cor. 2.8), then Proposition 7.4. τnρ = dϕ (uρ )τn · ρ −1 on ρ(Aα,b ). Proof. Setting ϕρ = anal.cont. ϕ(· u(t)), t→i
we have (see Sect. 1.1 and [47]) ϕρ = dϕ (uρ )ϕ · /ρ , thus ϕρ |ρ(A) = dϕ (uρ )ϕ · ρ −1 , which is a super-KMS functional with respect to the evolution Adu(t) · αt = ρ · αt · ρ −1 and the superderivation δ ρ = ρ · δ · ρ −1 . ρ A direct verification shows that (τn ) is the JLO cocycle on ρ(Aα ) associated with ϕρ |ρ(Aα ) . Thus the above expression is well defined, if (τn ) is well defined. In order to have a ρ ρ cocycle on all Aα,b we consider the induction (τ˜n ) of (τn ) to Aα . By the previous discussion and Prop. 7.4 we then have ¯ τ˜nρ = dϕ (uρ )τn · ρ. We summarize our discussion in the following corollary.
Notes for a Quantum Index Theorem
91
Corollary 7.5. Consider the supersymmetric structure as above on A = Ab ⊗ Af with ¯ b the quasi-local C∗ -algebra as in Sect. 2. A If ϕ = ϕb ⊗ ϕf is a supersymmetric KMS functional satisfying Haag duality and ρ is a translation covariant localized endomorphism of Ab mapping Aα,b into itself, then ∞
ρ n (2n)! ρ dϕ (ρ) = τ˜0 (1) ≡ (−1) τ˜ (1, 1, . . . , 1) n! 2n n=0
and dgeo (ρ) =
dϕ (uρ )dϕ (uρ¯ ) = dDHR (ρ) ∈ N.
Remark. The discussion in this section is incomplete in two respects. On one hand if we consider a Quantum Field Theory on a compact non-simply connected low dimensional spacetime, as a chiral conformal net on S 1 , then the quasi-local C∗ -algebra has nontrivial center with a non-trivial action of the superselection structure [19]; our results then need to be extended to this context with further study, possibly with a connection with the low dimensional QFT topology. On the other hand, when dealing with quantum field theory on Minkowski spacetime, as in Sect. 2, the boundedness requirement for the graded KMS functional ϕ should be omitted because of Prop. 1.10. Thus the construction ρ of (τ˜n ), in its actual form, is not satisfactory except, perhaps, for QFT on a contractible curved spacetime. This difficulty vanishes if we drop the boundedness requirement for ϕ, but only demand the restriction ϕb ≡ ϕ|Ab to be bounded. It is then natural to deal only with the restricion of the JLO cocycle (τn ) to the Bosonic algebra Aα,b . An example of such an unbounded super-KMS functional seems to occur naturally. However a general study of the unbounded super-KMS functional does not presently exist, in particular it should be checked that the JLO formula still gives a well defined and entire cyclic cocycle (τn ). A. Appendix. Some Properties of Sectors in the C∗ -Case For the convenience of the reader we describe here a few facts concerning endomorphisms of C∗ -algebras, which are natural counterparts of the corresponding results in the context of factors. A.1. The canonical endomorphism. Let A be a unital C∗ -algebra with trivial centre. An endomorphism λ of A is called canonical (with finite index) if there exist isometries T ∈ (ι, λ) and S ∈ (λ, λ2 ) such that λ(S)S =S 2 , S ∗ λ(T ) ∈ C\{0}, T ∗ S ∈ C\{0}.
(60)
If B ⊂ A is a unital C∗ -subalgebra and λ is an endomorphism of A with λ(A) ⊂ B, we shall say that λ is a canonical endomorphism of A with respect to B (or into B) if there exist intertwiners T ∈ (ι, λ), S ∈ (ι|B , λ|B ) such that S ∗ λ(T ) ∈ C\{0},
T ∗ S ∈ C\{0}.
(61)
In this case λ is clearly canonical, in the sense that Eq. (60) holds, since S ∈ B. The converse is also true.
92
R. Longo
Proposition A.1. If λ is canonical (Eq. 60), then it is canonical with respect to a natural C∗ -subalgebra B ⊂ A (Eq. 61). The proof is the same as in the factor case [46]: one defines B as the range of S ∗ λ(·)S and checks by Eq. (60) that B is a C∗ -subalgebra and that S ∗ λ(·)S is a conditional expectation of A onto B. The above definitions extend to the case of pre-C∗ -algebras; in this case we assume that endomorphisms are bounded and the intertwiners are assumed to live in the ∗ -algebras. Note now that if ρ is an endomorphism of A, a conjugate ρ¯ of ρ, if it exists, is unique as a sector, i.e. up to inner automorphisms of A; in fact a more general result holds in the context of tensor categories [51]. This implies the following. Proposition A.2. Let B ⊂ A be unital pre-C∗ -algebras with trivial centre and λ1 , λ2 canonical endomorphisms of A into B (namely Eq. (61) hold). There exists a unitary u ∈ B such that λ2 = Adu · λ1 . Proof. The proof could be given similarly to the one given in [46, Props. 4.1 and 4.2]. However, it is easier to observe that one can consider sectors and conjugate sectors between different C∗ -algebras. Then the canonical endomorphism λ, as a map of A into B, is just the conjugate sector for the embedding ι of B into A and thus the uniqueness of λ modulo inner automorphisms of B is just a consequence of the uniqueness of the conjugate in a tensor 2-C∗ -category, see [50]. Proposition A.3. If ρ and ρ¯ are conjugate as above, then λ = ρ ρ¯ is the canonical endomorphism of A with respect to the subalgebra ρ(A). Proof. Let R, R¯ be the operators in the conjugate equation for ρ, ρ; ¯ setting T = R¯ and S = ρ(R) Eq. (61) holds true. A.2. The quantum double in the C∗ setting. We consider now a construction in the C∗ context, that corresponds to the one given in [50] in the context of factors, see also [57, 53, 38]. Let A be a unital C∗ -algebra with trivial centre and T ⊂ End(A) a tensor category of endomorphisms with conjugates and sub-objects. We shall denote by Aop the opposite ˜ ≡ A ⊗ Aop the tensor product with respect to the maximal C∗ C∗ -algebra and by A tensor norm. Let I be an index set and choose {ρi }i∈I a family of all irreducible objects of T , op one for each equivalence class, with ρ0 = ι and ρi¯ = ρ¯i and set ρ˜i ≡ ρi ⊗ ρi , where ρ op ≡ j · ρ · j with j : A → Aop the canonical anti-linear isomorphism. ˜ with finite support. We consider Let B0 be the linear space of functions X : I → A the following product and ∗ -operation on B0 : X Y (k) ≡ X(i)ρ˜i (Y (j ))Cijk , (62) i,j
¯ ∗ ). X∗ (k) ≡ Ck0∗k¯ ρ˜k (X(k)
(63)
d(ρi )d(ρj )
Here Cijk ∈ (ρ˜k , ρ˜i ρ˜j ) is the canonical intertwiner Cijk = 3 v3 ⊗ j (v3 ) d(ρk ) ˜ with {v3 }3 any orthonormal bases in (ρk , ρi ρj ). Clearly A can be identified with the subalgebra of B0 of functions with support in 0.
Notes for a Quantum Index Theorem
93
˜ satisfy the relations Setting Ri (k) ≡ δik , the Ri ∈ A ˜ , X∈A Ri X = ρ˜i (X)Ri , Ri∗ Ri = d(ρi )2 ,
Ri Rj = k Cijk Rk , R ∗ = C 0∗ R , i¯ ¯ i ii
(64)
where we have omitted the in the product. B0 is the a ∗ -algebra and every X ∈ B0 as a unique expansion X=
X(i)Ri ,
i
˜ where the coefficients are uniquely determined by X(i) = ε(XRi∗ ). Here ε : B0 → A is the conditional expectation given by ε(X) = X(0), that can be shown to be faithful as in [38], App. A. ˜ to B0 via ε we obtain a faithful family of states, hence B0 Extending states from A has a maximal C∗ -norm, the completion under which is a C∗ -algebra B. Now X∗ X ≡ sup ψ(X ∗ X) ≥ sup ϕ1 ⊗ ϕ2 · ε(X ∗ X) = ε(X ∗ X), ψ
ϕ1 ⊗ϕ2
˜ namely where ψ ranges over the states of B0 and ϕ1 ⊗ ϕ2 over the product states of A, ˜ ε is bounded and extends to a conditional expectation ε : B → A. To each X ∈ B we may associate the formal expansion X=
X(i)Ri ,
i
where X(i) ≡ ε(XRi∗ ) and one has X = 0 ⇔ X(i) = 0 ∀i ∈ I . ˜ ∩ B, then for every a ∈ A, ˜ ˜ ∩ B = C. Indeed if X ∈ A We have A
aX(i)Ri = aX = Xa =
i
X(i)Ri a =
i
X(i)ρ˜i (a)Ri ,
i
thus X(i) ∈ (ι, ρ˜i ); thus it follows by the irreducibility of ρ˜i that X = X(0) ∈ A ∩ A = C. The contact with the construction in [50] is visible by the following proposition. Proposition A.4. If T is rational (namely I is a finite set), then λ=
i∈I
op
ρi ⊗ ρ i
˜ of the canonical endomorphism of B to A. ˜ is the restriction to A
94
R. Longo
Although the chemical potential can be described via extension of the KMS state from ˜ to B, in Sect. 3.3 we prefer to deal with a slight variation of the above construction. A ˜ with A ⊗C, where C is any Indeed the definition of B can be modified by replacing A ∗ namely T ⊂ End(C). Accordingly C -algebra with trivial centre where T acts faithfully,
d(ρi )d(ρj ) • one puts in this case ρ˜i ≡ ρi ⊗ ρ¯i and Cijk = 3 d(ρk ) v3 ⊗v3 . In particular we may take C = A, and indeed we specialize to this case in Sect. 3.3. It is rather obvious how to formulate in the above setting the structure and the results obtained for the quantum double case. Acknowledgements. Among others, we wish to thank in particular S. Doplicher for early motivational comments, and A. Connes, K. Fredenhagen and A. Jaffe for inspiring conversations and invitations respectively at the IHES, Hamburg University and Harvard University, while this work was at different stages.
References 1. Araki, H., Haag, R., Kastler, D., Takesaki, M.: Extension of KMS states and chemical potential. Commun. Math. Phys. 53, 97–134 (1977) 2. Araki, H., Kishimoto, A.: Symmetry and equilibrium states. Commun. Math. Phys. 52, 211–232 (1977) 3. Atiyah, M.F., Singer, I.M.: The index of elliptic operators. Ann. Math. (2) 87, 484–530 (1968) 4. Böckenhauer, J.: An algebraic formulation of level one Wess–Zumino–Witten models. Rev. Math. Phys. 8, 925–947 (1996) 5. Bisognano, J., Wichmann, E.: On the duality condition for a Hermitian scalar field. J. Math. Phys. 16, 985 (1975) 6. Borchers, H.J.: The CPT-theorem in two-dimensional theories of local observables. Commun. Math. Phys. 143, 315–332 (1992) 7. Borchers, H.J., Yngvason, J.: Modular groups of quantum fields in thermal states. J. Math. Phys. 40, 601–624 (1999) 8. Borisov, N.V., Müller, W., Schrader, R.: Relative index theorems and supersymmetric scattering theory. Commun. Math. Phys. 114, 475–513 (1988) 9. Brunetti, R., Guido, D., Longo, R.: Modular structure and duality in conformal Quantum Field Theory. Commun. Math Phys. 156, 201–219 (1993) 10. Buchholz, D., Longo, R.: Graded KMS functionals and the breakdown of supersymmetry. Adv. Theor. Math. Phys. 3, 615–626 (2000); Addendum; ibidem 6, 1909–1910 (2000) 11. Connes, A.: Une classification des facteurs de type III. Ann. Sci. Ec. Norm. Sup. 6, 133–252 (1973) 12. Connes, A.: Noncommutative Geometry. London–New York: Academic Press (1994) 13. Connes, A.: Noncommutative differential geometry and the structure of space and time. On Operator Algebras and Quantum Field Theory, Doplicher, Longo, Roberts & Zsido, eds., Boston, MA: International Press, 1997 14. Connes, A., Takesaki, M.: The flow of weights on factors of type III. Tohoku Math. J. 29, 73–555 (1977) 15. Doplicher, S.: Problems in operator algebras and quantum field theory. In Lecture Notes in Mathematics 1139. Berlin–Heidelberg–New York: Springer-Verlag, 1985 16. Doplicher, S., Haag, R., Roberts, J.E.: Local observables and particle statistics. I. Commun. Math. Phys. 23, 199–230 (1971) 17. Doplicher, S., Haag, R., Roberts, J.E.: Local observables and particle statistics. II. Commun. Math. Phys. 35, 49–85 (1974) 18. Doplicher, S., Roberts, J.E.: Why there is a field algebra with a compact gauge group describing the superselection structure in particle physics. Commun. Math. Phys. 131, 52 (1990) 19. Fredenhagen, K., Rehren, K.-H., Schroer, B.: Superselection sectors with braid group statistics and exchange algebras. II. Rev. Math. Phys. Special Issue, 113–157 (1992) 20. Evans, D.E., Kawahigashi, Y.: Quantum Symmetries on Operator Algebras. Oxford: Oxford University Press, 1998 21. Guido, D., Longo, R.: Relativistic invariance and charge conjugation in quantum field theory. Commun. Math. Phys. 148, 521 (1992) 22. Guido, D., Longo, R.: The conformal spin and statistics theorem. Commun. Math. Phys. 181, 11 (1996)
Notes for a Quantum Index Theorem
95
23. Guido, D., Longo, R., Wiesbrock, H.-W.: Extension of conformal nets and superselection structures. Commun. Math. Phys. 192, 217–244 (1998) 24. Guido, D., Longo, R., Roberts, J.E., Verch, R.: Charged sectors, spin and statistics in quantum field theory on curved spacetimes. Rev. Math. Phys. 13, 125–198 (2001) 25. Haag, R.: Local Quantum Physics. Berlin–Heidelberg–New York: Springer-Verlag, 1996 26. Haag, R., Hugenoltz, N.M., Winnink, M.: On the equilibrium states in quantum statistical mechanics. Commun. Math. Phys. 5, 215 (1967) 27. Hawking, S.W.: Particle creation by black holes. Commun. Math. Phys. 43, 199 (1975) 28. Hislop, P.D., Longo, R.: Modular structure of the local algebras associated with the free massless scalar field theory. Commun. Math. Phys. 84, 71 (1982) 29. Isola, T.: Modular structure of the crossed product by a group dual. J. Oper. Th. 33, 3–31 (1995) 30. Izumi, M.: The structure of sectors associated with the Longo-Rehren inclusion I. General theory. Commun. Math. Phys. 213 127–179 (2000) 31. Jaffe, A., Lesniewski, A., Osterwalder, K.: Quantum K-theory I. The Chern character. Commun. Math. Phys. 118, 1–14 (1988) 32. Jaffe, A., Lesniewski, A., Wisniowski, M.: Deformations of super-KMS functionals. Commun. Math. Phys. 121, 527–540 (1989) 33. Jaffe, A., Lesniewski, A.: An index theorem for super-derivations. Commun. Math. Phys. 125, 147-152 (1989) 34. Jones, V.F.R.: Index for subfactors. Invent. Math. 72, 1–25 (1983) 35. Julg, P.: Indice relatif et K-théorie bivariante de Kasparov. C. R. Acad. Sci. Paris 307, Série I, 243–248 (1988) 36. Kadison, L., Kastler, D.: Cohomological aspects and relative separability of finite Jones index subfactors. Nachr. Akad. Wissen. Göttingen II. Math.-Phys. Klasse 4, 1–11 (1992) 37. Kastler, D.: Cyclic cocycles from graded KMS functionals. Commun. Math. Phys. 121, 345–350 (1989) 38. Kawahigashi, Y., Longo, R., Müger, M.: Multi-interval subfactors and modularity of representations in conformal field theory. Commun. Math. Phys. 219, 631–669 (2001) 39. Kishimoto, A., Nakamura, H.: Superderivations. Commun. Math. Phys. 159, 15–27 (1994) 40. Kosaki, H.: Extension of Jones’ theory on index to arbitrary subfactors. J. Funct. Anal. 66, 123–140 (1986) 41. Longo, R.: Algebraic and modular structure of von Neumann algebras of physics. Proc. Symp. Pure Math. 38, Part 2, 551 (1982) 42. Longo, R.: Solution of the factorial Stone–Weierstrass conjecture. Invent. Math. 76, 145 (1984) 43. Longo, R.: Index of subfactors and statistics of quantum fields. I. Commun. Math. Phys. 126, 217–247 (1989) 44. Longo, R.: Index of subfactors and statistics of quantum fields. II, Correspondences, braid group statistics and Jones polynomial. Commun. Math. Phys. 130, 285–309 (1990) 45. Longo, R.: Minimal index and braided subfactors. J. Funct. Anal. 109, 98–112 (1992) 46. Longo, R.: A duality for Hopf algebras and for subfactors. Commun. Math. Phys. 159, 133–150 (1994) 47. Longo, R.: An analogue of the Kac–Wakimoto formula and black hole conditional entropy. Commun. Math. Phys. 186, 451–479 (1997) 48. Longo, R.: Abstracts for the talks at the Oberwolfach meetings on “C∗ -algebras”, February 1998, and “Noncommutative geometry”, August 1998 49. Longo, R.: The Bisognano-Wichmann theorem for charged states and the conformal boundary of a black hole. Proc. of the conference on “Mathematical Physics and Quantum Field Theory”, Berkeley, June 1999, Electronic J. Diff. Equations, Conf. 04, 2000, pp. 159–164 50. Longo, R., Rehren, K.H.: Nets of subfactors. Rev. Math. Phys. 7, 567–597 (1995) 51. Longo, R., Roberts, J.E.: A theory of dimension. K-Theory 11, 103–159 (1997) 52. Maldacena, J.: The large N limit of superconformal field theories and supergravity. Adv. Theor. Math. Phys. 2, 231–252 (1998) 53. Masuda, T.: Generalization of Longo–Rehren construction to subfactors of infinite depth and amenability of fusion algebra. J. Funct. Anal. 171, 53–77 (2000) 54. Peetre, J.: Une caractérisation abstraicte des opérateurs différentiels. Math. Scand. 8, 116–120 (1960) 55. Pimsner, M., Popa, S.: Entropy and index for subfactors. Ann. Sci. Ec. Norm. Sup. 19, 57–106 (1986) 56. Rehren, H.K.: Algebraic holography. Ann. H. Poincaré 1, 607–623 (2000) 57. Rehren, K.-H.: Space-time fields and exchange fields. Commun. Math. Phys. 132, 461–483 (1990) 58. Roberts, J.E.: Lectures on algebraic quantum field theory. In: The Algebraic Theory of Superselection Sectors. Introduction and Recent Results. D. Kastler ed., Singapore: World Scientific, 1990 59. Roberts, J.E.: Spontaneously broken gauge symmetries and superselection rules. In: Proc. of School of Math. Physics, Univ. di Camerino, 1974
96
R. Longo
60. Sewell, G.L.: Quantum fields on manifolds, PCT and gravitationally induced thermal states. Ann. Phys. 141, 201 (1982) 61. Stoytchev, O.: The modular group and super-KMS functionals. Lett. Math. Phys. 27, 43–50 (1993) 62. Summers, S., Verch, R.: Modular inclusions, the Hawking temperature and quantum field theory in curved spacetime. Lett. Math. Phys. 37, 145 (1996) 63. Takesaki, M.: Tomita theory of modular Hilbert algebras. Lect. Notes in Math. 128. New York– Heidelberg–Berlin: Springer Verlag, 1970 64. Takesaki, M., Winnink, M.: Local normality in quantum statistical mechanics. Commun. Math. Phys. 30, 129–152 (1973) 65. Wald, R.M.: General Relativity. Chikago, IL: University of Chicago Press, 1984 66. Wick, G.C., Wightman, A.S., Wigner, E.P.: The intrinsic parity of elementary particles. Phys. Rev. 88, 101–105 (1952) 67. Wiesbrock, H.-W.: Conformal quantum field theory and half-sided modular inclusions of von Neumann algebras. Commun. Math. Phys. 158, 537 (1993) 68. Wiesbrock, H.-W.: Superselection structure of conformal field theory on the circle and localized Connes cocycles. Rev. Math. Phys. 7, 133–160 (1995) 69. Witten, E.: Anti de Sitter space and holography. Adv. Theor. Math. Phys. 2, 253–291 (1998) Communicated by A. Connes
Commun. Math. Phys. 222, 97 – 116 (2001)
Communications in
Mathematical Physics
© Springer-Verlag 2001
Operator Algebras and Poisson Manifolds Associated to Groupoids N. P. Landsman Korteweg–de Vries Institute for Mathematics, University of Amsterdam, Plantage Muidergracht 24, 1018 TV Amsterdam, The Netherlands. E-mail: [email protected] Received: 6 December 2000 / Accepted: 19 April 2001
Dedicated to Dai Evans at his 50th birthday Abstract: It is well known that a measured groupoid G defines a von Neumann algebra W ∗ (G), and that a Lie groupoid G canonically defines both a C ∗ -algebra C ∗ (G) and a Poisson manifold A∗ (G). We construct suitable categories of measured groupoids, Lie groupoids, von Neumann algebras, C ∗ -algebras, and Poisson manifolds, with the feature that in each case Morita equivalence comes down to isomorphism of objects. Subsequently, we show that the maps G → W ∗ (G), G → C ∗ (G), and G → A∗ (G) are functorial between the categories in question. It follows that these maps preserve Morita equivalence. 1. Introduction Kontsevich has introduced the idea of the “three worlds”, viz. commutative, Lie, and associative algebras, relating these worlds to each other and to “formal” noncommutative geometry [17]. In the context of noncommutative geometry in the sense of Connes [4], and in particular of its relationship with quantum theory and quantization, three other worlds are relevant, namely von Neumann algebras, C ∗ -algebras, and Poisson manifolds. Groupoids provide access to each of these. Firstly, measured groupoids G [29, 38, 13, 10, 2, 33] define von Neumann algebras W ∗ (G) in standard form [5, 14, 43, 45, 40]. Secondly, Lie groupoids G [27] canonically define C ∗ -algebras C ∗ (G) [3, 4]. Thirdly, one may canonically associate a Poisson manifold A∗ (G) with a Lie groupoid G [6, 7, 9]. For the most basic examples of these associations, first note that a set S defines two entirely different groupoids. The first has S as the total space G1 , and also as the base space G0 of G. If S is an analytic measure space (X, µ), this leads to W ∗ (X) ∼ = L∞ (X, µ), and if S is a manifold M one obtains C ∗ (M) ∼ = C0 (M), and A∗ (M) ∼ =M The results in this paper were first presented in Cardiff on 10. 10. 2000.
Supported by a Fellowship from the Royal Netherlands Academy of Arts and Sciences (KNAW).
98
N. P. Landsman
with zero Poisson bracket. The second is the pair groupoid of S, with G1 = S × S and G0 = S. In that case one has W ∗ (X × X) ∼ = B(L2 (X, µ)), C ∗ (M × M) ∼ = K(L2 (M)), ∗ ∗ ∼ and A (M × M) = T (M). If the groupoid is a group, one recovers the usual von Neumann algebra and C ∗ algebra defined by a locally compact group. The Poisson manifold defined by a Lie group is just the dual of the Lie algebra, equipped with the Lie–Poisson structure. Group actions define the associated action groupoids [27], which in turn reproduce the group measure space construction of Murray and von Neumann, the notion of a transformation group C ∗ -algebra, and the class of semidirect Poisson structures, respectively (for the latter cf. [19]). For example, in the ergodic case all hyperfinite factors arise in this way. Finally, the von Neumann algebras and C ∗ -algebras defined by foliations [2–4, 33] may be seen as special cases of the above constructions as well, where G is the holonomy groupoid of a smooth foliation. This class of examples formed a major motivation for the development of noncommutative geometry. For fixed G, there are certain relationships between these constructions. Under appropriate technical conditions, both measured and Lie groupoids may be seen as special instances of locally compact groupoids with Haar system [40]; see [39] and [23, 21], respectively. The von Neumann algebra W ∗ (G) is then simply the weak closure of C ∗ (G) in its regular representation. The connection between A∗ (G) and C ∗ (G) is deeper: C ∗ (G) is a strict deformation quantization of A∗ (G)[21–23]. This means, among other things, that there exists a continuous field of C ∗ -algebras over [0, 1], whose fiber above 0 is the commutative C ∗ -algebra C0 (A∗ (G)), all other fibers being C ∗ (G). The C ∗ -algebra of continuous cross-sections of this continuous field turns out to be the C ∗ -algebra of the normal groupoid [15] defined by the embedding G0 → G1 of the unit space of G into its total space (Connes’s tangent groupoid [4] corresponds to the special case of a pair groupoid G = M × M). In the present paper, we examine and compare the properties of the associations G → W ∗ (G), G → C ∗ (G) and G → A∗ (G) as a function of G. Our main result is that each of these maps is functorial, though not with respect to the obvious arrows defining the pertinent categories. The categories that are involved have the desirable property that isomorphism of objects is the same as Morita equivalence (as previously defined by Rieffel for von Neumann algebras and C ∗ -algebras [42] and by Xu for Poisson manifolds [54]), so that functoriality implies that Morita equivalence is preserved. Often involving different terminology, for von Neumann algebras many special cases of the latter property have been known for some time, starting with Mackey’s ergodic imprimitivity theorem [29, 38], and including results in [10, 18, 39, 49]. For C ∗ -algebras and Poisson manifolds the preservation of Morita equivalence was already known in full generality; see [36] and [24], respectively. Special cases of our functoriality results include also [34, 35, 48, 47]. We surmise that the computations in [15], taking place in the category KK of separable C ∗ -algebras as objects and KK-groups as arrows, can be generalized to arbitrary Lie groupoids; they should then be related to our results as well. The plan of this paper is as follows. In Sect. 2 we deal with measured groupoids and von Neumann algebras, in Sect. 3 we treat Lie groupoids and C ∗ -algebras, and in Sect. 4 we end with Lie groupoids and Poisson manifolds. Our main results are Theorems 1, 2, and 3. The reader will notice that the category of measured groupoids and the category of Lie groupoids are defined in an apparently totally different way. The fact that these categories are actually closely related is explained in [26], to which we refer in general for motivation and for more details about the categories we use here. This includes the
Groupoids
99
proof that, as already mentioned, in each case recognized notions of Morita equivalence turn out to coincide with isomorphism of objects in the pertinent category. Notation. We use the notation f,g
A ×B C = {(a, c) ∈ A × C | f (a) = g(c)} for the fiber product of sets A and C with respect to maps f : A → B and g : C → B. The total space of a groupoid G is denoted by G1 , and its base space by G0 . The source and target projections are called s : G1 → G0 and t : G1 → G0 , and multiplication is a map m : G2 → G1 , with G2 = G1 ×s,t G0 G1 . The inversion is denoted by I : G1 → G1 . A functor : G → H decomposes into i : Gi → Hi , i = 1, 2, subject to the usual axioms. 2. Functoriality of G → W ∗ (G) 2.1. The category MG of measured groupoids and functors. The concept of a measured groupoid emerged from the work of Mackey on ergodic theory and group representations [29]. For the technical development of this concept see [38, 13, 10]. A different approach was initiated by Connes [2]. The connection between measured groupoids and locally compact groupoids is laid out in [40, 39]. Definition 1. A Borel groupoid is a groupoid G for which G1 is an analytic Borel space, I is a Borel map, G2 ⊂ G1 × G1 is a Borel subset, and multiplication m is a Borel map. It follows that G0 is a Borel set in G1 , and that s and t are Borel maps. A left Haar system on a Borel groupoid is a family of measures {ν u }u∈G0 , where ν u is supported on the t-fiber Gu = t −1 (u), which is left-invariant in that (2.1) dν s(x) (y) f (xy) = dν t (x) (y) f (y) for all x ∈ G1 and all positive Borel functions f on G1 for which both sides are finite. A measured groupoid is a Borel groupoid equipped with a Haar system as well as a Borel measure ν˜ on G0 with the property that the measure class of the measure ν on G1 , defined by d ν˜ (u) ν u , (2.2) ν= G0
is invariant under I (in other words, ν −1 = I (ν) ∼ ν). Recall that the push-forward of a measure under a Borel map is given by t (ν)(E) = ν(t −1 (E)) for Borel sets E ⊂ G0 . This definition turns out to be best suited for categorical considerations. It differs from the one in [38, 13], which is stated in terms of measure classes. However, the measure class of ν defines a measured groupoid in the sense of [38, 13], and, conversely, the latter is also a measured groupoid according to Definition 1 provided one removes a suitable null set from G0 , as well as the corresponding arrows in G1 ; cf. Thm. 3.7 in [13]. Similarly, Definition 1 leads to a locally compact groupoid with Haar system [40] after removal of such a set; see Thm. 4.1 in [39]. A measured groupoid according to Connes [2] satisfies Definition 1 as well, with ν˜ constructed from the Haar system
100
N. P. Landsman
and a transverse measure [33]. See all these references for extensive information and examples. The fact that a specific choice of a measure in its class is made in Definition 1 is balanced by the concept of a measured functor between measured groupoids, which is entirely concerned with measure classes rather than individual measures. Moreover, one merely uses the measure class of ν˜ . The measure ν˜ on G0 induces a measure νˆ on G0 /G, as the push-forward of ν˜ under the canonical projection, and similarly for a measured groupoid H , for whose measures we will use the symbol λ instead of ν. We say that a functor is Borel if both 0 and ˆ 0 : G0 /G → H0 /H in the obvious way. 1 are. If so, 0 induces a Borel map Definition 2. A measured functor : G → H between two measured groupoids is a ˆ ˆ 0 (ˆν ) ≺≺ λ. Borel map that is algebraically a functor and satisfies What we here call a measured functor is called a strict homomorphism in [38], and a homomorphism in [39]. Also, note that in [29, 38, 10] various more liberal definitions are used (in that one does not impose that be a functor algebraically at all points), but it is shown in [39] that if one passes to natural isomorphism classes, this greater liberty gains little. Definition 3. The category MG has measured groupoids as objects, and isomorphism classes of measured functors as arrows. (Here a natural transformation ν : G0 → H1 between Borel functors from G to H is required to be a Borel map.) Composition is defined by [$] ◦ [] = [$ ◦ ], and the unit arrow at a groupoid G is 1G = [idG ], where idG : G → G is the identity functor. 2.2. The category W∗ of von Neumann algebras and correspondences. Let M, N be von Neumann algebras. Recall that an M-N correspondence M H N is given by a Hilbert space H carrying commuting normal unital representations of M and Nop . See [4]. The notion of isomorphism of correspondences is the obvious one: one requires a unitary isomorphism between the Hilbert spaces in question that intertwines the left and right actions. Given two matched correspondences M H N and N K P, one may define an M-P correspondence M H N K P, called the relative tensor product or “Connes fusion” of the given correspondences. This construction is a von Neumann algebraic version of the bimodule tensor product in pure algebra. Various definitions exist [4, 44, 51], which coincide up to isomorphism. This composition is associative up to isomorphism. A standard representation of a von Neumann algebra M on H = L2 (M), unique up to unitary equivalence, is best seen as an M-M correspondence with special properties. One of these is that L2 (M) acts as a two-sided unit for M , again merely up to isomorphism. Definition 4. The category W∗ has von Neumann algebras as objects, and isomorphism classes of correspondences as arrows, composed by the relative tensor product, for which the standard forms L2 (M) are units. To detail, we here regard an isomorphism class [M H N] as an arrow from M to N, so that the composition is [N K P] ◦ [M H N] = [M H N K P].
Groupoids
101
Using results in [42, 44], it is easily seen that two von Neumann algebras are Morita equivalent iff they are isomorphic in W∗ [26], and this is true iff there is a correspondence in which the commutant of one is isomorphic to the opposite algebra of the other, or iff they are stably isomorphic. 2.3. The map G → W ∗ (G) as a functor. It is well known that a measured groupoid defines a von Neumann algebra in standard form [5, 14, 45, 43, 40]. In this section, we extend the map G → W ∗ (G) to a map from MG to W∗ , and establish its functoriality. The precise classes of Borel function f, g on G1 for which the formulae below are well defined are spelled out in the above papers; for example, one may assume that f, g ∈ I I (G1 ) as defined in [14]. Let G be a measured groupoid (cf. Definition 1). Convolution on G is defined by f ∗ g(x) = dν s(x) (y) f (xy)g(y −1 ), (2.3) G1
and involution is f ∗ (x) = f (x −1 ).
(2.4)
We here use the conventions in [40]; many authors include the modular homomorphism & : G1 → R+ in (2.4), defined by &(x) = dν(x)/dν −1 (x). We write L2 (G) for L2 (G1 , ν). For ψ ∈ L2 (G) the formulae πL (f )ψ = (&−1/2 f ) ∗ ψ; πR (f )ψ = ψ ∗ f
(2.5) (2.6)
define the left and right regular representations of I I (G1 ); one then has W ∗ (G) = πL (I I (G1 )) , which is in standard form with respect to J : L2 (G) → L2 (G) defined by J ψ(x) = &(x)−1/2 ψ ∗ (x). J W ∗ (G)J
W (G)
(2.7)
)) .
One then has = = πR (I I (G1 We have now defined the alleged functor G → W ∗ (G) on objects. To define it on arrows, let H be a second measured groupoid H (with Haar system λ), and let : G → H be a measured functor (cf. Definition 2). Define a Hilbert space 0 (u) 0 ,t L2 () = L2 G0 × . (2.8) H , d ν ˜ (u) λ 1 H0 G0
Compare (2.2). Also, define πλ : I I (G1 ) → B(L2 ()) and πρ : I I (H1 ) → B(L2 ()) by πλ (f )ϕ(u, h) = dν u (y) &(y)−1/2 f (y)ϕ(s(y), 1 (y −1 )h); (2.9) G1 πρ (g)ϕ(u, h) = dλs(h) (l) g(l −1 )ϕ(u, hl). (2.10) H1
These expressions extend to f ∈ W ∗ (G) and g ∈ W ∗ (H ) by continuity, and it is easily seen that one thus defines a correspondence W ∗ (G) L2 () W ∗ (H ).
102
N. P. Landsman
Theorem 1. The map W ∗ : MG → W∗ , defined on objects by W0∗ (G) = W ∗ (G) as above, and on arrows (i.e., natural isomorphism classes of measured functors : G → H ) by W1∗ ([]) = [W ∗ (G) L2 () W ∗ (H )], is a functor. Proof. For H = G and = id one easily sees that L2 (id) ∼ = L2 (G), πλ ∼ = πL , and ∼ ∼ πρ = πR (the = here standing for unitary equivalence). Hence one obtains the standard form W1∗ (id) = [W ∗ (G) L2 (G) W ∗ (G)]. Since the unit arrows in W∗ are precisely the standard forms, this shows that W ∗ maps units into units. We now need to show that, for a third measured groupoid K and a measured functor $ : H → K, one has W ∗ (G) L2 () W ∗ (H ) L2 ($) W ∗ (K) ∼ W ∗ (G) L2 ($ ◦ ) W ∗ (K). =
(2.11) (2.12)
Since W ∗ (H ) L2 (H ) is in standard form, one can easily compute the relative tensor product by applying the general prescriptions in [44] to the case at hand. We use the notation in [44] and [14]. Thus AI ⊂ L2 (H ) is the left Hilbert algebra associated to the above standard form. This defines a normal semi-finite faithful weight λ on W ∗ (H ) by λ(f ∗ ∗ f ) = f 2L2 (H ) for f ∈ AI , and λ(f ∗ ∗ f ) = ∞ otherwise. The space of λ-bounded vectors in L2 ($) is called D(L2 ($), λ). One defines a sesquilinear form on L2 () ⊗ D(L2 ($), λ) (algebraic tensor product over C) by sesquilinear extension of (ϕ1 ⊗ ψ1 , ϕ2 ⊗ ψ2 )0 = (ϕ1 , πρ (ψ1 , ψ2 λ )ϕ2 )L2 () ,
(2.13)
where ψ1 , ψ2 λ ∈ W ∗ (H ) in fact lies in AI , and may be determined by its property (f, ψ1 , ψ2 λ )L2 (H ) = (ψ1 , πλ (Jf )ψ2 )L2 ($) ,
(2.14)
where f ∈ AI is arbitrary. The form ( , )0 is positive semidefinite, and the completion of the quotient of L2 ()⊗D(L2 ($), λ) by the null space of ( , )0 in the induced norm is the Hilbert space L2 () W ∗ (H ) L2 ($). The actions of W ∗ (G) and W ∗ (K) on L2 () and D(L2 ($), λ) ⊂ L2 ($) (which is stable under W ∗ (K)), respectively, induce actions on L2 () W ∗ (H ) L2 ($), defining this Hilbert space as a W ∗ (G)-W ∗ (K) correspondence. Denoting the Haar system on K by ρ, from (2.14) one easily finds ψ1 , ψ2 λ (h) = dρ $0 (s(h)) (k) ψ1 (s(h), k)ψ2 (t (h), $1 (h)k), (2.15) K1
from which the form (2.13) may explicitly be computed. Now define U˜ : L2 () ⊗ D(L2 ($), λ) → L2 ($ ◦ ) by linear extension of U˜ (ϕ ⊗ ψ) : (u, k) →
H1
dλ0 (u) (h) ϕ(u, h)ψ(s(h), $1 (h−1 )k).
(2.16)
Groupoids
103
Using (2.15) and (2.13), one finds that (U˜ (ϕ1 ⊗ ψ1 ), U˜ (ϕ2 ⊗ ψ2 ))L2 ($◦) = (ϕ1 ⊗ ψ1 , ϕ2 ⊗ ψ2 )0 .
(2.17)
Hence U˜ descends to an isometric map U : L2 () W ∗ (H ) L2 ($) → L2 ($ ◦ ). Using the fact that the underlying measure spaces are analytic, it is easily shown that the range of U˜ is dense, so that U is unitary. A simple computation finally shows that U intertwines the pertinent actions of W ∗ (G) and W ∗ (K). This proves (2.12). Since Morita equivalence for measured groupoids is isomorphism in MG, and Morita equivalence of von Neumann algebras is isomorphism in W∗ , it follows that the map G → W ∗ (G) preserves Morita equivalence. 3. Functoriality of G → C ∗ (G) Most of the following constructions apply to locally compact groupoids with Haar system as well, but a key technical step in the proof of functoriality appears to be valid only in the smooth case; cf. the paragraph preceding (3.10). Another reason for our restriction to Lie groupoids is that the beautiful parallel with the classical case is only pertinent in the smooth case.
3.1. The category LG of Lie groupoids and principal bibundles. Lie groupoids [27] play a central role in differential geometry, once one starts looking for them. This applies, in particular, to foliation theory [3,4]. In addition, many physical systems can be modeled by Lie groupoids [21]. Definition 5. A Lie groupoid is a groupoid for which G1 and G0 are manifolds, s and t are surjective submersions, and m and I are smooth. It follows that object inclusion is an immersion, that I is a diffeomorphism, that G2 is a closed submanifold of G1 × G1 , and that for each q ∈ G0 the fibers s −1 (q) and t −1 (q) are submanifolds of G1 . In this paper we include Hausdorffness in the definition of a manifold for simplicity, though the total space G1 of the holonomy groupoid of a foliation usually fails to satisfy this condition. With more technical machinery, our results should extend to that case also. The category LG, and the key concept of a principal bibundle occurring in its definition, arose in the work of Moerdijk [30], originally in the context of topos theory. Similar structures independently emerged in foliation theory [3, 12, 15]. The connection between these two points of entry was made by Mrˇcun [34, 35], from which the following definitions are taken; for the basic underlying notion of a Lie groupoid action cf. [27]. τ
Definition 6. A G-H bibundle is a manifold M equipped with smooth maps M → G0 σ and M → H0 , a left G-action (x, m) → xm from G ×s,τ G0 M to M, and a right H action t,τ (m, h) → mh from M ×H0 H to M, such that τ (mh) = τ (m), σ (xm) = σ (m), and (xm)h = x(mh) for all (m, h) ∈ M ×H and (x, m) ∈ G×M. We write G M H . Such a bibundle is called left principal when σ is a surjective submersion, the G action is free (in that xm = m iff x ∈ G0 ) and transitive along the fibers of σ . Equivalently, the map from G1 ×s,τ G0 M → M ×H0 M given by (x, m) → (xm, m) is a diffeomorphism.
104
N. P. Landsman
A G-H bibundle M is called regular when it is left principal and the right H action is proper (in that the map (m, h) → (m, mh) from M ×H0 H to M × M is proper). Two G-H bibundles M, N are called isomorphic if there is a diffeomorphism M → N that intertwines the maps M → G0 , M → H0 with the maps N → G0 , N → H0 , and in addition intertwines the G and H actions (the latter condition is well defined because of the former). Note that the G action in a left principal G-H bibundle is automatically proper. In the topos literature a left principal bibundle is seen as a generalized map from H to G, whereas in the foliation literature it is regarded as the graph of a map between the leaf spaces of the foliations defining G and H . Now suppose one has left principal bibundles G M H and H N K. The fiber product M ×H0 N carries a right H action, given by h : (m, n) → (mh, h−1 n) (defined as appropriate). We denote the orbit space by M H N = (M ×H N )/H.
(3.1)
This is a manifold, and, indeed, a G-K bibundle under the obvious maps. The “tensor product” is well defined on isomorphism classes. The canonical G-G bibundle G, defined by putting M = H = G, τ = t, and σ = s in the above definitions, with left and right actions given by multiplication in the groupoid, is a left and a right unit for the bibundle tensor product (3.1), up to isomorphism. Definition 7. The category LG has Lie groupoids as objects and isomorphism classes of regular (i.e., left principal and right proper) bibundles as arrows. The arrows are composed by (3.1), descending to isomorphism classes. The units 1G in G are the isomorphism classes [G G G] of the canonical bibundles. A number of definitions of Morita equivalence of Lie groupoids have appeared in the literature [12, 36, 30, 53, 34, 35]; it can be shown that these are all equivalent, and that two Lie groupoids are Morita equivalent iff they are isomorphic objects in LG [34, 26]. 3.2. The category C∗ of C ∗ -algebras and Hilbert bibundles. The definition of C∗ is based on the concept of an A-B Hilbert bimodule, which is what Rieffel [42] called an Hermitian B-rigged A-module, with strict continuity of the A action added. Thus an A-B Hilbert bimodule is a Hilbert C ∗ module E over B, along with a nondegenerate ∗ -homomorphism of A into L (E). We write A E B. Two A-B Hilbert bimodules B E, F are called isomorphic when there is a unitary U ∈ LB (E, F); cf. [20], p. 24. The canonical bimodule 1B over a C ∗ -algebra B is defined by A, BB = A∗ B, and the left and right actions are given by left and right multiplication, respectively. Rieffel’s interior tensor product [42, 20] maps an A-B Hilbert bimodule E and a B-C Hilbert ˆ B F. This operation is well defined on bimodule F into an A-C Hilbert bimodule E ⊗ ˆ B , up to isomorphism. unitary isomorphism classes, and 1B acts as a two-sided unit for ⊗ Definition 8. The category C∗ has C ∗ -algebras as objects, and isomorphism classes of Hilbert bimodules as arrows. The arrows are composed by Rieffel’s interior tensor product, for which the canonical Hilbert bimodules 1A are units. This category was introduced independently in [46], and, in the guise of a bicategory (where the arrows are Hilbert bimodules rather than isomorphism classes thereof), in [25]. It was shown in [46] that two C ∗ -algebras are Morita equivalent as defined by
Groupoids
105
Rieffel [42] iff they are isomorphic as objects in C∗ ; also see [26] for a detailed proof. The nondegeneracy condition in the definition of the arrows in C∗ is essential for this result. It should be noted that Thm. 2.2 in [1] implies that the category W∗ of Definition 4 is isomorphic to the subcategory of C∗ consisting of von Neumann algebras as objects and normal selfdual Hilbert bimodules as arrows.
3.3. The map G → C ∗ (G) as a functor. We will now prove that the map G → C ∗ (G) mentioned in the Introduction may be extended so as to associate Hilbert bimodules to regular bibundles, thus defining a functor from LG to C∗ . Although it should be possible to use the geometric definition of C ∗ (G) in terms of half-densities [4], as in our previous direct proof that G → C ∗ (G) preserves Morita equivalence [24], we find it much easier to regard a Lie groupoid as a locally compact groupoid with smooth Haar system (cf. the Introduction). Specifically, a Lie groupoid G has a left Haar system {ν q }q∈G0 such that ν q is supported on t −1 (q) and is equivalent to Lebesgue measure in each coordinate chart (recall that t −1 (q) is a submanifold of G1 ). Furthermore, for each f ∈ Cc∞ (G1 ) the function q → dν q (x) f (x) on G0 is smooth. This endows Cc∞ (G) with the structure of a ∗ -algebra under the operations (2.3) and (2.4). The groupoid C ∗ -algebra C ∗ (G) is a suitable completion of the ∗ -algebra Cc∞ (G); see [40] for the analogous case of Cc (G), or [4, 21] for the smooth case. We have now defined the map G → C ∗ (G) on objects. To define it on arrows, let G M H be a regular bibundle (cf. Definition 6 for the notation that will be used throughout this chapter). A key fact is that a Haar system on G defines a family of measures {µr }r∈H0 on M, where µr is supported on σ −1 (r), on which it is equivalent to Lebesgue measure in each coordinate chart. Moreover, for each f ∈ Cc∞ (M) the r function r → dµ (m) f (m) on H0 is smooth, and the family is H -equivariant (in the sense of [41]) with respect to σ , the given H action on M, and the natural right H action on H0 . This means that for each f ∈ Cc∞ (M) one has
t (h)
dµ
(m) f (mh) =
dµs(h) (m) f (m).
(3.2)
Namely, for fixed r ∈ H0 this system is defined by choosing m0 ∈ σ −1 (r), and putting
dµr (m) f (m) =
dν τ (m0 ) (x)f (x −1 m0 ).
(3.3)
Using (2.1), one verifies that this is independent of the choice of m0 (despite the fact that τ (m0 ) is not constant on σ −1 (r)). This definition is evidently possible because in a regular bibundle the G action is principal over σ . The following lemma is similar to Thm. 2.8 in [36], and also appeared in [48] for the locally compact case (this paper was drawn to our attention after the circulation of an earlier draft of this paper as an e-print); our assumptions are weaker, since we do not have an equivalence bibundle but merely a regular one. However, what is really used in [36] is precisely our regularity properties.
106
N. P. Landsman
Lemma 1. Let G M H be a regular bibundle. The formulae ϕ, ψ : h → dµt (h) ϕ(m)ψ(mh); f · ϕ : m → dν τ (m) (x)f (x)ϕ(x −1 m); ϕ · g : m → dλσ (m) g(h−1 )ϕ(mh),
(3.4) (3.5) (3.6)
where ϕ, ψ ∈ Cc∞ (M), f ∈ Cc∞ (G), and g ∈ Cc∞ (H ), define functions in Cc∞ (H ), Cc∞ (M), and Cc∞ (M), respectively. This equips Cc∞ (M) with the structure of a pre Hilbert C ∗ -module over Cc∞ (H ) (seen as a dense subalgebra of C ∗ (H )), on which Cc∞ (G) (seen as a dense subalgebra of C ∗ (G)) acts nondegenerately by adjointable operators. This structure may be completed to a C ∗ (G)-C ∗ (H ) Hilbert bimodule, which we call E(M). Proof. It should now be obvious why the right H action on a regular bibundle has to be proper, since this guarantees Cc∞ (H )-valuedness of the inner product (otherwise, one could land in C ∞ (H )). The necessary algebraic properties may be checked by elementary computations. The property ϕ, ψ∗ = ψ, ϕ follows from (3.2), the property ϕ, ψ · g = ϕ, ψ ∗ g is an identity, the properties ϕ, f · ψ = f ∗ · ϕ, ψ and (f1 ∗ f2 ) · ϕ = f1 · (f2 · ϕ) require (3.3) and (2.1), and finally ϕ · (g1 ∗ g2 ) = (ϕ · g1 ) · g2 follows from (2.1) for λ. The proof of positivity of , is the same as in [36]; it follows from Prop. 2.10 in [36] and the argument of P. Green (see the remark following Lemma 2 in [11]). This also proves the nondegeneracy of the action of Cc∞ (G) (and hence of the ensuing action of C ∗ (G)). We cannot use the entire argument in [36] to the effect that everything can be completed, since in [36] one has a Cc∞ (G)-valued inner product as well. However, it is quite trivial to proceed, since by the above results Cc∞ (M) is a pre Hilbert C ∗ -module over Cc∞ (H ), which can be completed to a Hilbert C ∗ -module E(M) over C ∗ (H ) in the standard way (cf. Ch. 1 in [20] or Cor. IV.2.1.4 in [21]). One then copies the proof in [36] of the property f · ϕ, f · ϕ ≤ f 2 ϕ, ϕ, where the norm is in C ∗ (G), to complete the argument.
(3.7)
Theorem 2. The map C ∗ : LG → C∗ , defined on objects by C0∗ (G) = C ∗ (G), and on arrows by C1∗ ([G M H ]) = [C ∗ (G) E(M) C ∗ (H )], is a functor. Proof. We begin with the unit arrows. We claim that the construction in Lemma 1 maps the canonical bibundle G G G into the canonical Hilbert bimodule C ∗ (G) C ∗ (G) C ∗ (G). It is easy to check from (3.4)–(3.6) that ϕ, ψ = ϕ ∗ ∗ψ, f ·ϕ = f ∗ϕ, and ϕ · g = ϕ ∗ g. These properties pass to the completions by continuity. Hence C ∗ preserves units. Now let H N K be a second regular bibundle, so that one may form the bibundle tensor product M H N (cf. (3.1)) and its associated C ∗ (G)-C ∗ (K) Hilbert
Groupoids
107
bimodule E(M H N ). To compare this with the C ∗ (G)-C ∗ (K) Hilbert bimodule ˆ C ∗ (H ) E(N ), we define a map U˜ : Cc∞ (M) ⊗C Cc∞ (N ) → Cc∞ (M H N ) E(M)⊗ by U˜ (ϕ ⊗C ψ) : [m, n]H → dλσ (m) (h) ϕ(mh)ψ(h−1 n). (3.8) Note that the right-hand side is well defined on [m, n]H rather than (m, n) because of the invariance property (2.1) for H . This map was introduced by Mrˇcun [34] for smooth étale groupoids; we have merely replaced the counting measure by a general Haar system. We now show that the map U˜ leaves the kernel of the canonical projection ˆ C ∗ (H ) E(N ) Cc∞ (M) ⊗C Cc∞ (N ) → E(M)⊗ stable, that U˜ has dense range, and that accordingly the corresponding quotient map U , extended by continuity, defines an isomorphism ˆ C ∗ (H ) E(N ) E(M)⊗
E(M H N )
(3.9)
as C ∗ (G)-C ∗ (K) Hilbert bimodules. A lengthy but straightforward computation shows that E (M N)
U˜ (ϕ1 ⊗C ψ1 , U˜ (ϕ2 ⊗C ψ2 )C ∗ (K)H is equal to
E (M)
,
E (N)
ψ1 , ϕ1 , ϕ2 C ∗ (H ) · ψ2 C ∗ (K) , which by definition is equal to ˆ E (M)⊗
ϕ1 ⊗C ∗ (H ) ψ1 , ϕ2 ⊗C ∗ (H ) ψ2 C ∗ (K) C
∗ (H ) E (N)
.
ˆ C ∗ (H ) E(N ). In view of the definitions Here ϕ ⊗C ∗ (H ) ψ is the image of ϕ ⊗C ψ in E(M)⊗ of the various Hilbert C ∗ -modules over C ∗ (K) involved, this computation implies that ˆ C ∗ (H ) E(N ) to E(M H N ). U˜ quotients and extends to an isometry U from E(M)⊗ Moreover, using the fact that M and N are manifolds, it is easily seen that U˜ has a dense range in Cc∞ (M H N ) with respect to the inductive limit topology, so that it certainly has a dense range for the topology induced on Cc∞ (M H N ) by the norm on E(M H N ) as a Hilbert C ∗ -module over C ∗ (K) (since the latter topology is finer than the former). Since Cc∞ (M H N ) is itself dense in E(M H N ) in the latter topology, it follows that U˜ has dense range when seen as a map taking values in E(M H N ). ˆ C ∗ (H ) E(N ) and E(M H N ) as Hence U is an isometric isomorphism between E(M)⊗ Banach spaces. Note that the first claim in this paragraph is not obvious in the general locally compact case; this is one of the reasons why we have restricted ourselves to Lie groupoids in this chapter. Another elementary computation shows that U˜ (ϕ ⊗C (ψ · g)) = U˜ (ϕ ⊗C ψ) · g
(3.10)
for ϕ ∈ Cc∞ (M), ψ ∈ Cc∞ (N ), and g ∈ Cc∞ (H ). This implies that U (ϕ ⊗C ∗ (H ) (ψ · g)) = U (ϕ ⊗C ∗ (H ) ψ) · g
(3.11)
108
N. P. Landsman
for all ϕ ∈ E(M), ψ ∈ E(N ), and g ∈ C ∗ (H ). The reason for this is that a continuous B0 -linear map between two pre Hilbert C ∗ modules over a dense subalgebra B0 of B extends to a B-linear map between the completions; this easily follows from the bound ψB ≤ B ψ. We conclude that U is a C ∗ (K)-linear isometric isomorphism, and hence by Thm. 3.5 in [20] it is actually unitary (in particular, it now follows that P is adjointable). Finally, analogously to (3.10) one obtains the equality U˜ (f · (ϕ ⊗C ψ) = f · U˜ (ϕ ⊗C ψ),
(3.12)
where f ∈ Cc∞ (G). This time, the passage of this property to the pertinent completions is achieved through (3.7), which leads to the bound Aψ ≤ A ψ for any adjointable operator on a (pre) Hilbert C ∗ -module. Thus U is C ∗ (G)-linear as well. This proves (3.9). Hence C ∗ preserves composition of arrows, and Theorem 2 follows. Since Morita equivalence of Lie groupoids is isomorphism in LG, and Morita equivalence of C ∗ -algebras is isomorphism in C∗ , we recover the known result that the map G → C ∗ (G) preserves Morita equivalence [36, 24]. 4. Functoriality of G → A∗ (G) The category on which the map A∗ is going to be defined is as follows. Definition 9. The category LGc has s-connected and s-simply connected Lie groupoids as objects, and isomorphism classes of left principal bibundles as arrows. The arrows and units are as in Definition 7. In contrast with Definition 7, the class of objects is more restricted; this will be necessary for our functor to preserve units. On the other hand, the bibundles need not be right proper.
4.1. The category Poisson of Poisson manifolds and dual pairs. The definition of a suitable category of Poisson manifolds [26] is based on the theory of symplectic groupoids (cf. [6, 50] and refs. therein). The objects in Poisson are defined as follows. Definition 10. A Poisson manifold P is called integrable when there exists a symplectic groupoid :(P ) over P . This definition is due to [6]. Using Thms. 5.2, 5.3, and A1 in [28] and Prop. 3.3 in [31], it follows that if P is integrable, then there exists an s-connected and s-simply connected symplectic groupoid :(P ) over P , which is unique up to isomorphism [26]. The arrows in Poisson will be isomorphism classes of certain dual pairs. Given two Poisson manifolds P and Q, a dual pair Q ← S → P consists of a symplectic manifold S and Poisson maps q : S → Q and p : S → P − , such that {q ∗ f, p∗ g} = 0 for all f ∈ C ∞ (Q) and g ∈ C ∞ (P ) [52, 16]. In a complete dual pair the maps p and q are complete; a Poisson map J : S → P is called complete when, for every f ∈ C ∞ (P ) with complete Hamiltonian flow, the Hamiltonian flow of J ∗ f on S is complete as pi qi well (that is, defined for all times). Two Q-P dual pairs Q ← S˜i → P , i = 1, 2, are
Groupoids
109
isomorphic when there is a symplectomorphism ϕ : S˜1 → S˜2 for which q2 ϕ = q1 and p2 ϕ = p1 . Based on results in [6, 8, 54], it can be shown that for integrable Poisson manifolds P and Q, with associated s-connected and s-simply connected symplectic groupoids :(P ) and :(Q), there is a natural bijective correspondence between complete dual pairs Q ← S → P and symplectic bibundles :(Q) S :(P ). In particular, t s the canonical symplectic bibundle associated to the dual pair P ← :(P ) → P is :(P ) :(P ) :(P ). Accordingly, we say that a dual pair is regular when it is complete and when the associated symplectic bibundle is left principal (it is not necessary to impose properness of the right :(P ) action). Let R be a third integrable Poisson manifold, with associated s-connected and ssimply connected symplectic groupoid :(R), and let Q ← S1 → P and P ← S2 → R be regular dual pairs. The embedding S1 ×P S2 ⊂ S1 × S2 is coisotropic [21]; we denote the corresponding symplectic quotient by S1 P S2 . This is the middle space of a regular dual pair P ← S1 P S2 → R, which we regard as the tensor product of the given dual pairs. An alternative way of defining this tensor product is to construct the groupoid tensor product :(Q) S1 :(P ) S2 :(R) of the associated symplectic bibundles [53]. Thus we have S1 P S2 = S1 :(P ) S2
(4.1)
as symplectic manifolds, as :(Q)-:(R) symplectic bibundles, and as Q-R dual pairs. In any case, this tensor product is associative up to isomorphism, and the dual pair s t P ← :(P ) → P is a two-sided unit for P , up to isomorphism [26]. Definition 11. The category Poisson has integrable Poisson manifolds as objects, and isomorphism classes of regular dual pairs as arrows. The arrows are composed by the t s tensor product , for which the dual pairs P ← :(P ) → P are units. Here :(P ) is “the” s-connected and s-simply connected symplectic groupoid over P . The original reason for the introduction of this category was not so much the subsequent functoriality theorem, but rather the fact that two Poisson manifolds are Morita equivalent in the sense of Xu [54] iff they are isomorphic objects in Poisson [26]. In particular, a Poisson manifold is integrable iff it is Morita equivalent to itself. Moreover, we now have a classical analogue of the categories W∗ and C∗ . 4.2. The map G → A∗ (G) as a functor. A Lie groupoid G defines an associated “infinitesimal” object, its Lie algebroid A(G) [37]; see [27, 6, 21] for reviews. The main point is that A(G) is a vector bundle over G0 , endowed with an “anchor map” α : A(G) → T (G0 ) and a Lie algebra structure on its space of sections C ∞ (G0 , A(G)) that is compatible with the anchor map in a certain way. It is of central importance to us that the dual vector bundle A∗ (G) is a Poisson manifold in a canonical way [6, 7, 9], which generalizes the well-known Lie–Poisson structure on the dual of a Lie algebra. We look at the passage G → A∗ (G) as a classical analogue of the map G → C ∗ (G). Another important construction is that of the cotangent bundle T ∗ (G) of G. This is not merely a symplectic space (equipped, in our conventions [21, 24], with minus the usual symplectic form on a cotangent bundle, so that we write T ∗ (G)− when this aspect is relevant), but a symplectic groupoid with T ∗ (G)1 = T ∗ (G1 ) over T ∗ (G)0 = A∗ (G)
110
N. P. Landsman
[6] (also see [50] for a review). For simplicity we will write T ∗ (G) for T ∗ (G1 ), and denote the source and target projections of T ∗ (G) by s˜ and t˜, respectively. Lemma 2. The s-connected and s-simply connected symplectic groupoid over A∗ (G) ˜ where G ˜ is the s-connected and s-simply connected Lie groupoid with Lie is T ∗ (G), ˜ algebroid A(G) = A(G). ˜ is guaranteed by Prop. 3.3 in [31]. Since the Poisson structure Proof. The existence of G ∗ on A (G) is entirely determined by the Lie algebroid structure of A(G), one has A∗ (G) = ˜ as Poisson manifolds. It may be checked from its definition that T ∗ (G) is sA∗ (G) connected and s-simply connected iff G is. In view of this lemma, we will henceforth assume that all Lie groupoids are sconnected and s-simply connected, and drop the tilde. Thus we have defined the map A∗ : LGc → Poisson on objects. In order to define this map on arrows, we recall a number of results from [24], which we here combine into a lemma. Lemma 3. Any bibundle G M H (cf. Definition 6) defines a symplectic bimodule JLG
JRH
A∗ (G) ←− T ∗ (M)− −→ A∗ (H ),
(4.2)
with associated symplectic bibundle T ∗ (G)− T ∗ (M)− T ∗ (H )− .
(4.3)
The explicit form of the “momentum map” JRH is dh(λ) dmh(λ) JRH (θm ), = θm , , dλ |λ=0 σ (m) dλ |λ=0 m
(4.4)
˙ lies in Aσ (m) (H ) where θm ∈ Tm∗ (M), σ (m) = h(0), and h(λ) ∈ t −1 (σ (m)), so that h(0) H ∗ and JR (θm ) ∈ Aσ (m) (H ). The associated right action of T ∗ (H ) on T ∗ (M) is given by dm(λ) θm · (αh )−1 , dλ |λ=0 mh−1 ˜ ˜ dm(λ)h(λ) d h(λ) = θm , − αh , , (4.5) dλ dλ |λ=0 |λ=0 m
h
˜ is a curve in H satisfying h(0) ˜ where m(0) = mh−1 , and h(·) = h and σ (m(λ)) = ˜ t (h(λ)). As explained in [24], Eq. (4.5) is independent of the choice of h˜ because of the compatibility condition JRH (θm ) = s˜ (αh ) under which θm · (αh )−1 is defined; cf. Definition 6. Explicitly, this condition reads σ (m) = s(h), along with dmχ (λ) dhχ (λ) θm , = αh , , (4.6) dλ |λ=0 m dλ |λ=0 h for all curves χ (·) ∈ t −1 (s(h)) subject to χ (0) = s(h). Note that these formulae for right actions are not given in [24], but they may be derived from those for left actions,
Groupoids
111
together with the formula α −1 = −I ∗ (α) for the inverse in T ∗ (G) (where I : G1 → G1 is the inverse in G) [6]. The explicit form of JL will shortly be needed not for G M H , but for a second bibundle H N K; hence we state it for the latter. The momentum map JLH : T ∗ (N ) → A∗ (H ), then, is given by [24]
JLH (ηn ),
dh(λ) dλ |λ=0
ρ(n)
dh(λ)−1 n = − ηn , , dλ |λ=0 h
(4.7)
where ηn ∈ Tn∗ (N ), ρ(n) = h(0), and h(λ) ∈ t −1 (ρ(n)); recall that ρ : N → H0 is the base map of the H action on N . The associated left action of T ∗ (H ) on T ∗ (N ) is given by ˆ −1 n(λ) ˆ dn(λ) d h(λ) d h(λ) αh · ηn , = ηn , + αh , , (4.8) dλ |λ=0 hn dλ dλ |λ=0 |λ=0 n
h
ˆ is a curve in H satisfying h(0) ˆ ˆ where n(0) = hn, and h(·) = h and ρ(n(λ)) = t (h(λ)). The condition under which αh · ηn is defined is JLH (ηn ) = s˜ (αh ), which reads ρ(n) = s(h), along with dχ (λ)−1 n dχ (λ) − ηn , = αh , , (4.9) dλ dλ |λ=0 h |λ=0 n for χ as specified after (4.6). This completes the exposition of Lemma 3. Theorem 3. The map A∗ : LGc → Poisson, defined on objects by A∗0 (G) = A∗ (G) and on arrows by A∗1 ([G M H ]) = [A∗ (G) ← T ∗ (M)− → A∗ (H )], is a functor. Proof. The object map A∗0 is well defined between the given categories by Lemma 2. Turning to the unit arrows, we note that the construction in Lemma 3 maps the canonical bibundle G G G into the symplectic bimodule t˜
s˜
A∗ (G) ← T ∗ (G)− → A∗ (G). To see this, recall that s˜ and t˜ are the source and target maps of the symplectic groupoid T ∗ (G)− . The lemma follows because, as already remarked in [24], s˜ and t˜ as defined in [6] coincide with the momentum mappings JRG and JLG defined by Lemma 3, applied to the canonical bibundle. It is here that the assumption of s-connectedness and s-simply connectedness is essential. We now turn to the composition of arrows. Let G M H and H N K be regular bibundles, with associated symplectic bimodules A∗ (G) ← T ∗ (M)− → A∗ (H ) and A∗ (H ) ← T ∗ (N )− → A∗ (K), respectively (cf. Lemma 3). We will prove that the tensor product A∗ (G) ← T ∗ (M)− A∗ (H ) T ∗ (N )− → A∗ (K)
(4.10)
112
N. P. Landsman
of these symplectic bimodules is isomorphic to the symplectic bimodule A∗ (G) ← T ∗ (M H N )− → A∗ (K)
(4.11)
associated with the bibundle G M H N K. We omit all suffixes “−” (as in S − ), unless strictly necessary. By (4.1) and (3.1) we have T ∗ (M) A∗ (H ) T ∗ (N ) = (T ∗ (M) ∗A∗ (H ) T ∗ (N ))/T ∗ (H )
(4.12)
as A∗ (G)-A∗ (K) symplectic bimodules. By (3.1), one has T ∗ (M H N ) = T ∗ ((M ∗H0 N )/H ),
(4.13)
so we start by proving that (T ∗ (M) ∗A∗ (H ) T ∗ (N ))/T ∗ (H )
T ∗ ((M ∗H0 N )/H )
(4.14)
as symplectic manifolds. To do so, we first show that T ∗ ((M ∗H0 N )/H )
(T ∗ (M) ∗A∗ (H ) T ∗ (N ))/ ∼
(4.15)
as manifolds, where ∼ is an equivalence relation defined as follows. For (θm , ηn ) ∈ T ∗ (M) ∗A∗ (H ) T ∗ (N ) (i.e., σ (m) = ρ(n) and JRH (θm ) = JLH (ηn )), h ∈ s −1 (σ (m)), and ∗ ∗ (θmh −1 , ηhn ) ∈ T (M) ∗A∗ (H ) T (N ), we say that (θmh−1 , ηhn ) ∼ (θm , ηn ) iff for each pair of vectors m(0) ˙ ∈ Tmh−1 (M) and n(0) ˙ ∈ Thn (N ) such that ˙ = ρ∗ (m(0)), ˙ σ∗ (m(0))
(4.16)
there exists a curve h(·) in H with h(0) = h and t (h(λ)) = σ (m(λ)) = ρ(n(λ)) (the latter equality may be imposed for convenience because of (4.16)), such that dm(λ) dn(λ) , + η , θmh −1 hn dλ |λ=0 mh−1 dλ |λ=0 mh−1 dm(λ)h(λ) dh−1 (λ)n(λ) = θm , + ηn , . (4.17) dλ dλ |λ=0 m |λ=0 n To prove (4.15), note that for any (possibly singular) smooth foliation of a manifold Q with smooth leaf space Q/ one has an isomorphism C ∞ (Q/, T ∗ (Q/))
C ∞ (Q, T ()00 ),
(4.18)
where the right-hand side consists of all 1-forms ω on Q that satisfy iξ ω = 0 (forming T ()0 ⊂ T ∗ (Q)) and iξ dω = 0 (defining T ()0 ), for all ξ ∈ C ∞ (Q, T ()). This is well known for regular foliations (cf. [32]), and the proof is the same in the singular case (it merely depends on the smoothness of the leaf space). These conditions may be rewritten as iξ ω = Lξ ω = 0 (where L is the Lie derivative), or as iξ ω = 0 for all vector fields ξ as above and ϕ ∗ ω = ω for all diffeomorphisms ϕ of Q that are generated by such ξ . The isomorphism (4.18) is then given by α ↔ π ∗ α, where π : Q → Q/ is the canonical projection. In addition, one has C ∞ (Q, T ()00 )
C ∞ (Q/, T ()0 / ∼),
(4.19)
Groupoids
113
where the equivalence relation ∼ on T ()0 is defined by β ∼ β iff β = ϕ ∗ β for some diffeomorphism ϕ as specified above. The isomorphism (4.19) associates a section q → β(q) with a section [q] → [β(q)]∼ . Hence the ensuing isomorphism C ∞ (Q/, T ∗ (Q/))
C ∞ (Q/, T ()0 / ∼)
(4.20)
is given by α ↔ [π ∗ α]∼ . We apply this to Q = M ∗H0 N , where is the foliation by the orbits of the diagonal H action. The condition of lying in T ()0 then has T ∗ (M) ∗A∗ (H ) T ∗ (N ) as its solution set, and the equivalence relation ∼ defined for is precisely the one imposed by (4.17) and preceding text. This proves (4.15). Next, we show that the equivalence relation ∼ on T ∗ (M) ∗A∗ (H ) T ∗ (N ) coincides with , defined as follows. We say that (θmh −1 , ηhn ) (θm , ηn ) iff there exists αh ∈ ∗ Th (H ) satisfying s˜ (αh ) = JRH (θm )
(4.21)
(and therefore also s˜ (αh ) = JLH (ηn )), such that for each pair of vectors m(0) ˙ ∈ Tmh−1 (M) and n(0) ˙ ∈ Thn (N ) (not necessarily satisfying (4.16)), there exist curves ˆ and h(·) ˜ in H subject to h(0) ˆ ˜ ˜ ˆ h(·) = h(0) = h, t (h(λ)) = σ (m(λ)), t (h(λ)) = ρ(n(λ)), for which one has dm(λ) dn(λ) θmh + ηhn , −1 , dλ |λ=0 mh−1 dλ |λ=0 mh−1 ˜ dm(λ)h(λ) d hˆ −1 (λ)n(λ) = θm , + ηn , (4.22) dλ dλ |λ=0 |λ=0 m n ˆ ˜ d h(λ) d h(λ) + αh , − αh , . dλ |λ=0 dλ |λ=0 h
h
We stress that hˆ and h˜ do, and αh does not depend on the vectors m(0) ˙ and n(0). ˙ The ˜ cf. the comment full right-hand side of (4.22) is independent of the choice of hˆ and h; following (4.6). First, implies ∼ (i.e., A B → A ∼ B), for if (4.16), and hence σ (m(λ)) = ˆ and the final line in (4.22) drops out, ρ(n(λ)), holds, one may choose h = h˜ = h, implying (4.17). To prove the converse, we note that, since the bibundle G M H is regular, the map σ : M → H0 is a surjective submersion, so that Tm (M)
Tmσ (M) ⊕ Tσ (m) (H0 ).
Here Tmσ (M) is the kernel of σ∗ : T (M) → T (H0 ) at m. This induces the decomposition Tm (M) ⊕ Tn (N )
σ =ρ
T(m,n) (M × N ) ⊕ Tσ (m) (H0 ),
σ =ρ
(4.23)
where T(m,n) (M × N ) is the kernel of σ∗ − ρ∗ at (m, n). Explicitly, the decomposition of a given vector according to (4.23) reads (ξ1 , ξ2 , ζ ) = (ξ1 , ρ∗ (ζ ), ζ ) + (0, ξ2 − ρ∗ (ζ ), 0),
114
N. P. Landsman
where ξ1 ∈ Tmσ (M), ξ2 ∈ Tσ (m) (H0 ), and ζ ∈ Tn (N ). Now, in order to verify (4.22) given (4.17), we examine the two possible cases allowed by (4.23). A dimension count shows that one can always choose αh so as to satisfy (4.22) on Tσ (m) (H0 ). This is because in a Lie groupoid one has [27, 21] Th (H )
Tht (H ) ⊕ Tt (h) (H0 ),
and the condition (4.21) constrains αh only on Tht (H ), leaving its value on Tt (h) (H0 ) σ =ρ free. On the other hand, if (4.16) holds, so that (m(0), ˙ (n(0)) ˙ lies in T(m,n) (M × N ), and we assume (4.17), then (4.22) is satisfied for any αh , as one may choose h˜ = hˆ = h. Hence ∼ implies , and we have shown that these equivalence relations coincide. Comparing (4.22) with (4.5) and (4.8), and using (4.15), it is clear that (4.14) holds at the manifold level. But it is almost trivial that the identification we have made preserves the symplectic structure, so that (4.14) is valid for symplectic manifolds as well. Finally, we need to verify that the symplectomorphism (4.14) is compatible with the A∗ (G)-A∗ (K) symplectic bimodule structure that both sides have. This is, indeed, obvious from the explicit structure of the pertinent Poisson maps. For example, denoting the appropriate Poisson map from T ∗ (M)− A∗ (H ) T ∗ (N )− to A∗ (G) by JˆLG , we have JˆLG ([θm , ηn ]) = JLG (θm ), so that dγ (λ) dγ (λ)−1 m G ˆ = − θm , . JL ([θm , ηn ]), dλ |λ=0 dλ |λ=0
(4.24)
Here [θm , ηn ] is the equivalence class of (θm , ηn ) under either the T ∗ (H ) orbits or under the null foliation with respect to the inclusion T ∗ (M)− ∗A∗ (H ) T ∗ (N )− → T ∗ (M)− × T ∗ (N )− ; these coincide by (4.12). On the other hand, J˜LG : T ∗ (M H N )− → A∗ (G) is given by dγ (λ) d[γ (λ)−1 m, n]H = − F[m,n]H , . J˜LG (F[m,n]H ), dλ |λ=0 dλ |λ=0
(4.25)
It is trivial from the explicit form of the isomorphism (4.14) described above that (4.24) is duly transferred to (4.25). This completes the proof of the isomorphism between (4.10) and (4.11), and therefore of Theorem 3. Since Morita equivalence of s-connected and s-simply connected Lie groupoids is isomorphism in LG’, and Morita equivalence of Poisson manifolds is isomorphism in Poisson, we recover the known result [24] that the map G → A∗ (G) preserves Morita equivalence. References 1. Baillet, M., Denizeau, Y., Havet, J.-F. : Indice d’une espérance conditionnelle. Compositio Math. 66, 199–236 (1988) 2. Connes,A.: Sur la théorie non commutative de l’intégration. Lecture Notes in Math. 725, 1979, pp. 19–143 3. Connes, A.: A survey of foliations and operator algebras. Proc. Sympos. Pure Math. 38, 521–628 (1982) 4. Connes, A.: Noncommutative Geometry. San Diego: Academic Press, 1994 5. Connes, A., Takesaki, M.: The flow of weights on factors of type III. Tohoku Math. J. 29, 473–575 (1977); Err. ibid. 30, 653–655 (1978)
Groupoids
115
6. Coste, A., Dazord, P., Weinstein, A.: Groupoides symplectiques. Publ. Dépt. Math., Univ. C. Bernard, Lyon I 2A, 1–62 (1987) 7. Courant, T.J.: Dirac manifolds. Trans. Am. Math. Soc. 319, 631–661 (1990) 8. Dazord, P.: Groupoïdes symplectiques et troisième théorème de Lie “non linéaire”. Lecture Notes in Math. 1416, Berlin–Heidelberg–New York: Springer, 1990, pp. 39–74 9. Dazord, P., Sondaz, D.: Variétés de Poisson – algébroïdes de Lie. Séminaire Sud-Rhodanien, Univ. Claude Bernard, Lyon, 1–66 (1988) 10. Feldman, J., Hahn, P., Moore, C.C.: Orbit structure and countable sections for actions of continuous groups. Adv. Math. 28, 186–230 (1978) 11. Green, P.: The local structure of twisted covariance algebras. Acta Math. 140, 191–250 (1978) 12. Haefliger, A.: Groupoïdes d’holonomie et classifiants. Astérisque 116, 70–97 (1984) 13. Hahn, P.: Haar measure for measure groupoids. Trans. Am. Math. Soc. 242, 1–33 (1978) 14. Hahn, P.: The regular representations of measure groupoids. Trans. Am. Math. Soc. 242, 35–72 (1978) 15. Hilsum, M., Skandalis, G.: Morphismes K-orientés d’espaces de feuilles et fonctorialitéé en théorie de Kasparov (d’aprés une conjecture d’A. Connes). Ann. Sci. ÉÉcole Norm. Sup. (4) 20, 325–390 (1987) 16. Karasev, M.V.: The Maslov quantization conditions in higher cohomology and analogs of notions developed in Lie theory for canonical fibre bundles of symplectic manifolds. I, II. Selecta Math. Soviet. 8, 213–234, 235–258 (1989) 17. Kontsevich, M.: Formal (non)commutative symplectic geometry. In: The Gelfand Mathematical Seminars, 1990–1992, Boston: Birkhäuser, 1993, pp. 173–187 18. Krieger, W.: On ergodic flows and the isomorphism of factors. Math. Ann. 223, 19–70 (1976) 19. Krishnaprasad, P.S., Marsden, J.E.: Hamiltonian structure and stability for rigid bodies with flexible attachments. Arch. Rat. Mech. An. 98, 137–158 (1987) 20. Lance, E.C.: Hilbert C ∗ -Modules. A Toolkit for Operator Algebraists. Cambridge: Cambridge University Press, 1995 21. Landsman, N.P.: Mathematical Topics Between Classical and Quantum Mechanics. New York: Springer, 1998 22. Landsman, N.P.: Lie groupoid C ∗ -algebras and Weyl quantization. Commun. Math. Phys. 206, 367–381 (1999); E-print math-ph/9807028 23. Landsman, N.P., Ramazan, B.: Quantization of Poisson algebras associated to Lie algebroids. In: Kaminker, J., Ramsay, A., Renault, J., Weinstein, A. (eds.), Proc. Conf. on Groupoids in Physics, Analysis and Geometry. Contemporary Mathematics, to appear; E-print math-ph/0001005 24. Landsman, N.P.: The Muhly–Renault–Williams theorem for Lie groupoids and its classical counterpart. Lett. Math. Phys. 54, 43–59 (2001); E-print math-ph/0008005 25. Landsman, N.P.: Bicategories of operator algebras and Poisson manifolds. In: Longo, R. (ed.) Mathematical Physics in Mathematics and Physics. Quantum and Operator Algebraic Aspects. Fields Inst. Comm., to appear (2001); E-print math-ph/0008003 26. Landsman, N.P.: Quantized reduction as a tensor product. In: Landsman, N.P., Pflaum, M., Schlichenmaier, M. (eds.), Quantization of Singular Symplectic Quotients, Basel: Birkhäuser, 2001; E-print mathph/0008004 27. Mackenzie, K.C.H.: Lie Groupoids and Lie Algebroids in Differential Geometry. Cambridge: Cambridge University Press, 1987 28. Mackenzie, K.C.H., Xu, P.: Integration of Lie bialgebroids. Topology 39, 445–476 (2000) 29. Mackey, G.W.: Ergodic theory and virtual groups. Math. Ann. 166, 187–207 (1966) 30. Moerdijk, I.: Toposes and groupoids. Lecture Notes in Math. 1348, Berlin–Heidelberg–New–York: Springer, 1988, pp. 280–298 31. Moerdijk, I., Mrˇcun, J.: On integrability of infinitesimal actions. Eprint math.DG/0006042 32. Molino, P.: Riemannian Foliations. Basel: Birkhäuser, 1988 33. Moore, C.C., Schochet, C.: Global Analysis on Foliated Spaces. New York: Springer, 1988 34. Mrˇcun, J.: Stability and Invariants of Hilsum–Skandalis Maps. Ph.D. Thesis, Univ. of Utrecht (1996) 35. Mrˇcun, J.: Functoriality of the bimodule associated to a Hilsum–Skandalis map. K-Theory 18, 235–253 (1999) 36. Muhly, P., Renault, J., Williams, D.: Equivalence and isomorphism for groupoid C ∗ -algebras. J. Operator Th. 17, 3–22 (1987) 37. Pradines, J.: Théorie de Lie pour les groupoïdes différentiables. Relations entre propriétés locales et globales. C. R. Acad. Sc. Paris A263, 907–910 (1966) 38. Ramsay, A.: Virtual groups and group actions. Adv. Math. 6, 253–322 (1971) 39. Ramsay, A.: Topologies on measured groupoids. J. Funct. Anal. 47, 314–343 (1982) 40. Renault, J.: A Groupoid Approach to C ∗ -algebras. Lecture Notes in Math. 793. Berlin: Springer, 1980 41. Renault, J.: Représentation des produits croisés d’algèbres de groupoïdes. J. Operator Theory 18, 67–97 (1987) 42. Rieffel, M.A.: Morita equivalence for C ∗ -algebras and W ∗ -algebras. J. Pure Appl. Alg. 5, 51–96 (1974)
116
N. P. Landsman
43. Samuélidès, M.: Mesures de Haar et W ∗ -couple d’un groupoïde mesuré. Bull. Soc. Math. France 106, 261–278 (1978) 44. Sauvageot, J.-L.: Sur le produit tensoriel relatif d’espaces de Hilbert. J. Operator Theory 9, 237–252 (1983) 45. Sauvageot, J.-L.: Image d’un homomorphisme et flot des poids d’une relation d’équivalence mesurée. Math. Scand. 42, 71–100 (1978) 46. Schweizer, J.: Crossed products by equivalence bimodules. Univ. Tübingen preprint (1999) 47. Stachura, P.: Differential groupoids and C ∗ -algebras. E-print math.QA/9905097 48. Stadler, M.O., O’Uchi, M.: Correspondence of groupoid C ∗ -algebras. J. Operator Th. 42, 103–119 (1999) 49. Sutherland, C.E., Takesaki, M.: Right inverse of the module of approximately finite-dimensional factors of type III and approximately finite ergodic principal measured groupoids. Fields Inst. Commun. 20, 149–159 (1998) 50. Vaisman, I.: Lectures on the Geometry of Poisson Manifolds. Basel: Birkhäuser, 1994 51. Wassermann, A.: Operator algebras and conformal field theory. III. Fusion of positive energy representations of LSU(N ) using bounded operators. Invent. Math. 133, 467–538 (1998) 52. Weinstein, A.: The local structure of Poisson manifolds. J. Diff. Geom. 18, 523–557 (1983); Err. ibid. 22, 255 (1985) 53. Xu, P.: Morita equivalent symplectic groupoids. In: Dazord, P., Weinstein, A. (eds.), Symplectic Geometry, Groupoids, and Integrable Systems, New York: Springer, 1991, pp. 291–311 54. Xu, P.: Morita equivalence of Poisson manifolds. Commun. Math. Phys. 142, 493–509 (1991) Communicated by A. Connes
Commun. Math. Phys. 222, 117 – 146 (2001)
Communications in
Mathematical Physics
© Springer-Verlag 2001
Well Posedness for Pressureless Flow Feimin Huang1,2, , Zhen Wang3 1 S.I.S.S.A., Via Beirut 2–4, 34010 Trieste, Italy 2 Institute of Applied Mathematics, Academia Sinica, Beijing 100080, P.R. China 3 Department of Mathematics, City University of Hong Kong, Hong Kong, P.R. China
Received: 26 September 2000 / Accepted: 25 April 2001
Abstract: We study the uniqueness problem for pressureless gases. Previous results on this topic are only known for the case when the initial data is assumed to be a bounded measurable function. This assumption is unnatural because the solution is in general a Radon measure. In this paper, the uniqueness of the weak solution is proved for the case when the initial data is a Radon measure. We show that, besides the Oleinik entropy condition, it is also important to require the energy to be weakly continuous initially. Our uniqueness result is obtained in the same functional space as the existence theorem. 1. Introduction The one-dimension pressureless gases read ρt + (ρu)x = 0, (ρu)t + (ρu2 )x = 0,
(1.1)
where ρ ≥ 0 denotes the density of mass and u the velocity. Recently, the problem of pressureless gas dynamics has attracted a great deal of attention (see [1, 3–5, 10, 12, 16, 19, 21, 22]). Physically this problem is studied as the sticky particle model in many different areas of physics (see [23]). Mathematically the solutions to these equations are naturally measures and this poses new challenges to the analysis of the solutions. The existence of the global weak solution of (1.1) is first obtained independently by Brenier, Grenier [5] and E, Rykov and Sinai [10]. An explicit formula of the weak solution was obtained in [10] by Generalized Variational Principles (GVP). Wang, Huang and Ding [21] extended their results to the general case when u0 is not continuous. Similar results were also obtained by Chen, Li, Zhang [16]. Boudin [4] showed that the weak solution could also be obtained as limits of solutions to a viscous pressureless model. Supported partially by NSFC # 19801039.
118
F. Huang, Z. Wang
After the existence of the weak solution is obtained, people naturally want to know whether the weak solution is unique or not. It is well known that in general, the weak solution of conservation laws of a hyperbolic system is not unique without an entropy condition. The question is which entropy condition is suitable for pressureless gases. The authors of [1, 5, 10] found that the Lax entropy condition was insufficient to guarantee the uniqueness for (1.1). In [10], E, Rykov and Sinai pointed out further that the Oleinik entropy condition might be necessary. Along this direction, Wang and Ding [22] proved the uniqueness of the weak solution for the case when the initial data ρ0 is of bounded measurable function. Similar results were also obtained by Bouchut and James [3]. Their results showed that the conjecture of [10] was true, i.e. the Oleinik entropy condition was suitable for pressureless gases. However, the uniqueness results of [3, 22] are far from satisfactory because the solution of (1.1) is in general a Radon measure. In the present paper, we establish the uniqueness of the weak solution to (1.1) when the initial data is a Radon measure. We show that, besides the Oleinik entropy condition, it is also important to require the energy to be weakly continuous initially, i.e. even though the system (1.1) loses energy as time goes on, it does not do so right away. This condition is called the energy condition. A weak solution of (1.1) is called an entropy solution if it satisfies the Oleinik entropy condition and the energy condition (see Definition 2 below for detail). Without the energy condition, the weak solution satisfying the Oleinik entropy condition is not unique (see Sect. 4). This indicates that the energy condition is necessary and sufficient for the uniqueness. Our proofs are based on the idea that any entropy solution should coincide with an entropy solution generated by the generalized potential, y−0 F (y; x, t) = tu0 (η) + η − xdm0 (η), (1.2) 0+0
with m0 (x) = ρ0 ([0, x[). This naturally leads to the uniqueness of the entropy solution. It is noted that F (y; x, t) only depends on the initial data. Obviously, our idea includes two parts: Part I constructing an entropy solution (ρs , us ) by F (y; x, t), Part II showing that any entropy solution (ρ, u) coincides with (ρs , us ). For Part I, we give an explicit formula of entropy solution. We note that our constructive procedure is essentially equivalent to that of [10] by GVP. We thank the anonymous referee for reminding us of the relation between the two kinds of constructive procedures. For Part II, we first establish a uniqueness theorem for a linear equation with discontinuous coefficients (see Sect. 3). By the new uniqueness theorem for the linear equation, we prove the uniqueness of entropy solution of (1.1) in the region t ≥ t1 > 0. Namely, any entropy solution in the region t ≥ t1 (denoted by (ρt1 , ut1 )) can be expressed explicitly by its value at t = t1 . This is possible because (1.1) has only one family of characteristics. Then, we study the convergence of the sequence (ρt1 , ut1 ) as t1 → 0. We show that the limit of (ρt1 , ut1 ) coincides with (ρs , us ). This leads to the uniqueness of the entropy solution of (1.1) for general initial data. It is worthwhile to point out that our uniqueness theorem is obtained in the same functional space as the existence theorem. We note here that the usual initial condition (i.e. the weak continuity of the mass ρ, the momentum ρu at the initial time), together with the Oleinik entropy condition, is not sufficient to guarantee the uniqueness of the limits as t1 → 0. Before recalling the definition of the weak solution introduced by Wang, Huang and Ding (see [21]), we introduce some notations first.
Well Posedness for Pressureless Flow
119
(x,t) If ρ, u are bounded measurable functions, then m(x, t) = (0,0) ρdx − ρudt is independent of the integral path and satisfies mx = ρ due to the fact that the first equation of (1.1) is conserved. Therefore the system (1.1) becomes, by the new variable m(x, t), mt + umx = 0, (1.3) (mx u)t + (mx u2 )x = 0. We focus our attention on the system (1.3) instead of (1.1). Definition 1. Let m(x, t) be of bounded variation locally in x and u(x, t) be bounded and measurable to mx . Assume that the measures mx and umx are weakly continuous in t. (ρ, u) = (mx , u) is called a weak solution of (1.1) or (m, u) is called a weak solution of (1.3), if ϕt m dx dt − ϕu dm dt = 0, (1.4) 2 ψt u + ψx u dm dt = 0, 2 ). Here holds for all ϕ, ψ ∈ C0∞ (R+
· · · dmdt denotes Lebesgue–Stieltjes integral.
The initial value is understood in the following sense: as t → +0, the measures ρ and ρu weakly converge to ρ0 , ρ0 u0 respectively, where u0 is bounded and measurable to ρ0 . Definition 2. Let (ρ, u) be a weak solution of (1.1). (ρ, u) is called an entropy solution of (1.1) if u(x2 , t) − u(x1 , t) 1 ≤ , (1.5) x2 − x 1 t holds for any x1 < x2 , almost everywhere t > 0 and the measure ρu2 weakly converges to ρ0 u20 as t → 0. Our main results are Theorem 1 (Existence theorem). Let ρ0 ≥ 0 ∈ M loc (R) (or m0 (x) = ρ0 ([0, x[) be increasing) and u0 be bounded and measurable to ρ0 , then the system (1.1) admits at least one entropy solution. Theorem 2 (Uniqueness theorem). Let ρ0 ≥ 0 ∈ Mloc (R) (or m0 (x) = ρ0 ([0, x[) be increasing) and u0 be bounded and measurable to ρ0 . Let (m1 , u1 ) and (m2 , u2 ) be the two entropy solutions of (1.1) with the same initial data ρ0 , u0 in the above sense, then
and
m1 = m2 a.e.
(1.6)
u1 = u2 a.e
(1.7)
with respect to the measure m1x = m2x . An outline of this paper is as follows. In Sect. 2, we construct the entropy solution and study its structure in detail. In Sect. 3, we prove a new uniqueness theorem for the linear equation with a discontinuous coefficient. In Sect. 4, we study the general uniqueness for pressureless gases.
120
F. Huang, Z. Wang
2. Existence of Entropy Solution In this section, we use the generalized potential introduced in [21] to construct an entropy solution. As in [21], let y−0 F (y; x, t) = tu0 (η) + η − xdm0 (η), (2.1) 0+0
where m0 (x) = ρ0 ([0, x[) is an increasing function. Since ρ0 = 0 is trivial, we assume that, there always exist a < b such that ρ0 ([a, b[) > 0, i.e. the support of ρ0 is not empty set. Before constructing the entropy solution, we give some lemmas first. Lemma 2.1. For any point (x, t), as a function of y, F (y; x, t) has a finite low bound. Proof. For any y1 < y2 ≤ x − Mt, or y1 > y2 ≥ x + Mt, we have y1 −0 x−η dm0 (η) ≥ 0, u0 (η) − F (y1 ; x, t) − F (y2 ; x, t) = t t y2 −0 where M = u0 L∞ . This implies the lemma. Let ν(x, t) = min F (y; x, t), y
S(x, t) = {y; ∃yn → y, s.t. F (yn ; x, t) → ν(x, t)}.
(2.2)
Since F (y; x, t) is left continuous (not continuous) with respect to y, for any y0 ∈ S(x, t), the following holds: F (y0 ; x, t), if F (y; x, t) achieves its minimum at y0 , ν(x, t) = (2.3) F (y0 + 0; x, t), otherwise. Thus, we have Lemma 2.2. Assume that y0 ∈ S(x, t) and [m0 (y0 )] = m0 (y0 + 0) − m0 (y0 − 0) > 0, then ν(x, t) = min F (y; x, t) = F (y0 ; x, t), y
if x ≤ y0 + tu0 (y0 ),
ν(x, t) = min F (y; x, t) = F (y0 + 0; x, t), if x > y0 + tu0 (y0 ).
(2.4)
y
Proof. Since u0 is measurable to ρ0 , u0 must be well defined at the point y0 , where ρ0 has a positive mass. If x ≤ y0 + tu0 (y0 ), we compute F (y0 + 0; x, t) − F (y0 − 0; x, t) = (tu0 (y0 ) + y0 − x)[m0 (y0 )] ≥ 0. This indicates that F (y; x, t) achieves its minimum at y0 due to the left continuity of F (y; x, t). In the same fashion, the second term of (2.4) also holds. Lemma 2.3. Let (xn , tn ) and yn ∈ S(xn , tn ) converge to (x, t) and y0 , respectively, then y0 ∈ S(x, t).
Well Posedness for Pressureless Flow
121
Proof. First we show that ν(x, t) is a continuous function of x and t. Note that F (y; x, t) is continuous in x, t and ν(xn , tn ) = F (yn ; xn , tn ) or F (yn + 0, xn , tn ), it is easy to see ν(x, t) ≤ liminfn→∞ ν(xn , tn ). On the other hand, we compute F (y; x, t) ≥ ν(xn , tn ) + F (y; x, t) − F (y; xn , tn ). Letting n → ∞ gives ν(x, t) ≥ limsupn→∞ ν(xn , tn ). This implies ν(x, t) = lim ν(xn , tn ). n→∞
Now we choose a subsequence of yn (still denoted by yn ) such that ν(xn , tn ) = F (yn ; xn , tn ) (or ν(xn , tn ) = F (yn + 0, xn , tn )) always holds. Without loss of generality, we assume that ν(xn , tn ) = F (yn ; xn , tn ). Due to |F (yn ; x, t) − F (yn ; xn , tn )| = o(1), we have F (yn ; x, t) → ν(x, t), as n → ∞. Thus, the definition of S(x, t) gives y0 ∈ S(x, t). Let ym (x, t) =
inf
y∈S(x,t)
y, y m (x, t) =
sup
y∈S(x,t)
y.
From Lemma 2.1, inf{y; y ∈ spt{ρ0 }} is finite if ym (x, t) = −∞. In the same way, sup{y; y ∈ spt{ρ0 }} is also finite if y m (x, t) = +∞. Let y , if (x, t) ∈ A(x, t) and y ∈ spt{ρ0 }, yM (x, t) = (2.5) ym , otherwise, y , if (x, t) ∈ B(x, t) and y ∈ spt{ρ0 }, M y (x, t) = (2.6) y m , otherwise, where A(x, t) = {(x, t); ∃y > ym , s.t. ρ0 = 0 on (ym , y ) and ∀y ∈ (ym , y ), y ∈ S(x, t)}, B(x, t) = {(x, t); ∃y < ym , s.t. ρ0 = 0 on (y , y m ) and ∀y ∈ (y , y m ), y ∈ S(x, t)}. Let
y∗ (x, t) = min{yM (x, t), y M (x, t)},
y ∗ (x, t) = y M (x, t).
(2.7)
(x, t), y ∗ (x, t)
∈ S(x, t) ∩ spt{ρ0 }. It is easy to see y∗ Analogous to [21], for each point (x0 , t0 ), we introduce the left and right backward generalized characteristics L1 , L2 as follows: L1 : L2 :
x0 − y∗ (x0 , t0 ) (t − t0 ) for t < t0 , t0 x0 − y ∗ (x0 , t0 ) x = x0 + (t − t0 ) for t < t0 . t0 x = x0 +
(2.8) (2.9)
We claim that there is only one minimum point of F (y; x, t) for each (x, t) along the backward lines L1 , L2 .
122
F. Huang, Z. Wang
Lemma 2.4. For any y0 ∈ S(x0 , t0 ), y∗ (x, t) = y ∗ (x, t) holds along the lines L : x = x0 +
x0 − y0 (t − t0 ), t0
t < t0 .
Furthermore, y∗ (x, t) = y ∗ (x, t) ≤ y∗ (x0 , t0 ) along the line L1 and y∗ (x, t) = y ∗ (x, t) = y ∗ (x0 , t0 ) along the line L2 . Proof. Let y0 ∈ S(x0 , t0 ). Without loss of generality, we assume that ν(x0 , t0 ) = F (y0 ; x0 , t0 ). On the line L, for any y = y0 , we compute F (y; x, t) − F (y0 ; x, t) =
y−0
y−0
y0 −0
tu0 (η) + η − xdm0 (η)
=
t t0
=
t0 − t t (F (y; x0 , t0 ) − ν(x0 , t0 )) + t0 t0
y0 −0
t0 u0 (η) + η − x0 dm0 (η) +
t0 − t t0 y−0 y0 −0
y−0 y0 −0
η − y0 dm0 (η)
(2.10)
η − y0 dm0 (η) ≥ 0,
which infers y0 ∈ S(x, t) and y∗ (x, t) = y ∗ (x, t) along the line L. Now we prove Lemma 2.4 for L1 , i.e. y0 = y∗ (x0 , t0 ). There are four subcases. (1) ∃y1 < y0 < y2 , s.t., ρ0 = 0 on (y1 , y0 ) ∪ (y0 , y2 ); y −0 (2) ∃y1 < y0 , s.t. ρ0 = 0 on (y1 , y0 ) and for any y2 > y0 , y02+0 dm0 > 0; y −0 (3) ∃y2 > y0 , s.t. ρ0 = 0 on (y0 , y2 ) and for any y1 < y0 , y10+0 dm0 > 0; y −0 y −0 (4) For any y1 < y0 < y2 , the integrals y10+0 dm0 and y02+0 dm0 are positive. It is easy to see y∗ (x, t) = y ∗ (x, t) = y0 for the subcases (1), (3) and (4) and y∗ (x, t) = y ∗ (x, t) ≤ y0 for (2). In the same way, we can also prove y∗ (x, t) = y ∗ (x, t) = y0 along the line L2 . In terms of Lemma 2.4, for each point (x, t), the backward generalized characteristics L1 , L2 and X-axis form a characteristic triangle. We denote it "(x, t). Fix the time t, as x changes, all these characteristic triangles "(x, t) never interact with each other except X-axis. Thus, we have the following lemma. Lemma 2.5. y∗ , y ∗ are increasingly monotonic in x. In particular, y ∗ (x1 , t) ≤ y∗ (x2 , t) holds for any x1 < x2 . Proof. If there exist two points (x1 , t) and (x2 , t) such that x1 < x2 , y ∗ (x1 , t) > y∗ (x2 , t), then the line L2 (x1 ) (defined in (2.9)) must interact with L1 (x2 ) at a point (xs , ts ), with ts > 0. From Lemma 2.4, we have y∗ (xs , ts ) = y ∗ (xs , ts ) = y ∗ (x1 , t) ≤ y∗ (x2 , t). This contradicts the assumption y ∗ (x1 , t) > y∗ (x2 , t). Lemma 2.6. Each point (x1 , t1 ) at t1 > 0 uniquely determines a Lipschitz continuous curve x = x(t), x1 = x(t1 ). In particular, at every t ∈ {τ ; τ ≥ t1 },
Well Posedness for Pressureless Flow
123
x − y∗ , if t ∈ V1 ∩ (V2 ∪ V3 ), t x(ta ) − x(t) , if t ∈ V1 ∩ V4 , ta − t x (t) = ∗ y˜ (x2 ,t)−0 y˜∗ (x1 ,t)+0 u0 (η)dm0 (η) , if t ∈ V5 ∪ (V1 /(V2 ∪ V3 ∪ V4 )), lim x2 ,x1 →x±0 y˜ ∗ (x2 ,t)−0 dm (η) y˜∗ (x1 ,t)+0
0
(2.11)
where V1 = {t; y∗ (x, t) = y ∗ (x, t)}, V2 = {(x, t); ∃ta > t, s.t. y∗ (x(ta ), ta ) = y ∗ (x(ta ), ta ), and (x(t), t) ∈ "(x(ta ), ta )}, V3 = {t; ∀ta > t, y∗ (x(ta ), ta ) < y ∗ (x(ta ), ta ), and y∗ (x(ta ), ta ) → y∗ (x, t), y ∗ (x(ta ), ta ) → y ∗ (x, t), as ta → t}, V4 = {(x, t); ∀ta > t, s.t. y∗ (x(ta ), ta ) = y ∗ (x(ta ), ta ), 2 and (x(t), t) ∈ R+ /"(x(ta ), ta )}, ∗ V5 = {t; y∗ (x, t) < y (x, t)},
and x(t ) − x(t ) , t − t y˜ ∗ (x2 , t) = y ∗ (x2 , t) + H (x2 − y∗ (x + 0, t) − u0 (y∗ (x + 0, t))t)(x2 − x), y˜∗ (x1 , t) = y∗ (x1 , t) + H (u0 (y∗ (x − 0, t))t + y∗ (x − 0, t) − x1 )(x1 − x), 1, x > 0, H (x) = 0, x ≤ 0. x (t) =
lim
t ,t →t+0
Proof. Consider any line t = t0 > t1 . Let x − y∗ (x, t0 ) (t0 − t1 ) < x1 }, t0 x − y∗ (x, t0 ) b(t0 ) = inf{x; x − (t0 − t1 ) > x1 }. t0
a(t0 ) = sup{x; x −
Obviously, a(t0 ) ≤ b(t0 ) holds. If a(t0 ) < b(t0 ), then for any c ∈ (a(t0 ), b(t0 )), we have c − c−y∗t0(c,t0 ) (t0 − t1 ) = x1 , which contradicts the fact that all characteristic triangles "(x, t0 ), (−∞ < x < ∞) never interact with each other except the X-axis. Thus a(t0 ) = b(t0 ) always holds. Therefore, for any t = t0 , there always exists a unique point x = x(t0 ) = a(t0 ) on the line t = t0 . When time t0 changes, these points x = x(t0 ) form a curve x = x(t) with x1 = x(t1 ) in t > t1 . We denote it L(x1 , t1 ). In particular, for any t2 > t > t1 , the characteristic "(x, t2 ) does not contain the points (x(t1 ), t1 ) and (x(t), t) if x = x(t2 ). This means that the curve L(x1 , t1 ) coincides with the curve L(x2 , t2 ) originated from (x2 , t2 ) in t ≥ t2 , if x2 = x(t2 ), x1 = x(t1 ), t2 ≥ t1 .
124
F. Huang, Z. Wang
We first consider the case y∗ (x, t) < y ∗ (x, t). Let y∗ = y∗ (x, t), y ∗ = y ∗ (x, t). By the definition of y∗ , y ∗ , it is easy to check y ∗ −0 dm0 (η) > 0, (2.12) y∗ +0
or which indicates
ρ0 = 0 on (y∗ , y ∗ ), (y∗ , y ∗ ) ∩ S(x, t) = ∅,
(2.13)
"(x(t), t) ⊂ "(x(t ), t ), ∀t > t.
(2.14)
2 = ∅, we have y ∗ (x(t ), t ) ≤ y or y (x(t ), t ) ≥ In fact, if "(x(t ), t )∩"(x(t), t)∩R+ ∗ ∗ ∗ y . Without loss of generality, we assume that y ∗ (x(t ), t ) ≤ y∗ . Now we choose a sequence of points (xn , t ) with xn > x(t ) = a(t ), xn → x(t ), as n → ∞. By n ,t ) the definition of a(t ), we have ςn = xn − xn −y∗t (x (t − t1 ) > x(t1 ). This implies y∗ (xn , t ) ≥ y ∗ . Let y0 = lim inf n→∞ y∗ (xn , t ), then
y0 ≥ y ∗ > y∗ > y ∗ (x(t ), t ).
(2.15)
On the other hand, from Lemma 2.3, we have y0 ∈ S(x(t ), t ). This contradicts the definition of y ∗ due to (2.12) and (2.13). Let x = x(t ), x = x(t ), t > t > t, and y = y∗ (x , t ),
y = y ∗ (x , t ),
y = y∗ (x , t ),
y = y ∗ (x , t ).
From (2.14), we have "(x, t) ⊂ "(x , t ) ⊂ "(x , t ), y ≤ y ≤ y∗ ≤ y ∗ ≤ y ≤ y .
(2.16)
It is easy to check that y∗ (x +0, t) = limt →t+0 y and y∗ (x −0, t) = limt →t+0 y . Without loss of generality, we assume that y2 < y , y , y1 > y , y and ν(x , t ) = F (y ; x , t ) = F (y ; x , t ),
ν(x , t ) = F (y ; x , t ) = F (y ; x , t ). By the definition of ν(x, t), we have F (y ; x , t ) − F (y ; x , t ) ≤ F (y ; x , t ) − F (y ; x , t ),
(2.17)
i.e.,
y −0
y −0
t u0 + η − x dm0 (η) ≤
which gives x − x t − t
y −0 y −0
y −0
y −0
dm0 (η) −
t u0 + η − x dm0 (η),
y −0 y −0
u0 dm0 (η) ≥ 0.
(2.18)
(2.19)
Well Posedness for Pressureless Flow
125
In the same way, we also have F (y ; x , t ) − F (y ; x , t ) ≤ F (y ; x , t ) − F (y; x , t ), which gives x − x t − t
y −0 y −0
dm0 (η) −
y −0 y −0
u0 dm0 (η) ≤ 0.
(2.20)
(2.21)
Combining (2.19) and (2.21), we have y −0
y −0 y −0 u0 dm0 (η) x − x ≤ . ≤ y −0 y −0 t − t y −0 dm0 (η) y −0 dm0 (η)
y −0
u0 dm0 (η)
Letting t → t + 0 yields
x (t) =
y˜ ∗ (x2 ,t)−0
y∗ (x+0,t)+0
y∗ (x−0,t)−0 u0 dm0 y∗ (x+0,t)+0 y∗ (x−0,t)−0 dm0
=
y˜∗ (x1 ,t)+0 u0 (η)dm0 (η) , y˜ ∗ (x2 ,t)−0 x2 ,x1 →x±0 y˜∗ (x1 ,t)+0 dm0 (η)
lim
(2.22)
where we have used the fact that y˜ ∗ (x2 , t) → y∗ (x + 0, t) + 0 and y˜∗ (x1 , t) → y∗ (x − 0, t) − 0, as x2 , x1 → x ± 0. Next we turn to the case y∗ = y ∗ . There are three subcases. (1) t ∈ V1 ∩ (V2 ∪ V3 ); (2) t ∈ V1 ∩ V4 ; (3) t ∈ V1 /(V2 ∪ V2 ∪ V3 ). Proof of (1). If t ∈ V1 ∩ V3 , then the curve x = x(t), t < t lies in "(x(t ), t ). Thus, we have x − y x − y x − x ≤ ≤ , (2.23) t t − t t which indicates limt ,t →t+0
x(t )−x(t ) t −t
=
x−y∗ t
by letting t , t → t + 0. If t ∈ V1 ∩ V2 ,
∗ (τ − ta ). It is easy to then (x(t), t) is located on the line Lta : ς (τ ) = x(ta ) + x(tat)−y a check that (x(τ ), τ ), t < τ < ta is also located on the line Lta . Thus, we have
x − y∗ x − x = , t > t > t. t − t t Proof of (2). There exists ta > t such that for any t < t ≤ ta , y∗ (x(t ), t ) = y∗ (x(t), t) = y∗ and "(x(t ), t ) does not contain the point (x(t), t). Obviously, we have x(ta ) − y∗ x(ta ) − (ta − t) < x(t), (2.24) ta or, x(ta ) − y∗ x(ta ) − (ta − t) > x(t). (2.25) ta Without loss of generality, we assume that x(ta ) −
x(ta ) − y∗ (ta − t) < x(t). ta
126
F. Huang, Z. Wang
Now we choose a sequence of (xn , ta ) with xn > x(ta ), xn → x(ta ), n → ∞. It is observed that for any x, "(x, t ), t < t ≤ ta , does not contain the point (x(t), t), we have, from (2.24), yta = lim inf y∗ (xn , ta ) ≥ x(ta ) −
x(ta ) − x(t) ta > y∗ . ta − t
Furthermore, yta = y m (x(ta ), ta ) ∈ S(x(ta ), ta ) ∩ spt{ρ0 } due to Lemma 2.3. Define x(ta ) − yta Lta : ξ = x(ta ) + (τ − ta ), τ < ta . ta From Lemma 2.4, we have y∗ (ξ, τ ) = y ∗ (ξ, τ ) = y∗ .
(2.26)
(2.27)
(2.28)
Let (xs , ts ) be the interaction point of Lta with the line L(t, ta ) : ξ = x(t) +
x(t) − y∗ (τ − t), τ > 0. t
In view of (2.26), we have ts ≥ t. If ts > t, then (2.28) gives "(x(t), t) ⊂ "(xs , ts ). This contradicts the fact that for any x, "(x, ts ) does not contain the point (x(t), t). Therefore ts = t and then (x(t), t) ∈ Lta . In the same way, we also have (x(t), t) ∈ Lt , if t < t < ta , where Lt is defined as in (2.27). It is easy to see yta = yt . This implies that any line Lt coincides with Lta . Thus, we have x(ta ) − x(t) x − x = , t > t > t. t −t ta − t Proof of (3). This proof is similar to the case y∗ < y ∗ , and we omit it here. On the other hand, it is easy to see x(t) is Lipschitz continuous in t by the argument above. Therefore Lemma 2.6 is proved. We call x = x(t) in Lemma 2.6 the forward generalized characteristic generated by (x1 , t1 ). Obviously, the curve x = x(t) is Lipschitz continuous. We define x(t ) − x(t ) , if |x − y∗ (x, t)| ≤ Mt, lim t − t t ,t →t+0 u(x, t) = (2.29) M, if x − y∗ (x, t) > Mt, −M, if x − y∗ (x, t) < −Mt, where the curve ξ = x(τ ) is originated from the point (x, t). 2 . Some properties of u follow from Lemma Obviously, u(x, t) is well defined in R+ 2.6. , u(x + 0, t) = Lemma 2.7. (i) u(x − 0, t) = x−y∗ (x−0,t) t 0, t)| ≤ Mt. (ii) For almost everywhere t > 0, u(x, t) satisfies
x−y∗ (x+0,t) , if |x − y∗ (x ± t
u(x + 0, t) < u(x, t) < u(x − 0, t).
(2.30)
Well Posedness for Pressureless Flow
127
(iii) For any x1 < x2 , almost everywhere t > 0, u(x, t) satisfies the Oleinik entropy condition. i.e. u(x2 , t) − u(x1 , t) 1 ≤ . x2 − x 1 t Proof. (i), (ii) are easy due to Lemmas 2.1, 2.5–2.6. We only prove (iii). For any x1 < x2 , without loss of generality, we assume that, |x2 − y∗ (x2 + 0, t)| ≤ Mt and |x1 − y∗ (x1 − 0, t)| ≤ Mt. We compute u(x2 , t) − u(x1 , t) ≤ u(x2 − 0, t) − u(x1 + 0, t) x2 − x 1 x2 − y∗ (x2 − 0, t) x1 − y∗ (x1 + 0, t) − ≤ . = t t t Therefore (iii) is proved. Now we construct the solution m(x, t). First we introduce the forward generalized characteristics originated from the X-axis. For each point η ∈ spt{ρ0 }, let if N1 ∪ N2 = ∅, η + u0 (η)t, (2.31) a(η, t) = inf{x; x ∈ N2 }, if N1 = ∅, N2 = ∅, sup{x; x ∈ N }, if N = ∅, 1 1 if N1 ∪ N2 = ∅, η + u0 (η)t, b(η, t) = sup{x; x ∈ N1 }, if N1 = ∅, N2 = ∅, (2.32) inf{x; x ∈ N }, if N = ∅, 2 2 where N1 = {x; y∗ (x, t) < η}, N2 = {x; y∗ (x, t) > η}. Note that (η, 0) is located on the X-axis, a(η, t) may not be equal to b(η, t). Namely, each point (η, 0) may generate at least one Lipschitz continuous curve x(t), x(0) = η. If there is only one forward generalized characteristic x(t) from (η, 0), we denote x(η, t) = x(t). If there are at least two curves from (η, 0), where ρ0 has a positive mass, let if η + u0 (η)t ≤ a(η, t), a(η, t), x(η, t) = η + u0 (η)t, if a(η, t) ≤ η + u0 (η)t ≤ b(η, t), (2.33) b(η, t), if b(η, t) ≤ η + u0 (η)t. For each point η ∈ R/ spt{ρ0 }, let c(η, t) = η + Mt, then there exists tη > 0 such that c(η, t) does not interact with other curves x(η, ˜ t), η˜ ∈ spt{ρ0 }, as 0 < t < tη . Let t˜η be the first time when c(η, t) interacts with a curve x(η0 , t), η0 ∈ spt{ρ0 }, let η + Mt, 0 < t ≤ t˜η , x(η, t) = x(η0 , t), t > t˜η . It is easy to see x(η, t) is increasing in η and Lipschitz continuous in t > 0. Thus x = x(η, t) is well defined with respect to the measure ρ0 due to Wang and Ding [22]. By Lemma 2.6, we have, ∂x(η, t) = u(x(η, t), t), if |x − y∗ (x(η, t), t)| ≤ Mt and η ∈ spt{ρ0 }. ∂t
(2.34)
128
F. Huang, Z. Wang
In view of the definition of y∗ and the forward generalized characteristic, we define y∗ −0 dm0 , F achieves its minimum at y∗ , (2.35) m(x, t) = 0+0 y∗ +0 dm0 , otherwise, 0+0 y∗ −0 u0 dm0 , F achieves its minimum at y∗ , q(x, t) = 0+0 (2.36) y∗ +0 u0 dm0 , otherwise, 0+0 y∗ −0 1 u0 u(x(η, t), t) dm0 , F achieves its minimum at y∗ , 2 0+0 E(x, t) = (2.37) y +0 ∗ 1 u0 u(x(η, t), t)dm0 , otherwise. 2 0+0 It is noted that u(x(η, t), t) is of bounded variation locally in x and satisfies u(x(η2 , t), t) − u(x(η1 , t), t) 1 ≤ . x(η2 , t) − x(η1 , t) t Thus u(x(η, t), t) is also measurable to the measure ρ0 due to Wang and Ding [22]. Obviously, m(x, t), q(x, t) and E(x, t) are of bounded variation locally in x. For these functions, we have the following properties. Lemma 2.8. In the sense of Radon-Nikodym derivatives, for almost everywhere t > 0, m(x, t), q(x, t) and E(x, t) satisfy (1) qx = umx , (2) Ex = 21 u2 mx . Proof. First we consider the simplest case that there is a neighborhood U(x,t) of (x, t), such that y∗ keeps constant in U(x,t) . Namely, for any point (ξ, τ ) ∈ U(x,t) , y∗ (ξ, τ ) = y∗ (x, t). In terms of Lemma 2.3, we also have y ∗ (ξ, τ ) = y∗ (x, t). There are two cases. Case 1. F (y; ξ, τ ) always achieves or never achieves its minimum at y∗ in U(x,t) . This case is trivial because m, q and E keep constant in U(x,t) . We note that the case |x − y∗ (x, t)| > Mt belongs to Case 1. Case 2. F (y; ξ, τ ) achieves its minimum at y∗ only for some points of U(x,t) , not for the whole set U(x,t) . Obviously, m, q, E take two values in U(x,t) . When m(x − 0, t) = m(x + 0, t), it is the same as subcase 1 by choosing a smaller neighborhood of (x, t). When m(x − 0, t) = m(x + 0, t), F (y; ξ, τ ) achieves its minimum at y∗ only for ξ < x. Let y0 = y∗ (x, t), it is easy to see ρ0 has a positive mass at the point y0 . Thus Lemma 2.2 and 2.6 give x − y0 u(x, t) = u0 (y0 ) = . (2.38) t We compute y0 +0 q(x2 , t) − q(x1 , t) y −0 u0 dm0 (η) = u0 (y0 ) = u(x, t), lim (2.39) = 0y +0 0 x2 ,x1 →x±0 m(x2 , t) − m(x1 , t) y0 −0 dm0 (η) 1 y0 +0 1 E(x2 , t) − E(x1 , t) 2 y0 −0 u0 u(x, t)dm0 (η) lim = u2 (x, t). (2.40) = y0 +0 x2 ,x1 →x±0 m(x2 , t) − m(x1 , t) 2 dm0 (η) y0 −0
Well Posedness for Pressureless Flow
129
Next we turn to the case that y∗ is not a constant for any neighborhood of (x, t). Let x1 < x < x2 , y1 = y∗ (x1 , t), y2 = y∗ (x2 , t), then y1 < y2 . Analogous to Lemma 2.6, we first consider the case y∗ < y ∗ . It is easy to check that y˜ ∗ (x2 ,t)−0 q(x2 , t) − q(x1 , t) y˜∗ (x1 ,t)+0 u0 (η)dm0 (η) lim = u(x, t). = lim y˜ ∗ (x2 ,t)−0 x2 ,x1 →±0 m(x2 , t) − m(x1 , t) x2 ,x1 →x±0 dm0 (η) y˜∗ (x1 ,t)+0
(2.41) When y∗ = y ∗ , there are three subcases. The proof of (3) is similar to the case y∗ < y ∗ , and we omit it here. For (1), if t ∈ V1 ∩ V3 , then we have y1 , y2 → y∗ . Without loss of generality, we assume that ν(x1 , t) = F (y1 ; x1 , t),
ν(x2 , t) = F (y2 ; x2 , t).
By the definition of ν(x, t), we have ν(x2 , t) ≤ F (y1 ; x2 , t), F (y2 ; x1 , t) ≥ ν(x1 , t), which indicates
y2 −0
y1 −0 y2 −0 y1 −0
Therefore, we have x1 − y2 ≤ t ≤
u0 dm0 (η) ≤ u0 dm0 (η) ≥
y2 −0
y1 −0 y2 −0 y1 −0
x2 − η dm0 (η), t
(2.42)
x1 − η dm0 (η). t
(2.43)
y2 −0
x1 −η q(x2 , t) − q(x1 , t) y1 −0 t dm0 (η) ≤ y2 −0 m(x2 , t) − m(x1 , t) y1 −0 dm0 (η) y2 −0 x2 −η x 2 − y1 y1 −0 t dm0 (η) ≤ . y2 −0 t y1 −0 dm0 (η)
q(x2 ,t)−q(x1 ,t) Let x1 → x − 0, x2 → x + 0 in (2.44), we get limx2 ,x1 →0 m(x = 2 ,t)−m(x1 ,t) In the same fashion, we can also prove (2.44) for the case t ∈ V1 ∩ V2 . Now we study the case (2). Without loss of generality, we assume that
x(ta ) −
x(ta ) − y∗ (ta − t) < x(t). ta
(2.44)
x−y∗ t .
(2.45)
It is observed that ρ0 = 0 on the interval (y∗ , yta ). This infers y∗ ∈ S(x1 , t) when x1 is close to x. Furthermore, F (y; x1 , t) does not achieve its minimum at y∗ if ρ0 has a positive mass at y∗ . By (2.44), we have y2 −0 x1 −η q(x2 , t) − q(x1 , t) x1 − y 2 yt −0 t dm0 (η) ≤ ≤ a y −0 2 t m(x2 , t) − m(x1 , t) yta −0 dm0 (η) y2 −0 x2 −η x2 − yta yt −0 t dm0 (η) ≤ ≤ a y −0 . 2 t dm0 (η) yta −0
130
F. Huang, Z. Wang
Letting x2 , x1 → x ± 0, we have lim
x2 ,x1 →x±0
=
x(ta ) − t , ta − t
where we have used the fact that y2 → yta , as x2 → x + 0. Thus the first part of Lemma 2.8 is proved. On the other hand, it is observed that x(η, t) = x,
as y∗ (x − 0, t) < η ∈ spt{ρ0 } < y∗ (x + 0, t),
x1 ≤ x(η, t) ≤ x2 , as y1 < η ∈ spt{ρ0 } < y2 .
(2.46) (2.47)
Thus, we have u(x(η, t), t) = u(x, t), as y∗ (x − 0, t) < η ∈ spt{ρ0 } < y∗ (x + 0, t), x1 − y 2 x2 − y1 ≤ u(x(η, t), t) ≤ , t t
as y1 < η ∈ spt{ρ0 } < y2 .
If ρ0 has a positive mass on y∗ (x − 0, t) and y∗ (x + 0, t), then we have x(y∗ (x − 0, t), t) < x, if ∃x1 < x, s.t. y1 = y∗ (x − 0, t) and x−y∗ (x−0,t) > u0 (y∗ (x − 0, t)), t x(y (x − 0, t), t) = x, otherwise, ∗
(2.48)
and x(y∗ (x + 0, t), t) > x, if ∃x2 > x, s.t. y2 = y∗ (x + 0, t) and x−y∗ (x+0,t) > u0 (y∗ (x + 0, t)), t x(y (x + 0, t), t) = x, otherwise. ∗
(2.49)
It is noted that in the first relation of (2.48) and (2.49), y˜ ∗ (x2 , t) = y∗ (x + 0, t) and y˜∗ (x1 , t) = y∗ (x − 0, t) hold. Applying Lemma 2.6, (2.46–2.49) and the fact that qx = umx , we have Ex = 21 u2 mx . Lemma 2.9.
x2
x1
m(x, t)dx = ν(x1 , t) − ν(x2 , t),
t2
t1
q(x, t)dt = ν(x, t2 ) − ν(x, t1 ).
(2.50)
(2.51)
Proof. Let y∗ = y∗ (x, t), y∗ = y∗ (x , t). Without loss of generality, we assume F (y; x, t) achieves its minimum at y∗ and F (y; x , t) does not achieve its minimum at y∗ , Then we have ν(x, t) = F (y∗ ; x, t) and ν(x , t) = F (y∗ + 0; x , t). We compute
Well Posedness for Pressureless Flow
131
ν(x, t) − ν(x , t) = [ν(x, t) − F (y∗ + 0; x, t)] + [F (y∗ + 0; x, t) − ν(x , t)] = [F (y∗ ; x, t) − F (y∗ ; x , t)] + [F (y∗ ; x , t) − ν(x , t)]. In the first relation the first term on the right is ≤ 0, in the second relation the second term is ≥ 0. This means that ν(x, t) − ν(x , t) lies between the limits F (y∗ + 0; x, t) − ν(x , t) = (x − x)m(x , t), F (y∗ ; x, t) − F (y∗ ; x , t) = (x − x)m(x, t).
(2.52)
Let x , x be a subdivision of the interval (x1 , x2 ), we get (2.50) by integrating (2.52) over (x1 , x2 ). Similar to (2.50), we also have (2.51). Lemma 2.9 is proved. Proof of Theorem 1. We now prove (u(x, t), m(x, t)) defined in (2.29), (2.35) is an entropy solution of (1.1). From Lemma 2.9, we have νx = −m(x, t) and νt = q(x, t) in the sense of distribution. This infers mt + qx = mt + umx = 0. In fact, for any 2 ), we have ϕ ∈ C0∞ (R+
mϕt dxdt − uϕdmdt = =
(mϕt + qϕx )dxdt
(2.53)
(νt ϕx − νx ϕt )dxdt = 0.
Therefore the first relation of (1.4) is proved. The proof of the second equation of (1.4) is slightly complicated. We need to introduce another generalized potential to get some information on q, E like Lemma 2.9. Similar to F (y; x, t), we define G(y; x, t) =
y−0
(u0 (η) + k)(x(η, t) − x)dm0 (η),
(2.54)
0+0
where k is any constant satisfying k > M and x(η, t) is defined in (2.33). Since x(η, t) is increasing in η and x(y∗ − 0, t) ≤ x, x(y ∗ + 0, t) ≥ x, it is easy to see G(y; x, t) has a finite low bound for any (x, t) and y∗ (x, t) = G∗ (x, t) ∈ SG (x, t), where G∗ (x, t) and SG (x, t) are defined like (2.2) and (2.7). Let µ(x, t) = miny G(y; x, t). In the same way, Lemma 2.9 still holds for G(y; x, t). Therefore, we have µx = −(q + km),
(2.55)
µt = 2E + kv(x, t),
(2.56)
132
F. Huang, Z. Wang
where y∗ −0 dm0 , m(x, t) = 0+0 y∗ +0 dm0 ,
If G achieves its minimum at y∗ , otherwise,
0+0
y∗ −0 u0 dm0 , q(x, t) = 0+0 y∗ +0 u0 dm0 ,
if G achieves its minimum at y∗ , otherwise,
0+0
y∗ −0 1 u0 u(x(η, t), t)dm0 , If G achieves its minimum at y∗ , 2 0+0 E(x, t) = y +0 ∗ 1 u0 u(x(η, t), t)dm0 , otherwise, 2 0+0 y∗ −0 u(x(η, t), t)dm0 , If G achieves its minimum at y∗ , v(x, t) = 0+0 y∗ +0 u(x(η, t), t)dm0 , otherwise. 0+0
We claim that for any t > 0, m = m,
q = q,
E=E
a.e.
In fact, due to y∗ (x, t) = G∗ (x, t), only the case that [m0 (y∗ (x, t))] > 0 may cause m(x, t) = m(x, t). It is observed that such points (y∗ , 0) are at most denumerable. Since the map from (x, t) to (y∗ , 0) is not one to one, we have to consider the case that an interval maps to one point, i.e. (y∗ , 0) generates at least two forward generalized characteristics. Let x ∈ (a(η, t), b(η, t)). There are two subcases: (2) x < y∗ + u0 (y∗ )t. (1) y∗ + u0 (y∗ )t < x, For (1), F (y; x, t) and G(y; x, t) do not arrive in their minimum at y∗ in terms of Lemma 2.2. Thus we have m = m. For (2), both F (y; x, t) and G(y; x, t) arrive in their minimum at y∗ . So F and G always achieve or do not achieve their minimum at y∗ at the same time. This infers m(x, t) = m(x, t). In the same fashion, we also have q = q, E = E, a.e. Since k > M is arbitrary, we have, from (2.55), (2.56), θx = −q,
θt = 2E,
(2.57)
where y∗ −0 u0 (η)(x(η, t) − x)dm0 , θ (x, t) = 0+0 y∗ +0 u0 (η)(x(η, t) − x)dm0 , 0+0
if G achieves its minimum at y∗ , otherwise. (2.58)
Well Posedness for Pressureless Flow
133
2 ), (2.57) and Lemma 2.8 give For any φ ∈ C0∞ (R+ 2 φx u dmdt= θx φxt dxdt − θt φxx dxdt = 0. φt udmdt +
(2.59)
Therefore the second relation of (1.4) is proved. To prove (m, u) is an entropy solution of (1.1), it is sufficient to prove that mx , umx , u2 mx are weakly continuous at t = 0. We first consider the weak continuities of mass at the initial time. It is observed that m0 (x) is continuous except at most denumerable points. We only study the point x, where m0 (x) is continuous. If there exists t0 such that |x − y∗ (x, t0 )| > Mt0 , then ρ0 = 0 on (x − Mt, x + Mt). This implies that for any t < t0 , y∗ (x, t) = y∗ (x, t0 ) and m(x, t) → m0 (x), as t → +0. If there exists t0 such that ∀t < t0 , |x − y∗ (x, t)| ≤ Mt, then y∗ (x, t) → x as t → 0. From (2.35), we have m(x, t) → m0 (x). Thus, m(x, t) → m0 (x), a.e. This indicates the weak continuity of mx at the initial time. In the same way, qx is also weakly continuous initially. Now we study the initial condition for the energy 21 ρu2 . By (2.37), we only need to prove u(x(η, t), t) → u0 (η), a.e. with respect to the measure ρ0 , as t → 0+. Let η be the point satisfying η2 +0 η1 −0 u0 (η)dm0 (η) . (2.60) lim u0 (η0 ) = η2 +0 η2 ,η1 →η0 ±0 η1 −0 dm0 (η) We shall show u(x(η0 , t), t) → u0 (η0 ), as t → 0+. In view of the definition of x(η, t), we have y∗ (x(η0 , t) − 0, t) ≤ η0 . Let y ∗ = limt→0+ y∗ (x(η0 , t) − 0, t). There are two subcases: (a) y ∗ < η0 ; (b) y ∗ = η0 . Proof of (a). It is easy to see ρ0 = 0 on (y ∗ , η0 ) and y ∗ ∈ S(x(η0 , t), t) when t is sufficiently small. By Lemma 2.2, F (y; x(η0 , t), t) does not achieve its minimum at y ∗ if ρ0 has a positive mass at y ∗ . This implies η0 = y∗ (x(η0 , t), t). Let y = η0 + ht. We have y−0 y−0 x(η0 , t) − η u0 (η)dm0 (η) ≥ dm0 (η) t η0 −0 η0 −0 (2.61) y−0 y−0 x(η0 , t) − η0 ≥ dm0 (η) − h dm0 (η). t η0 −0 η0 −0 From Lemma 2.6, we have y−0 η0 −0 u0 (η)dm0 (η) ≥ u(x(η0 , t), t) − h. y−0 η0 −0 dm0 (η)
(2.62)
Letting t → 0+, we have u0 (η0 ) ≥ lim sup u(x(η0 , t), t) − h. t→0
(2.63)
Again letting h → 0, we have u0 (η0 ) ≥ lim sup u(x(η0 , t), t). t→0
(2.64)
134
F. Huang, Z. Wang
The case (b) can be treated similarly. Thus, (2.64) always holds. In the same fashion, we also have u0 (η0 ) ≤ lim inf u(x(η0 , t), t), (2.65) t→0
which, together with (2.64), indicates u0 (η0 ) = lim u(x(η0 , t), t). t→0
(2.66)
Thus, u(x(η, t), t) → u0 (η), a.e. with respect to the measure ρ0 , as t → 0+. Then we have E(x, t) → E0 (x) a.e. as t → 0+ due to (2.37). Therefore Theorem 1 is proved. Finally, it is worthwhile to point out that m(x, t), q(x, t) and E(x, t) can expressed by y0 −0 dm0 , F achieves its minimum at y0 , m(x, t) = 0+0 y0 +0 dm0 , otherwise, 0+0 y0 −0 u0 dm0 , F achieves its minimum at y0 , q(x, t) = 0+0 y0 +0 u0 dm0 , otherwise, 0+0 y0 −0 1 u0 u(x(η, t), t) dm0 , F achieves its minimum at y0 , 2 0+0 E(x, t) = y +0 0 1 u0 u(x(η, t), t)dm0 , otherwise, 2 0+0
also be
(2.67)
(2.68)
(2.69)
with any y0 ∈ S(x, t). These are due to the fact that, for any t > 0, the points (x, t) satisfying y∗ < y ∗ are at most denumerable and ρ0 = 0 on (y0 , y∗ ) for the point where y∗ = y ∗ , if y0 < y∗ (or ρ0 = 0 on (y∗ , y0 ) if y0 > y∗ ). 2 , from Lemma 2.4, it is easy to see In addition, for any (x0 , t0 ) ∈ R+ u(x, t) = holds along the line L : x = x0 +
x − y0 , t
(2.70)
x − y0 (t − t0 ), t < t0 , t0
0 with any y0 ∈ S(x0 , t0 ) ∩ spt{ρ0 } and | x−y t0 | ≤ M.
3. Linear Equation with Discontinuous Coefficient This section is devoted to the existence and uniqueness of the linear equation with discontinuous coefficient in the region t ≥ t1 > 0. In fact, we get an explicit formula of the weak solution if its value at t = t1 is given. We consider the following linear equation: ωt + u(x, t)ωx = 0,
(3.1)
Well Posedness for Pressureless Flow
135
where u(x, t) is a given bounded measurable function satisfying 1 u(x2 , t) − u(x1 , t) ≤ , x2 − x 1 t
(3.2)
for any x1 < x2 and almost all t > 0. Similar to Definition 1, our definition of the weak solution of (3.1) is Definition 3.1. Suppose that ω(x, t) is of bounded variation locally in x and the measure ωx is weakly continuous with respect to t, ω(x, t) is called a weak solution of (3.1) if ϕudωdt = 0, (3.3) ϕt ωdxdt − 2 ). holds for all ϕ ∈ C0∞ (R+
In the sense of Definition 3.1, we have the following existence and uniqueness results. Theorem 3.1. Suppose that ω(x, t1 ) ∈ BVloc (R), t1 > 0 and u(x, t) satisfies (3.2), then (3.1) admits only one weak solution in t ≥ t1 . Proof. Let M = u0 L∞ and uε = u∗jε , where jε is a standard mollifier, then |uε | ≤ M and uεx ≤ 1t . We denote x = Xtε1 (ξ, t) the characteristic curve as follows:
dx dt
= uε , x(t1 ) = ξ.
(3.4)
Some properties of these characteristics have been studied in [13, 22]. We state them in Lemma 3.2. Lemma 3.1. (1) There exists a subsequence Xtε1i (ξ, t) such that lim Xtε1i (ξ, t) = Xt1 (ξ, t),
εi →0
(3.5)
for all ξ and t. Furthermore, Xt1 (ξ, t) is Lipschitz continuous with respect to t. (2) If Xt1 (ξ1 , t0 ) = Xt1 (ξ2 , t0 ) holds for some ξ1 < ξ2 and t0 ≥ t1 > 0, then Xt1 (ξ1 , t) = Xt1 (ξ2 , t) for all t ≥ t0 . Furthermore, Xt1 (ξ, t), ξ ∈ R, cover the strip {t ≥ t1 }. (3) If t2 < t1 , then Xt2 (ξ, t) = Xt1 (ξ , t) holds for any t ≥ t1 , where ξ = Xt2 (ξ, t1 ). (4) Let U = {ξ : ∃t > 0, s.t.X(ξ − 0, t) = X(ξ + 0, t)}, where X(ξ, t) is defined by letting t1 = 0 in (3.4), then for any ξ ∈ R/U and almost t > 0, ∂X(ξ, t) = u(X(ξ, t), t). ∂t (5) Let ξt1 (x, t) = sup{ξ : Xt1 (ξ, t) < x}, ηt1 (x, t) = inf{ξ : Xt1 (ξ, t) > x}, then the set
136
F. Huang, Z. Wang
Similar to F (y; x, t), we define Ht1 (y; x, t) =
y−0
0+0
Xt1 (η, t) − xdω(η, t1 ).
(3.6)
Since BV functions can always be expressed by the difference of two increasing functions, we assume that ω0 (x) is increasing. Thus the function Ht1 (y; x, t) has a finite low bound for each fixed point (x, t). It is easy to see Ht1 (ξt1 (x, t); x, t) = min Ht1 (y; x, t) y
(3.7)
always holds due to fact that x = Xt1 (ξt1 (x, t), t) (see (2), (5) of Lemma 3.2). In another words, Ht1 (y; x, t) always achieves its minimum at ξt1 (x, t). We note here that (3.7) does not always hold when t1 = 0 because the curves X(ξ, t), ξ ∈ R, do not cover the 2. half space R+ Define ξt −0 1 dω(η, t1 ), (3.8) ω(x, t) = q2 (x, t) =
0+0 ξt1 −0 0+0
u(Xt1 (η, t), t)dω(η, t1 ).
(3.9)
In terms of the existence result of Wang and Ding [22], we have q2x = uωx in the sense of the Radon-Nikodym derivative. On the other hand, it is easy to check that Lemma 2.9 still holds here. Therefore we have ∂ ∂ min Ht1 (y; x, t) = −ω(x, t), min Ht1 (y; x, t) = q2 (x, t). (3.10) ∂x y ∂t y This implies that ω(x, t) is a weak solution of (3.1). Now we study the uniqueness of the weak solution of (3.1) in the region t ≥ t1 . Consider two weak solutions ω1 (x, t), ω2 (x, t) satisfying ω1 (x, t1 ) = ω2 (x, t1 ) a.e. at the line t = t1 > 0. Let z = ω1 − ω2 , then (u, z) satisfies (3.3) with z(x, t1 ) = 0 a.e. in R 1 . For any N > 0, let > = {(x, t); 0 < t < N/M, |x| < N − Mt}. To prove the uniqueness, it is sufficient to prove z(x0 , t0 ) = 0, for any Lebesgue point of > ∩ {t > t1 }. We first choose a sequence of smooth functions uh such that uh converge pointwisely to u and uhx ≤ 1t . We note that choosing this approximating function is possible because we can modify u by a special smooth kernel introduced by Wang and Ding in [22]. Let Ch = > |uh − u||dz|dt, then Ch → 0 as h → +0. From (3.3), we have ≤ Ch ψ C , ∀ψ ∈ C ∞ (>). (ψ + (ψu ) )zdxdt (3.11) t h x 0 0 ˜ For any −N + t0 M < ξ < N − t0 M, there exists a smooth curve x = X(ξ, t) satisfying dx = uh , (3.12) dt x(t ) = ξ. 0
Well Posedness for Pressureless Flow
137
Then ˜ X(ξ, t) = ξ +
t t0
˜ uh (X(ξ, s), s)ds.
(3.13)
We compute t ˜ ∂ X(ξ, t) u ds = e t0 hx . ∂ξ
(3.14)
We now make a coordinate transformation A : (x, t) → (ξ, t)
(3.15)
˜ t), t). such that A−1 (ξ, t) = (X(ξ, From (3.13) and (3.14), (3.11) becomes (ψt + (ψuh )x )zdxdt = (ψt + uh ψx + uhx ψ)zdxdt
t ∂ u ds ˜ = ψ(X(ξ, t), t) + uhx ψ e t0 hx zdξ dt (3.16) ∂t t u ds = ψe t0 hx zdξ dt ≤ Ch ψ C0 . t
For any ϕ(ξ ) ∈ C0∞ (−N + t0 M, N − t0 M) and t2 ∈ (t1 , t0 ), let φ(t) = ˜ ψ(X(ξ, t), t)e
t
t0
uhx ds
t
−∞
(αε (s − t2 ) − αε (s − t1 ))ds,
= ϕ(ξ )φ(t),
where αε is a standard modifier. Substituting (3.17) into (3.16) gives t − u ds ≤ Ch ϕ(ξ )φ (t)zdξ dt ϕφe t0 hx ≤ Ch ϕφe ≤ or,
t0
ds t1 s 2
C0
(3.18) C0
2t0 Ch ϕ C0 t1
2t0 ϕ(αε (t − t2 ) − αε (t − t1 ))zdξ dt ≤ Ch ϕ C0 . t1 Let ε → 0. We get 2t0 ϕ(ξ )z(X(ξ, ≤ ˜ t ), t )dξ Ch ϕ C0 , 2 2 t1
for almost all t1 < t2 < t0 .
(3.17)
(3.19)
(3.20)
138
F. Huang, Z. Wang
By approximating smooth functions, (3.20) gives 1 √ C h
2t0 ˜ ϕ(ξ )z(X(ξ, t), t)dξ dt ≤ Ch ϕ L∞ , √ t1 t0 − Ch
(3.21)
for all ϕ ∈ L∞ (−N + t0 M, N − t0 M). √ √ Choosing ϕ(ξ ) = 2√1C (H (ξ − x0 + Ch ) − H (ξ − x0 − Ch )) with H (s) the h Heaviside function, we get 1 2C h
2t0 ˜ z(X(ξ, t), t)dξ dt ≤ Ch , t1 D
(3.22)
√ √ √ where D = [x0 − Ch , x0 + Ch√ ] × [t0 − Ch , t0 ]. √ −1 Since A (D) ⊂ {(x, t); t0 − Ch < t < t0 , |x − x0 | < Ch + M|t − t0 |}, we −1 have mes(A (D)) ≤ 2(M + 1)Ch . We compute 1 1 ˜ |z − z(x0 , t0 )|dξ dt + z(X(ξ, t), t)dξ dt |z(x0 , t0 )|≤ 2Ch 2Ch D D 2t0 1 |z − z(x0 , t0 )|dξ dt + Ch ≤ 2Ch t1 D t (3.23) 1 2t0 u ds − = |z(x, t) − z(x0 , t0 )|e t0 hx dxdt + Ch 2Ch t1 A−1 (D) 2(M + 1) ≤ mes(A−1 (D))
A−1 (D)
|z(x, t) − z(x0 , t0 )|dxdt +
2t0 Ch . t1
Letting Ch → +0 in (3.23), we have z(x0 , t0 ) = 0. This proves Theorem 3.1. By Theorem 3.1, any weak solution ω(x, t) of (3.1) must have the following formula if its value at t = t1 is given, i.e. ω(x, t) =
ξt1 −0
dω(x, t1 ).
(3.24)
0+0
4. Uniqueness of Entropy Solution This section is devoted to the uniqueness of the entropy solution. First, we prove the uniqueness of the entropy solution in the region t ≥ t1 > 0 with arbitrary t1 , by the uniqueness results for the linear equation with discontinuous coefficient. In other words, any entropy solution can be explicitly expressed by its value at the time t = t1 > 0. Then, by letting t1 → 0, we show any entropy solution (ρ, u) of (1.1) coincides with the entropy solution (ρs , us ) constructed in Sect. 2 by the generalized potential F (y; x, t) with the same initial data. This leads to our uniqueness theorem. Proof of Theorem 2. Suppose that (ρ, u) = (mx , u) is any entropy solution of (1.1) with initial data ρ0 , u0 , where ρ0 is a non-negative Radon measure and u0 is bounded and
Well Posedness for Pressureless Flow
139
measurable to ρ0 . It is easy to see m(x, t) is a weak solution of mt + umx = 0,
(4.1)
in the sense of Definition 3.1. Next, we shall construct a function q(x, t) which is a weak solution of a scalar transport equation qt + uqx = 0 (4.2) and satisfies qx = umx . +∞ We choose a smooth function φ(x) ∈ C0∞ (R), −∞ φ(x)dx = 1. Let q(x, t) = p(x, t) − f (t), where x−0 p(x, t) = udm, 0+0 +∞
f (t) = h(t) =
−∞ +∞ −∞
φ(x)(p(x, t) − p(x, t1 ))dx +
t t1
h(τ )dτ,
φ(x)udp.
Obviously, qx = px = umx . For any ϕ ∈ C0∞ (R × Rt1 ) with Rt1 = {t ∈ R : t ≥ t1 }, let +∞ x g(t) = ϕ(y, t)dy, ψ(x, t) = (ϕ(y, t) − φ(x)g(t))dy, −∞
−∞
C0∞ (Rt1 ),
C0∞ (R
then g(t) ∈ ψ∈ × Rt1 ) and ψx (x, t) = ϕ(x, t) − φ(x)g(t). Substituting ψ in the second relation of (1.4), we have ψx u2 dmdt = 0, ψt udmdt +
which indicates − i.e.
ψxt qdxdt +
ψx udqdt = 0,
(ϕ(x, t) − φ(x)g(t))t qdxdt −
(ϕ(x, t) − φ(x)g(t))udqdt = 0.
(4.3)
From (4.3), we have ϕt qdxdt − ϕudqdt = φ(x)gt (t)qdxdt − φ(x)g(t)udqdt = φ(x)gt (t)(p(x, t) − f (t))dxdt − g(t)h(t)dt t = gt (t)( φ(x)p dx − f (t) + h(τ )dτ )dt =
+∞
−∞
φ(x)p(x, t1 )dx
t1
gt dt = 0, (4.4)
140
F. Huang, Z. Wang
where we have used the definition of f (t), h(t). Therefore q(x, t) is a weak solution of (4.2) in the sense of Definition 3.1. By the uniqueness of Theorem 3.1 and (3.24), we have the following expressions of m(x, t) and q(x, t) in the region t ≥ t1 : ξt (x,t)−0 1 m(x, t) = dm(η, t1 ), (4.5) q(x, t) =
0+0 ξt1 (x,t)−0
dq(η, t1 ),
(4.6)
0+0
where ξt1 (x, t) = sup{ξ : Xt1 (ξ, t) < x} and Xt1 (ξ, t) is defined in (3.5). We note that (4.5) and (4.6) are not our desired results because ξt1 (x, t) depends on u(x, t). However, they provide some information on the structure of the entropy solution. Next, we shall use (4.5), (4.6) to show m(x, t) = −
∂ min Ft1 (y; x, t), ∂x y
q(x, t) =
∂ min Ft1 (y; x, t), ∂t y
in the sense of distributions, where y−0 Ft1 (y; x, t) = (t − t1 )u(η, t1 ) + η − xdm(η, t1 ),
(4.7)
(4.8)
0+0
is the generalized potential which only depends on the value of (m, q) at t = t1 . This leads to the uniqueness of the entropy solution in the region t ≥ t1 . Since mt + qx = mt + umx = 0, (4.9) is conserved, we introduce the following generalized potential Dt1 (x, t) (x,t) Dt1 (x, t) = m(x − 0, t)dx − q(x − 0, t)dt. (0,t1 )
(4.10)
Obviously, Dt1 (x, t) is independent of the integral path and (Dt1 )x = m, (Dt1 )t = −q hold in the sense of distributions. Now we prove Dt1 (x, t) = − min Ft1 (y; x, t), (4.11) y
by choosing two integral paths. 2 , the first integral path is from (0, t ) to (ξ (x , t ), t ) For any point (x0 , t0 ) ∈ R+ 1 t1 0 0 1 along the line t = t1 and from (ξt1 (x0 , t0 ), t1 ) to (x0 , t0 ) along the curve L : x = Xt1 (ξt1 (x0 , t0 ), t). Let ξt1 = ξt1 (x0 , t0 ) and τ0 = t0 − t1 . We compute ξt 1 Dt1 (x0 , t0 ) = m(η − 0, t1 )dη + m(x − 0, t)dx − q(x − 0, t)dt
L
0
ξt1 −0
=
0+0
ξt1 − η dm(η, t1 ) − τ0
+(x0 − ξt1 ) =
ξt1 −0
0+0
ξt1 −0
ξt1 −0 0+0
dm(η, t1 )
0+0
x0 − η − τ0 u(η, t1 )dm(η, t1 ),
u(η, t1 )dm(η, t1 ) (4.12)
Well Posedness for Pressureless Flow
141
where we have used (4.5) and (4.6) and the fact that ξt1 (x, t) = ξt1 along the curve Xt1 (ξt1 , t). On the other hand, we choose another integral path that is from (0, t1 ) to (x0 −vτ0 , t1 ) along the line t = t1 and from (x0 − vτ0 , t1 ) to (x0 , t0 ) along the line L : x = x0 + v(t − t0 ), t1 ≤ t ≤ t0 , with arbitrary v. We compute x0 −vτ0 m(η − 0, t1 )dη + mdx − qdt Dt1 (x0 , t0 ) = 0 L t0 x0 −vτ0 m(η − 0, t1 )dη + (vm(x − 0, t) − q(x − 0, t))dt, = t1
0
(4.13) where x = x0 + v(t − t0 ), t1 ≤ t ≤ t0 . Since v is a constant, from (4.1) and (4.2), we have, (ϕt + vϕx )(q − vm)dxdt = ϕ(u2 − vu)dmdt − v ϕ(u − v)dmdt = ϕ(u − v)2 dmdt ≥ 0, (4.14)
2 ). for any ϕ ≥ 0, ϕ ∈ C0∞ (R+ Let y = x − vt, ϕ(y, ˆ t) = ϕ(x, t), then (4.14) becomes ∂ ϕ( ˆ qˆ − v m)dydt ˆ ≥ 0. ∂t
(4.15)
Let τ = t − t1 , t ≥ t1 , then (4.15) implies q(x, t) − vm(x, t) ≤ q(x − vτ, t1 ) − vm(x − vτ, t1 ) a.e.
(4.16)
which indicates q(x − 0, t) − vm(x − 0, t) ≤ q(x0 − vτ0 − 0, t1 ) − vm(x0 − vτ0 − 0, t1 )
(4.17)
along the line L : x = x0 + v(t − t0 ), t1 ≤ t ≤ t0 . From (4.13) and (4.17), we have x0 −vτ0 m(η − 0, t1 )dη + vτ0 m(x0 − vτ0 − 0, t1 ) Dt1 (x0 , t0 ) ≥ 0
− τ0 q(x0 − vτ0 − 0, t1 ) x0 −vτ0 −0 = (x0 − vτ0 − η)dm(η, t1 ) + vτ0 0+0
−τ0 =
x0 −vτ0 −0
dm(η, t1 )
0+0 x0 −vτ0 −0
0+0 x0 −vτ0 −0
(4.18)
u(η, t1 )dm(η, t1 )
x0 − η − τ0 u(η, t1 )dm(η, t1 ).
0+0
Since v is arbitrary, we have Dt1 (x0 , t0 ) ≥ max y
y−0 0+0
x0 − η − τ0 u(η, t1 )dm(η, t1 ).
(4.19)
142
F. Huang, Z. Wang
Combining (4.12) and (4.19), we have y−0 x0 − η − (t0 − t1 )u(η, t1 )dm(η, t1 ), Dt1 (x0 , t0 ) = max y
0+0
= − min Ft1 (y; x0 , t0 ).
(4.20)
y
On the other hand, by the existence Theorem 1, we can construct a standard entropy solution mt1 ,s , ut1 ,s satisfying mt1 ,s = (Dt1 )x , qt1 ,s = −(Dt1 )t , where (qt1 ,s )x = ut1 ,s (mt1 ,s )x . We note that the constructive procedure above starts from the line t = t1 > 0. Since m = (Dt1 )x , q = −(Dt1 )t , we have m = mt1 ,s , q = qt1 ,s a.e. in t ≥ t1 > 0. Therefore the uniqueness of the entropy solution of (1.1) in the region t ≥ t1 > 0 is proved. Next, we shall show m(x, t), q(x, t) only depends on the initial value. This leads to the uniqueness of the entropy solution for pressureless gases. For any (x0 , t0 ), from (3.8), (4.5) and (2.67)–(2.69), we know that, ξt1 (x0 , t0 ) belongs to the set of minimum points of y−0 t0 u(η, t1 ) + η − x0 dm(η, t1 ) Ft1 (y; x0 , t0 ) = 0+0
except denumerable points, i.e. ξt1 ∈ St1 (x0 , t0 ), where St1 (x0 , t0 ) = {y; ∃yn → y, s.t.Ft1 (yn ; x0 , t0 ) → min Ft1 (y; x0 , t0 )}. y
(4.21)
Now, we choose a sequence of ti > 0, i = 1, 2, · · · such that ti → 0, as i → ∞. By Lemma 2.4 and (2.67-2.69), m and E keep constant along the straight lines Lti : x = x0 +
x0 −ξti t0 −ti
(t − t0 ), ti ≤ t ≤ t0 . Furthermore, u(x, t) =
x0 − ξti , t0 − t i
(4.22)
holds along Lti due to (2.70) and (5) of Lemma 3.2 if ξti (x0 , t0 ) ∈ spt{ρ0 }. If ξti (x0 , t0 ) ∈ R/ spt{ρ0 }, then m, q must be constant in some neighborhood Vs of (x, t), i.e. ρ = mx = 0 in Vs . We redefine x0 − ξti u(x, t) = . t0 − t i Since ξti (x, t) is increasing in x, the solution we redefine still satisfies the Oleinik entropy condition. In view of Lemma 3.2, we have |ξti − ξtj | ≤ M|ti − tj |. Therefore, as ti → 0, ξti converge toward H∗ (x0 , t0 ) and Lti converge to a straight line L0 : x = x0 + x0 −Ht∗0(x0 ,t0 ) (t − t0 ), 0 ≤ t ≤ t0 . Therefore m(x, t) and q(x, t) are constant along
the line L0 and u(x, t) = x0 −Ht∗0(x0 ,t0 ) . Namely, m(x0 , t0 ), q(x0 , t0 ) are the functions of H∗ (x0 , t0 ). It is easy to see the map from (x, t) to H∗ (x, t) is not one to one. If there exists x1 < x2 such that H∗ (x1 , t) = H∗ (x2 , t), then for any x ∈ (x1 , x2 ), H∗ (x, t) = H∗ (x1 , t). On the other hand, each point (η, 0) of the X-axis generates at least one curve if we choose t1 = 0 in (3.4) and let εi → 0. Define >s = {η : ∀t > 0, ∃x, s.t., η(x, t) − ξ(x, t) > 0 and ξ(x, t) ≤ η ≤ η(x, t)},
Well Posedness for Pressureless Flow
143
where ξ(x, t) = sup{ξ ; X(ξ, t) < x},
η(x, t) = inf{ξ ; X(ξ, t) > x}.
Since the weak solution of (3.1) admits at most countable shock waves due to Lemma 3.2, >s is a zero measure set. It is easy to see, for any η ∈ R/ >s , there is t0 > 0 such that the curve X(η, t), X(η, 0) = η, 0 < t ≤ t0 lies in R/ <, with < = {(x, t); ξ(x, t) = η(x, t), t > 0}. Therefore H∗ (X(η, t), t) = η holds for any 0 < t ≤ t0 from Lemma 3.2. η−0 Let w(η) = 0+0 dm0 (x)−m(x(η), t (η)), where (x(η), t (η)) is any point located on the curve X(η, t) and satisfies H∗ (x(η), t (η)) = η. It is easy to see w(η) is independent of the choice of (x(η), t (η)), if η ∈ R/(U ∪ >s ) with U introduced in (4) of Lemma 3.2. Thus w(η) is well defined almost everywhere in R 1 due to the zero measure set U. ∀φ(η) ∈ C0∞ (R 1 ), we compute η−0 φ(η)w(η)dη ≤ φ(η) dm(x, t (η)) − m(x(η), t (η)) dη 0+0 (4.23) η−0 η−0 dm(x, t (η) − dm0 (x))dη . + φ(η) 0+0
0+0
Take any t0 > 0, and let Yt0 = {η ∈ spt{φ} : ∃t (η) = t0 , s.t. H∗ (x(η), t (η)) = η}. Obviously, Yt0 converges to spt{φ}/(>s ∩U ) as t0 → 0, i.e. limt0 →0 meas{spt{φ}/Yt0 } = 0. Thus (4.23) becomes η−0 φ(η)w(η)dη ≤ φ(η) dm(x, t0 ) − m(x(η), t0 ) dη 0+0 η−0 η−0 dm(x, t0 ) − dm0 (x) dη + φ(η) 0+0
(4.24)
0+0
+ Cmeas{spt{φ}/Yt0 }. Since the mass is weakly continuous at t = 0 and |x(η) − η| ≤ Mt0 , η ∈ Yt0 , letting η−0 t0 → 0 gives w(η) = 0 a.e. in R 1 . Similarly, let z(η) = 0+0 u0 dm0 − q(x(η), t (η)), 2 , there exists a point then we get q(η) = 0 a.e. in R 1 . It is noted that, for any (x, t) ∈ R+ η on the X-axis such that H∗ (x, t) = η. Thus, we have m(x, t) = q(x, t) =
H∗ (x,t)−0
0+0 H∗ (x,t)−0 0+0
dm0 (η), if ∀x˜ = x, H∗ (x, ˜ t) = H∗ (x, t),
(4.25)
u0 (η)dm0 (η), if ∀x˜ = x, H∗ (x, ˜ t) = H∗ (x, t).
(4.26)
Next we turn to the case that a whole interval maps into one point H∗ (x, t), i.e. H∗ (x, t) ∈ U . Without loss of generality, we assume that H∗ (x, t) = 0 and denote x1 (t) = X(0 − 0, t), x2 (t) = X(0 + 0, t). Obviously, m(x, t), q(x, t) keep constant along the line x = ct, x1 (t)/t < c < x2 (t)/t. Therefore, the region V surrendered by x1 (t), x2 (t) is convex and m, q are functions of ξ = xt . Of course, u(x, t) = xt is also a
144
F. Huang, Z. Wang
function of ξ . Since the velocity u is always finite and x1 (t), x2 (t) are monotone functions, x1 (t), x2 (t) have finite limits as t → 0. Let ξ1 = limt→0 x1 (t), ξ2 = limt→0 x2 (t). We define limt→0 m(x1 (t) − 0, t), ξ < ξ1 , m(ξ ) = m(x, t), (4.27) ξ1 < ξ < ξ2 , lim t→0 m(x2 (t) + 0, t), ξ > ξ2 . Choose a smooth function ϕε satisfying ϕε = 1 in (x1 (t) − ε, x2 (t) + ε) and ϕε = 0 in (−∞, x1 (t) − 2ε) and (x2 (t) + 2ε, +∞), we compute that
ϕε dm(x, t) =
x1 (t)−0 x1 (t)−2ε
+
ϕε dm(x, t) + [m(x1 (t), t)]
x2 (t) t −0 x1 (t) t +0
dm(ξ ) + [m(x2 (t), t)] +
x2 (t)+2ε x2 (t)+0
(4.28) ϕε dm(x, t),
where [m(x, t)] = m(x + 0, t) − m(x − 0, t). Since the first and fifth terms of the right-hand side of (4.28) tend to zero as ε → 0 and mx is weakly continuous at t = 0, let t → 0, and we have = lim [m(x1 (t), t)] + t→0
=
ξ2 +0
ξ1 −0
ξ2 −0 ξ1 +0
dm(ξ ) + lim [m(x2 (t), t)] t→0
(4.29)
dm(ξ ).
Similarly, we also have [m0 (0)]u0 (0) = 1 [m0 (0)]u20 (0) = 2
ξ2 +0
ξ1 −0 ξ2 +0 ξ1 −0
ξ dm(ξ ),
(4.30)
1 2 ξ dm(ξ ), 2
(4.31)
where we have used the energy condition in (4.31), i.e. u2 mx is weakly continuous at t = 0. Combining (4.29), (4.30) gives
ξ2 +0
ξ1 −0
(ξ − u0 (0))2 dm(ξ ) = 0.
(4.32)
We claim that we can obtain a formula of m, q by initial value from (4.32). If [m0 (0)] = 0, then m(ξ ) is constant in V which infers that m(x, t) = m(x1 (t) − 0, t) = m(x2 (t) + 0, t) =
H∗ (x,t)−0
dm0 (η).
(4.33)
0+0
When [m0 (0)] > 0, m(ξ ) must be discontinuous at ξ = u0 (0) because m(ξ ) is increasing in ξ . From (4.24), we have ξ1 ≤ u0 (0) ≤ ξ2 . This means m(ξ ) takes two
Well Posedness for Pressureless Flow
values determined by ξ = u0 (0). Namely, H∗ (x,t)−0 dm0 (η), m(x, t) = 0+0 H∗ (x,t)+0 dm0 (η), 0+0
By (4.34), it is easy to check that H∗ (x,t)−0 u0 (η)dm0 (η), q(x, t) = 0+0 H∗ (x,t)+0 u0 (η)dm0 (η), 0+0
145
if x < u0 (H∗ (x, t))t,
(4.34)
if x > u0 (H∗ (x, t))t.
if x < u0 (H∗ (x, t))t,
(4.35)
if x > u0 (H∗ (x, t))t.
Therefore any entropy solution (m(x, t), q(x, t)) can be expressed by its initial value. Furthermore, m(x, t) and q(x, t)) keep constant along the straight line which connects (x, t) and (H∗ (x, t), 0). It is observed that the previous argument for t1 > 0 still holds here if (m, q) can be expressed by its initial data. Therefore we have Dx = m(x, t), Dt = −q(x, t), where y−0 D(x, t) = max x − η − tu0 (η)dm0 (η) = − min F (y; x, t). (4.36) y
0+0
y
Assume that (ms , us ) is the standard entropy solution constructed by (4.36), then 2 . Therefore the m = Dx = ms and q = −Dt = qs hold almost everywhere in R+ uniqueness Theorem 2 is proved. Finally, we show that the energy condition is necessary for the uniqueness of the entropy solution. Without the energy condition, we are able to construct at least two weak solutions satisfying the Oleinik entropy condition with the same initial data. For example, we consider the initial data x < 0, (1, 0), (ρ0 , u0 ) = (δ(x), 1), x = 0, (4.37) (1, 2), x > 0. We construct
(1, 0), (0, x/t), (ρ1 , u1 ) = (δ(x − t), 1), (0, x/t), (1, 2), (1, 0), (ρ2 , u2 ) = (1/2t, x/t), (1, 2),
x < 0, 0 ≤ x < t, x = t, t < x ≤ 2t, x > 0, x < 0, 0 ≤ x < 2t, x > 0.
(4.38)
(4.39)
Obviously, both (ρ1 , u1 ) and (ρ2 , u2 ) are the weak solutions of (1.1) with initial data (4.37). Moreover, u1 , u2 satisfy Oleinik entropy condition.
146
F. Huang, Z. Wang
By direct computation, u21 ρ1 weakly converges to u20 ρ0 . However, it is not right for ρ2 , u2 . Therefore ρ1 , u1 is the entropy solution of (1.1), (4.37) in the sense of Definition 1.2. This shows that the energy initial condition is necessary for the uniqueness of the weak solution to pressureless gases (1.1). Acknowledgement. The authors wish to thank Prof. Xiaqi Ding for his kindly suggesting this problem and helpful discussions. The first author also thanks Prof. Alberto Bressan for his kind hospitality in SISSA. The research was partially supported by NSFC Grant. No. 19801039.
References 1. Bouchut, F.: On zero pressure gas dynamics. Advances in kinetic theory and computing, Series on Advances in Mathematics for Applied Sciences, Vol. 22, Singapore: World Scientific, 1994, pp. 171–190 2. Bouchut, F. and James, F.: One-dimensional transportation equations with discontinuous coefficients. Nonlinear Analysis TMA, 32, 891933 (1998) 3. Bouchut, F. and James, F.: Duality solutions for pressureless gases, monotone scalar conservation laws and uniqueness. Comm. Part. Diff. Equs. 24, 2173–2190 (1999) 4. Boudin, L.: A solution with bounded expansion rate to the model of viscous pressureless gases. SIAM J. Math. Anal. 32, No. 1, 172–193 (2000) 5. Brenier, Y. and Grenier, E.: Sticky particles and scalar conservation laws. SIAM J. Num. Anal. 35 No.6, 2317–2328 (1998) 6. Dafermos, C.: Generalized characteristics and the structure of solutions of hyperbolic conservation laws. Indiana Univ. Math. J. 26, No. 6, 1097–1119 (1977) 7. Dal Maso, G., Le Floch, P. and Murat, F.: Definition and weak stability of nonconservative products. J. Math. Pure Appl. 74, 483–548 (1995) 8. Ding, X.: On a non-strictly hyperbolic system. Preprint, Dept. of Math., University of Jyvaskyla, No. 167 (1993) 9. Ding, X. and Wang, Z.: Existence and Uniqueness of Discontinuous Solutions Defined by Lebesgue– Stieltjes Integral. Science in China (Series A) 39, No. 8, 807–819 (1996) 10. E, W., Rykov,Y. and Sinai,Y.: Generalized variational principles, global weak solutions and behavior with random initial data for systems of conservation laws arising in adhesion particle dynamics. Commun. Math. Phys. 177, 349–380 (1996) 11. Hopf, E.: The partial differential equation ut +uux = µuxx . Comm. Pure Appl. Math. 3, 201–230 (1950) 12. Hu, J.: The Riemann problem for pressureless fluid dynamics with distribution solutions in Colombeau’s sense. Commun. Math. Phys. 194, 191–205 (1998) 13. Huang, F., Li, C. and Wang, Z.: Solutions Containing Delta-Waves of Cauchy Problems for a Nonstrictly Hyperbolic System. Acta Math. Appl. Sinica, 4 (1995) 14. Huang, F.: Existence and uniqueness of discontinuous solutions for a hyperbolic system. Proc. Roy. Soc. Edinb., Scotland, A 127, 1193–1205 (1997) 15. Huang, F.: Existence and uniqueness of discontinuous solutions for a class of nonstrictly hyperbolic system. In: Advances in Nonlinear Partial Differential Equations and related areas, Singapore: Scientific Press, 1998, pp. 187–208 16. Chen, S., Li, J. and Zhang, T.: Explicit construction of measure solutions of the Cauchy problem for the transportation equations. Science in China (Series A), 40, 12, 1287–1299 (1997) 17. Oleinik, O.: Discontinuous solutions of nonlinear differential equations. Am. Math. Soc. Transl. (2), 26, 95–172 (1963) 18. Poupaud, F. and Rascle, M.: Measure solutions to the linear multidimensional transportation with discontinuous coefficients. Comm. Part. Diff. Equs. 22, 337–358 (1997) 19. Sheng, W. and Zhang, T.: The Riemann problem for transportation equations in gas dynamics. Mem. Am. Math. Soc. 564, 137 (1999) 20. Volpert, A.: The space BV and quasilinear equations. Mat. Sb. 73, 255–302 (1967); English translation in Math. USSR, Sb. 21, 225–267 (1967) 21. Wang, Z., Huang, F. and Ding, X.: On the Cauchy problem of transportation equation. Acta Math. Appl. Sinica, 2, 113–122 (1997) 22. Wang, Z. and Ding, X.: Uniqueness of generalized solution for the Cauchy problem of transportation equations. Acta Math. Scientia 17, 3, 341–352 (1997) 23. Zeldovich, Y.: Gravitational instability: An approximate theory for large density perturbations. Astron. & Astrophys. 5, 84–89 (1970) Communicated by Ya. G. Sinai
Commun. Math. Phys. 222, 147 – 179 (2001)
Communications in
Mathematical Physics
© Springer-Verlag 2001
The Low-Temperature Expansion of the Wulff Crystal in the 3D Ising Model Raphaël Cerf, Richard Kenyon Université Paris Sud and CNRS, Mathématique, Bâtiment 425, 91405 Orsay Cedex, France. E-mail: [email protected]; [email protected] Received: 15 February 2001/ Accepted: 11 May 2001
Abstract: We compute the expansion of the surface tension of the 3D random cluster model for q ≥ 1 in the limit where p goes to 1. We also compute the asymptotic shape of a plane partition of n as n goes to ∞. This same shape determines the Wulff crystal to order o(ε) in the 3D Ising model (and more generally in the 3D random cluster model for q ≥ 1) at temperature ε. 1. Introduction The three-dimensional Ising model on the cubic lattice is one of the most challenging models of modern statistical physics. Many qualitative properties of the model are understood but despite a great deal of attention by physicists, very few exact, quantitative results have been obtained. In this paper we study the low-temperature expansion of both the surface tension and the Wulff shape in the three-dimensional Ising model, and more generally the threedimensional random cluster (FK percolation) model. At temperature ε, we obtain an expansion of the surface tension and Wulff shape to order ε. A closely related probabilistic model which we study is the random plane partition. This is a three-dimensional analogue of a random integer partition; we prove the existence of, and compute a formula for, the asymptotic limit shape for a “3D Young diagram” of a plane partition of n as n → ∞. This is accompanied by a large deviation principle. This limit shape turns out to be the same as the asymptotic Wulff shape for the threedimensional Ising model (taken near a corner and appropriately scaled). Here are more precise statements of our results.
1.1. The surface tension. We consider the FK percolation (random cluster) model in the three dimensional cubic lattice L3 with parameters p, q. In a finite box BM = p,q [−M, M]3 ∩ L3 this is the probability measure M on subsets of edges of BM , such
148
R. Cerf, R. Kenyon p,q
that for C a set of edges, M (C) is proportional to (p/(1 − p))|C| q nC , where nC is the number of connected components of C. We are interested in the case where q ≥ 1 and p is close to 1. From [18], we know that there exists a value p0 (q) < 1 depending on q such that, for any p in (p0 (q), 1], there exists a unique infinite volume FK measure p,q ∞ on L3 corresponding to the parameters p, q. We fix q ≥ 1. Let p belong to (p0 (q), 1] and let ν be a vector in the unit sphere S 2 . Let A be a unit square orthogonal to ν, let cyl A be the cylinder A + Rν. We define the surface tension τ (ν, p) depending on the direction ν and the parameter p as τ (ν, p) = lim n→∞ inside ncyl A there exists a finite set of closed edges E which cuts 1 p,q − 2 ln ∞ ncyl A in at least two unbounded components and the edges of E at . n distance less than 6 from ∂ncyl A are at distance less than 6 from nA p,q
The existence of the limit follows from the FKG property of the measure ∞ and a classical subadditivity argument (see [6] for a detailed proof). We know that for a fixed value of p sufficiently close to 1, the map τ (·, p) : ν ∈ S 2 → τ (ν, p) is strictly positive, continuous, invariant under the isometries which leave Z3 invariant. Furthermore it satisfies the weak simplex inequality, that is, the homogeneous extension of τ (·, p) to R3 is convex. Theorem 1.1. We have the following expansion of τ (ν, p) as p goes to 1: for any q ≥ 1, uniformly over ν in S 2 , as p goes to 1, τ (ν, p) = |ν|1 ln(
1 ) − |ν|1 ent(ν) + o(1), 1−p
where for any ν = (a, b, c) in the unit sphere S 2 we set 1 1 1 |a| |b| |c| + L π + L π , |ν|1 = |a| + |b| + |c|, ent(ν) = L π π |ν|1 π |ν|1 π |ν|1 and L is the Lobachevsky function given by ∀ x ∈ [0, π] L(x) = −
x
ln(2 sin t) dt.
0
Note that the first two terms of this expansion are independent of q. When q = 1, we have the surface tension of the Bernoulli percolation model [6]; when q = 2, it is the surface tension of the Ising model [4, 7], and when q is an integer larger than 2, it is the surface tension of the q-state Potts model [8]. 1.2. The Wulff crystal. We denote by Wτ the Wulff set associated to the surface tension τ = τ (ν, p), called also the crystal of τ , Wτ = { x ∈ R3 : x · w ≤ τ (w) for all w in S 2 }.
(1)
Since τ is continuous and bounded away from 0, its crystal Wτ is convex, closed, bounded and contains the origin 0 in its interior [15, Prop. 3.5].
Low-Temperature Expansion of Wulff Crystal in 3D Ising Model
149
The surface energy I(A) of a set A having a smooth boundary ∂A is defined to be the surface integral of τ on the boundary of A, that is I(A) =
∂A
τ (νA (x)) dH2 (x),
where νA (x) is the exterior normal vector to ∂A at x and H2 is the two dimensional Hausdorff measure in R3 . The Wulff Theorem asserts that, up to dilations and translations, the Wulff crystal Wτ is the unique solution to the isoperimetric problem associated to the surface energy I, that is, it is the unique set enclosing a fixed volume and minimizing the surface energy I. In the case where τ is constant, we have the classical isoperimetric problem and the Wulff crystal is an Euclidean ball. In the case of anisotropic functions τ , the first attempts to solve this problem are due to Wulff, at the turn of the century [37]. Later Dinghas [12] proved that, among convex polyhedra, the Wulff crystal Wτ is the solution to the problem. Taylor obtained general existence and uniqueness results in the framework of the Geometric Measure Theory [33–35]. Recently, Fonseca and Müller reworked and slightly enhanced these results using the theory of the Caccioppoli sets [15, 16]. We recall that the Hausdorff distance between two compact sets A1 , A2 is defined by
dH (A1 , A2 ) = max
sup inf |a1 − a2 |2 , sup inf |a2 − a1 |2
a1 ∈A1 a2 ∈A2
a2 ∈A2 a1 ∈A1
.
A direct consequence of Theorem 1.1 is that we can compute an approximation of the Wulff crystal Wτ as p goes to 1. Here is a corollary to Theorem 1.1: Corollary 1.2. Let Wσ be the Wulff crystal associated to the function σ defined by ∀ ν ∈ S 2 σ (ν, p) = −|ν|1 ln(1 − p) − |ν|1 ent(ν). Then, for any q ≥ 1, lim dH (Wσ , Wτ ) = 0.
p→1
Remark. For p close to 1, the diameter of these Wulff crystals is of order − ln(1 − p), which tends to infinity as p → 1. If we rescale the Wulff crystal to have volume 1, then the corollary gives an approximation to order o(1/ log(1 − p)). Proof. The functions τ and σ are the support functions of the crystals Wτ and Wσ . By an identity due to Rådström [27] and Hörmander [21], we have dH (Wσ , Wτ ) = sup |σ (ν, p) − τ (ν, p)|. ν∈S 2
The result follows then directly from the uniformity of the asymptotic expansion of Theorem 1.1. The following theorem determines exactly Wσ and hence Wτ to order o(1) in the Hausdorff metric.
150
R. Cerf, R. Kenyon
Fig. 1. The surface S0
Theorem 1.3. Let ε = −1/ ln(1 − p) and let Tε be the map Tε (x) = 1ε (1, 1, 1) − x. The portion of the boundary of Tε (Wτ ) inside ]0, 1/ε[3 converges with respect to the Hausdorff metric to the surface S0 = {(f (A, B, C) − ln A, f (A, B, C) − ln B, f (A, B, C) − ln C) | A, B, C > 0} , where for A, B, C positive, f (A, B, C) =
1 4π 2
[0,2π ] [0,2π]
ln |A + Beiu + Ceiv | du dv.
The definition of the topology for which this convergence is proved is given in Sect. 3.2. Note that this theorem and the previous corollary do not give any control on the size of the facets of the true crystal. They only determine the shape of the crystal to order o(1). A plot of S0 is shown in Fig. 1. 1.3. Plane partitions. A plane partition of a positive integer n is a collection of positive integers {pi,j }1≤i,j <∞ indexed by pairs (i, j ) of positive integers with the properties
pi,j = n and ∀ i, j ∈ N
pi,j ≥ pi+1,j ,
pi,j ≥ pi,j +1 .
i,j
To a plane partition {pi,j } we associate a 3D Young diagram by putting a column of unit cubes of height pi,j over the unit square centered at (−1/2, −1/2, 0) + (i, j, 0) in the horizontal plane. For convenience, we consider also that the axis planes belong to the 3D Young diagram. The 3D Young diagram provides a way to view a plane partition as a stack of cubes, see Fig. 2 for an example. The exposed surface of the stack of cubes and the axis planes will be called the surface of the 3D Young diagram.
Low-Temperature Expansion of Wulff Crystal in 3D Ising Model
151
Fig. 2. 3D Young diagram of a plane partition
Let now Yn= (respectively Yn≤ ) be a 3D Young diagram chosen randomly with the uniform distribution over the set of all 3D Young diagrams associated to n (respectively to integers less than or equal to n). The asymptotic form of Yn= for large n is obtained as the solution of a variational problem for minimizing a certain surface energy while enclosing a fixed volume. The surface energy is exactly that given by the function −|ν|1 ent(ν). We prove: Theorem 1.4. The surface (ζ (3)/4)−1/3 S0 is the asymptotic shape of (the surface of) a random rescaled 3D Young diagram n−1/3 Yn= or n−1/3 Yn≤ , where ζ (3) = n≥1 1/n3 . See the precise statement in Theorem 3.4. Moreover we prove a large deviation principle for Yn= and Yn≤ . 1.4. Organization. The remainder of the paper is organized as follows. In Sect. 2 we prove Theorem 1.3 using the expansion of Theorem 1.1. This is a straightforward computation using the Wulff construction. We also compute the volume under the surface S0 and we study its smoothness. In Sect. 3 we derive a large deviation principle for 3D Young diagrams and use it to prove Theorem 1.4. This section relies on the results of [10]. In that paper the authors deal with a four-parameter family of “domino tilings”; in the current paper we need a specialization of their results, when one of the parameters is set to 0. Specifically, the four parameters a, b, c, d are edge activities in a dimer model on Z2 . The edge weights a, b, c, d are staggered, with a and b alternating on horizontal edges and c and d alternating on vertical edges of Z2 , in such a way that around each lattice square all four activities occur. As noted in [10], when one of the activities, say d, is zero, the model becomes isomorphic to a dimer model on a honeycomb lattice, with activities a, b, c. This is dual to the lozenge-tiling model which is the model we need in this paper. Sections 4 and 5 below are devoted to the proof of Theorem 1.1.
152
R. Cerf, R. Kenyon
2. Proof of Theorem 1.3 To determine Wσ , we need to find for each direction ν ∈ S 2 , r(ν) = min
w∈S 2
σ (w) . ν·w
Then r(ν) will be the radius of Wσ in direction ν. If we extend homogeneously σ to a function on R3 \ {0} by setting σ (ν) = |ν|2 σ (ν/|ν|2 ) for ν ∈ R3 \ {0}, we have r(ν) =
min
w∈R3 \{0}
σ (w) . ν·w
(2)
From Theorem 1.1, setting w = (x, y, z), we see that σ is of the form 1 σ (x, y, z) = (|x| + |y| + |z|)( − ent(x, y, z)), ε where ε = −1/ ln(1 − p) and 1 1 1 |x| |y| |z| ent(x, y, z) = L π + L π + L π . π |w|1 π |w|1 π |w|1 By symmetry we need only to work in the positive orthant O+ = { (x, y, z) ∈ (R+ )3 , x ≥ 0, y ≥ 0, z ≥ 0, x + y + z > 0 }. Letting ν = (a, b, c) we have from (2), r(a, b, c) =
min
x,y,z∈O+
(x + y + z)(1 − εent(x, y, z)) . ε(ax + by + cz)
Setting derivatives with respect to x, y and z respectively equal to zero we find three equations for the minimum (any two of which suffice to determine the minimum): (ax + by + cz)(1 − εent − ε(x + y + z)entx ) − a(x + y + z)(1 − εent) = 0, (ax + by + cz)(1 − εent − ε(x + y + z)enty ) − b(x + y + z)(1 − εent) = 0, (ax + by + cz)(1 − εent − ε(x + y + z)entz ) − c(x + y + z)(1 − εent) = 0. Rather than solving for (x, y, z) as a function of (a, b, c), these can be solved for (a, b, c) in terms of (x, y, z). The solution ν = (a, b, c) is only defined up to a constant multiple, which is chosen so that σ (w) = ν · w. Using the fact that xentx + yenty + zentz = 0, since ent depends only on the direction of ν and not the length, the solution is a = ε−1 − ent − (x + y + z)entx , b = ε−1 − ent − (x + y + z)enty , c = ε−1 − ent − (x + y + z)entz . The interpretation of this solution is that when ν = (a, b, c) is of the above form, then w = (x, y, z) minimizes σ (w)/(ν · w) (and the minimum value is 1). Therefore as
Low-Temperature Expansion of Wulff Crystal in 3D Ising Model
153
(x, y, z) runs over S 2 , (a, b, c) gives a parametric representation of ∂Wσ . Thus, we have a parametric representation of Wσ in the positive orthant, given by ∂Wσ = (ε−1 − ent − (x + y + z)entx ,
ε−1 −ent − (x + y + z)enty , ε−1 −ent − (x + y + z)entz ) | (x, y, z) ∈ O+ .
If we translate by (−ε−1 , −ε−1 , −ε−1 ) so that the corner of the cube (to which Wσ tends as p → 1) is at the origin in R3 , then the surface ∂Wσ − (ε −1 , ε−1 , ε−1 ) is −1 times the fixed surface S0 defined by the parametric equation S0 = (ent + (x + y + z)entx , ent + (x + y + z)enty ,
ent + (x + y + z)entz ) | x, y, z ∈ O+ . Recall ent(x, y, z) =
1 π
L
πx x+y+z
where
+L
x
L(x) = −
πy x+y+z
+L
πz x+y+z
,
ln(2 sin t) dt
0
is the Lobachevsky function. We have L (x) = − ln(2 sin x). Setting θx =
πx , x+y+z
θy =
πy , x+y+z
θz =
πz , x+y+z
a short computation gives (x + y + z) entx (x, y, z) =
θy θz θx ln sin θx + ln sin θy + ln sin θz − ln sin θx . π π π
Similar expressions hold for (x + y + z)enty and (x + y + z)entz . Note that ent and (x + y + z)entx depend only on the direction of (x, y, z), not on its length. From [10] we have the identity ent(x, y, z) +
θy θx θz ln sin θx + ln sin θy + ln sin θz π π π 2π 2π 1 iu iv ln θ + e sin θ + e sin θ = du dv. sin x y z 4π 2 0 0
Replacing (sin θx , sin θy , sin θz ) with (A, B, C) satisfying A : B : C = sin θx : sin θy : sin θz , S0 can be written as in the statement of the theorem. This completes the proof.
154
R. Cerf, R. Kenyon
0.6 0.5 0.4 0.3 0.2 0.1
0.2
0.4
0.6
0.8
1
1.2
Fig. 3. The intersection of the surface with the plane y = z; the horizontal axis is the x-axis and the vertical is the y-coordinate (or z-coordinate)
2.1. Properties of S0 . When A ≥ B + C, we have f (A, B, C) = ln A. The part of the surface S0 described by these parameters is given by the set of points (0, ln(A/B), ln(A/C)) as A, B, C vary while satisfying A ≥ B + C and B, C > 0. This set consists of B+C the points in the yz plane lying above the curve parametrized by (0, ln B+C B , ln C ), which is the curve {(0, y, z) | e−y + e−z = 1}. In particular the “curved” part of S0 intersects the yz plane in the curve {(0, y, z) | e−y + e−z = 1}, the xy plane in the curve {(x, y, 0) | e−x + e−y = 1}, and the xz-plane in the curve {(x, 0, z) | e−x + e−z = 1}. Surprisingly, each of these curves is, up to scale, the boundary of the asymptotic Wulff crystal of the two-dimensional Ising model when the temperature goes to 0 (and the asymptotic shape of the 2D Young diagram of a uniform partition of n), see [9, 36, 32]. Another property of the surface S0 is that it is C 1 but not C 2 at the points where it touches the axis planes. For example, when (A, B, C) = (2 − δ, 1, 1) we have f (2 − δ, 1, 1) = ln 2 −
δ 2 3/2 + δ + O(δ 2 ), 2 3π
so that the intersection of S0 with the plane y = z is a curve which, near x = 0, is parametrized by (f (2 − δ, 1, 1) − ln(2 − δ), f (2 − δ, 1, 1), f (2 − δ, 1, 1)) 2 δ δ = δ 3/2 + O(δ 2 ), ln 2 − + O(δ), ln 2 − + O(δ) , 3π 2 2 so that x = (c1 − c2 y)3/2 near x = 0 for constants c1 , c2 . The actual curve is shown in Fig. 3. Notice that the facets of the Wulff crystal in the Ising model still exist for fixed small temperature [5, 24]. We finally compute the volume under S0 .
Low-Temperature Expansion of Wulff Crystal in 3D Ising Model
155
Proposition 2.1. The volume under the surface S0 is equal to ζ (3)/4. Proof. This volume is R2 zdxdy, where (x, y, z) are given by the parametric equation of S0 in the canonical basis. We proceed in two steps. First change coordinates from x, y, z to A, B, C. Here we let A, B be the independent variables and fix C = 1; the curved part of S0 corresponds to pairs A, B for which {A, B, 1} satisfies the triangle inequality. We have the identity ([10]) ∂ θA f (A, B, C) = , ∂A Aπ where θA is the angle opposite edge A in a triangle of edge lengths A, B, C. Similar expressions hold for B and C. We compute θ 1 θB A − dA + dB, πA A πB θA θB 1 dy = df (A, B, 1) − d ln B = dA + − dB, πA πB B
dx = df (A, B, 1) − d ln A =
yielding
θC dAdB, π AB where we used θA + θB + θC = π . Changing coordinates again, to θA , θB (where θC = π − θA − θB ) we have A = sin θA / sin θC and B = sin θB / sin θC . A short computation gives dxdy =
θC dA dB θC θC = d ln A d ln B = dθA dθB . π AB π π Now 1 θA θB (L(θA ) + L(θB ) + L(θC )) + ln A + ln B. π π π
f (A, B, 1) = We use the expansion
∞
L(θ ) =
1 sin(2nθ ) . 2 n2 n=1
We have
0
π
π−θA 0
sin(2nθA )
θC π dθB dθA = π 4n
with the same expression holding for sin(2nθB ); and π π−θA θC sin(2nθC ) dθB dθA = 0. π 0 0 In particular 0
π
0
π−θA
∞
1 θC 1 1 1 = ζ (3). (L(θA ) + L(θB ) + L(θC )) dθB dθA = π π 4 n3 4 n=1
156
R. Cerf, R. Kenyon
The integral of the terms θA sin θA θB sin θB + ln ln π sin θC π sin θC is easily shown to be zero. In conclusion we have π π−θA θC 1 f (A, B, 1) dθB dθA = ζ (3). π 4 0 0
3. Plane Partitions 3.1. Monotone sets and entropy. Our goal is to derive a large deviation principle and a law of large numbers for the rescaled random 3DYoung diagrams n−1/3 Yn= and n−1/3 Yn≤ . To this end, we need first to define our topological framework and to embed our random objects in a continuous space. We consider the space E consisting of closed subsets E of (R+ )3 having finite volume 3 (L (E) < ∞) and satisfying the following monotonicity property: for any (x, y, z) ∈ (R+ )3 , (x, y, z) ∈ E ⇒ [0, x] × {y} × {z} ⊂ E, {x} × [0, y] × {z} ⊂ E,
{x} × {y} × [0, z] ⊂ E.
For convenience, we impose also that the axis planes are included in E. Let P111 be the plane containing the origin and orthogonal to the vector (1, 1, 1), i.e., P111 = { (a, b, c) ∈ R3 : a + b + c = 0 }. The boundary of an element E of E can be conveniently parametrized by looking at the height of E over the plane P111 , that is, to E we associate the function fE : P111 → R+ defined by ∀ x ∈ P111
fE (x) = sup{ t ∈ R+ : x + t (1, 1, 1) ∈ E }.
The set E can be recovered from its height function fE , indeed E = { (a, b, c) ∈ (R+ )3 : a + b + c ≤ fE (π111 (a, b, c)) }, where π111 is the projection on P111 parallel to the direction (1, 1, 1). The monotonicity condition satisfied by the set E implies that ∀ x, y ∈ P111
fE (y) ≥ fE (x) − |x − y|2 ,
hence the height function fE belongs to the space Lip1 (P111 , R+ ) of the maps from P111 to R+ which are Lipschitz with Lipschitz constant 1. In particular the boundary of an element E of E admits a Lipschitz parametrization. By Rademacher’s Theorem, a Lipschitz function is differentiable almost everywhere with respect to the Lebesgue measure, hence the set E admits a tangent plane at (x, fE (x)) for λ almost all x, where λ is the planar Lebesgue measure in the plane P111 . We denote by νEclassic (x) the normal vector at (x, fE (x)). Let axis(·) be the height function associated to the axis planes, that is ∀ x ∈ P111
axis(x) = inf{ t : x + t (1, 1, 1) ∈ (R+ )3 }.
Low-Temperature Expansion of Wulff Crystal in 3D Ising Model
157
We define the domain of E as dom(E) = { x ∈ P111 : fE (x) > axis(x) } and the entropy of E as
ent(E) = dom(E)
ent(νEclassic (x)) dλ(x),
where ent is the function appearing in Theorem 1.1; notice that in the integral we do not take into account the boundary points lying in the axis planes. A more flexible way to define the entropy is to work in the space of functions having bounded variation. Since f is Lipschitz, it is absolutely continuous and it belongs to the space BVloc (P111 ), so that its distributional derivative is the Radon measure having density ∇fE with respect to λ, the Lebesgue measure in P111 . As in [31, Formula (15)], to the entropy function |ν|1 ent(ν) we associate the convex set S defined by S = { x ∈ (R+ )3 : ∀ ν ∈ (R+ )3
x · ν ≥ |ν|1 ent(ν) }.
Since |ν|1 ent(ν) is concave, we have the dual relation ∀ ν ∈ (R+ )3 C01 (R3 , S)
|ν|1 ent(ν) = inf x · ν. x∈S
C1
Let be the set of the vector fields with compact support in R3 and taking values in S. Like in [6, Chap. 6], it can be shown that div g(x) dL3 (x) : g ∈ C01 (R3 , S) . ent(E) = inf E
From this last expression, it is obvious that the entropy ent is upper semicontinuous in the topology L1loc (P111 ), as well as in the stronger topology L1 (P111 ). Lemma 3.1. The function ent is upper semicontinuous in the topology L1loc (P111 ). Another way to prove this lemma is to rely on the specific fact that we work with the set of functions Lip1 (P111 , R+ ). On Lip1 (P111 , R+ ), the topology L1loc (P111 ) agrees with the topology of uniform convergence on compact subsets. Let (fn )n∈N be a sequence of functions belonging to Lip1 (P111 , R+ ) and converging in L1loc (P111 ) (and therefore uniformly on compact subsets of P111 ) towards f . By [10, Lemma 2.1], for any compact set K, lim sup ent(∇fn ) dλ ≤ ent(∇f ) dλ. n→∞ K K We can write P111 as the union P111 = m∈N Km , where the sets Km , m ∈ N, are compact and satisfy ∀ m1 , m2 ∈ N,
m1 = m2 ,
Thus lim sup ent(fn ) = lim sup n→∞
n→∞
≤
m∈N
≤
λ(Km1 ∩ Km2 ) = 0.
m∈N Km
lim sup n→∞
m∈N Km
ent(∇fn ) dλ
Km
ent(∇fn ) dλ
ent(∇f ) dλ = ent(f ).
158
R. Cerf, R. Kenyon
3.2. Topology on E. We endow E with the topology H of Hausdorff convergence on compact sets, that is, the topology whose basis elements are
F ∈ E : dH (F ∩ K, E ∩ K) < ε , ε > 0,
E ∈ E,
K compact subset of (R+ )3 ,
where dH is the Hausdorff metric. For other possible topologies on E, as well as their relationships with the topology H, see [30]. It is likely that our results hold with a finer topology, for instance the topology L1 . If we express the topology H on the space Lip1 (P111 , R+ ) through the map E ∈ E → fE ∈ Lip1 (P111 , R+ ), the corresponding functional topology is the topology of uniform convergence over comis metrizable: if (Kn )n∈N is an increasing sequence pact subsets of P111 . This topology of compact sets such that (R+ )3 = n∈N Kn , setting for E, F in E, dist(E, F ) =
n∈N
2−n min
d (E ∩ K , F ∩ K ) H n n ,1 , diam Kn
we get a metric compatible with this topology. A standard diagonal argument shows that for any α > 0, the subset Eα = { E ∈ E : L3 (E) ≤ α } is compact. Indeed, let (En )n∈N be a sequence in Eα and let (fEn )n∈N be the associated height functions in Lip1 (P111 , R+ ). Since ∀n ∈ N
√ 3 α ≥ L3 (En ) ≥ fEn (0)/ 3
the sequence (fEn (0))n∈N is bounded. The functions fEn , n ∈ N, being Lipschitz with Lipschitz constant 1, by a diagonal argument, we can extract a subsequence (which we redenote by fEn ) which converges uniformly on any compact subset of P111 towards a function f . Let E be the element of E associated to f . Then the sequence (En )n∈N converges towards E with respect to the topology H and by the Fatou Lemma,
L (E) = 3
P111
|f (x) − axis(x)| dλ(x) ≤ lim inf n→∞
P111
|fEn (x) − axis(x)| dλ(x) ≤ α.
The topology H on E is identical to the topology L1loc (P111 ), hence the entropy ent is upper semicontinuous when E is endowed with the topology H.
Low-Temperature Expansion of Wulff Crystal in 3D Ising Model
159
3.3. Large deviation principle. We are now ready to state our large deviation principle. Notice that the rescaled 3D Young diagrams n−1/3 Yn= and n−1/3 Yn≤ belong to the space E1 = { E ∈ E : L3 (E) ≤ 1}. Theorem 3.2. The sequences (n−1/3 Yn= )n∈N and (n−1/3 Yn≤ )n∈N satisfy a large deviation principle in E1 endowed with the topology H, with speed n2/3 , governed by the good rate function I defined by ∀ E ∈ E1 I(E) = sup ent(F ) − ent(E), F ∈E1
i.e., for ∗ equal either to = or to ≤, for any open subset O of (E1 , H), −1/3 ∗ 1 ln P n Y ∈ O , n n2/3
(3)
1 ln P n−1/3 Yn∗ ∈ F ≤ − inf { I(E) : E ∈ F }. n2/3
(4)
− inf { I(E) : E ∈ O } ≤ lim inf n→∞
and, for any closed subset F of (E1 , H), lim sup n→∞
Proof. For n ∈ N, we denote by N (n) the number of 3D Young diagrams associated to n. We deal first with the large deviations lower bound for (Yn= )n∈N . Let E belong to E1 and let fE be the associated height function. By density, we need only to consider the case L3 (E) < 1. ∗ =π 3 For an M > 0 we set EM = E ∩ [0, M]3 and RM 111 ([0, M] ). There is a one to one correspondance between random lozenge tilings (with step size n−1/3 ) of the region ∗ , and 3D Young diagrams contained in [0, M]3 . Each such 3D Young diagram has an RM ∗ . We are interested associated height function h which agrees with axis (·) outside of RM in 3D Young diagrams whose height function h lies close to fEM . The volume of the 3D Young diagram with height function h is less than ∗ L3 (EM ) + λ(RM ) sup |fEM (x) − h(x)|. ∗ x∈RM
Now let ε > 0 and take M large enough that ent(fEM ) ≥ ent(E) − ε. Since the entropy ent vanishes in the axis directions, we have I(EM ) ≤ I(E). Since L3 (E) < 1, there exists δ > 0 small enough so that ∗ L3 (EM ) + δλ(RM ) < 1. ∗ By [10, Theorem 4.3], for n large enough, the number of height functions h over RM corresponding to random tilings such that supx∈R ∗ |fEM (x) − h(x)| < δ is larger than M
exp n2/3 (ent(fEM ) − ε) .
Each such height function corresponds to a rescaled 3D Young diagram associated to an integer m with m ≤ n. By adding a column of n−m cubes along the z axis we obtain a 3D Young diagram yn associated to n which still satisfies dH (n−1/3 yn ∩ [0, M]3 , EM ) < 2δ
160
R. Cerf, R. Kenyon
for n large enough, say n−1/3 < δ (notice that we use here the fact that the axis planes belong to the set E). Thus, using the inequality ent(fEM ) ≥ ent(E) − ε, 1 ln P dH (n−1/3 Yn= ∩ [0, M]3 , EM ) < 2δ 2/3 n ≥ ent(E) − 2ε −
1 ln N (n). n2/3
(5)
We deal next with the large deviations upper bound for (Yn≤ )n∈N . Let E belong to E1 . For M > 0, we set EM = E ∩ [0, M]3 . Notice that the map t ∈ R+ → H2 (E ∩ {x = t}) ∈ R+ decreases monotonically to 0 as t increases to ∞ (here H2 is the two dimensional Hausdorff measure in R3 and {x = t} is the plane consisting of the points whose first coordinate is equal to t). Let ε > 0. We choose M1 large enough so that H2 (E ∩ {x = M1 }) < ε. We proceed similarly for the two other axis directions to get M2 , M3 . Let also M4 be such that I(EM4 ) ≥ I(E) − ε. Let M = max(M1 , M2 , M3 , M4 , 1/ε). Since E is closed, by the dominated convergence theorem, lim H2 ({ a ∈ [0, M]3 : d(a, E) < δ} ∩ {x = M}) = H2 (EM ∩ {x = M}),
δ→0
and there exists δ1 > 0 such that ∀F ∈ E
dH (EM , F ∩ [0, M]3 ) < δ1 ⇒ H2 (F ∩ ({M} × [0, M]2 )) < 2ε.
We proceed similarly for the two other axis directions to get δ2 , δ3 . By [10, Theorem 4.3], there exists δ4 > 0 such that, for n sufficiently large, the number of tilings of ∗ =π 3 RM 111 ([0, M] ), whose corresponding height function h satisfies sup |fEM (x) − h(x)| < δ4
∗ x∈RM
is less than exp n2/3 (ent(EM ) + ε). We set δ = min(δ1 , δ2 , δ3 , δ4 ). We have then ∀F ∈ E
dH (EM , F ∩ [0, M]3 ) < δ ⇒ H2 (F ∩ ({M} × [0, M]2 )) < 2ε,
H2 (F ∩ ([0, M] × {M} × [0, M])) < 2ε,
H2 (F ∩ ([0, M]2 × {M})) < 2ε.
We will now compute an upper bound on the number of 3D Young diagrams yn such that dH (n−1/3 yn ∩ [0, M]3 , EM ) < δ. Let yn be such a 3D Young diagram. By the choice of M, δ, we have H2 (n−1/3 yn ∩ ({M} × [0, M]2 )) < 2ε, and moreover MH2 (n−1/3 yn ∩ ({M} × ((R+ )2 \ [0, M]2 ))) ≤ 1, hence
H2 (n−1/3 yn ∩ ({M} × (R+ )2 )) < 3ε.
Low-Temperature Expansion of Wulff Crystal in 3D Ising Model
161
∗ given by f ∗ To yn we associate a height function over RM n−1/3 yn |RM (the restriction of ∗ ) as well as the three 3D Young diagrams y i,M , 1 ≤ i ≤ 3, corresponding fn−1/3 yn to RM n to yn \ [0, n1/3 M]3 . More precisely, yn1,M = yn \ [0, n1/3 M] × (R+ )2 − (n1/3 M, 0, 0),
and yn2,M , yn3,M are defined analogously, considering the two other axis directions. The number of possible configurations for n−1/3 yn ∩ [0, M]3 is bounded by the number of corresponding tilings, that is, by exp 21 n2/3 (ent(EM ) + ε). The number of possible configurations for yni,M , 1 ≤ i ≤ 3, is estimated with the help of the following lemma. Lemma 3.3. There exists a constant c > 0 such that for any ε > 0, any n ∈ N, the number of 3D Young diagrams associated to an integer less than n, with step size n−1/3 and having less than εn2/3 cubes intersecting one of the axis planes is less than exp(cε1/4 n2/3 ). √
Proof. Recall that the number of ordinary (2D) partitions of n is at most c n for a constant c. Let Zn,ε be the set of 3D Young √ diagrams whose intersection with the xyplane has area less than εn2/3 . Let K = εn1/3 . For Y ∈ Zn,ε , let Y (1) = Y ∩ {(x, y, z)|0 ≤ y ≤ K} and Y (2) = Y ∩ {(x, y, z)|y ≥ K and 0 ≤ x ≤ K}. Then Y = Y (1) ∪ Y (2) and Y (1) and Y (2) are disjoint. Now Y (1) is made up of the K 2D-Young diagrams Y (1) (i) = {(x, y, z) ∈ Y (1) |i ≤ y < i + 1},
0 ≤ i < K,
and similarly Y (2) is made up of Y (2) (j ) = {(x, y, z) ∈ Y (2) |j ≤ x < j + 1},
0 ≤ j < K.
Let M1 be the volume of Y (1) , and let mi be the volume of Y (1) (i). Given the volumes m1 , . . . , mK , the number of choices for the Y (1) (i) is at most c
√
√ m1 +···+ mK
≤c
√ √ K M1
.
The number of choices for√the volumes m1 , . . . , mK is at most the number of partitions of M1 , which is at most c n . Similar bounds hold for Y (2) . Therefore the total number of elements of Zn,ε is at most √
|Zn,ε | ≤ c2
√ n+ Kn
for a constant c3 . This completes the proof.
≤ c3ε
1/4 n2/3
162
R. Cerf, R. Kenyon
We conclude that: ∀ E ∈ E1
∀ε > 0
∃M, δ > 0
∃N
∀ n > N,
P dH (n−1/3 Yn≤ ∩ [0, M]3 , EM ) < δ −1
N (k) exp n2/3 (ent(E) + 2ε + 9cε 1/4 ). ≤
(6)
1≤k≤n
Inequality (5) implies on one hand that lim inf n→∞
1 ln N (n) ≥ max{ ent(E) : E ∈ E1 }. n2/3
On the other hand, let ε > 0; for each set E in E1 consider the neighborhood { F ∈ E1 : dH (F ∩ [0, M]3 , EM ) < δ }, where M, δ are chosen as in inequality (6). The space E1 being compact, we can extract from this covering a finite subcover, associated to (Ei , Mi , δi ), i ∈ I . For n large enough, inequality (6) is satisfied for each set Ei , i ∈ I , hence, 1 = P n−1/3 Yn≤ ∈ E1 ≤ P dH (n−1/3 Yn≤ ∩ [0, Mi ]3 , EMi ) < δi i∈I
≤
−1
N (k)
1≤k≤n
whence lim sup n→∞
exp n2/3 (ent(Ei ) + 2ε + 3cε 1/4 ),
i∈I
1 ln N (n) ≤ max ent(Ei ) + 2ε + 3cε 1/4 . i∈I n2/3
Sending ε to 0, we conclude that lim
n→∞
1 ln N (n) = max{ ent(E) : E ∈ E1 }. n2/3
(7)
Now the large deviations lower bound (3) for Yn= follows directly from inequality (5) and equality (7). The large deviations upper bound (4) for Yn≤ follows directly from inequality(6), equality (7) and the compactness of E1 . Because N (n) increases with n, we have 1≤k≤n N (k) ≤ nN (n) and for any E ∈ E1 , M, δ > 0 and n ∈ N, P dH (n−1/3 Yn= ∩ [0, M]3 , EM ) < δ ≤ n P dH (n−1/3 Yn≤ ∩ [0, M]3 , EM ) < δ . Therefore the large deviations upper bound for Yn≤ implies the large deviations upper bound for Yn= , while the large deviations lower bound for Yn= implies the large deviations lower bound for Yn≤ . Our large deviation principle implies automatically a law of large numbers for the random rescaled 3D Young diagrams. We recall that S = { x ∈ R3 : ∀ ν ∈ S 2 and that the boundary of S is the surface S0 .
x · ν ≥ |ν|1 ent(ν) }
Low-Temperature Expansion of Wulff Crystal in 3D Ising Model
163
Theorem 3.4. The set (ζ (3)/4)−1/3 ((R+ )3 \ S) is the asymptotic shape of a random rescaled 3D Young diagram: for any M, δ > 0, we have lim sup n→∞
1 ln P dH (n−1/3 Yn= ∩ [0, M]3 , [0, M]3 \ (ζ (3)/4)−1/3 S) ≥ δ < 0. 2/3 n
Proof. The solution of the variational problem minimize
I(E) over E ∈ E1
which is of course equivalent to the problem maximize
ent(E) over E ∈ E1
is given by a slight variant of the famous Wulff isoperimetric theorem. The unique solution is the adequate dilation of ((R+ )3 \ S) which encloses a volume 1 (see [31]). Since the volume under S0 is ( 41 ζ (3)), it is the set ( 41 ζ (3))−1/3 ((R+ )3 \ S). The large deviation principle for Yn= implies that lim sup n→∞
1 ln P dH (n−1/3 Yn= ∩ [0, M]3 , [0, M]3 \ S) ≥ δ 2/3 n ≤ − inf { I(E) : E ∈ E1 , dH (E, [0, M]3 \ S) ≥ δ },
and the right-hand side is strictly negative. 4. Preliminaries for Theorem 1.1 In this section we introduce first the notation and we give some basic definitions. In the second part, we recall some basic properties of FK (or random cluster) measures. 4.1. Notation. The cardinality of a set A is denoted by |A|. The symmetric difference between two sets A1 , A2 is denoted by A1 ?A2 . We denote by dp the metric associated with the p-norm, i.e., dp (x, y) = |x − y|p for any x, y in R3 . We will only use the 1, 2 and ∞ norms. The dp distance between two subsets E1 and E2 of R3 is dp (E1 , E2 ) = inf{ |x1 − x2 |p : x1 ∈ E1 , x2 ∈ E2 }. The r-neighborhood of E ⊂ R3 with respect to the d2 metric is the set V(E, r) = { x ∈ R3 : d2 (x, E) < r }. We will usually work with the Euclidean distance d2 on the continuous space R3 and with the distance d1 or d∞ on the discrete lattice Z3 . The unit sphere of R3 is denoted by S 2 . We denote by H2 the standard 2-dimensional Hausdorff measure. We turn Z3 into a graph with vertex set Z3 and edge set E1 = { {x, y} : x, y ∈ Z3 , |x − y|1 = 1 }. This graph is called the three dimensional cubic lattice and is denoted by L1 . Let D be a subset of R3 . An edge {x, y} of E1 is said to be included in D if both sites x, y belong to D. We denote by E1 (D) the set of the edges of E1 included in D. For D a subset of Z3 ,
164
R. Cerf, R. Kenyon
the graph (D, E1 (D)) will be often identified with its vertex set D. Let A be a subset of Z3 . We define in ∂∞ A = { x ∈ A : ∃ y ∈ Ac d∞ (x, y) = 1}. The set A ⊂ Z3 is said to be connected or L1 -connected (respectively L∞ -connected) if any two of its points are connected by a path x0 , x1 , . . . , xn of points of A with d1 (xi , xi+1 ) = 1, 0 ≤ i < n (respectively d∞ (xi , xi+1 ) = 1). Note that L1 -connectedness implies L∞ -connectedness. The object dual to the edge e = {x, y} of E1 is the unit square orthogonal to e centered at (x + y)/2, also called the plaquette associated to e, and denoted by p(e). Two edges in E1 are said to be adjacent if (and only if) their corresponding plaquettes meet along a unit segment. A set of edges E ⊂ E1 is said to be connected if any two edges are connected by a path of adjacent edges. Let A, B, D be subsets of R3 with A ∩ D ∩ B = ∅. A set of edges E ⊂ E1 is said to separate A and B in D if there is no path in the graph (Z3 ∩ D, E1 (D) \ E) connecting a vertex of A and a vertex of B. The set E separates ∞ in D if the graph (Z3 ∩ D, E1 (D) \ E) has at least two infinite components. One is naturally lead to work simultaneously with the two metrics L1 and L∞ for topological reasons. Here we will use the following result. Lemma 4.1. If A is a L∞ connected set of vertices and R is a L1 connected component in R is L1 connected. of Ac , then ∂∞ Proof. From R, we construct a three dimensional manifold M with boundary as follows. The manifold M is the ε-neighborhood of the union of those edges whose vertices are both in R and the set of (solid) unit cubes all of whose 8 vertices are in R. The boundary of M consists of closed oriented two dimensional manifolds. Now each boundary component of M divides R3 into an outside and an inside: this is a classical theorem of algebraic topology (essentially the generalized Schoenflies’ Theorem, see [29]). Note that two boundary components of M are at d∞ -distance at least 1 + 2ε. in R and x , x points of ∂M to which they are closest. Let x1 , x2 be two vertices in ∂∞ 1 2 We claim that x1 , x2 are on the same boundary component of M. This is because (using connectedness of A) there is a path in Rd from x1 to x2 which does not pass through the interior of M. Since x1 , x2 are on the same boundary component of M, there exists a path on this boundary component from x1 to x2 . This path can be pushed onto an L1 -path in R. on the underlying edges close to the boundary, whose vertices belong to ∂∞ A detailed proof of a similar result has been done by Kesten [22, Lemma 2.23]; see also [11, Lemma 2.1]. 4.2. FK percolation. We give here a short account of FK measures; we refer to [18, 26] for a more detailed exposition. For E ⊂ E1 with E = ∅, we write A(E) for the set {0, 1}E ; its elements are called edge configurations in E. The natural projections are given by ω ∈ A(E) → ω(e) ∈ {0, 1}, where e ∈ E. An edge e is called open in the configuration ω if ω(e) = 1, and closed otherwise. For A ⊂ Z3 , let AA = A(E1 (A)), the set of the configurations within A (recall that E1 (A) denotes the set of edges between sites in A). We set also A = AZ3 . Given ω ∈ A and E ⊂ E1 , we denote by ω(E) the restriction of ω to A(E). Given ω ∈ A, we denote by O(ω) the set of the edges of E1 which are open in the configuration ω. The connected components of the graph (Z3 , O(ω)) are called ω-clusters. The path γ = (x1 , e1 , x2 , . . . ) is said to be ω-open if all
Low-Temperature Expansion of Wulff Crystal in 3D Ising Model
165
the edges ei belong to O(ω). An edge e = {x, y} is said to be wired in the configuration ω if there exists an ω-open path joining the endvertices x, y of e which does not use the edge e itself. Let ω ∈ A and V ⊂ Z3 be a finite subset of Z3 . The open clusters in V are the connected components of the random graph (V , O(ω(E1 (V )))). The number of open clusters in V is denoted by cl(ω). Let FV be the σ -field with atoms {ω}, ω ∈ A(V ). For fixed p ∈ [0, 1] and q ≥ 1, the FK measure with parameters (p, q) is a probability p,q measure V on FV , defined by the formula ∀ ω ∈ AV
p,q
V [{ω}] =
1 p,q ZV
p ω(e) (1 − p)1−ω(e) q cl(ω) ,
(8)
e∈E
p,q
where ZV is the appropriate normalization factor. There exists p0 (q) in (0, 1) such that, for any p in (p0 (q), 1), the weak limit p,q
p,q
∞ = lim V V →Z3
exists and is the unique FK measure in infinite volume corresponding to the parameters p, q [18, Proof of Thm. 5.3]. We will work in this regime throughout the paper. By conditioning on the wiring status of the endvertices of a fixed edge, we obtain the following estimates for the probabilities of this edge to be open or closed: ∀ e ∈ E1
p,q
∞ (e is open) ≥
p , p + q(1 − p)
p,q
∞ (e is closed) ≥ 1 − p.
There is a partial order ! in A given by ω ! ω if and only if ω(e) ≤ ω (e) for every e ∈ E1 . A function f : A → R is called increasing if f (ω) ≤ f (ω ) whenever ω ! ω . An event is called increasing if its characteristic function is increasing. A property of p,q crucial importance is that for q ≥ 1, p > p0 (q), ∞ satisfies the FKG inequality, i.e., for all F-measurable bounded increasing functions f, g, we have (see [18]) p,q
p,q
p,q
∞ (f g) ≥ ∞ (f ) ∞ (g). Lemma 4.2. Let p > p0 (q). Let F be a fixed finite set of edges and let N be an integer with N ≤ |F |. Then p,q ∞ the edges of F are closed, at least N edges of F are wired 1 − p |F | ≤ q |F | − N . p Proof. We denote by E the event
E = the edges of F are closed, at least N edges of F are wired . Since we work in the region where there is uniqueness of the infinite volume FK measure, we have p,q p,q ∞ E = lim E E . E→Z3
Let E be a box containing all the edges of F . To a configuration η in E ∩ A(E), we associate the configuration η obtained by changing the states of all the edges of F and keeping the remaining edges unchanged. If at least N edges of F are wired in the
166
R. Cerf, R. Kenyon
configuration η, then cl(η) ≥ cl(η) + N − |F |. This follows by induction using the fact that edges in F are closed. Therefore
η∈E ∩A(E)
p,q
E (E) ≤
η∈E ∩A(E)
e∈E1 (E)
p η(e) (1 − p)1−η(e) q cl(η)
e∈E1 (E)
1 − p |F |
≤
p η(e) (1 − p)1−η(e) q cl(η)
p
q |F | − N .
Letting E grow to Z3 , we obtain the inequality stated in the lemma.
Lemma 4.3. Let p > max(p0 (q), 1/2). Let E, F, G be three finite sets of edges, with G ⊂ F . Then p,q ∞ the edges of E ∪ (F \ G) are closed, the edges of G are open 1 − p |F \ G| p,q ≤ 2q ∞ the edges of E are closed, the edges of F are open . p Proof. We denote by E and F the events
E = the edges of E ∪ (F \ G) are closed, the edges of G are open ,
F = the edges of E are closed, the edges of F are open . Since we work in the region where there is uniqueness of the infinite volume FK measure, we have p,q p,q p,q p,q ∞ E = lim E E , ∞ F = lim E F . E→Z3
E→Z3
Let E be a box containing all the edges of E ∪ F . To a configuration η in A(E), we associate the configuration η obtained by opening all the edges of F \ G and keeping the remaining edges unchanged. We have cl(η) ≥ cl(η) − |F \ G|. Therefore p,q
E (E) = = ≤
1
p,q ZE η∈E ∩A(E)
1
p η(e) (1 − p)1−η(e) q cl(η)
e∈E1 (E)
p,q ZE ρ∈F ∩A(E) η∈E ∩A(E), η=ρ ¯
1
p,q ZE ρ∈F ∩A(E)
2q
p η(e) (1 − p)1−η(e) q cl(η)
e∈E1 (E)
1 − p |F \ G| p ρ(e) (1 − p)1−ρ(e) q cl(ρ) p 1
1 − p |F \ G| p,q ≤ 2q E (F). p
e∈E (E)
Letting E grow to Z3 , we obtain the inequality stated in the lemma.
Low-Temperature Expansion of Wulff Crystal in 3D Ising Model
167
5. Proof of Theorem 1.1 Let ν belong to S 2 and let A be a unit square orthogonal to ν. Let cyl A be the cylinder A + Rν. Let n belong to N. We define E(n, A, ν) as the collection of all the subsets E ⊂ E1 such that • E is a finite connected subset of ncyl A. • E separates ncyl A in at least two unbounded components. • The edges of E at distance less than 6 from ∂ncyl A are at distance less than 6 from nA. We define next the event W (n, A, ν) = { there exists E in E(n, A, ν) such that all the edges of E are closed }. The only difference between the collections of edges in E(n, A, ν) and those realizing the event appearing in the definition of the surface tension is the additional connectedness constraint. The next lemma shows however that this constraint is irrelevant to compute the surface tension. Lemma 5.1. Let A be a square in R3 . Let E be a finite set of edges which separates ∞ in cyl A and such that the edges of E at distance less than 6 from ∂cyl A are at distance less than 6 from A. Then there exists a connected subset E ∗ of E which separates ∞ in cyl A. Proof. Let Y be the union of plaquettes associated to edges in E. Note that this set disconnects cyl A \ V(∂cyl A, 6) ∩ V(A, 6) in a topological sense (no continuous path joins the top to the bottom). This is because any continuous path avoiding a set of plaquettes can be pushed onto a lattice path avoiding the same set of plaquettes. Let X be the set X = V(Y, ε) ∪ V(∂cyl A, 6) ∩ V(A, 6) . We locally modify X at each vertex as follows: let x be a vertex of a plaquette of Y . If Y contains several plaquettes with vertex x which are not connected locally (in the sense of not being connected via a chain of adjacent plaquettes containing x), separate X near x so that locally, different components of Y correspond to different components of X: one can do this by pushing each local component of X slightly off of x. Note that this does not destroy the property of X of disconnecting cyl A \ V(∂cyl A, 6) ∩ V(A, 6) . Now ∂X consists of closed two-manifolds and by the generalized Schoenflies theorem [29], at least one component C of ∂X∩cyl A disconnects cyl A\ V(∂cyl A, 6)∩V(A, 6) . Let E ∗ be the set of edges corresponding to plaquettes in the ε-neighborhood of C. Then E ∗ is connected (in the sense that any two plaquettes are connected by a path of adjacent plaquettes) and separates ∞ in cyl A. Corollary 5.2. We have 1 p,q ln ∞ (W (n, A, ν)). 2 n We define f (n, A, ν) to be the minimal number of edges of a set in E(n, A, ν), that τ (ν, p) = lim − n→∞
is,
f (n, A, ν) = min { |E| : E ∈ E(n, A, ν) } and φ(n, A, ν) to be the number of elements in E(n, A, ν) having minimal cardinality, i.e., φ(n, A, ν) = { E ∈ E(n, A, ν) : |E| = f (n, A, ν) } .
168
R. Cerf, R. Kenyon
Lemma 5.3. For any ν in S 2 , any unit square A orthogonal to ν, we have 1 f (n, A, ν) = |ν|1 , n2
lim
n→∞
1
lim
n→∞ n2 |ν|1
ln φ(n, A, ν) = ent(ν),
where ent(ν) is the function defined in Theorem 1.1. Proof. For the first statement, note that the projection of A onto the xy-axis has area |νz |, the absolute value of the z-component of ν. Similarly the projections onto the xz and yz axes have areas |νy | and |νx | respectively. Replacing edges in E by their plaquettes, we clearly need at least n2 |νz | + O(n) plaquettes of type xy, n2 |νy | + O(n) plaquettes of type xz, and n2 |νx | + O(n) plaquettes of type yz for E to disconnect ncyl A. It is also easy to see that these are enough: take the set X of unit lattice cubes intersecting nA; the upper boundary of X is an element of E(n, A, ν) with the required number of plaquettes. For the second statement, see [10, Thm. 4.1]. In particular setting d = 0 in the entropy formula of [10] yields the desired entropy formula for lozenge tilings. Let now q ≥ 1 be fixed and let p belong to (p0 (q), 1]. We set ?(n, p) =
1 1 p,q ln ∞ (W (n, A, ν)) − 2 f (n, A, ν) ln(1 − p). n2 n
Lemma 5.3 and the definition of the surface tension imply immediately that lim ?(n, p) = −τ (ν, p) − |ν|1 ln(1 − p).
n→∞
Thus the asymptotic expansion of Theorem 1.1 is equivalent to saying that lim lim ?(n, p) = |ν|1 ent(ν). p→1 p<1
n→∞
We will prove the even stronger statement lim
(n,p)→(∞,1) n<∞,p<1
?(n, p) = |ν|1 ent(ν).
To this end, we study separately the infimum and the supremum limits of ?(n, p). Lemma 5.4. For any ν in S 2 , any unit square A orthogonal to ν, we have lim inf
(n,p)→(∞,1) n<∞,p<1
?(n, p) ≥ |ν|1 ent(ν).
Proof. Let D(n, A, ν) be the set of the edges which are at distance less than 6 from the boundary of ncyl A and at distance less than 6 from nA. Let E ∈ E(n, A, ν). Let F be the set of edges which share a vertex with an edge of E but which are not in E. Let W (E) be the event: all the edges of E are closed and all the edges in E ∗ = (ncyl A) ∩ (D(n, A, ν) \ E) ∪ F
Low-Temperature Expansion of Wulff Crystal in 3D Ising Model
169
are open. Whenever W (E) occurs, the set E is the unique element of E(n, A, ν) realizing the event W (n, A, ν). Therefore the events W (E),
E ∈ E(n, A, ν),
for different E’s are pairwise disjoint and
|E| = f (n, A, ν)
p,q
∞ (W (n, A, ν)) ≥
p,q
∞ (W (E)).
E∈E (n,A,ν) |E|=f (n,A,ν)
We fix now a set of edges E in E(n, A, ν) such that |E| = f (n, A, ν) and we compute p,q a lower bound on ∞ (W (E)) as follows: p,q
p,q
∞ (W (E)) = ∞ (the edges of E are closed, the edges of E ∗ are open) p,q
= ∞ (the edges of E are closed | the edges of E ∗ are open) p,q
· ∞ (the edges of E ∗ are open). There exists a constant c such that |D(n, A, ν)| ≤ cn, hence |E ∗ | ≤ cn + 5|E| ≤ cn + 5f (n, A, ν) and, using the FKG inequality with the lower bound on the probability of an edge to be open, p,q p,q ∞ (e open) ∞ (the edges of E ∗ are open) ≥ e∈E ∗
≥
cn + 5f (n, A, ν) p . p + q(1 − p)
On the other hand, if we define ∂ vert E = { x ∈ Z3 : ∃y, z ∈ Z3 , {x, y} ∈ E1 \ E, {x, z} ∈ E } we have p,q
∞ (the edges of E are closed | the edges of E ∗ are open) p,q
≥ ∞ (the edges of E are closed | the vertices of ∂ vert E are wired) ≥ (1 − p)f (n, A, ν) , so that ∞ (W (E)) ≥ (1 − p)f (n, A, ν) p,q
cn + 5f (n, A, ν) p . p + q(1 − p)
Plugging this estimate in the initial sum, we get ∞ (W (n, A, ν)) ≥ φ(n, A, ν)(1 − p)f (n, A, ν) cn + 5f (n, A, ν) p · p + q(1 − p) p,q
and for n ∈ N, p ∈ (p0 (q), 1], ?(n, p) ≥
c p 5 1 ln φ(n, A, ν) + ( f (n, A, ν)) ln + . n2 n n2 p + q(1 − p)
Sending (n, p) to (∞, 1) and using Lemma 5.3, we obtain the claim of Lemma 5.4.
170
R. Cerf, R. Kenyon
We turn now to the study of the supremum limit of ?(n, p). We consider the general case where ν belongs to the positive orthant O+ = { (x, y, z) ∈ (R+ )3 , x + y + z > 0 }. The first step is to reduce the problem to collections of edges which are close to the minimal ones. To this end, we relax the definitions with the help of an additional parameter α representing the allowed fraction of additional edges. For α in [0, ∞] (the values 0 and ∞ are not excluded), we define successively E(n, A, ν, α) = { E ∈ E(n, A, ν) : |E| ≤ f (n, A, ν) + αn2 }, φ(n, A, ν, α) = |E(n, A, ν, α)|. We define the event
W (n, A, ν, α) =
there exists E in E(n, A, ν, α) such . that all the edges of E are closed
For the particular values α = 0, ∞, the previous definitions yield E(n, A, ν, ∞) = E(n, A, ν), W (n, A, ν, ∞) = W (n, A, ν), φ(n, A, ν, 0) = φ(n, A, ν). We set finally 1 1 p,q ln ∞ (W (n, A, ν, α)) − 2 f (n, A, ν) ln(1 − p). 2 n n We first show that the study can be reduced to ?(n, p, α). ?(n, p, α) =
Lemma 5.5. For any positive α, we have lim sup ?(n, p, α) = lim sup ?(n, p).
(n,p)→(∞,1) n<∞,p<1
(n,p)→(∞,1) n<∞,p<1
Proof. Clearly, for any α > 0, any n ∈ N and p < 1, we have ?(n, p, α) ≤ ?(n, p) and therefore lim sup ?(n, p, α) ≤ lim sup ?(n, p). (n,p)→(∞,1) n<∞,p<1
(n,p)→(∞,1) n<∞,p<1
Let us prove the converse inequality. There exists a constant c0 ≥ 1 such that, for any m ∈ N,
F ⊂ E1 : F connected, |F | = m, F contains an edge with 0 as vertex ≤ cm . 0 (9) Let D(n, A, ν) be the set of the edges which are at distance less than 6 from the boundary of ncyl A and at distance less than 6 from nA. We have p,q
p,q
∞ (W (n, A, ν)) − ∞ (W (n, A, ν, α)) p,q there is a connected set E of closed edges such that ≤ ∞ E ∩ D(n, A, ν) = ∅ and |E| ≥ f (n, A, ν) + αn2
p,q ≤ |D(n, A, ν)| ∞ all the edges of F are closed , m≥f (n,A,ν)+αn2 F
Low-Temperature Expansion of Wulff Crystal in 3D Ising Model
171
where the last summation extends over all connected sets of edges F such that |F | = m and F contains an edge with 0 as endvertex. Using (9) and applying Lemma 4.2, we get p,q
p,q
∞ (W (n, A, ν)) − ∞ (W (n, A, ν, α)) ≤ |D(n, A, ν)|
q(1 − p) m c0 . p 2
m≥f (n,A,ν)+αn
There exists a constant c1 such that |D(n, A, ν)| ≤ c1 n2 . For p sufficiently close to 1, so that c0 q(1 − p)/p < 1/2, we have thus p,q p,q ∞ (W (n, A, ν)) − ∞ (W (n, A, ν, α))
q(1 − p) f (n, A, ν) + αn2 ≤ 2c1 n c0 . p 2
Using the inequality ∞ (W (n, A, ν, α)) ≥ (1 − p)f (n, A, ν) , we obtain p,q
p,q p,q ∞ (W (n, A, ν)) − ∞ (W (n, A, ν, α)) 1 ln 1 + p,q n2 ∞ (W (n, A, ν, α)) q(1 − p) f (n, A, ν) + αn2 (1 − p)−f (n, A, ν) . ≤ 2c1 c0 p
?(n, p) − ?(n, p, α) =
There exists n1 such that f (n, A, ν) ≤ 2n2 |ν|1 for n ≥ n1 and ?(n, p) − ?(n, p, α) ≤ 2c1
2 q 2|ν|1 q(1 − p) α n c0 c0 . p p
Let p1 > p0 (q) be such that
c0
1 q 2|ν|1 q(1 − p1 ) min(α, 1) c0 < . p1 p1 2
For any n ≥ n1 , any p in (p1 , 1), we have ?(n, p) − ?(n, p, α) ≤ 2c1 2−n , 2
which implies the desired inequality on the supremum limits.
We next estimate the supremum limit of ?(n, p, α) with the help of φ(n, A, ν, α). Lemma 5.6. For any positive α, we have lim sup ?(n, p, α) ≤ lim sup
(n,p)→(∞,1) n<∞,p<1
n→∞
1 ln φ(n, A, ν, α) + 3α ln q. n2
Proof. Using the symmetry of the lattice, we need only to consider vectors ν whose three coordinates are non-negative. To avoid unessential discussions, we suppose also that all three coordinates of ν are strictly positive. Let as usual A be a unit square orthogonal to ν; for simplicity we suppose that A is centered at the origin. The main technical problem to get the correct upper bound on ?(n, p, α) is to show that, whenever p is close to 1 and the event W (n, A, ν, α) occurs, most of the closed edges realizing the event have
172
R. Cerf, R. Kenyon
their endvertices wired. Let D(n, A, ν) be the set of the edges which are at distance less than 6 from the boundary of ncyl A and at distance less than 6 from nA. Let us define P1 (n, A) = (x, y) ∈ Z × Z : ∃ z ∈ Z d2 ((x, y, z), nA) ≤ 1,
∀ z ∈ Z d2 ((x, y, z), D(n, A, ν)) ≥ 2 and for E a set of edges and x, y in Z × Z,
π1 (E, x, y) = e ∈ E : ∃ z ∈ Z e = {(x, y, z), (x, y, z + 1)} . Let (x, y) belong to P1 (n, A). Then the set of edges {(x, y, z), (x, y, z + 1)}, z ∈ Z, contains a finite path of edges linking the two connected components of { w ∈ R3 : d2 (w, n∂cyl A) < 6 ≤ d2 (w, nA) }. Let E be a set of edges in E(n, A, ν). Since by definition E contains no edge having an endvertex in the above set, necessarily at least one edge of the previous path belongs to E. We define next T1 (E) = π1 (E, x, y), T1∗ (E) = π1 (E, x, y). (x,y)∈P1 (n,A)
(x,y)∈P1 (n,A) |π1 (E,x,y)|=1
We define the analogous quantities related to the two other directions parallel to the axis, Pi (n, A), πi (E, x, y), Ti (E), Ti∗ (E), 1 ≤ i ≤ 3. p,q We proceed now to estimating ∞ (W (n, A, ν, α)). We write p,q
∞ (W (n, A, ν, α)) ≤
p,q ∞ the edges of E are closed .
(10)
E∈E (n,A,ν,α)
Let V (E) be the set of the vertices belonging to an edge of E. Since E is connected, then V (E) is L1 -connected and therefore L∞ -connected. Let R be the unbounded L1 in R is L1 connected. Let F be the edges having component of V (E)c . By Lemma 4.1, ∂∞ in R. Let us denote by E the event an endvertex in ∂∞ E = { the edges of E are closed, the edges of F are open}. We have, using Lemma 4.3, p,q ∞ the edges of E are closed
p,q ≤ ∞ the edges of E ∪ (F \ G) are closed, the edges of G are open G⊂F
≤
2q
G⊂F
≤
1 − p |F \ G| p,q ∞ E ≤ p
0≤N≤|F |
2q
G⊂F |G|=N
1 − p |F | − N p,q ∞ E p
|F |
|F | 1 − p |F | − N p,q q p,q 2q ∞ E = 1 + 2 (1 − p) ∞ E . N p p
N≤|F |
Low-Temperature Expansion of Wulff Crystal in 3D Ising Model
173
Yet |E| ≤ f (n, A, ν) + αn2 , thus for n large enough, |V (E)| ≤ 2(2|ν|1 + α)n2 and |F | ≤ c1 n2 with some positive constant c1 so that c1 n2 q p,q p,q ∞ the edges of E are closed ≤ 1 + 2 (1 − p) ∞ E . p
(11)
in R are connected On the event E, all the edges of F are open, hence all the vertices of ∂∞ ∗ by open paths. Moreover, the endvertices of any edge in T1 (E) ∪ T2∗ (E) ∪ T3∗ (E) in R. In conclusion, on the event are connected by an open edge of F to a vertex of ∂∞ ∗ ∗ ∗ E, all the edges in T1 (E) ∪ T2 (E) ∪ T3 (E) have their endvertices wired. Whenever (x, y) ∈ P1 (n, A) and |π1 (E, x, y)| = 1, the line {(x, y, z) : z ∈ Z} contains at least two edges of E, therefore
|Ti∗ (E)| + 2(|Pi (n, A)| − |Ti∗ (E)|) ≤ |E|, 1≤i≤3
whence
|Ti∗ (E)| ≥ 2
1≤i≤3
|Pi (n, A)| − |E| ≥ f (n, A, ν) − 2αn2 ,
1≤i≤3
the last inequality being valid for n large enough, since 1
|Pi (n, A)| = |ν|1 . n→∞ n2 lim
1≤i≤3
Thus at least f (n, A, ν) − 2αn2 edges of E have their endvertices wired together. By Lemma 4.2, we have p,q p,q ∞ E ≤ ∞ the edges of E are closed, at least f (n, A, ν) − 2αn2 edges of E are wired 1 − p f (n, A, ν) 2 ≤ q 3αn . p Plugging this estimate in (10) and (11), we obtain p,q ∞ (W (n, A, ν, α))
c1 n2 q ≤ φ(n, A, ν, α) 1 + 2 (1 − p) p 1 − p f (n, A, ν) 2 · q 3αn , p
whence ?(n, p, α) ≤
1 ln φ(n, A, ν, α) + c1 ln 1 + 2q(1 − p)/p n2 f (n, A, ν) ln p + 3α ln q. − n2
Taking the supremum limit as (n, p) → (∞, 1) yields the desired result.
We finally prove that the entropy is continuous with respect to α at α = 0.
174
R. Cerf, R. Kenyon
Lemma 5.7. For any ν in S 2 , we have 1 ln φ(n, A, ν, α) = |ν|1 ent(ν). n2
lim lim sup α→0 α>0
n→∞
Proof. Using the symmetry of the lattice, we need only to consider vectors ν whose three coordinates are non–negative. To avoid unessential discussions, we suppose also that all three coordinates of ν are strictly positive. Let as usual A be a unit square orthogonal to ν; for simplicity we suppose that A is centered at the origin. Let P111 be the plane containing the origin and orthogonal to the vector (1, 1, 1), i.e., P111 = { (x, y, z) ∈ R3 : x + y + z = 0 }. Let also π111 be the projection on P111 parallel to the direction (1, 1, 1). Let D be the parallelogram D = π111 (nA). Let k be an integer. We tile D with k 2 translates of D/k, which we denote by Di , 1 ≤ i ≤ k 2 . Let E ∈ E(n, A, ν, α). For n sufficiently large, we have |ν|1 n2 − αn2 ≤ f (n, A, ν) ≤ |ν|1 n2 + αn2 , hence |ν|1 n2 − αn2 ≤ |E| ≤ |ν|1 n2 + 2αn2 . Let E ∗ ∈ E(n, A, ν) be such that |E ∗ | = f (n, A, ν). Let a ∗ , b∗ , c∗ (respectively a, b, c) be the number of the edges of E ∗ (respectively E) parallel to the first, second and third axis respectively. We have max(|a − a ∗ |, |b − b∗ |, |c − c∗ |) ≤ 3αn2 , 1 ∗ ∗ ∗ (a , b , c ) = ν. lim n→∞ n2 The plaquette associated to an edge e is denoted by p(e). We say that a parallelogram Di is good if the π111 projection of the collection of the plaquettes associated to E above Di is one to one in the following sense: ∀ e1 , e2 ∈ E,
e1 = e2 ,
H2 (π111 (p(e1 )) ∩ π111 (p(e2 )) ∩ Di ) = 0,
where H2 is the two dimensional Hausdorff measure. We denote by I (E) the set of the indices of the good parallelograms. The area of the π111 projection is the same for the three types of plaquettes; call this area H . Since E belongs to E(n, A, ν, α), we have H2 π111 p(e) ≥ n2 |ν|1 H − O(n), e∈E
so that for n large enough, the number of bad parallelograms is less than 3αn2 and therefore |I (E)| ≥ k 2 − 3αn2 . If B is a subset of P111 , we say that an edge e is above B if π111 (p(e))∩B = ∅. Let B(E) be the edges of E which are above good parallelograms, i.e.,
B(E) = e ∈ E : ∃ i ∈ I (E) π111 (p(e)) ∩ Di = ∅ . Let next F(E) be the edges of E which are above the boundaries of the parallelograms Di , 1 ≤ i ≤ k 2 , i.e.,
F(E) = e ∈ E : ∃ i ∈ {1, · · · , k 2 } π111 (p(e)) ∩ ∂Di = ∅ . We have |F(E)| = k 2 O(n/k) = O(kn). Let finally M(E) = (E \ B(E)) ∪ F(E). Let i belong to I (E). The set of the edges of E which are above Di cuts the cylinder
Low-Temperature Expansion of Wulff Crystal in 3D Ising Model
175
of basis Di and direction ν in two infinite components and it has thus cardinality larger than |ν|1 n2 /k 2 − O(n/k). Therefore |B(E)| ≥ (|ν|1 (n2 /k 2 ) − O(n/k))|I (E)| − 4|F(E)| ≥ |ν|1 n2 − O(nk) − |ν|1 3αn4 /k 2 and
|M(E)| ≤ 2αn2 + O(nk) + |ν|1 3αn4 /k 2 .
Since M(E) is a connected set of edges intersecting D(n, A, ν), the total number of possible configurations for M(E) is less than exp O 2αn2 + O(nk) + |ν|1 3αn4 /k 2 . We next estimate the number of possible configurations for B(E) once M(E) is given. For each i in I (E), let Bi be the edges of B(E) which are above Di and let ai , bi , ci be the number of the edges of Bi parallel to the first, second and third axis respectively. The number of possible choices for Bi corresponding to a fixed value of ai , bi , ci is estimated with the help of the following lemma. Lemma 5.8. Let R be a parallelogram in P111 . Let a, b, c belong to N. Let B(a, b, c, R) be the set of the collections B of edges above R such that • B is connected and at least one edge of B is at distance less than one from P111 , • the π111 projection of the collection of the plaquettes associated to B is one to one and covers R, • the number of the edges of B parallel to the first, second and third axis are equal to a, b, c. For any ε > 0, there exists n(ε) such that:
∀ n ≥ n(ε) ∀ (a, b, c) ∈ N3 |B(a, b, c, nR)| ≤ exp (a + b + c)(ent(a, b, c) + ε) . Remark. Notice that whenever B(a, b, c, nR) is not empty, then a + b + c is of order |ν|1 n2 . Proof. This follows from [10, Thm 1.1]: partition B(a, b, c, nR) into sets having the same boundary configurations (the edges which project to an O(1) neighborhood of ∂nR). The size of each element of the partition is determined by the entropy formula of [10, Thm. 1.1]. The boundary configuration of largest entropy is the one whose boundary edges approximate a plane of slope (a, b, c). The size of this element of the partition is less than exp((a + b + c)(ent(a, b, c) + ε)) uniformly over (a, b, c) for a + b + c sufficiently large. Since the number of elements of the partition is at most exponential in the length of the boundary, whereas a, b, c are quadratic, summing over all elements of the partition gives the same bound up to a lower order error. We apply Lemma 5.8 to the parallelogram Di . For any ε > 0, there exists ρ0 such that of possible choices for Bi for any n, k such that k/n < ρ0 , for each i in I (E), the number corresponding to a fixed value of a , b , c is less than exp (a +b i i i i i +ci )(ent(ai , bi , ci )+ ε) . Letting
ai , bI = bi , cI = ci , aI = i∈I
i∈I
i∈I
176
R. Cerf, R. Kenyon
we have
max(|a − aI |, |b − bI |, |c − cI |) ≤ |M(E)|,
whence max(|a ∗ − aI |, |b∗ − bI |, |c∗ − cI |) ≤ 5αn2 + O(nk) + |ν|1 3αn4 /k 2 . The concavity of ν ∈ R3 → |ν|1 ent(ν) yields
(ai + bi + ci )ent(ai , bi , ci ) ≤ (aI + bI + cI )ent(aI + bI + cI ), i∈I
and we conclude that the total number of possible configurations for B(E) once M(E), aI , bI , cI are fixed is bounded above by
aI + |I | − 1 |I | − 1
p+q−1 q−1
bI + |I | − 1 |I | − 1
cI + |I | − 1 |I | − 1 · exp (aI + bI + cI )(ent(aI , bI , cI ) + ε) .
is the number of ways to partition p identical elements into q √ labelled subsets.) We now choose α < ρ04 and k = α 1/4 n. We then have |I | ≤ αn2 which implies √ √ √ aI + |I | − 1 |ν|1 n2 (1 + α) √ 2 = exp(n2 O( α ln α)). ≤ |I | − 1 αn (Recall that
Therefore
aI + |I | − 1 |I | − 1
bI + |I | − 1 |I | − 1
cI + |I | − 1 |I | − 1
= exp(O(α 1/4 n2 )).
Now √ max(|a ∗ − aI |, |b∗ − bI |, |c∗ − cI |) ≤ (5α + O(α 1/4 ) + 9 α)n2 = O(α 1/4 n2 ), so the number of possible values for aI , bI , cI is bounded by O(α 3/4 n6 ). Moreover ent is continuous with respect to the direction; hence there exists ρ1 > 0 such that, for α < ρ1 and n sufficiently large, we have |a ∗ + b∗ + c∗ − |ν|1 n2 | ≤ εn2 ,
|ent(ν) − ent(aI , bI , cI )| ≤ ε.
Putting together the previous estimates, we see that for α < min(ρ04 , ρ1 ), and n sufficiently large, the total number of possible choices for E is less than O(α 3/4 n6 ) exp(O(α 1/4 n2 )) exp (|ν|1 + ε)n2 (ent(ν) + 2ε) . Since ε and α can be chosen arbitrarily small, the desired estimate on φ(n, A, ν, α) follows. Lemmas 5.5–5.7 together imply the following result, which implies the expansion stated in Theorem 1.1.
Low-Temperature Expansion of Wulff Crystal in 3D Ising Model
177
Corollary 5.9. For any ν in S 2 , any unit square A orthogonal to ν, we have lim sup ?(n, p) ≤ |ν|1 ent(ν).
(n,p)→(∞,1) n<∞,p<1
It remains to prove that the expansion is uniform with respect to ν in S 2 . For ε > 0, let Tε be the symmetry in R3 defined by ∀ x ∈ R3
e −x ε
Tε (x) =
where e = (1, 1, 1).
Since τ (ν, p) is the support function of Wτ , we have for any ν in (R+ )3 , |ν|2 τ (
ν , p) = sup w · ν |ν|2 w∈Wτ =
sup
w∈Tε (Wτ )
Tε (w) · ν =
e·ν + ε
sup
w∈Tε (Wτ )
−w · ν.
Restricting this identity to S 2 ∩ (R+ )3 , we get ∀ ν ∈ S 2 ∩ (R+ )3
τ (ν, p) −
|ν|1 = sup −w · ν, ε w∈Tε (Wτ )
which converges pointwise towards −|ν|1 ent(ν). By homogeneity we see that ν → |ν|2 τ (
ν |ν|1 , p) − |ν|2 ε
is a sequence of convex functions from (R+ )3 to R which converges pointwise to the continuous function −|ν|1 ent(ν). The uniformity stated in Theorem 1.1 follows from the following lemma. Lemma 5.10. Let (fn )n∈N be a sequence of convex functions from (R+ )d to R which converges pointwise to a continuous function f . Then the convergence is uniform on any compact set included in (R+ )d . Remark. A classical result shows that the convergence is uniform on any compact set included in the interior of (R+ )d (see for instance [28, Theorem 10.8]). For a related result concerning more complicated domains, see [19]. Proof. The proof is done by induction on the dimension d. In the case d = 0, (R+ )0 = {0} and pointwise convergence implies uniform convergence! Suppose now that the result holds in dimension d − 1, where d is a fixed integer, d ≥ 1. Let (fn )n∈N , f be functions from (R+ )d to R satisfying the hypothesis of the lemma. Let ε > 0 and M > 0. The function f is uniformly continuous on the compact set [0, M]d , thus √ ∃ δ > 0 ∀ x, y ∈ [0, M]d |x − y|2 < 2 dδ ⇒ |f (x) − f (y)| < ε. By the classical result quoted in the above remark, the sequence (fn )n∈N converges uniformly to f on [δ, M]d , hence ∃ N1 (δ, M, ε) ∀ n ≥ N1
∀ x ∈ [δ, M]d
|fn (x) − f (x)| < ε.
178
R. Cerf, R. Kenyon
For each k in {1, . . . , d}, the restrictions of the functions (fn )n∈N , f to the d − 1 dimensional hyperplane xk = 0 (xk is the k th coordinate in the canonical basis of Rd ) are convex functions from (R+ )d−1 to R which satisfy the induction hypothesis. Therefore we have uniform convergence on the set D = [0, M]d ∩ { x ∈ Rd : xk = 0 }, 1≤k≤d
that is,
∃ N2 (M, ε)
∀ n ≥ N2
∀x ∈ D
|fn (x) − f (x)| < ε.
Let now n be an integer larger than max(N1 , N2 ). Let y belong to [0, M]d \ D \ [δ, M]d . Let z0 be the orthogonal projection of y on [δ, M]d (that is, the point of [δ, M]d closest to y). Let z1 be the intersection of the line (y z0 ) and √ D and let z2 be the point symmetric of y with respect to z0 . We have |z1 − z0 |2 < dδ. Since y belongs to the segment [z0 , z1 ], there exist α, β ≥ 0 such that α + β = 1 and y = αz0 + βz1 . By convexity and the previous inequalities, we have then fn (y) ≤ αfn (z0 ) + βfn (z1 ) ≤ αf (z0 ) + βf (z1 ) + ε ≤ f (y) + 2ε. Finally, we have also fn (z0 ) ≤ (1/2)(fn (y) + fn (z2 )), whence fn (y) ≥ 2fn (z0 ) − fn (z2 ) ≥ 2f (z0 ) − 2ε − f (z2 ) − ε ≥ f (y) − 6ε. Thus we have uniform convergence over [0, M]d and the induction step is completed. References 1. Aizenman, M., Chayes, J.T., Chayes, L., Fröhlich, J. and Russo, L.: On a sharp transition from area law to perimeter law in a system of random surfaces. Commun. Math. Phys. 92, 19–69 (1983) 2. Alexander, K.S., Chayes, J.T. and Chayes, L.: The Wulff construction and asymptotics of the finite cluster distribution for two–dimensional Bernoulli percolation. Commun. Math. Phys. 131, 1–50 (1990) 3. Almgren, F.: Questions and answers about area–minimizing surfaces and geometric measure theory. In: Differential geometry. Part 1, R. Greene (ed.) et al. Proc. Symp. Pure Math. 54, 29–53 (1993) 4. Bodineau, T.: The Wulff construction in three and more dimensions. Commun. Math. Phys. 207, 1, 197–229 (1999) 5. Bricmont, J., El Mellouki, A., Fröhlich, J.: Random surfaces in statistical mechanics: roughening, rounding, wetting. J. Stat. Phys. 42, 5–6, 743–798 (1986) 6. Cerf, R.: Large deviations for three dimensional supercritical percolation. Astérisque 267, (2000) 7. Cerf, R. and Pisztora, A.: On the Wulff crystal in the Ising model. Ann. Probab. 28, 3, 945–1015 (2000) 8. Cerf, R. and Pisztora, A.: Phase coexistence in Ising, Potts and percolation models. Ann. Inst. H. Poincaré (2001) 9. Cheng, H., Wu, T.T.: Phys. Rev. 164, 719 (1967) 10. Cohn, H., Kenyon, R., Propp, J.: A variational principle for domino tilings. J. AMS To appear 11. Deuschel, J.D. and Pisztora, A.: Surface order large deviations for high–density percolation. Probab. Theory Relat. Fields 104, 4, 467–482 (1996) 12. Dinghas, A.: Uber einen geometrischen Satz von Wulff für die Gleichgewichtsform von Kristallen. Z. Kristallogr. 105, 304–314 (1944) 13. Dobrushin, R.L., Kotecký, R. and Shlosman, S.B.: Wulff construction: a global shape from local interaction. Providence, RI: AMS translations series, 1992 14. Dobrushin, R.L. and Shlosman, S.B.: Thermodynamic inequalities for the surface tension and the geometry of the Wulff construction. In: Ideas and methods in quantum and statistical physics, 2, S. Albeverio (ed.), Cambridge: Cambridge University Press, (1992) pp. 461–483 15. Fonseca, I.: The Wulff theorem revisited. Proc. R. Soc. Lond. Ser. A 432 No. 1884, 125–145 (1991) 16. Fonseca, I. and Müller, S.: A uniqueness proof for the Wulff theorem. Proc. R. Soc. Edinb. Sect. A 119 No. 1/2, 125–136 (1991)
Low-Temperature Expansion of Wulff Crystal in 3D Ising Model
179
17. Grimmett, G.R.: Percolation. Berlin–Heidelberg–New York: Springer–Verlag, 1989 18. Grimmett, G.R.: The stochastic random-cluster process and the uniqueness of random-cluster measures. Ann. Probab. 23, 1461–1510 (1995) 19. Guberman, I.Ja.: On the uniform convergence of convex functions in a closed domain. Leningrad. Gos. Ped. Inst. Uv cen. Zap. 274, 7–37 (1965) 20. Herman, G.T. and Webster, D.: Surfaces of organs in discrete three–dimensional space. In: Mathematical aspects of computerized tomography, Oberwolfach, Lect. Notes Med. Inf. 8, 1981, pp. 204–224 21. Hörmander, L. Sur la fonction d’appui des ensembles convexes dans un espace localement convexe. Arkiv Matematik 3, 181–186 (1954) 22. Kesten, H.: Aspects of first passage percolation. Ecole d’été de probabilités de Saint–Flour, XIV–1984, Lecture Notes in Math. 1180, Berlin–Heidelberg–New York: Springer, 1986, pp. 125–264 23. Messager, A., Miracle-Solé, S. and Ruiz, J.. Convexity properties of the surface tension and equilibrium crystals. J. Stat. Phys. 67, 3/4, 449–469 (1992) 24. Miracle-Solé, S.: Surface tension, step free energy, and facets in the equilibrium crystal. J. Stat. Phys. 79, 1/2, 183–214 (1995) 25. Pfister, C.E.: Large deviations and phase separation in the two–dimensional Ising model. Helv. Phys. Acta 64, 4, 953–1054 (1991) 26. Pisztora, A.: Surface order large deviations for Ising, Potts and percolation models. Probab. Theory Relat. Fields 104, 4, 427–466 (1996) 27. Rådström, H.: An embedding theorem for spaces of convex sets. Proc. Am. Math. Soc. 3, 165–169 (1952) 28. Rockafellar, R.T.: Convex analysis. Princeton, NJ: Princeton Univ. Press, 1970 29. Rolfsen, D.: Knots and links. Houston: Publish or Perish, 1990 30. Salinetti, G. and Wets, R.J.-B.: On the convergence of sequences of convex sets in finite dimensions. Siam review 21, 1, 18–33 (1979) 31. Shlosman, S.: Geometric variational problems of statistical mechanics and of combinatorics. Probabilistic techniques in equilibrium and nonequilibrium statistical physics. J. Math. Phys. 41, 3, 1364–1370 (2000) 32. Szalay, M. and Turán, P.: On some problems of the statistical theory of partitions with application to characters of the symmetric group. I,II,III. Acta Math. Acad. Sci. 29, 361–379, 381–392 (1977); 32, 129–155 (1978) 33. Taylor, J.E.: Crystalline variational problems. Bull. Am. Math. Soc. 84, 4, 568–588 (1978) 34. Taylor, J.E.: Existence and structure of solutions to a class of nonelliptic variational problems. Symposia Mathematica 14, 4, 499–508 (1974) 35. Taylor, J.E.: Unique structure of solutions to a class of nonelliptic variational problems. Proc. Symp. pure Math. 27, 419–427 (1975) 36. Vershik, A.: Statistical mechanics of combinatorial partitions and their limit shapes. Func. Anal. Appl. 30, 2, 90–105 (1996) 37. Wulff, G.: Zur Frage der Geschwindigkeit des Wachsturms und der Auflösung der Kristallflächen. Z. Kristallogr. 34, 449–530 (1901) Communicated by M. Aizenman
Commun. Math. Phys. 222, 181 – 200 (2001)
Communications in
Mathematical Physics
© Springer-Verlag 2001
The Hamiltonian Operator Associated with Some Quantum Stochastic Evolutions M. Gregoratti Dipartimento di Matematica “F.Brioschi”, Politecnico di Milano, piazza Leonardo da Vinci 32, 20133 Milano, Italy. E-mail: [email protected] Received: 28 March 2001 / Accepted: 11 May 2001
Abstract: We consider the Hamiltonian operator associated to the quantum stochastic differential equation introduced by Hudson and Parthasarathy to describe a quantum mechanical evolution in the presence of a “quantum noise”. We characterize such a Hamiltonian in the case of arbitrary multiplicity and bounded coefficients: we find an essentially self-adjoint restriction of the operator and, in particular, we provide an explicit construction of a dense set of vectors belonging to its domain. 1. Introduction Many important applications of quantum stochastic calculus (QSC) [17, 15] are based on the unitary solutions (stochastic evolutions) of quantum stochastic differential equations (QSDE’s) of the form [12] dV (t) =
N i,j =0
j
Lji V (t) dMi (t),
V (0) = 1, j
(1)
where dM00 (t) = dt and the other differentials dMi (t) are the basic quantum noises d ij (t), dAi (t), dA†i (t) of Hudson and Parthasarathy, while the coefficients Lji are linear operators on a complex separable Hilbert space H, the initial space. We consider this class of QSDE’s with bounded coefficients (and arbitrary multiplicity N , finite or infinite). In this case there is a full characterization of the coefficients Lji such that the solutions V are unitary [12, 16]. As soon as Hudson and Parthasarathy introduced QSDE’s of this kind, Frigerio and Maassen independently realized that each unitary solution V enjoys the cocycle property (previously introduced by Accardi [1, 2]) and thus it is naturally associated to a strongly continuous one-parameter unitary group U [8, 9, 13, 14]. Nevertheless it took more than
182
M. Gregoratti
ten years to have the first results about the infinitesimal generator K of U , which completely characterizes not only U , but also the solution V of (1). Indeed, if we introduce a well known group of shift operators with its generator E0 , the Hamiltonian K allows us to obtain the solution V (t) as a product V (t) = ∗t Ut = eiE0 t e−iKt ,
t ≥ 0,
(2)
of two evolution groups, which are solutions of non-stochastic differential equations, formally dUt = −i K Ut dt and d∗t = i E0 ∗t dt. The first results about the Hamiltonian K were obtained by Chebotarev with the hypotheses of multiplicity N = 1 and of commuting coefficients Lji endowed with a joint spectral family [3–6]. In this case he determined the structure of K and its relation with the coefficients. As noted by Chebotarev, his operator K turns out to be symmetric independently of the assumption about the coefficients. Afterwards, in the 1-multiplicity case with arbitrary bounded coefficients, it was shown that the Hamiltonian K associated to V always admits a restriction which is a densely defined symmetric operator of Chebotarev’s kind and in some particular cases it was proved that this restriction is essentially self-adjoint [10]. The PhD Thesis of the author [11] completed the study of the 1-multiplicity case by showing that this restriction is always essentially self-adjoint. Here, we not only present the result obtained in [11], but we also generalize it to the case of arbitrary multiplicity and bounded coefficients: the Hamiltonian K associated to a stochastic evolution (1) is completely characterized by finding an essentially self-adjoint restriction of K. The generalization is not trivial even for the 2-multiplicity case: whenever the multiplicity N is strictly bigger than 1, the construction of a dense set of vectors belonging to D(K) leads us to introduce a new class of vectors, the pseudo-exponential vectors defined by (40). The paper is organized as follows: Section 2 introduces the basic notions of QSC, including the quantum stochastic evolutions. Section 3 introduces the basic tools to study the Hamiltonian K: the matrix elements of a stochastic evolution are studied and some (unbounded) operators are defined together with their domains. Section 4 characterizes the Hamiltonian K: first we find a restriction of K, then we prove that this restriction is densely defined and, finally, we show that it is essentially self-adjoint. 2. Notations and Preliminaries The following introduction of notions of QSC is based mainly on [12, 17] and it is not a general one. Indeed we deal with a problem which can be stated using a quite simple version of QSC and moreover we can tackle it by employing fundamental properties of QSC which do not need more general formulations. Every Hilbert space in this paper is assumed to be complex and separable. The norm and the scalar product in such a space H are denoted by · H and ·|· H , or simply by · and ·|· ; the scalar product is conjugate-linear in the first variable and linear in the second. We denote by B(H) and U(H) respectively the C ∗ -algebra of all bounded operators on H and the group of all unitary operators on H; analogously we denote by B(H1 ; H2 ) the Banach space of all bounded operators from H1 to H2 . Every time we consider tensor products of Hilbert spaces, each operator L : H1 → H2 is identified with its extension L ⊗ 1 : H1 ⊗ H → H2 ⊗ H. For every open ⊆ Rn , we denote by
Hamiltonian Operator Associated with Quantum Stochastic Evolutions
183
C 1 (; H) and Cc∞ (; H) respectively the space of continuously differentiable functions from to H and the space of infinitely differentiable functions with compact support from to H. Let I be a measurable subset of R equipped with Lebesgue measure and let Z be a Hilbert space where we fix an orthonormal basis {zj }j ∈J . The dimension of Z is called the multiplicity and Z is called the multiplicity space. The set J is thought to be {1, . . . , N} in the case of finite multiplicity N , while J is thought to be N in the case of infinite multiplicity. We make the identification L2 (I ; Z) = L2 (I × J ),
(3)
got by identifying a vector f in the former space with the function on I × J defined by (r, j ) → zj |f (r) . We set f (r, j ) = zj |f (r) = fj (r). Introducing another Hilbert space H, called the initial or system space, we denote by L2symm (I × J )n ; H the space of totally symmetric square integrable functions from (I × J )n to H; then we identify the tensor product K[I ] = [I ] ⊗ H, between the symmetric Fock space [I ] over L2 (I ; Z) and the space H, with the direct sum K[I ] =
∞
L2symm (I × J )n ; H
(4)
n=0 ∞ (ϒ n )n=0 ,
where each component ϒn belongs to the correconsisting of sequences ϒ= sponding L2symm (I × J )n ; H , and where ϒ2K[I ] =
∞ 1 ϒn 2L2 ((I ×J )n ;H) < ∞. symm n! n=0
For each f ∈ L2 (I ; Z), let ψ(f ) be the corresponding exponential vector in [I ] ψ(f ) = (1, f, f ⊗2 , . . . , f ⊗n , . . . ), and let E(s) be the exponential domain of all the linear combinations of vectors of the form ψ(f ) ⊗ h, f ∈ s ⊆ L2 (I ; Z), h ∈ H. Note that E(s) is a dense subspace of K[I ] for every dense s in L2 (I ; Z). We also set E[I ] = E L2 (I ; Z) . Let us recall that, for every decomposition I = I1 ∪ I2 such that I1 ∩ I2 = ∅, we have the basic identification [I ] = [I1 ] ⊗ [I2 ], given by ψ(f ) = ψ(f |I1 ) ⊗ ψ(f |I2 ), and also the natural immersion of [Ik ] in [I ] = [Ik ] ⊗ [Ikc ], φ → φ ⊗ ψ(0|Ikc ),
∀φ ∈ [Ik ].
184
M. Gregoratti
We consider adapted processes of operators on K[R+ ], where, for every t ≥ 0, we have the tensor product decomposition K[R+ ] = K[(0, t)] ⊗ [(t, +∞)]. The basic quantum noises are the processes M0i (t) = Ai (t) = A I(0,t) ⊗ zi , Mi0 (t) = A†i (t) = A† I(0,t) ⊗ zi , j Mi (t) = ij (t) = π(0,t) ⊗ |zi zj | ,
(5)
where the indexes i and j belong to J , while IB is the indicator function of B and π(0,t) ⊗ |zi zj | is the tensor product between the multiplication operator by I(0,t) in L2 (R+ ) and the operator in Z mapping x → zj |x zi . We are interested in the solutions of the QSDE (the right Hudson–Parthasarathy equation) dV (t) = R∗i Sij dAj (t) + Rj dA†j (t) Sij − δij d ij (t) − i,j ∈J i,j ∈J j ∈J
1 − iH + R∗j Rj dt V (t), 2 j ∈J V (0) =1,
(6)
where H, Rj , Sij are operators on H, called system operators, such that 1. H, Rj , Sij ∈ B(H), 2. H = H∗ , 3.
R ∈ B(H; Z ⊗ H),
∀i, j ∈ J, where Rh =
zj ⊗ Rj h,
j
4. S ∈ U(Z ⊗ H),
where S =
(7)
|zi zj | ⊗ Sij .
i,j
Let us recall that Condition 3 is equivalent to the strong convergence of i R∗i Ri , that Condition 4 is equivalent to i S∗ij Si& = i Sj i S∗&i = δj & (where the sums are strongly convergent) these conditions imply the strong convergence of i R∗i Sij . Note ∗ and that ∗ that i Ri Ri = R R. The QSDE (6) is an equation of type (1), where the differential dM00 (t) is just dt, j the other differentials dMi (t) are given by (5) and the coefficients Lji are bounded and given through H, Rj , Sij . When we consider unitary adapted processes V satisfying (1) with Lji ∈ B(H), the conditions (7), which are conditions on the coefficients Lji , are not only necessary for V to be unitary, but also sufficient. Actually the following theorem holds. Theorem 1. Let H, Rj , Sij be system operators satisfying the conditions (7). Then there exists a unique adapted process V satisfying the QSDE (6). Moreover, the solution V is strongly continuous and unitary ([17, Theorem 27.8, p. 228]) and the correspondence {H, Rj , Sij } → V is injective ([17, Proposition 27.3, p. 224]).
Hamiltonian Operator Associated with Quantum Stochastic Evolutions
185
The adjoint process V ∗ = {V (t)∗ }t≥0 , strongly continuous and unitary, is the unique adapted process satisfying the QSDE (left Hudson–Parthasarathy equation) ∗ ∗ (t) = V (t) S∗ij Ri dA†j (t) + R∗j dAj (t) S∗ij − δij d j i (t) − dV i,j i,j j 1 ∗
(8) + iH − Rj Rj dt , 2 j ∗ V (0) = 1. Since V and V ∗ are adapted processes of bounded operators on K[R+ ], both the operators V (t) and V ∗ (t) are factorized operators of the form W ⊗ 1 on K[(0, t)] ⊗ [(t, ∞)]. Since V ∗ satisfies (8), its matrix elements on E[R+ ] satisfy ψ(f ) ⊗ k|V ∗ (t) ψ(v) ⊗ h
t
= ψ(f ) ⊗ k|ψ(v) ⊗ h + −
i,j
ψ(f ) ⊗ k|V ∗ (s)
0
S∗ij Ri f¯j (s) +
j
i,j
S∗ij − δij f¯j (s)vi (s)
R∗j vj (s)
+ iH −
1 ∗
Rj Rj ψ(v) ⊗ h ds 2
(9)
j
for every t ≥ 0. An analogous formula can be deduced from (6) for the matrix elements of V . In order to associate a group U to the solution V of (6), we first introduce the strongly continuous one-parameter unitary group θ of the shift operators on L2 (R; Z) and its second quantization on [R]: for every t ∈ R, θt f (r) = f (r + t), ∀f ∈ L2 (R), (10) t ψ(f ) = ψ(θt f ), ∀f ∈ L2 (R). Let us extend the operators t and V (t) to the space K = K[R] = [R− ] ⊗ [R+ ] ⊗ H. Thus, on one side we have , which is a strongly continuous one-parameter unitary group, and on the other side we have the strongly continuous unitary adapted process V , which is the solution of the Hudson–Parthasarathy equation (6) with system operators satisfying (7). Since they are related by the cocycle property (of V w.r.t. ) V (s + t) = ∗s V (t) s V (s),
∀s, t ≥ 0,
(11)
they can be combined to define the strongly continuous one-parameter unitary group U on K,
t V (t), if t ≥ 0, (12) Ut = V ∗ (|t|) t , if t ≤ 0.
186
M. Gregoratti
Theorem 2 ([8, 9, 13, 14]). Let be the strongly continuous one-parameter unitary group defined by (10) and let V be the solution of the QSDE (6) with system operators satisfying the conditions (7). Then V has the cocycle property (11) and the family of unitary operators U = {Ut }t∈R defined by (12) is a strongly continuous one-parameter unitary group. In the applications to the quantum theory of open systems, the Hilbert space K = [R] ⊗ H describes a system consisting of a field S of bosons “of type L2 (R; Z) = L2 (R) ⊗ Z” plus a system SH “of type H”, and the QSDE (6) is introduced to describe a class of interacting evolutions of S + SH , where the action of S on SH has no memory effects. The free evolution of the field is supposed to be given by , and hence, physically, the degree of freedom described by L2 (R) is thought to be the conjugate moment of energy. The group U (12) is interpreted as the reversible evolution of the isolated system S + SH , and thus V (t) = ∗t Ut is the evolution operator giving the state dynamics from time 0 to time t ≥ 0 of the same system in interaction picture w.r.t. the free field dynamics. The evolutions and U are strongly differentiable on the dense domains of their Hamiltonians (E0 and K) where they satisfy the Schrödinger equations dt = −iE0 t dt, dUt = −iK Ut dt.
(13) (14)
Nevertheless, even if V is the product of two differentiable groups, it does not need to be differentiable for domain reasons, but, remarkably, it is an adapted process and it satisfies a QSDE. Thus we are dealing with a class of evolutions, called (quantum) stochastic evolutions, which are not defined in terms of the Hamiltonian K and the usual Schrödinger equation (14), but in terms of the system operators H, Rj , Sij and the Hudson–Parthasarathy equation (6). The Hamiltonians E0 and H represent the energy of S and SH respectively, the Hamiltonian K represents the total energy (comprehending the interaction) of S + SH and the system operators Rj and Sij control the interaction between S and SH . The aim of this paper is the characterization of the Hamiltonian K associated to a QSDE of Hudson–Parthasarathy. This characterization is obtained in the general case of arbitrary multiplicity and arbitrary system operators satisfying (7). If there is no interaction, i.e. if Rj = 0 and Sij = δij , then V (t) = e−itH , Ut = −itE 0 e−itH and therefore K = E + H, which is a self-adjoint operator defined on e 0 D(E0 ). In the case of interaction we find an essentially self-adjoint restriction of K which appears as a singular perturbation of E0 + H. Before starting the study of K, let us recall that the generators *0 and E0 of the groups θ on L2 (R; Z) and on K are unbounded self-adjoint operators. In order to give explicitly their domains and their actions we introduce some functional spaces. For every open I ⊆ R and every (separable) Hilbert space H we set H , (I 0 ; H) = H, and for every n ≥ 1 we introduce the Sobolev space n H , (I n ; H) = v ∈ L2 (I n ; H) s.t. ∂& v ∈ L2 (I n ; H) ,
(15)
&=1
where all the derivatives are in the sense of distributions on I n taking values in H. Then H , (I n ; H) contains the usual Sobolev space H 1 (I n ; H) and it is a complex separable
Hamiltonian Operator Associated with Quantum Stochastic Evolutions
187
Hilbert space with respect to its Sobolev norm, n 2 ∂& v 2 v2H , (I n ;H) = v2L2 (I n ;H) +
L (I n ;H)
&=1
.
Clearly the operator n&=1 ∂& : H , (I n ; H) → L2 (I n ; H) is continuous. Note that the condition in (15) concerns a unique directional derivative because, if {e& }n&=1 denotes n the canonical basis for R , the sum n&=1 ∂& v coincides with the derivative of v in the direction of e1 + · · · + en . According to the identifications (3) and (4), we identify H , (I n ; Z⊗n ⊗ H) with H ,(I × J )n ; H = v ∈ L2 (I × J )n ; H s.t. v(·, j1 ; . . . ; ·, jn ) ∈ H , (I n ; H) ∀j1 , . . . , jn ∈ J, n 2 ∂& v(·, j1 ; . . . ; ·, jn ) 2 n <∞ , j1 ,... ,jn
L (I ;H)
&=1
, and we set Hsymm (I × J )n ; H = H , (I × J )n ; H ∩ L2symm (I × J )n ; H . Then D(*0 ) = H 1 (R; Z), *0 v = iv , while
, (R × J )n ; H ∀n, D(E0 ) = / ∈ K s.t. /n ∈ Hsymm ∞ n 2 1 ∂& / n < ∞ , n! (E0 /)n = i
n
n=1
&=1
(16)
∂& / n ,
&=1
where the empty sum is equal to 0, so that (E0 φ)0 = 0. Moreover ψ(v) ⊗ h belongs to D(E0 ) if and only if v belongs to D(*0 ) = H 1 (R; Z) and E0 ψ(v) ⊗ h = A† (iv ) ψ(v) ⊗ h,
∀v ∈ H 1 (R; Z), h ∈ H.
(17)
The exponential domain E H 1 (R; Z) is a domain of essential self-adjointness for E0 . 3. The Matrix Elements of a Stochastic Evolution U We begin the study of K by considering the matrix elements of U . We develop and generalize some results obtained in [10] for the 1-multiplicity case.
188
M. Gregoratti
3.1. The matrix elements of U on the exponential domain E[R]. Let us denote by f(t) the general function obtained from a matrix element of Ut∗ when t ∈ [0, +∞). Since Ut∗ = V ∗ (t) ∗t for every t ≥ 0 and since E[R] is invariant under , we know from quantum stochastic calculus rules (see (9)) that f(t) = ψ(f ) ⊗ k|Ut∗ ψ(v) ⊗ h = ψ(f ) ⊗ k|∗t ψ(v) ⊗ h t ψ(f ) ⊗ k|V ∗ (s) f¯j (s)vi (s − t) S∗ij − δij − f¯j (s)S∗ij Ri + 0
+
j
i,j
1 ∗ ∗ vj (s − t)R∗j + iH − Rj Rj t ψ(v) ⊗ h ds 2
i,j
(18)
j
for every k, h in H and every f, v in L2 (R; Z). Motivated by Chebotarev’s results, we deduce a different representation for f(t) when v has some regularity property. More precisely, we consider v ∈ H 1 (R∗ ; Z), where R∗ = R\{0}. Let us recall that each function v ∈ H 1 (R∗ ; Z) is (absolutely) continuous on R− and R+ while it can be discontinuous at 0; anyway v admits left and right limits at 0 (say v(0− ) and v(0+ )) and v ∈ H 1 (R; Z) if and only if v(0− ) = v(0+ ). Proposition 1. Let f ∈ L2 (R; Z), v ∈ H 1 (R∗ ; Z), k, h ∈ H. Let U be the stochastic evolution defined by (12), (10), (6) with system operators satisfying the conditions (7). For t ≥ 0, let us define f(t) = ψ(f ) ⊗ k|Ut∗ ψ(v) ⊗ h . Then the function f is absolutely continuous and f(t) = ψ(f ) ⊗ k|ψ(v) ⊗ h t i ∗ + ψ(f ) ⊗ k|Us∗ i H + A† (iv ) + Rj Rj − i R∗j vj (0− ) 2 0 (19) j j
+ S∗ij vi (0− ) − S∗ij Ri − vj (0+ ) ψ(v) ⊗ h ds, f¯j (s) j
i
i
where all the sums are strongly convergent. Proof. It can be proved just following the proof of Proposition 1 in [10]. If f belongs to C 1 (R; Z) ∩ H 1 (R; Z) and v belongs to C 1 (R∗ ; Z) ∩ H 1 (R∗ ; Z) and only finitely many components fj and vj are different from 0, then all the sums in (18) are finite (with the exception of i S∗ij Ri and j R∗j Rj ), the function f is continuously differentiable and the formula (19) is recovered by the quantum Ito formula. Then the general case is obtained by continuity. All the sums are strongly convergent by Lemma 27.7 in [17]. " !
Hamiltonian Operator Associated with Quantum Stochastic Evolutions
189
Equation (19) can also be displayed in a basis free form: f(t) = ψ(f K t) ⊗ k|ψ(v) ⊗ h i ∗ + ψ(f ) ⊗ k|Us i H + A† (iv ) + R∗ R ψ(v) ⊗ h 2 0
−iR∗ v(0− ) ⊗ ψ(v) ⊗ h K ds (20) t + f (s) ⊗ ψ(f ) ⊗ k|Us∗ S∗ v(0− ) ⊗ ψ(v) ⊗ h − S∗ R ψ(v) ⊗ h 0
−v(0+ ) ⊗ ψ(v) ⊗ h Z⊗K ds. Thanks to (19) or (20), we could already establish if ψ(v)⊗h (or a linear combinationof these regular exponential vectors) belongs to D(K), but we could find E H 1 (R∗ ; Z) ∩ D(K) = {0}. Therefore, in order to determine K, we want to extend (19) outside the exponential domain. Let us remark that formula (19) exhibits unbounded operators acting on ψ(v) ⊗ h, namely A† (iv ) ψ(v) ⊗ h,
vj (0− ) ψ(v) ⊗ h,
vj (0+ ) ψ(v) ⊗ h,
(21)
which are well defined thanks to the regularity of v. Therefore, just as Chebotarev [3–7], we define some subspaces of K, consisting of vectors with “regular” components, and then we use them to extend the unbounded operators (21). Since we employ these vectors to determine a domain for the Hamiltonian K, we want these subspaces to be as large as possible, in order to obtain a domain of essential self-adjointness for K. Actually our definitions are more general than Chebotarev’s ones. 3.2. Vectors with regular components. For every n ≥ 1, let us consider the Sobolev space H , (Rn∗ ; H), where the open set Rn∗ is the space Rn minus the coordinate hyperplanes. We denote by Qm (m = 1, . . . , 2n ) the connected parts of Rn∗ and we call them standard n-particle chambers [7]. Their boundaries are denoted by ∂Qm . From now on, each partial derivative is to be understood as a derivative in the sense of distributions on Rn∗ . For every & = 1, . . . , n and s ∈ R∗ ∪ {0− , 0+ }, the hyperplane {r& = s} is not parallel to e1 + · · · + en and thus, for every function v ∈ H , (Rn∗ ; H), the trace v|{r& =s} is naturally defined as a function in L2 (Rn−1 ; H). As usual, the trace operator ·|{r& =s} : H , (Rn∗ ; H) → L2 (Rn−1 ; H) is continuous. A function v ∈ H , (Rn∗ ; H) is regular inside each chamber (i.e. v|Qm ∈ H , (Qm ; H) for every m = 1, . . . , 2n ), but it may be discontinuous through the chamber boundaries (i.e. v|{r& =0− } may be different from v|{r& =0+ } ); it is continuous through all the chamber boundaries (i.e. v|{r& =0− } = v|{r& =0+ } , n for every & = 1, . . . , n) if and only if v belongs n to H (R ; H), which is a closed , n subspace of H (R∗ ; H). In this case the sum &=1 ∂& v coincides with the sum of the same derivatives in the sense of the distributions on Rn . Moreover the usual integration by parts formula holds on every chamber Qm . Let v|∂Qm denote the trace of v on the chamber boundary ∂Qm , obtained combining n traces of the kind v|{r& =0± } , and let ηm denote the outer normal on ∂Qm . Then for every u, v in H , (Rn∗ ; H), we have Qm
u|
n &=1
∂& v H = −
n
Qm &=1
∂& u|v H +
n ∂Qm &=1
ηm · e& u|∂Qm v|∂Qm H . (22)
190
M. Gregoratti
Note ηm · e& is a function different from 0 only on ∂Qm ∩ {r& = 0} and thus n that &=1 ηm · e& can take only the values ±1. We are interested in H , (Rn∗ ; Z⊗n ⊗ H) = H , (R∗ × J )n ; H and in its closed , subspace Hsymm (R∗ ×J )n ; H . Then, for every v ∈ H , (R∗ ×J )n ; H we can consider both v|{r& =s} ∈ L2 (Rn−1 ; Z⊗n ⊗ H) = Z ⊗ L2 (R × J )n−1 ; H) and v|{r& =s,j& =i} ∈ L2 (R × J )n−1 ; H), where v|{r& =s} = i zi ⊗ v|{r& =s,j& =i} . Moreover, for every u, v ∈ , (R∗ × J )n ; H , the following integration by part formula holds Hsymm u|
n &=1
∂& v L2 ((R×J )n ;H) = −
n &=1
∂& u|v L2 ((R×J )n ;H)
+ nu|{rn =0− } v|{rn =0− } Z⊗L2 ((R×J )n−1 ;H) − nun |{r =0+ }v|{r =0+ } Z⊗L2 ((R×J )n−1 ;H) = −
n &=1
+n
n
∂& u|v L2 ((R×J )n ;H)
(23)
u|{rn =0− ,jn =i} v|{rn =0− ,jn =i} L2 ((R×J )n−1 ;H) i
−n
n
i
u|{rn =0+ ,jn =i} v|{rn =0+ ,jn =i} L2 ((R×J )n−1 ;H) .
It is obtained from (22) by summing up over the n-particle chambers Qm ⊆ Rn∗ and exploiting the total symmetry of u and v. Now we define some dense subspaces of K consisting of regular vectors, i.e. vectors , whose components belong to Hsymm (R∗ × J )n ; H . Let s ∈ R∗ ∪ {0− , 0+ } and define , W = / ∈ K s.t. /n ∈ Hsymm (R∗ × J )n ; H ∀n, ∞ n 2 1 ∂& / n 2 <∞ , L ((R×J )n ;H) n!
Vs = / ∈ W s.t.
n=1 ∞ n=0
(24)
&=1
1 /n+1 |{r =s} 2 2 <∞ , n+1 Z⊗L ((R×J )n ;H) n!
V0± = V0− ∩ V0+ , , n V0 = / ∈ V0± s.t. /n ∈ Hsymm (R × J ) ; H ∀n , V00 = / ∈ V0 s.t. /n |{rn =0− } = /n |{rn =0+ } = 0 ∀n . Clearly V00 ⊆ V0 ⊆ V0± ⊆ W. We discuss the properties of these spaces beginning from W.
(25) (26) (27) (28)
Hamiltonian Operator Associated with Quantum Stochastic Evolutions
191
The space W contains D(E0 ) (16) and hence it is dense. It is the natural domain for the unbounded operator (E/)n = i
E : W → K,
n
∂& / n .
(29)
&=1
The operator E is a non-symmetric operator which extends E0 . Furthermore W contains E H 1 (R∗ ; Z) and a generalization of (17) holds: E ψ(v) ⊗ h = A† (iv ) ψ(v) ⊗ h,
∀v ∈ H 1 (R∗ ; Z), h ∈ H.
(30)
Consider now the space Vs ⊆ W, s ∈ R∗ ∪ {0− , 0+ }. Besides E|Vs , for every i ∈ J , we can define the unbounded operator ai (s) : Vs → K,
(ai (s) /)n = /n+1 |{rn+1 =s,jn+1 =i} .
(31)
Made the identification Z⊗K =
∞
Z ⊗ L2symm
n
(R × J ) ; H
,
n=0
we also define the unbounded operator (a(s) /)n = /n+1 |{rn+1 =s} , a(s) : Vs → Z ⊗ K, (32) 1 and we have a(s) / = i zi ⊗ ai (s) /. Also each space Vs contains E H (R∗ ; Z) (and hence it is dense) and ai (s) ψ(v) ⊗ h = vi (s) ψ(v) ⊗ h,
∀v ∈ H 1 (R∗ ; Z), h ∈ H,
a(s) ψ(v) ⊗ h = v(s) ⊗ ψ(v) ⊗ h,
∀v ∈ H 1 (R∗ ; Z), h ∈ H.
(33)
The subspace we are most interested in is the space V0± , where the operators E, a(0− ) and a(0+ ) are simultaneously defined. The operator E is non-symmetric also on V0± . In fact, from (23) we can deduce the following integration by parts formula for every ϒ, / ∈ V0± :
ϒ|E/ K = Eϒ|/ K + i a(0− )ϒ|a(0− )/ Z⊗K − a(0+ )ϒ|a(0+ )/ Z⊗K
= Eϒ|/ K + i aj (0− )ϒ|aj (0− )/ K − aj (0+ )ϒ|aj (0+ )/ K . j
(34) If we consider the space V0 , we can clarify between the operators the relationships E|V0± and E0 and the formula (34). Clearly E H 1 (R; Z) ⊆ V0 ⊆ D(E0 ) and thus E|V0± and E0 coincide on the dense space V0 , which is a domain of essential self-adjointness for E0 . Moreover the definition (32) implies that V0 = / ∈ V0± s.t. a(0− ) / = a(0+ ) / . Therefore E|V0± is a non-symmetric operator which becomes not only symmetric (see Eq. (34)), but even essentially self-adjoint when it is restricted to the domain V0 consisting
192
M. Gregoratti
of vectors / ∈ V0± which satisfy the boundary condition a(0− ) / = a(0+ ) /. Notice that the last condition really is a boundary condition which has to be satisfied by each component /n on the coordinate hyperplanes. For vectors in V0 we define the unbounded operators ai (0) : V0 → K,
ai (0) / = ai (0− ) / = ai (0+ ) /,
a(0) : V0 → Z ⊗ K,
a(0) / = a(0− ) / = a(0+ ) /.
(35)
Now consider the space V00 . Clearly − + V00 = / ∈ V0± s.t. a(0 ) / = a(0 ) / = 0 , and it is dense because it contains E H01 (R∗ ; Z) , where 1 1 − + H0 (R∗ ; Z) = v ∈ H (R∗ ; Z) s.t. v(0 ) = v(0 ) = 0 . We define the operator E00 = E|V00 . ∗ = E. Proposition 2. E00 ∗ ) ⊆ W and that E| ∗ ) = E∗ . Proof. Let us verify first that D(E00 D(E00 00 , Firstly, consider a vector ϒ ∈ K such that ϒm ∈ / Hsymm (R∗ × J )m ; H for some m. ∗ ). Indeed ϒ ∈ , m ⊗m ⊗ H) and so there exists a sequence of Then ϒ ∈ / D(E00 m / H (R∗ ; Z k ∞ m ⊗m test ; Z ⊗H) such that the sequence {v k } is L2 -bounded, but mfunctions {vk } ⊂ Cc (R∗ k ( &=1 ∂& ϒm , v ) = −ϒm | m &=1 ∂& v is unbounded (by ( , ) we denote the sesquilin⊗m ⊗ H) and the ear duality pairing between the space of test functions Cc∞ (Rm ∗ ;Z corresponding space of distributions). Since the component ϒm is totally symmetric, we can suppose that also the functions v k are totally symmetric and hence we can define the k k vectors /k = (δn,m v k )∞ n=0 . Then / ∈ V00 = D(E00 ) for every k, the sequence {/ } ∗ ). is bounded in K, but ϒ|E00 /k is unbounded and so ϒ ∈ / D(E00 n ∗ ). Then we have (E ∗ ϒ) = i Secondly, take ϒ ∈ D(E00 n 00 &=1 ∂& ϒn for every ∞ 1 n 2 n ≥ 0 and n=1 n! &=1 ∂& ϒn < ∞. Indeed, if n ≥ 1 and if v belongs to Cc∞ (Rn∗ ; Z⊗n ⊗ H) ∩ L2symm (R × J )n ; H , then the vector / = (δm,n v)∞ belongs to m=0 n 1 1 ∗ D(E00 ) and thus ϒ|E00 / is both equal to n! (E00 ϒ)n |v and to n! ϒn |i &=1 ∂& v = 1 n is proved for every n ≥ 1 thanks to the &=1 ∂& ϒn |v . So the second statement n! i density of Cc∞ (Rn∗ ; Z⊗n ⊗ H) ∩ L2symm (R × J )n ; H in L2symm (R × J )n ; H . If n = 0 and if h is a vector in H, then the vector / = (δm,0 h)∞ m=0 belongs to D(E00 ) and thus ∗ ϒ) |h and 0. the scalar product ϒ|E00 / is both equal to (E00 0 ∗ ⊆ E. Still remaining to prove is the inclusion Therefore we have proved that E00 ∗ ). W ⊆ D(E00 Take ϒ ∈ W. Using the integration by part formula (23), we have
ϒ|E00 / =
∞ n ∞ n 1 1 ∂ & /n = ∂& ϒn |/n = Eϒ|/ ϒn |i i n! n! n=1
&=1
n=1
∗ ). for every / ∈ D(E00 ). Therefore ϒ ∈ D(E00
&=1
" !
Hamiltonian Operator Associated with Quantum Stochastic Evolutions
193
3.3. Other matrix elements of U . Thanks to the introduction of V0± , E, a(0− ), a(0+ ), we can extend the equality (20) outside the exponential domain. Proposition 3. Let f ∈ L2 (R; Z), k ∈ H and / ∈ V0± . Let U be the stochastic evolution defined by (12), (10), (6) with system operators satisfying the conditions (7). For t ≥ 0, let us define f(t) = ψ(f ) ⊗ k|Ut∗ / . Then the function f is absolutely continuous and f(t) = ψ(f ) ⊗ k|/ K t i + ψ(f ) ⊗ k|Us∗ i H + E + R∗ R − iR∗ a(0− ) / ds K 2 0 t + f (s) ⊗ ψ(f ) ⊗ k|Us∗ S∗ a(0− ) − S∗ R − a(0+ ) /
(36)
Z⊗K
0
ds.
Proof. It can be proved just following the proof of Proposition 2 in [10]. If v ∈ H 1 (R∗ ; Z) and h ∈ H, then / = ψ(v) ⊗ h belongs to V0± and (36) holds thanks to (20), (30), (33). Then, in spite of the unboundedness of E, a(0− ) and a(0+ ), (36) can be extended by linearity and continuity to the case of an arbitrary / ∈ V0± . ! " Let us remark that Eq. (36) exhibits all the operators on the right of the evolution group U and that it can be employed to determine a dense subspace of D(K) (see Theorem 3). Thus we can obtain a densely defined restriction of K. In order to prove the essential self-adjointness of this restriction, we need another equation, similar to (36), but with all the operators on the left of U . Corollary 1. Let ϒ ∈ V0 , χ ∈ K[R− ] and v ∈ L2 (R+ ; Z). Let U be the stochastic evolution defined by (12), (10), (6) with system operators satisfying the conditions (7). For t ≥ 0, let us define g(t) = ϒ|Ut χ ⊗ ψ(v) . Then the function g is absolutely continuous and g(t) = ϒ|χ ⊗ ψ(v) K t
i + ϒ| − i H − R∗ R Us χ ⊗ ψ(v) − iR∗ SUs v(s) ⊗ χ ⊗ ψ(v) ds K 2 0 t + iE ϒ|Us χ ⊗ ψ(v) K ds (37) 0 t
+ a(0) ϒ| RUs χ ⊗ ψ(v) + S − 1 Us v(s) ⊗ χ ⊗ ψ(v) ds. 0
Z⊗K
Proof. Equation (37) immediately follows from Eq. (36) by complex conjugation for every χ = ψ(u) ⊗ h, u ∈ L2 (R− ; Z), h ∈ H. Then it is extended to the case of arbitrary χ ∈ K[R− ] by linearity and continuity. ! "
194
M. Gregoratti
4. The Hamiltonian Finally we can state the main result of the paper. Let us recall that the space V0± and the operators E, a(0− ), a(0+ ) are defined by (26),(29),(32). Theorem 3. Let K be the Hamiltonian operator associated to the QSDE (6) with system operators satisfying the conditions (7). Then:
1. D(K) ∩ V0± = / ∈ V0± s.t. a(0− ) / = Sa(0+ ) / + R / , i 2. K / = H + E − iR∗ a(0− ) + R∗ R /, ∀ / ∈ D(K) ∩ V0± , 2 3. K|D(K)∩V0± is essentially self-adjoint. Proof of Points 1 and 2. The proof follows straight from Proposition 3 just as for the 1-multiplicity case [10]. ! " With a non-basis free notation we can write
1. D(K) ∩ V0± = / ∈ V0± s.t. ai (0− ) / = Sij aj (0+ ) / + Ri / ∀i ∈ J ,
j
i ∗ 2. K / = H + E − i R∗j aj (0− ) + Rj Rj /, ∀ / ∈ D(K) ∩ V0± , 2 j j where j Sij aj (0+ ) / and j R∗j aj (0− ) / converge in the norm topology of K.
The operator H+E−iR∗ a(0− )+ 2i R∗ R on the domain D(K)∩V0± is the restriction of an Hamiltonian operator and hence it is symmetric; however its symmetry can be directly verified by the integration by parts formula (34) and the boundary condition a(0− ) / = Sa(0+ ) / + R /.
(38)
Note that (38) is a boundary condition which mixes the components of / and that, in general, neither exponential vectors nor finite particle vectors (vectors / with only finitely many components /n different from 0) can provide a dense subspace of vectors satisfying (38). Moreover, in the case Sij = δij , the boundary condition (38) plus the formal relations (true for the regular exponential vectors) ai (0± ) aj (0± ) / = aj (0± ) ai (0± ) /
(39)
would imply Ri Rj / = Rj Ri /, which can not hold on a dense domain if the operators Ri and Rj do not commute. Thus D(K) ∩ V0± should consist of vectors / regular enough (/ ∈ V0± ) to make (38) meaningful, but not so regular to make (39) true. Therefore it is not even obvious that D(K) ∩ V0± is dense for any choice of the system operators and the proof of Point 3 of Theorem 3 requires more work. First of all, we introduce a class of vectors which is very useful to study the Hamiltonian K. We construct these vectors in K[I ], I ⊆ R, by developing Chebotarev’s idea [7]: in the definition of an exponential vector ψ(f ) ⊗ h, the function f , which takes values in the multiplicity space, is replaced by a function F taking values in a space of operators acting on h. We shall denote such a vector by 7(F)h and we shall call it a pseudo-exponential vector.
Hamiltonian Operator Associated with Quantum Stochastic Evolutions
195
Let us denote B(H; Z ⊗ H) simply by B and, for every L ∈ B, let us denote by Lj the bounded operators on H defined by Lh =
zj ⊗ Lj h.
j ∈J
For every F ∈ L2 (I ; B) and h ∈ H, we define (7(F)h)0 = h ∈ H, (7(F)h)n (r1 , . . . , rn ) = F(rn ) · · · F(r1 ) h ∈ Z⊗n ⊗ H,
rn > . . . > r1 , r& ∈ I ∀&.
Each (7(F)h)n is a measurable function from {rn > . . . > r1 } ∩ I n to Z⊗n ⊗ H which can be uniquely extended to a totally symmetric function from (I × J )n to H: (7(F)h)n (r1 , j1 ; . . . ; rn , jn ) = Fjσ (n) (rσ (n) ) · · · Fjσ (1) (rσ (1) ) h,
(40)
where σ is the permutation of {1, . . . , n} such that rσ (n) > . . . > rσ (1) . Since n j1 ,... ,jn I
(7(F)h)n 2 dr1 · · · drn ≤ H
In
F(r1 )2 · · · F(rn )2 h2 dr1 · · · drn H B B
2n = FL2 (I ;B) h2H ,
∞ each (7(F)h)n belongs to L2symm (I × J )n ; H and 7(F)h = (7(F)h)n n=0 is a well defined vector of K[I ]. Note that the definition of 7(F)h does not depend on the chosen basis {zj }j ∈J and that, when Z = C, differently from [7] and [10], we do not ask for any commutative property of the image of F. If we identify f ∈ L2 (I ; Z) with F ∈ L2 (I ; B) defined by F(r) h = f (r)⊗h, then the pseudo-exponential vector 7(F)h (40) coincides with the usual exponential vector ψ(f ) ⊗ h. We are interested in the case F = v + uL
(41)
with v ∈ L2 (I ; Z), u ∈ L2 (I ), L ∈ B, i.e. F(r) h = v(r) ⊗ h + u(r) L h for every h ∈ H. Let us observe that, for every fixed v, L and h, 7(v + uL)h → ψ(v) ⊗ h in K[I ]
as
u → 0 in L2 (I ).
(42)
Indeed F2L2 (I ;B) ≤ 2 v2L2 (I ;Z) + u2L2 (I ) L2B and the following estimation for the components of 7(v + uL)h − ψ(v) ⊗ h can be proved by induction: 7(v + uL)h n − v ⊗n ⊗ h2L2 ((I ×J )n ;H)
n−1 ≤ u2L2 (I ) L2B h2H n 2n v2L2 (I ;Z) + u2L2 (I ) L2B ,
196
M. Gregoratti
which implies 7(v + uL)h − ψ(v) ⊗ h2K[I ] ≤ 2u2L2 (I ) L2B h2H e
2(v2 2
L (I ;Z)
+u2 2
L (I )
L2B )
,
where the right-hand side goes to 0 as u2L2 (I ) → 0. Take again I = R and come back to the Hamiltonian. In order to prove the essential self-adjointness of K|D(H )∩V0± , we do not need to investigate the relationship between the regularity of F and the regularity of 7(F)h, but it suffices to study a particular case of regular pseudo-exponential vectors. Proposition 4. Let v ∈ H 1 (R∗ ; Z), u ∈ H 1 (R∗ ), L ∈ B, h ∈ H. Then, for every s ∈ {0− , 0+ } ∪ R∗ , the pseudo-exponential vector 7(v + uL)h belongs to Vs . Moreover, if u|R+ ≡ 0, then
ai (s) 7(v + uL)h = vi (s) + u(s) Li 7(v + uL)h, a(s) 7(v + uL)h = v(s) ⊗ 7(v + uL)h + u(s) L 7(v + uL)h,
(43)
for every s ≥ 0− (i.e. s ∈ {0− , 0+ } ∪ R+ ).
Proof. Let us denote 7(v + uL)h simply by /. Its components /n belong to H , (R∗ × J )n ; H because v ∈ H 1 (R∗ ; Z) and u ∈ H 1 (R∗ ). Indeed the definition of /n changes through the hyperplanes {r& = r& }, but it does not affect n&=1 ∂& /n because we are taking the derivative of /n in the direction of e1 + · · · + en which is parallel to each {r& = r& }. Furthermore ∞ n 2 1 2 2 2 2 ∂& / n 2 ≤ v + u L 2 2 B hH L (R;Z) L (R) L ((R×J )n ;H) n! n=1
&=1
·
∞ n=1
(n−1) n2n < ∞, v2L2 (R;Z) + u2L2 (R) L2B (n − 1)!
and, for every s ∈ {0− , 0+ } ∪ R∗ , ∞ 1 2 2 2 /n+1 |{r =s} 2 2 ≤ v(s) + |u(s)| L h2H n n+1 Z B Z⊗L ((R×J ) ;H) n! n=0
·
∞ n+1 n 2 v2L2 (R;Z) + u2L2 (R) L2B < ∞, n! n=0
so that 7(v + uL)h belongs to Vs . If u|R+ ≡ 0 and s ∈ {0− , 0+ } ∪ R+ , then Eq. (43) is a simple consequence of the definitions (40), (31) and (32) because, for every r > s and j ∈ J , the operator (v + uL)j (r) is just a multiplication by the complex number vj (r) and thus it commutes
Hamiltonian Operator Associated with Quantum Stochastic Evolutions
197
with (v + uL)i (s) for every i ∈ J . Thus, setting (r1 , . . . , rn , s) = (t1 , . . . , tn+1 ), we have, for every i ∈ J and s ∈ {0− , 0+ } ∪ R+ , (ai (s) /)n (r1 , j1 ; . . . ; rn , jn ) = /n+1 (r1 , j1 ; . . . ; rn , jn ; s, i) = (v + uL)jσ (n+1) (tσ (n+1) ) · · · (v + uL)jσ (1) (tσ (1) ) h = (vi (s) + u(s) Li ) /n (r1 , j1 ; . . . ; rn , jn ).
" !
Let us remark that (43) generalizes (33), but it holds only in the special case u|R+ ≡ 0, s ≥ 0− . In the general case we do not have a simple formula for ai (s) 7(v + uL)h or a(s) 7(v + uL)h because the operators Li do not need to commute. Moreover, we want to underline here that these regular pseudo-exponential vectors provide an example of vectors which do not satisfy (39): even if ai (0± ) aj (0± ) 7(v + uL)h and aj (0± ) ai (0± ) 7(v + uL)h are well defined, they are not equal when Li Lj & = Lj Li , essentially because the components of 7(v + uL)h are “discontinuous” along the hyperplanes {r& = r& }. Let us also note that, when u|R+ ≡ 0, we deal with pseudo-exponential vectors which are factorized vectors in K = K[R− ] ⊗ [R+ ]: 7(v + uL)h = 7(v|R− + u|R− L)h ⊗ ψ(v|R+ ). Now we can prove that D(K) ∩ V0± is dense by giving an explicit construction of vectors belonging to this domain. Proposition 5. Let K be the Hamiltonian associated to the QSDE (6) with system operators satisfying the conditions (7). Then D(K) ∩ V0± is dense in K. Proof. Similarly to the 1-multiplicity case [10], we consider the subspace of K, D1 = span 7 v + u Sv(0+ ) + R − v(0− ) h −
v ∈ H (R∗ ; Z), u ∈ H (R∗ ), u(0 ) = 1, u|R+ ≡ 0, h ∈ H , 1
1
(44)
where Sv(0+ ) + R − v(0− ) is the operator in B defined by Sv(0+ ) + R − v(0− ) h = S v(0+ ) ⊗ h + R h − v(0− ) ⊗ h. Then D1 is dense thanks to (42) and to the density of H 1 (R∗ ; Z) in L2 (R; Z), while D1 is contained in D(K) ∩ V0± thanks to Proposition 4. ! " So we have to prove that the densely defined symmetric operator K|D(K)∩V0± is essentially self-adjoint. We shall do this by exploiting the facts that K|D(K)∩V0± is the restriction of an Hamiltonian K and that the related group U is linked to the solution V of the Hudson–Parthasarathy equation. In order to prove the essential self-adjointness of K|D(K)∩V0± , we want to commute K|D(K)∩V0± with Ut and so we need to study the regularity of a vector Ut / when / ∈ D(K) ∩ V0± . We do not know explicitly the action of V and U , but Corollary 1 is sufficient to study this regularity for vectors / ∈ D1 and t ≥ 0. Actually we are able to consider a subspace of D(K) bigger than D1 . Let D2 denote the dense linear space,
D2 = span χ ⊗ ψ(v) χ ∈ K[R− ], v ∈ L2 (R+ ), χ ⊗ ψ(v) ∈ D(K) . (45)
198
M. Gregoratti
Clearly D1 ⊆ D2 ⊆ D(K). Let us observe that, roughly speaking, a vector χ ⊗ ψ(v) in D1 or D2 enjoys two fundamental properties: it belongs to the natural domain of a quantum stochastic integral, because its factor in [R+ ] is an exponential vector, and simultaneously it belongs to D(K), because its factor in K[R− ] can ensure the boundary condition (38). This suggests that the vectors in D2 should behave well under U and should be especially adapted to study the Hamiltonian K. Indeed the following proposition holds. Proposition 6. Let U be the stochastic evolution defined by (12), (10), (6) with system operators satisfying the conditions (7). Then Ut D2 ⊆ Vs for every s = 0− , 0+ or s > 0 and for every t ≥ 0. Furthermore, for every / ∈ D2 , a(0+ ) Ut / = Ut a(t) /, a(s) Ut / = Ut a(s + t) /,
∀t > 0, ∀s > 0, t ≥ 0.
Proof. It is enough to prove the thesis for every vector / = χ ⊗ ψ(v) generating D2 . Firstly, we prove that Ut / ∈ W for every t ≥ 0. Indeed, for every ϒ ∈ K, the function g(t) = ϒ|Ut / belongs to C 1 [0, +∞) and g (t) = ϒ| − iK Ut / because / ∈ D(K). On the other hand, if we suppose ϒ ∈ V00 , then by Corollary 1, i g (t) = ϒ| − i H − R∗ R − i R∗i Sij vj (t) Ut / + iE00 ϒ|Ut / . 2 i,j
Therefore v has to be continuous; moreover, for fixed t ≥ 0, the scalar product
i E00 ϒ|Ut / = ϒ| K − H − R∗ R − i R∗i Sij vj (t) Ut / 2 i,j
∗) = W is continuous with respect to ϒ ∈ V00 = D(E00 ) and hence Ut / belongs to D(E00 (Proposition 2). ∗ = E, we have Moreover, since E00
i K Ut / = H + E − R∗ R − i R∗i Sij vj (t) Ut /. 2 i,j
Let us remark that the relation χ ⊗ ψ(v) ∈ W implies v ∈ H 1 (R+ ; Z). Secondly, we prove that Ut / ∈ Vs for s = 0+ or s > 0 and for t ≥ 0. It is a consequence of the adaptedness and boundedness of V . Indeed Vt / is a factorized vector of the form η ⊗ ψ(v|(t,+∞) ) with η ∈ K[(−∞, t)]; consequently, if r1 < . . . < r&¯ < 0 < r&+1 < . . . < rn , then ¯ Ut / n+1 (r1 , j1 ; . . . ; rn , jn ; s, i) = t Vt / n+1 (r1 , j1 ; . . . ; rn , jn ; s, i) n vj& (r& + t) vi (s + t) = η&¯(r1 + t, j1 ; . . . ; r&¯ + t, j&¯) ¯ &=&+1
= vi (s + t) Ut / n (r1 , j1 ; . . . ; rn , jn ),
Hamiltonian Operator Associated with Quantum Stochastic Evolutions
199
where vi (0+ + t) = vi (0+ ) if t = 0, while vi (0+ + t) = vi (t) if t > 0. Therefore the ∞ 1 2 sum n=0 n! Ut / n+1 |{rn+1 =s} Z⊗K converges to v(s + t)2Z /2K for s = 0+ or s > 0 and for t ≥ 0. We have also found that ai (0+ ) Ut / = vi (t) Ut /
= Ut ai (t) /,
∀t ≥ 0,
+
= Ut a(t) /, ∀t ≥ 0, a(0 ) Ut / = v(t) ⊗ Ut / a(s) Ut / = v(s + t) ⊗ Ut / = Ut a(s + t) /, ∀s > 0, t ≥ 0. Moreover K Ut / = H + E − iR∗ S a(0+ ) − 2i R∗ R Ut /. 2 1 Thirdly, we prove for t ≥ 0 that ∞ n=0 n! Ut / n+1 |{rn+1=0− } < ∞, i.e. that Ut / ∈ V0− . Indeed, for every n ≥ 0, we can show that Ut / n+1 |{rn+1 =0− } = S a(0+ ) + R Ut / n . Take now ϒ ∈ V0 and consider again the function g, so that i g (t) = ϒ| − i H + E − iR∗ S a(0+ ) − R∗ R Ut / K , 2 while, by Corollary 1, i g (t) = ϒ| − i H − iR∗ S a(0+ ) − R∗ R Ut / K + iEϒ|Ut / K 2 + a(0)ϒ| S − 1 a(0+ ) + R Ut / Z⊗K . Therefore
−iϒ|E Ut / K = −iE ϒ|Ut / K + a(0)ϒ| S − 1 a(0+ ) + R Ut / Z⊗K ;
applying Eq. (23) we find that, for every ϒ in V0 , ∞ 1 a(0)ϒ n | Sa(0+ ) + R Ut / n − Ut / n+1 |{rn+1 =0− } Z⊗L2 ((R×J )n ;H) = 0, n! n=0
and hence Ut / n+1 |{rn+1 =0− } = Sa(0+ ) + R Ut / n .
" !
Roughly speaking, a factorized vector χ ⊗ ψ(v) is allowed to belong to D(K) only if its components are sufficiently regular (χ ⊗ ψ(v) ∈ W), which in particular forces v to belong to H 1 (R+ ; Z); therefore a(0+ ) χ ⊗ ψ(v) is well defined and, once a(0+ ) is well defined, a(0− ) has to be well defined, too. Thus χ ⊗ ψ(v) ∈ V0± . Then the structure of the stochastic evolution U , which is the composition of a cocycle and a shift operator, preserves of the vector (for every t ≥ 0 we have Ut χ ⊗ ψ(v) = the factorization χ˜ ⊗ ψ (θt v)|R+ ) and thus it preserves all of its regularity properties. Finally we can conclude the proof of Theorem 3. Proof of Point 3 of Theorem 3. We shall prove that Ker(K|∗D(K)∩V ± ±i) = {0}. Take ϒ ∈ 0 D(K|∗D(K)∩V ± ) such that K|∗D(K)∩V ± ϒ = −iϒ and consider the bounded function 0 0 g(t) = ϒ|Ut / , for t ≥ 0 and / ∈ D2 . Since Ut / ∈ D(K) ∩ V0± for every t ≥ 0, g (t) = ϒ| − iK Ut / = −iK|∗D(K)∩V ± ϒ|Ut / = g(t) . 0
Therefore g(t) = g(0)et and hence g(0) = ϒ|/ = 0. Since D2 is dense, ϒ = 0.
200
M. Gregoratti
Similarly, take ϒ ∈ D(K|∗D(K)∩V ± ) such that K|∗D(K)∩V ± ϒ = iϒ and consider 0 0 again the function g for t ≥ 0 and / ∈ D2 . Now g (t) = −g(t) and so g(t) = g(0)e−t . Since D2 is dense and Ut is unitary, for every t ≥ 0 we have ϒ = sup |ϒ|/ | = sup |ϒ|Ut / | , /∈D2 /=1
/∈D2 /=1
where |ϒ|Ut / | = e−t |ϒ|/ | for every / ∈ D2 . Therefore ϒ = e−t ϒ and thus ϒ = 0. ! " Acknowledgement. The author would like to thank Professor Barchielli, who posed him the problem, for their several discussions about it, and, in particular, for his fundamental help to find the proper definition of pseudo-exponential vector in the arbitrary multiplicity case.
References 1. Accardi, L.: On the quantum Feynman–Kac formula. Rend. Sem. Mat. Fis. Milano 48, 135–180 (1978) 2. Accardi, L., Frigerio, A., Lewis, J.T.: Quantum stochastic processes. Publ. RIMS 18, 97–133 (1982) 3. Chebotarev, A.M.: The quantum stochastic equation is unitarily equivalent to a symmetric boundary value problem for the Schrödinger equation. Math. Notes 61, 510–518 (1997) 4. Chebotarev, A.M.: Quantum stochastic equation is unitarily equivalent to a symmetric boundary value problem for the Schrödinger equation. In: Stochastic analysis and mathematical physics (Viña del Mar, 1996), New York: World Sci. Publishing, 1998, pp. 42–54 5. Chebotarev, A.M., Victorov, D.V.: Quantum stochastic processes arising from the strong resolvent limits of the Schrödinger evolution in Fock space. In: Quantum probability (Gdánsk, 1997). Banach Center Publ. 43, Warsaw: Polish Acad. Sci. 1998, pp. 119–133 6. Chebotarev, A.M.: Quantum stochastic differential equation is unitary equivalent to a symmetric boundary value problem in Fock space. Inf. Dimens. Anal. Quantum Probab. Relat. Top. 1, 175–199 (1998) 7. Chebotarev, A.M.: Lectures on Quantum Probability. Sociedad Matemática Mexicana, Aportaciones Matemáticas, Nivel Avanzado 14, México, 2000 8. Frigerio, A.: Covariant Markov dilations of quantum dynamical semigroups. Publ. RIMS Kyoto Univ. 21, 657–675 (1985) 9. Frigerio, A.: Construction of stationary quantum Markov processes through quantum stochastic calculus. In: Quantum Probability and Applications II. Lect. Not. Math. 1136, Berlin: Springer-Verlag, 1985, pp. 207-222 10. Gregoratti, M.: On the Hamiltonian operator associated to some quantum stochastic differential equations. Inf. Dimens. Anal. Quantum Probab. Relat. Top. 3 No.4, 483–503 (2000) 11. Gregoratti, M.: The Hamiltonian operator associated to some quantum stochastic differential equations. Tesi di Dottorato. Università degli Studi di Milano, 2000 12. Hudson, R.L., Parthasarathy, K.R.: Quantum Itô’s formula and stochastic evolutions. Commun. Math. Phys. 93, 301–323 (1984) 13. Maassen, H.: The construction of continuous dilations by solving quantum stochastic differential equations. Semesterbericht Funktionalanalysis Tübingen Sommersemester 1984, 183–204 (1984) 14. Maassen, H.: Quantum Markov processes on Fock space described by integral kernels. In: Quantum Probability and Applications II. Lect. Not. Math. 1136, Berlin: Springer-Verlag, 1985, pp. 361–374 15. Meyer, P.-A.: Quantum Probability for Probabilists, second edition. Lect. Not. Math. 1538, Berlin: Springer-Verlag, 1995 16. Mohary, A., Sinha, K.B.: Quantum stochastic flows with infinite degrees of freedom and countable state Markov processes. Sankhy¯a 52 A, pt.1, 43–57 (1990) 17. Parthasarathy, K.R.: An Introduction to Quantum Stochastic Calculus. Basel–Boston–Berlin: Birkhäuser, 1992 Communicated by H. Araki
Commun. Math. Phys. 222, 201 – 227 (2001)
Communications in
Mathematical Physics
© Springer-Verlag 2001
On Quantum Ergodicity for Linear Maps of the Torus Pär Kurlberg , Zeév Rudnick Raymond and Beverly Sackler School of Mathematical Sciences, Tel Aviv University, Tel Aviv 69978, Israel. E-mail: [email protected] Received: 15 October 1999 / Accepted: 4 June 2001
Abstract: We prove a strong version of quantum ergodicity for linear hyperbolic maps of the torus (“cat maps”). We show that there is a density one sequence of integers so that as N tends to infinity along this sequence, all eigenfunctions of the quantum propagator at inverse Planck constant N are uniformly distributed. A key step in the argument is to show that for a hyperbolic matrix in the modular group, there is a density one sequence √ of integers N for which its order (or period) modulo N is somewhat larger than N . 1. Introduction 1.1. Quantum ergodicity. An important model for understanding the quantization of classically chaotic systems are quantum maps, and in particular the quantizations of linear automorphisms of the torus T2 (“cat maps”). Recall that a linear automorphism of T2 is given by a matrix A in the modular group SL(2, Z). Iterating such a map, we get a discrete dynamical system, well-known to be chaotic if the map is hyperbolic, that is if it has two real eigenvalues > 1 > −1 (equivalently | tr(A)| > 2). A quantization of these “cat maps” was proposed by Hannay and Berry [9], see also [13, 4, 5]. In brief, this procedure restricts Planck’s constant to be an inverse integer: h = 1/N , and the Hilbert space of states HN is N -dimensional, in keeping with the intuition that each state occupies a Planck cell of volume h = 1/N and the constraint that the total phase-space T2 has volume one. Classical observables (i.e. functions f ∈ C ∞ (T2 )) give rise to operators OpN (f ) on HN . Given a linear automorphism A of the torus, its quantization is a unitary operator UN (A) on HN , called the quantum propagator, or “quantized cat map”. The eigenfunctions of UN (A) play the rôle of energy eigenstates. Supported in part by a grant from the Israel Science Foundation.
Current address: Department of Mathematics, University of Georgia, Athens, GA 30602, USA.
E-mail: [email protected]
202
P. Kurlberg, Z. Rudnick
In this paper we will use the quantized cat map to illuminate one of the few rigorous results available on the semi-classical limit of eigenstates of classically chaotic systems, namely Quantum Ergodicity [18, 3, 21]. To formulate this notion, recall that if the classical dynamics are ergodic, then almost all trajectories of a particle cover the energy shell uniformly. The intuition afforded by the “Correspondence Principle” leads one to look for an analogous statement about the semi-classical limit of expectation values of observables in an energy eigenstate. As formulated by Schnirelman [18], the corresponding assertion is that when the classical dynamics is ergodic, for almost all eigenstates the expectation values of observables converge to the phase-space average. For quantum maps, the form that this takes is the following ([2, 22, 23]): Fix an observable f ∈ C ∞ (T2 ). Then for any orthonormal basis ψj of HN consisting of eigenfunctions of UN (A), there is a subset J (N) ⊂ {1, 2, . . . , N}, with #JN(N) → 1, so that for j ∈ J (N ) we have: OpN (f )ψj , ψj → f, as N → ∞. (1.1) T2
This is a consequence, using positivity and a standard diagonalization argument, of the following estimate for the variance due to Zelditch [22]: Given f ∈ C ∞ (T2 ), for any orthonormal basis ψj , j = 1, . . . , N of of HN consisting of eigenfunctions of UN (A), we have 2 N 1 OpN (f )ψj , ψj − f → 0 (1.2) N T2 j =1
Note that the result (1.2) does not guarantee that all eigenfunctions in HN are equidistributed, even for one single value of N . 1.2. Beyond quantum ergodicity. In recent work [14], we have found that there is a commutative group of unitary operators on the state-space which commute with the quantized map and therefore act on its eigenspaces. We called these “Hecke operators”, in analogy with the setting of the modular surface. We showed that the joint eigenfunctions of these and of UN (A) (which we called “Hecke eigenfunctions”) are all equidistributed, that is (1.1) holds for any choice of Hecke eigenfunctions in HN . Not all eigenfunctions of UN (A) are Hecke eigenfunctions. In fact, the Hecke eigenspaces have small dimension (at most O(log log N )), while the eigenspaces of UN (A) may have large dimension. In fact, the mean degeneracy is N/ ord(A, N ), where ord(A, N ) the order (or period) of A modulo N , that is the least integer k ≥ 1 for which Ak = I mod N . It can be shown (see Sect. 3.2) that the mean degeneracy can be as large as N/ log N for arbitrarily large N . However, it is reasonable to expect that all eigenfunctions become equidistributed – that is we have quantum unique ergodicity. In this paper, we show ergodicity of all eigenfunctions of UN (A) for almost all integers N : Theorem 1. Let A be a fixed cat map. There is a set of integers N ∗ of density one so that all eigenfunctions of UN (A) are equi-distributed, as N → ∞, N ∈ N ∗ . Previously, the only result giving an infinite set of N for which all eigenfunctions of UN (A) become equi-distributed is by Degli-Esposti, Graffi and Isola [5], which conditional on GRH give an infinite set of primes.
On Quantum Ergodicity
203
1.3. Outline of the argument. Our main tool in relating this result to more traditional themes of Number Theory is the following estimate for the fourth power moment of the expectation values, giving a condition in terms of the order of A modulo N : Theorem 2. There is a sequence of integers of density one so that for all observables f ∈ C ∞ (T2 ) and any orthonormal basis {ψj }N j =1 of HN consisting of eigenfunctions of UN (A) we have: N j =1
|OpN (f )ψj , ψj −
T2
f |4
N (log N )14 . ord(A, N )2
Thus for any subsequence of integers N such that ord(A, N ) →∞ N 1/2 (log N )7
(1.3)
(and satisfying an additional “genericity” assumptionexplained in Sect. 4) we find that for all eigenfunctions of UN (A), OpN (f )ψ, ψ → T2 f as N → ∞. Theorem 2 reduces the problem of quantum ergodicity to that of finding sequences of integers satisfying (1.3), a problem closely related to the classical Gauss–Artin problem of showing that any integer, other than ±1 or a perfect square, is a primitive root modulo infinitely many primes; see [17] for a nice survey article. We show (Theorem 17) that there is some δ > 0 for which there is a set of integers of density 1 so that ord(A, N ) N 1/2 exp((log N )δ ) . This, combined with Theorem 2 gives Theorem 1. To prove Theorem 17, we first show in Sect. 5 that on a set of density one, ord(A, N ) is not much smaller than the product of the orders of A modulo prime divisors of N . Next, we deal with prime values of N . In Sect. 6 we show (Theorem 14) that given 1/2 < η < 3/5, there is a set of primes of positive density c(η) > 0 so that ord(A, p) p η . We note that this is far short of the truth; by invoking the Generalized Riemann Hypothesis, one can show that for a set of primes of density one, we have ord(A, p) p/ log p (cf. [6]). In Sect. 7 we prove Theorem 17 by using Theorem 14 together with the elementary observation that for almost all primes p, ord(A, p) ≥ p1/2 / log p. As is apparent from this discussion, our result hinges on the condition (1.3) being satisfied; we can say nothing for N for which this condition fails, of which there are infinitely many examples. We consider it a fundamental problem to get results when ord(A, N ) is smaller than N 1/2 .
1.4. Notation. We will use the standard convention of analytic number theory: Thus e(z) stands for e2πiz , f (x) g(x) as x → ∞ means that there is some C > 0 so that for x sufficiently large, f (x) < Cg(x). Similarly, f (x) g(x) as x → ∞ means lim sup f (x)/g(x) ≤ 1. We will write pt ||n if p t divides n but p t+1 does not. We will denote by ω(N ) the number of prime divisors of N .
204
P. Kurlberg, Z. Rudnick
2. Quantum Mechanics on the Torus 2.1. The Hilbert space of states. We review the basics of quantum mechanics on the torus T2 , viewed as a phase space [9, 13, 4, 5], beginning with a description of the Hilbert space of states of such a system: We take state vectors to be distributions on the line which are periodic in both momentum and position representations: ψ(q + 1) = ψ(q), [Fh ψ](p+1) = [Fh ψ](p), where [Fh ψ](p) = h−1/2 ψ(q) e(−pq/ h) dq. The space of such distributions is finite dimensional, of dimension precisely N = 1/ h, and consists of periodic point-masses at the coordinates q = Q/N , Q ∈ Z. We may then identify HN with the N -dimensional vector space L2 (Z/N Z), with the inner product · , · defined by φ, ψ =
1 N
φ(Q) ψ(Q).
(2.1)
QmodN
2.2. Observables. Next we construct quantum observables: A central role is played by the translation operators [t1 ψ](Q) = ψ(Q + 1) and, letting eN (Q) = e
2π iQ N
, [t2 ψ](Q) = eN (Q) ψ(Q),
which may be viewed as the analogues of (respectively) multiplication and differentiation operators. In fact in terms of the usual translation operators on the line qψ(q) ˆ = qψ(q) h d and pψ(q) ˆ = 2πi ψ(q), they are given by t = e( p), ˆ t = e( q). ˆ In this context, 1 2 dq Heisenberg’s commutation relations read t1a t2b = t2b t1a eN (ab)
∀a, b ∈ Z.
(2.2)
More generally, mixed translation operators are defined for n = (n1 , n2 ) ∈ Z2 by TN (n) = eN (
n1 n2 n2 n1 )t2 t1 . 2
These are unitary operators on HN , whose action on a wave-function ψ ∈ L2 (Z/N Z) is given by: TN (n)ψ(Q) = e
iπ n1 n2 N
e(
n2 Q )ψ(Q + n1 ) . N
This implies that the absolute value of the trace of TN (n) is given by N n ≡ (0, 0) mod N |tr TN (n)| = 0 otherwise
(2.3)
(2.4)
(see [14, Lemma 4]). The adjoint/inverse of TN (n) is given by TN (n)∗ = TN (−n) .
(2.5)
On Quantum Ergodicity
205
As follows from the commutation relation (2.2), we have TN (m) TN (n) = eN
ω(m, n) 2
TN (m + n),
(2.6)
where ω(m, n) is the symplectic form ω(m, n) = m1 n2 − m2 n1 . For any smooth function f ∈ C∞ (T2 ), define a quantum observable OpN (f ), called the Weyl quantization of f [7] OpN (f ) =
f(n)TN (n),
n∈Z2
where f(n) are the Fourier coefficients of f . Given a state ψ ∈ HN , the expectation value of the observable f in the state ψ is defined to be OpN (f )ψ, ψ . 2.3. Cat maps. To introduce dynamics, we consider a linear automorphism of the torus A ∈ SL(2, Z). The iteration of A gives a (discrete) dynamical system, well-known to be chaotic if A is hyperbolic, that is | tr A| > 2 (such a map is called a “cat map” in the physics literature). ab If we further assume A is “quantizable” (that is A = with ab ≡ cd ≡ 0 cd mod 2, for more details see [14, Sect. 3] or [9, p. 273]) then one can assign to A a unitary operator UN (A) on HN , the quantum propagator, whose iterates give the evolution of the quantum system, and characterized by the property (an analogue of “Egorov’s theorem”): UN (A)∗ OpN (f )UN (A) = OpN (f ◦ A).
(2.7)
This can be thought of as saying that the evolution of the quantum observable OpN (f ) follows the evolution f → f ◦ A of the classical observable f . That (2.7) holds exactly is a special feature of the linearity of the map A; for general maps, (2.7) is only expected to hold asymptotically as N → ∞ (cf. [15]). The stationary states of the quantum system are given by the eigenfunctions ψ of UN (A). It is our goal to study the limiting expectation values OpN (f )ψ, ψ of observables in (normalized) eigenstates and show that outside a zero density set of N ’s, they all converge to the classical average T2 f of the observable as N → ∞. 3. The Order of a Matrix Modulo N 3.1. Let A ∈ SL(2, Z) be a hyperbolic matrix, that is | tr(A)| > 2. The order (or period) ord(A, N ) of the map A modulo N is the least integer k ≥ 0 so that Ak = I mod N . We begin to study the order of A modulo an arbitrary integer N , starting with some well-known generalities.
206
P. Kurlberg, Z. Rudnick
3.1.1. Firstly, if M and N are co-prime then ord(A, MN ) = lcm(ord(A, M), ord(A, N )), and so if N has a prime factorization N = piki then ord(A, N ) = lcm{ord(A, piki )}. 3.1.2. The eigenvalues , −1 of A generate a field extension K = Q( ), which is a real quadratic field since tr(A)2 > 4. We label them so that | | > 1. Let DA = 4(tr(A)2 − 4) √ so that K = Q( DA ). We denote by OK the ring of integers of K. The eigenvalues , −1 of A will be units in OK . Adjoining to Z gives an order O = Z[ ] ⊆ OK in K. Then there is an O-ideal I ⊂ O so that the action of by multiplication on I is equivalent to the action of A on Z2 , in the sense that there is a basis of I with respect to which the matrix of is precisely A (see [19] or [14]). The action of O by multiplication on I gives us an embedding ι : O 1→ Mat 2 (Z) so that γ = x + y ∈ O corresponds to xI + yA. Moreover, the determinant of xI + yA equals N (γ ) = γ γ¯ , where N : K → Q is the Galois norm. In particular, if γ ∈ O has norm one then γ corresponds to an element in SL2 (Z) Given an integer N ≥ 1, the embedding ι : O 1→ Mat 2 (Z) induces a map ιN : O/NO → Mat2 (Z/NZ) and the norm N : K → Q gives a well-defined map N : O/N O → Z/N Z. Denote by CA (N ) the group of norm one elements in O/N O:
CA (N ) = ker N : (O/N O)∗ → (Z/N Z)∗ . This is a subgroup of SL(2, Z/NZ), containing the residues class of A modulo N . The cardinality of CA (N ) can be computed via the Chinese Remainder Theorem from the cardinality at prime power arguments. To do so, define +1, p splits in K χ (p) = −1, p inert in K. (Recall that a prime p is inert if (p), the ideal generated by p in the ring of integers of K, is a prime ideal. If (p) is a product of two distinct prime ideals, then p splits, whereas p ramifies if (p) is a square of a prime ideal.) By quadratic reciprocity, χ is a Dirichlet character modulo DA (not necessarily primitive). It can then be shown (see e.g. [14], Appendix B) that if p does not divide DA , then #CA (p k ) = pk−1 (p − χ (p)),
(3.1)
while for primes dividing DA , there is some cA > 0 so that #CA (p k ) ≤ cA p k .
(3.2)
As a consequence, we find that if p does not divide DA , then the order of A modulo p divides p − χ (p), and more generally, for any prime power p k , if p does not divide DA , then ord(A, pk ) divides p k−1 (p − χ (p)).
On Quantum Ergodicity
207
3.1.3. An upper bound for ord(A, N ). Another consequence of (3.1), (3.2) is that for any integer N = pkp , #CA (N ) =
#CA (p kp ) A N
p
1 ) A N log log N. p
(1 +
p|N
Thus, for any integer N , we have as an upper bound for the order ord(A, N ) N log log N.
(3.3)
3.2. Making ord(A, N ) small. As for lower bounds on the order, it is easily seen that ord(A, N ) log N for all N . In fact, this bound is sharp, as we claim Proposition 3. There is an infinite sequence of integers {Nk }∞ k=1 for which ord(A, Nk ) log Nk . Proof. To explain the idea, recall first how to find integers n for which 2 has small order modulo n: The trick is to take nk = 2k − 1, since then 2k = 1 mod nk , and so ord(2, nk ) ≤ k ∼ log nk / log 2. To modify this idea to our context, assume for simplicity that the matrix A is “principal”, that is the action of A on Z2 is equivalent to the action of the unit on the maximal order OK (in general we need an ideal in the order O = Z[ ], see Sect. 3.1.2). Then Ak = I mod N is equivalent to k = 1 mod N OK (in general, only the implication ⇒ is valid). Factor | det(Ak − I )| as a product of prime powers: p σp p ιp p ρp , | det(Ak − I )| = S
I
R
where S means the product over primes p = pp¯ which split in K = Q( ), I the product over inert primes and R the product over the ramified primes p = p2 . On the other hand, we have
det(Ak − I ) = N ( k − 1) = − −k ( k − 1)2 . Write the ideal factorization of ak := ( k − 1)OK as ! !! psp p¯ sp p ip prp . ak = S
I
R
Since ak2 = det(Ak − I )OK , we get on comparing the prime exponents that 2sp! = 2sp!! = σp , Since σp is even, we can set Nk :=
S
Then where δ =
R
p σp /2
ιp = 2ip , I
p ip
ρp = rp .
p [rp /2] .
R
Nk ≤ | det(Ak − I )| ≤ Nk2 δ, p is the product of all ramified primes of K.
208
P. Kurlberg, Z. Rudnick
We have ak ⊆ Nk OK and so k = 1 mod Nk OK , equivalently Ak = I mod Nk . Thus we find ord(A, Nk ) ≤ k ∼
log Nk2 δ log | det(Ak − I )| 2 ≤ = log Nk + O(1), log log log
and so ord(A, Nk ) log Nk as required.
# "
4. Large Order of A Implies Equidistribution 4.1. In this section we give a relation between the order of the map A modulo N and the distribution of the eigenfunctions of the quantization UN (A). We start by relating the fourth power-moment of the expectation values TN (n)ψi , ψi , for ψi ranging over an orthonormal basis of UN (A)-eigenfunctions, to the number of solutions of a certain equation modulo N. Recall our notation: n = (n1 , n2 ) will denote a row vector, and the matrix A acts on by multiplication on the right: n → nA. Proposition 4. Let {ψi }N i=1 be an orthonormal basis of eigenfunctions of UN (A). Then N
|TN (n)ψi , ψi |4 ≤
i=1
N ν(N, n), ord(A, N )4
(4.1)
where ν(N, n) is the number of solutions of the congruence n(Ai − Aj + Ak − Al ) ≡ 0
mod N,
1 ≤ i, j, k, l ≤ ord(A, N ).
Proof. Let 1 D(n) = ord(A, N )
ord(A,N)
TN (nAi ),
i=1
and let tij be the matrix coefficients of TN (n) expressed in terms of the basis {ψi }N i=1 . From (2.7) we have that TN (nAk ) = UN (Ak )∗ TN (n)UN (Ak ),
(4.2)
and by assumption UN (A)ψi = λi ψi for λi a root of unity. Thus, if {Dij }N i,j =1 are the N matrix coefficients of D in terms of the basis {ψi }i=1 , then Dij =
tij 0
if λi = λj , otherwise.
(4.3)
On Quantum Ergodicity
209
Indeed, from (4.2) we have Dij = = = =
1 ord(A, N ) 1 ord(A, N ) 1 ord(A, N ) 1 ord(A, N )
ord(A,N)
TN (nAk )ψi , ψj
k=1 ord(A,N)
UN (Ak )∗ TN (n)UN (Ak )ψi , ψj
k=1 ord(A,N)
TN (n)UN (Ak )ψi , UN (Ak )ψj
k=1 ord(A,N)
(λi λ¯ j )k tij ,
k=1
which gives (4.3). ∗ 2 If we denote by {vi }N i=1 the column vectors of D, then the (k, k)-entry of (D D) is ((D ∗ D)2 )kk = vi , vk vk , vi = |vi , vk |2 , and since |vk , vk | =
i
i
i
|Dki |2 we get |tij |4 ≤ tr((D ∗ D)2 ). λi =λj
Substituting the definition of D and using (2.5) and (2.6), we see that (D ∗ D)2 is given by ord(A, N )−4 times a sum, ranging over 1 ≤ i, j, k, l ≤ ord(A, N ), of terms TN (nAi )TN (−nAj )TN (nAk )TN (−nAl ) = γi,j,k,l TN (n(Ai − Aj + Ak + Al )), where γi,j,k,l has absolute value one. Now take the trace and use (2.4): |TN (n)| equals N if n ≡ (0, 0) mod N, and is zero otherwise. The result now follows by taking absolute values and summing over all i, j, k, l. (For more details, see Sect. 6.2 in [14].) " # 4.2. A counting problem. In order to make use of Proposition 4 we must bound the number of solutions to n(Ai − Aj + Ak − Al ) ≡ 0
mod N,
1 ≤ i, j, k, l ≤ ord(A, N ).
We will show that there are essentially only trivial solutions of this equation, i.e. (Ai , Ak ) = (Aj , Al ),
(Ai , Ak ) = (Al , Ak ),
or (Ai , Aj ) = (−Ak , −Al ),
where the third possibility only happens if there exists t such that At = −I . In terms of the exponents i, j, k, l this means that (i, k) = (j, l),
(i, k) = (l, k),
or (i, j ) = (t − k, t − l),
where equality is to be interpreted as equality modulo the order of A.
(4.4)
210
P. Kurlberg, Z. Rudnick
4.2.1. The prime case. Here we assume N = p is prime. Lemma 5. Assume that nA and n are linearly independent modulo p, and that the eigenvalues of A are distinct modulo p. Then there are at most 3 ord(A, p)2 solutions of n(Ai − Aj + Ak − Al ) ≡ 0
mod p, 1 ≤ i, j, k, l ≤ ord(A, p).
(4.5)
Proof. Let K be the real quadratic field containing the eigenvalues of A, and let Kp be the residue class field at the prime p, i.e., Kp = OK /P , where P is a prime of K lying above p. Kp has cardinality p if p splits in K, or p 2 if p is inert. We may diagonalizethe reduction of A modulo p over the field Kp . In the eigenvector basis we 0 have A! = and n! = (n!1 , n!2 ), where the assumption of linear independence 0 −1 modulo p implies that both n!1 , n!2 $ = 0 (in Kp .) Thus (4.5) is equivalent to the following two equations over Kp : i − j + k − l = 0, −i − −j + −k − −l = 0,
(4.6)
which in turn (see Lemma 15 in [14]) is equivalent to l = i − j + k , ( k − i )( k − j )( i + j ) = 0.
(4.7)
Hence l is determined by the triple (i, j, k). Dividing by k and letting i ! = i − k and j ! = j − k we rewrite the second equation as !
!
!
!
(1 − i )(1 − j )( i + j ) = 0,
1 ≤ i ! , j ! ≤ ord(A, p).
(4.8)
If the first (or second) factor equals zero then ord(A, p) | i ! (or j ! ) since the order of in Kp× equals ord(A, p). If the third factor is zero then ord(A, p) | i ! − j ! − t, where t = −1. In each case this leaves ord(A, p) possibilities for the pair (i ! , j ! ), and since k is unconstrained the total number of solutions is at most 3 ord(A, p)2 . " # Remark. The condition of linear independence mod p in Lemma 5 is satisfied for all but finitely many primes. In fact, if we let n1 n2 M= , m1 m2 where n = (n1 , n2 ) and nA = (m1 , m2 ), then the condition of linear dependence is equivalent to p | det M. Now det M is a nonzero integer, because A has no rational eigenvectors. We also note that if the independence condition is not satisfied then trivially there are at most ord(A, p)4 solutions to (4.5). Lemma 6. Let N be square free and coprime to DA = 4(tr(A)2 − 4). Assume further that nA and n are linearly independent modulo p for all p | N . Then there are at most 3ω(N) ord(A, N )2 solutions of n(Ai − Aj + Ak − Al ) ≡ 0
mod N,
1 ≤ i, j, k, l ≤ ord(A, N ).
(4.9)
On Quantum Ergodicity
211
Proof. Let (i, j, k, l) be a solution to (4.9). If p | N then (4.9) holds with N replaced by p. Arguing as in Lemma 5 one of the three factors in (4.8) must be zero, and the vanishing factor determines which one of the three equations in (4.4) that (i, j, k, l) must satisfy modulo ord(A, p). For example, if the first factor in (4.8) is zero, then (i, j ) ≡ (k, l) mod ord(A, p). Now, the group generated by A modulo N is cyclic and isomorphic to ⊕q∈Q Z/q aq Z, where the q’s are distinct primes. We will denote the Z/q aq Z component of i by iq and similarly for j, k, l. Since ord(A, N ) is equal to the least common multiple of {ord(A, p)}p|N , there exists for each q ∈ Q at least one prime p | N such that q aq ' ord(A, p). Claim. If (i, j, k, l) is a solution to (4.9) then (iq , jq , kq , lq ) satisfies one of the equations in (4.4). The reason is as follows: there is a prime p | N such that q aq ' ord(A, p), thus one of the equations in (4.4) is satisfied modulo ord(A, p). Since q aq ' ord(A, p) this implies that (iq , jq , kq , lq ) satisfies one of the equations in (4.4). (Note in particular that this leaves q 2aq possibilities for (iq , jq , kq , lq ) if we specify one of the equations in (4.4) to be satisfied.) Now, to each p | N there are 3 different types of trivial solutions, and since (iq , jq , kq , lq ) must satisfy one of the possibilities in (4.4) for all q ∈ Q, we obtain that there are at most q 2aq = 3ω(N) ord(A, N )2 3ω(N) q∈Q
solutions to (4.9).
# "
In our applications the hypothesis of linear independence might not hold for all p | N . However, we have the following Lemma 7. Let N be square free. Then there are at most ω(N) ord(A, N )2 OA |n|8+ 2 3 solutions to n(Ai − Aj + Ak − Al ) ≡ 0
mod N,
1 ≤ i, j, k, l ≤ ord(A, N ).
(4.10)
Proof. By the remark after Lemma 5, linear dependence modulo p holds if and only if p| det M, where | det M| A |n|22 . Let N! =
N . gcd(DA det M, N ) !
Then the hypothesis in Lemma 6 is satisfied for N ! , leaving 3ω(N ) ord(A, N ! )2 possible values for (i, j, k, l) modulo ord(A, N ! ). Now, an element in Z/ ord(A, N ! )Z has exactly ord(A,N) ord(A,N ! ) preimages in Z ∩ [1, ord(A, N )]. Hence there are at most !
3ω(N ) ord(A, N ! )2
ord(A, N ) ord(A, N ! )
solutions to (4.10). Since | det(M)| A |n|22
4
212
P. Kurlberg, Z. Rudnick
we get that
N = gcd(DA det M, N ) ≤ DA det M A |n|22 . N! Finally noting that since N is square-free, ord(A, N ) = lcm ord(A, N ! ), ord(A, N/N ! ) ≤ ord(A, N ! ) · ord(A, N/N ! ), we find (by (3.3)) that ord(A, N ) ≤ ord(A, N/N ! ) ord(A, N ! )
for all > 0, and we are done.
N N!
1+
# "
4.3. Conclusion. Proposition 8. There exists a density-one sequence S of integers such that if n $ = 0 and N ∈ S then N N (log N )14 |TN (n)ψi , ψi |4 |n|8+ . 2 ord(A, N )2 i=1
Proof. Let S be the set of integers of the form N = ds 2 , where d is square free, s ≤ log N, and ω(N ) ≤ 3/2 log log N . By Lemmas 21 and 22, proved in the appendix, S has density one. For N = ds 2 ∈ S, we wish to bound the number of solutions to n(Ai − Aj + Ak − Al ) = 0 mod N,
1 ≤ i, j, k, l ≤ ord(A, N ).
(4.11)
Since N is not square free we cannot apply Lemma 7 directly. For N = ds 2 , d squarefree, we further decompose d = d1 gcd(d, s), so that d1 and N/d1 = gcd(d, s)s 2 are coprime. ord(A,N) Given t ∈ Z there are exactly ord(A,d solutions to Ai ≡ At mod d1 if i ∈ Z ∩ 1) [1, ord(A, N )]. Thus, a solution of n(Ai − Aj + Ak − Al ) = 0 mod d1 ,
1 ≤ i, j, k, l ≤ ord(A, d1 )
(4.12)
lifts to at most (ord(A, N )/ ord(A, d1 ))4 solutions for which 1 ≤ i, j, k, l ≤ ord(A, N ). This, together with Lemma 7 applied to (4.12) gives there are at most
ord(A, N ) ord(A, d1 )
4
ω(d1 ) |n|8+ ord(A, d1 )2 2 3
solutions to (4.11). Clearly ω(d1 ) ≤ ω(N ), ord(A, d1 ) ≤ ord(A, N ), and since d1 , N/d1 are coprime, with N/d1 ≤ s 3 , we have 1+ N ord(A, N ) N ≤ s 3(1+ ) ≤ ord A, ord(A, d1 ) d1 d1
On Quantum Ergodicity
213
for all > 0 (by (3.3)). Hence the number ν(N, n) of solutions of (4.11) is bounded by 12+ ω(N) 3 ord(A, N )2 . ν(N, n) |n|8+ 2 s
(4.13)
Thus we find that for N ∈ S the number of solutions of (4.11) is bounded by 14 2 |n|8+ (log N )12+ 33/2 log log N ord(A, N )2 |n|8+ 2 2 (log N ) ord(A, N )
and consequently we see from Proposition 4 that N
|TN (n)ψi , ψi |4 ≤ |n|8+ 2
i=1
N (log N )14 ord(A, N )2
as required. " # By a routine argument (see [14], Sect. 6) we get: Corollary 9. There is a density one sequence of integers N so that for all observables f ∈ C ∞ (T2 ), we have N j =1
|OpN (f )ψj , ψj −
T2
f | 4 f
N (log N )14 . ord(A, N )2
This reduces the proof of Theorem 1 to showing that for a sequence of density one of integers, ord(A, N ) grows faster than N 1/2 (log N )7 as N → ∞. We will do this in Sect. 7 (Theorem 17). 5. Relating the Order of A Modulo Integers to the Order Modulo Primes Our goal in this section is to show (Proposition 11) that for a set of density one of integers N , ord(A, N ) is not much smaller than the product of ord(A, p) over prime divisors p of N. 5.1. For a set of positive integers M = {m1 , . . . , mk }, define k L(M) =
j =1 mj
lcm{m1 , . . . , mk }
.
Then L(M) is a positive integer, L({m}) = 1 and L({m1 , m2 }) = gcd(m1 , m2 ). From the definition, a prime ? divides L(m1 , . . . , mk ) if and only if there are two distinct indices i $= j so that ? divides both mi and mj . Lemma 10. Let M = {m1 , . . . , mk }, N = {n1 , . . . , nk } and suppose that mj | nj , 1 ≤ j ≤ k. Then L(M) divides L(N ). In particular, j mj . lcm{m1 , . . . , mk } ≥ L(N )
214
P. Kurlberg, Z. Rudnick
α α +β Proof. Factor mj = i pi ij , nj = i pi ij ij with αij , βij ≥ 0. Then L(M) = µi νi i pi , L(N ) = i pi , where µi =
k j =1
νi =
k
αij − max αij , 1≤j ≤k
(αij + βij ) − max (αij + βij ) . 1≤j ≤k
j =1
Thus the lemma reduces to the following easily verified inequality: For any non-negative reals aj , bj ≥ 0, 1 ≤ j ≤ k, we have aj − max aj ≤ (aj + bj ) − max{aj + bj }. # " j
j
j
j
5.2. We need to apply these considerations to bounding ord(A, N ). Given an integer N , we will write N = ds 2 with d square-free, and further decompose d = d0 gcd(d, DA ), so that d0 = d0 (N ) is square-free and co-prime to DA . Now define L(N ) = L({p − χ (p) : p | d0 (N )}).
(5.1)
Since d0 | N, we have ord(A, N ) ≥ ord(A, d0 ) = lcm({ord(A, p) : p | d0 }). Moreover, for p | d0 we have ord(A, p) | p − χ (p) and so by Lemma 10 we find p|d0 ord(A, P ) lcm({ord(A, p) : p | d0 }) ≥ L(N ) and thus
ord(A, N ) ≥
p|d0
ord(A, P )
L(N )
.
We will show (Proposition 11) that for almost all N ≤ x, we have L(N ) ≤ exp(3(log log x)4 ) and consequently we get as the main result of this section: Proposition 11. For almost all N ≤ x, ord(A, N ) ≥
p|d0
ord(A, p)
exp(3(log log x)4 )
,
where d0 is given by writing N = ds 2 , with d = d0 gcd(d, DA ) square-free.
(5.2)
On Quantum Ergodicity
215
5.3. For x 1, we set z = z(x) = (log log x)3 . We say that an integer is z-smooth if it has no prime divisors larger than z. Lemma 12. L(N ) is z-smooth with at most O(x/ log log x) exceptions for 1 ≤ N ≤ x. Proof. Suppose that L(N ) is divisible by a prime ? > z. From the definition of L(N ), this implies that there are two distinct prime divisors q1 , q2 of d0 (N ) so that ? divides qi − χ (qi ), i = 1, 2. In particular, ? ≤ x 1/2 . Thus we find two distinct primes such that q1 q2 | N
and
qi = ±1
mod ?,
i = 1, 2.
(5.3)
For fixed q1 , q2 the number of N ≤ x divisible by q1 q2 is [x/q1 q2 ]. Thus for fixed ?, the number of N ≤ x satisfying (5.3) is at most 2 x 1 . ≤x q1 q2 q q1 ,q2 =±1
mod ?
q=±1
mod ?
By Brun–Titchmarsh (Lemma 23 – recall ? ≤ x 1/2 ), this is bounded (up to constant factor) by x(log log x/?)2 . Summing over all primes ? > z, we find that the number of integers N ≤ x such that L(N ) is divisible by some prime ? > z is at most x(log log x)2
1 x(log log x)2 x . ?2 z log log x
# "
?>z
Proposition 13. For almost all integers N ≤ x we have L(N ) ≤ exp(3(log log x)4 ) . Proof. By Lemma 12 we may assume that L(N ) is z-smooth, with z = (log log x)3 . For p | d0 (N ), write the z-smooth part of p − χ (p) as fp sp2 , with fp square-free. Set SN = max sp . p|d0
Note that since fp is square-free and z-smooth, it divides the product of all primes q ≤ z. Thus for z 1 we have: fp ≤ q ≤ e3z/2 . q≤z
Since L(N ) is z-smooth and divides p|d0 (p − χ (p)), it also divides the product 2 p|d0 fp sp . Thus 3 L(N ) ≤ fp sp2 ≤ e3z/2 S 2 ≤ (e 2 z S 2 )ω(N)
p|d0
p|d0
or log L(N ) 3 2 − z ≤ log SN . ω(N ) 2
(5.4)
Now for almost all N ≤ x we have (Lemma 22) ω(N ) <
3 log log x, 2
(5.5)
216
P. Kurlberg, Z. Rudnick
and so by (5.4) if L(N ) is large, so is SN . Specifically, if log L(N ) > 3z log log x = 3(log log x)4 then by (5.4), (5.5), we find 2 > z/2 = (log log x)3 /2. log SN
We will show that this fails for almost all N ≤ x and thus prove the proposition. 2 > z/2 = (log log x)3 /2, recall To estimate the number of N ≤ x for which log SN that by the definition of SN there is some prime q dividing d0 (and hence dividing N ) so that the z-smooth part of q − χ (q) is fq sq2 and SN = sq (in particular if N ≤ x then SN ≤ x 1/2 ). Thus there is a prime q | N for which q = ±1 mod S 2 . Given q there are at most [x/q] integers N ≤ x divisible by q, and hence the total 2 > z/2 is at most number of N ≤ x with log SN
exp(z/4)<S<x 1/2 q=±1 mod S 2 q≤x
x . q
By Lemma 23 we have for fixed S < x 1/2 , q=±1 mod S 2 q≤x
x x log log x , q S2
and summing over S > ez/4 gives at most x log log x
S>exp(z/4)
1 x log log x . 2 S exp(z/4)
2 > z/2 = (log log x)3 /2 is at most Thus the number of N ≤ x for which log SN
1 x log log x x log log x exp(− (log log x)3 ) = o(x), ez/4 4 and we are done. " # 6. Large Order for Primes In this section we show that ord(A, p) is large for a positive proportion of primes. Our main result here is: Theorem 14. Let 1/2 < η < 3/5. Then the number of primes p ≤ x for which the order of the cat map modulo p satisfies ord(A, p) > x η is at least c(η)π(x) + o(π(x)), where c(η) =
3 − 5η , 2(1 − η)
1/2 < η < 3/5 .
We first observe (following Hooley [12]): Lemma 15. The number of primes for which ord(A, p) ≤ y is y 2 .
(6.1)
On Quantum Ergodicity
217
Proof. If ord(A, p) = k ≤ y then Ak = I mod p and so p | det(Ak − I ). Thus the number of such primes is bounded by the total number of prime divisors of the integers det(Ak − I ), k ≤ y, that is by ω(det(Ak − I )), k≤y
where ω(n) is the number of prime factors of n. Now trivially ω(n) ≤ log |n|, and | det(Ak − I )| ∼ k , where > 1 is the largest eigenvalue of A. Thus we get a bound for the number of primes as above of ω(det(Ak − I )) k y2 k≤y
k≤y
as required. " # For η ≥ 1/2, let Pη (x) be the set of primes p ≤ x for which there is a prime q > x η , with q | p − χ (p). The main tool for proving Theorem 14 is: Proposition 16. For 1/2 < η < 3/5 we have #Pη (x) ≥ c(η)π(x) (1 + o(1)) with c(η) > 0 given by (6.1). Theorem 14 follows from Proposition 16 and the following observation: For all but o(π(x)) of the primes of Pη (x) we have ord(A, p) > x η . Indeed, for p DA , ord(A, p) divides p − χ (p). For p ∈ Pη (x), if ord(A, p) is not divisible by the large factor q > x η of p − χ (p) then it divides p−χ(p) < x 1−η and so ord(A, p) is smaller than q 1−η y = x ; the number of such primes is by Lemma 15 at most O(x 2(1−η) ) = o(π(x)) since η > 1/2. Thus for all but o(π(x)) of the primes in Pη (x), we have q | ord(A, p) and so for these primes ord(A, p) ≥ q > x η . 6.1. Proof of Proposition 16. The proof of Proposition 16 is a modification of a theorem due to Goldfeld [8] from the case of primes p for which p + a has a large prime factor for fixed a, to the case when a is allowed to vary with p in a bounded fashion, depending on a fixed set of congruence conditions. The idea is as follows: By quadratic reciprocity, χ (p) only depends on the residue of p modulo DA = 4(tr(A)2 − 4). Thus the number of primes in Pη (x) is the sum over all invertible residues a mod DA of the number of primes in Pη (x; DA , a) = {p ∈ Pη (x) : p = a
mod DA }.
We will show #Pη (x; DA , a)
c(η) π(x), φ(DA )
(6.2)
where c(η) is given by (6.1). Summing (6.2) over all invertible residues a mod DA will give Proposition 16.
218
P. Kurlberg, Z. Rudnick
6.1.1. As in [8], we consider the sum Sa (x) =
D(m),
m≤x p≤x (m,DA )=1 p=a modDA m|p−χ(a)
and more generally for y1 < y2 ≤ x, we set Sa (y1 , y2 ; x) =
D(m).
y1 <m≤y2 p≤x (m,DA )=1 p=a modDA m|p−χ(a)
This is the weighted sum over prime powers m ∈ (y1 , y2 ], coprime to DA , of the number of primes p ≤ x, p = a mod m with m | p − χ (p). If (m, DA ) = 1 then by the Chinese Remainder Theorem, there is a unique am mod mDA so that am = χ (a) mod m, am = a mod DA . Then we have Sa (y1 , y2 ; x) =
D(m)π(x; mDA , am ).
y1 <m≤y2 (m,DA )=1
6.1.2. Prime powers. Let us first see that the contribution of proper prime powers m = q k , k > 1, to Sa (y1 , y2 ; x) is at most O(x/ log x), which will allow us to ignore their contribution: Indeed, this contribution is bounded by q k <x k>1
log q · π(x; q k DA , aq k ) ≤ + q k <x 3/4 k>1
x 3/4 ≤q k <x k>1
log q · π(x; q k DA , a k ). q
By Brun–Titchmarsh (A.1), if q k < x 3/4 then π(x; q k DA , aq k ) x/(q k log x), so that the sum over q k < x 3/4 is bounded by
log q
q k <x 3/4
since
x x , q k log x log x
log q < ∞. qk
q prime k>1
As for the sum over x 3/4 < q k < x, we use the trivial bound π(x; q k DA , aq k )
x < x 1/4 q k DA
On Quantum Ergodicity
219
(which comes from counting integers in an arithmetic progression) plus the fact that the number of prime powers q k < x is O(log x/ log q). Since the primes contributing are no larger than x 1/2 , we bound this sum by
log q
q<x 1/2
log x 1/4 x 3/4 x log q
which is negligible. 6.1.3. A reduction. We reduce the study of Pη (x; DA , a) to that of Sa (x η , x; x): Pη (x; DA , a) =
π(x; DA , aq )
x η
≥
1 log x
log q · π(x; DA , aq )
x η
qDA prime
1 = Sa (x η , x; x) + O log x
x log2 x
,
since the prime powers are negligible. (Also note that q > x 1/2 so that each p is counted exactly once in the first sum.) Thus in order to prove (6.2), we need to show that for η < 3/5, Sa (x η , x; x)
3 − 5η x . 2(1 − η) φ(DA )
(6.3)
6.1.4. A division. We write Sa (x) = Sa (1,
x 1/2 x 1/2 ( ; x) + S , x η ; x) + Sa (x η , x; x) a logc x logc x
with c > 1 to be determined later. We will show x , φ(DA ) x 1/2 1 x Sa (1, ; x) ∼ , c log x 2 φ(DA ) x 1/2 2η − 1 x Sa ( c , x η ; x) , log x 1 − η φ(DA ) Sa (x) ∼
which will give (6.3) and hence our proposition.
(6.4) (6.5) (6.6)
220
P. Kurlberg, Z. Rudnick
6.1.5. To show Sa (x) ∼ x/φ(DA ), we first write Sa (x) as m≤x (m,DA )=1
D(m)
p≤x p=a modDA m|p−χ(a)
1= − m≤x
m≤x (m,DA )$=1
D(m)
1.
p≤x p=a modDA m|p−χ(a)
To evaluate
the sum over all m ≤ x, we switch the order of summation and use the identity d|n D(d) = log n to get D(m) 1= D(m) m≤x
p≤x p=a modDA m|p−χ(a)
p≤x m|p−χ(a) p=a modDA
=
log(p − χ (a)) ∼
p≤x p=a modDA
x . φ(DA )
To estimate the sum over prime powers m ≤ x, with gcd(m, DA ) $ = 1, note that since the sum is only over the powers of the primes q dividing DA , it suffices to treat each such prime separately. We will show that each contributes at most Oq (x/ log x) and thus prove (6.4). Indeed, the contribution of such a prime is log q 1 ≤ log q 1 p≤x k≥1 q k ≤x p=a modDA q k |p−χ(a)
q k ≤x
≤ log q
p≤x q k |p−χ(a)
π(x; q k , ±1) .
q k ≤x
The contributing exponents k consist of those (“small” k’s) with q k ≤ x/e and at most two “large” values of k for which x/e < q k ≤ x. The contribution of the “large” exponents can be shown to be at most O(1) by noting that π(x; q k , ±1) is at most the number of integers n ≤ x congruent to ±1 modulo q k , which is at most x/q k +1 = O(1). For the “small” exponents (k ≥ 1 such that q k ≤ x/e), we use the Brun–Titchmarsh theorem (A.1) to bound π(x; q k , ±1) <
x/q k 2 , 1 − q −1 log x/q k
and so the sum over all k ≥ 1 with q k ≤ x/e is at most log q
2 x/q k . 1 − q −1 k log x/q k q ≤x/e
k
is decreasing and so the sum over In the range q ≤ q k ≤ x/e, the function k → logx/q x/q k 1 ≤ k ≤ log(x/e)/ log q is bounded by the integral x log(x/e)/ log q x/q k dt 1 1 x dk = . k) log(x/q log q log t log q log x 0 e
On Quantum Ergodicity
221
Thus the total contribution of these “small” k’s is at most cq x/ log x. Summing over all prime divisors q of DA gives (6.4). x 6.1.6. To evaluate Sa (1, log c x ; x), we replace π(x; mDA , am ) by Li(x)/φ(mDA ) and use the Bombieri–Vinogradov theorem to bound the error by 1/2
m<x 1/2 / logc x
x Li(x) x log m max π(x; mDA , b) − log x 2 (b,m)=1 φ(mDA ) log x log x
(c was chosen to give the exponent 2 on the RHS of (A.2)). The main term is evaluated by (note that φ(mDA ) = φ(m)φ(DA ) if m and DA are coprime) m<x 1/2 / logc x (m,DA )=1
D(m) Li(x) Li(x) = φ(mDA ) φ(DA )
m<x 1/2 / logc x (m,DA )=1
Li(x) = φ(DA )
D(m) φ(m)
m<x 1/2 / logc x
D(m) + O(1) φ(m)
x 1/2 Li(x) log φ(DA ) logc x 1 x ∼ 2 φ(DA )
∼
as required to prove (6.5). 6.1.7. Finally we estimate Sa (x 1/2 / logc x, x η ; x), We will use the Brun–Titchmarsh inequality (A.1) which for m < x η , η < 3/5 gives π(x; mDA , am ) <
2 x . 1 − η φ(DA m) log x
We now find using (6.7) that Sa
x 1/2 , xη; x logc x
<
x 2 1 − η log x
x 1/2 / logc x<m≤x η (m,DA )=1
D(m) φ(mDA )
1 x x 1/2 2 η = log x − log + O(1) logc x φ(DA ) 1 − η log x 2(η − 1/2) x ∼ , 1 − η φ(DA ) which gives the required estimate (6.6).
(6.7)
222
P. Kurlberg, Z. Rudnick
7. Large Order for Almost All Integers In this section we will show that for a density one subsequence of the positive integers, the order of A is large enough to give uniform distribution of all eigenfunctions of UN (A). We will show: Theorem 17. There exist δ > 0 and a density one subset S of the integers such that for all N ∈ S we have ord(A, N ) N 1/2 exp((log N )δ ) . Fix 1/2 < η < 3/5. We say that a prime p is good if p DA and ord(A, p) ≥ pη . Let PG be the set of good primes, and let PG (x) be the set of primes in PG that does not exceed x. As shown in Theorem 14, there exists γ = γ (η) > 0 such that PG (x) γ π(x) . If p | DA or ord(A, p) < pη we call p bad, and if p | DA or ord(A, p) < p1/2 / log p we call p terrible. As for good primes we let PB and PT denote the set of bad, respectively terrible, primes (note that PT ⊂ PB ), and by PB (x) resp. PT (x) the number of primes less than x in these sets. Since PB is the complement of PG which has lower density γ , we have PB (x) (1 − γ )π(x).
(7.1)
As for the size of PT , it is immediate from Lemma 15 that PT (x) = O(
x ). log2 x
(7.2)
Given an integer N we write N = NG NB , where a a NG = pi i , NB = pi i . a
a
pi i 'N pi ∈PG
We also let NT | NB be given by NT =
pi i 'N pi ∈PB
ai a pi i 'N pi . pi ∈PT NG if and only
if all prime divisors of n are Define a set of integers NG by n ∈ good, and similarly for NB and NT . As for primes we let NG (x) (respectively NB (x) and NT (x)) be the elements of NG (respectively NB and NT ) not exceeding x. Proposition 18. The number NB (x) of integers N ≤ x having all their prime factors in PB satisfies x . NB (x) (log x)γ Proof. Let bp = 1 if p ∈ PB and let
bp = 0 if p ∈ PG , and for composite integers d put bd = p|d bp . Then NB (x) = n≤x bn . Since PB (x) ≤ (1 − γ )π(x) the sieve of Eratosthenes gives that NB (x) = o(x). Indeed, NB (x) = #{n ≤ x : p ∈ PG ⇒ p n} = x (1 − 1/p) + O(exp(z)) . p∈PG (z)
On Quantum Ergodicity
223
Putting z = log log x and noting that limz→∞ p∈PG (z) (1 − 1/p) = 0, since
p∈PG 1/p = ∞, we obtain NB (x) = o(x). x Now following Wirsing [20], we consider the smoothed sum 1 NB (t) dtt . By partial summation we have x dt NB (t) = NB (x) log x − bn log n . (7.3) t 1 n≤x
Using the identity log n = d|n D(d) we obtain: bn log n = bn D(d) = bd D(d) bn n≤x n≤x d|n d≤x n≤x/d (7.4) = bn bd D(d) . n≤x
Now,
d≤x/n
x x log p + O(( )1/2 log(x/n)) n n d≤x/n p∈PB (x/n) x by Chebyshev’s bound on π(x). Moreover, NB (t) = o(t) implies that 1 o(x). Hence x bn . NB (x) log x + o(x) n n≤x
However, bn n≤x
n
≤
bd D(d) =
(1 + 1/p + 1/p 2 + . . . ) = exp(
p∈PB (x)
exp ((1 − γ ) log log x) = (log x) NB (x)
=
(7.5)
(1/p + O(1/p 2 ))
p∈PB (x) 1−γ
and thus
NB (t) t dt
,
x x x (log x)1−γ + o( ) . log x log x (log x)γ
(7.6)
# " Corollary 19. We have #{N ≤ x : NG ≤ exp((log x)γ /2 )}
x . (log x)γ /2
Proof. We may write #{N ≤ x : NG ≤ z} as x NB ( ), NG NG ≤z
and by Proposition 18 we may bound this sum by 1 x x x log z . x γ x γ NG (log NG ) (log z ) NG (log xz )γ NG ≤z
NG ≤z
Putting z = exp((log x)γ /2 ) we obtain the desired conclusion.
# "
(7.7)
224
P. Kurlberg, Z. Rudnick
We will also need to estimate the number of integers N with NT large:
Lemma 20. Let β(z) = N∈NT 1/N . Then: N≥z
(i) The number of integers N ≤ x for which NT ≥ z is at most xβ(z). (ii) limz→∞ β(z) = 0. Proof. (i) We have #{N ≤ x : NT ≥ z} ≤
x = xβ(z). NT
NT ≥z
(ii) By (7.2),
1/p < ∞,
p∈PT
and hence
1/N =
N∈NT
(1 + 1/p + 1/p 2 + . . . ) < ∞.
# "
p∈PT
Proof of Theorem 17. As in Sect. 5, write N = ds 2 , where d is square free, d = d0 gcd(d, DA ), DA = 4(tr(A)2 − 4). By Proposition 11, for almost all N ≤ x we have p|d0 ord(A, p) ord(A, N ) ≥ . exp(3(log log x)4 ) Fix 1/2 < η < 3/5. Write d0 = dG dB , where dG is “good” and dB is “bad”. By definition, if p is good then ord(A, p) > pη , hence η ord(A, p) ≥ dG . p|dG
Furthermore, d
p| dB
1/2 p 1/2 1 dB ord(A, p) ≥ . ≥ log p dT (log dB )ω(dB ) d p| dB
T
T
But trivially ord(A, p) ≥ 1 for p ∈ PT , hence p|d0
η
ord(A, p) ≥ dG =
dB dT
1/2
1 (log dB )ω(dB )
η−1/2 1/2 d
dG 1/2
dT (log dB )ω(dB ) η−1/2
=
×
dG N 1/2 . (dT s 2 )1/2 (log dB )ω(dB )
On Quantum Ergodicity
225
Now consider N ≤ x. By the previous results we may, without affecting the density (i.e. for all but o(x)), assume that the following holds: dT ≤ log x s ≤ log x ω(dB ) ≤ ω(N ) ≤ 2 log log x dG ≥ exp((log x)γ /2 )
(Lemma 20), (Lemma 21), (Lemma 22),
(7.8) (7.9) (7.10)
(Corollary 19).
(7.11)
We also use log dB ≤ log N ≤ log x. Hence p|d0
N 1/2 exp (η − 1/2)(log x)γ /2 ord(A, p) ≥ . (log x)3/2+3/2 log log x
Hence by Proposition 11, ord(A, N ) ≥
p|d0
ord(A, p)
exp(3(log log x)4 ) N 1/2 exp (η − 1/2)(log x)γ /2 ≥ exp 3(log log x)4 + (3/2 + 3/2 log log x) log log x
N 1/2 exp((log N )γ /3 ). This concludes the proof of Theorem 17.
# "
A. Background from Prime Number Theory A.1. In this Appendix, we collect some facts which we will need in the rest of the paper. The first asserts that most integers have only small square factors: Lemma 21. The number of integers N ≤ x which have a square factor s 2 | N with s > log N is o(x). Proof. If N ∈ [x 1/2 , x], then log N ≥ 1/2 log x, and the number of N ∈ [x 1/2 , x] such that s 2 | N for some s > log N is bounded by s≥1/2 log x
x x . 2 s log x
Hence the number of N ≤ x for which s 2 | N for some s > log N is o(x). " #
x log x
+ x 1/2 =
A.2. We will need to know that most integers have few prime factors: Let ω(N ) be the number of prime factors of N . As a consequence of the Hardy-Ramanujan theorem [10] (see [11], Theorem 431), we have: Lemma 22. The set of N such that ω(N ) ≥ 3/2 log log N has zero density.
226
P. Kurlberg, Z. Rudnick
A.3. We recall two important theorems: The first is the Brun–Titchmarsh inequality, which we will use in the following convenient form [16]: For all 1 ≤ k < x, (a, k) = 1, π(x; k, a) <
2x . φ(k) log xk
(A.1)
One consequence we will need is: Lemma 23. Let q ≤ x 1/2 . Then p≤x p≡±1 mod q
1 log log x . p φ(q)
The second is the Bombieri–Vinogradov theorem [1] in the form: For every A > 0 there is some B > 0 so that Li(x) x max π(x; k, a) − . (A.2) (a,k)=1 φ(k) (log x)A k≤
x 1/2 (log x)B
References 1. Bombieri E.: On the large sieve. Mathematika 12, 201–225 (1965) 2. Bouzouina, A. and De Bièvre, S.: Equipartition of the eigenfunctions of quantized ergodic maps on the torus. Commun. Math. Phys. 178, 83–105 (1996) 3. Colin de Verdière, Y.: Ergodicité et fonctions propres du laplacien. Commun. Math. Phys. 102, 497–502 (1985) 4. Degli Esposti, M.: Quantization of the orientation preserving automorphisms of the torus. Ann. Inst. Poincaré 58, 323–341 (1993) 5. Degli Esposti, M., Graffi, S. and Isola, S.: Classical limit of the quantized hyperbolic toral automorphisms. Commun. Math. Phys. 167, 471–507 (1995) 6. Erdös, P. and Murty, R.: On the order of a (mod p). In: Number theory (Ottawa, ON, 1996), CRM Proc. Lecture Notes 19, Providence, RI: Am. Math. Soc., 1999, pp. 87–97 7. Folland, G.: Harmonic analysis in phase space. Annals of Mathematics Studies 122, Princeton, NJ: Princeton University Press, 1989 8. Goldfeld, M.: On the number of primes p for which p + a has a large prime factor. Mathematika 16, 23–27 (1969) 9. Hannay, J.H. and Berry, M.V.: Quantization of linear maps on a torus - Fresnel diffraction by a periodic grating. Physica D 1, 267–291 (1980) 10. Hardy, G.H. and Ramanujan: Quarterly J. of Math. 48, 76–92 (1917) 11. Hardy, G.H. and Wright, E.M.: An introduction to the theory of numbers. NewYork: The Clarendon Press, Oxford University Press, 1979 12. Hooley, C.: on Artin’s conjecture. J. reine angew. Math. 225, 209–220 (1967) 13. Knabe, S.: On the quantisation of Arnold’s cat. J. Phys. A: Math. Gen. 23, 2013–2025 (1990) 14. Kurlberg, P. and Rudnick, Z.: Hecke theory and equidistribution for the quantization of linear maps of the torus. Duke Math. J. 103, 47–78 (2000) 15. Marklof, J. and Rudnick, Z.: Quantum unique ergodicity for parabolic maps. Geom. and Func. Anal. 10, 1554–1578 (2000) 16. Montgomery, H.L. and Vaughan, R.C. The large sieve. Mathematika 20, 119–134 (1973) 17. Murty, M.: Artin’s conjecture for primitive roots. Math. Intelligencer 10, no. 4, 59–67 (1988) 18. Schnirelman, A.: Ergodic properties of eigenfunctions. Usp. Math. Nauk 29, 181–182 (1974) 19. Taussky, O.: Introduction into connections between algebraic number theory and integral matrices. Appendix to H. Cohn A classical invitation to algebraic numbers and class fields, New York: Springer, 1978 20. Wirsing, E.: Das asymptotische Verhalten von Summen über multiplikative Funktionen. Math. Ann. 143, 75–102 (1961)
On Quantum Ergodicity
227
21. Zelditch, S.: Uniform distribution of eigenfunctions on compact hyperbolic surfaces. Duke Math. J. 55, 919–941 (1987) 22. Zelditch, S. Quantum ergodicity of C ∗ -dynamical systems. Commun. Math. Phys. 177, 507–528 (1996) 23. Zelditch, S.: Index and dynamics of quantized contact transformations. Ann. Inst. Fourier (Grenoble) 47, 305–363 (1997) Communicated by P. Sarnak
Commun. Math. Phys. 222, 229 – 247 (2001)
Communications in
Mathematical Physics
© Springer-Verlag 2001
Quantum Teleportation with Entangled States Given by Beam Splittings Karl-Heinz Fichtner1 , Masanori Ohya2 1 Friedrich-Schiller-Universität Jena, Fakultät für Mathematik und Informatik, Institut für Angewandte
Mathematik, 07740 Jena, Germany. E-mail: [email protected]
2 Department of Information Sciences, Science University of Tokyo, Chiba 278-8510, Japan.
E-mail: [email protected] Received: 21 January 2000 / Accepted: 23 April 2001
Abstract: Quantum teleportation is rigorously demonstrated with coherent entangled states given by beam splittings. The mathematical scheme of beam splitting has been used to study quantum communication [2] and quantum stochastic [8]. We discuss the teleportation process by means of coherent states in this scheme for the following two cases: (1) Delete the vacuum part from coherent states, whose compensation provides us a perfect teleportation from Alice to Bob. (2) Use fully realistic (physical) coherent states, which gives a non-perfect teleportation but shows that it is exact when the average energy (density) of the coherent vectors goes to infinity. We show that our quantum teleportation scheme with coherent entangled state is more stable than that with the EPR pairs which was previously discussed. It is in [3] that quantum teleportation was first studied as a part of quantum cryptography [5]. Quantum teleportation is to send a quantum state itself containing all information of a certain system from one place to another. The problem of quantum teleportation is whether there exist a physical device and a key (or a set of keys) by which a quantum state is completely transmitted and a receiver can reconstruct the state sent. It has been considered that such a teleportation would not be realistic because the usual quantum state contains information which can not be observed simultaneously. In the above paper [3], Bennett et al showed such a teleportation is possible through a device (channel) made from proper (EPR) entangled states of Bell basis. The basic idea behind their discussion is to divide the information encoded in the state into two parts, classical and quantum, and send them through different channels, a classical channel and an EPR channel. The classical channel is nothing but a simple correspondence between sender and receiver, and the EPR channel is constructed by using a certain entangled state. However the EPR channel is not so stable due to decoherence. In this paper (1) we study the quantum teleportation by means of general beam splitting processes so that it contains the EPR channel, and (2) we construct a more stable teleportation process with coherent entangled states.
230
K.-H. Fichtner, M. Ohya
The quantum teleportation scheme can be mathematically expressed in the following steps [11]: Step 0. A girl named Alice has an unknown quantum state ρ on (a N -dimensional) Hilbert space H1 and she was asked to teleport it to a boy named Bob. Step 1. For this purpose, we need two other Hilbert spaces H2 and H3 , H2 is attached to Alice and H3 is attached to Bob. Prearrange a so-called entangled state σ on H2 ⊗ H3 having certain correlations and prepare an ensemble of the combined system in the state ρ ⊗ σ on H1 ⊗ H2 ⊗ H3 . on the Step 2. One then fixes a family of mutually orthogonal projections (Fnm )N n,m=1 Hilbert space H1 ⊗ H2 corresponding to an observable F := zn,m Fnm , n,m
and for a fixed pair of indices n, m, Alice performs a first kind of incomplete measurement, involving only the H1 ⊗ H2 part of the system in the state ρ ⊗ σ , which filters the value znm , that is, after measurement on the given ensemble ρ ⊗ σ of identically prepared systems, only those where F shows the value znm are allowed to pass. According to the von Neumann rule, after Alice’s measurement, the state becomes (123) := ρnm
(Fnm ⊗ 1)ρ ⊗ σ (Fnm ⊗ 1) , tr 123 (Fnm ⊗ 1)ρ ⊗ σ (Fnm ⊗ 1)
where tr 123 is the full trace on the Hilbert space H1 ⊗ H2 ⊗ H3 . Step 3. Bob is informed which measurement was done by Alice. This is equivalent to transmitting the information that the eigenvalue znm was detected. This information is transmitted from Alice to Bob without disturbance and by means of classical tools. Step 4. Making only partial measurements on the third part on the system in the state (123) ρnm means that Bob will control a state nm (ρ) on H3 given by the partial (123) trace on H1 ⊗ H2 of the state ρnm (after Alice’s measurement) (123) nm (ρ) = tr 12 ρnm (Fnm ⊗ 1)ρ ⊗ σ (Fnm ⊗ 1) = tr 12 . tr 123 (Fnm ⊗ 1)ρ ⊗ σ (Fnm ⊗ 1)
Thus the whole teleportation scheme given by the family (Fnm ) and the entangled state σ can be characterized by the family (nm ) of channels from the set of states on H1 into the set of states on H3 and the family (pnm ) given by pnm (ρ) := tr 123 (Fnm ⊗ 1)ρ ⊗ σ (Fnm ⊗ 1) of the probabilities that Alice’s measurement according to the observable F will show the value znm . The teleportation scheme works perfectly with respect to a certain class S of states ρ on H1 if the following conditions are fulfilled: (E1) For each n, m there exists a unitary operator vnm : H1 → H3 such that ∗ nm (ρ) = vnm ρ vnm
(ρ ∈ S).
Teleportation and Entangled States
231
(E2)
pnm (ρ) = 1
(ρ ∈ S).
nm
(E1) means that Bob can reconstruct the original state ρ by unitary keys {vnm } provided to him. (E2) means that Bob will succeed to find a proper key with certainty. In [3, 4], the authors used an EPR spin pair to construct a teleportation model. In order to have a more convenient model, we here use coherent states to construct a model. One of the main points for such a construction is how to prepare the entangled state. The EPR entangled state used in [3] can be identified with the splitting of one particle state, so that the teleportation model of Bennett et al. can be described in terms of Fock spaces and splittings, which makes us possible to work the whole teleportation process in general beam splitting scheme. Moreover to work with beams having a fixed number of particles seems to be not realistic, especially in the case of large distance between Alice and Bob, because we have to take into account that the beams will lose particles (or energy). For that reason one should use a class of beams being insensitive to this loss of particles. That and other arguments lead to superpositions of coherent beams. In Sect. 2 of this paper, we construct a teleportation model being perfect in the sense of conditions (E1) and (E2), where we take the Boson Fock space (L2 (G)) := H1 = H2 = H3 with a certain class of states ρ on this Fock space. In Sect. 3 we consider a teleportation model where the entangled state σ is given by the splitting of a superposition of certain coherent states. Unfortunately this model doesn’t work perfectly, that is, neither (E2) nor (E1) hold. However this model is more realistic than that in Sect. 2, and we show that this model provides a nice approximation to be perfect. To estimate the difference between the perfect teleportation and non-perfect teleportation, we add a further step in the teleportation scheme: Step 5. Bob will perform a measurement on his part of the system according to the projection F+ := 1 − |exp(0) >< exp(0)|, where |exp(0) >< exp(0)| denotes the vacuum state (the coherent state with density 0). Then our new teleportation channels (we denote it again by nm ) have the form nm (ρ) := tr 12
(Fnm ⊗ F+ )ρ ⊗ σ (Fnm ⊗ F+ ) tr 123 (Fnm ⊗ F+ )ρ ⊗ σ (Fnm ⊗ F+ )
and the corresponding probabilities are pnm (ρ) := tr 123 (Fnm ⊗ F+ ) ρ ⊗ σ (Fnm ⊗ F+ ). For this teleportation scheme, (E1) is fulfilled. Furthermore we get nm
d
(1 − e− 2 )2 pnm (ρ) = 1 + (N − 1)e−d
(→ 1 (d → +∞)) .
232
K.-H. Fichtner, M. Ohya
Here N denotes the dimension of the Hilbert space and d is the expectation value of the total number of particles (or energy) of the beam, so that in the case of high density (or energy) “d → +∞” of the beam the model works perfectly. Specializing this model we consider in Sect. 4 the teleportation of all states on a finite dimensional Hilbert space (through the space Rk ). Further specialization leads to a teleportation model where Alice and Bob are spatially separated, that is, we have to teleport the information given by the state of our finite dimensional Hilbert space from one region X1 ⊆ Rk into another region X2 ⊆ Rk with X1 ∩ X2 = ∅, and Alice can only perform local measurements (inside of region X1 ) as well as Bob (inside of X2 ).
1. Basic Notions and Notations First we collect some basic facts concerning the (symmetric) Fock space. We will introduce the Fock space in a way adapted to the language of counting measures. For details we refer to [6–8, 2, 9] and other papers cited in [8]. Let G be an arbitrary complete separable metric space. Further, let µ be a locally finite diffuse measure on G, i.e. µ(B) < +∞ for bounded measurable subsets of G and µ({x}) = 0 for all singletons x ∈ G. In order to describe the teleportation of states on a finite dimensional Hilbert space through the k-dimensional space Rk , especially we are concerned with the case G = Rk × {1, . . . , N}, µ = l × #, where l is the k-dimensional Lebesgue measure and # denotes the counting measure on {1, . . . , N}. Now by M = M(G) we denote the set of all finite counting measures on G. Since n ϕ ∈ M can be written in the form ϕ = δxj for some n = 0, 1, 2, . . . and xj ∈ G j =1
(where δx denotes the Dirac measures corresponding to x ∈ G) the elements of M can be interpreted as finite (symmetric) point configurations in G. We equip M with its canonical σ -algebra W (cf. [6, 7]) and we consider the measure F by setting n 1 F (Y ) := XY (O) + XY δxj µn (d[x1 , . . . , xn ])(Y ∈ W). n! n≥1
Gn
j =1
Hereby, XY denotes the indicator function of a set Y and O represents the empty configuration, i. e., O(G) = 0. Observe that F is a σ -finite measure. Since µ was assumed to be diffuse one easily checks that F is concentrated on the set of simple configurations (i.e., without multiple points) Mˆ := {ϕ ∈ M|ϕ({x}) ≤ 1 for all x ∈ G}. Definition 1.1. M = M(G) := L2 (M, W, F ) is called the (symmetric) Fock space over G.
Teleportation and Entangled States
233
In [6] it was proved that M and the Boson Fock space (L2 (G)) in the usual definition are isomorphic. For each " ∈ M with " = 0 we denote by |" > the corresponding normalized vector |" >:=
" . ||"||
Further, |" >< "| denotes the corresponding one–dimensional projection, describing the pure state given by the normalized vector |" >. Now, for each n ≥ 1 let M⊗n be the n-fold tensor product of the Hilbert space M. Obviously, M⊗n can be identified with L2 (M n , F n ). Definition 1.2. For a given function g : G → C the function exp (g) : M → C defined by 1 if ϕ = 0 exp (g) (ϕ) := g(x) otherwise x∈G,ϕ({x})>0 is called an exponential vector generated by g. Observe that exp (g) ∈ M if and only if g ∈ L2 (G) and one has in this case 1 2 2 ||exp (g)||2 = eg and |exp (g) >= e− 2 g exp (g). The projection |exp (g) >< exp (g)| is called the coherent state corresponding to g ∈ L2 (G). In the special case g ≡ 0 we get the vacuum state |exp(0) >= X{0} . The linear span of the exponential vectors of M is dense in M, so that bounded operators and certain unbounded operators can be characterized by their actions on exponential vectors. Definition 1.3. The operator D : dom(D) → M⊗2 given on a dense domain dom(D) ⊂ M containing the exponential vectors from M by Dψ(ϕ1 , ϕ2 ) := ψ(ϕ1 + ϕ2 ) (ψ ∈ dom(D), ϕ1 , ϕ2 ∈ M) is called a compound Hida–Malliavin derivative. On exponential vectors exp (g) with g ∈ L2 (G), one gets immediately D exp (g) = exp (g) ⊗ exp (g).
(1)
Definition 1.4. The operator S : dom(S) → M given on a dense domain dom (S) ⊂ M⊗2 containing tensor products of exponential vectors by S"(ϕ) := "(ϕ, ˜ ϕ − ϕ) ˜ (" ∈ dom(S), ϕ ∈ M) ϕ≤ϕ ˜
is called compound Skorohod integral.
234
K.-H. Fichtner, M. Ohya
One gets Dψ, "M⊗2 = ψ, S"M S(exp (g) ⊗ exp (h)) = exp (g + h)
(ψ ∈ dom(D), " ∈ dom(S)),
(2)
(g, h ∈ L (G)).
(3)
2
For more details we refer to [10]. Definition 1.5. Let T be a linear operator on L2 (G) with T ≤ 1. Then the operator (T ) called second quantization of T is the (uniquely determined) bounded operator on M fulfilling (T )exp (g) = exp (T g) (g ∈ L2 (G)). Clearly, it holds (T1 )(T2 ) = (T1 T2 ), (T ∗ ) = (T )∗ .
(4)
It follows that (T ) is an unitary operator on M if T is an unitary operator on L2 (G). Lemma 1.6. Let K1 , K2 be linear operators on L2 (G) with property K1∗ K1 + K2∗ K2 = 1 .
(5)
Then there exists exactly one isometry νK1 ,K2 from M to M⊗2 = M ⊗ M with νK1 ,K2 exp (g) = exp(K1 g) ⊗ exp(K2 g) (g ∈ L2 (G)).
(6)
Further it holds νK1 ,K2 = ((K1 ) ⊗ (K2 ))D
(7)
∗ of νK1 ,K2 is (at least on dom(D) but one has the unique extension). The adjoint νK 1 ,K2 characterized by ∗ (exp (h) ⊗ exp (g)) = exp(K1∗ h + K2∗ g) (g, h ∈ L2 (G)) νK 1 ,K2
(8)
and it holds ∗ = S((K1∗ ) ⊗ (K2∗ )). νK 1 ,K2
(9)
Remark 1.7. From K1 , K2 we get a transition expectation ξK1 K2 : M ⊗ M → M, ∗ using νK and the lifting ξK∗ 1 K2 may be interpreted as a certain splitting (cf. [2]). 1 ,K2 Proof of Lemma 1.6. We consider the operator B := S((K1∗ ) ⊗ (K2∗ ))((K1 ) ⊗ (K2 ))D on the dense domain dom(B) ⊆ M spanned by the exponential vectors. Using (1), (3), (4) and (5) we get B exp (g) = exp (g)
(g ∈ L2 (G)) .
Teleportation and Entangled States
235
It follows that the bounded linear unique extension of B onto M coincides with the unity on M, B = 1.
(10)
On the other hand, by Eq. (7) at least on dom (D), an operator νK1 ,K2 is defined. Using (2) and (4) we obtain νK1 ,K2 ψ2 = νK1 ,K2 ψ, νK1 ,K2 ψ = ψ, Bψ,
(ψ ∈ dom (D))
which implies νK1 ,K2 ψ2 = ψ2
(ψ ∈ dom (D))
because of (10). It follows that νK1 ,K2 can be uniquely extended to a bounded operator on M with νK1 ,K2 ψ = ψ
(ψ ∈ M).
Now from (7) we obtain (6) using (1) and the definition of the operators of second quantization. Further, (7), (3) and (4) imply (9) and from (9) we obtain (8) using the definition of the operators of second quantization and Eq. (3). Here we explain the fundamental scheme of beam splitting [8]. We define an isometric operator Vα,β for coherent vectors such that Vα,β | exp (g) = | exp (αg) ⊗ | exp (βg) |2
with | α +|β 1. This beam splitting is a useful mathematical expression for optical communication and quantum measurements [2]. √ Example 1.8 (α = β = 1/ 2 above). Let K1 = K2 be the following operator of multiplication on L2 (G): 1 K1 g = √ g = K2 g (g ∈ L2 (G)). 2 We put |2 =
ν := νK1 ,K2 and obtain
ν exp (g) = exp
1 1 √ g ⊗ exp √ g 2 2
(g ∈ L2 (G)).
Example 1.9. Let L2 (G) = H1 ⊕ H2 be the orthogonal sum of the subspaces H1 , H2 . K1 and K2 denote the corresponding projections. We will use Example 1.8 in order to describe a teleportation model where Bob performs his experiments on the same ensemble of the systems like Alice. Further we will use a special case of Example 1.9 in order to describe a teleportation model where Bob and Alice are spatially separated (cf. Sect. 5). Remark 1.10. The property (5) implies K1 g2 + K2 g2 = g2
(g ∈ L2 (G)).
(11)
Remark 1.11. Let U , V be unitary operators on L2 (G). If operators K1 , K2 satisfy (5), then the pair Kˆ 1 = U K1 , Kˆ 2 = V K2 fulfill (5).
236
K.-H. Fichtner, M. Ohya
2. A Perfect Model of Teleportation Concerning the general idea we follow the papers [1, 11]. We fix an ONS {g1 , . . . , gN } ⊆ L2 (G), operators K1 , K2 on L2 (G) with (5), a unitary operator T on L2 (G), and d > 0. We assume T K 1 g k = K2 g k K1 gk , K1 gj = 0
(k = 1, . . . , N ), (k = j ; k, j = 1 . . . , N ).
(12) (13)
1 . 2
(14)
Using (11) and (12) we get K1 gk 2 = K2 gk 2 = From (12) and (13) we get K2 gk , K2 gj = 0
(k = j ; k, j = 1, . . . , N ).
(15)
The state of Alice asked to teleport is of the type ρ=
N
λs |"s "s |,
(16)
s=1
where |"s =
N
csj |exp (aK1 gj ) − exp (0)
j =1
|csj |2 = 1; s = 1, . . . , N
(17)
j
√ and a = d. One easily checks that (|exp (aK1 gj ) − exp (0))N j =1 and (|exp aK2 gj ) − exp (0))N are ONS in M. Here we took the vacuum state exp(0) off, but it is just j =1 only for computational symplicity. In order to achieve that (|"s )N s=1 is still an ONS in M we assume N
c¯sj ckj = 0
(j = k ; j, k = 1, . . . , N ) .
(18)
j =1 N Denote cs = [cs1,... , csN ] ∈ CN , then (cs )N s=1 is an CONS in C . N N Now let (bn )n=1 be a sequence in C ,
bn = [bn1,... , bnN ] with properties |bnk | = 1 bn , bj = 0
(n, k = 1, . . . , N ),
(19)
(n = j ; n, j = 1, . . . , N ).
(20)
Then Alice’s measurements are performed with projection Fnm = |ξnm ξnm |
(n, m = 1, . . . , N )
(21)
Teleportation and Entangled States
237
given by N 1 |ξnm = √ bnj |exp (aK1 gj ) − exp (0) > ⊗| exp (aK1 gj ⊕m ) − exp (0), (22) N j =1
where j ⊕ m := j + m(mod N ). ⊗2 One easily checks that (|ξnm )N n,m=1 is an ONS in M . Further, the state vector |ξ of the entangled state σ = |ξ ξ | is given by 1 |ξ = √ |exp (aK1 gk ) − exp (0) ⊗ |exp (aK2 gk ) − exp (0). N k
(23)
Lemma 2.1. For each n, m = 1, . . . , N it holds (Fnm ⊗ 1)(|"s ⊗ |ξ ) 1 b¯nj csj |exp (aK2 gj ⊕m ) − exp (0) (s = 1, . . . , N ). (24) = |ξnm ⊗ N j
Proof. From the fact that |γj := |exp (aK1 gj ) − exp (0)
(j = 1, . . . , N)
(25)
is an ONS, it follows γr ⊗ γr⊕m , γj ⊗ γk =
1 0
if r = j and k = r ⊕ m . otherwise
(26)
On the other hand, we have (Fnm ⊗ 1)(|"s ⊗ |ξ ) 1 ¯ csj bns γr ⊗ γr⊕m , γj ⊗ γk ξnm ⊗ |exp (aK2 gk ) − exp (0). = N r k
(27)
j
Using (26) and (27), we get (24). Now we have ρ⊗σ =
N
λs |"s "s | ⊗ |ξ ξ |
s=1
=
N s=1
(28) λs |"s ⊗ ξ "s ⊗ ξ |,
238
K.-H. Fichtner, M. Ohya
which implies (Fnm ⊗ 1)(ρ ⊗ σ )(Fnm ⊗ 1) =
N
λs (Fnm ⊗ 1)|"s ⊗ ξ "s ⊗ ξ |(Fnm ⊗ 1)
s=1
=
N
λs (Fnm ⊗ 1)("s ⊗ ξ )2
(29)
s=1
× |(Fnm ⊗ 1)("s ⊗ ξ )(Fnm ⊗ 1)("s ⊗ ξ )|. Note |"s ⊗ ξ = |"s ⊗ |ξ . From (12) it follows that
b¯nj csj |exp (aK2 gj ⊕m ) − exp (0)
j
= (T )
b¯nj csj |exp (aK1 gj ⊕m ) − exp (0).
(30)
j
Further, for each m, n (= 1, . . . , N) , we have unitary operators Um , Bn on M given by Bn |exp (aK1 gj ) − exp (0) = bnj |exp (aK1 gj ) − exp (0) (j = 1, . . . , N ), (31) Um |exp (aK1 gj ) − exp (0) = |exp (aK1 gj ⊕m ) − exp (0) (j = 1, . . . , N ). (32) Therefore we get j
b¯nj csj |exp (aK1 gj ⊕m ) − exp (0) = Um Bn∗ ("s ).
(33)
From (30), (33) and Lemma 2.1 we obtain (Fnm ⊗ 1) (|"s ⊗ |ξ ) =
1 |ξnm ⊗ (T )Um Bn∗ |"s . N
(34)
It follows (Fnm ⊗ 1) (|"s ⊗ |ξ ) 2 =
1 . N2
(35)
Finally from (29), (34) and (35) we have (Fnm ⊗ 1) (ρ ⊗ σ ) (Fnm ⊗ 1) =
1 Fnm ⊗ (T )Um Bn∗ ρ Bn Um∗ (T ∗ ) N2
(36)
That leads to the following solution of the teleportation problem. Theorem 2.2. For each n, m = 1, . . . , N, define a channel nm by nm (ρ) := tr 12
(Fnm ⊗ 1) (ρ ⊗ σ ) (Fnm ⊗ 1)
tr 123 (Fnm ⊗ 1) ρˆ ⊗ σ (Fnm ⊗ 1)
(ρ normal state on M).
Then we have for all states ρ on M with (16) and (17)
∗ nm (ρ) = (T )Um Bn∗ ρ (T )Um Bn∗ .
(37)
(38)
Teleportation and Entangled States
239
Remark 2.3. In case of Example 1.8 using the operators Bn , Um , (T ), the projections Fnm are given by unitary transformations of the entangled state σ :
∗ Fnm = Bn ⊗ Um (T ∗ ) σ Bn ⊗ Um (T ∗ ) , or (39)
|ξnm = Bn ⊗ Um (T ∗ ) |ξ . Remark 2.4. If Alice performs a measurement according to the following selfadjoint operator: F =
N
znm Fnm
n,m=1
with {znm |n, m = 1, . . . , N} ⊆ R − {0}, then she will obtain the value znm with probability 1/N 2 . The sum over all these probabilities is 1, so that the teleportation model works perfectly. 3. A Non-Perfect Case of Teleportation In this section we will construct a model where we also have channels with property (38). But the probability that one of these channels will work in order to teleport the state from Alice to Bob is less than 1 depending on the density parameter d (or energy of the beams, depending on the interpretation). If d = a 2 tends to infinity that probability tends to 1. That is, the model is asymptotically perfect in a certain sense. We consider the normalized vector N
γ |η := √ |exp (agk ), N k=1 1 1
2 2 1 1 = , γ := 2 −d 1 + (N − 1)e 1 + (N − 1)e−a
(40)
and we replace in (37) the projector σ by the projector σ˜ := |ξ˜ ξ˜ |, N γ ξ˜ := νK1 ,K2 (η) = √ |exp (aK1 gk ) ⊗ |exp (aK2 gk ). N k=1
(41)
Then for each n, m = 1, . . . , N, we get the channels on a normal state ρ on M such as ˜ nm (ρ) := tr 12
(Fnm ⊗ 1) (ρ ⊗ σ˜ ) (Fnm ⊗ 1) , tr 123 (Fnm ⊗ 1) (ρ ⊗ σ˜ ) (Fnm ⊗ 1)
(42)
8nm (ρ) := tr 12
(Fnm ⊗ F+ ) (ρ ⊗ σ˜ ) (Fnm ⊗ F+ ) , tr 123 (Fnm ⊗ F+ ) (ρ ⊗ σ˜ ) (Fnm ⊗ F+ )
(43)
where F+ = 1 − |exp (0)exp (0)| e. g., F+ is the projection onto the space M+ of configurations having no vacuum part, e. g., orthogonal to vacuum M+ := {ψ ∈ M| exp (0)exp (0)|ψ = 0}.
240
K.-H. Fichtner, M. Ohya
One easily checks that 8nm (ρ) =
˜ nm (ρ)F+ F+ , ˜ nm (ρ)F+ tr F+
(44)
˜ nm (ρ) from Alice, Bob has to omit the vacuum. that is, after receiving the state From Theorem 2.2 it follows that for all ρ with (16) and (17), nm (ρ) =
F+ nm (ρ)F+ . tr (F+ nm (ρ)F+ )
˜ nm , namely, in general it does not hold This is not true if we replace nm by ˜ nm (ρ). 8nm (ρ) = But we will prove that for each ρ with (16) and (17) it holds 8nm (ρ) = nm (ρ), which means 8nm (ρ) = ((T )Um Bn∗ )ρ((T )Um Bn∗ )∗
(45)
because of Theorem 2.2. Further we will show tr123 (Fnm ⊗ F+ ) (ρ ⊗ σ˜ ) (Fnm ⊗ F+ ) =
2 γ2 d 2 −1 e e−d N2
(46)
and the sum over n, m (= 1, . . . , N) gives the probability d 2 1 − e− 2 −→ 1 (d −→ ∞) 1 + (N − 1)e−d which means that the teleportation model works perfectly in the limit d −→ ∞, e. g., Bob will receive one of the states 8nm (ρ) given by (44). Thus we formulate the following theorem. Theorem 3.1. For all states ρ on M with (16) and (17) and each pair n, m (= 1, . . . , N), Eqs. (44) and (45) hold. Further, we have d 2 1 − e− 2 tr 123 (Fnm ⊗ F+ ) (ρ ⊗ σ˜ ) (Fnm ⊗ F+ ) = . (47) 1 + (N − 1)e−d n,m In order to prove Theorem 3.1, we fix ρ with (16) and (17) and start with a lemma. Lemma 3.2. For each n, m, s (= 1, . . . , N) , it holds
d γ 1 − e− 2 |ξnm ⊗ (T )Um Bn∗ |"s (Fnm ⊗ 1) |"s ⊗ |ξ˜ = N 21 d γ e2 − 1 + bn , cs CN ξnm ⊗ |exp (0). N ed
Teleportation and Entangled States
241
Proof. For all k, j, r = 1, . . . , N, we get αk,j,r := |exp (aK1 gr ) − exp (0) ⊗ ||exp (aK1 gr⊗m ) − exp (0), |exp (aK1 gj ) − exp (0) ⊗ |exp (aK1 gk ) 2 a e 2 −1 if r = j and k = r ⊕ m a2 = e 2 0 otherwise and
|exp aK2 gj ⊕m = e
2
− a2
e
a2 2
1 −1
2
a2 |exp aK2 gj ⊕m − exp (0) + e− 2 |exp (0). On the other hand, we have γ ¯ csj bnr αk,j,r ξnm ⊗ | exp (aK2 gk ). (Fnm ⊗ 1) |"s ⊗ |ξ˜ = N r k
j
It follows with a 2 = d, (Fnm ⊗ 1) "s ⊗ ξ˜
d d γ = csj b¯nj |exp aK2 gj ⊕m − exp (0) e 2 − 1 e− 2 ξnm ⊗ N j
1 d γ d 2 csj b¯nj ξnm ⊗ |exp (0) e 2 − 1 e− 2 + N j
d γ = 1 − e− 2 ξnm ⊗ (T )Um Bn∗ "s N 21 d γ e2 − 1 + bn , cs CN ξnm ⊗ |exp (0). N ed
If ρ is a pure state ρ = |"s "s |, then we obtain from Lemma 3.2, tr 123 (Fnm ⊗ 1) (ρ ⊗ σ˜ ) (Fnm ⊗ 1) 2 e d2 − 1 γ2 − d2 2 = 2 1−e + |bn , cs | N ed d d 2 1 e2 − 1 − 2
= 2 1−e 2 + |bn , cs | ed N 1 + (N − 1)e−d
242
K.-H. Fichtner, M. Ohya
and
˜ nm (ρ) = (T )Um Bn∗ ρ (T )Um Bn∗ ∗ .
Now we have (T )Um Bn∗ "s ∈ M+ , |exp (0) ∈ M⊥ +. Hence, Lemma 3.2 implies
d γ 1 − e− 2 ξnm ⊗ (T )Um Bn∗ "s (1 ⊗ 1 ⊗ F+ ) (Fnm ⊗ 1) "s ⊗ ξ˜ = N that is, we have the following lemma Lemma 3.3. For each n, m, s = 1, . . . , N, it holds
d γ 1 − e− 2 ξnm ⊗ (T )Um Bn∗ "s . (Fnm ⊗ F+ ) "s ⊗ ξ˜ = N
(48)
Remark 3.4. Let K2 be a projection of the type K2 h = hXX ; h ∈ L2 (G), where X ⊆ G is measurable. Then (48) also holds if we replace F+ by the projection F+,X onto the subspace M+,X of M given by M+,X := {ψ ∈ M|ψ(ϕ) = 0 if ϕ(X) = 0}. Observe that M+,G = M+ . Proof of Theorem 3.1. We have assumed that (|"s )N s=1 is an ONS in M, which implies
N ∗ that |ξnm ⊗ (T )Um Bn |"s s=1 is an ONS in M⊗3 . Hence we obtain Eqs. (45), (46 ) and (47) by Lemma 3.3. This proves Theorem 3.1. Remark 3.5. In the special case of Remark 3.4, Eqs. (45), (46) and (47) hold if we replace F+ by F+,X in the definition of the channel 8nm and in (46), (47 ), that is, Bob will only perform “local” measurement according to the region X, about which we will discuss more details in the next sections. 4. Teleportation of States Inside Rk Let H be a finite-dimensional Hilbert space. We consider the case H = CN = L2 ({1, . . . , N}, #) without loss of generality, where # denotes the counting measure on the set {1, . . . , N}. We want to teleport states on H with the aid of the constructed channels (nm )N n,m=1 or N (8nm )n,m=1 . We fix – a CONS (|j )N of H, j =1 – f ∈ L2 Rk , f = 1, – d = a 2 > 0,
Teleportation and Entangled States
243
– Kˆ 1 , Kˆ 2 linear operators on L2 Rk , – Tˆ unitary operator on L2 Rk with two properties Kˆ 1∗ Kˆ 1 f + Kˆ 2∗ Kˆ 2 f = f, Tˆ Kˆ 1 f = Kˆ 2 f.
(49) (50)
We put G = Rk × {1, . . . , N},
µ = l × #,
where l is the Lebesgue measure on Rk . Then L2 (G) = L2 (G, µ) = L2 (Rk ) ⊗ H. Further, put gj := f ⊗ |j
(j = 1, . . . , N ).
2 2 Then (gj )N j =1 is an ONS in L (G). We consider linear operators K1 , K2 on L (G) with (5) and
Kr gj = Kˆ r f ⊗ |j (j = 1, . . . , N; r = 1, 2). (51)
Remark 4.1. Equation (51) determines operators K1 , K2 on the subspace of M spanned by the ONS (gj )N j =1 . On the orthogonal complement, one can put for instance 1 Kr ψ = √ ψ. 2 Then K1 , K2 are well defined and fulfill (5) because of (49). Further, one checks that (13) and (15) hold. Now let T be an unitary operator on L2 (G) with T (K1 gj ) = Tˆ Kˆ 2 f ⊗ |j . From (13) one can prove the existence of T using the arguments as in Remark 4.1. Further, we get (12) from (50). Summarizing, we obtain that {g1 , . . . , gN }, K1 , K2 , T fulfill all the assumptions required in Sect. 2. Thus we have the corresponding channels n,m , 8nm given by (37) and (43) respectively. It follows that we are able to teleport a state ρ on M = M(G) with (16) and (17 ) as it was stated in Theorem 2.2 and Theorem 3.1, respectively. In order to teleport states on H through the space Rk using the above channels, we have to consider: a “lifting” E ∗ of the states ρˆ on H into the set of states on the bigger state space on M such that ρ = E ∗ (ρ) ˆ can be described by (16), (17), (18). Second: a “reduction” R of (normal) states on M to states on H such that for all states ρˆ on H it holds
∗ ∗ R (T )Um Bn∗ E ∗ ρˆ (T )Um Bn∗ ˆ nm (n, m = 1, . . . , N ), = Vnm ρV (52) First:
where (Vnm )N n,m=1 are unitary operators on H.
244
K.-H. Fichtner, M. Ohya
That we can obtain as follows: We have already stated in Sect. 2 that
N |exp (aKr (gj )) − exp (0) j =1 (r = 1, 2) are ONS in M. We denote by Mr (r = 1, 2) the corresponding N - dimensional subspaces of M. Then for each r = 1, 2, there exists exactly one unitary operator W r from H onto Mr ⊆ M with Wr |j = |exp (aKr gj ) − exp (0) We put
ˆ 1∗ <M1 E ∗ ρˆ := W1 ρW
(j = 1, . . . , N ).
ρˆ state on H ,
(53)
(54)
where <Mr denotes the projection onto Mr (r = 1, 2). Describing the state ρˆ on H by ρˆ =
N
ˆ s " ˆ s| λs |"
(55)
s=1
with ˆ s = |"
N
csj |j ,
j =1
N where csj s,j =1 fulfills (18), we obtain that ρ = E ∗ ρˆ is given by (16) and (17). Now, for each state ρ on M we put R(ρ) :=
W2∗ <M2 ρW2 . tr M W2∗ <M2 ρW2
(56)
Since
∗
∗ <M2 (T )Um Bn∗ E ∗ ρˆ (T )Um Bn∗ = (T )Um Bn∗ E ∗ ρˆ (T )Um Bn∗ ,
we get
∗ tr M W2∗ <M2 (T )Um Bn∗ E ∗ ρˆ (T )Um Bn∗ = 1 and
∗ R (T )Um Bn∗ E ∗ ρˆ (T )Um Bn∗
∗ ˆ 1∗ <M1 (T )Um Bn∗ W2 . = W2∗ (T )Um Bn∗ W1 ρW
As we have the equality
∗
∗ <M1 (T )Um Bn∗ W2 = (T )Um Bn∗ W2 , which implies
∗ R (T )Um Bn∗ E ∗ ρˆ (T )Um Bn∗
∗ ˆ 1∗ (T )Um Bn∗ W2 . = W2∗ (T )Um Bn∗ W1 ρW
Teleportation and Entangled States
245
Put Vnm := W2∗ (T )Um Bn∗ W1
(n, m = 1, . . . , N ),
(57)
then Vnm (n, m = 1, . . . , N ) is a unitary operator on H and (52) holds. One easily checks Vnm |j = b¯nj |j ⊕ m
(j, m, n = 1, . . . , N ).
Summarizing these, we have the following theorem: Theorem 4.2. Consider the channels on the set of states on H ˆ nm := R ◦ nm ◦ E ∗ (n, m = 1, . . . , N ), ˆ nm := R ◦ 8nm ◦ E ∗ (n, m = 1, . . . , N ). 8
(58) (59)
where R, E ∗ , nm , 8nm are given by (56), (54), (37), (43), respectively. Then for all states ρˆ on H, it holds ∗ ˆ nm ρˆ = Vnm ρV ˆ nm ρˆ ˆ nm =8 (n, m = 1, . . . , N ), (60) where Vnm (n, m = 1, . . . , N ) are the unitary operators on H given by (57). Remark 4.3. Remember that the teleportation model according to (nm )N n,m=1 works N perfectly in the sense of Remark 2.4, and the model dealing with (8nm )n,m=1 was only asymptotically perfectfor larged (i.e., high density or high energy of the beams). They ˆ n,m , 8 ˆ nm . can transfer to Example 4.4. We specialize h ∈ L2 Rk , Tˆ = 1.
1 Kˆ 1 h = Kˆ 2 h = √ h 2
Realizing the teleportation in this case means that Alice has to perform measurements (Fnm ) in the whole space Rk and also Bob (concerning F+ ). 5. Alice and Bob are Spatially Separated We specialize the situation in Sect. 4 as follows: We fix – t ∈ Rk , – X1 , X2 , X3 ⊆ Rk are measurable decomposition of Rk such that l(X1 ) = 0 and X2 = X1 + t := {x + t| x ∈ X1 }. Put Tˆ h(x) := h(x − t) Kˆ r h := hXXr
x ∈ Rk , h ∈ L2 (Rk ) , r = 1, 2 , h ∈ L2 (Rk ) ,
246
K.-H. Fichtner, M. Ohya
and assume that the function f ∈ L2 Rk has the properties
f XX2 = Tˆ f XX1 , f XX3 ≡ 0. Then Tˆ is an unitary operator on L2 Rk and (49 ), (50) hold. Using the assumption that X1 , X2 , X3 is a measurable decomposition of Rk we get immediately that Gs := Xs × {1, . . . , N} (s = 1, 2, 3) is a measurable decomposition of G. It follows that M = M(G) is decomposed into the tensor product M(G) = M(G1 ) ⊗ M(G2 ) ⊗ M(G3 ) [6, 7, 10]. According to this representation, the local algebras A(Xs ) corresponding to regions Xs ⊆ Rk (s = 1, 2) are given by A(X1 ) := {A ⊗ 1 ⊗ 1; A bounded operator on M(G1 )}, A(X2 ) := {1 ⊗ A ⊗ 1; A bounded operator on M(G2 )}. One easily checks in our special case that Fnm ∈ A(X1 ) ⊗ A(X1 ) (n, m = 1, . . . , N ), σ ∈ A(X1 ) ⊗ A(X2 ) and E ∗ ρˆ gives a state on A(X1 ) (the number of particles outside of G1 is 0 with probabiliy 1 ). That is, Alice has to perform only local measurements inside of the region X1 in order described in Sect. 4 or measure
to realize the teleportation processes
the state E ∗ ρˆ . On the other hand, nm E ∗ ρˆ and 8nm E ∗ ρˆ give local states on A(X2 ) such that by measuring these states Bob has to perform only local measurements inside of the region X2 . The only problem could be that according to the definition (43) of the channels 8nm Bob has to perform the measurement by F+ which is not local. However, as we have already stated in Remark 3.5, this problem can be avoided if we replace F+ by F+,X2 ∈ A(X2 ). Therefore we can describe the special teleportation process as follows: We have a beam being in the pure state |ηη| (40). After splitting, one part of the beam is located in the region X1 or will go to X1 (cf. Remark 1.11) and
the other part is located in the region X1 . Now Alice X2 or will go to X2 . Further, there is a state E ∗ ρˆ localized in the region will perform the local measurement inside of X1 according to F = znm Fnm involving n,m
the first part of the beam and the state E ∗ (ρ). ˆ This leads to a preparation of the second part of the beam located in the region X2 which can be controlled by
Bob, and the second
part of the beam will show the behaviour of the state nm E ∗ ρˆ = 8nm E ∗ ρˆ if Alice’s measurement shows the value znm . Thus we have teleported the state ρˆ on H from the region X1 into the region X2 .
Teleportation and Entangled States
247
References 1. Accardi, L. and Ohya, M.: Teleportation of general quantum states. Voltera Center preprint, 1998 2. Accardi L., Ohya M.: Compound channels, transition expectations and liftings. Applied Mathematics & Optimization 39, 33–59 (1999) 3. Bennett, C. H., Brassard, G., Crépeau, C., Jozsa, R., Peres, A. and Wootters, W.K.: Teleporting an unknown quantum state via Dual Classical and Einstein–Podolsky–Rosen channels. Phys. Rev. Lett. 70, 1895–1899 (1993) 4. Bennett, C.H., Brassard, G., Popescu, S., Schumacher, B., Smolin, J.A., Wootters, W.K.: Purification of noisy entanglement and faithful teleportation via noisy channels. Phys. Rev. Lett. 76, 722–725 (1996) 5. Ekert, A.K.: Quantum cryptography based on Bell’s theorem. Phys. Rev. Lett. 67, 661–663 (1991) 6. Fichtner, K.-H. and Freudenberg, W.: Point processes and the position distrubution of infinite boson systems. J. Stat. Phys. 47, 959–978 (1987) 7. Fichtner, K.-H. and Freudenberg, W.: Characterization of states of infinite Boson systems I. – On the construction of states. Commun. Math. Phys. 137, 315–357 (1991) 8. Fichtner, K.-H., Freudenberg, W. and Liebscher, V.: Time evolution and invariance of Boson systems given by beam splittings. Infinite Dim. Anal., Quantum Prob. and Related Topics 1, 511–533 (1998) 9. Lindsay, J.M.: Quantum and Noncausal Stochastic Calculus. Prob. Th. Rel. Fields 97, 65–80 (1993) 10. Fichtner, K.-H. and Winkler, G.: Generalized brownian motion, point processes and stochastic calculus for random fields. Math. Nachr. 161, 291–307 (1993) 11. Inoue, K., Ohya, M. and Suyari, H.: Characterization of quantum teleportation processes by nonlinear quantum mutual entropy. Physica D 120, 117–124 (1998) Communicated by H. Araki
Commun. Math. Phys. 222, 249 – 267 (2001)
Communications in
Mathematical Physics
© Springer-Verlag 2001
Laplace–Beltrami Operator on a Riemann Surface and Equidistribution of Measures Andre Reznikov Department of Mathematics, Weizmann Institute, Rehovot, 76100 Israel. E-mail: [email protected] Received: 12 March 2001 / Accepted: 23 April 2001
Abstract: We consider the Laplace–Beltrami operator on a compact Riemann surface of a constant negative curvature. For any eigenvalue of the Laplace–Beltrami operator there is an associated sequence of measures on the Riemann surface. These measures naturally appear in Quantum Chaos type questions in the theory of electro-magnetic flow on a Riemann surface. The main result of the paper is the claim that this sequence of measures has the Liouville measure as the (weak∗ ) limit. We prove a quantitative version of this equidistribution claim. 1. Introduction 1.1. Motivation. Let Y be a compact Riemann surface of a genus greater than one endowed with a Riemannian metric of constant curvature −1 and the volume element dv. The corresponding Laplace–Beltrami operator is nonnegative and has purely discrete spectrum on the space L2 (Y, dv) of functions on Y . We will denote by 0 = µ0 < µ1 ≤ µ2 ≤ . . . eigenvalues of and by φµi corresponding eigenfunctions of norm one. The study of these eigenfunctions and the corresponding eigenvalues is important in many areas of analysis on Riemann surfaces. The purpose of this paper is to discuss a phenomenon of equidistribution of a certain sequence of measures naturally associated with any eigenvalue. This phenomenon is similar to certain questions in Quantum Chaos, especially the equidistribution of “fuzzy ladders” of Guillemin–Sternberg–Uribe (see [GS, GU, ST, Z3] and a discussion in Sect. 1.4.2). 1.2. Associated measures. To state the problem and results, we introduce necessary notation and recall some well-known facts about the geometry of the Laplace–Beltrami operator on a Riemann surface (see [G6] or [L] and also Sect. 2 for a more detailed discussion).
250
A. Reznikov p1
1.2.1. Casimir operator. We consider the principal bundle X = T1∗ Y −→ Y of unit spheres in the cotangent bundle T ∗ Y . It is well-known that the manifold X can be endowed with an action of the group G = SL(2, R). In particular, fibers of p1 are isomorphic to S 1 and are orbits of the compact subgroup K = SO(2, R) ⊂ G under the above action. There exists a unique G-invariant measure µX on X of the total mass one. We denote || ||X the corresponding norm in L2 (X, dµX ). There exists a G-invariant second order differential operator on X. This operator is unique up to a constant and it generates the algebra of G-invariant differential operators on X. Let ω be the second order G-invariant differential operator which coincides with on the space of functions constant along the fibers of p1 (we identify this space of functions with the space of functions on Y ). Such ω is called the Casimir operator. We also denote by W the differential operator on X which generates the action of K on X. We note that ω and W commute since W is obtained from the action of G on X. Operators ω and W appear in the quantization of an electro–magnetic flow on Y (see [Su], [T, Z3] and Sect. 1.4.3 below). The operator ω is not an elliptic operator. However, the spectrum of ω is a discrete set and in fact, it coincides with the spectrum of (though eigenspaces of ω in L2 (X) are infinite dimensional while those of in L2 (Y ) are finite dimensional). We consider joint eigenfunctions of ω and of W . Denote Eµj an eigenspace of ω in 2 L (X) where µj ∈ Spec(). Consider the restriction of W to Eµj . The spectrum of W on Eµj consists of imaginary even integers: Spec(W |Eµj ) = i2Z. The multiplicity is constant along the spectrum and coincides with the multiplicity mj of the eigenvalue µj in the spectrum of . We denote {φµk j ,l }, l = 1, 2, . . . , mj , k ∈ 2Z a corresponding set of orthonormal eigenfunctions: k k ω(φµj ,l ) = µj · φµj ,l , W (φµk j ,l ) = ik · φµk j ,l , k ||φµj ,l ||L2 (X) = 1.
(1.1)
(The condition W (φµk j ) = ik · φµk j simply means that the function φµk j is an exponent along the fibers of p1 : φµk j (xθ ) = e2πikθ φµk j (x) for any θ ∈ K S 1 and x ∈ X.)
The functions φµk j ,l are important in many areas of analysis on Riemann surfaces (see Sect. 1.4 for some examples). In particular, these functions are essential in the treatment of the equivariant microlocalization of eigenfunctions φµi , their Wigner distribution and analysis of pseudo-differential operators on Riemann surfaces. This point of view on functions φµk j ,l was pursued by Zelditch in [Z1]. 1.2.2. Measures. In this paper we investigate the distribution of the mass of functions φµk j ,l . Throughout the paper we fix one eigenvalue µ = µj of and assume it has multiplicity m. We define a sequence {ωk,l } of probability measures on X by k ωk,l = |φµ,l (x)|2 dµX ,
l = 1, 2, . . . , m, k ∈ 2Z.
(1.2)
We note that the measures ωk,l are K-invariant and hence can be viewed as measures on the Riemann surface Y .
Equidistribution
251
1.3. Equidistribution. Our main result is the following Claim. For a fixed µ the sequence of measures ωk,l tends to µX in a weak∗ sense. Moreover, the following quantitative version of the equidistribution claim is proven. Theorem 1.1. There exists δ > 0 such that for any f ∈ C ∞ (X) one has f (x)ωk,l = f (x)dµX + O(|k|−δ ), X
X
as |k| → ∞. The proof of this theorem is based on the study of the representation theory of the group SL(2, R) which naturally governs properties of the measures ωk,l . The role of SL(2, R) is explained in Sect. 2 in more detail. 1.4. Remarks. 1.4.1. The exponent δ in the O-term is directly connected to the first eigenvalue of . 1−λ2
1) Namely, let µ1 = 4 1 be the first nonzero eigenvalue of . Then δ = min( 21 , 1−Re(λ ) 2 1 (i.e. only eigenvalues below 4 affect the rate of decay). This estimate of the rate of the decay is tight (see Appendix A.2). The constant in the O-term in the theorem is bounded by a certain Sobolev norm of f in an explicit way (see Proof of Theorem 1.1).
1.4.2. Quantum ergodicity. Since the pioneering paper [Sh1] of A. Shnirelman (for proofs see [Sh2], Colin de Verdière [CV] and Zelditch [Z2]) there have been many results concerning equidistribution of eigenfunctions in chaotic dynamical systems. In particular, Guillemin, Sternberg and Uribe introduced a notion of “fuzzy ladders”. In our setting this notion concerns a sequence of joint eigenfunctions of two commuting operators: ω + 2W 2 (the Kaluza–Klein operator) and −iW on X. Here ω is the Casimir operator and W the infinitesimal generator of the SO(2) action on X as before (see [Z3]). One sees immediately that these joint eigenfunctions are nothing else than the eigenfuncµ tions φµk j ,l . The condition that this sequence forms a “fuzzy ladder” is: k 2j = c +O(k −1 ) for some c > 0. For such sequences, called hyperbolic ladders, equidistribution on the average has been proven by Zelditch (refining results in [ST]). For example, in k in [Z3] our setting it is proven that: N1 |φµj ,l |2 → µX , as N → ∞. Here the sum is over the first N elements in the ladder and the convergence is in a weak sense. There are two more types of ladders: elliptic and parabolic (in the setting of [Z3] these can be treated at present only for discrete series representations). The sequence k } for µ fixed and |k| → ∞, considered in this paper, is of a parabolic type. We {φµ,l k |2 with µ fixed prove the unique ergodicity of this sequence (i.e. that the functions |φµ,l and k → ∞ tend to the Liouville measure individually and not only on average). The only other ladder for which unique ergodicity has been proven is the parabolic ladder in the representation which is a limit of the discrete series. This is discussed at the end of [Z3]. We would like to add that it is quite possible that one could extend methods of [GS, GU, ST, Z3] to cover other parabolic ladders as well. However, it does not seem to be possible to obtain an analog of our Theorem 1.1 with a remainder term along these lines. In that aspect our results (as well as methods) are quite different.
252
A. Reznikov
k appear in the quantization of an 1.4.3. Electro-magnetic flow on Y . The functions φµ,l electro-magnetic flow on Y (see [Su,T, Z3]). Consider Y as above and let B be a closed 2-form coming from the form y −2 dx ∧ dy p on the upper half-plane H. Let T ∗ Y −→ Y be the cotangent bundle, * the canonical symplectic form on it and ||ξ || the norm Ty∗ Y → R on fibers (coming from the Riemannian metric on Y ). Let V be a positive smooth function on Y (in our case we will have V ≡ 1); we denote by the same letter its lift to T ∗ Y . Consider the Hamiltonian on T ∗ Y given by 1
H (y, ξ ) = (||ξ ||2 + V (y)) 2 . The Hamiltonian H is called the electro-magnetic Hamiltonian associated with the magnetic field B and the electric potential V on the symplectic manifold T ∗ Y endowed with the symplectic form *1 = *−p ∗ B. H generates an electro-magnetic flow on the energy level H = const. The quantum Hamiltonian is given by the following (first order positive elliptic pseudo-differential) operator on X = T1∗ Y Hˆ = ( + V (x) · W 2 ) 2 , 1
where W is as before. For the special case of the constant potential (V ≡ 1) we see that eigenfunctions of k for all µ ∈ Spec() and k ∈ 2Z from (1.1). ˆ H are precisely the functions φµ,l k are classical objects of the theory of auto1.4.4. Maass operators. The functions φµ,l morphic functions. They have a description in terms of so–called rising and lowering Maass operators (see [B]). Let . ⊂ G be a lattice. To a function on X = . \ G fθ (g) − sin θ ∈ K) there of weight k (i.e. satisfying f (ghθ ) = eikθ f (g) for any hθ = cos cos θ sin θ corresponds a function F (z) on the upper half plane H such that f (g) = F (g · i)eikθ ,
b where g = a0 a −1 · hθ . Such F (z) satisfies a well-known condition:
F (γ z) =
cz + d |cz + d|
k
F (z), γ =
a b c d
∈ ..
Maass (rising) operator, acting from weight 2k to weight 2k + 2 functions on . \ H, is given by the formula K2k = iy∂x + y∂y + k and similarly for the lowering Maass operator. These operators come from elements E+ and E− in the Lie algebra sl2 (C). 2k to φ 2k+2 (up to an explicit constant). The Maass operator K2k maps φµ,l µ,l
Equidistribution
253
1.4.5. Generalizations. Our method proves the equidistribution of a more general set of functions. Namely, one can ask for weak limits of sequences of functions of a type φµk φµ−l for µ fixed and k, l → ∞ (naturally viewed as distributions on smooth functions on X of the K-type l − k). The case we treated is k = l, i.e. the “diagonal case”. One sees immediately that our method works as long as |k − l| is bounded along the sequence in question. In that case the limit is 0. A similar question arises for representations of the discrete series. In that case the function φµm we start with is a holomorphic form (instead of φµ0 which is an eigenfunction of ). Here µ = m(m − 1), m is an integer. As before we can form the sequence of functions {φµk+m } for k ∈ 2Z, k ≥ 0 of the weight k + m on X. Out of this sequence of functions we form the sequence of measures on Y given by ωkm = φµk+m φ¯ µk+m dµX , for µ = m(m − 1) fixed and k → ∞. Such a sequence of measures should also be equidistributed. However, to prove this (by our method) one has to overcome some technical difficulties related to the form of a geometric trilinear invariant functional in representations of the discrete series. We hope to return to this problem elsewhere.
2. Representation Theoretic Reformulation We recall the standard connection of the above setting with the representation theory of SL(2, R) (see [G6]). 2.1. Automorphic representations. Let H be the upper half plane with the hyperbolic metric of constant curvature −1. The group of motions of H is isomorphic to PSL(2, R) = SL(2, R)/{±1}. We denote G = SL(2, R), K = SO(2, R) S 1 as before and we identify H with G/K. By the uniformization theorem we have Y = . \ H for . = π1 (Y ) ⊂ SL(2, R) a discrete co-compact subgroup in SL(2, R). This is the main geometric construction which allows one to make a connection between analysis on a Riemann surface and representation theory. We denote by X the compact quotient . \ SL(2, R). One can view X as the bundle T1∗ Y of unit spheres in the cotangent bundle of Y . The group G acts on X and hence on the space of functions on X. We fix the unique G-invariant measure µX on X of the total mass one. Let L2 (X) = L2 (X, dµX ) be the space of square integrable functions. We will
denote the norm on L2 (X) by || ||X or simply || || and < f, g >X = X f gdµ ¯ X the corresponding scalar product. We will identify the Riemann surface Y = . \ H with X/K. This induces the embedding L2 (Y ) ⊂ L2 (X), whose image consists of all K-invariant functions. For any eigenfunction φ of the Laplace–Beltrami operator on Y we can consider a closed G-invariant subspace Lφ ⊂ L2 (X) generated by φ under the action of G. It is known that (π, L) = (πφ , Lφ ) is an irreducible unitary representation of G (see [G6]). Such (πφ , Lφ ) is called an automorphic representation. Conversely, fix an irreducible unitary representation (π, L) of the group G and a K-fixed unit vector v0 ∈ L. Any G-morphism ν : L → L2 (X) defines an eigenfunction φ = ν(v0 ) of the Laplace–Beltrami operator on Y ; this function is normalized if ν is an isometry. Thus eigenfunctions φ correspond to tuples (π, L, v0 , ν).
254
A. Reznikov
All nontrivial irreducible unitarizable representations of G with a K-fixed vector are classified: these are the representations of principal and complementary series – the so-called irreducible unitary representation of class one (i.e. with a K-invariant vector). An irreducible unitary representation of class one (π, L) can be realized in the space Lλ of (even) homogeneous functions on R2 \ 0 of degree λ − 1, where λ = it is a purely imaginary complex number if π is a representation of the principal series or λ ∈ (0, 1) for the complementary series. For representations of the principal series the G-invariant scalar product in Lλ is
1 f gdθ ¯ . For the complementary series the given by Pλ (f, g) = < f, g >πλ = 2π 1 S formula is more complicated (see Sect. 4.1.2). Thus vectors in a representation Lλ of the principal series are just even locally L2 functions f on R2 \ 0 satisfying f (ax, ay) = |a|λ−1 f (x, y) for all a ∈ R \ 0. The representation π is induced by the natural action of G on (x, y) ∈ R2 . The eigenfunction φ of the Laplace operator corresponding to a representation 2 (πλ , Lλ ) has eigenvalue µ = 1−λ 4 . Thus we see that eigenfunctions φ on the Riemann surface Y with the given eigenvalue µ correspond to G-morphisms νφ : Lλ → L2 (X) (namely, φ = νφ (v0 )). Normalization ||φ|| = 1 means that ν preserves the scalar product. Let Vλ ⊂ Lλ be the subspace of smooth vectors. For Lλ of principal or complementary series as above this simply means that f (x, y) is a smooth function on R2 \ 0. It is wellknown that ν(Vλ ) ⊂ C ∞ (X). This fact will be used often in the sequel. The space L2 (X) splits under the (unitary) action of G into the orthogonal sum of spaces according to types of representations of G: L2 (X) = C ⊕ L20 (X) ⊕ L2d (X). Here C means the trivial representation, L20 (X) = ⊕i πi is the direct sum of irreducible unitary representations of class one as above and L2d (X) is the direct sum of irreducible unitary representations of discrete series, i.e. those which do not have K-invariant vectors. We consider only the space L21 (X) = C⊕L20 (X). This is the space which corresponds to eigenfunctions of on Y , as we explained above. 2.2. K-types. Throughout the paper we fix an eigenfuction φ = φµ with an eigenvalue µ as above, the space L = Lφ and the subspace V = Vφ ⊂ Lφ of smooth vectors. The structure of V is well-known, as we mentioned (see [L]). Namely, it is isomorphic to C ∞ (S 1 )even and, in particular, V has an orthonormal basis {ek } with ek = eikθ , k ∈ 2Z. Such ek is the unique vector (of one) in V of K-type k (i.e. vector satisfying norm θ − sin θ ∈ K). π(hθ )ek = eikθ ek for any hθ = cos sin θ cos θ We define a sequence of functions {φ k }k∈2Z on X as the (unique) sequence of norm one vectors of all K-types in the space V . Namely, φ k ∈ V ⊂ C ∞ (X) and φ k (ghθ ) = eikθ φ k (g). In particular φ 0 = φ. The main object of our study is the following sequence of probability measures on X: ωk = |φ k (x)|2 dµX , k ∈ 2Z.
(2.1)
2.2.1. Multiplicities. It is the essence of the Duality theorem of Gel’fand and Fomin (see [G6]) that the above definition of φ k = φµk coincides with the one given in Sect. 1. The space Eµ introduced in Sect. 1.2.1 has a structure of the tensor product Eµ = Lλ ⊗ Mµ , where Lλ is the space of the corresponding representation of G and Mµ is a vector space whose dimension is equal to the multiplicity of µ.
Equidistribution
255
3. Outline of the Proof 3.1. Method of triple products. To prove equidistribution of ωk,l we use the method of H. Weyl. Namely, we will prove that for a basis {φi } in the space of continuous functions on Y with zero mean we have φi ωk,l → 0, (3.1) X
for any fixed i ≥ 1 as |k| → ∞. Obviously, X 1 · ωk,l = 1. We note here that from the representation theoretic point of view it is easier (and more natural) to deal with measures ωk introduced in Eq. (2.1) and not with ωk,l from Eq. (1.2). This is because ωk are associated with a choice of an eigenvalue and of an eigenfunction and hence with a choice of an irreducible representation and not with an eigenvalue alone. However, any ωk,l belongs to Vφ for some choice of φ as in Eq. (2.1) and hence it is enough to prove Theorem 1.1 for ωk . The proof of Eq. (3.1) for ωk is based on the uniqueness of trilinear invariant functionals on irreducible unitary representations of SL(2, R). We explain this now. 3.1.1. We choose an orthonormal basis {φi }i≥1 , φi = φµi of non–constant eigenfunctions of in L2 (Y ). As we explained in Sect. 2.1, this amounts to a decomposition L20 (X) = ⊕i πi , πi = πµi into irreducible (unitary) representations of G. An eigenfunction φi is realized as the unique K-invariant vector of norm one in πi . As we noted, we have G-invariant orthogonal decomposition L21 (X) = C · 1 ⊕ L20 (X), where L20 (X) is
the space of functions satisfying: X f (x)dµX = 0. We fix an eigenfunction φ ∈ L20 (X) and a corresponding representation (π, V ). Such a choice gives rise to the family of measures ωk = |φ k (x)|2 dµX described in Sect. 2.2. Given a function f ∈ C ∞ (X), in order to prove the Theorem 1.1 we first note that all ωk are K-invariant and hence we can assume that f is also K-invariant. For such f ∈ C ∞ (X) we consider the spectral decomposition with respect to the basis {φi }: f (x) =
ci φi (x) =
i≥0
X
f (x)dµX · 1 +
c i φi ,
(3.2)
i≥1
where ci =< f, φi >X . 3.1.2. Basic bound. For i ≥ 1 we introduce a notation pik = φi (x)ωk = φi (x)|φ k (x)|2 dµX = φi (y)|φ k (y)|2 dv. X
X
Y
In order to prove equidistribution we will show that for a given φi we have the following basic bound |pik | ≤ Ci |k|−σ ,
(3.3)
for appropriate σ > 0 and Ci (depending on φi but not on k; see Proposition 5.1). Hence the main contribution to X f (x)ωk comes from the first term in Eq. (3.2):
256
A. Reznikov
f (x)dµX · 1ωk X X + ci · φi (x)ωk → f (x)dµX = f ,
X
f (x)ωk =
i≥1
X
X
as |k| → ∞ (at least if i Ci · ci is bounded; see Proof of Theorem 1.1). To prove Eq. (3.3) we use properties of trilinear functionals on irreducible unitary representations of SL(2, R). Let us define them. 3.1.3. Automorphic triple products. For any three automorphic representations (πj , Vj , νj : Vj → C ∞ (X)),
j = 1, 2, 3,
we define the following G-invariant functional: lπaut : V1 ⊗ V2 ⊗ V3 → C, 1 ,π2 ,π3 given by
lπaut (v1 ⊗ v2 ⊗ v3 ) = 1 ,π2 ,π3
X
φv1 (x)φv2 (x)φv3 (x)dµX ,
(3.4)
where φvj (x) = νj (vj )(x) for vj ∈ Vj . Clearly l aut is well-defined for any vectors in Vj since Vj ⊂ C ∞ (X). aut In particular, for the quantities in Eq. (3.3) one has pik = lπ, π,π ¯ i (ek ⊗ ek ⊗ e0 ), where 2 π¯ is the complex conjugate of π in L (X). We note that π¯ π ∗ π for the class one representations. This amounts to the fact that eigenfunctions of on Y can be chosen to assume only real values. 3.1.4. Uniqueness of triple products. The central fact about trilinear functionals is the following theorem: Uniqueness. Let (πj , Lj ) be three nontrivial irreducible unitary representations of G = SL(2, R) of class one and Vj ⊂ Lj the corresponding spaces of smooth vectors. Then the space of G-invariant trilinear functionals on V1 ⊗ V2 ⊗ V3 is one-dimensional. The uniqueness statement was proved by Oksak in [O] for the group SL(2, C). However, the same proof works for SL(2, R) as well. For the p-adic GL(2) more refined results were obtained by Prasad (see [P]). He also obtained the uniqueness for discrete series representations. 3.1.5. Geometric triple products. Since there exists a unique (up to a constant) invariant trilinear functional we can choose any representative of it. We choose it for a particular model of representations πj , where it is easy to make computations. (We note that the computation of l aut for particular vectors is a very deep problem, related to special values of automorphic L-functions – see, for example, [HK].) In Sect. 4.2 we choose a model in which an invariant trilinear functional is given quite explicitly. We call it the geometric triple product – l geom . By the uniqueness principle there exists a constant aπ,π,π ¯ i such that: geom
aut lπ, ¯ i · lπ,π,π π,π ¯ i = aπ,π,π ¯ i.
(3.5)
Equidistribution
257
On the other hand, using a specific model for representations πj , one obtains bounds on the following quantities: qik = lπ,π,π ¯ i (ek ⊗ ek ⊗ e0 ). geom
Namely, we prove the following main technical result (see Appendix): geom
Proposition 3.1. Let lπ,π,π ¯ i be as in Sect. 4.2. There exist σ > 0 such that for any fixed i ≥ 1 the bound |qik | ≤ Di · |k|−σ
(3.6)
holds for some Di > 0 and all k. 3.2. Weak∗ limit of ωk . Now we can easily establish the bound Eq. (3.3). We have k |pik | = |aπ,π,π ¯ i |·|qi | and hence the bound Eq. (3.6) implies Eq. (3.3) since |aπ,π,π ¯ i | does not depend on k. This implies that weak∗ − lim ωk = µX by standard considerations (see [Z3]).
k→∞
3.3. Quantitative version. We explain here ramifications needed in order to prove Theorem 1.1. 3.3.1. In order to prove the Theorem 1.1 we have to make the bound Eq. (3.3): |pik | ≤ Ci |k|−σ , effective in i. To achieve this we have to estimate the coefficients Di in the geometric bound Eq. (3.6) |qik | ≤ Di · |k|−σ , and coefficients aπ,π,π ¯ i in Eq. (3.5) as i → ∞. We seek a bound which is polynomial in λi . 3.3.2. In Sect. 5 coefficients Di are bounded by the direct computation in a geometric model of l geom (see Proposition 5.1). Namely, we prove that: |Di | |λi |ε ,
(3.7)
for any ε > 0. A bound on aπ,π,π ¯ i has a deeper nature. (We note here that many bounds in the analytic theory of automorphic functions could be reduced to bounds on these coefficients with respect to different parameters; see [S].) We explain now how to obtain it.
258
A. Reznikov
3.3.3. Supremum norm. Automorphic trilinear functional lπaut (v ⊗ v ⊗ v ) = φv1 (x)φv2 (x)φv3 (x)dµX , 1 2 3 1 ,π2 ,π3 X
is well-defined for any vectors vj ∈ Vj . Moreover, it is bounded with respect to some natural norms. Let (π, V , ν : V → C ∞ (X)) be any automorphic representation. We denote by N the supremum norm induced on the space V by ν: N (v) = supx∈X |ν(v)(x)|. We note that N (v) is well defined since ν(v)(x) is a smooth function on a compact space X. From the Cauchy–Schwartz inequality we obtain the following simple bound: |lπaut (v1 ⊗ v2 ⊗ v3 )| ≤ N (v1 )||v2 ||P2 ||v3 ||P3 , 1 ,π2 ,π3
(3.8)
where vj ∈ Vj and Pj is the invariant unitary norm on Vj . Hence from (3.5) we obtain the following bound: |aπ1 ,π2 ,π3 | ≤ inf
vj ∈Vj
N (v1 )||v2 ||P2 ||v3 ||P3 . geom |lπ1 ,π2 ,π3 (v1 ⊗ v2 ⊗ v3 )|
(3.8 )
3.3.4. Bound on aπ1 ,π2 ,π3 . In order to use (3.8 ) we have to obtain a lower bound on the norm of l geom with respect to the product norm N (·)|| · ||P2 || · ||P3 on V1 ⊗ V2 ⊗ V3 . In Sect. 5 we evaluate l geom on a particular choice of vectors vi and obtain the following bound for a fixed π: 1+ε |aπ,π,π , ¯ i | |λi |
(3.9)
for any ε > 0 as |λi | → ∞. ε We note that one would expect that |aπ,π,π ¯ i | |λi | , for any ε > 0 (see Sect. 6.4). Proof of Theorem 1.1. Bounds (3.6), (3.7) and (3.9) imply Theorem 1.1. As we explained in Sect. 3.1, it is enough to prove the theorem for the measures ωk defined in Sect. 2.1. Namely, let (π, V ) be a fixed automorphic representation and φ k ∈ V be as before. Given a function f ∈ C ∞ (X), we consider its spectral decomposition as in (3.2): f (x) = f (x)dµX · 1 + c i φi . X
i≥1
We have
X
f (x)ωk =
X
=f +
f (x)dµX · 1ωk + ci · φi (x)ωk i≥1
X
ci · pik .
i≥1
X
(3.10)
Equidistribution
259
We claim that the last sum in (3.10) tends to 0 as |k| → ∞. In fact, k k c · p |ci ||aπ,π,π i ¯ i ||qi |. i≤ i≥1 i≥1 From (3.6), (3.7) and (3.9) we have:
k |ci ||aπ,π,π ¯ i ||qi |
i≥1
|ci ||λi |1+ε |k|−σ ,
i≥1
for any ε > 0. By Cauchy–Schwartz:
|k|−σ
1
|ci ||λi |1+ε ≤ |k|−σ
|ci |2 |λi |4+4ε ·
i≥1
2
i≥1
1 2
|λi |−2−2ε ,
i≥1
for any ε > 0. Noting that i≥1 |λi |−2−ε is finite for any ε > 0 because of the Weyl law for on Y :
and that
# { λi ≤ T } ≤ C · T 2 ,
i≥1 |ci |
2 |λ |4 i
is the Sobolev norm on Y :
2 (f ) = |f |2 + S2+ε
i≥1
|ci |2 |λi |4+4ε =
Y
|f (y)|2 + |1+ε f (y)|2 dv,
we obtain finally: X
for any ε > 0.
f (x)ωk = f + O S2+ε (f ) · |k|−σ ,
(3.11)
4. Triple Products 4.1. Geometric models of representations. Representations of SL(2, R) described in Sect. 2.1 have many different realizations. Let (πλ , Lλ ) be an irreducible unitary representation of class one and Vλ ⊂ Lλ the space of smooth vectors. We will use the following realizations of πλ (see [G5]). 4.1.1. Plane model Hλ . Realization of Vλ as the space Hλ of smooth functions f on R2 \ 0 satisfying f (ax, ay) = |a|λ−1 f (x, y) for all a ∈ R. The representation π is induced by the natural action of G on (x, y).
260
A. Reznikov
4.1.2. Circle model Cλ . Realization of Vλ as the space Cλ of smooth functions on S 1 (it is obtained as the restriction of functions from Hλ ). For λ purely imaginary (such a representation πλ is called tempered) the scalar product is given by: 1 f, gπλ = f gdθ. ¯ (4.1) 2π S 1 For λ ∈ (0, 1) (such a representation πλ is called non-tempered) the scalar product is given by: −1+λ 1 f, gπλ = f (θ)g(ω)| ¯ sin θ−ω dθdω. (4.2) | 2 2π S 1 ×S 1 4.1.3. Line model Nλ . In this model, to every vector v ∈ Hλ we assign the function u ∈ Nλ on the line given by u(x) = v(x, 1). The line model is convenient to describe the action of the upper triangular subgroup of SL(2, R). 1b 1. π( )u(x) = u(x − b). 01 a 0 )u(x) = |a|λ−1 u(a −2 x). 2. π( 0 a −1 4.2. Geometric trilinear functionals. Here we describe the invariant trilinear functional using the geometric models. Let πj = πλj , j = 1, 2, 3, be three irreducible unitary representations of G of class one. We construct explicitly the nontrivial trilinear functional l geom by means of its kernel. 4.2.1. Kernel of l geom in plane model. Let [ξ, η] = ξ1 η2 − ξ2 η1 be G-invariant of a pair of noncolinear vectors ξ, η ∈ R2 . We set Kλ1 ,λ2 ,λ3 (u, v, w) = |[u, v]|
−1−λ1 −λ2 +λ3 2
· |[v, w]|
|[u, w]|
−1+λ1 −λ2 −λ3 2
−1−λ1 +λ2 −λ3 2
(4.3)
,
for u, v, w ∈ R2 \ 0. The kernel function Kλ1 ,λ2 ,λ3 (s1 , s2 , s3 ), si ∈ R2 \ 0 satisfies two main properties: 1. K is homogeneous in each sj of degree −1 − λj . Hence if fj are homogeneous functions of degree −1 + λj , the following function f1 (s1 )f2 (s2 )f3 (s3 )Kλ1 ,λ2 ,λ3 (s1 , s2 , s3 ) is homogeneous of degree −2 in each variable sj ∈ R2 \ 0. 2. K is G-invariant: Kλ1 ,λ2 ,λ3 (gs1 , gs2 , gs3 ) = Kλ1 ,λ2 ,λ3 (s1 , s2 , s3 ) for any g ∈ G.
Equidistribution
261
4.2.2. l geom in a model. To define l geom we note that there is a natural G-invariant functional INT on locally integrable, homogeneous functions of degree −2 on R2 \ 0. It is given by an integral over any closed curve E ⊂ R2 \ 0 around 0 with respect to the measure dσ of degree 2 on E given by the area element inside of E. By applying I NT separately to each variable si of the function f1 (s1 )f2 (s2 )f3 (s3 )Kλ1 ,λ2 ,λ3 (s1 , s2 , s3 ), geom
we obtain the G-invariant functional lπ1 ,π2 ,π3 (f1 ⊗ f2 ⊗ f3 ). In particular taking E = S 1 ⊂ R2 \ 0 and the standard angle measure ds on S 1 we obtain an expression for l geom through the following integral: geom
lπ1 ,π2 ,π3 (f1 ⊗ f2 ⊗ f3 ) f1 (s1 )f2 (s2 )f3 (s3 )Kλ1 ,λ2 ,λ3 (s1 , s2 , s3 )ds1 ds2 ds3 . =
(4.4)
||sj ||=1
Another choice of a curve above would be E = {(x, 1), x ∈ R}. This choice is wellsuited for the line model Nλ .
5. Computations in the Geometric Model In this section we prove Proposition 3.1. A convention. We note that class one representations of SL(2, R) (tempered or nongeom geom tempered) are selfdual. Hence we can study lπλ ,πλ ,πµ instead of lπλ ,π¯ λ ,πµ . In what follows µ
geom
µ
geom
we consider lπλ ,πλ ,πµ (ek ⊗ e−k ⊗ e0 ) instead of lπλ ,π¯ λ ,πµ (ek ⊗ ek ⊗ e0 ) as we did before.
5.1. Triple product for K-types. Let us fix one irreducible unitary representation πλ and the basis of K-equivariant vectors {ek } in it. Let πµ be another such representation µ and e0 the K-fixed vector in it. We assume, as before, that both representations are of geom µ class one. We compute now the value lπλ ,πλ ,πµ (ek ⊗ e−k ⊗ e0 ) of the trilinear functional defined in Eq. (4.4). We note first that ek and e−k are K-equivariant and hence l geom (ek ⊗ e−k ⊗ v) = 0 µ for any v orthogonal to e0 . Hence it is enough to compute the integral (4.4) with respect µ to s1 and s2 for any fixed s3 (since e0 is a constant function of s3 ). We denote s1 = (cos x, sin x), s2 = (cos y, sin y) with x, y ∈ [−π, π ] and set s3 = (1, 0). From the above discussion we see that: geom lπλ ,πλ ,πµ (ek
µ ⊗ e−k ⊗ e0 )
=
µ
| cos x sin y − sin x cos y|− 2 + 2 −λ 1
− 21 − µ2
× | sin x|
− 21 − µ2 ik(x−y)
| sin y|
e
dxdy.
(5.1)
262
A. Reznikov
5.2. Geometric bound. The integral appearing in Eq. (5.1) is explicit to be estimated by a straightforward calculation (see Appendix). We obtain the following: Proposition 5.1. For a fixed πλ and any c > 0 there exists σ > 0 such that the bound µ
|lπλ ,πλ ,πµ (ek ⊗ e−k ⊗ e0 )| ≤ A · |µ|ε · |k|−σ +ε geom
holds for any ε > 0, any µ with Re(µ) > c and an appropriate A depending on ε. Remark 5.1. From the proof of the proposition above it follows that the rate of decay σ of the trilinear product depends on the temperedness of the representation πµ (i.e. on Re(µ); see Appendix A.2). 6. Automorphic Bound In this section we obtain a necessary bound on the proportionality coefficient aπ1 ,π2 ,π3 geom between lπ1 ,π2 ,π3 and lπaut and establish bound (3.9): 1 ,π2 ,π3 |aπλ1 ,πλ2 ,πλi | |λi |1+ε , for any ε > 0. 6.1. Supremum norm. Let N be the supremum norm as defined in Eq. (3.8). By the Cauchy–Schwartz inequality we have: (v1 ⊗ v2 ⊗ v3 )| |lπaut 1 ,π2 ,π3 =|
X
φv1 (x)φv2 (x)φv3 (x)dµX | ≤ N (v1 )||v2 ||P2 ||v3 ||P3 .
Therefore,
l aut N (v1 )||v2 ||P2 ||v3 ||P3 π1 ,π2 ,π3 (v1 ⊗ v2 ⊗ v3 ) . |aπ1 ,π2 ,π3 | = geom ≤ inf geom lπ1 ,π2 ,π3 (v1 ⊗ v2 ⊗ v3 ) vj ∈Vj |lπ1 ,π2 ,π3 (v1 ⊗ v2 ⊗ v3 )|
(6.1)
In order to use (6.1) we have to obtain a lower bound on the norm of l geom with respect to the product norm N · || || · || ||. The supremum norm N is bounded by an appropriate Sobolev norm on X (see [BR]). Using this fact it is easy to obtain a bound |aπ1 ,π2 ,π3 | |λ3 |2 , for fixed π1 , π2 . However we can use results of [BR] in order to obtain a stronger bound. This is based on a computation with K-fixed vectors. 6.2. Computation of l geom for K-fixed vectors. In order to use Eq. (6.1) we will bound geom |lπ1 ,π2 ,π3 (v1 ⊗ v2 ⊗ v3 )| from below and |lπaut (v1 ⊗ v2 ⊗ v3 )| from above for a 1 ,π2 ,π3 particular choice of vi . We do so by an explicit computation with K-fixed vectors eλi ∈ Vi of L2 -norm one. geom In order to compute the value |lπ1 ,π2 ,π3 (eλ1 ⊗ eλ2 ⊗ eλ3 )| we use the line model Nλ from Sect. 4.1.3 which allows one to reduce the computation to standard integrals. In the line model the K-fixed vector is given by eλ (x) = (1 + x 2 )
λ−1 2
.
(6.2)
Equidistribution
263
We use the formula (4.4) for the trilinear functional l geom . We choose E1 = {(x, 1) , x ∈ R}, E2 = {(y, 1), y ∈ R} and E3 = {(1, z), z ∈ R} (notice the difference between E1,2 and E3 ). Similarly to (5.1), K-invariance allows one to reduce the computation to the computation of the integral over E1 and E2 for a fixed z. We set z = 0 and obtain the following integral: geom
lπ1 ,π2 ,π3 (eλ1 ⊗ eλ2 ⊗ eλ3 ) λ1 −1 λ2 −1 −1−λ1 −λ2 +λ3 2 dxdy. (6.3) = (1 + x 2 ) 2 (1 + y 2 ) 2 |x − y| We use the following well-known table integral for the Fourier transform of eλ (see [M], p. 85): eˆλ (ξ ) =
1 .( 21
− λ/2)
· |ξ |−λ/2 K−λ/2 (|ξ |),
(6.4)
where Ks is the K-Bessel function. Transforming the integral (6.3) into an integral of convolutions of Fourier transforms (see [Tit]) and using (6.4) we obtain geom
lπ1 ,π2 ,π3 (eλ1 ⊗ eλ2 ⊗ eλ3 ) 2 +λ3 = γ ( 1−λ1 −λ )· 2
−1+λ1 +λ2 −λ3
2 eˆλ1 (ξ ) ∗ eˆλ2 (ξ )|ξ | dξ (6.5) 2 +λ3 −1−λ3 γ ( 1−λ1 −λ ) 2 = · K−λ1 /2 (|ξ |)K−λ2 /2 (|ξ |)|ξ | 2 dξ, .( 21 − 21 λ1 ).( 21 − 21 λ2 ) s
where γ (s) = π − 2 .( 2s )/π − (see [M], p. 101). We have
1−s 2
.( 1−s 2 ). The last integral is given in terms of .-function 2 +λ3 ) γ ( 1−λ1 −λ 2
geom
lπ1 ,π2 ,π3 (eλ1 ⊗ eλ2 ⊗ eλ3 ) =
.( 21 − 21 λ1 ).( 21 − 21 λ2 ) .( 41 (1 + (−1)i λ1 + (−1)j λ2 − λ3 )) 1 (1−λ )−3 i,j =0,1
× 22
3
.( 21 − 21 λ3 )
.
We are interested in the asymptotics of the above expression as |λ3 | → ∞ hence we assume here that λ3 is purely imaginary (since there are only finitely many exceptional automorphic representations πλ with λ real). We also assume for simplicity that both λ1 and λ2 are purely imaginary. Applying the Stirling formula for asymptotics of the .-function (.(x + iy) ∼ e−π|y|/2 |y|x−1/2 ) we obtain the following bound: Lemma. For fixed π1 , π2 there exist c > 0 such that π
|lπ1 ,π2 ,π3 (eλ1 ⊗ eλ2 ⊗ eλ3 )| ≥ c|λ3 |−1 · e− 4 |λ3 | , geom
as |λ3 | → ∞.
(6.6)
264
A. Reznikov
6.3. A bound on aπ,π,πi . In [BR] it is proved that for fixed π1 , π2 there exists C > 0 such that π
(eλ1 ⊗ eλ2 ⊗ eλ3 )| ≤ C · ln 2 |λ3 | · e− 4 |λ3 | . |lπaut 1 ,π2 ,π3 3
Comprising this with the bound Eq. (6.6) we obtain Proposition 6.1. For fixed πλ1 , πλ2 and for any ε > 0 there exists Cε > 0 independent of λi such that |aπλ1 ,πλ2 ,πλi | ≤ Cε |λi |1+ε , as |λi | → ∞. 6.4. A Conjecture. We would like to make the following conjecture concerning the size of coefficients aπ1 ,π2 ,πi above: Conjecture. For fixed πλ1 , πλ2 and for any ε > 0 there exists Cε > 0 independent of λi such that |aπλ1 ,πλ2 ,πλi | ≤ Cε |λi |ε , as |λi | → ∞. This conjecture is consistent with the Lindelöf conjecture for some L-functions (see [BR] and [S] for more details). We note that the bound in the proposition above corresponds to the so-called convexity bound. A. Appendix Here we prove Proposition 5.1. We deal first with tempered representations (i.e. satisfying Re(λi ) = 0) and then we indicate what should be changed in the proof for non-tempered representations. A.1. Tempered representations. Let πλ and πτ be as in 4.1 with Re(λ) = 0 and Re(τ ) = 0 and let l geom be defined as in 4.2. We have the following Proposition A.1. For a fixed πλ and any ε > 0 there exists A ≥ 0 such that |lπλ ,πλ ,πτ (ek ⊗ e−k ⊗ e0τ )| ≤ A · |τ |ε · |k|− 2 +ε , 1
geom
uniformly in τ and |k| → ∞. Proof. We denote τ = 2µ for convenience. We have shown in Eq. (5.1) that the following relation holds: geom
2µ
lπλ ,πλ ,π2µ (ek ⊗ e−k ⊗ e0 ) 1 1 = | cos x sin y − sin x cos y|− 2 +µ−λ | sin x|− 2 −µ · | sin y|− 2 −µ , eik(x−y) dxdy 1 1 1 = | sin(x − y)|− 2 +µ−λ | sin x|− 2 −µ | sin y|− 2 −µ eik(x−y) dxdy. 1
(A.1)
Equidistribution
265
Since all functions in (A.1) are 2π-periodic, the integral is taken over the torus T 2 = S 1 × S 1 . Let θ and η be new coordinates on T 2 obtained from the coordinates 21 (x + y) and 21 (x − y) on the plane. We can rewrite (A.1) as follows: 1 1 1 | sin(x − y)|− 2 +µ−λ | sin x|− 2 −µ | sin y|− 2 −µ eik(x−y) dxdy 1 − 21 +µ−λ − 21 −µ − 21 −µ | sin(θ + η)| = · | sin(2θ )| | sin(θ − η)| dη · eik(2θ) dθ. 2 (A.2) We denote hν (x) = | sin x|− 2 +ν a function on [0, 2π ]. The integral (A.2) is equal to the −kth Fourier coefficient of the following function: 1
hµ−λ (2θ) · h−µ ∗ h−µ (θ ),
(A.3)
where ∗ means the convolution on S 1 which we identify with [0, 2π ]. We note that the integrand in (A.1) has π as the period and hence it is enough to consider the integral over [0, π] (in particular all odd Fourier coefficients are zero). We compute Fourier coefficients of hν (x) first. Proposition A.1 will follow from the standard facts about Fourier coefficients for convolutions and multiplications of functions. We invoke the following formula for Fourier coefficients hˆ ν (ξ ), ξ ∈ Z of hν which is due to Ramanujan (see [M], p. 8): π 1 2−a .(1 + a) ˆhν (ξ ) = | sin t|a eiξ t dt = (A.4) · e 2 iπξ , 1 1 1 1 .( 2 a + 2 ξ + 1).( 2 a − 2 ξ + 1) 0 with a = − 21 + ν. Using the functional equation for the .-function we obtain 1 2 2 sin(π( 3 − 1 ξ + 1 ν)).( 1 − ν).( 1 + 1 ξ + 1 ν) 4 2 2 2 4 2 2 |hˆ ν (ξ )| = . π .( 3 + 1 ξ + 1 ν) 4
2
2
We now use the well-known asymptotics for the ratio of .-functions (see [M], p. 12) .(z + a) = z(a−b) (1 + O(|z|−1 )), (A.5) .(z + b) for a, b fixed and | arg z| < π and the Stirling formula in order to conclude that |hˆ ν (ξ )| (|ν|2 + |ξ |2 )− 4 . 1
(A.6) − 21
The bound (A.6) implies that |(h−µ ∗ h−µ )ˆ(ξ )| (|µ|2 + |ξ |2 ) we see that for a fixed λ, |(hµ−λ · h−µ ∗ h−µ )ˆ(ξ )| = | hˆ µ−λ (ξ − η)(h−µ ∗ h−µ )ˆ(η)|
, ξ ∈ Z. Finally,
η∈Z
≤
(|µ − λ|2 + |η|2 )− 4 (|µ|2 + |ξ − η|2 )− 2 1
1
η∈Z
|µ|ε |ξ |− 2 +ε , 1
for any ε > 0 uniformly as |µ| → ∞. Setting ξ = k/2 we obtain Proposition A.1.
Remark. One can obtain the same result by a straightforward application of the stationary phase method directly to the integral (A.1).
266
A. Reznikov
A.2. Non-tempered representations. Here we deal with the case when one or all λi are not purely imaginary. First we note that there are only finitely many such representations (those correspond to the spectrum of below 41 ). We have the following Proposition A.2. Let σ ∈ [0, 1), then for a fixed πλ and any ε > 0 there exists A > 0 such that −1+σ geom |lπλ ,πλ ,πσ (ek ⊗ e−k ⊗ e0σ )| ≤ A · |k| 2 +ε , as |k| → ∞. Proof. The proof goes along the same lines as in the tempered case. We want to esgeom timate lπλ ,πλ ,πσ (ek ⊗ e−k ⊗ e0σ ). The K-fixed vector e0σ is the constant function as before (although the constant is not 1 and it depends on σ ). Hence, as for the tempered representations, we have to compute the integral given by the kernel of l geom : | cos x sin y − sin x cos y|− 2 + 2 σ −λ | sin x|− 2 − 2 σ | sin y|− 2 − 2 σ 1
1
1
1
1
1
against the functions ek (x), e−k (y). However, the functions ek , e−k are exponents, + e− ikx , only up to a constant which now depends on k and r = Re(λ). Namely, it follows from the formula for the scalar product given in Eq. (4.2) that ek (x) = ck · eikx , where r ck = |k| 2 + O(1) (see [G5]). Hence we have to estimate the following quantity: 1 1 1 1 1 1 r |k| · | sin(x − y)|− 2 + 2 σ −λ | sin x|− 2 − 2 σ | sin y|− 2 − 2 σ eik(x−y) dxdy. This is done as in the tempered case.
Acknowledgements. This paper is a part of a joint project with Joseph Bernstein. I would like to thank him for fruitful discussions and for generous support. I also would like to thank Peter Sarnak for his interest in my work and fruitful suggestions. It is a pleasure to thank Leonid Polterovich for remarks which led to an improvement of the exposition. Special thanks to my daughter Masha for her assistance. Research was partially supported by EC TMR network “Algebraic Lie Representations”, grant no. ERB FMRX-CT97-0100 and by the Israel Science Foundation, the Emmy Noether Institute for Mathematics and the Minerva Foundation of Germany.
References [B] Bump, D.: Automorphic Forms and Representations. Cambridge, Cambridge University Press, 1997 [BR] Bernstein, J., Reznikov, A.: Analytic continuation of representations. Ann. Math. 150, 329–352 (1999) [CV] Colin de Verdière, Y.: Ergodicité et fonctions propres du Laplacien. Commun. Math. Phys. 102, 497– 502 (1985) [G5] Gelfand, I., Graev, M., Vilenkin, N.: Generalized Functions. Vol. 5, London–New York: Academic Press, 1966 [G6] Gelfand, I., Graev, M., Piatetski-Shapiro, I.: Representation Theory and Automorphic Forms. Philadelphia–London: Saunders, 1969 [GS] Guillemin, V., Sternberg, S.: Homogeneous quantization. J. Funct. Anal. 47, 344–380 (1982) [GU] Guillemin, V., Uribe, A.: Circular symmetry and the trace formula. Invent. Math. 96, 385–423 (1989) [HK] Harris, M., Kudla, S.: The central critical value. Ann. of Math. 133, 605–672 (1991) [L] Lang, S.: SL(2, R). Berlin–Heidelberg–New York: Springer, GTM105, 1985 [M] Magnus, W. et al.: Formulas and Theorems for the Special Functions. Berlin–Heidelberg–New York: Springer, 1966 [O] Oksak, A.: Trilinear Lorenz invariant forms. Commun. Math. Phys. 29, 189–217 (1973) [P] Prasad, D.: Trilinear forms for representations of GL(2). Compositio Math. 75, 1–46 (1990) [S] Sarnak, P.: Arithmetical Quantum Chaos. Israeli Math. Conference Proc., 1995 [Sh1] Shnirelman, A.: Ergodic properties of eigenfunctions. Uspekhi Mat. Nauk 29, 181–182 (1974)
Equidistribution
267
[Sh2] Shnirelman, A.: On the asymptotic properties of eigenfunctions. In: V. Lazutkin, KAM theory and semiclassical approximations to eigenfunctions. Berlin–Heidelberg–New York: Springer, 1993 [ST] Schrader, R., Taylor, M.: Semiclassical asymptotics, gauge fields, and quantum chaos. J. Funct. Anal. 83, no. 2, 258–316 (1989) [Su] Sunada, T.: Quantum ergodicity. In: Progress in inverse spectral geometry, Basel–Boston: Birkhäuser, 1997, pp. 175–196 [T] Tate, T.: Quantum ergodicity at a finite level. J. Math. Soc. Japan 51, 4, 867–885 (1999) [Tit] Titchmarsh, E.C.: Fourier integrals. Oxford: Oxford University Press, 1948 [Z1] Zelditch, S.: Pseudo-differential analysis on compact hyperbolic surfaces. J. Funct. Anal. 68, 72–105 (1986) [Z2] Zelditch, S.: Uniform distribution of eigenfunctions. Duke Math. J. 55, 919–941 (1987) [Z3] Zelditch, S.: On a quantum chaos theorem. J. Funct. Anal. 109, 1–21 (1992) Communicated by P. Sarnak
Commun. Math. Phys. 222, 269 – 292 (2001)
Communications in
Mathematical Physics
© Springer-Verlag 2001
Universal Scalings in Homoclinic Doubling Cascades Ale Jan Homburg1 , Todd Young2 1 KdV Institute for Mathematics, University of Amsterdam, Plantage Muidergracht 24, 1018 TV Amsterdam,
The Netherlands
2 Department of Mathematics, Ohio University, Athens, OH 45701, USA
Received: 16 June 1999 / Accepted: 24 April 2001
Abstract: Cascades of period doubling bifurcations are found in one parameter families of differential equations in R3 . When varying a second parameter, the periodic orbits in the period doubling cascade can disappear in homoclinic bifurcations. In one of the possible scenarios one finds cascades of homoclinic doubling bifurcations. Relevant aspects of this scenario can be understood from a study of interval maps close to x → p + r(1 − x β )2 , β ∈ ( 21 , 1). We study a renormalization operator for such maps. For values of β close to 21 , we prove the existence of a fixed point of the renormalization operator, whose linearization at the fixed point has two unstable eigenvalues. This is in marked contrast to renormalization theory for period doubling cascades, where one unstable eigenvalue appears. From the renormalization theory we derive consequences for universal scalings in the bifurcation diagrams in the parameter plane. 1. Introduction Cascades of period doubling bifurcations form one of the most intriguing ways to go from simple to chaotic dynamics on paths of differential equations. Its main characteristics are independent of the details of the differential equations; renormalization theory predicts universal scalings in the bifurcation diagram. A similar scenario, albeit with only saddle periodic orbits, exists with cascades of homoclinic bifurcations [12]. In two parameter families of differential equations, these two scenarios can be joint through cascades of homoclinic doubling bifurcations (a homoclinic doubling bifurcation is a codimension two homoclinic bifurcation, for which in the parameter plane a curve of doubled homoclinic orbits branches from a curve of primary homoclinic orbits). Homoclinic doubling cascades were described and shown to occur persistently in a class of two parameter families of three dimensional vector fields in [14], following numerical investigations in [18]. The analogy with period doubling cascades was pointed out, in particular by establishing a relation to one dimensional dynamics. All of this naturally leads to the question of universal scalings as observed in period doubling cascades. Using
270
A. J. Homburg, T. Young
renormalization theory, we will discuss universal scalings in families of one dimensional model maps appearing in the study of cascades of homoclinic doubling bifurcations in vector fields. In the next section we recall some of the theory of homoclinic doubling bifurcations and explain how interval maps arise in the study of cascades of homoclinic doubling bifurcations. We indicate how interval maps of the form x → p + r(1 − x β )2 with x ≥ 0 and 21 < β < 1 occur in the bifurcation study of a codimension three homoclinic bifurcation, an ‘inclination flip at resonance’ [15]. Here p and r are two parameters. Observe that the interval maps are unimodal with a unique minimum at x = 1. They also have infinite derivative at x = 0. In Sects. 3 and 4 we consider universal scalings in the bifurcation diagram of families near the one dimensional model family x → p+r(1−x β )2 for values of β slightly larger than 21 . The renormalization analysis we perform provides evidence that the bifurcation diagram of such families schematically is like depicted in Fig. 2 below, and establishes scaling structures in the bifurcation diagram. In Sect. 3 we will define renormalization, state and prove the renormalization Theorem 3.2. In Sect. 4 we discuss the consequences of the renormalization theorem for universal scalings in the bifurcation diagram (see Theorem 4.1). We illustrate the results by numerically computed bifurcation diagrams. We mention that numerical experiments on families of differential equations possessing homoclinic doubling cascades have been conducted in [23, 24]. They found and studied cascades of homoclinic doubling bifurcations in the following system, depending on parameters (µ1 , µ2 ): 1 7 1 2 x˙ = 8 x + 8 y − 8 x − µ1 zx(2 − 3x), 3 2 y˙ = 78 x + 18 y − 21 16 x − 16 xy + 2µ1 yz, z˙ = −2z + µ x + 3xz + µ (x 2 (1 − x) − y 2 ). 2 1 Their numerical study of scaling properties is in accordance with our theory. 2. Homoclinic Doubling Cascades In this section we review the bifurcation theory of homoclinic doubling cascades. We recall the homoclinic doubling bifurcation and explain how cascades of such bifurcations are formed. This section is meant to introduce the reader to the background for the renormalization theory studied in this paper. We point out how interval maps appear in the study of homoclinic doubling cascades as a result of singular rescalings of first return maps, see Proposition 2.4. The material here is not used elsewhere in the paper. For details we refer to [14, 15], while a more extensive overview can be found in [13]. Consider a system x˙ = Xµ (x),
(1)
x ∈ R3 , of differential equations depending on parameters µ ∈ Rk . Suppose the origin 0 is an equilibrium of saddle type for Xµ for all µ. We will assume that DXµ (0) has two real negative eigenvalues λss < λs and one real positive eigenvalue λu . A homoclinic
Universal Scalings in Homoclinic Doubling Cascades
271
orbit of Xµ is a solution = {γ (t); t ∈ R} of Xµ that converges to 0 both as t → ±∞. With λss , λs , λu as above, denote α = −λss /λu , β = −λs /λu . Note that α and β are invariant under time reparametrizations, in contrast to the eigenvalues themselves. Observe that α > β > 0. Suppose µ is a homoclinic bifurcation value of the family {Xµ } meaning that Xµ has a homoclinic orbit. Let µ denote the homoclinic orbit of Xµ . Then µ is an inclination flip homoclinic orbit if (G1) β = 1, (G2) µ ⊂ W ss (0), and (G3) W s,u (0) is tangent to W ss,s (0) along µ . Here W ss (0) denotes the one-dimensional strong stable manifold of 0 and W s,u (0) is a two dimensional center unstable manifold of 0; see also [12]. Consider a system of differential equations {Xµ } as in (1) depending on µ ∈ R2 . Suppose that Xµ¯ possesses an inclination flip homoclinic orbit and that at µ = µ, ¯ 1 s,u (0) < β < 1 and α > 1. We will assume that moreover α > 2β. Assume that W 2 has a quadratic tangency with W ss,s (0) along µ¯ . Take a cross section transverse to µ¯ . Following [15], there are coordinates (xss , xu ) on so that for µ near µ, ¯ the first return map µ : → has the following asymptotics: β 2β 2β+ω Q + Axu + Bxu + O(xu ) (xss , xu ) = (2) β 2β β+ω 2β+ω µ2 + µ1 xu + Dxu + O(|µ1 |xu + xu ) for some ω > 0. The coefficients µ1 , µ2 , Q, A, B, D depend smoothly on µ, the higher order terms can be differentiated for xu > 0. Definition 2.1. A homoclinic orbit µ of Xµ that intersects in k points, is called a k-homoclinic orbit. A periodic orbit of Xµ that intersects in k points, is called a k-periodic orbit.
Fig. 1.
Theorem 2.2 ([17, 14]). Let {Xµ }, µ near µ, ¯ be as above. Suppose µ → (µ1 , µ2 ) is a local diffeomorphism. Then in the (µ1 , µ2 ) plane, the bifurcation diagram is as depicted. {µ2 = 0} is a curve of 1-homoclinic orbits from which a curve of 2-homoclinic orbits branches at (µ1 , µ2 ) = (0, 0). Also branching at (µ1 , µ2 ) = (0, 0) are a curve of saddle node bifurcation of 1-periodic orbits, dotted in Fig. 1, and a curve of homoclinic doubling bifurcations of 1-periodic orbits, dashed in Fig. 1.
272
A. J. Homburg, T. Young
The above bifurcation is called the homoclinic doubling bifurcation. Homoclinic doubling bifurcations occur in a cascade when, following a curve of doubled homoclinic orbits, another homoclinic doubling bifurcation occurs. It was established in [14] that such cascades occur persistently in two parameter families of differential equations. We will present a construction from [15] where the existence of homoclinic doubling cascades was established near the unfolding of a codimension three homoclinic bifurcation, an inclination flip with eigenvalues at resonance. Consider a system of differential equations {Xµ } as in (1), now depending on µ ∈ R3 , ¯ β = 21 and α > 1. so that Xµ¯ possess an inclination flip homoclinic orbit and at µ = µ, s,u ss,s Assume that W (0) has a quadratic tangency with W (0) along µ¯ . Take a cross section transverse to µ¯ . Recall that there are coordinates (xss , xu ) on so that for µ near µ, ¯ the first return map µ : → is given by (2). This defines µ1 and µ2 . Write µ3 = 2 −
1 . β
Note that µ3 > 0 iff β > 21 . ¯ be as above. Suppose µ → (µ1 , µ2 , µ3 ) is Theorem 2.3 ([14, 15]). Let {Xµ }, µ near µ, a local diffeomorphism. If D > 1, then for µ3 > 0 fixed small, the two parameter family Yµ1 ,µ2 = Xµ possesses a connected set of homoclinic bifurcations in the (µ1 , µ2 )parameter plane containing a cascade (µn1 , µn2 ) of homoclinic doubling bifurcations in which a 2n -homoclinic orbit is created. 2.1. Singular rescalings to interval maps. A rescaling brings a first return map to a map which is a small perturbation of an interval map. Let be as in (2) and define rescaled coordinates (xˆss , xˆu ) by xss − Q = |µ1 |xˆss , µ 1/β 1 xu = xˆu . 2D The following proposition, which gives expansions for the first return map in rescaled coordinates, is proved by a direct computation. ˆ be the first return map in the rescaled coordinates (xˆss , xˆu ). Proposition 2.4. Let Write r = |µ1 |µ3 /4D and p = r(4Dµ2 |µ1 |−2 − 1). Then, for some ω > 0, A β ω xˆ β ) x ˆ + O(|µ | 1 u u 2D ˆ xˆss , xˆu ) = . ( β β p + r(1 − xˆu )2 + O(|µ1 |ω xˆu ) As µ1 → 0, restricting xˆu to a compact interval and parameters (µ1 , µ2 , µ3 ) to a ˆ converges to chart near the origin on which |p|, |r| are bounded, A β x ˆ u 2D xˆu → . (3) β p + r(1 − xˆu )2 ˆ is essentially one dimensional; the interval map obtained by ignoring the Hence, first coordinate is unimodal. Note that homoclinic orbits for the vector field correspond
Universal Scalings in Homoclinic Doubling Cascades
273 PD2
PD4
SN4
SN8
H8
H4
H2
SN2
H2
SNk
Saddle-node bifurcations of k-periodic orbits
PDk
Period-doubling bifurcations of k-periodic orbits
Homoclinic bifurcations of k-homoclinic orbits Hk ˆ in the parameter plane {(r, p) ∈ R2 }. The Fig. 2. Schematic impression of the expected bifurcation set of thick dots indicate three subsequent homoclinic doubling bifurcations. Choosing parameters on a line in the right hand side, one encounters a period doubling cascade as well as a cascade of homoclinic bifurcations of 2n -homoclinic orbits, a phenomenon described in [12]. These two scenarios are joint through homoclinic doubling bifurcations.
to periodic orbits through 0 for the interval map. Homoclinic doubling bifurcations correspond to periodic orbits that contain both 0 and the critical point. Compared to a strongly dissipative Hénon map, which is a singular perturbation of the one dimensional quadratic map (x, y) → (1 − ax 2 , 0), we obtain one dimensional maps which are also unimodal but not differentiable in x = 0 (and defined only if x ≥ 0). We remark that the interplay in the dynamics of a point with infinite slope at x = 0 and a critical point at x = 1, has a profound effect on the dynamics and bifurcations. In the following sections we consider interval maps as above and small perturbations thereof. For these interval maps we build a renormalization theory. We will not consider ˆ For this one could attempt a renormalization for the two dimensional return maps . reasoning as in [2] where renormalization for dissipative Hénon maps is considered. 3. The Renormalization Operator For 0 < σ < 1, let C2+σ denote the set of functions x → f (x β ), defined on a compact interval I = [0, L], L > 1, with u → f (u) a unimodal C 2+σ function (f is σ -Hölder with bounded Hölder constant) with a unique quadratic minimum at x = 1. We can write a function f ∈ C2+σ as 2 f (x) = p + r(x β ) 1 − x β ,
274
A. J. Homburg, T. Young
where p is a real number and u → r(u) is a positive C 2+σ function. We will be interested in such functions for which β is close to 21 , and for which p is small and r is close to 1 on a compact interval. We will write r(u) = 1 + ((u). If f (x) = g(x β ), let |g (u) − g (v)| f 2+σ = max sup |g(u)|, |g (u)|, |g (u)|, . sup |u − v|σ β x β =u=v=y β u=x x∈I x,y∈I
Equipped with the norm .2+σ , C2+σ is a cone in a Banach space.
2
1.5
y 1
0.5
0
0.5
1 x
1.5
2
Fig. 3. The graphs of f and f 2 for f (x) = r(1 − x β )2 with β = 0.9 and r = 1.413178
The renormalization R(f ) of f is a rescaling of the second iterate f 2 . It will be defined on an open subset of C2+σ , given by the following conditions. • • • •
f (0) > 1, f (0) ≤ L, f (1) < 1, f (aL) < 1, where a ∈ (0, 1) satisfies f (a) = 1.
The inequalities f (0) > 1 and f (1) < 1 imply that there is a point a ∈ (0, 1) for which f (a) = 1. Let R(f )(x) =
1 2 f (ax). a
Universal Scalings in Homoclinic Doubling Cascades
275
The condition f (0) > 1 is equivalent to p > −((0). The rescaling factor a is such that R(f ) has its critical point at x = 1, so that R(f ) ∈ C2+σ . Remark 3.1. Observe that we rescale the second iterate f 2 near x = 0, and not near the critical point x = 1 as is customary in renormalization theory for smooth unimodal maps. We rescale near x = 0 since for p < 0 points in the immediate vicinity of the critical point are mapped into {x < 0}, where f is not defined. For p = 0 one can of course define a renormalization operator by rescaling f 2 around x = 1, compare [21]. Define the following sets of functions: Hk = {f ∈ C2+σ : f k (0) = 0 with k minimal}, HDk = {f ∈ C2+σ : f (1) = 0, f k (0) = 0 with k minimal}, Sk = {f ∈ C2+σ : f k (1) = 1 with k minimal}. Note that for f ∈ HDk , the periodic orbit O(0) goes through the critical point. For f ∈ Sk , the critical point is periodic, in other words, f possesses a superstable periodic orbit. The following theorem provides an isolated fixed point of the renormalization operator R and gives further properties relevant to universal scalings in the bifurcation diagram as will be discussed below. The proof of the theorem is contained in the next sections. Theorem 3.2. For β > 21 and β − 21 small, the renormalization operator R on C2+σ possesses an isolated fixed √ point φ. The function φ depends continuously on β and converges to x → (1 − x)2 as β → 21 . The linearization DR at φ has two unstable eigenvalues δ1 , δ2 . They depend continuously on β and satisfy δ1 → 2, δ2 → ∞ as β → 21 . The remainder of the spectrum of DR(φ) is strictly inside the unit disc; the spectral projection to the eigenspace of the unstable eigenvalues is continuous. The unstable manifold of R is 2 dimensional. It intersects transversally the manifolds HD2 , H2 and S2 , defined above. The one dimensional strong unstable manifold intersects H2 and S2 transversally. In the following we will prove the above theorem. In Sect. 4 we discuss consequences for universal scalings in bifurcation diagrams. 3.1. Existence of a fixed point for real analytic functions. In this section and the next section we start proving Theorem 3.2 by establishing the existence of a fixed point of R. In the following sections we will prove hyperbolicity of the fixed point, identify the two unstable eigenvalues, and prove the transversality statements in Theorem 3.2. We consider first real analytic functions of the argument x β . The results will be used in the next section for the study of C 2+σ functions. Let Cω denote the set of functions x → f (x), defined on a compact interval [0, L], that are of the form f (x) = g(x β ) for a real analytic function u → g(u), and that are unimodal with a unique quadratic minimum at x = 1. We require that g extends to a bounded analytic function on a fixed complex neighborhood . of [0, Lβ ]. For f ∈ Cω with f (x) = g(x β ), let f 0 = sup |g(u)|. u∈.
Equipped with this norm, Cω is a cone in a Banach space.
276
A. J. Homburg, T. Young
S2
S4
C02+σ
H4
φ HD4
HD2
H2
Fig. 4. Bifurcation surfaces in the function space C2+σ
It is important to notice that R maps the set of functions in Cω with critical value 0 into itself. Denote by C0ω this set of functions, i.e. C0ω consists of functions on a compact interval [0, L] that can be written r(x β )(1 − x β )2 with u → r(u) a bounded and positive analytic function on .. By a coordinate change u = x β , we obtain a set of functions C˜ ω of the form r˜ (u)(1 − u)2β with u → r˜ (u) real analytic. The renormalization operator ˜ on C˜ ω . This operator is reminiscent of the renormalization R induces an operator R operator studied in [3] for 2β somewhat larger than 1, however it is defined in a slightly different way. The renormalization in [3] rescales the second iterate on a small interval around the critical point of f ∈ C˜ ω , whereas we rescale the second iterate on a small interval around the critical value 0. We can therefore not directly cite [3] to establish the ˜ but must re-examine their proof for the altered situation. existence of a fixed point of R, At this point, it is convenient to work with an alternative definition of a renormalization operator by choosing a different normalization of the functions. Let Dω0 be the set of functions f : [0, 1] → [0, 1] of the form f (x) = (1 − r(x β )x β )2 , with u → r(u) a positive and bounded analytic function on a fixed neighborhood / of [0, 1] in C. The space Dω0 will be equipped with the supnorm; if f (x) = g(x β ), then f 0 = supu∈/ |g(u)|. Note that f (0) = 1 for each f ∈ Dω0 . The normalization condition used
Universal Scalings in Homoclinic Doubling Cascades
277
for functions in Cω that the critical point sits at 1, is replaced by the requirement that 0 is mapped to 1. Observe further that Dω0 is a submanifold of a Banach space, but not itself a (open subset of a) Banach space. The renormalization operator S on Dω0 is defined by 1 2 f (λ(x)), λ λ = f (1).
S(f )(x) =
The domain of S is given by the functions f ∈ Dω0 for which 0 ≤ f (1) ≤ 1. Observe that the rescaling factor λ is explicitly defined, whereas the rescaling factor a in the definition of R must be solved from the equation f (a) = 1. ˆ The Proposition 3.3. For β slightly larger than 21 , S Dω has a hyperbolic fixed point φ. 0
ˆ has an unstable eigenvalue δ1 , that depends continuously linearized operator DS(φ) on β and approaches 2 as β → 21 . The eigenspace E(δ1 ) of δ1 is one dimensional and √ ˆ lies converges to {k(1 − x)2 , k ∈ R} as β → 21 . The rest of the spectrum of DS(φ) strictly inside the unit disc. The spectral projection on E(δ1 ) is continuous. Proof. Write r(u) = 1 + ((u). We have
2 f 2 (x) = 1 − (1 + ((f (x)β ))(1 − (1 + ((x β ))x β )2β
2 = 1 − (1 + ((f (x)β )) 1 − 2β(1 + ((x β ))x β + O((2β − 1)x 2β ) 2 = ((f (x)β ) − (1 + ((f (x)β )) 2β(1 + ((x β )) + O((2β − 1)x β ) x β . Rescaling by a factor ((1)2 and writing (f = ((f (((1)2 x)β ), ( = ((((1)2β x β ), we get S(f )(x) =
(f − (1 + (f )((1)2β−1 2β(1 + () + O((2β − 1)((1)2β x β x β ((1)
2 .
To get an idea where to find a fixed point of S, note the following. If h(x) = (1 − (1 + α)x β )2 for some constant α, then 2
S(h)(x) = 1 − 2βα 2β−1 (1 + α)2 x β + O((2β − 1)α 2β x 2β ) . The order x β terms in h and S(h) are equal if 1 + α = 2βα 2β−1 (1 + α)2 .
(4)
This defines α as a function of 2β − 1. Each small α corresponds to a small value of −α 2β − 1. In fact, 2β − 1 ∼ 1+ln α . Observe that 2β − 1 α. This computation suggests 1 1 β β writing ((x ) = αs(x ) with α given by (4). Note that α 2β−1 = 2β 1+α , we will use this identity below.
278
A. J. Homburg, T. Young
Denote by T the operator induced by S on the functions s. So if f (x) = (1 − (1 + αs(x β ))x β )2 ), then S(f )(x) = (1 − (1 + αT (s)(x β ))x β )2 . The operator T is given by the following expression, where sf = s(f (α 2 s(1)2 x)β ) and s = s(α 2β s(1)2β x β ). αsf − αs(1) 1 2β−1 T (s)(x β ) = − + 2β(1 + αs )(1 + αs)(αs(1)) − 1 f α αs(1)x β (5) + O((2β − 1)α 2β−1 s(1)2β x β ). Observe that ((f (((1)2 x)β ) = ((1 − (1 + ((((1)2β x β ))((1)2β x β )2β ) = ((1) − 2β( (1)(1 + ((0))((1)2β x β + O(((1)4β x 2β ). Putting this in (5), gives T (s)(x β ) 1 2β(1 + αs(0))(αs(1))2β−1 s (1)α + 2β(1 + αsf )(1 + αs)(αs(1))2β−1 − 1 = α + O((2β − 1)α 2β−1 s(1)2β x β ) + O(α 4β−1 s(1)4β−1 x β ) 1 s (1)(1 + αs(0)) = s(1)2β−1 1+α 1 1 + (1 + αs(1))(1 + αs(0)) s(1)2β−1 − 1 α 1+α + O((2β − 1)α 2β−1 s(1)2β x β ) + O(α 4β−1 s(1)4β−1 x β ). Let T0 s(x) = s(1) + s(0) + s (1) − 1. Direct inspection gives that T (s) − T0 s and its derivatives with respect to s, converge to 0 as α → 0. Define R( = T − T0 . Rewrite the equation T (s) = s as R( (s) = (I − T0 ) s. It follows from the identity T02 s(x β ) = 2T0 s(x β ) − 1, that 1 + (I − T0 ) R( (s)(x β ) = s(x β ).
(6)
The mapping given by the left-hand side is a contraction in the supnorm on a neighborhood of s ≡ 1. Observe that T0 − 1 is linear, its spectrum contains an eigenvalue 2 with the line of constant functions as the eigenspace. The rest of the spectrum of T0 − 1 is at the origin in the complex plane. Standard perturbation theory as found in e.g. [16] and applied in a similar situation in renormalization theory in [3], proves the theorem. We will show how to obtain from Proposition 3.3 the corresponding result, i.e. the existence of a hyperbolic fixed point with appropriate spectral properties, for the renormalization operator R. Let U0ω be a small open neighborhood in C0ω of x → (1 − x β )2 . For f ∈ U0ω , let h(f ) be given by h(f )(x) =
1 f (f (0)x), f (0)
Universal Scalings in Homoclinic Doubling Cascades
279
so that h(f )(0) = 1. Having fixed the region . on which f ∈ C0ω is a bounded analytic function of u = x β , choose the region / in the definition of Dω0 small enough so that h(f ) ∈ Dω0 for each f ∈ U0ω . Then h is smooth on U0ω and h−1 is smooth on h(U0ω ). ˆ ∈ U ω and this φ is a fixed point of Note that, for β sufficiently near 21 , φ = h−1 (φ) 0 R = h−1 ◦ S ◦ h. To check hyperbolicity of φ, compute ˆ ˆ DR(φ) = Dh−1 (φ)DS( φ)Dh(φ). ˆ eˆ1 = δ1 eˆ1 , it follows that DR(φ)e1 = δ1 e1 for e1 = Dh−1 (φ) ˆ eˆ1 . Let So, from DS(φ) ˆ πˆ 1 Dh(φ). That the rest πˆ 1 be the spectral projection onto eˆ1 and define π1 = Dh−1 (φ) of the spectrum of DR(φ) is strictly inside the unit circle in the complex plane, follows from
ˆ DS k (φ) ˆ − δ1k πˆ 1 Dh(φ), DRk (φ) − δ1k π1 = Dh−1 (φ) which is a contraction for k large enough. 3.2. Existence of a fixed point for twice differentiable functions. A possible way to show that φ is an isolated fixed point of R in C02+σ goes as follows. Consider the mapping 1 + (I − T0 )R( from the proof of Proposition 3.3 (see (6)), and show that there is a bounded ball around s ≡ 1 in C02+σ which is mapped into itself by 1 + (I − T0 )R( and on which 1 + (I − T0 )R( is a contraction in the supnorm. Since a closed bounded set in C02+σ is closed in the supnorm [10], this would establish the existence of an isolated fixed point. Instead, we will closely follow arguments from [4] to show that φ is an isolated fixed point of R in C02+σ and establish spectral properties in C02+σ . Proposition 3.4. For any 0 < σ < 1 and for β slightly larger than 21 , φ is an isolated fixed point of R 2+σ . The linearized operator DR(φ) has an unstable eigenvalue δ1 , C0
that depends continuously on β and approaches 2 as β → 21 . The eigenspace E(δ1 ) of √ δ1 is one dimensional and converges to {k(1 − x)2 , k ∈ R} as β → 21 . The rest of the spectrum of DR(φ) lies strictly inside the unit disc. The spectral projection to E(δ1 ) is continuous. Since the proof closely follows the arguments in [4], we will merely set up the proof and perform the calculations differing from those in [4]. With φ the fixed point of R, write ψ(x β ) = φ(x), so that u → ψ(u) is a real analytic function. The fixed point equation a1 φ 2 (ax) = φ(x), φ(a) = 1, for ψ becomes 1 ψ(ψ(λβ u)β ) = ψ(u). λ Let T be the corresponding operator on C 2+σ functions: 1 g(g(a β u)β ), a g(a β ) = 1. T (g) =
(7)
280
A. J. Homburg, T. Young
Calculate the formal derivative of T at g: DT (g)h(u) =
1 h(g(a β u)β ) + g (g(a β u)β )βg(a β u)β−1 h(a β u) a h(a β ) βh(a β ) β β u) ) − g(g(a g (g(a β u)β )g(a β u)β−1 g (a β u)u. a 1+β βg (a β ) ag (a β )
Write S for the operator DT at the fixed point ψ. Noting that ψ (u) = a β−1 βψ (ψ(a β u)β )ψ(a β u)β−1 ψ (a β u), we have Sh(u) =
1 h(ψ(a β u)β ) + βψ (ψ(a β u)β )ψ(a β u)β−1 h(a β u) a h(a β ) 1 + β β ψ(u) + uψ (u) . a ψ (a ) β
(8)
Following [4], define γ 1 β a βψ(a β u)β−1 ψ (a β u) h(ψ(a β u)β ) a + a βγ ψ (ψ(a β u)β )βψ(a β u)β−1 h(a β u) γ = a βγ −1 βψ(a β u)β−1 ψ (a β u) h(ψ(a β u)β ) + ψ (ψ(a β u)β )βψ(a β u)β−1 h(a β u) .
Sγ h(u) =
(9)
Denote by ρ(Sγ ) the spectral radius of Sγ in the supnorm. We will need to estimate ρ(Sγ ). Note that for β close to 21 , the fixed point g is close to u → (1 − u)2 . We obtain from this and (9) that Sγ ≤ ka βγ −1 , where k → 1 as β → 21 . Hence, Sγ and therefore also ρ(Sγ ) is small if βγ > 1 and β close enough to 21 . We cite the following lemma from [4]: Lemma 3.5. Suppose γ > 1 and ρ(Sγ ) < ρ, where 1 ≤ ρ < δ1 . The spectral projection γ σ to the unstable eigenspace E(δ1 ) of C0ω extends continuously to C0 . Given ( > 0, there γ is a positive integer m > 0 so that for all h ∈ C0 , S m h − δ1m σ hγ ≤ (ρ m hγ . Choose γ = 2 + σ with 0 < σ < 1, so that in particular γ > β1 . We can then take ρ = 1. The conclusion of the above lemma implies that, apart from a single eigenvalue δ1 , the spectrum of S as an operator on C02+σ lies strictly inside the disc with center 0 and radius 1. Note that the unstable manifold of φ in C02+σ is contained in C0ω .
Universal Scalings in Homoclinic Doubling Cascades
281
3.3. The spectrum at the fixed point in the full space. Recall that R leaves invariant the subset C02+σ ⊂ C2+σ consisting of functions with critical value 0. The previous sections 2+σ have shown the existence of a fixed point φ of R in C0 and have established spectral properties of DR(φ) 2+σ . In this section we calculate the remaining eigenvalue of C0
DR(φ).
1 1 Proposition 3.6. For β > 2 close to 2 , the spectrum of DR(φ) consists of the spectrum of DR(φ) 2+σ plus an unstable eigenvalue δ2 . This eigenvalue depends continuously C0
on β and satisfies δ2 → ∞ as β → 21 . The eigenspace E(δ2 ) of δ2 is one dimensional. Proof. Note that there is a natural decomposition C2+σ = C02+σ ⊕ R, along with projections π1 : C2+σ → C02+σ and π2 : C2+σ → R given by: π1 (f )(x) = f (x) − f (1), π2 (f )(x) = f (1). ∂ R(f )(1) φ is in the The invariance of C02+σ implies that the partial derivative δ2 = ∂p spectrum of DR(φ). In the next section we show that δ2 is an eigenvalue. We will now estimate the partial derivative and establish in particular that it is a large positive number. Recall that R(f )(x) = a1 f 2 (ax) with a given by f (a) = 1. Let us start by calculating the rescaling factor a. Writing s = a β , we have p+(1+((s))(1−s)2 = 1, yielding 1−p s =1− . 1 + ((s) Using this equation, s can be solved invoking the implicit function theorem. This gives s as a smooth function of p and (. Note that a β ≈ 21 (p + ((0)) in the sense that aβ 1 → , p + ((0) 2 as p, ( → 0. Though a depends on f , we suppress this dependence from the notation. The fixed point φ of R is of the form φ(x) = (1 + (ˆ (x β ))(1 − x β )2 , for some C ω function u → (ˆ (u). We will write aˆ for the rescaling factor at φ; φ(a) ˆ = 1. Now ∂ ∂ f 2 (a) ∂ p R(f )(1) = = . ∂p ∂p a ∂p a Hence ∂ 1 R(f )(1) f =φ = ≈ ∂p aˆ
(ˆ (0) 2
− 1 β
This is a large number for β close to 21 , since then (ˆ is small.
.
282
A. J. Homburg, T. Young
3.4. Invariant manifolds of the fixed point. In this section we investigate properties of the unstable manifold of the fixed point φ of the renormalization operator R. In particular we establish the transversality properties stated in Theorem 3.2. Let us first discuss the statement that HD2 intersects the unstable manifold W u,uu (φ) in C02+σ , it suffices to consider R restricted of φ transversally. Since HD2 is contained √ 2+σ to C0 . The unstable manifold of R 2+σ converges to the line {k(1 − x)2 , k ∈ R} C0
as β ↓ 21 (compare [3]). The manifold HD2 is given by {r(x β )(1 − x β )2 , r(0) = 1}. Transversality easily follows for β sufficiently close to 21 . The next step is to show that the strong unstable manifold W uu (φ) intersects H2 transversally. For this, we perform some cone estimates in Cω . For α real and positive, let Cα ⊂ Cω = C0ω × R denote the cone Cα = {λ(v, 1) ∈ C0ω × R, v ≤ α, λ ∈ R}. The next lemma shows that there is an invariant cone field consisting of tiny (meaning α small) cones Cα . Lemma 3.7. For each α > 0, there is a neighborhood U of x → (1 − x β )2 , so that for f ∈ U we have DR(f )Cα ⊂ Cα . Moreover, DR(f ) expands vectors in Cα . Proof. Write β 2
β β 2 f (x) = p + (1 + ((f (x) ) 1 − p + (1 + ((x ))(1 − x ) 2
β
= d0 + d1 x β + d2 x 2β + O(x 3β ). To calculate the coefficients d0 , d1 , d2 , it is convenient to treat ((x β ) and ((f (x)β ) as parameters. Under the assumption that ( = ((x β ) and (f = ((f (x)β ) are parameters, we will compute a series expansion f 2 (x) = dˆ0 + dˆ1 x β + dˆ2 x 2β + O(x 3β ). By filling in, in this power series expansion of f 2 , the actual expansions of ( and (f , the values of d0 , d1 , d2 can be obtained. Using the identity (1 + p + ( + h)β = (1 + p + ()β + β(1 + p + ()β−1 h + 21 β(β − 1)(1 + p + ()β−2 h2 + O(h3 ), we get
β 2 2 β 2β f (x) = p + (1 + (f ) 1 − 1 + p + ( − 2(1 + ()x + (1 + ()x
= p + (1 + (f ) 1 − (1 + p + ()β − 2(1 + ()β(1 + p + ()β−1 x β + (1 + ()β(1 + p + ()β−1 + 2(1 + ()2 β(β − 1)(1 + p + ()β−2 x 2β 2 +O(x 3β ) , so that
2 dˆ0 = p + (1 + (f ) 1 − (1 + p + ()β , dˆ1 = − 4β(1 + (f )(1 + ()(1 + p + ()β−1 1 − (1 + p + ()β , dˆ2 = 4(1 + (f )(1 + ()2 β 2 (1 + p + ()2β−2 + 2(1 + (f )(1 + ()β(1 + p + ()β−1 1 − (1 + p + ()β + 4(1 + (f )(1 + ()2 β(β − 1)(1 + p + ()β−2 1 − (1 + p + ()β .
Universal Scalings in Homoclinic Doubling Cascades
283
2 So, e.g. the value of d0 is p + (1 + (((1 + p + ((0))β )) 1 − (1 + p + ((0))β . The remainder term f 2 (x) − d0 − d1 x β − d2 x 2β is analytic as function of x β and small of order O(x 3β ). Recall that the rescaling factor a is implicitly given by 1−p β , a =1− 1 + ((a β ) so that a β ≈ 21 (p + ((0)) in the sense that 1 aβ → , p + ((0) 2 as p, ( → 0. Note that R(f )(x) = d0 a −1 + d1 a β−1 x β + d2 a 2β−1 x 2β + O(a 3β−1 x 3β ). The remainder term R(f )(x) − d0 a −1 + d1 a β−1 x β + d2 a 2β−1 x 2β is analytic in the argument x β , it is small of order O(a 3β−1 x 3β ) so that it converges to 0 uniformly in p, (. Take a curve ft (x) = pt + (1 + (t (x β ))(1 − x β )2 in Cω , for smooth functions pt = p0 + t and (t = (0 + tv + O(t 2 ), v ≤ α, and with t from an interval containing 0. Denote by at the rescaling factor in R(ft ), i.e., ft (at ) = 1. Then ∂ 1 + v(0) ≈ at | ∂t t=0 β
1 −1
1 (p0 + (0 (0)) 2
β
,
(10)
as p0 , (0 → 0. Write R(ft )(x) = (R(ft )(x) − R(ft )(1)) + R(ft )(1). Observe that x → R(ft )(x) − R(ft )(1) in contained in C0ω . It follows from (10) that ∂ ∂ (pt at−1 )| t=0 = a0−1 − p0 a0−2 at | t=0 ∂t ∂t is bounded from below by a constant times ( 21 (p0 + (0 (0)))−1/β for p0 , (0 small and p0 + (0 (0) > 0 (which holds for f0 from the domain of R). Since R(ft ) = pt at−1 + β−1 2β−1 O((pt + (t )2 at−1 ) + O((pt + (t )at ) + O(at ), also − 1 β ∂ 1 R(ft )(1)| t=0 ≥ c (p0 + (0 (0)) ∂t 2 for some constant c > 0, for all small p0 , (0 . Similarly, ∂ 1− 1 (R(ft )(x) − R(ft )(1))| t=0 ≤ C(p0 + (0 (0)) β ∂t for some C > 0. The lemma easily follows from these bounds.
284
A. J. Homburg, T. Young
The tangent spaces of W uu (φ) are contained in the cones Cα from the above lemma. Hence, W uu (φ) contains a curve from φ to {f ∈ Cω , f (0) = 1}, the boundary of the domain of definition of R. It also shows that W uu (φ) intersects H2 transversally, for β close to 21 . Indeed, H2 is formed by the functions f (x) = p + (1 + ((x β ))(1 − x β )2 2 that satisfy p + ((1 + p + ((0)) 1 − (1 + p + ((0))β = 0. This manifold is tangent to C02+σ along HD2 and therefore it intersects W uu (φ) transversally. Similarly one sees that S2 intersects W uu (φ) transversally. 4. Universal Scalings Theorem 3.2 provides an explanation of universal scalings in the bifurcation diagram of generic two parameter families of functions. We will discuss scaling structures for two parameter families in Cω , thus made up from real analytic functions. The renormalization operator R on Cω is smooth. It follows from invariant manifold theory that R possesses local stable and unstable manifolds near φ, which are denoted by W s (φ) and W u,uu (φ), respectively. The local unstable manifold contains a strong unstable manifold W uu (φ) and the weak unstable manifold W u (φ) = W u,uu (φ) ∩ C0ω . Note that W u (φ) is a smooth manifold. The next theorem is basic in finding scaling structures in bifurcation diagrams. Following its formulation, we provide a further discussion of its consequences. The theorem will be proved in Sect. 4.1 below. Theorem 4.1. Consider a two parameter family {fp,( ; (p, () ∈ R2 } of functions in Cω with f0,( ∈ C0ω , that intersects the local stable manifold W s (φ) transversally at some function f0,¯( . Let (pn , (n ) be a sequence of parameter values tending to (0, (¯ ), such that n R2 (fpn ,(n ) converges to a function f¯ in the local unstable manifold W u,uu (φ). Then 1 pn+1 − pn → δ2 pn − pn−1 1 (n+1 − (n → (n − (n−1 δ1
if f¯ ∈ W u (φ), if f¯ ∈ W uu (φ).
The theorem can be applied to sequences of parameter values (pn , (n ) so that n R2 (fpn ,(n ) is contained in a bifurcation surface (e.g. H2 , HD2 , or S2 ) and converges to a function f¯ in the intersection of this bifurcation surface with W u,uu (φ). The theorem shows that scalings exist in bifurcation diagrams of generic two parameter families of functions in Cω , independent of details of the family, other than the value of β. It relates the unstable eigenvalues at the fixed point φ of the renormalization operator R to scalings in such bifurcation diagrams. In figures 5 and 6, bifurcation curves for the family x → p +r(1−x β )2 with β = 0.6 are shown. These figures illustrate the scalings predicted by Theorem 4.1. In particular, the HD points converge exponentially at a rate δ1 which is close to 2 for β close to 1 n 2 . As is illustrated in figure 6, the curve pieces in H2n+1 between HD2 and HD2n+1 , accumulate on the line {p = 0} exponentially fast at the strong rate δ2 . It should be noted that Theorem 4.1 does not explain all scalings that exist in the bifurcation diagram. Define H∞ = limk→∞ H2k , note that H∞ is invariant under R. Similarly, P∞ = limk→∞ S2k is invariant under R. H∞ and P∞ constitute a ‘boundary of chaos’. Theorem 4.1 does not address scalings of sequences of bifurcation values in S2n
Universal Scalings in Homoclinic Doubling Cascades
285
0.4
0.2
0
p
-0.2
H2 H4 H8 S2 S4 S8 HD points HD limit
-0.4
-0.6
-0.8
-1 0
0.5
1
1.5
r
2
2.5
3
Fig. 5. Bifurcation curves for x → p + r(1 − x β )2 in the (p, r) parameter plane for β = 0.6
converging to a point in P∞ away from W s (φ) (which are expected to scale exponentially with the usual Feigenbaum rate). Nor does it address scalings of sequences of bifurcation values in H2n converging to a point in H∞ away from W s (φ). We have no information on these scalings (however, [19] and [12] contain results on scalings of homoclinic bifurcations in similar situations). The following proposition provides some knowledge on Hu,uu = H∞ ∩ W u,uu (φ) ∞ u,uu (φ). Compare Fig. 5. and Pu,uu = P ∩ W ∞ ∞ Proposition 4.2. Hu,uu and Pu,uu are contained in a wedge {(r, p) : |p| ≤ C(r − ∞ ∞ u,uu δ /δ 2 1 1) , r > 1} for some constant C > 0. Hu,uu ∞ \{φ} ⊂ {f (1) < 0} and P∞ \{φ} ⊂ {f (1) > 0}. Proof. Because W uu (φ) intersects H2 and S2 transversally, Hu,uu and Pu,uu are not ∞ ∞ u,uu u,uu uu contained in W (φ). The fact that H∞ , P∞ are contained in a wedge as claimed, is now standard. u,uu To show that Hu,uu ∞ \φ ⊂ {f (1) < 0}, first notice that H∞ is contained in {f (0) ≤ 0} since homoclinic orbits cannot exist for positive critical values. To show that Hu,uu ∞ has no points in {f (1) = 0} outside φ, consider fundamental domains of R restricted to W u (φ) = W u,uu (φ) ∩ {f (1) = 0}. It suffices to establish the existence of such a fundamental domain on which the kneading sequences are different from that of the Feigenbaum map. Using the continuity of kneading sequences √ of functions with critical value 0 [22], and closeness of W u (φ) to the line {k(1 − x)2 , k ∈ R}, it is easy to provide such a fundamental domain, for β small enough. The analogous statement for Pu,uu ∞ is proved in the same manner.
286
A. J. Homburg, T. Young 0.0002
0
p
-0.0002
H2 H4 H8 S2 S4 S8 HD points HD limit
-0.0004
-0.0006
-0.0008
-0.001 0.95
1
1.05
1.1
1.15
r
1.2
1.25
1.3
1.35
1.4
Fig. 6. Blow up of Fig. 5
4.1. Proof of Theorem 4.1. Let us start with a comment on the proof in [3] where universal scalings in period doubling cascades are studied (see also [25]). In [3], a differentiable stable foliation with codimension one leaves for the renormalization operator near the fixed point is constructed. The existence of a differentiable stable foliation (as opposed to a merely continuous foliation) is readily shown to imply the universal scalings observed in period doubling cascades. However, the operator R may not possess a differentiable stable foliation (with leaves of codimension two) near φ. We will therefore use alternative arguments to unravel the universal scalings in homoclinic doubling cascades. By a smooth local coordinate change, we may assume that W s (φ), W u (φ) and uu W (φ) are linear. We thus have smooth coordinates x = (xs , xu , xuu ) near φ on Cω , so that C0ω = {xuu = 0} and W s (φ) = {(xu , xuu ) = (0, 0)}, W u (φ) = {(xs , xuu ) = (0, 0)}, W uu (φ) = {(xs , xu ) = (0, 0)}. Smooth coordinate changes that leave C0ω invariant, do not effect the scalings we wish to show. Note that xu and xuu are both one dimensional coordinates. The renormalization
Universal Scalings in Homoclinic Doubling Cascades
287
operator R in these coordinates can be written as
As xs + xs gs (x)
1 2 R(xs , xu , xuu ) = δ1 xu + xu gu (x) + xuu gu (x) , δ2 xuu + xuu guu (x)
(11)
where gs (x), gu1 (x), gu2 (x), guu (x) = O(x). It suffices to consider x from a small box B = {xs , |xu |, |xuu | ≤ η} for some small η > 0. By a linear rescaling, we may assume η = 1. Then s = sup{gs , |gu1 |, |gu2 |, |guu |}, with the supremum taken over B, is small. To obtain convergence properties of {xi } as N → ∞, we rely on a technique called Shil’nikov variables which amounts to formulating and solving a boundary value problem for orbits {xi } (see e.g. [26, 5]). Let ξ = (ξs , ξu , ξuu ) ∈ B. Let F be the map on the set of sequences N ∩ [0, N] → Cω defined by F({xk })k
k−1
Ak−i Aks ξs + s xs,i gs (xi ) i=0 N−k−1
k−N k−N+i 1 2 = δ x ξ + δ g (x ) + x g (x ) u u,N−i N−i uu,N−i N−i u u 1 1 i=0 N−k−1 δ k−N ξ + δ k−N+i x g (x ) uu
2
1
uu,N−i uu
N−i
,
(12)
i=0
where xk = (xs,k , xu,k , xuu,k ). If {xi }, 0 ≤ i ≤ N , is an orbit of R with xs,0 = ξs , xu,N = ξu , xuu,N = ξuu , then {xi } is a fixed point of F: F({xi }) = {xi }. Indeed, the right hand side of (12) is obtained from the variation of constants formula for orbits. Let σ < 1 be larger then the spectral radius of As . With ξ ∈ B as before, let (ξ ) = {xi }0≤i≤N
(xs,0 , xu,N , xuu,N ) = (ξs , ξu , ξuu ) . xs,i σ −i , |xu,i |δ N−i , |xuu,i |δ N−i < ∞ 1 2
Equipped with the norm {xi } = sup {xs,i σ −i , |xu,i−N |δ1N−i , |xuu,i−N |δ2N−i }, 0≤i≤N
(ξ ) is a Banach space. Lemma 4.3. There is a bounded ball B(ξ ) in (ξ ), so that for each positive integer N , F maps B(ξ ) into itself. Moreover, F is a contraction on B(ξ ).
288
A. J. Homburg, T. Young
Proof. We will show that there exists M > 0, so that for {xi } ∈ (ξs , ξu , ξuu ) with {xi } ≤ M, we have F({xi }) ≤ M. In the following estimates, C denotes a positive constant that may vary from line to line, but which is bounded uniformly in N . xs,k ≤ Cσ k ξs +
k−1
Cσ k−i xs,i gs (x)
i=0
≤ Cσ k ξs +
k−1
CMσ k gs (x)
i=0
≤ Cσ k ξs + CsMσ k , |xu,k | ≤ δ1k−N |ξu | +
i=0
≤ δ1k−N |ξu | + ≤
δ1k−N+i |xu,N−i ||gu1 (x)| + |xuu,N−i ||gu2 (x)|
N−k−1
N−k−1
Mδ1k−N |gu1 (x)| + Mδ1k−N+i δ2−i |gu2 (x)|
i=0 k−N δ1 |ξu | + CsMδ1k−N ,
|xuu,k | ≤ δ2k−N |ξuu | +
N−k−1
δ2k−N+i |xuu,N−i ||guu (x)|
i=0
≤ δ2k−N |ξuu | +
N−k−1
Mδ2k−N |guu (x)|
i=0
≤ δ2k−N |ξuu | + CsMδ2k−N . It follows from these estimates that F maps some bounded ball B(ξ ) in (ξ ) into itself. Similar estimates show that F is a contraction on B(ξ ). The lemma shows that for each ξ ∈ B, there is a unique orbit {xi }0≤i≤N contained in B and satisfying xs,0 = ξs , xu,N = ξu , xuu,N = ξuu . In fact, the computations in the proof of Lemma 4.3 can be easily adapted to show that |xu,0 | ∼ δ1−N |xu,N | if xu,N = 0, |xuu,0 | ∼ δ2−N |xuu,N | if xuu,N = 0. At this stage we have obtained a weak form of Theorem 4.1: Proposition 4.4. Let a two parameter family fp,( and a sequence of parameter values n (pn , (n ) tending to (0, (¯ ) be as in Theorem 4.1. So R2 (fpn ,(n ) tends to some function f¯ as n → ∞. Then 1 − ln |pn | → δ2 if f¯ ∈ W u (φ), n 1 − ln |(n − (¯ | → δ1 if f¯ ∈ W uu (φ). n
Universal Scalings in Homoclinic Doubling Cascades
289
Note that this proposition relates scalings in the bifurcation diagram to the unstable eigenvalues of DR(φ). The statement of the proposition however, is not as strong as the statement of Theorem 4.1. Indeed, it shows that pn ∼ δ2−n (where an ∼ bn means that an /bn is bounded and bounded away from 0, uniformly in n). The statement of Theorem 4.1 implies that pn δ2n converges to some constant c as n → ∞. Similarly for (n . We will proceed with the proof of Theorem 4.1. For this, we look at the Shil’nikov variables approach in more detail. First we apply an additional smooth local coordinate change. The proof of the following lemma will be postponed. Lemma 4.5. There are smooth coordinates x = (xs , xu , xuu ) near φ, in which R has the following expression. R(xs , xu , xuu ) =
As xs + xs gs (x) δ1 xu + xu2 gu1 (x) + xuu gu2 (x) 1 (x) + x 2 g 2 (x) δ2 xuu + xuu xu guu uu uu
,
1 (x), g 2 (x) are all of order O(x ). where gs (x) = O(x) and gu1 (x), gu2 (x), guu s uu
Proposition 4.6. Take coordinates near φ given by Lemma 4.5. Let {xi }, 0 ≤ i ≤ N , be an orbit of R contained in B. Then, for some κ > 1, xu,0 = xu,N δ1−N + O((δ1 κ)−N ), xuu,0 = xuu,N δ2−N + O((δ2 κ)−N ), as N → ∞. Proof. Define yu,k = xu,k δ1N −k and yuu,k = xuu,k δ2N−k . Then the fixed point equation F({xi }) = {xi }, for the xu , xuu coordinates can be written as yu,k = ξu +
N−k−1 i=0
yuu,k = ξuu +
2 δ1−i yu,N−i gu1 (xN−i ) +
δ2 δ1
−i
yuu,N−i gu2 (xN−i )
,
N−k−1 i=0
1 2 2 δ1−i yuu,N−i yu,N−i guu (xN−i ) + δ2−i yuu,N−i guu (xN−i ) .
Take k = 0 in the above. Since xs,i ≤ Kσ i for some K > 0, we can estimate that for some C > 0 (independent of N ), N−1 −i δ 2 2 δ1−i yu,N−i gu1 (xN−i ) + yuu,N−i gu2 (xN−i ) δ1 i=0 ≤ Cs σ
N
+ δ1−N
+
δ2 δ1
−N
,
290
A. J. Homburg, T. Young
and N−1 −i −i 2 1 2 δ1 yuu,N−i yu,N−i guu (xN−i ) + δ2 yuu,N−i guu (xN−i ) i=0
≤ Cs σ N + δ1−N + δ2−N . Both sums converge to 0 exponentially fast in N . The proposition follows. Proof of Theorem 4.1. This is an easy corollary of the previous proposition.
It remains to prove Lemma 4.5. As a first step, we smoothly linearize R restricted to the local unstable manifold. Despite possible rational dependence of δ1 , δ2 (which would obstruct in general the existence of smoothly linearizing coordinates), this is possible because W u (φ) is a smooth manifold. Lemma 4.7. There are smooth local coordinates in Cω near φ, in which R restricted to W u,uu (φ) is linear. Proof. It suffices to construct a smooth invariant foliation F u of W u,uu (φ) with one dimensional leaves that includes W u (φ). Indeed, by [11], there exists a smooth strong unstable foliation F uu of W u,uu (φ) including W uu (φ). The existence of a smooth foliation F u means that a smooth coordinate change separates the weak unstable and strong unstable coordinate on the local unstable manifold. One dimensional maps with a hyperbolic fixed point can be smoothly linearized, so that the lemma would be proven. Let (xu , xuu ) → R(xu , xuu ) denote the renormalization operator restricted to W u,uu (φ); R is defined on an open neighborhood U of (0, 0). To construct F u , we construct a DR invariant line bundle which gives F u by integration. Let R (1) be the induced mapping on U × L(R, R): R (1) (xu , xuu , α) = (R(xu , xuu ), β), graph β = DR(xu , xuu )graph α. Writing DR one has
(1)
(xu , xuu ) =
a(xu , xuu ) b(xu , xuu ) c(xu , xuu ) d(xu , xuu )
,
a(xu , xuu )α + b(xu , xuu ) R (1) (xu , xuu , α) = R(xu , xuu ), . c(xu , xuu )α + d(xu , xuu )
It is now easily seen that R (1) has a hyperbolic singularity at (0, 0, 0) and δ1 0 0 DR (1) (0, 0, 0) = 0 0 δ2 . ∗ ∗ δ2 /δ1 Observe that δ1 < δ2 /δ1 < δ2 . The sought for invariant line bundle is given by a smooth invariant manifold F that projects injectively to R2 × {0} by the coordinate
Universal Scalings in Homoclinic Doubling Cascades
291
projection. This manifold is obtained as follows: let Guu be a strong unstable foliation for R(1) with one dimensional leaves and let W u (0, 0, 0) be the weak unstable manifold {(xu , 0, 0); (xu , 0) ∈ U}. Then Guu F = x , x∈W u (0,0,0)
uu that contains the point x. Since W u (0, 0, 0) is a smooth where Guu x is the leaf of G uu manifold and G a smooth foliation, F provides the sought for invariant line bundle and thus, by integrating, the foliation F u .
Proof of Lemma 4.5. Applying the coordinate change from Lemma 4.7, we may assume that in (11), gu1 (x), gu2 (x), guu (x) = O(xs ). Consider a smooth local coordinate change (xs , xu , xuu ) → (ys , yu , yuu ) of the form ys = xs , yu = xu + pu (xs )xu , yuu = xuu + puu (xs )xuu , for smooth functions pu , puu which vanish at xs = 0. Write R in the new coordinates as As ys + ys hs (y) ˜ s , yu , yuu ) = δ1 yu + h1 (ys , yu , yuu )yu + h2 (ys , yu , yuu )yuu . R(y u u δ2 yuu + huu (ys , yu , yuu )yuu We seek coordinates y = (ys , yu , yuu ), so that h1u (ys , 0, 0) = 0 and huu (ys , 0, 0) = 0. Write R = (Rs , Ru , Ruu ) and let P(ys , pu , puu ) = (pu ◦ Rs (ys , 0, 0), puu ◦ Rs (ys , 0, 0)). Treating pu and puu as variables, the demands that h1u (ys , 0, 0) = 0 and huu (ys , 0, 0) = 0, yield pu − δ11 gu1 (ys , 0, 0) + h.o.t. P(ys , pu , puu ) = , puu − δ12 guu (ys , 0, 0) + h.o.t. where the higher order terms are quadratic and higher order in (ys , pu , puu ). One obtains (pu , puu ) as functions of ys by constructing the strong stable manifold for the mapping (ys , pu , puu ) → (Rs (ys , 0, 0), P(ys , pu , puu )). Acknowledgements. The authors thank the Institute for Physical Science and Technology in College Park, Maryland, for its gracious hospitality.
References 1. Ahlfors, L.V.: Complex analysis. An introduction to the theory of analytic functions of one complex variable. Third edition. New York: McGraw-Hill Book Co., 1978 2. Collet, P., Eckmann, J.-P., Koch, H.: Period doubling bifurcations for families of maps on R n . J. Statist. Phys. 25, 1–14 (1981) 3. Collet, P., Eckmann, J.-P., Lanford, O.E. III: Universal properties of maps on an interval. Commun. Math. Phys. 76, 211–254 (1980) 4. Davie, A.M.: Period doubling for C 2+( mappings. Commun. Math. Phys. 176, 261–272 (1996)
292
A. J. Homburg, T. Young
5. Deng, B.: The Shil’nikov problem, exponential expansion, strong λ-lemma, C 1 -linearization, and homoclinic bifurcation. J. of Diff. Eq. 79, 189–231 (1989) 6. Eckmann, J.-P., Epstein, H.: Bounds on the unstable eigenvalue for period doubling. Commun. Math. Phys. 128, 427–435 (1990) 7. Epstein, H.: Fixed points of composition operators II. Nonlinearity 2, 305–310 (1989) 8. Feigenbaum, M.J.: Quantitative universality for a class of nonlinear transformations. J. Statist. Phys. 19, 25–52 (1978) 9. Feigenbaum, M.J.: The universal metric properties of nonlinear transformations. J. Statist. Phys. 21, 669–706 (1979) 10. Henry, D.: Geometric theory of semilinear parabolic equations. Lecture Notes in Mathematics 840, Berlin–Heidelberg–New York: Springer-Verlag, 1981 11. Hirsch, M.W., Pugh, C.C., Shub, M.: Invariant manifolds. Lecture Notes in Mathematics 583, Berlin– Heidelberg–New York: Springer-Verlag, 1977 12. Homburg, A.J.: Global aspects of homoclinic bifurcations of vector fields. Memoirs A.M.S., Vol. 578, Providence, RI: A.M.S., 1996 13. Homburg, A.J.: Cascades of homoclinic doubling bifurcations. To appear in: Ergodic Theory, Analysis, and Efficient Simulation of Dynamical Systems, ed. B. Fiedler, Berlin–Heidelberg–New York: Springer Verlag, 2001 14. Homburg, A.J., Kokubu, H., Naudot, V.: Homoclinic-doubling cascades. To appear in Arch. Ration. Mech. Anal. 15. Homburg, A.J., Krauskopf, B.: Resonant homoclinic flip bifurcations. J. Dynam. Differential equations 12, 807–850 (2000) 16. Kato, T.: Perturbation theory for linear operators, Berlin–Heidelberg–New York: Springer Verlag, 1976 17. Kisaka, M., Kokubu, H., and Oka, H.: Bifurcations to N -homoclinic orbits and N -periodic orbits in vector fields. J. Dynam. Differential equations 5, 305–357 (1993) 18. Kokubu, H., Komuro, M., Oka, H.: Multiple homoclinic bifurcations from orbit-flip. I. Successive homoclinic doublings. Int. J. Bif. Chaos 6, 833–850 (1996) 19. Lyubimov, D.V., Pikovsky, A.S., Zaks, M.A.: Universal scenarios of transitions to chaos via homoclinic bifurcations. Sov. Sci. Rev., Sect. C, Math. Phys. Rev. 8, 221–292 (1989) 20. Martens, M.: The periodic points of renormalization. Ann. of Math. 147, 543–584 (1998) 21. McGuire, J.B., Thompson, C.J.: Asymptotic properties of sequences of iterates of nonlinear transformations. J. Statist. Phys. 27, 183–200 (1982) 22. de Melo, W., van Strien, S.: One dimensional dynamics. Ergebnisse der Mathematik und ihrer Grenzgebiete 25, Berlin–Heidelberg–New York: Springer Verlag, 1993 23. Oldeman, B.E., Krauskopf, B., Champneys, A.R.: Death of period-doublings: locating the homoclinicdoubling cascade. Physica D 146, 100–120 (2000) 24. Oldeman, B.E., Krauskopf, B., Champneys, A.R.: Numerical unfoldings of codimension-three resonant homoclinic flip bifurcations. Nonlinearity 14, 597–621 (2001) 25. Palis, J.: A note on the inclination lemma (λ-lemma) and Feigenbaum’s rate of approach. In: Geometric dynamics (Rio de Janeiro, 1981), Lecture Notes in Math. 1007, Berlin–Heidelberg–New York: Springer Verlag, 1983, pp. 630–635 26. Shil’nikov, L.P. On the generation of periodic motion from trajectories doubly asymptotic to an equilibrium state of saddle type. Math. USSR Sbornik 6, 427–437 (1968) 27. Tresser, C., Coullet, P.: Itérations d’endomorphismes et groupe de renormalisation. C. R. Acad. Sci. Paris Sér. A-B 287, A577–A580 (1978) Communicated by Ya. G. Sinai
Commun. Math. Phys. 222, 293 – 298 (2001)
Communications in
Mathematical Physics
© Springer-Verlag 2001
On the Collapse of Tubes Carried by 3D Incompressible Flows Diego Cordoba1 , Charles Fefferman2, 1 Mathematics Department, University of Chicago, Chicago, IL 60637-1546, USA.
E-mail: [email protected]
2 Mathematics Department, Fine Hall, Princeton University, Princeton, NJ 08544-1000, USA.
E-mail: [email protected] Received: 5 February 2001 / Accepted: 25 April 2001
Abstract: We define the notion of a “regular tube”, and prove that a regular tube convected by a 3D incompressible flow cannot collapse to zero thickness in finite time. 0. Introduction The 3-dimensional incompressible Euler equation (“3D Euler”) is as follows: ∂ x ∈ R3 , t ≥ 0 , + u · ∇ x u = − ∇x p ∂t ∇x · u = 0 x ∈ R3 , t ≥ 0 , u(x, 0) = u0 (x) x ∈ R3 , with u0 a given, smooth, divergence-free, rapidly decreasing vector field on R3 . Here, u(x, t) and p(x, t) are the unknown velocity and pressure for an ideal, incompressible fluid flow at zero viscosity. An outstanding open problem is to determine whether a 3D Euler solution can develop a singularity at a finite time T . A classic result of Beale– Kato–Majda [1] asserts that, if a singularity forms at time T , then the vorticity ω(x, t) = ∇x × u(x, t) grows so rapidly that T sup |ω(x, t)|dt = ∞. 0
x
In [2], Constantin–Fefferman–Majda showed that, if the velocity remain bounded up to the time T of singularity formation, then the vorticity direction ω(x, t)/|ω(x, t)| cannot remain uniformly Lipschitz continuous up to time T . This work was supported initially by the American Institute of Mathematics.
Partially supported by NSF Grant DMS 0070692.
294
D. Cordoba, C. Fefferman
One scenario for possible formation of a singularity in a 3D Euler solution is a constricting vortex tube. Recall that a vortex line in a fluid is an arc on an integral curve of the vorticity ω(x, t) for fixed t, and a vortex tube is a tubular neighborhood in R3 arising as a union of vortex lines. In numerical simulations of 3D Euler solutions, one routinely sees that vortex tubes grow longer and thinner, while bending and twisting. If the thickness of a piece of a vortex tube becomes zero in finite time, then one has a singular solution of 3D Euler. It is not known whether this can happen. Our purpose here is to adapt our work [3, 4] on two-dimensional flows to three dimensions, for application to 3D Euler. We introduce below the notion of a “regular tube”. Under the mild assumption that T sup |u(x, t)|dt < ∞, 0
x
we show that a regular tube cannot reach zero thickness at time T . In particular, for 3D Euler solutions, a vortex tube cannot reach zero thickness in finite time, unless it bends and twists so violently that no part of it forms a regular tube. This significantly sharpens the conclusion of [2] for possible singularities of 3D Euler solutions arising from vortex tubes. On the other hand, [2] applies to arbitrary singularities of 3D Euler solutions, while our results apply to “regular tubes”. Although we are mainly interested in 3D Euler solutions, our result is stated for arbitrary incompressible flows in 3 dimensions. The proof is simple and elementary. The main novelty for readers familiar with [3, 4] is that we can adapt the ideas of [3] to three dimensions, even though there is no scalar that plays the rôle of the stream function on R2 . 1. Regular Tubes Let Q = I1 × I2 × I3 ⊂ R3 be a closed rectangular box (with Ij a bounded interval), and let T > 0 be given. A regular tube is a relatively open set t ⊂ Q parametrized by time t ∈ [0, T ), having the form t = {(x1 , x2 , x3 ) ∈ Q : θ (x1 , x2 , x3 , t) < 0}
(1)
θ ∈ C 1 (Q × [0, T )),
(2)
with
and satisfying the following properties: |∇x1 ,x2 θ | = 0 for (x1 , x2 , x3 , t) ∈ Q × [0, T ), θ (x1 , x2 , x3 , t) = 0; t (x3 ) : = {(x1 , x2 ) ∈ I1 × I2 : (x1 , x2 , x3 ) ∈ t } is non-empty,
(3) (4)
for all x3 ∈ I3 , t ∈ [0, T ); closure (t (x3 )) ⊂ interior (I1 × I2 ) for all x3 ∈ I3 , t ∈ [0, T ).
(5)
Collapse of Tubes Carried by 3D Incompressible Flows
295
For example, t for a fixed time t might be a thin tubular neighborhood of a curve ⊂ Q. To meet the conditions for a regular tube, would have to enter Q through the face {x3 = min I3 } and exit Q through the face {x3 = max I3 }, with the tangent vector always transverse to the (x1 , x2 ) plane. Let u(x, t) = (uk (x, t))1≤k≤3 be a C 1 velocity field defined on Q × [0, T ). We say that the regular tube t moves with the velocity field u, if we have ∂ (6) + u · ∇x θ = 0 whenever (x, t) ∈ Q × [0, T ), θ (x, t) = 0. ∂t It is well-known that a vortex tube arising from a 3D Euler solution moves with the fluid velocity. 2. Statement of the Main Result Theorem. Let t ⊂ Q(t ∈ [0, T )) be a regular tube that moves with a C 1 , divergence free velocity field u(x, t). If T sup |u(x, t)|dt < ∞,
0
x∈Q
(7)
then lim inf Vol(t ) > 0. t→T −
(8)
3. Calculus Formulas for Regular Tubes Let t be a regular tube, as in (1)· · · (5). Recall that t (x3 ) = {(x1 , x2 ) ∈ I1 × I2 : θ(x1 , x2 , x3 , t) < 0} .
(9)
St (x3 ) = {(x1 , x2 ) ∈ interior (I1 × I2 ) : θ(x1 , x2 , x3 , t) = 0} for x3 ∈ I3 , t ∈ [0, T ).
(10)
Define also
Also, for intervals I ⊂ I3 , and for t ∈ [0, T ), define t (I ) = {(x1 , x2 , x3 ) ∈ Q : x3 ∈ I and θ(x1 , x2 , x3 , t) < 0}, and
(11)
St (I ) = {(x1 , x2 , x3 ) ∈ Q : x3 ∈ I and (x1 , x2 ) ∈ St (x3 )}.
(12)
Let ν denote the outward-pointing unit normal to St (I3 ), and let ν˜ = (˜ν1 , ν˜ 2 , 0), where (˜ν1 , ν˜ 2 ) is the outward-pointing unit normal to St (x3 ). Thus, ν and ν˜ are continuous
296
D. Cordoba, C. Fefferman
vector-valued functions, defined on S = {(x1 , x2 , x3 , t) ∈ Q × [0, T ) : (x1 , x2 , x3 ) ∈ St (I3 )}. Define also scalar-valued functions σ, σ˜ on S by requiring that ∂ ∂ (13) + σ ν · ∇x θ = + σ˜ ν˜ · ∇x θ = 0 for x ∈ St (I3 ). ∂t ∂t Again, σ and σ˜ are well-defined and continuous on S, thanks to (3). We write f ds for the integral of a function f over a curve with respect to arclength. We write
gdA for Y
the integral of a function g over a surface Y with respect to area. Let F be a continuous function on Q. Then we have the formulas d F dA = F σ˜ ds for fixed x3 , and (14) dt t (x3 )
St (x3 )
F σ dA =
x3 ∈I
St (I )
F σ˜ ds
St (x3 )
dx3 .
(15)
The proofs of (14) and (15) consist merely of elementary calculus, and may be omitted. 4. Proof of the Theorem We retain the notation of the previous sections. We will define a time-dependent interval Jt = [A(t), B(t)] ⊂ I3
(16)
and establish an obvious formula for the time derivative of Vol t (Jt ). We assume that the endpoints A(t), B(t) are C 1 functions of t. We have Volt (Jt ) = Areat (x3 )dx3 , so that x3 ∈Jt
d Volt (Jt ) = B (t)Areat (B(t)) dt
− A (t)Areat (A(t)) + x3 ∈Jt
∂ Areat (x3 )dx3 . ∂t
Applying (14) with F ≡ 1, we find that d Volt (Jt ) = B (t)Areat (B(t)) dt − A (t)Areat (A(t)) +
x3 ∈Jt
St (x3 )
σ˜ ds
dx3 .
Collapse of Tubes Carried by 3D Incompressible Flows
297
In view of (15) (with F ≡ 1 on S), this is equivalent to d Volt (Jt ) = B (t)Areat (B(t)) − A (t)Areat (A(t)) + dt
σ dA.
(17)
St (Jt )
Now we bring in the hypothesis that t moves with a divergence-free C 1 velocity field u. From (6) and (13), we see that (σ ν −u)·∇x θ = 0 on St (Jt ). Thus (σ ν −u) is orthogonal to ν, so that σ = u · ν on St (Jt ), and (27) may be rewritten as d Volt (Jt ) = B (t)Areat (B(t)) − A (t)Areat (A(t)) + u · νdA. (18) dt St (Jt )
On the other hand, since u = (u1 , u2 , u3 ) is divergence-free, the divergence theorem yields 0= (∇x · u)dV = u · νdA + u3 dA − u3 dA,
t (Jt )
St (Jt )
t (B(t))
where dV denotes a volume integral. Hence, (18) may be rewritten in the form d [B (t) − u3 (x, t)]dA − Volt (Jt ) = dt t (B(t))
t (A(t))
[A (t) − u3 (x, t)]dA.
(19)
t (A(t))
This is our final formula for the time derivative of Vol t (Jt ). It is intuitively clear. We now pick the time-dependent interval Jt = [A(t), B(t)] ⊂ I3 . Let I3 = [a, b], and let t0 ∈ (0, T ) be a time to be picked below. We define T B(t) = b − t
max |u(x, τ )|dτ,
(20)
max |u(x, τ )|dτ.
(21)
x∈Q
and T A(t) = a + t
x∈Q
We are assuming that u(x, τ ) is continuous on Q × [0, T ), and that ∞. It follows that A(t), B(t) are C 1 functions on [0, T ), and that
T
max |u(x, τ )|dτ <
0 x∈Q
a ≤ A(t) < B(t) ≤ b for t ∈ [t0 , T ),
(22)
provided we pick t0 close enough to T . We pick t0 so that (22) holds. Thus, t (Jt ) ⊂ Q for t ∈ [t0 , T ). Immediately from (20), (21), we obtain B (t) = −A (t) = max |u(x, t)| ≥ x∈Q
(recall u = (u1 , u2 , u3 )).
max
x∈t (A(t))∪t (B(t))
|u3 (x, t)|
(23)
298
D. Cordoba, C. Fefferman
From (19) and (23) we see at once that d Vol t (Jt ) ≥ 0 for t ∈ [t0 , T ). dt
(24)
On the other hand, (4) and (22) show that Vol t0 (Jt0 ) > 0. Consequently, lim inf Vol t ≥ lim inf Volt (Jt ) ≥ Vol t0 (Jt0 ) > 0. t→T −
t→T −
The proof of our theorem is complete.
References 1. Beale, J., Kato, T. and Majda,A.: Remarks on the breakdown of smooth solutions for the 3D Euler equations. Commun. Math. Phys. 94, 61–64 (1984) 2. Constantin, P., Fefferman, C. and Majda, A.: Geometric constraints on potentially singular solutions for the 3D Euler equations. Commun. Part. Diff. Eq. 21, 559–571 (1996) 3. Cordoba, D. and Fefferman, C.: Scalars convected by a 2D incompressible flow. To appear in Commun. Pure Appl. Math. 4. Cordoba, D. and Fefferman, C.: Behavior of several 2D fluid equations in singular scenarios. Proc. Nat. Acad. Sci. 98, 4311–4312 (2001) Communicated by P. Constantin
Commun. Math. Phys. 222, 299 – 318 (2001)
Communications in
Mathematical Physics
© Springer-Verlag 2001
Hilbert Schemes, Separated Variables, and D-Branes A. Gorsky1,4 , N. Nekrasov2,4, , V. Rubtsov1,3,4 1 Institute of Theoretical and Experimental Physics, 117259 Moscow, Russia.
E-mail: [email protected]
2 Lyman Laboratory of Physics, Harvard University, Cambridge, MA 02138, USA.
E-mail: [email protected]
3 Département de Mathématiques, Université d’Angers, 49045 Angers, France.
E-mail: [email protected]
4 Institut Mittag-Leffler, Auravägen 17, Djursholm, Sweden
Received: 2 November 2000 / Accepted: 7 May 2001
Abstract: We explain Sklyanin’s separation of variables in geometrical terms and construct it for Hitchin and Mukai integrable systems. We construct Hilbert schemes of points on T ∗ for = C, C∗ or elliptic curve, and on C2 / and show that their complex deformations are integrable systems of Calogero–Sutherland–Moser type. We present the hyperkähler quotient constructions for Hilbert schemes of points on cotangent bundles to the higher genus curves, utilizing the results of Hurtubise, Kronheimer and Nakajima. Finally we discuss the connections to physics of D-branes and string duality. 1. Introduction A way of solving a problem with many degrees of freedom is to reduce it to the problem with the smaller number of degrees of freedom. The solvable models allow to reduce the original system with N degrees of freedom to N systems with 1 degree of freedom which reduce to quadratures. This approach is called a separation of variables (SoV). Recently, E. Sklyanin came up with a “magic recipe” for the SoV in the large class of quantum integrable models with a Lax representation [1, 2]. The method reduces in the classical case to the technique of separation of variables using poles of the Baker– Akhiezer function, which goes back to the work [3], see also [4] for recent developments and more references. The basic strategy of this method is to look at the Lax eigen-vector (which is the Baker–Akhiezer function) (z, λ): L(z) (z, λ) = λ(z) (z, λ)
(1.1)
with some choice of normalization (this is the artistic part of the method). The poles zi of (z, λ) together with the eigenvalues λi = λ(zi ) are the separated variables. In all Permanent address: I.H.E.S., 35 route de Chartres, 91440 Bures-surYvette, France. E-mail: [email protected]
300
A. Gorsky, N. Nekrasov, V. Rubtsov
the examples studied so far the most naive way of normalization leads to the canonically conjugate coordinates λi , zi . The purpose of this paper is to explain the geometry behind the “magic recipe” in a broad class of examples, which include Hitchin systems [5], their deformations [6] and many-body systems considered as their degenerations [7, 8]. We shall use the results of [9, 10, 11]. For a complex surface X let X [h] denote the Hilbert scheme of points on X of length h (if X is compact hyperkahler then so is X [h] [10]). 2. Hitchin Systems Hitchin systems can be thought of as a generalized many-body system. In fact, the elliptic Calogero–Moser model as well as its various spin and some relativistic generalizations can be thought of as a particular degeneration of Hitchin system [7, 13, 8]. 2.1. The integrable system. Recall the general Hitchin’s setup [5]. One starts with the compact algebraic curve of genus higher than one and a topologically trivial vector bundle V over it. Let G = SLN (C), g = LieG. The Hitchin system is the integrable system on the moduli space N of stable Higgs bundles. The point of N is the gauge ¯ a holomorphic section φ of equivalence class of a pair (an operator ∂¯A = ∂¯ + A, 1 ¯ ad(V ) ⊗ ω ). The holomorphic structure on V is defined with the help of ∂A . The ¯ symplectic structure on N descends from the two-form Trδφ ∧ δ A. The integrals of motion are the Hitchins hamiltonians: Trφ k ∈ H0 (, ωk ) ≈ C(2k−1)(g−1) ,
k > 1.
Their total number is equal to N−1 k=2
(2k − 1)(g − 1) = (N 2 − 1)(g − 1) = 21 dimC N .
Thus, N can be represented as fibration over B=
N−1
H0 (, ωk )
k=2
with the fibre over a generic point b ∈ B being an abelian variety Eb . This variety is identified by Hitchin with the quotient of the Jacobian variety Jb of the spectral curve Cb , defined as the divisor of zeroes of R(z, λ) = Det(φ(z) − λ) ∈ H0 (, ωN ) in T ∗ . The curve Cb has genus N 2 (g − 1) + 1 which is by g higher than the dimension of Eb . In fact, Eb = Jb /Jac(). The Jacobian of is embedded into the Jacobian of C as follows. The spectral curve covers , hence the holomorphic 1-differentials on are pulled back to Cb giving the desired embedding. An open dense subset of N is isomorphic to T ∗ M, the cotangent bundle to the moduli space of holomorphic stable G-bundles on . The space N is a non-compact integrable system. One can compactify it by replacing T ∗ by a K3 surface. This is a natural deformation of the original system in the sense
Hilbert Schemes, Separated Variables, and D-Branes
301
that the infinitesimal neighborhood of imbedded into K3 is isomorphic to T ∗ (since c1 (K3) = 0). Instead of studying the moduli space of the gauge fields on together with the Higgs fields φ one studies the moduli of torsion free sheaves, supported on [6]. It turns out that this model is important in the studies of the bound states of D2 branes in Type IA string theory compactified on K3, which wrap a given holomorphic curve . One can think of the bound state as of the vacuum in the gauge theory on according to [14, 15]. It can also be represented classically by a smooth curve of the genus h = N 2 (g − 1) + 1, which is an N -fold cover of . In the compactified case, the curve C (which is nothing but our fellow spectral curve Cb ) is imbedded into K3. It is also endowed with the line bundle L (from the point of view of Hitchin equations the bundle is simply the eigen-bundle of φ; from the D-brane point of view - the single D-brane carries a U (1) gauge field whose vacuum configurations are the flat connections) which determines a point on its Jacobian Jb . Let us assume that degL = h. Take the generic section of this line bundle. It has h zeroes p1 , . . . , ph . Conversely, given a set S of h points in K3 there is generically a unique curve C in a given homology class β ∈ H2 (K3, Z) of genus h with a line bundle L on it, such that the curve passes through these points and the divisor of L coincides with S. This identifies the open dense subset of the moduli space of pairs (a curve Cb ; a line bundle L on it) with that in the symmetric power Symh (K3) of the K3 surface itself. The symplectic form on the moduli space is therefore the direct sum of h copies of the symplectic forms on K3. Summarizing, the phase space of integrable system looks locally as T ∗ M, where M is the moduli space of rank N vector bundles over , where is holomorphically imbedded in K3. It can also be identified with the moduli space of pairs (a curve Cb ; a line bundle L on it) where the homology class of Cb is N times that of , and the topology L is fixed. This identification provides the action-angle coordinates on the phase space. Namely, the angle coordinates are (following [5]) the linear coordinates on the Jacobian of Cb , while the action coordinates are the periods of d−1 ω along the A-cycles on Cb . The last identification of the phase space with the symmetric power of K3 provides the SoV in the sense of Sklyanin. Notice the similarity of our description to his “magic recipe”. 2.2. Separation of variables in Hitchin system. Let us present an explicit realization of the separation of variables in the Hitchin system. Let be a compact smooth genus g algebraic curve and p be the projection p : T ∗ → . Let V be complex vector bundle over of rank N and degree k. Consider the moduli space MN,k of the semi-stable holomorphic structures E on V . It can be identified with the quotient of the open subset of the space of ∂¯A operators acting on the sections of V by the action of the gauge group ∂¯A → g −1 ∂¯A g. The complex dimension of MN,k is given by the Riemann–Roch formula dimC MN,k = N 2 (g − 1) + 1 := h.
(2.1)
T ∗ MN,k
Explicitly, the points in are the equivalence classes of pairs (E, '), where E is the holomorphic bundle on and ' is the section of End(E) ⊗ ω . The map π is given by the formula: π(E, ') = {Tr', Tr'2 , . . . , Tr'N }.
(2.2)
302
A. Gorsky, N. Nekrasov, V. Rubtsov
Let H denote the h-dimensional vector space Ch : 0 ⊗l H = ⊕N l=1 H (ω ).
N. Hitchin shows that the partial compactification of the cotangent bundle T ∗ MN,k is the algebraically integrable system, i.e. there exists a holomorphic map π : T ∗ MN,k → H whose fibers are abelian varieties which are Lagrangian with respect to the canonical symplectic structure on T ∗ MN,k . The generic fiber is compact. Geometric separation of variables in Hitchin system is the content of the following: Theorem. There exists a birational map ϕ : T ∗ MN,k → (T ∗ )[h] , which is a symplectomorphism of the open dense subsets. Remark. The open dense set in X[h] coincides with (X h − +)/Sh , where + denotes the union of all diagonals and Sh is the symmetric group. The theorem implies that on this dense set one can introduce the coordinates {(zi , λi )}, where zi ∈ , λi ∈ Tz∗i , such that the symplectic form in these coordinates have the separated form: ,=
h
δλi ∧ δzi .
(2.3)
i=1
The coordinates (λi , zi ) are defined up to permutations. Proof. First we construct ϕ and prove that this map is biholomorphic. In order to do that we need to choose a specific k. As the moduli spaces MN,k are isomorphic for different k (although not canonically!) this is not a problem. Fix the pair (E, '). Consider the spectral curve C ⊂ T ∗ defined as the zero set of the characteristic polynomial ⊗N P (λ, z) = Det('(z) − λ) ∈ (ω ).
(2.4)
Its genus equals h as can be seen from the adjunction formula or from Riemann–Hurwitz formula. The pullback p∗ E restricted to C contains the line sub-bundle L of the eigenlines of '. Our choice of k is such that the degree of L equals h. Generically L has unique up to a multiple non-vanishing section s. Its zeroes l1 , . . . , lh determine h points on C and therefore on T ∗ . Clearly they are determined uniquely up to permutations. This allows us to define: ϕ(E, ') = (l1 , . . . , lh ).
(2.5)
Now let us prove that the image of ϕ coincides with (T ∗ )[h] . We do it at the level of the open dense sets. Fix the set of points (l1 = (λ1 , z1 ), . . . , lh = (λh , zh )) in T ∗ . Let (j ) ωα denote a basis in the space of holomorphic j -differentials on . For given j the index α runs from 1 to (2j − 1)(g − 1) for j ≥ 2, to g for j = 1 and to 1 for j = 0. ⊗N . The vectors of V are: Consider the space V of sections of p ∗ ω P (λ, z) =
N j =0 α
Vj,α ωα(j ) (z)λN−j .
(2.6)
Hilbert Schemes, Separated Variables, and D-Branes
303
The dimension of V equals h + 1. Consider the subspace of P ’s such that P (λi , zi ) = 0 for any i = 1, . . . , h. It is one-dimensional if the points l1 , . . . , lh are distinct. In that case choose any P in this one dimensional subspace and consider the curve C defined by the equation P (λ, z) = 0 through the points l1 , . . . , lh . They form a divisor of the unique line bundle L on C of degree h which can be considered as a point of (T ∗ )[h] . Then we can consider a direct image p∗ L which is a vector bundle over and a Higgs form ' : p∗ L → p∗ L ⊗ ω which is nothing but the multiplication by the elements λ ∈ C. Hence we obtain desired one-to-one correspondence at the level of the open dense subsets. In fact, one can go further and extend this isomorphism to the loci of positive codimension. We don’t need this for the birational isomorphism here, yet let us sketch a direction of thought. A point 1 in (T ∗ )[h] determines a complex line in the space V. The divisor of the non-zero element P of this line defines a “curve” C. Now, the point 1 could be identified with the torsion free rank one sheaf on T ∗ . For those C which are smooth the restriction of the sheaf becomes a line bundle L. Now let us prove the equality of symplectic forms on T ∗ MN,k and on (T ∗ )[h] . We show it at the generic point of (T ∗ )[h] corresponding to the set of distinct points (l1 , . . . , lh ). To this end recall the construction of the symplectic form on T ∗ MN,k in terms of the data (C, L): Let Aa be a choice of the basis of A-cycles on C. The Jacobian Jac(C) = H(0,1) (C, C)/H1 (C, Z) has the local linear coordinates ϕb associated to Aa . The differentials dϕb are naturally identified with the holomorphic one-differentials on C. They are normalized in such a way that: dϕb = δba . (2.7) Aa
Introduce the coordinates
I a: Ia =
λdz.
(2.8)
δI a ∧ δϕa .
(2.9)
Aa
Then ω=
h a=1
To make contact with (2.3) we recall the Abel map: Let ,a be the basis in the space of holomorphic 1-differentials on C which obeys: ,b = δba . (2.10) Aa
Then ϕa =
h i=1
li
l∗
,a .
Notice that the normal bundle NT ∗ |C to C is isomorphic to T ∗ C. The deformed curve C˜ can be identified with the holomorphic section p(x)dx of T ∗ C. It can be expanded as follows: h p(x)dx = p a ,a , pa ∈ C. a=1
304
A. Gorsky, N. Nekrasov, V. Rubtsov
Using (2.10) we get: pa =
Aa
p(x)dx = I a .
(2.11)
Let (pi , xi ), i = 1, . . . , h be a set of distinct points in T ∗ C. Let Ca,i = ,a (xi ) ∈ Tx∗i C. Lemma. δϕa =
h
Ca,i δxi .
(2.12)
i=1
Proof of the Lemma. δϕa =
h δ i=1
Thus,
h
δpi ∧ δxi =
i=1
(pi ,xi )
l∗
,a =
h
,a (xi )δxi .
i=1
h h
δp a Ca,i δxi =
a=1 i=1
h
δI a ∧ δϕa .
a=1
The theorem is proven. Remark. J. Hurtubise in [9] gave the pure algebro-geometric proof of the main part of the theorem. The basic motivation of our remarks is that some of our arguments looks more direct and more in the spirit of the approach of integrable systems. 3. Gaudin Model Consider the space (O1 × . . . × Ok ) //G,
(3.1)
where Ol are the complex coadjoint orbits of G = SLN (C) and the symplectic quotient is taken with respect to the diagonal action of G. This moduli space parameterizes Higgs pairs on P1 with singularities at the marked points zi ∈ P1 , i = 1, . . . , k. This is a natural analogue of the Hitchin space for genus zero. Concretely, the connection to the bundles on P1 comes about as follows: consider the moduli space of Higgs pairs: (∂¯A , φ) where φ is a meromorphic section of ad(V )⊗O(−2), with the restriction that resz=zi φ ∈ Oi . The moduli space is isomorphic to (3.1). This space is integrable system, studied in [16, 17]. Indeed, consider the solution to the equation ∂¯A φ = µci δ (2) (z − zi ) (3.2) i
in the gauge where A¯ = 0 (such gauge exists for stable bundles on P1 due to Grothendieck’s theorem). We get: φ(z) =
µc i , z − zi i
(3.3)
Hilbert Schemes, Separated Variables, and D-Branes
305
provided that i µci = 0 and is defined up to a global conjugation by an element of G, hence the Hamiltonian reduction in (3.1). Now, consider the following polynomial: Det (λ − φ(z)) = Ai,l λi z−l . (3.4) i,l
It is an easy count to check that the number of functionally independent coefficients Ai,l is precisely equal to N (N − 1) k + 1 − N 2. 2 Now let us treat explicitly the case N = 2. In this case the coadjoint orbits Oi can be explicitly described as the surfaces in C3 given by the equations: Oi : Zi2 + Xi+ Xi− = ζi2
(3.5)
with the symplectic forms: ωi = and the complex moment maps: µci
dZi ∧ dXi+
,
(3.6)
Zi Xi+ . = Xi− −Zi
(3.7)
Xi+
The phase space of our interest is P = ×ki=1 Oi //SL2 . It is convenient to work with a somewhat larger space P0 = ×ki=1 Oi /C∗ , where C∗ ∈ SL2 (C) acts as follows:
t : Zi , Xi+ , Xi− → Zi , tXi+ , t −1 Xi− .
The moment map of the torus C∗ action is simply i Zi . The complex dimension of P0 is equal to 2(k −1). The Hamiltonians are obtained by expanding the quadratic invariant: T (z) = 21 Trφ(z)2 , T (z) =
i
Hi =
1 2
φ(z) =
µc i , z − zi i
Hi ζi2 + , 2 (z − zi ) z − zi
i Xi+ Xj− + Xi− Xj+
j =i
zi − z j
+ 2Zi Zj
(3.8) .
The separation of variables proceeds in this case as follows: write φ(z) as h(z) f (z) φ(z) = . e(z) −h(z) Then the Baker–Akhiezer function is given explicitly by:
ψ+ , ψ+ = f, ψ− = h2 + ef − h (z) = ψ−
(3.9)
306
A. Gorsky, N. Nekrasov, V. Rubtsov
and its zeroes are the roots of the equation f (pl ) = 0 ⇔
Xi+ = 0, pl − z i
i
l = 1, . . . , k − 1.
(3.10)
The eigen-value λ(p) of the Lax operator φ at the point p is most easily computed using i , the fact that T (z) = λ(z)2 . Hence, λl = i plZ−z i P (zi ) , Q (zi ) λl
Xi+ = u
u ∈ C,
Zi =
Q(pl )P (zi ) , zi − pl Q (zi )P (pl )
l
P (z) =
k−1
(z − pl ),
(3.11)
l=1
Q(z) =
k
(z − zi ).
i=1
C∗
The value of u can be set to 1 by the transformation. The (λl , pl )’s are the gauge invariant coordinates on P0 . They are defined It is easy to check up to a permutation. that the restriction of the symplectic form ωi onto the set i Zi = 0 is the pullback of the form k−1
dλl ∧ dpl .
(3.12)
l=1
In the last section we present the quantum analogue of this separation of variables. 4. Many-Body Systems: Rational and Trigonometric Cases In this section we study the Hilbert scheme of points on S = C2 , S = C2 / for ≈ ZN , Z. We show that S [v] has a complex deformation Sζ[v] and that each Sζ[v] is an integrable model including the complexification of the Sutherland model [18]. 4.1. Points on C × C. Let us start with C2 . As is well-known [19] the Hilbert scheme of points on C2 has an ADHM-like description: it is the set of stable triples (B1 , B2 , I ), I ∈ V ≈ Cv , B1 , B2 ∈ End(V ), [B1 , B2 ] = 0 modulo the action of GL(V ): (B1 , B2 , I ) ∼ (gB1 g −1 , gB2 g −1 , gI ) for g ∈ GL(V ). Stability means that by acting on the vector I by arbitrary polynomials in B1 , B2 one can generate the whole of V . The meaning of the vector I and the operators B1 , B2 is the following. Let z1 , z2 be the coordinates on C2 . Let Z be a zero-dimensional subscheme of C2 of length v. It means that the space H0 (OZ ) of functions on Z which are the restrictions of holomorphic functions on C2 has dimension v. Let V be this space of functions. Then it has the canonical vector I which is the constant function f = 1 restricted to Z and the natural action of two commuting operators: multiplication by z1 and by z2 , which are represented by the operators B1 and B2 . Conversely, given a stable triple (B1 , B2 , I ) the
Hilbert Schemes, Separated Variables, and D-Branes
307
scheme Z, or, rather the corresponding ideal IZ ⊂ C[z1 , z2 ] is reconstructed as follows: f ∈ IZ iff f (B1 , B2 )I = 0. Now let us discuss the notion of stability. In the Geometric Invariants Theory (GIT) the notion of a stable triple (B1 , B2 , I ) would be the following: there exists a holomorphic function ψ on the space of all triples (B˜ 1 , B˜ 2 , I˜) such that: 1. For any g ∈ GL(V ), 2. ψ(B1 , B2 , I ) = 0.
ψ(g B˜ 1 g −1 , g B˜ 2 g −1 , g I˜) = det(g)ψ(B˜ 1 , B˜ 2 , I˜).
Let us show the equivalence of the two definitions of stability. Choose any v-tuple f of polynomials f1 , . . . , fv ∈ C[z1 , z2 ]. Choose any non-zero element ω ∈ (Bv Cv )∗ . Define a function (4.1) τf (B˜ 1 , B˜ 2 , I˜) = ω f1 (B˜ 1 , B˜ 2 )I˜ ∧ . . . ∧ fv (B˜ 1 , B˜ 2 )I˜ . Clearly it obeys the property 1. If the triple (B1 , B2 , I ) is stable in the sense of the first definition then there exist a v-tuple f for which the vectors f1 (B1 , B2 )I, . . . , fv (B1 , B2 )I form a basis in Cv and therefore τf (B1 , B2 , I ) = 0. Conversely, if τf (B1 , B2 , I ) = 0 for any f then the span S of {C[B1 , B2 ]I } is strictly less than V . On the other hand S is an invariant subspace. [v] Now let us discuss another aspect of the space C2 . It is a symplectic manifold. To see this let us start with the space of quadruples, (B1 , B2 , I, J ) with B1 , B2 , I as above and J ∈ V ∗ . It is a symplectic manifold with the symplectic form , = Tr [δB1 ∧ δB2 + δI ∧ δJ ]
(4.2)
which is invariant under the naive action of G = GL(V ). The moment map for this action is µ = [B1 , B2 ] + I J ∈ (LieG)∗ .
(4.3)
Let us perform the Hamiltonian reduction, that is: take the zero level set of µ, choose a subset of stable points in the sense of GIT and take the quotient of this subset with respect to G. One can show [19] that the stability implies that J = 0 and therefore the moment equation reduces to the familiar [B1 , B2 ] = 0. [v] is an integrable system. Indeed, the functions TrB1l PoissonMoreover, C2 commute and are functionally independent for l = 1, . . . , v. [v] has an interesting complex deformation which preserves its It turns out that C2 symplecticity and integrability. Namely, instead of µ−1 (0) in the reduction one should take µ−1 (ζ · Id) for some ζ ∈ C. Now J = 0. The resulting quotient Sζ[v] no longer parametrizes subschemes in C2 but rather sheaves on a non-commutative C2 , that is the “space” where functions are polynomials in z1 , z2 with the commutation relation [z1 , z2 ] = ζ (see [20] for more details). Nevertheless, the quotient itself is a perfectly well-defined symplectic manifold with an integrable system on it: the functions Hl = TrB1l still Poisson-commute and are functionally independent for l = 1, . . . , v. On the dense open subset of Sζ[v] , where B2 can be diagonalized: B2 = diag (q1 , . . . , qv ) the Hamiltonians H1 , H2 can be written as follows: H1 =
i
pi ,
H2 =
i
pi2 +
i<j
ζ2
qi − q j
2 ,
(4.4)
308
A. Gorsky, N. Nekrasov, V. Rubtsov
where pi = (B1 )ii . These Hamiltonians describe a collection of indistinguishable particles on a (complex) line with a pair-wise potential interaction x12 . This system is called
the rational Calogero model [21]. It is shown in [11] that the space Sζ[v] can be used for compactifying the Calogero flows in the complex case and moreover that the same compactification is natural in the KdV/KP realization of Calogero flows [22, 23].
4.2. Rounding off to C × C∗ and to C∗ × C∗ . Now let z1 , z2 be the coordinates on C × C∗ , i.e. z2 = 0. Then the description of the previous section is still valid except that B2 must be invertible now. So in this case the Hilbert scheme of points is obtained by a complex Hamiltonian reduction from the space T ∗ (G × V ). We skip all the details as they are well-known by now (the complex analogue of the real reduction studied in [24, 25] can be found for example in [7]). The moment map in our notations will be: µ = B2−1 B1 B2 − B1 + I J, which corresponds to the symplectic form: , = δTr B1 B2−1 δB2 + I δJ .
(4.5)
(4.6)
The reduction at the non-zero level µ = ζ · Id leads to the the complex analogue of either the Sutherland [18] or rational Ruijsenaars model [25, 27]. In the former case 2 H2 = TrB1 while in the latter Hrel = Tr B2 + B2−1 . On the open dense subset where B2 diagonalizable: B2 = diag (exp(2π iq1 ), . . . , exp(2π iqv )) the Hamiltonian H2 equals: H2 =
i
pi2 +
i<j
ζ2
. sin2 π qi − qj
(4.7)
The expression for Hrel can be found in [7]. Finally let us present the hyperkähler quotient construction of (complex deformation of) the Hilbert scheme of points on C × C∗ . It is done in terms of the gauge fields on a circle and a triplet of the adjoint Higgs fields. More precisely, consider the space of su(N )-valued functions on a circle φ i (t), i = 1, 2, 3, t ∈ R/(2π R)Z, and an su(N ) gauge field A. In addition consider the space of pairs I, J , I ∈ CN , J ∈ CN,∗ . This space is acted on by the gauge group g(t) ∈ U (N ): φ i (t) → g −1 (t)φ i (t)g(t), I → g(0)I,
A(t) → g −1 (t)∂t g(t) + g −1 (t)A(t)g(t),
J → g(0)−1 J,
which preserves the triplet of symplectic forms: i G i = ωi (I, J ) + dt TrδA ∧ δφ i + εij k Trδφ j ∧ δφ k , 2
(4.8)
(4.9)
where ωi (I, J ) is the triplet of symplectic forms on T ∗ CN . The gauge group action (4.8) is generated by the triplet of moment maps: i mi = ∂φ i + εij k [φ j , φ k ] − δ(t)µi (I, J ), 2
(4.10)
Hilbert Schemes, Separated Variables, and D-Branes
309
where µi are the moment maps for I, J : µ3 = I I † − J † J, µ1 + iµ2 = I J. The hyperkähler quotient consists of setting mi = ζ i δ(t), where ζ i ∈ R, and modding out by the action of the gauge group. If, instead of imposing the m3 equation one takes the quotient of the space of solutions to m1,2 = ζ 1,2 δ(t) by the action of the complexified gauge group GLN (C) then one recovers the description 2πR (4.5), with B2 = P exp 0 A + iφ 3 , B1 ∼ φ 1 (0) + iφ 2 (0). Finally, if both B1 and B2 are invertible then we get the Hilbert scheme of points on C∗ × C∗ . Its complex deformation is a bit more tricky, though. It turns out that it can be obtained via Poisson reduction of G × G × V × V . The integrable system one gets in this case is the trigonometric case of the Ruijsenaars model [29].
4.3. ALE models. Slightly generalizing the results of [30–32] one may easily present the finite-dimensional symplectic quotient construction of the Hilbert scheme of points on T ∗ P1 . Here it is: Take V = Cv , A = T ∗ Hom(V, V), A˜ = A ⊕ A and X = A˜ ⊕ T ∗ (Hom(V, C)). The space X is acted on by the group G = GL(V ) × GL(V ). The maximal compact subgroup U of G preserves the hyperkähler structure of X. The hyperkähler quotient of X with respect to U is the Hilbert scheme of points on T ∗ P1 of length v. This space is an integrable system. We shall prove it in a slightly more general setting. 2π i Namely, let S be the deformation of the orbifold C2 /Zk , where the generator ω = e k of Zk acts as follows: (z0 , z1 ) → (ωz0 , ω−1 z1 ). Theorem. S [v] is a holomorphic integrable system. Proof. The space S [v] can be described as a hyperkähler quotient. Let us take a k + 1 copy of the space Cv , and denote the i th vector space as Vi , i = 0, . . . , k. Let us consider the space X=
k
Hom(Vi , Vi+1 ) ⊕ Hom(Vi+1 , Vi )
Hom(W, V0 ) ⊕ Hom(V0 , W ),
i=0
where k + 1 ≡ 0. The space X has a natural hyperkähler structure, in particular it has a holomorphic symplectic form: ω = TrδI ∧ δJ +
k
TrδBi,i+1 ∧ δBi+1,i ,
(4.11)
i=0
where Bi,j ∈ Hom(Vj , Vi ), I ∈ Hom(W, V0 ), J ∈ Hom(V0 , W ). The space X has a natural symmetry group G = ki=0 U (Vi ) which acts on X as: Bi,j → gi Bi,j gj−1 ,
I → g0 I,
J → J g0−1 .
310
A. Gorsky, N. Nekrasov, V. Rubtsov
The action of the group G preserves the hyperkähler structure of X. The complex moment map has the form: µi = Bi,i+1 Bi+1,i − Bi,i−1 Bi−1,i + δi,0 I J.
(4.12)
The space S [v] is defined as a (projective) quotient of µ−1 (0) by the action of the complexified group G, which we denote as Gc . There is a deformation Sζ[v] which depends on k complex parameters ζ0 , . . . , ζk , i ζi = 0 defined as Sζ[v] = ∩i µ−1 i (ζi Id)/Gc .
(4.13)
Now we present the complete set of Poisson-commuting functions on S [v] : define the “monodromy”: B0 = B0,1 B1,2 . . . Bk,0
(4.14)
which transforms under the action of G in the adjoint representation. The invariants fl = TrBl0 ,
l = 1, . . . , v
(4.15)
clearly Poisson-commute on X, are gauge invariant and therefore descend to the commuting functions on S [v] (and to Sζ[v] as well). The functional independence is easily checked on the dense open set where S [v] can be identified with the symmetric product of S’s. 5. Elliptic Models 5.1. Hilbert scheme of points on T ∗ E. Let E be the ellipic curve, E ∨ ≈ Jac(E) = Pic0 (E). For t ∈ E ∨ let Lt denote the corresponding line bundle. In particular, let 0 ∈ E ∨ be the trivial line bundle: L0 = O. Let S = T ∗ E, let (z1 , z2 ) be the coordinates on the universal cover of S, z2 ∈ T ∗ , π : S → E be the projection. Let Z ∈ S [v] . For each t ∈ E ∨ let Vt be the space H0 (Z, π ∗ Lt |Z ). The spaces Vt form a holomorphic rank v bundle E over E ∨ . Let φ : V → V ⊗ ,1E ∨ be the operator which multiplies a section of π ∗ Lt by a covector (thus we get a map from V × ,1E to V which is the same as V → V ⊗ ,1E ∨ ). The fiber at t = 0 contains a distinguished vector I which is the image of 1 ∈ H0 (Z, π ∗ O|Z ). The triple (E, φ, I ) is stable in the following sense: There exists no holomorphic subbundle F of E, such that φ(F) ⊂ F ⊗ ,1E ∨ and I ∈ F0 . This is an appropriate counterpart of the notion of a stable Higgs pair [33] in the case of genus one. Let Z ∈ S [v] . Consider the cover p : C × C → T ∗ E. Let us lift Z to the covering space. The space V of functions on p ∗ Z is a Z ⊕ Z-module. By Fourier transform we may view this space as a space of functions on a two-torus with values in a v-dimensional vector space. Let t, t¯ be the coordinates on the torus. The elements of the v-dimensional space Vt,t¯ are the functions on p ∗ Z which transform as follows: f (z1 + m + nτ, z2 ) = exp
2π i m t − t¯ + n τ t¯ − τ¯ t f (z1 , z2 ). τ − τ¯
(5.1)
Hilbert Schemes, Separated Variables, and D-Branes
311
Clearly, the function f ≡ 1 belongs to V0,0 . It is represented by the vector I ∈ V0,0 . ¯ I ) one can go to a gauge To reconstruct a scheme Z given a stable triple (φ, ∂¯ + A, ¯ commute where both φ and A¯ are constant (t-independent). Then the matrices (φ, A) and together with I form a stable triple suitable for defining a subscheme of C2 of length v. In fact, the gauge where φ and A¯ are constant allows extra gauge transformations which make the support of the subscheme of C2 invariant under the action of Z ⊕ Z by ¯ translations. The simplest yet instructive case is that of the diagonalizable φ, A. One can do more. The Hilbert scheme of points on T ∗ E can be endowed with a hyperkahler structure. To see this we start with the space X of the pairs (A, '), where A is U (N ) gauge field on the torus E and ' is the adjoint-valued one-form. The space X is hyperkahler and the hyperkahler structure is preserved by the action of the gauge group G. The latter may also act via evaluation at some points t1 , . . . , ts on some finitedimensional hyperkahler manifolds O1 , . . . , Os . Let µk be the hyperkahler moment map for Ok . Let us consider the hyperkahler reduction of the space X × O1 × . . . Os with respect to G. We first impose the hyperkahler moment map equations: ¯ t + [At¯, 't ] = ∂'
s k=1
Ft t¯ + ['t , 't¯] =
s k=1
µck δ (2) (t − tk ), (5.2) µrk δ (2) (t
− tk ).
These equations are the genus one analogues of Hitchin’s self-duality equations [33]. They were studied from the point of view of complex reduction in [7, 34, 35]. They also appeared in the study of intersecting D-branes on tori [36] (see also [37] for the general discussion of D-branes on tori), in the attempts to find appropriate field parameterization of the Yang–Mills theory [38]. The case of our interest here is s = 1, O1 = T ∗ CPN−1 . The complex reduction in this case has been studied in [34], where it was shown that the solutions to Eqs. 5.2 (in fact of their deformation) form a holomorphic integrable system, which turns out to be elliptic Calogero–Moser system. The latter describes the system of non-relativistic particles on the elliptic curve E, which pairwise interact via the potential ζc2 ℘ (zi − zj ), where ζ is a the period of a holomorphic symplectic form on O1 (see below). The space O1 in turn can be described as a hyperkahler quotient of C2N with respect to the action of U (1) which has charges (+1, −1) on CN ⊕ CN . Let us denote the elements of CN ⊕ CN as I ⊕ J , where J and I are the row and the column vectors respectively. Then Eqs. (5.2) can be written explicitly as: ¯ z + [Az¯ , 'z ] = (I J − ζc Id) δ (2) (z), ∂' Fz¯z + ['z , 'z¯ ] = I I † − J † J − ζr Id δ (2) (z).
(5.3)
The solutions to (5.3) are the gauge fields which have
monodromy around the puncture z = 0, which is conjugate to exp I I † − J † J − ζr Id and the covariantly constant Higgs fields which have a first order pole at z = 0 with the residue given by (I J − ζc Id). 6. Beauville–Mukai Systems Let S be the surface of K3 type, which contains a holomorphically embedded curve of genus g. Fix the numbers N and p. Let MN,p,g (S, ) be the moduli space of the
312
A. Gorsky, N. Nekrasov, V. Rubtsov
pairs (C, L), where [C] = N [], C is a holomorphically embedded curve and L is the line bundle on C of degree p. Let h be the genus of generic C. It is equal to h = 1 + N 2 (g − 1).
(6.1)
Let M be the compactification of MN,h,g (S, ) (the case of general p is left beyond the scope of our investigation) which is defined as the moduli space of pairs (C, L), where C is a curve as above and L is a torsion free rank one coherent sheaf on C (if the curve C is smooth then this is the same as a line bundle), with C c1 (L) = h. Theorem. M is a symplectic manifold, birationally equivalent to S [h] with its standard symplectic structure. Proof. The statement is actually well-known and is a compilation of two Beauville remarks [39, 40]. Similar considerations are contained in [41, 42]. Fix a smooth irreducible representative C. A. Beauville proves that under the conditions of our theorem there exists a morphism ϕ : S → Ph , such that its restriction to C coincides with the canonical embedding of C into Ph−1 by the sections of the canonical sheaf ωC of C: A hyperplane H ⊂ Ph intersects ϕ(S) by a curve CH . A restriction of the line bundle O(1) over Ph to CH gives a line bundle LH of degree h. Given a very ample line bundle L over
∗H of the same degree h one may consider C the corresponding linear system |L| = Ph as a base of the fibration J → |L| with the fiber Jac(CH ) [39] over a point CH . Moreover, as it was shown in [43] one can extend this fibration to a “compactified fibration” J¯ with the fibre Jac(CH ) such that dimJ¯ = 2h. This space is an algebraically completely integrable system with respect to a holomorphic symplectic structure introduced in [44]. Through h linearly independent points p1 , . . . , ph in S ⊂ Ph the unique hyperplane H passes. Hence (generically) there exists a unique genus h curve CH passing through the points and the line bundle L whose divisor coincides with p1 + . . . + ph . This L has a unique up to a constant non-zero section with h zeroes at the points p1 , . . . , ph . In other words the pairs (C, L) ∈ M are parameterized by J¯ . We have established a correspondence between the point of ¯ and the point of a smooth open subset in the symmetric product Symh (S). Prop. 1.3 M of [40] shows that this correspondence actually extends to a birational isomorphism of our theorem. Thus the symplectic structure of Mukai on J¯ maps to a direct sum of symplectic structures on the copies of the surface S. Remark. The identification of Beauville–Mukai integrable systems phase space with the Hilbert scheme of points on the initial K3 surface S provides an analogue of Sklyanin’s SoV in terms of projective coordinates in Ph . It again resembles his “magic recipe”: the rôle of zeroes and poles of eigen-bundle sections is played by the zeroes of sections of linear bundles L arising under replacing of T ∗ by the K3 surface S. The following example of a Beauville–Mukai system which is proposed in [39] is an illustration of this ad hoc separation. Example. Let us consider a quartic S ⊂ P3 given by the equation F (X0 , X1 , X2 , X3 ) = 0 in homogeneous coordinates (X0 : X1 : X2 : X3 ). This is an example of a K3 surface. The condition degF = 4 implies that S has a holomorphic symplectic form, whose associated Poisson bracket can be described quite explicitly. Suppose X0 = t = 0. Let f (x1 , x2 , x3 ) = t −4 F (t, tx1 , tx2 , tx3 ). Let g, h be the locally defined holomorphic
Hilbert Schemes, Separated Variables, and D-Branes
313
functions on S\S ∩ {X0 = 0}. Extend them to the functions in (x1 , x2 , x3 ) defined in the neighborhood of f −1 (0). Then: {g, h} ≡
dg ∧ dh ∧ df , dx1 ∧ dx2 ∧ dx3
(6.2)
evaluated at f = 0 is well-defined independently of the choices made and also the Poisson bivector defined in this way extends to the whole of S. Now we want to study the h = 3 d symmetric power of S, more precisely the Hilbert scheme S [3] . Introduce the variables Xai , i = 1, . . . , 3, α = 0, 1, 2, 3. We denote by x i = (x1i , x2i , x3i ), where xai = Xai /X0i , a = 1, 2, 3. The space S [3] has a natural sympectic form ωh which is induced from the symplectic form on S via the Hilbert–Chow morphism S [3] → Sym3 (S). Let π = ωh−1 be the corresponding Poisson bivector. We now define three functions on S [3] . Assume first that x i = x j , i = j . Then the construction of the previous section can be formulated very concretely as follows. There is a unique hyperplane in 3 P3 which passes through the points x 1 , x
2 , x 3 ∈ S ⊂ P . The space of hyperplanes in 3 3 P is the dual projective space H = P :
c = (c0 : c1 : c2 : c3 ) ∈ H →
3
ca Xa = 0,
X = (X0 : X1 : X2 : X3 ) ∈ P3 .
a=0
(6.3) The Hilbert scheme of points contains regions where the points coincide. For a collision of a pair of points x 1 , x 2 one glues a line P1 which determines the direction along which the points collided. In this case there still exists a unique hyperplane in P3 which passes through this line and the third point x 3 . Now, if the third point approaches the cluster formed by the first two then the plane P2 along which the configuration of the line and the third point collided is contained in the Hilbert scheme of points. Thus we have shown that by associating a plane to the triple of points and the limits one gets a well-defined birational map: ϕ : S [3] → H.
(6.4)
The map ϕ is given by the explicit formulae: ϕ α = Det α ,
(6.5)
where α is the 3 × 3 matrix obtained from "Xai " by removing the α th column. In the domain where ψ0 := Det"xai " ≡ 0 / (X10 X20 X30 ) = 0 we may introduce the functions: Ha =
ψa , ψ0
(6.6)
314
A. Gorsky, N. Nekrasov, V. Rubtsov
where ψa = DetMa , (Ma )bi = Xbi , for b = a and (Ma )ai = 1 for any i: 1 x21 x31 1 Det 1 x22 x32 , H1 = ψ0 1 x23 x33 x11 1 x31 1 H2 = Det x12 1 x32 , ψ0 x13 1 x33 x11 x21 1 1 H3 = Det x12 x22 1 ψ0 x x 1, 13
(6.7)
23
These explicit Hamiltonians are defined in ϕ −1 (A3 ), where A3 is the affine part of H, corresponding to ψ0 = 0. It follows from the identity: {ψa , ψb } = ψa {ψ0 , ψb } − ψb {ψ0 , ψa }
(6.8)
that the Hamiltonians Ha , a = 1, 2, 3, Poisson commute. It would be interesting to investigate this system as an example of a “deformation” of a Hitchin system in the sense of [6] and to study its quantum analogues which as are hopefully related to some Feigin–Odesski—Sklyanin algebras. 7. Relations to the Physics of D-Branes and Gauge Theories The abovementioned constructions of the separation of variables in integrable systems on moduli spaces of holomorphic bundles with some additional structures can be described as a symplectomorphism between the moduli spaces of the bundles (more precisely, torsion free sheaves) having different topology, e.g. Chern classes. To be specific let us concentrate on the moduli space Mv of stable torsion free coherent sheaves E on S. Let Aˆ S = 1 − [pt] ∈ H ∗ (S, Z) be the A-roof genus of S. The vector v = Ch(E) Aˆ S = (r; w; d − r) ∈ H∗ (S, Z), w ∈ 3,19 corresponds to the sheaves with the Chern numbers: ch0 (E) = r ∈ H0 (S; Z), ch1 (E) = w ∈ H2 (S; Z),
(7.1)
ch2 (E) = d ∈ H (S; Z). 4
Type IA string theory compactified on S has BPS states, corresponding to the Dp-branes, with p even, wrapping various supersymmetric cycles in S, labelled by v ∈ H∗ (S, Z). The actual states correspond to the cohomology classes of the moduli spaces Mv of the configurations of branes. The latter can be identified with the moduli spaces Mv of appropriate sheaves. The string theory, compactified on S has moduli space of vacua, which can be identified with MA = O 4,20 \O(4, 20; R)/O(4; R) × O(20; R), where the arithmetic group O( 4,20 ) is the group of discrete automorphisms. It maps the states corresponding to different v to each other. The only invariant of its action is v 2 . We have studied three realizations of an integrable system.
Hilbert Schemes, Separated Variables, and D-Branes
315
The first one uses the non-abelian gauge fields on the curve imbedded into the symplectic surface S. Namely, the phase space of the system is the moduli space of stable pairs: (E, φ), where E is rank r vector bundle over of degree l, while φ is the 1 ⊗ End(E). holomorphic section of ω The second realization is the moduli space of pairs (C, L), where C is the curve (divisor) in S which realizes the homology class r[] and L is the line bundle on C. The third realization is the Hilbert scheme of points on S of length h, where h = 1 dimM. 2 The equivalence of the first and the second realizations corresponds to the physical statement that the bound states of N D2-branes wrapped around are represented by a single D2-brane which wraps a holomorphic curve C which is an N -sheeted covering of the base curve . The equivalence of the second and the third descriptions is tempting to attribute to T -duality. 8. Discussion We have attempted to formulate the separation of variables purely in geometric terms. It seems that the proper physical setup is the use of a chain of dualities to get the system of D0 branes on some hyperkähler manifold1 . A few subjects need further clarification. Evidently an interesting question to investigate is the quantization of the integrable system. According to the work of E. Frenkel and B. Feigin, the quantum separation of variables can be viewed as the geometrical Langlands transform. The latter maps the spectrum of the Hitchin D-module (Hamiltonians) to the data connected to the local system on the dual object [46]. What we have studied is the classical analogue of the Langlands correspondence which maps the phase space of the Hitchin and Mukai systems to the Hilbert scheme of points. The quantum separation of variables goes as follows. The wave function in the separated variables becomes the product of identical factors up to a determinant factor κ (ξ i,j i − ξj ) : (ξ1 , . . . , ξn ) = Q(ξi ), i
where Q(ξ ) obeys the so-called Baxter’s equation Det (L(ξ ) − η) Q(ξ ) = 0, where L(ξ ) is the Lax operator of the dynamical system, evaluated at the zero of the Baker–Akhiezer function. The variable η is viewed as an operator acting on Q(ξ ). The commutation relation between ξ and η implies the representation of η, which can be either differential, difference or integral operator. In some sense Q(ξi ) can be considered as the wave function of a single D0 brane while the union of integrable systems with all N ’s is the second quantization. As an example of the quantum separation of variables let us consider the Gaudin system. Consider D-modules which describe a system of differential equations on the moduli space MG () of principal G-bundles over a complex curve with marked points which is given by a commuting set of differential operators whose symbols coincide with Hitchin hamiltonians. Hence they can be considered as a quantization of 1 As our manuscript was at the stage of final preparation, a very interesting paper [45] appeared which studied various aspects of string theory on instanton moduli spaces on hyperkähler manifolds. It also contains an extensive discussion of the duality groups
316
A. Gorsky, N. Nekrasov, V. Rubtsov
Hitchin systems [46]. If the curve has genus zero then the corresponding D-modules give rise to the Gaudin model of the integrable quantum spin chain [47]. Consider the punctured sphere = CP1 and attach to each of the distinct marked points zi , i = 1, . . . n the g = sl2 Verma module Vλi of the highest weight λi ∈ C. Represent the Lie algebra sl2 by the operators ei = −ti ∂t2i − λi ∂ti ,
hi = −2ti ∂ti − λi ,
fi = ti ,
(8.1)
and denote by Cij the Casimir operator Cij = ei fj + fi ej + 21 hi hj . The Lax operator is given by φ(z) h i fi n ei −hi = φ(z) . (8.2) z − zi i=1
Introduce the operators Hi : Hi =
i=j
Cij , zi − z j
(8.3)
and ∇i = (κ + 2)∂zi − Hi , where κ ∈ C is a “level”. Consider the infinite-dimensional Lie algebras gˆ and gˆ + : gˆ = g ⊗ C((z)) ⊕ CK, where C((z)) are meromorphic functions of a formal parameter z around a marked point and gˆ + = g⊗C[[z]] is the Lie subalgebra of gˆ which consists of the power series in z. The quotient U gˆ /(K − κ) is denoted by Uκ gˆ . By definition the conformal blocks are linear functionals: f : ⊗ni=1 (Uκ (ˆg) ⊗U (ˆg
+
)
Vλi ) → C,
(8.4)
which are invariant under the action of the Lie algebra gˆ out (z1 , . . . , zn ) of the Laurent seriies at the marked points of regular g-valued functions on the marked sphere which vanish at infinity. It is a well-known fact [47] that any such functional defines a numbervalued function f (z1 , . . . , zn ; t1 , . . . , tn ) obeying the Knizhnik–Zamolodchikov system ∇i (f ) = 0.
(8.5)
Let us perform the transition to the separated variables pi : (z, t1 , . . . , tn ) → (z, p1 , . . . , pn−1 , u), where the variables (p, u) and t are defined through the following relation: n−1 n (z − pl ) ti dz = u l=1 dz. n z − zi i=1 (z − zi )
(8.6)
i=1
It is clear that pl which are the zeroes of the '12 element of the Lax matrix are zeroes of the Baker–Akhiezer function. Therefore the spectral curve η2 = Det'(z)
Hilbert Schemes, Separated Variables, and D-Branes
317
yields the Baxter–Sklyanin equation for the Gaudin magnet ∇p2i + Det'(pi ) Q(pi ) = 0, λi where ∇ = −∂p − i p−z , which defines a projective connection. If one considers i the g = slN magnet instead then the Baxter equation becomes the N th order differential equation. In the generic situation there are a few subtle points. First, the proper ordering should be chosen in the quantum operators. Secondly, when discussing the Hitchin system on a higher genus curve one has to deal with globally defined objects with proper modular properties. There are no global solutions to the differential equations in this case. The differential operators are defined globally, though. The explicit computations in the case of genus one [8, 7] give a system which interpolates between the Gaudin model and Calogero–Moser elliptic many-body system. The geometrical approach to SoV for this system is formulated in [48] and elaborated on in [49]. Acknowledgements. We would like to thank B. Feigin, E. Frenkel, V. Ginzburg, B. Enriquez, L. Evain, I. Krichever, S. Kharchev, A. Losev, I. Reider, A. Stoyanovsky and especially D. Markushevich for useful and interesting discussions and to V. Fock for collaboration at the initial stage of the project. Research of A. G. is supported by grants CRDF-RP2-132, INTAS 96-482 and RFFI 97-02-16131; that of N. N. is supported by Harvard Society of Fellows, partially by NSF under grant PHY-98-02-709, partially by RFFI under grant 96-02-18046 and partially by grant 96-15-96455 for scientific schools. He is also grateful to Aspen Center for Physics for hospitality while part of this work was done. V. R. is partly supported by the grant INTAS-96-196, by RFFI under the grants 98-01-00327, by the grants for promotion of French-Russian scientific cooperation: CNRS grant PICS No. 608, RFFI 98-01-2033, and by the grant 96-15-96455 for support of scientific schools. V. R. is also grateful to Université Louis Pasteur in Strasbourg for its kind hospitality during the Spring term of 1997. His particular thanks are due to M. Audin, O. Debarre and J.-I. Merindol for numerous discussions. We would like to thank the Mittag-Leffler Institute for hospitality at the final stage of preparation of the manuscript.
References 1. Sklyanin, E.: Separation of variables. New trends. solv-int/9505035 2. Sklyanin, E.: Separation of variables in the classical integrable SL(3) magnetic chain. Commun. Math. Phys. 150, 181–191 (1992) 3. Flaschka, H. and McLaughlin, D.W.: Progr. Theor. Phys. 55, 438–456 (1976); Gelfand, I.M. and Dikiı, L.A.: Func. Anal. Appl. 13, 8–20 (1979); Novikov, S.P. and Veselov, A.P.: Proc. Steklov Inst. Math. 3, 53–65 (1985) 4. Krichever, I., Phong, D.H.: On the integrable geometry of soliton equations and N=2 supersymmetric gauge theories. hep-th/9604199, J. Diff. Geom. 45, 349–389 (1997) 5. Hitchin, N.: Stable bundles and integrable systems. Duke. Math. J. 54, 91–114 (1987) 6. Donagi, R.. Ein, L., Lazarsfeld, R.: alg-geom/9504017; Nilpotent cones and sheaves on K3 surfaces. Birational algebraic geometry (Baltimore, MD, 1996), Contemp. Math., 207, Providence, RI: Am. Math. Soc. 1997, pp. 51–61 7. Nekrasov, N.: Holomorphic bundles and many-body systems. hep-th/9503157, Commun. Math. Phys. 180, 587 (1996) 8. Enriquez, B., Rubtsov, V.: Hitchin systems, higher Gaudin operators and r-matrices. alg-geom/9503010, Math. Res. Lett. 3, no. 3, 343–357 (1996) 9. Hurtubise, J.: Integrable systems and algebraic surfaces. Duke. Math. J. 83, 19–50 (1996) 10. Nakajima, H.: Jack polynomials and Hilbert schemes of points on surfaces. alg-geom/9610021 11. Wilson, G.: Collisions of Calogero–Moser particles and an adelic Grassmanian. Inv. Math. 133, no. 1, 1–41 (1998) 12. Beauville, A.: Variétés Kahleriennes dont la première classe de Chern est nulle. J. Diff. Geom. 18, no. 4, 755–782 (1983)
318
A. Gorsky, N. Nekrasov, V. Rubtsov
13. Gorsky, A., Nekrasov, N.: Hamiltonian systems of Calogero type and Two Dimensional Yang–Mills Theory. Nucl. Phys. B 414, 213–238(1994) 14. Witten, E.: Bound States of Strings and p-Branes. hep-th/9510135, Nucl.Phys. B 460, 335–350 (1996) 15. Vafa, C.: Instantons on D branes. hep-th/9512078, Nucl. Phys. B 463, 435–442 (1996) 16. Garnier, R.: Sur une classe de systemès différentiels Abeliéns déduits de la théorie des équations linéares. Rend. del Circ. Matematice Di Palermo 43, vol. 4 (1919) 17. Gaudin, M.: J. Physique, 37, 1087–1098 (1976) 18. Sutherland, B.: Phys. Rev. A 5, 1372–13 (1972) 19. Nakajima, H.: Lectures on Hilbert schemes of points on surfaces. H.Nakajima’s homepage 20. Nekrasov, N., Schwarz, A.: Instantons on non-commutative R4 and six dimensional (2, 0) superconformal theory. hep-th/9802068, Commun. Math. Phys. 198, 689–703 (1998) 21. Calogero, F.: J. Math. Phys. 12, 419 (1971) 22. Airault, H., McKean, H., Moser, J.: Rational and elliptic solutions of the KdV equation and a related many-body problem. Comm. Pure. Appl. Math. 30, no. 1, 95–148 (1977) 23. Krichever, I.: Funk. Anal. and Appl. 12 1, 76–78 (1978); 14, 282–290 (1980) 24. Kazhdan, D., Kostant, B. and Sternberg, S.: Comm. on Pure and Appl. Math. vol. XXXI, 481–507 (1978) 25. Olshanetsky, M., Perelomov, A.: Phys. Rep. 71, 313 (1981) 26. Ruijsenaars, S.: Commun. Math. Phys. 110, 191–213 (1987) 27. Ruijsenaars, S.N.M., Schneider, H.: Ann. of Physics 170, 370–405 (1986) 28. Moser, J.: Adv.Math. 16, 197–220 (1975) 29. Gorsky, A., Nekrasov, N.: Relativistic Calogero–Moser model as gauged WZW theory. Nucl.Phys. B 436, 582–608 (1995) 30. Kronheimer, P.: The construction of ALE spaces as hyper-kahler quotients. J. Diff. Geom. 28, 665 (1989) 31. Kronheimer, P. and Nakajima, H.: Yang–Mills instantons on ALE gravitational instantons. Math. Ann. 288, 263 (1990) 32. Nakajima, H.: Homology of moduli spaces of instantons on ALE Spaces, I. J. Diff. Geom. 40, 105 (1990); Instantons on ALE spaces, quiver varieties, and Kac–Moody algebras. Preprint; Gauge theory on resolutions of simple singularities and affine Lie algebras. Preprint 33. Hitchin, N.: The self-duality equations on a Riemann surface. Proc. London Math. Soc. 55, 59–126 (1987) 34. Gorsky, A., Nekrasov, N.: Elliptic Calogero–Moser System from Two Dimensional Current Algebra. hepth/9401021 35. Markman, E.: Spectral curves and integrable systems. Compositio Math. 93, no. 3, 255–290 (1994) 36. Kapustin, A., Sethi, S.: The Higgs Branch of Impurity Theories. hep-th/9804027, Adv. Theor. Math. Phys. 2, 571–591 (1998) 37. Taylor, W.: D-Brane Field Theory on a Compact Space. hep-th/9611042, Phys. Lett. B 394, 283–287 (1997) 38. Bochicchio, M.: The large-N limit of QCD and the collective field of the Hitchin fibration. hep-th/9810015 39. Beauville, A.: Systèmes Hamiltoniens complétement intègrables associés aux surfaces K3. In: Problems in the theory of surfaces and their classification (Cortona, 1988), Sympos. Math. XXXII, London: Academic Press, 1991, pp. 25–31 40. Beauville, A.: Counting rational curves on K3 surfaces. alg-geom/9701019 41. Bershadsky, M., Vafa, C., Sadov, V.: D-branes and topological theories. Nucl.Phys. B 463, 420–434 (1996) 42. Yau, S.-T., Zaslow, E.: BPS States, string duality and nodal curves on K3. hep-th/9512121 43. Altman, A., Kleiman, S.: Compactifying the Picard scheme. Adv. in Math. 35, 50–112 (1980) 44. Mukai, S.: Symplectic structure of the moduli space on a abelian or K3 surface. Invent. Math. 83, 101–116 (1984) 45. Dijkgraaf, R.: Instanton Strings and HyperKähler Geometry. hep-th/9810210 46. Beilinson, A., Drinfeld, V.: Quantization of Hitchin’s fibration and Langlands’ program. In: Algebraic and geometric methods in mathematical physics (Kaciveli, 1993), Math. Phys. Stud. 19, Dordrecht: Kluwer Acad. Publ., 1996, pp. 3–7 47. Feigin, B., Frenkel, E., Reshetikhin, N.: Gaudin Model, Critical Level and Bethe Ansatz. Commun. Math. Phys. 166, 27–62 (1995) 48. Frenkel, E.: Affine algebras, Langlands duality and Bethe ansatz. In: XIth International Congress of Mathematical Physics (Paris, 1994), Cambridge, MA: Internat. Press, 1995, pp. 606–642, q-alg/9506003 49. Enriquez, B.,Feigin, B., Rubtsov, V.: Separation of variables for sl2 -Calogero-Gaudin system. Compositio Math. 110, no. 1, 1–16 (1998), q-alg/9605030 Communicated by M. Aizenman
Commun. Math. Phys. 222, 319 – 326 (2001)
Communications in
Mathematical Physics
© Springer-Verlag 2001
On Connectivity of Julia Sets of Yang–Lee Zeros Jianyong Qiao1 , Yuhua Li 2 1 Center of Mathematics, China University of Mining & Technology, Beijing 100083, P.R. China.
E-mail: [email protected]
2 Institute of Mathematics, Chinese Academy of Science, Beijing 100080, P.R. China
Received: 23 May 2000 / Accepted: 9 May 2001
Abstract: We prove that the Julia sets of the Yang–Lee zeros of the Potts model on the diamond hierarchical lattice are connected sets on the complex plane. 1. Introduction Much interest has been devoted to statistical mechanical models on hierarchical lattices. On such lattices, the exact renormalization-group recursion relation defines a rational mapping on the complex plane (see [3–8]). We consider the Potts model on the diamond hierarchical lattice defined in [3]. The Hamilton of the λ-state Potts model is H = −J δ(σi , σj ), σi = 1, 2, · · · , λ,
where δ is the kronecker delta which is defined by δ(σi , σj ) = 1 for σi = σj and δ(σi , σj ) = 0 for δi = δj , and the sum is over nearest neighbors. The partition function is then given by Z= exp{βJ δ(σi , σj )}. σi
The recursion relations obeyed by the partition function lead to the following functional equation for the reduced free energy per bond: f [Tλ (z)] = 4f (z) − 2 log(2z + λ − 2), here z = exp(βJ ) and
Tλ (z) =
z2 + λ − 1 2z + λ − 2
2 .
The research was supported by the 973 Project Foundation of China.
(1)
320
J. Qiao, Y. Li
We recall that the Julia set J (R) of a rational mapping R with deg(R) ≥ 2 is the closure of all repelling periodic points of R (see [1]). It has been shown that the limiting set of the Yang–Lee zeros is the Julia set J (Tλ ) (see [3, 4, 6]). Since the discovery of the famous Yang–Lee theorem, very little is known about the distribution of the zeros of the partition function (see [2], [3], [6]), fairly complete information about the Yang–Lee zeros has thus become available due to their connection to the Julia set (see [3–8]). In this paper, we shall prove the following result: Theorem. J (Tλ ) is a connected set for arbitrary constant λ ∈R. J (Tλ ) is a Jordan curve for λ ∈R\(0, 3], meanwhile J (Tλ ) isn’t a Jordan curve for λ ∈ (0, 3]. Furthermore J (Tλ ) is a quasicircle for λ ∈R\[0, 3]. For the fundamental concepts and basic notations of iteration theory of rational mappings, see [1]. 2. Some Lemmas ¯ to itself. Suppose R is a rational mapping of degree d ≥ 2 from the complex sphere C ¯ by We denote the nth iterate of R by R n , and the complement of the Julia set J (R) in C F (R), which is called the Fatou set of R. J (R) is a closed set and F (R) is an open set. ¯ or J (R) = C. ¯ In order to prove the above Furthermore, J (R) is nowhere dense on C theorem, we need the following results. Lemma 1 ([1]). Let D be a completely invariant component of F (R), then all components of F (R) \ D are simply connected and J (R) = ∂D. Furthermore, if F (R) has two completely invariant components D and D0 , then F (R) = D ∪ D0 . Lemma 2 ([9]). Let F (R) have only two components D and D0 . Then J (R) is a Jordan curve. Furthermore, if D and D0 are attractive or superattractive, then J (R) is a quasicircle. ¯ be a closed set containing at least three points. If R −1 (E) ⊂ Lemma 3 ([1]). Let E ⊂ C E, then J (R) ⊂ E. Lemma 4 ([1]). Suppose z0 is a superattractive fixed point of R, A(z0 ) is the immediate basin of z0 . If A(z0 ) is multiply connected, then A(z0 ) contains a critical point c such that O+ (c) {z0 } = ∅, here O+ (c) = {z|z = R n (c) for some n ∈ N}. Lemma 5 ([1] Sullivan’s non-wandering domain theorem). For each component D of F (R), there exists a constant n ∈ N such that R n (D) is a periodic component of F (R). Lemma 6 ([1]). If D is a forward invariant component of F (R), then there are just five possibilities: (i) D is the immediate basin of an attractive fixed point, i.e., D is a component of F (R) containing an attractive fixed point; (ii) D is the immediate basin of a superattractive fixed point, i.e., D is a component containing a superattractive fixed point; (iii) D is the immediate basin of a parabolic fixed point, i.e., there is a parabolic fixed point z0 ∈ ∂D such that R n (z) → z0 as n → ∞ for z ∈ D; (iv) D is a Siegel disc, i.e., R : D → D is analytically conjugate to a Euclidean rotation of the unit disc onto itself;
Connectivity of Julia Sets of Yang–Lee Zeros
321
(v) D is a Herman ring, i.e., R : D → D is analytically conjugate to a Euclidean rotation of some annulus onto itself. Furthermore, if D is one of the types (i), (ii) and (iii), then D contains at least one critical point of R; if D is one of the types (iv) and (v), then every boundary point of D belongs to the closure of the forward orbit of some critical point of R. Lemma 7 ([1] Riemann–Hurwitz formula). Let f : S0 → S be a branched covering mapping of degree d from the Riemann surface S0 onto the Riemann surface S.Then the number of the branch points, counted with multiplicities, is equal to dχ (S) − χ (S0 ), where χ is the Euler characteristic. Lemma 8 ([10]). Denote the set of all rational mapping of degree d by Ratd . If Rw0 ∈ Ratd is a hyperbolic rational mapping, then Rw0 is J-stable, i.e., if w close to w0 the restriction Rw |J(Rw ) is topologically conjugate to Rw0 |J(Rw ) , and the conjugating map 0 φw : J(Rw0 ) → J(Rw ) depends continuously on w. 3. The Proof of Theorem Firstly, by (1) we know that z = 1, ∞ are always two fixed points of Tλ for λ ∈ R and Tλ (z) =
4(z − 1)(z + λ − 1)(z2 + λ − 1) . (2z + λ − 2)3
(2)
So z = 1 and ∞ are two superattractive fixed points of Tλ for λ ∈ R\{0}. We distinguish the following eight cases. (a) If λ < 0, it is easy to verify that √ √ (3) 0 ≤ Tλ (x) ≤ Tλ (1) = 1 for x ∈ [− −λ + 1, −λ + 1], and x < Tλ (x) < 1 for x ∈ [0, 1).
(4) √ √ → 1 as n → ∞ for x ∈ [− −λ + 1, −λ + 1], So it follows (3) √ Tλ √ and (4) that and hence [− −λ + 1, −λ + 1] ⊂ F (Tλ ). Denote the immediate basin of the fixed point z = 1 by A(1). Since √ √ 0 ∈ A(1), Tλ −1 (0) = − −λ + 1, −λ + 1 ⊂ A(1), n (x)
we thus deduce that A(1) is a completely invariant component of F (Tλ ). Put f (t) = t 4 − 2t 3 + (2 − λ)t − (1 − λ), then
f (t) = 4t 3 − 6t 2 + (2 − λ),
f (t) = 12t (t − 1).
Obviously f (t) > 0 for t > 1 and f (1) = −λ > 0, hence f (t) > 0 for t > 1. Furthermore, it follows from f (1) = 0 that f (t) > 0 for t > 1, i.e., t 4 + λ − 1 > 2t 3 + (λ − 2)t
(5)
322
J. Qiao, Y. Li
for t > 1. If x > − λ2 + 1, set x = t 2 , then 2t 2 + (λ − 2) > 0. By (5) we have x2 + λ − 1 √ > x, 2x + λ − 2 and hence Tλ (x) > x for x > − λ2 + 1. It follows that Tλ n (x) → ∞ as n → ∞ for x > − λ2 + 1, and so [− λ2 + 1, ∞) ⊂ F (Tλ ). Denote the immediate basin of z = ∞ by A(∞), then [− λ2 + 1, ∞) ⊂ A(∞). Since Tλ −1 (∞) = {∞, − λ2 + 1} ⊂ A(∞), so A(∞) is also a completely invariant componentof F (Tλ ). By Lemma 1 and Lemma 2, F (Tλ ) = A(0) A(∞) and J (Tλ ) is a quasicircle. (b) If λ = 0, T0 (z) = 41 (z+1)2 , T0 (z) = 21 (z+1), T0 has only one critical point z = −1. Obviously, z = 1 is a rational indifferent fixed point of T0 and 0 ≤ T0 (x) < T0 (1) = 1 for x ∈ [−1, 1). On the other hand, for x ∈ [0, 1) we have T0 (x) > x, and hence T0 n (x) → 1 as n → ∞. So we have [−1, 1) ⊂ F (T0 ). Denote the immediate basin of the rational indifferent fixed point z = 1 by L(1). Since T0 −1 (0) = {−1} ⊂ L(1), we deduce that L(1) is a completely invariant component of F (T0 ). By Lemma 1 and Lemma 2, F (T0 ) = L(1) A(∞) and J (T0 ) is a Jordan curve. It follows from the famous Leau-petal theorem (see [1]) that J (T0 ) isn’t a quasicircle. rational mapping. For arbitrary (c) If λ ∈ (0, 32 27 ), we shall prove that Tλ is a hyperbolic √ λ ∈ (0, 1), it follows (2) that Tλ has six critical points ± 1 − λ, 1 − λ, − λ2 + 1, 1 and ∞. Obviously, Tλ (x) √ max √ − 1−λ≤x≤ 1−λ
= Tλ (1 − λ) = (1 − λ)2 ,
Tλ (x) √ min √ − 1−λ≤x≤ 1−λ
= 0.
(6)
√ √ We note that Tλ (z) has only one zero z = 1−λ in (− 1 − λ, 1√− λ), and √Tλ (1−λ) = (1 − λ)2 < 1 − λ. So that Tλ has only one fixed point x0 in (− 1 − λ, 1 − λ). It is easy to verify that x0 is an attractive fixed point. Since Tλ ([0, x0 ]) = [( and
λ−1 2 ) , x0 ] ⊂ (0, x0 ], λ−2
Tλ ([x0 , 1 − λ]) = [x0 , (1 − λ)2 ] ⊂ [x0 , 1 − λ),
we have [0, 1 − λ) ⊂ A(x0 ). It follows from (6) that √ √ Tλ ([− 1 − λ, 1 − λ]) = [0, (1 − λ)2 ] ⊂ [0, 1 − λ], √ √ √ √ so that [− 1 − λ, 1 − λ] ⊂ A(x0 ), and hence the critical points − 1 − λ, 1 − λ, 1 − λ ∈ A(x0 ). Noting Tλ (− λ2 + 1) = ∞, we deduce that Tλ is hyperbolic for λ ∈ (0, 1).
z 1 T1 (z) = (2z−1) 2 has four critical points z = 0, 1, 2 , ∞. T1 is obviously hyperbolic too. √ If λ > 1, by (2) we know that Tλ has six critical points: ± λ − 1i, −λ+1, − λ2 +1, 1 and ∞. For arbitrary λ ∈ (1, 32 27 ), it is easy to verify that Tλ has four real fixed points z = 1, x0 , α, β, x0 and 1 are attractive, α and β are repelling, and they satisfy 4
0 < x0 < α < − Since
λ + 1, 2
1 < β < ∞.
Tλ ([−λ + 1, x0 ]) = [(−λ + 1)2 , x0 ] ⊂ (−λ + 1, x0 ],
Connectivity of Julia Sets of Yang–Lee Zeros
323
so that −λ + 1 ∈ [−λ + 1, x0 ] ⊂ A(x0 ).
√
Noting Tλ (± λ − 1i) = 0, 0 ∈ [−λ + 1, x0 ] ⊂ A(x0 ), Tλ (− λ2 + 1) = ∞, we deduce that Tλ is also hyperbolic for λ ∈ (1, 32 27 ). By Lemma 4 and Lemma 6, F (T1 ) contains only three periodic components A(0), A(1) and A(∞), and they are all simply connected. Since each component of F (T1 ) \ (A(0) ∪ A(1) ∪ A(∞)) contains at most one critical point, by Lemma 5 and Lemma 7, we deduce that all components of F (T1 ) are simply connected, and hence J (T1 ) is a connected set. It is obvious that J (T1 ) isn’t a Jordan curve. By Lemma 8, Tλ is J-stable in Rat4 for each λ ∈ (0, 32 27 ). It follows from the above discussion about J (T1 ) that J (Tλ ) is connected and isn’t a Jordan curve for λ ∈ (0, 32 27 ). (d) If λ =
32 27 ,
we can prove that T 32 has only three real fixed points z = 1, x0 , x1 , 27
11 27 ),
x1 ∈ (1, +∞), x0 is rational indifferent, 1 is superattractive and x1 is x0 ∈ (0, repelling. 5 , x0 ) and T 32 (x) < x for x ∈ [0, x0 ), so that Since T 32 (x) ∈ (0, x0 ) for x ∈ [− 27 27
27
5 5 T 32 n (x) → x0 as n → ∞ for x ∈ [− 27 , x0 ), and hence [− 27 , x0 ) ⊂ L(x0 ). Noting 27
5 5 0 ∈ [− 27 , x0 ), T 32 −1 (0) = ± 27 i and F (T 32 ) is symmetric with respect to the real 27 27 axis, we deduce that L(x0 ) is a completely invariant component of F T 32 . 27 By Lemma 1, all components of F T 32 \ L(x0 ) are simply connected, so ∂A(1) and 27 ∂A(∞) are connected. Note x1∈ ∂A(1) ∂A(∞). By J0 we denote the component of J T 32 which contains ∂A(1) ∂A(∞). Since 27
sup T 32 (x) = ∞, 11 27 <x<1
27
T 32 (x) = ∞,
sup x0 <x< 11 27
27
T 32 (x) = ∞,
sup 5 −∞<x<− 27
27
inf
T 32 (x) = 1 < x1 ,
inf
T 32 (x) = x0 < x1 ,
11 27 <x<1
x0 <x< 11 27
inf
5 −∞<x<− 27
27
27
T 32 (x) = 27
5 2 < x1 , 27
it follows from the continuity of T 32 that there are three points x1 (j ) (j = 1, 2, 3) 27 satisfying x1 (1) ∈
− ∞, −
11 5 11 , x1 (2) ∈ x0 , − , x1 (3) ∈ ,1 27 27 27
such that T 32 (x1 (j ) ) = x1 (j = 1, 2, 3). Obviously (x1 (3) , x1 ) ⊂ A(1), hence 27
{x1 (3) , x1 } ⊂ ∂A(1). It can be verified that T 32 (x) > x1 for x ∈ (x1 (2) , x1 (3) ), and thus 27
{x1 (2) , x1 (3) } ⊂ ∂D0 , here D0 is a component of F (T 32 ) \ L(x0 ). Since x1 (1) ∈ ∂A(∞) 27
(1) (2) (3) and ∂D0 is connected, we have {x1 , x1 , x1 , x1 } ⊂ J0 . Noting all critical points 5 5 11 i, − 27 , 27 , 1 and ∞ lie on F (T 32 ), we can easily deduce that T 32 −1 (J0 ) = J0 . ± 27 27 27 By Lemma 3, J0 = J (T 32 ), and thus J (T 32 ) is connected. It is obvious that J (T 32 ) isn’t 27 27 27 a Jordan curve.
324
J. Qiao, Y. Li
(e) If λ ∈ ( 32 27 , 2), Put λ λ %1 : z + − 1 = , 2 2
%2 : y 2 = −
(x +
λ 2
− 1)(x 2 + λ − 1) x−
λ 2
+1
(z = x + iy).
It is easy to prove the following properties: (i) %1 is a circle and Tλ (%1 ) ⊂ [0, 1]; (ii) %2 is a simple curve lying on λ2 − 1 < Rez < − λ2 + 1, it intersects the x-axis at √ √ − λ2 + 1 and intersects the y-axis at λ − 1i and − λ − 1i, two ends of %2 tend to λ2 − 1 + i · ∞ and λ2 − 1 − i · ∞ respectively, Tλ (%2 ) ⊂ [−∞, 0]. Below, we shall show A(∞) Int(%1 ) = ∅. In fact, otherwise, there is a curve γ1 lying on a compact subset of A(∞) such that γ1 connects Int(%1 ) with ∞. Note [0, 1] ⊂ Int(%1 ) and ∞ ∈ Ext(%1 ). Since Tλ (γ1 ∩ %1 ) ⊂ [0, 1), we have Tλ (γ1 )
Tλ (∞) = ∞,
Int(%1 ) = ∅. Iterating this procedure, we can deduce that Tλ n (γ1 ) Int(%1 ) = ∅ for all n ∈ N.
This contradicts the fact that Tλ n (γ1 ) → ∞ as n → ∞. We thus obtain Int(%1 ) A(∞) = ∅. √ Since the critical points 1 − λ, ± λ − 1i, − λ2 + 1 ∈ Int(%1 ), then A(∞) contains only one critical point z = ∞. By Lemma 4, A(∞) is simply connected. Denote the component of C \ %2 containing 1 − λ by (L , and another component of C \ %2 by (R . Now we want to show A(1) (L = ∅. In fact, otherwise, there is a curve γ 2 lying on a compact subset of A(1) such that γ2 connects (L with z = 1. So Tλ (γ2 ) (−∞, 0] = ∅, and hence Tλ (γ2 ) (L = ∅. Since Tλ (1) = 1, we can iterate this procedure and obtain that (L = ∅ for all n ∈ N. Tλ n (γ2 ) This contradicts the fact that Tλ n (γ2 ) → A(1) (L = ∅. √ 1 as n → ∞. We thus obtain Since the critical points −λ + 1, ± λ − 1i ∈ (L and Tλ (− λ2 + 1) = ∞ ∈ A(∞), so A(1) contains √ only one critical point z = 1. By Lemma 4, A(1) is simply connected. Noting ± λ − 1i, −λ + 1 ∈ %1 , Tλ (%1 ) = [0, 1], Tλ (− λ2 + 1) = ∞, by the same discussion as used above we can prove that the critical point − λ2 + 1 and any one of √ three critical points ± λ − 1i, −λ + 1 can’t lie on the√same component of F (Tλ ). Furthermore, we shall prove that any two elements of {± λ − 1i, −λ + 1} can’t lie on the same component of F √ (Tλ ) either. In fact, otherwise, by the symmetry of F (Tλ ), we √ know that λ − 1i and − λ − 1i lie on a component D of F (Tλ ). Then Tλ : D → D0 is a 4-fold covering mapping, here D0 = Tλ (D) is a component of F (Tλ ). By the above discussion, each one of Tλ n (D0 ) (n ≥ 0) contains at most one critical point. Denote the repelling fixed √ point of Tλ in (1, ∞) by x1 , then Tλ (x) > x for x ∈ R \ [1, x1 ]. So for any η ∈ {± λ − 1i, −λ + 1}, if η ∈ J (Tλ ), then Tλ n (η) → x1 as n → ∞; if η ∈ F (Tλ ), then there exists n0 ∈ N such that Tλ n0 (η) ∈ [1, x1 ) ⊂ A(1), or there exists
Connectivity of Julia Sets of Yang–Lee Zeros
325
m0 ∈ N such that Tλ m0 (η) ∈ A(∞). By Lemma 5 and Lemma 6, F (Tλ ) has only two periodic components A(1) and A(∞), and Tλ n (D0 ) = A(1)
or
Tλ n (D0 ) = A(∞)
for sufficiently large n. Since A(1) and A(∞) are all simply connected, by Lemma 7 we know that D0 is also simply connected. Applying Lemma 7 to Tλ : D → D0 , we get χ (D) = 4χ (D0 ) − 2 = 2, ¯ This is a contradiction. So we have the following conclusion: each and thus D = C. component of F (Tλ ) contains at most one critical point. By Lemma 7, all components of F (Tλ ) are simply connected, and hence J (Tλ ) is connected. Since Tλ (− λ2 + 1) = ∞, − λ2 + 1 ∈ A(∞), then F (Tλ ) has infinitely many components. This implies J (Tλ ) isn’t a Jordan curve. (f) If λ ∈ [2, 3), it is easy to prove that Tλ has only one fixed point x1 in (1, +∞), it is repelling, [1, x1 ) ⊂ A(1) and (x1 , +∞] ⊂ A(∞). So Tλ (x) < x for x ∈ (1, x1 ) and Tλ (x) > x for x ∈ (x1 , +∞). T2 is obviously a hyperbolic rational mapping. If λ ∈ (2, 3), it can be verified that Tλ ((λ − 1)2 ) < (λ − 1)2 . So Tλ (−λ + 1) = (λ − 1)2 ∈ (1, x1 ), and hence the forward orbit of the critical point −λ + 1 tends to the superattractive fixed λ−1 point 1. Since λ−2 > 2, we can prove λ−1 4 ( λ−2 ) +λ−1
λ−1 2 2( λ−2 ) +λ−2
>
λ−1 . λ−2
λ−1 2 This implies Tλ 2 (0) > Tλ (0). Noting Tλ (0) = ( λ−2 ) > 1, we deduce Tλ (0) ∈ √ (x1 , +∞), and hence the forward orbits of the critical points ± λ − 1i both tend to the superattractive fixed point ∞. Since Tλ (− λ2 + 1) = ∞, we know that Tλ is also hyperbolic for λ ∈ (2, 3). Put %1 : |z| = 1 and %2 : Rez = 0, then T2 (%1 ) = [0,1], T2 (%2 ) = [−∞, 0]. Since the critical points ±i, −1 and 0 belong to {z : |z| ≤ 1} {z : Rez ≤ 0}, by the same discussion as used in the case (e), we can deduce that J (T2 ) is connected. Since F (T2 ) contains infinitely many components, so J (T2 ) isn’t a Jordan curve. By Lemma 8, for every λ ∈ [2, 3), Tλ is J-stable in Rat4 . So it follows from the above discussion about J (T2 ) that J (Tλ ) is connected and isn’t a Jordan curve for λ ∈ [2, 3). √ (g) If λ = 3, T3 has six critical points: −2, − 21 , 1, ± 2i, ∞. Since T3 (−2) is a repelling √ fixed point, T3 2 (± 2i) = T3 (−2) and T3 (− 21 ) = ∞, by Lemma 4 and Lemma 6 we can prove that F (T3 ) contains only two periodic components A(1) and A(∞), and they are all simply connected. Put %1 : |z+ 21 | = 23 , then all critical points of T3 belong to Int(%1 ). Noting T3 (%1 ) = [0, 1] ⊂ Int(%1 ), by the same discussion as used in the case (e), we can get A(∞) Int(%1 ) = ∅, and hence − 21 ∈ A(∞). It follows that each component of F (T3 ) contains at most one critical point. By Lemma 5 and Lemma 7, we deduce that all components of F (T3 ) are simply connected, and hence J (T3 ) is connected. It is obvious that J (T3 ) isn’t a Jordan curve.
326
J. Qiao, Y. Li
(h) If λ > 3, it is easy to see that Tλ has only one fixed point x1 in (1, +∞), it is repelling and (x1 , ∞] ⊂ A(∞), [1, x1 ) ⊂ A(1). So Tλ (x) < x for x ∈ (1, x1 ) and Tλ (x) > x for x ∈ (x1 , +∞). Since Tλ ((λ−1)2 ) > (λ−1)2 for λ > 3, so Tλ (−λ+1) = (λ−1)2 > x1 . Noting min Tλ (x) = Tλ (−λ + 1) = (λ − 1)2 , x∈(−∞,− λ2 +1)
we deduce
λ Tλ − ∞, − + 1 ⊂ (x1 , +∞) ⊂ A(∞). 2
This implies (−∞, − λ2 + 1) ⊂ F (Tλ ). On the other hand, Tλ −1 (∞) = {− λ2 + 1, ∞}, so that [−∞, − λ2 + 1] ⊂ A(∞), and hence A(∞) is a completely invariant component of F (Tλ ). It is easy to verify that λ−1 4 ( λ−2 ) +λ−1
λ−1 2 2( λ−2 )
+λ−2
<
λ−1 λ−2
λ−1 2 ) > 1, we immediately have for λ > 3, i.e., Tλ 2 (0) < Tλ (0). Noting Tλ (0) = ( λ−2 Tλ (0) ∈ (1, x1 ) ⊂ A(1). Obviously, Tλ (x) is monotone decreasing on [0, 1), so
Tλ (x) ∈ (1, Tλ (0)] ⊂ (1, x1 ) ⊂ A(1) for x ∈ [0, 1). √ √ This shows that 0 ∈ A(1). Since Tλ −1 (0) = {− λ − 1i, λ − 1i}, T√ λ (A(1)) = A(1) and F (Tλ ) is symmetric with respect to the real-axis, we thus obtain ± λ − 1i ∈ A(1). This implies A(1) is a completely invariant component of F (Tλ ). By Lemma 1 and Lemma 2, we deduce that J (Tλ ) is a quasicircle. The proof of the Theorem is thus complete. References 1. Beardon, A.F.: Iteration of rational functions. Berlin: Springer, 1991 2. Biskup, M., Kleinwaks, L.J., Chayes, J.T., Kotecký, R.: General theory of Lee-Yang zeros in models with first-order phase transitions. Phys. Rev. Lett. 84, 4794–4797 (2000) 3. Derrida, B., De Seze, L., Itzykson, C.: Fractal structure of zeros in hierarchinal models. J. Stat. Phys. 33, 559–569 (1983) 4. Derrida, B., Itzykson, C., Luck J.K.: Oscillatory critical amplitudes in hierarchical models. Commun. Math. Phys. 94, 115–132 (1984) 5. Erzan, A.: Hierarchical q-state Potts models with periodic and aperiodic renormalization group trajectories. Phys. Lett. A 93, 237–240 (1983) 6. Hu, B., Lin, B.: Yang–Lee zeros, Julia sets, and their singularity spectra. Phys. Review A 39, 4789–4796 (1989) 7. Kaufman, K., Griffiths, R.B.: Infinite susceptibility at high temperature in the Migdal-Kadanoff scheme. J. Phys. A 15, L239–L242 (1982) 8. Mckay, S.R., Berker, A.N., Kirkpatrick, S.: Spin-glass behavior in frustrated Ising models with chaotic renormalization-group trajectories. Phys. Rev. Lett. 48, 767–770 (1982) 9. Steinmetz, N.: Jordan and Julia. Math. Ann. 307, 531–541 (1997) 10. Sullivan, D., Thurston, W.: Extending holomorphic motions. Acta Math. 157, 243–257 (1986) Communicated by Ya. G. Sinai
Commun. Math. Phys. 222, 327 – 369 (2001)
Communications in
Mathematical Physics
© Springer-Verlag 2001
Periodic Orbits Contribution to the 2-Point Correlation Form Factor for Pseudo-Integrable Systems E. Bogomolny, O. Giraud, C. Schmit Laboratoire de Physique Théorique et Modèles Statistiques , Université de Paris XI, Bât. 100, 91405 Orsay Cedex, France Received: 10 January 2000 / Accepted: 18 May 2001
Abstract: Using heuristic arguments based on the trace formulas we relate the 2-point correlation form factor, K2 (τ ), at small values of τ with sums over classical periodic orbits for typical examples of pseudo-integrable systems. The later sums have been explicitly calculated for the following models: (i) plane billiards in the form of right triangles with one angle π/n and (ii) rectangular billiards with the Aharonov-Bohm flux line. In the first model, using the properties of the Veech structure, it is shown that K2 (0) = (n + (n))/(3(n − 2)), where (n) = 0 for odd n, (n) = 2 for even n not divisible by 3, and (n) = 6 for even n divisible by 3. For completeness we also recall informally the main features of the Veech construction. In the second model the answer depends on arithmetical properties of ratios of flux line coordinates to the corresponding sides of the rectangle. When these ratios are non-commensurable irrational numbers, K2 (0) = 1−3α+4 ¯ α¯ 2 , where α¯ is the fractional part of the flux through the rectangle when 0 ≤ α¯ ≤ 1/2 and it is symmetric with respect to the line α¯ = 1/2 when 1/2 ≤ α¯ ≤ 1. The comparison of these results with numerical calculations of the form factor is discussed in detail. The above values of K2 (0) differ from all known examples of spectral statistics, thus confirming analytically the peculiarities of statistical properties of the energy levels in pseudo-integrable systems. 1. Introduction The statistical properties of quantum systems attracted wide attention in the last years (see e.g. [1]). The investigation of many different models had led to a few accepted conjectures which relate statistical distribution of quantum energy levels with general properties of corresponding classical motion. For generic systems these conjectures are the following: for chaotic systems the level spacing distribution follows the Random Matrix statistics [2, 3]; for integrable systems it follows the Poisson statistics [4]. Both conjectures are supported by a lot of numerical evidences and by some analytical arguments [5–7]. Unité Mixte de Recherche de l’Université Paris XI et du CNRS (UMR 8626)
328
E. Bogomolny, O. Giraud, C. Schmit
These well-established conjectures are applicable only to completely chaotic or integrable models. But there are systems which are neither chaotic nor integrable. Noticeable examples of such systems are plane polygonal billiards with all angles, αi , commensurable with π, mi αi = π , (1) ni where mi , ni are co-prime integers. In such systems all trajectories belong to a 2dimensional surface of genus g =1+
N mi − 1 , 2 ni
(2)
i
where N is the least common factor of the ni (see e.g. [11]). The case where all mi = 1 corresponds to g = 1 (i.e. to a torus) which is integrable. If some mi > 1 trajectories belong to a higher genus surface and, consequently, the system is not integrable (at least in the usual sense) but it is not chaotic either since all trajectories belong to a 2-dimensional surface and cannot cover a 3-dimensional energy surface ergodically as is required for chaotic systems. For such reasons these systems are called pseudo-integrable. A natural question appears: what is the spectral statistics of pseudo-integrable systems? Numerical calculations [11–13] clearly demonstrated that statistical properties of such systems differ from standard examples but have many points in common with the statistics of the 3-dimensional Anderson model at the metal-insulator transition point [14]. The full analytical approach to this question meets with difficulties related mostly with the existence of quickly growing terms in the trace formula [15] which do not permit to use standard methods. The main purpose of this paper is to compute analytically the value of the 2-point correlation form factor, K2 (τ ), in the limit τ → 0 for two examples of pseudo-integrable systems, (i) a plain billiard in the shape of a right triangle with one angle equal to π/n and (ii) a rectangular billiard with a Bohm-Aharonov flux line inside. Using heuristic arguments based on trace formulas we show that in the small-τ limit the diagonal approximation [16] is valid and the problem reduces to the calculation of distributions of periodic orbit lengths and areas occupied by periodic orbit families. Though for general pseudo-integrable systems very little is known on this subject, triangular billiards in the shape of right triangles with angle π/n belong to the so-called Veech polygons [17, 18] and have a hidden group structure which makes possible explicit calculation of necessary quantities. After the calculations we found a finite value of the 2-point correlation form factor at the origin, 0 < K2 (0) < 1, which is different from both the Poisson distribution (for which K2 (0) = 1) and the random matrix results (where K2 (0) = 0). An analogous result has also been obtained for rectangular billiards with a Bohm-Aharonov flux line. Non-zero values of the 2-point correlation form factor at the origin confirm peculiar properties of spectral statistics for pseudo-integrable systems. We also discuss the comparison of theoretical predictions with the results of extensive numerical calculations. The plan of the paper is the following. In Sect. 2 the discussions of the trace formula and the diagonal approximation for the 2-point correlation form factor are presented. A brief introduction to the Veech structure of certain pseudo-integrable billiards is given in Sect. 3. For clarity we start in Sect. 3.1 with a simple example of square billiards where ideas and methods can easily be illustrated. Needed properties of the modular group and the Eisenstein series are shortly revised in Sects. 3.2 and 3.3. In Sect. 3.4 the
2-Point Correlation Form Factor for Pseudo-Integrable Systems
329
Veech group for π/n right triangle is derived and in Sect. 3.4.2 the density of periodic orbits for this triangle is computed. In Sect. 3.5 the calculation of 2-point form factor at the origin is performed and the comparison with the results of numerical calculations is discussed. Section 4 dwel on the calculation of the 2-point form factor for a rectangular billiard with a flux line. As in the previous sections the main point is the calculation of areas swept by periodic orbits around the flux line. The result depends on arithmetical properties of ratios of coordinates of the flux line to the corresponding rectangular sides. In Sect. 5 concluding remarks are presented. 2. The Form Factor in the Diagonal Approximation 2.1. The density of states. The modern semiclassical approximation of multi-dimensional quantum systems is based on various types of trace formulas which express quantum density of states (and other quantities as well) through quantities computed in pure classical mechanics [8–10]. The main step in deriving trace formulas is the semiclassical approximation for the (advanced) Green function G+ ( x , y) =
n ( x )n ( y) , E − En + i n
(3)
where En is the set of energy levels and n the eigenfunctions as a sum over classical trajectories with energy E connecting initial point x and final point y [8, 9] x , y) = G+ (
tr
i π Atr exp( Scl − i ν). h¯ 2
(4)
Scl is the classical action computed along a trajectory, Atr is a pre-factor depending on the system considered, and ν is a phase (the Maslov index) which, roughly speaking, counts points, where the simple semiclassical approximation breaks down. For 2-dimensional free motion (and for 2-dimensional polygonal billiards) the semiclassical approximation for G reads (see e.g. [9]) eiklp −i π2 νp −i 3π4 G+ ( x , y) = , 8π klp p
(5)
√ where lp is the geometrical length of the orbit and k = E is the wave vector (in the units h¯ = 1 and m = 1/2). The knowledge of the Green function permits to find other quantum quantities as well. In particular the quantum density of states δ(E − En ) (6) d(E) = n
may be written by means of the advanced Green function as 1 d x G+ ( x , x). d(E) = − Im π
(7)
¯ and the The contribution from very short trajectories gives the mean level density, d, integration over the space selects periodic orbit contributions [8, 9] and determines an
330
E. Bogomolny, O. Giraud, C. Schmit
oscillating part of level density, d (osc) (E). For example, the density of states of an integrable rectangular billiard with sides a and b is d(E) = d¯ + d (osc) (E).
(8)
A , d¯ = 4π
(9)
Here the smooth part is
where A is the area of the rectangle (this formula is valid for all 2-dimensional billiards) and the oscillating part is dp.o. (E) =
∞ Ap p.p.o. n=1
where lp =
4π
π π 1 eiknlp −i 2 νp −i 4 + c.c., 2π knlp
(2Ma)2 + (2N b)2 .
(10)
(11)
In the rectangular billiard, the lengths of periodic orbits are 4 times degenerate in the sum (5) because (±M, ±N ) give the same length. When the integral (7) is performed, orbits (M, N ) and (−M, N ) are absorbed in the same Ap . The summation in (10) is therefore performed over all primitive periodic orbits of length lp with M ≥ 0 repeated n times (an orbit (M, N ) and its time-reverse companion (−M, −N ) are counted as two different orbits). In all integrable billiards periodic orbits are not isolated but belong to families. Ap is the area of the pencil of periodic orbits of length lp . For the rectangular billiard Ap = 2A. Pseudo-integrable systems considered in the paper belong to the class of diffractive systems whose characteristic property is the existence of singularities which make the classical motion undetermined. Each time a classical trajectory hits a singularity there is no unique way to continue it. Quantum mechanics smooths out these singularities and associates with each (not too strong) singularity a diffraction coefficient, D( n, n ), (or scattering amplitude) which defines an amplitude of scattering on this singularity from the initial direction n to the final direction n . Correspondingly, the semiclassical approximation of the Green function in the presence of a singularity at point x0 takes the form x , y) + G0 ( x , ( x0 , n))D( n, n )G0 (( x0 , n ), y), (12) G( x , y) = G0 ( n, n
x , y) is the Green function without singularity and G0 ( x , ( x0 , n)) is a contriwhere G0 ( bution to the Green function from a classical trajectory starting at point x and ending at the singularity x0 with momentum in the direction n. G0 (( x0 , n ), x ) is a contribution to the Green function from a classical trajectory starting at point x0 with momentum in the direction n and ending at point x . This modification of the Green function changes the trace formula. For diffractive systems the density of states can now be written as the sum of three terms [19–21] d(E) = d¯ + dp.o. (E) + dd.o. (E),
(13)
2-Point Correlation Form Factor for Pseudo-Integrable Systems
331
where d¯ is the mean level density, dp.o. is the contribution of periodic orbits without singularity, and the third term, dd.o. (E), is a contribution from all classical orbits starting and ending at the singularity (with, in general, different momenta). These trajectories are called diffractive orbits and dd.o. (E) is a sum over all possible combinations of them ∞ 1 ∂ G( n1 , n 1 )D( n 1 , n2 ) . . . G( nm−1 , n m )D( n m , n1 ), dd.o. (E) = πm ∂E
(14)
m=1
where G( n, n ) is the contribution to the Green function from a classical trajectory starting at the singular point with initial momentum in direction n and ending at it with final momentum in direction n . For polygonal billiards the vertices with mi = 1 play the role of singular points [19]. In the case of scattering on the angle α the diffraction coefficient can be derived from Sommerfeld’s exact solution [19] D(θf , θi ) π 1 1 2 − , = sin γ γ cos π/γ − cos(θf + θi )/γ cos π/γ − cos(θf − θi )/γ
(15)
where γ = α/π and θf (resp. θi ) is the final (resp. initial) scattering angle. For rectangular billiards with Aharonov-Bohm flux lines the flux lines themselves are singular points and the exact solution for an infinite plane with a flux line carrying a flux α [22] gives D(θf , θi ) =
2 sin π α cos
θf −θi 2
ei(θf −θi )/2 .
(16)
The main difference between pseudo-integrable models discussed in this paper and the usual diffractive models is the divergence of diffraction coefficients (15) and (16) at certain directions (called optical boundaries because in the simplest case they separate illuminated regions from dark ones). Of course, exact solutions do not diverge even in the vicinity of optical boundaries. The divergence comes from artificial separation of exact waves into geometrical and diffraction parts. Nevertheless, this formal divergence has profound effects on the structure of the trace formula. First, multiple diffraction along optical boundaries needs special treatment. Using a kind of uniform approximation in [15] it was demonstrated that for polygonal billiards such multiple diffraction produces terms proportional up to a numerical factor to l/k, where l is the total length of the diffractive orbit. When l is fixed and k → ∞ (as in the usual approach to trace formulas) these terms are smaller than periodic orbit terms (10) but bigger than diffractive terms (14). But to compute spectral correlation functions one needs to consider a limit when k is fixed and l → ∞. In this limit multiple diffraction terms are bigger than both periodic orbit and diffraction terms. Another difficulty is related with the existence of terms corresponding to diffraction not exactly on optical boundaries but sufficiently close to them so their contributions are also large. Without exact summation of these quickly growing terms it is not possible to find spectral statistics of the systems considered. It the next section we argue that, nevertheless, these terms give a negligible contribution to the value of the 2-point correlation form factor at the origin and only diagonal contributions of periodic orbits will be important for this quantity.
332
E. Bogomolny, O. Giraud, C. Schmit
2.2. The 2-point correlation form factor. The 2-point correlation function is related with the level density by the formal expression R2 ( ) = d E + d E− , (17) 2 2 where the brackets denote an energy averaging around E on an energy window much ¯ and much smaller than energy E. larger than the mean level spacing 1/d, The two-point correlation form factor is the Fourier transform of R2 ( ): ∞ 2iπ d τ d ¯ K2 (τ ) = , (18) d E− e d E+ ¯ 2 2 −∞ d (the factors are chosen so that τ and K2 are dimensionless). Trace formulas, roughly speaking, state that the density of states can be represented as a sum over classical orbits (both periodic and diffractive) d (osc) (E) =
Cp eiSp (E)/h¯ + c.c.
(19)
p
Substituting this formal expansion into (17) and using the expansion S(E + ) ≈ S(E) + T (E) , where T (E) is the period of classical motion one obtains [16] d E+ d E− = 2 2
i(T +T ) /(2h¯ ) i ∗ Cp1 Cp2 exp . Sp1 (E) − Sp2 (E) e p1 p2 h¯ p1 ,p2
(20)
Here the terms corresponding to the sum of actions are omitted as it is assumed that they are washed out by the smoothing procedure. The corresponding expression for the 2-point correlation form factor is the following: T + T 2π h¯ p2 p1 ∗ i(Sp1 (E)−Sp2 (E))/h¯ ¯ K2 (τ ) = δ C C e − 2π h¯ dτ . (21) ¯ p1 p2 2 p1 ,p2 d The main difficulty in such an approach is the computation of the mean value of terms with action differences F (E) =< ei(Sp1 (E)−Sp2 (E))/h¯ > .
(22)
The best developed approximation (called the diagonal approximation) consists in taking into account only terms with exactly the same actions [16] i.e. 1, if Sp1 (E) = Sp2 (E) F (E) = , (23) 0, if Sp1 (E) = Sp2 (E)
2-Point Correlation Form Factor for Pseudo-Integrable Systems
333
since terms with Sp1 (E) = Sp2 (E) will vanish by smoothing over E. In this approximation (assuming that for orbits with equal actions pre-factors are also equal (which is not always the case)) the 2-point correlation form factor takes the form (diag)
K2
(τ ) =
2π h¯ p
d¯
¯ ), gp2 |Cp |2 δ(Tp − 2π h¯ dτ
(24)
where gp is the multiplicity of a given periodic orbit (i.e. the number of orbits with exactly the same action) and the summation is performed over orbits with different actions. In particular for integrable and pseudo-integrable systems from Eq. (10) one gets (diag)
K2
(τ ) =
1 |Ap |2 2 ¯ ), g δ(lp − 4π k dτ lp p 8π 2 d¯ p
(25)
where as before lp is the length of a periodic orbit and Ap is the surface occupied by a periodic orbit family. It is instructive to perform the calculation for the simplest example of the rectangular billiard with sizes a and b. A periodic orbit in this billiard is defined by 2 integers m, n and its length is lp = (2ma)2 + (2nb)2 . (26) As pairs (m, n) and (m, −n) belong to the same family (or torus) the degeneracy is gp = 2 (we remind the reader that in the rectangular billiard the terms corresponding to m < 0 are already taken into account in Ap ), and it is sufficient to compute the density of periodic orbits with positive m, n, ρ(l) = δ(l − lp ). (27) m,n≥0
Changing the summation over integers (m, n) to the integration and using the substitution m = r cos φ/(2a) and n = r sin φ/(2b), one obtains by integrating over φ from 0 to π/2, ρ(l) =
πl . 8A
(28)
Since all families of periodic orbits in the rectangle cover the same area Ap = 2A and the length multiplicity is gp = 2, the 2-point correlation form factor for the rectangular billiard in the diagonal approximation is 2A2 ∞ 1 ¯ ρ(l)dl = 1, δ l − 4π k dτ (29) K2 (τ ) = 2 π d¯ 0 l which is the expected value for the form factor of integrable systems [16]. The diagonal approximation (23) is known (with physical accuracy) to be valid for generic integrable systems [16] and can be modified [23] to compute mean values of more than 2 actions in the exponent of (23). For general systems the validity of the diagonal approximation is restricted only to small values of τ [16, 24] and it is usually used to compute the first non-zero term of the expansion of the 2-point correlation form factor in powers of τ .
334
E. Bogomolny, O. Giraud, C. Schmit
For diffractive systems with finite diffraction coefficient one can use the diagonal approximation for both periodic orbit terms and diffractive terms. But when the diffraction coefficient diverges in certain directions these calculations lead to difficulties. For example, multiple diffraction on optical boundaries corresponding to n repetitions of a primitive periodic orbit in pseudo-integrable billiard gives the following terms [15]: dmult.diff. (E) =
l cn cos(knl), k
(30)
l,n
where cn are certain numerical coefficients. The attempt to use the diagonal approximation for these terms leads to the following result (mult.diff.)
K2
∼ k2 τ 3,
(31)
if we take into account that the density of primitive periodic orbits in pseudo-integrable systems (at least for Veech systems (see the next section)) differs only by a numerical factor from Eq. (28). But this expression contains powers of momentum k and when k → ∞ it cannot be correct. All terms corresponding to diffraction on or close to optical boundaries give similar quickly growing terms which cannot be treated separately. Without a resummation of these terms the determination of spectral statistics of such models seems not possible. These arguments suggest the following scenario. The 2-point form factor is a sum of two terms K2 (τ ) = f1 (k α τ ) + f2 (τ ),
(32)
where α is a certain positive quantity. The first function, f1 (x), describes a result of resummation of quickly growing terms connected with divergence of the diffraction coefficient and when x → ∞ f1 (x) should go quickly to zero. The second function, f2 (x), is a contribution of diffraction far from optical boundaries and can be computed similarly to ordinary diffraction [25] in perturbation series of τ . Of course, this is only a plausible conjecture and more detailed investigation should be done to give credit to it. Though the divergence of the diffraction coefficient prevents the calculation of the 2-point correlation form factor in the full range, one can still use the trace formula (14) to find its behavior at the origin, τ = 0. The main point is that, even when the diffraction coefficient formally diverges, the exact waves remain finite and using a uniform approximation [15] one can demonstrate that the ratio D( n, n ) √ kl
(33)
is bounded for all angles, lengths and momenta. Each term in the diffractive trace formula (14) is a product of a certain number of these ratios and the total period of the corresponding composite orbit which appears due to the derivative over energy. Therefore it is of order of τ multiplied by a constant and in the limit τ → 0 all diffractive terms disappear. Only the periodic orbit contribution (10) remains important at small values of τ . From Eq. (25) one concludes that for pseudo-integrable systems 1 |Ap |2 2 ¯ ). gp δ(lp − 4π k dτ τ →0 8π 2 d¯ l p p
K2 (0) = lim
(34)
2-Point Correlation Form Factor for Pseudo-Integrable Systems
335
The main problem now is how to compute the density of periodic orbits and the distribution of the areas of periodic orbit families. For generic pseudo-integrable systems very little is known and no reliable calculations can be done. E.g. for general plane polygonal billiards with angles commensurable with π it has only been proved [27, 28] that the number of periodic orbits with length less than l, N (lp ≤ l) obeys inequalities c1 l 2 < N (lp ≤ l) < c2 l 2
(35)
for certain constants c1 and c2 (depending on the polygon). But even the existence of an asymptotic law for N (lp ≤ l) was not proved. Fortunately, there is a sub-class of pseudo-integrable billiards for which all necessary quantities can be computed due to the existence of a hidden group structure, and the triangular billiard in the shape of the right triangle with one angle equal to π/n belongs to this class. In the following Section we focus on those polygons. 3. Veech Structures for Polygonal Billiards We start the discussion of a hidden group structure of certain polygonal billiards with the simple example of the square billiard where the necessary ideas and methods can be illustrated clearly without technical difficulties.
3.1. A simple case: The square billiard. How can one evaluate the number of periodic orbits with length less than l in a square billiard of size 2a with periodic boundary conditions? The exact expression for the length of the periodic orbits in such a billiard is, of course, lp = (2ma)2 + (2na)2 (36) with m ∈ N and n ∈ Z (see (11)). The number of periodic orbits with length less than l, N (lp ≤ l), reads N (lp ≤ l) = (37) 5 l − 2a m2 + n2 m,n
and asymptotically when l → ∞ ∞ N (lp ≤ l) = dm 0
=
π l2
∞ −∞
dn 5 l − 2a m2 + n2 (38)
8a 2
if one sets m = (r cos ϕ)/2a and n = (r sin ϕ)/2a. This is the number of all periodic orbits. More interesting questions and rich mathematical structure appear when one is interested in the calculation of the number of primitive periodic orbits Npp (lp ≤ l) (that is, orbits with m and n coprime). The number of such orbits for a square billiard can easily be computed by using the inclusion-exclusion principle. The number of primitive periodic orbits with length less than l is the total number of periodic orbits with length less than l minus the number
336
E. Bogomolny, O. Giraud, C. Schmit
of orbits repeated p times with prime p, to which we add orbits repeated p1 p2 times, which had been subtracted twice, etc. Finally one concludes that l l Npp (lp ≤ l) = N (lp ≤ l) − N (lp ≤ ) + N (lp ≤ ) p p1 p2 p p1 ,p2 (39) l − N (lp ≤ )... . p1 p 2 p 3 p ,p ,p 1
2
3
Using the l 2 dependence of N in (38), we have 1 1 1 Npp (lp ≤ l) = N (lp ≤ l)(1 − + − ...) 2 2 p (p1 p2 ) (p1 p2 p3 )2 p p ,p p ,p ,p 1
2
6 1 1 =N = 2N, (1 − 2 ) = N p ζ (2) π p
1
2
3
(40)
where ζ (s) =
∞ 1 1 = s n 1 − p −s p
(41)
n=1
is the Riemann zeta function. From (38) one gets 3l 2 . (42) 4π a 2 Our aim is to generalize the previous calculation of Npp (lp ≤ l) to certain triangular billiards. This generalization naturally appears [17] when one considers carefully the usual geometrical picture of the free motion inside the square billiard. It is well known that any trajectory of such a motion can be unfolded to a straight line when instead of the square billiard one considers the motion on the covering space which for the square billiard is a plane with infinite square lattice of the side 2a. The vertices of this lattice (which are the images of the vertices of the initial square) have coordinates Npp (lp ≤ l) =
x = 2am, y = 2an
(43)
with integers m and n and can be considered as the result of the application of a 2 × 2 matrix with integer coefficient to a horizontal vector (2a, 0), 2am 2a m k . (44) = 2an 0 n l Thus, the periodic orbit lengths (36) are the distances between these vertices and the initial point (0, 0). The problem of finding the number Npp of primitive periodic orbits with length less than l is therefore equivalent to the problem of finding out how many 2 × 2 matrices with integer coefficients and determinant equal to 1 (since m and n are coprime one can impose ml −nk = 1) exist with m2 +n2 ≤ x 2 for a given x (or, which is equivalent, with n2 + l 2 ≤ x 2 ). The group of 2 × 2 matrices with integer coefficients and determinant equal to 1 form a group SL(2, Z) and in the next two sections we shall discuss its main properties. Though this material is well known we find it useful to recall it informally.
2-Point Correlation Form Factor for Pseudo-Integrable Systems
337
3.2. The modular group. The subgroup of SL(2, R) containing all 2 × 2 matrices with integer coefficients and determinant equal to 1 is called the modular group SL(2, Z). The standard representation of this group (see e.g. [26]) is the Poincaré half-plane H with measure ds 2 =
1 (dx 2 + dy 2 ). y2
(45)
A matrix g ∈ SL(2, Z) is represented by the isometry g : H → H, mz + k z → . nz + l
(46)
The modular group is generated by the translation T : z → z + 1 and the inversion S : z → −1/z, which correspond respectively to the matrices 1 α (47) 0 1 (with α = 1 for the modular group) and
0
1
−1 0
.
(48)
Since the modular group is a discrete group, we can define its fundamental domain D (shown in Fig. 1), that is the domain of the Poincaré half-plane H that covers H under the action of the representation (46) of the group. m k ∈ SL(2, Z) verifying In order to compute the number of matrices g = n l n2 + l 2 ≤ x 2 , we have to evaluate N (x) = 1, (49) g∈:∞ \G n2 +l 2 ≤x 2
where G = SL(2, Z) and :∞ is the subgroup of G generated by the translations (:∞ = {T n , n ∈ Z}): since the left multiplication by matrices of the form T p , m + pn k + pl m k 1 p (50) = n l n l 0 1 does not change n and l it is necessary to consider the quotient :∞ \G (i.e. 2 matrices which differ by T p are considered only once), so that the sum is convergent [29]. If we assume that in the limit x → ∞ the sums can be written as integrals over n and l with uniform measure (see later) in the form (B/π )dndl, we get B B N (x) = (51) dndl = x 2 . 2 n2 +l 2 ≤x 2 π
338
E. Bogomolny, O. Giraud, C. Schmit
y
-1
11111111 00000000 00000000 11111111 00000000 11111111 00000000 11111111 00000000 11111111 00000000 11111111 00000000 11111111 00000000 11111111 00000000 11111111 00000000 11111111 00000000 11111111 00000000 11111111 00000000 11111111 00000000 11111111 00000000 11111111 00000000 11111111 00000000 11111111 00000000 11111111 00000000 11111111 00000000 11111111 11111111 00000000 11111111 00000000
-1/2
0
1/2
1
x
Fig. 1. The fundamental domain of the modular group
3.3. Eisenstein series. In order to compute the coefficient B, let us introduce the Eisenstein series (52) E(z, s) = (Im g(z))s g∈:∞ \G
for s > 1. From expression (46) we get, since ml − nk = 1, Im g(z) =
y , |nz + l|2
(53)
where y = Im z. Since Im g g(z) = Im g(z) for g ∈ :∞ , the sum over :∞ \G is well defined. Let us first compute the asymptotic behavior of E(z, s) when s → 1. For a given R ∈ R, we can rewrite the sum (52) as a finite sum over elements of G for which n2 + l 2 < R 2 and a sum over elements for which n2 + l 2 > R 2 which diverges as s → 1. The divergent part, n2 + l 2 > R 2 , is B dn dl ys (54) E div (z, s) = π n2 +l 2 >R 2 |nz + l|2s By s ∞ π r 1−2s drdφ = (55) . π R 0 (x sin φ + cos φ)2 + (y sin φ)2 s Since
∞
R
r 1−2s dr =
R 2(1−s) 2(s − 1)
∼
s→1
1 , 2(s − 1)
(56)
2-Point Correlation Form Factor for Pseudo-Integrable Systems
339
and the finite part of the Eisenstein series can be neglected as compared with the divergent part, we have π yB dφ E(z, s) ∼ . (57) s→1 2π(s − 1) 0 (x sin φ + cos φ)2 + (y sin φ)2 The computation of the integral can be performed the following way: setting A = 1 − x 2 − y 2 , B = 2x and C = 1 + x 2 + y 2 , we get π π dφ dφ =2 , 2 2 2 2 0 (x + y ) sin φ + cos φ + 2x sin φ cos φ 0 A cos 2φ + B sin 2φ + C (58) and since C 2 − A2 − B 2 = 4y 2 , the integral is equal to 2π d π = . √ 2 2 y A + B cos + C 0
(59)
Finally E(z, s) ∼
s→1
B 2(s − 1)
(60)
and this limit does not depend on z. Now let us integrate the series in (52) with the invariant measure dµ(z) = dxdy/y 2 over a part, DY , of the fundamental domain D corresponding to a restriction y ≤ Y . If d is the width of the fundamental domain, E(z, s) dµ(z) = (Im g(z))s dµ(z) DY
g∈:∞ \G DY
=
g∈:∞ \G g DY
d
0
=d
Y
dx
Y s−1 . s−1
0
(Im z)s dµ(z)
dy s y y2
(61)
(62) (63)
In transformation from Eq. (61) to Eq. (62) we take into account that the image of fundamental region D (and DY ) under the action of G is a certain region on the upperhalf plane which under the action of :∞ can be moved into a vertical strip of width d (which is the fundamental region for the group :∞ ). These images can not intersect and when Y → ∞ will cover the whole strip with y ≤ Y . The asymptotic behavior of this integral is thus the following: d E(z, s) dµ(z) = . (64) lim s→1 DY s−1 By comparing this expression with Eq. (60) one concludes that the value of the constant B is d , (65) B=2 Vol D
340
E. Bogomolny, O. Giraud, C. Schmit
and from Eq. (51), the final answer for the density of primitive (n and l are coprime) periodic orbits in a square billiard is d (66) x2. Vol D For the modular group d = 1, Vol D = π/3, and for a square billiard with side 2a we have x = l/(2a) according to (36), so Npp (n2 + l 2 ≤ x 2 ) =
3l 2 , 4π a 2 which agree with Eq. (42) obtained by a different method. Npp (lp < l) =
(67)
3.4. Veech group for π/n right triangles. 3.4.1. The symmetry group. The previous calculations were possible because we have found a group –the modular group– that relates periodic orbits in a square to a simple vector (see Eq. (44)). In order to generalize this construction for more complicated polygons it is important to point out that the modular group is the symmetry group of the unfolding of the square billiard (that is, the lattice whose unit cell is a 1×1 square). Indeed this square lattice is, evidently, invariant under the following two transformations: the rotation by π/2 around the center of the square which we denote by S and the translation of one coordinate (say x) by 1 which we denote by T . In Cartesian coordinates these transformations are represented by the following matrices: 1 1 0 1 . (68) T = S= 0 1 −1 0 The vertices of the lattice are also unchanged under the action of the group generated by S and T . But it is well known that this group is exactly the modular group SL(2, Z). Therefore the modular group plays a double role for a square billiard. First, it is the group of invariance of unfolding of the square and, second, it generates periodic orbits starting from a fixed vector as in Eq. (44). It has been proved by Veech [17] that for certain polygons (called the Veech polygons) there exists a group with similar properties. In particular, a π/n right triangle (i.e. a triangle with angles π/2, π/n, π(n − 2)/2n) belongs to the Veech polygons [17, 18]. Let us consider this case in detail. The geometrical construction of the unfolding of the classical trajectories in such a billiard is slightly different for n even and odd. By reflections with respect to the sides corresponding to the π/n angle the π/n triangle can be unfolded to the regular n-gon. For n even the opposite sides of this n-gon should be identified by translations (see Fig. 2a). For n odd one has to consider 2 regular n-gons reflected with respect to one side and to identify parallel sides by translations as in Fig. 2b. The resulting surface is the surface of genus (n − 1)/2 for n odd (see 2) to which all trajectories belong. From this construction it is clear that if a group of invariance exists it should include the rotation by 2π/n around the center of these n-gons. In Cartesian coordinates this rotation is defined by the following matrix: cos 2π − sin 2π n n σn = . (69) sin 2π cos 2π n n
2-Point Correlation Form Factor for Pseudo-Integrable Systems
341
1
1 11 00 00 11
1 0 0 1
2
4
1111111111 0000000000 0000000000 1111111111 0000000000 1111111111 0000000000 1111111111 0000000000 1111111111 1 0 0000000000 1111111111 0000000000 1111111111 0000000000 1111111111
11 00
3
2 1 0
4
3 1 0
11 00 00 11
2
11111111111 00000000000 00000000000 11111111111 00000000000 11111111111 00000000000 11111111111 1 0 00000000000 11111111111 00000000000 11111111111 00000000000 11111111111 00000000000 11111111111 0 1 11 00 0 1
1 0 0 1
3 4 11 00
1
(a)
111 000 000 111 000 111 000 111 1 0 000 111 000 111 000 111 000 111 000 111 000 111 000 111 00 11 000 111 00 11 000 111 00 11 000 111 000 111 000 111 000 111 000 111 000 111 000 111 000 111 000 111 000 111 000 111 000 111 000 111 000 111 000 111
1
1 0 0 00 1 11 00 11 00 11 00 11 00 11 00 11 00 11 00 11 00 11 00 11 00 11 00 11 00 11 00 11 00 11 00 11 00 11 00 11 00 11 00 11 00 11 00 11 00 11 00 11 00 11 00 11 00 11 00 11 00 11 00 11 00 11 00 11 00 11 00 11 00 11 00 11 00 11 1 0 00 11 00 11 1 0 00 11 00 11 00 11 00 11 00 11 00 11 00 11 00 11 00 11 00 11 00 11 00 11 2 00 11 00 11 00 11 0 1 0 1
11 00
3 1 0 0 1
4
(b)
Fig. 2. The unfolding of π/n triangle. (a) n is even. (b) n is odd
To find other transformations which leave this surface invariant it is necessary to consider a few families of periodic orbits. For n even, we define in the n−gon two important elementary families of periodic orbits: the first one is the family of horizontal primitive periodic orbits, the second one is the family of primitive periodic orbits making an angle π/n with the horizontal (see Fig. 3). For n odd we only define the first family (see Fig. 4). For n even (n = 4p or n = 4p + 2), the first family has orbits with lengths Lj = 4 cos
(2j − 1)π π cos n n
(70)
Wj = 2 cos
(2j − 1)π π sin n n
(71)
and widths
with 1 ≤ j ≤ p. The second family has orbits with lengths lj = 4 cos
(2j − 2)π π cos n n
(72)
π (2j − 2)π sin n n
(73)
and widths wj = 2 cos
with 2 ≤ j ≤ p if n = 4p or 2 ≤ j ≤ p + 1 if n = 4p + 2. The orbit with j = 1 is special: it has a length and a width equal to l1 = 2 cos π/n, w1 = 2 sin π/n.
(74)
342
E. Bogomolny, O. Giraud, C. Schmit
L2
11111111111111111111111 00000000000000000000000 00000000000000000000000 11111111111111111111111 L1 0000000000000000000000000000000000 1111111111111111111111111111111111 0000000000000000000000000000000000 1111111111111111111111111111111111 0000000000000000000000000000000000 1111111111111111111111111111111111
11111111111111111111111111111 00000000000000000000000000000 L’3 00000000000000000000000000000 11111111111111111111111111111 000000000000000 111111111111111 00000000000000000000000000000 11111111111111111111111111111 000000000000000 111111111111111 00000000000000000000000000000 11111111111111111111111111111 000000000000000 111111111111111 00000000000000000000000000000 11111111111111111111111111111 000000000000000 111111111111111 00000000000000000000000000000 11111111111111111111111111111 000000000000000 111111111111111 00000000000000000000000000000 11111111111111111111111111111 000000000000000 111111111111111 00000000000000000000000000000 11111111111111111111111111111 00000000000000000000000000000 11111111111111111111111111111 00000000000000000000000000000 11111111111111111111111111111 0000000000000000000 1111111111111111111 L’1 00000000000000000000000000000 11111111111111111111111111111 0000000000000000000 1111111111111111111 00000000000000000000000000000 11111111111111111111111111111 0000000000000000000 1111111111111111111 0000000000000000000 1111111111111111111 0000000000000000000 1111111111111111111 000000000000000000 111111111111111111 0000000000000000000 1111111111111111111 000000000000000000 111111111111111111 0000000000000000000 1111111111111111111 000000000000000000 111111111111111111 0000000000000000000 1111111111111111111 000000000000000000 111111111111111111 0000000000000000000 1111111111111111111 000000000000000000 111111111111111111 0000000000000000000 1111111111111111111 000000000000000000 111111111111111111 000000000000000000 111111111111111111 000000000000 111111111111 000000000000 111111111111 000000000000 111111111111 000000000000 111111111111
L’
2
Fig. 3. The elementary orbits in the decagon
For n odd (n = 2p + 1), the lengths and widths are the following: Lj = 4 sin
2j π π cos n n
(75)
Wj = 2 sin
2j π π sin . n n
(76)
and
with 1 ≤ j ≤ p. It is of interest to compute the ratio of the length of each periodic orbit family to its width. From the above formulas it follows that for all families, except the one with j = 1 for even n, this ratio is the same l π = 2 cot . n w
(77)
For the exceptional family (74) this ratio is 2 times smaller. The unfolding of any family of periodic orbits gives an infinite strip of points of period Lp and width Wp . If there is a group of invariance of the unfolded surface, it
2-Point Correlation Form Factor for Pseudo-Integrable Systems
343
11111111111111111L 00000000000000000 00000000000000000 3 11111111111111111 00000000000000000000000000000000 11111111111111111111111111111111 00000000000000000000000000000000 11111111111111111111111111111111 00000000000000000000000000000000 L 2 11111111111111111111111111111111 00000000000000000000000000000000 11111111111111111111111111111111 00000000000000000000000000000000 11111111111111111111111111111111 00000000000000000000000000000000 11111111111111111111111111111111 11111111111111111 00000000000000000 0000000000000 1111111111111 00000000000000000 11111111111111111 0000000000000 1111111111111 00000000000000000 11111111111111111 0000000000000 1111111111111 00000000000000000 11111111111111111 0000000000000 1111111111111
L1
Fig. 4. The elementary orbits in the heptagon
should include a transformation which leaves invariant periodic orbit strips. Assume that the strip is oriented horizontally. In this case one sees that the shift of the form
1 α
(78)
0 1 leaves points of the strip invariant provided that Lp = nαWp ,
(79)
where n is an integer. Because for periodic orbits considered the ratio (77) is constant the invariance group should include the following transformation τn =
1 2 cot
0
1
π n
(80)
Veech proved [17, 18] that the invariance group for the π/n triangle is a discrete subgroup :n of SL(2, R) generated by the two elements (69) and (80). Similarly to the relation (44) for the square, periodic orbits in this triangle are generated by the action of :n over the elementary families of periodic orbits considered above. We shall call the members of these families “basis orbits” (and label them by the index i). We define the corresponding basis vectors vi by vj = (Lj , 0) and vj = (lj cos(π/n), lj sin(π/n)), so that {vi } = {vj , vj } for n even and {vi } = {vj } for n odd.
344
E. Bogomolny, O. Giraud, C. Schmit
An element g of the symmetry group :n has the following matrix representation a b . (81) g= c d The result of the action of this element on one of the basis vectors vi = (vi1 , vi2 ) gives the coordinates of a new primitive periodic orbit (more precisely, a periodic orbit situated on the boundary of the periodic orbit pencil) vi1 a b , (82) g vi = c d vi2 and the length of this primitive periodic orbit is the length of this vector. The first family of basis vectors for even n and all basis vectors for odd n can be chosen in the form vi = (Li , 0) and the lengths of periodic orbits generated by applying the group :n are L g = a 2 + c 2 Li , (83) where a, c are matrix elements of g (81). The second basis periodic orbits, vi , are obtained from horizontal vectors by rotation by π/n. But the matrix corresponding to the inverse of this rotation cos πn sin πn (84) r= − sin πn cos πn does not belong to our group :n . Nevertheless, this matrix plays the role of a Hecke operator, namely, even if it does not belong to :n the conjugation of any matrix from this group does belong to :n : if g ∈ :n , then r −1 gr ∈ :n . To prove it let us note that p
p
r −1 σn r = σn ,
(85)
where σn is the generator (69) because all rotations commute, and it is easy to check that r −1 τn r = −σn τn−1 , r −1 τn−1 = τn σn−1 .
(86)
The right-hand sides of these relations belong to :n and as all matrices from :n can be written as a product of generators we get r −1 gr ∈ :n for g ∈ :n . Using this conjugation one can rotate the second family of periodic orbits for even n by −π/n and the lengths of the orbits generated by the vectors vi will be related to matrix elements of :n by the same relation as in (83), (87) L g = a 2 + c2 li . Therefore, to find the distribution of periodic orbit lengths it is necessary to compute the distribution of a 2 + b2 for matrices from :n , which has been done for the modular group in the previous section. Equation (66) can be derived the same way for :n . According to the previous section one can compute the density of periodic orbits and other quantities as well by investigation of the fundamental domain of :n . The distribution of areas of periodic orbit families is also easy to obtain: as all matrices from :n have unit determinant, the area covered by the pencil corresponding to g vi
2-Point Correlation Form Factor for Pseudo-Integrable Systems
345
(g ∈ :n ) is the same as the area covered by the pencil corresponding to basis vectors vi , i.e. it is equal to Li Wi . In other words, there is a one to one correspondence between pencils of primitive periodic orbits and vectors g vi for g ∈ :n . The discrete group :n is related to periodic orbits in the π/n right triangle in the same way as the modular group is related to periodic orbits in the square. 3.4.2. The density of periodic orbits. The fundamental domains of the symmetry groups :n for n even and odd are described in Figs. 5 and 6 respectively. For n even, it is the union of two triangles with angles 2π/n, 0, 0 on the Poincaré half-plane: the area of the domain is Vol D = 2π(n − 2)/n, and its width is 2 cot(π/n). y 2 cot π /n
111111111111111111 000000000000000000 000000000000000000 111111111111111111 000000000000000000 111111111111111111 000000000000000000 111111111111111111 000000000000000000 111111111111111111 000000000000000000 111111111111111111 000000000000000000 111111111111111111 000000000000000000 111111111111111111 000000000000000000 111111111111111111 000000000000000000 111111111111111111 000000000000000000 111111111111111111 2 π /n 000000000000000000 111111111111111111 000000000000000000 111111111111111111 000000000000000000 111111111111111111 000000000000000000 111111111111111111 i 000000000000000000 111111111111111111 000000000000000000 111111111111111111 000000000000000000 111111111111111111 1111111111111111111 0000000000000000000 111111111111111111 000000000000000000
x
Fig. 5. The fundamental domain of :n for n even
For n odd the two triangles have angles π/n, π/2 and 0, therefore Vol D = π(n − 2)/n; the width of the fundamental domain is 2 cot π/n. These shapes of fundamental domains can be obtained by taking into account that the group :n considered as a group acting on the Poincaré upper-half plane as in Sect. 3.2 includes (i) the translation by 2 cot π/n and (ii) the rotation around point i by 2π/n for n odd and by 4π/n for n even. This difference between even and odd n is related to the fact that the rotation by angle π corresponds to the transformation g → −g, but these 2 matrices are represented by the same function on H (see (46)). For odd n (n = 2q + 1), the group generated by the generator (69) contains rotations by angles 2πj/(2q + 1) for j = 0, 1, . . . 2q. The value j = q + 1 corresponds to the rotation by π + π/n. As the rotation by π is the identity transformation, the rotation by π/n belongs to :n and it is a primitive generator of the p subgroup {σn , p ∈ Z} of :n . For even n (n = 2q), the rotation by 2π q/n is the identity, therefore the rotation by p π/n does not belong to the group and the primitive generator of the subgroup {σn , p ∈ Z} is the rotation by 2π/n.
346
E. Bogomolny, O. Giraud, C. Schmit
y
111111 1010 000000 11111111 00000000 00000000 11111111 000000001010 11111111 00000000 11111111 1010 00000000 11111111 00000000 11111111 10 00000000 11111111 000000001010 11111111 00000000 11111111 1010 00000000 11111111 00000000 11111111 10 00000000 11111111 1010 00000000 11111111 π/n 00000000 11111111 10 00000000 11111111 00000000 11111111 00000000 11111111 00000000 11111111 11111111 00000000 2 cot π/n
x
Fig. 6. The fundamental domain of :n for n odd
Due to (66), we now get the number of matrices a b ∈ :n g= c d verifying a 2 + c2 ≤ x 2 :
n cot π(n − 2) 2 2 2 N (a + c ≤ x ) = 2n cot π(n − 2)
(88)
π n
x2
(n even)
π n
x2
(n odd)
.
(89)
These formulas give the total number of matrices verifying a 2 + c2 ≤ x 2 . But due to the existence of rotation matrices in the group :n each primitive periodic orbit length appears a few times in the above calculations. This multiplicity corresponds to different unfoldings of a given periodic orbit. For n odd the 2n matrices of the form {±βnk g, 0 ≤ k ≤ 2n − 1}, where βn is the matrix of rotation by π/n, give rise to one primitive periodic orbit. For n even there exist n matrices of the form {±σnk g, 0 ≤ k ≤ n − 1} (where σn is the matrix of rotation by 2π/n (69)) which describe one periodic orbit. √ For g given by (88), the length of gvi is Li a 2 + c2 . So the number of primitive periodic orbits of type gvi less than l is 2 1 π l . (90) cot Ni,pp (Lp < l) = π(n − 2) n Li The number of all primitive periodic orbits is the sum over all such contributions: Npp (Lp < l) = C
l2 , A
(91)
2-Point Correlation Form Factor for Pseudo-Integrable Systems
347
where 1 2π sin 4 n
(92)
1 π 1 cos2 . 2π(n − 2) n L2i i
(93)
A= is the area of our triangle and C=
For n odd (n = 2p + 1) Eq. (75) gives p p 1 1 1 . = 2 2 2 16 cos π/n L sin (2π k/n) k=1 k k=1
(94)
For n even (n = 4p + 2 , = 0, 1) from Eqs. (70), (72), and (74) it follows p p p+ 1 1 1 1 + + 4 = 16 cos2 π/n cos2 (2j − 1)π/n cos2 (2j − 2)π/n L2i i=1
j =1
j =2
(n−2)/2 1 1 = + 4 . 16 cos2 π/n cos2 kπ/n
(95)
k=1
The last sums can be calculated using the evident formulas (for another method of calculation see [17]) ∞ 1 1 = , 2 (x − qπ )2 sin x q=−∞
(96)
∞ 1 1 4 1 ( + ). = 2 2 2 cos πx/2 π (2q − 1 − x) (2q − 1 + x)2
(97)
and
q=1
Taking into account that sin2 (kπ/n) = sin2 ((n − k)π/n) for odd n and performing the following transformations: n−1 ∞ k=1
∞ ∞ 1 1 1 π2 1 = − = (1 − 2 ), 2 2 2 2 (k − qn) t n q 3 n q=−∞ t=−∞ q=−∞
(98)
one obtains that for odd n, (n−1)/2 k=1
n2 − 1 1 = . 6 sin2 (2π k/n)
(99)
348
E. Bogomolny, O. Giraud, C. Schmit
Similarly for even n, (n−2)/2 k=1
n2 − 4 1 = . cos2 kπ/n 6
(100)
Therefore the value of constant C is
1 (n2 − 1) C= 192π(n − 2) (n2 + 20)
for odd n . for even n
(101)
This result corresponds to primitive periodic orbits with geometrically different lengths: time-reversal orbits are not included in the summation. They give an additional factor of 2 in Eq. (101). In [17] the orbits corresponding to different unfoldings of the same periodic orbit have been included in the asymptotic formula, which leads asyptotically to the additional factor n for even n and 2n for odd n in Eq. (101). Furthermore, if one needs all periodic orbits including repetitions Eq. (101) should be multiplied by π 2 /6 as in Sect. 3.1. In Fig. 7 we present numerical results of the cumulative density of primitive periodic orbits in a π/8 right triangular billiard (with area A = 4π ) when all orbits (time-reversal and for different unfoldings) are included. The solid line is the best quadratic fit to these data Npp (Lp < l) = .0294l 2 − .6055l + .3617.
(102)
One sees that this fit can hardy be distinguished from numerical results. The theoretical prediction for the coefficient in front of l 2 is C/A according to Eq. (91); C is given by Eq. (101), and has to be multiplied by n since the numerical computation has taken into account the repetitions of each orbit, and by 2 as time-reversal orbits are taken into account as well: C 7 ≈ .0295, (103) = A 24π 2 which is in excellent agreement with numerical calculations. 3.5. Explicit calculation of the 2-point form factor for the π/n right triangle. 3.5.1. First case: No degeneracy of the lengths. We assume in this section that there is no degeneracy between the lengths of the periodic orbits (except the ones connected by the time-reversal transformation) or more carefully, that there is no pair of primitive periodic orbits whose lengths are commensurable. Since the lengths of the gvi are proportional to the lengths of the vi , the necessary requirement for the validity of this condition is the absence of commensurability relations between the Li . In this case the 2-point correlation form factor in the diagonal approximation is done by Eq. (25). The sum over different periodic orbits can be split into a sum over all types of periodic orbits, then the sum over periodic orbits of each type can be replaced by an integral; since the density ρi of periodic orbits of type i only takes into account once the periodic orbit and its time-reverse, the degeneracy is gp = 2 and ∞ gp2 = 4 =4 dlρi (l) . (104) p
p
i
0
2-Point Correlation Form Factor for Pseudo-Integrable Systems
349
20000
N per (L )
15000
10000
5000
0
0
200
400 L
600
800
Fig. 7. The cumulative density of periodic orbits for the π/8 right triangular billiard
In (25), Ap is the area occupied by a pencil of periodic orbits of length Lp ; but this area is the same for all trajectories belonging to the same family i. So we just have to evaluate the area Ai = Li Wi occupied by an elementary orbit of type i. The lengths (70), (72) and (75) are Lk = 4 cos
kπ π cos , n n
1≤k ≤p−1
(105)
1≤k≤p
(106)
and L0 = 2 cos π/n if n = 2p, and Lk = 4 sin
2kπ π cos , n n
if n = 2p + 1. The widths are only half the widths Wi given by (71), (73) and (76) since each fundamental pencil is symmetric with respect to the line joining two images of the π/2 corner of the triangle (see Fig. 3 and 4). So the Wk are Wk = cos
π kπ sin , n n
0≤k ≤p−1
(107)
1≤k≤p
(108)
if n = 2p, and Wk = sin
π 2kπ sin , n n
350
E. Bogomolny, O. Giraud, C. Schmit
if n = 2p + 1. Then for small τ , A2 ∞ 1 i ¯ ρi (l)dl, K2 (τ ) = 4 δ l − 4π k dτ 2 ¯ 8π 0 l d i
(109)
and replacing the density of orbits of type i by its mean value ρi =
π 2 dNi,pp (Lp < l) 6 dl
(110)
where Ni,pp (Lp < l) is given by Eq. (90) we obtain, when performing the integral, K2 (τ ) =
cot π/n 2 Wk , 6π(n − 2)d¯ k
(111)
where the average density of states is d¯ = A/4π , A=
1 2π sin 4 n
being the area of the triangle. The sum (111) over the widths (107) and (108) gives n+2 2 π sin n , n even 2 . Wk = n 4 sin2 π , n odd k n 4
(112)
(113)
So we finally get K2 (τ ) =
n + (n) 3(n − 2)
(114)
with (n) = 0 (n) = 2
when n is odd, when n is even.
(115) (116)
3.5.2. Second case: Degeneracy of the lengths. We have assumed in the previous Section that the lengths of all primitive periodic orbits were non-commensurable. In the case of the π/n right triangle, there may exist a commensurability relation between the Lk given by (105) or (106) if there is one between the cos(kπ/n) (0 ≤ k ≤ p − 1) for n even, or between the sin(2kπ/n) (1 ≤ k ≤ p) for n odd. It is shown in [30] that if n is an odd prime, there is no such relation between the sin(2kπ/n). Ref. ([31]) deals with the case (k, n) = 1 and gives the same conclusion. It seems that in the general case, only one relation of that kind exists between the cos(kπ/n), which is n π 2 cos = cos(0) (117) 3n
2-Point Correlation Form Factor for Pseudo-Integrable Systems
351
and that no relation exists between terms with sinus. Therefore the only degeneracy occurs in the case where n is even and 3|n, that is n ∈ 6Z. In that case, from (10) we get diag d(E + ) d(E − ) = K2 2 2 Ap Ap 1 +4 eik(lp −lp )+i 4k (lp +lp ) + c.c., 2 2π k l l 16π p p + + p =p
(118) where is the usual diagonal approximation. p + means that we only count for one orbit in the sum the orbit and its time-reverse, therefore there is a coefficient 4. If there is a relation m1 L1 = m2 L2 (with m1 and m2 coprime) between two lengths of deg primitive periodic orbits, we have a contribution R2 to the 2-point correlation function (17) which comes from orbits of lengths qm1 L1 and qm2 L2 , q ∈ Z∗ : 1 2A1 A2 deg R2 = ei 4k (qm1 L1 +qm2 L2 ) √ 2 4π .2π k q qm1 L1 qm2 L2 (119) 2A1 A2 1 i 2k (qm1 L1 ) = e . 4π 2 .2π k q qm1 L1 diag K2
The sum over all repetition numbers q of a function of qL1 (where L1 is a primitive periodic orbit) can be replaced by an integral: A1 A2 ∞ 1 i m1 l deg e 2k ρ1 (l), dl (120) R2 = 3 4π k 0 m1 l where the density ρ1 (l) is the density of periodic orbits of type 1 (that is, with length proportional to L1 ) with length less than l, given by (110). Performing the Fourier transform (18) and the integral over l gives cot π/n W 1 W2 2 K2 (0) = . (121) Wi + 2 m1 m2 6π(n − 2)d¯ i
In our case the degeneracy is given by (117) and π 1 W0 W n = sin2 . 2 n 3 We finally obtain K2 (0) = with
0 (n) = 2 6
n + (n) 3(n − 2)
when n is odd when n is even and 3 n when n is even and 3 | n.
(122)
(123)
(124)
This formula is the main result of our calculations for the triangular billiards. It clearly demonstrates the peculiarities of spectral statistics for pseudo-integrable systems. The non-zero value of the form factor (< 1) at the origin does not correspond to any random matrix ensemble but it is typical for intermediate statistics [25, 32].
352
E. Bogomolny, O. Giraud, C. Schmit
3.6. Comparison with numerical calculations. To compare the prediction (123) with numerical results we have computed 20000 levels for triangular billiards in the shape of a right triangle with one angle π/n for all n = 5, 7 . . . , 30 (the case of n = 6 is integrable). For each triangle we take levels from 15000 till 20000 and compute numerically the corresponding 2-point correlation form factor. A typical result is presented in Fig. 8. From data like this, it is quite difficult to find the value of the form factor at the origin because τ → 0 corresponds, according to Eq. (18), to an infinitely large energy difference in the 2-point correlation function: therefore numerically we always have K2 (0) = 0. We found it convenient first to fit the numerical data to the following simple expression for the form factor: K2 (τ ) =
a 2 − 2a + 4π 2 τ 2 , a 2 + 4π 2 τ 2
(125)
and then from it compute K2 (0), K2 (0) = 1 −
2 . a
(126)
The form (125) has been chosen because (i) one wants a simple expression, (ii) when τ → ∞ the form factor should go to 1, (iii) to describe the level repulsion it is necessary that ∞ 1 (127) (1 − K2 (τ )) dτ = , 2 0 1.5
K (t )
1
0.5
0
0
1
2 t
Fig. 8. The 2-point form factor for the π/8 right triangular billiard
3
2-Point Correlation Form Factor for Pseudo-Integrable Systems
353
and (iv) the expression (125) when a = 4 equals the form factor of the so-called semiPoisson model [25, 32] which serves as a reference point for intermediate statistics. We stress that the above expression has no solid theoretical explanations and it is used because it relatively well describes our numerical results. The only fitting parameter is K2 (0) related with a by Eq. (126). We tried two fitting procedures. First we fit Eq. (125) for the data with all τ or, second, to decrease the influence of very small τ , where numerical accuracy is not very good, we did not consider the data with 0 < τ < 0.25. In Fig. 8 these two fits are presented. The first one gives K2 (0) ≈ 0.44 and the second one K2 (0) ≈ 0.565. The expected value (123) for n = 8 is 5/9 ≈ 0.56.
K(0)
0.6
0.4
0.2
0
0
10
20
30
N Fig. 9. K2 (0) for π/n right triangles, n = 5 to 30. Circles are theoretical results (123). Squares are the fit (125) when the region of small τ , 0 < τ < .25, is omitted. Diamonds are the same fit but with all τ
In Fig. 9 the results of such fitting procedures are given for all triangles. The lower two curves correspond to these fits and the upper curve is the predictions (123). (Of course, only points are important. Curves are presented for clarity.) The numerical results quite well follow theoretical formula (123) but there is a small shift which decreases when the region of small τ is ignored. This difference between the curves seems to be a consequence of the fact that the result (123) corresponds to asymptotic limit k → ∞ but numerical calculations have been performed at large but finite energy. To check this point we present in Fig. 10 the results of the calculation of the mean number variance, E (2) (L), for the π/30 right triangle, in 10 energy intervals [8000k, 8000(k + 1)], 0 ≤ k ≤ 9 (the energy increases from bottom to top). It is well known (see e.g. [1]) that the behavior of the mean number variance at large distances is related with the value of the form factor
354
E. Bogomolny, O. Giraud, C. Schmit
8
σ2 (L)
6
4
2
0
0
5
10 L
15
20
Fig. 10. E (2) (L) for energy windows with higher and higher energy
at the origin by the simple formula E (2) (L) → K2 (0)L when L → ∞.
(128)
From Fig. 10 it is clearly seen that even for 80000 levels the curve does not stabilize. To find its limiting behavior √ we extrapolate point by point (with L fixed) this ten curves with a fit A(L) + B(L)/ k. (It means that for each L we fit 10 points to find the best A(L) and B(L).) The limit curve (i.e. A(L)) is the most upper curve in Fig. 10. It perfectly reproduces the expected features of E (2) (L): it is a straight line with slope K2 (0) = 0.38 corresponding to the expected value (123) for n = 30. In the same way, Prosen and Casati [33] have computed E (2) (L) for triangle billiards with angle π/5 for much larger values of the energy, and it seems that such fit works well for their calculations and the result for K2 (0) agrees with (123). These (and other) calculations clearly demonstrate that the value of the 2-point correlation form factor at the origin converges slowly to the theoretical result (123) with increasing energy. This behavior may be a consequence of the conjectured existence of two different terms (32) in the form factor and, in the final extent, a manifestation of the strong diffraction in the vicinity of optical boundaries. 4. Rectangular Billiard with a Flux Line 4.1. Preliminary calculations. This section is devoted to the study of a rectangular billiard with the Aharonov-Bohm flux line [22] at a point r0 = (x0 , y0 ) inside the rectangle. In the polar coordinates, r, ϕ, around this point the vector potential of the flux
2-Point Correlation Form Factor for Pseudo-Integrable Systems
355
line has only ϕ component Aϕ =
α r
(129)
and the 2-dimensional Schrödinger equation for the motion in this potential is (when h¯ = c = 1 and m = 1/2) 2 ∂ 1 1 ∂ ∂2 + 2 − iα + En n (r, ϕ) = 0. + (130) ∂r 2 r ∂r r ∂ϕ Similarly to triangular billiards discussed in previous sections this model belongs to the class of diffractive systems. The diffraction coefficient for the scattering on the flux line (16) diverges in the forward direction but as for pseudo-integrable systems the contribution of diffractive orbits can be neglected when computing the value of the 2-point correlation form factor at the origin. It is well known that the Aharonov-Bohm potential (129) does not change classical trajectories but gives an additional phase, Fφ, when a trajectory turns n times around the flux line Fφ = 2nπ α.
(131)
Therefore the contribution (10) of a periodic orbit to the trace formula will contain an additional phase depending on the winding number of the trajectory around the flux line. Periodic orbits in the rectangle of sides a, b are determined by two integers M and N in the usual way and they are characterized by their length (132) lp = (2Ma)2 + (2N b)2 , the area occupied by the periodic orbit family, and the winding number around the flux line. Each pencil of primitive periodic orbits occupies an area 2ab = 2A, so its width is 2A/ lp . The images of the flux line in the unfolding of the rectangular billiard are located at the points ((ζ1 + 2k)a, (ζ2 + 2k )b).
(133)
Here ζi takes the values i or 2 − i (i = 1, 2), where 1 =
x0 y0 , 2 = a b
(134)
are the ratios of coordinates of the flux line to the corresponding sides, and k, k ∈ Z (see Fig. 11). Let us define [x] as the largest integer less than or equal to x, so that [x] ≤ x < [x] + 1,
(135)
and {x} = x − [x] ∈ [0, 1[. Each unfolded pencil of a primitive periodic orbit contains two and only two images of the flux line, since it covers twice the area of the rectangle. The periodic orbits from this pencil parallel to the vector (M, N ) and going through the images of the flux line (which we shall call the saddle connections) split the pencil of primitive periodic orbits parallel to (M, N ) into three pencils of same length (see Fig. 11). Only the central strip is affected by the presence of the flux line and any trajectory from this strip gets a phase
356
E. Bogomolny, O. Giraud, C. Schmit
1 0
00000000000000000000 11111111111111111111 00000000000000000000 11111111111111111111 1111111111111111111 0000000000000000000 1 0 1 0 00000000000000000000 11111111111111111111 0000000000000000000 1111111111111111111 00000000000000000000 11111111111111111111 0000000000000000000 1111111111111111111 00000000000000000000 11111111111111111111 0000000000000000000 1111111111111111111 00000000000000000000 11111111111111111111 0000000000000000000 1111111111111111111 00000000000000000000 11111111111111111111 0000000000000000000 1111111111111111111 00000000000000000000 11111111111111111111 0000000000000000000 1111111111111111111 00000000000000000000 11111111111111111111 0000000000000000000 1111111111111111111 111111111111111111111 0 1 0 1 0 00000000000000000000 0000000000000000000 1111111111111111111 00000000000000000000 11111111111111111111 0000000000000000000 1111111111111111111 00000000000000000000 11111111111111111111 0000000000000000000 1111111111111111111 111111111111111111111 0 1 0 1 0 00000000000000000000 0000000000000000000 1111111111111111111 00000000000000000000 11111111111111111111 0000000000000000000 1111111111111111111 00000000000000000000 11111111111111111111 0000000000000000000 1111111111111111111 00000000000000000000 11111111111111111111 0000000000000000000 1111111111111111111 00000000000000000000 11111111111111111111 0000000000000000000 1111111111111111111 00000000000000000000 11111111111111111111 0000000000000000000 1111111111111111111 00000000000000000000 11111111111111111111 0000000000000000000 1111111111111111111 00000000000000000000 111111111111111111111 0 1 0 1 0 0000000000000000000 1111111111111111111 00000000000000000000 11111111111111111111 0000000000000000000 1111111111111111111 00000000000000000000 11111111111111111111 0000000000000000000 1111111111111111111
11 00
Flux line Periodic orbit (M =1, N = 2) The same orbit unfolded
11 00 000 111 00 11 000 111 00 11 000 111 00 11 000 111
Pencil of periodic orbits
Fig. 11. An unfolded trajectory in the rectangle
2πα (according to (131) and since the orbit is primitive). So the winding number of a periodic trajectory is nothing but the repetition number of a periodic orbit belonging to the central pencil. To compute the widths of the central strip as a function of M and N , let us note that the algebraic distance from an image ((ζ1 + 2k)a, (ζ2 + 2k )b) of the flux line to the saddle-connection linking the points (0, 0) to (2Ma, 2N b) is d=
2A ζ2 M − ζ1 N + 2k M − 2kN . lp
(136)
The two images of the flux line which are inside the pencil are the two nearest points among all the images ((ζ1 + 2k)a, (ζ2 + 2k )b) with integer k and k that have positive distance to the saddle-connection. They correspond to points such that the distance (136) is positive and less than 2A/ lp , or less than 1 in units of 2A/ lp . Let us set Q± = [ 2 M ± 1 N ],
± = { 2 M ± 1 N }.
(137)
The four possible values of (ζ1 , ζ2 ) (( 1 , 2 ), ( 1 , 2− 2 ), (2− 1 , 2 ), and (2− 1 , 2− 2 )) give four possible families for the distance (136): di = 2ki ± (Q+ + + ), di = 2ki ± (Q− + − ),
(138)
where ki is a certain integer which depends on k, k , M and N .Among these four families, exactly two points correspond to a distance positive and less than 1: for instance if Q+ and Q− are even, only 2k1 + Q+ + + and 2k2 + Q− + − can be positive and less than one. So we must have 2k1 + Q+ = 0 and 2k2 + Q− = 0 and the two images of the flux line that are in the pencil of primitive periodic orbits are at a distance + and − from the saddle-connection (0, 0) − (2Ma, 2N b) ; if both Q+ and Q− are odd, the two distances are 1 − + and 1 − − . Since the width of the central strip is the difference between the two distances, it is in both cases | 1 − 2 |. Dealing in the same way with the
2-Point Correlation Form Factor for Pseudo-Integrable Systems
357
case where Q+ and Q− have opposite parity, we get that the width of the central strip in units of 2A/ lp is | − − + | , if Q+ and Q− have the same parity η= . (139) |1 − − − + | , if Q+ and Q− have opposite parity Both cases can be summed up in the following formula: η = ϕ(x, y),
(140)
x = 2 M + 1 N , y = 2 M − 1 N
(141)
1 ; ϕ(x, y) = |f (x) − f (y)| , f (x) = (−1)[x] {x} − 2
(142)
with
where
f (x) is an even function of period 2 ; if we restrict the study of ϕ to [−1, 1] × [−1, 1] we have |x + y| if xy ≤ 0 ϕ(x, y) = , (143) |x − y| if xy ≥ 0 and the Fourier expansion of ϕ is ϕ(x, y) = 2
∞ (cos π nx − cos π ny)2
π 2 n2
n=1
.
(144)
Using (141) and (144), we obtain that the width of the central strip associated to the orbit (2M, 2N ) is the following η=
∞ 8 sin2 (π n 2 M) sin2 (π n 1 N ) . π2 n2
(145)
n=1
4.2. Form factor for the billiard with flux line. The value of the two-point correlation form factor in the diagonal approximation given by Eq. (25), when diffractive contributions have been neglected, still holds for billiards with flux line. But Ap now includes the phase factor depending on the repetition number of the trajectory. The density of periodic orbits (10) becomes dp.o. (E) =
1 2
∞ Apn
pp+ ,pp− n=1
2π
π π 1 eiknlp −i 2 νp −i 4 + c.c. 2π knlp
(146)
Here we distinguish between two types of orbits. The orbits associated with a primitive orbit pp+ have a phase exp(2iπ nα) for the orbit repeated n times, and the total coefficient in the trace formula associated with these orbits is Ap+ n = Ap1 + Ap2 e2iπnα + Ap3 ,
(147)
358
E. Bogomolny, O. Giraud, C. Schmit
where Ap1 , Ap2 and Ap3 are the areas covered respectively by the three strips in which the pencil of periodic orbits splits. The orbits associated with a primitive orbit pp − have the complex conjugate phase exp(−2iπ nα) and their contribution to the trace formula is proportional to Ap− n = Ap1 + Ap2 e−2iπnα + Ap3 .
(148)
When the terms with same length lp = nlpp are gathered together, we get dp.o. (E) =
Anp p+
2π
π π 1 eiklp −i 2 νp −i 4 + c.c., 2π klp
(149)
where Anp = Ap1 + Ap2 cos(2π nα) + Ap3 and the sum
p+
(150)
goes over orbits M, N ≥ 0. Equation (25) now becomes
K2 (τ ) =
∞ A2np pp+ n=1
n2
¯ 1 4π k dτ − δ l . pp n 2π 2 lpp d¯
(151)
If ηp is the width of the central strip expressed in units of 2A/ lp , we have Anp = 2A(1 − ηp + ηp cos 2π nα) = 2A(1 − 2ηp sin2 π nα),
(152)
and (using the fact that d¯ = A/4π) the 2-point correlation form factor at small τ is a sum of 3 terms ∞ 8A 1 1 Akτ K2 (τ ) = δ l − pp π n2 pp lpp n n=1 ∞ Akτ 32A sin2 π nα ηpp δ lpp − − (153) π n2 l n pp pp n=1 ∞ 2 Akτ 32A sin4 π nα ηpp δ lpp − . + π n2 l n pp pp n=1
The summation over primitive periodic orbits can be done by replacing the sum by an integral, taking into account the density of primitive periodic orbits. If 1 and 2 are rational numbers pi i = , (154) qi where pi and qi are co-prime integers, the width of the central strip (145) only depends on the remainder r1 of M modulo q1 and r2 of N modulo q2 , η(r1 , r2 ) =
∞ 8 1 p1 p2 sin2 (π nr1 ) sin2 (π nr2 ). 2 2 π n q1 q2 n=1
(155)
2-Point Correlation Form Factor for Pseudo-Integrable Systems
359
There are q1 q2 periodic orbit families M = q1 k + r1 , N = q2 k + r2 ,
(156)
with k, k ∈ N. To compute sums in Eq. (153) one needs to know the mean density of primitive periodic orbits for each family, ρpp,r1 ,r2 (l). Let c be the greatest common divisor of q1 , q2 : c = (q1 , q2 ). If (r1 , r2 , c) = 1, then M and N are not coprime and there is no primitive periodic orbit. In the opposite case it is demonstrated in Appendix A that ρpp,r1 ,r2 (l) = ρpp (l)α(r1 , r2 ),
(157)
where ρpp (l) is the mean density for all primitive periodic orbits in the rectangle (cf. Eq. (42)) with M, N > 0, ρpp (l) =
3l , 4π A
(158)
and α(r1 , r2 ) =
q1 q2
!
1 p|lcf(q1 ,q2 )
(1 − 1/p 2 )
(1 −
p|(q1 ,r1 ),pq2
1 ) p
(1 −
p|(q2 ,r2 ),pq1
1 ), p (159)
and lcf(q1 , q2 ) is the least common factor of q1 , q2 . The knowledge of the mean density of periodic orbit families permits the computation of mean values of different quantities depending on families. If f (r1 , r2 ) is such a quantity its mean value is defined as follows f = f (r1 , r2 )α(r1 , r2 ). (160) ri mod qi (r1 ,r2 ,c)=1
In particular β ηpp
lpp
pp
Akτ δ lpp − n
= ri
β
∞
η (r1 , r2 ) 0
mod qi
1 Akτ 3 " β# ρpp,r1 ,r2 (l)δ l − = η . l n 4π A
(161)
(r1 , r2 , c) = 1
The sums over n that appear in (153) can be computed using the standard formula ∞ cos 2π nx n=1
(2πn)2
1 = x2 − x + , 6
for 0 ≤ x ≤ 1.
(162)
It gives ∞ sin2 π nα n=1
n2
=
π2 α(1 ¯ − α) ¯ 2
(163)
360
E. Bogomolny, O. Giraud, C. Schmit
and ∞ sin4 π nα n=1
n2
=
π2 α, ¯ 4
(164)
where α¯ is the fractional part {α} of the flux through the rectangle when 0 ≤ {α} ≤ 1/2 and α¯ = 1 − {α} when 1/2 ≤ {α} ≤ 1. Using (161), (163) and (164) one concludes that the 2-point correlation form factor for τ → 0 (153) is the following K2 (0) = 1 − 12α(1 (165) ¯ − α) ¯ η + 6α¯ η2 . To use this formula it is necessary to know the values of < η > and < η2 >. In the case where both 1 and 2 are irrational non-commensurable quantities the fractional parts {n 2 M} and {n 1 N } cover the whole interval [−1, 1] and η and η2 can be computed by integrating expression (143) of η over the square [−1, 1] × [−1, 1]. Simple calculations show that in this case 1 , 3
(166)
1 η2 = . 6
(167)
η =
Therefore when the coordinates of the flux line are non-commensurable with the corresponding sides K2 (0) = 1 − 3α¯ + 4α¯ 2 .
(168)
In Appendix B, it is shown that for all rational i η =
1 , 3
(169)
" # like in the irrational case. The average η2 is more difficult to compute analytically in this case: we have found an analytical expression only when q1 divides q2 (or similarly q2 divides q1 ). " # " # Though the general formula for η2 is cumbersome, the computation of the η2 for rational 1 and 2 can easily" be#done numerically using Eqs. (141) and (143). For small denominators the values of η2 are given in Table 1. To check the obtained formulas we have computed numerically the 1500 first energy levels for the rectangular billiard with sides a = 4 and b = π and the flux line with coordinates (from the low left corner) x0 = 5a/9 and y0 = 11b/20. The typical picture of K2 (τ ) is shown in Fig. 12. As for the triangular billiards discussed in the previous sections we extrapolated K2 (τ ) to small τ using the simple expression (125). The results for different values of the flux are presented in Fig. 13. We also check the following more suitable fit (which obeys the condition (127) when c = (1 − b)2 ) b + cτ, when τ < (1 − b)/c . (170) K2 (τ ) = 1, when τ > (1 − b)/c
2-Point Correlation Form Factor for Pseudo-Integrable Systems
361
Table 1. Value of < η2 > for a rational flux line q1
2
3
4
5
6
7
8
2
1/3
2/9
1/4
2/9
2/9
2/9
11/48
3
2/9
2/9
13/72
17/90
5/27
47/252
107/576
4
1/4
13/72
1/6
61/360
1/6
85/504
1/6
5
2/9
17/90
61/360
14/75
89/540
37/210
167/960
6
2/9
5/27
1/6
89/540
4/27
35/216
47/288
7
2/9
47/252
85/504
37/210
35/216
26/147
85/504
8
11/48
107/576
1/6
167/960
47/288
85/504
1/6
q2
It gives practically the same results. The existing numerical precision does not permit to distinguish these 2 fits. In the case where x0 = 5a/9 and y0 = 11b/20, simple calculations show that < η2 >= 4867/29160; Eq. (165) gives the expected value of K2 (0), K2 (0) = 1 −
14573 α¯ + 4α¯ 2 . 4860
(171)
which corresponds to the solid curve in Fig. 13. (Note that the coefficient of α¯ equals approximately 2.99 and is practically indistinguishable from the coefficient 3 for irrational i .) Similarly as for triangular billiards there is a small difference between the theoretical and numerical curves. For triangular billiards where more levels are available this difference slowly decreases with energy. We expect the same behavior also for rectangular billiards with a flux line. 5. Conclusion In this paper we have obtained explicit expressions of the 2-point correlation form factor K2 (τ ) in the limit τ → 0 for a few typical examples of pseudo-integrable billiards: triangular billiards in the shape of right triangles with one angle equal to π/n, and rectangular billiards with a flux line. The obtained values of K2 (0) differ from standard examples of spectral statistics (the Random matrix theory and the Poisson statistics), which confirm analytically the peculiarities of spectral statistics of pseudo-integrable systems. The calculations have been performed by analyzing analytically the properties of classical periodic orbits of the systems considered.
362
E. Bogomolny, O. Giraud, C. Schmit 2
K(t )
1.5
1
0.5
0
0
1
2
3
t
Fig. 12. The 2-point form factor for the rectangular billiard with flux line with α = 0.4 and its smoothed value (white line)
In order to elucidate further the special properties of spectral statistics of polygonal billiards, it would be of interest to compute K2 (0) for generic triangular billiards without the Veech structure. Moreover, we have taken into account only the diagonal terms and, consequently, were able to obtain only K2 (0). The computation of the next terms in the expansion of K2 (τ ) in powers of τ should include the exact resummation of singular contributions, coming from the diffraction close to the optical boundaries. The solutions of these problems require the development of new methods beyond the ones used in this paper. Appendix A Periodic orbits in a rectangle with sides a, b are determined by 2 integers M and N which count the difference of coordinates of initial, (xi , yi ), and final, (xf , yf ), points xf = xi + 2aM, yf = yi + 2bN. The length of the periodic orbit is the geometrical length of this vector Lp = (2aM)2 + (2bN )2 .
(172)
(173)
The mean cumulative density and the corresponding quantity for primitive periodic orbits (when M, N are co-prime integers) can be computed as for the square billiard (see Eqs. (38) and (42)). When l → ∞ and if only positive M are considered N (Lp < l) →
π l2 , 8ab
(174)
2-Point Correlation Form Factor for Pseudo-Integrable Systems
363
1
0.8
K(0)
0.6
0.4
0.2
0
0
0.1
0.2
0.3
0.4
0.5
Flux
Fig. 13. K2 (0) for different values of the flux α (points) and the asymptotic theoretical prediction (solid line)
and Npp (Lp < l) →
3l 2 . 4π ab
(175)
The purpose of this Appendix is the computation of the mean cumulative density of primitive periodic orbits for periodic orbit families when M ≡ r1
mod q1 , N ≡ r2
mod q2 .
(176)
The asymptotics of Npp (Lp < l) when l → ∞ is related with the behavior at small x of the 5-function associated with these periodic orbits 5(x) =
e−xLp . 2
(177)
pp
If 5(x) →
C , when x → 0, xγ
(178)
C l 2γ , when l → ∞. γ :(γ )
(179)
then Npp (Lp < l) →
364
E. Bogomolny, O. Giraud, C. Schmit
We are interested in the following 5-function: ∞
5(x) =
e−x((2aM)
2 +(2bN)2 )
,
(180)
M,N=−∞
where the summation is performed over all integers M, N with the following constraints: (M, N ) = 1, M ≡ r1
mod q1 , N ≡ r2
mod q2 .
(181)
Note that both positive and negative values of M, N are considered. When only positive M are taken into account the formulas below have asymptotically factor 1/2. To impose the restriction M ≡ r mod q it is convenient to introduce the δ-function 1, if t ≡ 0 mod q δt,q = . (182) 0, otherwise Its explicit form may be the following: δt,q =
q−1 1 2πikt/q e . q
(183)
k=0
As in Sect. 3.1 the condition (M, N ) = 1 can be taken into account by the inclusionexclusion principle
∞
f (M, N ) =
∞
f (Mt, N t)µ(t),
(184)
M,N=−∞ t=1
(M,N)=1
where µ(t) is the Möbius function equal to (−1)n if t is a product of n distinct primes, 0 if t contains a squared factor, and µ(1) = 1. Combining all the necessary restrictions one finds the final expression for the 5function (180) 5(x) =
1 q1 q2 ×e
∞
∞
µ(t)e−4xt
2 (M 2 a 2 +N 2 b2 )
(185)
mod qi t=1
M,N=−∞ ki
2πik1 (Mt−r1 )/q1 +2πik2 (Nt−r2 )/q2
.
Using the Poisson summation formula ∞
e
−xM 2 +2πiyM
$ =
M=−∞
π x
∞
e−π
2 (M+y)2 /x
,
(186)
M=−∞
one obtains that 5(x) =
π 4abxq1 q2 ×
∞ M=−∞
ki
e
∞ µ(t)
mod qi t=1
t2
e−2πik1 r1 /q1 −2πik2 r2 /q2
−π 2 (M+k1 t/q1 )2 /(4xt 2 a 2 )
∞ N=−∞
(187) e
−π 2 (N+k2 t/q2 )2 /(4xt 2 b2 )
.
2-Point Correlation Form Factor for Pseudo-Integrable Systems
365
When x → 0 the dominant contribution comes from terms with zero exponent, i.e. from terms with M+
k1 t = 0, q1
and N +
k2 t = 0, q2
(188)
or ki t ≡ 0 mod qi . The asymptotics of the 5-function is therefore the following: 5(x) =
π 1 4abx q1 q2
∞
ki
δk1 t,q1 δk2 t,q2
mod qi t=1
µ(t) −2πik1 r1 /q1 −2πik2 r2 /q2 e . t2
(189)
Using the representation (183) for these δ-functions and performing the summation over ki one gets that when x → 0, 5(x) =
π F (r1 , r2 ), 4abx
(190)
where F (r1 , r2 ) =
1 q1 q2
li
∞
δl1 t−r1 ,q1 δl2 t−r2 ,q2
mod qi t=1
µ(t) . t2
(191)
From Eq. (179) one concludes that the asymptotics of the mean cumulative density of primitive periodic family (181) (with M, N > 0, i.e. with a factor 1/4) is Npp (Lp < l) =
π l2 F (r1 , r2 ). 16ab
(192)
To perform the summation over t in Eq. (191) it is necessary to know the number of solutions of two equations l1 t ≡ r1
mod q1 , l2 t ≡ r2
mod q2 .
(193)
It is well known (and can be easily checked) that the number of solutions of the equation ax ≡ b mod q depends on the greatest common divisor of a and q, (a, q) = d. If d = 1 there is 1 solution, x ≡ ba −1 mod q. If d > 1 and d b there is no solution. If d|b there is one solution, x0 = (b/d)(a/d)−1 mod (q/d) and consequently, there are d solutions modulo q: xj = x0 + (q/d)j , j = 0, . . . d − 1. Therefore ∞ 1 µ(t) (q1 , t)(q2 , t)δ(q1 ,t),r1 δ(q2 ,t),r2 q1 q2 t2 t=1 µ(t) 1 = d1 d2 . q1 q2 t2
F (r1 , r2 ) =
(194)
d1 |(q1 ,r1 ) (q1 ,t)=d1 d2 |(q2 ,r2 )
(q2 ,t)=d2
Terms corresponding to (d1 , d2 ) > 1 give a 0 contribution to the sum, since in that case q1 , q2 , d1 and d2 have a common factor, which contradicts condition (M, N ) = 1. The sum (194) can therefore be restricted to (d1 , d2 ) = 1, and the sum over t is now a sum over t , where t = d1 d2 t . Let us denote by P the product of the prime factors of q1 that
366
E. Bogomolny, O. Giraud, C. Schmit
do not divide c and by P the product of the prime factors of q2 that do not divide c. Now (c, P ) = (c, P ) = (P , P ) = 1.
(195)
If p is a prime dividing d1 and c = (q1 , q2 ), then p divides d1 = (q1 , t). Since it divides c it also divides q2 , so p|(q2 , t) and (d1 , d2 ) = 1, which is impossible. So the prime divisors of d1 have to be taken among the divisors of P , and in the same way the prime divisors of d2 have to be taken among the divisors of P . Similarly, we can check that if p is a prime divisor of c which divides t , p divides d1 = (q1 , t), so d1 would contain a prime factor of c, which is impossible; and if p is a prime factor of P or P , p divides both d1 and d2 . So the sum over t must be restricted to t which do not contain any prime divisor of q1 or q2 . As µ(ab) = µ(a)µ(b) for co-prime a and b, one gets F (r1 , r2 ) =
µ(d1 )µ(d2 )µ(t ) 1 d1 d2 , q1 q2 d12 d22 t 2
(196)
d1 ,d2 ,t
where the sum is taken over all d1 , d2 , t verifying d1 |(P , r1 ), d2 |(P , r2 ), t q1 , t q2 . Using the identity µ(δ) 1 (1 − s ) = , p δs p|k
(197)
δ|k
we get 1 q1 q2
F (r1 , r2 ) =
(1 −
pq1 ,pq2
1 1 ) (1 − ) 2 p p p|(P ,r1 )
(1 −
p|(P ,r2 )
1 ) p (198)
1 6 = (1 − 2 )α(r1 , r2 ) = 2 α(r1 , r2 ), p π all p where α(r1 , r2 ) =
q1 q2
!
1 p|lcf(q1 ,q2 )
(1 − 1/p 2 )
(1 −
p|(q1 ,r1 ),pc
1 ) p
p|(q2 ,r2 ),pc
(1 −
1 ), p (199)
and lcf(q1 , q2 ) is the least common factor of q1 , q2 (lcf(q1 , q2 ) = q1 q2 /c). It is also instructive to check directly that α(r1 , r2 ) are normalized correctly α(r1 , r2 ) = 1, (200) ri
mod qi (r1 ,r2 ,c)=1
where as above c = (q1 , q2 ). We use once more the inclusion-exclusion principle f (r1 , r2 ) = µ(t) f (r1 , r2 ). (201) ri
mod qi (r1 ,r2 ,c)=1
t|c
ri
mod qi t|ri
2-Point Correlation Form Factor for Pseudo-Integrable Systems
367
If we set
D = q1 q 2
(1 −
p|lcf (q1 ,q2 )
1 ), p2
(202)
we get, with (199) and (201),
1 µ(t) D
α(r1 , r2 ) =
ri
mod qi (r1 ,r2 ,c)=1
t|c
t|r1 t|r2 δ1 |(P ,r1 )
µ(δ1 ) δ1
δ2 |(P ,r2 )
µ(δ2 ) δ2
µ(δ1 ) µ(δ2 ) 1 µ(t) D δ1 δ2
=
t|c
(203)
δ1 |P δ2 |P tδ1 |r1 tδ2 |r2
µ(t) µ(δ1 ) µ(δ2 ) 1 . q1 q2 D t2 δ12 δ |P δ22 t|c δ |P
=
1
2
Here we have used the fact that if t|c and δ|P , t and δ have no common factor. In the above sums it is always understood that the summation over ri goes only from ri = 0 to qi − 1. But the last sum in Eq. (203) exactly equals D because cP P = lcf(q1 , q2 ) and Eq. (200) holds. Appendix B In the same way one can compute the mean value of η defined in Eq. (155),
η =
η(r1 , r2 )α(r1 , r2 ) =
ri
mod qi (r1 ,r2 ,c)=1
n=1
t|r1 t|r2 δ1 |(P ,r1 ) δ2 |(P ,r2 )
Since
∞ 8 1 µ(t) π 2D n2
µ(δ1 ) µ(δ2 ) 2 r1 sin π n δ1 δ2 q1
sin2 π n
tδ|r
t|c
sin
q r = 1 − δntδ,q , q 2tδ
2
r2 πn q2
.
(204)
(205)
one obtains η =
2q1 q2 µ(t) µ(δ1 ) µ(δ2 ) π 2D t2 δ12 δ |P δ22 t|c δ |P 1
∞ n=1
2
1 1 − δntδ1 ,q1 1 − δntδ2 ,q2 . 2 n
(206)
The sum over n includes 4 terms. The first is the sum over all n, ∞ π2 1 = . n2 6 n=1
(207)
368
E. Bogomolny, O. Giraud, C. Schmit
The second sum has the restriction that n = (q1 /tδ1 )m and ∞ 1 π 2 tδ1 2 δntδ1 ,q1 = . n2 6 q1
(208)
n=1
The third sum is the same but with the substitution 1 → 2. The fourth sum incorporates two restrictions, ntδ1 ≡ 0 mod q1 and ntδ2 ≡ 0 mod q2 . Remembering the definition of P and P (see (195)) one concludes that in this last case the restriction is n = (cP P /(tδ1 δ2 ))m and ∞ π 2 tδ1 δ2 2 1 δntδ1 ,q1 δntδ2 ,q2 = . (209) n2 6 cP P n=1
Performing the summation over δi and t in Eq. (206) one notes that all three last sums will have as a factor µ(δ1 ) or µ(δ2 ). (210) δ1 |P
But for any K ≥ 2 we have
δ2 |P
µ(δ) = 0.
(211)
δ|K
Since q1 and q2 are greater than 1, it is impossible that simultaneously c = P = 1, or c = P = 1, the terms (210) equal zero. Therefore only the term (207) survives and (206) gives η =
µ(t) µ(δ1 ) µ(δ2 ) 1 q 1 q2 . 3D t2 δ12 δ |P δ22 t|c δ |P 1
(212)
2
These sums are exactly equal to D and finally we get η =
1 . 3
(213)
References 1. Bohigas, O.: Chaos and Quantum Mechanics. In: Giannoni, M.-J., Voros, A. and Zinn-Justin, J. (eds.), Les Houches Summer School Lectures LII, 1989, Amsterdam: North Holland, 1991, p. 87 2. Mehta, M.L.: Random Matrix Theory. New York: Springer, 1990 3. Bohigas, O., Giannoni, M.-J., Schmit, C.: Characterization of chaotic quantum spectra and universality of level fluctuation laws. Phys. Rev. Lett. 52, 1 (1984) 4. Berry, M. V., Tabor, M.: Level clustering in the regular spectrum. Proc. Roy. Soc. Lond. 356, 375 (1977) 5. Andreev, A. V., Altshuler, B.L.: Spectral Statistics beyond Random Matrix Theory. Phys. Rev. Lett. 75, 902 (1995); Agam, O., Altshuler, B.L., Andreev, A.V.: Spectral Statistics from disordered to chaotic systems. Phys. Rev. Lett 75, 4389 (1995) 6. Bogomolny, E. B., Keating, J.P.: Gutzwiller’s Trace formula and Spectral Statistics: Beyond the diagonal approximation. Phys. Rev. Lett. 77, 1472 (1996) 7. Marklof, J.: Spectral Form Factors of Rectangle Billiards. Commun. Math. Phys. 199, 169 (1998) 8. Balian, R., Bloch, C.: Distribution of eigenfrequencies for the wave equation in a finite domain: Eigenfrequency density oscillations. Ann. Phys. (N.Y.) 69, 76 (1972)
2-Point Correlation Form Factor for Pseudo-Integrable Systems
369
9. Gutzwiller, M. C.: Chaos and Quantum Mechanics. In: Giannoni, M.-J., Voros, A. and Zinn-Justin, J. (eds.), Les Houches Summer School Lectures LII, 1989 Amsterdam: North Holland, 1991, p. 201 10. Berry, M. V., Tabor, M.: Closed orbits and the regular bound spectrum. Proc. Roy. Soc. Lond. 349, 101 (1976), ibid J. Phys. A Math. Gen. 10, 371 (1977) 11. Richens, P. J., Berry, M. V.: Pseudointegrable systems in classical and quantum mechanics. Physica D 2, 495 (1981) 12. Shudo,A., Shimizu,Y.: Extensive numerical study of spectral statistics for rational and irrational polygonal billiards. Phys. Rev. E 47, 54 (1993) 13. Bogomolny, E. B., Gerland, U., Schmit, C.: Models of intermediate spectral statistics. Phys. Rev. E 59, 1315 (1999) 14. Schklovskii, B.I. et al.: Statistics of spectra of disordered systems near the metal-insulator transition. Phys. Rev. B 47, 11487 (1993) 15. Bogomolny, E. B., Pavloff, N., Schmit, C.: Diffractive corrections in the trace formula for polygonal billiards. Phys. Rev. E 61, 3689 (2000) 16. Berry, M. V.: Semiclassical theory of spectral rigidity. Proc. Roy. Soc. A 400, 229 (1985) 17. Veech, W. A.: Teichmüller curves in moduli space, Eisenstein series and an application to triangular billiards. Invent. Math. 97 (1989), 553–583 18. Vorobets, Y. B.: Planar structures and billiards in rational polygons: The Veech alternative. Russ. Math. Surv. 51, 5 (1996), 779–817 19. Keller, J.P.: Geometrical theory of diffraction. J. Opt. Soc. Am. 52, 116 (1962) 20. Vattay, G., Wirzba, A., Rosenqvist, P.E.: Periodic orbit theory of diffraction. Phys. Rev. Lett. 73, 2304 (1994) 21. Pavloff, N., Schmit, C.: Diffractive orbits in quantum billiards. Phys. Rev. Lett. 75, 61 (1995); Erratum 75, 3779 (E) (1995) 22. Aharonov, Y., Bohm, D.: Significance of electromagnetic potentials in the quantum theory. Phys. Rev. 115, 485 (1959) 23. Bogomolny, E. B.: Action correlations in integrable systems. Nonlinearity 13, 947 (2000) 24. Bogomolny, E. B.: New Directions in Quantum Chaos. In: Proc. of the International School of Physics “Enrico Fermi”, course, CXLIII, eds. Casati, G., Guarneri, I., Smilansky, U., Amsterdam, Oxford, Tokyo, Washinton DC: IOS press, 2000, p. 333 25. Bogomolny, E. B., Gerland, U., Schmit, C.: Singular statistics. Phys. Rev. E 63 (2001) 036 206, pp. 1–16 26. Bogomolny, E. B.: Quantum Dynamics of Simple Systems. In: The Forty Fourth Scottish Universities Summer School in Physics, Oppo, G.L., Barnett, S.M., Riis, E., Wilkinson, M. (eds.), Bristol and Philadelphia: Institute of physics publishing, August 1994, p. 17 27. Masur, H.: Holomorphic Functions and Moduli, Vol. I (Berkeley, CA, 1986), Math. Sci. Research Inst. Publ. 10, New York–Berlin: Springer-Verlag, 1988, pp. 215–228 28. Masur, H.: The growth rate of trajectories of a quadratic differential. Ergod. Theory of Dynam. Syst. 10, 151–176 (1990) 29. Terras, A.: Harmonic analysis on symmetric spaces and applications, I. New York–Berlin: SpringerVerlag, 1985, p. 206 30. Jager, H., Lenstra jr., H. W.: Linear independance of cosecant values. Nieuw Archief Wisk (3) 23, 131–144 (1975) 31. Girstmair, K.: Character coordinates and annihilators of cyclotomic numbers. Manuscripta Math. 59, 375–389 (1987) 32. Bogomolny, E. B., Gerland, U., Schmit, C.: Short-range plasma model for intermediate spectral statistics. Europ. Phys. J. B 19, 121–132 (2001) 33. Casati, G., Prosen, T.: Quantum chaos in triangular billiards. Unpublished Communicated by P. Sarnak
Commun. Math. Phys. 222, 371 – 413 (2001)
Communications in
Mathematical Physics
© Springer-Verlag 2001
Asymptotic Properties of Solutions to 3-Particle Schrödinger Equations Hiroshi Isozaki Department of Mathematics, Graduate School of Science, Tokyo Metropolitan University, Minami-Osawa, Hachioji, 192-0397, Japan Received: 20 September 2000 / Accepted: 20 May 2001
Dedicated to Professor Kiyoshi Mochizuki on the occasion of his sixtieth birthday Abstract: We construct a generalized Fourier transformation F(λ) associated with the 3-body Schrödinger operator H = − + a Va (x a ) and characterize all solutions of (H − λ)u = 0 in the Agmon–Hörmander space B ∗ as the image of F(λ)∗ . These stationary solutions admit asymptotic expansions in B ∗ in terms of spherical waves associated with scattering channels. 1. Introduction 1.1. Helmholtz equation. Consider the Helmholtz equation in Rn , (− − λ)u = 0,
λ > 0.
(1.1)
According to a classical theorem of Sommerfeld–Rellich, u = 0 if u satisfies (1.1) and u = O(|x|−s ) as |x| → ∞ for s > (n − 1)/2. Non-trivial solutions arise from the decay rate s ≤ (n − 1)/2, and the borderline case s = (n − 1)/2 was characterized by Agmon–Hörmander [2]: Let u be a solution to (1.1). Then u satisfies 1 |u(x)|2 dx < ∞ (1.2) sup R>1 R |x|
u(x) =
S n−1
ei
√ λω·x
ϕ(ω)dω
(1.3)
for some ϕ ∈ L2 (S n−1 ). This, combined with the stationary phase method on the sphere, implies that all the solution u of (1.1) satisfying (1.2) admit an asymptotic expansion u C(λ)r −(n−1)/2 ei
√ λr
ϕ(x) ˆ + C(λ)r −(n−1)/2 e−i
√ λr
ϕ(−x), ˆ
(1.4)
372
H. Isozaki
where r = |x|, xˆ = x/r, and the asymptotic relation u v means that 1 lim R→∞ R
|x|
|u(x) − v(x)|2 dx = 0.
(1.5)
1.2. 2-body Schrödinger equation. There are two directions for generalization of the above facts. One is the extension to Laplacians on non-compact Riemannian manifolds. This is actually a classical problem and had been studied by Helgason [13,14] for example. The general case was studied by Agmon [1], Melrose [27], Melrose–Zworski [28]. Another is the extension to 2-body Schrödinger operators by Yafaev [39], Gâtel– Yafaev [7]. For the sake of simplicity, let us mention the 2-body Schrödinger operator with shortrange potential H = − + V (x), V (x) = O(|x|−1− ) ( > 0). Suppose u is a solution to the equation (H − λ)u = 0,
λ > 0.
(1.6)
Then u satisfies (1.2) if and only if u is written as u = F(λ)∗ ϕ,
ϕ ∈ L2 (S n−1 ),
(1.7)
where F(λ) is the operator defined by F(λ)f =
Rn
e−i
√ λω·x
f (x)dx −
Rn
e−i
√ λω·x
V (x)R(λ + i0)f dx,
(1.8)
with R(z) = (H − z)−1 . The solution u of (1.6) satisfying (1.2) admits an asymptotic expansion u C(λ)r −(n−1)/2 ei
√ λr
ϕ+ (x) ˆ + C(λ)r −(n−1)/2 e−i
√ λr
ϕ− (x), ˆ
(1.9)
and ϕ± are related as follows: ˆ ϕ+ = S(λ)J ϕ− ,
(1.10)
ˆ where (J ϕ)(ω) = ϕ(−ω) and S(λ) is the scattering matrix for H . The operator F(λ) is a spectral representation (generalized Fourier transformation) for H . In fact there are two types of generalized Fourier transformations F± (λ), which are related to the spatial asymptotics of the resolvent in the following way: F± (λ)f = lim C± (λ)r (n−1)/2 e∓i r→∞
√ λr
(R(λ ± i0)f )(r·).
Moreover F(λ)f = F+ (λ)f = J F− (λ)f . The above facts (1.7), (1.9) and (1.10) are thus closely related to each other and arise from fundamental properties of the generalized Fourier transformation associated with H .
Asymptotic Properties of Solutions to 3-Particle Schrödinger Equations
373
1.3. 3-body Schrödinger equation. To extend the above results to many-body Schrödinger equations is a very difficult problem. It was shown in [21] that a solution u of the N -body Schrödinger equation (H − λ)u = 0 vanishes identically if u ∈ L2,−α for some α < 1/2 (for the definition of L2,s , see §2) and if λ is neither the eigenvalue nor in the set of thresholds of H . This is a generalization of Sommerfeld–Rellich’s classical result. To characterize the solutions of the borderline case is much harder, since it requires detailed knowledge of the N -body stationary Schrödinger equation, which remains unknown in spite of the success of the proof of asymptotic completeness by the time-dependent method [5, 31, 8, 3]. In this paper, we shall study this problem in the case of 3-particle systems in R3 . To get the complete result, we assume that each pair potential decays rapidly. More precisely, each pair potential Vij (y) is assumed to be a real C ∞ -function on R3 and to satisfy Assumption. ∂ym Vij (y) = O(|y|−m−ρ ),
∀m ≥ 0
(1.11)
for some ρ > 5. Here ∂ym stands for any differentiaition of order m. Let us stress that this is the only assumption we impose on our 3-particle systems. We assume no extra assumptions such as nonexistence of zero-eigenvalue or zeroresonances. One can allow Coulombic singularities for Vij . Namely our results below also hold if (1) (2) (1) (2) Vij = Vij +Vij , where Vij is a smooth function satisfying (1.11), Vij is a compactly (2)
supported function satisfying |Vij (y)| ≤ C|y|−1 , and all the multiple commutators of (2)
Vij and A = 2i1 (x · ∇ + ∇ · x) extend to -bounded operators. We do not enter into the details, however.
1.4. Main results. Let us summarize our main results in this paper. For the notation used below, see Sects. 2 and 3. Let H = H0 + a Va (x a ) be the 3-body Schrödinger operator with center of mass removed. Each pair potential is assumed to satisfy (1.11). Let T be the set of thresholds for H and T = T ∪σp (H ). In the following, a,n = a n denotes the sum ranging over all pairs of particles and over the eigenvalues of the subsystem H a = −x a +Va (x a ). The meaning of the notation ⊕a,n is similar to this. Let B be the Banach space introduced by Agmon–Hörmander [2], whose dual space has the norm equivalent to (1.2) (see Sect. 2). Theorem 1. For λ ∈ σcont (H ) \ T , there exists a bounded operator F(λ) : B → L2 (S 5 ) ⊕ ⊕a,n L2 (S 2 ) having the following properties: (1) F(λ) diagonalizes H : F(λ)Hf = λF(λ)f.
374
H. Isozaki
(2) Define (Ff )(λ) by F(λ)f . Then the operator F is uniquely extended to a partial isometry with initial set Hac (H ) = the absolutely continuous subspace for H , and final set L2 ((0, ∞); L2 (S 5 ); ρ0 (λ)dλ) ⊕ ⊕a,n L2 ((λa,n , ∞); L2 (S 2 ); ρa,n (λ)dλ), λ2 1√ ρ0 (λ) = , ρa,n (λ) = λ − λa,n , 2 2 where λa,n ∈ σp (H a ). (3) Let F0 (λ), Fa,1 (λ), · · · be the components of F(λ). They are eigenoperators of H in the sense that (H − λ)F0 (λ)∗ ϕ0 = 0,
(H − λ)Fa,n (λ)∗ ϕa,n = 0
hold for ϕ0 ∈ L2 (S 5 ), ϕa,n ∈ L2 (S 2 ). (4) For f ∈ Hac (H ), the following inversion formula holds: ∞ ∞ F0 (λ)∗ (F0 f )(λ)ρ0 (λ)dλ + Fa,n (λ)∗ (Fa,n f )(λ)ρa,n (λ)dλ. f = 0
a,n
λa,n
Theorem 2. For f ∈ B and λ ∈ σcont (H ) \ T , the boundary value of the resolvent of H admits the following asymptotic expansion in the sense of (1.5) √
ei λr R(λ + i0)f C(λ) 5/2 F0 (λ)f r √ a,n ei λ−λ ra Ca,n (λ) Fa,n (λ)f (ωa ) ⊗ ϕ a,n (x a ), + r a a,n π −3πi/4 π 3/4 C(λ) = e h(λ − λa,n ), (λ+ ) , Ca,n (λ) = 2 2 where k+ = max{k, 0} and h(t) = 1 if t ≥ 0, h(t) = 0 if t < 0, and ϕ a,n is the eigenvector of H a associated with the eigenvalue λa,n . Theorem 3. Let λ ∈ σcont (H ) \ T . Let u satisfy (H − λ)u = 0. Then u ∈ B ∗ if and only if u is written as u = F(λ)∗ ϕ for some ϕ ∈ L2 (S 5 ) ⊕ ⊕a,n L2 (S 2 ). Theorem 4. Let λ ∈ σcont (H ) \ T . Let u ∈ B ∗ satisfy (H − λ)u = 0. Then u admits the asymptotic expansion √
(+)
√
(−)
ˆ + C(λ)r −5/2 e−i λr ϕ0 (x) ˆ u C(λ)r −5/2 ei λr ϕ0 (x) √ a,n (+) (ωa ) ⊗ ϕ a,n (x a ) Ca,n (λ)ra−1 ei λ−λ ra ϕa,n + a,n
+
√ a,n (−) Ca,n (λ)ra−1 e−i λ−λ ra ϕa,n (ωa ) ⊗ ϕ a,n (x a )
(1.12)
Asymptotic Properties of Solutions to 3-Particle Schrödinger Equations
375
in the sense of (1.5), where C(λ) = (2π )−1/2 e−5πi/4 (λ+ )−5/4 , Ca,n (λ) = (2π )−1/2 e−πi/2 ((λ − λa,n )+ )−1/2 . (±)
(±)
Let ϕ (±) = t (ϕ0 , ϕa,1 , · · · ). Then (−) ˆ , ϕ (+) = S(λ)Jϕ
(1.13)
ˆ where S(λ) is the S-matrix and J is the reflection J : (ϕ0 (θ ), ϕa,1 (ωa ), · · · ) → (ϕ0 (−θ), ϕa,1 (−ωa ), · · · ).
(1.14)
For any ϕ (−) , there exist a unique solution u of (H − λ)u = 0 and ϕ (+) for which the expansion (1.12) is valid. We also discuss the asymptotic expansion of generalized eigenfunctions associated with the scattering process with initial state of the 2-cluster (Theorem 8.4). Theorem 2 is a fundamental result concerning the behavior at infinity of solutions to stationary Schrödinger equations. We shall discuss its other applications in a forthcoming paper.
1.5. Related works. Let us consider the N -body Schrödinger operator H = H0 + 1≤i<j ≤N Vij (qi − qj ). As is inferred from the 2-body case, the above problem boils down to the asymptotic expansion of the resolvent (H − λ ∓ i0)−1 at infinity, which leads to the construction of generalized eigenfunctions for H and to the properties of Smatrices. All the difficulties of the N -body problem arise from the directions {qi = qj }, along which the pair potential Vij (qi − qj ) does not decay. Let us call them singular directions in this paper. Most of the study of the stationary N -particle Schrödinger equation has been done outside the singular directions. The asymptotic expansion of the resolvent outside the singular directions (free region) was obtained by Herbst-Skibsted [15], smoothness of the scattering amplitude in the free region was proved by Skibsted [32]. In a series of papers [10–12, 34, 35] Hassell and Vasy continued the study of the generalized eigenfunction for H and S-matrices in the free region. As for the behavior around the singular directions, Vasy [36] studied it by projecting the solution of (H − λ)u = 0 onto the bound states of subsystems and investigating the spatial asymptotics in the free region for the subsystem. Let us also mention the work of Vasy [37] on the propagation of singularities for the S-matrix, the underlying idea of which is to look at the micro-local behavior of the resolvent at infinity. The works [18, 19] studied directly the scattering matrix and generalized eigenfunctions around the singular directions. Since they lean heavily upon the spectral property near the zero energy of subsystems, they are restricted to the 3-body case.
376
H. Isozaki
1.6. Methods. The present paper is essentially a continuation of our previous works [18, 19], which are based on the micro-local resolvent estimates for N -body Schrödinger operators. The germ of the idea of the generalized Fourier transformation for a 3-particle system has already been given in [19]. However we need two optimal results to overcome new difficulties in the many-body problem. The first one is the Agmon–Hörmander space B, B ∗ , which is not only optimal for the restriction of the Fourier transform on the sphere, but also appropriate to deal with the multi-channel property of √the many-body√problem. We encounter two types of spherical a,n scattering waves, r −5/2 ei λr and ra−1 ei λ−λ ra . The former is dominant in the free region, while the latter is dominant near the singular direction {x a = 0}. The space B ∗ enables us to show the orthogonality of these two waves and hence the expansion of the resolvent. The second new tool employed in this paper is a micro-local version ofYafaev’s resolvent estimates concerning the spherical part of the radiation condition [40]. Our Theorem 3.5 is a many-body counterpart of Agmon–Hörmander’s result [2], Theorem 7.4, and is crucial to construct the generalized Fourier transformation near the singular directions. 1.7. The plan of the paper. In Sect. 2, we review basic properties of the spaces B, B ∗ . We summarize various estimates for the resolvent and the unitary group for the 3particle Hamiltonian in Sect. 3. In Sect. 4, we study the generalized Fourier transform associated with the operator Ha = H0 + Va , which contains only one pair potential. The asymptotic expansion of the resolvent of Ha is derived in Sect. 5. From the technical point of view, Sects. 4 and 5 play the key role in this paper. In Sect. 6, we review the proof of asymptotic completeness of time-dependent wave operators, through which we introduce the generalized Fourier transformation associated with H . After studying the scattering matrix in Sect. 7, we prove our main results in Sect. 8. We shall mainly deal with the case λ > 0. The case λ < 0 can be treated more easily. 1.8. Notation. The notation used in this paper is almost standard. For two Banach spaces X and Y , B(X; Y ) is the set of all bounded operators from X to Y . For x ∈ Rn , x = (1 + |x|2 )1/2 . C’s denote various constants which may vary from line to line. S is the Schwartz space of rapidly decreasing functions. For a smooth function ϕ on Rn , ϕ(D) means the pseudo-differential operator (Ps.D.Op.) with symbol ϕ(ξ ). F (· · · ) denotes the characteristic function of a set {· · · }. We often use homogeneous functions in this paper. Let Hom(Rn ) be the set of func tions which are homogeneous of degree 0 and smooth except for 0. Let H om(Rn ) be the set of functions which are smooth on Rn and homogeneous of degree 0 for |x| > 1. 2. Preliminaries 2.1. B, B ∗ spaces. For s ∈ R, let u∈L
2,s
⇐⇒
!u!2s
=
Rn
(1 + |x|)2s |u(x)|2 dx < ∞.
We also use the Banach space B equipped with the norm
1/2 1/2 ∞ 2 j −1 2 !u!B = 2 |u(x)| dx + |u(x)| dx < ∞, 70
j =1
7j
(2.1)
(2.2)
Asymptotic Properties of Solutions to 3-Particle Schrödinger Equations
377
where 70 = {x ∈ Rn ; |x| < 1}, 7j = {x ∈ Rn ; 2j −1 < |x| < 2j } for j ≥ 1. The norm of the dual space B ∗ is equivalent to the following one: 1/2 1 |u(x)|2 dx , (2.3) !u!B∗ = sup R>0 R |x| 1/2 we have L2,s ⊂ B ⊂ L2,1/2 ⊂ L2 ⊂ L2,−1/2 ⊂ B ∗ ⊂ L2,−s ,
(2.4)
|(u, v)| ≤ C!u!B !v!B∗ .
(2.5)
Throughout the paper, we shall use the notation ( , ) to denote the inner product of L2 , the pairing of L2,s and L2,−s as well as that of B, B ∗ . 1 |x| Lemma 2.1. Let u ∈ L2,−1/2 and ρ(t) ∈ C0∞ ((0, ∞)). Then ρ( )u(x) → 0 in B R R as R → ∞. Proof. There exist 0 < a < b < ∞ such that supp ρ( |x| R ) ⊂ {aR ≤ |x| ≤ bR}. Take l, m such that 2−l ≤ a < b ≤ 2m+1 . Assume that 2j −1 ≤ R < 2j . Then |x| 1 supp ρ( |x| R ) ⊂ 7j −l ∪ 7j −l+1 ∪ · · · ∪ 7j +m+1 . Let uR (x) = R ρ( R )u(x). Since |uR (x)| ≤ C|u(x)|/|x|, we have !uR !B =
j +m+1
2
n−1
n=j −l
≤C as R → ∞.
|x|>2j −l
1/2
|uR (x)| dx 2
7n
(1 + |x|)−1 |u(x)|2 dx)1/2 → 0,
$ #
Lemma 2.2. Assume that u ∈ B ∗ . Then 1 |u(x)|2 dx = 0 lim R→∞ R |x|
(2.6)
is equivalent to 1 lim R→∞ R
ρ
|x| |u(x)|2 dx = 0, R
∀ρ ∈ C0∞ ((0, ∞)).
Proof. We have only to show that (2.6) is equivalent to 1 lim |u(x)|2 dx = 0, 0 < ∀a < ∀b < ∞. R→∞ R aR<|x|
(2.7)
(2.8)
Obviously (2.6) implies (2.8). Since u ∈ B ∗ , for any > 0 there exists 0 < a < 1 such that 1 1 |u(x)|2 dx < , ∀R > . R |x|
$ #
378
H. Isozaki
2.2. Fourier transforms. For f ∈ S and λ > 0, we define √ √ e−i λθ ·x f (x)dx. U0 (λ)f (θ ) = fˆ( λθ ) = (2π )−n/2 Rn
(2.9)
Let R0 (z) be the resolvent of −: R0 (z) = (− − z)−1 .
(2.10)
By the result of [2] for any λ > 0, R0 (λ ± i0) ∈ B(B; B ∗ ), U0 (λ) ∈ B(B; L2 (S n−1 )),
U0 (λ)∗ ∈ B(L2 (S n−1 ); B ∗ ).
(2.11) (2.12)
Lemma 2.3. For f ∈ L2,3/2 and λ > 0, the following strong limit exists in L2 (S n−1 ) s − lim r (n−1)/2 e−i r→∞
√ λr
R0 (λ + i0)f (rθ ) = C0 (λ)U0 (λ)f (θ ), π −(n−3)πi/4 (n−3)/4 C0 (λ) = λ . e 2
Proof. If f is rapidly decreasing, this lemma follows from the asymptotic expansion of the Hankel function. For the general case, see [30]. # $
2.3. Stationary phase method on the sphere. Lemma 2.4. For ϕ ∈ L2 (S n−1 ), we have √ √ √ ei λω·x ϕ(ω)dω C(λ)r −(n−1)/2 ei λr ϕ(x) ˆ + C(λ)r −(n−1)/2 e−i λr ϕ(−x), ˆ S n−1
in the sense of (1.5). Here xˆ = x/r, r = |x|, and C(λ) = e−(n−1)πi/4 (2π )(n−1)/2 λ−(n−1)/4 . Proof. This is well-known when ϕ ∈ C ∞ (S n−1 ). If ϕ ∈ L2 (S n−1 ), we have only to approximate it by smooth functions. # $ By a bounded Ps.D.Op., we mean a pseudo-differential operator with symbol p(x, ξ ) such that β sup |∂xα ∂ξ p(x, ξ )| < ∞, ∀α, β. x,ξ
A bounded Ps.D.Op. is bounded on B and also on B ∗ . This is also true for f (H ), where H is a Schrödinger operator with bounded potential and f ∈ C0∞ (R). These facts follow from Theorem 2.5 of [2].
Asymptotic Properties of Solutions to 3-Particle Schrödinger Equations
379
3. 3-Body Schrödinger Operator We summarize here various estimates for 3-particle Schrödinger operators. Most of them also hold for the N -body case with suitable modifications. We consider 3 particles with mass mi > 0 and position qi ∈ R3 . To remove the motion of the center of mass, our Hamiltonian is defined over the space X = {(q1 , q2 , q3 );
3
mi qi = 0} R6 .
(3.1)
i=1
Let a = (i, j ) be a pair of particles i and j , and k be the 3rd particle. The reduced masses ma , na and the Jacobi-coordinates are defined by 1 1 1 1 1 1 = + , = + , ma mi mj na mi + m j mk m i qi + m j qj x a = 2ma (qi − qj ), xa = 2na (qk − ). mi + m j The 3-particle Hamiltonians H0 , H are defined by H0 = −xa − x a ,
H = H0 +
Va (x a ),
(3.2) (3.3)
(3.4)
a
where xa (x a ) denotes the Laplacian with respect to the variable xa (x a ). In this section we shall assume that Va (y) is a real smooth function on R3 satisfying |∂ym Va (y)| ≤ Cm (1 + |y|)−m−ρ ,
∀m ≥ 0
(3.5)
for ρ > 0. We put Ta = −xa ,
H a = −x a + Va (x a ),
Ha = Ta + H a = H0 + Va (x a ).
(3.6) (3.7)
Let R(z) = (H − z)−1 ,
Ra (z) = (Ha − z)−1 ,
R a (z) = (H a − z)−1 .
Let B be the self-adjoint operator defined by 1 x x B= ·∇ +∇ · . x 2i x
(3.8)
(3.9)
Let T be the set of thresholds of H , namely T = {0} ∪ ∪a σp (H a ),
(3.10)
where σp (H a ) is the set of eigenvalues of H a . Let T = T ∪ σp (H ). We define a(λ) = inf{λ − t; t ∈ T , t < λ}.
(3.11)
Froese–Herbst [6] proved that T ∩ (0, ∞) = ∅, hence a(λ) = λ if λ > 0. By the result of [26], we have
380
H. Isozaki
Theorem 3.1. For λ ∈ σcont (H ) \ T , R(λ ± i0) ∈ B(B; B ∗ ). We use improved resolvent estimates in weighted L2 -spaces. The first one is due to Gérard–Isozaki–Skibsted [9]. ∈ C ∞ (R) be such that |f (m) (t)| ≤ Theorem 3.2. Let λ ∈ σcont (H ) \ T . Let f (t) √ −m Cm (1 + |t|) , ∀m ≥ 0 and supp f (t) ⊂ (−∞, a(λ)). Then
f (B)R(λ + i0) ∈ B(L2,s+s ; L2,s ),
∀s > 1, ∀s > −1/2.
We put for sufficiently small > 0,
a a 6 |x | < , N = x ∈ R ; |x| N = ∪a Na ,
(3.12)
M = (N )C .
(3.13)
χa = 1,
(3.14)
Let χ0 , χa ∈ Hom(R6 ) be such that χ0 +
a
1 0
χa (x) =
x ∈ Na , a . x ' ∈ N2
(3.15)
In the following, the index α ranges over 0, a, b, c. Let Sα be the set of Ps.D.Op.’s Pα with symbol pα (xα , ξα ) satisfying |∂xmα ∂ξnα pα (xα , ξα )| ≤ Cm,n xα −m ξα −n ,
∀m, n ≥ 0.
(3.16)
Here for α = 0, x0 , ξ0 stand for x, ξ ∈ R6 . (−) Let Sα be the set of Ps.D.Op.’s Pα with symbol pα (xα , ξα ) satisfying (3.16) and sup
xα ,ξα
xα ξα · <1 |xα | |ξα |
supp pα (xα , ξα ),
on
(3.17)
suppξα pα (xα , ξα ) ⊂ {C1 < |ξα | < C2 }, for some constants 0 < C1 < C2 . (−)
Theorem 3.3. Let λ ∈ σcont (H ) \ T and Pα ∈ Sα . Then
Pα χα R(λ + i0) ∈ B(L2,s+s ; L2,s ),
∀s > 1, s > −1/2.
For the proof see [9], Theorem 1.1 and [19], Theorem 3.6. See also Wang [38]. The above Theorem 3.3 and the following Theorem 3.4 due to Yafaev ([40], Theorem 3.5) are closely related with the radiation condition. Let ∇α = ∇xα , rα = |xα | and put x α xα ∇α(s) = ∇α − (3.18) · ∇α . rα rα
Asymptotic Properties of Solutions to 3-Particle Schrödinger Equations
381
Theorem 3.4. Let λ ∈ σcont (H ) \ T . Then χα ∇α(s) R(λ + i0) ∈ B(L2,s ; L2,−1/2 ),
∀s > 1/2.
We derive a Ps.D.Op. version of Theorem 3.4. Theorem 3.5. Let P ∈ Sα be the Ps.D.Op. with symbol p(xα , ξα ). Suppose suppξα p(xα , ξα ) ⊂ {C1 < |ξα | < C2 } for some constants 0 < C1 < C2 . Assume that p(xα , ξα ) = 0 if Then
xα ξα = . |xα | |ξα |
χα P R(λ + i0) ∈ B(L2,s ; L2,−1/2 ),
∀s > 1/2.
Proof. For the sake of simplicity, we consider the case α = 0 and drop the subscript α. The other cases are dealt with similarly. We smear χ near the origin and assume that χ ∈ C ∞ (R6 ). By the Taylor expansion ξi xi p(x, ξ ) = p i (x, ξ ) − |ξ | |x| i
with p i (x, ξ ) ∈ S0 . Therefore we have only to prove the theorem when the symbol has the following form: ξi xi pi (x, ξ ) = ρ(x)ψ(ξ ) − , |ξ | |x| where ρ, ψ ∈ C ∞ (R6 ), ρ(x) = 1 for |x| > 2, ρ(x) = 0 for |x| < 1, ψ(ξ ) = 1 for |ξ | > , ψ(ξ ) = 0 for |ξ | < /2. Take ρ± (t) ∈ C ∞ (R) such that ρ+ (t) + ρ− (t) = 1, ρ− (t) = 1 if t < −2/3, ρ− (t) = 0 if t > −1/3. We split pi (x, ξ ) into two parts: x x ξ ξ pi (x, ξ ) = ρ+ · pi (x, ξ ) + ρ− · pi (x, ξ ) |x| |ξ | |x| |ξ | (+)
= pi (±)
Let Pi
(−)
(x, ξ ) + pi
(x, ξ ).
(±)
be the Ps.D.Op. with symbol pi (−)
χ Pi
(x, ξ ). Since
R(λ + i0) ∈ B(L2,s+s ; L2,s ),
s > 1, s > −1/2 (+)
by Theorem 3.3, we have only to prove the theorem for P = Pi The idea is to use the following identity: (ξ − (xˆ · ξ )x) ˆ 2=
.
|ξ |2 (1 + xˆ · ξˆ )(xˆ − ξˆ )2 , 2
where xˆ = x/|x|. We then have by the symbolic calculus (+) (+) (Pi )∗ χ x−1 χ Pi = (∇ (s) )∗ · Q∗ χ x−1 χ Q · ∇ (s) + x−2 Q, i
382
H. Isozaki
∈ S0 . Therefore with Q, Q (+) !χ Pi R(λ + i0)f !2−1/2 = !χ Q · ∇ (s) R(λ + i0)f !2−1/2 i
+ (x−2 QR(λ + i0)f, R(λ + i0)f ). This together with Theorem 3.4 implies Theorem 3.5.
$ #
Let us elucidate the utility of the above Theorem 3.5. As is seen from Lemma 2.3, the Fourier transformation U0 (λ) is related to the asymptotic behavior of the resolvent R0 (λ + i0). We show another formula. ∞ Theorem 3.6. Let ρ(t) ∈ C0∞ ((0, ∞)) be such that ρ(t)dt = 1. Then for f ∈ 0
L2,s , s > 1/2, the following strong limit exists in L2 (S n−1 ): √ √ √ 2i λ |x| lim − e−i λθ·x ρ( )R0 (λ + i0)f dx = e−i λθ ·x f (x)dx. R→∞ R R Rn Rn ∞ Proof. Let u = R0 (λ + i0)f and ρ1 (t) = ρ(s)ds. Then we have t
[(− − λ)e
√ −i λθ ·x
]ρ1
|x| udx = 0. R
Integration by parts yields √ √ |x| x 2i λ −i λθ·x ρ e θ· udx − R R |x| √ √ |x| |x| −i λθ·x −i λθ ·x = e ρ1 (ρ1 f dx + e )udx. R R Therefore √ √ √ 2i λ |x| x −i λθ·x ρ e θ· udx = e−i λθ ·x f (x)dx. lim − R→∞ R R |x|
(3.19)
√ Let P be = 0 if |ξ | < λ/2 √ a Ps.D.Op. with symbol p(x, ξ ) xsuch that p(x, ξ ) √ √ or |ξ | > 2 λ or |x| < 1, and p(x, ξ ) = ξ · |x| if |x| > 1 and 3 λ/4 < |ξ | < 3 λ/2. Then by (3.19) |x| 1 ρ P u = U0 (λ)f. lim −2iU0 (λ) R→∞ R R By virtue of Lemma 2.1 and Theorem 3.5, one can put ξ/|ξ | = x/|x| in the left-hand side. This proves Theorem 3.6. # $ We shall use the above formulation to construct the generalized Fourier transformation for the 3-body Schrödinger operator.
Asymptotic Properties of Solutions to 3-Particle Schrödinger Equations
383
Lemma 3.7. Let λ ∈ σcont (H ) \ T and ϕ ∈ C0∞ (R), ψ ∈ C0∞ ((0, ∞)). Then there exists > 0 such that if s > 1, a |x | F < Ra (λ + i0)ϕ(Ta )ψ(H a ) ∈ B(L2,s ; L2 ). |x| Proof. We prove that there exists > 0 such that a |x | !F < e−itHa ϕ(Ta )ψ(H a ) x−s ! ≤ Cs (1 + t)−s |x|
(3.20)
for t, s ≥ 0. The lemma follows from this by passing to the Laplace transform. a The idea of the proof of (3.20) is to use the fact that e−itH ψ(H a ) and e−itTa ϕ(Ta ) a |x | |xa | propagate in the region > δ1 and < δ2 respectively for suitable δ1 , δ2 > 0. t t In fact by the minimal velocity estimate (see e.g. [4], Theorem 4.16.1 or [33], Theorem 3.3), there exists δ > 0 such that |x a | a −s −itH a a F ( ≤ Cs (1 + t)−s , ∀t, s ≥ 0. < δ)e ψ(H ) x (3.21) t We also see by integration by parts that if supp ϕ ⊂ (−∞, γ ), γ > 0, |xa | √ −itTa −s −s F ( x > 2 γ )e ϕ(T ) a a ≤ Cs (1 + t) , ∀t, s ≥ 0. t √ For small > 0, |x a | < |x| implies |xa | > 2t γ or |x a | < tδ. Therefore a a |x | |x | |xa | √ F < ≤F >2 γ +F <δ . |x| t t The lemma then follows from (3.21)–(3.23).
$ #
Theorem 3.8. Let ϕ ∈ C0∞ (R \ T ). (1) For s, > 0, we have ! x−s e−itH ϕ(H ) x−s ! ≤ C(1 + |t|)−(s−) for t ∈ R. (2) Let P be a bounded Ps.D.Op. with symbol p(x, ξ ) such that sup x,ξ
x ξ · < 1 on supp p(x, ξ ). |x| |ξ |
Then if supp ϕ is sufficiently small
! xs P χ0 e−itH ϕ(H ) x−s ! ≤ C(1 + t)−m for t ≥ 0, if s > s + m, s > −1/2, m > 0. For the proof, see [24, Theorem 3.10] and [20, Theorem 5.7].
(3.22)
(3.23)
384
H. Isozaki
4. Analysis of Ha In this and the next sections, we fix a pair a = (i, j ) and discuss the spectral representation for Ha = H0 + Va (x a ). Let Ta be the set of thresholds for Ha . 4.1. Generalized Fourier transformation associated with Ha . Let λa,n , n = 1, 2, · · · , be the eigenvalues of H a and ϕ a,n (x a ) the associated normalized eigenvectors. Let a,n = f (xa , x a )ϕ a,n (x a )dx a (4.1) f, ϕ R3
and define for λ > λa,n and f ∈ S, Ua,n (λ)f (θa ) = (2π )−3/2
R3
e−i
√ λ−λa,n θa ·xa
f, ϕ a,n dxa .
(4.2)
This is clearly well-defined and Ua,n (λ)f ∈ L2 (S 2 ). For λ > 0 and f ∈ S, the generalized Fourier transformation corresponding to the continuous spectrum of H a is formally defined by √ e−i λθ ·x f (x)dx Uacont (λ)f = (2π)−3 R6 (4.3) √ −3 −i λθ ·x a − (2π) ·e Va (x )Ra (λ + i0)f dx. R6
To give a definite meaning to this expression, we pass to the partial Fourier transformation with respect to xa : e−ixa ·ξa f (xa , x a )dxa . (4.4) f˜(ξa , x a ) = (2π )−3/2 R3
Let R a (z) = (H a − z)−1 . We define for f ∈ S and λ > 0, √ Uacont (λ)f = fˆ( λθ ) √ a a √ e−i λθ ·x Va (x a )R a (λ|θ a |2 + i0)f˜( λθa , ·)dx a . − (2π)−3/2 R3
(4.5) By the result of Jensen–Kato [25], we have the following expansion in B(L2,s ; L2,−s ), s > 5/2, B−2 B−1 + √ + O(1), as z → 0, z z = −P0 , B−1 = −iP0 Va GVa P0 + i(·, ψ)ψ,
R a (z) =
(4.6)
B−2
(4.7)
where P0 is the projection onto the eigenspace for the 0 eigenvalue, G is an integral operator with kernel |x a − y a |/(24π ) and ψ is the 0-resonance for H a . The last term O(1) in (4.6) is continuous with respect to z near z = 0. This follows from Murata [29].
Asymptotic Properties of Solutions to 3-Particle Schrödinger Equations
385
In [18], we have shown that if V a (x a ) = O(|x a |−ρ ), ρ > 3 + 1/2, and u(x a ) is an eigenfunction of H a with zero eigenvalue, R3
e−ix
a ·ξ a
Va (x a )u(x a )dx a = −i
as |ξ a | → 0, and
R3
x a · ξ a Va (x a )u(x a )dx a + O(|ξ a |2 ),
u(x a ) = O(|x a |−2 )
as
|x a | → ∞.
(4.8)
(4.9)
It follows from (4.6) ∼ (4.9) that Uacont (λ)f = O(|θ a |−1 )
as
|θ a | → 0.
(4.10)
In particular, Uacont (λ)f ∈ L2 (S 5 ). Theorem 4.1. (1) Parseval’s formula holds: For λ ' ∈ Ta , 1 ([Ra (λ + i0) − Ra (λ − i0)]f, f ) 2πi 1 1 = (λ+ )2 !Uacont (λ)f !2L2 (S 5 ) + (λ − λa,n )+ !Ua,n (λ)f !2L2 (S 2 ) , 2 2 n where k+ = max(k, 0). (2) Uacont (λ) ∈ B(B; L2 (S 5 )), Ua,n (λ) ∈ B(B; L (S )), 2
2
λ > 0, λ ∈ (λa,n , ∞) \ Ta .
Proof. For f ∈ S(R6 ), we put
a
a
e−ix ·ξ f(ξa , x a )dx a a a − (2π )−3/2 e−ix ·ξ Va (x a )R a (|ξ a |2 + i0)f(ξa , ·)dx a , 3 R e−ixa ·ξa f, ϕ a,n dxa . (F a,n f )(ξa ) = (2π)−3/2
(F a f )(ξa , ξ a ) = (2π)−3/2
R3
R3
Then we have for z ' ∈ R, (Ra (z)f, f ) =
R3
=
R6
(R a (z − |ξa |2 )f(ξa , ·), f(ξa , ·))dξa |(F a f )(ξa , ξ a )|2 dξa dξ a + 2 a 2 |ξa | + |ξ | − z n
R3
|(F a,n f )(ξa )|2 dξa . |ξa |2 + λa,n − z
Parseval’s formula follows from this by the well-known computation. The assertion (2) follows from (1) and Theorem 3.1. # $
386
H. Isozaki
4.2. Asymptotics of the resolvent and the generalized Fourier transform. We investigate relations between generalized Fourier transformations Ua,n (λ), Uacont (λ) and the spatial asymptotics of the resolvent Ra (λ + i0). Theorem 4.2. For f ∈ S and λ > λa,n , √ 2 a,n Ua,n (λ)f = lim ra e−i λ−λ ra Ra (λ + i0)f, ϕ a,n (ra θa ). ra →∞ π Proof. Since Ra (λ + i0)f, ϕ a,n = (−xa − (λ − λa,n + i0))−1 f, ϕ a,n , this theorem follows from Lemma 2.3 and (4.2).
$ #
The relationships between Uacont (λ)f and Ra (λ + i0)f are summarized in the following three theorems. Theorem 4.3. If f ∈ S and λ > 0, we have the following asymptotic expansion in L2loc (S 5 \ {θ a = 0}), √
(Ra (λ + i0)f )(r·) = C(λ)r −5/2 ei λr Uacont (λ)f + O(r −7/2 ) π −3πi/4 3/4 as r → ∞, where C(λ) = λ . We also have e 2 Uacont (λ)f ∈ C ∞ (S 5 \ {θ a = 0}). We take Ma (x) ∈ C ∞ (R6 \ {0}) such that Ma (x) > 0, Ma (x) is homogeneous of degree 1 and satisfies for small > 0, |xa | if x ∈ Na , Ma (x) = (4.11) a . |x| if x ' ∈ N2 We also take ρ(t) ∈ χ0 (x), χa (x) ∈
C0∞ ((0, ∞))
Hom(R6 )
such that ρ(t) ≥ 0 and
∞
ρ(t)dt = 1. Let
0
be such that χa (x) =
1 0
a if x ∈ N/2 a , if x ' ∈ N3/4
(4.12)
χ0 (x) + χa (x) = 1.
(4.13)
Theorem 4.4. For f ∈ S and λ > 0, the following strong limit exists in L2 (S 5 ): lim −
R→∞
√ √ 2i λ Ma χ0 (x)Ra (λ + i0)f dx e−i λθ·x ρ R R R6 = (2π )3
χ0 (θ ) cont U (λ)f. Ma (θ ) a
Asymptotic Properties of Solutions to 3-Particle Schrödinger Equations
387
We take a smooth function E(t) > 0 such that E(t) = t if λ− < t < λ−inf σ (Ha ), and E(t) = constant if t < λ − 2 or t > λ − inf σ (Ha ) + . Theorem 4.5. For f ∈ S and λ > 0, the following strong limit exists in L2 (S 5 ): √ 2i |xa | lim − e−i λθ·x ρ E(Ta )χa (x)Ra (λ + i0)f dx R→∞ R R6 R = (2π )3 χa (θ )Uacont (λ)f. Proof of Theorem 4.3. We first prove the theorem under the additional assumption that Va is rapidly decreasing. We need three cut-off functions. Let χ (x) ∈ H om(R6 ) be such that there exists > 0 for which χ (x) = 0 if |x| < 1/2 or x ∈ Na , and χ (x) = 1 if |x| > 1 and a . Take χ , χ ∈ H x ' ∈ N2 om(R6 ) such that χ1 = 1 on supp χ , χ2 = 1 on supp χ1 and 1 2 χ1 (x) = χ2 (x) = 0 if |x| < 1/2 or x ∈ Na . Let ϕ0 (t) ∈ C ∞ (R) be such that ϕ0 (t) = 1 if |t − λ| < 1 , ϕ0 (t) = 0 if |t − λ| > 21 , where 1 > 0 is a sufficiently small constant. We let Qa = ϕ0 (H0 )χ1 (D)χ2 (x), (4.14) and define
Ga = H0 Qa − Qa Ha = [H0 , Qa ] − Qa Va .
(4.15)
Since [H0 , Qa ] = ϕ0 (H0 )χ1 (D)[H0 , χ2 (x)], noting that the directions of x and ξ are different if x ∈ supp ∇χ2 and ξ ∈ supp χ1 , we have by using Theorem 3.3, [H0 , Qa ]Ra (λ + i0) ∈ B(L2,s+t ; L2,s ),
∀s > −1/2, t > 1.
(4.16)
Therefore if Va (x a ) is rapidly decreasing, Ga Ra (λ + i0) maps S into S and Qa Ra (λ + i0) = R0 (λ + i0)(Qa + Ga Ra (λ + i0)).
(4.17)
If f ∈ S, (1 − ϕ0 (Ha ))Ra (λ + i0)f is rapidly decreasing. Therefore we have only to consider ϕ0 (Ha )Ra (λ + i0)f . We proceed as follows: χ (x)ϕ0 (Ha )Ra (λ + i0)f = χ (x)ϕ0 (H0 )Ra (λ + i0)f + O(r −5 )Ra (λ + i0)f. (4.18) We insert χ1 (D) + (1 − χ1 (D)) between ϕ0 (H0 ) and Ra (λ + i0). Then χ (x)ϕ0 (H0 )(1 − χ1 (D))Ra (λ + i0)f = O(r −∞ ).
(4.19)
Assuming (4.19) for the moment, we continue the proof. We then have χ (x)Ra (λ + i0)f = χ (x)Qa Ra (λ + i0)f + O(r −7/2 ) = χ (x)R0 (λ + i0)(Qa + Ga Ra (λ + i0))f + O(r −7/2 ). Therefore by Lemma 2.3, √
χ (θ )(Ra (λ + i0)f )(rθ ) = χ (θ)r −5/2 ei λr ϕ(θ) + O(r −7/2 ), π −3πi/4 3/4 ϕ(θ ) = e λ (U0 (λ)(Qa + Ga Ra (λ + i0))f ) (θ ). 2
(4.20) (4.21)
388
H. Isozaki
Let F− (t) ∈ C ∞ (R) be such that F− (t) = 1 if t < √ It remains to prove (4.19). √ λ − 2δ and F− (t) = 0 if t > λ − δ, δ > 0 being a sufficiently small constant. Then by Theorem 3.2, F− (B)Ra (λ + i0) ∈ B(L2,s+t , L2,s ),
∀s > −1/2, t > 1.
(4.22)
On the other hand, letting F+ (t) = 1 − F− (t), one can show that χ (x)ϕ0 (H0 )(1 − χ1 (D))F+ (B) ∈ B(L2,−s ; L2,s ),
∀s > 0.
(4.23)
x This can be seen intuitively, if we regard F+ (B) as a Ps.D.Op. with symbol F+ x ·ξ . √ In fact, on the supports of χ (x) and ϕ0 (|ξ |2 )(1 − χ1 (ξ√ )), |ξ | is close to λ and the x directions of x and ξ are different. Therefore x · ξ < λ − 3δ for sufficiently small √ x x · ξ , x · ξ > λ − 2δ. This δ > 0. On the other hand, on the support of F+ x implies (4.23). This intuitive argument can be made rigorous by inserting suitable Ps.D.Op.’s and applying the technique discussed in [9], p. 80. We omit the details. If Va is not rapidly decreasing, we modify the above proof in the following way. In the definition of Qa of (4.14), we replace χ2 (x) by χ2 (x)A, where A is a Ps.D.Op. with symbol a(x, ξ ) such that β
|∂xα ∂ξ (a(x, ξ ) − 1)| ≤ Cαβ x−|α|− ξ −|β| ,
(4.24)
and in the region |x a |/|x| > , xˆ · ξˆ > −1/2 (xˆ = x/|x|), 2iξ · ∇x a(x, ξ ) + x a(x, ξ ) − Va (x a )a(x, ξ ) = O(|x|−∞ ).
(4.25)
Such a construction, based on the transport equation, is well-known (see e.g. [22] or [23]). On this region H0 Ga − Ga H is rapidly decreasing in x. The portion on the region xˆ · ξˆ < −1/2 is handled as follows. Let P− be a bounded Ps.D.Op. whose symbol vanishes if xˆ · ξˆ < −1/2. Inserting F± (B) we have, ϕ0 (H0 )P− Ra (λ + i0) = ϕ0 (H0 )P− F+ (B)Ra (λ + i0) + ϕ0 (H0 )P− F− (B)Ra (λ + i0). The operator P− F+ (B) maps L2,−s to L2,s (s > 0) and F− (B)Ra (λ + i0) maps S to S by virtue of Theorem 3.2. We can thus argue as above to see that (4.20) also holds in this case. That ϕ(θ ) = C(λ)Uacont (λ)f will be proved in Lemma 4.8. # $ We prepare two lemmas to prove Theorems 4.4 and 4.5. Lemma 4.6. For f ∈ S, the following strong limit exists in L2 (S 5 ): √ √ 2i λ Ma −i λθ·x ρ e (θ · ∇Ma )Ra (λ + i0)f dx = (2π )3 Uacont (λ)f. lim − R→∞ R R (4.26)
Asymptotic Properties of Solutions to 3-Particle Schrödinger Equations
Proof. Let ρ1 (t) =
∞ t
389
ρ(s)ds and u = Ra (λ + i0)f . We have by integration by parts
√ √ Ma 2i λ e−i λθ·x ρ − (θ · ∇Ma )udx R R R6 √ Ma = e−i λθ·x ρ1 (− + Va − λ)udx R R6 √ √ Ma Ma −i λθ·x −i λθ ·x − ρ1 Va udx + udx. e ρ1 e R R R6 R6 (4.27) √ The first term of the right-hand side of (4.27) converges to (2π )3 fˆ( λθ ) and the third term tends to 0 as R → ∞. It remains to show √ Ma lim e−i λθ·x ρ1 Va udx R→∞ R6 R √ a a √ = (2π)3/2 e−i λθ ·x Va R a (λ|θ a |2 + i0)f˜( λθa , ·)dx a . (4.28) R3
We first show (4.28) for |θ a | > δ > 0, where δ is an arbitrarily fixed small constant. We put Ma (xa , x a ) a −3 −ixa ·ξa wR (ξa , x ) = (2π ) e ρ1 dxa , R √ a a v(ξa , x a ) = (2π )−3 e−ixa ·ξa e−i λθ ·x Va udxa . Then we have √ Ma e−i λθ·x ρ1 Va udx − (2π )3/2 R √ a a √ Ma (0, x a ) e−i λθ ·x ρ1 Va R a (λ|θ a |2 + i0)f˜( λθa , ·)dx a R √ √ wR (ξa , x a ) v( λθa − ξa , x a ) − v( λθa , x a ) dξa dx a . = (2π)3
(4.29)
We show that the right-hand side of (4.29) converges to 0 as R → ∞ if |θ a | > δ > 0. Integration by parts yields |wR (ξa , x a )| ≤ CN |ξa |−N R 3−N ,
∀N > 0.
Therefore for any > 0, √ √ |wR (ξa , x a ) v( λθa − ξa , x a ) − v( λθa , x a ) |dξa dx a |ξa |> 1 ≤ Cn R −n 1 + a , |θ |
∀n > 0.
390
H. Isozaki
This tends to 0 as R → ∞. Let ρ2 (ξa , x a ) = (2π )−3
e−ixa ·ξa ρ1 (Ma (xa , x a ))dxa .
Since ρ2 ∈ S and wR (ξa , x a ) = R 3 ρ2 (Rξa , x a /R), we have xa |wR (ξa , x a )|dξa = |ρ2 ξa , |dξa ≤ C R with a constant C > 0 independent of R, xa . This implies |
√ √ wR (ξa , x a ) v( λθa − ξa , x a ) − v( λθa , x a ) dξa dx a | |ξa |< √ √ |v( λθa − ξa , x a ) − v( λθa , x a )|dx a . ≤ C sup |ξa |<
By the continuity with respect to λ > 0 of R a (λ + i0), the right-hand side tends to 0 as → 0. Therefore to complete the proof of (4.28), we have only to show the strong convergence of the left-hand side of (4.28) in L2 (S 5 ∩ {|θ a | < δ}). Since Va (x a ) = O(|x|−ρ ) for ρ > 5 in the region |x a |/|x| > , we have only to consider a √ Ma |x | e−i λθ·x ρ1 F < Va udx. R |x| However Ma (x) = |xa | if |x a |/|x| < . Therefore we have only to show the existence of the strong limit √ |xa | −i λθ·x lim e ρ1 Va Ra (λ + i0)f dx R→∞ R in L2 (S 5 ∩ {|θ a | < δ}). The proof of this fact is the same as (3.6) of [19], whose proof is rather long and explained in pp. 9–12 of [19]. We omit the details, since only a small change of notation is necessary. Here let us remark that to prove (3.12) of [19], we can use Theorem 3.6 of the present paper. # $ Lemma 4.7. For f ∈ S, the following strong limits exist in L2 (S 5 ): √ √ 2i λ Ma I0 = lim − e−i λθ·x ρ (θ · ∇Ma )χ0 (x)Ra (λ + i0)f dx, R→∞ R R R6 √ √ 2i λ Ma −i λθ·x Ia = lim − (θ · ∇Ma )χa (x)Ra (λ + i0)f dx. e ρ R→∞ R R R6
(4.30) (4.31)
Proof. By Lemma 4.6, we have only to show the existence of I0 . To show this, by virtue of Theorem 4.3, we have only to show the existence of the limit √ √ 1 Ma lim e−i λθ·x ρ (θ · ∇Ma )χ0 (x)r −5/2 ei λr ϕ(x)dx, ˆ (4.32) R→∞ R R
Asymptotic Properties of Solutions to 3-Particle Schrödinger Equations
391
where ϕ(θ ) is defined by (4.20). By Lemma 2.4, we have √ Ma (θ · ∇Ma )χ0 (x)ϕ(x)d e−i λθ·x ρ ˆ xˆ R S5 √ Ma (rθ ) = C(λ)r −5/2 e−i λr ρ (θ · ∇Ma (rθ ))χ0 (θ )ϕ(θ ) R √ Ma (−rθ) + C(λ)r −5/2 ei λr ρ (−θ · ∇Ma (−rθ ))χ0 (−θ)ϕ(−θ) + O(r −7/2 ). R It is easy to see that 1 R→∞ R
lim
∞
e2i
√ λr
0
ρ
Ma (−rθ) (θ · ∇Ma (−rθ ))dr = 0. R
Therefore we have only to note that ∞ ∂ Ma (rθ ) 1 ∞ θ · ∇Ma (rθ )dr = − ρ1 (Ma (rθ ))dr = 1. ρ R 0 R ∂r 0
$ #
In particular we have proven I0 = 2e3πi/4 (2π )5/2 λ−3/4 χ0 (θ )ϕ(θ ). Lemma 4.8. Let ϕ(θ ) be as in (4.20). Then π −3πi/4 3/4 cont e λ Ua (λ)f. ϕ(θ ) = 2
(4.33)
(4.34)
√ Proof. Take ψ(ξ ) ∈ C0∞ (R6 ) satisfying ψ(ξ ) = 1 for ||ξ | − λ| < 1 , and ψ(ξ ) = 0 √ for ||ξ | − λ| > 21 , and let AR be a Ps.D.Op. with symbol ξ ξ Ma · ∇Ma χ0 ψ(ξ ). ρ R |ξ | |ξ | Then by multiplying χ0 (θ ) to (4.26), we have √ √ 2i λ 3 cont (2π) χ0 (θ )Ua (λ)f = lim − e−i λθ ·x AR Ra (λ + i0)f dx. R→∞ R √ √ Let F+ (t) ∈ C ∞ (R) be such that F+ (t) = 1 if t > λ−2 , F+ (t) = 0 if t < λ−22 . Then by virtue of Theorem 3.2, one can insert F+ (B) between AR and Ra (λ + i0)f . This means that one can localize the direction of x very close to that of ξ . Let us take a and χ˜ = 1 on supp χ . By the above χ˜ 0 ∈ H om(R6 ) such that χ˜ 0 (x) = 0 if x ∈ N/2 0 0 arguments we have √ √ 2i λ 3 cont (2π) χ0 (θ )Ua (λ)f = lim − e−i λθ ·x AR χ˜ 0 (x)Ra (λ + i0)f dx R→∞ R √ 2i λ = lim − χ0 (θ ) R→∞ R √ Ma (θ · ∇Ma )χ˜ 0 (x)Ra (λ + i0)f dx. · e−i λθ ·x ρ R
392
H. Isozaki
The last term can be computed in the same way as in Lemma 4.7 and (4.33). Hence (2π)3 χ0 (θ )Uacont (λ)f = 2e3πi/4 (2π )5/2 λ−3/4 χ0 (θ )χ˜ 0 (θ )ϕ(θ ), which proves the lemma. # $ Proof of Theorems 4.4 and 4.5. Theorem 4.4 is proved in the same way as I0 in Lemma 4.7. Replace θ · ∇Ma by 1 and apply the same arguments as above. Let us prove Theorem 4.5. We first note that by Lemmas 4.6, 4.7, 4.8 and (4.32), √ √ 2i λ Ma −i λθ·x lim − e (θ · ∇Ma )χa (x)Ra (λ + i0)f dx ρ R→∞ R R = (2π )3 χa (θ )Uacont (λ)f. We rewrite the left-hand side of (4.35). Let P be the Ps.D.Op. with symbol Then
(4.35)
xa · ξa . |xa |
|xa | P χa (x)Ra (λ + i0)f dx = (2π )3 χa (θ )Uacont (λ)f, R→∞ R (4.36) since Ma (x) = |xa | on supp χa . Let ψ1 (t) ∈ C0∞ (R) be such that ψ1 (t) = 1 if √ √ |t − λ| < 1 , ψ1 (t) = 0 if |t − λ| > 21 . Then one can insert ψ1 (Ha ) in front of Ra (λ+i0)f , since (1−ψ1 (Ha ))Ra (λ+i0)f is rapidly decreasing. Let ψ2 (t) ∈ C ∞ (R) be such that ψ2 (t) = 1 if t < 2 , ψ2 (t) = 0 if t > 22 . Then in view of Lemma 3.7, one can insert ψ2 (H a ) in front of Ra (λ + i0)f . Therefore in the left-hand side of (4.36), Ta is localized in the interval (λ − 3 , λ − inf σ (Ha ) + 3 ), 3 being a sufficienlty small √ constant. One can now use Theorem 3.5 and Lemma 2.1 to replace P by E(Ta ), which proves Theorem 4.5. # $ lim −
2i R
e−i
√ λθ·x
ρ
Theorem 4.9. For λ > 0 and f ∈ S, let u = Ra (λ + i0)f . Then 1 |xa | (ρ( ) E(Ta )χa (x)u, u) R→∞ πR R (λ − λa,n )1/2 λ2 = (χa (θ )Uacont (λ)f, Uacont (λ)f ) + (Ua,n (λ)f, Ua,n (λ)f ). 2 2 n lim
Proof. Letting R → ∞ in the equality Ma Ma Ma (ρ1 u, u) − (ρ1 u, u) = − ρ1 u, u R R R Ma − 2 ∇ρ1 · ∇u, u , R we have 1 1 ([Ra (λ + i0) − Ra (λ − i0)] f, f ) = lim R→∞ π iR 2πi
Ma ρ ∇Ma · ∇u, u . R (4.37)
Asymptotic Properties of Solutions to 3-Particle Schrödinger Equations
393
We insert χ0 + χa = 1 in the right-hand side. Using Theorem 4.3 and xˆ · ∇Ma = Ma (x), ˆ xˆ = x/|x|, we have 1 Ma ρ ∇Ma · ∇u, χ0 u lim R→∞ πiR R √ λ|C(λ)|2 ∞ Ma = lim Ma (x)χ ρ ˆ 0 (x)|U ˆ acont (λ)f (x)| ˆ 2 drd xˆ 5 R→∞ πR R S 0 =
λ2 (χ0 (θ )Uacont (λ)f, Uacont (λ)f ). 2
(4.38)
Applying the same arguments as in the proof of Theorem 4.5, we have 1 1 Ma |xa | E(Ta )u, χa u . lim ρ ∇Ma · ∇u, χa u = lim ρ R→∞ πiR R→∞ π R R R By Theorem 4.1, (4.37) and (4.38), we obtain the theorem.
$ #
5. Asymptotic Expansion of the Resolvent of Ha In this section we derive a spatial asymptotics of the resolvent Ra (λ+i0) = (Ha −z)−1 . The main result is Theorem 5.8. For f ∈ S, we let u = Ra (λ + i0)f, √ −5/2 i λr
π −3πi/4 3/4 e Uacont (λ)f, C(λ) = λ , e u0 = C(λ)r 2 π −1 i √λ−λa,n ra un = r e Ua,n (λ)f (ωa ) ⊗ ϕ a,n (x a ), 2 a
(5.1) (5.2) (5.3)
where ra = |xa |, ωa = xa /ra and ϕ a,n is a normalized eigenfunction of H a with eigenvalue λa,n . Let χ0 ,χa and Ma be as in (4.11) ∼ (4.13). Let ρ(t) ∈ C0∞ ((0, ∞)) be ∞
such that ρ(t) ≥ 0 and of Theorem 4.3.
ρ(t)dt = 1. The following theorem is an easy consequence
0
Theorem 5.1. 1 lim R→∞ R
Ma ˆ − u0 ), u − u0 = 0. ρ χ0 (x)(u R
To derive the asymptotic behavior of u along the singular direction {x a = 0}, the essential step is the following theorem. Theorem 5.2. Let w = u − u0 − n un . Then we have 1 |xa | ˆ E(Ta )w, w = 0. ρ χa (x) lim R→∞ R R
394
H. Isozaki
The proof of Theorem 5.2 is divided into several steps. We put 1 |xa | KR = ρ ˆ E(Ta ), χa (x) R R ψ(θ ) = Uacont (λ)f (θ ),
ψa,n (ωa ) = Ua,n (λ)f (ωa ).
(5.4) (5.5)
A simple computaion shows that (KR w, w) = (KR u, u) + (KR u0 , u0 ) − (KR u, u0 ) − (KR u0 , u) (KR um , un ) − (KR u, un ) − (KR un , u) + m,n
n
n
n
+ (KR u0 , un ) + (KR un , u0 ).
Lemma 5.3. lim (KR u0 , u0 ) =
R→∞
n
π 2 λ (χa ψ, ψ). 2
Proof. Let C(λ) = e5πi/4 (2π)−5/2 λ5/4 . Then by the stationary phase method on the sphere √ −1 ∂ √ i √λθ ·x u0 = C(λ)C(λ)(2i λ) e ψ(θ )dθ + O(r −7/2 ). +i λ ∂r S5 This implies √ −1 ∂ √ 1 |xa | KR u0 = C(λ)C(λ) ρ ˆ λ) χa (x)(2i +i λ R R ∂r √ √ · ei λθ·x λ|θa |ψ(θ)dθ + · · · S5 √ √ |xa | 1 = λC(λ) ρ χa (x)t ˆ a ψ(x)r ˆ −5/2 ei λr + · · · , R R
(5.6)
where ta = |xa |/r and the term · · · tends to 0 in L2,s for some s > 1/2 as R → ∞. Therefore r √ 1 ∞ a ρ ˆ a |ψ(x)| ˆ 2 drd xˆ + o(1). χa (x)t (KR u0 , u0 ) = λ|C(λ)|2 R 0 R S5 Using 1 R
∞
ρ
r
0
a
R
ta dr = 1,
we obtain the lemma. # $ Lemma 5.4. lim (KR u, u0 ) = lim (KR u0 , u) =
R→∞
R→∞
π 2 λ (χa ψ, ψ). 2
Asymptotic Properties of Solutions to 3-Particle Schrödinger Equations
395
Proof. By Theorem 4.5, we have √ lim −2i e−i λθ ·x KR uψ(θ )dxdθ = (2π )3 (χa ψ, ψ). R→∞
We take ρ+ (t) ∈ C ∞ (R) such that ρ+(t) = 1 if t > 1/2, ρ(t) = 0 if t < −1/2 and let ξa xa · χ (|xa |)χ (|ξa |), where χ ∈ C ∞ (R), Pa+ be the Ps.D.Op. with symbol ρ+ |xa | |ξa | χ (t) = 0 if t < , χ (t) = 1 if t > 2, being a small constant. Then by virtue of Theorem 3.3, one can insert Pa+ in front of KR , which implies √ xa −i λθ·x lim −2i e · θa KR uψ(θ)dxdθ = (2π )3 (χa ψ, ψ). ρ+ R→∞ |xa | √ xa −i λθ ·x Applying Lemma 2.4 to the integral e ρ+ · θa ψ(θ)dθ , we get |xa | S5 lim (KR u, u0 ) =
R→∞
π 2 λ (χa ψ, ψ). 2
Since (KR∗ − KR )u → 0 in L2,s for some s > 1/2, we have lim (KR u0 , u) = lim (u0 , KR u) =
R→∞
R→∞
Lemma 5.5. lim (KR um , un ) = δmn
R→∞
π 2 λ (χa ψ, ψ). 2
$ #
π√ λ − λa,n !ψa,n !2 . 2
Proof. We first note that KR um =
π
1/2
(λ − λa,m )
2 √ a,m 1 ra ei λ−λ ra · ρ χa (x) ˆ ψa,m (ωa ) ⊗ ϕ a,n (x a ) R R ra + gR (xa ) ⊗ ϕ a,n (x a ),
L2,s (R3 ) for some s > 1/2. where ωa = xa /ra , ra = |xa |, and gR → 0 in √ a,m To prove this, we first approximate ra−1 ei λ−λ ra ψa,m (ωa ) by the integral
√ ∂ Const. + i λ − λa,m ∂ra
S2
ei
√ λ−λa,m θa ·xa
ψa,m (θa )dθa .
√ E(Ta ), we get modulo a term in L2,s (R3 ), √ √ √ ∂ a,m Const. + i λ − λa,m ei λ−λ θa ·xa λ − λa,m ψa,m (θa )dθa . ∂ra S2
Applying
Applying Lemma 2.4 again, we get (5.7).
(5.7)
396
H. Isozaki
We shall write aR ∼ bR if aR − bR → 0 as R → ∞. It follows from (5.7) that (KR um , un ) ∼
C R
ρ
r a
R
χa (x) ˆ
eiSra ψa,m (ωa )ψa,n (ωa )χa (x)ϕ ˆ a,m (x a )ϕ a,n (x a )dx, ra2
√ √ π√ λ − λa,m , S = λ − λa,m − λ − λa,n . 2 We split χa (x) ˆ as χa (x)−1+1. ˆ When m '= n, ϕ a,m (x a ) and ϕ a,n (x a ) are orthogonal. Then we have where C =
(KR um , un ) ∼
C R
ρ
r a
R
ˆ − 1) (χa (x)
eiSra ψa,m (ωa )ψa,n (ωa )ϕ a,m (x a )ϕ a,n (x a )dx. ra2
Here we recall that |ϕ a,n (x a )| ≤ C(1 + r a )−2 , where r a = |x a |. This follows from (4.9) when λa,n = 0. When λa,n < 0 it is well-known that ϕ a,n is exponentially decreasing. We then have r C a |(KR um , un )| ≤ (1 + r a )−2 dra dr a + o(1) ρ R r a ≥C ra R ∞ 1 ra ≤C ρ( )dra (1 + r a )−2 dr a + o(1) R r a ≥C R R 0 =C (1 + r a )−2 dr a + o(1) → 0. r a ≥C R
If m = n, we have (KR un , un ) ∼
1 π√ λ − λa,n 2 R
ρ
r 1 a |ψa,n (ωa )|2 χa (x)|ϕ ˆ a,n (x a )|2 dx. R ra2
We split χa (x) ˆ as χa (x) ˆ − 1 + 1. Arguing as above, we see that the term containing χa (x) ˆ − 1 vanishes as R → ∞. From this the lemma easily follows. # $ Lemma 5.6. lim (KR u, un ) = lim (u, KR un ) =
R→∞
R→∞
π√ λ − λa,n !ψa,n !2 . 2
Proof. Noting the fact that (KR u, un ) ∼ (u, KR un ) and using (5.7), we have (KR u, un ) ∼
π √
λ − λa,n
1/2
2 √ a,n ra 1 e−i λ−λ ra ρ · ˆ u(x)ψa,n (ωa )ϕ a,n (x a )dx. χa (x) R R ra
Asymptotic Properties of Solutions to 3-Particle Schrödinger Equations
397
We split χa (x) ˆ as χa (x) ˆ − 1 + 1. Integrating first with respect to x a and noting Theorem 4.2, we have 1 R
√
r e−i λ−λa,n ra a ρ u(x)ψa,n (ωa )ϕ a,n (x a )dx R ra π 1 ra 1 2 r dra !ψa,n !2 ∼ ρ 2R R ra2 a π = !ψa,n !2 . 2
ˆ − 1 is estimated as follows: The term containing χa (x) √ a,n 1 r e−i λ−λ ra a ˆ − 1) u(x)ψa,n (ωa )ϕ a,n (x a )dx ρ (χa (x) R R ra r 1 C a ≤ ρ (1 + r a )−2 |(χa (x) ˆ − 1)u(x)ψa,n (ωa )|dx. R r a ≥C ra R ra
Examining the proof of Theorem 4.3 using the Sobolev inequality, one can see that if f ∈ S, |(χa (x) ˆ − 1)u(x)| ≤ Cr −5/2 . Therefore the above integral is dominated from above by r r C C a a ra (1 + r a )−5/2 dra dr a ≤ (1 + r a )−3/2 dra dr a ρ ρ R r a ≥C ra R R r a ≥C ra R ra C ρ dra ≤ (1 + r a )−3/2 dr a . R R r a ≥C R This converges to 0 as R → ∞.
$ #
Lemma 5.7. lim (KR u0 , un ) = lim (u0 , KR un ) = 0.
R→∞
R→∞
Proof. Using (5.6), we have (KR u0 , un ) ∼
C R
ρ
r a
ˆ a ψ(x) ˆ χa (x)t
eiS ψa,n (ωa )ϕ a,n (x a )dx, r 5/2 ra
R √ √ where S = λr − λ − λa,n ra . The right-hand side is dominated from above by r C a −7/2 ra (1 + r a )−2 (ra )2 (r a )2 dra dr a ρ R r a ≤Cra R r C C ∞ ra −1/2 a −3/2 ≤ ρ ρ ra dra dr a ≤ ra dra . R r a ≤Cra R R 0 R The last term tends to 0 as R → ∞.
$ #
398
H. Isozaki
It follows from Lemmas 5.3–5.7 that π 2 π √ λ (χa ψ, ψ) − λ − λa,n !ψn !2 . 2 2 n
(KR w, w) ∼ (KR u, u) −
By virtue of Theorem 4.9 the right-hand side tends to 0, which proves Theorem 5.2. Theorem 5.8. For f ∈ B, √
ei λr Ra (λ + i0)f C(λ) 5/2 Uacont (λ)f r √ a,n π ei λ−λ ra Ua,n (λ)f (ωa ) ⊗ ϕ a,n (x a ) + 2 n ra in the sense of (1.5), where C(λ) =
π −3πi/4 3/4 e λ . 2
Proof. We have only to prove the theorem for f ∈ S. Since 1 R
r a ≥Cra
ρ
2 r 1 a a,n a ϕ (x ) dx → 0, R ra
we see by Theorem 5.1 1 R
Ma ρ χ0 (x)w, w → 0. R
Therefore we have only to show that 1 Ma ρ χa (x)w, w → 0. R R Since E(Ta ) is positive definite, we have 1 R
1/2 1/2 C Ma Ma Ma w, ρ w) ρ χa w, w ≤ ( E(Ta ) ρ χa χa R R R R C Ma ∼ ρ χa E(Ta )w, w . R R
The last term converges to 0 by virtue of Theorem 5.2.
$ #
6. Time-Dependent Scattering Theory and Generalized Fourier Transformation for H We construct the generalized Fourier transformation for H by passing to the Fourier transformation with respect to t in an expression of the time-dependent wave operator. We begin by reviewing the proof of the asymptotic completeness.
Asymptotic Properties of Solutions to 3-Particle Schrödinger Equations
399
6.1. Behavior in the free region. Let λ0 > 0 and Iδ = (λ0 − δ, λ0 + δ), where 0 < δ < 6 λ0 /2. Take ϕ ∈ C0∞ (I2δ ) such that ϕ = 1 on Iδ . Let χ a (x) ∈ H om(R ) be such that a ,χ χ a (x) = 1 if x ∈ N4/5 a (x) = 0 if x ' ∈ Na . We put χ 0 (x) = 1 − a χ a (x). Then χ 0 (x) = 1 if x ∈ M , and χ 0 (x) = 0 if x ∈ N4/5 . We first consider the behavior of χ 0 (x)e−itH ϕ(H )f as t → ∞. We take ϕ0 ∈ C0∞ (I2δ ) such that ϕ0 = 1 on supp ϕ. Let ψ ∈ Hom(R6 ) be such that om(R6 ) be such that ψ(ξ ) = 1 if ξ ∈ M4/5 , ψ(ξ ) = 0 if ξ ∈ N3/5 . Let χ (x) ∈ H χ (x) = 1 if |x| > 1 and x ∈ M2/5 , χ (x) = 0 if |x| > 1 and x ∈ N/5 . We let Q0 = ϕ0 (H0 )ψ(D)χ (x), G0 = H0 Q0 − Q0 H = [H0 , Q0 ] − Q0
(6.1) Va .
(6.2)
a
In Subsects. 6.1–6.3, f (t) g(t) means that !f (t) − g(t)! → 0 as t → ∞. Lemma 6.1. For f ∈ S, 0 (x)e−itH0 f0 , χ 0 (x)e−itH ϕ(H )f χ ∞ f0 = Q0 ϕ(H )f + i eitH0 G0 e−itH ϕ(H )f dt. 0
Proof. We compute χ0 (x)e−itH ϕ(H )f χ0 (x)ϕ0 (H0 )e−itH ϕ(H )f
χ0 (x)ϕ0 (H0 )ψ(D)e−itH ϕ(H )f
χ0 (x)Q0 e−itH ϕ(H )f. Differentiating and integrating, we have e
itH0
Q0 e
−itH
t
ϕ(H )f = Q0 ϕ(H )f + i
eisH0 G0 e−isH ϕ(H )f ds.
0
On the support of the symbol of [H0 , Q0 ], directions of x and ξ are different. By Theorem 3.8 (2), G0 e−itH ϕ(H )f is integrable in t ≥ 0. This implies the lemma. # $ Corollary 6.2. For λ ∈ Iδ and f ∈ L2,s , s > 1/2, Q0 R(λ + i0)f = R0 (λ + i0)g0 , g0 = Q0 f + G0 R(λ + i0)f. Proof. Apply H0 − λ to Q0 R(λ + i0)f . # $
400
H. Isozaki
6.2. Behavior near the singular directions. We take χa (x) ∈ H om(R6 ) such that a a χa (x) = 1 if |x| ≥ 1, x ∈ N2 , and χa (x) = 0 if x ' ∈ N3 , and let Ga = Ha Qa − Qa H.
(6.3)
χ a (x)e−itH ϕ(H )f χ a (x)Qa e−itH ϕ(H )f.
(6.4)
Qa = ϕ0 (Ha )χa (x), Then we have for f ∈ S,
Differentiating and integrating, we have eitHa Qa e−itH ϕ(H )f = Qa ϕ(H )f + i
t 0
eisHa Ga e−isH ϕ(H )f ds.
Unlike the above case 6.1, Ga e−itH ϕ(H )f does not decay integrably in t ≥ 0. The remedy comes from re-localizing e−itH near the support of ∇χa . Let us take ψ0 (ξ ) ∈ Hom(R6 ), χa0 (x) ∈ H om(R6 ) as follows: a ∩ (N a )c ψ0 (ξ ) = 1 if ξ ∈ N3 2 a )c , ψ0 (ξ ) = 0 if ξ ∈ Na ∪ (N4 χa0 (x) = 1 χa0 (x) = 0 χ (x) = 1 a0
a ∩ (N a )c if |x| > 1, x ∈ N4 a ∪ (N a )c , if x ∈ N/2 5 on supp ∇χa .
We put Qa0 = ϕ0 (H0 )ψ0 (D)χa0 (x), and split
Ga
Ga0 = [Ha , Qa0 ],
(6.5)
as follows: Ga = Ga ϕ0 (H ) + Ga (1 − ϕ0 (H )) Vb ϕ0 (H ) + Ga (1 − ϕ0 (H )) = [Ha , Qa ]ϕ0 (H ) − Qa b'=a
=
[Ha , Qa ]Qa0 + Ga1 ,
where Ga1 = [Ha , Qa ] {ϕ0 (H0 )ψ0 (D)(1 − χa0 (x)) + ϕ0 (H0 )(1 − ψ0 (D)) + (ϕ0 (H ) − ϕ0 (H0 ))} Vb ϕ0 (H ) + Ga (1 − ϕ0 (H )). − Qa
(6.6)
b'=a
The essential feature is that on the support of the symbols of [H0 , Qa0 ] and [H0 , Qa ]ϕ(H0 )(1 − ψ0 (D)), the directions of x and ξ are different. One can then use Theorem 3.8 to see that for j = 0, 1, !Gaj e−itH ϕ(H )f ! ≤ C(1 + t)−1− , for some > 0.
t ≥0
(6.7)
Asymptotic Properties of Solutions to 3-Particle Schrödinger Equations
401
Lemma 6.3. For f ∈ S, χa (x)e−itH ϕ(H )f χa (x)e−itH0 fa0 + χa (x)e−itHa fa , ∞ eitH0 Ga0 e−itH ϕ(H )f dt, fa0 = Qa0 ϕ(H )f + i 0
Ga0 = H0 Qa0 − Qa0 H,
∞ fa = Qa (1 − Qa0 )ϕ(H )f + i eitHa Ga e−itH ϕ(H )f dt, 0 Ga = Qa Qa0 Vb − Qa Ga0 + Ga1 b'=a
= Ha Qa (1 − Qa0 ) − Qa (1 − Qa0 )H. Proof. Noting that [Ha , Qa ]Qa0 = [Ha , Qa Qa0 ] − Qa Ga0 = (Ha Qa Qa0 − Qa Qa0 H ) + Qa Qa0
b'=a
we have
Vb − Qa Ga0 ,
Ga = Ha Qa Qa0 − Qa Qa0 H + Ga .
(6.8)
Therefore eitHa Qa e−itH ϕ(H )f = Qa (1 − Qa0 )ϕ(H )f + eitHa Qa Qa0 e−itH ϕ(H )f t eisHa Ga e−isH ϕ(H )f ds. +i 0
Computing eitH0 Qa0 e−itH ϕ(H )f = Qa0 ϕ(H )f + i we see that
t 0
eisH0 Ga0 e−isH ϕ(H )f ds,
Qa e−itH ϕ(H )f Qa e−itH0 fa0 + e−itHa fa .
Using (6.4) and the stationary phase method, one obtains the lemma.
$ #
Corollary 6.4. For λ ∈ Iδ and f ∈ L2,s , s > 1/2, Qa R(λ + i0)f = Qa Qa0 R(λ + i0)f + Ra (λ + i0) [Qa (1 − Qa0 ) + Ga R(λ + i0)] f. Proof. We compute (Ha − λ)Qa R(λ + i0)f = Qa f + Ga R(λ + i0)f = Qa f + [(Ha − λ)Qa Qa0 − Qa Qa0 (H − λ) + Ga ] R(λ + i0)f. Multiply Ra (λ + i0) to both sides.
$ #
402
H. Isozaki
6.3. Asymptotic completeness. The channel wave operators are defined by W0 = s − limt→∞ eitH e−itH0 , Wa,n = s − limt→∞ e
itH
(e
−it (Ta +λa,n )
(6.9) ⊗P
a,n
),
(6.10)
where P a,n is the eigenprojection for H a associated to the eigenvalue λa,n . Theorem 6.5. These channel wave operators are asymptotically complete, and for f ∈ Hac (H ), !W0∗ f !2 + !(Wa,n )∗ f !2 = !f !2 . a,n
Proof. It follows from Lemmas 6.1 and 6.3 that e−itH ϕ(H )f χ0 (x)e−itH0 f0 +
χa (x)e−itH0 fa0 +
a
χa (x)e−itHa fa .
a
We insert 1 = P cont (H a ) + n P a,n between e−itHa and fa , where P cont (H a ) denotes the projection onto the continuous subspace for H a . Letting a
W a = s − limt→∞ eitH eitx a ,
(1 ⊗ P a,n )fa = ua,n ⊗ ϕ a,n ,
we have e−itH ϕ(H )f e−itH0 χ0 (D)f0 + +
e−itH0 χa (D) fa0 + (1 ⊗ (W a )∗ )fa
a
e
−it (Ta +λa,n )
ua,n ⊗ ϕ a,n ,
a,n
where we have used the fact that for g ∈ S and χ ∈ Hom(R6 ), ˆ χ (x)e ˆ −itH0 g χ (x)
C i x2 x e 4t gˆ
e−itH0 χ (D)g, t3 2t
which follows from the stationary phase method. The proof of the asymptotic completeness is now well-known. Suppose f ∈ Hac (H ) is orthogonal to the ranges of all channel wave operators. Let ϕ ∈ C0∞ (σac (H ) \ T ). Then !ϕ(H )f !2 = limt→∞ (e−itH ϕ(H )f, e−itH ϕ(H )f ) = 0, if we replace one of e−itH ϕ(H )f by the above asymptotic expansion and use the assumption that f ⊥ Ran (W0 ), f ⊥ Ran (Wa,n ). # $ We remark that we have proven W0∗ ϕ(H )f = χ 0 (D)f0 +
(Wa,n )∗ ϕ(H )f = ua,n ⊗ ϕ a,n .
χ a (D) fa0 + (1 ⊗ (W a )∗ )fa ,
(6.11)
a
(6.12)
Asymptotic Properties of Solutions to 3-Particle Schrödinger Equations
403
Let us notice that by passing to the Fourier transform ∞ U0 (λ)∗ U0 (λ)[Q0 + G0 R(λ + i0)]ϕ(H )fρ0 (λ)dλ, (6.13) f0 = 0 ∞ U0 (λ)∗ U0 (λ)[Qa0 + Ga0 R(λ + i0)]ϕ(H )fρ0 (λ)dλ, (6.14) fa0 = 0 ∞ Uacont (λ)∗ Uacont (λ) [Qa (1 − Qa0 ) + Ga R(λ + i0)] ϕ(H )fρ0 (λ)dλ fa = 0 ∞ Ua,n (λ)∗ Ua,n (λ) Qa (1 − Qa0 ) + n
−∞
+ Ga R(λ + i0) ϕ(H )fρa,n (λ)dλ, (6.15) where ρ0 (λ) = λ2 /2, ρa,n (λ) = (λ − λa,n )+ /2. This is a well-known computation. In fact to derive (6.13), we compute ∞ (f0 , g) = (U0 (λ)f0 , U0 (λ)g)ρ0 (λ)dλ. 0
Since
∞
U0 (λ)
eitH0 G0 e−itH dt =
0
∞
U0 (λ)G0 e−it (H −λ) dt
0
= −iU0 (λ)G0 R(λ + i0), we obtain (6.13). The others are proved similarly. 6.4. Generalized Fourier transformation associated with H . We now define F0 (λ) = χ0 (θ )U0 (λ)[Q0 + G0 R(λ + i0)] χa (θ )U0 (λ) [Qa0 + Ga0 R(λ + i0)] +
(6.16)
a
+
a
χa (θ )Uacont (λ) [Qa (1 − Qa0 ) + Ga R(λ + i0)] ,
Fa,n (λ) = Ua,n (λ) [Qa (1 − Qa0 ) + Ga R(λ + i0)] .
(6.17)
The operators F0 (λ) and Fa,n (λ) are well-defined, since U0 (λ) ∈ > 1/2, Uacont (λ) ∈ B(L2,s ; L2 (S 5 )), s > 5/2, Ua,n (λ) ∈ B(L2,s ; L2 (S 2 )), s > 1/2, and the operators in the parentheses are bounded from L2,s+ to L2,s for some s > 5/2, > 0. The aim of this sub-section is to show that these operators provide a spectral representation for H . First we introduce operators U0 , Uacont , Ua,n by B(L2,s ; L2 (S 5 )), s
(U0 f )(λ) = U0 (λ)f, (Uacont f )(λ)
Uacont (λ)f,
= (Ua,n f )(λ) = Ua,n (λ)f.
(6.18) (6.19) (6.20)
404
H. Isozaki
These operators are unitary between the following spaces: U0 : L2 (R6 ) → L2 ((0, ∞); L2 (S 5 ); ρ0 (λ)dλ), Uacont : (1 ⊗ P cont (H a ))L2 (R6 ) → L2 ((0, ∞); L2 (S 5 ); ρ0 (λ)dλ),
Ua,n : (1 ⊗ P a,n )L2 (R6 ) → L2 ((λa,n , ∞); L2 (S 2 ); ρa,n (λ)dλ).
It is well-known that
1 ⊗ W a = (Uacont )∗ U0 . We define operators F0 and Fa,n by
(6.21)
(F0 f )(λ) = F0 (λ)f, (Fa,n f )(λ) = Fa,n (λ)f.
(6.22) (6.23)
By (6.11), (6.13) and (6.22), we have
∗ ∗ a ∗ W0 ϕ(H )f = U0 χ0 (θ )U0 f0 + χa (θ ) U0 fa0 + U0 (1 ⊗ (W ) )fa a
=
∞
U0 (λ)∗ F0 (λ)ϕ(H )fρ0 (λ)dλ.
(6.24)
0
This implies
W0∗ ϕ(H ) = U0∗ F0 ϕ(H ). We also have by (6.12), (6.15) and (6.17), (Wa,n )∗ ϕ(H ) = (Ua,n )∗ Fa,n ϕ(H ).
(6.25) (6.26)
These formulas in particular imply that the operators F0 (λ), Fa,n (λ) are independent of the choice of the cut-off function χ s. Let EH (·) be the spectral decomposition for H . Theorem 6.5 and (6.25), (6.26) imply !Fa,n EH (Iδ )f !2 . (6.27) !EH (Iδ )f !2 = !F0 EH (Iδ )f !2 + a,n
Therefore the operator F = (F0 , Fa,1 , · · · , Fb,1 , · · · ) is uniquely extended to a unitary operator F : EH (Iδ )L2 (R6 ) → L2 (Iδ ; L2 (S 5 ); ρ0 (λ)dλ) ⊕ L2 (Iδa,n ; L2 (S 2 ); ρa,n (λ)dλ), a,n
Iδa,n
+ λa,n
+ λa,n , λ0
where = Iδ = (λ0 − δ + δ + λa,n ). Parseval’s formula (6.27) also holds when Iδ ⊂ (−∞, 0) if we drop the term F0 f . The following Parseval’s formula in differential form can be easily proved by (6.27) and Stone’s formula. Theorem 6.6. For f ∈ B and λ ∈ σcont (H ) \ T 1 ([R(λ + i0) − R(λ − i0)]f, f ) 2πi 1 1 = (λ+ )2 !F0 (λ)f !2L2 (S 5 ) + (λ − λa,n )+ !Fa,n (λ − λa,n )f !2L2 (S 2 ) , 2 2 a,n where k+ = max(k, 0).
Asymptotic Properties of Solutions to 3-Particle Schrödinger Equations
405
We define F(λ) = (F0 (λ), Fa,1 (λ), · · · , Fb,1 (λ), · · · ).
(6.28)
Then F(λ) is uniquely extended to a bounded operator F(λ) : B → H = L2 (S 5 ) ⊕
L2 (S 2 ).
(6.29)
a,n
6.5. Expansion of arbitrary functions. The above construction of generalized Fourier transformation is extended to the whole Hac (H ). Namely, one can construct a unitary operator F : Hac (H ) → L2 ((0, ∞); L2 (S 5 ); ρ0 (λ)dλ) L2 ((λa,n , ∞); L2 (S 2 ); ρa,n (λ)dλ) ⊕ a,n
by which any f ∈ Hac (H ) is expanded as
∞
f =
F0 (λ)∗ (F0 f )(λ)ρ0 (λ)dλ +
0
a,n
Here
∞
∞ λa,n
∞ λa,n
Fa,n (λ)∗ (Fa,n f )(λ)ρa,n (λ)dλ. (6.30)
· · · dλ = s − lim
N
N→∞ 1/N
0
and
· · · dλ = s −
· · · dλ
lim
I →(λa,n ,∞)\T
· · · dλ,
I being a finite union of compact intervals in (λa,n , ∞) \ T . F diagonalizes H : (FHf )(λ) = λ(Ff )(λ).
(6.31)
F0 (λ)∗ and Fa,n (λ)∗ are eigenoperators in the sense that (H − λ)F0 (λ)∗ ψ0 = (H − λ)Fa,n (λ)∗ ψa,n = 0
(6.32)
holds for any ψ0 ∈ L2 (S 5 ), ψa,n ∈ L2 (S 2 ) and λ ' ∈ T . There is an abundance of references for the eigenfunction expansion theory for 2body Schrödinger operators. The method employed above is a 3-body version of the spectral representation theory in terms of stationary wave operators. We have discussed it for the 2-body case in [17] and [16]. The proof for the 3-body case can be done in a parallel manner, hence is omitted.
406
H. Isozaki
6.6. Asymptotic expansion of the resolvent. In this sub-section u v means (1.5). Theorem 6.7. For f ∈ B and λ > 0, we have the following asymptotic expansion: √
ei λr R(λ + i0)f C(λ) 5/2 F0 (λ)f r √ a,n π ei λ−λ ra Fa,n (λ)f (ωa ) ⊗ ϕ a,n (x a ), + 2 a,n ra π −3πi/4 3/4 C(λ) = λ . e 2 Proof. We have only to prove the theorem for f ∈ S. We use the same notation as above, taking the interval Iδ in Subsect. 6.1 in such a way that λ ∈ Iδ . Then we have χ 0 (x)R(λ + i0)f χ 0 (x)R(λ + i0)ϕ(H )f
χ 0 (x)Q0 R(λ + i0)f =χ 0 (x)R0 (λ + i0)g0 , where we have used Corollary 6.2. By virtue of Lemma 2.3, if we restrict x in M2 χ 0 (x)R(λ + i0)f C(λ)r −5/2 ei = C(λ)r
√ λr
√ −5/2 i λr
e
U0 (λ)g0 F0 (λ)f,
where we have used (6.16). a , In a similar manner we have, if we restrict x in N/5 a (x)Qa R(λ + i0)f χ a (x)R(λ + i0)f χ
Ra (λ + i0) [Qa (1 − Qa0 ) + Ga R(λ + i0)] f, where we have used Corollary 6.4. Using Theorem 5.8 and (6.16), (6.17), we have on a N/5 Ra (λ + i0) [Qa (1 − Qa0 )Ga R(λ + i0)] f
C(λ)r −5/2 ei
√ λr
F0 (λ)f +
π Fa,n (λ)f ⊗ ϕ a,n . 2 n
Since can be chosen arbitrarily, this completes the proof of the theorem.
$ #
For a bounded operator A, we define A by Au = Au,
(6.33)
where u denotes the complex conjugate of a function u. The following corollary is easily proved by Theorem 6.7. Note that without loss of generality, we can assume that ϕ a,n is real-valued. Corollary 6.8. For f ∈ B and λ > 0, √
e−i λr R(λ − i0)f C(λ) 5/2 F0 (λ)f r √ a,n π e−i λ−λ ra Fa,n (λ)f (ωa ) ⊗ ϕ a,n (x a ). + 2 a,n ra
Asymptotic Properties of Solutions to 3-Particle Schrödinger Equations
407
7. Scattering Matrices Let us introduce new notation. For F(λ) defined in Sect. 6, we put F (+) (λ) = F(λ), F
(−)
(7.1)
(λ) = JF(λ),
(7.2)
where J is the reflection defined by (1.14). We see that F (−) (λ) is defined in the same way as F (+) (λ) with R(λ + i0) replaced by R(λ − i0), if we take ψ(ξ ) as an even function. For a triple α = (a, λa,n , ϕ a,n ), we write Wα(±) = s − lim eitH e−itHa (1 ⊗ P a,n ).
(7.3)
t→±∞
(±)
When α = 0, W0
is defined by (±)
W0
= s − lim eitH e−itH0 .
(7.4)
t→±∞
The scattering operator Sαβ is defined to be (−)
Sαβ = (Wα(+) )∗ Wβ .
(7.5)
Letting Uα = U0 or Uα,n (see (6.18) and (6.20)), we define the Fourier transform of Sαβ by (7.6) Sˆαβ = Uα Sαβ (Uβ )∗ . (+)
Using Wα
(−)
= Wα , (6.25), (6.26) and (7.2), we have (−) Sˆαβ = Fα(+) (Fβ )∗ .
(7.7)
Lemma 7.1. Sˆαβ is written as a direct integral Sˆαβ =
⊕Sˆαβ (λ)dλ,
where for β = 0, Sˆα0 (λ) = δα0 + 2πiρα (λ)Fα(+) (λ) ·
0 (θ ) + G∗0 U0 (λ)∗ χ
a
! (G∗a0 U0 (λ)∗
+ G∗a Uacont (λ)∗ ) χa (θ )
and for β ' = 0, Sˆαβ (λ) = δαβ + 2π iρα (λ)Fα(+) (λ)G∗β Uβ (λ)∗ .
,
408
H. Isozaki
Proof. Parseval’s formula implies 1 ρα (λ)(Fα(+) (λ))∗ Fα(+) (λ). (R(λ + i0) − R(λ − i0)) = 2πi α
(7.8)
Therefore by (6.16) and (6.17), (+)
(−)
(F0 (λ))∗ − (F0 (λ))∗ = − 2π i +
α
ρα (λ)(Fα(+) (λ))∗ Fα(+) (λ)[G∗0 U0 (λ)∗ χ 0 (θ )
(G∗a0 U0 (λ)∗ + G∗a Uacont (λ)∗ ) χa (θ )], a
(+)
(−)
(Fβ (λ))∗ − (Fβ (λ))∗ = −2π i
ρα (λ)(Fα(+) (λ))∗ Fα(+) (λ)G∗b Uβ (λ)∗ .
α
(+) (+) (−) Computing (Fα ((Fβ )∗ − (Fβ )∗ )fˆ, g), ˆ one can prove the lemma by using the above formula. # $
Let us define an operator-valued matrix by ˆ S(λ) = (Sˆαβ (λ)).
(7.9)
(−) . Then ˆ Theorem 7.2. For λ > 0 and ϕ ∈ H, let ϕ (−) = Jϕ and ϕ (+) = S(λ)Jϕ √
F
(−)
√
ei λr (+) e−i λr (−) (λ) ϕ C(λ) 5/2 ϕ0 (x) ˆ + C(λ) 5/2 ϕ0 (x) ˆ r r √ a,n ei λ−λ ra (+) Ca,n (λ) ϕa,n (ωa ) ⊗ ϕ a,n (x a ) + r a a,n ! √ a,n e−i λ−λ ra (−) a,n a + Ca,n (λ) ϕa,n (ωa ) ⊗ ϕ (x ) , ra ∗
where ωa = xa /ra , ra = |xa | and C(λ) = (2π )−1/2 e−5πi/4 λ−5/4 , Ca,n (λ) = (2π )−1/2 e−πi/2 (λ − λa,n )−1/2 . Proof. We have only to prove the theorem for ϕ in a dense set of H. Take ϕ = (ϕ0 , ϕa,1 , · · · ) such that ϕa,n ∈ C ∞ (S 2 ), ϕ0 ∈ C ∞ (S 5 ) and ϕ0 = 0 near the singular directions. Take > 0 such that supp ϕ ⊂ M3 . For this , construct F(λ) as in Sect. 6. Then χ a (θ ) = 0 on supp ϕ0 and (−)
|G∗0 U0 (λ)∗ χ 0 (θ )ϕ0 | ≤ C(1 + |x|)−ρ , with ρ from (1.11). We also have (−)
|G∗b Uβ (λ)∗ ϕb,n | ≤ C(1 + |x|)−4 .
Asymptotic Properties of Solutions to 3-Particle Schrödinger Equations
409
(+)
Therefore Sˆαβ (λ)ϕ (−) is well-defined. We also have by (6.16) and (6.17), (−) F0 (λ)∗ = Q∗0 + R(λ + i0)G∗0 U0 (λ)∗ χ 0 (θ ) ∗ ∗ + a (θ ) Qa0 + R(λ + i0)Ga0 U0 (λ)∗ χ a
+
a
(1 − Q∗a0 )Q∗a + R(λ + i0)G∗a Uacont (λ)∗ χ a (θ ),
(−) Fa,n (λ)∗ = (1 − Q∗a0 )Q∗a + R(λ + i0)G∗a Ua,n (λ)∗ . Therefore
(−)
F0 (λ)∗ ϕ0 = χ (x)U0 (λ)∗ ϕ0 + R(λ + i0)G∗0 U0 (λ)∗ ϕ0 .
We apply Lemma 2.4 to the first term, and Theorem 6.7 to the second term. Then we have √
(−) F0 (λ)∗ ϕ0
√
ei λr e−i λr
C(λ) 5/2 Sˆ00 (λ)ϕ0 + C(λ) 5/2 J ϕ0 r r √ b,m r λ−λ b e Sˆβ0 (λ)ϕ0 ⊗ ϕ b,m . + Cb,m (λ) rb b,m
Similarly we have (−) Fa,n (λ)∗ ϕa,n Ua,n (λ)∗ ϕa,n + R(λ + i0)G∗a Ua,n (λ)∗ ϕa,n √ √ b,m ei λr ˆ ei λ−λ rb ˆ Sβα (λ)ϕa,n ⊗ ϕ b,m
C(λ) 5/2 S0α (λ)ϕa,n + Cb,m (λ) rb r √ a,n e−i λ−λ ra
b,m
(J ϕa,n ) ⊗ ϕ a,n . ra (−) (−) Since F (−) (λ)∗ ϕ = F0 (λ)∗ ϕ0 + a,n Fa,n (λ)∗ ϕa,n , we get the theorem. + Ca,n (λ)
$ #
Theorem 7.3. For ϕ ∈ H, 1 lim |F (−) (λ)∗ ϕ|2 dx = C0 (λ)!ϕ0 !2 + Ca,n (λ)!ϕa,n !2 , R→∞ R |x| 0 such that C −1 !ϕ!H ≤ !F (−) (λ)∗ ϕ!B ∗ ≤ C!ϕ!H . Lemma 7.5. Let u ∈ B ∗ satisfy (H − λ)u = 0. Then (u, f ) = 0 for any f ∈ B such that F (−) (λ)f = 0.
410
H. Isozaki
Proof. We put v = R(λ − i0)f . Then we have by integration by parts M v, (H − λ)u 0 = ρ1 R M M = −2 ∇ρ1 · ∇v, u + ρ1 f, u . R R Hence we have 2 (f, u) = lim − R→∞ R
M ρ ∇M · ∇v, u . R
(7.10)
The assumption F (−) (λ)f = 0 and Corollary 6.8 imply 1 M ρ |v|2 dx → 0. R R This and the well-known elliptic estimates yield M 1 ρ |∇M · ∇v|2 dx → 0. R R
(7.11)
Since u ∈ B ∗ , we have 1 R>1 R sup
ρ
M R
|u|2 dx < ∞.
The lemma then follows from (7.10), (7.11) and (7.12).
(7.12)
$ #
8. Main Theorems Theorem 8.1. Let u satisfy (H − λ)u = 0 for λ ∈ σcont (H ) \ T . Then u ∈ B ∗ if and only if u ∈ Ran(F (−) (λ)∗ ). Proof. This is proved in the same way as in [7]. In fact since Ran(F (−) (λ)∗ ) is closed by virtue of Corollary 7.4, u ∈ B ∗ belongs to Ran(F (−) (λ)∗ ) if and only if (u, f ) = 0 for all f ∈ B such that F (−) (λ)f = 0 (see [41] p.205). Therefore by Lemma 7.5, u ∈ Ran(F (−) (λ)∗ ) if (H − λ)u = 0. The converse direction is easy to prove. # $ This theorem also holds if we replace F (−) (λ) by F (+) (λ). Theorem 8.2. Let λ ∈ σcont (H ) \ T . Suppose u ∈ B ∗ satisfy (H − λ)u = 0. Then there exist ϕ (±) ∈ H such that u admits the asymptotic expansion (1.12) in the sense of (1.5). Proof. By virtue of Theorem 8.1, u is written as u = F (−) (λ)∗ ϕ for some ϕ ∈ H. The present theorem then follows from Theorem 7.2. # $ Theorem 8.3. Let λ ∈ σcont (H ) \ T . Then for any ϕ (−) ∈ H, there exist unique u ∈ B ∗ and ϕ (+) ∈ H such that (H − λ)u = 0 and u admits the asymptotic expansion (1.12) in the sense of (1.5).
Asymptotic Properties of Solutions to 3-Particle Schrödinger Equations
411
(−) . Let ˆ Proof. The existence is shown by taking u = F (−) (λ)∗ ϕ (−) and ϕ (+) = S(λ)Jϕ ∗ us show the uniqueness. First we note that for given u ∈ B satisfying (H − λ)u = 0, the asymptotic expansion (1.12) is unique. In fact, by the computation similar to the one in Sect. 5, we have ϕ (+) = ϕ (−) = 0, if they satisfy the asymptotic expansion (1.12) with u = 0. Suppose u ∈ B ∗ satisfy (H − λ)u = 0 and assume further that ϕ (−) = 0 in the asymptotic expansion (1.12). By Theorem 8.1, u is written as u = F (−) (λ)∗ ϕ for some ϕ ∈ H. We then have ϕ = ϕ (−) = 0 by the uniqueness of the expansion and Theorem 7.2. # $
Generalized eigenfunctions with 2-cluster incoming state. Finally, we shall discuss the asymptotic expansion of generalized eigenfunction for H . From the practical point of view, in the real scattering experiment, the most important case is the one in which the initial state is of 2-cluster. Suppose in the remote past the pair a = (i, j ) forms a bound state with energy λa,n < 0 and eigenstate ϕ a,n (x a ). Then the generalized eigenfunction Ma (x, λ, ωa ) is written as √
a,n
Ma (x, λ, ωa ) = ei λ−λ ωa ·xa ϕ a,n (x a ) − v, v = R(λ + i0)f, √ a,n Vc (x c )ϕ a,n (x a )ei λ−λ ωa ·xa . f =
(8.1) (8.2) (8.3)
c'=a
In our previous work [19], we derived asymptotic expansions of v at infinity. However, the results were not satisfactory in that we have separated 3-cluster scattering and 2-cluster scattering. Moreover in Theorem 1.3 of [19], we multiplied a technical localization factor ψb (Dxb ) to v (we wrote it as ψβ (Dxβ )), although it was removed in [20], Theorem 6.6. By virtue of the analysis of the present paper, we can derive a more transparent asymptotic expansion. Theorem 8.4. Let v be as in (8.2) and α = (a, λa,n , ϕ a,n ). Then in the sense of (1.5) √
ˆ ωa ) v C0 (λ)r −5/2 ei λr Sˆ0α (λ; x, √ b,m + Cβ (λ)rb−1 ei λ−λ rb Aβα (λ; θb , ωa ) ⊗ ϕ a,n (x a ), β
where r = |x|, xˆ = x/r, rb = |xb |, θb = xb /rb , C0 (λ) = e−πi/4 2π λ−1/4 (λ − λa,n )−1/4 , Cβ (λ) = 2π i(λ − λa,n )−1/4 (λ − λb,m )−1/4 , and Aβα (λ, θb , ωa ) is the scattering amplitude associated with the scattering process, in which after the collision the pair b takes the bound state ϕ b,m (x b ) with eigenvalue λb,m . This theorem follows from Theorem 6.7. The proof is omitted. If the initial state is of 3-cluster, the behavior of the generalized eigenfunction complicated. In fact, Hassell [10] derived an asymptotic expanM0 (x, λ, θ) is much more √ sion of M0 (x, λ, θ)−ei λθ·x away from singular directions, which contains in addition to √ ˆ a(x) the expected term r −5/2 ei λr a0 (x) ˆ extra terms like r −p eiφ(x)r ˆ with 0 < p < 5/2.
412
H. Isozaki
References 1. Agmon, S.: A representation theorem for solutions of Schrödinger type equations on non-compact Riemannian manifolds. Astérisque 210, 13–26 (1992) 2. Agmon, S. and Hörmander, L.: Asymptotic properties of solutions of differential equations with simple characteristics. J. d’Anal. Math. 30, 1–38 (1976) 3. Derezi´nski, J.: Asymptotic completeness for N-particle long-range quantum systems. Ann. of Math. 138, 427–476 (1993) 4. Derezi´nski, J. and Gérard, C.: Scattering Theory of Classical and Quantum N-particle Systems. Berlin– Heidelberg: Springer-Verlag, 1997 5. Enss, V.: Long-range scattering of two- and three-body systems with potentials of short and long-range. In: Journées “Equations aux dérivées partielles”, Saint Jean de Monts, Juin 1989, Publications Ecole Polytechnique, Palaiseau (1989) 6. Froese, R. and Herbst, I.: Exponential bounds and absence of positive eigenvalues for N-body Schrödinger operators. Commun. Math. Phys. 87, 429–447 (1982) 7. Gâtel, Y. and Yafaev, D.: On solutions of the Schrödinger equation with radiation conditions at infinity: the long-range case. Ann. Inst. Fourier Grenoble 49, 1581–1602 (1999) 8. Graf, G.M.: Asymptotic completeness for N-body short-range systems: A new proof. Commun. Math. Phys. 132, 73–101 (1990) 9. Gérard, C., Isozaki, H. and Skibsted, E.: Commutator algebra and resolvent estimates. In: Spectral and Scattering Theory and Applications, ed. K.Yajima, Advanced Studies in Pure Mathematics 23, 1994, pp. 69–82 10. Hassell, A.: Distorted plane waves for the 3-body Schrödinger operators. GAFA, Geom. Funct. Anal. 10, 1–50 (2000) 11. Hassell, A.: Scattering matrices for the quantum N body problem. To appear in Trans. Am. Math. Soc. 12. Hassell, A. and Vasy, A.: Symbolic functional calculus and N body resolvent estimates. J. Funct. Anal. 173, 257–283 (2000) 13. Helgason, S.: A duality for symmetric spaces with applications to group representations. Adv. in Math. 5, 1–154 (1970) 14. Helgason, S.: Eigenspaces of the Laplacian; Integral representations and irreducibility. J. Funct. Anal. 17, 328–353 (1974) 15. Herbst, I. and Skibsted, E.: Free channel Fourier transform in the long-range N-body problem. J. d’Anal. Math. 65, 297–332 (1995) 16. Ikebe, T. and Isozaki, H.: Completeness of modified wave operators for long-range potentials. Publ. RIMS. Kyoto Univ. 15, 679–718 (1979) 17. Isozaki, H.: On the long-range stationary wave operator. Publ. RIMS. Kyoto Univ. 13, 589–626 (1977) 18. Isozaki, H.: Structures of S-matrices for three-body Schrödinger operators. Commun. Math. Phys. 146, 241–258 (1992) 19. Isozaki, H.: Asymptotic properties of generalized eigenfunctions for three-body Schrödinger operators. Commun. Math. Phys. 153, 1–21 (1993) 20. Isozaki, H.: On N -body Schrödinger operators. Proc. Indian Acad. Sci. Math. Sci. 104, 667–703 (1993) 21. Isozaki, H.: A generalization of the radiation condition of Sommerfeld for N-body Schrödinger operators. Duke Math. J. 74, 557–584 (1994) 22. Isozaki, H. and Kitada, H.: A remark on the micro-local resolvent estimates for two-body Schrödinger operators. Publ. RIMS. Kyoto Univ. 35, 81–107 (1985) 23. Isozaki, H. and Kitada, H.: Scattering matrices for two-body Schrödinger operators. Scientific Papers of the College of Arts and Sciences, The University of Tokyo, 35, 81–107 (1985) 24. Jensen, A.: Propagation estimates for Schrödinger type operators. Trans. Am. Math. Soc. 291, 129–144 (1985) 25. Jensen, A. and Kato, T.: Spectral properties of Schrödinger operators and time-decay for wave functions. Duke Math. J. 46, 583–611 (1979) 26. Jensen, A. and Perry, P.: Commutator methods and Besov space estimates for Schrödinger operators. J. Operator Theory 14, 181–188 27. Melrose, R.B.: Geometric Scattering Theory. Cambridge: Cambridge University Press, 1995 28. Melrose, R.B. and Zworski, M.: Scattering metrics and geodesic flow at infinity. Inv. Math. 124, 389–436 (1996) 29. Murata, M.: Asymptotic expansions in time for solutions of Schrödinger type equations. J. Funct. Anal. 49, 10–56 (1982) 30. Saito, Y.: Spectral representation for Schrödinger operators with long-range potentials. Lecture Notes in Math. 727, Berlin: Springer-Verlag, 1979 31. Sigal, I.M. and Soffer, A.: The N -particle scattering problem: asymptotic completeness for short-range quantum systems. Ann. of Math. 125, 35–108 (1987)
Asymptotic Properties of Solutions to 3-Particle Schrödinger Equations
413
32. Skibsted, E.: Smoothness of N -body scattering amplitudes. Rev. Math. Phys. 4, 279–300 (1992) 33. Skibsted, E.: Propagation estimates for N-body Schrödinger operators. Commun. Math. Phys. 142, 67–98 (1991) 34. Vasy, A.: Structure of the resolvent for three body potentials. Duke Math. J. 90, 379–434 (1997) 35. Vasy, A.: Asymptotic behavior of generalized eigenfunctions in N body scattering. J. Funct. Anal. 148, 170–184 (1997) 36. Vasy, A.: Scattering matrices in many-body scattering. Commun. Math. Phys. 200, 105–124 (1999) 37. Vasy, A.: The propagation of singularities in many-body scattering. To appear in Ann. Sci. Ec. Norm. Sup. (2001) 38. Wang, X.P.: Microlocal resolvent estimates for N-body Schrödinger operators. J. Fac. Sci. Univ. Tokyo Sect. IA, Math. 40, 337–385 (1993) 39. Yafaev, D.: On solutions of the Schrödinger equation with radiation conditions at infinity. Adv. in Sov. Math. 7, 179–204 (1991) 40. Yafaev, D.: Resolvent estimmates and scattering matrix for N-particle Hamiltonians. Integ. Equat. Oper. Th. 21, 93–126 41. Yosida, K.: Functional Analysis. Berlin: Springer-Verlag, 1966 Communicated by H. Araki
Commun. Math. Phys. 222, 415 – 448 (2001)
Communications in
Mathematical Physics
© Springer-Verlag 2001
Bootstrap Multiscale Analysis and Localization in Random Media François Germinet1 , Abel Klein2, 1 UMR 8524 CNRS, UFR de Mathématiques, Université de Lille 1, 59655 Villeneuve d’Ascq Cédex, France.
E-mail: [email protected]
2 Department of Mathematics, University of California, Irvine, Irvine, CA 92697-3875, USA.
E-mail: [email protected] Received: 29 November 2000 / Accepted: 21 June 2001
Abstract: We introduce an enhanced multiscale analysis that yields subexponentially decaying probabilities for bad events. For quantum and classical waves in random media, we obtain exponential decay for the resolvent of the corresponding random operators ζ in boxes of side L with probability higher than 1 − e−L , for any 0 < ζ < 1. The starting hypothesis for the enhanced multiscale analysis only requires the verification of polynomial decay of the finite volume resolvent, at some sufficiently large scale, 1 with probability bigger than 1 − 841 d (d is the dimension). Note that from the same starting hypothesis we get conclusions that are valid for any 0 < ζ < 1. This is achieved by the repeated use of a bootstrap argument. As an application, we use a generalized eigenfunction expansion to obtain strong dynamical localization of any order in the Hilbert–Schmidt norm, and better estimates on the behavior of the eigenfunctions.
1. Introduction Quantum and classical waves may be described by first or second order differential equations on a Hilbert space H = L2 (Rd , dx; Cn ). Quantum waves are described by the Schrödinger equation: i
∂ ψ t = H ψt , ∂t
(1.1)
while classical waves may be described by a second order wave equation with an auxiliary condition: ∂2 ψt = −H ψt , ∂t 2 Partially supported by NSF Grant DMS-9800883.
with ψt = PH⊥ ψt .
(1.2)
416
F. Germinet, A. Klein
In both cases H is a self-adjoint operator on H; for the wave equation we have H ≥ 0 and PH⊥ is the orthogonal projection on the orthogonal complement of the kernel of H . Finite energy solutions of the first order equation (1.1) are of the form ψt = e−itH φ0 ,
φ0 ∈ H,
(1.3)
inasmuch as finite energy solutions of the second order equation (1.2) are given by √ √ ψt = cos t H PH⊥ φ0 + sin t H PH⊥ η0 , φ0 , η0 ∈ H. (1.4) In this article we discuss questions concerning localization of waves in random media. A random medium will be modeled by a Zd -ergodic random self-adjoint operator Hω , where ω belongs to a probability set with a probability measure P (e.g., [37, 33, 9, 24, 25, 43, 12]). In this article such a Hω will be simply called a random operator. It follows from ergodicity that there exists a nonrandom set , such that σ (Hω ) = with probability one, where σ (A) denotes the spectrum of the operator A. In addition, the decomposition of σ (Hω ) into pure point spectrum, absolutely continuous spectrum, and singular continuous spectrum is also independent of the choice of ω with probability one [37, 49, 8, 13]. As an example one can consider the potential λi (ω)u(x − i), (1.5) Vω (x) = i∈Zd
where u is a bounded nonnegative function with compact support, and the {λi (ω)}i∈Zd are independent, identically distributed random variables (e.g., [33, 9, 45, 38, 39, 52]). The random Schrödinger operator is given by Hω = − + γ Vω (x) (plus possibly a bounded periodic potential [45, 38]). The parameter γ measures the amount of disorder. For classical waves, examples include Maxwell’s equations, and the equations of acoustics and elasticity, where the √ random √ Schrödinger-like operators in (1.2) are of the form Hω = A∗ω Aω with Aω = Rω D Sω , where D is a first order partial differential operator with constant coefficients, and Rω , Sω are strictly positive matrix valued functions of the form Y0 (x)(1 + γ Vω (x))±1 , with Y0 (x) a periodic matrix valued function, not necessarily smooth, and Vω (x) as in (1.5) [24, 25, 43, 12]. In this paper we restrict ourselves to continuum models. There is a vast literature on the Anderson model and other discrete random operators, e.g., [46, 26, 27, 20, 7, 13, 8, 2, 1, 3, 22, 23, 41, 16, 55]. Our methods and results also apply to discrete random operators, with the obvious modifications. The main achievement of this paper is an enhanced multiscale analysis, the bootstrap multiscale analysis, which is stated in Theorem 3.4. In this context the multiscale analysis (MSA) is a technique, initially developed in [26, 27] and simplified in [19, 20] (see also [21, 40]), for the purpose of proving Anderson localization (pure point spectrum and exponential decay of eigenfunctions). It was later shown to also yield dynamical localization (non spreading of the wave packets) [29], and more recently strong dynamical localization (dynamical localization not only with probability one, but in expectation) up to some order [14]. Our enhancement yields strong dynamical localization up to any order. In fact, it yields more: strong dynamical localization in the Hilbert–Schmidt norm. The usual multiscale analyses, based on von Dreifus and Klein [20], give exponential decay of the resolvent on big boxes with side Lk ∞, with probability close to 1 −p up to a polynomially small correction in Lk (i.e., ≥ 1 − Lk for a fixed p > 0). In
Bootstrap Multiscale Analysis and Localization in Random Media
417
comparison, the bootstrap multiscale analysis we present here in Theorem 3.4 requires less in the starting hypotheses, and yields far better probability estimates. In fact, it gives any desired sub-exponential decay for the probabilities of bad events. An important new feature of the enhanced MSA is that the final probability estimates are independent of the probability estimate in the starting hypothesis. This is achieved by a repeated use of a bootstrap argument. Thus one may look for the weakest possible starting hypothesis without affecting the resulting probability estimates. An important consequence of this bootstrap MSA is given in Theorem 3.8, which we paraphrase as follows. For a large class of random operators, if the bootstrap MSA starting hypothesis holds at a fixed energy E0 (E0 > 0 for Eq. (1.2)), then there exists δ0 > 0, such that, defining I (δ0 ) = (E0 − δ0 , E0 + δ0 ), one has 2 ζ E sup χx f (Hω )EH (I (δ0 ))χy ≤ Cζ e−|x−y| (1.6) ω
f ≤1
2
for any 0 < ζ < 1, where Cζ is some finite constant depending only on ζ and on the parameters of the problem (χx stands for the characteristic function of a box of side 1 centered at x; the supremum is taken over Borel functions f of a real variable, with f = supt∈R |f (t)|; EHω ( ) denotes the spectral projection of the operator Hω ; B2 denotes the Hilbert–Schmidt norm of the operator B). It follows from (1.6), by a fairly straightforward calculation, that for any bounded region and all q > 0 we have q 2 (1.7) E sup |X| 2 f (Hω )EHω (I (δ0 ))χ < ∞, 2
f ≤1
in which case we will say that the corresponding wave equation (either (1.1) or (1.2)) exhibits strong HS-dynamical localization in the energy interval I (δ0 ). An immediate consequence is strong dynamical localization, meaning that for any finite energy solution ψt (as in either (1.3) or (1.4)), we have 2 q (1.8) E sup |X| 2 EHω (I (δ0 )) ψt < ∞ t
for any q > 0 and Cauchy data (either φ0 ∈ H or φ0 , η0 ∈ H) with compact support. (It actually follows from (1.6) that it suffices to have Cauchy data that decays faster than any polynomial in the L2 -sense, i.e., the local L2 -norms decay faster than any polynomial.) An application of the results of this paper (the bootstrap multiscale analysis and its application to strong HS-dynamical localization) can be found in [32], where we show a discontinuity of the transport properties of the random media at the Anderson metal-insulator transition (if there is one). Classical waves may be described by first order Schrödinger-like equations of the same form as (1.1), where the self-adjoint operator H is a first order partial differential operator (see [42]); e.g., Maxwell equations. Such an equation yields two second order wave equations of the form (1.2). It turns out that the bootstrap MSA for one of these second order equations implies the estimate (1.6), and hence also (1.7) and (1.8), for the first order classical wave equation, as well as for the other second order wave equation (see [43]). Dynamical localization is a term commonly used for the almost sure version of (1.8), i.e., 2 q sup |X| 2 EHω (I (δ0 )) ψt < ∞ for a.e. ω. (1.9) t
418
F. Germinet, A. Klein
It was proven in [29] in the context of this paper (the proof is given for Schrödinger operators, but it is also applicable to classical waves). Dynamical localization implies pure point spectrum by the RAGE Theorem (e.g., the argument in [13, Theorem 9.21]), but the converse is not true. Dynamical localization is actually a strictly stronger notion than pure point spectrum, since the latter can take place whereas a quasi-ballistic motion is observed [18]. The question “what is localization?” has been raised in [17] and the last decade has seen many contributions to this subject matter [18, 29, 28, 3, 53]. In the discrete case, the first results on dynamical localization are due to Jona-Lasinio, Martinelli and Scoppola [35] for a hierarchical model, and to Martinelli and Scoppola [47] for the Anderson model. For the latter, with a bounded, absolutely continuous probability distribution for the single site potential, the Aizenman–Molchanov approach [1, 3] gives strong dynamical localization (in fact, it gives exponential decay in (1.6)). But where the Aizenman–Molchanov approach does not apply (e.g., random operators on the continuum, the Anderson model with a Bernoulli potential in one dimension), dynamical localization has been harder to prove. In the continuum the first results were obtained by Holden and Martinelli [33], who proved subdiffusive motion for random Schrödinger operators (for the time averaged second moment). Recently, Barbaroux et al. [5] showed the absence of diffusion for the time averaged second moment. The search for a proof of dynamical localization in the continuum ended when Germinet and De Bièvre [29] proved (almost sure) dynamical localization whenever the MSA applied. More recently, Damanik and Stollman [14] extended the analysis in [29] to prove partial strong dynamical localization. In fact, they proved partial strong operatordynamical localization. The partial refers to the fact that they obtain (1.8) and (1.10) for all q < q0 , for some q0 < ∞ that depends on the parameters of the problem - the disorder, the energy interval where the result takes place, etc. By strong operator-dynamical localization on an interval I we mean (compare with (1.7)) q (1.10) E sup |X| 2 f (Hω )EHω (I )χ < ∞. f ≤1
We note that the q < q0 limitation comes from the fact that they only had at their disposal the usual MSA. Theorem 3.4 below is sufficient to push their analysis to full strong operator-dynamical localization, i.e., (1.10) for all q > 0. In this article we propose an alternative, and quite natural, way to get strong dynamical localization (see also [30]), which yields strong HS-dynamical localization. As in [28], our method uses a generalized eigenfunction expansion (see Sect. 2.3) to exploit the fruits of the bootstrap MSA, instead of resorting to centers of localization as in [29, 14]. (See Remark 4.3.) Let us give an idea of our enhancement of the multiscale analysis. Roughly, the usual MSA works as follows: fix p > 0 and a mass m > 0. In a way, the parameter p determines how good the final result will be. A box 0L (x) is said to be regular at an energy E if the resolvent on that box, R0L (x) , sandwiched between the center of the box and its boundary, is smaller than e−mL/2 (see Sect. 2.1 and Definition 3.2): L
3x,L Rx,L (E)χx,L/3 x,L ≤ e−m 2 .
(1.11)
The basic result of the usual MSA is to provide an energy interval I (p) and a sequence of scales Lk ∞, such that the probability of getting regular boxes at the scale Lk at −p energies E ∈ I (p) is greater than 1 − Lk (for precise statements we refer the reader
Bootstrap Multiscale Analysis and Localization in Random Media
419
to Sects. 3 and 5.1). But this process can only take place once the first step (starting hypothesis) is proven to hold. To achieve the first step, typically one has to work at either the edge of a gap in the spectrum, or at high disorder, or at low energy. The point is that the parameters of the operator (disorder, energy, . . . ) are fixed depending on p to satisfy this starting hypothesis. As a consequence, in the usual MSA, as p → ∞, then either: at the edge of a gap in the spectrum, the interval I (p) where localization holds will shrink to nothing; (ii) to obtain localization in a specified interval, the required disorder γ = γ (p) increases to ∞; (iii) at low energy, the energy at which we see localization diverges.
(i)
If one is interested in the decay of the kernel of the semigroup e−iHω t as in (1.6), then this link between the rate of decay of the probability and the region in the diagram energy × disorder where the conclusions hold is unfortunate, and limits the scope of the result that can be obtained. More precisely, in that context the usual MSA can only provide results of the type: (i)
at the edge of a gap in the spectrum (fixed disorder γ ), there exists an interval I (p), shrinking to nothing as p → ∞, and a finite constant Cp , so that Cp E sup χx f (Hω )EHω (I (p))χy ≤ ; (1.12) 1 + |x − y|p f ≤1
(ii) in pre-specified interval I , there exist γ (p) → ∞ as p → ∞, so that if γ ≥ γ (p), (1.12) holds on I ; (iii) at low energy, there exists Ep → ∞ as p → ∞, such that (1.12) holds on compact intervals I ⊂ (−∞, −Ep ), and, in the discrete case, also if I ⊂ (Ep , ∞). These should be compared to (1.6) above, where the desired decay does not affect the starting hypothesis. In this article we show that once the MSA is performed for one p, then by a bootstrap argument it can be done for any p , on the same interval I (p) and with the same disorder (and of course for the same starting hypothesis). Since this in turn means that the starting hypothesis does not affect the strength of the conclusions, another way to take advantage of this new possibility is to start with the weakest possible starting hypothesis. In a companion paper [31] we shall explore this fact in more detail and, in particular, propose a finite volume criterion that may be implemented numerically. We notice that this type of finite volume criterion is fairly close to results obtained recently in [3], a paper that deals with the discrete setting. Moreover in [3] polynomial decay (of the averaged fractional resolvent) is shown to imply exponential decay. Here, if polynomial decay holds, then any sub-exponential decay follows. To perform the bootstrap MSA we take advantage of two kinds of multiscale analyses: one where length scales grow by multiplication by a fixed factor: Lk+1 = Y Lk , Y > 1, and another with exponentially growing length scales: Lk+1 = Lαk , α > 1. Previous proofs yielded only a pre-fixed polynomial decay of the probabilities of bad events (i.e., L1p for a pre-fixed p > 0). In this context, the MSA with exponential growth of k
length scales is well known; it was put in the present form by von Dreifus [19] and von Dreifus and Klein [20], simplifying the work of Fröhlich and Spencer [26] and Fröhlich, Martinelli, Scoppola and Spencer [27]. The MSA with multiplicative growth of length
420
F. Germinet, A. Klein
scales is less well known and was developed by Figotin and Klein in [24, 25], using ideas of Spencer [51], to improve the starting hypothesis of the MSA, and thereby to weaken the hypotheses of their theorems. We extend both types of multiscale analyses to obtain subζ exponential decay of the probabilities of bad events (i.e., e−Lk for all 0 < ζ < 1). A new ingredient in our extension of the MSA with exponentially growing length scales is that we allow the number of bad boxes to grow with the scale. (The bad boxes are controlled by a Wegner estimate in the usual way.) All these multiscale analyses have differing starting hypotheses, the weakest belonging to the Figotin and Klein MSA, which only requires that at some sufficiently large scale we can verify polynomial decay of the finite volume resolvent with some minimal probability, independent of the scale. (In this article we 1 show that a probability bigger than 1− 841 d suffices.) It is by successively performing the four multiscale analyses, feeding the results of one into the next, thus doing a bootstrap multiscale analysis, that we are able to go from the weakest starting hypothesis to the strongest conclusions. Combining the results of this bootstrap multiscale analysis with the generalized eigenfunction expansion leads to (1.6). The paper is organized as follows. In Sect. 2 we present, as assumptions, the properties of the random operator Hω that are required for the multiscale analysis and its applications. In Sect. 3 we state the main results of this paper, namely the bootstrap multiscale analysis (Theorem 3.4), and its application to various manifestations of localization: sub-exponential decay of the kernel of f (Hω ) (Theorem 3.8), strong HS-dynamical localization (Corollary 3.10), and a SULE property (Theorem 3.11). In Sect. 4 we assume Theorem 3.4 and prove Theorem 3.8, Corollary 3.10 and Theorem 3.11. In Sect. 5 we discuss the four multiscale analyses (Theorems 5.1, 5.2, 5.6, and 5.7) that are used in the bootstrap multiscale analysis. In Sect. 6 we prove Theorem 3.4. 2. Requirements of the Multiscale Analysis 2.1. Finite volume. Throughout this paper we use the sup norm in Rd : |x| = max{|xi |, i = 1, . . . , d}.
(2.1)
By 0L (x) we denote the open box (or cube) of side L > 0: 0L (x) = {y ∈ Rd ; |y − x| < L/2},
(2.2)
and by 0L (x) the closed box. In this article we will always take boxes centered at sites x ∈ Zd with side L ∈ 2N. Very often we will require L ∈ 6N; given K ≥ 6, we set [K]6N = max{L ∈ 6N; L ≤ K}.
(2.3)
The operator Hx,L is defined as the restriction of H , either to the open box 0L (x) with Dirichlet boundary condition, or to the closed box 0L (x) with periodic boundary condition. (We consistently work with either Dirichlet or periodic boundary condition.) We write Rx,L = (Hx,L − z)−1 for its resolvent. By x,L we denote the norm or the operator norm on L2 (0L (x), dx; Cn ). The characteristic function of a set 0 ⊂ Rd is denoted by χ0 . If x ∈ Rd and 7 > 0, we let χx,7 = χ07 (x) ,
χx = χx,1 = χ01 (x) .
(2.4)
Bootstrap Multiscale Analysis and Localization in Random Media
421
Given a box 0L (x), we set
L ϒL (x) = y ∈ Zd ; |y − x| = − 1 , 2
(2.5)
and define its (boundary) belt by
ϒ˜ L (x) = 0L−1 (x)\0L−3 (x) =
01 (y) ;
(2.6)
y∈ϒL (x)
it has the characteristic function 3x,L = χϒ˜ L (x) =
χy a.e.
(2.7)
y∈ϒL (x)
Note that |ϒL (x)| = (L − 1)d − (L − 2)d = d
L−1
L−2
x d−1 dx ≤ d(L − 1)d−1 .
(2.8)
We shall suppress the dependency of a box on its center when not necessary. When using boxes 07 contained in bigger boxes 0L , we shall need to know that the small box is inside the belt ϒ˜ L of the bigger one. We thus introduce the following definition. Definition 2.1. Let L > 7 + 3 and x ∈ Zd . We say that 07 0L (x) if 07 ⊂ 0L−3 (x). 2.2. Properties of random operators. We present here, as assumptions, the properties of the random operator Hω that are required for the multiscale analysis and its applications. These properties are routinely verified for the random operators of interest [26, 27, 20, 33, 9, 24, 25, 42, 43, 52]. The bootstrap multiscale analysis has the same requirements as the usual one. We fix a compact interval I0 and an open interval I˜0 ⊃ I0 (we always take I˜0 ⊂ (0, ∞) for Eq. (1.2)). 2.2.1. Deterministic assumptions. The deterministic assumptions are supposed to hold almost surely, with non-random constants. We omit ω from the notation. The first assumption is reminiscent of the Simon–Lieb inequality (SLI) in Classical Statistical Mechanics. It relates resolvents in different scales. In the discrete case it is an immediate consequence of the resolvent identity; in this context it was originally used in [26]. In the continuum, its proof requires internal estimates. For Schrödinger operators it was proved in [9]; it was adapted to classical wave operators in [24]. We state it as in [42, Lemma 3.8]. In the continuum, we typically have the constant γI0 below 1 to be of the form γI0 = supE∈I0 γE , with γE = const(1 + |E|) 2 , where the nonrandom constant depends only on nonrandom parameters of the operator (dimension d, bounds on coefficients or potential – see [42, Eq. (3.80)] for an explicit expression for classical wave operators, a similar expression holds for Schrödinger operators).
422
F. Germinet, A. Klein
Assumption SLI. There exists a finite constant γI0 such that, given L, 7 , 7 ∈ 2N, x, y, y ∈ Zd with 07 (y) 07 (y ) 0L (x), then for any E ∈ I0 with E ∈ / σ (Hx,L ) ∪ σ (Hy ,7 ) we have 3x,L Rx,L (E)χy,7 x,L ≤ γI0 3y ,7 Ry ,7 (E)χy,7 y ,7 3x,L Rx,L (E)3y ,7 x,L . (2.9) Assumption SLI will be used in the following way: We will take 7 = 7/3 with 7 ∈ 6N, and 7 = k7/3 with 3 ≤ k ∈ N. By a cell we will mean a closed box 07/3 (y ), with y ∈ 67 Zd . We define Zeven and Zodd to be the sets of even and odd integers. We take y ∈ 67 Zd , so χy,7/3 is the characteristic function of a cell. We want the closed box 07 (y ) to be exactly covered by cells (in effect, by k d cells); thus we specify y ∈ 37 Zd = 67 Zdeven if k is odd, and y ∈ 37 Zd + 67 (1, 1, . . . , 1) = 67 Zdodd if k is even. We then replace the boundary belt ϒ˜ 7 (y ) (of width 1) by a thicker belt ϒ˜ 7 ,7 (y ) of width 7/3. To do so, we set
7 d 7 7 , (2.10) ϒ7 ,7 (y ) = y ∈ Z ; |y − y | = − 3 2 6 and define the boundary 7-belt of 07 (y ) by
ϒ˜ 7 ,7 (y ) = 07 (y )\07 −27/3 (y ) =
07/3 (y ),
(2.11)
y ∈ϒ7 ,7 (y )
with characteristic function 3y ,7 ,7 = χϒ˜
7 ,7 (y
)
=
χy ,7/3 a.e.
(2.12)
y ∈ϒ7 ,7 (y )
Note that |ϒ7 ,7 (y )| = (k d − (k − 2)d ) ≤ k d .
(2.13)
Since 3y ,7 ,7 3y ,7 = 3y ,7 , the projection 37 on the belt of 07 can be replaced by the projection over the thicker belt of width 7/3, which can be decomposed in boxes of side 7/3. Thus (2.9) yields 3x,L Rx,L (E)χy,7/3 x,L ≤ γI0 k d 3y ,7 Ry ,7 (E)χy,7/3 y ,7 3x,L Rx,L (E)χy ,7/3 x,L ,
(2.14)
for some y ∈ ϒ7 ,7 (y ). We will say that, after performing the SLI, i.e., using the estimate (2.14), we moved from the cell 07/3 (y) to the cell 07/3 (y ). Remark 2.2. While performing a multiscale analysis we will use (2.14) with either 7 = 7 (for good boxes), or some 7 = k7/3, k > 3, which will be the side of a bad box. Note that in the first case, k = 3, and the geometric factor is 3d − 1 ≤ 3d . In that case note also that we must have y = y and |y − y| = 7/3, so after performing the SLI we moved to an adjacent cell, i.e., by 7/3 in the sup norm. (Recall that we are using the sup norm in Rd , so we may move both sidewise and along the diagonals.)
Bootstrap Multiscale Analysis and Localization in Random Media
423
The second assumption estimates generalized eigenfunctions (see Sect. 2.3 for a precise definition) in terms of finite volume resolvents. It is not needed for the multiscale analysis, but it plays an important role in obtaining localization from the multiscale analysis [27, 20]. We call it an eigenfunction decay inequality (EDI), since it translates decay of finite volume resolvents into decay of generalized eigenfunctions ; we present it as in [42, Lemma 3.9]. It is closely related to the SLI, the proofs being very similar. Assumption EDI. There exists a finite constant γ˜I0 such that, given a generalized eigenfunction ψ of H with generalized eigenvalue E ∈ I0 , we have for any x ∈ Zd and L ∈ 2N with E ∈ / σ (Hx,L ) that χx ψ ≤ γ˜I0 3x,L Rx,L (E)χx x,L 3x,L ψ.
(2.15)
Typically we have γ˜I0 = γI0 , with γI0 as in (2.9). We will use the following consquence of (2.15): χx ψ ≤ γI0 Ld−1 3x,L Rx,L (E)χx x,L χy ψ
(2.16)
for some y ∈ ϒL (x), with γI0 = d γ˜I0 . 2.2.2. Probabilistic assumptions. The first probabilistic assumption is independence at a distance (IAD). It says that if boxes are far apart, events related to the restrictions of the random operator Hω to these boxes are independent. We say that an event is based on the box 0L (x) if it is determined by conditions on the restriction Hω,x,L . Given : > 0, we say that two boxes 0L (x) and 0L (x ) are :-nonoverlapping if |x − x | > L+L 2 +: (i.e., if d(0L (x), 0L (x )) > :). Assumption IAD. There exists : > 0 such that events based on :-nonoverlapping boxes are independent. The second probabilistic assumption is an “a priori” estimate on the average number of eigenvalues (NE) of finite volume random operators in a fixed, bounded interval. It is usually proved by a deterministic argument, using the well known bound for the Laplacian [9, 24, 25, 42]. It is, of course, entirely obvious in the discrete case. Assumption NE. There exists a finite constant CI0 such that E tr H EHω,x,L (I˜0 ) ≤ CI0 Ld
(2.17)
for all x ∈ Zd and L ∈ 2N. The final probabilistic assumption is a form of Wegner’s estimate (W), a probabilistic estimate on the size of the resolvent. It is a crucial ingredient for the MSA, where it is used to control the bad regions. Assumption W. For some b ≥ 1 there exists a constant QI0 < ∞, such that
P dist(σ (Hω,x,L ), E) ≤ η ≤ QI0 ηLbd , for all E ∈ I˜0 , η > 0, x ∈ Zd , and L ∈ 2N.
(2.18)
424
F. Germinet, A. Klein
Remark 2.3. In the continuum one usually proves the stronger estimate [33, 9, 10, 4, 24, 25, 43, 11]: (2.19) E tr H EHω,x,L ([E − η, E + η]) ≤ QI0 ηLbd , from which (2.18) follows by Chebychev’s inequality. The estimate (2.17) is used as an “a priori” estimate in the proof of (2.19). Remark 2.4. In practice we have either b = 1 or b = 2 in the Wegner estimate (2.18). For random Schrödinger operators with Anderson potential we may have b = 1 [9, 45, 4] (including the Landau Hamiltonian). For classical waves in random media, (2.18) has been proven with b = 2 [54, 24, 25, 43]. Very recently the correct volume dependency (i.e., b = 1) in gaps of the unperturbed operator was obtained in [11], at the price of losing a bit in the η dependency. In this paper, we shall use (2.18) as stated, the modifications in our methods required for the other forms of (2.18) being obvious. Our methods may also accommodate Assumptions NE and W being valid only for large L, and/or Assumption W being valid only for η < ηL for some appropriate ηL , say ηL = L−r , some r > 0, or β ηL = e−L for some 0 < β < 1. The latter is of importance if one wants to deal with singular probability measures like Bernoulli [7, 36, 16]. 2.3. Generalized eigenfunction expansion. Generalized eigenfunction expansions were originally developed for elliptic partial differential operators with smooth coefficients (we refer to Berezanskii’s book [6]). These expansions were extended to Schrödinger operators with singular potentials (Simon [50] and references therein), and to classical wave operators with nonsmooth coefficients by Klein, Koines and Seifert [44]. These expansions construct polynomially bounded generalized eigenfunctions for a set of generalized eigenvalues with full spectral measure. These generalized eigenfunctions were used by Pastur [48] and by Martinelli and Scoppola [47] to prove that certain Schrödinger operators with random potentials have no absolutely continuous spectrum. They played a crucial role in the work by Fröhlich, Martinelli, Spencer and Scoppola [27] and by von Dreifus and Klein [20] on Anderson localization of random Schrödinger operators, providing the crucial link between the multiscale analysis and pure point spectrum: the exponential decay of finite volume Green’s functions (obtained by a multiscale analysis) forces polynomially bounded generalized eigenfunctions to be bona fide eigenfunctions, so the spectrum is at most countable and hence pure point. In this article we go further and, as in [28, 30], use the generalized eigenfunction expansion itself (not just the existence of polynomially bounded generalized eigenfunctions) to provide the link between the multiscale analysis and strong HS-dynamical localization (and hence pure point spectrum). We will state the existence of a generalized eigenfunction expansion as an assumption. Proofs of such an assumption are provided in [50, 44] for Schrödinger operators and classical wave operators. We follow the presentation in [44]. Let T be the operator on H given by multiplication by the function (1 + |x|2 )ν , where ν > d/4. We define the weighted spaces H± as follows: H± = L2 (Rd , (1 + |x|2 )±2ν dx; Cn ). H− is a space of polynomially L2 -bounded functions. The sesquilinear form φ1 , φ2 H+ ,H− = φ1 (x) · φ2 (x)dx,
(2.20)
(2.21)
Bootstrap Multiscale Analysis and Localization in Random Media
425
where φ1 ∈ H+ and φ2 ∈ H− , makes H+ and H− conjugate duals to each other. By O † we will denote the adjoint of an operator O with respect to this duality. By construction, H+ ⊂ H ⊂ H− , the natural injections ı+ : H+ → H and ı− : H → † = ı− . The operators T+ : H+ → H− being continuous with dense range, with ı+ H and T− : H → H− , defined by T+ = T ı+ , T− = ı− T on D(T ), are unitary with T− = T+† . The map τ : B(H) → B(H+ , H− ), with τ (C) = T− CT+ , is a Banach space isomorphism, as T± are unitary operators. (B(H1 , H2 ) denotes the Banach space of bounded operators from H1 to H2 , B(H) = B(H, H).) If 1 ≤ q < ∞, we define Tq (H+ , H− ) = τ Tq (H) , where Tq (H) denotes the Banach space of bounded 1
operators S on H with Sq = (tr |S|q ) q < ∞. By construction, Tq (H+ , H− ), equipped with the norm Bq = τ −1 (B)q , is a Banach space isomorphic to Tq (H), with T2 (H+ , H− ) being the usual Hilbert space of Hilbert–Schmidt operators from H+ to H− . Note that χx,L H,H+ = χx,L H− ,H ≤ CL,ν (1 + |x|2 )ν
(2.22)
for all x ∈ Rd and L > 0, with CL,ν a finite constant depending only on L and ν. (Given an operator B : H1 → H2 , BH1 ,H2 will denote its operator norm.) The following assumption guarantees the existence of a generalized eigenfunction expansion (GEE) with the right properties (see [44] for details). Recall that PH⊥ is the orthogonal projection on the orthogonal complement of the kernel of H in the case of classical waves; for convenience we let it be the identity operator in the case of the Schrödinger equation. Note also that we fix ν > d/4 and use the corresponding operator T and weighted spaces H± as in (2.20). ω := {φ ∈ D(H ) ∩ H , H φ ∈ H } is Assumption GEE. Fix ν > d/4. The set D+ ω + ω + dense in H+ and an operator core for Hω , with probability one. There exists a bounded, continuous function f , strictly positive on the spectrum of Hω , such that (2.23) trH T −1 f (Hω )PH⊥ω T −1 < ∞
with probability one. A measurable function ψ : Rd → Cn is said to be a generalized eigenfunction of Hω with generalized eigenvalue λ, if ψ ∈ H− and ω Hω φ, ψH+ ,H− = λφ, ψH+ ,H− for all φ ∈ D+ .
It follows from GEE that if a generalized eigenfunction is in H, then it is a bona fide eigenfunction. If GEE holds, for almost every ω we have (2.24) tr H T −1 EHω (J )PH⊥ω T −1 < +∞ for all bounded Borel sets J . Thus, with probability one, µω (J ) = trH T −1 EHω (J )PH⊥ω T −1
(2.25)
426
F. Germinet, A. Klein
is a spectral measure for the restriction of Hω to the Hilbert space PH⊥ω H, with µω (J ) < ∞
for J bounded.
(2.26)
In particular, we have a generalized eigenfunction expansion for Hω : with probability one, there exists a µω -locally integrable function Pω (λ) from the real line into T1 (H+ , H− ), with Pω (λ) = Pω (λ)† and
trH T−−1 Pω (λ)T+−1 = 1
such that ı− EHω (J )PH⊥ω ı+
for µω − a.e. λ,
(2.27)
(2.28)
=
J
Pω (λ) dµω (λ)
for bounded Borel sets J,
(2.29)
where the integral is the Bochner integral of T1 (H+ , H− )-valued functions. Moreover, if φ ∈ H+ , then Pω (λ)φ ∈ H− is a generalized eigenfunction of Hω with generalized eigenvalue λ, for µω almost every λ. The following lemma will play an important role in our proof of strong HS-dynamical localization. Note that the constant C in (2.30) is independent of λ, and that 1 denotes the trace norm in H. Lemma 2.5. Under Assumption GEE, we have, with probability one, that for µω almost every λ, χx Pω (λ)χy 1 ≤ C(1 + |x|2 )ν (1 + |y|2 )ν
(2.30)
for all x, y ∈ Rd , with C a finite constant independent of λ and ω. Proof. Since χx Pω (λ)χy 1 ≤ χx H− ,H Pω (λ)T1 (H+ ,H− ) χy H,H+ , (2.30) follows from (2.22) and (2.28).
(2.31)
" !
Assumption GEE suffices for proofs of localization [27, 20] and (almost sure) dynamical localization [29, 28]. But for strong HS-dynamical localization we need to strengthen (2.23), as we will use (2.34) below. Assumption SGEE. Assumption GEE holds with 2 E trH T −1 f (Hω )PH⊥ω T −1 < ∞. It follows that
2 E tr H T −1 EHω (J )PH⊥ω T −1 < +∞
(2.32)
(2.33)
for all bounded Borel sets J , so we have a stronger version of (2.26): E [µω (J )]2 < ∞
for J bounded.
(2.34)
Bootstrap Multiscale Analysis and Localization in Random Media
427
Remark 2.6. Estimate (2.32) is true for the usual random operators. We could have required either the weaker E trH T −1 f (Hω )PH⊥ω T −1 < ∞, (2.35) or the stronger
trH T −1 f (Hω )PH⊥ω T −1
∞
< ∞.
(2.36)
If we assume (2.35) instead of (2.32), Theorem 3.8 yields strong operator dynamical localization instead of strong HS-dynamical localization. One usually proves the stronger (2.36) (e.g., [44, 43]), which was one of the assumptions in [14]. 3. Statement of the Main Results In order to state our results we need first to characterize good boxes for random operators. We start with two definitions of good boxes. Note that these are deterministic; we omit ω from the notation when not necessary. Definition 3.1. Given θ > 0, E ∈ R, x ∈ Zd , and L ∈ 6N, we say that the box 0L (x) is (θ, E)-suitable if E ∈ / σ (Hx,L ) and 3x,L Rx,L (E)χx,L/3 x,L ≤
1 . Lθ
(3.1)
Definition 3.2. Given m > 0, E ∈ R, x ∈ Zd , and L ∈ 6N, we say that the box 0L (x) is (m, E)-regular if E ∈ / σ (Hx,L ) and L
3x,L Rx,L (E)χx,L/3 x,L ≤ e−m 2 .
(3.2)
Remark 3.3. Note that a box 0L (x) is (θ, E)-suitable if and only if it is (m, E)-regular, with m = 2θ logLL . The difference between the two definitions is the point of view. In Definition 3.1 we require only polynomial decay in the scale L, while in Definition 3.2 we want exponential decay in L. We fix a compact interval I0 and an open interval I˜0 ⊃ I0 (I˜0 ⊂ (0, ∞) for Eq. (1.2)). Throughout this paper, by C = C(a, b, . . . ) we mean a positive finite constant C, depending only on the parameters a, b, . . . . The following theorem provides our enhancement of the multiscale analysis. Theorem 3.4 (Bootstrap Multiscale Analysis). Let Hω be a random operator satisfying assumptions SLI, IAD, NE and W in the compact interval I0 . Given θ > bd, there exists a finite scale L = L(d, :, QI0 , γI0 , b, θ), such that, if for some E0 ∈ I0 we can verify at some finite scale L > L that P{0L (0) is (θ, E0 )-suitable} > 1 −
1 , 841d
(3.3)
then there exists δ0 = δ0 (d, :, QI0 , CI0 , γI0 , θ, L) > 0, such that, given any ζ , 0 < ζ < 1, and α, 1 < α < ζ −1 , there is a length scale L0 = L0 (d, :, QI0 , CI0 , γI0 , θ, L, ζ, α) < ∞,
428
F. Germinet, A. Klein
and a mass mζ = m(ζ, L0 ) > 0, so if we set Lk+1 = [Lαk ]6N , k = 0, 1, . . . , we have ζ P R mζ , Lk , I (δ0 ), x, y ≥ 1 − e−Lk
(3.4)
for all k = 0, 1, . . . , and x, y ∈ Zd with |x − y| > Lk + :, where I (δ0 ) = [E0 − δ0 , E0 + δ0 ] ∩ I0 , and R(m, L, I, x, y) {for every E ∈ I, either 0L (x) or 0L (y) is (m, E)-regular}. (3.5) Remark 3.5. If we have the expected volume factor in (2.18), i.e., b = 1, we need only θ > d, hence (3.3) is an estimate on the probability that the finite volume resolvent decays faster than the inverse of the volume. 1 Remark 3.6. The initial probability 1 − 841 d in the starting hypothesis (3.3) of Theorem 3.4 does not depend on the initial scale L. It suffices to verify (3.3) for some L > L, with L large enough depending on d, :, QI0 , γI0 , b, θ. This is not the case in the usual MSA where the required initial probability behaves like 1 − L−p [20]. Estimates on L, as well as better numbers for the required initial probability, will be given in [31].
Remark 3.7. In some cases one may verify the starting hypothesis (3.3) by proving the stronger condition: lim sup P{0L (0) is (θ, E0 )-suitable} > 1 − L→∞
1 for some θ > bd. 841d
(3.6)
In such cases one usually shows that the lim sup is actually equal to one (e.g., [24, 25]). The following result combines Theorem 3.4 and the generalized eigenfunction expansion presented in Sect. 2.3. Under the hypotheses of Theorem 3.4, we show that one can get any sub-exponential decay of the (averaged) “kernel” of a bounded function of Hω . Theorem 3.8 (Decay of the Kernel). Let Hω be a random operator satisfying assumptions SLI, IAD, NE and W in the compact interval I0 as in Theorem 3.4, plus assumptions EDI and SGEE. Suppose (3.3) holds at E0 ∈ I0 for some θ > bd, and let δ0 and I (δ0 ) be as in Theorem 3.4. Then for any 0 < ζ < 1 there exists a finite constant Cζ = C(ζ, d, :, QI0 , γI0 , γ˜I0 , θ, ν), such that 2 ζ E sup χx f (Hω )EH (I (δ0 ))χy ≤ Cζ e−|x−y| (3.7) f ≤1
ω
2
for all x, y ∈ Zd . (The supremum is taken over bounded Borel functions f of a real variable, with f = supt∈R |f (t)|.) Remark 3.9. The initial probabilistic estimate (3.3) or (3.6) may be shown to be satisfied either at the edge of a gap in the spectrum, or at low energy, or for sufficiently high disorder in a pre-specified energy interval, e.g., [20, 9] in the Schrödinger case, [24, 25, 43] for classical waves. But, in contrast with the results from the usual MSA, the region where Theorems 3.4 and 3.8 apply (in the diagram energy × disorder) is not conditioned to the final estimate of the probability of bad events; one always gets any sub-exponential decay on a fixed interval I (δ0 ), as shown in (3.7).
Bootstrap Multiscale Analysis and Localization in Random Media
429
An important application of Theorem 3.8 concerns strong HS-dynamical localization, as defined by (1.7). Corollary 3.10 (Strong HS-Dynamical Localization). Consider the wave equation (either (1.1) or (1.2)) in a random medium, and assume that the corresponding random operator Hω satisfies the hypotheses of Theorem 3.8 in the compact interval I0 . Suppose (3.3) holds at E0 ∈ I0 for some θ > bd, and let δ0 and I (δ0 ) be as in Theorem 3.4. Then the wave equation exhibits strong HS-dynamical localization in the energy interval I (δ0 ). Related results have been obtained for the almost Mathieu model, a one-dimensional quasi-periodic model: dynamical localization [34, 28], and more recently strong dynamical localization [30] (for the optimal set of coupling constants). Another measure of localization is “how localized are the eigenfunctions around their center of localization”. The criterion SULE [17, 18] deals with this question. Thanks to the sub-exponential decay of the probability in Theorem 3.4, we are able to improve the control on the behavior of the eigenfunctions given in [29]. Theorem 3.11 (SULE). Let Hω be a random operator satisfying assumptions SLI, EDI, GEE, IAD, NE and W in the compact interval I0 . Suppose (3.3) holds at E0 ∈ I0 for some θ > bd, and let δ0 and I (δ0 ) be as in Theorem 3.4. Then Hω exhibits Anderson localization (pure point spectrum) in the interval I (δ0 ). In addition, one gets the following form of SULE: for any ε > 0, there exists a mass mε > 0, and for a.e. ω there is a constant Cε,ω < ∞, such that, if we let {φn,ω }n∈N be the normalized eigenfunctions of Hω with energy En,ω in I (δ0 ), there exist {xn,ω }n∈N , so for any n ∈ N and x ∈ Zd , we have χx φn,ω ≤ Cε,ω emε (log |xn,ω |)
1+ε
e−mε |x−xn,ω | .
(3.8)
Moreover, the centers of localization xn,ω can be reordered in such a way that |xn,ω | increases with n, and |xn,ω | ≥ C˜ ω n 4ν 1
for some finite constant C˜ ω > 0 for a.e. ω, where ν >
(3.9) d 4
is as in GEE.
This improves the result obtained in [29]. First, because the interval I (δ0 ) does not depend anymore on the chosen ε > 0, and second, because the control of the eigenfunctions in terms of the centers of localization {xn,ω }n∈N , given in (3.8), is almost 1+ε ε polynomial (we get emε (log |xn,ω |) instead of emε |xn,ω | as in [29]). Note that exponential decay of the probability of bad events in Theorem 3.4 (i.e., in (3.4)) would provide right away polynomial behavior in |xn,ω |, as expected. In the discrete case, the Aizenman– Molchanov [1, 3] approach supplies that polynomial behavior [18]. If one is interested in proving localization in a specified interval, then sometimes it suffices to take sufficiently large disorder to satisfy the starting hypothesis (3.3) for every energy in the interval. The following corollary re-states Theorem 3.4, Theorem 3.8, Corollary 3.10 and Theorem 3.11 in this case. The proof is a simple compactness argument. Here again, as in Remark 3.9, we improve on former results, since how large the disorder has to be is not anymore conditioned by how good one wants the final probabilistic estimates to be.
430
F. Germinet, A. Klein
Corollary 3.12. If for some θ > bd we have (3.3) for every energy E in the compact interval I0 , then Theorem 3.4, Theorem 3.8, Corollary 3.10, and Theorem 3.11 are valid with the whole interval I0 substituted for I (δ0 ) in the conclusions. Remark 3.13. We note that our results apply to the one-dimensional, discrete Anderson model with a singular potential, like a Bernoulli or alloy potential [7, 36] (for the onedimensional continuous case see the very recent work [15]). The Wegner estimate proved in [7, 36] for this case is slightly weaker than our Assumption W, since it holds only for sub-exponentially small distances to the spectrum rather than for any η > 0. (The fact it only holds for scales L large enough, uniformly in the interval I0 , does not affect the results.) But in this case one can also prove a starting hypothesis with subexponentially decaying probabilities of bad events [7,36], i.e., the starting hypothesis (5.9) of Theorem 5.7. The proof of this theorem only requires this weaker Assumption W, so our results are also valid for Bernoulli or alloy potentials. Another application of this work leads to strong dynamical localization for the random dimer model [16]. 4. Decay of the Kernel and Dynamical Localization In this section we assume Theorem 3.4 and prove Theorem 3.8, Corollary 3.10, and Theorem 3.11. We start with a preliminary lemma which translates the exponential decay of the resolvent of finite boxes at energy λ, as given by the multiscale analysis, in terms of an exponential decay of the kernel of the “generalized eigenprojector” Pω (λ) defined before (2.27). We note that Lemma 2.5, with the uniform polynomial bound (2.30) that it provides, is a crucial tool for Lemma 4.1 below. Lemma 4.1. Let Hω be a random operator satisfying assumptions EDI and GEE in some compact interval I0 . Given I ⊂ I0 , m > 0, L ∈ 6N, and x, y ∈ Zd , let R(m, L, I, x, y) be as in (3.5). If ω ∈ R(m, L, I, x, y), we have χx Pω (λ)χy ≤ C e−mL/4 (1 + |x|2 )ν (1 + |y|2 )ν , (4.1) 2 for µω -almost all λ ∈ I , with C = C(m, d, ν, γ˜I0 ) < +∞. Proof. It follows from (2.27) that χx Pω (λ)χy = χy Pω (λ)χx , 2 2 for µω -almost every λ, so the roles played by x and y are symmetric. Let ω ∈ R(m, L, I, x, y). Then for any λ ∈ I , either 0L (x) or 0L (y) is (m, λ)regular for Hω , let’s say 0L (x). Let φ ∈ H. Since for µω -almost all λ and all y ∈ Zd , the vector Pω (λ)χy φ is a generalized eigenfunction of Hω with generalized eigenvalue λ, it follows from the EDI (see (2.15)), using χx = χx,L/3 χx , that χx Pω (λ)χy φ ≤ γ˜I0 3x,L Rx,L (λ)χx,L/3 x,L 3x,L Pω (λ)χy φ.
(4.2)
Since 0L (x) is (m, λ)-regular, we have, using also Lemma 2.5 and the definition of the HS norm, that χx Pω (λ)χy 2 ≤ γ˜I0 e−mL/2 3x,L Pω (λ)χy 2 ≤ C(ν)γ˜I0 dL
d−1 −mL/2
e (1 + (|x| + L2 )2 )ν (1 + |y|2 )ν −mL/4 2 ν 2 ν
≤ C(m, d, ν, γ˜I0 )e " !
(1 + |x| ) (1 + |y| ) .
(4.3) (4.4) (4.5)
Bootstrap Multiscale Analysis and Localization in Random Media
431
Remark 4.2. The estimate (4.1) may be compared to the criterion WULE introduced in [28]. Indeed Pω (λ) can be seen as the projection operator on the set of the generalized eigenfunctions ϕλω in H− with energy λ. Hence provides, at a finite scale L, ω(4.1) above the exponential decay of the key quantity | ϕλ (x) ϕλω (y)|. As in [28, 30], the fact that the eigenfunctions ϕλω are uniformly polynomially bounded ( ϕλω H− ≤ 1) is crucial for our approach. We are now in position to prove Theorem 3.8. Proof of Theorem 3.8. Let 0 < ξ < 1. We will apply Theorem 3.4 together with the generalized eigenfunction expansion (2.29) to show that E
2 sup χx f (Hω )EHω (I (δ0 ))χ0 2
|f |≤1
ξ
≤ Cξ e−|x| ,
(4.6)
for all x ∈ Zd , where I (δ0 ) ⊂ I0 is given by Theorem 3.4. Since our random operator is Zd -ergodic, probabilities are translation invariant, so there is no loss of generality in taking y = 0. Given 0 < ξ < 1, we pick ζ such that ζ 2 < ξ < ζ < 1 (always possible) and set α = ζξ , note α < ζ −1 . Theorem 3.4 then provides us with a scale L0 and a mass mζ > 0, such that, if we set Lk+1 = [Lαk ]6N , k = 0, 1, . . . , then for each k we have the estimate (3.4) with y = 0 and x ∈ Zd such that |x| > Lk + :. Let us now fix x ∈ Zd and k such that Lk+1 + : ≥ |x| > Lk + :. In this case Lemma 4.1 asserts that if ω ∈ R mζ , Lk , I (δ0 ), x, 0 , then ζ
sup χx Pω (λ)χ0 2 ≤ C1 e−mζ Lk /4 (1 + |x|2 )ν ≤ C1 C2 e−Lk ,
(4.7)
λ∈I (δ0 )
with C1 = C1 (mζ , d, ν, γ˜I0 ), C2 = C2 (ν, :, ζ, ξ, mζ ). We split the expectation in (4.6) in two pieces: where (4.7) holds, and over the complementary event, which has ζ probability less than e−Lk by (3.4). From (2.29) we have (note EHω (I (δ0 ))PH⊥ω = EHω (I (δ0 )) in the case of Eq. (1.2), since in this case I0 ⊂ (0, ∞)), sup χx f (Hω )EH (I (δ0 ))χ0 ω
|f |≤1
2
≤ sup
f ≤1 I (δ0 )
|f (λ)| χx Pω (λ)χ0 2 dµω (λ)
(4.8)
≤
I (δ0 )
χx Pω (λ)χ0 2 dµω (λ).
(4.9)
Thus, it follows from (4.7) that [with E(F (ω); A) ≡ E(F (ω)χA (ω))] E
2 sup χx f (Hω )EHω I (δ0 )χ0 2 ; R(mζ , Lk , I (δ0 ), x, 0)
f ≤1
ζ
≤ C12 C22 E((µω (I (δ0 )))2 ) e−2Lk .
(4.10)
432
F. Germinet, A. Klein
To estimate the second term, note that using (2.25) we have χx f (Hω )EH (I (δ0 ))χ0 2 ≤ f 2 EH (I (δ0 ))χ0 2 ω ω 2 2 ≤ 4ν f 2 µω (I (δ0 )),
(4.11)
so, using the Schwarz’s inequality and (3.4), E
2 sup χx f (Hω )EHω I (δ0 )χ0 2 ; ω ∈ / R(mζ , Lk , I (δ0 ), x, 0)
f ≤1
ζ
≤ 4ν [E((µω (I (δ0 )))2 )] 2 e− 2 Lk . 1
1
(4.12)
1
Since C3 = C12 C22 E((µω (I (δ0 )))2 ) + 4ν [E((µω (I (δ0 )))2 )] 2 < ∞ in view of (2.34), we conclude from (4.10) and (4.12) that (recall α = ζξ ) E
2 sup χx f (Hω )EHω (I (δ0 ))χ0 2
f ≤1
ζ
ξ
ξ
1 ξ
ξ
≤ C5 e− 2 Lk ≤ C5 e− 2 Lk+1 ≤ C5 e− 2 (|x|−:) ≤ C5 e 2 : e− 2 |x| 1
1
1
1
(4.13)
for all |x| ≥ L0 + :. Thus (4.6) follows (for a slightly smaller ξ ), and Theorem 3.8 is proved. ! " Proof of Corollary 3.10. Let q > 0, y ∈ Zd . We have 2 q |X| 2 f (Hω )EH (I (δ0 ))χy ω 2 (4.14) = tr χy f (Hω )EHω (I (δ0 ))|X|q f (Hω )EHω (I (δ0 ))χy ≤ (|x| + 1)q tr χy f (Hω )EHω (I (δ0 ))χx f (Hω )EHω (I (δ0 ))χy x∈Zd
=
2 (|x| + 1)q χx f (Hω )EHω (I (δ0 ))χy 2 .
(4.15)
x∈Zd
Corollary 3.10 now follows from (3.7).
" !
Remark 4.3. Note that we proved strong HS-dynamical localization, and hence strong dynamical localization. without proving firstAnderson localization or resorting to centers of localization (required in [29, 14]). This is because of our better use of the Assumption GEE (with Lemma 2.5), as in [28, 30], and Assumption SGEE. However, once Anderson localization is proven, one can use more refined properties of orthonormal sets of eigenfunctions as in [53], and bypass the explicit use of the generalized eigenfunctions as well as the discussion in [14] with the centers of localization. Nevertheless we point out that: 1) Assumption GEE is needed anyway to establish Anderson localization; 2) the analysis in this paper (and [28, 30]) shows that Pω (λ) enters the game in a natural way.
Bootstrap Multiscale Analysis and Localization in Random Media
433
Proof of Theorem 3.11. The proof mimics the one in [29], taking into account the subexponential decay of the probabilities of bad events. Given ε > 0, we pick ζ , such that ζ 2 < (1 + ε)−1 < ζ < 1 (always possible), and choose α, 1 < α < ζ (1 + ε). (Note α < ζ −1 ). Applying Theorem 3.4 yields a sequence of scales with Lk+1 = [Lαk ]6N , k = 0, 1, . . . , such that (3.4) holds. Following [29] and the notations therein, one defines Fk = Ek (x0 ), (4.16) (1+ε)−1 x0 ; |x0 |≤exp Lk+1
where Ek (x0 ) is the complement of the event x∈AL (x0 ) R(mζ , Lk , I (δ0 ), x0 , x), and k+1 ALk+1 (x0 ) is an annulus as in [20, 29] (ALk+1 (x0 ) ≈ 0Lk+1 (x0 )\0Lk (x0 )). Using (3.4), one estimates the probability of Fk as follows: α
ζ
P(Fk ) ≤ C Lαk d exp(−Lk + dLk1+ε ),
(4.17)
where C is a finite constant. Since α/(1 + ε) < ζ , the Borel-Cantelli Lemma applies, and proceeding as in [29], it follows that for any ε > 0, there exists a mass mε > 0, and for a.e. ω there is a constant Cε,ω < ∞, such that, if we let {φn,ω }n∈N be the normalized eigenfunctions of Hω with energies {En,ω }n∈N in I (δ0 ), there exist {xn,ω }n∈N , so for any n ∈ N and x ∈ Zd , we have (3.8). For the sake of completeness we now show that it follows from GEE and (3.8) that the centers of localization {xn,ω }n∈N can be reordered in such a way that |xn,ω | increases with n and we have the lower bound (3.9). The spirit of the proof goes back to [18] (see L also [29, 14]). Give L > 0, if |xn,ω | ≤ L and |x| ≥ 2L, we have |x − xnω | ≥ |x| 3 + 3, and it follows from (3.8) that for a.e. ω, χx φn,ω ≤ Cε,ω e
−mε
L 1+ε 3 −(log L)
e−mε
|x| 3
≤ Cε,ω e−mε
|x| 3
(4.18)
if L ≥ 3(log L)1+ε . Thus for L sufficiently large (depending on ε and ω), if |xn,ω | ≤ L we have χ0,2L φn,ω 2 ≥ 21 , so if N (L) is the cardinal of the set {n, En,ω ∈ I (δ0 ), |xn,ω | ≤ L}, we conclude that 2 1 χ0,2L φn,ω 2 = χ0,2L EHω (I (δ0 ))2 N (L) ≤ 2 n,En,ω ∈I (δ0 ) 2 ≤ (1 + 4L2 )2ν T −1 EHω (I (δ0 )) (4.19) 2 = (1 + 4L2 )2ν trH T −1 EHω (I (δ0 ))T −1 = (1 + 4L2 )2ν µω (I (δ0 )) , where µω (I (δ0 )) < ∞ for a.e. ω by (2.26) (recall EHω (I (δ0 )) = EHω (I (δ0 ))PH⊥ω ). It follows that N (L) < ∞ for all L > 0, so we may reorder the centers xn,ω in such a way that |xn,ω | increases with n, so we have N (|xn,ω |) ≥ n. Thus, with probability one, for n large enough, depending on ω (so that |xn,ω | ≥ 1), we have n ≤ N (|xn,ω |) ≤ 2µω (I (δ0 )) (1 + 4|xn,ω |2 )2ν ≤ 2 52ν µω (I (δ0 )) |xn,ω |4ν , and the lower bound (3.9) follows.
" !
(4.20)
434
F. Germinet, A. Klein
5. Multiscale Analyses In this section we discuss the four multiscale analyses we will need for the bootstrap multiscale analysis (i.e., for the proof of Theorem 3.4). They can be classified by either the resulting estimate on the probabilities of bad events, or by the type of growth of length scales. We will state them according to the first classification, and then present the proofs conforming to the second. 5.1. Polynomially decaying probabilities. We use two multiscale analyses that yield polynomially decaying probabilities for bad events. Theorem 5.1 ([24, Lemma 36]). Let Hω be a random operator satisfying assumptions SLI, IAD and W in some compact interval I0 . Let E0 ∈ I0 and θ > bd. Given an odd integer Y ≥ 11, for any p with 0 < p < θ − bd we can find Z = Z(d, :, QI0 , γI0 , b, θ, p, Y ), such that if for some L0 > Z, L0 ∈ 6N, we have P{0L0 (0) is (θ, E0 )-suitable} > 1 − (3Y − 4)−2d ,
(5.1)
then, setting Lk+1 = Y Lk , k = 0, 1, 2, . . . , we have that P{0Lk (0) is (θ, E0 )-suitable} ≥ 1 −
1 p Lk
(5.2)
for all k ≥ K, where K = K(p, Y, L0 ) < ∞. The value of Theorem 5.1 is that it requires a very weak starting hypothesis, in which the probability of the bad event is independent of the scale, and its conclusion, in view of Remark 3.3, gives the starting hypothesis of a modified form of the usual multiscale analysis, as given in the following theorem. We stated Theorem 5.1 in a slightly different form than in [24, Lemma 36]; it is adapted to our assumptions and definitions. Theorem 5.2 ([24, Theorem 32]). Let Hω be a random operator satisfying assumptions SLI, IAD, NE and W in some compact interval I0 . Let E0 ∈ I0 , θ > bd and 0 < p < p < p θ − bd. Then for 1 < α < 1 + p +2d , there is B = B(d, :, QI0 , CI0 , γI0 , b, θ, p, p , α), such that, if at some finite scale L0 ≥ B we verify that P{0L0 (0) is (2θ
log L0 1 , E0 )-regular} ≥ 1 − p , L0 L0
(5.3)
then there exists δ1 = δ1 (d, :, QI0 , γI0 , θ, p, p , α, L0 ) > 0, such that if we set I (δ1 ) = [E0 − δ1 , E0 + δ1 ] ∩ I0 , m0 = 2θ logL0L0 , and Lk+1 = [Lαk ]6N , k = 0, 1, . . . , we have P{0Lk (0) is (
m0 1 , E)-regular} ≥ 1 − p 2 Lk
for all E ∈ I (δ1 ),
(5.4)
and P R m20 , Lk , I (δ1 ), x, y ≥ 1 − for all k = 0, 1, . . . .
1 2p Lk
for x, y ∈ Zd , |x − y| > Lk + :,
(5.5)
Bootstrap Multiscale Analysis and Localization in Random Media
435
Theorem 5.2 is quite close to the usual multiscale analysis result [20]. The crucial difference is that Theorem 5.2 allows the mass to go to zero as the initial scale L0 goes to infinity, which may seem very surprising at first sight. Indeed, in the usual versions of the MSA ( e.g., [26, 27, 19, 20, 9, 45, 38, 29, 52]), the mass has to be fixed first in order to know how large L0 has to be chosen. It turns out that one can handle a mass depending on the scale, as in (5.3) above, i.e., a mass proportional to log L0 /L0 [24, Theorem 32]. Thus the starting hypothesis (5.3) only requires the decay of the resolvent on finite boxes to be polynomially small in the scale (see Remark 3.3), not exponentially small. Note also that by using the SLI as in (2.14), so we only move between cells, we only need to d require p > 0 as in [38], not p > d as in [20] (we need to consider only the 3 L7 cells that are cores of boxes of side 7 inside the bigger box of side L, instead of Ld boxes as in [20]). We will only need the weaker conclusion (5.4) for the bootstrap multiscale analysis; we also stated (5.5) because it is the usual conclusion of this multiscale analysis. Remark 5.3. In Theorem 5.2 the length scale B is increasing in p ∈ (0, p), and the halfinterval length δ1 depends on p and p . This should be compared with Theorem 3.4, where the length scale L and the half-interval length δ0 , while depending on θ, are independent of the parameters in the conclusion (3.4).
5.2. Sub-exponentially decaying probabilities. Previous multiscale analyses only yielded polynomially decaying probabilities for bad events. We now provide new versions of two multiscale analyses that give sub-exponential decay for the probabilities of bad events. We believe our method can yield any decay strictly slower than exponential. Definition 5.4. Given ζ ∈ (0, 1), E ∈ R, x ∈ Zd , and L ∈ 6N, we say that the box 0L (x) is (ζ, E)-sub-exponentially-suitable, if E ∈ / σ (Hx,L ) and ζ
3x,L Rx,L (E)χx,L/3 x,L ≤ e−L .
(5.6)
Remark 5.5. A box 0L (x) is (ζ, E)-sub-exponentially-suitable if and only if it is (2Lζ −1 , E)-regular. The multiscale analysis with multiplicative growth of length scales has the following sub-exponential version (compare with Theorem 5.1). Theorem 5.6. Let Hω be a random operator satisfying assumptions SLI, IAD and W in 1
some compact interval I0 . Let E0 ∈ I0 and ζ0 ∈ (0, 1). Given an odd integer Y ≥ 11 1−ζ0 , for any ζ1 with 0 < ζ1 < ζ0 we can find Z = Z(d, :, QI0 , γI0 , b, ζ0 , ζ1 , Y ), such that if for some L0 > Z, L0 ∈ 6N, we have P{0L0 (0) is (ζ0 , E0 )-sub-exponentially-suitable} > 1 − (3Y − 4)−2d ,
(5.7)
then, setting Lk+1 = Y Lk , k = 0, 1, 2, . . . , we have that ζ1
P{0Lk (0) is (ζ0 , E0 )-sub-exponentially-suitable} ≥ 1 − e−Lk for all k ≥ K, where K = K(ζ0 , ζ1 , Y, L0 ) < ∞.
(5.8)
436
F. Germinet, A. Klein
The well known multiscale analysis with exponential growth of length scales has the following sub-exponential version (compare with Theorem 5.2). In order to get subexponential decay of probabilities, our proof allows the number of bad boxes to grow with the scale. Theorem 5.7. Let Hω be a random operator satisfying assumptions SLI, IAD, NE and W in some compact interval I0 . Let E0 ∈ I0 , 0 < ζ2 < ζ1 < ζ0 < 1. Then for 1 < α < ζ0 /ζ1 , there is C = C(d, :, QI0 , CI0 , γI0 , b, ζ0 , ζ1 , ζ2 , α), such that, if at some finite scale L0 ≥ C, L0 ∈ 6N, we verify that ζ −1
P{0L0 (0) is (2L00
ζ1
, E0 )-regular} ≥ 1 − e−L0 ,
(5.9)
then there exists δ2 = δ2 (d, :, QI0 , γI0 , ζ0 , ζ1 , ζ2 , α, L0 ) > 0 such that, if we set I (δ2 ) = ζ −1 [E0 − δ2 , E0 + δ2 ] ∩ I0 , m0 = 2L00 , and Lk+1 = [Lαk ]6N , k = 0, 1, . . . , we have m ζ2 0 P R (5.10) , Lk , I (δ2 ), x, y ≥ 1 − e−Lk 4 for all k = 0, 1, 2, . . . and x, y ∈ Zd with |x − y| > Lk + :. We took m40 in (5.10) for convenience; we may take any mass m = βm0 with 0 < β < 1, but C and δ2 will also depend on β. Note also that we allow the mass m0 in the starting hypothesis (5.9) to decay with the scale L0 . The equivalent to (5.4) holds in the context of Theorem 5.7, but it will not be needed. 5.3. Multiplicative growth of length scales: Proofs. We now prove Theorems 5.1 and 5.6, along the lines of [24, Proof of Lemma 36]. We start by introducing some notations to facilitate the simultaneous proof of both theorems. For Theorem 5.1, we will say that a box is good if it is (θ, E0 )-suitable. Pick s such that p + bd < s < θ,
(5.11)
and set qL = L−p ,
tL = L−s ,
uL = L−θ .
(5.12)
For Theorem 5.6, we say that a box is good if it is (ζ0 , E0 )-sub-exponentially-suitable. Pick ξ such that ζ1 < ξ < ζ0 ,
(5.13)
and set ζ1
qL = e−L ,
ξ
tL = e−L ,
ζ0
uL = e−L .
(5.14)
In both cases, uL is the decay of the finite volume resolvent we want to propagate, tL or tL−1 is a control term to be used with Assumption W (the Wegner estimate), and qL is the probability decay of a bad event we want to end up with. A box is bad if it is not good. We set pL to be the probability of a bad box at scale L, i.e., pL = P{0L (0) is bad}.
(5.15)
Bootstrap Multiscale Analysis and Localization in Random Media
437
Note that the conclusions (5.2) and (5.8) may now be restated as pLk ≤ qLk for k ≥ K. The proof will proceed by induction. For the induction step, let 7 ∈ 6N, 7 > 3:, Y ∈ N odd, and L = Y 7. Knowing p7 , we will estimate pL . We set 7 ML,7 (x) = 0L (x) ∩ Zd ⊂ Zd , ML,7 = ML,7 (0), 3 CL,7 (x) = {07 (y); y ∈ ML,7 (x), 07 (y) 0L (x)}, CL,7 = CL,7 (0), 7 ML,7 (x) = 0L (x) ∩ Zd ⊂ Zd , ML,7 = ML,7 (0). 6
(5.16) (5.17) (5.18)
Note |ML,7 | = (3Y )d , |ML,7 | = (6Y )d , ML,7 ⊂ ML,7 . By a cell we will now mean a closed box 07/3 (y) with y ∈ ML,7 , the core of the box 07 (y). Thus CL,7 (x) is the collection of boxes of side 7 whose core is a cell and are inside the boundary belt ϒ˜ L (x) d of the big box 0 L (x); we have |CL,7 | = (3Y − 4) . Note that the big box is divided into cells: 0L (0) = x∈ML,7 07/3 (x). The induction step proceeds as in [24, Lemma 36], it is based on the SLI, but only boxes in CL,7 are allowed. The basic idea is that, if all boxes in CL,7 were good in scale 7, then it would follow from applying the SLI (2.14) repeatedly that the big box is also good in scale L. To obtain an improvement in the probability of having a good box, we need to admit the possibility of the existence of bad boxes, to be controlled by Assumption W, but we only need to allow for a fixed number of bad boxes [19, 20]. One starts in a cell inside the core of the big box 0L (0), i.e., in 0L/3 (0), apply the SLI (2.14) repeatedly, and stops just before hitting the boundary belt. Each time the SLI is performed with a good box of size 7, one gains the small factor u7 , and moves to an adjacent cell (see Remark 2.2). Each time we must perform the SLI with a bad box, we enlarge the box slightly, so the SLI moves us to the core of a good box, where we also perform the SLI. The small factor from the latter SLI is used to control the bad factor (estimated by (2.18)) coming from the former SLI (see (5.24) below). To make this discussion rigorous, let S denote be the maximum number of :nonoverlapping bad boxes in CL,7 that we shall allow. Thus at most S boxes, which must be :-nonoverlapping, may be bad, out of a total of (3Y − 4)d boxes, and we will control the probability of such an event. The bad boxes produce bad regions, such that any box in CL,7 outside these bad regions must be good. If one has one bad box, then to be sure that another box of size 7 is :-nonoverlapping, it suffices to add to the bad box an exterior belt of size 27/3 (recall 7 > 3:), and consider boxes of size 7 with cores outside this region. So the bad region will have size 7 + 47/3 = 77/3. If one has j , j ≤ S, bad boxes which are clustered in the worst possible way (their exterior belts of size 27/3 just touch), then the size of the bad region will be 2(27/3)+j 7+(j −1)47/3 = 7j 7/3. Note that the bad region has center either in ML,7 , if j is odd, or in ML,7 , if j is even. Since after using the SLI with a box 07 one ends up inside this box, on its boundary belt, we shall use slightly bigger boxes of size 7 = 7j 7/3 + 27/3 = (7j + 2)7/3, j ≤ S, so one gets out of the bad region after executing this procedure. The bad regions are inside the big box, so we require (7S + 2)7/3 < L, i.e., Y > (7S + 2)/3.
(5.19)
Now let FL,7 denote the event that either there are at least S + 1 :-nonoverlapping bad boxes in CL,7 , or dist(σ (Hx,7 , E0 )) ≤ tL for some x ∈ ML,7 and 7 of the form (7j + 2)7/3, with j = 1, 2, · · · , S, or dist(σ (H0,L , E0 )) ≤ tL . If β(Y, S + 1) denotes
438
F. Germinet, A. Klein
the number of possible choices of S + 1 :-nonoverlapping boxes in CL,7 , and S ≥ 1, we have β(Y, S + 1) ≤
1 (3Y − 4)d(S+1) ≤ (3Y − 4)d(S+1) . (S + 1)! 2
(5.20)
As in [24, Lemma 36], we will show that, for L and Y large (in a sense to be specified later), {0L (0) is bad} ⊂ FL,7 ,
(5.21)
so 1 (3Y − 4)d(S+1) p7S+1 + S(6Y )d + 1 QI0 Lbd tL 2 1 1 ≤ (3Y − 4)d(S+1) p7S+1 + qL , 2 2
pL ≤ P(FL,7 ) ≤
(5.22) (5.23)
where we used (2.18) to obtain the last term in (5.22). To obtain (5.23), we take L large enough: L > Z1 , for some Z1 = Z1 (d, QI0 , b, S, Y, p, s) for Theorem 5.1, and Z1 = Z1 (d, QI0 , b, S, Y, ζ1 , ξ ) for Theorem 5.6. / FL,7 ), we To prove (5.21), note that if the event FL,7 does not happen (i.e., ω ∈ can find :-nonoverlapping boxes 0 , i = 1, . . . , r ≤ S, with 7 ∈ {7j 7/3; j = 7 i i 1, 2, · · · , S}, and ri=1 7i ≤ 7S7/3, such that if x ∈ ML,7 , x ∈ / ri=1 07i , the box 07 (x) is good. We control the “bad region” 07i as follows: we apply the SLI (2.14) twice as in [20, Lemma 4.2], first with the extended box 07i (7i = 7i + 27/3), followed by a good box in CL,7 . We require that the product of these two factors gives rise to a number strictly smaller than one, so that if one keeps visiting a bad region infinitely often, it yields zero. In other words, taking into account that ω ∈ / FL,7 , we require [γI0 (7S + 2)d tL−1 ][γI0 3d u7 ] < 1,
(5.24)
which is true for 7 large enough (how large depending on γI0 , Y, S, and on either θ, s or ζ0 , ξ ). Thus repeated use of the SLI (2.14) yields 30,L R0,L (E0 )χL/3 0,L ≤ 30,L R0,L (E0 )χx,7/3 0,L x∈ML/3,7
d L ≤ sup 30,L R0,L (E0 )χx,7/3 0,L 7 x∈ML/3,7 N(Y ) ≤ Y d γI0 3d u7 tL−1 ,
(5.25)
where N (Y ) is the number of times we are able to perform the SLI on good boxes, without using the result for the control of a “bad region” as in (5.24). (We cannot get trapped in the bad regions; if we keep getting back to a bad region after performing the SLI to control a bad region, the estimate (5.24) would drive the left-hand-side of (5.25) to zero.) To estimate N (Y ), note that one goes from a cell inside the core of the big box 0L (0) to its boundary. Each time we perform the SLI on a good box in CL,7 , one moves to an adjacent cell. The last good box that can be used has its core cell two cells away from the boundary of 0L (0) (because of the boundary belt of size 3/2 of the latter); the shortest (thus the worst for our purposes) possible way is then made of
Bootstrap Multiscale Analysis and Localization in Random Media
439
(L/3)/(7/3) − 1 = Y − 1 cells. In addition to that one has to subtract the number of cells where one did not gain anything due to the bad regions, which is, in the worst case, (7 + 1)S = 8S cells. We thus have N (Y ) ≥ Y − 8S − 1. Thus for 0L (0) to be good, it suffices, in view of (5.25), to require Y −8S−1 tL−1 ≤ uL , Y d γI0 3d u7
(5.26)
(5.27)
which is true if we fix Y such that Y − 8S − 1 ≥ 2 Y − 8S − 1 ≥ 2Y
for Theorem 5.1, ζ0
for Theorem 5.6,
(5.28)
and then take 7 large enough, large enough depending on γI0 , Y and on either θ , s or ζ0 , ξ . Thus (5.21) is proven. So far we did not specify the value of S. Roughly, S has to be large enough so that the term p7S+1 in (5.23) can be converted into pL . It turns out that S = 1 is sufficient for Theorem 5.1, as in [24, Proof of Lemma 36]. For Theorem 5.6 we take S = [Y ζ0 ], where [M] denotes the largest integer ≤ M. We now set Y = 11
for Theorem 5.1, 1
Y = 11 1−ζ0 for Theorem 5.6.
(5.29)
We now require Y to be an odd integer such that Y ≥ Y, so with our choice of S conditions (5.19) and (5.28) are satisfied, and we require that 7 is large enough to obtain (5.27). So if we pick L0 ≥ Z2 , where Z2 is chosen so Z2 ≥ Z1 and is large enough so (5.27), and hence (5.21), holds, and set Lk+1 = Y Lk , k = 0, 1, 2, · · · , pk = pLk and qk = qLk , it follows from (5.23) that S+1 1 1 + qk+1 for k = 0, 1, 2, . . . . (5.30) (3Y − 4)d pk pk+1 ≤ 2 2 We are now in position to finish the argument. Notice first that if pk < qk , then S+1 (3Y − 4)d pk ≤ qk+1 for Z2 large enough (depending on the constants Y and on p for Theorem 5.1, ζ0 , ζ1 for Theorem 5.6). This is clear in the first case. In the second, it comes from the choice of S, which satisfies S + 1 ≥ Y ζ0 > Y ζ1 . With this choice of Z2 , pk < qk leads to 1 1 qk+1 + qk+1 = qk+1 . (5.31) 2 2 It thus suffices to show that pk < qk must occur at some scale. Suppose pk+1 ≥ qk+1 S+1 for k = 0, 1, 2, . . . , n−1. It then follows from (5.30) that we have (3Y − 4)d pk ≥ qk+1 for k = 0, 1, 2, . . . , n − 1, so, using again (5.30), we conclude that pk+1 ≤ S+1 (3Y − 4)d pk for k = 0, 1, . . . , n − 1, obtaining (S+1)n d(S+1) d(S+1) . (5.32) (3Y − 4) S p0 qn ≤ pn ≤ (3Y − 4)− S pk+1 ≤
440
F. Germinet, A. Klein
We now require p0 to be such that (3Y − 4) (3Y − 4)
− d(S+1) S
d(S+1) S
p0 < 1. Note that in both cases
≤ (3Y − 4)2d .
(5.33)
Thus taking p0 so that p0 < (3Y − 4)−2d ,
(5.34)
the right-hand side of (5.32) decays much faster than qn , so we get a contradiction. This is clear in the case of Theorem 5.1, where qn decays exponentially in n. For Theorem 5.6, we have ζ qn = exp −(Y ζ1 )n L01 , (5.35) and the contradiction comes from having chosen S = [Y ζ0 ] and ζ0 > ζ1 . Thus there must be K depending on Y, L0 , d, and on either p or ζ0 , ζ1 , so pk ≤ qk for all k ≥ K. Theorems 5.1 and 5.6 are proven. ! " 5.4. Exponential growth of length scales: Proofs. The proofs of Theorems 5.2 and 5.7 may be done simultaneously, as we did for Theorems 5.1 and 5.6. For simplicity, we will only give the proof of Theorem 5.7, adapting the methods of [20] to get sub-exponential decay of the probabilities of bad events, rather than the usual polynomial decay. The modifications in the proof to obtain Theorem 5.2 will be apparent to the reader. (We refer to [20] and [24, Theorem 32] for the proof of Theorem 5.2.) We start by deriving from (5.9) the initial step of the inductive process, i.e., (5.10) with k = 0, but with m20 substituted for m40 . We recall 0 < ζ2 < ζ1 < ζ0 < 1, ζ −1 m0 = 2L00 , and pick ζ2 < ξ1 < ζ1 . As in [20, p. 287], if 0L0 (0) is (m0 , E0 )-regular, ξ1
dist(σ (H0,L0 ), E0 ) > e−L0 , and we set δ2 = δ2 (m0 , L0 , ξ1 ) =
ξ1 L0 e−2L0 − m0 L0 e 2 2 − e−m0 2 , 2
(5.36)
it follows from the resolvent identity that 0L0 (0) is ( m20 , E)-regular for all E ∈ I (δ2 ) = [E0 − δ2 , E0 + δ2 ] ∩ I0 . Thus, it is a consequence of (5.9) and (2.18) that m0 P{for every E ∈ I (δ2 ), 0L0 (0) is ( , E)-regular} (5.37) 2 ζ1
ξ1
ζ2
−L0 ≥ 1 − e− 2 L0 , ≥ 1 − e−L0 − QI0 Lbd 0 e 1
provided L0 is large enough (depending only on the parameters d, QI0 , b, ζ1 , ζ2 , ξ1 ). Combining with Assumption IAD, we get that ζ2 P R m20 , L0 , I (δ2 ), x, y ≥ 1 − e−L0
(5.38)
for all x, y ∈ Zd with |x − y| > L0 + :. The theorem is now proven by induction. The induction step goes from scale 7 ≥ L0 to scale L = [7α ]6N , with 1 < α < ζ0 /ζ1 . We assume that for some mass m, m40 < m ≤ m20 , we have ζ2
P [R (m, 7, I (δ2 ), x, y)] ≥ 1 − e−7
(5.39)
Bootstrap Multiscale Analysis and Localization in Random Media
441
for all x, y ∈ Zd with |x − y| > Lk + :. We will show that, if 7 is large enough (in a sense to be specified), the same statement holds at scale L with a new mass m , and we will estimate m − m . We proceed as in the proof of Theorems 5.1 and 5.6; the basic idea is the same. But in order to propagate such a strong decay of the bad probabilities as in (5.39), it does not suffice to allow for a fixed number of bad (i.e., non-regular) boxes of size 7 inside a bigger box 0L (x) of size L. We must allow the number of bad boxes to grow with the scale. We fix ζ2 < ζ < ζ1 , and allow at most
S7 = 2[7(α−1)ζ ] − 1
(5.40)
:-nonoverlapping bad boxes. That will produce, as in the proof of Theorems 5.1 and 5.6, bad regions 07i , i = 1, . . . , r ≤ S7 , with 7i ∈ {7j 7/3, j = 1, 2, · · · , S7 }, with centers in ML,7 (x) (see (5.18)). Note that r i=1
7i ≤
14 1+(α−1)ζ1 < 5 7α−(α−1)(1−ζ1 ) , 7 3
(5.41)
and (α − 1)(1 − ζ1 ) > 0, so the sum of the sizes of the bad regions grows slower than 7α . The effect of the bad regions will be controlled as follows. We pick ξ , ζ2 < ξ < ζ1 , ξ
and require that in a bad region of size 7i = 7i + 27/3 we have dist(σ (H7i ), E) > e−L , ξ
so the corresponding resolvent will be estimated by a factor eL . (The price we will have to pay, in terms of probabilities, is then given by (2.18) in Assumption W, with ξ η = e−L .) By the same reasoning that lead us to (5.24) (we also specify 7 > 3:), we require m ξ [γI0 (7S7 + 2)d eL ] γI0 3d e− 2 7 < 1. (5.42)
We have
m m 1 ζ0 m ξ αξ [γI0 (7S7 + 2)d eL ] γI0 3d e− 2 7 = γI20 [3(7S7 + 2)]d e7 e− 2 7 ≤ e 4 7 − 2 7
(5.43)
for 7 (and thus L0 ) large enough, depending on γI0 , d, α, ξ and ζ0 (but not on m0 ), provided αξ < ζ0 , which is true since we picked α < m>
ζ0 ζ1 .
(5.44) ζ −1
Moreover, recalling m0 = 2L00
m0 1 ζ −1 1 = L00 ≥ 7ζ0 −1 , 4 2 2
, we have (5.45)
so (5.42) follows from (5.43). ξ Once we have (5.42), and assume dist(σ (H0,L ), E) > e−L , the same argument used to derive (5.25) leads to N7 ξ m eL , (5.46) 30,L R0,L (E0 )χL/3 0,L ≤ 7d(α−1) 3d γI0 e− 2 7
442
F. Germinet, A. Klein
where N7 is the number of times we are guaranteed to be able to perform the SLI on good boxes, without using the result for the control of a “bad region” as in (5.42). Similarly to (5.26), we have L L N7 ≥ − 8S7 − 1 ≥ (5.47) 1 − 327(α−1)(ζ −1) , 7 7 if, say 7 ≥ 12 (so 1 − 67−α ≥ 1/2; recall L = [7α ]6N ≥ 7α − 6). Thus L
30,L R0,L (E0 )χL/3 0,L ≤ e−m 2 ,
(5.48)
with, using L ≥ 7α − 6 and (5.45), (α − 1)d log 7 log(3d γI0 ) 1 m ≥ m 1 − 327(α−1)(ζ −1) − 2 + + 7α − 6 7 (7α − 6)1−ξ
4 (α − 1)d 7 log 7 7 d ≥ m 1 − 327(α−1)(ζ −1) − ζ γ ) + + log(3 I0 70 7α − 6 (7α − 6)1−ξ C (5.49) ≥m 1− τ , 7 for some finite constant C = C(d, γI0 , α) > 0 and τ = min{ζ0 , (α − 1)(1 − ζ ), ζ0 − (α(ξ − 1) + 1)} > 0 ;
(5.50)
note ζ0 − (α(ξ − 1) + 1) = α − 1 + ζ0 − αξ > 0 by (5.44). We still need to assure that m40 < m ≤ m20 . This cannot be done in a single induction step, because we would need to take 7 large depending on m. But we can do it in a way that applies to all inductive steps. Given L0 large enough for the inductive step, 1 < α < ζ1 /ζ0 , we construct the sequence of length scales Lk+1 = Lαk , k = 0, 1, . . . . Applying the inductive step from scale Lk to scale Lk+1 , we obtain a decreasing sequence of masses mk , with m0 = m20 , satisfying (5.48) and (5.49) at scale Lk . We then have 0≤
+∞ k=0
(mk − mk+1 ) ≤
+∞ m0 C m0 , < τ 2 Lk 4
(5.51)
k=0
provided L0 is large enough, depending on d, γI0 , α, ζ0 , ζ , ξ , but not on m0 . (Note that the fact that how large L0 has to be is independent of m0 , possible in view of (5.45) and ζ −1 (5.49), is quite important, since in (5.9) we have m0 = 2L00 .) It follows that m0 m0 < mk ≤ , k = 0, 1, 2, . . . . 4 2
(5.52)
We now turn to the probability estimates of the inductive step; here we follow [20, Lemma 4.1]. To apply the just discussed deterministic argument in a given box 0L (x), for a fixed energy E ∈ I0 , it suffices to require: ξ
(i) dist(σ (H0,L ), E)) > e−L . ξ (ii) dist(σ (Hy,7 ), E)) > e−L for all 7 ∈ {(7j + 2)7/3, j = 1, 2, · · · , S7 } and y ∈ ML,7 (x). (iii) There are at most S7 :-nonoverlapping bad boxes in CL,7 (x).
Bootstrap Multiscale Analysis and Localization in Random Media
443
It follows that P R m , L, I (δ2 ), x, y ≥ P for any E ∈ I (δ2 ), (i), (ii) and (iii) hold for either 0L (x) or 0L (y) .
(5.53)
Thus, to complete the inductive step, it suffices to show that the the right-hand-side of ζ (5.53) is bigger than 1 − e−L 2 for any x, y ∈ Zd with |x − y| > L + :. Let I˜0 ⊃ I0 be the open interval in Assumptions NE and W, and let σ˜ (A) = σ (A)∩ I˜0 for any operator A. If 071 (x1 ) and 072 (x2 ) are :-nonoverlapping boxes, then it follows from Assumptions IAD, NE and W that (5.54) P dist σ˜ (Hx1 ,71 ), σ˜ (Hx2 ,72 ) ≤ η ≤ CI0 QI0 η7d1 7bd 2 , by the same argument as in [20, p. 293], using (2.18) and (2.17) (see [29] for an argument using (2.19)). Thus, if we fix x, y ∈ Zd with |x − y| > L + :, we have ξ P dist(σ˜ (Hx1 ,71 ), σ˜ (Hx2 ,72 )) ≤ 2e−L for some x1 ∈ ML,7 (x), x2 ∈ ML,7 (y), 71 , 72 ∈ {L, (7j + 2)7/3, j = 1, 2, · · · , S7 } 2d ξ 2 6L L(b+1)d e−L ≤ 2CI0 QI0 (S7 + 1) 7 1 ξ ζ2 ≤ 8 · 36d CI0 QI0 7(α−1)(ζ +2d)+(b+1)αd e−L ≤ e−L , 2 (5.55) for 7 sufficiently large, depending on d, CI0 , QI0 , b, ζ2 , α, ζ , ξ . Now, let E ∈ I (δ2 ), and suppose there exist x1 ∈ ML,7 (x), 71 ∈ {L, (7j +2)7/3, j = ξ
1, 2, · · · , S7 }, with dist(σ (Hx1 ,71 ), E)) ≤ e−L . If L is large enough, depending only on ξ I˜0 \I0 , we must also have dist(σ˜ (Hx1 ,71 ), E)) ≤ e−L . If the event whose probability is ξ estimated in (5.55) does not occur, we must have dist(σ˜ (Hx2 ,72 ), E) > e−L , and hence ξ also dist(σ (Hx2 ,72 ), E) > e−L , for all x2 ∈ ML,7 (y) and 72 ∈ {L, (7j + 2)7/3, j = 1, 2, · · · , S7 }. Since we can interchange x and y in this argument, we can conclude that if 7 is large enough, P for any E ∈ I (δ2 ), (i) and (ii) hold for either 0L (x) or 0L (y) ζ
≥ 1 − 21 e−L 2 .
(5.56)
On the other hand, since we chose S7 to be an odd integer, and using Assumption IAD, we have P for some E ∈ I (δ2 ) there are at least S7 + 1 :-nonoverlapping bad boxes in CL,7 (0) ≤ P for some E ∈ I (δ2 ) there are at least two :-nonoverlapping S7 +1 bad boxes in CL,7 (0) 2 S7 +1 (α−1)ζ ] 3L 2d −7ζ2 2 ζ2 [7 ≤ e ≤ 9d 72(α−1)d e−7 (5.57) 7 ζ
≤ 41 e−L 2 ,
(5.58)
444
F. Germinet, A. Klein
where we used the induction hypothesis (5.39) to get (5.57). The final estimate (5.58) holds for 7 sufficiently large, depending on d, ζ2 , α, ζ , since ζ2 + (α − 1)ζ > αζ2 as ζ > ζ2 . We can thus conclude that P [for some E ∈ I (δ2 ), (iii) does not hold for either 0L (x) or 0L (y)] ζ
≤ 21 e−L 2 .
(5.59)
Combining (5.53), (5.56) and (5.59), we get that ζ2 P R m , L, I (δ2 ), x, y ≥ 1 − e−L ,
(5.60)
for 7 sufficiently large, the desired result. Thus, if L0 is large enough, how large depending only on the parameters d, :, QI0 , CI0 , γI0 , b, ζ0 , ζ1 , ζ2 , α, we construct the sequence of length scales Lk+1 = Lαk , k = 0, 1, . . . , and we may apply the inductive step from scale Lk to scale Lk+1 , starting from (5.38) for k = 0, obtaining (5.60) with L = Lk and m = mk , and hence, using (5.52), the conclusion (5.10) for all k = 0, 1, 2, . . . . This finishes the proof of Theorem 5.7. ! " 6. Bootstrap Multiscale Analysis We now prove Theorem 3.4. This will be done by a bootstrapping argument, making successive use of Theorems 5.1, 5.2, 5.6, and 5.7. We start by giving an outline of the proof: Prologue: Under the hypotheses of Theorem 3.4, we note that hypothesis (5.1) of Theorem 5.1 is the same as hypothesis (3.3) for appropriate choices of the parameters. Act 1: We apply Theorem 5.1, obtaining a sequence of length scales satisfying conclusion (5.2), with its polynomial decay estimate of the probability of bad events. Act 2: In view of Remark 3.3, it follows that hypothesis (5.3) of Theorem 5.2 is now satisfied at suitably large scale. (We have bootstrapped from hypothesis (3.3) to hypothesis (5.3)!). Thus we can apply Theorem 5.2 with appropriate parameters, getting δ1 > 0 and a sequence of length scales satisfying conclusion (5.4) for all E ∈ I (δ1 ). We set δ0 = δ1 . Act 3: We fix ζ and α as in Theorem 3.4, and pick ζ0 , ζ1 , ζ2 such that 0 < ζ < ζ2 < ζ1 < ζ0 < 1 < α < ζ0 ζ1−1 < ζ2−1 < ζ −1 . We note that we have bootstrapped again: hypothesis (5.7) of Theorem 5.6 is satisfied at all energies E ∈ I (δ0 ) at appropriately large scale (the same for all E). Applying Theorem 5.6, we obtain a sequence of length scales for which conclusion (5.8) holds for all E ∈ I (δ0 ), with its sub-exponential decay estimate of the probability of bad events. Act 4: Using Remark 5.5, we can see that we have bootstrapped to Theorem 5.7: for any 0 < ζ2 < ζ1 < ζ0 < 1, hypothesis (5.9) is satisfied at all energies E ∈ I (δ1 ) at sufficiently large scale (depending on ζ0 , ζ1 , ζ2 but independent of E). We apply Theorem 5.7, obtaining δ2 > 0 and an exponentially growing sequence of length scales, depending on ζ0 , ζ1 , ζ2 , but independent of E, such that conclusion (5.10) holds for all E ∈ I (δ1 ). Epilogue: We have constructed in Act 4 a sequence of length scales for which (5.10) holds for all E ∈ I (δ0 ). Since the interval I (δ0 ) (which is independent of ζ ) can be covered by [ δδ02 ] + 1 closed intervals of length δ2 , we note that the desired conclusion
Bootstrap Multiscale Analysis and Localization in Random Media
445
(3.4) now follows from (5.10), at the energies that are the centers of the [ δδ21 ] + 1 covering intervals, if we take L0 appropriately large. We now give the detailed proof of Theorem 3.4: Given θ > bd, we pick p, 0 < p < θ − bd; to fixate ideas we take p = θ−bd 2 . We choose Y = 11, and let Z = Z(d, :, QI0 , γI0 , b, θ, p, Y = 11) be as in Theorem 5.1. We take L = L(d, :, QI0 , γI0 , b, θ) = Z, and note that hypothesis (5.1) of Theorem 5.1 is now the same as hypothesis (3.3) with L0 = L and (3Y − 4)2d = 841d . (1) We now fix E0 ∈ I0 and assume (3.3) for this E0 with L > L. We set L0 = L, and (1) (1) (1) define a sequence of length scales Lk by Lk+1 = Y Lk , k = 0, 1, 2, . . . . We apply Theorem 5.1, and conclude that (5.2) holds for these length scales for all k ≥ K(1) = K(1) (d, γI0 , b, θ, L). In view of Remark 3.3, we have that
P 0L(1) (0) is 2θ k
(1)
log Lk (1) Lk
1
, E0 -regular ≥ 1 −
(1) p
(6.1)
Lk
for all k ≥ K(1) . Note that we have bootstrapped to hypothesis (5.3) of Theorem 5.2, since (6.1) is (1) and α1 = 1 + 2(pp+2d) , and take the same as (5.3) at scale Lk . We take p = θ−bd 4 B = B(d, :, QI0 , CI0 , γI0 , b, θ, p, p , α1 ) as in Theorem 5.2. Letting k1 be the smallest (1)
(2)
(1)
(2)
k ≥ K(1) such that Lk > B, we define length scales L0 = Lk1 , Lk+1 = (2)
(2) α1
Lk
6N
for k = 0, 1, 2, . . . . We apply Theorem 5.2 with L0 = L0 in (5.3), and conclude that (2) (5.4) holds for these length scales with δ1 = δ1 (d, :, QI0 , γI0 , θ, p, p , α1 , L0 ) > 0. Letting (2)
δ0 = δ0 (d, :, QI0 , CI0 , γI0 , θ, L) = δ1 (d, :, QI0 , γI0 , θ, p, p , α1 , L0 ) > 0, (6.2) we proved that for all k = 0, 1, 2, . . . we have 1
P{0L(2) (0) is (m1 , E)-regular} ≥ 1 − k
for all E ∈ I (δ0 ),
(2) p
(6.3)
Lk
with I (δ0 ) = (E0 − δ0 , E0 + δ0 ) ∩ I0 and m1 = θ
(2)
log L0 (2) L0
.
Now let us fix ζ and α as in Theorem 3.4, so 0 < ζ < 1 < α < ζ −1 . To be definite, we take ζ2 = ζ α −1 , ζ1 = ζ2 α −1 , ζ0 = ζ1 α, (6.4) so we have 0 < ζ < ζ2 < ζ1 < ζ0 < 1 < α < ζ0 ζ1−1 < ζ2−1 < ζ −1 .
(6.5)
446
F. Germinet, A. Klein 1
Next, we apply Theorem 5.6. To do so, let Y1 be the first odd integer bigger than 11 1−ζ0 (3) (2) and let Z1 = Z(d, :, QI0 , γI0 , b, ζ0 , ζ1 , Y1 ) be as in Theorem 5.6. Let L0 = Lk2 , where k2 is the smallest integer k such that: (2) (2) p (2) ζ0 −1 Lk > Z1 , Lk > (3Y1 − 4)2d , 2 Lk < m1 . (6.6) Then, recalling Remark 5.5, it follows from (6.3) that for all E ∈ I (δ0 ) we have P{0L(3) (0) is (ζ0 , E)-sub-exponentially-suitable} > 1 − (3Y1 − 4)−2d ,
(6.7)
0
and we have bootstrapped to hypothesis (5.7) of Theorem 5.6 for all E ∈ I (δ0 ), uni(3) (3) formly in E ∈ I (δ0 ). We now set Lk+1 = Y1 Lk , k = 0, 1, 2, . . . , so it follows from Theorem 5.6 that for all E ∈ I (δ0 ), P{0L(3) (0) is (ζ0 , E)-sub-exponentially-suitable} ≥ 1 − e
(3) ζ1 − Lk
k
(6.8)
(3)
for all k ≥ K(3) , where K(3) = K(ζ1 , Y1 , L0 ) < ∞. To complete our final bootstrap, we use Remark 5.5 to rewrite (6.8) as
(3) ζ1 − Lk (3) ζ0 −1 P 0L(3) (0) is 2 Lk , E -regular ≥ 1 − e k
(6.9)
for all E ∈ I (δ0 ) and k ≥ K(3) . Note that (6.9) is just a hypothesis (5.9) of Theorem 5.7 (3) (4) (3) at scale Lk for each E ∈ I (δ0 ). Thus we set L0 = Lk3 , where k3 is the smallest (3)
integer k ≥ K(3) such that Lk > C, where the constant
C = C(d, :, QI0 , CI0 , γI0 , b, ζ0 , ζ1 , ζ2 , α) (4)
is as in Theorem 5.7, with the parameters from (6.4). Note the crucial fact that L0 is the same for all E ∈ I (δ0 ). Theorem 5.7 provides us with (4)
δ2 = δ2 (d, :, QI0 , γI0 , ζ0 , ζ1 , ζ2 , α, L0 ) > 0, so, setting I (δ2 , E) = [E − δ2 , E + δ2 ] ∩ I0 , mζ = (4) α Lk , k = 0, 1, . . . , we have
(4) ζ0 −1 1 , 2 (L0 )
(4)
and Lk+1 =
6N
(4) ζ2 − Lk (4) P R mζ , Lk , I (δ2 , E), x, y ≥ 1 − e
(6.10)
(4)
for all E ∈ I (δ0 ), k = 0, 1, 2, . . . , and x, y ∈ Zd with |x − y| > Lk + :. Since I (δ0 ) can be covered by intervals I (δ2 , Ei ), i = 1, 2, . . . , [ δδ02 ] + 1, with each Ei ∈ I (δ0 ), we can conclude from (6.10) that −L(4) ζ2 (4) δ0 k P R mζ , Lk , I (δ0 ), x, y ≥ 1 − [ δ2 ] + 1 e ≥1−e
(4) ζ − Lk
,
(6.11)
Bootstrap Multiscale Analysis and Localization in Random Media
447
(4)
for all x, y ∈ Zd with |x − y| > Lk + :, and k ≥ k4 , where k4 is the smallest k such (4) (4) that the last inequality in (6.11) holds. Note Lk4 depends only on δ0 , δ2 , α, L0 , ζ, ζ2 , and hence only on d, :, QI0 , CI0 , γI0 , θ, ζ, α, L. To conclude the proof of Theorem 3.4, (4) (4) we set L0 = Lk4 , so Lk = Lk4 +k , k = 0, 1, . . . , and note that (3.4) now follows from (6.11). The proof of Theorem 3.4 is complete. ! " Acknowledgement. F. G. thanks Peter Stollman for enjoyable discussions. F. G. also thanks the University of California, Irvine for its hospitality. A. K. thanks Alex Figotin and Svetlana Jitomirskaya for enjoyable discussions.
References 1. Aizenman, M.: Localization at weak disorder: Some elementary bounds. Rev. Math. Phys. 6, 1163–1182 (1994) 2. Aizenman, M., Molchanov, S.: Localization at large disorder and extreme energies: An elementary derivation. Commun. Math. Phys. 157, 245–278 (1993) 3. Aizenman, M., Schenker, J., Friedrich, R., Hundertmark, D.: Finite-volume criteria for Anderson localization. Commun. Math. Phys., to appear 4. Barbaroux, J.M., Combes, J.M., Hislop, P.D.: Localization near band edges for random Schrödinger operators. Helv. Phys. Acta 70, 16–43 (1997) 5. Barbaroux, J.M., Fischer, W., Müller, P.: Dynamical properties of random Schrödinger operators. Preprint (1999) 6. Berezanskii, Ju.M.: Expansions in eigenfunctions of selfadjoint operators. Providence, RI: Am. Mat. Soc., 1968 7. Carmona, R., Klein, A., Martinelli, F.: Anderson localization for Bernoulli and other singular potentials. Commun. Math. Phys. 108, 41–66 (1987) 8. Carmona, R, Lacroix, J.: Spectral theory of random Schrödinger operators. Boston: Birkhäuser, 1990 9. Combes, J.M., Hislop, P.D.: Localization for some continuous, random Hamiltonian in d-dimension. J. Funct. Anal. 124, 149–180 (1994) 10. Combes, J.M., Hislop, P.D.: Landau Hamiltonians with random potentials: Localization and the density of states. Commun. Math. Phys. 177, 603–629 (1996) 11. Combes, J.M., Hislop, P.D., Nakamura, S.: The Lp -theory of the spectral shift function, the Wegner estimate and the integrated density of states for some random operators. Commun. Math. Phys. 218, 113–130 (2001) 12. Combes, J.M., Hislop, P.D., Tip, A.: Band edge localization and the density of states for acoustic and electromagnetic waves in random media. Ann. Inst. H. Poincare Phys. Theor. 70 , 381–428 (1999) 13. Cycon, H.L., Froese, R.G., Kirsch, W., Simon, B.: Schrödinger operators. Heidelberg: Springer-Verlag, 1987 14. Damanik, D., Stollman, P.: Multi-scale analysis implies strong dynamical localization. Geom. Funct. Anal. 11, 11–29 (2001) 15. Damanik, D., Sims, R., Stolz, G.: Localization for one dimensional, continuum, Bernoulli–Anderson models. Duke Math. J., to appear 16. De Bièvre, S., Germinet, F.: Dynamical localization for random dimer Schrödinger operator. J. Stat. Phys. 98, 1135–1147 (2000) 17. Del Rio, R., Jitomirskaya, S., Last, Y., Simon, B.: What is Localization? Phys. Rev. Lett. 75, 117–119 (1995) 18. Del Rio, R., Jitomirskaya, S., Last, Y., Simon, B.: Operators with singular continuous spectrum IV: Hausdorff dimensions, rank one perturbations and localization. J. d’Analyse Math. 69, 153–200 (1996) 19. von Dreifus, H.: On the effects of randomness in ferromagnetic models and Schrödinger operators. Ph.D. thesis, New York University (1987) 20. von Dreifus, H., Klein, A.: A new proof of localization in the Anderson tight binding model. Commun. Math. Phys. 124, 285–299 (1989) 21. von Dreifus, H., Klein, A.: Localization for random Schrodinger operators with correlated potentials. Commun. Math. Phys. 140, 133–147 (1991) 22. Figotin, A., Klein, A.: Localization phenomenon in gaps of the spectrum of random lattice operators. J. Stat. Phys. 75, 997–1021 (1994)
448
F. Germinet, A. Klein
23. Figotin, A., Klein, A.: Localization of electromagnetic and acoustic waves in random media. Lattice model. J. Stat. Phys. 76, 985–1003 (1994) 24. Figotin, A., Klein, A.: Localization of classical waves I: Acoustic waves. Commun. Math. Phys. 180, 439–482 (1996) 25. Figotin, A., Klein, A.: Localization of classical waves II: Electromagnetic waves. Commun. Math. Phys. 184, 411–441 (1997) 26. Fröhlich, J., Spencer, T.: Absence of diffusion with Anderson tight binding model for large disorder or low energy. Commun. Math. Phys. 88, 151–184 (1983) 27. Fröhlich, J., Martinelli, F., Scoppola, E., Spencer, T.: Constructive proof of localization in the Anderson tight binding model. Commun. Math. Phys. 101, 21–46 (1985) 28. Germinet, F.: Dynamical localization II with an application to the almost Mathieu operator. J. Stat Phys. 95, 273–286 (1999) 29. Germinet, F, De Bièvre, S.: Dynamical localization for discrete and continuous random Schrödinger operators. Commun. Math. Phys. 194, 323–341 (1998) 30. Germinet, F, Jitomirskaya, S.: Strong dynamical localization for the almost Mathieu model. Rev. Math. Phys., to appear 31. Germinet, F, Klein, A.: Finite volume criterium for localization in random media. In preparation 32. Germinet, F, Klein, A.: Order parameter for the Anderson metal-insulator transport transition. In preparation 33. Holden, H., Martinelli, F.: On absence of diffusion near the bottom of the spectrum for a random Schrödinger operator. Commun. Math. Phys. 93, 197–217 (1984) 34. Jitomirskaya, S., Last, Y.: Anderson localization for the almost Mathieu equation, III. Semi-uniform localization, continuity of gaps, and measure of the spectrum. Commun. Math. Phys. 195, 1–14 (1998) 35. Jona-Lasinio, G., Martinelli, F., Scoppola, E.: Multiple tunnelings in d-dimensions: A quantum particle in a hierarchical potential. Ann. Inst. Henri Poincaré 42, 73–108 (1985) 36. Lacroix, J., Klein., A., Speis, A.: Localization for the Anderson model on a strip with singular potentials. J. Funct. Anal. 94, 135–155 (1990) 37. Kirsch, W., Martinelli, F. : On the ergodic properties of the spectrum of general random operators. J. Reine Angew. Math. 334, 141–156 (1982) 38. Kirsch, W., Stolz, G., Stollman, P.: Localization for random perturbations of periodic Schrödinger operators. Random Oper. Stochastic Equations 6, 241–268 (1998) 39. Kirsch, W., Stolz, G., Stollman, P.: Anderson localization for random Schrödinger operators with long range interactions. Commun. Math. Phys. 195, 495–507 (1998) 40. Klein, A.: Localization in the Anderson model with long range hopping. Brazilian J. Phys. 23, 363–371 (1993) 41. Klein, A.: Extended states in the Anderson model on the Bethe lattice. Adv. in Math. 133, 163–184 (1998) 42. Klein, A., Koines, A.: A general framework for localization of classical waves: I. Inhomogeneous media and defect eigenmodes. Math. Phys. Anal. Geom. To appear 43. Klein, A., Koines, A.: A general framework for localization of classical waves: II. Random media. In preparation 44. Klein, A., Koines, A., Seifert, M.: Generalized eigenfunctions for waves in inhomogeneous media. J. Funct. Anal., to appear 45. Klopp, F.: Localization for continuous random Schrödinger operators. Commun. Math. Phys. 167, 553– 569 (1995) 46. Kunz, H., Souillard, B.: Sur le spectre des operateurs aux differences finies aleatoires. Commun. Math. Phys. 78, 201–246 (1980) 47. Martinelli, F., Scoppola, E.: Introduction to the mathematical theory of Anderson localization. Riv. Nuovo Cimento 10, 1–90 (1987) 48. Pastur, L.: Spectral properties of disordered systems in one-body approximation. Commun. Math. Phys. 75, 179–196 (1980) 49. Pastur, L., Figotin, A.: Spectra of Random and Almost-Periodic Operators. Heidelberg: Springer-Verlag, 1992 50. Simon, B.: Schrödinger semi-groups. Bull. Am. Math. Soc. 7, 447–526 (1982) 51. Spencer, T. : Localization for random and quasiperiodic potentials. J. Stat. Phys. 51, 1009–1019 (1988) 52. Stollmann, P.: Caught by disorder. To appear 53. Tcheremchantsev, S.: How to prove dynamical localization. Commun. Math. Phys., to appear 54. Wang, W.-M.: Microlocalization, percolation, and Anderson localization for the magnetic Schrödinger operator with a random potential. J. Funct. Anal. 146, 1–26 (1997) 55. Wang, W.-M.: Localization and universality of Poisson statistics for the multidimensional Anderson model at weak disorder. Invent. Math., to appear Communicated by M. Aizenman
Commun. Math. Phys. 222, 449 – 474 (2001)
Communications in
Mathematical Physics
© Springer-Verlag 2001
Large-Time Behaviors of Solutions to an Inflow Problem in the Half Space for a One-Dimensional System of Compressible Viscous Gas Akitaka Matsumura1 , Kenji Nishihara2 1 Department of Mathematics, Osaka University, Osaka 560-0045, Japan 2 School of Political Science and Economics, Waseda University, Tokyo 169-8050, Japan.
E-mail: [email protected] Received: 25 April 2000 / Accepted: 20 April 2001
Dedicated to Professors Takaaki Nishida and Masayasu Mimura on the occasion of their sixtieth birthdays Abstract: The “inflow problem” for a one-dimensional compressible barotropic flow on the half-line R+ = (0, +∞) is investigated. Not only classical waves but also the new wave, which is called the “boundary layer solution”, arise. Large time behaviors of the solutions to be expected have been classified in terms of the boundary values by [A. Matsumura, Inflow and outflow problems in the half space for a one-dimensional isentropic model system of compressible viscous gas, to appear in Proceedings of IMS Conference on Differential Equations from Mechanics, Hong Kong, 1999]. In this paper we give the rigorous proofs of the stability theorems on both the boundary layer solution and a superposition of the boundary layer solution and the rarefaction wave. 1. Introduction In this paper we consider the “inflow problem” recently proposed by the first author [6] for a one-dimensional compressible barotropic flow on the half-line R+ = (0, ∞), which is an initial-boundary value problem in the Eulerian coordinate (x, ˜ t) : ˜ x˜ = 0, (x, ˜ t) ∈ R+ × R+ ρ˜t + (ρ˜ u) (ρ˜ u) 2 ˜ t + (ρ˜ u˜ + p) ˜ x˜ = µu˜ x˜ x˜ (1.1) (ρ, ˜ u)| ˜ x=0 = (ρ− , u− ) with u− > 0 ˜ (ρ, ˜ u)| ˜ t=0 = (ρ˜0 , u˜ 0 )(x) ˜ → (ρ+ , u+ ) as x˜ → +∞. Here, ρ(> ˜ 0) is the density, u˜ is the velocity, p˜ = p( ˜ ρ) ˜ = ρ˜ γ (the adiabatic constant γ ≥ 1) is the pressure, and µ is the viscosity constant. The condition ρ˜0 (x) ˜ > 0,
ρ± > 0
(1.2)
This work was supported in part by a Grant-in-Aid for Scientific Research c(2) 10640216 of the Ministry
of Education, Science, Sports and Culture.
450
A. Matsumura, K. Nishihara
is assumed, and so the flow does not include the vacuum state at the initial time. The compatibility condition is (ρ˜0 , u˜ 0 )(0) = (ρ− , u− ). (1.3) The assumption u− > 0 implies that, through the boundary x˜ = 0, the fluid with the density ρ− flows into the region under consideration with its speed u− > 0, and hence the problem is called the inflow problem. In the case of u− < 0 the problem is called the outflow problem. When u− = 0 and hence the condition ρ| ˜ x=0 = ρ− is removed, the problem becomes ˜ an initial-boundary value problem with fixed boundary: ρ˜ + (ρ˜ u) ˜ x˜ = 0, (x, ˜ t) ∈ R+ × R+ t (ρ˜ u) 2 ˜ t + (ρ˜ u˜ + p) ˜ x˜ = µu˜ x˜ x˜ u| ˜ x=0 =0 ˜ (ρ, ˜ u)| ˜ t=0 = (ρ˜0 , u˜ 0 )(x) ˜ → (ρ+ , u+ ) as x˜ → +∞. This is changed to the problem in the Lagrangian coordinate (x, t) : vt − ux = 0, (x, t) ∈ R+ × R+ ut + p(v)x = µ( ux )x v u|x=0 = 0(:= u− ) (v, u)| t=0 = (v0 , u0 )(x) → (v+ , u+ ) := (1/ρ+ , u+ ) as x → +∞.
(1.4)
Here, v = 1/ρ, u and p = p(v) = v −γ (γ ≥ 1) are, respectively, the specific volume, velocity and pressure denoted in the Lagrangian coordinate. Matsumura and Nishihara [11] and Matsumura and Mei [7] have shown that the solution (v, u) to (1.4), roughly speaking, tends to the rarefaction wave as t tends to infinity when u+ > u− = 0, and the viscous shock wave when u+ < u− = 0. We now concentrate on the case u− > 0, the inflow problem (1.1). In the case u− < 0, see [6] and also [3]. In [6] (1.1) is treated, but we here transform (1.1) to the problem in the Lagrangian coordinate: vt − ux = 0, x > s− t, t > 0 ut + p(v)x = µ( ux )x v (P ) (v, u)|x=s− t = (v− , u− ), v− = 1/ρ− , u− > 0 (v, u)| = (v , u )(x) → (v , u ) = (1/ρ , u ) as x → ∞, t=0
0
0
+
where s− = −
+
+
+
u− < 0. v−
See Fig. 1.1. The change (x, ˜ t) → (x, t) is given by ∂ x(x, ˜ t) = u( ˜ x(x, ˜ t), t), ∂t x(x, ˜ 0) = x˜0 (x),
t > 0, x > 0
(1.5)
Large-Time Behaviors of Solutions to an Inflow Problem
451
t
x =s−t
x = x 0 + λ1 t
x = x0 + λ2 t
x
0 Fig. 1.1. (γ = 1)
with
x˜0 (x)
ρ(y ˜ , 0)dy = x,
0
˜ t); x˜ > x(0, ˜ t)}, and by where (x, ˜ t) ∈ 1 = {(x, ∂ x(x, ˜ t) = u( ˜ x(x, ˜ t), t), t > t0 (x), x < 0 ∂t x(x, ˜ t0 (x)) = 0, with
−x =
t0 (x)
0
(ρ˜ u)(0, ˜ τ )dτ = (ρ− u− ) · t0 (x),
˜ t); 0 < x˜ < x(0, ˜ t)}. See Fig. 1.2. where (x, ˜ t) ∈ 2 = {(x, From the definition it follows that u− x ≥ −ρ− u− t = − t = s− t, v− and that
x(x,t) ˜ x(0,t) ˜
ρ(y ˜ , t)dy = x
t x(x, ˜ t)
t0 (x) ρ
for (x, ˜ t) ∈ i (i = 1, 2).
x(0, ˜ t)
x(x, ˜ t)
2 1
u
0
x˜0 (x) Fig. 1.2. (γ > 1)
x˜
452
A. Matsumura, K. Nishihara
u
u
super
c(v)
trans
l
c(v) sub
super sub
trans
v
v
Fig. 1.3. Left: (γ = 1); right: (γ > 1)
Hence, for f (x, t) = f˜(x(x, ˜ t), t), ∂ ∂ ∂ f (x, t) = +u f˜(x, ˜ t), ∂t ∂t ∂ x˜
∂ ∂ f (x, t) = v f˜(x, ˜ t), ∂x ∂ x˜
which yields (P) with (1.5). We now consider the inflow problem (P) described in the Lagrangian coordinate. The characteristic speeds of the corresponding hyperbolic system without uviscosity i are λi (v) = (−1) −p (v) (i = 1, 2). Compare them with the speed s− = − v−− of the moving boundary. Since the sound speed c(v) is defined by √ (1.6) c(v) = v −p (v) = γ v −(γ −1)/2 (note that v −p (v) = p˜ (ρ)), ˜ comparing |u| with c(v) instead of |u|/v with |λi (v)|, we divide the (v, u)-space into three regions: sub = {(v, u); |u| < c(v), v > 0, u > 0}, trans = {(v, u); |u| = c(v), v > 0, u > 0}, super = {(v, u); |u| > c(v), v > 0, u > 0}.
(1.7)
Call them the subsonic, transonic, and supersonic resions, respectively. See Fig. 1.3. When (v− , u− ) ∈ sub , λ1 (v− ) < s− (< 0), and hence the existence of a traveling wave solution (V , U )(x − s− t) with (V , U )(0) = (v− , u− ), (V , U )(+∞) = (v+ , u+ ) is expected. Substitute this into (P)1,2 (this means the first and second equations in (P)) have −s− V − U = 0, = d/dξ, ξ = x − s− t > 0 U (1.8) −s− U + p(V ) = µ( ) V (V , U )(0) = (v , u ), (V , U )(+∞) = (v , u ). − − + + We call the solution (V , U ) the boundary layer solution, or BL-solution simply. Seek for the condition for the existence of the BL-solution. When (V , U ) exists, the integration of (1.8) over (0, ∞) and (ξ, ∞) yields −s− (v+ − v− ) − (u+ − u− ) = 0 (1.9) U (0) −s− (u+ − u− ) + p(v+ ) − p(v− ) = −µ v−
Large-Time Behaviors of Solutions to an Inflow Problem
c(v)
u
v+
453
c(v)
u
v−
v∗
v−
v+
v∗
v
v−
v+
v∗
v
v−
v+
v∗
v
p(v)
p(v) slope −s 2 slope −s 2 v+
v−
v∗
v h(v)
h(v) v+
v−
v∗
v
v
Fig. 1.4. Left: (v+ < v− ); right: (v+ > v− )
and
−s− (V − v+ ) − (U − u+ ) = 0
−s− (U − u+ ) + p(V ) − p(v+ ) = µ U . V From (1.9)1 and (1.10)1 s− = −
u− U (ξ ) u+ =− =− , V (ξ ) v− v+
(1.10)
(1.11)
and hence we define BL-line through (v− , u− ) ∈ sub by u− u = −s− }. BL(v− , u− ) = {(v, u) ∈ sub ∪ trans ; = v v− Denote trans ∩ BL(v− , u− ) = {(v∗ , u∗ )}. By (1.10) we have the ordinary differential equation of V : dV V V 2 = {−s− (V − v+ ) − (p(V ) − p(v+ ))} := h(V ) µ dξ s− s− (1.12) V (0) = v , V (+∞) = v . − + To the contrary, for (v+ , u+ ) ∈ BL(v− , u− ) there exists a solution (V , U ) to (1.12). Because we find that h(v+ ) = 0, h(v) < 0 for v+ < v < v− if v+ < v− and h(v) > 0 for v− < v+ (≤ v∗ ) if v− < v+ . See Fig. 1.4. Noting that h (v∗ ) = 0 and h
(v∗ ) = 0, we have the following lemma.
454
A. Matsumura, K. Nishihara
u
R2 (v∗ , u∗ )
R2 (v− , u− )
R1 (v∗ , u∗ ) BL+ c(v)
(v− , u− ) BL−
S2 (v∗ , u∗ )
S2 (v− , u− )
v Fig. 1.5.
Lemma 1.1 (Boundary Layer Solution). Let (v− , u− ) ∈ sub and (v+ , u+ ) ∈ BL(v− , u− ). Then, there exists a unique solution (V , U )(ξ ) to (1.8), which satisfies |(V (ξ ) − v+ , U (ξ ) − u+ )| ≤ C exp (−c|ξ |) if v+ < v∗ , |(V (ξ ) − v+ , U (ξ ) − u+ )| ≤ C|ξ |−1
if v+ = v∗ .
On the other hand, since 0 > λ1 (v) > s− in super , the 1-characteristic field is away from the moving boundary. The 2-characteristic field is, of course, away from the boundary. Hence, the behaviors of solutions are expected to be the same as those for the Cauchy problem. By noting that c (v∗ ) > −λ2 (v∗ ) for 1 < γ < 3, the large time behaviors to be expected divide the (v, u)-space as the following figure, Fig. 1.5. Here, BL+ (v− , u− ) = {(v, u) ∈ BL(v− , u− ); v− < v ≤ v∗ } , BL− (v− , u− ) = {(v, u) ∈ BL(v− , u− ); 0 < v < v− } ,
v λ1 (s)ds, v > v∗ , R1 (v∗ , u∗ ) = (v, u); u = u∗ − v
∗v λ2 (s)ds, v < v∗ R2 (v− , u− ) = (v, u); u = u− − R2 (v∗ , u∗ ) = (v, u); u = u∗ −
v− v
v∗
λ2 (s)ds, v < v∗
S2 (v− , u− ) = {(v, u); u = u− − s2 (v − v− ), v > v− } , S2 (v∗ , u∗ ) = {(v, u); u = u∗ − s∗ (v − v∗ ), v > v∗ } , together with −(p(v) − p(v− ))/(v − v− ), s∗ = −(p(v) − p(v∗ ))/(v − v∗ ). s2 =
Our aim is to investigate the stability of the BL-solution or a superposition of the BL-solution and nonlinear waves. Our results are, roughly speaking, as follows: (I) If (v+ , u+ ) ∈ BL+ (v− , u− ), then the BL-solution is stable. (II) If (v+ , u+ ) ∈ BL− (v− , u− ) , then the BL-solution is stable provided that |(v+ − v− , u+ − u− )| is small. That is, the BL-solution is necessary to be weak.
Large-Time Behaviors of Solutions to an Inflow Problem
455
(III) If (v+ , u+ ) ∈ BL+ R2 (v− , u− ), then there exists (v, ¯ u) ¯ ∈ BL+ (v− , u− ) such that (v+ , u+ ) ∈ R2 (v, ¯ u), ¯ and the superposition of the BL-solution connecting (v− , u− ) with (v, ¯ u) ¯ and the 2-rarefaction wave connecting (v, ¯ u) ¯ with (v+ , u+ ) is stable provided that |(v+ − v, ¯ u+ − u)| ¯ is small, where v λ2 (s)ds, BL+ R2 (v− , u− ) = (v, u); u > −s− v, u > u− − u ≤ u∗ −
v v∗
v−
λ2 (s)ds .
That is, the BL-solution is not necessarily weak and the rarefaction wave is weak. (IV) If (v+ , u+ ) ∈ BL− R2 (v− , u− ), then the superposition of the BL-solution and the 2-rarefaction wave is stable provided that |(v+ − v− , u+ − u− )| is small, where
v λ2 (s)ds . BL− R2 (v− , u− ) = (v, u); u > −s− v, u < u− − v−
In this case, both the BL-solution and the rarefaction wave are weak. (V) If (v∗ , u∗ ) ∈ BL+ (v− , u− ), (v+ , u+ ) ∈ R1 R2 (v∗ , u∗ ) and |(v+ − v∗ , u+ − u∗ )| is small, then the superposition of the BL-solution, 1-rarefaction wave and 2rarefaction wave is stable. Here,
v λi (s)ds, i = 1, 2 . R1 R2 (v∗ , u∗ ) = (v, u); u > u∗ − v∗
Similar to (III), the BL-solution is not necessarily weak. In later sections we will give the proofs of (I)–(V), with which it is interesting to compare those of results [8,9,10] on the Cauchy problem for the viscous p-system: v − ux = 0 (x, t) ∈ R × R+ t ux ut + p(v)x = µ( )x (1.13) v (v, u)|t=0 = (v0 , u0 )(x) → (v± , u± ) as x → ±∞. (For more general systems see [1, 4, 12, 13] etc. and references therein). In these papers the signs of first order derivative of rarefaction waves and viscous shock waves are crucial. The cases (I) and (III) are, respectively, similar to the cases (v+ , u+ ) ∈ R1 (v− , u− ) and ∈ R1 R2 (v− , u− ). Hence, global results on the present problem are expected, but we could not control the values from the boundary for large data. In the case of (II) it seems ∞ to be available to take the perturbation ∞ of the integrated form (φ, ψ)(x, t) = − ξ (v−V , u−U )dy. However, in general, 0 (v0 −V , u0 −U )(y)dy = (0, 0). Even if ∞ we assume that 0 (v0 −V , u0 −U )(y)dy = (0, 0), we could not control the values from the boundary even for small data. So, we put the perturbation (φ, ψ) = (v − V , u − U ). This is no integrated form, and the sign of (Vξ , Uξ ) is not good. However, we can overcome this for the weak BL-solution, applying the discussion by Kawashima and Nikkuni [2]. Related to this case, when (v+ , u+ ) ∈ BL− S2 (v− , u− ) = {(v, u); u < −s− v, u < u− − s2 (v − v− )},
456
A. Matsumura, K. Nishihara
the asmptotic state is conjectured to be (V , U )(x −s− t)+(V2S , U2S )(x −s2 t +α)−(v, ¯ u) ¯ together with a suitable shift α, where (v, ¯ u) ¯ ∈ BL− (v− , u− ) such that (v+ , u+ ) ∈ S2 (v, ¯ u), ¯ and (V , U ) is the BL-solution connecting (v− , u− ) with (v, ¯ u) ¯ and (V2S , U2S ) is the 2-viscous shock wave connecting (v, ¯ u) ¯ with (v+ , u+ ). In the final section how to determine the shift α will be discussed. Our plan of this paper is as follows. After stating the notations, in Sect. 2 we show the cases (I), (II). In Sect. 3 the cases (III)–(V) will be treated. In the final section we will present the concluding remarks. Notations.. Throughout this paper several positive generic constants are denoted by ci (a, b, · · · ), Ci (a, b, · · · ) (i = 0, 1, 2, · · · ) depending on a, b, · · · , or simply by ci , Ci , c, C without confusion. Denote f (x) ∼ g(x) as x → a when C −1 g < f < Cg in a neighborhood of a. For function spaces, Lp (), 1 ≤ p ≤ ∞ is an usual Lebesgue space on ⊂ R = (−∞, ∞) with its norm 1/p f Lp () = |f (x)|p dx , 1 ≤ p < ∞, f L∞ () = sup |f (x)|.
H l () denotes the l th order Sobolev space with its norm 1/2 l j f l = ∂x f 2 , where · := · L2 () . j =0
H01 () is a closure of C0∞ () with respect to H 1 -norm, so that f ∈ H01 () satisfies f (∂) = 0. The domain will be often abbreviated without confusion. 2. Stability of the Boundary Layer Solution 2.1. The case (v+ , u+ ) ∈ BL+ (v− , u− ). Assume that (v− , u− ) ∈ sub
and
(v+ , u+ ) ∈ BL+ (v− , u− ),
(2.1)
then Lemma 1.1 gives a unique boundary layer solution (V , U )(ξ ), ξ = x − s− t ≥ 0, s− = −u− /v− satisfying (1.8) or (1.12). Note that Vξ =
V h(V ) > 0, µs−
h(V ) < 0,
h
(V ) > 0 for v− < V < v+ .
(2.2)
We put the perturbation (φ, ψ)(ξ, t) by (v, u)(x, t) = (V , U )(ξ ) + (φ, ψ)(ξ, t), so that the reformulated problem is φt − s− φξ − ψξ = 0, ξ > 0, t > 0 ψt − s− ψξ + (p(V + φ) − p(V ))ξ = µ Uξ +ψξ − V +φ (φ, ψ)|ξ =0 = (0, 0) (φ, ψ)| t=0 = (φ0 , ψ0 )(ξ ) := (v0 − V , u0 − U )(ξ ),
(2.3)
Uξ V
ξ
(2.4)
Large-Time Behaviors of Solutions to an Inflow Problem
457
from (P) and (1.8). The solution space is Xm,M (0, T ) = (φ, ψ) ∈ C([0, T ]; H01 ) | φξ ∈ L2 (0, T ; L2 ), with
sup (φ, ψ)(t)1 ≤ M,
inf
ψξ ∈ L2 (0, T ; H 1 ) (V + φ)(ξ, t) ≥ m ,
R+ ×[0,T ]
[0,T ]
for positive constants m, M. To obtain the stability theorem, we combine the time-local existence of the solution (φ, ψ) to (2.4) with the a priori estimates. Those are given as follows. Proposition 2.1 (Local existence). Let (φ0 , ψ0 ) be in H01 (R+ ). If φ0 , ψ0 1 ≤ M, and inf R+ ×[0,T ] (V + φ)(ξ, t) ≥ m, then there exists t0 = t0 (m, M) > 0 such that (2.4) has a unique solution (φ, ψ) ∈ X 1 m,2M (0, t0 ). 2
Proposition 2.2 (A priori estimates). Let (φ, ψ) be in X 1 m,ε (0, T ). Then, for a suitably 2 small ε > 0, there exists a constant C0 > 0 such that t (φ, ψ)(t)21 + (ψξ (0, τ )2 + Vξ φ(τ )2 0
+ φξ (τ )2 + ψξ ξ (τ )2 )dτ ≤ C0 φ0 , ψ0 21 . Remark 2.1. If ε is suitably small, then inf R+ ×[0,T ] (V +φ)(ξ, t) ≥ m/2 is automatically satisfied by the Sobolev inequality. Hence we denote Xm,ε (0, T ) simply by Xε (0, T ). The following stability theorem is from these two propositions, which is on the same line as in [7–11]. Theorem 2.1 (Stability of BL-solution in case of (v+ , u+ ) ∈ BL+ (v− , u− )). If v0 − V , u0 − U 1 is suitably small together with the compatibility condition (v0 − V , u0 − U )(0) = (0, 0), then there exists a unique solution (v, u) to (P), which satisfy (v − V , u − U ) ∈ C([0, ∞); H01 ) and sup |(φ, ψ)(ξ, t)| = sup |(v, u)(x, t) − (V , U )(x − s− t)| → 0 as t → ∞. ξ ≥0
x≥s− t
We first devote ourselves to the proof of Proposition 2.2, which will be done by a series of lemmas. At the end of this section we will mention the local existence theorem. Multiply (2.4)1 (first equation of (2.4)) and (2.4)2 by −(p(V + φ) − p(V )) and ψ, respectively, and add these two equations to have a divergence form
1 2 ψ + 5(v, V ) 2 t
uξ Uξ s− + −s− 5(v, V ) − ψ 2 + (p(v) − p(V ))ψ − µ( − )ψ (2.5) 2 v V ξ ψξ2 Vξ φψξ + µ − µs− − s− Vξ (p(V + φ) − p(V ) − p (V )φ) = 0, v vV where
5(v, V ) = p(V )φ −
V +φ
V
p(η)dη.
(2.6)
458
A. Matsumura, K. Nishihara
Here and after we will often use the notation (v, u) = (V + φ, U + φ), though the unknown functions are φ and ψ. Since p
(V ) > 0, put p(V + φ) − p(V ) − p (V )φ = f (v, V )φ 2 ,
(2.7)
then f (v, V ) ≥ 0. Noting that −s− Vξ > 0, we regard the last three terms in (2.5) as the quadratic equation: ψξ2
Vξ φψξ − s− Vξ f (v, V )φ 2 v vV −µs− Vξ √ ψξ √ ψξ 2 = ( µ√ ) − √ · µ √ · −s− Vξ f (v, V )φ v v V vf (v, V ) 2 −s− Vξ f (v, V )φ . +
Q := µ
− µs−
The discriminant of Q is D=
−µs− Vξ −h(V ) −4= − 4. 2 V vf (v, V ) V vf (v, V )
(2.8)
Since v+ > v− , 2 −h(V ) = s− (V − v+ ) + p(V ) − p(v+ ) < p(V ) = V −γ .
(2.9)
Moreover, by putting X = V /v, V v(v −γ − V −γ + γ V −γ −1 (v − V )) (v − V )2 γ +1 − (γ + 1)X + γ X = V −γ · ≥ γ V −γ , (X − 1)2
V vf (v, V ) =
(2.10)
because Xγ +1 − (γ + 1)X + γ ≥ γ (X − 1)2 for X ≥ 0. By (2.8)–(2.10), D≤
1 V −γ − 4 = − 4 ≤ −3. −γ γV γ
(2.11)
Thus, integrating (2.5) over (0, ∞) × (0, t), we have the following basic lemma. Lemma 2.1 (Basic estimate). For the solution (φ, ψ) ∈ X2ε (0, T ), it holds that ∞ 1 5(v, V )(ξ, t)dξ ψ(t)2 + 2 0 t ∞ ψ2 Vξ φψξ ξ −1
+C +| | + (p(V + φ) − p(V ) − p (V )φ)Vξ dξ dτ v vV 0 0 ∞ 1 5(v0 , V )(ξ )dξ ≤ Cφ0 , ψ0 2 . ≤ ψ0 2 + 2 0 Remark 2.2. The method used here is similar to that in [11]. But (2.11) is sharper than the corresponding one in [11]. Note that the basic estimate is obtained without a smallness condition on the data.
Large-Time Behaviors of Solutions to an Inflow Problem
459
Next, following [11], change φ to v˜ := v/V . Since p(V + φ) − p(V ) − p (V )φ = V −γ (v˜ −γ − 1 + γ (v˜ − 1)) and
˜ v), 5(v, V ) = V −γ +1 5( ˜
v˜ − 1 − ln v˜ (γ = 1) ˜ v) 5( ˜ = 1 v˜ − 1 + (v˜ −γ +1 − 1) γ −1 Lemma 2.1 is rewritten as follows. where
(2.12)
(γ > 1),
Lemma 2.2. It follows that ∞ 1 ˜ v(ξ, V −γ +1 5( ˜ t))dξ ψ(t)2 + 2 0
t ∞ ψ2 Vξ φψξ Vξ −γ ξ −1 +C (v˜ − 1 + γ (v˜ − 1)) dξ dτ + + v vV V γ 0 0 ≤ Cφ0 , ψ0 2 . Equation (2.4)2 is also written as v˜ v˜ γ v˜ξ γ Vξ ξ ξ µ − ψ − s− µ − ψ + γ γ +1 + γ +1 (v˜ −γ − 1) = 0. t ξ v˜ v˜ V v˜ V Multiplying (2.13) by v˜ξ /v, ˜ we have a divergence form v˜ξ µ v˜ξ 2 −ψ 2 v˜ v˜ t −γ γ v˜ξ2 vt γ h(V ) v˜ − 1 µs− v˜ξ 2 + ψ − + + ln v ˜ − V γ v˜ γ +2 v s− µV γ γ 2 v˜ ξ
=
ψξ2 v
+
s− φψξ Vξ γ Vξ h (V )V γ − h(V )γ V γ −1 − vV s− µ V 2γ
(2.13)
(2.14)
v˜ −γ − 1 + ln v˜ . γ
By (2.2)
Vξ −γ (v˜ − 1 + γ (v˜ − 1)). Vγ Hence, the right-hand side of (2.14) is controllable by Lemma 2.2. Thus, integrating (2.14) over (0, ∞) × (0, t) yields the following lemma. |the final term of (2.14)| ≤ C
Lemma 2.3. It holds that t ∞ v˜ 2 v˜ξ 2 ξ (t) + dξ dτ v˜ γ +2 v ˜ 0 0 ≤ C(φ0 21 + ψ0 2 ) + C
t 0
v˜ξ v˜
2 (0, τ )dτ.
(2.15)
460
A. Matsumura, K. Nishihara
t v˜ We have tried to control the final term of (2.15), C 0 ( v˜ξ )2 (0, τ )dτ , without smallness condition, in a similar fashion to [11]. But, we could not break through it. However, we can control it provided that the initial data is small. Since 2 v˜ξ 1 1 (0, τ ) = 2 φξ2 (0, τ ) = 2 ψξ2 (0, τ ) (2.16) v˜ v− u− (the t validity2 of this equation will be stated later), it is necessary to estimate 0 ψξ ξ (τ ) dτ , which is controllable for small initial data. We now assume that N (T ) := sup (φ, ψ)(t)1 ≤ 2ε 1. 0≤t≤T
Multiplying (2.4)2 by −ψξ ξ , we have
ψξ2ξ s− + −ψt ψξ + ψξ2 + µ 2 v ξ t ψξ (Vξ + φξ ) Uξ Uξ = −µ +µ − (p(V + φ) − p(V ))ξ (−ψξ ξ ) − (V + φ)2 V +φ V ξ 1 2 ψ 2 ξ
and, after integrating the resultant equation over (0, ∞) × (0, t), t 2 (ψξ (0, τ )2 + ψξ ξ (τ )2 )dτ ψξ (t) + 0 t ∞ 2 ≤ Cψ0ξ + C (φξ2 + Vξ φ 2 + ψξ2 )dξ dτ. 0
(2.17)
0
Here, we have estimated the amount (φξ ψξ )2 as t t ∞ 2 (φξ ψξ ) dξ dτ ≤ ψξ ψξ ξ φξ 2 dτ 0 0 0 t t 2 2 ≤ν ψξ ξ dτ + Cν N (T ) φξ (τ )2 dτ 0
0
for a small constant ν > 0. By Lemma 2.1, (2.17) is reduced to t (ψξ (0, τ )2 + ψξ ξ (τ )2 )dτ ψξ (t)2 + 0 t 2 2 ≤ C(φ0 + ψ0 1 ) + C φξ (τ )2 dτ.
(2.18)
0
For a small constant λ > 0, (2.15) + (2.18) · λ together with (2.16) yields t v˜ξ 2 (t) + λψξ (t)2 + (v˜ξ (τ )2 + λψξ (0, τ )2 + λψξ ξ (τ )2 )dτ v˜ 0 t 2 v˜ξ 2 2 ≤ Cφ0 , ψ0 1 + C (2.19) (0, τ ) + λφξ (τ ) dτ v˜ 0 t ≤ Cφ0 , ψ0 21 + (νψξ ξ (τ )2 + Cν ψξ (τ )2 + Cλφξ (τ )2 )dτ. 0
Large-Time Behaviors of Solutions to an Inflow Problem
461
Since ∞ φ
v˜ξ (t)2 =
ξ
Vξ φ V2
2
dξ ∞ φ2 (Vξ φ)2 ξ ≥ dξ ≥ c0 φξ (t)2 − C Vξ φ(t)2 − 2 4 2V V 0
and
t 0
V
0
−
v˜ξ (τ )2 dτ ≥ c0
t
φξ (τ )2 dτ − C
0
t 0
∞ 0
Vξ φ 2 dξ dτ,
we fix λ such that Cλ ≤ c0 /2 and ν such that ν ≤ λ/2. Then, the following lemma holds. Lemma 2.4. If N (T ) = sup0≤t≤T (φ, ψ)(t)1 is suitably small, then (φξ , ψξ )(t) +
t
2
0
(ψξ (0, τ )2 + (φξ , ψξ ξ )(τ )2 )dτ ≤ Cφ0 , ψ0 21 .
Combining Lemmas 2.1–2.4 completes the proof of Proposition 2.2. We now mention the unique existence of the local solution to (2.4), the proof of Proposition 2.1, briefly. By (2.4)1 , φ has the explicit form t 0 ≤ ξ ≤ −s− t ξ ψξ (ξ + s− (t − τ ), τ )dτ, φ(ξ, t) = t+ s− (2.20)φ0 t φ0 (ξ + s− t) + 0 ψξ (ξ + s− (t − τ ), τ )dτ, ξ ≥ −s− t. Equation(2.4)2 is regarded as the initial-boundary value problem for the linear parabolic equation of ψ: ψξ − µ ψ t V +φ ξ = g := g(ψξ , φ, φξ ) (2.21)ψ0 ψ(0, t) = 0 ψ(ξ, 0) = ψ (ξ ), 0
where
Uξ Uξ − g(ψξ , φ, φξ ) = s− ψξ − (p(V + φ) − p(V ))ξ + µ V +φ V
ξ
.
(2.22)
To use the iteration method, we approximate (φ0 , ψ0 ) ∈ H01 by (φ0k , ψ0k ) ∈ H 3 ∩H01 such that (φ0k , ψ0k ) → (φ0 , ψ0 ) strongly in H 1 as k → ∞. We may assume φ0k , ψ0k 1 ≤
3 2 M, inf (V + φ0k ) ≥ m R+ 2 3 (n)
(n)
for any k ≥ 1. We first define the sequence {(φ (n) , ψ (n) )} := {(φk , ψk )} for each k so that (φ (0) , ψ (0) )(ξ, t) = (φ0k , ψ0k )(ξ ), (2.23)
462
A. Matsumura, K. Nishihara
and, for a given (φ (n−1) , ψ (n−1) )(ξ, t), ψ (n) is a solution to (n) ψ (n) (n−1) (n−1) , φ (n−1) , φξ ) ψt − µ( V +φξ(n−1) )ξ = g (n−1) := g(ψξ ψ (n) (0, t) = 0 ψ (n) (ξ, 0) = ψ (ξ ), 0k
and
t
(2.21)
(n)
ψξ (ξ + s− (t − τ ), τ )dτ, 0 ≤ ξ ≤ −s− t t (n) φ0k (ξ + s− t) + ψ (ξ + s− (t − τ ), τ )dτ, ξ ≥ −s− t. 0 ξ (2.20) 1 0 2 3 From the linear theory, if g ∈ C ([0, T ]; H ), ψ0 ∈ H ∩ H0 , then there exists a unique solution ψ to (2.21)ψ0 satisfying φ (n) (ξ, t) =
t+ sξ
−
ψ ∈ C([0, T ]; H 3 ∩ H01 ) ∩ C 1 ([0, T ]; H 1 ) ∩ L2 (0, T ; H 4 ). Using this, if (φ (n−1) , ψ (n−1) ) ∈ X 1 m,2M , then we have (φ
(n)
,ψ
(n)
)(t) ≤ 2
2
3 M 2
≤ (2M)
t0
0
+ C(m, M)t0 exp (C(m, M)t0 )
(2.24)
if 0 < t0 := t0 (m, M) 1
2
and also
2
(n)
ψξ (τ )21 dτ ≤ C(m, M)M 2 .
Hence, direct estimates on (2.20) give t √ (n) ψξ (ξ + s− (t − τ ), τ )dτ ≤ C t0 M t+ sξ
1
−
t √ (n) ≤ C t0 M. ψ (ξ + s (t − τ ), τ )dτ − ξ
and
0
1
Hence, for a suitable small t0 we have sup φ (n) (t)1 ≤ 2M
and
0≤t≤t0
inf
R+ ×[0,t0 ]
(V + φ)(ξ, t) ≥
1 m. 2
(2.25)
By (2.24)–(2.25), (φ (n) , ψ (n) ) ∈ X 1 m,2M (0, t0 ). Since φ0k , ψ0k 3 ≤ Ck , (φ (n) , ψ (n) ) 2
can be shown to be the Cauchy sequence in C([0, t0 ]; H 2 ), by a standard way. Thus we have a solution (φk , ψk ) ∈ X 1 m,2M (0, t0 ) ∩ C([0, t0 ]; H 2 ) to (2.20)φ0k and (2.21)ψ0k by 2
(n)
(n)
limn→∞ (φ (n) , ψ (n) ) = limn→∞ (φk , ψk ). Here, we note that ψk ∈ C 1 ([0, t0 ]; L2 ) ∩ L2 (0, t0 ; H 3 ),
(2.26)
1 2 since g((ψk )ξ , φk , (φk )ξ ) ∈ C([0, t0 0k , ψ0k ) ∈ H ∩ H0 . Again, showing that (φk , ψk ) is a Cauchy sequence in C([0, t0 ]; H 1 ) (taking t0 smaller than the previous
]; H 1 ) and (φ
Large-Time Behaviors of Solutions to an Inflow Problem
463
one if necessary), we obtain the desired unique-local solution (φ, ψ) ∈ X 1 m,2M (0, t0 ). 2 We omit the details. Here we state the validity of (2.16). As we see above, (φk , ψk ) → (φ, ψ) as k → ∞, and (φk , ψk ) is a solution to (2.4) with its initial data (φ0k , ψ0k ). Hence, by (2.20)φ0k , t φk (ξ, t) = (ψk )ξ (ξ + s− (t − τ ), τ )dτ, 0 ≤ ξ ≤ s− t. t+ sξ
−
Since
ξ (φk )t (ξ, t) = (ψk )ξ (ξ, t) − ψk )ξ (0, t + + s− s−
t t+ sξ
(ψk )ξ ξ (ξ + s− (t − τ ), τ )dτ,
−
(2.26) shows that (φk )t (ξ, t) → 0 as ξ → 0. Hence, by (2.4)1 , −s− (φk )ξ (0, t) − (ψk )ξ (0, t) = 0. The solution ψ is in L2 (0, t0 ; H 2 ) and so ψξ (0, t) has a meaning for almost all t ≥ 0 and ψξ (0, t) = lim (ψk )ξ (0, t). k→∞
Thus, there exists limk→∞ (φk )ξ (0, t). Therefore, Lemma 2.3 is first obtained for (φk , ψk ) or v˜k = V + φk , and then for (φ, ψ) after letting k tend to infinity. 2.2. The case (v+ , u+ ) ∈ BL− (v− , u− ). In this section we assume that (v− , u− ) ∈ sub
and
(v+ , u+ ) ∈ BL− (v− , u− ),
(2.27)
then Lemma 1.1 gives a unique BL-solution (V , U )(ξ ) satisfying (1.8) or (1.12) together with V Vξ = h(V ) < 0, h(V ) > 0, h (V ) > 0 for v+ < V < v− . (2.28) µs− We put the perturbation (φ, ψ)(ξ, t) by (v, u)(x, t) = (V , U )(ξ ) + (φ, ψ)(ξ, t), ξ = x − s− t, so that the reformulated problem is φt − s− φξ − ψξ = 0, ξ > 0, t > 0 ψ − s ψ + (p(V + φ) − p(V )) = µ Uξ +ψξ − t − ξ ξ V +φ (φ, ψ)| = (0, 0) ξ =0 (φ, ψ)|t=0 = (φ0 , ψ0 )(ξ ) := (v0 − V , u0 − U )(ξ ),
Uξ V
(2.29)
ξ
(2.30)
which is formally the same as (2.4). However, the sign of Vξ is negative, and so Lemma 2.1 does not hold. Nevertheless, we seek for the solution (φ, ψ) to (2.30) in the same solution space Xε (0, T ) = {(φ, ψ) ∈ C([0, T ]; H 1 ) | φx ∈ L2 (0, T ; L2 ), with sup (φ, ψ)(t)1 ≤ ε},
ψx ∈ L2 (0, T ; H 1 )
[0,T ]
for a suitably small ε > 0 (cf. Remark 2.1). Then, the a priori estimates are obtained as follows.
464
A. Matsumura, K. Nishihara
Proposition 2.3 (A priori estimates). Let δ := |v+ − v− , u+ − u− | be suitably small and (φ, ψ) be a solution to (2.30) in Xε (0, T ) for a suitably small ε > 0. Then, there exists a constant C1 such that t 2 (φ, ψ)(t)1 + (ψξ (0, τ )2 + φξ (τ )2 + ψξ (τ )21 )dτ ≤ C1 φ0 , ψ0 21 . 0
Combining the local existence theorem with Proposition 2.3 we have the stability theorem. Theorem 2.2 (Stability of BL-solution in case of (v+ , u+ ) ∈ BL− (v− , u− )). If |v+ − v− , u+ − u− | + v0 − V , u0 − U 1 is suitably small with the compatibility condition (v0 − V , u0 − U )(0) = (0, 0), then there exists a unique solution (v, u) to (P), which satisfies (v − V , u − U ) ∈ C([0, ∞); H01 ) and sup |(v, u)(x, t) − (V , U )(x − s− t)| → 0 as t → ∞.
x≥s− t
We only show the a priori estimates. Assume that N (T ) = sup (φ, ψ)(t)1 ≤ ε ≤ ε0 ( 1), 0≤t≤T
where ε0 is chosen as sup |f (x)| ≤ Cf 1/2 fx 1/2 ≤ Cε0 < R
v+ 2
so that V + φ ≥ v+ /2. Multiplying (2.30)2 by ψ and (2.30)1 by −(p(V + φ) − p(V )) and adding those equations, we have d dt
0
∞ 1
∞ ψ2 ξ ψ 2 + 5(v, V ) dξ + µ dξ 2 v 0 ∞ |Vξ |φ 2 dξ + ν ≤C 0
∞ 0
ψξ2 dξ + Cν
∞ 0
|Vξ |2 φ 2 dξ
for a small constant ν > 0, and hence
t
φ(t)2 + ψ(t)2 + 0
ψξ (τ )2 dτ ≤ Cφ0 , ψ0 2 + C
t 0
∞ 0
|Vξ |φ(ξ, τ )2 dξ dτ.
(2.31)
Here, we estimate the last term using the idea by Kawashima and Nikkuni [2]. Since ξ φξ (η, t)dη ≤ ξ 1/2 φξ (t), φ(ξ, t) = φ(0, t) + 0
the last term of (2.31) is estimated as t |the last term| ≤ C φξ (τ )2 0
∞ 0
ξ(−Vξ (ξ ))dξ dτ ≤ Cδ
t 0
φξ (τ )2 dτ.
Large-Time Behaviors of Solutions to an Inflow Problem
465
Hence, (2.31) yields
t
(φ, ψ)(t)2 + 0
ψξ (τ )2 dτ ≤ C(φ0 , ψ0 2 + δ
t 0
φξ (τ )2 dτ ).
(2.32)
Similar fashion to (2.14) yields t t φξ (t)2 + φξ (τ )2 dτ ≤ C(φ0 , ψ0 21 + φξ (0, t)2 dτ 0 0 t ∞ + |Vξ (ξ )|φ 2 dξ dτ ). 0
0
Noting that t t t C t 2 2 2 C φξ (0, τ ) dτ = 2 ψξ (0, τ ) ≤ ν ψξ ξ (τ ) dτ + Cν ψξ (τ )2 dτ, s− 0 0 0 0 we have φξ (t) +
t
φξ (τ )2 dτ t 2 ≤ Cφ0 , ψ0 1 + {νψξ ξ (τ )2 + Cδφξ (τ )2 + Cν ψξ (τ )2 }dτ. 2
0
(2.33)
0
The same estimate as (2.17) gives ψξ (t) +
t
2
0
(ψξ (0, τ )2 + ψξ ξ (τ )2 )dτ ≤
Cφ0 , ψ0 21
t
+C 0
(φξ (τ )2 + ψξ (τ )2 )dτ.
(2.34)
By (2.33) and (2.34) for a fixed number λ > 0 such as 1 − Cλ ≥ 1/2 and ν = λ/2, (φξ , ψξ )(t) +
t
2
0
(ψξ (0, τ )2 + φξ (τ )2 + ψξ ξ (τ )2 )dτ t 2 ≤ C(φ0 , ψ0 1 + ψξ (τ )2 dτ ).
(2.35)
0
Again, adding (2.35)·λ(λ > 0) to (2.32), we have (φ, ψ)(t)2 + λ(φξ , ψξ )(t)2 t + {(1 − Cλ)ψξ (τ )2 + λ(ψξ (0, τ )2 + φξ (τ )2 + ψξ ξ (τ )2 }dτ 0 t ≤ Cφ0 , ψ0 21 + Cδ φξ (τ )2 dτ. 0
Taking λ as 1 − Cλ ≥ 1/2 and restricting δ as λ − Cδ ≥ λ/2, then we obtain the desired a priori estimate, which completes the proof of Proposition 2.3.
466
A. Matsumura, K. Nishihara x = λ1 (v∗ )t = s− t
u R2 (v, ¯ u) ¯
R2 (v∗ , u∗ )
¯ t x = λ1 (v)t
x = λ2 (v)t ¯
(v+ , u+ )
trans
R1 (v∗ , u∗ ) (v, ¯ u) ¯
(v, ¯ u) ¯
(v∗ , u∗ )
x = λ(v+ )t
(v+ , u+ )
(v− , u− ) v
v Fig. 3.1.
3. Superposition of BL-Solution and Rarefaction Wave In this section we investigate the case (v− , u− ) ∈ sub , (v∗ , u∗ ) ∈ BL+ (v− , u− ) ∩ trans and (v+ , u+ ) ∈ R1 R2 (v∗ , u∗ ).
(3.1)
That is, we show (V ) in Sect. 1. The cases (III) and (IV) are similar to (V). ¯ u), ¯ and In the case of (3.1), there is (v, ¯ u) ¯ ∈ R1 (v∗ , u∗ ) such that (v+ , u+ ) ∈ R2 (v, there are the 1-rarefaction wave (v1R , uR )(x/t) connecting (v , u ) with ( v, ¯ u) ¯ and the ∗ ∗ 1 2-rarefaction wave (v2R , uR )(x/t) connecting ( v, ¯ u) ¯ with (v , u ), + + 2 which are weak solutions to (x, t) ∈ R × (0, ∞) vt − ux = 0, (3.2) ut + p(v)x = 0 with the Riemann initial data
(v, u)|t=0 =
R (v01 , uR 01 )(x)
(v∗ , u∗ ) (v, ¯ u) ¯
x<0 x>0
(3.3)1
(v, ¯ u) ¯ (v+ , u+ )
x<0 x > 0.
(3.3)2
=
and (v, u)|t=0 =
R (v02 , uR 02 )(x)
=
See Fig. 3.1. To construct the smooth approximate rarefaction wave (V˜i , U˜ i )(x, t), x ∈ R, and its restriction (Vi , Ui )(ξ, t) := (V˜i , U˜ i )(x, t)|x≥s− t , we prepare the following lemma. Lemma 3.1 ([9]). Let w+ > w− and w˜ = w+ − w− . Then the Cauchy problem wt + wwx = 0, x ∈ R, t > 0 (3.4) − − w|t=0 = w0 (x) := w+ +w + w+ −w tanh x 2 2 has a unique, smooth and global solution w(x, t) = w(x, t; w− , w+ ), which satisfies the following:
Large-Time Behaviors of Solutions to an Inflow Problem
(i) w− < w(x, t) < w+ ,
467
wx > 0,
˜ w˜ 1/p t −1+1/p ) , (ii) wx (t)Lp (R) ≤ Cp min (w, wxx (t)Lp (R) ≤ Cp min (w, ˜ t −1 ), (iii) if 0 < w− (< w+ ), then, for any x ≤ 0, |w(x, t) − w− | ≤ w˜ exp {−2(|x| + |w− |t)} , |wx (x, t)| ≤ 2w˜ exp {−2(|x| + |w− |t)}, (iv) if 0 > w+ (> w− ), then, for any x ≥ 0, |w(x, t) − w+ | ≤ w˜ exp {−2(|x| + |w+ |t)} , |wx (x, t)| ≤ 2w˜ exp {−2(|x| + |w+ |t)} , (v) limt→∞ supR |w(x, t) − w R (x/t)| = 0, where
w− x ≤ w− t R w (x/t) = x/t w− t < x < w+ t w+ x ≥ w+ t.
Remark 3.1. If x ∞ w+ + w− w + − w− 2 −q (1 + y ) dy, κq (1 + y 2 )−q dy = 1, + κq w0 (x) = 2 2 0 0 instead of (3.4)2 , then wxx (t)Lp (R) ≤ Cpq w˜
p−1 − p−1 2pq −1− 2pq
t
([10, 11]). This was available for the strong rarefaction wave. However, the result in this paper is concerning the weak rarefaction wave, and (3.4)2 is chosen. Thecharacteristic speed λi (v), i = 1, 2 of the hyperbolic system (3.2) are λi (v) = (−1)i −p (v), and hence the smooth approximations (V˜i , U˜ i )(x, t) to (viR , uR i )(x/t) are given by V˜1 (x, t) = λ−1 ¯ λ1 (v))) ¯ 1 (w(x, t; 2λ1 (v∗ ) − λ1 (v), (3.5)i=1 V˜1 (x,t) U˜ 1 (x, t) = u∗ − λ1 (s)ds v∗
and
¯ λ2 (v+ ))) V˜2 (x, t) = λ−1 2 (w(x, t; λ2 (v), V˜2 (x,t) U˜ 2 (x, t) = u˜ − λ2 (s)ds,
which satisfy, for i = 1, 2,
(3.5)i=2
v˜
V˜it − U˜ ix = 0, x ∈ R, t > 0 U˜ it + p(V˜i )x = 0.
(3.6)
Then, define, for i = 1, 2, (Vi , Ui )(ξ, t) = (V˜i , U˜ i )(x, t)|x≥s− t , ξ = x − s− t ≥ 0, and the following lemma holds.
(3.7)
468
A. Matsumura, K. Nishihara
Lemma 3.2. Let δ1 = |v+ − v∗ , u+ − u∗ |. Then (Vi , Ui )(ξ, t) defined by (3.7) satisfy (i)
Uiξ > 0, |Viξ | ≤ CUiξ ≤ Cδ1 ,
(ii)
(Viξ , Uiξ )(t)Lp (R+ ) ≤ Cp δ1 (1 + t)−1+1/p , 1/p
(Viξ ξ , Uiξ ξ )(t)Lp (R+ ) ≤ Cp min (δ1 , (1 + t)−1 ), (iii)
|(V1 − v, ¯ U1 − u)(ξ, ¯ t)| + |(V1ξ , U1ξ )(ξ, t)| ≤ Cδ1 exp {−c(|ξ + s− t| + t)}, for ξ ≥ −s− t, |(V2 − v, ¯ U2 − u)(ξ, ¯ t)| + |(V2ξ , U2ξ )(ξ, t)| ≤ Cδ1 exp {−c(|ξ + s− t| + t)} for 0 ≤ ξ ≤ −s− t, lim sup |(Vi , Ui )(ξ, t) − (viR , uR i )(
(iv)
t→∞ R +
and also
ξ + s− t )| = 0, t
Vit − s− Viξ − Uiξ = 0,
i = 1, 2,
ξ ∈ R+ , t > 0
Uit − s− Uiξ + p(Vi )ξ = 0
(3.8)
with (V1 , U1 )(0, t) = (v∗ , u∗ ),
(V1 , U1 )(∞, t) = (v, ¯ u) ¯
(3.9)
and |(V2 − v, ¯ U2 − u)(0, ¯ t)| ≤ Cδ1 exp (−ct),
(V2 , U2 )(∞, t) = (v+ , u+ ).
(3.10)
All results except for (3.9) are a direct consequence from (3.5)–(3.7) and Lemma 3.1. ¯ in (3.5)i=1 . It is natural Equation (3.9) follows from the choice of w− = 2λ1 (v∗ )−λ1 (v) to take w− = λ1 (v∗ ). However, from our choice w+ + w − ¯ + 2λ1 (v∗ ) − λ1 (v) ¯ λ1 (v) = = s− 2 2 (note that λ1 (v∗ ) = s− ). Hence, w0 (0) = s− in (3.4), which means w(x, t)|x=s− t = s− . By the definition (3.5)i=1 , (V˜1 , U˜ 1 )(s− t, t) = (v∗ , u∗ ) and hence (3.9) holds. On the other hand, the BL-solution (V0 , U0 )(ξ ) connecting (v− , u− ) and (v∗ , u∗ ) is given by Lemma 1.1: −s− V0ξ − U0ξ = 0,
ξ ∈ R+ U0ξ −s− U0ξ + p(V0 )ξ = µ V0 ξ (V0 , U0 )(0) = (v− , u− ), |(V0 − v∗ , U0 − u∗ )(ξ )| ≤ Cδ0 (1 + ξ )−1 ,
where δ0 = |(v∗ − v− , u∗ − u− )|.
(3.11)
Large-Time Behaviors of Solutions to an Inflow Problem
469
Putting
V U
(ξ, t) =
V0 (ξ ) + V1 (ξ, t) + V2 (ξ, t) − v∗ − v¯
,
U0 (ξ ) + U1 (ξ, t) + U2 (ξ, t) − u∗ − u¯
(3.12)
we set the perturbation (φ, ψ) by (v, u)(x, t) = (V + φ, U + ψ)(ξ, t). Then the reformulated problem is, from (1.5), (3.8)-(3.12), φt − s− φξ − ψξ = 0, ξ ∈ R+ , t > 0 ψt − s− ψξ + (p(V + φ) − p(V ))ξ = µ Uξ +ψξ − V +φ
Uξ V
ξ
+ Gξ
(φ, ψ)(0, t) = (v¯ − V2 (0, t), u¯ − U2 (0, t)) =: (bV , bU )(t) (φ, ψ)(ξ, 0) = (v0 (ξ ) − V (ξ, 0), u0 (ξ ) − U (ξ, 0)) =: (φ0 , ψ0 )(ξ ),
(3.13)
where |(bV , bU )(t)| ≤ Cδ1 exp (−ct),
(3.14)
(φ0 , ψ0 )(0) = (bV , bU )(0)
(3.15)
and G = −(p(V ) − p(V0 ) − p(V1 ) − p(V2 ) + p(v∗ ) + p(v)) ¯ +µ
Uξ U0ξ − V V0
=: −G1 + G2 .
(3.16) Equation (3.13) is almost same as (2.4) together with Uξ > 0 (note that Uξ = −Vξ /s− > 0 in (2.4)) except for the term Gξ . Therefore, for the solution (φ, ψ)(ξ, t), 0 ≤ t ≤ T , to (3.13) with sup (φ, ψ)(t)1 ≤ ε ≤ ε0 1, 0≤t≤T
we have (φ, ψ)(t)21 +
t 0
( Uξ φ(τ )2 + ψξ (τ )2 )dτ
t
≤ C(φ0 , ψ0 + δ1 ) + Cδ1 2
0
ψξ (0, τ ) dτ + C 2
t 0
∞ 0
Gξ ψdξ dτ
(3.17)
by a similar way to that in Subsect. 2.1. The second to the last term comes from the boundary value and is controllable by combining the estimates of higher order derivatives provided δ1 is small. We must control the last term in (3.17).
470
A. Matsumura, K. Nishihara
Since G1ξ = p (V )(V0ξ + V1ξ + V2ξ ) − p (V0 )V0ξ − p (V1 )V1ξ − p (V2 )V2ξ = (p (V ) − p (V0 ))V0ξ + (p (V ) − p (V1 ))V1ξ + (p (V ) − p (V2 ))V2ξ , by noting the signs of V1 − v∗ , etc., |G1ξ | ≤ C{(V1 − v∗ )V0ξ + (v∗ − V0 )V1ξ } + C{|V2 − v|(|V ¯ ¯ 0ξ | + |V1ξ |) + |V2ξ |(|V0 − v∗ | + |V1 − v|)}
(3.18)
=: G11ξ + |G12ξ |. The second term |G12ξ | is easily controlled. Because the wave V2 is away from the waves V0 and V1 . In fact, −s− t ∞ G12ξ (t)2 = |G12ξ |2 (ξ, t)dξ + −s− t
0
≤C
sup
0≤ξ ≤−s− t −s− t
{(|V0ξ (ξ )|2 + |V1ξ (ξ, t)|2 )|V2 (ξ, t) − v|} ¯
·
|V2 (ξ, t) − v|dξ ¯
0
+C
sup
0≤ξ ≤−s− t −s− t
{(|V0 − v∗ |2 + |V1 − v| ¯ 2 )|V2ξ (ξ, t)}
|V2ξ (ξ, t)|dξ
· 0
+ C sup {|V2 − v| ¯ 2 (|V0ξ (ξ )| + |V1 (ξ, t)|} ξ ≥−s− t ∞
·
−s− t
(|V0ξ (ξ )| + |V1 (ξ, t)|)dξ
+ C sup {(|V0 − v∗ |2 + |V1 − v| ¯ 2 )|V2ξ (ξ, t)|} ·
ξ ≥−s− t ∞
−s− t
|V2ξ (ξ, t)|dξ
≤ C(δ1 + δ0 )δ1 (1 + t)3 . V0 V0 h(V0 ) ∼ − µs (V0 − v∗ )2 and |V0 (ξ ) − v∗ | ≤ Cδ0 |ξ |−1 Here we have used V0ξ = µs − − as ξ → ∞. Hence t ∞ t G12ξ ψdξ dτ ≤ C(δ 1/2 + δ 1/2 )δ 1/2 (1 + τ )−3/2 ψ(τ )dτ, C 0
0
0
which is controllable by the Gronwall inequality. The term G11ξ is rather difficult to be treated, because the waves V0 and V1 are contact for all time with each other. See Fig. 3.1. However, this situation is similar to that in [5, Sect. 4] and the method used there is applicable to our case.
Large-Time Behaviors of Solutions to an Inflow Problem
471
Putting a := −s− + λ1 (v) ¯ > 0, we have t ∞ C G11ξ ψdξ dτ 0
0
t
≤C
sup |ψ|
0 R+
≤C
t
at
ψ
0
{(V1 − v∗ )V0ξ + (v∗ − V0 )V1ξ }dξ dτ
at
0 1/2
∞
+
ψξ
1/2
[(V1 − v∗ )(V0 − v∗ )]at ξ =0
+ [(v∗ − V0 )(V1 − v∗ )]∞ ξ =at + 2
t
≤C ≤ν
0 t 0
≤ν
0
t
ψ
1/2
ψξ
1/2
∞ at
t 0
at
−2
V1ξ (V0 − v∗ )dξ
0
V0ξ (V1 − v∗ )dξ dτ
1/8 δ1 (1 + τ )−7/8 δ0
ψξ (τ )2 dτ + Cν
aτ
+(1 + τ )
−1
dτ
0
δ1 δ0 ψ(τ )2/3 (1 + τ )−7/6 (ln(2 + τ ))4/3 dτ 1/6 3/4
4/3 1/6
ψξ (τ )2 dτ + Cδ0 δ1 .
Here we have used, by Lemma 3.2 (ii), 1
δ0 ,
+7
V1ξ (·, t)L∞ ≤ V1ξ (·, t)L8 ∞ 8 ≤ δ1 (1 + t)−7/8 . t ∞ Estimating C 0 0 G2ξ ψdξ dτ by a similar way to the above, we have, for any fixed
(φ, ψ)(t)21
t
+ 0
1/8
( Uξ φ(τ )2 + ψξ (τ )2 )dτ ≤ C(φ0 , ψ0
2
1/6 + δ1 ) + Cδ1
t 0
ψξ (0, τ )2 dτ.
The estimates of higher order derivatives are also obtained, though the calculations are rather tedious. Thus, we have the following theorem. Theorem 3.1 (The case of (v+ , u+ ) ∈ BL+ R1 R2 (v− , u− )). Define (V , U )(ξ, t) by (3.12). Then, if both φ0 , ψ0 1 = v0 −V (·, 0), u0 −U (·, 0)1 and δ1 = |v+ −v∗ , u+ − u∗ | are suitably small, then there exists a unique solution (φ, ψ) ∈ C([0, ∞); H 1 ) to (3.13) and hence a solution (v, u) to (P) which satisfies sup |(φ, ψ)(ξ, t)| = sup |(v, u)(x, t) − (V , U )(x − s− t, t)| → 0 (t → ∞). ξ ≥0
x≥s− t
Remark 3.2. This result implies that the BL-solution is not necessary to be weak though the rarefaction waves are necessarily weak. We do not know whether the weakness is removed. Remark 3.3. The case of (v+ , u+ ) ∈ BL+ R2 (v− , u− ) is treated in a similar fashion to the above case, and the assertion (III) holds. However, in the case of (v+ , u+ ) ∈ BL− R2 (v− , u− ) the situation is similar to that in Subsect. 2.2, and hence the BL-solution is necessary to be weak, and (IV) holds.
472
A. Matsumura, K. Nishihara
4. Concluding Remarks Except for the cases treated in Sects. 2 and 3, all other cases are open. In this section we discuss the cases concerning the viscous shock wave, which are not yet solved either. Compared to the case of the corresponding Cauchy problem, we consider the case (v− , u− ) ∈ sub , (v+ , u+ ) ∈ BL− S2 (v− , u− ),
(4.1)
where a superposition of the BL-solution and the viscous shock wave is expected to be an asymptoics of the solution to (P). In this case, there is (v, ¯ u) ¯ ∈ BL− (v− , u− ) such that (v+ , u+ ) ∈ S2 (v, ¯ u), ¯ and there are the BL-solution (V0 , U0 )(ξ ) satisfying −s− V0ξ − U0ξ = 0, U0ξ −s− U0ξ + p(V0 )ξ = µ V0 ξ (V0 , U0 )(0) = (v− , u− ),
ξ ∈ R+ (4.2) (V0 , U0 )(∞) = (v, ¯ u) ¯
with s− = −u− /v− and the 2-viscous shock wave (V˜2 , V˜2 )(x−s2 t +α) connecting (v, ¯ u) ¯ with (v+ , u+ ), and its restriction (V2 , U2 )(ξ, t; α) := (V˜2 , V˜2 )(x − s2 t + α)|x≥s− t = (V˜2 , V˜2 )(ξ − (s2 − s− )t + α)|ξ ≥0 , which satisfies V − s− V2ξ − U2ξ = 0, ξ ∈ R+ , t > 0 2t U 2ξ U − s U + p(V ) = µ 2t − 2ξ 2 ξ V2 ξ (4.3) (V − v, ¯ U − u)| ¯ =: (b , −b )(t), |(b , −b )(t)| ≤ Cδ exp (−ct) 2 2 ξ =0 V U V U 1 (V2 , U2 )|ξ =∞ = (v+ , u+ ), v) ¯ where α is a shift and s2 = − p(vv++)−p( > 0 with δ1 = |v+ − v, ¯ u+ − u|. ¯ Hence the −v¯ solution (v, u) to (P) is expected to tend to (V , U )(ξ, t; α) = (V0 (ξ ) + V2 (ξ, t; α) − v, ¯ U0 (ξ ) + U2 (ξ, t; α) − u). ¯
(4.4)
The key point is how to determine α, which are suggested by the method in [4]. The perturbation (v − V , u − U ) satisfies (v−V )t −s− (v−V )ξ −(u−U )ξ = 0 U U u (u−U ) −s (u−U ) + (p(v)−p(V )−p(V ) + p(v)) ¯ ξ = µ vξ − V0ξ0 − V2ξ2 t − ξ 0 2 ξ (v−V , u−U )| ξ =0 = (−bV , bU )(t) (v−V , u−U )|t=0 = (v0 −V0 −V2 (·, 0; α) + v, ¯ u0 −U0 −U2 (·, 0; α) + u)(ξ ¯ ). (4.5) Integrating (4.5)1 in ξ over R+ , we have d dt
0
∞
(v − V )(ξ, t)dξ = s− bV (t) − bU (t).
(4.6)
Large-Time Behaviors of Solutions to an Inflow Problem
473
∞ Expecting 0 (v − V )(ξ, t)dξ |t=∞ = 0 yields ∞ ∞ − (v0 (ξ ) − V (ξ, 0; α))dξ = (s− bV (t) − bU (t))dt 0 0 ∞ (V2 (α − (s2 − s− )t) − v)dt, ¯ = −(s2 − s− ) 0
and hence
I (α) := 0
∞
(v0 (ξ ) − V0 (ξ ) − V2 (ξ + α) + v)dξ ¯ ∞ (V2 (α − (s2 − s− )t) − u)dt ¯ = 0. − (s2 − s− )
(4.7)
0
Since
I (α) = 0
∞
−V2 (ξ
+ α)dξ − (s2 − s− )
0
∞
V2 (α − (s2 − s− )t)dt
= −(v+ − V2 (α)) + v¯ − V2 (α) = v¯ − v+ , the equality 0 = I (α) = I (0) + (v¯ − v+ )α determines α by ∞ 1 (v0 (ξ ) − V0 (ξ ) − V2 (ξ ) + v)dξ ¯ α= − v¯ − v+ 0
∞ (V2 (−(s2 − s− )t) − v)dt ¯ . − (s2 − s− )
(4.8)
0
Again, integrating (4.6) over (0, t), we have ∞ (v(ξ, t) − V (ξ, t; α))dξ 0 ∞ t = (v0 (ξ ) − V (ξ, 0; α)dξ + (s− bV (τ ) − bU (τ ))dτ 0 0 ∞ (V2 (α − (s2 − s− )τ ) − v)dτ ¯ (= −bˆV (t)) = (s2 − s− ) t
→0
exponentially as t → ∞.
(4.9)
Thus, putting the perturbation in the integrated form by ∞ (φ, ψ)(ξ, t) = − (v(η, t) − V (η, t; α), u(η, t) − U (η, t; α))dη, ξ
we reach the reformulated problem φt − s0 φξ − ψξ = 0, ξ ∈ R+ , t > 0 ψt − s− ψξ + p(V + φξ ) − p(V0 ) − p(V2 ) + p(v) ¯ U U U +ψ = µ( Vξ +φξξ ξ − V0ξ0 − V2ξ2 ) φ|ξ =0 = bˆV (t), ψξ =0 = −bU (t) ∞ (φ, ψ)|t=0 = (φ0 , ψ0 )(ξ ) := − (v0 − V (·, 0; α), u0 − U (·, 0; α))(ξ )dξ. ξ
(4.10)
474
A. Matsumura, K. Nishihara
This setting seems to be reasonable. However, we could not prove the global existence theorem on (4.10) even if both |v+ − v− , u+ − u− | and φ0 , ψ0 2 were small. The difficulty was to control the value ψ(0, t) from the boundary. References 1. Goodman, J.: Nonlinear asymptotic stability of viscous shock profiles for conservation laws. Arch. Rat. Mech. Anal. 95, 325–344 (1986) 2. Kawashima, S. and Nikkuni, Y.: Stability of stationary solutions to the half-space problem for the discrete Boltzmann equation with multiple collisions. Kyushu J. Math. to appear 3. Kawashima, S. and Nishibata, S.: Stability of stationary waves for compressible Navier–Stokes equations in the half space. In preparation 4. Liu, T.-P.: Nonlinear stability of shock waves for viscous conservation laws. Mem. Am. Math. Soc. 56, 1985 5. Liu, T.-P., Matsumura,A. and Nishihara, K.: Behaviors of solutions for the Burgers equation with boundary corresponding to rarefaction waves. SIAM J. Math. Anal. 29, 293–308 (1998) 6. Matsumura, A.: Inflow and outflow problems in the half space for a one-dimensional isentropic model system of compressible viscous gas. In: Proceedings of IMS Conference on Differential Equations from Mechnics, Hong Kong, 1999, to appear 7. Matsumura, A. and Mei, M.: Convergence to traveling fronts of solutions of the p-system with viscosity in the presence of a boundary. Arch. Rat. Mech. Anal. 146, 1–22 (1999) 8. Matsumura, A. and Nishihara, K.: On the stability of traveling wave solutions of a one-dimensional model system for compressible viscous gas. Japan J. Appl. Math. 2, 17–25 (1985) 9. Matsumura, A. and Nishihara, K.: Asymptotics toward the rarefaction waves of the solutions of a onedimensional model system for compressible viscous gas. Jpn. J. Appl. Math. 3, 1–13 (1986) 10. Matsumura, A. and Nishihara, K.: Global stability of the rarefaction wave of a one-dimensional model system for compressible viscous gas. Commun. Math. Phys. 144, 325–335 (1992) 11. Matsumura, A. and Nishihara, K.: Global asymptotics toward the rarefaction wave for solutions of viscous p-system with boundary effect. Q. Appl. Math. 58, 69–83 (2000) 12. Szepessy, A. and Xin, Z.: Nonlinear stability of viscous shock waves. Arch. Rat. Mech. Anal. 122, 53–103 (1993) 13. Xin, Z.: Asymptotic stability of rarefaction waves for 2 × 2 viscous hyperbolic conservation laws – two-mode case. J. Differ. Eqs. 78, 191–219 (1989) Communicated by P. Constantin
Commun. Math. Phys. 222, 475 – 493 (2001)
Communications in
Mathematical Physics
© Springer-Verlag 2001
Asymptotic Statistics of Zeroes for the Lamé Ensemble Alain Bourget, John A. Toth Department of Mathematics and Statistics, McGill University, Montreal, Canada Received: 17 January 2001 / Accepted: 14 May 2001
Abstract: The Lamé polynomials naturally arise when separating variables in Laplace’s equation in elliptic coordinates. The products of these polynomials form a class of spherical harmonics, which are joint eigenfunctions of a quantum completely integrable (QCI) system of commuting, second-order differential operators P0 = , P1 , . . . , PN−1 acting on C ∞ (SN ). These operators naturally depend on parameters and thus constitute an ensemble. In this paper, we compute the limiting level-spacings distributions for the zeroes of the Lamé polynomials in various thermodynamic, asymptotic regimes. We give results both in the mean and pointwise, for an asymptotically full set of values of the parameters. 1. Introduction Fix N + 1 distinct, positive real numbers 0 < α0 < · · · < αN . Given Cartesian coordinates (z1 , . . . , zN+1 ) ∈ RN+1 , consider the partial differential operators ij ∂ ∂ 2 Pk := Sk (α0 , . . . , αN ) zi − zj ; k = 0, . . . , N − 1 (1) ∂zj ∂zi i<j
ij
acting on C ∞ (SN ). Here, Sk denotes the k th elementary symmetric polynomial in the α parameters with αi and αj deleted. It is easy to check that [Pi , Pj ] = 0
for all i, j = 0, . . . , N − 1.
Consequently, since the Pj ’s are jointly elliptic, they possess a Hilbert basis of joint eigenfunctions. Since P0 is just the constant curvature spherical Laplacian, these eigenfunctions form a class of spherical harmonics, the so-called generalized Lamé harmonics Supported in part by an Alfred P. Sloan Research Fellowship and NSERC grant OGP0170280.
476
A. Bourget, J. A. Toth
[T1]. These systems constitute important examples of quantum completely integrable systems and they have as complex analogues the Gaudin spin-chains of various types [HW, K]. The purpose of this note is to derive asymptotic formulas for the level-spacings distributions of the zeroes of these spherical harmonics. To describe our results in more detail, we begin by noting that in terms of appropriate (elliptic-spherical [HW, T1, T2]) parametrizing coordinates (u1 , . . . , uN ) ∈ (α0 , α1 ) × · · · × (αN−1 , αN ) on SN , and up to constant multiples, the joint eigenfunctions of P0 , . . . , PN−1 can be written in the form: ψ(u1 , . . . , uN ) =
N
Sm ( uj − α0 , . . . , uj − αN ) · φ(uj ).
j =1
Here, φ is a polynomial, and Sm (x0 , . . . , xN ); m = 0, . . . , N + 1 denotes the mth elementary symmetric √ √ function on N + 1 variables. Furthermore, the function ψ(x) := Sm ( x − α0 , . . . , x − αN ) · φ(x) is a solution of the ODE: N ν=0
N
d 2ψ 1 dψ (x − αν ) 2 + (x − αλ ) + C(x)ψ = 0, dx 2 dx
(2)
ν=0 λ =ν
where, C(x) is a polynomial of order N − 1 depending linearly on the joint eigenvalues (λ0 , . . . , λN−1 ) ∈ Spec (P0 , . . . , PN−1 ). The different species [WW] of harmonics are indexed by m = 0, . . . , N + 1. Although for simplicity, we consider here the case where m = 0, our main result (Theorem 1.1) can be proved for the other cases corresponding to m = 1, . . . , N + 1 in a similar fashion. When m = 0, the solutions ψ(x) are called generalized Lamé polynomials. Consider (k)
(k)
E(k) := {φ1 , . . . , φj (k) }, the set of Lamé polynomials of degree k. By the standard theory of spherical harmonics [WW] and the fact that the corresponding Lamé harmonics form a Hilbert basis, we (k) (k) know that j (k) = σ (N, k) := (N+k−1)! k!(N−1)! . Let θi,1 ≤ · · · ≤ θi,k denote the (real) zeroes (k)
of the polynomial, φi , where i = 1, . . . , σ (N, k). In our main result (Theorem 1.1), we compute the asymptotic weak limit for the level spacings distribution averaged over the set, E(k), of k th order Lamé polynomials. More precisely, consider AV (x; N, k, α) := dρLS
σ (N,k) k−1 1 1 (k) (k) δ x − k(θl,j +1 (α) − θl,j (α)) , (3) σ (N, k) k−1 l=1
j =1
where α ∈ "N and "N := {(α0 , . . . , αN ) ∈ [0, 1]N+1 ; α0 < α1 < · · · < αN−1 < αN }.
(4)
We henceforth put the normalized Lebesgue measure d−α := (N + 1)!dα on "N , so that meas ("N ) = 1. In order to state our first result, we will also need to introduce the integrated, averaged level-spacings distribution: AV dµLS (x; N, k) := dρLS (x; N, k, α) d−α. (5) "N
Asymptotic Statistics of Zeroes for the Lamé Ensemble
477
Theorem 1.1. (i) Fix 0 < $ < 1 and assume that k ∼ N 1−$ as N → ∞. Then, w − lim dµLS (x; N, k) = e−x dx. N→∞
(ii) Suppose that k(N) satisfies the hypotheses of part (i). Then, for any 0 < δ < $ there exist a measurable subset J N ⊂ "N with meas (J N ) ≥ 1 − N −δ , such that for any α ∈ JN, AV w − lim dρLS (x; N, k, α) = e−x dx. N→∞
In both (i) and (ii), the weak-limit is taken in the dual space to C00 ([a, b]), where 0 ≤ a < b < ∞. Remarks. (i) In recent work, Bleher, Shiffman and Zelditch [BSZ 1,2] have determined the asymptotics of various measures associated with the distribution of zeroes of eigensections of Toeplitz operators. The Lamé ensemble together with its complex analogues (the Gaudin spin chains) can be described in the Toeplitz framework [K]. However, in [BSZ 1,2] the averaging is carried out over a much larger ensemble: namely, all suitably normalized bases of Toeplitz eigensections. Most such bases are not quantum completely integrable and consequently, the situation considered in this paper is quite different from that in [BSZ 1,2]. The main point here is that we are really averaging over a comparatively small ensemble indexed by the parameters α ∈ "N and the elements of which are all quantum completely integrable. (ii) There are two asymptotic parameters that enter into our analysis here: k, the degree of the joint eigenfunctions, and N , the dimension of the base space, which in this case is just SN . So, Theorem 1.1 above is really a hybrid asymptotic result about the zeroes of the joint eigenfunctions of the Pj ’s on spheres of increasing dimension where we assume that the number of zeroes, k, satisfies k(N ) ∼ N 1−$ as N → ∞. This asymptotic regime can be thought of as a kind of thermodynamic limit. It would also be of interest to determine what happens in other asymptotic ranges where the number of zeroes is permitted to grow at faster rates as N → ∞. In particular, one would like to know what happens in the purely semiclassical regime, where N is fixed and k → ∞. We hope to address these points in future work. (iii) As the referee has pointed out, it would be of considerable interest to determine how the actual zeroes of the Lamé harmonics are distributed in the sense of a Riemann measure on SN itself. A natural starting point would be to look at the density of states measures (see [ShZ, NV]). In light of our results in this paper, the zeroes should, at least on average, behave like random variables in the asymptotic regime where k(N ) ∼ N 1−$ . Consequently, we believe that the density of states should on average tend to uniform measure on SN , but at present we do not know how to prove this. We plan to address this question for the Lamé harmonics as well as the more general complex XXX Gaudin spin chains in an upcoming paper. 2. The Lamé Differential Equation We now give a brief introduction to the Lamé equation following the classical presentation in Whittaker and Watson [WW], where this equation is introduced via the theory of ellipsoidal harmonics. In his treatise on heat conduction in an ellipsoidal body, G. Lamé was led to consider the class of homogeneous, harmonic polynomials on RN+1 that
478
A. Bourget, J. A. Toth
vanish on a family of confocal quadrics. There is an analogous construction of spherical harmonics which we will now describe. Pick a set {α0 , . . . , αN } of positive real constants, all distinct, and ordered in increasing order. Define, for
some real parameter θ , the diagonal matrix Aθ = diag (θ − α0 )−1 , . . . , (θ − αN )−1 . The problem then reduces to finding, for any positive integer k and multi-index β = (β0 , . . . , βN ) ∈ {0, 1}N+1 , k real numbers θ1 , . . . , θk for which the Niven’s functions fβ (X) = Xβ
k
(Aθj X, X),
X ∈ RN+1 ,
(6)
j =1
are solutions of Laplace’s equation (fβ ) = 0. The restrictions of the fβ ’s to SN yield an important class of spherical harmonics: the generalized Lamé harmonics. Clearly, the functions fβ (X) vanish on a family of confocal cones. Moreover, after substitution of the ansatz into Laplace’s equation, a straightforward computation shows that the relevant θj are obtained as solutions of the equations N ν=0
N
1 2βν + θj (α; l) − αν θ (α; l) − αν ν=0 j 4 + =0 θj (α; l) − θi (α; l)
(7) for
j = 1, . . . , k.
i =j
In the literature, these equations are commonly referred to as the “Bethe Ansatz” equations. Consequently, if we denote the solutions of (7) by θ1 , . . . , θk , it is not hard to see that the functions ψ(x) =
N k (x − αν )βν /2 (x − θj ), ν
βν ∈ {0, 1}
(8)
j =1
satisfy the second order differential equation given by N ν=0
N d 2ψ 1 dψ + C(x)ψ = 0, (x − αν ) 2 + (x − αµ ) dx 2 dx
(9)
ν=0 µ =ν
where C(x) is a polynomial of degree N − 1 that can be computed explicitly. This is exactly Eq. (2) of the introduction, and is known as the generalized Lamé differential equation. In the special case where the multi-index β = 0, it follows that the k th degree polynomial φ(x) = kj =1 (x − θj ) is a solution of the Lamé equation. Following the terminology adopted previously, we refer to these as Lamé polynomials. Restricting our attention to the N -sphere SN , we can use elliptic-spherical [HW, T1, T2] coordinates u = (u1 , . . . , uN ) ∈ (α0 , α1 ) × · · · × (αN−1 , αN ) to rewrite Niven’s function in the form fβ (u1 , . . . , uN ) = c
N N
(uj − αν )βν /2 φ(uj ),
(10)
ν=0 j =1
where c is some constant depending only on α0 , . . . , αN and φ(u) = kj =1 (u − θj ) . The functions ψ(u1 , . . . , uN ) in the introduction are simply linear combinations of the
Asymptotic Statistics of Zeroes for the Lamé Ensemble
479
fβ ’s and consequently, they are solutions of the eigenvalue problem for the Laplace’s operator on SN written in terms of elliptic-spherical coordinates; that is N j =1
4 ∂ ∂ψ = −λ0 ψ, U (uj ) U (uj ) ∂uj ∂uj i =j (uj − ui ) N
− αν ). The separated equations have the form N N N−1 d 2ψ 1 dψ 1 (x − αν ) 2 + (x − αµ ) λN−j −1 x j ψ = 0, + dx 2 ν dx 4
where U (x) =
ν=0
ν=0 (x
µ =ν
(11)
j =0
where the separation constants λ1 , . . . , λN−1 are the joint eigenvalues of the partial differential operators P0 , . . . , PN−1 defined in (1). Equation (11) is exactly the Lamé differential equation (2) considered in the introduction. 3. The Heine–Stieltjes Theorem One of the key steps in the proof of Theorem 1.1 is based on a result originally obtained by M. Heine [H]. Shortly afterwards, Stieltjes [S] improved Heine’s result in the special case of differential equations of Lamé’s type, the case that we consider here. For a complete proof, we refer the reader to Szegö [Sz]. Theorem 3.1 (Heine–Stieltjes). Let A(x) be the polynomial of degree N + 1 given by A(x) = (x − α0 ) · (x − α1 ) · · · (x − αN ), where 0 < α0 < α1 < · · · < αN and B(x) is a polynomial of degree N satisfying the condition B(x) ρN ρ0 + ··· + , = x − α0 x − αN A(x) for given numbers ρν > 0, ν = 0, . . . , N. Then, there are exactly σ (N, k) = polynomials C(x) of degree N − 1 for which the differential equation A(x)
d 2φ dφ + 2B(x) + C(x)φ = 0 dx 2 dx
(N+k−1)! k! (N−1)!
(12)
has a polynomial solution of degree k > 0. In addition, for each of the σ (N, k) solutions, φ(x), the zeroes are simple and uniquely determined by their distribution in the intervals (α0 , α1 ), . . . , (αN−1 , αN ). Note that the Lamé equation (11) is a particular case of the differential equation (12) appearing in the statement of the theorem: Indeed, in the Lamé case, we have that ρν = 1/4 for ν = 0, . . . , N. Taking into account the Heine–Stieltjes result, we denote the zeroes of φ(x) by θ1 (α; l) ≤ · · · ≤ θk (α; l), where α := (α0 , . . . , αN ), whereas l = (l1 , . . . , lk ); 1 ≤ l1 ≤ · · · ≤ lk ≤ N denotes the configuration of the zeroes. By this we mean that θ1 (α; l) is the smallest zero lying in the interval (αl1 −1 , αl1 ), the next zero θ2 (α; l) is contained in the interval (αl2 −1 , αl2 ) and so on. Although we will not explicitly use the following result in this paper, it is of independent interest and so we include it here for future reference:
480
A. Bourget, J. A. Toth
Lemma 3.2. For any given configuration l = (l1 , . . . , lk ), the zeroes θ1 (α; l), . . . , θk (α; l) are differentiable functions of α ∈ "N . Proof. The proof is based on the argument given in [Sh]. Differentiating the BetheAnsatz equations (7) with respect to the θ variables, we form the Jacobian matrix B = (bij ) given by N ρν 1 if i = j − ν=0 (θ −α 2 − m =i (θj −θm )2 ) ν j bij = 1 if i = j. (θ −θ )2 j
i
By a standard result in matrix theory (Gerˆsgorin’s Theorem) it follows that all the eigenvalues of B are strictly negative since if λ is an eigenvalue of B, then for some j ∈ {1, 2, . . . , k}, λ ≤ bjj +
|bij | = −
N ν=0
i =j
ρν < 0. (θj − αν )2
Therefore, the determinant of B is nonzero, so we can apply the implicit function theorem to conclude the proof. 4. Level Spacings Distribution Consider the Lamé system consisting of N + 1 particles located at α0 , . . . , αN with 0 < α0 < · · · < αN < 1 together with the k zeroes θ1 (α; l) ≤ · · · ≤ θk (α; l) of the polynomial solution φ(x) of the differential equation (12). Recall that, we use the notation θj (α; l) to designate the j th zero in configuration l = (l1 , . . . , lk ) with respect to parameters α = (α0 , . . . , αN ). As a consequence of Heine–Stieltjes Theorem, we have that: 1 1 AV (x; N, k, α) = dρLS σ (N, k) k−1 1≤l1 ≤···≤lk ≤N (13) k−1
· δ x − k θj +1 (α; l) − θj (α; l) , j =1
and so, dµLS (x; N, k) := 1 = σ (N, k)
"N
AV dρLS (x; N, k, α) d−α
1≤l1 ≤···≤lk ≤N
k−1
1 k−1
N j =1 "
δ(x − k(θj +1 (α; l) − θj (α; l))) d−α.
(14)
4.1. Proof of part (i) of Theorem 1.1. The proof is somewhat long and computational, so we divide it into several steps. For notational simplicity, we assume here that [a, b] = [0, 1] and φ ∈ C01 ([0, 1]). The argument for more general non-negative intervals [a, b] follows in the same way.
Asymptotic Statistics of Zeroes for the Lamé Ensemble
481
Step 1. We first show that dµLS (x; N, k) can be asymptotically approximated by fairly simple integrals that do not depend on the explicit formulas for the zeroes θ1 (α; l), . . . , θk (α; l). More precisely, we claim that:
dµLS (x; N, k)(φ) =
1 σ (N, k)
k−1
1≤l1 ≤···≤lk ≤N
1 k−1
N j =1 "
φ(k(αlj +1 − αlj )) d−α + O
k N
.
(15)
To obtain the estimate in (15), we make a first-order Taylor expansion in (14) around k(αlj +1 − αlj ) to get:
dµLS (x; N, k)(φ) = 1 σ (N, k)
1≤l1 ≤···≤lk ≤N
k−1
1 k−1
N j =1 "
φ(k(αlj +1 − αlj )) d−α + E1 (N, k, φ),
(16)
where the error term E1 (N, k, φ) is given by,
E1 (N, k, φ) = ·
1 σ (N, k) k−1 N j =1 "
1≤l1 ≤···≤lk ≤N
k k−1
φ (kξj (α)) θj +1 (α; l) − αlj +1 − θj (α; l) − αlj d−α,
with ξj (α) ∈ (0, 1). We need to show that E1 (N, k, φ) = O First, we start with a simple calculus lemma:
k
N
.
Lemma 4.1. For any 0 ≤ i ≤ j ≤ N and multi-indices β = (β1 , β2 ) ∈ N2 \ {(0, 0)}, we have "N
β β αi 1 αj 2 d−α
β1 =
where we define products of the form
β2 + l) l=1 (β1 + j + l) , |β| l=1 (N + 1 + l)
l=1 (i
0
l=1
to be equal to 1 and |β| := β1 + β2 .
482
A. Bourget, J. A. Toth
Proof of Lemma 4.1. A direct computation with iterated integrals gives "N
β
β
αi 1 αj 2 d−α =
0<α0 <···<αN <1 α1 1 αN
···
= (N + 1)!
0
0
= = = =
β
β
β
β
αi 1 αj 2 d−α
0
αj 2 αi 1 dα0 · · · dαN
αi+1 (N + 1)! 1 αN β β +i ··· αj 2 αi 1 dαi · · · dαN i! 0 0 0 1 αN αj +1 (N + 1)! β +β +j ··· αj 2 1 dαj · · · dαN i!(β1 + i + 1) . . . (β1 + j ) 0 0 0 (N + 1)! i! (β1 + i + 1) . . . (β1 + j ) (β1 + β2 + j + 1) . . . (β1 + β2 + N + 1) β1 β2 l=1 (i + l) l=1 (β1 + j + l) . |β| (N + 1 + l) l=1
As a consequence of Lemma 4.1, we see that the integrals of consecutive monomials over the truncated positive Weyl chamber "N are asymptotically equal as N → ∞. Combined with the Heine–Stieltjes result, this fact leads to the following simple corollary of Lemma 4.1: Corollary 4.2. For any configuration l = (l1 , . . . , lk ) and integer j satisfying 1 ≤ j ≤ k, we have that "N
θj (α; l) − αl d−α = O(N −1 ) j
(17)
uniformly in k. Proof of Corollary 4.2. As a consequence of the Heine–Stieltjes theorem, we know that given a configuration l, the j th zero necessarily lies in the interval (αlj −1 , αlj ); that is, αlj −1 ≤ θj (α; l) ≤ αlj . On the other hand, by Lemma 4.1, "N
"N
αj d−α =
θj (α; l) − αl d−α ≤ j
j +1 N+2 .
"N
Thus,
αlj − αlj −1 d−α
lj + 1 lj = − N +2 N +2 = O(N −1 ).
Asymptotic Statistics of Zeroes for the Lamé Ensemble
483
Given Corollary 4.2, we can now estimate the error term, E1 (N, k, φ), in (11) as follows: E1 (N, k, φ) ≤
1 σ (N, k) ·
k−1 j =1
=O
k N
1≤l1 ≤···≤lk ≤N
"N
k k−1
θj +1 (α; l) − αl
− j +1 d α+
"N
θj (α; l) − αl d−α φ C 1 j
.
This proves the identity in (15) and so Step 1 is complete. Step 2. The next step involves computing the first term on the RHS of (15) explicitly. We claim that: 1 σ (N, k) =
k−1
1≤l1 ≤···≤lk ≤N
1 k−1
N j =1 "
φ k αlj +1 − αlj d−α
k φ(0) N +k−1 1 N−2 N +1 + σ (N − m − 1, k − 1) φ(kx)binom(N, m; x) dx σ (N, k) 0
(18)
m=0
N! where binom(N, m; x) := m!(N−m)! x m (1 − x)N−m for x ∈ [0, 1]. In order to prove the identity in (18), we start with a simple lemma which involves a successive application of the Fubini theorem.
Lemma 4.3. For any integers i, j with 0 ≤ i < j ≤ N , we have that 1
φ k αj − αi d−α = (N + 1) φ(kx)binom(N, j − i − 1; x) dx. "N
(19)
0
Proof of Lemma 4.3. Given the definition of "N , it is clear that 1 αN α1
φ k αj − αi d−α = (N + 1)! ··· φ k αj − αi dα0 · · · dαN . "N
0
0
0
By repeated application of Fubini’s theorem, we can ensure that the iterated integrals with respect to αi and αj are carried out last. More precisely, we apply Fubini’s theorem on the double integral with respect to αj and αj +1 to reverse the order of integration. We then repeat the same procedure for the double integral with respect to αj and αj +2 and so on, until we bring the last integration with respect to αj . This gives "N
1 1 αN
dα = 0
αj
αj
···
αj +2
αj
αj 0
··· 0
α1
dα0 . . . dαj −1 dαj +1 . . . dαN−1 dαN dαj .
484
A. Bourget, J. A. Toth
We proceed in a similar manner for αi to finally obtain: 1 1 αj +2 αj αj αi+2 dα = ··· ··· "N
αj
αj
0
αi
0
αi
αi
···
0
α1
dα0 . . .
0
. . . dαi−1 dαi+1 . . . dαj −1 dαi dαj +1 . . . dαN dαj 1 αj
=
1 αj
0
0
···
αj +2
αj
αj αi
···
αi+2
αi
αi
···
0
α1
0
i . . . dα0 . . . dα
j . . . dαN dαi dαj , . . . dα j means that these variables are omitted in the product measure i , dα where dα dα0 · · · dαN . We then carry out the iterated integration over the first N − 2 variables α0 < α1 < · · · < αi−1 < αi+1 < · · · < αj −1 < αj +1 < · · · < αN to get 1 αj i αi (αj − αi )j −i−1 (1 − αj )N−j − d α = (N + 1)! dαi dαj . i! (j − i − 1)! (N − j )! "N 0 0 Finally, we make the change of variables x = αj − αi , y = αj and integrate by parts i times with respect to y. It follows that 1
− x j −i−1 (1 − x)N−(j −i−1) φ k αj − αi d α =(N + 1)! φ(kx) dx (j − i − 1)! (N − (j − i − 1))! "N 0 1 φ(kx)binom(N, j − i − 1; x)dx. = (N + 1) 0
This completes the proof of Lemma 4.3. need to compute the asymptotic averages (18) of the integrals To complete Step 2, we −α. First, we start with a simple combinatorial lemma. In order φ(k(α − α )) d N l l j +1 j " to state the lemma, it is useful to introduce some notation at this point: We denote by Sj (m) the set of all configurations l = (l1 , . . . , lk ) for which lj +1 − lj = m. As the following lemma shows, the cardinality of Sj (m) is independant of j . Lemma 4.4. For each m = 0, . . . , N −1 and each j = 1, . . . , k, the number of k-tuples (l1 , . . . , lk ) with 1 ≤ l1 ≤ · · · ≤ lk ≤ N and satisfying lj +1 − lj = m is given by σ (N − m, k − 1) =
(N − m + k − 2)! . (k − 1)! · (N − m − 1)!
Proof of Lemma 4.4. By identifying the j th and the j + 1st zeroes, we are reduced to the problem of distributing k − 1 zeroes amongst the remaining N − m slots. Clearly, this number is given by σ (N − m, k − 1). As a consequence of Lemma 4.4, 1 σ (N, k)
k−1
1≤l1 ≤···≤lk ≤N
=
1 k−1
N j =1 "
φ k αlj +1 − αlj d−α
N−1 k−1
1 1 φ k αlj +1 − αlj d−α. (k − 1) σ (N, k) "N m=0 j =1 l∈Sj (m)
Asymptotic Statistics of Zeroes for the Lamé Ensemble
485
In order to apply Lemma 4.3, we need to treat the two cases where m = 0 and m > 0 separately. Since by Lemma 4.4, we know that Sj (0) = σ (N, k − 1), it is clear that 1 σ (N, k) =
k−1
1≤l1 ≤···≤lk ≤N
1 k−1 j =1
"N
φ k αlj +1 − αlj d−α
(20)
N−1 k−1
σ (N, k − 1) 1 1 φ(0) + φ k αlj +1 − αlj d−α. σ (N, k) (k − 1) σ (N, k) "N m=1 j =1 l∈Sj (m)
On one hand, we can apply Lemma 4.3 to the RHS of (20) to get σ (N, k − 1) φ(0) σ (N, k) +
k−1 N−1
1 k 1 φ k αlj +1 − αlj d−α = φ(0) (k − 1) σ (N, k) N +k−1 "N m=1 j =1 l∈Sj (m)
k−1 1 N−1 N +1 + φ(kx)binom(N, lj +1 − lj − 1; x) dx. (k − 1)σ (N, k) 0
(21)
m=1 j =1 l∈Sj (m)
On the other hand, an application of Lemma 4.4 allow us to remove the summations over j and l in (21), so that N−1 k−1 1 N +1 φ(kx)binom(N, lj +1 − lj − 1; x) dx (k − 1)σ (N, k) 0 m=1 j =1 l∈Sj (m)
1 N−2 N +1 = σ (N − m − 1, k − 1) φ(kx)binom(N, m; x)dx. σ (N, k) 0
(22)
m=0
The identity in (18) follows from (21) and (22) and so, Step 2 is complete. Step 3. Summing up, as a result of Steps 1 and 2 we have shown that: dµLS (x; N, K)(φ) =
1 N−2 N +1 k . σ (N − m − 1, k − 1) φ(kx)binom(N, m; x) dx + O σ (N, k) N 0
(23)
m=0
Our next task is to further simplify the expression on the RHS of (23) by appealing to the theory of Bernstein approximations [D]. First, we need to estimate the quotient σ (N−m−1,k−1) appearing on the RHS of (23). For this, it is convenient to consider two σ (N,k) cases:
486
A. Bourget, J. A. Toth
Case 1. (m << (N/k)1+β ); 0 < β < 1. Under this assumption, k−2 σ (N − m − 1, k − 1) m+1 k 1− = σ (N, k) N +k−1 N +j j =0 k−2 k m + 1 = exp log 1 − N +k−1 N +j j =0 k−2 k m + 1 1 = log 1 − exp · N +k−1 N 1 + Nj j =0 2 k−2 k 1 m + 1 km . = exp − · +O N +k−1 N N2 1+ j j =0
N
(24) After some further simplification involving Taylor expansions, we get that σ (N − m − 1, k − 1) σ (N, k)
2 k−2 k m+1 j km = exp − 1+O +O N +k−1 N N N2 j =0 2 2 k km (k − 2)(m + 1) mk = + O . exp − +O 2 N +k−1 N N N2 2 2 k km −mk mk = + O . exp 1+O N +k−1 N N2 N2
Finally, using the fact that x p e−x = Op (1); for all x ≥ 0, we get 2 σ (N − m − 1, k − 1) 1 k −mk k +O = exp +O . σ (N, k) N +k−1 N N2 N Case 2. (m >> (N/k)1+β ). In this case, we can choose 0 < β < appropriate constants C1 , C2 > 0, k−2 σ (N − m − 1, k − 1) m+1 k 1− = σ (N, k) N +k−1 N +j
1−$ $
(25)
so that with
j =0
k−2 k Nβ ≤ 1 − C1 1+β N +k−1 k j =0 β = O e−C2 (N/k) . Substituting the estimates (25) and (26) into (23) and using the fact that 1 N binom(N, m; x) = 1 and φ(kx)dx = O(k −1 ) m=0
0
(26)
Asymptotic Statistics of Zeroes for the Lamé Ensemble
487
gives: N−2
dµLS (x; N, K)(φ) =
=
k(N + 1) − mk e N N +k−1 m=0 1 k 1 · φ(kx)binom(N, m; x) dx + O +O N k 0 N k(N + 1) −mk e N N +k−1 m=0 1 k 1 · φ(kx)binom(N, m; x) dx + O +O N k 0
(27)
since the terms for m = N − 1 and m = N are bounded by 1/N . Recall that for a function f (x) defined on [0, 1], the N th degree Bernstein polynomial of f (x) is defined to be [D]: BN (f ; x) =
N
f
m
m=0
N
binom(N, m; x).
It is easy to see that in the special case where exp−k (x) := e−kx , there is a concise closed-form expression for BN (exp−k ; x); indeed, N k BN (exp−k ; x) = xe− N + (1 − x) .
(28)
From the identity in (28) we easily derive the following: Lemma 4.5. For x ≥ 0, we have that BN (exp−k ; x) = e
−kx
k +O N
.
k
(29)
Proof of Lemma 4.5. Expand e− N in a second-order Taylor series and use the identity (28) directly to get N k !N kx k −N BN (exp−k ; x) = 1 + x e −1 = 1− . 1+O N N From the inequality x N x 2 e−x 0 ≤ e−x − 1 − ≤ , N N and the fact that x p e−x = Op (1) for all x ≥ 0, it follows that 2 k x k BN (e−kx ; x) = e−kx 1 + O + O(N −1 ) = e−kx + O . N N This completes the proof of the lemma.
488
A. Bourget, J. A. Toth
Substituting (29) into (27), we finally obtain 1 k(N + 1) k 1 φ(kx)BN (exp−k ; x) dx + O +O dµLS (x; N, K)(φ) = N +k−1 0 N k 1 1 k k(N + 1) −kx +O φ(kx)e dx + O = N +k−1 0 N k 1 1 k −x +O . (30) φ(x)e dx + O = N k 0 By noting that C01 ([0, 1]) is dense in C00 ([0, 1]), this completes the proof of Theorem 1.1 (i). 4.2. Proof of part (ii) of Theorem 1.1. For convenience, we henceforth denote by l,l the double sum over the indices 1 ≤ l1 ≤ · · · ≤ lk ≤ N and 1 ≤ l 1 ≤ · · · ≤ l k ≤ N . We also define φk (x) := φ(kx). First, we claim that "N
2 AV dρLS (x; N, k, α)(φ) d−α
k−1 1 1 k − = 2 φk (αlj +1 − αlj )φk (αl i+1 − αl i ) d α + O . 2 N σ (N, k) (k − 1) N " i,j =1
l,l
(31) To obtain (31), we essentially repeat the argument of Step 1 in Sect. 4.1. That is, we expand each of the functions φk (θj +1 (α; l) − θj (α; l)) and φk (θi+1 (α; l ) − θi (α; l )) in a first-order Taylor series around the points (αlj +1 − αlj ) and (αl i+1 − αl i ) respectively.
First, we claim that the terms involving the derivative of φk are all O Nk . Indeed, the Heine–Stieltjes Theorem and Lemma 4.1 imply that
αlj +1 − αlj (αl i +1 − αl i ) d−α θj (α, l) − αlj θi (α, l ) − αl i d−α ≤ "N
= =
(l
i
+ 2)(lj
+ 3) − (l
"N
i
+ 1)(lj + 3) − (l i + 2)(lj + 2) + (l i + 1)(lj + 2) (N + 2)(N + 3)
1 . (N + 2)(N + 3)
Thus, it follows that θj (α, l) − αlj θi (α, l ) − αli d−α = O(N −2 ), "N
(32)
uniformly for all l i , lj ∈ {1, . . . , N}. Consequently, the terms involving the derivative
of φ are all O Nk as desired. In the integral (32) we have assumed without loss of generality that l i < lj . The other cases l i = lj and l i > lj can be treated in a similar AV (x; N, k, α)(φ) and then derive as an fashion. We next prove an L2 estimate for dρLS AV (x; N, k, α)(φ). In order to immediate corollary an estimate for the variance of dρLS
Asymptotic Statistics of Zeroes for the Lamé Ensemble
489
simplify the writing in the next proposition, we introduce the following notation for the multinomial coefficient:
multi(N − 1, m, m ; x, y) :=
m!m !(N
N! x m y m (1 − x − y)N−m−m , − m − m )!
where m, m are positive integers satisfying m + m ≤ N and x, y ∈ [0, 1]. Lemma 4.6. (i) For any x, y ∈ [0, 1],
N−2 N−2−m
m =0
e
km − km N − N
e
multi(N − 1, m , m; x, y) = e
−kx−ky
m=0
k +O N
. (33)
(ii) Also, for 0 ≤ x ≤ y ≤ k1 ,
N−2
m
e
km − km N − N
e
multi(N − 1, m , N − m; x, 1 − y) = e
m=0 m =0
−kx−ky
k +O N
. (34)
Proof of Lemma 4.6. (i) As in the proof of the first part of the theorem, modulo O(N −1 ) errors, we can replace N − 2 by N − 1 in the upper limit of both summations. Define exp−k (x) := exp(−kx). Then, as a consequence of Lemma 4.5, we have that
N−2 N−2−m
m =0
km
e− N e−
km N
multi(N − 1, m , m; x, y)
m=0
=
N−1 N−1−m
m =0
m=0
k
k
multi(N − 1, m , m; xe− N , ye− N ) + O(N −1 )
k N−1 + O(N −1 ) = 1 − (x + y) + (x + y)e− N k = BN (exp−k ; x + y) + O N k = exp(−kx − ky) + O . N
490
A. Bourget, J. A. Toth
(ii) Once again, we can replace N − 2 by N − 1 in the upper limit of the first sum. We make successive applications of the binomial theorem to get: N−2
m
m=0
m =0
km
e− N e−
k
multi(N − 1, m , N − m; x, 1 − y)
N−1
k
x − y + ye− N
m−1
(1 − x)N−m + O(N −1 ) (m − 1)! (N − m)! m=0 k k k N−1 + O(N −1 ) 1 − (x + ye− N ) + (x + ye− N )e− N
= (N − 1)! = e− N
km N
e
− km N
k
k
= e− N BN−1 (exp−k ; x + e− N y) + O(N −1 ) k −kx−ky . +O =e N
We now use the combinatorial identities in Lemma 4.6 to estimate the variance of the averaged level-spacings measures: Proposition 4.7. For any φ ∈ C01 ([0, 1]), we have that "N
2 AV dρLS (x; N, k, α)(φ) d−α =
1
e−x φ(x) dx
2
+O
0
k N
+O
1 . k (35)
Proof of Proposition 4.7. As a consequence of the estimate in (31), it suffices to show that k−1 1 1 φk (αlj +1 − αlj )φk (αl i+1 − αl i ) d−α σ 2 (N, k) (k − 1)2 "N l,l
i,j =1
= 0
1
e−x φ(x)dx
2
+O
k N
+O
1 . k
In order to show this, we need to distinguish three different cases corresponding to the various relative configurations of αli , αli+1 , αlj and αlj +1 : Case 1. αli < αli+1 < αlj < αlj +1 (or equivalently, αlj < αlj +1 < αli < αli+1 ). The argument is essentially the same as in Lemma 4.3. The only difference is that instead of getting a simple integral, we obtain a double integral at the end of the iterated integration. More precisely, just as in Step 2, we make repeated applications of Fubini’s Theorem to ensure that the last four iterated integrals are with respect to αli , αli+1 , αlj and αlj +1 variables. We then integrate by parts in the first N − 4 integrals with respect to the remaining α’s. As before, we make the change of variables x = αli+1 − αli and y = αlj +1 − αlj and then integrate by parts with respect to αli+1 and αlj +1 . The end
Asymptotic Statistics of Zeroes for the Lamé Ensemble
491
result is that: "N
φk (αlj +1 − αlj )φk (αl i+1 − αl i ) d−α 1 1−y
= (N + 1)N 0
0
0
0
φk (x)φk (y)
· multi(N − 1, l i+1 − l i − 1, lj +1 − lj − 1; x, y) dxdy 1 1 = (N + 1)N φk (x)φk (y) · multi(N − 1, l
i+1
− l i − 1, lj +1 − lj − 1; x, y) dxdy + O(k −1 ),
since 0 ≤ y ≤ 1/k in supp φk (y). Case 2. αl i < αlj < αl i+1 < αlj +1 (or equivalently αlj < αl i < αlj +1 < αl i+1 ). When compared with all possible relative configurations, the proportion of configurations satisfying the assumptions of Case 2 are asymptotically small. Indeed, the proportion of such relative configurations is O(k −1 ). One can see this as follows: Given N + 1 positive real numbers 0 < α0 < . . . < αN < 1, we consider the following two subsets of k elements given by: αl1 ≤ · · · ≤ αlk and αl 1 ≤ · · · ≤ αl k .
(36)
For each of the subsets above, there are k−1 pairs of the form (αlj , αlj +1 ) and (αl i , αl i+1 ). From (36) it follows that for any fixed pair (αlj , αlj +1 ), there is at most one pair (αl i , αl i+1 ) for which Case 2 is possible. Case 3. αlj < αl i < αl i+1 < αlj +1 (or equivalently, αl i < αlj < αlj +1 < αl i+1 ). This case can be dealt with in a similar fashion to Case 1. That is, we apply the Fubini Theorem repeatedly to ensure that the last four iterated integrals involve αl i , αlj , αlj +1 and αl i+1 . Then, we integrate by parts with respect to the remaining α’s. Finally, we make the change of variables x = αl i+1 − αl i and y = αlj +1 − αlj and integrate by parts again with respect to αl i+1 and αlj +1 to get: "N
φk (αlj +1 − αlj )φk (αl i+1 − αl i ) d−α 1 1
= (N + 1)N 0
x
0
0
φk (x)φk (y)
· multi(N − 1, l i+1 − l i − 1, N − lj +1 − lj + 1; x, 1 − y) dydx 1 1 = (N + 1)N φk (x)φk (y) · multi(N − 1, l i+1 − l i − 1, N − lj +1 − lj + 1; x, 1 − y) dydx + O(k −1 ). As in the proof of part (i) of Theorem 1.1, we make the substitution m = lj +1 − lj − 1 and m = l i+1 − l i − 1 in order to apply Lemma 4.6. From the estimate in (31) and the
492
A. Bourget, J. A. Toth
analysis of Cases 1-3 above, we deduce that 2 AV dρLS (x; N, k, α)(φ) d−α "N
=
N−2 N−2−m k 2 (N + 1)N − mk − m k 1 1 N e N e φk (x)φk (y) (N + k − 1)2 0 0 m=0
m =0
· multi(N − 1, m , m; x, y) dxdy 1 1 N−2 m − mk − mNk N e e φk (x)φk (y) + 0
m =0 m=0
0
· multi(N − 1, m , N − m; x, 1 − y) dydx k 1 +O +O . N k
!
By Lemma 4.6, we finally conclude that 1 2 2 k 2 (N + 1)N AV −kx φ (x)e dx dρLS (x; N, k, α)(φ) d−α = k (N + k − 1)2 "N 0 k 1 +O +O N k 1 2 1 k = +O . φ(x)e−x dx + O N k 0
Theorem 1.1 (ii) is then an immediate consequence of the Chebyshev inequality and the following corollary of Proposition 4.7: Corollary 4.8. For any φ ∈ C01 ([0, 1]), we have "N
AV dρLS (x; N, k, α)(φ) −
1
e−x φ(x) dx
0
2
d−α = O
k N
+O
1 . k
(37)
Prof of Corollary 4.8. The corollary follows directly from Proposition 4.7 and the estimate for the convergence of the integrated, averaged level-spacings measure in (30). Remark. We should point out that one can also quite easily determine the weak limit of the level-spacings measures before “unfolding the zeroes” (i.e. rescaling to unit mean level-spacing). Indeed, by carrying out exactly the same analysis as above, one can show that σ (N,k) k−1 1 1 (k) (k) δ(x − (θl,j +1 (α) − θl,j (α))) = δ0 (x), N→∞ σ (N, k) k−1
w − lim
l=1
(38)
j =1
both in the mean, and pointwise for an asymptotically full measure of α ∈ "N . Indeed, the computation of the weak-limit in (38) turns out to be a simpler problem that the corresponding one after “unfolding”, since the error terms are much easier to control.
Asymptotic Statistics of Zeroes for the Lamé Ensemble
493
Acknowledgements. We thank Dmitry Jakobson, Maung Min-Oo and Steve Zelditch for many helpful discussions regarding the paper. We would also like to thank the referee for his insightful comments.
References [BSZ1] Bleher, P., Shiffman, B. and Zelditch, S.: Poincaré-Lelong approach to universality and scaling of correlations between zeros. Commun. Math. Phys. 208, 771–785 (2000) [BSZ2] Bleher, P., Shiffman, B. and Zelditch, S.: Universality and scaling of correlations between zeros on complex manifolds. Invent. Math. 142, 351–395 (2000) [D] Davis, P.J.: Interpolation and Approximation. London: Dover, reprint, 1975 [H] Heine, M.: Handbuch der Kugelfunctionen, Bd. I. (2nd ed.), Berlin: G. Reimer, 1878 [HW] Harnad, J. and Winternitz, P.: Harmonics on hyperspheres, separation of variables and the Bethe ansatz. Lett. Math. Phys. 33, 61–74 (1995) [K] Kuznetsov, V.B.: Quadrics on real Riemannian spaces of constant curvature: Separation of variables and connection with Gaudin magnet. J. Math. Phys. 33 (9), 3240–3254 (1992) [NV] Nonnenmacher, S. and Voros, A.: Chaotic eigenfunctions in phase space. J. Statist. Phys. 92, no. 3–4, 431–518 (1998) [S] Stieltjes, T.J.: Sur certains polynômes que vérifient une équation différentielle linéaire du second ordre et sur la théorie des fonctions de Lamé. Acta Math. 6, 321–326 (1885) [Sh] Shah, G.M.: Monotonic variation of zeros of Stieltjes and Van Vleck polynomials. J. Ind. Math. Soc. 33, 85–92 (1969) [ShZ] Shiffman, B. and Zelditch, S.: Distribution of zeros of random and quantum chaotic sections of positive line bundles. Commun. Math. Phys. 200, 661–683 (1999) [Sz] Szegö, G.: Orthogonal Polynomials, Vol. 23. Third edition, Providence, RI: Am. Math. Soc., 1967 [T1] Toth, J.A.: The quantum C. Neumann problem. Internat. Math. Res. Notices 5, 137–139 (1993) [T2] Toth, J.A.: Various quantum mechanical aspects of quadratic forms. J. Funct. Anal. 130, 1–42 (1995) [WW] Whittaker, E.T. and Watson, G.N.: A Course of Modern Analysis. Fourth edition, Cambridge U.K.: Camb. Univ. Press, 1963 Communicated by P. Sarnak
Commun. Math. Phys. 222, 495 – 501 (2001)
Communications in
Mathematical Physics
© Springer-Verlag 2001
No or Infinitely Many A.C.I.P. for Piecewise Expanding C r Maps in Higher Dimensions Jérôme Buzzi Centre de Mathématiques de l’Ecole Polytechnique U.M.R. 7640 du C.N.R.S., 91128 Palaiseau Cedex, France. E-mail: [email protected] Received: 6 September 2000 / Accepted: 15 May 2001
Abstract: We modify Tsujii’s example [9] to show that in contrast with the one-dimensional case, piecewise uniformly expanding and C r maps of the plane may: (1) either have no absolutely continuous invariant probability measures (a.c.i.p. for short) and be such that every point is statistically attracted to a fixed repelling point; (2) or have infinitely many ergodic a.c.i.p. 1. Introduction A piecewise expanding C 2 map on the interval is a map F : [0, 1] → [0, 1] such that [0, 1] is the union of finitely many intervals on the interior of each of which F is C 2 and satisfies inf |F | > 1. It is well-known (Lasota–Yorke [4]) that such a map admits a finite, non-zero number of a.c.i.p. (ergodic absolutely continuous invariant probability measures) µ1 , . . . , µr , such that for Lebesgue almost every point x ∈ [0, 1], n
1 δF k (x) → µi n
as n → ∞
k=0
in the weak star topology, for some i depending on x (one says that x is statistically attracted to µi ). On the interval, very little smoothness is required (say C 1 plus summability of the modulus of continuity of F (see [7, 1])). In dimension 2, the corresponding result [2, 8] is known only assuming real-analyticity. M. Tsujii [9] proved that, at any rate, finite differentiability is not enough: for arbitrary r < ∞, there exist piecewise C r and expanding maps of the plane such that for a non-empty open set of points x, n1 nk=0 δF k (x) → δp as n → ∞, for some p ∈ X independent of x. Thus there is a large set of points not statistically attracted to an a.c.i.p., in contrast with the Lasota–Yorke situtation recalled above.
496
J. Buzzi
Question: Can one arrange Tsujii’s example so as to have no a.c.i.p.? infinitely many a.c.i.p.? In this note, we answer both these questions positively. Let us recall that a piecewise C r map is (X, Z, F ) such that: (1) X is a compact subset of Rd , (2) Z is a finite collection of pairwise disjoint open subsets of X, the union of which is dense in X. Moreover the subsets in Z have C r boundaries, i.e., the boundaries are finite unions of compact subsets of open C r submanifolds, (3) F restricted to each subset in Z extends to a C r diffeomorphism between a neighborhood of the closure of the subset and some open set in Rd . F is expanding if F (x).v ≥ λ v for all x ∈ Z∈Z Z, v ∈ Rd and some constant λ > 1. Theorem. 1. For any r < ∞, there exists a piecewise C r expanding map of the square with no a.c.i.p. 2. For any r < ∞, there exists a piecewise C r expanding map of the square with infinitely many a.c.i.p. Remark. 1. Example 1 above is dissipative: it admits infinitely many σ -finite absolutely continuous invariant measures. 2. It was already known [3] that one can have any (finite, non-zero) number of ergodic a.c.i.p. for maps with 10 pieces (i.e., elements of the partition Z). 3. The point of our examples is that they have finitely many pieces. Otherwise counterexamples can already be found in dimension 1 (see the last paragraph of [5]). We begin by recalling Tsujii’s example. Then we build our example with no a.c.i.p. before turning to the one wih infinitely many a.c.i.p. To finish, we explain why it is impossible to build a C ∞ example along the same lines: the C ∞ case is therefore still open. 2. Tsujii’s Counter-Example We recall the basic structure of Tsujii’s construction for the convenience of the reader. Choose the differentiability r < ∞. Then set γ > r, 0 < < γ −3 (1 − γ −1 ) log 2 and M very large. λ = e will be the expansion constant of our map. The map F is defined on [0, 1] × [−1, 1]. It is made of 8 nearly rectangular pieces (see Fig. 1), only three of which are important (they are labelled A, B and C on the figure). Observe the role of the functions f (defining the discontinuity line near y = 0) and η (defining the downward push in piece B). The key property of F is the existence of a countable collection of disjoint rectangles R(k, m), for m > M and 0 ≤ k < γ m . Each rectangle is mapped into the next one: F (R(k, m)) = R(k + 1, m) if k < γ m − 1, and F (R(γ m − 1, m)) = R(0, m + 1). The rectangles are organized into lines R(0, m), R(1, m), . . . , R(γ m − 1, m) and into columns: R(k, m) ⊂ Jγ m −k × [0, 1] with J! = [e−! , e−! + (e−(!−1) − e−! )/6]. which are contained in B((0, 0), δ) is The proportion of rectangles in the nth line larger than 1 − | log δ|/γ m . Thus, for any x ∈ m,k R(k, m), n−1
1 δF k x → δ(0,0) n k=0
as n → ∞.
Uniformly Expanding C r Map with No or Infinitely Many A.C.I.P.
497 1 e−
B : F (x, y) = (λx, λ (y − η(x)))
C : F (x, y) = (λy, λx − 1)
0
A : F (x, y) = (λx, λy)
y = f (x)
−e− e− e−
0
1
Fig. 1. (λ = e )
To each column J! ×[0, 1] with ! > M, corresponds a “bump” in the functions f and η. These functions are constant on each J! with values such that the discontinuity line cuts in the middle the highest rectangle in the column1 (i.e., f on J! is the y-coordinate of the middle of the rectangle) and F maps the higher half of the rectangle to the same place as the lower half (i.e., η on J! is half the height of the rectangle). Remark. The smooth bumps in the discontinuity line y = f (x) are drawn as rectangles. 3. The Difficulties of the C ∞ Case We explain why one can’t build C ∞ counter-examples along these lines. Assume that the beginning of the mth line, R(0, m), is in the column Jc(m) ×[0, 1], for some c(m). To have the first r(m) derivatives going to zero, this rectangle must be below y ≈ e−c(m)r(m) . Therefore, the last rectangle in the line is below e−c(m)r(m) · ec(m) = e−c(m)(r(m)−1) . But the y-coordinate of this rectangle is mapped into the x-coordinate of the rectangle R(0, m + 1). Therefore c(m + 1) ≈ (r(m) − 1)c(m). To ensure C ∞ smoothness, r(m) must tend to infinity and c(m) must increase super-exponentially. This implies that the size of the rectangles decreases super-exponentially fast. But in our scheme, this decrease can only be exponential in the number of the rectangles in the cycle, a contradiction. 1 The highest rectangle is R(γ m(!) − !, m(!)) with m(!) minimum such that γ m > !.
498
J. Buzzi
4. Example with No A.C.I.P. We start with Tsujii’s map F : [0, 1] × [−1, 1] → [0, 1] × [−1, 1] and modify it by carving some new pieces. The idea is to capture in these new pieces every orbit which would not eventually enter the rectangles R(k, m) under the iteration of F so that the modification of F sends it into some rectangle. To do this we are going to specify a subset E ⊂ [0, 1] × [−1, 1] with C r boundary such that every F -orbit which does not eventually enter the rectangles eventually enters E. This will solve our problem: we shall define F on [0, 1] × [−1, 1] to coincide with F outside of E and to be a piecewise affine and expanding map on E with values inside R(0, M + 1). Thus, every orbit of F will enter the rectangles and therefore be statistically attracted to p = (0, 0), excluding the existence of any a.c.i.p. (or any invariant probability measure indeed). Let E0 = [0, 1] × [−1, 1] \ ([0, e− ] × [0, e− ]). Consider the pre-images in [e− , 1] × [0, e− ] of the highest rectangles. Make these rectangles into a strip S, as pictured in Fig. 2 taking vertically some margin (i.e., enlarge the rectangles by replacing the horizontal side of R(γ m(!) − !, m(!)) by J! ). Each of the two lateral boundaries of the strip can be assumed to be a C r curve: it has essentially the same shape as the discontinuity line y = f (x). Define E to be E0 minus the strip S. The boundary of E is C r . Observe that the strip contains all the rectangles in E0 (i.e., R(γ m − 1, m), m > M). Hence E cannot meet any rectangle. It remains to see that any orbit of F eventually enters E. First, observe that on the complement of E0 , F multiplies the y-coordinate by e . Thus, any point eventually enters E0 . Consider an orbit which never enters the rectangles. We have just seen that it enters E0 . If it does not enter E, this means that it enters the strip S. At this time, the point of the orbit must be in the strip between two highest rectangles. Consider the next point of the orbit. There are two cases. First case. This point is far from the rectangles contained in F (S) in the sense that it does not belong to a column of the form J! × [0, 1]. In this case the orbit will eventually enter [e− , 1] × [−1, 1] \ (J1 × [−1, 1]). But this is a subset of E, contradicting our assumption. Thus, this case is impossible. Second case. The point is near the highest rectangle R(γ m − !, m), i.e., in the column J! × [0, 1]. Observe then that F restricted to any rectangle Jq × [0, 1], q ≥ 2, map horizontal lines into horizontal lines. Therefore the orbit will remain on the side of some rectangle until it comes back to column J1 × [0, 1]. At this time, it will be on the side of some rectangle R(k, m). But any rectangle in E0 is the pre-image of some highest rectangle. Hence the point is on the side of strip S: it is outside the strip S and therefore in E. This contradiction proves that no orbit avoids E forever and concludes the construction of the map with no a.c.i.p. 5. Example with Infinitely Many A.C.I.P. We start again from Tsujii’s map F : [0, 1] × [−1, 1] → [0, 1] × [−1, 1]. This time we modify the cutting so that, instead of having a single infinite sequence of rectangles, one mapped into the next, we shall have infinitely many cycles of rectangles, each supporting an a.c.i.p. Fix γ > r, 0 < < (1 − γ −1 ) log 2/(1 + γ )γ . Set M very large.
Uniformly Expanding C r Map with No or Infinitely Many A.C.I.P. e−
499
1
e−
vertical margin
pre-image of highest rectangle
boundary of the strip S
0
Fig. 2. The strip S englobing the pre-images of the higher rectangles
For each even integer m ≥ M, we shall have the following cycle: the rectangles R(0, m), R(1, m), . . . , R(γ m − 1, m), R(0, m + 1), . . . , R(γ m+1 − 1, m + 1) are each mapped into the next one (the last one being mapped into the first one). We compute the position of the bottom left corner (x, y) of R(0, m) by writing that m m+1 it comes back to itself under F γ +γ : x = eγ
m +γ m+1
Thus: x=
x − eγ
m+1
1 e
γ m
−e
−γ m+1
and y = eγ
and y =
m +γ m+1
y − 1.
1 e
γ m +γ m+1
−1
.
Observe that the rectangles R(0, m + 1), . . . , R(γ m+1 − 1, m + 1) lie on a line which is above the rectangles R(0, m), . . . , R(γ m − 1, m) (see Fig. 3 below). m+10 (any sufficiently small We define the height and width of R(0, m) to be e−γ number would do). One easily checks that the rectangles so defined remain inside the columns J! × [0, 1] and are disjoint. Observe in Fig. 3, that instead of a single line of discontinuity y = f (x), we have introduced two of them: y = fi (x), i = 1, 2, f2 being above f1 . The map F coincides with (x, y) → (λx, λy) in [0, e− ] × [−e− , e− ] outside of the strip between the two discontinuity lines. In this strip the map F coincides with (x, y) → (λx, λ(y − η(x))). The functions f1 , f2 , η are chosen to have the following situation. The rectangles R(k, m) for 0 ≤ k < γ m − γ m−1 and the rectangles R(k, m + 1) for γ m+1 − γ m−1 ≤ k < γ m+1 − γ m−2 are cut in the middle by a horizontal segment which is part of the discontinuity line y = f1 (x). What is between the two discontinuity lines is pushed down by η(x), equal to half the height of the rectangle. In this way both upper and lower halves of these rectangles have the same image: the next rectangle in the cycle. Observe that in each column J! × [0, 1] we do this at most once.
500
J. Buzzi f2 (x)
R(γ m+1−m+1 )
f1 (x)
R(0, m) R(γ m−1,m ) R(0, m + 1)
Jγ m
Jγ m+1
j = γ −2
jγ m−1
J1
Fig. 3.
The discontinuity line y = f2 (x) protects the rectangles R(k, m+1) which are above the rectangles R(l, m). In particular, for x ∈ J! , f2 (x) can be chosen constant, with value slightly above the height of the top of the rectangle R(γ m − !, m) for γ m−1 < ! ≤ γ m . We realize this cutting in the same way as Tsujii by making bumps in the functions f1 , f2 , η. To ensure that we get globally C r functions it will be enough to check that the height h and the width w of these bumps satisfy: h/w r → 0 as m → ∞. Going through the cycle, the height of R(0, m) is multiplied by: eγ
m +γ m+1 −log 2·(γ m −γ m−1 )
= eγ
m ((1+γ )−(1−γ −1 ) log 2)
< 1,
and its width is multiplied by: eγ m
m +γ m+1 −log 2·(γ m−1 −γ m−2 )
m+1
= eγ
m−1 (γ (1+γ )−(1−γ −1 ) log 2)
< 1.
Thus, F γ +γ (R(0, m)) ⊂ R(0, m). Finally we check the condition for C r smoothness. Observe that the bumps associated to the rectangles R(k, m), resp. R(k, m+1), are magnified versions of the one associated to R(0, m), resp. R(γ m+1 − γ m−1 , m + 1). Hence it is enough to check the condition for these two rectangles. m The width of the bump associated to R(0, m) ⊂ Jγ m × [0, 1] is about e−γ , its m m+1 height is y which is approximately e−γ −γ , the relevant quotient is therefore m e−γ (γ +1−r) . As γ > r − 1, this goes to zero and it is ok.
Uniformly Expanding C r Map with No or Infinitely Many A.C.I.P.
501
For the second bump associated to R((γ m+1 − γ m−1 , m + 1) ⊂ Jγ m−1 × [0, 1], the m−1
m+1
m−1 )
and the height is e(γ −γ width is about e−γ m−1 m e−γ −γ . Thus, the relevant quotient is about e−γ
m −γ m−1
/(e−γ
m
(eγ x−1), which is approximately
m−1
)r ,
m−1
which is e−γ (γ +1−r) . This is also ok as γ > r − 1. m m+1 : R(0, m) → R(0, m) is a piecewise expanding and affine map H := F γ +γ preserving the two axis directions together and made of rectangular pieces with sides aligned with the axis. It is well-known that such a map has an a.c.i.p. (see, for instance [6], using that any iterate is piecewise affine on rectangles aligned with the axis and therefore at most 4 rectangles can touch a given point: Y (H n ) ≤ 4 in the language of that paper). We therefore have built a piecewise expanding and C r map with infinitely many a.c.i.p. References 1. Buzzi, J.: a.c.i.m.’s as equilibrium states for piecewise invertible dynamical systems. In: Proc. Sympos. Pure Math. (Seattle, 1999), Providence, RI: Am. Math. Soc., to appear 2. Buzzi, J.: a.c.i.m. for arbitrary piecewise expanding and real-analytic maps on the plane. Ergod. Th. & Dynam. Syst. 20, 697—708 (2000) 3. Góra, P., Boyarsky, A., Proppe H.: On the number of invariant measures for higher-dimensional chaotic transformations. J. Statist. Phys. 62, 709–728 (1991) 4. Yorke, J. and Lasota, A.: On the existence of invariant measures for piecewise monotonic transformations. Trans. Am. Math. Soc. 186, 481–488 (1973) 5. Rychlik, M.: Bounded variation and invariant measures. Studia Math. 76, 69–80 (1983) 6. Saussol, B.: Absolutely continuous invariant measures for multidimensional expandings maps. Israel J. Math. 116, 223–248 (2000) 7. Schmitt, B.: Ergodic theory and thermodynamic of one-dimensional Markov expanding endomorphisms. In: Dynamical systems (Temuco, 1991/1992), Travaux en Cours, 52, Paris: Hermann, 1996 8. Tsujii, M.: Absolutely continuous invariant measures for piecewise real-analytic expanding maps on the plane. Comm. Math. Phys. 208, 605–622 (2000) 9. Tsujii, M.: Piecewise expanding maps on the plane with singular ergodic properties. Ergod. Th. & Dynam. Syst. 20, 1851–1857 (2000) Communicated by Ya. G. Sinai
Commun. Math. Phys. 222, 503 – 531 (2001)
Communications in
Mathematical Physics
© Springer-Verlag 2001
Global Uniqueness in the Inverse Scattering Problem for the Schrödinger Operator with External Yang–Mills Potentials G. Eskin Department of Mathematics, UCLA, Los Angeles, CA 90095-1555, USA. E-mail: [email protected] Received: 11 August 2000 / Accepted: 24 May 2001
Abstract: We consider the Schrödinger equation in Rn , n ≥ 3, with external Yang– Mills potentials having compact supports. We prove the uniqueness modulo a gauge transformation of the solution of the inverse boundary value problem in a bounded convex domain. A similar uniqueness result holds for the inverse scattering problem at a fixed energy. 1. Introduction Consider the following Schrödinger equation in ⊂ Rn , n ≥ 3:
2
−i
∂ + A(x) ∂x
u + V (x)u − k 2 u = 0,
(1.1)
where A(x) = (A1 (x), . . . , An (x)), Aj (x), V (x) are m × m matrices, is a bounded domain with a smooth boundary ∂. ¯ with continuous partial We assume that V (x) ∈ L∞ () and A(x) is continuous in derivatives up to order n0 ≥ n + 3. Note that (1.1) can be written in the form: −u − 2ijn=1 Aj (x)
∂u + (q(x) − k 2 )u = 0, ∂xj
(1.2)
where ∂Aj q(x) = jn=1 (A2j − i + V (x). ∂xj Partially supported by NSF Grant DMS-9970565.
(1.3)
504
G. Eskin
Vice versa, any system of the form (1.2) can be rewritten in the form (1.1) with ∂A V = q(x) − jn=1 A2j − i ∂xjj . Let u(x) be any solution of (1.1) in H1 () and f = u(x)|∂ . The operator f = ( ∂u ∂n + i(A · n)u)|∂ is called the Dirichlet-to-Neumann operator. Here n is the outer unit normal to ∂. We assume that k 2 is such that for any f ∈ H 1 (∂) there exists a 2 solution of the Dirichlet problem in H1 (). We say that A (x) = (A 1 (x), . . . , A n (x)), V (x) are gauge equivalent to A(x) = (A1 (x), . . . , An (x)), V (x) if there exists a smooth m × m matrix g(x), det g(x) = 0 ¯ g(x) = I on ∂ such that in , A (x) = g −1 (x)A(x)g(x) − ig −1 (x)
∂g , ∂x
V (x) = g −1 (x)V (x)g(x). We shall prove the following theorem: Theorem 1.1. If the Dirichlet-to-Neumann operator corresponding to the Schrödinger equation (L − k 2 )u = 0 with potentials A (x), V (x) is equal to the Dirichlet¯ is convex then A (x), V (x) are gauge equivalent to to-Neumann operator and A(x), V (x). Here k is fixed. Since the recovery of A(x), V (x) from the Dirichlet-to-Neumann operator in a domain containing supports of A(x) and V (x) is equivalent to the recovery of A(x), V (x) from the scattering amplitude at the fixed energy k 2 > 0 (see, for example, [S] and [ER2]), Theorem 1.1 implies the following result: Theorem 1.2. Let (L − k 2 )u = 0 be Eq. (1.1) in Rn , where A(x), V (x) have compact supports. If the scattering amplitude a (θ, ω, k) for the Schrödinger equation (L − k 2 )u = 0 is equal to the scattering amplitude a(θ, ω, k) for (L − k 2 )u = 0 for k > 0 fixed and all |θ | = 1, |ω| = 1 and A (x), V (x) also have compact supports then A (x), V (x) are gauge equivalent to A(x), V (x). The Schrödinger equation with external Yang–Mills potentials was considered in [ST]. The inverse scattering problem for such Schrödinger equations was studied in [HN2] and [ER3] under the assumption that A(x) is small . For other inverse scattering and inverse boundary value problems for physically relevant systems of differential equations see [Iso, NT, NU1, NU2, OPS]. In this paper we prove the global uniqueness theorem without any smallness conditions on A(x) and V (x) except that they have a compact support and n ≥ 3. Note that a global uniqueness result for large energies was announced in [No]. To prove the uniqueness in the inverse boundary problem one need to construct a rich set of solutions of the system (1.2). This is done in Sect. 3. The crucial tool in construction of solution in Sect. 3 is the matrix solutions of the first order system (2.1). The problem is that there may be no solutions of (2.1) tending to I on ∞. We overcome this problem by showing that there are solutions of (2.1) that grow polynomially on ∞. Such growing solutions are not unique and their use considerably complicates the analysis of the Green’s identity that follows from the equality of the Dirichlet-to-Neumann operators for two Schrödinger operators of form (1.1). We conclude the proof of global uniqueness by using a simplified version of the ¯ ∂-approach of G. Henkin and R. Novikov ([HN1, HN2, HN3]).
Global Uniqueness in Inverse Scattering Problem
505
2. Solution of the Transport Equation As in [ER1, ER3], the crucial role in the solution of the inverse problem is played by the matrix solution of the following equation: i
∂c0 (x, µ + iν) · (µ + iν) = A(x) · (µ + iν)c0 (x, µ + iν). ∂x
(2.1)
Here c0 (x, µ + iν) is m × m matrix smooth in x ∈ Rn , µ + iν, det c0 = 0 for all x, µ + iν, µ and ν are orthogonal vectors in Rn of equal length: µ · ν = 0, |µ|2 = |ν|2 . In this section we assume that A(x) is defined in Rn , has n0 continuous derivatives and supp A ⊂ BR = {x : |x| < R}. If µ and ν are fixed then any vector x ∈ Rn can be written in the form x = x1
µ ν + x2 + x⊥ , |µ| |ν|
where x⊥ · µ = 0, x⊥ · ν = 0. Denote z = x1 + ix2 ,
∂ ∂ 1 ∂ . + i = ∂ z¯ 2 ∂x1 ∂x2
∂ = |µ| ∂x∂ + i ∂x∂ = 2|µ| ∂∂z¯ . Then (µ + iν) · ∂x 1 2 Let π be the two-dimensional subspace in Rn spanned by µ and ν. We shall assign an orientation to π by fixing the order (µ, ν). Thus the two-dimensional subspace (ν, µ) has an opposite orientation. We shall often write π = µ + iν, where (µ, ν) is any orthogonal basis in π, |µ| = |ν|. Denote by Gn the manifold of all oriented two-dimensional subspaces in Rn and denote by Y the vector bundle of all two-dimensional planes in Rn . Any y ∈ Y can be represented in the form y = (π, x⊥ ) = (µ + iν, x⊥ ). We shall denote by YR the subset of Y where |x⊥ | ≤ R. We can rewrite Eq. (2.1) in the form ∂c(y, z) − B(y, z)c(y, z) = 0, z ∈ C, ∂ z¯
(2.2)
µ ν + i |ν| ), c(y, z) = c0 (x, µ + iν). Therefore (2.2) is an where B(y, z) = − 2i A(x) · ( |µ| 2 equation in R depending on y ∈ Y as a parameter. Denote by )f the Cauchy operator in R2 , f (w)dy1 dy2 1 )f = , (2.3) π R2 z−w
where w = y1 + iy2 . Note that ) is a bounded operator from L2,ρ to L2,−ρ , ρ > where L2,ρ is the weighted L2 space with the weight (1 + |z|)ρ , and ∂ )f = f, ∀ f ∈ L2,ρ , ∂ z¯ ∂g ∂g ) = g, ∀ g ∈ L2,−ρ , ∈ L2,ρ . ∂ z¯ ∂ z¯
1 2
+ ε,
506
G. Eskin
To solve the nonhomogeneous equation ∂v − B(y, z)v = f, ∂ z¯
(2.4)
where f ∈ L2,ρ , v ∈ L2,−ρ , we apply ) and get v − )Bv = )f.
(2.5)
Note that the operator I − )B is Fredholm in L2,−ρ and Eqs. (2.4) and (2.5) are equivalent. We shall say that condition (A) is satisfied if ker(I − )B) = 0 for all y ∈ Y . If condition (A) is satisfied then there is a unique solution of (2.1) (or (2.2)) such that c − I = O( 1z ), det c = 0, ∀ (y, z), c is smooth in (y, z). We look for c in the form c = c1 + I . Then c1 satisfies c1 − )Bc1 = )B. Since ker(I − )B) = 0 this equation has a unique solution c1 = O( 1z ). Therefore c = I + O( 1z ). Note that det c satisfies the equation ∂∂z¯ (det c) = Tr B(y, z) det c. Therefore det c = f (z)e) Tr B , where f (z) is analytic in C. Since ) Tr B = O( 1z ) and det c = 1 + O( 1z ) we have that f (z) is bounded and therefore f (z) = const = 0 by the Liouville theorem. Therefore det c = 0 for all z. Vice versa, if there exists a smooth solution c(y, z) of (2.2) such that c = I + O( 1z ), then ker( ∂∂z¯ − B) = 0. −1 Indeed if ϕ ∈ ker( ∂∂z¯ − B), i.e. ϕ = O( 1z ) and ∂ϕ ∂ z¯ − Bϕ = 0 then c ϕ satisfies ∂ −1 −1 ∂c c−1 ϕ + c−1 ∂ϕ = −c−1 Bϕ + c−1 Bϕ = 0. Therefore c−1 ϕ is ∂ z¯ (c ϕ) = −c ∂ z¯ ∂ z¯ analytic in C and c−1 ϕ = O( 1z ). Then by the Liouville theorem c−1 ϕ = 0, i.e. ϕ = 0. Remark 2.1. For any y0 ∈ Y, z0 ∈ C there exists a neighborhood U (y0 , z0 ) ⊂ Y × C such that there is a solution d(y, z) of (2.2) in U (y0 , z0 ) with det d(y, z) = 0. To prove this denote by B0 (y, z) a matrix equal to B(y, z) in U (y0 , z0 ) and equal to 0 outside a slightly larger neighborhood. Then Condition (A) is satisfied for B0 since operator )B0 has a small norm. Therefore there exists d(y, z) = I + O( 1z ) such that d(y, z) satisfies (2.2) in U (y0 , z0 ) and det d(y, z) = 0. Condition (A) is certainly satisfied if A is small. This was the case considered in [ER3]. However if A is not small one may have that ker(I − )B) = 0 for some y. The purpose of this section is to show that even in the case when condition (A) is not satisfied one still may construct solutions c(y, z) of (2.2). However these solutions will grow as a polynomial when |z| → ∞, y is fixed. Consider the adjoint equation −
∂ϕ∗ − B ∗ ϕ∗ = 0, ∂z
(2.6)
where B ∗ is the adjoint matrix to B(y, z). ∂ Lemma 2.1. Let vector-functions ϕ∗1 , . . . , ϕ∗r be a basis of ker(− ∂z − B ∗ ), y = y0 is fixed. Denote 1 γjp = w¯ p B ∗ (y0 , w)ϕ∗j (w)dy1 dy2 , (2.7) π R2
w = y1 + iy2 . Then vectors γj = (γj 0 , γj 1 , . . . , γj n , . . . ), 1 ≤ j ≤ r, are linearly independent.
Global Uniqueness in Inverse Scattering Problem
507
∂ Proof. If ϕ∗j ∈ ker(− ∂z − B ∗ ), then ϕ∗j = O( 1z ) and B ∗ ϕ∗j (w)dy1 dy2 1 . ϕ∗j = − π R2 z¯ − w¯
Since B ∗ (y0 , w) = 0 for |w| > R and since Laurent expansion of ϕ∗j for |z| > R:
1 z¯ −w¯
∞ γjp z¯ −p−1 , ϕ∗j = −p=0
p
∞ w¯ = p=0 for |z| > |w|, we get a z¯ p+1
1 ≤ j ≤ r.
Suppose that γ1 , . . . , γr are linearly dependent. Then there exists (α1 , . . . , αr ) = 0 such that jr =1 αj γjp = 0 for all p = 0, 1, . . . . Then ϕ∗ = jr =1 αj ϕ∗j ≡ 0 and has the following property: w¯ p B ∗ ϕ∗ dy1 dy2 = 0, ∀ p ≥ 0. R2
−1 ∞ ∗ ∗ w¯ p B ∗ ϕ∗ dy1 dy2 = 0 for |z| > R. Since − ∂ϕ Therefore ϕ∗ = p=0 ∂z −B ϕ∗ = π z¯ p+1 R2 2 0 in R and ϕ∗ = 0 for |z| > R we get that ϕ∗ ≡ 0 by the uniqueness of the Cauchy problem. And this is a contradiction. Lemma 2.2. There exists a matrix solution c(y, z) of (2.2), y = y0 is fixed, such that c(y0 , z) = O((1 + |z|)M ) for some M > 0 and det c(y0 , z) ≡ 0. (N)
Proof. It follows from Lemma 2.1 that if N is large enough then vectors γj = mN (γj 0 , γj 1 , . . . , γj N ) ∈ C are linearly independent, 1 ≤ j ≤ r. We are looking for vector solutions of (2.2) in the form: ci = zN+1 ei + jr =1 αj i Pj (z) + ci1 ,
(2.8)
N β zp . where ei , 1 ≤ i ≤ m, is the standard basis in Cm , ci1 = O( 1z ), Pj (z) = p=0 jp Vectors βjp ∈ Cm will be chosen later. Substituting (2.8) into (2.2) we get:
∂ci1 − B(y0 , z)ci1 = zN+1 Bei + jr =1 αj i BPj . ∂ z¯
(2.9)
Equation (2.9) has a solution iff the right-hand side of (2.9) is orthogonal to ϕ∗j , 1 ≤ j ≤ r. It follows from (2.7) that (zp Bβjp , ϕ∗k ) = (βjp , z¯ p B ∗ ϕ∗k ) = π(βjp , γkp ). Therefore the solvability conditions take the form (N)
jr =1 αj i (βj , γk
) = −(ei , γk,N+1 ),
1 ≤ i ≤ m, (N)
where βj = (βj 0 , βj 1 , . . . , βj N ) ∈ Cm(N+1) , (βj , γk
(βjp , γkp ) is the inner product in Cm . Since there exists βj , 1 ≤ j ≤ r, such that
(N)
det (βj , γk (N)
(N) γj ,
1 ≤ k ≤ r,
(2.10)
N (β , γ ) and ) = p=0 jp kp
1 ≤ j ≤ r, are linearly independent
)rj,k=1 = 0.
(2.11)
In particular one can choose βj = γj . Note that there is an open set of βj , 1 ≤ j ≤ r, satisfying (2.11). If (2.11) is satisfied we can find αj i , 1 ≤ j ≤ r, solving (2.10) and
508
G. Eskin
therefore there exists ci1 = O( 1z ) such that ci of the form (2.8) solves (2.2). Denote by c(y0 , z) the matrix with columns c1 , . . . , cm . Then c(y, z) is a matrix solution of (2.2) and c = zN+1 I + Q(z) + c1 ,
(2.12)
where c1 = O( 1z ) and Q(z) is a polynomial matrix in z of degree N . Therefore det c = z(N+1)m + O(z(N+1)m−1 ) ≡ 0. Lemma 2.3. There exists a matrix solution c(y0 , z) of (2.2) for y = y0 such that c(y0 , z) = O(1 + |z|)M ), M > 0, det c(y0 , z) = 0 for all z ∈ C. Proof. Let c(y0 , z) be a matrix solution of (2.2) such that c = O(1 + |z|)M and det c(y0 , z) ≡ 0. Since det c = f (z)e) Tr B , where f (z) is a polynomial, equation det c = 0 is equivalent to the polynomial equation f (z) = 0. Suppose z0 is a zero of f (z). It follows from Remark 2.1 that there exists a matrix solution d(y0 , z) of (2.2) in a neighborhood U (z0 ) such that det d(y0 , z) = 0. We have ∂d ∂c ∂ −1 (d c) = −d −1 d −1 c + d −1 = 0, ∂ z¯ ∂ z¯ ∂ z¯ i.e. d −1 c = P (z), where P (z) is analytic in U (z0 ). Thus c = dP and det P (z0 ) = 0. Adding to a column of P (z) a linear combination of other columns (that is equivalent to multiplication of P (z) from the right by a nonsingular constant matrix Q ) we get that cQ = dP Q and a column of P Q has a zero at z − z0 . Dividing this column by z − z0 (that is equivalent to the multiplication by some matrix Q1 (z)) we get that P1 (z) = P (z)QQ1 (z) is analytic in U (z0 ). Therefore c1 (y0 , z) = cQQ1 satisfies (2.2) in C and has fewer zeros than c(y0 , z). After a finite number of such steps we get a matrix cN (y0 , z) satisfying (2.2) and such that det cN (y0 , z) = 0, ∀ z. Since det cN = fN (z)e) Tr B and fN (z) is a polynomial having no zeros we get that fN (z) = const. Therefore det cN → const when |z| → ∞. Lemma 2.4. For any y0 ∈ Y there exists a neighborhood U (y0 ) ⊂ Y such that there is a matrix solution c1 (y, z) of (2.2) smooth in (y, z), c1 (y, z) = O(1+|z|M ), det c1 (y, z) = 0 for y ∈ U (y0 ), |z| ≤ 2R. Proof. Let Pj (z), 1 ≤ j ≤ r, be the same as in the proof of Lemma 2.2. Denote by A(y) the following operator: A(y)(u, α1 , . . . , αr ) =
∂u − B(y, z)u − jr =1 αj B(y, z)Pj (z). ∂ z¯
(2.13)
Note that )A(y) is a family of Fredholm operators from L2,−ρ × Cr to L2,−ρ continuously dependent of y ∈ Y in the operator norm. If condition (2.11) is satisfied then dim ker )A(y0 ) = r, coker )A(y0 ) = 0, i.e. the range of )A(y0 ) is the entire space L2,−ρ . Therefore by the continuity coker )A(y0 ) = 0 for y ∈ U (y0 ), where U (y0 ) is some neighborhood of y0 ∈ Y . Let fi (y0 , z), 1 ≤ i ≤ m, be the columns of matrix c(y0 , z) with det c(y0 , z) = 0, ∀ z, constructed in Lemma 2.3. We have fi (y0 , z) = fi1 (y0 , z) + fi2 (y0 , z), where fi1 (y0 , z) is a polynomial vector and fi2 = O( 1z ). Equation A(y0 )(g, α1 , . . . , αr ) = B(y0 , z)fi1 (y0 , z) has a solution (fi2 , 0, . . . , 0). Denote by Ni (B(y, z)fi1 (y0 , z)) the hyperplane of solutions of A(y)(g, α1 , . . . , αr ) =
Global Uniqueness in Inverse Scattering Problem
509
B(y, z)fi1 (y0 , z). Since coker A(y) = 0 we have that Ni (B(y, z)fi1 (y0 , z)) has constant dimension r for all y ∈ U (y0 ). Therefore there exists a smooth in y ∈ U (y0 ) solution (gi2 (y, z), α1i (y), . . . , αri (y)) ∈ Ni (B(y, z)fi1 (y0 , z)) such that (gi2 (y0 , z), α1i (y0 ),
. . . , αri (y0 )) = (fi2 (y0 , z), 0, . . . , 0).
Denote gi (y, z) = gi2 (y, z) + jr =1 αj i (y)Pj (z) + fi1 (y0 , z). Then gi (y, z) are smooth in (y, z), satisfy ( ∂∂z¯ −B(y, z))gi = 0, y ∈ U (y0 ), z ∈ C and gi (y0 , z) = fi (y0 , z), 1 ≤ i ≤ m. Denote by c1 (y, z) a matrix with columns g1 (y, z), . . . , gm (y, z). Then det c1 (y0 , z) = det c(y0 , z) = 0, ∀ z. Therefore by the continuity det c1 (y, z) = for |z| ≤ 2R if y ∈ U (y0 ), U (y0 ) is small. Now we shall prove the global existence of a solution of Eq. (2.1). Theorem 2.1. There exists a matrix solution c(x, µ + iν) of Eq. (2.1) smooth for |x| ≤ 2R, µ + iν ∈ Gn and such that det c(x, µ + iν) = 0 for |x| ≤ 2R, µ + iν ∈ Gn . Proof. It is enough to prove that there exists a smooth solution of (2.2) for all y ∈ Y2R , |z| ≤ 2R. Consider the case n = 3. Let l0 be the unit vector in R3 orthogonal to µ and ν and µ ν such that { |µ| , |ν| , l0 } form a right hand orthogonal basis in R3 . We can identify the oriented two-dimensional subspace µ + iν with l0 . Therefore G2 ≈ S 2 , x⊥ = x3 l0 , where x3 ∈ R1 and y ≈ (l0 , x3 l0 ). Let D+ be the upper half-sphere and D− be the lower half-sphere, D¯ + ∩ D¯ − = S 1 . Denote Y+ = D¯ + × B1 , Y− = D¯ − × B1 , where B1 is the closed interval [−2R, 2R]. In any small neighborhood U in Y we have a matrix solution c(y, z) of (2.2) given by Lemma 2.4 such that det c = 0 for |z| ≤ 2R. If c1 is defined in U1 and c2 is defined in U2 then c1−1 c2 will be analytic in z in U1 ∩ U2 . Since Y+ and Y− are homotopic to a point there exist global solutions c+ (y, z) on Y+ and c− (y, z) (0) (0) (0) on Y− . Fix x⊥ such that |x⊥ | = 2R. Then for any y = (l0 , x⊥ ) B(y, z) = 0 and (0) (0) therefore c+ (l0 , x⊥ , z) is analytic in z for all l0 ∈ D¯ + and c− (l0 , x⊥ , z) is analytic in (0) −1 (l0 , x⊥ , z) and z for all l0 ∈ D¯ − . Replace c+ (y, z) by c1+ (l0 , x⊥ , z) = c+ (l0 , x⊥ , z)c+ (0) −1 (l0 , x⊥ , z). Then c1+ (y, z) = c1− (y, z) = c− (y, z) by c1− (l0 , x⊥ , z) = c− (l0 , x⊥ , z)c− (0) I for x⊥ = x⊥ and l ∈ S 1 = D¯ + ∩ D¯ − . Introduce in D¯ + coordinates (t, l ) where l ∈ S 1 , 0 ≤ t ≤ 1, t = 1 corresponds to S 1 and t = 0 corresponds to the north pole of (0) D¯ + . Denote h1 (l , x⊥ , z) = (c1+ (l , x⊥ , z))−1 c1− (l , x⊥ , z). Note that h1 (l , x⊥ , z) = I . (0) We shall extend h1 from S 1 × B1 to D¯ + × B1 by defining h+ 1 (t, l , x⊥ , z) = h1 (l , x⊥ + (0) + (0) t (x⊥ − x⊥ ), z). Then h+ 1 (0, l , x⊥ , z) = h1 (l , x⊥ , z) = I, h1 is continuous on Y+ = D¯ + × B1 and analytic in z. + + − − 1 Define c2+ = c1+ h+ 1 . Then c2 = c1 h1 = c1 on S × B1 . Therefore c = c1 on Y− + and c = c2 on Y+ is a global solution of (2.2) on Y2R , |z| ≤ 2R. The proof in the case n ≥ 4 is similar. 3. Construction of Special Solutions of the Schrödinger Equation ¯ We extend A(x) Let R > 0 be such that the ball BR = {x : |x| < R} contains . ¯ to Rn as a continuous function having n0 continuous derivatives and such that from
510
G. Eskin
¯ Let u be a solution of supp A(x) ⊂ BR . We extend also V (x) as zero outside of . (1.2) in√BR . Denote u1 = e−i(ζ +iτ ν0 ) u, where |ν0 | = 1, τ > 0 is large, ζ · ν0 = 0 and |ζ | = τ 2 + k 2 . Then u1 satisfies the following equation: 2 ∂ ∂ −i + ζ + iτ ν0 u1 + 2A(x) · −i + ζ + iτ ν0 u1 + (q(x) − k 2 )u1 = 0. ∂x ∂x (3.1) We shall look for the solution of (3.1) in the form u1 = u0 + v, where u0 is a tempered solution of
2 ∂ 2 −i + ζ + iτ ν0 − k u0 = 0 ∂x
(3.2)
in Rn . Then v satisfies the following equation: 2 ∂ ∂ + ζ + iτ ν0 v + 2A(x) · −i + ζ + iτ ν0 v + (q(x) − k 2 )v −i ∂x ∂x = −q0 (x, ζ + iτ ν), where
∂ q0 (x, ζ + iτ ν) = 2A(x) · −i + ζ + iτ ν u0 + q(x)u0 . ∂x
(3.3)
(3.4)
For any η ∈ Rn denote ην0 = η · ν0 , η = η − ην0 ν0 . Let c(x, η, ν0 , τ ) be a solution of equation η η ∂c (3.5) · ( + iν0 ) = A(x) · + iν0 χ1 ην , |η |, τ c(x, η, ν0 , τ ), i ∂x |η | |η | 2 2 2 η where χ1 (ην , |η |, τ ) = χ τν χ |η | −τ , χ (t) ∈ C0∞ (R1 ), χ (t) = 1 for |t| < 21 , 3 τ2
χ (t) = 0 for |t| > 1. Note that p ∂ − |p| ∂ηp χ1 (ην , |η |, τ ) ≤ Cp τ 2 , ∀ p.
(3.6)
It follows from Theorem 2.1 that there exists a solution of (3.5) having n0 continuous derivatives in x, n0 ≥ n + 3, C ∞ in |ηη | , ν0 , χ1 and such that det c = 0 for all |x| ≤ 2R,
η |η | , ν0 , 0 ≤ χ1 ≤ 1. Denote by c0 the matrix c when η −1 |η | + iν0 . Replacing c by cc0 we may assume that
χ1 = 0. It is analytic
in z = x · c = I when χ1 = 0. Taking into account (3.6) we get p+r ∂ c(x, η, ν0 , τ ) − |r| (3.7) ≤ Cpr τ 2 , 0 ≤ |p| ≤ n0 , ∀ r. ∂x p ∂ηr
Global Uniqueness in Inverse Scattering Problem
511
Remark 3.1. We shall need in this section c(x, η, ν0 , τ ) defined only in a ball |x| < R + δ, δ > 0. Denote by c(x, D, ν0 , τ ) the pseudo-differential operator with symbol c(x, η, ν0 , τ ). Take ψj (x) ∈ C0∞ (Rn ), 1 ≤ j ≤ 3, ψ1 = 1 for |x| ≤ R, ψ2 = 1 in a neighborhood of supp ψ1 , ψ3 = 1 in a neighborhood of supp ψ2 , supp ψ3 ⊂ BR+δ . We are looking for the solution of (3.3) in the form v = ψ2 c(x, D+ζ, ν0 , τ )Eg, where E is the following operator: 1 eix·η g(η)dη ˜ Eg = , n (2π ) Rn (η + ζ + iτ ν0 )2 − k 2 where g(η) ˜ is the Fourier transform of g. Note that E is a bounded operator from L2,ρ to L2,−ρ , ρ > [ER4]), and EgL2,−ρ ≤
C gL2,ρ , τ
1 2
(see [A, W], see also
τ ≥ 1.
(3.8)
Substituting v = ψ2 cEg into (3.3) we get ψ2 cg + ψ2 T1 g + ψ2 T2 g + A2 g = −q0 (x, , ζ + iτ ν0 ),
(3.9)
where A2 g denote terms containing derivatives of ψ2 in the front, 1 T1 g = (2π)n T2 g =
1 (2π)n
Rn
∂ ix·η dη + q(x))c(x, η + ζ, ν0 , τ )g(η)e ˜ (− − 2iA(x) ∂x , (η + ζ + iτ ν0 )2 − k 2
(3.10)
Rn
∂c ix·η dη )g(η)e ˜ 2(η + ζ + iτ ν0 ) · (A(x)c − i ∂x . (η + ζ + iτ ν0 )2 − k 2
(3.11)
In the numerator of (3.10) we have a symbol of pseudodifferential operator that satisfies (3.7) with n0 replaced by n0 − 2. Therefore using (3.8) we get ψ3 T1 gL2,−ρ ≤
C gL2,ρ . τ
(3.12)
Write T2 g = T2 χ1 g + T2 (1 − χ1 )g, where χ1 is the pseudodifferential operator with symbol χ1 (ην , |η + ζ |, τ ). On supp (1 − χ1 ) we have, taking into account that |ζ | ∼ τ : (η + ζ + iτ ν0 )2 − k 2 = |η|2 + 2(ζ + iτ ν0 ) · η 3 = ην2 + |η + ζ |2 − |ζ |2 + 2iτ ην ≥ Cτ 2 , τ is large.
(3.13)
Therefore (3.13) implies C ψ3 T2 (1 − χ1 )gL2 ≤ √ gL2 τ
(3.14)
On supp χ1 we have |η + ζ − τ
√ η + ζ | ≤ τ. |η + ζ |
(3.15)
512
G. Eskin
Using (3.5) we get
1 (2π)n Rn (η + ζ + iτ ν0 )2 − k 2 η +ζ ix·η dη 2τ ˜ |η +ζ | + iν0 · A(x) 1 − χ1 (ην , |η + ζ |, τ ) cχ1 g(η)e
T2 χ1 g = +
1 (2π)n
∂c ix·η dη 2 η + ζ − τ |ηη +ζ ˜ +ζ | · A(x)c − i ∂x χ1 g(η)e
(η + ζ + iτ ν0 )2 − k 2
Rn
.
(3.16) Using (3.15) and (3.8) to estimate the first integral in (3.16) and using (3.13) to estimate the second we obtain C ψ3 T2 χ1 gL2,−ρ ≤ √ gL2,ρ . τ
(3.17)
g = ψ3 c(−1) g1 ,
(3.18)
Take
where c(−1) is the pseudo-differential operator with symbol c−1 (x, η + ζ, ν0 , τ ). Substituting (3.18) in (3.9) we get ψ2 g1 + ψ2 T3 g1 + ψ2 (T1 + T2 )ψ3 c(−1) g1 + A2 ψ3 c(−1) g1 = −q0 (x, ζ + iτ ν0 ). (3.19) It follows from (3.7) C ψ3 T3 g1 L2 ≤ √ g1 L2 . τ
(3.20)
Denote T4 = ψ3 T3 + ψ3 (T1 + T2 )ψ3 c(−1) . Then (3.9) can be rewritten in the form: ψ2 (g1 + T4 g1 ) + A2 ψ3 c(−1) g1 = −q0 . Since T4 g1 L2 ≤ large. Take
C √ g1 L2 τ
the operator I + T4 is invertible in L2 (Rn ) when τ is g1 = −(I + T4 )−1 q0 .
Then (L − k 2 )v = ψ2 q0 − A2 ψ3 c(−1) (I + T4 )−1 q0 ,
(3.21)
where L − k 2 is the left-hand side of (3.3) and v = −ψ2 cEψ3 c(−1) (I + T4 )−1 q0 .
(3.22)
Since ψ1 A2 ≡ 0 and ψ1 = 1 in BR we get that (3.22) is a solution of (3.3) in BR . Since ⊂ BR we get that (3.22) is also a solution of (3.3) in .
Global Uniqueness in Inverse Scattering Problem
513
Choose µ0 · ν0 = 0, |µ0 | = 1 and l orthogonal to µ0 and ν0 . Take ζ = 2l +
2 τ 2 + k 2 − |l|4 µ0 , τ is large. We shall find the limit of (3.22) when τ → ∞. We shall choose solution u0 of (3.2) dependent of τ and such that limτ →∞ u0 = [(µ0 + iν0 ) · x]p B, p ≥ 0 is arbitrary, B is a constant matrix. Denote x1 = x · µ0 , x2 = x · ν0 , x⊥ = x − x1 µ0 − x2 ν0 . We shall look for u0 independent of x⊥ . Then (3.2) takes the form:
∂ 2 ∂ ∂ 2 ∂ − − − 2i τ 2 + k1 + 2τ u0 = 0, (3.23) ∂x1 ∂x2 ∂x1 ∂x2 where k1 = k 2 − |l|4 . Denote z = x1 + ix2 , z¯ = x1 − ix2 . Then we can rewrite (3.23) in the following form: 4∂ 2 ∂ ∂ ∂ ∂ 2 − − 2i τ + k1 + − 2iτ − u0 = 0 ∂ z¯ ∂z ∂z ∂ z¯ ∂ z¯ ∂z 2
or
2i τ − τ 2 + k1 ∂ ∂2 ∂ − − u0 = 0. ∂ z¯ τ + τ 2 + k1 ∂z∂ z¯ τ + τ 2 + k1 ∂z
Therefore
∂ ∂u0 =G u0 , ∂ z¯ ∂z
(3.24)
where
∂ G ∂z
−1 2i ∂ ∂ I− = 2 2 ∂z τ + τ + k1 τ + τ + k1 ∂z 2i ∂2 τ − τ 2 + k1 ∂ + =− + ... . τ + τ 2 + k1 ∂z τ + τ 2 + k1 ∂z2 τ−
τ 2 + k1
(3.25)
We will look for u0 in the form: u0 = (zp + Q1 (z)¯z + Q2 (z)¯z2 + · · · + Qp (z)¯zp )B,
(3.26)
where Qj (z) are polynomials, deg Qj = p − j . Substituting (3.26) into (3.24) we get Q1 (z) + 2Q2 (z)¯z + · · · + pQp (z)¯zp−1 ∂ ∂ ∂ =G zp + G Q1 (z)¯z + · · · + G Qp−1 z¯ p−1 . ∂z ∂z ∂z Therefore take
(3.27)
∂ ∂ ∂ p z , 2Q2 (z) = G Q1 , . . . , pQp (z) = G Qp−1 . Q1 (z) = G ∂z ∂z ∂z
Since coefficients of Qj tend to zero when τ → ∞ we get that u0 → zp B uniformly in B¯ R and any derivative of u0 converges uniformly in B¯ R too.
514
G. Eskin
With this choice of u0 we shall pass to the limit in (3.22). Note that |η + ζ |2 − τ 2 = + 2η · ζ + |ζ |2 − τ 2 and |ζ |2 = τ 2 + k 2 . Therefore 2 ην |η + ζ |2 − τ 2 χ1 (ην , |η + ζ |, τ ) = χ → χ (0)χ (0) = 1, χ 3 τ τ2
|η |2
when τ → ∞ and η is fixed. Lemma 3.1. Let c0 be the operator of multiplication by c(x, µ0 + iν0 ). Then ψ3 (c − c0 )f L2 → 0 when τ → ∞
(3.28)
for ∀ f ∈ L2,−ρ . Proof. It follows from (3.7) that ψ3 (c − c0 )f L2 ≤ C1 f L2 ,−ρ , where C1 is independent of τ . For ∀ ε > 0 there exists fε (x) ∈ C0∞ (Rn ) such that f − fε L2,−ρ < ε. Since c(x, η, ν, τ ) depends smoothly on |ηη | and χ1 and since
+ζ χ1 → 1, |ηη +ζ | → µ0 when τ → ∞ and η is fixed we have that c(x, η + ζ, ν0 , τ ) → c(x, µ0 + iν0 ) when τ → ∞ and η is fixed. By the Lebesgue dominant convergence theorem (c(x, η + ζ, ν, τ0 ) − c(x, µ0 + iν0 ))f˜ε (η)eix·η dη → 0 Rn
when τ → ∞ uniformly on B¯ R+δ . This proves (3.28).
s
Remark 3.2. Since c(x, η, ν0 , τ ) is C ∞ in η and (3.7) holds we get commuting (1+|x|2 ) 2 and c(x, η + ζ, ν0 , τ ) that ψ3 c(x, D + ζ, ν0 , τ ) is bounded in L2,s (Rn ) for any s ∈ R. p 0 ,τ ) for 0 ≤ |p| ≤ n + 1 is a simple Note that the boundness in (x, η) of ∂ c(x,η+ζ,ν ∂x p sufficient condition of the boundness of operator c(x, D + ζ, ν0 , τ ) in L2 . Therefore ∂p (3.28) still holds when c − c0 is replaced by ∂x p (c − c0 ), |p| ≤ n0 − n − 1. Using the Fourier transform we can rewrite (2.3) in the following form i 1 )f = 2 (2π )n
Rn
f˜(η)eix·η dη . η · (µ0 + iν0 )
Since τ1 (η · η + 2η · (ζ + iτ ν0 )) → 2η · (µ0 + iν0 ) when τ → ∞ we have the following lemma (cf. [Iso]): Lemma 3.2.
τE − i ) f → 0 when τ → ∞ 4 L2,−ρ
for any f ∈ L2,ρ , ρ > 21 .
Global Uniqueness in Inverse Scattering Problem
515
Proof. It follows from (3.8) that (τ E − 4i ))f L2,−ρ ≤ C1 f L2,ρ , where C1 is independent of τ . For any ε > 0 choose fε (x) ∈ C0∞ (Rn ) such that f − fε L2,ρ < ε. Taking δ1 sufficiently small we get choosing 21 < ρ1 < ρ: (1 − χ (δ1 |x|)) τ E − i ) fε 4 L2,−ρ i −(ρ−ρ1 ) ≤ (1 + |x|) (1 − χ (δ1 |x|)) τ E − ) fε ≤ ε. 4 L2,−ρ 1
Also we have τ E − i ) (1 − χ (δ1 |D|))fε ≤ C1 (1 − χ (δ1 |D|))fε L2,−ρ ≤ ε 4 L2,−ρ when δ1 is small. Here χ (δ1 |D|) is a pseudodifferential operator with symbol χ (δ1 |η|). Finally making a change of variables 1 (3.29) η · η + 2η1 τ 2 + k1 + η⊥ · l , t2 = η2 , t⊥ = η⊥ t1 = 2τ and using the Lebesgue dominant convergence theorem we get that i χ (δ1 |x|) τ E − ) (1 − χ (δ1 |D|))fε 4 ˜ fε (η(t))eix·η(t) (1 − χ (δ|η(t)|)a(t) f˜ε (t)eix·t (1 − χ (δ1 |t|)) χ1 (δ1 |x|) = − dt (2π)n 2(t1 + it2 ) 2(t1 + it2 ) Rn →0 Rn .
when τ → ∞ uniformly in x ∈ Here η = η(t) is the inverse change of variables to (3.29) and a(t) is its Jacobian. Denote as before y = (µ0 + iν0 , x⊥ ). For simplicity of notation we denote by c(y, z) matrix c(x, µ0 + iν0 ) in coordinates (y, z). Using Lemmas 3.1 and 3.2 we get the following limit in L2 (BR+δ ): i lim v = −ψ2 (x)c(x, µ0 + iν0 ) )ψ3 c−1 (x, µ0 + iν0 )A(x) · (µ0 + iν0 )zp Eij , 2 (3.30)
τ →∞
ij where ψ2 = ψ3 = 1 when x ∈ B¯ R , B = Eij is the matrix where elements epk are 1 when i = p, j = k and 0 otherwise. We shall simplify (3.30). Note that
c−1 A(x) · (µ0 + iν0 )zp Eij = −i
∂ −1 p ∂ c z · (µ0 + iν0 )Eij = −2i (c−1 zp Eij ). ∂x ∂ z¯
Denote θ (R−|z|) = 1 when |z| < R, θ (R−|z|) = 0 when |z| > R. Since supp A ⊂ BR we have that θ c−1 A · (µ0 + iν0 )zp = c−1 A · (µ0 + iν0 )zp .
516
G. Eskin
Note that
∂ ∂ −1 p ∂θ −1 p −2i (θ c z ) = θ −2i (c z ) − 2i (c−1 zp )) ∂ z¯ ∂ z¯ ∂ z¯ = c−1 A · (µ0 + iν0 )zp − 2i
We have 2
∂θ −1 p (c z ). ∂ z¯
∂θ x1 ix2 = δ(R − |z|) − − = −δ(R − |z|)eiϕ ∂ z¯ |z| |z|
where z = |z|eiϕ and by definition (δ(R − |z|), ψ) =
|z|=R
ψ(Reiϕ )ds,
ds = Rdϕ.
Taking into account that
∂ −1 p ) (θ c z ) = θc−1 zp , ∂ z¯ since θ c−1 zp has a compact support, we obtain for |z| < R, )c−1 A(x) · (µ0 + iν0 )zp Eij = −2iθ c−1 zp Eij −
1 π
|w|=R
1 c−1 (y0 , w)wp dwEij . z−w
(3.31)
Denote for |z| < R )+ (c−1 zp ) =
1 2π i
|w|=R
c−1 (y0 , w)wp dw . w−z
(3.32)
Since A(x) = 0 for |x| > R − δ we have that c−1 (y0 , w) is analytic when R − δ < |w| ≤ R + δ. By the Cauchy theorem c−1 (y0 , w)wp dw 1 + −1 p ) (c z ) = . 2π i |w|=R+δ w−z Therefore )+ (c−1 zp ) is analytic when |z| < R + δ. The limiting values of (3.32) on |z| = R is called the Toeplitz projection. Therefore for u = u0 + v and x ∈ BR we get i lim u = lim (u0 + v) = zp Eij − c)c−1 A · (µ0 + iν0 )zp Eij τ →∞ τ →∞ 2 = zp Eij − c(θ c−1 zp Eij − )+ c−1 zp Eij ) = c)+ c−1 zp Eij , since θ = 1 in BR .
(3.33)
Global Uniqueness in Inverse Scattering Problem
517
Remark 3.3. We derived (3.32) and (3.33) assuming only that c(x, µ0 + iν0 ) is defined when x ∈ BR+δ . When c(x, µ0 + iν0 ) is a solution of (2.1) in Rn with det c = 0, ∀ x, then c−1 (y0 , z) has a Laurent expansion when |z| > R: ∞ ck (y0 )z−k . c−1 (y0 , z) = k=−r
Denote by )∞ c−1 zp the polynomial part of the Laurent expansion, i.e. p
)∞ c−1 zp = k=−r ck (y0 )z−k+p . Then, obviously, )+ c−1 zp = )∞ c−1 zp . We have c−1 zp − )∞ c−1 zp = O( 1z ) when |z| → ∞. Therefore ∂ −1 p (c z ) ∂ z¯ ∂ = −2i) (c−1 zp − )∞ c−1 zp ) ∂ z¯ = −2i(c−1 zp − )∞ c−1 zp ) = −2i(c−1 zp − )+ c−1 zp ),
)c−1 A · (µ0 + iν0 )zp = −2i)
since ) ∂∂z¯ f = f when f = O( 1z ). Therefore as before we get lim u = c)+ c−1 zp Eij .
τ →∞
4. Use of the Green Formula Let Lj , j = 1, 2 be two Schrödinger operators, Lj = − − 2iAj ·
2 ∂ ∂ + qj (x) − k 2 = −i + Aj + Vj (x). ∂x ∂x
(4.1)
Here Aj = (Aj 1 , . . . , Aj n ) and qj = −i
∂ · Aj + A2j + Vj (x). ∂x
(4.2)
¯ ⊂ BR and extend Aj , Vj to have compact As in Sect. 3 we take R > 0 such that support in BR . For the matrix functions u(x), v(x) denote Tr v ∗ udx. (4.3) (u, v) =
If f, g are matrix functions on ∂ then denote Tr g ∗ f ds. [f, g] = ∂
(4.4)
518
G. Eskin
Let L∗j , j = 1, 2 be formally adjoint operators to Lj , i.e. 2 ∂ ∂ + A∗j + Vj∗ − k 2 = − − 2iA∗j + q∗j − k 2 , L∗j = −i ∂x ∂x
(4.5)
where q∗j = −i
∂ · A∗j + (A∗j )2 + Vj∗ . ∂x
(4.6)
Here A∗j , Vj∗ are adjoint matrices to Aj , Vj . Let uj and u∗j be solutions in : (Lj − k 2 )uj = 0, (L∗j
− k )u∗j = 0, 2
x ∈ , j = 1, 2, x ∈ , j = 1, 2.
We have, integrating twice by parts: ∂u 2 0 = (Lj − k )uj , u∗j = − + i(Aj · n)uj , u∗j ∂n ∂u∗j ∗ + uj , + i(Aj · n)u∗j + uj , (L∗j − k 2 )u∗j ∂n = − j fj , gj + fj , ∗j gj , (4.7) where fj = uj |∂ , gj = u∗j |∂ , j , ∗j are the Dirichlet-to-Neumann operators for (Lj − k 2 ), (L∗j − k 2 ) respectively. It follows from (4.7) that j fj , gj = fj , ∗j gj , j = 1, 2, (4.8) i.e. the Dirichlet-to-Neumann operator for L∗j is adjoint to the Dirichlet-to-Neumann for Lj . Analogously to (4.7) we have 0 = (L1 − k 2 )u1 , u∗2 = − [1 f1 , g2 ] ∂ ∂ + A1 u1 , −i + A∗1 u∗2 + V1 − k 2 u1 , u∗2 + −i ∂x ∂x and
0 = u1 , (L∗2 − k 2 )u∗2 ∂ ∂ ∗ ∗ = − f1 , 2 g2 + −i + A2 u1 , −i + A2 u∗2 ∂x ∂x + u1 , (V2∗ − k 2 )u∗2 .
(4.9)
(4.10)
Global Uniqueness in Inverse Scattering Problem
519
Suppose 1 = 2 . Then [f1 , ∗2 g2 ] = [2 f1 , g2 ] = [1 f, g2 ]. Therefore, subtracting (4.9) from (4.10), we have: ∂u1 ∗ ∂u2∗ ∂u1 ∗ −i , A2 u∗2 + A2 u1 , −i − −i , A1 u2∗ ∂x ∂x ∂x ∂u∗2 − A1 u1 , −i + A22 − A21 + V2 − V1 u1 , u2∗ = 0. ∂x
(4.11)
(4.12)
We substitute in (4.12) instead of u1 solution eix·(ζ +iτ ν0 ) (u0 + v) constructed in Sect. 3, where u0 → zp Eii , ζ = 2l + l1 µ0 , ν0 · µ0 = 0, l · ν0 = l · µ0 = 0, l1 =
2 τ 2 + k 2 − |l|4 . Instead of u∗2 we take a solution eix·(ζ1 −iτ ν0 ) (u∗0 + v∗ ), where ζ1 = ∂ − 2l + l1 µ0 , u∗0 is a solution of [(−i ∂x + ζ1 − iτ ν0 )2 − k 2 ]u∗0 = 0 such that lim u∗0 = z¯ p Ejj
τ →∞
(4.13)
and v∗ is the solution of 2 ∂ ∂ ∗ + ζ1 − iτ ν0 v∗ + 2A2 (x) · −i + ζ1 − iτ v0 v∗ + (q∗2 (x) − k 2 )v∗ −i ∂x ∂x ∂ ∗ = −2A2 (x) · −i + ζ1 − iτ ν0 u∗0 − q∗2 (x)u∗0 . (4.14) ∂x As in Sect. 3 we get −1 p lim (u∗0 + v∗ ) = c∗2 )+ ∗ (c∗2 ) z¯ Ejj ,
τ →∞
(4.15)
where ∂c∗2 (4.16) · (µ0 − iν0 ) = A∗2 · (µ0 − iν0 )c∗2 , ∂x −1 (y0 ,w)w¯ p d w¯ c∗2 −1 p 1 . Note that eix·(ζ +iτ ν0 ) · z¯ = x · (µ0 − iν0 ), )+ ∗ c∗2 z¯ = − 2πi |w|=R z¯ −w¯ i
∂ ∂ eix(ζ1 −iτ ν0 ) = eix·l . Taking into account that ∂x (u0 + v) and ∂x (u0∗ + v∗ ) are bounded when τ → ∞ (see Remark 3.2) we get, dividing (4.12) by τ and passing to the limit in L2 when τ → ∞: −1 p 2 eix·l c1 )+ c1−1 zk Eii , A∗2 · (µ0 − iν0 )c∗2 )+ (c ) z ¯ E ∗2 jj ∗ ix·l + −1 k −1 p − 2 e A1 · (µ0 + iν0 )c1 ) c1 z Eii , c∗2 )+ ∗ (c∗2 ) z¯ Ejj = 0. (4.17)
Since l is any vector orthogonal to µ0 and ν0 and the integration in (4.17) is over we get, taking the inverse Fourier transform in l: ∗ ∗ −1 ∗ Tr ()+ zp Ejj )c∗2 A2 · (µ0 + iν0 )c1 )+ c1−1 zk Eii )dx1 dx2 (c∗2 ) y
− Tr
y
∗ + p ∗ −1 ∗ Ejj )c∗2 A1 · (µ0 + iν0 )c1 )+ c1−1 zk Eii )dx1 dx2 = 0. () z (c∗2 )
(4.18)
520
G. Eskin
Here y is the intersection of with two dimensional plane x = x1 µ0 + x2 ν0 + x⊥ , where y = (µ0 + iν0 , x⊥ ) is fixed, z = x1 + ix2 varies. Note that (4.18) holds for any ∗ + ∗ y ∈ Y . We used in (4.19) that ()+ ∗ f) = ) f . We have from (4.16), −i
∗ ∂c∗2 ∗ · (µ0 + iν0 ) = c∗2 A2 · (µ0 + iν0 ). ∂x
(4.19)
∗ )−1 satisfies Therefore c2 = (c∗2
i
∂c2 · (µ0 + iν0 ) = A2 · (µ0 + iν0 )c2 . ∂x
(4.20)
Since (4.18) holds for all Eii and Ejj we get, taking into account (4.19) and (4.20): 2
y
+ p ) z c2 (−i)
∂ −1 c1 )+ c1−1 zk c2 ∂ z¯ + p −1 ∂c1 + −1 k − ) z c2 c 2 i ) c1 z dx1 dx2 = 0. ∂ z¯
(4.21)
Note that )+ zp c2 , )+ zp c1−1 given by an expression of the form (3.32) are analytic in y since y ⊂ {z : |z| < R}. Therefore (4.21) can be rewritten in the following form: y
∂ [()+ zp c2 )c2−1 c1 ()+ zk c1−1 )]dx1 dx2 = 0. ∂ z¯
Using the Green formula we get that (4.22) is equivalent to ()+ zp c2 )c2−1 c1 ()+ zk c1−1 )dz = 0, Ly
(4.22)
(4.23)
where Ly is the boundary of y . 5. Recovery of A(x) Modulo Gauge Transformation Denote CR = {z : |z| = R} and D+ = {z : |z| ≤ R}, D− = {z : |z| ≥ R}. Let H + (D+ ) be the space of functions analytic for |z| < R and having limiting value on |z| = R that belongs to L2 (CR ). Let L+ 2 (CR ) ⊂ L2 (CR ) be the subspace of the limiting values of f (z) ∈ H + (D+ ) on CR . Note that )+ f given by (3.32) belongs to H + (D+ ) for any f ∈ L2 (CR ) and )+ f ∈ L+ 2 (CR ) when z ∈ CR . We shall keep the notation ¯ y is )+ f both for the analytic function on D+ and for its limiting values on CR . Since contained in the interior of D+ we have that )+ f is analytic in y and on Ly = ∂y . Since polynomials are dense in L+ 2 (CR ) (4.23) implies that Ly
()+ zp c2 )c2−1 c1 )+ (c1−1 b+ )dz = 0.
(5.1)
Global Uniqueness in Inverse Scattering Problem
521
+ + −1 for any b+ ∈ L+ 2 (CR ). Operator ) c1 is a Fredholm operator in L2 (CR ) with index 1 κ = 2π arg c1−1 ||z|=R . Note that κ = 0 since det c1−1 = 0 for |z| ≤ R. If ker )+ c1−1 = (1) + 0 then )+ c1−1 is invertible in L+ 2 (CR ). Therefore for any b+ ∈ L2 (CR ) we have (1) ()+ zp c2 )c2−1 c1 b+ dz = 0. (5.2) Ly
In particular we have Ly
()+ zp c2 )c2−1 c1 zk dz = 0, ∀ k ≥ 0.
(5.3)
We shall show that (5.3) always holds. Lemma 5.1. There exists Q+ (z) analytic for |z| ≤ R + δ, det Q+ = 0 for |z| ≤ R + δ such that operator )+ Q+ c1−1 is invertible in L+ 2 (CR ). Here y = y0 is fixed. Proof. Consider the Fredholm operator )+ c1 . The index of )+ c1 is equal to zero. It was proven by Gohberg and M.Krein ( [GK] ) that in any neighborhood of matrix c1 (y0 , z) there exists a matrix a(z) such that )+ a is invertible. If )+ a is invertible then matrix a(z) admits the following factorization on CR (see [GK]): a(z) = Q− (z)Q+ (z),
(5.4)
where Q+ (z) is analytic for |z| < R and continuous for |z| ≤ R, Q− (z) is analytic when |z| > R, continuous when |z| ≥ R and Q− (∞) = I . Moreover det Q+ = 0 when |z| ≤ R and det Q− = 0 when |z| ≥ R. We can replace Q+ (z) by its polynomial approximation. This will lead to a slight perturbation of a(z) and we will have that det Q+ = 0 for |z| ≤ R + δ. We claim that operator )+ Q+ c1−1 is invertible in L+ 2 (CR ). It is enough to prove that −1 −1 + + ker ) Q+ c1 = 0. Suppose ) Q+ c1 w+ = 0 for some w+ ∈ L+ 2 (CR ). Applying )+ Q− we get )+ Q− )+ Q+ c1−1 w+ = 0.
(5.5)
Denote )− f = f − )+ f . Note that )− f is analytic in D− and ()− f )(∞) = 0. Since )+ Q− )− Q+ c1−1 w+ = 0 we obtain )+ ac1−1 w+ = 0.
(5.6)
Since a is close to c1 the matrix ac1−1 is close to the identity. Therefore operator )+ ac1−1 is invertible and therefore w+ = 0. Replace matrix c1 (y, z) by c1 (y, z) = c1 (y, z)Q−1 + (z) for all y ∈ Y . Then repeating the constructions of Sect. 3 and Sect. 4 with c1 (y, z) instead of c1 we get −1 k + ()+ zp c2 )c2−1 c1 Q−1 (5.7) + () Q+ c1 z )dz = 0. Ly
522
G. Eskin
Since )+ Q+ c1−1 is invertible in L+ 2 (CR ) we get as in (5.2) Ly
(1)
()+ zp c2 )c2−1 c1 Q−1 + b+ dz = 0
(5.8)
(1)
−1 for any b+ ∈ L+ 2 (CR ) and y = y0 . Since multiplication by Q+ is an isomorphism in + L2 (CR ) we get from (5.8) that
Ly
()+ zp c2 )c2−1 c1 zk dz = 0,
∀ k ≥ 0.
Analogously, starting from (5.3) we get that for y = y0 zp c2−1 c1 zk dz = 0, ∀ p ≥ 0, ∀ k ≥ 0. Ly
(5.9)
It follows from (5.9) that c2−1 (y0 , z)c1 (y0 , z) = P (y0 , z) ∈ L+ 2 (Ly0 ),
(5.10)
i.e. P (y0 , z) extends as analytic matrix from Ly0 to y0 . Note that (5.10) holds for all y ∈ Y . We shall use a rich enough subset Y (0) of Y to conclude that if (5.10) holds for any y ∈ Y (0) then A1 and A2 are gauge equivalent. Denote ξ = µ + iν ∈ Cn , ξ = (ξ1 , . . . , ξn ). We have ξ · ξ = ξ12 + · · · + ξn2 = (µ + iν) · (µ + iν) = 0, since µ · ν = 0 and |µ|2 = |ν|2 . If α ∈ C then (αξ ) · (αξ ) = 0, αξ = µ + iν and µ + iν defines the same oriented two-dimensional subspace π as µ + iν. Denote by G± j , 3 ≤ j ≤ n, the following two-dimensional submanifolds of Gn : ξ = ξj± (t), where ξj±1 (t) = 21 (t − 1t ), ξj±2 (t) = 2i (t + 1t ), t ∈ C \ {0}, ξj±k (t) = ±δj k , 3 ≤ k ≤ n. Note that (ξj±1 (t))2 + · · · + (ξj±n (t))2 = 0, i.e. G± j ⊂ Gn . When t → ∞ we ξ ± (t) ξ ± (t)
ξ ± (t)
have ( j 1t , j 2t , . . . , j nt ) → ( 21 , 2i , 0, . . . , 0) = µ1 + iν1 ∈ Gn and when t → 0 we have (tξj±1 (t), tξj±2 (t), . . . , tξj±n (t)) → (− 21 , 2i , 0, . . . , 0) = µ2 + iν2 ∈ Gn .
∗ We add µ1 + iν1 and µ2 + iν2 to G± j . Therefore the Riemann sphere C gives the parameterization of G± j . Note that µ1 + iν1 and µ2 + iν2 are the only common points − n (0) has of G± , 3 ≤ j ≤ n. Denote by G(0) the union (∪nj=3 G+ j j ) ∪ (∪j =3 Gj ). The set G the following property:
Lemma 5.2. If (µ + iν) · a = b for all µ + iν ∈ G(0) then a = 0 and b = 0. Here a = (a1 , . . . , an ) ∈ Cn , b ∈ C. + + − − − (0) corresponding Proof. If µ+ j + iνj ∈ Gj and µj + iνj ∈ Gj are the points of G + − − to the same value of t ∈ C \ {0} then (µ+ j + iνj ) − (µj + iνj ) = (0, . . . , 2, . . . , 0). Therefore we have 2aj = 0, 3 ≤ j ≤ n. Since aj = 0 for j ≥ 3 we get, changing t to −t : −(µ + iν) · a = b. Therefore b = 0. Finally, ξ1 (t)a1 + ξ2 (t)a2 = 0 for all t ∈ C \ {0} implies a1 = a2 = 0.
Global Uniqueness in Inverse Scattering Problem
523
Denote c2 (y0 , z) = c2 (y0 , z)P (y0 , z),
(5.11)
where P (y0 , z) is the same as in (5.10). Note that c2 (y0 , z) satisfies (4.20) in y0 since P (y0 , z) is analytic in y0 . We have: c1 (y, z) = c2 (y, z)
(5.12)
c1 (x, µ + iν) = c2 (x, µ + iν)
(5.13)
for all z ∈ Ly , y ∈ Y . Therefore
+ \ for all µ + iν ∈ Gn , x ∈ ∂. Consider µ + iν ∈ G+ j , i.e. µ + iν = ξj (t), t ∈ C {0}, + + 3 ≤ j ≤ n. Since c1 (x, ξj (t)) and c2 (x, ξj (t)) satisfy ∂c1 x, ξj+ (t) ¯ · ξj+ (t) = A1 (x) · ξj+ (t) c1 (x, ξj+ (t)), x ∈ , (5.14) i ∂x ∂c2 (x, ξj+ (t)) + ¯ i · ξj (t) = A2 (x) · ξj+ (t) c2 (x, ξj+ (t)), x ∈ , (5.15) ∂x
respectively, we get applying
∂ ∂ t¯
to (5.14), (5.15),
∂c1t¯ + · ξj (t) = A1 (x) · ξj+ (t) c1t¯(x, ξj+ (t)), ∂x ∂c2t¯ + i · ξj (t) = A2 (x) · ξj+ (t) c2 t¯(x, ξj+ (t)), ∂x i
where ct¯ means
∂c . ∂ t¯
(5.16) (5.17)
Denote
Mj+1 (x, t) = c1−1 x, ξj+ (t) c1t¯ x, ξj+ (t) , −1 c2 t¯ x, ξj+ (t) , Mj+2 (x, t) = c2 x, ξj+ (t)
Then
¯ x ∈ ,
(5.18)
¯ x ∈ .
(5.19)
c1t¯ x, ξj+ (t) = c1 x, ξj+ (t) Mj+1 , c2 t¯ x, ξj+ (t) = c2 x, ξj+ (t) Mj+2 .
(5.20) (5.21)
Differentiating (5.18) and (5.19) in x and using (5.14), (5.16) and (5.15), (5.17) we get that Mj+1 , Mj+2 satisfy ∂Mj+1 ∂x
· ξj+ (t)
= 0,
∂Mj+2 ∂x
· ξj+ (t) = 0,
¯ x ∈ .
(5.22)
It follows from (5.13) that ∂c1 (x, ξj+ (t)) ∂ t¯
=
∂c2 (x, ξj+ (t)) ∂ t¯
for x ∈ ∂.
(5.23)
524
G. Eskin
Therefore Mj+1 (x, t) = Mj+2 (x, t)
(5.24)
1 for t ∈ C \ {0}, x ∈ ∂. Since Mj+1 and Mj+2 are analytic in z = |µ| (x · ξj+ (t)) for each y = (ξj+ (t), x⊥ ), z ∈ y and (5.24) holds when z ∈ Ly , we get that (5.24) holds for all ¯ and differentiate c (x, ξ + (t))c−1 (x, ξ + (t)) in t¯. We get x ∈ . Fix arbitrary x ∈ 1 2 j j
∂c ∂c1 −1 ∂ −1 (c2 c1 ) = 2 c1−1 − c2 c1−1 c = c2 Mj+2 c1−1 − c2 Mj+1 c1−1 = 0, ¯ ¯ ∂t ∂t ∂ t¯ 1
(5.25)
i.e. c2 c1−1 is analytic in t. Since c2 c1−1 is bounded on G+ j we get by the Liouville
+ −1 theorem that c2 c1−1 is constant on G+ j . Analogously, c2 c1 is constant on each Gj and − ± Gj , 3 ≤ j ≤ n. Since Gj has common points µ1 + iν1 and µ2 + iν2 we get that
c2 c1−1 = g(x) is constant on G(0) . Thus c2 (x, µ + iν) = g(x)c1 (x, µ + iν) for all ¯ and µ + iν ∈ G(0) . We have x∈ ∂c2 ∂g ∂c1 −1 A2 (x) · (µ + iν) = i · (µ + iν)(c2 ) = i c1 + g · (µ + iν)c1−1 g −1 ∂x ∂x ∂x ∂g =i · (µ + iν)g −1 + gA1 (x) · (µ + iν)g −1 . ∂x Therefore (A2 (x) − g(x)A1 (x)g −1 (x) − i
∂g(x) −1 g (x)) · (µ + iν) = 0 ∂x
for all µ + iν ∈ G(0) . It follows from Lemma 5.2 that A2 (x) = gA1 g −1 + i
∂g −1 g , ∂x
i.e. A2 (x) is gauge equivalent to A1 (x). Since c2 = c1 for x ∈ ∂ we have that g(x) = I for x ∈ ∂. 6. Recovery of V (x) Modulo the Gauge Transformation After we have proved that A1 is gauge equivalent to A2 (x) we can fix the gauge g(x) such that A1 = A2 .
(6.1)
It follows from (4.12) that if A1 = A2 then ((V2 (x) − V1 (x)) u1 , u∗2 ) = 0.
(6.2)
Therefore substituting in (6.2) the same solutions that we substituted in (4.12) and taking the limit when τ → ∞ we get, analogously to (4.18), ()+ zp c2 )c2−1 (V2 − V1 )c1 ()+ c1−1 zk )dx1 dx2 = 0 (6.3) y
Global Uniqueness in Inverse Scattering Problem
525
for all y = (µ0 + iν0 , x⊥ ) and all p ≥ 0, k ≥ 0. Note that we extended V1 , V2 by zero outside of . If the Toeplitz operators )+ c1−1 and )+ c2 are invertible in L+ 2 (CR ) then, as in Sect. 5, we get: zp c2−1 (V2 − V1 )c1 zk dx1 dx2 = 0, ∀ p ≥ 0, ∀ k ≥ 0. (6.4) R2
If those operators are not invertible we can use Lemma 5.1 to show that (6.4) still holds. Let ξ(t) be analytic in t, ξ(t) = µ + iν, µ ∈ Rn , ν ∈ Rn , µ · ν = 0, |µ| = |ν| = √1 |ξ(t)|. Denote by P the following operator in R n : 2
Pf =
1 (2π )n
Rn
f˜(η)eix·η dη . iη · ξ(t)
(6.5)
We have ∂Pf · (µ + iν) = f, ∂x
∀ f.
(6.6)
µ ν , x2 = x · |ν| , x⊥ = x − x1 · Let (x1 , x2 , x⊥ ) be coordinates in Rn , where x1 = x · |µ| µ ν |µ| − x2 |ν| . Denote by f1 (x1 , x2 , x⊥ ) the f (x) in new coordinates, i.e. f1 (x1 , x2 , x⊥ ) = µ ν f (x1 |µ| + x2 |ν| + x⊥ ). Then we have
(Pf )1 x1 , x2 , x⊥
f˜1 (η1 , η2 , η⊥ )ei(x1 η1 +x2 η2 +x⊥ ·η⊥ ) 1 = dη1 dη2 dη⊥ (2π )n Rn i|µ|(η1 + iη2 ) 1 = (6.7) )f1 , 2|µ|
where ) is the Cauchy operator ()f1 )1 (z, x1 ) =
1 π
R2
f1 (w, x⊥ )dy1 dy2 , z−w
(6.8)
z = x1 + ix2 , w = y1 + iy2 . Lemma 6.1. The following formula holds: ∂ 1 f˜1 (0, 0, η⊥ )α(η⊥ , t)eix⊥ ·η⊥ dη⊥ , Pf = ∂ t¯ (2π)n−2 Rn−2 where α(η⊥ , t) =
1 4πi (η⊥
·
∂ ξ¯ (t) ). ∂ t¯
Proof. We have Pf = I1 + I2 , where I1 =
1 (2π)n
|η·ξ(t)|<ε
f˜(η)eix·η dη , iη · ξ(t)
I2 = Pf − I1 .
(6.9)
526
G. Eskin
Denote g(η) = f˜(η)eix·η . We have, by denoting η1 = |µ|η1 , η2 = |ν|η2 , I1 =
1 (2π)n
µ ν g η1 |µ| 2 + η2 |ν|2 + η⊥ dη1 dη2 dη⊥
Rn−2
i(η1 + iη2 )|µ|2
(η1 )2 +(η2 )2 ≤ε2
Taking the derivative in t¯ and then the limit when ε → 0 we get limε→0 I2 = and since
∂ ξ(t) ∂ t¯
1 (2π)n
∂ I ∂ t¯ 1
.
(6.10)
= 0. Since
Rn
g(η)θ (|ξ(t) · η| − ε) dη iη · ξ(t)
(6.11)
Rn
∂θ (|ξ(t) · η| − ε) g(η)dη . ∂ t¯ iη · ξ(t)
(6.12)
= 0 we obtain ∂I2 1 = ¯ ∂t (2π)n
Note that ∂θ (|ξ(t) · η| − ε) ∂ t¯
∂µ ∂ν = δ (|ξ(t) · η| − ε) (η · µ) η · + (η · ν) η · |ξ(t) · η|−1 . ∂ t¯ ∂ t¯
(6.13)
Therefore introducing coordinates ηˆ 1 = ε−1 |µ|η1 , ηˆ 2 = ε−1 |ν|η2 , η⊥ = η⊥ we get 1 ∂I2 = ∂ t¯ (2π)n
Rn−2
ηˆ 12 +ηˆ 22 =1
g1 ε|µ|−1 ηˆ 1 , ε|µ|−1 ηˆ 2 , η⊥ iε(ηˆ 1 + i ηˆ 2 ) ∂µ ∂ν + ηˆ 2 η · εdϕdη⊥ , · ηˆ 1 η · ∂ t¯ ∂ t¯
(6.14)
where dϕ is the arc length element of the unit circle. Since ξ(t) = µ + iν we have 0 = ∂µ + i ∂ν . ∂ t¯ ∂ t¯ Therefore ∂I2 1 lim f˜1 (0, 0, η⊥ )α(η⊥ , t)eix⊥ ·η⊥ dη⊥ , (6.15) = ε→0 ∂ t¯ (2π)n−2 Rn−2 where 1 α(η⊥ , t) = 2πi
∂µ η⊥ · ∂ t¯
We used that µ = 21 (ξ(t) + ξ¯ (t)) and
∂µ ∂ t¯
=
1 = 4π i
1 ∂ ξ¯ 2 ∂ t¯ .
∂ ξ¯ (t) η⊥ · . ∂ t¯
(6.16)
Denote B x, ξj± (t) = P c2−1 (V1 − V2 )c1 .
(6.17)
Global Uniqueness in Inverse Scattering Problem
527
It follows from Lemma 6.1, (6.4) with k = p = 0 and (5.20), (5.21) that
∂c2−1 ∂B ∂c1 −1 =P (V1 − V2 )c1 + c2 (V1 − V2 ) ∂ t¯ ∂ t¯ ∂ t¯ 1 = ) −Mj+2 c2−1 (V1 − V2 )c1 + c2−1 (V1 − V2 )c1 Mj+1 . 2|µ|
(6.18)
We used in (6.18) that when f = c2−1 (V1 − V2 )c1 we have −1 ˜ f1 (0, 0, η⊥ ) = c2 (V1 − V2 )c1 dx1 dx2 e−ix⊥ ·η⊥ dx⊥ = 0. Rn−2
R2
It follows from (5.22) that Mj+2 and Mj+1 are analytic in ξ(t) · x. We shall prove that (6.4) implies that
and
)Mj+2 c2−1 (V1 − V2 )c1 = Mj+2 )c2−1 (V1 − V2 )c1
(6.19)
) c2−1 (V1 − V2 )c1 Mj+1 = )c2−1 (V1 − V2 )c1 Mj+1 .
(6.20)
Indeed, denote c2−1 (V1 − V2 )c1 = b. Then Mj+2 )b − )Mj+2 b =
1 π
(Mj+2 (y, z)b(y, w) − Mj+2 (y, w)b(y, w)) z−w
R2
dy1 dy2 .
Since Mj+2 is analytic in z when |z| < 2R, ∞ Mj+2 (y, z) − Mj+2 (y, w) = (z − w)p=0 Mpj 2 (y, z)w p
when |w| ≤ R, |z| < R. Therefore Mj+2 )b − )Mj+2 b =
1 ∞ Mpj 2 (y, z) π p=0
|w|≤R
w p b(y, w)dy1 dy2 = 0,
because of (6.4). Analogously we can prove (6.20). Therefore combining (6.18), (6.19) and (6.20) we get ∂B = −Mj+2 B + BMj+1 . ∂ t¯
(6.21)
B1 = c2 Bc1−1 .
(6.22)
Denote
Then ∂c−1 ∂B1 ∂B −1 ∂c2 −1 Bc1 + c2 c1 + c 2 B 1 = ∂ t¯ ∂ t¯ ∂ t¯ ∂ t¯ + −1 + = c2 Mj 2 Bc1 + c2 −Mj 2 B + BMj+1 c1−1 + c2 B −Mj+1 c1−1 = 0.
528
G. Eskin
Therefore B1 is analytic in C \ {0} for arbitrary x ∈ . It follows from (5.14) that c1 (x, ξj± (t)) depends on
ξj+ (t)
|ξj+ (t)|
only. Therefore c1 (x, ξj+ (t)) and c1−1 (x, ξj+ (t)) are
bounded on C \ {0} since |ξj+ (t)| → ∞ iff |t| → 0 or |t| → ∞. √1 2
The same is true for c2±1 (x, ξj+ (t)). It follows from (6.7) that Pf → 0 when |µ| = |ξj+ (t)| → ∞ assuming that f is bounded with respect to |ξj+ (t)|. Therefore B1 =
c2 (P (c2−1 (V1 − V2 )c1 ))c1−1 tends to 0 when |t| → ∞ and when |t| → 0. Since B1 is analytic in C \ {0} we have by the Liouville theorem that B = P (c2−1 (V1 − V2 )c1 ) = 0. + −1 −1 From ∂B ∂x · ξj (t) = c2 (V1 − V2 )c1 we conclude that c2 (V1 − V2 )c1 = 0. Therefore V1 − V2 = 0. Remark 6.1. In this remark we shall describe the simplifications in the proofs of Theorem 1.2 when Condition (A) is satisfied. Note that Condition (A) allows to treat the inverse scattering problem in the case of A(x) and V (x) with noncompact supports, providing that A(x), V (x) decay exponentially. Assume that there exists a solution c(x, µ + iν) of Eq. (2.1) in Rn for all µ + iν ∈ Gn 1 and such that c − I = O( |x| ). It was proven in [ER3] that the scattering amplitude a(θ, ω, k) given for fixed k > 0 and all |θ | = |ω| = 1 determines I (η⊥ ) =
Rn
c−1 (x, µ + iν)A(x) · (µ + iν)e−ix·η⊥ dx
(6.23)
for all η⊥ orthogonal to µ and ν (see formula (2.29) in [ER3]). The limit (6.23) was obtained in [ER3] using basically the Lebesgue dominant convergence theorem (cf. with the proof of Lemma 3.1). We can reprove (6.23) using a refinement of Lemma 3.1 thus allowing to lessen the requirements on A(x). It follows from Lemma 3.1, Remark 3.2 and the fact that c − I = 0( 1z ) that N 1 + |x|2 (c − c0 )f
L2,ρ
→ 0,
(6.24)
when τ → ∞, ∀ N , assuming that f ∈ L2,2N . When N > n+1 4 the Cauchy–Schwartz inequality implies that (c − c0 )f L1 (Rn ) → 0 when τ → ∞ Therefore the Fourier transform F ((c − c0 )f )(ξ ) → 0 uniformly in ξ ∈ Rn . It follows from [ER1, ER3] that F g(η)|τ , where τ is the surface η · η + 2η · (ζ + iτ ν0 ) = 0, is determined by the scattering amplitude at fixed energy k 2 . Here g is the same as in (3.9) with A2 = 0, ψ2 = ψ3 ≡ 1. Applying F c−1 to (3.9), restricting to τ and taking the limit we get (6.23). Note that τ → {η : η1 = η2 = 0} when τ → ∞. Since c−1 satisfies
i
∂c−1 · (µ + iν) = −c−1 (x, µ + iν)A(x) · (µ + iν) ∂x
(6.25)
1 ) we get applying P to (6.25): and since c−1 − I = O( |z|
c−1 (x, µ + iν) − iP c−1 A(x) · (µ + iν) = I,
(6.26)
Global Uniqueness in Inverse Scattering Problem
529
−1
∂ since P ∂c∂x · (µ + iν) = P ∂x (c−1 − I ) · (µ + iν) = c−1 − I . Substituting µ + iν = + + ξj (t), 3 ≤ j ≤ n, where ξj (t) are defined in Sect. 5 and differentiating in t¯ we get, using Lemma 6.1:
∂c−1 ∂c−1 − iP A(x) · ξj+ (t) = M(x⊥ ), ∂ t¯ ∂ t¯ where M(x⊥ ) =
i (2π )n−2
R n−2
I (η⊥ )α(η⊥ , t)eix⊥ ·η⊥ dη⊥ .
(6.27)
(6.28)
Multiplying (6.26) by M(x⊥ ) from the left and using the uniqueness of the solution of integral equation (6.27) (this is a consequence of Condition (A) ) we get ∂c−1 = M(x⊥ )c−1 x, ξj+ (t) . ∂ t¯
(6.29)
Note that M(x⊥ ) is independent of x · ξj+ (t) and it is determined by the scattering amplitude. If we have another Schrödinger equation corresponding to potentials A (x), V (x) that has the same scattering amplitude for given k and if we assume that Condition (A) is satisfied for A as well then the corresponding solution (c (x, µ + iν))−1 of the equation of the form (6.25) will satisfy an equation of the form (6.29) with the same M(x⊥ ). −1 Consider c c−1 and take the derivative in t¯. We get ∂(c∂ct¯ ) = 0. The remaining proof is the same as in the end of Sect. 5, starting from (5.25). Now we consider recovering of V (x). It was proven in [ER3] (see [ER3], formula (3.43) ) that the scattering amplitude determines c−1 (x, µ + iν)V (x)c(x, µ + iν)eix·η⊥ dx (6.30) Rn
for all η⊥ orthogonal to µ and ν. We shall outline the derivation of (6.30). It was shown in [ER1] that the scattering amplitude determines the restriction to τ of the Fourier transform of (79) in [ER1] (see [ER1], p. 222). Applying Lemma 3.2 we recover J1 − J2 where c−1 (x, µ + iν)V (x)e−ix·η⊥ dx J1 = Rn
and
J2 =
Rn
c−1 (x, µ + iν)V (x)c(x, µ + iν) ·
i −1 )c (x, µ + iν)A(x) · (µ + iν) e−ix·η⊥ dx. 2
As in Remark 3.3 we have )c−1 A(x) · (µ + iν) = −2i) ∂∂z¯ c−1 = −2i) ∂∂z¯ (c−1 − I ) = −2i(c−1 − I ). Therefore J1 − J2 is equal to (6.30). Therefore if x = x1 µ+x2 ν +x⊥ , z = x1 +ix2 , y = (µ+iν, x⊥ ) we know integrals c−1 (y, z)V (x⊥ , z)c(y, z)dx1 dx2 . R2
530
G. Eskin
If the Schrödinger operator corresponding to A (x), V (x) has the same scattering amplitude for given k then we can fix gauge such that A = A and therefore c = c. The scattering amplitude determines c−1 (y, z)V (x⊥ , z)c(y, z)dx1 dx2 . R2
Therefore we have R2
c−1 (y, z)(V (x⊥ , z) − V (x⊥ , z))c(y, z)dx1 dx2 = 0.
(6.31)
Denote B = P (c−1 (V − V )c). Taking µ + iν = ξj+ (t) and differentiating in t¯ we get, using (6.29) and (6.31): ∂B = −M(x⊥ )B + BM(x⊥ ). ∂ t¯ We used that M(x⊥ ) commute with ) since M(x⊥ ) is independent of z. Now the continuation of the proof is the same as in the end of Sect. 6, starting from (6.22). Acknowledgement. The author is grateful to Jim Ralston for many discussions that shaped and clarified this paper.
References [A] [ER1] [ER2] [ER3] [ER4] [HN1] [HN2] [HN3] [Iso] [No] [NT] [NU1] [NU2]
Agmon, S.: Spectral properties of Schrödinger operators and scattering theory. Annali de Pisa, Serie IV 2, 151–218 (1975) Eskin, G. and Ralston, J.: Inverse scattering problem for the Schrödinger equation with magnetic potential at a fixed energy. Commun. Math. Phys. 173, 199–224 (1995) Eskin, G. and Ralston, J.: Inverse scattering problems for Schrödinger operators with magnetic and electric potentials. The IMA Volumes in Mathematics and its Applications, Vol. 90, 147–166 (1997) Eskin, G. and Ralston, J.: Inverse scattering problems for the Schrödinger operators with external Yang–Mills potentials. CRM Proceedings and Lecture Notes, Vol. 12, 91–106 (1997) Eskin, G. and Ralston, J.: Inverse coefficient problems in purturbed half-spaces. Inverse Problems 15, 683–699 (1999) ¯ Henkin, G. and Novikov, R.: The ∂-equation in the multidimensional inverse scattering problem. Russ. Math. Survey 42, no. 3, 109–180 (1987) Henkin, G. and Novikov, R.: The Yang–Mills fields, the Radon-Penrose transform and the Cauchy– Riemann equations. Several ComplexVariablesV. Encyclopaedia Math. Sci.,Vol. 54, Berlin: SpringerVerlag, 1993, pp. 109–193 Henkin, G. and Novikov, R.: A multidimensional inverse problem in quantum and acoustic scattering. Inverse Problems 4, 103–121 (1993) Isozaki, H.: Inverse scattering theory for Dirac operators. Ann. Inst. Henri Poincare, Phys.Theor. 66, 237–270 (1997) Novikov R.: On determination of a gauge field on Rd from its non-abelian Radon transform along oriented straight lines. Preprint, Universite de Nantes (1999) Nakamura, G., Tsuchida, T.: Uniqueness for an inverse boundary value problem for Dirac operators. Commun. in PDE 25 (7 and 8), 1327–1369 (2000) Nakamura, G., Uhlmann, G.: Identification of Lamé parameters by boundary measurements. Am. J. Math. 115, 1161–1187 (1993) Nakamura, G. and Uhlmann, G.: Global uniqueness for an inverse boundary problem arising in elasticity. Invent. Math. 118, 457–474 (1994)
Global Uniqueness in Inverse Scattering Problem
[OPS] [ST] [S] [W]
531
Ola, P., Päivärinta, L. and Somersalo, E.: An inverse boundary value problem in electro-dynamics. Duke Math. J. 70, 617–653 (1993) Schrader, R. and Taylor, M.: Small h-asymptotics for quantum partition functions associated to the particles in external Yang–Mills potentials. Commun. Math. Phys. 92, no. 4, 555–594 (1984) Sylvester, J.: The Cauchy data and the scattering amplitude. Commun. in PDE 19, 1735–1741 (1994) Weder, R.: Generalized limiting absorption method and multidimensional inverse scattering theory. Math. Meth. in Appl. Sci 14, 509–524 (1991)
Communicated by B. Simon
Commun. Math. Phys. 222, 533 – 567 (2001)
Communications in
Mathematical Physics
© Springer-Verlag 2001
On Long-Time Evolution in General Relativity and Geometrization of 3-Manifolds Michael T. Anderson Department of Mathematics, S.U.N.Y. at Stony Brook, Stony Brook, NY 11794-3651, USA. E-mail: [email protected] Received: 21 August 2000 / Accepted: 28 May 2001
Abstract: This paper introduces relations between the long-time asymptotic behavior of Einstein vacuum space-times and the geometrization of 3-manifolds envisioned by Thurston. The relations are obtained by analysing the asymptotic behavior of a CMC foliation by compact Cauchy surfaces and the induced curve of 3-manifold geometries. The Cheeger-Gromov theory is introduced in this context, and a number of open problems are considered from this viewpoint. 0. Introduction This paper describes certain relations between the vacuum Einstein evolution equations in general relativity and the geometrization of 3-manifolds. In its simplest terms, these relations arise by analysing the long-time asymptotic behavior of natural space-like hypersurfaces τ , diffeomorphic to a fixed , in a vacuum space-time. In the best circumstances, the induced asymptotic geometry on τ would implement the geometrization of the 3-manifold . We present several results illustrating this relationship, some of which require however rather strong hypotheses. Thus, besides proving these results, a second purpose of this paper is to clarify and make precise what some of the major difficulties are in carrying this program out to a deeper level. In this setting, a number of open problems and conjectures are discussed, some of which are well-known, and are of interest both in general relativity and in 3-manifold geometry. The idea of geometrizing 3-manifolds, and its proof in many important cases, is due to Thurston. We refer to [40] for an introduction from the point of view of hyperbolic geometry. A survey of geometrization from the point of view of general Riemannian geometries is given in [1], where some early relations with issues in general relativity were also presented. We also refer to [6] and [36] for surveys on topics in general relativity related to this paper. Partially supported by NSF Grants DMS 9802722 and 0072591
534
M. T. Anderson
Let (M, g) be a vacuum space-time, i.e. a 4-manifold M with smooth (C ∞ ) Lorentz metric g, of signature (−, +, +, +), satisfying the Einstein vacuum equations Ricg = 0.
(0.1)
We assume that (M, g) is a cosmological space-time in the sense that M admits a compact Cauchy surface . In particular, (M, g) is then globally hyperbolic and M is diffeomorphic to × R. The space-time (M, g) is the maximal C ∞ vacuum Cauchy development of initial data on , cf. [31, Ch. 7], [20]. By a result of [15] any compact space-like hypersurface is then a Cauchy surface for (M, g). It is thus natural to seek a preferred Cauchy surface. We assume throughout the paper that (M, g) has a compact Cauchy surface with constant mean curvature H = tr K = const < 0.
(0.2)
The sign of H is chosen so that H < 0 corresponds to expansion of in the future direction. It is a well-known conjecture that any vacuum cosmological space-time admits a CMC Cauchy surface, and this is certainly the case for all known examples; cf. [11] for further discussion. We will call a cosmological space-time satisfying (0.2) a CMC cosmological space-time. One reason for preferring CMC surfaces is that any such surface is unique, for a given value of H . Further, any such surface embeds to the past and future in a unique smooth foliation F = {τ , τ ∈ I } of a domain in (M, g) by CMC Cauchy surfaces. Each leaf τ is diffeomorphic to and the function H : F → R is monotone decreasing w.r.t. future proper time, cf. [33]. Let MF ⊂ M
(0.3)
be the maximal domain in M foliated in this way. Thus, the function τ =H
(0.4)
defines a natural time function on MF . The range of τ : MF → R is a connected open interval I ⊂ R, cf. [26], and hence the subset MF is a domain, i.e. a connected open subset, of M. The Hawking–Penrose singularity theorem, cf. [31, Ch. 8.2], implies that if (M, g) is a CMC cosmological space-time, then there is a singularity in the finite proper time past of . A singularity to the past (or future) of merely means that (M, g) is not time-like geodesically complete to the past (or future) of . Unfortunately, comparatively little is known in general about the structure of such singularities, cf. [31, 41]. In this paper, we mainly, (although not only), consider issues concerning the future development of (M, g) from the Cauchy surface . Perhaps the first natural issue to consider is to characterize the situation when (M, g) is future geodesically complete, i.e. time-like geodesically complete to the future of . This depends, at the very minimum, on topological properties of . The basic reason derives from one of the constraint equations for the geometry of CMC surfaces in (M, g); namely on any leaf (τ , gτ ) ⊂ MF , one has s = |K|2 − H 2 ,
(0.5)
where s is the scalar curvature of (τ , gτ ) and K is the second fundamental form of (τ , gτ ) ⊂ (M, g). Now the scalar curvature plays an important role in understanding
General Relativity and Geometrization
535
the geometry and topology of 3-manifolds, just as it does in dimension 2. Thus, define the Sigma constant σ () of a closed oriented 3-manifold to be the supremum of the scalar curvatures of unit volume Yamabe metrics on , cf. [1, 38]. This is a topological invariant, which divides the family of closed 3-manifolds naturally into three classes according to σ () < 0, σ () = 0, σ () > 0.
(0.6)
By the solution to the Yamabe problem, σ () > 0 if and only if admits a metric of positive scalar curvature. In fact, σ () > 0 if admits a non-flat metric with s ≥ 0, cf. [14]. If MF contains a (non-flat) leaf o with H = 0, then (0.5) implies that s ≥ 0, so that σ (o ) > 0. Since MF is open, (M, g) admits CMC Cauchy surfaces with H > 0 and H < 0 and hence the Hawking–Penrose singularity theorem implies that there exist singularities to the finite proper time future and past of o . This observation, together with the behavior in certain model cases, namely the space homogeneous Bianchi geometries on S 3 and Kantowski-Sachs geometries on S 2 × S 1 , have led to the following conjecture, cf. [9, 36]. Recollapse Conjecture. If σ () > 0, then M = MF , and the leaves τ of F have mean curvature taking on all values monotonically in (−∞, +∞). The space-time (M, g) recollapses in that there is a singularity to the finite proper time future of . This conjecture basically says that if ⊂ M topologically can admit a metric of positive scalar curvature, then under the CMC time evolution, it will evolve in finite proper time to a maximal surface (o , go ) with H = 0, where it realizes positive scalar curvature, and then recollapses to the future in finite time. Perhaps the strongest general result on this conjecture is the work in [26], which resolves it in case (M, g) has a past and a future crushing singularity [22], i.e. there exists a sequence of compact space-like hypersurfaces Pi with H (Pi ) → −∞ uniformly, and similarly such a sequence Fi with H (Fi ) → +∞ uniformly. We mention that the only known examples of 3-manifolds with σ () > 0 are spherical space forms S 3 / , S 2 × S 1 , and connected sums of such manifolds. Most all 3-manifolds satisfy σ () ≤ 0, i.e. admit no metric of positive scalar curvature. This is the case for instance if has a K(π, 1) factor in its prime decomposition, cf. [28]. For such Cauchy surfaces, the space-time (M, g) has no compact maximal hypersurfaces, and so the CMC time evolution is restricted to a subset of (−∞, 0). For much of the paper, we assume then that σ () ≤ 0. The following generalization of the recollapse conjecture to all signs of σ () was raised in [34, 22]. Global existence problem in CMC time. The CMC foliation F = {τ } in (M, g) exists for all allowable CMC time τ , i.e. ∀τ ∈ (−∞, 0) if σ () ≤ 0 and ∀τ ∈ (−∞, +∞) if σ () > 0. This problem arises from a general belief that a CMC foliation should avoid any singularities at the boundary of (M, g), cf. also [33]. The following result is of some relevance to this problem. Theorem 0.1. Suppose σ () ≤ 0. Then either the CMC foliation F exists for all allowable CMC time or there exists a sequence of points xi ∈ τi , with τi → τo < 0, such
536
M. T. Anderson
that {xi } approaches a point of ∂M. Further, this is the case only if the curvature R of (M, g) blows up on {xi }, i.e. |R|(xi ) → ∞. Here |R| is measured in terms of the electric-magnetic decomposition of R with respect to the future unit normal T of the leaves τ of F, in that |R|2 = |E|2 + |B|2 ,
(0.7)
with E(X, Y ) = R(X, T )T , Y , B(X, Y ) = (∗R)(X, T )T , Y , cf. [19, §0]. Theorem 0.1 implies that global existence in CMC time fails, (when σ () ≤ 0), only when a sequence of leaves τi of F approaches ∂M somewhere. This behavior is special to CMC foliations, and is not shared by other geometrically defined foliations, i.e. 3+1 decompositions of M. For instance the Gauss or proper time-equidistant foliation from a given space-like hypersurface will typically break down before reaching ∂M. The assumption σ () ≤ 0 in Theorem 0.1 is not necessary and an analogous result is valid for any value of σ (), cf. Remark 2.6 and Sect. 4. Theorem 0.1 also gives information on the initial singularity or singularities to the finite past of . Thus, it implies that either the curvature in the sense of (0.7) blows-up in the finite proper-time past of , or the initial singularity is a crushing singularity. In contrast to the positive case, when σ () ≤ 0 one does not expect, even if one has global existence in CMC time, that in general, MF = M, i.e. the space-time (M, g) may well extend past the CMC foliated region. Assuming global CMC time existence, we prove in Theorem 4.1 that the future boundary of MF in M, i.e. the boundary ∂o MF ⊂ M, to the future of the initial surface in (0.2), is a countable union of smooth, connected, complete maximal hypersurfaces o ⊂ M. Each component o of ∂o MF is non-compact, and so not diffeomorphic to . However, any smooth compact domain in o does embed in . The hypersurface ∂o MF is a Cauchy surface for the region M + = M \ MF to the future of MF in M. Of course any causal curve starting in MF which enters M+ can never reenter MF . In analogy to the Hawking–Penrose singularity theorem, (which only applies to compact CMC hypersurfaces), we have: Singularity Formation Conjecture. If ∂o MF = ∅ in (M, g), then (M, g) possesses a singularity to the finite proper-time future in each component of M+ . + It is natural to conjecture that M+ itself has a foliation MF by non-compact CMC surfaces limiting on crushing singularities to the finite future, cf. [22] and Sect. 4. The region M + , when non-empty, may be expected to be related to the formation of black holes in (M, g). The global existence problem in CMC time and the singularity formation conjecture thus imply that (M, g) is future geodesically complete if and only if σ () ≤ 0 and M = MF . Next, suppose (M, g) is a space-time which is future geodesically complete. It is then natural to consider the possible asymptotic behavior of the space-time as the proper time t approaches infinity. This brings us to the relations with the geometrization of the 3-manifold .
General Relativity and Geometrization
537
Definition 0.2. Let be a closed, oriented and connected 3-manifold, with σ () ≤ 0. A weak geometrization of is a decomposition of , = H ∪ G,
(0.8)
where H is a finite collection of complete connected hyperbolic manifolds of finite volume embedded in and G is a finite collection of connected graph manifolds embedded in . The union is along a finite collection of embedded tori T = ∪Ti , T = ∂H = ∂G. A strong geometrization of is a weak geometrization as above, for which each torus Ti ∈ T is incompressible in , i.e the inclusion of Ti into induces an injection of fundamental groups. Of course, it is possible that the collection T of tori dividing H and G in (0.8) is empty, in which case weak and strong geometrizations are the same. In such a situation, is then either a closed hyperbolic manifold or a closed graph manifold. For a strong geometrization, the decomposition (0.8) is unique up to isotopy, cf. [2, 32], but this is far from being the case for a weak geometrization. Any 3-manifold, for instance S 3 or T 3 , has numerous knots or links whose complement admits a complete hyperbolic metric of finite volume, and hence gives rise to a weak geometrization. We recall that a graph manifold is a union of S 1 fibrations over surfaces, i.e. Seifert fibered spaces, glued together along toral boundary components. More precisely, a graph manifold G has a decomposition into Seifert fibered spaces Sj , with ∂Sj given by a union of tori {Tk }. The manifold G is then assembled by glueing (some of) the boundary tori together by toral automorphisms. In case ∂G = ∅, or ∂G consists of incompressible tori, there exists such a decomposition into Seifert fibered pieces along tori {Tk }, each of which is incompressible in G. Such an incompressible decomposition, called the JSJ decomposition, is unique up to isotopy, except in a few, well understood special cases. A typical exceptional case are the Sol manifolds, i.e. T 2 bundles over S 1 or over an interval or finite quotients of such bundles. The topology of graph manifolds is completely classified, cf. [42]. The Seifert fibered spaces above each admit a geometric structure in the sense of Thurston, i.e. a complete locally homogeneous Riemannian metric, cf. [40, 39]. Thus if has a strong geometrization as above, then admits a further decomposition by incompressible tori into domains, each of which has a complete geometric structure. Such a structure is called the geometrization of in the sense of Thurston. Not all 3-manifolds have such a geometric decomposition however. The essential 2-spheres S 2 ⊂ are obstructions. This may be related to the fact that MF = M in general, and is discussed further in Remark 4.3. Let t: M → R be the proper time to a fixed (initial) CMC Cauchy surface as in (0.2); t (x) is the maximal length of time-like curves from x to . The maximum is achieved by a time-like geodesic, since (M, g) is globally hyperbolic. Similarly, let tτ be the proper time of the CMC surface τ ∈ MF to an initial surface , i.e. tτ = tmax (τ ) = max{t (x) : x ∈ τ } = distg (τ , ).
(0.9)
If (M, g) is future geodesically complete and σ () ≤ 0, then tτ → ∞ as τ → 0. In this situation, the volume of the slices (τ , gτ ) typically diverges to +∞, and the metrics gτ typically become more and more flat, as tτ → ∞. Hence it is natural to consider the rescaled metrics g¯ τ = tτ−2 · gτ ,
(0.10)
538
M. T. Anderson
which at least give τ uniformly bounded volume, cf. (3.14). We raise the following: Weak Global Asymptotics Problem. Suppose is closed, oriented, and σ () ≤ 0, as above. Suppose further that (M, g) is future geodesically complete and MF = M. Then for any sequence τi → 0, the slices (τi , g¯ τi ) have a subsequence asymptotic to a weak geometrization of . More precisely, the problem states on the region H ⊂ from (0.8), the metrics g¯ τi in (0.10) converge to the complete hyperbolic metric of finite volume on H while on the region G ⊂ , the metrics g¯ τi collapse the graph manifold G with bounded curvature in the sense of Cheeger–Gromov to a lower dimensional space. Note that although (0.10) is a fixed time rescaling of the metric gτ for each τ = τi , i.e. the scale factor is independent of any base point in τ , the spatial behavior of the metrics g¯ τi depends strongly on the choice of base points. Some sequences of base points xi ∈ τi give rise to hyperbolic limits, while others give rise to a collapse along a graph manifold structure. Of course, one may also formulate the corresponding strong problem, that the slices (τ , g¯ τ ) are asymptotic to a strong geometrization, or even the geometrization of , as τ → 0. We refer to Sects. 1 and 3 for a more detailed discussion of this problem, in particular regarding the collapse behavior. The conclusion of the strong version of this problem is essentially the same as the conclusions of Conjectures I and II of [1, 2] on the behavior of maximizing sequences of Yamabe metrics or minimizing sequences for the L2 norm of scalar curvature on a 3-manifold . This problem, as well as the previous conjectures and problems, still appear to be very difficult to resolve. Some evidence for its validity is given by the result below. Suppose (M, g) is a CMC cosmological space-time satisfying the following curvature assumption: there is a constant C < ∞ such that for x ∈ (M, g), |R|(x) + t (x)|∇R|(x) ≤ Ct −2 (x).
(0.11)
Here, in analogy to (0.7), |∇R|2 = |∇E|2 + |∇B|2 . Observe that the bound (0.11) is scale-invariant. Theorem 0.3. Let (M, g) be a cosmological CMC space-time, with σ () ≤ 0. Suppose that the curvature assumption (0.11) holds, and MF = M, to the future of an initial CMC surface . Then (M, g) is future geodesically complete and, for any sequence τi → 0, the slices (τi , g¯ τi ) have a subsequence asymptotic to a weak geometrization of . Theorem 0.3 is proved in Sect. 3, which also provides more details on the asymptotic behavior of the space-time (M, g). The assumptions of Theorem 0.3 hold for Bianchi space-times of non-positive type, cf. Sect. 3.3(I). It is reasonable to conjecture that they also hold at least for perturbations of Bianchi initial data sets. An interesting open problem is whether any such weak geometrization is in fact a strong geometrization. In fact, this may be closely related to the recollapse conjecture. cf. Sect. 3.3(III). Note that since the weak decomposition (0.8) is not unique, different sequences τi → 0 may possibly give rise to distinct decompositions (0.8). The bound on the derivative of the curvature R in (0.11) is needed for purely technical reasons, related to the use of the stability theorem for the Cauchy initial value problem. Of course one would like to remove this dependence on ∇R. In Theorem 3.1, a version
General Relativity and Geometrization
539
of Theorem 0.3 is proved without the bound on ∇R, but requiring an extra hypothesis on the decay of the mean curvature H as tτ → ∞. A more difficult problem is whether the decay assumption on R in (0.11) is really necessary, see Sect. 5 for further discussion. The basic reason that one may expect that hyperbolic manifolds arise from the longtime behavior of τ is the simple fact that the Lorentzian cone on a hyperbolic 3manifold, i.e. go = −dt 2 + t 2 g−1
(0.12)
is a flat Lorentzian space-time, and so in particular vacuum; here g−1 is any complete metric of curvature −1 on a 3-manifold and we have supressed the shift diffeomorphisms. In fact, the metric go in (0.12) is just a quotient of the future light cone about a point {0} in flat Minkowski space (R4 , η). In this respect, it has recently been proved by Andersson-Moncrief [8] that if Cauchy data on a given compact CMC surface are a sufficiently small perturbation of hyperbolic data as in (0.12), and the hyperbolic metric on is rigid among conformally flat metrics on , then the resulting space-time (M, g) is future geodesically complete, and asymptotic to the flat Lorentzian metric (0.12) as τ → 0. We also refer to the interesting work of Fischer–Moncrief [25], where the long-term behavior of a naturally defined Hamiltonian for the space-time evolution is related to the Sigma constant (0.6) of . 1. Background and Preliminary Results Let (M, g) be a CMC cosmological space-time, with MF ⊂ M as in (0.3), and with time function τ = H on MF . The 4-metric g may then be decomposed into a 3+1 split as g = −α 2 dτ 2 + gτ ,
(1.1)
where gτ is a Riemannian metric on τ and α is the lapse function. Hence gτ may be viewed as a curve of metrics on a fixed slice , by means of diffeomorphisms ψτ : → τ . The vacuum Einstein equations Ric = 0 imply, via the Gauss and Gauss-Codazzi equations, constraints on the geometry of (τ , gτ ) ⊂ (M, g). The (CMC) constraint equations are: s = |K|2 − H 2 , δK = 0.
(1.2)
Here K is the 2nd fundamental form or extrinsic curvature, given by K(X, Y ) = g(∇X Y, T ) where T is the future unit normal, H = tr g K, and δ is the divergence operator. The sign convention on K agrees with the convention following (0.2). The vacuum Einstein evolution equations are L∂τ g = −2αK,
(1.3)
L∂τ K = −D 2 α + α(Ric + H · K − 2K 2 ),
(1.4)
where L denotes the Lie derivative, D 2 the Hessian and K 2 (X, Y ) = K(X), K(Y ).
540
M. T. Anderson
The choice of time function τ = H determines the lapse α, which satisfies the following lapse equation, derived from the 2nd variational formula for volume: −/α + |A|2 α =
dH = 1, dτ
(1.5)
where / = tr D 2 . The equations (1.2)–(1.5) on the surfaces (τ , gτ ) are equivalent to the vacuum Einstein equations on (M, g). Finally, the Gauss–Codazzi equations, (in general), are dK = RT ,
(1.6)
i.e. dK(X, Y, Z) = ∇X K(Y, Z) − ∇Y K(X, Z) = R(T , X, Y, Z). The operator d is the exterior derivative w.r.t. the Levi–Civita connection ∇ of gτ , when K is viewed as a T τ -valued 1-form. The maximum principle applied to the lapse equation (1.5) gives rise to the following elementary and well-known, but important estimates. Lemma 1.1. On any leaf τ , the lapse α satisfies the following bounds: sup α ≤ (inf |K|2 )−1 ≤ 3H −2 , and inf α ≥ (sup |K|2 )−1 > 0.
(1.7)
Note that α is not scale-invariant – it scales as the square of the distance, and so inversely to the norm of the curvature. Namely, if g = λ−2 g is a rescaling of g, then (1.1) becomes g = −α 2 λ−2 dτ 2 + λ−2 gτ .
(1.8)
The space metric gτ = λ−2 · gτ is of course just a rescaling of gτ . Now τ is the mean curvature in a given metric and so τ = λτ . Thus, (1.8) becomes g = −(λ−2 α)2 d(τ )2 + gτ = (α )2 d(τ )2 + gτ ,
(1.9)
where α = λ−2 α. In particular, under this rescaling, the lapse equation is scale-invariant, as are the other equations (1.2)–(1.6), since the vacuum Einstein equations are scaleinvariant. It will be useful to also represent the metric w.r.t. the parameter tτ defined as in (0.9) in place of τ . Thus, g = −α 2 (
dτ 2 2 ) dtτ + gτ , dtτ
(1.10)
dτ where now the lapse α dt is scale-invariant. In (1.10), the slices τ are parametrized by τ the distance function tτ ; since tτ is the maximal proper time between τ and τo , we have ∀x ∈ τ ,
α(x)
dτ (x) ≤ 1. dtτ
(1.11)
Next we prove a natural comparison result for cosmological CMC space-times which will be important in the proof of Theorem 0.3. It is essentially a standard argument in Lorentzian comparison geometry.
General Relativity and Geometrization
541
Proposition 1.2. Let τo be any initial compact CMC surface, with Ho = τo < 0 in (M, g). Then τ −H (τ ) · tτ ≤ 3(1 − ), (1.12) τo where tτ = dist g (τ , τo ) ≥ 0. Equality holds in (1.12) for some τ = τ1 > τo if and only if the domain M[τo ,τ1 ] = ∪τ ∈[τo ,τ1 ] τ ⊂ (M, g) is flat and is isometric to a time-annulus in a flat cone as in (0.12). In particular, all the leaves τ , τ ∈ [τo , τ1 ] are of constant negative curvature. Proof. Given the initial surface τo , let S(r) = {x ∈ M : t (x, τo ) = r} = t −1 (r), where t is the Lorentzian distance as preceding (0.9). Let θ be the mean curvature of S(r), with the sign convention as following (0.2) or (1.2). Since the space-time is vacuum, the Raychaudhuri equation in standard notation, cf. [31, §4.4], gives −θ = − 13 θ 2 − 2σ 2 ≤ − 13 θ 2 .
(1.13)
Thus by integration, one immediately obtains, as long as θ ≤ 0, −(θ (s))−1 − 13 s ≥ −(H (τo ))−1 ,
(1.14)
for s ≥ 0. Now on τ , there is a point xτ realizing tτ , the maximal value of t on τ . Hence S(tτ ) lies to the future of τ , but touches τ at the point xτ . By a standard geometric maximum principle, cf. [7, Thm. 3.6] for example, we then have −θ(xτ ) ≥ −H (τ ), again with the sign convention above understood. Substituting this in (1.14), and using the fact that s = tτ at xτ gives (1.12). It is obvious that (1.12) holds in any region where H = τ ≥ 0. If equality holds in (1.12) at some τ1 > τo , then equality holds in (1.14) for all τ ∈ [τo , τ1 ]. Hence by (1.13), σ = 0 everywhere in the time annulus t −1 [τo , τ1 ] and so the geodesic spheres S(r) are everywhere umbilic. Also, the maximum principle again implies that τ = S(tτ ), so the surfaces τ are all equidistants. These facts imply that (M[τo ,τ1 ] , g) is a Lorentzian cone of the form g = −dt 2 + t 2 gτo , where gτo is the metric on τo , up to rescaling. The vacuum equations (0.1) then imply that g must be flat and the metrics gτ on τ are of constant curvature. Observe that (1.12) holds if (M, g) satisfies the strong energy condition Ricg ≥ 0 in place of the vacuum equations (0.1). However, the rigidity statement in the second part of Proposition 1.2 requires the vacuum equations. Proposition 1.2 of course implies that if M is geodesically complete to the future of τo , then lim sup tτ (−H ) ≤ 3. tτ →∞
Since H is the mean curvature, the first variational formula for volume gives d vol s = − H · f dV , ds s where s is any parametrization for the leaves τ and f > 0 is the length of the associated variation field. For example, if s = τ , then f = α as in (1.1). Consider the family τ dτ parametrized w.r.t. tτ , as in (1.10). Then f = α dt and hence by (1.11), τ d 3 vol τ ≤ − H dV = −H vol τ ≤ vol τ . (1.15) dtτ tτ τ
542
M. T. Anderson
This discussion gives the following monotonicity result on the volume growth of the leaves τ . Corollary 1.3. The volume of the leaves τ in (M, g) satisfies vol τ /tτ3 ↓,
(1.16)
i.e. the ratio is monotone non-increasing in tτ or τ . The volume ratio is constant in [τo , τ1 ] if and only if the corresponding time-annulus in (M, g) is a flat Lorentzian cone. Proof. This follows immediately from (1.15) by integration and the corresponding rigidity statement in Proposition 1.2. We will need several results from the Cheeger–Gromov theory on convergence and collapse of Riemannian manifolds, cf. [16, 17, 4]. For simplicity, we consider only the 3-dimensional case, where the Ricci curvature determines the full curvature tensor. Proposition 1.4. Let (Di , gi , xi ) be Riemannian metrics on 3-manifolds Di , satisfying | Ricgi | ≤ 7,
diamgi Di ≤ D,
distgi (xi , ∂Di ) ≥ do ,
volgi Bxi (1) ≥ vo ,
(1.17) (1.18)
for some constants 7, D < ∞, and vo , do > 0. Then for any ε > 0 with ε < do , there are domains Ui = Ui (ε) ⊂ Di , with distgi (∂Di , ∂Ui ) < ε, and diffeomorphisms ψi of Ui , s.t. {ψi∗ gi } has a subsequence converging in the C 1,β topology to a limit C 1,β Riemannian manifold (U∞ , g∞ , x∞ ), x∞ = lim xi , for any β < β < 1. In particular, U∞ is diffeomorphic to Ui , for i sufficiently large. Proposition 1.5. Let (Di , gi , xi ) be Riemannian metrics on 3-manifolds Di , satisfying (1.17) and, in place of (1.18), volgi Bxi (1) → 0,
(1.19)
for some constants 7, D < ∞ and do > 0. Then there are domains Ui ⊂ Di as above, such that Ui is either a Seifert fibered space or a torus bundle over an interval. In both cases, the gi diameter of any fiber F, (a circle S 1 or torus T 2 ), goes to 0 as i → ∞. Further, suppose Ui is not diffeomorphic to a closed spherical space-form S 3 / . Then for any i < ∞ sufficiently large, π1 (F ) injects in π1 (Ui ) and there is a finite cover U¯ i of Ui such that the sequence (U¯ i , gi , xi ) does not collapse, i.e. satisfies (1.18); hence a subsequence converges in the C 1,β topology as above to a C 1,β Riemannian manifold (U¯ ∞ , g∞ , x∞ ). The limit (U¯ ∞ , g∞ ) admits a locally free isometric action by one of the following Lie groups: S 1 , S 1 × S 1 , T 3 , Nil. Both of these results essentially remain valid if diamgi Di → ∞ as i → ∞, but now both behaviors convergence/collapse are possible depending on the choice of base points xi . Thus, suppose (, gi ) are complete Riemannian 3-manifolds with | Ricgi | ≤ 7,
(1.20)
for some constant 7 < ∞. For a given sequence of base points xi ∈ , suppose (1.18) holds. Then a subsequence of (i , gi , xi ) converges, modulo diffeomorphisms of i ,
General Relativity and Geometrization
543
to a complete C 1,β limit Riemannian manifold (∞ , g∞ , x∞ ). The limit ∞ embeds weakly in , denoted as ∞ ⊂⊂ ,
(1.21)
in the sense that any domain with smooth and compact closure in ∞ embeds smoothly in . On the other hand, if (1.19) holds at xi ∈ , then the sequence collapses uniformly on compact sets, in the sense that volgi Byi (1) → 0, for all yi ∈ Bxi (R), for any fixed R < ∞. In this case, the collapse may be unwrapped as above and one obtains a complete ¯ ∞ , g∞ , x∞ ), which admits a locally free isometric limit C 1,β Riemannian manifold ( 1 1 action by one of the groups S , S × S 1 , T 3 , Nil. The convergence in Propositions 1.4 and 1.5 above is actually in the weak L2,p topology, for any p < ∞ and the limits are L2,p smooth. Remark 1.6. (i) If the bound |Ricgi | ≤ 7 in (1.20) is strengthened to a bound on the derivatives of the curvature, i.e. |∇ j Ricgi | ≤ 7j < ∞, then one has convergence to a Lj +2,p limit in the weak Lj +2,p topology, p < ∞. In particular, if this holds for all j , then one has C ∞ convergence. (ii) By standard results in Riemannian geometry, a lower volume bound as in (1.18) under the curvature bound (1.17) is equivalent to a lower bound on the injectivity radius at xi , cf. [35]. The collapse situation in Proposition 1.5 corresponds to the formation of families of (arbitrarily) short geodesic loops in (Di , gi ) within bounded distance to xi and the unwrapping of the collapse corresponds to the unwrapping of these short loops to loops of length about 1 in covering spaces. In the single exceptional case of S 3 / , the collapse is along inessential loops, as for instance in the Berger collapse, cf. [16], of S 3 to S 2 by shrinking the S 1 fibers of the Hopf fibration. (Exactly this behavior occurs at the Cauchy horizon of the Taub-NUT metric, cf. [31, §5.8]). In all other cases, the short loops are essential in bounded domains about xi , (with diameter depending on the degree of collapse in (1.19)), although they are not necessarily essential globally, i.e. in all of Di , when diam Di becomes unbounded. The dimension of the groups S 1 , S 1 × S 1 , T 3 or Nil corresponds to the dimension of this family of short loops. For instance, in the latter two cases of T 3 or Nil, one has a 3-dimensional family of short loops, and so all of (Di , gi ) is collapsing, and in fact diamgi Di → 0; cf. §3.3(I) for an example. Of course, by (1.17), this is possible only if Di is closed. In view of Proposition 1.5, this means that the base spaces of the S 1 or S 1 × S 1 bundle are collapsing, so that the finite covers also unwrap the base space collapse. 1,β
(iii) Define the C 1,β harmonic radius rh (x) = rh (x) of a Riemannian 3-manifold (, g) at x to be the largest radius of the geodesic ball about x on which there exists a harmonic coordinate chart in which the metric components are controlled in the C 1,β norm, i.e. as matrices, C −1 δij ≤ gij ≤ Cδij , and, for p given by β = 1 − p3 , rh (x)p ||∂gij ||C β (Bx (rh (x))) ≤ C. The power of rh is chosen so that rh scales as a radius, i.e. as a distance. The assumptions (1.17)–(1.18) imply a lower bound on rh , rh (xi ) ≥ ro , where ro depends only on 7, vo , (and do ), cf. [4] and references therein.
544
M. T. Anderson
2. Curvature Estimates on CMC Surfaces In this section, we prove Theorem 0.1 and several related results. To begin, we claim that for any given compact subset ? ⊂ M, there is a constant 7o = 7o (?) < ∞ such that, for all x ∈ ?, |R|(x) ≤ 7o ,
(2.1)
where |R| is measured as in (0.7). Since (M, g) is smooth, (2.1) holds provided the future unit normal T to the leaves τ of F remains a uniformly bounded distance away from the null-cones in ?, (so that T remains in a compact set in the tangent bundle T ?). Since (M, g) is globally hyperbolic and ? is compact, this follows directly from the a priori estimate of Bartnik [10, Thm. 3.1]. Of course, on approach to ∂M, the bound (2.1) may no longer hold. In the results to follow, we use this bound to control the intrinsic and extrinsic geometry of the leaves τ of MF . First, consider the symmetric bilinear form K as a 1-form on τ with values in T τ , as in (1.6). We have the following standard Weitzenbock formula, cf. [13] for example: δdK + dδK = 2D ∗ DK + R(K),
(2.2)
on (τ , gτ ), where the curvature term is given by R(K) = Ric ◦K + K ◦ Ric −2R ◦ K. Here R ◦ K is the action of the curvature tensor R on symmetric bilinear forms given by: (R ◦ K)(X, Y ) = R(X, ei )K(ei ), Y , {ei } a local orthonormal framing for τ . The sign of the curvature tensor is such that Rijj i denotes the sectional curvature in the (ij ) direction. By (1.2) and (1.6), δK = 0 and dK = RT . Pairing (2.2) with K thus implies /|K|2 = 2|DK|2 − δRT , K + R(K), K.
(2.3)
The key point is to prove that the algebraic curvature term in (2.3) is positive, on the order of |K|4 ; this was verified when (M, g) is Minkowski space-time in [18]. Lemma 2.1. On a leaf τ , with mean curvature τ = H, the curvature term in (2.3) satisfies R(K), K ≥ |K|4 − c(|H ||K|3 + 7o |K|2 + H 2 |K|2 ),
(2.4)
where c is a numerical constant, independent of (M, g) and τ . Proof. Let λi , i = 1, 2, 3, be the eigenvalues of K. The Gauss equation gives, for j = k, Rj kkj = Rj kkj − λj λk .
(2.5)
The minus sign in (2.5), which is crucial in the following, uses the fact that (M, g) is Lorentzian; in the case of Riemannian geometry, one has a plus sign, (which would give the wrong sign to the dominant term in (2.4)). We also have, since is a 3-manifold, Ricii = 2s − Rj kkj , and so s 2 Rj kkj λ2i λi − Ric ◦K, K = K ◦ Ric, K = Ricii ·λ2i = 2 s = |K|2 + λ2i λj λk − λ2i Rj kkj . 2
General Relativity and Geometrization
545
Observe that λ2i λj λk = H λ1 λ2 λ3 = H det K ≤ c · |H ||K|3 and s = |K|2 − H 2 by the constraint equation (1.2). Similarly, from the definition, one computes λi λj Rijj i = λi λj (Rijj i − λi λj ) = λi λj Rijj i − λ2i λ2j . R ◦ K, K = i=j
i=j
i=j
i=j
Combining these estimates and using the bound (2.1) easily leads to (2.4).
Proposition 2.2. Let τ be any compact leaf of MF , τ ⊂ ?, with |H | ≤ Ho < ∞,
(2.6)
for some constant Ho > 0. Then there is a constant 71 = 71 (7o , Ho ) such that, on (τ , gτ ), |K|L∞ ≤ 71 , | Ric |L∞ ≤ 71 .
(2.7)
Proof. By the Gauss equation (2.5) and the bound (2.1), the two estimates in (2.7) are equivalent, so it suffices to prove the first estimate. Since τ is compact, we may choose x ∈ τ realizing the maximal value of |K| on τ , i.e. ¯ |K|(y) ≤ |K|(x) ≡ K,
(2.8)
for all y ∈ τ . We will prove (2.7) by contradiction, although with some further work it is possible to give a direct argument, (without passing to the limit below), in which case the dependence of 71 on (7o , Ho ) is more explicit. Thus, if (2.7) is false, there exist compact CMC surfaces (i , gi ) in space-times (Mi , gi ) satisfying (2.1) and (2.6), but for which K¯ i = |Ki |(xi ) = supi |Ki | → ∞. Consider the rescaled metric g˜ i = (K¯ i )2 · gi on i , and the corresponding space-time (Mi , g˜ i ). By scaling, (2.8) becomes |K˜ i |(yi ) ≤ |K˜ i |(xi ) = 1
(2.9)
on (i , g˜ i ). Similarly, the scaling properties of the curvature R and mean curvature H imply ˜ i | ≤ 7o · (K¯ i )−2 → 0, |R
and |H˜ i | ≤ Ho (K¯ i )−1 → 0,
as i → ∞,
(2.10)
on (Mi , g˜ i ) and (i , g˜ i ) respectively. Thus, the Gauss equation (2.5) implies, ∀yi ∈ (i , g¯ i ), | Ricg˜i |(xi ) ≥ 21 ,
while | Ricg˜i |(yi ) ≤ 2.
(2.11)
By Proposition 1.5, if the geodesic ball (B˜ xi (10), g˜ i ) in i is sufficiently collapsed, it may be unwrapped by passing to a sufficiently large finite cover, so that vol B˜ xi (1) ≥ 10−1 . We assume this has been done, without change of the notation. It follows from Proposition 1.4 and Remark 1.6 (iii) that the local C 1,β geometry of (i , g˜ i ) is uniformly controlled, (independent of (i , g˜ i )), in the ball (B˜ xi (10), g˜ i ). ˜ o ) of a uniform size ro > 0, Thus, there are harmonic coordinate charts on balls B(r ˜ within Bxi (10), in which the metric components are uniformly controlled in the C 1,β norm.
546
M. T. Anderson
Next, the Gauss–Codazzi equation (1.6) and the constraint equation (1.2) give ˜ T, d K˜ = R
δ K˜ = 0
(2.12)
on (i , g˜ i ), where K˜ = K˜ i . The system (2.12) is a first order uniformly elliptic system for ˜ In local harmonic coordinates, the coefficients of this system are uniformly bounded K. in C 1,β . Hence, from the bound (2.9) and (2.10), standard elliptic interior regularity estimates imply a uniform bound ˜ L1,p ≤ C, |K|
(2.13)
˜ o ) in where p < ∞, C depends only on p, and the L1,p norm is taken over any ball B(r α ˜ B˜ xi (5). In particular, Sobolev embedding gives a uniform local C bound on K. By Proposition 1.4, the manifolds (i , g˜ i , xi ) have a subsequence converging in the C 1,β topology to a C 1,β limit (∞ , g˜ ∞ , x∞ ). Equation (2.2) holds on each i and so it holds weakly on the limit, i.e. when paired by integration with any L1,p test form of ˜ i → 0 in L∞ by (2.10), on the limit (∞ , g˜ ∞ ), the compact support. Hence, since R limit form K = K˜ ∞ satisfies 2D ∗ DK = R(K)
(2.14)
weakly in L1,p . This is an elliptic equation with C 1,β coefficients, so elliptic regularity, cf. [27, Ch. 9.6], implies that K is L3,p smooth, for any p < ∞, and hence by Sobolev embedding, K is C 2,β smooth. In particular, as in (2.3), we then have /|K|2 = 2|DK|2 + < R(K), K > .
(2.15)
However, at the limit base point x∞ , |K| is maximal, so /|K|2 ≤ 0. Further, the estimates (2.10)–(2.11) together with (2.4) imply that the curvature term in (2.15) is non-negative. Hence, at x∞ , we must have |K| = 0. However, this contradicts the estimate (2.9), which, by (2.13), passes to the limit. This contradiction thus establishes (2.7). By Remark 1.6 (iii), the estimate (2.7) on the Ricci curvature gives apriori local C 1,β control on the metric gτ in harmonic coordinates, i.e. an apriori lower bound on the C 1.β harmonic radius rh (x) ≥ ro = ro (7o , Ho ) > 0,
(2.16)
at any x ∈ τ , provided one has a lower bound on the volume volgτ Bx (1). This will not be the case when volgτ Bx (1) is too small, but in that case Bx (1) may be unwrapped by passing to covering spaces as in Proposition 1.5, (assuming Bx (1) = S 3 / ). Thus, one obtains such a lower bound on rh (x) in the covering. Given such a lower bound on rh , one may apply standard elliptic estimates, cf. [27, §8.8], to the lapse equation (1.5) to control the behavior of α, since the coefficients of / are controlled in C 1,β in harmonic coordinates on Bx (rh (x)). (Note that the lapse equation is invariant under coverings.) Thus, together with the uniform bound on |K| in (2.7), it follows that sup α/ inf α ≤ Ao < ∞,
(2.17)
where the sup and inf are taken over any Bx (ro ) ⊂ τ , and Ao depends only on 7o , Ho . Similarly, given any fixed x ∈ τ , if α is renormalized so that α¯ = α/α(x), then elliptic
General Relativity and Geometrization
547
regularity [27, §9.5] applied to the lapse equation (1.5) divided by α(x) gives a uniform bound |D 2 α| ¯ Lp (Bx (ro )) ≤ A1 ,
(2.18)
where A1 = A1 (7o , Ho , p); (2.18) is assumed to be taken over the local unwrapping covering if τ is very collapsed at x. In particular, by Sobolev embedding, there is a constant A2 = A2 (7o , Ho ) such that |∇ α| ¯ L∞ ≤ A2 . Similarly, consider Eq. (2.2) again, i.e. 2D ∗ DK = δRT − R(K). This is a second order elliptic system in K, and the terms RT and R(K) are bounded in L∞ by (2.1) and (2.7). Thus, consider the elliptic operator D ∗ D as a mapping D ∗ D : L1,p → L−1,p . In local harmonic coordinates, i.e. within the harmonic radius, the coefficients of D ∗ D are controlled in C 1,β . It follows from elliptic regularity and the bounds above that K is controlled in L1,p , for any p < ∞, i.e. there is a K1 = K1 (7o , Ho , p) s.t. ||K||L1,p (B(ro )) ≤ K1 .
(2.19)
There is of course no general apriori lower bound on the volumes vol Bx (1), (cf. also the collapse discussion in Sect. 3). However, if some initial CMC surface τo as in (0.2) is given which has a fixed lower bound on vol Bp (1), ∀p ∈ τo , then the following result shows that one has a lower bound on vol Bx (1) at all points x ∈ τ within bounded proper distance to τo . Lemma 2.3. Let τo be a CMC surface in the space-time (M, g) satisfying (2.1), and suppose vol Bp (1) ≥ νo > 0,
(2.20)
for all p ∈ τo . Let τ be any other compact CMC surface in (M, g) with mean curvature τ = H and suppose, for x ∈ τ , t (x) = dist g (x, τo ) ≤ To . Then there are positive constants ν1 and r1 , depending only on (H, Ho , 7o , νo , To ), s.t. 1,β
vol Bx (1) ≥ ν1 , rh (x) ≥ r1 .
(2.21)
Proof. Consider neighborhoods U in τ as graphs over τo . Thus, for any p ∈ τo , the normal exponential map to τo induces a continuous map F : Bp (1) → U ⊂ τ , F (q) = exp(φ(q) · T ), where T is the time-like unit normal to τo and φ is a positive (or negative) function. We claim that F is a quasi-isometry, with distortion factor depending only on the bounds (Ho , 7o , To ). To see this, consider the 1-parameter interpolation Fs (q) = exp(sφ(q) · T ). The apriori bounds on |K| from (2.7), which controls the infinitesimal distortion of gτ in the unit normal direction, and α from (2.17), imply that for s small, Fs has metric distortion at most C(Ho , 7o ) · s onto its image. Iterating this control inductively to s = 1 then gives the claim. Since F is a quasi-isometry, it is clear that the estimate (2.20) implies the 1st estimate in (2.21), while the 2nd then follows from the arguments preceding the lemma.
548
M. T. Anderson
Observe that if τ is bounded away from 0, i.e. |τ | ≥ τ¯ > 0, then (1.7) implies that α is bounded above, depending only on τ¯ . Hence, one has a global estimate for the proper time, i.e. t (x) ≤ To (τ¯ ),
(2.22)
for all x ∈ τ . We need one further a priori estimate for the proof of Theorem 0.1. Lemma 2.4. For τ ⊂ ?, as in (2.1), there is a constant 71 < ∞, depending only on Ho , 7o , vol τ and an initial surface τo as in (0.2) such that ||∇ Ricτ ||L2 ≤ 71 ,
and ||∇ 2 Kτ ||L2 ≤ 71 .
(2.23)
Proof. Given the estimates above from Proposition 2.2 through Lemma 2.3, (2.23) follows from estimates for the 1st order Bel–Robinson energy and we refer to [19, 6, 8] for some further details. Thus, let Q1 be the 1st order Bel–Robinson tensor associated to the Weyl-type field 1 W = ∇T W, where W is the Weyl tensor of (M, g) and T is the unit normal to the foliation F. Then there are numerical constants c, C such that, for 1 Q (τ ) = Q1 (T , T , T , T ), (2.24) τ
one has cQ1 (τ ) ≤ ||∇ Ricτ ||2L2 + ||∇ 2 Kτ ||2L2 ≤ CQ1 (τ ). Further, Q1 obeys the d differential inequality: | dτ Q1 (τ )| ≤ M(1 + Q1 (τ ) + F (τ )), where M depends only on the L∞ norm of K and |∇ log α| on τ and F (τ ) = τ (| Ric |4 + |DK|4 ). By (2.7) and (2.18 ff), M is uniformly bounded in terms of 7o , Ho while (2.7) and (2.1) give, via the Hölder inequality, a uniform bound on F (τ ) depending only on 7o , Ho and an upper bound for vol τ . This gives control on M and F (τ ) and hence (2.23) follows by integration w.r.t. τ . Remark 2.5. In Sect. 4, we will also use the analogous 0th order Bel–Robinson estimate. Thus, the Bel–Robinson energy Qo (τ ), given by Qo (τ ) = | Ric |2 + |dK|2 , (2.25) τ
d satisfies | dτ Qo (τ )| ≤ cMQo (τ ), where M = |K|L∞ + |∇logα|L∞ on τ , and c is numerical.
This discussion now easily leads to the proof of Theorem 0.1. Proof of Theorem 0.1. Suppose σ () ≤ 0 and let τi be a sequence of compact leaves in MF , with τi → τ¯ < 0. We may assume that {τi } is either increasing or decreasing. Suppose τi does not approach any point of ∂M, so that by the discussion at the beginning of Sect. 2, the estimate (2.1) holds, for some constant 7o , uniformly on {τi }. Since τ¯ < 0, the estimate (2.22) implies that the proper time function t is uniformly bounded above on {τi }. Thus, the proof of Lemma 2.3 implies that the surfaces {τi } are all uniformly quasi-isometric. Together with the bound on |Ric| from (2.7), we see that the hypotheses (1.17)–(1.18) of Proposition 1.4 hold on {τi }. Proposition 1.4, Remark 1.6(i) and Lemma 2.4 thus imply that a subsequence of {τi } converges in the weak L3,2 topology to a limit L3,2 CMC surface τ¯ ⊂ (M, g)
General Relativity and Geometrization
549
with L2,2 extrinsic curvature Kτ¯ . Similarly, the extrinsic curvatures Ki of τi converge in weak L2,2 to the limit Kτ¯ on τ¯ . Now the Cauchy problem on τ¯ with data (gτ¯ , Kτ¯ ) in (L3,2 , L2,2 ) is uniquely locally solvable, cf. [24], and hence there exist unique local developments of this data for a short time into the past and future of τ¯ . By the uniqueness, such a development must lie in a region of the C ∞ smooth space-time (M, g). Hence, by elliptic regularity applied to Eq. (2.2), it follows that τ¯ is C ∞ smooth, with C ∞ extrinsic curvature. The limit CMC surface τ¯ is the unique surface in (M, g) with mean curvature τ¯ , [33], and hence the sequence {τi }, (and not just a subsequence), converges smoothly to the limit τ¯ . Thus, the smooth foliation MF extends a definite amount past τ¯ . This completes the proof. Remark 2.6. The only place the assumption σ () ≤ 0 was used in the proof of Theorem 0.1 was to obtain a uniform upper bound on the lapse function α from (1.7), which in turn was used only to obtain an upper bound on the proper time to {τi } via (2.22). Hence, Theorem 0.1 remains true for any value of σ () whenever there is an upper bound for tτ on {τi }. More generally, since Lemma 2.3 and the results preceding it are all local, one needs only an upper bound on t (x) on domains in {τi }. This will be discussed further in Sect. 4. Remark 2.7. As noted in the Introduction, Theorem 0.1 may be applied to the past of the initial surface , for any value of σ () by Remark 2.6. Hence either the curvature in the sense of (0.7) blows up in the finite proper and CMC past of , or one has global CMC time existence to the past, and in the limit τ → −∞, the space-time (M, g) approaches a crushing singularity. It is well-known, (by a standard argument as in the Hawking singularity theorem via the Raychaudhuri equation (1.13)), that a globally hyperbolic space-time (M, g) does not extend, as a globally hyperbolic space-time, to the past of a (compact) crushing singularity. It follows that in any extension (M , g ) of (M, g), the boundary ∂P MF of MF to the past of τo is the past Cauchy horizon of . Examples with this behavior of course do occur, most notably in the Taub-NUT metric, cf. [31, Ch. 5.8]. 3. Asymptotics of Future Complete Space-Times This section is mainly concerned with the proof of Theorem 0.3. In Sect. 3.1 we prove the result under an extra hypothesis concerning the asymptotic behavior of the mean curvature H = τ , but without the bound on ∇R in (0.11). This hypothesis is then removed, i.e. proved to always hold, in Sect. 3.2. In Sect. 3.3 we make a number of remarks on the collapse situation. First however we note that a space-time satisfying the assumptions of Theorem 0.3 must be geodesically future complete. In fact, the much weaker assumption of a uniform curvature bound |R| ≤ 7o < ∞
(3.1)
to the future of an initial surface τo implies by Theorem 0.1 that the CMC foliation F exists for all CMC time τ ∈ [τo , 0). By the assumption in Theorem 0.3 that ∂o MF = ∅ in M, it follows that (M, g) is future geodesically complete. The remainder of the proof is thus concerned only with the weak geometrization of .
550
M. T. Anderson
3.1. We first prove Theorem 0.3 in a special case. Thus, in contrast to (0.9), define tmin (τ ) = min{t (x) : x ∈ τ }.
(3.2)
As in (1.12), the product H · tmin is scale invariant. Theorem 3.1. Suppose (M, g) is a CMC cosmological space-time with M = MF , (to the future of τo ), and that (M, g) satisfies the curvature assumption |R|(x) ≤ C/t 2 (x),
(3.3)
for some C < ∞, cf. (0.11). Further, given a sequence {τi } → 0, suppose there exists δ > 0 s.t. |Hτi | ≥ δ/tmin (τi ).
(3.4)
Then there is a subsequence of {τi }, denoted also τi , such that the rescaled metrics (τi , g¯ τi ) as in (0.10) converge to a weak geometrization of . Proof. Given a sequence of surfaces τi , τi → 0, to simplify notation, we let i = τi . Then combining the assumption (3.4) with the general bound (1.12) gives δ/tmin (τi ) ≤ |Hi | ≤ 3/tmax (τi ) ≡ 3/tτi ,
(3.5)
and hence on each i , tτi = tmax (τi ) ≤ 3δ tmin (τi ). This means that the surfaces i lie in time-annuli of uniformly bounded ratios, i.e. i ⊂ A(δtτi /3, tτi ),
(3.6)
where A(r, s) = tτ−1 (r, s). Consider then the rescaled space-time metric g¯ i = tτ−2 · g, g¯ i = −α¯ i2 d τ¯i2 + g¯ τ¯i . i
(3.7)
Note that the vacuum Einstein equations are of course invariant under rescaling. Here, the mean curvature parameter τ is replaced by τ¯i = τ · tτi and α¯ i = α/tτi . Hence the ¯ i in this rescaled metric, have mean curvature H¯ i = H¯ ( ¯ i) surfaces i , now denoted by satisfying δ ≤ |H¯ i | ≤ 3.
(3.8)
Let t¯i be the proper time function w.r.t. g¯ i , so that t¯i = t/tτi . Thus, (3.6) translates to ¯ ¯ i ⊂ A(δ/3, 1),
(3.9)
¯ s) = t¯−1 (r, s) so that the surfaces ¯ i lie in time-annuli of uniformly bounded where A(r, i inner and outer radii. Of course, the leaves τ of the foliation F are now considered as ¯ i , g¯ i ). ¯ τ¯i , g¯ τ¯i ) of the foliation F¯ i of the rescaled space-time (M leaves ( By scaling properties of curvature and the curvature assumption (3.3), we have ¯ i |(x) = tτ2 |R|(x) ≤ C · tτ2 · t (x)−2 = C · t¯i (x)−2 , |R i i
(3.10)
¯ κ −1 ), and so, within any time-annulus A(κ, ¯ i | ≤ C = C(κ). |R
(3.11)
General Relativity and Geometrization
551
¯ κ −1 ), cf. (3.9), have ¯ τ¯i , g¯ τ¯i ) of F¯ i within A(κ, Thus, by Proposition 2.2 the leaves ( ¯ Observe also that by uniformly bounded intrinsic curvature and extrinsic curvature K. ¯ κ −1 ), (3.8), within A(κ, ¯ 2 ≥ |H¯ |2 /3 ≥ δ 2 /3. |K|
(3.12)
These bounds on K¯ applied to the lapse estimates (1.7) in the g¯ i scale immediately ¯ κ −1 ), i.e. ¯ τ¯i within A(κ, give uniform bounds for the lapse function α¯ i on the leaves 1 ≤ sup α¯ i / inf α¯ i ≤ C,
(3.13)
¯ τ¯i and C = C(κ). At least when (3.4) holds for all where sup and inf are taken over τ ≥ −1, (3.13) implies that the original lapse function α on (M, g) satisfies α ∼ t 2 , i.e. the ratio α/t 2 is bounded above and below away from 0 within A(κtτi , κ −1 tτi ) as tτi → ∞. The volume estimate (1.16) is also scale-invariant, and hence we also have a uniform upper bound ¯ i ≤ Vo , volg¯τi
(3.14)
¯ i , g¯ τi ) may or may not be uniformly for some Vo < ∞. However, the diameter of ( bounded. (We remark this is in contrast to the Riemannian situation, where one can obtain apriori diameter bounds, in addition to volume bounds, from (0.1), or just the assumption Ric ≥ 0.) Hence, in passing to limits, one must choose base points and the form of the limits will then depend on the choice of the base points. Again however note ¯ i lie in fixed time annuli as in (3.9). that all base points on ¯ Thus, let xi ∈ i be any sequence of base points and consider the behavior of the ¯ i , g¯ τi , xi ). By Propositions 1.4 and 1.5, we have two possible cases. pointed sequence ( ¯ i , g¯ τi , xi ) is non-collapsing, in that Case A (Non-Collapse). Suppose the sequence ( (1.18) holds with g¯ τi in place of gi . This lower volume bound implies that there exists a constant νo > 0 such that ¯ i = vol i /tτ3 ≥ νo , volg¯τi i
(3.15)
for tτi → ∞. By the monotonicity property (1.16), it follows that (3.15) holds for all tτ , and we may assume that νo is the limiting minimal value. ¯ κ −1 ) have uniform L∞ bounds on the ¯ τ¯i in annuli A(κ, As noted above, all leaves intrinsic curvature Ric and so the spatial metrics g¯ τ¯i are uniformly controlled locally 2,p ¯ τ¯i . By (2.19), there is a uniform in Lx , the spatial Sobolev space along the leaves 1,p 2,p local Lx bound on K while by (2.18), α¯ i is also uniformly bounded locally in Lx in ¯ κ −1 ). A(κ, Similarly, we claim that the time derivative ∂τ2 g¯ i is uniformly bounded locally in Lp ¯ κ −1 ), where τ = τ¯i is the time parameter. For by (1.3) and the estimates above, in A(κ, 1,p ∂τ g¯ i = −2α¯ i K¯ i is uniformly bounded in Lx . We have ∂τ2 g¯ i = −2(∂τ α¯ i )K¯ i −2α¯ i ∂τ K¯ i . The second term is uniformly bounded locally in Lp by the estimates above applied to the evolution equation (1.4). Further, by differentiating the lapse equation (1.5) w.r.t. τ and using standard formulas for the derivative of /, (cf. [14, 1.184]), one finds that p (−/ + |K|2 )∂τ α¯ i is uniformly bounded locally in Lx . Hence by elliptic regularity, ∂τ α¯ i 2,p is uniformly bounded locally in Lx , which proves the claim.
552
M. T. Anderson
It follows that the sequence of vacuum space-times in (3.7) has a subsequence converging in the weak L2,p topology to a maximal limit C 1,β ∩ L2,p vacuum space-time ¯ ∞ , g¯ ∞ ), with (M 2 2 g¯ ∞ = −α¯ ∞ d τ¯∞ + g¯ τ¯∞ .
(3.16)
¯ κ −1 ) and then (More precisely, one should first take a limit within the time-annuli A(κ, let κ = κj , j = j (i) → 0 in a suitable diagonal subsequence.) Here τ¯∞ is the mean curvature parametrization given by the limit of the parametrizations τ¯i from (3.7) and α¯ ∞ is the limit of the lapse functions α¯ i . The metric (3.16) is a weak solution of the vacuum equations (0.1). By the volume monotonicity (1.16) used above, this limit space-time is a “volume cone”, i.e. ¯ τ¯∞ /(t¯∞ )3 ≡ ν1 > 0, vol
(3.17)
where t¯∞ is the limit of the renormalized proper time functions t¯i following (3.8) and v1 ≤ v0 . By Corollary 1.3, it thus follows that g¯ ∞ is flat, and g¯ τ¯∞ is complete and hyperbolic, (i.e. of constant negative curvature). Thus, the limit is a flat Lorentzian cone on a hyperbolic manifold, as in (0.12). ¯ ∞ = lim ¯ i may be compact, in which case it is difNote that the limit surface ¯ i → ∞. feomorphic to , or non-compact, corresponding to the possibility diamg¯i ¯ The limit lapse function α¯ ∞ ≡ 1, so that the leaves τ¯∞ are level sets of t¯∞ , while ¯ ∞ . Observe that the limit extrinsic curvature K¯ ∞ is pure trace, with |K¯ ∞ |2 = 3 on consequently, not only does (3.4) hold, but in fact H (i ) · t (yi ) → 3,
(3.18)
for all yi ∈ i such that dist g¯τ¯i (xi , yi ) ≤ D, for any fixed D < ∞. Thus, we see that the only limit geometry on the CMC leaves arising in the noncollapse situation is the hyperbolic geometry. ¯ i , g¯ τi , xi ) is collapsing, so that (1.19) holds, Case B (Collapse).. Suppose the sequence ( again with g¯ τ¯i in place of gi . By (3.10), it is then collapsing with bounded curvature, and ¯ i , g¯ τ¯i ) are Seifert fibered spaces thus, for any R < ∞, the geodesic R-balls about xi in ( ¯i or torus bundles over an interval I , with collapsing fibers. In particular, this part of is an (elementary) graph manifold, and corresponds to a piece in the decomposition of the graph manifold G, cf. the discussion in Sect. 0. As discussed following Proposition 1.5, we may choose a sequence R = Rj → ∞ and pass to a diagonal subsequence, which, after unwrapping the collapse, converges in ¯ ∞ , g¯ ∞ , x∞ ); (we use here the C 1,β ∩ L2,p to a complete limit Riemannian manifold ( assumption σ () ≤ 0, so that = S 3 / .) The limit is either a Seifert fibered space or torus bundle over an interval, (i.e. a Sol manifold), and has either a locally free isometric S 1 , S 1 × S 1 , T 3 or Nil action. For the same reasons as above in Case A, preceding (3.16), the unwrapping of the collapse ¯ ∞ , g¯ ∞ , x∞ ) and a corresponding maximal CMC also gives rise to a limit space-time (M ¯ ∞ as a leaf. Thus the limit metric g¯ ∞ has the form (3.16), although foliation M¯ F¯∞ with here of course g¯ τ¯∞ is not hyperbolic. Thus, combining the discussion in Cases A and B, we see that all based limits of ¯ i , gτ¯i , xi ) are either complete hyperbolic manifolds, complete Seifert fibered spaces ( or complete Sol manifolds, with a corresponding non-trivial group of isometries.
General Relativity and Geometrization
553
To obtain the decomposition (0.8), any Riemannian 3-manifold (, g) has a thick-thin decomposition, i.e. for any fixed ε > 0, we may write = ε ∪ ε ,
(3.19)
ε
where = {x ∈ : volg Bx (1) ≥ ε}, ε = {x ∈ : volg Bx (1) ≤ ε}. Of course these domains need not be connected. Under a fixed curvature bound as in (3.11), this corresponds to a decomposition according to the size of the injectivity radius, as in ¯ i , g¯ τ¯i ). By the discussion Remark 1.6(ii). Now apply the decomposition (3.19) to each ( ε ¯ ¯ in Case A, for any fixed ε > 0, the domains (i , g¯ τ¯i ) ⊂ (i , g¯ τ¯i ), when based at base ε , g¯ ). Next choose a sequence ¯ ε , (sub)-converge to domains ( ¯∞ point sequence xi ∈ ∞ i ε = εj → 0, giving a double sequence in (i, j ). Taking a suitable diagonal subsequence ¯ εj converge to the complete limit ( ¯ ∞ , g¯ ∞ ) ⊂ H , and this j = j (i), the domains i defines the complete hyperbolic part H ⊂ . On the other hand, for ε = εj sufficiently ¯ i )εj is a graph manifold, which collapses everywhere as εj → 0. This gives small, ( the decomposition (0.8). Of course when G is empty, then Case A gives convergence to the hyperbolic metric on H = , while if H is empty, then G = and Case B implies that the metrics g¯ i collapse everywhere, cf. also Remark 3.3 below. ¯ εj to the εj -thin part ( ¯ i )εj We note that the transition from the εj -thick part i takes larger and larger diameter as i and j = j (i) → ∞, so that the distance between these regions diverges to ∞. Choosing different base points gives different limits only when the distance between base points goes to ∞. Hence for instance if the diameter of ¯ i , g¯ τi ) happens to remain uniformly bounded, then any limit has a unique geometry, ( i.e. independent of the base point. In fact, we see that the decomposition (3.19) is naturally refined into a further decomposition of ε according to the rank of the collapse, i.e. according to the type of group action of the limits described in Proposition 1.5; this is part of the general theory of collapse along F-structures, cf. [16, 17] and also [2] for further details. Thus, based ¯ i )εj which have a locally free isometric S 1 action become infinitely distant limits of ( (in space) from based limits which have a locally free isometric S 1 × S 1 action. Hence ¯ i , g¯ i ), one also obtains a decomposition of the from the near limiting geometry of ( graph manifold G ⊂ into Seifert fibered components, as discussed in Sect. 0. The decomposition (3.21) and hence (0.8) could change with different choices of sequences τi → 0, as noted in Sect. 0; see also Sect. 3.3 for further discussion. This completes the proof of Theorem 3.1. Remark 3.2. An alternate argument to the use of the volume comparison result (Corollary 1.3) in Case A can be given based on the monotonicity of the reduced Hamiltonian of Fischer–Moncrief [25]. Thus, it is proved in [25] that the function τ H 3 dV = τ 3 volgτ τ is monotone non-increasing in τ , and is constant if and only if is hyperbolic. Of course the monotonicity of this Hamiltonian and that of the volume ratio (1.16) are closely related via (1.12). Remark 3.3. Suppose (3.18) is strengthened to the statement that |H (τi )| · tmin (τi ) → 3. Then together with (1.12) and fact that H is constant, it follows that tmax /tmin → 1 on i = τi . This means that the corresponding rescaled proper time t¯i following (3.8) ¯ i and hence the lapse α¯ i on ¯ i also approaches a constant converges to 1 everywhere on ¯ ¯ ¯ i is hyperbolic everywhere function everywhere on i . It follows that the limit ∞ of ¯ ∞ , g¯ ∞ ) is a compact hyperbolic manifold, diffeomorphic to , i.e. G = ∅. so that (
554
M. T. Anderson
¯ τ¯ , g¯ τ¯ ) converge to the hyperbolic We claim that in this situation, the rescalings ( limit (, g¯ ∞ ), for all sequences τ → 0, i.e. by Mostow rigidity, the limit is unique. For the monotonicity of (3.15) implies that any limit of a sequence gives rise to a complete hyperbolic manifold H embedded in , as in (0.8). Now Thurston’s cusp closing theorem, cf. [40], implies that the volume of any hyperbolic cusp H embedded in has volume strictly larger than that of the compact hyperbolic metric (, g¯ ∞ ). Since the volume of the graph manifold part G in (0.8) has non-negative volume in the limit, the claim follows from the volume monotonicity. Similarly, if the ratio in (3.15) converges to 0 on some sequence τi → 0, then the monotonicity implies it converges to 0 on all such sequences. Hence, by Case B above, ¯ i , g¯ i ) collapse as τi → 0. H = ∅, = G always, and all sequences ( 3.2. In this section, we prove that the assumption (3.4) always holds, at least when the curvature assumption (3.3) is strengthened to (0.11). Theorem 3.4. Let (M, g) be a cosmological CMC space-time satisfying the curvature assumption (0.11) and M = MF . Then there exists a constant δ = δ(M, g) > 0 such that |H | ≥ δ/tmin .
(3.20)
Proof. The proof will proceed in several steps, but overall the proof proceeds by contradiction, and so we suppose there exists a sequence τi → 0, and so tmin (τi ) → ∞, such that tmin (τi ) · |Hτi | → 0,
as i → ∞.
(3.21)
Let xi ∈ i be base points realizing tmin , so t (xi ) = tmin (τi ). In contrast to (3.7), throughout this section we consider the rescaled space-time metrics g¯ i = tmin (τi )−2 · g,
(3.22)
with renormalized proper time t¯i = t/tmin (τi ) and CMC time τ¯i = τ · tmin (τi ). As in ¯ i of g¯ i is the proof of Theorem 3.1, the argument in (3.10) shows that the curvature R uniformly bounded in the region where t¯i ≥ to , for any given to > 0. It then follows from the arguments in Sect. 3.1, cf. the discussion preceding (3.16), that the space-times (M, g¯ i , xi ) have a subsequence converging in the weak L2,p topology to a limit L2,p ∩ C 1,β ¯ ∞ , g¯ ∞ , x∞ ), where one must pass to covers in the case maximal vacuum space-time (M ¯ ∞ , g¯ ∞ ) has at least a free isometric space-like S 1 of collapse. In this latter case, (M action. The parameters t¯i and τ¯i converge to limit parameters t¯∞ and τ¯∞ . ¯ i , are such that a subsequence of ( ¯ i , g¯ τi , xi ) The CMC surfaces τi , now labeled as also converges in the weak L2,p topology, and uniformly on compact subsets, to a limit ¯ ∞ , g¯ ∞ , x∞ ). Of course, by construction, t¯∞ ≥ 1 on L2,p ∩ C 1,β CMC hypersurface ( ¯ ∞ . Similarly, the CMC foliation MF¯ i in the scale (3.22) converges to a limit L2,p ∩C 1,β ¯ τ¯∞ . Again, in the case of collapse, the CMC foliation MF¯ ∞ of (M∞ , g¯ ∞ ) with leaves leaves are unwrapped and so have at least a locally free isometric S 1 action. The assumption on |∇R| in (0.11) implies in the same way that |∇Rg¯ i | is uniformly bounded where t¯i ≥ to > 0. By elliptic regularity applied to Eq. (2.2), it follows that the intrinsic curvature Ricg¯i and extrinsic curvature Kg¯i of the leaves are uniformly bounded in L1,p and L2,p respectively (compare with the proof of Theorem 0.1). Thus,
General Relativity and Geometrization
555
the convergence above is everywhere in the weak L3,p topology and the limits are in L3,p ∩ C 2,β . Now the scale-invariance of (3.21) and the smoothness of the convergence imply that ¯ ∞ is a complete maximal hypersurface, i.e. H = 0. In fact, since τ = H is the limit ¯ τ¯∞ of M ¯ are complete maximal hypersurfaces, monotone on (M, g), all the leaves F∞ and so the parameter τ¯∞ ≡ 0. This means that the lapse function α on (M, g) satisfies α(xi ) >> tmin (τi )2 ,
(3.23)
so that the corresponding renormalized lapse functions α¯ i = α/α(xi ) on (M, g¯ i , xi ) satisfy α¯ i (xi ) >> 1. To remedy this situation, renormalize the lapse by setting α¯ i = α/α(xi ).
(3.24)
The estimates (2.17) imply that α¯ i is uniformly bounded within g¯ i bounded distance to ¯ i . Similarly, the arguments preceding (3.16) imply that α¯ i is uniformly bounded xi on on bounded domains on leaves within bounded proper time distance to xi . Hence α¯ i ¯ ∞ , g¯ ∞ ). Similarly, we redefine the CMC converges to the limit lapse function α¯ ∞ on (M time parameter by τ¯i = (α(xi ))1/2 · τ.
(3.25)
Then τ¯i converges smoothly to a parametrization, again denoted τ¯∞ , of the leaves of the limit foliation. Hence the limit α = α¯ ∞ satisfies the following lapse equation on each ¯ τ¯∞ : leaf −/α + |K|2 α = 0.
(3.26)
Thus the variation vector field αT , (T the unit normal), is a Jacobi field for the volume ¯ ∞ , g¯ ∞ ) has the form (3.16) w.r.t. these redefinitions functional. The limit space-time (M of α¯ ∞ and τ¯∞ . The task is now to rule out this limiting or near limiting behavior of the space-time (M, g). The key to this is the following result. ¯ ∞ cannot be flat, i.e. for some y ∈ ¯ ∞, Lemma 3.5. The limit maximal hypersurface | Ric |g¯∞ (y) > 0.
(3.27)
Proof. Assuming (∞ , g∞ ) is flat, we will obtain a contradiction by the Cauchy stability theorem. Thus, suppose (∞ , g∞ ) is flat. By the constraint equation (1.2), since s¯∞ = 0 and H = 0, K¯ ∞ = 0, ¯ ∞ is totally geodesic, (i.e. time-symmetric). It follows that for i sufficiently so that ¯ i , g¯ i , xi ) are almost flat and totally geodesic, i.e. large, the CMC surfaces ( | Ric |g¯i (y) < ε, and |K|g¯ i (y) < ε,
(3.28)
for all y ∈ Bxi (R), where R may be made arbitrarily large and ε > 0 arbitrarily small by ¯ i , g¯ i ) about xi . In choosing i sufficiently large; here Bxi (R) is the geodesic R-ball in ( fact, as discussed above following (3.22), by Remark 1.6(i), the curvature bounds (0.11)
556
M. T. Anderson
imply that (Bxi (R), g¯ i ) is ε-close in the weak L3,p topology to the flat metric, and K is L2,p close to the 0-form. ¯ i in the region It follows that the Cauchy data (g¯ i , Ki ) on the CMC surfaces (Bxi (R), g¯ i ) are ε-close in weak (L3,p , L2,p ) to trivial Cauchy data. By Sobolev embedding, L3,p is compactly contained in H s , while L2,p is compactly contained in H s−1 , for any s < 3, where H s is the Sobolev space with s derivatives in L2 . Hence the Cauchy data (g¯ i , Ki ) are strongly close to trivial data in (H s , H s−1 ), s < 3. Choosing s > 2.5, the Cauchy stability theorem, cf. [24], then implies that the ¯ i to the past exists for a proper time maximal Cauchy development of Bxi (R/2) ⊂ T = T (ε), where T may be made arbitrarily large if ε is chosen sufficiently small. However, t¯i (xi ) = 1 and by the remarks in Sect. 0, the space-time (M, g) has a singularity, i.e. fails to be globally hyperbolic, within g-bounded proper time to the past of ¯ i within proper time at most 2, for τo . Hence (M, g¯ i ) has a singularity to the past of i large. This contradiction proves the result. Remark 3.6. The following generalization of Lemma 3.5 will be needed in the work to follow. Namely, the proof above leads to the same contradiction if there is an ε > 0 ¯ ∞ such that sufficiently small, a number D = D(ε) sufficiently large, and a point y ∈ the metric g = t¯∞ (y)−2 · g¯ ∞
(3.29)
is ε-close to the flat metric in the H s topology, s > 2.5, in the ball (By (D), g ). We remark that this use of the Cauchy stability theorem is the only place in the proof that the assumption on the derivative of the curvature in (0.11) is needed. ¯ ∞ , g¯ ∞ ) Thus, to prove Theorem 3.4, it suffices to prove the limit maximal surface ( must either be flat or have a point y satisfying the weaker assumption in Remark 3.6. The proof of this needs to be divided into non-collapse and collapse cases, as in Sect. 3.1. ¯ ∞ , g¯ ∞ ) is a rescaled limit at infinity of Case A (Non-Collapse). Since the space-time (M (M, g), there exists a future directed time-like geodesic ray γ in (M, g) whose rescalings ¯ ∞ , g¯ ∞ ). As in the proof of in the metrics (3.22) converge to a geodesic ray γ¯∞ in (M −1 Proposition 1.2, let S(s) = t (s) be the time s geodesic sphere about τo in (M, g) ¯ i , g¯ i ) from (3.22). Let zi = γ (ti ) ∈ S(ti ), and let S¯i (s) = t¯i−1 (s) be its rescaling in (M where ti = t (xi ). Thus zi ∈ S¯i (1) and xi ∈ S¯i (1). The geodesic spheres (S¯i (1), zi ) converge, (in subsequences), to a Lipschitz, achronal −1 (1) ⊂ (M ¯ ∞ , g¯ ∞ ) containing the base point x∞ . The limit proper surface S¯∞ (1) = t¯∞ ¯ time function t¯∞ on (M∞ , g¯ ∞ ) induces a Lipschitz 3+1 splitting of the limit space-time metric g¯ ∞ , (which of course is not well-defined on the cut locus of t¯∞ ). For any fixed R < ∞, let Dzi (R · ti ) be the geodesic ball of radius R · ti about zi in S(ti ), and D¯ zi (R) its rescaling in S¯i (1). In this case, we assume that there exists νo > 0 such that vol D¯ zi (1) ≥ νo > 0, as i → ∞,
(3.30)
so that the domains D¯ zi (1) are non-collapsing and hence converge, (in the Hausdorff topology), to the limit domain D¯ z∞ (1) ⊂ S¯∞ (1). Now as in (1.13), the expansion θ of the congruence formed by time-like geodesics normal to the spheres S(s) ⊂ (M, g) satisfies (1.12) (see the proof of Proposition 1.2).
General Relativity and Geometrization
557
Hence, if dVσ (s) denotes the infinitesimal volume of the family S(s) along any geodesic σ in this congruence, then dVσ (s)/s 3 is monotone non-increasing as s → ∞. ¯ ∞ , g¯ ∞ ) formed by It then follows from (3.30) as with (3.17) that the domain in (M the geodesics normal to D¯ z∞ (1) ⊂ S¯∞ (1) is a volume cone, and hence this domain is contained in a flat Lorentz cone as in (0.12). This flat structure extends past the cone on ¯ ∞ , g¯ ∞ ) is a flat Lorentz cone. D¯ zi (1) and implies that all of (M ¯ ∞ , g¯ ∞ ), Now the limit surface (∞ , g∞ ) is a complete maximal hypersurface in (M 4 which lifts to a complete maximal hypersurface in (R , η). It is an easy consequence of the maximum principle applied to (2.3), as in the proof of Proposition 2.2, that the only complete maximal hypersurfaces in (R4 , η) are flat and totally geodesic; this is also ¯ ∞ , g¯ ∞ ) is flat. Lemma 3.5 thus rules out the possibility ¯ ∞ ⊂ (M proved in [18]. Hence of this case. Case B (Collapse). In this case, we suppose (3.30) does not hold, so that vol D¯ zi (1) → 0, as i → ∞. It follows from the monotonicity used above that for any fixed so > 0, and s ∈ [so , so−1 ], the domains D¯ zi (s) ⊂ S¯i (s) also volume collapse. ¯ containing zi is also collapsing when based at zi This implies that the CMC leaf i as i → ∞. By the uniform control on the geometry following (3.22), it follows that ¯ , zi ), are all CMC leaves within g¯ i bounded distance, in space and proper time, to ( i ¯ collapsing as i → ∞. In particular, the leaves i containing the base points xi are collapsing everywhere within bounded distance to xi . ¯i Thus, as described in the beginning of the proof, we unwrap the collapse of ¯ ∞ , g¯ ∞ , x∞ ). ¯ ∞ and the limit space-time (M and the space-times to obtain the limit ¯ ∞ has a foliation by maximal These limits have a non-trivial group of isometries, and M hypersurfaces. Now unfortunately, we need to divide this situation into two further subcases, according to the size of the isometry group. ¯ ∞ , g¯ ∞ ) has a locally free isometric S 1 × S 1 Case B(I) (S 1 × S 1 ). Suppose the limit ( 2 ¯ action, so that ∞ is a T -bundle over R. By the constraint equation (1.2), the metric ¯ ∞ . Further, any orbit of g¯ ∞ is a complete metric of non-negative scalar curvature on ¯ ∞ is an incompressible torus in ¯ ∞ . However, by a result of [28], the S 1 ×S 1 action on any complete 3-manifold of non-negative scalar curvature which has an incompressible ¯ ∞ , g¯ ∞ ) is in fact flat. Lemma 3.5 again rules out torus must be flat. Hence we see that ( this possibility. Of course the same arguments apply, (and are even easier), if the limit ¯ ∞ has a locally free T 3 or Nil action. ¯ ∞ , g¯ ∞ ) has a locally free isometric S 1 action. Case B(II) (S 1 ). Suppose the limit ( ¯ ∞ is an S 1 bundle over a surface V , with induced complete Riemannian metric Thus ¯ ∞ is a gV . By [28] as above, V must be simply connected, and hence topologically solid torus D 2 × S 1 . ¯ ∞ , g¯ ∞ ) is flat, since a solid torus In this case, we are not able to prove directly that ( carries many complete metrics of positive scalar curvature. Instead, we will prove that the hypothesis concerning (3.29) in Remark 3.6 holds, which again gives a contradiction. ¯ ∞ , g¯ ∞ ) is asymptotically flat To do this, we prove that the maximal hypersurface ( in a weak sense. To simplify, we drop the bar and subscript everywhere from the notation ¯ ∞ , g¯ ∞ ), ¯ ∞ , g¯ ∞ ), (M, g) denote (M for the rest of Case B(II); thus let (, g) denote ( t (x) denote t¯∞ (x), etc. The scale-invariant bound (0.11) gives |R| + t|∇R| ≤ Ct −2 ,
(3.31)
558
M. T. Anderson
on (, g). Since by construction t ≥ 1 on , the curvature R and its derivative ∇R are thus uniformly bounded on . By Proposition 2.2 it follows that | Ric | ≤ Ct −2 ,
|K| ≤ Ct −1 ,
(3.32)
on (, g). One derives (3.32) most easily from (3.31) by using scale-invariance and working in the scale where t = 1. In the same way, applying elliptic regularity to Eq. (2.2) gives the scale-invariant bounds over geodesic balls Bt = Bq ( 21 t (q)) about any q ∈ (, g): t −3/p |t 3 ∇ Ric |Lp (Bt ) ≤ C, t −3/p |t 3 ∇ 2 K|Lp (Bt ) ≤ C.
(3.33)
In the case of collapse, these norms are taken in the corresponding covering spaces. Let r : → R be the (Riemannian) distance function on (, g) from the given base point x∞ ∈ . Lemma 3.7. On any sequence yi with r(yi ) → ∞, the functions t and r on (, g) satisfy t << r.
(3.34)
Proof. If t (yi ) is bounded, then (3.34) is obvious, so suppose t (yi ) → ∞ as r(yi ) → ∞. Then by (3.31)–(3.32) the metric g on becomes flat everywhere as t → ∞ and (M, g) thus approaches empty Minkowski space (R4 , η), (or a discrete quotient of it). The maximal foliation of (M, g) approaches the constant foliation of (R4 , η) where all leaves are parallel, i.e. time-equidistant, since this is the only maximal foliation of (R4 , η), cf. [18]. Hence, for i large, the lapse function α = α¯ ∞ , (cf. (3.26)), tends to a constant function on domains of bounded diameter about the base points yi while the light cones of (M, g) have axis becoming perpendicular to the leaves. This means that the time function t also tends to a constant on such domains. Since r obviously continues to increase linearly, (3.34) follows. We are now ready to prove that (, g) is weakly asymptotically flat. Thus, for any divergent sequence yi in consider the rescaled metrics gi = t (yi )−2 · g.
(3.35)
Hence, ti (y) = t (y)/t (yi ) satisfies ti (yi ) = 1 while by Lemma 3.7, ri (y) = r(y)/t (yi ) → ∞. We may alter the choice of yi if necessary so that it satisfies these estimates and in addition ti (zi ) ≥ 21 ,
(3.36)
for all zi ∈ such that dist gi (yi , zi ) ≤ C, where C may be made arbitrarily large if i is sufficiently large. To see this, for any sequence qi with r(qi ) → ∞, let ρi = r(qi ) and let Ai = A( 21 ρi , 2ρi ) be the g-geodesic annulus centered at x∞ of inner and outer radii 1 2 ρi , 2ρi . Choose the points yi ∈ Ai to realize the minimal value of the scale-invariant ratio t (y)/ dist g (y, ∂Ai ). By (3.34), this minimal value converges to 0 as i → ∞. Since ti (yi ) = 1, we must have distgi (yi , ∂Ai ) → ∞ and the estimate (3.36) then follows easily from the minimality of the choice of yi . Thus, by (3.33), the metrics (, gi ) based at yi have a subsequence converging in the weak L3,p topology, uniformly on compact subsets to a complete L3,p limit ( ∞ , g ∞ ), unwrapping in the case of collapse. We claim that any such limit must be flat.
General Relativity and Geometrization
559
The proof of this is essentially the same as in [5, Prop. 4.1–4.3]; for brevity, we refer to these results for some further details. Suppose first the metrics (gi )V on the base space V of (, gi , yi ) are collapsing at yi . By unwrapping this collapse, (and any possible collapse in the invariant S 1 fiber direction), the limit ( ∞ , g ∞ ) is thus a complete metric of non-negative scalar curvature on the limit manifold ∞ = T 2 × R, i.e. one obtains a second S 1 factor from the collapse of the base metric at infinity. As above in Case B(I), by [28], the metric g ∞ must be flat; this is the same argument as in [5, Prop.4.3, Case II]. If the base metrics (gi )V are not collapsing, then the limit ( ∞ , g ∞ ) is complete and of non-negative scalar curvature on R2 × S 1 . In this case, the proof in [5, Prop.4.3, Case I], together with the control given by (3.33), proves again that ( ∞ , g ∞ ) is flat; (although [5, Prop.4.3] applies to complete S 1 invariant metrics on R2 × S 1 of nonnegative scalar curvature which satisfy certain elliptic equations, (the Zc2 equations), the proof only needs these equations for regularity estimates, provided here by (3.33)). By Sobolev embedding, the convergence to any such flat limit ( ∞ , g ∞ ) is in the strong H s topology, for any s < 3. Hence there are arbitrarily large domains in (, gi , yi ) on which gi is ε-close to a flat metric in the H s topology. Thus, the hypothesis of Lemma 3.6 is satisfied, and one obtains a contradiction from the Cauchy stability theorem as in Lemma 3.5. This completes the proof of Case B(II), which thus also completes the proof of Theorem 3.4. Together with Theorem 3.1, this also completes the proof of Theorem 0.3. 3.3. Discussion on collapse behavior. (I). The collapse behavior arising in Theorems 0.3 and 3.1, cf. Sect. 3.1 Case B, does actually occur in numerous examples. In fact, let (M, g) be a Bianchi space-time, i.e. (M, g) locally has a 3-dimensional local isometry group whose action is free and spacelike. All such space-times admit a global CMC foliation F whose leaves τ are invariant under the action, i.e. each leaf τ is locally homogeneous. Further, the foliation F exists for all allowable CMC time, for any value of σ (), and M = MF , cf. [21]. The Bianchi space-times correspond very closely to the eight Thurston geometries arising in the geometrization program. In particular, all eight of the Thurston geometries occur as space-like Cauchy surfaces in a corresponding Bianchi space-time, with the interesting exception of S 2 × R, which corresponds to a Kantowski–Sachs space-time. We refer to [6] for a dictionary of this Bianchi–Thurston correspondence and for further references. The five geometries R3 , Nil, H 2 × R, SL(2, R) and Sol all give 3-manifolds with σ () = 0 and the corresponding Bianchi space-times are future geodesically complete. The curvature assumption (0.11) is satisfied and the space-times collapse in the sense of Sect. 3.1, Case B, as τ → 0 or tτ → ∞. The simplest example is the Kasner metric, given by g = −dt 2 + 3i=1 t 2pi dθi ⊗dθi , 2 where {pi } are any constants satisfying pi = 1, pi = 1 and θi are parameters for T 3 . This has R3 geometry, with a flat 3-torus, H = τ = −t −1 . In this case, for any ¯ i of i as in (3.7) collapse with bounded curvature sequence τi → 0, the rescalings and diameter converging to 0. Hence, the unwrapping in covers give rise to a T 3 action on the limit surface, i.e. one obtains the Kasner metric again, (with the same exponents). For the remaining 3 geometries, the S 3 and S 2 × R geometries have σ () > 0 and give Bianchi and Kantowski–Sachs space-times which recollapse, as in the recollapse
560
M. T. Anderson
conjecture, while the hyperbolic geometry H 3 (−1) gives the flat Lorentz cone (0.12) and so evolves trivially, i.e. just by rescalings. (II). As discussed in Sect. 3.1 Case B, if the rescalings (3.7) collapse at xi , then one may unwrap the collapse of both the space and space-time, and pass to a limit space-time ¯ ∞ , g¯ ∞ , x∞ ), with complete CMC Cauchy surface ( ¯ ∞ , g¯ ∞ , x∞ ) and a correspond(M ¯ τ¯∞ of M¯ ¯ have at least a locally free ing limit CMC foliation M¯ F¯ ∞ . The leaves F∞ isometric S 1 action, and possibly a locally free isometric action of a 2 or 3-dimensional group and may be either compact, or complete and non-compact. The limit space-time ¯ ∞ , g¯ ∞ ) is time-like geodesically complete to the future of ¯ ∞. (M ¯ ∞ , g¯ ∞ ) include the Gowdy In the case of an S 1 ×S 1 action, these limit space-times (M ¯ ∞ is compact, cf. [6] and space-times and have been quite well studied, at least when references therein. One can now study (again) the further long-time evolution of these limit space-times. ¯ ∞ , g¯ ∞ ) evolve as t¯∞ → ∞ to a Bianchi An interesting open question is if these limits (M ¯ ∞ , g¯ ∞ ) themselves are space-time as in (I), and if so whether in fact the space-times (M already Bianchi. To prove this, it seems likely one needs a monotonicity structure to replace the volume monotonicity of Corollary 1.3 (or the Hamiltonian monotonicity [25]), used in the proof of Theorem 0.3. (III). Finally, we discuss the relation between weak and strong geometrizations. Let = H ∪ G be a weak geometric decomposition as given by Theorem 0.3 or Theorem 3.1. The manifolds G and H have toral boundary T = ∪Ti and the decomposition (0.8) is strong, and hence unique, if each torus Ti is incompressible in . Obviously since the hyperbolic part H has no incompressible tori, Ti is incompressible in if and only if Ti is incompressible in G. A standard result in 3-manifold topology, cf. [32, Ch. II.2; 37], implies that each boundary component Ti is incompressible in G if and only if G has no solid tori factors, i.e. in the decomposition of G into Seifert fibered spaces Sj discussed in Sect. 0, no Sj is a solid torus D 2 × S 1 . Thus, the question is whether solid torus factors D 2 × S 1 can arise asymptotically from the collapse behavior, (compare with Case B (II) in the proof of Theorem 3.4). At least in a special case, this question is closely related to the recollapse conjecture. Suppose one has a D 2 × S 1 factor in G arising from the asymptotic collapsing geometry of (M, g) as discussed in the proof of Theorem 3.1. The limiting metric g¯ ∞ on D 2 × S 1 is complete and invariant under a free isometric S 1 action. Suppose, for some tτi sufficiently large, i.e. τi sufficiently small, there is D 2 × S 1 ¯ i , g¯ τi ) with a totally geodesic boundary T 2 in the g¯ i metric. One region contained in ( can then double or reflect across the boundary to obtain a closed 3-manifold S 2 ×S 1 with an isometric Z2 action, fixing the boundary. Of course in this process we have created ¯ i at the boundary a different space-time. Suppose further the extrinsic curvature K¯ of 2 ¯ T is invariant under this Z2 action so that the Cauchy data on i are Z2 invariant. Now σ (S 2 × S 1 ) > 0 and so the recollapse conjecture, if true, implies that this S 2 × S 1 must evolve into a singularity in finite future time. Since the evolution of Cauchy data ¯ i , g¯ τi ) also evolves to a preserves isometries, this implies that the D 2 × S 1 factor in ( singularity in finite future time. This is of course a contradiction. It would obviously be interesting if this argument could be strengthened to prove in general that the recollapse conjecture implies that a weak decomposition of must necessarily be a strong decomposition.
General Relativity and Geometrization
561
4. The Future Boundary of MF in M In this section, we discuss the situation where MF is strictly contained in M, cf. Sect. 0. In this case, Theorem 0.1 implies that M has global existence in CMC time both to the past τ → −∞ and to the future τ → 0. In particular, by the discussion at the end of Sect. 2, the past singularity is a crushing singularity. Thus, we only consider here the boundary ∂ o MF ⊂ M
(4.1)
of MF formed when τ → 0, i.e. the boundary to the future of the initial surface τo from (0.2). The condition (4.1) means that the space-time (M, g), i.e. the maximal Cauchy development of initial data on , extends a definite amount locally, i.e. in a neighborhood of any point in ∂o MF , to the future of the CMC foliation F, so that τ does not approach anywhere ∂M as τ → 0. We begin with the following general result. Theorem 4.1. Suppose σ () ≤ 0 and (4.1) holds. Then each component o of ∂o MF is a smooth, complete non-compact maximal hypersurface o ⊂ M, weakly embedded in , cf. (1.21). Proof. Referring to the estimates in Sect. 2, by Proposition 2.2, |K|2 is uniformly bounded on any compact subset ? of M and hence by (1.7), inf ? α ≥ αo = αo (7o , Ho ) > 0.
(4.2)
Further, we recall that α satisfies the Harnack inequality (2.17), i.e. supB(ro ) α/ inf B(ro ) α ≤ C, where B(ro ) is any geodesic ro -ball in (τ , gτ ) and C is independent of τ , for τ ∈ [−τo , 0]. The estimate (1.7) also gives an upper bound on α when H is bounded away from 0, (as in (2.22)). However, when H = τ → 0, α may diverge to +∞. Applying the Harnack inequality iteratively to a collection of balls covering a domain of bounded diameter, it follows that α either remains uniformly bounded on uniformly bounded domains within τ , or α diverges uniformly to +∞ on such domains, as τ → 0. Given these preliminary remarks, let τi be any sequence with τi → 0 monotonically and consider the behavior of {τi }. We divide the discussion into three cases. Suppose first that there is a constant To < ∞ such that supτ t = tτi ≤ To , i
as i → ∞.
(4.3)
As in the proof of Theorem 0.1, cf. also Remark 2.6, it then follows that τi converges smoothly to a compact maximal hypersurface o , and o is diffeomorphic to . Of course, this situation implies σ () > 0, contradicting the assumption, (cf. Remark 4.2.). Next suppose inf τi t = tmin (τi ) → ∞,
as i → ∞,
(4.4)
so that t tends uniformly to +∞ on τi as i → ∞. Thus, the surfaces τi diverge uniformly to ∞ in infinite proper time. In this case, we then clearly have M = MF to the future of τo , and so ∂o MF = ∅ again.
562
M. T. Anderson
It follows that there are base points yi ∈ τi such that lim supi→∞ t (yi ) < ∞, and other points zi ∈ τi where lim inf i→∞ t (zi ) = ∞. Analogous to (3.19), for any To < ∞ and any i, write = τi = To ∪ To ,
(4.5)
where To = {x ∈ : t (x) ≥ To }, To = {x ∈ : t (x) ≤ To }. Hence for all i and To sufficiently large, both To and To are non-empty. Note that these domains need not be connected. Lemma 2.3 implies that the domains To = (i )To ⊂ (τi , gτi ) do not collapse anywhere, i.e. at any sequence of base points. Further, as in the proof of Theorem 0.1, elliptic regularity applied to (2.2) gives uniform local control over the derivatives of Ric and K on (i )To . It then follows exactly as in the proof of Theorem 0.1, that for any fixed To , and at any sequence of base points, the domains (i )To have a subsequence converging smoothly and uniformly on compact subsets to a limit surface (o )To ; the limit obtained of course depends on the choice of base points. As following (3.19), choose then a sequence Tj → ∞ and consider the double sequence (i, j ). A suitable diagonal subsequence of (i )Tj then converges smoothly and uniformly on compact subsets to a limit (∂o MF , go ). The limit ∂o MF is a collection of maximal hypersurfaces in (M, g) and forms the boundary of MF (to the future of τo ). Since each τi+1 lies to the future of τi , the limit ∂o MF is unique. By construction, the components o of ∂o MF are weakly embedded in . Of course the complementary T domains i j diverge to infinity and hence have no limit in (M, g) itself. Remark 4.2. Theorem 4.1 extends to the case σ () > 0 with only a minor modification, but for clarity we have separated the cases. Thus, suppose σ () > 0. The first case above, where (4.3) holds, gives a limit o which is compact, and so diffeomorphic to . By Theorem 0.1, the foliation MF then extends a definite amount to the future in CMC time, i.e. the range of τ extends a definite amount past 0 into R+ . Thus, as in Remark 2.7, either the leaves of MF approach ∂M somewhere in the finite proper and CMC future of o or the CMC foliation τ extends to all values of τ ∈ [0, ∞). For the second (4.4) and third cases (4.5), the proof of Theorem 4.1 proceeds in the same way. Of course the recollapse conjecture would imply that neither of these last two cases actually occurs. Remark 4.3. It is not asserted that the lapse functions αi on i converge to the lapse function α on o as τi → 0 in Theorem 4.1 or Remark 4.2. If αi (xi ) remains uniformly bounded as i → ∞, then αi converge smoothly to the limit α on o , and α satisfies the lapse equation (1.5) on o . However, it is possible, (although probably only in very special situations), that αi → ∞ uniformly as τi → 0 even under the bound (4.3). If αi (xi ) → ∞ with t (xi ) ≤ To , xi ∈ τi , then renormalize αi by setting α¯ i = αi /αi (xi ), as in (3.24). By (4.3) ff, the functions α¯ i then converge smoothly to a limit lapse α ≡ α¯ ∞ satisfying (3.26). This “degenerate” situation occurs when the limit maximal surface o is only weak maximum for the volume functional. If o is compact, (as in the case (4.3) when σ () > 0) then the maximum principle applied to (3.26) shows that K = 0 and α¯ ∞ = const on o . This situation does actually occur, for example in the Taub-NUT metric with m = 0, cf. [31, §5.8]. However, if o is non-compact, (corresponding to the situation (4.5)), it is no longer clear that necessarily K = 0. We make some further observations on the geometry of ∂o MF in (M, g). By Remark 4.2, we may assume that each component o is non-compact. First, the constraint
General Relativity and Geometrization
563
equation (1.2) shows that s ≥ 0 everywhere on o , so that g is a complete metric of non-negative scalar curvature on o . Each component o is a partial Cauchy surface for the region M+ in M to the future of o and ∂o MF is a Cauchy surface for M+ . The proper time t (x) is a proper, unbounded exhaustion function on o , as is the lapse function α (or its renormalization α¯ ∞ in the degenerate case). Now suppose the curvature bound (2.1), i.e. |R| ≤ 7o ,
(4.6)
holds globally on MF to the future of τo and hence also on ∂o MF . (This assumption, which is automatic on compact subsets ? of M, has not been used in the discussion above in Sect. 4). The estimates (2.7) and (2.18 ff) thus give a uniform bound on |K| and |∇logα| on MF . Hence, by integration of the 0th order Bel–Robinson estimate (2.25 ff) from τo to 0, we obtain the bound | Ric |2 + |dK|2 < ∞. (4.7) ∂o MF
The integrand | Ric |2 +|dK|2 is of course pointwise bounded on ∂o MF by (2.7) and (1.6), (4.6). Further, one certainly expects that each component o is not volume collapsing, i.e. vol Bx (1) ≥ vo > 0, for all balls Bx (1) ⊂ o and some vo > 0. If this does hold, then (4.7) implies that | Ric | → 0, |K| → 0,
(4.8)
uniformly at infinity in the components o . Hence the surfaces o become flat and totally geodesic at infinity in this case. Thus, in a certain sense, o is weakly asymptotically flat and time-symmetric. Remark 4.4. There are numerous interesting open questions concerning the structure of the components o . For example, does o admit a complete metric of uniformly positive scalar curvature, cf. [28] for results on the structure of such manifolds. Do the components o have at least two ends, and is there an essential 2-sphere S 2 ⊂ o ? Are the ends of o asymptotically flat? It would also be interesting to understand any relations of the structure of the manifolds o weakly embedded in with the sphere or prime decomposition of , since essential 2-spheres are obstructions to the geometrization of any 3-manifold. We also make the following conjecture on the structure of the region M+ to the future of ∂o MF . Non-Compact CMC Conjecture. If ∂o MF ⊂ M is non-empty, then either there exists a CMC foliation F + of M + by non-compact leaves τ , defined for all τ ∈ (0, ∞), and satisfying MF + = ∪τ = M+ ,
(4.9)
or there are leaves τi ∈ MF + which approach ∂M in finite proper and CMC time. This conjecture is a non-compact analogue of Theorem 0.1. Using the estimates in Sect. 2, together with the results and methods of [26], the conjecture can be proved provided the existence of suitable barrier surfaces Si can be established, i.e. surfaces in M+ whose mean curvature H (Si ) diverges everywhere to +∞ as i → ∞. Of course, one could consider a strengthening of this conjecture, asserting that ∂M is not reachable
564
M. T. Anderson
in finite CMC time, so that F + covers all of M+ ; this is the analogue of the CMC global existence problem in M+ . Note also that this conjecture implies the singularity formation conjecture of Sect. 0, since if the CMC foliation F + is defined for all τ ∈ [0, ∞) then the leaves τ limit on a crushing singularity as τ → ∞. As in Remark 2.7, this behavior must occur at a finite proper future time from ∂o MF , and so (M, g) cannot be future geodesically complete. 5. Remarks on the Curvature Assumption In this section, we discuss the curvature assumption (0.11), which is the strongest assumption in Theorem 0.3. Of course one would like to replace this by the assumption that (M, g) is merely future geodesically complete and derive (0.11) as a consequence. We will ignore the assumption on |∇R| in (0.11), since it is of lesser importance. Let q ∈ ∂M, be any point at the (finite) boundary of the maximal Cauchy development. Thus, near q, either the space-time (M, g) is inextendible, for example because the curvature blows up in a parallel propagated frame, or (M, g) is extendible as a spacetime, (possibly not vacuum), and so (M, g) has a non-empty Cauchy horizon through q. For any x ∈ M, let t∂M (x) = sup{t : Bx (t (x)) ⊂ M}.
(5.1)
The condition (5.1) means that any past or future directed time-like geodesic starting at x, which has proper length less than t∂M (x), has its endpoint in M. Curvature Blow-up Problem. Let (M, g) be a CMC cosmological vacuum space-time. Does there exist a constant C = C(M, g) < ∞, such that, 2 ? |R| ≤ C/t∂M
(5.2)
The estimate (5.2) can be viewed as an upper bound on the rate of curvature blow-up on approach to a singularity, i.e. a point in ∂M, or as an upper bound on the decay of the curvature in (large) distances from ∂M. By scale invariance, these statements are equivalent. Of course the estimate (5.2) implies the curvature assumption (0.11) without the |∇R| bound, when (M, g) is geodesically complete to the future of τo . The bound (5.2) has been proved for time-independent, i.e. stationary space-times, in [3], where ∂M is interpreted as the region where the associated Killing field becomes null. We first observe that (5.2) is in fact false for general globally hyperbolic space-times, i.e. without compact CMC Cauchy hypersurfaces. For suppose (M , g ) is a geodesically complete globally hyperbolic vacuum space-time. For any such space-time, ∂M = ∅, so that t∂M = ∞. Hence, (5.2) would imply that R = 0, i.e. (M , g ) is flat. However, there exist such vacuum space-times which are not flat, namely the Christodoulou–Klainerman (CK) space-times arising as global perturbations of Minkowski space [19], as well as the G2 -invariant space-times constructed in [12]. Note that the CK space-times in fact have a global foliation by maximal hypersurfaces. Now for cosmological space-times, ∂M must of course be non-empty by the Hawking– Penrose singularity theorem. However, the failure of the estimate (5.2) is closely related to the existence of the CK and analogous space-times. Namely, if (5.2) were false, then there must exist a sequence of globally hyperbolic cosmological space-times (Mi , gi ),
General Relativity and Geometrization
565
(possibly identical), and points xi ∈ Mi with ti (xi ) → 0, ti (·) = t∂Mi (·), (this may be arranged by a rescaling if necessary), such that |Ri |(xi ) · ti (xi )2 → ∞.
(5.3)
We may assume without loss of generality that the points xi realize the maximal value of |Ri | on the level set Li = ti−1 (xi ). By a standard argument, (the same as that establishing (3.36) but with time in place of space), we may choose the points xi to realize an approximate maximal value of |Ri |ti2 in small time-like annuli about Li . Next, rescale the metrics gi by setting gi = |Ri |(xi ) · gi , so that |Ri |(xi ) = 1,
(5.4)
and consequently |Ri | ≤ C < ∞ within gi bounded time distance to xi . This has the effect of making ti , the proper distance to the boundary w.r.t. gi , tend to ∞. As discussed in Sect. 3, a subsequence then converges weakly in L2,p to a L2,p limit (M∞ , g∞ , x∞ ), (unwrapping in the case of collapse). The limit is necessarily a geodesically complete globally hyperbolic space-time. Further, if the condition (5.4), or some weakening of it, passes to the limit, then (M∞ , g∞ ) is not flat. (This is not at all obvious however, and may be difficult to prove.) In addition, if (Mi , gi ) are CMC cosmological space-times and xi ∈ (MF )i , then the apriori bound (1.12) and the arguments from Sect. 3 show that the limit (M∞ , g∞ ) has a global foliation by maximal hypersurfaces, as in the CK space-times. Thus, the problem is essentially whether such non-flat geodesically complete, globally hyperbolic space-times can actually arise in this way, or more precisely whether the evolution from smooth Cauchy data can lead to singularities which have this blow-up character. We conclude this paper with the following remark or caveat. The discussion above presents relations between the geometrization of 3-manifolds, (at least in case σ () ≤ 0), and the Einstein evolution equations. Thus, one may ask if a detailed study of the long-time behavior of the hyperbolic Einstein equations is a possible avenue of approach to the solution of the geometrization conjecture when σ () ≤ 0. And conversely, if the resolution of the geometrization conjecture has any direct bearing on the resolution of some of the fundamental problems on global existence and asymptotics for the Einstein evolution. With regard to the first problem, this seems to be an extremely difficult approach. Approaches to the solution of the geometrization conjecture by study of a suitable sequence of elliptic PDEs are given in [1, 2] for example, or by study of a parabolic PDE, the Ricci flow, in [29] for example. Both the elliptic and the parabolic approaches appear to be much simpler than the hyperbolic approach via the Einstein flow and stand a much better chance of resolving geometrization. For instance, an exact analogue of Theorem 0.3 has been proved in both of these approaches, cf. [1, 2] and [30], with only the bound on R in (0.11), i.e. without any bound on |∇R|. Further, strong geometrization in the sense of Definition 0.2 has been proved in this case in these approaches. Similarly, it is not at all clear that the (affirmative) resolution of the geometrization conjecture would have any concrete implications on the resolution of any of the global existence, asymptotics, or singularity formation problems in general relativity. Nevertheless, it is clear that there are many interesting relations between the subjects. Acknowledgements. I would like to thank Lars Andersson and Alan Rendall for interesting and informative discussions on topics related to the paper.
566
M. T. Anderson
References 1. Anderson, M.: Scalar curvature and geometrization conjectures for 3-manifolds. In: Comparison Geometry (’93 MSRI Workshop), MSRI Publications. Camb. Univ. Press 30, 49–82 (1997) 2. Anderson, M.: Scalar curvature and the existence of geometric structures on 3-manifolds, I. Preprint, Stony Brook, 1999, www.math.sunysb.edu/∼ anderson 3. Anderson, M.: On stationary vacuum solutions to the Einstein equations. Annales Henri Poincaré 1, 977–994, (2000) gr-qc/0001091. 4. Anderson, M.: Extrema of curvature functionals on the space of metrics on 3-manifolds. Calc. Var. & P.D.E. 5, 199–269 (1997) 5. Anderson, M.: Extrema of curvature functionals on the space of metrics on 3-manifolds, II. Calc. Var. & P.D.E. 12, 1–58, (2001), math.DG/9912177 6. Andersson, L.: The global existence problem in general relativity. Preprint 1999, gr-qc/9911032 7. Andersson, L., Galloway, G. and Howard, R.: A strong maximum principle for weak solutions of quasilinear elliptic equations with applications to Lorentzian and Riemannian geometry. Comm. Pure Appl. Math. 51, 581–624 (1998) 8. Andersson, L. and Moncrief, V.: On the global evolution problem in 3+1 dimensional general relativity. To appear 9. Barrow, D., Galloway, G. and Tipler, F.: The closed universe recollapse conjecture. Mon. Not. Royal Astron. Soc. 223, 835–844 (1986) 10. Bartnik, R.: Existence of maximal surfaces in asymptotically flat spacetimes. Commun. Math. Phys. 94, 155–175 (1984) 11. Bartnik, R.: Remarks on cosmological spacetimes and constant mean curvature surfaces. Commun. Math. Phys. 117, 615–624 (1988) 12. Berger, B., Chru´sciel, P. and Moncrief V.: On “asymptotically flat” space-times with G2 -invariant Cauchy surfaces. Ann. of Physics 237, 322–354 (1995) 13. Berger, M. and Ebin, D.: Some decompositions of the space of symmetric tensors on a Riemannian manifold. Jour. Diff. Geom. 3, 379–392 (1969) 14. Besse, A.: Einstein Manifolds. New York: Springer Verlag, 1987 15. Budic, R., Isenberg, J., Lindblom, L.,Yasskin, P.: On the determination of Cauchy surfaces from intrinsic properties. Commun. Math. Phys. 61, 87–95 (1978) 16. Cheeger, J. and Gromov, M.: Collapsing Riemannian manifolds while keeping their curvature bounded, I. Jour. Diff. Geom. 23, 309–346 (1986) 17. Cheeger, J. and Gromov, M.: Collapsing Riemannian manifolds while keeping their curvature bounded, II. Jour. Diff. Geom. 32, 269–298 (1990) 18. Cheng, S.Y. andYau, S.T.: Maximal space-like hypersurfaces in the Lorentz-Minkowski spaces., Annals of Math. 104, 407–419 (1976) 19. Christodoulou, D. and Klainerman, S.: The Global Nonlinear Stability of the Minkowski Space. Princeton: Princeton Univ. Press, 1993 20. Chru´sciel, P.: On uniqueness in the large of solutions of Einstein’s equations (“strong cosmic censorship”). Math. aspects of classical field theory. Providence, RI: Ame. Math. Soc. (1992), pp. 235–273 21. Chru´sciel, P. and Rendall, A.: Strong cosmic censorship in vacuum space-times with compact, locally homogeneous Cauchy surfaces. Ann. of Physics 242, 349–385 (1995) 22. Eardley, D. and Smarr, L.: Time functions in numerical relativity: marginally bound dust collapse. Phys. Rev. D 19:8, 2239–2259 (1979) 23. Ellis, G.F.R. and Schmidt, B.G.: Singular space-times, Gen. Rel. and Grav. vol. 8:11, (1977), 915–953 24. Fischer, A. and Marsden, J.: The initial value problem and the dynamical formulation of general relativity. In: General Relativity. S.W. Hawking and W. Israel (eds.), Cambridge: Cambridge University Press, 1979, pp. 138–211 25. Fischer, A. and Moncrief, V.: The reduced Hamiltonian of general relativity and the σ -constant of conformal geometry. In: Math. and Quantum Aspects of Relativity and Cosmology, S. Cotsakis and G.W. Gibbons, (eds.). Berlin–Heidelberg–New York: Springer Lecture Notes in Physics, 537, (2000), pp. 70–101 26. Gerhardt, C.: H-surfaces in Lorentzian manifolds. Commun. Math. Phys. 89, 523–553 (1983) 27. Gilbarg, D. and Trudinger, N.: Elliptic Partial Differential Equations of Second Order. New York: Springer Verlag, 1983 28. Gromov, M. and Lawson, H.B.: Positive scalar curvature and the Dirac operator on complete Riemannian manifolds. Publ. Math. I.H.E.S. 58, 83–196 (1983) 29. Hamilton, R.: The formation of singularities in the Ricci flow. In: Surveys in Differential Geometry, Vol. II, C.C. Hsiung and S.T. Yau (eds.). Cambridge, MA: International Press, 1995, pp. 7–136 30. Hamilton, R.: Non-singular solutions of the Ricci flow on three-manifolds. Comm. in Geom. & Anal. 7, 695–729 (1999)
General Relativity and Geometrization
567
31. Hawking, S. and Ellis, G.: The Large Scale Structure of Space-Time. London: Cambridge Univ. Press, 1973 32. Jaco, W. and Shalen, P.: Seifert fibered spaces in 3-manifolds. Memoirs Amer. Math. Soc. 220 (1979) 33. Marsden, J. and Tipler, F.: Maximal hypersurfaces and foliations of constant mean curvature in general relativity, Physics Reports 66:3, 109–139 (1980) 34. Moncrief, V. and Eardley, D.: The global existence problem and cosmic censorship in general relativity. Gen. Rel. and Grav. 13:9, 887–892 (1981) 35. Petersen, P.: Riemannian Geometry. New York: Springer Verlag, 1998 36. Rendall, A.: Constant mean curvature foliations in cosmological space-times. Helv. Phys. Acta 69:4, 490–500 (1996) 37. Rong, X.: The limiting eta invariant of collapsed 3-manifolds. Jour. Diff. Geom. 37, 535–568 (1993) 38. Schoen, R.: Variational theory for the total scalar curvature functional for Riemannian metrics and related topics. Springer Lect. Notes in Math. 1365, (1987), pp. 120–154 39. Scott, P.: The geometries of 3-manifolds, Bull. London Math. Soc. 15, 401–487(1983) 40. Thurston, W.: Three dimensional manifolds, Kleinian groups and hyperbolic geometry, Bull. Amer. Math. Soc. 6, 357–381 (1982) 41. Tipler, F., Clarke, C. and Ellis, G.: Singularities and horizons: a review article. In: Gen. Rel. and Gravitation, Vol. 2. A. Held, ed. New York: Plenum Press, 1980, pp- 87–206 42. Waldhausen, F.: Eine Klasse von 3-dimensionalen Mannifaltigkeiten, I, II. Invent. Math. 3, 308–333 (1967) and 4, 87–117 (1967) Communicated by H. Nicolai
Commun. Math. Phys. 222, 569 – 609 (2001)
Communications in
Mathematical Physics
© Springer-Verlag 2001
Asymptotically Flat Initial Data with Prescribed Regularity at Infinity Sergio Dain, Helmut Friedrich Max-Planck-Institut für Gravitationsphysik, Am Mühlenberg 1, 14476 Golm, Germany. E-mail: [email protected] Received: 12 February 2001 / Accepted: 28 May 2001
Abstract: We prove the existence of a large class of asymptotically flat initial data with non-vanishing mass and angular momentum for which the metric and the extrinsic curvature have asymptotic expansions at space-like infinity in terms of powers of a radial coordinate.
1. Introduction ˜ h˜ ab , ˜ ab ), An initial data set for the Einstein vacuum equations is given by a triple (S, ˜ ˜ where S is a connected 3-dimensional manifold, hab a (positive definite) Riemannian ˜ The data will be called “asymptotically ˜ ab a symmetric tensor field on S. metric, and flat”, if the complement of a compact set in S˜ can be mapped by a coordinate system x˜ j diffeomorphically onto the complement of a closed ball in R3 such that we have in these coordinates 2m h˜ ij = 1 + (1) δij + O r˜ −2 , r˜ ˜ ij = O r˜ −2 , (2) as r˜ = ( 3j =1 (x˜ j )2 )1/2 → ∞. Here the constant m denotes the mass of the data, a, b, c... denote abstract indices, i, j, k..., which take values 1, 2, 3, denote coordinates indices while δij denotes the flat metric with respect to the given coordinate system x˜ j . Tensor indices will be moved with the metric hab and its inverse hab . We set xi = x i and ∂ i = ∂i . Our conditions guarantee that the mass, the momentum, and the angular momentum of the initial data set are well defined. There exist weaker notions of asymptotic flatness (cf. [14]) but they are not useful for our present purpose. In this article we show the existence of a class of asymptotically
570
S. Dain, H. Friedrich
flat initial data which have a more controlled asymptotic behavior than (1), (2) in the sense that they admit near space-like infinity asymptotic expansions of the form h˜ kij 2m h˜ ij ∼ (1 + )δij + , r˜ r˜ k
(3)
k≥2
˜ ij ∼
˜k ij k≥2
r˜ k
,
(4)
˜ k are smooth functions on the unit 2-sphere (thought of as being pulled where h˜ kij and ij back to the spheres r˜ = const. under the map x˜ j → x˜ j /˜r ). We are interested in such data for two reasons. The evolution of asymptotically flat initial data near space-like and null infinity has been studied in considerable detail in [23]. In particular the that article a certain “regularity condition” has been derived on the data near space-like infinity, which is expected to provide a criterion for the existence of a smooth asymptotic structure at null infinity. To simplify the lengthy calculations, the data considered in [23] have been assumed to be time-symmetric and to admit a smooth conformal compactification. With these assumptions the regularity condition is given by a surprisingly succinct expression. With the present work we want to provide data which will allow us to perform the analysis of [23] without the assumption of time symmetry but which are still “simple” enough to simplify the work of generalizing the regularity condition to the case of the non-trivial second fundamental form. Thus we will insist in the present paper on the smooth conformal compactification of the metric but drop the time symmetry.A subsequent article will be devoted to the analysis of a class of more general data which will include in particular stationary asymptotically flat data. The “regular finite initial value problem near space-like infinity”, formulated and analyzed in [23], suggests how to calculate numerically entire asymptotically flat solutions to Einstein’s vacuum field equations on finite grids. In the present article we provide data for such numerical calculations which should allow us to study interesting situations while keeping a certain simplicity in the handling of the initial data. The difficulty of constructing data with the asymptotic behavior (3), (4) arises from the fact that the fields need to satisfy the constraint equations ˜ ab − D˜ a ˜ = 0, D˜ b ˜ ab ˜ ab = 0, ˜2− R˜ + ˜ where D˜ a is the covariant derivative, R˜ is the trace of the corresponding Ricci on S, ˜ = h˜ ab ˜ ab . Part of the data, the “free data”, can be given such that they are tensor, and compatible with (3), (4). However, the remaining data are governed by elliptic equations and we have to show that (3), (4) are in fact a consequence of the equations and the way the free data have been prescribed. To employ the standard techniques to provide solutions to the constraints, we assume ˜ = 0,
(5)
such that the data correspond to a hypersurface which is maximal in the solution spacetime.
Asymptotically Flat Initial Data with Prescribed Regularity at Infinity
571
We give an outline of our results. Because of the applications indicated above, we wish to control in detail the conformal structure of the data near space-like infinity. Therefore we shall analyze the data in terms of the conformal compactification (S, hab , ab ) of the “physical” asymptotically flat data. Here S denotes a smooth, connected, orientable, compact 3-manifold. It contains a point i such that we can write S˜ = S\{i}. The point i will represent, in a sense described in detail below, space-like infinity for the physical initial data. By singling out more points in S and by treating the fields near these points in the same way as near i we could construct data with several asymptotically flat ends, since all the following arguments equally apply to such situations. However, for convenience we restrict ourselves to the case of a single asymptotically flat end. We assume that hab is a positive definite metric on S with covariant derivative Da ˜ In agreement with (5) we and ab is a symmetric tensor field which is smooth on S. shall assume that ab is trace free, hab ab = 0. The fields above are related to the physical fields by rescaling h˜ ab = θ 4 hab ,
˜ ab = θ −2 ab ,
(6)
˜ For the physical fields to satisfy the with a conformal factor θ which is positive on S. vacuum constraints we need to assume that ˜ D a ab = 0 on S, 1 1 (Db D b − R)θ = − ab ab θ −7 8 8
(7) on
˜ S.
(8)
Equation (8) for the conformal factor θ is the Lichnerowicz equation, transferred to our context. j Let x be h-normal coordinates centered at i such that hkl = δkl at i and set r = ( 3i=1 (x j )2 )1/2 . To ensure asymptotic flatness of the data (6) we require ab = O(r −4 ) lim rθ = 1.
r→0
as
r → 0,
(9) (10)
In the coordinates x˜ j = x j /r 2 the fields (6) will then satisfy (1), (2) (cf. [22, 23] for this procedure). Not all data as given by (6), which are derived from data hab , ab as described above, will satisfy conditions (3), (4). We will have to impose extra conditions and we want to keep these conditions as simple as possible. Since we assume the metric hab to be smooth on S, it will only depend on the behavior of θ near i whether condition (3) will be satisfied. Via Eq. (8) this behavior depends on ab . What kind of condition do we have to impose on ab in order to achieve (3)? The following space of functions will play an important role in our discussion. Denote by Ba the open ball with center i and radius r = a > 0, where a is chosen small enough ˜ is said to be such that Ba is a convex normal neighborhood of i. A function f ∈ C ∞ (S) ∞ ∞ in E (Ba ) if on Ba we can write f = f1 +rf2 with f1 , f2 ∈ C (Ba ) (cf. Definition 1). An answer to our question is given by the following theorem:
572
S. Dain, H. Friedrich
Theorem 1. Let hab be a smooth metric on S with positive Ricci scalar R. Assume that ab is smooth in S˜ and satisfies on Ba , r 8 ab ab ∈ E ∞ (Ba ).
(11)
Then there exists on S˜ a unique solution θ of Eq. (8), which is positive, satisfies (10), and has in Ba the form θ=
θˆ , r
θˆ ∈ E ∞ (Ba ),
ˆ = 1. θ(i)
(12)
In fact, we will get slightly more detailed information. We find that θˆ = u1 + r u2 on Ba with u2 ∈ E ∞ (Ba ) and a function u1 ∈ C ∞ (Ba ) which satisfies u1 = 1 + O(r 2 ) and
1 u1 Db D b − R = θR , 8 r
in Ba \{i}, where θR is in C ∞ (Ba ) and vanishes at any order at i. If θ has the form (12) then (3) will be satisfied due to our assumptions on hab . Note the simplicity of condition (11). To allow for later generalizations, we shall discuss below the existence of the solution θ under weaker assumptions on the smoothness of the metric hab and the smoothness and asymptotic behavior of ab (cf. Theorem 12). In fact, already the methods used in this article would allow us to deduce analogues of all our results under weaker differentiability assumptions; however, we are particularly interested in the C ∞ case because it will be convenient in our intended applications. If the metric is analytic on Ba it can be arranged that θR = 0 and u1 is analytic on Ba (and unique with this property, see [24] and the remark after Theorem 2). We finally note that the requirement R > 0, which ensures the solvability of the Lichnerowicz equation, could be reformulated in terms of a condition on the Yamabe number (cf. [29]). It remains to be shown that condition (11) can be satisfied by tensor fields ab which satisfy (7), (9). A special class of such solutions, namely those which extend smoothly to all of S, can easily be obtained by known techniques (cf. [16]). However, in that case the initial data will have vanishing momentum and angular momentum. To obtain data ˜ which are singular at i without this restriction, we have to consider fields ab ∈ C ∞ (S) in the sense that they admit, in accordance with (2), (6), (10), at i = {r = 0} asymptotic expansions of the form ij ∼
k≥−4
ijk r k
with
ijk ∈ C ∞ (S 2 ).
(13)
It turns out that condition (11) excludes data with non-vanishing linear momentum, which requires a non-vanishing leading order term of the form O(r −4 ). In Sect. 3.4 we will show that such terms imply terms of the form log r in θ and thus do not admit expansion of the form (3). However, this does not necessarily indicate that condition (11) is overly restrictive. In the case where the metric hab is smooth it will be shown in Sect. 3.4 that a non-vanishing linear momentum always comes with logarithmic terms, irrespective of whether condition (11) is imposed or not.
Asymptotically Flat Initial Data with Prescribed Regularity at Infinity
573
There remains the question whether there exist fields ab which satisfy (11) and have non-trivial angular momentum. The latter requires a term of the form O(r −3 ) in (13). It turns out that condition (11) fixes this term to be of the form ijAJ =
A 3 (3ni nj − δij ) + 3 (nj %kil J l nk + ni %lj k J k nl ), 3 r r
(14)
where ni = x i /r is the radial unit normal vector field near i and J k , A are constants, the three constants J k specifying the angular momentum of the data. The spherically symmetric tensor which appears here with the factor A agrees with the extrinsic curvature for a maximal (non-time symmetric) slice in the Schwarzschild solution (see for example [10]). Note that the tensor ijAJ satisfies condition (11) and the equation ∂ i ijAJ = 0 on S˜ for the flat metric. In the next theorem we prove an analogous result for general smooth metrics. Theorem 2. Let hab be a smooth metric in S. There exist trace-free tensor fields ab ∈ C ∞ (S \ {i}) satisfying (13) with the following properties: AJ + ˆ ab , where AJ is given by (14) and ˆ ab = O(r −2 ). (i) ab = ab ab ˜ (ii) D a ab = 0 on S. (iii) r 8 ab ab satisfies condition (11).
We prove a more detailed version of this theorem in Sect. (4.3). There it will be shown how to construct such solutions from free-data by using the York splitting technique ([35]). In Sect. 4.1 the case where hab is conformal to the Euclidean metric is studied in all generality. 2. Preliminaries In this section we collect some known facts from functional analysis and the theory of linear elliptic partial differential equations. Let Z be the set of integer numbers and N0 the set of non-negative integers. We use multi-indices β = (β1 , β2 , . . . , βn ) ∈ Nn0 and set |β| = ni=1 βi , β! = β1 !β2 ! . . . βn !, β β β β β β x β = (x 1 )β1 (x 2 )β2 . . . (x n )βn , ∂ β u = ∂x 11 ∂x 22 . . . ∂x nn u, D β u = D1 1 D2 2 . . . Dn n u, n and, for β, γ ∈ N0 , β + γ = (β1 + γ1 , . . . , βn + γn ) and β ≤ γ if βi ≤ γi . We denote by ( an open domain in R3 (resp. in S; quite often we will then choose ( = Ba ). We shall use the following functions spaces (see [1, 25] for definitions, notations, and results): the set of m times continuously differentiable functions C m ((), the Hölder ¯ C m,α ((), ¯ the space space C m,α ((), where 0 < α < 1, the corresponding spaces C m ((), ∞ C0 (() of smooth functions with compact support in (, the Lebesgue space Lp ((), the m,p Sobolev space W m,p ((), and the local Sobolev space Wloc ((). For a compact manifold p m,α S we can also define analogous spaces L (S), C (S), W m,p (S) (cf. [6]). We shall need the following relations between these spaces. Theorem 3 (Sobolev imbedding). Let ( be a C 0,1 domain in R3 , let k, m, j be nonnegative integers and 1 ≤ p, q < ∞. Then there exist the following imbeddings: (i) If mp < 3, then W j +m,p (() ⊂ W j,q ((),
p ≤ q ≤ 3p/(3 − mp).
574
S. Dain, H. Friedrich
(ii) If (m − 1)p < 3 < mp, then ¯ W j +m,p (() ⊂ C j,α ((),
α = m − 3/p.
Theorem 4. Let u ∈ W 1,1 ((), and suppose there exist positive constants α ≤ 1 and K such that |∂u| dµ ≤ KR 2+α for all balls BR ⊂ ( of radius R > 0. BR
Then u ∈ C α ((). Our existence proof for the non-linear equations relies on the following version of the compact imbedding for compact manifolds [6]: Theorem 5 (Rellich–Kondrakov). The following imbeddings are compact (i) W m,q (S) ⊂ Lp (S) if 1 ≥ 1/p > 1/q − m/3 > 0. (ii) W m,q (S) ⊂ C α (S) if m − α > 3/q, 0 ≤ α < 1. A further essential tool for the existence proof is the Schauder fixed point theorem [25]: Theorem 6 (Schauder fixed point). Let B be a closed convex set in a Banach space V and let T be a continuous mapping of B into itself such that the image T (B) is precompact, i.e. has compact closure in B. Then T has a fixed point. We turn now to the theory of elliptic partial differential equations (see [12, 14, 25, 31]). Let L be a linear differential operator of order m on the compact manifold S which acts on tensor fields u. In the case where u ∼ ua1 ...am1 is a contravariant tensor field of rank m1 , L has in local coordinates the form Lu =
m |β|=0
a j1 ...jm2
β i1 ...im1 ≡ i1 ...im1 β D u
m
aβ D β u,
(15)
|β|=0
where the coefficients a j1 ...jm2 i1 ...im1 β = a(x)j1 ...jm2 i1 ...im1 β are tensor fields of a certain smoothness, and D denotes the Levi–Civita connection with respect to the metric h. In the expression on the right-hand side we suppressed the indices belonging to the unknown and the target space. Assuming the same coordinates as above, we write for a given covector ξi at a point x ∈ ( and multi-index β as usual ξ β = β β ξ1 1 . . . ξ4 4 and define a linear map A(x, ξ ) : Rm1 → Rm2 by setting (A(x, ξ ) u)j1 ...jm2 = j1 ...jm2 β i1 ...im1 . The operator L is elliptic at x if for any ξ = 0 i1 ...im1 β ξ u |β|=m a(x) the map A(x, ξ ) is an isomorphism, L is elliptic on S if it is elliptic at all points of S. We have the following Lp regularity result [2, 3, 14, 31]. Theorem 7 (Lp regularity). Let L be an elliptic operator of order m on ( (resp. S) with coefficients aβ ∈ W s|β| ,p (S), where sk > 3/p + k − m + 1, and p > 1. Let s be m,p m,p a natural number such that sk ≥ s − m ≥ 0. Let u ∈ Wloc (() (resp. Wloc (S)), with p > 1, be a solution of the elliptic equation Lu = f . s−m,q
s,q
(i) If f ∈ Wloc ((), q ≥ p, then u ∈ Wloc ((). (ii) If f ∈ W s−m,q (S), q ≥ p, then u ∈ W s,q (S). Furthermore, we have the Schauder interior elliptic regularity [3, 19, 25, 31].
Asymptotically Flat Initial Data with Prescribed Regularity at Infinity
575
Theorem 8 (Schauder elliptic regularity). Let L be an elliptic operator of order m on ¯ Let u ∈ W m,p ((), with p > 1, be a solution of the ( with coefficients aβ ∈ C k,α ((). ¯ Then u ∈ C k+m,α (( ), for all ( ⊂⊂ (. elliptic equation Lu = f , with f ∈ C k,α ((). For linear elliptic equations we have the Fredholm alternative for elliptic operators on compact manifolds [12]. Theorem 9 (Fredholm alternative). Let L be an elliptic operator of order m on S whose coefficients satisfy the hypothesis of Theorem 7. Let s be some natural number such that sk ≥ s − m ≥ 0 and f ∈ Lp (S), p > 1. Then the equation Lu = f has a solution u ∈ W m,p (S) iff < v, f >h dµ = 0 for all v ∈ ker(L∗ ). S
Here dµ denotes the volume element determined by h and L∗ the formal adjoint of L, which for the operator (15) is given by m
L∗ u =
(−1)|β| D β (aβ u).
(16)
|β|=0
Furthermore <, >h denotes the appropriate inner product induced by the metric hab . In our case, where u and f will be vector fields f a and ua , we have < u, f >h = f a ua . Let Lu = ∂i (a ij ∂j u + bi u) + ci ∂i u + du,
(17)
be a linear elliptic operator of second order with principal part in divergence form on ( which acts on scalar functions. An operator of the form (17) may be written in the form (15) provided its principal coefficients a ij are differentiable. We shall assume that L is strictly elliptic in (; that is, there exists λ > 0 such that a ij (x)ξi ξj ≥ λ|ξ |2 ,
∀x ∈ (,
ξ ∈ Rn .
(18)
We also assume that L has bounded coefficients; that is for some constants 7 and ν ≥ 0 we have for all x ∈ (, |a ij |2 ≤ 72 , λ−2 |bi |2 + |ci |2 + λ−1 |d| ≤ ν 2 . (19) In order to formulate the maximum principle, we have to impose that the coefficient of u satisfy the non-positivity condition (dv − bi ∂i v) dx ≤ 0 ∀v ≥ 0, v ∈ C01 ((). (20) (
We have the following versions of the maximum principle [25]. Theorem 10 (Weak Maximum Principle). Assume that L given by (17) satisfies conditions (18), (19) and (20). Let u ∈ W 1,2 (() satisfy Lu ≥ 0 (≤ 0) in (. Then sup u ≤ sup u+ inf u ≥ inf u− , (
∂(
(
∂(
where u+ (x) = max{u(x), 0}, u− (x) = min{u(x), 0}.
576
S. Dain, H. Friedrich
Theorem 11 (Strong Maximum Principle). Assume that L given by (17) satisfies conditions (18), (19) and (20). Let u ∈ W 1,2 (() satisfy Lu ≥ 0 in (. Then, if for some ball B ⊂⊂ ( we have sup u = sup u ≥ 0, B
(
the function u must be constant in (. Because u is assumed to be only in W 1,2 the inequality Lu ≥ 0 has to be understood in the weak sense (see [25] for details).
3. The Hamiltonian Constraint In this section we will prove Theorem 1.
3.1. Existence. The existence of solutions to the Lichnerowicz equation has been studied under various assumptions (cf. [15, 16, 28] and the reference given there). The setting outlined above, where we have to solve (8), (10) on the compact manifold S, has been studied in [8, 22, 23]. In general the “physical” metric provided by an asymptotically flat initial data set will not admit a smooth conformal compactification at space-like infinity. Explicit examples for such situations can be obtained by studying space-like slices of stationary solutions like the Kerr solution. To allow for later generalizations of the present work which would admit also stationary solutions we shall prove the existence result of Theorem 1 for metrics hab which are not necessarily smooth. In the proof we will employ Sobolev spaces W m,p (S) and the corresponding imbeddings and elliptic estimates (in particular, there will be no need for us to employ weighted Sobolev spaces with weights involving the distance to the point i). With these spaces and standard Lp elliptic theory we will also be able to handle the mild r −1 -type singularity at i which occurs on the right-hand side of Eq. (8). The conformal Laplacian or Yamabe operator 1 Lh = hab Da Db − R, 8 which appears on the left-hand side of (8), is a linear elliptic operator of second order whose coefficients depend on the derivatives of the metric h up to second order. The smoothness to be required of the metric h is determined by the following considerations. In the existence proof we need: (i) The existence of normal coordinates. This suggests that we assume h ∈ C 1,1 (S). (ii) The maximum principle, Theorems 10 and 11. The required boundedness of the Ricci scalar R imposes restrictions on the second derivative of h. (iii) The elliptic Lp estimate, Theorem 7. This requires that h ∈ W 3,p (S) for p > 3/2. Since the right-hand side of Eq. (8) is in L2 the assumption that h ∈ W 3,p (S), p > 3, would be sufficient to handle Eq. (8). However, when we will discuss the momentum constraint in Sect. 4.2, we will wish to be able to handle cases where p < 3/2. In these
Asymptotically Flat Initial Data with Prescribed Regularity at Infinity
577
cases the conditions of Theorem 7 suggest that we assume that h ∈ W 4,p (S). In order to simplify our hypothesis we shall assume in the following that hab ∈ W 4,p (S),
p > 3/2.
(21)
The imbedding theorems then imply that hab ∈ C 2,α (S), 0 < α < 1, whence R ∈ C α (S). We note that (21) is not the weakest possible assumption but it will be sufficient for our future applications. Lemma 1. Assume that h satisfies (21) and R > 0. Then: (i) Lh : W 2,q (S) → Lq (S), q > 1, defines an isomorphism. (ii) If u ∈ W 1,2 (S) and Lh u ≤ 0, then u ≥ 0; if, moreover, Lh u = 0 ∈ Lq (S), then u > 0. Proof. (i) To show injectivity, assume that Lh u = 0. By elliptic regularity u is smooth enough such that we can multiply this equation with u and integrate by parts to obtain 1 2 a D uDa u + Ru dµh = 0. 8 S Since R > 0 it follows that u = 0. Surjectivity follows then by Theorem 9 since Lh = L∗h . Boundedness of Lh is immediately implied by the assumptions while the inequality ||u||W 2,p (S) ≤ C||Lh u||Lp (S) , which follows from the elliptic estimates underlying Theorem 7 and the injectivity of Lh (see e.g [12] for this well known result), implies the boundedness of L−1 h . (ii) If we have u ≤ 0 in some region of S, it follows that supS (−u) ≥ 0. Then there is a region in S in which we can apply the maximum principle to the function −u to conclude that u must be a non-positive constant whence Lh u = −Ru/8 ≥ 0 in that region. In the case where Lh u < 0 we would arrive at a contradiction. In the case where Lh u ≤ 0 we conclude that u = 0 in the given region and a repetition of the argument gives the desired result. To construct an approximate solution we choose normal coordinates x j centered at i such that (after a suitable choice of a > 0) we have in the open ball Ba in these coordinates hij = δij + hˆ ij ,
hij = δ ij + hˆ ij
(22)
with hˆ ij = O(r 2 ),
hˆ ij = O(r 2 ),
x i hˆ ij = 0,
xi hˆ ij = 0.
Notice that hˆ ij , hˆ ij , defined by the equations above, are not necessarily related to each other by the usual process of raising indices. Denoting by 9 the flat Laplacian with respect to the coordinates x j , we write on Ba Lh = 9 + Lˆ h ,
578
S. Dain, H. Friedrich
with 1 Lˆ h = hˆ ij ∂i ∂j + bi ∂i − R. 8
(23)
We note that bi = O(r). Choose a function χa ∈ C ∞ (S) which is non-negative and such that χa = 1 in Ba/2 and χa = 0 in S \ Ba . Denote by δi the Dirac delta distribution with source at i. Lemma 2. Assume that h satisfies (21) and R > 0. Then, there exists a unique solution θ0 of the equation Lh θ0 = −4πδi . Moreover θ0 > 0 in S˜ and we can write θ0 = χa /r +g with g ∈ C α (S), 0 < α < 1. Proof. Observing that 1/r defines a fundamental solution to the flat Laplacian, we obtain χ a 9 = −4π δi + χˆ , (24) r where χˆ is a smooth function on S with support in Ba \ Ba/2 . The ansatz θ0 = χa /r + g translates the original equation into an equation for g, χ a − χˆ . Lh g = −Lˆ h r A direct calculation shows that Lˆ h (χa r −1 ) ∈ Lq (S), q < 3. By Lemma 1 there exists a unique solution g ∈ W 2,q (S) to this equation which by the imbedding theorem is in C α (S). To show that θ0 is strictly positive, we observe that it is positive near i (because r −1 is positive and g is bounded) and apply the strong maximum principle to −θ0 . We use the conformal covariance of the equation to strengthen the result on the differentiability of the function g. Consider a conformal factor ω0 = ef0 with f0 ∈ C ∞ (S) such that f0 =
1 j k x x Lj k (i) on 2
Ba ,
(25)
where we use the normal coordinates x k and the value of the tensor 1 Lab ≡ Rab − Rhab , 4
(26)
at i. Then the Ricci tensor of the metric hab = ω04 hab
(27)
vanishes at the point i and, since we are in three dimensions, the Riemann tensor vanishes there too. Hence the connection and metric coefficients satisfy in the coordinates x k ?i j
k
= O(r 2 ),
hij = δij + O(r 3 ).
Corollary 1. The function g found in Lemma 2 is in C 1,α (S), 0 < α < 1.
(28)
Asymptotically Flat Initial Data with Prescribed Regularity at Infinity
579
Proof. With ω0 = ef0 and hab as above, we note that Lh (θ0 ) = ω0−5 Lh (θ0 ),
(29)
θ0 = ω0−1 θ0 .
(30)
where
We apply now the argument of the proof of Lemma 2 to the function θ0 . Since we have by Eq. (28) that Lˆ h (χa r −1 ) ∈ L∞ (S), it follows that θ0 = χa /r + g , where g ∈ C 1,α (S). We use Eq. (30), the fact that ω0 = 1 + O(r 2 ), and Lemma 4 to obtain the desired result. We note that the function θ0−1 =
r , χa + rg
(31)
is in C α (S), it is non-negative and vanishes only at i. To obtain θ , we write θ = θ0 + u and solve on S the following equation for u: 1 Lh u = − θ0−7 ab ab (1 + θ0−1 u)−7 . 8
(32)
Theorem 12. Assume that hab ∈ W 4,p (S) with p > 3/2, that R > 0 on S, and that θ0−7 ab ab ∈ Lq (S), q ≥ 2. Then there exists a unique non-negative solution u ∈ W 2,q (S) of Eq. (32). We have u > 0 on S unless ab ab = 0 ∈ Lq (S). We note that our assumptions on ab impose rather mild restrictions, which are, in particular, compatible with the fall off requirement (9). By the imbedding Theorem 3 we will have u ∈ C α (S), α = 2 − 3/q, for q > 3; and u ∈ C 1,α (S), for q > 3. Proof. The proof is similar to that given in [8], with the difference that we impose weaker smoothness requirements. Making use of Lemma 1, we define a non-linear operator T : B → C 0 (S), with a subset B of C 0 (S) which will be specified below, by setting T (u) = L−1 h f (x, u), where 1 f (x, u) = − θ0−7 ab ab g(x, u) 8 with
(33)
−7 . g(x, u) = 1 + θ0−1 u
In the following we will suppress the dependence of f and g on x. Let ψ ∈ W 2,q (S) ⊂ C α (S) be the function satisfying ψ = T (0) and set B = {u ∈ C 0 (S) : 0 ≤ u ≤ ψ}, which is clearly a closed, convex subset of the Banach space C 0 (S). We want to use the Schauder theorem to show the existence of a point u ∈ B satisfying u = T (u). This will be the solution to our equation.
580
S. Dain, H. Friedrich
We show that T is continuous. Observing the properties of θ0−1 noted above, we see that g defines a continuous map g : B → L2 . Using the Cauchy–Schwarz inequality, we get |f (u1 ) − f (u2 )||L2 ≤ 1 θ −7 ab ab L2 ||g(u1 ) − g(u2 ) 2 . 0 L 8 By Theorems 1, 3 we know that the map L2 (S) → W 2,2 (S) → C 0 (S), where the first arrow denotes the map L−1 h and the second arrow the natural injection, is continuous. Together these observations give the desired result. We show that T maps B into itself. If u ≥ 0 we have f ≤ 0 whence, by Lemma 1, T (u) = L−1 h (f (u)) ≥ 0. If u1 ≥ u2 it follows that f (u1 ) ≥ f (u2 ) whence, again by Lemma 1, T (u2 ) − T (u1 ) = L−1 h (f (u2 ) − f (u1 )) ≥ 0. We conclude from this that for u ∈ B we have 0 ≤ T (u) ≤ T (0) = ψ. Finally, T (B) is precompact because W 2,2 (S) is compactly embedded in C 0 (S) by Theorem 5. Thus the hypotheses of Theorem 6 are satisfied and there exists a fixed point u of T in B. By its construction we have u ∈ Lq (S) whence, by elliptic regularity, u ∈ W 2,q (S). To show its uniqueness, assume that u1 and u2 are solutions to (32). Observing the special structure of g and the identity a
−7
−c
−7
= (c − a)
6
a j −7 c−1−j ,
j =0
which holds for positive numbers a and c, we find that we can write Lh (u1 − u2 ) = c (u1 −u2 ) with some function c ≥ 0. The maximum principle thus allows us to conclude that u1 = u2 . The last statement about the positivity of u follows from Lemma 1, (ii). 3.2. Asymptotic expansions near i of solutions to the Lichnerowicz equation. The aim of this section is to introduce the function spaces E m,α , to point out the simple consequences listed in Lemma 6, and to prove Theorem 13. These are the tools needed to prove Theorem 1. Let m ∈ N0 , and denote by Pm the space of homogeneous polynomials of degree m in the variables x j . The elements of Pm are of the form |β|=m Cβ x β with constant coefficients Cβ . Note that r 2 is in P2 but r is not in P1 . We denote by Hm the set of homogeneous harmonic polynomials of degree m, i.e. the set of p ∈ Pm such that 9p = 0. For s ∈ Z, we define the vector space r s Pm as the set of functions of the form r s p with p ∈ Pm . Lemma 3. Assume s ∈ Z. The Laplacian defines a bijective linear map 9 : r s Pm → r s−2 Pm , in either of the following cases: (i) s > 0, (ii) s < 0, |s| is odd and m + s ≥ 0.
Asymptotically Flat Initial Data with Prescribed Regularity at Infinity
581
Note that the assumptions on m and s imply that the function 9(r s pm ) ∈ C ∞ (R3 \ {0}) defines a function in L1loc (R3 ) which represents 9(r s pm ) in the distributional sense. Proof. Since 9 maps Pm into Pm−2 , we find from 9(r s p) = r s−2 s(s + 1 + 2m)p + r 2 9p ,
(34)
that 9(r s p) ∈ r s−2 Pm . We show now that the map is bijective for certain values of s and m. Because r s Pm and r s−2 Pm have the same finite dimension, we need only show that the kernel is trivial for some s and m. The vector space Pm can be written as a direct sum Pm = Hm ⊕ r 2 Hm−2 ⊕ r 4 Hm−4 · · · ,
(35)
(cf. [20]). If 9(r s p) = 0, we get from (35) that 0 = 9(r s p) = 9(r s+2k hm−2k ), 0≤k≤m/2
with hm−2k ∈ Hm−2k . Applying (34), we obtain 0= r 2k (s + 2k)(s + 1 + 2(m − k))hm−2k , 0≤k≤m/2
which allows us to conclude by (35) that (s + 2k) (s + 1 + 2(m − k)) hm−2k = 0. Since by our assumptions (s +2k)(s +1+2(m−k)) = 0, it follows that the polynomials hm−2k vanish, whence r s p = 0. We will need the following technical lemma regarding Hölder functions: Lemma 4. Suppose m ∈ N, 0 < α < 1, f ∈ C m,α (Ba ), and Tm denotes the Taylor polynomial of order m of f . Then fR ≡ f − Tm is in C m,α (Ba ) and satisfies, if |β| ≤ m, ∂ β fR = O r m−|β|+α as r → 0. Moreover, let s be an integer such that s ≤ 1 and m + s − 1 ≥ 0. Then fR satisfies: (i) r s−2 fR ∈ W m+s−1,p (B% ), for p < 3/(1 − α), 0 < % < a. (ii) If m + s − 1 ≥ 1 then r s−2 fR ∈ C m+s−2,α (B% ). (iii) rfR ∈ C m,α (B% ). Proof. The relation x ∈ B¯ % ,
|∂ β fR | ≤ C|x|m−|β|+α , is a consequence of Lemma 14. (i) We have ∂ β (r s−2 fR ) =
β +γ =β
Cβ ∂ β fR ∂ γ (r s−2 ),
(36)
582
S. Dain, H. Friedrich
with certain constants Cβ , and the derivatives of r s−2 are bounded for x ∈ B¯ % by
|∂ γ r s−2 | ≤ Cr s−2−|γ | . Observing (36), we obtain |∂ β (r s−2 fR )| ≤ Cr m−|β|+s−2+α , for x ∈ B¯ % , whence, by our hypothesis m + s − 1 ≥ 0, |∂ β (r s−2 fR )| ≤ Cr −1+α ,
(37)
for |β| ≤ m + s − 1. Using that r −1+α is in Lp (B% ) for p < 3/(1 − α), we conclude that ∂ β (r s−2 f ) ∈ Lp (B% ) whence r s−2 fR ∈ W m+s−1,p (B% ). (ii) From the relation above and Theorem 3 we conclude for m + s − 1 ≥ 1 that r s−2 fR ∈ C m+s−2,α (B% ) with α = 1 − 3/p < α. To show that α can in fact be chosen equal to α, we use the sharp result of Theorem 4. Set g = ∂ β (r s−2 fR ) for some β ≤ m + s − 2. Let z be an arbitrary point of B% and BR (z) be a ball with center z and radius R such that BR ⊂ Ba . Using the inequality (37), we obtain α−1 |∂g|dµ ≤ C r dµ. ≤ C r α−1 dµ = C R 2+α BR (z)
BR (z)
BR (0)
(cf. [25] p. 159 for the second estimate). Applying now Lemma 4, we conclude that g ∈ C α (B% ), whence r s−2 fR ∈ C m+s−2,α (B% ). (iii) We have ∂ β (rfR ) = r∂ β fR + f1 , where, with certain constants Cβ,β , f1 =
Cβ,β ∂ β fR ∂ γ r.
β=β +γ ≤β
Note that r∂ β f ∈ C α (Ba ), since r is Lipschitz continuous. Using the bound
|∂ γ r| ≤ Cr −|γ
|+1
,
and the bound (36) we obtain |∂f1 | ≤ Cr α . Thus f1 ∈ W 1,p for all p, whence, by Theorem 3, f1 ∈ C α for α < 1.
The following function spaces will be important for us. Definition 1. For m ∈ N0 and 0 < α < 1, we define the space E m,α (Ba ) as the set E m,α (Ba ) = {f = f1 + rf2 : f1 , f2 ∈ C m,α (Ba )}. Furthermore we set E ∞ (Ba ) = {f = f1 + rf2 : f1 , f2 ∈ C ∞ (Ba )}.
Asymptotically Flat Initial Data with Prescribed Regularity at Infinity
583
Note that the decompositions above are not unique. If f = f1 + rf2 , f1 , f2 ∈ C m,α (Ba ), then also f = f1 + rfR + r(f2 − fR ) with f1 + rfR , f2 − fR ∈ C m,α (Ba ) if fR is given as in Lemma 4. Obviously, E ∞ (Ba ) ⊂ E m,α (Ba ) for all m ∈ N0 . The converse is not quite immediate. Lemma 5. If f ∈ E m,α (Ba ) for all m ∈ N0 , then f ∈ E ∞ (Ba ). Proof. Assume that f ∈ E m,α (Ba ) for all m. Take an arbitrary m and write f = f1 +rf2 with f1 , f2 ∈ C m,α (Ba ). To obtain a unique representation, we write f1 and f2 as the sum of their Taylor polynomials of order m and their remainders, f1 =
m j =0
pj1 + fR1 ,
f2 =
m j =0
pj2 + fR2 ,
(38)
with pj1 , pj2 ∈ Pj and fR1 , fR2 = O(r m+α ). From this we get the representation m m−1 f = pj1 + r pj2 + fR , j =0
(39)
j =0
2 ) ∈ C m,α (B ) and f = O(r m+α ) by Lemma 4. This where fR ≡ fR1 + r(fR2 + pm a R decomposition is unique: if we had f = 0, the fast fall-off of fR at the origin would imply that the term in brackets, whence also each of the polynomials and fR , must vanish. Since m was arbitrary, we conclude that the function f determines a unique sequence of polynomials pj2 , j ∈ N0 as above. By Borel’s theorem (cf. [18]) there exists a function v2 ∈ C ∞ (Ba ) (not unique) such that
v2 −
m j =0
pj2 = O(r m+1 ), m ∈ N0 .
(40)
We show that the function v1 ≡ f − rv2 is C m−1 (Ba ) for arbitrary m, i.e. v1 ∈ C ∞ (Ba ). Using (39), we obtain m m−1 v1 = (41) pj1 + fR + r pj2 − v2 . j =0
j =0
The first term is in C m,α (Ba ) by the observations above, the second term is in C m,α (Ba ) by (40) and Lemma 4. While we cannot directly apply elliptic regularity results to these spaces, they are nevertheless appropriate for our purposes. This follows from the following observation, which will be extended to more general elliptic equations and more general smoothness assumptions in Theorem 13 and in Appendix B. If u is a solution to the Poisson equation 9u =
f , r
584
S. Dain, H. Friedrich
with f ∈ E m,α (Ba ), m ≥ 1, then u ∈ E m+1,α (Ba ). This can be seen as follows. If we write f = f1 + r f2 ∈ E m,α (Ba ) in the form f Tm = + fR , r r where Tm is the Taylor polynomial of order m of f1 , the remainder fR is seen to be in C m−1,α (Ba ) by Lemma 4. By Lemma 3, there exists a polynomial Tˆm such that Tm 9(r Tˆm ) = . r Then uR ≡ u − r Tˆm satisfies 9uR = fR and Theorem 8 implies that uR ∈ C m+1,α (Ba ), whence u ∈ E m+1,α (Ba ). To generalize these arguments to equations with non-constant coefficients and to non-linear equations we note the following observation. Lemma 6. For f, g ∈ E m,α (Ba ) we have (i) f + g ∈ E m,α (Ba ). (ii) f g ∈ E m,α (Ba ). (iii) If f = 0 in Ba , then 1/f ∈ E m,α (Ba ). Analogous results hold for functions in E ∞ (Ba ). Proof. The first two assertions are obvious, for (iii) we need only consider a small ball B% centered at the origin because r is smooth elsewhere. If f = f1 +rf2 , f1 , f2 ∈ C m,α (Ba ), we have 1/f = v1 + rv2 with v1 =
(f1
)2
f1 , − r 2 (f2 )2
v2 =
(f1
)2
These functions are in C m,α (B% ) for sufficiently small % imply that f1 (0) = 0. The E ∞ (Ba ) case is similar.
−f2 . − r 2 (f2 )2 > 0 because our assumptions
We consider now a general linear elliptic differential operator L of second order L = a ij ∂i ∂j + bi ∂i + c.
(42)
It will be assumed in this section that a ij , bi , c ∈ C ∞ (B¯ a ).
(43)
We express the operator in normal geodesic coordinates x i with respect to a ij , centered at the origin of Ba , such that a ij (x) = δ ij + aˆ ij , with aˆ ij = O(r 2 ),
(44)
and xj aˆ ij = 0,
x ∈ Ba .
(45)
Asymptotically Flat Initial Data with Prescribed Regularity at Infinity
585
ˆ given by For the differential operator L, ˆ = aˆ ij (x)∂i ∂j u + bi (x)∂i u + c(x)u, Lu we find ˆ s p) = r s−2 U is C ∞ Lemma 7. Suppose p ∈ Pm . Then the function U defined by L(r m+1 i and satisfies U = O(r ). If in addition b = O(r) (as in the case of the Yamabe operator Lh ), then U = O(r m+2 ). Proof. A direct calculation, observing (45), gives U = aˆ ij (sδij p + r 2 ∂j ∂i p) + bi (sxi p + r 2 ∂i p) + cr 2 p, which guarantees our result. ˆ where 9 is the flat Laplacian In the following we shall use the splitting L = 9 + L, i in the normal coordinates x . 2,p
Theorem 13. Let u ∈ Wloc (Ba ) be a solution of Lu = r s−2 f, where L is given by (42) with (43) and s ∈ Z, p > 1.
(i) Assume s = 1 and f ∈ E m,α (Ba ). Then u ∈ E 1,α (Ba ), 0 < α < α, if m = 0 and u ∈ E m+1,α (Ba ) if m ≥ 1. If f ∈ E ∞ (Ba ), then u ∈ E ∞ (Ba ). (ii) If s < 0, |s| is odd, f ∈ C m,α (Ba ) with m + s − 1 ≥ 0, and f = O(r s0 ), with s0 + s ≥ 0; then u has the form u=r
s
m
uk + u R ,
k=s0
where uk ∈ Pm , LuR = O(r m+α+s−2 ), and uR ∈ C 1,α (Ba ), 0 < α < α, if m + s − 1 = 0, uR ∈ C m+s,α (Ba ) if m + s − 1 ≥ 1. If f ∈ C ∞ (Ba ), then u can be written in the form u = r s v1 + v2 with v1 , v2 ∈ s0 a ), v1 = O(r ), and
C ∞ (B
L(r s v1 ) = r s−2 f + θR ,
L(v2 ) = −θR ,
where θR ∈ C ∞ (Ba ) and all its derivatives vanish at the origin. Proof. In both cases we write f = Tm + fR with a polynomial Tm of order m, Tm =
m
tk ,
tk ∈ Pm .
k=m0
Case (i): m0 = 0 and f = f1 + rf2 , where f1 , f2 are in C m,α (Ba ). We define Tm to be the Taylor polynomial of order m of f1 . Case (ii): m0 = s0 and we define Tm to be the Taylor polynomial of order m of f .
586
S. Dain, H. Friedrich
We show that u has in both cases the form u = rs
m
uk + u R ,
(46)
k=m0
with uk ∈ Pk and uR ∈ C m+s,α (Ba ). For this purpose Lemma 3 will be used to determine the polynomials uk in terms of tk by a recurrence relation. The differentiability of uR follows then from Lemma 4, elliptic regularity, and the elliptic equation satisfied by uR . The recurrence relation is defined by 9(r s um0 ) = r s−2 tm0 ,
(k)
9(r s uk ) = r s−2 (tk − Uk ),
k = m0 + 1, . . . , m,
(47)
(k)
where, given um0 , . . . , uk−1 , we define Uk U (k)
as follows. By Lemma (7) the functions k−1 (48) = r −s+2 Lˆ r s uj , j =m0
which will be defined successively for k = m0 + 1, . . . , m + 1, are C ∞ and U (k) = O(r m0 +1 ). Thus we can write by Lemma (14) U (k) =
m
(k)
j =m0 +1 (k)
Uj
(k)
+ UR ,
(k)
where UR = O(r m+α ) and Uj ∈ Pj denotes the homogeneous polynomial of order j in the Taylor expansion of U (k) . By Lemma (3) the recurrence relation (47) is well defined for the cases (i) and (ii). Note that (k )
Uj
(k)
= Uj , m0 + 1 ≤ j ≤ k
if
k < k ≤ m + 1,
(49)
because we have by Lemma 7, U
(k )
− U (k) = r −s+2 Lˆ r s
−1 k
uj = O(r k+1 ).
j =k (k)
With the definitions above and the identity (49), which allows us to replace Uk (m+1) by Uk , the original equation for u implies for the function uR defined by (46) the equation (m+1) LuR = r s−2 UR (50) + fR . Case (i): We use Lemma (4), (i) to conclude that the right-hand side of Eq. (50) is in Lp (B% ) if m = 0 and in C m−1,α (B% ) if m ≥ 1. Now Theorems 7 and 8 imply that uR ∈ C 1,α (Ba ), α < α, if m = 0 and uR ∈ C m+1,α (Ba ) if m ≥ 1. For the E ∞ case we use Lemma 5.
Asymptotically Flat Initial Data with Prescribed Regularity at Infinity
587
Case (ii): By our procedure we have LuR = O(r m+α+s−2 ). If m + s − 1 = 0 we use part (i) of Lemma 4 to conclude that the right-hand side of Eq. (50) is in Lp (Ba ) and Theorems 7, 3 to conclude that uR ∈ C 1,α (Ba ). If m+s−1 ≥ 1 we use part (ii) of Lemma 4 to conclude that the right-hand side of Eq. (50) is in C m+s−2,α (Ba ). Elliptic regularity then implies that uR ∈ C m+s,α (Ba ). The C ∞ case follows by an analogous argument as in the proof of Lemma 5, since the polynomials um0 , . . . , um obtained for an integer m with m > m coincide with those obtained for m, i.e. the procedure provides a unique sequence of polynomials uk , k = m0 , . . . . More general expansions, which include logarithmic terms, have been studied (in a somewhat different setting) in [30], where results similar to those given in 13 have been derived. Definition 1 is tailored to the case in which no logarithmic terms appear and leads to a considerable simplification of the proofs as well as to a more concise statement of the results as compared with those given in [30]. Corollary 2. Assume that the hypotheses on u of Theorem 13 are satisfied. Let θ0 be a distributional solution of L θ0 = −4π δi in Ba . Then we can write θ0 = r −1 u1 + u2 ,
(51)
with u1 , u2 ∈ C ∞ (Ba ), u1 (0) = 1, L(r −1 u1 ) = −4π δi + θR , where θR ∈ C ∞ (Ba ) and all its derivatives vanish at i. In the particular case of the Yamabe operator Lh with respect to a smooth metric h we have u1 = 1 + O(r 2 ). Proof. Using that 9(r −1 ) = −4π δi in Ba , we obtain for u = θ0 − 1/r, ˆ −1 ). Lu = −L(r
(52)
ˆ −1 ) = r −3 U , with U ∈ C ∞ (Ba ) and U = O(r). Our assertion By Lemma 7 we have L(r now follows from Theorem 13. For the last assertion we use that in the case of Lh we have U = O(r 2 ). We note that the functions u1 , u2 are in fact analytic and θR ≡ 0 if the coefficients of L are analytic in Ba (cf. [24]). 3.3. Proof of Theorem 1. There exists a unique solution θ = θ0 + u of Eq. (8) with θ0 as Lemma 2 and u as in Theorem 12. Since the operator Lh satisfies the hypothesis of Corollary 2 we can write on Ba , θ0 =
u1 + w, r
(53)
where u1 , w are smooth functions and Lh (r −1 u1 ) = θR on Ba \ i, with θR as described in Corollary 2. Given the solution u = u(x), we can read Eq. (32) in Ba as an equation for u, Lh u =
f (x) , r
(54)
588
S. Dain, H. Friedrich
with f considered as a given function of x f (x) = −
r 8 ab ab . 8(rθ0 + ru)7
By hypothesis r 8 ab ab ∈ E ∞ (Ba ), by Corollary 2 we have r θ0 ∈ E ∞ (Ba ), and by Theorem 12 the solution u is in C α (Ba ) ⊂ E 0,α (Ba ). By Lemma 6 we thus have f ∈ E 0,α (Ba ). By Theorem 13 Eq. (54) implies that u ∈ E 1,α (Ba ), 0 < α < α which implies in turn that f ∈ E 1,α (Ba ). Repeating the argument, we show inductively that u, whence f is in E m,α (Ba ) for all m ≥ 0. Lemma 5 now implies that u ∈ E ∞ (Ba ). 3.4. Solution of the Hamiltonian constraint with logarithmic terms. The example 9(log rhm ) = r −2 hm (2m + 1),
(55)
hm ∈ Hm , shows that logarithmic terms can occur in solutions to the Poisson equation even if the source has only terms of the form r s p with p ∈ Pm . This happens in the cases where the Laplacian does not define a bijection between r s+2 Pm and r s Pm , cases which are excluded in Lemma 3. We shall use this to show that logarithmic terms can occur in the solution to the Lichnerowicz equation if the condition r 8 ab ab ∈ E ∞ (Ba ) is not satisfied. Our example will be concerned with initial data with non-vanishing linear momentum. We assume that in a small ball Ba centered at i the tensor ab has the form ab = Pab + Rab ,
(56)
where Pab is given in normal coordinates by (76) and Rab = O(r −3 ) is a tensor field such that ab satisfies Eq. (7). The existence of such tensors, which satisfy also θ0−7 ab ab ∈ Lq (S),
q ≥ 2,
(57)
and, by Lemma 13, r 8 ab ab = ψ + rψ R
in
Ba ,
(58)
will be shown in Sect. 4.3. Here the function ψ R is in C α (Ba ) and ψ is given explicitly by ij
ψ ≡ r 8 P ij P = c + r −2 h2 , with c =
15 16
P 2 , P 2 = P i Pi , and
(59)
3 2 (60) r 3(P i ni )2 − P 2 , 8 where, in accordance with the calculations in Sect. 4.3, the latin indices in the expressions above are moved with the flat metric. We note that h2 ∈ H2 and ψ is not continuous. The tensor ab satisfies condition (9) and the three constants P i given by (76) represent the momentum of the initial data. Since Pab is trace-free and divergence-free with respect to the flat metric, we could, of course, choose hab to be the flat metric and Rab = 0. This would provide one of the conformally flat initial data sets discussed in [11]. We are interested in a more general situation. h2 =
Asymptotically Flat Initial Data with Prescribed Regularity at Infinity
589
Lemma 8. Let hab be a smooth metric, and let ab be given by (56) such that conditions (57) and (58) hold. Then, there exists a unique, positive, solution to the Hamiltonian constraint (8). In Ba it has the form θ=
1 1 w1 + m+ r (9 (P i ni )2 − 33 P 2 r 2 32 7 5 2 3 + m r2 P + (3 (P i ni )2 − P 2 ) log r + uR , 16 4 5
(61)
where the constant m is the total mass of the initial data, w1 is a smooth function with w1 = 1 + O(r 2 ), and uR ∈ C 2,α (Ba ) with uR (0) = 0. Since w1 is smooth and uR is in C 2,α (Ba ), there cannot occur a cancellation of logarithmic terms. For non-trivial data, for which m = 0, the logarithmic term will always appear. In the case where hab is flat and Rab = 0 an expansion similar to (61) has been calculated in [26]. Proof. The existence and uniqueness of the solution has been shown in Sect. 3.1. To derive (61) we shall try to calculate each term of the expansion and to control the remainder as we did in the proof of Theorem 13. However, Lemma 3 will not suffice here, we will have to use Eq. (55). By Corollary 2 we have θ0 =
w1 + w, r
with w, w1 ∈ C ∞ (Ba ) and w1 = 1 + O(r 2 ). By Theorem 12 the unique solution u of Eq. (54) is in C α (Ba ). Equation (54) has the form Lh u = (
ψ + ψ R ) f (x, u) r
with
f (x, u) = −
1 . (w1 + r(u + w))7
By m ≡ 2 (u(0) + w(0)) is given the mass of the initial data. Since u ∈ C α (Ba ), we find f = −1 +
7 m r + fR , 2
If we set c 7 1 u1 = − r + h2 + m 2 4r 2 we find from (34), (55) that 9u1 = and that v = u − u1 satisfies Lh v = ψ
R
ψ r
fR = O(r 1+α ). c 2 1 r + log r h2 , 6 5
−1 +
7 mr , 2
7 ψfR . −1 + m r − Lˆ h u1 + 2 r
(62)
(63)
(64)
We shall show that this equation implies that the function v is in C 2,α (S).
(65)
590
S. Dain, H. Friedrich
Since ψ R ∈ C α (Ba ), the first term on the right-hand side of (65) is in C α (Ba ). By a direct calculation (using that the coefficient bi of Lˆ h is O(r)) we find that Lˆ h u1 ∈ W 2,p (Ba ), p < 3, which implies by the Sobolev imbedding theorem that Lˆ h u1 ∈ C α (Ba ). The third term on the right-hand side of (65) is more delicate, because it depends on the solution v. From Theorem 12 and Eq. (63) we find that v ∈ C α (Ba ) ∩ W 2,p (Ba ), where, due to (63), we need to assume p < 3. However, since fR = O(r 1+α ) and ψ is bounded, the function ψfR /r is bounded and the right-hand side of (65) is in L∞ (Ba ). Thus Theorem 7 implies that v ∈ W 2,p (Ba ) for p > 3, whence v ∈ C 1,α (Ba ) by the Sobolev imbedding theorem. It follows that we can differentiate fR , considered as a function of x, to find that ∂i fR = O(r). It follows that ψfR /r is in W 1,p (Ba ), p > 3, and thus in C α (Ba ). Since the right-hand side of (65) is in C α (Ba ), it follows that v ∈ C 2,α (Ba ). 4. The Momentum Constraint 4.1. The momentum constraint on Euclidean space. In the following we shall give an explicit construction of the smooth solutions to the equation ∂a ab = 0 on the 3dimensional Euclidean space E3 (in suitable coordinates R3 endowed with the flat standard metric) or open subsets of it. Another method to obtain such solutions has been described in [7], multipole expansions of such tensors have been studied in [9]. Let i be a point of E3 and x k a Cartesian coordinate system with origin i such that in these coordinates the metric of E3 , denoted by δab , is given by the standard form δkl . We denote by na the vector field on E3 \ {i} which is given in these coordinates by x k /|x|. Denote by ma and its complex conjugate m ¯ a complex vector fields, defined on E3 outside a lower dimensional subset and independent of r = |x|, such that ¯ am ¯ a = na ma = na m ¯ a = 0, ma ma = m
ma m ¯ a = 1.
(66)
There remains the freedom to perform rotations ma → eif ma with functions f which are independent of r. The metric has the expansion δab = na nb + m ¯ a mb + m a m ¯ b, while an arbitrary symmetric, trace-free tensor ab can be expended in the form r 3 ab = ξ(3na nb − δab ) +
√
2η1 n(a m ¯ b) +
√
2η¯ 1 n(a mb) + µ¯ 2 ma mb + µ2 m ¯ am ¯ b,
(67)
with ξ=
1 3 r ab na nb , 2
η1 =
√ 3 2r ab na mb ,
µ2 = r 3 ab ma mb .
Since ab is real, the function ξ is real while η1 , µ2 are complex functions of spin weight 1 and 2 respectively. Using in the equation ∂a ab = 0,
(68)
Asymptotically Flat Initial Data with Prescribed Regularity at Infinity
591
the expansion (67) and contracting suitably with na and ma , we obtain the following representation of (68): ¯ 1 + ðη¯ 1 = 0, 4r∂r ξ + ðη ¯ 2 − ðξ = 0. r∂r η1 + ðµ
(69) (70)
Here ∂r denotes the radial derivative and ð the edth operator of the unit two-sphere (cf. [32] for definition and properties). By our assumptions the differential operator ð commutes with ∂r . Let s Ylm denote the spin weighted spherical harmonics, which coincide with the standard spherical harmonics Ylm for s = 0. The s Ylm are eigenfunctions of the operator ¯ for each spin weight s. More generally, we have ðð ð¯ p ðp (s Ylm ) = (−1)p
(l − s)! (l + s + p)! s Ylm . (l − s − p)! (l + s)!
(71)
If µs denotes a smooth function on the two-sphere of integral spin weight s, there exists a function µ of spin weight zero such that ηs = ðs η. We set ηR = Re(η) and ηI = i Im(η), such that η = ηR + ηI , and define ηsR = ðs ηR , ηsI = ðs ηI , such that ηs = ηsR + ηsI . We have ðs ηR = ð¯ s ηR , ðs ηI = −ð¯ s ηI . Using these decompositions now for η1 and µ2 , we obtain Eq. (69) in the form ¯ R. 2r∂r ξ = −ððη
(72)
Applying ð¯ to both sides of Eq. (70) and decomposing into real and imaginary part yields ¯ I = −ð¯ 2 ð2 µI , r∂r ððη ¯ = ð¯ 2 ð2 µR . 2r∂r (r∂r ξ ) + ððξ
(73) (74)
Since the right-hand side of (72) has an expansion in spherical harmonics with l ≥ 1 and the right hand sides of (73), (74) have expansions with l ≥ 2, we can determine the expansion coefficients of the unknowns for l = 0, 1. They can be given in the form 1 ξ = A + rQ + P , r
ηI = iJ + const.,
1 ηR = rQ − P + const., r
with P =
3 a P na , 2
Q=
3 a Q na , 2
J = 3J a na ,
(75)
592
S. Dain, H. Friedrich
where A, P a , Qa , J a are arbitrary constants. Using (67), we obtain the corresponding tensors in the form (cf. ([11]) 3 Pab = 4 −P a nb − P b na − (δ ab − 5na nb ) P c nc , (76) 2r 3 Jab = 3 na % bcd Jc nd + nb % acd Jc nd , (77) r A Aab = 3 3na nb − δ ab , (78) r 3 ab (79) Q = 2 Qa nb + Qb na − (δ ab − na nb ) Qc nc . 2r We assume now that ξ and ηI have expansions in terms of spherical harmonics with l ≥ 2. Then there exists a smooth function λ2 of spin weight 2 such that ξ = ð¯ 2 λR 2,
¯ I. η1I = ðλ 2
Using these expressions in Eqs. (72)–(74) and observing that for smooth spin weighted ¯ s = 0 only if µs = 0, we obtain functions µs with s > 0 we can have ðµ ¯ R , ð2 µI = −r∂r λI , ðηR = −2r∂r ðλ 2 2 2 R R R R ¯ . ð µ = 2r∂r r∂r λ2 − 2λ2 + ððλ 2 We are thus in a position to describe the general form of the coefficients in the expression (67) 1 ξ = ð¯ 2 λR 2 + A + r Q + P, r ¯ I + r ðQ − 1 ðP + i ðJ, ¯ R + ðλ η1 = −2 r ∂r ðλ 2 2 r R ¯ R − r ∂ r λI . µ2 = 2 r ∂r (r ∂r λ2 ) − 2 λR + ð ðλ 2 2 2
(80) (81) (82)
Theorem 14. Let λ be an arbitrary complex C ∞ function in Ba \ {i} ⊂ E3 with 0 < a ≤ ∞, and set λ2 = ð2 λ. Then the tensor field ab ab = Pab + Jab + Aab + Q + λab ,
(83)
where the first four terms on the right-hand side are given by (76)–(78) while λab is is obtained by using in (67) only the part of the coefficients (80)–(82) which depends on λ2 , satisfies the equation D a ab = 0 in Ba \ {i}. Conversely, any smooth solution in Ba \ {i} of this equation is of the form (83). Obviously, the smoothness requirement on λ can be relaxed since λab ∈ C 1 (Ba \{i}) if λ ∈ C 5 (Ba \ {i}). Notice, that no fall-off behaviour has been imposed on λ at i and that it can show all kinds of bad behaviour as r → 0. Since we are free to choose the radius a, we also obtain an expression for the general smooth solution on E3 \ {i}. By suitable choices of λ we can construct solutions λab which are smooth on E3 or which are smooth with compact support. Given a subset S of R3 which is compact with boundary, we can use the representation (83) to construct hyperboloidal initial data ([21]) on S with a metric h which is Euclidean
Asymptotically Flat Initial Data with Prescribed Regularity at Infinity
593
on all of S or on a subset U of S. In the latter case we would require λab to vanish on S\U . In the case where the trace-free part of the second fundamental form implied by h on ∂S vanishes and the support of ab has empty intersection with ∂S the smoothness of the corresponding hyperboloidal initial data near the boundary follows from the discussion in ([5]). Appropriate requirements on h and ab near ∂S which ensure the smoothness of the hyperboloidal data under more general assumptions can be found in ([4]). There exists a 10-dimensional space of conformal Killing vector fields on E3 . In the cartesian coordinates x i a generic conformal Killing vector field ξ0a has components (84) ξ0i = k j 2 xj x i − δj i xl x l + % i j k S j x k + ax i + q i , where k i , S i , q i are arbitrary constant vectors and a an arbitrary number. In terms of the “physical coordinates” y i = x i /|x|2 , with respect to which i represents infinity, we see that k i , S i , q i , and a generate translations, rotations, “special conformal transformations”, and dilatations respectively. For 0 < % < a we set S% = {|x| = %} and denote by dS% the surface element on it. For the tensor field ab of (83) we obtain
1 (85) ab na ξ0b dS% = P a ka + J a Sa + A a + Qa qa . 8 π S% We note that the integral is independent of % and, more importantly, independent of the choice of λ. Thus the function λ neither contributes to the momentum 1 a lim r 2 bc nb (2nc na − δ ca ) dS% , (86) P = 8π %→0 S% nor to the angular momentum 1 lim J = 8π %→0 a
S%
r bc nb % cad nd dS% ,
(87)
of the data. If we use the coordinates x i to identify E3 with R3 and map the unit sphere S 3 ⊂ R4 by stereographic projection through the south pole onto R3 , the point i, i.e. x i = 0, will correspond to the north pole, which will be denoted in the following again by i. The south pole, denoted in the following by i , will represent infinity with respect to the coordinates x i and the origin with respect to the coordinates y i . We use x i and y i as coordinates on S 3 \ {i } and S 3 \ {i} respectively. If h0 is the standard metric on S 3 , we have in the coordinates x i 1/2 1 + r2 0 −2 hkl = θ δkl , θ = . (88) 2 ˜ ab = θ 10 ab with ab We assume that the function λ is smooth in E3 \ {i} and set as in (83). Then, we have by general rescaling laws (cf. (111), (112)), ˜ ab = 0 in S 3 \ {i, i }, Da
(89)
where Da denotes the connection corresponding to h0 . The smoothness of ˜ab near i can be read off from Eqs. (67) and (80)–(82). In order to study its smoothness near i
594
S. Dain, H. Friedrich
we perform the inversion to obtain the tensor in the coordinates y i . It turns out that we obtain the same expressions as before if we make the replacements ni → −ni , mi → mi , r → 1/r, ξ → ξ, η1 → −η1 , µ → µ, a P → Qa , J a → −J a , A → A, Qa → P a . Thus the tensors (76)–(78), the first two of which are the only ones which contribute to the momentum and angular momentum, are singular in i as well as in i. Observing again the conformal covariance, we can state the following result. ˜ ab = 0 on S 3 \{i, i } with Corollary 3. The general smooth solution of the equation Da 4 0 0 respect to the metric ω h , where h denotes the standard metric on the unit 3-sphere ˜ ab = (ω−1 θ)10 ab with ab as in (83) and and ω ∈ C ∞ (S 3 ), ω > 0, is given by ∞ 3 λ ∈ C (E \ {i}). If we require the solution to be bounded near i (in particular, if we construct solutions with only one asymptotically flat end), the quantities P a , J a , A, Qa must vanish. We can now provide tensor fields which satisfy condition (11) and thus prove a special case of Theorem 2. Theorem 15. Denote by ab a tensor field of the type (83). If rλ ∈ E ∞ (Ba ) and P a = 0, then r 8 ab ab ∈ E ∞ (Ba ). Only the part of λ which in an expansion in terms of spherical harmonics is of order l ≥ 2 contributes to ab . We note that the condition rλ ∈ E ∞ (Ba ) entails that this part is of order r. The singular parts of ab are of the form (77)–(78). For the proof we need certain properties of the ð operator. If ma is suitably adapted to standard spherical coordinates, the ð operator, acting on functions of spin weight s, acquires the form ∂ i ∂ (90) + (sin θ)−s ηs . ðηs = −(sin θ )s ∂θ sin θ ∂φ ¯ acting on functions with s = 0 is the Laplace operator on the unit The operator ðð sphere, i.e. we have the identity ¯ ððf = r 2 9f − x i x j ∂i ∂j f − 2x i ∂i f,
(91)
where 9 denotes the Laplacian on R3 . The commutator is given by ¯ − ðð)η ¯ s = 2sηs . (ðð
(92)
From this formula we obtain for q ∈ N by induction the relations ¯ q − ðq ð¯ ηs = (2qs + (q − 1)q)ðq−1 ηs , ðð
(93)
q ðð¯ − ð¯ q ð ηs = (−2qs + (q − 1)q)ðq−1 ηs .
(94)
Asymptotically Flat Initial Data with Prescribed Regularity at Infinity
595
In particular, it follows that the functions η and µ, which satisfy ðη = η1 and ð2 µ = µ2 , are given by ¯ R + 2λR + ððλ ¯ I + 2λI + rQ − P + iJ, (95) η = −2r∂r ððλ r ¯ R − 2r∂r λI . µ = 2r∂r r∂r λR + ððλ (96) Lemma 9. Let f, g ∈ E ∞ be two complex functions (spin weight zero). Then, the functions ð¯ q f ðq g, q ∈ N, of spin weight zero are also in E ∞ . Remarkably, the statement is wrong if we replace E ∞ by C ∞ : the calculation in polar coordinates, using (90), gives ¯ 3 = −x 1 x 3 + i r x 2 . ðx 1 ðx Proof. For q = 1 the proof follows from two identities. The first one is a simple consequence of the Leibnitz rule ¯ + ðf ¯ ðg = ðð(f ¯ ¯ − g ððf. ¯ ðf ðg g) − f ððg
(97)
¯ maps by (91) smooth functions into smooth functions, it follows that (ðf ðg ¯ + Since ðð ¯ðf ðg) ∈ E ∞ if f, g ∈ E ∞ (here we can replace E ∞ by C ∞ ). The other identity reads ¯ − ðf ¯ ðg = 2 i r %l j k x l ∂j f ∂k g. ðf ðg
(98)
It is obtained by expressing (90) in the Cartesian coordinates x l . Important for us is the appearance of the factor r. A particular case of this relation has been derived in [27]. It ¯ − ðf ¯ ðg) ∈ E ∞ if f, g ∈ E ∞ . Taking the difference of (97) and follows that (ðf ðg (98) gives the desired result. To obtain the result for arbitrary q, we proceed by induction. The Leibniz rule gives q ¯q ¯ ¯ q f ðð¯ q g − ðq f ðð¯ q+1 g − ð¯ q gððð ¯ q f. ðq+1 f ð¯ q+1 g = ðð(ð f ð g) − ðð
The induction hypothesis for q and (91) imply that the first term on the right-hand side is in E ∞ . The factors appearing in the following terms can be written by (93) and (94) in the form: ¯ q f = ðq−1 (ððf ¯ + q(q − 1)f ), ðð¯ q g = ð¯ q−1 (ððg ¯ + q(q − 1)g), ðð ¯ + q(q − 1)g), ¯ q f = ðq (ððf ¯ + q(q − 1)f ). ððð ðð¯ q+1 g = ð¯ q (ððg Since the functions in parentheses are, by (91), in E ∞ , the induction hypothesis implies that each of the products is in E ∞ . Proof of Theorem 15. In terms of the coefficients (80)–(82) we have r 8 ab ab = r 2 2µ2 µ¯ 2 + 2η1 η¯ 1 + 3ξ 2 .
(99)
Since rλ is in E ∞ (Ba ), Eqs. (95) and (96) imply that rξ, rη, rµ ∈ E ∞ . The conclusion now follows from Lemma 9.
596
S. Dain, H. Friedrich
4.2. The general case: Existence. The existence of solutions to the momentum constraint for asymptotically flat initial data has been proved in weighted Sobolev spaces (cf. [12, 15–17] and the references given there) and in weighted Hölder spaces [13]. The existence of initial data with non-trivial momentum and angular momentum and the role of conformal symmetries have been analysed in some detail in [9]. In this section we will prove existence of solutions to the momentum constraint with non-trivial momentum and angular momentum following the approach of [9]. We generalize some of the results shown in [9] to metrics in the class (21). The results of the previous section will be important for the analysis of the general case. We use the York splitting methods to reduce the problem of solving the momentum constraint to solving a linear elliptic system of equations. Let the conformal metric h on the initial hypersurface S be given. We use it to define the overdetermined elliptic conformal Killing operator Lh , which maps vector fields v a onto symmetric h-trace-free tensor fields according to (Lh v)ab = D a v b + D b v a −
2 ab h Dc v c , 3
(100)
and the underdetermined elliptic divergence operator δh which maps symmetric h-tracefree tensor fields Lab onto vector fields according to (δh L)a = Db Lba .
(101)
Let Lab be a symmetric h-trace-free tensor field and set ab = Lab − (Lh v)ab .
(102)
Then ab will satisfy the equation Da ab = 0 if the vector field v a satisfies Lh v a = Db Lab ,
(103)
1 Lh v a = Db (Lh v)ab = Db D b v a + D a Db v b + R a b v b . 3
(104)
where the operator Lh is given by
Since ab (Lh v)ab = 2 Da ( ab vb ) − 2 (δh )a va for arbitrary vector fields v a and symmetric h-trace-free tensor fields ab , Lh has formal adjoint L∗h = −2 δh . Thus 1 Lh = − L∗h ◦ Lh , 2
(105)
and the operator is seen to be elliptic. Provided the given data are sufficiently smooth, we can use Theorem 9 to show the existence of solutions to (103). Lemma 10 (Regular case). Assume that the metric satisfies (21) and Lab is an h-tracefree symmetric tensor field in W 1,p (S), p > 1. Then there exists a unique vector field v a ∈ W 2,p (S) such that the tensor field ab = Lab − (Lh v)ab satisfies the equation Da ab = 0 in S.
Asymptotically Flat Initial Data with Prescribed Regularity at Infinity
597
Proof. The well known argument that the condition of Theorem 9 will be satisfied extends to our case. Assume that the vector field ξ a is in the kernel of Lh , i.e. Lh ξ a = 0 in S.
(106)
Since the metric satisfies (21), elliptic regularity gives ξ a ∈ C 2,α (S).
(107)
This smoothness suffices to conclude from (105), (106) that 0 = −2 (ξ, Lh ξ )L2 = (Lh ξ, Lh ξ )L2 , whence (Lh ξ )ab = 0.
(108)
This implies for an arbitrary symmetric, h-trace-free tensor field Lab ∈ W 1,p (S), p > 1, the relation 0 = (Lh ξ, L)L2 = −2 (ξ, δh L)L2 , which shows that the Fredholm condition will be satisfied for any choice of Lab in (103). We call the case above the “regular case” because the solution still satisfies the condition ab ∈ W 1,p (S). While this allows us to have solutions diverging like O(r −1 ) at given points, it excludes solutions with non-vanishing momentum or angular momentum. We note that by (108) the kernel of Lh consists of conformal Killing fields. Let ξ a be such a vector field. Using (107) and Lemma 14 we find that we can write in normal coordinates centered at the point i of S ξ k = ξ0k + O(r 2+α ),
(109)
where ξ0k is the “flat” conformal Killing field (84) with coefficients given by 1 1 Da Db ξ b (i), S a = % a bc D b ξ c (i), q a = ξ a (i), a = Da ξ a (i). (110) 6 3 Since S is connected, the integrability conditions for conformal Killing fields (cf. [34]) entail that these ten “conformal Killing data at i” determine the field ξ a uniquely on S. With a conformal rescaling of the metric with a smooth, positive, conformal factor ω ka =
hab → hab = ω4 hab ,
(111)
which implies a corresponding change of the connection Da → rescalings ab →
ab
= ω−10 ab ,
Da ,
we associate the
a
ξ a → ξ = ξ a,
(112)
for h-trace free, symmetric tensor fields ab and Killing fields ξ a . Then the conformal Killing operator and the divergence operator satisfy (Lh v)ab = ω−4 (Lh v)ab ,
Da
ab
= ω−10 Da ab .
(113)
If we write ω = ef , the conformal Killing data transform as a
k = k a + 2aD a f (i) + % abc Sb Dc f (i) + qc D a D c f (i), S
a
a
= S + 2%
a
abc
qb Dc f (i),
a = a + q Da f (i), q
a
a
=q .
(114) (115) (116) (117)
598
S. Dain, H. Friedrich
A vector field ξ on S will be called an “asymptotically conformal Killing field” at the point i if it satisfies in normal coordinates x k centered at i Eq. (109) with (84). While any conformal Killing field is an asymptotically conformal Killing field, the converse need not be true. Let ξ a be an asymptotically conformal Killing field and consider the integral 1 ab na ξb dS% , (118) lim Iξ = 8π %→0 ∂B% where in the coordinates x k the unit normal nk to the sphere ∂B% around i approaches x k /|x| for small %. As shown in the previous section, the integral vanishes if the tensor ab is of order r −1 at i. If, however, ab is of order r −n , n = 2, 3, 4, at i, the integral gives non-trivial results and can be understood in particular as a linear form on the momentum and angular momentum of the data. We recall two important properties of the integral Iξ . The first is the fact that the integral Iξ is invariant under the rescalings (111), (112). The second property is concerned with the presence of conformal symmetries. Let Lab be an arbitrary h-trace-free tensor field and v a an arbitrary vector field. Using Gauss’ theorem, we obtain ab ab L na vb dS% = −2 va Db L dµ − (Lh v)ab Lab dµ, (119) 2 ∂B%
S−B%
S−B%
with the orientation of nk as above. If we apply this equation to (118), the first term on the right-hand side will vanish if Da ab = 0 in S \ {i}, while the second term on the right-hand side need not vanish if ξ a is only an asymptotically conformal Killing vector field. However, if ξ is a conformal Killing field, we get Iξ = 0. Thus the presence of Killing fields entails restrictions on the values allowed for the momentum and angular momentum of the data. For this reason the presence of conformal symmetries complicates the existence proof. Note that we are dealing with vector fields ξ a which satisfy the conformal Killing equation (108) everywhere in S; in general, a small local perturbation of the metric will destroy this conformal symmetry. Observing the conformal covariance (113) of the divergence equation, we perform a rescaling of the form (25), (27) such that the metric h has in h -normal coordinates x k in Ba centered at i local expression hkl with hkl − δkl = O(r 3 ),
∂j hkl = O(r 2 ).
(120)
ik ∈ C ∞ (B \ {i}) be a trace free symmetric and divergence In these coordinates let flat a free tensor with respect to the flat metric δkl , ik = 0, δik flat
ik ∂i flat = 0 in Ba \ {i},
(121)
with ik ik flat = O(r −4 ), ∂j flat = O(r −5 ) as r → 0.
(122)
All these tensors have been described in Theorem (14). Note that the conditions (122) are essentially conditions on the function λ which characterizes the part λab of (83). ∞ Denote by Lab sing ∈ C (S \ {i}) the h -trace free tensor which is given on Ba \ {i} by 1 ab cd ab Lab , (123) h = χ − h sing flat 3 cd flat
Asymptotically Flat Initial Data with Prescribed Regularity at Infinity
599
and vanishes elsewhere. Here χ denotes a smooth function of compact support in Ba equal to 1 on Ba/2 . By our assumptions we have then −2 Da Lab sing = O(r ) as → 0.
(124)
Theorem 16 (Singular case). Assume that the metric h satisfies (21), ω0 denotes the ab conformal factor (25), h = ω04 h, Lab sing is the tensor field defined above, and Lreg is a symmetric h-trace free tensor field in W 1,p (S), p > 1. i) If the metric h admits no conformal Killing fields on S, then there exists a unique vector field v a ∈ W 2,q (S), with q = p if p < 3/2 and 1 < q < 3/2 if p ≥ 3/2, such that the tensor field ab ab ab = ω010 Lab , (125) sing + Lreg + (Lh v) satisfies the equation Da ab = 0 in S \ {i}. ii) If the metric h admits conformal Killing fields ξ a on S, a vector field v a as specified above exists if and only if the constants k a , S a , a, q a (partly) characterizing the tensor field Lab sing (cf. (83), (76)–(79)), satisfy the equation
(126) P a ka + J a Sa + A a + P c Lc a (i) + Qa qa = 0, for any conformal Killing field ξ a of h, where the constants ka , Sa , a, qa characterizing ξ a are given by (110). In both cases the momentum and angular momentum (cf. (86), (87)) of ab agree with those of the tensor Lab sing . These quantities can thus be prescribed freely in case (i). ab p Proof. Because of (124) we can consider Da (Lab sing + Lreg ) as a function in L (S), 1 < p < 3/2. In case (i) the kernel of the operator Lh appearing in the equation ab ab Da Lab = 0, sing + Lreg + (Lh v)
is trivial and we can apply Theorem 9 to show that the equation above determines a unique vector field v a with the properties specified above. After the rescaling we have Da ab = 0 by (113). In case (ii) the kernel of Lh is generated by the conformal Killing fields ξ a = ξ a ab of h . If we express (119) in terms of the metric h , the tensor field Lab sing + Lreg , and a the vector field ξ , take the limit % → 0 and use Eq. (85), we find that the Fredholm condition of Theorem 9 is satisfied if and only if for every conformal Killing field ξ a of h we have P a ka + J a Sa + A a + Qa qa = 0, where the constants ka , Sa , a , qa are given by Eq. (110), expressed in terms of ξ a and h . By (114)–(117) this condition is identical with (126). As shown in the previous section, we can choose Lab sing such that the corresponding momentum and angular momentum integrals take preassigned values, which can be chosen freely in case (i) and need to satisfy (126) in case (ii). These values will agree with those obtained for ω0−10 ab due to the regularity properties of v a . After the rescaling the values of the momentum and the angular momentum remain unchanged because ω0 = 1 + O(r 2 ).
600
S. Dain, H. Friedrich
We note that it is the presence of the 10-dimensional space of conformal Killing fields on the standard 3-sphere which led to the observation made in Corollary (3). The latter generalizes as follows. Corollary 4. If S is an arbitrary compact manifold, h satisfies (21), and we allow for p ≥ 2 asymptotic ends ik , 1 ≤ k ≤ p, we can choose the sets of constants (Pka , Jka , Ak , Qak ) arbitrarily in the ends ik , 1 ≤ k ≤ p − 1. Which constants can be chosen at the end ip depends on the conformal Killing fields admitted by h. Proof. This follows from the observation that in the case of p ends Eq. (126) generalizes to an equation of the form p l=1
Pla kal + Jla Sal + Al a l + Plc Lc a (il ) + Qal qal = 0,
where the constants bear for given l the same meaning with respect to the point il as the constants in (126) with respect to i. The case of spaces conformal to the unit 3-sphere (S 3 , h0 ) is very exceptional. A result of Obata [33], discussed in [9] in the context of the constraint equations, says that unless the manifold (S, h) is conformal to (S 3 , h0 ) there exists a smooth conformal factor such that in the rescaled metric h every conformal Killing field is in fact a Killing field. Thus the dimension of the space of conformal Killing fields cannot exceed 6. In fact, it has been shown in [9] that in that case h can admit at most four independent Killing fields and only one of them can be a rotation. In this situation Eq. (126), written in terms of the metric h , reduces by (110) to J a Sa + Qa + P b La b (i) qa = 0, since Da ξ a = 0 for a Killing field. The constants P a and A can be prescribed arbitrarily. If there does exist a rotation among the Killing fields, the equation above implies Qa + P b La b (i) qa = 0. J a Sa = 0, 4.3. Asymptotic expansions near i of solutions to the momentum constraint. In this section we shall prove an analogue of Theorem 13 for the operator Lh defined in (104). It will be used to analyse the behaviour of the solutions to the momentum constraint considered in Theorem (16) near i and to show the existence of a general class of solutions which satisfy condition (11). Our result rests on the close relation between the operator Lh and the Laplace operator. We begin with a discussion on R3 and write xi = x i , ∂ i = ∂i . The flat space analogue of Lh on R3 is given by L0 v k = 9v k +
1 k ∂ ∂l v l , 3
(127)
where 9 denotes the flat space Laplacian and v k a vector field on some neighbourhood of the origin in R3 . The following spaces of vector fields whose components are homogeneous polynomials of degree m and smooth functions respectively will be important for us.
Asymptotically Flat Initial Data with Prescribed Regularity at Infinity
601
Definition 2. Let m ∈ N, m ≥ 1. We define the real vector spaces Qm , Q∞ (Ba ) by Qm = {v ∈ C ∞ (R3 , R3 )| v i ∈ Pm ,
v i xi = r 2 v with v ∈ Pm−1 },
Q∞ (Ba ) = {v ∈ C ∞ (Ba , R3 )| v i xi = r 2 v with v ∈ C ∞ (Ba )}. The following lemma, an analogue of Lemma 3, rests on the conditions imposed on the vector fields above. Lemma 11. Suppose s ∈ Z. Then the operator L0 defines a linear, bijective mapping of vector spaces L0 : r s Qm → r s−2 Qm , in the following cases: (i) s > 0, (ii) s < 0, |s| is odd and m + s ≥ 0. Note that the assumptions on m and s imply that the vector field L0 (r s p i ) ∈ \ {0}, R3 ) defines a vector field in L1loc (R3 , R3 ) which represents L0 (r s p i ) in the distributional sense. C ∞ (R3
Proof. For s as above and pi ∈ Qm there exists some pm−1 ∈ Pm−1 with ∂k (r s p k ) = r s qm−1 with qm−1 = s pm−1 + ∂k p k ∈ Pm−1 .
(128)
With Eq. (34) it follows that L0 (r s p i ) = r s−2 pˆ i with 1 pˆ i = s (s + 1 + 2m) p i + r 2 9p i + (s x i qm−1 + r 2 ∂ i qm−1 ) ∈ Pm . 3 Moreover, pˆ i ∈ Qm because pˆ i xi = r 2 pˆ m−1 with 1 pˆ m−1 = s (s + 1 + 2m) pm−1 + xi 9p i + (s qm−1 + xi ∂ i qm−1 ) ∈ Pm−1 . 3 To show that the kernel of the map is trivial, assume that L0 (r s p i ) = 0 ∈ L1loc (R3 , R3 ). Taking a (distributional) derivative we obtain 0 = ∂i L0 (r s p i ) =
4 4 9(∂i (r s p i )) = 9(r s qm−1 ). 3 3
(129)
When s > 0 or |s| odd and m − 1 + s ≥ 0 we use Lemma 3 to conclude that qm−1 = 0. We insert this in the equation L0 (r s p i ) = 0 to obtain 9(r s p i ) = 0 and conclude again by Lemma 3 that pi = 0. There remains the case s + m = 0 with |s| odd. Expanding qm−1 in Eq. (129) in harmonic polynomials (cf. (35)), we get 0 = 9(r −m qm−1 ) = 9(r 2k−m hm−1−2k ), 0≤k≤(m−1)/2
602
S. Dain, H. Friedrich
whence, by (34),
r 2k−m−2 (m − 2 k) (m − 2 k − 1) hm−1−2k = 0.
0≤k≤(m−1)/2
Since this sum is direct each summand must vanish separately. Since m is odd, the only factor (m − 2 k − 1) which vanishes occurs when 2k = m − 1, from which we conclude that qm−1 = r m−1 h0 with a constant h0 . Since Eq. (129), which reads now h0 9(r −1 ) = 0, holds in the distributional sense, it follows that h0 = 0. Unless noted otherwise we shall assume in the following that the metric h is of class C ∞ and that it is chosen in its conformal class such that its Ricci tensor vanishes at i (cf. (25), (27)). By x i will always be denoted a system of h-normal coordinates centered at i and all our calculations will be done in these coordinates. Thus we have hkl = δkl + O(r 3 ),
∂j hkl = O(r 2 ).
We write the operator Lh in the form Lh = L0 + Lˆ h , where, with the notation of (22), 1 (Lˆ h v)i = hˆ j k ∂j ∂k vi + hˆ j k ∂i ∂k vj + B j k i ∂j vk + Aj i vj , 3
(130)
with B kj i = −2 hj l ?l k i −
4 lf j h ?l 3
f
hk i +
1 ∂i hj k , 3
and Aj i is a function of the metric coefficients and their first and second derivatives. The fields Aj i , B kj i are smooth and satisfy Aj i = O(r),
B kj i = O(r 2 ),
(131)
and, because xk x i ?i k j = 0 at the point with normal coordinates x k , 4 xk x i B kj i = − r 2 hlf ?l j f . 3 Similarly, we write the operator Lh in the form
(132)
Lh = L0 + Lˆ h . Lemma 12. Suppose pi ∈ Qm . Then Lˆ h (r s p i ) = r s−2 U i with some U i ∈ Q∞ (Ba ) which satisfies U i = O(r m+3 ). Proof. Using (130), we calculate r −s+2 Lˆ h (r s p i ) and find 1 Ui = hˆ kj sδkj p i + r 2 ∂k ∂j pi + hˆ j k sxi ∂k pj + r 2 ∂k ∂i pj + δki pj 3 + B kj
i
sxk pj + r 2 ∂k pi + r 2 Aj i pj .
(133)
Thus U i is smooth. Using (131) we obtain that Ui = O(r m+3 ), using (132) we find x i Ui = r 2 f with some smooth function f .
Asymptotically Flat Initial Data with Prescribed Regularity at Infinity
603
We are in a position now to prove for m = ∞ the analogue of point (ii) of Theorem 13. Theorem 17. Assume that h is smooth, s ∈ Z, s < 0, |s| odd, F i ∈ C ∞ (Ba ), and J i ∈ Q∞ (Ba ) with J i = O(r s0 ) for some s0 ≥ |s|. 2,p Then, if v i ∈ Wloc (Ba ) solves (Lh v)i = r s−2 J i + F i , it can be written in the form v i = r s v1i + v2i ,
(134)
with v1i ∈ Q∞ (Ba ), v1i = O(r s0 ), v2i ∈ C ∞ (Ba ). Proof. The proof is similar to that of Theorem 13. For given m ∈ N we can write by our assumptions J i = Tmi + JRi , where JRi = O(r m+1 ) and Tmi denotes the Taylor polynomial of J i of order m. Because J i ∈ Q∞ (Ba ), its Taylor polynomial can be written in the form Tmi =
m k=s0
tki
with
tki ∈ Qk .
We define now a function vRi (depending on m) by vi = r s
m k=s0
vki + vRi .
The quantities (vki ) ∈ Qk are determined by the recurrence relation L0 (r s vsi0 ) = r s−2 tsi0 ,
(k)i
L0 (r s vki ) = r s−2 (tki − Uk
),
(k)i
where, for given k, the quantity Uk U (k)i
∈ Qk is obtained as follows. The function k−1 ≡ r −s+2 Lˆ r s vki , j =s0
has by Lemma 12 an expansion U (k)i =
m j =s0 +2
(k)i
Uj
(k)i
+ UR
with
(k)i
Uj
∈ Qj ,
(k)i
from which we read off Uk . By Lemma 11 the recurrence relation is well defined. With these definitions, the remainder vRi satisfies the equation (m+1)i LvRi = r s−2 UR + JRi + F i . By Lemma 4 the right-hand side of this equation is in C m+s−2,α (B% ). By elliptic regularity we have vRi ∈ C m+s,α (Ba ). Since m was arbitrary, the conclusion follows now by an argument similar to the one used in the proof of Lemma 5.
604
S. Dain, H. Friedrich
Theorem 17 will allow us to prove that the solutions of the momentum constraint obtained in Sect. 4.2 have an expansion of the form (13), if we impose near i certain conditions on the data which can be prescribed freely. ab In definition (123) of the field Lab sing , which enters Theorem (16), we assume that flat is of the form (83) with λ ≡ 0, i.e. it is given by the tensor fields (76)–(79). In order to ab in a convenient form, we introduce vector fields which are given in normal write flat coordinates by 1 vPi = − P k ∂k ∂ i r −1 = r −5 pPi , 4
pPi =
1 2 i (r P − 3 x i P k xk ) ∈ Q2 , 4 (135)
vJi = % ij k Jj ∂k r −1 = r −3 pJi , pJi = −% ij k Jj xk ∈ Q1 , (136) 1 1 i i i = A ∂ i r −1 = r −3 pA , pA = − A x i ∈ Q1 , (137) vA 2 2 1 7 1 i i i vQ = −2 Qi r −1 + Qk ∂k ∂ i r = r −3 pQ , pQ = − r 2 Qi − x i Qk xk ∈ Q2 , 4 4 4 (138) where P i , J i , A, Qi are chosen such that the vector fields satisfy (L0 vP )ab = Pab ,
(L0 vJ )ab = Jab ,
ab (L0 vQ )ab = Q ,
(L0 vA )ab = Aab , (139)
ab as given by (76)–(79). We have on B \ {i}, with Pab , Jab , Aab , Q a
(L0 vP )a = 0,
(L0 vJ )a = 0,
(L0 vA )a = 0,
(L0 vQ )a = 0,
(140)
and can thus write on Ba/2 \ {i},
ij 1 kl
ij Lsing = L0 (vP + vJ + vA + vQ ) − hij hkl L0 (vP + vJ + vA + vQ ) . 3 (141) 1,p (S), p > 1, entering Theorem 16 we assume that it can be Of the field Lab reg ∈ W written near i in the form s ab ab Lab reg = r L1reg + L2reg ,
(142)
ab where s ≤ −1 is some integer which will be fixed later on, Lab 1reg , L2reg are smooth in ij
ij
Ba and such that L1reg = O(r −s−1 ), and xi xj L1reg = r 2 L with some L ∈ C ∞ (Ba ). Then ij i s−2 ˆi Jreg ≡ Dj Lij J + Dj L2reg , reg = r
(143)
ij ij with Jˆi = r 2 Dj L1reg + s xj L1reg ∈ Q∞ and Jˆi = O(r −s ). Using Theorem 17, we obtain for the solutions of Theorem 16 (where we can set by our present assumptions ω0 ≡ 1, h ≡ h ) the following result.
Asymptotically Flat Initial Data with Prescribed Regularity at Infinity
605
ab Corollary 5. With the tensor fields Lab sing , Lreg given by (141), (142) respectively, let the vector field v a be such that ab ab ab = Lab sing + Lreg + (Lv) ,
(144)
satisfies Da ab = 0 in Ba \ {i}. (i) If P a = 0 in (141) and s = −3 in (142), the vector field v a can be written in the form v i = r −3 v1i + v2i ,
with v1i ∈ Q∞ (Ba ),
v1i = O(r 3 ),
v2i ∈ C ∞ (Ba ).
(ii) If J a = 0, A = 0, Qa = 0 in (141) and s = −5 in (142), the vector field v a can be written in the form v i = r −5 v1i + v2i ,
with v1i ∈ Q∞ (Ba ),
v1i = O(r 5 ),
v2i ∈ C ∞ (Ba ).
a − J a with J a given Proof. In both cases the vector field v a satisfies Lh v a = −Jsing reg reg a = D Lab . By Eq. (140) we have in case (i) J a = (L (v + v + by (143) and Jsing b sing h J A sing a vQ ))a = (Lˆ h (vJ + vA + vQ ))a , and in case (ii) Jsing = (Lh vP ))a = (Lˆ h vP ))a on Ba/2 \ {i}. The results now follow from Eqs. (135)–(138), Lemma 12, and Theorem 17.
We are in a position now to describe the behaviour of the scalar field ab ab near i. Lemma 13. The tensor field (144) satisfies in case (i) r 8 ab ab ∈ E ∞ (Ba ), in case (ii) r 8 ab ab = ψ + r ψ R where ψ R ∈ C α (Ba ) and ψ = with harmonic polynomial h2 = 38 r 2 (3 (P i ni )2 − Pi P i ).
15 16
Pi P i + r −2 h2
ˆ wˆ ∈ C ∞ (Ba ) we have Proof. For wi ∈ Q∞ (Ba ) with xi w i = r 2 w, (Lh (r s w))ij = r s ((Lh w)ij −
2 ij ˆ + r s−2 2 s x (i w j ) . s h w) 3
(145)
i + p i in case (i) and s = −5 and w i = p i in case We set s = −3 and wi = pJi + pA Q P i 2 (ii) and we write xi v1 = r vˆ1 with vˆ1 ∈ C ∞ (Ba ). Observing the equation above we get on Ba/2 \ {i} a representation
ij = r s H ij + r s−2 K ij + Lij with fields
2 ij 2 1 h hkl (L0 w)kl − s wˆ δ ij + 2 hij (1 − hkl δ kl ) 3 3 3 2 ij + (Lh v1 )kl − s hij vˆ1 + L1reg , 3 j) ij = 2 s (x (i w j ) + x (i v1 ), Lij = L2reg + (Lh v2 )ij ,
H ij = (L0 w)ij −
K ij
606
S. Dain, H. Friedrich
which are in C ∞ (Ba/2 ). Since a direct calculation gives Kij K ij = r 2 K with K ∈ C ∞ (Ba/2 ), we get ij ij = r 2s−2 (2 Hij K ij − K) + r 2s Hij H ij + r s−2 2 Kij Lij + r s 2 Hij Lij + Lij Lij , from which we can immediately read off the desired result in case (i). In case (ii) it is obtained from our assumptions by a detailed calculation of r 2s−2 (2 Hij K ij − K) + r 2s Hij H ij . Combining the results above and observing the conformal invariance of the equations involved, we obtain the following detailed version of Theorem 2. We use here the notation of Theorem 16. Theorem 18. Assume that the metric h is smooth and ab is the solution of the momentum constraint determined in Theorem 16. If 2 ab ab ab ab cd cd cd (i) Lab sing = J + A + Q − 3 h hcd (J + A + Q ) in Ba/2 , −3 Lab + Lab with Lab , Lab ∈ C ∞ (B ) such that Lab = O(r 2 ), (ii) Lab a reg = r 1reg 2reg 1reg 2reg 1reg 2 L with some L ∈ C ∞ (B ), and xa xa Lab = r a 1reg
then ab satisfies condition (11). A. On Hölder Functions In this section we want to prove an estimate concerning Hölder continuous functions. Let B be an open ball in Rn , n ≥ 1, centered at the origin. Suppose f ∈ C k (U ) for some k ≥ 0 and m is a non-negative integer with m ≤ k. Then we can write 1 1 1 β β f = (1 − t)m−1 ∂ f (0) x + m ∂ β f (t x) x β d t β! β! 0 |β|<m |β|=m 1 1 1 = ∂ β f (0) x β + m (∂ β f (t x) − ∂ β f (0)) x β d t, (1 − t)m−1 β! β! 0 |β|≤m
|β|=m
where the first line is a standard form of Taylor’s formula and the second line a slight modification thereof. We denote by Tm (f ) the Taylor polynomial of order m and by Rm (f ) the modified remainder, i.e. the first and the second term of the second line respectively. Lemma 14. Suppose f ∈ C m,α (U ). Then f − Tm (f ) ∈ C m,α (U ) and we have for β ∈ Nn0 , |β| ≤ m, |∂ β (f − Tm (f ))(x)| ≤ |x|m+α−|β|
|γ |=m−|β|
1 cγ +β on U, γ!
(146)
where the constants cδ denote the Hölder coefficients satisfying |∂ δ f (x) − ∂ δ f (0)| ≤ cδ |x|α in U for δ ∈ Nn0 , |δ| = m.
Asymptotically Flat Initial Data with Prescribed Regularity at Infinity
607
Proof. Applying the modified Taylor formula to f and then to its derivatives, we get ∂ β (f − Tm (f )) = Tm−|β| (∂ β f ) + Rm−|β| (∂ β f ) − ∂ β Tm (f ). We show that Tm−|β| (∂ β f ) − ∂ β Tm (f ) = 0.
(147)
To prove this equation we use induction on n. For n = 1 the result follows by a direct calculation. To perform the induction step we assume n ≥ 2 and show that the statement for n − 1 implies that for n. We write x = (x , x n ) for x ∈ Rn and β = (β , βn ) for β ∈ Nn0 , etc. Then we find the equalities m 1 Tm−γn (∂ γn f ) (x n )γn ∂ β Tm (f ) = ∂ β ∂ βn γn ! γn =0
=
m−|β|+β n
Tm−|β |−γn ∂ β ∂ γn f
γn =βn
=
m−|β| γn =0
1 (x n )γn −βn (γn − βn )!
1 Tm−|β|−γn ∂ β ∂ γn +βn f (x n )γn = Tm−|β| (∂ β f ). γn !
Here the first line is a simple rewriting where we denote by Tm−γn (∂ γn f ) the Taylor polynomial of order m − γn of the function ∂ γn f (x , 0) of n − 1 variables. In the second line the derivatives are taken and the induction hypothesis is used. The third line is obtained by redefining the index γn and using a similar rewriting as in the first line. With (147) the estimate (146) follows immediately by estimating the integral defining Rm−|β| (∂ β f ). B. An Additional Result In this section we prove a certain extension of Theorem 13. Theorem 19. Let u be a distribution satisfying Lu = f , where f ∈ E m,α (Ba ), and the coefficient of the elliptic operator L are in C m,α (Ba ). Then u = r3
m
uk + uR ∈ E m+2,α (Ba ),
(148)
k=0
with uk ∈ Pk and uR ∈ C m+2,α (Ba ). Proof. We follow the proof of Theorem 13 using Schauder instead of Lp estimates. Since f = f1 + rf2 ∈ E m,α (Ba ) we have f = r Tm + fR
with
Tm =
m
tk ,
k=0
where Tm is the Taylor polynomial of order m of f2 and tk ∈ Pk .
608
S. Dain, H. Friedrich
Consider the recurrence relation 9(r 3 u0 ) = rt0 ,
(k) 9(r 3 uk ) = r tk − Uk ,
1 ≤ k ≤ m,
which is obtained by defining Uk , k = 1, . . . , m, by k−1 Lˆ r 3 uj = rU (k) , j =0 (k)
(m+1)
as in the proof of Theorem 13. The equation for u and and defining Uj and Uj (148) then imply for uR Eq. (50) with s = 3. Since by our assumptions and Lemma 4 the right-hand side of this equation is in C m,α (Ba ), the interior Schauder estimates of Theorem 8 imply that uR ∈ C m+2 (Ba ). Acknowledgement. We would like to thank N. O’Murchadha for a careful reading of the manuscript.
References 1. Adams, R.A.: Sobolev Spaces. New York: Academic Press, 1975 2. Agmon, S., Douglis, A., and Niremberg, L.: Estimates near the boundary for solutions of elliptic partial differential equations satisfying general boundary conditions I. Comm. Pure Appl. Math. 12, 623–727 (1959) 3. Agmon, S., Douglis, A. and Niremberg, L.: Estimates near the boundary for solutions of elliptic partial differential equations satisfying general boundary conditions. II. Comm. Pure Appl. Math. 17, 35–92 (1964) 4. Andersson, L. and Chru´sciel, P.: On hyperboloidal Cauchy data for vacuum Einstein equations and obstructions to the smoothness of scri. Commun. Math. Phys. 161, 533–568 (1994) 5. Andersson, L., Chru´sciel, P. and Friedrich, H.: On the regularity of solutions to the Yamabe equation and the existence of smooth hyperboloidal initial data for Einstein’s field equations. Commun. Math. Phys. 149, 587–612 (1992) 6. Aubin, T.: Nonlinear Analysis on Manifolds. Monge–Ampère Equation. New York: Springer-Verlag, 1982 7. Beig, R.: TT-tensors and conformally flat structures on 3-manifolds. In: P. Chru´sciel, editor, Mathematics of Gravitation, Part 1., Volume 41, Banach Center Publications, Polish Academy of Sciences, Institute of Mathematics, Warszawa, 1997; gr-qc/9606055 8. Beig, R. and O’Murchadha, N.: Trapped surface in vacuum spacetimes. Class. Quantum Grav. 11 (2), 419–430 (1994) 9. Beig, R. and O’Murchadha, N.: The momentum constraints of general relativity and spatial conformal isometries. Commun. Math. Phys. 176 (3), 723–738 (1996) 10. Beig, R. and O’Murchadha, N.: Late time behavior of the maximal slicing of the Schwarzschild black hole. Phys. Rev. D 57 (8), 4728–4737 (1998) 11. Bowen. J.M. and York, J.W., Jr.: Time-asymmetric initial data for black holes and black-hole collisions. Phys. Rev. D 21 (8), 2047–2055 (1980) 12. Cantor, M.: Elliptic operators and the decomposition of tensor fields. Bull. Am. Math. Soc. 5 (3), 235–262 (1981) 13. Chaljub-Simon, A.: Decomposition of the space of covariant two-tensors on R3 . Gen. Rel. Grav. 14, 743–749 (1982) 14. Choquet-Bruhat, Y. and Christodoulou, D.: Elliptic systems in Hs,δ spaces on manifolds which are euclidean at infinity. Acta Math. 146, 129–150 (1981) 15. Choquet-Bruhat, Y., Isenberg, J. and York, J.W., Jr.: Einstein constraint on asymptotically euclidean manifolds. gr-qc/9906095, 1999 16. Choquet-Bruhat, Y. and York, Jr., J.W.: The Cauchy problem. In: A.Held, editor, General Relativity and Gravitation, Volume 1, New York: Plenum, 1980, pp. 99–172 17. Christodoulou, D. and O’Murchadha, N.: The boost problem in general relativity. Comm. Math. Phys. 80, 271–300 (1981) 18. Dieudonné, J.: Foundation of Modern Analysis. New York: Academic Press, 1969
Asymptotically Flat Initial Data with Prescribed Regularity at Infinity
609
19. Douglis, A. and Niremberg, L.: Interior estimates for elliptic systems of partial differential equations. Comm. Pure Appl. Math. 8, 503–538 (1955) 20. Folland, G.B.: Introduction to Partial Differential Equation. Princeton, NY: Princeton University Press, 1995 21. Friedrich, H.: Cauchy problems for the conformal vacuum field equations in general relativity. Commun. Math. Phys. 91, 445–472 (1983) 22. Friedrich, H.: On static and radiative space-time. Commun. Math. Phys. 119, 51–73 (1988) 23. Friedrich, H.: Gravitational fields near space-like and null infinity. J. Geom. Phys. 24, 83–163 (1998) 24. Garabedian, P.R.: Partial Differential Equations. New York: John Wiley, 1964 25. Gilbarg, D. and Trudinger, N.S.: Elliptic Partial Differential Equations of Second Order. Berlin: SpringerVerlag, 1983 26. Gleiser, R.J., Khanna, G. and Pullin, J.: Evolving the Bowen–York initial data for boosted black holes. gr-qc/9905067, 1999 27. Held, A., Newman, E.T. and Posadas, R.: The Lorentz group and the sphere. J. Math. Phys. 11 (11), 3145–3154 (1970) 28. Isenberg, J.: Constant mean curvature solution of the Einstein constraint equations on closed manifold. Class. Quantum Grav. 12, 2249–2274 (1995) 29. Lee, J.M. and Parker, T.H.: The Yamabe problem. Bull. Am. Math. Soc. 17 (1), 37–91 (1987) 30. Meyers, N.: An expansion about infinity for solutions of linear elliptic equations. J. Math. Mech. 12 (2), 247–264 (1963) 31. Morrey, Jr., C.B.: Multiple Integrals in the Calculus of Variations. Berlin: Springer Verlag, 1966 32. Newman, E.T. and Penrose, R.: Note on the Bondi–Metzner–Sachs group. J. Math. Phys. 7 (5), 863–870 (1966) 33. Obata, M.: The conjectures on the conformal transformations of Riemannian manifolds. J. Differ. Geom. 6 (2), 247–258 (1971) 34. Yano, K.: The theory of Lie derivatives and its applications. Amsterdam: North Holland, 1957 35. York, Jr., J.W.: Conformally invariant orthogonal decomposition of symmetric tensor on Riemannian manifolds and the initial-value problem of general relativity. J. Math. Phys. 14 (4), 456–464 (1973) Communicated by H. Nicolai
Commun. Math. Phys. 222, 611 – 661 (2001)
Communications in
Mathematical Physics
© Springer-Verlag 2001
Representations of the Exceptional Lie Superalgebra E(3, 6) II: Four Series of Degenerate Modules Victor G. Kac1, , Alexei Rudakov2, 1 Department of Mathematics, MIT, Cambridge, MA 02139, USA. E-mail: [email protected] 2 Department of Mathematics, NTNU, Gløshaugen, 7491 Trondheim, Norway.
E-mail: [email protected] Received: 22 January 2001 / Accepted: 8 June 2001
Abstract: Four Z+ -bigraded complexes with the action of the exceptional infinitedimensional Lie superalgebra E(3, 6) are constructed. We show that all the images and cokernels and all but three kernels of the differentials are irreducible E(3, 6)-modules. This is based on the list of singular vectors and the calculation of homology of these complexes. As a result, we obtain an explicit construction of all degenerate irreducible E(3, 6)-modules and compute their characters and sizes. Since the group of symmetries of the Standard Model SU (3) × SU (2) × U (1) (divided by a central subgroup of order six) is a maximal compact subgroup of the group of automorphisms of E(3, 6), our results may have applications to particle physics. 0. Introduction It has been established by A. Rudakov [R] some 25 years ago that all degenerate irreducible continuous modules over the Lie algebra Wn of all formal vector fields in n indeterminates occur as kernels and as cokernels of the differential of the Z+ -graded de Rham complex n of all formal differential forms in n indeterminates. The main objective of the present paper is to obtain a similar result for the exceptional infinite-dimensional Lie superalgebra E(3, 6) from the list of simple linearly compact Lie superalgebras classified by V. Kac [K]. It turned out that the situation is much more interesting: we have constructed four Z+ -bigraded complexes with the action of E(3, 6) and certain connecting homomorphisms between these complexes. All images and cokernels and all but three kernels of differentials turn out to be irreducible, and all degenerate irreducible E(3, 6)-modules occur among them. The failure of irreducibility is connected to non-triviality of homology of these complexes, which we compute as well. At the end of the paper we compute the characters and sizes of all degenerate irreducible E(3, 6)-modules and speculate on their relation to the Standard Model. Supported in part by NSF grant DMS-9970007.
Research was partially conducted by Alexei Rudakov for the Clay Mathematics Institute.
612
V. G. Kac, A. Rudakov
In our two other papers [KR1] and [KR2] on the subject we show that all locally finite with respect to any non-trivial open subalgebra irreducible E(3, 6)-modules, that do not occur as subquotients in our four complexes, are non-degenerate (and therefore induced). 1. A Reminder on E(3, 6) and Induced Modules Recall that we view E(3, 6) as a subalgebra of the exceptional Lie superalgebra E(5, 10). The latter is constructed as follows (see [CK] for details). The even part, E(5, 10)0¯ , is isomorphic to the Lie algebra S5 of divergenceless formal vector fields in the indeterminates x1 , . . . , x5 , while the odd part, E(5, 10)1¯ , is the space d 15 of closed (≡ exact) 2-forms, and the remaining brackets are defined by: [X, w] = LX w, [w, w ] = w ∧ w , X ∈ S5 , w, w ∈ d 15 . In the second formula a closed 4-form w ∧ w is identified with the vector field whose contraction with dx1 ∧ · · · ∧ dx5 produces this 4-form. As in [KR1], we use the notation: dj k = dxj ∧ dxk ,
∂i = ∂/∂xi .
Elements from E(5, 10)0¯ are of the form ai ∂i , where ai ∈ C[[x1 , . . . , x5 ]], ∂i ai = 0, i
i
and elements from E(5, 10)1¯ are of the form bj k dj k where bj k ∈ C[[x1 , . . . , x5 ]], dw = 0. w= j,k
In particular, the commutator of two odd elements can be computed using bilinearity and the rule (a, b ∈ C[[x1 , . . . , x5 ]]) [adj k , bdm ] = ij km ab ∂i , where ij km is the sign of the permutation (ij km) if all indices are distinct and 0 otherwise. The Lie superalgebra E(5, 10) carries a unique consistent irreducible Z-grading defined by deg xi = 2 = − deg ∂i , deg dxi = − 21 . In order to define E(3, 6) as a subalgebra of E(5, 10), let z+ = x4 , z− = x5 , ∂+ = ∂4 , ∂− = ∂5 and define the secondary Z-grading by: deg xi = 0 = deg ∂i , deg z± = 1 = − deg ∂± , deg d = − 21 . Then E(3, 6) is the 0 th piece of the secondary grading. The consistent Z-grading of E(5, 10) induces the consistent Z-grading E(3, 6) = L = ⊕j ≥−2 gj , where g−2 = ∂i | i = 1, 2, 3, g−1 = di+ := di4 , di− := di5 | i = 1, 2, 3.
Representations of E(3, 6) II: Four Series of Degenerate Modules
613
Furthermore, g0 s(3) ⊕ s(2) ⊕ g(1), where s(3) = h1 = x1 ∂1 − x2 ∂2 , h2 = x2 ∂2 − x3 ∂3 , e1 = x1 ∂2 ,
e2 = x2 ∂3 , e12 = x1 ∂3 , f1 = x2 ∂1 , f2 = x3 ∂2 , f12 = x3 ∂1 , s(2) = h3 = z+ ∂+ − z− ∂− , e3 = z+ ∂− , f3 = z− ∂+ , g(1) = Y = 23 xi ∂i − (z+ ∂+ + z− ∂− ) .
(1.1) (1.2) (1.3)
The eigenspace decomposition of ad (3Y ) coincides with the consistent Z-grading of E(3, 6). Below we often mark the elements ∂i ∈ g−2 by a hat, writing ∂ˆi whenever we need to distinguish them from ∂i in another role. As has been mentioned in [CK] and [KR1], the even part E(3, 6)0¯ of E(3, 6) contains a subalgebra W isomorphic to W3 with the isomorphism given by the formula D −→ D −
1 2
divD (z+ ∂+ + z− ∂− ).
(1.4)
We fix the Cartan subalgebra H = h1 , h2 , h3 , Y and the Borel subalgebra B = H+ ei (i=1,2,3), e12 of g0 . Then f0 := d1+ is the highest weight vector of the (irreducible) g0 -module g−1 , e0 := x3 d3− and e0 := x3 d2− − x2 d3− + 2x5 d23 are the lowest weight vectors of the g0 -module g1 , and one has: [e0 , f0 ] = f2 , [e0 , f0 ] =
2 3 h1
(1.5) +
1 3 h2
− h3 − Y =: h0 ,
(1.6)
so that h0 = −x2 ∂2 − x3 ∂3 + 2z− ∂− .
(1.7)
The following relations are also important to keep in mind: [ e0 , d1+ ] = f2 , [e0 , di− ]
= 0,
[e0 , d2+ ] = −f12 , [e0 , d3+ ] = 0,
[ di± , dj± ]
= 0,
[di+ , dj− ] + [dj+ , di− ]
(1.8) = 0.
(1.9)
Recall that g0 along with the elements f0 , e0 , e0 generate the Lie superalgebra E(3, 6). Sometimes we use the following shorthand notation for the elements of the universal enveloping algebra of E(3, 6): dij±k = di± dj± dk± . We have the triangular decomposition: L := E(3, 6) = L− + g0 + L+ , where L− = ⊕j <0 gj , L+ = j >0 gj . Given a g0 -module V , we extend it to L0 := g0 + L+ by letting L+ act trivially and consider the induced L-module M(V ) = U (L) ⊗U (L0 ) V U (L− ) ⊗C V ,
(1.10)
614
V. G. Kac, A. Rudakov
the latter being a g0 -isomorphism. If V is a finite-dimensional irreducible g0 -module, the L-module M(V ) is called a generalized Verma module. Any quotient of a generalized Verma module is called a highest weight module. Recall that all finite-dimensional irreducible g0 -modules are modules with highest weight λ such that p = λ(h1 ), q = λ(h2 ), r = λ(h3 ) ∈ Z+ , y = λ(Y ) ∈ C. Such a module is denoted by F (p, q; r; y) and the corresponding generalized Verma L-module is denoted by M(p, q; r; y). A highest weight vector of the g0 -module F (p, q; r; y) is called a highest weight vector of any corresponding highest weight L-module. Note that the hypercharge operator Y acts diagonally on M(p, q; r; y) with eigenvalues from the set y − 13 Z+ , its eigenspaces are g0 -invariant and finite-dimensional, and the y-eigenspace is isomorphic to the g0 -module F (p, q; r; y). A highest weight vector s of an irreducible finite-dimensional g0 -submodule of an L-module is called singular if e0 s = 0,
e0 s = 0
(equivalently : L+ s = 0 ).
(1.11)
For example, the highest weight vector of F (p, q; r, y) ⊂ M(p, q; r; y) is a singular vector, called a trivial singular vector. Recall that the L-module M(p, q; r; y) has a unique irreducible quotient denoted by I (p, q; r; y). The irreducible L-module I (p, q; r; y) is called non-degenerate if it coincides with M(p, q; r; y), the module I (p, q; r; y) and the module M(p, q; r; y) are called degenerate if M(p, q; r; y) = I (p, q; r; y). It follows that M(p, q, ; r; y) is a degenerate L-module iff it has a non-trivial singular vector. The main result of [KR1] is the following. Theorem 1.1. If pq = 0, then the E(3, 6)-module I (p, q; r; y) is non-degenerate. The corollary of the main result of [KR2] is the following. Theorem 1.2. The following list consists of all degenerate E(3, 6)-modules I (p, q; r; y) (p, q, r ∈ Z+ ): type A I (p, 0; r; yA ), where yA = 23 p − r, type B I (p, 0; r; yB ), where yB = 23 p + r + 2, type C I (0, q; r, yC ), where yC = − 23 q − r − 2, type D I (0, q; r, yD ), where yD = − 23 q + r. In the present paper we construct four complexes of degenerate generalized Verma modules, which allows us to construct the four series of degenerate modules given by Theorem 1.2.
Representations of E(3, 6) II: Four Series of Degenerate Modules
615
2. The Four Bigraded Complexes Given a g0 -module V and a number a ∈ C, we can define a new g0 -module, denoted by V[a] with the same action of s(3) ⊕ s(2) but the changed action of Y by adding to it aIV . Consider the following g0 -modules: VA VB VC VD
= C[x1 , x2 , x3 , z+ , z− ], = C[x1 , x2 , x3 , ∂+ , ∂− ][2] , = C[∂1 , ∂2 , ∂3 , z+ , z− ][−2] , = C[∂1 , ∂2 , ∂3 , ∂+ , ∂− ],
with the action of g0 given by (1.1)–(1.3). The decomposition into a sum of irreducible g0 -modules is given by the bigrading (X = A, B, C, D) m,n VX , VX = m,n∈Z
where VXm,n = {f ∈ VX |
i
xi ∂i f = mf ,
z ∂ f
= nf }.
Let MX = M(VX ) be the corresponding induced L-module. The bigrading of VX gives rise to a bigrading of MX by generalized Verma modules: M(VXm;n ). (2.1) MX = m,n∈Z
Note that we have the following isomorphisms of g0 -modules (see (1.1)): p,r
F (p, 0; r; yA ),
p,−r VB −q,r VC −q,−r VD
F (p, 0; r; yB ),
VA
F (0, q; r; yC ),
(2.2)
F (0, q; r; yD ).
Consequently, the bigrading (2.1) takes the form: MX = ⊕p,r∈Z+ M(p, 0; r; yX ), for X = A or B, MX = ⊕q,r∈Z+ M(0, q; r, yX ), for X = C or D. We let the algebra U (L− ) ⊗ End VX act on MX = U (L− ) ⊗ VX by the formula: (u ⊗ ϕ)(u ⊗ v) = u u ⊗ ϕ(v), u, u ∈ U (L− ), ϕ ∈ End VX , v ∈ VX .
(2.3)
We shall often drop the ⊗ sign, e.g. we shall write ∂i in place of 1 ⊗ ∂i , etc. Introduce the following operators on MX (acting by (2.3)): ±
6 = δi =
3
di± ⊗ ∂i ,
i=1 di+ ⊗ ∂+
+ di− ⊗ ∂− ,
(2.4) i = 1, 2, 3.
(2.5)
Here ∂i acts as d/dxi (resp. as multiplication by ∂i ) in the cases X = A, B (resp. X = C, D) and similarly for ∂± .
616
V. G. Kac, A. Rudakov
Lemma 2.1. (a) The action of s(3) (resp. s(2)) ⊂ g0 ⊂ E(3, 6) on the E(3, 6)-module MX commutes with the operators 6± and ∂± (resp. δi and ∂i ). (b) One has the following commutation relations for Y : [Y, ∂± ] = ∂± ,
[Y, ∂i ] = − 23 ∂i , [Y, di± ] = − 13 di± ,
[Y, 6± ] = −6± , [Y, δi ] = 23 δi . (c) (6± )2 = 0,
6+ 6− + 6− 6+ = 0,
δi δj + δj δi = 0.
Proof. The proof of (a) and (b) is straightforward, (c) follows from (1.9).
Now we introduce our basic operator := 6+ ∂+ + 6− ∂− = δ1 ∂1 + δ2 ∂2 + δ3 ∂3 .
Proposition 2.2. (a) 2 = 0. (b) The operator commutes with the action of E(3, 6) on MX . Proof. (a) is immediate by Lemma 2.1c. In order to prove (b), notice that the action (2.3) commutes with the left multiplication action of L− on MX , hence, in particular, commutes with L− . Furthermore, it follows from Lemma 2.1a (resp. 2.1b) that commutes with the action of s(3) and s(2) (resp. Y ), hence commutes with g0 . Now the proof that the operator commutes with L is based on the following lemma. Lemma 2.3. Let L = j gj be a Z-graded Lie superalgebra and let M be an L-module induced from an L0 -module V such that gj |V = 0 for j > 0. Let be an operator acting on M that commutes with L− . Suppose that commutes with g0 and g(v) = 0
for all g ∈ L+ , v ∈ 1 ⊗ V .
(2.6)
Then commutes with L. Proof. It is sufficient to show that commutes with L+ . For g ∈ L+ , u ∈ L− we can write: un gn , where un ∈ U (L− ), gn ∈ gn . gu = n≥0
Then we have: g(u ⊗ v) = gu(v) =
un gn (v)
n≥0
= u0 g0 (v) = u0 g0 (v) =
n≥0
un gn (v) = g(u ⊗ v).
Representations of E(3, 6) II: Four Series of Degenerate Modules
617
We shall establish that the conditions of Lemma 2.3 are valid in our situation. Since {g ∈ L+ | g(v) = 0 for all v ∈ 1 ⊗ VX } is a g0 -invariant subalgebra of L+ (because commutes with g0 ), in order to establish (2.6), it suffices to check only the following: e0 (v) = 0, e0 (v) = 0,
v ∈ 1 ⊗ VX , v ∈ 1 ⊗ VX .
(2.7) (2.8)
(Recall that g1 generates L+ and e0 , e0 generate the g0 -module g1 .) Using (1.8), we get (2.7): (f2 ∂1 ∂ v − f12 ∂2 ∂ v) = ((x3 ∂2 )∂1 − (x3 ∂1 )∂2 ) ∂ v = 0. e0 (v) =
=+,−
Furthermore, we have: e0 (v) = h0 (∂1 ∂+ v) + f1 (∂2 ∂+ v) + f12 (∂3 ∂+ v) − 2f3 (∂1 ∂− v).
(2.9)
In the X = A case, h0 acts on VX by (1.7) (where x4 = z+ , x5 = z− ), f1 = x2 ∂1 , f12 = x3 ∂1 and f3 = z− ∂+ , hence (2.9) becomes: e0 (v) = (h0 + x2 ∂2 + x3 ∂3 − 2z− ∂− )∂1 ∂+ v = 0. In the X = B case, h0 acts on VX by “(1.7) minus 2I ” due to the twist for Y , which is necessary to include in order to compensate for the additional term 2∂1 ∂+ v occurring from the last term on the right of (2.9). Similar calculations show that (2.8) holds in the X = C and D cases as well. Remark 2.4. (a) Since commutes with the representation of E(3, 6) in MX , the nonzero image under of a singular vector is a singular vector. (b) Let V be a direct sum of finite-dimensional irreducible g0 -modules, M(V ) be the corresponding induced E(3, 6)-module, and let be an operator on M(V ) defined via (2.3), such that v is a singular vector for each of the highest weight vectors v of g0 in V . Suppose that commutes with Y and all fi , i = 1, 2, 3, then commutes with E(3, 6). This follows from Lemma 2.3. (c) Suppose that for the same module M(V ) there is a linear map ϕ : M(V ) → N to another L-module N . If ϕ commutes with the action of L, then the non-zero image of a singular vector is a singular vector. If ϕ commutes with the action of L− , the image ϕ(v) is a singular vector for each of the highest weight vectors v of g0 in V , ϕ commutes with g0 , then ϕ is a morphism of L-modules. The arguments are just the same as for (a), (b). (d) Lemma 2.3 and the above remark (c) could be generalized to the case when M is a highest weight L-module, or a sum of such modules. Proposition 2.2 implies immediately the following corollary about singular vectors. Corollary 2.5. (a) The E(3, 6)-module M(p, 0; r; yA ) has a non-trivial singular vector p+1 r+1 z+ )
(x1 for all p, r ∈ Z+ .
p
r = (p + 1)(r + 1)d1+ x1 z+
618
V. G. Kac, A. Rudakov
(b) The E(3, 6)-module M(p, 0; r; yB ) has a non-trivial singular vector p+1 r−1 ∂− )
(x1
p
r−1 = (p + 1)δ1 x1 ∂−
for all p, r ∈ Z+ , r > 0. (c) The E(3, 6)-module M(0, q; r; yC ) has a nontrivial singular vector q−1 r+1 z+ )
(∂3
q−1 r z+
= (r + 1)6+ ∂3
for all q, r ∈ Z+ , q > 0. (d) The E(3, 6)-module M(0, q; r; yD ) has a nontrivial singular vector q−1 r−1 ∂− )
(∂3 for all q, r ∈ Z+ , q, r > 0.
In order to construct more singular vectors, consider smaller g0 -modules VX ⊂ VX and the corresponding induced L-modules MX ⊂ MX : VA VB VC VD
= C[x1 , x2 , x3 ] = C[x1 , x2 , x3 ][2] = C[∂1 , ∂2 , ∂3 ][−2] = C[∂1 , ∂2 , ∂3 ]
and MA = M(VA ), and MB = M(VB ), and MC = M(VC ), and MD = M(VD ).
Let τ1 : VA − ∼ VB and τ2 : VC − ∼ VD be the identity isomorphisms. Then we have: → → [Y, τi ] = 2τi .
(2.10)
Let 2 = 6+ 6− τ1 : MA → MB , 2 = 6+ 6− τ2 : MC → MD act via (2.3). Proposition 2.6. (a) 2 = 2 = 0. (b) 2 is an E(3, 6)-module isomorphism: MX − ∼ MY , where X = A (resp. C ) and → Y = B (resp., D ). Proof. (a) is clear from Lemma 2.1(c). Next, L− commutes with 2 by definition, s(3) commutes due to Lemma 2.1, s(2) commutes for trivial reasons and Y commutes due to Lemma 2.1(b) and (2.10). As before, due to Lemma 2.3, in order to establish (b), it suffices to show that e0 2 v = 0 and e0 2 v = 0 for v ∈ 1 ⊗ VX . Let X = A . We have: e0 2 v = e0 (−6− 6+ τ1 )(v) = 6− (−f2 ∂1 + f12 ∂2 )τ1 (v) = 6− (−(x3 ∂2 )∂1 + (x3 ∂1 )∂2 )τ1 (v) = 0.
Representations of E(3, 6) II: Four Series of Degenerate Modules
619
Next, we have: e0 2 v = ((2f3 ∂1 )6+ + 6− (h0 ∂1 + f1 ∂2 + f12 ∂3 ))τi (v) = 6− ((2 + h0 )∂1 + (x2 ∂1 )∂2 + (x3 ∂1 )∂3 )τi (v), because f3 τi (v) = 0, since f3 annihilates B and D . Notice that for v = x1a1 x2a2 x3a3 : (xj ∂1 )∂j τ1 (v) = (xj ∂j )∂1 τ1 (v),
j = 2, 3,
but for v = ∂1a1 ∂2a1 ∂3a1 : (xj ∂1 )∂j τ2 (v) = (xj ∂j − 1)∂1 τ2 (v),
j = 2, 3.
At the same time h0 |B = −x2 ∂2 −x3 ∂3 +2z− ∂− −2 and h0 |D = −x2 ∂2 −x3 ∂3 +2z− ∂− . Therefore e0 2 v = 6− ((2 + h0 )∂1 + (x2 ∂1 )∂2 + (x3 ∂1 )∂3 )τi (v), = 6− (2z− ∂− )τi (v) = 0.
Finally, in a similar fashion, consider the g0 -modules VX ⊂ VX and the corresponding induced L-modules MX ⊂ MX : VA VB VC VD
= C[z+ , z− ] = C[∂+ , ∂− ][2] = C[z+ , z− ][−2] = C[∂+ , ∂− ]
and MA = M(VA ), and MB = M(VB ), and MC = M(VC ), and MD = M(VD ).
We let ρ1 : VA − ∼ VC and ρ2 : VB − ∼ VD be the identity isomorphisms, so that → → [Y, ρi ] = −2ρi .
(2.11)
Let 3 = δ1 δ2 δ3 ρ1 : MA → MC , 3 = δ1 δ2 δ3 ρ2 : MB → MD act via (2.3). Proposition 2.7. (a) 3 = 3 = 0. (b) 3 is an E(3, 6)-module isomorphism MX → MY , where X = A (resp. B ) and Y = C (resp. D ). Proof. As in the proof of Proposition 2.6, due to Lemma 2.1, (a) is clear and (b) reduces to checking the relations e0 3 v = 0
and e0 3 v = 0
for v ∈ 1 ⊗ VX .
We have e0 3 v = (f2 ∂+ δ2 δ3 + δ1 f12 ∂+ δ3 )ρ1 v = ∂+ δ32 ρ1 v = 0 if X = A because f2 ρ1 v = f12 ρ1 v = 0, and similarly for X = B .
620
V. G. Kac, A. Rudakov
Next, we have for v ∈ VX : [e0 , δ1 δ2 δ3 ]v = ((h0 ∂+ − 2f3 ∂− )δ2 δ3 − δ1 f1 ∂+ δ3 + δ1 δ2 f12 ∂+ )v = (h0 ∂+ − 2f3 ∂− )δ2 δ3 (v). Since [h0 , δi ] = −δi for i = 1, 2 and [f3 , δi ] = 0, we obtain for v ∈ VX : [e0 , δ1 δ2 δ3 ]v = δ2 δ3 ((h0 − 2)∂+ v − 2(z− ∂+ )∂− v). Again (z− ∂+ )∂− v = (z− ∂− )∂+ v for v ∈ C and (z− ∂+ )∂− v = (z− ∂− − 1)∂+ v for v ∈ D . In the case X = A we have: e0 δ1 δ2 δ3 ρ1 v = δ2 δ3 (h0 − 2z− ∂+ − 2)∂− ρ1 v = 0. In the case X = B , we similarly deduce: e0 δ1 δ2 δ3 ρ2 v = δ2 δ3 (h0 − 2z− ∂+ )∂− ρ2 v = 0.
Propositions 2.6 and 2.7 lead to the next corollary about singular vectors. Corollary 2.8. (a) The E(3, 6)-module M(p, 0; 0; yB ) has a non-trivial singular vector p+2
2 (x1
p
) = (p + 2)(p + 1)d1+ d1− x1
for all p ∈ Z+ . (b) The E(3, 6)-module M(0, q; 0; yD ) has a non-trivial singular vector q−2
2 (∂3
q−2
) = 6+ 6− ∂3
for all q ∈ Z, q ≥ 2. (c) The E(3, 6)-module M(0, 0; r; yC ) has a non-trivial singular vector r+3 + r 3 (z+ ) = (r + 3)(r + 2)(r + 1)d123 z+
for all r ∈ Z+ . (d) The E(3, 6)-module M(0, 0; r; yD ) has a non-trivial singular vector r−3 3 (∂− )
for all r ∈ Z+ , r ≥ 3. There are a few more singular vectors. One is the vector + w1 = d123 6− 1
(2.12)
of the E(3, 6)-module M(0, 1; 0; yD ). It is straightforward to show by checking that e0 w1 = 0 and ei w1 = 0 for i = 0, . . . , 3. This singular vector generates a non-zero homomorphism of E(3, 6)-modules 4 : M(0, 0; 2; yA ) → M(0, 1; 0; yD ) such that 2 of M(0, 0; 2; y ) we have for the highest weight vector z+ A + 2 4 (z+ ) = d123 6− 1.
(2.13)
Representations of E(3, 6) II: Four Series of Degenerate Modules
621
We shall be looking for 4 in the form: 2 2 4 = a6− ∂+ + b6− ∂+ ∂− + c6− ∂− ,
a = d1+ d2+ d3+ ,
(2.14)
2 , ∂ ∂ and ∂ 2 map V 0;2 , which is the where b, c ∈ U (L− ). Here all three operators ∂+ + − − A space of quadratic polynomials in z± , to C, hence operator (2.14) may be viewed as a linear map from M(0, 0; 2; yA ) to M(0, 1; 0; yD ). It is clear that, applied to the highest 2 of M(0, 0; 2; y ) the operator (2.14) gives the singular vector w. weight vector 21 z+ A Hence, due to Remark 2.4b, it suffices to check that 4 commutes with f1 , f2 , f3 . Commuting with f3 gives the relations: [f3 , a] = b, [f3 , b] = 2c, which gives:
b = d1− d2+ d3+ + d1+ d2− d3+ + d1+ d2+ d3− , c = d1− d2− d3+ + d1− d2+ d3− + d1+ d2− d3− . (2.15) With these b and c, the operator 4 given by (2.14) clearly commutes with s(3). Thus, 4 : M(0, 0; 2; −2) → M(0, 1; 0; − 23 ) defined by (2.14) and (2.15) is a non-zero homomorphism of E(3, 6)-modules. Another singular vector is the vector 2 2 + b∂+ ∂− + c∂− ), w2 = d1− (a∂+
(2.16)
where a, b, c are the same as in (2.14), (2.15), and w2 ∈ M(0, 0; 2; yD ). Similarly the checking is straightforward although somewhat tedious. Again this singular vector generates a non-zero homomorphism of E(3, 6)-modules 4 : M(1, 0; 0; yA ) → M(0, 0; 2; yD ) such that for the highest weight vector x1 of M(1, 0; 0; yA ) we have 4 x1 = w2 . This allows us to find that 4 =
2 2 d1− (a∂+ + b∂+ ∂− + c∂− )∂1
2 2 + d2− (a∂+ + b∂+ ∂− + c∂− )∂2 2 + d3− (a∂+
(2.17)
2 + b∂+ ∂− + c∂− )∂3 .
Clearly 4 commutes with f1 , f2 , f3 and Remark 2.4b shows that 4 determines the morphism of E(3, 6)-modules 4 : M(1, 0; 0; 23 ) → M(0, 0; 2; 2). The most complicated singular vector is w3 ∈ M(0, 0; 1; yD ), given by w3 = d− d+ + (d− d+ − d− d+ )∂ˆ1 + (d− d+ − d− d+ )∂ˆ2 123 123
13 12
12 13
21 23
23 21
+ − + ˆ − + ˆ2 − + ˆ2 − + ˆ2 + (d− 32 d31 − d31 d32 )∂3 − d1 d1 ∂1 − d2 d2 ∂2 − d3 d3 ∂3 + − + ˆ ˆ − + − + ˆ ˆ − + − (d− 1 d2 + d2 d1 )∂1 ∂2 − (d1 d3 + d3 d1 )∂1 ∂3 − (d2 d3
+ ˆ − + ˆ − + ˆ + + d− 1 d123 ∂1 + d2 d123 ∂2 + d3 d123 ∂3 ∂+ − ˆ − ˆ − ˆ ∂ ∂ ∂ = d− a∂ + d + d + d 123 − 1 1 2 2 3 3 (b∂− + a∂+ ).
+ ˆ ˆ ∂ + d− d ) ∂ 3 2 2 3 ∂−
(2.18) This singular vector generates a homomorphism of E(3, 6)-modules 6 : M(0, 0; 1; yA ) → M(0, 0; 1; yD )
622
V. G. Kac, A. Rudakov
such that 6 z+ = w3 . In (2.18) we keep the notations from (2.14), (2.15) and denote by ∂ˆi elements ∂i ∈ g−2 (i = 1, 2, 3). It is straightforward to check that w3 = 0, hence · 6 = 0.
(2.19)
In Proposition 5.27 of Sect. 5 we will write 6 explicitly and show that 6 · = 0 as well. Remark 2.9. The operators , 2 , 3 , 4 , 4 and 6 have degree 1, 2, 3, 4, 4 and 6 respectively, with respect to the Z-gradation of U (L− ) induced by the consistent gradation of E(3, 6). The maps , 2 , 3 , 4 , 4 and 6 are illustrated by Fig. 1. The black nods in quadrants A, B, C, D represent generalized Verma modules M(p, 0; r; yX ), X = A, B or M(0, q; r; yX ), X = C, D. The plain arrows represent , the dotted arrows represent 2 , the interrupted arrows represent 3 and the bold arrows represent 4 , 4 and 6 . r
r
C
A
✒ ✒ ✒ ✒ q q
✒ ✒ ✒ ✒ ✒ ✒ ✒
D
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠ p
✠
✠
✠
✠
✠
☛
✠
✠
✠
✠ p
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✒
✠
✒
✠
✒
✠
✒
✠
✒ r
Fig. 1.
r
✒
✠
✒
✠
✒
✠ B
Representations of E(3, 6) II: Four Series of Degenerate Modules
623
There is still one more singular vector – the vector w4 = d1+ (6+ ∂+ + 6− ∂− )
(2.20)
of the E(3, 6)-module M(0, 1; 1; yD ). It occurs as the image of a non-trivial singular vector of M(0, 0; 0; 0) under the map : M(0, 0; 0; 0) → M(0, 1; 1; yD ) (cf. Corollary 2.5a for p = r = 0 and 2.5d for q = r = 1). The reason for its existence is the fact that the composition of the following maps is non-zero:
M(1, 0; 1; yA ) −→ M(0, 0; 0; 0) −→ M(0, 1; 1; yD ). The main result of [KR2] is the following theorem. Theorem 2.10. A complete list of non-trivial singular vectors (up to a constant factor) of the E(3, 6)-modules M(p, q; r; y) is given by Corollary 2.5, Corollary 2.8, (2.12), (2.16), (2.18) and (2.20). Corollary 2.11. All degenerate modules M(p, q; r; y), except for M(0, 1; 1; 23 ) (of type D), have a unique (up to a constant factor) non-trivial singular vector. The module M(0, 1; 1; 23 ) has two such vectors. 3. Homology of Complexes (MX , ) Recall that the complexes MX = U (L− ) ⊗ VX are Z-bigraded: MX = ⊕m,n∈Z MXm,n ,
MXm,n = U (L− ) ⊗ VXm,n ,
(3.1)
where VXm,n = 0 if X = A, m, n < 0; if X = B, m < 0, n > 0; if X = C, m > 0, n < 0; and if X = D, m > 0, n > 0, and VXm,n are irreducible g0 -modules described by (2.2) otherwise. Note that : MXm,n → MXm−1,n−1 .
(3.2)
Due to (3.2) the Z-bigrading (3.1) of MX induces a Z-bigrading on its homology: H (MX ) = ⊕m,n H m,n (MX ).
(3.3)
Note also that MX is a free S(g−2 )-module: MX S(g−2 )⊗(=(g−1 )⊗VX ) and that commutes with S(g−2 ). Hence H (MX ) and each H m,n (MX ) are S(g−2 )-modules as well. The canonical filtration of U (L− ): C = F0 U (L− ) ⊂ ... ⊂ Fi U (L− ) = L− Fi−1 U (L− ) + Fi−1 U (L− ) ⊂ . . . induces a filtration of MX by letting Fi MX = Fi U (L− ) ⊗ VX .
(3.4)
624
V. G. Kac, A. Rudakov
Moreover (Fi MX ) ⊂ Fi+1 MX , so that MX becomes a filtered complex with the differential and the bigrading (3.1) and the filtration is bounded below. As we discuss in Appendix, one can use a spectral sequence to study the homology for such a complex and the spectral sequence converges when the filtration is bounded below. This applied to (MX , ) produces a sequence of complexes {(E i , i )}, such that E i+1 is the homology of (E i , i ), limi→∞ E i = Gr (H (MX )), and E 0 = H (Gr MX ). Remark 3.1. In the subalgebra W W3 described in (1.4) consider the standard filtration W of W3 : W = LW −1 ⊃ L0 ⊃ . . . , then LW j · Fi MX ⊂ Fi−j MX . Therefore the action of W on MX descends to the action of W on Gr MX , because W Gr W . The actions commute with , and thus W acts also on the spectral sequence and homologies. To get hold on H (Gr MX ) we notice that, by the PBW theorem, the associated graded algebra Gr U (L− ) S(g−2 ) ⊗ =(g−1 ) (tensor product of associative algebras). Then we get the associated graded complex Gr MX = Gr U (L− ) ⊗ VX (S(g−2 ) ⊗ =(g−1 )) ⊗ VX . Clearly LW 1 annihilates =(g−1 ) ⊗ VX and the above isomorphism can be interpreted as the following isomorphism of W -modules (=(g−1 ) ⊗ VX ). Gr MX S(g−2 ) ⊗ (=(g−1 ) ⊗ VX ) IndW LW
(3.5)
0
The differential of the complex Gr MX we denote again by , because it is given by the same formula as for MX , except that the multiplication by di± is to be taken in Gr U (L− ) instead of the multiplication in U (L− ). It follows that GX = =(g−1 ) ⊗ VX is the subcomplex of the complex (Gr MX , ), and that the latter is obtained from the former by extending coefficients from C to S = S(g−2 ). Homologies H m,n (GX ), that are computed with the differential restricted from Gr MX , are also annihilated by LW 1 , so we have an isomorphism of W -modules (and g0 -modules): H m,n (Gr MX ) S ⊗C H m,n (GX ) IndW (H m,n (GX )). LW
(3.6)
0
Thus, (3.6) and the theory of spectral sequence give us the following result: Proposition 3.2. H m,n (GX ) = 0 #⇒ H m,n (Gr MX ) = 0 #⇒ H m,n (MX ) = 0, and rank S H m,n (MX ) ≤ rank S H m,n (Gr MX ) = dimC H m,n (GX ). Figure 1 and Propositions 2.6, 2.7 make it reasonable to consider complexes not restricted to the quadrants of the figure. In the following we make the first step in this direction (see Fig. 2 several pages below). Let MAB be the module MA ⊕ MB with the bigrading and the filtration induced from the summands provided with the differential that coincides with 2 on MA ⊂ MA ⊕ MB and with on the bigraded components of MA ⊕ MB that do not belong to MA . Let us define MCD and also GAB , GCD in the same way as the sum of the spaces with the differential constructed from 2 and .
Representations of E(3, 6) II: Four Series of Degenerate Modules
625
Clearly MAB and MCD are filtered modules with the differential and the bigrading introduced above, and we can use the spectral sequence to study their homology. Moreover the isomorphism (3.6) holds for X = AB, CD and Proposition 3.2 remains valid for X = AB, CD. This and the calculations of the homology H m,n (GX ) made in the next section provide us with the following proposition. Proposition 3.3. H m,n (MAB ) = H m,n (GAB ) = 0 when m ≥ 2 or m = 1, n = 0, 1, 2, H m,n (MCD ) = H m,n (GCD ) = 0 when m ≤ −2 or m = −1, n = 0, −1, −2. 4. Homology of GX Let us notice that along with natural inclusions VX ⊂ VX there are natural projections VX → VX defined by substituting z± and ∂ ± by zero. One has the corresponding projections GX → GX . We will consider the compositions GX → GX → GY , where the projection is the first map and 2 is the second one, as well as compositions GX → GY → GY , where the first map is 2 and the second is the natural inclusion. These compositions are morphisms of E(3, 6)-modules, and we allow ourselves to denote them by 2 as well. Let GAo = Ker(2 : GA → GB ), GB o = Coker(2 : GA → GB ), GC o = Ker(2 : GC → GD ), GD o = Coker(2 : GC → GD ).
(4.1) (4.2)
It follows from Proposition 2.6a that the differential is defined for GXo . It is clear from the above definitions that m,n for n > 0, H (GAo ) m,n H (GAB ) = H m,n (GB o ) (4.3) for n < 0, H m,0 (G o ) ⊕ H m,0 (G o ) for n = 0; A B m,n for n > 0, H (GC o ) m,n m,n H (GCD ) = H (GD o ) (4.4) for n < 0, H m,0 (G o ) ⊕ H m,0 (G o ) for n = 0. C D This means that the computation of the homologies H m,n (GAB ), H m,n (GCD ) reduces to finding H m,n (GXo ). In the computations we need to go from GX to GXo and the following lemma is helpful. Lemma 4.1. Let (M, d) be a differential complex M0 ← M1 ← M2 ← · · · , let (N, d) be another differential complex of the same type and let α : M → N be a morphism of complexes. 1. Suppose M o = Ker α and N is concentrated at 0 (i.e., Ni = 0 for i = 0). Then Hn (M) Hn (M o ) for n = 0 and there is an exact sequence: 0 → H0 (M o ) → H0 (M) → H0 (N ).
626
V. G. Kac, A. Rudakov
2. Suppose N o = Coker α and M is concentrated at 0. Then Hn (N ) Hn (N o ) for n = 0 and there is an exact sequence: Hn (M) → Hn (N ) → Hn (N o ) → 0. The statement follows immediately from definitions. We can apply this lemma to connect the -homology of GXo and GX since one can check, using the definitions (4.1), (4.2), that the conditions of the lemma are valid. In the rest of the section we also consider the following Z-bigrading of modules GX . Let for X = A, B, C, D, (VX )[p,q] = {f ∈ VX | (z+ ∂ + )f = pf, (z− ∂ − )f = qf }, (GX )[p,q] = =(g−1 ) ⊗ (VX )[p,q] .
(4.5)
It is important to mention that (Gm,n X )[p,q] = 0 only if p + q = n. We hope that the notations allow one to distinguish what grading is referred to. The new bigrading naturally descends to GXo . The definitions of 6± , Lemma 2.1c and the formula = 6 + ∂+ + 6 − ∂− allow us to conclude that the bigraded modules GX , GXo provided with differentials d = 6+ ∂+ , d = 6− ∂− become bicomplexes and their homologies with respect to (that we are interested in) are nothing but the total homologies of the bicomplexes. So the classical theory of spectral sequences of a bicomplex is relevant here ([M, Chapter XI, Sect. 6]) and the following lemma contains well-known statements about two spectral sequences of a bicomplex that we will use. Lemma 4.2. Let (K, d , d ) be a bicomplex, K = p,q K[p,q] , and d = d + d the total differential of K. The first spectral sequence of the bicomplex E = {(E r , d r )}, E r = p,q E r[p,q] has the property: (E , d 0 ) (K, d ), 0
(E , d 1 ) (H (K, d ), d ), 1
so that E [p,q] Hp (Hq (K, d ), d ). 2
For the second spectral sequence E = {(E r , d r )}, E r = d , d are reversed: (E , d 0 ) (K, d ), 0
p,q
E r[p,q] the roles of
(E , d 1 ) (H (K, d ), d ), 1
so that E [p,q] Hp (Hq (K, d ), d ). 2
The spectral sequences are functors on the bicomplexes. Any of the above spectral sequences E converges to the homology of K with respect to the total differential d whenever for every n the set 2 = 0} {(p, q) | p + q = n, E[p,q]
is finite.
Representations of E(3, 6) II: Four Series of Degenerate Modules
627
The above condition for the convegence could be relaxed but this form is enough for our purposes. It is fairly traditional although slightly differs from the one in Theorem 6.1 of [M, Chapter XI]. The arguments of Proposition 3.2 of [M, Chapter XI] are easily modified to prove the convergence under our condition. We leave details to the reader. Now we decompose bicomplexes GX (and their sub- or quotient-bicomplexes GXo ) into a sum of smaller bicomplexes. Introduce notations (0 ≤ i, j ≤ 3): i ± ± ± =± i = = d1 , d3 , d3 , − + − =+ i =j [x] = =i =j ⊗C C[x1 , x2 , x3 ],
− + − =+ i =j [∂] = =i =j ⊗C C[∂ 1 , ∂ 2 , ∂ 3 ].
Let p q
− GA (a, b)[p,q] = =+ a−p =b−q [x]z+ z− , −p −q
− GB (a, b)[p,q] = =+ a−p =b−q [x] ∂ + ∂ −
p q
− GC (a, b)[p,q] = =+ a−p =b−q [∂]z+ z− , −p −q
− GD (a, b)[p,q] = =+ a−p =b−q [∂] ∂ + ∂ − . (4.6)
Then GX decomposes in a direct sum of subcomplexes: GX = ⊕a,b GX (a, b),
where GX (a, b) =
GX (a, b)[p,q] .
(4.7)
p,q
We have the induced decomposition GXo = ⊕a,b GXo (a, b). Note that the above equalities are isomorphisms of s(3)-modules but not those of s(2)-modules, since GX (a, b) = {f ∈ VX | (z+ ∂ + )f = af, (z− ∂ − )f = bf } with the obvious action of z± ∂ ± on GX . Thus, in order to know H m,n (GXo ) it is enough to compute homology H m,n (GXo (a, b)) and this is our further goal. We will use formulae (4.6), (4.7) in these computations. Let us consider s(3)-modules =i = =i x1 , x2 , x3 for i ≥ 0, and let =i = 0 for i < 0. Of course =i = 0 for i > 3 too. Proposition 4.3. If a, b > 3, then (as s(3)-modules)
0, for m > 0, a+b−n = , for m = 0;
0, for m < 0, H m,n (GC o (a, b), ) = H m,n (GC (a, b), ) =a+b−n−3 , for m = 0. H
m,n
(GAo (a, b), ) = H
m,n
(GA (a, b), )
If a, b < 0, then (as s(3)-modules)
0, for m > 0, a+b−n = , for m = 0,
0, for m < 0, H m,n (GD o (a, b), ) = H m,n (GD (a, b), ) =a+b−n−3 , for m = 0. H
m,n
(GB o (a, b), ) = H
m,n
(GB (a, b), )
628
V. G. Kac, A. Rudakov
(Note that for X = B or D the above formulae show that the homology H 0,n (GXo (a, b), ) can be non-zero only for negative n, namely when −3 + a + b ≤ n ≤ a + b < 0 for B and −6 + a + b ≤ n ≤ −3 + a + b < 0 for D.) Proof. First of all it follows from (4.6) that, under restrictions of the proposition, GX = GXo , because we are to care about the difference only if (GX )[0,0] = 0 which is not the case here. We use the spectral sequences of Lemma 4.2 for the evaluation of H (GX ) and the first spectral sequence happens to be sufficient for the proof. As we compute the E 2 -term, we notice that we have got a one-row spectral sequence that necessarily degenerates (i.e. all the higher differentials are zero), thus E 2 E ∞ H (GX , ) as s(3)-modules. Let us start with the A-case. By Lemma 4.2, E [p,q] (GA (a, b)) Hp (Hq (GA (a, b), d ), d ), 2
thus we are to begin with considering GA (a, b) as a complex with the differential d = 6− ∂ − . We see from (4.6) that it splits into a sum of subcomplexes p q−1
− · · · ← =+ a−p =b−q+1 [x]z+ z−
p q
− ←− =+ a−p =b−q [x]z+ z− ← · · · .
(4.8)
Observing how the differential acts we conclude that the complex (4.8) is isomorphic to the tensor product of the following complex q−1
b−3 − 0 ← =− 3 [x]z− ← · · · ← =b−q+1 [x]z−
q
− b ←− =− b−q [x]z− ← · · · ← =0 [x]z− ← 0
(4.9)
p
(note that b > 3) with the vector space =+ a−p z+ which is not affected by the differential. The complex (4.9) is nothing but a De Rham complex with the “grading variable” z− added. This implies that it has non-zero homologies only at its right end (with our b. direction of arrows), and those are isomorphic to Cz− Therefore the complex (4.8) also has its non-zero homologies only at one place and p b those are isomorphic to =+ a−p z+ z− . This shows us that all non-zero terms in E 1[p,q] (GA (a, b)) are confined to one row q = b and that d is zero on this row. Thus E 2 = E 1 and for a one-row spectral sequence E 2 = . . . = E ∞ , hence we have arrived at the following answer
0 for q = b, ∞ E [p,q] (GA (a, b)) = (4.10) p b =+ z z for q = b. a−p + −
At the same time ∞ H m,n (GA (a, b)) E [p,q] (GA (a, b)) m
p+q=n ∞ E [n−b,b] (GA (a, b))
(4.11) n−b b =+ a+b−n z+ z− .
Now (4.11) shows that H m,n (GA (a, b)) = 0 for m = 0 and that it has the stated value for m = 0. This proves the proposition for X = A.
Representations of E(3, 6) II: Four Series of Degenerate Modules
629
The proof for X = B is quite similar, we are to deal with complexes −p 1−q
− · · · ← =+ a−p =b−q+1 [x]∂ + ∂ −
−p −q
− ←− =+ a−p =b−q [x]∂ + ∂ − ← · · ·
(4.12)
isomorphic to the tensor product of a De Rham complex 3−b − 0 ← =− 3 [x]∂ − ← · · · ← =b−q+1 [x]∂ −
1−q −q
− −b ←− =− b−q [x]∂ − ← · · · ← =0 [x]∂ − ← 0
(4.13)
−p
(b < 0), and a vector space =+ a−p ∂ + which is not affected by the differential. It gives
0 for q = b, 2 (4.14) E [p,q] (GB (a, b)) = −p −b =+ ∂ ∂ for q = b. a−p + − We have got the same configuration with one non-zero row, hence E 2 = . . . = E ∞ . At the same time ∞ ∞ H m,n (GB (a, b)) E [p,q] (GB (a, b)) E [n−b,b] (GB (a, b)) m p+q=n (4.15) b−n −b =+ ∂ ∂ , − a+b−n + and this immediately implies the proposition for X = B. Going to X = C we represent GC (a, b) as the sum of subcomplexes of the form p q−1
− · · · ← =+ a−p =b−q+1 [∂]z+ z−
p q
− ←− =+ a−p =b−q [∂]z+ z− ← · · · ,
(4.16)
which differs from (4.8) in that respect that here we have modules over C[∂] instead of C[x]. Again the complex (4.16) is isomorphic to the tensor product of q−1
b−3 − 0 ← =− 3 [∂]z− ← · · · ← =b−q+1 [∂]z−
q
− b ←− =− b−q [∂]z− ← · · · ← =0 [∂]z− ← 0 p
(4.17)
(b > 3), and a vector space =+ a−p z+ which is not affected by the differential. Now with its differential the complex (4.17) is essentially a Koszul complex (dual of the De Rham complex) and this implies that it has non-zero homologies only at its b−3 very left end (with our direction of arrows), and those are isomorphic to =− 3 z− . Thus the complex (4.16) also has its non-zero homologies only at its left end and those are − p b−3 isomorphic to =+ a−p =3 z+ z− . We have got a one-row spectral sequence again, but with the row q = b − 3 and again d is zero on this row. Hence
0 for q = b − 3, 2 E [p,q] (GC (a, b)) = (4.18) − p b−3 =+ = z z for q = b − 3. a−p 3 + −
630
V. G. Kac, A. Rudakov
Now E 2[p,q] = E ∞ [p,q] , ∞ − n−b+3 b−3 H m,n (GC (a, b)) E [n−b+3,b−3] (GC (a, b)) =+ z− , a+b−n−3 =3 z+ m
(4.19) and, taking into account the isomorphism =− 3 C, we conclude that we proved the proposition for X = C. For X = D we arrive similarly at the complex −p 1−q
− · · · ← =+ a−p =b−q+1 [∂]∂ + ∂ −
−p −q
− ←− =+ a−p =b−q [∂]∂ + ∂ − ← · · · ,
(4.20)
which is the tensor product of a Koszul complex 3−b − 0 ← =− 3 [∂]∂ − ← · · · ← =b−q+1 [∂]∂ −
1−q −q
− −b ←− =− b−q [∂]∂ − ← · · · ← =0 [∂]∂ − ← 0
(4.21)
−p
(b < 0), and a vector space =+ a−p ∂ + . So we get a formula
0 for q = b − 3, 2 E [p,q] (GD (a, b)) = − −p 3−b =+ = ∂ ∂ for q = b − 3. (a−p) 3 + −
(4.22)
Again E 2[p,q] = E ∞ [p,q] , and ∞ − b−n−3 3−b H m,n (GD (a, b)) E [n−b+3,b−3] (GD (a, b)) =+ ∂− . a+b−n−3 =3 ∂ + m
(4.23) This is enough to prove the proposition for X = D.
Looking back at the proof we notice that the roles of a and b are not symmetric. The restrictions on b are important in that respect that the complexes (4.9), (4.13), (4.16), (4.21) would not be cut short and remain the full length De Rham or Koszul complexes. But the restrictions on a have not been used. Therefore leaving the proof the same we can relax the rectrictions on a, arriving at the following corollary. Corollary 4.4. If b > 3, then
=a+b−n for m = 0, n ≥ b, 0 otherwise,
=a+b−n−3 for m = 0, n ≥ b − 3. H m,n (GC o (a, b), ) = H m,n (GC (a, b), ) 0 otherwise. H
m,n
(GAo (a, b), ) = H
m,n
(GA (a, b), )
If b < 0, then
=a+b−n for m = 0, n ≤ b, 0 otherwise,
=a+b−n−3 for m = 0, n ≤ b − 3 H m,n (GD o (a, b), ) = H m,n (GD (a, b), ) 0 otherwise. H
m,n
(GB o (a, b), ) = H
m,n
(GB (a, b), )
Representations of E(3, 6) II: Four Series of Degenerate Modules
631
Corollary 4.5. By interchanging a and b in Corollary 4.4 we get valid statements as well. This is because we can use the second spectral sequence of Lemma 4.2 in the proof instead of the first one. We also have to remember that for some values of a complexes GX (a, b) are entirely zero. Thus we can assume that a ≥ 0 for X = A, Ao , C, C o and a ≤ 3 for X = C, C o , D, D o , and we are left with the cases when 0 ≤ a, b ≤ 3. Here the answer becomes somewhat different and the proof demands more elaborate arguments, but goes along the same lines of computing the homology via spectral sequence. Proposition 4.6. Let 0 ≤ a ≤ b ≤ 3 then a+b−n = H m,n (GAo (a, b), ) =1+a+b−n 0
=a+b−n−3 H m,n (GC o (a, b), ) 0
=a+b−n H m,n (GB o (a, b), ) 0
for m = 0, n ≥ b, for m = 1, 0 ≤ n ≤ a, otherwise, for m = 0, n ≥ 0, otherwise, for m = 0, n ≤ 0, otherwise.
Let 0 ≤ b ≤ a ≤ 3 then
a+b−n−3 for m = 0, n ≤ b − 3 = m,n H (GD o (a, b), ) =−1+a+b−n−3 for m = −1, a − 3 ≤ n ≤ 0. 0 otherwise.
Proof. We see immediately that the statement is true for the following complexes: GAo (0, 0) = C + x1 , x2 , x3 , GC o (0, 0) = 0, GB o (3, 3) = 0, GD o (3, 3) = C + ∂ 1 , ∂ 2 , ∂ 3 , with trivial differentials. We exclude these cases from consideration further on. We already noticed that variables z± (resp. ∂ ± ) play only the role of grading variables, so we can eliminate them as it is done below. Let us define a bicomplex
+ − A (a, b))[p,q] = =a−p =b−q [x] for p ≥ 0, q ≥ 0, (G 0 otherwise, with the differentials d = 6+ , d = 6− . Comparing with (4.6) we conclude that there A (a, b). exists an isomorphism of bicomplexes α : GA (a, b) → G Let
− =+ a+1 =b+1 [x] for p = 0, q = 0, (GB (a, b))[p,q] = 0 otherwise.
632
V. G. Kac, A. Rudakov
Following (4.1) the bicomplex GAo (a, b) is the kernel of the morphism 2 : GA (a, b) → GA (a, b) and the morphism can be included into a commutative diagram GA (a, b) −→ GA (a, b) ↓ ( A (a, b) −→ GA (a, b) G where the morphism at the lower row is equal to 6+ 6− . This shows that α induces an isomorphism between GAo (a, b) and the bicomplex A (a, b) −→ GA (a, b) ), Ao (a, b) = Ker(6+ 6− : G G thus it also induces an isomorphism of their homologies. A (a, b) can be represented by the following diagram (we allow The bicomplex G ourselves to omit the zero components and morphisms with a zero source or target): − + − + − + − =+ a =0 [x] ← · · · ← =k+1 =0 [x] ← =k =0 [x] ← · · · ← =0 =0 [x] ↓ ··· ↓ ↓ ··· ↓ .. .. .. .. . ··· . ··· . ··· . − + − + − + − = [x] ← · · · ← = = [x] ← = = [x] ← · · · ← = = =+ a j 0 j [x] k+1 j k j
↓ ··· ↓ ↓ ··· ↓ .. .. .. .. . ··· . ··· . ··· . − + − + − + − + =a =b [x] ← · · · ← =k+1 =b [x] ← =k =b [x] ← · · · ← =0 =b [x]
where the row (resp. column) maps are the De Rham differentials 6+ (resp. 6− ). Ao (a, b) by a similar diagram we need only to change it in the lower-left To represent G corner and put there Ker(6+ 6− ). Ao (a, b) is It follows that the E 1 -term of the spectral sequence of the bicomplex G represented by the following diagram: =+ a 0 ··· 0
← ··· ← ← ··· ← ······ ← ··· ←
Ker 6+ 6− Im 6−
← ··· ←
=+ k+1 0 ··· 0 − =+ k+1 =b [x] Im 6−
← ← ··· ← ←
=+ k 0 ··· 0 − =+ k =b [x] Im 6−
← ··· ← ← ··· ← ······ ← ··· ← ← ··· ←
=+ 0 0 ··· 0 − =+ 0 =b [x] Im 6−
with only two non-zero rows, those where q = 0 and q = b. Here b = 0 as 0 ≤ a ≤ b and we have excluded a = b = 0. The E 2 -term will have the similar “two rows” structure, and this together with a ≤ b r imply that for every differential d[p,q] , r ≥ 2, either its source or its target is zero. Thus all the differentials are trivial and therefore E 2 = · · · = E ∞ . Let us compute E 2 . First of all the differential d is induced by 6+ and evidently it is trivial on the upper row, hence the row descends to E 2 unchanged. The following lemma helps us to compute the terms in the lower row.
Representations of E(3, 6) II: Four Series of Degenerate Modules
633
Lemma 4.7. Let R(a, b) be the complex − =+ 0 =b [x] Im 6−
− =+ 1 =b [x] Im 6−
6+
−→
6+
6+
−→ · · · −→
− =+ k =b [x] Im 6−
6+
6+
−→ · · · −→
Ker 6+ 6− Im 6−
and S(a, b) be the complex − 6− (=+ 0 =b [x])
6+
6+
6+
6+
6+
− − + − → 6− (=+ 1 =b [x]) → · · · → 6 (=k =b [x]) → · · · → S(a, b)b , 6+ − + − − where the last term S(a, b)b = Ker 6− (=+ a =b [x]) → 6 (=a+1 =b [x]) .
Then (a) 6− induces an isomorphism of complexes R(a, b) → S(a, b) for b > 0, (b) homologies of R(a, b) (and S(a, b)) for b > 0 are isomorphic to: =b+1 ,
=b+2 ,
...
, =b+k+1 ,
...
, =a+b+1 .
In the notations of Lemma 4.7 we write the terms of the lower non-trivial row of the E 2 -term of the spectral sequence as follows: 2 Ao (a, b)) = Ha−p (R(a, b)) E [p,0] (G
(4.24)
(in part (b) of the lemma these terms are explicitly calculated). Proof of Lemma 4.7. The statement (a) is obvious. Going to (b) let us first of all notice that the second complex has a nice property: Hi (S(a, b)) = Hi (S(a + 1, b)) for 0 ≤ i ≤ a. Thus it is sufficient to compute its homologies for large a, and because of (4.24) it is Ao (a, b)) when a is large. enough to compute E 2 (G Let a > 3 (and thus a > b). Evidently E 2 (GAo (a, b)) has two non-zero rows, for r q = 0 and q = b, and therefore among differentials d[p,q] , r ≥ 2, all but those for r = b + 1, q = 0, b < p ≤ a are equal to zero. Ao (a, b) are isomorphic to total homologies On the other hand total homologies of G o of GA (a, b) and their description is given in Corollary 4.5. It follows from Corollary 4.5 that ∞ ∞ Ao (a, b)) = 0 for n < a, and Ao (a, b)) = =+ . E [p,q] (G E [p,q] (G b p+q=n
p+q=a
(b+1)
All this forces us to conclude that the differential d[p,0] , b < p ≤ a, determines isomorphism b+1
b+1
Ao (a, b)) −→ E [p−b−1,b] (G Ao (a, b)) Ha−p (R(a, b)) = E [p,0] (G = =+ a+b−p+1 . This proves the lemma.
(4.25)
(b+1)
It is important to notice that the above morphism d[p,0] is induced by hence it diminishes the degree of x’s by 1. Therefore the space E b+1 [p,0] (GAo (a, b)) above is
634
V. G. Kac, A. Rudakov
represented by elements of x-degree 1 because elements in E b+1 [p−b−1,b] (GAo (a, b)) have x-degree zero. From (4.24) and (4.25) we get 2 Ao (a, b)) = Ha−p (R(a, b)) = =a+b−p+1 (in x-degree 1). E [p,0] (G
(4.26)
To continue proving Proposition 4.6, we return to our previous data 0 ≤ a ≤ b ≤ 3 and conclude that if 0 ≤ n ≤ a then Ao (a, b)) E ∞ H m,n (G [n,0] (GAo (a, b)) m
Ao (a, b)) =a+b−n+1 (in x-degree 1), E [n,0] (G 2
hence H (GAo (a, b))1,n =a+b−n+1 , and H (GAo (a, b))m,n = 0 for m = 1. With this we have got the rest of the statement for X = Ao . Similarly for X = B we define a “∂ ± -free version”, a bicomplex
+ − B (a, b))[p,q] = =a−p =b−q [x] for p ≤ 0, q ≤ 0, (G 0 otherwise, with the differentials d = 6+ , d = 6− , and looking at (4.6) we see that there is a B (a, b). natural isomorphism of bicomplexes α : GB (a, b) → G Let
− =+ a−1 =b−1 [x] for p = 0, q = 0, (GA (a, b))[p,q] = 0 otherwise. It is clear that 2 maps GA (a, b) into GB (a, b) having GB o (a, b) as the cokernel, and that α induces an isomorphism between GB o (a, b) and the cokernel of the morphism B (a, b), the latter cokernel we will denote by G B o (a, b). 6+ 6− from GA (a, b) into G o Notice that for a = b = 0 we have GA (0, 0) = 0, so GB (0, 0) = GB (0, 0) and it is easy to check that the proof of Proposition 4.3 works and provides the result. Let us suppose b > 0. B (a, b) by the following diagram: Again we represent G − + − + − + − =+ 3 =b [x] ← · · · ← =k+1 =b [x] ← =k =b [x] ← · · · ← =a =b [x] ↓ ··· ↓ ↓ ··· ↓ .. .. .. .. . ··· . ··· . ··· . + − + − + − + =3 =j [x] ← · · · ← =k+1 =j [x] ← =k =j [x] ← · · · ← =a =− j [x]
↓ ··· ↓ ↓ ··· ↓ .. .. .. .. . ··· . ··· . ··· . + − + − + − + =3 =3 [x] ← · · · ← =k+1 =3 [x] ← =k =3 [x] ← · · · ← =a =− 3 [x] where the row (resp. column) maps are the De Rham differentials 6+ (resp. 6− ) and the upper-right corner corresponds to [p, q] = [0, 0] so the diagram is situated in the B o (a, b) we are to change the upper-right corner to Coker(6+ 6− ). third quarter. For G
Representations of E(3, 6) II: Four Series of Degenerate Modules
635
To compute the E 1 -term of the spectral sequence we consider the column complexes and evidently they have non-zero homology only at the upper (q = 0) row. Having in mind that b > 0 one easily checks that the E 1 -term of the spectral sequence of the B o (a, b) is represented by the following diagram: bicomplex G − − − + − + − 6− (=+ 3 =b−1 [x]) ← · · · ← 6 (=k+1 =b−1 [x]) ← 6 (=k =b−1 [x]) ← · · · ←
← ··· ←
0 .. . 0
0 .. . 0
··· ← ··· ←
Now notice that Ker 6− Im 6+ 6−
− 6− (=+ a =b−1 [x]) − 6+ 6− (=+ a−1 =b−1 [x])
←
··· ←
0 .. . 0
← ··· ←
Ker 6− Im 6+ 6−
0 .. . 0
··· ← ··· ←
Coker
6+ − + − − 6− (=+ a =b−1 [x]) ← 6 (=a−1 =b−1 [x])
.
Now comparing the upper row with the complex S(3, b − 1) of Lemma 4.7 we have no difficulty determining its homology. The upper row of the diagram for E 2 becomes =a+b+3 ,
... ,
=k+b+1 ,
=k+b ,
... ,
=a+b .
There are no more non-zero differentials, and we obtain the result in this case. The reasoning for X = D goes along the same lines and the situation is in some sense dual to this in the case X = A. We will stress the main points leaving it to the reader to fill in details. D o (a, b) is the following (we skip the evident definition The diagram representing G of GD o (a, b)): − + − + − + − =+ 3 =b [∂] ← · · · ← =k+1 =b [∂] ← =k =b [∂] ← · · · ← Coker(6 6 ) ↓ ··· ↓ ↓ ··· ↓ .. .. .. .. . ··· . ··· . ··· . − + − + − + =− [∂] = [∂] ← · · · ← = = [∂] ← = = [∂] ← · · · ← = =+ a j 3 j k+1 j k j
↓ ··· ↓ ↓ ··· .. .. .. . ··· . ··· . ··· + − + − + − =3 =3 [∂] ← · · · ← =k+1 =3 [∂] ← =k =3 [∂] ← · · · ←
↓ .. . + = a =− 3 [∂]
where the row (resp. column) maps 6+ (resp. 6− ) are Koszul differentials, the upperright corner corresponds to [p, q] = [0, 0], the diagram is situated in the third quarter and its other corners are [0, b − 3], [a − 3, 0], [a − 3, b − 3]. Calculating homologies of the vertical complexes we get (provided b > 0) the following diagram: − − − + − + − 6− (=+ 3 =b−1 [∂]) ← · · · ← 6 (=k+1 =b−1 [∂]) ← 6 (=k =b−1 [∂]) ← · · · ←
0 .. . 0 − =+ 3 =3
← ··· ←
··· ← ··· ← ← ··· ←
0 .. . 0
− =+ k+1 =3
←
··· ← ←
0 .. . 0 − =+ k =3
← ··· ←
··· ← ··· ← ← ··· ←
Ker 6− , Im 6+ 6−
0 .. . 0 − =+ a =3
636
V. G. Kac, A. Rudakov
Clearly the term at the upper-right corner is isomorphic to 6+ − + Ker 6− Im 6− + − − − Im 6+ 6− Coker 6 (=a =b−1 [∂]) ← 6 (=a−1 =b−1 [∂]) Im 6+ 6− and we evaluate the homologies of the complex in the upper row by the following lemma (which is analogous to Lemma 4.7). Lemma 4.8. Let b > 0 and T (a, b) be the complex 6+
6+
− + − − 6− (=+ 3 =b−1 [∂]) ← 6 (=2 =b−1 [∂]) ← · · · 6+
6+
6+
− ← 6− (=+ k =b−1 [∂]) ← · · · ← T (a, b)a , 6+ − + − − where T (a, b)a = Coker 6− (=+ . = [∂]) ← 6 (= = [∂]) a b−1 a+1 b−1
Then homologies of T (a, b) are isomorphic to: =b−1 ,
=b−2 ,
... ,
=b−k ,
... ,
=b−1+a−3 ,
and are represented by the elements linear in ∂ i , i = 1, 2, 3. We leave it to the reader to prove the lemma by the same trick of changing a and comparing the first and the second spectral sequences of the bicomplex in question. The condition 0 ≤ b ≤ a implies that no futher differentials of the spectral sequence are non-zero. Thus E 2 = E ∞ and this gives us the result in the D-case. C o (a, b)) situated again in the first For the C-case the diagram representing E 1 (G quarter and it is 0 .. . 0 Ker(6+ 6− ) Im 6−
← ··· ← ··· ← ··· ←
0 .. . 0
← ··· ←
0 .. . 0
← ··· ← ··· ← ··· ←
0 .. . 0
− − + − − + − ← · · · ← 6− (=+ k+1 =b [∂]) ← 6 (=k =b [∂]) ← · · · ← 6 (=0 =b [∂]).
Clearly the term at the lower-left corner is isomorphic to Ker(6+ 6− ) 6+ − + + − − − Ker 6 (=a =b−1 [∂]) ← 6 (=a−1 =b−1 [∂]) Im 6− and the values of the horizontal homologies could be found from those of T (0, b) given by Lemma 4.8. This closes the last case and completes the proof of Propositon 4.6. Corollary 4.9. By interchanging a, b in Proposition 4.6 we get a valid statement as well. We are to change rows into columns and use the other spectral sequence of a bicomplex. Given n ∈ Z, y ∈ C, let P (n, y) be an irreducible s(2)⊕g(1)-module with highest weight (n, y) when n ≥ 0 and P (n, y) = 0 when n < 0. Theorem 4.10. There are the following isomorphisms of g0 -modules (sums below run over 0 ≤ i ≤ 3):
Representations of E(3, 6) II: Four Series of Degenerate Modules
637
(a) 1 i for m = 0, n ≥ 0, = ⊗ P (n − i, − 3 i − n) m,n i H (GAo ) = = ⊗ P (i − n − 1, − 13 i − n + 1) for m = 1, 0 ≤ n ≤ 3, 0 otherwise. (b) H m,n (GB o ) =
=i ⊗ P (−n + i, − 13 i − n + 2)
0
for m = 0, n ≤ 0, otherwise.
(c) H
m,n
(GC o ) =
=i ⊗ P (n + 3 − i, − 13 i − n − 3) for m = 0, n ≥ 0, 0 otherwise.
(d) 1 i = ⊗ P (−n−3 + i, −3 i −n−1) for m = 0, n ≤ 0, m,n 1 i H (GD o ) = = ⊗ P (n−i + 2, −3 i −n−2) for m = −1, −2 ≤ n ≤ 0, 0 otherwise. Proof. To prove the statements we use the decomposition (4.7) and then collect the information about homologies for various values of a, b from Propositions 4.3, 4.6 and Corollaries 4.4, 4.5, 4.9, having in mind that h3 acts on GXo (a, b) as multiplication by a − b. The action of Y has to be computed too. It is easy to compute it for E 2 by choosing a representative for a homology class. Then it descends to E ∞ and because of the convergence of the spectral sequences we immediately determine the action on homologies. We leave the details to the reader. We see that Theorem 4.10 provides us with the information about the homologies of the complexes Gr MXo (and hence MX for X = AB, CD). In the following Fig. 2 this information is presented graphically. The white circles mark those places (m,n), where the homologies H m,n (Gr MXo ) are zero, and black nodes mark those positions where they may be non-zero. Proof of Proposition 3.3. If either m ≥ 2 or m = 1, n = 0, 1, 2, then, by Theorem 4.10, H m,n (GX ) = 0 for X = Ao , B o . Equations (4.3) show that H m,n (GAB ) = 0 under the same conditions on m, n, and hence H m,n (MAB ) = 0 by Proposition 3.2. In the same way using Theorem 4.10 and Eqs. (4.4) we prove the second part of the proposition. For a linear map : M → M we shall use the notations: Ker(|M ) = Ker( : M → M ), Im(|M ) = Im( : M → M ) = Im(|M ), Coker(|M ) = Coker( : M → M ).
(4.27)
638
V. G. Kac, A. Rudakov r
r
C
A
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
q
✠
✠
✠
✠
✠
✠
✠
✠
✠
q
✠
✠
✠
✠
✠
✠
✠
✠
✠ p
✠
✠
✠
✠
✠
✠
✠
✠
✠ p
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
D
r
r
B
Fig. 2.
Proof. To prove that a morphism of filtered modules with a differential induces an isomorphism on homologies it is sufficient to show that it induces an isomorphism of the initial terms of their spectral sequences and the initial terms are of the form H (Gr MX ). Because of (3.6) and (4.1), (4.2) it is enough to establish that 3 induces isomorphisms: H 0,n (GAo ) → H 0,n−3 (GC o ) for n ≥ 3,
H 0,n (GB o ) → H 0,n−3 (GD o ) for n ≤ 0.
Theorem 4.10 shows that the corresponding homologies are indeed isomorphic g0 modules, hence we should check only that 3 maps each of the highest weight vectors in H 0,n (GAo ) (resp. H 0,n (GB o )) to a non-zero element of H 0,n−3 (GC o ) (resp. H 0,n−3 (GD o )). The representatives for highest weight vectors are n−1 n−2 2 n−3 3 n z+ , d1+ z+ z− , d1+ d2+ z+ z− , d1+ d2+ d3+ z+ , z−
Representations of E(3, 6) II: Four Series of Degenerate Modules
639
for H 0,n (GAo ), and −n −n−1 −n−2 2 −n−3 3 , d1+ ∂− ∂+ , d1+ d2+ ∂− ∂+ , d1+ d2+ d3+ ∂− ∂+ ∂−
for H 0,n (GB o ). Now it is immediate to see that the images are non-zero.
Proposition 4.11. Morphisms 4 , 4 induce isomorphisms: −1,0 4 : Coker(|MA0,2 ) −→ Ker(|MD ),
0,−2 4 : Coker(|MA1,0 ) −→ Ker(|MD ).
Proof. Again it is sufficient to establish that the maps induce isomorphisms of the initial terms of the spectral sequences that are isomorphic g0 -modules because of Theorem 4.10. Thus again we have to check the images of the highest vectors of H 0,2 (GAo ) and H 1,0 (GAo ). For H 0,2 (GAo ) the representatives of the highest weight vectors are 2 1 2 z+ , d1+ z+ z− , d1+ d2+ z− ,
and their images + + − + − + 6− , d1+ d1− d23 6 , d12 (d12 d3 + d1− d2+ d3− )6− , d123
are indeed non-zero in H 0,−2 (GD o ). Similarly for H 1,0 (GAo ) the representatives are x1 , d1+ x2 − d2+ x1 , d1+ d2+ x3 + d2+ d3+ x1 + d3+ d2+ x1 . We use (2.17) and calculate in GD = =(g−1 ) ⊗ VD . We leave to the reader to make the calculations and see that the results are indeed non-zero. 5. Homology of MX and Secondary Singular Vectors We return to the spectral sequence for MX , decribed in Sect. 3. We are particularly interested in eight cases when X = A or D, which have been left unfinished in the previous section. The description of the initial term E 0 = H (Gr MX ) of the spectral sequences in these cases follows from Theorem 4.10 and was presented in Fig. 2. In this section we compute the subsequent terms of these spectral sequences and hence the homology. Theorem 5.1. There are the following isomorphisms of E(3, 6)-modules: (i) (ii) (iii) (iv)
H 0,0 (MA ) = C, H 1,2 (MA ) = 0, H 0,1 (MA ) H 1,1 (MA ) I (0, 0; 1; −1), H 0,0 (MD ) = 0, H −1,−2 (MD ) = C, H 0,−1 (MD ) H −1,−1 (MD ) I (0, 0; 1; −1).
640
V. G. Kac, A. Rudakov r
r
C
A
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
q
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
q
✠
✠
✠
✠
✠
✠
✠
✠
✠ p
✠
✠
✠
✠
♠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠ ✠
✰
☛
✠
✠
✠
✠ p
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
✠
D
r
r
B
Fig. 3.
In the following we denote by any of the morphisms , 1 , . . . , 6 (see Fig. 1) assuming that it is clear from the context which map is considered. From now on we combine the differential E(3, 6)-modules MAB and MCD in one differential E(3, 6)-module M represented by Fig. 3. The E(3, 6)-module M equals to the direct sum m,n M= MAm,n ⊕ MB ⊕ MC ⊕ MD , (m,n)=(0,0)
(m,n)=(0,0)
−1,−1 and the maps are the same as in Fig. 1 except for the map : MA1,1 −→ MD which is defined to be equal to the composition:
0,0 −1,−1 MA1,1 −→ MA0,0 MD −→ MD .
Representations of E(3, 6) II: Four Series of Degenerate Modules
641
White nodes in Fig. 3 mark the positions where the kernel of the outcoming map equals to the image of the incoming one (i.e. the corresponding homology of M is zero). The black marks denote the places with non-zero homology. Namely, the star refers to the trivial module C, the diamond to I (0, 0; 1; −1) and the spade to the module P = I (0, 0; 1; −1) ⊕ C. The above description of the homologies follows from Theorem 5.1, Propositions 3.3, 4.11, and 5.27. Saying that P is a direct sum we have taken into account that an extension 0 −→ C −→ P −→ I (0, 0; 1; −1) −→ 0 is necessary a direct sum, because all eigenvalues of Y on I (0, 0; 1; −1) are strictly negative, but Y acts on C with eigenvalue zero. Corollary 5.2. The complex M decomposes in a direct sum of Z-graded complexes, that are infinite in both directions and consist of generalized Verma modules MXm,n . Homologies of these complexes are all zero except those that appear at terms MA1,1 −1,−1 −1,−2 M(1, 0; 1; − 23 ), MD M(0, 1; 1; 23 ) and MD M(0, 1; 2; 53 ). We use below the notations (4.27). Corollary 5.3. For the E(3, 6)-modules M(p, q; r; yX ) corresponding to the white nodes of Fig. 3 (and for M(0, 0; 0; 0)) Ker = Im is a unique non-trivial submodule, it is isomorphic to the irreducible quotient of the previous module of the complex, and the quotient by it is isomorphic to I (p, q; r; yX ). Equivalently, Im(|MXm,n ) is a unique −1,−1 −1,−2 non-trivial submodule of MXm,n except when MXm,n = MA1,1 , MD , MD . Thus, all degenerate generalized Verma modules M(p, q; r; yX ), except for M(1, 0; 1; yA ), M(0, 1; 1; yD ) and M(0, 1; 2; yD ), have a unique non-trivial submodule. Corollary 5.4. The only nontrivial submodules of the excluded above modules M(1, 0; 1; yA ), M(0, 1; 2; yD ) and M(0, 1; 1; yD ) are the following (i) Im ⊂ Ker for M(1, 0; 1; yA ), with subquotients respectively isomorphic to I (2, 0; 2; yA ), I (0, 0; 1; yA ), I (1, 0; 1; yA ), (ii) Im ⊂ Ker for M(0, 1; 2; yD ), with subquotients I (0, 0; 1; yD ), C, I (0, 1; 2; yD ), (iii) Si , i = 0, 1, 2, 3 for M(0, 1; 1; yD ), where , S0 = Im
S1 = Im ,
S2 = U (L)q+ + S0 ,
S3 = Ker ,
and q+ is defined before Lemma 5.24. One has: S0 = S1 ∩ S2 , S3 = S1 + S2 , S0 I (1, 0; 1; yA ), S3 /S0 C ⊕ I (0, 0; 1; yA ),
M(0, 1; 1; yD )/S3 I (0, 1; 1; yD ).
The results of Corollary 5.4 will be established while proving Theorem 5.1. To prove Corollary 5.3 we need the following lemma.
642
V. G. Kac, A. Rudakov
Lemma 5.5. Let M ← M ← M be an exact sequence of highest weight modules over L with highest weight vectors m , m and m respectively. Suppose that (m) is a unique (up to a constant factor) non-trivial singular vector of M . Then Im = Ker is an irreducible L-submodule of M . Suppose also that (m ) is a unique (up to a constant factor) non-trivial singular vector of M . Then Im Coker = M / Im is irreducible too, thus Im is a unique non-trivial submodule in M . Proof. Due to exactness, the module Ker is a quotient of the highest weight module M . Hence, having by the conditions only trivial singular vectors, it is irreducible. Furthermore, by exactness we have the isomorphism of Coker and Im . Due to uniqueness of a non-trivial singular vector in M , we conclude, as above, that Im is irreducible. Proof of Corollary 5.3. We use Corollary 5.2 to provide exact sequences and Corollary 2.11 to ensure the uniqueness of non-trivial singular vectors. We also get the fact that Im(|MA0,0 ) is irreducible applying the first part of Lemma 5.5 to the sequence C ← MA0,0 ← MA1,1 . Then MA0,0 / Im(|MA0,0 ) = H 0,0 (MA ) C by 0,0 Theorem 5.1, so we get the statement for MA0,0 = MD . To prove Theorem 5.1 we are to compute the differentials in the spectral sequence {E i (MX ), (i) } for the homology H (MX ), X = A, D, and for this we need to consider representatives of various cycles. Definition 5.6. We say that an element s ∈ MXm,n represents the highest weight vector of H m,n (GX ) if 1. s is a non-zero g0 -highest weight vector (in MXm,n ), 2. the image [s] of s in Gr MX is a cycle, 3. [s] belongs to GX ⊂ Gr MX . It follows that the homology class of [s] in H m,n (GX ) is a g0 -highest weight vector. We say that s0 , . . . , sk ∈ MXm,n represent a basis of H m,n (GX ) if 1 . the span of s0 , . . . , sk is a g0 -submodule in MXm,n , 2 . all [si ] are cycles and belong to GX ⊂ Gr MX , 3 . { [si ] } give us a basis for H m,n (GX ). Definition 5.7. We say that s ∈ MXm,n represents a singular vector of H m,n (MX ) if 1. s is a non-zero g0 -highest weight vector, 2. s is a cycle in MX , 3. e0 s ∈ Im , e0 s ∈ Im .
Representations of E(3, 6) II: Four Series of Degenerate Modules
643
Whenever c1 , c2 are cycles, the notation c1 ≡ c2 means c1 and c2 belong to the same homology class. Let us notice that due to Remark 3.1 the terms E i (MX )m,n are W -modules and clearly they are g0 -modules as well. Thus we may consider the action on these terms of the following subalgebra W of E(3, 6): W = W + g0 W ⊕ s(2).
(5.1)
W Given a g0 -module V with the trivial action of LW 1 = L1 we get an isomorphism:
S(g−2 ) ⊗C V IndW (V ). LW
(5.2)
0
We conclude that (3.5), (3.6) lead us to the following isomorphisms of W-modules: Gr MX IndW (=(g−1 ) ⊗ VX ), LW 0
H m,n (Gr MX ) IndW (H m,n (GX )). LW
(5.3)
0
Therefore we can speak of W-singular vectors in Gr MX or in H m,n (Gr MX ) (see [KR1] for general properties of singular vectors). Remark 5.8. Formulae (1.3) and (1.4) show that the central element xi ∂i ∈ g(3) ⊂ W3 corresponds to 23 Y ∈ g(3) ⊂ W ⊂ W. Because of Remark 5.8 we notice that the following g(3) ⊕ s(2)-modules are isomorphic to each other V = F (0, 1; 0; − 23 ) =2 ⊗ P (0, − 23 ) g−2 (C3 )∗ ⊗ C VD−1,0 = ∂1 , ∂2 , ∂3 . (5.4) Let T k = IndWW (=k V ) with V defined by the isomorphisms (5.4) above. It is clear L0
that with respect to its W3 -module structure the module T k is the dual to the differential forms module k3 . Hence dualizing the De Rham complex •3 we get an exact sequence of W-modules: σ0
σ1
σ2
σ3
0 ← C ←− T 0 ←− T 1 ←− T 2 ←− T 3 ← 0.
(5.5)
Denote R = Ker(σ1 ) = Coker(σ3 ) and Q = IndWW (=0 ⊗ P (1, −1)) = IndWW VA0,1 . L0
L0
Lemma 5.9. There are the following isomorphisms of W-modules: (1) (2) (3) (4)
E 0 (MA )0,0 T 0 , E 0 (MA )1,1 T 1 ⊕ Q, 0 0,1 2 E (MA ) T ⊕ Q, E 0 (MA )1,2 T 3 , E 0 (MD )−1,−2 T 0 , E 0 (MD )0,−1 T 1 ⊕ Q, E 0 (MD )−1,−1 T 2 ⊕ Q , E 0 (MD )0,0 T 3 .
Statements (1) and (2) (resp. Statements (3), (4) ) follow from Theorem 4.10 as soon as we take into account Eqs. (5.3) and accommodate to the new notations. Proposition 5.10. There are the following isomorphisms of W-modules: (1) E 1 (MA )0,0 C , E 1 (MA )1,1 R ⊕ Q, (2) E 1 (MA )0,1 R ⊕ Q , E 1 (MA )1,2 0,
644
V. G. Kac, A. Rudakov
(3) the spectral sequence {Ei (MA )} degenerates at E 1 , that is E 1 (MA )m,n E ∞ (MA )m,n . To prove the proposition we shall calculate the differentials (i) starting with (0) . We need first to find the representatives of the bases of H m,n (GA ) ⊂ E 0 (MA ) for (m, n) = (1, 2), (0, 1), (1, 1), (0, 0). This is done in the following lemmae. Let ζi = di− z+ − di+ z− . 2 − α z z + α z2 ) ∈ M 1,2 , where Lemma 5.11. The element s = 13 (α− z+ 0 + − + − A
α± = d1± d2± x3 + d2± d3± x1 + d3± d1± x2 ,
α0 = f3 α+ ( = e3 α− ),
represents the unique highest weight vector of H 1,2 (GA ), and IndWW s T 3 . L0
Proof. Clearly e3 α− = α0 , e3 α0 = 2α+ , e3 α+ = 0, hence 2 2 − α 0 z+ − 2α+ z+ z− + 2α+ z+ z− ) = 0. e3 s = 13 (α0 z+
Similarly f3 s = 0 and s(3)s = 0. Also Y · s = (−2)s and s = ∂ˆ1 ζ1 + ∂ˆ2 ζ2 + ∂ˆ3 ζ3 .
(5.6)
We see that s ∈ F2 MA . On the other hand s ∈ F2 MA , therefore [s] = 0 by the definition of the differential of Gr MA . That means [s] is a cycle and it evidently generates a one-dimensional subspace C s = H 1,2 (GA ) =0 ⊗ P (0, −2) in H 1,2 (Gr MA ). It follows from the lemma that the image of [s] in H 1,2 (Gr MA ) = E 0 (MA )1,2 is the trivial W-singular vector generating E 0 (MA )1,2 T 3 . Lemma 5.12. Vectors z+ , ζ1 ∈ MA0,1 represent the highest weight vectors of H 0,1 (GA ), and {z± , ζi } represent the basis of H 0,1 (GA ). Also IndWW z+ , z− Q and IndWW ζi , i L0
= 1, 2, 3
L0
T 2.
This is easy to check, and we leave it to the reader. Remark 5.13. Considering E 0 (MA )0,1 = H 0,1 (MA ) as a W-module, we conclude that [z+ ] is a W-singular vector that generates a submodule Q and [ζ1 ] is a W-singular vector that generates a W-submodule isomorphic to T 2 . Of course z+ also represents an L-singular vector of H 0,1 (MA ). Let Ce (u· v· w· ) = even ui vj wk and Co (u· v· w· ) = odd ui vj wk be the sums over all even and odd permutations respectively, where we keep the order of the letters but permute the indices, for example, Ce (d·+ d·− d·+ ) = d1+ d2− d3+ + d2+ d3− d1+ + d3+ d1− d2+ , Ce (d·+ d·− x· ) = d1+ d2− x3 + d2+ d3− x1 + d3+ d1− x2 .
By the definition of one has: Ce (d·+ d·− x· )z± = Ce (d·+ d·− d·± ) and Co (d·+ d·− x· )z± = Co (d·+ d·− d·± ). Also Ce (d·+ d·+ d·− ) + Co (d·+ d·+ d·− ) = 0
and
Ce (d·+ d·+ x· ) + Co (d·+ d·+ x· ) = 0
Representations of E(3, 6) II: Four Series of Degenerate Modules
645
because [di+ , dj+ ] = 0, and similarly for the pair of minuses. The relation [di+ , dj− ] + [dj+ , di− ] = 0 implies Ce (d·+ d·− x· ) + Co (d·− d·+ x· ) + Co (d·+ d·− x· ) + Ce (d·− d·+ x· ) =0,
(5.7)
Co (d·+ d·− d·+ ) + Ce (d·+ d·− d·+ ) = 0.
(5.8)
Ce (d·+ d·+ d·− ) = −Co (d·+ d·− d·+ ) − (∂ˆ1 d1+ + ∂ˆ2 d2+ + ∂ˆ3 d3+ ), Ce (d·− d·+ d·+ ) = −Co (d·+ d·− d·+ ) + (∂ˆ1 d + + ∂ˆ2 d + + ∂ˆ3 d + ),
(5.9)
Ce (d·+ d·− d·± ) + Co (d·− d·+ d·± ) + Co (d·+ d·− d·± ) + Ce (d·− d·+ d·± ) = 0, hence
One easily checks that
1
2
(5.10)
3
hence Ce (d·− d·+ d·+ ) + Ce (d·+ d·+ d·− ) = −2Co (d·+ d·− d·+ ) = 2Ce (d·+ d·− d·+ ).
(5.11)
Lemma 5.14. The following elements t± , τ1 , τ2 , τ3 ∈ MA1,1 represent a basis of H 1,1 (GA ): t+ = Ce (d·+ d·+ x· ) z− − Ce (d·+ d·− x· )z+ + Co (d·+ d·− x· )z+ − Co (d·− d·+ x· )z+ ,
t− = −Ce (d·− d·− x· ) z+ + Ce (d·− d·+ x· )z− − Co (d·− d·+ x· )z− + Co (d·+ d·− x· )z− , τ1 = d2+ x3 z− + d3− x2 z+ ,
τ2 = d3+ x1 z− + d1− x3 z+ , τ3 = d1+ x2 z− + d2− x1 z+ .
Here we have isomorphisms
IndWW t+ , t− Q, L0
IndWW τi , i = 1, 2, 3 T 1 . L0
This is also not difficult to prove. Theorem 4.10 shows us the weight subspaces where the representatives should be and we only have to compute the cycles and boundaries in these subspaces. We leave it to the reader to fill in the details. Remark 5.15. The elements [t+ ], [τ3 ] of E 0 (MA )1,1 = H 1,1 (MA ) are W-singular vectors, the first one generates the W-submodule isomorphic to Q, the second generates the submodule isomorphic to T 1 . At the same time t+ represents a singular vector of H 1,1 (MA ), which is not a singular but a secondary singular vector of MA1,1 M(1, 0; 1; yA ). Proof of Proposition 5.10. We are ready now for the calculating of the differential (0) : E 0 (MA )1,2 → E 0 (MA )0,1 . From (5.6) it follows that (0) [s] = ∂ˆ1 [ζ1 ] + ∂ˆ2 [ζ2 ] + ∂ˆ3 [ζ3 ].
(5.12)
Using Lemmae 5.11, 5.12 and comparing with (5.5) we see immediately that Ker((0) |E 0 (MAB )1,2 ) = 0
and
Coker((0) |E 0 (MAB )0,1 ) = Q ⊕ R.
646
V. G. Kac, A. Rudakov
This proves (2). For the differential (0) : E 0 (MA )1,1 → E 0 (MA )0,0 we need to know t± , τi . Clearly t+ = Ce (d·+ d·+ d·− ) − Ce (d·+ d·− d·+ ) + Co (d·+ d·− d·+ ) − Co (d·− d·+ d·+ ) = 0
(5.13)
because of (5.8) and (5.11). Now t− = (f3 t+ ) = f3 (t+ ) = 0. And τ3 = [d1+ , d2− ] = −∂ˆ3 . Similarly τi = −∂ˆi for the other values of i. Keeping in mind that E 0 (MA )0,0 = H 0,0 (MA ) = T 0 and Lemma 5.14 we conclude that Ker((0) |E 0 (MA )1,1 ) = Q ⊕ R
and
Coker((0) |E 0 (MA )0,0 ) = C,
which gives us (1). It is clear that (i) : E i (MA )1,2 → E i (MA )0,1 is zero for any i ≥ 1. Also we see that if any of the differentials (i) : E i (MA )1,1 → E i (MA )0,0 happens to be not zero then E ∞ (MA )0,0 = 0 but this is impossible because Im(|MA0,0 ) = MA0,0 . This proves (3). Remark 5.16. Proposition 5.10 shows that W-modules E ∞ (MA )0,1 and E ∞ (MA )1,1 are isomorphic. We would like to have such an isomorphism ϕ : H 0,1 (MA ) → H 1,1 (MA ) that would be not only an isomorphim of W-modules but also a morphism (hence isomorphism) of L-modules. Let us notice that the elements θ1 = ∂ˆ3 τ2 − ∂ˆ2 τ3 , θ2 = ∂ˆ1 τ3 − ∂ˆ3 τ1 ,
(5.14)
θ3 = ∂ˆ2 τ1 − ∂ˆ1 τ2 represent a basis for the subspace of trivial singular vectors in the W-submodule Ker (0) ∩ ( U (W)[τi ] ) ⊂ E 0 (MA )1,1 , which is isomorphic to R. Also the elements ζi represent the basis of the subspace of trivial singular vectors of a W-submodule Coker((0) |E 0 (MA )0,1 ), which is isomorphic to R as well. A similar pair of bases of singular vectors for W-modules which are isomorphic to Q is [t± ] and [z± ]. It holds in H 0,1 (MA ) that d1− [z+ ] − d1+ [z− ] = [ζ1 ]. On the other hand it holds “in MA1,1 modulo boundaries” that d1− [t+ ] − d1+ [t− ] ≡ 4(∂ˆ3 [τ2 ] − ∂ˆ2 [τ3 ]) ≡ 4[θ1 ]
(5.15)
(which could be obtained by straightforward but unfortunately hard computation). Thus if we define a W-isomorphism ϕ : H 0,1 (MA ) → H 1,1 (MA ) by conditions ϕ([z± ]) = 41 [t± ], ϕ([ζi ]) = [θi ], we get a map which is L− -linear. As it also maps a singular vector [z+ ] to a singular vector 41 [t+ ], so it is a L-homomorphism by the arguments of Remark 2.4(b). Clearly we got the needed isomorphism of the E(3, 6)-modules.
Representations of E(3, 6) II: Four Series of Degenerate Modules
647
Proposition 5.17. There are the following isomorphisms of W-modules: (1) E 1 (MD )−1,−2 C , E 1 (MD )0,−1 R ⊕ Q, 1 −1,−1 (2) E (MD ) R ⊕ Q , E 1 (MD )0,0 0. (3) The spectral sequence {Ei (MD )} degenerates at E 1 , that is E 1 (MD )m,n E ∞ (MD )m,n . In order to prove this proposition, we shall use representatives for the bases of H m,n (GD ) and calculate the differentials (i) but first we have to introduce some notations and prove several lemmas. Consider the associative algebra U (L− ) ⊗C VD MD . Allowing ourselves a slight abuse of notations we denote ∂ˆi := ∂i ⊗ 1, di± := di± ⊗ 1, and ∂i := 1 ⊗ ∂i . Furthermore we consider 6± = d1± ∂1 + d2± ∂2 + d3± ∂3 and define ± = ∂ˆ1 d ± + ∂ˆ2 d ± + ∂ˆ3 d ± . 6 1 2 3 The following two lemmae are straightforward to check.(For odd elements x, y from MD we mean [x, y] = xy + yx as usual.) Lemma 5.18. [ di± , 6± ] = 0,
[ d1+ , 6− ] = −[ d1− , 6+ ] = ∂ˆ2 ∂3 − ∂ˆ3 ∂2 , [ d + , 6− ] = −[ d − , 6+ ] = ∂ˆ3 ∂1 − ∂ˆ1 ∂3 , 2
2
[ d3+ , 6− ] = −[ d3− , 6+ ] = ∂ˆ1 ∂2 − ∂ˆ2 ∂1 . ε ] = 0 for ε = − or + . Lemma 5.19. [ di± , 6 It follows from these two lemmae that, whatever ε, one has ε ] = 0, [ 6± , 6
± , 6 ε ] = 0, [6
[ 6± , 6ε ] = 0.
(5.16)
Let us supplement the notations introduced in (2.14), (2.15) defining d := d1− d2− d3− . Clearly f3 a = b, f3 b = 2c, f3 c = 3d, f3 d = 0, e3 d = c, e3 c = 2b, e3 b = 3a, e3 a = 0. It follows from Lemma 5.19 that ± ] = 0, [ a, 6
± ] = 0, [ b, 6
± ] = 0, [ c, 6
Lemma 5.20. + , + 6+ = −6+ 6 (1) [ a, 6− ] = 6 − + − − + − , + 6− − 6 + 6 [ b, 6 ] = 6 6 + 6 6 = 6 − − − − − [ c, 6 ] = 6 6 = −6 6 , d 6− = 0, 6− d = 0,
± ] = 0. [ d, 6
648
V. G. Kac, A. Rudakov
a 6+ = 0, 6+ a = 0, + = −6 + 6+ , [ b, 6+ ] = 6+ 6 + − − + + = 6+ 6 − − 6 + 6− , [ c, 6 ] = 6 6 + 6 6 + − − − − [ d, 6 ] = 6 6 = −6 6 , (3) 0 = 6+ a = 6− a + 6+ b = 6− b + 6+ c = 6− c + 6+ d = 6− d, 0 = a 6+ = a 6− + b 6+ = b 6− + c 6+ = c 6− + d 6+ = d 6− , ± (with a hat). and exactly the same for 6
(2)
Proof. We can check that + d + , [ a, di− ] = 6 i
− d − , [ d, di+ ] = −6 i
(5.17)
− , follows. Now we apply f3 and get the + 6+ , [ d, 6+ ] = 6− 6 and [ a, 6− ] = 6 rest in (1), or apply e3 and get the rest in (2). + a = a 6 + = 0 , then we apply f3 and get (3). Similarly 6+ a = a6+ = 0 , 6 Lemma 5.21. Consider the element 2 2 ξ = a6− ∂+ + b6− ∂+ ∂− + c6− ∂− ∈ M(0, 1; 2; yD ).
It has the following properties: (a) g0 · ξ = 0, (b) e0 · ξ = 0, (c) e0 · ξ ∈ I m, but e0 · ξ = 0, and ξ = 0, (d) ξ represents a basis for H −1,−2 (GCD ). Here IndWW ξ T 0 . L0
Proof. The proof of (a) is immediate. We get (b) by a straightforward calculation. To get (c) it is enough to notice that e0 ξ = −2(d2− d3+ + d3+ d3− )∂+ − 4d2− d3− ∂− = −2(d2− d3+ + d3+ d3− )∂+ − 4d2− d3− ∂− , and that 3 2 ξ = a6− 6+ ∂+ + (a6− 6− + b6− 6+ )∂+ ∂−
2 3 + (b6− 6− + c6− 6+ )∂+ ∂− + c6− ∂− = 0,
as it follows from (5.16) and Lemma 5.20(3). For (d) we are to prove that ξ ∈ Im so [ξ ] = 0 in H −1,−2 (GD o ) C. If ξ ∈ Im then H −1,−2 (MD ) = 0, and this would imply the existence of a non-zero differential S(g−2 ) ⊗ H 0,−1 (GD ) −→ S(g−2 ) ⊗ H −1,−2 (GD ) in the spectral sequence for H (MD ) with its image containing [ξ ]. This is impossible because Y -eigenvalue of ξ is 0, but all eigenvalues of Y on S(g−2 ) ⊗ H 0,−1 (GD ) are negative as it follows from the description of the homologies in Theorem 4.10. Corollary 5.22. ξ is a secondary singular vector of the E(3, 6)-module M(0, 1; 2; yD ).
Representations of E(3, 6) II: Four Series of Degenerate Modules
649
Lemma 5.23. − a, ab = ba = 0, − = −6 (1) ac = ca = a 6 + d, dc = cd = 0, + = 6 (2) db = bd = −d 6 − = 6 + c, 2ad + bc = da. (3) ad + da = b6 − and ca = −6 − a. Then Proof. It is not difficult to check that ac = a 6 − = a 6 + = 0, ab = e3 ac = e3 a 6 and we get (1). Now let + = 2ad − b6 − . λ = 2ad + c6 + = 0 because of (1) and Lemma 5.20(3). As the s(2)Then e3 λ = 2ac + 2b6 − = 0. The latter means weight of λ is zero, so f3 λ = 0, which gives us 2bd − 2c6 − + bd = c6 = −d 6 . − because λ is invariant under the Weyl reflection At the same time λ = −2da + b6 in SL(2) and this implies the first part of (3). We now get the rest of (2) from e3 λ = 0 − . and the rest of (3) applying f3 to ac = a 6 Let
q+ = 13 (−2a6− ∂+ − b6− ∂− ), q− = 13 ( c6+ ∂+ + 2d6+ ∂− ), κi = di− q+ − di+ q− .
Lemma 5.24. (1) e3 q+ = 0, e0 q+ = 0, q+ = 0, e0 q+ = (−2(d2+ d3− + d2− d3+ )) = 0, (2) vectors q+ , κ1 represent the highest weight vectors of H −1,−1 (GD ), and q+ , q− , κ1 , κ2 , κ3 represent a basis of H −1,−1 (GD ), (3) q+ represents the (highest weight) singular vector of H −1,−1 (MD ), (4) q+ is the secondary (highest weight) singular vector in M(0, 1; 1; yD ). We have isomorphisms IndWW q+ , q− Q, L0
IndWW κi , i = 1, 2, 3 T 2 . L0
We leave it to the reader to check the calculations. The rest follows. + = −2da + b6 − ∈ M 0,0 represents a basis Lemma 5.25. The element λ = 2ad + c6 D W 0,0 0,0 for H (GD ), the singular vector of H (MD ), Ind W λ T 3 , and L0
− q+ − 6 + q− = ∂ˆ1 κ1 + ∂ˆ2 κ2 + ∂ˆ3 κ3 . λ = 6 Proof. Let us start with checking the last statement: + 6 + ∂+ + c 6 + 6 − ∂− λ = 2ad6+ ∂+ + c6 − ∂+ − b 6 − 6 + ∂+ + c 6 + 6− ∂ − = 2a6− 6 − ∂+ + c 6 + 6 − ∂− = a6− 6 − ∂+ + d 6 + 6+ ∂ − . = a6− 6
650
V. G. Kac, A. Rudakov
On the other hand + q− = − q+ − 6 6 = =
1 3 1 3 1 3
− (−2a6− ∂+ − b6− ∂− ) − 6 + (c6+ ∂+ + 2d6+ ∂− ) 6 + c6+ )∂+ + (−6 − b6− + 2d 6 + 6+ )∂− − a6− + 6 (−26 + 6 + ∂− . − 6− ∂+ + 3d 6 3a 6
Hence we get the equality. It implies that [λ] = 0 for [λ] ∈ Gr MD , so [λ] is a cycle. The rest is immediate. r+ = a (d ∂− + c ∂+ ),
Let
r− = d (b ∂− + a ∂+ ), ρ1 = ρ2 = ρ3 =
(d2+ d3+ d∂− (d3+ d1+ d∂− (d1+ d2+ d∂−
and
+ d2− d3− a∂+ ), + d3− d3− a∂+ ), + d1− d2− a∂+ ).
0,−1 Lemma 5.26. Elements r± , ρi ∈ MD have the properties:
(1) the vector r+ coincides (up to a sign) with the non-trivial singular vector of M(0, 0; 1; yD ) defined in (2.18): r+ = −w3 , (2) vectors r+ , ρ1 represent the highest weight vectors of H 0,−1 (GD ), and r+ , r− , ρ1 , ρ2 , ρ3 represent the basis of H 0,−1 (GD ), (3) d1− r+ − d1+ r− = ∂ˆ3 ρ2 − ∂ˆ2 ρ3 , −1,−2 . (4) ρ3 ≡ − 13 ∂ˆ3 ξ modulo boundaries in MD Here
IndWW r+ , r− Q and L0
IndWW ρi , i = 1, 2, 3 T 1 . L0
− b)∂− + 6 − a∂+ . (1) now Equation (2.18) written in our notations is w3 = (da + 6 follows from Lemma 5.23. The calculations needed to check the rest of the statements are quite straightforward, we leave them to the reader. Proof of Proposition 5.17. As we compare Eqs. (5.6) with Lemma 5.25 and Eqs. (5.15) with Lemma 5.26(3), we notice that the differential (i) for E i (MD ) is given by the similar formulae as (i) for E i (MA ). Thus it is possible to repeat the arguments of the proof of Proposition 5.10 and conclude that the values of homologies are as stated. Proposition 5.27. (1) If f ∈ MA0,1 and f = f z+ + f z− , where f , f ∈ U (L− ), then 6 f = −(f r+ + f r− ), (2) 6 · = 0, 0,−1 (3) the morphism Coker(|MA0,1 ) → Ker(|MD ) defined by 6 is an isomorphism. Proof. Evidently the morphism given by the formula in (1) maps z+ to w3 , and commutes with g0 and L− , hence it is a morphism of L-modules, thus equal to 6 . To show that 6 · = 0 it is enough to check that 6 annihilates vector di+ z+ , that generates Im over U (L). For it we have 6 di+ z+ = di+ r+ = di+ a(d ∂− + c ∂+ ) = 0 because di+ a = 0. The last statement follows.
Representations of E(3, 6) II: Four Series of Degenerate Modules
651
0,−1 Corollary 5.28. H 0,1 (MA ) Ker(6 |MD ) is an irreducible module, isomorphic to I (0, 0; 1; −1). 0,−1 Proof. The submodule Ker(6 |MD ) is generated by the singular vector w3 and, by Theorem 2.10, has no other singular vectors, hence (by Proposition 1.3h from [KR1]) it is irreducible. As it is isomorphic to H 0,1 (MA ), it has to be a quotient of M(0, 0; 1; −1), thus I (0, 0; 1; −1).
Proof of Theorem 5.1. The statement (i) of the theorem follows from Proposition 5.10 and (iii) follows from Proposition 5.17. From Remark 5.16 we know that H 0,1 (MA ) H 1,1 (MA ), and Corollary 5.28 shows that H 0,1 (MA ) I (0, 0; 1; −1). Again as we compare Eqs. (5.15) with Lemma 5.26(3), we see that the arguments of −1,−1 0,−1 Remark 5.16 can be applied to MD , MD as well. This shows us that H 0,−1 (MD ) −1,−1 0,1 H (MD ) H (MA ). That is all what we need.
6. The Size of I (p, q; r; y) We define the character of a E(3, 6)-module V by the usual formula: chV = trV t −3Y . If V is a degenerate highest weight module, then chV is a Laurent series in t, whose coefficients are dimensions of eigenspaces of the operator −3Y . Since, as a g0 -module M(p, q; r; y) = S(g−2 ) ⊗ =(g−1 ) ⊗ F (p, q; r; y)
(6.1)
and the eigenvalues of −3Y on g−1 and g−2 are 1 and 2 respectively, we obtain: ch M(p, q; r; y) = t −3y dim F (p, q; r; y)R(t),
(6.2)
where R(t) = (1 + t)6 /(1 − t 2 )3 . Next, let us compute the characters of the modules I (p, 0; r; yA ). By Corollary 5.2, we have an exact sequence 0 ← I (p, 0; r; 23 p − r) ← M(p, 0; r; 23 p − r) ← M(p + 1, 0; r + 1; 23 p − r − 13 ) ← · · · , unless p = r = 0 or 1. Hence, using (6.2), we obtain: ch I (p, 0; r; 23 p − r) = t 3r−2p R(t)
∞ j =0
(−1)j
(j + p + 2)(j + p + 1)(j + r + 1) j t 2 (6.3)
unless p = r = 0 or 1. In order to sum this series for |t| < 1, we use the identity ∞ j =0
(−1)
j
j +m j 1 t = . m (1 + t)m+1
(6.4)
652
V. G. Kac, A. Rudakov
We get (unless p = r = 0 or 1): ch I (p, 0; r; 23 p − r) 3 2p + r − 2 p 2 + 2pr − p p 2 r + pr 3r−2p =t R(t) + + + . (1 + t)4 (1 + t)3 2(1 + t)2 2(1 + t)
(6.5)
In particular, for r > 0 we have: (r + 1) + (r − 2)t . (6.6) (1 + t)4 Next, we compute the characters of the modules I (0, q; r; yD ). By Corollary 5.2, we have an exact sequence ch I (0, 0; r; −r) = t 3r R(t)
0 → I (0, q; r; r − 23 q) → M(0, q + 1; r + 1; r − 23 q − 13 ) → · · · unless (q, r) = (0, 0, ), (1, 1), (0, 1) and (1, 2), in the last case the exactness being broken by a 1-dimensional space. Hence, except for these four cases, we have: chI (0, q; r; r − 23 q) = t 4q−6r chI (q + 1, 0; r + 1; r − 23 q − 13 ).
(6.7)
One computes similarly the characters of B and C type, but they are more cumbersome and we omit them. Definition 6.1. Given a S(g−2 )-module V , we define 1 rank S(g−2 ) V . 4 It is clear that size V can be expressed via chV as follows: 1 size V = lim (1 − t 2 )3 chV . 4 t→1 In particular, we obtain from (6.2): size V =
size M(p, q; r; y) = 16 dim F (p, q; r; y).
(6.8)
(6.9)
One computes immediately the sizes of modules I (p, 0; r; yA ) and I (p, 0; r; yD ) using (6.5), (6.7) and (6.8). The sizes of I (p, 0; r; yB ) and I (p, 0; r; yC ) are then computed by adding a finite number of terms using exact sequences from Fig. 3; it follows that in both cases the size is a polynomial of degree ≤ 2 in p and of degree ≤ 1 in r. The final result of the calculations is given by the following. Theorem 6.2. (A) If (p, r) = (0, 0) or (1, 1), then size I (p, 0; r; 23 p − r) = 2r(2p 2 + 4p + 1) + (2p 2 + 2p − 1); size I (1, 0; 1; − 13 ) = 16. (B) size I (p, 0; r; 23 p + r + 2) = 2r(2p 2 + 4p + 1) + (6p 2 + 14p + 5). (C) size I (0, q; r; − 23 q − r − 2) = 2r(2q 2 + 8q + 7) + (2q 2 + 10q + 11). (D) If (q, r) = (0, 0) or (1, 1), then size I (0, q; r; r − 23 q) = 2r(2q 2 + 6q + 7) + (6q 2 + 22q + 17); size I (0, 1; 1; 13 ) = 74. Remark 6.3. It follows from (6.1) and Fig. 3 that the sizes of the even and odd parts of all modules I (p, q; r; y) are equal.
Representations of E(3, 6) II: Four Series of Degenerate Modules
653
7. The Secondary Z-Grading of E(5, 10) as a E(3, 6)-Module Recall that in Sect. 1 we defined the secondary Z-grading Uj E(5, 10) = j ≥−1
by letting deg xi = 0 = deg ∂i for i = 1, 2, 3 ; deg xj = 1 = − deg ∂j for j = 4, 5; deg d = − 21 , so that U 0 is isomorphic to E(3, 6), and each U j is a E(3, 6)-module. Each of the subspaces U j carries a Z-grading by finite-dimensional subspaces induced by the consistent grading of E(5, 10) defined in Sect. 1: j Ui . E(5, 10)j = i∈Z
Each of the E(3, 6)-modules U j is a linearly compact space, hence the dual modules are discrete spaces: j∗
U j ∗ = ⊕j ∈Z Ui . j∗
(7.1)
j
Note that all Ui (and Ui ) are g0 -modules. j
Since E(5, 10)i = 0 for i < −2, it follows that all E(3, 6)-modules U j ∗ are L0 locally finite, hence they are objects from the category P(L, L0 ) considered in [KR1].
Theorem 7.1. (a) The E(3, 6)-module U j ∗ has size 2j + 3, j = −1, 0, 1, . . . . (b) One has the following isomorphisms of E(3, 6)-modules: U −1∗ I (0, 0; 1; −1),
U 0∗ I (1, 0; 0; 23 ),
U j ∗ I (0, 0; j − 1; j + 1)
for j ≥ 1.
Proof. It is straightforward to check (a) using the construction of U j . j∗ j∗ It follows from (7.1) that U j ∗ has a maximal submodule U(0) such that U j ∗ /U(0) is an irreducible module from P(L, L0 ). It is easy to see that the lowest non-zero space in (7.1) is isomorphic to the g0 -modules F (0, 0; 1; −1), F (1, 0; 0; 23 ) and F (0, 0; j − 1; j + 1) j∗ for j = −1, j = 0 and j ≥ 1, respectively. Hence the E(3, 6)-module U j ∗ /U(0) is isomorphic to I (0, 0; 1; −1), I (1, 0; 0; 23 ) and (I (0, 0; j − 1; j + 1)), respectively. It follows from Theorem 6.2 that these E(3, 6)-modules have the same size as U j ∗ j∗ (given by (a)). Since U j ∗ as S(g−2 )-modules have no torsion, it follows that U(0) = 0, hence (b). We conclude this section by an explicit description of the module U −1∗ = I (0, 0; 1; −1), which is a (non-trivial) E(3, 6)-module of minimal possible size (= 1). Denote by k3 the space of differential k-forms over C[[x1 , x2 , x3 ]] and by k3c the subspace of closed forms. The Lie algebra W3 of all formal vector fields acts on k3
654
V. G. Kac, A. Rudakov
via Lie derivative, leaving k3c invariant. For λ ∈ C, define a λ-twisted W3 -module structure on k3 , which we denote by ( k3 )λ , by letting D ∈ W3 , ω ∈ k3 .
Dω = LD ω + λ(div D)ω,
Note that we have a canonical isomorphism of W3 -modules: ( 33 )λ ( 03 )1+λ .
(7.2)
Recall the following explicit construction of E(3, 6) [CK]: E(3, 6)0¯ = W3 + 03 ⊗ s(2),
E(3, 6)1¯ = ( 13 )− 2 ⊗ C2 , 1
(7.3)
with the obvious bracket on E(3, 6)0¯ and between E(3, 6)0¯ and E(3, 6)1¯ , and the following bracket on E(3, 6)1¯ : [ω ⊗ v, ω ⊗ v ] = (ω ∧ ω ) ⊗ (v ⊗ v ) + (dω ∧ ω + ω ∧ dω ) ⊗ v · v , where v · v ∈ S 2 C2 s(2), v ∧ v ∈ =2 C2 C. We have identified here 03 with ( 33 )−1 (see (7.2)), and also use the identification W3 ( 23 )−1 .
(7.4)
(We fix a volume form in C3 when we define the twisted action.) Proposition 7.2. The E(3, 6)-module I = I (0, 0; 1; −1) is constructed explicitly as follows: 1
I0¯ = ( 03 ) 2 ⊗ C2 ,
I1¯ = 23c .
E(3, 6)0¯ acts in the obvious way ( 03 ⊗ s(2) acts trivially on I1¯ ), and E(3, 6)1¯ acts as follows (ωi ∈ i3 , u, v ∈ C2 ): (ω1 ⊗ u)(ω0 ⊗ v) = d(ω0 ω1 ) ⊗ (u ∧ v), (ω1 ⊗ u) ω2 = (ω1 ∧ ω2 ) ⊗ u, where we use identifications (7.3), u∧v ∈ =2 C2 C and ω1 ∧ω2 ∈ ( 33 )− 2 ( 03 ) 2 . 1
Proof. It is straightforward from definitions.
1
8. A Relation to the Standard Model Recall that the g0 -module g−1 is isomorphic to F (1, 0; 1; − 13 ). The action of the compact form k = su(3) + su(2) + iRY of g0 on g−1 exponentiates to a faithful representation of the compact group K = (SU (3) × SU (2) × U (1))/C, where C is a central subgroup of order 6. Recall that the group K is the group of symmetries of the Standard Model. It is straightforward to check the following.
Representations of E(3, 6) II: Four Series of Degenerate Modules
655
Lemma 8.1. The g0 -module F (p, q; r; y) exponentiates to K iff the following two conditions hold: y ∈ 13 Z,
(8.1)
2(p − q) + 3r − 3y ∈ 6Z.
(8.2)
Since the action of g0 on g−1 and hence on g−2 exponentiates to K (this, in fact, is true for g1 and hence for all gj as well), we obtain from the isomorphism (1.10) that a E(3, 6)-module M(p, q; r; y) restricted to g0 exponentiates to K iff (8.1) and (8.2) hold. In particular, we obtain the following corollary of Lemma 8.1 and Theorem 1.2. Corollary 8.2. All degenerate E(3, 6)-modules I (p, q; r; yX ) exponentiate to K. Definition 8.3. A g0 -module F (p, q; r; y) is called a fundamental particle multiplet if the following three properties hold: this module occurs among degenerate irreducible E(3, 6)- modules restricted to g0 , when restricted to sl(3), only the 1-dimensional, the two fundamental and the adjoint representations occur, 1 0 1 ∈ sl(2). 2 |y + h| ≤ 1, where h is any eigenvalue of H = 0 −1
(8.3) (8.4) (8.5)
Condition (8.5) means that the modulus of charges (given by Gell-Mann-Nishijima formula) of all particles in the multiplet do not exceed 1. It is immediate to see that all K-modules F (p, q; r; y) for which (8.4) and (8.5) hold are listed in the left half of Table 1. Hence, by Corollary 8.2, the left half of Table 1 consists of all fundamental particle multiplets. The right half contains all the fundamental particles of the Standard Model: the upper part consists of three generations of quarks and the middle part of three generations of leptons (these are all fundamental fermions from which the matter is built), and the lower part consists of the fundamental bosons (which mediate the strong and electroweak interactions). The last row of Table 1 corresponds to charged gluons, which do not occur in the Standard Model. One can show that the direct sum of degenerate E(3, 6)-modules I (0, 0; 1; −1) ⊕ I (1, 0; 0; 23 ) ⊕ I (0, 0; 0; 2) ⊕ I (0, 0; 0; −2) contains all the fundamental particle multiplets once, except for (01, 1, 13 ) which is contained twice. 9. Appendix: A Spectral Sequence for a Filtered Module with a Differential That Does not Preserve the Filtration Usually a spectral sequence is constructed for a differential filtered module but we relax the conditions and suppose that there is a module with a differential and a filtration where the differential does not preserve the filtration but only the condition (9.1) below is true. We show that the construction still works with minor alterations.
656
V. G. Kac, A. Rudakov
Table 1. multiplets
charges
(01, 1, 13 )
2,−1 3 3
(10, 1, − 13 )
− 23 , 13
(10, 0, − 43 )
− 23
(01, 0, 43 ) (01, 0, − 23 ) (10, 0, 23 )
2 3 − 13 1 3
(00, 1, −1)
0, −1
(00, 1, 1)
0, 1
(00, 0, 2) (00, 0, −2)
particles uL
c L
tL
dL uR dR
sL cR sR
tR
uL
cL
tL
uR
cR
tR
dR
sR
bR
sL
bL
dL ν
ν
L
µL
bL
bR
ν τL
eL νR eR
µL νµR µR
ντ R
1
eL
µL
τL
−1
eR
µR
τR
(11, 0, 0)
0
gluons
(00, 2, 0)
1, −1, 0
W +, W −, Z
(gauge bosons)
(00, 0, 0)
0
γ
(photon)
(11, 0, ±2)
±1
–
τL
τR
Let A be a module with a filtration (we follow more or less the notations of [M, Ch. XI, Sect. 3]): · · · ⊂ Fp−1 A ⊂ Fp A ⊂ Fp+1 A ⊂ · · ·
(p ∈ Z)
and with a differential d : A → A, d 2 = 0, such that d(Fp A) ⊂ Fp−s+1 A
(9.1)
for some fixed s and every p ∈ Z. The usual case of a differential module corresponds to s = 1, but for our main application s = 0. We claim that there is a spectral sequence E = {E r , d r }r∈Z , which, as usual, is a r r sequence of Z-graded modules E = p∈Z Ep , each with a differential r d r : Epr → Ep−r
and with isomorphisms H (E r , d r ) E r+1 ,
(9.2)
and for this spectral sequence there are natural isomorphisms: Epr Fp A/Fp−1 A for r ≤ s − 1
and d r = 0 if r < s − 1, d s−1 = gr d,
(9.3)
hence Eps H (Fp A/Fp−1 A).
(9.4)
Representations of E(3, 6) II: Four Series of Degenerate Modules
657
In other words, E s is isomorphic to the homology of the module Gr A with respect to the induced differential gr d : Fp A/Fp−1 A → Fp−s+1 A/Fp−s A.
(9.5)
Let us mention that if A is a graded filtered module with a grading of some kind (bigrading, etc.) then the E r inherit similar gradings. To construct the spectral sequence we keep on with the usual construction of the spectral sequence for a differential module. Introduce submodules Zpr = {a| a ∈ Fp A, da ∈ Fp−r A},
(9.6)
r−1 + Fp−1 A), Epr = (Zpr + Fp−1 A)/(dZp+r−1
(9.7)
define subquotients
r and differentials d r : Epr → Ep−r as the homomorphisms induced on the subquotients by the differential d of A. In the case r ≤ s − 1 one has: Zpr = Fp A, and clearly r dZp+r−1 = d(Fp+r−1 A) ⊂ Fp+r−s A ⊂ Fp−1 A.
Therefore
(9.8)
Epr = Fp A/Fp−1 A for r ≤ s − 1.
On the other hand (9.8) shows that r ⊂ Fp+1+r−s A if r < s − 1. dZp+r
Hence
r ⊂ Fp−1 A and so d r ≡ 0 for r < s − 1. dZp+r
Thus E s−1 coincides with Gr A and d s−1 coincides with gr d and (9.2) for r ≤ s − 1 implies (9.3) and (9.4). r−1 To prove (9.2) we notice first that Zpr ∩ Fp−1 A = Zp−1 , hence (9.7) implies r−1 r−1 Epr Zpr /(dZp+r−1 + Zp−1 ).
(9.9)
Here we use the standard module isomorphism: (U + W )/(V + W ) U/(V + U ∩ W ) for submodules U, V , W of a module A such that U ⊃ V . We write in the same way r−1 r−1 r r Zp−r /(dZp−1 + Zp−r−1 ). Ep−r
But r−1 } = {x| x ∈ Zpr , dx ∈ Fp−r−1 A} = Zpr+1 . {x| x ∈ Zpr , dx ∈ Zp−r−1 r This forces us to conclude that ker(d r : Epr → Ep−r ) coincides with the image in Epr of Zpr+1 ⊂ Zpr .
658
V. G. Kac, A. Rudakov
r−1 r−1 r Thus, taking into account that Zpr+1 ⊃ dZp+r−1 and that Zpr+1 ∩ Zp−1 = Zp−1 , we use (9.9) to establish the following isomorphism: r−1 r r ker(d r : Epr → Ep−r ) Zpr+1 /(dZp+r−1 + Zp−1 ).
(9.10)
Now, because of (9.9), r−1 r−1 r r Ep+r Zp+r /(dZp+2r−1 + Zp+r−1 ), r ⊂ Zpr , we see that hence, as dZp+r r−1 r r r r Im (d r : Ep+r → Epr ) (dZp+r + Zp−1 )/(dZp+r−1 + Zp−1 ).
This, together with (9.10), r r + Zp−1 ) Epr+1 . H (Epr ) Zpr+1 /(dZp+r
So (9.3) is established. Quite similar to the filtered differential module situation ([M, Ch. XI, Prop. 3.2]) we get the convergence of the spectral sequence under additional conditions on the filtration. Proposition 9.1. If ∪p Fp A = A and for some N, F−N A = 0, then the spectral sequence converges. The latter means that for every p and large enough r we get a commutative diagram of natural morphisms Epr → Epr+1 → · · · . .↓ Fp (H (A))/Fp−1 (H (A)) = Gr p (H (A)) that identifies Gr p (H (A)) lim E r , or Gr (H (A)) Ep∞ . −→ p The morphisms
r
Epr → Epr+1 → · · · r r r = 0 so E−N = 0, therefore ker(d r : Epr → Ep−r ) = Epr for are defined because Z−N given p and r large enough. r−1 Moreover Epr = Fp A/(dZp+r−1 + Fp−1 A) and Epr → Epr+1 are surjective for r large r. Also for large r, Zp = (ker d) ∩ Fp A and then (9.7) shows that the inductive limit Ep∞ of the system {Epr → Epr+1 → · · · } is r−1 E r = ((ker d) ∩ Fp A + Fp−1 A)/(∪r dZp+r−1 + Fp−1 A). Ep∞ = lim −→ p r
r−1 But ∪r dZp+r−1 = (dA) ∩ Fp A, hence
Ep∞ ((ker d) ∩ Fp A + Fp−1 A)/(dA ∩ Fp A + Fp−1 A) = Gr p (H (A)), as stated above.
Representations of E(3, 6) II: Four Series of Degenerate Modules
659
Postscript, added June 6, 2001. Joris Van der Jeugt has made (at our request) extensive computer calculations of the multiplicities of g0 -submodules from Table 1 in the modules M(p, q; r; yX ) from Fig. 3. Using Corollaries 5.2–5.4, it is easy to derive from his calculations the multiplicities of these g0 -submodules in the corresponding irreducible E(3, 6)-modules I (p, q; r; yX ). We number the complexes in Fig. 3 by s = r−q+3 in the D sector, and replace in them the induced modules M by their irreducible quotients I . Then we find that fundamental particle multiplets appear in the s th sequence if and only if s ≥ 1. Furthermore, for 1 ≤ s ≤ 7 we get sequences with various particle contents, but for s ≥ 8 the particle contents remain unchanged, and it is invariant under the CPT symmetry (though for s ≤ 7 it is not). Remarkably, precisely three generations of leptons occur in the stable region (s ≥ 8), but the situation with quarks is more complicated: this model predicts a complete fourth generation of quarks and an incomplete fifth generation (with missing down type threeplets). All the sequences of irreducible E(3, 6)-modules that contain fundamental particle multiplets are listed below along with the fundamental particle multiplets they contain. For the sake of simplicity, we omitted the charged gluons (they occur too iff s ≥ 1, and the stabilization takes place from s = 9 on as follows: 3(11, 0, 2) 5(11, 0, 2) ∅ 5(11, 0, −2) 3(11, 0, −2)). Sequence 1: I (00, 0, −2) (00, 0, −2) Sequence 2: I (01, 0, − 23 ) 2(10, 0, − 43 ) (01, 0, − 23 ) (00, 0, −2) (00, 1, −1) Sequence 3:
I (01, 1, 13 ) I (10, 1, − 13 ) I (20, 2, − 23 ) 2(10, 0, − 43 )
(10, 0, − 43 )
(10, 0, − 43 )
2(01, 0, − 23 )
(01, 0, − 23 )
(00, 0, −2)
2(10, 1, − 13 ) (01, 1, 13 )
(10, 1, − 13 )
(00, 1, −1) (11, 0, 0) (00, 2, 0)
2(00, 0, −2) (00, 1, −1)
660
V. G. Kac, A. Rudakov
Sequence 4:
I (01, 2, 43 )
I (00, 1, 1) I (00, 1, −1) I (10, 2, − 43 )
2(10, 1, − 13 ) 3(10, 1, − 13 ) 2(01, 1,
1 3)
(11, 0, 0) (00, 2, 0)
2(01, 1, 13 ) (10, 0, − 43 ) (10, 0, 23 ) 3(01, 0, − 23 )
(10, 0, − 43 )
(00, 0, −2)
(00, 1, −1)
2(00, 1, −1) (00, 1, 1) 2(11, 0, 0) 2(00, 2, 0) (00, 0, 0) Sequence 5:
I (20, 1, 13 )
I (30, 2, 0)
(10, 1, − 13 )
(10, 1, − 13 ) 2(10, 1, − 13 )
(10, 0, − 43 )
3(01, 1, 13 )
(01, 1, 13 ) 4(10, 0, − 43 )
(00, 0, −2)
I (00, 2, 2)
(01, 0, (10, 0,
4 3) 2 3)
2(00, 1, 1) (11, 0, 0) 2(00, 2, 0) Sequence 6:
(01, 0, − 23 ) (10, 0, 23 )
2(01, 0, − 23 ) 3(00, 1, −1)
(11, 0, 0) 2(00, 0, −2) (00, 2, 0) (11, 0, 0) (00, 0, 0) I (30, 1, 1)
I (40, 2, 23 )
2(01, 1, 13 ) 3(10, 1, − 13 )
(10, 0, − 43 )
(01, 0, 43 ) 2(10, 1, − 13 ) 4(10, 0, − 43 )
(00, 0, −2)
I (00, 0, 2) (01, 1, 13 )
I (20, 0, 43 )
2(01, 0, − 23 )
(00, 0, 2)
2(01, 0, − 23 ) (10, 0, 23 )
I (10, 0, 83 )
I (30, 0, 2)
I (40, 1, 53 )
I (50, 2, 43 )
3(01, 1, 13 ) 2(01, 1, − 13 ) 3(10, 1, − 13 )
(10, 0, − 43 )
3(01, 0, 43 ) 2(10, 1, − 13 ) 4(10, 0, − 43 )
(00, 0, −2)
(00, 1, 1)
Sequence 7:
I (10, 0, 23 )
3(00, 1, −1) 2(11, 0, 0) 2(00, 0, −2) (00, 2, 0) 2(11, 0, 0) 2(00, 0, 0)
2(10, 0, 23 ) 2(01, 0, − 23 ) 2(01, 0, − 23 ) 3(00, 1, 1)
2(10, 0, 23 ) 3(00, 1, −1)
(00, 0, 2) 2(11, 0, 0)
3(11, 0, 0) 2(00, 0, −2) (00, 2, 0)) 2(11, 0, 0) 2(00, 0, 0)
Representations of E(3, 6) II: Four Series of Degenerate Modules
Sequence s ≥ 8:
A
B
(01, 0, 43 ) 3(01, 1, 13 )
C
661
D
E
2(01, 1, 13 ) 3(10, 1, − 13 ) (10, 0, − 43 )
(00, 0, 2) 4(01, 0, 43 ) 2(10, 1, − 13 ) 4(10, 0, − 43 ) (00, 0, −2) 2(10, 0, 23 ) 2(01, 0, − 23 ) 2(01, 0, − 23 )
Here
3(00, 1, 1)
2(10, 0, 23 ) 3(00, 1, −1)
2(00, 0, 2) 2(11, 0, 0)
3(11, 0, 0) 2(00, 0, −2) (00, 2, 0) 2(11, 0, 0) 2(00, 0, 0)
, B = I s − 6, 0; 0; 2s−6 , A = I s − 7, 0; 1; 2s−5 3 3 C = I s − 4, 0; 0; 2s−8 , D = I s − 3, 0; 1; 2s−9 , 3 3 2s−10 E = I s − 2, 0; 2; 3 .
References [CK]
Cheng, S-J., Kac, V.: Structure of some Z-graded Lie superalgebras of vector fields. Transformation Groups 4, 219–272 (1999) [K] Kac, V.: Classification of infinite-dimensional simple linearly compact Lie superalgebras. Adv. in Math. 139, 1–55 (1998) [KR1] Kac, V., Rudakov, A.: Representations of the exceptional Lie superalgebra E(3,6) I: Degeneracy conditions. ESI preprint no 921, 2000; To appear in Transformation Groups [KR2] Kac, V., Rudakov, A.: Representations of the exceptional Lie superalgebra E(3,6) III: Classification of singular vectors. In preparation [M] Mac Lane, S.: Homology. (3rd corr. printing) Berlin–Heidelberg–New York: Springer-Verlag: 1975 [R] Rudakov, A.: Irreducible representations of infinite-dimensional Lie algebras of Cartan type. Math.USSR-Izvestia 8, 836–866 (1974) Communicated by R. H. Dijkgraaf